[KinoSearch] Field-specific terms vs. query filter?
Larry Leszczynski
larryl at emailplus.org
Sun Dec 2 10:03:17 PST 2007
On Sat, 1 Dec 2007, Marvin Humphrey wrote:
>> Is there any syntax to query an analyzed field for an exact match?
>> E.g. if I want to match "cream" but not "creamy"?
>
> What ends up in the index depends on the Analyzer assigned to that
> field. The choice the KS docs shunt you towards as the easiest is a
> PolyAnalyzer that's actually a series of three other analyzers.
> Here's the code from PolyAnalyzer's constructor:
>
> # create a default set of analyzers if language was specified
> if ( !defined $args->{analyzers} ) {
> confess("Must specify either 'language' or 'analyzers'")
> unless $language;
> $args->{analyzers} = [
> KinoSearch::Analysis::LCNormalizer->new,
> KinoSearch::Analysis::Tokenizer->new,
> KinoSearch::Analysis::Stemmer->new( language => $language ),
> ];
> }
>
> The element that reduces "creamy" to "cream" is the Stemmer. If you
> take it out of the loop, then searches for "creamy" will no longer
> docs matching "cream" and vice versa. Whether that's desirable
> depends on your application. It's perfectly reasonable to index
> twice, so long as you have the resources to spare (you'd probably
> only want to have only one of the fields be "stored").
>
> Take the Tokenizer out of the loop too, and then searches for
> "creamy" will no longer match a field whose value is "Creamy
> Goodness". Take the LCNormalizer out of the loop too, and then
> there's no more analysis being performed -- so only exact matches
> will succeed.
Coolness, info much appreciated as usual!
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list