[KinoSearch] Field-specific terms vs. query filter?
Marvin Humphrey
marvin at rectangular.com
Sat Dec 1 13:37:18 PST 2007
On Dec 1, 2007, at 12:01 PM, Larry Leszczynski wrote:
> Is there any syntax to query an analyzed field for an exact match?
> E.g. if I want to match "cream" but not "creamy"?
What ends up in the index depends on the Analyzer assigned to that
field. The choice the KS docs shunt you towards as the easiest is a
PolyAnalyzer that's actually a series of three other analyzers.
Here's the code from PolyAnalyzer's constructor:
# create a default set of analyzers if language was specified
if ( !defined $args->{analyzers} ) {
confess("Must specify either 'language' or 'analyzers'")
unless $language;
$args->{analyzers} = [
KinoSearch::Analysis::LCNormalizer->new,
KinoSearch::Analysis::Tokenizer->new,
KinoSearch::Analysis::Stemmer->new( language =>
$language ),
];
}
The element that reduces "creamy" to "cream" is the Stemmer. If you
take it out of the loop, then searches for "creamy" will no longer
docs matching "cream" and vice versa. Whether that's desirable
depends on your application. It's perfectly reasonable to index
twice, so long as you have the resources to spare (you'd probably
only want to have only one of the fields be "stored").
Take the Tokenizer out of the loop too, and then searches for
"creamy" will no longer match a field whose value is "Creamy
Goodness". Take the LCNormalizer out of the loop too, and then
there's no more analysis being performed -- so only exact matches
will succeed.
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list