[KinoSearch] Field-specific terms vs. query filter?

Marvin Humphrey marvin at rectangular.com
Sat Dec 1 13:37:18 PST 2007




On Dec 1, 2007, at 12:01 PM, Larry Leszczynski wrote:

> Is there any syntax to query an analyzed field for an exact match?   
> E.g. if I want to match "cream" but not "creamy"?

What ends up in the index depends on the Analyzer assigned to that  
field.  The choice the KS docs shunt you towards as the easiest is a  
PolyAnalyzer that's actually a series of three other analyzers.   
Here's the code from PolyAnalyzer's constructor:

     # create a default set of analyzers if language was specified
     if ( !defined $args->{analyzers} ) {
         confess("Must specify either 'language' or 'analyzers'")
             unless $language;
         $args->{analyzers} = [
             KinoSearch::Analysis::LCNormalizer->new,
             KinoSearch::Analysis::Tokenizer->new,
             KinoSearch::Analysis::Stemmer->new( language =>  
$language ),
         ];
     }

The element that reduces "creamy" to "cream" is the Stemmer.  If you  
take it out of the loop, then searches for "creamy" will no longer  
docs matching "cream" and vice versa.  Whether that's desirable  
depends on your application.  It's perfectly reasonable to index  
twice, so long as you have the resources to spare (you'd probably  
only want to have only one of the fields be "stored").

Take the Tokenizer out of the loop too, and then searches for  
"creamy" will no longer match a field whose value is "Creamy  
Goodness".  Take the LCNormalizer out of the loop too, and then  
there's no more analysis being performed -- so only exact matches  
will succeed.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/



_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch




More information about the kinosearch mailing list