[KinoSearch] Stemming and scoring
Eamon Daly
edaly at nextwavemedia.com
Wed Feb 14 15:30:59 PST 2007
Quoting the perldoc:
Stemming reduces words to a root form. For instance,
"horse", "horses", and "horsing" all become "hors" -- so
that a search for 'horse' will also match documents
containing 'horses' and 'horsing'.
Our search is having a lot of trouble with words such as
"intern" and "internal". Am I correct in assuming that when
indexing only the stem is stored, so that searches on
"intern" and "internal" will return the same documents with
equal scores? If not, is there a way to bump up the score of
exact matches, perhaps? If so, does anyone know of alternate
stemmers we can try-- plural to single, for instance? CPAN
is failing me for once.
____________________________________________________________
Eamon Daly
NextWave Media Group
Tel: 773 975-1115
Fax: 773 913-0970
More information about the kinosearch
mailing list