[KinoSearch] fuzzy searches
Marvin Humphrey
marvin at rectangular.com
Sun Mar 14 22:15:43 PDT 2010
On Sun, Mar 14, 2010 at 11:45:15PM -0500, Peter Karman wrote:
> fwiw, I like to offer 2 versions of an index, a 'strict' version and a 'fuzzy'
> version, with some kind of selector in my UI to toggle which one is used.
With work, you can also structure your index and your queries so that strict
matches score higher, but fuzzy searches still match.
my $fuzzy = $fuzzy_fields_qparser->parse($query_string);
my $strict = $strict_fields_qparser->parse($query_string);
$fuzzy->set_boost(0.1);
my $or_query = KinoSearch::Search::ORQuery->new(
children => [ $fuzzy, $strict ],
);
my $results = $searcher->search($or_query);
> I'm also very interested in experimenting with LSI[0] which I see on the KS
> BrainDump list[1].
LSI/LSA (Latent Semantic Indexing/Analysis, "LSA" seems to have become more
common) fell out of patent a couple of years ago. The matrix algebra needed
to perform the data reduction is heavy-duty math, beyond my capabilities. But
it sure is interesting to think about it in terms of vector space clustering.
Marvin Humphrey
More information about the kinosearch
mailing list