[KinoSearch] Typos in queries

Nathan Kurz nate at verse.com
Wed May 21 23:44:09 PDT 2008


On Thu, May 22, 2008 at 12:12 AM, Eriam Schaffter
<eriam at mediavirtuel.com> wrote:
> I was wondering if there is already a method to return results when the
> query contains typos.

So far as I know, there is not yet such a method.

Because inverted indexes are keyed by the word you are searching for,
using the Levenshtein distance directly is difficult.  You would have
to calculate the distance from each word in the query to each word in
your index.  This would only work if your index has a limited
vocabulary, you have a lot of extra processing power, or you come up
with an elegant means of caching your results.

I would recommend an easier solution:  use an existing spell checker
like Text::Aspell to 'suggest' spelling corrections, and then
construct a query with these suggestions as part of a weighted Boolean
query.  If I recall correctly, you have to hack libaspell if you want
it to return the distance, but otherwise this should work well.

Good luck!

Nathan Kurz
nate at verse.com



More information about the kinosearch mailing list