[KinoSearch] Re: KinoSearch feature suggestions
Peter Karman
peter at peknet.com
Wed Jan 23 06:59:30 PST 2008
On 01/23/2008 08:32 AM, Marvin Humphrey wrote:
> However, with this design we can get something working reasonably
> quickly, and then check out prior art from other IR projects. Peter,
> does SWISH do wildcards?
>
Yes, Swish-e supports the '*' to match 1 or more characters at the end of the word, and
'?' to match exactly one character anywhere in the word. However, Swish-e does that by
means of a 256-byte wide lookup table (iirc), which works only because Swish-e supports
single-byte encodings.
>>> How about if we outsource excerpting to subclasses of a new class,
>>> KinoSearch::Highlight::Excerpter?
>>
fwiw, Search::Tools offers highlighting and excerpting (snipping) via the building of
complex regular expressions. See
http://search.cpan.org/~karman/Search-Tools-0.16/lib/Search/Tools/Snipper.pm
http://search.cpan.org/~karman/Search-Tools-0.16/lib/Search/Tools/HiLiter.pm
The algorithm I use for snipping/excerpting is slow, and I would love to see how a
different approach could improve performance. I believe the primary reason my approach is
slow is that it uses a big regex.
--
Peter Karman . peter at peknet.com . http://peknet.com/
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list