[KinoSearch] Wildcards (was: Re: KinoSearch feature suggestions)
Father Chrysostomos
sprout at cpan.org
Fri Jan 25 12:57:29 PST 2008
On Jan 25, 2008, at 11:34 AM, Marvin Humphrey wrote:
> I've just committed some revised docs for Lexicon. Please let me
> know if this is sufficiently clear:
>
> KinoSearch::Index::Lexicon - Iterator for a field's Terms.
>
> =head1 SYNOPSIS
[...]
Perfect. :-)
>
> So here's my latest thought:
>
> my $seeked_lexicon = $reader->lexicon( 'content', 'foo' );
> my $unseeked_lexicon = $reader->lexicon( 'content', undef );
> my $seeked_posting_list = $reader->posting_list( 'content',
> 'foo' );
> my $unseeked_posting_list = $reader->posting_list( 'content',
> undef );
>
> Does that make sense?
Yes, it does.
> 2. At some point, it would be nice to support non-text fields.
Since binary data can be stored in a string, it is already supported,
is it not?
> Hmm. Thinking over the second point, perhaps it would be best if
> Lexicons only stored field values rather than terms. In Lucene,
> that wouldn't work because TermEnum objects handle multiple fields,
> but in KS, the field is fixed.
Do you mean that the field contains the terms, which contain the field
name? This does seem redundant.
> Making such a change wouldn't be trivial, but it's probably
> worthwhile.
That would certainly make things simpler. Of course, it’s up to you.
> A RegexQuery class would be nice to have, but it would have some
> significant limitations. If it used the existing KS index data
> structures, it would not behave like a typical SQL regex or LIKE
> query, matching the regex against the non-tokenized contents for
> each field. If you did something like this...
>
> my $regex_query = KSx::Search::RegexQuery->new(
> field => 'content',
> regex => qr/three blind/,
> );
>
> ... and the 'content' field was tokenized, the regex wouldn't match
> against any of the values in the Lexicon, since e.g. "blind" doesn't
> match qr/three blind/.
I should have called it RegexpTermQuery. :-) I still like the idea,
though.
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list