[KinoSearch] using kinosearch without stemming
Marvin Humphrey
marvin at rectangular.com
Thu Jun 7 19:07:35 PDT 2007
On Jun 7, 2007, at 11:31 AM, Hans Dieter Pearcey wrote:
> I like using KS. It's fast, and though I sometimes get lost in the
> twisty maze
> of classes, the documentation is generally pretty good.
Thanks!
I'll have more to say about navigating the twisty maze later... maybe
over the weekend...
> I especially like using it for things that I might have previously
> used a
> database for -- log files and the like, where I want quick and
> flexible
> searching.
Thanks, it's good to know how people are using KS beyond the
archetypal setup of CGI search for a website.
> Assume a field called "action" that can have (among other values)
> "rejected".
> I don't want this to be stemmed, because rather than being ordinary
> speech it's
> effectively like an enum. So my first instinct is to make a
> FieldSpec subclass
> with
>
> sub analyzed { 0 }
>
> However, this only seems to take effect while building the invindex.
You're right. It's a bug in QueryParser. Here's the code that's
been misbehaving:
for my $field (@$fields) {
# custom analyze for each field unless override
my $analyzer = $supplied_analyzer;
$analyzer = $schema->fetch_analyzer($field) unless defined
$analyzer;
my @token_texts = grep {length} $analyzer->analyze_raw($text);
my $query = $self->_gen_single_field_query( $field,
\@token_texts );
push @queries, $query if defined $query;
}
QueryParser was finding the "correct" analyzer for the field -- since
none was specified, fetch_analyzer() returns the main analyzer for
$schema. However, QueryParser wasn't obeying the field's analyzed()
property, as you discovered.
The problem is fixed as of subversion repository revision 2465.
svn co -r 2465 http://www.rectangular.com/svn/kinosearch/trunk ks
> When I search for "action:rejected",
You may have seen this in a recent post of mine, but just FYI the
'field_name:term_text' syntax is now off by default in QueryParser.
You can get it back via $query_parser->set_heed_colons(1).
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
More information about the KinoSearch
mailing list