[KinoSearch] OpenQueryParser
Nathan Kurz
nate at verse.com
Mon Apr 28 15:38:44 PDT 2008
On Mon, Apr 28, 2008 at 2:00 PM, Marvin Humphrey <marvin at rectangular.com> wrote:
> I think the most elegant solution is to use undef/NULL for the 'any' field
> type. (We'll have to modify TermQuery etc. to accept an undef value for
> 'field'.)
Sounds good.
> The final output of the parser *has* to be a Query object, because this
> code snippet has to work:
>
> my $query = $query_parser->parse($query_string);
> my $hits = $searcher->search( query => $query );
Fine by me.
> So... keep the two-stage compilation, but have the both stages output Query
> objects and just make it possible to walk child nodes for ANDQuery, ORQuery,
> etc.
Yes, definitely simpler this way.
> And that finally suggests a decent replacement name for the Weight class:
> it should be renamed to KinoSearch::Search::Compiler.
This certainly makes me happier than Weight.
> To refine what we want out of the parser class, though... it ought to be
> abstract syntax tree built from Query objects, rather than a parse tree per
> se. (<http://en.wikipedia.org/wiki/Abstract_syntax_tree>)
Pedantic, but true :). I don't see a great benefit in distinguishing
these two (an AST looks a lot to me like a Parse Tree for slightly
different input), but I will attempt to change my terminology.
> For example, we
> want the output of the parser for both of these query string inputs...
>
> 'foo -bar'
> 'foo AND NOT bar'
>
> ... to be exactly equivalent to this:
> [snip]
Yes. Although, personally, I'm not above canonicalizing the input
string as text before passing it to the parser. One could allow the
Parser to directly accept either of these, and massage the other into
form. This is much simpler than writing your own parser if you want
just want a small change in the grammar.
> That's lossy -- because we lose the ability to recreate the input query
> string -- but we don't lose any intent and it's not a true optimization.
True, but I don't think this is a problem. It's probably best to save
the initial query string at some top level, so it can be redisplayed
later, but this can be outside KinoSearch core.
Nathan Kurz
nate at verse.com
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list