[KinoSearch] opening up the scorers

Nathan Kurz nate at verse.com
Thu Apr 17 10:43:09 PDT 2008



>  I've been thinking about adding new public classes ORQuery, ANDQuery,
> ANDNOTQuery and ANDORQuery.  BooleanQuery would either be deprecated or
> removed; the logic from the compilation phase of BooleanScorer's first
> iteration would be moved to QueryParser.

This sounds like a good idea to me, especially changing QueryParser to
build the query directly from the components.  I think it would be
great to have a toolbox of component scorers (core or KSx) that can be
wired together in different ways by custom QueryParsers.

I'm not sure I understand the differences between the component
classes you are proposing though.  Or maybe I do, and I just don't
understand the names.  Also, related to this and probably evident from
my sloppy terminology, I still can't keep straight how Queries and
Scorers relate.

AndQuery: short circuit and, scored in some way as a product of subqueries?
OrQuery: score equal to best scoring subquery, could be short circuit if sorted?
AndOrQuery: score all subqueries and add them, possibly normalized?
AndNotQuery:  not sure why this isn't a NotQuery, scored as a constant?

> > ps.  Marvin --- the term-by-term approach might be a useful general
> > optimization for a special purpose additive OrScorer.
> >
>
>  Yeah, term-at-a-time scoring is great stuff, it's just that the combining
> scorers in KS all need to go doc-at-a-time in order to handle boolean
> constraints without blowing up.

I agree that it probably can't be the default OrQuery/OrScorer, but it
strikes me as a useful piece of rope to tempt users who are creating
their own queries.  It also might be useful to think about how Queries
could be split across cores/servers.  If it worked, there would be
some performance benefits of doing so per term rather than
partitioning the corpus.

Nathan Kurz
nate at verse.com

_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch




More information about the kinosearch mailing list