[KinoSearch] Queries with large number of hits.

Nathan Kurz nate at verse.com
Fri Sep 19 11:25:53 PDT 2008



On Thu, Sep 18, 2008 at 10:25 PM, Marvin Humphrey
<marvin at rectangular.com> wrote:
> I've now performed the experiment pitting Lucene's TermQuery against
> KinoSearch's.  The results were worse than expected, and extra positions
> decoding overhead seem to be a bigger factor than I'd thought it would be.

Thanks for your quick research and improvements, Marvin.  You prompted
me to run a few quick tests as well:

kinosearch_instream/perl$ search "the OR and OR of OR for" (ten times)
kinosearch_instream/perl$ opreport -alt2 */KinoSearch.so
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  cum. samples  %        cum. %     symbol name
190      190           23.8394  23.8394    kino_InStream_read_c32
112      302           14.0527  37.8921    kino_ScorePost_read_record
68       370            8.5320  46.4241    kino_ScorerDocQ_top_next
55       425            6.9009  53.3250    kino_TermScorer_next
51       476            6.3990  59.7240    kino_InStream_read_u8
45       521            5.6462  65.3701    kino_SegPList_next
31       552            3.8896  69.2597    __i686.get_pc_thunk.bx
30       582            3.7641  73.0238    advance_after_current
29       611            3.6386  76.6625    anonymous symbol from section .plt
25       636            3.1368  79.7992    kino_MemMan_wrapped_realloc
22       658            2.7604  82.5596    kino_ScorePostScorer_tally
17       675            2.1330  84.6926    kino_ORScorer_tally

"opannotate --source */KinoSearch.so" is also useful to glance at, as
is adding the '-c' flag to opreport to see where the calls are coming
from and where the functions are spending their time internally.

The main thing that jumped out is that the function call to
Instream_read_c32 is killing us.  I don't see any way to have this
remain a per-int function call and still get good performance.  You
need to figure out some to decode the entire posting with fewer
function calls, and this is where the bulk of them are coming from.
I'd suggest having the Store level return an undecoded raw posting,
and let the Posting class decode the parts it wants.  That way the
VByte code can be macros that work on a local buffer in a tight loop.
I'm sure there are other ways to do it, though.

The second thing that jumps out is that decompressing VBytes is
expensive.  P4Delta might be a significant advantage, or perhaps there
are ways to optimize the decompression with the existing scheme.  I
toyed a bit with trying to come up with a branchless way of doing it,
but gave up without much to show for it.

The third thing (tiny, but perhaps easy to fix) is that
Scorepost_read_record is spending 40% of its time in REALLOC.  Is the
enlarged position buffer not getting reused for some reason?

The last thing (sort of a non-thing) is that to my surprise the
double/triple buffering of the Posting data doesn't seem to have much
of a negative effect.  I still think it's worth trying to avoid this,
but it's barely on the current radar.

> The way to do this in KS is to override FieldSpec->posting() in a subclass
> so that it returns a MatchPosting instead of a ScorePosting.  MatchPosting
> was a placeholder until today, but now a provisional implementation has been
> completed for testing purposes.
> Happily, with MatchPosting, we get much closer to Lucene performance: around
> 1.3x for "Reuters" and around 1.7x for "the".

Did you also recreate an index without the position information, or
are these times based on just skipping over the position info in the
index?  Times would probably be closer if the information was
out-of-line.

Nathan Kurz
nate at verse.com

ps.  The directions for building the Reuters benchmark index seem out
of date.   '-Mblib' no longer finds the uninstalled KinoSearch.so in
the parent hierarchy.

_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch




More information about the kinosearch mailing list