[KinoSearch] MultiSearcher's lack of features

Henka henka at cityweb.co.za
Tue Jun 12 02:28:26 PDT 2007


Oi - multi-search without sorting is an app killer in my case too...

>    # now:
>    (  451, 20, 52 )
>    # fixed:
>    ( [ 451, 'cat' ], [ 20, 'dog' ], [ 52, 'human' ] )
>
> Then the central machine could make string comparisons of the sort
> field values from different machines.
>
> Looking up those values is kind of expensive though.  Lexicons are
> only partially kept in memory (1 out of every 128 terms), and with a
> multi-segment index, you have to perform 1 disk scan per segment.
> Say you have 10 hits and 25 segments.  That's 250 disk seeks to
> associate each hit with a sort field value.  :(

Very bad indeed.  That would potentially murder search times on a busy
search cluster, right?  (IO being the bottlenek)

> To avoid that cost, we might have to load entire lexicons for sort
> fields into memory.  I've been trying to avoid that, but I don't see
> how.

Just so I understand:  when you say "load entire lexicons for sort fields
into memory" you mean the sort fields of the -search result set-, right?

I'm trying to get my mind around what kind of memory|performance
requirements this would imply for a given (corpus size)/search_nodes x
search activity|volume for planning purposes.

Regards
Henry




More information about the kinosearch mailing list