[KinoSearch] MultiSearcher's lack of features
Henka
henka at cityweb.co.za
Tue Jun 12 02:28:26 PDT 2007
Oi - multi-search without sorting is an app killer in my case too...
> # now:
> ( 451, 20, 52 )
> # fixed:
> ( [ 451, 'cat' ], [ 20, 'dog' ], [ 52, 'human' ] )
>
> Then the central machine could make string comparisons of the sort
> field values from different machines.
>
> Looking up those values is kind of expensive though. Lexicons are
> only partially kept in memory (1 out of every 128 terms), and with a
> multi-segment index, you have to perform 1 disk scan per segment.
> Say you have 10 hits and 25 segments. That's 250 disk seeks to
> associate each hit with a sort field value. :(
Very bad indeed. That would potentially murder search times on a busy
search cluster, right? (IO being the bottlenek)
> To avoid that cost, we might have to load entire lexicons for sort
> fields into memory. I've been trying to avoid that, but I don't see
> how.
Just so I understand: when you say "load entire lexicons for sort fields
into memory" you mean the sort fields of the -search result set-, right?
I'm trying to get my mind around what kind of memory|performance
requirements this would imply for a given (corpus size)/search_nodes x
search activity|volume for planning purposes.
Regards
Henry
More information about the kinosearch
mailing list