[KinoSearch] KinoSearch::Docs::Cookbook::ReusingSearchers
Nathan Kurz
nate at verse.com
Fri Sep 14 01:39:30 PDT 2007
On 9/14/07, Marvin Humphrey <marvin at rectangular.com> wrote:
>
> On Sep 13, 2007, at 12:12 AM, Henry wrote:
>
> > Dunno if you missed the above post (or you're playing catchup): any
> > comments on whether all search nodes can be "masters" for
> > aggregating/sorting results?
>
> Yeah, that will work fine.
For what it's worth, that approach appeals to me as well. The
simplicity of having each node identical seems ideal so long as the
resources on each machine are reasonably utilized.
I think you mentioned somewhere earlier, but how large is your dataset
Henry? Are you using MultiSearcher because your index is too large
to fit on local disks or is it mid-size and you are trying to keep
everything in RAM?
> You had also asked about the MultiSearcher sort. I'd been back-
> burnering that one because I was hoping that a new approach would
> present itself during the course of fixing other things. Well, I
> believe that one has.
>
> What we need is to do is break up nodes by task.
Ouch --- this runs counter to the simplicity I appreciate about the
masterless system Henry proposed. I agree that it would be pretty
easy to go to programmatically, but it doesn't sound much fun to
administer. I see this being of benefit only to really gigantic
loads/indexes with the hardware customized to the role of each node.
What are the cases you are thinking it would benefit?
> If you'll recall, the problem with the MultiSearcher sort has to do
> with the overhead of loading large fields into memory to cut down on
> disk seeks. This solves that problem by loading the whole lexicon
> into one shared space for the whole search cluster.
I think that coming up with a good way of returning the field value to
the requester is going to be a better final solution. The fear of
disk seeks seems like a red herring --- if a block is being read
often, it's going to be cached, if it's not often, it doesn't matter.
And If for some reason we are trashing the page cache and forcing a
re-read, let's figure out how to change that!
But perhaps I'm missing part of this equation?
Nathan Kurz
nate at verse.com
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list