[KinoSearch] getting back to mmap
Nathan Kurz
nate at verse.com
Tue Apr 29 12:57:09 PDT 2008
On Tue, Apr 29, 2008 at 11:51 AM, Marvin Humphrey
<marvin at rectangular.com> wrote:
> On Apr 28, 2008, at 6:37 PM, Nathan Kurz wrote:
> > We've been off the mmap theme for a while, but I think it's still very
> > relevant to KinoSearch.
> >
>
> Thing is, we never resolved the problems of how to balance data
> compression, file sizes over 4 gb, mmap, portability, and Posting's
> architecture. There is Lucene's MMapDirectory as a point of reference. But
> all that's better discused on-list.
To my recollection, portability to Windows is the only real problem
with this approach. On Linux (and I presume OS X) mmap already
underlies the existing file system. All one is doing is stripping out
some unnecessary copies and duplication of effort between KinoSearch
and the system. The goal would be to have C structures with
elements pointing directly to the system buffers, and to let the
system handle all the paging and buffering issues. Compression would
be dealt with one Posting at a time (efficient to do it at the last
minute to keep things in processor cache) and memory limits aren't a
problem to my knowledge (if the file system allows we can handle it
too).
It's quite possible that this can be done on Windows as well, using
some other technique I'm not familiar with. But despite the
similarity in name, I don't think that MMapDirectory is particularly
relevant, as I don't think Java allows direct pointers to system
buffers. I think they are merely using mmap to open the files, and
then doing all the copies in userland, which doesn't buy much.
One wouldn't be to change over to using all mmap'ed IO, rather just
design the file structures and internal API's so it is possible to use
them in a more efficient fashion. From what I can tell, you are
already moving in this direction, making the lower levels (scorers)
less aware of the upper ones (posting lists). Then at some point,
someone (me once I have time?) would write a Linux specific mmap
version that we can test for performance. If it works, we can
integrate it better with the core. Then ideally, someone with more
knowledge of Windows internals can do something parallel.
> You may be interested in an ongoing dialog between Mike McCandless and
> myself on java-dev at lucene.apache.org about PostingList and the postings file
> format. There's some stuff in there about phrase scorers, too. In addition
> to many other contributions to Lucene such as the lockless-commits file
> format innovation, Mike's applied a bunch of concepts from KS.
> http://www.nabble.com/Pooling-of-posting-objects-in-DocumentsWriter-tt16565743.html#a16596031
Thanks! I only read through it quickly, but there are a lot of good
ideas there (most of which flew over my head).
Given my comments above about working with the file system, a few of
the parts about bulk reads, buffered writes, and disk seeks made me
cringe a little. As the architecture article mentioned, 'flushing'
to 'disk' when your 'memory' is full might not work as well as one
hopes. I'd guess that one could do something simpler but just as
efficient by just calling 'write' each time, and letting the system
decide when to commit the least recently used page to the physical
disk. And I'd bet that (at least on Linux) one could do something
considerably more efficient by using a file backed mmap and letting
the system handle the details. Unfortunately, I won't be able to back
that up with code any time soon. :(
Nathan Kurz
nate at verse.com
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list