[KinoSearch] Problem during delete_docs_by_term

Marvin Humphrey marvin at rectangular.com
Sun Dec 21 15:06:53 PST 2008


Matt Williamson:

> I managed to index around 140000 files, but then I hit a problem with the
> following message:
>
>   Couldn't open file '/home/qatest/kinosearch/invindex/_10613.f0': File
>   exists 

>   eval {...} called at index.pl line 79

This seems to indicate that a previous indexing run wasn't completed.  Is it
possible that sometimes you perform actions with the InvIndexer, either
deleting or adding a document, but don't call finish() ?  The maint branch of
KS won't recover gracefully under such circumstances.  Is your eval wrapper
hiding a crash?

I've snooped the relevant code in InvIndexer and SegWriter, and I can't see
any holes so long as finish() gets called.  The segment gets started when you
either add or delete a document, which changes the InvIndexer's internal state
to INITIALIZED.  If that's true, calling InvIndexer->finish() triggers a call
to SegWriter->finish(), and at that point there's no avoiding the cleanup
logic.

This isn't a scaling problem - there's nothing special about the number
140,000.  The problem arises because an old segment-in-progress isn't getting
cleaned up.  Something is peculiar with the document or the behavior of
index.pl at around that time.  

Marvin Humphrey





More information about the kinosearch mailing list