[KinoSearch] Problem during delete_docs_by_term
Marvin Humphrey
marvin at rectangular.com
Sun Dec 21 15:06:53 PST 2008
Matt Williamson:
> I managed to index around 140000 files, but then I hit a problem with the
> following message:
>
> Couldn't open file '/home/qatest/kinosearch/invindex/_10613.f0': File
> exists
> eval {...} called at index.pl line 79
This seems to indicate that a previous indexing run wasn't completed. Is it
possible that sometimes you perform actions with the InvIndexer, either
deleting or adding a document, but don't call finish() ? The maint branch of
KS won't recover gracefully under such circumstances. Is your eval wrapper
hiding a crash?
I've snooped the relevant code in InvIndexer and SegWriter, and I can't see
any holes so long as finish() gets called. The segment gets started when you
either add or delete a document, which changes the InvIndexer's internal state
to INITIALIZED. If that's true, calling InvIndexer->finish() triggers a call
to SegWriter->finish(), and at that point there's no avoiding the cleanup
logic.
This isn't a scaling problem - there's nothing special about the number
140,000. The problem arises because an old segment-in-progress isn't getting
cleaned up. Something is peculiar with the document or the behavior of
index.pl at around that time.
Marvin Humphrey
More information about the kinosearch
mailing list