[KinoSearch] _write_postings hanging in _02
Chris Nandor
projects at pudge.net
Wed Mar 7 20:08:11 PST 2007
At 19:35 -0800 2007.03.07, Marvin Humphrey wrote:
>Thanks for the report. I've reproduced and identified the problem,
>and am working on a fix.
Great, thanks.
>Your narrative brings up a related issue, recently uncovered. As
>currently implemented, delete_by_term() only operates on documents
>which were already in the index before the indexing session started.
...
>My inclination is to document the method's actual behavior, but I
>don't think that's enough -- the name delete_by_term suggests a
>certain behavior (I'm thinking SQL DELETE with a WHERE clause) and
>it's bad design to have it do something subtly different. Perhaps
>renaming the method to something more descriptive, like
>"delete_existing" would help.
>
>I haven't decided what to do yet. Thoughts?
Well, for now, you can just document it. :-) I can't think of any easy
way to do it offhand, and there are more important issues to deal with.
I am not sure if this can actually come up with our code. It probably can,
so I need to take a closer look.
The way our code works is that there's a queue of changes to index, so I
can run through "new" index items first and then look at "modified" and
"deleted" items, and skip those if the same ID is in the "new" list, to be
picked up at the next run. Just a few lines of code, probably.
Oh, one more thing: are you planning (*cough*) on a dump_index replacement?
I know the schema is required which complicates things a bit.
--
Chris Nandor pudge at pobox.com http://pudge.net/
Open Source Technology Group pudge at ostg.com http://ostg.com/
More information about the kinosearch
mailing list