[KinoSearch] Index cleared when using add_doc, delete_by_term, et al.
Darian Anthony Patrick
darian at criticode.com
Tue Mar 31 21:00:15 PDT 2009
> On Tue, Mar 31, 2009 at 08:59:40PM -0400, Darian Anthony Patrick wrote:
>
>> $invindexer->delete_by_term(
>> 'listing_id' => $listing{'listing_id'}
>> );
>
>> to create the new entry. I do this in a loop over all entries in the
>> RSS feed. I'm seeing behavior where occasionally my entire index gets
>> totally blown away with only entries created during the current
>> invocation of my indexing script existing in the index.
>
> What's the analyzer for the listing_id field? My immediate guess is that
> there's a stemmer involved that's normalizing the listing_id and so that the
> delete hits multiple entries instead of just one.
>
I'm using the default KinoSearch::Analysis::PolyAnalyzer for all fields,
including listing_id, constructed from my schema like so:
sub analyzer {
return KinoSearch::Analysis::PolyAnalyzer->new(language => 'en');
}
Here is an example of the contents of listing_id:
cl-philadelphia-apa-1100510822
cl-philadelphia-apa-1101384542
cl-philadelphia-apa-1101378600
cl-newyork-aap-1101426145
cl-newyork-aap-1101425002
cl-newyork-aap-1101408072
>> Is this a known issue? Is this what "Fix delete_by_term behavior after
>> add_doc is called in InvIndexer" means on
>
> That's something else. It's that when you do this...
>
> $invindexer->add_doc( { content => 'foo' } );
> $invindexer->delete_by_term( content => 'foo' );
> $invindexer->finish;
>
> ... the doc you just added doesn't get deleted.
Ah, thanks for clarifying that point.
--
Darian Anthony Patrick, Criticode LLC
Office: (215) 789-9956
Facsimile: (866) 789-2992
XMPP/SMTP: darian at criticode.com
Web: http://criticode.com
=================================================
BCF1 E7AD 15AD 8A99 F613 AF5F 2A9C C45C F580 E087
=================================================
More information about the kinosearch
mailing list