[KinoSearch] Index cleared when using add_doc, delete_by_term, et al.

Darian Anthony Patrick darian at criticode.com
Tue Mar 31 21:00:15 PDT 2009


> On Tue, Mar 31, 2009 at 08:59:40PM -0400, Darian Anthony Patrick wrote:
> 
>> $invindexer->delete_by_term(
>> 	'listing_id' => $listing{'listing_id'}
>> );
> 
>> to create the new entry.  I do this in a loop over all entries in the 
>> RSS feed.  I'm seeing behavior where occasionally my entire index gets 
>> totally blown away with only entries created during the current 
>> invocation of my indexing script existing in the index.
> 
> What's the analyzer for the listing_id field?  My immediate guess is that
> there's a stemmer involved that's normalizing the listing_id and so that the
> delete hits multiple entries instead of just one.
> 

I'm using the default KinoSearch::Analysis::PolyAnalyzer for all fields, 
including listing_id, constructed from my schema like so:

sub analyzer {
   return KinoSearch::Analysis::PolyAnalyzer->new(language => 'en');
}

Here is an example of the contents of listing_id:

cl-philadelphia-apa-1100510822
cl-philadelphia-apa-1101384542
cl-philadelphia-apa-1101378600
cl-newyork-aap-1101426145
cl-newyork-aap-1101425002
cl-newyork-aap-1101408072

>> Is this a known issue?  Is this what "Fix delete_by_term behavior after 
>> add_doc is called in InvIndexer" means on 
> 
> That's something else.  It's that when you do this...
> 
>   $invindexer->add_doc( { content => 'foo' } );
>   $invindexer->delete_by_term( content => 'foo' );
>   $invindexer->finish;
> 
> ... the doc you just added doesn't get deleted.

Ah, thanks for clarifying that point.

-- 
Darian Anthony Patrick, Criticode LLC
Office:     (215) 789-9956
Facsimile:  (866) 789-2992
XMPP/SMTP:  darian at criticode.com
Web:        http://criticode.com
=================================================
BCF1 E7AD 15AD 8A99 F613 AF5F 2A9C C45C F580 E087
=================================================



More information about the kinosearch mailing list