[KinoSearch] Index cleared when using add_doc, delete_by_term, et al.

Marvin Humphrey marvin at rectangular.com
Tue Mar 31 21:17:29 PDT 2009


On Wed, Apr 01, 2009 at 12:00:15AM -0400, Darian Anthony Patrick wrote:
> > What's the analyzer for the listing_id field?  My immediate guess is that
> > there's a stemmer involved that's normalizing the listing_id and so that the
> > delete hits multiple entries instead of just one.
> 
> I'm using the default KinoSearch::Analysis::PolyAnalyzer for all fields, 
> including listing_id, constructed from my schema like so:
> 
> sub analyzer {
>    return KinoSearch::Analysis::PolyAnalyzer->new(language => 'en');
> }
> 
> Here is an example of the contents of listing_id:
> 
> cl-philadelphia-apa-1100510822

Because of that PolyAnalyzer, when you do this:

    $invindexer->delete_by_term(
        field => 'listing_id',
        term  => 'cl-philadelphia-apa-1100510822'
    );

You're actually deleting everything that contain 'cl'.

The solution is to turn off analysis for that field.

    package UnAnalyzed;
    use base qw( KinoSearch::FieldSpec::TextField );
    sub analyzed { 0 }

    package MySchema;
    use base qw( KinoSearch::Schema );

    our %fields = (
        content    => 'text',
        title      => 'text',
        listing_id => 'UnAnalyzed',
    );

Best,

Marvin Humphrey




More information about the kinosearch mailing list