[KinoSearch] Incremental indexing and deletion.

Marvin Humphrey marvin at rectangular.com
Sun Dec 21 15:12:00 PST 2008


Marcus Ramburg:

> As you can see on line 95 http://github.com/marcusramberg/mojomojo/tree/master/lib/MojoMojo/Model/Search.pm#L95
>   I delete docs by term before reindexing it. However, for each page
> save the term count increases by one for each word in the document. Am
> I misunderstanding something about kinosearch?

Here's the code in question:

    my $term = KinoSearch::Index::Term->new(path => $fixed_path );
    $index->delete_docs_by_term( $term );

The problem is that the text in the index has been processed by your default
Analyzer, a PolyAnalyzer, but the term text you are submitting to
delete_docs_by_term has not.  Since the string you are looking up differs from
the string in the index, the lookup discovers no documents and the deletion
fails.

The solution is to turn off analysis for the "path" field:

  $invindexer->spec_field(
     name     => 'path'
     analyzed => 0,
  );

You probably want to do the same with your "date" field as well.

FYI, in the devel branch delete_docs_by_term() has been replaced by
delete_by_term(), which has subtly different behavior: "If C<field> is
associated with an analyzer, C<term> will be processed automatically (so don't
pre-process it yourself)."

  http://xrl.us/o3z6c (Link to www.rectangular.com)

Marvin Humphrey






More information about the kinosearch mailing list