[KinoSearch] Another error

Marvin Humphrey marvin at rectangular.com
Thu Jun 21 16:17:35 PDT 2007


On Jun 21, 2007, at 3:25 PM, Chris Nandor wrote:

> Can't locate object method "get_size" via package
> "KinoSearch::Index::MultiLexicon" at
> /usr/local/lib/perl5/site_perl/5.8.4/darwin-2level/KinoSearch/ 
> Search/RangeFilter.pm
> line 159, <GEN0> line 1.

Mmf.  OK, no big deal.  This is much easier to solve than the last  
one you threw my way.  :)

A Lexicon's "size" is the number of terms it holds.  We can't know  
the size of a MultiLexicon until we've iterated over the entire thing  
once.  We can know the number of terms each SegLexicon in the  
MultiLexicon holds, but we don't know how many terms overlap.  The  
iterator uses a PriorityQueue which checks for duplicates, though, so  
if we start at the top and count how many times Lex_Next 
(multi_lexicon) returns true, we have the size.

Fortunately, by this point, we'll have already performed that  
iteration -- during the call to build_sort_cache.  What we need to do  
is add a self->size member var to the MultiLexicon struct, then set  
it to self->term_num as soon as the iteration finishes in  
MultiLex_build_sort_cache.

The actual accessor will look like this:

   i32_t
   MultiLex_get_size(MultiLexicon *self)
   {
       if (self->lex_cache == NULL)
           CONFESS("Can't call MultiLex_Size unless cache filled");
       return self->size;
   }

We should add a Lex_Get_Size abstract accessor to Lexicon.c/h, along  
with an XS hook in Lexicon.pm which both SegLexicon and MultiLexicon  
will inherit.  We should zap the current XS hook in SegLexicon.pm and  
replace it with an implementation of Lex_Get_Size in SegLexicon.c/h.

I have a deadline tomorrow, so I don't think I'll get to adding this  
code and the accompanying tests before the weekend.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/





More information about the kinosearch mailing list