[KinoSearch] Scorer::doc

Marvin Humphrey marvin at rectangular.com
Sat Mar 1 14:26:53 PST 2008


On Mar 1, 2008, at 1:09 PM, Father Chrysostomos wrote:

> When is a Scorer’s doc method called?

Scorer::doc really means "get_doc_num".  It is supposed to return the  
same document number that next() just returned.

    while ( my $doc_num = $scorer->next ) {
       die "broken scorer" unless $scorer->get_doc_num == $doc_num;
    }

The method name came from Lucene and I've been thinking of changing it  
for the sake of stylistic consistency with all the other KS getters.   
This bug report cements it -- because if you'd known that it was  
get_doc_num, it would have been plain that the WildCardQuery recipe  
was incorrect.

[What will soon be] Scorer::get_doc_num gets called all the time  
internally.  For instance, when one subscorer within an ORScorer has a  
very high document number, it needs to stay on that document number  
until all the other subscorers catch up, over multiple calls to  
ORScorer_Next.

It used to be the case that Scorer_Next returned a boolean.  It was  
only after I implemented Nathan's suggestion of having document  
numbers start at 1 that it was possible to have Scorer_Next() return  
the document number and have that double as a boolean.

> In the example in the cookbook, the first call to ‘next’ returns the  
> first doc number, while ‘doc’ returns the doc number that the *next*  
> call to ‘next’ will return. This doesn’t make sense to me. Is there  
> a mistake in the cookbook?

Yeah, I botched it, sorry.  Fixed by r3093, patch pasted below.

(I'd really like a way of running code samples in documentation.  And  
a pony.)

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


Modified: trunk/perl/lib/KinoSearch/Docs/Cookbook/WildCardQuery.pod
===================================================================
--- trunk/perl/lib/KinoSearch/Docs/Cookbook/WildCardQuery.pod	 
2008-03-01 21:04:02 UTC (rev 3092)
+++ trunk/perl/lib/KinoSearch/Docs/Cookbook/WildCardQuery.pod	 
2008-03-01 22:34:50 UTC (rev 3093)
@@ -185,7 +185,7 @@
         my @doc_nums = sort { $a <=> $b } keys %all_doc_nums;
         $doc_nums{$id} = \@doc_nums;

-        $tick{$id}     = 0;
+        $tick{$id}     = -1;
         $tally{$id}    = KinoSearch::Search::Tally->new;
         $tally{$id}->set_score(1.0);    # fixed score of 1.0

@@ -208,8 +208,9 @@
         my $self     = shift;
         my $id       = refaddr($self);
         my $doc_nums = $doc_nums{$id};
-        return 0 if $tick{$id} >= scalar @$doc_nums;
-        return $doc_nums[ $tick{$id}++ ];
+        my $tick     = ++$tick{$id};
+        return 0 if $tick >= scalar @$doc_nums;
+        return $doc_nums[$tick];
     }

next() advances the Scorer to the next valid matching doc.  In this  
example,




More information about the kinosearch mailing list