[KinoSearch] Omitted results for different num_wanted values

Nick Wellnhofer wellnhofer at aevum.de
Thu Jul 2 08:01:41 PDT 2009


Marvin Humphrey wrote:
> OK.  Clearly that shouldn't be happening, and we have tests in the test suite
> to confirm that (t/511-sort-spec.t)... it seems they aren't rigorous
> enough...
> 
> In the absence of a failing test case, I have some questions:
> 
>   * Is this index built all at once, or modified incrementally?
>   * Are you using a SortSpec?
>   * If you're sorting by score (the default), what are the scores of the hits?
>   * Any deletions?

The index is built in one go without other additions or deletions. I'm not using SortSpec, but I tried a SortSpec sorting by score and doc_id with the same result.

Here are the doc_ids and scores for a query with num_wanted => 20

12126: 9.36577701568604
10623: 9.26733779907227
9592: 6.62579917907715
10686: 3.88833498954773
7776: 3.88633751869202
8081: 3.60492777824402
10923: 0.136016055941582 ***
11107: 0.136016055941582
9881: 0.118667937815189
10136: 0.108812846243382
9158: 0.078095369040966
10616: 0.0764531493186951
11217: 0.0635384768247604
10563: 0.0588966794312
12129: 0.048088937997818
8701: 0.0340040139853954
12257: 0.0119014047086239

Here are the results for num_wanted => 10

12126: 9.36577701568604
10623: 9.26733779907227
9592: 6.62579917907715
10686: 3.88833498954773
7776: 3.88633751869202
8081: 3.60492777824402
11107: 0.136016055941582
9881: 0.118667937815189
10136: 0.108812846243382
9158: 0.078095369040966

Document 10923 is missing. It has the same score as document 11107, but this may be coincidence. In another case, a document with a unique score is missing:

num_wanted => 20

13533: 6.35173892974854
11709: 5.42276954650879
13288: 4.91336631774902
15935: 4.44877099990845
13292: 4.22429084777832
15941: 1.7977180480957
15918: 0.254095643758774 ***
15177: 0.222333684563637
15185: 0.222333684563637
13203: 0.1905717253685
13276: 0.158809781074524
15893: 0.158809781074524
13102: 0.127047821879387
13543: 0.127047821879387

num_wanted => 10

13533: 6.35173892974854
11709: 5.42276954650879
13288: 4.91336631774902
15935: 4.44877099990845
13292: 4.22429084777832
15941: 1.7977180480957
15177: 0.222333684563637
15185: 0.222333684563637
13203: 0.1905717253685
13276: 0.158809781074524


Nick


-- 
aevum gmbh
rumfordstr. 4
80469 münchen
germany

tel: +49 89 3838 0653
http://aevum.de/



More information about the kinosearch mailing list