[KinoSearch] Omitted results for different num_wanted values
Nick Wellnhofer
wellnhofer at aevum.de
Thu Jul 2 08:01:41 PDT 2009
Marvin Humphrey wrote:
> OK. Clearly that shouldn't be happening, and we have tests in the test suite
> to confirm that (t/511-sort-spec.t)... it seems they aren't rigorous
> enough...
>
> In the absence of a failing test case, I have some questions:
>
> * Is this index built all at once, or modified incrementally?
> * Are you using a SortSpec?
> * If you're sorting by score (the default), what are the scores of the hits?
> * Any deletions?
The index is built in one go without other additions or deletions. I'm not using SortSpec, but I tried a SortSpec sorting by score and doc_id with the same result.
Here are the doc_ids and scores for a query with num_wanted => 20
12126: 9.36577701568604
10623: 9.26733779907227
9592: 6.62579917907715
10686: 3.88833498954773
7776: 3.88633751869202
8081: 3.60492777824402
10923: 0.136016055941582 ***
11107: 0.136016055941582
9881: 0.118667937815189
10136: 0.108812846243382
9158: 0.078095369040966
10616: 0.0764531493186951
11217: 0.0635384768247604
10563: 0.0588966794312
12129: 0.048088937997818
8701: 0.0340040139853954
12257: 0.0119014047086239
Here are the results for num_wanted => 10
12126: 9.36577701568604
10623: 9.26733779907227
9592: 6.62579917907715
10686: 3.88833498954773
7776: 3.88633751869202
8081: 3.60492777824402
11107: 0.136016055941582
9881: 0.118667937815189
10136: 0.108812846243382
9158: 0.078095369040966
Document 10923 is missing. It has the same score as document 11107, but this may be coincidence. In another case, a document with a unique score is missing:
num_wanted => 20
13533: 6.35173892974854
11709: 5.42276954650879
13288: 4.91336631774902
15935: 4.44877099990845
13292: 4.22429084777832
15941: 1.7977180480957
15918: 0.254095643758774 ***
15177: 0.222333684563637
15185: 0.222333684563637
13203: 0.1905717253685
13276: 0.158809781074524
15893: 0.158809781074524
13102: 0.127047821879387
13543: 0.127047821879387
num_wanted => 10
13533: 6.35173892974854
11709: 5.42276954650879
13288: 4.91336631774902
15935: 4.44877099990845
13292: 4.22429084777832
15941: 1.7977180480957
15177: 0.222333684563637
15185: 0.222333684563637
13203: 0.1905717253685
13276: 0.158809781074524
Nick
--
aevum gmbh
rumfordstr. 4
80469 münchen
germany
tel: +49 89 3838 0653
http://aevum.de/
More information about the kinosearch
mailing list