[KinoSearch] Stemming and Term/TermQuery
Marvin Humphrey
marvin at rectangular.com
Tue Aug 14 19:59:10 PDT 2007
On Aug 14, 2007, at 4:19 PM, Evaldas Imbrasas wrote:
> The first series is done using a simple query call:
> my $hits = $searcher->search(query => 'organic');
>
> The second series is done using TermQuery:
> my $term = KinoSearch::Index::Term->new(title => 'organic');
> my $by_title = KinoSearch::Search::TermQuery->new(term => $term);
> my $hits = $searcher->search(query => $by_title);
>
> I expect both series to produce the same results, since there's only
> one field indexed per document.
They will not. The one passing through the Searcher is receiving
additional processing -- crucially, it is being passed through an
Analyzer. In the first you are searching for 'organ', which is in
the index. In the second, you are searching for 'organic', which is
not.
> However, the output is different for
> some search terms:
>
> Test searches:
>
> cotton: 10 results
> bags: 29 results
> organic: 18 results
> bamboo: 7 results
> clothes: 7 results
>
> Test term searches:
>
> cotton: 10 results
> bags: 0 results
> organic: 0 results
> bamboo: 7 results
> clothes: 0 results
Try out each of these terms at <http://snowball.tartarus.org/
demo.php>. The ones where the stemmed output is identical to the
input produce identical results.
PS to the list regarding my continuing absence...
I mentioned in a post a little while ago that for the contract job
I've been working on, testing had begun and the project lead had
left. Testing has gone about as well as we might have expected.
However, the absence of the project lead has made our troubleshooting
significantly less efficient, and I have had to step it up to
compensate as best I can. There is a lot of money at stake for a lot
of people, and I intend to continue pouring 100% of my efforts into
this job until it is certain that we are free and clear. I
appreciate your continuing patience with this pause.
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list