[KinoSearch] newbie: Indexing and searching text not working

Mike Barborak barborak at basikgroup.com
Sat Aug 23 12:22:14 PDT 2008


Hi,

After creating your index with PolyAnalyzer, your body field will have the
terms "short" and "body" but not "short body." Take a look at
KinoSearch::QueryParser::QueryParser as it will likely do what you want.

Best,
Mike

On Sat, Aug 23, 2008 at 2:58 PM, <kinosearch-request at rectangular.com> wrote:

> Date: Sat, 23 Aug 2008 18:31:58 +0530
> From: ram <ram at netcore.co.in>
> Subject: [KinoSearch] newbie: Indexing and searching text not working
> To: kinosearch at rectangular.com
> Message-ID: <1219496518.25503.64.camel at darkstar.netcore.co.in>
> Content-Type: text/plain
>
> Hi list,
>
>   I am a newbie here
> I have written an indexer to index all my mails. I am able to search on
> to & from fields but not search on the actual body
> can someone help me where I am going wrong
>
> I have the script below
>
>
>
>
>
> ---------------------------------------------------------------------------------------------------------------
> use KinoSearch::InvIndexer;
> use KinoSearch::Analysis::PolyAnalyzer;
> use KinoSearch::Searcher;
> use strict;
> #
> # Start on a clean slate
> #
> system("rm -rf /tmp/invindex/*");
>
> my $analyzer   = KinoSearch::Analysis::PolyAnalyzer->new( language =>
> 'en' );
> @gl::headers     = qw(from to cc subject body date reply-to message-id
> in-reply-to filename);
> my $invindexer = KinoSearch::InvIndexer->new(
>    invindex => '/tmp/invindex',
>    create   => 1,
>    analyzer => $analyzer,
>    );
> foreach (@gl::headers) {
>    $invindexer->spec_field( name => $_ ,indexed =>1);
> }
> my $doc = $invindexer->new_doc;
>
>
> my %mail = (
>    'date' => 'Mon, 07 Jan 2008 14:04:35 +0530',
>    'to' => 'myteam at example.com',
>    'subject' => 'subject test here ',
>    'body' => 'This is a very short body here',
>    'cc' => 'ram at example.com',
>    'from' => 'sagar at example.com',
>    'message-id' => '<1199694875.14998.392.camel at sagar.example.com>',
>    'filename'=>'/abc/def'
>    );
> foreach (keys %mail) {
>    next unless($mail{$_});
>    $doc->set_value( $_    => $mail{$_} );
> }
> $invindexer->add_doc($doc);
> $invindexer->finish;
>
>
> $analyzer   = KinoSearch::Analysis::PolyAnalyzer->new( language =>
> 'en' );
> my $searcher = KinoSearch::Searcher->new(
>    invindex => '/tmp/invindex',
>    analyzer => $analyzer,
>    );
>
>
>
> #
> # Search on From  ... works fine
> #
> my $term = KinoSearch::Index::Term->new("from","sagar");
> my $term_query = KinoSearch::Search::TermQuery->new(term => $term);
> my $hits = $searcher->search( query => $term_query );
> while ( my $hit = $hits->fetch_hit_hashref ){
>    print "Found HIT in from" .  $hit->{filename}."\n";
> }
>
>
> #
> # Search on body    ... no results
> #
> my $term = KinoSearch::Index::Term->new("body","short body");
> my $term_query = KinoSearch::Search::TermQuery->new(term => $term);
> my $hits = $searcher->search( query => $term_query );
> while ( my $hit = $hits->fetch_hit_hashref ){
>    print "Found HIT in body" .  $hit->{filename}."\n";
> }
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rectangular.com/pipermail/kinosearch/attachments/20080823/8aad6b33/attachment-0002.htm 
-------------- next part --------------
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch


More information about the kinosearch mailing list