[KinoSearch] How to easily disable SimpleHTMLEncoder in Highlighter & How to boost field

Marvin Humphrey marvin at rectangular.com
Fri Apr 11 11:12:56 PDT 2008




On Apr 10, 2008, at 11:19 PM, Vladimir Vlach wrote:
> 1) can you help me how to disable SimpleHTMLEncoder?

In 0.20_051, you subclass KinoSearch::Highlight::Encoder and override  
encode():

     package MyEncoder;
     use base qw( KinoSearch::Highlight::Encoder );

     sub encode { return shift }

     package main;

     my $highlighter = KinoSearch::Highlight::Highlighter->new(
         encoder => MyEncoder->new,
     );
     ...

For future reference, in svn trunk you subclass Highlighter itself:

     package MyHighlighter;
     use base qw( KinoSearch::Highlight::Highlighter );

     sub encode { return shift }

     package main;

     my $highlighter = MyHighlighter->new(
         searchable => $searcher,
         query      => $query,
         field      => 'content',
     );
     ...

> 2) I also have a question, how to boost some fields in the search
> results. For example Title. I know this was discussed before but it's
> related to older version. Do I have to alter Schema or can I boost it
> with Searcher?

It's possible, though it requires a bit of complex manual  
intervention.  Basically, you have to prepare a complex query and  
boost part of it.

However, I think you can solve the problem another way.

> I created Schema/Category.pm but I am not sure if this works. Score is
> still the same.

Note that because you've indicated for the field not to be analyzed,  
only an exact match counts:

>    sub analyzed     { 0 }

Such an exact match unlikely to be produced by a QueryParser. For  
example, if the category was "travel books", and you feed the query  
'travel books' into QueryParser, it will parse that into a search for  
"travel" plus a search for "books", neither of which will match the  
complete value "travel books".

If you change for the category field to be "analyzed", it would match,  
though, and I bet you'd see a change in the score.  Either that, or  
you can change how you're building the query to go after exact matches.

> sub boost { 2 }

You might also try a higher boost, just for the sake of verifying that  
it isn't having an effect.  To keep down index size, boost is stored  
with extremely coarse resolution.  Sometimes a sensible boost value  
like 2 doesn't cross a threshold and trigger a change in one  
particular score.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch




More information about the kinosearch mailing list