[KinoSearch] Subclassable Highlighter

Father Chrysostomos sprout at cpan.org
Thu Feb 7 14:28:55 PST 2008


On Feb 1, 2008, at 4:47 PM, Marvin Humphrey wrote:

>> I did run into a problem with multiple HighlightSpans with the same  
>> start offset. It makes the ( 1 / ( 1 + log($diff) ) ) formula in  
>> HeatMap.pm blow up (log 0). So I’ve added code that eliminates  
>> duplicates, adding the weights together. I’m not sure if this is  
>> how it should be done.
>
> I'll cover this topic in a separate post.

I understand if you are unable to respond yet, but I just wanted to  
point out that, unless I misread the code, the current HeatMap  
implementation, when it calculates the density around a particular  
highlight span, does not take into account the weight of that  
highlight span itself. E.g., if someone searches for ‘the haecceity,’  
the location of the word ‘the’ will be the hottest because it’s near a  
heavy word, whereas ‘haecceity’ will have a lower heat, because it’s  
closest word is merely ‘the.’

I suggest that we add 1 / ( 1 + log .5 ) * $posits[$loc_index]- 
 >get_weight to the heat for that location. Or maybe log 1. The  
problem is that $diff for that spot would, in theory, be 0, and log  
would blow up on that. So if we adjust it a little it will work. Or is  
this too much of a hack?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.rectangular.com/pipermail/kinosearch/attachments/20080207/5874f8fc/attachment.htm


More information about the kinosearch mailing list