[KinoSearch] fast phrase matching [patch]

Marvin Humphrey marvin at rectangular.com
Mon Sep 10 19:47:05 PDT 2007




On Sep 10, 2007, at 6:05 PM, Nathan Kurz wrote:

> Here's a stab at a faster phrase matcher.  I haven't tested it on
> anything other than an improved version of t/502-phrase-query and
> Matthew's test case from the weekend.

Thanks!  This looks like a very nice patch.  I'll test it and  
presumably commit it after I finish some other work.

> Looking at the object code generates,

This is something I've dabbled in, but would like to pursue in  
earnest.  Can you suggest some links or a course of study to get me  
on my way?

> I think should be considerably more efficient than the
> current code, but it's also unlikely that the current code is ever a
> bottleneck unless searching for 'and the' in a giant document.

Yes, but... we're about to attempt a generalization of this position  
code.

> I don't know that it makes sense as a patch right now. I won't be
> offended if you decide that the current code is more reliable and
> readable.   I pulled out the DEBUG() and ASSERT() statements I used to
> write it, which felt odd.  I also convoluted it to agree with 'gcc
> -pedantic', which I think  definitely hurt readability.   I think the
> expanded tests for phrase offsets merit inclusion, though.
>
> What is your opinion on C99 versus -pedantic?

MSVC is a target, and it doesn't support C99.

I really hate the declaration after statement limitation in  
particular, but KS needs to be maximally portable.

> And how do you feel
> about littering the code with assertions and environmentally
> controlled debug statements?

Often, assertions are clarifying.  Other times, they're just paranoid  
-- like checking every dang pointer that could be a NULL.  I see the  
addition of clarifying or important ones as beneficial and would be  
pleased to have them in the code base.

PS: Tabs suck.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/



_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch




More information about the kinosearch mailing list