[KinoSearch] help with query filter

Marvin Humphrey marvin at rectangular.com
Tue Sep 2 21:28:17 PDT 2008


On Sep 2, 2008, at 1:06 PM, Mike Barborak wrote:

> if I re-index a document (done in a different process and without  
> spec_field being called),

Is there a reason not to call spec_field() when you re-index?  Calling  
it would likely solve the problem.

Calling spec_field() multiple times is fine (and recommended) so long  
as the field definition is always the same.

> What I _think_ is happening is that when I create my index, the  
> filter field is correctly not being analyzed but that when I do the  
> re-index, it is being analyzed and this then is causing an issue.

Yes, it looks like that's right.  The 'analyzed' flag is not stored  
with the index.  It's defaulting to a true value when the fields  
metadata is read in (FieldInfos->read_infos).  This wouldn't cause  
significant problems for most people because once the data is in the  
index, it doesn't get re-analyzed.  (I can see an esoteric bug with  
Searcher->_prepare_simple_search, but it wouldn't be easy to tickle.)

The workaround should be to call spec_field().  A fix for maint would  
involve storing the 'analyzed' flag, which would be a little tricky  
for back-compat reasons.

I know the devel branch is not an option for you, but for the record  
and anyone who might be concerned, this problem would not affect devel  
-- field definitions are determined by the FieldSpec class assigned to  
the given field name in the Schema, and this load-from-disk-vs-call- 
spec_field initialization conflict wouldn't happen.

> So I'm hoping this rings a bell with anyone in terms of something  
> I'm doing wrong or what the issue might be. If not then I'll work on  
> developing a concise test case to hopefully reproduce what I'm seeing.

Good detective work.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/




More information about the KinoSearch mailing list