[KinoSearch] Numeric range searches, unicode, thread safe

Marvin Humphrey marvin at rectangular.com
Tue Nov 7 06:39:37 PST 2006


On Nov 7, 2006, at 5:38 AM, Stephanie Belton wrote:

> Many thanks for your reply. As it turns out I tried installing  
> Kinosearch
> yesterday and hit a major hurdle (for us at least!) which is that  
> Perl 5.3+
> is required and I am not in a position where we can upgrade from 5.8.0
> unfortunately

5.8.3 is required because there were numerous, serious Unicode bugs  
in Perl prior to that.

 From perldelta581:

   * ... The pain was intensified by the Unicode implementation of
     Perl 5.8.0 (still) having nasty bugs, especially related to the
     use of s/// and tr///. (Bugs that have been fixed in 5.8.1)

   * If an input filehandle is marked :utf8 and Perl sees illegal UTF-8
     coming in when doing <FH>, if warnings are enabled a warning is
     immediately given - instead of being silent about it and Perl being
     unhappy about the broken data later. (The :encoding(utf8) layer
     also works the same way.)

   * binmode(SOCKET, ":utf8") only worked on the input side, not
     on the output side of the socket. Now it works both ways.

   * In 5.8.0 this

         $some_unicode .= <FH>;

     didn't work correctly but instead corrupted the data. This has
     now been fixed.


 From perldelta583:

   * Using substr() on a UTF8 string could cause subsequent accesses on
     that string to return garbage. This was due to incorrect UTF8
     offsets being cached, and is now fixed.

    * join() could return garbage when the same join() statement was
      used to process 8 bit data having earlier processed UTF8 data,
      due to the flags on that statement's temporary workspace not being
      reset correctly. This is now fixed.

    * Using Unicode keys with tied hashes should now work correctly.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/





More information about the KinoSearch mailing list