[KinoSearch] The Axis of Evil: $&, $`, and $'

Marvin Humphrey marvin at rectangular.com
Thu Mar 15 10:07:28 PST 2007


Greets,

Anybody who's having index-time performance issues with KinoSearch  
should make sure that they don't have a "SawAmpersand" problem: the  
appearance of $&, $`, or $' anywhere in your script or its  
dependencies will completely destroy KinoSearch's Tokenizer, which  
runs a short regex over a large string many times in a tight loop.

This has been covered in other forums....

http://www.perlmonks.org/index.pl?node_id=599428
http://www.socialtext.net/stoss/index.cgi?kinosearch_slowdown

... but I've never explicitly brought it up on this list.  The topic  
came up on Perl 5 Porters today because the diagnostics module  
actually uses $&.  :( Needlessly -- Rafael just patched blead. ):

I've contemplated issuing a warning from Tokenizer's constructor if  
use of the evil vars gets detected, but I don't think I can get away  
with that.

It may or may not be related to recent posts about indexing  
performance, but it's worth mentioning.

To test for the problem yourself, use Devel::SawAmpersand from CPAN.

   use Devel::SawAmpersand qw(sawampersand);

   die "A bullet to the brain is better than a slow, agonizing death"
      if sawampersand();

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/





More information about the KinoSearch mailing list