[KinoSearch] adding a proximity scorer - Boilerplater

Marvin Humphrey marvin at rectangular.com
Fri Jun 15 17:25:13 PDT 2007


On Jun 15, 2007, at 1:52 PM, Nathan Kurz wrote:

> On 6/15/07, Marvin Humphrey <marvin at rectangular.com> wrote:
>> To subclass OrScorer, create MyOrScorer.c and MyOrScorer.h following
>> the guidelines documented in the POD for devel/lib/BoilerPlater.pm.
>> All you'll need is a constructor and a MyOrScorer_tally function.
>> The rest will inherit.
>
> Got it, and it seems to Build correctly.

Wow, hot damn.

The BoilerPlater stuff came out well, but it wasn't and isn't really  
designed to be a public API.  It arose out of necessity because the  
faked-up inheritance schemes that Dave Balmain was using with Ferret  
and I was using with KS 0.15 were messy and scaled poorly.  The  
design was hashed out last fall on the Lucy developer's list.

> First subclass subquestion:  why do I actually need a constructor?

In theory you could write an XS function that "reblessed" the object  
by changing the vtable pointer to point at a different one.

   MODULE = KinoSearch  PACKAGE = KinoSearch::Search::MyOrScorer

   kino_MyORScorer*
   new(unused, sim, subscorers)
      SV *unused;
      kino_Similarity *sim;
      kino_VArray *subscorers;
   CODE:
      RETVAL = kino_ORScorer_new(sim, subscorers);
      RETVAL->_ = &KINO_MYORSCORER;    /* <-------- rebless */
   OUTPUT: RETVAL

I've opted never to do that mainly because I want XS code to be  
limited to glue whenever possible.  XS is powerful, but it's nasty  
and esoteric.

Take a look at RichPostingScorer_new() -- it does exactly the same  
thing as that XS function above, but within a dedicated C constructor.

> Is the class_name field actually used at runtime?

Yes, absolutely.  self->_->class_name is used all over the place,  
particularly when crossing the Perl/C boundary.

Note that class_name is a member of the vtable, and not a member of  
the object struct.  That means we don't have to waste space in every  
object with an extra pointer to the class name -- but also that the  
class name of an object is fixed.

> And if I do need a
> constructor, is it possible to piggyback on top of the constructor for
> the class I'm inheriting from?  Something like:
>
>    MyORScorer *self = STEAL(MyORScorer, ORScorer_new(sim,  
> subscorers));
>          (or BLESS, or RECREATE, RECLASS, whatever)

Cool idea, but it would have to look slightly different, because of  
the limitations of C syntax.  It would have to be either a function,  
or a multi-line macro like this:

   #define KINO_RECLASS(var, obj, type, vtable) \
      type* var = (type*)(obj); \
      var->_ = &(vtable)

> I don't understand the innards of your vtable implementation well
> enough to know quite what this would need to do, but if it is possible
> it seems likely you've already done it somewhere but I just haven't
> found where.

Yes.  Dynamic subclassing is supported via DynVirtualTable and the  
CREATE_SUBCLASS macro defined in MemManager.h.

The implementation is fairly complex, which is unfortunate, because  
it doesn't accomplish very much -- it just allows the object to be  
associated with an arbitrary class name.  :\  The feature is used by  
Schema, FieldSpec and Similarity to allow users to subclass via Perl  
without knowing anything about the underlying C objects.

Ideally, our discussion will result in an improvement upon that  
scheme that will allow you to write your ORScorer subclass without  
touching BoilerPlater.  Something like this:

   package MyORScorer;
   use base qw( KinoSearch::Search::ORScorer );

   __PACKAGE__->register_c_method( tally => 'my_tally' );

   use Inline => C << 'END_C';

   kino_Tally*
   my_tally(kino_OrScorer *self) {
       /* ... */
   }

   END_C

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/





More information about the kinosearch mailing list