[KinoSearch] Orderly global destruction

Marvin Humphrey marvin at rectangular.com
Wed Feb 20 18:43:02 PST 2008


Greets,

This post will be of interest only to a few individuals, but I'm  
sending it to the list so that it ends up in the public record, and  
because it is relevant to the RT issue #32689 "KinoSearch::Simple and  
DESTROY".  It arises out of a phone conversation between Peter Karman  
and I last night.

When you call Devel::Peek::Dump($obj) on a "blessed reference"  
returned by the Perl function bless(), you see two SV structs: an  
outer RV, and the inner object SV:

     $ perl -MDevel::Peek -e 'my $blank; Dump bless \$blank, "Foo"'
     SV = RV(0x81a418) at 0x800d80
       REFCNT = 1
       FLAGS = (TEMP,ROK)
       RV = 0x800e34
       SV = PVMG(0x80e440) at 0x800e34
         REFCNT = 2
         FLAGS = (PADBUSY,PADMY,OBJECT)
         IV = 0
         NV = 0
         PV = 0
         STASH = 0x800f0c	"Foo"

KinoSearch stores only the inner SV, discarding the outer.  It does  
this by first incrementing the refcount of the inner object, then  
decrementing the refcount of the RV wrapper, triggering the RV's  
reclamation without killing off the inner object.  It's kind of  
pointless to create and then immediately destroy the outer RV, but we  
have to do things that way because there are no public Perl C API  
functions that allow for the direct creation of an inner object.

The relevant KS functions are prototyped in trunk/c_src/KinoSearch/ 
Obj.h and implemented in trunk/perl/xs/KinoSearch/Obj.c, but the  
following function illustrates what the code would look like if it  
were collected into one place and unrolled for clarity.  We'll use  
InvIndex as an example class.

     typedef struct InvIndex {
         VirtualTable *_;
         SV           *inner_perl_object;
         Schema       *schema;
         Folder       *folder;
     } InvIndex;

     InvIndex*
     InvIndex_new(Schema *schema, Folder *folder)
     {
         kino_InvIndex *self =  
(kino_InvIndex*)malloc(sizeof(kino_InvIndex));
         SV *perl_obj_rv = newSV(0); /* soon to be an RV */

         /* Turn perl_obj_rv into an RV and create an inner Perl object,
          * which right now only the RV can "see".
          */
         sv_setref_pv(perl_obj_rv, "KinoSearch::InvIndex", self);

         /* Copy a pointer to the inner Perl object into the KS obj. */
         self->inner_perl_obj = SvRV(perl_obj_ref);

         /* Increment the refcount of the inner object, so that it won't
          * go away when we decrement the RV.
          */
         SvREFCNT_inc(self->inner_perl_obj);

         /* Decrement the outer RV.  Its refcount falls to 0, and Perl
          * frees it.  Now [self] holds the sole refcount keeping
          * the inner Perl object alive.
          */
         SvREFCNT_dec(perl_obj_rv);

         /* Assign the vtable pointer for the InvIndex class. */
         self->_ = (VirtualTable*)&INVINDEX;

         /* Increment the refcounts of schema and folder and copy  
pointers
          * into the InvIndex object.
          */
         self->schema = KINO_REFCOUNT_INC(schema);
         self->folder = KINO_REFCOUNT_INC(folder);

         return self;
     }

There is an esoteric but important reason for doing things this way.

During Perl global destruction, all objects pointed to by RVs are  
reclaimed in a first sweep.  Then later, Perl goes and sweeps over all  
existing SVs, repeatedly decrementing their refcounts until every last  
one gets DESTROYed.

It is essential that all composite KS objects get DESTROYed during the  
*first* sweep.

Consider this snippet, which does nothing but create an InvIndex object:

   use MySchema;
   my $invindex = MySchema->open('/path/to/invindex');

Under the current implementation, at the end of that script there are  
three Perl objects in existence, BUT ONLY ONE RV: $invindex, visible  
from Perl space.  The first sweep will catch this one RV, decrementing  
its refcount, and trigger an orderly cascade of detructor calls.

However, if KS were to keep RVs around instead of only the inner  
object, at the end of that script there would be 4 RVs in existence:  
one in Perl space, and three in C space.  Because the second phase of  
global destruction is unordered, it's entirely possible that either  
the Schema or the Folder object will get DESTROYed before the  
InvIndex.  If that happens, we'll have a nasty double-free problem:

     void
     InvIndex_destroy(InvIndex *self)
     {
         REFCOUNT_DEC(self->schema);  /* BAD if self->schema already  
freed */
         REFCOUNT_DEC(self->folder);
         FREE_OBJ(self);
     }

Fortunately, that can't happen the way things are now.

Congratulations if you've fought your way through this entire  
message.  If the reasoning seems complicated, well, it sort of is --  
but that's XS for you.  Now you know why there is absolutely no XS any  
more in trunk/c_src, where most of KS is implemented.  :)

(Footnote: On the off chance that someone who reads this message wants  
to browse the relevant portions of the Perl source code, check out  
Perl_destruct in perl.c, and sv_clean_objs/sv_clean_all in sv.c.)

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/

Excerpt from a comment in sv.c:
--------------------------------------------------------------------------

The function visit() scans the SV arenas list, and calls a specified
function for each SV it finds which is still live - ie which has an  
SvTYPE
other than all 1's, and a non-zero SvREFCNT. visit() is used by the
following functions (specified as [function that calls visit()] /  
[function
called by visit() for each SV]):

     sv_report_used() / do_report_used()
                         dump all remaining SVs (debugging aid)

     sv_clean_objs() / do_clean_objs(),do_clean_named_objs()
                         Attempt to free all objects pointed to by RVs,
                         and, unless DISABLE_DESTRUCTOR_KLUDGE is  
defined,
                         try to do the same for all objects indirectly
                         referenced by typeglobs too.  Called once from
                         perl_destruct(), prior to calling  
sv_clean_all()
                         below.

     sv_clean_all() / do_clean_all()
                         SvREFCNT_dec(sv) each remaining SV, possibly
                         triggering an sv_free(). It also sets the
                         SVf_BREAK flag on the SV to indicate that the
                         refcnt has been artificially lowered, and thus
                         stopping sv_free() from giving spurious  
warnings
                         about SVs which unexpectedly have a refcnt
                         of zero.  called repeatedly from  
perl_destruct()
                         until there are no SVs left.







More information about the kinosearch mailing list