[KinoSearch] Towards a stable C API... via indirect dispatch

Marvin Humphrey marvin at rectangular.com
Thu Nov 1 11:52:34 PDT 2007




On Oct 30, 2007, at 2:33 PM, Nathan Kurz wrote:

> OK, I think I got it now: the problem comes when methods are added to
> the ancestors of class one is accessing in the shared library.   Could
> this also be solved by marking those classes as BOIL_FINAL_CLASS()?

In theory, we could add new "final" methods (though I believe we'd  
have to mod Boilerplater's implementation to actually make that  
work).  Method macros for "final" methods are just aliases to actual  
functions, and don't go through the vtable -- so adding new ones  
doesn't require changes to vtable layout.

However, we couldn't do any of the following, as they all mess with  
the composition of the vtable:

    * add new virtual methods
    * remove existing virtual methods, even non-public ones
    * change the order of virtual methods

> I'm getting there, but I'm still not completely sold on the advantages
> of the indirection you are contemplating.

Well, let me put it this way:

If in order to present a C API, we have to freeze the vtables, there  
won't *be* a C API.

KinoSearch's implementation details are in too much flux.  The API  
and file format are approaching beta status... but an ABI that  
includes frozen vtables is a ways off, and it would be unreasonable  
to bump major versions as frequently as I expect we would need to.   
Perl and CPAN suck when it comes to handling major version changes.

> Along the lines of my previous message, the potential for silently  
> wrong
> behaviour worries me.

Absolutely, that's a major concern of mine, and I hope to wind up  
with something that is *more* reliable rather than less.

Binary compatibility problems with C++ shared libraries are at the  
root of "DLL Hell" on Windows; if MSVC used indirect dispatch, one  
source of ABI compat issues would be removed.  By making the vtables  
dynamic, I hope to avoid accidental incompatibilities between XS KSx  
extensions and KS itself.  I don't want to screw up an update and mod  
a vtable somehow, causing people's apps to start segfaulting.

Some level of safety is gained by initializing all vtable pointers to  
NULL -- so if a bootstrap operation fails, we'll get an immediate  
segfault.  Furthermore, all method_OFFSET vars will get initialized  
to 0, and the first method in every vtable can be set to trigger a  
fatal exception.   This should compensate for the fact that the  
compiler won't be able to tell whether "extern" variables have been  
initialized.

Aside from that, I don't think there's inherently anything less safe  
about the proposed new way of doing things, except that there'll be  
some new code with new bugs.

Ideally, KS objects should be opaque.  *No* struct layout details  
should be described by either the API or the ABI, with the exception  
that method calls will depend on finding a vtable as the first  
element in any object.   It won't be possible to enforce struct  
layout privacy, but it should be possible to set things up so that a  
conscientious author isn't *required* to write code that depends on  
violating it.

> There's a good paper by Ulrich Drepper on shared libraries:
> http://people.redhat.com/drepper/dsohowto.pdf
> I haven't read it in a while, and probably should reread it.  Glancing
> at it again, I note that it has what seems like a very relevant
> section on ABI versioning, as well as a great overview of how ld.so
> actually works.

It's a good paper, though the section on ABI versioning in particular  
doesn't yield much because it focuses on versioning controls  
within .so files, and at the moment, KS versioning happens at the  
Perl level.

> Tangential to this, Drepper has been publishing a fantastic piece
> about optimizing memory access.

Cache management of the OO infrastructure is something I'm trying to  
take into account, as it seems likely that the vtables will be  
accessed constantly.  Right now, all of the vtable pointers and the  
method_OFFSET vars are getting dumped into a single C file, so that  
they'll be concentrated together.   Perl does something similar: its  
most frequently accessed op codes are all in one file, pp_hot.c.

The vtables themselves will be dynamically allocated.  The classic  
way to handle things would be one malloc() per object, but I think  
what we might try instead is grabbing memory in page-sized chunks, so  
that the vtables will be packed together into a few memory pages.

I'm also trying not to spend too much time on the optimization aspect  
of this, since its unclear how much of a payoff we'll see.  :)   
Flexibility, reliability and API stability are all more important  
than the difference between decent optimization and excellent  
optimization.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/



_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch




More information about the kinosearch mailing list