[KinoSearch] Towards a stable C API... via indirect dispatch
Marvin Humphrey
marvin at rectangular.com
Thu Nov 1 11:52:34 PDT 2007
On Oct 30, 2007, at 2:33 PM, Nathan Kurz wrote:
> OK, I think I got it now: the problem comes when methods are added to
> the ancestors of class one is accessing in the shared library. Could
> this also be solved by marking those classes as BOIL_FINAL_CLASS()?
In theory, we could add new "final" methods (though I believe we'd
have to mod Boilerplater's implementation to actually make that
work). Method macros for "final" methods are just aliases to actual
functions, and don't go through the vtable -- so adding new ones
doesn't require changes to vtable layout.
However, we couldn't do any of the following, as they all mess with
the composition of the vtable:
* add new virtual methods
* remove existing virtual methods, even non-public ones
* change the order of virtual methods
> I'm getting there, but I'm still not completely sold on the advantages
> of the indirection you are contemplating.
Well, let me put it this way:
If in order to present a C API, we have to freeze the vtables, there
won't *be* a C API.
KinoSearch's implementation details are in too much flux. The API
and file format are approaching beta status... but an ABI that
includes frozen vtables is a ways off, and it would be unreasonable
to bump major versions as frequently as I expect we would need to.
Perl and CPAN suck when it comes to handling major version changes.
> Along the lines of my previous message, the potential for silently
> wrong
> behaviour worries me.
Absolutely, that's a major concern of mine, and I hope to wind up
with something that is *more* reliable rather than less.
Binary compatibility problems with C++ shared libraries are at the
root of "DLL Hell" on Windows; if MSVC used indirect dispatch, one
source of ABI compat issues would be removed. By making the vtables
dynamic, I hope to avoid accidental incompatibilities between XS KSx
extensions and KS itself. I don't want to screw up an update and mod
a vtable somehow, causing people's apps to start segfaulting.
Some level of safety is gained by initializing all vtable pointers to
NULL -- so if a bootstrap operation fails, we'll get an immediate
segfault. Furthermore, all method_OFFSET vars will get initialized
to 0, and the first method in every vtable can be set to trigger a
fatal exception. This should compensate for the fact that the
compiler won't be able to tell whether "extern" variables have been
initialized.
Aside from that, I don't think there's inherently anything less safe
about the proposed new way of doing things, except that there'll be
some new code with new bugs.
Ideally, KS objects should be opaque. *No* struct layout details
should be described by either the API or the ABI, with the exception
that method calls will depend on finding a vtable as the first
element in any object. It won't be possible to enforce struct
layout privacy, but it should be possible to set things up so that a
conscientious author isn't *required* to write code that depends on
violating it.
> There's a good paper by Ulrich Drepper on shared libraries:
> http://people.redhat.com/drepper/dsohowto.pdf
> I haven't read it in a while, and probably should reread it. Glancing
> at it again, I note that it has what seems like a very relevant
> section on ABI versioning, as well as a great overview of how ld.so
> actually works.
It's a good paper, though the section on ABI versioning in particular
doesn't yield much because it focuses on versioning controls
within .so files, and at the moment, KS versioning happens at the
Perl level.
> Tangential to this, Drepper has been publishing a fantastic piece
> about optimizing memory access.
Cache management of the OO infrastructure is something I'm trying to
take into account, as it seems likely that the vtables will be
accessed constantly. Right now, all of the vtable pointers and the
method_OFFSET vars are getting dumped into a single C file, so that
they'll be concentrated together. Perl does something similar: its
most frequently accessed op codes are all in one file, pp_hot.c.
The vtables themselves will be dynamically allocated. The classic
way to handle things would be one malloc() per object, but I think
what we might try instead is grabbing memory in page-sized chunks, so
that the vtables will be packed together into a few memory pages.
I'm also trying not to spend too much time on the optimization aspect
of this, since its unclear how much of a payoff we'll see. :)
Flexibility, reliability and API stability are all more important
than the difference between decent optimization and excellent
optimization.
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list