[KinoSearch] Towards a stable C API... via indirect dispatch
Marvin Humphrey
marvin at rectangular.com
Sun Oct 28 19:18:49 PDT 2007
On Oct 28, 2007, at 9:05 AM, Aaron Crane wrote:
> Theoretically, object pointers (including void pointers) and function
> pointers are incommensurate according to the C standard -- you get
> undefined behaviour when you cast between them.
Ah, yes, I'd forgotten that.
Presently, the vtables are actual objects themselves, with a
'refcount' member and the whole bit. Having them be objects makes it
easier to implement dynamic subclassing, a feature which is required
by both Schema and FieldSpec, and which may come in handy elsewhere
in the future.
The vtable objects belong to the class
"KinoSearch::Util::VirtualTable". Here's the definition for
KinoSearch::Index::Term's vtable object:
KINO_TERM_VTABLE KINO_TERM = {
(KINO_OBJ_VTABLE*)&KINO_VIRTUALTABLE, /* vtable object's
vtable */
1, /* refcount */
(KINO_OBJ_VTABLE*)&KINO_OBJ, /* parent */
"KinoSearch::Index::Term", /* class name */
(kino_Obj_clone_t)kino_Term_clone,
(kino_Obj_destroy_t)kino_Term_destroy,
(kino_Obj_equals_t)kino_Term_equals,
(kino_Obj_hash_code_t)kino_Obj_hash_code,
(kino_Obj_is_a_t)kino_Obj_is_a,
(kino_Obj_to_string_t)kino_Term_to_string,
(kino_Obj_serialize_t)kino_Term_serialize,
(kino_Term_get_field_t)kino_Term_get_field,
(kino_Term_get_text_t)kino_Term_get_text,
(kino_Term_copy_t)kino_Term_copy
};
The first four member variables aren't function pointers, and I'd
kinda sorta been hoping to sneak them into the array somehow. ;) A
fifth member var will actually be needed as well: 'size' (or
something like that), describing the size of the vtable either in
bytes or in array members.
One approach is to keep the vtables as structs, with the last member
a "flexible array" of function pointers:
typedef struct kino_VTable {
KINO_OBJ_VTABLE *_;
chy_u32_t refcount;
KINO_OBJ_VTABLE *parent;
const char *class_name;
size_t size;
kino_method_t methods[];
} kino_VTable;
Flexible arrays are C99, but you can get away with them on C89 if you
declare them to be at least length 1.
kino_method_t methods[1];
You then take advantage of C's lack of bounds checking to malloc()
enough memory for however many elements you need. :) It's a hack,
but widely portable -- Perl's regex engine depends on it, for example.
The downside of having the vtable be a struct rather than an array is
that it adds an extra addition op to the process of finding the right
function pointer.
method_OFFSET * sizeof(kino_method_t)
method_OFFSET * sizeof(kino_method_t) + FIXED_OFFSET
Here's some AT&T assembler, for code implementing the array technique:
# %eax register holds method_OFFSET
# %edx register holds address of "methods" array
movl (%edx,%eax,4), %eax
Here's assembler for code using a vtable struct containing a
"methods" array:
# %eax register holds method_OFFSET
# %edx register holds vtable struct pointer
movl 20(%edx,%eax,4), %eax # <----------- NOTE extra "20"
(To see the whole context, view the attached file "need_meth.s",
which was generated from the attached file "need_meth.c" using the
command "gcc -S -Wall -Os need_meth.c" on an x86 Linux box.)
I'm not sure how much of a penalty you pay for the extra addition op
-- only a benchmark would tell -- but I'm reasonably sure it doesn't
help matters. :)
It seems to me that the only way to get away with using the array
rather than the struct containing the array involves some nasty
casting hacks. Worth it, y'think?
>> Say we remove the Kino_Term_Destroy method... then this code
>> will crash at run-time, because the kino_Term_destroy_OFFSET
>> symbol cannot be resolved:
>>
>> destroy_meth = self->_[kino_Term_destroy_OFFSET];
>>
>> Of course a run-time crash would be bad -- but that just means that
>> we can't redact public methods -- which we wouldn't be doing anyway.
>
> More specifically, the failure would be at link-time, right? Unless
> I'm misunderstanding, code using the new macro will contain a
> reference
> to the kino_Term_destroy_OFFSET symbol, so the linker should fail when
> trying to resolve uses of that symbol in callers. Of course, assuming
> that most uses of the Kinosearch code rely on a dynamically loaded
> KinoSearch.so (or local equivalent), that turns out to be roughly the
> same thing as run-time anyway.
Exactly.
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: need_meth.c
Type: application/octet-stream
Size: 1156 bytes
Desc: not available
Url : http://rectangular.com/pipermail/kinosearch/attachments/20071028/2b906aef/attachment-0002.obj
-------------- next part --------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: need_meth.s
Type: application/octet-stream
Size: 2286 bytes
Desc: not available
Url : http://rectangular.com/pipermail/kinosearch/attachments/20071028/2b906aef/attachment-0003.obj
-------------- next part --------------
-------------- next part --------------
_______________________________________________
KinoSearch mailing list
KinoSearch at rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
More information about the kinosearch
mailing list