[KinoSearch] adding a proximity scorer - Boilerplater
Nathan Kurz
nate at verse.com
Sun Jun 17 11:42:35 PDT 2007
On 6/16/07, Marvin Humphrey <marvin at rectangular.com> wrote:
> Your approach is probably cleaner than the one I've taken, which has
> been to create two functions: Parent_new() and Parent_init_base().
Being more efficient, your way is the way I would approach if it I was
designing from scratch. A standard that class_new() does the
allocation and calls class_init() is my standard for C. It feels
funny to do it that way in Perl, though, since the allocation is
usually just an empty hash ref. And if the goal is to blur the
distinction between C and Perl, a common strategy for both would be
good.
My first (well, second) instinct was that the realloc() wouldn't be
that expensive, since allocations are often oversized to begin with,
and even if an changing the real size is necessary, resizing the last
object allocated might be optimized. But a little quick test in a
loop made it seem like the realloc call is just about as expensive as
the orginal malloc.
> However, I can see doing things your way by default, then going to
> the two-function setup only for performance-critical constructors.
> There aren't very many of those -- and DelDocs_new() probably isn't
> one of them. :)
And for most of the performance critical sections, one is going to
avoid dynamic allocation anyway. The intersection of performance
critical sections and those that call constructors should be small.
> What I really want to do, though, is figure out how to implement KS
> objects using the inside-out object model. I believe that this is
> the key to realizing the goals I've laid out for subclassing Scorer.
I hadn't read about that model until you mentioned it here. I can see
why the foreign object inheritance ability would be very useful. I
don't know if it would be a good fit for a performance sensitive
section in C, however. If the object is already a hash table, you
aren't losing much in performance. But if the object is a C struct,
looking up elements is currently zero-cost. Having to dereference (or
even call a function) every time you need an element seems both
expensive and at odds with any sort of compile time optimization. And
the lack of locality of an object seems like it would be be very bad
for staying within processor cache. Are there implementations that
would get around this?
I fear that until I'm more familiar with the KS code as it exists, I'm
not going to be very useful in design discussions. Hopefully I'll
have some better ideas once I've spent some more time working in
system.
Nathan Kurz
nate at verse.com
More information about the KinoSearch
mailing list