[KinoSearch] FieldSpec/InvIndexSpec API
Peter Karman
peter at peknet.com
Sun Nov 19 14:37:04 PST 2006
Marvin Humphrey scribbled on 11/19/06 3:11 PM:
>
> On Nov 16, 2006, at 7:37 PM, Peter Karman wrote:
>
>> But something like XML::Simple should be able to handle most anything
>> we dream up for this purpose.
>>
>> I prefer libxml2 for its speed and ubiquity on unixy systems (and
>> because Swish uses it ;)). There's a nice tutorial for it here:
>> http://www.yolinux.com/TUTORIALS/GnomeLibXml2.html
>>
>> I also like libxml2 because it converts all input to UTF-8 for you (if
>> it isn't already). One less thing to worry about.
>
> I just checked out libxml2, and I'm concerned about its sheer size. I
> think it's too heavyweight for this limited task. Unlike Swish, we
> wouldn't be using it for serious parsing of user input.
>
> For now, all we need is a way to convey small amounts of data in a tree
> structure. I think if we limit ourselves to a strict subset of XML,
> writing our own C parser will be simple enough. Here's a starting set
> of constraints:
>
> * No attributes.
> * ASCII-only.
> * No escapes.
>
> Basically, nothing except for paired tags indicating node name, with a
> text value and optional child nodes.
>
yes, given the XML you've been describing, it makes sense to go with a very
lightweight parser.
Glad to see you're going the XML route for now; writing the parser aside, I
think it'll make life easier having a human-readable meta format.
--
Peter Karman . http://peknet.com/ . peter at peknet.com
More information about the kinosearch
mailing list