[KinoSearch] RE: Beginner help with search strategy
Marvin Humphrey
marvin at rectangular.com
Sat Dec 20 14:53:19 PST 2008
Jozo Radic:
> I'm planning to develop my first small search engine in KinoSearch, and I'm
> looking for some input on what the best way to approach it would be.
> Basically, I have a bunch of different "meta-groups" and each of them will
> have several categories of documents associated. For example, each group
> will have ~100 text documents, ~50 presentations, ~500 web pages of one type
> and ~1000 web pages of another type.
>
> Search results will be primarily grouped by the "meta-groups", and scored by
> the combined hits in all document categories. Each "hit" in the result page
> will show the "meta-group" name and top hits in each of the categories. But,
> I also want to offer the possibility of using only one of the categories to
> sort the relevance of the queries (for example, user wants to search only
> the text documents associated with the 'meta-groups').
>
> I was thinking of using a separate index for each category and combining the
> results of queries on these separate indexes. The only problem is with the
> pagination of such results, as I'd have to fetch all the hits each time for
> each query, in order to create the full result set (grouped by
> 'meta-group'). Is there another, more elegant way to approach this?
You can also put everything into one index and use filters. If you are using
svn, use KSx::Search::Filter, otherwise it's KinoSearch::Search::QueryFilter.
Filters, which cache result sets using an internal bit vector, are the best
way to represent categories when the number of categories is relatively small.
Marvin Humphrey
More information about the kinosearch
mailing list