It’s just data

BlogScene

Erik Hatcher: I decided to rename my current pet project to blogscene as I'm going to use Lucene as its underlying storage/retrieval mechanism rather than the filesystem.   Now that sounds innovative!  Using the search engine as the primary data store for a blogging tool.  Cool!

I have a feeling that's actually what BlogStudio does - www.blogstudio.com - which does presnet problems when it comes to accessing archives of discussions they are hosting.

Posted by Simon Phipps at

And as you can see from the URL Sam posted, its actually working! The search engine is indeed doing exactly what it should. I'm adding each blog document (blosxom-style text files) with the category as the path category that blosxom uses, and also the date format "path" (/2002/12/15) and so it instantly finds documents based on these paths - no filesystem access whatsoever (except Lucene's API hitting its index files). Super fast.

Querying can be by full document, title, category, date, or permalink. And the query works from the "path" (category or date) downward making it quite powerful. Sam already has this going on his site to some extent.... my extension is adding the title field and storing the blogs within Lucene. This is all automated with Ant using my <index> task.

Note: the app is currently under development but its working nicely so far.

Posted by Erik Hatcher at

Can it do date range searches? The problem with BlogStudio is that they don't build archives but only support searches, so once items leave your top page they can only be retrieved by an explicit search. If they had a 'display all entries by range' option there would be no problem.

Looks good, BTW.

Posted by Simon Phipps at

Yes, it is indexing the files by their last modified date, and Lucene supports a RangeQuery. The tricky part is exposing that to the UI. Lucene's QueryParser does a great job, but its got its limitations. I'm working on understanding how to phrase queries with it, but doing it through the API is trivial though.

Posted by Erik Hatcher at

Add your comment