Lucene's in Java, blosxom's in Perl. Most of the work is getting the two to talk. I'll describe the process bottom up, then show the code top down.
First, follow the Lucene instructions to index your blosxom *.txt files.
LuceneSearch.java is a simplified SearchFiles program which accepts the index name and a url encoded query from the argument list and simply outputs the names of the matching files to standard out. The reason for the url encoding is to avoid the mess of concatenating arguments and unmatched quotes and security issues.
Lucene.pm is a perl module that simply calls the Java program, captures the output into an array (after stripping off paths and linefeeds) and otherwise mimics the object interface of DirHandle. The stripping off of paths is probably a bit of overkill at the moment, and may ultimately need to be relaxed.
blosxom.cgi is modified to simply use Lucene and to substitute the appropriate instance of this object if the "q" parameter was passed on the URL. Everything else is the same, so blosxom will do its normal job of sorting and selecting by date, and formatting as required.
All that is left is to add a form to the appropriate place in the html...
Add to foot.html
<form id="searchform" method="get" action=Search>
<p id="searchlabel"><label for="q" accesskey="4"><span class="heading">Search this site:</span></label></p>
<p id="searchinput"><input type="text" id="q" name="q" size="18" maxlength="255" value=" " /></p>
<p id="searchsubmit">
<input type="submit" value=Search />
<a href="http://jakarta.apache.org/lucene"><img src="/images/lucene_green_100.gif" alt="Lucene" border="0" /></a>
</p>
</form>
Add to blosxom.cgi (first line with the other use's, the second before the foreach)
use Lucene;
param('q') and $dh = new Lucene(param('q'));
Create a Lucene.pm in the same directory as blosxom.cgi:
package Lucene;
use File::stat;
use URI::Escape;
use POSIX qw(strftime);
use Env qw(@PATH @CLASSPATH);# --- Configurable variables -----
# Where's Java?
my $JAVA_HOME = '/home/rubys/jdk1.3.1_04';# Where's lucene?
my $lucene = '/home/rubys/lucene/lib/*.jar';# What's my index?
my $index = '/home/rubys/lucene/index';# --------------------------------
unshift @PATH, "$JAVA_HOME/bin";
push @CLASSPATH, "$JAVA_HOME/lib/tools.jar";
push @CLASSPATH, "$JAVA_HOME/jre/lib/rt.jar";
push @CLASSPATH, glob($lucene);sub new {
shift;
$arg = uri_escape(shift);foreach (`$JAVA_HOME/bin/java -cp $ENV{CLASSPATH} LuceneSearch $index $arg`) {
chomp;
s !.*/!!;
push @matches, $_;
}$self = @matches;
bless $self;
return $self;
}sub read {
my $self = shift;
@$self;
}1;
Compile and place into a jar LuceneSearch.java:
import java.net.URLDecoder;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.search.Searcher;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.Hits;
import org.apache.lucene.queryParser.QueryParser;class LuceneSearch {
public static void main(String[] args) {
try {Searcher searcher = new IndexSearcher(args[0]);
Analyzer analyzer = new StandardAnalyzer();String line = URLDecoder.decode(args[1]);
Query query = QueryParser.parse(line, "contents", analyzer);
Hits hits = searcher.search(query);
for (int i = 0; i < hits.length(); i++) {
System.out.println(hits.doc(i).get("url"));
}searcher.close();
} catch (Exception e) {
e.printStackTrace();
System.exit(9);
}
}
}
You're done!