Filters and plugins are simple Unix pipes. Input comes in
stdin, parameters come from the config file, and output goes to
stdout. Anything written to
stderr is logged as an
ERROR message. If no
stdout is produced, the entry is not written
to the cache or processed further; in fact, if the entry had previously been
written to the cache, it will be removed.
There are two types of filters supported by Venus, input and template.
Input to an input filter is a aggressively normalized entry. For example, if a feed is RSS 1.0 with 10 items, the filter will be called ten times, each with a single Atom 1.0 entry, with all textConstructs expressed as XHTML, and everything encoded as UTF-8.
Input to a template filter will be the output produced by the template.
You will find a small set of example filters in the filters directory. The coral cdn filter will change links to images in the entry itself. The filters in the stripAd subdirectory will strip specific types of advertisements that you may find in feeds.
The excerpt filter adds metadata (in
the form of a
planet:excerpt element) to the feed itself. You
can see examples of how parameters are passed to this program in either
Alternately parameters may be passed
URI style, for example:
The xpath sifter is a variation of the above, including or excluding feeds based on the presence (or absence) of data specified by xpath expressions. Again, parameters can be passed as config options or URI style.
The regexp sifter operates just like the xpath sifter, except it uses regular expressions instead of XPath expressions.
[planet]section of your config.ini will be invoked on all feeds. Filters listed in individual
[feed]sections will only be invoked on those feeds. Filters listed in
[template]sections will be invoked on the output of that template.
.tmpl(a.k.a. htmltmp) are also options. Other languages, like perl or ruby or class/jar (java), aren't supported at the moment, but these would be easy to add.
>), then the output stream is teed; one branch flows through the specified filter and the output is planced into the named file; the other unmodified branch continues onto the next filter, if any. One use case for this function is to use xhtml2html to produce both an XHTML and an HTML output stream from one source.
os.abort()can't be recovered from.