Using XPath to mine XHTML
Simon
Willison: This morning, I finally decided to
install
libxml2 and see what
all the
fuss was about, in particular with respect to XPath. What
followed is best described as an enlightening experience.
What one could do, is take MSIE's HTMLDOM, loop through available elements with JavaScript and add them (with the right attributes and content) to a XMLDOM (with MSXML). It should be fairly easy to write a recursive function that does this.
The application would be web based and require MSIE, but besides that, it would be «free». Anything MSIE understands, this conversion application will understand too.
Posted by Asbjørn Ulsberg at
Minding XHTML with XPath/XSLT works great. However, I have yet to find anything freely available that turns any HTML into XHTML. HtmlTidy chokes on all kinds of things.
Posted by Chris Sells at