intertwingly

It’s just data

Implied Warranty of Fitness


A number of people advocate avoiding templates when producing XML, lest they produce output that is not well formed.  Yet I use a templates for this weblog.

Venus produces a DOM, and serializes it via XSLT, so it is pretty safe... or so you would think.  Here a few ways I have found in which one can produce a DOM which can’t be serialized as well-formed XML:

from xml.dom import minidom

doc = minidom.getDOMImplementation().createDocument(None,None,None)
root = doc.createElement('9')
root.setAttribute(';',u'\x0C')
root.appendChild(doc.createTextNode(u'\uFFFF'))
root.appendChild(doc.createComment('-'))
doc.appendChild(root)

try:
  minidom.parseString(doc.toxml('utf-8'))
except Exception, e:
  print e

print
print doc.toxml()

Am I missing anything?

Venus currently handles all of these cases, and it is my intention that it will continue to do so — as well as handle any other cases that I may have missed — as I transition from sgmllib based processing to html5lib based processing.