It’s just data

Divide Et Imperia

Joe Gregorio: The pathologies in XML that preclude the use of regular expressions are just that, pathologies, and ones that need to be excised.

I believe that what Joe is looking for is YAML.

That being said, my preference would be to start with what was omitted from the InfoSet, and then see if there is a more appropriate serialization format for what is left.

I'll also note that Joe's RSS feed violates the first rule he proposes.

It sure does. Because I haven't figured out a way to do namespaces that doesn't muck up the idea. I love namespaces and the extensibility they offer and I won't give them up. Given that, all my vetching is probably for naught, since in their current implementation XML namespaces make my dream impossible.

I like your idea of starting with a complete InfoSet and then coming up with a serialization, as long as the InfoSet is discarded at the end of the process :)

Posted by joe at

The problem with subsetting is that everybody wants a different subset. ;-)

What's good about an alternate InfoSet serialization is that you don't require the entire world to cooperate in order to benefit. You can write a streaming filter converter between the canonical XML representation and your favorite format. One that is not necessary when interfacing with a like minded tool.

Posted by Sam Ruby at

In brief: 27 Feb 2003

Google's first use of its new Blogger acquisition. Implications of audio blogging. Templating in Python. Data mining of comments. Emulators and virtual machines. XML and publishing. XML and regular expressions. Inside the RSS validator.... [more]

Trackback from dive into mark


Spotted YAML links today from Sam Ruby and Mark Pilgrim. It's great to see YAML recommended. I'd love to see some experienced perspectives on YAML. And yet, part of me feels so glad that YAML has largely stayed out of the arenas of intense critique...

Excerpt from at

Add your comment