DocGems
A Blogger Code of UnProfessional Ethics and Wetware, the Killer App. Both are must reads.
P.S. Doc's RSS feed is not valid XML. Something about a reference to an undefined entity 'ouml'.
It’s just data
A Blogger Code of UnProfessional Ethics and Wetware, the Killer App. Both are must reads.
P.S. Doc's RSS feed is not valid XML. Something about a reference to an undefined entity 'ouml'.
Dave had briefly posted a request for an RSS feed for Doc Searls and I spent about an hour creating some XSLT to get Doc's OPML file into RSS format. By the time I was done so was Dave.
Anyway, I'm looking at this now. This looks like a case where encoding HTML markup comes back to bite.
To get the XSLT processor to handle this, the input has to be proper XML. To get proper XML, I have to decode the entities in the opml element's "text" attribute. But some of those entities are < and >.
So, if I leave it as it is, it isn't valid xml because of the undefined "ouml" entity. If I decode all the entities, it isn't valid XML because of the angle brackets.
So, now I'll go through the string initially, look for < or >, encode that as &lt; or &gt; and then decode the entities later. An extra two lines of Perl (though I could make it one, if I really, really wanted to).
Oh, and don't forget the " elements, either.
Posted by Mark A. Hershberger atRe: "though I could make it one if I really really wanted to". I think you just summed up the essence of Perl.
Posted by Mark at