Accordingly, I've converted
my rss 2.0
feed from <content:encoded> to the to the more bandwidth
and xpath friendly <xhtml:body>. It
looks
likegotdotnet and
blogx users will soon follow. Hopefully the owners of the
wellformedweb and
w3future weblogs will
take notice.
The updated feed is
valid, and it uses namespaces in exactly the way that
rss 2.0
and xhtml
intend. I've tested it with
radio
and
syndirella.
Sjoerd Visscher approves, but suggests <xhtml:div> instead. Any objections?
The choice of xhtml:body was intentional. The goal was to convey that "this is the content - no need to HTTP GET another entity unless you're looking for fancy styling."
Just blasting an xhtml:div into an item seemed less explicit as to why it was there.
If y'all would we happy coining yet another element name (e.g., realcontent) in some new namespace, then great. Otherwise, I think xhtml:body is the right choice.
I've converted my rss 2.0 feed from <content:encoded> to the more bandwidth and xpath friendly <xhtml:body>. It looks like gotdotnet and blogx users will soon follow. Hopefully the owners of the wellformedweb and w3future weblogs will...
There are two things going on here. First is that syndirella doesn't (yet) know how to handle xhtml:body. Second, it is falling back to the description which I had not properly encoded.
There we where thinking we had some agreement on the RSS format and had achieved some convergence and stability at the 2.0 level, then this happens!
[more]
Trackback from TheArchitect.co.uk - Jorgen Thelin's weblog
at
Personally, I don't disagree with this change, but it does raise some interesting questions around whole-document versioning and format standardization.
My policy with parsing RSS has always been to parse everything, and then run through whatever I got looking for things I recognize in descending order of preference ("is there a dc:date? that's my date. no? is there a pubdate?..."), rather than saying "this is RSS 2, so unless there's a pubdate there's no date at all", so I don't see any real problem with saying "is there an xhtml:body? how about content:encoded? oh well, I'll take description."
But the xpath users better come up with some fun and pretty implementations, to make up for the hassle I'm going through changing all my sax parsers to do something completely different when I see start and end tags while xhtml:body is open. It may be just code, but quite a lot of it is just code I didn't write, and only half understand.
I agree with Don on the xhtml:body vs. xhtml:div thing. Using the body tag, IMHO, conveys more semantic meaning than using a div in the middle of nowhere.
Don, you have <body> tags in your <description> elements. That doesn't seem right.
Furthermore, I thought the whole point of RSS was simple syndication. Having description, content:encoded or xhtml:body tags, doesn't make things simpler. And there aren't any precedence rules either.
Breyten: Don already said that he would try to get his descriptions properly encoded by Monday. There problem with description is that there is no documentation as to what proper encoding is. Some people don't encode it, others encode it once, and some encode it multiple times.
content:encoded is better documented, but it renders the structure of the comment opaque. Much of the what makes the web work is that every bit of structure that people can tolerate putting into their data is fully exposed.
xhtml:body is a step forward. But it isn't for everyone. In particular, it won't work if the content is not well-formed XML.
As for precedence rules, these will emerge. RSS is not controlled by the W3C and these elements were proposed and implemented by different people. In this case, I think the order is clear: xhtml:body then content:encoded then description.
Interesting and a must have for RssComponents: XHTML in the body. Sam seems to develope a syndirella clone that supports blog posting, so the result may be a similar tool like OExpress/NNTP but for RSS. Should have a closer look how they incorporate...
Please reconsider using xhtml:div instead of xhtml:body. If you're going to use existing vocabularies, you're going to have to play by the rules of that vocabulary. Semantics is nice, but this is XML and how elements should be used is described by the schema. xhtml:body only allows block-like elements, like div, p, ul... So if your RSS file contains < xhtml:body > Click < a href="..." > here < /a > < /xhtml:body >, then this element does not validate according to the xhtml schema.
Personally, I'm ok with only block-like elements within the xhtml:body. If you want small snippets of text, put it into <description>. If you want to have rich xhtml markup, then follow the rules and wrap in
or <div> or something.
To me, the restriction is worth it to have more obvious and understandable semantics.
Sam Ruby: xhtml in rss 2.0 I've converted my rss 2.0 feed from <content:encoded> to the to the more bandwidth...
[more]
Trackback from Jim Mangan's Weblog
at
Tag du jour: <xhtml:body>. Now what?
...Sam himself updates his own feed to <xhtml:body>, saying it's "more bandwidth friendly" than <content:encoded>, which probably won't be true if all internal tags must also contain the xhtml: prefix, as some argue......
[more]
Trackback from Solipsism Gradient
at
Yes, yes, let's do both. Let's all do both. I'd help add to the hellish confusion, but I'm stuck on HTML 4. Then again, HTML is just a few regular expressions away from XML, right?
I stand by the use of xhtml:body even though short posts will need an innocuous <div> or < p>. I looked at a lot of feeds before pulling the trigger and many, many blog entries are multi-paragraph and naturally have < p > or <div> children anyway.
If people are really torqued about the use of xhtml:body, then we should define a NEW element whose content model is identical to <div> but whose (new) name would convey "this is the content of the damn entry in XHTML 1.0 transitional!!"
Just slamming a <div> element under item gives me the willies.
I've now updated my rss2 feeds to only insert a <div> elements when it is necessary to make a valid <xhtml:body>. As Don points out, in many cases this isn't necessary.
NewsGator 1.1 has been released! This is a significant release......
[more]
Trackback from Greg Reinacker's Weblog
at
RSS 2.0
Sometime soon I'm going to convert my RSS feed to version 2.0. I want to know what the right tags to put my content in are, so I'm keeping track of this bit on XHTML in RSS 2.0 from Sam Ruby. ...
I looked around for an XML Schema definition for RSS 2.0, so I could post some examples and ideas on extending the core specification based on some of the discussions over the weekend through Don Box's and Sam Ruby's weblogs. I was somewhat...
I looked around for an XML Schema definition for RSS 2.0, so I could post some examples and ideas on extending the core specification based on some of the discussions over the weekend through Don Box's and Sam Ruby's weblogs. I was somewhat...
[more]
Trackback from TheArchitect.co.uk - Jorgen Thelin's weblog
at
Wherefore flyeth baby and bathwater?
Sam Ruby and other notables are replacing the content:encoded elements in their RSS 2.0 feeds with xhtml:body. From the point...
[more]
Trackback from Raw Blog
at
I like it.
It may be a while before I support it, as I am currently in the middle of refactoring my implementation of RESTLog to make better use of Cheetah, also adding a unit test suite. Once that work is complete it should be simple to add support to xhtml:body. Then I have to add it to Aggie, then to Pamphlet...Uh-oh, I'm beginning to think I have spread myself too thin. Just a little...
I just looked at it again and I like it. Gets me out of the business of maintiaining a whole CMS when I am only really interested in the RESTLog interface. I'll finish off my last changes to RESTLog to have a final release then begin migrating to mombo.
I particularly like the EntryStore abstraction as I get to keep my native file format.
I spent most of the weekend offline. Here are a bunch of things that I have been keeping tabs on but haven't had a chance to look into: Cool freshmeat releases Highlight (source hilighter) 2.0b-6. I've been happy with GNU...
Sam Ruby gamely responded to my comments re. inserting xhtml in RSS : Update your RSS 1.0 feed, and I'll...
[more]
Trackback from Raw Blog
at
Ideagraph screenshot
Phil Ringnalda says: I'm much less interested in who will produce the data, and more interested in who will actually...
[more]
Trackback from Raw Blog
at
In brief: 1 April 2003
The Register's got a new RSS 1.0 feed, and it validates. Table-like CSS layouts....
[more]
Trackback from dive into mark
at
To add support for xhtml:body in Aggie RC5, add the following to RssExtractors.xml:
1. A namespace declaration under extractors/namespaces: "<namespace><prefix>xhtml</prefix><uri>http://www.w3.org/1999/xhtml</uri></namespace>"
2. An extractor declaration under extractors/properties: "<property><name>description</name><path>xhtml:body</path><owner>RssItem</owner><variant>All</variant></property>"
That's it.
(Hopefully, Sam, this will not break havoc in your site. I would've posted on my weblog, only Radio went south on me again, and I wanted people to give this a try...)
Ziv: I don't know enough about the internals of Aggie to comment, but be careful: the contents of <xhtml:body> is meant to be XML which is to be taken literally, not a string which is to be XML decoded.
For example, strings like "&" should be left as is.
Ziv,
I did try it and it doesn't work since
we're extracting the element Text and not the elements InnerXml. Maybe add one more flag to <property> element?
Seems to be the thing. I've implemented it for now here, but I'm not checking it into CVS yet. It's still hackish, with the extra <div></div> wrapped around all entries and fixate_url not being ...
Sam/Joe: I'm afraid don't understand the comment. The RSS file is a well-formed XML file, so any literal "&" string in it stands for a single ampersand (disregarding the edge-cases for the moment). Are XML vocabularies allowed to override this rule, or am I missing something?
Ziv: Compare the content:encoded and xhtml:body elements in my rss 1.0 and rss 2.0 feeds respectively. In the former, "<" becomes "<". In the latter, "<" remains as is.
The same thing should be true of "&". If you see it in the stream it actually represents a single "&" conceptually, but it will need to be encoded back into a "&" when you put it into the output HTML.
Net: what you really want to do is a byte for byte copy of the characters between the <xhtml:body> and </xhtml:body> tags.
Sam says, "what you really want to do is a byte for byte copy of the characters between the <xhtml:body> and </xhtml:body> tags."
Similarly, one can take the <xhtml:body> fragment and preserve it as a DOM, then write the DOM back out as XML or HTML as needed (which would also correct for where a byte-for-byte copy wouldn't include the necessary namespace declarations).
If one is already using a DOM for the Comment API, xhtml:body is already a node in the tree. If one is using SAX, you'll have to look for <xhtml:body> specifically and trap all the start/end callbacks to create the DOM, until the closing </xhtml:body>.
Some kind of flag in the markup, "this is literal XML", would make that easier than having every application keep a current list of what fields may be literal XML.
Ken - you are correct. An InfoSet preserving XML to XML transformation is what we are looking for. In particular, there is a list of items that need not be preserved.
As to the added flag: my opinion is that it doesn't belong in the markup any more than the schema that says that an item has a title, link, and description belongs in the markup. Beyond the simple mustUnderstand semantics, you know what you are looking for and know the syntax rules for skipping over the rest.
the following are links i found at work today that i'd like to follow up on... but didn't have an email client setup on the machine, so i figured i'd......
[more]
Trackback from bish
at
In brief: 2nd April 2003
Bruce Eckel has a blog: ok, it's old news now, but Bruce Eckel has a blog. Some interesting things in the posts, especially bits from the opening of a not specified Python Conference (or a Python Conference I miss the specification of). And on...
(My humble opinion doesn't weight much, but anyway...)
If the point of using RSS is to deliver content, and/or abstract it from the presentation layer, then WHY OH WHY using <xhtml:whatever> instead of <content:whatever>?
First content abstraction problem: enclosing content in an xhtml namespace tag seems to imply that the content is well-formed XHTML.
Now imagine for example Mark Pilgrim using xhtml:body in his RSS 2.0 feed, when his entries are definitely not going to be well-formed XHTML since his weblog itself is HTML 4.01. Should Mark enclose his entries in html:body to imply that they are well-formed HTML? Should he deliver a version that is converted to XHTML in order to fit in xhtml:body?
Now a second problem: not every source of content is XHTML/HTML. There can possibly be content encoded in Flash (or let's be crazy, MPEG or WAV), that happens to provide an RSS version of its entries. How would xhtml:body relate to the entries there?
--
Am I the only one thinking this is really complicating the matters for a discutable gain in functionality/semantics?
Wouldn't extending the "content:" module be a simplier task?
After all, wouldn't something along the lines of <content:encoded type="application/xhtml+xml"> carry a better meaning of the content of the entry?
This way, one could post entries in whichever way one likes: Mark could use type="text/html" for example.
--
Excuse the ranting, but really sometimes to the external eye it looks a lot like you (we?) are complicating (y)our RSS feeds just for the heck of it, with not much use outside of pure novelty...
Michel: I guess we have different perspectives on what is simple.
<content:encoded> remains an option for not-well formed content. It is considerably more verbose, less readable, and less easy to parse, but it still exists.
Of course, that'll teach me not to post after so many hours of straight work (not!). I'll add the encoded/not-encoded field to Aggie once I complete transforming its RSS-to-object-graph engine (the part that reads RssExtractors) into something more general than just RSS.
Phil is right. The specs are silent on this. In fact, many people believe that relative links should be resolved relative to the feed itself, not the <channel><link> element's value. Spec issues aside,...
[more]
Trackback from Sam Ruby
at
I spent most of today knee deep in RSS, writing an aggregator for a project at work. It has been quickly becomng apparent that "Really Simple Syndication" is anything but! There are currently three major (and goodness knows how many minor) ...
I am working on a RESTLog client, and specifically on the subsystem responsible for building RSS item fragments representing each weblog entry, and sending them to the server side of the application. As some of you may have noticed, I am using a new...
Ég hvet alla bloggara, og þá sérstaklega Movable Type notendur til að uppfæra RSSið sitt í útgáfu 2.0 Movable Type notendur geta afritað eftirfarandi RSS template kóða: <?xml version="1.0" encoding="<$MTPublishCharset$>"?>...
[more]
Trackback from Már Örlygsson
at
Okay, I ‘fess up. I don’t get it.
Why should I bother using xhtml:body when I could just as well be using content:encoded ? What’s the inherent benefit?...
[more]
How (and why) to include an xhtml:body in a Radio UserLand RSS feed
Sam Ruby and Don Box have both demonstrated valid RSS 2.0 feeds (Sam, Don) that include a <body> element, properly namespaced as XHTML. Quietly, last week, I joined the party. My primary feed now includes: ......
I'm waiting to be convinced about the utility of XHTML-in-RSS, but it's easy enough to play along....
[more]
Trackback from phil ringnalda dot com
at
How (and why) to include an xhtml:body in a Radio UserLand RSS feed. Sam Ruby and Don Box have both demonstrated valid RSS 2.0 feeds (Sam, Don) that include a <body> element, properly namespaced as XHTML. Quietly, last week, I joined the party....
When Sam started a small revolution last month by using <xhtml:body> in his rss-feed, he said this was because it was "more bandwidth and xpath friendly". I can see it's more xpath friendly (though I'm not sure why anyone would want to...
[more]
Trackback from public virtual MemoryStream
at
The logic (or lack thereof) behind xhtml:body
The logic (or lack thereof) behind xhtml:body...
[more]
Trackback from .NET Blog - Chris Frazier Style
at
Mike's Briefs
There's a good article on ExtremeTech on Krazy Keyboards. I have a great interest in new input devices as......
At SXSW I told Mena Trott that RSS 1.0 was dead or dying, because it was too complicated. Turns out I was partially wrong -- it's very much alive, but perhaps only because it's the default in Movable Type. Six Apart has signed up for the semantic...
I posted a new drop of Synderilla that's based on Dimtry's May 9 release, and adds support for gzip/deflate compression, xhtml:body based items and multiple plugins (both IBlogThis and IBlogExtension)....
[more]
I posted a new drop of Synderilla that's based on Dimtry's May 9 release, and adds support for gzip/deflate compression, xhtml:body based items and multiple plugins (both IBlogThis and IBlogExtension). [Simon Fell] Well, here's the beauty of open...
El germen de una idea (y una buena colección de links, de paso) Logicola Diet Un proyecto para aprender y divertirnos, trabajando con tecnologías como PHP, RSS, XSLT y sobre todo en áreas que nos interesan y nos apasionan como el...
I've been using nntp//rss for over a month now, and overall, I'm pretty happy with it. I like it for the same reason people who live in Outlook love NewsGator. (I live in Outlook Express, because it lets me handle......
[more]
I’m a beginer compared to you guys but I have a question thatt I just started searching out an anser for and came across this site. Can href tags and other html tags be put into RSS 2.0 feeds? Do you know of a convineint list of what can go into a feed and what cannot?
Thanks.
DeWitt Clinton: But what if you wanted to put something interesting inside a syndicated content feed? What if you wanted to put valid XHTML in a feed? You went through the trouble of writing XHTML, why should it be flattened to an opaque blob of...