It’s just data

Two Webs

Dare Obasanjo: When talking about REST and HTTP-For-APIs, we should be careful not to learn the wrong lessons from how HTTP-For-Browsers is used today

My two cents: you have to look deeper.  Otherwise, you will miss the fact that the split is actually elsewhere.

99.44% of the requests your web server processes are for GETs.  There are trace amounts of everything else — and a lot of that is spam and can’t be trusted.

Focusing on the trees (whether the trace amounts of other requests are all POST or use the full WebDAV vocabulary) misses a bit of the forest.

The most fundamental constraint is that HTTP is fundamentally request/response, as in one of each.  With GET, the request is already spent, and the recipient has very few recourses but to cope with whatever cards it is dealt.

With all other verbs, the available options are wide open.  Such processors can afford to be a bit more Draconian.  In fact, being liberal with unsafe verbs opens up vectors for exploitation.  (Note: this is also true for browsers that take unsafe actions in response to HTTP GETs, hence Ian’s reference to security bugs)

Looking at the emerging APIs, I’m pleased to see that a lot more focus is being placed on the safe/unsafe split.


Two Weblog Posts

Dare Obasanjo: “In conclusion, I completely agree with Robert Sayre...” I knew it wouldn’t last. Sam Ruby: "you have to......

Excerpt from franklinmint.fm at

I feel like we may be talking past each other. The split I see between HTTP-For-APIs and HTTP-For-Browsers isn’t simply at whether GET and POST are the only HTTP methods supported. In general, it runs across the entire gamut of HTTP/XML/HTML compliance including using correct MIME types when serving documents to always using well-formed, valid markup. With HTTP-For-Browsers your audience is fairly limited (the major browsers and search engine bots) and there isn’t much to gain by doing things the right way at this point. With HTTP-For-APIs, this isn’t the case at all which is why I provided the example of RSS/Atom feed readers.

Posted by Dare Obasanjo at

Off topic:
Hi Sam. Are you aware that some of your old blog entries are missing presentation? I implore you to retrofit if necessary :)

[link]
[link]

Posted by Frank at

Ah, a few pages were left naked.  Fixed.

As an aside, no files remain in my cache for longer than two weeks, so this would have been fixed in a matter of a couple of days.

Posted by Sam Ruby at

With HTTP-For-APIs, this isn’t the case at all which is why I provided the example of RSS/Atom feed readers.

OK, I’ll admit it.  I’m thoroughly confused now.

Are you saying that Content-Type and charset are or are not important to RSS readers?  What, if anything, are you planning to do with this information in RSSBandit?

What I am saying is closer to what Robert said on his weblog than what he said on yours.

Offtopic: the icon on your page that purports to point to an Atom 1.0 feed points to an RSS 2.0 feed.  My suggestion is that you consolidate to a single feed in the x.0 feed format of your choice.  Drop two icons and two auto-discovery links.  The selection dialogs that Firefox and IE7 now provide for subscribing to your page now make little sense at all.

Posted by Sam Ruby at

Sam,
Content-Type and Charset should be important for RSS readers today. However most readers don’t honor them because they are wrong a majority of the time. I personally would love to treat them as authoritative especially when it comes time for me to add podcasting support. The goal should be to move to a world where they are important to RSS readers in all cases.

RE - feed icons:  I just switched my feed to Feedburner. I need to do some cleaning of my various feed icons and links now that I’ve done that. I ran out of time to work on that last weekend, I’ll mess with it this weekend.

Posted by Dare Obasanjo at

However most readers don’t honor them because they are wrong a majority of the time.

IMHO, the first step is to try to honor them if present, and they aren’t obviously bogus.  And then, if that doesn’t work, perhaps try a fallback.

As it stands now, I can produce a valid feed that will cause RSSBandit to barf.  One of the most common feed errors is to not include an xml prolog, yet to have the data encoded in iso-8859-1 or win-1252.  One possible fix to this without changing the feed would be to add a charset parameter on the server.  The end result would be a valid feed that the current RSSBandit will reject.

Posted by Sam Ruby at

I don’t honor Content Types because they are most often wrong. There is the case of an XML document that is not in UTF-8 which doesn’t have an encoding stated in the XML prolog but has it specified in the charset parameter of its content type when served over HTTP.

I haven’t seen such a feed in the wild which wasn’t being created by some XML geek trying to prove a point. My main worry is breaking lots of feeds that work just to satisfy some arbitrary test case. 

While writing this I thought to myself that Mark Pilgrim has probably already solved this problem. So I did a quick search and found the documentation for Character Encoding Detection in the Universal Feed Parser. I can probably do something to better honor content types in a future version of RSS Bandit while ensuring not to break feeds that work today. Of course, the problem is that I don’t have any feeds in the wild to actually test this against. :)

Posted by Dare Obasanjo at

Important, I Think

I’m genuinely paranoid about banging my own drum and shouting “Listen to me!” because I know how often I’ve been wrong about things, and how much of the future is determined by luck and raw random chance. That said, if the lessons I’ve learned over...

Excerpt from ongoing at

While writing this I thought to myself that Mark Pilgrim has probably already solved this problem. So I did a quick search and found the documentation for Character Encoding Detection in the Universal Feed Parser.

Thank you.  I strive to be the person that people think about like that while writing things like this.  Really.

I would also like to point out that feedparser.NonXMLContentType is one of three exceptions — along with feedparser.CharacterEncodingOverride and feedparser.CharacterEncodingUnknown — which is derived from the base class feedparser.ThingsNobodyCaresAboutButMe.  If this turns out to be inaccurate, I will be more than happy to change the name of the base class in the next release.

Posted by Mark at

SOAP/REST split is a safe/unsafe split

... [more]

Trackback from Clemens Vasters: Enterprise Development and Alien Abductions

at

“There are trace amounts of everything else — and a lot of that is spam and can’t be trusted”

Actually, none of it - not even GETs - can be trusted.  That’s why it’s important that GET not have side-effects; so that the cost of responding to it stays below the publisher’s caring threshold, aka the “cost of joining the Web community”.

Posted by Mark Baker at

QOTD - "Web Services"中"Web"比"Services"更重要

Tim Bray说: so I think the “Web” part of “Web Services” is more important than the “Services” part. “Web Services”是个偏重词组,“Web”更重要,但是要知道 这个 Web 是 指 哪一个。 Technorati Tags: Technology, webdev...

Excerpt from Yining.write() at

Sam’s Two Webs

... [more]

Trackback from Don Box's Spoutlet

at

Putting in the Potent

... [more]

Trackback from From 9 till 2

at

The safe/unsafe Web and what's that X-Bender header?

It’s been very refreshing to see recent discussions about why/how the Web works moving away from the REST vs SOA argument, or POX vs WS, or how cool AJAX is. Here are few links to get you started:Don: “HTTP, XML, REST, and $100”, “coming out”,...

Excerpt from <savas:blog /> at

Add your comment