Eric Lawrence: we’ve provided web-applications with the ability to opt-out of MIME-sniffing. Sending the new authoritative=true attribute on the Content-Type HTTP response header prevents Internet Explorer from MIME-sniffing a response away from the declared content-type
While I’m not a fan of content-sniffing, one of my few pet peeves with HTML5 is that it endeavors to institutionalize the practice with no provisions for content providers to opt out. As the lesser of the available evils, I hope Microsoft’s proposal is quickly adopted by other browsers.
article
, aside
, footer
, header
, nav
, time
Each appear on my front page. Try putting them on yours.
Each were made up by somebody. Many require significant workarounds to work on IE and Gecko. Combined market share of these two browsers far exceeds 50%, by any measure.
And taken together, these six elements solve less significant problems than the problems caused by institutionalizing the content sniffing done by browsers past.
Oh, those naughty browsers-past.
Wonder how long it will be before Apache’s httpd.conf has a default value for DefaultAuthoritative?
Now, THAT, Phil is a much more interesting discussion to me than a discussion as to who is, and who is not, permitted to become a self-appointed benevolent dictator.
I presume that you mean a default value of true
? Because a default value of false
would be what IE7 implements today, and what is being canonized by the HTML5 spec.
The problem that will stop such a default value from being considered is exactly the problem that is causing IE and the WHATWG to ignore the TAG finding — namely that my postulate seems to correlate closer to reality than the TAG finding does.
Put more succinctly, when considering a RSS 2.0 feed served as text/plain
, the correct answer more often than not is that the MIME type is not authoritative, and for any distribution of Apache to default otherwise will result in more bug reports than leaving the default to being false
.
authoritative=reallytrue
.I wonder when we get authoritative=reallytrue.
No, that would break IE8 beta2. Thus, yet another parameter (reallyauthorative
?) would be needed :-).
They sent us .rar as text/plain
I did some testing. I found a random rar file on the internet. Noted with amusement the svn:mime-type. Placed the file on my local machine and served it using the defaults. Verified that Content-Type: text/plain
was returned.
Fetched the file using IE 7.0.5730.13, Firefox 3.0, Safari 3.1.2, and Opera 9.50. None of them served this content as text/plain
. Two identified the content as a RAR file, two declared it as unknown.
Checked the latest Draft Recommendation, currently dated 3 July 2008. Saw no mention of RAR files or anything that justified the current browsers behavior, though it is a big spec and I may have missed it.
This could be a principled stand on the part of the HTML5’s editor, or it could merely be an oversight. Given the fact that HTML5 past history of documenting the sniffing of content such as this, I suspect the latter.
Of course, I am told that I can opt out of that particular sniffing by putting 512 bytes of comments and/or whitespace before the first element in my feed. That’s until some future date whereby there is a need to implement really-really-really-opt-out whereby I will need to put in place 32,768 bytes of comments and/or whitespace.
Now, to answer Sylvain’s question: the world I would most like to live in is the one that follows the specs that Roy and friends have produced and implemented. However, I don’t expect to live in such a place any time soon. As such, given the choice between the lesser of two available evils (personified by IE and HTML5 in this case), I usually pick the latter, but in a few cases, such as this one, I prefer the former.
So one of the founders of Atom is in favor of people just making up a new “standard” and announcing it on their blogs after they’ve already implemented it. And he either can’t see — or is deliberately obscuring — the difference between “making shit up on a blog” and “standardizing behavior in a formal group setting.”
How far we’ve fallen.
How far we’ve fallen.
I call False dilemma here.
This was proposed on a blog, and ultimately showed up here.
More specifically, I would like to see this authoritative
parameter (or separate header, as Sylvain proposes — I care not) take the same path that canvas did; implemented by a vendor, adopted by others, and then standardized in HTML5.
The only remaining question is whether the benevolent dictator will grant an audience to those who wish such to be considered, or if another work group is required.
@Sam:
Speaking from memory, I think the situation with HTML 5 is that it should be treated as application/octet+stream.
Geoffrey: OK, I see that now, and two of the four browsers behaved that way.
I continue to profoundly uncomfortable with content sniffing, particularly when there is no provisions for an opt out.
Thank you for making my point so succinctly (though you probably didn’t realize it).
The original RSS autodiscovery “standard” (nee blog post) was TERRIBLE. It didn’t specify what clients should do when multiple autodiscovery links were present; it wasn’t clear that attributes could appear in any order (which led directly to interoperability issues); it wasn’t clear that “alternate” could be one of any number of space-separated keywords in the @rel attribute, that the @rel attribute was not case-sensitive (more interop problems); it wasn’t clear whether relative URIs were supported and how they should be resolved; and on and on.
Oh, and it used “text/xml” for the @type attribute instead of “application/rss+xml”. Even though that mistake was corrected THREE DAYS LATER and I chased my referrers and left comments on OVER 100 BLOGS to notify people about the new syntax, the damage had already been done. To this day, client implementations still need to support the “text/xml” variant that I “standardized” on my blog FOR THREE DAYS, over six years ago. But since there’s no spec governing this behavior, clients are free to implement it in incompatible ways. For example, did you know that Firefox only treats a type="text/xml" link as a feed if the @title attribute contains the magic letters "RSS"? (And of course they’re still tweaking that heuristic as well.)
The original RSS autodiscovery “standard” (nee blog post) was TERRIBLE.
I can name two other standards that are just as TERRIBLE. And you know what? All the heuristics that you are inexplicably defending at this time are only making the problem worse.
Add to this the openly dismissive attitude towards efforts to nurture and retain this important aspect of evolvability, and the inevitable result is that those who may not be as predisposed as they should be to participate are given all the excuse that they need to pursue other venues.
Mark Jason Dominus in Good Advice and Maxims for Programmers:
#11953 Of course, this is a heuristic, which is a fancy way of saying that it doesn’t work.
But then, I think this whole thing is a case of #11901.
From a technical perspective, I’m not sure how I feel about this parameter, even though I’ve proposed something similar in the past.
The right time to standardize is a tough issue, as ever. Lots of good things come from making shit up on blogs, and everyone does it. Even Mark’s favorite open source projects. The trick is to find a happy medium. I certainly agree that MS has been tough to work with in the past and continues to be in many areas, but I also understand their reluctance to participate in a process that gives a competitor dictatorial control over most of the Web, and the W3C efforts are slow as usual. I am not sure what to do. Getting indignant probably won’t help (tried it).
All the heuristics that you are inexplicably defending at this time
I don’t even know what the hell you’re talking about. Where did I say I was defending anything? It’s like you’re having an entirely different discussion in your head, and I’m not invited.
Fair enough, when I used the word hueristics, there was an indefinite antecedent involved. Let me rephrase it in the form of a question. Consider the following sections:
§ 2.7.2 Content-Type sniffing: Web pages, § 2.7.3 Content-Type sniffing: Web pages, § 2.7.4 Content-Type sniffing: unknown type, § 2.7.5 Content-Type sniffing: image, § 2.7.6 Content-Type sniffing: feed or HTML.
The above sections I’ll collectively hereafter call heuristics. I’ll also note that these sections are the subject of an active discussion in a formal group setting, and furthermore contain the following text : The above algorithm is a willful violation of the HTTP specification. Which brings to mind the question of whose will? is being referred to here. Before an answer of n out of m browsers is given, consider that the specification already defines elements that 0 out of m browsers support, like aside
; and considers as an error elements that n out of n browsers support, like acryonym
.
Ah, but I digress. These heuristics are an unabashed and willful violation of the HTTP specification. Furthermore, these sections provide no means for a content provider to “opt out” short of avoiding the specific marker sequences that the specification defines, a process which in some cases is extraordinarily difficult.
And furthermore, lets put aside the question of who raised the possibility of an opt-out, and whatever perceived atrocities they may or may not have perpetrated in the past, or what venue that question was initially raised. No, lets merely consider the question itself.
So, without further ado, Mark, do you agree with the heuristics as currently stated in the Draft Recommendation? If the answer is no, permit me two follow-up questions. Would the Draft Recommendation be improved if it permitted a mechanism for a content provider to intentionally and explicitly opt out of these heuristics? Furthermore, given that weight seems to be given over the number of browsers that implement a given function, wouldn’t it be prudent for a number of browsers to implement such functions in the form of a beta in the hopes of obtaining early feedback?
Since you mentioned the authors of HTTP bis: what opinions have you seen expressed by the authors of HTTP bis regarding the numerous HTTP violations present in the HTML5 specification? Have their objections been addressed, either by changes to HTML5 or to HTTP bis?
It this how I want the world to work? No.
Is it the lesser of the available evils? Until I see any other path that allows me to the MIME type that I am sending to be treated as authoritative, sadly, yes.
Two identified the content as a RAR file
If one of those two was Firefox, well, telling what it’s doing from the UI requires a careful eye. Identified as a RAR file would be a dialog saying “This is a RAR file” with the “always do this” box not disabled; identified as application/octet-stream by being sniffed as binary would be a dialog saying “This is [going to be, once you save it to your filesystem which is your only choice] a RAR file” with the “always do this” box disabled. Talking about what it will be once it’s saved for sniffed binary has a little whiff of wrongness about it, but not admitting what the browser already knows the OS will do about it, once you go through the required dance of saving it, isn’t actually going to help anyone.
“You have chosen to open Sql2005.rar
which is a: RAR
file from http://localhost
What should Firefox do with this file? Open with [Browse...] Save File.”
The “Do this automatically for files like this from now on.” checkbox is disabled.
IE says “Unknown File Type”, Safari says “Type: Unknown”, and Opera says “Type: RAR File”, with an enabled “Remember choice and do not show dialog again” checkbox box.
Until I see any other path that allows me to ...
I see the source of our disagreement. You want a solution to your specific problem, regardless of the cost(1), while I want browsers to converge on common behavior, regardless of whether that behavior solves any specific problem.
(1)In case you were wondering, the “cost” of Microsoft’s solution is that they get to keep making shit up and relying on well-intentioned stooges to complain that other browser vendors “aren’t keeping up.”
Mark, that’s a creative interpretation of our positions; one that would be considerably more credible if you were equally as adamant about canvas not being a part of the specification. Apparently, Apple innovating an entire graphics subsystem is barely worthy of mention, but Microsoft putting out a trial balloon for a header that — in some limited cases — makes browsers more conformant with HTTP 2616 is call for alarm.
Anne - thanks for that link. Given the information in that email, I can see Ian believing that he is behaving exactly as he said, and I can see others reading that as merely indicative of a person who wants the working group to proceed at a brisk pace, and not as an indication that their vote would be interpreted as support for a democratically supported dictator.
At this point I’m sort of lost as to what point you’re trying to make though
Julian Reschke summarized it well.
Ian said:
I complained loudly about <canvas> at the time that came out. See, e.g.: <[link]>
Except now Ian, you dismiss any proposal that doesn’t first have an implementation. While that link indicates you have also dismissed anything first proposed by an implementation. You have the most severe case of not-invented-here I have ever seen. In your case the ‘here’ is your own brain.
In the case here of Microsoft proposing authoratative=true we need to recognize the historical mistakes made so that they don’t get repeated with this parameter. Most importantly httpd’s DefaultType should simply be removed. It could be replaced with a NullFilenameExtension MIMEType to fulfill the requirement Roy Fielding has described for the common unix practice of text files without a filename extension (e.g., README). To underscore the absurdity of the DefaultType setting I think the following instructions would put it in perspective:
“Administrators must provide a mime type value for the DefaultType setting. When httpd cannot map a resource to a specific MIME content type, httpd will use this value as the value it pulls out of its ass. Keep in mind that httpd must pull a mime type out of its ass when it doesn’t know the intrinsic type.”
This helps drive home the absurdity of this setting. It also would be helpful if the originator of MIME magic had not called it “magic” since there are a lot of negative connotations among administrators for that word. Using MIME magic in a more widespread way would have led to more accurate server-side sniffing and reduced the need for client-side sniffing prohibited by the HTTP spec.
My guess is that proposal-by-implementation is meant to be something along the lines of this:
The builds I linked to in my previous post contain this feature. I’ve uploaded the above example and the reflection demo...The next thing I have to do is to write up a spec proposal for all this work and get it discussed by the CSS and SVG working groups. Based on that feedback we’ll figure out the best way to deliver this functionality in Gecko.
So I think that in the case of IE8, there is a sense this is not an experimental feature in an experimental build, but rather an announcement of what shall come to pass.
I don’t think I’ve ever seen an argument by the IE team explaining why they are against early-stage openness about their intentions or experimental builds, but their past behavior seems to indicate that they do stand pretty firmly against these things.
I don’t think I’ve ever seen an argument by the IE team explaining why they are against early-stage openness about their intentions or experimental builds, but their past behavior seems to indicate that they do stand pretty firmly against these things.
Perhaps it is due to the open hostility and intransigence that such explorations are met with. Either that or the evident hostility and intransigence gives them all the cover they need to not participate.
The authoritative=true proposal
is a prime example. Nobody likes content sniffing, and everybody would like to see a path towards a future where such is minimized if not outright eliminated. What is needed first is that a number of user agents agree on a common set of rules and implement such consistently over a long period of time. Ian himself has often talked about this being a process that may take a decade or more to complete, and given such a view, what is or is not done by IE6 becomes somewhat less relevant.
As such, it is the agreement itself rather than the specific solution that is of paramount importance. However, instead of giving the popular browsers a range of choices with clear pros and cons to pick from, Ian feels that he has been given the mandate to select, and perhaps even dictate, which specific solution everybody is to adopt. Even if it varies significantly from what the two largest browsers (in terms of current install base) do today, and varies significantly from what the relevant specifications require.
Furthermore, one of those two has indicated (albeit somewhat indirectly) that they find what is currently specified in HTML5 to meet requirements, and has made a counter-proposal. A second browser (Firefox) has tended to follow IE’s lead in terms of content sniffing, even if it means being at odds with the extant specifications.
I see as a key factor in resolving this is to both welcome and encourage Microsoft to participate in the mailing list. My request on Sunday has yet to receive a direct response, however Chris Wilson has chimed in.
Keep in mind that httpd must pull a mime type out of its ass when it doesn’t know the intrinsic type.
I don’t see a requirement anywhere in RFC 2616 to send a Content-Type
header. In fact the spec explicitly grants licence to clients to sniff the type when the Content-Type
header is absent – meaning its absence in some cases is explicitly anticipated and permitted. And if web servers would simply not label files they don’t know the type of, then clients could finally quit second-guessing the label when the server does supply one.
It kind of boggles the mind why we’re having this discussion at all. The culprit in this mess seems very obvious to me.
Kevin H, I think the implementation requirement Ian cites is primarily meant to reverse the W3C priority of constituencies to make sure that HTML5 does not address the needs of users and authors. Very few implementations are going to invest significant time for the needs of users or authors when Ian has shown such hostility to proposals addressing their needs. Ian even spends time in the open source projects disparaging ideas not-invented-in-his-brain.
In the case of IE8 beta 2, this is clearly a pre-release version of IE. The IE team has even taken time to pre-announce their proposal. The focus on process completely — especially by those who endorse the entirely perverted process brought to us by Ian — without laying any substantive criticisms basically guarantees IE will stick with this proposal. I would imagine that if some substantive issues were raised the IE team would take them seriously. However, I can clearly see the IE team responding to discussion held by the HTML WG: discussions that Ian almost completely ignores in the name of what he unilaterally calls “merit” (i.e., he deems the W3C and the HTML WG un-meritorious of his time).
Ok, Rob... enough. Any more, and I will resurrect my no hitting policy.
I happen to believe that most people honestly believe that they are pursuing the best path that they know how; and I most certainly believe that this applies to Ian.
In this case, Ian is correct that design by consensus often leads to monstrosities. However, it is my belief that by swinging the pendulum too far in the other direction, Ian may not have fully considered the implications of his own actions. In the WHATWG he has created an environment that Microsoft will certainly never participate in. To date, he has managed to replicate that environment in the W3C.
Such an approach is self-defeating. Even if it is currently declining, IE continues to enjoy a respectable market share, and will continue to do so for the foreseeable future. Therefore, it is vital that to the success of HTML5 that it defines something that IE would be willing to implement. It also wouldn’t hurt if Ian did not continue to alienate the IETF and remainder of the W3C in the process.
I do think that Ian has thought through the consequences of his actions and he and the rest of his supporters are playing a very high stakes poker game as to whether Apple + Opera + Mozilla can force MSFT to do something.
The really disappointing thing is that he doesn’t have a moral high ground because his actions are identical to those of MSFT and IBM and every other previous technical dictator. If he gave the Working Group any chance to actually guide him through an open process, like “Who is for/against/neutral having distributed extensibility in HTML5” and then acted as an editor from that result, then I would be the biggest supporter around. But I have extreme trouble with the W3C even doing this work given that it’s making a mockery of an open process with technical and process oversight.
If it were clear that Mozilla valued documented specifications over compatibility with IE and the web in the context of content sniffing of feeds which are served with an incorrect MIME type, then I would agree: this is “just” a very high stakes poker game.
But if Microsoft’s proposal has enough “truthiness” to it that Mozilla decides to follow Microsoft’s lead, then Ian is not only challenging both the body of existing standards and Microsoft, he is quite possibly deciding to fracture the very coalition that supports him.
Which makes me wonder what the fundamental principle is that Ian seeks to defend?