Sam Ruby

OpenSearch Description Validation

2006-09-13T19:12:22-04:00

Yesterday, DeWitt Clinton IM’ed me that the feed validator is already proving helpful in working out an issue with an OpenSearch description document. While this is cool, knowing that this code is actually being used motivated me to make a sprint to complete my first pass. At the present time, there are 101 tests. The bulk of the code is here and consists of only 120 or so lines as I was able to leverage much of the infrastructure already present in the validator.

Undoubtedly, in my haste, I’ve made mistakes and errors of omission. The code may flag non-errors. It may miss real errors. The messages are undoubtedly not as helpful as they could be, but it is a start. Feedback is welcome. However, given that this format involves XML escaping, URL escaping, and MixedCase identifiers, I expect that a validator will prove to be very useful.

Feedback

A few additional pieces of feedback, the first of which I’ve already provided to DeWitt: the spec should be clear that none of the URIs can be relative: this applies not only to Images, but also to the Url templates themselves.

The remaining points are rather obscure and relate to internationalization. In the grammar for templates: parameter names and prefixes may contain non-reserved characters, and the grammar specifies percent encoding. Percent encoding applies to bytes, so the character encoding must be specified in this case. Neither the Input nor Output encoding clearly apply, and each may appear multiple times in any case. This is an edge case that will rarely be seen, so I don’t see any value in allowing one to specify the encoding of parameter names; I’d suggest simply specifying that tprefix and tlname values be encoded in utf-8 before percent encoding.

Finally, and this is simply a clarification, the searchTerms attribute in the Query element MUST be encoded in the same encoding as the enclosing document, not in one of the InputEncodings specified.

Outlook

I’ve obsessed over minutia, but quite honestly these are mostly either edge cases, or things that can be rapidly and accurately flagged by a validator — in either case, the impact of these concerns is very containable. The spec is sound.

Having now taken a close look at this specification, I’m rather bullish on it’s future. In many ways it is RSS’s textInput done right. The purpose of has always been something of a mystery. The core issue is that the workflow is all wrong. One needs this information to discover a feed: after you have the feed it is a bit too late.

Ideally, everybody who supports feed autodiscovery in response to user input would also support OpenSearch autodiscovery, at the very least in HTML/XHTML.

This, coupled with standardized feed extensions and/or microformats, could also bootstrap an emerging genre of tools: mashup generators/IDEs.

OpenSearch Description Validation

2006-09-13T21:38:06-04:00

Is this a good place to leave feedback where Dewitt will read it?

For accessibility reasons, the Image element needs a way to specify alternate text, like the alt attribute in HTML. Yes, accessibility applies to XML vocabularies too.
Limiting names to 64 characters is stupid. In fact, limiting any element to a specific character length is stupid. RSS did that, and everyone ignored it.
Language in the autodiscovery section is very vague. “The "type” attribute must contain the value “application/opensearchdescription+xml"” Does this mean that the attribute must be exactly that value? What about leading and trailing whitespace (allowed in HTML)? What about media type parameters (also allowed)?
Similarly, “The "rel” attribute must contain the value “search”." Can I combine that with other rel values (HTML says ‘rel’ is a whitespace-separated list of values)? Can I use leading and trailing whitespace (HTML says yes)? Is case significant (HTML says no)? At the very least, you could mention that the rules of HTML apply and that implementers wishing to parse such autodiscovery elements should be familiar with those rules.
If you’re going to use RFC 2119 language, say so, and capitalize keywords appropriately.
“Tags must be a single word and are delimited by the space character (' ').” There are 18 space characters in Unicode 4. Please be more specific.

OpenSearch Description Validation

2006-09-13T22:01:57-04:00

Other than that, I’m in.

Sam Ruby: OpenSearch Description Validation

2006-09-13T22:15:17-04:00

Sam Ruby: OpenSearch Description Validation - it’s so nice to have Sam on board the OpenSearch train :-)...

OpenSearch Description Validation

2006-09-14T00:01:59-04:00

BTW, I added the OpenSearch autodiscovery elements to my page headers, and Firefox 2.0b2 recognizes them out-of-the-box and offers to add them as custom search engines in the search engine dropdown box (on the left side of the search bar). It picks up the icon and ShortName I specified and adds it to my search engine list, and from then on I have an easy way to search my site without going there first or adding extra parameters to the Google search. Too cool.

OpenSearch Description Validation

2006-09-14T01:31:00-04:00

Mark, yes, it’s a good place. I’m tabulating all of the feedback for the OpenSearch wiki and we can hash out the final version of OpenSearch 1.1 (and beyond) over there. But the in meantime, I’m happy to follow the thread wherever people are most comfortable discussing it.

Thanks!

OpenSearch Description Validation

2006-09-14T06:38:24-04:00

I’m tabulating all of the feedback for the OpenSearch wiki and we can hash out the final version of OpenSearch 1.1 (and beyond) over there.

Is there a specific page on the wiki where this is occuring? The issues that Mark and I have come up with tend to be harder to explain than to fix: a few, short sentences can clear up all of these.

OpenSearch Description Validation

2006-09-14T08:22:21-04:00

Sam,

Not just yet, but depending on whether or not I have wifi access today, I will post it by the end of the day. I had planned to start a page that will list a short summary of each proposal, which can be in turned linked to a full page for more discussion if needed.

I’ll comment here when the page is up and you guys can let me know what you think.

OpenSearch Description Validation

2006-09-14T11:13:36-04:00

Mark:

At the very least, you could mention that the rules of HTML apply and that implementers wishing to parse such autodiscovery elements should be familiar with those rules.

Yes, that was certainly the intention, I suppose it could be made clearer though. Nothing can top the comprehensiveness of [link] , of course :-)

OpenSearch Description Validation

2006-09-14T15:12:28-04:00

Just jotting down an idea here in the hopes that DeWitt and others will notice it...

I don’t believe that this would require any changes at all to the spec, but probably should simply be enumerated somewhere so that people know about it as an option.

Sam Ruby: OpenSearch Description Validation

2006-09-16T13:45:33-04:00

Sam Ruby: OpenSearch Description Validation - it’s so nice to have Sam on board the OpenSearch train :-)...

links for 2006-09-14

2006-09-19T06:45:37-04:00

From the blogroll… OpenSearch Description Validation Ultimate Blog Post of Ultimate Destiny Sharing...

Sam Ruby: OpenSearch Description Validation

2006-09-24T09:45:10-04:00

[link]...

Link: Unicode Spaces

2006-11-08T11:15:39-05:00

Link: Unicode Spaces Presented without comment, the 18 spaces of Unicode. (From Sam Ruby.)...