I’ve been looking into differences between the WHATWG URL Living Standard and the combination of RFC 3986 and RFC 3987. I’ve come up with an indirect but effective way to identify the differences. To start with I downloaded urltestdata.txt and urltestparser. I then wrote a small script to convert the test data into json.
I then wrote another script to take this data and pass it through what is advertised as a closely conforming implementation of the relevant RFCs.
Looking at the results, the first set of issues related to the stripping of leading and trailing whitespace, so I updated the script to do that to focus on the remaining differences. Similarly, the URL parsing definition includes the leading ?
and #
in the query
and fragment
values respectively, so I eliminated those differences in the cases where the values were non-empty.
The resulting script produces the this output.
The next set of differences concern canonicalization, so I ran tests using Addressable’s normalize method. Note that as this as this non standard. Updated output including normalization.
Updates to the test data should be sent as pull requests to w3c/web-platform-tests.
See a user agent that should be included in the results? Visit urltest and leave a comment with the user agent and hex code that that the web page reports.
urltest is JS only. Does it make sense to test things like httpie, curl, modules and libraries from ruby, python, php and so on?
Sure! I’ll note that the ‘IETF’ rows actually represent data captured by a Ruby library. My personal preference is to focus on modern, actively maintained or spec compliant applications. A counter-example would be Java.
Opera/9.80 (Macintosh; Intel Mac OS X 10.9.5) Presto/2.12.388 Version/12.16
Added. Thanks!
To address a problem Anne found, I updated urltesttojson.js, and then updated the urltestdata.json, captured new results for each browser (thanks, Simon!), and produced new output.
Colors on the initial page triage results:
Clicking through to an individual result, lack of convergence is represented by an entire column in gold. Exceptions thrown are shown in pale violet red (#D87093).
PLH ran these tests using the following user agent:
Mozilla/5.0 (iPad; CPU OS 8_0_2 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12A405 Safari/600.1.4
I compared these results to those obtained from Version/7.1 Safari/537.85.10 on Intel Mac OS X 10_9_5.
Not a single result changed.