Validome
There’s a new validator out there (at least new to me), that not only validates HTML, XHTML, WML, XML, DTD-Schema, and Google Sitemaps, but also is an “Advanced” Feed-Validator for RSS and Atom. It does appear to be pretty good, and as the development appears to be test-driven I'm confident that it will not only stay that way, but will continue to get better as more people use it.
I took a look at the 458 Atom 1.0 test cases defined (the Feed Validator currently has 763), and found a few items worthy of note, though some of these issues affected multiple tests.
| test | description | validation | comments | |
|---|---|---|---|---|
| 100 | extending link elements | FeedValidator | Validome | Places where I assert that Validome deviates from the specification. |
| 106 | rel attribute with relative IRI | FeedValidator | Validome | |
| 150 | entry document with no author | FeedValidator | Validome | |
| 185 | scheme-attribute with a relative IRI | FeedValidator | Validome | |
| 292 | content element with a src attribute | FeedValidator | Validome | Feeds where the FeedValidator reports on more than Validome does. |
| 335 | multiple category elements | FeedValidator | Validome |
I didn’t look as closely at the other feed formats, but did take a quick pass.
Validome does pay special attention to Mime types and charsets, issuing warnings when the Atom mime type isn't used for Atom feeds, and disallowing the unregistered but popular application/rss+xml.
While Validome says it will validate 0.91, the NetScape version of this specification required a DOCTYPE, and this validator does not support feeds with DTDs. It also seems to prefer the UserLand spelling of textInput.
Validome validates 0.92, which says that all sub-elements of item are optional. That spec also says that any 0.92 source is a valid 2.0 source, and yet the RSS 2.0 spec says that at least one of title or description must be present. At least one of these three statements must be false. The feed validator takes a more conservative route and requires at least one of title or description independent of the RSS version; Validome permits 0.92 feeds to omit both.
Validome doesn’t seem to verify that RSS 1.0 extensions are valid RDF/XML.
With reguard to RSS 2.0, I'd suggest that Validome take a look at the profile work being done by the RSS Advisory Board. I'd also be curious to see how it handles the prenially problematic RFC 822 style dates that the RSS 2.0 specification mandates.
Validome does validate Atom 0.3, which should make Matt happy.
Much better than the W3C HTML validator. Among other things, it validates attribute values.
Unfortunately, it seems to screw up on the allowed values for width attribute of the MathML <mspace /> element (and, I’m gonna guess, <mpadded />, as well).
Probably, there are other glitches as well, but ... wow! ... what a good start.
Posted by Jacques Distler atAny info about the underlying technology? On the About section it says “object-oriented” and “template-based”, and one person in the team is a Perl and C/C++ expert... But how has Validome been done?
Posted by Giulio Piancastelli at
@Sam
Sam, please consider, that our feed validator was launched only three days ago, so we are aware, that there is still much work to do on it. Furthermore, it is not a problem for us to change behaviour of the validator in a short time, IF we misunderstood / misinterpreted the specifications. The behaviour of our validator repots (errors / warnings / ok) is completely independent from processing, data gaining, parsing, etc. In most cases, it can be done in minutes.
There are still some points to discuss in regard to the test cases you have mentionned above:
-100: FeedValidator fails here, as there is no element foo
-106: rel_attribute:
<quote>Note that use of a relative reference other than a simple name is not allowed</quote>
As we understand the specification, “/foo” IS a relative reference (because of the leading "/")
150; Validome fixed behaviour
185: Validome fixed behaviour
292: FeedValidator has two WARNINGS, BUT the specification says:
<quote>atom:entry elements MUST contain an atom:summary element in either of the following cases:
the atom:entry contains an atom:content that has a “src” attribute (and is thus empty).</quote>
So, here Validome is right, this is not a warning but an error, as Validome reports.
335: FV reports an error “Unexpected Text”
I doubt, that that is right because the specification allows "undefinedContent" and as undefinedContent is= (text|anyForeignElement)*, there is text alowed over there.
While Validome says it will validate 0.91, the NetScape version of this specification required a DOCTYPE, and this validator does not support feeds with DTDs
Fixed, thank you.
I’d also be curious to see how it handles the prenially problematic RFC 822 style dates that the RSS 2.0 specification mandates.
Take a look over here.
disallowing the unregistered but popular application/rss+xml.
At my humble opinion, the popularity of application/rss+xml is not a reason for non-conformity (as unregistered). Still, we changed this in a warning; if not, our validator would interrupt the validation process, which is not very user friendly. This behaviour is still ambiguous and not all members of our team agree with it.
Thank you for your efforts, comments and suggestions. You are always welcome to make proposals for improvement or criticize our work!
During the next days, we will run your testsuite and - in cases where necessary - improve the reliability of our validation results.
@Giolio
But how has Validome been done?
C, C++ and Java. At the moment it runs on Apache (RedHat), but as it is platform indepedent, it should not be a serious problem to get it working on IIS or Mac. Screen outpot is done in PHP.
We have begun to work on a local application, first version will be only for Windows users, as made with .NET.
@Jacques
it seems to screw up on the allowed values for width attribute of the MathML <mspace /> element
We will take a look at this issue on monday. Thank you for your comments!
Posted by Alex atSam, please consider, that our feed validator was launched only three days ago, so we are aware, that there is still much work to do on it.
I agree with Jacques: wow! ... what a good start.
In most cases, it can be done in minutes.
Agreed.
100: FeedValidator fails here, as there is no element foo
Fair enough.
106: rel_attribute: As we understand the specification, “/foo” IS a relative reference (because of the leading "/")
Either I misread the Validome output, or it has been fixed, as it is now correct.
150; Validome fixed behaviour
Not deployed online yet?
185: Validome fixed behaviour
Not deployed online yet?
292: FeedValidator has two WARNINGS, BUT the specification says:
You are right. Fixed.
335: FV reports an error “Unexpected Text”
Fair enough.
Fixed, thank you.While Validome says it will validate 0.91, the NetScape version of this specification required a DOCTYPE, and this validator does not support feeds with DTDs
There are other differences. Image is not required in NetScape 0.91. Essentially, there are two different standards with the name RSS 0.91.
Take a look over here.I’d also be curious to see how it handles the prenially problematic RFC 822 style dates that the RSS 2.0 specification mandates.
Two points:
- RFC 822 allows all sorts of things, like nested comments, so these should not generate an error; this being said, such features are discouraged and not widely supported, so a warning is definitely in order
- More specific warnings are in order here. I’m amazed at how often people get the day of the week wrong, for example; and this generates a lot of questions. Few would be able to spot the problem with this date, for example.
At my humble opinion, the popularity of application/rss+xml is not a reason for non-conformity (as unregistered). Still, we changed this in a warning; if not, our validator would interrupt the validation process, which is not very user friendly. This behaviour is still ambiguous and not all members of our team agree with it.disallowing the unregistered but popular application/rss+xml.
Understood.
During the next days, we will run your testsuite and - in cases where necessary - improve the reliability of our validation results.
If you see anything you disagree with, let me know.
FYI: I’ve also made a change to the FeedValidator so that if the Validome test cases are placed in (for example) testcases/validome/atom_1_0/*.xml, you can check the results by running:
python src/validtest.py testcases/validome/atom_1_0/*.xml
You can find instructions on installing and running the feed validator here.
Posted by Sam Ruby atSam, we will implement FeedValidator’s validation results in our testsuite. I hope this step should help discussing ambiguous or even contradictory issues of appropriate specifications.
We discuss at the moment with Olivier Thereaux from W3C the possibility of creating a kind of test framework for interested webmasters and developers, so people can easily contribute to or discuss on test cases and associated problems; the test framework was originally planned for HTML and XHTML, but as we launched the RSS and Atom validator, we begun first with these test cases.
150 + 185 --> Not deployed online yet?
Oh, we’ve fixed these issues only on our internal server, I hope tomorrow they will be online. Thank you!
Posted by Alex atBTW: Trying to validate the W3C main file, FV says it' s valid, Validome says NO.
Your opinion?
Posted by Alex at
We discuss at the moment with Olivier Thereaux from W3C the possibility of creating a kind of test framework for interested webmasters and developers, so people can easily contribute to or discuss on test cases and associated problems.
Why don’t we simply create a project at code.google.com and merge our test cases, under the MIT license?
Trying to validate the W3C main file, FV says it' s valid, Validome says NO. Your opinion?
I don’t believe that Validome should report errors on lines 14-17. RDF/XML was revised in 2004, and resource became rdf:resource.
Validome is correctly reporting errors on lines 27, 33, 39, and 57. The FeedValidator does not currently implement this check.
I don’t believe that any error should be reported on line 73 as rdf:Description is defined and RSS 1.0 is extensible.
Posted by Sam Ruby atWhy don’t we simply create a project at code.google.com and merge our test cases, under the MIT license?
A good idea, we will discuss this within next days.
I don’t believe that Validome should report errors on lines 14-17. RDF/XML was revised in 2004, and resource became rdf:resource.[...]I don’t believe that any error should be reported on line 73 as rdf:Description is defined and RSS 1.0 is extensible.
We will take a look at this issues tomorrow.
also seems to prefer the UserLand spelling of textInput.
A very good point Sam. We discussed this and within next days we will give users the possibility to select between validation
--> 0.91 “Netscape-Mode” and
--> 0.91 “UserLand-Mode”
including a little bit of explanation on differences. It’s probably the best (most fair and transparent) way to provide 0.91 validation.
Thank you again for your efforts and suggestions.
During the next days, we will run your testsuite and - in cases where necessary - improve the reliability of our validation results.
To make this easier, I committed a change to make it easier to compare Feed Validator and Validome test results for any of the Feed Validator tests for RSS 2.0 or Atom 1.0. Simply click through any section number (example), and you should see two icons next to every test.
I haven’t tried many things, but I did encounter this.
If we do merge our tests, I’m confident that the quality of both efforts will benefit.
Posted by Sam Ruby atI haven’t tried many things, but I did encounter this.
It looks like you guys have been busy fixing things. Excellent!
within next days we will give users the possibility to select between validation 0.91 “Netscape-Mode” and 0.91 “UserLand-Mode” including a little bit of explanation on differences.
Skip hours has a different range in the two formats.
More troublesome: while Netscape 0.91 does not specify the data format, but does contain an example that shows an RFC 2822 formatted date, with a four digit year; Userland 0.91 does not contain an example with a date, but does specify RFC 822 as the date format, a format that only permits two digit years. At the present time, the Feed Validator applies RSS 2.0’s date processing rules on RSS 0.91 feeds of either format. How will Validome handle this?
- - -
Another question to ponder: should Validome issue a warning on this feed?
$ curl --head http://www.zeldman.com/feed/zeldman.xml" HTTP/1.1 302 Found Date: Mon, 25 Sep 2006 14:11:22 GMT Server: Apache/2.0.52 (Red Hat) Location: http://www.zeldman.com/rss/ Connection: close Content-Type: text/html; charset=iso-8859-1 $ curl --head http://www.zeldman.com/rss/ HTTP/1.1 200 OK Date: Mon, 25 Sep 2006 14:11:34 GMT Server: Apache/2.0.52 (Red Hat) X-Powered-By: PHP/4.3.9 Set-Cookie: bb2_screener_=1159193494+66.57.27.65; path=/ X-Pingback: <a href="http://www.zeldman.com/xmlrpc.php">[link]</a> Last-Modified: Fri, 22 Sep 2006 20:34:58 GMT ETag: "518eb41249b66ceff2a55ef01e924222" Status: 200 OK Connection: close Content-Type: text/xml; charset=UTF-8Posted by Sam Ruby at
It looks like you guys have been busy fixing things.
We are working on it, as our spare time allows but we try our best.
to make it easier to compare Feed Validator and Validome test results for any of the Feed Validator tests for RSS 2.0 or Atom 1.0. Simply click through any section number (example), and you should see two icons next to every test.
That’s what we are working on at Validome at the moment. BTW: I hope by using Feed Validator’s icon / logo for processing FV validation results, there would be no problem with trademark issues, isn’t it?
Posted by anonymous atI hope by using Feed Validator’s icon / logo for processing FV validation results, there would be no problem with trademark issues, isn’t it?
No problem whatsoever. The code, tests, and documentation are all open source.
Posted by Sam Ruby atBTW: I hope by using Feed Validator’s icon / logo for processing FV validation results, there would be no problem with trademark issues, isn’t it?
Do you mean this icon?
The feed icon was created by the Mozilla Foundation and subsequently promoted as a universal icon to represent feeds. There should be no problem with trademark issues.
Feed icon usage guidelines can be found here.
Posted by Kevin H atKevin, I don’t mean the feed icon, but the graphical element representing the feed validator.
Sam, at the moment we are working on the xml:base issue, as it is quite tricky regarding IRI and usage of relative references (one example: rel="/foo") = having fun with DOM.
Posted by Alex at
Sam, “Zeldman-Bug” now fixed, it was because of the header redirect.
Feed Validators’s results are now completely implemented for all feed versions. We used the icon Kevin mentioned.
Posted by Alex at
Alex, you seem to believe that there are seven errors on my feed. Needless to say, I disagree.
Posted by Sam Ruby at
Feed Test Cases Project?
Sam Ruby: “Why don’t we simply create a project at code.google.com and merge our test cases, under the MIT license?” +1. This would be excellent. To help things along, I offer to contribute the Feed Security tests to this project and...Excerpt from snellspace.com at
Alex, you seem to believe that there are seven errors on my feed. Needless to say, I disagree.
Line 6, Column 4: The icon element must contain a complete / absolute (not relative) and valid URL.
--> The specification says, it is an IRI reference and that MUST be absolute if you don’t have a xml:base.
It is logical, as if you eg share a feed on other URL how should a user agent or machine know how to resolve a relative reference?
LIne 13, Column 6: The same aspect. Specs 3.2.2
IRIs must be absolute to be resolved. There is only one exception: if you define xml:base, as than references CAN BE RESOLVED.
Posted by anonymous atThere is only one exception: if you define xml:base, as than references CAN BE RESOLVED.
OK, so we are down to two errors reported. I continue to be impressed by the responsiveness of the Validome team.
Take a close look at the definition of xml:base. It turns out that for any feed fetched over HTTP, it is always defined.
Posted by Sam Ruby atI finally got around to giving this a test with my Atom feed. My first problem was that it wouldn’t accept an IRI for the source URL (which I suppose is fair enough). So then I tried with the punycode URI, but that didn’t help much since the next thing it choked on was the DTD. I wouldn’t mind so much if it just said “sorry we don’t support DTDs”, but I think it’s a bit harsh to claim my feed is invalid because of it.
Posted by James Holderness at
@Sam
Take a close look at the definition of xml:base.
We’ll do it today and respond.
@James Holderness
My first problem was that it wouldn’t accept an IRI for the source URL
Is the IRI relative or absolute?
but I think it’s a bit harsh to claim my feed is invalid because of it.
Could you post the URL of your feed, so we can take a look? You can send it per mail too, if you prefer (address: al at validome dot org) or use our contact us form. Thank you!
Posted by Alex atIs the IRI relative or absolute?
Absolute. I’m talking about the IRI of the feed itself - the thing I want to validate. And by IRI I meant an IRI with an IDN in case that wasn’t clear.
Could you post the URL of your feed, so we can take a look?
www.詹姆斯.com/feed
Posted by James Holderness atHmm. Validome complains about my feed, too, for a totally bogus reason. It says: “Illegal namespace declaration (http://www.w3.org/1999/xhtml) - this must be xmlns="http://www.w3.org/2005/Atom.”
I declare the XHTML namespace as the default namespace and bind the Atom namespace to the prefix a.
Hmmm. Some more XHTML 1.1 + MathML 2.0 problems:
1. <mtable width="..."> also screws up.
2. In MathML 1.0, <mtr> and <mtd> elements were inferred. In MathML 2.0, this is no longer the case. The children of an <mtable> element must be zero or more <mtr> (or <mlabeledtr>) elements and the children of <mtr> must be zero or more <mtd> elements. Or, at least, that’s the way I read the spec.
@Henri
for a totally bogus reason. It says: “Illegal namespace declaration ([link]) - this must be xmlns="http://www.w3.org/2005/Atom.”
This is fixed now, sorry for the inconvenience.
Your feed is still NOT valid:
- rgarding the warnings Validome agrees with FV results
More than that, the specification says
<quote_spec>
atom:entry elements that contain no child atom:content element MUST contain at least one atom:link element with a rel attribute value of “alternate”.
</quote_spec>
So, this is clearly an error and for Sam some work to fix.
@Jacques
Jacques, we are working at the moment with pressure on the feed validator. Please be patient a few days, we will surely take a look and fix bugs, if necessary. Thank you for your efforts!
@James
I wouldn’t mind so much if it just said “sorry we don’t support DTDs”, but I think it’s a bit harsh to claim my feed is invalid because of it.
Yes, you are right. We will fix this behaviour.
Furthermore we look for an acceptable solutions for accepting DTDs; as you surely know, the problem is that a DTD within a feed-file can be solely valid BUT contain for example elements not allowed in the appropriate specification.
Alex: If the rel attribute in unspecified in a link element it is equivalent to rel="alternate". The entry that you’re reporting an error on contains <a:link href="http://hsivonen.iki.fi/kesakoodi/clipboard/"></a:link>, which is equivalent to <a:link rel="alternate" href="http://hsivonen.iki.fi/kesakoodi/clipboard/"></a:link>.
Posted by James Snell at
Posted by Sam Ruby atIf the rel attribute in unspecified in a link element it is equivalent to rel="alternate"
@Henri + James + Sam
“Sivonen-Bug” fixed, thanks to James and Sam for clarification. Henri’s feed validates now.
@Sam
you can check the results by running: python src/validtest.py testcases/validome/atom_1_0/*.xml
Sam, as we need the header information (charset, mime type) we would need a possibility to access the script via HTTP request (not a local version). Could you help with this, please? (of course, only if your time allows)
Thank you in advance.
Sam, as we need the header information (charset, mime type) we would need a possibility to access the script via HTTP request (not a local version). Could you help with this, please?
I’m not sure if you are asking about the results of running the Feed Vaidator against the Validome tests, or the results of running Validome against the Feed Validator tests.
I can help with the former: here’s the current set of 127 differences for the Atom 1.0 tests. Clearly, some (like the first) is due to Validome catching something the Feed Validator doesn’t currently catch.
Longer term, we should pursue merging the tests with other like minded individuals, like James’s Abdera tests.
Posted by Sam Ruby at@James
James, we discussed the integretation of DTD support (it was NOT the first time we did it) and came up to the conclusion to implement this feature.
As you surely know, there is a bunch of pitfalls associated with this issue, but we found a solution how to proceed.
Since there are more people involved in development and fundamental changes, we will need app. two days to handle this feature, so we hope the results of our work will honorate your efforts.
We are curious, how FV handles some DTD tricky test cases.
Posted by Alex atWe are curious, how FV handles some DTD tricky test cases.
If a DOCTYPE is spotted, I call a validating parser, and include any output it produces into the results: example.
Posted by Sam Ruby atHi everybody,
Now we have implemented full DTD support (that means now Validome supports either validation of external DTDs in feed files), but consider that the implementation is done currently only for Atom 1.0. DTD support for RSS 0.91, etc. will be completed within next days, as time allows.
James' feed validates now. James, thank you very much for your help and suggestions!
Feel free to comment any further bug you see.
@Sam
If a DOCTYPE is spotted, I call a validating parser, and include any output it produces into the results: example.
It seems, FV has some problems with the DTD support, here are 2 examples:
Sam I didn’t mean the backend but the correctness of validation results.
This one should be fixed. FV seems also not to support correct processing of external DTDs.
Take a look at the test cases from
to
...last one is the error ID number, these are 7 quite interesting test cases.
Validome handles now errors in DTDs by suggesting users to fix the DTD issues FIRST; source of the incorrect DTD will be displayed, charset too.
I’m not sure if you are asking about the results of running the Feed Vaidator against the Validome tests
Yes, that is what we need help for: a possibility for HTTP request would be fine.
Posted by Alex atSorry, the first and last link of my post don’t work. It must be:
- James' feed validates now.
- ...by suggesting users to fix the DTD issues FIRST;
Yes, that is what we need help for: a possibility for HTTP request would be fine.I’m not sure if you are asking about the results of running the Feed Vaidator against the Validome tests
At the moment, the longest part of the process is the downloading of all of the Valiome tests. I parse the main page looking for links, and download them. This both takes time and the code needs to be updated if the page changes. If the plan is to ultimately set up a project for feed test cases, I’d rather focus my efforts there. With something like an svn repository, I could simply update to the latest and run.
Posted by Sam Ruby at@Sam
I parse the main page looking for links, and download them. This both takes time and the code needs to be updated if the page changes.
Oh, it’s not necessary to do this. We will make a XML file for you which will be automatically updated when we add test cases. When completed, I’ll send you the URL.
If the plan is to ultimately set up a project for feed test cases, I’d rather focus my efforts there.
That is much more better.
At the moment, we are about coordinating this with Olivier who wants to host this project on W3C; originally, there was a plan to create a testsuite for all markup stuff (HTML, XHTML, etc.)
As we launched our feed validator last week, we completed the RSS and Atom testsuite first.
As Olivier wrote me, he doesn’t see any conflict of interests if we merge the RSS testcases under a MIT license and share them with other people.
Any comments on Validome’s DTD support for Atom 1.0?
Posted by anonymous atAt the moment, we are about coordinating this with Olivier who wants to host this project on W3C; originally, there was a plan to create a testsuite for all markup stuff (HTML, XHTML, etc.)
Excellent!
As we launched our feed validator last week, we completed the RSS and Atom testsuite first.
As Olivier wrote me, he doesn’t see any conflict of interests if we merge the RSS testcases under a MIT license and share them with other people.
I don’t care who hosts it, lets simply create a project where we each can contribute. If the W3C can create infrastructure for configuration management, bug tracking and mailing lists, I’m OK with it being hosted there.
In addition to contributing more tests for RSS and Atom, I can contribute tests for quite a number of common namespaces and for OpenSearch.
If desired, we could have separate access control lists for each subject area. I’m not sure that’s necessary, however.
Any comments on Validome’s DTD support for Atom 1.0?
It probably won’t be until early next week that I can look further into this.
Posted by Sam Ruby at@Sam
We will make a XML file for you which will be automatically updated when we add test cases. When completed, I’ll send you the URL.
Here it is: everytime you call it, you will get the current version of our testsuite.
Posted by Alex at@Sam
DTD support now completed. As promised, people can now select between 0.91-Netscape and 0.91-UserLand validation mode.
End of the week, when all our team members are back from vacation, we will make a final decision on the testsuite cooperation, so we can begin with the work on it soon.
@Jacques
As the RSS issues people commented here are now fixed / improved, we will take a look to the MathML issues.
@All
Thank you for your professional comments, suggestions and help!
Posted by Alex atSam, could you pass me a mail address, so we can discuss on the planned testsuite?
Back from vacation, all our people agree to create a platform for merging and discussing RSS and Atom test cases. The programming and technical maintenance should begin as soon as possible.
Best regards, Alex.
Posted by Alex at
I could of sworn I read “At least three of these statements must be false.”
Posted by Robert Sayre at