It’s just data

Well Formed Comments

My comment system is based on a number of regular expressions which seem to work tolerably well in most instances when coupled with a preview function.  Unfortunately, the results are not quite as good  when used in a API context.  So, today, I finally did something about it. 

The way it works is as follows:


postcomment.py is a small Python script which posts a comment.

Message from Sam Ruby

at

What's wrong with simply putting together a DTD of XHTML modules that are acceptable, and running comments through a validator?  It seems like ad-hoc checks here and there is just reinventing the wheel.

Posted by Jim Dabell at

Jim, I have no plans of requiring people to enter well formed XHTML in the online HTML forms based interface if that is what you are suggesting.

As to the API: I'm not aware of a validating parser that comes with Python.  In any case, the code that we are talking about here is the validate function in entryparser.py.  Quite small, actually.

Posted by Sam Ruby at

No, I'm not saying that you should require well-formed XHTML.  I'm merely suggesting doing it in place of "a simple scan... for objectionable tags".  In other words, only requiring well-formed XHTML when it will be treated as XHTML.

I don't know of a purely Pythonic validator either; I was thinking of simply using an external validator like xmllint.

Posted by Jim Dabell at

I was actually considering something along the lines Jim is talking about originally. I'm working on a restricted subset of XHTML basic for use in my commenting system. A brief newsgroup discussion of it can be found in googles archive.

Maybe using some kind of sax implementation to process the XML input would be more lightweight for gathering the input, and then a validation at a later point to ensure you caught everything.

Posted by BenM at

A Python article has just popped up on xml.com that mentions a few Python validators:

http://www.xml.com/pub/a/2003/09/10/py.html

Posted by Jim Dabell at

Add your comment