It’s just data

Trackbacks, Queries, and Encoding

This morning, I got a trackback from a Korean weblog.  Unfortunately, if you look at how a typical trackback is sent, you see that character encoding information is not provided.

Something to think about the next time you are tempted to think that you can get queries for free.  While both HTTP and XML provide mechanisms for defining encoding, support in widely deployed implementations is much better in XML than in straight HTTP.

URI's seem to be converging on UTF-8, albeit at an excruciatingly slow pace.  Don't leave this to chance - if you are defining a GenerativeNaming scheme today, make this explicit.

If you are defining a protocol based on HTTP POST, encourage the use of the charset parameter on the Content-Type header.  Require it if you can.


Anne van Kesteren : Trackbacks, Queries, and Encoding - I wonder if pingback solves this...

Excerpt from HotLinks - Level 1 at

[LIEN] Trackbacks, Queries, and Encoding

TB sucks...

Excerpt from Znarf Infos - le carnet web at

I wonder how that here... My link log doesn't even send pings out. I just read the Pingback specification [link] and it seems that isn't the optimal solution as well.

From what I have heard, trackback is really bad. It can invalidate weblogs [link] it can't handle encoding as well, someone should come up with a new format that addresses those aspects (encoding, validating, excerpt) and of course, it might be nice if it is in some way compatible with trackback so that people can easily implement support for it.

Posted by Anne at

I have sent numerous requests to web site owners in Korea to define the charset (including when publishing pages in English, since some Latin chars in Korean fonts are not correctly displayed in iso- and utf), to no avail. They seem to think that because their target market is Korea and Koreans, their browsers will be set to display Korean by default, so no problem... :-(

Posted by dda at

dda, I've actually had some success in the past persuading web site owners in Japan to declare a charset, which is gratifying.  Meanwhile, if it makes you feel any better, at least one (presumably) Korean site owner is pursuing the issue of Trackbacks and character encoding that Sam raised here, though Six Apart hasn't replied yet.

Posted by jacob at

Trackback in, valid out (mostly)

Jacques Distler:  It turns out that by design it is rather hard for a string of bytes to be  valid utf-8, unless that string is pure US-ASCII, in which case it doesn't much matter which encoding you presume.... [more]

Trackback from Sam Ruby

at

TB sucks...

Excerpt from Public marks from user znarf with tag trackback at

Add your comment