It’s just data

Markup vs Unicode

HTML 4.0.1: Although Unicode specifies special characters that deal with text direction, HTML offers higher-level markup constructs that do the same thing: the dir attribute (do not confuse with the DIR element) and the BDO element. Thus, to express a Hebrew quotation, it is more intuitive to write

<Q lang="he" dir="rtl">...a Hebrew quotation...</Q>

than the equivalent with Unicode references:

&#x202B;&#x05F4;...a Hebrew quotation...&#x05F4;&#x202C;

Do the unicode control characters also affect the direction of page layout as the HTML constructs do, or are they for characters only?

For example, if a table appears in a "right-to-left" zone, do the table cells get formatted right-to-left?

I personally find the HTML constructs more intuitive, since they work with the document heirarchy, while the control characters have no relationship to the HTML structure whatsoever.

Posted by Martin Atkins at

Shouldn't this formatting be preferably done using CSS?

Posted by cybarber at

Mart:  use the source, yo.
http://web3.w3.org/TR/html4/struct/tables.html#h-11.2.1.1

cybarber:  no.  It's a fundamental attribute of the text, required for it to be readable.

Posted by Evan Martin at

Putting the directionality logic in the mark-up also means that you can write HTML 4 bidirectional text in an 8-bit character set, which is supported by more OSes than UTF-8; and that this text can be viewed on more platforms and with more browsers (Internet Explorer 3.x for Windows, for example).

Posted by Dotan Dimet at

Atom Schemata

In the process, Tim generated a  lot  of  feedback. Meanwhile, a few days ago Sean Palmer initiated an  ExtensibilityFramework that just so happens to address a number of Tim's issues.  In the process, a  RelaxNG grammar... [more]

Trackback from Sam Ruby

at

Add your comment