Incite A Riot
This article first appeared in 24 Ways, the online advent calendar for geeks.
Given its relatively limited scope, HTML can be remarkably expressive. With a bit of lateral thinking, we can mark up content such as tag clouds and progress meters, even when we don’t have explicit HTML elements for those patterns.
Suppose we want to mark up a short conversation:
Alice:I think Eve is watching.
Bob:This isn’t a cryptography tutorial …we’re in the wrong example!
A note in the the HTML 4.01 spec says it’s okay to use a definition list:
Another application of
DL, for example, is for marking up dialogues, with each
DTnaming a speaker, and each
DDcontaining his or her words.
That would give us:
<dl> <dt>Alice</dt>: <dd>I think Eve is watching.</dd> <dt>Bob</dt>: <dd>This isn't a cryptography tutorial ...we're in the wrong example!</dd> </dl>
This usage of a definition list is proof that writing W3C specifications and smoking crack are not mutually exclusive activities. “I think Eve is watching” is not a definition of “Alice.” If you (ab)use a definition list in this way, Norm will hunt you down.
The conversation problem was revisited in HTML5. What if
dd didn’t always mean “definition title” and “definition description”? A new element was forged:
dialog. Now the the “d” in
dd doesn’t stand for “definition”, it stands for “dialog” (or “dialogue” if you can spell):
<dialog> <dt>Alice</dt>: <dd>I think Eve is watching.</dd> <dt>Bob</dt>: <dd>This isn't a cryptography tutorial ...we're in the wrong example!</dd> </dialog>
Problem solved …except that dialog is no longer in the HTML5 spec. Hixie further expanded the meaning of
dd so that they could be used inside
details (which makes sense—it starts with a “d”) and
figure (…um). At the same time as the content model of
figure were being updated, the completely-unrelated
dialog element was dropped.
Back to the drawing board, or in this case, the HTML 4.01 specification. The spec defines the
cite element thusly:
Contains a citation or a reference to other sources.
Perfect! There’s even an example showing how this can applied when attributing quotes to people:
As <CITE>Harry S. Truman</CITE> said, <Q lang="en-us">The buck stops here.</Q>
For longer quotes, the
blockquote element might be more appropriate. In a conversation, where the order matters, I think an ordered list would make a good containing element for this pattern:
<ol> <li><cite>Alice</cite>: <q>I think Eve is watching.</q></li> <li><cite>Bob</cite>: <q>This isn't a cryptography tutorial ...we're in the wrong example!</q></li> </ol>
Problem solved …except that the cite element has been redefined in the HTML5 spec:
citeelement represents the title of a work … A person’s name is not the title of a work … and the element must therefore not be used to mark up people’s names.
HTML5 is supposed to be backwards compatible with previous versions of HTML, yet here we have a semantic pattern already defined in HTML 4.01 that is now non-conforming in HTML5. The entire justification for the change boils down to this line of reasoning:
- Given that: titles of works are often italicised and
- given that: people’s names are not often italicised and
- given that: most browsers italicise the contents of the
- therefore: the
citeelement should not be used to mark up people’s names.
In other words, the default browser styling is now dictating semantic meaning. The tail is wagging the dog.
Not to worry, the HTML5 spec tells us how we can mark up names in conversations without using the cite element:
In some cases, the
belement might be appropriate for names
I believe the colloquial response to this is a combination of the letters W, T and F, followed by a question mark.
The non-normative note continues:
In other cases, if an element is really needed, the
spanelement can be used.
This is not a joke. We are seriously being told to use semantically meaningless elements to mark up content that is semantically meaningful.
We don’t have to take it.
Firstly, any conformance checker—that’s the new politically correct term for “validator“—cannot possibly check every instance of the
cite element to see if it’s really the title of a work and not the name of a person. So we can disobey the specification without fear of invalidating our documents.
Secondly, Hixie has repeatedly stated that browser makers have a powerful voice in deciding what goes into the HTML5 spec; if a browser maker refuses to implement a feature, then that feature should come out of the spec because otherwise, the spec is fiction. Well, one of the design principles of HTML5 is the Priority of Constituencies:
In case of conflict, consider users over authors over implementors over specifiers over theoretical purity.
That places us—authors—above browser makers. If we resolutely refuse to implement part of the HTML5 spec, then the spec becomes fiction.
Join me in a campaign of civil disobedience against the unnecessarily restrictive, backwards-incompatible change to the
cite element. Start using HTML5 but start using it sensibly. Let’s ensure that bad advice remains fictitious.
Tantek has set up a page on the WHATWG wiki to document usage of the
cite element for conversations. Please contribute to it.