Jade Dungeon

XHTML5 in a nutshell

July 25th, 2010 by Sergey Mavrody in Syntax, What's Next, WHATWG

The WHATWG Wiki portal has a nice section describing HTML vs. XHTML differences, as well as specifics of a polyglot HTML document that also would be able to serve HTML5 document as valid XML document. I'd like to review what it takes to transform an HTML5 polyglot document into a valid XHTML5 document: it appears, finally the 'XHTML5' has become an official name.

The W3C first public working draft of "Polyglot Markup" recommendation describes polyglot HTML document as a document that conforms to both the HTML and XHTML syntax by using a common subset of both the HTML and XHTML and in a nutshell the HTML5 polyglot document is:

  • HTML5 doctype/namespace
  • XHTML well-formed syntax

Polyglot document could serve as either HTML or XHTML, depending on browser support and MIME type. A polyglot HTML5 code essentially becomes XHTML5 document if it is served with the XML MIME type application/xhtml+xml . In a nutshell the XHTML5 document is:

  • HTML doctype/namespace: The <!DOCTYPE html> definition is optional, but it would be useful in a polyglot document by preventing browser quirks mode.
  • XHTML well-formed syntax
  • XML MIME type: application/xhtml+xml. This MIME declaration is not visible in the source code, but it would appear in the HTTP Content-Type header that could be configured on the server. Of course, the XML MIME type is not yet supported by the current version Internet Explorer though IE can render XHTML documents.
  • Default XHTML namespace: <html xmlns="http://www.w3.org/1999/xhtml">
  • Secondary namespace such as SVG, MathML, Xlink, etc. To me this is like a test, if you don’t have a need for these namespaces in your document, then the use of XHTML is overkill in the first place.

Finally, the basic XHTML5 document would look like this:

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
	<head>
		<title></title>
		<meta charset="UTF-8" />
	</head>
	<body>
		<svg xmlns="http://www.w3.org/2000/svg">
			<rect stroke="black" fill="blue" x="45px" y="45px" width="200px" height="100px" stroke-width="2" />
		</svg>
	</body>
</html>

The XML declaration <?xml version=”1.0” encoding=”UTF-8”?> is not required if the default UTF-8 encoding is used: an XHTML5 validator would not mind if it is omitted. However it is strongly recommended to configure the encoding using server HTTP Content-Type header, otherwise this character encoding could be included in the document as part of a meta tag <meta charset="UTF-8" />. This encoding declaration would be needed for a polyglot document so that it will be treated as UTF-8 if served as either HTML or XHTML.

The Total Validator Tool - Firefox plugin/desktop app has now the user-selectable option for XHTML5-specific validation.

I would say that the main advantage of using XHTML5 would be the ability to extend HTML5 to XML-based technologies such as SVG and MathML. The disadvantage is the lack of Internet Explorer support, more verbose code, and error handling. Unless we need that extensibility, HTML5 is the way to go.