Death of a DTD
By Michael Floyd
As cool as XML is, the W3C hasn't exactly been perfect when it comes to defining supporting standards. Case in point? The document type definition, or DTD. As you likely know, a DTD is a collection of rules that define the type of content your elements may contain, the number of times a subelement may occur, and whether they are required or optional. DTDs also let you specify default values for attributes. And it would be impossible to perform entity replacement without the DTD.
When I look at the DTD, however, I think, Gee, the W3C really rushed that one out the door! You see, one of the requirements in defining XML was that it be SGML compliant. So when it came to specifying the XML DTD, the XML Working Group apparently decided that it had to use the same syntax as the SGML DTD. The problem with this syntax is that it's different from XML and relatively arcane. Even worse, the syntax varies (slightly) depending on whether it's an internal DTD contained within the XML document, or an external DTD maintained in a separate file. That means learning yet another syntax, and remembering nuances that depend on the physical location of the definitions. In creating its new markup language, the XML Working Group initially missed the opportunity to redesign the DTD.
"Ah!" you say, "what about the requirement that XML remain SGML compliant?" Well, that's the beauty of XML (and SGML, for that matter). The first word in XML is "extensible." And there's no reason you can't rewrite the DTD directly in XML.