An XML document that obeys the rules of the previous section is described as well-formed.
It is also possible to specify additional rules for the structure and content of an XML document, via a schema for the document. If the document is well-formed and also obeys the rules given in a schema, then the document is described as valid.
The Document Type Definition language (DTD) is a language for describing the schema for an XML document. DTD code consists of element declarations and attribute declarations.
An element declaration should be included for every different type of element that will occur in an XML document. Each declaration describes what content is allowed inside a particular element. An element declaration is of the form:
The elementContents specifies whether an element can contain plain text, or other elements (and if so, which elements, in what order), or whether the element must be empty. Some possible values are shown below.
An attribute declaration should be included for every different type of element that can have attributes. The declaration describes which attributes an element may have, what sorts of values the attribute may take, and whether the attribute is optional or compulsory. An attribute declaration is of the form:
The attrType controls what value the attribute can have. It can have one of the following forms:
The value of an ID attribute must not start with a digit.
The attrDefault either provides a default value for the attribute or states whether the attribute is optional or required (i.e., must be specified). It can have one of the following forms:
A DTD can be embedded within an XML document or the DTD can be located within a separate file and referred to from the XML document.
The DTD information is included within a DOCTYPE declaration following the XML declaration. An inline DTD has the form:
An external DTD stored in a file called file.dtd would be referred to as follows:
The name following the keyword DOCTYPE must match the name of the root element in the XML document.
Figure 6.1 shows a very small, well-formed and valid XML document with an embedded DTD.
<?xml version="1.0"?> <!DOCTYPE temperatures [ <!ELEMENT temperatures (filename, case)> <!ELEMENT filename (#PCDATA)> <!ELEMENT case EMPTY> <!ATTLIST case date CDATA #REQUIRED temperature CDATA #IMPLIED> ]> <temperatures> <filename>ISCCPMonthly_avg.nc</filename> <case date="16-JAN-1994" temperature="278.9"/> </temperatures> |
Line 1 is the required XML declaration.
Lines 2 to 9 provide a DTD for the document. This DTD specifies that the root element for the document must be a temperatures element (line 2). The temperatures element must contain one filename element and one case element (line 3). The filename element must contain only plain text (line 4) and the case element must be empty (line 5).
The case element must have a date attribute (line 7) and may also have a temperature attribute (line 8). The values of both attributes can be arbitrary text (CDATA).
The elements within the XML document that mark up the actual data values are on lines 10 to 14.
Paul Murrell
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 New Zealand License.