DTD | XSL | CSS | DOM |


XML file itself is simple. However XML family is full of jargons and concepts. This chapter can only give you a brief description of what they are, which are mainly based on Benoit Marchal's book --XML by Example. For more detailed information, please refer to Resources.

 

DTD

DTD, short for Document Type Definition, is a mechanism to describe the structure of documents. The role of DTD is to specify which elements are allowed where in the XML document. Software tools can read DTD and learn about the document structure. Consequently the tools can adapt themselves to better support the document structure.

A XML is not required to have a DTD. XML recognizes two classes of documents: well-formed and valid. The difference between them is the latter has DTD while the former doesnt. Look at the following examples:

 

Listing 2.1 A XML File

<?xml version="1.0"?>

<address>

<street>34 Fountain Square Plaza</street>

<region>OH</region>

<postal-code>45202</postal-code>

<locality>Cincinnati</locality>

<country>US</country>

</address>

 

Listing 2.2 The DTD---address.dtd

<!ELEMENT address (street, region?, postal-code, locality, country)>

<!ATTLIST address preferred (true| false) "false">

<!ELEMENT street (#PCDATA)>

<!ELEMENT region (#PCDATA)>

<!ELEMENT postal-code (#PCDATA)>

<!ELEMENT locality (#PCDATA)>

<!ELEMENT country (#PCDATA)>

 

Listing 2.3 A Valid Document

<?xml version="1.0"?>

<!DOCYPE address SYSTEM "address.dtd">

<address>

<street>34 Fountain Square Plaza</street>

<region>OH</region>

<postal-code>45202</postal-code>

<locality>Cincinnati</locality>

<country>US</country>

</address>

 

Listing 2.4 An Invalid Document

<?xml version="1.0"?>

<!DOCYPE address SYSTEM "address.dtd">

<address>

<street>34 Fountain Square Plaza</street>

<region>OH</region>

<postal-code>45202</postal-code>

<locality>Cincinnati</locality>

</address>

The element courtry, which is defined in address.dtd and is not optional, is missing.

The main benefits of using a DTD are:

The XML processor enforces the structure, as defined in DTD

The application accesses the document structure, such as to populate an element list

The DTD gives hints to the XML processor-that is, it helps separate indenting from content

The DTD can declare default or fixed values for attributes. This might result in a smaller document.

Return


 

XSL

XSL, the XML Stylesheet Language, is organized into two parts: XSLT, short for XSL Tranformation, and XSLFO, short for XSL Formatting Objects. CSS and XSLFO are very similar, therefore we will talk about CSS instead.

XSLT can be used to

  • add elements specifically for viewing, such as add the logo or the address of the sender to an XML invoice
  • create new content from an existing one, such as create the table of contents
  • present information with the right level of details for the reader, such as using a style sheet to present high-level information to a managerial person while using another style sheet to present more detailed technical information to the rest of the staff
  • convert between different DTDs or different version of a DTD, such as convert a company specific DTD to an industry standard
  • transform XML documents into HTML for backward compatibility with existing browsers

You may want to go back to Your First Cup of XML, where the three xsl files transform the student xml file into different presentations.

return


 

CSS

CSS, the Cascading Style Sheet, can display XML without converting it to HTML. However its main function is to tell the browser how to style the elements, such as fonts, colors, text indentation etc. XSLT is obviously more ambitious in that it functions to tranform and extract information from a XML file.

Let's return back to our student.xml example. Download student.css to your computer.

Open student.xml in Notepad. Change type="text/xsl" href="student1.xsl" with type="text/css" href="student.css". Open student.xml in Internet Explorer, you will see:


return


 

DOM, by Jan Egil Refsnes


The DOM, short for Document Object Model, is a programming interface for HTML and XML documents. It defines the way a document can be accessed and manipulated.

Using a DOM, a programmer can create a document, navigate its structure, and add, modify, or delete its elements.

As a W3C specification, one important objective for the DOM has been to provide a standard programming interface that can be used in a wide variety of environments and applications.

The W3C DOM has been designed to be used with any programming language.

The Node Interface
As you will se in the next section, a program called an XML parser can be used to load an XML document into the memory of your computer. When the document is loaded, it's information can be retrieved and manipulated by accessing the Document Object Model (DOM).

The DOM represents a tree view of the XML document. The documentElement is the top-level of the tree. This element has one or many childNodes that represent the branches of the tree.

A Node Interface is used to read and write (or access if you like) the individual elements in the XML node tree. The childNodes property of the documentElement can be accesses with a for/each construct to enumerate each individual node.

The Microsoft XML parser used to demonstrate the DOM in this Web, supports all the necessary functions to traverse the node tree, access the nodes and their attribute values, insert and delete nodes, and convert the node tree back to XML.

All the demonstrated Microsoft XML parser functions are from the official W3C XML DOM recommendation, apart from the load and loadXML functions. (Believe it or not: The official DOM does not include standard functions for loading XML documents !!)

A total of 13 node types are currently supported by the Microsoft XML parser. The following table lists the most commonly used node types:

Node Type Example
Document type: <!DOCTYPE food SYSTEM "food.dtd">
Processing instruction: <?xml version="1.0"?>
Element: <drink type="beer">Carlsberg</drink>
Attribute: type="beer"
Text: Carlsberg

return