You First Cup of XML

based on Let's Talk About XML by Purple/pconline.com

 


 

I. What is wrong with HTML?

XML stands for eXtensible Markup Language. Before we go any further, it is assumed that the reader is familiar with HTML and can read, understand and write basic html pages. As we go farther into the topic of XML, the reader will find it more easier if he/she knows some script language, such as JavaScript or VBScript. However for the time being, HTML is enough.

In order to understand XML, let us start from HTML. After all, XML exists because HTML is successful.

The following example is about a student, who has the following information: studentID=001, name=John Smith, sex=male, age=20.

We can display it in three ways through HTML (please click the links for demo):

Display 1:

The source code:

<body bgcolor="#FFFFFF" text="#000000">
001, John Smith, male, 20

</body>

Display 2:

The source code:

<body bgcolor="#FFFFFF" text="#000000">


<table width="300" border="1" cellspacing="0" cellpadding="0" bgcolor="#FFFFFF">
<tr bordercolor="#000000">
<td><b>studentID</b></td>
<td>001</td>
</tr>
<tr bordercolor="#000000">
<td><b>name</b></td>
<td>John Smith</td>
</tr>
<tr bordercolor="#000000">
<td><b>sex</b></td>
<td>male</td>
</tr>
<tr bordercolor="#000000">
<td><b>age</b></td>
<td>20 </td>
</tr>
</table>

</body>

Display 3:

<body bgcolor="#FFFFFF" text="#000000">
<p><font face="Times New Roman, Times, serif"><span class="text">Student Infor.</span></font></p>
<p class="text"><font face="Times New Roman, Times, serif">studentID:
<input type="text" name="textfield" value="001">
</font></p>
<span class="text"><font face="Times New Roman, Times, serif"> sex:
<input type="text" name="textfield2" value="male">
</font> </span>
<p class="text"><font face="Times New Roman, Times, serif">name:
<input type="text" name="textfield3" value="John Smith">
</font></p>
<p class="text"><font face="Times New Roman, Times, serif">age:
<input type="text" name="textfield4" value="20">
</font></p>
<p> <span class="text"><font face="Times New Roman, Times, serif">
<input type="submit" name="Submit" value="Submit">
<input type="submit" name="Submit2" value="Reset">
</font></span><font face="Times New Roman, Times, serif"></font></p>

</body>

What can we find from the above example? Isn't it true that HTML is getting fatter and fatter, and that it is more and more difficult to find data through source code?

HTML has some inherent shortcomings, such as the combination of presentation and data, the increase of HTML tags family. Information that is stored in databases can change dramatically after interpreted and transformed by CGI, ASP ect into HTML. The popularity of e-commerce provides HTML with more opportunities, but HTML's shortcomings inhibit its further development.

Return


 

II. Your First Cup of XML

The biggest advantage of XML is to separate data and presentation. Doesn't sound concrete enough? Please walk through the following XML example:

Please download the following files to your computer and do as guided:

student.xml

student1.xsl

student2.xsl

student3.xsl

Step 1: double-click student.xml, you will see

which is the same as display1.

Step 2: Open student.xml with Notepad, you will see

Now change href="student1.xsl" with href="student2.xsl", refresh window, you will see

which is the same as display2.

Step 3. Still in the Notepad, change href="student2.xsl" with href="student3.xsl", save student.xml, refresh window, you will see:

which is the same as display3.

Step 4. Now let's see what a xml file looks like. In Notepad, delete <?xml-stylesheet type="text/xsl" href="student3.xsl"?>, save student.xml, then refresh window, you will see

Try click on the "-" symbol, and see it change into "+". The tree folds and expands correspondingly.

What xml looks like? If you ever use Windows Explorer, the "-""+" symbol should look familiar to you. The fact is XML elements are organized in tree structure.

The differences between XML and HTML are:

  • XML focuses on content, while HTML focuses on page presentation and styles.
  • XML + XSL /CSS ==>HTML
  • The writing of XML and HTML are similar

The advantages of XML:

  • Facilitate information transmission between different systems, bringing huge convenience for B2B transactions.
  • More convenient information query. In XML, data and display are separated. When display changes, data file remains the same. When information is queried, only XML file is searched

Return


 

III. XML Syntax

Let's have a look at the basic syntax of XML through our student example. For more detailed information about XML syntax, please refer to XML SYNTAX.

<?xml version="1.0"?>
<student>
<studentID>001</studentID>
<name>John Smith</name>
<sex>male</sex>
<age>20</age>
</student>

  • Declaration:<?xml version="1.0"?>

The declaration identifies the document as an XML document. The declaration also lists the version of XMLused in the document. For the time being, it's 1.0.

  • XML is Case Sensitive

XML names are case sensitive. The following two elements are different for XML:

<studentID>001</studentID>

<StudentID>001</StudentID>

  • Don't Forget End-Tags

The following will be rejected because "age" doesn't have a end tag.

<?xml version="1.0"?>
<student>
<studentID>001</studentID>
<name>John Smith</name>
<sex>male</sex>
<age>20
</student>

  • Empty Element must be Closed

Elements that have no content are known as empty elements. For XML, the following two elements are identical:

<email href="mailto:abc@email.unc.edu" />

<email href="mailto:abc@email.unc.edu"></email>

  • Remember to Quote Attributes

In HTML, attributes can be quoted and can not. However in XML, attribute must be quoted:

<tel preferred="true">919-962-1234</tel>

The following will be rejected:

<tel preferred=true>919-962-1234</tel>

  • One and Only One Root Element

At the root of the document, there must be one and only one element.The following example is illegal because there are two "student" elements.

<?xml version="1.0"?>
<student>
<studentID>001</studentID>
<name>John Smith</name>
<sex>male</sex>
<age>20
</student>

<student>
<studentID>002</studentID>
<name>Ellen Joe</name>

</student>

In order to correct it, we can change it to:

<?xml version="1.0"?>
<student>
<entry>
<studentID>001</studentID>
<name>John Smith</name>
<sex>male</sex>
<age>20
</entry>

<entry>
<studentID>002</studentID>
<name>Ellen Joe</name>

</entry>
</student>

Return


After you first cup of XML, you should have a basic idea of what XML is, what are the differences between XML and HTML. XML gives you more freedom to organize your data by creating your own tags. At the same time, XML imposes more restrictions on how you use those tags. XML realizes the separation between data and presentation. Using different style sheets or applications, the same XML file can be displayed in many different ways.