|
|
|
|
|
|
XML Basics, Part II: The Key Concepts
In Part I of this series I answered the question, "What is XML?" Here in Part II of XML Basics, I will define, discuss, and illustrate some of the key concepts crucial to understanding and working with XML documents.
To briefly recap, XML is a meta language for describing mark-up languages. It provides a facility to define tags and the structural relationship between them.
In order to view XML documents hierarchically or view their output, you need an XML parser and processor. While there are a number of these tools available (Perfect XML has one such listing), for the purpose of simplicity, I will use Internet Explorer (5.x and later) to view XML documents for my examples. Internet Explorer has a built in XML parser and processor and is readily available.
The basic flow with XML processing consists of creating an XML document (and optionally corresponding XSL stylesheets), and translating it through an XML parser and processor to result in desired output(s). One of the benefits of XML is the ability to create multiple outputs from one XML document and this is clearly shown within the following visual representation of this process:
The XML Document
XML documents are composed of markup and contents. Six kinds of markup can occur in XML documents:
Every XML document begins with a declaration that identifies it as being of the type XML. While the XML declaration is not mandatory, it is good practice to include it anyway. It can look as simply as:
or it could use the optional attributes of encoding and standalone and look something like:
The encoding attribute specifies to the XML parser what character encoding the text is in so that it can read the document and translate it into Unicode (the "all integers language" machines understand).
The standalone attribute specifies whether the XML document depends on other, external files.
Most of the time, it will be sufficient to accept the defaults and not include these two attributes.
Document Type Declarations
Since one of XML's benefits is its strong adherence to common standards while still being extensible, the DTD coupled with the XML specification are the key to making this work in the real world.
An example of what a DOCTYPE looks like is:
In this example MovieCatalog is the name of the root document element (we'll discuss this in a moment under elements). It is required and links the DTD to the entire element tree. The keyword SYSTEM and the URL that follows allows the document to locate the corresponding DTD file on the same or an external filesystem.
DTDs can get a little confusing and don't really make much sense until you've worked with some XML documents. So, for now, it is sufficient just to understand what they are and why they are important.
Elements
To draw an example from HTML, all of the following would be the equivalent of one element, named h1:
<h1>This is my big heading.</h1>
Where, <h1> is the start tag, </h1> is the end tag, and the content is in between.
Each XML document has a root element within which all other elements are nested. So, if we were creating an XML document representing a movie catalog, as in the example of the DOCTYPE statement above, the root element might look as follows:
<MovieCatalog>
Fundamental to the understanding of XML are the rules all elements must follow. I will list and describe them briefly:
<!-- all the comments go in here -->
Comments can contain any combination of characters, numbers, or punctuation except for the literal string "--".
Where Does all this Lead?
|
Intranet Journal's Tutorials |
|
Managing Editor |