On the first day God created SGML as a way of structuring documents
so that they would have something to live up to. (Any resemblance of God to
Charles Goldfarb is unintentional.) Tim Berners-Lee was shown SGML and saw
that it was good but waaaay too complex. So, on the second day, Berners-Lee
created HTML and saw that it was good and actually usable. Because HTML had
a fixed and determined set of elements (paragraphs, headings, bulleted lists,
etc.), a browser could look at HTML and know what to expect: if it says "<li>"
then what follows is an item on a list and should probably be indented and
have a bullet next to it. Thus was the world of documents made simpler. Much
simpler. Much much simpler. Too simple. Oversimplified. And inflexible.
So, on the third day, Jon Bosak, Tim Bray, Michael Sperberg-McQueen
and some others created XML which, like SGML, enabled page designers to create
their own types of elements and their own predefined document types. And they
looked at XML and said that it was way cool and just what we need, for XML
documents can be validated against their document type definitions (DTDs)
and can be structured so that a machine can read them and know which piece
of text is a part number and which is a dollar amount.
On the fourth day, the world looked at how the Web was developing
and looked at XML and saw that maybe they needed something more. XML was unwieldy
for some of the non-PC applications that were getting plugged into the Web
— cable boxes to refrigerators — and that XML stuff was still pretty hard
to do. Plus, XML isn't backwards compatible with the older Web browsers. Even
HTML, because it's so flexible and people write it so sloppily, requires multi-megabyte
interpreters (called "browsers") to be understood.
And so, on the sixth day (on the fifth day everyone downloaded
everything they could before Napster was shut off), XHTML was created. XHTML
is compatible with HTML 4, so if you develop your pages using it those pages
will still work in browsers that aren't so old that they choke on javascript.
And, of course, XHTML can be read by anything that can read XML, for it is
technically an XML document specification. While XHTML is less flexible than
the XML it's written in (for it has a fixed tagset), it's a stricter disciplinarian
than HTML; browsers are currently happy to read even the sloppiest of HTML
pages, but to be a valid instance of XHTML, authors have to remember to do
things like match all their tags with the appropriate end-tag, only use lowercase
for the tag names, get tag-nesting right and put attribute values into quotes
(e.g., <img width= "200">,
not <img width= 200>).
The free ride is over. But this discipline is required if Web pages are going
to be read by low-wattage applications like household appliances. (Yes, our
toaster-ovens are now setting the tune.)
There are three initial basic types of XHTML documents. "Strict"
is a minimal set of tags. "Transitional" will let you do all your
fancy-ass HTML formatting tricks. "Frameset" is for the loser pages
that use HTML frames. You can use style sheets with any of these, thus regaining
formatting capabilities such as "<center>" that have been
stripped out.
Will XHTML replace HTML? While some tag junkies may think so
because they believe the universe is ultimately rational, there's not a chance
in hell if only because the first browser that refuses to show you an HTML
page because it's not properly done in XHTML is the browser you'll throw off
your desktop. But XHTML is very likely to become the standard for people creating
Web pages for a living, for it adds enough rigor to make their work reusable
both by a wide range of devices and by computing applications trying to make
sense of the
livable mess we call the Web.
RESOURCES:
General XHTML Reference:
http://www.xhtml.org/
http://www.wdvl.com/Authoring/Languages/XML/XHTML/dif.html
http://www.w3.org/TR/xhtml1/
XML resources:
http://www.xml.org
http://www.xml.com
Differences between HTML 4 and XHTML:
http://www.wdvl.com/Authoring/Languages/XML/XHTML/dif.html
Printer Friendly Version
The Author
David Weinberger writes JOHO and is one of the Ringleaders of cluetrain.com,
a manifesto of web ethics. He also provides strategic marketing
consulting to high-tech companies, writes for several magazines
(including Wired)
and is a commentator on NPR's "All Things Considered."
He was, as VP of Strategic Marketing, one of the shapers of Open
Text's intranet strategy. David sits on several conference boards
and is a member of AIIM's Emerging Technology Advisory Group. Reach
him at self@evident.com
.