c-- styles for logos and headline links do not modify internet, red, or black styles -->

Intranet Journal   Earthweb  
Events Jobs Premium Services Media Kit Network Map E-mail Offers Vendor Solutions Webcasts

   Intranet Journal Subjects
Search Earthweb

Privacy Policy



internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

internet commerce
Be a Commerce Partner
















 

[ Home | Discussion Forum | How Do I... | Lotus Notes Intranets | Microsoft SharePoint | Products | Shopping  ]

free news!

 

Burning X Feature
It's Millennial, But Is It XML?
XML pioneer Tim Bray answers questions about Microsoft Office 2000


Interview by David Weinberger
Editor, Journal of the Hyperlinked Organization

A few weeks ago I made the mistake of publishing a quasi-enthusiastic statement about the onrushing Office 2000 juggernaut:

Microsoft is continuing to commit to XML as a standard "save" format for its applications.

From the ensuing "fan" mail, I learned that - as is typically the case with opinions gleaned from marketing literature - there is more to this than meets the eye. So I posed some questions to Tim Bray (www.textuality.com), one of XML's parents and co-author of the standard itself:

Q: There are questions about how thoroughly MS is implementing XML support.

XML support per se is not a design goal of O2K [Office 2000]. They are using some XML machinery to store non-HTML-type information in their HTML++ save format, that's all.
Q: Is "WordML" -- the XML Word uses -- so complex and structured on whatever random paragraph styles the user has implemented that it is in effect proprietary? That is, could a developer write an app that processes MSW2000 docs, finds the XML islands, and does interesting things?
In terms of the popularly-held view of XML as a family of tag languages that mark up the *structure* and *meaning* embedded in data, the Office 2K stuff obviously doesn't qualify, for the reasons you state, although it's not fair to say it's unduly complex.

At a more concrete level, Office 2000 has no "save" format that conforms to the XML specification, thus there is no such thing as "the XML Word uses". The right way to think of their weird "HTML++" format that embeds chunks of XML in HTML -- the result being neither valid HTML nor well-formed XML -- is that it's a much more parseable and tractable version of RTF. The existence of such a thing would be of benefit to everyone - in particular if, as they've promised, they actually document all the tags and attributes that show up (they may already have, I could've missed it).

So the answer to your second question is "yes"; of course, the hypothetical developer can't actually use a standard off-the-shelf XML processor - even Microsoft's - to do so. The O2K format could have, with only a moderate amount of extra work, been made into real XML. Whether you regard the failure to do so as commentary on MS's competence, attitude, or competitive strategy probably depends on how you regard Microsoft.

Q: I assume MS's approach closes off the possibility of using Word as an XML editor that produces clean, valid, DTD-conforming, no-presentation-info XML docs.
Given Microsoft's resources and money, they could turn Word into a structured editing engine if they wanted to. Yes, it would be hard - months, maybe more than a year. They might get there quicker starting from scratch rather than building on Word or Frontpage. So far I don't think they see a market there. Their take seems to be that XML is primarily for app-to-app data interchange, they don't see why anyone would want to send XML to a human viewer.

The possibility always remains open. But first they'd have to be interested.

Q: The roundtripping in Word 2000 seems to me to work really well. I'm impressed with the preservation of presentation info in the HTML docs (at least in IE 5). Am I missing something?
Nope, works just fine. The problem is that the HTML++ format is neither valid HTML nor conforming XML, and it could have been both with only a little more work. By doing that work they could have immensely increased the reusability of O2K docs, to their customers' huge benefit. Either they don't see that, or there is some cost to providing that benefit that they're not talking about. The End

quoteGiven Microsoft's resources and money, they could turn Word into a structured editing engine if they wanted to. But so far I don't think they see a market there.unquote

The Author

David Weinberger writes JOHO and is one of the Ringleaders of cluetrain.com, a manifesto of web ethics. He also provides strategic marketing consulting to high-tech companies, writes for several magazines (including Wired) and is a commentator on NPR's "All Things Considered." He was, as VP of Strategic Marketing, one of the shapers of Open Text's intranet strategy. David sits on several conference boards and is a member of AIIM's Emerging Technology Advisory Group. Reach him at self@evident.com. [ Top ]
Of Interest
· Intranet eXchange Discussion Board

· Advice and Opinions