c-- styles for logos and headline links do not modify internet, red, or black styles -->

Intranet Journal   Earthweb  
Events Jobs Premium Services Media Kit Network Map E-mail Offers Vendor Solutions Webcasts

   Intranet Journal Subjects
Search Earthweb

Privacy Policy



internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

internet commerce
Be a Commerce Partner
















 

[ Home | Discussion Forum | How Do I... | Lotus Notes Intranets | Microsoft SharePoint | Products | Shopping  ]

free news!

 
The Future of Internet Publishing
Putting XML in perspective
XML and Distributed Computing
XML and Meta-Data
Enhanced hyperlinks through XML
Emerging stylesheet standards
DataChannel and XML
Conclusion
References


In many cases, XML and SGML will be used within the same architecture. Many companies in the publishing arena have existing investments in SGML systems. There is no reason to stop using and expanding these systems. There are also many compelling reasons for having publishing systems use SGML in the authoring process since there are still many things that XML was not designed for. A common scenario in next-generation publishing systems is the use of SGML to create and author documents, and then XML to publish them via the Internet.

XML goes beyond traditional markup
XML can be used in ways that clearly go beyond traditional markup of documents. Two new areas currently emerging are distributed computing and metadata.

XML and Distributed Computing

XML allows the creation of software tools that can process XML objects quickly while keeping the working set of software within reasonable limits. This makes XML an ideal candidate for a message interchange format between distributed applications and components. This idea, already popular within the SGML community, gets new attention in light of XML's capabilities.

XML can be used as a transparent and efficient message format, and yet remain readable to humans. XML messages are passed between applications based on other established protocols such as HTTP.

XML and metadata

Metadata is information about data. An abstract, keywords, and information about the author of a document are all examples of metadata. Sometimes metadata can be extracted directly from the described data (as in the case of an XML document). In other cases, such as an image, metadata needs to come from an external source.

Working-groups inside the World Wide Web Consortium are currently investigating the use of metadata for applications on the Internet. The Resource Description Framework (RDF) is a work-in-progress with the goal of defining a standard for expressing and exchanging metadata information. RDF uses XML as the encoding syntax for its metadata.

Enhanced hyperlinks through XML

Linking is the capability to build associations between two objects, also known as a hyperlink. In their simplest form links can be used to create associations between phrases or words in HTML documents. This simple mechanism is already used to build very powerful systems.

Hyperlinking can go much further. As part of the XML family, XLL is going to provide a set of very sophisticated association mechanisms. XLL will build on the association techniques and ideas of proven technologies, such as HyTime and TEI-Pointers.

XLL will push the idea of hyperlinking and hyper-navigation to new levels. Next generation publishing systems will be able to draw heavily from the new possibilities.

Some of these new capabilities are:

  • Multi-directional links.
  • External links (Link information is kept outside a document).
  • Addressing based on the structure of a document (e.g. 1st section of 2nd chapter).

Emerging stylesheet standards

Ideally, rendering is based on some kind of stylesheet mechanism. A stylesheet, in its most primitive form, can dictate that the header of a chapter is to be rendered in "bold 14pt Times New Roman." In HTML, there is no explicit style information. Browsers have hardwired or user-configurable style information that renders a Heading One (<h1>) for instance, in large, bold characters. The font and size are dependent on the browser and user preferences. Three important stylesheet mechanisms are CSS, DSSSL, and XSL.

Cascading Stylesheets (CSS)
CSS is an attempt to provide a simple yet sufficient stylesheet mechanism for HTML documents. Essentially CSS enables the association of rendering information like color, font-size, font-characteristics, and alignment with HTML element types (tags).

Although originally designed for HTML, CSS can be applied to XML as well. CSS is good enough for many simple publishing requirements. For more complex publishing needs, however, more powerful features are needed.

Document Style and Semantic Specification Language (DSSSL)
In the SGML community, style has been one of the big challenges of the last decade. SGML systems often used proprietary stylesheets. Early attempts for standardization such as Format Output Specification Interchange (FOSI) were not very successful.

In 1995 DSSSL was introduced. DSSSL is probably the most powerful stylesheet mechanism ever conceived. DSSSL includes features for rendering as well as tree-transformation. DSSSL also embeds a subset of Scheme enabling authors to include complex programs within a stylesheet. The DSSSL formatting model is explicitly designed to target the needs of the high-end publishing community.

As powerful as DSSSL is, it seems to face to same problem as SGML. Due to its complexity and volume, it is not very suitable for the Internet where processing speed and wide-spread adoption are important.

eXtensible Stylesheet Language - XSL
XSL is currently a submission to the W3C by a variety of Internet and publishing companies. As HTML is to SGML, XSL is an attempt to subset DSSSL and provide a syntax that is more appealing to the majority of WWW developers than Scheme(-like) expressions are.

XSL uses XML as its syntax and ECMA-Script as an embedded programming language. XSL is likely to become the stylesheet language of choice in situations where traditional CSS stylesheets and insufficient, such as in professional or high-end publishing on the Internet.

Meta-content routing

The next generation of Internet Publishing also introduces one new and very important concept: meta-content routing. Meta-content routing enables a publisher to make sure that the right information gets to the right person at the right time.

A meta-content router is a software module that uses, maintains and controls a variety of different information repositories, all of which, though conceptually different, can be kept in the same database. In the ideal case, a Meta-content Router, such as DataChannel’s ChannelManager, works with the following data.

Channel Profiles
A channel can be compared to a stable classification scheme such as the arrangements of the daily newspaper. Sports, politics etc. can usually be found at the same place and furthermore one can be sure that the "Sports Pages" contain nothing but sport. In other words a channel can also be seen as a mechanism that makes sure that I find information that is of relevance and/or interest to me and belongs to a certain subject. A channel is a conceptual stream of information.

User Profiles
Users of a system can be described in traditional terms such as name, contact information, and what streams of information (channels) they are interested in.

Group Profiles
Users, especially in a corporate environment, are often organized into groups. For instance, a Marketing person might be a member of both a general marketing department and an international accounts team. Everybody in the marketing department needs a base of information. A member of the department can "inherit the interest" in certain subjects. For this reason, it is valuable to map these organizational structures into Meta-content Routers.

Desktop Profiles
Desktop profiles are parameters a user can configure to adjust a desktop to preferred settings. A system like DataChannel's ChannelManager keeps desktop profiles in a database. A user can access this desktop profile at any time, from any place, and from any computer (provided it is connected to the Internet).

Metadata
Metadata is of special importance in a meta-content router. DataChannel's ChannelManager does not route/distribute content. Imagine a network with 500 clients and a file of 20MB that is pushed to all users at (potentially) the same time. Even with multi-cast push, this is the nightmare scenario for every Intranet officer.

That is why systems like ChannelManager route only Meta-content. ChannelManager notifies all 500 clients that the 20MB file is available, including a link to it. The consumers need only to know about an information object. It is then up to these consumers to decide how to proceed after analyzing the Meta-content.

DataChannel and XML

DataChannel is the leading company in the arena of next-generation publishing systems based on meta-content routing. DataChannel was one of the first adopters of XML technology and is going to play a leading role in its future.

DataChannel is represented in the W3C-XML-Working Group, the W3C-RDF-Schema-Working Group and the W3C-DOM-Interest Group. DataChannel’s representatives have been speakers at a variety of key industry events and conferences. DataChannel is the host of DXDE – the DataChannel XML Development Environment – a set of components for XML application developers.

Furthermore, DataChannel maintains ChannelWorld - a unique assembly of resources relevant to next generation standards.

Conclusion

New formats, standards, technologies, and concepts are changing the way publishing is done on the Internet. The transition does not have to be abrupt. SGML will continue to play a part in document formatting. The kinds of things that can be done with an SGML document will be greatly expanded. The emerging stylesheet standards and XML fill gaps in Internet publishing that have long been a thorn in the side of publishers.

Possibly the biggest change in the near future is the emergence of meta-content routing. Finally, publishers will be able to make sure that the right information gets to the right person at the right time.

References

© 1998 DataChannel, Inc.
A common scenario in next-generation publishing systems will be to use SGML to create and author documents, and then XML to publish them via the Internet.

 

The Resource Description Framework (RDF) is a work-in-progress with the goal of defining a standard for expressing and exchanging metadata information. RDF uses XML as the encoding syntax for its metadata.

A meta-content router is a software module that uses, maintains and controls a variety of different information repositories, all of which, though conceptually different, can be kept in the same database.

The Company

DataChannel Inc., based in Bellevue, Washington, is the leader in XML-enabled active content technology. DataChannel's flagship product, DataChannel RIO, simplifies the process of delivering critical information to the right people at the right time through instant distribution of organized content, the ability of anyone to save content directly to the Web, and the provision of an open API (application program interface). To find out more, visit the company's web site at www.datachannel.com.

The Author

Norbert H. Mikula is Senior Online Information Architect at DataChannel and author of NXP (Norbert's XML Parser), the world's first fully featured XML parser written in Java.
Of Interest
· Intranet eXchange Discussion Board

· Advice and Opinions