c-- styles for logos and headline links do not modify internet, red, or black styles -->

Intranet Journal   Earthweb  
Events Jobs Premium Services Media Kit Network Map E-mail Offers Vendor Solutions Webcasts

   Intranet Journal Subjects
Search Earthweb

Privacy Policy



internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

internet commerce
Be a Commerce Partner
















 

[ Home | Discussion Forum | How Do I... | Lotus Notes Intranets | Microsoft SharePoint | Products | Shopping  ]

free news!

Image: Quivering diskette Feature
Internet Messaging II

Standards on the sending desktop


Adapted from the Prentice-Hall text Internet Messaging, From the
Desktop to the Enterprise
, by David Strom and Marshall T. Rose.

 

Previous PageTable Of ContentsNext Page

Content Types

The Content-type: header is used to identify the content value contained within a message. A content type is identified by the following properties:

  • a type, which gives general guidance as to the resources required in order to process the content;
  • a subtype, which refines the content; and,
  • zero or more parameters, which allow for the customization of the content.

By convention, when people talk about a content type they say both the type and subtype. The two are separated by a solidus (aka forward slash, "/"), for example text/plain or application/MSWord. There are seven predefined types and a several subtypes associated with each content type. The definition of the original content types is such that there probably won't be anymore than the original seven.

The multipart type is the most complex. It is used to convey a content value that contains subordinate parts. Basically, a multipart content, regardless of its subtype, contains zero or more body parts, each separated by a delimiter. Each of the body parts is structured in a similar fashion to an electronic mail message. Unlike a message, however, no header fields need be present. Hence, any of the body parts could start with a blank line. However, there are usually headers present and they should all be named with a prefix of Content-. If no Content- type: header is present, then the value text/plain is used as a default, which means that the body part contains unstructured ASCII text.

There are eight subtypes of multipart in common use. We'll describe five here.

  1. multipart/mixed, which indicates that the subordinate body parts should be processed in sequence.
  2. multipart/parallel, which indicates that the subordinate body parts should be processed in parallel. However, if more than one body part requires exclusive access to a common resource (e.g., if two or more body parts requires access to the user's keyboard when rendering them), or if the software processing the message is incapable of simulating parallel processing, then sequential processing is acceptable.
  3. multipart/digest, which indicates that each subordinate body part is an electronic message, having type message/rfc822 (discussed in just a moment). When messages are forwarded, this is the content type to use. Unfortunately, much "modern" desktop software simply includes the message as text-without actually structuring it as an included message. As a result, humans can figure out what's going on, but programs can't.
  4. multipart/alternative, which indicates that while there are multiple subordinate body parts present, they all have identical semantic content. As such, only one should be processed. The body parts are ordered in terms of expressive power, with the least expressive content being the first, and the most expressive content being the last. The reason for this is to make things simpler for pre-MIME software. That desktop software will display the entire message to the user; hopefully, the first body part will be legible to a human.
  5. multipart/report, which indicates that the message is an error report.

The easy way to think of the multipart type is that it is interpreted directly by the desktop software and the user should be completely unaware of its existence. The same is largely true of the next type, message, which has three commonly used subtypes and one unpopular subtype:

  1. message/rfc822, which indicates that the content value is an electronic mail message. When forwarding messages, the multipart/digest content type is used and each subordinate body part is of type message/rfc822.
  2. message/partial, which indicates that the content value is part of a fragmented message. When a message is too large to send, typically due to administrative controls, it can be divided into several fragments. Each fragment has a common id and a unique number. The final fragment must (and the other fragments usually do) have an indication as to the total number of fragments. Upon receiving all fragments, the original message can be reconstructed. The only particularly tricky part about the process is that the Content- headers and the Message-ID: of the original message is placed at the front of the value put in the first fragment. This prevents any confusion between the headers identifying each fragment and the headers in the original message.
  3. message/delivery-status, which is contained inside a structured error report. As described in Chapter 2, it's the second part of an error report that contains machine-readable information about the problems in delivering the message.
  4. message/external-body, which indicates that the content value is a pointer to the content, rather than the actual value. This subtype is falling out of use. The reason is that it is proving easier to send a message containing HTML, which embeds a link to the external content rather than constructing a separate external body part.

As a note for protocol historians, this last subtype was developed at approximately the same time as the Web technologies. For various reasons, it didn't use the same syntax as the Web. This was, in retrospect, a mistake, given that an HTML fragment has equivalent functionality to an external body part.

The remaining content types are meaningful to the user. The standardized ones in common use are:

  1. text/plain, which indicates that the content value is plain text. A parameter indicates which character set should be used when rendering the text. In general, the simplest character set that faithfully represents the value should be chosen. For example, the characters contained in the US-ASCII set are a subset of those contained in the ISO-8859-1 repertoire. If a message makes use of only those characters in the former character set, then that should be the character set indicated by the e-mail program. However, as we'll see later in this chapter, not all products have been implemented in this fashion.
  2. text/html, which indicates that the content value is from the HTML used by the Web. The same characters set issues apply as for text/plain.
  3. text/richtext, which indicates that the content value is input to a simple text formatter. This is another casualty of the early development of MIME not foreseeing the popularity of HTML.
  4. image/gif, which indicates that the content value is image data encoded using the Graphics Interchange Format (GIF).
  5. image/jpeg, which indicates that the content value is image data encoded using the Joint Picture Experts Group (JPEG) format.
  6. audio/*, which indicates that the content value is audio data encoded using the indicated subtype (and parameters). Originally, there was the audio/basic content, which was phone-quality, single-channel audio, but this lacks the sizzle required by the people marketing today's Internet.
  7. video/*, which indicates that the content value is video encoded using the indicated subtype (and parameters).

As might be imagined, there are many subtypes of text, image, audio and video used for specialized applications throughout the Internet. However, we haven't yet described the seventh content type, which is where most of the customized behavior is found-the application type. Although the original intent of the application type was to convey a content value for mail-enabled applications, in practice anytime something needs to be sent that is more complex than one of these four types (text, image, audio or video), then the application type is used.

For example, if you need to send a spreadsheet, a word processing document or a slide presentation, then the company that wrote the authoring program has already registered the application subtype that conveys the appropriate kind of file.2 Among other things, the MIME standard documents the procedure wherein a vendor may register content types with a registration authority. In addition, there is one other common subtype:

  • application/octet-stream, which indicates that the content is arbitrary binary data. Parameters indicate a textual explanation of the contents. This subtype is generally used when the appropriate company has registered a specific application subtype.

In effect, the application/octet-stream type provides a simple file transfer facility over e-mail. Let's look at two examples.

First, let's combine the foregoing concepts to deconstruct at a typical structured error report:

  1. A structured error report consists of two or three subordinate body parts. So, we know that it's going to be a multipart content type. The particular content type is multipart/report.
  2. The first part is a textual explanation as to the problem and includes the three-digit reply code. The particular content type is text/plain.
  3. The second part looks like a small message-it has a collection of headers. These headers include precise information as to what the problem was and where it occurred. This information is carefully generated to be machine readable. The particular content type is message/delivery-status.
  4. The third part, if present, is the original message. The particular content type is message/rfc822.

As a second example, consider the the correct way to generate a Bcc: message:

  1. Strip the Bcc: header out of the message, but remember the addresses contained therein.
  2. Send that message to the recipient addresses in the To: and cc: fields.
  3. For each address in the Bcc: header, construct a new message of type multipart/digest. It should have one subordinate body part, message/rfc822, which contains the message that was sent in the previous step.
  4. The headers of each new message sent should be identical to the original message sent except that the Content-*: headers should be removed and replaced with a Content-Type: of multipart/digest and the To: and cc: headers should be replaced with a To: header containing the address of the Bcc: recipient.

2 Of course, there is still plenty of room for user error when sending attachments.

Previous PageTable Of ContentsNext Page


[print version of this page]

TOC
Internet Messaging

Introduction

Problems

Standards

Solutions


Of Interest
· Intranet eXchange Discussion Board

· Advice and Opinions