The Content-Transfer-Encoding: header, if present,
indicates how the content value has been encoded. There are four general rules
that govern encodings:
· The multipart and message content types are never encoded. This
requirement makes it much easier for mail reading software to parse structured
messages.
· There is no a priori binding between a content type and the mechanism
used to encode its value. Although some content values may lend themselves
toward a particular encoding, these are independent issues. For example,
one could encode plain text using the same mechanism used to encode binary
information. There is no particular efficiency gained by such an approach,
but it does have the amusing side effect of preventing people with pre-MIME
desktop software from reading your messages!
· One should view the encoding and decode of a content value as completely
separate from processing the value. Hence, when processing an incoming message,
the value is decoded to its native form prior to being processed as a particular
content type.
· Although MIME allows for extensibility of transfer encodings, the
definition of new mechanisms is strongly discouraged. MIME provides three
standard encoding mechanisms: one useful when the value is printable and
format characters, one useful when the value is mostly such characters and
the third for content values that are primarily binary in nature.
The three encoding mechanisms are:
7-bit, which indicates that the content value conforms to the
ASCII repertoire.
quoted-printable, which indicates that the content value is mostly
(or entirely) from the ASCII character set. It is useful when a small percentage
of the characters have the high-order bit set, or when it is possible that
mail software somewhere down the line might transform some of the characters
present. An example of the latter case might be if some non-Internet e-mail
system is involved.
base64, which indicates that the content value is arbitrary binary
values. For every 24 bits of input, it generates a four-character sequence
taken from a special subset of the ASCII characters. This character set
was carefully chosen to have identical representation in all currently standardized
character sets. Arguably, it is the safest transfer encoding for this reason.
One might reasonably ask at this point why arbitrary binary values
couldn't be sent directly using the Internet messaging infrastructure. The
answer is historical: Internet e-mail grew up in an ASCII world. Message envelopes
(to be discussed shortly) and headers are all ASCII.3
Although it may be more bandwidth-efficient to support native binary transfers,
there are other efficiencies to consider, such as software compatibility.
For example, the Internet messaging infrastructure is blissfully unconcerned
with regard to the content values it carries, other than they are part of
an ASCII stream.
3
MIME provides for non-ASCII information to be encoded in headers, typically
the Subject: header. This topic is rather esoteric and won't be discussed
further. Presumably, if your mail sending software allows you to specify character
sets for various headers, it uses MIME's mechanisms for doing so.