The notion that messages carry semistructured data is central to this book. RFC934 (January 1985, Proposed Standard for Message Encapsulation) introduces the idea of a message body that is logically divided into regions separated by an “encapsulation boundary.” This idea was elaborated in a series of MIME RFCs, from RFC1341 (June 1992, MIME (Multipurpose Internet Mail Extensions) to RFC2045 (November 1996, MIME (Multipurpose Internet Mail Extensions) Part One: Format of Internet Message Bodies).
This series spells out the basic idea of MIME: a
Content-Type:
header can specify that a message
body contains structured text, image data, other application-specific
data, or a composite of these types.
The author of RFC1049 (March 1988, A Content-Type Header Field for
Internet Messages) wrote, “A standardized Content-Type field
allows mail reading systems to automatically identify the type of a
structured message body and to process it for display
accordingly.” This idea would become central not only to
mailers and newsreaders, which use the
Content-Type:
header to identify and process rich
content and attachments, but also to browsers. RFC2046 (November
1996, Multipurpose Internet Mail Extensions (MIME) Part Two: Media
Types) extended and revised RFC1049.
RFC2048 (November 1996, Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures) describes rules and procedures for registering new MIME content types.
A series beginning with RFC1872 (December 1995, The MIME Multipart/Related Content-Type) and ending with RFC2387 (August 1998, same title) defines how email programs can format compound documents made of interrelated parts. It suggests the use of the cid: (Content-ID) URL scheme, supported in modern HTML-aware mailreaders, as a way to form intradocument links.
RFC1873 (December 1995, Message/External-Body Content-ID Access
Type), a companion to RFC1872, defines the use of a
Content-ID:
header as a mechanism for
intradocument references.
To illustrate how this can work, suppose I drag an image into an HTML mail message I’m writing with Netscape Composer. To the recipient, it appears that the image is embedded within the text, like this:
As you can see in this picture: +----------+ | picture | +----------+ the graphic is shown inline.
If you inspect the body of such a message, you’ll see how the
MIME multipart/related Content-Type, the cid: protocol, and the
Content-ID:
header interact:
Content-Type: multipart/related; boundary="------------9F32153EFCC9C5CAFE0BDFE9" --------------9F32153EFCC9C5CAFE0BDFE9 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit <!doctype html public "-//w3c//dtd html 4.0 transitional//en"> <html> As you can see in this picture: <p><img SRC="cid:[email protected]" ALT="" BORDER=0 height=62 width=150> <p>the graphic is shown inline. --------------9F32153EFCC9C5CAFE0BDFE9 Content-Type: image/jpeg Content-ID: <[email protected]> Content-Transfer-Encoding: base64 Content-Disposition: inline; filename="C:TEMP smailN4.jpeg" /9j/4AAQSkZJRgABAgEASABIAAD/7QE0UGhvdG9zaG9wIDMuMAA4QklNA+0AAAAAABAASAAA AAEAAQBIAAAAAQABOEJJTQPzAAAAAAAIAAAAAAAAAAA4QklNJxAAAAAAAAoAAQAAAAAAAAAC OEJJTQP1AAAAAABIAC9mZgABAGxmZgAGAAAAAAABAC9mZgABAKGZmgAGAAAAAAABADIAAAAB AFoAAAAGAAAAAAABADUAAAABAC0AAAAGAAAAAAABOEJJTQP4AAAAAABwAAD///////////// ////////////////A+gAAAAA/////////////////////////////wPoAAAAAP//////////
Another series, from RFC1523 (September 1993, The text/enriched MIME Content-Type) to RFC1896 (February 1996, same title) documents a predecessor to HTML email. It defines a simple, HTML-like tag language used to format ASCII text messages:
<bold>Now</bold> is the time for <italic>all</italic> good men
The mechanism supporting HTML email is described in RFC2110 (March 1997, MIME E-Mail Encapsulation of Aggregate Documents), which was superseded by RFC2557 (March 1999, MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)). Say the authors of RFC2557:
In order to transfer a complete HTML multimedia document in a single email message, it is necessary to: a) aggregate a text/html root resource and all of the subsidiary resources it references into a single composite message structure, and b) define a means by which URIs in the text/html root can reference subsidiary resources within that composite message structure.
HTML email messages need to be able to refer, by means of hyperlinks, to messages and to parts of messages. The mid: and cid: URL schemes defined in RFC2392 (August 1998, Content-ID and Message-ID Uniform Resource Locators) serve this purpose.