Home : Network Programming
Format of Email Messages: RFC 822 and MIME

Contents


What format are we talking about?

Emails are exchanged between hosts on the internet by using SMTP. While transferring the email message, the sender SMTP specifies the address of the sender, the receiver and the data to be sent. The data is arbitrary ASCII text. SMTP does not interpret the data in any way or specify a format for it.

During the early days of email, a need was felt for a header in the data which would contain information such as the Subject, Date when the mail was sent etc. Consequently, several informal standards were developed by individuals leading to incomaptibilities. It was felt necessary to codify these practices and provide for those features that seemed imminent. This resulted in RFC 733 which was later updated by RFC 822- Standard for the Format of ARPA Internet Text Messages which specifies a standard set of message headers which are followed by the message content.

The problem with RFC 822 is that it allows message content consisting of only ASCII text. MIME or Multipurpose Internet Mail Extensions overcome this limitation and allow messages containing character sets other than ASCII, non-textual data (attachments), multi-part messages etc.

RFC 822- Standard for the Format of ARPA Internet Text Messages

A message consists of header fields and, optionally, a body. The body is simply a sequence of lines containing ASCII characters. It is separated from the headers by a null line. Each header field can be viewed as a single, logical line of ASCII characters, comprising a field-name followed by a colon (":"), followed by a field-body. Depending on the field-name, the field body may be Structured or Unstructured.

Not all header fields received by the receiver may have been specified by the sender. There is a distinction between "message" and "envelope" headers. Briefly, the "envelope" headers are actually generated by the machine that receives a message, rather than by the sender. The envelope headers are added at the beginning of the mail data. The SMTP relay servers that handle the message on the way to the final receipient insert some header fields into the message header. For example:

In addition to the the predefined header field, users are allowed to define and use their own header fields. These fields may be used for transferring application specific information. Such fields must have names which are not already used in the current specification. The names of user defined fields usually begin with "X-" because it is guaranteed that predefined fields will never have names beginning with this string.

Some of the header fields specified by the standard are listed below:
(From the article Reading Email Headers, by Nathan Tenny of Qualcomm (the publishers of Eudora email software))

A sample email header from a mail I sent to myself is given below:

From malhotra_g@vsnl.net Fri Nov 7 19:17:53 2003
Return-Path: <malhotra_g@vsnl.net>
Delivered-To: gaurav@deimos.csa.iisc.ernet.in
Received: from csa.iisc.ernet.in (csa.iisc.ernet.in [144.16.67.8])
      by deimos.csa.iisc.ernet.in (Postfix) with ESMTP id 8EF794E662
      for <gaurav@deimos.csa.iisc.ernet.in> ; Fri, 7 Nov 2003 19:17:53 +0530 (IST)
Received: by csa.iisc.ernet.in (Postfix)
      id 30AA52BDF9; Fri, 7 Nov 2003 18:57:32 +0530 (IST)
Delivered-To: gaurav@csa.iisc.ernet.in
Received: from smtp2.vsnl.net (smtp2.vsnl.net [203.200.235.232])
      by csa.iisc.ernet.in (Postfix) with ESMTP id D45FA2BC9F
      for <gaurav@csa.iisc.ernet.in> ; Fri, 7 Nov 2003 18:57:31 +0530 (IST)
Received: from vsnl.net ([127.0.0.1])
      by smtp2.vsnl.net (iPlanet Messaging Server 5.2 HotFix 1.16 (built May 14
      2003)) with ESMTP id <0HNZ00HE9I8UA7@smtp2.vsnl.net> for
      gaurav@csa.iisc.ernet.in; Fri, 07 Nov 2003 19:16:06 +0530 (IST)
Received: from ([172.16.28.141])
      by smtp2.vsnl.net (InterScan E-Mail VirusWall Unix); Fri,
      07 Nov 2003 19:16:06 +0530 (IST)
Received: from [172.16.28.182] by pop2.vsnl.net (mshttpd); Fri,
      07 Nov 2003 18:46:06 +0500
Date: Fri, 07 Nov 2003 18:46:06 +0500
From: malhotra_g@vsnl.net
Subject: Test Mail
To: gaurav@csa.iisc.ernet.in
Message-id: <4d318d4d0507.4d05074d318d@vsnl.net>
X-Mailer: iPlanet Messenger Express 5.2 HotFix 1.16 (built May 14 2003)
Content-type: text/plain; charset=us-ascii
Content-language: en
Content-transfer-encoding: 7BIT
Content-disposition: inline
X-Accept-Language: en
Priority: normal
Status: RO
X-Status:
X-Keywords:
X-UID: 218

MIME - Multipurpose Internet Mail Extensions

MIME extends the format of Internet mail, as specified by RFC 822, to allow non-US-ASCII textual messages, non-textual messages, multipart message bodies, and non-US-ASCII information in message headers. Specifically, MIME messages can contain text, images, audio, video, or other application-specific data.

To allow mail readers to recognise email messages that use MIME, some new MIME specific header fields were defined. This allows email applications to distinguish between MIME and Non-MIME message so that each can be appropriately processed.

MIME defines the following new header fields:
(from MIME Overview, by Mark Grand)

  1. The MIME-Version header field, which uses a version number to declare that a message conforms to the MIME standard.
  2. The Content-Type header field, which can be used to specify the type and subtype of data in the body of a message and to fully specify the encoding of such data. The Content-Type can also have an associated subtype.
    1. The Content-Type value Text, which can be used to represent textual information in a number of character sets and formatted text description languages in a standardized manner.
    2. The Content-Type value Multipart, which can be used to combine several body parts, possibly of differing types of data, into a single message.
    3. The Content-Type value Application, which can be used to transmit application data or binary data.
    4. The Content-Type value Message, for encapsulating a mail message.
    5. The Content-Type value Image, for transmitting still image (picture) data.
    6. The Content-Type value Audio, for transmitting audio or voice data.
    7. The Content-Type value Video, for transmitting video or moving image data, possibly with audio as part of the composite video data format.
  3. The Content-Transfer-Encoding header field, that specifies how the data is encoded to allow it to pass through mail transports having data or character set limitations. Since SMTP was designed to carry US-ASCII text messages, binary data such as audio, video, images etc. have to be suitably encoded before they can be transferred. The possible values for the Content-Transfer-Encoding field are:
  4. Two header fields that can be used to further identify and describe the data in a message body: the Content-ID and Content-Description header fields.

MIME allows messages to contain multiple objects. When multiple objects are in a MIME message, they are represented in a form called a body part. A body part has a header and a body, so it makes sense to speak about the body of a body part. Also, body parts can be nested in bodies that contain one or multiple body parts. The Content-type value Multipart is used to encapsulate multiple body-parts in a single body. The interested reader can refer to the MIME RFCs or the MIME Overview. by Mark Grand for technical details of exactly how MIME works.

A sample Multipart MIME Email message is given below (some lines have been deleted for clarity):

From gaurav@csa.iisc.ernet.in Fri Nov 14 16:04:00 2003
Return-Path: <gaurav@csa.iisc.ernet.in>
Delivered-To: gaurav@csa.iisc.ernet.in
Received: from deimos.csa.iisc.ernet.in (deimos.csa.iisc.ernet.in [144.16.67.57])
     by csa.iisc.ernet.in (Postfix) with ESMTP id 7F9B12B992
     for <gaurav@csa.iisc.ernet.in>; Fri, 14 Nov 2003 15:45:05 +0530 (IST)
Received: by deimos.csa.iisc.ernet.in (Postfix, from userid 9408)
     id EA8F24E659; Fri, 14 Nov 2003 16:03:59 +0530 (IST)
Received: from localhost (localhost [127.0.0.1])
     by deimos.csa.iisc.ernet.in (Postfix) with ESMTP id D57994A27C
     for <gaurav@csa.iisc.ernet.in>; Fri, 14 Nov 2003 16:03:59 +0530 (IST)
Date: Fri, 14 Nov 2003 16:03:59 +0530 (IST)
From: Gaurav Malhotra <gaurav@csa.iisc.ernet.in>
To: Gaurav Malhotra <gaurav@csa.iisc.ernet.in>
Subject: Testing
Message-ID: <Pine.LNX.4.58.0311141602430.11137@deimos.csa.iisc.ernet.in>
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY="277887299-1503234992-1068806039=:11137"
Status: O
X-Status:
X-Keywords:
X-UID: 226

   This message is in MIME format. The first part should be readable text,
   while the remaining parts are likely unreadable without MIME-aware tools.
   Send mail to mime@docserver.cac.washington.edu for more info.

--277887299-1503234992-1068806039=:11137
Content-Type: TEXT/PLAIN; charset=US-ASCII

This is a test.
--277887299-1503234992-1068806039=:11137
Content-Type: IMAGE/jpeg; name="911c4s1_10x7.jpg"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.LNX.4.58.0311141603590.11137@deimos.csa.iisc.ernet.in>
Content-Description:
Content-Disposition: attachment; filename="911c4s1_10x7.jpg"

/9j/4AAQSkZJRgABAgAAZABkAAD/7AARRHVja3kAAQAEAAAAPAAA/+4ADkFk
b2JlAGTAAAAAAf/bAIQABgQEBAUEBgUFBgkGBQYJCwgGBggLDAoKCwoKDBAM
DAwMDAwQDA4PEA8ODBMTFBQTExwbGxscHx8fHx8fHx8fHwEHBwcNDA0YEBAY
GhURFRofHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8f
Hx8fHx8fHx8f/8AAEQgDAAQAAwERAAIRAQMRAf/EALEAAQADAAMBAQEAAAAA
*******some lines have been deleted for clarity******
SBPHq9oFkl6gJoASCLUQVKQE0AkCaAKASBPUBNUBJQIJKJQBgK8SCavpAmvD
vAmoCrAmvUA1ANQB9PAoVIFeJR//2Q==

--277887299-1503234992-1068806039=:11137
Content-Type: TEXT/plain; charset=US-ASCII; name="proxies.txt"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.LNX.4.58.0311141603591.11137@deimos.csa.iisc.ernet.in>
Content-Description:
Content-Disposition: attachment; filename="proxies.txt"

ZWNlIDogIDE0NC4xNi42NC40IDogICAgMzEyOA0Kc2VyYzogIDE0NC4xNi43
OS41ODoJMzEyOCAgIG9ubHkgYWZ0ZXIgNiBpbiB0aGUgZXZlbmluZyB0byA5
IGluIG1vcm5pbmcNCmNzYSA6ICAxNDQuMTYuNjcuOCA6CTgwODANCg==

--277887299-1503234992-1068806039=:11137--

In the example given above, the main message body is defined as Multipart/Mixed. It consists of three body-parts. The first body-part contains plain text. The second part contains a jpeg image file encoded using BASE64 and the third part contains plain text encoded as BASE64.

Further References

Want to learn more?


Back to Network Programming Valid XHTML 1.0! Valid CSS!