HYPERTEXT TRANSFER PROTOCOL

HTTP is an application-level protocol over TCP for distributed, collaborative hypermedia information systems. It is a generic, stateless protocol. HTTP is the most frequently used protocol in combination with Web browsers. HTTP is a simple request/reply (RR) protocol over TCP. The standard procedure when an HTTP request is done is

1.Establish a TCP connection to the host given in the URL at the given port (the WWW server is supposed to listen there). If no host is given in the URL, connect to the local host. If no port number is given in the URL connect to port 80 (the default port).

2.Send the HTTP request,

3.Receive the requested document from the WWW server

4.Close the TCP connection (no explicit closing of the HTTP connection is necessary)

OPERATION

The HTTP protocol is a request/response protocol. A client sends a request to the server in the form of a request method, URI, and protocol version, followed by a MIME-like message containing request modifiers, client information, and possible body content over a connection with a server. The server responds with a status line, including the message's protocol version and a success or error code, followed by a MIME-like message containing server information, entity metainformation, and possible entity-body content.

Most HTTP communication is initiated by a user agent and consists of a request to be applied to a resource on some origin server. In the simplest case, this may be accomplished via a single connection (v) between the user agent (UA) and the origin server (O).

 request chain ------------------------>
       UA -------------------v------------------- O
          <----------------------- response chain 

A more complicated situation occurs when one or more intermediaries are present in the request/response chain. There are three common forms of intermediary: proxy, gateway, and tunnel. A proxy is a forwarding agent, receiving requests for a URI in its absolute form, rewriting all or part of the message, and forwarding the reformatted request toward the server identified by the URI.

A gateway is a receiving agent, acting as a layer above some other server(s) and, if necessary, translating the requests to the underlying server's protocol. A tunnel acts as a relay point between two connections without changing the messages; tunnels are used when the communication needs to pass through an intermediary (such as a firewall) even when the intermediary cannot understand the contents of the messages.

       request chain -------------------------------------->
       UA -----v----- A -----v----- B -----v----- C -----v----- O
          <------------------------------------- response chain


A, B, and C are three intermediaries between the user agent and origin server. Some HTTP communication options may apply only to the connection with the nearest, non-tunnel neighbor, only to the end-points of the chain, or to all connections along the chain. Although the diagram is linear, each participant may be engaged in multiple, simultaneous communications. For example, B may be receiving requests from many clients other than A, and/or forwarding requests to servers other than C, at the same time that it is handling A's request.

HTTP communication usually takes place over TCP/IP connections. The default port is TCP 80 , but other ports can be used. This does not preclude HTTP from being implemented on top of any other protocol on the Internet, or on other networks. HTTP only presumes a reliable transport; any protocol that provides such guarantees can be used.

In HTTP/1.0, most implementations used a new connection for each request/response exchange. In HTTP/1.1, a connection may be used for one or more request/response exchanges, although connections may be closed for a variety of reasons .

As far as HTTP is concerned, Uniform Resource Identifiers are simply formatted strings which identify--via name, location, or any other characteristic--a resource. The HTTP protocol does not place any a priori limit on the length of a URI.

HTTP MESSAGE

Header:- HTTP header fields, which include general-header, request-header, response-header , and entity-header fields. Each header field consists of a name followed by a colon (":") and the field value. Field names are case-insensitive. The order in which header fields with differing field names are received is not significant.

The message-body (if any) of an HTTP message is used to carry the entity-body associated with the request or response.

CONNECTION

Persistent Connections:- Unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server.Persistent connections provide a mechanism by which a client and a server can signal the close of a TCP connection. An HTTP/1.1 server MAY assume that a HTTP/1.1 client intends to maintain a persistent connection unless a Connection header including the connection-token "close" was sent in the request.

Pipelining :-A client that supports persistent connections MAY "pipeline" its requests (i.e., send multiple requests without waiting for each response). A server MUST send its responses to those requests in the same order that the requests were received.

Proxy Servers :-The proxy server MUST signal persistent connections separately with its clients and the origin servers (or other proxy servers) that it connects to.

For more details refer RFC 2616