The Icon Bar: HTTP_an_introduction: The Icon Bar: HTTP: an introduction

The HTTP (or HyperText Transfer Protocol) is designed for two way communication between a web server and a client, usually a web browser.

It caters for downloading of files, submission of data, uploading of files, deleting of files, retrieving information about both individual files and the server. It has the potential to allow people to read versions in their own language, retrieve filetypes suitable for their browsers, allow multiple web sites to run on one computer, have a request 'bounced' through any number of proxy servers, offer to accept a cached version of a file, demand a new version of the file and even run over a secured connection.

Do not even consider basing any implementation of an HTTP client or server on this description - it is liable to be innacurate, misleading and is only intended as a basic overview. Base any implementation on the official HTTP Specification from W3C. If you wish to write a client on RISC OS, it is much better to use the Acorn URL modules, which include a transparent interface for fetching URIs of any supported protocol, rather than reimplementing a client at socket level.

How it works

At a basic level it works with one client and one server, with the client generating a request and in return receiving a response. Resquests for information are always generated by the client; it is thus a request/response protocol, rather than a broadcast protocol where the server initiates the transaction.

The protocol works by a client initiating a TCP/IP connection (via a standard socket) with a port (usually 80) on the server and then sending a request header, optionally accompanied by a request body. The server will respond with a response header, again optionally accompanied by a response body. In the event of a Keep-Alive or persistant connection, the connection may be kept open and another request sent by the client, otherwise the connection will be closed.

Request Headers

The request header takes the form of method (which could be GET, POST, PUT, OPTIONS, HEAD, DELETE, TRACE, CONNECT or an extension method) followed by a space, followed by the URI, followed by a space followed by the HTTP version followed by a CR LF. e.g.

GET / HTTP/1.1

This is followed by any number of request header lines, which take the form Header name followed by a colon followed by a space followed by the value and terminated by a CR LF. e.g.

Host: www.iconbar.com

Which coupled with the above would request the index page from iconbar.com. When the headers are finished, then the client sends a blank line again terminated by a CR LF to indicate that it has finished.

The server will now process the headers and act accordingly - if the request was a PUT or a POST request, it will read the amount of data specified in the Content-Length header first, otherwise it will move onto response headers.

Request Body

After the header has been sent for a POST or a PUT request, the client will send the appropriate data, the length of which has already been specified with a Content-Length header.

Response Headers

This starts by specifying the HTTP protocol version followed by a space, followed by a statuc code, followed by a textual description of the status code followed by CR LF. e.g.

HTTP/1.1 200 OK

The status codes are in grouped as follows:

1xx: Informational - Request received, continuing process
2xx: Success - The action was successfully received, understood, and accepted
3xx: Redirection - Further action must be taken in order to complete the request
4xx: Client Error - The request contains bad syntax or cannot be fulfilled
5xx: Server Error - The server failed to fulfill an apparently valid request

This is then followed by any number of response header lines, dependant on the code. These follow the same format as request header lines.

The server should then supply the blank line terminated by CR LF to indicate the end of the header.

If the request is successful then the server will usually also supply a response body.

Response Body

After the header has server will send appropriate data, the length of which has already been specified with a Content-Length header. Exceptions may include when an error occurs (4xx, 5xx), response redirection (3xx), responses in the 1xx range or when the request didn't generate data, e.g a HEAD request which is only permitted to return the headers.

Commonly used methods

GET

The GET method is probably the most commonly used and is used to fetch a request URI. It returns the response headers and the entity refered to by the URI. The returned data may be cacheable.

HEAD

The HEAD method allows a client to verify whether data is still valid. It should return headers identical to those that would be returned by a GET response under the same conditions, but must not return the response body.

POST

The POST method allows a client to post data to the server. This is generally used alongside the CGI protocol to process data from a form. The response should not be cacheable unless the reponse specifically includes a appropriate Expires or Cache-Control headers.

Commonly Used Status Codes

Status Code	Example Textual Description
200	OK
300	Multiple Choices
301	Moved Permanently
302	Found
400	Bad Request
403	Forbidden
404	Not Found
500	Internal Server Error
501	Not Implemented

Useful links

The W3C Full HTTP Specification