Lab 1 (a): Experimentation with Wireshark

Lab 1 (b): Multi-threaded Web server

Requirements

The assignment is to create a small multi-threaded Web server using Java.  Your Web server should implement the subset of HTTP/1.0 as described in detail below. In short it should:


You can also download the slides of the in-class assignment presentation here.

Submission:

Assignment description

First download the skeleton-code for the client/server pair from the links server and client and read the instructions.

The server should implement a subset of the Hypertext Transfer Protocol (HTTP) version 1.0 as defined in the Internet Engineering Task Force (IETF) document RFC 1945 [ps, pdf].

The server should be able to handle several simultaneous requests.

Your server only needs to understand the HTTP/1.0 request format and generate HTTP/1.0 responses, i.e. you can ignore all requirements for HTTP/0.9 in the RFC. The rest of this document will describe which parts of HTTP/1.0 your server should be able to handle. While it might be possible to solve the lab without reading the RFC it is highly recommended to have a look at (at least) the referenced parts of the RFC. (It is rather long, so read it on screen if you're low on printer quota).

HTTP Request Message

A HTTP request from a client to a server has the following structure

       Full-Request   = Request-Line             ; Section 5.1
*( General-Header ; Section 4.3
| Request-Header ; Section 5.2
| Entity-Header ) ; Section 7.1
CRLF
[ Entity-Body ] ; Section 7.2

(The section numbers above refer to the RFC. The BNF syntax is described in section 2.1 of the RFC.)

Looking up the Request-line in the RFC give us this information

   The Request-Line begins with a method token, followed by the
Request-URI and the protocol version, and ending with CRLF. The
elements are separated by SP characters. No CR or LF are allowed
except in the final CRLF sequence.

Request-Line = Method SP Request-URI SP HTTP-Version CRLF

The Method can be any of "GET", "HEAD", "POST" or some implementation defined extension. Your implementation only need to support "GET" and "HEAD" (but should return the appropriate error response if it receives a request with an unsupported Method, see section 8.1 and 8.2).

   The Request-URI is a Uniform Resource Identifier (Section 3.2) and
identifies the resource upon which to apply the request.

Request-URI = absoluteURI | abs_path

In our case only abs_path is valid, since the server only serves local documents. The server should treat the path as relative to the current directory. (I.E. you could just add a "." in front of it to get the path in the local file system. A secure Web server, however, does need to check the abs_path carefully.). A request for the server root, that is an abs_path of consisting of only "/", should return the document "/index.html".

       HTTP-Version   = "HTTP" "/" 1*DIGIT "." 1*DIGIT

The General-Header is described in section 4.3 and has the following form

       General-Header = Date                     ; Section 10.6
| Pragma ; Section 10.12

In our case we don't care about it in the request, but putting a Date line in the response header is good style.

The Request-Header looks like this

       Request-Header = Authorization            ; Section 10.2
| From ; Section 10.8
| If-Modified-Since ; Section 10.9
| Referer ; Section 10.13
| User-Agent ; Section 10.15

and you can ignore it. You don't need to implement conditional GET.

There should not be any Entity-Headers nor any Entity-Body in a GET or HEAD request, so it is safe to ignore those for now. (But you will see them again below, when we talk about the response.)

Note that the request is not considered finished until the final CRLF is received (for a request with body the Content-Length is used instead). Otherwise, a slow client might fail to read the response if the server sends it prematurely and then closes the connection.

HTTP Response Message

In response to a request message the server should generate a response message.

       Full-Response  = Status-Line              ; Section 6.1
*( General-Header ; Section 4.3
| Response-Header ; Section 6.2
| Entity-Header ) ; Section 7.1
CRLF
[ Entity-Body ] ; Section 7.2

The Status-Line.

   The first line of a Full-Response message is the Status-Line,
consisting of the protocol version followed by a numeric status code
and its associated textual phrase, with each element separated by SP
characters. No CR or LF is allowed except in the final CRLF sequence.

Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

The status codes and reason phrases are described in section 6.1.1 and section 9 in the RFC. Your server should be able to generate these response messages

      Status code    Reason phrase
200 OK
400 Bad Request
404 Not Found
500 Internal Server Error
501 Not Implemented

('Internal Server Error' is useful as a last ditch response in case something unexpected happened in your server.) It is common that an error response also contains a small HTML message with a user friendly explanation of the error as the entity body which the Web browser then will display. Your Web server should return such messages.

The General-Header lines in the response from your server should contain a Date entry, see section 4.3 and 10.6.

The Response-Header lines should contain a Server entry. (Invent some nice name for your program! :)

       Response-Header = Location                ; Section 10.11
| Server ; Section 10.14
| WWW-Authenticate ; Section 10.16

If the response includes an entity, for example the contents of the requested file or some error message that should be displayed in the client browser, some entity headers and an entity body must be included in the message. The HEAD method should only return the headers (and never an entity body).

From section 7.1 of the RFC:

   Entity-Header fields define optional meta information about the
Entity-Body or, if no body is present, about the resource identified
by the request.

Entity-Header = Allow ; Section 10.1
| Content-Encoding ; Section 10.3
| Content-Length ; Section 10.4
| Content-Type ; Section 10.5
| Expires ; Section 10.7
| Last-Modified ; Section 10.10
| extension-header

The only entity headers you need to include is the Content-Type header, the Content-Length header and the Last-Modified header. The Content-Length header is only required for a response NOT containing the expected entity body, i.e when the content length is zero. (Note that this is the case if the requested file is empty). See section 7.2 for the details.

It is enough to support just a few content types like text/html, text/plain, image/gif and image/jpeg and to use file name extensions to recognize them. Files of unknown type should be marked with application/octet-stream.

The entity body is just a stream of octets. To send a file, just read the file and copy its contents to the socket. Close the socket when the response has been sent.

Hints