How Java Web Servers Work
by Budi Kurniawan04/23/2003
Editor's Note: this article is adapted from Budi's self-published book on Tomcat internals. You can find more information on his web site.
A web server is also called a Hypertext Transfer Protocol (HTTP) server
because it uses HTTP to communicate with its clients, which are usually web
browsers. A Java-based web server uses two important classes,
java.net.Socket and java.net.ServerSocket, and
communicates through HTTP messages. Therefore, this article starts by
discussing of HTTP and the two classes. Afterwards, I'll explain the simple
web server application that accompanies this article.
The Hypertext Transfer Protocol (HTTP)
HTTP is the protocol that allows web servers and browsers to send and receive data over the Internet. It is a request and response protocol--the client makes a request and the server responds to the request. HTTP uses reliable TCP connections, by default on TCP port 80. The first version of HTTP was HTTP/0.9, which was then overridden by HTTP/1.0. The current version is HTTP/1.1, which is defined by RFC 2616(.pdf).
This section covers HTTP 1.1 briefly; enough to make you understand the messages sent by the web server application. If you are interested in more details, read RFC 2616.
In HTTP, the client always initiates a transaction by establishing a connection and sending an HTTP request. The server is in no position to contact a client or to make a callback connection to the client. Either the client or the server can prematurely terminate a connection. For example, when using a web browser, you can click the Stop button on your browser to stop the download process of a file, effectively closing the HTTP connection with the web server.
HTTP Requests
An HTTP request consists of three components:
- Method-URI-Protocol/Version
- Request headers
- Entity body
An example HTTP request is:
POST /servlet/default.jsp HTTP/1.1
Accept: text/plain; text/html
Accept-Language: en-gb
Connection: Keep-Alive
Host: localhost
Referer: http://localhost/ch8/SendDetails.htm
User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98)
Content-Length: 33
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
LastName=Franks&FirstName=Michael
The method-URI-Protocol/Version appears as the first line of the request.
POST /servlet/default.jsp HTTP/1.1
|
Related Reading
HTTP: The Definitive Guide |
where POST is the request method,
/servlet/default.jsp represents the URI and HTTP/1.1
the Protocol/Version section.
Each HTTP request can use one of the many request methods, as specified in
the HTTP standards. The HTTP 1.1 supports seven types of request:
GET, POST, HEAD, OPTIONS,
PUT, DELETE, and TRACE. GET
and POST are the most commonly used in Internet applications.
The URI specifies an Internet resource completely. A URI is usually
interpreted as being relative to the server's root directory. Thus, it should
always begin with a forward slash (/). A URL is actually a type of
URI. The protocol version
represents the version of the HTTP protocol being used.
The request header contains useful information about the client environment and the entity body of the request. For example, it could contain the language for which the browser is set, the length of the entity body, and so on. Each header is separated by a carriage return/linefeed (CRLF) sequence.
A very important blank line (CRLF sequence) comes between the headers and the entity body. This line marks the beginning of the entity body. Some Internet programming books consider this CRLF the fourth component of an HTTP request.
In the previous HTTP request, the entity body is simply the following line:
LastName=Franks&FirstName=Michael
The entity body could easily become much longer in a typical HTTP request.
HTTP Responses
Similar to requests, an HTTP response also consists of three parts:
- Protocol-Status code-Description
- Response headers
- Entity body
The following is an example of an HTTP response:
HTTP/1.1 200 OK
Server: Microsoft-IIS/4.0
Date: Mon, 3 Jan 1998 13:13:33 GMT
Content-Type: text/html
Last-Modified: Mon, 11 Jan 1998 13:23:42 GMT
Content-Length: 112
<html>
<head>
<title>HTTP Response Example</title></head><body>
Welcome to Brainy Software
</body>
</html>
The first line of the response header is similar to the first line of the request header. The first line tells you that the protocol used is HTTP version 1.1, the request succeeded (200 = success), and that everything went okay.
The response headers contain useful information similar to the headers in the request. The entity body of the response is the HTML content of the response itself. The headers and the entity body are separated by a sequence of CRLFs.
The Socket Class
A socket is an endpoint of a network connection. A socket enables an
application to read from and write to the network. Two software applications
residing on two different computers can communicate with each other by sending
and receiving byte streams over a connection. To send a message to another
application, you need to know its IP address, as well as the port number of its
socket. In Java, a socket is represented by the java.net.Socket
class.
To create a socket, you can use one of the many constructors of the
Socket class. One of these constructors accepts the host name and
the port number:
public Socket(String host, int port)
where host is the remote machine name or IP address, and
port is the port number of the remote application. For example, to
connect to yahoo.com at port 80, you would construct the following socket:
new Socket("yahoo.com", 80);
Once you create an instance of the Socket class successfully,
you can use it to send and receive streams of bytes. To send byte streams, you
must first call the Socket class' getOutputStream
method to obtain a java.io.OutputStream object. To send text to a
remote application, you often want to construct a
java.io.PrintWriter object from the OutputStream
object returned. To receive byte streams from the other end of the connection,
you call the Socket class' getInputStream method, which
returns a java.io.InputStream.
The following snippet creates a socket that can communicate with a local
HTTP server (127.0.0.1 denotes a local host), sends an HTTP request, and
receives the response from the server. It creates a StringBuffer
object to hold the response, and prints it to the console.
Socket socket = new Socket("127.0.0.1", "8080");
OutputStream os = socket.getOutputStream();
boolean autoflush = true;
PrintWriter out = new PrintWriter( socket.getOutputStream(), autoflush );
BufferedReader in = new BufferedReader(
new InputStreamReader( socket.getInputStream() ));
// send an HTTP request to the web server
out.println("GET /index.jsp HTTP/1.1");
out.println("Host: localhost:8080");
out.println("Connection: Close");
out.println();
// read the response
boolean loop = true;
StringBuffer sb = new StringBuffer(8096);
while (loop) {
if ( in.ready() ) {
int i=0;
while (i!=-1) {
i = in.read();
sb.append((char) i);
}
loop = false;
}
Thread.currentThread().sleep(50);
}
// display the response to the out console
System.out.println(sb.toString());
socket.close();
Note that to get a proper response from the web server, you need to send an HTTP request that complies with the HTTP protocol. If you have read the previous section, "The Hypertext Transfer Protocol (HTTP)," you can understand the HTTP request in the code above.