- Introduction to Sockets
- Serialization and Deserialization
- Single Thread File Server
- Processes and Threads
- Multithreaded Servers
This post is about how to use Java Socket to build a simple http server responding to GET request. We first discuss how to do it with single thread, then we extend the server to multi-thread. It’s part of note and the assignment of Distributed Java offered by Rice University at Coursera. (Week 2 and 4).
Introduction to Sockets
Server Side
For JVM A and JVM B to communicate with each other, we assumed that JVM A plays the “client’’ role and JVM B the “server’’ role. To establish the connection, the main thread in JVM B first creates a 𝚂𝚎𝚛𝚟𝚎𝚛𝚂𝚘𝚌𝚔𝚎𝚝
(called 𝚜𝚘𝚌𝚔𝚎𝚝
, say) which is initialized with a designated URL and port number. It then waits for client processes to connect to this socket by invoking the 𝚜𝚘𝚌𝚔𝚎𝚝.𝚊𝚌𝚌𝚎𝚙𝚝() method, which returns an object of type 𝚂𝚘𝚌𝚔𝚎𝚝 (called 𝚜, say) . The 𝚜.𝚐𝚎𝚝𝙸𝚗𝚙𝚞𝚝𝚂𝚝𝚛𝚎𝚊𝚖() and 𝚜.𝚐𝚎𝚝𝙾𝚞𝚝𝚙𝚞𝚝𝚂𝚝𝚛𝚎𝚊𝚖() methods can be invoked on this object to perform read and write operations via the socket, using the same APIs that you use for file I/O via streams.
Client Side
Once JVM B has set up a server socket, JVM A can connect to it as a client by creating a 𝚂𝚘𝚌𝚔𝚎𝚝 object with the appropriate parameters to identify JVM B’s server port. As in the server case, the 𝚐𝚎𝚝𝙸𝚗𝚙𝚞𝚝𝚂𝚝𝚛𝚎𝚊𝚖()
and 𝚐𝚎𝚝𝙾𝚞𝚝𝚙𝚞𝚝𝚂𝚝𝚛𝚎𝚊𝚖()
methods can be invoked on this object to perform read and write operations. With this setup, JVM A and JVM B can communicate with each other by using read and write operations, which get implemented as messages that flow across the network. Client-server communication occurs at a lower level and scale than MapReduce, which implicitly accomplishes communication among large numbers of processes. Hence, client-server programming is typically used for building distributed applications with small numbers of processes.
Reading : Java tutorial titled Lesson: All About Sockets
Serialization and Deserialization
When communications are performed using input and output streams, the unit of data transfer is a sequence of bytes. Thus, it becomes important to serialize objects into bytes in the sender process, and to deserialize bytes into objects in the receiver process.
Few apparoaches :
- Custom approach
- in which the programmer provides custom code to perform the serialization and deserialization
- XML
- Java Serialization and Deserialization
- Interface Definition Language (IDL)
- like Google’s Protocol Buffers framework
Reading :
- Chapter 4. Encoding and Evolution (Kleppmann, Martin. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems (Kindle Location 2919). O’Reilly Media. Kindle Edition. )
Single Thread File Server
(Spoiler! Proceed with caution)
/**
* A basic and very limited implementation of a file server that responds to GET
* requests from HTTP clients.
*/
public final class FileServer {
/**
* Main entrypoint for the basic file server.
*
* @param socket Provided socket to accept connections on.
* @param fs A proxy filesystem to serve files from. See the PCDPFilesystem
* class for more detailed documentation of its usage.
* @throws IOException If an I/O error is detected on the server. This
* should be a fatal error, your file server
* implementation is not expected to ever throw
* IOExceptions during normal operation.
*/
public void run(final ServerSocket socket, final PCDPFilesystem fs)
throws IOException {
/*
* Enter a spin loop for handling client requests to the provided
* ServerSocket object.
*/
while (true) {
// TODO 1) Use socket.accept to get a Socket object
Socket socketObj = socket.accept();
/*
* TODO 2) Using Socket.getInputStream(), parse the received HTTP
* packet. In particular, we are interested in confirming this
* message is a GET and parsing out the path to the file we are
* GETing. Recall that for GET HTTP packets, the first line of the
* received packet will look something like:
*
* GET /path/to/file HTTP/1.1
*/
InputStream inputStream = socketObj.getInputStream();
InputStreamReader inputReader = new InputStreamReader(inputStream);
BufferedReader input = new BufferedReader(inputReader);
String requestHeader = input.readLine();
assert (requestHeader != null);
if (requestHeader.startsWith("GET")) {
String[] splited = requestHeader.split("\\s+");
String path = splited[1];
PCDPPath filePath = new PCDPPath(path);
String file = fs.readFile(filePath);
/*
* TODO 3) Using the parsed path to the target file, construct an
* HTTP reply and write it to Socket.getOutputStream(). If the file
* exists, the HTTP reply should be formatted as follows:
*
* HTTP/1.0 200 OK\r\n
* Server: FileServer\r\n
* \r\n
* FILE CONTENTS HERE\r\n
*
* If the specified file does not exist, you should return a reply
* with an error code 404 Not Found. This reply should be formatted
* as:
*
* HTTP/1.0 404 Not Found\r\n
* Server: FileServer\r\n
* \r\n
*
* Don't forget to close the output stream.
*/
PrintWriter out = new PrintWriter(socketObj.getOutputStream(), true);
if (file == null) {
// return 404
out.print("HTTP/1.0 404 Not Found\r\n");
out.print("Server: FileServer\r\n");
out.print("\r\n");
} else {
out.print("HTTP/1.0 200 OK\r\n");
out.print("Server: FileServer\r\n");
out.print("\r\n");
out.print(file + "\r\n");
}
out.close();
}
}
}
}
Processes and Threads
In the case of Java applications, a process corresponds to a single Java Virtual Machine (JVM) instance, and threads are created within a JVM instance.
The advantages of creating multiple threads in a process include increased sharing of memory and per-process resources by threads, improved responsiveness due to multithreading, and improved performance since threads in the same process can communicate with each other through a shared address space.
The advantages of creating multiple processes in a node include improved responsiveness (also) due to multiprocessing (e.g., when a JVM is paused during garbage collection), improved scalability (going past the scalability limitations of multithreading), and improved resilience to JVM failures within a node. However, processes can only communicate with each other through message-passing and other communication patterns for distributed computing.
Multithreaded Servers
One challenge of following this approach literally is that there is a significant overhead in creating and starting a Java thread. However, since there is usually an upper bound on the number of threads that can be efficiently utilized within a node (often limited by the number of cores or hardware context), it is wasteful to create more threads than that number.
There are two approaches that are commonly taken to address this challenge in Java applications.
One is to use a thread pool, so that threads can be reused across multiple requests instead of creating a new thread for each request.
Another is to use lightweight tasking (e.g., as in Java’s ForkJoin framework) which execute on a thread pool with a bounded number of threads, and offer the advantage that the overhead of task creation is significantly smaller than that of thread creation.
(Spoiler! Proceed with caution)
public void run(final ServerSocket socket, final PCDPFilesystem fs,
final int ncores) throws IOException {
ExecutorService executor = Executors.newFixedThreadPool(25 * ncores);
/*
* Enter a spin loop for handling client requests to the provided
* ServerSocket object.
*/
while (true) {
Socket socketObj = socket.accept();
Runnable runner = new Runnable() {
@Override
public void run() {
try {
InputStream inputStream = socketObj.getInputStream();
InputStreamReader inputReader = new InputStreamReader(inputStream);
BufferedReader input = new BufferedReader(inputReader);
String requestHeader = input.readLine();
assert (requestHeader != null);
if (requestHeader.startsWith("GET")) {
String[] splited = requestHeader.split("\\s+");
String path = splited[1];
PCDPPath filePath = new PCDPPath(path);
String file = fs.readFile(filePath);
PrintWriter out = new PrintWriter(socketObj.getOutputStream(), true);
if (file == null) {
// return 404
out.print("HTTP/1.0 404 Not Found\r\n");
out.print("Server: FileServer\r\n");
out.print("\r\n");
} else {
out.print("HTTP/1.0 200 OK\r\n");
out.print("Server: FileServer\r\n");
out.print("\r\n");
out.print(file + "\r\n");
}
out.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
};
Thread thread = new Thread(runner);
executor.execute(thread);
}
}