TCP, Client-Server, Netcat

Please review SI110's lessons on TCP, client-server and netcat. If, using the above commands, you can create connections between different hosts and send and receive data, you get the basics of TCP clients and servers.

Names vs. IP Addresses

Before we can talk about connecting to machines, we need to be able to identify the host we want to connect to. You need a host's IP address in order to connect to it. But usually you have a domain name instead. Moreover, there are IPv4 and IPv6 addresses ... which to use? Not surprisingly, Java has a class InetAddress (API documentation) that has methods for "resolving" domain names into addresses, and deals with the IPv4 vs IPv6 distinction in the way that, by now, I hope you expect: InetAddress is a base class, and Inet4Address are Inet6Address are subclasses. You don't really need to worry about whether you're using IPv4 or IPv6, because you typically get InetAddress objects not by calling constructors, but by calling the method
static InetAddress getByName(String host);
which, in one fell-swoop, resolves the domain name host and returns an InetAddress object for it using the appropriate type. This makes it trivial to create a little Java version of the nslookup utility.

Of course, if you were doing this you would handle all the errors and exceptions nicely, wouldn't you ...

Sockets, connecting as a client

The object that represents a TCP (or UDP, but we're not covering that) communication endpoint is called a "socket". Think of a socket like a telephone: once connected, input goes into it and over the network to another phone somewhere else, and input produced on the other end gets sent over and winds up as output on our telephone. The big difference is that instead of voice inptu and output, we have input bytestreams and output bytestreams.

There's an asymmetry with sockets (as with phones), in that one socket waits around for a call to come in (listens), and the other actually places the call (connects). The Java API actually has different classes for these different roles. The class Socket (API documentation) represents the client's role, i.e. the one that makes connection requests. The class ServerSocket (API documentation) represents the server's rolw, i.e. the one that listens for connection requests.

The important methods of the Socket class for us are:

Socket(InetAddress address, int port) - Creates a stream socket and connects it to the specified port number at the specified IP address.
InputStream getInputStream()          - Returns an input stream for this socket.
void close()                          - Closes this socket.
OutputStream getOutputStream()        - Returns an output stream for this socket.
So, the key idea is that, once connected, you get InputStream and OutputStream objects for the connection, which means you can use all the same classes and methods that you have for files, byte arrays, and so on.

The following program assumes the command-line argument is the name of a webserver, and it connects on port 80 to that webserver, sends it the basic HTTP GET request for "/", i.e. the front-page of the site, and echos everything it gets right to the terminal.

A Java netcat client

Writing programs that communicate over networks give us lots of nice reasons to use threads. To see this in action, let's try to write a Java version of the client functionality of netcat. So, you give it a hostname and a port and it connects, taking whatever it reads from standard in and sending it to the server on the other end, and taking whatever it gets sent by the server and writing it to standard out. The reason this involves threads is that the program needs to simultaneously wait for input from standard in and wait for input from the socket. This means two separater threads!

This is a very primitive program: it uses "busy wait" to determine when the two threads have exited, in order to close the socket, it doesn't do error handling, and so on. However, it illustrates some basic TCP client functionality. Note: the call to "flush()" is there to ensure that what we've jsut written with a println() actually gets sent then and there acrosss the network.