Networks III

Reading: APUE 16.2 describes sockets. APUE 16.4 describes the connect function.
Homework: Printout the Homework and answer the questions on that paper.

Protocols and terms

A network protocol provides a service. Here are some important terms for describing services:

Connectionless - data sent as "datagrams", i.e. messages sent to an address, like a letter
Connection-oriented - a logical connection ("virtual circuit") is established between communicating processes (like a phone call).
Reliable - packets arrive complete, error-free, in the order sent.
Byte-stream - like a file or pipe in Unix, just a stream of bytes ... no boundaries.
Full-duplex - data flows in both diretions at the same time.

IP (Internet Protocol) is the network-layer protocol we'll be using. It is: connectionless, and unreliable.
TCP (Transmission Control Protocol) is one of the tranport layer protocols we'll be using. It is: connection-oriented, reliable, full-duplex, byte-stream.
UDP (User Datagram Protocol) is another transport layer protocol we'll be using. It is: connectionless, unreliable, full-duplex.

Sockets

To send and recieve data to remote hosts, processes use file descriptors, just like for files and pipes and FIFOS, except that the descriptor is referring to something called a socket. A socket is a communication endpoint conforming to a fixed protocol. Sockets are created with a call to the function: socket.

int socket(int domain, int type, int protocol);

The protocol parameter is usually set to zero, which allows the actual protocol to be deduced from the given domain and type. The domain will be PF_INET (for IPv4) in our examples, although others are possible, like PF_INET6 for IPv6. The argument type will depend on what you want to do: the constant SOCK_STREAM gives us a TCP connection (reliable, connection-oriented bytestream) and SOCK_DGRAM gives us UDP (unreliable, connectionless). The choice of which to use really depends on the application.

A socket on a host is addressed by two things: a hostname and a port number. We say the socket is "bound" to a particular port number on a host. Only one socket can be bound to a given port at any one time on a host, so that address is unique. However, many file descriptors may be referring to that socket simultaneously. Sockets are numbered by a 16-bit non-negative number (C's unsigned short int ). Some port numbers are dedicated to different services -- like port 22 for ssh, or imap on port 143, or WoW on port 3724. This is controlled by ICANN. Higher-numbered ports are there for whatever you need to do. So, to connect in order to communicate with a process on a remote host, you need to know both the host name / IP address and the port number. A socket can be bound to a port number in different ways, depending on whether the socket is being used as a "client" or as a "server".

The netstat utility can be used to show the current socket/port bindings. Here's an excerpt (I've cut out lots of stuff!):

michcsdbrownu$ netstat -a

TCP: IPv4
   Local Address        Remote Address    Swind Send-Q Rwind Recv-Q    State
-------------------- -------------------- ----- ------ ----- ------ -----------
michcsdbrownu.ssh    131.122.90.34.40215  135424      0 49232      0 ESTABLISHED
michcsdbrownu.40923  nwtime.usna.edu.ldap  6064      0 49640      0 ESTABLISHED
michcsdbrownu.893    chessie.cs.usna.edu.nfsd 49640      0 49640     76 ESTABLISHED
localhost.6010       localhost.41013      49152      0 49152      0 ESTABLISHED
michcsdbrownu.6000   hercules1.usna.navy.mil.3833 65535      0 49640      0 CLOSE_WAIT
michcsdbrownu.41055  chessie.cs.usna.edu.ssh 49640      0 49640      0 TIME_WAIT
michcsdbrownu.32774  chessie.cs.usna.edu.54831 49640      0 49640      0 ESTABLISHED

You see that we're looking at entries for sockets in the IPv4 domain of type TCP. You get the hostname.portnum for both ends of the sockets. When you see a name instead of a port number, like the .ssh in michcsdbrownu.ssh, that means that the port (22 in this case) is well-known so that the system has a name for it, and uses the name instead of the number. The last column is the "state" and that tells you something about the state of the current connection.

A simple TCP client-server: intro

There is an inherent asymmetry in making a phone call. To call Mr. X, I need to know his number, but he doesn't need to know mine. A business needs to have a well-known number, but the customers' numbers are irrelevent. In a networked application, we usually have one server process acting as the busines, and its port number and hostname have to be known to anyone who wants to connect. The port number used by the other process, the client, is irrelevant ... the server will just send results straight back to whatever host/port connected to it in the first place. But to initiate the communication, the server's port number has to be known ahead of time. In fact, servers just sit around listening to their well-known port number waiting for clients to ask for connections. I have a server program, which you can run if you like, at ~wcbrown/courses/IC221/labs/L12/server that sits and listens at port 10000. What it does we'll see later. If we run that server (in the background with an & please), and try netstat again, we see a change:

michcsdbrownu$ ~wcbrown/courses/IC221/labs/L12/server &
[2] 1774
michcsdbrownu$ netstat -a

TCP: IPv4
   Local Address        Remote Address    Swind Send-Q Rwind Recv-Q    State
-------------------- -------------------- ----- ------ ----- ------ -----------
michcsdbrownu.ssh    131.122.90.34.40215  135424      0 49232      0 ESTABLISHED
michcsdbrownu.40923  nwtime.usna.edu.ldap  6064      0 49640      0 ESTABLISHED
michcsdbrownu.893    chessie.cs.usna.edu.nfsd 49640      0 49640     76 ESTABLISHED
localhost.6010       localhost.41013      49152      0 49152      0 ESTABLISHED
michcsdbrownu.6000   hercules1.usna.navy.mil.3833 65535      0 49640      0 CLOSE_WAIT
michcsdbrownu.41055  chessie.cs.usna.edu.ssh 49640      0 49640      0 TIME_WAIT
michcsdbrownu.32774  chessie.cs.usna.edu.54831 49640      0 49640      0 ESTABLISHED
michcsdbrownu.10000        *.*                0      0 49152      0 LISTEN

Notice that we know have en entry for port 10000, where our simple server is in the LISTEN state. There's no "Remote Address" yet, because no client has connected to it. I also have a simple client written, and you'll see how to write it momentarily. But if I launch the client, I should see a connection. The client knows to try to connect to port 10000, but you must give it the hostname on the command-line. So let's login on another machine and run the client:

michcsdbrownu$ ssh mich302csd07d
Password: **********
mich302csd07d$ ~wcbrown/courses/IC221/labs/L12/client michcsdbrownu
the
THE

So, now if we look at netstat again, we see a change in the line for michcsdbrownu.10000 --- now there's a remote address, and the "state" is ESTABLISHED. Notice how the client's port number is some random big number ... it's actual value is pretty much irrelevent.

michcsdbrownu$ netstat -a

TCP: IPv4
   Local Address        Remote Address    Swind Send-Q Rwind Recv-Q    State
-------------------- -------------------- ----- ------ ----- ------ -----------
michcsdbrownu.ssh    131.122.90.34.40215  135424      0 49232      0 ESTABLISHED
michcsdbrownu.40923  nwtime.usna.edu.ldap  6064      0 49640      0 ESTABLISHED
michcsdbrownu.893    chessie.cs.usna.edu.nfsd 49640      0 49640     76 ESTABLISHED
localhost.6010       localhost.41013      49152      0 49152      0 ESTABLISHED
michcsdbrownu.6000   hercules1.usna.navy.mil.3833 65535      0 49640      0 CLOSE_WAIT
michcsdbrownu.41055  chessie.cs.usna.edu.ssh 49640      0 49640      0 TIME_WAIT
michcsdbrownu.32774  chessie.cs.usna.edu.54831 49640      0 49640      0 ESTABLISHED
michcsdbrownu.10000  mich302csd07d.cs.usna.edu.32881 49640      0 49640      0 ESTABLISHED

So, to summarize: Sockets are kernel resources, which we request via the socket system call. To communicate across a network, sockets are bound to ports so that the hostname/port-number combination gives you an address for a particular socket --- only one socket has that hostname/portnumber combination, so the address is unique. Servers have to "listen" at a port number that's known ahead of time. Clients connect to a server at a well-known hostname/portnumber, and though the client uses a socket bound to some port, the actual port number is more or less irrelevent. Next, what does the client do?

A simple TCP client

A TCP client application basically just needs to call socket to create a socket, and a system call conveniently named connect to both

bind the new socket to a random port, and
establish a connection with a remote socket given by host + port.

Here's the prototype for connect:

#include <sys/types.h>
#include <sys/socket.h>

int  connect(int  s,  const  struct  sockaddr   *name,   int  namelen);

While conceptually this is quite simple, the code for it is a bit baroque because the "host + port" information is stored in a struct that isn't easy to use.

struct sockaddr_in {
 ...
      sa_family_t     sin_family;
      in_port_t       sin_port;
      struct  in_addr sin_addr;
 ...
};

struct in_addr {
 ...
      in_addr_t s_addr; ← this is just the 32-bit network byte order IP!
 ...
};

The "family" is easy for us: IF_INET to indicate IPv4. The port is easy, port 10000, though we must remember to put that number into network byte order: htns(10000). The last issue is the IP address itself, which is made difficult by the nesting struct's. All together, we need to to something like this:

  struct sockaddr_in mysa;
  mysa.sin_family = AF_INET;
  mysa.sin_port = htons(10000);
  mysa.sin_addr.s_addr = ???;

But what do we do to get the address? We need the 32-bit int, network byte order address? Well, we could hardcode the IP address as a 32-bit unsigned int, we could hardcode / get from argv the IP address as a dotted quad and use inet_addr, or we could get the symbolic name (from argv) and use gethostbyname like last class.

Another piece of nastiness is that the second argument to connect is a struct sockaddr*, not a struct sockaddr_in*, which is what we have. And what about that third argument? Well, connect is supposed to work for several domains, not just IPV4 (a.k.a. IF_INET). In object oriented languages, "sockaddr" would be a base class and "sockaddr_in" would be a derived class --- one among several. This is C, and there's no OOP. So, we cast the struct sockaddr_in* to a struct sockaddr*, evan though the cast is not really valid. Connect figures out what it's gotten passed, by the third arguement, which is the sizeof the actual struct being passed, and as long as each sockadd_? type has a different sizee, everything's good. So connect looks like:

connect(sfd,(struct sockaddr*)&mysa,sizeof(struct sockaddr));

If successful in connecting, connect returns 0. Anything else indicates failure. Putting it all together:

/***********************************************************
 * Simple TCP Client.
 * This client connects to the server and sends the user
 * input to the server and echos back what the server sends it.
 * Compile like this:  gcc -o client client.c -lnsl -lsocket
 ***********************************************************/
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <netdb.h>

int main(int argc, char **argv)
{
  // Print usage
  if (argc == 1) { fprintf(stderr,"usage: %s <hostname>!\n",argv[0]); exit(1); }

  // Set up socket
  int sfd = socket(AF_INET,SOCK_STREAM,0);
  if (sfd == -1) { fprintf(stderr,"Socket not created!\n"); exit(2); }

  // Get IP address from symbolic name
  struct hostent *p;
  p = gethostbyname(argv[1]);
  if (p == NULL) { fprintf(stderr,"Name not found!\n"); exit(1); }
  unsigned int *ip = (unsigned int*)(p->h_addr_list[0]);

  // Set up address structure
  struct sockaddr_in mysa;
  mysa.sin_family = AF_INET;
  mysa.sin_addr.s_addr = *ip;
  mysa.sin_port = htons(10000);

  // Connect!
  if (connect(sfd,(struct sockaddr*)&mysa,sizeof(mysa)) != 0)
  {
    fprintf(stderr,"Client could not connect!\n");
    exit(3);
  }

  // Communicate with server
  char inc, outc;
  while(scanf("%c",&inc) == 1)
  {
    write(sfd,&inc,1);
    read(sfd,&outc,1);
    printf("%c",outc);
  }
  close(sfd);

  return 0;
}

Recapping the arguments to connect

To recap: the prototype for connect is

int connect(int s, const struct sockaddr *name, int namelen);

... and calling connect always requires some casting.

If we were in an object oriented world, we'd have the following class hierarchy:
                   --- sockaddr_in
                  /
       sockaddr -<
                  \
                   --- sockaddr_in6
	
and connect's prototype would be connect(int s, sockaddr name) and we'd simply call like this: connect(sfd,mysa). The mysa argument could be a either a sockaddr_in (for IPv4) or a sockaddr_in6 (for IPv6), because inheritance says both are sockaddr objects.

However, C is not an object oriented language, so the struct sockaddr* argument can only point to a struct sockaddr object. Therefore, to allow connect to take different kinds of objects as a second parameter, we cast whatever pointer we really have to a struct sockaddr* (tricking the compiler!). Then we need to somehow let the connect function know what type of object the second argument is really pointing to, and we do that by passing it the true size of the object as a third parameter.