Menu

Networks II


Reading
APUE sections 16.3.1 - 16.3.3.

Homework
Printout the Homework and answer the questions on that paper.

The network stack
Throughout this course, we've seen computer systems as comprised of three layers: the physical machine, the kernel, and user processes. Our processes talk to the kernel only, the physical machine talks to the kernel only. So the kernel is kind of an intermediary. The protocol governing user process communication with the kernel is the set of system calls, i.e. the set of services the kernel provides.

Networks make this kind of layered view of things very explicit --- there are more layers, and they are very well defined. The idea is the same though. A layer communicates only with the layers above and below it, and the protocol governing communication is all about the services one layer provides another. The set of layers is called "the stack". The internet stack is:

application layer
transport layer
network layer
link layer
physical layer
	  
The application layer is where our user processes live. HTTP, the protocol by which Web servers and web clients communicate is thus called an application layer protocol. Any communication between processes on different machines gets passed down from layer to layer until it gets communicated via the physical layer, then passed back up again through all the layers until the process recieving the communication gets the message.

DNS: the phonebook of the internet
Before you can communicate with another host on the internet, you need an IP address for it. However, we usually have a domain name for the host, not an IP address. So we need to consult some kind of "phonebook" equivalent to get the IP address from the symbolic name. The irony is that consulting the "phonebook" means talking to another host, and that requires an IP address ... there's a whole chicken-egg thing here.

The "phonebook" of the internet is called DNS (Domain Name System). It consists of a global system of servers that translate symbolic names to IP addresses either by knowing the answer, or passing the query along to a server that does. Sincer we're talking about server processes communicating with client processes, we see that DNS is actually an application layer protocol --- like HTTP, the protocol for web servers and clients. To translate a symbolic name to an IP address, you need to query a nameserver. This requires knowing the nameservers IP address. If you only had the symbolic name of the nameserver, you'd be in trouble. However, the query starts by looking in a file called /etc/resolv.conf that contains the IP addresses of one or more nameservers.

The nslookup utility is a command-line tool that will carry out a DNS request for you.

bash$ nslookup bear.cs.usna.edu
Server:         131.122.88.2
Address:        131.122.88.2#53

bear.cs.usna.edu        canonical name = mich301csdbrownu.cs.usna.edu.
Name:   mich301csdbrownu.cs.usna.edu
Address: 131.122.89.4
	  

The gethostbyname Network Services Library function
You can ask the OS to carry out a DNS request from within a program using the gethostbyname function in the Network Services Library. It returns a pointer to a struct hostent object. This struct is a bit more indirect than you'd expect, due to the fact that a domain name might have more than one IP address associated with it. Also, the host might have other symbolic names (aliases) associated with it. Here's the struct:
struct	hostent {
	char	*h_name;	/* canonical name of host */
	char	**h_aliases;	/* alias list */
	int	h_addrtype;	/* host address type */
	int	h_length;	/* length of address */
	char	**h_addr_list;	/* list of addresses from name server */  ← the raw addresses, not dotted quads!
                                Just to muddy things up a bit:
                                although h_addr_list has type char**,
                                they really mean void**, in that 
                                an entry h_addr_list[0], for
                                instance, could point to any kind of
                                object.  In fact, for IPv4, it's
                                really pointing to an 'unsigned int'.
                                So ... you must cast that pointer to
                                an (unsigned int*) to use it right.

};
	    
The member h_length is the number of bytes in the address --- i.e. 4 for IPv4 and 16 for IPv6. h_addr_list. An element of that array is a pointer to an IP address, so it's really an object of type unsigned int, but we must cast it to use it that way.

/*******************************************
 * sym2ip <symbolic hostname>
 * This program takes a symbolic hostname
 * as its sole command-line argument and
 * produces a list of all the IP addresses
 * corresponding to that name, both as an
 * integer and as a dotted quad.
 * Compile as: gcc -o sym2ip sym2ip.c -lnsl
 *******************************************/
#include <stdio.h>
#include <stdlib.h>
#include <netinet/in.h>
#include <netdb.h>
int main(int argc, char **argv)
{
  if (argc != 2) { printf("%s <name>\n",argv[0]); exit(1); }

  /* Get host entry */
  struct hostent *H;
  H = gethostbyname(argv[1]);
  if (H == NULL) { printf("Host not found!\n"); exit(2); }
  
  /* Get ip addresses */
  int i;
  for(i = 0; H->h_addr_list[i] != NULL ; ++i)
  {
    unsigned int *ip = (unsigned int*)H->h_addr_list[i];
    printf("%u\n",ntohl(*ip));     /* Integer IP address     */
    printf("%s\n",inet_ntoa(*ip)); /* Dotted quad IP address */
  }
}