/Application Layer

Discussion

Introduction to the Application Layer

Recall from our introduction to networking that the Application Layer is where programs that provide end user services reside in the TCP/IP Stack. The programs that you use on a daily basis operate at the Application Layer; e.g. web browsing, email. We will look at some protocols that you know you make use of on a daily basis, and some protocols that you probably didn't know you make use of on a daily basis. Just like the lower layer protocols in the TCP/IP Stack, all of the protocols we discuss here all provide a service, and all of them are based on a client-server model.

Application Layer Protocol Basics

The Application Layer is at the top of the TCP/IP Stack, but in practice that does not mean that an Application Layer protocol or service is only used by the user. In fact as we continue to expand the use of the Internet, and technology, more an more services at the Application Layer are using other Application Layer protocols or services in order to provide their service. For example, Application A, at the Application Layer, will make use of Application B, also at the Application Layer, in order to provide its (Application A) service.

Security through Obscurity

Time and time again it has been publicly and privately shown that security through obscurity is not a long term effective defense principle. Offering a service on the non-standard port will not thwart determined adversaries.

Each of the below protocols have a default Transport Layer port that the server will listen for incoming client requests on. The default Transport Layer port is a part of the Application Layer protocol specification. However, there is nothing preventing the use of the any Application Layer protocol on any Transport Layer port. That is any of the below protocols can be used on any Transport Layer port. But, in order for a client to use the service being provided they need to know the port the server is listening on. By offering services on the default port, then clients (users) are able to easily find and use the service. Offering services on other than default ports only makes it more difficult, but not impossible, for users on the Internet to find the service.

Dynamic Host Configuration Protocol (DHCP)

A broadcast is a special message that is sent to all hosts on the same local network. It is the similar to a live news cast on television, where everyone tuned into the same channel is simultaneously watching the same presentation.

At this point you might be wondering, if my host (computer) needs an IP address so that it can communicate via the Internet, then why have I never had to configure, much less know, what my IP address is? Connecting to and using a network is effortless for even beginner computer users in part because of DHCP. Right now, you are using a network to access this page, but you probably did not concern yourself with configuring your IP settings and DNS server address (more to follow). That is because it was automatically done for you by a service called DHCP (Dynamic Host Configuration Protocol) provided by the network. Your computer is already configured to use DHCP, so when you plug in the Ethernet cable (or connect to a wireless network), your computer broadcasts a request for an IP address. The DHCP server replies with an IP address, subnet mask, and default gateway router for your host to use. For networks without the DHCP service, users must obtain their IP settings from the network administrator and manually configure their computers.

The DHCP protocol runs over UDP, using ports 67 and 68. So, why UDP for DHCP? Well, let's think about that a little. TCP certainly has a great feature in that it provides reliable communications between the two hosts. Recall, that we said TCP is connection oriented, and that the two hosts need to synchronize in order to start exchanging messages, exchanging data. Well, that synchronization cannot occur if one of the hosts does not have an IP address. The whole reason why a client uses DHCP is so that it can get an IP address. Therefore, DHCP cannot use TCP, DHCP must use UDP. UDP allows a host to send a message without a connection; a host does not need to establish a connection in order to create a UDP datagram and an associated IP packet.

Domain Name System (DNS)

Shell commands resource. nslookup requires an Administrator shell.

Domain Names. When communicating over the phone, we distinguish between a person's name and their phone number. In fact we only need the number to make a call. The name by itself isn't useful. On the other hand, the name is what we actually associate with the person. If your friend says "Who did you just call?", you say "Bill" not "410-293-9999". Phone numbers may change, but usually a person's name stays the same.

The situation on the Internet is similar: a host's IP address may often change, but it's associated name usually does not. What you need to communicate with another host on the Internet is its IP address. But when we as people identify a host, it's with a name, like www.usna.edu. This kind of name is called a domain name.

Domain names are Hierarchical. They're just like paths in a file system; the only difference is that we write them the other way around: usna.edu instead of /edu/usna. In a domain name, the more specific portion is to the left, the more general portion is to the right. For example, intranet.usna.edu is a subset of usna.edu, which is in turn a subset of .edu, which is called a "top-level domain".

The top of the entire hierarchy is called the root, and the root domain is the name ".". It is common practice not to write the ending . of the domain name, in DNS the root is always inferred. The next level down in the hierarchy is the name at the right end: .edu, .com,.mil etc. .edu, .com, .mil are are all examples of top-level domains. Check out this list of top-level domain names.

The www at the front of a name, like www.usna.edu usually is meant to indicate a web server, but having www at the front of a domain name doesn't make a host a web server any more than having the first name "Prince" makes you royalty. Having your web server named www is just a convention, a convention that helps clients find your publicly available services.

Why We Need It. Before you can communicate with another host on the Internet, you need an IP address for it. However, we usually have a domain name, not an IP address. So we need to consult some kind of "phonebook" equivalent to get the IP address from the symbolic name. The irony is that the DNS "phonebook" is itself another host on the Internet, and talking to it requires an IP address ... so there's a whole chicken-egg thing here.

What It Does. The "phonebook" of the Internet is called DNS (Domain Name System). It consists of a global system of servers, called name server, that translate symbolic names to IP addresses either by knowing the answer, or passing the query along to a server that does. To translate a symbolic name to an IP address, you need to query a name server, which requires knowing the name server's IP address. If you only had the symbolic name of the name server, you'd be in trouble. However, when your computer joins a network, it is usually given the IP address of one or more name servers, from a protocol like DHCP. You can see these addresses with the shell command ipconfig /all. Look for the line

DNS Servers. . . . . : 10.1.74.10

that contains the IP addresses of one or more name servers.

A Tool for DNS: nslookup. The nslookup utility is a shell tool (for both Windows and UNIX) that will carry out a DNS request for you. Here's an example:

$ nslookup yog.academy.usna.edu
Server:		ns1.usna.edu
Address:	10.1.74.10

Name:	yog.academy.usna.edu
Address: 10.1.83.30

From this we see that the IP address of the host yog.academy.usna.edu is 10.1.83.30. Furthermore, the output is telling us that the name server that provided us this answer has IP address 10.1.74.10. The nslookup utility is also able to do reverse DNS requests — i.e. "here's an IP address, what's the name?". We can use that to find the name of the name server we just queried.

$ nslookup 10.1.74.10
Server:		ns1.usna.edu
Address:	10.1.74.10

From this we see that the name server at 10.1.74.10 has the name ns1.usna.edu.

Querying Other Name servers. Normally, nslookup will query the name server listed by the call to ipconfig /all to do DNS lookups. However, if you call nslookup with a second argument that is the name or IP address of a name server, nslookup will query that name server instead. So, for example:

$ nslookup www.google.com 8.8.8.8

... actually causes my host to contact 8.8.8.8 to resolve the name www.google.com. However, if I run this request inside the USNA network, it can't complete, because (for security reasons), USNA does not want DNS requests to be fulfilled by outside (potentially untrusted) name servers. This is another example of by using a service you are opening your self up to a vulnerability. DNS like most of the protocols in the TCP/IP Stack accept the first reply received. This means that protocol designers and tool developers have to take extra steps to ensure that a request received is valid.

Name Resolution in Action. It's worthwhile thinking a bit about what happens when you send your browser to a website. When you enter

http://www.martinguitar.com/
in your browser's address bar, the browser is supposed to send a request to the web server www.martinguitar.com (specifically an HTTP GET request, more to follow). But that can't happen until the browser finds out what IP address goes with that name. In fact, you can enter the IP address directly into the browser's address bar, like this
http://209.235.218.124
... and you'll get the website. If you use the symbolic name, however, the browser first makes a DNS request to a name server to get the IP address for the name www.martinguitars.com, and then actually sends the HTTP GET request to the web server. If you don't have access to a name server, and you know only a web site's URL, not its IP address, you can't access the web site!

DNS servers listen on port 53. DNS uses UDP rather than TCP. If DNS is so crucial to Internet communications, then why does it use an unreliable protocol like UDP. Well, the answer is in the question, since DNS is so crucial to Internet communications, and is used a lot by a majority of hosts on the Internet, then the protocol needs to be efficient. If DNS used TCP, then the overhead associated with TCP would severely reduce the efficiency of networking. This should become clearer once we go through how DNS resolves an IP address from a domain name.

DNS is a complicated system, with millions of servers spread out across the earth. Suppose you query your local name server for www.foo.com. The general scheme works like this: There are 13 root name servers. If your name server doesn't know the IP address of www.foo.com, it sends a query to one of the 13 root name servers, such as the root server for .com. The .com name server will send you to the name server for the foo.com domain, and that name server ought to be able to give the IP address for www.foo.com. If this much traffic was required for every name resolution, the Internet would be a much slower place. Instead, name servers remember in a cache the answers to queries they've answered recently.

From a security perspective it's crucial that DNS works properly. If the name bankwithallmymoney.com gets resolved incorrectly to an IP address owned by a bad guy, I could be in trouble. He could put up a dummy web page that looks just like bankwithallmymoney.com's, but which isn't and he could perhaps steal my password ... and then my money.

HyperText Transfer Protocol (HTTP) and HyperText Transfer Protocol Secure (HTTPS)

HTTP (HyperText Transfer Protocol) is the Application Layer protocol that the web is based on. HTTP servers (web servers) use TCP and listen on port 80. HTTP clients are called web browsers. HTTPS (HTTP Secure) is a more secure version of HTTP. It employs Transport Layer Security (TLS), a set of cryptographic protocols which support authentication as well as encryption of the Application Layer HTTP data. HTTP traffic is sent in the clear, meaning that anyone can read the data in the message if they receive the message. Remember a packet will flow through many networks as it makes it way from Host A to Host B. HTTPS on the other hand encrypts (we will talk about encryption later in the course) the data such that only someone that knows the key can read the contents of the message. If you don't have the key to decrypt an encrypted message, then you cannot make sense of the data. You may be able to receive, and read the encrypted data, but encrypted data is gibberish.

The protocol behind the web, HTTP, governs the interaction between web servers and web clients (browsers). Browsers can send messages like

GET /prices.html HTTP/1.1

to a server (of course you need its IP address to send it this message!). In the above GET request /prices.html is the path and file name of the file the web client is requesting to view. The web server process accepts the GET requests, and looks in its file system for /prices.html. The HTTP protocol specifies exactly what this can look like and what response the request should elicit from the server. For example, the server might send back the message

HTTP/1.1 404 Not Found

which indicates that it did not have a file prices.html available to send.

Secure SHell (SSH)

There are SSH clients for Windows, too, like PuTTY, which is freely available.

SSH (Secure SHell) is a protocol that allows secure, remote command shell access. In this setting, secure means preserving confidentiality and authentication. Nobody snooping on the network traffic can read off your password or other information that gets sent back and forth during the session. Generally, you use ssh like this: ssh username@hostname, e.g. ssh mxxxxxx@flux.academy.usna.edu. You'll be prompted for a password, and assuming you give the right one, you have a shell on the remote host (flux.academy.usna.edu in the example).

SSH is a client/server system just like the web (HTTP). For example, there is an ssh-server process running on rona listening on port 22 for connection requests. On Windows, the ssh command (which is actually an alias for a program called PuTTY) is an ssh-client. When you run it as ssh rona.academy.usna.edu, the client resolves the name rona.academy.usna.edu to an IP address, makes a TCP connection to that IP address on port 22, and from that point on follows (communicates using) the SSH protocol with the server process to carry out your shell commands.

So, you already have an ssh client installed on your machine (PuTTY). You just need to pull up a Windows shell and start up the ssh client with:

ssh mxxxxxx@rona.academy.usna.edu

RDP

The "Remote Desktop Protocol" is sort of like the Windows equivalent of ssh, except that you get a full desktop on the remote machine instead of just a shell. It uses TCP on port 3389. In this class, we won't do this from one Windows host to another, but we will be doing this later from a UNIX host to a Windows host using the UNIX tool rdesktop. The Windows RDP client is just called "Remote Desktop Connection". Just like HTTP, DNS, and SSH, this is a client/server system. The Windows host you want to connect to must have an RDP server process listening to port 3389 waiting for connection requests from remote desktop clients.

SFTP

Secure File Transfer Protocol (SFTP) offers secure file transfer. You've already used WinSCP, which is an SFTP client, to transfer web content over to rona. SFTP uses TCP port 22 ... just like SSH. That's because SFTP is actually an extension of SSH. So the "secure" in SFTP comes from the fact that traffic is encrypted, so an eavesdropper can't snoop in. The protocol also provides for authentication.

SMB

The SMB (Server Message Block) protocol, is an Application Layer protocol used for network file sharing. There are two popular implementations of the SMB protocol: Microsoft SMB and Samba. Windows systems use Microsoft SMB to share files over the network, which is what most people are familiar with. Unix systems use Samba primarily to use resources shared using the Microsoft SMB protocol. Both versions are based on the original SMB protocol and are compatible with one another. SMB servers usually listen on port 445. The SMB protocol runs over TCP.

SMTP

SMTP (Simple Mail Transfer Protocol) is an application layer protocol for transporting email. An SMTP client is responsible for transferring electronic mail (email) messages to one or more SMTP servers. Message transfer can occur in a single connection between the original SMTP sender and the final SMTP-recipient, or can occur in a series of hops through intermediary systems. DNS is used to identify hosts that act as SMTP servers (or relays).

Other Services/Protocols

Important: Each network service a host provides corresponds to a process sitting and listening for requests at a specific TCP or UDP port.

Each of these processes is a "server". Each process provides services to other hosts, but each process also represents a potential avenue into the host for attackers. These server processes are programs reading input from other hosts, often outside the local network. They are expecting input that follows the rule of whichever protocol the service uses. But what happens when input doesn't follow the rules? We know how hard it is to write programs that deal gracefully with all possible inputs!

Summarizing it all

The following table summarizes all of this service/protocol/port/tool information. You need to be able to match up the service to protocol name to standard tool for each table entry. Why, because these are all so fundamental. You only need to memorize the ports that are highlighted.

ServiceProtocolPortTCP/UDPSoftware/Tools
Network Connectivity / Host Status ICMP --* -- ping
World Wide Web HTTP 80 TCP browsers
Secure Web HTTPS 443 TCP browsers
Name Resolution DNS 53 UDP nslookup
Secure Remote Shell SSH 22 TCP ssh (PuTTY)
Remote Desktop (Windows) RDP 3389 TCP rdesktop (UNIX), Remote Desktop Connection (Windows)
Secure Remote File Transfer SFTP 22** TCP WinSCP, FileZilla, CyberDuck
Dynamic Host Configuration Protocol*** DHCP 67, 68**** UDP built into operating systems
Network file & printer sharing SMB 445 TCP File browser (see "map network drive" in Windows)
email SMTP 25 TCP any email client: Outlook, Thunderbird, Mail, browser, etc.
* Ping doesn't use the Transport Layer, so there's no associated port
** Same port as SSH because it actually uses SSH
*** Means you can join a network "on the fly" and get assigned an IP address and find a DNS server
**** Port 67 is for the server, 68 is for the client