SY110: Intro to the TCP/IP Stack: Application Layer


Intro to the TCP/IP Stack: Application Layer

Learning Outcomes

After completing these activities, you should be able to:


Intro to Network Communications

So far, our lectures have primarily focused on our own, isolated computing environments. Now, we will study the networks and communications transporting data across the world, specifically studying a standard called Transmission Control Protocol (TCP)/Internet Protocol (IP). Recall from the Intro to the World-Wide-Web and HTML class that the Advanced Research Projects Agency Network (ARPANET) was the precursor to the Internet. The development of the TCP/IP model has since been the de facto standard for the Internet as we know it today, with Requests for Comments (RFCs) published under 791 for IP and 793 for TCP.

Since the ARPANET shut down operations in 1983, other network communication standards have since fallen to the wayside because of the overwhelming dominance and acceptance of TCP/IP, while others are still in existence for delivering vital communications. Some of these are listed below:
Organization Protocol Standard
ARPANET (1969) IMP RFC1
ITU-T (1975) X.25 Grey Book
Xerox (1976) PUP RFC1132
Novell (1983) IPX RFC1132
This model divides communications into layers, which together make up the TCP/IP “stack”. This layered model represents standards for networked computing systems and consists of five distinguishing layers (1) Physical, (2) Data Link, (3) Network, (4) Transport, and (5) Application. Having these layers enables independence between layers so that, for example, we can upgrade from a CAT-5 cable to a CAT-6 cable without having to change the Ethernet protocol.

  TCP/IP STACK LAYERS        COMPUTER ARCHITECTURE
+-----------------------+     +------------------+
| Layer 5 - APPLICATION | >>> | USER/APPLICATION |
+-----------------------+     +------------------+
| Layer 4 - TRANSPORT   | >>> |  APPLICATION/OS  |
+-----------------------+     +------------------+
| Layer 3 - NETWORK     | >>> |   OS/HARDWARE    |
+-----------------------+     +------------------+
| Layer 2 - DATA LINK   | >>> |     HARDWARE     |
+-----------------------+     +------------------+
| Layer 1 - PHYSICAL    | >>> |     HARDWARE     |
+-----------------------+     +------------------+
A Basic Overview of Each TCP/IP Stack Layer
Layer What It Does How It Interacts with Other Layers
Application
Layer 5
Protocols such as HTTP (Hypertext Transfer Protocol), SMTP (Simple Mail Transfer Protocol), and FTP (File Transfer Protocol) enable functions like web browsing, email communication, and file transfers. Processes, programs in execution, that communicate via the Internet to provide services directly to users (web, video-conferencing, file sharing, etc) operate at the Application Layer.
Transport
Layer 4
TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) are prominent protocols operating at this layer, providing different levels of reliability based on application requirements. Both TCP and UDP use source and destination port numbers Responsible for getting data from a process on Host A to a process on Host B. Most applications have default port numbers.
Network
Layer 3
Responsible for the logical addressing, routing, and forwarding of data packets across different networks. The IP protocols (IPv4 and IPv6) operate at the Network Layer. Encapsulates Layer 4 Segments/Datagrams into packets addressed with the source and destination IP addresses.
Data Link
Layer 2
Manages the transmission of data packets between devices on the same network segment. It utilizes protocols like Ethernet and establishes physical/hardware addresses (MAC addresses) for effective communication within Local Area Networks (LANs). Encapsulates Layer 3 packets into Layer 2 frames addressed with the source MAC address of the current host and destination MAC address of the host serving as the next Layer 3 hop on the current host's local-area network. Handles some error detection and correction from the physical layer.
Physical
Layer 1
This foundational layer deals with the transmission of raw binary data (0s and 1s) over physical media such as copper wires, fiber optics, and wireless frequencies (RF, microwave, laser). It includes devices like repeaters, antennas, and transducers essential for signal propagation and reception. Encodes and decodes between frames and 0s and 1s for transmitting through a physical medium.

Each layer of the TCP/IP Stack has a specific responsibility that allows the transition of data from one layer to another. Additionally, each layer only needs to know how to interact with the layers directly above and below it. This is broadly analogous to postal mail. When you send a letter to grandma, there are certain standards you must conform to - such as proper addressing and postage - but once you mail the letter, you are agnostic to how the post office delivers it to grandma (such as whether the post office used a train, plane, or automobile to move the letter). Meanwhile, the postal worker is not concerned with the content of your letter, but making sure it’s properly delivered from the mailbox to the post office - or from the post office to grandma’s mailbox, on the other end.

Encapsulation and Decapsulation (aka Deencapsulation). As data moves down the stack (from the application layer to the physical layer) it is encapsulated by adding information to a message, such as purposefully-constructed header and footer information, to ensure proper handling and routing by the next layer. As the data arrives at its destination, and moves back up the stack, these headers and footers are used to route or process the application data, then are removed and discarded in a process known as decapsulation (sometimes written as deencapsulation).

Overview of Networking Terminology and Utilities

The following components comprise computer networks:

A Protocol for Tactical Voice Communications

The document ACP 125 gives a protocol for communications between Allied Forces on tactical voice nets, to "provide a standardized way of passing speech and data traffic." The protocol specifies such things as a phonetic alphabet ("ALFA, BRAVO, ..., ZULU"), prowords (e.g., "say again", "roger" ), how to unambiguously record a message (e.g., zero written as: Ø, letter Z written as: Ƶ), and brevity codes (e.g. the brief phrase "Birds away" means "Friendly surface-to-air missiles have been fired at the designated target").
When you pick up that VHF bridge-to-bridge radio on one of the YP's, the ACP 125 protocol tells you how you should talk!

Here's an example dialog between call signs S7 and CC:

CC: "Sierra Seven this is Charlie Charlie,
over."
S7: "This is Sierra Seven, roger, over."
CC: "Sierra Seven, Charlie Charlie, immediate
execute, turn starboard niner, I say again,
turn starboard niner, standby ...
execute, over."
S7: "This is Sierra Seven, roger, out."
Host
A host is any hardware device that has the capability of permitting access to a network via a user interface, specialized software, network address, protocol stack, or any other means. Some examples include, but are not limited to, computers, personal electronic devices, thin clients, and multi-functional devices.
Network
A network is a system implemented with a collection of interconnected components. Such components may include routers, hubs, cabling, telecommunications controllers, key distribution centers, and technical control devices.
Protocol
A protocol is a set of rules used by two or more communicating entities that describe the message order and data structures for information exchanged between the entities. Internet protocols are typically codified in RFCs and developed by the Internet Engineering Task Force (IETF), while organizations such as the International Organization for Standardization (ISO) and Institute of Electrical and Electronics Engineers (IEEE) help codify a variety of other protocols, such as Wi-Fi and Bluetooth. Example protocols:
  • ARP - Address Resolution Protocol
  • DHCP - Dynamic Host Configuration Protocol
  • HTTPS - HyperText Transfer Protocol Secure
  • ICMP - Internet Control Messaging Protocol
  • IP - Internet Protocol
  • NTP - Network Timing Protocol
  • SMTP - Simple Mail Transfer Protocol
  • TCP - Transmission Control Protocol
  • TFTP - Trivial File Transfer Protocol
  • UDP - User Datagram Protocol
  • VoIP - Voice-over-IP
Services
A process used or provided by programs to leverage network connectivity, which may be referred to as a daemon. Services utilize and follow protocols in order to establish network connectivity for an application, such as an Apache web service (daemon) running on the Linux server that allows the client-side browser to send an HTTP GET request to access a hosted webpage. Example services:
  • Mail
  • Name Resolution
  • Web
Utilities
Common utility programs that will be used to evaluate network communications within the course will leverage organic tools available on the system, typically within a shell environment. Some of these utilities will include viewing processes and network connections, assessing network connectivity, looking up network services, and using network configuration tools. Example utilities:
  • arp - displays IP-MAC resolution
  • ipconfig - displays network information
  • netstat - displays network connections
  • nslookup - queries the name server
  • ping - sends ICMP request to see if an IP host is responding
  • tasklist - outputs running processes
  • tracert - sends ICMP request to each hop until final destination

Introduction to the Application Layer

There are many terms that seem to be used interchangeably, with application being one of them. It is important to distinguish between the subtle differences no matter how minor and this class will clarify some of the terms and tools to understand Layer 5 - Application Layer. Based on the National Institute for Standards and Technology (NIST) glossary, the term "application" is generally used when referring to a component of software that can be executed. The difference between an application and process is that an application, being a component of software that can be executed, has yet to be run and resides in non-volatile storage, whereas a process is executing, being run by the Central Processing Unit (CPU) and resides in Random Access Memory (RAM) volatile storage.

At the Application layer of the TCP/IP Stack, "applications" refers to processes (including daemons) that provide services that communicate data across a computer network.
Daemons are processes running specifically on the server-side to provide a service, such as web, name resolution, or mail. Daemons usually run in the background and handle requests for service that it provides.

Gmail Scenario. Throughout this part of the course, we will be building on this "Gmail Scenario" as a demonstration of how the TCP/IP Stack works. We start with your laptop and connect to Gmail at mail.google.com. Notice that accessing mail.google.com is not a mail service but a web service because we use our web browser to access it. As a precursor to the next lesson, know that the default Transport Layer port for HyperText Transfer Protocol Secure (HTTPS) is TCP port 443. Before your laptop can connect to the Gmail server, it must query a DNS server resolve the name mail.google.com to its IP address. For name resolution, the default port is UDP port 53. We'll focus more on these port numbers later. For now, we focus on the application layer, which includes HTTPS and DNS.

  1. Your laptop's application layer generates a DNS query (#1) and passes it down to your laptop's transport layer.
    **See the next lesson for what happens below the application layer.**
    The DNS server's application layer receives the deencapsulated DNS query (#1).
  2. The DNS server's application layer processes the DNS query (#1), generates a DNS response (#2), and passes the DNS response (#2) down to the DNS server's transport layer.
    **See the next lesson for what happens below the application layer.**
    Your laptop's application layer receives the deencapsulated DNS response (#2).
  3. Now that your laptop has the IP address for mail.google.com, your laptop's application layer generates an HTTPS request (#3) for the Gmail web server and passes it down to your laptop's transport layer.
    **See the next lesson for what happens below the application layer.**
    The Gmail web server's application layer receives the deencapsulated HTTPS request (#5).

Activity: Analyzing Web and Mail Exchange Services

Web and mail are two different services and this activity will analyze network information to determine how each of the services are implemented.

  1. Open a browser and navigate to mail.google.com.
    • Click on the address bar and the scheme will be HTTPS, which is a web service.
    • Open PowerShell and resolve the domain mail.google.com by entering: nslookup mail.google.com
    • In PowerShell, run the netstat command to observe network connections to any of the IPs that match mail.google.com: netstat -an | findstr "443". If you have a lot of browser tabs open, you'll see a lot of connections using port 443.
    • TCP/443 is a web service that has now been associated with your connection with mail.google.com. Can you find a connection in the list with one of the IP addresses listed in the nslookup output?
  2. In the browser, select an email to analyze by expanding the ellipses and clicking on Show Original.
      
  3. A new tab will open displaying the original message header and details. Take a moment to read through the content and pull out information that can be used to assess technical details, such as domain names and IP addresses.
  4. Various parts of the message will identify a domain name of mx.google.com. This is the mail exchange service that is identified for organizations, using domain names similar to that of web domains. The usna.edu domain is used in both web (https://www.usna.edu) and mail services (m9999@usna.edu).
      - If you are not able to read through and find the mx value, use Ctl+F to find it within the message.
  5. Open PowerShell and run a DNS query for mx.google.com: Resolve-DnsName -name google.com -type mx
  6. The NameExchange value should have returned smtp.google.com, which is the Simple Mail Transfer Protocol (SMTP) or mail service that operates using TCP/25.
  7. Conduct a DNS lookup using the PowerShell cmdlet Resolve-DnsName -name smtp.google.com or nslookup smtp.google.com
      NOTE: The name resolution of what was returned in the shell and the scenario will differ because IP addresses will change over time. That is the purpose of DNS, to provide the most up-to-date information based on the existing infrastructure.

Application Layer Overview

Processes. Software and code that resides on non-volatile storage is not in use and has not been executed to run on a system. It is not until a user starts the application for resources to be allocated by the OS, such as CPU and RAM. A process ID (PID) is assigned and can be observed by running the command tasklist in Windows.

For example, Google Chrome is a web browser application that is installed on your computer. Once run by double-clicking on the application, it becomes a process that can now be seen by running the command tasklist | findstr "chrome" with a measure of the amount of memory currently being utilized. The column names will not appear when filtering an output but running tasklist on its own will reveal that information. The PID can also be used to refine netstat outputs to see what network connections are being used by that process.

PS C:\Users\m9999> tasklist | findstr "chrome"
chrome.exe                   10472 Console                    1    107,896 K

C:\Users\m9999>tasklist

Image Name                     PID Session Name        Session#    Mem Usage
========================= ======== ================ =========== ============
System Idle Process              0 Services                   0          8 K
System                           4 Services                   0     14,872 K
Registry                       148 Services                   0     66,796 K
smss.exe                       616 Services                   0      1,056 K
csrss.exe                     1448 Services                   0      5,556 K
wininit.exe                   1820 Services                   0      6,496 K
csrss.exe                     1828 Console                    1      6,436 K
services.exe                  1900 Services                   0     16,828 K
lsass.exe                     1920 Services                   0     27,504 K
svchost.exe                   2044 Services                   0     48,312 K
WUDFHost.exe                  1484 Services                   0     10,084 K
fontdrvhost.exe               1496 Services                   0      3,336 K
svchost.exe                   1624 Services                   0     26,236 K
chrome.exe                   10472 Console                    1    107,896 K
tasklist.exe                   656 Console                    1      9,208 K

PS C:\Users\m9999> netstat -ano | findstr "10472"
  TCP    10.60.145.241:55813    142.251.45.98:443      ESTABLISHED     10472
  TCP    10.60.145.241:55878    104.16.19.94:443       ESTABLISHED     10472
  TCP    10.60.145.241:55969    142.250.73.196:443     ESTABLISHED     10472
  TCP    10.60.145.241:55973    172.217.1.202:443      ESTABLISHED     10472
  TCP    10.60.145.241:56343    142.251.33.206:443     ESTABLISHED     10472
  TCP    10.60.145.241:57179    172.217.13.238:443     ESTABLISHED     10472
  TCP    10.60.145.241:57308    142.251.45.98:443      ESTABLISHED     10472
  TCP    10.60.145.241:58207    142.251.45.98:443      ESTABLISHED     10472
  TCP    10.60.145.241:58729    172.217.1.195:443      ESTABLISHED     10472
  TCP    10.60.145.241:59080    157.240.229.35:443     ESTABLISHED     10472
  TCP    10.60.145.241:59750    172.217.9.197:443      ESTABLISHED     10472
  TCP    10.60.145.241:60573    142.250.65.78:443      ESTABLISHED     10472
  TCP    10.60.145.241:61205    142.250.188.33:443     ESTABLISHED     10472
  TCP    10.60.145.241:61349    142.250.188.206:443    ESTABLISHED     10472
  TCP    10.60.145.241:61778    142.250.188.202:443    ESTABLISHED     10472
  TCP    10.60.145.241:63006    199.127.207.183:443    ESTABLISHED     10472
  TCP    10.60.145.241:63785    172.253.63.188:443     ESTABLISHED     10472
  TCP    10.60.145.241:63864    208.118.62.69:443      ESTABLISHED     10472
  TCP    10.60.145.241:64634    162.247.243.146:443    ESTABLISHED     10472
  TCP    10.60.145.241:64855    172.217.12.229:443     ESTABLISHED     10472
  TCP    10.60.145.241:64957    172.217.15.74:443      ESTABLISHED     10472
  TCP    10.60.145.241:65196    142.251.45.98:443      ESTABLISHED     10472
  UDP    0.0.0.0:5353           *:*                                    10472
  UDP    0.0.0.0:5353           *:*                                    10472

Let's investigate further by conducting a DNS query into Google's Gmail web services (if you haven't already done so as part of the activity). Use the nslookup utility to query the USNA name server and see what is returned for mail.google.com. Then associate the IP address to a network connection by piping the output of netstat -ano to the findstr command and searching for the IP associated with mail.google.com.

PS C:\Users\m9999> nslookup mail.google.com
Server:  ns1.usna.edu
Address:  10.1.74.10

Non-authoritative answer:
Name:    googlemail.l.google.com
Addresses:  2607:f8b0:4004:807::2005
          172.217.12.229
Aliases:  mail.google.com

PS C:\Users\m9999> netstat -ano | findstr "172.217.12.229"
  TCP    10.60.145.241:64855    172.217.12.229:443     ESTABLISHED     10472

Based on the output, the Chrome application with the process chrome.exe has now been associated with PID 10472 and has an established network connection to mail.google.com with a DNS resolution of 172.217.12.229 over TCP/443.

We can summarize our findings based on the table below:

Application Chrome Browser
Process chrome.exe
PID 10472
Domain Name mail.google.com
DNS Resolution  172.217.12.229
Protocol TCP
Port 443
Service Web/HTTPS

Now that we've explored the client side of the connections, let's look at a daemon running on a server. Permissions limit users' ability to look into the Apache2 web service, so the ssh, or SSH daemon, will suffice.

  1. Connect to the SSH server: ssh m9999@
  2. View the ssh server process: systemctl status ssh
  3. View the ssh server process based on your specific connection by running the process status command ps and greping the output to filter the sshd process and greping that output based on your specific username: ps -ef | grep ssh | grep m9999
    (Your output may differ slightly, but should look very similar.)
m9999@ward-rweb-09:~$ systemctl status ssh
 ssh.service - OpenBSD Secure Shell server
     Loaded: loaded (/lib/systemd/system/ssh.service; disabled; preset: enabled)
     Active: active (running) since Sun 2023-08-27 06:17:31 EDT; 1 month 4 days ago
       Docs: man:sshd(8)
             man:sshd_config(5)
    Process: 5591 ExecStartPre=/usr/sbin/sshd -t (code=exited, status=0/SUCCESS)
   Main PID: 5598 (sshd)
      Tasks: 3 (limit: 629145)
     Memory: 182.8M
        CPU: 16min 56.072s
     CGroup: /system.slice/ssh.service
             ├─   5598 "sshd: /usr/sbin/sshd -D [listener] 1 of 10-100 startups"
             ├─2025123 "sshd: m276720 [priv]"
             └─2025124 "sshd: m276720 [net]" ""
m9999@ward-rweb-09:~$ ps -ef | grep ssh | grep m9999
root     1730983    5598  0 14:13 ?        00:00:00 sshd: m9999 [priv]
m9999  1731040 1730983  0 14:13 ?        00:00:00 sshd: m9999@notty

Ports, Protocols, and Services (PPS). Enterprise-level management for Information Technology (IT) organizations can be extremely challenging and implementing a PPS Management (PPSM) solution is a critical part of a configuration management program. The hundreds of thousands of services and network connections utilized by thousands of students, staff and faculty, can seem overwhelming but starting with identifying critical PPS is a way to define recovery, restoration, and reconstitution when a disaster impacts any part of the tenets of cybersecurity or the CIA-triad.

Would you like to know more?
Look into the DoD's approach to PPSM at the Cyber Exchange.

Security through Obscurity

Time and time again it has been publicly and privately shown that security through obscurity is not a long term effective defense principle. Offering a service on the non-standard port will not thwart determined adversaries.

Application Layer Protocols

Each of the below protocols have a default Transport Layer port that the server will listen for incoming client requests on. The default Transport Layer port is a part of the Application Layer protocol specification. However, there is nothing preventing the use of any Application Layer protocol on any Transport Layer port. That is, any of the below protocols can be used on any Transport Layer port. But, in order for a client to use the service being provided they need to know the port the server is listening on. By offering services on the default port, then clients (users) are able to easily find and use the service. Offering services on other than default ports only makes it more difficult, but not impossible, for users on the Internet to find the service.

Dynamic Host Configuration Protocol (DHCP)

Remember, a broadcast is a special message that is sent to all hosts on the same local network. It is similar to listening to the radio, where everyone tuned into the same frequency is simultaneously listening to the same presentation.

At this point you might be wondering, if my host (computer) needs an IP address so that it can communicate via the Internet, then why have I never had to configure, much less know, what my IP address is? Connecting to and using a network is effortless for even beginner computer users in part because of DHCP. Right now, you are using a network to access this page, but you probably did not concern yourself with configuring your IP settings and DNS server address (more to follow). That is because it was automatically done for you by a service called DHCP (Dynamic Host Configuration Protocol) provided by the network. Your computer is already configured to use DHCP, so when you plug in the Ethernet cable (or connect to a wireless network), your computer broadcasts a request for an IP address. The DHCP server replies with an IP address, subnet mask, and default gateway router for your host to use. For networks without the DHCP service, users must obtain their IP settings from the network administrator and manually configure their computers.

Domain Name System (DNS)

Domain Names. When communicating over the phone, we distinguish between a person's name and their phone number. In fact we only need the number to make a call. The name by itself isn't useful. On the other hand, the name is what we actually associate with the person. If your friend says "Who did you just call?", you say "Bill" not "410-293-9999". Phone numbers may change, but usually a person's name stays the same.

The situation on the Internet is similar: a host's IP address may often change, but its associated name usually does not. What you need to communicate with another host on the Internet is its IP address. But when we as people identify a host, it's with a name, like www.usna.edu. This kind of name is called a domain name.

Domain names are Hierarchical. They're just like paths in a file system; the only difference is that we write them the other way around: usna.edu instead of /edu/usna. In a domain name, the more specific portion is to the left, the more general portion is to the right. For example, intranet.usna.edu is a subset of usna.edu, which is in turn a subset of .edu, which is called a "top-level domain".

The top of the entire hierarchy is called the root, and the root domain is the name ".". It is common practice not to write the ending . of the domain name, in DNS the root is always inferred. The next level down in the hierarchy is the name at the right end: .edu, .com,.mil etc. .edu, .com, .mil are all examples of top-level domains. Check out this list of top-level domain names.

The www at the front of a name, like www.usna.edu usually is meant to indicate a web server, but having www at the front of a domain name doesn't make a host a web server any more than having the first name "Prince" makes you royalty. Having your web server named www is just a convention that helps clients find your publicly available services.

Why We Need It. Before you can communicate with another host on the Internet, you need an IP address for it. However, we usually have a domain name, not an IP address. So we need to consult some kind of "phonebook" equivalent to get the IP address from the symbolic name. The irony is that the DNS "phonebook" is itself another host on the Internet, and talking to it requires an IP address ... so this may seem like a chicken-egg problem, but DHCP helps with this.

What It Does. The "phonebook" of the Internet is called DNS (Domain Name System). It consists of a global system of servers, called name server, that translate symbolic names to IP addresses either by knowing the answer, or passing the query along to a server that does. To translate a symbolic name to an IP address, you need to query a name server, which requires knowing the name server's IP address. If you only had the symbolic name of the name server, you'd be in trouble. However, when your computer joins a network, it is usually given the IP address of one or more name servers, from a protocol like DHCP. You can see these addresses with the shell command ipconfig /all. Look for the line

DNS Servers. . . . . : 10.1.74.10

that contains the IP addresses of one or more name servers.

A Tool for DNS: nslookup. The nslookup utility is a shell tool (for both Windows and UNIX) that will carry out a DNS request for you. Here's an example:

PS C:\Users\m9999> nslookup courses.cyber.usna.edu
Server:  ns1.usna.edu
Address:  10.1.74.10

Name:    ward-rweb-08.academy.usna.edu
Address:  10.1.83.71
Aliases:  courses.cyber.usna.edu

From this we see that the IP address of the host courses.cyber.usna.edu is 10.1.83.71. Furthermore, the output is telling us that the name server that provided us this answer has IP address 10.1.74.10. The nslookup utility is also able to do reverse DNS requests — i.e. "here's an IP address, what's the name?". We can use that to find the name of the name server we just queried.

PS C:\Users\m9999> nslookup 10.1.74.10
Server:  ns1.usna.edu
Address:  10.1.74.10

Name:    ns1.usna.edu
Address:  10.1.74.10

From this we see that the server at 10.1.74.10 has the name ns1.usna.edu.

Querying Other Name servers. Normally, nslookup will query the name server listed by the call to ipconfig /all to do DNS lookups. However, if you call nslookup with a second argument that is the name or IP address of a name server, nslookup will query that name server instead. So, for example:

nslookup www.google.com 8.8.8.8

... actually causes my host to contact 8.8.8.8 to resolve the name www.google.com. However, if I run this request inside the USNA network, it can't complete, because (for security reasons), USNA does not want DNS requests to be fulfilled by outside (potentially untrusted) name servers. This is another example of by using a service you are opening yourself up to a vulnerability. DNS like most of the protocols in the TCP/IP Stack accept the first reply received. This means that protocol designers and tool developers have to take extra steps to ensure that a request received is valid.

Name Resolution in Action. It's worthwhile thinking a bit about what happens when you send your browser to a website. When you enter

https://www.archive.org/
in your browser's address bar, the browser is supposed to send a request to the web server www.archive.org (specifically an HTTP GET request). But that can't happen until the browser finds out what IP address goes with that name. In fact, for some websites, you can enter the IP address directly into the browser's address bar, like this
https://207.241.224.2
... and you'll get the website. If you use the symbolic name, however, the browser first makes a DNS request to a name server to get the IP address for the name www.archive.org, and then actually sends the HTTP GET request to the web server. If you don't have access to a name server, and you know only a web site's URL, not its IP address, you can't access the web site!

DNS servers listen on port 53. For most queries, DNS uses UDP rather than TCP. If DNS is so crucial to Internet communications, then why does it use an unreliable protocol like UDP. Well, the answer is in the question, since DNS is so crucial to Internet communications, and is used a lot by a majority of hosts on the Internet, then the protocol needs to be efficient. If DNS used TCP, then the overhead associated with TCP would severely reduce the efficiency of networking. This should become clearer once we learn about TCP and UDP, which are Transport (Layer 4) protocols.

DNS is a complicated system, with millions of servers spread out across the earth. Suppose you query your local name server for www.foo.com. The general scheme works like this: There are 13 root name servers. If your name server doesn't know the IP address of www.foo.com, it sends a query to one of the 13 root name servers governed by the Internet Assigned Numbers Authority (IANA). The .com name server will send you to the name server for the foo.com domain, and that name server ought to be able to give the IP address for www.foo.com. If this much traffic was required for every name resolution, the Internet would be a much slower place. Instead, name servers remember the answers to queries they've answered recently in a cache.

From a security perspective it's crucial that DNS works properly. If the name bankwithallmymoney.com gets resolved incorrectly to an IP address owned by a bad guy, I could be in trouble. He could put up a dummy web page that looks just like bankwithallmymoney.com's, but isn't and he could perhaps steal my password ... and then my money.

HyperText Transfer Protocol (HTTP) and HyperText Transfer Protocol Secure (HTTPS)

HTTP (HyperText Transfer Protocol) is the Application Layer protocol that the web is based on. HTTP servers (web servers) use TCP and listen on port 80. HTTP clients are called web browsers. HTTPS (HTTP Secure) is a more secure version of HTTP. It employs Transport Layer Security (TLS), a set of cryptographic protocols which support authentication as well as encryption of the Application Layer HTTP data. HTTP traffic is sent in the clear, meaning that anyone can read the data in the message if they receive the message. Remember a packet will flow through many networks as it makes its way from Host A to Host B. HTTPS on the other hand encrypts the data such that only someone who knows the key can read the contents of the message. If you don't have the key to decrypt an encrypted message, then you cannot make sense of the data. You may be able to receive, and read the encrypted data, but encrypted data is gibberish.

The protocol behind the web, HTTP, governs the interaction between web servers and web clients (browsers). Browsers can send messages like

GET /prices.html HTTP/1.1

to a server (of course you need its IP address to send it this message!). In the above GET request /prices.html is the path and file name of the file the web client is requesting to view. The web server process accepts the GET requests, and looks in its file system for /prices.html. The HTTP protocol specifies exactly what this can look like and what response the request should elicit from the server. For example, the server might send back the message

HTTP/1.1 404 Not Found

which indicates that it did not have a file prices.html available to send.

Secure SHell (SSH)

There are SSH clients for Windows, too, like PuTTY, which is freely available.

SSH (Secure SHell) is a protocol that allows secure, remote command shell access. In this setting, secure means preserving confidentiality and authentication. Nobody snooping on the network traffic can read off your password or other information that gets sent back and forth during the session. Generally, you use SSH like this: ssh username@hostname, e.g. ssh m9999@ssh.cyber.usna.edu. You'll be prompted for a password, and assuming you give the right one, you have a shell on the remote host (flux.academy.usna.edu in the example).

SSH is a client/server system just like the web (HTTP). For example, there is an SSH server process running as a daemon on listening on port 22 for connection requests. On Windows, the ssh command is an SSH client. When you run it as ssh , the client resolves the name to an IP address, makes a TCP connection to that IP address on port 22, and from that point on follows (communicates using) the SSH protocol with the server process to carry out your shell commands.

So, you already have an SSH client installed on your machine (PuTTY). You just need to pull up a Windows shell and start up the SSH client with:

ssh m9999@

RDP

The "Remote Desktop Protocol" allows users to access the Graphical User Interface (GUI) of a remote system, encrypting the session just like SSH except that the desktop is shown instead of only accessing a shell or command-line interface (CLI). It uses TCP on port 3389. The Windows RDP client is just called "Remote Desktop Connection." Just like HTTP, DNS, and SSH, this is a client/server system. The Windows host you want to connect to must have an RDP server process listening to port 3389 waiting for connection requests from remote desktop clients.

SFTP

Secure File Transfer Protocol (SFTP) offers encrypted file transfer. You've already used WinSCP, which is an SFTP client, to transfer web content over to . SFTP uses TCP port 22 ... just like SSH. That's because SFTP is actually an extension of SSH. So the "secure" in SFTP comes from the fact that traffic is encrypted, so an eavesdropper can't snoop in. The protocol also provides for authentication.

SMB

The SMB (Server Message Block) protocol, is an Application Layer protocol used for network file sharing. There are two popular implementations of the SMB protocol: Microsoft SMB and Samba. Windows systems use Microsoft SMB to share files over the network, which is what most people are familiar with. Unix systems use Samba primarily to use resources shared using the Microsoft SMB protocol. Both versions are based on the original SMB protocol and are compatible with one another. SMB servers usually listen on TCP port 445.

SMTP

SMTP (Simple Mail Transfer Protocol) is an application layer protocol for transporting email. An SMTP client is responsible for transferring electronic mail (email) messages to one or more SMTP servers. Most often, one server uses SMTP to forward an email to another server. Sometimes, email client software (like Outlook or Thunderbird, but not in a web browser) will use SMTP to transmit a message to sending user's email service provider's server. Message transfer can occur in a single connection between the original SMTP sender and the final SMTP-recipient, or can occur in a series of hops through intermediary systems. DNS is used to identify "mail exchange" (mx) hosts that act as SMTP servers (or relays).

Other Services/Protocols

FTP (21) - File Transfer Protocol for non-secure transfer of files (pre-dates SFTP, circa 1971)

TELNET (23) - provides a non-secure bi-directional ASCII-text communication connection to a remote host (pre-dates SSH, circa 1969, i.e., the very early days of the Internet).

IMAP (143) - Internet Message Access Protocol, for email clients to manage email held on servers.

IRC (6667) - Internet Relay Chat, real-time chat.

Important: Each network service that a host provides corresponds to a running process listening for requests on a specific TCP or UDP port.

Each of these processes is a "server." Each process provides services to other hosts, but each process also represents a potential avenue into the host for attackers. These server processes are programs reading input from other hosts, often outside the local network. They are expecting input that follows the rule of whichever protocol the service uses. But what happens when input doesn't follow the rules? We know how hard it is to write programs that deal gracefully with all possible inputs!


Supplemental Media:



Application Layer Functionality and Protocols


Review Questions:

  1. What are the FIVE layers of the TCP/IP Stack?
  2. How do the TCP/IP layers interact with each other?
  3. How does encapsulation and decapsulation take place within the TCP/IP Stack?
  4. What are the networking terms and associated utilities that are used to assess and configure networked systems?
  5. What is the purpose of the Application Layer?
  6. What does DHCP do for us?
  7. What network configuration items are provided by DHCP?
  8. Why do we need DNS?
  9. How does DNS work?
  10. What utility is used to look up domain names?


References

  1. OmniSecu. (2008). "TCP/IP Encapsulation and Decapsulation." [Online]. Available: http://www.omnisecu.com/tcpip/tcpip-encapsulation-decapsulation.php
  2. Medium. (2018). "Intro To Computer Networking And Internet Protocols." [Online]. Available: https://medium.com/@sadatnazrul/intro-to-computer-networking-and-internet-protocols-8f03710ca409
  3. S. Tatham, "PuTTY: a free SSH and Telnet client", Jul 8, 2019.
  4. Internet Corporation for Assigned Names and Numbers, "List of Top-Level Domains", Jul 2019.