SY110: Intro to the TCP/IP Stack: Data Link Layer



Intro to the TCP/IP Stack: Data Link Layer

Learning Outcomes

After completing these activities, you should be able to:


Introduction to the Data Link Layer

The Data Link Layer interfaces with the Physical Layer by transitioning physical network signals and characteristics into logical ones and conducting error checking and building tables that contain unique identifiers that reference hardware addressing schemes. The Data Link Layer is limited to Local Area Networks (LANs) because data used within this layer is discarded during deencapsulation, going up to the Network Layer. A host's Network Interface Card (NIC) has a chip that contains its Media Access Control (MAC) address, which must be unique within a network. At Layer 2 of the TCP/IP Stack, communications take place using MAC addresses with the adoption of Ethernet for Internet Protocol (IP) networks based on Request for Comment (RFC) 894.

Overview of Ethernet

The Institute of Electrical and Electronics Engineers (IEEE) defines Ethernet standards under 802.3. The 802.3 Working Group provides contributions to improvements, issues, and future drafts related to Ethernet specifications. For this course, Ethernet standards will only focus on LANs that interconnect devices within a limited area such as a residence, campus, or building. Other types of data-link technologies outside of the IEEE 802 standards include Fiber Distributed Data Interface (FDDI), X.25, Frame Relay, and Asynchronous Transfer Mode (ATM).

Five common network topology configurations. The network topology of a LAN
can be configured as Point-to-Point, Bus, Ring, Star, or Mesh. All are used with
Ethernet but it's uncommon to see an Ethernet ring network implemented.
Network topologies can be configured in many ways but five common LAN implementations include (1) point-to-point, (2) bus, (3) ring, (4) star, or (5) mesh. The use of hybrid or combination of topologies can benefit organizations based on technological requirements established when designing networks. For example, Nimitz Library has many workstations available for students to conduct research throughout the main floor. Today, each has a separate cable connecting it to a switch in a data closet, forming a star topology; decades ago, however, running a separate cable to the data closet for each host may have been cost prohibitive, but a bus or ring architecture allows each host to be connected with much less cable. For WiFi connections throughout Hopper Hall, Wireless Access Points (WAPs) allow each host to directly connect to the Wireless Local Area Network (WLAN), thus implementing the use of a star network.

Ethernet frames are used for hosts to communicate within a LAN. Construction of an Ethernet frame is as follows:
+----------------+-------------+-------------+----+----------------------------------------+---------+
|    PREAMBLE    |  DEST ADDR  |  SRC ADDR   |TYPE|                  DATA                  |   FCS   |
+----------------+-------------+-------------+----+----------------------------------------+---------+
|----8 Bytes-----|---6 Bytes---|---6 Bytes---|-2B-|---------------~48 Bytes----------------|-4 Bytes-|
The purpose of the preamble is to inform a receiving host that the data being interpreted is indeed intentional and not part of interference or noise in the transmission. It contains a sequence of alternating 1's and 0's in the first seven bytes until the last byte of 10101011 to indicate the end of the preamble.

The destination and source addresses used to communicate between hosts and network devices. At the data link layer, the addressing scheme is derived from MAC addresses, which are embedded into all NICs by their manufacturers. A NIC is a peripheral installed in a system that allows the device to be connected to a network, this includes LANs, WLANs, and Wireless Personal Area Networks, as is the case of short-range wireless communications. Since we are on the topic of Ethernet, only 802.3 specifications are written for LANs but we know that MAC addresses of NICs are not limited to Ethernet.

The data portion of an Ethernet frame is not limited to 48 bytes but is determined by the size of the Layer 3 packet that Layer 2 needs to encapsulate. The size of the frame can be larger or smaller but remember that a frame encapsulates all the data assembled from the layers above Data Link.
Error checking and Credit Card verification. The
credit card numbers entered into systems use
the Lunh algorithm, similar to how error checking
works using modulo or other various mathematical
algorithms in computing.
The Frame Check Sequence (FCS) of an Ethernet frame is used to implement error detection and recovery, which determines issues during transmission but also can correct data that is inaccurate. Like how the Luhn Algorithm is used to verify credit card numbers, FCS can do the same thing but also correct any of the missing or incorrect data.

MAC Addressing Schemes

MAC Address Scheme. The OUI and
Network Interface Controller parts of
a MAC addresses used to provide
globally unique identifiers used
identification of objects.
A MAC address is a unique identifier that allows network communications to be sent to a specific host. It is 6 bytes (48 bits) in length and has two parts - (1) a 3-byte Organizationally Unique Identifier (OUI) and (2) a Network Interface Controller (different than a Network Interface Card, which is the physical network card that has the MAC address embedded by the manufacturer).

During the manufacturing process for building NICs, the business has to request OUIs to be issued by IEEE, which is the Registration Authority for MAC addresses under EUI-48. Navigate to the WireShark OUI Lookup website and determine the manufacturer of the device of the MAC addresses in the illustration to the left.

Fingerprinting Systems

The OUI of a MAC address can potentially identify the manufacturer of a device, an IP can be associated to an organization, and port connections the service and OS used. Operations pertaining to the TCP/IP Stack necessitate the interconnected world we live in but can also reveal a lot about the host. Putting together these artifacts and fingerprinting a system, such as the behaviors unique to an OS like ephemeral port usage, are techniques that can be employed to result in system vulnerabilities, identification of cyber-personas associated to devices, and eventually exploits that can be delivered to target systems.

The probability of duplicating MAC addresses is 248 but because of how OUIs are arranged, this is reduced to 224. OUI assignments allocate the first half of the MAC address to a specific company and the second half is used by that company for distribution. The MAC address for workstation with the MAC address below has an OUI of C0-3E-BA, which is owned by Dell, Inc., with the remaining three octets assigned by the vendor.

|Organizationally Unique Id|  Network Int Controller  | field
|   C0   |   3E   |   BA   |   AF   |   5E   |   43   | hex
|11000000 00111110 10111010 10101111 01011110 01000011| bits
|    |                                           |   |
|    most-significant-byte  least-significant-byte   |
most-significant-bit             least-significant-bit
Knowledge Check: If a MAC address is 6 bytes in length, how many bits is that?


Capturing data with networking tools like Wireshark will interpret data transmitted across networks. Check out this captured Ethernet information:


Notice that the OUI for the host and network device has already been identified and the critical information for the frame captured includes the source and destination addresses.

You might have noticed that a port is much smaller than a MAC address or an IP address. Well, there's some rationale behind that. Each network interface has a unique MAC address, and MAC addresses are allocated based on the manufacturer; hardware manufacturers are assigned large blocks of MAC addresses for the hardware they make. If you were a hardware vendor, would you want to be able to produce and sell a small number or large number of devices? Obviously a large number versus a small number. Additionally, having a large address allows new networking technologies to operate using the same concepts; e.g., Wi-Fi and Bluetooth both adopted the 48-bit MAC address from Ethernet. Obviously we need a large number (32-bit for IPv4, or 128-bit for IPv6) to uniquely identify all the hosts connected to the global Internet. But how big does a port number need to be? Is 16-bits enough? Well, how much multitasking do you like to do? Better yet, how much multitasking do you think you can do? Can you manage 65 thousand processes all running at the same time? Sometimes, a single process (like a web server) can have many sockets, so the comparison of processes to ports fits clients, not servers. In practice, 16-bits to represent the processes communicating on a host has not been an issue.

Layer 2 Network Devices

Circuit Diagram of a Switch - Data Link Layer. ASICs are responsible for
processing frames for a certain number of ports, allowing for the management
of layer 2 data and isolating network traffic through collision domains, allowing
hosts to only communicate with those identified as the destination address.
Switches are the primary Data Link (Layer 2) devices. Switches contain Application Specific Integrated Controllers (ASICs), or chips programmed to perform a specific set of tasks, which are extremely efficient and cost effective. Hosts communicating on the network will send data only to the destination address. This is possible with a switch because it maintains a list in CAM tables that matches all MAC addresses of directly connected devices and the switch ports to which each connects. Any time a host sends a frame to another host, the switch references the CAM table and directs traffic to that specific port, thus isolating collision domains (a Layer 1 issue) and allowing for increased confidentiality on LANs.

Basic Small Office Home Office (SOHO) devices will leverage a Central Processing Unit (CPU) to manage MAC address tables, which are typically limited to 4-8 ports in a device as compared to commercial switches that can have 48 ports in a single device or hundreds of ports for expandable units.

Activity: Network Utilities

Take a look at some of the Data Link layer properties, such as your MAC address, on your current system.

  1. Open PowerShell and run ipconfig /all
  2. The Windows Operating System (OS) identifies the system's MAC address as the Physical Address. Locate that and look up the manufacturer of the Ethernet NIC.
  3. Look up the manufacturer of your wireless NIC.

C:\Users\m9999>ipconfig /all

Windows IP Configuration

   Host Name . . . . . . . . . . . . : host01
   Primary Dns Suffix  . . . . . . . : academy.usna.edu
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No
   DNS Suffix Search List. . . . . . : academy.usna.edu
                                       usna.edu

Ethernet adapter Ethernet:

   Connection-specific DNS Suffix  . : academy.usna.edu
   Description . . . . . . . . . . . : Realtek USB GbE Family Controller #2
   Physical Address. . . . . . . . . : C0-3E-BA-AF-5E-43
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   IPv4 Address. . . . . . . . . . . : 10.60.145.241(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Lease Obtained. . . . . . . . . . : Wednesday, October 6, 2023 1:03:30 AM
   Lease Expires . . . . . . . . . . : Friday, October 22, 2023 1:03:36 AM
   Default Gateway . . . . . . . . . : 10.60.145.1
   DHCP Server . . . . . . . . . . . : 10.1.74.10
   DNS Servers . . . . . . . . . . . : 10.1.74.10
   NetBIOS over Tcpip. . . . . . . . : Enabled

Apple is a large technology company that is involved in the manufacturing process of their own products, where the devices and OUIs match. Many systems that rely on global supply chains will likely source parts from other manufacturers, as is the case for your issued computers. Do a little more research to determine where the company that manufactured your Ethernet and wireless NICs is headquartered. What other products does that company make and who are they also likely to supply parts to?

Address Resolution Protocol

For Layer 2 to know what MAC address to put into the destination address field of the frame, it uses ARP. Recall that MAC addresses are associated with hardware-specific NICs, and IP has logical addresses. Every time a host receives a frame from another host in its IP network, it caches the source MAC-IP address match in its ARP table. If a host does not have a cached entry needed for a destination, it transmits an ARP request in which the source and destination IP and source MAC addresses are known and the destination MAC address is the broadcast address, FF:FF:FF:FF:FF:FF. When the host with an IP address that matches an ARP request's destination IP address receives the ARP request, it replies with an ARP response in which all Layer 2 and 3 addresses are known. Upon receiving the ARP response, the host that sent the ARP request caches the destination's IP-MAC address association in its ARP table, and then is able to form the frame for the IP packet that's been waiting to be transmitted to its next hop.

Below is an example of viewing an ARP table for your local host. Run the arp -a command in PowerShell on your computer to see how many ARP resolutions have taken place. Why aren't there more addresses in the ARP table?

PS C:\Users\m9999> arp -a

Interface: 10.60.145.241 --- 0x3
  Internet Address      Physical Address      Type
  10.60.145.1           88-3a-30-a2-fc-80     dynamic
  224.0.0.22            01-00-5e-00-00-16     static
  224.0.0.251           01-00-5e-00-00-fb     static
  239.255.255.250       01-00-5e-7f-ff-fa     static
  255.255.255.255       ff-ff-ff-ff-ff-ff     static

The network captures below illustrate an ARP request and ARP response.

Ethernet Broadcast Containing ARP Request Data. The host 10.60.145.238 sent a layer 2
broadcast to FF:FF:FF:FF:FF:FF to see what MAC has an IP of 10.60.145.241.
Ethernet Frame Containing ARP Reply Data. Source MAC and IP and Destination MAC and IP are contained in this ARP reply data within an Ethernet frame.

All devices within the LAN receive ARP broadcasts but reply only if the IP matches. Some network devices, including routers and printers, often send ARP broadcasts to maintain their ARP table entries in network devices.

Knowledge Check: Based on the ARP request to determine who has 10.60.145.241, what is the MAC address of the system that would respond with an ARP reply? Take a look at the screenshots above to find the answer.


ARP Poisoning (aka ARP Spoofing). ARP does not provide any inherent authentication mechanism. As a result, if an adversary can physically connect their device to a network, they can inject ARP packets claiming the target device's IP address correlates to the attacker's MAC address. By 'poisoning' the ARP caches of other devices on the network, an attacker can intercept and modify traffic intended for the target.

How DHCP works at Layer 2

In the previous lesson, we explored how DHCP works at Layer 3. The process is simpler at Layer 2 because no assignments or reconfigurations occur at Layer 2. The Layer 2 addresses are already embedded in the host's NICs. As such, frames carrying DHCP traffic are addressed much like ARP request frames, that is, the Layer 2 source address is known and the destination address is the Layer 2 broadcast address, FF:FF:FF:FF:FF:FF. When the DHCP client receives a DHCP offer, it inserts the IP-MAC address pair into its ARP table.

Gmail Scenario

So far, we've covered the how the Application Layer provides services, the Transport Layer connects these services end-to-end, and the Network Layer provides connectivity from one network device to another across multiple network hops. Now, we take one step deeper into the TCP/IP Stack by exploring how the Data Link Layer provides connectivity between devices within a LAN. The introduction of switching in this lesson means that our scenario now needs to incorporate switches. As our diagram continues to grow, you may need to scroll to see its right end.

  1. Your laptop's application layer generates a DNS query (#1) and passes it down to your laptop's transport layer.
  2. Your laptop's transport layer encapsulates the DNS query (#1) in a UDP datagram (#2) and passes the datagram to your laptop's network layer. The source port number is an arbitrary ephemeral port number, and the destination port number is the default for DNS (53).
  3. Your laptop's network layer encapsulates the UDP datagram (#2) in an IP packet (#3) and passes the packet to your laptop's data link layer. The source IP address is your laptop's IP address (10.60.145.241), and the destination IP address is the DNS server's IP address (10.1.74.10).
  4. Your laptop's data link layer encapsulates the IP packet (#3) into an Ethernet frame (#4) and passes the frame to your laptop's physical layer. The source MAC address is your laptop's MAC address, and the destination MAC address is the USNA router's MAC address.
    Why the router? You may have noticed that your laptop and the DNS server are in different IP subnets; therefore, we need a layer 3 device (a router) to route between these subnets.
    What about the switch? Each Layer 2 device examines the Layer 2 frame header of each frame received, compares the destination MAC address to its CAM table, and decides which way it should forward the frame.
    **See the next lesson for what happens below the data link layer.**
    The USNA router's data link layer receives the Ethernet frame (#4), deencapsulates the IP packet (#3), and passes the IP packet (#3) up to the USNA router's network layer.
  5. The USNA router examines the IP packet (#3), decides how to route the packet, and passes the packet to the USNA router's data link layer. The USNA router's data link layer encapsulates the IP packet (#3) into an Ethernet frame (#5) and passes the frame to the USNA router's physical layer. The source MAC address is the USNA router's MAC address, and the destination MAC address is the DNS server's MAC address.
    **See the next lesson for what happens below the data link layer.**
    The DNS server's data link layer receives the frame (#5), deencapsulates the IP packet (#3), and passes the IP packet (#3) up to the DNS server's network layer. The DNS server's network layer receives the deencapsulated IP packet (#3), deencapsulates the UDP datagram (#2) and passes the UDP datagram (#2) up to the DNS server's transport layer. The DNS server's transport layer receives the deencapsulated UDP datagram (#2), deencapsulates the DNS query (#1), and passes the DNS query (#1) up to the DNS server's application layer.
  6. The DNS server's application layer processes the DNS query (#1), generates a DNS response (#6), and passes the DNS response (#6) down to the DNS server's transport layer.
  7. The DNS server's transport layer encapsulates the DNS response (#6) in a UDP datagram (#7) and passes the UDP datagram (#7) to the DNS server's network layer. For the port numbers, the DNS server uses the same port numbers for the response as were used for the query. As such, the source port number for this response matches the port through which it received the query (port 53), and the destination port number for this response matches the port through which the DNS client transmitted the query (port 54321).
  8. The DNS server's network layer encapsulates the UDP datagram (#7) in an IP packet (#8) and passes the IP packet (#8) down to the DNS server's data link layer.
  9. The DNS server's data link layer encapsulates the IP packet (#8) in an Ethernet frame (#9) and passes the frame down to the DNS server's physical layer. The Layer 2 source and destination addresses follow the same template as before with the USNA router being the destination.
    **See the next lesson for what happens below the data link layer.**
    The USNA router's data link layer receives the Ethernet frame (#9), deencapsulates the IP packet (#8), and passes the IP packet (#8) up to the USNA router's network layer.
  10. The USNA router examines the IP packet (#8), decides how to route the packet, and passes the packet to the USNA router's data link layer. The USNA router's data link layer encapsulates the IP packet (#8) into an Ethernet frame (#10) and passes the frame to the USNA router's physical layer. The source MAC address is the USNA router's MAC address, and the destination MAC address is your laptop's MAC address.
    **See the next lesson for what happens below the data link layer.**
    Your laptop's data link layer receives the frame (#10), deencapsulates the IP packet (#8), and passes the IP packet (#8) up to your laptop's network layer. Your laptop's network layer receives the deencapsulated IP packet (#8), deencapsulates the UDP datagram (#7), and passes the UDP datagram (#7) up to the transport layer. Your laptop's transport layer receives the deencapsulated UDP datagram (#7), deencapsulates the DNS response (#6), and passes the DNS response (#6) up to the application layer.
  11. Now that your laptop has the IP address for mail.google.com (172.217.12.229), your laptop's application layer generates an HTTPS request (#11) for the Gmail web server and passes it down to your laptop's transport layer.
  12. Your laptop's transport layer recognizes the need for a TCP connection and encryption to support HTTPS, so it conducts the TCP three-way handshake and negotiates the TLS configuration with the Gmail web server (#12). We explored these steps in greater detail in the previous lessons but will summarize them from here on down the TCP/IP Stack. The source port number (54322) is merely incremented from the last assigned ephemeral port number, and the destination port number is the default for HTTPS (443). Because TCP is connection-oriented, your laptop and the Gmail web server will use these same established sockets for all remaining TCP communications in this session.
  13. Your laptop's transport layer and the Gmail web server's transport layer both use their respective network layer to convey all TCP segments between them. At each hop (each Layer 3 device) between source and destination, the router examines the Layer 3 IP packet header, compares the destination IP address to its route table, and decides which way it should forward the packet. Every device's network layer passes packets down to its data link layer (to be conveyed across the network) and receives deencapsulated packets from its data link layer. Although each TCP segment (three segments for the three-way handshake and at least two segments for the TLS negotiation) gets encapsulated in its own IP packet, we summarize these communications here to keep the diagram and this description succinct. We show the individual IP packets per UDP datagram in the DNS query/response part of the scenario.
  14. Now that your laptop has an encrypted connection with the Gmail web server, your laptop's transport layer encrypts and then encapsulates the HTTPS request (#11) in a TCP segment (#14) and passes the segment to your laptop's network layer.
  15. Your laptop's network layer encapsulates the TCP segment (#14) in an IP packet (#15) and passes the IP packet (#15) to your laptop's data link layer.
  16. Your laptop's data link layer encapsulates the IP packet (#15) into an Ethernet frame (#16) and passes the frame to your laptop's physical layer. The source MAC address is your laptop's MAC address, and the destination MAC address is the USNA router's MAC address.
    **See the next lesson for what happens below the data link layer.**
    The USNA router's data link layer receives the Ethernet frame (#16), deencapsulates the IP packet (#15), and passes the IP packet (#15) up to the USNA router's network layer.
  17. The USNA router examines the IP packet (#15), decides how to route the packet, and passes the packet to the USNA router's data link layer. The USNA router's data link layer encapsulates the IP packet (#15) into an Ethernet frame (#17) and passes the frame to the USNA router's physical layer. The source MAC address is the USNA router's MAC address, and the destination MAC address is the MAC address of USNA's Internet Service Provider (ISP), hidden in the diagram's Internet cloud.
    **See the next lesson for what happens below the network layer.**
    Between the your laptop and the Gmail web server, the packet will stay intact as it passes through many devices (without considering NAT). When passing through a Layer 3 device, remember that the Layer 2 frame gets discarded during deencapsulation and the Layer 3 packet is encapsulated in a new Layer 2 frame.
  18. The Google router does exactly as the USNA router does, that is, it deencapsulates the IP packet (#15) from the incoming Ethernet frame (#18), examines the packet header, determines how to route the packet, and encapsulates it in a new Ethernet frame (#19).
  19. The Gmail web server's data link layer receives the Ethernet frame (#19) and deencapsulates the IP packet (#15) for the network layer, which deencapsulates the TCP segment (#14) for the transport layer, which deencapsulates and then decrypts the HTTPS request (#11), and passes the HTTPS request (#11) up to the Gmail web server's application layer.

Note that we've gotten to the point at which the Gmail web server has received our first HTTPS request. Now, it's the Gmail web server's turn to respond to your laptop's HTTPS request. Can you figure out how the HTTPS response gets from the Gmail web server's application layer to your laptop's application layer?

As you learned when building your own web sites, your browser will need to send an additional HTTPS request for each additional element, and for each of those additional elements that are not at mail.google.com, your laptop will need to send another DNS query to resolve each other server's IP address. Amazingly, all this happens in a fraction of a second!


Supplemental Media:



MAC Addresses Explained


Review Questions:

  1. How does the Data Link layer interact with the Physical and Networking layers of the TCP/IP Stack?
  2. What are the FIVE different networking topologies commonly used to interconnect hosts?
  3. What network hardware devices are associated with the Data Link layer?
  4. What addressing scheme do Ethernet frames use and how does it enable network communications?
  5. Where is the physical address located on a host system?
  6. Where is networking information stored for resolving MAC-IP addresses?


References

  1. Network Working Group, "RFC 826 An Ethernet Address Resolution Protocol", Internet Engineering Task Force, Nov. 1982.
  2. Network Working Group, "RFC 894 A Standard for the Transmission of IP Datagrams over Ethernet Networks", Internet Engineering Task Force, Apr. 1984.
  3. Network Working Group, "RFC 903 A Reverse Address Resolution Protocol", Internet Engineering Task Force, Jun. 1984.
  4. Xpert Technologies, "Cat5 vs. Cat6 Ethernet Cables (and Why You Should Care), Sep. 2018.