<SCRIPT type="text/javascript">document.write(crsNum);</SCRIPT>: File Systems and Hierarchies



File Systems and Hierarchies

Learning Outcomes

After completing this class you should be able to:



File Systems

File System Workflow
File System Workflow. The Hard Disk Drive (HDD) is partitioned and formatted to support a specific file system.
The file system contains directories and files that are organized within the file hierarchy.

This class will discuss how file systems are organized, structured, and accessed by the user as well as managed by the Operating System (OS). This course began with the physical components of a computer in Computer Architecture, with one of the key items being the HDD. Data stored on the HDD is non-volatile and needs to be organized in a manner in which the OS is able to access and interpret the data located on the medium. The file system workflow will discuss this process as well as identifying terms and concepts within Linux and Windows file hierarchies.

Windows Disk Management
Windows Disk Management application showing the different partitions
and physical drives.
The C: drive is highlighted with 935 GB formatted as New Technology File System (NTFS)
and 18.45 GB allocated for OS recovery in case the system needs to be restored.

Disk Management

HDDs are organized in a manner in which computers are able to access the data. Formatting is critical to how data is stored by the OS and is the reason why the OSX Apple File System (APFS) formatted disk cannot be read by the Windows OS and the Windows NTFS cannot be read by a macOS. This is really applicable for portable drives, such as flash drives, as moving data from one system to another on a Universal Serial Bus (USB) drive would require the correct format for the OS to read any data on the disk. Partitioning, or dividing, the hard disk into different areas starts with the Master Boot Record (MBR) or the Globally Unique Identifier (GUID) Partition Table (GPT). The Windows Disk Management application can be accessed by going to Windows Start , typing diskmgmt.msc, and running as administrator. Notice in the file system column, the (C:) volume is NTFS but that the (E:) volume is formatted as exFAT. That's because the portable drive assigned to (E:) is a flash drive also used by a camera.

In Linux, the df command shows the file system and partitions that have been mounted (see console output below). If you run that command on the Ubuntu server, you'll notice that the format is tmpfs, for temporary file system, because it's mounted virtually. Run the -T option with the command, df -T and you'll be able to pick out the physical drive partitions under /dev and see that it's formatted in ext4, or the Fourth Extended File System, used by Linux OSs.

Windows (C:)...what happened to (A:) and (B:)?
The default Windows root directory will always begin with (C:). Before the development of OSs, computers ran instructions from removable disks. These disks began as large magnetic floppies contained within its own structure to protect the content from the external environment. Reduced in size from 8 inches, the 5¼-inch floppy disks were the first drives to be integrated Personal Computers (PCs) and used as (A:). The smaller form of floppy disks reduced further in size to 3.5 inches and was allocated to (B:). The save icon that is now popular in a lot of applications resembling the 3.5-in floppy disks that were once popular to transport data from one computer to another through a network called sneakernet.
Example of Linux console display of the df and df -T command outputs:
m9999@ubuntu:~$ df
Filesystem      1K-blocks       Used  Available Use% Mounted on
udev            198029824          0  198029824   0% /dev
tmpfs            39615408       2748   39612660   1% /run
/dev/sda2       190603188  101340556   79507752  57% /
tmpfs           198077024          0  198077024   0% /dev/shm
tmpfs                5120          0       5120   0% /run/lock
tmpfs           198077024          0  198077024   0% /sys/fs/cgroup
/dev/nvme0n1p1  479595176     219080  454940488   1% /var/lib/mysql
/dev/sdb1      9296620868 3999099796 4828922392  46% /home/mids
m9999@ubuntu:~$ df -T
Filesystem     Type      1K-blocks       Used  Available Use% Mounted on
udev           devtmpfs  198029824          0  198029824   0% /dev
tmpfs          tmpfs      39615408       2748   39612660   1% /run
/dev/sda2      ext4      190603188  101340556   79507752  57% /
tmpfs          tmpfs     198077024          0  198077024   0% /dev/shm
tmpfs          tmpfs          5120          0       5120   0% /run/lock
tmpfs          tmpfs     198077024          0  198077024   0% /sys/fs/cgroup
/dev/nvme0n1p1 ext4      479595176     219080  454940488   1% /var/lib/mysql
/dev/sdb1      ext4     9296620868 3999100284 4828921904  46% /home/mids
      

File Hierarchy

Understanding the different hierarchies, default directories, and user profiles in both Linux and Windows environments will develop an understanding on how data is stored across these systems. System directories are necessary for OS operations and will be located in specific directories, whereas user profiles have their own dedicated locations.

Ukraine War - Windows malware targets Master Boot Record (MBR)
The Department of Homeland Security (DHS) Cyber Security & Infrastructure Security Agency (CISA) published a Cybersecurity Advisory (CSA) under Alert AA22-057A identifying threat actors that have deployed destructive malware against organizations in Ukraine to destroy computer systems and render them inoperable. HermeticWiper manipulates the MBR, preventing the OS from loading and results in subsequent boot failure. Details pertaining to the malware were released on February 23, 2022 by security researchers, a day before Russia launched their military invasion into Ukraine.


Windows File Hierarchy

Root drive. The drive letters do not always correspond to a single physical drive as it can be divided into multiple partitions. For the Windows OS, the root drive will typically be the C drive (C:). The root drive will always be the highest level in the file hierarchy for that partition. From there, subdirectories will exist to organize the information used by the OS and users, indicated by a back slash \. Directions in slashes are determined by the top of the slash; if the slash is to the left "\" then it's a backslash and if the slash is to the right "/" then it's a forward slash. Windows system administrators may also call double backslashes "wack wack" when attempting to connect to remote systems.

Windows File System Hierarchy
Where did you save your lecture notes from class 01? It may be saved in two different locations in the file hierarchy!

Home directory. The default user directory, or the home directory, in Windows will be C:\Users\m9999\ (boxed in the illustration). If you are not aware of where in the folder hierarchy you're saving notes from class lectures, you may think that the computer is not saving your information when there are multiple files in different locations. The file Class01.txt is located in two different locations, as depicted in the file hierarchy illustration on the right.

Parent and child directories. Desktop\ and Documents\ are folders or directories, which are objects that may contain other files or directories. Both Desktop\ and Documents\ are child directories of m9999\ and m9999\ is the parent directory of Desktop\ and Documents\. Users\ is the parent directory of m9999\. Understanding the folder hierarchy and terms used to describe relationships to each other are important.

Absolute and relative file paths. Identifying file paths can be articulated in two different ways, absolute and relative. An absolute file path always begins with the root directory. The Chrome.exe application is located at C:\Program Files\Google\Chrome\Application\Chrome.exe. No matter where you are in the entire file system, using the absolute file path will allow you to know specifically where the file is located. See if you can properly identify the absolute file path of the Class01.txt file under Documents\.

Relative file paths are based on the current working directory. If the home directory is the current working directory, then the path to the Class01.txt file accessible from the desktop would look like Desktop\\Lectures\Class01.txt. Notice that Desktop\ is a child directory and is based on the location in a file hierarchy, in which the current working directory is the home directory. Because Chrome\ is not a subdirectory of m9999\, the ..\ would have to be used to refer to the parent directory. The relative file path to get to Chrome.exe from m9999\ is ..\..\Program Files\Google\Chrome\Application\Chrome.exe.

Knowledge Check: Starting with the Documents\ directory, what is the relative file path to the Class01.txt file (specifically, the one located in a subfolder of the Documents\ directory)?


Directories and Files. The Windows Graphical User Interface (GUI) uses an application called File Explorer to view system files. This is a different application than Internet Explorer as it is a browser that is used to view content on the World Wide Web (WWW). To run File Explorer, click on Windows Start and type file explorer followed by the ↩ Enter key. A new window should appear containing user access to the entire file system. Directories, also referred to as folders, are objects that contain other directories and files.

Windows Explorer File System Hierarchy
Windows File System Hierarchy. Example of folders and files that would be displayed in File Explorer and associated absolute file paths.
Files contain data that is used by the system and applications. The Windows OS leverages filename extensions, or simply file extensions, to determine the application to use to open the file with. It is the all too familiar suffix used after the "." in a file name, such as file.docx. The docx refers to a standard that is used by Microsoft Office to open the file in the Word application. What are some other file extensions you may know of? exe, txt, jpg, mp4, pdf, zip, and log may be some of the common ones that are used. There are also system file extensions that are used by the OS, such as sys, dll, and bin. What application is used to open an html file? There are several web browser applications that could be used but is there a favorite or preferred browser? That can be changed by going to the Windows Start and typing settings. On the left pane, select Apps > Default Apps and scroll down to Choose default apps by file type. Look for .html and determine the default application configured for that file type.

File header highlighted with a PDF file
Raw data located within a PDF file with the highlighted byte-data consisting of the file header.
File headers are the first few bytes of data in a file that allows an application to read the rest of the contents of the file. Use the binted.jar to view the file contents, noting the first few bytes of data and matching it with the file header table under course references. The screencapture of the pdf file shows the first four bytes of data as 25 50 44 46 and matches the file header table, associating that hexadecimal string to the PDF file type. All this to say that applications, like Adobe Acrobat, and web browsers, like Google Chrome, can open and read the contents of that specific file.

Knowledge Check: What would the expected file header be for a jpg file extension?


Linux File Hierarchy

Linux File System Hierarchy
Linux File System Hierarchy.

Root. One of the first things that distinguishes Linux from Windows is that there is no "root drive." It is simply root. The Linux file system utilizes the forward slash to signify the root directory and is the highest level in the file system. So what happens if an external hard drive or USB drive is added? Any additional drives are mounted in the /mnt or /media directories and are accessible from there.

Home directory. The Linux home directory will be located at /home/mids/m9999/. From there, the familiar subdirectories of Desktop/, Documents/, Downloads/, etc, will be available as part of the user profile. Recognize the difference between m9999/ as part of the directory hierarchy and a username of m9999. If user m9999 changes their current working directory to /home/mids/m9555/, they are still logged in as user m9999. Know that when accessing a remote session on the Linux server, users are placed in their home directories by default.

Parent and child directories. Linux directory structures may look different than Windows but the concept of parent and child, or subdirectories, are the same. Now that you know about home directories, what would be the parent directory? Using the Linux File System Hierarchy illustration may be useful but just go "up" one directory to mids/ and that is the parent of m9999/.

Absolute and relative file paths. Absolute file paths will always begin with root, whereas relative file paths are based on the current working directory. Notice in the Home Directory paragraph that the absolute file path is mentioned in the first sentence because it begins with root, or /. No matter where in the file hierarchy the current working directory may be, the absolute file path will bring you to its precise location. What would you think if I told you to meet in room 128? That would be relative to where you are currently located at. Bancroft, Michelson, Hopper, Mahan? If it were to be more specific, such as Mahan 128, would that be enough information? Not if cadets from the military academy needed a meeting location. Yes, there's a Mahan Hall at West Point as well! So, an absolute path would look something like - United States, State of Maryland, City of Annapolis, US Naval Academy, Mahan Hall, 1st Floor, Room 28. No matter where in the world a person may be, that precise location is clearly understood.

Knowledge Check: What is the absolute file path to the highlighted class03.txt file?


Directories and Files. In Linux file systems, directories are denoted by a letter d in front of the permissions (drwxr-xr-x) when viewing the long listing format when viewing the directory's contents. Lab 2 will go more into basic shell commands and navigating the file hierarchy but also take note that the Linux file system is case sensitive, which is not the case for the Windows file system. See if you can spot the only file in the user's home directory below.

Linux console display of the contents of a home directory:
m9999@ubuntu:~$ ls -l
total 16901988
drwxr-xr-x  2 m9999  mids        4096 Feb 15  2023 Desktop
drwxr-xr-x  6 m9999  mids        4096 May 12 18:07 Documents
drwxr-xr-x  2 m9999  mids        4096 May  5 14:41 Downloads
-rw-r--r--  1 m9999  mids         276 Aug 16  2023 Class01.txt
drwxr-xr-x  2 m9999  mids        4096 Feb 15  2023 Music
drwxr-xr-x  2 m9999  mids        4096 Feb 15  2023 Pictures
drwxr--r--  4 m9999  mids        4096 Oct  4  2023 projects
drwxr-xr-x  2 m9999  mids        4096 Feb 15  2023 Public
drwxr-xr-x 12 m9999  mids        4096 Jul 23 14:43 public_html
drwxr-xr-x  2 m9999  mids        4096 Feb 15  2023 Templates
drwxr-xr-x  2 m9999  mids        4096 Feb 15  2023 Videos

File extensions in Linux are also used to associate default applications within the GUI but this course will be leveraging the remote shell to access and conduct most of the Linux-based activities. When running programs from the shell, the application is required to be specified along with the file being run. For example, if a user is running a Python script, the command invoked would look something like python script.py. User applications are typically located in /usr/bin/ and system files in /etc/.

Key Points
  1. Files and folders (directories) are arranged hierarchically
  2. Every file and directory has a place in the hierarchy
  3. Every file and directory is uniquely named by its path
  4. In a file viewer window, you see the contents of one directory, the current working directory, and the address bar describes the path to the current directory

Supplemental Media:

File Systems


Review Questions:

  1. What is the hardware component that is relevant to file systems?
  2. Where are the home directories located on Windows and Linux systems?
  3. What is the root drive for Windows and the root for Linux?
  4. How can directories be traversed using absolute and relative file paths?
  5. What are the differences between files and directories?
  6. How are file extensions used by the OS?
  7. How are file headers used by applications?


References

  1. GeeksforGeeks, “File Systems in Operating System,” GeeksforGeeks, Jun. 2021. https://www.geeksforgeeks.org/operating-systems/file-systems-in-operating-system/
  2. Kingston Technology, “Understanding File Systems,” Kingston Blog, 2022. https://www.kingston.com/en/blog/personal-storage/understanding-file-systems
  3. Sweetwater, “Hard Drive File Systems and Why They Matter,” Sweetwater Knowledge Base, 2023. https://www.sweetwater.com/sweetcare/articles/hard-drive-file-systems-and-why-they-matter/
  4. Microsoft, “Change a GUID Partition Table (GPT) disk into a Master Boot Record (MBR) disk,” Microsoft Learn, 2023. https://learn.microsoft.com/en-us/windows-server/storage/disk-management/change-disk-partition-scheme
  5. Minitool Partition Wizard, “Windows Disk Management: Full Guide,” PartitionWizard, 2024. https://www.partitionwizard.com/partitionmagic/windows-disk-management-full-guide.html