1. Goals:
    1. Meet the data management needs of the user
    2. Guarantee that the data in the file are valid
    3. Optimize performance
    4. Provide I/O support for a variety of storage device types
    5. Minimize the potential for lost or destroyed data
    6. Provide a standardized set of I/O interface routines to user processes
    7. Provide I/O support for multiple users in the case of multiple-user systems
  2. Parts:
    1. Drivers talk directly to the disk
    2. Basic Filesystem deals with blocks of data, writing to disk, or buffering in RAM.
    3. Supervisor selects device, does scheduling, assigns buffers.
    4. Logical, record access.
  3. Types of files
    1. Some operating systems provide support for special file structures.
    2. Most files are just sequential piles of bits from the OS perspective. The user applications manage how they're structured.
    3. This can be inefficient for things like data base servers, which need to find individual records in a file quickly.
    4. OSes support this by structuring the files around indexed keys.
    5. The most common mechanism are B-Trees
  4. B-Trees
    1. Balanced tree, NOT A BINARY TREE.
    2. Made up of keys and records, similar to binary trees.
    3. Branching factor of \(d \leq b \leq 2d\).
    4. Every node has at least \(d-1\) keys and \(d\) pointers, except for the root, which has 1 key.
    5. Dummy node leaves.
    6. Search is done like any ordered tree.
    7. Insertion is done like a 2-3-4 tree. Insert, if too many keys, split and promote.
    8. Deletion is also the same. Find, delete, move up a child.
    9. What we've made is a generalization of a 2-3-4, call it a d-(d+1)-(d+2)-...-2du tree. B-tree is easier.
  5. File directories.
    1. The operating system has to keep a model of the logical location of files, and map that to the physical location. This is the directory system.
    2. You keep the following info:
    3. Usually the directory is a file itself. Sometimes some of this info is in the file header.
  6. Sharing and Access levels
    1. None
    2. Knowledge
    3. Execution
    4. Reading
    5. Appending
    6. Updating
    7. Change protection
    8. Deletion
    9. Simultaneous access?
  7. File allocation
    1. Do we allocate a large chunk at creation?
    2. What size portions to use (groups of blocks).
    3. How do we keep track of portions?
    4. Pre-allocation - simple, difficult to estimate size.
    5. Allocating large portions means more of the file is contiguous, small tables for where everything is. Simple block based portions are simple. Small portions waste less space.
    6. Placement? best fit, first fit, Nearest fit.
    7. Basic methods of allocation: Contiguous, chained, indexed.
    8. Contiguous- pre-allocate a contiguous block. Fragmentation is a problem.
    9. Chained. No accomodation for locality.
    10. Indexed. Index entry for each block.
  8. Free Space management
    1. In order to quickly find free space, we need some sort of table of where it is.
    2. Bit vector. 16G Disk, 512byte blocks, -> 4M. If we don't have room in RAM, very slow.
    3. Chained Free portions.
    4. Indexing. Use index table (hash)
    5. Free block list. Again large,
  9. Unix Filesystem
    1. 6 types of files: regular, directory, special (devices), named pipes, link, symbolic link.
    2. Allocation done on block level. Pointers to the blocks are in the direct and indirect fields in the inode.
    3. The inode is of small fixed size, can be kept in memory.
    4. Small files have little indirection.
    5. Can handle very large files.
    6. Directories are handled just as above, where the directiory is a list of pointers to other directories or inodes.
    7. Volume structure: boot block (boots the system), superblock (size of partition, inode table size, etc), Inode table, data blocks.
    8. Access control: rwx, oga. We know this already.
    9. We might not know that there are 3 more bits- setUID, setGID, and sticky. Sticky applies to directory, saying only owners of files in that directory can move, rename or delete those files.
  10. Access control lists
    1. Another more recent way to control access is with Access Control Lists.
    2. Assign a list of user IDs with individual permission rights with the setfacl command.
    3. See man page.
  11. Linux virtual file system
    1. In its early days, linux had a problem. It's specially designed filesystem, ext worked will, but lived in a world where most people used disks with other filesystems, so the designers needed to support all these others in order to gain acceptance.
    2. Thus they developed the virtual filesystem.
    3. This layer maps between the Linux idea of files to the other system, e.g. inodes to FATs.