COMP 3400 Lecture 10: File Systems
[ previous
| schedule
| next ]
Part One: File Systems Interface
Introduction
A file is the logical unit of secondary storage. It is a named collection
of related information on secondary storage. Secondary storage includes most
magnetic and optical media such as disks, CDs, and tapes.
We will focus on disk storage. All user data on secondary storage must reside
in a file. The OS also uses secondary storage for process/memory/storage management
such as process swapping space. We will focus on user files.
A file system is the logical collection of files. We call it logical because a single device may contain multiple
files systems (e.g. partitions); likewise a single file system may occupy multiple devices (e.g. Unix device mounting).
File and File System Management Issues
We will cover a number of issues relevant to files and file systems. First we focus on file issues:
- types and internal structure (from an OS standpoint),
- data attributes,
- operations,
- access methods
- protection
Then we introduce directories
as special types of files, and the same issues. Finally, we look at file systems and in particular file system
organization and storage allocation.
The OS View of File Types
Files of different types are created by different programs: word processors, compilers, etc.
Should the OS recognize file types as a service to the
user?
- Windows philosophy: use filename extensions, a period
followed by one or more characters. The user (or software) can associate a
filename extension with an application.
- Unix philosophy:
categorize files simply as being either text or binary (anything other than text). Not very
fine grained. It relies on file contents, not name, to yield clues as to the file's type but only
very crudely. Executable files contain a magic number which the OS can sense.
- Macintosh philosophy: places the name of the associated application directly into the file.
The OS View of File Structure
A file's internal structure depends on the program that created it.
A similar question applies here: Should the OS know anything about a file's structure?
Process analogy (we will do this frequently): A process has a particular
data structure including text segment, data segment, stack segment, and process
management variables. The OS knows a lot about the process structure! Segmented
memory management techniques organize the process in memory based on its segmented
structure.
- The file management equivalent of pages/frames is blocks, the unit
of disk storage. The OS may see the file as merely a collection of blocks
without regard to its internal structure.
- Unix philosophy: a file is a byte stream, an unstructured
sequence of bytes. It is the responsibility of the application to impose structure
on the raw bytes. This philosophy is reflected in the basic I/O system calls
read() and write(), which simply read/write the specified
number of bytes starting at the specified memory location (buffer).
- Alternative philosophy: OS uses knowledge of certain internal file
structures to improve performance of file operations in the applications that
use them. This ties the OS to the application, requiring such OS support to
be installed when a new application is installed.
File Attributes (data)
Certain file management information must be stored with the file. It is not normally
stored in the file itself, but in a directory structure. The kinds of
data stored include name, type (sometimes), disk location
(for OS use), size, protection, ownership, usage
times (creation, last modification, last use).
File Operations
The OS provides system calls through which programmers can work with files. Typical operations
include create, delete, rename, execute, open, close,
read, write, append, seek, and truncate.
The typical life-cycle of a file goes something like this:
- create, open, write, write, write, ... , close
- open, read, write, read, write, ... , close
- delete
Every read and write operation makes use of the current-file-position pointer.
- OS variable associated with an open file
- indicates where the next read/write operation will take place.
- initialized when the file is opened, to either the beginning or the end
(if opened for appending) of the file.
- updated with each read/write operation.
- can be manipulated by the programmer through the seek operation listed above.
The Unix system calls for basic file I/O are creat(), open(),
read(), write(), lseek(), and close(). Open
is the most complex of these, with options for read-only, write-only, read-write,
append, truncate, create-if-not-found, and more.
When a file is opened, an entry is added to an OS open-file table.
- programmer is returned a table entry pointer or similar data structure as
its handle for subsequent file access.
- A multiuser system maintains both a system-wide table and a per-process
table.
- system-wide table permits sharable files to occupy only one table entry
when opened by multiple processes.
- per-process table entry is deleted when the file is closed; the system-wide
table entry is not deleted until all processes using the file have closed
it (keep a file open count incremented upon open and decremented
upon close).
- Because every open file requires one or two OS table entries, it is important
for system performance that programmers close files when finished with them.
The OS limits the number of open files a process can have. To test this limit in Unix, write a program
containing an infinite loop that opens a file without closing it. The open() system call
returns an error value (-1) when your limit is reached.
File Access Methods
File data are accessed either sequentially or directly (randomly).
- Sequential is based on the tape concept: fixed read/write head and linear
media moving past it.
- Direct is based on the disk concept: movable read/write head and rotating
media.
- Either access method works well on physical disks; the direct access method
does not work well on physical tapes.
Addresses in a file may be specified in different units: records, blocks, or bytes.
- File addresses are numbered sequentially starting at 0.
- This is a logical address because it is the same regardless of
where the file is physically stored.
- Every location is thus relative to the beginning of the file.
- Process analogy: process logical memory address space, which starts at 0
regardless of where it will be physically loaded into memory.
Directory Structures
Mere mortal users would not be able to effectively manage files were they not organized by partitions
and directories (called folders in the Windows world).
The top level of organization is partitions, which are logical storage
devices. A physical device may contain one or more partitions; a partition may cover more than one
physical device. Sometimes also called volumes.
Each partition contains a device directory, a table with information about
all the files it contains. We will also assume the directory can contain other directories, called
subdirectories. The directory entry is where file attributes are stored.
Like files, directories have certain defined structures, attributes and operations.
Unlike a regular file, it is essential for the OS to be familiar with the internal
structure of a directory (its table). Operations such as creating, deleting,
reading and writing are applied to a directory but they must be implemented
in a special way.
Consider directory deletion. Deleting a non-empty directory requires OS policy -- should the files it contains be deleted too? If not,
who should adopt them? Windows deletes them; Unix allows the deletion of a non-empty directory but only through
a special command switch which deletes both the directory and its files;
DOS prohibits the deletion of a non-empty directory.
Directory operations include:
- create a file in the directory
- delete a file from the directory
- rename a file in the directory
- search a directory for a given file
- list a directory's contents
- traverse the file system
We assume the directories in a file system form a tree structure. This means the file system has a
single root directory which contains files and subdirectories, and the subdirectories in turn
contain other files and other subdirectories. Any given file or subdirectory is contained in exactly one
directory. A sketch that shows directories and folders as nodes and the "contains" relationship as links would
thus resemble a tree.
While working with a file system, each user has a current directory. These contain the files of current
interest.
The location of
every file can be specified using a pathname which lists each directory in the unique path through the
tree to its location. The pathname can either be relative starting from the current directory or
absolute starting from the root directory.
The user can traverse the file system using appropriate commands or actions. If commands are used, the
absolute or relative pathname of the destination directory can be specified directly.
The problem with tree-structured directories is they do not allow files to be shared among multiple
directories. Such shared files would allow all members of a group project to have shared files listed in
directories that they individually own. A file system that allows such sharing forms an acyclic graph
instead of a tree -- there can be more than one directory path to a given file.
Such sharing is implemented through links. There is one physical copy of the file plus one
or more links to it. There are two approaches to implementing links:
- A symbolic link (aka soft link) implements a link by storing the pathname of the shared file
in the directory table.
- A non symbolic link (aka hard link) implements a link by
duplicating the directory table entry of the shared file.
Each approach has its issues, particularly where file movements and deletions are concerned.
- If the shared
file is renamed or moved to a different directory, its physical location on disk remains the same and a
hard link remains valid. A soft link would become invalid since the file pathname is changed.
- Deletion is
tricky too. If the shared file is deleted, all soft links to it become invalid. Deleting of a shared file
with hard links works OK if the file has an associated reference count, a count of the number of hard
links to it. If one user "deletes" the file, the reference count is decremented. The actual file is deleted only
if the reference count is reduced to 0.
File System Mounting
One of the OS responsibilities in maintaining a file system is mounting, which maps a physical
device to one or more logical partitions (file system entities). This is necessary before a user process can reference
the partition. Mounting is done at boot time and may also be done "on the fly" as
devices are attached and removed.
- Windows mounts devices to drive letters (A, C, D, etc) and presents the collection of logical drives as
its file system. Note that devices may be attached via network connections and mapped to logical drives.
- Macintosh searches devices for presence of file system structures, including file system name, and
and places a corresponding icon on the desktop. The file systems are then referenced by name.
- Unix maintains a single file system with a root directory called "/" (slash). A device is mapped to a file
name using the mount command. Device files are normally found in the "/dev" directory. Thus individual
devices do not have a "root identity" as they do in Windows or Mac.
File Protection
The protection of files refers to assuring they cannot be "improperly" accessed. This includes
assuring that only authorized users may access the file, and assuring that only authorized operations can
be performed. The term controlled access refers to allowing access to some users but not others and
allowing some operations but not others.
Access controls can be defined for any of the file operations listed above.
Limiting access to authorized users is accomplished by defining an access
list for each file that the OS checks before allowing access. Two approaches
to implementing the access list are:
- a list of users, and permitted access for each user (access control list, or ACL)
- a fixed list of user categories with permitted access for each category
The first approach is very precise but is variable length and slow to use and maintain. The
second approach is less precise but is fixed length and quick to use and maintain.
File Access Control in Windows
To see Windows file protection, right-click on a file icon and select "Properties". Check the
resulting window for the "General" and "Security" tabs. The General tab implements some protections, such
as read-only access. The Security tab details which users or groups of users are allowed access, and what access is
allowed for them.
File Access Control in Unix/Linux
To see Unix/Linux protection, type the "ls -l" command to produce a detailed listing of files in the
current directory. The leftmost column will contain a string of 10 characters consisting of the following
characters: -, r, w, x, d. The string actually represents 4 groups of information:
- The first character indicates what kind of file it is: - for regular file, d for directory, and a
couple others more rarely seen such as l for link and s for socket.
- The next 3 characters represent access rights for the file owner. Every file has a owner ID as one of its attributes.
There are 3 kinds of
access: read, write, and execute. The first of the three characters indicates whether read access is allowed
(r) or not (-). The second indicates whether write access is allowed (w) or not (-). The third
indicates whether execute access is allowed (x) or not (-).
- The next 3 characters represent access rights for the file group. A named group containing a list of
its members is separately maintained. Every file has a group ID as one of its attributes. The 3 access rights
are defined the same as for owner.
- The last 3 characters represent access rights for everyone else, the world. The 3 access rights
are defined the same as for owner.
For example, a Unix file with protection code "-rw-r--r--" is a regular file to
which everyone has read access but only the owner has write access. A Unix file
with protection code "-rwx--x--x" is a regular file which anyone can execute (it
is an executable program or script), but only the owner can read or write it.
File protection values are set when the file is created, and changed by an authorized
user with the chmod (change mode) command or system call.
Part Two: File Systems Implementation
Introduction
OSs provide consistent control and access to secondary storage through a file system. Design
of a file system involves multiple layers of concern, here listed from the top-down:
- logical file system which is the API
- file-organization module which maps logical file structures to physical ones.
- basic file system which interacts with device drivers to request operations on physical addresses
- device drivers and interrupt handlers to push bits between the device and memory
- storage devices themselves
Application programs then interface with the API to use the file system. Examples
include Unix command shells, Windows Explorer, word processors, compilers.
We have already covered a significant portion of the file system API: operations
for files and directories.
File System Storage Allocation
File space is allocated in logical units called blocks. These are
ultimately translated into physical storage locations. In the case of disks,
the physical location of a block is an ordered triple: < cylinder, track,
sector >. The logical-physical translation is non-trivial. We'll limit our
discussion to logical blocks, which are typically 512 to 4096 bytes long.
Assume that disk blocks, like file blocks, are a linear resource numbered sequentially starting at 0.
This is analogous to memory frames and process pages; both are numbered sequentially starting at 0.
There are 3 basic strategies for allocating disk blocks to files:
- contiguous allocation, which allocates consecutively-numbered disk blocks.
- linked allocation, which allocates non-consecutive blocks and organizes them into a linked list.
- indexed allocation, a variation of linked allocation in which the links are organized into
a single per-file structure called the index block.
Contiguous allocation
Contiguous allocation has the same properties, advantages and disadvantages as its memory counterpart.
Overhead is low (directory entry need only store starting block and length) and both sequential and
direct access are fast
but finding the right hole (e.g. first fit, best fit) is expensive and external fragmentation results.
Files that dynamically grow are also bothersome.
Linked allocation
Any available disk block can be allocated to a file block. There is additional overhead. Directory
entry needs only store starting block at minimum, but each data block must contain link to next block. This
means the file block and disk block cannot be the same size.
The link overhead can be reduced by allocating blocks in multi-block clusters. This however
increases internal fragmentation.
Linked allocation supports sequential access well (traversing the linked list) but is dismal for
direct access (reduces almost to the mag tape model).
Both the link overhead and direct access problems can be reduced by collecting all links into a
single file allocation table (FAT) stored at the beginning of a partition. The FAT contains
one entry per disk block and is indexed by disk block number. Each entry contains the link that would
otherwise occupy part of the disk block. If the FAT is cached in main memory, direct access becomes
considerably faster.
Indexed allocation
Any available disk block can be allocated to a file block. Instead of organizing the blocks into a linked
list or collecting the list pointers into a per-partition FAT, the indexed organization collects the list pointers
into a per-file table, indexed by the file's block number, called the index block. Each entry
contains the corresponding disk block. The index block is kept in the file's directory
entry. This is analogous to a process page table.
This solves the direct access problem of linked allocation, since the index block entry of any file block
can be reached in one access.
The problem with indexed allocation is how to organize the index block -- the number of entries needed
is the same as the file length in blocks. But a file
can be as small as one block or as large as the partition itself! This is analogous to the page table
size problem. Here are approaches to that problem:
- start with an index block which is one block long. When it is full, allocate another block for the
index block and link to it. Thus the index block is a linked list of blocks. Access to blocks
toward the end of the file is much slower than to those toward the beginning.
- Organize it in two or more levels with indirection. Each entry in the primary index block points to a secondary
level index block. This is analogous to multi-level page table organization. Access to any disk block
requires two (or however many levels there are) steps.
- Combine direct and indirect indexing. This is best illustrated by Unix and described below.
Unix combines direct and indirect indexing. The index block is contained in the
inode (short for index node). The inode contains:
- 12 (or so) entries that point directly to the disk block. For a file which is 12 or fewer blocks long,
this is all that's needed.
- one entry that points to a single indirect block, which is a second level index block.
- one entry that points to a double indirect block, which is an index block containing
pointers to single indirect blocks.
- one entry that points to a triple indirect block, which is an index block containing
pointers to double indirect blocks.
- If you assume that blocks are 1024 (1K) bytes and block numbers are 4 bytes each, then the above allows
a very large file system and or a file to be quite long!
This supports about 4.2 billion block numbers, and each block is 1K bytes long. The maximum file size is
thus 4 terabytes!
- If file size is 12K bytes or less, only direct indexing is needed (most Unix files are very
small).
- The single indirect block handles the next 1024/4 = 256 blocks. The max length is 12K + 256K = 268K bytes
for a file, requiring no more than one level of indirection.
- The double indirect block extends an additional
256 single indirect blocks, or 256 * 256 = 65536 blocks. The max length is now 268K + 65536K = 65804K bytes
for a file (just over 64 MB), requiring no more than two levels of indirection.
How important is this?
The choice of allocation and indexing method is critical to OS performance simply because secondary storage
devices are SO SLOW compared to the memory and processing units. Nearly any performance improvement
will have significant impact on overall system performance. Suppose an indexing improvement saves 1 millisecond
per disk access. During 1 millisecond, the processor possibly executes over 100,000 machine instructions.
So even if the improvement requires an additional say 10,000 lines of C code, it is probably worthwhile.
File System Free Space Management
A file system's free-space list organizes (disk) blocks that are available to be allocated to growing files.
As with any disk operation, performance issues are critical because the devices are so slow.
Note that the FAT method of linked allocation incorporates the free-space list; it is just one additional
linked list.
The greatest performance benefit comes from caching the free-space list in
main memory. Coherency is an obvious issue, but there is also the issue of memory
requirements. If each block is 4K and a drive's capacity is 600 GB, the drive
contains 150 million blocks and thus a drive could easily have a free-space
list 100 million or more blocks long!
Note that allocating blocks in clusters cuts the memory requirements considerably.
The free-space list may be implemented as a bit vector. The block number (0 to n) is bit position
index into a bit string. If a bit position contains 0, the corresponding block is available (if 1, it is
occupied). Advantages is small space required, disadvantage is the variable length of time to find
an available block (have to search the vector).
An alternative is to maintain a linked list of free blocks. This is what the FAT does. It could
be organized either as a stack or a queue. Either one limits access only to the ends and thus access
time is constant.
[ COMP 3400
| Peter Sanderson
| Math Sciences home page
| Otterbein
]
Last updated:
Peter Sanderson (PSanderson@otterbein.edu)