COMP 3400 Lecture 8: Main Memory
[ previous | schedule | next ]
This is reasonable: even if only one process is currently running, a quick context switch to a "ready" process cannot occur unless at least a portion of that process is already contained in memory. And if there are multiple processes, the OS will not be able to guarantee them a "reserved" location in memory.
Given those assumptions, address references contained in the binary executable code will in general not match the target physical addresses when the process runs. The address references in the program are thus logical addresses, not physical addresses. In order for the program to run correctly, logical addresses must be mapped, or bound, to physical addresses at some point. Logical addresses are also known as virtual addresses. The de-coupling of virtual and physical addresses permits a process to have a larger address space than is physically present. This is turn leads to the requirement that a process can execute while only partially loaded into memory.
Historically, address binding has been available at any of three steps:
Computers and OSs in the modern era use execution time binding almost exclusively. Exceptions include certain OS kernel code which resides in fixed low memory locations (e.g. interrupt vectors and handlers).
Execution time binding requires special hardware consisting at a minimum of a relocation (base) register and limit register, located in the memory-management unit (MMU). Recall these were introduced in lecture 1.
Partitioning involves loading the entire process space into memory. Physical memory is thus partitioned into the various processes, and each process is stored in a contiguous chunk of memory.
This makes address limit checking and relocation very simple! It is not efficient use of space because different processes have different size.
Early partitioning techniques used fixed partitions, a fixed number of partitions of fixed size each. Each partition holds exactly one process, and the OS made an effort to fit processes to partitions.
This soon gave way to variable partitions in which the partition was only as large as the process, and could be allocated in any available contiguous hole of memory large enough to contain it. The OS must maintain a list of holes. The trick then is matching processes to holes, and several strategies emerged:
All these approaches result in external fragments, holes between processes that are not large enough to be usable. Fragments can be periodically removed through compaction (analogous to disk de-fragmenting) but this requires OS overhead.
The fragmentation problems of variable partitioning were caused by requirement for contiguous memory allocation. Paging allows the physical process space to be non-contiguous.
Here we consider only the mechanism, which is loading pages into frames and translating logical addresses to physical addresses. The policy, which concerns which frames to load, when to load them, where to load them and when to replace them, is covered in the virtual memory lecture.
Paging eliminates external fragmentation but introduces internal fragmentation. This is simply the unused portion of the last page of a process, on average one half a page in size. Fragmentation is reduced only by reducing the page size, which has performance costs of its own (results in more frames and thus longer page table, see below).
Another cost of paging: since any frame can be allocated to any page, the OS has to keep track of which frames are available (for future allocations), plus keep track of which frames are allocated to which process (for protection -- prevent process from accessing frame allocated to different process) and which frames are available for allocation. It keeps track using a frame table with one entry per physical frame.
The hardware required to implement paged addressing includes:
Each process has its own page table. The page table (or at least a pointer to it) is part of the Process Control Block and must be saved/restored upon context switch.
The page table can be implementing using registers only if the table is very small. The benefits of a large page table (e.g. 1 million entries) are so great the sacrifice is made to store them in main memory. In this case, the base address of the page table itself is stored in a register. To get the page table entry thus requires an additional memory access to accomplish steps 3 and 4 above:
Protection bits can also be added to each page table entry. Two examples are:
If multiple processes are running the same application, it is advantageous to keep only one shared copy of the application's binary code in memory. The page tables for those processes will all contain entries pointing to the same set of shared frames occupied by the application.
The answer is a small associative cache of recently-accessed page table entries called the Translation Look-aside Buffer (TLB).
Page table size is a huge concern. Consider a 32 bit virtual address with a reasonable 4KB page size; the displacement field uses the low order 12 bits, leaving 20 for the page number. The page table thus has 220 (over 1 million) entries, with each entry requiring 4 bytes or more -- 4 MB of RAM per process!
Wait, it gets worse...the page table must be stored in contiguous locations to allow page number to be used as index!
Solution? Page the page table! Instead of having one page table with 220 entries, you could have, say, 210 page tables each with 210 entries. E.g. 1024 page tables of 1024 entries each. The contiguous storage requirement then drops from 4MB to 4KB (e.g. could be stored in one frame).
Advantage? Page table can be split up and stored in non-contiguous frames, facilitating memory management.
Disadvantage? Now two memory accesses are required to find the frame number, one to access the outer page table and a second to access the page table page. This can be overcome using a TLB, since a TLB hit would eliminate both accesses.
Can this be extended to 3-levels? 4? Sure -- the Motorola 68030, used in Macintoshes for years, implemented 4-level paging.
Paging is fine but results in memory organization that bears no resemblance to process structures. Examples of process structures are functions, classes, modules, data, and so forth.
Organizing memory by segmentation means thinking of the logical address space as a collection of segments.
Advantage of segmentation: facilitates sharing and protection. For read-only contents, such as program module, several processes can share one copy of memory-resident segment. Can control access/manipulation of a segment through protection bits stored in one segment table entry.
Disadvantage of segmentation: external fragmentation of memory, since allocation is based on variable segment sizes.
The advantages of segmentation are considerable, but how can we control the fragmentation problem?? You guessed it: page the segments! We relax the requirement that a segment be stored in contiguous memory. A generic solution involves defining a segment table where each entry points to the page table for that segment. You should be able to figure it out from there.
We will not go into the details of this solution. However you should be aware that this is not just of theoretical interest; the Intel Pentium processor implements the technique of paged segments.