C SC 340 Lecture 9: Virtual Memory

[ previous | schedule | next ]

What is virtual memory?

Recall the previous lecture concerned organizing memory and translating logical memory addresses to physical addresses. Then follow this line of reasoning that motivates virtual memory.

  1. The OS should support multiprogramming so the CPU will not be idle while a process is waiting on I/O or some other event
  2. An instruction can be executed in a fetch/decode/execute cycle only if it is stored in (main) memory
  3. As a result of 1 and 2, it is necessary to keep more than one process in memory at a time
  4. However it is not possible to store more than one process beginning at memory location 0 (or some other fixed address) at the same time.
  5. As a result of 4, it is not possible for a compiler to generate absolute addresses because it does not know what address the process will be loaded into.
  6. As a result of 5, the compiler generates logical addresses which assume starting address is 0.
  7. Address translation hardware is then used to translate each logical address request into a physical address at runtime.
  8. There is no good reason to restrict logical address space from exceeding physical address space.
  9. There is no good reason to require that an entire process be in memory while it is running - vast parts of it are unused at any given time or possibly not used at all!
  10. If we allow both 8 and 9, then processes can be larger than the total physical memory! Programs and processes thus occupy virtual memory.

Point #9 is the final key to virtual memory. By allowing a partially-loaded process to run, the OS:

Our study of virtual memory is limited to systems that use paged memory allocation. Assume the translation hardware consists of registers for the virtual and physical address plus a basic page table. The TLB and all variations of the page table are allowed but are not relevant to the topics at hand.

So, what are the "topics at hand"?

The major topics we cover here are


Demand Paging

Demand paging refers to the practice of not loading a page into memory until it is needed by the running process. E.g. load the page on demand. If no pages are initially loaded for a new process, it is called pure demand paging.

The mechanism for determining that a page needs loaded is relatively simple

When a process requests memory located in a page not yet loaded, a page fault occurs. The OS must allocate a frame to the demanded page. This is potentially very complex (to do well) and involves OS policy.

Before getting into that, here are some other issues that are usually pretty easy to solve:


Page Replacement

When a page demand occurs, the OS (among other things):

  1. consults its frame table to find an available frame,
  2. allocates the frame to the page,
  3. reads the page contents from disk into the frame, and
  4. updates its frame table and the process' page table.

But what happens if there are no available frames? The process has an immediate demand for the page, and it should be serviced promptly. It is the OS reasonability to replace a currently allocated page. But which to choose? This is the page replacement problem.

In this situation, additional steps are needed between 1 and 2.

1a. if no frames are available, select a victim page for replacement, and
1b. if necessary, save the victim to swap storage (disk)

Step 1b requires yet another disk operation, which slows down the paging process even more! Fortunately, it can often be avoided. How? Add a dirty bit to the frame table or page table entry. Dirty bit is 0 when page loaded, and marked 1 when any location in that page is modified. If it is still 0 at replacement time, it need not be written to disk. A similar bit can be used to mark read-only pages.

The page to be replaced can be selected either from among those in the same process, called local replacement, or from among those in all processes, called global replacement. The replacement techniques described here apply equally well to both.

An optimal page replacement technique

The ideal technique is to replace the page that will not be used again for the longest time period! Remind you of SJF scheduling? Same implementation problem, but can be used as a benchmark.

As with SJF, we'll try to approximate the ideal by using past behavior to predict future behavior.

First In First Out (FIFO) replacement

This technique replaces the oldest page, e.g. the page which has been in memory the longest. The data structure is easy to maintain, just a queue of page-frame allocations. Insert at the tail and replace at the head.

Advantage? Easy to implement. Disadvantage? The oldest pages are often the most heavily used. Swapping one out results in it needing to be paged back in very soon.

FIFO does not take process memory access behavior into account, so is not a good approximation to the optimal.

Least Recently Used (LRU) replacement

Uses past behavior to predict the future in this way: replace the page that has not been used for the longest period of time. This is the "back in time" equivalent to the optimal.

Implementing LRU is a bear because it requires significant overhead. One approach is to store clock value in frame table entry upon load, then at replacement time search frame table for oldest entry. Another approach is maintain stack of of all page numbers; when page is referenced pull it from wherever it is on stack and put it on top, then at replacement time the one on the bottom is the oldest. Both approaches require something to be updated at every memory reference.

Approximating LRU through aging

Pure LRU requires too much overhead. Here is an approach that approximate it at much lower cost. It combines the concepts of the clock, a reference bit, and a daemon OS process. Here's how it works:

The history byte in this case represents page usage over the past 800 ms, and the bit positions represents age. If never used, its value remains 0 because all the reference bit values shifted into it were 0. If frequently used, its value may be as high as 255 (all 1's). Low order bit is furthest peek into the past.

Approximating LRU through second chance

Here is a second LRU approximation approach that combines FIFO replacement and a reference bit. Here's how it works:

This last step represents the "second chance": if the oldest page has been recently used, it is given a second chance.

Determining how many frames to allocate

The OS can allocate the minimum number of frames. Consider the minimum number of frames needed by a single instruction. You may say "one" but in fact several may be required. Here are some possibilities:

The OS can allocate an equal share of available frames to every new process. This is simple to implement but not very effective.

The OS can allocate a number of frames proportional to the process size, priority or both. Larger and/or higher priority processes are allocated more frames than smaller and/or lower priority ones.

The OS can dynamically regulate the number of frames allocated to a process. Some methods are described below in the discussion of thrashing.


Thrashing

When demand paging occurs with high frequency, the OS spends an inordinate amount of time simply swapping pages between memory and disk. Because little productive work is accomplished despite the flurry of activity, the situation is called thrashing.

What causes this to happen? It is usually a combination of page replacement and CPU scheduling policy coupled with high process workload. Assume global page replacement:

  1. a "greedy" process "steals" pages from other processes,
  2. the affected processes subsequently generate more page faults which in turn take pages from yet other processes.
  3. Recall that processes waiting for I/O are blocked, so the processes involved in paging cannot run.
  4. as a result, the CPU's "ready" queue diminishes and CPU utilization decreases.
  5. the OS notices this and increases the degree of multiprogramming - the number of processes that can reside in memory simultaneously!
  6. this opens the floodgates to new processes
  7. the new processes generate a large number of page faults to get started
  8. those page faults take frames from other processes, which cause them to generate page faults
  9. the CPU scheduler kicks up the degree of multiprogramming again! See where this is headed?
What can be done to keep this situation from occurring? Here are several techniques.

The working set model of a process was mentioned above as one way to prevent thrashing.


[ C SC 340 | Peter Sanderson | Math Sciences server  | Math Sciences home page | Otterbein ]

Last updated:
Peter Sanderson (PSanderson@otterbein.edu)