Memory Management

Basic Hardware –

Registers built into the CPU are generally accessible in one CPU clock tick.  CPU’s can often perform operations on register contents at a rate of one or more per clock tick.

The same can not be said for RAM which is transferred to the CPU registers via the memory bus. Completing a RAM access may take more than one clock tick. This is intolerable since the CPU accesses memory so frequently.  The solution is use cache memory, memory that resides between the CPU and RAM. Here, hardware components fetch blocks of data automatically from RAM and put it in cache for quicker access by the CPU.

To benefit from multi-processing, we need to have code and data for multiple processes loaded in RAM at the same time.

Although RAM is not physically manufactured as a linear array of bytes, the operating system views it as such. The smallest unit of RAM that can be addressed is the byte and each byte in RAM has a unique address, starting with address 0.

Memory (RAM) Access

For the rest of this lecture, when we refer to memory we’ll be referring to RAM.

The OS, operating in kernel-mode, is given unrestricted access to all memory. This allows the OS to

  • load a user’s program
  • dump a program in case of errors
  • perform I/O to and from user memory
  • store the process’ states in memory during a context switches
  • restore a process’ state during a context switch

For the most part, user-mode processes are not allowed to access memory that was not allocated for it..

Symbolic, Virtual and Physical Addresses

Usually a program resides on a disk as a binary file. When a program is slated to be executed, it is put in a queue. When the kernel finds space for it in memory, a process is create and its code is loaded into memory. It is then put on the scheduler’s ready queue.

Most systems allow a user process to reside in any unused area of the physical memory. So, although the address space of memory starts at address 0, the process will most likely not be loaded at memory address 0.  So, the question is: since the compiler will have no idea where in RAM the code will be loaded, how can it specify an address (e.g. an address for a jump instruction) in the compiled code?

The name of a function in source code can be thought of as an symbolic address – it indicates where the start of a block of code begins.  Since at compile time, the compiler doesn’t know where the process will reside when it is run, the compiler typically binds the symbolic addresses to relocatable virtual address (aka logical addresses), starting at 0 which denotes the beginning of the module. It compiles the code as if the code will be loaded in RAM starting at address 0.  For example, the symbolic address for a function named foo may be at virtual address 14, 14 bytes from the beginning of the code block, wherever that may be. For the time being, let’s assume that the code is loaded in a contiguous block of memory.

When an address in a program is loaded into the CPU, the virtual address must be bound to the physical address space.  The runtime mapping from the virtual address space to the physical address space is performed by the memory-management unit (MMU).

One way to handle this is to store the starting address of the code block for the process. We call this the base address and store it in a register.   For example, suppose the base address for a process is 11,000 and the virtual address for a function named getAddress is 140. Then the physical address of getAddress is computed by:

physical address = base address + virtual address 
10140 = 10000 + 140;

In other words, the value of the base address is added to every virtual address to get a physical address.

Paging

One way to allocate memory is to give a process contiguous memory.  However, as processes terminate and others replace them, memory develops holes. This is called fragmentation.  One way to avoid fragmentation is to use Paging.  Paging is used in most OSs on various hardware devices from mainframes to mobile devices.

To implement paging, physical memory is partitioned into fixed-size blocks called frames.  All frames having the same size (a power of 2) typically between 4 and 8K bytes.  When the kernel allocates memory for a process, it reserves a set of frames for the process to use.  It is possible that this set of frames does not make up a single contiguous area of memory.  The OS keeps track of which frames are allocated to which processes in a frame table.

We previously stated that a compiler can assign virtual addresses in a virtual address space with addresses from 0 to some max value.  With paging, the compiler partitions the virtual address space into segments called blocks, with each block having the the same size as a frame.  Each virtual address created by the compiler is then written as a pair (p, o) where p is the page number and o is the offset from its page boundry.  Each page that an executable has defined is mapped to a physical address frame in a page table.

When the CPU needs to compute a physical address given of a virtual address (p, o) the following occurs:

  1. The frame number associated with the page (p) is determined by querying the page table.
  2. The frame boundary is computed by multiplying the frame number by the size of each frame.
  3. The physical address associated with (p,o) is computed by adding o to the frame boundary.
Example
  • Suppose a function named getAddress() has a virtual address given by (7, 100),
  • Assume page 7 is mapped to frame 2 in the process’ page table
  • Assume also that the frame size is 1024 bytes.

Then the physical address of getAddress is:

(2 * 1024) + 100 = 2048 +100 = 2148

Dynamic Linking

Dynamically linked libraries are system libraries that linked to user programs when the programs are run. This feature is usually used with system libraries that are called frequently.

When a program is loaded, a stub of each dynamically linked library routine is included in the loaded user image. The stub is a small piece of code that indicates how to locate the routine in memory if it already in memory or how to load the routine if it is not in memory.  If the stub indicates that the routine is already in the process’ memory space, it is executed. If the stub indicates the routine is in another program’s address space, the OS will update the processes’ memory address table to allow the routine access to it so it can run.

Notice, that here, a process may in fact, to run a dynamically linked library that is already being stored in memory, share another process’ memory space,  Thus, only one copy of the routine is ever in memory at a given time giving rise to the term shared libraries.

Swapping

A process must be in memory to be executed. Since there may not be enough memory to hold all the currently running processes, a process may be swapped out of memory to a backing store (e.g. a swap partition on a hard drive) and be brought back into memory for continued execution when the CPU is ready for it.

As we mentioned last time, the kernel maintains a ready queue for processes that are ready to run and the scheduler chooses a process to run. The dispatcher then checks to see if the process is in memory or is in the backing store. If in the backing store, the dispatcher swaps out a process in memory for the process in the backing store. It then restores the state of the registers and transfers control to the selected process.

The time to swap out a process can be costly. (2 sec per 100 MB).

What if a process was allocated 1GB of memory?
What if a process that is waiting on I/O that is mapped to the process’s memory is swapped out?

How does an operating system deal with these issues?

  • Most OSs only swap if the amount of free memory is below a threshold.
  • Swapping is halted when free memory increases.
  • Some swap out only a portion of memory, leaving some of the outgoing process’ data in memory
  • Mobile devices do not support swapping.  If a user clicks on a new app and there is no memory available, an old app is terminated.

© 2017, Eric. All rights reserved.