Synook

A simple demonstration of virtual memory

In discussion, it appears many people are unclear on how memory (RAM) is managed in modern OSs, and in particular the concept of virtual memory and its implications for a process. Also, how virtual memory is not the same as paging/swapping (or RAM, or ‘storage’, or the memory used by virtual machines, etc.).

Many modern computer architectures (e.g., x86-64) possess a Memory Management Unit (MMU) that translates memory addresses as used by processes running on the computer into physical addresses. The memory seen by processes, therefore, is only virtual memory. This allows an OS to present a contiguous and isolated virtual memory space to each process, irrespective of what else is running on the machine—each process in effect thinks it has all the memory to itself. This is simply demonstrated by the program below, which when compiled (on, e.g., Linux x86-64) will show the same memory address holding two different values ‘simultaneously’!

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
  int a, i;
  if (fork()) sleep(1);
  a = getpid();
  for (i = 0; i < 4; i++) {
    printf("pid=%i: &a=%p, a=%i\n", getpid(), &a, a);
    sleep(2);
  }
  return 0;
}

Sample output for the lazy:

pid=23: &a=0x7ffff89d1b08, a=23
pid=22: &a=0x7ffff89d1b08, a=22
pid=23: &a=0x7ffff89d1b08, a=23
pid=22: &a=0x7ffff89d1b08, a=22
pid=23: &a=0x7ffff89d1b08, a=23
pid=22: &a=0x7ffff89d1b08, a=22
pid=23: &a=0x7ffff89d1b08, a=23
pid=22: &a=0x7ffff89d1b08, a=22

Of course, this is because the memory spaces seen by the two instances of the program are actually completely isolated, and the MMU is translating the same virtual address to different physical memory locations depending on the process. Each process cannot see the memory of the other process at all, or even see where it might be in RAM (without the explicit instantiation of some form of shared memory).

Virtual memory also enables the use of paging (Windows) or swapping (Unix), where the MMU maps some virtual addresses (more accurately, pages (blocks of addresses) (hence ‘page file’, etc.)) to a location on secondary storage (hard disk, etc.), increasing the apparent amount of memory available. When those pages are requested by a program they are loaded back into RAM, and other pages are moved to disk if necessary (i.e., ‘swapped’). Obviously this would not be possible without virtual memory. Swapping, however, is not the same as virtual memory—just a feature enabled by it.

If you are wondering, &a (the memory address of a) will always be the same for both instances of the program because fork makes an exact copy of the current process when creating a new instance (once again only possible due to virtual memory), before continuing execution from the instruction after the fork. If you ran the program twice externally the pointer would almost certainly be different—though not by much, you might note, since the data of programs are usually laid out consistently in their virtual memory space across invocations (this layout is often controlled by the headers in the binary file, and so is even more likely be similar for multiple instances of one program).