Unfortunately, the term virtual memory (VM) is often misunderstood or hazily understood, at best, by a large proportion of engineers. In this section, we attempt to clarify what this term and its associated terminologies (such as memory pyramid, addressing, and paging) really mean; it's important for developers to clearly understand this key area.
First, what is a process?
A process is an instance of a program in execution.
A program is a binary executable file: a dead, disk object. For example, take the cat program:
$ ls -l /bin/cat
-rwxr-xr-x 1 root root 36784 Nov 10 23:26 /bin/cat
$
When we run cat it becomes a live runtime schedulable entity, which, in the Unix universe, we call a process.
In order to understand deeper concepts clearly, we start with a small, simple, and fictional machine. Imagine it has a microprocessor with 16 address lines. Thus, it's easy to see, it will have access to a total potential memory space (or address space) of 216 = 65,536 bytes = 64 KB:
But what if the physical memory (RAM) on the machine is a lot less, say, 32 KB?
Clearly, the preceding diagram depicts virtual memory, not physical.
Meanwhile, physical memory (RAM) looks as follows:
Still, the promise made by the system to every process alive: every single process will have available to it the entire virtual address space, that is, 64 KB. Sounds absurd, right? Yes, until one realizes that memory is more than just RAM; in fact, memory is viewed as a hierarchy – what's commonly referred to as the memory pyramid:
As with life, everything's a trade-off. Toward the apex of the pyramid, we gain in Speed at the cost of size; toward the bottom of the pyramid, it's inverted: Size at the cost of speed. One could also consider CPU registers to be at the very apex of the pyramid; as its size is almost insignificant, it has not been shown.
To help quantify this, according to Computer Architecture, A Quantitative Approach, 5th Ed, by Hennessy & Patterson, fairly typical numbers follow:
Type | CPU registers | CPU caches | RAM | Swap/storage | ||
L1 | L2 | L3 | ||||
Server | 1000 bytes | 64 KB | 256 KB | 2 - 4 MB | 4 - 16 GB | 4 - 16 TB |
300 ps | 1 ns | 3 - 10 ns | 10 - 20 ns | 50 - 100 ns | 5 - 10 ms | |
Embedded | 500 bytes | 64 KB | 256 KB | - | 256 - 512 MB | 4 - 8 GB Flash |
500 ps | 2 ns | 10 - 20 ns | - | 50 - 100 ns | 25 - 50 us |
The OS will do its best to keep the working set of pages as high up the pyramid as is possible, optimizing performance.
The reason is straightforward: the deep and gory technical details are well beyond the scope of this book. So, the reader should keep in mind that several of the following areas are explained in concept and not in actuality. The Further reading section provides references for readers who are interested in going deeper into these matters. Refer it on the GitHub repository.