Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 1. Introduction to the Linux Kernel

After three decades of use, the unix operating system is still regarded as one of the most powerful and elegant systems in existence. Since the creation of Unix in 1969, the brainchild of Dennis Ritchie and Ken Thompson has become a creature of legends, a system whose design has withstood the test of time with few bruises to its name.

Unix grew out of Multics, a failed multiuser operating system project in which Bell Laboratories was involved. With the Multics project terminated, members of Bell Laboratories' Computer Sciences Research Center were left without a capable interactive operating system. In the summer of 1969, Bell Lab programmers sketched out a file system design that ultimately evolved into Unix. Testing their design, Thompson implemented the new system on an otherwise idle PDP-7. In 1971, Unix was ported to the PDP-11, and in 1973, the operating system was rewritten in C, an unprecedented step at the time, but one that paved the way for future portability. The first Unix widely used outside of Bell Labs was Unix System, Sixth Edition, more commonly called V6.

Other companies ported Unix to new machines. Accompanying these ports were enhancements that resulted in several variants of the operating system. In 1977, Bell Labs released a combination of these variants into a single system, Unix System III; in 1982, AT&T released System V^[1].

The simplicity of Unix's design, coupled with the fact that it was distributed with source code, led to further development at outside organizations. The most influential of these contributors was the University of California at Berkeley. Variants of Unix from Berkeley are called Berkeley Software Distributions (BSD). The first Berkeley Unix was 3BSD in 1979. A series of 4BSD releases, 4.0BSD, 4.1BSD, 4.2BSD, and 4.3BSD, followed 3BSD. These versions of Unix added virtual memory, demand paging, and TCP/IP. In 1993, the final official Berkeley Unix, featuring a rewritten VM, was released as 4.4BSD. Today, development of BSD continues with the Darwin, Dragonfly BSD, FreeBSD, NetBSD, and OpenBSD systems.

In the 1980s and 1990s, multiple workstation and server companies introduced their own commercial versions of Unix. These systems were typically based on either an AT&T or Berkeley release and supported high-end features developed for their particular hardware architecture. Among these systems were Digital's Tru64, Hewlett Packard's HP-UX, IBM's AIX, Sequent's DYNIX/ptx, SGI's IRIX, and Sun's Solaris.

The original elegant design of the Unix system, along with the years of innovation and evolutionary improvement that followed, have made Unix a powerful, robust, and stable operating system. A handful of characteristics of Unix are responsible for its resilience. First, Unix is simple: Whereas some operating systems implement thousands of system calls and have unclear design goals, Unix systems typically implement only hundreds of system calls and have a very clear design. Next, in Unix, everything is a file^[2]. This simplifies the manipulation of data and devices into a set of simple system calls: open(), read(), write(), ioctl(), and close(). In addition, the Unix kernel and related system utilities are written in C—a property that gives Unix its amazing portability and accessibility to a wide range of developers. Next, Unix has fast process creation time and the unique fork() system call. This encourages strongly partitioned systems without gargantuan multi-threaded monstrosities. Finally, Unix provides simple yet robust interprocess communication (IPC) primitives that, when coupled with the fast process creation time, allow for the creation of simple utilities that do one thing and do it well, and that can be strung together to accomplish more complicated tasks.

Today, Unix is a modern operating system supporting multitasking, multithreading, virtual memory, demand paging, shared libraries with demand loading, and TCP/IP networking. Many Unix variants scale to hundreds of processors, whereas other Unix systems run on small, embedded devices. Although Unix is no longer a research project, Unix systems continue to benefit from advances in operating system design while they remain practical and general-purpose operating systems.

Unix owes its success to the simplicity and elegance of its design. Its strength today lies in the early decisions that Dennis Ritchie, Ken Thompson, and other early developers made: choices that have endowed Unix with the capability to evolve without compromising itself.

Along Came Linus: Introduction to Linux

Linux was developed by Linus Torvalds in 1991 as an operating system for computers using the Intel 80386 microprocessor, which at the time was a new and advanced processor. Linus, then a student at the University of Helsinki, was perturbed by the lack of a powerful yet free Unix system. Microsoft's DOS product was useful to Torvalds for little other than playing Prince of Persia. Linus did use Minix, a low-cost Unix created as a teaching aid, but he was discouraged by the inability to easily make and distribute changes to the system's source code (because of Minix's license) and by design decisions made by Minix's author.

In response to his predicament, Linus did what any normal, sane, college student would do: He decided to write his own operating system. Linus began by writing a simple terminal emulator, which he used to connect to larger Unix systems at his school. His terminal emulator evolved and improved. Before long, Linus had an immature but full-fledged Unix on his hands. He posted an early release to the Internet in late 1991.

For reasons that will be studied through all of time, use of Linux took off. Quickly, Linux gained many users. More important to its success, however, Linux quickly attracted many developers—adding, changing, improving code. Because of its license terms, Linux quickly became a collaborative project developed by many.

Fast forward to the present. Today, Linux is a full-fledged operating system also running on AMD x86-64, ARM, Compaq Alpha, CRIS, DEC VAX, H8/300, Hitachi SuperH, HP PA-RISC, IBM S/390, Intel IA-64, MIPS, Motorola 68000, PowerPC, SPARC, UltraSPARC, and v850. It runs on systems as small as a watch to machines as large as room-filling super-computer clusters. Today, commercial interest in Linux is strong. Both new Linux-specific corporations, such as MontaVista and Red Hat, as well as existing powerhouses, such as IBM and Novell, are providing Linux-based solutions for embedded, desktop, and server needs.

Linux is a Unix clone, but it is not Unix. That is, although Linux borrows many ideas from Unix and implements the Unix API (as defined by POSIX and the Single Unix Specification) it is not a direct descendant of the Unix source code like other Unix systems. Where desired, it has deviated from the path taken by other implementations, but it has not compromised the general design goals of Unix or broken the application interfaces.

One of Linux's most interesting features is that it is not a commercial product; instead, it is a collaborative project developed over the Internet. Although Linus remains the creator of Linux and the maintainer of the kernel, progress continues through a loose-knit group of developers. In fact, anyone can contribute to Linux. The Linux kernel, as with much of the system, is free or open source software^[3]. Specifically, the Linux kernel is licensed under the GNU General Public License (GPL) version 2.0. Consequently, you are free to download the source code and make any modifications you want. The only caveat is that if you distribute your changes, you must continue to provide the recipients with the same rights you enjoyed, including the availability of the source code^[4].

Linux is many things to many people. The basics of a Linux system are the kernel, C library, compiler, toolchain, and basic system utilities, such as a login process and shell. A Linux system can also include a modern X Window System implementation including a full-featured desktop environment, such as GNOME. Thousands of free and commercial applications exist for Linux. In this book, when I say Linux I typically mean the Linux kernel. Where it is ambiguous, I try explicitly to point out whether I am referring to Linux as a full system or just the kernel proper. Strictly speaking, after all, the term Linux refers to only the kernel.

Overview of Operating Systems and Kernels

Because of the ever-growing feature set and ill design of some modern commercial operating systems, the notion of what precisely defines an operating system is vague. Many users consider whatever they see on the screen to be the operating system. Technically speaking, and in this book, the operating system is considered the parts of the system responsible for basic use and administration. This includes the kernel and device drivers, boot loader, command shell or other user interface, and basic file and system utilities. It is the stuff you need—not a web browser or music players. The term system, in turn, refers to the operating system and all the applications running on top of it.

Of course, the topic of this book is the kernel. Whereas the user interface is the outermost portion of the operating system, the kernel is the innermost. It is the core internals; the software that provides basic services for all other parts of the system, manages hardware, and distributes system resources. The kernel is sometimes referred to as the supervisor, core, or internals of the operating system. Typical components of a kernel are interrupt handlers to service interrupt requests, a scheduler to share processor time among multiple processes, a memory management system to manage process address spaces, and system services such as networking and interprocess communication. On modern systems with protected memory management units, the kernel typically resides in an elevated system state compared to normal user applications. This includes a protected memory space and full access to the hardware. This system state and memory space is collectively referred to as kernel-space. Conversely, user applications execute in user-space. They see a subset of the machine's available resources and are unable to perform certain system functions, directly access hardware, or otherwise misbehave (without consequences, such as their death, anyhow). When executing the kernel, the system is in kernel-space executing in kernel mode, as opposed to normal user execution in user-space executing in user mode. Applications running on the system communicate with the kernel via system calls (see Figure 1.1). An application typically calls functions in a library—for example, the C library—that in turn rely on the system call interface to instruct the kernel to carry out tasks on their behalf. Some library calls provide many features not found in the system call, and thus, calling into the kernel is just one step in an otherwise large function. For example, consider the familiar printf() function. It provides formatting and buffering of the data and only eventually calls write() to write the data to the console. Conversely, some library calls have a one-to-one relationship with the kernel. For example, the open() library function does nothing except call the open() system call. Still other C library functions, such as strcpy(), should (you hope) make no use of the kernel at all. When an application executes a system call, it is said that the kernel is executing on behalf of the application. Furthermore, the application is said to be executing a system call in kernel-space, and the kernel is running in process context. This relationship—that applications call into the kernel via the system call interface—is the fundamental manner in which applications get work done.

Figure 1.1. Relationship between applications, the kernel, and hardware.

The kernel also manages the system's hardware. Nearly all architectures, including all systems that Linux supports, provide the concept of interrupts. When hardware wants to communicate with the system, it issues an interrupt that asynchronously interrupts the kernel. Interrupts are identified by a number. The kernel uses the number to execute a specific interrupt handler to process and respond to the interrupt. For example, as you type, the keyboard controller issues an interrupt to let the system know that there is new data in the keyboard buffer. The kernel notes the interrupt number being issued and executes the correct interrupt handler. The interrupt handler processes the keyboard data and lets the keyboard controller know it is ready for more data. To provide synchronization, the kernel can usually disable interrupts—either all interrupts or just one specific interrupt number. In many operating systems, including Linux, the interrupt handlers do not run in a process context. Instead, they run in a special interrupt context that is not associated with any process. This special context exists solely to let an interrupt handler quickly respond to an interrupt, and then exit.

These contexts represent the breadth of the kernel's activities. In fact, in Linux, we can generalize that each processor is doing one of three things at any given moment:

In kernel-space, in process context, executing on behalf of a specific process
In kernel-space, in interrupt context, not associated with a process, handling an interrupt
In user-space, executing user code in a process

This list is inclusive. Even corner cases fit into one of these three activities: For example, when idle, it turns out that the kernel is executing an idle process in process context in the kernel.

Linux Versus Classic Unix Kernels

Owing to their common ancestry and same API, modern Unix kernels share various design traits. With few exceptions, a Unix kernel is typically a monolithic static binary. That is, it exists as a large single-executable image that runs in a single address space. Unix systems typically require a system with a paged memory-management unit; this hardware enables the system to enforce memory protection and to provide a unique virtual address space to each process.

See the bibliography for my favorite books on the design of the classic Unix kernels.

Monolithic Kernel Versus Microkernel Designs

Operating kernels can be divided into two main design camps: the monolithic kernel and the microkernel. (A third camp, exokernel, is found primarily in research systems but is gaining ground in real-world use.)

Monolithic kernels involve the simpler design of the two, and all kernels were designed in this manner until the 1980s. Monolithic kernels are implemented entirely as single large processes running entirely in a single address space. Consequently, such kernels typically exist on disk as single static binaries. All kernel services exist and execute in the large kernel address space. Communication within the kernel is trivial because everything runs in kernel mode in the same address space: The kernel can invoke functions directly, as a user-space application might. Proponents of this model cite the simplicity and performance of the monolithic approach. Most Unix systems are monolithic in design.

Microkernels, on the other hand, are not implemented as single large processes. Instead, the functionality of the kernel is broken down into separate processes, usually called servers. Idealistically, only the servers absolutely requiring such capabilities run in a privileged execution mode. The rest of the servers run in user-space. All the servers, though, are kept separate and run in different address spaces. Therefore, direct function invocation as in monolithic kernels is not possible. Instead, communication in microkernels is handled via message passing: An interprocess communication (IPC) mechanism is built into the system, and the various servers communicate and invoke “services” from each other by sending messages over the IPC mechanism. The separation of the various servers prevents a failure in one server from bringing down another.

Likewise, the modularity of the system allows one server to be swapped out for another. Because the IPC mechanism involves quite a bit more overhead than a trivial function call, however, and because a context switch from kernel-space to user-space or vice versa may be involved, message passing includes a latency and throughput hit not seen on monolithic kernels with simple function invocation. Consequently, all practical microkernel-based systems now place most or all the servers in kernel-space, to remove the overhead of frequent context switches and potentially allow for direct function invocation. The Windows NT kernel and Mach (on which part of Mac OS X is based) are examples of microkernels. Neither Windows NT nor Mac OS X run any microkernel servers in user-space in their latest versions, defeating the primary purpose of microkernel designs altogether.

Linux is a monolithic kernel—that is, the Linux kernel executes in a single address space entirely in kernel mode. Linux, however, borrows much of the good from microkernels: Linux boasts a modular design with kernel preemption, support for kernel threads, and the capability to dynamically load separate binaries (kernel modules) into the kernel. Conversely, Linux has none of the performance-sapping features that curse microkernel designs: Everything runs in kernel mode, with direct function invocation—not message passing—the method of communication. Yet Linux is modular, threaded, and the kernel itself is schedulable. Pragmatism wins again.

As Linus and other kernel developers contribute to the Linux kernel, they decide how best to advance Linux without neglecting its Unix roots (and more importantly, the Unix API). Consequently, because Linux is not based on any specific Unix, Linus and company are able to pick and choose the best solution to any given problem—or at times, invent new solutions! Here is an analysis of characteristics that differ between the Linux kernel and other Unix variants:

Linux supports the dynamic loading of kernel modules. Although the Linux kernel is monolithic, it is capable of dynamically loading and unloading kernel code on demand.
Linux has symmetrical multiprocessor (SMP) support. Although many commercial variants of Unix now support SMP, most traditional Unix implementations did not.
The Linux kernel is preemptive. Unlike traditional Unix variants, the Linux kernel is capable of preempting a task even if it is running in the kernel. Of the other commercial Unix implementations, Solaris and IRIX have preemptive kernels, but most traditional Unix kernels are not preemptive.
Linux takes an interesting approach to thread support: It does not differentiate between threads and normal processes. To the kernel, all processes are the same—some just happen to share resources.
Linux provides an object-oriented device model with device classes, hotpluggable events, and a user-space device filesystem (sysfs).
Linux ignores some common Unix features that are thought to be poorly designed, such as STREAMS, or standards that are brain dead.
Linux is free in every sense of the word. The feature set Linux implements is the result of the freedom of Linux's open development model. If a feature is without merit or poorly thought out, Linux developers are under no obligation to implement it. To the contrary, Linux has adopted an elitist attitude toward changes: Modifications must solve a specific real-world problem, have a sane design, and have a clean implementation. Consequently, features of some other modern Unix variants, such as pageable kernel memory, have received no consideration.

Despite any differences, Linux remains an operating system with a strong Unix heritage.

Linux Kernel Versions

Linux kernels come in two flavors: stable or development. Stable kernels are production-level releases suitable for widespread deployment. New stable kernel versions are released typically only to provide bug fixes or new drivers. Development kernels, on the other hand, undergo rapid change where (almost) anything goes. As developers experiment with new solutions, often-drastic changes to the kernel are made.

Linux kernels distinguish between stable and development kernels with a simple naming scheme (see Figure 1.2). Three numbers, each separated by a dot, represent Linux kernels. The first value is the major release, the second is the minor release, and the third is the revision. The minor release also determines whether the kernel is a stable or development kernel; an even number is stable, whereas an odd number is development. Thus, for example, the kernel version 2.6.0 designates a stable kernel. This kernel has a major version of two, has a minor version of six, and is revision zero. The first two values also describe the “kernel series”—in this case, the 2.6 kernel series.

Figure 1.2. Kernel version naming convention.

Development kernels have a series of phases. Initially, the kernel developers work on new features and chaos ensues. Over time, the kernel matures and eventually a feature freeze is declared. At that point, no new features can be submitted. Work on existing features, however, can continue. After the kernel is considered nearly stabilized, a code freeze is put into effect. When that occurs, only bug fixes are accepted. Shortly thereafter (one hopes), the kernel is released as the first version of a new stable series. For example, the development series 1.3 stabilized into 2.0 and 2.5 stabilized into 2.6.

Everything I just told you is a lie

Well, not exactly. Technically speaking, the previous description of the kernel development process is true. Indeed, historically the process has proceeded exactly as described. In the summer of 2004, however, at the annual invite-only Linux Kernel Developers Summit, a decision was made to prolong the development of the 2.6 kernel without introducing a 2.7 development series in the near future. The decision was made because the 2.6 kernel is well received, it is generally stable, and no large intrusive features are on the horizon. Additionally, perhaps most importantly, the current 2.6 maintainer system that exists between Linus Torvalds and Andrew Morton is working out exceedingly well. The kernel developers believe that this process can continue in such a way that the 2.6 kernel series both remains stable and receives new features. Only time will tell, but so far, the results look good.

This book is based on the 2.6 stable kernel series.

The Linux Kernel Development Community

When you begin developing code for the Linux kernel, you become a part of the global kernel development community. The main forum for this community is the Linux kernel mailing list. Subscription information is available at http://vger.kernel.org. Note that this is a high-traffic list with upwards of 300 messages a day and that the other readers—which include all the core kernel developers, including Linus—are not open to dealing with nonsense. The list is, however, a priceless aid during development because it is where you will find testers, receive peer review, and ask questions.

Later chapters provide an overview of the kernel development process and a more complete description of participating successfully in the kernel development community.

Before We Begin

This book is about the Linux kernel: how it works, why it works, and why you should care. It covers the design and implementation of the core kernel subsystems as well as the interfaces and programming semantics. The book is practical, and takes a middle road between theory and practice when explaining how all this stuff works. This approach—coupled with some personal anecdotes and tips on kernel hacking—should ensure that this book gets you off the ground running.

I hope you have access to a Linux system and have the kernel source. Ideally, by this point, you are a Linux user and have been poking and prodding at the source, but require some help making it all come together. Conversely, you might never have used Linux but just want to learn the design of the kernel out of curiosity. However, if your desire is to write some code of your own, there is no substitute for the source. The source code is freely available; use it!

Oh, and above all else, have fun!

^[1]What about System IV? The rumor is it was an internal development version.

^[2]Well, okay, not everything—but much is represented as a file. Modern operating systems, such as Unix's successor at Bell Labs, Plan9, implement nearly everything as a file.

^[3]I will leave the free versus open debate to you. See http://www.fsf.org and http://www.opensource.org.

^[4]You should probably read the GNU GPL version 2.0 if you have not. There is a copy in the file COPYING in your kernel source tree. You can also find it online at http://www.fsf.org.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 1. Introduction to the Linux Kernel

Create new playlist

Sign In

Sign Up