Linux started as a hobby project in 1991 by a Finnish student, Linus Torvalds. The project has gradually grown and continues to do so, with roughly a thousand contributors around the world. Nowadays, Linux is a must, in embedded systems as well as on servers. A kernel is a central part of an operating system, and its development is not straightforward. Linux offers many advantages over other operating systems; it is free of charge, well documented with a large community, is portable across different platforms, provides access to the source code, and has a lot of free open source software.
This book will try to be as generic as possible. There is a special topic, known as the device tree, that is not a full x86 feature yet. This topic will be dedicated to ARM processors, especially those that fully support the device tree. Why those architectures? Because they are mostly used on desktops and servers (for x86), as well as embedded systems (ARM).
In this chapter, we will cover the following topics:
When you're working in embedded system fields, there are terms you must be familiar with, before even setting up your environment. They are as follows:
Because embedded computers have limited or reduced resources (CPU, RAM, disk, and so on), it is common for the hosts to be x86 machines, which are much more powerful and have far more resources to speed up the development process. However, over the past few years, embedded computers have become more powerful, and they tend to be used for native compilation (thus used as the host). A typical example is the Raspberry Pi 4, which has a powerful quad-core CPU and up to 8 GB of RAM.
In this chapter, we will be using an x86 machine as the host, either to create a native build or for cross-compilation. So, any "native build" term will refer to an "x86 native build." Due to this, I'm running Ubuntu 18.04.
To quickly check this information, you can use the following command:
lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
My computer is an ASUS RoG, with a 16 core AMD Ryzen CPU (you can use the lscpu command to pull this information out), 16 GB of RAM, 256 GB of SSD, and a 1 TB magnetic hard drive (information that you can obtain using the df -h command). That said, a quad-core CPU and 4 or 8 GB of RAM could be enough, but at the cost of an increased build duration. My favorite editor is Vim, but you are free to use the one you are most comfortable with. If you are using a desktop machine, you could use Visual Studio Code (VS Code), which is becoming widely used.
Now that we are familiar with the compilation-related keywords we will be using, we can start preparing the host machine.
Before you can start the development process, you need to set up an environment. The environment that's dedicated to Linux development is quite simple – on Debian-based systems, at least (which is our case).
On the host machine, you need to install a few packages, as follows:
$ sudo apt update
$ sudo apt install gawk wget git diffstat unzip
texinfo gcc-multilib build-essential chrpath socat
libsdl1.2-dev xterm ncurses-dev lzop libelf-dev make
In the preceding code, we installed a few development tools and some mandatory libraries so that we have a nice user interface when we're configuring the Linux kernel.
Now, we need to install the compiler and the tools (linker, assembler, and so on) for the build process to work properly and produce the executable for the target. This set of tools is called Binutils, and the compiler + Binutils (+ other build-time dependency libraries if any) combo is called toolchain. So, you need to understand what is meant by "I need a toolchain for <this> architecture" or similar sentences.
Before we can start compiling, we need to install the necessary packages and tools for native or ARM cross-compiling; that is, the toolchains. GCC is the compiler that's supported by the Linux kernel. A lot of macros that are defined in the kernel are GCC-related. Due to this, we will use GCC as our (cross-)compiler.
For a native compilation, you can use the following toolchain installation command:
sudo apt install gcc binutils
When you need to cross-compile, you must identify and install the right toolchain. Compared to a native compiler, cross-compiler executables are prefixed by the name of the target operating system, architecture, and (sometimes) library. Thus, to identify architecture-specific toolchains, a naming convention has been defined: arch[-vendor][-os]-abi. Let's look at what the fields in the pattern mean:
The following are some toolchain names to illustrate the use of the pattern:
Now that we are familiar with toolchain naming conventions, we can determine which toolchain can be used to cross-compile for our target architecture.
To cross-compile for a 32-bit ARM machine, we would install the toolchain using the following command:
$ sudo apt install gcc-arm-linux-gnueabihf binutils-arm-linux-gnueabihf
Note that the 64-bit ARM backend/support in the Linux tree and GCC is called aarch64. So, the cross-compiler must be called something like gcc-aarch64-linux-gnu*, while Binutils must be called something like binutils-aarch64-linux-gnu*. Thus, for a 64-bit ARM toolchain, we would use the following command:
$ sudo apt install make gcc-aarch64-linux-gnu binutils-aarch64-linux-gnu
Note
Note that aarch64 only supports/provides hardware float aarch64 toolchains. Thus, there is no need to specify hf at the end.
Note that not all versions of the compiler can compile a given Linux kernel version. Thus, it is important to take care of both the Linux kernel version and the compiler (GCC) version. While the previous commands installed the latest version that's supported by your distribution, it is possible to target a particular version. To achieve this, you can use gcc-<version>-<arch>-linux-gnu*.
For example, to install version 8 of GCC for aarch64, you can use the following command:
sudo apt install gcc-8-aarch64-linux-gnu
Now that our toolchain has been installed, we can look at the version that was picked by our distribution package manager. For example, to check which version of the aarch64 cross-compiler was installed, we can use the following command:
$ aarch64-linux-gnu-gcc --version
aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
[...]
For the 32-bit ARM variant, we can use the following command:
$ arm-linux-gnueabihf-gcc --version
arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
[...]
Finally, for the native version, we can use the following command:
$ gcc --version
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
Now that we have set up our environment and made sure we are using the right tool versions, we can start downloading the Linux kernel sources and dig into them.
In the early kernel days (until 2003), odd-even versioning styles were used, where odd numbers were stable and even numbers were unstable. When the 2.6 version was released, the versioning scheme switched to X.Y.Z. Let's look at this in more detail:
This is called semantic versioning and was used until version 2.6.39, when Linus Torvalds decided to bump the version to 3.0, which also meant the end of semantic versioning in 2011. At that point, an X.Y scheme was adopted.
When it came to version 3.20, Linus argued that he could no longer increase Y. Therefore, he decided to switch to an arbitrary versioning scheme, incrementing X whenever Y got so big that he ran out of fingers and toes to count it. This is the reason why the version has moved from 3.20 to 4.0 directly.
Now, the kernel uses an arbitrary X.Y versioning scheme, which has nothing to do with semantic versioning.
According to the Linux kernel release model, there are always two latest releases of the kernel out there: the stable release and the long-term support (LTS) release. All bug fixes and new features are collected and prepared by subsystem maintainers and then submitted to Linus Torvalds for inclusion into his Linux tree, which is called the mainline Linux tree, also known as the master Git repository. This is where every stable release originates from.
Before each new kernel version is released, it is submitted to the community through release candidate tags so that developers can test and polish all the new features. Based on the feedback he receives during this cycle, Linus decides whether the final version is ready to go. When Linus is convinced that the new kernel is ready to go, he makes the final release. We call this release "stable" to indicate that it's not a "release candidate:" those releases are vX.Y versions.
There is no strict timeline for making releases, but new mainline kernels are generally released every 2-3 months. Stable kernel releases are based on Linus releases; that is, the mainline tree releases.
Once a stable kernel is released by Linus, it also appears in the linux-stable tree (available at https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/), where it becomes a branch. Here, it can receive bug fixes. This tree is called a stable tree because it is used to track previously released stable kernels. It is maintained and curated by Greg Kroah-Hartman. However, all fixes must go into Linus's tree first, which is the mainline repository. Once the bug has been fixed in the mainline repository, it can be applied to previously released kernels that are still maintained by the kernel development community. All the fixes that have been backported to stable releases must meet a set of important criteria before they are considered – one of them is that they "must already exist in Linus's tree."
Note
Bugfix kernel releases are considered stable.
For example, when the 4.9 kernel is released by Linus, the stable kernel is released based on the kernel's numbering scheme; that is, 4.9.1, 4.9.2, 4.9.3, and so on. Such releases are called bugfix kernel releases, and the sequence is usually shortened with the number "4.9.y" when referring to their branch in the stable kernel release tree. Each stable kernel release tree is maintained by a single kernel developer, who is responsible for picking the necessary patches for the release and going through the review/release process. Usually, there are only a few bugfix kernel releases until the next mainline kernel becomes available – unless it is designated as a long-term maintenance kernel.
Every subsystem and kernel maintainer repository is hosted here: https://git.kernel.org/pub/scm/linux/kernel/git/. Here, we can also find either a Linus or a stable tree. In the Linus tree (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/), there is only one branch; that is, the master branch. Its tags are either stable releases or release candidates. In the stable tree (https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/), there is one branch per stable kernel release (named <A.B>.y, where <A.B> is the release version in the Linus tree) and each branch contains its bugfix kernel releases.
In this book, we will be using Linus's tree, which can be downloaded using the following commands:
git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git --depth 1
git checkout v5.10
ls
In the preceding commands we used --depth 1 to avoid downloading the history (or rather, picking only the last commit history), which may considerably reduce the download size and save time. Since Git supports branching and tagging, the checkout command allows you to switch to a specific tag or branch. In this example, we are switching to the v5.10 tag.
Note
In this book, we will be dealing with Linux kernel v5.10.
Let's look at the content of the main source directory:
To enforce portability, any architecture-specific code should be in the arch directory. Moreover, the kernel code that's related to the user space API does not change (system calls, /proc, /sys, and so on) as it would break the existing programs.
In this section, we have familiarized ourselves with the Linux kernel's source content. After going through all the sources, it seems quite natural to configure them to be able to compile a kernel. In the next section, we will learn how kernel configuration works.
There are numerous drivers/features and build options available in the Linux kernel sources. The configuration process consists of choosing what features/drivers are going to be part of the compilation process. Depending on whether we are going to perform native compilation or cross-compilation, there are environment variables that must be defined, even before the configuration process takes place.
The compiler that's invoked by the kernel's Makefile is $(CROSS_COMPILE)gcc. That said, CROSS_COMPILE is the prefix of the cross-compiling tools (gcc, as, ld, objcopy, and so on) and must be specified when you're invoking make or must have been exported before any make command is executed. Only gcc and its related Binutils executables will be prefixed with $(CROSS_COMPILE).
Note that various assumptions are made and options/features/flags are enabled by the Linux kernel build infrastructure based on the target architecture. To achieve that, in addition to the cross-compiler prefix, the architecture of the target must be specified as well. This can be done through the ARCH environment variable.
Thus, a typical Linux configuration or build command would look as follows:
ARCH=<XXXX> CROSS_COMPILE=<YYYY> make menuconfig
It can also look as follows:
ARCH=<XXXX> CROSS_COMPILE=<YYYY> make <make-target>
If you don't wish to specify these environment variables when you launch a command, you can export them into your current shell. The following is an example:
export CROSS_COMPILE=aarch64-linux-gnu-
export ARCH=aarch64
Remember that if these variables are not specified, the native host machine is going to be targeted; that is, if CROSS_COMPILE is omitted or not set, $(CROSS_COMPILE)gcc will result in gcc, and it will be the same for other tools that will be invoked (for example, $(CROSS_COMPILE)ld will result in ld).
In the same manner, if ARCH (the target architecture) is omitted or not set, it will default to the host where make is executed. It will default to $(uname -m).
As a result, you should leave CROSS_COMPILE and ARCH undefined to have the kernel natively compiled for the host architecture using gcc.
The Linux kernel is a Makefile-based project that contains thousands of options and drivers. Each option that's enabled can make another one available or can pull specific code into the build. To configure the kernel, you can use make menuconfig for a ncurses-based interface or make xconfig for an X-based interface. The ncurses-based interface looks as follows:
For most options, you have three choices. However, we can enumerate five types of options while configuring the Linux kernel:
The selected options will be stored in a .config file, at the root of the source tree.
It is very difficult to know which configuration is going to work on your platform. In most cases, there will be no need to start a configuration from scratch. There are default and functional configuration files available in each arch directory that you can use as a starting point (it is important to start with a configuration that already works):
ls arch/<your_arch>/configs/
For 32-bit ARM-based CPUs, these config files can be found in arch/arm/configs/. In this architecture, there is usually one default configuration per CPU family. For instance, for i.MX6-7 processors, the default config file is arch/arm/configs/imx_v6_v7_defconfig. However, on ARM 64-bit CPUs, there is only one big default configuration to customize; it is located in arch/arm64/configs/ and is called defconfig. Similarly, for x86 processors, we can find the files in arch/x86/configs/. There will be two default configuration files here – i386_defconfig and x86_64_defconfig, for 32- and 64-bit x86 architectures, respectively.
The kernel configuration command, given a default configuration file, is as follows:
make <foo_defconfig>
This will generate a new .config file in the main (root) directory, while the old .config will be renamed .config.old. This can be useful to revert the previous configuration changes. Then, to customize the configuration, you can use the following command:
make menuconfig
Saving your changes will update your .config file. While you could share this config with your teammates, you are better off creating a default configuration file in the same minimal format as those shipped with the Linux kernel sources. To do that, you can use the following command:
make savedefconfig
This command will create a minimal (since it won't store non-default settings) configuration file. The generated default configuration file will be called defconfig and stored at the root of the source tree. You can store it in another location using the following command:
mv defconfig arch/<arch>/configs/myown_defconfig
This way, you can share a reference configuration inside the kernel sources and other developers can now get the same .config file as you by running the following command:
make myown_defconfig
Note
Note that, for cross-compilation, ARCH and CROSS_COMPILE must be set before you execute any make command, even for kernel configuration. Otherwise, you'll have unexpected changes in your configuration.
The followings are the various configuration commands you can use, depending on the target system:
make x86_64_defconfig
make menuconfig
ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make imx_v6_v7_defconfig
ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make menuconfig
With the first command, you store the default options in the .config file, while with the latter, you can update (add/remove) various options, depending on your needs.
ARCH=aarch64 CROSS_COMPILE=aarch64-linux-gnu- make defconfig
ARCH=aarch64 CROSS_COMPILE=aarch64-linux-gnu- make menuconfig
You may run into a Qt4 error with xconfig. In such a case, you should just use the following command to install the missing packages:
sudo apt install qt4-dev-tools qt4-qmake
Note
You may be switching from an old kernel to a new one. Given the old .config file, you can copy it into the new kernel source tree and run make oldconfig. If there are new options in the new kernel, you'll be prompted to include them or not. However, you may want to use the default values for those options. In this case, you should run make olddefconfig. Finally, to say no to every new option, you should run make oldnoconfig.
There may be a better option to find an initial configuration file, especially if your machine is already running. Debian and Ubuntu Linux distributions save the .config file in the /boot directory, so you can use the following command to copy this configuration file:
cp /boot/config-`uname -r` .config
The other distributions may not do this. So, I can recommend that you always enable the IKCONFIG and IKCONFIG_PROC kernel configuration options, which will enable access to .config through /proc/configs.gz. This is a standard method that also works with embedded distributions.
Now that we can configure the kernel, let's enumerate some useful configuration features that may be worth enabling in your kernel:
# zcat /proc/config.gz | grep CONFIG_SOUND
CONFIG_SOUND=y
CONFIG_SOUND_OSS_CORE=y
CONFIG_SOUND_OSS_CORE_PRECLAIM=y
# CONFIG_SOUNDWIRE is not set
#
Now that the kernel has been configured, it must be built to generate a runnable kernel. In the next section, we will describe the kernel building process, as well as the expected build artifacts.
This step requires you to be in the same shell where you were during the configuration step; otherwise, you'll have to redefine the ARCH and CROSS_COMPILE environment variables.
Linux is a Makefile-based project. Building such a project requires using the make tool and executing the make command. Regarding the Linux kernel, this command must be executed from the main kernel source directory, as a normal user.
By default, if not specified, the make target is all. In the Linux kernel sources, for x86 architectures, this target points to (or depends on) vmlinux bzImage modules targets; for ARM or aarch64 architectures, it corresponds to vmlinux zImage modules dtbs targets.
In these targets, bzImage is an x86-specific make target that produces a binary with the same name, bzImage. vmlinux is a make target that produces a Linux image called vmlinux. zImage and dtbs are both ARM- and aarch64-specific make targets. The first produces a Linux image with the same name, while the second builds the device tree sources for the target CPU variant. modules is a make target that will build all the selected modules (marked with m in the configuration).
While building, you can leverage the host's CPU performance by running multiple jobs in parallel thanks to the -j make options. The following is an example:
make -j16
Most people define their -j number as 1.5x the number of cores. In my case, I always use ncpus * 2.
You can build the Linux kernel like so:
make -j16
ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make -j16
Each make target can be invoked separately, like so:
ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make dtbs
You can also do the following:
ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make zImage -j16
Finally, you can also do the following:
make bzImage -j16
Note
I have used -j16 in my commands because my host has an 8-core CPU. This number of jobs must be adapted according to your host configuration.
At the end of your 32-bit ARM cross-compilation jobs, you will see something like the following:
[…]
LZO arch/arm/boot/compressed/piggy_data
CC arch/arm/boot/compressed/misc.o
CC arch/arm/boot/compressed/decompress.o
CC arch/arm/boot/compressed/string.o
SHIPPED arch/arm/boot/compressed/hyp-stub.S
SHIPPED arch/arm/boot/compressed/lib1funcs.S
SHIPPED arch/arm/boot/compressed/ashldi3.S
SHIPPED arch/arm/boot/compressed/bswapsdi2.S
AS arch/arm/boot/compressed/hyp-stub.o
AS arch/arm/boot/compressed/lib1funcs.o
AS arch/arm/boot/compressed/ashldi3.o
AS arch/arm/boot/compressed/bswapsdi2.o
AS arch/arm/boot/compressed/piggy.o
LD arch/arm/boot/compressed/vmlinux
OBJCOPY arch/arm/boot/zImage
Kernel: arch/arm/boot/zImage is ready
By using the default targets, various binaries will result from the build process, depending on the architecture. These are as follows:
This is bzImage (which means "big zImage") for x86, zImage for ARM or aarch64, and vary for other architectures.
Now that we know how to (cross-)compile the Linux kernel, let's learn how to install it.
The Linux kernel installation process differs in terms of native compilation or cross-compilation:
Now that we are familiar with the kernel configuration, including the build and installation processes, let's look at kernel modules, which allow you to extend the kernel at runtime.
Modules can be built separately using the modules target. You can install them using the modules_install target. Modules are built in the same directory as their corresponding source. Thus, the resulting kernel objects are spread over the kernel source tree:
make modules
sudo make modules_install
The resulting modules will be installed in /lib/modules/$(uname -r)/kernel/, in the same directory structure as their corresponding source. A custom install path can be specified using the INSTALL_MOD_PATH environment variable.
ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make modules
ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- INSTALL_MOD_PATH=<dir> make modules_install
In addition to the kernel directory that is shipped with modules, the following files are installed in /lib/modules/<version> as well:
With that, we have installed the necessary modules. We've finished learning how to build and install Linux kernels and modules. We've also finished learning how to configure the Linux kernel and add the features we need.
In this chapter, you learned how to download the Linux source and process your first build. We also covered some common operations, such as configuring or selecting the appropriate toolchain. That said, this chapter was quite brief; it was just an introduction. This is why, in the next chapter, we will cover the kernel building process, how to compile a driver (either externally or as part of the kernel), and some basics that you should learn before you start the long journey that kernel development represents. Let's take a look!