Quick, think: how often have you interspersed printf() instances (or the equivalent) in your program in order to follow its progress as it executes code, and indeed, to see at approximately which point it (perhaps) crashes? Often, I'm guessing! Don't feel bad at all, this is a really good debugging technique! It has a fancy name to boot: instrumentation.
What you've been doing is instrumenting your code, allowing you to see the flow (depending on the granularity of your print statements); this allows you to understand where it's been. Often enough, this is all that's required to debug many situations. Do recollect, though, what we discussed in the previous chapter – a technique like instrumentation is typically useful in certain circumstances, not all. For example, a resource leak (such as a memory leak) defect is difficult, if not impossible, to debug with instrumentation. For most other situations though, it's a really useful technique!
In this chapter, we're going to understand how to instrument kernel (or driver) code, primarily using the powerful printk() – and friends – APIs. Further, we shall continue along this path in the following chapter as well, focusing our efforts on another kernel technology than can be used for instrumentation on production systems – kprobes.
In this chapter, we will focus on covering the following main topics:
These very practical topics are important: knowing how to efficiently debug via instrumentation can result in a quick cure for annoying bugs!
The technical requirements and workspace remain identical to what's described in Chapter 1, A General Introduction to Debugging Software. The code examples can be found within the book's GitHub repository here: https://github.com/PacktPublishing/Linux-Kernel-Debugging.
There's a good reason the famous and familiar Kernighan and Ritchie (K&R) Hello, world C program employs the printf() API: it's the preferred API via which any output is written to the screen (well, technically, to the standard output channel stdout of the calling process). After all, it's how we can actually see that our program is really doing something, right?
You will surely recall using this API when writing your very first C program. Did you write the code that incorporates the printf() function? No, of course not; then where is it? You know: it's part of the (typically rather large) standard C library – GNU libc (glibc) on Linux. Pretty much every binary executable program on a Linux box automatically and dynamically links into this library; thus printf() is pretty much always available! (On x86, doing ldd $(which ps) will have the useful ldd script show you the libraries that the ps app links into; one of them will be the standard C library libc.so.* – try it.)
Except printf() isn't available within the kernel! Why? This itself is a key point: the Linux kernel does not use libraries – dynamically or statically – in the way userspace applications do. There are what could perhaps pass as the equivalent: the lib/ branch of the kernel source tree (peek at it here if you wish: https://github.com/torvalds/linux/tree/master/lib) contains many useful APIs that get built into the kernel image itself. Also, the kernel's framework for writing modules – the Loadable Kernel Module (LKM) – has facilities that kind of mimic the user mode library: the module stacking approach and the ability to link together several source files into a single kernel module object (.ko) file.
Important Note
These facilities – writing kernel modules with the LKM framework, the module stacking approach, the usage of the printk() API, and so on, are covered in detail in my earlier book Linux Kernel Programming.
So, how is the kernel or driver developer expected to emit a message that can be seen and, even better, logged? Via the ubiquitous printk() API, that's how! We say this because the printk() (and friends) APIs can be used anywhere – within interrupt handlers (all sorts – hardirq/softirq/tasklets), process context, while holding a lock; they're SMP-safe.
For you, the reader of this book, I do assume that you understand the basic usage of the useful printk() API, so I'll mostly skip over the very basics and instead explain a summary of typical basic usage, along with a few examples from the kernel code base.
The printk() API's signature is as follows:
// include/linux/printk.h
int printk(const char *fmt, ...);
If you're curious, the actual implementation is here within the kernel source: kernel/printk/printk.c:printk().
Tip – Browsing Source Trees
Efficiently browsing large code bases is an important skill; the modern Linux kernel source tree's Source Lines Of Code (SLOCs) are in excess of 20 million lines! Though you could go with the typical find <ksrc>/ -name "*.[ch]" |xargs grep –Hn "<pattern>" approach, it quickly gets tiresome.
Instead, please do yourself a big favor and learn to use powerful and efficient purpose-built code-browsing tools like (exuberant!) ctags and cscope (you installed them when following directions in Chapter 1, A General Introduction to Debugging Software). In fact, for the Linux kernel, they're built-in targets to the top-level Makefile; here's how you can build their index files for the kernel:
cd <kernel-src-tree>
make –j8 tags
make –j8 cscope
To build the indices for a particular architecture, set the environment variable ARCH to the architecture name; for example, to build cscope indices for AArch64 (ARM 64-bit):
make ARCH=arm64 cscope
You'll find links to tutorials on using ctags and cscope in the Further reading section of this chapter.
Great – let's actually make use of the famous printk(); to do so, we'll begin by checking out the logging levels at which messages can be emitted.
Syntax-wise, the printk API usage is almost identical to that of the familiar printf(3); the immediately visible difference is the usage of a logging level prefixed to the format specifier, the KERN_<foo> as the first token. Here's a sample printk with the logging level set to KERN_INFO:
printk(KERN_INFO "Hello, kernel debug world ");
First off, notice that KERN_INFO is not a separate parameter; it's part of the format string being passed as the argument. Next, it's not a priority level; it's merely a marker to specify that this printk is being logged as an informational one. Utilities to view logs – such as dmesg(1), journalctl(1), and even GUI tools such as gnome-logs(1) – can subsequently be used to filter log messages by logging level.
The printk has eight available log levels (from 0 to 7); you're expected to use the one appropriate to the situation at hand. We'll show them to you direct from the source. The comment to the right of each log level specifies the typical circumstances under which you're expected to use it:
// include/linux/kern_levels.h
[...]
#define KERN_EMERG KERN_SOH "0" /* system is unusable */
#define KERN_ALERT KERN_SOH "1" /* action must be taken immediately */
#define KERN_CRIT KERN_SOH "2" /* critical conditions */
#define KERN_ERR KERN_SOH "3" /* error conditions */
#define KERN_WARNING KERN_SOH "4" /* warning conditions */
#define KERN_NOTICE KERN_SOH "5" /* normal but significant condition */
#define KERN_INFO KERN_SOH "6" /* informational */
#define KERN_DEBUG KERN_SOH "7" /* debug-level messages */
#define KERN_DEFAULT "" /* the default kernel log level */
[...]
You can see that the KERN_<FOO> log levels are merely strings ("0", "1", ..., "7") that get prefixed to the kernel message being emitted by printk, nothing more. KERN_SOH is simply the kernel Start Of Header (SOH), which is the value