Trapping and extracting information from a crash

Next, we will see a test program ch12/handle_segv.c, with deliberate bugs, to help us understand the use cases possible. All this will result in the SIGSEGV signal being generated by the OS. How the application developer handles this signal is important: we demonstrate how you can use it to gather important details, such as the address of the memory location upon whose access the crash took place and the value of all registers at that point in time. These details often provide useful clues into the root cause of the memory bug.

To help understand how we are constructing this program, run it without any parameters:

$ ./handle_segv 
Usage: ./handle_segv u|k r|w
u => user mode
k => kernel mode
 r => read attempt
 w => write attempt
$

As can be seen, we can thus perform four kinds of invalid memory accesses: in effect, four bug cases:

Invalid user [u] mode read [r]
Invalid user [u] mode write [w]
Invalid kernel [k] mode read [r]
Invalid kernel [k] mode write [w]

Some typedefs and macros we use are as follows:

typedef unsigned int u32;
typedef long unsigned int u64;

#define ADDR_FMT "%lx"
#if __x86_64__ /* 64-bit; __x86_64__ works for gcc */
 #define ADDR_TYPE u64
 static u64 invalid_uaddr = 0xdeadfaceL;
 static u64 invalid_kaddr = 0xffff0b9ffacedeadL;
#else
 #define ADDR_TYPE u32
 static u32 invalid_uaddr = 0xfacedeadL;
 static u32 invalid_kaddr = 0xdeadfaceL;
#endif

The main function is shown as follows:

int main(int argc, char **argv)
{
 struct sigaction act;
 if (argc != 3) {
     usage(argv[0]);
     exit(1);
 }
 
 memset(&act, 0, sizeof(act));
 act.sa_sigaction = myfault;
 act.sa_flags = SA_RESTART | SA_SIGINFO;
 sigemptyset(&act.sa_mask);
 if (sigaction(SIGSEGV, &act, 0) == -1)
     FATAL("sigaction SIGSEGV failed
");

if ((tolower(argv[1][0]) == 'u') && tolower(argv[2][0] == 'r')) {
   ADDR_TYPE *uptr = (ADDR_TYPE *) invalid_uaddr;
   printf("Attempting to read contents of arbitrary usermode va uptr = 0x" 
             ADDR_FMT ":
", (ADDR_TYPE) uptr);
   printf("*uptr = 0x" ADDR_FMT "
", *uptr); // just reading

 } else if ((tolower(argv[1][0]) == 'u') && tolower(argv[2][0] == 'w')) {
    ADDR_TYPE *uptr = (ADDR_TYPE *) & main;
    printf
    ("Attempting to write into arbitrary usermode va uptr (&main actually) = 0x" ADDR_FMT ":
", (ADDR_TYPE) uptr);
    *uptr = 0x2A; // writing

 } else if ((tolower(argv[1][0]) == 'k') && tolower(argv[2][0] == 'r')) {
    ADDR_TYPE *kptr = (ADDR_TYPE *) invalid_kaddr;
    printf
 ("Attempting to read contents of arbitrary kernel va kptr = 0x" ADDR_FMT ":
", (ADDR_TYPE) kptr);
    printf("*kptr = 0x" ADDR_FMT "
", *kptr); // just reading

 } else if ((tolower(argv[1][0]) == 'k') && tolower(argv[2][0] == 'w')) {
    ADDR_TYPE *kptr = (ADDR_TYPE *) invalid_kaddr;
    printf
 ("Attempting to write into arbitrary kernel va kptr = 0x" ADDR_FMT ":
",
      (ADDR_TYPE) kptr);
    *kptr = 0x2A; // writing
 } else
     usage(argv[0]);
 exit(0);
}

va = virtual address.

Here is the key part: the signal handler for the SIGSEGV:

static void myfault(int signum, siginfo_t * si, void *ucontext)
{
  fprintf(stderr,
    "%s:
------------------- FATAL signal ---------------------------
",
    APPNAME);
    fprintf(stderr," %s: received signal %d. errno=%d
"
 " Cause/Origin: (si_code=%d): ",
         __func__, signum, si->si_errno, si->si_code);

 switch (si->si_code) {
     /* Possible values si_code can have for SIGSEGV */
 case SEGV_MAPERR:
     fprintf(stderr,"SEGV_MAPERR: address not mapped to object
");
     break;
 case SEGV_ACCERR:
     fprintf(stderr,"SEGV_ACCERR: invalid permissions for mapped object
");
     break;
 /* SEGV_BNDERR and SEGV_PKUERR result in compile failure? */

 /* Other possibilities for si_code; here just to show them... */
 case SI_USER:
     fprintf(stderr,"user
");
     break;
 case SI_KERNEL:
     fprintf(stderr,"kernel
");
     break;

--snip--

 default:
     fprintf(stderr,"-none-
");
 }
<...>
     
    /* 
     * Placeholders for real-world apps:
     * crashed_write_to_log();
     * crashed_perform_cleanup();
     * crashed_inform_enduser();
     *
     * Now have the kernel generate the core dump by:
     *  Reset the SIGSEGV to (kernel) default, and,
     *  Re-raise it!
     */
    signal(SIGSEGV, SIG_DFL);
    raise(SIGSEGV);
}

There is much to observe here:

We print out the signal number and origin value
We interpret the signal origin value (via the switch-case)
- Particularly for SIGSEGV, the SEGV_MAPERR, and SEGV_ACCERR

Here comes the interesting bit: the following code prints out the faulting instruction or address! Not only that, we devise a means by which we can print out most of the CPU registers as well via our dump_regs function. As mentioned earlier, we also make use of the helper routine psiginfo(3) as follows:

fprintf(stderr," Faulting instr or address = 0x" ADDR_FMT "
",
         (ADDR_TYPE) si->si_addr);
fprintf(stderr, "--- Register Dump [x86_64] ---
");
dump_regs(ucontext);
fprintf(stderr,
     "------------------------------------------------------------
");
psiginfo(si, "psiginfo helper");
fprintf(stderr,
     "------------------------------------------------------------
");

We then just keep some dummy stubs for the functionality you probably want in a real-world application, when handling a fatal signal such as this (here, we do not actually write any code, as it's of course very application-specific):

/* 
 * Placeholders for real-world apps:
 * crashed_write_to_log();
 * crashed_perform_cleanup();
 * crashed_inform_enduser();
 */

Finally, calling abort(3) so that the process terminates (as it's now in an undefined state and cannot continue) is one way to finish. However, think for a second: if we abort() now, the process dies without the kernel getting a chance to generate a core dump. (As mentioned, a core dump is essentially a snapshot of the process's dynamic memory segments at the time of the crash; it's very useful for developers to debug and determine the root cause of the crash). So, having the kernel generate a core dump would indeed be useful. How can we arrange for this? Its quite simple really: we need to do the following:

Reset the SIGSEGV signal's handler to the (kernel) default
Have the signal (re)raised on the process

This code fragment achieves just this:

[...]
 * Now have the kernel generate the core dump by:
 * Reset the SIGSEGV to glibc default, and,
 * Re-raise it!
 */
 signal(SIGSEGV, SIG_DFL);
 raise(SIGSEGV);

As it's a simple case, we just use the simpler signal(2) API to revert the signal's action to the default. Then, again, we use the library API raise(3) to raise a given signal on the calling process. (The error-checking code has been left out for easy readability.)

Table of Contents for Trapping and extracting information from a crash

Create new playlist

Sign In

Sign Up

Table of Contents for
Trapping and extracting information from a crash