Scale, index, base, and displacement

This is a very flexible addressing mode as it allows us to address memory in a manner similar to addressing data within arrays, which we are all familiar with. Despite the fact that this addressing mode is often referred to as scale/index/base (omitting the displacement part), we are not forced to make use of all of its elements at once, and we will further see that the scale/index/base/displacement scheme often gets reduced to base, base + index, or displacement + index. The latter two may come with or without scale. But, first of all, let's see who is who and which part represents what:

  • Displacement: Technically, this is an integer offset relative to a certain segment base (DS by default).
  • Base: This is a register containing the offset to data relative to the displacement, or the address of the start of the data if no displacement was specified (in fact, when we do not specify displacement, the assembler adds a displacement of zero).
  • Index: This is a register containing the offset into the data relative to base + displacement. This is similar to an index or an array member.
  • Scale: The CPU has no concept of type of data; it only understands sizes. Therefore, if we are operating on values larger than 1 byte, we have to scale the index value appropriately. The scale may be 1, 2, 4, or 8 for bytes, words, double words, or quad words, respectively. Obviously, there is no reason to explicitly specify the scale of 1, as it is the default value if no scale is specified.
It is possible to explicitly specify another segment by prepending the segment prefix to the address (for example, cs: for CS, es: for ES, and so on).

In order to calculate the final address, the processor takes the segment's base address (the default is DS), adds displacement, adds base and finalizes the calculation by adding the index times scale:

segment base address + displacement + base + index * scale

In theory, all of this looks nice and easy, so let's advance toward practice, which is nicer and much easier too. If we take another look at the example code for direct addressing, we may see that it contains a few completely redundant lines. The following would be the first one for us to deal with:

mov rbx, [rbx]

Although it provides a good example of register-based direct addressing, it may be safely removed, and the following instruction (call) should then be changed to (remember the indirect call?):

call qword [rbx]

However, even this line may be omitted just like most of the caller code. Taking a closer look at the problem, we see that there is an array of procedure pointers (in fact, an array of two). In terms of a high-level language, C for example, what the preceding code is intended to do is as follows:

int my_proc0()
{
return 0;
}

int my_proc1()
{
return 1;
}

int call_func(int selector)
{
int (*funcs[])(void) = {my_proc0, my_proc1};
return funcs[selector]();
}

The Intel architecture provides a similar interface for addressing data/code in an array-like fashion of base + index, yet it introduces another member of the equation--scale. As the assembler and, especially, the processor do not care about types of data we are operating, we have to help them with it ourselves.

While the base part (whether it is a label or a register holding an address) is treated by the processor as an address in memory, and index is simply a number of bytes to add to that base address, in this particular case, we may, of course, scale the index ourselves, as the algorithm is fairly simple. We only have two possible values for the selector (which is the rax register in the preceding Assembly code), 0 and 1, so we load, for example, the rbx register with the address of my_proc_address:

lea rbx, [my_proc_address]

Then, we shift the rax register three times left (doing this is equivalent to multiplying by 8 as we are on 64-bit and addresses are 8 bytes long, and as we would point into the second byte of address of my_proc0 otherwise) and add the result to the rbx register. This may be good for a single iteration, but not very convenient for a code that gets executed very frequently. Even if we use an additional register to store the sum of rbx and rax--what if we need that other register for something else?

This is where the scale part comes into play. Rewriting the calling code from the Assembly example would result in the following:

xor rax, rax
; inc rax ; increment RAX to call the second procedure
lea rbx, [my_proc_address]
call qword [rbx + rax * 8]

; or even a more convenient one

xor rax, rax
; inc rax
call qword[my_proc_address + rax * 8]

Of course, the base/index/scale mode may be used for addressing any type of array, not necessarily an array of function pointers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset