Now that you know the basics of assembly language programming, it's time to start putting those concepts to practical use. One very common use of assembly language programming is to code assembly functions within higher-level languages, such as C and C++. There are a couple of different ways to do this. This chapter describes how to place assembly language functions directly within C and C++ language programs. This technique is called inline assembly.
The chapter begins by describing how C and C++ programs use functions, and how the functions are converted to assembly language code by the compiler. Next, the basic inline assembly format is discussed, including how to incorporate simple assembly functions. After that, the extended inline assembly format is described. This format enables you to incorporate more complex assembly language functions within the C or C++ programs. Finally, the chapter explains how to define macros using complex inline assembly language functions within C programs.
In a standard C or C++ program, code is entered in the C or C++ syntax in a text source code file. The source code file is then compiled into assembly language code using the compiler. After that step, the assembly language code is linked with any required libraries to produce an executable program (see Chapter 3, "The Tools of the Trade").
In the Linux world, the GNU compiler (gcc
) is used to create the executable program from the text source code file. Normally, the step of converting the code to assembly language is hidden from the programmer. But as shown in Chapter 3, you can use the -S
option of the GNU compiler to view the actual assembly language code generated from the source code.
A common programming technique in C and C++ programming is to create separate standalone functions within the source code file. These functions perform individual processes that can be called multiple times from the main program. When a C or C++ program is divided into functions, the compiler compiles each function into separate assembly functions (see Chapter 11, "Using Functions"). The functions are still contained within the same assembly language file, but as separate functions. To see what is produced, you can still use the -S
option to compile the program and view the generated assembly language code.
To demonstrate this, the cfunctest.c
program uses separate functions within a simple C language program:
/* cfunctest.c – An example of functions in C */ #include <stdio.h> float circumf(int a) { return 2 * a * 3.14159; } float area(int a) { return a * a * 3.14159; } int main() { int x = 10; printf("Radius: %d ", x); printf("Circumference: %f ", circumf(x)); printf("Area: %f ",area(x)); return 0; }
The two functions are defined as having a single integer value for the input, and producing a double-precision floating-point value as the output. The mathematical calculations are performed within the individual functions, separate from the main program code. The functions can be called as many times as required within the main program without having to write additional code.
To view the assembly language code generated by the compiler, compile using the -S
option:
$ gcc –S cfunctest.c
This command creates the file cfunctest.s
, which looks like this:
.file "cfunctest.c" .version "01.01" gcc2_compiled.: .section .rodata .align 8 .LC0: .long 0xf01b866e,0x400921f9 .text .align 16 .globl circumf .type circumf,@function circumf:
pushl %ebp movl %esp, %ebp subl $4, %esp movl 8(%ebp), %eax addl %eax, %eax pushl %eax fildl (%esp) popl %eax fldl .LC0 fmulp %st, %st(1) fstps −4(%ebp) flds −4(%ebp) movl %ebp, %esp popl %ebp ret .Lfe1: .size circumf,.Lfe1-circumf .section .rodata .align 8 .LC2: .long 0xf01b866e,0x400921f9 .text .align 16 .globl area .type area,@function area: pushl %ebp movl %esp, %ebp subl $4, %esp movl 8(%ebp), %eax imull 8(%ebp), %eax pushl %eax fildl (%esp) popl %eax fldl .LC2 fmulp %st, %st(1) fstps −4(%ebp) flds −4(%ebp) movl %ebp, %esp popl %ebp ret .Lfe2: .size area,.Lfe2-area .section .rodata .LC4: .string "Radius: %d " .LC5: .string "Circumference: %f " .LC6: .string "Area: %f " .text .align 16 .globl main
.type main,@function main: pushl %ebp movl %esp, %ebp subl $8, %esp movl $10, −4(%ebp) subl $8, %esp pushl −4(%ebp) pushl $.LC4 call printf addl $16, %esp subl $4, %esp subl $8, %esp pushl −4(%ebp) call circumf addl $12, %esp leal −8(%esp), %esp fstpl (%esp) pushl $.LC5 call printf addl $16, %esp subl $4, %esp subl $8, %esp pushl −4(%ebp) call area addl $12, %esp leal −8(%esp), %esp fstpl (%esp) pushl $.LC6 call printf addl $16, %esp movl $0, %eax movl %ebp, %esp popl %ebp ret .Lfe3: .size main,.Lfe3-main .ident "GCC: (GNU) 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)"
By now you should be able to understand the assembly language code generated by the compiler. The two C functions were created as separate assembly language functions, set apart from the main program code. The main program uses the standard C style function format to pass the input parameter to the functions (by placing the input value onto the top of the stack). The CALL
instruction is used to invoke the functions from the main program.
In this simple example, the assembly code generated to implement the functions was fairly trivial. However, in a more complex application, you may not want the compiler to generate the assembly language code, or you may want to use assembly language instructions that the compiler is incapable of producing (such as the CPUID
instruction).
If you want to directly control what assembly language code is generated to implement a function, you can do one of three things:
Implement the function from scratch in assembly language code and call it from the C program.
Create the assembly language version of the C code using the -S
option, modify the assembly language code as necessary, and then link the assembly code to create the executable.
Create the assembly language code for the functions within the original C code and compile it using the standard C compiler.
The first option is discussed in Chapter 14, "Calling Assembly Libraries." The second option is discussed in Chapter 15, "Optimizing Routines." The third option is exactly how inline assembly language programming works. This method enables you to create assembly language functions within the C or C++ source code itself, without having to link additional libraries or programs. It gives you greater control over how certain functions are implemented at the assembly language level of the final program.
Creating inline assembly code is not much different from creating assembly functions, except that it is done within a C or C++ program. This section describes how to create basic inline assembly code functions that can implement simple assembly language code within C or C++ programs.
The GNU C compiler uses the asm
keyword to denote a section of source code that is written in assembly language. The basic format of the asm
section is as follows:
asm( "assembly code" );
The assembly code contained within the parentheses must be in a specific format:
The instructions must be enclosed in quotation marks.
If more than one instruction is included, the newline character must be used to separate each line of assembly language code. Often, a tab character is also included to help indent the assembly language code to make lines more readable.
The second rule is required because the compiler takes the assembly code in the asm
section verbatim and places it within the assembly code generated for the program. Each assembly language instruction must be on a separate line—thus, the requirement to include the newline character.
Some assemblers also require instructions to be indented by a tab character to distinguish them from labels. The GNU assembler does not require this, but many programmers include the tab character for consistency.
These requirements can create some confusing-looking assembly code in the source code, but it helps make things sane in the generated assembly language code.
A sample basic inline assembly section could look like this:
asm ("movl $1, %eax movl $0, %ebx int $0x80");
This example uses three instructions: two MOVL
instructions to place a one value in the EAX
register and a zero value in the EBX
register, and the INT
instruction to perform the Linux system call.
This format can get somewhat messy when using a lot of assembly instructions. Most programmers place instructions on separate lines. When doing this, each instruction must be enclosed in quotation marks:
asm ( "movl $1, %eax " "movl $0, %ebx " "int $0x80");
This format is much easier to read when trying to debug an application. The asm
section can be placed anywhere within the C or C++ source code. The following asmtest.c
program demonstrates how the asm
section would look in an actual program:
/* asmtest.c - An example of using an asm section in a program*/ #include <stdio.h> int main() { int a = 10; int b = 20; int result; result = a * b; asm ( "nop"); printf("The result is %d ", result); return 0; }
The assembly language instruction used in the asm
statement (the NOP
instruction) does not do anything in the C program, but will appear in the assembly language code generated by the compiler. To generate the assembly language code for this program, use the -S
option of the gcc
command. The generated assembly code file should look like this:
.file "asmtest.c" .section .rodata .LC0: .string "The result is %d " .text .globl main .type main, @function main: pushl %ebp movl %esp, %ebp subl $24, %esp andl $-16, %esp movl $0, %eax subl %eax, %esp movl $10, −4(%ebp) movl $20, −8(%ebp)
movl −4(%ebp), %eax imull −8(%ebp), %eax movl %eax, −12(%ebp) #APP nop #NO_APP movl −12(%ebp), %eax movl %eax, 4(%esp) movl $.LC0, (%esp) call printf movl $0, %eax leave ret .size main, .-main .section .note.GNU-stack,"",@progbits .ident "GCC: (GNU) 3.3.2 (Debian)"
The generated code uses the normal C style function prologue and the LEAVE
instruction to implement the standard epilogue (see Chapter 11). Within the prologue and epilogue code is the code generated by the C source code, and within that is a section identified by the #APP
and #NO APP
symbols. This section contains the inline assembly code specified by the asm
section. Note how the code is placed using the newline and tab characters specified.
Just implementing assembly language code itself won't be able to accomplish much. To do any real work, there must be a way to pass data into and out of the inline assembly language function.
The basic inline assembly code can utilize global C variables defined in the application. The word to remember here is "global." Only globally defined variables can be used within the basic inline assembly code. The variables are referenced by the same names used within the C program.
The globaltest.c
program demonstrates how to do this:
/* globaltest.c - An example of using C variables */ #include <stdio.h> int a = 10; int b = 20; int result; int main() { asm ( "pusha " "movl a, %eax " "movl b, %ebx " "imull %ebx, %eax " "movl %eax, result " "popa"); printf("the answer is %d ", result); return 0; }
The a
, b
, and result
variables are defined as global variables in the C program, and are used within the asm
section of the code. Note that the values are used as memory locations within the assembly language code, and not as immediate data values. The variables can also be used elsewhere in the C program as normal.
Remember that the data variables must be declared as global. You cannot use local variables within the
asm
section.
The generated assembly code from the compiler looks like this:
.file "globaltest.c" .globl a .data .align 4 .type a, @object .size a, 4 a: .long 10 .globl b .align 4 .type b, @object .size b, 4 b: .long 20 .section .rodata .LC0: .string "The result is %d " .text .globl main .type main, @function main: pushl %ebp movl %esp, %ebp subl $8, %esp andl $-16, %esp movl $0, %eax subl %eax, %esp #APP pusha movl a, %eax movl b, %ebx imull %ebx, %eax movl %eax, result popa #NO_APP movl result, %eax movl %eax, 4(%esp) movl $.LC0, (%esp) call printf movl $0, %eax leave ret .size main, .-main .comm result,4,4
.section .note.GNU-stack,"",@progbits .ident "GCC: (GNU) 3.3.2 (Debian)"
Notice how the a
and b
variables are declared in the .data
section and assigned the proper values. The result
variable, because it is not initialized in the C code, is declared as a .comm
value.
One other important feature is shown in this example program. Notice the PUSHA
instruction at the start of the assembly language code, and the POPA
instruction at the end. It is important to remember to store the initial values of the registers before entering your code, and then restore them when you are done. It's quite possible that the compiler will use those registers for other values within the compiled C source code. If you modify them in your asm
section, unpredictable things may occur.
When creating inline assembly code in your application, you must be aware of what the compiler may do to it during the compile operation. In a normal C or C++ application, the compiler may attempt to optimize the generated assembly code to increase performance. This is usually done by eliminating functions that are not used, sharing registers between values that are not concurrently used, and rearranging code to facilitate better flow of the program.
Sometimes optimization is not a good thing with inline assembly functions. It is possible that the compiler may look at the inline code and attempt to optimize it as well, possibly producing undesirable effects.
If you want the compiler to leave your hand-coded inline assembly function alone, you can just say so! The volatile
modifier can be placed in the asm
statement to indicate that no optimization is desired on that section of code. The format of the asm
statement using the volatile
modifier is as follows:
asm volatile ("assembly code");
The assembly code within the statement uses the standard rules it would use without the volatile
modifier. Nor does the addition of the volatile
modifier change the requirement to store and retrieve the register values within the inline assembly code.
The asm
keyword used to identify the inline assembly code section may be altered if necessary. The ANSI C specifications use the asm
keyword for something else, preventing you from using it for your inline assembly statements. If you are writing code using the ANSI C conventions, you must use the __asm__
keyword instead of the normal asm
keyword.
The assembly code section within the statement does not change, just the asm
keyword, as shown in the following example:
____asm__ ("pusha " "movl a, %eax " "movl b, %ebx " "imull %ebx, %eax " "movl %eax, result " "popa");
The __asm__
keyword can also be modified using the __volatile__
modifier.
The basic asm
format provides an easy way to create assembly code, but it has its limitations. For one, all input and output values have to use global variables from the C program. In addition, you have to be extremely careful not to change the values of any registers within the inline assembly code.
The GNU compiler provides an extended format for the asm
section that helps solve these problems. The extended format provides additional options that enable you to more precisely control how the inline assembly language code is generated within the C or C++ language program. This section describes the extended asm
format.
Because the extended asm
format provides additional features to use, they must be included in the new format. The format of the extended version of asm
looks like this:
asm ("assembly code" : output locations : input operands : changed registers);
This format consists of four parts, each separated by a colon:
Assembly code: The inline assembly code using the same syntax used for the basic asm
format
Output locations: A list of registers and memory locations that will contain the output values from the inline assembly code
Input operands: A list of registers and memory locations that contain input values for the inline assembly code
Changed registers: A list of any additional registers that are changed by the inline code
Not all of the sections are required to be present in the extended asm
format. If no output values are associated with the assembly code, the section must be blank, but two colons must still separate the assembly code from the input operands. If no registers are changed by the inline assembly code, the last colon may be omitted.
The following sections describe how to use the extended asm
format.
In the basic asm
format, input and output values are incorporated using the C global variable name within the assembly code. Things are a little different when using the extended format.
In the extended format, you can assign input and output values from both registers and memory locations. The format of the input and output values list is
"constraint"(variable)
where variable
is a C variable declared within the program. In the extended asm
format, both local and global variables can be used. The constraint
defines where the variable is placed (for input values) or moved from (for output values). This is what defines whether the value is placed in a register or a memory location.
The constraint is a single-character code. The constraint codes are shown in the following table.
Constraint | Description |
---|---|
a | Use the %eax, %ax, or %al registers. |
b | Use the %ebx, %bx, or %bl registers. |
c | Use the %ecx, %cx, or %cl registers. |
d | Use the %edx, %dx, or $dl registers. |
S | Use the %esi or %si registers. |
D | Use the %edi or %di registers. |
r | Use any available general-purpose register. |
q | Use either the %eax, %ebx, %ecx, or %edx register. |
A | Use the %eax and the %edx registers for a 64-bit value. |
f | Use a floating-point register. |
t | Use the first (top) floating-point register. |
u | Use the second floating-point register. |
m | Use the variable's memory location. |
o | Use an offset memory location. |
V | Use only a direct memory location. |
i | Use an immediate integer value. |
n | Use an immediate integer value with a known value. |
g | Use any register or memory location available. |
In addition to these constraints, output values include a constraint modifier, which indicates how the output value is handled by the compiler. The output modifiers that can be used are shown in the following table.
Output Modifier | Description |
---|---|
+ | The operand can be both read from and written to. |
= | The operand can only be written to. |
% | The operand can be switched with the next operand if necessary. |
& | The operand can be deleted and reused before the inline functions complete. |
The easiest way to see how the input and output values work is to see some examples. This example:
asm ("assembly code" : "=a"(result) : "d"(data1), "c"(data2));
places the C variable data1
into the EDX
register, and the variable data2
into the ECX
register. The result of the inline assembly code will be placed into the EAX
register, and then moved to the result
variable.
If the input and output variables are assigned to registers, the registers can be used within the inline assembly code almost as normal. I use the word "almost" because there is one oddity to deal with.
In extended asm
format, to reference a register in the assembly code you must use two percent signs instead of just one (the reason for this will be discussed a little later). This makes the code a little odd looking, but not too much different.
The regtest1.c
program demonstrates using registers within the extended asm
format:
/* regtest1.c - An example of using registers */ #include <stdio.h> int main() { int data1 = 10; int data2 = 20; int result; asm ("imull %%edx, %%ecx " "movl %%ecx, %%eax" : "=a"(result) : "d"(data1), "c"(data2)); printf("The result is %d ", result); return 0; }
This time, the C variables are declared as local variables, which you couldn't do with the basic asm
format. Each C variable is assigned to a specific register. The output register is modified with the equal sign to indicate that it can only be written to by the assembly code (this is required for all output values in the inline code).
When the C program is compiled, the compiler automatically generates the assembly code necessary to place the C variables in the appropriate registers to implement the inline assembly code. You can see what is generated again by using the -S
option. The inline code generated looks like this:
movl $10, −4(%ebp) movl $20, −8(%ebp) movl −4(%ebp), %edx movl −8(%ebp), %ecx #APP imull %edx, %ecx
movl %ecx, %eax #NO_APP movl %eax, −12(%ebp)
The compiler moved the data1
and data2
values onto the stack spaces reserved for the C variables. The values were then loaded into the EDX
and ECX
registers required by the inline assembly code. The resulting output in the EAX
register was then moved to the result
variable location on the stack.
You don't always need to specify the output value in the inline assembly section. Some assembly instructions already assume that the input values contain the output values.
The MOVS
instructions include the output location within the input values. The movstest.c
program demonstrates this:
/* movstest.s - An example of instructions with only input values */ #include <stdio.h> int main() { char input[30] = {"This is a test message. "}; char output[30]; int length = 25; asm volatile ("cld " "rep movsb" : : "S"(input), "D"(output), "c"(length)); printf("%s", output); return 0; }
The movstest.c
program specifies the required three input values for the MOVS
instruction as input values. The location of the string to copy is placed in the ESI
register, the location of the destination is placed in the EDI
register, and the length of the string to copy is placed in the ECX
register (remember to include the terminating null character in the string length).
The output value is already defined as one of the input values, so no output values are specifically defined in the extended format. Because no specific output values are defined, it is important to use the volatile
keyword; otherwise, the compiler may remove the asm
section as unnecessary, as it doesn't produce an output.
In the regtest1.c
example, the input values were placed in specific registers declared in the inline assembly section, and the registers were specifically utilized in the assembly instructions. While this worked fine for just a few input values, for functions that require a lot of input values this is a somewhat tedious way in which to use them.
To help you out, the extended asm
format provides placeholders that can be used to reference input and output values within the inline assembly code. This enables you to declare input and output values in any register or memory location that is convenient for the compiler.
The placeholders are numbers, preceded by a percent sign. Each input and output value listed in the inline assembly code is assigned a number based on its location in the listing, starting with zero. The placeholders can then be used in the assembly code to represent the values.
For example, the following inline code:
asm ("assembly code" : "=r"(result) : "r"(data1), "r"(data2));
will produce the following placeholders:
%0
will represent the register containing the result
variable value.
%1
will represent the register containing the data1
variable value.
%2
will represent the register containing the data2
variable value.
Notice that the placeholders provide a method for utilizing both registers and memory locations within the inline assembly code. The placeholders are used in the assembly code just as the original data types would be:
imull %1, %2 movl %2, %0
Remember that you must declare the input and output values as the proper storage elements (registers or memory) required by the assembly instructions in the inline code. In this example, both of the input values were required to be loaded into registers for the
IMULL
instruction.
To demonstrate using placeholders, the regtest2.c
program performs the same function as the regtest1.c
program, but enables the compiler to choose which registers to use:
/* regtest2.c - An example of using placeholders */ #include <stdio.h> int main() { int data1 = 10; int data2 = 20; int result; asm ("imull %1, %2 " "movl %2, %0" : "=r"(result) : "r"(data1), "r"(data2)); printf("The result is %d ", result); return 0; }
The regtest2.c
program uses the r
constraint when defining the input and output values, using registers for all of the data requirements. The compiler selects the registers used when the assembly language code for the program is generated. You can see this by viewing the generated assembly code with the -S
option:
movl $10, −4(%ebp) movl $20, −8(%ebp) movl −4(%ebp), %edx movl −8(%ebp), %eax #APP imull %edx, %eax movl %eax, %eax #NO_APP movl %eax, −12(%ebp)
My compiler elected to do something interesting when the assembly code was generated. It used the EDX
register to hold the dat
a1 value, and the EAX
register to hold the data2
value, as we would normally expect. The interesting part is that it noticed that the result was generated after the input values were finished being used, so it assigned the result variable to the EAX
register as well. My poorly constructed inline assembly code still performed the MOVL
instruction, but it just moved the EAX
register to itself.
You can watch the running program in the debugger to see if the MOVL
instruction is really executed. To generate an executable that can be used in the debugger, you can use the -gstabs
option with the gcc
compiler:
$ gcc -gstabs -o regtest2 regtest2.c
When the executable is created, it can then be run in the debugger:
$ gdb -q regtest2 (gdb) break *main Breakpoint 1 at 0x8048364: file regtest2.c, line 4. (gdb) run Starting program: /home/rich/palp/chap13/regtest2 Breakpoint 1, main () at regtest2.c:4 4 { (gdb) s 5 int data1 = 10; (gdb) s 6 int data2 = 20; (gdb) s 9 asm ("imull %1, %2 " (gdb) s 14 printf("The result is %d ", result); (gdb) info reg eax 0xc8 200 ecx 0x1 1 edx 0xa 10
To set a breakpoint in a C program, you can specify either the line number to start or a function label. This example set the breakpoint at the main()
function label, or the start of the program.
One thing you may notice as you are stepping through the program is that the asm
section is considered a single statement by the debugger. You can step into the asm
section using the stepi
debugger command and execute each instruction separately.
The registers listing shows that after the asm
section, the data1
value was loaded into the EDX
register, and the EAX
register was used as the result variable.
As you saw in the regtest2.c
program, I needlessly used a MOVL
instruction to produce the output value in the proper variable. Sometimes it is beneficial to use the same variable as both an input value and an output value. To do this, you must define the input and output values differently in the extended asm
section.
If an input and output value in the inline assembly code share the same C variable from the program, you can specify that using the placeholders as the constraint value. This can create some odd-looking code, but it comes in handy to reduce the number of registers required in the code.
To fix the inline code from the regtest2.c
program, you could write the following:
asm ("imull %1, %0" : "=r"(data2) : "r"(data1), "0"(data2));
The 0
tag signals the compiler to use the first named register for the output value data2
. The first named register is defined in the second line, which assigns a register to the data2
input variable. This ensures that the same register will be used to hold the input and output values. Of course, the result will be placed in the data2
value when the inline code is complete.
The regtest3.c
program demonstrates this:
/* regtest3.c - An example of using placeholders for a common value */ #include <stdio.h> int main() { int data1 = 10; int data2 = 20; asm ("imull %1, %0" : "=r"(data2) : "r"(data1), "0"(data2)); printf("The result is %d ", data2); return 0; }
The regtest3.c
program uses the data2
value as both an input value and the output value.
If you are working with a lot of input and output values, the numeric placeholders can quickly become confusing. To help keep things sane, the GNU compiler (starting with version 3.1) enables you to declare alternative names as placeholders.
The alternative name is defined within the sections in which the input and output values are declared. The format is as follows:
%[name]"constraint"(variable)
The name
value defined becomes the new placeholder identifier for the variable in the inline assembly code, as shown in the following example:
asm ("imull %[value1], %[value2]" : [value2] "=r"(data2) : [value1] "r"(data1), "0"(data2));
The alternative placeholder names are used in the same way as the normal placeholders were, as demonstrated in the following alttest.c
program:
/* alttest.c - An example of using alternative placeholders */ #include <stdio.h> int main() { int data1 = 10; int data2 = 20; asm ("imull %[value1], %[value2]" : [value2] "=r"(data2) : [value1] "r"(data1), "0"(data2)); printf("The result is %d ", data2); return 0; }
You may have noticed in the examples presented so far that I have not used the changed registers list in the extended asm
format, even though it is obvious that each of the programs contained registers that were changed.
The compiler assumes that registers used in the input and output values will change, and handles that accordingly. You do not need to include these values in the changed registers list. In fact, if you do, it will produce an error message, as demonstrated in the following badregtest.c
program:
/* badregtest.c - An example of incorrectly using the changed registers list */ #include <stdio.h> int main()
{ int data1 = 10; int result = 20; asm ("addl %1, %0" : "=d"(result) : "c"(data1), "0"(result) : "%ecx", "%edx"); printf("The result is %d ", result); return 0; }
The badregtest.c
program specifies that the result variable should be loaded into the EDX
register and the data1
variable into the ECX
register. The changed registers list incorrectly specifies that the ECX
and EDX
registers change within the inline code. Note that the registers are listed in the changed registers list using the full register names, not just a single letter as with the input and output register definitions. Using the percent sign with the register name is optional.
When you try to compile this program, an error will be produced:
$ gcc -o badregtest badregtest.c badregtest.c: In function 'main': badregtest.c:8: error: can't find a register in class 'DREG' while reloading 'asm' $
The compiler already knew that the EDX
register was used as a register, and it could not properly handle the request for the changed register list.
The proper use of the changed register list is to notify the compiler if your inline assembly code uses any additional registers that were not initially declared as input or output values. The compiler must know about these registers so it knows to avoid using them, as demonstrated in the changedtest.c
program:
/* changedtest.c – An example of setting registers in the changed registers list */ #include <stdio.h> int main() { int data1 = 10; int result = 20; asm ("movl %1, %%eax " "addl %%eax, %0" : "=r"(result) : "r"(data1), "0"(result) : "%eax"); printf("The result is %d ", result); return 0; }
In the changedtest.c
program, the inline assembly code uses the EAX
register as an intermediate location to store a data value. Because the register was not declared as an input or output value, it must be included in the changed registers list.
Now that the compiler knows that the EAX
register is not available, it will work around that. The input and output values were declared using the r
constraint, which enables the compiler to select the registers to use. Looking at the generated assembly language code, you can see which registers were selected:
movl $10, −4(%ebp) movl $20, −8(%ebp) movl −4(%ebp), %ecx movl −8(%ebp), %edx #APP movl %ecx, %eax addl %eax, %edx #NO_APP movl %edx, %eax
The code for moving the C variables into registers uses the ECX
and EDX
registers (remember that in the regtest2.c
program it used the EAX
and EDX
registers). The compiler purposely avoided using the EAX
register, as it was declared as being used in the inline assembly code.
There is one oddity with the changed registers list: If you use any memory locations within the inline assembly code that are not defined in the input or output values, that must be tagged as being corrupted as well. The word "memory" is used in the changed registers list to flag the compiler that memory locations were altered within the inline assembly code.
Although using registers in the inline assembly language code is faster, you can also directly use the memory locations of the C variables. The m
constraint is used to reference memory locations in the input and output values. Remember that you still have to use registers for the assembly instructions that require them, so you may have to define intermediate registers to hold the data. The memtest.c
program demonstrates this:
/* memtest.c - An example of using memory locations as values */ #include <stdio.h> int main() { int dividend = 20; int divisor = 5; int result; asm("divb %2 " "movl %%eax, %0" : "=m"(result) : "a"(dividend), "m"(divisor)); printf("The result is %d ", result); return 0; }
The asm
section loads the dividend value into the EAX
register as required by the DIV
instruction. The divisor is kept in a memory location, as is the output value. The generated assembly code looks like the following:
movl $20, −4(%ebp) movl $5, −8(%ebp) movl −4(%ebp), %eax #APP divb −8(%ebp) movl %eax, −12(%ebp) #NO_APP
The values are loaded into memory locations (in the stack), with the dividend value also moved to the EAX
register. When the result is determined, it is moved into its memory location on the stack, instead of to a register.
Because this example uses the
DIVB
instruction, it will only work with dividend values less than 65,536 and divisor values less than 256. If you want to use larger values, you must modify the inline assembly language code to use theDIVW
orDIVL
instructions.
Because of the way the FPU uses registers as a stack, things are a little different when using floating-point values in inline assembly language coding. You must be more careful about how the FPU
registers are handled by the inline code.
You may have noticed that three different constraints dealt with the FPU register stack:
f
references any available floating-point register
t
references the top floating-point register
u
references the second floating-point register
When retrieving output values from the FPU, you cannot use the f
constraint; you must declare the t
or u
constraints to specify the FPU
register in which the output value will be, as shown in the following example:
asm( "fsincos" : "=t"(cosine), "=u"(sine) : "0"(radian));
The FSINCOS
instruction places the output in the first two registers in the FPU stack. You must be sure to specify the correct register for the correct output value. Because the input value must also be in the ST(0)
register, it uses the same register as the first output value, and is declared using the placeholder. The sincostest.c
program demonstrates using this inline assembly code:
/* sincostest.c - An example of using two FPU registers */ #include <stdio.h> int main() {
float angle = 90; float radian, cosine, sine; radian = angle / 180 * 3.14159; asm("fsincos" :"=t"(cosine), "=u"(sine) :"0"(radian)); printf("The cosine is %f, and the sine is %f ", cosine, sine); return 0; }
The assembly language code generated by the compiler for this function looks like this:
flds −8(%ebp) #APP fsincos #NO_APP fstps −24(%ebp) movl −24(%ebp), %eax movl %eax, −12(%ebp) fstps −24(%ebp) movl −24(%ebp), %eax movl %eax, −16(%ebp)
The radian variable is loaded into the FPU stack from the program stack using the FLDS
instruction. After the FSINCOS
instruction, the two output values are popped from the FPU stack using the FSTPS
instruction and moved to their appropriate C variable location.
In the preceding example, because the compiler knows the output values are in the first two FPU
registers, it pops the values, restoring the FPU stack to its previous condition. If you perform any operations within the FPU stack that are not cleared, you must specify the appropriate FPU
registers in the changed registers list. The areatest.c
program demonstrates this:
/* areatest.c - An example of using floating point regs */ #include <stdio.h> int main() { int radius = 10; float area; asm("fild %1 " "fimul %1 " "fldpi " "fmul %%st(1), %%st(0)" : "=t"(area) :"m"(radius) : "%st(1)"); printf("The result is %f ", area); return 0; }
The areatest.c
program places the radius value into a memory location, and then loads that value into the top of the FPU stack with the FILD
instruction. That value is multiplied by itself, with the result still in the ST(0)
register. The pi value is then placed on top of the FPU stack, shifting the squared radius value down to the ST(1)
position. The FMUL
instruction is then used to multiply the two values within the FPU.
The output value is taken from the top of the FPU stack and assigned to the area
C variable. Because the ST(1)
register was used, but not assigned as an output value, it must be listed in the changed registers list so the compiler knows to clean it up afterward.
The inline assembly language code can also contain labels to define locations in the inline assembly code. Normal assembly conditional and unconditional branches can be implemented to jump to the defined labels.
The jmptest.c
program demonstrates this:
/* jmptest.c - An example of using jumps in inline assembly */ #include <stdio.h> int main() { int a = 10; int b = 20; int result; asm("cmp %1, %2 " "jge greater " "movl %1, %0 " "jmp end " "greater: " "movl %2, %0 " "end:" :"=r"(result) :"r"(a), "r"(b)); printf("The larger value is %d ", result); return 0; }
The inline assembly code defines two labels within the instructions. The JGE
instruction is used along with the CMP
instruction to compare the two input values loaded into registers. The JMP
instruction is used to unconditionally jump to the end of the inline assembly code.
The assembly code generated by the compiler contains the labels as well as the instructions:
movl −4(%ebp), %edx movl −8(%ebp), %eax #APP cmp %edx, %eax jge greater movl %edx, %eax jmp end
greater: movl %eax, %eax end: #NO_APP movl %eax, −12(%ebp)
There are two restrictions when using labels in inline assembly code. The first one is that you can only jump to a label within the same asm
section. You cannot jump from one asm
section to a label in another asm
section.
The second restriction is somewhat more complicated. The jmptest.c
program uses the labels greater
and end
. However, there is a potential problem with this. As you saw from the assembled code listing, the inline assembly labels are encoded into the final assembled code. This means that if you have another asm
section in your C code, you cannot use the same labels again, or an error message will result due to duplicate use of labels. In addition, if you try to incorporate labels that use C keywords, such as function names or global variables, you will also generate errors.
There are two solutions to solve this. The easiest solution is to just use different labels within different asm
sections. If you are hand-coding each of the asm
sections, this is a viable alternative.
If you are using the same asm
sections (such as if you declare macros as explained in the "Using Inline Assembly Code" section later) you cannot alter the labels within the inline assembly code. The solution is to use local labels.
Both conditional and unconditional branches allow you to specify a number as a label, along with a directional flag to indicate which way the processor should look for the numerical label. The first occurrence of the label found will be taken. To demonstrate this, the jmptest2.c
program can be used:
/* jmptest2.c - An example of using generic jumps in inline assembly */ #include <stdio.h> int main() { int a = 10; int b = 20; int result; asm("cmp %1, %2 " "jge 0f " "movl %1, %0 " "jmp 1f " "0: " "movl %2, %0 " "1:" :"=r"(result) :"r"(a), "r"(b)); printf("The larger value is %d ", result); return 0; }
The labels have been replaced with 0:
and 1:
. The JGE
and JMP
instructions use the f
modifier to indicate the label is forward from the jump instruction. To move backward, you must use the b
modifier.
While you can place inline assembly code anywhere within the C program, most programmers utilize inline assembly code as macro functions. The C macro functions enable you to declare a single macro that contains a function. When the macro is referenced in the main program, the macro is expanded to the full function defined by the macro. This section shows how to create inline assembly macros in your C programs.
In C and C++ programs, macros are used to define anything from a constant value to complex functions. A macro is defined using the #define
statement. The format of the #define
statement is as follows:
#define NAME expression
By convention, the macro name NAME
is always defined using uppercase letters (this is to ensure it will not conflict with C library functions). The expression
value can be a numeric or string value that is constant.
If you have done much coding in C or C++, you are most likely familiar with defining constant macros. The constant macro assigns a specific value to a macro name. The macro name can then be used throughout the program to represent the value.
A macro can be defined as a numeric value, such as the following:
#define MAX_VALUE 1024
Whenever the macro MAX_VALUE
is used in the program code, the compiler substitutes the value associated with it:
data = MAX_VALUE; if (size > MAX_VALUE)
The macro value is not treated like a variable in that it cannot be altered. It remains a constant value throughout the program. It can, however, be used in numeric equations:
data = MAX_VALUE / 4;
Another aspect of the C macro is the macro functions. These are described in the next section.
While constant macros come in handy for defining values, macro functions can be utilized to save typing time throughout the program. An entire function can be assigned to a macro at the beginning of the program and used everywhere in it.
The macro function defines input and output values, and then defines the function that processes the input values and produces the output values. The format of the macro function is as follows:
#define NAME(input values, output value) (function)
The input values are a comma-separated list of variables used for input to the function. The function defines how the input values are processed to produce the output value.
Macros are defined as a single line of text. With macro functions, that can create a very long line of text. To help make the macro more readable, a line continuation character (the backslash) can be used to split the function. Here's an example of a simple C macro function:
#define SUM(a, b, result) ((result) = (a) + (b))
The macro SUM
is defined as requiring two input values, and producing a single output value, which is the result of the addition of the two input values. Whenever the SUM()
macro function is used in a program, the compiler expands it to the full macro function definition.
It is important to note that this is the opposite of standard C functions, which are used to save coding space. The compiler expands the full macro function before the code is assembled, creating a larger code.
An example of a C macro function is shown in the mactest1.c
program:
/* mactest1.c - An example of a C macro function */ #include <stdio.h> #define SUM(a, b, result) ((result) = (a) + (b)) int main() { int data1 = 5, data2 = 10; int result; float fdata1 = 5.0, fdata2 = 10.0; float fresult; SUM(data1, data2, result); printf("The result is %d ", result); SUM(1, 1, result); printf("The result is %d ", result); SUM(fdata1, fdata2, fresult); printf("The floating result is %f ", fresult); SUM(fdata1, fdata2, result); printf("The mixed result is %d ", result); return 0; }
There are a few things of interest to note in the mactest1.c
example program. First, note that the variables defined in the macro function are completely independent of the result variables defined in the program. You can use any variables in the SUM()
macro function.
Second, note that the same SUM()
macro function worked for integer input values, numeric input values, floating-point input values, and even mixed input and output values! You can see how versatile macro functions can be. Now it's time to apply that to the inline assembly functions.
If you want to see the code with the expanded macro lines, you can use the
-E
command-line option when compiling.
Just as you can with the C macro functions, you can declare macro functions that include inline assembly code. The inline assembly code must use the extended asm
format, so the proper input and output values can be defined. Because the macro function can be used multiple times in a program, you should also use numeric labels for any branches required in the assembly code.
An example of defining an inline assembly macro function is as follows:
#define GREATER(a, b, result) ({ asm("cmp %1, %2 " "jge 0f " "movl %1, %0 " "jmp 1f " "0: " "movl %2, %0 " "1:" :"=r"(result) :"r"(a), "r"(b)); })
The a
and b
input variables are assigned to registers so they can be used in the CMP
instruction. The JGE
and JMP
instructions use numeric labels so the macro function can be used multiple times in the program without duplicating assembly labels. The result variable is copied from the register that contains the greater of the two input values. Note that the asm
statement must be in a set of curly braces to indicate the start and end of the statement. Without them, the compiler will generate an error each time the macro is used in the C code.
The mactest2.c
program demonstrates using this macro function in a C program:
/* mactest2.c - An example of using inline assembly macros in a program */ #include <stdio.h> #define GREATER(a, b, result) ({ asm("cmp %1, %2 " "jge 0f " "movl %1, %0 " "jmp 1f " "0: " "movl %2, %0 " "1:" :"=r"(result) :"r"(a), "r"(b)); }) int main() { int data1 = 10; int data2 = 20; int result; GREATER(data1, data2, result); printf("a = %d, b = %d result: %d ", data1, data2, result);
data1 = 30; GREATER(data1, data2, result); printf("a = %d, b = %d result: %d ", data1, data2, result); return 0; }
This chapter discussed how to use assembly language code inside of C and C++ programs. The technique of inline assembly code enables you to place assembly language functions inside C or C++ programs, pass program variables to the assembly language code, and place output from the assembly language code into C program variables.
The C asm
statement contains assembly language code that is transferred to the compiled assembly language program from the C program code. The asm
statement has two formats. The basic asm
format enables you to code assembly language instructions directly, using C global variables as input and output values.
The extended asm
format provides advanced techniques for passing input values to the assembly code, and moving output values to the C program code. Any type of C data, such as local variables, can be passed to either registers or memory locations using the extended asm
format. The input values can be assigned to specific registers or you can allow the compiler to assign the registers as necessary. Similarly, output values can be assigned to either registers or memory locations. Numerous features can be used to control how the variables are used within the inline assembly language code.
Inline assembly language code in the asm
section is often defined using C macro functions. The C macro function uses a format that defines a function name, the input values used, and the output values used, along with the asm
section function. Each time the macro function is called in the main program, the compiler expands the inline assembly language code.
The next chapter digs deeper into using assembly language in mixed programming environments. Besides inline assembly language code, you can create complete assembly language libraries that can be utilized by C and C++ programs. This technique is discussed and demonstrated in the next chapter.