All of the Diamond cores except the 108Mini implement a set of zero-overhead loop instructions. Loops are a fundamental programming structure and are usually implemented with a processor’s decrement-test-and-branch instructions. Like all instructions, these instructions must be fetched and executed. These operations take time and create memory cycles and bus traffic, which can increase power dissipation. In addition, branch instructions inevitably create pipeline bubbles.
All of these considerations generate loop overhead. During these overhead cycles, the processor performs no useful work. The zero-overhead loop instruction uses three additional 32-bit registers to set up and keep track of loop conditions so that no housekeeping instructions for the loop are executed within the loop. These zero-overhead loop registers become part of the special-register group in the processor. Table 6.5 lists the three zero-overhead loop registers and Table 6.6 lists the three new instructions that implement the zero-overhead loops.
Register mnemonic | Register name | Special register number |
---|---|---|
LBEG | Loop Begin | 0 |
LEND | Loop End | 1 |
LCOUNT | Loop Count | 2 |
Instruction mnemonic | Instruction definition |
---|---|
LOOP | Set up the zero-overhead loop by initializing the LBEG, LEND, and LCOUNT registers. |
LOOPGTZ | Set up the zero-overhead loop by initializing the LBEG, LEND, and LCOUNT registers. Skip the loop if LCOUNT is not positive. |
LOOPNEZ | Set up the zero-overhead loop by initializing the LBEG, LEND, and LCOUNT registers. Skip the loop if LCOUNTis zero. |
The XCC compiler automatically uses the zero-overhead loop registers and instructions, if present, to accelerate program execution. Once the loop is set up, no instructions within the loop are needed for loop housekeeping. The processor hardware automatically manages the loop using logic circuits that perform the loop computations and comparisons in the background. This mechanism accelerates the execution of all loops and greatly accelerates the execution speed of small loops.
It’s possible to terminate zero-overhead loops prematurely by directly branching to the address stored in the LEND register. It’s also possible to short-circuit one loop iteration and conditionally end the loop by branching to a nop instruction placed just before the address contained in the LEND register.