Information includes our private data that we desire to protect from unwilling leakage depending on the application. Cryptology is a field of research that offers appropriate solutions for the data protection by exploring how to construct a secure communication for fair information exchange. Modern cryptology often deals with digitalized data rather than analog data that cannot be expressed simply with a series of 0s and 1s. In our daily life, information is exchanged by digital devices such as radio frequency identification (RFID) tags, smart cards, and smart phones, where a computational resource is limited. Therefore, it is one of the most important challenges in cryptology to realize an efficient implementation of cryptosystems.
There are various ways to realize encryption that is a kind of computational process for information to be protected. In a symmetric-key cipher, information is encrypted with a secret key, and it is expected that the owner of the secret key can decrypt the encrypted information correctly. For instance, let us see the situation, where Alice would like to send a message to Bob in a secure way. If the secret key, K, is shared only with Alice and Bob, only Bob can decrypt the message from the encrypted message. The original and the encrypted messages are called plaintext and ciphertext, respectively. Figure 1.1 illustrates the encryption and decryption processes.
The encryption by Alice can be written as
The ciphertext is decrypted by Bob as
Only Bob can decrypt and read the message, and Eve, who does not own the secret key, cannot decrypt it.
Alice and Bob need to compute the cryptographic operations based on the functions, and . The simpler the functions are, the more efficiently they can compute. For instance, Vernam cipher, invented in 1917, uses just XOR operations as
to convert plaintext and ciphertext. The XOR operation is explained in Section 1.2.1.
However, in order to guarantee the security, that is, in order that Eve cannot obtain any information of message from C, the secret key needs to be refreshed with a random number for each encryption/decryption. In other words, in order to communicate securely with the Vernam cipher, a very long key, which is the same size as M, is required. This is significantly inefficient. In general, encryption anddecryption processes are based on the trade-offs between cost, performance, and security.
The fundamental idea to achieve an efficient encryption scheme is designing a fixed-input size encryption scheme, and iteratively applying this scheme to encrypt arbitrary length messages. Such a fixed-input size encryption scheme is called block cipher, and the group of bits with the fixed-input size is called block. If the unit of operation is small enough, for example, 1 bit or 1 byte, such a symmetric-key cipher is called stream cipher. As block ciphers are expected to compute encryption and decryption efficiently, they have an iterated structure, and repeat the same function several times. Such a function is called round function. The iterated structure contributes to achieving a small program code in software and implementing a compact circuit design in hardware.
Modern block ciphers are mainly categorized into two kinds: Feistel structure and substitution-permutation network (SPN) structure. Feistel structure was employed in data encryption standard (DES) block cipher proposed in 1977. Including FEAL and Camellia, the Feistel structure has been employed by many block ciphers.
On the contrary, Advanced Encryption Standard (AES) employed SPN structure. AES is the main target of this book as it is one of the most widely used block ciphers, and it contains fundamental ideas of SPN structure. The basic mathematics to understand SPN structure and AES specification will be explained later in this chapter.
Boolean functions are used in most of the block ciphers including AES. A Boolean function, f, is described as
where is called Boolean domain and is the set of all n-tuples , where are all in Boolean domain.1
The most simple Boolean function is inversion or the INV operation that is a bit complement. It operates as
where is used for representing the INV operation. Alternatively, the logic symbol, , is also used for INV. In this book, we allow both usage, that is, .
For the case of , representative Boolean functions are OR, AND, and XOR. OR is defined as
Likewise, AND and XOR are defined, respectively, as
“,” “,” and “” are used for representing OR, AND, and XOR operations.
The truth table for OR, AND, and XOR is described in Table 1.1.
Table 1.1 Truth table for basic operators
x | y | |||
0 | 0 | 0 | 0 | 0 |
0 | 1 | 1 | 0 | 1 |
1 | 0 | 1 | 0 | 1 |
1 | 1 | 1 | 1 | 0 |
Finite filed or Galois field deals with a finite number of elements. Over a Galois filed, addition, subtraction, multiplication, and division are defined. Galois field with the smallest order is called a binary field or . For instance, addition, multiplication, additive inverse, and multiplicative inverse over are defined in Table 1.2.
Table 1.2 Operations over
x | y | ||||
0 | 0 | 0 | 0 | 0 | – |
0 | 1 | 1 | 0 | 0 | – |
1 | 0 | 1 | 0 | 1 | 1 |
1 | 1 | 0 | 1 | 1 | 1 |
As can be found from Tables 1.1 and 1.2, addition and multiplication over are realized, respectively, with XOR and AND.
Table 1.3 Operations over
x | y | ||||
0 | 0 | ||||
0 | 1 | ||||
0 | 2 | ||||
0 | 3 | ||||
0 | 4 | ||||
1 | 0 | ||||
1 | 1 | ||||
1 | 2 | ||||
1 | 3 | ||||
1 | 4 | ||||
2 | 0 | ||||
2 | 1 | ||||
2 | 2 | ||||
2 | 3 | ||||
2 | 4 | ||||
3 | 0 | ||||
3 | 1 | ||||
3 | 2 | ||||
3 | 3 | ||||
3 | 4 | ||||
4 | 0 | ||||
4 | 1 | ||||
4 | 2 | ||||
4 | 3 | ||||
4 | 4 |
Binary field, , can be extended to a large field size called extended binary field, , where n is a positive integer. Especially, in the case of AES, operations in are of special interest. The number of elements of is . There are several different representations for the elements, which affect the cost and speed performance of software and hardware implementations.
As the number of elements of is a power of 2, each bit of the binary representation can be used for each coefficient of a polynomial whose degree is . Any element in can be expressed with the so-called polynomial basis as
where . For instance, 16 elements in can be expressed with the binary representation, . By assigning each bit to the coefficient of a polynomial of x, we have . Addition of two field elements, for example, , can be calculated as
as over .
Multiplication of the two field elements, for example, , needs modular reduction with an irreducible polynomial, for example, , which specifies the field.2 Therefore, the multiplication result becomes as
Alternatively, elements in are described using normal basis as
where and are roots of an irreducible polynomial, , that is,
Furthermore,
This can be confirmed by Fermat little theorem.
For the case of , suppose that , that is, . Addition in the normal basis representation of can be calculated simply by XORing each coefficient of two elements in the form of Equation ((1.12) ). That is,
where the normal basis representations of and can be found in Table 1.4.
Table 1.4 Representations of elements for irreducible polynomial in
Binary | Bit concatenation | Hex. | Polynomial basis | Power of | Normal basis |
0 |
(0, 0, 0, 0) | ||||
1 |
(1, 1, 1, 1) | ||||
2 |
x | (0, 0, 0, 1) | |||
4 |
(0, 0, 1, 0) | ||||
8 |
(1, 0, 1, 1) | ||||
9 |
(0, 1, 0, 0) | ||||
b |
(0, 1, 0, 1) | ||||
f |
(0, 1, 1, 1) | ||||
7 |
(1, 1, 0, 0) | ||||
e |
(1, 0, 0, 0) | ||||
5 |
(1, 1, 0, 1) | ||||
a |
(1, 0, 1, 0) | ||||
d |
(0, 1, 1, 0) | ||||
3 |
(1, 1, 1, 0) | ||||
6 |
(0, 0, 1, 1) | ||||
c |
(1, 0, 0, 1) |
This is correct as . By using the fact of , multiplication in , for example, is calculated as
The most advantageous point to use the normal basis representation lies in the fact that squaring is easy to compute in . As can be found in Table 1.4, squaring for is . More precisely, in squaring, the elements in the normal basis representation are derived as
This merit is often used in both software and hardware implementations. However, in general, implementing modular multiplication in the normal basis requires more computation than that in the polynomial basis. Hereafter, we mainly use polynomial basis representation.
Addition and multiplication by a constant are linear functions in . Suppose that and , where . Addition of and is
From the fact that , it is confirmed that addition in is a linear function.
For multiplication by a constant B, there exist such that
Therefore, we know that such multiplication in is also a linear function. It can be easily understood considering the fact that multiplication by a constant can be computed with multiple additions of in .
On the contrary, (normal) modular multiplication and multiplicative inverse in are nonlinear functions. The AES block cipher uses a nonlinear function in a part of the design that is based on modular multiplicative inversion in . The multiplicative inverse computation can be done with Fermat's (little) theorem as
for . In AES, multiplicative inverse of 0
is mapped to 0
.
One of the most optimal ways to compute the inversion is to find addition chain. On the basis of the Itoh–Tsujii algorithm, the computation can be performed with four multiplications and seven modular squarings as
Itoh–Tsujii algorithm utilizes the relationship of
As discussed in Section 1.3, logical operations are classified into linear operations and nonlinear operations. Composition of linear operations is also linear. Hence, if all the cipher's operations are linear, the resulting cipher is also linear, which is insecure. In order to break the linearity of the cipher, nonlinear operations need to be introduced. However, in general, the cost of implementing nonlinear operations is more expensive than the one for linear operations.
The strategy of the block cipher design is alternately applying nonlinear and linear operations several times. To avoid the heavy cost, nonlinear operation is designed to be weak but its cost is small. In many cases, a nonlinear operation is designed to be operated on a smaller size than the block size, and the operation is applied in parallel to all the data. Then, in order to compensate the weak nonlinear computations, a linear operation mixes the entire block. The strategy is depicted in Figure 1.2. In the following, each of the nonlinear layer and linear layer is further detailed.
In order to reduce the implementation cost, a nonlinear operation is designed to work on a fraction of the data. Typical choices of the size are 64 bits, 32 bits, 8 bits (called byte), 4 bits (called nibble), and 1 bit. The size of the nonlinear operation is determined depending on the following two aspects.
When the cipher is designed for being used in high-end CPUs, the implementation cost is not a big issue but the operation should be optimized for instructions adopted in such a CPU. Currently, many CPUs operate on 64 or 32 bits, thus the size of the nonlinear operation is also adjusted to 64 or 32 bits. The high-end CPUs can perform the modular addition or subtraction efficiently. The nonlinearity is often introduced by addition or subtraction on modulo or .
When the cipher is designed for more resource-constrained hardwares such as micro-controllers, the balance of the implementation cost and the computation efficiency is important. When the CPU register size is smaller than 32 bits, the 32- or 64-bit modular addition cannot be performed efficiently. The hardware implementation also faces some problems for those operations. Typical choices of the size of the nonlinear operation are 8 or 4 bits. Because the size is small, using the substitution table is a popular approach to introduce the nonlinearity. The substitution table, or S-box, is a pre-specified mapping from the input values to the output values. An example of 4-bit to 4-bit S-box is given in Table 1.5.
Table 1.5 An example of 4-bit to 4-bit S-box,
Input | x | 0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
a |
b |
c |
d |
e |
f |
Output | c |
0 |
f |
a |
2 |
b |
9 |
5 |
8 |
3 |
d |
7 |
1 |
e |
6 |
4 |
All values are described in the hexadecimal format.
In this S-box, the input value 5
is transformed to b
according to the table. A 4-bit to 4-bit S-box is implemented only with bits of memory, which is very small. An 8-bit to 8-bit S-box is implemented only with bits of memory, which is bigger than the 4-bit to 4-bit S-box but can mix the data faster than the 4-bit to 4-bit S-box.
A Boolean function is the smallest tool to introduce the nonlinearity. By using an AND or OR operation, the nonlinearity is introduced in 1 bit. When the cipher is designed to be a very resource constraint environment such as RFID, a Boolean function is a typical choice as a source of the nonlinearity. A Boolean function can also fit the high-end CPUs. Thirty two-bit CPUs can operate bit-wise for each of the 32 bits in parallel. If this is combined with modular additions (not bit-wise), the nonlinearity can be introduced quickly.
It is also a popular approach to specify the input and output correspondence of some Boolean functions as an S-box. If the cipher is implemented with some memory, the S-box can be implemented, and the nonlinearity of several bits can be introduced with 1 table look-up. If the cipher is implemented with small hardware, the logic of the Boolean function is implemented to minimize the implementation cost.
Unfortunately, there is no obvious choice that shows the overwhelming performance in any implementation environment. When the cipher is designed in multi-platforms, that is, both the high- and low-end environment, an S-box maybe chosen as the source of nonlinearity that shows a relatively good performance in both the environments. The popular choices of the nonlinear operations are summarized in Figure 1.3.
Note that the data is mixed by alternately applying a nonlinear operation and a linear operation. The choice of the nonlinear operation also depends on the choice of the linear operation.
The purpose of the linear layer is mixing all the output data from the nonlinear layer in which the data is updated in a small part independently. The linear layer is required to be performed efficiently and implemented lightly.
One of the simplest linear operations is XOR. A part of the nonlinear layer output is XORed to another part to mix the data from different parts. The XOR operation can be performed several times between different parts to mix the data more.
The bit-rotation and bit-shift are also simple linear operations. For example, by applying the 1-bit rotation to the entire data, 1-bit from each part will be moved to the next part. The XOR, bit-shift, and bit-rotation can be implemented efficiently in various platforms, thus they are suitable for the block cipher design.
Another important example is a multiplication over a finite field or modular multiplication. Suppose that the size of the nonlinear operation is n bits and each bit of n-bit value represents each coefficient of a polynomial whose degree is . As explained in Section 1.3, multiplication over a finite field with some irreducible polynomial is a linear function. Suppose that the entire data consists of m parts of n-bit data, that is, its size is bits. The purpose of the linear function is mixing m independent outputs from the nonlinear layer. In order to mix all the m outputs, matrices are often used.
For instance, when , four n-bit values , , , are updated to four n-bit values , , , by the following matrix operation:
where each is a constant number.
Any combination of linear operations is a linear operation. A popular design approach is combining different types of light linear operations to introduce a strong mixing effect. An example of the linear layer is depicted in Figure 1.4.
A maximum distance separable matrix (in short MDS matrix) is a matrix with some special property useful for block cipher's design. Considering the usage in block cipher AES, only the case with the same input and output size is discussed here. Let x be the m-component input to the matrix, M, and y be the m-component output from the matrix, that is, . The matrix M is called MDS if no distinct input-output pairs collide in m or more components.
For the application to cryptology, the fact that at least components differ in distinct pairs of is important. In other words, the MDS matrix guarantees a certain amount of change in different input and output values. For instance, suppose that the value of x is slightly modified to , which differs only 1 bit from x, and the corresponding output value is computed. The multiplication by the MDS matrix can guarantee that all the m components of the outputs y and have different values, meaning that the 1-bit change of the input always changes all the m components of the output.
Substitution-permutation network, which is often called SPN, is a design approach to mix a fixed-length input data. SPN is a special form of the iterative application of nonlinear and linear computations.
The substitution layer (or S-layer), which applies a nonlinear operation, is supposed to be an S-box application in a small size. The permutation layer (or P-layer) applies a linear operation to mix the results of the S-layer efficiently.
The SPN structure is adopted in many block ciphers. AES, which is a main target of this book, also adopts the SPN structure.
AES is the most widely used block cipher in present time in both governmental and commercial purposes. AES is standardized internationally, and a lot of academic researches and industrial developments have been proposed about AES. This section explains the specification of AES.
The block cipher AES supports three different key sizes: 128 bits, 192 bits, and 256 bits. The corresponding AES algorithms are called AES-128, AES-192, and AES-256, respectively. AES supports a fixed block size: 128 bits. That is to say, when the key is determined, AES provides a bijective map from 128-bit plaintext to 128-bit ciphertext, that is, for a key K, AES-128, AES-192, AES-256: (Figure 1.5).
In high level, the 128-bit key K is expanded to eleven 128-bit subkeys according to the key schedule function, or KSF.
Then, a plaintext is encrypted to a ciphertext as follows:
The computation structure of AES-128 in a function level is described in Figure 1.6.
In practice, it is not necessary to compute all the 11 subkeys at the very beginning. For example, the last subkey will not be used until the very end of the encryption process. Thus, generating the last subkey and keeping it in a register is a waste of computation resource. In order to minimize the computation resource, the KSF and the round function updates are computed in parallel round by round. The AES-128 encryption algorithm in the function level can be described as Algorithm 1.1.
In AES, byte represents 8-bit values. AES is a byte-oriented cipher. All operations are defined at byte level. Let v be a byte value and be its bit-wise representation, of which the corresponding vector representation is . In AES, each bit of a byte represents coefficients of polynomial of :
A byte value can be represented in hexadecimal. For example, the byte represents the polynomial .
Addition of two bytes, and , returns
Multiplication in corresponds with multiplication of polynomials modulo, an irreducible binary polynomial of degree 8. The irreducible polynomial of AES is defined as
Because the multiplication by and is later introduced inside the round function, more details of the operation are explained here. is written as
When , the result is according to the definition of byte. When , the irreducible polynomial is subtracted from the result. Subtraction is the inverseof the addition. Because the addition is the XOR, the subtraction is also a simple application of the XOR operations. Hence, the result is
According to the definition of byte, the result is .
AES uses a substitution-box (S-box) to mix the data. The S-box is used in both of the round function and the KSF, and thus is defined here. The S-box used in AES is a pre-determined bijective mapping from an 8-bit value to an 8-bit value. The definition of the AES S-box is shown in Table 1.6. Hereafter, the S-box transformation is described as . For example, returns 2f
, and returns 03
.
Table 1.6 AES S-box
Lower four digits | |||||||||||||||||
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
a |
b |
c |
d |
e |
f |
||
0 |
6 3 |
7 c |
7 7 |
7 b |
f 2 |
6 b |
6 f |
c 5 |
3 0 |
0 1 |
6 7 |
2 b |
f e |
d 7 |
a b |
7 6 |
|
1 |
c a |
8 2 |
c 9 |
7 d |
f a |
5 9 |
4 7 |
f 0 |
a d |
d 4 |
a 2 |
a f |
9 c |
a 4 |
7 2 |
c 0 |
|
2 |
b 7 |
f d |
9 3 |
2 6 |
3 6 |
3 f |
f 7 |
c c |
3 4 |
a 5 |
e 5 |
f 1 |
7 1 |
d 8 |
3 1 |
1 5 |
|
3 |
0 4 |
c 7 |
2 3 |
c 3 |
1 8 |
9 6 |
0 5 |
9 a |
0 7 |
1 2 |
8 0 |
e 2 |
e b |
2 7 |
b 2 |
7 5 |
|
4 |
0 9 |
8 3 |
2 c |
1 a |
1 b |
6 e |
5 a |
a 0 |
5 2 |
3 b |
d 6 |
b 3 |
2 9 |
e 3 |
2 f |
8 4 |
|
5 |
5 3 |
d 1 |
0 0 |
e d |
2 0 |
f c |
b 1 |
5 b |
6 a |
c b |
b e |
3 9 |
4 a |
4 c |
5 8 |
c f |
|
6 |
d 0 |
e f |
a a |
f b |
4 3 |
4 d |
3 3 |
8 5 |
4 5 |
f 9 |
0 2 |
7 f |
5 0 |
3 c |
9 f |
a 8 |
|
Upper | 7 |
5 1 |
a 3 |
4 0 |
8 f |
9 2 |
9 d |
3 8 |
f 5 |
b c |
b 6 |
d a |
2 1 |
1 0 |
f f |
f 3 |
d 2 |
four digits | 8 |
c d |
0 c |
1 3 |
e c |
5 f |
9 7 |
4 4 |
1 7 |
c 4 |
a 7 |
7 e |
3 d |
6 4 |
5 d |
1 9 |
7 3 |
9 |
6 0 |
8 1 |
4 f |
d c |
2 2 |
2 a |
9 0 |
8 8 |
4 6 |
e e |
b 8 |
1 4 |
d e |
5 e |
0 b |
d b |
|
a |
e 0 |
3 2 |
3 a |
0 a |
4 9 |
0 6 |
2 4 |
5 c |
c 2 |
d 3 |
a c |
6 2 |
9 1 |
9 5 |
e 4 |
7 9 |
|
b |
e 7 |
c 8 |
3 7 |
6 d |
8 d |
d 5 |
4 e |
a 9 |
6 c |
5 6 |
f 4 |
e a |
6 5 |
7 a |
a e |
0 8 |
|
c |
b a |
7 8 |
2 5 |
2 e |
1 c |
a 6 |
b 4 |
c 6 |
e 8 |
d d |
7 4 |
1 f |
4 b |
b d |
8 b |
8 a |
|
d |
7 0 |
3 e |
b 5 |
6 6 |
4 8 |
0 3 |
f 6 |
0 e |
6 1 |
3 5 |
5 7 |
b 9 |
8 6 |
c 1 |
1 d |
9 e |
|
e |
e 1 |
f 8 |
9 8 |
1 1 |
6 9 |
d 9 |
8 e |
9 4 |
9 b |
1 e |
8 7 |
e 9 |
c e |
5 5 |
2 8 |
d f |
|
f |
8 c |
a 1 |
8 9 |
0 d |
b f |
e 6 |
4 2 |
6 8 |
4 1 |
9 9 |
2 d |
0 f |
b 0 |
5 4 |
b b |
1 6 |
1 All the numbers in this table are in hexadecimal.
Note that the S-box and the inverse S-box transformations are not identical. As explained later, AES decryption algorithm requires the look-up table for the inverse of , that is .
The block size of AES is 128 bits. In AES, 128-bit data is called state. The 128-bit state consists of 16 bytes, and is represented as a two-dimensional array as depicted in Figure 1.7.
The 128-bit key K is loaded into a 128-bit subkey . Then, is computed for . The input is represented as a state. The state is further divided into four columns: , , , and . The output is computed column by column. At first, a temporary 4-byte value tmp
is generated by using the value of .
tmp
.tmp
by 1 byte. Precisely, let be the 4 bytes of tmp
. Then, tmp
is updated to .tmp
.Then, by using the 4-byte value tmp
, the next subkey is generated as follows.
The key schedule procedure for AES-128 is depicted in Figure 1.8.
The round function takes as input the previous 128-bit state and subkey , and generates the next 128-bit state . The round function consists of four transformations called SubBytes, ShiftRows, MixColumns, and AddRoundKey. It updates the state by following Algorithm 1.2.
SubBytes is a byte-wise operation. It updates the state by applying the S-box defined in Table 1.6 to each of the 16 bytes of the state. It is worth noting that the operation is the only nonlinear one in the AES round function.
ShiftRows is a row-wise byte positions swap. The state consists of four rows: row 0, row 1, row 2, and row 3. Each row of the state consists of 4 bytes. The ShiftRows operation applies a left cyclic shift by i bytes to the 4 bytes of row i. The ShiftRows operation is depicted in Figure 1.9.
MixColumns is a column-wise data mixing operation. It takes as input 4 bytes in a column and computes another 4-byte value. The same computation is applied to all of the four columns. Let and be the 4-byte input and 4-byte output, respectively. The is computed by the following matrix operation:
Each element in the matrix is written in hexadecimal.
The operation was designed to satisfy the MDS property explained in Section 1.4.2.1. The impact of modifying 1 input byte always expands to all the 4 output bytes. More generally, the sum of the number of modified input bytes and the number of modified output bytes is always greater than or equal to 5.
AddRoundKey updates the state by XORing the subkey to the state.
In the last round (Round 10 for AES-128), the round function is different from the middle rounds. The operation is not computed that is, only the , and operations are performed.
To decrypt ciphertext C to P, the round function is applied in reverse order. The KSF is exactly the same. Eleven subkeys are generated from K. Different from the encryption algorithm, is firstly used, and then the decryption is processed with , and finally with .
Inside the round function, four operations are computed in reverse order. The inverse of the operation is exactly the same as the original operation because the XOR operation is involution.
For the inverse of the operation, the inversion matrix is required. Let and be the 4-byte input and 4-byte output to the inverse operation, respectively. The is computed by the following matrix operation:
Each element in the matrix is again written in hexadecimal.
The inverse of the operation is relatively simple. It applies a right cyclic shift by i bytes to the 4 bytes of row i.
The inverse of the operation requires another table to substitute each byte value. The inverse S-box, denoted by , is defined in Table 1.7.
Table 1.7 AES inverse S-box
Lower four digits | |||||||||||||||||
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
a |
b |
c |
d |
e |
f |
||
0 |
5 2 |
0 9 |
6 a |
d 5 |
3 0 |
3 6 |
a 5 |
3 8 |
b f |
4 0 |
a 3 |
9 e |
8 1 |
f 3 |
d 7 |
f b |
|
1 |
7 c |
e 3 |
3 9 |
8 2 |
9 b |
2 f |
f f |
8 7 |
3 4 |
8 e |
4 3 |
4 4 |
c 4 |
d e |
e 9 |
c b |
|
2 |
5 4 |
7 b |
9 4 |
3 2 |
a 6 |
c 2 |
2 3 |
3 d |
e e |
4 c |
9 5 |
0 b |
4 2 |
f a |
c 3 |
4 e |
|
3 |
0 8 |
2 e |
a 1 |
6 6 |
2 8 |
d 9 |
2 4 |
b 2 |
7 6 |
5 b |
a 2 |
4 9 |
6 d |
8 b |
d 1 |
2 5 |
|
4 |
7 2 |
f 8 |
f 6 |
6 4 |
8 6 |
6 8 |
9 8 |
1 6 |
d 4 |
a 4 |
5 c |
c c |
5 d |
6 5 |
b 6 |
9 2 |
|
5 |
6 c |
7 0 |
4 8 |
5 0 |
f d |
e d |
b 9 |
d a |
5 e |
1 5 |
4 6 |
5 7 |
a 7 |
8 d |
9 d |
8 4 |
|
6 |
9 0 |
d 8 |
a b |
0 0 |
8 c |
b c |
d 3 |
0 a |
f 7 |
e 4 |
5 8 |
0 5 |
b 8 |
b 3 |
4 5 |
0 6 |
|
Upper | 7 |
d 0 |
2 c |
1 e |
8 f |
c a |
3 f |
0 f |
0 2 |
c 1 |
a f |
b d |
0 3 |
0 1 |
1 3 |
8 a |
6 b |
four digits | 8 |
3 a |
9 1 |
1 1 |
4 1 |
4 f |
6 7 |
d c |
e a |
9 7 |
f 2 |
c f |
c e |
f 0 |
b 4 |
e 6 |
7 3 |
9 |
9 6 |
a c |
7 4 |
2 2 |
e 7 |
a d |
3 5 |
8 5 |
e 2 |
f 9 |
3 7 |
e 8 |
1 c |
7 5 |
d f |
6 e |
|
a |
4 7 |
f 1 |
1 a |
7 1 |
1 d |
2 9 |
c 5 |
8 9 |
6 f |
b 7 |
6 2 |
0 e |
a a |
1 8 |
b e |
1 b |
|
b |
f c |
5 6 |
3 e |
4 b |
c 6 |
d 2 |
7 9 |
2 0 |
9 a |
d b |
c 0 |
f e |
7 8 |
c d |
5 a |
f 4 |
|
c |
1 f |
d d |
a 8 |
3 3 |
8 8 |
0 7 |
c 7 |
3 1 |
b 1 |
1 2 |
1 0 |
5 9 |
2 7 |
8 0 |
e c |
5 f |
|
d |
6 0 |
5 1 |
7 f |
a 9 |
1 9 |
b 5 |
4 a |
0 d |
2 d |
e 5 |
7 a |
9 f |
9 3 |
c 9 |
9 c |
e f |
|
e |
a 0 |
e 0 |
3 b |
4 d |
a e |
2 a |
f 5 |
b 0 |
c 8 |
e b |
b b |
3 c |
8 3 |
5 3 |
9 9 |
6 1 |
|
f |
1 7 |
2 b |
0 4 |
7 e |
b a |
7 7 |
d 6 |
2 6 |
e 1 |
6 9 |
1 4 |
6 3 |
5 5 |
2 1 |
0 c |
7 d |
All the numbers in this table are in hexadecimal.
AES supports not only the 128-bit key but also the 192-bit and the 256-bit keys. For all the key sizes, round function is identical. The differences are the number of rounds computed and the KSF.
The 192-bit key K is loaded into a array of bytes, which is denoted by . Then, is computed for . The state is further divided into six columns: , , , , , and . The output is computed column by column. At first, a temporary 4-byte value tmp
is generated by using the value of .
tmp
.tmp
by 1 byte. Precisely, let be the 4 bytes of tmp
. Then, tmp
is updated to .tmp
.Then, by using the 4-byte value tmp
, the next subkey is generated as follows.
Among the 192-bit of the , the first four columns (128 bits) are set to , and the remaining two columns (64 bits) are set to the left half of . Among the 192-bit of the , the first two columns (64 bits) are set to the right half of , and the remaining four columns (128 bits) are set to . Similarly, are obtained.
Note that is the last four columns of , and then is the first four columns of . The last two columns of are never used. Thus, in order to omit the redundant computations, the KSF should be processed up to the first four columns of .
The key schedule procedure for AES-192 is depicted in Figure 1.10.
The KSF for AES-256 can be similarly defined. The size of the key state is 256 bits consisting of -byte array. Each key state produces two subkeys, and 15 subkeys are generated.
The update computation is very similar to the ones for AES-128 and AES-192. In order to mix the data quickly, another S-box layer is introduced between columns 3 and 4. The detailed procedure is omitted. The key schedule procedure for AES-256 is depicted in Figure 1.11.
Note that is the first four columns of . The last four columns of are never used. Thus, in order to omit the redundant computations, the KSF should be processed up to the first four columns of .
The computation of AES-128 with all the operations is described in Figure 1.12. The state after the first XOR of the plaintext and is denoted by . Similarly in round i, where ,
As explained before, the state is represented by a -byte array. Using two subindices often causes misunderstanding, and thus each byte position is also denoted by a single sequence . For state S, the byte , where is converted to the byte . The byte positions in the single sequence are shown in Figure 1.13.
Byte values of state S in several different byte positions are often denoted by . For example, the 4-byte value in the column 0 of state S is denoted by .
State becomes state after the operation. During this process, 4 bytes in are moved to different byte positions in . The moved positions are denoted by .
Those 4-byte positions are called diagonal.
Similarly 4 bytes in are moved to different byte positions in through the inverse of the operation. The moved positions are denoted by .
Those 4-byte positions are called inverse diagonal.