In 2006, NIST announced a competition to produce a new hash function to serve alongside SHA-2. The new function was required to be at least as secure as SHA-2 and to have the same four output possibilities. Fifty-one entries were submitted, and in 2012, Keccak was announced as the winner. It was certified as a standard by NIST in 2012 in FIPS-202. (The name is pronounced “ketchak”. It has been suggested that the name is related to “Kecak,” a type of Balinese dance. Perhaps the movement of the dancers is analogous to the movement of the bits during the algorithm.) This algorithm became the hash function SHA-3.
The SHA-3 algorithm was developed by Guido Bertoni, Joan Daemen, and Gilles Van Assche from STMicroelectronics and Michaël Peeters from NXP Semiconductors. It differs from the Merkle-Damgård construction and is based on the theory of Sponge Functions. The idea is that the first part of the algorithm absorbs the message, and then the hash value is squeezed out. Here is how it works. The state of the machine is a string of bits, which is fed to a function that takes an input of bits and outputs a string of bits, thus producing a new state of the machine. In contrast to the compression functions in the Merkle-Damgård construction, the function is a one-to-one function. Such a function could not be used in the Merkle-Damgård situation since the number of input bits (from and the previous step) is greater than the number of output bits. But the different construction in the present case allows it.
Parameters (“the rate”) and (“the capacity”) are chosen so that . The message (written in binary) is padded so that its length is a multiple of , then is broken into blocks of length :
To start, the state is initialized to all 0s. The absorption stage is the first part of Figure 11.2.
After the absorption is finished, the hash value is squeezed out: bits are output and truncated to the bits that are used as the hash value. This is the second part of Figure 11.2.
Producing a SHA-3 hash value requires only one squeeze. However, the algorithm can also be used with multiple squeezes to produce arbitrarily long pseudorandom bitstrings. When it is used this way, it is often called SHAKE (= Secure Hash Algorithm with Keccak).
The “length extension” and collision-based attacks of Section 11.3 are less likely to succeed. Suppose two messages yield the same hash value. This means that when the absorption has finished and the squeezing stage is starting, there are bits of the state that agree with these bits for the other message. But there are at least bits that are not output, and there is no reason that these bits match. If, instead of starting the squeezing, you do another round of absorption, the differing bits will cause the subsequent states and the outputted hash values to differ. In other words, there are at least possible internal states for any given -bit output .
SHA-3 has four different versions, named SHA3-224, SHA3-256, SHA3-384, and SHA3-512. For SHA3-, the denotes the security level, measured in bits. For example, SHA3-256 is expected to require around operations to find a collision or a preimage. Since , this should be impossible well into the future. The parameters are taken to be
For SHA3-256, these are
The same function is used in all versions, which means that it is easy to change from one security level to another. Note that there is a trade-off between speed and security. If the security parameter is increased, then decreases, so the message is read slower, since it is read in blocks of bits.
In the following, we concentrate on SHA3-256. The other versions are obtained by suitably varying the parameters. For more details, see [FIPS 202].
The Padding. We start with a message . The message is read in blocks of bits, so we want the message to have length that is a multiple of 1088. But first the message is padded to . This is for “domain separation.” There are other uses of the Keccak algorithm such as SHAKE (mentioned above), and for these other purposes, is padded differently, for example with 1111
. This initial padding makes it very likely that using in different situations yields different outputs. Next, “10*1
padding” is used. This means that first a 1 is appended to 011 to yield 011 . Then sufficiently many 01 s are appended to make the total length one less than a multiple of 1088. Finally, a 1 is appended. We can now divide the result into blocks of length 1088.
Why are these choices made for the padding? Why not simply append enough 0
s s to get the desired length? Suppose that . Then . Now append 1079 zeros to get the block to be hashed. If is being used in SHAKE, then is padded with 1080 zeros to yield the same block. This means that the outputs for and are equal. The padding is designed to avoid all such situations.
Absorption and Squeezing. From now on, we assume that the padding has been done and we have blocks of length :
The absorption now proceeds as in Figure 11.2 (we describe the function later).
The initial state is a string of 0s s of length 1600.
For to , let , where denotes a string of 0s of length . What this does is XOR the message block with the first bits of , and then apply the function . This yields an updated state , which is modified during each iteration of the index .
Return .
The squeezing now proceeds as in Figure 11.2:
Input and let be the empty string.
While (where is the output size)
Let , where denotes the first bits of .
.
Return .
The bitstring is the 256-bit hash value SHA-. For the hash value, we need only one squeeze to obtain . But the algorithm could also be used to produce a much longer pseudorandom bitstring, in which case several squeezes might be needed.
The function . The main component of the algorithm is the function , which we now describe. The input to is the 1600-bit state of the machine
where each is a bit. It’s easiest to think of these bits as forming a three-dimensional array with coordinates satisfying
A “column” consists of the five bits with fixed . A “row” consists of the five bits with fixed . A “lane” consists of the 64 bits with fixed .
When we write “for all ” we mean for and , and similarly for other combinations of .
The correspondence between and is given by
for all . For example, . The ordering of the indices could be described as “lexicographic” order using (not ), since the index of corresponding to is smaller than the index for if precedes in “alphabetic order.”
The coordinates are taken to be numbers mod 5, and is mod 64. For example is taken to be , since mod 5, mod 5, and mod 64.
The computation of proceeds in several steps. The steps receive the array as input and they output a modified array to replace .
The following steps through are repeated for to :
The first step XORs the bits in a column with the parities of two nearby columns.
For all , let
This gives the “parity” of the bitstring formed by the five bits in the column.
For all , let .
For all , let .
The second step rotates the 64 bits in each lane by an amount depending on the lane:
For all , let .
Let .
For to
For all , let .
Let
Return .
For example, consider the bits with coordinates of the form . They are handled by the case and we have , so the bits in this lane are rotated by one position. Then, in Step 3(b), is changed to for the iteration with . We have , so this lane is rotated by three positions. Then is changed to , which is reduced mod 5 to , and we pass to , which gives a rotation by six (the rotations are by what are known as “triangular numbers”). After , all of the lanes have been rotated.
The third step rearranges the positions of the lanes:
For all , let .
Again, the coordinate should be reduced mod 5.
The next step is the only nonlinear step in the algorithm. It XORs each bit with an expression formed from two other bits in its row.
For all , let
The multiplication is multiplying two binary bits, hence is the same as the AND operator.
Finally, some bits in the lane are modified.
For all , let .
Set .
For to 6, let , where is an auxiliary function defined below.
For all , let .
Return .
After I through V are completed for one value of , the next value of is used and I through V are repeated for the new , through . The final output is the new array , which yields a new bitstring of length .
This completes the description of the function , except that we still need to describe the auxiliary function .
The function takes an integer mod 255 as input and outputs a bit according to the following algorithm:
If mod 255, return 1 . Else
10000000
For to mod 255
Let
Let
Let
Let
Let
Let .
Return .
The bit that is outputted is .