1 Introduction

The Keccak hash function [5] is the winner of NIST’s SHA-3 competition. The hash function uses the sponge construction [4] to map arbitrarily long inputs into fixed length outputs, and its official versions have an internal state size of b=1600 bits, and an output size n of either 224, 256, 384, or 512 bits. The internal permutation of Keccak consists of 24 application of a non-linear round function, applied to the 1600-bit state. Previous papers on Keccak, such as [17], include analysis of Keccak versions with a reduced internal state size, or with different output sizes. However, in this paper, we concentrate on the official Keccak versions, and the only way in which we modify them is by reducing their number of rounds.

Previous results on Keccak’s internal permutation include zero-sum distinguishers presented in [2], and later improved in [6, 7, 11]. Although zero-sum distinguishers can distinguish the full internal permutation of Keccak from a random permutation, they have very high complexities, and they seem unlikely to threaten the core security properties of Keccak (namely, collision resistance, preimage resistance and second-preimage resistance). Other results on Keccak’s internal permutation include a differential analysis given in [12]. Using techniques adapted from the rebound attack [16], the authors construct differential characteristics which give distinguishers on up to 8 rounds of the permutation, with complexity of about 2491. However, in their method it is not clear how to reach the starting state differences of these characteristics from valid initial states of Keccak’s internal permutation, since in sponge constructions a large portion of the initial state of the permutation is fixed and cannot be chosen by the cryptanalyst. Thus, although the results of [12] seem to be more closely related to the core security properties of Keccak than zero-sum distinguishers, they still do not lead to any attacks on the Keccak hash function itself.

Currently, there are very few results that analyze reduced-round variants of the full Keccak (rather than its building blocks): in [3], Bernstein described preimage attacks which extend up to 8 rounds of Keccak, but are only marginally faster than exhaustive search, and use a huge amount of memory. More recently, Naya-Plasencia, Röck and Meier presented practical attacks on Keccak-224 and Keccak-256 with a very small number of rounds [18]. These attacks include a preimage attack on 2 rounds, as well as collisions on 2 rounds and near-collisions on 3 rounds. In this paper, we extend these collision attacks on Keccak-224 and Keccak-256 by 2 additional rounds: we find actual collisions in 4 rounds and actual near-collisions in 5 rounds of Keccak-224 and Keccak-256, with Hamming distance 5 and 10, respectively.

The collisions and near-collisions of [18] were obtained using low Hamming weight differential characteristics, starting from the initial state of Keccak’s permutation. Such low Hamming weight characteristics are also the starting point of our new attacks, but we do not require the characteristics to start from the initial state of the permutation. Given a low Hamming weight starting state difference of a characteristic, we can easily extend it backwards by one round, and maintain its high probability (as done in [12]). However, due to the very fast diffusion of the inverse linear mapping used by Keccak’s permutation, the new starting state difference of the extended characteristic has a very high Hamming weight. We call this starting state difference a target difference, since our goal is to find message pairs which have this difference after one round of the Keccak permutation (after the fixed round, this difference will evolve according to the characteristic with high probability).Footnote 1 One of the main tools we develop in this paper is an algorithm that aims to achieve this goal, namely, to find message pairs which satisfy a given target difference after one Keccak permutation round. We call this algorithm a target difference algorithm, and it allows us to extend our initial characteristic by two additional rounds (as shown in Fig. 1): we first extend the characteristic backwards by one round to obtain the target difference (while maintaining the characteristic’s high probability). Then, we use the target difference algorithm to link the characteristic to the initial state of Keccak’s permutation, through an additional round. We note that the final link, which efficiently bypasses Keccak’s first Sbox layer, uses algebraic techniques rather than standard probabilistic techniques. Our methods are thus related to previously published work (such as [1]) which combines algebraic and differential techniques in block cipher cryptanalysis.

Fig. 1.
figure 1

Extending a 2-round differential characteristic by two additional rounds.

In the domain of hash function, the target difference algorithm is related to several cryptanalytic techniques that were developed in recent years. In particular, it is related to the work of Khovratovich, Biryukov and Nikolic [15], where, similarly to our algorithm, the authors use linear algebra to quickly satisfy many conditions of a differential characteristic. However, these techniques seem to work best on byte-oriented hash functions, whose internal structure can be described using a few sparse equations, which is not the case for Keccak. Our algorithm is also closely related to the work of Khovratovich [14] that exploits structures (which aggregate internal states of the hash function) in order to reduce the amortized complexity of collision attacks: the attacker first finds a truncated differential characteristic and searches for a few pairs of initial states that satisfy it. Then, using the structures and the initially found pairs, the attacker efficiently obtains many additional pairs that satisfy the truncated characteristic. However, in the case of Keccak, there are very few characteristics that can lead to a collision with high probability, and it seems unlikely that they can be joined in order to form the truncated differential characteristic required in order to organize the state differences into such structures. Moreover, it seems difficult to find even one pair of initial states that satisfy the target difference for Keccak. Another attack related to the target difference algorithm is the rebound attack [16]. In this attack, the cryptanalyst uses the available degrees of freedom to efficiently link and extend two truncated differential characteristics, both forwards and backwards, from an intermediate state of the hash function. However, once again, such high probability truncated characteristics are unlikely to exist for Keccak. Moreover, it is not clear how to use the rebound attack to link the backward characteristic to the initial state of the permutation. Thus, our target difference algorithm can be viewed as an asymmetric rebound attack, where one side of the characteristic is fixed.

Our full attacks have two parts, where in the first part we execute the target difference algorithm in order to obtain a sufficiently large set of message pairs that satisfy the target difference after the first round. In the second part of the attack, we try different message pairs in this set in order to find a pair whose difference evolves according to a characteristic whose starting state is the target difference. Since the target difference algorithm does not control the differences beyond the first round, the second part of the attack is a standard probabilistic differential attack (which only searches for collisions or near-collisions obtained from message pairs within a specific set). The high probability differential characteristic beyond the first round ensures that the time complexity of the second part of the attack is relatively low.

Although the target difference algorithm is heuristic, and there is no provable bound on its running time, it was successfully applied with its expected complexity to many target differences defined by the high probability differential characteristics. Consequently, we were able to find actual collisions for 4 rounds of Keccak-224 and Keccak-256 within minutes on a standard PC. By using good differential characteristics for an additional round, we found near-collisions for 5 rounds of Keccak-224 and Keccak-256. However, this required more computational effort (namely, a few days on a single PC), since the extended characteristics have lower probabilities.

The paper is organized as follows. In Sect. 2, we briefly describe Keccak, and in Sect. 3 we introduce our notations. In Sect. 4, we give a comprehensive overview of the target difference algorithm and describe the properties of Keccak that it exploits. In Sect. 5, we present our results on round-reduced Keccak. In Appendix A, we describe the full details of the target difference algorithm, and in Appendix B, we propose an alternative algorithm, which has a better understood time complexity. Since the original algorithm gave us very good results in practice, we did not use this alternative version. However, it may be more efficient in some cases, especially if someone finds longer high probability characteristics for Keccak’s permutation.

2 Description of Keccak

In this section, we give short descriptions of the sponge construction and the Keccak hash function. More details can be found in the Keccak specification [5].

The sponge construction [4] works on a state of b bits, which is split into two parts: the first part contains the first r bits of the state (called the outer part of the state) and the second part contains the last c=br bits of the state (called the inner part of the state).

Given a message, it is first padded and cut into r-bit blocks, and the b state bits are initialized to zero. The sponge construction then processes the message in two phases: In the absorbing phase, the message blocks are processed iteratively by XORing each block into the first r bits of the current state, and then applying a fixed permutation on the value of the b-bit state. After processing all the blocks, the sponge construction switches to the squeezing phase. In this phase, n output bits are produced iteratively, where in each iteration the first r bits of the state are returned as output and the permutation is applied.

The Keccak hash function uses multi-rate padding: given a message, it first appends a single 1 bit. Then, it appends the minimum number of 0 bits followed by a single 1 bit, such that the length of the result is a multiple of r. Thus, multi-rate padding appends at least 2 bits and at most r+1 bits.

The official Keccak versions have b=1600 and c=2n, where n∈{224,256,384,512}. The 1600-bit state can be viewed as a 3-dimensional array of bits, a[5][5][64], and each state bit is associated with 3 integer coordinates, a[x][y][z], where x and y are taken modulo 5, and z is taken modulo 64.

The Keccak permutation consists of 24 rounds, which operate on the 1600 state bits. Each round of the permutation consists of five mappings R=ιχπρθ. Keccak uses the following naming conventions, which are helpful in describing these mappings:

  • A row is a set of 5 bits with constant y and z coordinates, i.e., a[∗][y][z].

  • A column is a set of 5 bits with constant x and z coordinates, i.e., a[x][∗][z].

  • A lane is a set of 64 bits with constant x and y coordinates, i.e., a[x][y][∗].

  • A slice is a set of 25 bits with a constant z coordinate, i.e., a[∗][∗][z].

The five mappings are given below, for each x,y, and z (where the state addition operations are over GF(2)):

  1. 1.

    θ is a linear map, which adds to each bit in a column, the parity of two other columns:

    $$\theta \mbox{: } a[x][y][z] \leftarrow a[x][y][z] + \displaystyle\sum\limits_{y'=0}^{4} a[x-1][y'][z] + \displaystyle\sum\limits_{y'=0}^{4} a[x+1][y'][z-1]. $$

    In this paper, we also use the inverse mapping, θ −1, which is more complicated and provides much faster diffusion: for θ −1, flipping the value of any input bit, flips the value of more than half of the output bits.

  2. 2.

    ρ rotates the bits within each lane by T(x,y), which is a predefined constant for each lane:

    $$\rho \mbox{: } a[x][y][z] \leftarrow a[x][y]\bigl[z+T(x,y)\bigr]. $$
  3. 3.

    π reorders the lanes:

  4. 4.

    χ is the only non-linear mapping of Keccak, working on each of the 320 rows independently:

    $$\chi \mbox{: } a[x][y][z] \leftarrow a[x][y][z] + \bigl(\bigl(\neg a[x + 1][y][z]\bigr) \wedge a[x + 2][y][z]\bigr). $$

    Since χ works on each row independently, in can be viewed as an Sbox layer which simultaneously applies the same 5 bits to 5 bits Sbox to the 320 rows of the state. We note that the Sbox function is an invertible mapping, and our techniques are heavily based on the observation that the algebraic degree of each output bit of χ as a polynomial in the five input bits is only 2. We also note that the algebraic degree the inverse mapping χ −1 is 3 (as noted in [5]).

  5. 5.

    ι adds a round constant to the state:

    $$\iota \mbox{: } a \leftarrow a + \mathrm{RC}[i_r]. $$

    We omit the values of RC[i r ], as they are not needed for our analysis.

3 Notations

Given a message M, we denote its length in bits by |M|. Unless specified otherwise, in this paper we assume that |M|=r−8. Namely, we consider only single-block messages of maximal length such that |M|(modulo 8)≡0, which give us the maximal number of degrees of freedom for single-block messages containing an integral number of bytes.Footnote 2 Given M, we denote the initial state of the Keccak permutation as the 1600-bit word \(\overline{M} \triangleq M\parallel p \parallel 0^{c}\), where ∥ denotes concatenation, and p denotes the 8-bit pad 10000001.

The first three operations of Keccak’s round function are linear mappings, and we denote their composition by Lρπθ. We sometimes refer to L as a “half round” of the Keccak permutation, where ιχ represents the other half. We denote the Keccak nonlinear function on 5-bit words defined by varying the first index by χ |5. The difference distribution table (DDT) of this function is a two-dimensional 32×32 integer table, where all the differences are assumed to be over GF(2). The entry DDT(δ in,δ out) specifies the number of input pairs to this Sbox with difference δ in that give the output difference δ out (i.e., the size of the set {x∈{0,1}5 | χ |5(x)+χ |5(x+δ in)=δ out}).

We denote a 1600-bit difference in the state of Keccak’s permutation after i rounds by ΔS i (e.g., ΔS 0 is the initial difference, ΔS 0.5 is the difference after the application of L and ΔS 1 is the difference after the application of the first round function). We denote the 1600-bit target difference ΔS 1, which is the input of the target difference algorithm, by Δ T . The output of the algorithm is a subset of ordered pairs of single block messages \(\{(M_{1}^{1},M_{1}^{2}), (M_{2}^{1},M_{2}^{2}), \dots, (M_{k}^{1},M_{k}^{2})\}\) that satisfy this difference after one round R, namely \(R(\overline{M}_{i}^{1})+R(\overline{M}_{i}^{2})=\varDelta _{T}\)i∈{1,2,…,k}.

4 Overview of the Target Difference Algorithm

When designing the target difference algorithm, we face two problems: first, the target difference Δ T extends backwards, beyond the first Keccak Sbox layer to ΔS 0.5, with very low probability (due to its high Hamming weight). The second problem is that the initial state of the permutation fixes many of the state bits to pre-defined values, and the initial states that we use must satisfy these constraints. On the other hand, Keccak has several useful properties that we can exploit in our target difference algorithm. In this section, we describe these properties in detail and give an overview of the algorithm.

4.1 The Properties of Keccak Exploited by the Target Difference Algorithm

Property 1

Keccak-224 and Keccak-256 allow the user to control many of the 1600 state bits of the initial state of the permutation. Thus, given a target difference, we expect many solutions to exist (namely, one-block message pairs which have the 1600-bit target difference after one permutation round): since we consider message pairs where each message is of length r−8=1600−8−c bits (1144 for Keccak-224, and 1080 for Keccak-256), given an arbitrary 1600-bit target difference, there is an expected number of 22(1600−8−c)−1600=21584−2c message pairs of this length that satisfy this difference (regardless of the value of the inner part of the state). Thus, the algorithm has 704 and 560 degrees of freedom for Keccak-224 and Keccak-256, respectively.

Despite the large number of available degrees of freedom, the number of possible solutions varies significantly according to the target difference. To demonstrate this, we use the fact that L −1 has very fast diffusion (i.e., even an input with one non-zero bit is mapped by L −1 into a roughly balanced output). We consider the case where t>0 out of the 320 Sboxes of the target difference are active (i.e., they have a non-zero output difference). Each one of the 320−t non-active Sbox zero output differences is uniquely mapped backwards to a zero input difference into the first Sbox layer. Using the Keccak Sbox DDT, it is easy to see that each one of the t active Sbox output differences is mapped to more than 8 possible input differences. Thus, the number of possible state differences ΔS 0.5 is more than 8t=23t. Since L is invertible and acts deterministically on the differences, the number of possible input differences to the Keccak compression function ΔS 0 remains the same. If we require that the last c+8 bits of ΔS 0 are zero, for t large enough, we still expect more than 23tc−8 valid solutions. When the target difference is chosen at random, we have t≈310 (since the probability that an Sbox output difference is zero is \(\frac{1}{32}\)). This gives more than 2930−448−8=2474 expected solutions for Keccak-224, and more than 2930−512−8=2410 expected solutions for Keccak-256. On the other hand, consider the extreme case of t=1 (i.e., the target difference has only one active Sbox). Clearly, this Sbox cannot contribute more than 31 possible values of ΔS 0.5. Since L −1 has very fast diffusion, these possible differences are mapped to at most 31 roughly balanced non-zero possible input differences ΔS 0, and we do not expect the last c+8 bits of any of them to be zero. To conclude, target differences with a small number of active Sboxes are likely to have no solutions at all. On the other hand, a majority of the target differences have a very large number of expected solutions for Keccak-224 and Keccak-256. Note that having a large number of solutions does not imply that it is easy to find any one of them, since their density is still minuscule.

Property 2

The algebraic degree of the Keccak Sboxes is only 2. This implies that given a 5-bit input difference δ in and a 5-bit output difference δ out, the set of values {v 1,v 2,…,v l } such that χ |5(v i )+χ |5(v i +δ in)=δ out is an affine subset. Since (v i +δ in)+δ in=v i , then v i +δ in∈{v 1,v 2,…,v l }, implying {v 1,v 2,…,v l }={v 1+δ in,v 2+δ in,…,v l +δ in}. Thus, both coordinates of the ordered pairs give the same subset whose size is DDT(δ in,δ out).

We note that similar observations were used in [10] to prove that when DDT(δ in,δ out)=2 or 4, the same holds. In the specific case of Keccak, we also use 3-dimensional affine subsets of pairs that satisfy the Sbox difference transition (δ in,δ out), for which DDT(δ in,δ out)=8. On the other hand, since the algebraic degree of the inverse Sbox is 3, which is reduced to 2 (rather than 1) after differentiation, the output values that satisfy an input and an output difference do not necessarily form an affine subset.

Property 3

For any non-zero 5-bit output difference δ out from a Keccak Sbox, the set of possible input differences, {δ in|DDT(δ in,δ out)>0}, contains at least 5 (and up to 17) 2-dimensional affine subspaces. These affine subspaces can be easily pre-computed using the DDT, for each one of the 31 possible non-zero output differences. However, we note that there is no output difference for which the set of possible input differences contains an affine subspace of dimension 3 or higher.

4.2 Formulating the Problem

Given Δ T , an arbitrary message pair (M 1,M 2) in which |M 1|=|M 2|=r−8 is a solution to our problem if \(R(\overline{M}^{1})+R(\overline{M}^{2})=\varDelta _{T}\). This can be formulated using two constraints on the 1600-bit words \((\overline{M}_{1},\overline{M}_{2})\):

  1. 1.

    The last c+8 bits of \(\overline{M}^{1}\) and \(\overline{M}^{2}\) are equal to p∥0c, where p denotes the 8-bit pad 10000001.

  2. 2.

    \(R(\overline{M}^{1}) + R(\overline{M}^{2}) = \varDelta _{T}\) (where R is the permutation round of Keccak).

We can easily formulate the first constraint using linear equations on the bits of \(\overline{M}_{1}\) and \(\overline{M}_{2}\). Since Keccak’s Sbox has an algebraic degree of 2 over GF(2), we can formulate the second constraint as a system of quadratic equations on these bits. Standard heuristic techniques for solving such systems include using the available degrees of freedom to fix some message values (or values before the first Sbox layer) in order to linearize the system. However, these techniques require many more than the available number of degrees of freedom when used in a trivial way. For example, in order to get linear equations after one round of Keccak’s permutation, we can fix 3 out of the 5 bits entering an Sbox (after the first linear layer), such that there are no two consecutive unknown input bits entering the Sbox. Using this technique reduces the single quadratic term in the symbolic form of each of the Sbox’es output bits to a linear term. However, this requires fixing 320⋅3=960 bits per massage, and 2⋅960=1920 bits in total, which is significantly more than the 704 available degrees of freedom for Keccak-224 (and clearly more than the available number of degrees of freedom for the other Keccak versions). Consequently, we have to repeat the linearization procedure a huge number of times, with different fixed values, in order to find a solution.

A Two-Phase Algorithm

Although we expect our quadratic system to have many solutions,Footnote 3 solving all the equations at once seems difficult. Thus, we split the problem into easier tasks by exploiting the low algebraic degree of Keccak’s Sbox to a greater extent than in the standard techniques: as described in Property 2 of Sect. 4.1 given an input difference and an output difference to an Sbox, all the pairs of input values that satisfy them form an affine subset. This suggests an algorithm with two phases, where in the first phase (called the difference phase) we find an input difference to all the Sboxes, and in the second phase (called the value phase) we obtain the actual values of the message pairs that lead to the target difference.

Using this two-phase approach, the ordered pairs produced by our algorithm satisfy two additional properties: the 1600-bit input difference of the initial states ΔS 0 is fixed to some 1600-bit value Δ I (i.e., \(\overline{M}_{i}^{1} + \overline{M}_{i}^{2} = \varDelta _{I}\)i∈{1,2,…,k}), and the set composed of all the initial states defined by the first message in each ordered pair (i.e., \(\bigcup\{\overline{M}_{i}^{1}\}\)i∈{1,2,…,k}) forms an affine subset. The algorithm outputs the ordered pairs as the fixed 1600-bit input difference Δ I , and some basis for the affine subset \(\bigcup\{\overline{M}_{i}^{1}\}\)i∈{1,2,…,k}. We note that the large number of degrees of freedom allows us to restrict the set of solutions (i.e., the set of message pairs that satisfy the target difference) to a smaller subset (but still large enough for our purposes) that can be found relatively easily. In particular, the algorithm considers only message pairs with a fixed difference Δ I , for which all the solutions can be found by solving linear equations.

The two constraints above, which define our quadratic equation system, are broken into two sets of constraints, since we have to simultaneously enforce two difference constraints (given as constraints on the 1600-bit word Δ I ):

Difference Constraint 1

The last c+8 bits of Δ I are equal to zero.

Difference Constraint 2

The difference transition L(Δ I )→Δ T is possible, i.e., there exists some 1600-bit word W such that χ(W)+χ(W+L(Δ I ))=Δ T (note that since L is a linear function, L(Δ I ) is well-defined).

The first difference constraint simply equates bits of the input difference Δ I to zero (456 bits for Keccak-224 and 520 bits for Keccak-256), while the second difference constraint assigns to every 5 bits of L(Δ I ) that enter an Sbox, several possible values which are not related by simple affine equations.

In the second phase, we enforce additional value constraints (given on the 1600-bit word \(\overline{M}^{1}\)):

Value Constraint 1

The last c+8 bits of \(\overline{M}^{1}\) are equal to p∥0c, where p denotes the 8-bit pad 10000001.

Value Constraint 2

\(R(\overline{M}^{1}) + R(\overline{M}^{1} + \varDelta _{I}) = \varDelta _{T}\).

Note that the first difference constraint and the first value constraint on each \(\overline{M}_{i}^{1}\) also ensure that the same value constraint holds for \(\overline{M}_{i}^{2}\) (i.e., the last c+8 bits of \(\overline{M}_{i}^{2}\) are equal to p∥0c).

Given a single 1600-bit Sbox layer input difference ΔS 0.5, Property 2 of Sect. 4.1 implied that enforcing the two value constraints simply reduces to solving a union of two sets of linear equations. On the other hand, it is not clear how to simultaneously enforce both of the difference constraints, since given an output difference to an Sbox δ out, all the possible input differences δ in such that DDT(δ in,δ out)>0, are not related by simple affine relations.

4.3 The Difference Phase

Unsuccessful Attempts to Enforce the Difference Constraints

We can try to enforce both difference constraints by assigning the undetermined 1600−c−8 bits of Δ I , in such a way that the second difference constraint will hold. This usually involves iteratively constructing an assignment for Δ I , by guessing several undetermined bits at a time, and filtering the guesses by verifying the second difference constraint. However, this is likely to have a very large time complexity, since L diffuses the bits of Δ I in a way that forces us to guess many bits before we can start filtering the guesses. Moreover, for any Δ T , the fraction of input differences satisfying the first difference constraint that also satisfy the second difference constraint is very small. Thus, most of the computational effort turns out to be useless, since the guesses are likely to be discarded at later stages of the algorithm. Another approach is to guess L(Δ I ) by iteratively guessing the 5-bit Sbox input differences, and filtering the guesses by verifying the first difference constraint. For similar reasons, this approach is likely to have a very large time complexity.

A Better Approach

Both of these approaches are very strict, since each guess made by the algorithm commits to a specific value for some of the bits of Δ I , or L(Δ I ), and restricts the solution space significantly. Thus, we use Property 3 of Sect. 4.1, which gives us more flexibility, and significantly reduces the time complexity: given any non-zero 5-bit output difference to a Keccak Sbox, the set of possible input differences contains at least five 2-dimensional affine subspaces. Consequently, in order to enforce the second difference constraint, for each Sbox with a non-zero output difference (i.e., an active Sbox), we choose one of the affine subsets (which contains 4 potential values for the 5 Sbox input bits of L(Δ I )), instead of choosing specific values for these bits. This enables us to maintain an affine subspace of potential values for L(Δ I ), starting with the full 1600-dimensional space, and iteratively reducing its dimension by adding affine equations in order to enforce the second difference constraint for each Sbox. In addition to these affine equations that we add per active Sbox, we also have to add the linear equations for the non-active Sboxes (which equate their 5 input difference bits to zero), and the additional c+8 linear equations that enforce the first difference constraint. All of these equations are added to a linear system of equations that we denote by E Δ .

Since the c+8 equations that enforce the first difference constraint do not depend on the target difference, we add them to E Δ before we iterate the Sboxes. While iterating over the active Sboxes, we add equations on L(Δ I ) in order to enforce the second difference constraint and hope that for each Sbox, we can add equations such that E Δ remains consistent. Note that the equations in E Δ in each stage of the algorithm depend on the order in which we consider the active Sboxes, and on the order in which we consider the possible affine subsets of input differences for each Sbox. Thus, if we reach an Sbox for which we cannot add equations in order to enforce the second constraint (while maintaining the consistency of E Δ ), we can simply change the order in which we consider the active Sboxes, or the order in which we consider the affine subsets for each Sbox, and try again. Since we cannot predict in advance the orderings that give the best result, we choose them heuristically, as described in Appendix A.

4.4 The Value Phase

In case the difference phase procedure described above succeeds, it actually outputs an affine subspace of candidate input differences, rather than a single value for Δ I . Next, we can commit to a specific value for Δ I and run the value phase, hoping that the set of all linear equations defined by the value constraints has a solution. Namely, we allocate another system of equations, which we denote by E M , and add the equations on \(\overline{M}^{1}\) that enforce the first value constraint. We then add the additional linear equations that enforce the second value constraints for all the Sboxes, and output the solution to the system, if it exists. However, once again, this approach is too strict, and may force us to repeat the value phase a huge number of times with different values for Δ I , until we find a solution. Thus, we do not choose a single value for Δ I in advance. Instead, we reduce the linear subset of candidates for Δ I gradually by fixing the input difference to each one of the active Sboxes, until a single value for Δ I remains. Thus, we continue to maintain E Δ throughout the value phase, and iteratively add the additional 2 equations which are required to uniquely specify a 5-bit input difference for each active Sbox, among the 2-dimensional affine subsets chosen in the difference phase. Once we fix the input difference to an Sbox, we immediately obtain linear equations on \(\overline{M}^{1}\), and we can check their consistency with the current equations in E M . In case the equations in E M are not consistent for a certain Sbox, we can try to choose another input difference for it. This gives different equations on \(\overline{M}^{1}\), which may be consistent and allow us to continue the process.

Similarly to the difference phase, the equations in E M in each stage of the algorithm depend on the order in which we consider the active Sboxes, and on the order in which we consider the possible input differences for each Sbox. Thus, once again, if at some stage of the value phase we cannot add any consistent equations to E M , we can change one of these orderings and try again, hoping to obtain a valid solution.

We stress again that both phases of the algorithm are not guaranteed to succeed. The success of each phase depends on the target difference, and on orderings which are chosen heuristically. As a result, we may have to iterate both phases of the algorithm an undetermined number of times with modified orderings, hoping to obtain better results.

5 Application of the Target Difference Algorithm to Round-Reduced Keccak

Since we would like to use the target difference algorithm in order to find collisions and near-collisions in Keccak, it is crucial to verify the algorithm’s success on target differences which lead to these results. Thus, before we run the algorithm, we have to find such high probability differential characteristics, and to obtain the target differences which are likely to be the most successful inputs to the algorithm. As described in the introduction, once we find a high probability differential characteristic with a low Hamming weight starting state difference ΔS 2, we extend it backwards to obtain the target difference Δ T =ΔS 1 (while maintaining its high probability). We then use the target difference algorithm to link the extended characteristic backwards to the initial state of Keccak’s permutation, with an additional round. Thus, any low Hamming weight characteristic for r rounds of Keccak’s permutation can be used to obtain results on a round-reduced version of Keccak with r+2 round. Specifically, in this section we demonstrate how we use 2-round characteristics in order to find collisions for 4 rounds of Keccak-224 and Keccak-256, and how to use 3-round characteristics in order to find near-collisions for 5 rounds of these Keccak versions.

5.1 Searching for Differential Characteristics

We reuse the notion of a column parity kernel or CP-kernel that was defined in the Keccak submission document [5]: a 1600-bit state difference is in the CP-kernel if all of its columns have even parity. It is easy to see that such state differences are fixed points of the function θ, which does not increase their Hamming weight. Since ρ and π just reorder the bits of the state, the application of L to a CP-kernel does not change its total Hamming weight. In addition, there is a high probability that such low Hamming weight differential states are fixed points of χ. Thus, when we start a differential characteristic from a low Hamming weight CP-kernel, we can extend it beyond the Sbox layer, χ, to one additional round of the Keccak permutation, with relatively high probability and without increasing its Hamming weight. However, extending such a characteristic to more rounds in a similar way is more challenging, since we have to ensure that the state difference before the application of θ remains in the CP-kernel at the beginning of each round.

Using Previous Results

In [9], [12] and [18], the authors propose algorithms for constructing low Hamming weight differential characteristics for Keccak. These algorithms successfully find differential characteristics that stay in the CP-kernel for 2 rounds (named double kernel trails in [18]), some of which lead to collisions on the n-bit extract taken from the final state after 2 rounds, with high probability. However, when trying to extend each one of these characteristics by another round, the state difference is no longer in the CP-kernel and thus its Hamming weight increases significantly (from less than 10 to a few dozen bits). Nevertheless, the Hamming weight of the characteristics is still relatively low, and they can lead with reasonably high probability to near-collisions on the n output bits extracted. Beyond 3 rounds, the Hamming weight of the characteristics becomes very high (more than 100), and it seems unlikely that they can be extended to give collisions or near-collisions with reasonable probability. The currently known double kernel differential trails only extend forward to at most three rounds with reasonably high probability (higher than 2−100).

Our attacks on round-reduced Keccak make use of the type of differential characteristics that were found in [9], [12] and [18], namely low Hamming weight characteristics that stay in the CP-kernel for 2 rounds. The double kernel trails with the highest probability have Hamming weight of 6 at the input to the initial round, and due to their low hamming weight, we could easily find all these characteristics within a minute on a standard PC. There are 571 such characteristics out of which, 128 can give collisions for Keccak-224 and 64 can give collisions for Keccak-256. However, when trying to extend the characteristics by an additional round, we were not able to find any characteristic that gives collisions for Keccak-224 (or Keccak-256) with reasonable probability. Thus, our best 3-round characteristics lead only to near-collisions, rather than collisions. The characteristics that give the near-collisions with the smallest difference Hamming weight for Keccak-224 and Keccak-256 are, again, double kernel trails with 6 non-zero input bits. The best 3-round characteristics for Keccak-224 lead to near-collisions with a difference Hamming weight of 5, and for Keccak-256, the best 3-round characteristics leads to a near-collision difference Hamming weight of 8. Examples of these characteristics are found in Appendix C.

Extending the Characteristics Backwards

Since the characteristics that we use start with a low Hamming weight state difference, we can extend them backwards by one round without reducing their probability significantly (as done in [12]): we take this low Hamming weight initial state difference ΔS 2, and choose a valid state difference input to the previous Sbox layer ΔS 1.5 which could produce it. We then apply L −1, and obtain a new initial state difference for the extended characteristic ΔS 1, which serves as a target difference Δ T for our new algorithm. Note that the target difference is not in the CP-kernel (otherwise, we would have found a low Hamming weight differential characteristic that stays in the CP-kernel for 3 rounds). Thus, when we apply L −1 to ΔS 1.5, we usually obtain a roughly balanced target difference, with only a few non-active Sboxes. This is significant to the success of the target difference algorithm, which strongly depends on the number of active Sboxes in the target difference (as demonstrated in Sect. 4.1). In case the target difference obtained from a characteristic has too many non-active Sboxes, we can try to select another target difference for the characteristic, by tweaking ΔS 1.5.

Assuming that the algorithm succeeds and we obtain a sufficiently large linear subspace of message pairs (such that it contains at least one pair whose difference evolve according to the characteristic), we can find collisions for 4 rounds and near-collisions for 5 rounds of Keccak-224 and Keccak-256. For example, if we have an extended characteristic which can give collisions for 3 round of Keccak-256 with probability 2−24, we need a linear subspace which contains at least 224 message pairs in order to find a collision on 4-round Keccak-256 with high probability.

5.2 Applying the Target Difference Algorithm to the Selected Differential Characteristics

We tested our target difference algorithm using a standard PC, on dozens of double-kernel trails with Hamming weight of 6. For each one of them, after tweaking ΔS 1.5 at most once, we could easily compute a target difference where all of the 320 Sboxes are active. We then ran the target difference algorithm on each one of these targets. For both Keccak-224 and Keccak-256, the target difference algorithm eventually succeeded: the basic procedure of the difference phase always succeeded within the first two attempts (after changing the order in which we considered the Sboxes), while the value phase was more problematic, and we had to iterate its basic procedure dozens to thousands of times in order to find a good ordering of the Sboxes and obtain results. For Keccak-224, the algorithm typically returned an affine subspace of message pairs with a dimension of about 100 within one minute. For Keccak-256, the dimension of the affine subspaces of message pairs returned was typically between 35 and 50, which is smaller compared to the typical result size for Keccak-224 (as expected since we have fewer degrees of freedom). In addition, unlike Keccak-224, for Keccak-256 we had to rerun the algorithm (starting from the difference phase) a few times, when the value phase did not seem to succeed for the choice of candidate input difference subset. Hence, the running time of the algorithm was typically longer—between 3 and 5 minutes, which is, of course, still practical.

5.3 Obtaining Actual Collisions and Near-Collisions for Round-Reduced Keccak-224 and Keccak-256

Obtaining Collisions

After successfully running the target difference algorithm, we were able to find collisions for 4-round Keccak for each tested double-kernel trail with Hamming weight of 6 (which leads to a collision). Since the probability of each one of these differential characteristics is greater than 2−30, the probability that a random pair which satisfies its corresponding target difference leads to a collision, is greater than 2−30. Thus, we expect to find collisions quickly for both Keccak-224 and Keccak-256, once the target difference algorithm returns a set of more than 230 message pairs. However, even though the subsets we used contained more than 230 message pairs, we were not able to find collisions within several of these subsets for Keccak-224, and for many of the subsets for Keccak-256. As a result, we had to rerun the target difference algorithm and obtain additional sets of message pairs, until a collision was found. Thus, the entire process of finding a collision typically takes about 2–3 minutes for Keccak-224, and 15–30 minutes for Keccak-256. The reason that there were no 4-round collisions within many of the subsets of message pairs is the incomplete diffusion of the Keccak permutation within the first two rounds. Since our subsets of message pairs are relatively small (especially for Keccak-256), and the values of all the message pairs within a subset are closely related, some close relations between a small number of bits still hold before the Sbox layer of the second round (e.g., the value of a certain bit is always 0, or the XOR of two bits is always 1). Some of these non-random relations make the desired difference transition into the second Sbox layer impossible, for all the message pairs within a subset. We note that we were still able to find collisions rather quickly, since it is easy to detect the cases where the difference transition within the second Sbox layer is impossibleFootnote 4 (which allowed us to immediately rerun the target difference algorithm). In addition, when this difference transition is possible, we were always able to find collisions within the subset. Two concrete examples of colliding message pairs for Keccak-224 and Keccak-256 are given in Appendix D.

Obtaining Near-Collisions

In order to obtain near-collisions on 5-round Keccak-224 and Keccak-256, we again start by choosing suitable differential characteristics. Out of all the characteristics that we searched, we chose the differential characteristics described in Appendix C, which lead to near-collisions of minimal Hamming weight for the two versions of Keccak. The results of the target difference algorithm when applied to targets chosen according to these characteristics, were similar to the results described in Sect. 5.2. However, compared to the probability of the characteristics leading to a collision, the probability of these longer characteristics is lower: the probability of the characteristics are 2−57 and 2−59 for Keccak-224 and Keccak-256, respectively. Thus, obtaining message pairs whose differences propagate according to these characteristics, and lead to 5-round near-collisions, is more difficult than obtaining collisions for 4 rounds of Keccak-224 and Keccak-256. However, for each such main characteristic, there are several secondary characteristics which diverge from the main one in final two rounds and give similar results. Thus, the probabilities of finding near collisions with a small Hamming distance for 5 rounds of Keccak-224 and Keccak-256 are higher than the ones stated above. In addition, by using some simple message modification techniques within the subsets returned by the target difference algorithm, we were able to further reduce the workload of finding conforming message pairs. Thus, for Keccak-224, we obtained near-collisions with a Hamming distance of 5, which is the same as the output Hamming distance of the main characteristic that we used. For Keccak-256, the main characteristic that we used has an output Hamming distance of 8, but we were only able to find message pairs which give a near-collision with a slightly higher Hamming distance of 10. All of these near-collisions were found within a few days on a standard PC. Examples of such near-collisions are given in Appendix D.

5.4 Applying the Target Difference Algorithm to Other Versions of Keccak

Although there are only four official versions of Keccak, the hash function is defined for any capacity c and rate r such that c+r=1600 (and also for other state sizes, which we do not consider in this paper). Thus, it is interesting to apply the target difference algorithm different versions of Keccak, and estimate the maximal value of c (or minimal value of r) for which the algorithm is able to solve random challenges.

We applied the target difference algorithm for Keccak with different values of c on dozens of randomly chosen target differences. According to our simulations, the algorithm is able to quickly solve (within 10 minutes) a substantial fraction of the challenges (of at least 10 %) for values of c close to 650 (for which the algorithm has 300 degrees of freedom). For values of c which are larger than 650, the success rate of the algorithm seems to decrease rapidly as c grows. Thus, it is very unlikely that the current version of our algorithm can succeed in reasonable time to solve challenges for Keccak-384 (where c=768, and there are only a few dozens of available degrees of freedom). For Keccak-512 (where c=1024), only a negligible fraction of target differences have a solution, and it is not possible to solve an overwhelming majority of them (without exploiting additional degrees of freedom in multi-block messages).

Solving Keccak Collision Challenges

In 2011, the Keccak team announced challenges for 48 reduced-round Keccak instances [19]. By applying our techniques to two of these instances, we were able to obtain the best known collision attacks for three and four rounds (with a capacity of c=160 bits).

6 Conclusions and Future Work

In this paper, we presented practical collision and near-collision attacks on reduced-round variants of Keccak-224 and Keccak-256. Our attacks are based on a novel target difference algorithm, which is used to link high probability differential characteristics for the Keccak internal permutation to legal initial states of the hash function. Consequently, we were able to significantly improve the best known previous results on Keccak, by doubling (from 2 to 4) the number of rounds for which collisions can be found in a practical amount of time.

Our target difference algorithm is clearly limited by the number of available degrees of freedom, and it seems difficult to extend it to reach target differences spanning 2 or more rounds of the Keccak permutation. However, it seems very likely that the algorithm will be useful in the future if longer high probability differential characteristics are found for the Keccak permutation. In particular, finding new high probability differential characteristics, starting from a low Hamming weight state difference and extending forwards more than 3 rounds, remains a challenging task.