Cryptography and Communications

, Volume 1, Issue 2, pp 135–173

A combinatorial analysis of recent attacks on step reduced SHA-2 family

Authors

  • Somitra Kumar Sanadhya
    • Applied Statistics UnitIndian Statistical Institute
    • Applied Statistics UnitIndian Statistical Institute
Article

DOI: 10.1007/s12095-009-0011-5

Cite this article as:
Sanadhya, S.K. & Sarkar, P. Cryptogr. Commun. (2009) 1: 135. doi:10.1007/s12095-009-0011-5
  • 150 Views

Abstract

We perform a combinatorial analysis of the SHA-2 compression function. This analysis explains in a unified way the recent attacks against reduced round SHA-2. We start with a general class of local collisions and show that the previously used local collision by Nikolić and Biryukov (NB) and Sanadhya and Sarkar (SS) are special cases. The study also clarifies several advantages of the SS local collision over the NB local collision. Deterministic constructions of up to 22-round SHA-2 collisions are described using the SS local collision and up to 21-round SHA-2 collisions are described using the NB local collision. For 23 and 24-round SHA-2, we describe a general strategy and then apply the SS local collision to this strategy. The resulting attacks are faster than those proposed by Indesteege et al using the NB local collision. We provide colliding message pairs for 22, 23 and 24-round SHA-2. Although these attacks improve upon the existing reduced round SHA-256 attacks, they do not threaten the security of the full SHA-2 family.1

Keywords

SHA-2 familyReduced round collisionsCryptanalysis

Mathematics Subject Classifications (2000)

94A60 Cryptography

1 Introduction

Collision resistant hash functions (CRHF) are of great practical importance in cryptography. Consequently, over the years, a lot of effort has been expended in the design and analysis of such functions. The most famous families of CRHFs are the SHA-families standardized by NIST [21] of USA and are based on the iterative Merkle-Damgård (MD) [3, 12] type of hash functions designed by Rivest.

A CRHF maps arbitrarily long strings to short fixed length strings. Consequently, collisions are bound to exist. Cryptanalysis of a CRHF consists of finding one such collision for the given CRHF. Since the description of function is given, one needs to carefully analyse the structure of the function in order to determine a collision. This necessitates a detailed combinatorial study of the function. One approach is to linearize the function by replacing all non-linear components with their best linear approximation. Finding a collision for such a linearized function is easy, but, the collision holds for the original function only probabilistically. One then has to look for methods to increase the probability. Alternatively, one could work directly with the nonlinear function itself. This makes the analysis more difficult, but the probability of a collision is much higher.

Cryptanalysis of the MD-family and the SHA-family has been extensively studied with major successes coming at infrequent intervals. The first major success was the cryptanalysis of MD4 by Dobbertin [4, 5] which led to the exhibition of an actual colliding message pair. This was followed by partial attacks on MD5 with full cryptanalysis of MD5 and other hash functions coming recently [23, 25]. The NIST standard SHA-1 family was theoretically cryptanalysed in [24] (though, till date, a colliding message pair for SHA-1 remains to be found). Earlier, partial cryptanalysis of SHA-0 was done in [1, 2]. Following the works in [24, 25], there have been attacks [9, 22] on MD5 with improved time complexities and/or providing collisions of structured messages.

The SHA-2 family consists of two main hash functions, SHA-256 and SHA-512, and their truncated versions SHA-224 and SHA-384. In view of the existing attacks, the only surviving family in the NIST standard is the SHA-2 family. Consequently, it is of interest to analyse the SHA-2 family. Cryptanalysis of the SHA-2 family has recently gained momentum due to the important work of Nikolić and Biryukov [13]. Prior work on finding collisions for step reduced SHA-256 was done in [10, 11] and [16]. These earlier works used local collisions valid for the linearized version of SHA-256 from [6] and [15]. On the other hand, the work [13] used a local collision which is valid for the actual SHA-256.

The authors in [13] developed techniques to handle nonlinear functions and the message expansion of SHA-2 to obtain collisions for up to 21-round SHA-256. The 21-round attack of [13] succeeded with probability 2 − 19. Very recently, Indesteege et al [7] have developed attacks against 23 and 24-round SHA-2 family. They utilize the local collision from [13] in these attacks. Following the work of [13] and partly in parallel to [7], we have published several papers [14, 1820] on finding SHA-2 collisions for up to 24 steps with time complexities better than those obtained in [7]. The current work subsumes our previous works and provides a unified combinatorial analysis of the attacks. More details are given below.

Our contributions

We take a general approach to the analysis of SHA-2 family. The set of all possible 9-round local collisions using additive differentials are analysed using a general and unified framework. Simplification of the expressions are done in a systematic manner which lead us to the local collisions from [13] and [20] as special cases. We will call these NB and SS local collisions respectively.

We show that it is possible to deterministically construct up to 22-round SHA-2 collisions using the SS local collision and up to 21-round SHA-2 collisions using the NB local collision. A general method for obtaining collisions for 23 and 24-round SHA-2 are described. This method can be applied with both the NB and the SS local collisions. From the analysis it becomes clear that the SS local collision offers certain advantages over the NB local collision. Hence, we focus on the SS local collision which leads to 23 and 24-round collisions with better time complexities than those obtained using the NB local collision. A summary of results on collision attacks against reduced SHA-2 family is given in Table 1. Examples of 22, 23 and 24-round SHA-256 and SHA-512 collisions are presented in Appendix A.
Table 1

Summary of results against reduced SHA-2 family

Work

Hash function

Steps

Effort

Local collision utilized

Attack type

Example provided

Prob.

Calls

[10, 11]

SHA-256

18

 

a

GH [6]

Linear

Yes

[16]

SHA-256

18

b

 

SS5 [15]

Linear

Yes

[13]

SHA-256

20

\(\frac{1}{3}\)

 

NB [13]

Non-linear

Yes

  

21

2 − 19

 

NB [13]

Non-linear

Yes

[20]

SHA-256/SHA-512

18,20

1

1

SS [20]

Non-linear

Yes

 

SHA-256

21

2 − 15

 

SS [20]

Non-linear

Yes

[18]

SHA-256/SHA-512

21

1

1

SS [20]

Non-linear

Yes

[7]

SHA-256

23

 

218

NB [13]

Non-linear

Yes

  

24

 

228.5

NB [13]

Non-linear

Yes

 

SHA-512

23

 

244.9

NB [13]

Non-linear

Yes

  

24

 

253

NB [13]

Non-linear

No

This work

SHA-256/SHA-512

22

1

1

SS [20]

Non-linear

Yes

 

SHA-256

23

 

211.5

SS [20]

Non-linear

Yes

  

24

 

228.5

SS [20]

Non-linear

Yes

  

24

 

215.5c

SS [20]

Non-linear

No

 

SHA-512

23

 

216.5

SS [20]

Non-linear

Yes

  

24

 

232.5

SS [20]

Non-linear

Yes

  

24

 

222.5d

SS [20]

Non-linear

No

Effort is expressed as either the probability of success or as the number of calls to the respective reduced round hash function.

aIt is mentioned in [10, 11] that the effort is 20 but no details are provided.

bEffort is given as running a C-program for about 30–40 min on a standard PC.

cA table containing 232 entries, each entry of size 8 bytes, is required.

dA table containing 264 entries, each entry of size 16 bytes, is required.

We highlight the case of 23 and 24-round SHA-512 attacks from Table 1. These are considerably improved in comparison to the existing attack of [7]. While [7] describes these attacks with reported complexities of 244.9 and 253 calls to the corresponding functions, our attacks have complexities 216.5 and 232.5 calls. In fact, the improvement in the time complexity of the 24-round SHA-512 attack allows us to provide the first message pair which collides for 24-round SHA-512.

Chronology of recent attacks on SHA-2

Nikolić and Biryukov [13] started the analysis of SHA-2 using nonlinear differentials and attacked up to 21-round SHA-256. Our work was motivated by theirs. We generalize their technique and use a different local collision with certain advantages over the NB local collision. Also, we extend the number of rounds that can be attacked to 24.

The work [7] and its different versions [8] (later published as [7]) was done independently and in parallel to ours [14, 1820]. This work used the NB local collision. The chronological sequence of our work and that of the different versions of [8] for obtaining 22 to 24-round SHA-2 collisions is the following.
  1. 1.

    Our work [14, 08-Mar-2008] provided the first example of colliding message pairs for 22-round SHA-2.

     
  2. 2.

    The version [8, 08-Apr-2008] provided the first examples of colliding message pairs for 23 and 24-round SHA-256.

     
  3. 3.

    Our report [17, 12-Jun-2008] provided examples of colliding message pairs for 23 and 24-round SHA-256 with improved time complexities.

     
  4. 4.

    The version [8, 14-Jul-2008] provided the first examples of colliding message pairs for 23-round SHA-512 and a theoretical attack on 24-round SHA-512 with reported time complexity of 253 calls to the compression function.

     
  5. 5.

    Our paper [19] provides example of a colliding message pair for 23-round SHA-512 with improved time complexity; and the first example of a colliding message pair for 24-round SHA-512 (also with improved time complexity).

     
As mentioned earlier, the current work subsumes our earlier works [14, 1820] providing a unified view of the attacks. More generally, the framework of our combinatorial analysis explains how things fit together. We believe that the approach we take leads to a better understanding of the combinatorial structure of the SHA-2 family.

2 Preliminaries

We will use the following notation:
  • Message words: \(W_i \in \{0,1\}^{n}\), \(W^{\prime}_i \in \{0,1\}^{n}\); n is 32 for SHA-256 and 64 for SHA-512.

  • Colliding message pair: {W0, W1, W2, ...W15} and {\(W^{\prime}_0\), \(W^{\prime}_1\), \(W^{\prime}_2\), ...\(W^{\prime}_{15}\)}.

  • Expanded message pair: {W0, W1, W2, ...WN − 1} and {\(W^{\prime}_0\), \(W^{\prime}_1\), \(W^{\prime}_2\), ...\(W^{\prime}_{N-1}\)}. The number of steps N is 64 for SHA-256 and 80 for SHA-512.

  • The internal registers for the two messages at step i: REGi = {ai, ..., hi} and \({\rm{REG}}^{\prime}_i = \{a^{\prime}_i, \ldots, h^{\prime}_i\}\).

  • ROTRk(x): Right rotation of an n-bit string x by k bits.

  • SHRk(x): Right shift of an n-bit string x by k bits.

  • ⊕: bitwise XOR;

  • + , −: addition and subtraction modulo 2n.

  • δX = X − X where X is an n-bit quantity.

2.1 SHA-2 compression function

The complete description of the SHA-2 hash family can be found in [21]. In this work, we will need only the compression function. A description is given below.

The input to the compression function consists of 8 n-bit registers and a message block which consists of 16 n-bit words. The output consists of 8 n-bit words. For the first message block, the values of the input registers are given by 8 fixed n-bit words called the initialization vector and for later message blocks, these values are the output of the previous invocation of the compression function.

The message block is expanded from 16 n-bit words W0,..., W15 to Nn-bit words W0,...,WN − 1. A round function is applied N times. Each application updates the values of the registers. In Step i with 0 ≤ i ≤ N − 1, the 8 registers are updated from (ai − 1, bi − 1, ci − 1, di − 1, ei − 1, fi − 1, gi − 1, hi − 1) to (ai, bi, ci, di, ei, fi, gi, hi) as follows. ((a − 1,...,h − 1) corresponds to the initial value of the registers.)
$$\left . \begin{array}{lll} a_{i} &=& \Sigma_0 (a_{i-1}) + f_{MAJ}(a_{i-1},b_{i-1},c_{i-1}) + \Sigma_1 (e_{i-1}) + f_{IF}(e_{i-1},f_{i-1},g_{i-1}) + h_{i-1} + K_i + W_i \\[2pt] b_{i} &=& a_{i-1} \\[2pt] c_{i} &=& b_{i-1} \\[2pt] d_{i} &=& c_{i-1} \\[2pt] e_{i} &=& d_{i-1} + \Sigma_1 (e_{i-1}) + f_{IF}(e_{i-1},f_{i-1},g_{i-1}) + h_{i-1} + K_i + W_i \\[2pt] f_{i} &=& e_{i-1} \\[2pt] g_{i} &=& f_{i-1} \\[2pt] h_{i} &=& g_{i-1} \end{array} \right \}$$
(1)
The functions fIF and the fMAJ are three variable boolean functions defined as:
$$\begin{array}{lll} f_{IF}(x,y,z) &=& (x \wedge y) \oplus (\neg x \wedge z), \\ f_{MAJ}(x,y,z) &=& (x \wedge y) \oplus (y \wedge z) \oplus (z \wedge x). \end{array} $$
For SHA-256, the functions Σ0 and Σ1 are defined as:
$$\begin{array}{lllllll} \Sigma_0(x) &=& ROTR^{2}(x) &\oplus& ROTR^{13}(x) &\oplus& ROTR^{22}(x), \\ \Sigma_1(x) &=& ROTR^{6}(x) &\oplus& ROTR^{11}(x) &\oplus& ROTR^{25}(x). \\ \end{array}$$
For SHA-512, the corresponding functions are:
$$\begin{array}{lllllll} \Sigma_0(x) &=& ROTR^{28}(x) &\oplus& ROTR^{34}(x) &\oplus& ROTR^{39}(x), \\ \Sigma_1(x) &=& ROTR^{14}(x) &\oplus& ROTR^{18}(x) &\oplus& ROTR^{41}(x). \\ \end{array} $$
Given the message words W0,W1,...,W15; for i ≥ 16, Wi is computed as follows.
$$W_i = \sigma_1(W_{i-2}) + W_{i-7} + \sigma_0(W_{i-15}) + W_{i-16}$$
(2)
For SHA-256, the functions σ0 and σ1 are defined as:
$$\begin{array}{lllllll} \sigma_0(x) &=& ROTR^{7}(x) &\oplus& ROTR^{18}(x) &\oplus& SHR^{3}(x), \\ \sigma_1(x) &=& ROTR^{17}(x) &\oplus& ROTR^{19}(x) &\oplus& SHR^{10}(x). \\ \end{array}$$
And for SHA-512, they are defined as:
$$\begin{array}{lllllll} \sigma_0(x) &=& ROTR^{1}(x) &\oplus& ROTR^{8}(x) &\oplus& SHR^{7}(x), \\ \sigma_1(x) &=& ROTR^{19}(x) &\oplus& ROTR^{61}(x) &\oplus& SHR^{6}(x). \\ \end{array}$$

The final output of the compression function is (a − 1 + aN − 1,...,h − 1 + hN − 1). Adding the initial values (a − 1,...,h − 1) to the output of the final application of the round function is called feed-forward.

Reduced Round SHA-2

The value of N is fixed by the specification [21]. For the purpose of analysis, one may work with a lower value of N. In this paper, we will work with N up to 24. Everything else of the compression function, including the feed-forward, remain the same. Actually, we will not have to bother about the feed-forward, since we will be obtaining collisions for several steps of the round function itself.

2.2 Cross dependence equation (CDE)

By the form of the round update function in (1), we have the following relation.
$$e_i = a_i + a_{i-4} -\Sigma_0(a_{i-1})-f_{MAJ}(a_{i-1},a_{i-2},a_{i-3}).$$
(3)
Later, we make extensive use of this relation. A special case of this equation was utilized in Section 6.1 of [20]. The equation in the form above was used in [18]. This equation can be used to show that the SHA-2 state update can be rewritten in terms of only one state variable. This fact was independently observed in [7].

The following result can be used to set registers to specific values.

Proposition 1

Suppose that (ai − 1,...,hi − 1) are known and α and β are any two n-bit words. Then it is possible to choose Wisuch that either ai = α or ei = β. In general, however, using only Wi, it is not possible to simultaneously set both aito α and ei to β.

Proof

This is an easy consequence of (1). Consider the equation for ai. This is given in terms of (ai − 1,...,hi − 1) and Wi. So, if we set
$$\begin{array}{lll} W_i&=&\alpha-(\Sigma_0 (a_{i-1}) + f_{MAJ}(a_{i-1},b_{i-1},c_{i-1}) + \Sigma_1 (e_{i-1})\\ && + f_{IF}(e_{i-1},f_{i-1},g_{i-1}) + h_{i-1} + K_i), \end{array}$$
then clearly ai = α is attained. Similarly for ei.

Note, however, that using Wi, we cannot simultaneously set the values of both ai and ei. □

Even though we cannot use Proposition 1 to simultaneously set the values of ai and ei, there is a way out. This way is given by the CDE. Suppose, the values of ai − 3,...,ai have already become fixed, but, ai − 4 is still free. Then by choosing a suitable value for ai − 4 we can attain any desired value for ei. Now, using Proposition 1, we can use Wi − 4 to set ai − 4 to the required value. So, in effect, we can use Wi − 4 to set ei to any desired value. This is something nice (from a cryptanalytic point of view) and unexpected and we use this feature extensively.

2.3 Differential properties of σ1

For the analysis of 23 and 24-round SHA-2, we will need to consider the differential properties of σ1 with respect to modular addition. The particular property that we require is discussed in this section.

SHA-256

Consider the distribution of δ = σ1(W) − σ1(W − 1) as W ranges over all 232 values. This distribution is highly skewed and was mentioned in Section 7.1 in [20]. Later, it has been independently observed in [7] that δ takes only 6181 values and there are several values of δ which occur for more than 229 or more values of W.

Let \(\textsf {freq}_{\delta}\) be the number of W such that δ = σ1(W) − σ1(W − 1). It is quite easy to prepare a list of \((\delta,\textsf {freq}_{\delta})\) values. For each of the 232 values of W, compute δ = σ1(W) − σ1(W − 1). If this δ has been obtained earlier, then increment the frequency for this δ; else insert \((\delta,\textsf {freq}_{\delta}=1)\) into the list. To do this efficiently, we need a suitable index structure for searching and inserting into the list. A height balanced tree (or AVL tree) is the optimal solution; but, for the current application, a simple (data structure) hash technique is good enough and is the technique we implemented. Some values of \((\delta,\textsf {freq}_{\delta})\) are given in Table 2.
Table 2

Some examples of high frequency values of δ = σ1(W) − σ1(W − 1) for SHA-256

δ

\(\textsf {freq}_{\delta}\)

δ

\(\textsf {freq}_{\delta}\)

ffff6000

229 + 226 + 225

0000a000

229 + 226 + 225

ffffa000

229 + 226

00006000

229 + 226

ff006001

216

ff005fff

216

Note

Interestingly, we have observed that if \(\textsf {freq}_{\delta}\) is greater than 216, then δ is always even.

SHA-512

In this case, n = 64 and it is not possible to exhaustively prepare a list of values for δ = σ1(W) − σ1(W − 1) for all possible 264 values of W. Instead, we created a list using 225 randomly chosen values of W. This provides certain values of δ with certain frequencies. From these frequencies we extrapolate to estimate the actual frequency of each delta among all the 264 choices of W. The extrapolation is done in the following manner. If a particular difference δ occurs κ times in 225 random trials, then we expect it to have a frequency \(\textsf {freq}_{\delta}\) of about κ×264/225. Some of the observed and the extrapolated frequencies are shown in Table 3.
Table 3

Some examples of high frequency values of δ = σ1(W) − σ1(W − 1) for SHA-512

δ

\(\textsf {freq}_{\mbox{o}}\)

\(\textsf {freq}_{\delta}\)

δ

\(\textsf {freq}_{\mbox{o}}\)

\(\textsf {freq}_{\delta}\)

200000000008

4795491

261.5

8e000000003a9

22

243.5

ffffdffffffffff8

4793201

261.5

fff26000000000c9

22

243.5

1ffffffffff8

4792982

261.5

600000000237

18

243.5

The column \(\textsf {freq}_{\mbox{o}}\) denotes the observed frequencies among 225 random trials of computing δ. The column \(\textsf {freq}_{\delta}\) contains the extrapolated values of the frequencies for the complete search space of 264

3 A general non-linear differential path

In this section, we present a description of a general differential path. This description is given in terms of several variables w,x,y and z and the message differences δWi,...,δWi + 8. The values of the δW’s are obtained in terms of w,x,y and z so that the differential path holds. Starting from this general description, we obtain conditions to simplify the expression for the variables and the δW’s. This leads to the analysis of special cases of the general differential path.

The process of moving from the general to the specific is done in several steps. First, we simplify the expressions for the variables and the δW’s. Next, we try to set as many of the δW’s to zero as possible. To this end, we obtain conditions for δWi + 4,...,δWi + 7 to be set to zero. These lead to two special cases for the differential path which can be used to find reduced round collisions.

We use a differential technique to find a 9-round local collision. The idea is to use modular differentials which was first used for SHA-2 by Nikolić and Biryukov [13]. Given a word w, we define
$$x=-\delta\Sigma_0^i(w)-\delta f_{MAJ}^i(w,0,0); \ \ y=-\delta f_{MAJ}^{i+1}(0,w,0);\ \ z=-\delta f_{MAJ}^{i+2}(0,0,w).$$
(4)
For t-bit words α,β,γ and integer i, we use the following short-hands.
$$\left.\begin{array}{rcl} \delta\Sigma_1^i(\alpha) & = & \Sigma_1(e_i+\alpha)-\Sigma_1(e_i) = \Sigma_1(e_i^{\prime}) - \Sigma_1(e_i). \\[2pt] \delta\Sigma_0^i(\alpha) & = & \Sigma_0(a_i+\alpha)-\Sigma_0(a_i) = \Sigma_0(a_i^{\prime}) - \Sigma_0(a_i). \\[2pt] \delta f_{IF}^i(\alpha,\beta,\gamma) & = & f_{IF}(e_i+\alpha,f_i+\beta,g_i+\gamma)-f_{IF}(e_i,f_i,g_i)\\[2pt] &= &f_{IF}(e_i^{\prime},f_i^{\prime},g_i^{\prime})-f_{IF}(e_i,f_i,g_i). \\[2pt] \delta f_{MAJ}^i(\alpha,\beta,\gamma) & = & f_{MAJ}(a_i+\alpha,b_i+\beta,c_i+\gamma)-f_{MAJ}(a_i,b_i,c_i)\\[2pt] &=& f_{MAJ}(a_i^{\prime},b_i^{\prime},c_i^{\prime})-f_{MAJ}(a_i,b_i,c_i). \\[2pt] \delta\sigma_0(\delta W_i) & = & \sigma_0(W_i+\delta W_i)-\sigma_0(W_i) = \sigma_0(W_i^{\prime}) - \sigma_0(W_i). \\[2pt] \delta\sigma_1(\delta W_i) & = & \sigma_1(W_i+\delta W_i)-\sigma_1(W_i) = \sigma_1(W_i^{\prime}) - \sigma_1(W_i). \end{array}\right\} $$
(5)
The general differential path and corresponding message differences are shown in Table 4. It can be verified that the differential path holds for the stated message differences. We show the first step of the computation, the other steps are similar. In the (i + 1)st step we want δai + 1 = 0 and δei + 1 = x. The given values of x and δWi + 1 ensure that these two conditions hold. Note that the values of the other registers are fixed by the values of the registers at the ith step.
$$\begin{array}{lll} \delta a_{i+1} & = & a_{i+1}^{\prime}-a_{i+1} \\ & = & (\Sigma_0 (a_{i}^{\prime}) + f_{MAJ}(a_{i}^{\prime},b_{i}^{\prime},c_{i}^{\prime}) + \Sigma_1 (e_{i}^{\prime}) + f_{IF}(e_{i}^{\prime},f_{i}^{\prime},g_{i}^{\prime}) + h_{i}^{\prime} + K_{i+1}^{\prime} + W_{i+1}^{\prime}) \\ & & -(\Sigma_0 (a_{i}) + f_{MAJ}(a_{i},b_{i},c_{i}) + \Sigma_1 (e_{i}) + f_{IF}(e_{i},f_{i},g_{i}) + h_{i} + K_{i+1} + W_{i+1}) \\ & = & (\Sigma_0 (a_{i}^{\prime})-\Sigma_0 (a_{i})) + (f_{MAJ}(a_{i}^{\prime},b_{i}^{\prime},c_{i}^{\prime})-f_{MAJ}(a_{i},b_{i},c_{i})) \\ & & +(\Sigma_1 (e_{i}^{\prime})-\Sigma_1 (e_{i})) +(f_{IF}(e_{i}^{\prime},f_{i}^{\prime},g_{i}^{\prime})-f_{IF}(e_{i},f_{i},g_{i})) +(W_{i+1}^{\prime}-W_{i+1}) \\ & = & \delta\Sigma_0^i(w)+\delta f_{MAJ}^i(w,0,0) +\delta\Sigma_1^i(w)+\delta f_{IF}^i(w,0,0)+\delta W_{i+1} \\ & = & -x +(\delta\Sigma_1^i(w)+\delta f_{IF}^i(w,0,0))+ (x-\delta\Sigma_1^i(w)-\delta f_{IF}^i(w,0,0)) \\ & = & 0 \\ \delta e_{i+1} & = & e_{i+1}^{\prime}-e_{i+1} \\ & = & (\Sigma_1 (e_{i}^{\prime}) + f_{IF}(e_{i}^{\prime},f_{i}^{\prime},g_{i}^{\prime})+ h_{i}^{\prime} + K_{i+1}^{\prime} + W_{i+1}^{\prime}) \\ & & -(\Sigma_1 (e_{i}) + f_{IF}(e_{i},f_{i},g_{i})+ h_{i} + K_{i+1} + W_{i+1}) \\ & = & (\Sigma_1 (e_{i}^{\prime})-\Sigma_1 (e_{i})) +(f_{IF}(e_{i}^{\prime},f_{i}^{\prime},g_{i}^{\prime})-f_{IF}(e_{i},f_{i},g_{i})) +(W_{i+1}^{\prime}-W_{i+1}) \\ & = & \delta\Sigma_1^i(w)+\delta f_{IF}^i(w,0,0)+\delta W_{i+1} \\ & = & \delta\Sigma_1^i(w)+\delta f_{IF}^i(w,0,0)+x-\delta\Sigma_1^i(w)-\delta f_{IF}^i(w,0,0) \\ & = & x. \end{array}$$
Table 4

General 9-round nonlinear local collision for SHA-256

Differential Path

Step i

δWi

δai

δbi

δci

δdi

δei

δfi

δgi

δhi

i − 1

0

0

0

0

0

0

0

0

0

i

w

w

0

0

0

w

0

0

0

i + 1

δWi + 1

0

w

0

0

x

w

0

0

i + 2

δWi + 2

0

0

w

0

y

x

w

0

i + 3

δWi + 3

0

0

0

w

z

y

x

w

i + 4

δWi + 4

0

0

0

0

w

z

y

x

i + 5

δWi + 5

0

0

0

0

0

w

z

y

i + 6

δWi + 6

0

0

0

0

0

0

w

z

i + 7

δWi + 7

0

0

0

0

0

0

0

w

i + 8

δWi + 8

0

0

0

0

0

0

0

0

Message Word Differences

δWi = w;

\(\delta W_{i+1} = x-\delta\Sigma_1^i(w)-\delta f_{IF}^i(w,0,0)\);

\(\delta W_{i+2} = y-\delta\Sigma_1^{i+1}(x)-\delta f_{IF}^{i+1}(x,w,0)\);

\(\delta W_{i+3} = z-\delta\Sigma_1^{i+2}(y)-\delta f_{IF}^{i+2}(y,x,w)\);

\(\delta W_{i+4} = -w-\delta\Sigma_1^{i+3}(z)-\delta f_{IF}^{i+3}(z,y,x)\);

\(\delta W_{i+5} = -x-\delta\Sigma_1^{i+4}(w)-\delta f_{IF}^{i+4}(w,z,y)\);

\(\delta W_{i+6} = -y-\delta f_{IF}^{i+5}(0,w,z)\);

\(\delta W_{i+7} = -z-\delta f_{IF}^{i+6}(0,0,w)\);

δWi + 8 = − w.

The important thing to note about the differential path shown in Table 4 is that it puts no restrictions on the actual message words Wi,...,Wi + 8. Starting at any value for the registers ai to hi, and using any given non-zero w, and any Wi,...,Wi + 8, we simply run the compression function step-by-step and define the words x,y,z, the respective δWis and consequently the respective \(W_i^{\prime}\)s. All the steps are deterministic and hence with probability one, we obtain \(W_i^{\prime}\)s which collide with Wis. This gives rise to a local collision.

Note

We have defined δX = X − X and so δWi = w means \(W_i^{\prime}=W_i+w\); if we had defined δX to be X − X, then \(W_i^{\prime}\) would have been Wi − w. Consequently, without loss of generality one can assume w > 0.

Specifying the values of (w,x,y,z) completely specifies message differences as well as the differences in the register values at all the steps. Two special cases for (w,x,y,z) have been used.
  • Nikolić-Biryukov (NB) [13]. (w,x,y,z) = (1, − 1,0,0).

  • Sanadhya-Sarkar (SS) [20]. (w,x,y,z) = (1, − 1, − 1,0).

The NB local collision was the first to be proposed and has been used for finding collisions in both [13] and [7, 8]. The SS local collision was proposed later and was motivated by the analysis done in [13]. But, it turns out that the SS local collision is actually more attractive than the NB local collision. This is due to the fact that the time complexities of collision attacks using the SS local collision is lesser than the time complexities of collision attacks using the NB local collision. To understand why this is so, one needs to go through the detailed combinatorial analysis of the SHA-2 round function carried out in this work.

3.1 Simplifications

The differential path by itself is not useful for obtaining longer round collisions. To do this, we need to simplify the expressions and obtain conditions. The idea behind the simplifications is to obtain conditions which are easy to satisfy and which ensure that the differential path holds. These conditions are obtained in terms of values of the variables w,x,y and z as well as values of the different a and e registers. The registers can then be set to appropriate values using Proposition 1. The simplification is done using several rules which are actually sufficient conditions. The rules and their consequences are described below.

Simplifying δΣ0

There is only one occurrence of Σ0 in all the expressions and that is in the expression for x. In both SHA-256 and SHA-512, Σ0 is a linear function which is invariant only on 0 and − 1. Note that \(-1={\tt ffffffff}\) for SHA-256 and \(-1={\tt ffffffffffffffff}\) for SHA-512. Since \(\delta\Sigma_0^i(w)=\Sigma_0(a_i+w)-\Sigma_0(a_i)\) an easy way to satisfy this is to ensure that both ai and ai + w are either 0 or − 1.

Rule 1

Ensure that \(\delta\Sigma_0^i(w)=w\) by putting w = 1 and ai = − 1.

Simplifying Majority

If two of the inputs are equal, then the output of fMAJ() is equal to this input. Based on this observation, we have the following rule.

Rule 2

Simplify each occurrence of fMAJ by making two of the inputs equal.

This rule has several consequences. The function fMAJ is used only in the definitions of x, y and z. Consider, for example x which, after the application of Rule 1, is equal to
$$x=-w-f_{MAJ}(a_i+w,a_{i-1},a_{i-2})+f_{MAJ}(a_i,a_{i-1},a_{i-2}).$$
There are three ways to apply Rule 2 to this occurrence of fMAJ. These are:
  1. 1.

    Set ai − 1 = ai − 2 which implies x = − w;

     
  2. 2.

    set ai − 1 = ai + w, ai = ai − 2 which implies that x = − 2w;

     
  3. 3.

    set ai − 2 = ai + w, ai = ai − 1 which also implies that x = − 2w.

     
So applying Rule 2 to x implies that either x = − w (in which case ai − 1 = ai − 2) or x = − 2w (in which case either (ai − 1 = ai + w and ai = ai − 2) or (ai − 2 = ai + w and ai = ai − 1).
Similar reasoning applies to the expressions for y and z. Now, if we simultaneously apply Rule 2 to all the three occurrences of fMAJ, then there are eight possible values of (w,x,y,z) which are listed as Cases (I) to (VIII) in Table 5. The related sufficient conditions are given in Table 6.
Table 5

Different cases for (w,x,y,z)

(I)

(w, − w,0,0)

(II)

(w, − w,0, − w)

(III)

(w, − w, − w,0)

(IV)

(w, − w, − w, − w)

(V)

(w, − 2w,0,0)

(VI)

(w, − 2w,0, − w)

(VII)

(w, − 2w, − w,0)

(VIII)

(w, − 2w, − w, − w)

Table 6

Result of applying Rules 1 and 2

Case

ai − 2

ai − 1

ai

ai + 1

ai + 2

ei + 2

ei + 1

I

α

α

− 1

α

α

− Σ0(α) + α

1 + ai − 3

II(a)

\(\phantom{-}0\)

\(\phantom{-}0\)

− 1

\(\phantom{-}0\)

− 1

− 1

1 + ai − 3

II(b)

− 1

− 1

− 1

− 1

\(\phantom{-}0\)

\(\phantom{-}1\)

1 + ai − 3

III(a)

− 1

− 1

− 1

\(\phantom{-}0\)

\(\phantom{-}0\)

\(\phantom{-}0\)

2 + ai − 3

III(b)

\(\phantom{-}0\)

\(\phantom{-}0\)

− 1

− 1

− 1

\(\phantom{-}1\)

ai − 3

IV(a)

− 1

− 1

− 1

\(\phantom{-}0\)

− 1

− 1

2 + ai − 3

IV(b)

\(\phantom{-}0\)

\(\phantom{-}0\)

− 1

− 1

\(\phantom{-}0\)

\(\phantom{-}2\)

ai − 3

V(a)

− 1

\(\phantom{-}0\)

− 1

\(\phantom{-}0\)

\(\phantom{-}0\)

− 1

2 + ai − 3

V(b)

\(\phantom{-}0\)

− 1

− 1

− 1

− 1

\(\phantom{-}1\)

1 + ai − 3

VI(a)

− 1

\(\phantom{-}0\)

− 1

\(\phantom{-}0\)

− 1

− 2

2 + ai − 3

VI(b)

\(\phantom{-}0\)

− 1

− 1

− 1

\(\phantom{-}0\)

\(\phantom{-}2\)

1 + ai − 3

VII(a)

− 1

\(\phantom{-}0\)

− 1

− 1

− 1

\(\phantom{-}0\)

1 + ai − 3

VII(b)

\(\phantom{-}0\)

− 1

− 1

\(\phantom{-}0\)

\(\phantom{-}0\)

\(\phantom{-}1\)

2 + ai − 3

VIII(a)

− 1

\(\phantom{-}0\)

− 1

− 1

\(\phantom{-}0\)

\(\phantom{-}1\)

1 + ai − 3

VIII(b)

\(\phantom{-}0\)

− 1

− 1

\(\phantom{-}0\)

− 1

− 1

2 + ai − 3

For this table, we have w = 1 and ai = − 1

These sufficient conditions specify certain values for the registers (ai − 2,ai − 1,ai,ai + 1,ai + 2) and (ei + 1,ei + 2). Actually, the conditions on the a-register values are independent and the conditions on the e-register values are obtained from these values using the CDE. Using Proposition 1, it is possible to set the values of (Wi − 2,...,Wi + 2) to ensure that the (ai − 2,...,ai + 2) obtain the required values. Consequently, we can ensure that any of the cases in Table 6 can be made to hold with probability one.

Note

If w = 1, then Case (I) of Table 5 corresponds to the NB local collision and Case (III) of Table 5 corresponds to the SS local collision. As we proceed, we will see that the other cases become unusable.

3.2 Simplifying δWi + 4 to δWi + 7

The expression for δWi + 4 involves \(\delta \Sigma_1^{i+3}(z)\) and \(\delta f_{IF}^{i+3}(z,y,x)\). Joint simplification of the above two quantities is possible by ensuring that both ei + 3 and ei + 3 + z are either 0 or − 1.
  1. 1.

    If z = 0, then ei + 3 can be either 0 or − 1.

     
  2. 2.

    If z = − w, then we choose ei + 3 = 0 if w = 1; and ei + 3 = − 1 if w = − 1.

     
Similarly, simplification of δWi + 5 is possible by ensuring that both w and ei + 4 + w are either 0 or − 1.
For δWi + 6 and δWi + 7 we respectively ensure that ei + 5 and ei + 6 are either 0 or − 1. The effect of these simplifications are summarized in Tables 7 and 8. In particular, the simplifying conditions and the resulting values of the respective δWs are shown. The condition on the values of e-register can be achieved by setting the corresponding message word W (see Proposition 1). So, any of the conditions in Tables 7 and 8 can be achieved.
Table 7

Summary of simplifying conditions for δWi + 4 and δWi + 5

δW

Condition(s)

Value of δW

δWi + 4

z = 0, ei + 3 = 0

− w − x

 

z = 0, ei + 3 = − 1

− w − y

 

w = 1, z = − w, ei + 3 = 0

ei + 1 − ei + 2 + y

δWi + 5

w = 1, ei + 4 = − 1

− w − x − y + ei + 3 − ei + 2

These simplifications require Rules 1 and 2 and so, in particular w = 1 in all these cases

Table 8

Summary of simplifying conditions for δWi + 6 and δWi + 7

δW

Condition(s)

Value of δW

δWi + 6

ei + 5 = 0

− y − z

 

ei + 5 = − 1

− y − w

δWi + 7

ei + 6 = 0

− w − z

 

ei + 6 = − 1

− z

These simplifications do not require these Rules 1 and 2 and consequently, w can be any n-bit word

4 Obtaining up to 22-round collisions

The basic idea is the following. Choose a suitable value for i and place the local collision from Steps i to i + 8. By placing we mean the following. Ensure that δW0,...,δWi − 1 are all zeros and introduce the required differences in δWi,...,δWi + 8. This creates a collision from Steps i to i + 8. Ensure that there are no further disturbances by setting δWi + 9 to δW15 to be zero. This works well if we are interested in up to 16-round collisions.

For obtaining collisions on r > 16 rounds, we need to consider the message expansion. The initial words W0,...,W15 are free and from W16 onwards, the words are computed using the message expansion recursion given by (2). For clarity, some word differences are shown in Table 9. The differences in the message words introduced in Steps i to i + 8 can possibly affect δW16,δW17,...,δWr − 1. We ensure that the effects of these induced differences can be cancelled out and we have δW16 = δW17 = ⋯ = δWr − 1 = 0. This results in an r-round collision.
Table 9

Message expansion from W16 to W23

δW16 = δσ1(δW14) + δW9 + δσ0(δW1) + δW0

δW17 = δσ1(δW15) + δW10 + δσ0(δW2) + δW1

δW18 = δσ1(δW16) + δW11 + δσ0(δW3) + δW2

δW19 = δσ1(δW17) + δW12 + δσ0(δW4) + δW3

δW20 = δσ1(δW18) + δW13 + δσ0(δW5) + δW4

δW21 = δσ1(δW19) + δW14 + δσ0(δW6) + δW5

δW22 = δσ1(δW20) + δW15 + δσ0(δW7) + δW6

δW23 = δσ1(δW21) + δW16 + δσ0(δW8) + δW7

18-Round Collisions

Deterministic 18-round collisions are easy to obtain by setting i = 3 (i.e., the local collision spans from i = 3 to i + 8 = 11). So, we necessarily have δWj = 0 for j = 0,1,2,12,13,14,15.

Additionally, we need to ensure that δW16 = δW17 = 0. From Table 9, we see that in the expression for δW16 the only possible non-zero term is δW9 = δWi + 6. Similarly, in the expression for δW16, the only possible non-zero term is δW10 = δWi + 7. By ensuring that δWi + 6 = δWi + 7 = 0, we will obtain δW16 = δW17 = 0. But, ensuring δWi + 6 = δWi + 7 = 0 can be easily done by setting a suitable condition from Table 8. For example, if y = z = 0, then the setting ei + 5 = 0 and ei + 6 = − 1 ensures δWi + 6 = δWi + 7 = 0 for any choice of w. Using Proposition 1, the required values of ei + 5 and ei + 6 can be achieved by setting Wi + 5 and Wi + 6 to appropriate values. As a net result, we obtain deterministic 18-round collisions for any value of w.

20-Round Collisions

Set i = 5, i.e., the local collision spans from i = 5 to i + 8 = 13, so that δWj = 0 for j = 0,...,4,14,15. We need to ensure that δW16 = ⋯ = δW19 = 0. From Table 9, we see that this can be achieved by setting δW9 = δW10 = δW11 = δW12 = 0.

Since i = 5, this means that we have to set δWi + 4 = δWi + 5 = δWi + 6 = δWi + 7 = 0. The conditions for individually setting any of these to 0 are given in Tables 7 and 8. In the present case, we need to consider how to simultaneously set all of these to 0. In this situation, some conditions become infeasible. More precisely, certain conditions for obtaining δWi + 4 = 0 are incompatible with certain conditions for obtaining δWi + 5 = 0. The possible conditions for ensuring these two δWs to be zero are given in Table 10. In particular, we see that z = 0 in all cases. The conditions for setting δWi + 6 = 0 and δWi + 7 = 0 do not cause any conflict with other conditions. The set of conditions required for setting δWi + 4 = δWi + 5 = δWi + 6 = δWi + 7 = 0 are summarized in Table 11. Again achieving the appropriate values of the a and e-registers can be done using Proposition 1.
Table 10

Conditions for setting δWi + 4 = δWi + 5 = 0

Case

w

 x 

y

 z 

ei + 2

ei + 3

ei + 4

Extra condition

A

1

− 1

\(\phantom{-}0\)

0

0

\(\phantom{0}\phantom{-.}0\)

− 1

Case I

B

1

− 1

− 1

0

1

\(\phantom{0}\phantom{-.}0\)

− 1

Case III (b)

C

1

− 2

− 1

0

1

\(\phantom{0}\phantom{-.}0\)

− 1

Case VII (b)

D

1

− 1

− 1

0

0

− 1

− 1

Case III (a)

E

1

− 2

\(\phantom{-}0\)

0

1

− 1

− 1

Case V (b)

F

1

− 2

− 1

0

1

− 1

− 1

Case VII (b)

Table 11

Conditions for setting δWi + 4 = δWi + 5 = δWi + 6 = δWi + 7 = 0

A row of Table 10

AND

(ei + 5 = 0 and y = − z) or (ei + 5 = − 1 and y = − w)

AND

ei + 6 = − 1.

Note

Tables 10 and 11 show that it is possible to deterministically set all the four δWs to zero in Case (A) which is the NB local collision. Consequently, it is possible to obtain deterministic 20-round collision using this local collision. This was not done in [13] but was later mentioned in [20].

21-Round Collisions

Set i = 6, i.e., the local collision spans from i = 6 to i + 8 = 14. We need to ensure that δW16 = ⋯ = δW20 = 0. As in the case of 20-round collision, we set δWi + 4 = δWi + 5 = δWi + 6 = δWi + 7 = 0 by a suitable set of conditions given by Table 11. So, we have δWj = 0 for j = 0,...,5,10,11,12,13,15. From Table 9, we see that if we can now achieve δW16 = 0, then we will have achieved the condition δW16 = ⋯ = δW20 = 0.

From the structure of the differential path shown in Table 4, δW14 = δWi + 8 = − w and so
$$\delta W_{16} = \delta\sigma_1(\delta W_{14})+\delta W_9.$$
(6)
Consider δW9 = δWi + 3 which by the differential path in Table 4 is equal to \(z-\delta\Sigma_1^{i+2}(y)-\delta f_{IF}^{i+2}(y,x,w)\). To simplify this, we choose rows from Table 10 such that both ei + 2 and ei + 2 + y are either 0 or − 1. These are rows A and D.

In the case of row D, we have δW9 = − e7 + e6 + 2; whereas for row A, we get δW9 = − 1. It is possible to deterministically satisfy the case for row D. However, row A (which is the NB local collision) cannot be used in the attack. This is due to the fact that there does not exist any word X such that σ0(X) − σ0(X − 1) = − 1 either for SHA-256 or for SHA-512.

Since i = 6, the row of Table 6 corresponding to row D of Table 10 ensure that a4, a5, a6, a7, a8 and e8 are all fixed to particular values. Using CDE, we can now use a3 to set e7 to any specific value and then use a2 to set e6 to any specific value.

Now, the following strategy is used to ensure that δW16 = 0. Choose an arbitrary value for W14 and compute δ to be
$$\delta = \delta\sigma_1(\delta W_{14}) = \sigma_1(W_{14}+\delta W_{14}) -\sigma_1(W_{14}) = \sigma_1(W_{14}-w)-\sigma_1(W{14}).$$
Choose W2 and W3 to set a2 and a3 such that e7 − e6 − 2 = − δ. From (6), it now follows that δW16 = 0. This gives a deterministic 21-round collision.
It is also possible to obtain a deterministic 21-round collision by placing the local collision from Steps 7 to 15. Set i = 7 so that the local collision spans steps i = 7 to i + 8 = 15. In this case, set δWi + 4 = δWi + 5 = δWi + 6 = 0 the sufficient condition for this being any row of Table 10 AND ((ei + 5 = 0, y = − z) or (ei + 5 = − 1, y = − w)). This ensures δW11 = δW12 = δW13 = 0. Now
$$\begin{array}{lll} \delta W_{16} & = & \sigma_1(\delta W_{14}) + \delta W_9, \\ \delta W_{17} & = & \sigma_1(\delta W_{15}) + \delta W_{10}. \end{array}$$
We have δW15 = − w and by setting ei + 6 = 0, we also have δW14 = − w. Also,
$$\begin{array}{lll} \delta W_{9} = \delta W_{i+2} & = & y-\delta\Sigma_1^{i+1}(x)-\delta f_{IF}^{i+1}(x,w,0) \\ \delta W_{10} = \delta W_{i+3} & = & -\delta\Sigma_1^{i+2}(y)-\delta f_{IF}^{i+2}(y,x,w). \end{array}$$
To simplify δW10 = δWi + 3 we choose rows from Table 10 such that both ei + 2 and ei + 2 + y are either 0 or − 1. These are rows A and D (which correspond to the NB and SS local collisions respectively). Similarly, to simplify δW9 = δWi + 2, in row A we choose ei + 1 = 0; and in row D we choose ei + 1 = − 1.

The overall strategy is now the following. Choose arbitrary values for W14 and W15 and compute δ1 = δσ1(δW14) and δ2 = δσ1(δW15), where δW14 = δW15 = − w. Now set δW9 = − δ1 and W10 = − δ2 using W3 and W4 to set a3 and a4 and hence, using CDE to set e7 and e8 to desired values. This can be done deterministically.

We have sketched two ways of achieving deterministic 21-round collisions. In one case, the local collision spans from Step 6 to Step 14 and in the second case, the local collision spans from Step 7 to Step 15. For the first case, only the SS local collision can be used, while in the second case, both the SS and the NB local collisions can be used. The fact that the NB local collision can be used to obtain deterministic 21-round collisions was not mentioned in [13]; it was mentioned in [18].

The sketches above can be developed into detailed algorithms. We do not describe these algorithms. This is because below we describe in details a similar algorithm for constructing deterministic 22-round collisions.

4.1 22-round collisions

Set i = 7 so that the local collision spans from i = 7 to i + 8 = 15. Use sufficient conditions from Table 11 to ensure that δWi + 4 = δWi + 5 = δWi + 6 = δWi + 7 = 0. So δWj = 0 for j = 0,...,6,11,12,13,14. If we can now ensure that δW16 = δW17 = 0, then from Table 9, we will have δWj = 0 for j = 18,19,20,21 which will give rise to a 22-round collision. Under the conditions, we have
$$\left.\begin{array}{rcl} \delta W_{16} & = & \delta W_9 \\ \delta W_{17} & = & \delta \sigma_1(\delta W_{15})+\delta W_{10}. \end{array}\right\} $$
(7)
So, if we can achieve δW9 = δWi + 2 = 0 and δσ1(δW15) + δW10 = 0, then we are done. Note that δW15 = − w = − 1.

First, consider the condition on δW17. To simplify δW10 = δWi + 3 we need to choose both ei + 2 and ei + 2 + y to be 0 or − 1. These imply that we have to use either row A or row D of Table 10 (which respectively correspond to the NB and the SS local collisions).

In case of row D, we have δW10 = − e8 + e7 = 2, whereas in the case of row A, we have δW10 = − 1. The computation for row D is as follows. (A similar computation shows the value of δW10 for row A.)
$$\begin{array}{lll} \delta W_{10} &=& -\delta f_{IF}^{9}(-1,-1,1) - \delta \Sigma_1(-1)\\ &=& -f_{IF}(e_{9}-1,f_{9}-1,g_{9}+1)+ f_{IF}(e_{9},f_{9},g_{9}) - \Sigma_1(e_9-1) + \Sigma_1(e_9) \\ &=& -f_{IF}(e_{9}-1,e_{8}-1,e_{7}+1)+ f_{IF}(e_{9},e_{8},e_{7}) - \Sigma_1(e_9-1) + \Sigma_1(e_9) \\ &=& -f_{IF}(-1,e_8-1,e_{7}+1)+f_{IF}(0,e_8,e_{7}) - \Sigma_1(-1) + \Sigma_1(0) \\ &=& -(e_8-1)+e_{7}-(-1) + 0 \\ &=& -e_8+e_{7} + 2. \end{array}$$
If we want to use row A, then from (7) we need to have a value for W15 such that σ1(W15 + 1) − σ1(W15) = 1. There is no such value for W15 for both SHA-256 and SHA-512. Hence, row A, which correspond to the NB local collision, cannot be used.

So, we use row D which correspond to Case III(a) of Table 6. In this case, we see that e8 = ei + 1 = 2 + ai − 3 = 2 + a4. We set a4 = − 2, so that e8 = 0 and δW10 = e7 + 2. Setting a4 to − 2 is done using W4 as in Proposition 1.

Choose an arbitrary value for W15 and set δ = − δσ1(δW15) where δW15 = − 1. Then use W3 to set a3 such that due to CDE, e7 gets set to a particular value required to ensure that e7 + 2 = δW10 = δ, i.e., e7 = δ − 2. This computation is as follows.
$$\begin{array}{lll} \delta = e_{7} + 2 &=& a_3 + a_7-\Sigma_0(a_6)-f_{MAJ}(a_6,a_5,a_4)+2\\ &=& a_3 + (-1) -\Sigma_0(-1)-f_{MAJ}(-1,-1,-2)+2\\ &=& a_3 -1 -(-1)-(-1)+2\\ &=& a_3 +3. \end{array}$$
So, using W3 we need to set a3 = δ − 3.
Now consider the condition on δW16 given in (7), i.e., the condition δW9 = 0. Up to this point, the values of a3 to a6 have been fixed as follows: a3 = δ − 3, a4 = − 2, a5 = − 1, a6 = − 1. Noting that i = 7 and row D correspond to (w,x,y,z) = (1, − 1, − 1,0), from Table 4, we have
$$\begin{array}{lll} \delta W_9 = \delta W_{i+2} & = & y-\delta\Sigma_1^{i+1}(x)-\delta f_{IF}^{i+1}(x,w,0) \\ & = & -y -\Sigma_1(e_{i+1}+x)+\Sigma_1(e_{i+1}) \\ & & -f_{IF}(e_{i+1}+x,e_i+w,e_{i-1})+f_{IF}(e_{i+1},e_i,e_{i-1}) \\ & = & -1-(-1)+0-f_{IF}(-1,e_7+1,e_6)+f_{IF}(0,e_7,e_6) \\ & = & -e_7-1+e_6 \\ & = & 2-\delta-1+e_6. \end{array}$$
To obtain δW9 = 0, we need to have e6 = δ − 1. Using CDE, we have
$$\begin{array}{lll} e_6 & = & a_6+a_{2}-\Sigma_0(a_5)-f_{MAJ}(a5,a_4,a_3) \\ & = & -1+a_2-(-1)-f_{MAJ}(-1,-2,\delta-3) \end{array}$$
So, setting a2 = δ − 1 + fMAJ( − 1, − 2,δ − 3) ensures, e6 = δ − 1 as required. This completes the description. All of the above steps can be written more explicitly in an algorithmic form. We provide this below.

Algorithm to Obtain 22-Round Collisions

We define two functions which return the required message word Wi to set the register value ai or ei to desired values, say desired_a and desired_e, at Step i. (See Proposition 1.) Equation 1 provides the definitions of these two functions.
  1. 1.

    W_to_set_register_A(Step i, desired_a, Current State {ai − 1, bi − 1, ...hi − 1}) :

    = (desired_a − Σ0(ai − 1) − fMAJ(ai − 1,bi − 1,ci − 1) − Σ1(ei − 1) − fIF(ei − 1,fi − 1, gi − 1) − hi − 1 − Ki)

     
  2. 2.

    W_to_set_register_E(Step i, desired_e, Current State {ai − 1, bi − 1, ...hi − 1}) :

    = (desired_e − di − 1 − Σ1(ei − 1) − fIF(ei − 1,fi − 1,gi − 1) − hi − 1 − Ki)

     
Using these functions, the complete algorithm to obtain message pairs leading to deterministic 22-round collisions for SHA-2 family is described in Table 12.
Table 12

Deterministic algorithm to obtain message pairs leading to collisions for 22-round SHA-2

externalW_to_set_register_A(Step i, desired_a, Current State {ai − 1, bi − 1, ...hi − 1}) : Returns the required message Wi to be used in step i so that ai is set to the given value.

externalW_to_set_register_E(Step i, desired_e, Current State {ai − 1, bi − 1, ...hi − 1}) : Returns the required message Wi to be used in step i so that ei is set to the given value.

First Message words:

1. Select W0, W1, W14 and W15 randomly.

2. Set \(\texttt{DELTA} = \sigma_1(W_{15}) - \sigma_1(W_{15}-1)\).

3. Run Steps 0 and 1 of hash evaluation to define {a1,b1, ...h1}.

4. Choose W2 = W_to_set_register_A(2, \(\texttt{DELTA} - 1 + f_{MAJ}(-1,-2,\texttt{DELTA}-3)\), {a1, b1, ...h1}).

5. Run Step 2 of hash evaluation to define {a2,b2, ...h2}.

6. Choose W3 = W_to_set_register_A(3, \(\texttt{DELTA}-3\), {a2, b2, ...h2}).

7. Run Step 3 of hash evaluation to define {a3,b3, ...h3}.

8. Choose W4 = W_to_set_register_A(4, − 2, {a3, b3, ...h3}).

9. Run Step 4 of hash evaluation to define {a4,b4, ...h4}.

10. Choose W5 = W_to_set_register_A(5, − 1, {a4, b4, ...h4}).

11. Run Step 5 of hash evaluation to define {a5,b5, ...h5}.

12. Choose W6 = W_to_set_register_A(6, − 1, {a5, b5, ...h5}).

13. Run Step 6 of hash evaluation to define {a6,b6, ...h6}.

14. Choose W7 = W_to_set_register_A(7, − 1, {a6, b6, ...h6}).

15. Run Step 7 of hash evaluation to define {a7,b7, ...h7}.

16. Choose W8 = W_to_set_register_A(8, 0, {a7, b7, ...h7}).

17. Run Step 8 of hash evaluation to define {a8,b8, ...h8}.

18. Choose W9 = W_to_set_register_A(9, 0, {a8, b8, ...h8}).

19. Run Step 9 of hash evaluation to define {a9,b9, ...h9}.

20. Choose W10 = W_to_set_register_E(10, − 1, {a9, b9, ...h9}).

21. Run Step 10 of hash evaluation to define {a10,b10, ...h10}.

22. Choose W11 = W_to_set_register_E(11, − 1, {a10, b10, ...h10}).

23. Run Step 11 of hash evaluation to define {a11,b11, ...h11}.

24. Choose W12 = W_to_set_register_E(12, − 1, {a11, b11, ...h11}).

25. Run Step 12 of hash evaluation to define {a12,b12, ...h12}.

26. Choose W13 = W_to_set_register_E(13, − 1, {a12, b12, ...h12}).

Second message words:

27. Define δWi = 0 for i ∈ {0,1,2,3,4,5,6,9,11,12,13,14}.

28. Define δW7 = 1 and δW15 = − 1.

29. Define δW8 = − 1 − fIF(e7 + 1,f7,g7) + fIF(e7,f7,g7) − Σ1(e7 + 1) + Σ1(e7). (Refer Table 4.)

30. Define δW10 = − fIF(e9 − 1,f9 − 1,g9 + 1) + fIF(e9,f9,g9) − Σ1(e9 − 1) + Σ1(e9). (Refer Table 4.)

31. Compute \(W^{\prime}_i = W_i + \delta W_i\) for 0 ≤ i ≤ 15.

A Remark on the NB Local Collision

We have mentioned that if we place the local collision from Steps 7 to 15, then row A of Table 10 cannot be used to obtain a deterministic 22-round collision. Row A corresponds to the NB local collision.

We considered the issue of whether it is possible to place the NB local collision from Steps 8 to 16 to obtain a 22-round collision (which may not be deterministic). In this case, the local collision will end at Step 16 and hence δW16 = − 1. Recall from Table 9, that a difference in δW16 will affect δW18. We would like to have δW18 = 0 so as to ensure that there are no differences after the local collision ends. Again from Table 9 and the fact that the local collision spans Steps 8 to 16, to achieve δW18 = 0, we need to have δσ1(δW16) + δW11 = 0.

More generally, we considered the situation, where the NB local collision spans Steps i to (i + 8), with i ≥ 8 and we require δWi + 10 = 0. From Table 9, the last condition is achieved if δσ1(δWi + 8) + δWi + 3 = 0. Note that δWi + 8 = − 1.

For SHA-512, using the NB local collision makes achieving the condition δσ1(δWi + 8) + δWi + 3 = 0 difficult. This is because of the fact that there is a “gap” in the values of |δWi + 3| and |δσ1(δWi + 8)|. In Appendix B, we show that the probability of \(|\delta W_{i+3}|\geq 2^j\) is less than 1/2j − 1; and for any 64-bit value for Wi + 8, \(|\sigma_1(W_{i+8})-\sigma_1(W_{i+8}-1)|\geq 2^{42}+2^{39}+2^{38}+2^{36}-2^3\). As a consequence, to achieve δσ1(δWi + 8) + δWi + 3 = 0, we need to have \(|\delta W_{i+3}|> 2^{42}\), an event which occurs with probability less than 2 − 41.

The above probability computation is over uniform random choices of Wi + 8 and Wi + 3. In fact, this was one of the factors that had led us to focus only on the SS local collision. It was shown in [7] that the NB local collision can be used to obtain 23 and 24-round SHA-512 collision. However, the time complexities of the NB local collision attack is more than that of the SS local collision attack. This fact is possibly attributable to the “gap” in the values of |δWi + 3| and |δσ1(δWi + 8)| mentioned above.

5 A general idea for obtaining 23 and 24-round collisions

Obtaining deterministic collisions up to 22 rounds did not require the (single) local collision to extend beyond Step 15. For obtaining collisions for a greater number of rounds, we will need to start the local collision at Step 8 (or further) and hence the local collision will end at Step 16 (or further). This will require us to analyse the message expansion more carefully.

For obtaining collisions up to 22 rounds, we also needed to consider message expansion. But, we ensured that there were no differences in message words from Step 16 onwards. However, now that we consider the local collision to end at Step 16 (or further), this will necessarily mean that one or more δWi (for i ≥ 16) will be non-zero. This will require a modification of the strategy followed so far. Instead of requiring δWi = 0 for i ≥ 16, we will require δWi = 0 for a few i’s after the local collision ends. So, supposing that the local collision ends at Step 16 and we want a 23-round collision, then δW16 is necessarily − w and we will require δW17 = ⋯ = δW22 = 0.

5.1 A class of local collisions

A local collision of the type shown in Table 4 is completely determined by the values of w,x,y and z which in turn determine the values of δWi to δWi + 8. We need to consider some special values for the δWs. Let
$$(\delta W_i,\ldots,\delta W_{i+8}) = (w,-w,\delta_1,\delta_2,0,0,0,u,-w) \mbox{ with } w = 1. $$
(8)
The value of u is either 0 or w and the values of δ1 and δ2 will be explained later. Using the form of the δWs from Table 4, Equation 8 gives rise to the following 9 equations. We will refer to them as (9.1) to (9.9).
$$\left. \begin{array}{lll} (1)& \delta W_i = &= w; \\ (2)& \delta W_{i+1} = x-\delta\Sigma_1^i(w)-\delta f_{IF}^i(w,0,0) &= -w;\\ (3)& \delta W_{i+2} = y-\delta\Sigma_1^{i+1}(x)-\delta f_{IF}^{i+1}(x,w,0) &= \delta_1;\\ (4)& \delta W_{i+3} = z-\delta\Sigma_1^{i+2}(y)-\delta f_{IF}^{i+2}(y,x,w) &= \delta_2;\\ (5)& \delta W_{i+4} = -w-\delta\Sigma_1^{i+3}(z)-\delta f_{IF}^{i+3}(z,y,x) &= 0;\\ (6)& \delta W_{i+5} = -x-\delta\Sigma_1^{i+4}(w)-\delta f_{IF}^{i+4}(w,z,y) &= 0;\\ (7)& \delta W_{i+6} = -y-\delta f_{IF}^{i+4}(0,w,z) &= 0;\\ (8)& \delta W_{i+7} = -z-\delta f_{IF}^{i+4}(0,0,w) &= u;\\ (9)& \delta W_{i+8} = &= -w.\\ \end{array} \right\} $$
(9)
The values of x,y and z from (4) are the following.
$$\begin{array}{lll} x=-\delta\Sigma_0^i(w)-\delta f_{MAJ}^i(w,0,0); \ \ y=-\delta f_{MAJ}^{i+1}(0,w,0);\ \ z=-\delta f_{MAJ}^{i+2}(0,0,w). \end{array}$$
We now set conditions on the values for a and the e registers to obtain desired values for x,y and z and also to simplify the values of δWs. Using the kind of analysis done to obtain Rules 1 and 2, the following are easy to verify.
  1. 1.

    If ai = − 1 and ai − 1 = ai − 2 = α, then x = − 1.

     
  2. 2.

    If ai + 1 = ai − 1, then y = 0; if \(a_{i+1}=\overline{a_{i-1}}\), then y = − 1.

     
  3. 3.

    If ai + 2 = ai + 1, then z = 0; if \(a_{i+2}=\overline{a_{i+1}}\), then z = − 1.

     

Note

In our analysis of up to 22-round SHA-2, we saw that z = 0 arose as a necessary condition. Motivated by this, we will continue to work with z = 0. So, we will have ai + 2 = ai + 1. Let this common value be β. Further, if β = α, then y = 0 and if \(\beta=\overline{\alpha}\), then y = − 1. These and other values of a and e registers are shown in Table 13. We note the following.
  1. 1.

    If y = 0, then λ = α − Σ0(α).

     
  2. 2.

    If y = − 1, then \(\lambda=\alpha+\overline{\alpha}+1-\Sigma_0(\overline{\alpha})=-\Sigma_0(\overline{\alpha})\).

     
At later point in the analysis, we will be obtaining λ and will require to obtain a corresponding value for α. In the case y = − 1, \(\alpha=\overline{\Sigma_0^{-1}(-\lambda)}\) and it is easy to obtain α from λ. The case y = 0 is not so simple. For SHA-256, one works with 32-bit words and then obtaining α from λ can be done by exhaustive search; however, for SHA-512, one has to work with 64-bit words and then things become more difficult. This is one of the reasons why it is more convenient to work with y = − 1. (Note that (w,x,y,z) = (1, − 1,0,0) corresponds to the NB local collision, whereas (w,x,y,z) = (1, − 1, − 1,0) corresponds to the SS local collision.)
Table 13

Values of a and e register for the δWs given by (8) to hold

Index

i − 2

i − 1

i

i + 1

i + 2

i + 3

i + 4

i + 5

i + 6

a

α

α

− 1

β

β

    

e

γ

γ + 1

− 1

μ

λ

λ + y

− 1

y

− 1 − u

The value of u is either 0 or w. We have w = 1, x = − 1 and z = 0. If y = 0, then β = α, while if y = − 1, then \(\beta=\overline{\alpha}\). By CDE, we have λ = β + α − Σ0(β) − fMAJ(β, − 1,α). Thus, the independent quantities are α,γ and μ

The values shown in Table 13 have been chosen so that the conditions on δWi + 1 and δWi + 5 to δWi + 7 hold with probability one. Consider, for example, δWi + 1. From (9.2), we have
$$\begin{array}{lll} \delta W_{i+1} & = & x-\delta\Sigma_1^i(w)-\delta f_{IF}^i(w,0,0) \\ & = & x-(\Sigma_1(e_i+w)-\Sigma_1(e_i))-(f_{IF}(e_i+w,e_{i-1},e_{i-2})-f_{IF}(e_i,e_{i-1},e_{i-2})) \\ & = & -1-(0-(-1))-(e_{i-2}-e_{i-1}) \\ & = & -2-\gamma+\gamma+1 \\ & = & -1. \end{array}$$
Similarly, Equations (9.6), (9.7) and (9.8) can be verified. Equations (9.3), (9.4) and (9.5) on the other hand give rise to the following conditions on the values of α, γ and μ.
$$\left. \begin{array}{rcl} \delta_1 & = & y-\Sigma_1(\mu+x)+\Sigma_1(\mu)-f_{IF}(\mu+x,0,\gamma+1)+f_{IF}(\mu,-1,\gamma+1) \\ \delta_2 & = & -\Sigma_1(\lambda+y)+\Sigma_1(\lambda)-f_{IF}(\lambda+y,\mu+x,0)+f_{IF}(\lambda,\mu,-1) \\ w & = & -f_{IF}(\lambda+y,\lambda+y,\mu+x)+f_{IF}(\lambda+y,\lambda,\mu). \end{array} \right\} $$
(10)
The special case of these equations with y = 0 have been reported in [7] and a method for solving them has been discussed. The method to solve these equations is different for SHA-256 and for SHA-512. Next, we discuss methods to solve (10) for the case y = − 1.

5.2 Solving (10) for y = − 1

The following provides an outline of the method to solve (10) for μ,γ and λ when y = − 1 and δ1 and δ2 are given. From λ, we obtain α.
  • The third equation holds with probability 1 if both λ and μ are odd.

  • Given that λ and μ are odd, the second equation simplifies to \(\delta_2 =-\Sigma_1(\lambda-1)+\Sigma_1(\lambda)+\overline{(\lambda-1)}\). For a given odd value of δ2 occurring in the distribution of σ1(W) − σ1(W − 1), it is possible to solve this equation for odd λ.

  • Given such a λ, it is easy to solve the equation \(\lambda = -\Sigma_0(\overline{\alpha})\) to obtain a suitable value of α, since Σ0 is an invertible mapping for both SHA-256 and SHA-512.

  • For the first equation, the term − fIF(μ − 1,0,γ + 1) + fIF(μ, − 1,γ + 1) is equal to μ, if γ is odd. This term is equal to μ − 1 if γ is even. Further, we note that − Σ1(μ − 1) + Σ1(μ) is always even for both SHA-256 and SHA-512. Thus taking an arbitrary odd value of γ, the first equation is in the single variable μ and can be solved easily for a given δ1.

Now we provide proofs of the observations above.

Lemma 1

If y = − 1, then the third equation of (10) is satisfied for any odd λ and odd μ.

Proof

We have to show that
$$ 1 = -f_{IF}(\lambda-1, \lambda-1,\mu-1) +f_{IF}(\lambda-1,\lambda,\mu). $$
The quantities λ and λ − 1 differ only in their least significant bit since λ is odd. Similarly, μ and μ − 1 differ only in their least significant bit since μ is odd. Let xi denote the ith bit of x, then λ0=1, (λ − 1)0 = 0, μ0 = 1 and (μ − 1)0 = 0. Let (λ − 1)i = λi = 1 and (λ − 1)j = λj = 0 for some non-zero indices i and j. Also, let μi = b1 and μj = b2 for these bit positions i and j. Now we are ready to write the bit patterns of the quantities occurring in the third equation.

bit

63

...

i

...

j

...

0

λ − 1

.

...

1

...

0

...

0

λ

.

...

1

...

0

...

1

μ

.

...

b1

...

b2

...

1

fIF(λ − 1,λ, μ)        

.

...

1

...

b2

...

1

Similarly,

bit

63

...

i

...

j

...

0

λ − 1

.

...

1

...

0

...

0

λ − 1

.

...

1

...

0

...

0

μ − 1

.

...

b1

...

b2

...

0

fIF(λ − 1,λ − 1, μ − 1)

.

...

1

...

b2

...

0

From the two bit patterns above, we get that
$$ f_{IF}(\lambda-1,\lambda, \mu) - f_{IF}(\lambda-1,\lambda-1, \mu-1) = 1. $$

Lemma 2

Let y = − 1. For odd λ and odd μ, the second equation of (10) simplifies to\(\delta_2 =-\Sigma_1(\lambda-1)+\Sigma_1(\lambda)+\overline{(\lambda-1)}\).

Proof

Consider the following expression
$$ -f_{IF}(\lambda-1,\mu-1,0)+f_{IF}(\lambda,\mu,-1). $$
Similar to the proof of the previous lemma, we consider the bit patterns of the quantities occurring in the above equation. Let λi = 1 and λj = 0 for some non- zero i,j. Also, let μi = b1 and μj = b2. Then the following bit patterns can be seen for the various quantities.

bit

63

...

i

...

j

...

0

λ

.

...

1

...

0

...

1

μ

.

...

b1

...

b2

...

1

− 1

1

...

1

...

1

...

1

fIF(λ,μ, − 1)        

.

...

b1

...

1

...

1

Similarly,

bit

63

...

i

...

j

...

0

λ − 1

.

...

1

...

0

...

0

μ − 1

.

...

b1

...

b2

...

0

0

0

...

0

...

0

...

0

fIF(λ − 1, μ − 1, 0)

.

...

b1

...

0

...

0

From the two bit patterns above, we get that fIF(λ,μ, − 1) and fIF(λ − 1,μ − 1, 0) will have the same bit value whenever the corresponding bit of λ is 1 and different bit value whenever the corresponding bit of λ is 0, except the least significant bit which will always be different. Comparing this difference with the bit pattern \(\overline{\lambda-1}\), we obtain
$$ f_{IF}(\lambda,\mu,-1) - f_{IF}(\lambda-1,\mu-1,0) = \overline{\lambda-1}. $$
This completes the proof. □

Lemma 3

Let y = − 1. For odd μ and odd γ, the first equation of (10) simplifies to δ1 = − 1 − Σ1(μ − 1) + Σ1(μ) + μ.

Proof

By considering the bit patterns of μ, μ − 1 and γ + 1 the following can be proved in a manner similar to the previous two lemmas.
$$ f_{IF}(\mu,-1,\gamma+1) - f_{IF}(\mu-1,0,\gamma+1) = \left \{ \begin{array}{ll} \mu & \mbox{if $\gamma$ is odd.}\\ \mu-1 & \mbox{if $\gamma$ is even.} \end{array} \right . $$
Substituting the above value in the equation for δ1 gives the required proof. □

SHA-256

For SHA-256 we did not solve the second equation explicitly since random search is itself good enough, producing a solution in few seconds. Solving all the three equations for α,γ and μ can be done in a few seconds on a current PC. Examples of values of (δ1,δ2) and the solutions to (10) for λ,γ and μ are provided in Table 14. The value of α is obtained from λ as explained earlier. The justification for choosing these particular values for the δs as well as the explanation for the first column will be provided later.
Table 14

Values leading to collisions for different number of steps of SHA-256

(# rnds, i)

δ1

δ2

u

α

λ

γ

μ

(23, 8)

\(\phantom{00000000.}0\)

ff006001

0

32b308b2

051f9f7f

684e62b7

041fff81

(23, 9)

00006000

ff006001

1

32b308b2

051f9f7f

98e3923b

fbe05f81

(24, 10)

"

"

"

"

"

"

"

The value of i denotes the start point of the local collision, i.e., the local collision is placed from Step i to i + 8

SHA-512

It is possible to solve (10) for SHA-512 as well, although we require a different approach than SHA-256. The main difference is in solving the first and the second equations. Since now 64-bit quantities are involved, it is no longer possible to solve the first and second equations by exhaustive search. We describe a method to solve the second equation with the aid of an example.

Examples of values of (δ1,δ2) and the solutions to (10) for λ,γ and μ are provided in Table 15 and the value of α is obtained from λ as explained earlier. As in the case of SHA-256, the justification for choosing these particular values for the δs as well as the explanation for the first column will become clear later.
Table 15

Values leading to collisions for different number of steps of SHA-512

(# rnds, i)

δ1

δ2

u

α

λ

γ

μ

(23, 8)

\(\phantom{000000000000}0\)

600000000237

0

7201b90f9f8df85e

3e000007ffdc9

1

43fffff800001

(23, 9)

200000000008

600000000237

1

7201b90f9f8df85e

3e000007ffdc9

1

45fffff800009

(24, 10)

"

"

"

"

"

"

"

The value of i denotes the start point of the local collision, i.e., the local collision is placed from Step i to i + 8

Solving the Second Equation of (10) For SHA-512

As shown in Lemma 2, for odd λ the second equation simplifies to
$$ \delta_2 = -\Sigma_1(\lambda-1)+\Sigma_1(\lambda)+\overline{(\lambda-1)}. $$
We need to get an odd λ satisfying the above equation for a given value of δ2. Since − Σ1(λ − 1) + Σ1(λ) is always even and \(\overline{(\lambda-1)}\) is odd due to our choice of odd λ, we require δ2 to be odd. This equation can be solved by hand. We explain the method to solve this equation for \(\delta_2 = {\tt 600000000237}\).
First note that Σ1(x) is the XOR addition of 3 n-bit quantities which are rotated/shifted forms of x. If λ is odd, then λ and λ − 1 differ only in the least significant bit. Therefore, the bit patterns of Σ1(λ) and Σ1(λ − 1) will be same except at 3 bit positions. These 3 bit positions are indexed by 23, 46 and 50. By the structure of Σ1 function and using the fact that λ is odd (i.e. λ0 = 1), we have the following
$$\begin{array}{lll} b_1 = (\Sigma_1(\lambda))_{23} &=& \lambda_0 \oplus \lambda_{37} \oplus \lambda_{41} = 1\oplus \lambda_{37} \oplus \lambda_{41},\\ b_2 = (\Sigma_1(\lambda))_{46} &=& \lambda_0 \oplus \lambda_{23} \oplus \lambda_{60}= 1\oplus \lambda_{23} \oplus \lambda_{60},\\ b_3 = (\Sigma_1(\lambda))_{50} &=& \lambda_0 \oplus \lambda_{4} ~\oplus \lambda_{27}=1\oplus \lambda_{4} ~\oplus \lambda_{27} . \end{array}$$
Also, because (λ − 1)0 = 0, we have \((\Sigma_1(\lambda-1))_{23} = \overline{b_1}\), \((\Sigma_1(\lambda-1))_{46} = \overline{b_2}\) and \((\Sigma_1(\lambda-1))_{60} = \overline{b_3}\).
Now consider the bit pattern of various quantities as follows.

bit

63

...

50

...

46

...

23

...

0

A = Σ1(λ − 1)

.

...

\(\overline{b_3}\)

...

\(\overline{b_2}\)

...

\(\overline{b_1}\)

...

.

B = Σ1(λ)

.

...

b3

...

b2

...

b1

...

.

A − B

.

...

.

...

.

...

1

0...

0

δ2

.

...

.

...

.

...

.

...

.

A − B + δ2

.

...

.

...

.

...

.

...

.

We require the quantity (A − B + δ2) to be equal to \(\overline{(\lambda-1)}\). It is clear from the bit pattern above that the lowest 23 bits (indexed from 0 to 22) of (A − B + δ2) will be same as those of δ2. Equating these bits to corresponding bits of \(\overline{(\lambda-1)}\), we immediately get the lowest 23 bits of λ.

Now consider the bits between 23 and 46 of (A − B). It is clear that all these bits will be equal. Further, all these bits will be equal to 1 if b1 = 1 due to the borrow while subtracting B from A at bit position 23. Similarly, all these bits of (A − B) will be equal to 0 if b1 = 0. Our choice of δ2 has all these bits equal to zero, hence the term (A − B + δ2) will too have all these bits equal. But since this term is equal to \(\overline{(\lambda-1)}\), all these bits of (λ − 1) will also be equal. Finally, note that λ and (λ − 1) differ only in the lowest bit position, hence all the bits between 23 and 46 of λ will also be equal. In particular, we will have λ37 = λ41, hence we have that b1 = 1 ⊕ λ37 ⊕ λ41 = 1.

Continuing reasoning on bit positions in this way, for any given δ2, either we can solve for λ or determine that a solution does not exist. For \(\delta_2 = {\tt 600000000237}\) we obtained the solution \(\lambda = {\tt 3e000007ffdc9}\). Note that the method explained above does not require any particular structure of the bits of δ2. As another example, we also solved for \(\delta_2 = {\tt 19ffffffffdd9}\) and obtained the solution as \(\lambda = {\tt 2200000800227}\).

Note

  1. 1.

    The first equation can be solved in a similar manner for μ for a given δ1.

     
  2. 2.

    It is possible to design an algorithm to do the task described above. But, such an algorithm will be complicated. Since we are interested in solving for a single value of δ2, we chose not to describe and implement an algorithm. The method of solving by hand is good enough.

     

6 Finding 23 and 24-round collisions

We show that by suitably placing a local collision of the type described in Section 5.1 and using proper values for α,γ and μ, it is possible to obtain several 23 and 24-round collisions for SHA-2. For the description below, we will be considering the SS local collision, i.e., (w,x,y,z) = (1, − 1, − 1,0).

6.1 23-round collisions

There are two options of placing the SS local collision. From Step i = 8 to Step i + 8 = 16 and from Step i = 9 to Step i + 8 = 17. This gives rise to two kinds of 23-round collisions for SHA-2.

Case i = 8

The local collision is started at i = 8 and ends at i = 16.

We have (w,x,y,z) = (1, − 1, − 1,0) and \(\beta=\overline{\alpha}\). Also, we set u = 0 and δ1 = 0. We need to choose a suitable value for δ2 which is the value of δWi + 3 = δW11. For this case, we let δ = δ2. The value of δ2 has to be chosen so that (10) has a solution. The time complexity of the algorithm depends on \(\textsf {freq}_{\delta}\) (see Section 2.3 for the meaning of \(\textsf {freq}_{\delta}\)) as explained below, so, one would like to choose δ such that \(\textsf {freq}_{\delta}\) is as high as possible. At the same time, we have to ensure that (10) can be solved for the particular value of δ. Our choices of δ given in the rows with (23,8) of Tables 14 and 15 have the highest value of \(\textsf {freq}_{\delta}\) for which it is possible to solve (10).

Since the local collision ends at Step 16, from Table 8 it necessarily follows that δW16 = − 1. To obtain a 23-round collision, we want to ensure that δW17 = ⋯ = δW22 = 0. From Table 9, (8) and the fact that δWj = 0 for 0 ≤ j ≤ 7, it follows that the condition δW17 = ⋯ = δW22 = 0 is achieved if we can ensure δW18 = 0. Again, from Table 9, we have
$$\delta W_{18}=\delta\sigma_1(W_{16})+\delta W_{11}.$$
(11)
So, for δW18 to be zero, we need δ = δW11 = − δσ1(W16), so that δW11 should be one of the values which occur in the distribution of σ1(W) − σ1(W − 1) for some W. (This is the reason why we analysed the differential behaviour of σ1 in Section 2.3.) The word W16 is defined using message recursion and so, we cannot control this word directly. Instead, we analyse which message words can be used to control W16.

First, let us consider which register values need to be set to specific values. Since i = 8, from Table 13, we see that a6 to a10 and e6 to e14 get defined. Using CDE, the value of e10 is actually determined by the values of a6 to a10. Using CDE, the values of e9 down to e6 determine the values of a5 down to a2. So, the values of a2 to a10 and the values of e11 to e14 are fixed.

From message recursion, the expression for W16 is the following.
$$ W_{16} = \sigma_1(W_{14})+W_9+\sigma_0(W_1)+W_0. $$
From the update function of the e-register, we have
$$ W_{14}=e_{14}-(\Sigma_1(e_{13})+f_{IF}(e_{13},e_{12},e_{11})+a_{10}+e_{10}+K_{14}). $$
In this equation, all values other than W14 have already been fixed. So, W14 and hence σ1(W14) have fixed values. Let us now consider W9. From the update function of the a-register, we can write
$$W_9 = a_9-\Sigma_0(a_8)-f_{MAJ}(a_8,a_7,a_6)-\Sigma_1(e_8)-f_{IF}(e_8,e_7,e_6)-e_5-K_9.$$
In the right hand side, all quantities other than e5 have fixed values. Using CDE,
$$e_5=a_5+a_1-\Sigma_0(a_4)-f_{MAJ}(a_4,a_3,a_2).$$
Again in the right hand side, all quantities other than a1 have fixed values. So, we can write W9 = C − a1, where C is a fixed value. Now,
$$a_1=\Sigma_0(a_0)+f_{MAJ}(a_0,b_0,c_0)+\Sigma_1(e_0)+f_{IF}(e_0,f_0,g_0)+h_0+K_1+W_1$$
where a0 and e0 depend on W0 whereas b0,c0,f0,g0 and h0 depend only on the initialization vector and hence are constants. Thus, we can write a1 = Φ(W0) + W1, where
$$\Phi(W_0)=\Sigma_0(a_0)+f_{MAJ}(a_0,b_0,c_0)+\Sigma_1(e_0)+f_{IF}(e_0,f_0,g_0)+h_0+K_1.$$
We write Φ(W0) to emphasize that this depends only on W0. At this point, we can write
$$\begin{array}{lll} W_{16} & = & \sigma_1(W_{14}) + W_9 + \sigma_0(W_1) + W_0 \\ & = & \sigma_1(W_{14}) + C-\Phi(W_0)-W_1+\sigma_0(W_1) + W_0 \\ & = & D-\Phi(W_0)-W_1+\sigma_0(W_1) + W_0. \end{array} $$
(12)
We need to obtain W0 and W1 such that the value of W16 given by (12) satisfies the condition σ1(W16 − 1) − σ1(W16) = − δ and then using (11) we obtain δW18 = 0 giving us the required condition of δW17 = ⋯ = δW22 = 0.

Once W0,W1 have been obtained, a collision can be constructed in a manner similar to that for the 22-round case and as shown in Table 12. The idea is to first run SHA-2 for two steps using W0 and W1. This determines the registers (a1,...,h1). Now, using Proposition 1, run SHA-2 step-by-step using Wi to set ai to the desired value for 2 ≤ i ≤ 10. Then run SHA-2 step-by-step using Wi to set ei to the desired value for 11 ≤ i ≤ 14. Finally, choose any value for W15. The values of \(W_i^{\prime}\) are determined by the values of Wi and δWi for 0 ≤ i ≤ 15. This gives a colliding message pair (W0,...,W15) and \((W_0^{\prime},\ldots,W_{15}^{\prime})\).

Estimate of Computation Effort

The main computational effort is in solving (12) for W0 and W1 such that σ1(W16 − 1) − σ1(W16) = − δ. We did not attempt an analytic solution. Instead, we tried random choices of W0 and W1 until we found a suitable W16. There are \(\textsf {freq}_{\delta}\) values of W16 for which σ(W16) − σ(W16 − 1) equals δ. On an average, success is obtained after \(\textsf {freq}_{\delta}\) trials. Each trial corresponds to about a single step of SHA-2 computation. So, the total cost of finding suitable W0 and W1 is about \(\frac{2^n}{\textsf {freq}_{\delta}\times 2^{4.5}}\) tries of 23-round SHA-2 computations.

SHA-256

The value of δ given in Table 14 is such that \(\textsf {freq}_{\delta}=2^{16}\). (See Table 2 in Section 2.3.) So, the complexity of finding 23-round SHA-256 collision is about 211.5 tries of 23-round SHA-256 computations. A message pair colliding for 23-round SHA-256 is given in Table 18 of Appendix A.

SHA-512

In this case, we have estimates on \(\textsf {freq}_{\delta}\). (Again, see Section 2.3 for discussion on this issue.) For the particular value of δ given in Table 15, our estimate is \(\textsf {freq}_{\delta}\approx 2^{43}\). (See Table 3.) So, the effort required is about \(\frac{2^{64}}{2^{4.5}\times \textsf {freq}_{\delta}}\) = \(\frac{2^{21}}{2^{4.5}}\) = 216.5 trials of 23-round SHA-512. A message pair colliding for 23-round SHA-512 is given in Table 21 of Appendix A.

Casei = 9

It is possible to place the local collision from Step 9 to Step 17 and then perform an analysis to show that it is possible to obtain 23-round collisions for both y = 0 and y = − 1. We do not provide these details, since a similar technique with an additional constraint is required for 24-round collision for which we provide complete details. An example of a collision obtained using this method is given in Table 19 of Appendix A.

6.2 24-round collisions

The SS local collision is placed from Step i = 10 to Step i + 8 = 18, i.e. (w,x,y,z) = (1, − 1, − 1,0). The message differences are as given by (8) where we choose u = 1. The values of δ1,δ2 need to be suitably chosen and then the values of λ,γ and μ can be found by solving (10) as explained in Section 5.2. From λ, we find α as explained earlier.

Since the collision ends at Step 18 and u = 1, from (8) we have δW17 = 1 and δW18 = − 1. To obtain a 24-round collision, we need to ensure δW19 = ⋯ = δW23 = 0.

From Table 9, (8) and the fact that δWj = 0 for 0 ≤ j ≤ 9, we get that the conditions δW19 = δW20 = 0 translate into the conditions
$$\left. \begin{array}{rcl} \delta_1=\delta W_{12} & = & -(\sigma_1(W_{17}+1)-\sigma_1(W_{17})) \\ \delta_2=\delta W_{13} & = & -(\sigma_1(W_{18}-1)-\sigma_1(W_{18})). \end{array} \right\} $$
(13)
As in the case of 23-round collisions, based on the differential behaviour of σ1 (described in Section 2.3), we should try to choose δ1 and δ2 such that \(\textsf {freq}_{-\delta_1}\) and \(\textsf {freq}_{\delta_2}\) are as high as possible.
Consider Table 13. This table tells us what the values of the different a and e-registers need to be. The values of a8 to a12 and the values of e8 to e16 get defined. Using CDE, the values of e11 down to e8 determine the values of a7 down to a4. Thus, the values of a4 to a12 and e13 to e16 are fixed. So, the values of a0 to a3 are free. In particular, we see that e16 = − 1 − u = − 2. This can be achieved by setting W16 to
$$W_{16} = e_{16} - \Sigma_1(e_{15})-f_{IF}(e_{15},e_{14},e_{13})-a_{12}-e_{12}-K_{16}. $$
(14)
Since all values on the right hand side are constants, we have that W16 is a constant value. On the other hand, W16 is defined by message recursion. So, we have to ensure that W16 takes the correct value. This is in addition to the requirement that the value of W17 and W18 satisfy (13).
We have already seen that W16 is a fixed value. Note that
$$\left. \begin{array}{rcl} W_{14} & = & e_{14}-\Sigma_1(e_{13})-f_{IF}(e_{13},e_{12},e_{11})-a_{10}-e_{10}-K_{14} \\ W_{15} & = & e_{15}-\Sigma_1(e_{14})-f_{IF}(e_{14},e_{13},e_{12})-a_{11}-e_{11}-K_{15}. \end{array}\right\} $$
(15)
Since for both equations, all the quantities on the right hand side are fixed values, so are W14 and W15.
Using CDE twice, we can write
$$\left. \begin{array}{rcl} W_9 & = & -W_1 + C_4 + f_{MAJ}(a_4,a_3,a_2)-\Phi_0 \\ W_{10} & = & -W_2 + C_5 + f_{MAJ}(a_5,a_4,a_3)-\Phi_1 \\ W_{11} & = & -W_3 + C_6 + f_{MAJ}(a_6,a_5,a_4)-\Phi_2 \end{array}\right\} $$
(16)
where
$$\left. \begin{array}{rcl} C_i & = & e_{i+5}-\Sigma_1(e_{i+4})-f_{IF}(e_{i+4},e_{i+3},e_{i+2})-2a_{i+1}-K_{i+5}+\Sigma_0(a_i) \\ \Phi_i & = & \Sigma_0(a_i)+f_{MAJ}(a_i,b_i,c_i)+\Sigma_1(e_i)+f_{IF}(e_i,f_i,g_i)+h_i+K_{i+1}. \end{array} \right\} $$
(17)
Using the expressions for W9,W10 and W11 we obtain the following expressions for W16,W17 and W18.
$$\left. \begin{array}{rcl} W_{16} & = & \sigma_1(W_{14})+ C_4-W_1+f_{MAJ}(a_4,a_3,a_2)-\Phi_0+\sigma_0(W_1)+W_0 \\ W_{17} & = & \sigma_1(W_{15})+ C_5-W_2+f_{MAJ}(a_5,a_4,a_3)-\Phi_1+\sigma_0(W_2)+W_1 \\ W_{18} & = & \sigma_1(W_{16})+ C_6-W_3+f_{MAJ}(a_6,a_5,a_4)-\Phi_2+\sigma_0(W_3)+W_2. \end{array}\right\} $$
(18)
We need to ensure that W16 has the desired value given by (14) and that W17 and W18 take values which satisfy (13).
The only free quantities are W0 to W3 which determine a0 to a3. The value of C4 depends on e8, e7 and e6, where e8 has a fixed value and e7 and e6 are in turn determined using CDE by a3 and a2. Similarly, C5 is determined by e9,e8 and e7; where e9,e8 have fixed values and e7 is determined using a3. The value of C6 on the other hand is fixed. Coming to the Φ values, Φ0 is determined only by W0; Φ1 determined by W0 and W1; and Φ2 determined by W0,W1 and W2. Let
$$D = W_{16}-(\sigma_1(W_{14})+C_4+f_{MAJ}(a_4,a_3,a_2)-\Phi_0+W_0). $$
(19)
If we fix W0 and a3,a2, then the value of D gets fixed and we need to find W1 such that the following equation holds.
$$D = -W_1+\sigma_0(W_1). $$
(20)
A guess-then-determine algorithm can be used to solve this equation. This algorithm will be different for SHA-256 and for SHA-512 since the σ0 function is different for the two. The guess-then-determine algorithms for both SHA-256 and SHA-512 are described in Section 6.3.

Solving (20) Using Table Look-Up

An alternative approach would be to use a pre-computed table. For each of the 2n possible W1s (n is the word size 32 or 64), prepare a table of entries (W1, − W1 + σ0(W1)) sorted on the second column. Then all solutions (if there are any) for (20) can be found by a simple look-up into the table using D. The table would have 2n entries and if a proper index structure is used, then the look-up can be done very quickly. We have not implemented this method.

Given a1,b1,...,h1 and a2 the value of W2 gets uniquely defined; similarly, given a2,b2,...,h2 and a3, the value of W3 gets uniquely defined. The equations are the following.
$$\left. \begin{array}{rcl} W_2 & = & a_2 - (\Sigma_0(a_1)+f_{MAJ}(a_1,b_1,c_1)+h_1+\Sigma_1(e_1)+f_{IF}(e_1,f_1,g_1)+K_2) \\ W_3 & = & a_3 - (\Sigma_0(a_2)+f_{MAJ}(a_2,b_2,c_2)+h_2+\Sigma_1(e_2)+f_{IF}(e_2,f_2,g_2)+K_3) \end{array}\right\} $$
(21)
The strategy for determining suitable W0,...,W3 is the following.
  1. 1.

    Make random choices for W0 and a2,a3.

     
  2. 2.

    Run SHA-2 with W0 and determine Φ0.

     
  3. 3.

    From a3 and a2 determine e7 and e6 using CDE.

     
  4. 4.

    Determine C4 using (17) and then D using (19).

     
  5. 5.

    Solve (20) for W1 using the guess-then-determine algorithm.

     
  6. 6.

    Run SHA-2 with W1 to define a1,...,h1.

     
  7. 7.

    Determine Φ1 using (17) and then W2 using (21).

     
  8. 8.

    Run SHA-2 with W2 to define a2,...,h2.

     
  9. 9.

    Determine Φ2 using (17) and then W3 using (21).

     
  10. 10.

    Compute W17 and W18 using (18).

     
  11. 11.

    If σ1(W17 + 1) − σ1(W17) = − δ1 and σ1(W18 − 1) − σ1(W18) = δ2, then return W0,W1,W2 and W3.

     
The values of W0,W1,W2 and W3 returned by this procedure ensure that the local collision ends properly at Step 18 and that δWj = 0 for j = 19,...,23. This provides a 24-round collision. The actual construction of the collision is similar to the procedure for obtaining 22-round collisions described in Table 12; using the obtained values of W0,...,W3 run SHA-2 for 4 steps to define the values of (a3,...,h3). Use Proposition 1 to set W4,...,W12 to values so that a4,...,a12 get the required values. Set W13,W14,W15 to ensure that e13,e14,e15 get the required values. Finally, set \(W_i^{\prime}=W_i+\delta W_i\) for i = 0,...,15. Then the message pairs (W0,...,W15) and \((W_0^{\prime},\ldots,W_{15}^{\prime})\) provide a 24-round collision.

Estimate of Computation Effort

Let Step 5 involve a computation of g operations, where each operation is much faster than a single step of SHA-2; by our assessment the time for each operation is around 2 − 4 times the cost of a single step of SHA-2. Thus, the time for Step 5 is about \(\frac{g}{2^4}\) single SHA-2 steps. Further, let the success probability of the guess-then-determine attack be p. Then Step 5 needs to be repeated roughly \(\frac{1}{p}\) times to obtain a solution.

By the choice of δ1, the equality σ1(W17 + 1) − σ1(W17) = − δ1 holds roughly with probability \(\frac{\textsf {freq}_{\delta_1}}{2^n}\) while by the choice of δ2 the equality σ1(W18 − 1) − σ1(W18) = δ2 holds roughly with probability \(\frac{\textsf {freq}_{\delta_2}}{2^n}\) and we obtain success in Step 11 with roughly \(\frac{\textsf {freq}_{\delta_1}\times \textsf {freq}_{\delta_2}}{2^{2n}}\) probability. So, the entire procedure needs to be carried out around \(\frac{2^{2n}}{\textsf {freq}_{\delta_1}\times \textsf {freq}_{\delta_2}}\) times to obtain a collision.

The guess-then-determine step takes about g/24 single SHA-2 steps. The time for executing the entire procedure once is about \((\frac{g}{2^4}+3)\) single SHA-2 steps which is about \(2^{-4.5} \times(\frac{g}{2^4}+3)\) 24-round SHA-2 computations. Since the entire process needs to be repeated \(\frac{2^{2n}}{\textsf {freq}_{\delta_1}\times \textsf {freq}_{\delta_2}}\) times for obtaining success, the number of 24-round SHA-2 computations till success is obtained is about
$$\left(\frac{2^{2n}}{\textsf {freq}_{\delta_1}\times \textsf {freq}_{\delta_2}}\right) \times \left(2^{-4.5} \times\left(\frac{g}{2^4}+3\right)\times\frac{1}{p}\right).$$
If (20) is solved using a table look-up, then the cost estimate changes quite a lot. The cost of Step 5 reduces to about a single SHA-2 step so that the overall cost reduces to about
$$\left(\frac{2^{2n}}{\textsf {freq}_{\delta_1}\times \textsf {freq}_{\delta_2}}\right) \times \left(2^{-4.5} \times3 \times \frac{1}{p}\right)$$
24-round SHA-2 computations. The trade-off is that we need to use a look-up table having 2n entries.

SHA-256

We choose \(\delta_2={\tt ff006001}\) with \(\textsf {freq}_{\delta_2}=2^{16}\). Also, we choose \(\delta_1={\tt 00006000}\) so that \(-\delta_1={\tt ffffa000}\) and \(\textsf {freq}_{-\delta_1}=2^{29}+2^{26}\). (See Table 2 in Section 2.3.) (For choices of δ2 with higher value of \(\textsf {freq}_{\delta_2}\) there are no solutions to the second equation of (10).)

For these values of δ1 and δ2, it is possible to solve (10) to obtain suitable λ,γ and μ, which in turn determine α. An example of these values is shown in Table 14 in the row (24,9). (The same values also hold for obtaining 23-round collision by placing a local collision from Step 9 to 17.)

The values of g, \(\textsf {freq}_{\delta_1}\) and \(\textsf {freq}_{\delta_2}\) are 218, 229 and 216 respectively. So, the time complexity is about 228.5 24-round SHA-256 computations. In our experiments, we found that the computation effort required to find W0,...,W3 actually turns out to be less than the estimated effort of 228.5 24-round SHA-256 computations. The value of 228.5 matches the figure given in [7], but [7] does not provide the detailed analysis of their cost. A message pair colliding for 24-round SHA-256 is given in Table 20 of Appendix A.

As already explained, if (20) is solved using a table look-up, then the cost reduces to about 215.5 24-round SHA-256 computations.

SHA-512

We choose \(\delta_2={\tt 600000000237}\) with \(\textsf {freq}_{\delta_2}\approx 2^{43}\). Also, we choose \(\delta_1={\tt 200000000008}\) so that \(\textsf {freq}_{-\delta_1}\approx 2^{61.5}\). See Table 3 in Section 2.3 For these values of δ1 and δ2, it is possible to solve (10) to obtain suitable λ,γ and μ, which in turn determine α. An example of these values is shown in the row marked (24,10) of Table 15.

The guess-then-determine attack for SHA-512 case requires g = 215 operations. Hence, the effort required for 24-round SHA-512 attack is about
$$\left(\frac{2^{2 \times 64}}{2^{61.5}\times {2^{43}}}\right) \times \left(2^{-4.5} \times\left(\frac{2^{15}}{2^4}+3\right)\times \frac{1}{2^{-2.5}}\right) = 2^{32.5}$$
trials of 24-round SHA-512. In [7], the corresponding effort is 253 trials of 24-round SHA-512. This significant improvement in the attack complexity allows us to provide the first example of a colliding message pair for 24-round SHA-512. A message pair colliding for 24-round SHA-512 is given in Table 22 of Appendix A.

Note that using a table having 264 entries to solve (20) will reduce the computational effort to about 222.5 trials of 24-round SHA-512.

6.3 Guess-then-determine algorithm for solving (20)

For the ease of notation, in this section we will use W instead of W1.

For SHA-256

Consider Fig. 1 where the structure of W and σ0(W) is shown for SHA-256. We have − W + σ0(W) = D, where D = (d31,...,d0) is a 32-bit constant. For 31 ≥ k ≥ l ≥ 0, we will use the notation X[k,l] to denote bits xk,...,xl of the 32-bit quantity X.
https://static-content.springer.com/image/art%3A10.1007%2Fs12095-009-0011-5/MediaObjects/12095_2009_11_Fig1_HTML.gif
Fig. 1

Structure of W and σ0(W) for SHA-256

We explain how the guess-then-determine algorithm proceeds. Suppose that we guess W[14,0]. Let X = D + W and Y = (W[14,0] ≫ 3) ⊕ (W[14,0] ≫ 7). Then \(W[25,18]=(X\oplus Y)\&({\tt ff})\). Having determined W[25,18] we next determine W[29,26] using positions 22 to 19 of Fig. 1. This time, however, there may have been a possible carry into the 19th bit and we need to account for that. Let c0 be a bit. Define X = (D ≫ 19) + (W[25,18] ≫ 1) + c0 and Y = (W[14,0] ≫ 5) ⊕ (W[25,18] ≫ 4). Then \(W[29,26]=(X\oplus Y)\&({\tt f})\). This illustrates the general idea and can be extended to determine the other bits. Once the entire W has been determined we need to determine whether − W + σ0(W) = D. The entire algorithm is shown in Fig. 2.
https://static-content.springer.com/image/art%3A10.1007%2Fs12095-009-0011-5/MediaObjects/12095_2009_11_Fig2_HTML.gif
Fig. 2

A guess-then-determine algorithm for solving D = − W + σ0(W) for SHA-256

This algorithm involves guessing W[14,0] and bits c0,c1,c2, which is a total of 18 bits. If the equation D = − W + σ0(W) does not have any solution, then none will be returned by this algorithm; on the other hand, if there is a solution or there are more than one solutions, then all solutions will be returned. A total of 218 operations are required. The time for each operation is significantly less than the time for a single SHA-256 step and by our assessment it is about 2 − 4 times the time for a single SHA-256 step.

Note

In [7], it has been remarked that “by guessing the least 15 bits of W1 the entire W1 can be reconstructed and with probability 2 − 14 it is going to be correct”. No details are provided. In particular, the guess-then-determine algorithm that we have described is not present in [7].

In our experiments with SHA-256, we found that for almost every other value of D, (20) has solutions, the number of solutions being one or two. So, for a random choice of D, we consider (20) to hold with probability p ≈ 1.

For SHA-512

Consider Fig. 3 where the structure of W and σ0(W) is shown for SHA-512. We have − W + σ0(W) = D, where D = (d63,...,d0) is a 64-bit constant. For 63 ≥ k ≥ l ≥ 0, we will use the notation X[k,l] to denote bits xk,...,xl of the 64-bit quantity X.
https://static-content.springer.com/image/art%3A10.1007%2Fs12095-009-0011-5/MediaObjects/12095_2009_11_Fig3_HTML.gif
Fig. 3

Structure of W and σ0(W) for SHA-512

We explain how the guess-then-determine attack proceeds. Suppose that we guess W[7,0]. So we know the 7 bits W[7,1] and W[6,0]. Now, consider the lowest 7 bits of D + W. We need D + W to be equal to σ0(W). The term σ0(W) consists of 3 quantities XOR’ed, one of which, W[7,1], is already known. The other two quantities are W[13,7] and W[14,8]. So we can compute X = W[13,7] ⊕ W[14,8] = (D + W) ⊕ W[7,1]. Now, consider the least significant bit of X. This is the XOR of W[7] and W[8]. We already know W[7], so it is possible to compute W[8]. Once W[8] is known, we can compute W[9] by considering the second least significant bit of X. Continuing this way, we can get W[14,7].

Now consider the quantity \((D+W) \oplus (W\ggg 1) \) for bit positions 7 to 13. If the possible carry bit into the addition D + W at bit position 7 can be guessed, then W[21,15] can be determined. Extending this reasoning further, we need to guess 7 carry bits and the initial 8 bits of W to completely determine W. If the obtained value of W satisfies − W + σ0(W) = D, then we have the correct solution. The entire algorithm is shown in Fig. 4.
https://static-content.springer.com/image/art%3A10.1007%2Fs12095-009-0011-5/MediaObjects/12095_2009_11_Fig4_HTML.gif
Fig. 4

A guess-then-determine algorithm for solving D = − W + σ0(W) for SHA-512

In the algorithm, we use a function GTD, which takes low order 7i bits of W as input and produces low order 7i + 7 bits of W. This function is described at the end of the figure.

This algorithm involves guessing W[7,0] and bits c1,c2, ...c7, which is a total of 15 bits. If the equation D = − W + σ0(W) does not have any solution, then none will be returned by this algorithm; on the other hand, if there is a solution or there are more than one solutions, then all solutions will be returned. A total of 215 operations are required. The time for each operation is significantly less than the time for a single SHA-512 step and by our assessment it is about 2 − 4 times the time for a single SHA-512 step.

7 Concluding remarks

The method of attack described so far cannot be meaningfully extended beyond 24 steps as already mentioned in [7]. This is due to the fact that every extra step will introduce a new condition on the previous message words. The 24-round collision already utilized the freedom in the first message word W0. To have a 25-round collision by starting the local collision at Step i = 11, will introduce impossibility in ensuring that the message word difference δW16 = 0. This is explained below.

As shown in Section 5.1, the local collision is {w, − w,δ1,δ2,0,0,0,u,w}. If we start this local collision at Step i = 11, then δW15 = δW16 = δW17 = 0. Now from the message recursion of SHA-2, we have:
$$ W_{16} = \sigma_1(W_{14}) + W_9 + \sigma_0(W_1) + W_0. $$
All the terms in the above equation, except W14, are zero. Therefore this equation cannot be satisfied by this local collision. Similar reasons apply for longer round collisions.

Perhaps more fundamentally the problem is that, we are using only a single local collision. Since the local collision is nonlinear in nature, it is difficult to combine two or more such collisions. Further progress in analysis of step-reduced SHA-256 collisions will require some method to combined more than one (linear or non-linear) local collision.

Acknowledgements

We would like to thank the reviewers for suggesting changes to improve the readability of the paper.

Copyright information

© Springer Science + Business Media, LLC 2009