1 Introduction

Block ciphers and hash functions are cornerstones of symmetric cryptography, where privacy and authenticity of communication are established by efficient schemes. Design and analysis of these primitives was pushed by the competitions for new standards: most notably AES for block ciphers (1998–2001) and SHA-3 for hash functions (2008–2012). Not only new designs, but also new types of attacks emerged from these two competitions.

The SHA-3 competition emerged from the concerns among the cryptographic community of the new attacks on hash functions that had appeared by 2005 [6, 39, 44, 45]. The standard SHA-2 followed the same design strategies as broken SHA-0 and SHA-1, and it was largely unknown if the new attacks could be carried out to SHA-2 within the next years. The American standardization organization NIST after a series of workshops decided to run a public competition for a new standard, hoping to come up with a replacement earlier than SHA-2 would be under attack as well. Whereas attacks on SHA-2 progressed slowly since 2006 [16, 24, 31, 33, 37], NIST eventually chose Keccak [4] out of 64 submissions.

Among the methods proposed for analysis of the SHA-3 candidates, rotational cryptanalysis and the rebound attack are notably universal and effective. Rotational analysis is well suited for bit-oriented designs, in particular for those based on modular addition, rotation, and XOR (so-called ARX schemes). The reduced versions of SHA-3 candidates Skein [22], Shabal [1], BMW [38], and eventually the SHA-3 winner Keccak [11] are affected by this type of attack, despite their relative resistance to the well-studied differential cryptanalysis.

The rebound attack, first presented in [34], was initially and mostly aimed at byte-oriented primitives with an SPN structure. It produces conforming pairs for differential paths in a meet-in-the-middle fashion, essentially shaving off the most “expensive” part of a differential path. The method gives the best results so far on reduced variants of the SHA-3 candidates Grøstl [17] and ECHO [18], LANE [30], Luffa [21], Cheetah [46] and the hash function Whirlpool [26], among others. It also yields a differential distinguisher for the largest number of rounds of SHA-3 (Keccak) [11].

Our Results

In this paper we combine the rotational and the rebound attacks with the application to the compression function of the SHA-3 candidate Skein and its underlying cipher Threefish in the version 1.2 [14]. We carefully use the degrees of freedom in the inbound phase of the rebound attack, so that we attack many more rounds compared to all other results on Skein/Threefish. We introduce a new type of distinguishing property, called a rotational collision, and prove formally that in the black-box model the complexity of finding such collisions is significantly higher than the complexity of producing rotational collisions for the Skein-256 compression function reduced up to 53 (out of 72) rounds, and for the Skein-512 compression function reduced up to 55 (out of 72) rounds. Our approach is aimed for the largest number of rounds at the cost of complexity, but similar results with almost practical complexity can be drawn for the reduced number of rounds (out of this paper’s scope). We also provide a more accurate estimation of rotational probabilities compared to [22].

Our results demonstrate weaknesses both in the reduced Threefish cipher (in the ideal cipher model) and in the Skein compression function as of version 1.2. Our models require that the key, the message, and the tweak can be freely chosen by an attacker, which is certainly not the case for a hash function or block cipher. Nevertheless, our attacks show that reduced Threefish should not instantiate an ideal cipher in any indifferentiability proof using it, like the one used for the Skein hash function [13]. The designers eventually responded to our attack by changing Skein so that its components are much less vulnerable to the rotational analysis [15].

Details on Our Rotational-Rebound Attack

We start our research with a more detailed and careful analysis of the rotational property and its propagation. We represent it analytically, and derive necessary conditions on the key bits to increase the rotational probability and thus reduce the complexity of our attacks. We also correct [22] in terms of the independence assumptions, and find the best values of the key bits with an optimized computer search. Although we attack the second version of Skein (v1.2 [14]), we would like to stress out that our attack approach is applicable to the first version of Skein as well, but does not apply to the latest version of Skein-v1.3.

This preliminary rotational analysis gives us a rotational distinguisher for the compression function of Skein on up to 40 rounds. We advance further and show how to put the rotational property into the outbound phase of the rebound attack. The inner part of the rebound attack, which is the inbound phase, is accelerated with the method of the auxiliary path [19] and neutral bits [5]. In contrast to the first attacks on Skein, where these paths were used in differential attacks, we demonstrate their use in the rotational attack. As a result, we get a rotational distinguisher for the reduced Skein compression function. We attack 53 rounds of Skein-256 and 55 rounds of Skein-512 (Sect. 5), whereas the full versions have 72 rounds.

2 Preliminaries

2.1 Short History of Skein

Skein is a family of hash functions designed by Ferguson et al. [12]. Since its submission to the SHA-3 competition in 2008, Skein underwent a series of revisions. The first revision, yielding the version 1.1, appeared quickly after the submission and corrected typos. The second revision (version 1.2) appeared in September 2009 and allegedly improved the diffusion properties with a new set of rotation constants [14]. Our analysis, as well as the first conceptual paper on the rotational cryptanalysis [22], was published in the first half of 2010 and is devoted to this version.

As soon as the next phase of the SHA-3 competition again allowed changes, the designers of Skein responded to the rotational attacks and tweaked it to the version 1.3 in October 2010 [15]. The tweak solely changed the constant in the key schedule, which efficiently prohibits rotational cryptanalysis. Though Skein was considered a favorite in the competition till its end, NIST eventually declared Keccak as the winner on October 2, 2012.

2.1.1 Third-Party Cryptanalysis of Skein

The first third-party cryptanalysis [2] targeted various properties of the Skein v1.1 compression function and Threefish up to 35 rounds. Near-collisions up to 24 rounds were investigated in [43]. The first rotational attack demonstrated distinguishers on the underlying Threefish cipher [22], and was able to penetrate 39 rounds of Skein-256 v1.2 and 42 rounds of Skein-512 v1.2 (see Table 1).

Table 1. Summary of the attacks on Skein and Threefish.

After a preliminary version of this paper has been published [23], two more results have been announced. Biclique preimage attacks [24] cryptanalyze 22 rounds of the Skein-512 v1.3 hash function and 37 rounds of the Skein-512 compression function. Advanced boomerang distinguishers were applied on up to 32 rounds of Threefish v1.3 [28], and new differential near-collision attacks on the Skein v 1.3 compression function for up to 32 steps were proposed in [27, 47].

2.2 Description of Skein v1.2

Skein is a family of hash functions, which are based on the different versions of the block cipher Threefish. It has three incarnations: with 256-, 512-, and 1024-bit block and the same key sizes. The 1024-bit version was not intended for the SHA-3 output sizes (224, 256, 384, 512 bits), and later it was explicitly stated that only Threefish-512 would be used for all output sizes of the purposes of SHA-3 competition. In this work, we analyze Threefish-256 and Threefish-512.

Threefish is a tweakable block cipher, where the tweak value T is a public 128-bit input, and could be used to further parametrize the cipher and to break the similarity between the compression function calls when used in Skein. By E K,T (P) we denote Threefish with input key K, a tweak T, and a plaintext P. The compression function F of Skein uses Threefish in the Matyas–Meyer–Oseas (MMO) mode:

$$ F_T(CV,M) = E_{CV,T}(M)\oplus M, $$
(1)

where CV is the chaining value, and M is the message block of the same size. The hash function of Skein produces the output after a sequence of compression function calls. As our analysis does not go beyond the compression function, and as the full description of the hash function is rather complicated, we omit it and refer to [15].

The definition of the cipher is as follow. The internal state I undergoes a sequence of 72 similar rounds, and after each fourth round a subkey is modularly added to the state. An additional subkey addition (key whitening) is done at the beginning of the first round.

Internal Round

Each round has a simple structure (cf. also Fig. 1). The internal state is partitioned into N w (N w =4,8 for Threefish-256,-512, respectively) 64-bit words \(I_{0},I_{1},\ldots,I_{N_{w}-1}\). Then two distinct operations are applied to all the state words. The first is a pairwise non-linear MIX operation, while the second is a simple word permutation π: Round r, 0≤r<72:

  1. 1.

    For 0≤j<N w /2 set

    • (I 2j ,I 2j+1)←MIX((I 2j ,I 2j+1));

  2. 2.

    For 0≤j<N w set

    • \(I^{\mathrm{new}}_{j}\leftarrow I_{\pi(j)}\).

Fig. 1.
figure 1

Two rounds of Threefish-512.

The operation π depends on the round number r, while MIX depends as well on the index j, and the output (Y 1,Y 2)=MIX(X 1,X 2) is defined as follows:

The rotation constants R r,j and the permutation π are defined in Appendix A.

Key Schedule and Subkeys

Let \(K_{0},K_{1},\ldots,K_{N_{w}-1}\) be the 64-bit words of the master key K. An additional word \(K_{N_{w}}\), acting as a checksum, is computed as

$$K_{N_w} =\mathtt{0x55\ldots 5}\oplus \bigoplus_{j=0}^{N_w-1}K_j. $$

Here we would like to point out that the word rotation (either to the left or to the right) by an even amount of bits does not change the constant \(\mathtt{0x55\ldots 5}\)—this is a crucial property exploited in our subsequent rotational attack. In the final version of Threefish, i.e. v1.3, the constant was changed to 0x1BD11BDAA9FC1A22.

Similarly, a checksum tweak word T 2 is computed from the 64-bit tweaks words T 0,T 1, i.e. T 2=T 0T 1. Let K 0,K 1,…,K 18 be the subkeys, and \(K_{0}^{s},K_{1}^{s},\ldots,K_{N_{w}-1}^{s}\) be 64-bit words of the subkey K s,s=0,…,18. These words are computed as follows:

Note the counter s added to the last subkey word.

3 Rotational and Rebound Attacks

3.1 Rotational Cryptanalysis

It has been known for a while that if one vector is a rotation of the other, then the bitwise operations such as XOR or AND keep this property. Some designers used this fact in the initial analysis of their own cryptosystems [3, 41], whereas Daum explored the propagation of this property through the modular addition operation [10]. The term rotational cryptanalysis was introduced by Khovratovich and Nikolić in the analysis of Skein [22].

The pair \((X, \overleftarrow{X})\) is called a rotational pair (with a rotation amount r), where \(\overleftarrow{X}\) the rotation of X by r bits to the left. A rotational pair is preserved by any bitwise transformation, particularly by the bitwise XOR and by any rotation. The probability that a rotational pair is kept by the modular addition is given by the following formula [10]:

$$ \mathbf{P}\bigl(\overleftarrow{x+y} = \overleftarrow{x} + \overleftarrow{y}\bigr) = \frac{1}{4}\bigl(1+ 2^{r-n} + 2^{-r} + 2^{-n}\bigr). $$
(2)

For large n and small r we get the following table:

For r=n/2 the probability is the lowest and it is close to 1/4. The same holds for rotations to the right. When an addition of rotational inputs does not produce rotational outputs then we say that the addition produces a rotational error.

Similarly to differential and linear cryptanalysis, the rotational cryptanalysis requires the rotational property to hold through a number of rounds of a primitive. Clearly, the probability of this event depends on the number of operations that may violate the rotational property, e.g., the modular addition.

The simplest rotational attack first establishes that for the primitive F with n-bit output the rotational property holds with probability \(\mathbb{P}\gg 2^{-n}\):

$$F\bigl(\overleftarrow{X}\bigr) \overset{\mathbb{P}}{=} \overleftarrow{F(X)}, $$

for some pre-fixed rotational amount r.

It will be proven in Sect. 4 that this property and its variations yield non-random behavior for unkeyed primitives like compression and hash functions. In the further text, we always work with a single rotational amount, an optimal value of which has to be found. For keyed primitives this property allows for shortcut key recovery attacks, as it can be used as a distinguisher to verify partial key guesses.

Constant Addition

The use of constants is typical countermeasure against slide attacks and other methods exploiting similarity of rounds. The addition of a constant also violates the rotational property unless the constant is self-rotational, i.e. if \(C=\overleftarrow{C}\) then \(\overleftarrow{X}\oplus C=\overleftarrow{X\oplus C}\). However, if the constant addition follows another operation that may fail to preserve the rotational pair, then the resulting errors may compensate each other. The first rotational attack on Skein [22] made the errors introduced by three consecutive operations to compensate each other (see Fig. 2).

Fig. 2.
figure 2

Dealing with constant addition in Threefish: error is introduced by the modular addition, then is corrected by the key addition, and is finally compensated by another modular addition.

3.2 Rebound Attack

The rebound attack [26, 34] was introduced as a variant of differential cryptanalysis optimized for the cryptanalysis of hash functions. It aims to efficiently produce inputs conforming to valid differential paths. At the same time, the rebound attack can be seen as a high-level model for the cryptanalysis of key-less primitives. It was first applied to AES-like constructions because it is easy to find truncated differential characteristics in them for a number of rounds. Distinguishers for generic Feistel schemes [40] and meet-in-the-middle attacks on block ciphers [8] also use elements of the rebound attack.

The rebound attack (Fig. 3) decomposes primitive E—a compression function, a block cipher, or a permutation—into three parts:

$$E = E_{{3}} \circ E_{{2}} \circ E_{{1}}. $$

The two phases are as follows:

  • Inbound phase searches for inputs conforming to some property (usually, to a differential path) in the meet-in-the-middle fashion in E 2. Here the search is efficiently aided by the degrees of freedom available to a cryptanalyst.

  • Outbound phase computes the solutions of the inbound phase in both forward- and backward direction through E 1 and E 3 and checks whether they are solutions for the full E. If this is a probabilistic event, an attacker repeats the inbound phase to obtain more starting points for the outbound phase.

Recent modifications of the rebound approach include the inside-out variant [32], the linear solving variant [32], or the multiple-inbound variant [26, 30].

Fig. 3.
figure 3

Outline of the rebound attack.

Our idea is to target the rotational property in the rebound attack, so that the inputs conforming to the rotational property in the inbound phase can be found with a low complexity. In the next section we formally prove that our approach produces outputs which are improbable to find for an ideal primitive with the same complexity.

4 Rotational Distinguishers

In this section we argue that the attack, described in detail in Sect. 5, indeed shows non-random behavior of the Skein compression function. A typical argument would show that an attacker with only a black-box access to an ideal primitive of the same domain and range is not able to produce the same behavior with the same or better effort and probability. We follow the approach of [7], where the adversary produces so-called q-multicollisions for AES significantly faster than for an ideal cipher. Then we carry over this statement to the compression function.

The Threefish key schedule uses a counter in each subkey K s. As none of these counters are rotation-invariant, the subkey injection always violates the rotational property of a pair of internal states. As indicated in Sect. 3.1, we have to compensate emerging errors by other probabilistic operations around the subkey injection. As will be explained in details in Sect. 5, adding a constant e to the chaining value CV is sufficient to obtain the rotational property with reasonable probability for a reduced Skein compression function F:

$$ F_{\overleftarrow{T}}\bigl(\overleftarrow{CV}+ e,\overleftarrow{M}\bigr) \overset{\mathbb{P}}{=} \overleftarrow{F_T(CV,M)}. $$
(3)

Moreover, we can produce q such inputs with average complexity \(1/\mathbb{P}\) and the same e:

$$ \begin{cases} \overleftarrow{F_{T_1}(CV_1,M_1)} = F_{\overleftarrow{T_1}}\bigl(\overleftarrow{CV_1}+ e, \overleftarrow{M_1}\bigr);\\[4pt] \overleftarrow{F_{T_2}(CV_2,M_2)} = F_{\overleftarrow{T_2}}\bigl(\overleftarrow{CV_2}+ e, \overleftarrow{M_2}\bigr);\\ \vdots\\[4pt] \overleftarrow{F_{T_q}(CV_q,M_q)} = F_{\overleftarrow{T_q}}\bigl(\overleftarrow{CV_q}+ e, \overleftarrow{M_q}\bigr). \end{cases} $$
(4)

Let us introduce an appropriate definition of the wanted property.

Definition 1

A set

$$\bigl\{e; (CV_1,M_1,T_1), (CV_2,M_2,T_2),\ldots,(CV_q, M_q,T_q)\bigr\}$$

is called a rotational q-collision set for a compression function F T (CV,M) if (4) holds for it.

A similar definition can be introduced for the cipher on which the compression function is based. The MMO mode (1) yields the following conversion:

$$\overleftarrow{F_T(CV,M)} = F_{\overleftarrow{T}}\bigl(\overleftarrow{CV}+e,\overleftarrow{M}\bigr)\quad\Longleftrightarrow\quad \overleftarrow{E_{CV,T}(M)} = E_{\overleftarrow{CV} + e,\overleftarrow{T}}\bigl(\overleftarrow{M}\bigr). $$

Hence we can introduce an appropriate definition for a tweakable cipher.

Definition 2

A set

$$\bigl\{e; (P_1,K_1,T_1), (P_2,K_2,T_2),\ldots,(P_q, K_q,T_q)\bigr\}$$

is called a rotational q-collision set for a tweakable cipher E K,T (P) if

$$ \begin{cases} \overleftarrow{E_{K_1,T_1}(P_1)} = E_{\overleftarrow{K_1}+ e,\overleftarrow{T_1}}\bigl(\overleftarrow{P_1}\bigr);\\[4pt] \overleftarrow{E_{K_2,T_2}(P_2)} = E_{\overleftarrow{K_2}+ e,\overleftarrow{T_2}}\bigl(\overleftarrow{P_2}\bigr);\\ \vdots\\[4pt] \overleftarrow{E_{K_q,T_q}(P_q)} = E_{\overleftarrow{K_q}+ e,\overleftarrow{T_q}}\bigl(\overleftarrow{P_q}\bigr).\\ \end{cases} $$
(5)

We follow the line of the first distinguisher for the full AES [7] and compare the problem of finding a rotational collision set for an ideal cipher with that for reduced Threefish. By ideal cipher, as usual, we understand a set of randomly chosen permutations of cardinality equal to the size of the key space. Our results demonstrate that the versions of Threefish that we consider do not behave like an ideal cipher with respect to this rotational property. Afterwards we proceed with the same statement on the compression function.

The complexity of the generic attack is measured in the number of queries to the encryption and decryption oracles of an ideal cipher.

Lemma 1

To construct a rotational q-collision set for an ideal (tweakable) cipher with an n-bit block and key and success rate 1/2, an adversary needs at least \(\mathrm{min}(\frac{q}{12}\cdot 2^{({(q-1)}/{(q+1)})n}, 2^{n-1})\) queries.

Proof

Let A be an adversary attacking the cipher, and assume that A asks its oracles a total of L queries, where L<2n−1. Let us compute the probability of the event that a rotational q-collision set (5) is found. The probability is taken over all possible choices of permutations for the cipher.

First, we denote the equations in (5) as U 1,U 2,…,U q . With each equation we associate an integer t j such that t j th oracle query computes the chronologically second element of U j and hence is able to check whether the equation holds. Without loss of generality, assume that t 1<t 2<⋯<t q . Finally, define \(t_{1}'\) as the index of the query that computes the first element of the equation U 1:

(6)

Now compute for every tuple \((t_{1}',t_{1},t_{2},t_{3},\ldots, t_{q})\) the probability that it leads to a differential q-multicollision. Before submitting t i th query, i>1, equations U 1,U 2,…,U i−1 hold, where terms of U 1,U 2,…,U i−1 are completely determined by a tuple (\(t_{1}'\), t 1, t 2, t 3, … , t i−1). Indeed, from \(t_{1}'\) and t 1 we define K 1, e, P 1, T 1, the rotation amount; from t j we define K j , T j , and P j .

Just before the moment t i only one term of U i is computed—w.l.o.g. let it be \(E_{K_{i},T_{i}}(P_{i})\). Thus the following equation should hold:

$$\overleftarrow{E_{K_i,T_i}(P_i)}= \underbrace{E_{\overleftarrow{K_i}+ e,\overleftarrow{T_i}}\bigl(\overleftarrow{P_i}\bigr)}_{\mathrm{queried\ at\ }t_i}. $$

By our definition, t i is the first moment when \(E_{\overleftarrow{K_{i}}+ e,\overleftarrow{T_{i}}}(\overleftarrow{P_{i}})\) is queried. Then either the decryption or the encryption oracle is called. In the first case the decryption oracle is called with a ciphertext C and a key K, which for some i should be equal to \(\overleftarrow{K_{i}}+ e\). By the definition of t i , the value C is chosen from the set where \(E_{\overleftarrow{K_{i}}+ e,,\overleftarrow{T_{i}}}(\cdot)\) is undefined. To become a part of a rotational q-collision set, there should exist P i such that \(C = \overleftarrow{E_{K_{i},T_{i}}(P_{i})} \). On the other hand, after the decryption oracle is called, the following equation should hold:

$$ E_{\overleftarrow{K_i}+ e,\overleftarrow{T_i}}^{-1}(C)=\overleftarrow{P_i}. $$
(7)

Since L<2n−1, not more than 2n−1 texts were encrypted or decrypted with the key \(\overleftarrow{K_{i}}+ e\). So the probability that (7) holds does not exceed 1/2n−1.

In the second case, let the encryption oracle be queried with a plaintext P, tweak T, and a key K, which for some i should be equal to \(\overleftarrow{K_{i}}+ e\). For an answer C, a similar equation should hold:

$$ C=\overleftarrow{E_{K_i,T_i}(P_i)}. $$
(8)

The same probability argument holds for this equation. Therefore, for every t i ,i≥2, we get a multiplier 21−n to the probability that a tuple \((t_{1}',t_{1},t_{2},t_{3},\ldots, t_{q})\) defines a rotational q-collision set. There are \(\binom{L}{q+1}\) such tuples, each defining a rotational q-collision set with probability at max 2(q−1)(1−n). We get the following equation for the number of queries required to get a q-collision set with probability 1/2:

$$ \binom{L}{q+1} \geq 2^{(q-1)(n-1)-1}. $$
(9)

Let us simplify the left part:

(10)

Substitute the result to (9):

(11)

This concludes the proof. □

Let us consider compression functions now. An ideal compression function is introduced similarly to the ideal cipher and is an equivalent of a random PRF. So, an ideal compression function over a particular domain with a given range is a set of randomly chosen transformations with the same domain and range.

Theorem 1

To construct a rotational q-collision set for an ideal compression function with an n-bit output and success rate 1/2, an adversary needs at least \(\mathrm{min}(\frac{q}{12}\cdot 2^{({(q-1)}/{(q+1)})n}, 2^{n-1})\) queries.

Proof

The proof is almost identical to the proof of lemma 1. However, we are equipped with a single-compression oracle, and do not perform any sort of decryption. Hence we merely omit from the proof the “first case” where the decryption oracle is called. Given the bound of 2n−1 queries, we find that for every tuple of query indices of size q+1 the probability that it defines a rotational q-collision set does not exceed 2(q−1)(1−n). The rest of the proof remains the same. □

In the next section we show how to obtain a rotational q-collision set for reduced Threefish and the Skein compression function.

5 Rotational Rebound Attack on the Skein Compression Function

5.1 Overview

Our goal in this section is to construct a rotational q-collision set for the cipher Threefish, which immediately converts to a rotational q-collision set for the Skein compression function. We proceed as follows:

  • Fix the optimal rotation amount;

  • Find and fix the optimal key values K i ;

  • Calculate the transition probabilities and demonstrate that there exist inputs for which the rotational property holds throughout reduced Threefish (details in Sect. 5.3);

  • Identify the rounds for the inbound phase, where states conforming to the rotational property can be generated efficiently, and show the procedure;

  • Identify neutral bits that help to ensure the rotational property beyond the inbound phase;

  • Identify remaining degrees of freedom and estimate the total complexity of the attack including the outbound phase;

  • Demonstrate that for some q the attack outputs rotational q-collisions faster than what the lower bound for the ideal case instructs.

Having all these details elaborated, the attack would proceed as follows:

  1. 1.

    Produce internal states that conform to the rotational property through the inbound phase;

  2. 2.

    Filter out those for which the rotational property does not hold in the rounds of the acceleration phase;

  3. 3.

    Generate more solutions for the rounds covered in the acceleration phase;

  4. 4.

    Filter out those that do not conform to the rotational property through the outbound phase.

An illustration of the attack proposal is given Fig. 4, while also given in Table 2.

Fig. 4.
figure 4

The complete rotational rebound attack on Threefish-256, -512. Arrows indicate the direction of the computation.

Table 2. Structure of the rebound attack on Skein.

Eventually for the fixed correction e and rotational amount r we produce tuples (P,K,T) such that

$$E_{\overleftarrow{K}+ e,\overleftarrow{T}}(\overleftarrow{P}) = \overleftarrow{E_{K,T}(P)}, $$

where E is the Threefish-256 reduced to rounds 2–54 (0–50 for the 512-bit version), without the encompassing key addition. For the Skein compression function, this yields tuples (CV,T,M) such that

$$F_{\overleftarrow{T}}\bigl(\overleftarrow{IV} + e, \overleftarrow{M}\bigr) = \overleftarrow{F_T(IV,M)} $$

for the same e. The total complexity is about 2239 per tuple in Skein-256, and 2480 per tuple in Skein-512. Here and further the complexity unit is one evaluation of the compression function. The memory consumption is about 230 Skein states. Having fixed q=26 for both variants, we are able to construct a rotational q-collision set for the Skein compression function with complexity lower than for an ideal compression function. Also, we can construct a rotational q-collision set for the cipher Threefish with complexity lower than for an ideal cipher. This proves the distinguishing nature of our attack.

5.2 Selecting Parameters for the Attack

Rotation Amount

We recalled in Sect. 3.1 that the rotation by 1 bit delivers the highest probability for the modular addition. However, the constant \(\mathtt{0x55\ldots 5}\) is invariant to the rotation by 2 bits. Since the other constants used in Threefish—round counters—are not rotation-invariant for any amount, we select the rotation amount as two bits to the left.

Corrections

Our experiments showed that the counter in the key schedule quite often prohibits the rotational property to hold for several consecutive rounds. If there were no corrections, the probability for the rotational property to hold through rounds 4s−1,4s (i.e. with the key addition in between) would be zero for quite many s. To avoid that, we fix some of the key bits and introduce corrections. As a result, the errors introduced by counters, corrections, and modular additions compensate each other, and we want this property to hold as long as possible.

Optimal key bit and the resulting correction values were the subject of our experiments and were found with an optimized computer search. The optimal values are given in Tables 3 and 4. Hence the inbound phase of the rebound attack starts with assigning the 24 bits (for Threefish-256) or 48 bits (for Threefish-512) of the key K and its counterpart \(\overleftarrow{K} + e\) with actual values.

Table 3. Pre-fixed values of key bits for the rotational pair and the decimal value of the correction in Skein-256. The middle 58 bits of K i coincide (regarding rotation) in K and its rotated counterpart.
Table 4. Pre-fixed values of key bits and correction in Skein-512.

5.3 Rotational Probabilities

This section gives an outline of the search for optimal key values. More details are given in Appendix B.

We follow the idea of [22], and introduce corrections in the Threefish key pair. However, unlike [22], we consider modular corrections, i.e. we define the related-key pair as \((K, \overleftarrow{K} + e)\), where e is a low-weight correction and + is a modular addition. The rotation amount is fixed to 2 in order make the constant used in the key schedule, and thus the checksum master key word \(K_{N_{W}}\), susceptible to rotational analysis, and also to maximize the rotational probability of modular additions.Footnote 1

To obtain the highest number of rounds in the outbound phase, we find optimal values for the corrections and well as values of several bits of the key pair. These values are found with an exhaustive search on a computer. However, due to the large size of the search space, a simple brute force would be infeasible, and thus first we have to significantly reduce the amount of possible candidates by performing a detailed analysis. Further we explain how to optimize the search in Skein-256 (see Fig. B.1, the rotational pairs are presented one atop of another).

First we divide the cipher into pairs of consecutive rounds. There are two types of such pairs. The first type is composed of pairs that do not have a subkey addition in between the rounds, e.g. rounds 5 and 6, rounds 9 and 10, etc. As such double rounds have no operations involving counters (there is no subkey addition), we assume all the input and output pairs of these rounds to be fully rotational. Thus their rotational probability is fixed to 2−8.5 for Skein-256 and 2−17 for Skein-512. These number were obtained empirically with computer experiments and in fact differ from the theoretical values of 2−6.7 (4 modular additions in the two rounds, each with rotational probability 2−1.67) and 2−13.4 used in [22]. The second type of double rounds consists of pairs of consecutive rounds of Skein-256 that have a subkey addition in between (such as rounds 3 and 4, 7 and 8, etc.). Only such pairs could be used to efficiently prune and optimize the search, and to find the optimal values for the corrections and the key bits. The details of our search are quite technical, and we describe them in Appendix B.

We assigned optimal values to 6×4=24 bits of the first master key, i.e. 4 MSBs and 2 LSBs of each 64-bit word, and 24 bits of the second key. The values of these bits are given in Table 3. Once we had the optimal values of the keys and the optimal differences, we found the probability for four consecutive rounds. We start with a random rotational input pair of states and go through three rounds. Then we add the subkeys (with the particular counters) and then we go for an additional round. The outcome of this testing is given at Table C.2 of Appendix C. Thus, in Skein-256 the probability to pass rounds 2–41 (i.e. 10 key additions) is about 2−239.

Skein-512

Optimal values for the differences and some key bits can be obtained for Skein-512 as well. A property of the double subkey rounds of Skein-512 that helps to run the optimal search is that these two double subkey rounds can be split into two non-overlapping halves (see Fig. C.1 in Appendix C), and then for each half the optimal differences can be found independently. Note that this simply speeds up the search for optimal differences and values, and has no impact on the actual probability of the rotational property. Unlike Skein-256, in Skein-512 we could not find empirically the probabilities for four consecutive rounds because they were too low. Instead, we considered each four rounds as double round + double subkey round and simply multiplied the probabilities of these two. The values for the optimal 6 bits of each key word in Skein-512 are given in Table 4. In Skein-512 the probability to pass rounds 0–41 is about 2−480 (details in Table C.3).

5.3.1 Probabilities in the Khovratovich–Nikolić Analysis

The paper [22] provided the rotational analysis of Threefish on up to 42 rounds. The probability estimates were based on several independence assumptions, which must be corrected as follows:

  • The probability of the rotational pair propagation through double rounds without key addition (2–3, 6–7, etc.) is not a multiplication of probabilities for a single round. The problem is that two consecutive modular additions ((ab)⊞c) have lower rotational probability than expected. For example, the rotational probability of one round in Skein-256 is 2−3.35 for the rotation by 2, but the probability of two rounds is 2−8.52 instead of 22⋅(−3.35)=2−6.7.

  • The rotational inputs to the round before the key addition (4, 8, etc.) are not uniformly distributed, and this partly compensates the negative effect of the dependency (see above). We note that the non-uniformity of inputs is best approximated with restricting the two most significant bits to the value {00}.

  • The propagation of the rotational inputs through the double round with the key addition in Threefish-256, with the appearance and the correction of errors, can not be considered as two independent events (i.e., as getting rotational pairs in the further MIX operations independently). As a result, the probability of this event can not be computed as a multiplication of other probabilities, and must be computed as a single value.

5.4 Inbound Phase

We are going to produce a pair of states, keys, and tweaks, that follow the rotational trail in rounds 43–52. The rotational probability for Skein-256 is equal to 2−79.5. We show how to produce a conforming input with negligible amortized cost. Please refer to Fig. 5 for more details. These rounds are chosen as the most expensive for the rotational property, which makes the forward direction of the outbound phase very short. Nevertheless, our attack could be equally well run with the inbound phase in the middle at the cost of a slightly increased complexity.

Fig. 5.
figure 5

Probabilities in the inbound phase for Skein-256.

First, we produce 230 states that conform to the rotational trail in rounds 45–46, and do the same for rounds 49–50. This can be done with negligible amortized cost, as we basically need to fix a handful of bits to ensure the rotational property to propagate through modular additions. Then we match those states by determining the value of subkey K 12. We have already estimated that with probability 2−20.2 the resulting subkey compensates the errors introduced by the counter. Hence we output 230+30−20 solutions for rounds 45–50 with complexity 260.

The subkey K 12 determines the words K 2,K 3+T 0,K 4+T 1,K 0. We can freely choose K 1 and K 3 in order to pass through rounds 51–52 and 43–44. We note that only the least significant bits affect the rotational property when computing rounds 51–52 in the forward direction. Having fixed the least 20 significant bits for K 1 and K 3, we can filter out the states not conforming to rounds 51–52. Hence we are left with 219 solutions for rounds 45–52 and 80 bits of freedom left (note the pre-fixed key bit values). Now let us note that the knowledge of K 2,K 3+T 0,K 4+T 1,K 0 and 20 LSBs of K 1 and K 3 determines 20 LSBs of subkey K 11. This allows to compute back the rounds 43–44 except for the modular additions in round 43. This allows to filter out almost all internal states incompatible with the conditions of rounds 43–44, with about 5 filtering bits left. Hence we produce 219−20+5=24 states that conform to all but five bit conditions in rounds 43–52, and still have 80 bits of freedom left. Hence the amortized cost of building a solution for the full section of rounds 43–52 is about 25. The solutions for Skein-512 are built in a similar way. The memory consumption is about 230 Skein states.

In the next section we shall see how to further exploit the degrees of freedom we apparently have left.

5.5 Acceleration Phase

The acceleration phase of the attack may be seen as part of the inbound phase or part of the outbound phase. Technically, starting from here computations are done in an inside-out manner, yet remaining degrees of freedom are used to accelerate the search for right pairs in the outbound phase.

As soon as we get a right pair of computations for the inbound phase, we produce many more of them from the given one as follows. We follow the simple idea of neutral bits as e.g. applied in the analysis of SHA-0 and SHA-1 [5]. We view them as auxiliary path [19] (also formalized as tunnels or submarines in [25, 36, 42]) and apply the differences specified by the path to the key and the tweak.

The configuration of the auxiliary path for Skein-256 is given in Table 5. We apply the original path difference to the first execution of the pair, and the rotated path difference to the second execution.

Table 5. Configuration of the auxiliary path for Skein-256. K i is the ith word of the first subkey K 0.

We consider ⊕-differences here, so we have to take into account the fact that the tweak and the key are added by the modular addition. Therefore, we choose the difference so that the probability of observing a carry is low. However, since adjacent bits are often neutral as well, a carry bit may still preserve the rotational pair.

In Skein-256 we take various δ and apply the resulting auxiliary path \(\mathcal{P}_{\delta}\) to the right pair. We choose δ so that the differences in the subkey K 12 compensate each other. Then we check whether the modular additions in rounds 41–42 and 53–54 are not affected by the modification. If so, we get another rotational pair for rounds 41–54.

In experiments, we found that 44 of the 64 possible individual bits that result in a local collision of the latter type behave neutral with probability larger than 0.75 for three rounds in forward direction and simultaneously two rounds in backwards direction, 37 consecutive bits of those have a probability very closeFootnote 2 to 1. Details for this phase will be found in Appendix in Table C.1. Overall, the results mean that every time those four rounds in the outbound phase are computed, and the effort of those is less than 237, the amortized effort for those computations will be negligible. If the effort for those five rounds is more, the effect of this acceleration phase, the speed-up, still grows roughly exponentially with the number of neutral bits used.

5.6 Degrees of Freedom Analysis

Now we discuss the following question: How often can this inbound phase be repeated? After fixing the differences and the corrections, for Skein-256 we have 256+256+128=640 degrees of freedom available to perform the attack. The outbound phase fixes 24 of the 256 bits of the key (also 12 bits of the 128-bit tweak), and in addition may need up to 256 bits to follow the longest possible trail with high probability. What remains is 640−36−256=348 degrees of freedom to be spent by the inbound and the acceleration phase. In Skein-512 we would have 512+512+128=1152 degrees of freedom, of which 1152−512−60=580 bits are left. If variants with less rounds are targeted, this number is higher, as less repetitions are needed for the shorter outbound phase. Overall, this is enough for our purposes.

5.7 Summary and Complexity Estimates

We experimentally verified the probabilities of the outbound phase, and took various dependencies into account, and also experimentally verified parts of the acceleration and inbound phase.

Using the Skein-256 compression function as an example, we describe the resulting attack. As illustrated already in Fig. 4, the 8-round inbound part is performed close to the output of the cipher/compression function, the 4 round acceleration area (2 rounds in each direction) surrounding it. The majority of the inside-out computation is then done in backwards direction, covering 38 rounds for Skein-256 and 40 rounds for Skein-512. In total this gives about 52/54 rounds. Additionally, early stopping techniques will only require the computation of a small number of rounds in the outbound part before another trial is made, saving a factor of the computational complexity that is in the order of the number of rounds.

We estimate the amortized cost for the rounds covered by inbound and acceleration phase for both Skein-256 and Skein-512 by a computation that is equivalent to a single computation of the compression function, as there are plenty of neutral bits that cover up costs in solving the right pairs in those inner rounds. In Skein-256, we will spend 2239 computations in the outbound+acceleration phases to find 2239 starting pairs for the outbound phase. One such pair will pass this phase with probability close to one. Therefore with an effort that is roughly equivalent to 2239 calls to the compression function of Skein-256 we can find one rotational pair of messages and chaining values (with corrections) that produces a rotational pair of updated chaining values. To produce 26 such pairs, i.e. to find 26-rotational collisions in Skein-256, we only need 26+239=2245 calls. On the other hand, in an ideal function one has to make at least \(2^{2.5}\cdot 2^{\frac{64-1}{64+1}256}\approx 2^{250}\) calls (see Lemma 1).

Similarly, for the compression function of Skein-512, we can create 26-rotational collision set with 26+480=2486 compression function calls, while an ideal function would require \(2^{2.5}\cdot 2^{\frac{63}{65}512}\approx 2^{499}\) calls.

5.8 Probabilities with the New Key Schedule

Skein v1.3 differs from v1.2 in the constant used to generate the subkey word \(K_{N_{w}}\). As a result, rotated key will not generate rotated subkeys: every fifth word in the subkey sequence would violate the rotational property. As a result, most of key addition layers would generate additional rotational errors. We expect those errors to vanish with probability not higher than 2−32, which subtracts at least 20 rounds from the Skein-256 attack, and 15 rounds from the Skein-512 attack, ignoring possible troubles in the inbound phase. As a result, we do not pursue our attacks for the new version of Skein.

6 Conclusion and Future Work

Our results do not threaten the practical use of full-round Skein or Threefish. However, we show that reduced versions of these constructions behave in a non-random manner in settings where all or most of the inputs could be chosen, and this holds for many more rounds than initially expected. We argue that variants of Threefish reduced from 72 to about 52/54 rounds, in the chosen-key-and-tweak model, do not behave like an ideal cipher with respect to the rotational property we have defined. Remember that the ideal cipher model implies that the key is freely chosen, and hence nothing is said about the security of Threefish as a PRP. For the compression function of Skein a similar argument is made. Due to the finalization round, our results are unlikely to carry over to the actual hash function.

To summarize, the following ideas and approaches lead to the improved results:

  • The rebound approach as a high-level model for the attack.

  • Considering rotational corrections with respect to integer addition instead of XOR.

  • Based on analytic reasoning, we find an efficient search method for fixing a subset of input bits before other phases of attacks.

  • Using the degrees of freedom in the internal state to efficiently solve for the inner 8-rounds.

  • Using the 8-round local collision as long-range neutral bits in an inside-out manner to speed up the outbound phase.

It will be interesting to study how rotational properties found in other constructions, some of which have been reported recently (for SHA-3 e.g. in [35]), can also be amplified in a way similar to what we demonstrated in this paper for Skein. Our new methods cannot directly be used to recover key bits in Threefish in a secret-key model—this is another open problem. The inbound and acceleration techniques we use in our analysis are to a large extent independent of the statistical property that is meant to be produced at the inputs and outputs of Skein. Hence, in addition to the rotational attacks described in this paper, also more traditional differential attacks aiming for collision or near-collision attacks will be able to take advantage of those techniques.