1 Introduction

Impossible differential cryptanalysis is a very powerful attack against block ciphers introduced independently by Knudsen [18] and Biham et al. [3]. The idea of these attacks is to exploit impossible differentials, which are differentials occurring with probability zero. The general approach is then to extend the impossible differential by some rounds, possibly in both directions, guess the key bits that intervene in these rounds and check whether a trial pair is partially encrypted (or decrypted) to the impossible differential. In this case, we know that the guessed key bits are certainly wrong and we can remove the subsequent key from the candidate key space. Impossible differential attacks have been successfully applied to a large variety of block ciphers, based both on the SPN and the Feistel construction. In some cases, they yield the best cryptanalysis against the targeted cipher; this is the case for the standardized Feistel cipher Camellia [10, 25], for example. Furthermore, impossible differential attacks were for a long time the most successful attacks against AES-128 [27, 29, 39].

Recently, a generic complexity analysis of impossible differential attacks against Feistel ciphers was presented [10]. Thanks to this generalized vision, several flaws in previous attacks were detected and many new attacks were proposed. Our work is the natural extension of the analysis given in [10] that inspired since its publication new results and analyses (e.g., [4, 5, 11, 23, 32, 38]). The techniques introduced in this paper correct, complete and improve the techniques and analyses given in [10]. We further show how to combine all of these concepts in practice to mount optimized impossible differential attacks. In our applications, and in contrast to  [10], we consider SPN ciphers. It is important to recall here that the time complexity formula of [10] is a lower-bound approximation. This approximation is most of the times met in practice, but as shown in [11], some counter-examples may exist. So, as already pointed out in [10], we insist here on the fact that the exact complexity of each attack needs to be carefully computed.

1.1 Our Contributions

The main contributions of this paper.

Correction of the time complexity approximation taking into account the role of the key schedule. The first contribution of this paper is related to the role that the nature of the key schedule plays in an impossible differential attack. Indeed, if the key schedule is nonlinear and has sufficiently good diffusion, then it is usually not trivial to translate guessed information on a subkey into information on the master key. In this case, the key schedule can be seen as a black box between the first and last subkeys. We show that this implies a new term must be taken into account in the time complexity evaluation. This remark results in a more accurate estimate of the time complexity and then leads to a correction of the time complexity formula provided in [10].

New technique for improving data complexity: Multiple differentials (Not to confuse with the technique of multiple impossible differentials introduced in [27]). Our second contribution is to apply the technique of multiple differentials to impossible differential attacks, in order to reduce the data complexity. While this idea seems quite natural, these two techniques had never been combined before. Applying this idea, sometimes in combination with multiple impossible differentials, leads to improved attacks against many ciphers as we prove through some concrete applications.

Experimental verifications of the introduced techniques Our third contribution is to experimentally verify the theoretical complexities of our techniques and those of [10]. More precisely, we have implemented the state-test technique and the use of multiple (impossible) differentials with toy examples. In the state-test case, we show that the estimated complexity gain matches the real gain. With respect to the multiple (impossible) differentials, we have performed several experiments, leading to the following important conclusions:

  • When the wanted probability of keeping a random partial key as candidate is around 1 / 2 (implying a certain needed number of pairs), the use of any multiple output (impossible) differential will lead to a data complexity matching the formulas. In the case of multiple input (impossible) differentials, the obtained complexities will match the theoretical complexities only if the amount and form of needed pairs allow to optimally exploit the plaintext structures and will be slightly reduced otherwise.

  • When the wanted probability of keeping a random key is much smaller, e.g., if we only want to keep the correct secret key at the end of the attack, the corresponding amount of pairs will be slightly increased if multiple (impossible) differentials are considered (whether they are input or output ones), as a direct consequence of the higher number of key bits being involved. This previously unknown side effect will also imply a divergence with respect to the formulas. This divergence can be summed up to the previous one in the case of non-optimal input configurations.

To the best of our knowledge, this is the first time these techniques have been implemented.

Multiple impossible differentials vs. simple impossible differentials We provide a discussion on the comparison of an attack that exploits multiple impossible differentials with an attack with similar parameters but that exploits only one impossible differential. An interesting question that arises in this type of situations is whether there are cases where an attack using a single impossible differential provides better complexities than an attack exploiting multiple impossible differentials. To answer this question, we provide in Sect. 5 an application against the block cipher ARIA-128 and we demonstrate that while the data complexity is always worse in the single case, the time complexity of an attack with a single impossible differential can sometimes be slightly better.

Application to various block ciphers We apply our techniques to a variety of block ciphers. Our goal is to demonstrate the practical combination of our techniques with some of those of [10] (such as the state-test technique, for example). This is a technical task which was not correctly treated in [10]. Table 1 shows the complexities of all of our attacks together with a summary of the best known cryptanalyses on the targeted ciphers. As the table shows, the techniques of this paper permitted us to improve the attacks of [10] against the Feistel ciphers LBlock, CLEFIA-128 and Camellia-256 (without FL layers and whitening keys). In fact, most of the attacks that we provide improve on the memory complexities of the best known attacks. We also improve on the best known impossible differential attacks against three SPN block ciphers, namely AES-128, CRYPTON-128 and ARIA-128. While other types of cryptanalysis have led to more powerful attacks on these three ciphers, our techniques still yield an interesting improvement on previous impossible differential attacks. Each of these applications illustrates a different combination of our methods. Only our application against 7-round AES-128 will be treated in full here (the other applications are sketched more briefly). This attack has the best memory complexity among all known attacks against AES-128 (though its time complexity does not improve on the best). This attack gives a perfect illustration of how to practically combine almost all of the techniques introduced in this paper.

Table 1 Summary of best single-key attacks against AES-128, CRYPTON-128, ARIA-128, CLEFIA-128, Camellia-256‡ andLBlock

The rest of the paper is organized as follows: Sect. 2 presents our new techniques and remarks on impossible differential attacks. The role of the key schedule is discussed, the combination of multiple differentials and impossible multiple differentials is presented, and a corrected formula for estimating the time complexity of an attack is given. Section 3 is dedicated to the implementation of the introduced techniques on toy ciphers. Finally, Sect. 4 presents our attacks against AES-128, CRYPTON-128 and ARIA-128.

2 Impossible Differential Cryptanalysis

We provide here the basic principles of an impossible differential attack and introduce the notation that will be used throughout this paper.

2.1 An Overview of Impossible Differential Cryptanalysis

We start by recalling the framework introduced in [10].

An impossible differential attack against an n-bit block cipher, parametrized by a key \(\mathcal {K}\) of length K, starts with the discovery of an impossible differential composed of an input difference \(\mathcal {D}_X\) that propagates after \(r_{\mathcal {D}}\) rounds to an output difference \(\mathcal {D}_Y\) with probability zero. After this, one extends this differential \(r_\mathrm{in}\) rounds backward to obtain a difference that we will denote \(\mathcal {D}_\mathrm{in}\) and \(r_\mathrm{out}\) rounds forward to obtain a difference called \(\mathcal {D}_\mathrm{out}\). The \(\log _2\) of the size of a set \(\mathcal {D}\) will be denoted by \(\varDelta \).

The two appended differentials are used to eliminate the candidate keys that encrypt and decrypt data to the impossible differential. Indeed, if for a candidate key both differentials \(\mathcal {D}_\mathrm{in} \rightarrow \mathcal {D}_{X}\) and \(\mathcal {D}_\mathrm{out} \rightarrow \mathcal {D}_Y\) are satisfied, then this key is certainly wrong as it leads to an impossible differential and must therefore be rejected.

figure a

Two important quantities in an impossible differential attack are the total number of key bits that intervene in the appended rounds and the number of bit-conditions that must be satisfied in order to get \(\mathcal {D}_X\) from \(\mathcal {D}_\mathrm{in}\) and \(\mathcal {D}_Y\) from \(\mathcal {D}_\mathrm{out}\). We will therefore let \(k_\mathrm{in}\) (resp. \(k_\mathrm{out}\)) denote the number of key bits that have to be guessed during the first (resp. last) rounds, and \(|k_\mathrm{in}\cup k_\mathrm{out}|\) the entropy of the involved key bits when considering relations due to the key schedule. Similarly, \(c_\mathrm{in}\) (resp. \(c_\mathrm{out}\)) will denote the number of bit-conditions to be verified during the first (resp. last) rounds.

We continue by briefly reminding the way to determine the number of pairs needed for the attack.

The probability that for a given key, a pair of inputs already satisfying the differences \(\mathcal {D}_\mathrm{in}\) and \(\mathcal {D}_\mathrm{out}\) verifies all the \((c_\mathrm{in} + c_\mathrm{out})\) bit-conditions is \(2^{(c_\mathrm{in}+c_\mathrm{out})}\). In other words, this is the probability that for a pair of inputs satisfying the difference \(\mathcal {D}_\mathrm{in}\) and whose outputs satisfy the difference \(\mathcal {D}_\mathrm{out}\) , a key from the possible key set is discarded. Therefore, by repeating the procedure with N different input (or output) pairs, the probability that a trial key is kept in the candidate keys set is

$$\begin{aligned} P = (1 - 2^{-(c_\mathrm{in}+c_\mathrm{out})})^N. \end{aligned}$$

There is not a unique strategy for choosing the amount of input (or output) pairs N. This choice principally depends on the overall time complexity, which is influenced by N, and the induced data complexity. Different trade-offs are therefore possible. A popular strategy generally used by default is to choose N such that only the right key is left after the sieving procedure. This amounts to choose P as

$$\begin{aligned} P = (1 - 2^{-(c_\mathrm{in}+c_\mathrm{out})})^N < \frac{1}{2^{|k_\mathrm{in}\cup k_\mathrm{out}|}}. \end{aligned}$$

However, as shown in [10], a different approach can be applied helping to reduce the number of pairs needed for the attack and to offer better trade-offs between the data and time complexity. More precisely, it is permitted to consider smaller values of N. By proceeding like this, one will be probably left with more than one key in the candidate keys set and will need to proceed to an exhaustive search among the remaining candidates, but the total time complexity of the attack will probably be much lower. In practice, one will start by considering values of N such that P is slightly smaller than \(\frac{1}{2}\) so to reduce the exhaustive search by at least one bit. So N should be chosen such as

$$\begin{aligned} P=(1-2^{-(c_\mathrm{in}+c_\mathrm{out})})^N \approx e^{-N \times 2^{-(c_\mathrm{in} + c_\mathrm{out})}}< \frac{1}{2}. \end{aligned}$$
(1)

We remind here that the quantity N determines the memory complexity of the attack.

The data complexity of an attack can be determined by the following formula given in [10].

$$\begin{aligned} C_{N}=\max \left\{ \min _{{\varDelta } \in \{\varDelta _\mathrm{in}, \varDelta _\mathrm{out} \}}\left\{ \sqrt{N2^{n+1-\varDelta }}\right\} ,N2^{n+1-\varDelta _\mathrm{in}-\varDelta _\mathrm{out}}\right\} , \end{aligned}$$
(2)

where \(\varDelta _\mathrm{in}\) is the number of active bits in \(\mathcal {D}_\mathrm{in}\) (\(\log _2\) of the dimension of the input space) and \(\varDelta _\mathrm{out}\) is the number of active bits in \(\mathcal {D}_\mathrm{out}\).

Finally, we remind the analysis of the time complexity presented in [10]. We recall again that the formula provided is a lower-bound approximation of the time complexity. This is due to the fact that each of the terms of this formula represents the minimum complexity of the operations that should be done in order to accomplish each step.

By following the early abort technique, the attack consists in storing the N pairs and testing out step by step the key candidates, by reducing at each time the size of the remaining possible pairs. The time complexity is then determined by three quantities. The first term is the cost \(C_N\), that is the amount of needed data [see Formula (2)] for obtaining the N pairs, where N is such that \(P<1/2\). The second term corresponds to the number of candidate keys \(2^{|k_\mathrm{in} \cup k_\mathrm{out}|}\), multiplied by the average cost of testing the remaining pairs. For all the applications that we have studied, this cost can be very closely approximated by \(\left( N+2^{|k_\mathrm{in}\cup k_\mathrm{out}|}\frac{N}{2^{c_\mathrm{in}+c_\mathrm{out}}}\right) C_\mathrm{E}' \), where \(C_\mathrm{E}'\) is the ratio of the cost of partial encryption to the full encryption. Finally, the third term is the cost of the exhaustive search for the key candidates still in the candidate keys set after the sieving. By taking into account the cost of one encryption \(C_\mathrm{E}\), the approximation of the time complexity is given by

$$\begin{aligned} C_{T} = \left( C_{N}+\left( N+2^{|k_\mathrm{in}\cup k_\mathrm{out}|}\frac{N}{2^{c_\mathrm{in}+c_\mathrm{out}}}\right) C_\mathrm{E}' +2^{K}P\right) C_\mathrm{E}. \end{aligned}$$
(3)

Obviously, as the attack complexity should be smaller than that of exhaustive search, the quantity \(C_T\) should be smaller than \(2^{K}C_\mathrm{E}\). In Sect. 2.5, after discussing the role of the key schedule in an impossible differential attack and after presenting our new techniques, we provide a corrected time complexity formula that takes all of the above into account.

In all of the applications that we provide at the end of this paper, we aim to derive different possible trade-offs for the time, data and memory complexity of an attack. For this reason, we introduce a parameter \(\varepsilon \) offering this possibility. More precisely, we take \(N = 2^{c_\mathrm{in} + c_\mathrm{out} + \varepsilon }\). The data and time complexity formulas are subsequently modified. Different values of \(\varepsilon \) provide different complexity trade-offs.

In [10], it was said that \(\mathcal {D}_\mathrm{in}\) and \(\mathcal {D}_\mathrm{out}\) were obtained by allowing the differences \(\mathcal {D}_X\) and \(\mathcal {D}_Y\) to propagate with probability 1 in the backward and forward directions, respectively. However, we point out here that this restriction is not necessary. In the case of Feistel constructions, it is a common technique to propagate the \(\mathcal {D}_X\) and \(\mathcal {D}_Y\) differences with probability 1, as one usually does not have the choice of doing this in a different manner. However, in the case of SPN ciphers using AES-type matrices for diffusion, considering probabilistic propagation clearly makes sense as it considerably increases the number of possibilities for extending the impossible differential and therefore offers more flexibility to the attacker for finding the best parameters for the cryptanalysis. If we take for example the case of AES, there are usually many possibilities for extending an active state after the MixColumns operation. An attacker can thus choose among all these possible cases and take the transitions that provide the best parameters for her attack.

This remark has important consequences for the data complexity of some attacks. Indeed, as seen by the formulas given in [10], if we allow only transitions \(\mathcal {D}_X \rightarrow \mathcal {D}_\mathrm{in}\) and \(\mathcal {D}_Y \rightarrow \mathcal {D}_\mathrm{out}\) of probability 1, then the equalities \(\varDelta _\mathrm{in}-c_{in}=\varDelta _X \text{ and } \varDelta _\mathrm{out}-c_{out}=\varDelta _Y\) are true by Bayes’ theorem. Thus, the minimal data complexity, given by \(C_N\), is in this case \(2^{n+1-\varDelta _X-\varDelta _Y}\), meaning for example that if only one impossible differential is considered, the attack on some ciphers will not work, without the use of any special techniques, because of a lack of data. This is the case for impossible differential attacks against the block cipher Simon, for example, where \(\varDelta _X = \varDelta _Y = 0\). Indeed, as can be seen in [10], both \(\mathcal {D}_X\) and \(\mathcal {D}_Y\) in the attack against Simon have only one bit active; therefore, the \(\log _2\) of both these quantities is zero. This then leads to \(C_N \ge 2^{n+1}\), implying that the attack does not work. If the probability for choosing the input and output differences is not 1, then this does not hold anymore and more flexibility is available for choosing the different trade-off parameters.

2.2 On the Key Schedule Seen as a Black Box

The first contribution of this paper is to reveal that the nature of the key schedule has an impact on the complexity of an impossible differential attack. Indeed, if the cipher’s key schedule is strongly nonlinear, the first few subkeys have necessarily a very complicated relation with the subkeys of the last rounds. Note that the link between the nature of the key schedule and the complexity of the underlying attack has been independently reported by Derbez [11].

In the context of impossible differential attacks, in general one has to guess key bits that belong to subkeys that have a gap of some rounds between them. If the key schedule is complex, then it is not possible to directly translate the information guessed on the subkey bits into the same amount of information on the master key. For this reason, one has to complete the missing bits to some of the partially known subkeys of the first or of the last rounds until we have enough bits to compute through the key schedule (or its inverse). Once this is done, one can verify if this way of completing the missing bits was correct by checking if the result matches with the previously known key bits of the subkeys found on the other side of the impossible differential.

Usually, the part of the key schedule that connects the subkeys of the first rounds to the subkeys of the last rounds can be seen as a black box, and the computation above should be taken into account in the estimation of the time complexity. Before providing the new term that has to be taken into account in such a situation, we briefly define a classification of the key schedules and give the resulting key bit guessing techniques that should be adopted. We further introduce two new notations, \(k_A\) and \(k_B\), which permit us to partition the key bits to be guessed into two separate groups according to the three following cases:

Linear or almost linear key schedules In such a case, it is possible to directly translate the \(k_\mathrm{in}\) and \(k_\mathrm{out}\) bits of the first and last rounds in the same number of bits of the master key by using the key schedule. Therefore, we set \(k_A = |k_\mathrm{in} \cup k_\mathrm{out}|\) and \(k_B = 0\). For example, the block cipher LBlock has a key schedule of this type, and this was exploited in the attack provided in [9].

Complex key schedule of AES type In cases where it is very complicated to connect the \(k_\mathrm{in}\) bits of the first rounds to the \(k_\mathrm{out}\) bits of the last rounds, we simply set \(k_A = k_\mathrm{in}\) and \(k_B = k_\mathrm{out}\). The block ciphers AES, CRYPTON and ARIA belong to this group.

Complex key schedule of MISTY1 or Camellia type This category also includes ciphers with highly nonlinear key schedules; however, the partition of the key bits into first and last round bits is not always relevant. For example, Camellia-128’s key schedule can be seen as dividing subkeys into two groups, where on the one hand the relation between subkeys of the same group is very easy to compute, but on the other hand it is very complicated to connect subkeys of different groups. The difference with the previous type of key schedule is that these two groups do not exactly correspond to \(k_\mathrm{in}\) and \(k_\mathrm{out}\). Therefore, in such a case \(k_A\) will represent the subkey bits of one group, while \(k_B\) the subkey bits of the other group. The block cipher CLEFIA has also a key schedule of this type.

We are now ready to introduce the term taking into consideration the black box phenomenon that has to be added to Eq. (3): \(\min (2^{K-k_A}, 2^{K-k_B})\cdot P \cdot 2^{k_A + k_B} \cdot C_{KS},\) where \(C_{KS}\) is the key schedule cost. The quantities \(K-k_A\) and \(K-k_B\) correspond to the number of missing key bits that have to be completed. The above term can be simply rewritten as

$$\begin{aligned} \min (2^{K+k_A}, 2^{K+k_B})\cdot P \cdot C_{KS}. \end{aligned}$$

This term, multiplied by \(\max (2^{-k_A},2^{-k_B})\) \(\cdot \frac{1}{C_{KS}},\) gives the number of candidate keys to test.

To conclude this paragraph, we emphasize that the remark presented in this section had never been pointed out before and was not taken into consideration in [10]. Indeed, in many previous attacks, even if the key schedule of the analyzed cipher becomes highly nonlinear through the rounds, it was wrongly supposed that one guessed word of a subkey could directly be seen as one guessed word of the master key.Footnote 1 Of course, the classification given above does not take into account every possible key schedule that one can imagine. One can think of key schedules not fitting any of the above categories. However, in concrete constructions, we have not encountered such cases and we believe that the key schedules used in practice lie in one of the above classes.

2.3 Multiple Differentials in Impossible Differential Cryptanalysis

Multiple differential cryptanalysis [31] is a generalization of differential cryptanalysis in which several input and output differences are considered simultaneously. We show here that this technique can be successfully combined with impossible differential cryptanalysis to reduce the data complexity of an attack. To the best of our knowledge, the idea of combining these two techniques had never been considered before. Furthermore, as we demonstrate in the next section, this method can also be combined with multiple impossible differentials [10, 35] to further reduce the amount of data that an attacker requires.

The idea here is to consider several input differences \(\mathcal {D}_\mathrm{in}\) and several output differences \(\mathcal {D}_\mathrm{out}\), all of them corresponding to the same pair of differences \((\mathcal {D}_X, \mathcal {D}_Y)\) as depicted in Fig. 1. This method recalls the idea from [16], where multiple differentials were applied to rebound-type distinguishers.

Fig. 1
figure 1

Multiple inputs and multiple outputs

Considering multiple differentials provides the attacker with more input/output differences \(\mathcal {D}_\mathrm{in}, \mathcal {D}_\mathrm{out}\), meaning that there are more choices for the input/output patterns of a pair. Indeed, in a chosen-plaintext attack, a plaintext pair will be kept if the truncated difference of the corresponding ciphertexts is among the multiple output differences \(\mathcal {D}_\mathrm{out}\), leading to more choices than in an attack with a single \(\mathcal {D}_\mathrm{out}\). As we realized during our experiments, the input case is a bit more complicated to deal with, but globally it can be seen in the same way. Therefore, less data are needed to construct the pairs for the attack; this is the reason why this method helps to reduce the overall data complexity.

We define here two new variables, \(m_\mathrm{in}\) and \(m_\mathrm{out}\), corresponding to the number of input/output multiple differentials taken into account for a single impossible differential. Following the same reasoning as in [10] where the data complexity of an attack using multiple impossible differentials was given as a function of the data complexity, \(C_N\) of a standard attack with the same parameters, we deduce that the new data complexity \(C_{N'}\) is

$$\begin{aligned} C_{N'} = \frac{C_N}{m_\mathrm{in}m_\mathrm{out}}. \end{aligned}$$
(4)

We show in the sequel how to correctly deal with the situation where different sets of key bits are related to the different \(\mathcal {D}_\mathrm{in}, \mathcal {D}_\mathrm{out}\) differences. However, we note here that our analysis only considers multiple input and output differences of equal Hamming weight. Otherwise, the individual complexities might be non-equivalent, and in that case, the differential with the highest complexity becomes the leading term without therefore improving the final complexity.

2.4 Multiple Differentials with Multiple impossible Differentials

The idea of multiple impossible differentials, first introduced by Tsunoo et al. [35] and later formalized in [10], is to simultaneously consider several impossible differentials (\(\mathcal {D}_X, \mathcal {D}_Y\)). This technique reduces the data complexity of the attack compared to a cryptanalysis that only exploits one impossible differential. This is due to the fact that using multiple impossible differentials implies less bit-conditions to be verified (as one has more choice), and the number of bit-conditions directly affects the number of pairs N and thus the amount of data, as can be seen in Eq. (2).

We introduce the idea of using multiple differentials and multiple impossible differentials together to further reduce the amount of data. If \(n_\mathrm{in}\) is the number of input differences \(\mathcal {D}_X\) and \(n_\mathrm{out}\) the number of output differences \(\mathcal {D}_\mathrm{out}\), then the reduced data complexity by combining both techniques is

$$\begin{aligned} C_N'=\frac{C_N}{n_\mathrm{in}n_\mathrm{out}m_\mathrm{in}m_\mathrm{out}}. \end{aligned}$$
(5)

This formula is directly derived from Eq. (4) and from the formula for the data complexity given in [10] for multiple impossible differentials.

In practice, whether the different differentials come from several impossible differentials (\(\mathcal {D}_X, \mathcal {D}_Y\)) or from several input and/or output differences (\(\mathcal {D}_\mathrm{in}, \mathcal {D}_\mathrm{out}\)) will not change the way the complexity of the attack is affected, and we treat both types equally. For simplicity, we use multiples to refer to both multiple impossible differentials and multiple differentials. Applications of the above are provided in Sect. 4.

2.5 Putting it All Together

Multiples and black box key schedules When considering several multiples, there can be different patterns for the groups of key bits of size \(k_A\) or \(k_B\) involved in the attack. In some cases, these groups will be disjoint, but this will not always be so. For the sake of simplicity, we concentrate on the case of output multiples (the analysis of input multiples is similar). We let \(M=n_\mathrm{in}n_\mathrm{out}m_\mathrm{in}m_\mathrm{out}\) denote the total number of multiples. We let \(k_B^\mathrm{inv}\) denote the number of \(k_B\) bits that are involved in at least one of these differentials (i.e., the union of all the sets of \(k_B\) bits), and \(k_B^\mathrm{int}=M \times k_B-k_B^\mathrm{inv}\) the total number of redundant bits from \(k_B\) when we suppose that all the key bits are affected from all the differentials at the same time. In this case, the term to take the black box phenomenon into account is:

$$\begin{aligned} 2^{K-k_B^\mathrm{inv}}\cdot (P^{1/M} \cdot 2^{k_A + k_B})^M \cdot 2^{-k_B^\mathrm{int}}\cdot 2^{-k_A^\mathrm{int}}\cdot C_{KS} = 2^{K}\cdot P \cdot 2^{k_A^\mathrm{inv}}\cdot C_{KS}. \end{aligned}$$
(6)

This previous term, multiplied by \(2^{-k_A^\mathrm{inv}}\cdot \frac{1}{C_{KS}}\), gives the number of candidate keys to test, while the last term of the complexity stays \(2^{K}\cdot P\cdot C_\mathrm{E}\). We omit the \(\min \) here, since we can choose the roles of \(k_A\) and \(k_B\). Several applications of this situation are provided in Sect. 4.

Given these formulas, the combination of both the state-test and the multiple (impossible) differentials is now straightforward. Combining everything, the new time complexity formula that we propose is

$$\begin{aligned} C_T = \left( C_{N} + \left( N+2^{k_A + k_B}\frac{N}{2^{c_\mathrm{in}+c_\mathrm{out}}}\right) C_\mathrm{E}' + 2^{K}\cdot P \cdot 2^{k_A^\mathrm{inv}}\cdot C'_{KS} +2^{K}\cdot P\right) C_\mathrm{E}, \end{aligned}$$
(7)

where \(C'_{KS}\) is the ratio of the cost of the key schedule compared to the full encryption. Our application against AES-128 gives an illustration of such a combination.

Multiples and state-test The aim of the state-test technique, introduced in [10], is to eliminate some candidate keys without having to consider all of the possibilities for the involved key bits. This can be done, for example, by considering the value x of a word of size s of the internal part of the state needed to verify if a condition is satisfied in the second round. Typically, with a constant c from the diffusion layer and an invertible Sbox S, we would have \(x=x' + c S(P_i+K_i)+K_j,\) where \(x'\) is an already known value that we have computed with the knowledge of the plaintexts/ciphertexts and the already guessed key bits. The s-bit variable \(P_i\), corresponds to the fixed part of the state, i.e., it has the same value for all the considered pairs. The variables \(K_i\) and \(K_j\) correspond to the not yet guessed nor determined involved parts of the key, of size s each. We easily see that if instead of guessing both variables \(K_i\) and \(K_j\) we directly guess the value \(x+x'\), then we can perform the rest of the attack in a similar way, with a complexity reduced by s bits, as the number of guesses is reduced by this amount. Each guess of \(x+x'\) will imply a disjoint set of possibilities for \(K_i\) and \(K_j,\) and considering all the values of \(x'+x\) will provide all possible combinations of \(K_i\) and \(K_j\). The attack is performed as before, where now we will determine the candidate values for \(x+x'\). Note again that this is only possible because the value of \(P_i\) is fixed. This simplified version of the state-test technique combined with a simplified vision of multiple (impossible) differentials eases their combination. Consider a simple attack, i.e., implying a single impossible differential, performed with \(N_s\) number of pairs. Let \(P_s\) be the proportion of candidate keys that we obtain, and let \(C_{N_s}\) be the data complexity of the corresponding attack. The number of remaining key candidates is \(2^{|k_\mathrm{in}\cup k_\mathrm{out}|}\cdot P_s 2^{K-|k_\mathrm{in}\cup k_\mathrm{out}|}=2^{K} \cdot P_s\).

Now, suppose that we repeat this attack T times in parallel for different sets of data, possibly involving different key bits. While the parameters of the repeated attacks are the same as for the first one, the number of candidate keys leftFootnote 2 will be \((2^{|k_\mathrm{in}\cup k_\mathrm{out}|}\cdot P_s )^T \cdot 2^{-k_\mathrm{int}}\), where \(k_\mathrm{int}\) is the total number of redundant bits from \(\mathcal {K}\) when we consider all the key bits affected by all the multiple differentials together. The data complexity in this case is \(T\cdot C_{N_s},\) for a proportion of keys \({P_s}^T\), and the time complexity is about \(T\cdot C_{T_s}\). It is easy to see that when we perform a multiple instead of a parallel repetition, we are following a similar procedure, but we can reuse the data. Therefore, the data complexity of this multiple attack will be smaller, while the time and memory complexities will a priori stay the same.

Combining the above representations of the state-test and multiple impossible differentials techniques, together with the new formula that correctly takes into account the key schedule when using multiple differentials, is now straightforward. The attack against AES-128 gives a detailed illustration of how these two methods can be combined.

3 Verification of the Improvement Techniques

In this section, we detail the implementation experiments that we performed in order to verify the improvement techniques introduced both in this paper and in [10]. To the best of our knowledge, this is the first time that the state-test technique and the multiple (impossible) differential techniques have been implemented and therefore validated. We emphasize the importance of implementation as the only means of corroborating the theoretical approaches.

Multiple differentials

The scope of the first implementation experiment is to get a clear idea of the accuracy of the equations given in Sect. 2, and in particular Formula (5), when working with multiple (impossible) differentials.Footnote 3 For doing so, we considered a toy cipher corresponding to a 6-round Feistel network using blocks of \(n=32\) bits and whose round function has an SP structure. This round function is therefore composed of 3 operations: a bitwise key addition, the parallel application of a 4-bit Sbox (same as for PRESENT [8]) and finally the application of a MDS linear transformation P (same as for LED [15]).

For the sake of simplicity, we considered independent round keys. We used a 4-round impossible differential, with an input difference of the form \(\mathcal {D}_X = (0,0,0,0 | a,0,0,0)\) (with a a non-null nibble) and an output difference \(\mathcal {D}_Y = ( x,y,z,0 | 0,0,0,0 )\), where x, y and z are nibbles free of conditions.Footnote 4 We add one round before and after this differential and end up with the following parameters: \(\mathcal {D}_\mathrm{in} = (a,0,0,0 | P(b,0,0,0))\) with a and b nibbles free of conditions, leading to \(\varDelta _\mathrm{in} = 8\), \(c_\mathrm{in} = 4\) and \(\mathcal {D}_\mathrm{out} = (P(u,v,w,0)|(x,y,z,0)))\) (all nibbles free of conditions) leading to \(\varDelta _\mathrm{out} = 24\), \(c_\mathrm{out} = 12\).

To start with, we consider a simple attack against the above toy cipher exploiting only one impossible differential and we suppose that our goal is to discard half of the candidate keys after the attack. According to Eq. (1), we need N pairs satisfying both \(\mathcal {D}_\mathrm{in}\) and \(\mathcal {D}_\mathrm{out}\), where N is such that

$$\begin{aligned} P=(1-2^{-(c_\mathrm{in}+c_\mathrm{out})})^N = (1-2^{-16})^N < \frac{1}{2}. \end{aligned}$$

This leads to \(N > 2^{15.47}\). We verified experimentally that the above formula is accurate by launching 10 tests on our toy cipher. Indeed, the experiment showed that we need in average \(2^{23.44}\) pairs satisfying \(\mathcal {D}_\mathrm{in}\) to eliminate half of the candidate keys. This means that we have \(2^{23.44-n+\varDelta _\mathrm{out}} = 2^{23.44-8} = 2^{15.44} \) pairs satisfying both \(\mathcal {D}_\mathrm{in}\) and \(\mathcal {D}_\mathrm{out}\). In the sequel of our implementation experiments, we are interested in the evolution of the number of necessary pairs N when more than one impossible differential is used.

Furthermore, in the following experiments, we verify the accuracy of Formula (5) in the case of input multiples, in the case of output multiples, and in the case that the probability of keeping a key is not one half, but is equal to the inverse of the number of possible (involved) keys.

We first describe the simplest experiments, where we considered a probability of not discarding a key being 1 / 2, i.e., where half of the possible keys are eliminated after the attack.

Using multiple outputs We are interested here in the evolution of the quantity of required pairs if we use multiple impossible differentials for the second half of the differential, i.e., if we exploit the 4 possible patterns (of same Hamming weight) for \(\mathcal {D}_Y\) : \(\mathcal {D}_Y = (x,y,z,0 | 0,0,0,0)\), (x, 0, yz|0, 0, 0, 0), (xy, 0, z|0, 0, 0, 0) and (xyz, 0|0, 0, 0, 0) (see Fig. 2).

In such a case, if we have N pairs satisfying both \(\mathcal {D}_\mathrm{in}\) and one out of the 4 possible \(\mathcal {D}_\mathrm{out}\), the probability to not discard a key is not modified, as the conditions remain unchanged for all the 4 output possibilities, and is equal to \(P = (1-2^{-16})^N\), which indicates that if we want to divide by 2 the number of possible keys, the required amount of pairs satisfying both \(\mathcal {D}_\mathrm{in}\) and (one of the) \(\mathcal {D}_\mathrm{out}\) is unchanged. On the other hand, since more output differences are valid, we need to encrypt less pairs satisfying \(\mathcal {D}_\mathrm{in}\) to find N pairs. Indeed, if \(n_\mathrm{out}\) output differences are considered, we need to encrypt only a fraction of \(n_\mathrm{out}^{-1}\) pairs satisfying \(\mathcal {D}_\mathrm{in}\) (see Fig. 2).

The results of our experiments are given in Table 2. One can remark that these results correspond to what was predicted by theory.

Fig. 2
figure 2

Attack configuration with 2 \(\mathcal {D}_\mathrm{in}\) and 4 \(\mathcal {D}_\mathrm{out}\)

Table 2 Necessary amount of pairs N satisfying \(\mathcal {D}_\mathrm{in}\) in order to eliminate half of the candidate keys (average on 10 tests) and associated \(C_N\)

Using multiple inputs and combination with multiple outputs We consider here the case of several input differentials \(\mathcal {D}_\mathrm{in}\), and as we will see, the situation is now slightly different. The probability of eliminating a key remains unchanged, so we still require the same amount of pairs N satisfying (one of the) \(\mathcal {D}_\mathrm{in}\) and \(\mathcal {D}_\mathrm{out}\).

The experiments we did on our toy cipher meet this theory. More precisely, we generated random pairs, alternatively following the first and the second \(\mathcal {D}_\mathrm{in}\), and counted how many pairs are necessary to divide the set of possible keys by 2. It resulted that we need (average on 10 tests) \(2^{22.5}\) pairs in each \(\mathcal {D}_\mathrm{in}\) if we use a single \(\mathcal {D}_\mathrm{out}\). This quantity decreases to \(2^{21.5}\), \(2^{21.0}\) and \(2^{20.6}\), respectively, for 2, 3 and 4 \(\mathcal {D}_\mathrm{out}\).

The gain from the multiple inputs would come from the fact that we can create pairs following one of the \(\mathcal {D}_\mathrm{in}\) in a clever way, by carefully selecting the plaintexts we encrypt. For instance, if we use 2 different \(\mathcal {D}_\mathrm{in}\), a nice choice would be to choose a random plaintext p and to encrypt the \(2^{16}\) messages given by:

$$\begin{aligned} \{p\oplus (a,0,0,0 | P(b,0,0,0))\oplus (0,c,0,0 | P(0,d,0,0)), a, b, c, d \in GF(2^4)\} \end{aligned}$$

With such a set, we are able to make \(2^{15} \times 2^8 = 2^{23}\) pairs satisfying each of the entering pattern, i.e., \(2^{24}\) pairs satisfying one of the \(\mathcal {D}_\mathrm{in}\), while this amount of encryption would have given only \(2^{23}\) pairs if only one \(\mathcal {D}_\mathrm{in}\) was exploited.

We can visualize such a structure as a two-dimensional array: Each line is made of \(2^8\) plaintexts that form a structure for the first \(\mathcal {D}_\mathrm{in}\), and each column makes a structure for the second \(\mathcal {D}_\mathrm{in}\) (see Fig. 3).

Fig. 3
figure 3

Efficient structure to exploit two \(\mathcal {D}_\mathrm{in}\)

If we require a multiple of \(2^{24}\) pairs satisfying one of the \(\mathcal {D}_\mathrm{in}\), the data gain should be of one half. However, if we require less pairs, building structures is not that obvious and the real gain could be smaller. If we consider 2 possible \(\mathcal {D}_\mathrm{in}\), a solution would be to create a structure similar to the one in Fig. 3 with \(2^{\ell _1}\le 2^8 \) lines and \(2^{\ell _2} \le 2^8\) columns, which would allow to build approximately \(2^{\ell _1 + 2 \ell _2 - 1}\) pairs satisfying the first \(\mathcal {D}_\mathrm{in}\) and \(2^{\ell _2 + 2 \ell _1 - 1}\) pairs satisfying the second one. The aim is then to be able to build the needed amount of pairs satisfying one of the \(\mathcal {D}_\mathrm{in}\) (\(2^{\ell _1 + 2 \ell _2 - 1}+2^{\ell _2 + 2 \ell _1 - 1}\)) while minimizing the number of encryptions (\(2^{\ell _1+\ell _2}\)). We have therefore verified that the given equations cannot always be met with respect to the multiples considered in the input, and the loss with respect to this will depend on the best way of building the structures. Some results show that, in the generic case with \(n_\mathrm{in}\) differentials, the best configuration for having the smallest loss is to take maximal values for the first \(\ell _i\) while not exceeding the needed N, and then complete the next one with the needed amount, while considering 1 for the others. This implies that using all the structures associated with a 1 will not improve the data complexity, as it will be useless, and the best possible improvement is achieved when considering as many different \(\mathcal {D}_\mathrm{in}\) as the various \(\ell _i\) different from 1 that we have.

Choosing N in order to keep only the correct key The situation becomes quite different if we are interested in keeping only the correct key. In this case, for a simple attack with one \(\mathcal {D}_\mathrm{in}\) and one \(\mathcal {D}_\mathrm{out}\), Eq. (1) becomes

$$\begin{aligned} P=(1-2^{-(c_\mathrm{in}+c_\mathrm{out})})^N < \frac{1}{2^{16}}, \end{aligned}$$

since 4 nibbles of the key intervene in the attack. This theoretic formula gives that N has to be bigger than \(2^{19.47}\). We launched 10 experiments and were able to confirm this quantity in the non-multiple case; the average obtained is of \(2^{27.5}\) pairs satisfying \(\mathcal {D}_\mathrm{in}\). When considering several possible \(\mathcal {D}_\mathrm{out}\), the number of possible involved key bits is going to be slightly increased, but this increase is enough to affect the data complexity. This was not taken into account in [10] nor in our theoretical formulas and should be kept in mind. As can be seen in Table 3, this has a small but clear effect in the data needs, allowing to gain slightly less than what predicted.

Table 3 Necessary amount of pairs following the unique \(\mathcal {D}_\mathrm{in}\) in order to keep only the right key (average on 10 tests) and associated \(C_N\)

For the sake of completeness, we have considered the case where the probability of not discarding a key is as small as necessary to only keep one key, and we consider at the same time 2 different \(\mathcal {D}_\mathrm{in}\) and up to 4 different \(\mathcal {D}_\mathrm{out}\). We see in Table 4 how the side effect of a higher number of involved key bits is a bit stronger here than in Table 3, as with two \(\mathcal {D}_\mathrm{in}\), this number is increased. We do not see here the effect of the non-optimal structures constructed with the \(\mathcal {D}_\mathrm{in}\), as in this particular case, the amount needed is big enough to optimally exploit such structures. Therefore, the obtained experimental values for \(C_N\) nearly match the theoretical ones.

Table 4 Necessary amount of pairs satisfying one of the two \(\mathcal {D}_\mathrm{in}\) in order to keep only the right key (average on 10 tests) and associated \(C_N\)

State-test technique We describe here the experiments we performed to validate the state-test technique. For this purpose, we used a slightly modified version of CLEFIA [34] in which we drop the word-size from 8 bits to 4 bits, adapting the internal functions to fit this new size. As depicted in Fig. 4, we attack 6 rounds of such a CLEFIA with a 4-round impossible differential and the same 2-round input differential used in the attacks of [10].

For practical reasons, we suppose that the subkey \(RK_1\) has already been guessed. We then applied the state-test technique on the value of the nibble denoted by x in Fig. 4 to recover one nibble of the subkey \(RK_0\) and one nibble of \(RK_2\). We performed this experiment for \(2^3\) randomly chosen keys, using different amounts of data, as described in Table 5. This experiment consists in counting the average number of Sbox evaluations as well as the number of times we abort to try a candidate key, i.e., the number of false positives, until we recover the right key. For comparison, we provide the corresponding quantities in the case of a traditional cryptanalysis—that is, without applying the state-test technique. We have therefore been able to verify that the state-test technique considerably improves the time complexity of the attacks, as predicted.

Fig. 4
figure 4

Reduced version of CLEFIA used to verify the correctness of the state-test technique. The number 2 on the rightmost word of the ciphertext means that at least two nibbles have a nonzero difference

Table 5 Comparison of the average number of partial encryptions and the number of candidate keys with and without the state-test technique

4 Applications

We start by providing a brief overview of the importance and impact of each of our applications and a comparison with previous attacks. Table 6 provides the techniques and improvements that are applied in each case, the parameters used as well as the attack complexities. The formulas and techniques that we provide allow for a straightforward application on several ciphers. Thanks to our complexity estimates, we manage to improve on many of the previous attacks. As we have seen in Sect. 3 and as can be seen in the attack against AES-128, these estimations accurately meet the attack complexities.

Table 6 Summary of the details concerning the new applications

AES-128 We provide a complete description of one of our attacks that entirely match the complexity estimations. We chose to detail this attack because it involves the application of the state-test technique, the use of multiples and the consideration of the black box term. Another application to AES using multiples and taking the black box term into account gives a data complexity of \(2^{105}\) CP, a time complexity of \(2^{106.88} C_\mathrm{E}\) and a memory complexity of \(2^{74}\) words. This new attack provides a previously unknown trade-off, and the complexities are comparable to those of the best attacks on 7-round AES. Two attacks on AES often cited as the best known are given in Table 1. If we compare their results with our second attack, even though our time complexity is slightly higher, the data complexity is the same and our memory complexity is much better. Given the importance of AES, these new trade-offs, comparable to the best attacks, are interesting.

CRYPTON-128 We consider several multiples and use \(n_\mathrm{in}=4,\) \(n_\mathrm{out}=4\) and \(m_\mathrm{out}=6\). As pointed out previously, the multiple impossible differentials that correspond to the same key bits in the extended rounds can be seen as a reduction in the number of bit-conditions, which provides a better memory in the overall complexity while giving exactly the same other complexity parameters.

ARIA-128 In this case, the proposed attacks are far from being the best, still they result in the best impossible differential attacks against this cipher. Still, this application is a very illustrative example because of the many multiples that can be considered, and it provides the perfect scenario for comparing the use of multiples with the use of a single impossible differential. We discuss the advantages, disadvantages, and how far we can go with this type of attack.

Camellia-256 without FL/FL \(^{-1}\) layers and whitening keys We improve on the previous attack from [10] (which covers the highest number of rounds, starting from the first one), by efficiently combining the state-test technique with multiple impossible differentials. Consequently, we are able to consider more fixed bits for the state-test technique. In addition, we take into account the black box term corresponding to the complex key schedule. This was not done in the previous best cryptanalysis, and therefore, we provide the corrected complexity of the full key-recovery attack. The most important parameters of this application are given in Table 1.

CLEFIA-128 We applied the state-test technique, the use of multiples and the correct way of choosing \(\varDelta _\mathrm{out}\) and \(c_{out}\) to CLEFIA-128 in a similar way to our attack on Camellia-256. We check carefully what happens with the key bits, and we can apply the state-test technique when fixing 16 input bits. We also take into account the black box term corresponding to the key schedule. Finally, we obtain the improved and corrected complexities given in Table 1.

LBlock In this case, we consider the same starting parameters as in [10]. We can improve the 23-round attack on LBlock by applying the state-test technique with 8 bits fixed on the plaintexts, which could not be done without combining it with multiple impossible differentials. While the techniques we used here were already presented in [10], the attack proposed there on LBlock was much worse because the techniques were not applied in combination. The parameters used are given in Table 1.

4.1 AES-128

We give here a detailed description of our attack against AES-128. We do not recall here the specifications of the AES algorithm but refer to the design paper [14]. To ease the description of the attack, we number the bytes of the \(4\times 4\) AES state from 0 to 15, where byte 0 is the byte on the top left corner, byte 1 is the one in the second row from the top and in the leftmost column, and so on.

Previous attacks During the last 15 years, the security of AES-128 has been extensively analyzed. Among the many different types of attacks considered, impossible differential attacks have long led to the best cryptanalysis. Today, the most successful cryptanalysis of AES-128 is a meet-in-the-middle attack [13] reaching 7 out of 10 AES rounds.

In this work, we use improved impossible differential attacks to considerably improve not just the previous impossible differential attacks, but also the memory complexity of the best known attacks against 7-round AES-128 [13], maintaining a similar time complexity. We then provide comparable attacks with new, previously unknown trade-offs. To determine the impossible differential providing the best complexity trade-off for the attack, we carried out an exhaustive search for finding 4-round impossible differentials. For all of them, the application of the MixColumns of the last round was omitted. For this search, we considered two different types of impossible differentials, covering what we believe to be all impossible differentials on 4 rounds.

4.1.1 Search of 4-Round Impossible Differentials of AES

We provide here some details of our automated search of 4-round impossible differentials for AES. To perform our search, we considered two types of 4-round impossible differentials. The first type includes differentials where we computed one round in the forward direction and three rounds in the backward direction and applied the miss-in-the-middle technique after the first round. An impossible differential of this type was used in the attack of Mala et al. [29]. The second type includes differentials with two rounds in the forward and two rounds in the backward direction, with the miss-in-the-middle technique applied after two rounds. An impossible differential of this kind is for example used in the attacks of Bahrak et al. [2], Lu et al. [27] and Zhang et al. [39].

We divided then the impossible differentials found into equivalent classes, where each class contained those differentials where both \(\mathcal {D}_X\) and \(\mathcal {D}_Y\) had the same active columns and where each of the four columns had the same Hamming weight. The reason for this is that the first operation taking place when expanding the impossible differential backward and forward is MixColumns (or its inverse). As a consequence, the exact position of the active bytes in one column does not alter the attack, only the Hamming weight of each column matters. After this, by taking one representative differential of each class, we checked by an automated program which impossible differential leaded to an attack with the lower possible complexities. For this, we generated all possible differentials from \(\mathcal {D}_X\) and \(\mathcal {D}_Y\), by taking into account all possibilities after each MixColumns operation or its inverse. For each possible attack, we computed the data, memory and time complexities in order to choose the impossible differential that offered the best trade-off among these three quantities.

The conclusion made after this automated search is that the impossible differential providing the best complexity trade-offs for attacking 7 rounds of AES is the one pointed out by Mala et al. in [29]. This impossible differential is such that there are at least three active bytes in the first and the third column of \(\mathcal {D}_X\), while the other two columns stay inactive and such that there is exactly one active byte in \(\mathcal {D}_Y\). This impossible differential permits on the one hand to take into account in the best way the key schedule of AES-128, rending the number of the key bits that have to be guessed quite reasonable, while on the other hand it permits to minimize the data complexity (Fig. 5).

Fig. 5
figure 5

A 4-round impossible differential of AES. A square with a dot symbolizes an active byte, while an empty square stands for inactive bytes. The number 3 on the last three columns after the application of the first MixColumns says that at least 3 of the 4 bytes of each column will be active after the application of this operation

However, we would like to point out that the above impossible differential is not the one leading to the lowest number of key bits that one has to guess. The impossible differential whose \(\mathcal {D}_X\) has at least three active bytes in the leftmost column and \(\mathcal {D}_Y\) is formed by exactly one active byte can be extended in a way where only 104 bits have to be guessed during the attack, while at least 112 bits are needed with the impossible differential of [29]. However, the induced attack leads to worse data, time and memory complexities than the attack we propose using the differential of [29]. This remark disproves the claim stated in [33] saying that the time complexity of an impossible differential attack only depends on the number of key bits that need to be guessed (Fig. 6).

Fig. 6
figure 6

A 4-round impossible differential of AES that requires as low as 104 key bits to be guessed during an attack against 7-round AES-128

Parameters of the basic attack Since our attack is based on the best previous impossible differential attack of Mala et al. [29], we recall here some of its characteristics.

Their attack is based on several impossible differential paths on \(r_{\mathcal {D}} = 4\) rounds. These differ in the pattern of \(\mathcal {D}_{X}\), which can take 4 different forms, each having 3 active bytes in columns 1 and 3, but in different positions. One of these patterns is represented in Fig. 7. Its inactive bytes of columns 1 and 3 are in the positions 0 and 10, but other possibilities are 1 and 11, 2 and 8, and finally 3 and 9. The differentials used in [29] are represented in black in Fig. 7. According to our notations, the parameters of the attack described in [29] are: \(\varDelta _\mathrm{in}=64, \varDelta _\mathrm{out}=32, c_\mathrm{in}=46, c_\mathrm{out}=22, k_\mathrm{in}=80, k_\mathrm{out}=32.\)

Now, we detail how the application of our techniques leads to a reduced time and memory complexity compared to the Mala et al. attack. First, notice that to conduct this attack, we need to guess four bytes in Round 7 (\(K_7^{0}\), \(K_7^{7}\), \(K_7^{10}\) and \(K_7^{13}\)), while also the following 12 bytes of the first two rounds: \(K_0^0\), \(K_0^2\), \(K_0^5\), \(K_0^7\), \(K_0^8\), \(K_0^{10}\), \(K_0^{13}\), \(K_0^{15}\), \(K_1^0\), \(K_1^2\), \(K_1^8\) and \(K_1^{10}\). However, a study of the AES-128 key schedule reveals that the value of \(K_1^0\) (resp. \(K_1^2\)) can be directly computed from the values of \(K_0^0\) and \(K_0^{13}\) (resp. of \(K_0^2\) and \(K_0^{15}\)), explaining thus why \(k_\mathrm{in} \) is only 80 and not 96.

When applying the generic formulas (3) and (2), with the above parameters, we obtain \(N=2^{68+\varepsilon }\) and \(C_N=2^{101+\varepsilon },\) where \(\varepsilon \) is a crucial variable that appears in particular in the expression of the probability P of keeping an incorrect key as a candidate: \(P\approx 2^{-1.442\cdot 2^\varepsilon }\). Note that different values of \(\varepsilon \) lead to different time/data/memory trade-offs. Since the key schedule of 7-round AES-128 is nonlinear and has a relatively good diffusion after several rounds, we treat it as a black box and then add the potentially expensive term (6) discussed in Sect. 2.2 to the final time complexity of the attack.

Fig. 7
figure 7

Impossible differential cryptanalysis of 7-round AES-128. The two circled bytes of subkey \(K_1\) come for free by exploiting the key schedule relations. The four colors used for Rounds 6 and 7 correspond to the four output multiple differentials considered

4.1.2 Combining the State-Test Technique with Multiple Differentials

The first change we introduce to the attack of [29] is to consider several 4-round impossible differentials that differ in the pattern of \(\mathcal {D}_{Y}\), resulting in four output differences in Round 7, each corresponding to a different anti-diagonal. The involved bits of \(K_7\) form a partition of \(K_7\), as depicted in Fig. 7. Our second improvement is the use of the state-test technique. In order to apply the state-test technique, we have to slightly modify the differential of the first rounds used in [29] in order to render one of the previously active bytes of \(\mathcal {D}_\mathrm{in}\) inactive (namely byte 7). We provide a detailed explanation of this in the description of Step 3 of our attack.

To enhance the complexities of our attack, we use the early abort technique together with two precomputed tables:

Table \(T_1\) This table contains all the possible values for the differences lying in the main diagonal of the state after the first SubBytes operation. To compute these values, we start from the \(2^{16}\) possible differences of the third column of the state after the MixColumns layer in Round 1 and invert the two linear operations MixColumns and ShiftRows.

Table \(T_2\) Following the same reasoning, we compute the possible values for the differences in bytes 2, 8 and 13 of the internal state after the first SubBytes operation. Contrary to the previous case, we have an additional condition. Indeed, byte 11 of the state outputting the ShiftRows layer is inactive, meaning that only \(2^{8}\) differences are possible.

The precomputed tables require a total memory space of \(2^{16} + 2^{8}\) words, which is negligible in comparison with the memory used to store the N pairs.

We now describe the online part of our attack.

Step 1. This step consists in guessing the 32 key bits corresponding to the first diagonal of \(K_0\) (i.e., \(K_0^0, K_0^5, K_0^{10},K_0^{15}\)). Starting from \(C_N=2^{107+\varepsilon }\) plaintexts, we extract the \(2^{68+\varepsilon }\) pairs that meet the input difference \(\mathcal {D}_\mathrm{in}\) and one of the four possible output differences \(\mathcal {D}_\mathrm{out}\) and store them in a list \(L_1\). We sort this list according to the value of the plaintext difference in the 32-bit diagonal (bytes 0, 5, 10 and 15), creating then \(2^{32}\) sublists of \(2^{36+ \varepsilon }\) pairs. We then realize a first guess on the 32 bits of the first diagonal of \(K_0\) (i.e., \(K_0^0, K_0^5, K_0^{10},K_0^{15}\)) and follow the next process for each sublist. First, we confront the fixed diagonal plaintext difference with the \(2^{16}\) possible output differences of Table \(T_1\), and we use the difference distribution table (DDT) of the Sbox to check if the transitions are possible. If they are, we derive the possible values entering the Sboxes. Due to a well-known property of invertible Sboxes, there is one value derived in average for each transition, so we expect \(2^{16}\) values for each sublist. We then combine these values with the previous 32-bit guess on \(K_0\) to deduce the corresponding value of the plaintext diagonal. After that, we look into \(L_1\) and remove the sublists corresponding to plaintexts that have a diagonal value different from the ones compiled. Out of the \(2^{32}\) possible diagonal values, only \(2^{16}\) are kept, so a proportion of \(2^{-16}\) pairs remains. Note that this sieve corresponds to the probability of having two non-active bytes at the leftmost column after the application of MixColumns in the first round (Fig. 8).

Fig. 8
figure 8

The four lists \(L_1\), \(L_2\), \(L_3'\) and \(L_4\) that will be created during the attack. The size of each list as well as the way they are sorted can be visualized

At this point, there are \(N_1=2^{68-16+\varepsilon }=2^{52+\varepsilon }\) remaining pairs that we store in a list named \(L_2\) sorted by the difference in the bytes 2, 8 and 13. Each one of the \(2^{24}\) differences indexes a sublist of \(2^{28+\varepsilon }\) pairs.

Step 2. This step is very similar to Step 1 and consists in guessing the bytes \(K_0^2, K_0^8, K_0^{13}\) of the subkey \(K_0\). We study each of the \(2^{24}\) sublists together with the \(2^8\) differences contained in table \(T_2\) to deduce possible values for the inputs of the active Sboxes. As explained before, there will be one such value in average. We then realize a guess of the corresponding bytes of \(K_0\) (\(K_0^2, K_0^8, K_0^{13}\)) and deduce by XOR the possible values for the related bytes of the plaintext. The plaintexts of the sublist that are different from those \(2^8\) candidates are eliminated, resulting in a \(2^{-16}\) sieve. After this step, the number of remaining pairs is \(N_2=2^{52-16+\varepsilon }=2^{36+\varepsilon }\). Once again, this filter corresponds to the probability to have two non-active bytes at the third column after the application of the MixColumns operation of the first round. The complexity up to here is \(2^{32+24+36+\varepsilon }=2^{92+\varepsilon }\) lookup tables. We can compute now for free the values of \(K_1^0\) and \(K_1^2\) from the guesses already realized on \(K_0\).

Step 3. We then repeat the following procedure for each of the \(2^{36+\varepsilon }\) pairs left. Starting from the known values of byte 0 and 2 outputting the MixColumns operation of Round 1 and of the two subkey bytes of \(K_1\), we compute the values of the two corresponding bytes after the SubBytes operation of Round 2. After the application of ShiftRows, those two bytes are not in the same column anymore, but are in places 0 and 10. The first column contains then two active bytes, including one which is unknown in position 2, so there are \(2^8\) possible values for the difference of this column. However, since we know that after passing through MixColumns the difference should follow a specific pattern with only three active bytes, the number of possibilities for the unknown byte is restricted. Indeed, if the position of the inactive byte is fixed in \(\mathcal {D}_{X}\), only one possibility remains. However, since here we consider four possible patterns, there are four possibilities, that we denote by \(\delta ^i_{10}\), \(i = 0, \dots 3\), each one corresponding to a non-active position. The same reasoning holds for the difference in the third column of the state outputting the ShiftRows operation of Round 2, in which the known byte difference is in position 10 and the unknown one is byte 8. We denote by \(\delta ^i_{8}\), \(i = 0, \dots , 3\) the four possibilities for this last one. Since the pattern of the first column leads to a unique possibility for the third column, we have only four possible values for \(\delta ^i_{10}\) and \(\delta ^i_{8}\) given fixed differences in bytes 0 and 2 at the output of the SubBytes operation. Since we know the difference transitions of the active Sboxes in positions 8 and 10 of Round 2, we can refer to the DDT to obtain the values that permit these transitions. Once again, there is on average one value for each transition, which we denote, according to their positions, by \(x_8\) and \(x_{10}\) (see Fig. 7). The new list obtained, named \(L_3'\), is of size \(4\cdot 2^{36+\varepsilon } = 2^{38+\varepsilon }\).

The next natural step for continuing the attack is to confront those values with the ones obtained from the plaintext. Indeed, the expression of byte 8 at the entry of SubBytes of Round 2 is \(2S(P^8+K_0^8) + 3S(P^{13}+K_0^{13}) + S(P^2+K_0^2) + S(P^7+K_0^7) + K_1^{8},\) and the one of byte 10 is \(S(P^8+K_0^8) + S(P^{13}+K_0^{13}) + 2S(P^2+K_0^2) + 3S(P^7+K_0^7) + K_1^{10},\) where S is the AES Sbox and where the multiplication is realized in \(GF(2^8)\). So to compare it with the four values obtained previously, we would need additional key guesses of the values of \(K_0^7\), \(K_1^8\) and \(K_1^{10}\). However, instead of guessing those 3 bytes, we use the state-test technique which allows to decrease the time complexity by a factor of \(2^8\). The general idea behind this is to study together the pairs that lead to the same values of \(S(P^7+K_0^7) + K_1^{8}\) and of \(3S(P^7+K_0^7) + K_1^{10}\). To do so, we first compute the known values \(2S(P^8+K_0^8) + 3S(P^{13}+K_0^{13}) + S(P^2+K_0^2)\) and \(S(P^8+K_0^8) + S(P^{13}+K_0^{13}) + 2S(P^2+K_0^2) \) and then XOR them, respectively, to the 4 possible values of \(x_8\) and \(x_{10}\). These two quantities that we denote by \(z_1\) and \(z_2\) are equal to \(z_1 = S(P^7+K_0^7) + K_1^{8}\) and \(z_2 = 3S(P^7+K_0^7) + K_1^{10}.\) We use the couple \((z_1, z_2)\) to sort the list \(L_3'\) into \(2^{16}\) sublists of \(2^{38-16+\varepsilon }\) elements. We continue the attack with the sublist \(L_4\) of size \(N_3 = 2^{22+\varepsilon }\) pairs. The complexity up to this point is of \(2^{32+24+16+22+\varepsilon }=2^{94+\varepsilon }\) simple operations.

Step 4. In this final step, we study the last round of the differential attack. To do so, we divide \(L_4\) into 4 sublists of size \(2^{20+\varepsilon }\), each one corresponding to a fixed output pattern. This list is sorted first by \(\mathcal {D}_\mathrm{out}\) and then by value. We then perform the last guess of the attack on the 32 bits of \(K_7\) and check whether the impossible differential is satisfied. If none of the pairs satisfy it, then the partially guessed key bits are returned as a candidate value for the secret subkey bits. We keep four lists of independent possible values for each of the 32 output key bits, each of size \(2^{32}\cdot 2^{-1.442\cdot 2^{\varepsilon -2}}\). This quantity depends on \({(\varepsilon -2)}\) instead of \(\varepsilon \) since we have lists that are four times smaller. The time complexity up to the creation of these lists is \(2^{32+24+16+32+\varepsilon }=2^{104+\varepsilon }\) memory accesses.

An important aspect of the attack is to obtain lists that are small enough that the cost of merging them is not higher than the cost that we have paid so far. This issue arises because we have no way of knowing if the guessed input key bits and the guessed output key bits form a match unless we complete the remaining part of the state (see Sect. 2.2). The cost of merging these four lists is given directly from the equation in 2.5: \(2^{-1.442\cdot 2^{\varepsilon }}2^{4(72+32)}2^{-3\cdot 72}\cdot C_{KS}=2^{-1.442\cdot 2^{\varepsilon }+72+128},\) where \(C_{KS}\) is the cost of the application of the key schedule, and the number of candidates that will remain is the previous term multiplied by \(2^{-k_\mathrm{in}}\) and by \(1/C_{KS} = 2^{128-1.442\cdot 2^{\varepsilon }}\).

Computing \(C_\mathrm{E}'\). We estimate \(C_\mathrm{E}'\) by following the common practice of counting the number of Sbox applications computed in the bottleneck part of the attack (the penultimate of the previous procedure) compared to the number of Sbox applications in the full cipher (as done for instance in [7]). Our computations give \(C_\mathrm{E}'=2^{-5.12}\), since we are comparing four Sbox applications to the total of \(16\cdot 7+28=140\) Sbox applications used in 7 rounds of AES. Finally, we deduce that \(C_{KS}/C_\mathrm{E}=2^{-3.6}\).

This attack has data complexity \(C_N=2^{107+\varepsilon }\) CP, time complexity

$$\begin{aligned} 2^{104+\varepsilon }\cdot 2^{-5}+2^{-1.442\cdot 2^{\varepsilon }+72+128}\cdot 2^{-3.6}+2^{128} \cdot 2^{-1.442\cdot 2^{\varepsilon }} C_\mathrm{E}, \end{aligned}$$

and memory complexity \(N=2^{68+\varepsilon }\) words. The best time complexity is obtained by taking \(\varepsilon = 6.1\), leading to a data complexity of \(2^{113.1}\) CP, a time of \(2^{105.1} + 2^{113.1}\) \(C_\mathrm{E}\) and a memory complexity of \(2^{74.1}\) words.

4.2 CRYPTON-128

This example aims at showing the application of multiple differentials, combined with multiple impossible differentials in impossible differential cryptanalysis.

CRYPTON is an involutive 128-bit block cipher designed by Lim [24] that was a candidate of the AES competition. This block cipher can be parametrized by a key of 128, 192 or 256 bits. The number of rounds is fixed to 12.

Similarly to AES, an internal state of CRYPTON can be seen as a \(4\times 4\)-byte matrix. Each round is composed by the following four operations:

  • \(\mathbf {\gamma }\): a nonlinear operation, that uses \(8\times 8\)-bit involutive Sboxes applied in parallel on the bytes of the internal state.

  • \(\mathbf {\pi }\): a linear byte-wise transformation, that executes a \(4\times 4\)-byte matrix, with branch number 4, on each column of the state.

  • \(\mathbf {\tau }\): a byte transposition of columns into rows with respect to the anti-diagonal of the internal state.

  • \(\mathbf {\sigma }\): a byte-wise key addition, identical to the AddRoundKey operation of AES.

It must be noted that the encryption process of CRYPTON starts by applying \(\sigma \) with the first subkey. Finally, after performing the 12 rounds, the output transformation \(\tau \circ \pi \circ \gamma \), i.e., an actual round without the key addition step \(\sigma \), is applied to the state.

4.2.1 Previous Cryptanalysis and Our Contributions

The best previous impossible differential attack against CRYPTON-128 is an impossible differential attack published by Mala et al. [30]. This 7-round cryptanalysis has a data complexity of \(2^{121}\) CP, a time complexity of \(2^{116.2} C_\mathrm{E}\) and a memory complexity of \(2^{119}\) words.

4.2.2 Description of the Attack

In this section, we show how to improve all complexity parameters of this attack. For doing this, we jointly use the techniques of multiple differentials and of multiple impossible differentials, for both the first and the last appended rounds. More precisely, we exploit input and output differentials of two types, namely differentials having different \(\mathcal {D}_X, \mathcal {D}_Y\), leading to different impossible differentials, while also multiple differentials having a different \(\mathcal {D}_\mathrm{in}, \mathcal {D}_\mathrm{out}\), as discussed in Sects. 2.3 and 2.4. In the same way as for AES, we performed an exhaustive analysis of all 4-round impossible differentials for CRYPTON. We analyzed in an automated way all such impossible differentials, up to equivalent classes, to find out the one that led to the optimal complexities for an attack. In this way, we were able to confirm that the type of impossible differentials used in [30] is the best choice. An impossible differential of this kind has a single active byte in \(\mathcal {D}_X\) and exactly two active bytes in a single column of \(\mathcal {D}_Y\) and can be visualized in Fig. 9.

This impossible differential covering the Rounds 2–5 is then extended one round backward and two rounds forward, in exactly the same way as done in [30]. One such extension is visualized in Fig. 9. We skip here most of the details of the attack, as all basic parameters are identical to those of [30]. Instead, we provide details of the multiple (impossible) differentials used. We remind here that we call multiples all the differentials that correspond either to multiple differentials or to multiple impossible differentials.

Fig. 9
figure 9

Impossible differential attack against CRYPTON-128

Input multiples We use in total 4 input multiples, as shown in Fig. 10. These four differentials correspond to \(m_\mathrm{in} = 4\) different \(\mathcal {D}_\mathrm{in}\), described by a different single active column. Each \(\mathcal {D}_\mathrm{in}\) corresponds to one \(\mathcal {D}_X\), each one composed of a single active byte in the last column of the state. However, it would have been possible to take into account four times more \(\mathcal {D}_X\) that what we actually use, by considering further any other column with a single active byte. This is depicted in Fig. 9 by a \(\times 4\) symbol. Nevertheless, for the sake of simplicity and for being in line with previous analyses, the multiple impossible differentials whose conditions do not depend on any key will be instead taken into account by decreasing \(c_\mathrm{in}\). In this application, this \(\times 4\) parameter is counted in the probability of passing the application \(\pi \) of the first round, that we take to be \(p = 2^{-22}\) instead of \(p=2^{-24}\), as shown in Fig. 9. Note, however, that both approaches are equivalent.

Output multiples We consider here \(n_\mathrm{out} = 4\) differences \(\mathcal {D}_Y\). Each \(\mathcal {D}_Y\) corresponds to one column with the lower two bytes active. Each \(\mathcal {D}_Y\) gives us then \(4 \atopwithdelims ()2\) \(= 6\) possibilities after the application of \(\pi \) in Round 5 for choosing two active bytes within it, leading to \(m_\mathrm{out}=6\) differences \(\mathcal {D}_\mathrm{out}\). We come finally with the number of \(n_\mathrm{out} \times m_{out} = 4\times 6=24\) output multiples in total. These differentials are visualized in Fig. 10. As explained for the case of input multiples, we could have alternatively considered \(4\times 6\) differences \(\mathcal {D}_Y\), by taking also into account the six possible positions for the two active bytes. However, by following the same approach as before, we integrate this \(\times 6\) factor by decreasing instead \(c_\mathrm{out}\) by a factor of \(\log _2{6}\).

The remaining parameters of the attack (see Fig. 9) are \(\varDelta _\mathrm{in}=32, \varDelta _\mathrm{out}=64, c_\mathrm{in}=24-\log _2 4=22, c_\mathrm{out}=14.38-\log _2 6 +48=59.8, k_{A}=32, k_{B}=80. \)

Therefore, by using the formula (5), the data complexity is \(C_N = 2^{108.22 + \varepsilon }\) CP. The memory complexity, given by the number of pairs N, is \(2^{c_\mathrm{in} + c_\mathrm{out} + \varepsilon } = 2^{81.8 + \varepsilon }\). Finally, as the cost of the key schedule computed similarly to AES is \(C_{KS} = 2^{-3.6}\), \(C_\mathrm{E}'\) is \(2^{-5}\) and in this application \(k_A^\mathrm{inv} = 128\), the time complexity given by the formula (7) is \(2^{112+\varepsilon }2^{-5}+2^{256-1.442\cdot 2^{\varepsilon }}2^{-3.6}+ 2^{128-1.442\cdot 2^{\varepsilon }}\) \(C_\mathrm{E}\). By taking \(\varepsilon = 6.7\), we obtain thus a data complexity of \(2^{114.92}\) CP, a time complexity of \(2^{113.7}\) \(C_\mathrm{E}\) and a memory complexity of \(2^{88.5}\) 128-bit words.

We can see by the above description that we considerably improve all complexity parameters of the previous best impossible differential attack against CRYPTON-128 (Fig. 11).

Fig. 10
figure 10

Multiples for the attack against CRYPTON-128. The \(\times 4\) factor symbolizes that 4 more differences \(\mathcal {D}_X\) can be taken into account for each depicted state, by activating another byte in the same row as the one shown. Equally, we can consider six times more differentials than the ones shown, by choosing different positions for the two active bytes in each column

Fig. 11
figure 11

A 4-round impossible differential of CRYPTON. A square with a dot symbolizes an active byte, while an empty square stands for inactive bytes. A crossed square after the application of \(\pi \) says that at least 3 of the 4 bytes of each column will be active after the application of this operation

5 ARIA-128

ARIA [19] is a 128-bit block cipher designed in 2003 by Kwon et al. and established as a Korean Standard in 2004. ARIA-128 has 12 rounds, and each round is composed of 3 operations. The first operation is the Key Addition (ARK) that simply XORs the 128-bit round key to the state. The second operation is the Substitution Layer (SL) that consists in the parallel application of 4 different Sboxes on every byte of the state. Finally, the Diffusion Layer (DL) is defined by a \(16\times 16\) involutory binary matrix ensuring a branch number of 8 and is omitted in the last round. We refer to the design paper [19] for more details.

5.1 Improved 6-Round Impossible Differential Attack Using multiples

In this section, we improve on the best impossible differential attack on ARIA-128 [22] that covers up to 6 rounds. This attack is illustrated in Fig. 12. We achieve this improvement by using multiple impossible differentials. The main goal of this application is to demonstrate the comparison of a simple attack (with only one impossible differential) with a similar attack exploiting multiple differentials instead. We show in particular that in this last case, the value of the variable \(\varepsilon \) has to be higher, but the data complexity is lower. We consider for our attacks a configuration similar to the one used in [22], i.e., with parameters \(\varDelta _\mathrm{in}=48, \varDelta _\mathrm{out}=32, c_\mathrm{in}=40, c_\mathrm{out}=24, k_A = k_\mathrm{in}=48, k_B= k_\mathrm{out}=32\), as can be seen in Fig. 12. We provide now the complexities in the simple and the multiple case.

Fig. 12
figure 12

Example of a 6-round attack on ARIA

Simple Case By directly applying the above attack parameters in the formulas of Sect. 2, we get a memory complexity of \(N = 2^{40+24 + \varepsilon _s} = 2^{ 64 + \varepsilon _s}\) words. The data complexity by using Eq. (5) is \(C_{N}=2^{129+64+\varepsilon _s-48-32}=2^{113+\varepsilon _s}\) CP. The time complexity can be computed by directly applying Eq. (7) \(C_{T}=2^{80+\varepsilon _s}2^{-5}+2^{128+32}2^{-1.442\cdot 2^{\varepsilon _s}}2^{-1.58}+2^{128}2^{-1.442\cdot 2^{\varepsilon _s}} C_\mathrm{E}.\) By choosing \(\varepsilon _s = 5.9\), the data complexity is \(2^{118.9}\) CP , the time complexity is \(2^{80.9} C_\mathrm{E}\), and the memory complexity is \(2^{69.9}\) 128-bit words.

Multiple Case In order to find the multiple differentials with the same associated parameters as in the simple case described above, we performed an exhaustive search. So we determined how many impossible differentials from two equal active bytes to four equal active bytes exist. We found that there are more than \(2^9\) such impossible differentials; thus, we can consider \(M=n_\mathrm{in}\cdot n_\mathrm{out}=2^9\). As we show in Sect. 2.3, the multiple attacks can be seen, except for the data complexity, as applications in parallel of simple attacks, where the associated \(\varepsilon _s\) are related to the final \(\varepsilon \) determining the probability of keeping a key as candidate by the following relation: \(\varepsilon -\varepsilon _s=\log _2 (M)\), which equals 9 in our case. So we have \(N = 2^{ 64 + \varepsilon }\) (which is also the memory complexity), and consequently, the data complexity is then: \(C_{N}=2^{129+64+\varepsilon -48-32-9}=2^{104+\varepsilon }\) CP. The time complexity can be computed by directly applying the formula for the time complexity and the modification from Sect. 2.5: \(C_T=2^{80+\varepsilon }2^{-5}+2^{256}2^{-1.442\cdot 2^{\varepsilon }}2^{-1.58}+2^{128}2^{-1.442\cdot 2^{\varepsilon }}C_\mathrm{E}.\) By taking \(\varepsilon = 7\), we get a data complexity of \(2^{111}\) CP, a time complexity of \(2^{82} C_\mathrm{E}\) and a memory complexity of \(2^{71}\) 128-bit words.

Comparing both An interesting generic question is: Are there cases where the simple attack might provide a better complexity? As one can see from above, if we wanted to obtain the same data complexity, we should take an \(\varepsilon \) such that \(\varepsilon =\varepsilon _s +9\). In this case, the memory complexity of the multiple case is a factor of \(2^9\) times higher than in the simple one. Let’s see what happens with the time complexity. The first and last terms of the time complexity are equal. The difference might come from the middle term: \(2^{128+32}2^{-1.442\cdot 2^{\varepsilon _s}}2^{-3.6}\) and \(2^{256}2^{-1.442\cdot 2^{\varepsilon }}2^{-3.6}\). We see that in the multiple case, we have the simple case term multiplied by \(2^{96}\cdot 2^{-1.442\cdot (2^9-1) \cdot 2^{\varepsilon _s}}=2^{-640.86\cdot 2^{\varepsilon }}\), which provides a better complexity. Despite this, when the bottleneck term of the time complexity for the best attacks is not the second term but the first, as is the case of the results on ARIA, while the data complexity is always much worse in the simple case, the time complexity might be slightly better, given by the smaller \(\varepsilon _s\) that we can take into account.

6 Conclusion

In this paper, we presented new techniques for improving impossible differential attacks. Furthermore, we showed that the nature of the key schedule has a non-negligible impact on the time complexity of such attacks and provided a new complexity formula taking this phenomenon into account. We applied these new techniques, individually and in combination to various ciphers, based on both SPN and Feistel constructions. From this point of view, our work complements the results of [10] where only Feistel ciphers were analyzed. We showed here that our techniques, as well as those introduced in [10], work on both constructions. However, there are small differences in the extent of the applicability of these techniques. For example, we noticed that applying multiple differentials in impossible differential cryptanalysis is somewhat easier on SPN ciphers, because linear layers of MixColumns type offer more possibilities for extending a differential, hence naturally provide more input/output differences. On the other hand, the state-test technique applies more easily to Feistel ciphers. A natural explanation for this is that in SPN ciphers, even if the state-test technique can almost always be applied, the gain in the complexity generally leads to an equivalent loss in data complexity, because a part of the active part of the plaintexts has to be fixed.

We also compared attacks based on multiple (impossible) differentials with equivalent attacks exploiting only a single differential. We showed that when exploiting multiple differentials, the data complexity is always lower. However, the gain in the time complexity is not always clear, and a simple attack can sometimes lead to a better time complexity.

Additionally, in order to verify and validate the applicability of the proposed techniques, we implemented two of the techniques on toy ciphers. These experiments confirm that our theoretical estimates are indeed good estimates of the complexities. However, we insist that for an exact determination of the complexity, one must perform the detailed attack step by step.