Keywords

1 Introduction

Block ciphers are certainly among the most important cryptographic primitives. Since the invention of the DES [1] in the mid 70’s and even more with the design of the AES [2], a huge amount of research has been done on various aspects of block cipher design and block cipher analysis. In the last decade, many new block ciphers have been proposed that aim at highly resource constrained devices. Driven by new potential applications like the internet of things, we have witnessed not only many new designs, but also several new cryptanalytic results. Today, we have at hand a well established set of cryptanalytic tools that, when are carefully applied, allow to gain significant confidence in the security of a block cipher design. The most prominent tools here are certainly differential [5] and linear [21] attacks and their numerous variations [4, 7, 14, 15].

Despite this fact, quite some of the recently proposed lightweight block ciphers got broken rather quickly. One of the reasons for those attacks, on what is supposed to be a well-understood field of cryptographic designs, is that the new lightweight block ciphers are designed more aggressive than e.g. most of the AES candidates. Especially when it comes to the design of the key schedule, many new proposals keep the design very simple, often using identical round keys. While there is no general defect with such a key schedule, structural attacks become much more of an issue compared to a cipher that deploys a more complicated key schedule. In this paper we introduce a new structural attack, named nonlinear invariant attack. At first glance, it might seem quite unlikely that such an attack could ever be successfully applied. However, we give several examples of ciphers that are highly vulnerable to this attack.

1.1 Our Contribution

Given a block cipher \(E_k:\mathbb {F}_2^n \rightarrow \mathbb {F}_2^n \), the general principle of the nonlinear invariant attack is to find an efficiently computable nonlinear Boolean function \(g:\mathbb {F}_2^n\rightarrow \mathbb {F}_2\) such that

$$\begin{aligned} g(x) \oplus g(E_k(x)) \end{aligned}$$

is constant for any x and for many possible keys k. Keys such that this term is constant are called weak keys. The function g itself is called nonlinear invariant for \(E_k\). Clearly, when the block cipher \(E_k\) has a (non-trivial) nonlinear invariant function g, \(g(p) \oplus g(E_k(p))\) is constant for any plaintext p and any weak key k. On the other hand, the probability that random permutations have this property is about \(2^{-N+1}\) when g is balanced. Therefore, attackers can immediately execute a distinguishing attack. Moreover, if the constant depends on the secret key, an attacker can recover one bit of information about the secret key by using one known plaintext-ciphertext pair.

For round-based block ciphers, our attack builds the nonlinear invariants from the nonlinear invariants of the single round functions. In order to extend the nonlinear invariant for a single round to the whole cipher, all round-keys must be weak keys. It may be infeasible to find such weak-key classes for block ciphers with a non-trivial key schedule. However, as mentioned above, many recent block ciphers are designed for lightweight applications, and they adopt more aggressive designs to achieve high performance even in highly constrained environments. Several lightweight ciphers do not deploy any key schedule at all, but rather use the master key directly as the identical round key for all rounds. In such a situation, the weak-key class of round keys is trivially converted into the weak-key class of the secret key. In particular, when all round keys are weak, this property is iterative over an arbitrary number of rounds.

(Ciphertext-Only) Message-Recovery Attacks. The most surprising application of the nonlinear invariant attack is an extension to ciphertext-only message-recovery attacks. Clearly, we cannot execute any ciphertext-only attack without some information on the plaintexts. Therefore, our attack is ciphertext-only attack under the following environments. Suppose that block ciphers which are vulnerable against the nonlinear invariant attack are used in well-known modes of operation, e.g., CBC, CFB, OFB, and CTR. Then, if the same unknown plaintext is encrypted by the same weak key and different initialization vectors, attackers can practically recover a part of the plaintext from the ciphertexts only.

Applications. We demonstrate that our new attack practically breaks the full authenticated encryption schemes SCREAM Footnote 1 [11] and iSCREAM  [10] and the low-energy block cipher Midori64 [3] in the weak-key setting.

Table 1. Summary of the nonlinear invariant attack

We show that the tweakable block ciphers Scream and iScream have a nonlinear invariant function, and the number of weak keys is \(2^{96}\). Midori64 also has a nonlinear invariant function, and the number of weak keys is \(2^{64}\). Table 1 summarizes the result of the nonlinear invariant attack against SCREAM, iSCREAM, and Midori64. The use of the tweakable block cipher Scream is defined by the authenticated encryption SCREAM, and the final block is encrypted like CTR when the byte length of a plaintext is not multiple of 16. We exploit this procedure and recover 32 bits of the final block of the plaintext if the final block length ranges from 12 bytes to 15 bytes. We can also execute a similar attack against iSCREAM. Note that our attack breaks SCREAM and iSCREAM in the nonce-respecting model. Midori64 is a low-energy block cipher, and we consider the case that Midori64 is used by well-known modes of operation. As a result, we can recover 32 bits in every 64-bit block of the plaintext if Midori64 is used in CBC, CFB, OFB, and CTR.

Comparison with Previous Attacks. Leander et al. proposed invariant subspace attack on iSCREAM [19], which is a weak-key attack working for \(2^{96}\) weak keys. The attack can be a distinguishing attack and key recovery attack in the chosen-message and chosen-tweak model. Guo et al. presented a weak-key attack on full Midori64 [12], which works for \(2^{32}\) weak keys, distinguishes the cipher with 1 chosen-plaintext query, and recovers the key with \(2^{16}\) computations.

Compared to [19], our attack has the same weak key size and we distinguish the cipher in the known-message and chosen-tweak model. Compared to [12], our weak-key class is much larger and the cipher is distinguished with 2 known-plaintext queries. In both applications, the key space can be reduce by 1 bit, besides a part of message/plaintext can be recovered from the ciphertext.

1.2 Related Work

The nonlinear invariant attack can be regarded as an extension of linear cryptanalysis [21]. While linear cryptanalysis uses a linear function to approximate the cipher, the nonlinear invariant attack uses a nonlinear function and the probability of the nonlinear approximation is one. When g is linear, ciphers that are resistant against the linear cryptanalysis never have a linear approximation with probabilistically one.

The use of the nonlinear approximation has previously been studied. This extension was first discussed by Harpes et al.  [13], and Knudsen and Robshaw later investigated the effectiveness deeply [16]. However, they showed that there are insurmountable problems in the general use of nonlinear approximations. For instance, one cannot join nonlinear approximations for more than one round of a block cipher because the actual approximations depend on the specific value of the state and key. Knudsen and Robshaw demonstrated that nonlinear approximations can replace linear approximations in the first and last rounds only [16]. Unfortunately, nonlinear cryptanalysis has not been successful because of this limited application. Our attack can be seen as the first application of the nonlinear cryptanalysis against real ciphers in the past two decades.

Other related attacks are the invariant subspace attack [18, 19] and symmetric structures [8, 17, 23]. Similar to the nonlinear invariant attack, those attacks exploit a cryptanalytic property which continues over an arbitrary number of rounds in the weak-key setting. While those attacks have to choose plaintexts, i.e. are chosen plaintext attacks, the nonlinear invariant attack does not need to choose plaintexts in general. This in particular allows us to extend the nonlinear invariant attack from a pure distinguishing attack to the (ciphertext-only) message-recovery attack.

1.3 Paper Organization

We explain the general ideas and principles of the new attack in Sect. 2. Section 3 explains how, in many cases, the attack can be constructed in an almost automatic way using an algorithmic approach that is for most ciphers practical. Moreover, we give a structural reason why some ciphers, more precisely some linear layers, are inherently weak against our attack and why our attacks are possible against those ciphers. In Sect. 4 we explain in detail our attacks on SCREAM and iSCREAM. Moreover, Sect. 5 details our nonlinear invariant attack on Midori64. Finally, in Sect. 6, we give some additional insights into the general structure of nonlinear invariant functions and outline some future work.

2 Nonlinear Invariant Attack

In this section, we describe the basic principle of the attack and its extension to (ciphertext-only) message-recovery attacks when used in common modes of operations. While our attack can be applied to any cipher structure in principle, we focus on the case of key-alternating ciphers and later on substitution permutation networks (SPN) ciphers to simplify the description. We start by explaining the basic idea and later how, surprisingly, the attack can be extended to a (ciphertext-only) message-recovery attack in many scenarios.

2.1 Core Idea

Let \(F:\mathbb {F}_2^n \rightarrow \mathbb {F}_2^n\) be the round function of a key-alternating cipher and \(F_k(x) = F(x \oplus k)\) be the round function including the key XOR. Thus, for an r-round cipher, the ciphertext C is computed from a plaintext P using round keys \(k_i\) as

$$\begin{aligned} x_0= & {} P \\ x_{i+1}= & {} F_{k_i}(x_i) =F(x_i \oplus k_i) \quad 0\le i \le r-1\\ C= & {} x_{r} \end{aligned}$$

where we ignore post-whitening key for simplicity.

The core idea of the nonlinear invariant attack is to detect a nonlinear Boolean function g such that

$$\begin{aligned} g(F(x \oplus k))=g(x \oplus k) \oplus c = g(x) \oplus g(k) \oplus c \quad \forall x \end{aligned}$$

for many keys k, where c is a constant in \(\mathbb {F}_2\). Keys for which this equality holds will be called weak keys. The function g itself is called nonlinear invariant in this paper.

The important remark is that, if all round-keys \(k_i\) are weak then

Thus, the invariant is iterative over an arbitrary number of rounds and immediately leads to a distinguishing attack.

Distinguishing Attack. Assume that we found a Boolean function g that is nonlinear invariant for the round function \(F_k\) of a block cipher. Then, if all round keys are weak, this function g is also nonlinear invariant over an arbitrary number of rounds.

Let \((P_i,C_i)\) \(1\le i \le N\) be N pairs of plaintexts and corresponding ciphertexts. Then, \(g(P_i) \oplus g(C_i)\) is constant for all pairs. If g is balanced, the probability that random permutations have this property is about \(2^{-N+1}\). Note that the case that g is unbalanced can be handled as well, but is not the main focus of our paper. Therefore, we can practically distinguish the block cipher from random permutations under a known-plaintext attack.

Suitable Nonlinear Invariants. We next discuss a particular choice of a nonlinear invariant g for which it is directly clear that weak keys exist. Imagine we were able to identify a nonlinear invariant g for F, i.e. a function such that

$$\begin{aligned} g(F(x)) \oplus g(x) \end{aligned}$$

is constant, such that g is actually linear (or constant) in some of the inputs. In this case, all round keys that are zero in the nonlinear components of g, are weak.

More precisely, without loss of generality, assume that the nonlinear invariant g is linear in the last t bits of input (implying that g is nonlinear in the first s bits of input where \(s=n-t\)). Namely, we can view g as

$$\begin{aligned} g: (\mathbb {F}_2^s \times \mathbb {F}_2^t) \rightarrow \mathbb {F}_2 \end{aligned}$$

such that

$$\begin{aligned} g(x,y)=g(x,0) \oplus g(0,y) = f(x) \oplus \ell (y) \end{aligned}$$

where f is the nonlinear part of g, and \(\ell \) is the linear part of g. As g is a nonlinear invariant for F, it holds that

$$\begin{aligned} g(x,y) \oplus g(F(x,y)) = c, \end{aligned}$$

where c is a constant in \(\mathbb {F}_2\). Now consider a round key \(k\in \mathbb {F}_2^s \times \mathbb {F}_2^t\) of the form \((0,k')\). That is, we consider a round key such that its first s bits are zero. Then it holds that

$$\begin{aligned} g(F_k(x,y))= & {} g(F(x,y \oplus k'))\\= & {} g(x,y \oplus k') \oplus c\\= & {} f(x) \oplus \ell (y \oplus k') \oplus c\\= & {} f(x) \oplus \ell (y) \oplus \ell (k') \oplus c\\= & {} g(x,y) \oplus g(0,k') \oplus c. \end{aligned}$$

In other words, all those round-keys that are zero in the first s bits are weak. Phrased differently, the density of weak keys is \(2^{-s}\).

Example 1

Let \(g : \mathbb {F}_2^4 \rightarrow \mathbb {F}_2\) be a nonlinear invariant as

$$\begin{aligned} g(x_4,x_3,x_2,x_1) = x_4 x_3 \oplus x_3 \oplus x_2 \oplus x_1. \end{aligned}$$

Then, the function g can be viewed as

$$\begin{aligned} g(x_4,x_3,x_2,x_1) = f(x_4, x_3) \oplus \ell (x_2, x_1). \end{aligned}$$

Now consider a round key \(k\in \mathbb {F}_2^2 \times \mathbb {F}_2^2\) of the form \((0,k')\). Then, the function g is a nonlinear invariant for the key XOR because

$$\begin{aligned} g(x) \oplus g(x \oplus k) = g(x) \oplus g(x) \oplus g(0,k') = g(0,k'). \end{aligned}$$

On Key Schedule and Round Constants. Many block ciphers generate round keys from the master key by a key schedule. For a proper key schedule, it is very unlikely that all round keys are weak in the above sense. However, many recent lightweight block ciphers do not have a well-diffused key schedule, but rather use (parts of) the master key directly as the round keys. From a performance point of view, this approach is certainly preferable.

However, the direct XORing with the secret key often causes vulnerabilities like the slide attack [6] or the invariant subspace attack [18]. To avoid those attacks, round constants are additionally XORed in such lightweight ciphers. While dense and random-looking round constant would be a conservative choice, many such ciphers adopt sparse round constants because they are advantageous in limited memory requirements.

Focusing on the case of identical round keys, assume that there is a Boolean function g which is nonlinear invariant for the round function F. Now if all used round constants \(c_i\) are such that \(c_i\) is only involved in the linear terms of g, the function g is nonlinear invariant for this constant addition. This follows by the same arguments for the weak keys above. We call such constants, in line with the notation of weak keys from above, weak constant.

To conclude, given a key-alternating cipher with identical round-keys and weak round-constants, any master-key that is weak, is immediately weak for an arbitrary number of rounds. In this scenario, the number of weak keys is \(2^t\), or equivalently, the density of weak keys is \(2^{-s}\).

2.2 Message Recovery Attack

As described so far, the nonlinear invariant attack leaks at most one bit of the secret key. However, if a block cipher that is vulnerable to the nonlinear invariant attack is used in well-known modes of operation, e.g., CBC, CFB, OFB, and CTR, surprisingly, the attack can be turned into a ciphertext-only message recovery attack.

Clearly, we cannot execute any ciphertext-only attack without some information on the plaintexts. When block ciphers are used under well-known modes of operation, the plaintext itself is not the input of block ciphers and the input is rather initialization vectors. Here we assume that an attacker can collect several ciphertexts where the same plaintext is encrypted by the same (weak) key and different initialization vectors. We like to highlight that this assumption is more practical not only compared to the chosen-ciphertext attack but also to the known-plaintext attack. In practice, for instance, assuming an application sends secret password several times, we can recover the password practically. While the feasibility depends on the behavior of the application, our attack is highly practical in this case.

Attack Against CBC Mode. Figure 1 shows the CBC mode, where h message blocks are encrypted. Let \(P_j\) be the jth plaintext block, and \(C_j^i\) denotes the jth ciphertext block by using the initialization vector \(IV^i\). The attacker aims at recovering the plaintext \((P_1, P_2, \ldots , P_h)\) by observing the ciphertext \((IV^i, C_1^i, C_2^i, \ldots , C_h^i)\). Moreover, we assume that the block cipher \(E_k\) is vulnerable against the nonlinear invariant attack, i.e., there is a function g such that \(g(x) \oplus g(y)\) is constant, where x and y denote the input and output of the block cipher.

Fig. 1.
figure 1

CBC mode

First, we explain how to recover the plaintext \(P_1\) by focusing on the first block. Since \(E_k\) is vulnerable against the nonlinear invariant attack, there is a function g such that \(g(P_1 \oplus IV_1^i) \oplus g(C_1^i)\) is constant for any \(i \in \{1,2,\ldots ,N\}\). If g would be a linear function,

$$\begin{aligned} g(P_1 \oplus IV_1^j) \oplus g(C_1^j)=g(P_1) \oplus g(IV_1^j) \oplus g(C_1^j) \end{aligned}$$

is constant, and the attacker could only recover at most one bit of secret information. However, g is nonlinear in our attack. Therefore, we can guess and determine the part of \(P_1\) that is involved in the nonlinear term of g. More precisely, assume as above – without loss of generality – that g is nonlinear in the first s inputs and linear in the last t inputs, i.e.

$$\begin{aligned} g:\mathbb {F}_2^s \times \mathbb {F}_2^t \end{aligned}$$

such that

$$\begin{aligned} g(x,y)=f(x)\oplus \ell (y) \end{aligned}$$

where f is any Boolean function, and \(\ell \) is a linear Boolean function. Consider again a plaintext \(P_1=(x,y)\) with \(x\in \mathbb {F}_2^s\) and \(y \in \mathbb {F}_2^t\). The corresponding ciphertext \(C_1^i\) is split as \(C_1^i=(c_i,d_i)\) and the IVs as \(IV^i=(a_i,b_i)\). With this notation, we can rewrite the following

$$\begin{aligned} g(P_1 \oplus IV^i)= & {} \left( f(x\oplus a_i)\oplus \ell (y\oplus b_i)\right) , \\ g(P_1 \oplus IV^j)= & {} \left( f(x\oplus a_j)\oplus \ell (y\oplus b_j)\right) ,\\ g(C_1^i)= & {} \left( f(c_i)\oplus \ell (d_i)\right) , \\ g(C_1^j)= & {} \left( f(c_j)\oplus \ell (d_j)\right) . \end{aligned}$$

Now, by using two distinct initialization vectors \(IV^i\) and \(IV^j\)

$$\begin{aligned} 0 = g(P_1 \oplus IV^i) \oplus g(C_1^i) \oplus g(P_1 \oplus IV^j) \oplus g(C_1^j) \end{aligned}$$

implies

$$\begin{aligned} f(x\oplus a_i)\oplus f(x\oplus a_j)= & {} \ell (b_i\oplus b_j) \oplus g(C_1^i)\oplus g(C_1^j). \end{aligned}$$
(1)

Assuming that the left side of Eq. (1) randomly changes depending on x, that is the left part of \(P_1\), we can recover one bit of information on \(P_1\) by using two initialization vectors. Similarly, we can recover \(N-1\) bits of \(P_1\) by using N initialization vectors. Note that we can usually efficiently recover these bits by solving linear systems if the algebraic degree of f is small [22]. We show the specific procedure for SCREAM and Midori64 in Sects. 4 and 5, respectively. The relationship among \((P_1, IV, C_1)\) is equivalent to that among \((P_i, C_{i-1}, C_i)\). Therefore, we can similarly guess and determine the part of \(P_i\) from \(C_{i-1}\) and \(C_i\) for any of the plaintext blocks. One interesting remark is that as long as we start to recover the message from the second block, the attack can be executed even without the knowledge of the IV.

Attacks Against Other Modes. We can execute similar attack against the CFB, OFB, and CTR modes.

In the CFB mode, the hth ciphertext block \(C_h\) is encrypted as

$$\begin{aligned} C_h = E_k(C_{h-1}) \oplus P_h, \end{aligned}$$

where the initialization vector IV is used as the input of the first block. For simplicity, let \(C_0\) be IV. Then, we can recover the part of \(P_h\) from two ciphertext blocks \(C_{h-1}\) and \(C_h\).

In the OFB mode, the hth ciphertext block \(C_h\) is encrypted as

$$\begin{aligned} C_h = (E_k)^h(IV) \oplus P_h, \end{aligned}$$

where \((E_k)^h(IV)\) is h times multiple encryption. Since the nonlinear invariant property is iterative over an arbitrary number of rounds, the multiple encryption is also vulnerable against the nonlinear invariant attack. Therefore, we can recover the part of \(P_h\) from IV and \(C_h\).

In the CTR mode, the hth ciphertext block \(C_h\) is encrypted as

$$\begin{aligned} C_h = E_k(IV+h) \oplus P_h. \end{aligned}$$

Therefore, we can recover the part of \(P_h\) from \(IV+h\) and \(C_h\).

3 Finding Nonlinear Invariants for SP-ciphers

We start by considering the very general problem of finding nonlinear invariants. Namely, given any function

$$\begin{aligned} F :\mathbb {F}_2^m \rightarrow \mathbb {F}_2^m, \end{aligned}$$

our goal is to find a Boolean function

$$\begin{aligned} g: \mathbb {F}_2^m \rightarrow \mathbb {F}_2 \end{aligned}$$

such that

$$\begin{aligned} g(x)= & {} g(F(x)) \oplus c \end{aligned}$$
(2)

where c is a constant in \(\mathbb {F}_2\).

The description so far is generic in the sense that it applies to basically any block cipher. For now, and actually for the remainder of the paper, we focus on key-alternating ciphers with a round function using a layer of S-boxes and a linear layer, so called substitution-permutation networks (SPN).

3.1 SPN Ciphers

In the following, we consider the un-keyed round function only. That is to say that we ignore the key schedule and also any round constants.

For simplicity, we focus on the case of identical S-boxes, but the more general case can be handled in a very similar manner. We denote by t the number of S-boxes and by n the size of one S-box. Thus, the block size processed is \(n \cdot t\) bits. With this notation, we consider one round R of an SPN

$$\begin{aligned} R:\left( \mathbb {F}_2^n\right) ^t \rightarrow \left( \mathbb {F}_2^n\right) ^t \end{aligned}$$

as consisting of a layer of S-boxes \(\mathcal{S}\) with

$$\begin{aligned} \mathcal{S}(x_1,\dots ,x_t)=(S(x_1),\dots ,S(x_t)) \end{aligned}$$

where S is an n-bit S-box and a linear layer

$$\begin{aligned} L:\left( \mathbb {F}_2^n\right) ^t \rightarrow \left( \mathbb {F}_2^n\right) ^t \end{aligned}$$

which can also be seen as

$$\begin{aligned} L:\mathbb {F}_2^{nt} \rightarrow \mathbb {F}_2^{nt} . \end{aligned}$$

The round function R is given as the composition of the S-box layer and the linear layer, i.e.

$$\begin{aligned} R(x)=L\circ \mathcal{S}(x). \end{aligned}$$

We would like to find nonlinear invariant g for R. However, computing this directly is difficult as soon as the block size is reasonable large. For any function F, let us denote by

$$\begin{aligned} U(F):=\{ g: \mathbb {F}_2^m \rightarrow \mathbb {F}_2 \ | \ g(x)=g(F(x)) \oplus c \} \end{aligned}$$

the set of all nonlinear invariants for F, and it holds that

$$\begin{aligned} g \in \left( U(\mathcal{S}) \cap U(L) \right) \subset U(R). \end{aligned}$$

In other words, functions that are invariant under both \(\mathcal{S}\) and L are clearly invariants for their composition R.

As we will explain next, computing parts of \( U(\mathcal{S}) \cap U(L)\) is feasible, and sufficient to automatically detect the weaknesses described later in the paper.

The S-box Layer. We start by investigating the S-box-layer. Given the S-box as a function

$$\begin{aligned} S :\mathbb {F}_2^n \rightarrow \mathbb {F}_2^n \end{aligned}$$

computing U(S) is feasible as long as n is only moderate in size.

Note that, for any function F, U(F) is actually a subspace of Boolean functions. To see this, note that given two Boolean functions \(f,g \in U(F)\), it holds

$$\begin{aligned} (f \oplus g)(x)= & {} f(x) \oplus g(x) \\= & {} \left( f(F(x)) \oplus c \right) \oplus \left( g(F(x)) \oplus c'\right) \\= & {} (f \oplus g)(F(x)) \oplus (c \oplus c') \end{aligned}$$

for any x. Thus the sum, \(f \oplus g\), is in U(F) as well. Moreover, the all-zero function is in U(F) for any F. Therefore, any nonlinear invariant \(g_S\in U(S)\) can actually be described by a linear combination of basis elements of U(S). More precisely, let \(b_1,\dots ,b_d:\mathbb {F}_2^n \rightarrow \mathbb {F}_2\) be a basis of U(S), then any \(g_S\in U(S)\) can be written s

$$\begin{aligned} g_S(x)= \bigoplus _{i=1}^{d} \gamma _i b_i(x) \end{aligned}$$

for suitable coefficients \(\gamma _i\) in \(\mathbb {F}_2\).

To identify a nonlinear invariant \(g_S \in U(S)\), the idea is to consider the algebraic normal form (ANF) of \(g_S\), that is to express \(g_S\) as

$$\begin{aligned} g_S(x)=\bigoplus _{ u \in \mathbb {F}_2^n} \lambda _u x^u , \end{aligned}$$

where \(\lambda _u \in \mathbb {F}_2\) are the coefficients to be determined and \(x^u\) denotes \(\prod x_i^{u_i}\). The key observation is that Eq. (2), for any fixed \(x\in \mathbb {F}_2^n\), translates into one linear (or affine) equation for the coefficients \(\lambda _u\), namely

$$\begin{aligned} \bigoplus _{ u \in \mathbb {F}_2^n} \lambda _u (x^u \oplus S(x)^u) = c. \end{aligned}$$

The ANF of \((x^u \oplus S(x)^u)\) is computed for all \(u \in \mathbb {F}_2^n\), and we can easily solve the basis \(b_1,\ldots ,b_d \in U(S)\) for n not too big. Appendix A shows the algorithm in detail. In particular, for commonly used S-box sizes of up to 8 bits, the space U(S) can be computed in less than a second on a standard PC.

So far, we have considered only a single S-box, and it still needs to be discussed how those results can be translated into the knowledge of invariants for the parallel execution of S-boxes, i.e. for \(\mathcal{S}\). Again, for a layer of S-boxes \(\mathcal{S}\) computing \(U(\mathcal{S})\) directly using its ANF is (in general) too expensive. However, we can easily construct many elements in \(U(\mathcal{S})\) from elements in U(S) as summarized in the following proposition.

Proposition 1

Let \(g_i \in U(S)\), for \(i \in \{1,\dots ,t\}\) be any set of invariants for the S-box S. Then, any function of the form

$$\begin{aligned} g_{\mathcal{S}}(x_1,\dots ,x_t)=\bigoplus _{i=1}^t \alpha _i g_i(x_i) \end{aligned}$$

with \(\alpha _i \in \mathbb {F}_2\) is in \(U(\mathcal{S})\), that is an invariant for the entire S-box layer. The set of function form a subspace of \(U(\mathcal{S})\) of dimension \(d*t\) where d is the dimension of U(S), and t is the number of parallel S-boxes.

We denote this subspace of invariants for \(\mathcal{S}\) by \(U_{\ell }(\mathcal{S})\), and \(U_{\ell }(\mathcal{S}) \subset U(\mathcal{S})\).

It turns out that, in general, many more elements are contained in \(U(\mathcal{S})\) than those covered by the construction above. We decided to shift those details, which are not directly necessary for the understanding of the attacks presented in Sects. 4 and 5 to the end of the paper, in Sect. 6.

The Linear Layer. For the linear layer computing U(L) using its ANF seems again difficult. But, as stated above, we focus on

$$\begin{aligned} g \in (U(L) \cap U_\ell (\mathcal{S})) \subset (U(L) \cap U(\mathcal{S})) \subset U(R), \end{aligned}$$

and computing \(U(L) \cap U_{\ell }(\mathcal{S})\) is feasible in all practical cases.

Recall that any nonlinear invariant \(g\in U(S)\) can actually be described by a linear combination of basis of U(S) as

$$\begin{aligned} g_S(x)= \bigoplus _{i=1}^{d} \gamma _i b_i(x) \end{aligned}$$

for suitable coefficients \(\gamma _i\) in \(\mathbb {F}_2\).

As any f in \(U_{\ell }(\mathcal{S})\) is itself a direct sum of elements in U(S), it can be written as

$$\begin{aligned} f(x_1,\dots ,x_t)=\bigoplus _{i=1}^t \bigoplus _{j=1}^d \beta _{i,j} b_j(x_i) \end{aligned}$$

with \(\beta _{i,j} \in \mathbb {F}_2\). Computing those coefficients \(\beta _{i,j}\) can again be done by solving linear system, as any fixed \(x\in \left( \mathbb {F}_2^n\right) ^t\) results in a linear equation for the coefficients by using

$$\begin{aligned} f(x)=f(L(x)). \end{aligned}$$

As long as the dimension of \(U_{\ell }(\mathcal{S})\), i.e. the number of unknowns, is not too large, this again can be computed within seconds on a standard PC.

Experimental Results. When the procedure explained above was applied to the ciphers SCREAM and Midori, it instantaneously detected possible attacks. Actually, as we will explain next, there is a common structural reason why non linear invariant attacks are possible on those ciphers.

3.2 Structural Weakness with Respect to Nonlinear Invariant

Let us consider linear layers which are actually used in the LS-designs [9] (cf. Sect. 4) and also in any AES-like cipher that uses a binary diffusion matrix as a replacement for the usual MixColumns operation. Then, we consider a linear layer that can be decomposed into the parallel application of n identical \(t\times t\) binary matrices M. The input for the first \(t\times t\) matrix is composed of all the first output bits of the t S-boxes, the input for the second matrix is composed of all the second output bits of the S-boxes, etc.

Here, when M is an orthogonal matrix, that is if

$$\begin{aligned} \langle x,y \rangle =\langle xM, yM\rangle \quad \forall \ x,y, \end{aligned}$$

any quadratic nonlinear invariant for the S-box can be extended to a nonlinear invariant of the whole round function as described in Theorem 1.

Note that from a design point of view, taking M as an orthogonal matrix seems actually beneficial. Thanks to the orthogonality of M, bounds on the number of active S-boxes for differential cryptanalysis directly imply the same bounds on the number of active S-boxes for linear cryptanalysis.

Theorem 1

For the SPN ciphers whose round function follows the construction used in LS-designs, let \(M \in \mathbb {F}_2^{t \times t}\) be the binary representation of the linear layer and M is orthogonal. Assume there is a nonlinear invariant \(g_S\) for the S-box. If \(g_S\) is quadratic, then the function

$$\begin{aligned} g(x_1,\dots ,x_t):=\bigoplus _{i=1}^{t} g_S(x_i) \end{aligned}$$

is a nonlinear invariant for the round function \(L \circ \mathcal{S}\).

Proof

First, due to Proposition 1, it is immediately clear that g is a nonlinear invariant for the S-box layer \(\mathcal{S}\).

Next, let us consider the linear layer L. Let \(x \in (\mathbb {F}_2^n)^t\) and \(y \in (\mathbb {F}_2^n)^t\) be the input and output of L, respectively. Moreover, \(x_i[j]\) and \(y_i[j]\) denotes the jth bit of \(x_i\) and \(y_i\), respectively. For simplicity, let \(x^T \in (\mathbb {F}_2^t)^n\) and \(y^T \in (\mathbb {F}_2^t)^n\) be the transposed input and output, respectively, where \(x^T_j \in \mathbb {F}_2^t\) denotes \((x_1[j],x_2[j],\ldots ,x_t[j])\). Then, it holds \(y^T_i = x^T_i \times M\) for all \(i \in \{1,2,\ldots ,n\}\). Since the Boolean function \(g_S\) is quadratic, the function is represented as

$$\begin{aligned} g_S(x_i) = \bigoplus _{i_1=1}^{n} \bigoplus _{i_2=1}^{n} \gamma _{i_1,i_2} (x_i[i_1] \wedge x_i[i_2]), \end{aligned}$$

where \(\gamma _{i_1,i_2}\) are coefficients depending on the function g. From the inner product \(\langle x^T_{i_1} , x^T_{i_2} \rangle = \bigoplus _{i=1}^{t} x_i[i_1] \wedge x_i[i_2]\),

$$\begin{aligned} g(x) = \bigoplus _{i=1}^{t} g_S(x_i) = \bigoplus _{i_1=1}^{n} \bigoplus _{i_2=1}^{n} \gamma _{i_1,i_2} \langle x^T_{i_1} , x^T_{i_2} \rangle . \end{aligned}$$

Then,

$$\begin{aligned} g(y)&= \bigoplus _{i_1=1}^{n} \bigoplus _{i_2=1}^{n} \gamma _{i_1,i_2} \langle x^T_{i_1} M , x^T_{i_2} M \rangle \end{aligned}$$

From the orthogonality of M,

$$\begin{aligned} g(y)&= \bigoplus _{i_1=1}^{n} \bigoplus _{i_2=1}^{n} \gamma _{i_1,i_2} \langle x^T_{i_1} , x^T_{i_2} \rangle \\&= \bigoplus _{i=1}^{t} g_S(x_i) = g(x) \end{aligned}$$

Therefore, the function \(g(x)=\bigoplus _{i=1}^{t}g_S(x_i)\) is a nonlinear invariant for L.    \(\square \)

Assuming that the matrix M is orthogonal, Theorem 1 shows that there is a nonlinear invariant for the round function \(L \circ \mathcal{S}\) if there is a quadratic function which is nonlinear invariant for the S-box.

4 Practical Attack on SCREAM

The most interesting application of the nonlinear invariant attack is a practical attack against the authenticated encryption SCREAM and iSCREAM in the nonce-respecting model. Both authenticated encryptions have \(2^{96}\) weak keys, and we then practically distinguish their ciphers from a random permutation. Moreover, we can extend this attack to a ciphertext-only attack.

4.1 Specification of SCREAM

SCREAM is an authenticated encryption and a candidate of the CAESAR competition [11]. It uses the tweakable block cipher Scream, which is based on the tweakable variant of LS-designs [9].

LS-Designs. LS-designs were introduced by Grosso et al. in [9], and it is used to design block ciphers. We do not refer to the design rational in this paper, and we only show the brief structure to understand this paper. The state of LS-designs is represented as an \(s \times \ell \) matrix, where every element of the matrix is only one bit, i.e., the block length is \(n = s \times \ell \). The ith round function proceeds as follows:

  1. 1.

    The s-bit S-box S is applied to \(\ell \) columns in parallel.

  2. 2.

    The \(\ell \)-bit L-box L is applied to s rows in parallel.

  3. 3.

    The round constant C(i) is XORed with the state.

  4. 4.

    The secret key K is XORed with the state.

Figure 2 shows the components of a LS-design. Let SB and LB be the S-box layer and L-box layer, respectively. Then, we call the composite function \((LB \circ SB)\) a LS-function. Let \(x \in \mathbb {F}_2^{s \times \ell }\) be the state of LS-designs. Then \(x[i,\star ] \in \mathbb {F}_2^\ell \) denotes the row of index i of x, and \(x[\star ,j] \in \mathbb {F}_2^s\) denotes the column of index j of x. Moreover, let x[ij] be the bit in the \((i+1)\)th row and \((j+1)\)th column. The S-box S is applied to \(x[\star ,j]\) for all \(j \in [0,\ell )\), and the L-box L is applied to \(x[i,\star ]\) for all \(i \in [0,s)\).

Fig. 2.
figure 2

The components of a LS-design

Tweakable Block Cipher Scream . Scream is based on a tweakable LS-design with an \(8 \times 16\) matrix, i.e., the block length is \(8 \times 16 = 128\) bits. Let \(x \in \mathbb {F}_2^{8 \times 16}\) be the state of Scream, then the entire algorithm is defined as Algorithm 1. Here S and L denote the 8-bit S-box and 16-bit L-box, respectively. The round constant C(r) is defined as

figure a
Fig. 3.
figure 3

The \(\sigma \)th step function of Scream

$$\begin{aligned} C(r)=2199 \cdot r ~\mathrm{mod}~2^{16}. \end{aligned}$$

The binary representation of C(r) is XORed with the first row \(x[0,\star ]\). Scream uses an 128-bit key K and an 128-bit tweak T as follows. First, the tweak is divided into 64-bit halves, i.e., \(T=t_0 \Vert t_1\). Then, every tweakey is defined as

$$\begin{aligned} TK(\sigma = 3i)&= K \oplus (t_0 \Vert t_1), \\ TK(\sigma = 3i + 1)&= K \oplus (t_0 \oplus t_1 \Vert t_1), \\ TK(\sigma = 3i + 2)&= K \oplus (t_1 \Vert t_0 \oplus t_1). \end{aligned}$$

Here, the \(x[i,\star ]\) contains state bits from \(16(i-1)\) to \(16i-1\), e.g., \(x[0,\star ]\) contains state bits from 0 to 15 and \(x[1,\star ]\) contains state bits from 16 to 31. Moreover, Fig. 3 shows the step function, where SB and LB are the S-box layer and L-box layer, respectively.

Authenticated Encryption SCREAM . The authenticated encryption SCREAM uses the tweakable block cipher Scream in the TAE mode [20]. SCREAM consists of three steps: associated data processing, encryption of the plaintext block, and tag generation. Since our attack exploits encryption of the plaintext block, we explain the specification (see Fig. 4). Plaintext values are encrypted by using Scream in order to produce the ciphertext values, and all blocks use \(T_c=(N \Vert c \Vert 00000000)\). If the last block is a partial block, its bitlength is encrypted to generate a mask, which is then truncated to the partial block size and XORed with the partial plaintext block. Therefore, the ciphertext length is the same as the plaintext length.

Fig. 4.
figure 4

Encryption of plaintext blocks

Security Parameter. Finally, we like to recall the security parameters of SCREAM, as described by the designers. Let \(n_b\) be the nonce bytesize, and it can be chosen by the user between 1 and 15 bytes. However, the designers recommend that \(n_b=11\), and we also use the recommended parameter in this paper.

SCREAM has three security parameters, i.e., lightweight security, single-key security, and related-key security. They are summarized as follows.

Lightweight security: 80-bit security, with a protocol avoiding related keys. Tight parameters: 6 steps, Safe parameters: 8 steps.

Single-key security: 128-bit security, with a protocol avoiding related keys. Tight parameters: 8 steps, Safe parameters: 10 steps.

Related-key security: 128-bit security, with possible related keys. Tight parameters: 10 steps, Safe parameters: 12 steps.

More precisely, designers order their recommended sets of parameters as follows:

  • First set of recommendations: SCREAM with 10 steps, single-key security.

  • Second set of recommendations: SCREAM with 12 steps, related-key security.

4.2 Nonlinear Invariant for Scream

The L-box of Scream is chosen as an orthogonal matrix. Therefore, there is a nonlinear invariant for the LS function from Theorem 1 if we can find quadratic Boolean function \(g:\mathbb {F}_2^8 \rightarrow \mathbb {F}_2\) which is a nonlinear invariant for the S-box S.

Let \(x \in \mathbb {F}_2^8\) and \(y \in \mathbb {F}_2^8\) be the input and output of the S-box S, respectively. Moreover, \(x[j] \in \mathbb {F}_2\) and \(y[j] \in \mathbb {F}_2\) denote the jth bits of x and y, respectively. Then, the Scream S-box has the following property

$$\begin{aligned} (x[1] \wedge x[2]) \oplus x[0] \oplus x[2] \oplus x[5] = (y[1] \wedge y[2]) \oplus y[0] \oplus y[2] \oplus y[5] \oplus 1. \end{aligned}$$

Let \(g_S:\mathbb {F}_2^8 \rightarrow \mathbb {F}_2\) be a quadratic Boolean function, where

$$\begin{aligned} g_S(x)=(x[1] \wedge x[2]) \oplus x[0] \oplus x[2] \oplus x[5]. \end{aligned}$$

Then, the function \(g_S\) is a quadratic nonlinear invariant for S because

$$\begin{aligned} g_S(x) \oplus g_S(S(x)) = g_S(x) \oplus g_S(x) \oplus 1 = 1. \end{aligned}$$

Therefore, due to Theorem 1, the Boolean function

$$\begin{aligned} g(x) = \bigoplus _{j=0}^{15} g_S(x[\star ,j]) = \bigoplus _{j=0}^{15} \bigg ( x[1,j] \wedge x[2,j] \oplus x[0,j] \oplus x[2,j] \oplus x[5,j] \bigg ) \end{aligned}$$

is a nonlinear invariant for the LS function. Note that this nonlinear invariant g is clearly balanced, as it is linear (and not constant) in parts of its input.

Next, we show that this Boolean function is also a nonlinear invariant for the constant addition and tweakey addition. The round constant C(r) is XORed with only \(x[0,\star ]\). Since C(r) linearly affects the output of the function g,

$$\begin{aligned} g(x \oplus C(r))&= g(x) \oplus g(C(r)) \end{aligned}$$

for any x. The tweakey \(TK(\sigma )\) is defined as

$$\begin{aligned} TK(\sigma = 3i)&= K \oplus (t_0 \Vert t_1), \\ TK(\sigma = 3i + 1)&= K \oplus (t_0 \oplus t_1 \Vert t_1), \\ TK(\sigma = 3i + 2)&= K \oplus (t_1 \Vert t_0 \oplus t_1), \end{aligned}$$

where \(T=t_0 \Vert t_1\). Therefore, if we restrict the key and tweak by fixing

$$\begin{aligned}&K[1,\star ]=K[2,\star ]=0, \\&T[1,\star ]=T[2,\star ]=T[5,\star ]=T[6,\star ]=0, \end{aligned}$$

\(TK(\sigma )[1,\star ]\) and \(TK(\sigma )[2,\star ]\) are always zero vectors. Then, since the tweakey linearly affects the output of the function g,

$$\begin{aligned} g(x \oplus TK(\sigma ))&= g(y) \oplus g(TK(\sigma )), \end{aligned}$$

and all those keys are weak. Therefore, the density of weak keys is \(2^{-32}\), i.e., there are \(2^{96}\) weak keys.

Let P and C be the plaintext and ciphertext of Scream, respectively. In \(N_s\)-step Scream, the relationship between p and c is represented as

$$\begin{aligned} g(P)&= g(C) \oplus \bigoplus _{r = 1}^{2N_s} g\big (C(r)\big ) \bigoplus _{\sigma = 0}^{N_s}g\big ( TK(\sigma ) \big ) \\&= g(C) \oplus c \oplus g_T(N_s,T) \oplus g_K(N_s,K), \end{aligned}$$

where \(c = \bigoplus _{r = 1}^{2N_s} g\big (C(r)\big )\), and \(g_T(N_s,T)\) and \(g_K(N_s,K)\) are defined as

$$\begin{aligned} g_T(N_s,T)&= {\left\{ \begin{array}{ll} g(t_0 \Vert t_1) &{} N_s = 0 \text { mod }6, \\ g(t_1 \Vert 0) &{} N_s = 1 \text { mod }6, \\ g(0 \Vert t_0 \oplus t_1) &{} N_s = 2 \text { mod }6, \\ g(t_0 \Vert t_0) &{} N_s = 3 \text { mod }6, \\ g(t_1 \Vert t_0 \oplus t_1) &{} N_s = 4 \text { mod }6, \\ 0 &{} N_s = 5 \text { mod }6, \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} g_K(N_s,K)&= {\left\{ \begin{array}{ll} g(K) &{} N_s = 0 \text { mod }2, \\ 0 &{} N_s = 1 \text { mod }2, \end{array}\right. } \end{aligned}$$

respectively. When the master key belongs to the class of weak-keys, \(g(p) \oplus g(c) \oplus g_T(N_s,T)\) is constant for all plaintexts and a given key. When the key does not belong to the weak-key class, the probability that the output is constant is about \(2^{-n+1}\) given n known plaintexts. Therefore, we can easily distinguish whether or not the using key belongs to the weak-key class. Note that all recommended numbers of rounds are even number. Therefore, from

$$\begin{aligned} g(K) = g(p) \oplus g(c) \oplus c \oplus g_T(N_s,T), \end{aligned}$$

we can recover one bit of information about the secret key K.

4.3 Practical Attack on SCREAM

Known-Plaintext Attack. We exploit the encryption step of SCREAM (see Fig. 4). The nonlinear invariant attack is a chosen-tweak attack under the weak-key setting. First, let us consider the class of weak tweaks. In the encryption step, the tweak \(T_c=(N \Vert c \Vert 00000000)\) is used, where we assume that \(n_b=11\). Figure 5 shows the structure of \(T_c\). From the condition of the nonlinear invariant attack, the following \(T_c\)

$$\begin{aligned} T_c[1,\star ]=T_c[2,\star ]=T_c[5,\star ]=T_c[6,\star ]=0 \end{aligned}$$

are weak tweaks. Namely, we choose N whose 3rd, 4th, 5th, 6th, and 11th bytes are zero. Then, if the counter c is less than 256, i.e. from \(T_0\) to \(T_{255}\), the tweak fulfils the condition. Moreover, the actual nonce fulfils the needs of the tweak if the nonce is implemented as a counter increment, which seems to occur in practice. If the master key belongs to the weak-key class, we can recover one bit of information about the secret key by using only one known plaintext. Moreover, by using n known plaintexts, the probability that the output is constant is about \(2^{-n+1}\) when the key does not belong to weak-key class. Therefore, an attacker can distinguish whether or not the used key belongs to the weak-key class.

Fig. 5.
figure 5

Tweak mapping

Ciphertext-Only Message Recovery Attack. The interesting application of the nonlinear invariant attack is a ciphertext-only attack. This setting is more practical than the known-plaintext attack.

We focus on the procedure of the final block. The input of Scream is the bitlength of \(P_{m-1}\), and the bitlength is encrypted to generate a mask. Then the mask is truncated to the partial block size and XORed with \(P_{m-1}\). Therefore, the ciphertext length is the same as the plaintext length. In the ciphertext-only attack, we cannot know \(P_{m-1}\). On the other hand, we know ciphertext \(C_{m-1}\) and the bitlength \(|P_{m-1}|\) can be obtained from \(|C_{m-1}|\). Therefore, we guess \(P_{m-1}\) and evaluate

$$\begin{aligned} g(|P_{m-1}|) \oplus g(P_{m-1} \oplus C_{m-1}) \oplus g_T(N_s,T), \end{aligned}$$

and the above value is always constant for any weak tweaks T. Therefore, from two ciphertexts corresponding to the same final plaintext block encrypted by distinct tweaks, we create a linear equation as

$$\begin{aligned} g(P_{m-1} \oplus C_{m-1}) \oplus g(P_{m-1} \oplus C_{m-1}') = g_T(N_s,T) \oplus g_T(N_s,T'). \end{aligned}$$
(3)

We can compute the right side of Eq. (3). Moreover, we regard the function g as

$$\begin{aligned} g(X) = f(X) \oplus \ell (X), \end{aligned}$$

where

$$\begin{aligned} f(X)&= \bigoplus _{j=0}^{15} \bigg ( X[1,j] \wedge X[2,j] \bigg ),\\ \ell (X)&= \bigoplus _{j=0}^{15} X[0,j] \oplus X[2,j] \oplus X[5,j]. \end{aligned}$$

Then,

$$\begin{aligned}&g(P_{m-1} \oplus C_{m-1}) \oplus g(P_{m-1} \oplus C_{m-1}')\\&\quad = f(P_{m-1} \oplus C_{m-1}) \oplus f(P_{m-1} \oplus C_{m-1}') \oplus \ell (C_{m-1}) \oplus \ell (C_{m-1}') \\&\quad = \bigoplus _{j=0}^{15} \bigg ( C_{m-1}[1,j] P_{m-1}[2,j] \oplus P_{m-1}[1,j] C_{m-1}[2,j] \oplus C_{m-1}[1,j] C_{m-1}[2,j] \bigg ) \\&\quad \quad \bigoplus _{j=0}^{15} \bigg ( C'_{m-1}[1,j] P_{m-1}[2,j] \oplus P_{m-1}[1,j] C'_{m-1}[2,j] \oplus C'_{m-1}[1,j] C'_{m-1}[2,j] \bigg ) \\&\quad \quad \oplus \ell (C_{m-1}) \oplus \ell (C_{m-1}'). \end{aligned}$$

The equation above is actual a linear equation in 32 unknown bits, \(P_{m-1}[1,j]\) and \(P_{m-1}[2,j]\), as all other terms are known. Therefore, we can create t linear equations by collecting \(t+1\) ciphertexts encrypted by distinct tweaks. We can recover 32 bits, \(P_{m-1}[1,j]\) and \(P_{m-1}[2,j]\) by solving this system as soon as the corresponding system has full rank. Assuming the system behaves like a randomly generated system of linear equations, we can expect that the system has full rank already when taking slightly more than 33 equations. The time complexity for solving this system is negligible.

Note that the system involves four 16-bit words, \(C_{m-1}[0,j]\), \(C_{m-1}[1,j]\), \(C_{m-1}[2,j]\), and \(C_{m-1}[5,j]\). Since the bitlength of \(C_{m-1}\) is equal to that of \(P_{m-1}\), we cannot solve this system if \(|P_{m-1}|<96\). Therefore, the necessary condition of this attack is \(96 \le |P_{m-1}| < 128\).

Experimental Results. In order to verify our findings and in particular to verify that the system indeed behaves like a random system of linear equations, we implemented our ciphertext-only message recovery attack for SCREAM. In our experiment, the key is randomly chosen from the weak-key class. Moreover, we use N distinct nonces such that the corresponding tweak is weak, and collect N corresponding ciphertexts. We repeated our attack 1000 times. Table 2 summarizes the success probability of recovering the correct 32 bits. Moreover, in the table we compare the experimental success probability to the theoretically expected probability in the case of a randomly generated system of linear equations. As can be seen, the deviation of the experimental results to the theoretically expected results is very small.

Table 2. The success probability of recovering the correct 32 plaintext bits on SCREAM.

4.4 Application to iSCREAM

The authenticated encryption iSCREAM also has the similar structure of SCREAM. We search for the nonlinear invariant for the underlying tweakable block cipher iScream. As a result, the following quadratic Boolean function

$$\begin{aligned} g_S(x)= (x[4] \wedge x[5]) \oplus x[0] \oplus x[6]. \end{aligned}$$

is nonlinear invariant for the S-boxFootnote 2, and it holds

$$\begin{aligned} g_S(x) \oplus g_S(S(x)) = g_S(x) \oplus g_S(x) = 0. \end{aligned}$$

Therefore, from Theorem 1, the following Boolean function

$$\begin{aligned} g(x) = \bigoplus _{j=0}^{15} g_S(x[\star ,j]) = \bigoplus _{j=0}^{15} \bigg ( x[4,j] \wedge x[5,j] \oplus x[0,j] \oplus x[6,j] \bigg ). \end{aligned}$$

is nonlinear invariant for the LS function.

5 Practical Attack on Midori64

5.1 Specification of Midori64

Midori is a light-weight block cipher recently proposed by Banik et al. [3], which is particularly optimized for low-energy consumption. There are two versions depending on the block size; Midori64 for 64-bit block and Midori128 for 128-bit block. Both use 128-bit key. The nonlinear invariant attack can be applied to Midori64, thus we only explain the specification of Midori64 briefly.

Midori64 adopts an SPN structure with a non-MDS matrix and a very light key schedule. The state is represented by a \(4\times 4\)-nibble array. At first the plaintext is loaded to the state, then the key whitening is performed. The state is updated with a round function 16 times, and a final key whitening is performed. The resulting state is the ciphertext. The overall structure is illustrated in Fig. 6. More details on each operation will be given in the following paragraphs.

Fig. 6.
figure 6

Computation structure of Midori64

Key Schedule Function. A user-provided 128-bit key is divided into two 64-bit key states \(K_0\) and \(K_1\). Then, a whitening key WK and 15 round keys \(RK_i, i=0.1,\ldots ,14\) are generated as follows.

$$\begin{aligned} WK \leftarrow K_0 \oplus K_1,&\qquad \quad RK_i \leftarrow K_{i \text { mod } 2} \oplus \alpha _i, \end{aligned}$$

where the \(\alpha _i\) are fixed 64-bit constants. The round constant \(\alpha _i\) are binary for each nibble, i.e. any nibble in \(\alpha _i\) is either 0000 or 0001. Using such constants is beneficial to keep the energy consumption low. The exact values of the \(\alpha _i\) are given in Table 3 for the first 6 rounds. We refer to [3] for the complete specification.

Table 3. Examples of round constant \(\alpha _i\)

Round Function. The round function consists of four operations: SubCell, ShuffleCell, MixColumn, and KeyAdd. Each operation is explained in the following.  

SubCell::

The 4-bit S-box S defined below is applied to each nibble in the state.

 

x

0

1

2

3

4

5

6

7

8

9

A

B

C

D

E

F

S(x)

C

A

D

3

E

B

F

7

8

9

1

5

0

2

4

6

  • ShuffleCell: Each cell of the state is permuted as ShiftRows in AES. Let \(s_0,s_1,s_2,s_3\) be four nibbles in the first row. Let \(s_4, \ldots , s_{15}\) be the other 12 nibbles similarly defined. Then, the cell permutation is specified as follows.

    $$\begin{aligned} (s_0, s_1, \ldots , s_{15}) \leftarrow (s_0, s_{10}, s_5, s_{15}, s_{14}, s_4, s_{11}, s_1, s_9, s_3, s_{12}, s_6, s_7, s_{13}, s_2, s_8) \end{aligned}$$

    Note that our nonlinear invariant attack would actually work in exactly the same way for any other cell permutation as well.

  • MixColumn: The following \(4\times 4\) orthogonal binary matrix M is applied to every column of the state.

    $$\begin{aligned} M= \begin{pmatrix} 0 &{} 1 &{} 1 &{} 1 \\ 1 &{} 0 &{} 1 &{} 1 \\ 1 &{} 1 &{} 0 &{} 1 \\ 1 &{} 1 &{} 1 &{} 0 \\ \end{pmatrix} \end{aligned}$$
  • KeyAdd: The round key \(RK_i\) is xored to the state in round i.

In the last round, only SubCell (followed by the post-whitening) is performed.

5.2 Nonlinear Invariant for Midori64

The matrix used in MixColumn is a binary and orthogonal matrix. Thus, Theorem 1 implies that any quadratic Boolean function \(g:\mathbb {F}_2^4 \rightarrow \mathbb {F}_2\), which is a nonlinear invariant for the S-box S, allows us to find nonlinear invariant for the entire round function. Similarly to the previous section, we use the notation \(x[j] \in \mathbb {F}_2\) and \(y[j] \in \mathbb {F}_2\) to denote the jth bits of 4-bit S-box input x and 4-bit S-box output y, respectively.

We search for g such that \(g(x)=g(S(x))\). Different from Scream, the S-box of Midori64 is small, and many of such g usually exist. Actually, we found 15 choices of such g.

We then pick up ones that are also nonlinear invariant for the key addition \(RK_i\), which is computed by \(RK_i \leftarrow K_{i \text { mod } 2} \oplus \alpha _i\). Here, \(\alpha _i\) takes 0 or 1 in each nibble, i.e. the 2nd, 3rd, and 4th bits are always 0. Thus we need to avoid g in which the first bit is included in the nonlinear component, i.e. g cannot involve x[0] and y[0] in their nonlinear component.

Among 15 choices, only one can satisfy this condition. The picked S-box property of Midori64 is as follows.

$$\begin{aligned} (x[3] \wedge x[2]) \oplus x[2] \oplus x[1] \oplus x[0] = (y[3] \wedge y[2]) \oplus y[2] \oplus y[1] \oplus y[0]. \end{aligned}$$

Then, the following \(g_S:\mathbb {F}_2^4 \rightarrow \mathbb {F}\) is nonlinear invariant for S;

$$\begin{aligned} g_S(x)=(x[3] \wedge x[2]) \oplus x[2] \oplus x[1] \oplus x[0]. \end{aligned}$$

Here, ShuffleCell does not affect the nonlinear invariant. Therefore, from Theorem 1, the following Boolean function

$$\begin{aligned} g(x) = \bigoplus _{j=0}^{15} g_S(s_i) \end{aligned}$$

is a nonlinear invariant for the round function of Midori64. Note, as for SCREAM   the Boolean function g is actually balanced.

5.3 Distinguishing Attack

As mentioned in Sect. 2, the simple distinguishing attack can be mounted against a weak key. Let \(\ell : \mathbb {F}_2^4 \rightarrow \mathbb {F}\) be a linear part of g, namely \(\ell (x) = x[2] \oplus x[1] \oplus x[0]\). We have \(g(p) \oplus g(c) = const\) and const is a linear part of the values injected to round function during the encryption process;

$$\begin{aligned} const&= \ell (WK) \oplus \ell (RK_0) \oplus \ell (RK_1) \oplus \cdots \oplus \ell (RK_{14}) \oplus \ell (WK),\\&= \ell (RK_0) \oplus \ell (RK_1) \oplus \cdots \oplus \ell (RK_{14}). \end{aligned}$$

Given \(RK_i = K_{i \text { mod } 2} \oplus \alpha _i\), the above equation is further converted as

$$\begin{aligned} const&= \ell (K_1) \oplus \ell (\alpha _0) \oplus \ell (\alpha _1) \oplus \cdots \oplus \ell (\alpha _{14}). \end{aligned}$$

As \(\alpha _i[2]=\alpha _i[1]=0\) for any i, it can be simply written as

$$\begin{aligned} const&= \ell (K_1) \oplus \bigoplus _{i=0}^{14} \bigoplus _{j=0}^{15}\alpha _{i,j}, \end{aligned}$$

where \(\alpha _{i,j}\) is the jth nibble of \(\alpha _i\). We confirmed that the total number of 1 in all \(\alpha _i\) is even, thus \(\bigoplus _{i=0}^{14} \bigoplus _{j=0}^{15}\alpha _{i,j} = 0\). In the end, \(g(p) \oplus g(c) = \ell (K_1)\) always holds for Midori64, while this holds with probability 1 / 2 for a random permutation.

5.4 Experimental Results

As mentioned in Sect. 2, the above property can reveal 32 bits (the two most significant bits from each nibble) of an unknown plaintext block in the weak-key setting when Midori64 is used in well-known block cipher modes.

We implemented our ciphertext-only message recovery attack for Midori64 in the CBC mode. In our experiment, the key and IV are chosen uniformly at random from the weak-key space and the entire IV space. We also choose a 64-bit plaintext block p, uniformly at random, and assume that p is iterated over b blocks, where \(33 \le b \le 43\). We executed our attack of repeating 1000 times, and Table 4 summarizes the success probability of recovering the correct 32 bits.

Table 4. The success probability of recovering the correct 32 bits on Midori64-CBC.

Similarly to the case of SCREAM the system of equations behaves very much like a random system of equation in the sense that the probability that it has full rank is very close to the corresponding probability for a random system with the same dimensions.

6 Extensions and Future Work

In this section we outline some extensions to the previously described attacks. Furthermore, we give some additional insights in the structure of nonlinear invariants in general. Finally, we explain how invariant subspace attacks can be seen as a special, chosen plaintext, variant of nonlinear invariant attacks. It is important to point out that none of the observations in this section lead to any attacks. But we feel that those explanations provide good starting points for future investigations.

More General Nonlinear Invariant. We continue to use the notation that we fixed in Sect. 3. First recall Proposition 1, that allowed to construct nonlinear invariants for the whole S-box layer by linearly combining nonlinear invariants for each single S-box. This proposition can actually be easily extended. Instead of only linearly combining the nonlinear invariants for each S-box, any combination by using an arbitrary Boolean function results in an invariant for the whole S-box layer as well. The following proposition summarizes this observation.

Proposition 2

Given any Boolean function \(f:\mathbb {F}_2^t \rightarrow \mathbb {F}_2\) and t elements

$$\begin{aligned} g_1,\dots ,g_t :\mathbb {F}_2^n \rightarrow \mathbb {F}_2 \end{aligned}$$

from U(S) it holds that

is an element of \(U(\mathcal{S})\)

Note that the special case of f being linear actually corresponds to the choice made in Proposition 1.

While this generalization potentially allows a much larger variety of invariants, and therefore potential attacks, we like to mention that the restriction made in Proposition 1 has two crucial advantages. First, the choice is small enough, so that it can be handled exhaustively and second, the invariants generated by Proposition 1 are usually balanced, while this is not necessarily the case for the generalization.

At first sight, one might be tempted to assume that the above construction actually covers all invariants for the S-box layer. However, in general, this is not the case.

One counter-example, that is a nonlinear invariant not covered by this construction, can easily be identified as follows: For simplicity, consider an S-box layer consisting of two identical n bit S-boxes only. If the two inputs to those two S-boxes are equal, so are the outputs. Thus, the function

$$\begin{aligned} g: \mathbb {F}_2^n \times \mathbb {F}_2^n\rightarrow & {} \mathbb {F}_2 \\ g(x,y)= & {} \left\{ \begin{array}{ccc} 1 &{} \text{ if } &{} x=y \\ 0 &{} \text{ else } \end{array}\right. \end{aligned}$$

is an nonlinear invariant of the S-box layer as

$$\begin{aligned} g(x,y)=1 \Leftrightarrow x=y \Leftrightarrow S(x)=S(y) \Leftrightarrow g(S(x),S(y))=1 . \end{aligned}$$

Moreover, this nonlinear invariant can certainly not be generated by Proposition 2.

Cycle Structure. Actually, there is a nice, and potentially applicable way, of describing all nonlinear invariants for a given permutation F by considering its cycles. Recall that a cycle of F being a set

$$\begin{aligned} C_x:=\{ F^i(x) \ | \ i \in \mathbb {N} \} \end{aligned}$$

for a value \(x\in \mathbb {F}_2^n\). Actually, one can show that a mapping g is contained in U(F) if and only if g is either constant on all cycles of F or alternating along the cycles of F. The later case corresponds to nonlinear invariants such that

$$\begin{aligned} g(x)+g(F(x))=1. \end{aligned}$$

This is because \(g(x)=g(F(x))\) implies

$$\begin{aligned} g(x)=g(F(x))=g(F(F(x)))=\dots =g(F^i(x)) . \end{aligned}$$

Thus, looking at the cycle structure of F, we can assign to each cycle one value the function g should evaluate to on this cycle. That view point also shows that the number of invariant functions g is equal to

$$\begin{aligned} |U(F)| =2^{(\mathrm{\#~cycles~of~}F)}, \end{aligned}$$

in the case where there exist at least one cycle of odd length or

$$\begin{aligned} |U(F)| =2^{(\mathrm{\#~cycles~of~} F)+1}, \end{aligned}$$

in the case where all cycles of F have even length. This perspective allows to actually compute a basis of U(F) very efficiently. Consider, for simplicity, the case were not all cycles are of even length. Then, a basis of U(F) clearly consists of the set of all indicator functions of \(C_x\), i.e.

$$\begin{aligned} U(F)=\text{ span }\{ \delta _{C_a} \ | \ a \in \mathbb {F}_2^n \}. \end{aligned}$$

Here, for a subset \(A\subseteq \mathbb {F}_2^n\), the function \(\delta _{A}\) denotes the indicator function of the set A, i.e.

$$\begin{aligned} \delta _{A}(x) = \left\{ \begin{array}{ccc} 1 &{} \text{ if } &{} x\in A \\ 0 &{} \text{ else } \end{array}\right. \end{aligned}$$

Example 2

Consider the function \(F:\mathbb {F}_2^2 \rightarrow \mathbb {F}_2^2\) with

The cycle composition of F is

$$\begin{aligned} (0, 1, 2)(3) . \end{aligned}$$

Thus we have two cycles of odd length. Following the above, any nonlinear invariant of F is constant on those cycles. In this case we have the following invariants

$$\begin{aligned} g_1(x)= & {} \delta _{\{0,1,2\}}(x) \\ g_1(x)= & {} \delta _{\{3\}}(x) \end{aligned}$$

or, more explicitly

and

together with the trivial invariants, that is the identical zero or identical one functions. So in total F has 4 invariants.    \(\square \)

Relation to Invariant Subspace Attack. Along the same lines, one can also see the invariant subspace attack as a special case of a nonlinear invariant. Recall that a subspace \(V\subseteq \mathbb {F}_2^n\) is called invariant under (a block cipher) F if

$$\begin{aligned} F(V)=V. \end{aligned}$$

That is, the set V is mapped to itself by the function F. Note that the complement \(\bar{V}\) is also mapped to itself because the function F is permutation. This means nothing else than that the nonlinear Boolean function \(\delta _V(x)\) is a nonlinear invariant for F as

$$\begin{aligned} \delta _V(x)=1 \Leftrightarrow x \in V \Leftrightarrow F(x) \in V \Leftrightarrow \delta _V(F(x))=1,\\ \delta _V(x)=0 \Leftrightarrow x \in \bar{V} \Leftrightarrow F(x) \in \bar{V} \Leftrightarrow \delta _V(F(x))=0. \end{aligned}$$

In other words, invariant subspace attacks are nonlinear invariant attacks where the support of the nonlinear invariant is a subspace of \(\mathbb {F}_2^n\). And as such, nonlinear invariant attacks could be called invariant set attacks, as the function g splits in the inputs into two sets, the support of g and its complement, that are invariant under F.

Further Research. Other interesting directions for further research include the generalization of the nonlinear invariant to the case where one does not consider the same function g in every round, but rather a sequence of functions that can be chained together. In fact, we also found quadratic Boolean function \(g^{\prime }:\mathbb {F}_2^4 \rightarrow \mathbb {F}_2\) such that \(g(x)=g^{\prime }(S(x))\) for Midori64. Owing to the involution property of the S-box, \(g(x)=g^{\prime }(S(x))\) always implies \(g^{\prime }(x)=g(S(x))\). Combining with the alternative use of \(K_0\) and \(K_1\) in the key schedule, such \(g,g^{\prime }\) may be exploited in the attack. Unfortunately, since such Boolean functions are not nonlinear invariant for the constant addition in Midori64, we cannot exploit them in real cryptanalysis. However, it is clearly worth discussing this extension. And last but not least, even so it seems notoriously difficult, it would be nice to be able to use a statistical variant of the attack described here, i.e. consider nonlinear functions such that \(g(F(x))=g(x)\) for many – but not necessarily for all – inputs x.