Match Box MeetintheMiddle Attack Against KATAN
Abstract
Recent years have seen considerable interest in lightweight cryptography. One particular consequence is a renewed study of meetinthemiddle attacks, which aim to exploit the relatively simple key schedules often encountered in lightweight ciphers. In this paper we propose a new technique to extend the number of rounds covered by a meetinthemiddle attack, called a match box. Furthermore, we demonstrate the use of this technique on the lightweight cipher KATAN, and obtain the best attack to date on all versions of KATAN. Specifically, we are able to attack 153 of the 254 rounds of KATAN32 with low data requirements, improving on the previous best attack on 115 rounds which requires the entire codebook.
Keywords
Cryptanalysis Meetinthemiddle Biclique Match box KATAN1 Introduction
Over the past few years, ultralightweight embedded systems such as RFID tags and sensor nodes have become increasingly common. Many such devices require cryptography, typically for authentication purposes. However, traditional ciphers such as AES were not primarily designed for use in this context. Highly constrained devices impose a very small hardware footprint; on the other hand, they typically do not require a security level as high as that offered by AES.
To cater for this need, a number of lightweight ciphers have been developed, such as PRESENT [5], KATAN [7], LED [9], or Simon [2]. These ciphers aim to offer a tradeoff between security and the constraints of embedded systems. This is often achieved by innovative designs that look to push the boundaries of traditional ciphers. The security of these new designs needs to be carefully assessed; in this process, new cryptanalytic techniques have emerged.
In particular, there has been a resurgence in the study of meetinthemiddle attacks in the context of block ciphers [6, 10]. This type of attack requires a fairly simple key schedule, and is rarely applicable to traditional ciphers. However, many lightweight ciphers rely on simple round functions and key schedules, which are compensated by a high number of rounds. This makes them good targets for meetinthemiddle attacks.
Our Contribution. In this paper, we propose a new way to extend meetinthemiddle attacks, which we call a match box. This technique may be seen as a form of sieveinthemiddle [8] or threesubset meetintthemiddle attack [6], in that it extends the rounds covered in the middle section of the attack. It does so by relying on a large precomputed lookup table with a special structure. As such, it is also a form of time/memory tradeoff.
We demonstrate this technique on the lightweight block cipher KATAN. As a result, we improve on previous results on all three versions of KATAN, both in terms of number of rounds as well as data requirements. Of independent interest is our construction of bicliques on KATAN, which takes full advantage of the linearity of the key schedule, and improves on previous attacks with negligible memory requirements
Related Work. Previous results on KATAN include a conditional differential analysis by Knellwolf, Meier and NayaPlasencia [11, 12] and a differential cryptanalysis of 115 rounds of KATAN32 by Albrecht and Leander [1]. In [10], Isobe and Shibutani describe meetinthemiddle attacks on reduced versions of all three variants of KATAN. The attack that reaches the highest number of rounds on all three versions is a multidimensional meetinthemiddle attack by Zhu and Gong [15]. However, this attack may be regarded as an optimized exhaustive search, as it involves performing a partial encryption under every possible value of the key.
Summary of results.
Model  Data  Memory  Time  Rounds  Reference  

KATAN32  CP  \(2^{22}\)    \(2^{22}\)  78  [11] 
KP  138  \(2^{75}\)  \(2^{77}\)  110  [10]  
CP  \(2^{32}\)    \(2^{79}\)  115  [1]  
KP  \(3\)  \(2^{79.58}\)  \(2^{79.30}\)  175  [15]  
KP  4  \(2^{5}\)  \(2^{77.5}\)  121  Sect. 4.3  
CP  \(2^{7}\)  \(2^{5}\)  \(2^{77.5}\)  131  Sect. 4.5  
CP  \(2^{5}\)  \(2^{76}\)  \(2^{78.5}\)  153  Sect. 4.7  
KATAN48  CP  \(2^{34}\)    \(2^{34}\)  70  [11] 
KP  128  \(2^{78}\)  \(2^{78}\)  100  [10]  
KP  \(2\)  \(2^{79.00}\)  \(2^{79.45}\)  130  [15]  
KP  4  \(2^{5}\)  \(2^{77.5}\)  110  Sect. 4.3  
CP  \(2^{6}\)  \(2^{5}\)  \(2^{77.5}\)  114  Sect. 4.5  
CP  \(2^{5}\)  \(2^{76}\)  \(2^{78.5}\)  129  Sect. 4.7  
KATAN64  CP  \(2^{35}\)    \(2^{35}\)  68  [11] 
KP  116  \(2^{77.5}\)  \(2^{77.5}\)  94  [10]  
KP  \(2\)  \(2^{79.00}\)  \(2^{79.45}\)  112  [15]  
KP  4  \(2^{5}\)  \(2^{77.5}\)  102  Sect. 4.3  
CP  \(2^{7}\)  \(2^{5}\)  \(2^{77.5}\)  107  Sect. 4.5  
CP  \(2^{5}\)  \(2^{74}\)  \(2^{78.5}\)  119  Sect. 4.7 
2 MeetintheMiddle Attacks
2.1 MeetintheMiddle Framework
 For each partial key \(k_{\cap } \in K_{\cap } = K_1 \cap K_2\;\) ^{1}:

For each partial key \(k_1 \in K_1\) extending \(k_{\cap }\), \(\mathbf {v}\) is computed. For each possible value of \(\mathbf {v}\), the \(k_1\)’s leading to that value are stored in a table.

For each partial key \(k_2 \in K_2\), \(\mathbf {v}\) is computed. The \(k_1\)’s leading to this same \(\mathbf {v}\) are retrieved from the previous table. Each \(k_2\) merged with each \(k_1\) leading to the same \(\mathbf {v}\) provides a candidate master key.

The actual encryption key is necessarily among candidate keys. Indeed, for the actual key, encryption from the plaintext and decryption from the ciphertext are mirrors of each other, and agree on the intermediate value \(\mathbf {v}\). If we denote by \(\mathbf {v}\) the size of \(\mathbf {v}\), candidate keys form a proportion \(2^{\mathbf {v}}\) of the total key space.
In order to compute the actual encryption key, it remains to test candidate keys against enough plaintext/ciphertext pairs to ensure only one key remains. Each plaintext/ciphertext pair divides the number of candidates keys by \(2^{B}\), where \(B\) denotes the block size. Thus, in order to have only one key left, \(\lceil K/B\rceil \) pairs are necessary on average, where \(K\) denotes the key size.
Simultaneous Matching. As we have seen, overall, meetinthemiddle attacks proceed in two stages: a key filtering stage that produces key candidates, followed by a verification stage that tests the key candidates against a few plaintext/ciphertext pairs. This division in two stages is reflected in the complexity of the attack. The complexity of the first stage is determined mostly by the sizes of \(K_1\) and \(K_2\); the complexity of the second stage depends only on the size of \(\mathbf {v}\) (for a fixed cipher).
Directly tweaking the size of \(\mathbf {v}\) is one way to try and evenly spread the load between the two stages. However, increasing \(\mathbf {v}\) will often disproportionately impact the sizes of \(K_1\) and \(K_2\). Simultaneous matching provides a very efficient alternate way of increasing the size of \(\mathbf {v}\). The idea is to use \(n\) plaintext/ciphertext pairs instead of just one. For each guess of \(K_1\) and \(K_2\), we concatenate the \(\mathbf {v}\)’s produced by each pair in order to have a larger global \(\mathbf {v}\), and use that for matching, as before.
In other words, what we are doing is perform a standard meetinthemiddle attack, but on a cipher formed by \(n\) parallel applications of the basic cipher. This increases only linearly the complexity of the first stage, while exponentially decreasing the complexity of the second stage.
Indirect Matching. With the newfound interest in meetinthemiddle attack occasioned by lightweight ciphers, a number of techniques originally developed for the cryptanalysis of hash functions have been adapted to meetinthemiddle attacks on block ciphers. A short survey of these techniques has already been presented in, for example [14], and is out of the scope of this article. Still, we briefly mention one of these techniques, namely indirect matching, as we will use it later on KATAN. We also generalize this technique slightly.
In a regular meetinthemiddle attack, some value \(\mathbf {v}\) of the internal state is computed from the left as \(e(k_1)\) and from the right as \(d(k_2)\), where \(e\) and \(d\) are essentially a partial encryption and decryption. Keys are filtered by checking \(e(k_1) = d(k_2)\). Now assume some key bit \(k\) in \(k_1\) only has a linear impact on the value of \(e(k_1)\), i.e. \(e(k_1) = e'(k'_1) \oplus k\), where \(k'_1\) is \(k_1\) minus the knowledge of \(k\). Then if knowledge of \(k\) is included in \(K_2\), the equality in the middle \(e(k_1) = d(k_2)\) may be rewritten as \(e'(k'_1) = d(k_2) \oplus k = d'(k_2)\). In this way, guessing \(k\) is no longer necessary in the encryption direction, and the associated complexity decreases accordingly.
Here, we assumed that \(k\) is included in \(K_2\), i.e. \(k\) is in \(K_{\cap }\) since it is already in \(K_1\). But we can get the same benefit even if \(k\) is in \(K_1  K_{\cap }\): the only real requirement is that it linearly impacts \(e(k_1)\). To show this, the proof is a little more elaborate than in the previous case. Assume that \(k\) is in \(K_1  K_{\cap }\), and write \(e(k_1) = e'(k'_1) \oplus k\) as before.
Up to now, \(k_1\) together with \(k_2\) was assumed to contain knowledge of the entire key. We guessed \(k_1\) from the left, then \(k_2\) from the right and matched compatible guesses by checking \(e(k_1) = d(k_2)\). Instead, we are now going to guess \(k'_1\) from the left and \(k_2\) from the right, so the combination of the two does not encompass the entire key (\(k\) is missing). Furthermore, all guesses of \(k'_1\) and \(k_2\) are compatible. However, for each pair of guesses, we set \(k = e(k'_1) \oplus d(k_2)\), and the combination of \(k'_1\), \(k_2\) and \(k\) gives us one candidate master key.
Thus, the number of candidate master keys is unchanged. However, we need not guess \(k\) from the left, and the complexity of guessing \(k_1\) is reduced accordingly. Thus the benefit is exactly the same as in the case where \(k\) belonged to \(K_{\cap }\). Note that we remain compatible with simultaneous matching: if we use several plaintext/cipherext pairs, they all must agree on \(k\), which yields the usual filter on the candidate master keys.
3 Match Box
We now introduce the match box technique. This technique fits within the general sieveinthemiddle framework introduced in [8], which we recall here.
3.1 SieveintheMiddle
Let us still denote by \(K_{\cap }\) the information on the key common to \(K_1\) and \(K_2\); furthermore, let \(K'_1\) (resp. \(K'_2\)) be the proper part of \(K_1\) (resp. \(K_2\)), i.e. the part not already in \(K_{\cap }\). In a standard meetinthemiddle attack, a few bits of internal state \(\mathbf {l}\) are computed from the left by guessing \(k_1 \in K_1\), then the same bits \(\mathbf {r}\) are computed from the right by guessing \(k_2 \in K_2\). Valid key candidates are determined by checking \(\mathbf {l} = \mathbf {r}\).
When applying this idea, the following problem arises. Once having guessed \(k_{\cap } \in K_{\cap }\), the natural way to proceed would be to compute \(\mathbf {l}\) and \(\mathbf {r}\) for each \(k'_1 \in K'_1\) and \(k'_2 \in K'_2\) respectively, and exhaustively test \(\mathcal {R}(\mathbf {l},\mathbf {r})\) for every pair \((k'_1,k'_2)\). However, this would amount to a brute force search since \(K_{\cap } \times K'_1 \times K'_2\) is in fact the entire key. It should be noted that there is no completely general solution to this problem, since \(\mathcal {R}\) does need to be tested for every pair \((\mathbf {l},\mathbf {r})\) yielded by every \((k'_1,k'_2)\).
In the sieveinthemiddle paper, this issue is solved by using merging algorithms originally introduced in [13]. These algorithms tend to assume, roughly, that the size of \(\mathbf {l}\) is less than the size of \(K'_1\) (divided by a sieving factor). We refer the reader to [8] for a complete explanation of merging techniques. What we propose is a different way of matching \(\mathbf {l}\) and \(\mathbf {r}\) while avoiding exhaustive search, which we call a match box.
3.2 Match Box
Consider the situation depicted on Fig. 3. Here, \(\mathbf {l}\) contains some partial information \(\mathbf {l'}\) about the internal state entering an Sbox. At the output of this Sbox, some round key is added, and \(\mathbf {r}\) contains the entire state after the key addition. Now assume that the round key may be decomposed as a sum of some \(f_1(k'_1)\) depending on \(k'_1 \in K'_1\), and some \(f_2(k_2)\) depending on \(k_2 \in K_2\). Note that this is automatically true if the key schedule is linear. Since \(K_2\) is known when computing from the right, the component \(f_2(k_2)\) may be directly added into \(\mathbf {r}\).
So in this situation, \(\mathbf {l} = (\mathbf {l'},k'_1)\) and \(\mathbf {r}\) are compatible iff \(S^{1}(\mathbf {r} \oplus f_1(k'_1))\) equals \(\mathbf {l'}\) (wherever \(\mathbf {l'}\) is defined). If \(\mathbf {r}\) is larger than \(k'_2\), since \(k'_1\) is included in \(\mathbf {l}\), \(\mathbf {l}\) is also larger than \(k'_1\), and a merging technique in the style of [8] cannot apply. However, a match box is possible.
 For each \(k_{\cap } \in K_{\cap }\):

For each \(k'_1 \in K'_1\), \(\mathbf {l'}\) is computed. This yields a function \(f:K'_1 \rightarrow L'\), from which we obtain \(M(f)\).

For each \(k'_2 \in K'_2\), \(\mathbf {r}\) is computed. Candidate master keys are those corresponding to the pairs \((k'_1,k'_2)\) for each \(k'_1\) in \(M(f)(\mathbf {r})\).

3.3 Compressing \(\mathcal {R}\)
In this situation, each \(f_i\) is a boolean function of \(k'_1\), so it may be written as a polynomial in the bits of \(k'_1\). As such, each \(f_i\) can be fully expressed by no more than \(2^{k'_1}\) coefficients \((f_i^n)_{n < 2^{k'_1}}\). This is beneficial as long as \(\mathbf {l'} \cdot 2^{k'_1} < \mathbf {r}\), i.e. there are less \(f^n_i\)’s than bits of \(\mathbf {r}\). In this manner, \(\mathbf {r}\) is effectively shortened to \(\mathbf {l'} \cdot 2^{k'_1}\).
The only limit is the size and complexity necessary to build the table converting \(\mathbf {r}\) into the \(f_i^n\)’s. Note that in general, \(\mathbf {r}\) is more or less a set of internal state bits, with potentially some partial keys added in; so computing \(\mathbf {r}\) and the \(f_i\)’s is akin to a partial encryption. In that case, for a given \(\mathbf {r}\), the \(f^n_i\)’s can be indirectly computed by evaluating the \(f_i\)’s for all values of \(k'_1\). In this way, for each \(\mathbf {r}\) and each \(i\), the value of the \(f^n_i\)’s can be computed in at most \(2^{k'_1}\) encryption equivalents.
4 Application to KATAN
KATAN is an ultralightweight block cipher presented by Christophe de Cannière, Orr Dunkelman and Miroslav Knežević at CHES 2009 [7]. Its design is inspired by the stream cipher Trivium, and relies on two nonlinear feedback registers. This is rather unique for a block cipher, and makes the cryptanalysis of KATAN especially interesting, since it indirectly evaluates the strength of this type of design.
In [7] the authors describe two families of block ciphers, KATAN and KTANTAN, which only differ in their key schedule. In KATAN, the key is stored in a register, while in KTANTAN, it is hardcoded into the circuit. The tradeoff is that while the key cannot be modified, the circuit area is significantly reduced by avoiding the need for a register dedicated to the storage of the key. However, KTANTAN been broken [6, 14], mostly due to weaknesses in its key schedule. Hereafter we focus solely on KATAN.
4.1 Description of KATAN
KATAN is a family of three block ciphers with block sizes 32, 48, and 64 bits, denoted by KATAN32, KATAN48, and KATAN64 respectively. In all cases the key size is 80 bits, and the total number of rounds is 254. We begin by giving a brief description of KATAN32. KATAN48 and KATAN64 are very similar, as we shall see. We refer the reader to [7] for more details about the design of KATAN.
4.2 Linear Key Partition
We now introduce a few notions that will prove useful to mount a meetinthemiddle attack against KATAN. Let \(RK_1\) (resp. \(RK_2\)) denote the set of round keys necessary to compute some fixed bits of internal state at an intermediate round from the left (resp. from the right). The first step of a meetinthemiddle attack is to guess the bits of information on the master key common to \(RK_1\) and \(RK_2\) (see Sect. 2.1). Hence it is necessary to define an intersection of \(RK_1\) and \(RK_2\) in terms of bits of information on the master key.
In general, this intersection may be impossible to define. In [10], a generic solution is proposed: all round keys are regarded as independent, i.e. the master key is redefined as the union of all round keys. This yields good results on various lightweight ciphers, including KATAN. However, it has a significant impact on the attack complexity. This can be avoided when the key schedule is linear: indeed, in that case, the intersection of \(RK_1\) and \(RK_2\) can be cleanly defined, as we now show for KATAN.
Let us regard a master key of KATAN as a vector in \(E = (\mathbb {Z}/2\mathbb {Z})^{80}\). The value of the master key corresponds to the coordinates of this vector along the canonical basis. Each round key is a linear combination of bits of the master key; that is, it is the image of the master key through some map \((x_i) \mapsto \sum \lambda _i x_i\), i.e. a linear functional on \(E\). Let us denote by \(\mathcal {L}(E)\) the space of linear functionals on \(E\).
From this standpoint, the information carried by \(RK_1\) (resp. \(RK_2\)) is the value of the master key on the subspace \(E_{K_1}\) (resp. \(E_{K_2}\)) of \(\mathcal {L}(E)\) generated by the round keys of \(RK_1\) (resp. \(RK_2\)). Let \(E_{K_{\cap }} = E_{K_1} \cap E_{K_2}\). Then the bits of information on the master key common to \(RK_1\) and \(RK_2\) are exactly the value of the key on the functionals of \(E_{K_{\cap }}\).
Let us choose an arbitrary basis \(B_{\cap }\) of \(E_{K_{\cap }}\), and extend it to a basis \(B_1\) of \(E_{K_1}\), and \(B_2\) of \(E_{K_2}\). Then in concrete terms a partial key in \(K_{\cap }\) is a mapping \(B_{\cap } \rightarrow \{0,1\}\); likewise, \(K_1\) and \(K_2\) are regarded as the set of mappings \(B_1 \rightarrow \{0,1\}\) and \(B_2 \rightarrow \{0,1\}\) respectively. We are now able to apply the meetinthemiddle attack framework exactly as it was presented in Sect. 2.1.
In the remainder, it will always be assumed that \(B = B_1 \cup B_2\) is a basis for the whole space \(\mathcal {L}(E)\). In particular, knowledge of the value of a key on \(B\) amounts to knowing the entire key; it will be convenient at times to identify the key space with \(\{0,1\}^B\), which we will denote by \(K\), by analogy with \(K_1\) and \(K_2\).
4.3 Key Dependencies
A first step towards building a meetinthemiddle attack is to choose a value \(\mathbf {v}\) extracted from an internal state at an intermediate round to serve as a meeting point. In order to make this choice, it is necessary to evaluate which key bits are necessary to compute \(\mathbf {v}\) from the plaintext, and from the ciphertext (presumably for some reduced version of the cipher). We have carried out this computation using an algorithm similar to Algorithm 1 in [10].
The principle of such an algorithm is that once some round key enters the state, the impacted bit is marked as depending on that key. Then this dependency is propagated along the cipher each time this internal state bit affects other internal state bits. In our case, because we will use indirect matching, we keep track separately of key bits whose impact is linear, and those whose impact is nonlinear.
Key dependency of the bit at position 9 (middle) of register B.

Basic MeetintheMiddle Attack Against KATAN. With what we have so far, we can mount a first meetinthemiddle attack against KATAN. While this is not the best attack we will propose, it is still worth mentioning because it has a simple description, requires only known plaintexts and minimal data requirements, and improves on previously published attacks.
4.4 Bicliques
The number of rounds covered by a meetinthemiddle attack may be extended by a biclique. This technique was also originally developed for the cryptanalysis of hash functions [4], and first applied to block ciphers in [3] to produce an accelerated key search against AES. Such a search requires all possible keys to be tried, but each try costs significantly less than a full encryption.
However, bicliques may also be used in the context of a traditional attack, where not all keys are tried. This is the model known as “long bicliques” in [3], and corresponds to [4] for hash functions. We will use this approach against KATAN, and so we recall it here briefly.
Definition 1
For simplicity, assume \(a=0\), and we have a biclique covering rounds \(a\) to \(b\) as in the above definition. In order to construct an attack up to round \(r\), the remaining rounds from \(b\) to \(r\) must be covered by a meetinthemiddle attack. Furthermore, the biclique and the meetinthemiddle segments must be compatible in the following sense. Let \(C_i\) be the ciphertext corresponding to \(A_i\) after \(r\) encryption rounds, et \(\mathbf {v}\) be the internal value used as a meeting point for the meetinthemiddle attack. Let \(K_{i,*}\) denote the partial information on the key expressing the fact that it is one of the \(K_{i,j}\)’s, for fixed \(i\) and variable \(j\). Let \(K_{*,j}\) be defined in the same way.
Then the biclique and the meetinthemiddle segments of the attack are compatible iff the middle value \(\mathbf {v}\) can be computed starting from \(B_j\) with only knowledge of \(K_{*,j}\), and from \(C_i\) with only knowledge of \(K_{i,*}\). The situation is illustrated on Fig. 5. This requirement is quite restrictive. However, it becomes easier to enforce if the key schedule is linear, as we shall see with KATAN.
 For each partial key \(k_{\cap } \in K_{\cap } = \bigcup K_{i,*} \cap K_{*,j}\):

For each \(j \le n\), \(\mathbf {v}\) is computed starting from \(B_j\) using \(K_{*,j}\). For each possible value of \(\mathbf {v}\), the \(j\)’s leading to that value are stored in a table.

For each \(i \le n\), \(\mathbf {v}\) is computed starting from \(C_i\) using \(K_{i,*}\). The \(j\)’s leading to this same \(\mathbf {v}\) are retrieved from the previous table. For each pair \((i,j)\) leading to the same \(\mathbf {v}\), \(K_{i,j}\) is a candidate master key.

If the actual encryption key is among the \(K_{i,j}\)’s, then it is necessarily a candidate. Indeed, encryption by \(K_{i,j}\) will follow the path depicted on Fig. 5. As with a standard meetinthemiddle attack, it remains to test candidate keys on a few additional plaintext/ciphertext pairs to single out the right key. This step is unchanged. Finally, to ensure that the actual key is among the \(K_{i,j}\)’s, the key space must be covered by bicliques, and the previous attack is repeated for each biclique.
Construction of a biclique varies depending on the cipher, but in general the construction cost is negligible with respect to the global complexity. Note that it is implicitly assumed that the construction of a biclique on a set of keys \(K_{i,j}\) does not imply that each key be computed, i.e. there is a structure. The overall complexity is then the same as that of the meetinthe middle segment if it were simply applied to a fixed plaintext/ciphertext pair.
4.5 Bicliques on KATAN
We have presented bicliques in the previous section. It remains to show how to construct bicliques on KATAN. Once again, it all comes down to the linearity of the key schedule, and the weak nonlinearity of the cipher reduced to a few rounds. In fact, these two properties make it possible to adjoin a biclique to any preexisting, arbitrary meetinthemiddle attack, in a compatible manner. Furthermore, a single biclique will suffice to cover the entire key space.
Assume we have a preexisting meetinthemiddle attack, with the notation of the previous sections. Recall that \(K_{\cap }\) (resp. \(K_1\), \(K_2\)) are regarded as maps \(B_{\cap } \rightarrow \{0,1\}\) (resp. \(B_1 \rightarrow \{0,1\}\), \(B_2 \rightarrow \{0,1\}\)). Let us denote by \(K'_1\) (resp. \(K'_2\)) the proper part of \(K_1\) (resp. \(K_2\)) with respect to \(K_{\cap }\), i.e. its restriction to \(B_1  B_{\cap }\) (resp. \(B_1  B_{\cap }\)).
Let us denote by \(\mathrm{{Enc}}_{a \rightarrow b}^{k}{M}\) and \(\mathrm{{Dec}}_{b \rightarrow a}^{k}{M}\) the encryption and decryption of a message \(M\) between rounds \(a\) and \(b\) with key \(k\). We extend this notation to the case where \(k\) is a partial key (i.e. an element of \(K_1\), \(K'_1\), \(K_2\), \(K'_2\), or \(K_{\cap }\)) by completing the key by 0 on the rest of \(B\). In addition, for \(k \in K\), let us write \(k_1 \in K_1\) for its restriction to \(B_1\), and define in a similar way \(k'_1\), \(k_2\), and \(k'_2\). Finally, let \(k(i)\) denote the value of the \(i\)th round key generated by \(k\); again, if \(k\) is only partially defined, it is completed by 0 on the rest of the basis \(B\).
Definition 2
Proposition 1
Proof. The proof is essentially the same for all three versions of KATAN. For the sake of simplicity, we only present the proof for KATAN32. It will be convenient to designate bits in each register by their position, in the order depicted on Fig. 4.
For \(i=0\), (6) yields the definition of \(A_{k_2}\), so it holds. Assume that it holds for some round \(i < 10\). When we step forward one encryption round, since we are dealing with shift registers, the equality remains true everywhere, except possibly on the two new bits entering the registers (at positions 0 and 19 on Fig. 4). Let us show for instance that the equality remains true for the bit entering register B (position 0). Let us denote by \(f\) the feedback function from register A into register B.
Assume for instance we are in the first case. Then in (8), bits 24 and 27 are equal for \(\mathrm{{Enc}}_{0 \rightarrow i}^{k_1}(0) \oplus \mathrm{{Dec}}_{10 \rightarrow i}^{k'_2}(0)\) and \(\mathrm{{Dec}}_{10 \rightarrow i}^{k'_2}(0)\), and null for the last term, thus the only nonlinear component of \(f\) has the same contribution on each side of the equation. On the rest \(f\) is linear, so we are done. \(\square \)
In essence, there is only one multiplication on register A of KATAN32, between bits 24 and 27. By restricting ourselves to \(13(2724) = 10\) rounds, where 13 is the length of register A, we ensure that there is no nonlinear interaction between the bits of \(B_{k_1}\) dependent on \(k_1\), and the bits of \(A_{k'_2}\) dependent on \(k'_2\). With register B the same computation yields \(19  \max (1210,83) = 14\) rounds. That is why we can build a biclique of length 10 on KATAN32. The same reasoning shows that we can build bicliques of length 5 on KATAN48, and 5 again on KATAN64.
Building Several Bicliques. Later on, we will want to use simultaneous matching (cf. Sect. 2.1). For this purpose, we need several distinct bicliques, and the previous proposition only gives us one. Fortunately, it is possible to build new distinct bicliques on the same model as that of Proposition 1, by adding parameters to Definition 2. There are several ways to proceed. In particular, it is possible to either modify the bits of \(A_{k_2}\) that do not actually depend on \(k_2\), or those that do. We only describe the second option, as it is enough for our purpose.
Biclique Attack Against KATAN. In Sect. 4.3, we attacked 121 rounds of KATAN32. If we use four bicliques instead of four plaintext/ciphertext pairs, we gain an additional 10 rounds, as explained in the previous section. Meanwhile, the core of the attack remains the same, except we meet on \(b_{60}\) starting from round 10, instead of \(b_{50}\) starting from round 0. In particular, the complexity is unchanged. However we now require chosen plaintexts. Because the dimension of \(K'_2\) is 5, each biclique requires \(2^5\) chosen plaintexts, so the data requirements increase to \(4\cdot 2^5 = 2^7\). The attack covers 131 rounds with complexity \(2^{77.5}\). In the same way, we can extend the previous attacks on KATAN48 and KATAN64 respectively to 114 rounds with \(2^6\) CP, and 107 rounds with \(2^7\) CP, both with complexity \(2^{77.5}\).
4.6 Match Box
We now explain how the match box technique applies to KATAN32. Variants for KATAN48 and KATAN64 will be very similar. Assume we are meeting in the middle on \(b_{62}\) (this will be the case in the final attack). The idea is that we are going to isolate round keys whose impact on the value of \(b_{62}\) when computing from the right can be evaluated with knowledge of only a few bits of information. We do not consider round keys whose impact is only linear however, since those can be ignored thanks to indirect matching.
What we have gained in this example is that round keys 175 and 179 no longer need to be known when computing from the right. This decreases the dimension of \(K_2\) by two, and thus spares a factor \(2^2\) when guessing its value. Moreover this gain can be spent in order to extend the attack to one more round. Indeed, we can now append one extra round at the end of the (reduced) cipher, and simply add the two bits of round key for that round into \(K_2\). This increases the dimension of \(K_2\) by two, back to its original value. Then the attack proceeds as before. In short, every time we are able to decrease the dimension of \(K_2\) by two by needing less round keys in order to compute \(r_{62}\), we can reincrease it in order to extend the attack to one more round.
Sizes of \(\mathbf {r}\) sufficient to spare a certain number of round keys.

Note that this table only indicates the number of round keys spared. One can expect that every two round keys spared gains one round, but this is dependent on the two round keys being linearly independent of the rest of \(K_2\). This in turn depends on which round keys are in \(K_2\), i.e. which rounds are covered by \(K_2\).
4.7 Final Attack
In this section, we describe the final version of the attack we propose against KATAN32, combining all components from the previous sections. We aim at having \(K_1\) and \(K_2\) both of dimension 77. Starting from round 10 after the biclique, this allows us to cover 62 rounds in the forward direction. This corresponds to meeting on \(b_{62}\) (i.e. position 9 of register B after 72 rounds). The number of rounds covered in the backwards direction will depend on the match box. We will ensure that in the end, the dimension of \(K'_1\) is 3.
Compression Table (cf. Sect. 3.3). We have \(k'_1 = 3\), so \(2^{k'_1} = 8\), which makes it worthwhile to use the compression technique. We want to build a compression table \(C\) converting a \(\mathbf {r}\) of size greater than 8 into 8 \(f^n\)’s. Note that we meet on a single bit, so there is only one line in (2), which is why we talk about \(f^n\)’s and not \(f^n_i\)’s. On the other hand, we will use simultaneous matching on three bicliques, but always on the same bit, so the conversion from \(\mathbf {r}\) into the \(f^n\)’s is the same for each pair: we only need one compression table.
As observed in Sect. 3.3, for each \(\mathbf {r}\), the \(f^n\)’s can be computed with \(2^{k'_1}\) partial encryptions. Hence for KATAN32 we can choose \(\mathbf {r} = 73\) (see Table 3), yielding a table of size \(2^{76}\) in complexity \(2^{76}\). This spares 35 round keys in the decryption direction. In the end, we can begin the backwards computation from round 153.
Match Box (cf. Sect. 4.6). We perform simultaneous matching on \(b_{62}\) for 3 distinct bicliques; \(\mathbf {l'}\) contains the value of \(b_{62}\) computed from the left for each biclique, so \(\mathbf {l} = 3\). Meanwhile \(\mathbf {r}\) contains the 8 bits \(f^n\) computed from the right, again for each biclique, hence \(\mathbf {r} = 8\times 3 = 24\). This yields a match box table of size \(2^{3^3 + 24 + 3}= 2^{54}\) in less than \(2^{54}\) encryptions. Note that both the compression table and the match box table are absolute precomputations, in the sense that they do not depend on the actual plaintext/ciphertext pairs and need only be built once.
In the end, we attack 153 rounds: the first 10 are covered by the bicliques; the next 71 are the forward part of the meetinthemiddle attack; the next 19 are covered by the match box; and the final 53 are the backwards part of the meetinthemiddle attack. See the Appendix for the list of round keys involved in \(K_1\) and \(K_2\) and the list of round keys spared by using the match box.

Precompute the compression table \(C\).

Precompute the match box \(M\).
 For each partial key \(k_{\cap } \in K_{\cap }\):

For each partial key \(k'_1\), knowing \(k_1 = k_{\cap } \oplus k'_1\), compute \(b_{62}\) from the left for each biclique, and denote their concatenation by \(\mathbf {l'}\).
This yields a function \(F: k'_1 \rightarrow \mathbf {l'}\). Retrieve \(M(F)\).
 For each partial key \(k'_2\):

* For each biclique, knowing \(k_2 = k_{\cap } \oplus k'_2\), compute the 31bit \(\mathbf {r}\) from the right for that biclique. Convert it into 8 bits \(f^n\) through \(C\).

* Having done this for all 3 bicliques, the concatenation of the \(f^n\)’s makes up the 24bit \(\mathbf {r}\) entry of the match box. Match \(k'_2\) with the \(k'_1\)’s in \(M(F)(\mathbf {r})\) to form candidate master keys.



Test candidates master keys on 3 plaintext/ciphertext pairs as in a standard meetinthemiddle attack. This should be done on the fly.
5 Conclusion
In this paper, we presented a new technique to extend meetinthemiddle attacks. This technique makes it possible to extend the middle portion of the attack with no increase in the overall complexity, but at the cost of significant precomputation. As such, it is a form of time/memory tradeoff. We have applied this technique to the lightweight cipher KATAN, and significantly improve on previous results on this cipher.
Footnotes
Notes
Acknowledgments
The authors would like to thank Henri Gilbert for many helpful discussions, as well as Anne Canteaut and María NayaPlasencia for their insightful remarks, including the idea of compression (Sect. 3.3).
References
 1.Albrecht, M.R., Leander, G.: An allinone approach to differential cryptanalysis for small block ciphers. In: Knudsen, L.R., Wu, H. (eds.) SAC 2012. LNCS, vol. 7707, pp. 1–15. Springer, Heidelberg (2013) Google Scholar
 2.Beaulieu, R, Shors, D., Smith, J., TreatmanClark, S., Weeks, B., Wingers, L.: The Simon and Speck families of lightweight block ciphers. Cryptology ePrint Archive, Report 2013/404 (2013). http://eprint.iacr.org/2013/404
 3.Bogdanov, A., Khovratovich, D., Rechberger, C.: Biclique cryptanalysis of the full AES. In: Lee, D.H., Wang, X. (eds.) ASIACRYPT 2011. LNCS, vol. 7073, pp. 344–371. Springer, Heidelberg (2011) Google Scholar
 4.Khovratovich, D., Rechberger, C., Savelieva, A.: Bicliques for preimages: Attacks on Skein512 and the SHA2 family. In: Canteaut, A. (ed.) FSE 2012. LNCS, vol. 7549, pp. 244–263. Springer, Heidelberg (2012) Google Scholar
 5.Bogdanov, A.A., Knudsen, L.R., Leander, G., Paar, C., Poschmann, A., Robshaw, M., Seurin, Y., Vikkelsoe, C.: PRESENT: An ultralightweight block cipher. In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 450–466. Springer, Heidelberg (2007) Google Scholar
 6.Bogdanov, A., Rechberger, C.: A 3subset meetinthemiddle attack: Cryptanalysis of the lightweight block cipher KTANTAN. In: Biryukov, A., Gong, G., Stinson, D.R. (eds.) SAC 2010. LNCS, vol. 6544, pp. 229–240. Springer, Heidelberg (2011) Google Scholar
 7.De Cannière, C., Dunkelman, O., Knežević, M.: KATAN and KTANTAN — A family of small and efficient hardwareoriented block ciphers. In: Clavier, C., Gaj, K. (eds.) CHES 2009. LNCS, vol. 5747, pp. 272–288. Springer, Heidelberg (2009) Google Scholar
 8.Canteaut, A., NayaPlasencia, M., Vayssière, B.: Sieveinthemiddle: Improved MITM attacks. Cryptology ePrint Archive, Report 2013/324 (2013, to appear). http://eprint.iacr.org/2013/324
 9.Guo, J., Peyrin, T., Poschmann, A., Robshaw, M.: The LED block cipher. In: Preneel, B., Takagi, T. (eds.) CHES 2011. LNCS, vol. 6917, pp. 326–341. Springer, Heidelberg (2011) Google Scholar
 10.Isobe, T., Shibutani, K.: All subkeys recovery attack on block ciphers: extending meetinthemiddle approach. In: Knudsen, L.R., Wu, H. (eds.) SAC 2012. LNCS, vol. 7707, pp. 202–221. Springer, Heidelberg (2013) Google Scholar
 11.Knellwolf, S., Meier, W., NayaPlasencia, M.: Conditional differential cryptanalysis of NLFSRbased cryptosystems. In: Abe, M. (ed.) ASIACRYPT 2010. LNCS, vol. 6477, pp. 130–145. Springer, Heidelberg (2010) Google Scholar
 12.Knellwolf, S., Meier, W., NayaPlasencia, M.: Conditional differential cryptanalysis of trivium and KATAN. In: Miri, A., Vaudenay, S. (eds.) SAC 2011. LNCS, vol. 7118, pp. 200–212. Springer, Heidelberg (2012) Google Scholar
 13.NayaPlasencia, M.: How to improve rebound attacks. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS, vol. 6841, pp. 188–205. Springer, Heidelberg (2011) Google Scholar
 14.Wei, L., Rechberger, C., Guo, J., Wu, H., Wang, H., Ling, S.: Improved meetinthemiddle cryptanalysis of KTANTAN (Poster). In: Parampalli, U., Hawkes, P. (eds.) ACISP 2011. LNCS, vol. 6812, pp. 433–438. Springer, Heidelberg (2011) Google Scholar
 15.Zhu, B., Gong, G.: Multidimensional MeetintheMiddle Attack and Its Applications to KATAN32/48/64. Cryptology ePrint Archive, Report 2011/619 (2011). http://eprint.iacr.org/2011/619