# Linear Distinguishers in the Key-less Setting: Application to PRESENT

- 2 Citations
- 1.2k Downloads

## Abstract

The application of the concept of linear cryptanalysis to the domain of key-less primitives is largely an open problem. In this paper we, for the first time, propose a model in which its application is meaningful for distinguishing block ciphers.

Combining our model with ideas from message modification and rebound-like approaches, we initiate a study of cryptographic primitives with respect to this new attack vector and choose the lightweight block cipher PRESENT as an example target. This leads to known-key distinguishers over up to 27 rounds, whereas the best previous result is up to 18 rounds in the chosen-key model.

## Keywords

Hash function Block cipher Linear cryptanalysis Distinguisher PRESENT## 1 Introduction

We start off with a simple, clearly undesirable property of a block cipher and generalize it; suppose there is an *n*-bit block cipher which allows, for a particular known or chosen key, to determine a plaintext, such that the plaintext is the same as the ciphertext. For a good block cipher, accomplishing this should be very unlikely with much less than \(2^n\) trials. It would, for example, allow preimage attacks in fully preimage-secure compression function constructions that use this block cipher.

Now, consider an *n*-bit block cipher where the key is known or chosen by the attacker and let us focus on a single bit at position *i* of the plaintext \(p_i\) and ciphertext \(c_i\) in this setting. We would expect that the equation \(p_i=c_i\) holds in exactly half the cases. In fact, any statistically significant deviation from this expectation can be interpreted as a sign of non-randomness in the cipher.

Such an attack would be in the so-called *key-less model*, which covers both the *known-key* and *chosen-key* models, and is hence of relevance if the cipher is used as part of a hash function construction. More generally, it allows to make meaningful statements and differentiate between ciphers beyond what is possible in other models. Should we consider such a cipher as a good building block for a compression function? Not if there would be an alternative cipher with similar implementation characteristics that does not allow for such a distinguisher!

### 1.1 Contributions

We discuss the two types of contributions in this paper. One is of a more conceptual/modeling nature, while the other is a concrete cryptanalytic application of the former.

**A New Way of Formulating Key-less Distinguishers.** The property described in the beginning resembles properties used in linear cryptanalysis to recover secret keys. The problem with the above line of reasoning was that so far there did not exist a meaningful model to properly express the setting. By this, we mean a model which has a proper characterization of the power of generic attackers and a clear distinction as to when a dedicated attack in fact can be considered a valid distinguisher, i.e. outperforms generic attackers. In this paper, after starting off by giving notation and preliminary notions of block ciphers and linear cryptanalysis in Sect. 2, we put in Sect. 3 the above very informal description of a possible demonstration of non-randomness on more rigorous grounds.

The usual requirement for a distinguisher to be valid is, that one must compare the cost of satisfying a specific property, which varies from case to case, for a concrete permutation \(\pi \), with achieving the same property for an ideal permutation. In our model, we expand on this by posing the problem of determining for a concrete permutation \(\pi \): (i) a linear relation over \(\pi \) in the form of an input/output mask and (ii) a set of inputs to \(\pi \), such that the number of inputs satisfying the linear relation is *expected* to deviate from what one expects of an ideal permutation, by a significant amount. A property which should not be attainable for an ideal primitive.

Our proposed key-less linear distinguisher model captures the possibility of distinguishing a cipher using any previous linear cryptanalysis, in the sense that the attacker needs only a linear hull and the probability distribution on the absolute correlation, to perform his analysis. To amplify the distinguisher to either cover more rounds or to need less computation, approaches inspired by message modification [42] and rebound attacks [28, 35] are used.

**Application to PRESENT.** We can find concrete results in the new model in round-reduced versions of the leading lightweight-cipher PRESENT [10] (used in compression function designs advocated e.g. in [11]). In Sect. 4 we describe the relevant aspects of the PRESENT block cipher and give results on linear hulls and keys pertaining to it. Section 5 details the application of the key-less linear distinguisher to PRESENT. We fix a bit position *i*, devise an algorithm for determining up to \(2^{61.97}\) key-dependent plaintexts in a very efficient manner, and study the expected number of plaintext and ciphertext pairs where \(p_i=c_i\). What we claim to be able to find is a deviation from the expectation that the equation \(p_i=c_i\) is fulfilled with probability \(\frac{1}{2}\). Depending on the size of the allowable key-set, this will work for up to 27 rounds of PRESENT. Detailed results are summarized in Table 4, before our conclusions and a discussion of open problems in Sect. 6. We confirm the results with experimental verifications (see Appendix C and [29]).

### 1.2 Related Work

Linear cryptanalysis, a technique to recover keys in ciphers, was pioneered by Matsui from 1992 on [32, 34], with extensions or variants such as multiple linear approximations [5, 20], linear hulls [38], multidimensional variants [16], zero-correlations [12] and considerations of a general statistical framework [3, 30, 37].

The application of linear cryptanalysis to key-less constructions, i.e. in models where the key is either known or chosen by the attacker, is largely an open problem. Sometimes, designs are evaluated with respect to standard linear cryptanalysis [2, 31]. Some designers of SHA-3 candidates state properties with respect to this class of attacks (such as linear probability) without ever mentioning specific models. The reason is that there simply was no model, a situation that we address in this paper.

In all cases of linear cryptanalysis applied in a key-less setting, the analysis done is exactly the same as in a setting with a secret key: a linear approximation with a non-zero correlation is presented. The only known exception to us is a linear analysis of Cubehash by Ashur and Dunkelman [2]. There, an 11-round linear approximation with bias \(2^{-235}\) is used to describe a standard distinguisher with \(2^{470}\) queries. Then, inspired by a chosen-plaintext variant of linear cryptanalysis of DES by Knudsen and Mathiassen [23], the authors fix 80 bits of the plaintext input of modular additions, thereby gaining the first round for free, arriving at a 12-round result with a complexity below \(2^{512}\). This can be seen as a predecessor to our deterministic technique of Sect. 5.2.

The only analysis of PRESENT in a setting without secret keys we know of is by Koyama, Sasaki, and Kunihiro [25]. In their work, differential chosen-key distinguishers (a setting that gives the attacker more freedom than in our known-key model) for up to 18 rounds are obtained.

At its core is a differential rebound attack with an inbound phase of 5 rounds that needs 100 degrees of freedom^{1}. In the method we propose, we allow the key to be fixed arbitrarily, and out of the remaining 64 degrees of freedom from the plaintext input more than 61 degrees of freedom remain. Hence our results, that cover more rounds, and use our deterministic phase over 3 rounds that needs only 3 degrees of freedom, compare favorably to this result.

## 2 Preliminaries

In this section we introduce our notation, give basic definitions and recall known properties related to our analysis throughout the paper.

**Notation.** For an *n*-bit block cipher with key space \({\mathcal {K}}\), let \(E : {\mathbb {F}}_2^n \times {\mathcal {K}}\rightarrow {\mathbb {F}}_2^n\) and \(D : {\mathbb {F}}_2^n \times {\mathcal {K}}\rightarrow {\mathbb {F}}_2^n\) denote encryption and decryption functions, respectively. For convenience, we also use the notation that \(E_K(x) := E(x,K)\) and \(D_K(c) := D(c,K)\). We use \(\sharp X\) to denote the size of a set *X*. For a real number *w*, |*w*| denotes the absolute value of *w*. We let \({\mathsf{Perm}}(n)\) denote the set of all permutations on *n*-bit inputs and we let \(x \xleftarrow {\$} X\) denote the assignment of *x* by an element of *X* chosen uniformly at random. We use \(\mathcal{N}(\mu ,\sigma ^2)\) and \(\mathcal{B}(n,p)\) to denote the normal- and binomial distributions respectively. For a distribution *D* we use \({\varPhi }(D,x)\) to denote the cumulative distribution function of *D* at point *x*. We use the notation that \(\mathbf {e}_i\) is a binary string with a 1 in position *i* and zeroes elsewhere.

In this paper, when we talk about the key-less setting, we implicitly mean adversarial assumptions where the key \(K \in {\mathcal {K}}\) is either known or chosen by the attacker.

**Trails and Hulls.**In the following, let \(F : {\mathbb {F}}_2^n \rightarrow {\mathbb {F}}_2^n\) be an iterated function of the form \(F = F_R \circ \cdots \circ F_1\). We borrow to a large extent the notation from Leander’s treatment on linear cryptanalysis [30]. We define a

*mask*as a vector \(\alpha \in {\mathbb {F}}_2^n\). For two masks \(\alpha ,\beta \), we denote by \(\langle \alpha ,\beta \rangle \) the inner product of the two masks:

*R*-round

*trail*as an element \((\delta ,\alpha _1,\ldots ,\alpha _{R-1},\gamma ) \in ({\mathbb {F}}_2^n)^{R+1}\), where \(\delta \) and \(\gamma \) are the

*input*and

*output*masks, respectively. The \(\alpha _i\) are called the

*intermediate*masks. For a randomly chosen \(x \in {\mathbb {F}}_2^n\), and for \(i=1,\ldots ,R\) (letting \(\alpha _0=\delta \) and \(\alpha _R = \gamma \)), we have

*correlation*over \(F_i\). The

*trail correlation*over

*F*is defined in terms of the \({\mathbf {C}}_{F_i}\) as

*valid*if and only if each constituent correlation of (1) is non-zero.

*R*-round

*linear hull*\(\text {LH}_R(\delta ,\gamma )\) as the union of all valid linear trails with input mask \(\delta \) and output mask \(\gamma \). As such, we use the notation that \(t \in \text {LH}_R(\delta ,\gamma )\) for an

*R*-round trail

*t*. Note that a linear hull \(\text {LH}_R(\delta ,\gamma )\) defines an

*R*-round

*linear relation*between

*x*and

*F*(

*x*), which we denote \(\mathcal{R}_{\delta ,\gamma }^F : {\mathbb {F}}_2^n \rightarrow {\mathbb {F}}_2\), where

*satisfied*for input

*x*and otherwise it is not. The

*linear hull correlation*[17, Theorem 7.8.1] is given by

*F*. For a block cipher, the value of \( sgn (t)\) for \(t \in \text {LH}_R(\delta ,\gamma )\) depends on the secret key \(K \in {\mathcal {K}}\), and hence the value of \(|{\mathbf {C}}_F(\text {LH}_R(\delta ,\gamma ))|\) depends on the difference between the number of trails with \( sgn (t) = 1\) and those with \( sgn (t) = 0\). In this paper, we use the following assumption.

### **Assumption 1**

For any fixed key \(K \in {\mathcal {K}}\), we assume that for any two trails \(t,t' \in \text {LH}_R(\delta ,\gamma )\), where \(t \ne t'\), the signs \( sgn (t)\) and \( sgn (t')\) are independent Bernoulli random variables with \(p=\frac{1}{2}\).

We note that Assumption 1 has been experimentally verified for PRESENT, see e.g. [13, 30].

*follows*an

*R*-round trail over

*F*if and only if

## 3 Key-less Linear Distinguishers for Block Ciphers

Even though block ciphers have used already for a very long time, either implicitly or explicitly, to construct hash functions, a separate study of the security of block ciphers where the key is either known or under control of the adversary, has started only recently. Knudsen and Rijmen proposed so-called known-key distinguishers [24]. Later Biryukov et al. [8] and Lamberger et al. [27] proposed open- or chosen-key models to evaluate the security of block ciphers.

Even though these models often exhibit a rather contrived looking property, and evade a formally rigorous definition^{2} (a property they share with collision attacks), cryptanalysts largely agree that these distinguishers are useful and interesting. Indeed, techniques developed to improve the original known-key distinguishers from [24], such as the rebound attack later led to collision attacks on various hash functions [21, 27, 36]. Also, the findings in the open-key model from [8] were later used to find the first related-key key-recovery attacks on AES-256 and AES-192 [6, 7].

### 3.1 Motivation for Our Distinguisher

Sometimes distinguisher descriptions are merely motivated by the fact that they *can* be formulated, as e.g. the 7-round known-key distinguisher on AES from [24], where byte-level zero-sums are used as a distinguishing property. Another example is the rotational rebound attack on reduced Skein [22], where the existence of “rotational collisions with errors” is defined as a distinguishing property. Sometimes, however, they are better motivated, e.g. by the construction of near-collisions, or the subspace- and limited-birthday distinguishers [19, 27, 28] that resemble some generalization of the concept of near-collisions.

The distinguisher we propose below comes with a new motivation that stems from preimage attacks on hash functions or compression functions^{3}. As an example, consider the compression function construction using a single call to a block cipher in Matyas-Meyer-Oseas mode. The *i*th message block \(m_i\) is compressed by using it as the plaintext input when computing the next chaining value \(H_{i+1}\) using \(H_i\) as the cipher key, i.e. \(H_{i+1} = E_{H_i}(m_i) \oplus m_i\). If an attacker can determine a relation stating that the *j*th bit of \(m_i\) equals the *j*th bit of \(E_{H_i}(m_i)\) with a high probability, then it is likely that the *j*th bit of \(H_{i+1}\) equals zero. In a preimage attack, if the target preimage is zero at position *j*, this then leads to an advantage over brute-force search.

Motivated by this example, we proceed with our new key-less linear distinguisher model for block ciphers that we will use throughout the paper.

### 3.2 The Key-less Linear Distinguisher Model

In the following, we give our definition of key-less linear distinguishers. Essentially, the model captures the possibility of distinguishing any block cipher in the key-less setting, given that a linear relation (in the form of a linear hull) of sufficiently high absolute correlation for a reasonable fraction of the key space \({\mathcal {K}}\), is available. The notions of Definitions 1 and 2 are largely inspired by the recent work of Gilbert on pushing known-key attacks further on the AES [18].

The following definition of \(\alpha \) *-separability* formalizes how a linear relation, combined with a set of inputs for a permutation \(\pi : {\mathbb {F}}_2^n \rightarrow {\mathbb {F}}_2^n\), can exhibit a *significant* deviation from the behavior of a random permutation.

### **Definition 1**

**(** \(\alpha \) **-separability).** Let \({\mathcal {P}}\) be a set of permutations from \({\mathbb {F}}_2^n\) to \({\mathbb {F}}_2^n\) and let \(\pi \in {\mathcal {P}}\) denote a particular, fixed permutation from \({\mathcal {P}}\). Let \({\mathcal {S}}\subseteq {\mathbb {F}}_2^n\) with size \(\mathcal{M}\) and let \(\delta ,\gamma \in {\mathbb {F}}_2^n \backslash \{(0,\ldots ,0)\}\).

*-separable*if and only

### **Definition 2**

**(**\((T,\mathcal{M},\alpha )\)

**-intractability).**Let \(\mathcal {P}\) be a set of permutations from \({\mathbb {F}}_2^n\) to \({\mathbb {F}}_2^n\) and let \(\pi \in \mathcal {P}\) denote a particular, fixed permutation from \(\mathcal {P}\). Let \({\mathcal {S}}\subseteq {\mathbb {F}}_2^n\) of size \(\mathcal{M}\) and let \(\delta ,\gamma \in {\mathbb {F}}_2^n \backslash \{(0,\ldots ,0)\}\). We say that the tuple \((\mathcal {P},\pi , {\mathcal {S}},\mathcal{R}_{\delta ,\gamma }^{\pi })\) is \((T,\mathcal{M},\alpha )\)

*-intractable*if and only if it is impossible, for any algorithm \({\mathcal {A}}\) to

- 1.
Commit to a choice of \(\delta ',\gamma ' \in {\mathbb {F}}_2^n \backslash \{(0,\ldots ,0)\}\) and

- 2.
When given access to a fixed pair \({\varPi },{\varPi }^{-1}\) with \({\varPi }\xleftarrow {\$} {\mathsf{Perm}}(n)\), construct a set \({\mathcal {S}}'\) of size \(\mathcal{M}\) in time

*T*, s.t. the tuple \(({\mathsf{Perm}}(n),{\varPi },{\mathcal {S}}',\mathcal{R}_{\delta ',\gamma '}^{{\varPi }})\) is \(\alpha \)-separable.

### *Note 1*

For our distinguisher model, the notion of *one time unit* corresponds to a single evaluation of the respective permutation.

With the definition of \(\alpha \)-separability and \((T,\mathcal{M},\alpha )\)-intractability in hand, we are ready to formulate our proposed key-less linear distinguisher.

### **Definition 3**

**(Key-less linear distinguisher).** Let \(E : {\mathbb {F}}_2^n \times {\mathcal {K}}\rightarrow {\mathbb {F}}_2^n\) be a block cipher and let \(\mathcal {E}\) to denote the set of permutations due to choices of the key \(K \in {\mathcal {K}}\). Let \(E_K\) denote *some* fixed permutation from \(\mathcal {E}\).

Fix \(\delta ,\gamma \in {\mathbb {F}}_2^n \backslash \{(0,\ldots ,0)\}\) and let \({\mathcal {A}}\) be an algorithm producing in time *T* a set \({\mathcal {S}}\subseteq {\mathbb {F}}_2^n\) of size \(\mathcal{M}\). Then the tuple \(({\mathcal {A}}, \mathcal {E}, E_K, {\mathcal {S}}, T,\mathcal{R}_{\delta ,\gamma }^{E_K}, \alpha )\) is said to be a *key-less linear distinguisher* if and only if \((\mathcal {E}, E_K, {\mathcal {S}}, \mathcal{R}_{\delta ,\gamma }^{E_K})\) is both \(\alpha \)-separable and \((T, \mathcal{M}, \alpha )\)-intractable.

### *Note 2*

In all of the definitions above, the fixed linear masks \(\delta ,\gamma \in {\mathbb {F}}_2^n \backslash \{(0,\ldots ,0)\}\) are *chosen* by the algorithm \({\mathcal {A}}\), but the choice *must be made before* the production of the input set \({\mathcal {S}}\) commences.

In the context of distinguishing a block cipher, the adversary commits to \(\delta \) and \(\gamma \) and then obtains access to \(E_K\) upon which the production of \({\mathcal {S}}\) in time *T* begins. The parameter \(\alpha \) directly expresses a lower bound on the fraction of the permutations \(\pi \in \mathcal {P}\) for which the key-less linear distinguisher is valid. The time *T* allowed to construct \({\mathcal {S}}\) is a parameter chosen by the adversary.

**Analysis.** In the following, we analyze and argue that the key-less linear distinguisher is meaningful. First, informally, the notion of \(\alpha \)-separability expresses that for a concrete permutation \(\pi : {\mathbb {F}}_2^n \rightarrow {\mathbb {F}}_2^n\), one can provide a linear relation which captures, for some constructed set of inputs, a *significant non-random behavior* in a permutation which is supposed to behave randomly. The *significant* part is captured by the requirement that the number of inputs satisfying the relation \(\mathcal{R}_{\delta ,\gamma }^{\pi }\) should deviate from what is expected in the ideal case by at least \(\sqrt{\mathcal{M}}\). This reflects the usual requirement in linear cryptanalysis, that the data complexity is inversely proportional to the squared correlation. Second, on top of that, Definition 2 captures the notion that for a random permutation \({\varPi }\xleftarrow {\$} {\mathsf{Perm}}(n)\), it should not be possible, in the same amount of time, to provide such a relation with a set of inputs which exhibits the same significant non-random behavior.

With respect to Definition 2, one of the components to analyzing our proposed key-less linear distinguisher is to answer the following question: What is the *upper bound* on the probability \(\alpha '\) that an algorithm \({\mathcal {A}}\), when given access to the fixed pair \({\varPi }\) and \({\varPi }^{-1}\), can produce in time *T* a set \({\mathcal {S}}' \subseteq {\mathbb {F}}_2^n\) of size \(\mathcal{M}\), together with a pre-determined relation \(\mathcal{R}_{\delta ,\gamma }^{\varPi }\), such that \(({\mathsf{Perm}}(n),{\varPi },{\mathcal {S}}',\mathcal{R}_{\delta ,\gamma }^{{\varPi }})\) is \(\alpha '\)-separable? Our analysis answers this question in the following, and it implicitly provides a *lower bound* on \(\alpha \) for when a concrete permutation \(\pi : {\mathbb {F}}_2^n \rightarrow {\mathbb {F}}_2^n \in \mathcal {P}\) (in the notation of Definitions 1 and 2) can be shown to be \((T, \mathcal{M}, \alpha )\)-intractable, for fixed *T* and \(\mathcal{M}\). We begin our analysis with Lemma 1.

### **Lemma 1**

*T*is the following:

- 1.
Construct an arbitrarily chosen set \(\mathcal {Q} \subseteq {\mathbb {F}}_2^n\) of size

*T*. - 2.
Partition \(\mathcal {Q}\) into \({\mathcal {Q}}_1 = \{ x \in \mathcal {Q} \;|\;\mathcal{R}_{\delta ',\gamma '}^{\varPi }(x) = 1 \}\) and \({\mathcal {Q}}_0 = \{ x \in \mathcal {Q} \;|\;\mathcal{R}_{\delta ',\gamma '}^{\varPi }(x) = 0 \}\) by querying \({\varPi }(x)\) for all \(x \in \mathcal {Q}\).

- 3.
Set \({\mathcal {S}}'\) equal to the larger of the sets \({\mathcal {Q}}_0\) and \({\mathcal {Q}}_1\).

- 4.
Fill up \({\mathcal {S}}'\) with arbitrarily chosen inputs from \({\mathbb {F}}_2^n \backslash \mathcal {Q}\) until \(\sharp {\mathcal {S}}' = \mathcal{M}\).

### *Proof*

As \({\varPi }\xleftarrow {\$} {\mathsf{Perm}}(n)\), the particular choice of \(\delta ',\gamma ' \in {\mathbb {F}}_2^n \backslash \{(0,\ldots ,0)\}\) does not affect the analysis. The most information \({\mathcal {A}}\) can learn about \({\varPi }\) in time *T* is to obtain *T* pairs \((x,{\varPi }(x))\), as is done when determining \(\mathcal {Q}\) and its image under \({\varPi }\). In order to optimally shift the balance of the expected number of inputs of \({\mathcal {S}}'\) satisfying \(\mathcal{R}_{\delta ',\gamma '}^{\varPi }\) away from \(\mathcal{M}/2\), \({\mathcal {A}}\) should take the larger of \({\mathcal {Q}}_1\) and \({\mathcal {Q}}_0\) and pool it with randomly chosen inputs *x* for which the value of \(\mathcal{R}_{\delta ',\gamma '}^{\varPi }(x)\) is not known. \(\square \)

Continuing our analysis, assuming an algorithm \({\mathcal {A}}\) constructs \({\mathcal {S}}'\) as in Lemma 1, we determine an upper bound on the value \(\alpha '\) as a function of \(\mathcal{M}\) and *T*, such that the resulting tuple \(({\mathsf{Perm}}(n), {\varPi }, {\mathcal {S}}', \mathcal{R}_{\delta ',\gamma '}^{\varPi })\) is \(\alpha '\)-separable. We give this result in Theorem 1.

### **Theorem 1**

**(Generic success probability).**Let \({\mathcal {A}}, {\varPi }, \delta ', \gamma ', {\mathcal {S}}'\) and

*T*be as in Lemma 1, where \(T \le 4\sqrt{\mathcal{M}}\), and let \(\mathcal{X}:= \sharp \{ x \in {\mathcal {S}}' \;|\;\mathcal{R}_{\delta ,\gamma }^{\varPi }(x) = 1 \}\). Then

### *Proof*

First, note that \(\sharp \mathcal{Q}_1 \sim \mathcal{B}(T,\frac{1}{2})\). We want to determine the probability that we have \(\left| {\mathbb {E}}\left[ \mathcal{X}\right] - \frac{\mathcal{M}}{2} \right| \ge \sqrt{\mathcal{M}}\). The consideration is split into two cases depending on whether or not \(\sharp \mathcal{Q}_1 \ge T/2\).

*Case* \(\sharp \mathcal{Q}_1 \ge T/2\). In this case, we know that at least \(\sharp \mathcal{Q}_1\) of the \(\mathcal{M}\) inputs satisfy the relation. Thus, \({\mathbb {E}}\left[ \mathcal{X}\right] = {\mathbb {E}}\left[ Z\right] + \sharp \mathcal{Q}_1\) where \(Z \sim \mathcal{B}\left( \mathcal{M}- \sharp \mathcal{Q}_1, \frac{1}{2}\right) \). Thus, \({\mathbb {E}}\left[ \mathcal{X}\right] = \frac{\mathcal{M}+ \sharp \mathcal{Q}_1}{2}\), and the requirement \(\left| {\mathbb {E}}\left[ \mathcal{X}\right] - \frac{\mathcal{M}}{2} \right| \ge \sqrt{\mathcal{M}}\) is equivalent to either \(\sharp \mathcal{Q}_1 \ge 2\sqrt{\mathcal{M}}\) or \(\sharp \mathcal{Q}_1 \le -2\sqrt{\mathcal{M}}\), the latter not being possible as \(\sharp \mathcal{Q}_1\) is non-negative.

*Case* \(\sharp \mathcal{Q}_1 < T/2\). In this case, we know that there are at least \(T-\sharp \mathcal{Q}_1\) of the \(\mathcal{M}\) inputs that *do not* satisfy the relation. Thus, \({\mathbb {E}}\left[ \mathcal{X}\right] = {\mathbb {E}}\left[ Z\right] \) where \(Z \sim \mathcal{B}\left( \mathcal{M}- T + \sharp \mathcal{Q}_1, \frac{1}{2}\right) \). Thus, \({\mathbb {E}}\left[ \mathcal{X}\right] = \frac{\mathcal{M}- T + \sharp \mathcal{Q}_1}{2}\), and the requirement \(\left| {\mathbb {E}}\left[ \mathcal{X}\right] - \frac{\mathcal{M}}{2} \right| \ge \sqrt{\mathcal{M}}\) is equivalent to either \(\sharp \mathcal{Q}_1 \ge T + 2\sqrt{\mathcal{M}}\) or \(\sharp \mathcal{Q}_1 \le T - 2\sqrt{\mathcal{M}}\), the former not being possible as \(\sharp \mathcal{Q}_1 \le T\).

### *Note 3*

The reason for the requirement \(T \le 4\sqrt{\mathcal{M}}\) in the statement of Theorem 1 arises because otherwise the two sums would overlap and add the same terms twice. The probability which is derived as a function of \(\mathcal{M}\) and *T* provides a lower bound on \(\alpha \) for when, in the notation of Definition 2, a tuple \((\mathcal {P}, \pi , {\mathcal {S}}, \mathcal{R}_{\delta ,\gamma }^{\pi })\) can be \((T,\mathcal{M},\alpha )\)-intractable.

### **Corollary 1**

Let \({\mathcal {A}}\) be an algorithm which, after a choice of \(\delta ,\gamma \in {\mathbb {F}}_2^n \backslash \{(0,\ldots ,0)\}\) is fixed, is given access to some permutation \(\pi : {\mathbb {F}}_2^n \rightarrow {\mathbb {F}}_2^n \in \mathcal {P}\).

When \(T < 2\sqrt{\mathcal{M}}\) and \(\mathcal {P} = {\mathsf{Perm}}(n)\), it is impossible for \({\mathcal {A}}\) to produce in time *T* a set \({\mathcal {S}}\subseteq {\mathbb {F}}_2^n\) of size \(\mathcal{M}\) s.t. the tuple \((\mathcal {P}, \pi , {\mathcal {S}}, \mathcal{R}_{\delta ,\gamma }^{\pi })\) is \(\alpha \)-separable for any \(\alpha > 0\).

On the other hand, when \(T \ge 4\sqrt{\mathcal{M}}\) and \(\mathcal {P} = \mathcal {E}\) (in the notation of Definition 3), then it is impossible for \({\mathcal {A}}\) to produce in time *T* a set \({\mathcal {S}}\subseteq {\mathbb {F}}_2^n\) of size \(\mathcal{M}\) s.t. the tuple \(({\mathcal {A}}, \mathcal {P}, \pi , {\mathcal {S}}, T, \mathcal{R}_{\delta ,\gamma }^{\pi },\alpha )\) is a key-less linear distinguisher for any \(\alpha > 0\).

### *Proof*

The first result follows directly from Theorem 1 when observing that the both sums are zero when \(T < 2\sqrt{\mathcal{M}}\). The second result follows from Theorem 1 when observing that the sums equal one when \(T=4\sqrt{\mathcal{M}}\). This makes \((T,\mathcal{M},\alpha )\)-intractability impossible. \(\square \)

### *Note 4*

The key-less linear distinguisher specified in Definition 3 does not ask to provide outputs, hence it is not ruled out to give a valid key-less linear distinguisher without pre-computation, i.e. to have \(T=0\). Indeed, one of the concrete attacks we will later show does not need any computations.

Indeed, from Corollary 1 it follows that when no pre-computation is allowed, i.e. when \(T=0\), any algorithm \({\mathcal {A}}\) producing a set \({\mathcal {S}}\subseteq {\mathbb {F}}_2^n\) together with any relation \(\delta ,\gamma \in {\mathbb {F}}_2^n \backslash \{(0,\ldots ,0)\}\) for a permutation \(E_K \in \mathcal {E}\), yields a key-less linear distinguisher \(({\mathcal {A}},\mathcal {E}, E_K, {\mathcal {S}}, T, \mathcal{R}_{\delta ,\gamma }^{E_K}, \alpha )\) for *some* \(\alpha > 0\). Note, however, that the parameter \(\alpha \) measures how likely such a distinguisher is to work for a specific key. For example, when \(\alpha \) is very small, one might have a valid key-less linear distinguisher for many rounds, but for a tiny fraction of the key space. As such, when \(T=0\), such a key-less linear distinguisher is to be taken with a grain of salt, depending on the value \(\alpha \). In the following sections, we always provide together with our distinguishers the parameter \(\alpha \), to make clear the lower bound on the fraction of the key space for which it is valid.

Having analyzed the generic case, we move on to stating in Theorem 2 a necessary condition for when, for a particular fixed \(\pi \in \mathcal {P}\) and \(\delta ,\gamma \in {\mathbb {F}}_2^n \backslash \{(0,\ldots ,0)\}\), an algorithm \({\mathcal {A}}\) can construct \({\mathcal {S}}\subseteq {\mathbb {F}}_2^n\) of size \(\mathcal{M}\) in time *T*, s.t. the tuple \((\mathcal {P}, \pi , {\mathcal {S}}, \mathcal{R}_{\delta ,\gamma }^{\pi })\) is a \(\alpha \)-separable.

### **Theorem 2**

Let \(\pi \in \mathcal {P}\) and fix \(\delta ,\gamma \in {\mathbb {F}}_2^n \backslash \{(0,\ldots ,0)\}\). Let \({\mathcal {S}}\subseteq {\mathbb {F}}_2^n\) have size \(\mathcal{M}\). Then the tuple \((\mathcal {P}, \pi , {\mathcal {S}}, \mathcal{R}_{\delta ,\gamma }^\pi )\) can be \(\alpha \)-separable for \(\alpha > 0\) if and only if the absolute correction \(|{\mathbf {C}}_\pi |\) of \(\mathcal{R}_{\delta ,\gamma }^\pi \) satisfies \(|{\mathbf {C}}_\pi | \ge 2/\sqrt{\mathcal{M}}\). Furthermore, the largest \(\alpha \) for which \(\alpha \)-separability is obtained, is given by \(\alpha = \Pr \left[ |{\mathbf {C}}_\pi | \ge 2/\sqrt{\mathcal{M}}\right] \).

### *Proof*

Let \(\mathcal{X}:= \{ x \in {\mathcal {S}}\mid \mathcal{R}_{\delta ,\gamma }^\pi (x) = 1 \}\). Then \(\mathcal{X}\sim \mathcal{B}\left( \mathcal{M}, \frac{1}{2} + \frac{{\mathbf {C}}_\pi }{2}\right) \). We have \(\alpha \)-separability if and only if \(\Pr \left[ \left| {\mathbb {E}}\left[ \mathcal{X}\right] - \frac{\mathcal{M}}{2}\right| \ge \sqrt{\mathcal{M}}\right] \ge \alpha \). Thus, we require either \({\mathbb {E}}\left[ \mathcal{X}\right] \ge \frac{\mathcal{M}}{2} + \sqrt{\mathcal{M}}\) or \({\mathbb {E}}\left[ \mathcal{X}\right] \le \frac{\mathcal{M}}{2} - \sqrt{\mathcal{M}}\). Since \({\mathbb {E}}\left[ \mathcal{X}\right] = \frac{\mathcal{M}}{2} + \mathcal{M}\cdot \frac{{\mathbf {C}}_\pi }{2}\), this happens exactly when \(|{\mathbf {C}}_\pi | \ge 2/\sqrt{\mathcal{M}}\). From this, the results follow. \(\square \)

## 4 The Block Cipher PRESENT, Keys and Linear Hulls

PRESENT is a 64-bit iterated block cipher [10] for use in lightweight applications such as RFID tags and wireless sensor networks. Its use in compression function designs is e.g. studied and advocated for in [11]. The key space is \({\mathcal {K}}= {\mathbb {F}}_2^\kappa \) with \(\kappa \) either 80 or 128 bits. The respective block ciphers are denoted PRESENT-80 and PRESENT-128. Both ciphers have 31 rounds. The PRESENT key-schedule (see Appendix A for details) produces 32 \(\kappa \)-bit round keys, but only the 64 most significant bits are used in the key addition of each round. We refer to these 64-bit round keys as \(K_i\) with \(i=0,\ldots ,31\).

*x*is the 64-bit state input to round

*i*,

*S*is the parallel application of sixteen identical 4-bit S-boxes and

*P*is a fixed bitwise permutation

^{4}. The full cipher is composed of 31 applications of the round function followed by addition of a post-whitening key, i.e.

*P*, see Appendix A.

### 4.1 Keys and Linear Hulls in PRESENT

One of the first thorough treatments of linear cryptanalysis on PRESENT is by Ohkuma [39]. This work defines *optimal linear trails* using solely masks of Hamming weight one. Furthermore, 64 *optimal hulls* using these trails are determined, along with the number of trails in each hull.

*R*-round optimal trails

*t*is \(|{\mathbf {C}}_{E_K}(t)| = 2^{-2R}\). Considering a particular

*R*-round optimal hull \(\text {LH}_R(\delta ,\gamma )\), let \(T_R^+\) (respectively \(T_R^-\)) denote the number of trails

*t*in the hull for which \( sgn (t)=0\) (respectively \( sgn (t)=1\)). We also let \(T_R := \sharp \text {LH}_R(\delta ,\gamma )\), i.e. \(T_R = T_R^+ + T_R^-\). By Assumption 1, for a fixed key \(K \in {\mathcal {K}}\), we have \(T_R^+ \sim \mathcal{B}\left( T_R, \frac{1}{2}\right) \), which for sufficiently large \(T_R\) is well approximated by \(T_R^+ \sim \mathcal{N}\left( \frac{T_R}{2}, \frac{T_R}{4}\right) \). Let \(Z = T_R^+ - T_R^- = 2T_R^+ - T_R\). Thus,

*Z*is normally distributed with \(\mu = 2 \cdot \frac{T_R}{2} - T_R = 0\) and \(\sigma ^2 = 2^2 \cdot \frac{T_R}{4} = T_R\), so \(Z \sim \mathcal {N}(0, T_R)\). When \(|Z| \ge N\), for some

*N*, where \(0 \le N \le T_R\), the absolute linear hull correlation is

*R*, using the analysis above, \(T_R\) can be used directly to determine (i) a lower bound on \(|{\mathbf {C}}_{E_K}|\) and (ii) the probability that for a random \(K \in {\mathcal {K}}\), this bound is obtained. Table 1 gives, for various probabilities \(\alpha \) and number of rounds

*R*the value \(\beta \) such that \(\alpha = \Pr \left[ |{\mathbf {C}}_{E_K}| \ge \beta \right] \). Table 7 in Appendix B gives the same data points for \(R \in \{1,\ldots ,31\}\).

Values \(\log _2 \beta \) s.t. \(\alpha = \Pr \left[ |{\mathbf {C}}_{E_K}| \ge \beta \right] \) for *R*-round PRESENT

\(\alpha \) | |||||||||
---|---|---|---|---|---|---|---|---|---|

| 0.01 | 0.05 | 0.10 | 0.30 | 0.50 | 0.70 | 0.90 | 0.95 | 0.99 |

7 | \(-9.55\) | \(-9.94\) | \(-10.20\) | \(-10.86\) | \(-11.48\) | \(-12.29\) | \(-13.91\) | \(-14.91\) | \(-17.23\) |

11 | \(-14.74\) | \(-15.14\) | \(-15.39\) | \(-16.06\) | \(-16.68\) | \(-17.48\) | \(-19.10\) | \(-20.10\) | \(-22.43\) |

16 | \(-21.27\) | \(-21.66\) | \(-21.92\) | \(-22.58\) | \(-23.20\) | \(-24.01\) | \(-25.63\) | \(-26.63\) | \(-28.95\) |

24 | \(-31.71\) | \(-32.11\) | \(-32.36\) | \(-33.03\) | \(-33.65\) | \(-34.46\) | \(-36.07\) | \(-37.07\) | \(-39.40\) |

26 | \(-34.33\) | \(-34.72\) | \(-34.97\) | \(-35.64\) | \(-36.26\) | \(-37.07\) | \(-38.68\) | \(-39.69\) | \(-42.01\) |

28 | \(-36.94\) | \(-37.33\) | \(-37.58\) | \(-38.25\) | \(-38.87\) | \(-39.68\) | \(-41.30\) | \(-42.30\) | \(-44.62\) |

31 | \(-40.85\) | \(-41.25\) | \(-41.50\) | \(-42.17\) | \(-42.79\) | \(-43.60\) | \(-45.21\) | \(-46.22\) | \(-48.54\) |

### *Example 1*

For \(R=28\), we have \(T_{28} = 45170283840\). Thus, with probability \(\alpha = 0.30\), a randomly chosen \(K \in {\mathcal {K}}\) yields that one of Ohkuma’s optimal hulls has \(|{\mathbf {C}}_{E_K}| \ge 2^{-38.25}\).

## 5 Application to PRESENT

In this section we give key-less linear distinguishers on PRESENT for varying parameters; the number of rounds *R*; the pre-computation time *T*; the size \(\mathcal{M}\) of the set \({\mathcal {S}}\) produced and the lower bound \(\alpha \) on the fraction of the key space for which they are valid.

As already hinted in Sect. 4, PRESENT has received some attention in the context of key-recovery attacks, especially with respect to linear cryptanalysis [13, 15, 30, 39] on which our results build. The attack described is completely independent of the key size used, and hence also of the key-schedule.

### 5.1 Distinguishers with \(T=0\)

In this section we present key-less linear distinguishers on PRESENT using the model introduced in Sect. 3. We refer to approach described here as the *probabilistic phase*, which in Sect. 5.2 is combined with a *deterministic phase* to extend the distinguishers for three more rounds.

The distinguishers we present here do not use any pre-computation, i.e. in the notation of the model, we have \(T=0\). Corollary 1 implies in this case that when \(|{\mathbf {C}}_{E_K}| > 0\), the tuple produced by any algorithm \({\mathcal {A}}\) is always \((T,\mathcal{M},\alpha )\)-intractable for some \(\alpha > 0\), and hence a valid distinguisher. The results match those of distinguishers used in key-recovery attacks and are as such of limited interest. We hope the discussion below makes it easier to follow (and appreciate) the real use of the model we introduced, namely for the case described in Sect. 5.2 when we do a some, albeit very little, pre-computation.

*R*, lower bounds \(\alpha \) on the fraction of the key space, s.t. \(({\mathcal {A}},\mathcal{E}, E_K, {\mathcal {S}}, T=0, \mathcal{R}_{\delta ,\gamma }^{E_K}, \alpha )\) are key-less linear distinguishers.

Lower bounds \(\alpha \) on the fraction of the key space \({\mathcal {K}}\) susceptible to key-less linear distinguishers using \(T = 0\) and the specified parameters \(\mathcal{M}\) and number of rounds *R*. A dash indicates that \(\alpha < 0.00\).

Rounds | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

\(\mathcal M\) | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 |

\(2^{40}\) | 0.96 | 0.89 | 0.74 | 0.41 | 0.04 | - | - | - | - | - | - | - | - | - |

\(2^{44}\) | 0.99 | 0.97 | 0.93 | 0.84 | 0.61 | 0.21 | - | - | - | - | - | - | - | - |

\(2^{46}\) | 0.99 | 0.99 | 0.97 | 0.92 | 0.80 | 0.53 | 0.12 | - | - | - | - | - | - | - |

\(2^{52}\) | 1.00 | 1.00 | 1.00 | 0.99 | 0.97 | 0.94 | 0.85 | 0.63 | 0.24 | - | - | - | - | - |

\(2^{54}\) | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.97 | 0.92 | 0.81 | 0.55 | 0.14 | - | - | - | - |

\(2^{56}\) | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.98 | 0.96 | 0.90 | 0.77 | 0.46 | 0.07 | - | - | - |

\(2^{62}\) | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.97 | 0.93 | 0.82 | 0.58 | 0.17 | - |

\(2^{63}\) | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.98 | 0.95 | 0.87 | 0.69 | 0.33 | 0.02 |

\(2^{64}\) | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.96 | 0.91 | 0.78 | 0.49 | 0.09 |

Note, that the \(\alpha \) parameter from Table 2 gives immediately the probability that such an *R*-round key-less linear distinguisher without pre-computation for PRESENT is valid in practice, for a fixed chosen- or known key \(K \in {\mathcal {K}}\). As examples, we see that with \(\mathcal{M}= 2^{40}\), the probability of having a valid key-less linear distinguisher for 13-round PRESENT with a fixed key *K* is *at least* \(\alpha = 0.41\). Another example is a key-less linear distinguisher on 22-round PRESENT which is valid for a fraction of at least \(\alpha = 0.33\) of the key space, using \(\mathcal{M}= 2^{63}\).

### 5.2 Extension by Deterministic Phase

Next, we describe how one can use pre-computation to extend the key-less linear distinguishers from Sect. 5.1 to cover three more rounds with no degradation to the valid key space fraction \(\alpha \). In the notation of the model, we now have \(T > 0\), which in turn means that \((T,\mathcal{M},\alpha )\)-intractability is no longer granted for free by Corollary 1, unless below \(T < 2\sqrt{\mathcal{M}}\). In Appendix D we outline an approach for a deterministic phase over 6 rounds, reminiscent of the rebound approach [28, 35], which however has a too-high computational complexity to fit into our model.

*deterministic phase*.

*j*th S-box (counting from right to left) and let \(K_{r,j}\) denote the

*j*th least significant bit of the round key \(K_r\), where all indices start from zero. Consider then \(S_{2,5}\) which is highlighted in Fig. 2. By inspection, the PRESENT S-box has 10 inputs

*x*which satisfy \(\langle x,(0,0,1,0) \rangle = \langle S(x),(0,0,1,0) \rangle \) and hence follow the trail \((\mathbf {e}_{21},\mathbf {e}_{21})\) over \({R}_{2}\), no matter what the inputs on the other S-boxes are. By adding the key bits \((K_{2,23} \Vert \cdots \Vert K_{2,20})\) to each

*x*, we can trace those back through the permutation layer of \({R}_{1}\). For each value of \(x \oplus (K_{2,23} \Vert \cdots \Vert K_{2,20})\), we now have a particular value on output bit 1 of each of the S-boxes \(S_{1,7},\ldots ,S_{1,4}\), as indicated in Fig. 2. By the bijectivity of the S-box, it holds that for each of these S-boxes, half the inputs will give the desired output bit. However, for the S-box \(S_{1,5}\) we have the extra requirement that the input bit on position 1 should equal the output bit on position 1, and only 5 inputs have both properties. As such, we can trace each of the ten values for

*x*back through \({R}_{1}\) and also adding the key bits \((K_{1,31} \Vert \cdots \Vert K_{1,16})\) to obtain \(10 \cdot 8^3 \cdot 5 = 25600\) inputs to \({R}_{2} \circ {R}_{1}\) which follow the trail \((\mathbf {e}_{21},\mathbf {e}_{21},\mathbf {e}_{21})\) by construction. By tracing each of these values back through \({R}_0\) the same way, and adding the full round key \(K_0\), algorithm \({\mathcal {A}}\) has a construction of the set \({\mathcal {S}}\) which consists of inputs which follow \({\mathcal {T}}\) over three rounds with probability 1. Using this approach to constructing \({\mathcal {S}}\), the size of the set can be

*up to*\(\mathcal{M}= 25600 \cdot 8^{15} \cdot 5 = 4503599627370496000 \approx 2^{61.97}\). As such, if one should wish to use a smaller \(\mathcal{M}\) for the key-less linear distinguisher, this is also possible, simply by leaving out elements in the construction of \({\mathcal {S}}\).

Tight values \(\alpha \) such that \((\mathcal{E}, E_K, {\mathcal {S}}, \mathcal{R}_{\delta ,\gamma }^{E_K})\) is \(\alpha \)-separable, where \(E_K\) is *R*-round PRESENT for a fixed, known \(K \in {\mathcal {K}}\) (and thus \(E_K \in \mathcal{E}\))

Rounds | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 |
---|---|---|---|---|---|---|---|---|---|

\(\alpha \) | 0.998 | 0.995 | 0.988 | 0.970 | 0.926 | 0.819 | 0.571 | 0.162 | 0.001 |

Consider \(E_K\) being *R*-round PRESENT for a particular fixed \(K \in {\mathcal {K}}\), and thus \(E_K \in \mathcal{E}\). Let \({\mathcal {A}}\) be an algorithm for constructing \({\mathcal {S}}\) using the 3-round deterministic technique described, with \(\mathcal{M}\approx 2^{61.97}\) for one of Ohkuma’s optimal linear hull relations \(\mathcal{R}_{\delta ,\gamma }^{E_K}\). Table 3 gives, for various number of rounds *R*, the highest possible \(\alpha \) s.t. \((\mathcal{E}, E_K, {\mathcal {S}}, \mathcal{R}_{\delta ,\gamma }^{E_K})\) is \(\alpha \)-separable as per Definition 1. Of course, in order for the key-less linear distinguisher \(({\mathcal {A}}, \mathcal{E}, E_K, {\mathcal {S}}, T, \mathcal{R}_{\delta ,\gamma }^{E_K}, \alpha )\) to be valid, it also has to hold that the tuple \((\mathcal{E}, E_K, {\mathcal {S}}, \mathcal{R}_{\delta ,\gamma }^{E_K})\) is \((T,\mathcal{M},\alpha )\)-intractable as per Definition 2, where *T* is the time required by \({\mathcal {A}}\) to construct the set \({\mathcal {S}}\).

In Sect. 5.3, we show that the time *T* required to construct \({\mathcal {S}}\) by \({\mathcal {A}}\) is equivalent to \(T = \frac{409641}{16R}\) calls to an *R*-round PRESENT encryption oracle. As such, we have that \(T < 2\sqrt{\mathcal{M}}\), and from Corollary 1, it follows that \((\mathcal{E}, E_K, {\mathcal {S}}, \mathcal{R}_{\delta ,\gamma }^{E_K})\) is \((T,\mathcal{M},\alpha )\)-intractable.

In Appendix C, we give examples of experimental verification of the key-less linear distinguishers presented on 9-round PRESENT. The code for this experimental verification is available as [29].

### 5.3 Computational Complexity *T*

In this section we analyze the computational complexity, i.e. the time *T* required by \({\mathcal {A}}\) to construct \({\mathcal {S}}\) in the deterministic phase of Sect. 5.2. In the key-less setting, the attacker has white-box access to the encryption oracle. This is what is exploited by \({\mathcal {A}}\). In order to measure the time *T* spent in this phase, we determine the number of S-box lookups performed by \({\mathcal {A}}\) and then compare this to the number of S-box applications for a full call to the encryption oracle.

Let us consider all S-boxes as being different for generality, as the complexity in this case will certainly upper bound the case where they are all equal. In particular, since the key is known, this allows us to consider the key addition as part of the S-boxes.

The analysis follows the construction of \({\mathcal {S}}\) by \({\mathcal {A}}\) itself, starting from \({R}_2\) and working its way up (referring again to Fig. 2). To determine the 10 inputs to \(S_{2,5}\), \({\mathcal {A}}\) performs one lookup into this S-box. For each of these 10 values, one bit is traced back to an S-box of \({R}_1\), so this adds \(10 \cdot 4\) S-box lookups. Finally, \({\mathcal {A}}\) has 25600 inputs to \({R}_1\) for which it traces one bit back to each of the 16 S-boxes of \({R}_0\), contributing by \(25600\cdot 16\) S-box lookups.

In total, the number of lookups is \(1+10\cdot 4+25600\cdot 16 = 409641\). Now, comparing to the number of S-box lookups involved with a call to an *R*-round PRESENT oracle, the number of lookups would be 16*R*, not counting key scheduling. As such, we find that the time *T* spent by \({\mathcal {A}}\) for constructing \({\mathcal {S}}\) is \(T = \frac{409641}{16R}\).

**Memory Complexity.** The memory complexity, though not a formal part of the key-less linear distinguisher model, is at a practical level. The storage of the set \({\mathcal {S}}\) can be encoded efficiently with two lists. In a first list with 4-bit entries of length 256 (128 bytes), we store all possible input values before the first subkey addition. The second list contains 25600 sets of 16 indices to the first list. Even a naïve encoding of this only needs 400kB.

### 5.4 Overview of Selected Distinguishers and Discussion

*T*required by \({\mathcal {A}}\) to construct \({\mathcal {S}}\) is \(T = \frac{1 + 4w_2 + 16w_1}{16R}\) for

*R*-round PRESENT. Obviously, for a fixed target size \(\mathcal{M}\), minimizing \(w_1\) yields the lower time complexity

*T*.

Overview of parameters for key-less linear distinguishers on PRESENT. The entries give, for each \(\mathcal{M}\) and each total number of rounds *R* a pair \((\log _2 T, \log _2 (\alpha \cdot 2^{128}))\) s.t. algorithm \({\mathcal {A}}\) can construct \({\mathcal {S}}\) in time *T* and result in a distinguisher for *at least* a fraction \(\alpha \) of the key space. Here, we indicate for PRESENT-128 the number of keys supporting the distinguisher. The equivalent number for PRESENT-80 is obtained as \(\alpha \cdot 2^{80}\). A dash indicates that \(\alpha \cdot 2^{128} < 0\).

Rounds | |||||||
---|---|---|---|---|---|---|---|

\(\mathcal M\) | 14 | 18 | 22 | 25 | 26 | 27 | 28 |

\(2^{22}\) | - | - | - | - | - | - | - |

\(2^{25}\) | - | - | - | - | - | - | - |

\(2^{28}\) | \((-3.4, 70.9)\) | - | - | - | - | - | - |

\(2^{31}\) | \((-3.4, 119.2)\) | - | - | - | - | - | - |

\(2^{34}\) | \((-3.4, 126.2)\) | - | - | - | - | - | - |

\(2^{37}\) | \((-3.4, 127.5)\) | - | - | - | - | - | - |

\(2^{40}\) | \((-3.4, 127.8)\) | \((-3.8, 107.1)\) | - | - | - | - | - |

\(2^{43}\) | \((-3.4, 127.9)\) | \((-3.8, 124.3)\) | - | - | - | - | - |

\(2^{46}\) | \((-3.4, 128.0)\) | \((-3.8, 127.1)\) | - | - | - | - | - |

\(2^{49}\) | \((-1.7, 128.0)\) | \((-2.1, 127.7)\) | \((-2.4, 75.1)\) | - | - | - | - |

\(2^{52}\) | (0.9, 128.0) | (0.5, 127.9) | (0.3, 119.8) | - | - | - | - |

\(2^{55}\) | (3.9, 128.0) | (3.5, 128.0) | (3.2, 126.3) | - | - | - | - |

\(2^{58}\) | (6.9, 128.0) | (6.5, 128.0) | (6.2, 127.5) | (6.0, 103.1) | - | - | - |

\(2^{61}\) | (9.9, 128.0) | (9.5, 128.0) | (9.2, 127.8) | (9.0, 123.7) | (9.0, 108.5) | (8.9, 21.0) | - |

\(2^{61.97}\) | (10.8,128.0) | (10.5, 128.0) | (10.2, 127.9) | (10.0, 125.4) | (9.9, 117.1) | (9.9, 71.8) | - |

Using these simple observations, we give in Table 4 an overview of selected results for key-less linear distinguishers on *R*-round PRESENT. We give the size \(\mathcal{M}\) of \({\mathcal {S}}\subseteq {\mathbb {F}}_2^n\) constructed by \({\mathcal {A}}\), the time *T* required to do so, and the parameter \(\alpha \) (implicitly, as we give \(\alpha \cdot 2^{128}\)) for the distinguisher, i.e. the lower bound on the fraction of the key space for which the distinguisher is valid. As such, the table is representative for PRESENT-128. Numbers for PRESENT-80 can be directly determined with the same *T* and \(\alpha \cdot 2^{80}\). Note, however, that for 27-round PRESENT-80 using \(\mathcal{M}= 2^{61.97}\), \(\alpha \cdot 2^{80} < 0\), so one can distinguish at most 26 rounds of PRESENT-80.

What is evident from Table 4 is, that there is a clear limit to how many rounds can be distinguished using a particular \(\mathcal{M}\). This shows in the diagonal line through the table. Another observation is that for a fixed \(\mathcal{M}\), there is a clear drop in the fraction of the key space \(\alpha \) for which the distinguisher works between *R* and \(R+1\) rounds. For example, with \(\mathcal{M}= 2^{61}\), we see a drop from \(2^{108.5}\) keys supporting the distinguisher for 26 rounds to just \(2^{21}\) for 27 rounds. What is also apparent is that in all cases, \(T \ll 2\sqrt{\mathcal{M}}\), indeed sometimes \(T < 1\), so by Corollary 1, \((T,\mathcal{M},\alpha )\)-intractability is for granted.

One thing worth discussion is the time complexity *T*. This is the time, converted to equivalent calls to an *R*-round encryption oracle, required by the key-less linear distinguisher algorithm \({\mathcal {A}}\) to construct the set \({\mathcal {S}}\). In a scenario where one would verify the distinguisher for a concrete block cipher \(E_K\), i.e. for a particular value of *K*, one would need to determine the value of the random variable \(\mathcal{X}\) of Definition 1. What we denote as the *verifying complexity* in this case is dominated by \(\mathcal{M}\), because this is the number of inputs to the permutation that needs to be evaluated in order to determine \(\mathcal{X}\).

## 6 Conclusion and Open Problems

In this paper we have formalized the notion of distinguishers for block ciphers using linear cryptanalysis in the key-less setting, i.e. where the block cipher is instantiated with a single known or chosen key.

The introduced key-less statistical distinguisher based on linear cryptanalysis led to a wide variety of results on PRESENT, for example a linear distinguisher of up to 26 and 27 rounds of PRESENT-80 and PRESENT-128, with respective computational complexities of about \(2^{9}\) and \(2^{10}\), and verifying complexities of about \(2^{61}\) and \(2^{61.97}\), for both PRESENT variants. The very low computational complexity made a practical verification possible for a reduced number of rounds, but also leaves room for improvements: Is it possible to extend the deterministic phase to cover more rounds while still keeping the work factor below the allowed \(2^{30}\)?

While PRESENT was chosen because it is a relatively high profile cryptanalytic target and the fact that relatively long useful linear hulls exist, we point out that the new distinguisher model is not specifically tailored for it. KATAN, a cipher with a very different round transformation and design philosophy, exhibits linear effects as described in [14] that makes it another interesting target for an application of the techniques introduced in this paper.

More research is needed on the relations between the use of degrees of freedom and the number of rounds that can be sidestepped, e.g. in our deterministic phase. Even though there is no good theoretical understanding of this yet, the literature already contains many data points for differential properties. The linear counterpart seems different and interesting enough to warrant a separate study, see also Appendix D.

The techniques we developed for the presented distinguisher might also have applications to preimage attacks that are inspired by linear cryptanalysis, or at least to somewhat speed-up brute-force preimage search. It will be interesting to see how this approach compares to other such methods [9, 41]. Also, the approach naturally and directly applies to permutations, which become an increasingly important primitive in their own right, also due to the popularization of the Sponge [4] construction.

## Footnotes

- 1.
Authors mention that 92 degrees of freedom out of 192 (from key and plaintext input) are left for the outbound phase.

- 2.
One exception being [1].

- 3.
We emphasize here that the application to PRESENT later on in the paper will not be a preimage attack.

- 4.
*S*and*P*are called sBoxLayer and pLayer, respectively, in the specification.

## Notes

### Acknowledgments

We would like to thank Mohamed Ahmed Abdelraheem, Dmitry Khovratovich, Gregor Leander, and Tyge Tiessen for helpful discussions on the paper.

## Supplementary material

## References

- 1.Andreeva, E., Bogdanov, A., Mennink, B.: Towards understanding the known-key security of block ciphers. In: Moriai, S. (ed.) FSE 2013. LNCS, vol. 8424, pp. 348–366. Springer, Heidelberg (2014) Google Scholar
- 2.Ashur, T., Dunkelman, O.: Linear analysis of reduced-round cubehash. In: Lopez, J., Tsudik, G. (eds.) ACNS 2011. LNCS, vol. 6715, pp. 462–478. Springer, Heidelberg (2011) CrossRefGoogle Scholar
- 3.Baignères, T., Junod, P., Vaudenay, S.: How far can we go beyond linear cryptanalysis? In: Lee, P.J. (ed.) ASIACRYPT 2004. LNCS, vol. 3329, pp. 432–450. Springer, Heidelberg (2004) CrossRefGoogle Scholar
- 4.Bertoni, G., Daemen, J., Peeters, M., Van Assche, G.: On the indifferentiability of the sponge construction. In: Smart, N.P. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, pp. 181–197. Springer, Heidelberg (2008) CrossRefGoogle Scholar
- 5.Biryukov, A., De Cannière, C., Quisquater, M.: On multiple linear approximations. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 1–22. Springer, Heidelberg (2004) CrossRefGoogle Scholar
- 6.Biryukov, A., Dunkelman, O., Keller, N., Khovratovich, D., Shamir, A.: Key recovery attacks of practical complexity on AES-256 variants with up to 10 rounds. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 299–319. Springer, Heidelberg (2010) CrossRefGoogle Scholar
- 7.Biryukov, A., Khovratovich, D.: Related-key cryptanalysis of the full AES-192 and AES-256. In: Matsui [33], pp. 1–18Google Scholar
- 8.Biryukov, A., Khovratovich, D., Nikolić, I.: Distinguisher and related-key attack on the full AES-256. In: Halevi, S. (ed.) CRYPTO 2009. LNCS, vol. 5677, pp. 231–249. Springer, Heidelberg (2009) CrossRefGoogle Scholar
- 9.Bogdanov, A., Khovratovich, D., Rechberger, C.: Biclique cryptanalysis of the full AES. In: Lee, D.H., Wang, X. (eds.) ASIACRYPT 2011. LNCS, vol. 7073, pp. 344–371. Springer, Heidelberg (2011) CrossRefGoogle Scholar
- 10.Bogdanov, A.A., Knudsen, L.R., Leander, G., Paar, C., Poschmann, A., Robshaw, M., Seurin, Y., Vikkelsoe, C.: PRESENT: an ultra-lightweight block cipher. In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 450–466. Springer, Heidelberg (2007) CrossRefGoogle Scholar
- 11.Bogdanov, A., Leander, G., Paar, C., Poschmann, A., Robshaw, M.J.B., Seurin, Y.: Hash functions and RFID tags: mind the gap. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 283–299. Springer, Heidelberg (2008) CrossRefGoogle Scholar
- 12.Bogdanov, A., Rijmen, V.: Linear hulls with correlation zero and linear cryptanalysis of block ciphers. Des. Codes Cryptogr.
**70**(3), 369–383 (2014)MathSciNetCrossRefzbMATHGoogle Scholar - 13.Bulygin, S.: More on linear hulls of present-like ciphers and a cryptanalysis of full-round EPCBC-96. IACR Cryptol. ePrint Arch.
**2013**, 28 (2013)Google Scholar - 14.De Cannière, C., Dunkelman, O., Knežević, M.: KATAN and KTANTAN — a family of small and efficient hardware-oriented block ciphers. In: Clavier, C., Gaj, K. (eds.) CHES 2009. LNCS, vol. 5747, pp. 272–288. Springer, Heidelberg (2009) CrossRefGoogle Scholar
- 15.Cho, J.Y.: Linear cryptanalysis of reduced-round PRESENT. In: Pieprzyk [40], pp. 302–317Google Scholar
- 16.Cho, J.Y., Hermelin, M., Nyberg, K.: A new technique for multidimensional linear cryptanalysis with applications on reduced round serpent. In: Lee, P.J., Cheon, J.H. (eds.) ICISC 2008. LNCS, vol. 5461, pp. 383–398. Springer, Heidelberg (2009) CrossRefGoogle Scholar
- 17.Daemen, J., Rijmen, V.: The Design of Rijndael: AES - The Advanced Encryption Standard. Information Security and Cryptography. Springer, Heidelberg (2002) CrossRefGoogle Scholar
- 18.Gilbert, H.: A simplified representation of AES. In: Sarkar, P., Iwata, T. (eds.) ASIACRYPT 2014. LNCS, vol. 8873, pp. 200–222. Springer, Heidelberg (2014) Google Scholar
- 19.Gilbert, H., Peyrin, T.: Super-Sbox cryptanalysis: improved attacks for AES-like permutations. In: Hong, S., Iwata, T. (eds.) FSE 2010. LNCS, vol. 6147, pp. 365–383. Springer, Heidelberg (2010) CrossRefGoogle Scholar
- 20.Kaliski Jr., B.S., Robshaw, M.: Linear cryptanalysis using multiple approximations. In: Desmedt, Y.G. (ed.) CRYPTO 1994. LNCS, vol. 839, pp. 26–39. Springer, Heidelberg (1994) Google Scholar
- 21.Khovratovich, D., Naya-Plasencia, M., Röck, A., Schläffer, M.: Cryptanalysis of
*Luffa*v2 components. In: Biryukov, A., Gong, G., Stinson, D.R. (eds.) SAC 2010. LNCS, vol. 6544, pp. 388–409. Springer, Heidelberg (2011) CrossRefGoogle Scholar - 22.Khovratovich, D., Nikolić, I., Rechberger, C.: Rotational rebound attacks on reduced skein. In: Abe, M. (ed.) ASIACRYPT 2010. LNCS, vol. 6477, pp. 1–19. Springer, Heidelberg (2010) CrossRefGoogle Scholar
- 23.Knudsen, L.R., Mathiassen, J.E.: A chosen-plaintext linear attack on DES. In: Schneier, B. (ed.) FSE 2000. LNCS, vol. 1978, pp. 262–272. Springer, Heidelberg (2001) CrossRefGoogle Scholar
- 24.Knudsen, L.R., Rijmen, V.: Known-key distinguishers for some block ciphers. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833, pp. 315–324. Springer, Heidelberg (2007) CrossRefGoogle Scholar
- 25.Koyama, T., Sasaki, Y., Kunihiro, N.: Multi-differential cryptanalysis on reduced DM-PRESENT-80: collisions and other differential properties. In: Kwon et al. [26], pp. 352–367Google Scholar
- 26.Kwon, T., Lee, M.-K., Kwon, D. (eds.): ICISC 2012. LNCS, vol. 7839. Springer, Heidelberg (2013) zbMATHGoogle Scholar
- 27.Lamberger, M., Mendel, F., Rechberger, C., Rijmen, V., Schläffer, M.: Rebound distinguishers: results on the full whirlpool compression function. In: Matsui [33], pp. 126–143Google Scholar
- 28.Lamberger, M., Mendel, M., Schläffer, M., Rechberger, C., Rijmen, V.: The rebound attack and subspace distinguishers: application to whirlpool. J. Cryptol.
**28**(2), 1–40 (2015)CrossRefGoogle Scholar - 29.Lauridsen, M.M., Rechberger, C.: Source code for experimental validation. https://github.com/mmeh/present-keyless
- 30.Leander, G.: On linear hulls, statistical saturation attacks, PRESENT and a cryptanalysis of PUFFIN. In: Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 303–322. Springer, Heidelberg (2011) CrossRefGoogle Scholar
- 31.Li, Y., Ailan, W.: Linear cryptanalysis for the compression function of hamsi-256. In: Proceedings of the 2011 International Conference on Network Computing and Information Security - vol. 01, NCIS 2011, pp. 302–306. IEEE Computer Society, Washington (2011)Google Scholar
- 32.Matsui, M.: Linear cryptanalysis method for DES cipher. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 386–397. Springer, Heidelberg (1994) CrossRefGoogle Scholar
- 33.Matsui, M. (ed.): ASIACRYPT 2009. LNCS, vol. 5912. Springer, Heidelberg (2009) zbMATHGoogle Scholar
- 34.Matsui, M., Yamagishi, A.: A new method for known plaintext attack of FEAL cipher. In: Rueppel, R.A. (ed.) EUROCRYPT 1992. LNCS, vol. 658, pp. 81–91. Springer, Heidelberg (1993) CrossRefGoogle Scholar
- 35.Mendel, F., Rechberger, C., Schläffer, M., Thomsen, S.S.: The rebound attack: cryptanalysis of reduced whirlpool and grøstl. In: Dunkelman, O. (ed.) FSE 2009. LNCS, vol. 5665, pp. 260–276. Springer, Heidelberg (2009) CrossRefGoogle Scholar
- 36.Mendel, F., Rechberger, C., Schläffer, M., Thomsen, S.S.: Rebound attacks on the reduced grøstl hash function. In: Pieprzyk [40], pp. 350–365Google Scholar
- 37.Murphy, S.: The effectiveness of the linear hull effect. J. Math. Cryptol.
**6**(2), 137–147 (2012)MathSciNetCrossRefzbMATHGoogle Scholar - 38.Nyberg, K.: Linear approximation of block ciphers. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 439–444. Springer, Heidelberg (1995) CrossRefGoogle Scholar
- 39.Ohkuma, K.: Weak keys of reduced-round PRESENT for linear cryptanalysis. In: Jacobson Jr., M.J., Rijmen, V., Safavi-Naini, R. (eds.) SAC 2009. LNCS, vol. 5867, pp. 249–265. Springer, Heidelberg (2009) CrossRefGoogle Scholar
- 40.Pieprzyk, J. (ed.): CT-RSA 2010. LNCS, vol. 5985. Springer, Heidelberg (2010) zbMATHGoogle Scholar
- 41.Rechberger, C.: On bruteforce-like cryptanalysis: new meet-in-the-middle attacks in symmetric cryptanalysis. In: Kwon et al. [21], pp. 33–36Google Scholar
- 42.Wang, X., Yu, H.: How to break MD5 and other hash functions. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 19–35. Springer, Heidelberg (2005) CrossRefGoogle Scholar