Keywords

1 Introduction

In conventional security notions of Authenticated Encryption (AE), release of decrypted plaintext is subject to successful verification. In their pioneering paper in Asiacrypt 2014, Andreeva et al. challenged this model by introducing and formalizing the idea of releasing unverified plaintexts (RUP) [5, 6]. The idea was motivated by a lot of practical problems faced by the classical approach like insufficient memory in constrained environments, real-time usage requirements and inefficiency issues. The basic idea is to separate the plaintext computation and verification during AE decryption, so that the plaintexts are always released irrespective of the status of the verification process. In order to assess the security under RUP and to bridge the gap with the classical approach, the authors have introduced two new definitions: INT-RUP (for integrity) and plaintext awareness or PA for privacy (in combination with IND-CPA).

In this work, we try to answer the question pertaining to RUP that arises from a side-channel view-point: Can the ability to observe unverified plaintexts serve as a source of side-channel information? Our research reveals that the answer is affirmative with respect to differential fault analysis (DFA) [8, 1014, 16] which is known to be one of the most effective side-channel attacks on symmetric-key constructions. The basic requirement of any form of fault analysis is the ability to induce a fault in the intermediate state of the cipher and consequently observe the faulty output. Our first observation is that in the classical approach where successful verification precedes release of plaintexts, fault attacks are infeasible. This is attributed to the fact that if the attacker induces a fault, the probability of the faulty plaintext to pass the verification is negligible, thereby denying the ability to observe the faulty output. This scenario changes in the presence of unverified plaintexts. So the first scope that RUP provides at the hands of the attacker is the ability to observe faulty unverified plaintexts. Our second observation is in terms of the nonce constraint. In Indocrypt 2014, Saha et al. studied the impact of the nonce constraint in their EscApe fault attack [15] on the authenticated cipher APE [3]. The authors showcased the restriction that the uniqueness of nonces imposes on the replaying criterion Footnote 1 of fault analysis and demonstrated the idea of faulty collisions to overcome it. In this work we argue that ability to attack the decryption, provided by RUP, gives the additional benefit of totally bypassing the nonce constraint. This follows from the very definition of AE decryption which allows an attacker to make multiple queries to the decryption oracle with the same nonce. Thus prospect of nonce bypass makes fault analysis highly feasible.

Following these observations, we mount Scope: a differential fault attack on the decryption of APE which is also one of the submissions to the on-going CAESAR [1] competition. The choice of APE is motivated by the fact that according to PA classification of schemes provided by Andreeva et al. in [5, 6], APE which has offline decryption, is one of the CAESAR submissions that supports RUP. Authenticated Permutation-based Encryption [4] or APE was introduced by Andreeva et al. in FSE 2014 and later reintroduced in CAESAR along with GIBBON and HANUMAN and an indigenous permutation called PRIMATE as part of the authenticated encryption family PRIMATEs [2, 3]. We studied the fault diffusion in the inverse of the internal permutation PRIMATE of APE using random uni-word fault injections in the penultimate round. We capitalize on properties arising out of the non-square nature of the internal state and also the knowledge of the fault-free unverified plaintext. Our analysis shows average key-space reduction from \(2^{160}\) to \(2^{50}\) using 12 faults and to \(2^{24}\) using 16 faults. Finally, this work identifies and addresses a broader problem in differential fault analysis: Fault analysis with partial state information. Since, only part of the state is observable, the fault analysis presented here deviates from the classical DFA [8, 1014, 16] which generally assumes availability of the entire state at the output. Here we showcase that even knowledge of just one-fifth (the size of a plaintext block) of the state can be used to reconstruct the differential state and finally reduce the key-space. Moreover, close similarity between PRIMATE permutation and AES [9], automatically amplifies the scope of the results presented here. The contributions of this work can be summarized as below:

  • Scrutinizing the recently introduced RUP model in the light of fault attacks.

  • Showing that unverified plaintext can be an important source of side-channel information.

  • Showing the feasibility of fault induction using nonce bypass.

  • For the first time attacking the decryption of an AE scheme using DFA.

  • Presenting Scope attack exploiting: fault diffusion in the last two rounds of the Inverse PRIMATE permutation and the ability to observe faulty unverified plaintexts.

  • Finally, achieving a key space reduction from \(2^{160}\) to \(2^{50}\) with 12 faults and \(2^{24}\) with 16 faults using the random word fault model.

  • Moreover, this work also brings into focus the idea of fault analysis of AES based constructions with partial state information.

The rest of the work is organized as follows: Sect. 2 gives a brief description of the PRIMATE permutation and its inverse and introduces the notations used in this work. Section 3 looks at the RUP and classical models in the light of side-channel analysis. Some properties of APE decryption that become relevant in the presence of faults are discussed in Sect. 4. The proposed Scope attack is introduced in Sect. 5. Section 6 furnishes the experimental results with a brief discussion while Sect. 7 gives the concluding remarks.

2 Preliminaries

2.1 The Design of PRIMATE

PRIMATE has two variants in terms of size: PRIMATE-80 (200-bit permutation) and PRIMATE-120 (280-bit) which operate on states of \((5 \times 8)\) and \((7 \times 8)\) 5-bit elements respectively. The family consists of four permutations \(p_1, p_2, p_3,p_4\) which differ in the round constants used and the number of rounds. All notations introduced in this section are with reference to PRIMATEs-80 with the APE mode of operation.

Definition 1

(Word). Let \(\mathbb {T} = \mathbb {F}[x]/(x^5 + x^2 + 1)\) be the field \(\mathbb {F}_{2^5}\) used in the PRIMATE MixColumn operation. Then a word is defined as an element of \(\mathbb {T}\).

Definition 2

(State). Let \(\mathbb {S} = (\mathbb {T}^5)^8\) be the set of \((5 \times 8)\)-word matrices. Then the internal state of the PRIMATE-80 permutation family is defined as an element of \(\mathbb {S}\). We denote a state \(s\in \mathbb {S}\) with elements \(s_{i,j}\) as \([s_{i,j}]_{5,8}\).

$$\begin{aligned} s= [s_{i,j}]_{5,8}, \;\text{ where } {\left\{ \begin{array}{ll} s_{i,j} \in \mathbb {T}\\ 0 \le i \le 4, \; 0 \le j \le 7\\ \end{array}\right. } \end{aligned}$$
(1)

In the rest of the paper, for simplicity, we omit the dimensions in \([s_{i,j}]_{5,8}\) and use \([s_{i,j}]\) as the default notation for the \(5 \times 8\) state. We denote a column of \([s_{i,j}]\) as \(s_{*,j}\) while a row is referred to as \(s_{i,*}\). We now describe in brief the design of PRIMATE permutation. In this work we also deal with the inverse of the PRIMATE permutation. APE instantiates \(p_1\) which is a composition of 12 round functions. The inverse permutation \(p_1^{-1}\) applies the round functions in the reverse order with each component operations itself being inverted. For the sake of consistency, in the rest of the work rounds of \(p^{-1}\) will be denoted w.r.t to the corresponding rounds of p. For instance, the last round of \(p^{-1}\) will be referred to as \(\mathcal {R}^{-1}_1\) since functionally it is the inverse of the first round of p.

$$\begin{array}{cl | l} p_1, p_1^{-1} : \mathbb {S}\longrightarrow \mathbb {S}, &{} \quad p_1 = \mathcal {R}_{12} \circ \mathcal {R}_{11} \circ \cdots \circ \mathcal {R}_{1} \quad &{} \quad p_1^{-1} = \mathcal {R}_{1}^{-1} \circ \mathcal {R}_{2}^{-1} \circ \cdots \circ \mathcal {R}_{12}^{-1} \\ \mathcal {R}_r, \mathcal {R}_r^{-1} : \mathbb {S}\longrightarrow \mathbb {S}, &{} \quad \mathcal {R}_r = \alpha _r \circ \mu _r \circ \rho _r \circ \beta _r \quad &{} \quad \mathcal {R}_r^{-1} = \beta _r^{-1} \circ \rho _r^{-1} \circ \mu _r^{-1} \circ \alpha _r^{-1} \end{array}$$

where \(\mathcal {R}_r\) is a composition of four bijective functions on \(\mathbb {S}\) while \(\mathcal {R}_r^{-1}\) denotes the inverse round function. The index r denotes the \(r^{th}\) round and may be dropped if the context is obvious. Here, the component function \(\beta \) represents the non-linear transformation SubBytes which constitutes word-wise substitution of the state according to predefined S-box. The definitions extend analogously for the inverse.

$$\begin{aligned} \beta _r, \beta _r^{-1} : \mathbb {S}\longrightarrow \mathbb {S},&\quad \quad s= [s_{i,j}] \mathop {\mapsto }\limits ^{\beta } [S(s_{i,j})],&\quad \quad s= [s_{i,j}] \mathop {\mapsto }\limits ^{\beta ^{-1}} [S^{-1}(s_{i,j})]&\end{aligned}$$

where \(S : \mathbb {T} \longrightarrow \mathbb {T}\) is the S-box given in Table 1. The transformation \(\rho \) corresponds to ShiftRows which cyclically shifts each row of the state based on a set of offsets. The same applies to \(\rho ^{-1}\) with only the direction of shift being reversed.

$$\begin{aligned} \rho _r, \rho ^{-1} : \mathbb {S}\longrightarrow \mathbb {S}&, \quad \quad s= [s_{i,j}] \mathop {\mapsto }\limits ^{\rho } [s_{i,(j - \sigma (i))\text { mod }8}],&\quad \quad s= [s_{i,j}] \mathop {\mapsto }\limits ^{\rho ^{-1}} [s_{i,(j + \sigma (i))\text { mod }8}] \end{aligned}$$

where, \(\sigma = \{0,1,2,4,7\}\) is the ShiftRow offset vector and \(\sigma (i)\) defines shift-offset for the \(i^{th}\) row. The MixColumn operation, denoted by \(\mu \), operates on the state column-wise. \(\mu \) is actually a left-multiplication by a \(5 \times 5\) matrix \((M_{\mu })\) over the finite field \(\mathbb {T}\). For the InverseMixColumn \((\mu ^{-1})\), the multiplication is carried out using \((M_{\mu }^{-1})\).

$$\begin{aligned} \mu _r : \mathbb {S}\longrightarrow \mathbb {S}, \quad \quad s= [s_{i,j}] \longmapsto s' = [s'_{i,j}], \quad \quad s'_{*,j} = M_{\mu } \times s_{*,j} \end{aligned}$$

The last operation of the round function is \(\alpha \) which corresponds to the round constant addition. The constants are the output \(\{\mathcal {B}_1, \mathcal {B}_2, \cdots , \mathcal {B}_{12}\}\) of a 5-bit LFSR and are xored to the word \(s_{1,1}\) of the state \([s_{i,j}]\). \(\alpha \) is involutory implying \(\alpha = \alpha ^{-1}\).

$$\begin{aligned} \alpha _r : \mathbb {S}\longrightarrow \mathbb {S}, \quad \quad [s_{i,j}] \longmapsto [s'_{i,j}], \quad \quad s'_{i,j} = {\left\{ \begin{array}{ll} s_{i,j} \oplus \mathcal {B}_r \;\text{ if }\; i,j = 1\\ s_{i,j}, \;\text{ Otherwise } \end{array}\right. } \end{aligned}$$

The APE mode of operation is depicted in Fig. 1. Here, \(N[\cdot ]\) represents a Nonce block while \(A[\cdot ]\) and \(M[\cdot ]\) denote blocks of associated data and message respectively. The IVs shown in Fig. 1 are predefined and vary according to the nature of the length of message and associated data. Figure 1a and b show the encryption and decryption modules of APE respectively. In is evident from Fig. 1b that the decryption starts from the last ciphertext block and proceeds in the reverse direction which implies that APE decryption is offline.

Table 1. The PRIMATE 5-bit S-box
Fig. 1.
figure 1figure 1

The APE mode of operation

2.2 Notations

Definition 3

(Differential state). A differential state is defined as the element-wise XOR between a state \([s_{i,j}]\) and the corresponding faulty state \([s'_{i,j}]\).

$$\begin{aligned} s'_{i,j} = s_{i,j} \oplus \delta _{i,j}, \; \forall \; i,j \end{aligned}$$
(2)

\(\delta \) fully captures the initial fault as well as the dispersion of the fault in the state. In this work we assume induction of random faults in some word of a state. Thus, if the initial fault occurs in word \(s_{I,J} \in s\), the differential state is of the following form:

$$\begin{aligned} \delta _{i,j} = {\left\{ \begin{array}{ll} f : f \xleftarrow {R} \mathbb {T} \setminus \{0\}, \text{ if } (i = I, j = J) \\ 0, \text{ Otherwise } \end{array}\right. } \end{aligned}$$
(3)

If \(\exists j : \delta _{i,j} = 0 \;\forall i\) then \(\delta _{*,j}\) is called a pure column, otherwise \(\delta _{*,j}\) is referred to as a faulty column.

Definition 4

(Hyper-column). A Hyper-column is a \((5 \times 1)\) column vector where each element is again a vector of words i.e., a subset of \( \mathbb {T}\). It is denoted by \(\mathcal {H}\).

$$\begin{aligned} \mathcal {H} = \begin{bmatrix} b_0\\ b_1\\ \vdots \\ b_4\\ \end{bmatrix} \quad \text{ where } b_j \subset \mathbb {T},&\qquad \qquad \text{ Also, } \;\; \mathcal {H} = \varnothing \quad \text{ if } \quad \exists i : b_i = \varnothing \end{aligned}$$

The Hyper-column helps to capture the candidate words for a column that result due to the fault analysis presented here. Also a hyper-column is considered to be empty if at least one of its component sets is empty.

Definition 5

(Hyper-state [15]). A Hyper-state of a state \(s = [s_{i,j}]\), denoted by \(s^h = [s^h_{i,j}]\), is a two-dimensional matrix, where each element \(s^h_{i,j}\) is a non-empty subset of \(\mathbb {T}\), such that s is an element-wise member of \(s^h\).

$$\begin{aligned} s^h = \begin{bmatrix} s^h_{00}&s^h_{01}&\cdots&s^h_{07} \\ s^h_{10}&s^h_{11}&\cdots&s^h_{17} \\ \vdots&\vdots&\ddots&\vdots \\ s^h_{40}&s^h_{41}&\cdots&s^h_{47} \\ \end{bmatrix} \quad \text{ where } {\left\{ \begin{array}{ll} s^h_{i,j} \subset \mathbb {T}, \;\; s^h_{i,j} \ne \varnothing \\ s_{i,j} \in s^h_{i,j} \; \forall i,j \end{array}\right. } \end{aligned}$$
(4)

The significance of a hyper-state \(s^h\) is that the state s is in a way ‘hidden’ inside it. This means that if we create all possible states taking one word from each element of \(s^h\), then one of them will be exactly equal to s.

The hyper-state has some interesting properties with respect to the component transformations of the PRIMATE permutation and consequently its inverse. For instance all the inverse operations like InverseShiftRow(\(\rho ^{-1}\)), InverseSubByte(\(\beta ^{-1}\)), InverseAddRoundConstant(\(\alpha ^{-1}\)) can be applied on a hyper-state with little technical changes. This is possible since all these operations work word-wise and thus can be applied as a whole to each element-set of a hyper-state too with an equivalent effect. We define the analogs of these operations on a hyper-state as hyper-state-\(<\)operation\(>\): \((\rho ^{-1})', (\beta ^{-1})', (\alpha ^{-1}_r)'\). The formal definitions are provided in Appendix A. Another observation of particular interest is that hyper-state-\(<\)operation\({>}{(s^h)}\) = (\(<\)operation\({>}{(s))^h}\).

Definition 6

(Kernel [15]). If \(s^h\) is a hyper-state of s, then Kernel of a column \(s^h_{*,j} \in s^h\), denoted by \(\mathcal {K}^{s^h_{*,j}}\), is defined as the cross-product of \(s^h_{0,j}, s^h_{1,j}, \cdots , s^h_{4,j}\).

Subsequently, Kernel of the entire hyper-state is the set of the Kernels of all of its columns: \(\mathcal {K}^{s^h} = \{\mathcal {K}^{s^h_{*,0}}, \mathcal {K}^{s^h_{*,1}}, \cdots , \mathcal {K}^{s^h_{*,7}}\}\)

Here, \(w_k^T\) represents the transpose of \(w_k\), thereby implying that \(w_k\) is a column vector. One should note that \(s_{*,j} \in \mathcal {K}^{s^h_{*,j}} \;\forall j\). Thus each column of s is contained in each element of \(\mathcal {K}^{s^h}\). We now define an operation \((\mu ^{-1})'\) over the Kernel of a hyper-state which is equivalent to \(\mu ^{-1}\) that operates on a state.

Definition 7

(Kernel-InverseMixColumn). The Kernel-InverseMixColumn transformation denoted by \((\mu ^{-1})'\) is the left multiplication of \(M^{-1}_\mu \) to each element of each \(\mathcal {K}^{s^h_{*,j}} \in \mathcal {K}^{s^h}\).

$$\begin{aligned} (\mu ^{-1})'(\mathcal {K}^{s^h_{*,j}})&= \{M^{-1}_\mu \times w_i, \; \forall w_i \in \mathcal {K}^{s^h_{*,j}} \}\\ (\mu ^{-1})'(\mathcal {K}^{s^h})&= \{(\mu ^{-1})''(\mathcal {K}^{s^h_{*,0}}), (\mu ^{-1})'(\mathcal {K}^{s^h_{*,1}}), \cdots , (\mu ^{-1})'(\mathcal {K}^{s^h_{*,7}})\} \end{aligned}$$

An important implication is that \((\mu ^{-1})'(\mathcal {K}^{s^h}) = \mathcal {K}^{(\mu ^{-1}(s))^h}\). The notion of Hyper-state and Kernel will be used in the Outbound phase of Scope detailed in Subsect. 5.3.

3 RUP in the Light of Side-Channels

RUP which has been argued to be a very desirable property can be a major source for side channel information. In this work we try to study how RUP stands out in the light of fault attacks. Our research reveals that RUP opens up an exploitable opportunity with respect to fault analysis which would not have been possible if verification would precede release of the plaintexts. Moreover, attacking the decryption also allows the attacker to bypass the nonce constraint imposed by the encryption. It has been shown that nonce based encryption has an automatic protection against DFA and hence ability to bypass the nonce constraint exposes the AE scheme to fault attacks. In the rest of the paper we refer to the classical model that does not allow RUP as RVP (Release of Verified Plaintexts). We now argue why RVP has an implicit protection against fault attacks which makes attacking the decryption infeasible.

In order to understand the significance of the scope that the RUP model puts at the hands of the attacker, one has to be aware of why fault analysis in the RVP model is infeasible. According to the classical RVP model, the decryption oracle returns the entire plaintext if the verification passes and \(\bot \) otherwise. Now we define the term faulty forgery as the ability of the attacker to produce a plaintext \((p* \ne p)\) after inducing faults such that the verification passes. Now, in standard fault analysis it is assumed that the attacker can induce random faults in the state but the value of the fault is unknown. Under this scenario, the probability of the attacker to produce a faulty forgery is negligible (Fig. 2).

Fig. 2.
figure 2figure 2

RVP vs RUP from the perspective of fault analysis

On the contrary RUP gives the attacker the scope of inducing random faults while decrypting any chosen or known ciphertext and unconditionally observe the corresponding faulty plaintexts (which would never have passed verification in the RVP model). This power opens up the side-channel for fault analysis and is the basis of the differential fault attack presented in this work. Moreover, the ability to attack the decryption has the additional and important advantage of bypassing the nonce constraint that is imposed while making encryption queries. This magnifies the feasibility of mounting fault attacks.

In the next section, we look at some of the features of APE decryption and the inverse PRIMATE permutation \(p^{-1}\) that gain importance from a fault attack perspective. Finally, building upon these observations we introduce the Scope attack where for the first time we show how the decryption can also be attacked under RUP to retrieve the entire internal state of \(p^{-1}\) leading to recovery of the key with practical complexities.

4 Analyzing APE Decryption in the Presence of Faults

In this section we look at certain properties of APE decryption that become relevant in the context of RUP and from the prospect of fault induction. We first look at a property which by itself is of no threat to the security of APE but becomes exploitable in the presence of faults in the RUP scenario.

4.1 The Block Inversion Property

The Block Inversion Property is purely attributed to the APE mode of operation. This property allows the attacker to retrieve partial information about the contents of the state matrix after the last round InverseMixColumn operation.

Property 1

If the state after \(\mu ^{-1}_1\) in \(\mathcal {R}^{-1}_1\) (the last round of \(p^{-1}\)) be represented as \(t = [t_{i,j}]\) and the released plaintext block and next ciphertext block be p and c respectively, then \(t_{0,*}\) is public by the following expression:

$$\begin{aligned} t_{0,i}= S(v_i) \quad \text{ where }, {\left\{ \begin{array}{ll} v_i \in (p\oplus c),\\ S \rightarrow {\texttt {PRIMATE}\quad \!{\texttt {Sbox}}\quad \!{\texttt {(Table\,1)}} } \end{array}\right. } \end{aligned}$$

Analysis: By virtue of the APE mode of operation and the SPONGE [7] construction it follows, the rate part (top row of the state) after \(\mathcal {R}^{-1}_1\) of \(p^{-1}\) is released after XORing with the next ciphertext block as the plaintext block (which can be observed unconditionally under RUP). If the state after \(\mathcal {R}^{-1}_1\) be \(s = [s_{i,j}]\) then \(p\oplus c\) gives back \(s_{0,*}\). We can now invert this block to get inside \(\mathcal {R}^{-1}_1\) despite partial knowledge of the state. This becomes possible since \(\beta \) operates word-wise and \(\rho \) operates row-wise. Moreover, \(\rho \) can be ignored for it has no effect on top row as the shift-offset is zero. Thus applying \(\beta \) on \(s_{0,*}\) we get the value of \(t_{0,*}\). However, the inversion stops here since \(\mu \) operates column-wise and only word of each column is known.    \(\blacksquare \)

Later in this work we show how the Scope attack can exploit the Block Inversion Property along with RUP and use both faulty and fault-free plaintexts to reconstruct differential state after \(\mu ^{-1}_2\) in \(\mathcal {R}^{-1}_2\). We now study the fault induction and diffusion in the state of \(p^{-1}\) which is vital to understanding of the attack presented here.

4.2 Fault Diffusion in the Inverse PRIMATE Permutation

In this section we describe the induction and diffusion of faults in the inverse (\(p^{-1}\)) of the PRIMATE permutation during APE decryption. In fact, our intention is to study the fault diffusion in the differential state of \(p^{-1}\) which we exploit to formulate a fault attack on APE Decryption. The fault induction and subsequent differential plaintext block formation are illustrated in Fig. 3. One can see from Fig. 3 that the fault is induced in the input of the penultimate round \(\mathcal {R}^{-1}_2\) of \(p^{-1}\). The logic behind this will be clear from the following important property of fault diffusion in the internal state of \(p^{-1}\).

Fig. 3.
figure 3figure 3

Fault induction in APE decryption before releasing unverified plaintext and the unverified differential plaintext block

Property 2

If a single column is faulty at the start of \(\mathcal {R}^{-1}_{r+1}\) then there are exactly three fault-free words in each row of the differential state after \(\mathcal {R}^{-1}_{r}\).

Analysis: This property surfaces because in two rounds the fault does not spread to the entire state matrix. This is primarily attributed to the fact that the state matrix is non-square. To visualize this we need to first look at fault diffusion in the \(\mathcal {R}^{-1}_{r+1}\) round. Let us denote the differential state at the input of \(\mathcal {R}^{-1}_{r+1}\) as \(s= [s_{i,j}]\). This analysis takes into account the structural dispersion of the fault and is independent of the actual value of \(s\). At the beginning of \(\mathcal {R}^{-1}_{r+1}\) only one column \(s_{*,j}\) is faulty. The operation \(\alpha ^{-1}\) is omitted from analysis since round-constant addition has no effect on the differential state.

  • Fault diffusion in \(\mathcal {R}^{-1}_{r+1}\)

    • \(\mu ^{-1}_{r+1}:\) Intra-column diffusion. Fault spreads to entire column \(s_{*,j}\).

    • \(\rho ^{-1}_{r+1}: \) No diffusion, fault shifts to the words \(\{s_{i,(j+\sigma (i))\text { mod }8}\) : \({\scriptstyle 0\le i <|\sigma |} \}\).

    • \(\beta ^{-1}_{r+1}: \) No diffusion, fault limited to the same words as after \(\rho ^{-1}_{r+1}\).

      $$\begin{aligned} s_{*,j} \xrightarrow {\mu ^{-1}_{r+1}} s_{*,j} \xrightarrow {\beta ^{-1}_{r+1} \circ \rho ^{-1}_{r+1}} \{s_{i,(j+\sigma (i))\text { mod }{8}}\} \end{aligned}$$
      (5)
  • Fault diffusion in \(\mathcal {R}_{r}\)

    • \(\mu ^{-1}_{r}:\) Fault spreads to each column \(s_{*,(j+\sigma (i))\text { mod }{8}}.\)

    • \(\rho ^{-1}_{r}\) : No diffusion, fault shifts to the words \(\{s_{i,(j+\sigma (i)+\sigma (k))\text { mod }8}\) : \({\scriptstyle 0\le i,k <|\sigma |}\}\).

    • \(\beta ^{-1}_{r}: \) No diffusion, fault limited to the same words as after \(\rho ^{-1}_{r}\).

      $$\begin{aligned} \{s_{i,(j+\sigma (i))\text { mod }{8}} \} \xrightarrow {\mu ^{-1}_{r}} s_{*,(j+\sigma (i))\text { mod }{8}} \xrightarrow {\beta ^{-1}_{r} \circ \rho ^{-1}_{r}} \{s_{i,(j+\sigma (i)+\sigma (k))\text { mod }8}\} \end{aligned}$$
      (6)

From (5) and (6) we have the following relation between the faulty column \(s_{*,j}\) at the start \(\mathcal {R}^{-1}_{r+1}\) and the faulty words after \(\mathcal {R}^{-1}_r\).

$$\begin{aligned} s_{*,j} \xrightarrow {\mathcal {R}^{-1}_{r}\circ \mathcal {R}^{-1}_{r-1}} \big \{ s_{i,(j+\sigma (i)+\sigma (k))\text { mod }8}: {\scriptstyle 0\le i,k <|\sigma |} \big \} \end{aligned}$$
(7)

For PRIMATE-80, \(\sigma = \{0,1,2,4,7\}\), implying that \(\big |\sigma \big | = 5\). From (7), we have \(\big |\{s_{i,(j+\sigma (i)+\sigma (k))\text { mod }8} \}\big | = 25\). Thus a single faulty column before \(\mathcal {R}^{-1}_{r+1}\) results in 25 faulty words at the end of \(\mathcal {R}^{-1}_{r}\). Moreover, for each value of i we have \(\big |\{s_{i,(j+\sigma (i)+\sigma (k))\text { mod }8} : {\scriptstyle 0\le k < 5}\}\big | = 5\) implying that each row has 5 faulty words and respectively \(8-5 = 3\) fault-free words at the end of \(\mathcal {R}^{-1}_{r}\). An illustration of the above analysis with the source fault in column \(s_{*,3}\) is depicted in Fig. 4.    \(\blacksquare \)

Fig. 4.
figure 4figure 4

2-round fault diffusion with a uni-word fault in column \(s_{*,3}\)

4.3 The Bijection Lemma

This lemma stems out of the property mentioned above and is pivotal in increasing the efficiency of the Scope attack. Again it is a direct consequence of the non-square nature of the internal state of \(p^{-1}\).

Lemma 1

If fault is induced in the \(j^{th}\) column of the state at the input of \(\mathcal {R}^{-1}_{r+1}\), then the fault-free words in the differential plaintext block released after \(\mathcal {R}^{-1}_{r}\) are \(((j+3), (j+5), (j+6)) \text { mod }8\).

Proof

This directly follows from relation (7). One can recall that for APE decryption under RUP, the first row of the state is released after XORing with next ciphertext block. However, since we are considering a differential here, the effect of the ciphertext block is nullified. Now, for \(i = 0\), from relation (7) we have \(\{s_{0,(j+\sigma (0)+\sigma (k))\text { mod }8} : {\scriptstyle 0\le k < 5}\} = \{s_{0,j}, s_{0,j+1}, s_{0,j+2}, s_{0,j+4}, s_{0,j+7}\}\) which signifies the set of faulty words in the differential plaintext block. Hence, the complement of this set w.r.t the set of all the words in the plaintext block is \(\{s_{0,j+3}, s_{0,j+5}, s_{0,j+6}\}\), which signify the fault-free words.    \(\blacksquare \)

The implication of this lemma is that there exists a bijection between the positions of the fault-free words in the differential plaintext block released after \(\mathcal {R}^{-1}_{r}\) and position of the column in which the fault was induced before \(\mathcal {R}^{-1}_{r+1}\). This is vital to the analysis presented in this work and shows that by looking at the unverified differential plaintext block the attacker can ascertain the column position of the fault. This makes the attack 8 times faster. However, this information is not sufficient to guess the row position since all faults in the same column will produce the same pattern for the fault-free words.

In case of \(p^{-1}, r = 1\) and the Bijection Lemma implies that by looking at the unverified differential block (Fig. 3) released after \(\mathcal {R}^{-1}_1\), the attacker can ascertain in which column the fault was induced before \(\mathcal {R}^{-1}_2\). With knowledge of all these characteristics of the APE mode of operation as well as \(p^{-1}\), we are now in a place to finally introduce the differential fault attack developed in this work: Scope.

5 Scope: Differential Fault Analysis of APE Decryption (Exploiting Release of Unverified Plaintexts)

The first task is to run APE decryption and observe the released unverified plaintexts. Next the attacker queries the decryption with same set of inputs. Recall, that nonce constraint can be bypassed by definition. Every time, while replaying the decryption, he induces a random uni-word fault at the input of \(\mathcal {R}^{-1}_2\) of \(p^{-1}\) during the processing of the same ciphertext block. By RUP principle, the attacker can observe the corresponding faulty plaintext blocks. The fault-free plaintext block (p) along with each corresponding faulty plaintexts block (\(p'_i\)) are stored. Now using the Bijection Lemma every differential plaintext block (\(p \oplus p'_i\)) is analyzed to get the faulty column before \(\mathcal {R}^{-1}_2\). The information is stored in the fault count vector (\(\mathcal {F}\)) which is an array keeping count of the number of faults traced back to each column before \(\mathcal {R}^{-1}_2\). For each unverified faulty plaintext, the Inbound phase is initiated to get back a set of hyper-columns. The process is detailed in the next subsection.

5.1 The Inbound Phase

The main aim of this phase is to reduce the number of candidate words for the column to which the fault was traced back. Let the state after \(\mu ^{-1}_1\) for the fault-free case be \(s = [s_{i,j}]\) and for the faulty case be \(s' = [s'_{i,j}]\). Now, by virtue of the Block Inversion Property, \(s_{0,*}\) and \(s'_{0,*}\) are known to the attacker. He now exploits the relation between the differential state before and after \(\mu ^{-1}_1\) that arises from the fault diffusion to reconstruct the entire differential state after \(\mathcal {R}^{-1}_1\). To be more precise, the attacker is interested in the nature of the differential block (\(s_{0,*} \oplus s'_{0,*}\)). Due to the InverseMixColumn operation every non-zero word (\(s_{0,*} \oplus s'_{0,*}\)) is a multiple of the non-zero word in the corresponding column before \(\mu ^{-1}_1\) and the relation is governed by the InverseMixColumns matrix. Thus if the source fault is in column 4, (\(s_{0,*} \oplus s'_{0,*}\)) is of the following form: \(\{0,0, x_1 \times F_5, x_2 \times F_1, x_3 \times F_2, x_4 \times F_3, 0, x_5 \times F_4\}\). Now to get back each \(F_i\) from the differential row, the attacker makes use of the Factor Matrix given in Table 2. As one can notice the Factor Matrix is a circulant matrix. The \(i^{th}\) row corresponds to the factors to be used if the source fault is in the \(i^{th}\) column. The ‘*’ represents the positions of the zero values of the corresponding differential row. So, the attacker retrieves each \(F_i\) by word-wise Galois Field division of the differential row (\(s_{0,*} \oplus s'_{0,*}\)) by using the appropriate row from the Factor Matrix. The method of generating the Factor Matrix is detailed in Appendix D.

Table 2. The Factor Matrix

The attacker now has the entire differential state after \(\mathcal {R}^{-1}_2\). He cannot invert further deterministically since \(\beta \) is nonlinear. However, as \(\rho \) and \(\beta \) are commutative, he can apply \(\rho \) before \(\beta \). By virtue of the fault diffusion described in Property (2), the differential state after \(\beta ^{-1}_2\) has only one non-zero column and it is the same column where the fault was induced. The attacker now solves differential equations involving the same column at the input of \(\beta ^{-1}_2\) which arise due the InverseMixColumns of \(\mathcal {R}^{-1}_2\). However, these equations are characterized by the row in which the initial fault was induced. One can recall that from Lemma 1, that the information available is not sufficient to ascertain the exact row. For instance, the fault invariantsFootnote 2 for different rows of column 4 is shown in Fig. 5. So, the attacker solves the five sets of equations assuming all the possibilities. Out of these one set corresponds to the actual row that was affected. Solving the equations results in significant reduction in column space. The candidate words that satisfy the equations are stored into hyper-columns (Definition 4). So each row guess results in a different hyper-column and hence there can be maximum of 5 hyper-columns. However, one may encounter a lot of wrong candidate words getting accepted as they satisfy the wrong set of equations arising from the incorrect row guess. We refer to all accepted words other than the legitimate ones as Noise. Thus one run of the Inbound Phase returns a set of hyper-columns with a maximum cardinality of 5. The phase, is repeated for each faulty unverified plaintext and corresponding set of hyper-columns appropriately stored in a set of sets of hyper-columns: \(\mathbb {H}\). After all faulty plaintexts have been processed, the set \(\mathbb {H}\) along with the fault count vector \(\mathcal {F}\) are passed on to the Noise handling phase.

Fig. 5.
figure 5figure 5

Generation of hyper-columns using a word-fault at the beginning of \(\mathcal {R}^{-1}_2\)

5.2 Noise Handling

Here the attacker takes the advantage of the fact that while he induces random uni-word faults in input of \(\mathcal {R}^{-1}_2\), there is a high probability that some faults get induced in the same column. Thus he will have multiple sets of hyper-columns from the Inbound phase that reduced the column space for the same column before \(\mathcal {R}^{-1}_2\). On the contrary, it might so happen that only one fault gets induced for a particular column. The worst-case scenario occurs if none of the induced faults affects some specific column. The former cases are dealt with in the next subsections while for the later case the attacker is left with exhaustive search implying that Noise handling phase will return a hyper-column that spans the entire column space.

Fig. 6.
figure 6figure 6

The Noise handling phase

Noise Inclusion. When the attacker traces only one fault back to a column, he faces an ambiguity regarding the source row. In this scenario, he has no other option but to include all the hyper-columns for the next phase of the attack. So he includes all the Noise in the final step. So Noise Inclusion corresponds to word-wise union of all hyper-columns as depicted in Fig. 6a. Noise Inclusion, definitely, increases the column-space, however, computer simulations show that the final cardinality is still much better that brute force.

Noise Reduction. When the attacker traces multiple faults to the same column, he can significantly reduce the column space by eliminating Noisy hyper-columns. For e.g. if two faults are traced back to column x, then the attacker has two sets of hyper-columns. He now takes the cross-product of these two sets. Every element of the cross-product is a pair of hyper-columns. He now takes the set intersection between each such pair. The result is again a hyper-column with the cardinality of its component sets highly reduced. However, if the hyper-column turns out to be emptyFootnote 3, it is discarded. Experiments show that most of the elements from the cross-product get eliminated due to this and the attacker is left with a single final hyper-column. In case multiple hyper-columns remain, a element-wise union is taken to form the final hyper-column.

This Noise handling phase is repeated for all the columns and returns a set of eight hyper-columns for the last phase of the attack.

figure afigure a

5.3 The Outbound Phase

The Outbound phase of Scope is inspired from the Outbound phase of the EscApe [15] attack proposed by Saha et al. in Indocrypt 2014 and closely follows it. It borrows the idea of a Hyper-state and Kernel from there. The input to this phase is the set of eight hyper-columns. Since none of the hyper-columns are empty, they can easily be combined structurally to form the hyper-state of the state after \(\mu ^{-1}_2\). Let us denote the state by \(s = [s_{i,i}]\) and then the hyper-state is \(s^h\). This hyper-state \(s^h\) captures the reduced state-space for the state s that has been generated using the last two phases. In this phase we want to further reduce the state-space using knowledge of the fault-free plaintext block by again employing the Block Inversion property. This phase is called Outbound since it tries to move outward from \(\mu ^{-1}_2\). We start by propagating further into \(\mathcal {R}^{-1}_2\) and then move into \(\mathcal {R}^{-1}_1\) by applying some hyper-state-\(<\)operations\(>\) on \(s^h\). The steps of the Outbound phase are enlisted below.

  1. 1.

    The attacker starts the Outbound phase by applying Hyper-state InverseShiftRow transformation (Definition 8) on \(s^h\) followed by Hyper-state InverseSubByte (Definition 9) on \(s^h\). This completes \(\mathcal {R}^{-1}_2\) propagation.

    $$\begin{aligned} s^h \xrightarrow {(\rho ^{-1})'} (\rho ^{-1}_2(s))^h \xrightarrow {(\beta ^{-1})'} (\beta ^{-1}_2(\rho ^{-1}_2(s)))^h \rightarrow v^h (say) \end{aligned}$$
  2. 2.

    We now move forward into the last round of \(p^{-1}\) : \(\mathcal {R}^{-1}_1\). Let us denote the state \(\beta ^{-1}_2(\rho ^{-1}_2(s))\) as v. We now apply Hyper-state InverseAddRoundConstant (Definition 10): \((\alpha ^{-1}_1)'\) on the hyper-state \(v^h\). The next step is to compute the Kernel for \((\alpha ^{-1}_1(v))^h : \mathcal {K}^{(\alpha ^{-1}_1(v))^h}\).

    $$\begin{aligned} v^h \xrightarrow {(\alpha ^{-1}_1)'} (\alpha ^{-1}_1(v))^h \xrightarrow {\text{ Compute } \text{ Kernel }} \mathcal {K}^{(\alpha ^{-1}_1(v))^h} \end{aligned}$$
  3. 3.

    Then the attacker applies the Kernel-InverseMixColumn transformation on the Kernel \(\mathcal {K}^{(\alpha ^{-1}_1(v))^h}\)

    $$\begin{aligned} \mathcal {K}^{(\alpha ^{-1}_1(v))^h} \xrightarrow {(\mu ^{-1})'} \mathcal {K}^{(\mu ^{-1}_1(\alpha ^{-1}_1(v)))^h} \end{aligned}$$
  4. 4.

    Next comes the reduction step. It can be noted that \(\mathcal {K}^{(\mu ^{-1}_1(\alpha ^{-1}_1(v)))^h}\) represents the kernel for the hyper-state of \((\mu ^{-1}_1(\alpha ^{-1}_1(v)))\). i.e., the state just before the application of \(\rho ^{-1}_1\). Now let \(t = (\mu ^{-1}_1(\alpha ^{-1}_1(v)))\). Then by the Block Inversion property, the actual value of \(t_{0,*}\) is known. This knowledge is used to reduce the size of each \(\mathcal {K}^{t^h_{*,j}} \in \mathcal {K}^{t^h}\). This reduction algorithm is almost similar to ReduceKernel given in [15] and is restated in Appendix B for easy reference.

A pictorial description of the Outbound phase is furnished in Fig. 7. Thus, after the Outbound phase we get a reduced Kernel for the state at the end of \(\mu ^{-1}_1\). Every element of the cross-product of Kernels of each column is a candidate state. Finally, applying \(\rho ^{-1}_1\) and \(\beta ^{-1}_1\) on each candidate state produces the reduced state-space at the end of \(\mathcal {R}^{-1}_1\) of \(p^{-1}\). This reduced state-space directly corresponds to the key-space of the state since recovering the internal state implies recovery of the key. The overall Scope attack is summarized by the following algorithm:

figure bfigure b
Fig. 7.
figure 7figure 7

Final reduction in state-space using fault-free unverified plaintext block

6 Experimental Results and Discussion

Scope was verified by extensive computer simulations. The experimental results confirm large scale reduction in the state-space and consequently the key-space. Average case analysis reveals that with 12 random uni-word faults at the input of \(\mathcal {R}^{-1}_2\), the state-space at the end of \(\mathcal {R}^{-1}_1\) reduces from \(2^{160}\) to \(2^{50}\) while 16 faults give a reduced state-space of \(2^{24}\). It is interesting to note that the fault distribution had a direct impact on state(key)-space reduction. To highlight the impact we look at two different fault distributions with 12 faults. Let the fault count vectors be \(\mathcal {F}_1 =\{1,2,3,0,2,2,1,1\}\) and \(\mathcal {F}_2 =\{2,2,2,0,2,2,1,1\}\). The average reduction with these distributions are \(2^{45}\) and \(2^{28}\) respectively. This extreme variance in the reduced key-spaces is attributed firstly to the fact that \(\mathcal {F}_2\) is a more uniform distribution. Secondly, \(\mathcal {F}_1\) has three columns which get just one fault. Thus, Noise reduction cannot be applied to them. While for \(\mathcal {F}_2\) such cases are two which leads to a better Noise reduction in the Noise handling phase and hence the better reduction in overall key-space. To conclude, it can be said that best results are obtained when fault distribution is such that maximum number of columns receive at least two faults.

It might be argued that in comparison to EscApe attack by Saha et al. Scope requires more faults. However, it must be kept in mind that Scope works with only partial state information while EscApe has the full state at its disposal. Moreover, since Scope attacks APE decryption it can bypass the nonce constraint and hence also avoid the need of faulty collisions which are inevitable for EscApe. Overall, Scope shows an interesting case-study where an AES-like construction is analyzed using faults with partial state information available to the attacker.

7 Conclusion

In this work we explore the scope provided by the RUP model with regards to fault analysis. We argue that ability to observe unverified plaintext opens up the fault side channel to attackers which is otherwise unavailable or available with negligible probability. In this work for the first time we show how the decryption of APE, an AE scheme that supports RUP, becomes vulnerable to DFA. Experiments reveal that using the random word fault model the key-space can be reduced from \(2^{160}\) to \(2^{50}\) using 12 faults while 16 faults reduce it to \(2^{24}\). An important implication of the ability to attack the decryption using RUP is that the attacker can totally bypass the nonce constraint imposed by the encryption. Finally, this work shows that though RUP is a desirable property addressing a lot of practical problems, it provides a unique scope to the attacker for mounting the Scope fault attack.