# Two Attacks on a White-Box AES Implementation

## Abstract

White-box cryptography aims to protect the secret key of a cipher in an environment in which an adversary has full access to the implementation of the cipher and its execution environment. In 2002, Chow, Eisen, Johnson and van Oorschot proposed a white-box implementation of AES. In 2004, Billet, Gilbert and Ech-Chatbi presented an efficient attack (referred to as the BGE attack) on this implementation, extracts extracting its embedded AES key with a work factor of \(2^{30}\). In 2012, Tolhuizen presented an improvement of the most time-consuming phase of the BGE attack. The present paper includes three contributions. First we describe several improvements of the BGE attack. We show that the overall work factor of the BGE attack is reduced to \(2^{22}\) when all improvements are implemented. This paper also presents a new attack on the initial white-box implementation of Chow *et al.* This attack exploits collisions occurring on internal variables of the implementation and it achieves a work factor of \(2^{22}\). Eventually, we address the white-box AES implementation presented by Karroumi in 2010 which aims to withstand the BGE attack. We show that the implementations of Karroumi and Chow *et al.* are the same, making them both vulnerable to the same attacks.

## Keywords

White-box cryptography AES implementation Dual cipher Cryptanalysis## 1 Introduction

In 2002,
Chow *et al.* introduced the concept of white-box cryptography by presenting a white-box implementation of AES [5]. White-box cryptography aims to protect the confidentiality of the secret key of a cipher in a white-box model, i.e., where an adversary is assumed to have full access to the implementation of the cipher and its execution environment. For example, in a white-box context the adversary can use tools such as decompilers and debuggers to reverse engineer the implementation of the cipher, and to read and alter values of intermediate results of the cipher during its execution. A typical example of an application in which a cipher is implemented in a white-box environment is a content protection system in which a client is executed on the main processor of a PC, a tablet, a mobile device, or a set-top box.

In 2004, Billet *et al.* [3] presented an attack on the white-box AES implementation of Chow *et al.*. The BGE attack assumes that the order of the bytes of the intermediate AES results is randomized in the white-box implementation, and extracts its embedded AES key with a work factor of \(2^{30}\). In 2012, Tolhuizen [12] proposed an improvement to the most time-consuming phase of the BGE attack, reducing the work factor of this phase to \(2^{19}\). If the improvement of Tolhuizen is implemented, then the work factor of the BGE attack is dominated by the other phases of the BGE attack, and equals \(2^{29}\). This paper presents several improvements to the other phases of the BGE attack, and shows that the work factor of the BGE attack is reduced to \(2^{22}\) when Tolhuizen’s improvement and the improvements presented in this paper are implemented.

This paper also presents a new attack on the white-box implementation of Chow *et al.* The key idea is to exploit collisions in output of the first round in order to construct sparse linear systems. Solving these systems then reveals the byte encodings and secret key byte(s) involved in some target look-up tables. Applied to the original scheme, we get an attack of complexity \(2^{22}\).

The BGE attack triggered the design of new white-box AES implementations, such as the ones proposed by Xiao and Lai in 2009 [13] and by Karroumi in 2010 [6]. In [10], De Mulder, Roelse and Preneel presented a cryptanalysis of Xiao and Lai’s white-box AES implementation, showing that this implementation is insecure.

In [6], Karroumi uses the concept of dual ciphers [1, 2, 4] and the white-box techniques of Chow *et al.* to design a new white-box AES implementation. In [6], Karroumi argues that the additional secrecy introduced by the dual cipher increases the work factor of the BGE attack to \(2^{93}\). This paper shows that the white-box AES implementations of Chow *et al.* and Karroumi are the same. As a direct consequence, Karroumi’s white-box AES implementation is vulnerable to the same attacks, including the original BGE attack and the attacks presented in this paper.

**Paper organization.** Section 2 describes aspects of AES, the white-box AES implementation of Chow *et al.*, and the BGE attack that are relevant to this paper. The improvements of the BGE attack and their work factor are presented in Sect. 3. The new attack based on collisions is presented in Sect. 4. The insecurity of Karroumi’s scheme is shown in Sect. 5. Finally, concluding remarks are provided in Sect. 6

## 2 Preliminaries

### 2.1 AES

- ShiftRows
**:** -
a permutation on the indices of the 16 bytes of the state;

- AddRoundKey
**:** -
a byte-wise addition of 16 round key bytes \(k_i^{(r,j)}\) (\(0 \le i,j \le 3\)) and the 16-byte state;

- SubBytes
**:** -
applies the AES S-box, denoted by \(S\), to every byte of the 16-byte state;

- MixColumns
**:** -
a linear operation on \(\mathbf{F}_{256}^{16}\). The MixColumns operation is represented by a \(4 \times 4\) matrix MC over \(\mathbf{F}_{256}\); the linear operation applies 4 instances of this matrix in parallel to the 16-byte state. The 16 coefficients of MC are denoted by \(mc_{ij}\) for \(0 \le i,j \le 3\).

**AES Subrounds.** The mappings in the following definition will be used to describe the white-box AES implementations and the attacks on the implementations. In the following text, the finite field representation as defined in [8] is referred to as the AES polynomial representation, and \(\oplus \) and \(\otimes \) denote the addition and multiplication operations in this representation, respectively.

### **Definition 1**

Observe that an AES subround consists of the key additions, the S-box operations and the MixColumns operations in an AES round that are associated with a single MixColumns matrix operation, and that one AES round comprises four AES subrounds. The subrounds are indexed by \(j\) in Definition 1, and this paper assumes throughout that the four subrounds in a round are numbered left to right. The bytes \(k_i^{(r,j)}\) for \(0 \le i,j \le 3\) are the 16 bytes of the AES round key of round \(r\).

### 2.2 Chow *et al.*’s White-Box AES Implementation and the BGE Attack

This section describes aspects of Chow *et al.*’s white-box AES implementation [5] and the BGE attack [3] that are relevant to this paper. For an in-depth tutorial on how Chow *et al.*’s white-box AES implementation is constructed, refer to [9].

**Encoded AES Subrounds.** In the following text, \(P^{(r,j)}_i\) and \(Q^{(r,j)}_i\) for \(0 \le i \le 3\) denote bijective mappings on the vector space \(\mathbf{F}_2^8\), referred to as *encodings* in white-box cryptography. The encodings are generated randomly and are kept secret in a white-box implementation (for details about encodings, refer to [5, 9]). A vector of four mappings, such as \(\big (P^{(r,j)}_0, P^{(r,j)}_1,P^{(r,j)}_2, P^{(r,j)}_3 \big )\) or \(\big (Q^{(r,j)}_0, Q^{(r,j)}_1, Q^{(r,j)}_2, Q^{(r,j)}_3 \big )\), denotes the mapping defined by applying the \(i\)-th element of the vector to its \(i\)-th input byte for \(0 \le i \le 3\). For \(a \in \mathbf{F}_2^n\) the mapping \(\oplus _{a}:\mathbf{F}_2^n \rightarrow \mathbf{F}_2^n\) denotes the addition with \(a\). With slight abuse of notation, an input to \(AES^{(r,j)}\) is considered to be an element of \(\mathbf{F}_{256}^4\) using the AES polynomial representation in the following definition, and an output of \(AES^{(r,j)}\) is considered to be an element of \((\mathbf{F}_2^8)^4\).

### **Definition 2**

In Chow *et al.*’s white-box AES implementation, the output encodings \(Q_i^{(r-1,j)}\) and input encodings \(P_{i}^{(r,j)}\) for \(0 \le i,j \le 3\) of successive AES rounds are pairwise annihilating to maintain the functionality of AES. The data-flow of the white-box implementation between successive AES rounds \(r-1\) and \(r\) determines the 16 pairs of output/input encodings which are pairwise annihilating.

### *Remark 1*

Although not explicitly mentioned by Chow *et al.* [5], one can use a randomization of the order of the subrounds in an AES round and in the order of the bytes within each subround to add confusion to the implementation. This can be implemented without increasing the size and without decreasing the performance of the white-box implementation. We capture such a randomization in the next definition of encoded subround where permutations \(\varPi _i^{(r,j)} :(\mathbf{F}_2^8)^4 \rightarrow (\mathbf{F}_2^8)^4\) (\(i=1,2\)) for \(1 \le r \le 9\) and \(0 \le j \le 3\) are added to randomize the order of the input bytes and output bytes of an AES subround. Moreover, permutations \(\pi ^{(r)}:\{0,1,2,3\} \rightarrow \{0,1,2,3\}\) for \(1 \le r \le 9\) randomize the order of the four AES subrounds within an AES round. These permutations are randomly chosen and kept secret in a white-box implementation.

### **Definition 3**

In [3], Billet *et al.* described a cryptanalysis of Chow *et al.*’s white-box AES implementation [5] with byte permutations and subround permutations. The starting point of their attack is that for rounds \(1 \le r \le 9\), it is possible to compose certain white-box look-up tables in such a way that an adversary has access to the encoded AES subrounds of each round.

**BGE Attack.** As indicated above, the adversary has access to the encoded AES subrounds \(\overline{AES}^{(r,j)}_{\text {enc}}\) for \(1 \le r \le 9\) and \(0 \le j \le 3\). Next, the BGE attack [3] comprises the following three phases: Phases 1 and 2 retrieve the bytes of the AES round key associated with round \(r\) for some \(r\) with \(2 \le r \le 9\), and Phase 3 determines the correct order of the round key bytes and extracts the AES key.

*Phase 1* retrieves the encodings \(Q^{(r,j)}_i\) (\(0 \le i \le 3\)) up to an affine part for each encoded AES subround \(j\) (\(0\le j \le 3\)). Because of the pairwise annihilating property of the encodings between successive rounds, the encodings \(P^{(r,j)}_i\) (\(0 \le i,j \le 3\)) can be retrieved up to an affine part by applying the same technique to the encoded AES subrounds of the previous round.

*Phase 2* assumes that all encodings of an encoded AES round are affine mappings (as the other parts have been retrieved in Phase 1). Phase 2 first retrieves the affine encodings \(Q^{(r,j)}_i\) (\(0 \le i \le 3\)) for each encoded AES subround \(j\) (\(0\le j \le 3\)). During this process, the key-dependent affine mappings \(\widetilde{P}^{(r,j)}_i(x) = P^{(r,j)}_i(x) \oplus \bar{k}^{(r,j)}_i\) (\(0 \le i,j \le 3\)) are obtained as well. As in Phase 1, the affine encodings \(P^{(r,j)}_i\) (\(0 \le i,j \le 3\)) are retrieved by applying the same technique to the encoded AES subrounds of the previous round. This enables the adversary to compute the round key bytes \(\bar{k}^{(r,j)}_i = \widetilde{P}^{(r,j)}_i(0) \oplus P^{(r,j)}_i(0)\) for \(0 \le i,j \le 3\).

*Phase 3* retrieves the round key bytes of round \(r+1\) as discussed above in Phases 1 and 2, and uses the fact that the round key bytes of rounds \(r\) and \(r+1\) are related to each other via both the data-flow of the white-box implementation and the AES key scheduling algorithm to retrieve the AES round key. Finally, assuming that the AES variant with a 128-bit key is used, the adversary can use the property of the AES key scheduling algorithm that the AES key can be computed if one of the round keys is known.

*Work factor of the BGE attack.* In [3], the authors claim that the work factor associated with the three phases of the BGE attack is around \(2^{30}\). As a result, the white-box AES implementation of Chow *et al.* is insecure. For detailed information about the BGE attack, refer to [3].

## 3 Reducing the Work Factor of the BGE Attack

- 1.
A method to reduce the expected work factor of Phase 2 of the BGE attack;

- 2.
An efficient method to retrieve the round key bytes of round \(r+1\) after the round key bytes of round \(r\) are extracted;

- 3.
An efficient method to determine the correct order of the round key bytes, given the round key bytes of two consecutive rounds.

**Phases 1 and 2: Retrieve the Round Key Bytes** \(\bar{k}^{(r,j)}_i\) (\(0 \le i,j \le 3\)) **Associated with Round** \(r\) (\(2 \le r \le 8\))

The first two phases are the ones of the BGE attack [3] using Tolhuizen’s improvement, and retrieve the round key bytes \(\bar{k}^{(r,j)}_i\) for \(0 \le i,j \le 3\) associated with round \(r\) for some \(r\) with \(2 \le r \le 8\).

*Work factor of Phase 1.* Tolhuizen’s improvement [12] reduces the work factor of Phase 1 to around \(2 \cdot 4 \cdot 4 \cdot (35 \cdot 2^8) < 2^{19}\). The first three factors (i.e., \(2 \cdot 4 \cdot 4\)) denote the number of encodings involved in Phase 1, i.e., four encodings for each of the four subrounds for each of the two consecutive rounds. The fourth factor (i.e., \(35 \cdot 2^8\)) denotes the work factor required to retrieve one encoding up to an affine part using Tolhuizen’s method.

*Work factor of Phase 2.*The expected work factor \(F\) of the second phase as described in [3] equals approximately \(2 \cdot 4 \cdot 4 \cdot 2^{15} \cdot 2^{8} = 2^{28}\), and is measured in the number of evaluations of mappings on \(\mathbf{F}_{2}^8\). The evaluations are required to determine if a mapping on \(\mathbf{F}_{2}^8\) is affine. The mappings \(f\) that need to be tested for being affine are listed in [3, Proposition 3]. Each \(f\) is associated with a secret encoding \(P^{(r,j)}_i\) (\(0 \le i,j \le 3\)) of a round \(r\). As Phase 2 needs to be applied to two consecutive rounds, this involves a total of \(2 \cdot 4 \cdot 4\) mappings (which corresponds to the first three factors in \(F\)). The mappings \(f\) are permutations on \(\mathbf{F}_{2}^8\) and have the structure

To show the correctness of this algorithm, it is sufficient to show that an affine mapping always satisfies Eq. 2. If \(f\) is affine, then \(f(x) = A(x) \oplus b\) for some \(A \in \mathbf{F}_{2}^{n \times n}\) and some \(b \in \mathbf{F}_{2}^{n}\). It follows that \(f(0) \oplus f(e_1) \oplus f(e_2) = b \oplus A(e_1) \oplus b \oplus A(e_2) \oplus b = A(e_1 \oplus e_2) \oplus b = f(e_1 \oplus e_2)\).

### **Lemma 1**

If \(f\) is a random permutation on \(\mathbf{F}_{2}^{n}\) and if \(E(n)\) denotes the expected number of evaluations of \(f\) required by the algorithm described above, then \(E(n) < 5\).

### *Proof*

Let \(p(n)\) denote the probability that Eq. 2 holds true for a random permutation. To determine \(p(n)\), note that \(f(0),f(e_1),f(e_2)\) and \(f(e_1 \oplus e_2)\) are four distinct elements of \(\mathbf{F}_{2}^{n}\) if \(f\) is a permutation. From this it follows that \(f(0) \oplus f(e_1) \oplus f(e_2)\) and \(f(e_1 \oplus e_2)\) are both elements of \(\mathbf{F}_{2}^{n} \setminus \{ f(0),f(e_1),f(e_2) \}\). Further, as \(f\) is a random permutation, \(f(e_1 \oplus e_2)\) is a random element of this set. Hence, \(p(n) = 1 / (2^n - 3)\) and \(E(n) = 4(1-p) + 2^np = 4 + (2^n - 4) / (2^n - 3) < 5\). \(\square \)

Under the assumption that \(f\) in Eq. 1 behaves as a random permutation on \(\mathbf{F}_{2}^8\) for every incorrect guess for \((c,d)\), the expected work factor of the affine-test is reduced from \(2^8\) to approximately \(5\) evaluations if \(f\) is not affine and the work factor is \(2^8\) if \(f\) is affine. This implies that the fifth factor in \(F\) is reduced to approximately \(5\). That is, the expected work factor of Phase 2 of the BGE attack is now approximately \(2 \cdot 4 \cdot 4 \cdot 2^{15} \cdot 5 \approx 2^{22}\).

**Phase 3: Retrieve the Round Key Bytes** \(\bar{k}^{(r+1,j)}_i\) (\(0 \le i,j \le 3\)) **Associated with Round** \(r+1\)

As mentioned in the description of the BGE attack in Sect. 2.2, [3] obtains the round key bytes of round \(r+1\) by applying Phases 1 and 2 to round \(r+1\) as well. Here, we present a more efficient method based on the affine-test described above. The method comprises the following three steps for each encoded AES subround \(j\) (\(0 \le j \le 3\)) associated with round \(r+1\) to retrieve the round key bytes \(\bar{k}_i^{(r+1,j)}\) (\(0 \le i,j \le 3\)):

*Step 1* applies Phase 1 (using Tolhuizen’s improvement) to round \(r+1\) in order to retrieve the encodings \(Q_i^{(r+1,j)}\) (\(0 \le i \le 3\)) up to an affine part.

*Step 2*first removes the non-affine part of the output encodings as recovered in Step 1 from the encoded AES subround. Next, Step 2 removes the input encodings \(P_i^{(r+1,j)}\) (\(0 \le i \le 3\)) from the encoded AES subround (observe that the inverses of these input encodings were obtained in Phases 1 and 2). The resulting mapping \(f^{(r+1,j)} : (\mathbf{F}_{2}^8)^4 \rightarrow (\mathbf{F}_{2}^8)^4\) is given by

*Step 3*retrieves the round key bytes \(\bar{k}_i^{(r+1,j)}\) (\(0 \le i \le 3\)). To find a key byte, say \(\bar{k}_0^{(r+1,j)}\), fix the other three input bytes to \(f^{(r+1,j)}\) (e.g., to zero), search over all possible \(2^8\) values of the key byte \(k\) and verify if

The correctness of Step 3 uses the fact that the mapping \(S\big ( c \oplus S^{-1}(x) \big )\) is non-affine for all non-zero values of \(c\). This has already been proven in [3, proof of Proposition 3].

*Work factor of Phase 3.* The work factor of Step 3 equals \(4 \cdot 4 \cdot 2^7 \cdot 5 \approx 2^{13}\), where \(4 \cdot 4\) denotes the number of round key bytes, \(2^7\) denotes the expected number of key values for which the affine-test is performed and \(5\) denotes the expected number of evaluations of the affine-test if \(g_k\) is not affine. The work factor of Step 1 is \(4 \cdot 4 \cdot (35 \cdot 2^8) < 2^{18}\), where the first two factors denote the number of output encodings involved in Step 1. As a result, the work factor of Phase 3 is dominated by Step 1 and is less than \(2^{18}\).

**Phase 4: Determine the Correct Order of the Round Key Bytes and Extract the Secret AES Key**

After Phases 1–3, the values of the round key bytes of two consecutive rounds \(r\) and \(r+1\) are known. However, for each round, the order of the round key bytes of each subround and the order of the four subrounds are still unknown. Notice that there are still \((4!)^5 \approx 2^{23}\) possibilities for the round key if only the bytes of that round key are considered. In [3], it is indicated how the correct order can be determined given the “shuffled” round key bytes of rounds \(r\) and \(r+1\). However, [3] does not contain an explicit description of such a method. As the work factor of the first three phases equals \(2^{22}\), it is desirable to have a method to determine the correct order of the round key bytes with a work factor that is less than \(2^{22}\). Below we present such a method, comprising the following three steps:

*Step 1*retrieves \(\mathtt{MC}^{(r,j)}\) associated with each subround \(j\) (\(0 \le j \le 3\)) of round \(r\). Recall that the encodings \(P^{(r,j)}_i\) and \(Q^{(r,j)}_i\) (\(0 \le i,j \le 3\)) were obtained in Phases 1 and 2. Together with the knowledge of the round key bytes \(\bar{k}_i^{(r,j)}\) (\(0 \le i,j \le 3\)), compute

*Step 2*computes for each \(\mathtt{MC }^{(r,j)}\) (\(0 \le j \le 3\)) the permutations \(\varPi _1, \varPi _2: (\mathbf{F}_2^8)^4 \rightarrow (\mathbf{F}_2^8)^4 \) such that

After this, the order of the round key bytes associated with each subround is known up to an uncertainty of four possibilities (circular shifts). Observe that the order of the four subrounds is still unknown.

*Step 3* determines the correct order of the round key bytes. For each of the possible orderings of the four AES subrounds of round \(r\) and the round key bytes within these subrounds (as determined in Step 2), obtain a candidate for the \((r+1)^{th}\) round key using the following two methods: (i) the AES key scheduling algorithm and (ii) the data-flow of the white-box AES implementation between the encoded subrounds of rounds \(r\) and \(r+1\). Notice that once an order of the round key bytes of round \(r\) is selected, the order of the round key bytes of round \(r+1\) can be determined using the corresponding pair of permutations of each of the subrounds of round \(r\) (see also Eq. 4) and the data-flow of the white-box implementation. With overwhelming probability, only one ordering of round key bytes of round \(r\) results in the same \((r+1)^{th}\) round key; this ordering corresponds to the correct round key of round \(r\). Finally, use the property of the AES key scheduling algorithm that the AES key can be computed if one of the round keys is known.

*Work factor of Phase 4.* A naive approach yields an expected work factor of \((4!)^2 \approx 2^{9}\) for Step 2 by searching over all possible pairs of permutations. Step 2 reduces the number of possible orderings of the round key bytes from \(2^{23}\) to \(4^4 \cdot 4! < 2^{13}\) (where the first and second factor denote the possible orderings of round key bytes within each subround and of the four subrounds, respectively), which equals the work factor of Step 3. As a result, the overall work factor of Phase 4 is dominated by the work factor of Step 3 and hence is less than \(2^{13}\).

### 3.1 Conclusion

The work factor of the improved BGE attack is dominated by the work factor of the second phase and equals \(2^{22}\).

Note that the uncertainty in the order of the round key bytes results in the need to retrieve key bytes of two consecutive rounds. This affects the work factor of the original BGE attack. In the improved BGE attack this is no longer the case, as the work factors of the phases that determine the correct order (i.e. Phases 3 and 4) are negligible compared to the work factor of Phase 2. A consequence of Tolhuizen’s improvement is that the use of non-affine white-box encodings has a negligible impact on the overall work factor of the improved BGE attack.

## 4 A New Attack Exploiting Internal Collisions

In this section we propose a new attack on the initial Chow *et al.* implementation exploiting collisions in output of the first AES round. Note that unlike the BGE attack, the description below only considers the basic implementation, i.e., without byte permutations. In this section, an encoded AES subround is defined as in Definition 2.

### 4.1 Recovering the \(S_i\) Functions

*i.e.*\(u_i = S_0(i)\) and \(v_i=S_1(i)\)). Then (8) can be rewritten as

*i.e.*\(u'_i=S_0(0) \oplus S_0(i)\) and \(v'_i=S_1(0) \oplus S_1(i)\)), which is such that \(u'_i\ne 0\) and \(v'_i\ne 0\) by bijectivity of \(S_0\) and \(S_1\). The obtained system is hence of rank at most \(509\).

### **Lemma 2**

### *Proof*

The map \(\varphi \) is a \(4\)th-order derivative of the function \(g\) (specifically \(\varphi = D_{ \mathsf {1}} D_{ \mathsf {2}} D_{ \mathsf {4}} D_{ \mathsf {8}}(g)\)) and since \(g\) has algebraic degree at most \(4\), all its \(4\)th-order derivatives are null. \(\square \)

### *Remark 2*

^{1}

Once \(S_0\) has been recovered, we can recover \(S_1\) from (11) by exhaustive search on \(v_0\). Here again, the good solution is determined using Lemma 2 and the above approach. The remaining functions \(S_{2}\) and \(S_{3}\) are recovered similarly by solving the linear systems arising from collisions of the form \(f'_\ell (\alpha ,0,0,0) = f'_\ell (0,0,\beta ,0)\) and \(f'_\ell (\alpha ,0,0,0) = f'_\ell (0,0,0,\beta )\). Since \(S_0\) is already known, we get the same situation as for the recovery of \(S_1\). Namely, all the elements of \(S_{2}\) (resp. \(S_{3}\)) can be expressed as affine functions of \(S_{2}(0)\) (resp. \(S_{3}(0)\)), and we can recover the overall function by exhaustive search on this value and with the selection criterion of Lemma 2.

### 4.2 Recovering the Secret Key

Since the output byte-encodings of the first round are the inverse of the input byte-decodings of the second round, we now show how to retrieve the key bytes in the second round from that knowledge. In what follows, we shall slightly change the definition of \(f'\) and the \(S_i\)’s given in (5) and (6). Namely, \(f'\) shall denote the first encoded subround of the second round (rather that of the first round), and \(S_i\) the associated functions, that is \(f' = AES_{enc}^{(2,0)}\) and \(S_i (\cdot ) = S(k^{(2,0)}_i \oplus (P^{(2,0)}_i)(\cdot ))\) for \(0\le i \le 3\). As in the previous section, we shall further drop all the surperscripts \((2,0)\) for the sake of clarity.

^{2}This way, we can easily recover \(k_0\) by exhaustive search while testing for every candidate whether the function \(\hat{g}\) is of algebraic degree \(4\) or not. Namely, for every guess \(\hat{k}_0\), we test whether the function

The key bytes \(k_1\), \(k_2\) and \(k_3\) can be retrieved similarly; only the definition of the function \(g\) shall change. For instance, \(g\) is defined as \(f_0' (0, P_1^{-1}(S^{-1}(\cdot )\oplus k_1), 0, 0)\) for \(k_1\), and so on for \(k_{2}\) and \(k_{3}\). And the other key bytes \(k_i^{(2,j)}\) for \(j\ge 1\) can be recovered in the exact same way. Eventually, from the second round key, one can easily recover the full AES secret key by inverting the key schedule process.

### 4.3 Attack Complexity

*i.e.*in \(512\) times a few operations). To recover \(S_0\), one loops on the \(2^{16}\) candidate values for \((u_0,u_1)\), and for each value test whether \(\hat{\varphi }(x)=0\) (which is a XOR over \(16\) elements) for at most \(16\) values \(x\). We use laziness, namely we test whether \(\hat{\varphi }(0)=0\) first, if false we stop and if true we step forwards to the next \(x\), and so on and so forth. Now getting \(\hat{\varphi }(x)=0\) for a wrong pair \((u_0,u_1)\) roughly occurs with probability \(1/256\), therefore the expected number of tests is \(1+1/256+\cdots +1/(256^{15}) \le 1.004\). The complexity of the recovery of \(S_0\) is hence of

The recovery of the key bytes has a negligible complexity compared to the recovery of the \(S_i\) functions in the first round. Indeed, according to the above analysis, the recovery of one key byte is roughly of \(2^8 \cdot 1.004 \cdot 2^4 \approx 2^{12}\). This must be done \(16\) times, yielding a complexity of \(16\cdot 2^{12} \ll 2^{22}\).

## 5 Karroumi’s White-Box AES Implementation

Karroumi’s method to generate a white-box AES implementation [6] can be divided into two phases; *Phase 1* generates a dual AES cipher from a key-instantiated AES cipher, and *Phase 2* applies the white-box techniques presented by Chow *et al.* to the dual AES cipher. Below, aspects of these phases that are relevant to this paper are described.

**Phase 1: Dual AES Cipher**

### **Definition 4**

The following lemma presents a property that is required to show that a dual AES cipher maintains the functionality of AES. As the lemma is also used in the cryptanalysis in this paper, and as a formal proof of this property is omitted in [4] and [6], we include a proof as well.

### **Lemma 3**

### *Proof*

Now, Karroumi [6] obtains a dual AES cipher as follows:

*Step 1* assigns a randomly chosen \(\varDelta _{r,j} \in \mathcal {T}\) to each AES subround \(AES^{(r,j)}\) (\(1 \le r \le 9\) and \(0 \le j \le 3\)). Based on \(\varDelta _{r,j}\), the corresponding dual AES subround \(AES^{(r,j,\varDelta _{r,j})}\) is implemented as specified by Definition 4. The mappings \(\varDelta _{r,j}\) and \(\delta _{r,j}\) (and the implementation of the dual cipher) are kept secret.

*Step 2* ensures that the functionality of AES is maintained by including an additional operation (referred to as ChangeDualState) between ShiftRows and AddRoundKey operations of round \(r\) for \(1 \le r \le 9\). If the inverse ShiftRows operation is defined by the mapping \(sr(i,j) = (j+i) \text { mod } 4\) for \(0 \le i,j \le 3\), then the ChangeDualState operation of round \(r\) applies the mapping \(C_i^{(r,j)}: \mathbf{F}_{256}\rightarrow \mathbf{F}_{256}\) to the byte of the state associated with the \(i\)-th input byte of \(\text {AES}^{(r,j,\varDelta _{r,j})}\) for \(0 \le i,j \le 3\), defined by \(C_i^{(1,j)} = \varDelta _{1,j}\) and \(C_i^{(r,j)} = \varDelta _{r,j} \circ \varDelta _{r-1,\text {sr}(i,j)}^{-1}\) if \(2 \le r \le 9\). Observe that for \(2 \le r \le 9\), \(C_i^{(r,j)}\) maps elements from \(\mathbf{F}_{256}\) using the polynomial representation associated with \(\varDelta _{r-1,\text {sr}(i,j)}\) to elements of \(\mathbf{F}_{256}\) using the polynomial representation associated with \(\varDelta _{r,j}\).

Karroumi presents two different but equivalent methods (from a security point of view) in [6] to perform the ChangeDualState operation, and specifies the white-box AES implementation using one of these methods. In this paper we use the specification as in [6]; the cryptanalysis can easily be adapted if the other method is used.

**Phase 2: Apply the Techniques of Chow** **et al.**

The following description of Karroumi’s white-box AES implementation is equivalent to the description in [6]:

*Step 1*applies the techniques of Chow

*et al.*to write the dual AES cipher (with a fixed key) obtained in Phase 1 as a series of lookup tables. In particular, the dual AES key addition operations and the dual S-box operations are merged into key-dependent bijective mappings \(T_i^{(r,j,\varDelta _{r,j})}\) for \(0 \le i,j \le 3\) and \(1 \le r \le 9\). These mappings are referred to as dual T-boxes and are defined by

*et al.*in [5]. The number and types of tables (including the tables representing the dual T-boxes) and the data-flow between tables are the same as in the lookup table implementation of AES in [5]. The only difference is that the values of the table entries of the dual AES implementation are likely to be different from the values of the corresponding entries in the AES implementation in [5] due to the dual version of the AES operations.

*Step 2* applies the white-box encoding techniques of Chow *et al.* in [5] to this lookup table implementation of dual AES. As these white-box encoding techniques do not depend on the values of the table entries, the number and types of white-box tables, and the data-flow of Karroumi’s white-box AES implementation are the same as in the white-box AES implementation of Chow *et al.* in [5].

In [6], Karroumi argues that the secrecy of the mappings \(\varDelta _{r,j}\), randomly selected from the set \(\mathcal {T}\) and used to generate the dual cipher, increases the work factor of the BGE attack to \(2^{93}\).

### 5.1 Insecurity

This section shows that Karroumi’s white-box AES implementation [6] is insecure. Recall that Karroumi’s white-box AES implementation uses the same number and types of white-box tables, and that the data-flow of the implementation is the same as in Chow *et al.*’s white-box AES implementation in [5]. As a result, the techniques of Billet *et al.* can be applied directly to compose lookup tables in Karroumi’s implementation to obtain access to the encoded dual AES subrounds (instead of the encoded AES subrounds in case of Chow *et al.*’s implementation) for rounds \(1 \le r \le 9\). In the following definition, \(A_i^{(r,j)}\) and \(B_i^{(r,j)}\) for \(0 \le i \le 3\) denote bijective mappings (or encodings) on the vector space \(\mathbf{F}_2^8\). Further, with slight abuse of notation, an output of \(A_i^{(r,j)}\) is considered to be an element of \(\mathbf{F}_{256}\) using the polynomial representation associated with the mapping \(R_l\) as defined by \(\varDelta _{r-1,\text {sr}(i',j')}\), and an output of \(AES^{(r,j,\varDelta _{r,j})}\) is considered to be an element of \((\mathbf{F}_2^8)^4\). In the following definition, \(\varPi _1^{(r,j)}, \varPi _2^{(r,j)}\) and \(\pi ^{(r)}\) are the permutations as used in Definition 3.

### **Definition 5**

The next lemma shows that an encoded dual AES subround can be represented by an encoded AES subround using the same key bytes:

### **Lemma 4**

### *Proof*

From the discussion above it follows that Karroumi’s white-box AES implementation and the white-box AES implementation of Chow *et al.* are the same. As a consequence, Karroumi’s white-box AES implementation is vulnerable to the original BGE attack and the attacks presented in this paper.

## 6 Conclusion

The BGE attack on the white-box AES implementation of Chow *et al.* extracts the AES key from such an implementation with a work factor of \(2^{30}\). Taking Tolhuizen’s improvement to the most time-consuming phase of the BGE attack as the starting point, Sect. 3 presented several improvements to the other phases of the BGE attack. It was shown that the overall work factor of the BGE attack is reduced to \(2^{22}\) when all improvements are implemented. Unlike the original BGE attack, the use of non-affine white-box encodings and the randomization in the order of the bytes of the intermediate results in AES have a negligible contribution to the overall work factor of the improved BGE attack.

Section 4 presented a new attack on the white-box implementation of Chow *et al.* based on collisions occurring in the output bytes of an encoded AES round. It was shown that the new attack also has a work factor of \(2^{22}\).

Karroumi’s white-box AES implementation was designed to withstand the BGE attack. Section 5 showed that the white-box AES implementations of Chow *et al.* and Karroumi are the same. As a result, the original BGE attack and the attacks presented in this paper can be applied directly to extract the key from Karroumi’s white-box AES implementation, implying that this implementation is insecure.

## Footnotes

## Notes

### Acknowledgments

This work was supported in part by the Research Council KU Leuven: GOA TENSE (GOA/11/007). In addition, this work was supported by the Flemish Government, FWO WET G.0213.11N and IWT GBO SEC SODA. Yoni De Mulder was supported in part by a research grant of iMinds of the Flemish Government.

## References

- 1.Barkan, E., Biham, E.: In how many ways can you write Rijndael? In: Zheng, Y. (ed.) ASIACRYPT 2002. LNCS, vol. 2501, pp. 160–175. Springer, Heidelberg (2002) Google Scholar
- 2.Barkan, E., Biham, E.: The book of Rijndaels. IACR Cryptology ePrint Archive, 2002:158. http://eprint.iacr.org/2002/158 (2002)
- 3.Billet, O., Gilbert, H., Ech-Chatbi, C.: Cryptanalysis of a white box AES implementation. In: Handschuh, H., Hasan, M.A. (eds.) SAC 2004. LNCS, vol. 3357, pp. 227–240. Springer, Heidelberg (2004) Google Scholar
- 4.Biryukov, A., De Cannière, C., Braeken, A., Preneel, B.: A toolbox for cryptanalysis: linear and affine equivalence algorithms. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 33–50. Springer, Heidelberg (2003) Google Scholar
- 5.Chow, S., Eisen, P., Johnson, H., Van Oorschot, P.C.: White-box cryptography and an AES implementation. In: Nyberg, K., Heys, H. (eds.) SAC 2002. LNCS, vol. 2595, pp. 250–270. Springer, Heidelberg (2003)Google Scholar
- 6.Karroumi, M.: Protecting white-box AES with dual ciphers. In: Rhee, K.-H., Nyang, D. (eds.) ICISC 2010. LNCS, vol. 6829, pp. 278–291. Springer, Heidelberg (2011)Google Scholar
- 7.Lepoint, T., Rivain, M.: Another nail in the coffin of white-box AES implementations. Cryptology ePrint Archive, Report 2013/455. http://eprint.iacr.org/2013/455.pdf (2013)
- 8.National Institute of Standards and Technology: Advanced encryption standard. In: Federal Information Processing Standard (FIPS), Publication 197, U.S. Department of Commerce, Washington, DC (November 2001). http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf
- 9.Muir, J.A.: A tutorial on white-box AES. In: Kranakis, E. (ed.) Advances in Network Analysis and its Applications. Mathematics in Industry, pp. 209–229. Springer, Heidelberg (2013). http://www.ccsl.carleton.ca/ jamuir/papers/wb-aes-tutorial.pdf Google Scholar
- 10.De Mulder, Y., Roelse, P., Preneel, B.: Cryptanalysis of the Xiao - Lai white-box AES implementation. In: Knudsen, L.R., Wu, H. (eds.) SAC 2012. LNCS, vol. 7707, pp. 34–49. Springer, Heidelberg (2013)Google Scholar
- 11.De Mulder, Y., Roelse, P., Preneel, B.: Revisiting the BGE attack on a white-box AES implementation. Cryptology ePrint Archive, Report 2013/450. http://eprint.iacr.org/2013/450.pdf (2013)
- 12.Tolhuizen, L.: Improved cryptanalysis of an AES implementation. In: 33rd WIC Symposium on Information Theory in the Benelux (2012)Google Scholar
- 13.Xiao, Y., Lai, X.: A secure implementation of white-box AES. In: 2nd International Conference on Computer Science and its Applications (CSA 2009), pp. 1–6. IEEE (2009)Google Scholar