Keywords

1 Introduction

While the importance of attacks targeting actual primitives is obvious, structural attacks can also lead to interesting development. In fact, the last few years have seen the publications of several such attacks. For example, the attack targeting the SASAS construction has been recently extended to larger constructions [1]. The ASASA structure, which might look weaker at first glance due to its lower number of non-linear layers, has actually proved to be a challenging target; it was even proposed as the basis for public key encryption and white-box scheme [2]. Attacking this generic structure requires sophisticated methods presented in [3] and [4]. Feistel Networks have also been the target of generic attacks in two different settings. If the Feistel functions are completely secret, attacks up to 5-rounds are presented in [5]. If the Feistel functions consist in public functions preceded by the addition of a secret key, powerful attacks with very low data complexity are presented in [6].

As illustrated by the usage of the ASASA structure, generic constructions can be applied in white-box cryptography where the aim is to prevent an attacker from having access to some of the inner components of the algorithm to perform some computations. Thus, structural attacks are important in this context. They can also be used to reverse-engineer the secret structure of an S-Box, allowing for example an attacker to enjoy the benefits of a lightweight implementation known a priori only by the designer of the S-Box. The use of small Feistel Networks for lightweight S-Box design is investigated in [7] and, in fact, a secret hardware efficient decompositionFootnote 1 was recently discovered for the S-Box of the last Russian standards [8] using such reverse-engineering.

Our Contribution. Our results are based on the high-degree indicator matrix (HDIM), a new object we introduce. We associate to any n-bit permutation F a \(n \times n\) Boolean matrix \(\hat{H}(F)\) which can be computed in time \(\text {O}(n2^{n-1})\) using the full code-book and which is related all at once to the LAT/Walsh spectrum of F, to its algebraic normal form and to the existence of integral distinguishers.

The HDIM provides new attack directions which we illustrate by analysing some generic constructions based on Feistel Networks. In particular, we show the existence of some patterns in the HDIM of 2n-bit Feistel Networks with r rounds and Feistel functions with degree d depending on \(\theta (d, r)\) with

$$\begin{aligned} \theta (d,r) = d^{\lfloor {r/2} \rfloor -1} + d^{\lceil {r/2} \rceil -1}. \end{aligned}$$

These patterns provide efficient distinguishers for such structures. When the round functions are bijective, such patterns always exist in Feistel Networks with up to at least 5 round. We also show that these distinguishers can be interpreted as particular integral distinguishers and describe some relations between our results and Todo’s division property [9]. Due to their integral nature, our distinguishers are extremely memory efficient: we only need to store a block containing the sum studied. In contrast, the impossible differential for 5-round Feistel Network [10] and the yoyo-game [5] are the best known distinguishers for 5-rounds FN with bijective Feistel functions and require respectively \(\text {O}(2^n)\) and \(\text {O}(2^{2n})\) blocks of memory.

We also present a new type of recovery attack against Feistel Networks with secret round functions which rebuilds the last Feistel function by exploiting the predictable absence of some monomials in the algebraic normal form of the permutation without its last round.

Table 1. Structural attacks against Feistel Networks. n is the branch size, d is the degree of the Feistel functions.

Outline. We first describe the definitions and notations that we shall use throughout the paper in Sect. 2. Then, we investigate in Sect. 3 the relation between the different rows and columns of a table containing the congruence modulo 4 of the biases in the LAT of some n-bit permutation and, in doing so, introduce and study the high-degree indicator matrix (HDIM). Section 4 shows that the HDIM of a Feistel Network exhibits very strong patterns depending on the number of rounds, the algebraic degree of the Feistel functions and whether these are bijective or not. We also describe attacks relying on these patterns targeting both Feistel Networks and Feistel Networks whitened using affine layers. In fact, in Sect. 5, we introduce a new kind of attack rebuilding efficiently the algebraic normal form of secret Feistel functions which exploits the predictable absence of some monomials in the ANF of round-reduced Feistel Networks. Finally, we discuss in Sect. 6 how our findings can fit in the framework of integral attacks.

Table 2. Structural attacks against Feistel Networks whitened with unknown affine layers. The attacks recover parts of the unknown affine layers. n is the branch size, d is the degree of the Feistel functions.

2 Notations and Boolean Functions Basics

In this section, we introduce the notations and concepts that will be used throughout the paper. A thorough introduction to Boolean functions can be found in [13]. First, let us define some sets and simple operations:

  • \(\mathbb {F}_{2}^{}\) denotes the finite field of size 2,

  • the exclusive-OR (or XOR) is denoted \(\oplus \),

  • the logical AND is denoted \(\wedge \),

  • the hamming weight \(\text {hw}(x)\) of a vector x of \(\mathbb {F}_{2}^{n}\) is the number of ones in x,

  • |S| and \(\# S\) denote the size of a set S,

  • the scalar product of two elements \(x = (x_{0},...,x_{n-1})\) and \(y = (y_{0},...,y_{n-1})\) of \(\mathbb {F}_{2}^{n}\) is denoted “\(\cdot \)” and is equal to \(x \cdot y = \bigoplus _{i=0}^{n-1} x_{i} \wedge y_{i}\),

  • if \(x = (x_{0},...,x_{n-1})\) and \(u = (u_{0},...,u_{n-1})\) are two elements of \(\mathbb {F}_{2}^{n}\) then \(x^u = \prod _{i=0}^{n-1} x_i^{u_i}\), and

  • if \(x = (x_{0},...,x_{n-1})\) and \(u = (u_{0},...,u_{n-1})\) are two elements of \(\mathbb {F}_{2}^{n}\) then \(x \preccurlyeq u\) is true if and only if \((u_i = 0 \implies x_i = 0)\) is true for all i in \([0, n-1]\). We say that u “covers” x.

We now define some of the key components used in our analysis.

Definition 1

(Boolean Function). We call Boolean function a function mapping \(\mathbb {F}_{2}^{n}\) to \(\mathbb {F}_{2}^{}\). A function mapping \(\mathbb {F}_{2}^{n}\) to \(\mathbb {F}_{2}^{m}\) is a vectorial Boolean function and its restrictions to each output bit are its coordinates. Finally, for a vectorial Boolean function F, the Boolean functions \(x \mapsto c \cdot F(x)\) are its components.

Note that a coordinate of a Boolean function is one of its components but that the converse is not necessarily true. Let us then introduce the concept of balanced-ness.

Definition 2

(Balanced Boolean Function). A (vectorial) Boolean function F mapping \(\mathbb {F}_{2}^{n}\) to \(\mathbb {F}_{2}^{m}\) is said to be balanced if the size of the preimages of all elements of \(\mathbb {F}_{2}^{m}\) are equal.

A Boolean function is balanced if and only if all of its components are balanced.

We also recall the definition of the Algebraic Normal Form of a Boolean function.

Definition 3

(Algebraic Normal Form (ANF)). Any Boolean function f mapping n bits to 1 can be decomposed into

$$\begin{aligned} f (x) ~=~ \bigoplus _{u \in \mathbb {F}_{2}^{n}} a_u x^u ~\text {with }~a_u ~=~ \bigoplus _{x \preccurlyeq u} f(x), \end{aligned}$$

in a unique fashion which is called the Algebraic Normal Form (ANF) of f. The coefficients \(a_u\) can be obtained using the so-called Möbius transform. For vectorial Boolean functions, the ANF is the ANF of each of the coordinates.

Definition 4

(Algebraic Degree). The algebraic degree of a Boolean function is the largest number of variables in a single term of its ANF, i.e. the maximum hamming weight of all u of \(\mathbb {F}_{2}^{n}\) such that \(a_u \ne 0\). The algebraic degree of a vectorial Boolean function is the maximum algebraic degree of its coordinates. The algebraic degree of a (vectorial) Boolean function f is denoted \(\deg (F)\).

We observe that the algebraic degree of a permutation of n bits is at most equal to \(n-1\).

Our analysis will involve the LAT or Fourier Transform (related to the Walsh spectrum by a constant multiplication) of a Boolean function. These almost identical concepts are introduced below.

Definition 5

(LAT, Fourier Transform, Walsh Spectrum). The Linear Approximation Table of a function f mapping n bits to m is a \(2^{n} \times 2^{m}\) matrix \(\mathcal {L}\) where \(\mathcal {L}[a, b] = \# \{ x \in \mathbb {F}_{2}^{n}, a \cdot x = b \cdot f(x) \} - 2^{n-1}\). We note that the coefficient \(\mathcal {L}[a, b]\) can equivalently be expressed as follows:

$$\begin{aligned} \mathcal {L}[a, b] ~=~ - \sum _{x \in \mathbb {F}_{2}^{n}} \big ( b \cdot f(x) \big ) \times (-1)^{a \cdot x} ~=~ - \frac{1}{2} \sum _{x \in \mathbb {F}_{2}^{n}} (-1)^{a \cdot x \oplus b \cdot f(x)}, \end{aligned}$$

where the first sum corresponds to the Fourier transform of \(x \mapsto b \cdot f(x)\) and the second to its Walsh spectrum. Furthermore, the coefficient \(\mathcal {L}[a, b]\) of a LAT \(\mathcal {L}\) is called bias of the approximation \((a \leadsto b)\).

Remark 1

If F is an n-bit permutation then, for all (ab) in \((\mathbb {F}_{2}^{n})^2\), we have \(\mathcal {L}[a, b] \equiv 0 \mod 2\).

When a Boolean function \(\mu \) mapping n bits to m is linear, we use \(\mu \) to represent both the function itself and its matrix representation. The transpose of a matrix \(\mu \) is denoted \(\mu ^t\). Finally, we state the following well-known remark regarding the algebraic degree of a (vectorial) Boolean function.

Remark 2

If F is a (vectorial) Boolean function and \(\mathcal {V}\) is a vector space of \(\mathbb {F}_{2}^{n}\) such that \(|\mathcal {V}| > 2^{\deg (F)}\), then \(\bigoplus _{v \in \mathcal {V}} F(v) = 0\).

3 Patterns in Biases Modulo 4 and HDIM

Our initial goal was to identify new generic attacks against Feistel Networks. As suggested in [12], we looked at a visual representation of the Linear Approximation Table of such permutations. We identified some patterns which turned out to be byproducts of a strong structure in the congruence modulo 4 of the biases. Figure 1a and b show the “Pollock representation” of the LAT modulo 4 of a 4- and a 5-round 6-bit Feistel Networks for some bijective Feistel functions picked uniformly at random.

Fig. 1.
figure 1

LAT of r-round Feistel Networks (modulo 4).

As we can see, the congruence of the biases is constant in each square of dimensions \(8 \times 8\) for the 4-round Feistel Networks. Furthermore, there seems to be linear patterns for the 5-round structure: if we divide the LAT into \(8 \times 8\) squares as before then we find that each square at position (ij) is the sum of the squares at positions (i, 0) and (0, j) and a square-wise constant.

The reason behind these patterns is two-fold. The first aspect is a generic observation about the linearity (in some sense) of the construction of the LAT modulo 4. Indeed, we show in this section that the function \((a,b) \mapsto (\mathcal {L}[a,b] \mod 4)\) for the LAT \(\mathcal {L}\) of a permutation is a bilinear form and that its matrix representation has interesting properties. The second aspect of the justification for the patterns is the probability 1 presence of zeroes in some positions which is discussed later in Sect. 4.

3.1 The High-Degree Indicator Matrix

We first re-write the congruence modulo 4 of the biases in the LAT of a permutation using Boolean functions.

Lemma 1

(LAT modulo 4). Let F be a permutation of n bits (\(n > 2\)) and let \(\mathcal {L}\) be its LAT. Then \(\mathcal {L}[a,b]\) is such that \( \mathcal {L}[a, b] ~\equiv ~ 2 \times \left( \bigoplus _{x \in \mathbb {F}_{2}^{n}} \big ( b \cdot F(x) \big )\big ( a \cdot x \big ) \right) \mod 4 \) or, equivalently,

$$\begin{aligned} \frac{\mathcal {L}[a, b]}{2} ~\equiv ~ \bigoplus _{x \in \mathbb {F}_{2}^{n}} \big ( b \cdot F(x) \big )\big ( a \cdot x \big ) \mod 2. \end{aligned}$$

Proof

Since \((-1)^{z} = 1-2z\) (for z in \(\{0,1\}\)), the coefficient \(\mathcal {L}[a, b]\) is equal to

The first term in this sum is equal to \(2^{n-1}\) as every component of a permutation is balanced.Footnote 2 Thus, if we look at the congruence modulo 4 of \(\mathcal {L}[a, b]\), we obtain the following (for any \(n >2\)):

$$\begin{aligned} \mathcal {L}[a, b] ~\equiv ~ 2\Big ( \sum _{x \in \mathbb {F}_{2}^{n}} \big ( b \cdot F(x) \big )\big ( a \cdot x \big ) \Big ) \mod 4, \end{aligned}$$

from which we deduce that

$$\begin{aligned} \frac{\mathcal {L}[a, b]}{2} ~\equiv ~ \sum _{x \in \mathbb {F}_{2}^{n}} \big ( b \cdot F(x) \big )\big ( a \cdot x \big ) \mod 2 \end{aligned}$$

As sum and XOR are equivalent modulo 2, this proves the lemma.    \(\square \)

This lemma has several consequences regarding the congruence modulo 4 of the LAT coefficients of F (or, alternatively, the congruence modulo 2 of their half). First, we define \(\mathcal {L}_{4}\) to be a \(2^{n} \times 2^{n}\) matrix such that \(\mathcal {L}_{4}[a, b] \equiv \mathcal {L}[a,b] \mod 4\) and \(\mathcal {L}_{4}[a, b] \in {0,2}\). Using this, we define \(B(\mathcal {L})\) to be a \(2^{n} \times 2^{n}\) Boolean matrix with \(B(\mathcal {L})[a, b] = \mathcal {L}_{4}[a, b]/2\). This matrix has the following property:

$$\begin{aligned} B(\mathcal {L})[a \oplus a', b \oplus b'] = B(\mathcal {L})[a, b] \oplus B(\mathcal {L})[a, b'] \oplus B(\mathcal {L})[a', b] \oplus B(\mathcal {L})[a', b']. \end{aligned}$$

As consequence, the function \((a, b) \mapsto B(\mathcal {L})[a, b]\) is a bilinear form and can be represented using an \(n \times n\) matrix \(\hat{H}(F)\).

Definition 6

(High-Degree Indicator Matrix (HDIM)). Let F be an n-bit permutation and let \(B(\mathcal {L})\) be the Boolean matrix representing the congruence modulo 4 of its LAT (as described above). We define the High-Degree Indicator Matrix \(\hat{H}(F)\) of F to be the \(n \times n\) matrix such that

$$\begin{aligned} \hat{H}(F)[i, j] = \bigoplus _{x \in \mathbb {F}_{2}^{n}} \big ( e_{i} \cdot F(x) \big )\big ( e_{j} \cdot x \big ), \end{aligned}$$

where \(e_{k}\) is an all zero n-bit vector with a single 1 at position k. This matrix is such that

$$\begin{aligned} B(\mathcal {L})[a, b] = b^{t} \times \hat{H}(F) \times a. \end{aligned}$$

Lemma 2

The coefficients of \(\hat{H}(F)\) indicate the presence of the highest degree terms in the coordinates of F. More precisely, \(\hat{H}(F)[i, j] = 1\) if and only if the ANF of \(F_{i}\) contains the monomial \(\prod _{k \ne j} x_{k}\) (which has degree \(n-1\)).

Proof

Let F be an n-bit permutation. As \(\hat{H}(F)[i,j]\) is the sum over of space of size \(2^{n}\) of the Boolean function \(x \mapsto \big ( e_{i} \cdot F(x) \big )\big ( e_{j} \cdot x \big ) = F_i(x) \cdot x_j\), it is equal to 0 unless this Boolean function has algebraic degree n. As F has degree \(n-1\), this occurs if and only if \(F_{i}\) contains \(\prod _{k \ne j} x_{k}\). Indeed, in this case (and in this case only), the ANF of \(x_j \cdot F_i(x)\) contains the only possible degree n term \(\prod _{k=0}^{n-1}x_k\).    \(\square \)

This lemma is the reason behind the name “high-degree indicator matrix”. Indeed, the HDIM coefficients simply state whether each of the n possible \(n-1\) degree terms appear in each coordinate of F or not.

We finally note that the HDIM of a function can be computed much more efficiently than the LAT or the difference distribution table. Indeed, we can compute a column of the HDIM by summing the function over a cube of dimension \(n-1\) (see Sect. 6.1). The complexity for all n columns is therefore \(n2^{n-1}\).

3.2 Some Properties of the High-Degree Indicator Matrix

Let us investigate the effect of some simple transformations on the HDIM. First, we point out that due to the fact that the LAT of the inverse of a permutation F is the transpose of the LAT of F, the HDIM of \(F^{-1}\) is the transpose of the HDIM of F.

We now show that the HDIM of \(\eta \circ f \circ \mu \) can easily be deduced from that of f when \(\eta \) and \(\mu \) are n-bit linear permutations. The corresponding theorem will be used in Sect. 4.2 to attack Feistel Networks whitened using affine layers.

Theorem 1

Let \(\mu , \eta \) be linear n-bit mappings, F be an n-bit permutation and let \(G = \eta \circ F \circ \mu \). Furthermore, let \(\hat{H}(F)\) be the HDIM of f and \(\hat{H}(G)\) be that of G. Then it holds that

$$\begin{aligned} \hat{H}(G) = \eta \times \hat{H}(F) \times (\mu ^{t})^{-1}. \end{aligned}$$

Proof

We prove this result in two steps. First, the fact that \(\hat{H}(F \circ \mu ) = \hat{H}(F) \times (\mu ^{-1})^{t}\) can be derived as follows:

We then note that \(\hat{H}(\eta \circ F) = \hat{H}(F^{-1} \circ \eta ^{-1})^{t}\) which, using what we just found, is equal to \((\hat{H}(F^{-1}) \times \eta ^{t})^t = (\hat{H}(F)^{t} \times \eta ^{t})^t\), so that \(\hat{H}(\eta \circ F) = \eta \times \hat{H}(F)\). This concludes the proof.    \(\square \)

The ANF and the LAT of an n-bit permutation are connected in the sense that it is possible to determine the congruence modulo 4 of the LAT \(\mathcal {L}\) of an n-bit permutation F given parts of its ANF. Indeed, as we describe in this section, this congruence only depends on the terms of degree \(n-1\) in the ANF of the coordinates of F.

4 The High-Degree Indicator Matrix of Feistel Networks

In what follows, we denote \(\mathsf {F}^{r}_{d}\) an r-round FN with bijective Feistel function of algebraic degree at most d. The structure of a sample is given Fig. 2. It is possible to use the HDIM to analyse such generic structures.

Fig. 2.
figure 2

A sample \(\mathsf {F}^{3}_{d}\) structure, where \(\deg (f_i) \le d\).

4.1 Artifacts in the HDIM of Feistel Networks

The HDIM of a Feistel Network may yield interesting patterns depending on the degree of its Feistel functions, whether they are bijections or not and its number of rounds. These are formalized by Theorem 2 and its corollary (Corollary 1). These results link the maximum degree d of the Feistel functions, the number of rounds r and the presence or not of some patterns using the function \(\theta : \mathbb {Z}^2 \rightarrow \mathbb {Z}\) defined by

$$\begin{aligned} \theta (d, r) = d^{\lfloor {r/2} \rfloor -1} + d^{\lceil {r/2} \rceil -1}, \end{aligned}$$

where \(\lfloor {2k} \rfloor = \lfloor {2k+1} \rfloor = 2k\) and \(\lceil {2k} \rceil = \lceil {2k-1} \rceil = 2k\).

Theorem 2

Let F be a 2n-bit \(\mathsf {F}^{r}_{d}\). Then the HDIM of F is such that \(\hat{H}(F)[i, j] = 0\) if \(i < n\) or \(j < n\) under the following conditions:

  • if the Feistel functions are bijections and \(\theta (d,r) < 2n\), or

  • if the Feistel functions are not bijections and \(\theta (d,r+1) < 2n\).

The general idea of the proof is to express the sum corresponding to coefficient \(\hat{H}(F)[i,j]\) using well-chosen variables \((\alpha ,\beta )\) located in the middle of the encryption. The value of F(x) is then a function of degree \(d^{\lceil {r/2} \rceil -1}\) of \((\alpha , \beta )\) and that of x is a function of degree \(d^{\lfloor {r/2} \rfloor -1}\). The coefficients can thus be written as

$$\begin{aligned} \hat{H}(F)[i, j] = \bigoplus _{(\alpha , \beta ) \in (\mathbb {F}_{2}^{n})^2} \big ( e_{i} \cdot F(x(\alpha , \beta )) \big )\big ( e_{j} \cdot x(\alpha , \beta ) \big ) \end{aligned}$$

and the result is equal to 0 if \(\theta (d, r) = d^{\lfloor {r/2} \rfloor -1} + d^{\lfloor {r/2} \rfloor -1} < 2n\). If the Feistel functions are not bijective then a “trick” used to slightly decrease the degree in \((\alpha ,\beta )\) of the output cannot be used, hence the small discrepancy in this case. The complete formal proof of this theorem is given in the full version of this paper [14].

Corollary 1

Let F be a 2n-bit \(\mathsf {F}^{r}_{d}\). The HDIM of F is such that \(\hat{H}(F)[i, j] = 0\) if \(i < n\) and \(j < n\) under the following conditions:

  • if the Feistel functions are bijections and \(\theta (d,r-1) < 2n\), or

  • if the Feistel functions are not bijections and \(\theta (d,r) < 2n\).

Proof

Let r and d be such that \(\mathsf {F}^{r-1}_{d}\) fits the hypothesis of Theorem 2. The right word of the output of a \(\mathsf {F}^{r}_{d}\) structure is the left word output by a \(\mathsf {F}^{r-1}_{d}\) structure. As each line of the HDIM corresponds to one output bit, the top n rows of the HDIM of the r-round FN are equal to the bottom n rows of the same permutation reduced to \((r-1)\)-round. Because of Theorem 2, this bottom half is such that the first n columns are all 0. Thus, the first n columns of the first n rows of the HDIM of a \(\mathsf {F}^{r}_{d}\) are all equal to 0.    \(\square \)

To illustrate these theorems, we give the HDIM of the 4- and 5-round Feistel with 3-bit bijective Feistel functions picked uniformly at random whose LAT modulo 4 were given in Fig. 1a and b. The Feistel functions must have an algebraic degree at most equal to 2. Since \(\theta (2, 4) = 2^1 + 2^1 = 4 < 6\), these HDIM must exhibit the patterns described in the theorems above. It is the case, as we can see below. The zeroes caused by Theorem 2 and Corrolary 1 are represented in grey:

(1)

Even though a \(\mathsf {F}^{r}_{d}\) structure has an algebraic degree of \(2n-1\) in the conditions of Theorem 2, the way in which this high degree is achieved is very structured: only half of the output bits actually have a maximum degree and the monomials of degree \(2n-1\) can not contain the product of \(n-1\) bits from the right side of the input. Thus, a simple analysis of the algebraic degree can be made more sophisticated by also investigating the possible structure of the monomials of highest degree.

These patterns lead to the existence of distinguishers as long as the conditions necessary for Corollary 1 are satisfied. Table 3 shows the value of the number of rounds for which the conditions of Corollary 1 are satisfied for different values of dr and n in both the 1-to-1 case and the case where collisions in the Feistel functions are allowed. If real ciphers correspond to these parameters, we specify them. Note that the rotation applied to one of the branches in the round function of LBlock [15] does not change anything. The key-dependent linear FL layers in MISTY1 [16] do not protect from our distinguisher as well and may be included from any side for free.

Table 3. If \(r = r_{\text {max}}(d, 2n)\) then the 2n-bit permutation \(\mathsf {F}^{r}_{d}\) exhibits an artifact of size \(n^{2}\) in its HDIM.

4.2 Bypassing Affine Whitening

In the context of component reverse-engineering/white-box cryptography, it may not be sufficient to be able to attack generic Feistel structure. Indeed, simply whitening a generic structure with secret affine layers can prevent many attacks from succeeding at small cost for the designer. For example, applying affine layers before and after a 5-round Feistel Network would prevent the yoyo-game used in [5] to be exploitable. Similarly, the recent attacks against ASASA [3, 4] are much more sophisticated than the attack against SASAS proposed by Biryukov et al. in the first place [19]. We also note that the secret structure of the S-Box of the last Russian standard primitives recently recovered was indeed whitened with seemingly random linear layers [8].

As a consequence, we study the generic construction denoted \(\mathsf {AF}^{r}_{d}\mathsf {A}\) consisting in a \(\mathsf {F}^{r}_{d}\) construction with secret Feistel functions preceded and followed by the application of independent and secret linear layersFootnote 3. This structure has already been studied in [8] but our attacks are significantly more efficient. Note also that one of the S-Box of ZUC [20] has this structure: it is a 3-round Feistel Network composed with a bit rotation. Let us show how the HDIM and its artifacts we identified in the previous section can be used to attack permutations with \(\mathsf {AF}^{r}_{d}\mathsf {A}\) structures.

Fig. 3.
figure 3

The target of our attack, its result and its alternative representation. In Fig. 3c, \(f'_i\) is affine equivalent to \(f_i\).

Our attack works for a subset of all possible linear layers. We define \(G = \eta \circ F \circ \mu \) where F has a \(\mathsf {F}^{r}_{d}\) structure satisfying the conditions of Theorem 2 and \(\mu \) and \(\eta \) are linear layers. The layer applied first must have a decomposition as follows:

$$\begin{aligned} \mu = \left[ \begin{array}{cc} \mu _{0, 0} &{} \mu _{0, 1} \\ \mu _{1, 0} &{} \mu _{1, 1} \\ \end{array} \right] ~=~ \left[ \begin{array}{cc} d &{} 0 \\ c &{} b \\ \end{array} \right] \times \left[ \begin{array}{cc} I &{} a \\ 0 &{} I \end{array} \right] ~=~ \left[ \begin{array}{cc} d~~ &{} d \times a \\ c~~ &{} b + c\times a \\ \end{array} \right] , \end{aligned}$$

and the layer applied last must have a similar one:

$$\begin{aligned} \eta = \left[ \begin{array}{cc} \eta _{0, 0} &{} \eta _{0, 1} \\ \eta _{1, 0} &{} \eta _{1, 1} \\ \end{array} \right] ~=~ \left[ \begin{array}{cc} I &{} a' \\ 0 &{} I \end{array} \right] \times \left[ \begin{array}{cc} d' &{} 0 \\ c' &{} b' \\ \end{array} \right] ~=~ \left[ \begin{array}{cc} d'+a'\times c'~~ &{} a'\times b' \\ c'~~ &{} b' \\ \end{array} \right] . \end{aligned}$$

It is sufficient for such a decomposition of the first layer to exist that \(\mu _{0,0}\) is invertible. Indeed, we can then simply set \(d = \mu _{0,0}, c = \mu _{1,0}, a = d^{-1} \times \mu _{0,1}\) and \(b = \mu _{1,1} -c \times a\). Note that b has to be invertible since \(\mu \) is invertible. Similarly, it is sufficient that \(\eta _{1,1}\) is invertible to decompose the final layer. We define \(F'\) using these decompositions so that G is equal to:

$$\begin{aligned} G ~=~ \left[ \begin{array}{cc} I &{} a' \\ 0 &{} I \end{array} \right] \circ \left[ \begin{array}{cc} d' &{} 0 \\ c' &{} b' \\ \end{array} \right] \circ F \circ \left[ \begin{array}{cc} d &{} 0 \\ c &{} b \\ \end{array} \right] \circ \left[ \begin{array}{cc} I &{} a \\ 0 &{} I \end{array} \right] ~=~ \left[ \begin{array}{cc} I &{} a' \\ 0 &{} I \end{array} \right] \circ F' \circ \left[ \begin{array}{cc} I &{} a \\ 0 &{} I \end{array} \right] . \end{aligned}$$

A graphical representation of the relation between F, \(F'\) and G is provided in Fig. 3a and b. As F satisfies the condition of Theorem 2, its HDIM is such that \(\hat{H}(F)[i,j] = 0\) if \(i<n\) or \(j<n\). Applying Theorem 1 gives us that the HDIM of \(F'\) is equal to

$$\begin{aligned} \hat{H}(F') ~=~ \left[ \begin{array}{cc} d' &{} 0 \\ c' &{} b' \\ \end{array} \right] \times \hat{H}(F) \times \left[ \begin{array}{cc} d &{} c \\ 0 &{} b \\ \end{array} \right] ^{-1} ~=~ \left[ \begin{array}{cc} 0 &{} 0 \\ 0 &{} h' \\ \end{array} \right] \text {with } h' = b' \times h \times b^{-1}, \end{aligned}$$

h being the bottom-right part of \(\hat{H}(F)\). Like in \(\hat{H}(F)\), it holds that \(\hat{H}(F')[i,j]=0\) if \(i < n\) or \(j < n\). Another way to see why this holds is shown in Fig. 3c. Indeed, \(F'\) can be written as a \(\mathsf {F}^{r}_{d}\) structure, like F, where n-bit linear permutations are applied only on two branches and where the Feistel functions \(f'_i\) are obtained from compositions of \(b, b', d, d'\) and \(f_i\), as well as the addition of c and \(c'\) for the first and last rounds. We deduce that if G indeed has a \(\mathsf {AF}^{r}_{d}\mathsf {A}\) structure satisfying the conditions for Theorem 2, then the following equation with unknowns the \(n \times n\) binary matrices a and \(a'\) must have at least one solution:

$$\begin{aligned} \left[ \begin{array}{cc} I &{} a' \\ 0 &{} I \\ \end{array} \right] \times \hat{H}(G) \times \left[ \begin{array}{cc} I &{} 0 \\ a &{} I \\ \end{array} \right] ~=~ \left[ \begin{array}{cc} 0 &{} 0 \\ 0 &{} h_{1,1} \\ \end{array} \right] , \end{aligned}$$

where \(h_{1,1}\) is the bottom right corner of \(\hat{H}(G)\). This system has \(2n^{2}\) unknowns and \(3n^{2}\) equations, meaning that it is unlikely to have solutions if G is a random permutation. However, if it does have a solution then we deduce both that G has an \(\mathsf {AF}^{r}_{d}\mathsf {A}\) structure and the expression of parts of the linear layers. We summarize these results in the following attack.

Attack 1

(Partial Recovery Against \(\mathsf {AF}^{r}_{d}\mathsf {A}\) ). Let G be a 2n-bit permutation. It is necessary for G to be in \(\mathsf {AF}^{r}_{d}\mathsf {A}\) for some (rd) satisfying Theorem 2 that the equation

$$\begin{aligned} \left[ \begin{array}{cc} I &{} a' \\ 0 &{} I \\ \end{array} \right] \times \hat{H}(G) \times \left[ \begin{array}{cc} I &{} 0 \\ a &{} I \\ \end{array} \right] ~=~ \left[ \begin{array}{cc} 0 &{} 0 \\ 0 &{} h_{1,1} \\ \end{array} \right] , \end{aligned}$$

where h is an unknown \(n \times n\) matrix, has at least one solution. The unknowns are the coefficients of the \(n \times n\) matrices a and \(a'\), so that \(2n^{2}\) Boolean variables must satisfy \(3n^{2}\) equations corresponding to the zeroes in the right hand side.

This distinguisher requires the full code-book and as much time as is needed to compute the HDIM and solve a system of equations. Since the system is small, the bottle-neck is the computation of the HDIM which can be done in time \(\text {O}(n 2^{2n})\) where n is the branch size.

We can use the exact same reasoning to attack one more round if the decomposition of \(\eta \) and \(\mu \) involve the same “linear Feistel function” a. This happens in particular if \(\eta = \mu ^{-1}\). In this case, we can use the distinguisher obtained from the following attack.

Attack 2

(Partial Recovery Against \(\mathsf {A}^{-1}\mathsf {F}^{d}_{r+1}\mathsf {A}\) ). Let G be a 2n-bit permutation. In order for G to be in \(\mathsf {AF}^{r}_{d}\mathsf {A}\) for some (rd) satisfying Corollary 1 in such a way that the linear layers are the inverse of one another, it is necessary that the equation

$$\begin{aligned} \left[ \begin{array}{cc} I &{} a \\ 0 &{} I \\ \end{array} \right] \times \hat{H}(G) \times \left[ \begin{array}{cc} I &{} 0 \\ a &{} I \\ \end{array} \right] ~=~ \left[ \begin{array}{cc} 0 &{} h_{0,1} \\ h_{1,0} &{} h_{1,1} \\ \end{array} \right] , \end{aligned}$$

where \(h_{0,1}, h_{1,0}\) and \(h_{1,1}\) are unknown \(n \times n\) matrices, has at least one solution. The unknowns are the coefficients of the \(n \times n\) matrices a, so that \(n^{2}\) Boolean variables must satisfy \(n^{2}\) equations corresponding to the zero in the right hand side.

Note that if there is a single whitening affine layer applied at some side, we have a similar system with \(n^2\) unknowns. If we consider one more round, we will have \(n^2\) equations as well. Therefore we can attack \(\mathsf {F}^{r}_{d}\mathsf {A}\), where r is the maximum number of rounds satisfying Corollary 1. Another view on this attack is given in Sect. 5.3.

5 The Impossible Monomials Attack

In the previous sections we used absent terms of highest degree to recover whitening linear layers from Feistel Networks. In this section we generalize this method to terms of lower degree and, as a result, we present an attack recovering a secret round function from a 5-round Feistel Network with bijections. Furthermore, we generalize this attack to more rounds if the degrees of the round functions are small.

5.1 Impossible Monomials in Feistel Networks

Let F be a 2n-bit \(\mathsf {F}^{4}_{n-1}\) and let \(F_i\) be the ith output bit of F (\(F_0\) is the leftmost bit of F). We will denote by \(L=\{0,\ldots ,n-1\}\) and \(R=\{n,\ldots ,2n-1\}\) the indices from the left and right halves respectively, and \(F_L\) and \(F_R\) the truncations of the function F to the left and right half respectively. Consider the ANF of \(F_i\):

$$\begin{aligned} F_i(x_l||x_r) = \bigoplus _{u_l,u_r \in \mathbb {F}_{2}^{n}} a^{F_i}_{u_{l}||u_{r}} x_{l}^{u_{l}} x_{r}^{u_{r}}, \end{aligned}$$
(2)

where \(x_l\) and \(x_r\) are vectors of input variables from the left and right halves respectively. We will now show that some monomials are impossible, that is, \(a^{F_i}_{u_{l}||u_{r}} = 0\) for some \(u_l, u_r\) independently of the choice of the Feistel functions. To prove it, we will need the following lemmas.

Lemma 3

Let \(a, b \in \mathbb {F}_{2}^{n}\) be some vectors of variables and let \(f: \mathbb {F}_{2}^{n} \rightarrow \mathbb {F}_{2}^{}\) be a Boolean function of degree at most d. Then if some term in the ANF of \(f(a \oplus b)\) has degree \(d_a\) on variables from a, then it has degree at most \(d - d_a\) on variables from b. In particular, there are no terms of degree d on a and non-zero degree on b.

Proof

Let \(s(a,b) = a \oplus b\). Then \(\deg {s} = 1\) and \(\deg {(f \circ s)} \le d\). Hence a term containing \(d_a\) variables from a contains at most \(d-d_a\) variables from b.

Lemma 4

Let \(\pi :\mathbb {F}_{2}^{n} \rightarrow \mathbb {F}_{2}^{n}\) be a permutation and let \(f: \mathbb {F}_{2}^{n} \rightarrow \mathbb {F}_{2}^{}\) be some Boolean function of degree at most \(n-1\). Then \(\deg {(f \circ \pi )} \le n - 1\).

Proof

By the Möbius transform, the term of degree n is present in the ANF of \(f \circ \pi \) if and only if the sum of \(f \circ \pi \) over \(\mathbb {F}_{2}^{n}\) is equal to 1. Since \(\pi \) is a permutation, we have that \(\sum _{x\in \mathbb {F}_{2}^{n}} f (\pi (x)) = \sum _{x\in \mathbb {F}_{2}^{n}} f(x)\). But this last sum is equal to zero because \(\deg {f} \le n-1\). Therefore, there is no term of degree n in the ANF of \(f \circ \pi \) and we conclude that \(\deg {(f \circ \pi )} \le n-1\).

We now formally describe classes of impossible monomials using the following theorem.

Theorem 3

Let F and its ANF be as defined before. Then \(a^{F_i}_{u_l||u_r} = 0\) if one of the following holds:

  1. 1.

    \(i \in R\) and \(hw(u_l) = n\);

  2. 2.

    \(i \in R\) and \(hw(u_l) = n-1, hw(u_r) = n-1\);

  3. 3.

    \(i \in R\) and \(hw(u_l) = n-1, hw(u_r) = n\);

  4. 4.

    \(i \in L\) and \(hw(u_l) = n, \ \ \quad hw(u_r) = n-1\).

Proof

Points 3–4 are part of Theorem 2 and are presented here for the sake of completeness. It is left to prove points 1 and 2.

Fig. 4.
figure 4

The 4-round integral characteristic: words taking all values are represented in bold red and balanced words are represented in dashed blue. (Color figure online)

  1. 1.

    Consider the 4-round integral characteristic from Fig. 4. Let C be any cube which contains the whole left part. From the integral characteristic it follows that the sum of F over the cube C has zero on the right side. Therefore by the Möbius transform the corresponding ANF coefficients are zero.

  2. 2.

    Let \(f_0,f_1,f_2,f_3: \mathbb {F}_{2}^{n} \rightarrow \mathbb {F}_{2}^{n}\) be the round functions of F. The equation for the right half of the output is then given by:

    $$\begin{aligned} F_R(l||r) = l \oplus f_0(r) \oplus f_2(r \oplus f_1(l \oplus f_0(r))). \end{aligned}$$
    (3)

    Clearly, the first two terms do not contain any monomial of degree \(n-1\) on l and \(n-1\) on r. Consider the expression \(f_2(r \oplus f_1(l \oplus f_0(r)))\). Assume that a term with degree \(n-1\) on both l and r is present in the ANF of the expression. Then the term is present in the expansion of some product of at most \(n-1\) bits, where the bits are output bits of the expression \((r) \oplus f_1(l \oplus f_0(r))\), i.e. in the term each of the \(n-1\) factors is either a bit from (r) or from \(f_1(l \oplus f_0(r))\). Note that the term may not be generated by choosing bits only from (r), because in that case there will be no variables from l in it. Therefore there are at most \(n-2\) bits taken from the outer (r); \(n-1\) variable from l and at least one variable \(r_i\) are taken from \(f_1(l \oplus f_0(r))\). It means that there exists a monomial function \(\pi \) such that \(\pi \circ f_1(l \oplus f_0(r))\) contains term of degree \(n-1\) on l and degree at least 1 on r. By Lemma 4, \(\pi \circ f_1\) has degree at most \(n-1\) and by Lemma 3 there can not be such term in \(\pi \circ f_1(l \oplus f_0(r))\).

5.2 An Attack on 5-Round Feistel Network

In this section we use the impossible monomials to attack 5-round Feistel Network built from permutations. The key idea is to observe the presence of some 4-round impossible monomials in the 5-round network and extract some information about the last round function. Consider some monomial \(x^u\) which is impossible at the right side of a 4-round Feistel Network. We now add the 5th round. If we observe \(x^u\) on the left side, then we know that this monomial has come from the last round function. Otherwise, we know that it has not come from the last round function and it gives us some information as well. Using these observations we build a system of linear equations where the unknowns are the ANF coefficients of the coordinates of the last round function. By solving the system we recover the ANF coefficients and hence the function itself. Note that in order to compute the ANF, we have to obtain the full codebook.

Let \(F^{5}\) be a 2n-bit \(\mathsf {F}^{5}_{d}\), \(F^{4}\) be its first 4 rounds and f be the last round function. Let \(a^g_u\) be the coefficient of term \(x^u\) in the ANF of the Boolean function g. Consider the equation of the ith bit of \(F^5\) for \(i \in L\):

$$\begin{aligned} F^5_{i}(x) = F^4_{i+n}(x) \oplus f_{i}(F^5_{R}(x)) = \bigoplus _{u \in \mathbb {F}_{2}^{2n}} a^{F^4_{i+n}}_u x^{u} \oplus \bigoplus _{u \in \mathbb {F}_{2}^{2n}} a^{f_{i}}_u (F^5_R(x))^u. \end{aligned}$$
Fig. 5.
figure 5

Impossible monomials in the last round of a 5-round FN with 3-bit branches. The wire with 4-round impossible monomials is in dashed blue, the path of the observed monomials is highlighted with bold red. \(a_u\) is the ANF coefficient of some 4-round impossible monomial.(Color figure online)

The ANF of \(F^5_{i}\) with \(i \in L\) contains some monomial from the first or the second group from Theorem 3 if and only if the ANF of \(f_i \circ F^5_R\) does. Since we can compute the ANF of \(F^5_R\), we can check which possible terms from the ANF of \(f_i\) generate the impossible monomial. Then from the presence of the impossible monomial in the ANF of \(F^5_{i+n}\) we deduce if the number of such terms in the ANF of \(f_i\) is odd or even. This gives us a linear equation over \(\mathbb {F}_{2}^{}\) where the unknowns are the ANF coefficients of \(f_i\). For an illustration see Fig. 5.

Note that the 4-round impossible monomials which are still impossible in a 5-round Feistel Network do not leak any information about f. For example, since Feistel Network is a bijection, the monomial of degree 2n is impossible for any number of rounds but it can not be used in the attack. However it is the only such monomial. Therefore we can use \(2^n-1\) impossible monomials from the first group of Theorem 3 and \(n^2\) ones from the second group. Each such monomial yields an equation per each bit of f. There are \(2^n\) unknown coefficients in the ANF of \(f_i\) so the number of equations will be enough to recover \(f_i\) for all i and hence f with high probability. Note that we can recover f only up to xor with a constant because the constant may propagate through the Feistel Network and merge with other round functions (see the introduction of [5] for a more detailed explanation of this phenomenon).

The complexity of the attack is \(O(2^{3n})\) and is dominated by generating the equation matrix, which is the same for all output bits (the only difference is the target vector). For each of the \(2^n\) possible terms in the ANF of \(f_i\) we compute the ANF of the term applied after F in time \(O(2^{2n})\) and then we check if this term generates the impossible monomials. The next step is to solve the systems. Since the equation matrix is the same for all output bits, we can do some precomputation (for example the LU-decomposition) once and solve all n systems of equations very fast. Computing the target vectors is dominated by computing the ANF of \(F^5_i\) for \(i \in L\) which takes total time of \(O(n2^{2n})\).

As a consequence of the algebraic nature of the attack, if the round function has lower degree, then the complexity decreases. Indeed, there are less unknowns and therefore both steps of generating the equation matrix and solving the systems take less time. As an edge case, consider the \(F^5A\) structure where the affine layer can be seen as the 6th round with a function of degree 1. The complexity of recovering the affine round is \(O(n2^{2n})\), as was shown in Sect. 4.2.

Note that the attack can be run in the reverse direction as well, so that we recover the first round function instead of the last one.

We have implemented the attack in Sage [21]. We successfully attacked a 5-round Feistel Network with bijections and branch size of up to 9 bits and recovered the outer secret round functions in a few minutes on a modern laptop.

5.3 A Generalization of the Attack on Feistel Networks with Low Degree Round Functions

When the round functions in a Feistel Network have low degree, the degree deficiency is decreasing slowly and as a result impossible monomials may exist for more than 5 rounds. Moreover, since there are less unknowns to recover, we need less impossible monomials to mount the attack.

In the following theorem we give a lower bound on the maximum number of Feistel rounds for which the large class of monomials is impossible. Namely, this class is point 1 from Theorem 3. The size of the class is \(2^n\), which is enough to recover a round function of full degree. Therefore this is the lower bound on maximum number of rounds that can be attacked using the ANF recovery technique from Sect. 5.2.

Theorem 4

Let F be a 2n-bit \(\mathsf {F}^{r}_{d}\) with arbitrary functions and let its ANF be as in the Eq. 2. Then \(a^{F_i}_{u_l||u_r} = 0\) if \(d^{r-2} < n, i \in R\) and \(hw(u_l)=n\).

Proof

Let l||r be the input to F. Consider the degrees on the variables from l at the intermediate states. Initially, the degrees are 1 on the left and 0 on the right. After the first round the degrees are the same, because input to the round function has no variables from l. Now if we have the respective degrees \(d_1,d_2\) at some point and we add a swap and xor with the round function, the degrees become \(max(d_2, d\cdot d_1), d_1\). Then for 2 rounds the degrees are d, 1, for 3 rounds - \(d^2, d\), and, in general, for r rounds the degrees are \(d^{r-1}, d^{r-2}\). Therefore, when \(d^{r-2} < n\), the r-round Feistel Network has no monomials with degree n on l in the right branch of the output.

As a corollary of the theorem, we can attack a 2n-bit \(\mathsf {F}^{r}_{d}\) if \(d^{r-3} < n\). Note that for the 5-round Feistel with bijections which we attacked in the previous section this bound is not satisfied (for \(n \ge 3\)): \(d^{5-3} = (n-1)^2 > n\), i.e. we attacked more rounds than we could attack by Theorem 4. Though we expect that the bound is tight for the specified class of monomials in FN with non-bijective round functions, there are another classes of impossible monomials for Feistel Networks with more rounds. Moreover, if the degree is low, there are less ANF coefficients to recover and, therefore, smaller classes of impossible monomials may be enough for attack. As an edge case, consider an additional round function of degree 1 (a linear function). The impossible monomials of degree \(2n-1\) from Corollary 1 can be used to recover such round function, as was shown in attacks from Sect. 4.2. The maximal number of rounds (without the last linear one) for this attack is given by the condition \(\theta (d,r) = d^{\lfloor {r/2} \rfloor -1} + d^{\lceil {r/2} \rceil -1} < 2n\) (or 1 more round if the Feistel functions are bijections). In general case, if the Feistel functions are bijections, we can attack 5 normal rounds plus 1 linear round.

6 Relationship Between Our Results and Other Attacks

6.1 Integral Attacks

The HDIM has a simple integral interpretation. Indeed, its coefficients correspond to the presence or not of some monomials in the ANF of its coordinates. They thus correspond to coefficients in said ANF which can be computed using the Möbius transform:

$$\begin{aligned} \hat{H}(F)[i, j] = \bigoplus _{x \preccurlyeq \overline{e_{j}}} F_{i}(x) \end{aligned}$$

where \(\overline{e_{j}}\) is the vector where all elements are equal to 1 except in position j. This has two consequences.

  1. 1.

    we can compute the HDIM of an n-bit permutation in time \(\text {O}(n 2^{n-1})\), and

  2. 2.

    zeroes in column j imply the existence of an integral distinguisher.

In light of this, we state the following corollary of Corollary 1.

Corollary 2

(Integral Distinguisher for \(\mathsf {F}^{r}_{d}\) ). Let F be a 2n-bit \(\mathsf {F}^{r}_{d}\) and suppose that one of the following conditions holds:

  • the Feistel functions are bijections and \(\theta (d,r-1) < 2n\), or

  • the Feistel functions are not bijections and \(\theta (d,r) < 2n\).

Then there exists an integral distinguisher with data and time complexity \(2^{2n-1}\) for this structure, namely

$$\begin{aligned} \bigoplus _{x \preccurlyeq \overline{e_{j}}} \big ( e_{i} \cdot F(x) \big ) = 0 \end{aligned}$$

for all \(i < n\) and \(j < n\). In other words, the sum of the right words of F(x) is equal to 0 over a cube where one bit of the input right word is fixed to 0.

We notice a relation between our attacks and the so-called division property. This tool for finding integral attacks was introduced by Todo in [9] and later used by the same author to attack the full MISTY1 [22]. In his seminal paper, Todo gives some integral distinguishers against Feistel Network for various block sizes, number of rounds, degree of the Feistel functions for both bijective and non-bijective Feistel functions. Interestingly, his results are extremely similar to ours! Indeed, while there is no generic formula in Todo’s paper, the application of his algorithm shows the existence of cubes of size \(2n-1\) whose sum is equal to 0 for a number of rounds identical to the ones we predicted. In fact, results about the division property of the output of a Feistel Network can be extracted from its HDIM. To explain this, we first recall the definition of the division property.

Definition 7

(Division Property). Let \(\mathbb {X}\) be a multiset of \(\mathbb {F}_{2}^{n}\) and k be an integer of [0, n]. We say that \(\mathbb {X}\) has the division property \(\mathcal {D}^{n}_{k}\) if, for all u in \(\mathbb {F}_{2}^{n}\) such that \(\text {hw}(u) \le k\), \(\bigoplus _{x \in \mathbb {X}} x^u = 0\).

This property is further generalized into the vectorial division property which we define in the particular case of a Feistel Network.

Definition 8

(Vectorial Division Property (for Feistel Networks)). Let \(\mathbb {X}\) be a multiset of \((\mathbb {F}_{2}^{n})^2\) and \(k^L, k^R\) be integers of [0, n]. We say that \(\mathbb {X}\) has the collective division property \(\mathcal {D}^{n}_{(k^L, k^R)}\) if, for all uv in \(\mathbb {F}_{2}^{n}\) such that \(\text {hw}(u) \le k^L\) and \(\text {hw}(v) \le k^R\), \(\bigoplus _{(x,y) \in \mathbb {X}} x^u y^v = 0\).

In particular, Todo applied his technique to 2n-bit \(\mathsf {F}^{r}_{d}\). The integral distinguisher against the highest number of rounds correspond to integrals over cubes of size \(2n-1\) were the constant bit has to be on the left side.Footnote 4 As we have seen, summing over such a cube is equivalent to computing half of the lines of the HDIM of the function.

Let F be a 2n-bit \(\mathsf {F}^{r}_{d}\), x denote the left input bits, y denote the right ones and \(F_L\) and \(F_R\) denote its left and right output halves so that \(F(x || y) = F_L(x || y) || F_R(x||y)\). Suppose that the top left corner of the HDIM of F is all zero. We deduce that the following holds for any cube \(\mathcal {C}_k\) of dimension \(2n-1\) where the bit at index \(k \le n\) is fixed and for any \(i \le n\): \(\bigoplus _{x \in \mathcal {C}_k} F(x) \cdot e_i(x) = 0\). This can also be written as

$$\begin{aligned} \bigoplus _{x \in \mathcal {C}_k} \left( F_L(x) \right) ^{u_i}\left( F_R(x) \right) ^{0} =\bigoplus _{x \in \mathcal {C}_k} \left( F_L(x) \right) ^{u_i} = 0, \end{aligned}$$

where \(u_i\) is the element of \(\mathbb {F}_{2}^{n}\) equal to 0 except at position i where it is equal to 1. In other words, for all u in \(\mathbb {F}_{2}^{n}\), \(\text {hw}(u) \le 1\) implies that \(\bigoplus _{x \in \mathcal {C}_k} \left( F_L(x) \right) ^{u} = 0\), which means that the image of \(\mathcal {C}_k\) has vectorial division property \(\mathcal {D}^{n}_{1, 0}\). The HDIM of Feistel Networks can thus be interpreted as describing the vectorial division property of each output half!

The relation between the ANF and integral attacks is further stressed by the attack we described in Sect. 5. Indeed, the complexity of this attack is very similar to that of the integral attack against 5-round FN with bijective Feistel functions described in [5].

7 Conclusion

Investigating surprising visual patterns in the LAT of Feistel Network lead us to interesting results. To explain them, we introduced the high-degree indicator matrix (HDIM). It causes a form of linearity of the LAT modulo 4 and is related to the presence (or lack thereof) of some monomials in the ANF of the permutation. We identified patterns in the distribution of these monomials for Feistel Networks and provided theorems allowing us to predict the existence of these patterns (Theorem 2 and Corollary 1). More generally, we showed how the predictable absence of some monomials can be leveraged to attack a Feistel Network in an impossible monomial attack. We also drew some connections between our results and integral distinguisher.