Keywords

1 Introduction

The division property  [25] was first proposed by Todo at EUROCRYPT 2015 to uncover and exploit the spectrum of properties hidden between the two extremes—the ALL and BLANCE properties in the traditional integral cryptanalysis  [6, 16] targeting word-oriented primitives. Compared with the traditional integral cryptanalysis, the division property presents a more refined way for cryptanalysts to identify balanced output bits, where the algebraic degree information of the local components of the target is fully utilized. Its powerfulness and potential were undoubtedly demonstrated by the break of the full Misty1   [24]. Subsequently, by considering the division property at the bit level, Todo and Morii  [27] introduced the bit-based division property to find balanced bits of the round-reduced Simon. Moreover, to capture also constant output bits and some cancellation characteristics ignored by the conventional bit-based division property, the so-called three-subset bit-based division property was proposed in the same work  [27].

This seemingly natural and obvious migration from words to bits (1-bit word) not only makes division properties applicable to bit-oriented designs, but also reveals the intimate relationship between division properties and the algebraic normal forms (ANF) of the target  [26], well-beyond merely the algebraic degree. This relationship hints at how the division property can be employed to probe the ANF of a complex Boolean function whose explicit formula is typically not available. As expected, the division property was shown to be useful in (partially) determining the algebraic structures of the superpolies arising in cube attacks  [9, 26, 29, 30]. Essentially, every cryptanalysis attempt based on the division property employs some procedures which we call detection algorithms.

Detection Algorithms. Given a Boolean function f, a detection algorithm for a certain property \(\mathcal {P}\) is a procedure used to determine whether \(\mathcal {P}\) holds for f. The property \(\mathcal {P}\) can be as simple as “f is a constant” or as complicated as “the sum of f over all possible values of certain variables is zero regardless of the values of some other variables”. Given a Boolean function f and a detection algorithm for \(\mathcal {P}\), four possibilities are in order:

  • Hit: \(\mathcal {P}\) holds and the output of the algorithm is positive;

  • Miss: \(\mathcal {P}\) holds but the output of the algorithm is negative;

  • False alarm: \(\mathcal {P}\) does not hold but the output of the algorithm is positive;

  • Correct reject: \(\mathcal {P}\) does not hold and the output of the algorithm is negative.

At this point, we remind the readers that a lot of research that has been done on division property so far is about the construction of detection algorithms, loosely speaking, for the balance (or more generally the key-independent constant) property, or more essentially, the absence of certain monomials. A no-false-alarm algorithm can be employed by an attacker (e.g., to find balanced output bits), while a no-miss algorithm can be employed by a designer in security proofs. Our ultimate goal is to devise a perfect and efficient detection algorithm that never misses and never raises false alarms.

Our Contributions. Capturing the algebraic essentials of many attempts to make the detection of division properties more accurate, we propose a new technique called monomial prediction. This is a perfect detection algorithm for detecting the presence and absence of any monomial \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\) in the product \({\textit{\textbf{y}}}^{\textit{\textbf{v}}}\) of any output bits of a vectorial Boolean function \({\textit{\textbf{y}}}= \textit{\textbf{f}}({\textit{\textbf{x}}})\) by counting the number of the so-called monomial trails connecting \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\) and \({\textit{\textbf{y}}}^{\textit{\textbf{v}}}\) across a sequence of simpler vectorial Boolean functions whose composition is \(\textit{\textbf{f}}\). We then establish an equivalence between the monomial prediction approach and the recently proposed three-subset bit-based division property without unknown subset at EUROCRYPT 2020  [9]. We also show that all the predecessors of [9] (except the lazy propagation method  [27]) can be categorized as no-false-alarm detection algorithms.

The monomial prediction technique can be regarded as a new language for describing the division properties. The original language for the division properties is somehow indirect and vague since a property (the division property) of an object (a vectorial Boolean function) is defined via its effects on external objects (multisets) rather than via its own intrinsic natures. The monomial prediction delivers a definition of division properties fully getting rid of the external multisets. This new treatment not only gives us a unified view on the two-subset bit-based division property, three-subset bit-based division property, and three-subset division property without unknown subset, but also naturally leads to new search strategies. We revisit several well-known applications of the division property with the monomial prediction approach, and identify some improvements over the state-of-the-art.

By showing the presence of monomials with a certain degree and the absence of monomials with larger degrees, we obtain the exact algebraic degree of the output bits of Trivium up to 834 rounds for the first time. Our results show that the algebraic degree of 834-round Trivium is only 78, which is much lower than the previous estimations by Liu at CRYPTO 2017  [18], where the upper bound of 793-round Trivium has already reached 79. Along the way, we observe and report on an interesting and somewhat counter-intuitive phenomenon: The algebraic degree of Trivium can drop as the number of rounds grows. For example, the degree of 807-round Trivium has been proven to achieve 71, but the degree of the next round drops to 70.

For a Boolean function f, we can check the presence and absence of all monomials that are divisible by the cube term to recover the superpoly in the cube attack. With the help of a divide-and-conquer strategy, our algorithm achieves high efficiency and scales well, making it possible to test many cubes in a limited time. As a result, we are able to identify some cubes with smaller dimensions for Trivium than the previous best works, for instance, in  [8, 9] all the cubes chosen for 840-, 841- and 842-round Trivium are of dimension 78, which take \(2^{78}\) encryptions of Trivium to recover one bit information of the key, and take \(2^{79}\) Trivium encryption to recover the remaining key bits by exhaustive search. Thus the total complexity of the key-recovery attack is estimated as \(2^{78} + 2^{79} \approx 2^{79.6}\). Using our technique, for 840-round Trivium, we can recover superpolies with three different cubes that have dimension of only 75, which reduces the complexity for recovering the key to \(2^{77.8}\) encryption. For 841-round Trivium, we recover two superpolies with two different cubes of dimension 76, which reduces the complexity for recovering the full key to \(2^{78.6}\) encryption. For 842-round Trivium, with two different cubes of dimension 76 together with their superpolies, we can recover the full key with time complexity \(2^{78.6}\). We summarize our cube attacks on Trivium in Table 1.

Table 1. The complexity of cube attacks on 840-, 841- and 842-round Trivium measured by the encryption of Trivium. #Cube means the number of cubes used in the offline phase of the cube attack.

Remark. Before going any further, we would like to briefly discuss the relationship between the monomial prediction and division properties. When used as detection algorithms for the key-independent sum property, both monomial prediction and the three-subset bit-based division property without unknown subsets are perfect. Originally, the division properties are defined over the multisets that the target cipher acts on, while the monomial prediction technique is fully formulated via the algebraic structure of the cipher itself. Our philosophy is that the effect of a cipher on multisets should be regarded as the manifestations of the cipher’s intrinsic property, which should not be mixed with the definition of this property. A unified view naturally emerges with the monomial prediction technique for all previous division properties, since all of them are the manifestations of the properties of the ANFs of the target cipher. Finally, we would like to mention that Hebborn et al.  [10] show that the three-subset bit-based division property without unknown subsets allows to decide whether or not a specific monomial appears in the ANF with the help of the parity set proposed in  [2]. So we say that the monomial prediction and the division properties achieve the same goal through different routes.

Organization. In Sect. 2, we introduce necessary notations and preliminaries. The principle of the monomial prediction approach is established in Sect. 3. This leads to the applications to the degree evaluation in Sect. 4 and to cube attacks in Sect. 5. In Sect. 6, we establish the equivalence between the three-subset bit-based division property without unknown subsets and the monomial prediction technique, and theoretically prove that they are perfect in detecting the key-independent sum property. Also, we theoretically show that other algorithms for division properties raise no false alarms. Section 7 concludes and discusses potential future work.

2 Preliminaries

We use bold italic lowercase letters to represent bit vectors, and \(\textit{\textbf{0}}\) represents a bit vector with all elements being 0. For an n-bit vector \({\textit{\textbf{u}}}\in \mathbb {F}_2^n\), its i-th coordinate is denoted by \(u_i\), and thus \({\textit{\textbf{u}}}= (u_0, \cdots , u_{n-1})\). The complementary vector of \({\textit{\textbf{u}}}\) is denoted by \(\bar{{\textit{\textbf{u}}}}\) where \(u_i \oplus \bar{u}_i = 1\) for \(0 \le i < n\). The Hamming weight of \({\textit{\textbf{u}}}\) is \(wt({\textit{\textbf{u}}}) = \sum _{i=0}^{n-1}u_i\). For any n-bit vectors \({\textit{\textbf{u}}}\) and \({\textit{\textbf{u}}}'\), we define \({\textit{\textbf{u}}}\succeq {\textit{\textbf{u}}}'\) if \(u_i \ge u'_i\) for all i, otherwise, \({\textit{\textbf{u}}}\nsucceq {\textit{\textbf{u}}}'\). Similarly, we define \({\textit{\textbf{u}}}\preceq {\textit{\textbf{u}}}'\) if \(u_i \le u'_i\) for all i, \({\textit{\textbf{u}}}\prec {\textit{\textbf{u}}}'\) if \(u_i < u'_i\) for all i and \({\textit{\textbf{u}}}\succ {\textit{\textbf{u}}}'\) if \(u_i > u'_i\) for all i.

Let \(f: \mathbb {F}_2^n \rightarrow \mathbb {F}_2 \) be a Boolean function in \(\mathbb {F}_2[x_0, x_1, \ldots , x_{n-1}]/(x_0^2-x_0, x_1^2-x_1, \ldots , x_{n-1}^2-x_{n-1})\) whose algebraic normal form (ANF) is

$$f(\textit{\textbf{x}}) = f(x_0, x_1, \ldots , x_{n-1}) = \bigoplus _{{\textit{\textbf{u}}}\in \mathbb {F}_2^n} a_{{\textit{\textbf{u}}}} \prod _{i=0}^{n-1}x_i^{u_i}, $$

where \(a_{{\textit{\textbf{u}}}} \in \mathbb {F}_2\), and

$$\textit{\textbf{x}}^{\textit{\textbf{u}}} = \pi _{\textit{\textbf{u}}}(\textit{\textbf{x}}) =\prod _{i=0}^{n-1}x_i^{u_i} \text { with } x_i^{u_i} =\left\{ \begin{aligned} x_i,&\mathrm {~if~} u_i=1,\\ 1,&\mathrm {~if~} u_i = 0, \end{aligned} \right. $$

is called a monomial. If the coefficient of \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\) in f is 1, we say \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\) is contained by f, denoted by \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\rightarrow f\). Otherwise, \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\) is not contained by f, we denote it by \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\nrightarrow f\). In the remaining paper, we will use \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\) and \(\pi _{\textit{\textbf{u}}}({\textit{\textbf{x}}})\) interchangeably to avoid using the awkward notation \(\tiny {{\textit{\textbf{x}}}^{(i)}}^{{\textit{\textbf{u}}}^{(j)}}\) when both \(\textit{\textbf{x}}\) and \(\textit{\textbf{u}}\) have superscripts.

Example 1

Let \(f(x_0, x_1)= x_0x_1 \oplus x_0 \oplus 1\), then we have \(x_0x_1 \rightarrow f\), \(x_0 \rightarrow f\), \(1 \rightarrow f\), and \(x_1 \nrightarrow f\).

Let \(\textit{\textbf{y}} = (y_0, \cdots , y_{m-1}) = \textit{\textbf{f}}(\textit{\textbf{x}}) = (f_0(\textit{\textbf{x}}), \cdots , f_{m-1}(\textit{\textbf{x}}))\) be a vectorial Boolean function from \(\mathbb {F}_2^n\) to \(\mathbb {F}_2^m\). For \({\textit{\textbf{v}}}= (v_0, v_1, \ldots , v_{m-1}) \in \mathbb {F}_2^m\), a monomial \({\textit{\textbf{y}}}^{\textit{\textbf{v}}}\) of \(\textit{\textbf{y}}\) can be symbolically expressed as a polynomial of the variable \(\textit{\textbf{x}}\):

$${\textit{\textbf{y}}}^{\textit{\textbf{v}}}= \prod _{i=0}^{m-1}(f_i({\textit{\textbf{x}}}))^{v_i} = \bigoplus _{{\textit{\textbf{u}}}\in \mathbb {F}_2^n} a_{{\textit{\textbf{u}}}} {\textit{\textbf{x}}}^{\textit{\textbf{u}}}, a_{\textit{\textbf{u}}}\in \mathbb {F}_2. $$

In the following, we show how to determine whether \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\rightarrow {\textit{\textbf{y}}}^{\textit{\textbf{v}}}\) for a given monomial \({\textit{\textbf{x}}}^{{\textit{\textbf{u}}}}\).

3 Monomial Prediction

Let \(\textit{\textbf{f}} : \mathbb {F}_2^n \rightarrow \mathbb {F}_2^m\) be a vectorial Boolean function sending \(\textit{\textbf{x}} = (x_0, \cdots , x_{n-1})\) to \(\textit{\textbf{y}} = (y_0, \cdots , y_{m-1})\) with \(y_i = f_i(\textit{\textbf{x}})\). By the monomial prediction we mean the problem of determining the presence or absence of a particular monomial \(\textit{\textbf{x}} ^ {\textit{\textbf{u}}}\) in \(\textit{\textbf{y}}^{\textit{\textbf{v}}}\), that is, whether \({\textit{\textbf{x}}}^{{\textit{\textbf{u}}}} \rightarrow {\textit{\textbf{y}}}^{{\textit{\textbf{v}}}}\). This is a trivial problem if the ANF of \(\textit{\textbf{f}}\) is available. However, in the context of the symmetric-key cryptography, in most cases, the ANF of the targeted \(\textit{\textbf{f}}\) is too complicated to be computed (or even to be stored) in practice. Typically, the only fact we know is that \(\textit{\textbf{f}}\) is built by composition from a sequence of vectorial Boolean functions whose ANFs are known, i.e.,

$$\begin{aligned} \textit{\textbf{y}} = \textit{\textbf{f}}(\textit{\textbf{x}}) = \textit{\textbf{f}}^{(r-1)} \circ \textit{\textbf{f}}^{(r-2)} \circ \cdots \circ \textit{\textbf{f}}^{(0)}(\textit{\textbf{x}}). \end{aligned}$$

Now, how do we determine whether \({\textit{\textbf{x}}}^{{\textit{\textbf{u}}}} \rightarrow {\textit{\textbf{y}}}^{{\textit{\textbf{v}}}}\) ?

Let \({\textit{\textbf{x}}}^{(i)}\) and \({\textit{\textbf{x}}}^{(i+1)}\) be the input and output variables of \(\textit{\textbf{f}}^{(i)}: \mathbb {F}_2^{n_i} \rightarrow \mathbb {F}_2^{n_{i+1}}\), respectively. Then \({\textit{\textbf{x}}}^{(i+1)} = \textit{\textbf{f}}^{(i)}( {\textit{\textbf{x}}}^{(i)} )\) for \(0 \le i < r\), and thus \({\textit{\textbf{x}}}^{(i)}\) can be represented as a vectorial Boolean function of \({\textit{\textbf{x}}}^{(j)}\) with \(j < i\):

$$\begin{aligned} {\textit{\textbf{x}}}^{(i)}= \textit{\textbf{f}}^{(i-1)}\circ \cdots \circ \textit{\textbf{f}}^{(j+1)} \circ \textit{\textbf{f}}^{(j)}({\textit{\textbf{x}}}^{(j)}), \text{ for } 1 \le i \le r. \end{aligned}$$

Since the ANF of \({\textit{\textbf{x}}}^{(i+1)} = \textit{\textbf{f}}^{(i)}({\textit{\textbf{x}}}^{(i)})\) is available, one can determine whether \(\pi _{{\textit{\textbf{u}}}^{(i)}}({\textit{\textbf{x}}}^{(i)}) \rightarrow \pi _{{\textit{\textbf{u}}}^{(i+1)}}({\textit{\textbf{x}}}^{(i+1)})\) for any \({\textit{\textbf{u}}}^{(i)}\) and \({\textit{\textbf{u}}}^{(i+1)}\), which gives rise to the concept of the monomial trail.

Definition 1

(Monomial Trail). Let \({\textit{\textbf{x}}}^{(i+1)} = \textit{\textbf{f}}^{(i)}({\textit{\textbf{x}}}^{(i)})\) for \(0 \le i < r\). We call a sequence of monomials \(( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } , {\pi _{{\textit{\textbf{u}}}^{(1)}} ( {\textit{\textbf{x}}}^{(1)} ) } , \ldots , {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } )\) an r-round monomial trail connecting \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \) and \( {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \) with respect to the composite function \(\textit{\textbf{f}} = \textit{\textbf{f}}^{(r-1)} \circ \textit{\textbf{f}}^{(r-2)} \circ \cdots \circ \textit{\textbf{f}}^{(0)}\) if

$$\begin{aligned} {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow \cdots \rightarrow {\pi _{{\textit{\textbf{u}}}^{(i)}} ( {\textit{\textbf{x}}}^{(i)} ) } \rightarrow \cdots \rightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } . \end{aligned}$$

If there is at least one monomial trail connecting \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \) and \( {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \), we write \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \). Otherwise, .

Note that a monomial trail is always specified with respect to a given composition sequence \(\textit{\textbf{f}}^{(r-1)} \circ \textit{\textbf{f}}^{(r-2)} \circ \cdots \circ \textit{\textbf{f}}^{(0)}\). When this sequence is obvious from the context, we will omit it to keep the presentation concise. Also, we always assume in default that

$${\textit{\textbf{x}}}^{(r)} = \textit{\textbf{f}}^{(r-1)}({\textit{\textbf{x}}}^{(r-1)}) = \textit{\textbf{f}}^{(r-1)}\circ \textit{\textbf{f}}^{(r-2)}({\textit{\textbf{x}}}^{(r-2)}) = \cdots = \textit{\textbf{f}}^{(r-1)} \circ \cdots \circ \textit{\textbf{f}}^{(0)}({\textit{\textbf{x}}}^{(0)}). $$

Example 2

Let \(\textit{\textbf{z}} = (z_0, z_1) = \textit{\textbf{f}}^{(1)}(y_0, y_1) = (y_0y_1, y_0 \oplus y_1)\), \(\textit{\textbf{y}} = (y_0, y_1) = \textit{\textbf{f}}^{(0)}(x_0, x_1, x_2) = (x_0 \oplus x_1 \oplus x_2, x_0 x_1 \oplus x_0 \oplus x_2 )\) and \(\textit{\textbf{f}} = \textit{\textbf{f}}^{(1)} \circ \textit{\textbf{f}}^{(0)}\).

Consider the monomial \((x_0, x_1, x_2)^{(1,0,0)} = x_0\). Since the ANF of \(\textit{\textbf{f}}^{(0)}\) is available, we can compute all monomials of \(\textit{\textbf{y}}\), i.e.,

$$ \begin{aligned}&(y_0, y_1)^{(0, 0)} = 1, (y_0, y_1)^{(1, 0)} = y_0 = \underline{x_0} \oplus x_1 \oplus x_2, (y_0, y_1)^{(0, 1)} = y_1 = x_0 x_1 \oplus \underline{x_0} \oplus x_2, \\&(y_0, y_1)^{(1, 1)} = y_0 y_1 = x_0x_1x_2 \oplus x_0x_1 \oplus x_1 x_2 \oplus \underline{x_0} \oplus x_2. \end{aligned}$$

Then

$$ x_0 \rightarrow y_0,~ x_0 \rightarrow y_1,~ x_0 \rightarrow y_0 y_1 $$

are all the three monomial trails of \(\textit{\textbf{f}}^{(0)}\) connecting \(x_0\) and monomials of \(\textit{\textbf{y}}\).

Similarly, we can compute all the monomials of \(\textit{\textbf{z}}\) as follows,

$$ \begin{aligned}&(z_0, z_1)^{(0, 0)} = 1, (z_0, z_1)^{(1, 0)} = z_0 = \underline{ y_0 y_1 }, (z_0, z_1)^{(0, 1)} = z_1 = \underline{y_0} \oplus \underline{y_1},\\&(z_0, z_1)^{(1, 1)} = z_0z_1 = 0. \end{aligned} $$

There are three monomial trails of \(\textit{\textbf{f}}\) connecting \(x_0\) and monomials of \(\textit{\textbf{z}}\):

$$ \begin{aligned} x_0 \rightarrow y_0 \rightarrow z_1, \quad x_0 \rightarrow y_1 \rightarrow z_1, \quad x_0 \rightarrow y_0 y_1 \rightarrow z_0. \end{aligned} $$

Lemma 1

\( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \) if \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \), and thus implies \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \nrightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \).

Proof

We prove it by induction on r. Assuming this lemma holds for \(r < s\), we are going to show that it also holds for \(r = s\). First, we expand \( {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } \) on \({\textit{\textbf{x}}}^{(s-1)}\) as

$$ {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } = \bigoplus _{ {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } }^{} {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } . $$

Since \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } \), there is at least one \( {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } \) contained by \( {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } \) satisfying \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } \). According to our assumption, \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } \), then \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } \).    \(\square \)

According to Lemma 1, \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \) is sufficient for \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \). However, the conversion is not true in general. Considering Example 2, although \(x_0 \rightsquigarrow z_1\), we have \(x_0 \nrightarrow z_1\) since

$$ z_1 = y_0 \oplus y_1 = \underline{x_0} \oplus x_1 \oplus x_2 \oplus x_0x_1 \oplus \underline{x_0} \oplus x_2 = x_0 x_1 \oplus x_1. $$

The reason is that two \(x_0\)’s (underlined in the above equation) cancel each other. In the following, we will demonstrate that whether \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \) is determined by the number of monomial trails connecting them rather than the existence of the monomial trail, which raises the definition below.

Definition 2

(Monomial Hull). For \(\textit{\textbf{f}}\) with a specific composition sequence, the monomial hull of \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \) and \( {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \), denoted by \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \), is the set of all monomial trails connecting them. The number of trails in the monomial hull is called the size of the hull and is denoted by \(| {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } |\).

Example 3

Consider Example 2, the monomial hull of \(x_0\) and \(z_1\) is the set

$$ x_0 \bowtie z_1 = \left\{ x_0 \rightarrow y_0 \rightarrow z_1, x_0 \rightarrow y_1 \rightarrow z_1 \right\} . $$

Thus the size of \(x_0 \bowtie z_1\) is 2. Furthermore, since , \(x_0 \bowtie z_0z_1 = \emptyset \) and \(|x_0 \bowtie z_0z_1| = 0\).

For \(i \ge 1\), if \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow {\pi _{{\textit{\textbf{u}}}^{(i)}} ( {\textit{\textbf{x}}}^{(i)} ) } \), \(| {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie {\pi _{{\textit{\textbf{u}}}^{(i)}} ( {\textit{\textbf{x}}}^{(i)} ) } |\) can be calculated recursively as follows,

Lemma 2

For \(i \ge 1\), if \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow {\pi _{{\textit{\textbf{u}}}^{(i)}} ( {\textit{\textbf{x}}}^{(i)} ) } \),

figure a

The time has come to address the monomial prediction problem we mentioned at the beginning of this section.

Proposition 1

\( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \) if and only if \(| {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } |\) is odd.

Proof

We prove it by induction on r. Assuming this proposition holds for \(r < s\), we are going to show that it also holds for \(r = s\). First, we expand \( {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } \) on \({\textit{\textbf{x}}}^{(s-1)}\) as

$$ {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } = \bigoplus _{ {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } } {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } . $$

Consequently, we have

$$ | {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } | = \sum _{\begin{array}{c} {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } \\ \rightarrow {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } \end{array}} | {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } |. $$

Moreover, \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } \) if and only if there are odd number of \( {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } \) contained by \( {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } \) such that \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } \), or equivalently, according to the induction hypothesis we made at the beginning, there are odd number of \( {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } \) contained by \( {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } \) such that \(| {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } |\) is odd. Finally, Proposition 1 is true for \(r = s\) since \(| {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } |\) is odd if and only if

$$ \sum _{ {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(s)}} ( {\textit{\textbf{x}}}^{(s)} ) } }^{} | {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie {\pi _{{\textit{\textbf{u}}}^{(s-1)}} ( {\textit{\textbf{x}}}^{(s-1)} ) } |~~\mathrm {is~~odd}. $$

   \(\square \)

3.1 Derived Function

When applying the monomial prediction technique to cryptanalysis, we may consider functions that are derived from a vectorial Boolean function \(\textit{\textbf{f}}\) by fixing some variables of \(\textit{\textbf{f}}\) to known constants. In this case, the derived function has fewer variables than the original function \(\textit{\textbf{f}}\). Also, the remaining variables are not treated equally. Some of them are public (IV bits, plaintext bits, tweak bits, etc.), while some of them are secret (key bits). To highlight the semantic difference of the variables and distinguish between the variables fixed to 0 and those fixed to 1, we introduce the notion of variable masks. Together with the original function \(\textit{\textbf{f}}\), these masks completely determine the derived function, and tells us which variables of the derived function are public and which are secret.

Remark. The only purpose of introducing the concept of the derived function is to have a unified approach to specify the functions to which our techniques are applied. It has no theoretical significance and the readers who do not care about the details of the attacks on concrete targets can safely skip this part to avoid being overloaded by unnecessary notations. Actually, skipping this part is encouraged and the readers can look back when necessary.

Variable Masks and Derived Function. Let \(\varvec{\varGamma ^0}\), \(\varvec{\varGamma ^1}\), \(\varvec{\varGamma ^p}\), and \(\varvec{\varGamma ^s} \in \mathbb {F}_2^n\) be constant vectors such that \(\{ 0 \le i < n : \varGamma _i^0 = 1 \}\), \(\{ 0 \le i < n : \varGamma _i^1 = 1 \}\), \(\{ 0 \le i < n : \varGamma _i^p = 1 \}\), and \(\{ 0 \le i < n : \varGamma _i^s = 1 \}\) form a partition of \(\{0, \cdots , n-1 \}\), which are called variable masks. For a vectorial Boolean function \(\textit{\textbf{f}}(\textit{\textbf{x}})\) from \(\mathbb {F}_2^n\) to \(\mathbb {F}_2^m\), we can derive a new function \(\textit{\textbf{f}}_d\) from \(\textit{\textbf{f}}\) with the variable masks by setting certain variables of \(\textit{\textbf{f}}\) to constants according to the following rule for \(i \in \{0, 1, \cdots , n-1 \}\):

The remaining \(x_i\)’s are still treated as variables but with different access permissions: \(x_i\)’s with \(\varGamma _i^p = 1\) are public variables and can be manipulated by the attackers, while \(x_i\)’s with \(\varGamma _i^s = 1\) are secret variables. Although in practice secret variables typically represent secret key bits and are actually fixed to unknown constants, in our framework we still regard them as symbolic objects rather than constants. The concept of the derived function should be best understood by a concrete example.

Example 4

For \(\textit{\textbf{y}} = \textit{\textbf{f}}(x_0, x_1, x_2, x_3, k_0, k_1, k_2, k_3)\) where \(x_0, x_1, x_2, x_3\) are four public input bits and \(k_0, k_1, k_2, k_3\) are four secret input bits. If we fix \(x_0\) to 0 and \(x_1\) to 1, the resulting function mapping \((0, 1, x_2, x_3, k_0, k_1, k_2, k_3)\) to

$$\textit{\textbf{f}}(0, 1, x_2, x_3, k_0, k_1, k_2, k_3)$$

is a derived function from \(\textit{\textbf{f}}\) with the following variable masks

$$ \begin{aligned} \varvec{\varGamma }^{0} = (1,0,0,0,0,0,0,0),~~ \varvec{\varGamma }^{1} = (0,1,0,0,0,0,0,0), \\ \varvec{\varGamma }^{p} = (0,0,1,1,0,0,0,0),~~ \varvec{\varGamma }^{s} = (0,0,0,0,1,1,1,1). \end{aligned} $$

In the following sections, we typically first give a function \(\textit{\textbf{f}}\) which can be directly obtained from the description of the targeted cipher, and then we specify the associated variable masks. Finally, the techniques presented in this work are applied to the corresponding derived function.

In the case of \(\textit{\textbf{f}}_d\), we should note \({\textit{\textbf{x}}}^{{\textit{\textbf{v}}}} \equiv 1\) for any \({\textit{\textbf{v}}}\preceq \varvec{\varGamma }^{1}\), then \({\textit{\textbf{x}}}^{{\textit{\textbf{u}}}\oplus {\textit{\textbf{v}}}} = {\textit{\textbf{x}}}^{\textit{\textbf{u}}}\cdot {\textit{\textbf{x}}}^{\textit{\textbf{v}}}= {\textit{\textbf{x}}}^{\textit{\textbf{u}}}\) for any \({\textit{\textbf{v}}}\preceq \varvec{\varGamma }^{1}\) and the Proposition 1 can be converted to the following proposition.

Proposition 2

Let \(\textit{\textbf{f}}_d\) be the derived function of \(\textit{\textbf{f}}\) with \(\varvec{\varGamma }^{0}, \varvec{\varGamma }^{1}, \varvec{\varGamma }^{p}, \varvec{\varGamma }^{s}\). For \({\textit{\textbf{x}}}^{(r)} = \textit{\textbf{f}}_{d}({\textit{\textbf{x}}}^{(0)})\) and \({\textit{\textbf{u}}}^{(0)} \preceq \varvec{\varGamma }^{p} \oplus \varvec{\varGamma }^{s}\), \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \) if and only if

$$\begin{aligned} \sum _{ {\textit{\textbf{v}}}\preceq \varvec{\varGamma }^{1} } | \pi _{{\textit{\textbf{u}}}^{(0)} \oplus {\textit{\textbf{v}}}}({\textit{\textbf{x}}}^{(0)}) \bowtie \pi _{{\textit{\textbf{u}}}^{(r)}}({\textit{\textbf{x}}}^{(r)}) | \bmod 2 = 1. \end{aligned}$$

4 Application I: Degree Evaluation

Since the algebraic degree of a symmetric-key primitive significantly affects its security against cryptanalytic techniques such as algebraic attacks  [20], higher-order differential attacks  [15, 17], interpolation attacks  [14], and integral attacks  [6, 16], methods and tools for degree evaluation have been an important topic in the community all along. To put our approach into perspective, we highlight several important works in this line of research. At EUROCRYPT 2002, Canteaut and Videau developed a method for upper bounding the algebraic degree of composite functions  [5], which was improved by Boura et al.  [3] at FSE 2011. In  [1], the authors identified a simple closed formula bounding the number of rounds necessary to achieve full degree for the block ciphers with secret components. At CRYPTO 2017, Liu presented a general framework known as numeric mapping, which is exclusively used for estimating the algebraic degrees of the cryptosystems based on the nonlinear feedback shift register (NFSR)  [18].

Another approach for the degree evaluation is based on the division property. The accuracy of this approach is determined by the accuracy of the “propagation rules” of the underlying detection algorithms for division properties. When the detection algorithm is perfect (The meaning of perfect will be more concrete in Sect. 6), its estimation is exact. In the following, we show that the monomial prediction technique achieves this exactness.

4.1 Compute Exact Algebraic Degree of a Boolean Function

The algebraic degree of a Boolean function f is defined as follows,

$$\begin{aligned} \deg (f) = \max \limits _{ {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow f } wt( {\textit{\textbf{u}}}^{(0)} ). \end{aligned}$$
(1)

To determine the algebraic degree of f, we only need to prove the existence of a monomial \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \) such that \(\pi _{{\textit{\textbf{u}}}'}({\textit{\textbf{x}}}^{(0)}) \nrightarrow f\) for any \(\textit{\textbf{u}}'\) with \(wt(\textit{\textbf{u}}') > d\), which can be done in two steps:

  1. 1.

    Find a monomial \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow f\) with \(wt({\textit{\textbf{u}}}) = d\) and prove \(\pi _{{\textit{\textbf{u}}}'}({\textit{\textbf{x}}}^{(0)}) \nrightarrow f\) for any \(wt(\textit{\textbf{u}}') > d\).

  2. 2.

    Compute \(| {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie f|\) to confirm the presence of \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \), if the value is odd, then \(\deg (f) = d\), else, we need to repeat the process until we find a desired monomial of f.

The Mixed Integer Linear Programming (MILP) approach has been extensively used to probe the structure of Boolean functions in previous works such as  [9, 22, 26, 28,29,30,31]. In this work, we also employ the MILP-based approach to search for the monomials of f. In this MILP model, the objective function of the model is to maximize \(wt({\textit{\textbf{u}}}^{(0)})\) according to Eq. (1). One solution of the MILP model is a sequence of \(({\textit{\textbf{u}}}^{(0)}, {\textit{\textbf{u}}}^{(1)}, \ldots , {\textit{\textbf{u}}}^{(r)})\)Footnote 1, such that

$$ {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow {\pi _{{\textit{\textbf{u}}}^{(1)}} ( {\textit{\textbf{x}}}^{(1)} ) } \rightarrow \cdots \rightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } . $$

To confirm the presence of \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \) as in the above Step 2, we use the \(\mathtt {PoolSearchMode}\) of Gurobi to compute \(| {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie f|\).

PoolSearchMode of Gurobi. To judge whether the size of a monomial hull is an odd number, we frequently need to find all solutions of a MILP model. Following Hao et al.’s work at EUROCRYPT 2020  [9], we also employ the PoolSearchMode of GurobiFootnote 2 to perform solution enumerations. The PoolSearchMode is a mode implemented by Gurobi to systematically search for multiple solutions. Let \(\mathcal {M}\) be a MILP model, we use

$$\mathcal {M}.\mathtt {PoolSearchMode} \leftarrow 1$$

to signal that the PoolSearchMode is turned on. All the source codes are available at https://github.com/hukaisdu/MonomialPrediction.

4.2 Application to Trivium

Specification of Trivium. Trivium   [4] is an NFSR-based stream cipher with a 288-bit internal state \({\textit{\textbf{x}}}= (x_0,x_1,\ldots ,x_{287})\) divided into three registers (denoted as Reg 0, Reg 1 and Reg 2 in Fig. 1). The 80-bit secret key K is loaded to the first register (Reg 0), and the 80-bit initialization vector IV is loaded to the second register. The other bits of the three registers are set to 0 except the last three bits of the third register. Namely, we have

$$\begin{aligned} (x_0, x_1, \ldots , x_{92})&\leftarrow (K[0], K[1], \ldots , K[79], 0, \ldots , 0), \\ (x_{93}, x_{94}, \ldots , x_{176})&\leftarrow (IV[0], IV[2], \ldots , IV[79], 0, \ldots , 0), \\ (x_{177}, x_{178}, \ldots , x_{287})&\leftarrow (0, 0, \ldots , 0, 1, 1, 1). \end{aligned}$$

Let \(h: \mathbb {F}_2^5 \rightarrow \mathbb {F}_2\) be a Boolean function such that \(h(\alpha _0, \alpha _1, \alpha _2, \alpha _3, \alpha _4) =\alpha _0 \oplus \alpha _1 \alpha _2 \oplus \alpha _3 \oplus \alpha _4 \). The pseudo code of the update function is given by

$$\begin{aligned}&t_1 \leftarrow h(x_{65}, x_{90}, x_{91}, x_{92} , x_{170} ) = x_{65} \oplus x_{90} x_{91} \oplus x_{92} \oplus x_{170},\\&t_2 \leftarrow h( x_{161}, x_{174}, x_{175}, x_{176} , x_{263} ) = x_{161} \oplus _{174} x_{175} \oplus x_{176} \oplus x_{263},\\&t_3 \leftarrow h( x_{242}, x_{285} , x_{286} , x_{287} , x_{68} ) = x_{242} \oplus x_{285} x_{286} \oplus x_{287} \oplus x_{68}. \end{aligned}$$

The state of the next clock is computed as

$$\begin{aligned} (x_0, x_1, \ldots , x_{92})&\leftarrow (t_3, x_0, \ldots , x_{91}), \\ (x_{93}, x_{94}, \ldots , x_{176})&\leftarrow (t_1, x_{93}, \ldots , x_{175}), \\ (x_{177}, x_{178}, \ldots , x_{287})&\leftarrow (t_2, x_{177}, \ldots , x_{286}). \end{aligned}$$

During the initialization, the state is updated 1152 times without producing any output. After the initialization, one bit key is produced per application of the update function by the key stream generation function \(g: \mathbb {F}_2^{288} \rightarrow \mathbb {F}_2\) as

$$\begin{aligned} z~ \leftarrow g(x_{0}, x_{1},\ldots , x_{287}) = x_{65} \oplus x_{92} \oplus x_{161} \oplus x_{176} \oplus x_{242} \oplus x_{287}. \end{aligned}$$

MILP Model for a Monomial Trail of Trivium. Let \({\textit{\textbf{x}}}^{(0)}\) denote the initial state of Trivium and \({\textit{\textbf{x}}}^{(i+1)}\) denote the state after the i-th update function \(\textit{\textbf{f}}^{(i)}\). The output bit after r-round TriviumFootnote 3 \(z_r\) is a Boolean function of \({\textit{\textbf{x}}}^{(0)}\) which is denoted by \(z_r = f({\textit{\textbf{x}}}^{(0)})\). Naturally, f is the composition of the update functions and the key stream generation function as

$$\begin{aligned} z_r&= f({\textit{\textbf{x}}}^{(0)}) = g \circ \textit{\textbf{f}}^{(r-1)} \circ \textit{\textbf{f}}^{(r-2)} \circ \cdots \circ \textit{\textbf{f}}^{(0)}({\textit{\textbf{x}}}^{(0)}) \nonumber \\&= g ({\textit{\textbf{x}}}^{(r)}) = x^{(r)}_{65} \oplus x^{(r)}_{92} \oplus x^{(r)}_{161} \oplus x^{(r)}_{176} \oplus x^{(r)}_{242} \oplus x^{(r)}_{287}. \end{aligned}$$
(2)

To construct the MILP model for the monomial trail of Trivium, we should study the ANFs of \(\textit{\textbf{f}}^{(i)}\) and g and model the monomial trail locally for them.

Fig. 1.
figure 1

The illustration of \(\textit{\textbf{f}}^{(i)}\). In the first phase, if \(j \notin \{ 92, 176, 287\}\), \(y^{(i)}_j = x^{(i)}_j\). In the second phase, \(x^{(i+1)}_{(j+1)\bmod 288} = y^{(i)}_j\).

According to Fig. 1, \(\textit{\textbf{f}}^{(i)}\) can be represented by parallel bit-permutations and three H functions such as

$$\begin{aligned}&x^{(i+1)}_{j+1\bmod 288} = x^{(i)}_{j}, \mathrm {~if~} j\notin \{ {\scriptstyle 65,90,91,92,170, 161,174,175,176,263, 242,285,286,287,68} \},\end{aligned}$$
(3)
$$\begin{aligned}&(x^{(i+1)}_{66}, x^{(i+1)}_{91},x^{(i+1)}_{92},x^{(i+1)}_{93},x^{(i+1)}_{171}) = H(x^{(i)}_{65}, x^{(i)}_{90}, x^{(i)}_{91}, x^{(i)}_{92}, x^{(i)}_{170} ) \end{aligned}$$
(4)
$$\begin{aligned}&(x^{(i+1)}_{162}, x^{(i+1)}_{175},x^{(i+1)}_{176},x^{(i+1)}_{177},x^{(i+1)}_{264} = H(x^{(i)}_{161}, x^{(i)}_{174}, x^{(i)}_{175}, x^{(i)}_{176}, x^{(i)}_{263}) \end{aligned}$$
(5)
$$\begin{aligned}&(x^{(i+1)}_{243}, x^{(i+1)}_{286},x^{(i+1)}_{287},x^{(i+1)}_{0},x^{(i+1)}_{69} ) = H(x^{(i)}_{242}, x^{(i)}_{285}, x^{(i)}_{286}, x^{(i)}_{287}, x^{(i)}_{68} ) \end{aligned}$$
(6)

where \(H: \mathbb {F}_2^5 \rightarrow \mathbb {F}_2^5\) defined as follows,

$$ (\beta _0, \beta _1, \beta _2, \beta _3, \beta _4 ) = H(\alpha _0, \alpha _1, \alpha _2, \alpha _3, \alpha _4 ) = (\alpha _{0}, \alpha _{1}, \alpha _{2}, \alpha _{0} \oplus \alpha _{1} \alpha _{2} \oplus \alpha _{3} \oplus \alpha _{4}, \alpha _{4}). $$

H can be decomposed into a sequence of smaller functions such as \(\mathsf {COPY}\), \(\mathsf {AND}\) and \(\mathsf {XOR}\), which is shown in Fig. 2.

Fig. 2.
figure 2

The decomposition of H function by \(\mathsf {COPY}\), \(\mathsf {AND}\) and \(\mathsf {XOR}\).

MILP Model for the Monomial Trail of \(\textit{\textbf{f}}^{(i)}\). The operations in Eq. (3) are simple bit-permutations which can be handled by directly changing the positions of the variables, thus no inequalities are required for this condition. To model H function, we generate inequalities to model the monomial trials of \({\texttt {COPY}}\), \({\texttt {AND}}\) and \({\texttt {XOR}}\). For \({\texttt {COPY}}\), consider \(x \xrightarrow {{\texttt {COPY}}} (x, x)\) where x is a bit variable, we have

$$ {\left\{ \begin{array}{ll} x^0 (= 1) \rightarrow x^0 \cdot x^0 (= 1), \quad x^0 (= 1) \nrightarrow x^0 \cdot x^1 (=x) \\ x^0 (= 1) \nrightarrow x^1 \cdot x^0 (=x), \quad x^0 (= 1) \nrightarrow x^1 \cdot x^1 (=x) \\ x^1 (= x) \nrightarrow x^0 \cdot x^0 (=1), \quad x^1 (= x) \rightarrow x^0 \cdot x^1 (=x) \\ x^1 (= x) \rightarrow x^1 \cdot x^0 (=x), \quad x^1 (= x) \rightarrow x^1 \cdot x^1 (=x) \\ \end{array}\right. }. $$

Then there are four valid monomial trails of \({\texttt {COPY}}\), i.e., (0, 0, 0), (1, 0, 1), (1, 1, 0) and (1, 1, 1). Similarly, \({\texttt {AND}}\) has two monomial trials (0, 0, 0) and (1, 1, 1), while \({\texttt {XOR}}\) has three monomial trials (0, 0, 0), (1, 0, 1) and (0, 1, 1).

To generate inequalities for monomial trails of each function, we follow Sun et al.’s approach in  [23] to derive linear inequalities by SageFootnote 4 and then use the greedy algorithm to simplify them. At last, a set of 15 inequalities \(\mathcal {L}\) with 5 auxiliary variables (given in Appendix A of  [11]) is sufficient to describe the H function. Thus we need 45 linear inequalities and 15 auxiliary variables to model \(\textit{\textbf{f}}^{(i)}\). In Appendix B (Ref.  [11]), we provide an alternative method to describe the monomial trails of H with less inequalities, where H is treated as a whole. Note that Proposition 1 implies that the decomposition with different granularity levels of the target Boolean function will not affect the parity of the number of the monomial trails of the Boolean function.

MILP Model for the Monomial trail of g. Since g is a simple Boolean function that contains 6 monomials (Eq. (2)), a set of simple constraints as

$$\begin{aligned} \left\{ \begin{aligned}&u^{(r)}_{65} + u^{(r)}_{92} + u^{(r)}_{161} + u^{(r)}_{176} + u^{(r)}_{242} + u^{(r)}_{287} = 1, \\&u^{(r)}_j = 0, \mathrm {~if~} j \notin \{ 65, 92, 161, 176, 242, 287 \}. \end{aligned} \right. \end{aligned}$$
(7)

will complete our modeling.

figure b

In Algorithm 1, we demonstrate how to generate the MILP model for Trivium, where \(\mathcal {L}\) represents the inequalities for the model of H. Note in some cases we may want to manipulate the first (e.g., line 16 of Algorithm 2) and last terms (e.g., line 11 of Algorithm 3) of the monomial trail. Then the MILP model in Algorithm 1 excludes the model of g, instead the variables representing the first monomial \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \) and the last monomial \( {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \) are also returned in order for later usage.

Degree of Trivium. The output bit \(z_r = f ({\textit{\textbf{x}}}^{(0)})\) after r-round Trivium is a Boolean function of the initial state \({\textit{\textbf{x}}}^{(0)}\). If we regard the IV bits as public variables and the key bits as secret variables, the initial setup of the state implies the following derived function with four variable masks \(\varvec{\varGamma }^{0}, \varvec{\varGamma }^{1}, \varvec{\varGamma }^{p}, \varvec{\varGamma }^{s}\):

In accordance, the derived function and its variable masks can be used to modify the algebraic degree expression given in Eq. (1), therefore the algebraic degree of \(z_r\) can be computed as

$$\begin{aligned} \deg (z_r) = \max \limits _{ \begin{array}{c} {\textit{\textbf{u}}}^{(0)} \preceq \varvec{\varGamma }^{p} \oplus \varvec{\varGamma }^{s} \\ {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow z_r \end{array} } \Big \{ \sum _{\varvec{\varGamma }^{p}_i = 1} u^{(0)}_i \Big \} = \max \limits _{ \begin{array}{c} {\textit{\textbf{u}}}^{(0)} \preceq \varvec{\varGamma }^{p} \oplus \varvec{\varGamma }^{s} \\ {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow z_r \end{array} } \Big \{ \sum _{93 \le i \le 172} u^{(0)}_i \Big \}. \end{aligned}$$

By calling Algorithm 1, Algorithm 2 finds the monomial with the potential maximum degree satisfying \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow z_r\). Thereafter, \(| {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie z_r|\) is computed under the PoolSearchMode to determine if \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow z_r\) holds. Once \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightarrow z_r\) is confirmed, we derive the exact algebraic degree of r-round Trivium.

figure c

Our Results. With the help of the monomial prediction we are able to evaluate the exact algebraic degree of Trivium up to 834 rounds and the results are listed in Table 5 in Appendix E (Ref.  [11]). Interestingly, for the first time, we notice a counter-intuitive phenomenon that the algebraic degree of Trivium is not monotonously increasing with rounds. For example, the degrees of 806-, 807- and 808-round Trivium are 69, 71, 70, respectively. It implies that some monomials with the maximum degree are canceled in the subsequent round. Such degree drops are highlighted in Table 5.

A comparison of monomial prediction and the numeric mapping technique for upper bounding the degree of NFSR ciphers  [18] is illustrated in Fig. 3. As the number of iterated rounds gets larger, the gap between the upper bound and the exact degree becomes more significant. For the degree of the 793-round Trivium, the numeric mapping technique gives an upper bound of 79, while the monomial prediction method tells us that the exact degree is only 67.

Fig. 3.
figure 3

The exact degree derived by monomial prediction and the upper bound derived by numeric mapping  [18].

We also perform the degree evaluations with the two-subset bit-based division property  [27] to estimate the upper bound of the degree of r-round Trivium. The results show that the division property is quite precise. From 1- to 834-round Trivium, there are only 14 cases where the division property fails to hit the exact degrees, which are listed in Table 2.

Table 2. The gaps among the exact degree, the upper bound obtained by the two-subset bit-based division property and the numeric mapping for several special cases of Trivium up to 834-round. For the other cases, the result obtained by the two-subset bit-based division property equals to the exact degree.

5 Application II: Cube Attacks

The cube attack was proposed by Dinur and Shamir  [7] at EUROCRYPT 2009. Let \(f(\textit{\textbf{x}})\) be a Boolean function from \(\mathbb {F}_2^n\) to \(\mathbb {F}_2\), and \(\textit{\textbf{u}} \in \mathbb {F}_2^n\) be a constant vector. Then \(f(\textit{\textbf{x}})\) can be represented uniquely as

$$ f(\textit{\textbf{x}}) = {\textit{\textbf{x}}}^{{\textit{\textbf{u}}}} p(\textit{\textbf{x}}) + q(\textit{\textbf{x}}), $$

where each term of \(q(\textit{\textbf{x}})\) is not divisible by \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\). Note that in our notations, the set \(I_{\textit{\textbf{u}}}= \{ 0 \le i \le n-1: u_i = 1 \} \subseteq \{0, \cdots , n-1\}\) and the monomial \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\) correspond to the cube indices and cube term that are commonly used in the literature of cube attacksFootnote 5. If we compute the sum of f over the cube \(\mathbb {C}_{\textit{\textbf{u}}} = \{\textit{\textbf{x}} \in \mathbb {F}_2^n : \textit{\textbf{x}} \preceq \textit{\textbf{u}} \}\), we have

$$ \textstyle \bigoplus _{\textit{\textbf{x}} \in \mathbb {C}_{\textit{\textbf{u}}}} f(\textit{\textbf{x}}) = \bigoplus _{\textit{\textbf{x}} \in \mathbb {C}_{\textit{\textbf{u}}}} ({\textit{\textbf{x}}}^{\textit{\textbf{u}}}p(\textit{\textbf{x}}) + q(\textit{\textbf{x}})) = p(\textit{\textbf{x}}), $$

where \(p(\textit{\textbf{x}})\) is called the superpoly of the cube \(\mathbb {C}_{\textit{\textbf{u}}}\), and \(p(\textit{\textbf{x}})\) only involves variables \(x_j\) with \(j \in I_{\bar{\textit{\textbf{u}}}} = \{ 0 \le i \le n-1: u_i = 0 \}\).

The superpoly recovery plays a critical role in the cube attack. The attacker recovers the superpoly in the offline phase, and then in the online phase, he/she queries the encryption oracle with the cube, and finally gets the value of the superpoly. If the superpoly is a balanced Boolean function, a bit information of the secret key can be obtained. The remaining key bits can be recovered by the exhaustive search.

At the early stage in the applications of cube attacks, the superpoly recovery is achieved experimentally by summing the outputs over certain “good” cubes, and therefore the sizes of cubes are largely confined in a practical range. Moreover, superpolies derived from small cubes have to be extremely simple (typically linear or quadratic functions  [7, 19]) in order to be recovered in a probabilistic way.

In [26], the division property was first introduced to enhance cube attacks, which allows us to identify the key bits that do not present in the superpoly. This approach is deterministic and can be used to analyze cubes whose sizes are beyond practical reach. By setting the key bits that are not involved in the superpoly to arbitrary constants and varying the remaining l key bits, one can obtain the truth table of the superpoly for a subsequent key-recovery attack with complexity \(2^{|I|+l}\). At CRYPTO 2018, Wang et al. proposed the flag technique and term enumeration technique to recover directly all the monomials of the superpoly based on the two-subset bit-based division property, which further lowers the complexity of the superpoly recovery and thus attacks of more rounds on several targets are mounted  [29].

However, in  [26, 29], it was assumed that every identified secret key variable or the monomial must be involved in the superpoly. If such an assumption does not hold, the superpoly can be much simpler than estimated, or even falls into the extreme case: \(p(x) \equiv 0\). In fact it has been reported in  [8, 9, 30, 32] that some of previous key-recovery attacks are actually distinguishers. To get rid of this assumption, Wang et al. for the first time proposed a systematic method based on the three-subset bit-based division property to recover the exact superpoly  [30]. In  [9], the method was refined as the three-subset bit-based division property without unknown subsets and was modeled under the \(\mathtt {PoolSearchMode}\) of Gurobi. As a result, they recovered the exact superpolies for 840-, 841- and 842-round Trivium.

5.1 Apply Monomial Prediction to Superpoly Recovery

It is natural to apply the monomial prediction to the recovery of the superpoly. For \(f: \mathbb {F}_2^n \rightarrow \mathbb {F}_2\), we define a constant vector \({\textit{\textbf{u}}}\in \mathbb {F}_2^n\) and let the corresponding cube term be \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\). To recover the superpoly which is a polynomial of \(x_i\)’s with \(\bar{u}_i = 1\), we find all the possible monomials like \({\textit{\textbf{x}}}^{{\textit{\textbf{u}}}\oplus \textit{\textbf{w}}} = {\textit{\textbf{x}}}^{\textit{\textbf{u}}}\cdot {\textit{\textbf{x}}}^{\textit{\textbf{w}}}\) where \(\textit{\textbf{w}} \preceq \bar{{\textit{\textbf{u}}}}\) satisfying \({\textit{\textbf{x}}}^{{\textit{\textbf{u}}}\oplus \textit{\textbf{w}}} \rightarrow f\). Then the superpoly of \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\) is

$$ p({\textit{\textbf{x}}}) = \bigoplus _{\begin{array}{c} \textit{\textbf{w}} \preceq \bar{{\textit{\textbf{u}}}} \\ {\textit{\textbf{x}}}^{{\textit{\textbf{u}}}\oplus \textit{\textbf{w}} } \rightarrow f \end{array} } {\textit{\textbf{x}}}^{\textit{\textbf{w}}} = \Big ( \bigoplus _{\begin{array}{c} \textit{\textbf{w}} \preceq \bar{{\textit{\textbf{u}}}} \\ {\textit{\textbf{x}}}^{{\textit{\textbf{u}}}\oplus \textit{\textbf{w}} } \rightarrow f \end{array} } {\textit{\textbf{x}}}^{{\textit{\textbf{u}}}\oplus \textit{\textbf{w}}} \Big ) / {{\textit{\textbf{x}}}^{\textit{\textbf{u}}}}. $$

To find all \({\textit{\textbf{x}}}^{{\textit{\textbf{u}}}\oplus \textit{\textbf{w}}} \rightarrow f\) for \(\textit{\textbf{w}} \preceq \bar{{\textit{\textbf{u}}}}\), we could take the \(\mathtt {PoolSearchMode}\) of Gurobi solver to find all solutions satisfying \({\textit{\textbf{x}}}^{{\textit{\textbf{u}}}\oplus \textit{\textbf{w}}} \rightsquigarrow f\). Next, we store all the \({\textit{\textbf{x}}}^{{\textit{\textbf{u}}}\oplus \textit{\textbf{w}}}\) into a hash table which are indexed by \(({\textit{\textbf{u}}}, \textit{\textbf{w}})\), the size of each possible \({\textit{\textbf{x}}}^{{\textit{\textbf{u}}}\oplus \textit{\textbf{w}}} \bowtie f\) for \(\textit{\textbf{w}} \preceq \bar{{\textit{\textbf{u}}}}\) can be counted naturally.

Speedup and Memory Reduction: A Divide-and-Conquer Strategy. In this paper, we only study the composite function f, where

$$ f = \textit{\textbf{f}}^{(r-1)}\circ \textit{\textbf{f}}^{(r-2)} \circ \cdots \circ \textit{\textbf{f}}^{(0)}. $$

According to Lemma 2, if \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow f \), then for \(0< i < r\),

$$\begin{aligned} | {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie f | \equiv \sum _{ {\pi _{{\textit{\textbf{u}}}^{(r-i)}} ( {\textit{\textbf{x}}}^{(r-i)} ) } \rightarrow f} | {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie {\pi _{{\textit{\textbf{u}}}^{(r-i)}} ( {\textit{\textbf{x}}}^{(r-i)} ) } | \pmod 2. \end{aligned}$$
(8)

Generally speaking, computing \(| {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie {\pi _{{\textit{\textbf{u}}}^{(r-i)}} ( {\textit{\textbf{x}}}^{(r-i)} ) } | \) one by one is much easier than computing \(| {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \bowtie f |\) when i is significantly smaller than r. In this paper, we always expand f firstly and then obtain the speedups and memory reductions by the divide-and-conquer strategy.

5.2 Application to Trivium

Let \(z_r = f({\textit{\textbf{x}}}^{(0)})\) be the output of the r-round Trivium with \({\textit{\textbf{x}}}^{(0)} \in \mathbb {F}_2^{288}\). When the cube attack is applied to Trivium, only the cube variables indexed by the cube indices I and the secret key bits are regarded as symbolic variables in our analysis, and all other input variables are fixed to constants. Therefore, we are actually analyzing the derived function of f with the variable masks \(\varvec{\varGamma }^0\), \(\varvec{\varGamma }^1\), \(\varvec{\varGamma }^p\), and \(\varvec{\varGamma }^s\) given as follows:

(9)

To recover the superpoly corresponding to the cube indices \(I = \{ 0 \le i \le 287 : \varGamma ^p_i = 1 \}\), we need to find all \(\pi _{ {\tiny \varvec{\varGamma }^{p}} \oplus \textit{\textbf{w}} }( {\textit{\textbf{x}}}^{(0)} ) \rightarrow f \) for all \(\textit{\textbf{w}} \preceq \varvec{\varGamma }^{s}\).

In practice, we take the divide-and-conquer strategy based on Eq. (8) to keep the consumption of computational resources under control. Let the internal state of the i-th round Trivium be \(\textit{\textbf{x}}^{(i)}\). We first express \(z_r\) as a polynomial of \(\textit{\textbf{x}}^{(r-r_0)}\) for some \(r_0\). According to Proposition 3, when \(r_0\) is not very large, the expression of \(z_r\) in \(\textit{\textbf{x}}^{(r -r_0)}\) can be got by the monomial prediction techniqueFootnote 6.

Proposition 3

Let \(z_r = f({\textit{\textbf{x}}}^{(0)})\), and

$$ \mathbb {U}_{r-r_0} = \{ {\textit{\textbf{u}}}^{(r-r_0)}: ~| {\pi _{{\textit{\textbf{u}}}^{(r-r_0)}} ( {\textit{\textbf{x}}}^{(r-r_0)} ) } \bowtie f | \bmod 2 = 1~\},~~\mathrm {then} $$
$$ f = \bigoplus \limits _{\begin{array}{c} {\textit{\textbf{u}}}^{(r-r_0)} \in \mathbb {U}_{r-r_0} \end{array} } {\pi _{{\textit{\textbf{u}}}^{(r-r_0)}} ( {\textit{\textbf{x}}}^{(r-r_0)} ) } . $$

Based on Proposition 3, an algorithm to express r-round Trivium in \({\textit{\textbf{x}}}^{(r-r_0)}\) is presented in Algorithm 4 in Appendix D (Ref.  [11]).

Remark. We can also get the expression by symbolic computation. We choose the monomial prediction technique because most variables and constraints needed to complete this step are already presented in our model, which significantly reduces the burden of extra coding efforts.

figure d

Algorithm 3 shows how we recover the superpoly of a certain cube based on the divide-and-conquer strategy. The divide-and-conquer strategy leads to remarkable speedups and memory reductions in practice, which makes it possible to test more cubes with limited resources. As a result, we identify some cubes with smaller dimensions for Trivium, and thus improve upon several currently known best attacks on Trivium. We list our experimental results with different smaller-dimension cubes in Table 3 (Ref.  [11]). To verify our program, we re-conduct the experiments in  [9] using the same cube indices for 840- and 841-round Trivium and obtain the same superpolies.

Cube Attack on 840-Round Trivium. We find the superpolies \(p_{I_1}, p_{I_2}\) and \(p_{I_3}\) for three different cube indices \(I_1, I_2\) and \(I_3\)Footnote 7, whose dimensions are 75, 76, and 76, respectively.

Taking the cube of dimension 75 as \(I_1 = \{ 0, 1, \ldots ,69, 71, 73, 75, 77, 79 \}\) with

$$ IV[70] = IV[72] = IV[74] = IV[76] = IV[78] = 0, $$

we recover a balanced superpoly for 840-round Trivium that has 41 terms and of the algebraic degree 4. The independent monomial of the superpoly is labeled by the red text.

figure e

Taking the cube of dimension 76 as \(I_2 = \{0,1,\ldots ,71, 73, 75, 77, 79\}\) with

$$ IV[72] = IV[74] = IV[76] = IV[78] = 0, $$

we recover a balanced superpoly for 840-round Trivium that has 4 terms and algebraic degree of 2, and give it as follows

figure f

Taking the cube of dimension 76 as \(I_3 = \{0,1,\ldots ,69, 71, 72, 73, 75, 77, 79\}\) with

$$ IV[70] = IV[74] = IV[76] = IV[78] = 0, $$

we recover a balanced superpoly for 840-round Trivium that has 6 terms and algebraic degree of 3 as below,

figure g

Let \(\mathbb {C}_I = \{ {\textit{\textbf{x}}}\in \mathbb {F}_2^{288} : {\textit{\textbf{x}}}\preceq \varvec{\varGamma }^{p} \}\), where \(\varvec{\varGamma }^{p}\) is set as Eq. (9). since \(I_2 = I_1 \cup \{ 70 \}\),

$$ p_{I_2} = \bigoplus _{{\textit{\textbf{x}}}\in \mathbb {C}_{I_2}} f({\textit{\textbf{x}}}) = \bigoplus _{{\textit{\textbf{x}}}\in \mathbb {C}_{I_1}, IV[70]=1} f({\textit{\textbf{x}}}) \oplus \bigoplus _{{\textit{\textbf{x}}}\in \mathbb {C}_{I_1}, IV[70]=0} f({\textit{\textbf{x}}}), $$

and

$$ p_{I_1} = \bigoplus _{{\textit{\textbf{x}}}\in \mathbb {C}_{I_1}} f({\textit{\textbf{x}}}) = \bigoplus _{{\textit{\textbf{x}}}\in \mathbb {C}_{I_1}, IV[70]=0} f({\textit{\textbf{x}}}), $$

then we can deduce that \(p_{I_4} = p_{I_1} \oplus p_{I_2}\) is the superpoly for the cube indices \( I_4 = \{0,1,\ldots , 69, 71, 73, 75, 77, 79\} \) with

$$IV[72] = IV[74] = IV[76] = IV[78] = 0, IV[70] = 1.$$

Similarly, we can deduce that \(p_{I_5} = p_{I_1} \oplus p_{I_3}\) is the superpoly for the cube indices \( I_5 = \{0,1,\ldots ,69, 71, 73, 75, 77, 79\} \) with

$$ IV[70] = IV[74] = IV[76] = IV[78] = 0, IV[72] = 1. $$

\(p_{I_1}\), \(p_{I_4}\) and \(p_{I_5}\) are balanced Boolean functions because there are monomials that are independent of other monomials, respectively. Therefore, we can recover 3 bits of key information by using \(3 \times 2^{75} \approx 2^{76.6}\) time complexity. The dominant part of the whole key recovery attack is the exhaustive search after the recovery of the 3-bit key information, which is \(2^{77}\) time complexity. So in total, the time complexity for this 840-round Trivium is \(2^{76.6} + 2^{77} \approx 2^{77.8}\).

Cube Attack on 841-Round Trivium. We find the superpolies \(p_{I_6}\) and \(p_{I_7}\) for the set of cube indices \(I_6\) and \(I_7\), whose dimensions are 76 and 77, respectively. Taking the cube of dimension 76 as \( I_6 = \{0,1,\ldots ,69, 71, 73, 74, 75, 77, 79\} \) with

$$ IV[70] = IV[72] = IV[76] = IV[78] = 0, $$

we recover a balanced superpoly \(p_{6}\) for 841-round Trivium that has 3632 terms and algebraic degree of 9. Since the number of terms in \(p_{I_6}\) (and other superpolies, e.g., \(p_{I_7}, p_{I_9} \text {and } p_{I_{10}}\) are too many, we provide them at https://github.com/hukaisdu/MonomialPrediction/blob/master/superpoly.pdf.

Taking the cube of dimension 77 as \( I_7 = \{0,1,\ldots ,71, 73, 74, 75, 77, 79\} \) with

$$ IV[72] = IV[76] = IV[78] = 0, $$

we recover a balanced superpoly \(p_{I_7}\) for 841-round Trivium that has 1400 terms and algebraic degree of 8.

Similar with \(p_{I_4}\), \(p_{I_8} = p_{I_6} \oplus p_{I_7}\) is the superpoly for the cube indices \( I_8 = \{0,1,\ldots ,69, 71, 73, 74 75, 77, 79\} \) with

$$ IV[72] = IV[76] = IV[78] = 0, IV[70] = 1. $$

Hence, we can recover 2 bits of the key information with time complexity \(2^{77} = 2 \times 2^{76}\). The dominant part of the whole key recovery attack is the exhaustive search after 2-bit key recovery, which is \(2^{78}\) time complexity. Therefore, totally the time complexity of the attack on the 841-round Trivium is \(2^{78} + 2^{77} \approx 2^{78.6}\).

Cube Attack on 842-Round Trivium. We find the superpolies \(p_{I_9}\) and \(p_{I_{10}}\) for the set of cube indices \(I_9\) and \(I_{10}\), whose dimensions are 76 and 77, respectively.

Taking the cube of dimension 76 as \( I_9 = \{ 0,1,\ldots ,71, 73, 75, 77, 79\} \) with

$$ IV[72] = IV[74] = IV[76] = IV[78] = 0, $$

we recover a balanced superpoly for 842-round Trivium that has 5147 terms and algebraic degree of 8.

Taking the cube of dimension 77 as \( I_{10} = \{0,1,\ldots ,73, 75, 77, 79\} \) with

$$ IV[74] = IV[76] = IV[78] = 0, $$

we recover a balanced superpoly \(p_{10}\) for 842-round Trivium that has 4174 terms and algebraic degree of 8.

Similar with \(p_{I_{4}}\), \(p_{I_{11}} = p_{I_9} \oplus p_{I_{10}}\) is the superpoly of the cube indices \( I_{11} = \{0,1,\ldots ,71, 73, 75, 77, 79\} \) with \( IV[74] = IV[76] = IV[78] = 0, IV[72] = 1. \) Therefore, we can recover 2 bits of key information by using \(2^{77} = 2 \times 2^{76}\) time complexities. The dominant part of the whole key recovery attack is the exhaustive search after 2-bit key recovery, which is \(2^{78}\) time complexity. Totally, the time complexity is \(2^{78} + 2^{77} \approx 2^{78.6}\).

6 Division Property from an Algebraic Viewpoint

Since 2015, various division properties together with their “propagation rules” are proposed in the literature, including the word-based division property  [21, 25], the two-subset bit-based division property  [27] (a.k.a. the conventional bit-based division property), the three-subset bit-based division property  [27], and the recent three-subset bit-based division property without unknown subset  [9, 30]. Based on these properties with their associated propagation rules, detection algorithms or tools can be built. In a narrow sense, these detection algorithms are used to detect whether the sum of an output bit of a symmetric-key primitive over a carefully constructed input data set is key-independent, that is, the sum is a constant (0 or 1) for any key.

We now look at the detection algorithms for the key-independent property from an algebraic viewpoint. Before we go any further, we would like to mention that the first attempt to formulate the division property in an algebraic way was made by Boura and Canteaut at CRYPTO 2016  [2]. However, they only focused themselves on local components rather than on the global (keyed) Boolean functions. Furthermore, Biryukov, Khovratovich, and Perrin proposed the multiset-algebraic cryptanalysis which can also be seen as an algebraic treatment of the division property  [1]. But they focused more on the algebraic degree only. Now, let us proceed to show the following conclusions:

  • A perfect detection algorithm for the key-independent property can be constructed based on the monomial prediction (i.e., this algorithm never raises false alarms and never misses).

  • The word-based division property  [25], two-subset bit-based division property  [27] and three-subset bit-based division property  [27] together with their propagation rules lead to no-false-alarm detection algorithms for the key-independent property (however, these algorithms can miss).

  • The three-subset bit-based division property without unknown subset with its propagation rules  [9] forms a perfect detection algorithm for the key-independent property, and an equivalence between it and the monomial prediction technique can be established.

6.1 A Perfect Detection Algorithm Based on Monomial Prediction

For a composite function \(\textit{\textbf{f}}: \mathbb {F}_2^n \rightarrow \mathbb {F}_2^m, \textit{\textbf{x}}^{(r)} = \textit{\textbf{f}}(\textit{\textbf{x}}^{(0)})\), we define a constant vector \({\textit{\textbf{u}}}\in \mathbb {F}_2^n\) then we derive a structure of the input values \(\mathbb {X} = \{ {\textit{\textbf{x}}}\in \mathbb {F}_2^n : {\textit{\textbf{x}}}\preceq {\textit{\textbf{u}}}\}\). We want to detect whether

$$\begin{aligned} \lambda = \bigoplus _{\textit{\textbf{x}} \in \mathbb {X}} \pi _{{\textit{\textbf{u}}}^{(r)} } ( \textit{\textbf{f}}({\textit{\textbf{x}}}) ) \end{aligned}$$

is independent of the variables \(x_i\)’s with \(\bar{u}_i = 1\) denoted by \(\bar{{\textit{\textbf{u}}}}\)-(in)dependent. From the viewpoint of presence and absence of monomials, we have

$$ \lambda = \left\{ \begin{array}{@{}ll@{}} \bar{{\textit{\textbf{u}}}}\text{-dependent }, &{} \text { if }\pi _{{\textit{\textbf{u}}}\oplus \textit{\textbf{w}}}({\textit{\textbf{x}}}^{(0)}) \rightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \hbox { for some }\textit{\textbf{0}} \prec \textit{\textbf{w}} \preceq \bar{{\textit{\textbf{u}}}} \\ \bar{{\textit{\textbf{u}}}}\text{-independent }, &{} \text { if }\pi _{{\textit{\textbf{u}}}\oplus \textit{\textbf{w}}}({\textit{\textbf{x}}}^{(0)}) \nrightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \hbox { for all }\textit{\textbf{0}} \prec \textit{\textbf{w}} \preceq \bar{{\textit{\textbf{u}}}} \\ \end{array} \right. $$

Hence, for \(\textit{\textbf{f}}\), the monomial prediction can detect whether \(\lambda \) is independent of \(x_i\) with \(\bar{u}_i = 1\) precisely in theory by computing \(|\pi _{ {\textit{\textbf{u}}}\oplus \textit{\textbf{w}} } ({\textit{\textbf{x}}}^{(0)}) \bowtie {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } |\) for every possible \(\textit{\textbf{0}} \prec \textit{\textbf{w}} \preceq \bar{{\textit{\textbf{u}}}}\).

Application to Derived Function. When applying the monomial prediction to a practical cipher, some part of the public variables will be fixed as a constant value. Let \(\varvec{\varGamma }^{0}, \varvec{\varGamma }^{1}, \varvec{\varGamma }^{p}\) and \(\varvec{\varGamma }^{s}\) be four constant vectors indicating the 0-constant public variables, 1-constant public variables, the non-constant public variables and the secret variables, respectively. Then we study the derived function \(\textit{\textbf{f}}_d\) of \(\textit{\textbf{f}}\) with \(\varvec{\varGamma }^{0}, \varvec{\varGamma }^{1}, \varvec{\varGamma }^{p}, \varvec{\varGamma }^{s}\). In the integral attack, the chosen plaintext set is

$$\begin{aligned} \mathbb {X}_0 = \{ {\textit{\textbf{x}}}\oplus \varvec{\varGamma }^{1} \in \mathbb {F}_2^n : {\textit{\textbf{x}}}\preceq \varvec{\varGamma }^{p} \}. \end{aligned}$$
(10)

And we are interested in whether

$$\begin{aligned} \varLambda = \bigoplus _{\textit{\textbf{x}} \in \mathbb {X}_0} \pi _{{\textit{\textbf{u}}}^{(r)} } ( \textit{\textbf{f}}_d({\textit{\textbf{x}}}) ). \end{aligned}$$

is independent of the secret variables \(x_i\) with \(\varGamma ^s_i = 1\), denoted by key-(in)dependent. Similarly,

$$ \varLambda = \left\{ \begin{array}{@{}ll@{}} \text{ key-dependent }, &{} \text { if }\pi _{{\tiny \varvec{\varGamma }^{p}} \oplus \textit{\textbf{w}}}({\textit{\textbf{x}}}^{(0)}) \rightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \hbox { for some }\textit{\textbf{0}} \prec \textit{\textbf{w}} \preceq \varvec{\varGamma }^{s} \\ \text{ key-independent }, &{} \text { if }\pi _{ {\tiny \varvec{\varGamma }^{p}} \oplus \textit{\textbf{w}}}({\textit{\textbf{x}}}^{(0)}) \nrightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \hbox { for all }\textit{\textbf{0}} \prec \textit{\textbf{w}} \preceq \varvec{\varGamma }^{s} \\ \end{array} \right. $$

Hence, by computing \(|\pi _{ {\tiny \varvec{\varGamma }^{p} } \oplus \textit{\textbf{w}} } ({\textit{\textbf{x}}}^{(0)}) \bowtie {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } |\) for every possible \(\textit{\textbf{0}} \prec \textit{\textbf{w}} \preceq \varvec{\varGamma }^{s}\), we can predict whether \(\varLambda \) is or not key-independent.

6.2 No-False-Alarm Detection Algorithms

Although the monomial prediction can predict the key-independent property precisely, computing the size of a monomial hull is commonly difficult, especially for a block cipher because the size of the monomial hull is usually huge. Furthermore, for attackers, integral property of any bits (it is not necessary to find all) is useful in distinguishing attacks. Therefore, some trade-off between the efficiency and precision is necessary and reasonable.

Following this idea of trade-off, we show a simple observation. Recall Lemma 1, if , we have \(\pi _{ {\textit{\textbf{u}}}} ({\textit{\textbf{x}}}^{(0)}) \nrightarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \). Then if we are able to make the claim that \(\varLambda \) is key-independent according to for any \(\textit{\textbf{w}} \preceq \varvec{\varGamma }^{s}\), the detection algorithm we employ will never raise false alarms.

Definition 3

(No-False-Alarm Approximations). For two detection algorithms \(\mathcal {A}_1\) and \(\mathcal {A}_2\), if \(\mathcal {A}_1\) claims a certain property \(\mathcal {P}\) holds, \(\mathcal {A}_2\) must also claim \(\mathcal {P}\) holds, then we say \(\mathcal {A}_1\) is a no-false-alarm approximation of \(\mathcal {A}_2\).

Next we prove that the two-subset bit-based division property is a no-false-alarm approximation of the monomial prediction.

Definition 4

(Two-Subset Bit-Based Division Property  [27]). Let \(\mathbb {X}\) be a multiset whose elements are n-bit vectors and \(\mathbb {K}\) be a set whose elements are n-bit vectors. When the multiset \(\mathbb {X}\) has the division property \(\mathcal {D}^{1^n}_ {\mathbb {K}}\), it fulfills the following conditions:

$$ \bigoplus _{\textit{\textbf{x}}\in \mathbb {X}}\pi _{\textit{\textbf{u}}}(\textit{\textbf{x}}) = \left\{ \begin{array}{ll} unknown, &{}\mathrm {~if~there~exist~} \textit{\textbf{k}}\in \mathbb {K} \text{ s.t. } \textit{\textbf{u}}\succeq \textit{\textbf{k}},\\ 0, &{}\mathrm {otherwise}. \end{array} \right. $$

Let \(\textit{\textbf{f}}_d\) be the derived function of \(\textit{\textbf{f}}\) with \(\varvec{\varGamma }^{0}, \varvec{\varGamma }^{1}, \varvec{\varGamma }^{p}, \varvec{\varGamma }^{s}\). Suppose the initially chosen set (multiset) of the plaintext is \(\mathbb {X}_0\) as defined in Eq. (10) and the multiset of the ciphertext is \(\mathbb {X}_r = \{ {\textit{\textbf{y}}}: {\textit{\textbf{y}}}= \textit{\textbf{f}}_d(\textit{\textbf{x}}), \textit{\textbf{x}} \in \mathbb {X}_0 \}\). Then we first compute the division property of \(\mathbb {X}_0\) as \(\mathcal {D}^{1^n}_{\mathbb {K}_0 }\), where

$$\begin{aligned} \mathbb {K}_0 = \{ \textit{\textbf{k}} \in \mathbb {F}_2^n : \textit{\textbf{k}} \succeq \varvec{\varGamma }^{p} \}. \end{aligned}$$
(11)

To compute the division property of \(\mathbb {X}_r\), i.e., \(\mathcal {D}^{1^n}_{\mathbb {K}_r}\), we will trace all the propagation from the vectors in \(\mathbb {K}_0\). The propagation rules for the two-subset bit-based division property are listed in  [13, 27, 31].

Proposition 4

  The two-subset bit-based division property is a no-false-alarm approximation of the monomial prediction in detecting the balance property, therefore the two-subset bit-based division property claims \(\bigoplus _{ \textit{\textbf{x}}^{(r)}\in \mathbb {X}_r} \pi _{\textit{\textbf{k}}^{(r)} } ({\textit{\textbf{x}}}^{(r)}) \equiv 0\) without false alarms.

Proof

Firstly, for any \(\textit{\textbf{k}}^{(0)} \in \mathbb {K}_0\), \(\pi _{\textit{\textbf{k}}^{(0)} }({\textit{\textbf{x}}}^{(0)}) = \pi _{ {\tiny \varvec{\varGamma }^{p} } \oplus \textit{\textbf{w}} } ({\textit{\textbf{x}}}^{(0)})\) where \(\textit{\textbf{w}} = \varvec{\varGamma }^{p} \oplus \textit{\textbf{k}}^{(0)} \preceq \varvec{\varGamma }^{1} \oplus \varvec{\varGamma }^{s} \). Next, we consider the propagation from these vectors in \(\mathbb {K}_0\). Note all kinds of components of a cipher can be seen as an S-box: \(\textit{\textbf{y}} = \textit{\textbf{S}}(\textit{\textbf{x}})\), and the propagation of the S-box for the two-subset bit-based division property has been concluded as a rule: Let \(\mathcal {D}^{1^n}_{\mathbb {K}_{in}}\) and \(\mathcal {D}^{1^n}_{\mathbb {K}_{out}}\) be the input and output two-subset bit-based division property of \(\textit{\textbf{S}}\), respectively. If \({\textit{\textbf{u}}}\in \mathbb {K}_{in}\) can propagates to \({\textit{\textbf{v}}}\in \mathbb {K}_{out}\), there must be \({\textit{\textbf{u}}}' \succeq {\textit{\textbf{u}}}\) satisfying \(\pi _{{\textit{\textbf{u}}}'}({\textit{\textbf{x}}}) \rightarrow {\textit{\textbf{y}}}^{\textit{\textbf{v}}}\). Since the monomial trail requires \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\rightarrow {\textit{\textbf{y}}}^{\textit{\textbf{v}}}\), then from the same \({\textit{\textbf{u}}}\), the two-subset bit-based division property can propagate to a larger range of vectors \({\textit{\textbf{v}}}\).

Hence, if \(\textit{\textbf{k}}^{(r)} \notin \mathbb {K}_r\), we have for all \(\textit{\textbf{k}}\in \mathbb {K}_0\). Therefore, \(\pi _{\textit{\textbf{k}}^{(r)}}({\textit{\textbf{x}}}^{(r)})\) does not contain any terms like \(\pi _{ {\tiny \varvec{\varGamma }^{p} } \oplus \textit{\textbf{w}} } ({\textit{\textbf{x}}}^{(0)}) = \pi _{\textit{\textbf{w}}}({\textit{\textbf{x}}}^{(0)}) \pi _{ {\tiny \varvec{\varGamma }^{p}} }({\textit{\textbf{x}}}^{(0)})\) for \(\textit{\textbf{w}} \preceq \varvec{\varGamma }^{1} \oplus \varvec{\varGamma }^{s}\), naturally,

$$ \bigoplus _{ \textit{\textbf{x}}^{(r)}\in \mathbb {X}_r} \pi _{\textit{\textbf{k}}^{(r)} } ({\textit{\textbf{x}}}^{(r)}) = \bigoplus _{ \textit{\textbf{x}}^{(0)} \in \mathbb {X}_0} \pi _{\textit{\textbf{k}}^{(r)} } (\textit{\textbf{f}}_d ( {\textit{\textbf{x}}}^{(0)} ) )\equiv 0. $$

   \(\square \)

According to the proof, it can be checked even if \({\textit{\textbf{k}}}^{(r)} \in \mathbb {K}_r\), we cannot determine whether \(\pi _{{\textit{\textbf{k}}}^{(0)}}({\textit{\textbf{x}}}^{(0)}) \rightsquigarrow \pi _{{\textit{\textbf{k}}}^{(r)}}({\textit{\textbf{x}}}^{(r)})\) (let alone \(\pi _{{\textit{\textbf{k}}}^{(0)}}({\textit{\textbf{x}}}^{(0)}) \rightarrow \pi _{{\textit{\textbf{k}}}^{(r)}}({\textit{\textbf{x}}}^{(r)})\)), while the two-subset division property claims that the parity is an unknown value, i.e., the two-subset bit-based division property may miss some balance properties.

Similarly, we can prove that the three-subset bit-based division property and the word-based division property are also no-false-alarm approximation of the monomial prediction. The proofs are provided in Appendix C (Ref. [11]).

6.3 The Three-Subset Bit-Based Division Property Without Unknown Subset is Perfect

In  [30], Wang et al. found that we can only focus on a part of the propagation of the three-subset bit-based division property when processing a public-update cipher. Later in  [9], Hao et al. formulated this method to the three-subset bit-based division property without unknown subset. In this subsection, we show it is perfect in detecting the key-independent property.

Definition 5

(Three-Subset Bit-Based Division Property w/o Unknown Subset [9, 30]). Let \(\mathbb {X}\) and \(\mathbb {L}\) be two multisets whose elements are n-bit vectors. When the multiset \(\mathbb {X}\) has the three-subset bit-based division property without unknown subset \(\mathcal {T}_{\mathbb {L}}^{1^n}\), it fulfills the following conditions:

$$ \bigoplus _{{\textit{\textbf{x}}}\in \mathbb {X}} \pi _{\varvec{\ell }} ({\textit{\textbf{x}}}) = {\left\{ \begin{array}{ll} 1, \mathrm {~if~there~are~\text{ odd-number }~\varvec{\ell }~in~\mathbb {L}}, \\ 0, \mathrm {~if~there~are~\text{ even-number }~\varvec{\ell }~in~\mathbb {L}}. \end{array}\right. } $$

Let \(\textit{\textbf{f}}_d\) be the derived function of \(\textit{\textbf{f}}\) with \(\varvec{\varGamma }^{0}, \varvec{\varGamma }^{1}, \varvec{\varGamma }^{p}, \varvec{\varGamma }^{s}\)Footnote 8. Suppose the initial chosen set (multiset) of the plaintext is \(\mathbb {X}_0\) in Eq. (10), and the multiset of the ciphertext is \(\mathbb {X}_r = \{{\textit{\textbf{y}}}: {\textit{\textbf{y}}}= \textit{\textbf{f}}_d(\textit{\textbf{x}}), \textit{\textbf{x}} \in \mathbb {X}_0 \}\). Then we first compute the division property of \(\mathbb {X}_0\) as \(\mathcal {T}^{1^n}_{\mathbb {L}_0 }\)  [30], where

$$\begin{aligned} \mathbb {L}_0 = \{ \varvec{\ell }\in \mathbb {F}_2^n : \varvec{\varGamma }^{p} \preceq \varvec{\ell }\preceq \varvec{\varGamma }^{p} \oplus \varvec{\varGamma }^{1} \}. \end{aligned}$$
(12)

To compute the division property of \(\mathbb {X}_r\), i.e., \(\mathcal {T}^{1^n}_{\mathbb {L}_r}\), we will trace all the propagation from the vectors in \(\mathbb {L}_0\). The propagation rules for three-subset bit-based division property without unknown subset are listed in  [9, 30].

Proposition 5

  The three-subset bit-based division property without unknown subset predicts \(\bigoplus _{ \textit{\textbf{x}}^{(r)}\in \mathbb {X}_r} \pi _{\varvec{\ell }^{(r)} } ({\textit{\textbf{x}}}^{(r)})\) for any \(\varvec{\ell }^{(r)}\) perfectly.

Proof

Firstly, for any \(\varvec{\ell }^{(0)} \in \mathbb {L}_0\), \(\pi _{\varvec{\ell }^{(0)}}({\textit{\textbf{x}}}^{(0)}) = \pi _{ {\tiny \varvec{\varGamma }^{p}} \oplus \textit{\textbf{w}} } ({\textit{\textbf{x}}}^{(0)})\) where \(\textit{\textbf{w}} = \varvec{\varGamma }^{p} \oplus \varvec{\ell }^{(0)} \preceq \varvec{\varGamma }^{1}\). Then \(\pi _{ {\tiny \varvec{\varGamma }^{p} } \oplus \textit{\textbf{w}} } ({\textit{\textbf{x}}}^{(0)}) = \pi _{ {\tiny \varvec{\varGamma }^{p} } } ({\textit{\textbf{x}}}^{(0)})\). Next, we consider the propagation from these vectors in \(\mathbb {L}_0\). Since all kinds of components of a cipher can be seen as an S-box: \(\textit{\textbf{y}} = \textit{\textbf{S}}(\textit{\textbf{x}})\) and the propagation of the S-box for the three-subset bit-based division property without unknown subset has been concluded as a rule that guarantees \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\rightarrow {\textit{\textbf{y}}}^{\textit{\textbf{v}}}\)  [30], we can trace the propagation and compute out \(\mathbb {L}_r\). Therefore, for every vector \(\varvec{\ell }^{(r)} \in \mathbb {L}_r\), there is a monomial trail connecting \(\pi _{\varvec{\ell }^{(0)}}({\textit{\textbf{x}}}^{(0)})\) and \(\pi _{\varvec{\ell }^{(r)}}({\textit{\textbf{x}}}^{(r)})\) since \({\textit{\textbf{x}}}^{\textit{\textbf{u}}}\rightarrow {\textit{\textbf{y}}}^{\textit{\textbf{v}}}\) is also required by Definition 1. Let \(\varvec{\ell }^{(r)}\) appears N times in \(\mathbb {L}_r\), then

$$ N = \sum _{\varvec{\ell }\in \mathbb {L}_0 } |\pi _{\varvec{\ell }} ({\textit{\textbf{x}}}^{(0)}) \bowtie \pi _{\varvec{\ell }^{(r)}} ({\textit{\textbf{x}}}^{(r)}) | = \sum _{\textit{\textbf{w}} \preceq \varvec{\varGamma }^{1} } |\pi _{ {\tiny \varvec{\varGamma }^{p}} \oplus \textit{\textbf{w}} } ({\textit{\textbf{x}}}^{(0)}) \bowtie \pi _{\varvec{\ell }^{(r)}}({\textit{\textbf{x}}}^{(r)}) |. $$

According to Proposition 2, \(\pi _{ {\tiny \varvec{\varGamma }^{p}} } ({\textit{\textbf{x}}}^{(0)}) \rightarrow \pi _{\varvec{\ell }^{(r)}}({\textit{\textbf{x}}}^{(r)})\), if and only if \(N\bmod 2 = 1\).    \(\square \)

6.4 An Alternative Detection Algorithm for Division Property

The algebraic insights into the division property bring us much more flexibility in designing new detection algorithms for balance properties. Although the three-subset bit-based division property is more accurate than the two-subset bit-based division property  [30], the latter is more MILP-friendly and needs simpler programming, therefore the two-subset version is more efficient. According to the existing literature, the three-subset bit-based division property can find several more balanced bits, but hardly surpass the two-subset version by rounds. Hence, the two-subset bit-based division property is still the dominant method in searching for the integral property.

Table 3. Some experimental results of our new detection algorithm compared with the previous ones. All results are re-produced on the same platform.

From an algebraic viewpoint, we show how to design a new detection algorithm of division property which surpasses the capability but achieves the similar efficiency with the two-subset bit-based division property. For the derived function \(\textit{\textbf{f}}_d\) with \(\varvec{\varGamma }^{0}, \varvec{\varGamma }^{1}, \varvec{\varGamma }^{p}, \varvec{\varGamma }^{s}\), if we want to determine whether \( \bigoplus _{{\textit{\textbf{x}}}\in \mathbb {X}_0} \pi _{ {\textit{\textbf{u}}}^{(r)} } ( \textit{\textbf{f}}_d( {\textit{\textbf{x}}}) ) \) is key-independent or not, we only need to check whether \( {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \) contains any term in

$$ \mathbb {S}_0 = \{ \pi _{{\tiny \varvec{\varGamma }^{p}} \oplus \textit{\textbf{w}} } ({\textit{\textbf{x}}}^{(0)}) : \textit{\textbf{0}} \prec \textit{\textbf{w}} \preceq \varvec{\varGamma }^{s} \}. $$

Consider \(\mathbb {S}_r = \{ {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } : {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \}\), if \( {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \notin \mathbb {S}_r\), then we know \(\textit{\textbf{f}}_d\) does not contain any monomials in \(\mathbb {S}_0\) since there is no monomial trail. Therefore \( \bigoplus _{{\textit{\textbf{x}}}\in \mathbb {X}_0} \pi _{ {\textit{\textbf{u}}}^{(r)} } ( \textit{\textbf{f}}_d( {\textit{\textbf{x}}}) ) \) is a key-independent value.

To detect it, firstly, we construct the model of \( {\pi _{{\textit{\textbf{u}}}^{(0)}} ( {\textit{\textbf{x}}}^{(0)} ) } \rightsquigarrow {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \) by decomposing the target cipher like we do for \(\textsc {Trivium} \). Secondly, we impose another constraint on all the round key bits \(k_i\) on the MILP model \(\mathcal {M}\) as

$$ \mathcal {M}\leftarrow \sum _{i} k_i \ge 1. $$

Finally, we check the validity of this model. If the model is infeasible, then \( {\pi _{{\textit{\textbf{u}}}^{(r)}} ( {\textit{\textbf{x}}}^{(r)} ) } \) contains no monomial in \(\mathbb {S}_0\) and \( \bigoplus _{{\textit{\textbf{x}}}\in \mathbb {X}_0} \pi _{ {\textit{\textbf{u}}}^{(r)} } ( \textit{\textbf{f}}_d( {\textit{\textbf{x}}}) ) \) is key-independent. Since we do not need to compute the size of the monomial hull, the model is easy to solve. Some experiments are conducted to show the capibility of this alternative detection algorithm, we list the results in Table 3.

7 Conclusion and Discussion

In this work, a pure algebraic treatment of the division property is presented, and we propose the monomial prediction technique which determines the presence or absence of a monomial by counting the number of monomial trails in the corresponding monomial hull. Based on this technique, we manage to obtain the exact algebraic degrees of Trivium up to 834 rounds and improved key-recovery attacks on 840-, 841- and 842-round Trivium.

Moreover, we categorize existing detection algorithms for division properties into perfect, no-false-alarm, and no-missing classes. In particular, we prove that the three-subset bit-based division property without unknown subset and monomial prediction are perfect. At this point, a natural question arises. Can we design an efficient no-missing detection algorithm for the division property that does not raise too many false alarms, which would be very useful for designers to theoretically determine the security bounds against attacks based on division properties.