Abstract
The National Institute of Standards and Technology (NIST) recently published a FormatPreserving Encryption standard accepting two Feistel structure based schemes called FF1 and FF3. Particularly, FF3 is a tweakable block cipher based on an 8round Feistel network. In CCS 2016, Bellare et al. gave an attack to break FF3 (and FF1) with time and data complexity \(O(N^5\log (N))\), which is much larger than the code book (but using many tweaks), where \(N^2\) is domain size to the Feistel network. In this work, we give a new practical total break attack to the FF3 scheme (also known as BPS scheme). Our FF3 attack requires \(O(N^{\frac{11}{6}})\) chosen plaintexts with time complexity \(O(N^{5})\). Our attack was successfully tested with \(N\leqslant 2^9\). It is a slide attack (using two tweaks) that exploits the bad domain separation of the FF3 design. Due to this weakness, we reduced the FF3 attack to an attack on 4round Feistel network. Biryukov et al. already gave a 4round Feistel structure attack in SAC 2015. However, it works with chosen plaintexts and ciphertexts whereas we need a knownplaintext attack. Therefore, we developed a new generic knownplaintext attack to 4round Feistel network that reconstructs the entire tables for all round functions. It works with \(N^{\frac{3}{2}} \left( \frac{N}{2} \right) ^{\frac{1}{6}}\) known plaintexts and time complexity \(O(N^{3})\). Our 4round attack is simple to extend to five and more rounds with complexity \(N^{(r5)N+o(N)}\). It shows that FF1 with \(N=7\) and FF3 with \(7\leqslant N\leqslant 10\) do not offer a 128bit security. Finally, we provide an easy and intuitive fix to prevent the FF3 scheme from our \(O(N^{5})\) attack.
Keywords
 Formatpreserving Encryption (FPE)
 Tweakable Block Cipher (TBC)
 Feistel Network (FN)
 Round Function
 Slide Attacks
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Download conference paper PDF
1 Introduction
FormatPreserving Encryption (FPE) provides a method to encrypt data in a specific format into a ciphertext of the same format. A format in FPE schemes refers to a finite set of characters such as the decimal (or binary) numerals or alphanumerals along with the length of the sequence of the characters that form the plaintexts. FPE has been staging in applied cryptography community due to the desirable functionality. It secures data while keeping the database scheme intact. For instance, given a legacy database system, upgrading the database security requires a way for encrypting credit card numbers (CCN) or social security numbers (SSN) in a transparent way to its applications.
Brightwell and Smith [9] introduced a first known formatpreserving encryption which was termed as datatype preserving encryption in 1997. They wanted to encrypt an existing database to let all the applications access encrypted data just as they access nonencrypted data. Their solution for this was reduced to preserve the particular datatype of entries in the databases. The term formatpreserving encryption is due to Terence Spies from Voltage Security [21]. Though FPE dates back to late 90’s, the demand to make FPE based databases has created an active area of research during last few years. There have been many techniques proposed to build FPE schemes such as prefix cipher, cycle walking, Feistel network, Feistel modes [2, 4, 5, 7, 16, 20, 21]. The complete list of FPE schemes for small domain size along with their description and their security level can be found in a synopsis by Rogaway [18, pp. 6, 7]. In his list, Rogaway considers the schemes that are built with pseudorandom functions (that itself might be constructed from block ciphers).
Probably, it is natural to build FPE schemes based on a Feistel network (FN) since it can be used with already existing conventional block ciphers, such as AES. Indeed, the National Institute of Standards and Technology (NIST) published an FPE standard [1] (finalized in March 2016) that includes twoapproved Feistelbased FPE schemes: FF1 [5] and FF3 [8]. Both are expected to offer a 128bit security. In this work, we are particularly interested in the attacks for breaking the FNbased standard FF3 [1] and attacks against Feistel network. The former attack utilizes the latter that is designed as a generic roundfunctionrecovery attack.
The FF3 construction is an 8round FN that uses a tweak XORed with a round counter as an input to the block cipher. The XOR operation guarantees that round functions are pairwise different. This is usually called “domain separation”. The security of FF3 asserts that it achieves several cryptographic goals including chosenplaintext security or even PRPsecurity against an adaptive chosenciphertext attack under the assumption that the underlying round function is a good pseudorandom function (PRF). Our work shows that its security goal has not met even when the round functions are replaced by secure PRFs and gives a roundfunctionrecovery attack on FF3.
Our Contributions. Our work covers three significant contributions. (a). We give a total practical break to 8round Feistel network based FF3 FPE standard over a small domain. Our attack exploits the “bad domain separation” in FF3. Namely, the specific design choice of FF3 allows us permuting the round functions by changing the tweak and it leads us to develop a slide attack (using only two tweaks). The attack works with chosen plaintexts and tweaks when the message domain is small. It requires \(O(N^{\frac{7}{4}+ \frac{1}{4L}})\) chosen plaintexts and two tweaks, with time complexity \(O(N^{5})\), where \(N^2\) is input domain size to the Feistel network and L is a parameter in our attack which is typically set to \(L=3\) in experimental results. Luckily, the fix to prevent FF3 against our attack is quick and easy to maintain without changing the main structure of the scheme. (b). While we form our slide attack to break FF3, we develop a new generic knownplaintext attack on 4round Feistel networks and we insert it in our slide attack. Our techniques to develop a 4round attack is novel and different than previously known attacks on Feistel networks. In our attack, we compute the full recovery of round functions with \(N^{\frac{3}{2}} \left( \frac{N}{2} \right) ^{\frac{1}{2L}}\) known plaintext and time complexity \(O(N^{2+\frac{3}{L}})\) for four rounds. (c). We utilize our 4round FN attack to extend the round function recovery on more rounds. Due to the generic and known plaintext nature of our 4round FN attack, we easily adapt it to a chosenplaintext attack to apply it on 5 and more rounds Feistel structures. Our attack shows that neither FF1 with \(N=7\) nor FF3 with \(7\leqslant N\leqslant 10\) (even with our fix) offer a 128bit security.
Overview Of Previous Works. A security for message recovery in FPE constructions along with many other notions for FPE was first defined by Bellare et al. [4]. A recent work by Bellare et al. [3] gives a practical message recovery attack on NIST standard Feistelbased FPE schemes (both FF1 and FF3) on small domain sizes. In their work, however, the security definition they consider is under the new message recovery security that they define in the same work. Briefly, consider two messages X and \(X'\) which share the same right (or left) half of the messages. In their attack, the adversary is given \(X'\) together with the encryption of X and \(X'\) under q tweaks. The adversary wins if it can fully recover X, in particular, its unknown half. The attack by Bellare et al. uses a data complexity that exceeds the message space size. Clearly stating, their work shows that Feistelbased FPE with the standardized number of rounds does not achieve good enough security on small domain sizes.
The attack by Bellare et al. works using \(O(N^{5} \log N )\) data and time complexity with many tweaks on eight rounds. This is quite interesting when the amount of data is limited for each tweak. It is a decryption attack. Our attack herein is more traditional. It uses only two tweaks, but \(O(N^{\frac{11}{6}})\) chosen plaintexts with \(O(N^5)\) time complexity. We recover the entire codebook (for both tweaks).
To apply the slide attack to recover the entire round functions of Feistel networks, we develop a generic knownplaintext attack on 4rounds.
Since its invention, Feistel networks have created active research areas for cryptographers (both in theory and in practice) due to its applications and influence on the development of major constructions such as DES. The security for Feistel networks has been investigated for very long time and there already exist interesting results for cryptanalysis. The security of Feistel schemes aims either to distinguish a Feistel scheme from a random permutation or to recover the round functions. In their famous work [15], Luby and Rackoff proved the indistinguishability of 3round Feistel network against chosenplaintext attacks and 4rounds against chosenplaintext and ciphertext attacks for the number of queries \(q \ll \sqrt{N}\), where \(N^2\) is the size of the input domain. The directions derived from this result tried to improve the security bounds until \(q \ll N\) (that is called the “birthday bound”) which was a natural bound from information theory.^{Footnote 1} A work by Patarin [17], using the mirror theory, showed improved proofs and stronger security bounds for four, five, and six rounds Feistel networks. Namely, for \(q \ll N\), four rounds are secure against knownplaintext attacks, five rounds are secure against chosenplaintext attacks, and six rounds are secure against chosenplaintext and ciphertext attacks.
From an information theory viewpoint, we could recover all functions in time \(N^{\mathcal {O}(N)}\) by exhaustive search. As far as we know, there is no efficient generic attack which is polynomial in N on the Feistel scheme with \(q\sim N\). Our attack uses \(q\sim N^{\frac{3}{2}}\) and is polynomial in N with known plaintexts up to four rounds.
A recent work by Dinur et al. [11] gives a new attack on Feistel structures for more than four rounds to recover the round keys with a few known plaintext/ciphertext pairs when the \(i^{th}\) round uses \(\mathsf {x} \mapsto F_{i}(x \oplus k_i)\), where \(F_i\) is public whereas \(k_i\) is being kept secret. Here, we focus on the case where each round function is secret in a balanced 2branch Feistel scheme. Furthermore, we do not restrict to the XOR addition. Our results also apply to Feistel schemes with modular addition. The new cryptanalysis results against Feistel networks with modular addition for four and five rounds are presented in a recent work by Biryukov et al. [6]. For four rounds, they achieve the full recovery of round functions with data complexity \(O(N^{\frac{3}{2}})\) with a guess and determine technique. However, their attack uses chosen plaintexts and ciphertexts. We summarize their results and ours on Table 1.
Structure of the Paper. In Sects. 2 and 3, we give the details of FF3 construction and Tweakable Encryption, respectively. In Sect. 4, we develop our new generic attack for Feistel structure on specifically 4rounds and extend it on 5 and more rounds. In Sect. 5, we give our complete slide attack to a NIST standard FF3 scheme.
2 The FF3 Scheme
A Tweakable FormatPreserving Encryption (TFPE) scheme is a block cipher that preserves the format of the domain in the output. A TFPE function \(E: \mathcal {K} \times \mathcal {T} \times \mathcal {X} \mapsto \mathcal {X}\) is defined from a key space \(\mathcal {K}\), a tweak space \(\mathcal {T}\), and a domain \(\mathcal {X}\) to the same domain \(\mathcal {X}\). We are particularly interested in a TFPE scheme by Brier, Peyrin, and Stern (depicted in Fig. 1(b)) [8] whose design is based on Feistel network depicted in Fig. 1(a). It is named as FF3 in the NIST standards.
We use the following notations for the rest of the paper. The domain \(\mathcal {X}\) consists of strings of characters; s represents the cardinality of the set S of characters and b represents the length of the messages in the domain \(\mathcal {X}\). For example, the credit card numbers (CCNs) consists of 16 digits of decimal numerals with \(S= \{0,1, \ldots , 9\}\), \(s=10\) and \(b=16\) where we have \(10^{16} \cong 2^{54}\) possible distinct numeral strings. We set the minimum length of the message block \(minlen=2\) and the maximum length of the message block to \(maxlen=\lfloor \log _{s}(2^{f32})\rfloor \), where \( f \) is the input/output size of the round function used in Feistel scheme in FF3.^{Footnote 2} We represent the number of rounds in the scheme with \( w \).
Unlike standard Feistel schemes which use the exclusive or (XOR) (denoted by \(\oplus \)), FF3 uses the modular addition that is denoted by \(\boxplus \).
We define the following notations for three functions:
\({\mathbf{{STR^{b}_s:}}}\) a function that maps an integer x where \(0 \leqslant x < s^b\) to a string of length b in base s with most significant character first, e.g. \(STR^{4}_{12}(554)= 03A2\).
\({\mathbf{{NUM_{s}:}}}\) a function that maps a string X to an integer x such that \(STR^{b}_{s}(x)=X\). For instance, \(NUM_{2}(00011010)=26\).
\({\mathbf{{REV(X):}}}\) a function that reverses the order of the characters of string X.
The length of string X is denoted by X. The concatenation of strings is denoted by . The first (leftmost) character of string X is X[0]. The \(i^{th}\) one is \(X[i1]\). We denote \(X[a \cdots b]\) the substring of X formed with \(X[a]X[a+1] \cdots X[b]\).
The FF3 uses a tweakable block cipher as a round function, \(F_{K}(T,X)=Y\) with \(X,Y \in \{0,1,\ldots ,2^{ f }1\}\) and \(T \in \{0,1\}^{32}\), where K is a key and T is one half of the FF3 tweak with an offset.
In lines 1–2, the encryption algorithm splits the input X into two substrings \(L_0\) and \(R_0\). In lines 5–8 (respectively in lines 10–12), the algorithm first takes the tweak \(T_{R}\) (respectively \(T_L\)) XORed with the encoded round index \( i \) and \(R_i\) (respectively \(L_i\)) to input tweakable PRF \(F_{K}\). Second, it applies modular addition of the output of \(F_{K}\) to \(L_i\) (respectively \(R_i\)).
For simplicity and by abuse of notations, we say that FF3 encrypts the plaintext \((L_{0}, R_{0})\) into the ciphertext \((L_{ w }, R_{ w })\) with tweak \((T_{L}, T_{R})\), so that we only concentrate on lines 4–14. We illustrate the 4round FF3 scheme in Fig. 1(b).
In concrete proposal, \( w =8\), \( f =128\) and
where AES maps an \( f \)bit bitstring to an \( f \)bit bitstring [1].
3 Tweakable Encryption
A tweakable block cipher (TBC) is a tuple \((\mathcal {K}, \mathcal {E}_{K}(\cdot , \cdot ), \mathcal {D}_{K}(\cdot , \cdot ))\) formed of three algorithms for key generation, encryption, and decryption with a key K; all efficiently computable algorithms. We follow the notion of security from [13] as chosenplaintextsecure (CPA) tweakable block cipher.
Definition 1
A TBC is a \((q,t,\epsilon )\)CPAsecure cipher if for any probabilistic time adversary \(\mathcal {A}\) limited to t steps and q oracle queries, the advantage of distinguishing TBC from \(\Pi \) is bounded by \(\epsilon \):
where \(K \in \mathcal {K}\) is selected at random and \(\Pi (T, \cdot )\) is defined as a random permutation for every T.
In the standard model, the tweakable block ciphers [4, 14] are used to construct tweakable formatpreserving encryption schemes since tweakable encryptions provide better security bounds for tweakable FPE in terms of the number of chosen plaintext/ciphertext to attack the system [4].
It is underlined in [8] that using the same round function F twice during an encryption process can introduce some security vulnerability to the system. So, the domain of the tweaks in different rounds must be separated. For this, the scheme in [8] XORs tweaks with a round counters. However, this way to separate domains is not fully effective. Indeed, the tweaks are known to the adversary and are under adversary’s control in chosentweak attacks. Consider two 4round Feistel networks with tweaks \(T_{R}\) and \(T_{L}= T_{R} \oplus STR_{2}^{32}(1)\). For the first round, we have the tweak \(T_{R} \oplus STR_{2}^{32}(0) = T_{R}\) and the second round we have \(T_{L} \oplus STR_{2}^{32}(1)=T_{R}\). Then, for the third round \(T_{R} \oplus STR_{2}^{32}(2)\) and fourth round \(T_{L} \oplus STR_{2}^{32}(3)=T_{R} \oplus STR_{2}^{32}(2)\). We observe the following behavior: round \(2 i \) and \(2 i +1\) uses the same function \(F_i=F_K(T_R \oplus STR_{2}^{32}(2 i ), \cdot )\).
For a variant of FF3 with \(\oplus \) instead of \(\boxplus \), we present a trivial attack: Consider an FF3 encryption with a key \(K \in \mathcal {K}\), a tweak \(T=T_{L}T_{R} \in \mathcal {T}\) and domain \(\mathcal {X}\). Each round \( i \) defines a random function \(F_{ i }=F_{K}(T_{R} \oplus STR_{2}^{32}( i ), \cdot )\) for \( i \) even (\(F_{ i }=F_{K}(T_{L} \oplus STR_{2}^{32}( i ), \cdot )\) for \( i \) odd). We use the encryption with an input message \(X=(L_{0}, R_{0})\) and output ciphertext \(Y=(L_{ w }, R_{ w })\) with output \(X_ i \) from each round in Fig. 2(a). We assume that \( b \) is even so that \(\ell = r \). Now, we take the ciphertext Y from Fig. 2(a) and reverse it into \((L'_{0},R'_0)= (R_{ w }, L_{ w })\) to encrypt it with a new tweak \(T'=T_{R} \oplus STR_{2}^{32}( w 1)  T_{L} \oplus STR_{2}^{32}( w 1) \in \mathcal {T}\). We show this encryption in Fig. 2(b). We assume that \( w \) is a power of two (Fig. 2 uses \( w =8\)). With given encryption, we obtain the round functions \(F'_{ i }=F_{ w 1 i }\) as shown on Fig. 2(a). More precisely, the attack works as follows:

\(\circ \) Encrypt \((L_{0},R_{0})\) with the tweak T to get \((L_{ w }, R_{ w })\).

\(\circ \) Encrypt \((R_{ w }, L_{ w })\) with the tweak \(T'\) to get \((L', R')\).

\(\circ \) If \(L'=R_{0}\) and \(R'=L_{0}\), output 1. Otherwise, output 0.
The adversary always outputs 1 with \(\mathcal {E}_{K}\). It outputs 1 with \(\Pi (\cdot , \cdot )\) with probability \(\frac{1}{s^b}\). Therefore, the advantage is \(1\frac{1}{s^b}\).
4 KnownPlaintext RoundFunctionRecovery Attack on Feistel Scheme
In this section, we define the Feistel network over a group of order \(\mathsf {N}\). Typically, this group is \(\mathbb {Z}_{\mathsf {N}}\). Later in Sect. 5, we assume \(\mathsf {b}\) is even and \(\mathsf {N}=s^{\frac{b}{2}}\).
First of all, we observe that the round functions are not uniquely defined by the codebook. Namely, if \((F_0,\ldots ,F_{r1})\) is a solution to map given sample plaintexts to the corresponding ciphertexts, then we can construct many other solutions. Indeed, for any set of values \(\alpha _0,\ldots ,\alpha _{r1}\) such that \(\alpha _1+\alpha _3+\alpha _5+\cdots =\alpha _0+\alpha _2+\alpha _4+\cdots =0\), we can define
for all j and u to obtain another solution. Therefore, we can fix one point arbitrarily in \(F_0,\ldots ,F_{r3}\) when looking for a solution. All the other solutions are obtained by the above transformation of the round functions.
The rest of the section is organized as follows: in Sect. 4.1, we give a heuristic attack for 3round FN and analyze its time complexity. We report the ratio of success recovery in Fig. 3 with the parameters the attack takes. In Sect. 4.2, we give an attack for 4round FN that leverage our 3round attack. The correctness and further analysis is presented with formally stated lemmas. In Sect. 4.3, we expand our attack for five rounds and more and derived the time complexities.
4.1 RoundFunctionRecovery on 3Round Feistel Scheme
Consider a 3round Feistel Scheme with three round functions \(F_0, F_1, F_2\) and modular addition. Given x and y in \(\mathcal {X}\), we define:
Due to the symmetry of the set of solutions \((F_0,F_1,F_2)\) (as already observed), we can fix \(F_0\) on one point arbitrarily. The idea of our attack is to concentrate on data for which we know how to evaluate \(F_0\) so that we can deduce the output for the round function \(F_2\). Then, we concentrate on data for which we know how to evaluate \(F_2\) and we deduce more points in \(F_0\). We continue by alternating the deduction between \(F_0\) and \(F_2\) until we recover them all. When we continue iterating as described, we can fully recover the tables for all three round functions \((F_0, F_1, F_2)\). Our attack is presented in Algorithm 2 in more detail.
We model our set S as a bipartite graph with two parties of N vertices (one for the y’s and the other for the t’s) and edges for each (y, t) pair represented by tuples from S. What our algorithm does is just to look for a connected component of a random starting point y with complexity \(O(\theta N)\). Following the theory of random graphs [19], we have \(\theta N\) random edges so that the graph is likely to be fully connected when \(\theta \approx \ln (N)\). For a constant \(\theta \geqslant 1\), it is likely to have a giant connected component. This component corresponds to a constant fraction of the tables of \(F_0\) and \(F_2\). Therefore, after \(\log _{\theta }N\) iterations, we can reconstruct \(F_0\) and \(F_2\) which allow us to reconstruct \(F_1\). For any y, we can see that it does not appear in S with probability \(\left( 1 \frac{1}{N} \right) ^{\theta N} \approx 1  e^{\theta }\). Thus, we can only hope to recover a fraction \(1 e^{ \theta }\) of the table of \(F_0\). The same holds for \(F_1\) and \(F_2\). Therefore, with data and time complexity N, we recover a good fraction of all tables. With data and time complexity \(N\ln N\), we recover the full tables with good probability.
We implemented our attack. On Fig. 3, we plot the average fraction of recovered \(F_0\) values depending on \(\theta \) for several values of N. For this, we computed an average over 10,000 independent runs. For \(\theta =1\), the fraction is about \(40\%\). We also plot the fraction of the trials which fully recovered all functions. These two values can be taken as an approximation of the expected fraction of recovered table for \(F_0\) and the probability to fully recover all functions, respectively. As we can see, the first value does not depend so much on N (we have a giant connected component for \(\theta \) around 1), but the second one jumps for \(\theta \) proportional to \(\ln N\) (the graph becomes fully connected). For \(\theta =\ln N\), the probability is roughly \(\frac{1}{3}\).
4.2 RoundFunctionRecovery on 4Round Feistel Scheme
In this section, we give an attack to fully recover the round functions of a 4round Feistel scheme.
Consider a 4round Feistel scheme with round functions \(F_{0}, F_{1}, F_{2}, F_{3}\). Given x and y in \(\mathcal {X}\), we define the following equations (see Fig. 4(a)):
Assume that we collected M random pairwise different plaintext messages (xy). We collect the pairs:
and,
where c, d, z, t (respectively \(c',d',z',t'\)) are defined from (xy) (respectively form \((x'y')\)) as above. We define \(Label(xy, x'y')=xx'\).
We form a directed graph \(G=(V,E)\) with the vertex set V as defined above. We take \((x_1y_1x'_1y'_1,x_2y_2 x'_2y'_2) \in E\) if \(y'_1=y_2\) (i.e. a pair of tuples \(x_{1}y_{1}x'_1y'_1\) is connected to a pair \(x_2y_2x'_2y'_2\) if the \(y_2\) in the second message in former tuple is same as in the first message in latter tuple). Furthermore, we let \(E_{good}=(V_{good} \times V_{good}) \cap E\) and define the subgraph \(G_{good}=(V_{good}, E_{good})\).
Then, we have the following Lemma with four properties:
Lemma 1
Given a graph G with a vertex set V defined as above:

1.
\(V_{good} \subseteq V\).

2.
If \((xy,x'y') \in V\), then \(y \ne y'\).

3.
If \((xy,x'y') \in V_{good}\), then \(F_{0}(y')  F_{0}(y) = Label(xy,x'y')\).

4.
For all cycles \(v_{1}v_{2}\cdots v_{L}v_{1}\) of \(G_{good}\), \(\sum _{i=1}^{L}Label(v_{i})=0\).^{Footnote 3}
Proof
The proofs are straightforward:

1.
Clearly, \(z'=z\) and \(c'=c\) imply that \(t'y'=ty\), hence \(V_{good} \subseteq V\).

2.
If \(t'y'=ty\) and \(y'=y\), then \(t'=t\). If we further have \(z'=z\), then we deduce \(c'=c\). If \(c'=c\), then \(x'=x\), thus \(xy=x'y'\). Hence, we cannot have \((xy,x'y') \in V\).

3.
If \(c'=c\) then \(F_0(y')F_0(y)=xx'=Label(xy,x'y')\).

4.
Let \(v_i=(x_iy_i,x'_iy'_i)\). If \(v_i \in V_{good}\) then \(F_0(y'_i)F_0(y_i)=Label(v_i)\). If we have a cycle then \(y'_i=y_{i+1}\) with \(y_{L+1}=y_1\). Hence, \(\sum _i Label(v_i)=0\). \(\square \)
The principle of our attack is as follows: if we get vertices in \(V_{good}\), the property 3 from Lemma 1 gives equations to characterize \(F_{0}\). One problem is that we can identify vertices in V, but we cannot tell apart good and nongood (bad) ones. One way to recognize good vertices is to use property 4 in Lemma 1: to find cycles with zero sum of labels. For this, we will prove in Lemma 4 that this is a characteristic property of good cycles, meaning that all the vertices in these cycles are good vertices. First, we estimate the number of vertices and edges with the following two Lemma.
Lemma 2
For \(x,y,x',y'\) random and \(F_{0}, F_{1}, F_{2}, F_{3}\) random,
Proof
We compute the following probabilities:
Hence,
\(\square \)
Lemma 3
The expected number of elements in \(V_{good}\) is \(\frac{M(M1)\left( 1 \frac{1}{N} \right) }{N^{2}} \approx \frac{M^{2}}{N^{2}}\).
Proof
We have \(M(M1)\) possible pair of tuples \(xy,x'y'\) with \(xy\ne x'y'\) to construct \(V_{good}\). From Eq. (2), the probability of each vertex in \(V_{good}\) is \(\frac{1}{N^{2}} \left( 1  \frac{1}{N} \right) \). Thus, we expect to have \(\frac{M(M1)\left( 1 \frac{1}{N} \right) }{N^{2}} \approx \frac{M^{2}}{N^{2}}\) elements in \(V_{good}\). \(\square \)
We have the property that for each cycle \(v_{1}v_{2} \cdots v_{L}v_{1} \in G\), if \(v_{1}, \ldots ,v_{L}\) are all in \(V_{good}\), then the sum of \(Label(v_{i})\) is zero due to Lemma 1, property 4. If one vertex is not good, the sum may be random. This suggests a way to find good vertices in V that is to look for long cycles in G with a zero sum of labels.
Lemma 4
(\(L=2\) case). If \(v_1=(x_1y_1,x'_1y'_1)\) we say that \(v_1\) and \(v_2\) are permuting if \(v_2=(x'_1y'_1,x_1y_1)\). If \(v_{1}v_{2}v_{1}\) is a cycle in G with zero sum of labels, and \(v_{1}, v_{2}\) are not permuting, then \(v_{1}\) and \(v_{2}\) are likely to be good. More precisely, for \(v_{1}=(x_{1}y_{1}x'_{1}y'_{1})\) and \(v_{2}=(x_{2}y_{2}x'_{2}y'_{2})\) random, we have \(\Pr [v_{1},v_{2} \in V_{good} ~~ v_{1}v_{2}v_{1} ~ \text {is a cycle}, v_{1}, v_{2} ~\text {not permuting}, \sum _{i=1}^{2}Label(v_{i}) = 0] \geqslant \frac{1}{1+\frac{10}{N5}}\).
The proof for Lemma 4 is in Appendix A.1. We believe that Lemma 4 remains true for valid cycles of small length except in trivial cases. In Appendix A.2, we extend to \(L>2\) for cycles satisfying some special nonrepeating condition \([\lnot \mathsf {repeat}]\) on the c and d values to rule out many trivial cases. However, this condition \([\lnot \mathsf {repeat}]\) cannot be checked by the adversary. Instead, we could just avoid repetitions of any message throughout the cycle (as repeating messages induce repeating c’s or d’s). We use the following conjecture (which is supported by experiment for \(L=3\)).
Conjecture 1
If \(v_1v_2 \cdots v_Lv_1\) is a cycle of length L in G with zero sum of labels and the vertices use no messages in common, then \(v_1 \cdots v_L \)are all good with probability close to 1.
For M known plaintexts, the expected number of valid cycles in \(G_{good}\) of a given length L is \(\frac{M^{2L}}{N^{3L}}\).
The aim of our attack is to collect as many \(F_0\) outputs as possible to reconstruct a table of this function. Thus, we are interested in vertices whose labels are defined as \(Label(v_i)=F_0(y)F_0(y'), \forall i \in \{0,1, \ldots , V\} \) and we generate another graph to represent the collection of many independent equations for \(F_0\).
We have a valid cycle \(v_1v_2 \cdots v_Lv_1\) of length L in G when \(v_i \in V\),
and vertices use no messages in common. Now, let us define an undirected graph \(G'=(V', E')\), where \(V'=\{0,1, \ldots , N1\}\) and \(E'\) is defined as follows: for each vertex \(v_i=(xy,x'y')\) in a valid cycle \(v_1v_2 \cdots v_Lv_1\) of length L, add \(\{ y_i,y'_i \}\) as an edge in \(E'\) with label set to \(Label(v_i)\). The purpose of such a graph \(G'\) is to put y values which are dependent on each other in a single connected component and put apart with independent y values in separate connected components.
When we model \(G'\) as a random graph, we can adjust M so that we can have a large connected component in \(G'\). Given the vertex set size \(V'=N\) and the edge size \(E'=m\), \(m= \frac{N(N1)}{2}p\), where p is the probability that \(G'\) has an edge between two vertices. From ErdősR\(\acute{e}\)nyi model [12] on random graphs, we want \(Np \geqslant 1\). We know that \(Np \sim 2\frac{m}{N}\). So, we want \(m \geqslant \frac{N}{2}\). We have \(\frac{M^{2L}}{L\cdot N^{3L}}\) expected good cycles (counted without repetition of their L circular rotations) of length L, thus \(m \sim \frac{M^{2L}}{N^{3L}}\). Therefore, we need to set \(M = \lambda N^{ \frac{3}{2}} \left( \frac{N}{2}\right) ^{\frac{1}{2L}}\) for a constant \(\lambda \geqslant 1\) to have a large connected component in \(G'\). Our attack works with \(M= N^{\frac{3}{2} + \epsilon }\) for \(\epsilon > 0\) small, with complexity \(O(2^{L}N^{(1+2\epsilon )L})\) and a constant probability of success. If our attack recovers at least \(\sqrt{N}\) points in \(F_0\) correctly (which is the case when we have a large connected component in \(G'\)), we obtain \(M\times \frac{\sqrt{N}}{N} \gg N\) samples to apply the attack on 3rounds so that it recovers a good fraction of \(F_1\), \(F_2\), \(F_3\). It is enough to bootstrap a yoyo attack (Steps 9–18 of Algorithm 3). And, our attack succeeds.
Now, we give the full algorithm of our attack to 4round Feistel scheme.
Experimentally, we noticed that \(\lambda =0.8\) is too small to obtain a large enough connected component for \(L=3\). Conversely, for \(\lambda =2\), \(G'\) is more connected but the giant component contains many bad edges that we want to avoid.
Let \(E_{j}\) be the event that the sizes of the \(\mathsf {j}\) largest connected components sum to greater than \(\sqrt{N}\) with no bad edges in \(G'\). Let \(E_{\leqslant j}\) be the event that either of \(E_1, E_2, \ldots ,E_j\) occurs. We simulated the attack for various N values and \(\lambda =1,2,3\) and report the numbers for \(E_{\leqslant 1}, E_{\leqslant 2}, E_{\leqslant 3}\) on Table 2. When we read the table, by taking \(\lambda =1\) and \(\mathsf {j}=3\), our attack recovers \(\sqrt{N}\) points of \(F_0\) with probability at least \(23~\%\). In our attack, if we look at \(\mathsf {j}\) connected components, we need to multiply the complexity by \(N^{j1}\) (We can fix \(F_0\) on one point for free, then all values in its connected components are inferred, but for each additional connected component, we must guess one value of \(F_{0}\)). It is likely that we can mitigate this \(N^{j1}\) factor by early abort during the attack on 3rounds.
In our experiments, we observe better success probability of our attack with \(\lambda =1\). With \(\lambda \) larger, the attack hardly ever succeeds. It may look paradoxical to say that if \(\lambda \) is too large, then the attack fails, but this is due to higher chances to collect bad edges. However, when \(G'\) is heavily connected, we could propose algorithms to eliminate inconsistencies in labels and get rid of bad edges. It means that we would have a successful attack for any \(\lambda \geqslant 2\). We let it as future work.
Therefore, we have a double phase transition. The first phase transition occurs when we have enough data to be able to make the graph and find cycles. Our attack quickly succeeds after this phase transition. The second phase transition occurs when we start having bad edges in the collected cycles. Then, our attack must be enriched to be able to work any longer. We did not do it on purpose as we noticed there is a sufficient window in between these two phase transitions to break the scheme with good probability of success and without caring about possible bad edges.
In Table 3, we show the experimental results of success probability of the entire attack for various strategies. Let \(S_j\) be an event with strategy j. In \(S_1\), we accumulate the three largest connected components and abort unless the accumulated size is at least \(\sqrt{N}\) and they have no bad edges. I.e., \(S_1\) is exactly \(E_{\leqslant 3}\). In \(S_2\), we just look at the largest connected component and fail unless it has no bad edges in \(G'\) (we remove the condition on size of the connected component that is greater than \(\sqrt{N}\)). In \(S_3\) (and \(S_4\) resp.), we look at the two largest (three largest resp.) connected components that have no bad edges. What we report in Table 3 includes the success probability \(\Pr _{succ}\) of \(S_i\) and we recover the entire tables for each round function. These various strategies considered for experimental purpose even though we have the theory results that suggests to condition on the size of the connected component.
The data complexity of our attack in Algorithm 3 is \(M=O(N^{\frac{3}{2}+\frac{1}{2L}})\) . We compute the time complexity for the algorithm based on the step 2, 3, 4, and 5, since the other steps are much shorter. In step 2, creating our graph G is defined as forming the vertices in G. This can be done in \(M \log (M)\) time with collision detection for M known plaintext/ciphertext pairs. In step 3, we look for the cycles of length L. The cycles of length L in our graph can be found with multiplication on adjacency matrix (which is sparse). Matrix multiplication can be done in \(O(V^2d)\) where \(d=\frac{E}{V}\) is the average degree of a vertex. Therefore, the complexity is O(VE). With the FloydWarshall algorithm, we need \((L1)\) multiplications by the adjacency matrix in the maxplus algebra that leads us to a complexity O(LVE). With \(E \sim \frac{V^2}{N}\), where \(V=2\frac{M^2}{N^2}=2^{3\frac{1}{L}}N^{1+\frac{1}{L}}\) and L constant, we have \(O(\frac{V^3}{N})\) which is equal to \(O(N^{2+\frac{3}{L}})\). Another method to find cycles is to enumerate all Ltuples of vertices in \(O(V^L)\) which is \(O(N^{L+1})\). Therefore, we compute the minimum between the two methods which is \(O(N^3)\) for any L and it is the complexity of step 3. (It can even be lower for \(L>3\).) Step 4 takes N time and finally step 5 takes \(\frac{M^{2L}}{N^{3L}}=\frac{N}{2}\). Since the complexity is weighted by step 3, we have time complexity of our algorithm as \(O(N^3)\) for \(L=3\) and a smaller \(O(N^{2+\frac{3}{L}})\) for \(L>3\). Instead of \(L1\) multiplications to a sparse matrix in the maxplus algebra, we could also use \(O(\log L)\) general purpose matrix multiplications over the integer with the CoppersmithWinograd algorithm [10]. We would reach a complexity of \(O(V^{2.38}\log L)\) which is not better.
4.3 RoundFunctionRecovery on 5Round Feistel Scheme and More
Given the 4round full recovery attack from Sect. 4.2, we can extend it to attack 5round Feistel network. The attack for 5round Feistel network is straightforward; it uses chosen plaintexts and guess strategies. First of all, consider our 4round attack and the known plaintexts from this attack. We choose plaintexts for the 5round so that the right half of the messages have as little different values as possible then guess the corresponding images through \(F_0\). It means that for the right halves of the messages, we generate all the possible partial tables of the first round function for these right values. Then, we guess which table is consistent after running the attack on the next 4round. The data complexity of our 4round attack is \(\lambda N^{\frac{3}{2}+\epsilon }\), hence our time complexity for 5round recovery with chosen plaintexts is \(O(N^{\lambda N^{\frac{1}{2}+\epsilon +3}})\). The data complexity is unchanged.
We can attack \(r\) rounds similarly with complexity \(O(N^{(r5)N+\sqrt{N}+3})\) by guessing the round functions on the last \((r5)\) rounds. The data complexity is unchanged. We can apply this to FF1 (\(r=10\)) and FF3 (\(r=8\)). We obtain a complexity lower than \(2^{128}\) for FF1 with \(N=7\) and for FF3 with \(7\leqslant N\leqslant 10\). (For lower N, exhaustive search on either the codebook or the round functions reaches the same conclusion.) Hence, these instances of FF1 and FF3 do not offer a 128bit security.
5 Slide Attack on FF3
We develop an attack on 4round Feistel network in Sect. 4 and we deploy it as a building block for our chosenplaintext and chosentweak attack to FF3 scheme. Our FF3 attack aims to reconstruct the entire codebook for a challenge tweak for a number of queries which is lower than the size of the brute force codebook attack. The main idea of the designed FF3 attack takes advantage of the flexibility to change the tweak to permute the round functions.
Consider two functions G and H, where G is a 4round Feistel scheme using tweakable block cipher F with tweaks \((T_R \oplus STR^{32}_2(0), T_L \oplus STR^{32}_2(1), T_R \oplus STR^{32}_2(2), T_L \oplus STR^{32}_2(3))\) and H is a 4round Feistel scheme using tweakable block cipher F with tweaks \((T_R \oplus STR^{32}_2(4), T_L \oplus STR^{32}_2(5), T_R \oplus STR^{32}_2(6), T_L \oplus STR^{32}_2(7))\). In Fig. 5, we show two runs of FF3 encryption with tweak \(T=T_L  T_R\) in (a) and tweak \(T'= T_L \oplus STR^{32}_2(4)  T_R \oplus STR^{32}_2(4)\) in (b) on two distinct plaintext. We observe that \(FF3.E(K,T, \cdot )= H \circ G\) and \(FF3.E(K,T', \cdot )= G \circ H\). For simplicity, we do not explicitly write \(STR^{32}_{2}(\cdot )\) any longer. Given this permuting ability by setting the tweaks XORed with round functions, we desire to form a “cyclic” behavior of plaintext/ciphertext pairs under two FF3 encryption with sliding G and H.
We pick at random two sets of messages \(X=\{xy_{0}^{1}, \ldots , xy_{0}^{i}, \ldots ,\) \(xy_{0}^{A}\}\) and \(\overline{X}=\{ \overline{xy}_{0}^{1}, \ldots , \overline{x}\overline{y}_{0}^{i}, \ldots , \overline{x}\overline{y}_{0}^{A}\}\) of size A. For each message \(xy_{0}^{i}\) in X, set \(xy_{j+1}^{i}= Enc(K,T, xy_{j}^{i})\) with a fixed tweak \(T \in \mathcal {T}\) and a fixed key \(K \in \mathcal {K}\). We repeat the chain encryption of outputs B times for each message in X. Let XC be the set of chain encryption of elements of X. It contains segments of length B of cycles of \(H \circ G\). Similarly, for each message \(\overline{xy}_{0}^{i}\) in \(\overline{X}\), set \(\overline{xy}_{j+1}^{i}= Enc(K,T', \overline{xy}_{j}^{i})\) with the fixed tweak \(T' \in \mathcal {T}\) under the same key K. Let \(\overline{XC}\) be the set of chain encryption of elements of \(\overline{X}\). Apparently, we have \(XC=AB\) and \(\overline{XC}=AB\). Given these 2 sets XC and \(\overline{XC}\), we attempt to find a collision between XC and \(\overline{XC}\) such that \(G(xy_{j}^{i})=\overline{xy}_{0}^{i'}\) or \(G(xy_{0}^{i})=\overline{xy}_{j'}^{i'}\) for \(1 \leqslant i,i' \leqslant A\) and \(1\leqslant j,j' \leqslant B\). (See Fig. 6.) Upon having a table with inputs to G and H, we can apply the knownplaintext recovery attack on 4round Feistel networks. The concrete algorithm to collect plaintext/ciphertext pairs is given in Algorithm 4.
We, now, formally prove useful results for the analysis and success probability of the attack in Algorithm 4.
Let \(\Pi \) be a random permutation on \(\{0,\ldots ,N^21\}\). Let \(c_k\) be the number of cycles of length k in \(\Pi \). The total number of elements in a cycle of length k (for all k) is equal to \(N^2\), meaning that \(\sum _{k=1}^{N^2}(kc_{k})=N^2\). It is wellknown that the expected number of cycles of length k over a random \(\Pi \) is \(\mathbb {E}_{\Pi }(c_{k})=\frac{1}{k}\).^{Footnote 4}
In what follows we show two useful results.
Lemma 5
For a message \(xy^{i}\) picked at random, let \(length(xy^{i})\) be the length of the cycle that contains \(xy^{i}\). For two messages \(xy^{i}\) and \(\overline{xy}^{i'}\) picked at random, let \(E_0\) be an event that \(xy^{i}\) and \(\overline{xy}^{i'}\) are in the same cycle. The expected value of \(length(xy^{i})\) is \(\mathbb {E}_{xy^{i},\Pi }[length(xy^{i})]=\frac{N^2 +1}{2} \) and the expected value of \(length(xy^{i})\) given \(E_{0}\) is \(\mathbb {E} [length(xy^{i}) E_0] = \frac{2N^2 +1}{3}\).
Proof
We use the same notation for \(c_k\) as above.
We first, observe that for any messages \(xy^{i}\) and \(\overline{xy}^{i'}\), being in the same cycle of every possible length occurs with probability \(\frac{1}{2}\). Then,
\(\square \)
This means that if we pick \(xy^{i}\) and \(\overline{xy}^{i'}\) at random and let \(xy^{j}=G^{1}(\overline{xy}^{i'})\) then \(xy^{i}\) and \(\overline{xy}^{i'}\) are in the same cycle with probability close to \(\frac{1}{2}\) and we will observe Fig. 6. One problem is that the cycle is typically long, i.e. \(\frac{2N^2}{3}\) as shown in Lemma 5, but we want that two segments of length B starting from \(xy^{i}\) and \(\overline{xy}^{i'}\) intersect on at least M points. Therefore, we need the probability of two segments overlapping in a cycle of length k on at least M points.
Lemma 6
Let two segments \(xy^{i}\Pi (xy^{i})\Pi ^2(xy^{i}) \cdots  \Pi ^{B}(xy^{i})\) and \(\overline{xy}^{i'}\Pi (\overline{xy}^{i'})\Pi ^2(\overline{xy}^{i'}) \cdots  \Pi ^{B}(\overline{xy}^{i'})\) overlap in a given cycle of length k on at least M points be the event \(E_1^{k}\). Let \(E_1\) be the union of all \(E_1^{k}\) for every possible length of k. The probability that \(E_{1}\) occurs is equivalent to \(\frac{2(BM)}{N^2}\) for \(M=o(N^2)\).
Proof
We use the same notation for \(c_k\) as above.
\(\square \)
The probability of success of our FF3 attack depends on \(\Pr [E_1]\) and on the success probability of our 4round recovery attack on Feistel network. More clearly,
which is equivalent to \(\left( 1 e^{\frac{2(BM)A^{2}}{N^2}}\right) p_{success}^{Feistel}\). Thus, we need \(A^2(BM) \approx N^2\) to obtain a constant \(p_{success}\). We can neglect the cost of the attack on H as we have plenty of samples and we only run it once G is recovered.
Our attack has 2AB data complexity. The time complexity is \(A^{2}B\) times the complexity of 4round recovery attack on Feistel network. To minimize the data complexity 2AB with \(A^{2}(BM) = N^{2}\) and \(B \geqslant M\), we set \(B=2M\), then \(A=\frac{N}{\sqrt{M}}\). Therefore, we have data complexity of FF3 attack as \(4N\sqrt{M}\) and time complexity as \(2N^2\) times the complexity of 4round recovery attack on Feistel network and \(p_{success} \approx 1  e^{p_{success}^{Feistel}}\).
We fully implemented the attack but to test its success probability we could skip some parts of the running time we knew the attack would fail. Namely, in Algorithm 4 we can identify directly which segments overlap (using the key) and proceed directly to the 4round Feistel attack on the right pair of segments. We show on Table 4 the experimental probability of success of the whole attack following the strategies \(S_j\), \(j=1,\ldots ,4\). The probability was computed for 10,000 executions.^{Footnote 5} We also took the executions collecting less than M samples, as long as they succeed to recover all tables. Curiously, the \(N\leqslant 4\) and \(\lambda =1\) cases seem to take M too low to be able to find cycles. As we can see, the success probability is pretty good (\(18\%\)–\(77\%\) for \(8\leqslant N\leqslant 512\)) for \(\lambda =1\) and the strategy \(S_2\) collecting the largest connected components in \(G'\).
We conclude that the full attack succeeds with good probability.
6 Repairing FF3
As a quick fix, we can propose to change the length of the tweak in FF3 so that the adversary has no longer control on what is XORed to the round index. The same should hold if some other part of the tweak is XORed to a counter in a CBC mode, as proposed by the authors of the construction [8]. We obtain a scheme with a shorter tweak, to which we concatenate the round index instead of XORing it.
The original LubyRackoff results [15] was extended following this idea by Black and Rogaway [7], but the obtained security result is quite weak as we can only prove that for a number of queries \(q\ll \sqrt{N}\), the cipher resists to chosenplaintext attacks, even with only three rounds. By similarly extending the results by Patarin [17], we can obtain that for \(q\ll N\), the cipher resists to chosenplaintext and ciphertext attacks, even with only six rounds. However, this says nothing in the case \(q\sim N^{\frac{3}{2}}\) which is the case of our 4round attack.^{Footnote 6}
7 Conclusion
We took the NIST standard FF3 and investigated its security on small domain sizes. We started exploiting that we can permute the round functions due to a bad domain separation in the tweak scheme which uses an XOR with the round index. This permutation leads us to develop a slide attack on FF3 based on our own design for 4round Feistel schemes attack that works with known plaintexts/ciphertexts. Our FF3 attack works with chosen plaintexts and two tweaks. It improves the recent results from Bellare et al. [3] on data and time complexity to break FF3. Our 4round Feistel network attack is a full roundfunctionrecovery attack that works with known plaintexts instead of chosen plaintexts and ciphertexts unlike the recent results from Biryukov et al. [6].
Notes
 1.
In an rround FN, q samples give \(2q\log _2N\) bits of information but functions are defined by a table of \(rN\log _2N\) bits. Thus, \(q=\frac{r}{2}N\) queries is enough to reconstruct the round functions, in theory.
 2.
We consider here the FF3 block cipher. However, there is a mode of operation for FF3 allowing variablelength messages in the original paper [8].
 3.
Note that the cycle length notation L should not be confused with the subscript L indicating the left part of a plaintext or a ciphertext.
 4.
The probability that a given point is in a cycle of length exactly \(\mathsf {k}\) is \(\frac{(N^21)\cdots (N^2k+1)}{N^2(N^21) \cdots (N^2k+1)}=\frac{1}{N^2}\). Hence, the expected number of points in a cycle of length \(\mathsf {k}\) is \(1=\mathbb {E}_{\Pi }(kc_k)\).
 5.
Executions of the attack on the 4round Feistel scheme which we used to fill our previous tables are precisely those getting the M samples in this experiment. For some rows with M too large, no experiments collected M pairwise different messages so they are not reported in the previous table. Nevertheless, our attack may still work even though we collect less than M samples. This is why they appear on Table 4.
 6.
In reaction to this attack, NIST released the following announcement:
https://beta.csrc.nist.gov/News/2017/RecentCryptanalysisofFF3.
References
Recommendation for Block Cipher Modes of Operation: Methods for Format Preserving Encryption. National Institute of Standards and Technology (2016)
Anderson, R., Biham, E.: Two practical and provably secure block ciphers: BEAR and LION. In: Gollmann, D. (ed.) FSE 1996. LNCS, vol. 1039, pp. 113–120. Springer, Heidelberg (1996). doi:10.1007/3540608656_48
Bellare, M., Hoang, V.T., Tessaro, S.: Messagerecovery attacks on Feistelbased format preserving encryption. In: 23th CCS Proceedings (2016)
Bellare, M., Ristenpart, T., Rogaway, P., Stegers, T.: Formatpreserving encryption. In: Jacobson, M.J., Rijmen, V., SafaviNaini, R. (eds.) SAC 2009. LNCS, vol. 5867, pp. 295–312. Springer, Heidelberg (2009). doi:10.1007/9783642054457_19
Bellare, M., Rogaway, P., Spies, T.: The FFX mode of operation for formatpreserving encryption. Draft 1.1. Submission to NIST, Feburary 2010. http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/ffx/ffxspec.pdf
Biryukov, A., Leurent, G., Perrin, L.: Cryptanalysis of Feistel networks with secret round functions. In: Dunkelman, O., Keliher, L. (eds.) SAC 2015. LNCS, vol. 9566, pp. 102–121. Springer, Cham (2016). doi:10.1007/9783319313016_6
Black, J., Rogaway, P.: Ciphers with arbitrary finite domains. In: Preneel, B. (ed.) CTRSA 2002. LNCS, vol. 2271, pp. 114–130. Springer, Heidelberg (2002). doi:10.1007/3540457607_9
Brier, E., Peyrin, T., Stern, J.: BPS: a formatpreserving encryption proposal. http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/bps/bpsspec.pdf
Brightwell, M., Smith, H.E.: Using datatypepreserving encryption to enchance data warehouse security (1997). http://csrc.nist.gov/nissc/1997/proceedings/141.pdf
Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. J. Symb. Comput. 9(3), 251–280 (1990)
Dinur, I., Dunkelman, O., Keller, N., Shamir, A.: New attacks on Feistel structures with improved memory complexities. In: Gennaro, R., Robshaw, M. (eds.) CRYPTO 2015. LNCS, vol. 9215, pp. 433–454. Springer, Heidelberg (2015). doi:10.1007/9783662479896_21
Erdős, P., Renyi, A.: On random graphs I. Publicationes Mathematicae 6, 290–297 (1959)
Goldenberg, D., Hohenberger, S., Liskov, M., Schwartz, E.C., Seyalioglu, H.: On tweaking LubyRackoff blockciphers. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833, pp. 342–356. Springer, Heidelberg (2007). doi:10.1007/9783540769002_21
Liskov, M., Rivest, R.L., Wagner, D.: Tweakable block ciphers. J. Cryptol. 24(3), 588–613 (2011)
Luby, M., Rackoff, C.: How to construct pseudorandom permutations from pseudorandom functions. SIAM J. Comput. 17(2), 373–386 (1988)
Lucks, S.: Faster LubyRackoff ciphers. In: Gollmann, D. (ed.) FSE 1996. LNCS, vol. 1039, pp. 189–203. Springer, Heidelberg (1996). doi:10.1007/3540608656_53
Patarin, J.: Security of balanced and unbalanced Feistel schemes with linear non equalities (2010). http://eprint.iacr.org/2010/293
Rogaway, P.: A synopsis of format preserving encryption. http://web.cs.ucdavis.edu/~rogaway/papers/synopsis.pdf
Saltykov, A.I.: The number of components in a random bipartite graph. Discrete Math. Appl. 5, 515–523 (1995)
Schneier, B., Kelsey, J.: Unbalanced Feistel networks and block cipher design. In: Gollmann, D. (ed.) FSE 1996. LNCS, vol. 1039, pp. 121–144. Springer, Heidelberg (1996). doi:10.1007/3540608656_49
Spies, T.: Format preserving encryption. Unpublished white paper (2008). https://www.voltage.com/wpcontent/uploads/VoltageSecurityWhitePaperFormatPreservingEncryption.pdf
Acknowledgments
The work was done while the first author was visiting EPFL. It was supported by NSF grant CNS1453132. This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) and Space and Naval Warfare Systems Center, Pacific (SSC Pacific) under contract No. N6600115C4070.
We thank Adi Shamir for the useful comments and Stefano Tessaro for the discussions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Deferred Proofs
A Deferred Proofs
1.1 A.1 Proof of Lemma 4
Proof
Before we start computations, we let the followings:
[good]: the event that \(v_1\) and \(v_2\) are both in \(V_{good}\).
[bad]: the event that \(v_1\) and \(v_2\) are both in V but not both in \(V_{good}\).
[cyc]: the event that \(y'_1=y_2\) and \(y'_2=y_1\).
[perm]: the event that \(x_1y_1=x'_2y'_2\) and \(x'_1y'_1=x_2y_2\).
\([\Sigma =0]\): the event that \(Label(v_1)+Label(v_2)=0\).
\([\#\{d\}=4]\): the event that \(d_1,d'_1,d_2,d'_2\) are pairwise different.
\([\#\{d\}=j]\): the event that there are exactly j pairwise different values among \(d_1,d'_1,d_2,d'_2\).
Let \(p_{good}=\Pr [good ,cyc,\lnot perm,\Sigma =0]\).
Let \(p_{bad}=\Pr [ bad, cyc,\lnot perm,\Sigma =0]\).
We are interested in \(\Pr [ good ~~cyc, \lnot perm, \Sigma =0]=\frac{1}{1+\frac{p_{bad}}{p_{good}}}\).
We want to upper bound \(\frac{p_{bad}}{p_{good}}\). And, we start with the probability \(p_{good}\).
Note that if [good], we have \([\Sigma =0]\) and it is equivalent to \([c_1=c'_1,c_2=c'_2,d_1\ne d'_1,d_2\ne d'_2,z_1=z'_1,z_2=z'_2]\). When \([c_1=c'_1,c_2=c'_2, cyc]\) holds, [perm] is equivalent to \([c'_1=c_2]\). When \([c_1=c'_1, c_2=c'_2, y'_1=y_2]\) holds, \((d_1 y_1)  (d'_2y'_2)= F_1(c_1) F_1(c'_2)=F_1(c'_1)F_1(c_2)=(d'_1 y'_1)(d_2y_2)=d'_1d_2\). So, \(y_1=y'_2\) is equivalent to \(d_1d'_2=d'_1d_2\).
We let A be the event \([c_1=c'_1\ne c_2=c'_2,\#\{d\}=4, d_1+d_2=d'_1+d'_2]\) which consists of only the c and d. Picking the xy is equivalent to picking the cd. So, A only depends on the c, d. We have \(\Pr [A] \geqslant \frac{1}{N^3} \left( 1\frac{1}{N}\right) ^2 \left( 1 \frac{3}{N}\right) \geqslant \frac{1}{N3} \left( 1 \frac{5}{N}\right) \) (We first pick \(c_1\) and \(d_1\), then \(c_2 \ne c_1\), \(d'_1 \ne d_1\), and \(d_2 \notin \{d_1,d'_1,2d'_1 d_1\}\)). When A holds, \([y'_1=y_2]\) only depends on \(F_1\) and occurs with probability \(\frac{1}{N}\). When A holds, \([z_1=z'_1,z_2=z'_2]\) only depends on \(F_2\) and occurs with probability \(\frac{1}{N^2}\). Therefore,
Now, we compute the probability \(p_{bad}\).
We know that [bad] is equivalent to \([c_1\ne c'_1 \mathsf {\ or\ } c_2\ne c'_2,F_1(c_1)=F_1(c'_1),\) \(F_1(c_2)=F_1(c'_2), d_1\ne d'_1,d_2\ne d'_2, z_1=z'_1,z_2=z'_2]\). When [cyc] occurs, \([\lnot perm]\) is equivalent to \([c'_1\ne c_2 \mathsf {\ or\ } c_1 \ne c'_2]\). When \([F_1(c_1)=F_1(c'_1),F_1(c_2)=F_1(c'_2)]\) holds, [cyc] is equivalent to \([d_1+d_2=d'_1+d'_2,y'_1=y_2]\). When [cyc] holds, \([\Sigma =0]\) is equivalent to \([c_1+c_2=c'_1+c'_2]\). So, when \([cyc, \Sigma =0]\) occurs, \([c_1\ne c'_1 \mathsf {\ or\ } c_2\ne c'_2]\) is equivalent to \([c_1\ne c'_1, c_2\ne c'_2]\).
From the symmetry, \([c'_1\ne c_2\mathsf {\ or\ }c_1\ne c'_2]\) case is at most twice the \([c'_1\ne c_2]\) case. Let B be the event \([c_1\ne c'_1\ne c_2\ne c'_2,c_1+c_2=c'_1+c'_2, d_1+d_2=d'_1+d'_2,d_1\ne d'_1,d_2\ne d'_2]\) which consists of only the c and d. When B holds, \([F_1(c_1)=F_1(c'_1),F_1(c_2)=F_1(c'_2),y'_1=y_2]\) only depends on \(F_1\). Therefore,
We split B following the \([\#\{d\}=j]\) cases for \(j=2,3,4\). Each case is denoted \(B_j\). When we have \([d_1 \ne d'_{}, d_{2} \ne d'_{2}, \# \{d\}=2, d_1 + d_2 = d'_1 + d'_2]\), we have either \([d_1=d'_2, d'_1=d_2]\) or \([d_1=d_2, d'_1=d'_2, d'_1=d_1+ \frac{N}{2}]\). When we have \([d_1 \ne d'_{1}, d_{2} \ne d'_{2}, \# \{d\}=3]\), we have \([d_1=d_2 \mathsf {\ or\ } d'_1=d'_2]\) (If we have \([d_1=d'_2 \mathsf {\ or\ } d'_1=d_2]\), then \( d_1 + d_2 = d'_1 + d'_2\) and \(\# \{d\}=2\) conflicts). When we have \([d_1 \ne d'_{1}, d_{2} \ne d'_{2}, \# \{d\}=4]\), we have no equality of d’s. For \(B_4\),
For each of the two cases of \(B_3\), either \(z_1=z'_1\) or \(z_2=z'_2\) occurs with probability \(\frac{1}{N}\). So,
For \(B_2\),
Therefore, \(\Pr _{c,d,F_2}[B,z_1=z'_1,z_2=z'_2]\leqslant \frac{5}{N^4}\) and \(p_\mathsf {bad} \leqslant \frac{10}{N^7}\).
Finally, \(\frac{p_\mathsf {bad}}{p_\mathsf {good}}\leqslant \frac{10}{N5}\). We deduce
\(\square \)
1.2 A.2 Extended Lemma 4
Lemma 7
If \(v_{1}v_{2} \cdots v_{i} \cdots v_{L} v_{1}\) is a cycle of length L in G with zero sum of labels and the vertices use no \(d_i\) or \(c_i\) in common, then all \(v_{i}\) are likely to be good. More precisely, for \(v_{i}=(x_{i}y_{i}x'_{i}y'_{i})\) random, we have
\(\Pr \left[ {\scriptstyle \forall i, v_{i} \in V_{good} }\right. {\scriptstyle v_{1} \cdots v_{i} \cdots v_{L} v_{1} ~ \text {is a cycle}, \left( \#\{c\}=\#\{c'\}=L, \forall i\ne j~c_i\ne c'_j \right) , \left( \#\{d\}=L, \forall i,j~ d_i\ne d'_j \right) ,}\)
\(\left. {\scriptstyle \sum _{i=1}^{L}Label(v_{i}) = 0]} \right] \geqslant \frac{1}{1+\frac{2^L1}{N}}\).
Proof
We compute \(p= \Pr [good~~ good \vee bad, cyc, \lnot \mathsf {repeat}_c, \lnot \mathsf {repeat}_d, \Sigma =0]\), where we use the same notation as in Lemma 4 with new \([\lnot \mathsf {repeat}_c]\) and \([\lnot \mathsf {repeat}_d]\) notations. We define them as follows:
We note that when all \(v_i\) are vertices (good or bad), since \(F_{1}(c'_i)=F_1(c_i)\), \(y'_{i+1}=y_{i}\) is equivalent to \(d'_id_{i+1}=F_1(c_i)F_1(c_{i+1})\). We further note that when this holds, then \(\sum d_i=\sum d'\). To be able to compute the probability of \([\mathsf {cyc}]\), we introduce a condition on the nonrepetition of the c and \(c'\), except for the possible equalities \(c_i=c'_i\) in good vertices. Namely, we define
When \([\lnot \mathsf {repeat}_c,\sum d=\sum d']\) holds and all \(v_i\) are vertices, \([\mathsf {cyc}]\) occurs with probability \(\frac{1}{N^{L1}}\). Therefore, \(\Pr [cyc~~good \vee bad, \lnot \mathsf {repeat}_c, \Sigma d= \Sigma d'] = \frac{1}{N^{L1}}\)
The event \([\forall i\quad z_i=z'_i]\) is equivalent to \(c_i+F_2(d_i)=c'_i+F_2(d'_i)\). To be able to compute its probability, we introduce a condition on the nonrepetition of the d and \(d'\). Namely, we define
Hence, when \([\lnot \mathsf {repeat}_d]\) occurs, \([\forall i\quad z_i=z'_i]\) occurs with probability \(\frac{1}{N^L}\): \(\Pr [z'=z~~\lnot \mathsf {repeat}_d]=\frac{1}{N^L}\). Finally, when [cyc] holds, \([\Sigma =0]\) is equivalent to \(\Sigma (cc')=0\), and \([good \vee bad]\) is equivalent to \([F_1(c)=F_1(c'), z'=z]\).
We define
with obvious shorthands \([c=c']\), \([z'=z]\), \([F_1(c)=F_1(c')]\), \([\sum (cc')=0]\).
We upper bound \(\frac{p_\mathsf {bad}}{p_\mathsf {good}}\) to compute p.
We have
So,
where \(\left[ \lnot \mathsf {repeat}_c \mathrm {\ except\ } c'_{\max I} \right] \) means
By relaxing the constraints on \(c'_{\max I}\), we can compute the probability of \(\Sigma (cc')=0\) conditioned to other events about c and \(c'\). This probability is \(\frac{1}{N}\).
Therefore,
and we have
\(\square \)
Rights and permissions
Copyright information
© 2017 International Association for Cryptologic Research
About this paper
Cite this paper
Durak, F.B., Vaudenay, S. (2017). Breaking the FF3 FormatPreserving Encryption Standard over Small Domains. In: Katz, J., Shacham, H. (eds) Advances in Cryptology – CRYPTO 2017. CRYPTO 2017. Lecture Notes in Computer Science(), vol 10402. Springer, Cham. https://doi.org/10.1007/9783319637150_23
Download citation
DOI: https://doi.org/10.1007/9783319637150_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 9783319637143
Online ISBN: 9783319637150
eBook Packages: Computer ScienceComputer Science (R0)