An InverseFree SingleKeyed Tweakable Enciphering Scheme
 3 Citations
 1.4k Downloads
Abstract
In CRYPTO 2003, Halevi and Rogaway proposed CMC, a tweakable enciphering scheme (TES) based on a blockcipher. It requires two blockcipher keys and it is not inversefree (i.e., the decryption algorithm uses the inverse (decryption) of the underlying blockcipher). We present here a new inversefree, singlekeyed TES. Our construction is a tweakable strong pseudorandom permutation (TSPRP), i.e., it is secure against chosenplaintextciphertext adversaries assuming that the underlying blockcipher is a pseudorandom permutation (PRP), i.e., secure against chosenplaintext adversaries. In comparison, SPRP assumption of the blockcipher is required for the TSPRP security of CMC. Our scheme can be viewed as a mixture of type1 and type3 Feistel cipher and so we call it FMix or mixedtype Feistel cipher.
Keywords
(Tweakable strong) pseudorandom permutation Coefficient H Technique Encipher CMC Fiestel cipher1 Introduction
A tweakable enciphering scheme (TES) is a lengthpreserving encryption scheme that takes a tweak as an additional input. In other words, for each tweak, TES computes a ciphertext preserving length of the plaintext. Preserving length can be very useful in applications such as disksector encryption (as addressed by the IEEE SISWG P1619), where a lengthpreserving encryption preserves the file size after encryption. When a tweakable enciphering scheme is used, the disk sectors can serve as tweaks. Other applications of enciphering schemes could include bandwidthefficient network protocols and securityretrofitting of old communication protocols.
Examples based on Paradigms. There are four major paradigms of tweakable enciphering schemes. Almost all enciphering schemes fall in one of the following categories.

Feistel Structure: 2block Feistel design was used in early block ciphers like Lucifer [4, 22] and DES [23]. Luby and Rackoff gave a security proof of Feistel ciphers [12], and later the design was generalised to obtain inversefree enciphering of longer messages [17]. Examples: NaorReingold Hash [16], GFN [10], matrix representations [1].

HashCounterHash: Two layers of universal hash with a counter mode of encryption in between. Examples: XCB [13], HCTR [25], HCH [2].

HashEncryptHash: Two layers of universal hash with an ECB mode of encryption in between. Examples: PEP [3], TET [6], HEH [21].

EncryptMixEncrypt: Two encryption layers with a mixing layer in between. Examples: EME [8], EME* [5] (with ECB encryption layer), CMC [7] (with CBC encryption layer).
Among all these constructions, the examples from Feistel cipher and Encryptmixencrypt paradigms are based on blockciphers alone (i.e., no field multiplication or other primitive is used). Now we take a closer look at CMC encryption.

For an encryption using \(e_K\), the decryption needs \(e_K^{1}\). In a combined hardware implementation, the footprint size (e.g., the number of gates or slices) goes up;

The security proof of CMC relied on the stronger assumption SPRP (Strong PseudoRandom Permutation) on the underlying blockcipher;

Tweak is processed using an independent key, and the proposed singlekey variant uses an extra call to the blockcipher.
One recent inversefree construction based on Feistel networks is the AEZcore, which forms part of the implementation of AEZ [9]. It belongs to the EncryptMixEncrypt paradigm, where the encryption uses a Feistel structure. It requires five blockcipher calls for every two plaintext blocks, but is highly parallelizable.
1.1 Our Contribution
 1.
FMix is inversefree, i.e., it needs the same f for both encryption and decryption, having low footprint in the combined hardware implementation.
 2.
Because it is inversefree, an important improvement is on the security requirement of \(e_K\). CMC relies upon an SPRPsecure \(e_K\), while our construction just needs a PRFsecure \(e_K\). This can have significant practical implications in reducing the cost of implementation.
 3.
The tweak is processed through the same f, removing the requirement of an extra independent blockcipher key.
 4.
To encrypt a message with \(\ell \) blocks and a tweak (a single block), CMC needs \(2\ell +1\) calls to the blockcipher e. Its variant (which eliminates the independent key), however uses \(2\ell +2\) calls to e. Our construction requires \(2\ell +1\) calls, without needing the independent key.
A Comparison of some blockcipher based TES. The description of the columns are as follows: (1) Number of blockcipher calls, (2) Number of keys, (3) How many sequential layers with full parallelization, (4) Security assumption of the underlying blockcipher, (5) Whether it is inversefree. (CMC’ is a “natively tweakable” variant of CMC, as described in [7]).
Schemes  #BC  #Key  #Layers  BCsecurity  Inversefree? 

CMC  \(2\ell +1\)  2  \(\ell +2\)  SPRP  NO 
CMC’  \(2\ell +2\)  2  \(\ell +2\)  SPRP  NO 
EME  \(2\ell +3\)  1  4  SPRP  NO 
GFN1  \(4\ell 2\)  \(4\ell 2\)  \(4\ell 2\)  PRP  YES 
GFN3  \(2\ell ^22\)  \(2\ell ^22\)  \(2\ell 2\)  PRP  YES 
AEZcore  \(\sim \frac{5}{2}\ell \)  1  5  PRP  YES 
FMix (this paper)  \(2\ell +1\)  1  \(\ell +3\)  PRP  YES 
2 Preliminaries
2.1 Tweakable Encryption Schemes
Definition 1
Note that choosing a function uniformly from a class \(\{f_{\alpha }\}_{\alpha \in I}\) indexed by some finite set I can be achieved by choosing \(\alpha _0\) uniformly from I and then picking \(f_{\alpha _0}\) as the chosen function.
Tweakable Random Permutation. When \(\mathcal {R}=\mathcal {D}\), a popular choice of \(\mathscr {C}\) is \(\varPi _\mathcal {D}\), the class of all permutations on \(\mathcal {D}\) (i.e., bijections from \(\mathcal {D}\) to itself). A random permutation over \(\mathcal {D}\) is a \(\varPi _\mathcal {D}\)random function. It is an ideal choice corresponding to an encryption scheme over \(\mathcal {D}\). The ideal choice corresponding to a tweakable enciphering scheme over \(\mathcal {D}\) with tweak space \(\mathcal {T}\) is called tweakable random permutation \(\tilde{\pi }\) which is chosen uniformly from the class \(\varPi _{\mathcal {D}}^{\mathcal {T}}\). For each tweak \(\mathfrak {T} \in \mathcal {T}\), we choose a random permutation \(\pi _{\mathfrak {T}}\) independently, and \(\tilde{\pi }\) is a stochastically independent collection of random permutations \(\{\pi _{\mathfrak {T}}; \mathfrak {T} \in \mathcal {T}\}\).
2.2 Pseudorandomness and Distinguishing Games
It should be noted that a random function or a random permutation is an ideal concept, since in practice the sizes of \(\mathcal {R}^\mathcal {D}\) or \(\varPi _{\mathcal {D}}\) are so huge that the cost of simulating a uniform random sampling on them is prohibitive. What is used instead of a truly random function is a pseudorandom function (PRF), a function whose behaviour is so close to that of a truly random function that no algorithm can effectively distinguish between the two. An adversary for a pseudorandom function \(f_1\) is a deterministic algorithm \(\mathcal {A}\) that tries to distinguish \(f_1\) from a truly random \(f_0\).
Pointless Adversaries. In addition to the adversary being deterministic, we also assume that it does not make any pointless queries. An adversary \(\mathcal {A}\) making queries to a tweakable encryption scheme f and \(f^{1}\) is called pointless if either it makes a duplicate query or it makes an fquery \((\mathfrak {T}, P)\) and obtains response C and \(f^{1}\)query \((\mathfrak {T}, C)\) and obtains response P (the order of these two queries can be reversed). We can assume that adversary is not pointless since the responses are uniquely determined for these types of queries.
Theorem 1
The above result says that an uniform lengthpreserving random permutation is very close to an uniform lengthpreserving random function.
2.3 Domain Extensions and Coefficient H Technique
The notion of pseudorandomness, while giving us an approximate implementation of random functions, introduces a new problem. In general, it is very hard to decide whether or not there is an adversary that breaks the pseudorandomness of a particular function, since there is no easy way of exhaustively covering all possible adversaries in an analysis, and since there is no true randomness in a practically implemented function, probabilistic arguments cannot be used.
The common getaround is to assume we have PRFs \(f_1,...,f_n\) each with domain \(\mathcal {D}\) and use them to obtain an F with domain \(\mathcal {D'}\supset \mathcal {D}\), such that a PRFattack on F leads to a PRFattack on one of \(f_1,...,f_n\). Now, there are known functions on small domains (like AES, for instance) which have withstood decades of attempted PRFattacks and are believed to be reasonably secure against PRFattacks. Choosing \(\mathcal {D}\) suitably to begin with and using the known PRFs in our construction, we can find a PRF F with domain \(\mathcal {D'}\) that is secure as long as the smaller functions are secure. This technique is known as a domain extension.
Let \(\textsf {view}(\mathcal {A}^{\mathcal {O}})\) denote the the view obtained by the adversary \(\mathcal {A}\) interacting with \(\mathcal {O}\).
Theorem 2
This technique was first introduced by Patarin’s PhD thesis [18] (as mentioned in [24]). Later it has been formalized in [19].
3 The FMix Construction
The details of the construction are demonstrated in the figure, which shows a fourblock FMix construction. The algorithm for general l is described in the box. Here, b is a balanced linear permutation, which we define below, and \(b'\) is \(b^{1}\). Decryption is almost identical, just with T and b(T) switching roles.
Definition 2
A permutation \(b:\{0,1\}^n\longrightarrow \{0,1\}^n\) will be called a balanced linear permutation if both \(t\mapsto b(t)\) and \(t\mapsto t+b(t)\) are linear permutations.
One choice of b could be multiplication by a primitive \(\alpha \), but this is not very softwarefriendly. A more softwarefriendly choice is \((t_1,t_2)\mapsto (t_1\oplus t_2,t_1)\), where \(t_1\) and \(t_2\) are the higher and lower halves of t.
In the TSPRP game, the adversary makes q queries to the oracle \(\mathcal {O}\). Each query is of the form \((\delta ,\mathfrak {T},\mathbf X )\), where \(\delta \in \{e,d\}\) denotes the direction of the query, \(\mathfrak {T}\in \{0,1\}^n\) is the tweak, and \(\mathbf X \in \{0,1\}^{nl}\) for some l is the input. If \(\mathcal {O}\) is imitating FMIX, \(\mathcal {O}(e,\mathfrak {T},\mathbf X )\) returns \(\mathcal {E}^f(\mathfrak {T},\mathbf X )\), and \(\mathcal {O}(d,\mathfrak {T},\mathbf X )\) returns \(\mathcal {D}^f(\mathfrak {T},\mathbf X )\). If \(\mathcal {O}\) is imitating a tweaked PRP \(\varPi \), \(\mathcal {O}(e,\mathfrak {T},\mathbf X )\) returns \(\varPi (\mathfrak {T},\mathbf X )\), and \(\mathcal {O}(d,\mathfrak {T},\mathbf X )\) returns \(\varPi ^{1}(\mathfrak {T},\mathbf X )\). The output of \(\mathcal {O}\) is denoted \(\mathbf Y \).
All the queries and their outputs taken together form what we call a \(\mathbf view \). We use the following notation in a view. For the ith query, \(\delta ^i\) denotes the direction of the query, \(\mathfrak {T}^i\) denotes the tweak, and \(l^i\) denotes the number of blocks in \(\mathbf X \). When \(\delta ^i=e\), the blocks of \(\mathbf X \) are denoted \(P_1,...,P_{l^i}\) and those of \(\mathbf Y \) are denoted \(C_1,...,C_{l^i}\). When \(\delta ^i=d\), this notation is reversed, i.e., the blocks of \(\mathbf Y \) are denoted \(P_1,...,P_{l^i}\) and those of \(\mathbf X \) are denoted \(C_1,...,C_{l^i}\). In the analysis, the tweak \(\mathfrak {T}\) is denoted both \(P_0^i\) and \(C_0^i\).
4 TSPRP Security Analysis of FMix
4.1 Good Views and Interpolation
Our first task is to formulate the version of Patarin’s Coefficient H Technique we shall use for our proof. We begin by restricting our attention to a particular class of views.
 1.
\(\exists i \ne i'\) such that \(\delta ^i = \delta ^{i'} = e\), \(P^i = P^{i'}\).
 2.
\(\exists i \ne i'\) such that \(\delta ^i = \delta ^{i'} = d\), \(C^i = C^{i'}\).
 3.
\(\exists i' < i\) such that \(\delta ^i = e\), \(\delta ^{i'} = d\), \(P^i = P^{i'}\).
 4.
\(\exists i' < i\) such that \(\delta ^i = d\), \(\delta ^{i'} = e\), \(C^i = C^{i'}\).
The first two cases are for duplicate queries. The third holds when we obtain a response \(P^{i'}\) for some decryption query \(C^{i'}\) and then make an encryption query \(P^{i} := P^{i'}\). (The fourth case is the third case with the order of the queries reversed.) It is easy to see that when an adversary \(\mathcal {A}\) is interacting with a TES, the view obtained is pointless if and only if \(\mathcal {A}\) is pointless.
As we do not allow a pointless adversary we can restrict ourselves to nonpointless views only. Now we define good and bad views among this class.
Definition 3
The proof revolves around showing that the good views have a nearrandom distribution, and the bad views occur with a low probability. For the rest of the analysis, we fix a good view \(\mathcal {V}\).
Proposition 1
Armed with this result and the Coefficient H Technique, we are now ready to state and prove the main result of this paper.
Theorem 3
Proof
When a nonpointless adversary \(\mathcal {A}\) is interacting with a pair of independent random functions \((f_0, f_0')\), it obtains a bad view has probability upper bounded by \(\left( {\begin{array}{c}q\\ 2\end{array}}\right) \). To see this, let the bad event occurs for the first time at the \(i^{\mathrm {th}}\) query. If it is an encryption query (similar proof can be carried out for the decryption query) then \(C^i_1\) is chosen randomly from \(\{0,1\}^n\) and so it matches with one of the previous first ciphertext block is at most \((i1)/2^n\). So \(\mathrm {Pr}_{f_0, f_0'}[\textsf {view}(\mathcal {A}^{f_0,f'_0})\) is a bad view\(] \le \sum _{i=1}^q \frac{i1}{2^n} = \frac{q(q1)}{2^{n+1}}\). By using Coefficient H Technique (see in Sect. 2.3) and the proposition stated above we have proved our theorem. \(\square \)
Corollary 1
This follows from the standard hybrid argument.
4.2 Extension of FMix for Partial Block Input
Theorem 4
The proof of the statement is immediate from Theorem 1 and the generic conversion as described in [14].
5 Proof of Proposition 1
In this section we provide the proof of Proposition 1.
5.1 Simulations
We shall develop an effective way of calculating the interpolation probability of \(\mathcal {V}\). We begin by introducing the notion of variables. Let E be the set of all encryption query indices, i.e., \(E=\{i\delta ^i=e\}\). Similarly, let D be the set of all decryption query indices. In identifying and labelling internal blocks, we continue using superscripts to denote query indices. Thus, for a query i, the \(2l^i\) inputs of f (other than \(\mathfrak {T}^i\)) are denoted \(U_1^i,...,U_{l^i}^i,U'^i_1,...,U'^i_{l^i}\), and the \(2l^i+1\) outputs of f are denoted \(V_0^i,V_1^i,...,V_{l^i}^i,V'^i_1,...,V'^i_{l^i}\). For ease of notation, we shall write both \(U_0^i\) and \(U_0'^{i}\) to denote \(\mathfrak {T}^i\).
Let us assume for now that the input block and its corresponding output block are unrelated. We note that all input and output blocks of f are either variables or derivables. Thus, if we assign values to the variables, all the inputs and outputs of f over all queries are linearly determined. Thus, the variables linearly generate the entire set of input and output blocks, while themselves being linearly independent. We now formalise the notion of value assignment to variables.
Definition 4
A transcript \(\tau \) is a collection of variablevalue pairs (Z, v) such that no two pairs in the collection contain the same variable. For every \((Z,v) \in \tau \), the variable Z is said to be assigned the value v under \(\tau \). We denote this as \(Z_{\tau }=v\). The domain \(\mathbb {D}(\tau )\) of a transcript \(\tau \) is defined as \(\{Z(\exists v)(Z,v)\in \tau \}\). Given a set S of variables, a transcript \(\tau \) with \(\mathbb {D}(\tau )=S\) is said to be an instantiation of S.
For a transcript \(\tau \) and a derivable \(Z'\) whose value only depends on the variables in \(\mathbb {D}(\tau )\), \(\tau \) effectively determines a value for \(Z'\). This value is denoted by \(Z'_{\tau }\). For ease of notation, for any view block X, \(X_{\tau }\) will simply denote the value of X fixed in \(\mathcal {V}\). An instantiation \(\sigma \) of \(\mathcal {S}\) will be called a simulation, since it determines all inputs and outputs of f and thus describes a complete simulation of the internal computations that resulted in view \(\mathcal {S}\).
Not all simulations make sense, however, when we consider the connection between and input block and its corresponding output block. A dependence now creeps in among the variables, owing to the key observation below, which poses the only nontrivial questions in the entire proof.
Wherever the inputs of f are identical, so are its outputs.
There can be simulations which violate this rule, and thus describe internal computations that can never occur. A simulation which actually describes a possible set of internal computations is called realisable. It is immediately clear that our observation holds for all realisable simulations. The problem of calculating the interpolation probability of \(\mathcal {V}\) boils down to counting the number of realisable simulations.
5.2 Admissibility
All realisable simulations can be difficult to count, however. We shall focus instead on a smaller class of simulations, called admissible simulations, which are easy to count and yet are abundant enough to give us the desired result. Before that, we let us formulate in specific terms the ramifications of this observation. The immediate consequence is what we call predestined collisions. Let \(\mathcal {I}=\cup _i\{U^i_0,U^i_1,...,U^i_{l^i},U'^i_1,...,U'^i_{l_i}\}\) be the set of all input blocks of f.
Definition 5
All other collisions between input blocks are called accidental collisions. Our next task is to identify all predestined collisions. For that we’ll need some more definitions.
Definition 6

\((U^i_k,U^{i'}_k),0\le k<\min (l^i,l^{i'}),i\sim _{e_k}i'\)

\((U'^i_k,U'^{i'}_k),0\le k<\min (l^i,l^{i'}),i\sim _{d_k}i'\)

\((V^i_{k1}+P^i_k,V^{i'}_{k1}+P^{i'}_k),0\le k<\min (l^i,l^{i'}),i\sim _{e_k}i'\)

\((V'^i_{k1}+C^i_k,V'^{i'}_{k1}+C^{i'}_k),0\le k<\min (l^i,l^{i'}),i\sim _{d_k}i'\)
 (a)
\((i\sim _{e_k}i')\rightarrow (V^i_k=V^{i'}_k),0\le k<\min (l^i,l^{i'}),\)
 (b)
\((i\sim _{d_k}i')\rightarrow (V'^i_k=V'^{i'}_k),0\le k<\min (l^i,l^{i'}).\)
The predestined output collisions linearly follow from the predestined collisions, but are formulated separately here, because they’ll later be useful as a class of constraints on realisable simulations. Finally, we define the class of admissible simulations.
Definition 7
(Admissible). A simulation \(\sigma \) is called admissible if, for any \(Z_1,Z_2\in \mathcal {I}\) that do not constitute a predestined collision, \(Z_1_{\sigma }\ne Z_2_{\sigma }\).
Thus, in an admissible simulation, no two input blocks of f can accidentally collide, and the only collisions are the predestined ones.
5.3 Basis and Extension
We now identify a subclass B of the variables which are linearly independent under assumption of admissibility, and such that an instantiation \(\tau _B\) of B admits a unique extension \(\mathbb {E}(\tau _B)\) to a realisable simulation. We shall call B a basis of X. First, we’ll need one more definition.
Definition 8
A query index i, \(1\le i\le q\), is called k fresh, \(k\ge 0\) if \(k=l^i\), or \(k<l^i\) and \(\not \exists i'\le i\) with \(k<l^{i'}\) such that \(i\sim _{e_k}i'\) or \(i\sim _{d_k}i'\).
The set \(E_k\) of kfresh encryption queries is defined as \(\{i\delta ^i=e, i\text { }k\text {fresh}\}\). Similarly, the set \(D_k\) of kfresh decryption queries is defined as \(\{i\delta ^i=d, i\text { }k\text {fresh}\}\). Clearly, \(E=\cup _k E_k\), and \(D=\cup _k D_k\), since any i is \(l^i\)fresh.
Definition 9
Clearly, if i is kfresh, then i is its own kancestor.
Definition 10
Thus, a query slice is the portion of a transcript that refers to a specific query. The query slices of a transcript form a partition of it.
To show that \(\sigma \) indeed is a simulation, we just observe that if \(\cup _1^{i1}Q_i(\sigma )\) is realisable, and \(\delta ^i=d\), then \(Q_i(\sigma )\) cannot violate 4.3 (a) (which concerns encryption queries only), and \(Q_i(\sigma )\) is chosen so as to conform to 4.3 (b).
5.4 Extension Equations
We observe that in extending \(\tau _B\) to \(\mathbb {E}(\tau _B)\), once we’ve set the basis variables in accordance with \(\tau _B\), none of the steps we perform thereafter depend on the specific instantiation \(\tau _B\). Thus, for each variable we can identify an equation relating it to the basis variables, so that a simulation can be obtained by simply plugging in an appropriate instantiation of B. We call these equations the extension equations.
Pick \(i\in E,j\in \{0,...,l^i\}\). Then \(V^i_j\) is a variable. Let \(b_1\) be j, and \(a_1\) be \(A_j^e(i)\). Having obtained \(b_1,...,b_k\) and \(a_1,...,a_k\), we stop if k is odd and \(a_k\in E\), or if k is even and \(a_k\in D\). Otherwise, let \(b_{k+1}=l^{a_k}1b_k\), and \(a_{k+1}\) be \(A_{b_{k+1}}^{\delta ^{a_k}}(a_k)\). Since \(a_{k+1}>a_k\), this terminates after finitely many steps, say upon obtaining \(a_{k_0}\). Then we call \(((b_1,a_1),...,(b_{k_0},a_{k_0}))\) the extension chain of \(V^i_j\), denoted \(\mathfrak {C}(V^i_j)\).
To obtain the extension equation of \(V^i_j\) from \(\mathfrak {C}(V^i_j)\), note that \(V^i_j=V^{a_1}_j\), and for any even \(k\le k_0\), \(V'^{a_k}_j=V'^{a_{k1}}_j\), and (if \(k<k_0\)) \(V^{a_{k+1}}_j=V^{a_k}_j\). To bridge these equations, we just need to recall the equations relating \(V^{i'}_j\) to \(V'^{i'}_{l^{i'}1j}\) for arbitrary \(i'\) with \(l^{i'}\ge j\).
From our algorithm, \(V'^{i'}_0=V^{i'}_0\), \(V'^{i'}_{l^{i'}}=b(V^{i'}_{l^{i'}1}+V^{i'}_0+P^{i'}_{l^{i'}})+C^{i'}_1\) and \(V'^{i'}_{l^{i'}1}=b(V^{i'}_{l^{i'}}+V^{i'}_0+P^{i'}_1)+C^{i'}_{l^{i'}}\).
We’ll show that whenever for two input derivables Z and \(Z'\), \(\mathfrak {B}(Z)=\mathfrak {B}(Z')\), \((Z,Z')\) is either a predestined collision, or Z and \(Z'\) cannot collide. This’ll show that every accidental input collision corresponds to a linear equation on the basis variables and view blocks. Note that this linear equation actually corresponds to n linear equations in terms of the bits, all of which should be dodged. For most of the analysis, this distinction will not matter, and it’ll only become important when we deal with two special cases in the very end.
Lemma 1
Every accidental input collision imposes a nontrivial linear equation on the basis variables.
The proof of the lemma is postponed to the end of this section. It basically considers all cases for accidental collision and shows that it gives a nontrivial linear equation.
5.5 Bringing It All Together
5.6 Proof of Lemma 1
Proof

Whether they both occur in the same layer (encryption layer \(\{U^i_j\}\) or decryption layer \(\{U^i_j\}\)), or in different layers;

Whether they occur in the right layer (encryption layer of an encryption query, or decryption layer of decryption query) or the wrong layer;

Whether their firstcross indices match (this would be the current query index if in the wrong layer, and the index after the first backward jump during extension if in the right layer).
We begin with an easy group of cases, where both occur in the right layer, and their firstcross indices do not match:
Case 1a. \((U^i_j,U^{i'}_{j'}),i,i'\in E,a=A^e_{j1}(i)<A^e_{j'1}(i')=a'\)
\(\mathfrak {B}(U^i_j)=\mathfrak {B}(V^i_{j1})\) can only contain basis variables with query indices \(\le a\). Since \(\mathfrak {B}(U^{i'}_{j'})=\mathfrak {B}(V^{i'}_{j'1})\) will contain either \(V'^{a'}_{l^{a'}}\) or \(V^{a'}_{j'}\), \(\mathfrak {B}(U^i_j)\ne \mathfrak {B}(U^{i'}_{j'})\).
Case 1b. \((U'^i_j,U'^{i'}_{j'}),i,i'\in D,a=A^d_{j1}(i)<A^d_{j'1}(i')=a'\)
Case 1c. \((U^i_j,U'^{i'}_{j'}),i\in E,i'\in D,a=A^e_{j1}(i)<A^d_{j'1}(i')=a'\)
Case 1d. \((U'^i_j,U^{i'}_{j'}),i\in D,i'\in E,a=A^d_{j1}(i)<A^e_{j'1}(i')=a'\)
We next turn to another easy group, where exactly one of them is in the right layer, and firstcross indices do not match:
Case 2a. \((U^i_j,U'^{i'}_{j'}),i,i'\in E,a=A^e_{j1}(i)\ne i'\)
If \(a<i'\), \(V^{i'}_{l^{i'}}\) is in \(\mathfrak {B}(U'^{i'}_{j'})\) but not in \(\mathfrak {B}(U^i_j)\). If \(a>i'\), either \(V^a_{j1}\) is in \(\mathfrak {B}(U^i_j)\) but not in \(\mathfrak {B}(U'^{i'}_{j'})\), or \(V'^a_{l^a}\) is in \(\mathfrak {B}(U^i_j)\) but not in \(\mathfrak {B}(U'^{i'}_{j'})\).
Case 2b. \((U^i_j,U'^{i'}_{j'}),i,i'\in D,i\ne A^d_{j'1}(i')=a'\)
Case 2c. \((U^i_j,U^{i'}_{j'}),i\in E,i'\in D,a=A^e_{j1}(i)\ne i'\)
Case 2d. \((U'^i_j,U'^{i'}_{j'}),i\in E,i'\in D,i\ne A^d_{j'1}(i')=a'\)
The next group is even easier: both in the wrong layer, with nonmatching firstcross indices. This takes care of all cases with nonmatching firstcross indices.
Case 3a. \((U^i_j,U^{i'}_{j'}),i,i'\in D,i<i'\)
\(V'^{i'}_{l^{i'}}\) is in \(\mathfrak {B}(U^{i'}_{j'})\) but not in \(\mathfrak {B}(U^i_j)\).
Case 3b. \((U'^i_j,U'^{i'}_{j'}),i,i'\in E,i<i'\)
Case 3c. \((U^i_j,U'^{i'}_{j'}),i\in D,i'\in E,i<i'\)
Case 3d. \((U'^i_j,U^{i'}_{j'}),i\in E,i'\in D,i<i'\)
Next we turn to a slightly trickier group, where they are in the same layer, both in the right layer, and firstcross indices match.
Case 4. \((U^i_j,U^{i'}_{j'}),i,i'\in E,A^e_{j1}(i)=A^e_{j'1}(i')\)
Consider \(\mathfrak {C}(V^i_{j1})=((b_1,a_1),...,(b_{k_0},a_{k_0}))\), \(\mathfrak {C}(V^{i'}_{j'1})=((b'_1,a'_1),...,(b'_{k'_0},a'_{k'_0}))\). If the chains follow the same query paths (i.e., if \(k_0=k'_0\) and \((\forall k\le k_0)(a_k=a'_k)\)), assuming without loss of generality \(k_0\) is odd and \(k_0\in E\) (from the chaintermination condition), we have \(V^{a_{k_0}}_{b_{k_0}}\in \mathfrak {B}(U^i_j)\), and \(V^{a_{k_0}}_{b'_{k_0}}\in \mathfrak {B}(U^{i'}_{j'})\), all other basis variables in the two extension equations being the same. Thus, if \(b_{k_0}\ne b'_{k_0},\mathfrak {B}(U^i_j)\ne \mathfrak {B}(U^{i'}_{j'})\), and if \(b_{k_0}=b'_{k_0}\), \((U^i_j,U^{i'}_{j'})\) is either a predestined collision (if \(P^i_j=P^{i'}_{j'}\)) or it cannot be a collision. If the chains do not follow the same query path, we can find k such that \(a_k\ne a'_k\), which reduces to one of the previous cases.
Case 4a. \((U'^i_j,U'^{i'}_{j'}),i,i'\in D,A^d_{j1}(i)=A^d_{j'1}(i')\)
The next group is much simpler, where they are in different layers, both in the right layer, and firstcross indices match.
Case 5. \((U^i_j,U'^{i'}_{j'}),i\in E,i'\in D,a=A^e_{j1}(i)=A^d_{j'1}(i')\)
Without loss of generality, \(a\in E\). So \(V^a_{l^a}\) is in \(\mathfrak {B}(U^{i'}_{j'})\) but not in \(\mathfrak {B}(U^i_j)\).
Case 5a. \((U'^i_j,U^{i'}_{j'}),i\in D,i'\in E,A^d_{j1}(i)=A^e_{j'1}(i')\)
We’re almost done with the proof at this point. We wrap up with the few remaining cases. In the next group, they come from different layers, exactly one of them in the right layer, and firstcross indices match.
Case 6. \((U^i_j,U'^{i'}_{j'}),i,i'\in E,A^e_{j1}(i)=i'\)
Here, \(V^{i'}_{l^{i'}}\) is in \(\mathfrak {B}(U^{i'}_{j'})\) but not in \(\mathfrak {B}(U^i_j)\).
Case 6a. \((U^i_j,U'^{i'}_{j'}),i,i'\in D,i=A^d_{j'1}(i')\)
The four cases of the final group can be proved using the extensionchaincomparison technique of Case 4. In this group, they are in the same layer, at least one in the wrong layer, and firstcross indices match. (If they are both in the wrong layer, and firstcross indices match, they occur at the same query, so they cannot be in different layers, so this wraps up the case analysis).
Case 7. \((U^i_j,U^{i'}_{j'}),i\in E,i'\in D,A^e_{j1}(i)=i'\)
Case 7a. \((U'^i_j,U'^{i'}_{j'}),i\in E,i'\in D,i=A^d_{j'1}(i')\)
Case 7b. \((U^i_j,U^i_{j'}),i\in D\)
Case 7c. \((U'^i_j,U'^i_{j'}),i\in E\)
This leaves only a few boundary cases (involving the likes of \(U^i_{l^i}\)), which can be easily verified. We just point out two special cases which underline the importance of choosing b as a balanced permutation. For the pair \((U^i_1,U'^i_1)\) for some i, if \(P^i_1=C^i_1\), the condition for an accidental collision becomes \(V^i_0+b(V^i_0)=0\), which is still n independent linear equations in terms of the bits, by choice of b. Similarly, if \(i\sim _{e_{l^i1}}i'\), and \(b(P^i_{l^i})=P^{i'}_{l^i}\), the pair \((U^i_{l^i},U^{i'}_{l^i})\) yields the equation \(b(V^i_{l^i1})+V^{i'}_{l^i1}=0\), which again is n independent linear equations in terms of the bits.
Thus we establish our lemma. \(\square \)
6 Conclusion and Future Works
In this paper we propose a new Feistel type length preserving tweakable encryption scheme. Our construction, called FMix, has several advantages over CMC and other blockcipher based enciphering scheme. It makes an optimal number of blockcipher calls using single keyed PRP blockcipher. The only drawback compare to EME is that the first layer of encryption, like CMC, is sequential. We can view our construction as a composition of type1 and type3 Feistel ciphers.
There are several possible scopes of future work. When we apply a generic method to encrypt last partial block message, we need an independent key. (This is always true for generic construction.) However, one can have a very specific way to handle partial block message keeping only one blockcipher key. The presence of the function b helps us to simplify the security proof. However, we do not know of any attack if we do not use this function (except for handling the tweak in the bottom layer  that use is necessary). So it would be interesting to see whether our proof can be extended for the variant without using the function b.
References
 1.Berger, T.P., Minier, M., Thomas, G.: Extended generalized feistel networks using matrix representation. In: Lange, T., Lauter, K., Lisoněk, P. (eds.) SAC 2013. LNCS, vol. 8282, pp. 289–305. Springer, Heidelberg (2014) CrossRefGoogle Scholar
 2.Chakraborty, D., Sarkar, P.: HCH: a new tweakable enciphering scheme using the hashencrypthash approach. In: Barua, R., Lange, T. (eds.) INDOCRYPT 2006. LNCS, vol. 4329, pp. 287–302. Springer, Heidelberg (2006) CrossRefGoogle Scholar
 3.Chakraborty, D., Sarkar, P.: A new mode of encryption providing a tweakable strong pseudorandom permutation. In: Robshaw, M. (ed.) FSE 2006. LNCS, vol. 4047, pp. 293–309. Springer, Heidelberg (2006) CrossRefGoogle Scholar
 4.Feistel, H.: Block cipher cryptographic system, US Patent 3,798,359, 19 March 1974Google Scholar
 5.Halevi, S.: EME*: Extending EME to handle arbitrarylength messages with associated data. In: Canteaut, A., Viswanathan, K. (eds.) INDOCRYPT 2004. LNCS, vol. 3348, pp. 315–327. Springer, Heidelberg (2004) CrossRefGoogle Scholar
 6.Halevi, S.: Invertible universal hashing and the TET encryption mode. In: Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 412–429. Springer, Heidelberg (2007) CrossRefGoogle Scholar
 7.Halevi, S., Rogaway, P.: A tweakable enciphering mode. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 482–499. Springer, Heidelberg (2003) CrossRefGoogle Scholar
 8.Halevi, S., Rogaway, P.: A parallelizable enciphering mode. In: Okamoto, T. (ed.) CTRSA 2004. LNCS, vol. 2964, pp. 292–304. Springer, Heidelberg (2004) CrossRefGoogle Scholar
 9.Hoang, V.T., Krovetz, T., Rogaway, P.: Robust authenticatedencryption AEZ and the problem that it solves. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015. LNCS, vol. 9056, pp. 15–44. Springer, Heidelberg (2015) Google Scholar
 10.Hoang, V.T., Rogaway, P.: On generalized feistel networks. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 613–630. Springer, Heidelberg (2010) CrossRefGoogle Scholar
 11.Liskov, M., Rivest, R.L., Wagner, D.: Tweakable block ciphers. In: Yung, M. (ed.) CRYPTO 2002. LNCS, vol. 2442, pp. 31–46. Springer, Heidelberg (2002) CrossRefGoogle Scholar
 12.Luby, M., Rackoff, C.: How to construct pseudorandom permutations from pseudorandom functions. SIAM J. Comput. 17(2), 373–386 (1988)zbMATHMathSciNetCrossRefGoogle Scholar
 13.McGrew, D.A., Fluhrer, S.R.: The security of the extended codebook (XCB) mode of operation. In: Adams, C., Miri, A., Wiener, M. (eds.) SAC 2007. LNCS, vol. 4876, pp. 311–327. Springer, Heidelberg (2007) CrossRefGoogle Scholar
 14.Nandi, M.: A generic method to extend message space of a strong pseudorandom permutation. Computación y Sistemas 12(3), 285–296 (2009)Google Scholar
 15.Nandi, M.: XLS is not a strong pseudorandom permutation. In: Sarkar, P., Iwata, T. (eds.) ASIACRYPT 2014. LNCS, vol. 8873, pp. 478–490. Springer, Heidelberg (2014) Google Scholar
 16.Naor, M., Reingold, O.: On the construction of pseudorandom permutations: lubyrackoff revisited. J. Cryptology 12(1), 29–66 (1999)zbMATHMathSciNetCrossRefGoogle Scholar
 17.Nyberg, K.: Generalized feistel networks. In: Kim, K., Matsumoto, T. (eds.) Advances in Cryptology ASIACRYPT 1996. LNCS, vol. 1163, pp. 91–104. Springer, Berlin Heidelberg (1996)Google Scholar
 18.Patarin, J.: Etude des G\(\acute{e}\)n\(\acute{e}\)rateurs de Permutations Bas\(\acute{e}\)s sur le Sch\(\acute{e}\)ma du D.E.S. Ph.D Th\(\grave{e}\)sis de Doctorat de l’Universit\(\acute{e}\) de Paris 6 1991Google Scholar
 19.Patarin, J.: The “Coefficients H” technique. In: Avanzi, R.M., Keliher, L., Sica, F. (eds.) SAC 2008. LNCS, vol. 5381, pp. 328–345. Springer, Heidelberg (2009) CrossRefGoogle Scholar
 20.Ristenpart, T., Rogaway, P.: How to enrich the message space of a cipher. In: Biryukov, A. (ed.) FSE 2007. LNCS, vol. 4593, pp. 101–118. Springer, Heidelberg (2007) CrossRefGoogle Scholar
 21.Sarkar, P.: Improving upon the TET mode of operation. In: Nam, K.H., Rhee, G. (eds.) ICISC 2007. LNCS, vol. 4817, pp. 180–192. Springer, Heidelberg (2007) CrossRefGoogle Scholar
 22.Sorkin, A.: Lucifer, a cryptographic algorithm. Cryptologia 8(1), 22–42 (1984)CrossRefGoogle Scholar
 23.Data Encryption Standard: Fips pub 46. Federal Information Processing Standards Publication, Appendix A (1977)Google Scholar
 24.Vaudenay, S.: Decorrelation: a theory for block cipher security. In: Journal of Cryptology, Lecture Notes in Computer Science, vol. 16(4), pp. 249–286. SpringerVerlag, New York (2003)Google Scholar
 25.Wang, P., Feng, D., Wu, W.: HCTR: a variableinputlength enciphering mode. In: Feng, D., Lin, D., Yung, M. (eds.) CISC 2005. LNCS, vol. 3822, pp. 175–188. Springer, Heidelberg (2005) CrossRefGoogle Scholar