Keywords

1 Introduction

Information-theoretic indistinguishability proofs are fundamental tools in cryptography, and take a particularly prominent role in symmetric cryptography. In this context, it is imperative to derive bounds which are as precise as possible – a tighter bound yields a better understanding of the actual security of the system at hand, and avoids potential inefficiency provoked by the choice of unnecessarily large parameters, such as the key- and block-lengths, and the number of rounds.

This paper falls within a line of works investigating generic techniques to obtain best-possible information-theoretic bounds. We investigate a new approach to indistinguishability proofs – which we refer to as the chi-squared method – which will help us tighten (and simplify) proofs for certain examples where proofs so-far have evaded more classical methods, such as the H-coefficient method.

Specifically, we apply our methodology to the analyses of three, a priori seemingly unrelated, constructions – the XOR of permutations (initially studied by Hall, Wagner, Kelsey, and Schneier [12]), the Encrypted Davies-Meyer construction by Cogliati and Seurin [10], and the Swap-or-not construction by Hoang, Morris, and Rogaway [13]. Previously, no connections between these problems have been observed, but we give significantly improved bounds as an application of our framework.

Information-theoretic indistinguishability. Many cryptographic security proofs require showing, for a distinguisher A with access to one of two systems, \(\mathbf {S}_{0}\) and \(\mathbf {S}_{1}\),Footnote 1 an upper bound on

$$\begin{aligned} {\mathsf {Adv}}^{\mathsf {dist}}_{\mathbf {S}_{0}, \mathbf {S}_{1}}(A) = \Pr [A(\mathbf {S}_{0}) = 1] - \Pr [A(\mathbf {S}_{1}) = 1] \;, \end{aligned}$$

where \(A(\mathbf {S}_b)\) denotes the probability that A outputs 1 when interacting with \(\mathbf {S}_b\).

While it is customary to only target the case where A is computationally bounded, in many cases, the actual proofs themselves are concerned with the information-theoretic case where the advantage is maximized over all distinguishers, only bounded by their number q of queries, but with no further restrictions on their time complexities. A first example in this domain is the analysis of Feistel networks in the seminal work of Luby and Rackoff [16], whose main step is a proof that the Feistel construction with truly random round functions is information-theoretically indistinguishable from a random permutation. (This was first pointed out explicitly by Maurer [18].) Another class of inherently information-theoretic analyses – dating back to the analysis of the Even-Mansour [11] block cipher – studies constructions in ideal models (such as the ideal-cipher or random-permutation models), where adversaries are also only bounded in their query-complexity.

In this context, the perhaps most widely-used proof technique is that of bounding the probability of a certain failing condition, where \(\mathbf {S}_{0}\) and \(\mathbf {S}_{1}\) behave identically, in some well-defined sense, as long as the condition is not violated. This approach was abstracted e.g. in Maurer’s random systems [19] and Bellare-Rogaway game playing [4] frameworks. Unfortunately, such methods are fairly crude, and often fall short of providing tight bounds, especially for so-called beyond-birthday security.Footnote 2

More sophisticated approaches [5, 23, 25] directly bound the statistical distance \(\Vert \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) \Vert \), where \(\mathsf {p}_{\mathbf {S}_{1}, A}\) and \(\mathsf {p}_{\mathbf {S}_{0}, A}\) are the respective probability distributions of the answers obtained by A, which is assumed to be deterministic. This is an upper bound on \({\mathsf {Adv}}^{\mathsf {dist}}_{\mathbf {S}_{0}, \mathbf {S}_{1}}(A)\). In particular, Patarin’s H-coefficient method [25] has recently re-gained substantial popularity, mostly thanks to Chen and Steinberger’s exposition [6]. The technique was further refined by Hoang and Tessaro [14], who provided a “smoothed” version of the H-coefficient method, called the “expectation method.”

A different avenue. Techniques such as the H-coefficient method heavily exploit computing the probabilities \(\mathsf {p}_{\mathbf {S}_{1}, A}(\mathbf {Z})\) and \(\mathsf {p}_{\mathbf {S}_{0}, A}(\mathbf {Z})\) that a full sequence of q outputs \(\mathbf {Z} = (Z_1, \ldots , Z_q)\) occur. Often, these probabilities are easy to compute and compare under the condition that the sequence of outputs belongs to a set of good transcripts. One case where such methods however do not yield a good bound is where we are only given local information, e.g., the distance between \(\mathsf {p}_{\mathbf {S}_{1}, A}(\cdot \mid \mathbf {Z}_{i-1})\) and \(\mathsf {p}_{\mathbf {S}_{0}, A}(\cdot \mid \mathbf {Z}_{i-1})\) for all sequences \(\mathbf {Z}_{i-1}\) and all \(i \ge 1\), where \(\mathbf {Z}_{i-1}\) is the sequence of the first \(i - 1\) outputs. Here, the naïve approach is to use a so-called hybrid argument, and bound the distance as

$$\begin{aligned} \Vert \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) \Vert \le \sum _{i=1}^q \mathbf {E}\Bigl [\Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot \mid \mathbf {X}_{i-1}) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot \mid \mathbf {X}_{i-1})\Vert \Bigr ] \;, \end{aligned}$$
(1)

where \(\mathbf {X}_{i-1}\) is the vector of answers to A’s first \(i-1\) queries, according to \(\mathsf {p}_{\mathbf {S}_{0}, A}(\cdot )\). (Symmetrically, they can be all sampled according to \(\mathsf {p}_{\mathbf {S}_{1}, A}(\cdot )\).) If all summands are upper bounded by \(\epsilon \), we obtain a bound of \(q \epsilon \). This is rarely tight, and often sub-optimal. A different avenue was explored by Bellare and Impagliazzo (BI) [2], in an unpublished note. They consider the sequence of random variables \(U_1, \ldots , U_q\), where

$$\begin{aligned} U_i = \frac{ \mathsf {p}_{\mathbf {S}_{1}, A}(X_i|\mathbf {X}_{i-1})}{\mathsf {p}_{\mathbf {S}_{0}, A}(X_i|\mathbf {X}_{i-1})}\;, \end{aligned}$$

and \(\mathbf {X}_{i-1}\) and \(X_i\) are sampled from A’s interaction with \(\mathbf {S}_{0}\). Roughly, they show that if \(\left| U_i - 1\right| \) is sufficiently concentrated, say \(\left| U_i - 1\right| \le \epsilon \) for all i, except with probability \(\delta \), then the bound becomes

$$ \Vert \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) \Vert \le O(\sqrt{q} \cdot \epsilon \lambda ) + e^{-\lambda ^2/2} + \delta \;.$$

Unfortunately, the BI method is rather complex to use – it requires a careful balancing act in order to assess the trade-off between \(\epsilon \) and \(\delta \), and the additional slackness due to the \(\lambda \) term is also problematic and appear to be an artifact of the proof technique.Footnote 3 To the best of our knowledge, the BI method was never used elsewhere.

Our method: The Chi-squared Method. In this work, we consider a different version of the above method. In particular, we revisit the setting of (1), and change our metric to measure distance between \(\mu (\cdot ) = \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot \mid \mathbf {Z}_{i-1})\) and \(\nu (\cdot ) = \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot \mid \mathbf {Z}_{i-1})\). Instead of statistical distance, we will use the so-called \(\chi ^2\)-divergence, as proposed by Neyman and Pearson,Footnote 4

$$\begin{aligned} \chi ^2(\mu , \nu ) = \sum _{x} \frac{\left( \mu (x) - \nu (x) \right) ^2}{\nu (x)} \;. \end{aligned}$$

where the sum is over all x such that \(\nu (x) > 0\), and we assume that if \(\mu (x) > 0\), then \(\nu (x) > 0\), too. In particular, let \(\chi ^2(\mathbf {Z}_{i-1}) = \chi ^2(\mu ; \nu )\) as above, then, we show that

$$\begin{aligned} \Vert \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) \Vert \le \sqrt{\frac{1}{2} \sum _{i=1}^q \mathbf {E}\left[ \chi ^2(\mathbf {X}_{i-1}))\right] }\;, \end{aligned}$$

where for all \(i = 1, \ldots , q\), \(\mathbf {X}_{i-1}\) is sampled according to \(\mathsf {p}_{\mathbf {S}_{1}, A}(\cdot )\). We refer to the method of obtaining a bound by upper bounding the q expectations \(\mathbf {E}\left[ \chi ^2(\mathbf {X}_{i-1}))\right] \) as the chi-squared method. A crucial property that will make calculations manageable and elegant is that the distribution of \(\mathbf {X}_{i-1}\) and the distribution in the denominator of the \(\chi ^2\)-divergence are with respect to different systems. In many case, we will be able to show that \(\mathbf {E}\left[ \chi ^2(\mathbf {X}_{i-1}))\right] \) is much smaller than the statistical distance \(\epsilon \) – even quadratically, i.e., \(O(\epsilon ^2)\) – and thus the method gives a very good bound of the order \(O(\sqrt{q} \epsilon )\).

In contrast to the proof behind BI’s method, which relies on somewhat heavy machinery, such as Azuma’s inequality, the proof behind the chi-squared method is fairly simple, and relies on Pinsker’s and Jensen’s inequalities. In fact, we are not claiming that relations between the statistical distance and \(\chi ^2\)-divergence are novel, but we believe this methodology to be new in the context of cryptography indistinguishability proofs for interactive systems. Our method, as we discuss below in the body of the paper, can also be seen as a generalization of a technique by Chung and Vadhan [8], used in a different context.

We will apply our method to three different problems, improving (or simplifying) existing bounds.

Application: XOR of random permutations. A potential drawback of block ciphers is that their permutation structure makes them unsuitable to be used as good pseudorandom functions, as they become distinguishable from a truly random function when reaching \(q \approx 2^{n/2}\) queries, where n is the block length. For this reason, Hall, Wagner, Kelsey, and Schneier [12] initiated the study of constructions of good pseudorandom functions from block ciphers with security beyond the so-called Birthday barrier, i.e., above \(2^{n/2}\). A particularly simple construction they proposed – which we refer to as the XOR construction – transforms a permutation \(\pi : \{0,1\}^n \rightarrow \{0,1\}^n\) into a function \(f: \{0,1\}^{n-1} \rightarrow \{0,1\}^n\) by computing \(f(x) = \pi (0 \,\Vert \,x) \oplus \pi (1 \,\Vert \,x)\), where \(\pi \) is meant to be instantiated by a block cipher which is a good pseudorandom permutation, but is treated as a random permutation in the core argument of the proof, which we focus on.

Lucks [17] proved this construction be secure up to roughly \(q = 2^{2n/3}\), whereas Bellare and Impagliazzo [2] gave a better bound of \(O(n) q/2^n\), but also only provided a proof sketch. Patarin [24] gave an improved bound of \(O(q/2^{n})\), but the proof was quite complex. This bound was further improved to \(q/2^n\) in an unpublished manuscript [26]. Patarin’s tight proof is very involved, using an approach he refers to as “mirror theory”,Footnote 5 with some claims remaining open or unproved. (Also, as a related problem, Cogliati, Lampe, and Patarin [9] gave weaker bounds for the case of the sum of at least three permutations.) The XOR construction is particularly helpful as a tool for beyond-birthday security, and has been used for example within Iwata’s CENC mode of operation [15].

Here, as an application of the chi-squared method, we give a fairly simple proof giving us a bound of \((1.5q + 3 \sqrt{q})/2^n\). One can argue that the improvement is small (and in fact, if the bound in [26] is indeed correct, ours is slightly worse). However, we believe the analysis of the XOR construction to be fundamental, and it has evaded simple proofs for nearly two decades. While Patarin’s proof deals with precise bounds on number of permutations satisfying a given input-output relationship, our approach is simpler in that it does not require a fine-grained understanding of the underlying distribution, but only requires computing certain expectations.

A related version of the construction is the one computing \(f'(x) = \pi _1(x) \oplus \pi _2(x)\) for two independent permutations \(\pi _1, \pi _2\). We also analyze this variant in Appendix A, giving a bound of \(q^{1.5} / 2^{1.5n}\), and in the body focus on the “single-key” variant which is somewhat harder to analyze and more efficient.

Application: The EDM construction. As another application of the chi-squared method, we study the encrypted Davies-Meyer (EDM) construction recently introduced by Cogliati and Seurin [10]. The construction depends on two random permutations \(\pi \) and \(\pi '\), and on input x outputs the value \(\pi '(\pi (x) \oplus x)\). Again, the goal is to show that this is a good PRF, with security beyond the birthday barrier. In [10], a security bound showing security up to \(q = 2^{2n/3}\) queries was shown. Using the chi-squared method, we show that security up to \(q = 2^{3n/4}\) is achieved. We note that in concurrent work to ours, Mennink and Neves [21] prove that EDM security approaches \(2^n\). Their bound uses Patarin’s mirror theory, and has a different purpose than ours – we aim for a simpler-to-use framework, and the question of whether our approach yields better bounds remains open for future work.

The EDM construction is the underlying structure of a nonce-based misuse-resistant MAC that CS proposed. CS proved that the MAC construction also achieves 2n / 3-bit of security and conjecture that it actually has n-bit security. While our chi-squared technique seems to be able to handle the MAC construction as well, the combinatorics (also in CS’s work) will be very complex, and thus we leave this analysis for future work.

Application: Swap-or not. As our final application, we consider the swap-or-not block cipher, introduce by Hoang, Morris, and Rogaway [13]. Swap-or-not is a block cipher that supports an arbitrary abelian group \(\mathbb {G}\) with size N as its domain, and, for sufficiently many rounds \(r = \varOmega (\log (N))\), is meant to with stand up to \(q < N/c\) queries, for a small constant \(c \ge 2\). This makes it particularly suitable as cipher for format-preserving encryption (FPE) [3], both because of its flexibility to support multiple domain formats, as well as for its high security making it suitable to smaller domains. Subsequent work [22, 30] focused on boosting its security to \(q = N\), at the cost of higher (worst-case) round complexity. The Swap-or-not example is particularly interesting object to analyze, as it uses a very different structure than more usual Feistel-like designs. The original proof in [13] uses a fairly ad-hoc analysis, which however as an intermediate step ends up upper bounding exactly the quantity \(\mathbf {E}\left[ \chi ^2(\mathbf {X}_{i-1})\right] \). As a result of this, we end up saving a factor \(\sqrt{N}\) on final advantage bound.

For example, for \(N = 2^{64}\), \(q = 2^{60}\), and r rounds, the original analysis gives a CCA-security advantage \(2^{90 - 0.415 r}\) vs one of approximately \(2^{62 - 0.415 r}\) for our new analysis. Thus, if we are interested in achieving security \(2^{-64}\), we would need \(r \ge 371\) rounds according to the old analysis, whereas our analysis shows that 293 rounds are sufficient.

A perspective and further related works. We conclude by stressing that with respect to our current state of knowledge, there does not seem to be a universal method to obtain tight bounds on information-theoretic indistinguishability, and ultimately the best method depends on the problem at hand. This situation is not different than what encountered in statistics, where proving bounds on the variational distance require different tools depending on the context.

We are certainly not the first to observe the importance of using different metrics as a tool in cryptographic security proofs and reductions. For example, in symmetric cryptography, Steinberger [31] used the Hellinger distance to sharpen bounds on key-alternating ciphers. The H-coefficient technique itself can be seen as bounding a different distance metric between distributions. Further, cryptographic applications have often relied on using the KL-divergence, e.g., in parallel repetition theorems [7, 28], and Renyi divergences, e.g., in lattice-based cryptography [1].

2 Preliminaries

Notation. Let n be a positive integer. We use [n] to denote the set \(\{1, \ldots , n\}\). For a finite set S, we let denote the uniform sampling from S and assigning the value to x. Let |x| denote the length of the string x, and for \(1 \le i < j \le |x|\), let x[ij] denote the substring from the ith bit to the jth bit (inclusive) of x. If A is an algorithm, we let \(y \leftarrow A(x_1,\ldots ;r)\) denote running A with randomness r on inputs \(x_1,\ldots \) and assigning the output to y. We let be the resulting of picking r at random and letting \(y \leftarrow A(x_1,\ldots ;r)\).

PRF security. Let \(F: \mathcal {K}\times \{0,1\}^m \rightarrow \{0,1\}^n\) be a family of functions. Let \(\mathrm {Func}(m, n)\) be the set of all functions \(g: \{0,1\}^m \rightarrow \{0,1\}^n\). For an adversary A, define

figure a

as the PRF advantage of A attacking F.

Distance measures. Let \(\mu \) and \(\nu \) be two distributions on a finite event space \(\varOmega \). The statistical distance between \(\mu \) and \(\nu \) is defined as

$$ \Vert \mu - \nu \Vert = \sum _{x \in \varOmega } \max \{0, \mu (x) - \nu (x)\}. $$

The Kullback-Leibler (KL) divergence between \(\mu \) and \(\nu \) is defined as

$$ \varDelta _{\mathrm {KL}}(\mu , \nu ) = \sum _{x \in \varOmega } \mu (x) \ln \Bigl (\frac{\mu (x)}{\nu (x)} \Bigr ). $$

Note that for \(\varDelta _{\mathrm {KL}}\) to be well-defined, we need \(\nu \) to have full support, i.e. \(\varOmega \). The well-known Pinsker’s inequality relates the previous two notions.

Lemma 1 (Pinsker’s inequality)

Let \(\mu \) and \(\nu \) be two distributions on a finite event space \(\varOmega \) such that \(\nu \) has full support. Then

$$ (\Vert \mu - \nu \Vert )^2 \le \frac{1}{2}\varDelta _{\mathrm {KL}}(\mu , \nu ). $$

Another well-known fact for KL-divergence is that it decomposes nicely for product distributions. The chi-squared divergence between \(\mu \) and \(\nu \) is defined as

$$ \chi ^2(\mu , \nu ) = \sum _{x \in \varOmega } \frac{(\mu (x) - \nu (x))^2}{\nu (x)}. $$

Note that for \(\chi ^2(\mu , \nu )\) to be well-defined, again \(\nu \) needs to have full support. We remark that \(\chi ^2(\mu , \nu )\) is related to the notion of collision probability. To justify this remark, let \(\varOmega \) be some finite set and let \(M = |\varOmega |\). Let \(\nu \) be the uniform distribution over \(\varOmega \) and \(\mu \) be any distribution over \(\varOmega \). Let \(X_1, X_2\) be two i.i.d. samples from \(\mu \). Then

$$\begin{aligned} \chi ^2(\mu , \nu )= & {} \sum _{x \in \varOmega } M \cdot (\mu (x) - 1/M)^2 \\= & {} M \cdot \Pr [X_1 = X_2] - 1. \end{aligned}$$

The following lemma relates the chi-squared divergence and the KL-divergence.

Lemma 2

Let \(\varOmega \) be a finite set, and let \(\mu \) and \(\nu \) be two distribution on \(\varOmega \) such that \(\nu \) has full support. Then

$$ \varDelta _{\mathrm {KL}}(\mu , \nu ) \le \chi ^2(\mu , \nu ). $$

Proof

Since function \(\ln (x)\) is concave, by using Jensen’s inequality,

$$\begin{aligned} \sum _{x \in \varOmega } \mu (x) \ln \Bigl ( \frac{\mu (x)}{\nu (x)} \Bigr ) \le \ln \Bigl ( \sum _{x \in \varOmega } \frac{(\mu (x))^2}{\nu (x)}\Bigr ) . \end{aligned}$$
(2)

Next,

$$\begin{aligned} \sum _{x \in \varOmega } \frac{(\mu (x) - \nu (x))^2}{\nu (x)} = \sum _{x \in \varOmega } \frac{(\mu (x))^2}{\nu (x)} - \sum _{x \in \varOmega } (2\mu (x) - \nu (x)) = \sum _{x \in \varOmega } \frac{(\mu (x))^2}{\nu (x)} - 1. \end{aligned}$$
(3)

Finally, using the inequality that \(e^t - 1 \ge t\) for any real number t, we have

$$\begin{aligned} \sum _{x \in \varOmega } \frac{(\mu (x))^2}{\nu (x)} - 1 \ge \ln \Bigl ( \sum _{x \in \varOmega } \frac{(\mu (x))^2}{\nu (x)}\Bigr ). \end{aligned}$$
(4)

From Eqs. (2)–(4), we obtain the claimed result.

3 The Chi-Squared Method

In this section, we describe the chi-squared method, which simplifies previous results by Bellare and Impagliazzo (BI), and Chung and Vadhan (CV) [2, 8].

Notational setup. Let A be an adversary that tries to distinguish two stateless systems \(\mathbf {S}_{1}\) and \(\mathbf {S}_{0}\). Since we allow A to be computationally unbounded, without loss of generality, assume that A is deterministic. Assume further that the adversary always makes exactly q queries. Since the adversary is deterministic, for any \(i \le q - 1\), the answers for the first i queries completely determine the first \(i + 1\) queries. For a system \(\mathbf {S}\in \{\mathbf {S}_{1}, \mathbf {S}_{0}\}\) and strings \(z_1, \ldots , z_i\), let \(\mathsf {p}_{\mathbf {S}, A}(z_1, \ldots , z_i)\) denote the probability that when the adversary A interacts with system \(\mathbf {S}\), the answers for the first i queries that it receives is \(z_1, \ldots , z_i\). If \(\mathsf {p}_{\mathbf {S}, A}(z_1, \ldots , z_i) > 0\), let \(\mathsf {p}_{\mathbf {S}, A}(z_{i + 1} \mid z_1, \ldots , z_i)\) denote the conditional probability that the answer for the \((i + 1)\)-th query when the adversary interacts with system \(\mathbf {S}\) is \(z_{i + 1}\), given that the answers for the first i queries are \(z_1, \ldots , z_i\) respectively. For each \(\varvec{Z}= (z_1, \ldots , z_q)\), let \(\varvec{Z}_i = (z_1, \ldots , z_i)\), and for \(\mathbf {S}\in \{\mathbf {S}_{1}, \mathbf {S}_{0}\}\), let \(\mathsf {p}_{\mathbf {S}, A}(\cdot \mid \varvec{Z}_i)\) denote \(\mathsf {p}_{\mathbf {S}, A}(\cdot \mid z_1, \ldots , z_i)\). We let \(\varvec{Z}_0\) be the empty vector, and \(\mathsf {p}_{\mathbf {S}, A}(\cdot \mid \varvec{Z}_{0})\) is understood as \(\mathsf {p}_{\mathbf {S}, A}(\cdot )\).

The technique. We first give a brief intuition regarding our technique. On the high level, the chi-squared method relates the statistical distance of a product distribution to the expected chi-squared divergence of the components, via Kullback-Leibler divergence. The advantage of this approach is that the term that depends on the number of components, say q, is “under the square-root”, because of Pinsker’s inequality. The details follow.

For each \(i \le q\) and each vector \(\varvec{Z}_{i - 1} = (z_1, \ldots , z_{i - 1})\), define (with sligh abuse of notation)

$$\begin{aligned} \chi ^2(\varvec{Z}_{i-1})= & {} \chi ^2(\mathsf {p}_{\mathbf {S}_{1}, A}(\cdot \mid \varvec{Z}_{i - 1}),\; \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot \mid \varvec{Z}_{i - 1})) \\= & {} \sum _{z_i} \frac{\Bigl ( \mathsf {p}_{\mathbf {S}_{1}, A}(z_i \mid \varvec{Z}_{i - 1}) -\mathsf {p}_{\mathbf {S}_{0}, A}(z_i \mid \varvec{Z}_{i - 1}) \Bigr )^2}{ \mathsf {p}_{\mathbf {S}_{0}, A}(z_i \mid \varvec{Z}_{i - 1})}, \end{aligned}$$

where the sum is taken over all \(z_i\) in the support of the distribution \(\mathsf {p}_{\mathbf {S}_{0}, A}(\cdot \mid \varvec{Z}_{i - 1})\). We require that if \(\mathsf {p}_{\mathbf {S}_{1}, A}(\varvec{Z}_{i}) > 0\), then so is \(\mathsf {p}_{\mathbf {S}_{0}, A}(\varvec{Z}_i)\). Thus, \(\chi ^2(\varvec{Z}_{i-1})\) is well-defined. Typically, in applications, \(\mathbf {S}_{0}\) is the “ideal” system, and this technical constraint is always met.

The following lemma bounds the distinguishing advantage of A.

Lemma 3

Suppose whenever \(\mathsf {p}_{\mathbf {S}_{1}, A}(\varvec{Z}_i) > 0\) then \(\mathsf {p}_{\mathbf {S}_{0}, A}(\varvec{Z}_i) > 0\). Then,

$$ \Vert \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot )\Vert \le \Bigl (\frac{1}{2}\sum _{i = 1}^{q} \mathbf {E}[\chi ^2(\varvec{X}_{i-1})]\Bigr )^{1/2}, $$

where the expectation is taken over vectors \(\varvec{X}_{i-1}\) of the \(i - 1\) first answers sampled according to the interaction with \(\mathbf {S}_{1}\).

Discussion. To illustrate the power of the chi-squared method, suppose that

$$ \Bigl | \frac{ \mathsf {p}_{\mathbf {S}_{1}, A}(z_i \mid \varvec{Z}_{i - 1})}{ \mathsf {p}_{\mathbf {S}_{0}, A}(z_i \mid \varvec{Z}_{i - 1})} - 1 \Bigr | \le \varepsilon $$

for every i and every \(\varvec{Z}_i\). If one uses the H-coefficient technique, the first step is to give a lower bound for the ratio \(\mathsf {p}_{\mathbf {S}_{1}, A}(\varvec{Z}) / \mathsf {p}_{\mathbf {S}_{0}, A}(\varvec{Z})\), which is

$$ \prod _{i = 1}^{q} \frac{\mathsf {p}_{\mathbf {S}_{1}, A}(z_i \mid \varvec{Z}_{i - 1})}{\mathsf {p}_{\mathbf {S}_{0}, A}(z_i \mid \varvec{Z}_{i - 1})} \ge (1 - \varepsilon )^q \ge 1 - \varepsilon q. $$

Thus the distinguishing advantage is at most the statistical distance between \(\mathsf {p}_{\mathbf {S}_{0}, A}(\cdot )\) and \(\mathsf {p}_{\mathbf {S}_{1}, A}(\cdot )\), which is

$$\begin{aligned} \sum _{\varvec{Z}} \max \{0, \mathsf {p}_{\mathbf {S}_{0}, A}(\varvec{Z}) - \mathsf {p}_{\mathbf {S}_{1}, A}(\varvec{Z})\}\le & {} \sum _{\varvec{Z}} \varepsilon q \cdot \mathsf {p}_{\mathbf {S}_{0}, A}(\varvec{Z}) \le \varepsilon q. \end{aligned}$$

In contrast, from Lemma 3, the distinguishing advantage is at most \(\varepsilon \sqrt{q/2}\), because

$$\begin{aligned} \chi ^2(\varvec{Z}_{i - 1})= & {} \sum _{z_i} \mathsf {p}_{\mathbf {S}_{0}, A}(z_i \mid \varvec{Z}_{i - 1}) \Bigl ( \frac{ \mathsf {p}_{\mathbf {S}_{1}, A}(z_i \mid \varvec{Z}_{i - 1})}{ \mathsf {p}_{\mathbf {S}_{0}, A}(z_i \mid \varvec{Z}_{i - 1})} - 1 \Bigr )^2 \\\le & {} \sum _{z_i} \mathsf {p}_{\mathbf {S}_{0}, A}(z_i \mid \varvec{Z}_{i - 1}) \cdot \varepsilon ^2 = \varepsilon ^2. \end{aligned}$$

This is why the chi-square method can substantially improve the security bound in many settings, as we’ll demonstrate in subsequent sections.

Proof

(of Lemma 3 ). Recall that the adversary’s distinguishing advantage is at most the statistical distance between \(\mathsf {p}_{\mathbf {S}_{0}, A}(\cdot )\) and \(\mathsf {p}_{\mathbf {S}_{1}, A}(\cdot )\). On the other hand, from Pinsker’s inequality,

(5)

Fix \(i \le q\) and \(\varvec{Z}_{i - 1}\). Let \(\mu \) and \(\nu \) be the distributions \(\mathsf {p}_{\mathbf {S}_{1}, A}(\cdot \mid \varvec{Z}_{i - 1})\) and \(\mathsf {p}_{\mathbf {S}_{0}, A}(\cdot \mid \varvec{Z}_{i - 1})\) respectively. Let S be the support of \(\nu \), and recall that the support of \(\mu \) is a subset of S. Notice that from Lemma 2, we have

$$\begin{aligned} \sum _{x \in S} \mu (x) \ln \Bigl ( \frac{\mu (x)}{\nu (x)} \Bigr ) \le \sum _{x \in S} \frac{(\mu (x) - \nu (x))^2}{\nu (x)}. \end{aligned}$$
(6)

From Eqs. (5) and (6),

$$\begin{aligned}&2\Bigl ( \Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot )- \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert \Bigr )^2 \\\le & {} \sum _{i = 1}^q \sum _{\varvec{Z}_i = (z_1, \ldots , z_i)} \mathsf {p}_{\mathbf {S}_{1}, A}(\varvec{Z}_{i-1}) \frac{\Bigl ( \mathsf {p}_{\mathbf {S}_{1}, A}(z_i \mid \varvec{Z}_{i - 1}) - \mathsf {p}_{\mathbf {S}_{0}, A}(z_i \mid \varvec{Z}_{i - 1})\Bigr )^2}{\mathsf {p}_{\mathbf {S}_{0}, A}(z_i \mid \varvec{Z}_{i - 1})} \\= & {} \sum _{i = 1}^q \sum _{\varvec{Z}_i = (z_1, \ldots , z_i)} \mathsf {p}_{\mathbf {S}_{1}, A}(\varvec{Z}_{i-1}) \cdot \chi ^2(\varvec{Z}_{i-1}) =\sum _{i = 1}^{q} \mathbf {E}[ \chi ^2(\varvec{X}_{i-1})]. \end{aligned}$$

This concludes the proof.    \(\square \)

Comparison with CV’s framework. Underneath CV’s work is, in essence, a specialized treatment of our framework for the case that the ideal system \(\mathbf {S}_{0}\) implements an ideal random function. Thus their method can be used to justify the security of the xor of two permutations (Sect. 4) and Encrypted Davies-Meyer PRF (Sect. 5), but it does not work for the Swap-or-Not shuffle (Sect. 6). CV however do not realize these potential applications, and focus only on the Generalized Leftover Hash Lemma (GLHL) of block sources. To the best of our knowledge, CV’s method is never used for any other application, perhaps because it is written in a specific language for the context of GLHL.

Comparison with BI’s framework. Compared to BI’s framework, ours is better in both usability and tightness.

  • In BI’s method, the bound is a formula of two user-provided parameters. Consequently, to use BI’s method, one has to fine-tune the parameters to optimize the bound. Moreover, since BI’s method requires strong concentration bounds, in applications such as the xor of two permutations, one has to make non-trivial use of martingales and Azuma’s inequality.Footnote 6 In contrast, under the chi-squared method, in Sect. 4, when we handle the xor of two permutations, we only compute an expectation and there’s no need to use advanced probabilistic tools.

  • Due to BI’s requirement of strong concentration bounds, in some settings the results that BI’s method obtains can be sub-optimal. The looseness in BI’s method varies greatly among different settings. For example, in the xor of two permutations, BI’s bound is about \(nq / 2^n\), whereas ours is just \(q / 2^n\). For Encrypted Davies-Meyer PRF, BI’s method only gives \(\frac{2n}{3}\)-bit security, which is on par with the result of Cogliati and Seurin via the H-Coefficient technique, but our method yields \(\frac{3n}{4}\)-bit security. Finally, for the Swap-or-Not shuffle, BI’s framework doesn’t mesh with the analysis in [13], whereas our method can easily make use of the analysis in [13] to improve their result.

4 The XOR Construction

In this section, we consider the so called xor-construction, which was initially proposed in [12], and which is used to obtain, efficiently, a good pseudorandom function from a block cipher. Here, in particular, we consider a version which only involved one permutation (at the price of a slightly smaller domain). We analyze a two-permutation version in Appendix A.

Setup and main theorem. Let \(\mathrm {Perm}(n)\) be the set of permutations \(\pi : \{0,1\}^n \rightarrow \{0,1\}^n\). Define \(\mathsf {XOR}[n]: \mathrm {Perm}(n) \times \{0,1\}^{n - 1} \rightarrow \{0,1\}^n\) to be the construction that takes a permutation \(\pi \in \mathrm {Perm}(n)\) as a key, and on input x it returns \(\pi (x \,\Vert \,0) \oplus \pi (x \,\Vert \,1)\). Theorem 1 below gives the PRF security of \(\mathsf {XOR}[n]\).

Theorem 1

Fix an integer \(n \ge 8\). For any adversary A that makes \(q \le 2^{n - 5}\) queries we have

$$ {\mathsf {Adv}}^{\mathrm {prf}}_{\mathsf {XOR}[n]}(A) \le \frac{1.5q + 3 \sqrt{q}}{2^n} . $$

Discussion. Before we proceed into the proof, we have a few remarks. First, the bound in Theorem 1 is tight, since in the real system (the one implementing \(\mathsf {XOR}[n]\)), no answer can be \(0^n\). Hence if one simply looks for a \(0^n\)-answer among q queries, one can distinguish the two systems with advantage \(1 - (1 - 1/2^n)^q \approx q / 2^n\). Next, if we blindly use the chi-squared method, with \(\mathbf {S}_{1}\) being the real system, and \(\mathbf {S}_{0}\) the ideal one (the one implementing a uniformly random function), then the bound is weak, around \(\sqrt{q / 2^n}\). The reason is that, for each \(i \le q\) and \(\varvec{Z}_{i - 1} = (z_1, \ldots , z_{i - 1})\) that the real system can produce for its first \(i - 1\) answers,

$$ \chi ^2(\varvec{Z}_{i - 1}) \ge \frac{\Bigl (\mathsf {p}_{\mathbf {S}_{1}}(0^n \mid \varvec{Z}_{i - 1}) - \mathsf {p}_{\mathbf {S}_{0}}(0^n \mid \varvec{Z}_{i - 1})\Bigr )^2}{\mathsf {p}_{\mathbf {S}_{0}}(0^n \mid \varvec{Z}_{i - 1})} = \frac{1}{2^n}. $$

Hence when we sample \(\varvec{X}_{i - 1}\) according to the interaction with \(\mathbf {S}_{1}\), it holds that \(\mathbf {E}[\chi ^2(\varvec{X}_{i - 1})] \ge 1/2^n\), and consequently we end up with an inferior bound \(\sqrt{q / 2^n}\). To avoid this issue, the system \(\mathbf {S}_{0}\) in our proof is instead a “normalized” version of the ideal system. It only outputs uniformly random answers in \(\{0,1\}^n \backslash \{0^n\}\). This normalization introduces a term \(q / 2^n\) in the bound, but the important point is that this term won’t be under the square-root. We will use the chi-squared method with \(\mathbf {S}_{1}\) being the real system, and \(\mathbf {S}_{0}\) being the normalized ideal system.

Proof

(Theorem 1 ). Let \(\mathbf {S}_{1}\) be the real system, and let \(\mathbf {S}_2\) be the ideal system. To obtain a good advantage, as explained above, we’ll first “normalize” \(\mathbf {S}_2\) to obtain another system \(\mathbf {S}_{0}\). Let \(\mathbf {S}_{0}\) be the system that implements an ideal random function mapping \(\{0,1\}^{n - 1}\) to \(\{0,1\}^n \backslash \{0^n\}\). Let \(\varGamma _{\mathrm {good}}= (\{0,1\}^n \backslash \{0^n\})^q\), and \(\varGamma _{\mathrm {bad}}= (\{0,1\}^n)^q \backslash \varGamma _{\mathrm {good}}\). Recall that \({\mathsf {Adv}}^{\mathrm {xor}}(A, n)\) is at most the statistical distance between \(\mathsf {p}_{\mathbf {S}_{1}, A}\) and \(\mathsf {p}_{\mathbf {S}_2, A}\). From triangle inequality,

$$ \Vert \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_2, A}(\cdot ) \Vert \le \Vert \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) \Vert + \Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_2, A}(\cdot ) \Vert . $$

Let T be the random variable for the q answers in \(\mathbf {S}_2\). Then

$$\begin{aligned} \Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_2, A}(\cdot ) \Vert= & {} \sum _{\varvec{Z}} \max \{0, \mathsf {p}_{\mathbf {S}_2, A}(\varvec{Z}) - \mathsf {p}_{\mathbf {S}_{0}, A}(\varvec{Z}) \} \\= & {} \sum _{\varvec{Z}\in \varGamma _{\mathrm {bad}}} \mathsf {p}_{\mathbf {S}_2, A}(\varvec{Z}) = \Pr [T \in \varGamma _{\mathrm {bad}}] \end{aligned}$$

where the second equality is due to the fact that \(\mathsf {p}_{\mathbf {S}_2, A}(\varvec{Z}) > \mathsf {p}_{\mathbf {S}_{0}, A}(\varvec{Z})\) if and only if \(\varvec{Z}\in \varGamma _{\mathrm {bad}}\), and \(\mathsf {p}_{\mathbf {S}_{0}, A}(\varvec{Z}) = 0\) for every \(\varvec{Z}\in \varGamma _{\mathrm {bad}}\). Note that \(\Pr [T \in \varGamma _{\mathrm {bad}}]\) is the probability that among q answers in \(\mathbf {S}_2\) (the system implementing a uniformly random function), there is at least a \(0^n\)-answer, which happens with probability at most \(q / 2^n\).

What is left is to bound \(\Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert \). We shall use the chi-squared method. Let \(\varvec{X}= (X_1, \ldots , X_q)\) be the random variable for the q answers in \(\mathbf {S}_{1}\), and let \(\varvec{X}_i = (X_1, \ldots , X_i)\) for every \(i \le q\). Fix \(i \le q\) and fix \(x \in \{0,1\}^n \backslash \{0^n\}\). Let \(Y_{i, x}\) be the following random variable. If \(\varvec{X}_{i - 1}\) takes values \((z_1, \ldots , z_{i - 1})\) then \(Y_{i, x}\) takes the value \(\mathsf {p}_{\mathbf {S}_{1}, A}(x \mid z_1, \ldots , z_{i - 1})\). Recall that

$$\begin{aligned} \chi ^2(\varvec{X}_{i - 1})= & {} \sum _{x \in \{0,1\}^n \backslash \{0^n\}} \frac{(Y_{i, x} - 1/(2^n - 1))^2}{1/(2^n - 1)} \nonumber \\\le & {} \sum _{x \in \{0,1\}^n \backslash \{0^n\}} 2^n \cdot (Y_{i, x} - 1/(2^n - 1))^2. \end{aligned}$$
(7)

We now expand \(Y_{i, x}\) into a more expressive and convenient formula to work with. Let \(\pi \in \mathrm {Perm}(n)\) be the secret key of \(\mathsf {XOR}[n]\). Let \(m_1, \ldots , m_{i}\) be the first i queries of the adversary. Let \(V_1 = \pi (m_1 \,\Vert \,0), V_2 = \pi (m_1 \,\Vert \,1), \ldots , V_{2i - 3} = \pi (m_{i - 1} \,\Vert \,0),\) and \(V_{2i - 2} = \pi (m_{i - 1} \,\Vert \,1)\). Regardless of how the adversary chooses its queries, marginally, these \(V_1, \ldots , V_{2i - 2}\) are simply random variables sampled uniformly without replacement from \(\{0,1\}^{n}\). Let \(S = \{V_1, \ldots , V_{2i - 2} \}\). Let \(D_{i, x}\) be the number of pairs \((u, u \oplus x)\) such that both u and \(u \oplus x\) belongs to S. Note that S and \(D_{i, x}\) are both random variables, and in fact functions of the random variables \(V_1, \ldots , V_{2i - 2}\). If \(\pi (m_{i} \,\Vert \,0) \oplus \pi (m_{i} \,\Vert \,1) = x\), there are exactly \(2^{n} - 4(i - 1) + D_{i, x}\) choices for the pair \((\pi (m_{i} \,\Vert \,0), \pi (m_{i} \,\Vert \,1))\):

  • First, \(\pi (m_{i} \,\Vert \,0)\) must take value in \(\{0,1\}^{n} \backslash (S \cup S^*)\), where \(S^* = \{u \oplus x \mid u \in S\}\). There are exactly \(2^n - |S \cup S^*| = 2^n - |S| - |S^*| + |S \cap S^*| = 2^n - 4(i -1) + D_{i, x}\) choices for \(\pi (m_{i} \,\Vert \,0)\).

  • Once \(\pi (m_{i} \,\Vert \,0)\) is fixed, the value of \(\pi (m_{i} \,\Vert \,1)\) is determined.

Hence

$$ Y_{i, x} = \frac{2^n - 4(i - 1) + D_{i, x}}{(2^n - 2i + 1)(2^n - 2i)}, $$

and thus

$$ |Y_{i, x} - 1/(2^{n} - 1)| = \frac{|(2^n - 1) D_{i, x} - 4(i - 1)^2 + 2(2^n - i)|}{(2^n - 2i + 1) (2^n - 2i)(2^n - 1)}. $$

Note that

$$\begin{aligned}&\frac{|(2^n - 1) D_{i, x} - 4(i - 1)^2 + 2(2^n - i)|}{2^n - 1} \\= & {} \Bigl |D_{i, x} - \frac{4(i - 1)^2}{2^n - 1} + 2 - \frac{2(i-1)}{2^n - 1}\Bigr | \\= & {} \Bigl |D_{i, x} - \frac{4(i - 1)^2}{2^n} + 2 - \frac{2(i - 1)}{2^n - 1} - \frac{4(i - 1)^2}{2^n(2^n - 1)} \Bigr | \\\le & {} \Bigl |D_{i, x} - \frac{4(i - 1)^2}{2^n} \Bigr | + 2 - \frac{2(i - 1)}{2^n - 1} - \frac{4(i - 1)^2}{2^n(2^n - 1)} \\\le & {} \Bigl |D_{i, x} - \frac{4(i - 1)^2}{2^n} \Bigr | + 2, \end{aligned}$$

where the first inequality is due to the facts that (i) \(|a + b| \le |a| + |b|\) for any numbers a and b, and (ii) \(2 - \frac{2(i - 1)}{2^n - 1} - \frac{4(i - 1)^2}{2^n(2^n - 1)} > 0\), which is in turn due to the hypothesis that \(i \le q \le 2^{n - 5}\), and \(n \ge 8\). Dividing both sides by \((2^n - 2i + 1)(2^n - 2i)\) we have

$$\begin{aligned} |Y_{i, x} - 1/(2^{n} - 1)|\le & {} \frac{|D_{i, x} - 4(i - 1)^2/ 2^n| + 2}{(2^n - 2i + 1)(2^n - 2i)} \\\le & {} \frac{|D_{i, x} - 4(i - 1)^2 / 2^n| + 2}{\frac{7}{8} \cdot 2^{2n}} \\= & {} \frac{\frac{8}{7} \cdot |D_{i, x} - 4(i - 1)^2 / 2^n| + \frac{16}{7}}{2^{2n}} \\\le & {} \frac{\frac{8}{7} \cdot |D_{i, x} - 4(i - 1)^2 / 2^n| + 3}{2^{2n}}, \end{aligned}$$

where the second inequality is also due to the hypothesis that \(i \le q \le 2^{n - 5}\), and \(n \ge 8\). Using the fact that \((a + b)^2 \le 2(a^2 + b^2)\) for every real numbers a and b,

$$\begin{aligned} (Y_{i, x} - 1/(2^n - 1))^2\le & {} \frac{\frac{128}{49}(D_{i, x} - 4(i -1)^2 / 2^n)^2 + 18}{2^{4n}} \\\le & {} \frac{ 3(D_{i, x} - 4(i -1)^2 / 2^n)^2 + 18}{2^{4n}}. \end{aligned}$$

From Eq. (7),

$$\begin{aligned} \mathbf {E}[\chi ^2(\varvec{X}_{i - 1})]\le & {} \sum _{x \in \{0,1\}^n \backslash \{0^n\}} 2^n \cdot \mathbf {E}\Bigl [(Y_{i, x} - 1/(2^n - 1))^2 \Bigr ]\\\le & {} \sum _{x \in \{0,1\}^n \backslash \{0^n\}} \frac{18}{2^{3n}} + \frac{3}{2^{3n}} \;\; \mathbf {E}\Bigl [\Bigl (D_{i, x} - \frac{4(i -1)^2}{2^n} \Bigr )^2 \Bigr ]. \end{aligned}$$

In the last formula, it is helpful to think of each \(D_{i, x}\) as a function of \(V_1, \ldots , V_{2n - 2}\), and the expectation is taken over the choices of \(V_1, \ldots , V_{2n - 2}\) sampled uniformly without replacement from \(\{0,1\}^{n}\). We will show that for any \(x \in \{0,1\}^n \backslash \{0^n\}\),

$$\begin{aligned} \mathbf {E}\Bigl [\Bigl (D_{i, x} - \frac{4(i -1)^2}{2^n} \Bigr )^2 \Bigr ] \le \frac{4(i - 1)^2}{2^{n}}, \end{aligned}$$
(8)

and thus

$$ \mathbf {E}[\chi ^2(\varvec{X}_{i - 1})] \le \sum _{x \in \{0,1\}^n \backslash \{0^n\}} \Bigl ( \frac{18}{2^{3n}} + \frac{12(i - 1)^2}{2^{4n}}\Bigr ) \le \frac{18}{2^{2n}} + \frac{12(i - 1)^2}{2^{3n}}. $$

Summing up, from Lemma 3,

$$\begin{aligned} (\Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert )^2\le & {} \frac{1}{2}\sum _{i = 1}^q \mathbf {E}[\chi ^2(X_{i - 1})] \\\le & {} \frac{1}{2}\sum _{i = 1}^q \ \frac{18}{2^{2n}} + \frac{12(i - 1)^2}{2^{3n}} \\\le & {} \frac{1}{2} \Bigl ( \frac{18q}{2^{2n}} + \frac{4q^3}{2^{3n}} \Bigr ) \le \frac{9q + 0.25 q^2}{2^{2n}}, \end{aligned}$$

where the last inequality is due to the hypothesis that \(q \le 2^{n - 5}\).

We now justify Eq. (8). Fix \(x \in \{0,1\}^n \backslash \{0^n\}\). For each \(1 \le j \le 2i -2\), let \(B_j\) be the Bernoulli random variable such that \(B_j = 1\) if and only if \(V_j \in \{V_1 \oplus x, \ldots , V_{j - 1} \oplus x\}\). Then \(D_{i, x} = 2(B_1 + \cdots B_{2i -2})\): if \(V_j = V_k \oplus x\) for some \(k < j\), then these account for two pairs (uv) such that \(v = u \oplus x\), whereas \(B_k = 0\) and \(B_j = 1\). Let \(S_k = B_1 + \cdots + B_k\), and \(L_k = S_k - k^2 / 2^{n + 1}\). We will prove by induction that for any \(k \le 2i - 2\),

$$\begin{aligned} \mathbf {E}\Bigl [(L_k)^2 \Bigr ]\le & {} \frac{2k^2}{2^{n + 1}}, \text{ and }\\ \mathbf {E}\Bigl [ L_k \Bigr ]\ge & {} \frac{-k}{2^{n + 1}}. \end{aligned}$$

This subsumes Eq. (8) as the special case for \(k = 2i - 2\). The base case \(k = 1\) is vacuous, since \(B_1 = 0\). Suppose this holds for \(k - 1\); we’ll prove that it holds for k as well. Given \(B_1, \ldots , B_{k - 1}\), the conditional probability that \(B_k = 1\) is exactly

$$ p = \frac{k - 1 - 2S_{k - 1}}{2^n - (k - 1)} $$

because it is equally likely for \(V_k\) to take any value in \(\{0,1\}^n \backslash P\), where \(P = \{V_1, \ldots , V_{k - 1}\}\) and \(2S_{k - 1}\) is the number of elements \(u \in P\) such that \(u \oplus x\) is also in P. Moreover,

$$ \frac{k - 1 - 2S_{k - 1}}{2^n - (k - 1)} = \frac{k - 1 - 2(L_{k - 1} + (k - 1)^2 / 2^{n + 1})}{2^n - (k - 1)} = \frac{k - 1}{2^n} - \frac{2L_{k - 1}}{2^n - (k - 1)}. $$

Hence \(p = \frac{k - 1}{2^n} - \frac{2L_{k - 1}}{2^n - (k - 1)}\), and thus

$$\begin{aligned} \mathbf {E}[L_k]= & {} \mathbf {E}[L_{k - 1} + B_k - (2k - 1) / 2^{n + 1}] = \mathbf {E}[L_{k - 1} + p - (2k - 1) / 2^{n + 1}] \\= & {} \mathbf {E}\Bigl [ \Bigl (1 - \frac{2}{2^n - (k - 1)} \Bigr ) L_{k - 1}- \frac{1}{2^{n + 1}}\Bigr ] \\= & {} \Bigl (1 - \frac{2}{2^n - (k - 1)} \Bigr ) \mathbf {E}[L_{k - 1}] - \frac{1}{2^{n + 1}} \\\ge & {} \Bigl (1 - \frac{2}{2^n - (k - 1)} \Bigr ) \frac{(1 - k)}{2^{n + 1}} - \frac{1}{2^{n + 1}} \ge \frac{-k}{2^{n + 1}}, \end{aligned}$$

where the second last inequality is due to the induction hypothesis. On the other hand,

$$\begin{aligned}&\mathbf {E}[ (L_k)^2] = \mathbf {E}\Bigl [ \Bigl ( L_{k - 1} + B_k - (2k - 1) / 2^{n + 1} \Bigr )^2\Bigr ] \nonumber \\= & {} \mathbf {E}\Bigl [ p \Bigl ( L_{k - 1} + 1 - (2k - 1) / 2^{n + 1} \Bigr )^2 + (1 - p) \Bigl ( L_{k - 1} - (2k - 1) / 2^{n + 1} \Bigr )^2\Bigr ]. \end{aligned}$$
(9)

By substituting \(p = \frac{k - 1}{2^{n}} - \frac{2L_{k - 1}}{2^n - (k - 1)}\) and using some simple algebraic manipulations,

$$\begin{aligned}&p \Bigl ( L_{k - 1} + 1 - (2k - 1) / 2^{n + 1} \Bigr )^2 + (1 - p) \Bigl ( L_{k - 1} - (2k - 1) / 2^{n + 1} \Bigr )^2 \nonumber \\= & {} \Bigl ( 1 - \frac{4}{2n - k - 1} \Bigr ) (L_{k - 1})^2 \!- \!\Bigl (\frac{1}{2^n} + \frac{2}{2^n - (k - 1)} \Bigr ) L_{k - 1} \!+\! \frac{(2k - 1)^2}{2^{2n + 2}} + \frac{(2k - 1)}{2^{n + 1}} \nonumber \\\le & {} (L_{k - 1})^2 - \Bigl (\frac{1}{2^n} + \frac{2}{2^n - (k - 1)} \Bigr ) L_{k - 1} + \frac{3(2k - 1)}{2^{n + 2}}, \end{aligned}$$
(10)

where the last inequality is due to the fact that \(k \le 2q \le 2^{n - 4}\). Taking expectation of both sides of Eq. (10), and using the induction hypothesis yield

$$\begin{aligned} \mathbf {E}\Bigl [ \Bigl ( L_{k} \Bigr )^2\Bigr ]\le & {} \frac{2(k - 1)^2}{2^{n + 1}} + \Bigl (\frac{1}{2^n} + \frac{2}{2^n - (k - 1)} \Bigr ) \frac{k - 1}{2^{n + 1}} + \frac{3(2k - 1)}{2^{n + 2}} \le \frac{2k^2}{2^{n + 1}}, \end{aligned}$$

where the last inequality is again due to the fact that \(k \le 2q \le 2^{n - 4}\). This concludes the proof.    \(\square \)

5 The Encrypted Davies-Meyer Construction

In this section we consider the PRF construction \(\mathrm {EDM}\) that Cogliati and Seurin (CS) recently propose [10]. They show that \(\mathrm {EDM}\) achieves \(\frac{2n}{3}\)-bit security and conjecture that it actually achieves n-bit security. Here we’ll give a \(\frac{3n}{4}\)-bit security proof for \(\mathrm {EDM}\). We begin by describing the \(\mathrm {EDM}\) construction.

Setup and results. The construction \(\mathrm {EDM}[n]: (\mathrm {Perm}(n))^2 \times \{0,1\}^n \rightarrow \{0,1\}^n\) takes two secret permutations \(\pi , \pi ' \in \mathrm {Perm}(n)\) as its key, and outputs \(\pi '(\pi (x) \oplus x)\) on input x. Theorem 2 below shows that \({\mathsf {Adv}}^{\mathrm {prf}}_{\mathrm {EDM}[n]}(A) \le \frac{7q^2}{2^{3n/2}}\), namely \(\frac{3n}{4}\)-bit security, whereas CS’s result shows that \({\mathsf {Adv}}^{\mathrm {prf}}_{\mathrm {EDM}[n]}(A) \le \frac{5 q^{3/2}}{2^n}\).

We note that a concurrent work by Mennink and Neves (MN) [21] shows that \({\mathsf {Adv}}^{\mathrm {prf}}_{\mathrm {EDM}[n]}(A) \le \frac{q}{2^n} + \frac{{q \atopwithdelims ()t + 1}}{2^{nt}}\) for any integer \(t \ge 1\) and any \(q \le 2^n / 67t\). While MN’s bound is quite better than ours, their work relies on Patarin’s “mirror theory” [26]. Here, our goal is to give a much simpler proof and we leave it as an open question of whether our bound can be tightened without resorting to mirror theory. A graphical comparison of the three bounds is shown in Fig. 1.

Fig. 1.
figure 1

Comparison among CS’s bound (left), ours (middle), and MN’s (right) for \(n = 128\). The x-axis gives the log (base 2) of q, and the y-axis gives the security bounds. For MN’s bound, we use \(t = 9\) as suggested by MN.

Theorem 2

Let \(n \ge 16\) be an integer. Then for any adversary A that makes at most q queries,

$$ {\mathsf {Adv}}^{\mathrm {prf}}_{\mathrm {EDM}[n]}(A) \le \frac{7q^2}{2^{1.5n}}. $$

Proof

Without loss of generality, assume that \(q \le 2^{n - 4}\); otherwise the claimed bound is moot. Assume that the adversary is deterministic and never repeats a past query. For convenience of analysis, instead of working directly with the real system (the one implementing EDM), we will “normalize” it to ensure that it has nice behaviors even if the past answers are bad.

Specifically, let \(\mathbf {S}_{0}\) be the ideal system (the one implementing a uniform random function), and \(\mathbf {S}_2\) be the real system. We will construct a system \(\mathbf {S}_{1}\) that is the “normalized” version of \(\mathbf {S}_2\) as follows. The system \(\mathbf {S}_{1}\) keeps a secret boolean \({{\textsf {bad}}}\) that is initially set to false. Initially, it implements \(\mathbf {S}_2\), but if among the past queries, there are 4 answers that are the same, then it sets \({{\textsf {bad}}}\) to true. Once \({{\textsf {bad}}}\) is set, \(\mathbf {S}_{1}\) instead implements \(\mathbf {S}_{0}\). We now show that the advantage \({\mathsf {Adv}}^{\mathrm {prf}}_{\mathrm {EDM}}(A)\) can be bounded via the statistical distance between \(\mathsf {p}_{\mathbf {S}_{0}, A}(\cdot )\) and \(\mathsf {p}_{\mathbf {S}_{1}, A}(\cdot )\), and then bound the latter via the chi-squared method. First, recall that \({\mathsf {Adv}}^{\mathrm {prf}}_{\mathrm {EDM}}(A)\) is at most

$$\begin{aligned} \Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_2, A}(\cdot ) \Vert \le \Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert + \Vert \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_2, A}(\cdot ) \Vert . \end{aligned}$$
(11)

Let X and \(X'\) be the random variables for the q-answers on \(\mathbf {S}_{0}\) and \(\mathbf {S}_{1}\) respectively. Let \(\varGamma _{\mathrm {bad}}\) be the subset of \((\{0,1\}^n)^q\) such that for any \(\varvec{Z}\in \varGamma _{\mathrm {bad}}\), there are 4 components of \(\varvec{Z}\) that are the same. Then \(\mathsf {p}_{\mathbf {S}_{1}, A}(\varvec{Z}) = \mathsf {p}_{\mathbf {S}_2, A}(\varvec{Z})\) for every \(\varvec{Z}\in (\{0,1\}^n)^q \backslash \varGamma _{\mathrm {bad}}\), and thus

$$\begin{aligned} \Vert \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_2, A}(\cdot ) \Vert= & {} \sum _{\varvec{Z}\in (\{0,1\}^n)^q} \max \{0, \mathsf {p}_{\mathbf {S}_{1}, A}(\varvec{Z}) - \mathsf {p}_{\mathbf {S}_2, A}(\varvec{Z}) \} \\= & {} \sum _{\varvec{Z}\in \varGamma _{\mathrm {bad}}} \max \{0, \mathsf {p}_{\mathbf {S}_{1}, A}(\varvec{Z}) - \mathsf {p}_{\mathbf {S}_2, A}(\varvec{Z}) \} \\\le & {} \sum _{\varvec{Z}\in \varGamma _{\mathrm {bad}}} \mathsf {p}_{\mathbf {S}_{1}, A}(\varvec{Z}) = \Pr [X' \in \varGamma _{\mathrm {bad}}]. \end{aligned}$$

On the other hand, note that \(\Pr [X' \in \varGamma _{\mathrm {bad}}] - \Pr [X \in \varGamma _{\mathrm {bad}}]\) can’t exceed the statistical distance between \(X'\) and X, which is \(\Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert \). Hence

$$\begin{aligned} \Vert \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_2, A}(\cdot ) \Vert\le & {} \Pr [X' \in \varGamma _{\mathrm {bad}}] \nonumber \\\le & {} \Pr [X \in \varGamma _{\mathrm {bad}}] + \Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert . \end{aligned}$$
(12)

From Eqs. (11) and (12),

$$\begin{aligned} \Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_2, A}(\cdot ) \Vert\le & {} 2 \Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert + \Pr [X \in \varGamma _{\mathrm {bad}}] \\\le & {} 2 \Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert + \frac{q^4}{2^{3n}} \\\le & {} 2 \Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert + \frac{q^{2}}{2^{1.5n}}. \end{aligned}$$

Hence what’s left is to bound \(\Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert \). Fix \(i \le q\) and \(\varvec{Z}_{i - 1} = (z_1, \ldots , z_{i - 1}) \in (\{0,1\}^n)^{i - 1}\). Recall that

$$ \chi ^2(\varvec{Z}_{i - 1}) = \sum _{z_i \in \{0,1\}^n} \frac{(\mathsf {p}_{\mathbf {S}_{1}, A}(z_i \mid \varvec{Z}_{i - 1}) - 1/2^n)^2}{1/2^n}. $$

We claim that if \(z_i \in \{z_1, \ldots , z_{i - 1}\}\) then

$$\begin{aligned} \frac{1}{2^n} - \frac{4i}{2^{2n}} \le \mathsf {p}_{\mathbf {S}_{1}, A}(z_i \mid \varvec{Z}_{i - 1}) \le \frac{1}{2^n} + \frac{2i}{2^{2n}}, \end{aligned}$$
(13)

and if \(z_i \not \in \{z_1, \ldots , z_{i - 1}\}\)

$$\begin{aligned} \frac{1}{2^n} - \frac{2i^2}{2^{3n}} \le \mathsf {p}_{\mathbf {S}_{1}, A}(z_i \mid \varvec{Z}_{i - 1}) \le \frac{1}{2^n} + \frac{5i^2}{2^{3n}}. \end{aligned}$$
(14)

Consequently,

$$ \chi ^2(\varvec{Z}_{i - 1}) \le (i - 1) \frac{16i^2}{2^{3n}} + (2^n - i + 1) \frac{25i^4}{2^{5n}} \le \frac{18i^3}{2^{3n}}. $$

Hence from Lemma 3, if one samples vectors \(\varvec{X}_{i - 1}\) according to interaction with system \(\mathbf {S}_{1}\),

$$ (\Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert )^2 \le \frac{1}{2} \sum _{i = 1}^q \mathbf {E}[\chi ^2(\varvec{X}_{i - 1})] \le \frac{1}{2} \sum _{i = 1}^q \frac{18i^3}{2^{3n}} \le \frac{9q^4}{2^{3n}}. $$

We now justify the two claims above, namely Eqs. (13) and (14). Note that if there are 4 components of \(\varvec{Z}_{i - 1}\) that are the same, then the claims are obviously true, as \(\mathsf {p}_{\mathbf {S}_{1}, A}(z_i \mid \varvec{Z}_{i - 1}) = 1/2^n\). Suppose that there are no 4 components of \(\varvec{Z}_{i - 1}\) that are the same. Let \((m_1, \ldots , m_i)\) be the queries that are uniquely determined from \(\varvec{Z}_{i - 1}\). Let \(v_j = \pi (m_j) \oplus m_j\) for every \(j \le i\).

We first justify Eq. (13), namely \(z_i \in \{z_1, \ldots , z_{i - 1}\}\). First consider the upper bound. Let S be the subset of \(\{1, \ldots , i - 1\}\) such that \(z_i = z_j\), for every \(j \in S\). Then \(0 < |S| \le 3\). Let \(\ell \) be an arbitrary element of S. Note that \(\mathbf {S}_{1}\) outputs \(z_{i}\) on query \(m_i\) if and only if \(\pi (m_i) = v_\ell \oplus m_i\). For each fixed choice of \(v_1, \ldots , v_{i - 1}\), the conditional probability that \(\pi (m_i) = v_\ell \oplus m_i\), given \(\pi (m_j) = v_j \oplus m_j\) for every \(j \le i - 1\), is either 0 or \(1 / (2^n - i)\). Hence

$$ \mathsf {p}_{\mathbf {S}_{1}, A}(z_i \mid \varvec{Z}_{i - 1}) \le \frac{1}{2^n - i} \le \frac{1}{2^n} + \frac{2i}{2^{2n}}, $$

where the last inequality is due to the hypothesis that \(i \le q \le 2^{n - 4}\). Next, consider the lower bound in Eq. (13). For each fixed choice of \(v_j\), with \(j \in \{1, \ldots , i - 1\} \backslash S\), there are at least \(2^n - 4i\) choices for \(v_\ell \), out of at most \(2^n\) possible choices, such that \(v_\ell \oplus m_k \ne v_j \oplus m_j\), for every \(j \in \{1, \ldots , i - 1\} \backslash S\) and every \(k \in S \cup \{i\}\). For each such tuple \((v_1, \ldots , v_{i - 1})\), the conditional probability that \(\pi (m_i) = v_\ell \oplus m_i\), given \(\pi (m_j) = v_j \oplus m_j\) for every \(j \le i - 1\), is exactly \(1 / (2^n - i)\). Hence

$$ \mathsf {p}_{\mathbf {S}_{1}, A}(z_i \mid \varvec{Z}_{i - 1}) \ge \frac{2^n - 4i}{2^n(2^n - i)} \ge \frac{1}{2^n} - \frac{4i}{2^{2n}}, $$

where the last inequality is due to the hypothesis that \(i \le q \le 2^{n - 4}\).

We now justify Eq. (14), namely \(z_i \not \in \{z_1, \ldots , z_{i - 1}\}\). First consider the lower bound. Let r be the number of elements in \(\{z_1, \ldots , z_{i - 1}\}\), and thus \(r \le i - 1\). The system \(\mathbf {S}_{1}\) will give an answer not in \(\{z_1, \ldots , z_{i - 1}\}\) if and only if \(v_i \not \in \{v_1, \ldots , v_{i - 1}\}\). Note that for each \(x, x' \in \{0,1\}^n \backslash \{z_1, \ldots , z_{i - 1}\}\), we have \(\mathsf {p}_{\mathbf {S}_{1}, A}(x \mid \varvec{Z}_{i - 1}) = \mathsf {p}_{\mathbf {S}_{1}, A}(x' \mid \varvec{Z}_{i - 1})\), since as long as \(v_i \not \in \{ v_1, \ldots , v_{i - 1}\}\), \(\pi '(v_i)\) is equally likely to take any value in \(\{0,1\}^n \backslash \{z_1, \ldots , z_{i - 1}\}\). Hence

$$\begin{aligned} \mathsf {p}_{\mathbf {S}_{1}, A}(z_{i} \mid \varvec{Z}_{i - 1})= & {} \frac{1}{2^n - r} \Bigl ( 1 - \sum _{x \in \{z_1, \ldots , z_{i - 1}\}} \mathsf {p}_{\mathbf {S}_{1}, A}(x \mid \varvec{Z}_{i - 1}) \Bigr ) \\\ge & {} \frac{1}{2^n - r} \Bigl ( 1 - \sum _{x \in \{z_1, \ldots , z_{i -1}\}} \frac{1}{2^n}(1 + 2i/2^n) \Bigr ) \\\ge & {} \frac{1}{2^n - r} \Bigl ( 1 - \frac{r}{2^n}(1 + 2i/2^n) \Bigr ) \\\ge & {} \frac{1}{2^n} - \frac{2ri}{2^{2n}(2^n - r)} \ge \frac{1}{2^n} - \frac{2i^2}{2^{3n}}. \end{aligned}$$

For the upper bound of Eq. (14),

$$\begin{aligned} \mathsf {p}_{\mathbf {S}_{1}, A}(z_{i} \mid \varvec{Z}_{i - 1})= & {} \frac{1}{2^n - r} \Bigl ( 1 - \sum _{x \in \{z_1, \ldots , z_{i - 1}\}} \mathsf {p}_{\mathbf {S}_{1}, A}(x \mid \varvec{Z}_{i - 1}) \Bigr ) \\\le & {} \frac{1}{2^n - r} \Bigl ( 1 - \sum _{x \in \{z_1, \ldots , z_{i - 1}\}} \frac{1}{2^n}(1 - 4i/2^n) \Bigr ) \\\le & {} \frac{1}{2^n - r} \Bigl ( 1 - \frac{r}{2^n}(1 - 4i/2^n) \Bigr ) \\\le & {} \frac{1}{2^n} + \frac{4ri}{2^{2n}(2^n - r)} \le \frac{1}{2^n} + \frac{5i^2}{2^{3n}}. \end{aligned}$$

This concludes the proof.    \(\square \)

6 The Swap-or-Not Construction

As a final application of our framework, we prove a tighter bound on the security of the swap-or-not construction by Hoang, Morris, and Rogaway [13] using the chi-squared method. We start by reviewing the construction, before turning to its analysis.

The swap-or-not construction. Let \(r \ge 1\) be a round parameter. Let \(\mathbb {G}\) be a finite abelian group, for which we use additive notation to denote the associated operation. Then, the swap-or-not construction \(\mathrm {SN}_r\) uses r functions \(f_1, \ldots , f_r: \mathbb {G}\rightarrow \{0,1\}\) (to be chosen independently and uniformly at random in the proof), and additionally uses r rounds keys \(K = (K_1, \ldots , K_r) \in \mathbb {G}\). Then, on input \(X \in \mathbb {G}\), it computes states \(X_0, X_1, \ldots , X_r \in \mathbb {G}\), where \(X_0 = X\), and for \(i \in \{1, \ldots , r\}\), let \(V_i = \max \{X_{i-1}, K_i - X_{i-1}\}\),Footnote 7

(15)

Finally, it outputs \(X_r\). The corresponding inversion operation occurs by taking these steps backwards. We denote the resulting construction as \(\mathrm {SN}_r[\mathbb {G}]\).

Security notions. For a block cipher \(E: \mathcal {K}\times \mathcal {M}\rightarrow \mathcal {M}\) and an adversary A, the CCA advantage \({\mathsf {Adv}}^{\mathrm {cca}}_{E}(A)\) of A against E is defined as

where \(\mathrm {Perm}(\mathcal {M})\) is the set of all permutations on \(\mathcal {M}\). We emphasize that here \(\mathcal {M}\) is an arbitrary set. If the adversary only queries its first oracle, and makes only non-adaptive queries, then we write \({\mathsf {Adv}}^{\mathrm {ncpa}}_{E}(A)\) instead. We write \({\mathsf {Adv}}^{\mathrm {cca}}_E(q)\) and \({\mathsf {Adv}}^{\mathrm {ncpa}}_E(q)\) to denote the CCA and NCPA advantage of the best adversaries of q queries against E, respectively.

If we have two block ciphers F and G on the same message space that are just NCPA-secure, one can have a CCA-secure block cipher E by composing \(E = F \circ G^{-1}\), meaning that \(E_{K, K'}(x) = G^{-1}_{K'}(F_K(x))\). The following well-known theorem by Maurer, Pietrzak, and Renner [20] bounds the CCA security of E based on the NCPA security of F and G.

Lemma 4

([20]). Let F and G be block ciphers on the same message space, and let \(E = F \circ G^{-1}\). Then for any q,

$$ {\mathsf {Adv}}^{\mathrm {cca}}_E(q) \le {\mathsf {Adv}}^{\mathrm {ncpa}}_F(q) + {\mathsf {Adv}}^{\mathrm {ncpa}}_G(q). \;\;\;\; $$

   \(\square \)

We note that Lemma 4 only holds in the information-theoretic setting where one consider the best possible, computationally unbounded adversaries. Pietrzak shows that this lemma does not hold in the computational setting [27].

NCPA security of Swap-or-Not. Following the route in the analysis of [13], we’ll first consider the NCPA security of Swap-or-Not, and then use Lemma 4 to amplify it to CCA security.

Lemma 5

For any adversary A that makes at most q queries and an abelian group \(\mathbb {G}\) of N elements,

$$\begin{aligned} {\mathsf {Adv}}^{\mathrm {ncpa}}_{\mathrm {SN}_r[\mathbb {G}]}(A) \le \frac{N}{\sqrt{r + 1}} \Bigl (\frac{N + q}{2N} \Bigr )^{(r + 1)/2}\;. \end{aligned}$$

Proof

We assume without loss of generality that A is deterministic, and doesn’t make redundant queries. The adversary A interacts with the construction \(\mathrm {SN}_r[\mathbb {G}]\) with r secret and randomly chosen functions \(f_1, \ldots , f_r: \mathbb {G}\rightarrow \{0,1\}\), and r keys \(K = (K_1, \ldots , K_r)\). We denote by \(\mathbf {S}_{1}\) the system resulting from \(\mathrm {SN}_r[\mathbb {G}]\) and by \(\mathbf {S}_{0}\) the system resulting from interacting with the random permutation \(\pi \). We will bound

$$ {\mathsf {Adv}}^{\mathrm {ncpa}}_{\mathrm {SN}_r[\mathbb {G}]}(A) \le \Vert \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) \Vert . $$

For each \(i \in \{0, 1, \ldots , q\}\), we define \(\varvec{X}_{i}\) to be the vector of outputs from the first i queries of A to \(\mathbf {S}_{1}\). Let \(m_i = N - i + 1\). We will use the following lemma from [13] to bound \(\mathbf {E}[\chi ^2(\varvec{X}_{i-1})]\).

Lemma 6

([13]). For any NCPA adversary A making q queries and for any \(i \le q\),

$$\begin{aligned} \mathbf {E}\Bigl (\sum _{x \in \mathbb {G}\backslash \{x_1, \ldots , x_{i - 1}\}} (\mathsf {p}_{\mathbf {S}_{1}, A}(x \mid \varvec{X}_{i - 1}) - 1/m_i)^2 \Bigr ) \le \left( \frac{N + i}{2N}\right) ^r, \end{aligned}$$

where the expectation is taken over a vector \(\varvec{X}_{i - 1} = (x_1, \ldots , x_{i - 1})\) sampled according to interaction with \(\mathbf {S}_{1}\).    \(\square \)

Fix some \(\varvec{Z}_{i-1} = (z_1, \ldots , z_{i-1})\) such that \(\mathsf {p}_{\mathbf {S}_{0}}(\varvec{Z}_{i - 1}) > 0\). Notice that the i-th output of \(\mathbf {S}_{0}\), given that the first \(i - 1\) outputs are \(\varvec{Z}_{i-1}\), is uniformly distributed over \(\mathbb {G}\backslash \{z_1, \ldots , z_{i-1}\}\). In other words, for any \(x \in \mathbb {G}\backslash \{z_1, \ldots , z_{i-1}\}\).

$$ \mathsf {p}_{\mathbf {S}_{0}, A}(x \mid \varvec{Z}_{i - 1}) = 1/m_i. $$

Hence, from Lemma 6,

$$\begin{aligned} \mathbf {E}[\chi ^2(\varvec{X}_{i-1})]= & {} \mathbf {E}\Bigl (\sum _{x \in \mathbb {G}\backslash \{x_1, \ldots , x_{i - 1}\}} m_i \cdot (\mathsf {p}_{\mathbf {S}_{1}, A}(x \mid \varvec{X}_{i - 1}) - 1/m_i)^2 \Bigr ) \nonumber \\\le & {} m_i \left( \frac{N + i}{2N}\right) ^r \le N \left( \frac{N + i}{2N}\right) ^r. \end{aligned}$$
(16)

Using Lemma 3, we obtain,

$$\begin{aligned} (\Vert \mathsf {p}_{\mathbf {S}_{0}, A}(\cdot ) - \mathsf {p}_{\mathbf {S}_{1}, A}(\cdot ) \Vert )^2\le & {} \frac{1}{2} \cdot \sum _{i = 1}^q \mathbf {E}[\chi ^2(\varvec{X}_{i-1})] \\\le & {} \frac{1}{2} \sum _{i=1}^q N \left( \frac{N + i}{2N}\right) ^r \\\le & {} N^2 \int _{0}^{q/2N} \Bigl ( \frac{1}{2} + x \Bigr )^r dx \le \frac{N^2}{r + 1} \left( \frac{N + q}{2N}\right) ^{r + 1}. \end{aligned}$$

CCA security of Swap-or-Not. Note that the inverse of \(\mathrm {SN}_r[G]\) is also another \(\mathrm {SN}_r[\mathbb {G}]\) (but the round functions and round-keys are bottom up). Hence from Lemmas 4 and 5, we conclude that

Theorem 3

For any \(q, r \in \mathbb {N}\) and any abelian group \(\mathbb {G}\) of N elements,

$$ {\mathsf {Adv}}^{\mathrm {cca}}_{\mathrm {SN}_{2r}[\mathbb {G}]}(q) \le \frac{2N}{\sqrt{r + 1}}\Bigl (\frac{N + q}{2N}\Bigr )^{(r + 1)/2}. \;\;\;\;\; $$

   \(\square \)

Note that in Theorem 3, the number of rounds in the Swap-or-Not shuffle is 2r. The original bound in [13] is

$$ {\mathsf {Adv}}^{\mathrm {cca}}_{\mathrm {SN}_{2r}[\mathbb {G}]}(q) \le \frac{4N^{3/2}}{r + 2} \left( \frac{N + q}{2N}\right) ^{r/2 + 1}. $$

Typically one uses \(r = \varTheta (\log (N))\), and thus our result improves the original analysis by a factor of \(\varTheta (\sqrt{N / \log (N)})\). We note that our result is probably not tight, meaning that it might be possible to improve the security of Swap-or-Not further.