Keywords

1 Introduction

Authenticated encryption (AE) schemes target the security goals of privacy and integrity. The field of AE has received more interest in the light of the recently announced CAESAR competition [9]. In the target scope of the competition fall secure and efficient AE algorithms for specific or possibly multiple environments.

While AE can securely be achieved by combining a probabilistic encryption scheme and a message authentication code using Bellare and Namprempre’s generic composition [6], this approach comes at the cost of using two keys, one for encryption and one for authentication. This and further efficiency optimization reasons have led to the development of many dedicated nonce-based AE solutions such as CCM [33], CWC [20], EAX [7], GCM [23], IACBC [18], IAPM [18], OCB1-3 [21, 27, 29], and OTR [24].

Of these schemes, today GCM is the most widely deployed. GCM has been standardized by many organizations including ANSI, IEEE, ISO/IEC, and NIST. GCM has also been adopted by major cryptographic protocols such as IPsec, SSH, and TLS/SSL.

One advantage of GCM is that it performs well on Intel CPUs. According to Gladman [13], GCM outperforms CCM, CWC, and EAX on Intel P3/P4 and AMD 64(32/64) processors, if a 64 K table is used with GCM. This is mostly due to the fact that finite-field multiplication over \(\mathrm {GF}(2^{128})\) can be implemented efficiently on these platforms so that it runs faster than serial AES or hashing modulo \(2^{127}-1\).

There are several ways of parallelizing the polynomial hashing in GCM [14]. For example, instead of performing finite-field multiplications sequentially by Horner’s rule as \(\bigl (\bigl (\bigl (X[1]L\oplus X[2]\bigr )L\oplus X[3]\bigr )L\oplus X[4]\bigr )L\), one precomputes \(L^2\), \(L^3\) and \(L^4\), stores them in a table, and then computes the hash in a parallelizable way as \(X[1]L^4+X[2]L^3+X[3]L^2+X[4]L\). Here \(L\) denotes the key of polynomial hashing and \(X[i]\) the data blocks.

On more recent Intel CPUs such as Nehalem and Sandy Bridge, finite-field multiplication runs slower than AES [21]. Note that these processors are equipped with dedicated instruction sets, PCLMULQDQ for finite-field multiplication and AES-NI for AES block cipher computation. However, according to the latest report by Gueron [15] PCLMULQDQ is now more efficient on the latest Haswell processor, making finite-field multiplication over \(\mathrm {GF}(2^{128})\) faster than AES block cipher computation. This also makes GCM still attractive for use on Intel platforms.

Another advantage of GCM is the fact that it does not require the block cipher inverse. This contrasts sharply with schemes like OCB, where the cipher inverse is necessary for decryption. Besides the extra cost to implement the inverse algorithm, the problem is that the security proof needs to rely on a stronger assumption about the underlying block cipher if its inverse is used by the scheme. This issue has been discussed for OCB [5] and has led to the invention of OTR [24].

Given these features and its wide-spread use, GCM is often considered as a reference AE mode of operation. In fact, the call for submissions of the CAESAR competition [9] requires that authors “must explain, in particular, why users should prefer this cipher over AES-GCM.”

All of the above-mentioned dedicated schemes are proven secure in a nonce-respecting model — formalism proposed by Rogaway [28] — where an adversary is limited to making encryption queries only with non-repeating nonce values. For the cases when nonce values do repeat, none of these AE schemes provides any formal security guarantees. Indeed, all of these schemes, including the latest OTR, can be “attacked” under nonce repetition, as described by Fleischmann et al. [12].

Nonce repetition can, however, occur in practice due to the fact that the nonce is chosen by the application programmer rather than the scheme itself as discussed by Fleishmann et al. [12]. Examples of nonce repetition are flawed implementations [8, 10, 19, 22, 34], bad management of nonces by the user, and backup resets or virtual machine clones when the nonce is stored as a counter.

One way to address these situations is to design AE schemes which provide misuse resistance in a model where the adversary can perform queries with repeating nonces. Such schemes include the deterministic AE solutions SIV [30], BTM [16], and HBS [17], and also the authenticated online ciphers McOE-G [12], APE [2], and COPA [3]. The latter schemes are more efficient (need to process the message just once), even though the security under nonce repetition is limited to indistinguishability up to a common prefix.

We note that we are missing a “GCM-like” authenticated online cipher. McOE-G makes one block cipher call plus one finite-field multiplication per message block, but it is inherently sequential and not parallelizable like GCM. APE is permutation-based and sequential. COPA is parallelizable, but it makes two block cipher calls per message block. Moreover, all of these schemes require the inverse primitive calls for decryption. In this paper, therefore, we set out to propose a new authenticated online cipher whose efficiency is comparable to that of GCM.

Our Results. We present a secure and efficient solution for AE, which we name COBRA. A formal description of COBRA for integral message blocks is given in Sect. 3, and it is depicted in Figs. 2 and 3. (A description of COBRA for arbitrary-length messages is given in Appendix A).

Design. At first glance our design may seem to combine characteristics of the COPA [3] and OTR [24] designs. Indeed, to ensure misuse-resistance we include features from COPA and then substitute the parallelization procedures with the two-round balanced Feistel structure as proposed by Minematsu [24] in OTR. The latter design decision enables the use of just a single type of primitive, namely a block cipher in the forward encryption direction, without losing parallelizability, for efficiently authenticating and encrypting at the same time using polynomial hashing. It also allows for a scheme that does not need the inverse of the block cipher in decryption.

However, the construction of COBRA is not motivated by the mere combination of the two designs. Indeed, the employment of the Feistel network seems necessary for efficiently authenticating and encrypting at the same time using polynomial hashing. It also allows for a scheme that does not need the inverse of the block cipher in decryption. In order to achieve integrity of COBRA, we utilize the checksum of intermediate state values of the Feistel structure, which is similar to a technique proposed by Anderson et al. in their ManTiCore design [1].

Security. In Sect. 4, we prove that COBRA is secure against chosen-plaintext attacks (CPA) and against forgery up to approximately \(2^{n/2}\) queries, where \(n\) is the block length of the underlying cipher. Our result for privacy covers nonce-repeating attackers. This contrasts sharply with GCM whose security collapses once the nonce is repeated. Note that authenticity of COBRA requires nonce-respecting attackers.

Our new scheme requires no block cipher inverse and hence enjoys provable security under the pseudo-random permutation (PRP) assumption about the underlying block cipher. This is not the case for COPA, whose security proof relies on a stronger assumption about the block cipher.

Our proof itself is simplified by decomposing COBRA into smaller parts which are dealt with individually. The main idea here is to turn a call to the block cipher into a call to a tweakable cipher which we instantiate with Rogaway’s XE [27] construction. COBRA utilizes universal hashing (finite-field multiplication) and produces the tag using intermediate values of the Feistel networks. These differences make COBRA’s proof slightly simpler than that of COPA.

Efficiency. The efficiency of COBRA is comparable to that GCM. That is, they both perform one block cipher call plus one finite field multiplication per message block in a parallelizable way.

As compared to COPA, COBRA saves the cost of implementing the inverse of the underlying block cipher. COBRA performs potentially better than COPA on platforms where the finite-field multiplication runs faster than the underlying block cipher call. Such CPUs include Intel’s latest Haswell processor, where a 128-bit multiplication using the PCLMULQDQ instruction set runs faster than one AES call even using the AES-NI instruction set, hence essentially faster than any other block cipher implemented.

Attack on Previous Scheme. In the period between acceptance and publication of this paper, Nandi found an attack on the authenticity of the scheme using a nonce-repeating adversary [25]. As a result we have reduced the security claim of authenticity from being secure against nonce-repeating adversaries to being secure against nonce-respecting adversaries and we have made a small adjustment in the processing of the nonce to accomplish this security level: instead of multiplying the nonce with the message blocks, we use it in the block cipher call to create the secret value \(L\). This does not change the privacy proof and authenticity is achieved for the same reason that authenticity is achieved in OTR.

2 Preliminaries

By \((\{0,1\}^n)^+\) we denote the set of strings whose length is a positive multiple of \(n\) bits. Given two strings \(A\) and \(B\), we use \(A\parallel B\) and \(AB\) interchangeably to denote the concatenation of \(A\) and \(B\). For \(A\in \{0,1\}^*\), by \(A10^*\) we denote the string with a \(1\) appended, and then padded with zeros until its length is a multiple of \(n\). If \(X\) is a string with length a multiple of \(n\), by \(X[i]\) we denote the \(i\)th \(n\)-bit block of \(X\). The length of a string \(X\) is denoted by \(|X|\).

A block cipher \(E:\mathcal {K}\times \{0,1\}^n\rightarrow \{0,1\}^n\) is a function that takes as input a key \(k\in \mathcal {K}\) and a plaintext \(M\in \{0,1\}^n\), and produces a ciphertext \(C=E(k,M)\). We sometimes write \(E_k(\cdot )=E(k,\cdot )\). For a fixed key \(k\), a block cipher is a permutation on \(n\) bits.

We can view the set \(\{0,1\}^n\) of bit strings as the finite field \(\mathrm {GF}(2^n)\) consisting of \(2^n\) elements. To this end, we represent an element of \(\mathrm {GF}(2^n)\) as a polynomial over the field \(\mathrm {GF}(2)\) of degree less than \(n\), and a string \(a_{n-1}a_{n-2}\cdots a_1a_0\in \{0,1\}^n\) corresponds to the polynomial \(a_{n-1}\mathtt {x}^{n-1}+ a_{n-2}\mathtt {x}^{n-2}+\cdots + a_1\mathtt {x}+ a_0\in \mathrm {GF}(2^n)\). The addition in the field is simply addition of polynomials over \(\mathrm {GF}(2)\) (i.e., bitwise XOR, denoted by \(\oplus \)). To define multiplication in the field, we fix an irreducible polynomial \(f(\mathtt {x})\) of degree \(n\) over the field \(\mathrm {GF}(2)\). For \(a(\mathtt {x}),b(\mathtt {x})\in \mathrm {GF}(2^n)\), their product is defined as \(a(\mathtt {x}) b(\mathtt {x}) \mod f(\mathtt {x})\) — polynomial multiplication over the field \(\mathrm {GF}(2)\) reduced modulo \(f(\mathtt {x})\). We simply write \(a(\mathtt {x})b(\mathtt {x})\) and \(a(\mathtt {x})\cdot b(\mathtt {x})\) to mean the product in the field \(\mathrm {GF}(2^n)\), and denote the multiplication by \(\otimes \).

The set \(\{0,1\}^n\) can alternatively be regarded as a set of integers ranging from \(0\) through \(2^n-1\), where a string \(a_{n-1}a_{n-2}\cdots a_1a_0\in \{0,1\}^n\) corresponds to the integer \(a_{n-1}2^{n-1}+ a_{n-2}2^{n-2}+\cdots + a_12+ a_0\in [0,2^n-1]\). Based on these conversions, we often simply write elements of \(\mathrm {GF}(2^n)\) as integers. For example, “\(2\)” means \(\mathtt {x}\) and “\(3\)” means \(\mathtt {x}+1\). When we write multiplications such as \(2\cdot 3\), we mean those in the field \(\mathrm {GF}(2^n)\).

3 Specification

In this section we give the specification of our scheme COBRA. Here we define COBRA for messages whose length is a positive multiple of \(2n\), where \(n\) denotes the block length of the underlying block cipher. The case of fractional messages is given in Appendix A.

Let \(E:\mathcal {K}\times \{0,1\}^n\rightarrow \{0,1\}^n\) be an \(n\)-bit block cipher. COBRA consists of two functionalities, an encryption function \(\mathcal {E}\) and a decryption function \(\mathcal {D}\):

$$\begin{aligned} \mathcal {E}&: \mathcal {K}\times \{0,1\}^{n-1} \times \{0,1\}^* \times (\{0,1\}^{2n})^+ \rightarrow (\{0,1\}^{2n})^+ \times \{0,1\}^n,\\ \mathcal {D}&: \mathcal {K}\times \{0,1\}^{n-1} \times \{0,1\}^* \times (\{0,1\}^{2n})^+ \times \{0,1\}^n \rightarrow (\{0,1\}^{2n})^+\cup \{\bot \}. \end{aligned}$$

The function \(\mathcal {E}\) takes as input a key \(K\), a nonce \(N\), associated data \(A\), and a message \(M\), and returns a ciphertext \(C\) and tag \(T\): \((C,T)\leftarrow \mathcal {E}(K,N,A,M)\). The decryption function \(\mathcal {D}\) also gets a ciphertext \(C\) and tag \(T\) in addition to a key, nonce and associated data; it outputs \(M\) if the tag is correct and \(\bot \) otherwise, which we denote as \(M/\bot \leftarrow \mathcal {D}(K, N,A,C,T)\).

On input of a key \(K\), nonce \(N\), associated data \(A\), and a message \(M\) padded into \(n\)-bit blocks \(M[1]M[2]\cdots M[2d]\) (resp., a ciphertext \(C=C[1]C[2]\cdots C[2d]\) and tag \(T\)), the function \(\mathcal {E}\) (resp., \(\mathcal {D}\)) is defined in Fig. 1. Note that the functions are sound: for any \(K,N,A,M\) we have \(M\leftarrow \mathcal {D}(K,N,A,\mathcal {E}(K,N,A,M))\). For the case the associated data \(A\) is of length at most \(4n\) and the message is of length at most \(6n\), the function \(\mathcal {E}\) is depicted in Figs. 2 and 3.

Fig. 1.
figure 1

COBRA.

Fig. 2.
figure 2

Processing plaintext. Note that \(L'\) is defined in Fig. 3 below.

Fig. 3.
figure 3

Processing associated data (top), computing the tag (bottom left), and the secret values (bottom right).

4 Security

We briefly settle some notation for the security analysis in Sect. 4.1. In Sect. 4.2, we introduce some preliminary results related to COBRA. Confidentiality of COBRA is then proven in Sect. 4.3, and integrity in Sect. 4.4.

4.1 Notation

When writing \(x\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}X\) for some finite set \(X\) we mean that \(x\) is sampled uniformly from \(X\). We write \(\Pr \bigl [\varvec{\mathsf {A}}\bigm |\varvec{\mathsf {B}}\big ]\) to denote the probability of event \(\varvec{\mathsf {A}}\) given \(\varvec{\mathsf {B}}\).

Say that \(M\in \{0,1\}^{2n\ell }\). We write \(M[1]M[2]\cdots M[2\ell ]\xleftarrow {n} M\) to denote the blocks that make up \(M\), and \(\hat{M}[1]\hat{M}[2]\cdots \hat{M}[\ell ]\xleftarrow {2n} M\) to denote the fragments that make up \(M\). Note that a fragment is made of two blocks: \(\hat{M}[i] = M[2i-1]\parallel M[2i]\).

For convenience, we use the notation

$$\begin{aligned} \mathop {\mathrm {\Delta }}_{\mathbf {D}}(f\,;\,g) := \left|\Pr [\mathbf {D}^f=1] - \Pr [\mathbf {D}^g=1]\right|\end{aligned}$$
(1)

to denote the distinguishing advantages of adversary \(\mathbf {D}\) in distinguishing oracles \(f\) and \(g\), where the notation \(\mathbf {D}^\mathcal {O}\) indicates the value output by \(\mathbf {D}\) after interacting with oracle \(\mathcal {O}\). The probabilities are defined over the random coins used in the oracles and the random coins of the adversary, if any. If a class of distinguishers is described by some parameters, e.g. the number of queries \(q\), then by \(\mathop {\mathrm {\Delta }}_{q}(f\,;\,g)\) we denote the supremum of \(\mathop {\mathrm {\Delta }}_{\mathbf {D}}(f\,;\,g)\) over all distinguishers \(\mathbf {D}\) in this class of adversaries. Multiple oracles are separated by a comma or given by a set, e.g. \(\mathop {\mathrm {\Delta }}_{}(f_1, f_2\,;\,g_1, g_2)\) or \(\mathop {\mathrm {\Delta }}_{}(\{f_1,f_2\}\,;\,\{g_1,g_2\})\) denotes distinguishing the combination of \(f_1\) and \(f_2\) from the combination of \(g_1\) and \(g_2\).

A uniform random function (URF) from \(m\) bits to \(n\) bits is a uniformly distributed random variable over the set of all functions from \(\{0,1\}^m\) to \(\{0,1\}^n\). A uniform random permutation (URP) on \(n\) bits is a uniformly distributed random variable over the set of all permutations on \(n\) bits.

Definition 1

Let \(E\) be a block cipher. Let \(\pi \) be a URP on \(n\) bits. The \(\mathrm {prp}\) advantage of a distinguisher \(\mathbf {D}\) is defined as

$$\begin{aligned} \mathbf {Adv}_{E}^{\mathrm {prp}}(\mathbf {D}) = \mathop {\mathrm {\Delta }}_{\mathbf {D}}(E_k\,;\,\pi ). \end{aligned}$$

Here, \(\mathbf {D}\) is a distinguisher with oracle access to either \(E_k\) or \(\pi \). The probabilities are taken over \(k\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {K}\), the randomness of \(\pi \), and random coins of \(\mathbf {D}\), if any. By \(\mathbf {Adv}_{E}^{\mathrm {prp}}(t,q)\) we denote the maximum advantage taken over all distinguishers that run in time \(t\) and make \(q\) queries.

4.2 Preliminary Results

The input to each block cipher call in COBRA is first XORed with one of the following masks:

$$\begin{aligned} \{2J, 2^iL, 2^iL\oplus L, 3(2^iL\oplus L), 3^2(2^iL\oplus L)\}, \end{aligned}$$
(2)

where \(i\ge 2\), \(J := E_k(0)\) and \(L := E_k(1)\). As a result, each block cipher call can be viewed as a call to an XE construction [27]. Note that by using the doubling method from [27], we can produce many different values of the mask from the secret values \(J\) and \(L\). Specifically, we adopt the tweaks used in [24], allowing us to replace each of the XEs with independent URFs.

Lemma 1

([24, 27]). Let \(\mathcal {T}\) denote some set of indices such that \(\tau \rightarrow \mu _\tau \) maps all indices to all tweaks injectively. The permutations \(\{E_{k}(\mu _\tau \oplus \cdot )\}_{\tau \in \mathcal {T}}\) are indistinguishable from independent URFs \(\bigl \{\varphi _{\tau }\bigr \}_{\tau \in \mathcal {T}}\). Specifically, let \(\mathbf {D}\) be a distinguisher running in time \(t\) and making at most \(q\) queries, then

$$\begin{aligned} \mathop {\mathrm {\Delta }}_{\mathbf {D}}(\{E_k(\mu _\tau \oplus \cdot )\}_{\tau \in \mathcal {T}}\,;\,\{\varphi _\tau \}_{\tau \in \mathcal {T}}) \le \frac{5q^2}{2^n}+\mathbf {Adv}_{E}^{\mathrm {prp}}(\mathbf {D}'), \end{aligned}$$

where \(\mathbf {D}'\) is a distinguisher with running time similar to \(\mathbf {D}\), making \(2q\) queries.

In Fig. 4 one can see a description of COBRA where the XE constructions are replaced with URFs, where the URFs are labeled \(\alpha \) (replacing the XE construction in ProcessAD), \(\beta _i^N,\gamma _i^N\) for \(i\ge 1\) and all \(N\) (replacing them in COBRA-Encrypt and COBRA-Decrypt), and \(\delta _1^N,\delta _2^N\) for all \(N\) (replacing them in ComputeTag). Throughout, we will denote this scheme by \(\mathcal {E}'\).

We can describe \(\mathcal {E}'\) as a sequence of functions each computing one ciphertext fragment, a function computing the tag, and a function processing the associated data. More formally:

Definition 2

Say that \(\mathcal {E}'\) maps \((N,A,M)\) to \((C,T)\). Define \(f_i:\{0,1\}^n\times \{0,1\}^{2ni}\rightarrow \{0,1\}^{2n}\) to be the function mapping \((N, \hat{M}[1]\cdots \hat{M}[i])\) to \(\hat{C}[i]\), \(h:\{0,1\}^*\rightarrow \{0,1\}^n\) the function mapping \(A\) to \(U\) (where \(U\) is as shown in Fig. 4), and \(f':\{0,1\}^n\times \{0,1\}^*\times (\{0,1\}^{2n})^+\rightarrow \{0,1\}^n\) the function mapping \((N,A,M)\) to \(T\).

As a second step, we replace the associated data computation \(h\) in \(\mathcal {E}'\) with a URF \(\varOmega \):

Lemma 2

Let \(\varOmega :\{0,1\}^*\rightarrow \{0,1\}^n\) be a URF and let \(\mathbf {D}\) be a distinguisher making at most \(q\) queries each of length less than \(nl\), then

$$\begin{aligned} \mathop {\mathrm {\Delta }}_{\mathbf {D}}(h\,;\,\varOmega )\le \frac{lq^2}{2^n}. \end{aligned}$$

Proof

The URF \(\alpha \) generates independent, uniformly distributed values as long as its inputs are unique. The only issue is when two different \(A\)’s map to the same input to \(\alpha \), which itself reduces to finding zeros of a polynomial in \(J\) of degree at most \(l\). Since \(J\) is an independent, uniformly distributed value generated using a URP, and polynomials of degree \(l\) have at most \(l\) distinct zeroes, the probability that a pair of plaintexts collides is \(l/2^n\). By allowing the adversary to make \(q\) queries we get our desired bound. \(\quad \square \)

We define \(\mathbb {E}\) to be \(\mathcal {E}'\) with \(h\) replaced by \(\varOmega \). In other words, \(\mathbb {E}\) corresponds to COBRA where (i) the XE constructions have been replaced with independent URFs, and (ii) \(h\) has been replaced with \(\varOmega \). Formally, we obtain the following result for \(\mathbb {E}\).

Fig. 4.
figure 4

Construction with independent URFs.

Proposition 1

Let \((\mathcal {E}, \mathcal {D})\) denote COBRA and let \(\mathbf {D}\) be a distinguisher making at most \(q\) queries each of length less than \(2n\ell \), then

$$\begin{aligned} \mathop {\mathrm {\Delta }}_{\mathbf {D}}(\mathcal {E}_k\,;\,\mathbb {E})\le \frac{5(2\ell q+2q)^2}{2^n} + \frac{2\ell q^2}{2^n} + \mathbf {Adv}_{E}^{\mathrm {prp}}(\mathbf {D}'), \end{aligned}$$

where the probability is taken over \(k\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {K}\) and the URFs in \(\mathbb {E}\), and \(\mathbf {D'}\) is a distinguisher with running time similar to \(\mathbf {D}\), making \(4\ell q + 4q\) queries.

Proof

We first apply Lemma 1, where we note that one query by \(\mathbf {D}\) leads to at most \(2\ell +2\) block cipher calls, and then Lemma 2 to get the desired result. \(\quad \square \)

Using Proposition 1, we will prove confidentiality of COBRA in Sect. 4.3 and integrity in Sect. 4.4. For the proof of confidentiality, we present an additional elementary lemma:

Lemma 3

Say \(f_1\), \(g_1\) are random functions independent of each other, and that \(f_2\), \(g_2\) are independent random functions as well. Let \(\mathbf {D}\) be a distinguisher for \(\{f_1,g_1\}\) and \(\{f_2,g_2\}\) making \(q_f\) queries to the \(f_i\) oracles and \(q_g\) queries to the \(g_i\) oracle. Then there exist distinguishers \(\mathbf {D}_f\) and \(\mathbf {D}_g\) such that

$$\begin{aligned} \mathop {\mathrm {\Delta }}_{\mathbf {D}}(f_1, g_1\,;\,f_2, g_2) \le \mathop {\mathrm {\Delta }}_{\mathbf {D}_f}(f_1\,;\,f_2) + \mathop {\mathrm {\Delta }}_{\mathbf {D}_g}(g_1\,;\,g_2), \end{aligned}$$

where \(\mathbf {D}_f\) makes \(q_f\) queries and \(\mathbf {D}_g\) makes \(q_g\) queries.

4.3 Confidentiality

We adopt the definitions of security given in [3], yet rather than comparing our scheme to a random variable over the set of all online permutations, we explicitly describe an ideal online function in terms of URFs.

Definition 3

(Ideal Online Function). Let \(g_{i}:\{0,1\}^n\times \{0,1\}^{2ni}\rightarrow \{0,1\}^{2n}\) be URFs and let \(g':\{0,1\}^n\times \{0,1\}^*\times (\{0,1\}^{2n})^+\rightarrow \{0,1\}^{n}\) be a URF. We define \(\$:\{0,1\}^n\times \{0,1\}^*\times (\{0,1\}^{2n})^+\rightarrow (\{0,1\}^{2n})^+\times \{0,1\}^n\) as

$$\begin{aligned} \$(N,A,M) = g_{1}(N,\hat{M}[1])\parallel g_{2}(N,\hat{M}[1]\hat{M}[2])\parallel \cdots \parallel g_{\ell }(N, M)\parallel g'(N,A,M) \end{aligned}$$

where \(\hat{M}[1]\hat{M}[2]\cdots \hat{M}[\ell ]\xleftarrow {2n} M\).

Definition 4

(IND-CPA). Let \(\mathcal {E}\) be an encryption scheme. The IND-CPA advantage of a distinguisher \(\mathbf {D}\) relative to \(\mathcal {E}\) is given by

$$\begin{aligned} \mathbf {Adv}_{\mathcal {E}}^{\mathrm {cpa}}(\mathbf {D}) := \mathop {\mathrm {\Delta }}_{\mathbf {D}}(\mathcal {E}_k\,;\,\$), \end{aligned}$$

where \(k\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {K}\) and \({\$}\) is as defined in Definition 3.

Theorem 1

Let \(\mathcal {E}\) denote COBRA and let \(\mathbf {D}\) be a distinguisher running in time \(t\) and making at most \(q\) queries to \(\mathcal {E}\), each of length less than \(2n\ell \), then

$$\begin{aligned} \mathbf {Adv}_{\mathcal {E}}^{\mathrm {cpa}}(\mathbf {D}) \le \frac{22(\ell +1)^2q^2}{2^n} + \mathbf {Adv}_{E}^{\mathrm {prp}}(\mathbf {D}'), \end{aligned}$$

where \(\mathbf {D}'\) is a distinguisher with running time similar to \(\mathbf {D}\), making \(4\ell q + 4q\) queries.

Proof

Let \(\mathbf {D}\) be a distinguisher running in time \(t\) and making at most \(q\) queries each of length less than \(2n\ell \). As a first step, we move from \(\mathcal {E}\) to \(\mathbb {E}\), where the underlying XE constructions are replaced by independent URFs, and \(h\) by \(\varOmega \). By Proposition 1:

$$\begin{aligned} \mathop {\mathrm {\Delta }}_{\mathbf {D}}(\mathcal {E}_k\,;\,\mathbb {E}) \le \frac{5(2\ell q+2q)^2}{2^n} + \frac{2q^2\ell }{2^n} + \mathbf {Adv}_{E}^{\mathrm {prp}}(\mathbf {D}'), \end{aligned}$$
(3)

where \(\mathbf {D}'\) has running time similar to \(\mathbf {D}\) and makes at most \(4\ell q + 4q\) queries. Next, note that the \(f_i\)’s and \(f'\) (cf. Definition 2) are independent functions, as their underlying URFs \(\beta _i^N,\gamma _i^N,\delta _i^N\) are independent functions. By Lemma 3:

$$\begin{aligned} \mathop {\mathrm {\Delta }}_{\mathbf {D}}(\mathbb {E}\,;\,\$) \le \mathop {\mathrm {\Delta }}_{\mathbf {D}_t}(f'\,;\,g') + \sum _{i=1}^{\ell -1}\mathop {\mathrm {\Delta }}_{\mathbf {D}_i}(f_i\,;\,g_i). \end{aligned}$$
(4)

for some \(\mathbf {D}_t\) that makes at most \(q\) queries of total length less than \(2n\ell \) and \(\mathbf {D}_i\) (for \(i\in \{1,\ldots ,\ell -1\}\)) that makes at most \(q\) queries (of fixed length). A bound on \(\mathop {\mathrm {\Delta }}_{\mathbf {D}_i}(f_i\,;\,g_i)\) for arbitrary \(i\) is derived in Lemma 4. In Lemma 5 we compute a bound on \(\mathop {\mathrm {\Delta }}_{\mathbf {D}_t}(f'\,;\,g')\). We find:

$$\begin{aligned} \mathop {\mathrm {\Delta }}_{\mathbf {D}}(\mathbb {E}\,;\,\$) \le \frac{q^2(2\ell +3)}{2^n} + \sum _{i=1}^{\ell -1}\frac{q^22(2i+1)}{2^n} = \frac{q^2(2\ell +3)}{2^n} + \frac{q^22(\ell ^2-1)}{2^n}. \end{aligned}$$
(5)

The proof is completed by simplifying the obtained bound. \(\quad \square \)

Lemma 4

Let \(\mathbf {D}_i\) be a distinguisher making at most \(q\) queries, then

$$\begin{aligned} \mathop {\mathrm {\Delta }}_{\mathbf {D}_i}(f_i\,;\,g_i) \le \frac{2(2i+1)q^2}{2^n}. \end{aligned}$$

Proof

The proof is similar to that of Lemma 2. We use that the inputs to the URFs are polynomials of degree at most \(2i+1\), and that \(f_i\) consists of two URFs \(\beta _i^N\) and \(\gamma _i^N\). \(\quad \square \)

Lemma 5

Let \(\mathbf {D}_t\) be a distinguisher making at most \(q\) queries each of length less than \(2n\ell \), then

$$\begin{aligned} \mathop {\mathrm {\Delta }}_{\mathbf {D}_t}(f'\,;\,g')\le \frac{(2\ell +3)q^2}{2^n}. \end{aligned}$$

Proof

Without loss of generality, we may assume that the distinguisher does not make repeat queries. Say that \(\{(N_1, A_1, M_1), \ldots , (N_{i}, A_{i}, M_{i})\}\) is the query history. We consider what happens on query \((N^*, A^*, M^*)\).

Let \(U_i := \varOmega (A_i)\) and let \(\varSigma _i\) denote input to \(\delta _1^{N_i}\) (see Fig. 4). Let \(U^*\) and \(\varSigma ^*\) be the corresponding values for \((N^*, A^*, M^*)\). We compute the probability that

$$\begin{aligned} U^*\oplus \delta _1^{N^*}(\varSigma ^*)\oplus N^* = U_j\oplus \delta _1^{N_j}(\varSigma _j)\oplus N_j \end{aligned}$$
(6)

for some \(j\) for which \(1\le j\le i\).

  1. 1.

    If \(A^* \ne A_j\) then \(U^*\) is independent of \(U_j\oplus \delta _1^{N_j}(\varSigma _j)\oplus \delta _1^{N^*}(\varSigma ^*)\oplus N^*\oplus N_j\), hence the probability that Eq. (6) is satisfied is not more than \(1/2^n\).

  2. 2.

    If \(A^* = A_j\), then \(U^* = U_j\) and we focus on the probability that

    $$\begin{aligned} \delta _1^{N^*}(\varSigma ^*)\oplus N^* = \delta _1^{N_j}(\varSigma _j)\oplus N_j. \end{aligned}$$
    (7)
    1. (a)

      If \(M^* = M_j\), then \(\varSigma ^*=\varSigma _j\) and Eq. (7) reduces to \(\delta _1^{N^*}(\varSigma ^*)\oplus \delta _1^{N_j}(\varSigma ^*) = N^*\oplus N_j\). Since we do not allow repeat queries \(N^*\ne N_j\), and so this occurs with probability \(1/2^n\).

    2. (b)

      Say that \(M^* \ne M_j\), and let \(\rho ^*\) and \(\rho _j\) denote the output of the last call to \(\beta \) made when processing \(M^*\) and \(M_j\), respectively. If \(\rho ^*\) and \(\rho _j\) are independent, then \(\varSigma ^* = \varSigma _j\) with probability \(1/2^n\). If \(\varSigma ^*\ne \varSigma _j\), then \(\delta _1^{N^*}(\varSigma ^*)\) and \(\delta _1^{N_j}(\varSigma _j)\) are independent (and also independent of \(N^*\) and \(N_j\)), hence the probability of Eq. (7) being true is upper bounded by \(2/2^n\). The probability that \(\rho ^*\) and \(\rho _j\) are not independent is upper bounded by the probability of having a collision in the inputs to the last \(\beta \) call (and only if \(M^*\) and \(M_j\) are the same length), which is \((2\ell +1)/2^n\).

Putting our results together we get that

$$\begin{aligned} \Pr (\text {Eq.}\,\mathrm{(}6\mathrm{)}~\text {holds}) \le \max \left\{ \frac{1}{2^n}, \frac{2\ell +1}{2^n} + \frac{2}{2^n}\right\} = \frac{2\ell +3}{2^n} . \end{aligned}$$
(8)

This means that the probability that \(U^*\oplus \delta _1^{N^*}(\varSigma ^*)\oplus N^*\) collides with any of the previous \(U_j\oplus \delta _1^{N_j}(\varSigma _j)\oplus N_j\) is upper bounded by \(\frac{q(2\ell + 3)}{2^n}\). As long as the input to \(\delta _2\) is unique, the tag produced is uniform and independent of all previous values, hence \(f'\) remains indistinguishable from \(g'\). Summing over all queries, we get the desired bound. \(\quad \square \)

4.4 Integrity

Definition 5

Let \(\mathcal {E}\) be an AE scheme. The integrity advantage of a distinguisher \(\mathbf {D}\) relative to \(\mathcal {E}\) is given by

$$\begin{aligned} \mathbf {Adv}_{\mathcal {E}}^{\mathrm {int}}(\mathbf {D}) := \mathop {\mathrm {\Delta }}_{\mathbf {D}}(\mathcal {E}_k,\mathcal {D}_k\,;\,\mathcal {E}_k,\bot ), \end{aligned}$$

where \(k\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {K}\) and \(\bot \) is a function that responds with \(\bot \) on every query. We assume that the distinguisher does not make queries of the form \(\mathcal {D}_k(N, A, C, T)\), where \((C, T) = \mathcal {E}_k(N, A, M)\) for some previously queried \((N, A, M)\) and that it does not query \(\mathcal {E}_k\) twice under the same nonce.

Theorem 2

Let \(\mathcal {E}\) denote COBRA and let \(\mathbf {D}\) be a distinguisher running in time \(t\) and making at most \(q\) queries to \(\mathcal {E}\) and \(q_f\) forgery attempts, each of length less than \(2n\ell \), then

$$\begin{aligned} \mathbf {Adv}_{\mathcal {E}}^{\mathrm {int}}(\mathbf {D})\le \frac{(3q+1)q_f}{2^n} + \frac{22(\ell +1)^2q^2}{2^n} + \mathbf {Adv}_{E}^{\mathrm {prp}}(\mathbf {D}'), \end{aligned}$$

where \(\mathbf {D}'\) is a distinguisher with running time similar to \(\mathbf {D}\), making \(4\ell q + 4q\) queries.

Proof

As with the proof of confidentiality, we use Proposition 1 to switch to \(\mathbb {E}.\)

We first focus on adversaries with one forgery attempt, with \((N^*, A^*, C^*, T^*)\) being the attempt. Let \(\{(N_1, A_1, M_1), \ldots , (N_q, A_q, M_q)\}\) denote the history of queries made by the adversary to the encryption oracle, where \((C_i, T_i)\) is the output corresponding to \((N_i, A_i, M_i)\) and each \(N_i\) is distinct. Let \(U_i := \varOmega (A_i)\), \(U^* := \varOmega (A^*)\), and let \(\varSigma _i\) denote the input to \(\delta _1^{N_i}\) during the computation of \((N_i, A_i, M_i)\); define \(\varSigma ^*\) similarly. Note that

$$\begin{aligned} \delta _2^{N_i}\Big (U_i\oplus \delta _1^{N_i}(\varSigma _i)\oplus N_i\Big ) = T_i, \end{aligned}$$
(9)

and similarly for \(T^*\).

If

$$\begin{aligned} U^*\oplus \delta _1^{N^*}(\varSigma ^*)\oplus N^* \ne U_i\oplus \delta _1^{N_i}(\varSigma _i)\oplus N_i \end{aligned}$$
(10)

for all \(i\), then the input to \(\delta _2\) is distinct from all previous inputs to \(\delta _2\), hence the output of \(\delta _2\) from the forgery query is uniformly distributed and independent of \(T^*\), which means that the forgery will be successful with probability at most \(1/2^n\). Hence we focus on computing the probability that there is an \(i\) resulting in a collision in the \(\delta _2\) input.

Fix an \(i\) such that \(1\le i\le q\). We compute the probability that

$$\begin{aligned} U^*\oplus \delta _1^{N^*}(\varSigma ^*)\oplus N^* = U_i\oplus \delta _1^{N_i}(\varSigma _i)\oplus N_i. \end{aligned}$$
(11)
  1. 1.

    If \(A^* \ne A_i\), then \(U^*\) is uniformly distributed and independent of \(U_i\), hence the probability that Eq. (11) is satisfied is bounded above by \(1/2^n\).

  2. 2.

    If \(A^* = A_i\), then \(U^* = U_i\) and we focus on the probability that

    $$\begin{aligned} \delta _1^{N^*}(\varSigma ^*)\oplus N^* = \delta _1^{N_i}(\varSigma _i)\oplus N_i. \end{aligned}$$
    (12)
    1. (a)

      If \(C^* = C_j\), Eq. (12) reduces to

      $$\begin{aligned} \delta _1^{N^*}(\varSigma ^*)\oplus N^* = \delta _1^{N_i}(\varSigma ^*)\oplus N_i. \end{aligned}$$
      (13)

      Since \(A^* = A_j\), then either \(N^*\ne N_i\), in which case we only get a successful forgery with probability \(1/2^n\), or \(N^* = N_j\) and \(T^*\ne T_j\), in which case we get a failed forgery attempt as well.

    2. (b)

      Say that \(C^*\ne C_j\), and that they differ at the \(m\)th fragment, i.e. \(\hat{C}^*[m]\ne \hat{C}_j[m]\). We also assume that \(N^* = N_j\), because if \(N^*\ne N_j\) then the tags are independent of each other since they are produced by independent URFs \(\delta _2^{N^*}\) and \(\delta _2^{N_j}\).

      Since \(N^* = N_j\), and \(N_j\) does not equal any of the other \(N_i\), \(\varSigma ^*\) is independent of all \(\varSigma _i\) for \(i\ne j\). If \(C^*[2m-1] = C_j[2m-1]\), then \(C^*[2m] \ne C_j[2m]\), hence the inputs to \(\beta _m^{N^*}\) for \(C^*\) and \(C_j\) are different. This means that \(\varSigma ^* = \varSigma _j\) with probability at most \(1/2^n\), hence \(\delta _1(\varSigma ^*) = \delta _1(\varSigma _j)\) with probability at most \(2/2^n\). If \(\delta _1^{N^*}(\varSigma ^*)\ne \delta _1^{N_j}(\varSigma _j)\), then Eq. (12) is satisfied with probability at most \(1/2^n\) since \(N^*\) and \(N_i\) are independent of the outputs of \(\delta _1\). If \(C^*[2m-1]\ne C_j[2m-1]\), we can apply the same reasoning.

Putting the above results together, we get that the probability of Eq. (11) being satisfied is bounded above by \(3/2^n\). Hence, the probability that there exists an \(i\) satisfying (11) is bounded above by \(3q/2^n\). The probability that the forgery is successful is thus bounded above by \(3q/2^n + 1/2^n\).

Generalizing to adversaries which can make up to \(q_f\) forgery queries as explained in Andreeva et al. [4], we have our desired bound. \(\quad \square \)

5 Future Work

We shall implement COBRA and compare its software performance with GCM and COPA. It is interesting to see how much of an overhead COBRA actually has over GCM on a specific platform, possibly due to the Feistel network, larger state, extra mask generation, and the reverse order of multplication and block cipher call.