Keywords

1 Introduction

An authenticated encryption (AE) scheme provides two complementary data security goals: confidentiality (privacy) and integrity (authenticity). Traditionally, these goals were achieved by combining two cryptographic primitives, a privacy-only encryption scheme and a message authentication code (MAC)—a paradigm known as generic composition (GC) [7, 8, 20]. The notion of AE, as a desirable symmetric-key primitive in its own right, was introduced in 2000 [7, 9, 18]. Since then, security notions for AE schemes have been defined and refined [14, 23, 2527], together with many dedicated AE designs seeking some advantages over the GC-based schemes.

AE schemes have been studied for over a decade, yet the topic remains a highly active and interesting area of research as evidenced by the currently running CAESAR competition [10]. OMD [12, 13] is one of 57 first-round CAESAR submissions, among which, at the time of writing this paper, 8 submissions are withdrawn due to major security flaws.

Among the features that OMD possesses, the following two are notably interesting and distinctive: OMD is the only CAESAR submission that is designed (as a mode of operation) based on a compression function [3], and it provides (provably) high security levels (about twice that of the AES-based submissions) when implemented with an off-the-shelf compression function such as those of the standard SHA family [2].

Instantiations of OMD using the compression functions of SHA-256 and SHA-512, called OMD-sha256 and OMD-sha512 respectively, can freely benefit from the widely-deployed optimized implementations of these primitives, e.g. [15, 16]; in particular, OMD-sha256 can take advantage of the new Intel SHA Extensions [17].

Motivated by the aforementioned appealing features of OMD, we further investigate the possibility of making algorithmic improvements to the original OMD scheme towards boosting its efficiency, while preserving its security properties. We show that there is a natural way (inspired from the work of [28]) to modify OMD to make it more compact and efficient with respect to processing associated data (AD). Our new variant of OMD—called pure OMD (p-OMD)—has the following features:

  • It inherits all desirable security features of OMD. We prove the security of p-OMD under the same standard assumption (namely, pseudo-randomness of the compression function) as made in OMD. Furthermore, the proven security bounds for p-OMD are the same as those of OMD. This shows that the modifications we made to OMD, to obtain the performance-boosted variant p-OMD, are without sacrificing any security.

  • It has a more compact structure and processing AD is almost free. The original OMD scheme couples a single pass of the MD iteration—in which the chaining values are xored with specially crafted offsets—with the counter-based XOR MAC algorithm [6] to process a message and its associated data. The p-OMD scheme dispenses with the XOR MAC algorithm and is solely based on the (masked) MD iteration. This is achieved by absorbing the associated data blocks during the core MD path rather than processing them separately by an additional XOR MAC algorithm. To encrypt a message of \(\ell \) blocks having associated data of a blocks, OMD needs \(\ell +a+2\) calls to the compression function while p-OMD only requires \(\max \left\{ \ell , a\right\} +2\) calls. That is, for a typical case where \(\ell \ge a\), p-OMD makes just \(\ell +2\) calls independently of the length of AD.

We note that neither OMD nor p-OMD satisfy the nonce-reuse misuse-resistance notions defined in [14, 27]. Misuse-resistant variants of OMD are recently proposed in [21], but in these variants the encryption process is not online and they are less efficient than OMD.

A correction. In the preproceedings version of this paper, we claimed a partial level of robustness to nonce misuse with respect to the authenticity property. Tomer Ashur and Bart Mennink pointed out [4] that this claim was incorrect; hence, we have removed the claim. This is the revised and corrected version.

Organization of the paper. Notations and preliminary concepts are presented in Sect. 2. Definitions of security notions for AE schemes are reviewed in Sect. 3. Section 4 provides the specification of the p-OMD mode of operation. In Sect. 5, we provide the security analysis of p-OMD. Section 6 provides an experimental performance comparison between p-OMD and OMD.

2 Preliminaries

Notations. Let \(x\mathop {\leftarrow }\limits ^{\$}S\) denote choosing an element x from a finite set S uniformly at random. \(X \leftarrow Y\) is used for denoting the assignment statement where the value of Y is assigned to X. All strings are binary strings. The empty string is denoted by \(\varepsilon \). The set of all strings of length n bits (for some positive integer n) is denoted as \(\left\{ 0,1\right\} ^{n}\), the set of all strings whose lengths are upper-bounded by L is denoted by \(\left\{ 0,1\right\} ^{\le L}\) and the set of all strings of finite length is denoted by \(\left\{ 0,1\right\} ^{*}\). The notations X||Y and XY both stand for the string obtained by concatenating a string Y to a string X. For an m-bit string \(X=X[m-1] \cdots X[0]\) we denote the first (leftmost) bit by \(\mathsf{firstbit }(X)=X[m-1]\) and the last (rightmost) bit by \(\mathsf{lastbit }(X)=X[0]\). Let \(X[i \cdots j]=X[i] \cdots X[j]\) denote a substring of X, for \(m-1 \ge i \ge j \ge 0\); by convention we let \(X[i \cdots j]=\varepsilon \) if \(i<0\) and \(X[i \cdots j]=X[i \cdots 0]\) if \(j<0\).

For a non-negative integer i let \(\left\langle i\right\rangle _m\) denote the binary representation of i by an m-bit string. For a bit string \(X=X[m-1] \cdots X[0]\), let \({\mathsf {str2num}}(X)=\sum _{i=0}^{m-1}X[i]2^{i}\) denote the non-negative integer represented by X. Let \(\mathtt{ntz }(i)\) denote the number of trailing zeros (i.e. the number of rightmost bits that are zero) in the binary representation of a positive integer i. Let \(1^{n}0^{m}\) denote concatenation of n ones by m zeros.

We let \(\mathsf{firstbits }_i(X)=X[m-1 \cdots m - i]\) denote the i leftmost bits and \(\mathsf{lastbits }_i(X)=X[i-1 \cdots 0]\) denote the i rightmost bits of X. For two strings \(X=X[m-1] \cdots X[0]\) and \(Y=Y[n-1] \cdots Y[0]\) of possibly different lengths, let the notation \(X \oplus Y\) denote the bitwise xor of \(\mathsf{firstbits }_i(X)\) and \(\mathsf{firstbits }_i(Y)\) where \(i=\min \left\{ m, n\right\} \). Clearly, if X and Y have the same length then \(X \oplus Y\) matches the usual bitwise xor. For any string X, define \(X \oplus \varepsilon =\varepsilon \oplus X=\varepsilon \).

The special symbol \(\bot \) signifies both that the value of a variable or a function at some input is undefined, and an error. Let |Z| denote the number of elements of Z if Z is a set, and the length of Z in bits if Z is a string. We let \(|\varepsilon |=0\). For \(X \in \left\{ 0, 1\right\} ^{*}\) let \(X_1||X_2 \cdots ||X_m \mathop {\leftarrow }\limits ^{b}X\) denote partitioning X into blocks \(X_i\) such that \(|X_i|=b\) for \(1 \le i \le m-1\) and \(|X_m| \le b\); let \(m=|X|_b\) denote length of X in b-bit blocks.

The Finite Field with \(2^{n}\) Elements. Let \((GF(2^n), \oplus ,.)\) denote the Galois Field with \(2^n\) elements. An element \(\alpha \) in \(GF(2^n)\) is represented as a formal polynomial \(\alpha (X)=\alpha _{n-1}X^{n-1}+ \cdots +\alpha _{1}X+\alpha _{0}\) with binary coefficients. We can assign an element \(\alpha _i \in GF(2^n)\) to an integer \(i \in \{0, \ldots , 2^{n}-1\}\) in a natural way, similar applies for \(\alpha _s\) and a string \(s \in \{0,1\}^n\). We sometimes refer to the elements of \(GF(2^n)\) directly by strings or integers, if the context does not allow ambiguity. The addition “\(\oplus \)” and multiplication “.” of two field elements in \(GF(2^n)\) are defined as usual [13]. For \(GF(2^{256})\) we use \(P_{256}(X)=X^{256}+X^{10}+X^5+X^2+1\), and for \(GF(2^{512})\) we use \(P_{512}(X)=X^{512}+X^8+X^5+X^2+1\) as the irreducible polynomials used in the field multiplications.

Advantage Function. The insecurity of a scheme \(\varPi \) in regard to a security property xxx is measured using the resource parametrized function , where the maximum is taken over all adversaries \(\varvec{A}\) which use resources bounded by \(\mathbf r \).

Pseudorandom Functions (PRFs) and Tweakable PRFs. Let \(\text {Func}\)  \((m, n){=}\left\{ f: \left\{ 0, 1\right\} ^{m} \rightarrow \right. \) \(\left. \left\{ 0, 1\right\} ^{n}\right\} \) be the set of all functions from m-bit strings to n-bit strings. A random function (RF) R with m-bit input and n-bit output is a function selected uniformly at random from \(\text {Func}(m, n)\). We denote this by \(R \mathop {\leftarrow }\limits ^{\$} \text {Func}(m, n)\).

Let \(\text {Func}^{\mathcal {T}}(m, n)\) be the set of all functions \(\left\{ \widetilde{f}: \mathcal {T} \times \left\{ 0, 1\right\} ^{m} \rightarrow \left\{ 0, 1\right\} ^{n}\right\} \), where \(\mathcal {T}\) is a set of tweaks. A tweakable RF with the tweak space \(\mathcal {T}\), m-bit input and n-bit output is a map \(\widetilde{R}: \mathcal {T} \times \left\{ 0, 1\right\} ^{m} \rightarrow \left\{ 0, 1\right\} ^{n}\) selected uniformly at random from \(\text {Func}^{\mathcal {T}}(m, n)\); i.e. \(\widetilde{R} \mathop {\leftarrow }\limits ^{\$} \text {Func}^{\mathcal {T}}(m, n)\). Clearly, if \(\mathcal {T}=\left\{ 0, 1\right\} ^{t}\) then \(|\text {Func}^{\mathcal {T}}(m, n)|=|\text {Func}(m+t, n)|\), and hence, \(\widetilde{R}\) can be instantiated using a random function R with \((m+t)\)-bit input and n-bit output. We use \(\widetilde{R}^{\left\langle T \right\rangle }(.)\) and \(\widetilde{R}(T, .)\) interchangeably, for every \(T \in \mathcal {T}\). Notice that each tweak T names a random function \(\widetilde{R}^{\left\langle T \right\rangle }: \left\{ 0, 1\right\} ^{m} \rightarrow \left\{ 0, 1\right\} ^{n}\) and distinct tweaks name distinct (independent) random functions.

Let \(F:\mathcal {K} \times \left\{ 0, 1\right\} ^{m} \rightarrow \left\{ 0, 1\right\} ^{n}\) be a keyed function and let \(\widetilde{F}:\mathcal {K} \times \mathcal {T} \times \left\{ 0, 1\right\} ^{m} \rightarrow \left\{ 0, 1\right\} ^{n}\) be a keyed and tweakable function, where the key space \(\mathcal {K}\) is some nonempty set. Let \(F_{K}(.)=F(K, .)\) and \(\widetilde{F}^{\left\langle T \right\rangle }_{K}(.)=\widetilde{F}(K, T, .)\). Let \(\varvec{A}\) be an adversary. Then:

The resource parametrized advantage functions are defined accordingly, considering that the adversarial resources of interest here are the time complexity (t) of the adversary and the total number of queries (q) asked by the adversary (note that we just consider fixed-input-length functions, so the lengths of queries are fixed and known). We say that F is \((t, q; \epsilon )\)-PRF if . We say that \(\widetilde{F}\) is \((t, q; \epsilon )\)-tweakable PRF if .

3 Security Notions for AEAD

Syntax of an AEAD Scheme. A nonce-based authenticated encryption with associated data, AEAD for short, is a symmetric key scheme \(\varPi =(\mathcal {K}, \mathcal {E}, \mathcal {D})\). The key space \(\mathcal {K}\) is some non-empty finite set. The encryption algorithm \(\mathcal {E}: \mathcal {K} \times \mathcal {N} \times \mathcal {A} \times \mathcal {M} \rightarrow \mathcal {C} \cup \left\{ \bot \right\} \) takes four arguments, a secret key \(K \in \mathcal {K}\), a nonce \(N \in \mathcal {N}\), an associated data (a.k.a. header data) \(A \in \mathcal {A}\) and a message \(M \in \mathcal {M}\), and returns either a ciphertext \(\mathbb {C} \in \mathcal {C}\) or a special symbol \(\bot \) indicating an error. The decryption algorithm \(\mathcal {D}: \mathcal {K} \times \mathcal {N} \times \mathcal {A} \times \mathcal {C} \rightarrow \mathcal {M} \cup \left\{ \bot \right\} \) takes four arguments \((K, N, A, \mathbb {C})\) and either outputs a message \(M \in \mathcal {M}\) or an error indicator \(\bot \).

For correctness of the scheme, it is required that \(\mathcal {D}(K, N, A, \mathbb {C})=M\) for any \(\mathbb {C}\) such that \(\mathbb {C}=\mathcal {E} (K, N, A, M)\). It is also assumed that if algorithms \(\mathcal {E}\) and \(\mathcal {D}\) receive parameter not belonging to their specified domain of arguments they will output \(\bot \). We write \(\mathcal {E}_{K}(N, A, M)=\mathcal {E}(K, N, A, M)\) and similarly \(\mathcal {D}_{K}(N, A, \mathbb {C})=\mathcal {D}(K, N, A, \mathbb {C})\).

We assume that the message and associated data can be any binary string of arbitrary but finite length; i.e. \(\mathcal {M}=\left\{ 0, 1\right\} ^{*}\) and \(\mathcal {A}=\left\{ 0, 1\right\} ^{*}\), but the key and nonce are some fixed-length binary strings, i.e. \(\mathcal {N}=\left\{ 0, 1\right\} ^{|N|}\) and \(\mathcal {K}=\left\{ 0, 1\right\} ^{k}\), where the positive integers |N| and k are respectively the nonce length and the key length of the scheme in bits. We assume that \(|\mathcal {E}_{K}(N, A, M)|=|M|+\tau \) for some positive fixed constant \(\tau \); that is, we will have \(\mathbb {C}=C||\mathsf{Tag }\) where \(|C|=|M|\) and \(|\mathsf{Tag }|=\tau \). We call C the core ciphertext and \(\mathsf{Tag }\) the tag.

Nonce Respecting Adversaries. Let \(\varvec{A}\) be an adversary. We say that \(\varvec{A}\) is nonce-respecting if it never repeats a nonce in its encryption queries. That is, if \(\varvec{A}\) queries the encryption oracle \(\mathcal {E}_K (\cdot , \cdot , \cdot )\) on \((N_1, A_1, M_1) \cdots (N_q, A_q, M_q)\) then \(N_1, \cdots , N_q\) must be distinct.

Privacy Notion. We adopt the privacy notion called indistinguishability of ciphertext from random bits under CPA (IND$-CPA), which is defined in [26] as a stronger variant of the classical IND-CPA notion [5, 7].

Let \(\varPi =(\mathcal {K}, \mathcal {E}, \mathcal {D})\) be a nonce-based AEAD scheme. Let \(\varvec{A}\) be a nonce-respecting adversary. \(\varvec{A}\) is provided with an oracle which can be either a real encryption oracle \(\mathcal {E}_K(\cdot , \cdot , \cdot )\) such that on input (NAM) returns \(\mathbb {C}=\mathcal {E}_K(N, A, M)\), or a fake encryption oracle \(\$(\cdot ,\cdot , \cdot )\) which on any input (NAM) returns \(|\mathbb {C}|\) fresh random bits. The advantage of \(\varvec{A}\) in mounting a chosen plaintext attack (CPA) against the privacy property of \(\varPi \) is measured as follows:

Authenticity Notion. We adopt the established notion of authenticity, called integrity of ciphertext (INT-CTXT) under CCA attacks. The notion was originally defined in [7] for AE schemes and later revisited to include (authentication of AD in) AEAD schemes in [23].

Let \(\varPi =(\mathcal {K}, \mathcal {E}, \mathcal {D})\) be a nonce-based AEAD scheme. Let \(\varvec{A}\) be a nonce-respecting adversary. We stress that nonce-respecting is only regarded for the encryption queries; that is, \(\varvec{A}\) can repeat nonces during its decryption queries and it can also ask an encryption query with a nonce that was already used in a decryption query. Let \(\mathcal {A}\) be provided with the encryption oracle \(\mathcal {E}_K(\cdot , \cdot , \cdot )\) and the decryption oracle \(\mathcal {D}_K(\cdot , \cdot , \cdot )\); that is, we consider adversaries that can mount chosen ciphertext attacks (CCA). We say that \(\varvec{A}\) forges if it makes a decryption query \((N, A, \mathbb {C})\) such that \(\mathcal {D}_K(N, A, \mathbb {C}) \ne \bot \) and no previous encryption query \(\mathcal {E}_K(N, A, M)\) returned \(\mathbb {C}\).

Resource parameters. Let \((N^1, A^1, M^1) \cdots (N^{q_e}, A^{q_e}, M^{q_e})\) denote the encryption queries and \(({N'}^1, {A'}^1, {\mathbb {C}'}^1) \cdots ({N'}^{q_v}, {A'}^{q_v}, {\mathbb {C}'}^{q_v})\) the decryption queries made by an adversary \(\varvec{A}\). We define the resource parameters of \(\varvec{A}\) as \((t, q_e, q_v, \sigma _A, \sigma _M, \sigma _{A'}, \sigma _{\mathbb {C}'}, L_{max})\), where t is the time complexity, \(q_e\) and \(q_v\) are respectively the total number of encryption queries and decryption queries, \(L_{max}\) is the maximum length of each query in bits, \(\sigma _A=\sum _{i=1}^{q_e}|A^i|\), \(\sigma _{M}=\sum _{i=1}^{q_e}|M^i|\), \(\sigma _{A'}=\sum _{i=1}^{q_v}|{A'}^i|\) and \(\sigma _{\mathbb {C}'}=\sum _{i=1}^{q_v}(|{\mathbb {C}'}^i|-\tau )\).

The absence of a resource parameter will mean that the parameter is irrelevant in the context and hence omitted.

4 The p-OMD Mode of Operation

p-OMD is a mode of operation that converts a keyed compression function to an AEAD scheme. To instantiate p-OMD, one must first choose and fix a keyed compression function \(F:\mathcal {K} \times \left( \{0,1\}^{n} \times \{0,1\}^{m} \right) \rightarrow \{0,1\}^{n}\) and a tag length \(\tau \le n\); with the key space \(\mathcal {K}=\{0,1\}^{k}\) and \(m\le n\). Let \({\text {p-OMD}}[F,\tau ]\) denote the p-OMD instantiated by fixing F and \(\tau \).

If the compression function at hand does not have a dedicated key input per se, as it is the case for standard hash functions, then a keyed compression function with \(n+m\) input bits can be obtained from the keyless compression function with \(n+b\) input bits by allocating k input bits for the key, such that \(b=m+k\). For example, if we use the compression function of SHA-256, we have \(n=256, b=512\) and setting \(k=256\) will give us a keyed compression function with \(m=n=256\).

Description of the Mode. The main design rationale behind p-OMD is the integration of AD processing into the same MD path that processes the message. Figure 1 shows a schematic representation of the encryption algorithm of \({\text {p-OMD}}[F,\tau ]\). The decryption algorithm can be straightforwardly derived from the encryption algorithm with the additional verification of the authentication tag at the end of the decryption process. While the overall structure of such design is rather simple, the combined processing of the message and associated data blocks in p-OMD creates several additional possible cases, to be treated and analyzed carefully, compared to the analysis of OMD. Figure 2 provides an algorithmic description.

In the following we briefly explain the components that may need further clarification.

Fig. 1.
figure 1

The encryption process of \({\text {p-OMD}}[F,\tau ]\). Refer to Sect. 4 for details. See Sect. 2 for our convention on the notation \(X \oplus Y\) for operands of different lengths.

Fig. 2.
figure 2

Description of the encryption algorithm of p-OMD[\(F, \tau \)]. stage 1 processes blocks of message and AD simultaneously (Cases A, B and C in Fig. 1). stage 2 processes only message blocks (Case B in Fig. 1 and the case when we only have a message and no AD that is not in the Figure). stage 3 processes only double blocks of AD (Cases C and D in Fig. 1). Note that the Cases E and F are handled outside of the three stages. Subroutines PARTITION, PAD, SWITCH and PROC1-3 are described in Fig. 3.

Fig. 3.
figure 3

The subroutines used in the encryption algorithm of p-OMD[\(F, \tau \)] (Fig. 2)

  1. (1)

    Computing \(\varvec{\varDelta _{N, i, j}}\) . As shown in Fig. 1, before each call to the underlying compression function F, we xor a (key-dependent) masking value \(\varDelta _{N, i, j}\) to the chaining variable, where N is the nonce, the i component of the index is incremented at each call to the compression function and the j component is changed when needed (according to a pattern that will be detailed shortly). This method is known as the XE method [24] and is used for converting F to a tweakable function. There are different plausible ways to compute such masking values (under efficiency and security constraints) [11, 19, 24]. We adopt the Gray code based method following [19]. In the following, all multiplications (denoted by “\(\cdot \)”) are in \(GF(2^{n})\).

    1. (a)

      Precomputation. Let \(L_*(0)=0^n\), \(L_*(1)=F_{K}(0^n, 0^m)\) and \(L_*(i)=i~\cdot ~L_*(1)\) for \(2\le i \le 15\). Let \(L(0)=16~\cdot ~L_*(1)\) and \(L(j)=2 \cdot L(j-1)\) for \(j \ge 1\). For a fast implementation the values \(L_*(i)\) and L(j) can be precomputed and stored in a table for \(1\le i \le 15\) and \(0 \le j \le \left\lceil \log _2 (\ell _{max}) \right\rceil \), where \(\ell _{max}\) is the bound on the maximum number of blocks in M or A. Alternatively, (if there is a memory restriction) they can be computed on-the-fly. Note that all \(L_*(i)\) are linear.

    2. (b)

      Computation of the masking sequence. The masking values \(\varDelta _{N, i, j}\) are computed sequentially as follows. Let \(\varDelta _{N, 0, 0}=F_{K}(N||10^{n-1-|N|}, 0^m)\). For \(i \ge 1\) and \(j, j' \in \{0,1,\ldots , 15 \}\): \(\varDelta _{N, i, j}=\varDelta _{N, i-1, j'}\oplus L(\mathtt{ntz }(i)) \oplus L_* \left( {\mathsf {str2num}}\left( \langle j \rangle _4 \oplus \langle j' \rangle _4 \right) \right) \). For details on how we get this compact relation adopting the Gray code based sequence partition method, we refer to Appendix ??.

  2. (2)

    Encryption Algorithm: To encrypt a message \(M \in \left\{ 0, 1\right\} ^*\) with associated data \(A \in \left\{ 0, 1\right\} ^*\) using nonce \(N \in \left\{ 0, 1\right\} ^{|N|}\) and key \(K \in \left\{ 0, 1\right\} ^{k}\), obtaining a ciphertext \(\mathbb {C}=C||\mathsf{Tag } \in \left\{ 0, 1\right\} ^{|M|+\tau }\), do the following.

    1. (a)

      Partitioning the message and associated data. The partitioning is done by the PARTITION subroutine in Fig. 3. Let \(M_1||M_2 \cdots M_{\ell -1}||M_\ell \mathop {\leftarrow }\limits ^{m}M\). Let \(A'||A^*\leftarrow A\) where \(A'\leftarrow A[|A|-1 \cdots |A| - (\ell +1)n]\) and \(A^*\leftarrow A[|A| -|A'|-1 \cdots 0]\) (refer to the notations in Sect. 2). Let \(A'_1||A'_2 \cdots A'_{a'-1}||A'_{a'} \xleftarrow {n} A'\) and \(A^*_1||A^*_2 \cdots A^*_{a^*-1}||A^*_{a^*} \xleftarrow {n+m} A^* \). The string \(A'\) consists of \(a' \le \ell +1\) n-bit blocks and these blocks will be simply absorbed into the chaining variable during the message encryption. In a typical use case where the associated data is (a header) shorter than the message, we will have \(A'=A\) i.e. \(A^*=\varepsilon \) (Case A and Case B in Fig. 1). The string \(A^*\) will be non-empty only if \(|A|>(\ell +1)n\), in which case, while \(A^*\) is being processed, there are no more message blocks to encrypt. To maximize the efficiency, we partition the string \(A^*\) into \(n+m\)-bit blocks so that we can make use of both of the inputs to F (see Case C and Case D in Fig. 1).

    2. (b)

      Processing the message and associated data. The message and associated data blocks are processed by the modified MD iteration of the keyed compression functions F as shown in Fig. 1. For every call to F, the n-bit input (chaining variable) is masked by the value \(\varDelta _{N,i,j}\); where, the N component in the index denotes the nonce; i starts with the value \(i=1\) at the first call to F and is incremented (by one) for every call; the j component is used to separate logical parts in the encryption process as well as different types of input arguments. Appropriate use of the j component is essential for security and facilitates the analysis, as will be described in the following.

    3. (3)

      Selection of the \(\varvec{j}\) component in the index of the masks \(\varvec{\varDelta _{N, i, j}}\) . We use different values of j to separate the calls to the masked F in different contexts. Let’s classify the calls to the masked F to two types: (1) the final call to F which returns the tag, and (2) the internal calls. We note that in the special case that \(M=\varepsilon \) and \(|A| \le n\) there will be only one call to F which returns the tag; hence, it is considered as the final call.

      • Internal Calls. We use \(j \in \{0, 1, 2\}\) for the internal calls made to the masked F as follows.

        For \(i=1\), i.e. the first call to F, the value of j is determined as follows:

        • if \(\ell >0\) and \(a'>0\) then let \(j=0\),

        • if \(\ell >0\) and \(a'=0\) then let \(j=1\),

        • if \(\ell =0\) and \(a^*>0\) then let \(j=2\).

        For \(1< i < \ell + 1 + a^*\), depending on the presence of message blocks and AD blocks to be processed at the \(i^{\text {th}}\) call to the masked F, we have:

        • if both an n-bit AD block and an m-bit message block are present then \(j=0\),

        • if only an m-bit message block is present (no AD block is processed) then \(j=1\),

        • if only an \((n+m)\)-bit AD block is present (no message block is processed) then \(j=2\).

      • Final Call. The final call to F which produces the authentication tag uses \(j_f \in \{3, 4, 5, \ldots , 14, 15\}\). If the tag is produced by a call to F with \(i \ne 1\), we have three main cases depending on the inputs to the final masked F.

        • If both an AD block and a message block are present in the final call (see Case A in Fig. 1) then \(j_f\in \{4, 5, 6, 7\}\); where, we let \(j_f=4\) if \(|M_\ell |=m\) and \(|A'_{a'}|=n\); let \(j_f=5\) if \(|M_\ell |<m\) and \(|A'_{a'}|=n\); let \(j_f=6\) if \(|M_\ell |=m\) and \(|A'_{a'}|<n\), and otherwise (\(|M_\ell |<m\) and \(|A'_{a'}|<n\)) let \(j_f=7\).

        • If only a message block is present but no AD block is processed in the final call (see Case B in Fig. 1) then \(j_f\in \{8, 9, 10, 11\}\); where, we let \(j_f=8\) if \(|M_\ell |=m\) and \(|A'_{a'}|=n\); let \(j_f=9\) if \(|M_\ell |<m\) and \(|A'_{a'}|=n\); let \(j_f=10\) if \(|M_\ell |=m\) and \(|A'_{a'}|<n\), and otherwise (\(|M_\ell |<m\) and \(|A'_{a'}|<n\)) let \(j_f=11\). For the special case where there is no associate data at all, i.e. \(A=\varepsilon \), we let \(j_f=8\) if \(|M_\ell |=m\) and let \(j_f=9\) if \(|M_\ell |<m\).

        • If only an AD block is present but no message block is processed in the final call (see Case C and Case D in Fig. 1) then \(j_f\in \{12,13,14,15\}\); where, we let \(j_f=12\) if \(|M_\ell |=m\) and \(|A^*_{a^*}|=n+m\); let \(j_f=13\) if \(|M_\ell |<m\) and \(|A^*_{a^*}|=n+m\); let \(j_f=14\) if \(|M_\ell |=m\) and \(|A^*_{a^*}|<n+m\), and otherwise (\(|M_\ell |<m\) and \(|A^*_{a^*}|<n+m\)) let \(j_f=15\). For the special case where there is no message at all, i.e. \(M=\varepsilon \), let \(j_f=12\) if \(|A^*_{a^*}|=n+m\) and let \(j_f=14\) if \(|A^*_{a^*}|<n+m\).

        For \(i=1\) (meaning that the final call is the same as the first call, which happens if \(M=\varepsilon \) AND \(|A| \le n\)) we need to apply a special treatment:

        • if both \(M = A =\varepsilon \) then \(j_f=3\) (Case F in Fig. 1),

        • if \(M=\varepsilon \) and \(0 < |A| \le n\) then we let \(j_f=12\) if \(|A|=n\), otherwise, let \(j_f=14\) (Case E box in Fig. 1).

      Note that there is no variable \(j_f\) in Fig. 2 as \(j_f\) corresponds to a special use of variable j in the last call to F. Specifically, \(j_f\) corresponds to the calls to the SWITCH subroutine that use the value of new j of the form \(\mathsf {const}+j_A+j_M\) or the value 3.

  3. (4)

    Decryption Algorithm: The decryption algorithm accepts a ciphertext \(\mathbb {C} \in \left\{ 0, 1\right\} ^*\) together with associated data \(A \in \left\{ 0, 1\right\} ^*\) and nonce \(N \in \left\{ 0, 1\right\} ^{|N|}\), and using key \(K \in \left\{ 0, 1\right\} ^{k}\) obtains a plaintext \(M \in \left\{ 0, 1\right\} ^*\) or returns an invalid indication \(\bot \). If \(|\mathbb {C}| < \tau \) then return \(\bot \). Otherwise let C be the first \(|\mathbb {C}|-\tau \) bits of \(\mathbb {C}\) and \(\mathsf{Tag }\) be the remaining \(\tau \) bits. Now, considering that the encryption process of p-OMD is actually an additive stream cipher with an integrated authentication mechanism, the decryption process proceeds the same as the encryption process up until the verification of the tag, which happens at the end of the decryption process where the newly computed tag \(\mathsf{Tag }'\) is compared with the provided tag \(\mathsf{Tag }\). If \(\mathsf{Tag }' =\mathsf{Tag }\) then output M, otherwise output \(\bot \).

5 Security Analysis

The security analysis for p-OMD is modular and easy to follow. The high-level structure of the analysis is similar to that of OMD, as expected from the similarities of the algorithms, though the details differ and are more involved. We refer to the full version of this paper [22] for all omitted details.

The proof is divided into three main steps as follows:

  • Step 1: Idealization of the p-OMD scheme using a tweakable random function. We first analyse the security of a generalized variant of \(\text {p-OMD}[F, \tau ]\) where the “masked F” (aimed to instantiate a tweakable function) is replaced by an ideal primitive; namely, a tweakable random function \(\widetilde{R}\). This is the major proof step which differs from and is more involved than that of OMD.

  • Step 2: Realization of the tweakable random function by a tweakble PRF. This is a well-known classical method where the (ideal) random function is replaced by a PRF. This proof step is therefore the same as that of OMD.

  • Step 3: Instantiation of the tweakable PRF via a PRF. To make a tweakable PRF out of a PRF, we use the XE method of [24] with the masking sequence generated based on an appropriate adjustment of a canonical Gray code sequence [19, 26]. This step is similar to that of OMD; only the details of the mask generation function differ.

The security bound for p-OMD is given by Theorem 1.

Theorem 1

Fix \(n\ge 1\), \(0 \le \tau \le n\). Let \(F:\mathcal {K} \times \left( \{0,1\}^{n}\times \{0,1\}^{m}\right) \rightarrow \{0,1\}^{n}\) be a PRF, where the key space \(\mathcal {K} = \{0,1\}^{k}\) for \(k\ge 1\) and \(1\le m \le n \). We have

where \(q_e\) and \(q_v\) are, respectively, the number of encryption and decryption queries, \(\ell _{max}\) denotes the maximum number of the internal calls to F in an encryption or decryption query, \(t'=t+cn\sigma \) for some constant c, and \(\sigma _e\) and \(\sigma \) are the total number of calls to the underlying compression function F in all queries asked by the CPA and CCA adversaries against the privacy and authenticity of the scheme, respectively.

The proof is obtained by combining Lemma 1 in Sect. 5.1 with Lemma 2 in Sect. 5.2 and Lemma 3 in Sect. 5.3.

5.1 Idealization of p-OMD

Let \(\text {p-}\mathbb {OMD}[\widetilde{R}, \tau ]\) be a generalization (idealization) of \(\text {p-OMD}[F, \tau ]\) that uses a tweakable random function \(\widetilde{R}: \mathcal {T} \times (\left\{ 0, 1\right\} ^{n} \times \left\{ 0, 1\right\} ^{m}) \rightarrow \left\{ 0, 1\right\} ^{n}\) instead of the masked F. The tweak space \(\mathcal {T}\) consists of sixteen mutually exclusive sets of tweaks \(\mathcal {T}= \bigcup _{i=0}^{15} \mathcal {N} \times \mathbb {N}\times \left\{ i\right\} \), where \(\mathcal {N}=\left\{ 0, 1\right\} ^{|N|}\) is the set of nonces and \(\mathbb {N}\) is the set of positive integers.

Lemma 1

Let \(\text {p-}\mathbb {OMD}[\widetilde{R}, \tau ]\) be the idealized scheme. Then

where \(q_e\) and \(q_v\) are, respectively, the number of encryption and decryption queries, \(\ell _{max}\) denotes the maximum number of the internal calls to the underlying tweakable random function \(\widetilde{R}\) in an encryption or decryption query, and \(\sigma _e\) and \(\sigma \) are the total number of calls to \(\widetilde{R}\) in all queries asked by the CPA and CCA adversaries against the privacy and authenticity of the scheme, respectively.

The proof (and figures depicting \(\text {p-}\mathbb {OMD}[\widetilde{R}, \tau ]\)) can be found in the full version of the paper [22].

5.2 Realization of Tweakable RFs with Tweakable PRFs

This is a classical step in which the ideal primitive—tweakable random function \(\widetilde{R}\)—is replaced with a standard primitive—tweakable PRF \(\widetilde{F}\). The security loss induced by this step is stated in the following lemma. See the full version of this paper [22] for the proof.

Lemma 2

Let \(\widetilde{R}: \mathcal {T} \times (\left\{ 0, 1\right\} ^{n} \times \left\{ 0, 1\right\} ^{m}) \rightarrow \left\{ 0, 1\right\} ^{n}\) be a tweakable RF and \(\widetilde{F}:\mathcal {K} \times \mathcal {T} \times (\left\{ 0, 1\right\} ^{n} \times \left\{ 0, 1\right\} ^{m}) \rightarrow \left\{ 0, 1\right\} ^{n}\) be a tweakable PRF. Then

where \(q_e\) and \(q_v\) are, respectively, the number of encryption and decryption queries, \(q=q_e+q_v\), \(\ell _{max}\) denotes the maximum number of the internal calls to F in an encryption or decryption query, \(t'=t+cn\sigma _e\) and \(t''=t+c'n\sigma \) for some constants \(c, c'\), and \(\sigma _e\) and \(\sigma \) are the total number of calls to the underlying compression function F in all queries asked by the CPA and CCA adversaries against the privacy and authenticity of the scheme, respectively.

5.3 Instantiation of Tweakable PRFs with PRFs

The last step is to instantiate the tweakable PRFs by means of a (keyed) compression function which is assumed to be PRF. Similar to OMD, we use the XE method of [24] as shown in Fig. 4.

The proof and bound for this step follows from that of OMD, which in turn is a straightforward adaptation of the proof of the XE construction in [19]. Lemma 3 states the bound for this transformation. Here, the only aspect which is different between OMD and p-OMD is the way that the masking sequence \(\varDelta _{N, i, j}\) is computed. It is proved in the full version of this paper [22] that the required security and efficiency properties are satisfied by the specific mask generation scheme of p-OMD, as described in Sect. 4.

Fig. 4.
figure 4

Building a tweakable PRF \(\widetilde{F}^{\left\langle T\right\rangle }_{K}: \left\{ 0, 1\right\} ^{n} \times \left\{ 0, 1\right\} ^{m} \rightarrow \left\{ 0, 1\right\} ^{n}\) using a PRF \(F_K:\left\{ 0, 1\right\} ^{n} \times \left\{ 0, 1\right\} ^{m} \rightarrow \left\{ 0, 1\right\} ^{n}\).

Fig. 5.
figure 5

Performance comparisons between OMD and p-OMD. Top left: encryption complexity with fixed message length. Top right: encryption complexity with equal message length and AD length. Bottom right: comparison of OMD without AD to OMD and p-OMD with AD. Bottom left: encryption complexity of p-OMD for varying message and AD lengths.

Lemma 3

Let \(F: \mathcal {K} \times (\left\{ 0, 1\right\} ^{n} \times \left\{ 0, 1\right\} ^{m}) \rightarrow \left\{ 0, 1\right\} ^{n}\) be a function family with key space \(\mathcal {K}\). Let \(\widetilde{F}: \mathcal {K} \times \mathcal {T} \times (\left\{ 0, 1\right\} ^{n} \times \left\{ 0, 1\right\} ^{m}) \rightarrow \left\{ 0, 1\right\} ^{n}\) be defined by \(\widetilde{F}^{\left\langle T\right\rangle }_K (X, Y)=F_K((X \oplus \varDelta _K(T)), Y)\) for every \(T\in \mathcal {T}, K\in \mathcal {K}\), \(X\in \left\{ 0, 1\right\} ^n, Y\in \left\{ 0, 1\right\} ^m\) and \(\varDelta _K(T)\) is the masking function of p-OMD as defined in Sect. 4. If F is PRF then \(\widetilde{F}\) is tweakable PRF; more precisely

6 Performance Comparison with OMD

To verify the performance advantage of p-OMD over OMD, with respect to processing associated data, we implemented the two algorithms in software and made some measurements to determine and compare their performance.

The comparison is performed on the x86-64 architecture (Intel Core i7-3632QM, with all measurements carried out on a single core). For OMD, we used the OMD-sha512 instantiation optimised for the AVX1 instruction extension, which achieves the best result according to the CAESAR benchmarking measurements [1]. We made the necessary modifications (as in description of p-OMD) to the same code to obtain our implementation of p-OMD. Both OMD and p-OMD were instantiated with the same parameters: key length = 512, nonce length = 256, tag length = 256. Both implementations have been built using the gcc compiler and setting the -Ofast optimization flag.

We measure the time complexity of the encryption process for varying lengths of message and associated data. For the sake of this section, let m denote the message length and a the AD length in bytes. We measure the encryption time for \(m \in \{64, 128, 192, \ldots , 4096\}\) and \(a\in \{ 64, 128, \ldots m \}\) for every value of m. That is, we consider the typical case when AD is at most as long as the message.

For both OMD and p-OMD and for every pair of values ma, we measure the time of one encryption using the rdtsc instruction 200 times to compute the mean time. This is repeated 91 times and the value we take as the result is the median of these 91 mean encryption times. We additionally apply the same procedure to measure time complexity of the encryption of OMD with \(m \in \{64, 128, \ldots , 4096\}\) and \(a=0\). The results are shown in Fig. 5.

The top left graph in Fig. 5 shows that the relative complexity of encryption of both OMD and p-OMD decreases as the length of AD increases; however, p-OMD performs better than OMD. The top right graph demonstrates that if the length of AD is close to the message length then p-OMD has a clear advantage over OMD. The bottom right graph confirms that the p-OMD provides an almost free authentication of associated data compared to OMD.

For both OMD and p-OMD, these measurements exclude the complexity of the precomputation step in computing \(\varDelta _{N, i, j}\) (see Sect. 4) which is done only once during the whole lifetime of a key. As an upper bound, we measure the complexity of the precomputation step that is sufficient to encrypt messages with length up to \(2^{63}\) blocks. For OMD the precomputation step takes 5818 cycles while in p-OMD it requires 6863 cycles on average.