Keywords

1 Introduction

Background. An authenticated encryption with associated data scheme (AEAD) is a symmetric key cryptographic primitive that provides both confidentiality and integrity of plaintexts, and integrity of associated data. There are several ways of designing AEADs, and we focus on a design based on a blockcipher. CCM [39] was proposed by Whiting, Housley, and Ferguson for use within the IEEE 802.11 standard for Wireless LANs. It is adopted as NIST recommendation [16], and is broadly used in practice [9, 20, 21]. The mode is 2-pass, meaning that we run two algorithms, one for encryption and one for authentication. It is provably secure [25], but CCM suffers from a number of limitations, most notably it is not on-line; the encryption process cannot be started until knowing the whole input data. There are other issues in CCM [35], and EAX was proposed by Bellare, Rogaway, and Wagner to overcome these limitations [13]. EAX is included in ISO 19772 [9], and it has a number of attractive features; it is simple as it uses CMAC and CTR mode in a black-box manner, and it was designed by taking provable security into consideration. However, it has several implementation costs, and EAX-prime was designed by Moise, Beroset, Phinney, and Burns [31] to reduce the costs. It was designed to reduce the number of blockcipher calls both in precomputation and in processing the input data, to eliminate the key dependent constants, also called masks, to reduce memory requirement to store them, and to unify the associated data and the nonce, which contributes to reduce the memory requirement and the number of blockcipher calls as well. However, a practical attack was pointed out against EAX-prime [30], showing that it is not a secure AEAD. Later, Minematsu, Lucks, and Iwata proposed a variant of EAX called EAX\(^+\), which has similar complexity as EAX-prime and is provably secure as EAX [29].

Presumably, though not clearly stated in the document [31], the most significant advantage of EAX-prime over original EAX (and CCM) is its efficienthandling of short input data with small memory. As EAX-prime needs only one blockcipher call in precomputation whereas EAX needs three calls, EAX-prime gains the performance for short (say 16 bytes) input data, in particular if precomputation is difficult due to a limited amount of memory, or frequent key changes, or both. The performance for short input data is important for many practical applications, most notably for low-power wireless sensor networks, since messages are typically short to suppress the energy consumption of sensor nodes, which are usually battery-powered. For example, Zigbee [8] limits the maximum message length to be 127 bytes, and Bluetooth low energy limits the length to 47 bytes [4]. Another example is Electronic Product Code (EPC), which is a replacement of bar-code using RFID tags, and it typically has \(96\) bits [5].

Our Contributions. In this paper, we present a mode of operation, \(\mathrm {CLOC}\) (which stands for Compact Low-Overhead CFB, and is pronounced as “clock”), to meet the demand. The design of \(\mathrm {CLOC}\) aims at optimizing previous schemes, CCM, EAX, and EAX-prime, in terms of the implementation overhead beyond the blockcipher, the precomputation complexity, and the memory requirement. \(\mathrm {CLOC}\) is sequential and its asymptotic performance (i.e. for long input data) is comparable to CCM, EAX, and EAX-prime. However, \(\mathrm {CLOC}\) has a unique feature in its low overhead computation. \(\mathrm {CLOC}\) works without any precomputation beyond the key scheduling of the blockcipher. Specifically, we do not need any blockcipher calls nor generating a key dependent table. This contributes to the improvement of the performance for short input data. For example, when the input data consists of 1-block nonce, 1-block associated data, and 1-block plaintext, \(\mathrm {CLOC}\) needs 4 blockcipher calls, while we need 5 or 6 calls in CCM, 7 calls (where 3 out of 7 can be precomputed) in EAX, and 5 calls (where 1 out of 5 can be precomputed) in EAX-prime. We focus on provably secure schemes, but for comparison, there are lightweight AE schemes including ALE [15] and Fides [14], where ALE needs 44 AES rounds which amount to 4.4 AES calls (10 out of 44 AES rounds can be precomputed), and Fides needs 33 round function calls, where the round function is similar to that of AES but has larger state. This property of \(\mathrm {CLOC}\) is particularly beneficial for embedded devices since the internal blockcipher is relatively slow due to limited computing power. Moreover, \(\mathrm {CLOC}\) can be implemented using only two state blocks, i.e. the working memory of \(2n\) bits with an \(n\)-bit blockcipher, except those needed for interfacing and blockcipher invocations. We do not aware of any provably secure AE mode with on-line capability to work with such a small amount of memory, and this property makes \(\mathrm {CLOC}\) even suitable for small processors.

Important properties of \(\mathrm {CLOC}\) can be summarized as follows.

  1. 1.

    It is a nonce-based authenticated encryption with associated data (AEAD).

  2. 2.

    It uses only the encryption of the blockcipher both for encryption and decryption.

  3. 3.

    It makes \(\lceil |N|/n\rceil +\lceil |A|/n\rceil + 2\lceil |M|/n\rceil \) blockcipher calls for a nonce \(N\), associated data \(A\), and a plaintext \(M\), when \(|A|\ge 1\), where \(|X|\) is the length of \(X\) in bits and \(n\) is the block length in bits of the blockcipher. No precomputation is needed. We note that in \(\mathrm {CLOC}\), \(1\le |N|\le n-1\) holds (hence we always have \(\lceil |N|/n\rceil =1\)), and when \(|A|=0\), it needs \(\lceil |N|/n\rceil +1+ 2\lceil |M|/n\rceil \) blockcipher calls.

  4. 4.

    It works with two state blocks (i.e. \(2n\) bits).

We introduce various design techniques in order to achieve the above mentioned design goals. We introduce tweak functions which are used to update the internal state at several points in the encryption and the decryption. While bit-wise operations, such as a constant multiplication over \(\mathrm {GF}(2^n)\), are often employed in majority of previous schemes, considering the performance for small devices, we completely eliminate bit-wise operations. Instead, our tweak functions consist of word-wise permutations and xor’s. As a result, each tweak function can be described by using a \(4\times 4\) binary matrix.

The use of word-wise permutations and xor’s to update a mask or a key dependent constant was discussed in [22, 29], and the approach was applied on CMAC and EAX. Here we use them directly to update the internal state, instead of updating a key dependent constant and xoring it to the state. This was employed for example in designs of MACs [32, 40] using bit shift operations. The techniques introduced here seem to be worth for other areas, e.g., in designing MACs, and thus it may be of independent interest.

We also introduce bit-fixing functions. CFB mode leaks input and output pairs of the underlying blockcipher, which may result in the loss of security. We use the functions to logically separate the encryption part and the authentication part of \(\mathrm {CLOC}\).

With these techniques, we prove \(\mathrm {CLOC}\) secure, in a reduction-based provable security paradigm, under the assumption that the blockcipher is a pseudorandom permutation. For security notions, \(\mathrm {CLOC}\) fulfills the standard security notions for nonce-based AEADs, i.e., the privacy and the authenticity under nonce-respecting adversaries [34]. Furthermore, we prove that the authenticity notion holds even for nonce-reusing adversaries, where only a small number of schemes achieve this goal, and most of known modes do fail to provide [18]. See Table 1 for a brief comparison of \(\mathrm {CLOC}\) to other AEADs.

Table 1. Comparison of AE modes, for \(a\)-block associated data and \(m\)-block plaintext with one-block nonce, where \(a\ge 1\)

2 Preliminaries

Let \(\{0,1\}^{*}\) be the set of all finite bit strings, including the empty string \(\varepsilon \). For an integer \(\ell \ge 0\), let \(\{0,1\}^{\ell }\) be the set of all bit strings of \(\ell \) bits. For \(X,Y\in \{0,1\}^{*}\),we write \(X\,\Vert \,Y\), \((X,Y)\), or simply \(XY\) to denote their concatenation. For \(\ell \ge 0\),we write \(0^{\ell }\in \{0,1\}^{\ell }\) to denote the bit string that consists of \(\ell \) zeros, and \(1^{\ell }\in \{0,1\}^{\ell }\) to denote the bit string that consists of \(\ell \) ones. For \(X\in \{0,1\}^{*}\), \(|X|\) is its length in bits, and for \(\ell \ge 1\), \(|X|_{\ell }=\lceil |X|/\ell \rceil \) is the length in \(\ell \)-bit blocks. For \(X\in \{0,1\}^{*}\) and \(\ell \ge 0\) such that \(|X|\ge \ell \), \(\mathsf {msb}_{\ell }(X)\) is the most significant (the leftmost) \(\ell \) bits of \(X\). For instance we have \(\mathsf {msb}_{1}(1100)=1\) and \(\mathsf {msb}_{3}(1100)=110\). For \(X\in \{0,1\}^{*}\) and \(\ell \ge 1\), we write its partition into \(\ell \)-bit blocks as \((X[1],\dots ,X[x])\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \ell }}X\), which is defined as follows. If \(X=\varepsilon \), then \(x=1\) and \(X[1]\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \ell }}X\), where \(X[1]=\varepsilon \). Otherwise \(X[1],\ldots ,X[x]\in \{0,1\}^{*}\) are unique bit strings such that \(X[1]\,\Vert \,\cdots \,\Vert \,X[x]=X\), \(|X[1]|=\cdots =|X[x-1]|=\ell \), and \(1\le |X[x]|\le \ell \). For a finite set \(\mathcal {X}\), \(X\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {X}\) means that \(X\) is chosen uniformly random from \(\mathcal {X}\).

In what follows, we fix a block length \(n\) and a blockcipher \(E:\mathcal {K}_E\times \{0,1\}^n\rightarrow \{0,1\}^n\), where \(\mathcal {K}_E\) is a non-empty set of keys. Let \(\mathrm {Perm}(n)\) be the set of all permutations over \(\{0,1\}^n\). We write \(E_K\in \mathrm {Perm}(n)\) for the permutation specified by \(K\in \mathcal {K}_E\), and \(C=E_K(M)\) for the ciphertext of plaintext \(M\in \{0,1\}^n\) under key \(K\in \mathcal {K}_E\).

3 Specification of \(\mathrm {CLOC}\)

\(\mathrm {CLOC}\) takes three parameters, a blockcipher \(E:\mathcal {K}_E\times \{0,1\}^n\rightarrow \{0,1\}^n\), a nonce length \(\ell _N\), and a tag length \(\tau \). We require \(1\le \ell _N\le n-1\) and \(1\le \tau \le n\). We also require that \(n/4\) is an integer. We write \(\mathrm {CLOC}[E,\ell _N,\tau ]\) for \(\mathrm {CLOC}\) that is parameterized by \(E\), \(\ell _N\), and \(\tau \), and we often omit the parameters if they are irrelevant or they are clear from the context. \(\mathrm {CLOC}[E,\ell _N,\tau ]=(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {E},\mathrm {CLOC}\mathrm {\text {-}}\mathcal {D})\) consists of the encryption algorithm \(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {E}\) and the decryption algorithm \(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}\).

\(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {E}\) and \(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}\) have the following syntax.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathrm {CLOC}\mathrm {\text {-}}\mathcal {E}: \mathcal {K}_{\mathrm {CLOC}}\times \mathcal {N}_{\mathrm {CLOC}}\times \mathcal {A}_{\mathrm {CLOC}}\times \mathcal {M}_{\mathrm {CLOC}}\rightarrow \mathcal {CT}_{\mathrm {CLOC}}\\ \mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}: \mathcal {K}_{\mathrm {CLOC}}\times \mathcal {N}_{\mathrm {CLOC}}\times \mathcal {A}_{\mathrm {CLOC}}\times \mathcal {CT}_{\mathrm {CLOC}}\rightarrow \mathcal {M}_{\mathrm {CLOC}}\cup \{\bot \} \end{array}\right. } \end{aligned}$$

\(\mathcal {K}_{\mathrm {CLOC}}=\mathcal {K}_E\) is the key space, which is identical to the key space of the underlying blockcipher, \(\mathcal {N}_{\mathrm {CLOC}}=\{0,1\}^{\ell _N}\) is the nonce space, \(\mathcal {A}_{\mathrm {CLOC}}=\{0,1\}^{*}\) is the associated data space, \(\mathcal {M}_{\mathrm {CLOC}}=\{0,1\}^{*}\) is the plaintext space, \(\mathcal {CT}_{\mathrm {CLOC}}=\mathcal {C}_{\mathrm {CLOC}}\times \mathcal {T}_{\mathrm {CLOC}}\) is the ciphertext space, where \(\mathcal {C}_{\mathrm {CLOC}}=\{0,1\}^{*}\) and \(\mathcal {T}_{\mathrm {CLOC}}=\{0,1\}^{\tau }\) is the tag space, and \(\bot \not \in \mathcal {M}_{\mathrm {CLOC}}\) is the distinguished reject symbol. We write \((C,T)\leftarrow \mathrm {CLOC}\mathrm {\text {-}}\mathcal {E}_K(N,A,M)\) and \(M\leftarrow \mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}_K(N,A,C,T)\) or \(\bot \leftarrow \mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}_K(N,A,C,T)\), where \((C,T)\in \mathcal {CT}_{\mathrm {CLOC}}\) is a ciphertext, and we also call \(C\in \mathcal {C}_{\mathrm {CLOC}}\) a ciphertext.

\(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {E}\) and \(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}\) are defined in Fig. 1. In these algorithms, we use four subroutines, \(\mathsf {HASH}\), \(\mathsf {PRF}\), \(\mathsf {ENC}\), and \(\mathsf {DEC}\). They have the following syntax.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathsf {HASH}: \mathcal {K}_{\mathrm {CLOC}}\times \mathcal {N}_{\mathrm {CLOC}}\times \mathcal {A}_{\mathrm {CLOC}}\rightarrow \{0,1\}^n\\ \mathsf {PRF}: \mathcal {K}_{\mathrm {CLOC}}\times \{0,1\}^n\times \mathcal {C}_{\mathrm {CLOC}}\rightarrow \mathcal {T}_{\mathrm {CLOC}}\\ \mathsf {ENC}: \mathcal {K}_{\mathrm {CLOC}}\times \{0,1\}^n\times \mathcal {M}_{\mathrm {CLOC}}\rightarrow \mathcal {C}_{\mathrm {CLOC}}\\ \mathsf {DEC}: \mathcal {K}_{\mathrm {CLOC}}\times \{0,1\}^n\times \mathcal {C}_{\mathrm {CLOC}}\rightarrow \mathcal {M}_{\mathrm {CLOC}} \end{array}\right. } \end{aligned}$$

These subroutines are defined in Fig. 2, and illustrated in Figs. 34, and 5. In the figures, \(\mathsf {i}\) is the identity function, and \(\mathsf {i}(X)=X\) for all \(X\in \{0,1\}^n\). In the subroutines, we use the one-zero padding function \(\mathsf {ozp}:\{0,1\}^{*}\rightarrow \{0,1\}^{*}\), the bit-fixing functions \(\mathsf {fix0}, \mathsf {fix1}:\{0,1\}^{*}\rightarrow \{0,1\}^{*}\), and five tweak functions \(\mathsf {f}_1\), \(\mathsf {f}_2\), \(\mathsf {g}_1\), \(\mathsf {g}_2\), and \(\mathsf {h}\), which are functions over \(\{0,1\}^n\).

The one-zero padding function \(\mathsf {ozp}\) is used to adjust the length of an input string so that the total length becomes a positive multiple of \(n\) bits. For \(X\in \{0,1\}^{*}\), \(\mathsf {ozp}(X)\) is defined as \(\mathsf {ozp}(X)= X\) if \(|X|=\ell n\) for some \(\ell \ge 1\), and \(\mathsf {ozp}(X)=X\,\Vert \,10^{n-1-(|X|\mod n)}\) otherwise. We note that \(\mathsf {ozp}(\varepsilon )=10^{n-1}\), and we also note that, in general, the function is not invertible.

The bit-fixing functions \(\mathsf {fix0}\) and \(\mathsf {fix1}\) are used to fix the most significant bit of an input string to zero and one, respectively. For \(X\in \{0,1\}^{*}\), \(\mathsf {fix0}(X)\) is defined as \(\mathsf {fix0}(X)=X\wedge 01^{|X|-1}\), and \(\mathsf {fix1}(X)\) is defined as \(\mathsf {fix1}(X)=X\vee 10^{|X|-1}\), where \(\wedge \) and \(\vee \) are the bit-wise AND operation, and the bit-wise OR operation, respectively.

The tweak function \(\mathsf {h}\) is used in \(\mathsf {HASH}\) if the most significant bit of \(\mathsf {ozp}(A[1])\) is zero. We use \(\mathsf {f}_1\) and \(\mathsf {f}_2\) in \(\mathsf {HASH}\) and \(\mathsf {PRF}\), where \(\mathsf {f}_1\) is used if the last input block is full (i.e., if \(|A[a]|=n\) or \(|C[m]|=n\)) and \(\mathsf {f}_2\) is used otherwise. We use \(\mathsf {g}_1\) and \(\mathsf {g}_2\) in \(\mathsf {PRF}\), where we use \(\mathsf {g}_1\) if the second argument of the input is the empty string (i.e., \(|C|=0\)), and otherwise we use \(\mathsf {g}_2\). Now for \(X\in \{0,1\}^n\), let \((X[1],X[2],X[3],X[4])\mathop {\leftarrow }\limits ^{{\scriptscriptstyle n/4}} X\). Then \(\mathsf {f}_1\), \(\mathsf {f}_2\), \(\mathsf {g}_1\), \(\mathsf {g}_2\), and \(\mathsf {h}\) are defined as follows.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathsf {f}_1(X)=(X[1,3],X[2,4],X[1,2,3],X[2,3,4])\\ \mathsf {f}_2(X)=(X[2],X[3],X[4],X[1,2])\\ \mathsf {g}_1(X)=(X[3],X[4],X[1,2],X[2,3])\\ \mathsf {g}_2(X)=(X[2],X[3],X[4],X[1,2])\\ \mathsf {h}(X)=(X[1,2],X[2,3],X[3,4],X[1,2,4]) \end{array}\right. } \end{aligned}$$

Here \(X[a,b]\) stands for \(X[a]\oplus X[b]\) and \(X[a,b,c]\) stands for \(X[a]\oplus X[b]\oplus X[c]\).

Alternatively the tweak functions can be specified by a matrix. Let

$$\begin{aligned} \mathbf {M}= \begin{pmatrix} 0 &{} 0 &{} 0 &{} 1 \\ 1 &{} 0 &{} 0 &{} 1 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 \end{pmatrix} \end{aligned}$$
(1)

be a \(4\times 4\) binary matrix, and let \(\mathbf {M}^i\) for \(i\ge 0\) be exponentiations of \(\mathbf {M}\), where \(\mathbf {M}^0\) denotes the identity matrix. Then we have \(\mathsf {f}_1(X) = X\cdot \mathbf {M}^8\), \(\mathsf {f}_2(X) = X\cdot \mathbf {M}\), \(\mathsf {g}_1(X) = X\cdot \mathbf {M}^2\), \(\mathsf {g}_2(X) = X\cdot \mathbf {M}\), and \(\mathsf {h}(X) = X\cdot \mathbf {M}^4\), where \(X=(X[1],X[2],X[3],X[4])\) is interpreted as a vector.

The design rationale for the tweak functions is explained in Sect. 4.

Fig. 1.
figure 1

Pseudocode of the encryption and the decryption algorithms of \(\mathrm {CLOC}\)

4 Design Rationale

Overall Structure. At abstract level \(\mathrm {CLOC}\) is a straightforward combination of CFB mode and CBC MAC, where CBC MAC is called twice for processing associated data and a ciphertext, and CFB mode is called once to generate a ciphertext. However, when we want to achieve low-overhead computation and small memory consumption, we found that any other combination of a basic encryption mode and a MAC mode did not work. For instance, we could not use CTR mode or OFB mode, as they require one state block in processing a plaintext to hold a counter value or a blockcipher output. We then realized that combining CFB mode and CBC MAC was not an easy task. Since we avoid using two keys or using blockcipher pre-calls, such as \(L=E_K(0^n)\) used in EAX, we could not computationally separate CFB mode and CBC MAC via input masking, such as Galois-field doubling (\(2^iL\) for the \(i\)-th block, where \(2L\) denotes the multiplication of \(2\) and \(L\) in \(\mathrm {GF}(2^n)\)) [13, 33]. This implies that CFB mode leaks input and output pairs of the blockcipher calls, which can be freely used to guess or fake the internal chaining value of CBC MAC, leading to a break of the scheme. Lucks [28] proposed an AEAD scheme based on CFB mode, called CCFB. However, the problem is not relevant to CCFB due to the difference in the global structure. To overcome this obstacle in composition, we introduced the bit-fixing functions. Their role is to absolutely separate the input blocks of the blockcipher in CFB mode and the first input block of CBC MAC. This imposes the most significant one bit of the input of CBC MAC being fixed to \(0\), implying one-bit input loss. The set of five tweak functions, which is another tool we introduced in this paper, is used to compensate for this information loss. It also works to compensate the information loss caused by padding functions applied to the last input block to CBC MAC. A similar technique can be found in literature [32, 40], however, the previous works only considered MACs and the tweak functions required bit operations.

Fig. 2.
figure 2

Subroutines used in the encryption and decryption algorithms of \(\mathrm {CLOC}\)

Fig. 3.
figure 3

\(V\leftarrow \mathsf {HASH}_K(N,A)\) for \(0\le |A|\le n\) (left) and \(|A|\ge n+1\) (right)

Fig. 4.
figure 4

\(C\leftarrow \mathsf {ENC}_K(V,M)\) for \(|M|\ge 1\) (left), and \(\mathsf {DEC}_K(V,C)\) for \(|C|\ge 1\) (right)

Fig. 5.
figure 5

\(T\leftarrow \mathsf {PRF}_K(V,C)\) for \(|C|=0\) (left) and \(|C|\ge 1\) (right)

In the following we explain the specific requirements for the tweak functions.

Definition of \(\mathsf {f}_1, \mathsf {f}_2, \mathsf {g}_1, \mathsf {g}_2\), and \(\mathsf {h}\). These functions are defined to meet the following properties. First, they have the additive property. That is, for any \(\mathsf {z}\in \{\mathsf {f}_1, \mathsf {f}_2, \mathsf {g}_1, \mathsf {g}_2, \mathsf {h}\}\), we have \(\mathsf {z}(X\oplus X')=\mathsf {z}(X)\oplus \mathsf {z}(X')\) for all \(X,X'\in \{0,1\}^n\). Next, these functions are invertible over \(\{0,1\}^n\). For any \(\mathsf {z}\in \{\mathsf {f}_1, \mathsf {f}_2, \mathsf {g}_1, \mathsf {g}_2, \mathsf {h}\}\), we have \(\mathsf {z}\in \mathrm {Perm}(n)\). Finally, they satisfy the differential probability constraints specified in Fig. 6. Let \(\mathsf {z}\) be a function in Fig. 6. Then we require that, for any \(Y\in \{0,1\}^n\), \(\Pr [\mathsf {z}(K)=Y]=1/2^n\), where the probability is taken over \(K\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\{0,1\}^n\). When \(\mathsf {z}\) is of the form \(\mathsf {z}=\mathsf {z}'\oplus \mathsf {z}''\), then \(\mathsf {z}(K)\) stands for \(\mathsf {z}'(K)\oplus \mathsf {z}''(K)\). When \(\mathsf {z}\) is of the form \(\mathsf {z}=\mathsf {z}'\mathsf {z}''\), then \(\mathsf {z}(K)\) stands for \(\mathsf {z}'(\mathsf {z}''(K))\). Recall that we define \(\mathsf {i}\) as \(\mathsf {i}(K)=K\).

Fig. 6.
figure 6

Differential probability constraints of \(\mathsf {f}_1, \mathsf {f}_2, \mathsf {g}_1, \mathsf {g}_2\), and \(\mathsf {h}\)

Choosing Tweak Functions. Finding simple and word-wise tweak functions fulfilling all properties is not a trivial task. We start with matrix \(\mathbf {M}\) of (1), which is invertible and has order 15 (i.e. \(\mathbf {M}^{15}=\mathbf {M}^0\)), and test all combinations of the form \((\mathsf {f}_1, \mathsf {f}_2, \mathsf {g}_1, \mathsf {g}_2, \mathsf {h})=(i_1,\dots ,i_5)\in \{1,\dots ,14\}^5\), where \(i_1=2\) means \(\mathsf {f}_1(X)=X\cdot \mathbf {M}^2\), using a computer. There are 864 candidates out of 537,824 fulfilling the differential probability constraints of Fig. 6. The complexity increases as the index of \(\mathbf {M}\) grows, when we implement the tweak function by iterating \(\mathbf {M}\), which seems suitable for hardware. For software we would directly implement \(\mathbf {M}^i\) using a word-wise permutation and xor, and in this case we observe slight irregular, but similar phenomena (e.g. \(\mathbf {M}^1\) needs one xor while \(\mathbf {M}^3\) needs three xor’s). Figure 7 shows \(\mathbf {M}^i\) and the Feistel-like implementations using a word-wise permutation and xor. It shows that, except for \(\mathbf {M}^5\) and \(\mathbf {M}^{10}\), we have a simple implementation using at most four xor’s. Based on these observations, we simply define the cost of computing \(\mathbf {M}^i\) as \(i\) for \(1\le i\le 7\) and \(15-i\) for \(8\le i\le 14\), and define \(f_{\text {cost}}(i_1,\dots ,i_5)\) as

$$\begin{aligned} \left( i_1\times \frac{1}{16} + i_2\times \frac{15}{16}\right) \times 2 + i_4 + i_5\times \frac{1}{2}. \end{aligned}$$

This corresponds to the expected total cost for given \((i_1,\dots ,i_5)\), where associated data and a plaintext are assumed to be non-empty byte strings of random lengths (as we expect the standard use of \(\mathrm {CLOC}\) is AEAD, not MAC), and we also assume that the most significant bit of the associated data is random. Then there remains only two candidates giving the minimum value of \(f_{\text {cost}}\), which are \((i_1,\dots ,i_5)=(8,1,2,1,4)\) and \((8,1,6,1,4)\). As smaller \(i_3\) is better, we choose the former as the sole winner. We also tested other matrices, say the one replacing the forth column of \(\mathbf {M}\) by the transposition of \((1,0,1,0)\), but no better solution was found.

We note that \(\mathbf {M}^8=\mathbf {M}^2\oplus \mathbf {M}^0\) and \(\mathbf {M}^4=\mathbf {M}^1\oplus \mathbf {M}^0\) hold, implying that we have \(\mathsf {f}_1(X)=\mathsf {g}_1(X)\oplus X\) and \(\mathsf {h}(X)=\mathsf {f}_2(X)\oplus X=\mathsf {g}_2(X)\oplus X\), which may be useful in some implementations.

Fig. 7.
figure 7

Matrix exponentiations for the tweak functions

5 Security of \(\mathrm {CLOC}\)

In this section, we define the security notions of a blockcipher and \(\mathrm {CLOC}\), and present our security theorems.

PRP Notion. We assume that the blockcipher \(E: \mathcal {K}_E\times \{0,1\}^n\rightarrow \{0,1\}^n\) is a pseudo-random permutation, or a PRP [27]. We say that \(P\) is a random permutation if \(P\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathrm {Perm}(n)\), and define

$$\begin{aligned} \mathbf {Adv}_{E}^{\mathrm {prp}}(\mathcal {A}) \mathop {=}\limits ^{\mathrm {def}}\Pr \left[ \mathcal {A}^{E_K(\cdot )}\Rightarrow 1\right] -\Pr \left[ \mathcal {A}^{P(\cdot )}\Rightarrow 1\right] , \end{aligned}$$

where the first probability is taken over \(K\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {K}_E\) and the randomness of \(\mathcal {A}\), and the last is over \(P\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathrm {Perm}(n)\) and \(\mathcal {A}\). We write \(\mathrm {CLOC}[\mathrm {Perm}(n),\ell _N,\tau ]\) for \(\mathrm {CLOC}\) that uses \(P\) as \(E_K\), and the encryption and decryption algorithms are written as \(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {E}_P\) and \(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}_P\). We also consider \(\mathrm {CLOC}\) that uses a random function as \(E_K\), which is naturally defined as the invertibility of \(E_K\) is irrelevant in the definition of \(\mathrm {CLOC}\). Let \(\mathrm {Rand}(n)\) be the set of all functions from \(\{0,1\}^n\) to \(\{0,1\}^n\), and we say that \(R\) is a random function if \(R\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathrm {Rand}(n)\). We write \(\mathrm {CLOC}[\mathrm {Rand}(n),\ell _N,\tau ]\) for \(\mathrm {CLOC}\) that uses \(R\) as \(E_K\), and its encryption and decryption algorithms are written as \(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {E}_R\) and \(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}_R\).

Privacy Notion. We define the privacy notion for \(\mathrm {CLOC}[E,\ell _N,\tau ]=(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {E},\mathrm {CLOC}\mathrm {\text {-}}\mathcal {D})\). This notion captures the indistinguishably of a nonce-respecting adversary in a chosen plaintext attack setting [34]. We consider an adversary \(\mathcal {A}\) that has access to the \(\mathrm {CLOC}\) encryption oracle, or a random-bits oracle. The encryption oracle takes \((N,A,M)\in \mathcal {N}_{\mathrm {CLOC}}\times \mathcal {A}_{\mathrm {CLOC}}\times \mathcal {M}_{\mathrm {CLOC}}\) as input and returns \((C,T)\leftarrow \mathrm {CLOC}\mathrm {\text {-}}\mathcal {E}_K(N,A,M)\). The random-bits oracle, \( {\$} \)-oracle, takes \((N,A,M)\in \mathcal {N}_{\mathrm {CLOC}}\times \mathcal {A}_{\mathrm {CLOC}}\times \mathcal {M}_{\mathrm {CLOC}}\) as input and returns a random string \((C,T)\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\{0,1\}^{|M|+\tau }\). We define the privacy advantage as

$$\begin{aligned} \mathbf {Adv}_{\mathrm {CLOC}[E,\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A}) \mathop {=}\limits ^{\mathrm {def}}\Pr \left[ \mathcal {A}^{\mathrm {CLOC}\mathrm {\text {-}}\mathcal {E}_K(\cdot ,\cdot ,\cdot )}\Rightarrow 1\right] -\Pr \left[ \mathcal {A}^{\$(\cdot ,\cdot ,\cdot )}\Rightarrow 1\right] , \end{aligned}$$

where the first probability is taken over \(K\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {K}_{\mathrm {CLOC}}\) and the randomness of \(\mathcal {A}\), and the last is over the random-bits oracle and \(\mathcal {A}\). We assume that \(\mathcal {A}\) in the privacy game is nonce-respecting, that is, \(\mathcal {A}\) does not make two queries with the same nonce.

Privacy Theorem. Let \(\mathcal {A}\) be an adversary that makes \(q\) queries, and suppose that the queries are \((N_1,A_1,M_1),\ldots ,(N_q,A_q,M_q)\). Then we define the total associated data length as \(a_1+\cdots +a_q\), and the total plaintext length as \(m_1+\cdots +m_q\), where \((A_i[1],\ldots ,A_i[a_i])\mathop {\leftarrow }\limits ^{{\scriptscriptstyle n}} A_i\) and \((M_i[1],\ldots ,M_i[m_i])\mathop {\leftarrow }\limits ^{{\scriptscriptstyle n}} M_i\). We have the following information theoretic result.

Theorem 1

Let \(\mathrm {Perm}(n)\), \(\ell _N\), and \(\tau \) be the parameters of \(\mathrm {CLOC}\). Let \(\mathcal {A}\) be an adversary that makes at most \(q\) queries, where the total associated data length is at most \(\sigma _A\), and the total plaintext length is at most \(\sigma _M\). Then we have \(\mathbf {Adv}_{\mathrm {CLOC}[\mathrm {Perm}(n),\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A})\le 5\sigma _{\mathrm {priv}}^2/2^n\), where \(\sigma _{\mathrm {priv}}=q+\sigma _A+2\sigma _M\).

A proof overview is given in Sect. 6, and a complete proof is presented in [23], Appendix A]. If we use a blockcipher \(E\), which is secure in the sense of the PRP notion, instead of \(\mathrm {Perm}(n)\), then the corresponding complexity theoretic result can be shown by a standard argument. See e.g. [11]. We note that the privacy of \(\mathrm {CLOC}\) is broken if the nonce is reused.

Authenticity Notion. We next define the authenticity notion, which captures the unforgeability of an adversary in a chosen ciphertext attack setting [34]. We consider a strong adversary that can repeat the same nonce multiple times. Let \(\mathcal {A}\) be an adversary that has access to the \(\mathrm {CLOC}\) encryption oracle and the \(\mathrm {CLOC}\) decryption oracle. The encryption oracle is defined as above. The decryption oracle takes \((N,A,C,T)\in \mathcal {N}_{\mathrm {CLOC}}\times \mathcal {A}_{\mathrm {CLOC}}\times \mathcal {C}_{\mathrm {CLOC}}\times \mathcal {T}_{\mathrm {CLOC}}\) as input and returns \(M\leftarrow \mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}_K(N,A,C,T)\) or \(\bot \leftarrow \mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}_K(N,A,C,T)\). The authenticity advantage is defined as

$$\begin{aligned} \mathbf {Adv}_{\mathrm {CLOC}[E,\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A}) \mathop {=}\limits ^{\mathrm {def}}\Pr \left[ \mathcal {A}^{\mathrm {CLOC}\mathrm {\text {-}}\mathcal {E}_K(\cdot ,\cdot ,\cdot ),\mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}_K(\cdot ,\cdot ,\cdot ,\cdot )} \ \text {forges}\right] , \end{aligned}$$

where the probability is taken over \(K\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {K}_{\mathrm {CLOC}}\) and the randomness of \(\mathcal {A}\), and the adversary forges if the decryption oracle returns a bit string (other than \(\bot \)) for a query \((N,A,C,T)\), but \((C,T)\) was not previously returned to \(\mathcal {A}\) from the encryption oracle for a query \((N,A,M)\). The adversary \(\mathcal {A}\) in the authenticity game is not necessarily nonce-respecting, and \(\mathcal {A}\) can make two or more queries with the  same nonce. Specifically, \(\mathcal {A}\) can repeat using the same nonce for encryption queries, a nonce used for encryption queries can be used for decryption queries and vice-versa, and the same nonce can be repeated for decryption queries. Without loss of generality, we assume that \(\mathcal {A}\) does not make trivial queries, i.e., if the encryption oracle returns \((C,T)\) for a query \((N,A,M)\), then \(\mathcal {A}\) does not make a query \((N,A,C,T)\) to the decryption oracle, and \(\mathcal {A}\) does not repeat a query.

Authenticity Theorem. Let \(\mathcal {A}\) be an adversary that makes \(q\) encryption queries and \(q'\) decryption queries. Let \((N_1,A_1,M_1),\ldots ,(N_q,A_q,M_q)\) be the encryption queries, and \((N'_1,A'_1,C'_1,T'_1),\ldots ,(N'_{q'},A'_{q'},C'_{q'},T'_{q'})\) be the decryption queries.Then we define the total associated data length in encryption queries as \(a_1+\cdots +a_q\), the total plaintext length as \(m_1+\cdots +m_q\), the total associated data length in decryption queries as \(a'_1+\cdots +a'_{q'}\), and the total ciphertext length as \(m'_1+\cdots +m'_{q'}\), where \((A_i[1],\ldots ,A_i[a_i])\mathop {\leftarrow }\limits ^{{\scriptscriptstyle n}} A_i\), \((M_i[1],\ldots ,M_i[m_i])\mathop {\leftarrow }\limits ^{{\scriptscriptstyle n}} M_i\), \((A'_i[1],\ldots ,A'_i[a'_i])\mathop {\leftarrow }\limits ^{{\scriptscriptstyle n}} A'_i\), and \((C'_i[1],\ldots ,C'_i[m'_i])\mathop {\leftarrow }\limits ^{{\scriptscriptstyle n}} C'_i\). We have the following information theoretic result.

Theorem 2

Let \(\mathrm {Perm}(n)\), \(\ell _N\), and \(\tau \) be the parameters of \(\mathrm {CLOC}\). Let \(\mathcal {A}\) be an adversary that makes at most \(q\) encryption queries and at most \(q'\) decryption queries, where the total associated data length in encryption queries is at most \(\sigma _A\), the total plaintext length is at most \(\sigma _M\), the total associated data length in decryption queries is at most \(\sigma _{A'}\), and the total ciphertext length is at most \(\sigma _{C'}\). Then we have \(\mathbf {Adv}_{\mathrm {CLOC}[\mathrm {Perm}(n),\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A})\le 5\sigma _{\mathrm {auth}}^2/2^n+q'/2^{\tau }\), where \(\sigma _{\mathrm {auth}}=q+\sigma _A+2\sigma _M+q'+\sigma _{A'}+\sigma _{C'}\).

A proof overview is given in Sect. 6, and a complete proof is presented in [23], Appendix A]. As in the privacy case, if we use a blockcipher \(E\) secure in the sense of the PRP notion, then we obtain the corresponding complexity theoretic result by a standard argument in, e.g., [11].

6 Overview of Security Proofs

PRP/PRF Switching. The first step is to replace the random permutation \(P\) in \(\mathrm {CLOC}[\mathrm {Perm}(n),\ell _N,\tau ]\) with a random function \(R\), and use the PRP/PRF switching lemma [12] to obtain the followingdifferences.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathbf {Adv}_{\mathrm {CLOC}[\mathrm {Perm}(n),\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A})-\mathbf {Adv}_{\mathrm {CLOC}[\mathrm {Rand}(n),\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A})\\ \mathbf {Adv}_{\mathrm {CLOC}[\mathrm {Perm}(n),\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A})-\mathbf {Adv}_{\mathrm {CLOC}[\mathrm {Rand}(n),\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A}) \end{array}\right. } \end{aligned}$$

Defining \(Q_1,\ldots ,Q_{26}\) and \(\mathrm {CLOC}\mathrm {2}\). We define twenty six functions \(Q_1,\ldots ,Q_{26}: \{0,1\}^n\rightarrow \{0,1\}^n\) based on \(R\), \(K_1\), \(K_2\), and \(K_3\), where \(K_1, K_2, K_3\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\{0,1\}^n\) are three independent random \(n\)-bit strings. We also define a modified version of \(\mathrm {CLOC}[\mathrm {Rand}(n),\ell _N,\tau ]\) called \(\mathrm {CLOC}\mathrm {2}[\ell _N,\tau ]\), which uses \(Q=(Q_1,\ldots ,Q_{26})\) as oracles. \(Q\) and \(\mathrm {CLOC}\mathrm {2}\) are designed so that \(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {E}_R\) and \(\mathrm {CLOC}\mathrm {2}\mathrm {\text {-}}\mathcal {E}_Q\) are the same algorithms, \(\mathrm {CLOC}\mathrm {\text {-}}\mathcal {D}_R\) and \(\mathrm {CLOC}\mathrm {2}\mathrm {\text {-}}\mathcal {D}_Q\) are the same algorithms (except that \(\mathrm {CLOC}\mathrm {2}\mathrm {\text {-}}\mathcal {D}_Q\) is used for the verification only, and it does not output a plaintext even if the verification succeeds), and \(Q_1,\ldots ,Q_{26}\) are indistinguishable from \(F_1,\ldots ,F_{26}\), which are independent random functions. We then have

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathbf {Adv}_{\mathrm {CLOC}[\mathrm {Rand}(n),\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A}) =\mathbf {Adv}_{\mathrm {CLOC}\mathrm {2}[\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A}),\\ \mathbf {Adv}_{\mathrm {CLOC}[\mathrm {Rand}(n),\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A}) =\mathbf {Adv}_{\mathrm {CLOC}\mathrm {2}[\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A}), \end{array}\right. } \end{aligned}$$

and we show the distinguishing probability of \(Q=(Q_1,\ldots ,Q_{26})\) and \(F=(F_1,\ldots ,F_{26})\) in [23], Lemma 1]. However, the indistinguishability does not hold for arbitrary adversaries. We formalize an input-respecting adversary, and our indistinguishability result in [23], Lemma 1] holds only for these adversaries.

The three random strings, \(K_1, K_2\), and \(K_3\), are secret keys from the adversary’s perspective, and we introduce them to show the indistinguishability between \(Q\) and \(F\). For instance we know that the input \(\mathsf {fix0}(\mathsf {ozp}(A[1]))\) to produce \(S_{\mathsf {H}}[1]\) in \(\mathsf {HASH}_K(N,A)\) (The 2nd line of \(\mathsf {HASH}_K(N,A)\) in Fig. 2) never collides with the input \(\mathsf {fix1}(C[i])\) to produce \(S_{\mathsf {E}}[i+1]\) in \(\mathsf {ENC}_K(V,M)\) (The 8th line of \(\mathsf {ENC}_K(V,M)\) in Fig. 2), and hence we can safely assume that they are independent. Likewise, we show that the collision probability between \(\mathsf {fix0}(\mathsf {ozp}(A[1]))\) and, say, \(S_{\mathsf {H}}[i-1]\oplus A[i]\) in \(\mathsf {HASH}_K(N,A)\) (The 7th line of \(\mathsf {HASH}_K(N,A)\) in Fig. 2) is low, and the three random strings are introduced to help this argument.

Defining \(\mathrm {CLOC}\mathrm {3}\). We define another version of \(\mathrm {CLOC}[\mathrm {Rand}(n),\ell _N,\tau ]\) called \(\mathrm {CLOC}\mathrm {3}[\ell _N,\tau ]\). It uses \(F=(F_1,\ldots ,F_{26})\) as oracles, and the encryption algorithm \(\mathrm {CLOC}\mathrm {3}\mathrm {\text {-}}\mathcal {E}_F\) and the decryption algorithm \(\mathrm {CLOC}\mathrm {3}\mathrm {\text {-}}\mathcal {D}_F\) are obtained from \(\mathrm {CLOC}\mathrm {2}\mathrm {\text {-}}\mathcal {E}_Q\) and \(\mathrm {CLOC}\mathrm {2}\mathrm {\text {-}}\mathcal {D}_Q\) by replacing \(Q_1,\ldots ,Q_{26}\) with \(F_1,\ldots ,F_{26}\), respectively. We use [23], Lemma 1] to obtain the following differences.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathbf {Adv}_{\mathrm {CLOC}\mathrm {2}[\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A}) -\mathbf {Adv}_{\mathrm {CLOC}\mathrm {3}[\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A})\\ \mathbf {Adv}_{\mathrm {CLOC}\mathrm {2}[\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A}) -\mathbf {Adv}_{\mathrm {CLOC}\mathrm {3}[\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A}) \end{array}\right. } \end{aligned}$$

The simulations work with input-respecting adversaries, and hence [23], Lemma 1] is sufficient for our purpose.

Indistinguishability of \((\mathsf {HASH3}, \mathsf {HASH3}', \mathsf {HASH3}'')\). We then consider three subroutines \(\mathsf {HASH3}\), \(\mathsf {HASH3}'\), and \(\mathsf {HASH3}''\) in \(\mathrm {CLOC}\mathrm {3}[\ell _N,\tau ]\). \(\mathsf {HASH3}\) roughly corresponds to a function that computes \(S_{\mathsf {E}}[1]\) from \((N,A)\) in \(\mathrm {CLOC}[E,\ell _N,\tau ]\), i.e., \(E_K(\mathsf {HASH}_K(N,A))\). \(\mathsf {HASH3}'\) computes the tag \(T\) when \(|C|=0\), i.e., this function roughly corresponds to \(\mathsf {msb}_{\tau }(E_K(\mathsf {g}_1(\mathsf {HASH}_K(N,A))))\). \(\mathsf {HASH3}''\) computes \(S_{\mathsf {P}}[0]\) from \((N,A)\), which is used when \(|C|\ge 1\), i.e., \(E_K(\mathsf {g}_2(\mathsf {HASH}_K(N,A)))\). Then in [23], Lemma 2], we show that these functions are indistinguishable from three independent random functions \(\mathsf {HASH4}\), \(\mathsf {HASH4}'\), and \(\mathsf {HASH4}''\).

Defining \(\mathrm {CLOC}\mathrm {4}\). We define another version of \(\mathrm {CLOC}[\mathrm {Rand}(n),\ell _N,\tau ]\), called \(\mathrm {CLOC}\mathrm {4}[\ell _N,\tau ]\). This is obtained by replacing \(\mathsf {HASH3}\), \(\mathsf {HASH3}'\), and \(\mathsf {HASH3}''\) in \(\mathrm {CLOC}\mathrm {3}\) with \(\mathsf {HASH4}\), \(\mathsf {HASH4}'\), and \(\mathsf {HASH4}''\), respectively. We use [23], Lemma 2] to obtain the following differences.

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathbf {Adv}_{\mathrm {CLOC}\mathrm {3}[\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A}) -\mathbf {Adv}_{\mathrm {CLOC}\mathrm {4}[\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A})\\ \mathbf {Adv}_{\mathrm {CLOC}\mathrm {3}[\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A}) -\mathbf {Adv}_{\mathrm {CLOC}\mathrm {4}[\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A}) \end{array}\right. } \end{aligned}$$

Indistinguishability of \(\mathsf {PRF4}\). We then consider a subroutine called \(\mathsf {PRF4}\) in \(\mathrm {CLOC}\mathrm {4}\). This function outputs a tag \(T\) from \((N,A,C)\), and internally uses \(\mathsf {HASH4}'\), \(\mathsf {HASH4}''\), \(F_{24}\), \(F_{25}\), and \(F_{26}\). We show in [23], Lemma 3] that this function is indistinguishable from a random function \(\mathsf {PRF5}\).

Defining \(\mathrm {CLOC}\mathrm {5}\). We define our final version of \(\mathrm {CLOC}[\mathrm {Rand}(n),\ell _N,\tau ]\), called \(\mathrm {CLOC}\mathrm {5}[\ell _N,\tau ]\), which is obtained from \(\mathrm {CLOC}\mathrm {4}\) by replacing \(\mathsf {PRF4}\) with \(\mathsf {PRF5}\). This function is used in both encryption and decryption, and we obtain the following differences from [23], Lemma 3].

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathbf {Adv}_{\mathrm {CLOC}\mathrm {4}[\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A}) -\mathbf {Adv}_{\mathrm {CLOC}\mathrm {5}[\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A})\\ \mathbf {Adv}_{\mathrm {CLOC}\mathrm {4}[\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A}) -\mathbf {Adv}_{\mathrm {CLOC}\mathrm {5}[\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A}) \end{array}\right. } \end{aligned}$$

Privacy and Authenticity of \(\mathrm {CLOC}\mathrm {5}\). Finally, we analyze the privacy and the authenticity of \(\mathrm {CLOC}5\) in [23], Lemma 4]. The privacy result shows the upper bound on \(\mathbf {Adv}_{\mathrm {CLOC}\mathrm {5}[\ell _N,\tau ]}^{\mathrm {priv}}(\mathcal {A})\), and the proof is reduced to bounding the collision probability among the input values of the random function which is used to encrypt plaintexts. The authenticity result shows the upper bound on \(\mathbf {Adv}_{\mathrm {CLOC}\mathrm {5}[\ell _N,\tau ]}^{\mathrm {auth}}(\mathcal {A})\), and its proof is simple and the result is obtained from the fact that the adversary, even if the nonce is reused, has to guess the output of a random function \(\mathsf {PRF5}\) for the input that was not queried before.

We finally obtain the proofs of Theorems 1 and 2 by combining the above differences between advantage functions.

7 Software Implementation

We first tested \(\mathrm {CLOC}\) on a general-purpose CPU. It is interesting to note that the encryption process and tag generation can be done in parallel, which could speed up the overall computation by a factor close to 2 for long plaintexts, then the final speed could be close to that of encryption only in serial mode. To show that, we implemented \(\mathrm {CLOC}\) instantiated with AES-128 using the AES new instruction set, and tested against Intel processor, Core i5-3427U 1.80GHz (Ivy Bridge) [6]. It is known that Intel’s AES instruction allows fast parallel processing (up to \(4\) or \(8\) blocks), and we used this technique for two parallel inputs to AES. The tested speed for long plaintexts (more than \(2^{20}\) blocks) is around 4.9 cycles per byte (cpb), while AES-128 encrypts at a speed of 4.3 cpb in serial mode. In Table 2, we provide the test vectors.

Table 2. Test vector of \(\mathrm {CLOC}\) instantiated with AES-128
Table 3. Software implementation on ATmega128

We then tested \(\mathrm {CLOC}\) on embedded software. We used an 8-bit microprocessor, Atmel AVR ATmega128 [2]. For comparison we also implemented EAX [13], EAX-prime [31], and OCB3 [26]. For OCB3 we used a byte-oriented code from [7]. OCB3 needs relatively large precomputation for GF doublings, but we modify the code so that the doublings are on-line, since large precomputation may not be suitable to handle short input data for microprocessors. We also considered GCM for comparison, however, recent studies show that GCM does not perform well on constrained devices (see e.g. [10, 38]), hence we decided not to include it. All modes are written in C and combined with AES-128. Our AES code is taken from [3], which is written in assembler. AES runs at 156.7 cpb for encryption, 196.8 cpb for decryption, both without key scheduling, and the key scheduling runs at 1,979 cycles. Our codes are complied with Atmel Studio 6 available from [2]. Cycles counts are measured on the simulator of Atmel Studio 6. Table 3 shows the implementation result. ROM denotes the object size in bytes. The speed is measured based on the scenario of non-static associated data, i.e., we excluded key setup and other computations before processing associated data and a nonce, defined as “Init”, andfigures for Data \(b\) denote cycles per byte to process a \(b\)-byte plaintext. In EAX, “Init” includes the computation of \(E_K(0^n)\), \(E_K(0^{n-1}1)\), and \(E_K(0^{n-2}10)\). The length of associated data is fixed to \(16\) bytes except for EAX-prime, and for EAX-prime, we use 32-byte “cleartext,” which can be regarded as the combination of associated data and a nonce [31]. For OCB3 we also measured the decryption performance, whereas those of \(\mathrm {CLOC}\), EAX, and EAX-prime are almost the same as encryption, since CLOC, EAX and EAX-prime require only forward direction of the underlying blockcipher. The result shows a superior performance of \(\mathrm {CLOC}\) for short input data, up to around \(128\) bytes, which would be sufficiently long for low-power wireless networks, as we mentioned in Sect. 1. We also measure the RAM usage of the AVR implementations, using a public tool [41], based on data of 16 bytes. It is clear to see that CLOC requires much less RAM than OCB3.

8 Hardware Implementation

Although the primary focus of \(\mathrm {CLOC}\) is embedded software, we also implemented \(\mathrm {CLOC}\) on hardware to see basic performance figures. We used Altera FPGA,Cyclone IV GX (EP4CGX110DF31C7) [1], and implemented \(\mathrm {CLOC}\) using AES-128. AES implementation is round-based, and the S-box of AES is based on a composite field [37]. For reference we also wrote EAX for the same device, using the same AES. Both \(\mathrm {CLOC}\) and EAX use one AES core for encryption and authentication. In EAX implementation, all input masks are stored to registers. Table 4 shows the results. The size is measured by the number of logic elements (LEs). Our implementation is not optimized. Still, these figures show that \(\mathrm {CLOC}\) has slightly smaller size and faster speed than EAX. Table 4 lacks other important modes, in particular OCB. A more comprehensive comparison and optimized implementation for short input data are interesting future topics.

Table 4. Hardware implementation. Throughput figures of \(\mathrm {CLOC}\) and EAX are measured for 8-block plaintexts with one-block associated data.

9 Conclusions

We presented a blockcipher mode of operation called \(\mathrm {CLOC}\) for authenticated encryption with associated data. It uses a variant of CFB mode in its encryption part and a variant of CBC MAC in the authentication part. The scheme efficiently handles short input data without heavy precomputation nor large memory, and it is suitable for use in microprocessors. We proved \(\mathrm {CLOC}\) secure, in a reduction-based provable security paradigm, under the assumption that the blockcipher is a pseudorandom permutation. We also presented our preliminary implementation results.

It would be interesting to see improved implementation results using possibly lightweight blockciphers.