Beyond Conventional Security in Sponge-Based Authenticated Encryption Modes

The Sponge function is known to achieve 2c/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2^{c/2}$$\end{document} security, where c is its capacity. This bound was carried over to its keyed variants, such as SpongeWrap, to achieve a min{2c/2,2κ}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\min \{2^{c/2},2^\kappa \}$$\end{document} security bound, with κ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\kappa $$\end{document} the key length. Similarly, many CAESAR competition submissions were designed to comply with the classical 2c/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2^{c/2}$$\end{document} security bound. We show that Sponge-based constructions for authenticated encryption can achieve the significantly higher bound of min{2b/2,2c,2κ}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\min \{2^{b/2},2^c,2^\kappa \}$$\end{document}, with b>c\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b>c$$\end{document} the permutation size, by proving that the CAESAR submission NORX achieves this bound. The proof relies on rigorous computation of multi-collision probabilities, which may be of independent interest. We additionally derive a generic attack based on multi-collisions that matches the bound. We show how to apply the proof to five other Sponge-based CAESAR submissions: Ascon, CBEAM/STRIBOB, ICEPOLE, Keyak, and two out of the three PRIMATEs. A direct application of the result shows that the parameter choices of some of these submissions are overly conservative. Simple tweaks render the schemes considerably more efficient without sacrificing security. We finally consider the remaining one of the three PRIMATEs, APE, and derive a blockwise adaptive attack in the nonce-respecting setting with complexity 2c/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2^{c/2}$$\end{document}, therewith demonstrating that the techniques cannot be applied to APE.

We remark that ICEPOLE v1,v2 consists of three configurations (two with security level 128 and one with security level 256) and Keyak v1 of four configurations (one with an 800-bit state and three with a 1600-bit state) directly applies to SpongeWrap [19] and DuplexWrap [22], upon which Keyak v1 is built.
Our results imply that the initial submissions of these CAESAR candidates were overly conservative in choosing their parameters, since reducing c would have lead to the same bound. For instance, Ascon-128 could take (c, r ) = (128, 192) instead of (256, 64), NORX64 (the proposed mode with 256-bit security) could increase its rate by 128 bits, and GIBBON-120 and HANUMAN-120 could increase their rate by a factor of 4, without affecting their mode security levels.
These observations only concern the mode security, where characteristics of the underlying permutation are set aside. Specifically, the concrete security of the underlying permutations plays a fundamental role in the choice of parameters. For instance, the authors of Ascon [33,34], NORX [7,8], and PRIMATEs [2,3] acknowledge that non-random properties of some of the underlying primitives exist. Furthermore, the authenticity bound degrades as a function of the number of forgery attempts f : min{2 (r +c)/2 , 2 c / f, 2 κ }. In practical applications, the amount of forgery attempts may be limited, but if this is not possible, caution must be taken. We refer to [75] for a discussion.

Tightness of the Result
The earlier version of this article by Jovanovic et al. [53] had a security bound of the form min{2 (r +c)/2 , 2 c /r, 2 κ }, showing a security loss logarithmic relative to the rate. This loss was, however, not justified by any existing attack; it arose as an artifact of naively bounding the probability of a multi-collision occurring in the outer state, where multiple evaluations of the underlying primitive map to the same outer value.
In this article, we thoroughly analyze multi-collisions and derive bounds on the size of multi-collisions for various possible choices of r and c. Most importantly, we can conclude that if r c or r c, multi-collisions have no effect on the security. If r ≈ c, the security loss approaches 1.4c log 2 c−2 , as opposed to the factor r loss from [53]. We refer to Table 2 for a comprehensive description of the bound. Note that for all schemes in Table 1, r c or r c. The rigorous analysis of multi-collisions relies on an application of Stirling's approximation and the Lambert W function. It is not only applicable to Sponge-based modes. For example, there are quite a few cryptographic schemes that have been attacked using multi-collisions, such as block-cipher-based hashing schemes [73], identification schemes [41], JH hash function [58], MDC-2 hash function [54], HMAC and ChopMD MAC [68], the LED block cipher [70], iterated Even-Mansour [32], and strengthened HMAC [88]. Multi-collisions have also influenced various security upper bounds. Typical examples are the indifferentiability proof for the ChopMD construction [27], the collision resistance proof for the Lesamnta-LW hash function [46], and the indistinguishability proof for RMAC [52], where the bound is O(2 n /n) due to the existence of n-collisions. The compression function proposed by Hirose et al. [47] has a similar type of bound. Finally, the recent line of research on the keyed Sponge and Duplex constructions [6,18,20,26,31,38,60,69] strongly relies on "multiplicities." Some of these security analyses can be improved using our rigorous analysis of multi-collisions.
For r < c, the old bound of [53] is dominated by 2 (r +c)/2 and is in fact tight. The new bound improves over the one of [53] for r ≥ c, and in this work we additionally show that the new bound is tight for all possible choices of (r, c). To this end, we present a multi-collision-based adversary that meets the bound proven in our analysis. The attack is described for a generalized Sponge construction that covers CBEAM, ICEPOLE, Keyak v1, NORX, and STRIBOB. Even for variants with the additional XOR of the secret key at the end, (Ascon, GIBBON, and HANUMAN, see Fig. 4), a similar adversary with slightly higher complexity can meet the bound. A comparison of the earlier bound of [53], the new bound, and the attack complexity for the case of c = 256 and r ≥ c is given in Fig. 1.

APE
One of the interesting questions triggered by the publication of [53] was regarding APE, the third of the PRIMATEs. In more detail, the schemes listed in Table 1 are proven to achieve a beyond 2 c/2 security level against nonce-respecting adversaries, but the schemes are insecure against nonce-misusing adversaries. In contrast, APE is proven to achieve 2 c/2 security in the nonce-reuse scenario [4], and it is of interest to investigate what security guarantees APE offers against nonce-respecting adversaries. In this work, we include an analysis of APE in this setting and show that there exists a nonce-respecting blockwise adaptive adversary that can break the privacy with a total complexity of about 2 c/2 . In other words, while APE is more robust against nonce-misusing adversaries up to common prefix, in the nonce-respecting setting the schemes listed in Table 1

Publication History and Subsequent Work
An extended abstract of this article has appeared in the proceedings of ASIACRYPT 2014 [53]. This article is the full version of [53], and additionally includes the proofs that were absent in the proceedings version. New with respect to the full version of [53] are (i) a more rigorous analysis of multi-collisions and the therewith induced improved security bound (Sect. 3), (ii) the generic attack on Sponge-based authenticated encryption schemes demonstrating tightness of the bound (Sect. 5), (iii) a proof that, unlike the schemes of Table 1, APE does not achieve beyond 2 c/2 security in the nonce-respecting setting (Sect. 7).
Parts (i) and (ii) are due to Sasaki and Yasuda [90], with whom we have collaborated to combine their ideas for a complete analysis of the Sponge-based modes.
In response to the observations made in [53], the designers of Ascon and NORX have reconsidered their parameter choices. The new parameter choices are also listed in Table 1 and testify of a significant security gain for Ascon v1.1 [34] without sacrificing efficiency, and a significant efficiency gain for NORX v2 [8] without sacrificing security. The adjustments will make the schemes faster and more competitive. Mihajloska et al. [61] recently generalized the analysis of [53] to CAESAR submission π -Cipher [42,43], which is structurally different from NORX in the way it maintains state: a so-called "common internal state" is used throughout the evaluation.
From a more general perspective, the work has triggered analysis in the direction of high-efficiency full-state keyed Duplexes [31,60,89]. The result of Mennink et al. [60] on the full-state keyed Duplex has triggered the designers of Keyak to perform a major revision to their scheme. In more detail, Keyak v2 [23] is built on top of the "Motorist" mode, an alternative to the full-state keyed Duplex that was analyzed by Daemen et al. [31]. We remark that the results on the full-state keyed Sponges and Duplexes are more general than the target design in this work. The most important difference between [31,60] and our work is that we explicitly target nonce-based designs, and this allows for beyond 2 c/2 security. The work has, to certain extent, furthermore triggered the use of permutations for nonce-reuse secure authenticated encryption schemes [29,44,59] beyond APE.
Parallel to the research on keyed Duplexes is the research on the keyed Sponges, i.e., keyed versions of the Sponge that only aim for authenticity. Bertoni et al. [18] introduced the original keyed Sponge. Chang et al. [26] suggested to put the key in the inner part of the Sponge. Andreeva et al. [6] formalized and improved the analysis of the outer-and inner-keyed sponges. The analysis was generalized to the full-state Sponge in [31,38,60,69], following upon ideas that date back to the donkeySponge [21]. Beyond authentication (and encryption), keyed versions of the Sponge have found applications in reseedable pseudorandom sequence generation [18,39].

Outline
We present our security model in Sect. 2. In Sect. 3, we perform an in-depth analysis of multi-collisions with respect to Sponges. A security proof for NORX is derived in Sect. 4. Tightness of the bound is proven in Sect. 5. In Sect. 6, we show that the proof of NORX generalizes to other CAESAR submissions, as well as to SpongeWrap and DuplexWrap. We consider the security of APE against nonce-respecting adversaries in Sect. 7. The work is concluded in Sect. 8, where we also discuss possible generalizations to Artemia [1].

Security Model
For n ∈ N, let Perm(n) denote the set of all permutations on n bits. When writing x $ ← − X for some finite set X , we mean that x gets sampled uniformly at random from X . For x ∈ {0, 1} n , and a, b ≤ n, we denote by [x] a and [x] b the a leftmost and b rightmost bits of x, respectively. For tuples ( j, k), ( j , k ) we use lexicographical order: Let Π be an authenticated encryption scheme, with an encryption function E and a decryption function D, where Here, N denotes a nonce value, H a header, M a message, C a ciphertext, T a trailer, and A an authentication tag. The values (H, T ) will be referred to as associated data. If verification is successful, then the decryption function D K outputs M, and ⊥ otherwise. The scheme Π is also determined by a set of parameters such as the key size, state size, and block size, but these are left implicit. In addition, we define $ to be an ideal version of E K , where $ returns (C, A) $ ← − {0, 1} |M|+τ for every query (N ; H, M, T ). We follow the convention in analyzing modes of operation for permutations by modeling the underlying permutations as being drawn uniformly at random from Perm(b), where b is a parameter determined by the scheme.
An adversary A is a probabilistic algorithm that has access to one or more oracles O, denoted A O . By A O = 1 we denote the event that A, after interacting with O, outputs 1. We consider adversaries A that have unbounded computational power and whose complexity is solely measured by the number of queries made to their oracles. These adversaries have query access to (i) the underlying idealized permutations, (ii) E K or its counterpart $, and possibly (iii) D K . The key K is randomly drawn from {0, 1} κ at the beginning of the security experiment. The security definitions below follow [11,37,51,77,80].

Privacy
Let p denote a list of idealized permutations, which Π may depend on. We define the advantage of an adversary A in breaking the privacy of Π as follows: where the probabilities are taken over the random choices of p, $, K , and A, if any. The fact that the adversary has access to both the forward and inverse permutations in p is denoted by p ± . We assume that adversary A is nonce-respecting, which means that it never makes two queries to E K or $ with the same nonce. By Adv priv Π (q p , q E , λ E ) we denote the maximum advantage taken over all adversaries that query p ± at most q p times, and that make at most q E queries of total length (over all queries) at most λ E blocks to E K or $. We remark that this privacy notion is also known as the indistinguishability under chosen plaintext attack (IND-CPA) security of an (authenticated) encryption scheme.

Integrity
As above, let p denote the list of underlying idealized permutations of Π . We define the advantage of an adversary A in breaking the integrity of Π as follows: where the probability is taken over the random choices of p, K , and A, if any. We say that "A forges" if D K ever returns a message other than ⊥ on input of (N ; H, C, T ; A) where (C, A) has never been output by E K on input of a query (N ; H, M, T ) for some M. We assume that adversary A is nonce-respecting, which means that it never makes two queries to E K with the same nonce. Nevertheless, A is allowed to repeat nonces in decryption queries. By Adv auth Π (q p , q E , λ E , q D , λ D ) we denote the maximum advantage taken over all adversaries that query p ± at most q p times, make at most q E queries of total length (over all queries) at most λ E blocks to E K , and at most q D queries of total length at most λ D blocks to D K .

Multi-Collisions
Consider the following game of balls and bins. Let R ≥ 1 be the number of bins and σ the number of balls. The σ balls are thrown uniformly at random into the R bins. By multcol(R, σ, ρ) we denote a ρ-collision, namely the event that there exists a bin that contains ρ or more balls after all σ balls are thrown.
A folklore result [67, Theorem 3.1], [64, Lemma 5.1] states the following upper bound on the probability of a ρ-collision for ρ ≥ 2: where R ≥ 1 and σ ≥ ρ. Note that σ can be smaller or larger than R. The bound of (1) involves a binomial coefficient and hence factorials. To evaluate these factorials we rely on Stirling's approximation. Formally, Stirling's approximation can be written as an inequality as [71] x where π = 3.14 . . . and e = 2.71 . . ., which holds for all x ≥ 1.
For the purpose of the paper we combine inequalities (1) and (2) in the following way. Let S be some positive number limiting the maximum value of σ , i.e., σ ≤ S. From (1) and (2), we get Remark 1. The probability that multcol(R, σ, ρ) occurs can also be bounded using the Chernoff bound [28]. Consider any fixed bin, and for i = 1, . . . , σ , denote Defining X = σ i=1 X i as the number of balls in that specific bin, the Chernoff bound states that for any t > 0 [64, Section 4.2], Pr (X ≥ ρ) ≤ Pr e t X ≥ e tρ ≤ Ex e t X e tρ .
As in our case the events X i are mutually independent, One therefore finds, for any t > 0, Looking ahead, in our applications we will need an upper bound of this term of the form σ/S, where ρ is a function of R and S. The bound of (4) is more suited for that.
An alternative approach to bound the probability that multcol(R, σ, ρ) occurs, is via the first and second moments, as done by Raab and Steger [74]. In detail, Raab and Steger demonstrate that Pr (multcol(R, σ, ρ(R, σ ))) = o(1) for various parameter settings and choices of ρ as a function of R and σ [74,Theorem 1]. This approach, as well as the related approaches in the field of cryptography [10,49], again does not fit our targeted upper bound.

Lambert W Function
Stirling's approximation contains a "self-exponential" function x x , and we will need to solve equations of the form for variable ξ . For this purpose we utilize the Lambert W function [71]. Consider the function f (w) = we w defined for complex numbers w. Then, the Lambert W function is the inverse relation of f . More precisely, Z = W (Z )e W (Z ) is the defining equation for W , and Eq. (6) can be solved, using W , as where D := ln d [30].
In this work, we can restrict the domain of W to real numbers X ≥ −1/e and the range to real numbers W (X ) ≥ −1, and we focus on the principal branch W p , which is a single-valued function. Hoornar and Hassani [50] derived the following inequality on W p (X ) for any X ≥ e: Back to (6), when ξ is restricted to real numbers, the solution (7) becomes It should be emphasized that this bound is valid only under the condition D ≥ e, or equivalently, d ≥ e e .

Bounding Multi-Collision Probability
We will derive Sponge-oriented bounds for ρ. In more detail, consider parameters b, r, c such that b = r + c, write R = 2 r , and S = min{2 b/2 , 2 c }. We will derive choices for ρ (depending on r and c), such that the probability of a multi-collision of (1) is bounded by σ/S. Then, where β := log 2 e + log 2 log 2 e.
The proof of Lemma 1 is constructive, and the bounds for ρ are derived constructively rather than simply proven to hold. However, the reasoning is structurally different for the cases where r < c (cases (i-iv)) and for the cases where r ≥ c (cases (v-vii)).

Proof of Lemma 1(i-iv).
For the case r < c, our basic strategy is to bound Pr (multcol(R, σ, ρ)) by σ/S, where S = 2 b/2 , by means of setting for sufficiently large parameter θ . Note that, by the generalized pigeonhole principle, 2 (c−r )/2 is the minimum value of ρ when σ reaches S = 2 b/2 . Assume that ρ ≥ eS/R = e2 b/2 /2 r = e2 (c−r )/2 , i.e., θ ≥ e. Then, (4) becomes that is defined for real numbers ζ ∈ [0, 7.2]. It remains to show the following: Proof of claim. The derivative of ϕ is computed as Case (iv): c−2log 2 c+7.2 < r < c. The value of θ needs to increase as r approaches to c, and in general θ cannot be bounded by a constant but is rather a function of r and c. The Lambert W function can handle such a case, yielding a fairly sharp bound.
Claim. Let c ≥ 13. The inequality holds for all r ∈ [c − 2 log 2 c + 7.2, c ]. (The condition c ≥ 13 is to make the range of r non-empty.)
As we will show, the bound (16) "works" not only for r ≥ 2c but for all r > c. Moreover, it turns out that (16) is actually better than (15) for a large part of r ∈ (c, 2c ], except where r ≈ c. Claim. Let r > c. For ρ of (16), we have Pr (multcol(R, σ, ρ)) ≤ σ/S.

Proof of claim. Define the function
whose domain is the real numbers u ∈ (c, 2c ] with c ≥ 11 and β = log 2 e + log 2 log 2 e = 1.97 . . .. Then equation Δ c (u) = 0 becomes u = c + e log 2 u − eβ, whose solution we denote by u 0 . We differentiate Δ c with respect to u as Note that c + e log 2 r − eβ > c + e log 2 c − eβ, making the distinction between this case (vi) and the previous case (v) clear.

NORX
We introduce NORX at a level required for the understanding of the security proof and refer to Aumasson et al. [7,8] for the formal specification. Let p be a permutation on b bits. All b-bit state values are split into an outer part of r bits and an inner part of c bits. We denote the key size of NORX by κ bits, the nonce size by ν bits, and the tag size by τ bits. The header, message, and trailer can be of arbitrary length and are padded using 10 * 1-padding to a length of a multiple of r bits. Throughout, we denote the r -bit header blocks by  Table 1.
Although NORX starts with an initialization function init which requires the parameters (D, R, τ ) as input, as soon as our security experiment starts, we consider (D, R, τ ) fixed and constant. Hence, we can view init as a function that maps (K , N ) to where const is irrelevant to the mode security analysis of NORX, and will be ignored in the remaining analysis.
After init is called, the header H is compressed into the rate, then the state is branched into D states (if necessary), the message blocks are encrypted in a streaming way, the D states are merged into one state (if necessary), the trailer is compressed, and finally the tag A is computed. All rounds are preceded with a domain separation constant XORed into the capacity: 01 for header compression, 02 for message encryption, 04 for trailer compression, and 08 for tag generation. If D = 1, domain separators 10 and 20 are used for branching and merging, along with pairwise distinct lane indices id k for k = 1, . . . , D (if D = 1 we write id 1 = 0). In Fig. 2 we depict NORX for D = 1 and D = 2. The privacy of NORX is proven in Sect. 4.1 and the integrity in Sect. 4.2. In both proofs we consider an adversary that makes q p permutation queries and q E encryption queries of total length λ E . In the proof of integrity, the adversary can additionally make q D decryption queries of total length λ D . To aid the analysis, we compute the number of permutation calls made via the q E encryption queries. The exact same computation holds for decryption queries with the parameters defined analogously.
Consider a query to E K , consisting of u header blocks, v message blocks, and w trailer blocks. We denote its corresponding state values by as outlined in Fig. 2. Here,  4 We denote the number of state values by σ E, j , where the dependence on D is suppressed as D does not change during the security game. In other words, σ E, j denotes the number of primitive calls in the jth query to E K . Furthermore, we define σ E to be the total number of primitive evaluations via the encryption queries, and find that This bound is rather tight. Particularly, for D = 0 an adversary can meet this bound by only making queries without header and trailer. For queries to D K we define σ D, j and σ D analogously.
where σ E is defined in (18), and where ρ = ρ(r, c) is the function defined in Lemma 1.
Theorem 1 can be interpreted as implying that NORX provides privacy security as long as the total complexity q p + σ E does not exceed min{2 b/2 , 2 κ } and the total number of primitive queries q p , also known as the offline complexity, does not exceed 2 c /ρ. The presence of the term ρ makes the bound a bit unclear; in Table 2 we give the main implication of this bound for the various possible values of r and c as outlined in Lemma 1. See Table 1 for the security level of the various parameter choices of NORX: for NORX v1 [7], we are concerned with case (vi), where ρ = 2.5 = 3 for both b ∈ {512, 1024}; for NORX v2 [8], we are in case (vii), where ρ = 2.
The proof is based on the observation that NORX is indistinguishable from a random scheme as long as there are no collisions among the (direct and indirect) evaluations of p. Due to uniqueness of the nonce, state values from evaluations of E K collide with probability approximately 1/2 b . Regarding collisions between direct calls to p and calls via E K : while these may happen with probability about 1/2 c , they turn out not to significantly influence the bound. The latter is demonstrated in part using the principle of multiplicities [18]: roughly stated, the maximum number of state values with the same outer part. We use Lemma 1 to bound multiplicities. The formal security proof is more detailed. Furthermore, we remark that, at the cost of readability and simplicity of the proof, the bound could be improved by a constant factor.
Proof. Consider any adversary A with access to either ( p ± , E K ) or ( p ± , $) and whose goal is to distinguish these two worlds. For brevity, we write We start by replacing p ± by a random function to simplify analysis. This is done with a "URP-URF" switch [13], in which we make a transition from p ± to a primitive f ± defined as follows (as done by Andreeva et al. [4]).
The primitive f ± maintains an initially empty list F of query/response tuples (x, y) where the set of domain and range values are denoted by dom(F) and rng(F), respectively. For a forward query f (x) with x ∈ dom(F), the value in {y | (x, y) ∈ F} which occurs lexicographically first is returned. For a new forward query f (x), the response y is randomly drawn from {0, 1} b , then the tuple (x, y) is added to F. The description for f −1 is similar. We let abort denote the event that a new query f (x) results in a value y where y is already in rng(F), or a new query f −1 (y) results in a value x where x is already in dom(F).
By applying the triangle inequality, we have The two rightmost terms are bounded above by the maximum advantage of any adversary distinguishing p ± and f ± in at most q p + σ E queries. Since p ± and f ± are identical until abort, by the Fundamental Lemma of Game Playing [12,13] we have that the two rightmost terms are in turn bounded by We restrict our attention to A with oracle access to ( f ± , F), where F ∈ {E K , $}. Without loss of generality, we can assume that the adversary only queries full blocks and that no padding rules are involved since the padding rules are injective, allowing the proof to carry over to the case of fractional blocks with 10 * 1-padding.
We introduce some terminology. Queries to f ± are denoted (x i , y i ) for i = 1, . . . , q p , while queries to F are written as elements (N j ; H j , M j , T j ; C j , A j ) for j = 1, . . . , q E . If F = E K , the state values are denoted as in (17), subscripted with a j: If the structure of (22) is irrelevant we refer to the tuple as (s j,1 , . . . , s j,σ E, j ), where we use the convention to list the elements of the matrix column-wise. In this case, we write parent(s j,k ) to denote the state value that lead to s j,k , with parent(s j,1 ) := ∅ and parent(s T j,0 ) := (s M j,1,v 1 , . . . , s M j,D,v D ). We remark that the characteristic structure of NORX, with the D parallel states, only becomes relevant in the two technical lemmas that will be used at the end of the proof. We point out that s j,1 corresponds to the initial state value of the evaluation, which requires special attention throughout the remainder of the proof.
The remainder of the proof is divided as follows. In Lemma 2 we prove that ( f ± , E K ) and ( f ± , $) are identical until event occurs. In other words, by applying the Fundamental Lemma of Game Playing [12,13], Then, in Lemma 3 we bound this term by where ρ = ρ(r, c) is the function defined in Lemma 1. Noting that , this completes the proof via equations (19,21,23).

Lemma 2.
The outputs of ( f ± , E K ) and ( f ± , $) are identically distributed until event occurs.
Proof. The outputs of f ± are sampled independently and uniformly at random in ( f ± , $). This holds in the real world as well, unless a query to f ± collides with an f ± query made via E K . Therefore, until guess occurs, the outputs of f ± are distributed identically in both worlds. Furthermore, f ± 's outputs are independent of the distinguisher's query history, hence, assuming all past queries were identically distributed across worlds, a query to f ± will not change the fact that both worlds are identically distributed, until guess occurs. Let N j be a new nonce used in the F-query (N j ; H j , M j , T j ), with corresponding ciphertext and authentication tag (C j , A j ). Denote the query's state values as in (22). Let u, v, and w denote the number of padded header blocks, padded message blocks, and padded trailer blocks, respectively.
Consider the jth query. By the definition of $, in the ideal world we have (C j , A j ) $ ← − {0, 1} |M j |+τ . We will prove that (C j , A j ) is identically distributed in the real world, under the assumption that guess ∨ hit has not yet occurred. Denote the message blocks of M j by M j,k, for k = 1, . . . , D and = 1, . . . , v k . We As the state value s M j,k, −1 has not been evaluated by f before (neither directly nor indirectly via an encryption query), f (s M j,k, −1 ) outputs a uniformly random value from We remark that similar reasoning shows that a ciphertext block corresponding to a truncated message block is uniformly randomly drawn as well, yet from a smaller set. The fact that A j $ ← − {0, 1} τ follows the same reasoning, using that s tag j is a new input to f . Thus, Looking at the reasoning of the proof of Lemma 2 above, we notice that if event has not yet occurred, then each state value in an F-query is sampled independently and uniformly at random. In particular, once the adversary fixes the inputs to an F-query, each state value in that F-query is independent of the adversary's input, and independent of each other. Furthermore, the inner part of those state values are never released to the adversary, hence the adversary's future queries are independent of the inner parts of the state values. Hence, we have the following result:

Lemma 3. Pr
Proof. Consider the adversary interacting with ( f ± , E K ), and let Pr (guess ∨ hit) denote the probability we aim to bound. For i ∈ {1, . . . , q p }, define and key = ∨ i key(i), which corresponds to a primitive query hitting the key. Let j ∈ {1, . . . , q E } and k ∈ {1, . . . , σ E, j }, and consider any threshold ρ ≥ 1, then define Event multi( j, k) is used to bound the number of states that collide in the outer part. Note that state values s j ,1 are not considered here as they will be covered by key. We define multi = multi(q E , σ E,q E ), which is a monotone event. By basic probability theory, Pr (guess ∨ hit) ≤ Pr (guess ∨ hit | ¬(key ∨ multi)) + Pr (key ∨ multi) . (25) In the remainder of the proof, we bound these probabilities as follows (a formal explanation of the proof technique is given in "Appendix"): we consider the ith forward or inverse primitive query (for i ∈ {1, . . . , q p }) or the kth state of the jth construction query (for j ∈ {1, . . . , q E } and k ∈ {1, . . . , σ E, j }), and bound the probability that this evaluation makes guess ∨ hit satisfied, under the assumption that this query does not set key ∨ multi and also that guess ∨ hit ∨ key ∨ multi has not been set before. For the analysis of Pr (key ∨ multi) a similar technique is employed. Event guess. This event can be set in the ith primitive query (for i = 1, . . . , q p ) or in any state evaluation of the jth construction query (for j = 1, . . . , q E ). Denote the state values of the jth construction query as in (22). Consider any evaluation, assume this query does not set key ∨ multi and assume that guess ∨ hit ∨ key ∨ multi has not been set before. Firstly, note that x i = s init j for some i, j would imply key(i) and hence invalidate our assumption. Therefore, we can exclude s init j from further analysis on guess. For i = 1, . . . , q p , let j i ∈ {1, . . . , q E } be the number of encryption queries made before the ith primitive query. Similarly, for j = 1, . . . , q E , denote by i j ∈ {1, . . . , q p } the number of primitive queries made before the jth encryption query.
-Consider a primitive query (x i , y i ) for i ∈ {1, . . . , q p }, which may be a forward or an inverse query, and assume it has not been queried to f ± before.
Therefore, the probability that guess is set via a direct query is at most -Next, consider the probability that the jth construction query sets guess, for j ∈ {1, . . . , q E }. For simplicity, first consider D = 1, hence the message is processed in one lane and we can use state labeling (s j,1 , . . . , s j,σ E, j ). We range from s j,2 to s j,σ E, j (recall that s j,1 = s init j can be excluded) and consider the probability that this state sets guess assuming it has not been set before. Let k ∈ {2, . . . , σ E, j }. The state value s j,k equals f (s j,k−1 )⊕v, where v is some value determined by the adversarial input prior to the evaluation of f (s j,k−1 ), including input from (H j , M j , T j ) and constants serving as domain separators. By assumption, guess ∨ hit has not been set before, and f (s j,k−1 ) is thus randomly drawn from {0, 1} b . It hits any x i (i ∈ {1, . . . , i j }) with probability at most i j /2 b . Next, consider the general case D > 1. We return to the labeling of (22 where v 1 , . . . , v D are some distinct values determined by the adversarial input prior to the evaluation of the jth construction query. These are distinct by the XOR of the lane numbers id 1 , . . . , id D . Any of these nodes equals x i for i ∈ {1, . . . , q p } with probability at most i j D/2 b . Finally, for the merging node s T j,0 we can apply the same analysis, noting that it is derived from a sum of D new f -evaluations. Concluding, the jth construction query sets guess with probability at most i j σ E, j /2 b (we always have in total at most σ E, j new state values). Summing over all q E construction queries, we get Here we use that σ E, j k=1 i j σ E, j = q p σ E , which follows from a simple counting argument. Event hit. We again employ ideas of guess, and particularly that as long as guess ∨ hit is not set, we can consider all new state values (except for the initial states) to be randomly drawn from a set of size 2 b . Particularly, we can refrain from explicitly discussing the branching and merging nodes (the detailed analysis of guess applies) and label the states as (s j,1 , . . . , s j,σ E, j ). Clearly, s j,1 = s j ,1 for all j, j by uniqueness of the nonce. Any state value s j,k for k > 1 (at most σ E − q E in total) hits an initial state value s j ,1 only if [s j,k ] κ = K , which happens with probability at most σ E /2 κ , assuming s j,k is generated randomly. Finally, any two other states s j,k , s j ,k for k, k > 1 collide with probability Event key. For i ∈ {1, . . . , q p }, the query sets key(i) if [x i ] κ = K , which happens with probability 1/2 κ (assuming it did not happen in queries 1, . . . , i − 1). The adversary makes q p attempts, and hence Pr (key) ≤ q p /2 κ . Event multi. Event multi can be related to multcol of Sect. 3, in the following way. Consider any new state value s j,k−1 ; then it contributes to the bin If a threshold ρ needs to be exceeded for some α, at least ρ/2 of them are either of the first kind or of the second kind. The event multi can henceforth be seen as a balls and bins game with 2 r bins, σ E balls, and threshold ρ = ρ/2: By Lemma 1, we know that Pr multcol (2 r where ρ is the function described in Lemma 1 (parameters r, c are implicit). Note that we put ρ = 2ρ . Addition of the four bounds via (25) gives where ρ = ρ(r, c) is the function defined in Lemma 1.

Theorem 2. Let Π = (E, D) be NORX based on an ideal underlying primitive p. Then,
where σ E , σ D are defined in (18), and where ρ = ρ(r, c) is the function defined in Lemma 1.
The bound is more complex than the one of Theorem 1, but intuitively implies that NORX offers integrity as long as it offers privacy and the number of forgery attempts σ D is limited, where the total complexity q p + σ E + σ D should not exceed 2 c /σ D . See Table 1 for the security level for the various parameter choices of NORX. Needless to say, the exact bound is more fine-grained.
Proof. We consider any adversary A that has access to ( p ± , E K , D K ) and attempts to make D K output a non-⊥ value. As in the proof of Theorem 1, we apply a URP-URF switch to find Then we focus on A having oracle access to ( f ± , E K , D K ). As before, we assume without loss of generality that the adversary only makes full-block queries. We inherit terminology from Theorem 1. The state values corresponding to encryption and decryption queries will both be labeled ( j, k), where j indicates the query and k the state value within the jth query. If needed we will add another parameter δ ∈ {D, E} to indicate that a state value s δ, j,k is in the jth query to oracle δ, for δ ∈ {D, E} and j ∈ {1, . . . , q δ }. Particularly, this means we will either label the state values as in (22) with a δ appended to the subscript, or simply as (s δ, j,1 , . . . , s δ, j,σ δ, j ).
Observe that from (26) we get A bound on the probability that A sets event is derived in Lemma 4. The remainder of this proof centers on the probability that A forges given that event does not happen. Such a forgery requires that [ f (s tag D, j )] τ = A j for some decryption query j. By ¬event, we know that s tag D, j is a new state value for all j ∈ {1, . . . , q D }, hence f 's output under s tag D, j is independent of all other values and uniformly distributed for all j. As a result, we know that the jth forgery attempt is successful with probability at most 1/2 τ . Summing over all q D queries, we get and the proof is completed via (26,27) and the bound of Lemma 4, where we again use that c) is the function defined in Lemma 1.

Lemma 4. Pr
Proof. Recall that event = guess ∨ hit ∨ Dguess ∨ Dhit. Employing events key and multi from Lemma 3, we find: The proof builds upon Lemma 3, and in particular we will use the same proof technique of running over all queries and computing the probability that a query sets event, assuming event has not been set before. The bounds on Pr (guess ∨ hit | ¬(key ∨ multi)) and Pr (key ∨ multi) carry over from Lemma 3 verbatim, where we additionally note that for a given query, the previous decryption queries are of no influence as by hypothesis Dguess ∨ Dhit was not set before the query in question. We continue with the analysis of Dguess and Dhit. Event Dguess. Note that the adversary may freely choose the outer part in decryption queries and primitive queries. Indeed, the ciphertext values that A chooses in decryption queries define the outer parts of the state values. Consequently, Dguess gets set as soon as there is a primitive state and a decryption state whose capacities are equal. This happens with probability at most Pr (Dguess | ¬(key ∨ multi)) ≤ q p σ D /2 c . Event Dhit. A technicality occurs in that the adversary can reuse nonces in decryption.
To increase readability, we first state that any decryption state s satisfies [s] κ = K only with probability at most σ D /2 κ , and in the remainder we can exclude this case. Next, we define an event innerhit. Let (δ, j, k) and (δ , j , k ) be two decryption query indices, and let const ∈ {0, 01 ⊕ 02, 01 ⊕ 04, 01 ⊕ 08, 01 ⊕ 10, 02 ⊕ 04, 02 ⊕ 08, 02 ⊕ 20, 02 ⊕ 20 ⊕ id i , 04 ⊕ 08}: Note that for any choice of indices and const, we have Pr(innerhit(δ, j, k; δ , j , k ; const)) ≤ 1/2 c . We consider the general case D = 1. Consider thejth decryption query (N ; H, C, T ; A). Say it consists of u header blocks H 1 . . . H u , v ciphertext blocks C 1 . . . C v , and w trailer blocks T 1 . . . T w , and write its state values as in (17). Let (N δ, j ; H δ, j , C δ, j , T δ, j ; A δ, j ) be an older ciphertext tuple that shares the longest common blockwise prefix with (N ; H, C, T ; A). Note that this tuple may not be unique (for instance if N is new), and that it may come from an encryption or decryption query. Say that this query consists of u δ, j header blocks, v δ, j ciphertext blocks, and w δ, j trailer blocks, and write its state values as in (22). We proceed with a case distinction.
(a) = ∞. Note that s T min{w,w δ, j } = s T δ, j,min{w,w δ, j } ⊕ 04 ⊕ 08. If this input to f is old, it implies innerhit(δ, j, min{w, w δ, j }; δ , j , k ; 04 ⊕ 08) for some (δ , j , k ) older than the current query (D,j, min{w, w δ, j }), which is the case with probability at most 1/2 c (for all possible index tuples). Otherwise, f generates a new value and new state value s (s T w+1 if w > w δ, j or s tag if w < w δ, j ), which sets Dhit if it sets innerhit with an older state s δ , j ,k under const = 0. This also happens with probability at most 1/2 c for any (δ , j , k ). This procedure propagates to s tag . In total, thejth decryption query sets Dhit with probability at most 5 As before, s T is a new input to f , except if innerhit(δ, j, ; δ , j , k ; 0) for some (δ , j , k ) older than the current query (D,j, ). This is the case 5 Note that if (δ, j) were not unique, then we similarly have s T −1 = s T δ , j , −1 and s T = s T δ , j , ⊕ (T 0 c ) ⊕ (T δ , j , 0 c ) = s T δ , j , for all other queries (δ , j ) with the same prefix (possibly XORed with 04 ⊕ 08).
with probability at most 1/2 c for all possible older queries. The procedure propagates to s tag as before, and the same bound holds; (3) (N; H) = (N δ,j ; H δ,j ) but C = C δ,j . The analysis is similar but a special treatment is required to deal with the merging phase. Consider the ciphertext C to be divided into blocks C k, for k = 1, . . . , D and = 1, . . . , v k . Similarly for C δ, j . For We make a further distinction between whether or not ( 1 , . . . , D ) = (∞, . . . , ∞).
(a) ( 1 , . . . , D ) = (∞, . . . , ∞). As C = C δ, j , there must be a k such that v k = v δ, j,k and thus that C k is a strictly smaller substring of C δ, j,k or vice versa. Consequently, and there is no merging phase, or ⊕ 02 ⊕ 08 if there is furthermore no trailer). Then, this state is new to f except if innerhit(δ, j, k, v k ; δ , j , k ; const) is set for the const described above. (We slightly misuse notation here in that v k is input to innerhit.) This means that also s T 0 will be new except if it hits a certain older state, which happens with probability 1/2 c . The reasoning propagates up to s tag as before, and the same bound holds; The reasoning of case (2b) carries over for all future state values; (4) N = N δ,j but H = H δ,j . The analysis follows fairly the same principles, albeit using const ∈ {0, 01 ⊕ 02, 01 ⊕ 04, 01 ⊕ 08, 01 ⊕ 10}; (5) N = N δ,j . The nonce N is new (hence the query shares no prefix with any older query). There has not been an earlier state s that satisfies [s] κ = K (by virtue of the analysis in hit and key, and the first step of this event Dhit). Therefore, s init is new by construction and a simplification of above analysis applies.
Summing over all queries: where the last term comes from the exclusion of the event that any decryption state satisfies [s] κ = K . Together with the bound of Lemma 3 we find via (28),  Fig. 3. Target structure in key recovery attack.
where ρ = ρ(r, c) is the function defined in Lemma 1.

Tightness of the Bound
We derive a generic attack on Sponge-based authenticated encryption schemes. The attack exploits multi-collisions on the outer part of the internal state. Using the multicollision bounds of Suzuki et al. [91,92], we demonstrate that the attack actually matches the proven security bound, meaning that the bounds of Sect. 4 are tight. Therefore, we first describe our simplified target structure in Sect. 5.1. The attack is described in Sect. 5.2 and evaluated in Sect. 5.3.

Target Structure
We consider the simplified structure of Fig. 3. Without loss of generality, we consider a key K ∈ {0, 1} κ , nonce N ∈ {0, 1} b−κ (hence ν = b − κ), and we assume that init initializes the state as (K , N ) → K N . (The attack can be generalized to the setting where the key is absorbed in multiple evaluations of p, or where the key is XORed into the state before outputting A. See also Sect. 5.4.) We consider no associated data, or in terminology of Sect. 2, we put H, T ← Null. The message size must be at least one complete block. Note that, in many schemes, the message of one complete block will expand to two blocks by a padding procedure. We consider a general setting where the τ -bit authentication tag A may be generated in multiple extraction rounds (two in Fig. 3), and we assume that τ ≥ c. We ignore minor issues irrelevant to our attack, such as padding, frame bits, domain separation for message processing and tag generation parts, and truncation of the tag. As shown in Fig. 3, the b-bit state after the first permutation call is denoted s 1 . Its outer and inner part are denoted [s 1 ] r and [s 1 ] c , respectively. Then, an r -bit message block M 1 is XORed into [s 1 ] r and the first ciphertext block C 1 = [s 1 ] r ⊕ M 1 is output. The state is evaluated using the permutation, and the resulting state is s 2 . Note that the values M i and C i reveal the outer part of state s i as [s i ] r = M i ⊕ C i .

Distinguishing Attacks via Key Recovery
Let ρ ≥ 2. If 2 κ ≤ 2 c /ρ a naive key recovery attack can be performed in complexity 2 κ , and we assume that 2 κ > 2 c /ρ. We first give an overview of the attack. Once a b-bit state in the structure of Fig. 3 is recovered, the secret key K can be recovered immediately by computing the inverse of the permutation. Our attack aims to recover the internal state s 1 after the first permutation call. It consists of an online phase followed by an offline phase.
In the online phase, the adversary searches for a ρ-collision on the r -bit value C 1 . It makes a certain amount of encryption oracle queries for different N and possibly different M 1 . Let q denote the total number of encryption queries needed. The online phase results in ρ pairs of (N , M 1 ) which produce the same C 1 but different [s 1 ] c . The adversary also stores the tag A for each pair.
In the offline phase, the adversary recovers an inner part [s 1 ] c . Using the value C 1 , the same for all tuples, the value [s 1 ] c is exhaustively guessed. In a bit more detail, the adversary computes the authentication tag A from C 1 [s 1 ] c offline, and checks if there is a match with any stored tag. Because ρ tags are stored, the attack cost is about 2 c /ρ.
The formal description of the attack is given below. Here, we denote the data D for the kth block in the jth query by D j,k . We omit the second subscript for the data where the block length is always 1, e.g., nonce N j . 1, 2, . . . , q and receive (C i,1 , A i,1 A i,2 . . .); 3. Find a ρ-collision on C ·,1 ; 4. Store ρ triplets of (N i , M i,1 , A i,1 A i,2 . . .) contributing to the ρ-collision. We denote the colliding value of C ·,1 by C, which is also stored.

Offline
If the resulting value matches nonce N i , output the first κ bits of the state as the recovered key K .

Attack Evaluation
In the online phase, the adversary does not strictly need to choose N and M 1 , a given list of q different tuples suffices. Thus, the attack is a known plaintext attack. The data complexity is q one-block messages and the memory to store q triples (N i , M i,1 , A i,1 A i,2 . . .) for i = 1, . . . , q is required. The time complexity of at least q memory access is also required. Intuitively, all the complexities in the online phase are q.
In the offline phase, because ρ candidates are stored in the online phase and 2 c /ρ guesses are examined, one match is expected. If the internal state values match, the corresponding tag values also match. Thus, the right guess is identified. Due to the assumption that the tag size is at least c bits, the match likely only suggests the right guess. In addition, we can further filter out the false positive by r bits with the match of N in the last step. Thus, with a very high probability the key is successfully recovered. For the complexity, the only important factor is the time complexity of 2 c /ρ tag generation functions.
What remains is to appropriately choose parameters for q and ρ so that the total complexity max{q, 2 c /ρ} is minimized. Suzuki et al. [91,92] showed that, when c ≤ r , the complexity q to find a ρ-collision with probability about 0.5 is given by c = r. We demonstrate tightness of the bound for the cases c = r = 128, c = r = 256, and c = r = 512. Note that, provided κ is large enough, the bound of Theorem 1 is dominated by 2 c /α with α = 1.4r log 2 r +r −c−2 (cf., Table 2). In Table 3 we evaluate the attack complexity so that max{q, 2 r /ρ} is minimized. This complexity is always bigger but very close to the proven bound, which shows tightness of security bound. c < r. It is common practice to enlarge the rate of Sponge-based authenticated encryption so that more data can be processed per permutation call. We demonstrate tightness of our attack for the case of c = 256 and r ∈ [257, 768]. Figure 1 depicts the evaluated attack complexity and our security bound for c = 256. For the sake of completeness, it also includes the 2 c /r bound of the original ASIACRYPT 2014 article [53], which decreases by approximately a logarithmic factor log 2 r .
Note that the adversary needs to find a multi-collision on r bits with only 2 c trials. When the rate increases, and particularly when r > 2c, the adversary cannot even find an ordinary collision within 2 c trials. In this case, the multi-collision-based attack will not be influential. Due to this, our bound is getting close to 2 c when r becomes large. The advantage of the attack comes from the number of generated multi-collisions. Considering that the number of multi-collisions can only take discrete values while our bound can take sequential values, our bound is strictly tight. c > r. Note that, for c > r , the security bound of Theorem 1 is not dominated by 2 c /α but rather by 2 b/2 , omitting constants (cf., Table 2). Tightness of the bound follows by a naive attack that aims to find collisions on the b-bit state.

Distinguishing Attacks Without Key Recovery
As later explained in Fig. 4, several practical designs use key K for the initialization as well as for the tag generation. Those schemes cannot be distinguished with a straightforward application of the above generic procedure, yet it is still possible to distinguish them by increasing the attack complexity only by 1 bit or so.
We focus on Ascon, GIBBON and HANUMAN, in which K in the tag computation prevents the adversary from computing tag A offline. This can be solved by extending the number of message blocks in each query. Instead of the tag A i,1 A i,2 . . ., outer parts of the subsequent blocks [s i,2 ] r [s i,3 ] r . . . take a role of filter to identify the correct guess. If the number of filtered bits is much bigger than c, a match suggests the correct guess with very high probability. Owing to the additional message blocks, the attack complexity increases by 1 bit or so, depending how many message blocks are added.
In HANUMAN, K can be recovered from the internal state by inverting the permutation to the initial value. Meanwhile in Ascon and GIBBON, K cannot be recovered and the adversary only can mount distinguishing attacks.

Other CAESAR Submissions
In this section we discuss how the mode security proof of NORX generalizes to the CAESAR submissions Ascon, the BLNK mode underlying CBEAM/STRIBOB, ICE-POLE, Keyak (v1 only), and two out of the three PRIMATEs. Before doing so, we make a number of observations and note how the proof can accommodate small design differences.
-NORX uses domain separation constants at all rounds, but this is not strictly necessary and other solutions exist. In the privacy and integrity proofs of NORX, and more specifically at the analysis of state collisions caused by a decryption query in Lemma 4, the domain separations are only needed at the transitions between variable-length inputs, such as header to message data or message to trailer data. This means that the proofs would equally hold if there were simpler transitions at these positions, such as in Ascon. Alternatively, the domain separation can be done by using a different primitive, as in GIBBON and HANUMAN, or a slightly more elaborated padding, as in BLNK, ICEPOLE, and Keyak; -The extra permutation evaluations at the initialization and finalization of NORX are not strictly necessary: in the proof we consider the monotone event that no state collides assuming no earlier state collision occurred. For instance, in the analysis of Dhit in the proof of Lemma 4, we necessarily have a new input to p at some point, and consequently all next inputs to p are new (except with some probability); -NORX starts by initializing the state with init(K , N ) = (K N 0 b−κ−ν ) ⊕ const for some constant const and then permuting this value. Placing the key and nonce at different positions of the state does not influence the security analysis. The proof would also work if, for instance, the header is preceded with K N or a properly padded version thereof and the starting state is 0 b ; -In a similar fashion, there is no problem in defining the tag to be a different τ bits of the final state; for instance, the rightmost τ bits; -Key additions into the inner part after the first permutation are harmless for the mode security proof. Particularly, as long as these are done at fixed positions, these have the same effect as XORing a domain separation constant.
These five modifications allow one to generalize the proof of NORX to Ascon, CBEAM and STRIBOB, ICEPOLE, Keyak, and two PRIMATEs, GIBBON and HANU-MAN. The only major difference lies in the fact none of these designs accommodates a trailer, hence all are functions of the form except for one instance of ICEPOLE which accommodates a secret message number. Additionally, these designs have σ δ ≤ λ δ + q δ for δ ∈ {D, E} (or σ δ ≤ λ δ + 2q δ for CBEAM/STRIBOB). We always write H = (H 1 , . . . , H u ) and M = (M 1 , . . . , M v ) whenever notation permits. In below sections we elaborate on these designs separately, where we slightly deviate from the alphabetical order to suit the presentation. Diagrams of all modes are given in Fig. 4. The parameters and achieved provable security levels of the schemes are given in Table 1.
We remark that the attack of Sect. 5 carries over to CBEAM and STRIBOB, ICE-POLE and a simplified version of Keyak v1 (with only one round of key absorption). It does not apply to Ascon, GIBBON, and HANUMAN due to the additional XOR of the secret key at the end.

Ascon
Ascon is a submission by Dobraunig et al. [33,34] and is depicted in Fig. 4a. It is originally defined based on two permutations p 1 , p 2 that differ in the number of underlying rounds. We discard this difference, considering Ascon with one permutation p.
Ascon initializes its state using init that maps (K , N ) to (0 b−κ−ν K N ) ⊕ const, where const is determined by some design-specific parameters set prior to the security experiment. The header and message can be of arbitrary length and are padded to length a multiple of r bits using 10 * -padding. An XOR with 1 separates header processing from message processing. From the above observations, it is clear that the proofs of NORX directly carry over to Ascon.

ICEPOLE
ICEPOLE is a submission by Morawiecki et al. [65,66] and is depicted in Fig. 4c. It is originally defined based on two permutations, p 1 and p 2 , that differ in the number of underlying rounds. We discard this difference, considering ICEPOLE with one permutation p.
ICEPOLE initializes its state as NORX does, be it with a different constant. The header and message can be of arbitrary length and are padded as follows. Every block is first appended with a frame bit: 0 for header blocks H 1 , . . . , H u−1 and message block M v , and 1 for header block H u and message blocks M 1 , . . . , M v−1 . Then, the blocks are padded to length a multiple of r bits using 10 * -padding. In other words, every padded block of r bits contains at most r − 2 data bits. This form of domain separation using  (K , N , H )) u , and 11 for message blocks M 1 , . . . , M v−1 and 10 for M v . Then, the blocks are padded to length a multiple of r bits using 10 * 1-padding. In other words, every padded block of r bits contains at most r − 2 data bits. This form of domain separation using frame bits suffices for the proof to go through. Due to above observations, our proof readily generalizes to SpongeWrap [19] and DuplexWrap [22], and thus to Keyak. Without going into detail, we note that the same analysis can be generalized to the parallelized mode of Keyak [22]. Additionally, Keyak also supports sessions, where the state is re-used for a next evaluation. Our proof generalizes to this case, simply with a more extended description of (17).

BLNK (CBEAM and STRIBOB)
CBEAM and STRIBOB are submissions by Saarinen [81,[83][84][85][86]. Minaud identified an attack on CBEAM [62], but we focus on the modes of operation. Both modes are based on the BLNK Sponge mode [82], which is depicted in Fig. 4b. The BLNK mode initializes its state by 0 b , compresses K into the state (using one or two permutation calls, depending on κ), and does the same with N . Then, the mode is similar to SpongeWrap [19], though using a slightly more involved domain separation system similar to the one of NORX. Due to above observations, our proof readily generalizes to BLNK [82], and thus to CBEAM and STRIBOB.

PRIMATEs: GIBBON and HANUMAN
PRIMATEs is a submission by Andreeva et al. [2,3], and consists of three algorithms: APE, GIBBON, and HANUMAN. The APE mode is the more robust one, and significantly differs from the other two, and from the other CAESAR submissions discussed in this work, in the way that ciphertexts are derived and because the mode is secure against nonce-misusing adversaries up to common prefix [4]. (See Sect. 7 for a discussion on APE.) We now focus on GIBBON and HANUMAN, which are depicted in Fig. 4e, f. GIBBON is based on three related permutations p = ( p 1 , p 2 , p 3 ), where the difference in p 2 , p 3 is used as domain separation of the header compression and message encryption phases (the difference of p 1 from ( p 2 , p 3 ) is irrelevant for the mode security analysis). Similarly, HANUMAN uses two related permutations p = ( p 1 , p 2 ) for domain separation. 6 GIBBON and HANUMAN initialize their state using init that maps (K , N ) to 0 b−κ−ν K N . The header and message can be of arbitrary length, and are padded to length a multiple of r bits using 10 * -padding. In case the true header (or message) happens to be a multiple of r bits long, the 10 * -padding is considered to spill over into the capacity. From above observations, it is clear that the proofs of NORX directly carry over to GIBBON and HANUMAN. A small difference appears due to the usage of two different permutations: we need to make two RP-RF switches for each world.

PRIMATEs: APE
Unlike GIBBON and HANUMAN, the APE authenticated encryption scheme follows a different design strategy. It is depicted in Fig. 5. APE is based on one permutation p, and characteristic to the design is the way the ciphertexts are derived and verified. APE uses a key of size c bits, and the initialization init places K into the inner part of the state. In case of a present nonce N , in APE it is prepended to the header H , denoted N H . The nonce is of fixed length, and of suggested size 2r bits [2,3]. The header and message can be of arbitrary length and are padded to length a multiple of r bits using 10 * -padding. In case the true header (or message) happens to be a multiple of r bits long, the 10 * -padding is considered to spill over into the capacity. In case the message is not a multiple of r bits long, the last ciphertext is derived slightly differently, and we refer to [2,3].
The scheme is designed and proven to be 2 c/2 secure against nonce-misusing adversaries up to common prefix [4]. We now consider the security of APE in the noncerespecting setting, and present an adversary that breaks the privacy with a complexity of about 2 c/2 . We assume that the adversary can make blockwise queries to the scheme. In more detail, upon an authenticated encryption of M 1 , . . . , M v , it only needs to input the jth message block after it receives the j − 1 ciphertext block, for j = 2, . . . , v. Proposition 1. Let Π = (E, D) be APE based on an ideal underlying primitive p. Then, where all q E queries are of length (2 (c+1)/2 + 1)/q E + ρ + 1.

Conclusions
In this work we analyzed one of the Sponge-based authenticated encryption designs in detail, NORX, and proved that it achieves security of approximately min{2 b/2 , 2 c , 2 κ }, significantly improving upon the traditional bound of min{2 c/2 , 2 κ }. Additionally, we showed that this proof straightforwardly generalizes to five other CAESAR modes, Ascon, BLNK (of CBEAM/STRIBOB), ICEPOLE, Keyak v1, and PRIMATEs. Our findings indicate an overly conservative parameter choice made by the designers, implying that some designs can improve speed by a factor of 4 at barely any security loss. It is expected that the security proofs also generalize to the modes of Artemia [1]. However, this mode is based on the JH hash function [96] and XORs data blocks in both the rate and inner part. It does not use domain separations, rather it encodes the lengths of the inputs into the padding at the end [9]. Therefore, a generalization of the proof of NORX to Artemia is not entirely straightforward.
The results in this work are derived in the ideal permutation model, where the underlying primitive is assumed to be ideal. We acknowledge that this model does not perfectly reflect the properties of the primitives. For instance, it is stated by the designers of Ascon, NORX, and PRIMATEs that non-random (but harmless) properties of the underlying permutation exist. Furthermore, it is important to realize that the proofs of security for the modes of operation in the ideal model do not have a direct connection with security analysis performed on the permutations, as is the case with block ciphers modes of operation. Nevertheless, we can use these proofs as heuristics to guide cryptanalysts to focus on the underlying permutations, rather than the modes themselves.