Keywords

1 Introduction

Message Authentication Code (or in short MAC) is an important cryptogaphic primitive to authenticate any digital message or packet transmitted over an insecure communication channel. When a sender wants to send a message m, she computes a MAC function with input m, the shared secret key k, and possibly an auxiliary input variable \(\nu \) (called nonce), and obtains a tag t. Then she sends \((\nu , m, t)\) to the receiver. Upon receiving, receiver verifies the authenticity of \((\nu , m, t)\) by computing the MAC using \((\nu , m, k)\) and checks whether the computed tag \(t'\) matches with t.

Wegman-Carter (WC) MAC [25] is the first example of a nonce-based MAC which masks the hash value of the message with an encrypted nonce to generate the tag. WC MAC gives optimal security when the nonce is unique for every authenticated messages. However, its security is compromised if the nonce repeats even once. Wegman-Cater MAC, when instantiated with a polynomial hash, then the repetition of the nonce reveals the hash key of the polynomial hash. However, maintaining the uniqueness of the nonce for every authenticated messages is a challenging task in practical contexts. For example, it is difficult to maintain the uniqueness of the nonce while implementing the cipher in a stateless device or in cases where the nonce is chosen randomly from a small set. The nonce may also accidentally repeats due to a faulty implementation of the cipher or due to the fault occured by resetting of the nonce itself [4]. Therefore, the guard from the nonce repetition attack is much desired from a nonce-based MAC.

As a remedy of this, Encrypted Wegman-Carter-Shoup (EWCS) [11] MAC was proposed that guarantees the security even when the nonce repeats. But its security is limited only up to the birthday bound even when nonce is unique. To this end, Encrypted Wegman-Carter with Davies-Meyer [11] (or EWCDM) and Decrypted Wegman-Carter with Davies-Meyer [13] (or DWCDM) have been proposed that gives beyond the birthday bound security when nonce is uniqueFootnote 1 and birthday bound security when nonce repeatsFootnote 2. However, the security of both these constructions fall to the birthday bound with a single repetition of the nonce, i.e., if the nonce ever repeats accidentally, security of both the constructions immediately drops to the birthday bound.

In FSE 2010 [21], Minematsu proposed EHtM, a beyond birthday bound secure probabilisitic MAC. It is build upon two independent n-bit keyed functions \(\textsf {F}_{k_1}\) and \(\textsf {F}_{k_2}\) and an n-bit axu hash function \(\textsf {H}_{k_h}\), defined as follows:

This construction has been further analyzed in [15] for improving its security bound. In Eurocrypt 2019, Dutta et al. [16] proposed a nonce-based variant of EHtM, called nEHtM MAC, where the random salt r is replaced by an \(n-1\) bit nonce value \(\nu \) and an n-bit block cipher \(\textsf {E}_k\) is used as an internal primitive instead of two independent n-bit keyed functions. Schematic diagram of nEHtM is shown in Fig. 1 Similar to EWCDM and DWCDM, nEHtM gives beyond the (birthday bound) security in nonce-respecting (resp. nonce misuse) setting. But, unlike these two constructions, security of nEHtM MAC degrades gracefully with the repetition of the nonce. In other words, security of nEHtM remains beyond the birthday bound with a single repetition of the nonce (which is not true for EWCDM and DWCDM). That is, one can get adequate security from nEHtM if the repetition of the nonce occurs in a controlled way, a feature which is not present in EWCDM or DWCDM. This phenomena is formally captured by a notion, called faulty nonce model [16]. Informally, it says that a nonce is faulty if it appears in a previous signing query. It has been stated in [16] that faulty nonce model is a weaker notion than multicollision of nonces – a natural and a popular metric to measure the misuses of nonce. Under the notion of faulty nonce model, Dutta et al. have shown that nEHtM is secured roughly upto \(2^{2n/3}\) queries.

We would like to mention here that this construction was also analyzed by Moch and List [22] in parallel to [16] in the name of HPxNP, where two independent n-bit block ciphers have been used (as they did not use the domain separation technique). However, Moch and List analyzed its security under the condition of the uniqueness of the nonce, whereas Dutta et al. [16] proved its graceful security with respect to the repetition of the nonce.

1.1 Permutation Based Cryptography

All the above discussed nonce-based MACs are build on block ciphers as their underlying primitives and even stronger, these primitives are evaluated only in the forward direction. As most of the block ciphers are designed to be efficient in both the forward and the inverse direction, block ciphers are over-hyped primitives for such purpose [10]. On the other extreme, cryptographic permutations are particularly designed with the motive to be fast in the forward direction, but not necessarily in the inverse direction. Examples of such permutation includes Keccak [2], Gimli [1], SPONGENT [5]. Moreover, in most of the cases evaluating an unkeyed public permutation is faster than evaluating a keyed block cipher, as the latter involves in evaluating the underlying key scheduling algorithm each time the block cipher is invoked in the design. With the advancement of public permutation-based designs and the efficiencies of evaluating it in the forward direction, numerous public permutation-based inverse-free hash and authenticated encryption designs have been proposed. The use of cryptographic permutation gained the momentum during SHA-3 competition [24]. Furthermore, the selection of the permutation-based Keccak sponge function as the SHA-3 standard has given a high level of confidence on using this primitive in the community. Today, permutation-based sponge construction has become a successful and a full-fledged alternative to the block cipher-based modes. In fact, in the first round of the ongoing NIST light-weight competition [23], 24 out of 57 submissions are based on cryptographic permutations, and out of 24, 16 permutation based proposals have been qualified for the second round. This statistics, beyond any doubt, clearly depicts the wide adoption of permutation based designs [1, 3, 7, 8, 12, 14] in the community. In another direction, a long line of research work has been carried out in the study of designing block ciphers and tweakable block ciphers out of public random permutations. Even Mansour (EM) [17] and Iterated Even Mansour (IEM) cipher [6] are the notable approaches in this direction.

Nonce-based MACs using public permutations are mostly designed with sponge type of constructions. But the drawback of such designs are: (i) they do not use the full size of the permutation for guarranting security and (ii) they attain only the birthday bound security in the size of its capacity c, i.e., c/2 bit security (except Bettle [7], whose security bound is roughly the size of its capacity). Now, it is an admissible fact that the sponge type designs, which offer c/2-bit security, are good in practice when they are instantiated with large size permutations such as Keccak [2], whose state size is 1600 bits. But such large size permutations are not suitable for use in resource constrained environment. In such scneario, instead of using such large size permutations, one aims to use light-weight permutations such as SPONGENT [5] and PHOTON [18], whose state size go as low as 88 and 100 bits respectively. If we use these light-weight permutations as underlying primitives in birthday bound secure sponge type constructions, then it practically offers inadequate security. As a result, sponge type constructions instantiated with light-weight permutations are not suitable for deploying in resource constrained environment. Thus, it is natural to ask

Can we design a public permutation-based nonce-based MAC that gives an adequate security when instantiated with light-weight permutation?

This question hinted us to think of designing a MAC whose security depends on the entire size of the underlying permutation (unlike sponge type constructions whose security depends on only a part of the entire size of the underlying permutation) and the security must cross the birthday barrier. Coming up with such a design is the goal of this paper. In this direction, Chen et al. [10] have shown two instances of public permutation-based pseudo random functions that give beyond the birthday bound security with respect to the size of the permutation. We extend this line of research work by designing a public permutation-based nonce-based MAC that gives beyond the birthday bound security with respect to the size of the permutation.

The sole contribution of this paper is to design a beyond birthday bound secure nonce-based MAC using public random permutations. To this end we propose \(\textsf {nEHtM}_p\), a nonce based MAC designed using public permutations. As depicted in Fig. 1, our construction structurally resembles to the nEHtM MAC [16] where we replace its block cipher with a public random permutation and an appropriate masking of the key.

Fig. 1.
figure 1

(Left): \(\textsf {nEHtM}\) MAC based on block cipher \(\textsf {E}_k\); (Middle): \(\textsf {nEHtM}_p\) MAC based on single public random permutation \(\pi \); (Right): 2-round iterated even mansour cipher.

Note that, by instantiating the underlying block cipher of nEHtM MAC with 2-round iterated Even-Mansour cipher (as shown in Fig. 1), one can easily make the public permutation variant of nEHtM MAC, which becomes secure beyond the birthday bound (in faulty nonce model). However such transformation requires 4 permutation calls, 7 xor operations and one hash evaluation. Compared to this, \(\textsf {nEHtM}_p\) requires only 2 permutation calls, 3 xor operations and one hash evaluation. We have shown that \(\textsf {nEHtM}_p\) is secured roughly up to \(2^{2n/3}\) queries in the nonce-respecting setting. Moreover, this security bound degrades in a graceful manner under the faulty nonce model [16]. We show the unforgeability of this construction through an extended distinguishing game and apply the expectation method to bound its distinguishing advantage. We also show that our proven security bound is tight by giving a matching attack on it with roughly \(2^{2n/3}\) query complexity and \(2^{2n-4}\) time complexityFootnote 3.

2 Preliminaries

For \(n \in \mathbb {N}\), we denote the set of all binary strings of length n and the set of all binary strings of finite arbitrary length by \(\{0,1\}^n\) and \(\{0,1\}^*\) respectively. We often refer the elements of \(\{0,1\}^n\) as block. For an n-bit binary string \(x = (x_{n-1} \ldots x_0)\), \(\textsf {msb}(x)\) denotes the first bit of x in left to right ordering, i.e. \(\textsf {msb}(x) = x_{n-1}\). Moreover, \(\textsf {chop}_{\mathrm {msb}}(x) {\mathop {=}\limits ^{\mathrm {\varDelta }}}(x_{n-2} \ldots , x_0)\), i.e., \(\textsf {chop}_{\mathrm {msb}}(x)\) returns the string x by dropping just its msb. For any element \(x \in \{0,1\}^*\), |x| denotes the number of bits in x and for \(x, y \in \{0,1\}^*\), \(x \Vert y\) denotes the concatenation of x followed by y. We denote the bitwise xor operation of \(x, y \in \{0,1\}^n\) by \(x \oplus y\). We parse \(x \in \{0,1\}^*\) as \(x = x_1 \Vert x_2 \Vert \ldots \Vert x_l\) where for each \(i = 1, \ldots , l-1\), \(x_i\) is a block and \(1 \le |x_l| \le n\). For a sequence of elements \((x^1, x^2, \ldots , x^s) \in \{0,1\}^*\), \(x_a^i\) denotes the a-th block of i-th element \(x^i\). For a value s, we denote by \(t \leftarrow s\) the assignment of s to variable t. For any natural number \(j \in \mathbb {N}\), \(\langle j \rangle _{s}\) denotes the s bit binary representation of integer j. For \(i \in \{0,1\}^n\), \(\textsf {left}_k(i)\) represents the leftmost k bits of i. Similarly, \(\textsf {right}_k(i)\) represents the rightmost k bits of i. For any finite set \(\mathcal {X}\), denotes that X is sampled uniformly at random from \(\mathcal {X}\) and denotes that \(X_i\)’s are sampled uniformly and independently from \(\mathcal {X}\). \(\mathbb {F}_{\mathcal {X}}(n)\) denotes the set of all functions from \(\mathcal {X}\) to \(\{0,1\}^n\). We often write \(\mathbb {F}(n)\) when the domain is clear from the context. We denote the set of all permutations over \(\{0,1\}^n\) by \(\mathbb {P}(n)\). For integers \(1 \le b \le a\), \((a)_b\) denotes the product \(a(a-1) \ldots (a-b+1)\), where \((a)_0 = 1\) by convention and for \(q \in \mathbb {N}\), [q] refers to the set \(\{1, \ldots , q\}\).

2.1 Public Permutation Based Nonce Based MAC

Let \(\mathsf {F}: \mathcal {K}\times \mathcal {N}\times \mathcal {M}\rightarrow \mathcal {T}\) be a keyed function where \(\mathcal {K}, \mathcal {N}, \mathcal {M}\) and \(\mathcal {T}\) are the key space, nonce space, message space and the tag space respectively. We assume that \(\mathsf F\) makes internal calls to the public random permutations \(\varvec{\pi }\) = \((\pi _1, \ldots , \pi _d)\) for \(d \ge 1\), where all of the d permutations are independent and uniformly sampled from \(\mathbb {P}(n)\) for some \(n \in \mathbb {N}\). For simplicity, we write \(\textsf {F}^{\varvec{\pi }}_k\) to denote \(\mathsf F\) with uniform k and uniform \(\varvec{\pi }\). Based on \(\textsf {F}^{\varvec{\pi }}_k\), we define the nonce-based message authentication code \(\mathcal {I} = (\mathcal {I}.\textsf {KGen}, \mathcal {I}.\textsf {Sign}, \mathcal {I}.\textsf {Ver})\) build from public permutations as follows: For \(k \in \mathcal {K}\), the signing algorithm \(\mathcal {I}.\textsf {Sign}_{k}\), takes as input \((\nu , m) \in \mathcal {N}\times \mathcal {M}\) and outputs \(t \leftarrow \textsf {F}^{\varvec{\pi }}_k(\nu , m)\) and the verification algorithm \(\mathcal {I}.\mathsf {Ver}_{k}\), takes as input \((\nu , m, t) \in \mathcal {N}\times \mathcal {M}\times \mathcal {T}\) and outputs 1 if \(\mathsf {F}^{\varvec{\pi }}_k(\nu , m) = t\); otherwise it outputs 0.

A signing query \((\nu , m)\) by an adversary \(\mathsf {A}\) is called a faulty query if \(\mathsf {A}\) has already queried to the signing algorithm with the same nonce but with a different message. Let \(\mathsf {A}\) be a \((\eta , q_m, q_v, p, \textsf {t})\)-adversary against the unforgeability of \(\mathcal {I}\) with oracle access of the signing algorithm \(\mathcal {I}.\textsf {Sign}_k\), the verification algorithm \(\mathcal {I}.\mathsf {Ver}_{k}\) and the d-tuple of permutations \({\varvec{\pi }}\) and their inverses \({\varvec{\pi }} = (\pi _1^{-1}, \ldots , \pi _d^{-1})\) such that it makes at most \(\eta \) faulty signing queries out of \(q_m\) signing, \(q_v\) verification and p primitive queries with running time of A at most \(\textsf {t}\). A is said to be nonce respecting (resp. nonce misuse) if \(\eta = 0\) (resp. \(\eta \ge 1\)). However, A may repeats nonces in its verification queries. Moreover, the primitive queries are interleaved with the signing and the verification queries. \(\mathsf {A}\) is said to forge \(\mathcal {I}\) if for any of its verification queries (not obtained through a previous signing query), the verification algorithm returns 1. The advantage of \(\mathsf {A}\) against the unforgeability of the nonce based MAC \(\mathcal {I}\) is defined as

$$\mathbf {Adv}^{\mathrm {nMAC}}_{\mathcal {I}}(\mathsf {A}) {\mathop {=}\limits ^{\mathrm {\varDelta }}}\Pr \left[ \mathsf {A}^{\mathcal {I}.\textsf {Sign}_{k}, \mathcal {I}.\mathsf {Ver}_{k}, \varvec{\pi }, \varvec{\pi ^{-1}}} \text{ forges } \right] ,$$

where the randomness is defined over , and the randomness of the adversary (if any). We write

$$\mathbf {Adv}^{\mathrm {nMAC}}_{\mathcal {I}}(\eta , q_m, q_v, p, \textsf {t}) {\mathop {=}\limits ^{\mathrm {\varDelta }}}\max _{\textsf {A}} \mathbf {Adv}^{\mathrm {nMAC}}_{\mathcal {I}}(\mathsf {A}),$$

where the maximum is taken over all \((\eta , q_m, q_v, p, \textsf {t})\)-adversaries A. In this paper, we skip the time parameter of the adversary as we will assume throughout the paper that the adversary is computationally unbounded. This will render us to assume that the adversary is deterministic.

 ([15]). To obtain an upper bound for \(\mathbf {Adv}^{\mathrm {nMAC}}_{\mathcal {I}}(\textsf {A})\), we consider a random oracle \(\textsf {RF}\) that samples the tag t independently and uniformly at random from \(\{0,1\}^n\) for every nonce message pair \((\nu , m)\) and the \(\textsf {Rej}\) oracle always returns \(\bot \) for any \((\nu , m, t)\). Then, \(\mathbf {Adv}^{\mathrm {nMAC}}_{\mathcal {I}}(\textsf {A})\) is upper bounded by

$$\begin{aligned} \max _{\textsf {A}} \bigg | \Pr \left[ \mathsf {A}^{\mathcal {I}.\textsf {Sign}_{k}, \mathcal {I}.\mathsf {Ver}_{k}, \varvec{\pi }, \varvec{\pi ^{-1}}} \Rightarrow 1 \right] - \Pr \left[ \mathsf {A}^{\textsf {RF}, \textsf {Rej}, \varvec{\pi }, \varvec{\pi ^{-1}}} \Rightarrow 1 \right] \bigg |, \end{aligned}$$
(1)

where \(\textsf {A}^{\varvec{\mathcal {O}}} \Rightarrow 1\) denotes that adversary \(\textsf {A}\) outputs 1 after interacting with its oracle \({\varvec{\mathcal {O}}}\) (which could be a multiple of oracles).

2.2 Almost Xor Universal and Almost Regular Hash Function

Let \(\mathcal {K}_h\) and \(\mathcal {X}\) be two non-empty finite sets and \(\textsf {H}\) be a keyed function \(\textsf {H} : \mathcal {K}_h\times \mathcal {X}\rightarrow \{0,1\}^n\). Then, \(\textsf {H}\) is said to be an \(\epsilon _{\mathrm {axu}}\)-almost xor universal (axu) hash function, if for any distinct \(x, x' \in \mathcal {X}\) and for any \(\varDelta \in \{0,1\}^n\),

Moreover, \(\textsf {H}\) is said to be an \(\epsilon _{\mathrm {reg}}\)-almost regular (ar) hash function, if for any \(x \in \mathcal {X}\) and for any \(\varDelta \in \{0,1\}^n\),

2.3 Expectation Method

The Expectation Method of Hoang and Tessaro [19] was used to derive a tight multi-user security bound of the key-alternating cipher. This technique has subsequently been used in [16, 20]. Let A be a computationally unbounded deterministic distinguisher that interacts with either of the two worlds: \(\mathbf {O}_{\mathrm {re}}\) or \(\mathbf {O}_{\mathrm {id}}\), where these oracles are possibly randomized stateful systems. After the interaction, A returns a single bit. This interaction between \(\textsf {A}\) and the system results in an ordered sequence of queries and responses which is summarized in \(\tau = ((x_1, y_1), (x_2, y_2), \ldots , (x_q, y_q))\), called a transcript, where \(x_i\) is the i-th query of A and \(y_i\) is the corresponding response of the system to which A interacts with. Let \(\textsf {D}_{\mathrm {re}}\) (resp. \(\textsf {D}_{\mathrm {id}}\)) be the random variable that takes a transcript resulting from the interaction between \(\textsf {A}\) and \(\mathbf {O}_{\mathrm {re}}\) (resp. \(\mathbf {O}_{\mathrm {id}}\)). A transcript \(\tau \) is said to be attainble if \(\Pr [\textsf {D}_{\mathrm {id}} = \tau ] > 0\). Let \(\mathrm {\Theta }\) denotes the set of all attainable transcripts.

Let \(\mathsf {\Phi } : \mathrm {\Theta } \rightarrow [0, \infty )\) be a non-negative function which maps any attainable transcript to a non-negative real value. Suppose there is a set of good transcripts \(\mathsf {GoodT} \subseteq \mathrm {\Theta }\) such that for any \(\tau \in \mathsf {GoodT}\),

$$\begin{aligned} \frac{\Pr \left[ \mathsf {D}_{\mathrm {re}} = \tau \right] }{\Pr \left[ \mathsf {D}_{\mathrm {id}} = \tau \right] } \ge 1 - \mathsf {\Phi }(\tau ). \end{aligned}$$
(2)

Then, the statistical distance between \(\textsf {D}_{\mathrm {re}}\) and \(\textsf {D}_{\mathrm {id}}\) can be bounded as

$$\begin{aligned} \varDelta (\mathsf {D}_{\mathrm {re}}, \mathsf {D}_{\mathrm {id}}) \le \mathbf {E}[\mathsf {\Phi }(\mathsf {D}_{\mathrm {id}})] + \Pr [\mathsf {D}_{\mathrm {id}} \in \mathsf {BadT}], \end{aligned}$$
(3)

where \(\mathsf {BadT} {\mathop {=}\limits ^{\mathrm {\varDelta }}}\mathrm {\Theta } \setminus \mathsf {GoodT}\) is the set of all bad transcripts. In other words, the advantage of \(\textsf {A}\) in distinguishing \(\mathbf {O}_{\mathrm {re}}\) from \(\mathbf {O}_{\mathrm {id}}\) is bounded by \(\mathbf {E}[\mathsf {\Phi }(\textsf {D}_{\mathrm {id}})] + \Pr [\textsf {D}_{\mathrm {id}} \in \mathsf {BadT}]\). In the rest of the paper, we write \(\mathrm {\Theta }\), \(\mathsf {GoodT}\) and \(\mathsf {BadT}\) to denote the set of attainable, set of good and set of bad transcripts respectively.

2.4 Sum-Capture Lemma

We use the sum capture lemma by Chen et al. [9]. Informally, the result states that for a random subset \(\mathcal {S}\) of \(\{0,1\}^n\) of size q and for any two arbitrary subsets \(\mathcal {A}\) and \(\mathcal {B}\) of \(\{0,1\}^n\), the size of the set \(\{(s, a, b) \in \mathcal {S} \times \mathcal {A} \times \mathcal {B} : s = a \oplus b\}\) is at most \(q |\mathcal {A}| |\mathcal {B}| / 2^n\), except with negligible probability. In our setting, \(\mathcal {S}\) is the set of tag values \(t_i\), which are sampled with replacement from \(\{0,1\}^n\).

Lemma 1

(Sum-Capture Lemma). Let \(n, q \in \mathbb {N}\) such that \(9n \le q \le 2^{n-1}\). Let \(\mathcal {S} = \{t_1, \ldots , t_q\} \subseteq \{0,1\}^n\) such that \(t_i\)’s are with replacement sample of \(\{0,1\}^n\). Then, for any two subsets \(\mathcal {A}\) and \(\mathcal {B}\) of \(\{0,1\}^n\), we have

$$\begin{aligned} \Pr [|\{(t, a, b) \in \mathcal {S} \times \mathcal {A} \times \mathcal {B} : t = a \oplus b\}| \ge q |\mathcal {A}| |\mathcal {B}| / 2^n + 3 \sqrt{nq |\mathcal {A}| |\mathcal {B}|}] \le \frac{2}{2^n}, \end{aligned}$$
(4)

where the randomness is defined over the set \(\mathcal {S}\).

3 Solving a System of Affine (Non)-equations

In this section, we present a lower bound on the number of solutions of a system of bi-variate affine equations and bi-variate affine non-equations over a finite number of unknown variables which are without replacement samples of \(\{0,1\}^n\). This result will become handy for analysing the security of our proposed construction.

Consider an undirected edge-labelled acylic graph \(G = (\mathcal {V}{\mathop {=}\limits ^{\mathrm {\varDelta }}}\{Y_1, \ldots , Y_{\alpha }\}, \mathcal {F}\sqcup \mathcal {F}', \mathcal {L})\) with edge labelling function \(\mathcal {L}: \mathcal {F}\sqcup \mathcal {F}' \rightarrow \{0,1\}^n\), where the edge set is partitioned into two disjoint sets \(\mathcal {F}\) and \(\mathcal {F}'\). For an edge \(\{Y_i, Y_j\} \in \mathcal {F}\), we write \(\mathcal {L}(\{Y_i,Y_j\}) = \lambda _{ij}\) (and so \(\lambda _{ij} = \lambda _{ji}\)) and \(\mathcal {L}(\{Y_i,Y_j\}) = \lambda '_{ij}\) for all \(\{Y_i,Y_j\} \in \mathcal {F}'\). Let \(G^{=}{\mathop {=}\limits ^{\mathrm {\varDelta }}}(\mathcal {V}, \mathcal {F}, \mathcal {L}_{|\mathcal {F}})\) denotes the subgraph of G, where \(\mathcal {L}_{|\mathcal {F}}\) is the function \(\mathcal {L}\) restricted over the set \(\mathcal {F}\). We say G is good if it satisfies the following two conditions: (i) for all paths \(P_{st}\) in graph \(G^{=}\), \(\mathcal {L}(P_{st}) \ne \mathbf {0}\). where \(\mathcal {L}(P_{st}) {\mathop {=}\limits ^{\mathrm {\varDelta }}}\sum _{e \in P_{st}} \mathcal {L}(e) = Y_s \oplus Y_t\) and \(P_{st}\) is a path of \(G^{=}\) between vertex s and t and (ii) for all cycles C in G such that the edge set of C contains exactly one non-equation edge \(e' \in \mathcal {F}'\), \(\mathcal {L}(C) \ne \mathbf {0}\), where \(\mathcal {L}(C) {\mathop {=}\limits ^{\mathrm {\varDelta }}}\sum _{e \in C} \mathcal {L}(e)\). For such a good graph G, the induced system of equations and non-equations is defined as:

$$\mathcal {E}_G = {\left\{ \begin{array}{ll} Y_i \oplus Y_j &{} = \lambda _{ij} \; \forall \; \{Y_i,Y_j\}\in \mathcal {F}, \\ Y_i \oplus Y_j &{} \ne \lambda '_{ij} \; \forall \; \{Y_i,Y_j\}\in \mathcal {F}', \end{array}\right. } $$

The set of components in G is denoted by \(\textsf {comp}(G) = (\textsf {C}_1, \ldots , \textsf {C}_k)\), \(\mu _i\) denotes the size of (i.e. the number of vertices in) the i-th component \(\textsf {C}_i\) and \(\mu _{\max } = \max \{\mu _1, \ldots , \mu _{k}\}\) is the size of the largest component of G. \(\rho _i\) the total number of vertices upto the i-th component with the convention that \(\rho _0 = 0\) (Fig. 2).

Fig. 2.
figure 2

(Left): Graph is a tree of size 4; (Middle): Graph is a cycle of size 3; (Right): Graph with equation edges and non-equation edge. Continuous red edge represents equation edge and dashed blue edge represents non-equation edge. (Color figure online)

Definition 1

(Injective Solution). With respect to the system of equations and non-equations \(\mathcal {E}_G\) (as defined above), an injective function \(\mathrm {\Phi } : \mathcal {V}\rightarrow \mathcal {R}\), where \(\mathcal {R}\subseteq \{0,1\}^n\), is said to be an injective solution if \(\mathrm {\Phi }(Y_i) \oplus \mathrm {\Phi }(Y_j) = \lambda _{ij}\) for all \(\{Y_i,Y_j\}\in \mathcal {F}\) and \(\mathrm {\Phi }(Y_i) \oplus \mathrm {\Phi }(Y_j) \ne \lambda '_{ij}\) for all \(\{Y_i,Y_j\}\in \mathcal {F}'\).

Theorem 1

Let \(\mathcal {U}= \{u_1, \ldots , u_{\sigma }\}\) be a non-empty finite subset of \(\{0,1\}^n\), for some \(\sigma \ge 0\). Let \(G = (\mathcal {V}, \mathcal {F}\sqcup \mathcal {F}', \mathcal {L})\) be a good graph with \(\alpha \) vertices such that \(|\mathcal {F}| = q_m, |\mathcal {F}'| = q_v\). Let \(\mathsf {comp}(G^{=}) = (\mathsf {C}_1, \ldots , \mathsf {C}_k)\) and \(|\mathsf {C}_i| = \mu _i\), \(\rho _i = (\mu _1 + \cdots + \mu _i)\). Then the total number of injective solutions, chosen from a set \(\mathcal {Z}=\{0,1\}^n \setminus \mathcal {U}\) of size \(2^n -\sigma \), for the induced system of equations and non-equations \(\mathcal {E}_G\) is at least:

(5)

provided \(\rho '_k \mu _{\max } \le 2^n/4\) where \(\rho '_i = \rho _i + \sigma \).

Proof

We proceed the proof by counting the number of solutions in each of the k components. Let \(\tilde{\mu }_{ij}\) denotes the number of edges from \(\mathcal {F}'\) connecting vertices between i-th and j-th component of \(G^{=}\) and \(\mu '_i\) to be the number of edges in \(\mathcal {F}'\) incident on \(v_i \in \mathcal {V}\setminus G^{=}(\mathcal {V})\). For the first component, the number of solutions is at least exactly \((2^n-\mu _1\sigma )\). We fix such a solution and count the number of solutions for the second component. which is \((2^n - \mu _1 \mu _2 - \tilde{\mu }_{1,2} - \mu _2\sigma )\). This is because, let \(Y_{i_{\mu _1+1}}\) be an arbitrary vertex of the second component and let \(y_{i_{\mu _1+1}}\) be a solution of it. This solution is valid if the following conditions hold:

  • \(y_{i_{\mu _1+1}} \notin \mathcal {U}\).

  • \(y_{i_{\mu _1+1}}\) does not take \(\mu _1\) values \((y_{i_1}, \ldots , y_{i_{\mu _1}})\) from the first component.

  • It must discard \(\mu _1(\mu _2-1)\) values \((y_{i_1} \oplus \mathcal {L}(P_j), \ldots , y_{i_{\mu _1}} \oplus \mathcal {L}(P_j))\) for all possible paths \(P_j\) from a fixed vertex to any other vertex in the second component.

  • It must discard \(p(\mu _2-1)\) values as \((y_{i_{\mu _1+1}} \oplus \mathcal {L}(P_j)) \notin \mathcal {Y}\) for all possible paths \(P_j\) from \(Y_{i_{\mu _1+1}}\) to any other vertices in the second component.

  • \(y_{i_{\mu _1+1}}\) does not take \(\tilde{\mu }_{12}\) values to compensate for the fact that the set of values is no longer a group.

Summing up all the conditions, the number of solutions for the second component is at least \((2^n - \mu _1 \mu _2 - \mu _2\sigma - \tilde{\mu }_{12})\). In general, the total number of solutions for the \(i\text{-th }\) component is at least \(\prod \limits _{i=1}^k \bigg (2^n - \rho _{i-1}\mu _i - \mu _i \sigma - \sum \limits _{j = 1}^{i-1}\tilde{\mu }_{ij} \bigg )\). Suppose there are \(k'\) vertices that do not belong to the set of vertices of the subgraph \(G^{=}\). Fix such a vertex \(Y_{\rho _k+i}\) and let us assume that \(\mu '_{\rho _k+i}\) blue dashed edges are incident on it. If \(y_{\rho _k+i}\) is a valid solution to the variable \(Y_{\rho _k+i}\), then we must have (a) \(y_{\rho _k+i}\) should be distinct from the previous \(\rho _k\) assigned values, (b) \(y_{\rho _k+i}\) should be distinct from the \((i-1)\) values assigned to the variables that do not belong to the set of vertices of the subgraph \(G^{=}(\mathcal {V})\), (c) \(y_{\rho _k+i}\) should be distinct from the values of \(\mathcal {U}\), and (d) \(y_{\rho _k+i}\) should not take those \(\mu '_{\rho _k+i}\) values. Therefore, the total number of solutions is at least

$$\begin{aligned} h_\alpha \ge \prod \limits _{i=1}^k \bigg (2^n - \rho _{i-1} \mu _i - \mu _i \sigma - \sum \limits _{j = 1}^{i-1}\tilde{\mu }_{ij} \bigg ) \cdot \prod \limits _{i \in [k']} (2^n - \rho _k - \sigma - i + 1 - \mu '_{\rho _k+i}). \end{aligned}$$
(6)

Let \(\chi _i {\mathop {=}\limits ^{\mathrm {\varDelta }}}(\tilde{\mu }_{i1} + \ldots + \tilde{\mu }_{i,i-1}), q''_v {\mathop {=}\limits ^{\mathrm {\varDelta }}}(\mu '_{\rho _k+1} + \ldots + \mu '_{\rho _k+k'})\) and \(\rho '_i = \rho _i + \sigma \). After a simple algebraic calculation on Eq. (6), we obtain

$$\begin{aligned} h_{\alpha } \frac{2^{nq_m}}{(2^n-\sigma )_{\alpha }} \ge \underbrace{\prod _{i=1}^k \frac{(2^n - \rho '_{i-1}\mu _i - \chi _i)2^{n(\mu _i-1)}}{(2^n - \rho '_{i-1})_{\mu _i}}}_{\mathsf {D.1}} \underbrace{\prod _{i=1}^{k'} \frac{(2^n - \rho '_{k} - i + 1 - \mu '_{\rho _k + i})}{(2^n - \rho '_{k} - i + 1)}}_{\mathsf {D.2}}. \end{aligned}$$

By expanding \((2^n - \rho '_{i-1})_{\mu _i}\) we have \((2^n - \rho '_{i-1})_{\mu _i} \le 2^{n\mu _i} - 2^{n(\mu _i-1)} \bigg (\rho '_{i-1}\mu _i + {\mu _i \atopwithdelims ()2} \bigg ) + 2^{n(\mu _i-2)} A_i\), where \(A_i = \bigg ({\mu _i \atopwithdelims ()2}(\rho '_{i-1})^2 + {\mu _i \atopwithdelims ()2}(\mu _i-1) \rho '_{i-1} + {\mu _i \atopwithdelims ()2}\frac{(\mu _i-2)(3\mu _i-1)}{12} \bigg )\).

With a simplification on the expression of D.1, we have

where (4) follows from the fact that \(2^{n}(\rho '_{i-1}\mu _i + {\mu _i \atopwithdelims ()2}) - A_i \le 2^{2n}/2\), which holds true when \(\rho '_k \mu _{\max } \le 2^n/4\), (5) holds true due to the fact that \(A_i \le 3(\rho '_{i-1})^2 {\mu _{i} \atopwithdelims ()2}\) and \((\chi _1 + \ldots + \chi _k) = q'_v\), the total number of blue dashed edges across the components of \(G^{=}\) and \(\mu _1 + \ldots + \mu _k \le \alpha \).

For bounding D.2, we have

where (6) follows due to the fact that \((\rho '_k + i - 1) \le 2^n/2\) and (7) follows as we denote \((\mu '_{\rho _k+1} + \ldots + \mu '_{\rho _k+k'}) = q''_v\), the total number of blue dashed edges incident on the vertices outside of the set \(G^{=}(\mathcal {V})\).

Finally, by combining the expression of \(\textsf {D.1}\) and \(\textsf {D.2}\), we have

where \(q_v = q'_v + q''_v\), the total number of non-equation edges in G.    \(\square \)

4 Security of nEHtM in Public Permutation Model

In this section, we first state that \(\textsf {nEHtM}_p\) achieves 2n/3-bit security in public permutation model in the faulty nonce model. Followed by this, we demonstrate a matching attack in Subsect. 4.2 to show the security bound is tight.

4.1 Security of \(\textsf {nEHtM}_p\)

We show that \(\textsf {nEHtM}_p\) is secure against all adversaries that makes roughly \(2^{2n/3}\) queries in the faulty nonce model. However, similar to nEHtM, the construction posses a birthday bound forging attack when the number of faulty nonces reaches to an order of \(2^{n/2}\) [16].

Theorem 2

Let \(\mathcal {M}\) and \(\mathcal {K}_{h}\) be two finite and non-empty sets. Let be an n-bit public random permutation and \(\mathsf {H} : \mathcal {K}_h \times \mathcal {M} \rightarrow \{0,1\}^{n-1}\) be an \((n-1)\)-bit \(\epsilon _{\mathrm {axu}}\)-almost xor universal and \(\epsilon _{\mathrm {reg}}\)-almost regular hash function. Moreover, be an \(n-1\) bit random key and \(\eta \) be a fixed parameter. Then the forging advantage for any \((\eta , q_m, q_v, 2p)\)-adversary against the construction \(\mathsf {nEHtM}_p[\pi , \mathsf {H}, K]\) that makes at most \(\eta \) faulty queries out of \(q_m\) signing, \(q_v\) veritication and altogether 2p primitive queries, is given by

$$\begin{aligned} \mathbf {Adv}^{\mathrm {MAC}}_{\mathsf {nEHtM}_p}(\eta , q_m, q_v, 2p)\le & {} \frac{12\eta ^2}{2^{2n}}\bigg (q_m + 2p\bigg )^2 + \bigg (p+q_m\bigg ) \bigg (\frac{192pq_m}{2^{2n}} + \frac{48pq^2_m\epsilon _{\mathrm {axu}}}{2^{2n}} \bigg ) \\+ & {} \frac{48q^3_m}{2^{2n}} + \frac{12q^4_m\epsilon _{\mathrm {axu}}}{2^{2n}} + \frac{2q_v}{2^n} + \frac{p^2 \epsilon _{\mathrm {reg}}}{2^n}\bigg (3q_m + 2q_v\bigg ) + \frac{q_m}{2^n} \\+ & {} \epsilon _{\mathrm {axu}} \bigg (\frac{4q^3_m}{2^n} + 2 \eta q_m + \frac{pq^2_m}{2^n} + \frac{q^2_m}{2^{n+1}} + (\eta + 1)q_v \bigg ) \\+ & {} \epsilon _{\mathrm {reg}}(2 \eta p + p \sqrt{3nq_m}) + \frac{2p^2q_m}{2^{2n}} + \frac{2 + 2\eta }{2^n} + \frac{2p \sqrt{3nq_m}}{2^n}. \end{aligned}$$

By assuming \(\epsilon _{\mathrm {axu}} \approx 2/2^{n}\) and \(\epsilon _{\mathrm {reg}} \approx 2/2^{n}\), the above bound is simplified to

$$\begin{aligned} \mathbf {Adv}^{\mathrm {MAC}}_{\mathsf {nEHtM}_p}(\eta , q_m, q_v, 2p)\le & {} \frac{80q_m^3}{2^{2n}} + \frac{4(q_m + q_v)}{2^n} + \frac{4p \sqrt{3nq_m}}{2^n} + \frac{12\eta ^2}{2^{2n}}\bigg (q_m+2p\bigg )^2 \\+ & {} (p+q_m)\bigg (\frac{200pq_m}{2^{2n}} + \frac{96pq_m^2}{2^{3n}} + \frac{4\eta }{2^n}\bigg ) + \frac{2\eta q_v}{2^n} + \frac{4p^2q_v}{2^{2n}} \\+ & {} \frac{2}{2^n} + \frac{2\eta }{2^n} . \end{aligned}$$

We defer the proof of this theorem in Sect. 5. The forging advantage of \(\textsf {nEHtM}_p\) for \(\eta \le 2^{n/3}\), \(q_m \le 2^{2n/3}\) and \(p \le 2^{2n/3}\) is thus given by

$$\begin{aligned} \mathbf {Adv}^{\mathrm {MAC}}_{\mathsf {nEHtM}_p}(q_m, q_v, 2p) \le \bigg (\frac{29q_m}{2^{2n/3}} + \frac{6q_v}{2^{2n/3}} + \frac{28p}{2^{2n/3}}\bigg ) + \frac{296p^2q_m}{2^{2n}} + \frac{296pq^2_m}{2^{2n}} + \frac{4p^2q_v}{2^{2n}} + \frac{4}{2^{2n/3}}. \end{aligned}$$

4.2 Matching Attack on \(\textsf {nEHtM}_p\)

In this section we show a matching attack on \(\textsf {nEHtM}_p\) with \(2^{2n/3}\) signing queries and total \(2^{2n/3}+2\) primitive queries. For carrying out the attack, we consider the following version of Polyhash function, a specific instantiation of an axu and ar hash function: for a message m, if the size of m is not a multiple of n, where n is the key size of the hash function, then we first apply an injective padding (e.g., \(10^*\)) on it to generate a padded message \(m'\). Then the output of the hash function for \(m'\) is computed as follows:

$$\textsf {Poly}_{k_h}(m') = k_h^{l+1} \oplus k_h^{l} \cdot m'_{l} \oplus k_h^{l-1} m'_{l-1} \oplus \ldots \oplus k_h \cdot m'_{1},$$

where l denotes the number of message blocks of \(m'\) and \(m'_i\) denotes the i-th message block of \(m'\). Now, it is easy to see that the hash function is \((l_{\max }+1)/2^n\)-secure axu and ar hash function, where \(l_{\max }\) is the maximum number of message blocks allowed. With this instance of the hash function of \(\textsf {nEHtM}_p\), we mount the following attack. To begin with, we exploit bad event B.1 to mount the attack on the construction. We construct a deterministic adversary \(\textsf {A}\) that forges \(\textsf {nEHtM}_p\) by making \(2^{2n/3}\) signing queries and total \(2^{2n/3} + 2\) many primitive queries to \(\pi \) as follows:

Attack Algorithm:

  1. 1.

    \(\textsf {A}\) first chooses a single block message m consisting of all zeroes, i.e., \(m = 0^n\).

  2. 2.

    Then \(\textsf {A}\) makes \(2^{2n/3}\) signing queries with \((\nu _j, m)\) and obtains the tag \(t_j\) for \(j \in [2^{2n/3}]\), where \(\nu _j = 0^{n/3-1} \Vert \langle j \rangle _{2n/3}\).

  3. 3.

    \(\textsf {A}\) makes \(2^{2n/3-1}\) forward primitive queries to \(\pi \) with \(x^{1}_j\) and obtains the output \(y^{1}_j\) for \(j \in [2^{2n/3-1}]\), where \(x^1_j = 0 \Vert \langle j \rangle _{2n/3-1} \Vert 0^{n/3}\).

  4. 4.

    \(\textsf {A}\) makes again \(2^{2n/3-1}\) forward primitive queries to \(\pi \) with \(x^{2}_j\) and obtains the output \(y^{2}_j\) for \(j \in [2^{2n/3-1}]\), where \(x^2_j = 1 \Vert \textsf {left}_{n/3-1}(\langle j \rangle _{2n/3-1}) \Vert 0^{n/3} \Vert \textsf {right}_{n/3}(\langle j \rangle _{2n/3-1})\).

  5. 5.

    Then, \(\textsf {A}\) finds a tripet \((i, j, l) \in [2^{2n/3}] \times [2^{2n/3-1}] \times [2^{2n/3-1}]\) such that \(t_i = y^1_j \oplus y^1_l\).

  6. 6.

    \(\textsf {A}\) makes two additional forward primitive queries to \(\pi \) with \(x^1_{\star } = x^1_j \oplus 0 \Vert 1^{n-1}\) and \(x^2_{\star } = x^2_k \oplus 0 \Vert 1^{n-1}\). Let the received response be \(y^1_{\star }\) and \(y^2_{\star }\) respectively.

  7. 7.

    Finally, \(\textsf {A}\) forges with \((\nu _i \oplus 1^{n-1}, m, y^1_{\star } \oplus y^2_{\star })\).

Analysis of the Forging Advantage. We first note that the structure of \(\nu _j, x^1_j\) and \(x^2_j\) are as follows:

$$\begin{aligned} \nu = \bigg \{\underbrace{0~0~ \ldots ~0}_{n/3-1} \Vert \underbrace{\star ~\star ~ \ldots ~\star }_{n/3} \Vert \underbrace{\star ~\star ~ \ldots ~\star }_{n/3} \bigg \}, ~~ x^1 = \bigg \{0 \Vert \underbrace{\star ~\star ~ \ldots ~\star }_{n/3-1} \Vert \underbrace{\star ~\star ~ \ldots ~\star }_{n/3} \Vert \underbrace{0~0~ \ldots ~0}_{n/3}\bigg \}. \end{aligned}$$
$$\begin{aligned} x^2 = \bigg \{1 \Vert \underbrace{\star ~\star ~ \ldots ~\star }_{n/3-1} \Vert \underbrace{0~0~ \ldots ~0}_{n/3} \Vert \underbrace{\star ~\star ~ \ldots ~\star }_{n/3}\bigg \}. \end{aligned}$$

Note that, the number of elements \((\nu _i, x^1_j)\) that satisfy the relation \(0 \Vert (\nu _i \oplus k) = x^1_j\) is exactly \(2^{n/3}\). As a result, the expected number of triplets \((i, j, \ell )\) that satisfy \(0 \Vert (\nu _i \oplus k) = x^1_j\) and \(1 \Vert (\nu _i \oplus k_h^2) = x^2_{\ell }\) is exactly 1. For this particular triplet \((i, j, \ell )\) that satifies the relation, \(\textsf {A}\) makes two additional forward primitive queries to \(\pi \) with \(x^1_{\star } = x^1_j \oplus \varDelta \) and \(x^2_{\star } = x^2_{\ell } \oplus \varDelta \), where \(\varDelta = 0 \Vert 1^{n-1}\). Thus, if \(\textsf {A}\) makes a forging query with \(\nu _i \oplus 1^{n-1}\) (which is distinct from all other nonces that belong to the signing queries) and with the same message \(m = 0^n\), then we have

$$\begin{aligned}&\pi (0 \Vert (\nu _i \oplus 1^{n-1} \oplus k)) \oplus \pi (1 \Vert (\nu _i \oplus 1^{n-1} \oplus k_h^2)) \\= & {} \pi ((0 \Vert (\nu _i \oplus k)) \oplus \varDelta ) \oplus \pi ((1 \Vert (\nu _i \oplus k_h^2)) \oplus \varDelta ) = \pi (x^1_{\star }) \oplus \pi (x^2_{\star }) = y^1_{\star } \oplus y^2_{\star } \end{aligned}$$

which makes \((\nu _i \oplus 1^{n-1}, m, y^1_{\star } \oplus y^2_{\star })\) a valid and succesful forging attempt. Note that, the number of signing queries required is \(2^{2n/3}\) and the total number of primitive queries required is \(2^{2n/3}+2\). However, the time complexity of this attack is \(2^{2n-2}\).

5 Proof of Theorem 2: MAC Security of \(\textsf {nEHtM}_p\)

Due to Eq. (1), we bound the distinguishing advantage instead of bounding the forging advantage of \(\textsf {nEHtM}_p\). For this, we consider any information theoretic deterministic distinghisher \(\textsf {A}\) that has access to the following oracles in either the real world or in the ideal world: in the real world it has access to \((\textsf {nEHtM}_p.\textsf {Sig}^{\pi }_{(k, k_h)}, \textsf {nEHtM}_p.\textsf {Ver}^{\pi }_{(k, k_h)}, \pi , \pi ^{-1}\)); in the ideal world it has access to \((\textsf {RF}, \textsf {Rej}, \pi , \pi ^{-1})\). We summarize the interactions of the distinguisher with its oracle in a transcript \(\tau _m \cup \tau _v\), where \(\tau _m {\mathop {=}\limits ^{\mathrm {\varDelta }}}\{(\nu _1, m_1, t_1), \ldots , (\nu _{q_m}, m_{q_m}, t_{q_m})\}\) is the MAC transcript and \(\tau _v {\mathop {=}\limits ^{\mathrm {\varDelta }}}\{(\nu '_1, m'_1, t'_1, b'_1), \ldots , (\nu '_{q_v}, m'_{q_v}, t'_{q_v}, b'_{q_v})\}\) is the verification transcript. Primitives queries to \(\pi \) are summarized in two lists in the form of \(\tau ^{(1)}_{p} {\mathop {=}\limits ^{\mathrm {\varDelta }}}\{(x^1_1, y^1_1), \ldots , (x^1_p, y^1_p)\}\) and \(\tau ^{(2)}_p {\mathop {=}\limits ^{\mathrm {\varDelta }}}\{(x^2_1, y^2_1), \ldots , (x^2_p, y^2_p)\}\), where \(\mathsf {msb}(x^1_i) = 0\) and \(\mathsf {msb}(x^2_i) = 1\). We assume that none of the transcripts contain any duplicate elements and after the interaction, we reveal the keys \(k, k_h\) to the distinguisher (before it output its decision), which happens to be the keys used in the construction for the real world and uniformly sampled dummy keys for the ideal world. The complete view is denoted by \(\tau ' = (\tau _m, \tau _v, \tau ^{(1)}_p, \tau ^{(2)}_p, k, k_h)\).

5.1 Definition and Probability of Bad Transcripts

For the notational simplicity, we denote \(\textsf {H}_{k_h}(m_i) = \textsf {H}_i\). \(\hat{x}^b_i\) denotes \(\textsf {chop}_{\mathrm {msb}}(x^b_i)\) for \(b = 1, 2\). We also define three sets: (a) \(\mathcal {T} {\mathop {=}\limits ^{\mathrm {\varDelta }}}\{t_i : (\nu _i, m_i, t_i) \in \tau _m\}\), (b) \(\mathcal {Y}_1 {\mathop {=}\limits ^{\mathrm {\varDelta }}}\{y^1_i : (x^1_i, y^1_i) \in \tau ^{(1)}_p\}\) and (c) \(\mathcal {Y}_2 {\mathop {=}\limits ^{\mathrm {\varDelta }}}\{y^2_i : (x^2_i, y^2_i) \in \tau ^{(2)}_p\}\). The main idea of identifying bad events is to avoid the input collision of the permutation with primitive queries as that will determine the corresponding tag; hence losing the randomness of the tag, which in turn, will help the adversary to distinguish the output from random.

Definition 2

(Bad Transcript for \({{\mathbf {\mathsf{{nEHtM}}}}}_p\)). Given a parameter \(\xi \in \mathbb {N}\), where \(\xi \ge \eta \), an attainable transcript \(\tau '= (\tau _m, \tau _v, \tau ^{(1)}_p, \tau ^{(2)}_p, k, k_h)\) is called a bad transcript if any one of the following holds:

  • \(\mathsf {B.1}\) : \(\exists \) \(i \in [q_m], j, \ell \in [p]\) such that \(\nu _i \oplus k = \hat{x}^1_j, \nu _i \oplus \mathsf {H}_i = \hat{x}^2_{\ell }\).

  • \(\mathsf {B.2}\) : \(\exists \) \(i, j, \ell \in [q_m], i \ne j, i \ne \ell \) such that \(\nu _i = \nu _j\) and \(\nu _i \oplus \mathsf {H}_i = \nu _{\ell } \oplus \mathsf {H}_{\ell }\).

  • \(\mathsf {B.3}\) : \(\exists \) \(i \ne j \in [q_m], \ell \in [p]\) such that \(\nu _i \oplus k = \hat{x}^1_{\ell }\) and \(\nu _i \oplus \mathsf {H}_i = \nu _j \oplus \mathsf {H}_j\).

  • \(\mathsf {B.4}\) : \(\exists \) \(i \ne j \in [q_m], \ell \in [p]\) such that \(\nu _i = \nu _j\) and \(\nu _i \oplus \mathsf {H}_i = \hat{x}^2_{\ell }\).

  • \(\mathsf {B.5}\) : \(\exists \) \(i \ne j \in [q_m]\) such that \(\nu _i = \nu _j\) and \(t_i = t_j\).

  • \(\mathsf {B.6}\) : \(\exists \) \(i \ne j \in [q_m]\) such that \(\nu _i \oplus \mathsf {H}_i = \nu _j \oplus \mathsf {H}_j\) and \(t_i = t_j\).

  • \(\mathsf {B.7}\) : \(\#\{(t_i, y^1_j, y^2_{\ell }) \in \mathcal {T} \times \mathcal {Y}_1 \times \mathcal {Y}_2 : t_i = y^1_j \oplus y^2_{\ell }\} \ge p^2q_m/2^n + p \sqrt{3nq_m}\).

  • \(\mathsf {B.8}\) : \(\exists \) \(i \in [q_m], j, \ell \in [p]\) such that \(\nu _i \oplus k = \hat{x}^1_j, y^1_j \oplus t_i = y^2_{\ell }\).

  • \(\mathsf {B.9}\) : \(\exists \) \(i \in [q_m], j, \ell \in [p]\) such that \(\nu _i \oplus \mathsf {H}_i = \hat{x}^2_j, y^2_j \oplus t_i = y^1_{\ell }\).

  • \(\mathsf {B.10}\) : \(\{i_1,\ldots ,i_{\xi +1}\} \subseteq [q_m]\) such that \(\nu _{i_1} \oplus \mathsf {H}_{i_1} = \nu _{i_2} \oplus \mathsf {H}_{i_2} = \ldots = \nu _{i_{\xi +1}} \oplus \mathsf {H}_{i_{\xi +1}}\) (the optimal value of \(\xi \) shall be determined later in the proof).

  • \(\mathsf {B.11}\) \(\exists \) \(a \in [q_v]\), \(\exists \) \(i \in [q_m]\) such that \(\nu _i = \nu '_a\), \(\nu _i \oplus \mathsf {H}_i = \nu '_a \oplus \mathsf {H}'_a\) and \(t_i = t'_a\).

  • \(\mathsf {B.12}\) \(\exists \) \(a \in [q_v]\), \(\exists \) \(j, \ell \in [p]\) such that \(\nu '_a \oplus k = \hat{x}^1_j\), \(\nu '_a \oplus \mathsf {H}'_a = \hat{x}^2_{\ell }\) and \(t'_a = y^1_j \oplus y^2_{\ell }\).

  • \(\mathsf {B.13}\) \(\exists \) \(i \in [q_m]\) such that \(t_i = 0^n\).

Lemma 2

Let \(\mathsf {D}_{\mathrm {id}}\) and \(\mathsf {BadT}\) be defined as in Sect. 2.3. Then

$$\begin{aligned} \Pr [\mathsf {D}_{\mathrm {id}} \in \mathsf {BadT}]\le & {} \frac{p^2 \epsilon _{\mathrm {reg}}}{2^n}(3q_m + 2q_v) + \epsilon _{\mathrm {axu}} \bigg (\frac{q^2_m}{2\xi } + 2 \eta q_m + \frac{pq^2_m}{2^n} + \frac{q^2_m}{2^{n+1}} + (\eta + 1)q_v \bigg ) \\+ & {} \epsilon _{\mathrm {reg}}(2 \eta p + p \sqrt{3nq_m}) + \frac{2p^2q_m}{2^{2n}} + \frac{2 + 2\eta }{2^n} + \frac{2p \sqrt{3nq_m}}{2^n} + \frac{q_m}{2^n}. \end{aligned}$$

Proof of the lemma can be found in Sect. 6.

5.2 Analysis of Good Transcripts

For a good transcript \(\tau ' = (\tau _m, \tau _v, \tau ^{(1)}_p, \tau ^{(2)}_p, k_h, k)\), the ideal interpolation probability is

$$\begin{aligned}&\mathsf {p}_{\mathrm {id}}(\tau ') {\mathop {=}\limits ^{\mathrm {\varDelta }}}\Pr [\mathsf {D}_{\mathrm {id}} = \tau '] = \frac{1}{|\mathcal {K}_h|} \cdot \frac{1}{2^{n-1}} \cdot \frac{1}{2^{nq_m}} \cdot \frac{1}{(2^n)_{2p}}. \end{aligned}$$
(7)

To compute the real interpolation probability, we regroup the elements of \(\tau _m, \tau ^{(1)}_p\) and \(\tau ^{(2)}_p\) into three new transcripts \(\hat{\tau }_m, \hat{\tau }^{(1)}_p\) and \(\hat{\tau }^{(2)}_p\) in the following way: initially the new transcripts are set to the old one. Now, for each \((\nu _i, m_i, t_i) \in \tau _m\), if (a) \(\nu _i \oplus k = \hat{x}^1_j\), then \(\hat{\tau }_m \leftarrow \tau _m \setminus \{(\nu _i, m_i, t_i)\}\) and \(\hat{\tau }^{(2)}_p \leftarrow \hat{\tau }^{(2)}_p \cup \{1 \Vert (\nu _i \oplus \textsf {H}_i), t_i \oplus y^1_j)\); if (b) \(\nu _i \oplus \textsf {H}_i = \hat{x}^2_j\), then \(\hat{\tau }_m \leftarrow \tau _m \setminus \{(\nu _i, m_i, t_i)\}\) and \(\hat{\tau }^{(1)}_p \leftarrow \hat{\tau }^{(1)}_p \cup \{0 \Vert (\nu _i \oplus k), t_i \oplus y^2_j)\). Since \(\tau '\) is a good transcript, it does not meet any of the bad conditions listed in Definition 2. We know that if \(\nu _i \oplus k = \hat{x}^1_j\), then \(\nu _i \oplus \textsf {H}_i\) cannot collide with \(\hat{x}^2_{\ell }\) (due to \(\lnot \textsf {B.1}\)) and \(y^1_j \oplus t_i\) cannot collide with \(y^2_{\ell }\) (due to \(\lnot \textsf {B.8})\). Similarly for \(\hat{\tau }^{(2)}_p\). This way, we will end up with soundly defined \(\hat{\tau }^{(1)}_p\) and \(\hat{\tau }^{(2)}_p\) and a set of signing queries \(\hat{\tau }_m\) that does not collide with any tuple in \(\hat{\tau }^{(1)}_p\) or \(\hat{\tau }^{(2)}_p\).

Let \(s_1\), \(s_2 \le p\) be the number of signing queries that collides with any element of \(\tau ^{(1)}_p\) and \(\tau ^{(2)}_p\) respectively. Therefore, \(p_1 {\mathop {=}\limits ^{\mathrm {\varDelta }}}|\hat{\tau }^{(1)}_p| = p + s_2\), \(p_2 {\mathop {=}\limits ^{\mathrm {\varDelta }}}|\hat{\tau }^{(2)}_p| = p + s_1\) and \(q'_m {\mathop {=}\limits ^{\mathrm {\varDelta }}}|\hat{\tau }_m| = q_m - s_1 - s_2\). We denote \(q'_p = p_1 + p_2 = 2p + s_1 + s_2\). We say that a permutation \(\pi \) is compatible with \(\hat{\tau } {\mathop {=}\limits ^{\mathrm {\varDelta }}}\hat{\tau }_m \cup \tau _v \cup \hat{\tau }^{(1)}_p \cup \hat{\tau }^{(2)}_p\) if the following holds:

  • for all \((\nu _i, m_i, t_i) \in \hat{\tau }_m, \pi (0 \Vert (\nu _i \oplus k)) \oplus \pi (1 \Vert (\nu _i \oplus \textsf {H}_i)) = t_i\)

  • forall \(a \in [q_v], \pi (0 \Vert (\nu '_a \oplus k)) \oplus \pi (1 \Vert (\nu '_a \oplus \textsf {H}'_a)) \ne t'_a\)

  • for all \((x^1_i, y^1_i) \in \hat{\tau }^{(1)}_p\), \(\pi (x^1_i) = y^1_i\)

  • for all \((x^2_i, y^2_i) \in \hat{\tau }^{(2)}_p\), \(\pi (x^2_i) = y^2_i\).

Therefore, the remaining part is to count the number of compatible permutations \(\pi \). As a result, we have

$$\begin{aligned} \mathsf {p}_{\mathrm {re}}(\tau ') {\mathop {=}\limits ^{\mathrm {\varDelta }}}\Pr [\mathsf {D}_{\mathrm {re}}=\hat{\tau }] = \frac{1}{|\mathcal {K}_h|} \cdot \frac{1}{2^{n-1}} \cdot \frac{h_{\alpha }}{(2^n)_{p_1 + p_2 + \alpha }}, \end{aligned}$$
(8)

where \(h_{\alpha }\) denotes the number of injective solutions to the following system of equations and non-equations \((\mathcal {E}^{=} \cup \mathcal {E}^{\ne })\), with \(\alpha \) many distinct variables. For notational simplicity, we denote \(\pi (0 \Vert \nu _i \oplus k)\) as \(U_i\) and \(\pi (1 \Vert \nu _i \oplus \textsf {H}_i)\) as \(V_i\).

figure a

where \(q'_m = q_m - s_1 - s_2\). It is to be noted here that \(\mathcal {E}^{=} \cup \mathcal {E}^{\ne }\) is defined over \(\alpha \) many distinct variables. Therefore, some variables in \(\mathcal {E}^{=} \cup \mathcal {E}^{\ne }\) may collide to each other. Thus, from Eq. (7) and Eq. (8), we have,

$$\begin{aligned} \frac{\mathsf {p}_{\mathrm {re}}(\tau ')}{\mathsf {p}_{\mathrm {id}}(\tau ')} = \underbrace{\frac{2^{ns_1}}{(2^n-2p)_{s_1}}}_{\mathsf {A.1}} \cdot \underbrace{\frac{2^{ns_2}}{(2^n-2p-s_1)_{s_2}}}_{\mathsf {A.2}} \cdot \underbrace{\frac{h_{\alpha } \cdot 2^{nq'_m}}{(2^n - 2p - s_1 - s_2)_{\alpha }}}_{\mathsf {A.3}}. \end{aligned}$$
(9)

Note that, \(\textsf {A.1} \ge 1\) and \(\textsf {A.2} \ge 1\). Therefore, we are left to bound \(\textsf {A.3}\). Note that, the induced graph G of \(\mathcal {E}^{=} \cup \mathcal {E}^{\ne }\) has \(\alpha \) many vertices. Moreover, \(|\mathcal {F}| = q_m\) and \(|\mathcal {F}'| = q_v\). It is easy to verify that as \(\tau '\) is a good transcript, G is a good graph. Therefore, by putting \(\sigma = q'_p\) in Theorem 1, we have

(10)

From Eq. (8) and Eq. (10), we have

where the simplification for (1) follows from the fact \(\rho '_{i-1} = \alpha + q'_p \le 2(q'_m + q'_p)\). Now, from Sect. 6.2 of [16] we have

(11)

By applying the expectation method of Sect. 2.3 on Eq. (11), we have

$$\begin{aligned} \mathbf {E}[\mathsf {\Phi }(\mathsf {D}_{\mathrm {id}})] \le \frac{12(q'_m + q'_p)^2}{2^{2n}} \bigg ((q'_m)^2 \epsilon _{\mathrm {axu}} + \eta ^2 + 4q'_m \bigg ) + \frac{2q_v}{2^n}. \end{aligned}$$
(12)

By doing a simple algebra on Eq. (12) and by assuming \(q'_m \le q_m, q'_p \le 4p\), we have

$$\begin{aligned} \mathbf {E}[\mathsf {\Phi }(\mathsf {D}_{\mathrm {id}})]\le & {} \bigg (\frac{12q^4_m \epsilon _{\mathrm {axu}}}{2^{2n}} + \frac{12\eta ^2 q^2_m}{2^{2n}} + \frac{48q_m^3}{2^{2n}} + \frac{48pq^3_m \epsilon _{\mathrm {axu}}}{2^{2n}} + \frac{48\eta ^2 pq_m }{2^{2n}} + \frac{192pq^2_m }{2^{2n}} \nonumber \\&~~~~~ + \frac{48p^2q^2_m \epsilon _{\mathrm {axu}}}{2^{2n}} + \frac{48 \eta ^2 p^2}{2^{2n}} + \frac{192 p^2q_m}{2^{2n}} + \frac{2q_v}{2^n}\bigg ). \end{aligned}$$
(13)

We have assumed that \(\xi \ge \eta \) and from the condition of Theorem 1, we have \(\xi \le 2^n / (8q'_m+2q'_p) \le 2^n / 8q'_m\). By assuming \(\eta \le 2^n / 8q'_m\) (otherwise the bound becomes vacuously true) we choose \(\xi = 2^n / 8q'_m\). Hence, the result follows by applying Eq. (3), Lemma 2, Eq. (13) and \(\xi = 2^n / 8q'_m\).

6 Proof of Lemma 2

By the union bound,

$$\begin{aligned} \Pr [\mathsf {D}_{\mathrm {id}} \in \mathsf {BadT}] \le \sum \limits _{i=1}^{7} \Pr [\textsf {B.i}] + \Pr [\textsf {B.8}~|~\overline{\textsf {B.7}}] + \Pr [\textsf {B.9}~|~\overline{\textsf {B.7}}] + \sum \limits _{i=10}^{13} \Pr [\textsf {B.i}]. \end{aligned}$$
(14)

In the following, we bound the probabilities of all the bad events individually. The lemma will follow by adding the individual bounds.

Bounding B.1. For any possible signing query \((\nu _i, m_i, t_i) \in \tau _m\) and a pair of any possible primitive queries \((x^1_j, y^1_j) \in \tau ^{(1)}_p\) and \((x^2_{\ell }, y^2_{\ell }) \in \tau ^{(2)}_p\), the only randomness in the equation \(\nu _i \oplus k = \hat{x}^1_j\) is k and the randomness in the equation \(\nu _i \oplus \textsf {H}_i = \hat{x}^2_{\ell }\) is \(k_h\), the hash key. In the ideal world, k and \(k_h\) are dummy keys, sampled uniformly and independently from their respective space. Therefore, for a fixed choice of i, j and \(\ell \), the probability of the event is \(\epsilon _{\mathrm {reg}} / 2^{n-1}\), where \(\epsilon _{\mathrm {reg}}\) is the regular advantage of the underlying hash function. Summing over all possible choices of i, j and \(\ell \) we have

$$\begin{aligned} \Pr [\textsf {B.1}] \le \frac{2p^2q_m \epsilon _{\mathrm {reg}}}{2^n}. \end{aligned}$$
(15)

Bounding B.2. Let \(\mathcal {N}\) be the set of all query indices i for which there is a \(j \ne i\) such that \(\nu _i =\nu _j\). It is easy to see that \(|\mathcal {N}| \le 2\eta \). Event B.2 occurs if for some \(j \in \mathcal {N}\), \(\nu _j \oplus \mathsf {H}_j = \nu _{\ell } \oplus \mathsf {H}_{\ell }\) for some \(\ell \ne j\). For any such fixed \(i,j,\ell \), the probability of the event is at most \(\epsilon _{\mathrm {axu}}\), where \(\epsilon _{\mathrm {axu}}\) is the almost xor universal advantage of the underlying hash function. The number of such choices of \((i, j, \ell )\) is at most \(2\eta q_m\). Hence,

$$\begin{aligned} \Pr [\textsf {B.2}] \le 2\eta q_m \epsilon _{\mathrm {axu}}. \end{aligned}$$
(16)

Bounding B.3. For any two signing queries \((\nu _i, m_i, t_i), (\nu _j, m_j, t_j) \in \tau _m\) and a primitive query \((x^1_{\ell }, y^1_{\ell }) \in \tau ^{(1)}_p\), the only randomness in the equation \(\nu _i \oplus k = \hat{x}^1_{\ell }\) is k and the randomness in the equation \(\textsf {H}_i \oplus \textsf {H}_j = \nu _i \oplus \nu _j\) is \(k_h\). In the ideal world, k and \(k_h\) are dummy keys, sampled uniformly and independently from their respective space. Therefore, for a fixed choice of i, j and \(\ell \), the probability of the event is \(\epsilon _{\mathrm {axu}} / 2^{n-1}\), where \(\epsilon _{\mathrm {axu}}\) is the almost xor universal advantage of the underlying hash function. Summing over all possible choices of i, j and \(\ell \) we have

$$\begin{aligned} \Pr [\textsf {B.3}] \le \frac{pq^2_m \epsilon _{\mathrm {axu}}}{2^n}. \end{aligned}$$
(17)

Bounding B.4. For any two signing queries \((\nu _i, m_i, t_i), (\nu _j, m_j, t_j) \in \tau _m\) and a primitive query \((x^2_{\ell }, y^2_{\ell }) \in \tau ^{(2)}_p\), the only randomness in the equation \(\nu _i \oplus \textsf {H}_i = \hat{x}^2_{\ell }\) is \(k_h\). In the ideal world, \(k_h\) is sampled uniformly from \(\mathcal {K}_h\). Therefore, for a fixed choice of i, j and \(\ell \), the probability of the event is \(\epsilon _{\mathrm {reg}}\). The number of choices of \(i \ne j \in [q_m]\) such that \(\nu _i = \nu _j\) is at most \(2 \eta \) and the number of choices of \(\ell \) is at most p. Summing over all possible choices of i, j and \(\ell \) we have

$$\begin{aligned} \Pr [\textsf {B.4}] \le 2 \eta p \epsilon _{\mathrm {reg}}. \end{aligned}$$
(18)

Bounding B.5. For a fixed choice of indices i and j, the probability of the event is at most \(1 / 2^{n}\). Number of choices of i and j such that \(\nu _i = \nu _j\) is at most \(2 \eta \). Summing over all possible choices of i and j we have

$$\begin{aligned} \Pr [\textsf {B.5}] \le \frac{2\eta }{2^{n}}. \end{aligned}$$
(19)

Bounding B.6. Similar to B.5, for a fixed choice of indices i and j, the probability of the event is at most \(\epsilon _{\mathrm {axu}} / 2^{n}\), as the event \(\nu _i \oplus \textsf {H}_i = \nu _j \oplus \textsf {H}_j\) is independent over \(t_i = t_j\). Summing over all possible choices of i and j we have

$$\begin{aligned} \Pr [\textsf {B.6}] \le \frac{q^2_m \epsilon _{\mathrm {axu}}}{2^{n+1}}. \end{aligned}$$
(20)

Bounding B.7. Event B.7 is bounded by Lemma 1, where we take \(\mathcal {A} = \mathcal {Y}_1\) and \(\mathcal {B} = \mathcal {Y}_2\).

$$\begin{aligned} \Pr [\textsf {B.7}] \le \frac{2}{2^n}. \end{aligned}$$
(21)

Bounding \({\mathbf {\mathsf{{B.8}}}}~|~ \overline{{\mathbf {\mathsf{{B.7}}}}}\). Let \(\textsf {C} {\mathop {=}\limits ^{\mathrm {\varDelta }}}p^2q_m/2^n + p \sqrt{3nq_m}\). As we are bounding the event \(\textsf {B.8}~|~\overline{\textsf {B.7}}\), number of i, j and \(\ell \) that satifies \(t_i = y^1_j \oplus y^2_{\ell }\) is at most \(\textsf {C}\). For a fixed choice of indices i, j and \(\ell \), the probability of the event is at most \(1/2^{n-1}\). Hence, by summing over all possible choices of i, j and \(\ell \), we have

$$\begin{aligned} \Pr [\textsf {B.8}~|~\overline{\textsf {B.7}}] \le \frac{2p^2q_m}{2^{2n}} + \frac{2p \sqrt{3nq_m}}{2^n}. \end{aligned}$$
(22)

Bounding \({\mathbf {\mathsf{{B.9}}}}~|~\overline{{\mathbf {\mathsf{{B.7}}}}}\). Bounding \(\textsf {B.9}~|~\overline{\textsf {B.7}}\) is identical to that of \(\textsf {B.8}~|~\overline{\textsf {B.7}}\). For a fixed choice of indices i, j and \(\ell \), the probability of the event is at most \(\epsilon _{\mathrm {reg}}\). Summing over all possible choices of i, j and \(\ell \) we have

$$\begin{aligned} \Pr [\textsf {B.9}~|~\overline{\textsf {B.7}}] \le \frac{p^2q_m \epsilon _{\mathrm {reg}}}{2^{n}} + p \sqrt{3nq_m} \epsilon _{\mathrm {reg}}. \end{aligned}$$
(23)

Bounding B.10. Event B.10 occurs if there exist \(\xi +1\) distinct signing query indices \(\{i_1, \ldots , i_{\xi +1}\} \subseteq [q_m]\) such that \(\nu _{i_1} \oplus \textsf {H}_{i_1} = \ldots = \nu _{i_{\xi +1}} \oplus \textsf {H}_{i_{\xi +1}}\). This event is thus a \((\xi +1)\)-multicollision on the \(\epsilon _{\mathrm {univ}}\)-universal hash functionFootnote 4 mapping \((\nu ,m)\) to \(\nu \oplus \textsf {H}_{k_h}(m)\) (as \(\textsf {H}_{k_h}\) is an \(\epsilon _{\mathrm {axu}}\)-almost-xor universal). Therefore, by applying the multicollision theorem of universal hash function (Theorem 1) of [16], we have

$$\begin{aligned} \Pr [\textsf {B.10}] \le q^2_m \epsilon _{\mathrm {axu}} / 2 \xi . \end{aligned}$$
(24)

Bounding B.11. For some \(a \in [q_v]\) and \(i \in [q_m]\), if \(\nu _i = \nu '_a\), \(\nu _i \oplus \textsf {H}_i = \nu '_a \oplus \textsf {H}'_a\) and \(t_i = t'_a\), then \(m_i \ne m'_a\) (as the distinguisher is non-trivial). Hence the probability that \(\nu _i \oplus \textsf {H}_i = \nu '_a \oplus \textsf {H}'_a\) holds is at most \(\epsilon _{\mathrm {axu}}\), due to the axu probability of the hash function. Now, for any choice of \(a \in [q_v]\), there can be at most \((\eta +1)\) indices i such that \(\nu _i =\nu '_a\). Hence, the required probability is bounded as

$$\begin{aligned} \Pr [\textsf {B.11}] \le (\eta +1) q_v \epsilon _{\mathrm {axu}}. \end{aligned}$$
(25)

Bounding B.12. For any possible verification query \((\nu '_a, m'_a, t'_a) \in \tau _v\) and a pair of any possible primitive queries \((x^1_j, y^1_j) \in \tau ^{(1)}_p\) and \((x^2_{\ell }, y^2_{\ell }) \in \tau ^{(2)}_p\), the only randomness in the equation \(\nu '_a \oplus k = x^1_j\) is k and the randomness in the equation \(\nu '_a \oplus \textsf {H}'_a = x^2_{\ell }\) is \(k_h\). In the ideal world, k and \(k_h\) are dummy keys, sampled uniformly and independently from their respective spaces. Therefore, for a fixed choice of a, j and \(\ell \), the probability of the event is \(\epsilon _{\mathrm {reg}} / 2^{n-1}\). Summing over all possible choices of a, j and \(\ell \) we have

$$\begin{aligned} \Pr [\textsf {B.12}] \le \frac{2q_vp^2 \epsilon _{\mathrm {reg}}}{2^n}. \end{aligned}$$
(26)

Bounding B.13. For a fixed choice of i, the probability that \(t_i = 0^n\) is exactly \(2^{-n}\). Summing over all possible choices of i we have

$$\begin{aligned} \Pr [\textsf {B.13}] \le \frac{q_m}{2^n}. \end{aligned}$$
(27)

The proof follows from Eq. (14)–Eq. (27).   \(\square \)