1 Introduction

Authentication is among the most basic and important cryptographic tasks. In the present paper we construct efficient (secret key) authentication schemes from the assumption that the learning parity with noise (LPN) problem is hard. Informally, this problem asks to distinguish noisy inner products from random. We construct the first efficient message authentication codes (MACs) from LPN, but also simpler and more efficient two-round authentication protocols that achieve a notion called active security. Prior to our work, the only known way to construct an LPN-based MAC was via a relatively inefficient generic transformation [23] (that works with any pseudorandom generator), and all interactive LPN-based protocols with security properties similar to our new protocol required at least three rounds and had a loose security reduction. Our constructions and techniques diverge significantly from prior work in the area and will hopefully be of independent interest.

The pursuit of LPN-based authentication is motivated by two disjoint concerns, one theoretical and one practical. On the theoretical side, the LPN problem provides an attractive basis for provable security [3, 5, 6, 27, 32, 41]. It is closely related to the well-studied problem of decoding random linear codes, and unlike most number-theoretical problems used in cryptography, the LPN problem does not succumb to known quantum algorithms. On the practical side, LPN-based authentication schemes are strikingly efficient, requiring relatively few bit-level operations. Indeed, in their original proposal, Hopper and Blum [27] suggested that humans could perform the computation in their provably secure scheme, even with realistic parameters.

Each of our theoretical and practical motivations, on its own, would be sufficiently interesting for investigation, but together the combination is particularly compelling. LPN-based authentication is able to provide a theoretical improvement in terms of provable security in addition to providing better efficiency than approaches based on more classical symmetric techniques that are not related to hard problems. Usually we trade one benefit for the other, but here we hope to get the best of both worlds.

Before describing our contributions in more detail, we start by recalling authentication protocols, the LPN problem, and some of the prior work on which we build.

Authentication Protocols. An authentication protocol is a (shared-key) protocol where a prover \(\mathcal {P}\) authenticates itself to a verifier \(\mathcal {V}\). We recall some of the common definitions for security against impersonation attacks. A passive attack proceeds in two phases, where in the first phase the adversary eavesdrops on several interactions between \(\mathcal {P}\) and \(\mathcal {V}\), and then attempts to cause \(\mathcal {V}\) to accept in the second phase (where \(\mathcal {P}\) is no longer available). In an active attack, the adversary is additionally allowed to interact with \(\mathcal {P}\) in the first phase. The strongest and most realistic attack model is a man-in-the-middle attack (MIM), where the adversary can arbitrarily interact with \(\mathcal {P}\) and \(\mathcal {V}\) (with polynomially many concurrent executions allowed) and wins if it can make \(\mathcal {V}\) accept in any execution where at least one message was changed.

The LPN Problem. For \(\tau <1/2\) and a vector \(\mathbf {x}\in \mathbb {Z}_2^\ell \), define the distribution \(\Lambda _{\tau ,\ell }(\mathbf {x})\) on \(\mathbb {Z}_2^\ell \times \mathbb {Z}_2\) by \((\mathbf {r},\mathbf {r}^{\mathsf {T}}\mathbf {x}+ e)\), where \(\mathbf {r}\in \mathbb {Z}_2^\ell \) is uniformly random and \(e\in \mathbb {Z}_2\) is selected according to \(\mathsf{Ber}_\tau \), the Bernoulli distribution over \(\mathbb {Z}_2\) with parameter \(\tau \) (i.e., \(\Pr [e=1]=\tau \)). The \(\mathsf {LPN}_{\tau ,\ell }\) problem is to distinguish an oracle returning samples from \(\Lambda _{\tau ,\ell }(\mathbf {x})\), where \(\mathbf {x}\in \mathbb {Z}_2^\ell \) is random and fixed, from an oracle returning uniform samples. It was shown by Blum et al. [5] that this is equivalent to the search version of LPN, where one needs to find \(\mathbf {x}\) given oracle access to \(\Lambda _{\tau ,\ell }(\mathbf {x})\) (cf. [31, Thm.2] for precise bounds). We note that the search variant of LPN is efficiently solvable when there is no noise (i.e., when \(\tau = 0\)) using Gaussian elimination, for this we just need \(\ell \) samples with linearly independent \(\mathbf {r}\)’s. The best algorithms take time \(2^{\ell /\log \ell }\) when \(\tau >0\) is treated as a constant [6,7,8,9, 24, 34, 46].

Authentication Protocols from LPN. Starting with the work of Hopper and Blum [27], several authentication protocols based on the LPN problem have been proposed. Their original elegant protocol is simple enough for us to recall right away. The shared secret key is a binary vector \(\mathbf {s}\in \mathbb {Z}_2^\ell \). The interaction consists of two messages. First \(\mathcal {V}\) sends a random challenge \(\mathbf {r}\in \mathbb {Z}_2^\ell \), and then \(\mathcal {P}\) answers with the bit \(z=\mathbf {r}^{\mathsf {T}}\mathbf {s}+ e\), where \(e\in \mathbb {Z}_2\) is sampled according to \(\mathsf{Ber}_\tau \). Finally, the verifier accepts if \(z = \mathbf {r}^{\mathsf {T}}\mathbf {s}\).

Fig. 1
figure 1

The \({\mathsf {HB}}\) protocol, secure against passive attacks

This basic protocol has a large completeness error \(\tau \) (as \(\mathcal {V}\) will reject if \(e=1\)) and soundness error 1 / 2 (as a random \(\mathbf {r},z\) satisfies \(\mathbf {r}^{\mathsf {T}}\mathbf {s}=z\) with probability 1 / 2). This can be reduced via sequential or parallel composition. The parallel variant, denoted \({\mathsf {HB}}\), is illustrated in Fig. 1 (we represent several \(\mathbf {r}\) with a matrix \(\mathbf {R}\) and the noise bits are now arranged in a vector \(\mathbf {e}\)). The verifier accepts if at least a \(\tau '\) fraction (where \(\tau<\tau '<1/2\)) of the n basic authentication steps is correct.

The two-round \({\mathsf {HB}}\) protocol is provably secure against passive attacks, but efficient active attacks are known against it. This is unsatisfying because in several scenarios an adversary will be able to mount an active attack. Subsequently, Juels and Weis [28] proposed an efficient three-round variant of \({\mathsf {HB}}\), called \({\mathsf {HB^+}}\), and proved it secure against active attacks. Again the error can be reduced by sequential repetition, and as shown by Katz, Shin and Smith via a non-trivial analysis, parallel repetition works as well [30, 31]. The protocol (in its parallel repetition variant) is illustrated in Fig. 2.

Gilbert et al. [20] showed that \({\mathsf {HB^+}}\) can be broken by a MIM attack. Several variants \({\mathsf {HB}}^{++}\) [11], \({\mathsf {HB}}^*\) [16], \({\mathsf {HB}}\)-\(\mathsf{MP}\) [37] were proposed to prevent the particular attack from [20], but all of them were later shown to be insecure [21]. In [22], a variant \({\mathsf {HB}}^{\#}\) was presented which provably resists the particular attack from [20], but was shown susceptible to a more general MIM attack [38]. However, no improvements in terms of round complexity, security or tightness of the reduction over \({\mathsf {HB^+}}\) were achieved: Three-round protocols achieving active security \(\sqrt{\varepsilon }\) (assuming LPN is \(\varepsilon \)-hard) are the state of the art.

Fig. 2
figure 2

The \({\mathsf {HB^+}}\) protocol, secure against active attacks

1.1 Our Contribution

We provide new constructions of authentication protocols and even MACs from LPN. Our first contribution is a two-round authentication protocol secure against active adversaries (this is mentioned as an open problem in [28]) which moreover has a tight security reduction (an open problem mentioned in [31]). As a second contribution, we build two efficient MACs and thus also get two-round authentication protocols secure against MIM attacks, from the LPN assumption. Unlike some previous proposals, our constructions are not ad hoc, and we give a reduction to the LPN problem. Our authentication protocol is roughly as efficient as the \({\mathsf {HB^+}}\) protocol but has twice the key length. Our MACs perform roughly the same computation as the authentication protocol plus one evaluation of a pairwise independent permutation of an \(\approx 2\ell \)-bit domain, where \(\ell \) is the length of an LPN secret.

Two-Round Authentication with Active Security. Our first contribution is a two-round authentication protocol which we prove secure against active attacks assuming the hardness of the LPN problem. Our protocol diverges considerably from all previous \({\mathsf {HB}}\)-type protocols [22, 27, 28, 31] and runs counter to the intuition that the only way to efficiently embed the LPN problem into a two-round protocol is via an \({\mathsf {HB}}\)-type construction.

We now sketch our protocol. In \({\mathsf {HB}}\) and its variants, the prover must compute LPN samples of the form \(\mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}+ \mathbf {e}\), where \(\mathbf {R}\) is the challenge chosen by the verifier in the first message. We take a different approach. Instead of sending \(\mathbf {R}\), we now let the verifier choose a random subset of the bits of \(\mathbf {s}\) to act as the “session key” for this interaction. It represents this subset by sending a binary vector \(\mathbf {v}\in \mathbb {Z}_2^\ell \) that acts as a “bit selector” of the secret \(\mathbf {s}\), and we write \(\mathbf {s}_{\downarrow \mathbf {v}}\) for the sub-vector of \(\mathbf {s}\) which is obtained by deleting all bits from \(\mathbf {s}\) where \(\mathbf {v}\) is 0. (E.g., if \(\mathbf {s}=(1,1,1,0,0,0)^{\mathsf {T}},\mathbf {v}=(0,1,1,1,0,0)^{\mathsf {T}}\) then \(\mathbf {s}_{\downarrow \mathbf {v}}=(1,1,0)^{\mathsf {T}}\).) The prover then picks \(\mathbf {R}\) by itself and computes noisy inner products of the form \(\mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}_{\downarrow \mathbf {v}} + \mathbf {e}\). Curiously, allowing the verifier to choose which bits of \(\mathbf {s}\) to use in each session is sufficient to prevent active attacks. We only need to add a few sanity checks that no pathological \(\mathbf {v}\) or \(\mathbf {R}\) was sent by an active adversary.

Our proof relies on the recently introduced subspace LPN problem [40]. In contrast to the active attack security proof of \({\mathsf {HB^+}}\) [31], our proof does not use any rewinding techniques. Avoiding rewinding has at least two advantages. First, the security reduction becomes tight. Second, the proofs also work in a quantum setting: Our protocol is secure against quantum adversaries assuming LPN is secure against such adversaries. As first observed by van de Graaf [43], classical proofs using rewinding in general do not translate to the quantum setting (cf. [45] for a more recent discussion). Let us emphasize that this only means that there is no security proof for \({\mathsf {HB^+}}\) in the quantum setting, but we do not know if a quantum attack actually exists.

MAC and Man-in-the-Middle Security. In Sect. 4, we give two constructions of message authentication codes (MACs) that are secure (formally, unforgeable under chosen message attacks) assuming that the LPN problem is hard. It is well known that a MAC implies a two-round MIM secure authentication protocol: The verifier chooses a random message as challenge, and the prover returns the MAC on the message.

As a first attempt, let us try to view our authentication protocol as a MAC. That is, a MAC tag is of the form \(\phi =(\mathbf {R}, \mathbf {z}=\mathbf {R}^{\mathsf {T}}\cdot f_\mathbf {s}(m) + \mathbf {e})\), where the secret key derivation function \(f_\mathbf {s}(m) \in \mathbb {Z}_2^\ell \) first uniquely encodes the message \(m\) into \(\mathbf {v}\in \mathbb {Z}_2^{2\ell }\) of weight \(\ell \) and then returns \(\mathbf {s}_{\downarrow \mathbf {v}}\) by selecting \(\ell \) bits from secret \(\mathbf {s}\), according to \(\mathbf {v}\). However, this MAC is not secure: Given a MAC tag \(\phi =(\mathbf {R}, \mathbf {z})\) an adversary can ask verification queries where it sets individual rows of \(\mathbf {R}\) to zero until verification fails: If the last row set to zero was the ith, then the ith bit of \(f_{\mathbf {s}}(m)\) must be 1.Footnote 1 Our solution is to randomize the mapping f, i.e., use \(f_{\mathbf {s}}(m,\mathbf {b})\) for some randomness \(\mathbf {b}\) and compute the tag as \(\phi =\pi (\mathbf {R}, \mathbf {R}^{\mathsf {T}}\cdot f_\mathbf {s}(m,\mathbf {b}) + \mathbf {e},\mathbf {b})\), where \(\pi \) is a pairwise independent permutation (contained in the secret key). We can prove that if LPN is hard then this construction yields a secure MAC. (The key argument is that, with high probability, all non-trivial verification queries are inconsistent and hence lead to \(\mathsf {reject}\).) However, the security reduction to the LPN problem is quite loose, concretely, assuming the LPN problem is \(\epsilon \) hard and the adversary makes Q queries we get security of roughly \(Q\sqrt{\epsilon }\). Moreover, there is a non-uniformity issue as we need to know \(\epsilon \), or at least have a good guess on it, to achieve this bound (concretely, the optimal choice of parameter \(\mu \) used in the protocol depends on \(\epsilon \), cf. Corollary 4.3).

We also give another construction with a much tighter reduction and completely uniform reduction achieving security roughly \(Q\epsilon \). For this, we instantiate the above MAC with a different secret key derivation function \(f_\mathbf {s}(m,\mathbf {b}) = \mathbf {s}_0+\mathbf {S}\cdot \mathbf {v}\) (where \(\mathbf {v}=h(m, \mathbf {b})\) and \(h(\cdot )\) is a pairwise independent hash). The drawback of our second construction is the larger key size, as the secret key contains a matrix \(\mathbf {S}\in \mathbb {Z}_2^{\ell \times \mu }\). The security reduction uses a technique from [4, 13] that we needed to adapt to the case of LPN.Footnote 2

Table 1 A comparison of our new authentication protocols with the \({\mathsf {HB}}\), \({\mathsf {HB^+}}\) protocols and the classical GGM construction

1.2 Efficiency

Table 1 shows a rough comparison of our new authentication protocols with the \({\mathsf {HB}},{\mathsf {HB^+}}\) protocols and, as a reference, also the protocol one gets by using the PRF constructed via the classical tree-based GGM construction [23] when the underlying PRG is instantiated with LPN. \(\lambda \) denotes a statistical security parameter, say \(\lambda =80\). Q denotes the total number of queries made during the attack. The second row in the table specifies the security notion that is (provably) achieved under the \(\mathsf {LPN}_{\tau ,\ell }\) or “subset LPN" \(\mathsf {SLPN}^*_{\tau ,L,\ell '}\) assumption (all notions will be defined in Sect. 2.4). Typical parameters suggested for HB type protocols [34] are \(\ell =768, \tau = 0.05\), at this level they estimate \(\varepsilon \approx 2^{-80}\) security for \(\mathsf {LPN}_{\tau ,\ell }\), which is sufficient for schemes with a tight security reduction like for passive security of \({\mathsf {HB}}\) or active security of our \(\mathsf {AUTH}\) protocol (cf. the last column in Table 1). The \(\mathsf {SLPN}^*_{\tau ,L,\ell '}\) assumption reduces to \(\mathsf {LPN}_{\tau ,\ell '-g}\) with an additive loss of \(Q/2^g\), so we should set \(\ell '\approx \ell +80\), say \(\ell '=850\).

The reduction for all these protocols (except GGM) loses an additional additive \(Q\cdot 2^{-\Theta (n)}\) term, where n is the number of repetitions required to amplify the security of a basic protocol which only achieves constant soundness and security. The hidden constant in \(\Theta (\cdot )\) depends on \(\tau \) and, for \(\mathsf {AUTH}\) and \(\mathsf {MAC}_1\), on another parameter \(\gamma >0\). An upper bound on the \(2^{-\Theta (n)}\) term for \(\mathsf {AUTH}\) is \(2e^{-2(\frac{1}{2}-\frac{1}{2+\gamma })^2\ell }+2e^{-2(\frac{1}{4}-\frac{\tau }{2})^2n}\) (cf. Sect. 3.2), to make this term \(<2^{-80}\) it suffices to set \(n=360\) and \(\gamma =0.75\). Summing up, with \(\ell '=850, \tau =0.05, n=360, \gamma =0.75\) we provably get around 80 bits security for \(\mathsf {AUTH}\) assuming LPN provides 80 bits security for \(\ell =786,\tau =0.05\).

Computation complexity counts the number of binary operations over \(\mathbb {F}_2\). Communication complexity counts the total length of all exchanged messages.Footnote 3 The last row in the table states the tightness of the security reduction, i.e., what exact security is achieved (ignoring constants and higher order terms) assuming the \(\mathsf {LPN}_{\tau ,\ell }\) problem is \(\varepsilon \)-hard.

The prover and verifier in the \({\mathsf {HB}},{\mathsf {HB^+}}\) and our new protocols have to perform \(\Theta (\ell \cdot n)\) basic binary operations, assuming the \(\mathsf {LPN}_{\tau , \ell }\) problem (i.e., LPN with secrets of length \(\ell \)) is hard. This seems optimal, as \(\Theta (\ell )\) operations are necessary to compute the inner product which generates a single pseudorandom bit. We will thus consider an authentication protocol or MAC efficient, if it requires \(O(\ell \cdot n)\) binary operations. Let us mention that one gets a length-doubling PRG under the \(\mathsf {LPN}_{\tau ,\ell }\) assumption with \(\Theta (\ell ^2)\) binary operations [17]. Via the classical GGM construction [23], we obtain a PRF and hence a MAC. This PRF, however, requires \(\Theta (\ell ^2\cdot \lambda )\) operations per invocation (where \(\lambda \) is the size of the domain of the PRF) which is not very practical.

Communication Versus Key Size. For all constructions except GGM, there is a natural trade-off between communication and key size, where for any constant c (\(1\le c\le n\)), we can decrease communication by a factor of c and increase key size by the factor c (cf. Appendix A for how exactly this can be done). For the first three protocols in the table, the choice of c does not affect the computational efficiency, but it does so for the MAC based construction: To compute or verify a tag one has to evaluate a pairwise independent permutation (PIP) on the entire tag of length \(l{:}{=}\Theta (\ell \cdot n / c)\).

There are two standard ways how to construct a PIP \(\pi \) over \(\mathbb {F}_{2^l}\), which we briefly review below together with their corresponding computational complexity:

  • Define \(\pi (x){:}{=}a\cdot x+b \in \mathbb {F}_{2^l}\) for random \(a,b \in \mathbb {F}_{2^l}\). Thus the computational cost of evaluating the PIP is one multiplication of two m bits values: Asymptotically, such a multiplication takes only \(O(l\log l\log \log l)\) time [18, 42], but for small l (like in our scheme) this will not be faster than using schoolbook multiplication, which takes \(\Theta (l^2)\) time.

  • Define \(\pi (\mathbf {x}) {:}{=}\mathbf {A}\cdot \mathbf {x}+ \mathbf {b}\in \{0,1\}^l\), where we now interpret the input as an l-bit vector, and where \(\mathbf {A}\) is a Toeplitz matrixFootnote 4 of dimension l and \(\mathbf {b}\) is a random l-bit vector. The product of any Toeplitz matrix with any vector takes time \(O(l\log l)\) [2, 29], but note that the description of \(\pi \) now requires to store roughly 3l bits (whereas the previous solution just requires 2l bits).

The \(\mathsf {PIP}\) term in the table accounts for the above complexity. For parameters \(\ell =850, n=360, \gamma =0.75\) as above and trade-off \(c=n\) (which minimizes the tag length l) we get \(l\approx 2400\) (\(2(1+\gamma )\ell \) plus a statistical security parameter) for \(\mathsf {MAC}_1\) and \(l\approx 930\) for \(\mathsf {MAC}_2\) (\(\ell \) plus a statistical security parameter). Hence, depending on the parameters, the evaluation of the PIP may be the computational bottleneck of our MACs.

1.3 Subsequent Work

An abridged version of this paper appeared as [33]. Our results inspired several follow-up works on the problem of efficient authentication from hard learning problems. In [15], Dodis et al. started a systematic study of randomized MACs and showed how to replace the PIP from our MAC constructions with a pairwise independent hash function (leading to more efficient schemes). Heyse et al. [25] considered a variant of our actively secure authentication protocol based on the Ring-LPN problem (introduced in [36]) which simultaneously achieves a very short key and low communication complexity. The hardware efficiency and side-channel security of Lapin was analyzed in [19].

Lyubashevsky and Masny [35] built three-round authentication protocols from LPN that are man-in-the-middle secure for sequential sessions. In [12], Cash et al. show how to reduce the round complexity to two rounds via a generic transformation from active security to sequential man-in-the-middle security.

2 Definitions

2.1 Notation

For a positive integer k, [k] denotes the set \(\{{1, \ldots , k}\}\); [0] is the empty set. For \(a,b \in \mathbb {R}\), \(]a,b[=\{x\in \mathbb {R}\ ;\ a<x<b\}\). We denote the set of integers modulo an integer \(q \ge 1\) by \(\mathbb {Z}_q\). We will use normal, bold and capital bold letters like x, \(\mathbf {x}\), \(\mathbf{X}\) to denote single elements, (column) vectors and matrices over \(\mathbb {Z}_q\), respectively. \(\mathbf {X}[i]\) denotes the ith column vector of matrix \(\mathbf {X}\). \(\mathbf {x}[i]\) denotes the ith element of vector \(\mathbf {x}\).

For a vector \(\mathbf {x}\in \mathbb {Z}_q^m\), \(|{\mathbf {x}}|=m\) denotes the length of \(\mathbf {x}\); \(\mathbf{wt}(\mathbf {x})\) denotes the Hamming weight of the vector \(\mathbf {x}\), i.e., the number of indices \(i\in \{1,\ldots ,|\mathbf {x}|\}\) where \(\mathbf {x}[i]\ne 0\). For \(\mathbf {v}\in \mathbb {Z}_2^m\) we denote by \(\overline{\mathbf {v}}\) its inverse, i.e., \(\overline{\mathbf {v}}[i]=1-\mathbf {v}[i]\) for all i. For two vectors \(\mathbf {v}\in \mathbb {Z}_2^\ell \) and \(\mathbf {x}\in \mathbb {Z}_q^\ell \), we denote by \(\mathbf {x}_{\downarrow \mathbf {v}} \in \mathbb {Z}_q^{\mathbf{wt}(\mathbf {v})}\) the vector (of length \(\mathbf{wt}(\mathbf {v})\)) which is derived from \(\mathbf {x}\) by deleting all the bits \(\mathbf {x}[i]\) where \(\mathbf {v}[i]=0\). If \(\mathbf {X}\in \mathbb {Z}_2^{\ell \times m}\) is a matrix, then \(\mathbf {X}_{\downarrow \mathbf {v}}\) denotes the sub-matrix obtained by deleting the ith row if \(\mathbf {v}[i]=0\). We also extend Boolean operators to vectors, i.e., for two vectors \(\mathbf {x}, \mathbf {y}\in \mathbb {Z}_2^m\) we define \(\mathbf {x}\wedge \mathbf {v}= \mathbf {z}\in \mathbb {Z}_2^m\) with \(\mathbf {z}[i] = \mathbf {x}[i] \wedge \mathbf {y}[i]\) and \(\mathbf {x}\vee \mathbf {v}= \mathbf {z}\in \mathbb {Z}_2^m\) where \(\mathbf {z}[i] = \mathbf {x}[i] \vee \mathbf {y}[i]\).

A function in \(\lambda \) is negligible, written \({{\mathrm{negl}}}(\lambda )\), if it vanishes faster than the inverse of any polynomial in \(\lambda \). An algorithm \(\mathcal {A}\) is probabilistic polynomial time (PPT) if \(\mathcal {A}\) uses some randomness as part of its logic (i.e., \(\mathcal {A}\) is probabilistic) and for any input \(\mathbf {x}\in \{0,1\}^*\) the computation of \(\mathcal {A}(\mathbf {x})\) terminates in at most \({{\mathrm{poly}}}(|{\mathbf {x}}|)\) steps.

2.2 Authentication Protocols

An authentication protocol is an interactive protocol executed between a prover \(\mathcal {P}\) and a verifier \(\mathcal {V}\), both PPT algorithms. Both hold a secret \(\mathbf {x}\) (generated using a key generation algorithm \(\mathcal {K}\) executed on the security parameter \(\lambda \) in unary) that has been shared in an initial phase. After the execution of the authentication protocol, \(\mathcal {V}\) outputs either \(\mathsf {accept}\) or \(\mathsf {reject}\). We say that the protocol has completeness error \(\alpha \) if for all secret keys \(\mathbf {x}\) generated by \(\mathcal {K}(1^\lambda )\), the honestly executed protocol returns \(\mathsf {reject}\) with probability at most \(\alpha \).

Passive Attacks. An authentication protocol is secure against passive attacks if there exists no PPT adversary \(\mathcal {A}\) who can win the following game with non-negligible probability: In a first phase, we sample a key \(\mathbf {x}\leftarrow \mathcal {K}(1^\lambda )\), and then \(\mathcal {A}\) gets to see any number of transcripts from the protocol execution between \(\mathcal {P}(\mathbf {x})\) and \(\mathcal {V}(\mathbf {x})\) (including \(\mathcal {V}\)’s final decision \(\mathsf {accept}\) or \(\mathsf {reject}\)). In a second phase, \(\mathcal {A}\) interacts with \(\mathcal {V}(\mathbf {x})\) and wins if the verifier outputs \(\mathsf {accept}\). Here, we only give the adversary one shot to convince the verifier.Footnote 5

Active Attacks. A stronger notion for authentication protocols is security against active attacks. Here, the second phase is the same as in a passive attack, but in the first phase, the adversary \(\mathcal {A}\) is additionally given access to \(\mathcal {P}(\mathbf {x})\). For two-round protocols, there is no difference between concurrent and sequential execution of the sessions with the prover.

We say an authentication protocol is \((t,Q,\varepsilon )\) -secure against active adversaries if every adversary \(\mathcal {A}\), running in time at most t and making Q queries to the honest prover, has probability at most \(\varepsilon \) to win the above game.

Man-in-the-Middle Attacks. The strongest standard security notion for authentication protocols is security against man-in-the-middle (MIM) attacks. Here, the adversary can interact concurrently in many sessions between the prover and the verifier. The adversary gets to learn the verifier \(\mathsf {accept}/\mathsf {reject}\) decisions, and wins whenever it manages to let the verifier accept in a session where it changed at least one of the messages sent by the prover or the verifier. One can construct two-round authentication schemes which are secure against MIM attacks from basic cryptographic primitives like MACs, which we define next.

2.3 Message Authentication Codes

A message authentication code \(\mathsf {MAC}=\{\mathcal {K},\mathcal {T},\mathcal {V}\}\) is a triple of algorithms with associated key space \(\mathcal {K}\), message space \(\mathcal {M}\) and tag space \(\mathcal {T}\).

  • Key Generation. The probabilistic key generation algorithm \(\mathcal {K}\) takes as input a security parameter \(\lambda \in \mathbb {N}\) (in unary) and outputs a secret key \(K\in \mathcal {K}\).

  • Tagging. The probabilistic authentication algorithm \(\mathcal {T}\) takes as input a secret key \(K \in \mathcal {K}\) and a message \(m\in \mathcal {M}\) and outputs an authentication tag \(\phi \in \mathcal {T}\).

  • Verification. The deterministic verification algorithm \(\mathcal {V}\) takes as input a secret key \(K \in \mathcal {K}\), a message \(m\in \mathcal {M}\) and a tag \(\phi \in \mathcal {T}\) and outputs \(\{\mathsf {accept},\mathsf {reject}\}\).

If the \(\mathcal {T}\) algorithm is deterministic, one does not have to explicitly define \(\mathcal {V}\), since it is already defined by the \(\mathcal {T}\) algorithm as \(\mathcal {V}(K,m,\phi )=\mathsf {accept}\) iff \(\mathcal {T}(K,m)=\phi \).

Completeness. We say that \(\mathsf {MAC}\) has completeness error \(\alpha =\alpha (\lambda )\) if for all \(m\in \mathcal {M}\) and \(\lambda \in \mathbb {N}\)

$$\begin{aligned} \Pr [\mathcal {V}(K,m,\phi )=\mathsf {reject}:~K\leftarrow \mathcal {K}(1^\lambda );\phi \leftarrow \mathcal {T}(K,m)] \le \alpha . \end{aligned}$$
Fig. 3
figure 3

Experiment \(\mathbf {Exp}^{\mathrm {uf}\hbox {-}\mathrm {cma}}_{\mathsf {MAC}}(\mathcal {A},\lambda )\) defining uf-cma security of \(\mathsf {MAC}\)

Security. The standard security notion for a MAC is unforgeability under a chosen message attack (uf-cma). Formally, this is the probability that the experiment \(\mathbf {Exp}^{\mathrm {uf}\hbox {-}\mathrm {cma}}_{\mathsf {MAC}}(\mathcal {A},\lambda )\) of Fig. 3 outputs 1. The experiment features an adversary \(\mathcal {A} \) that issues tag queries on messages \(m\), and verification queries on pairs \((m,\phi )\); the adversary is successful if she ever asks a verification query \((m,\phi )\) that is accepted, for some message \(m\) not previously asked to the tag oracle (i.e., \(\mathcal {A}\) has found a valid forgery). We say that \(\mathsf {MAC}\) is \((t,Q,\varepsilon \))-secure against uf-cma adversaries if for any \(\mathcal {A}\) running in time t, and asking a total number of Q queries to her oracles, we have \(\Pr [\mathbf {Exp}^{\mathrm {uf}\hbox {-}\mathrm {cma}}_{\mathsf {MAC}}(\mathcal {A},\lambda ) = 1] \le \varepsilon \).

2.4 Hard Learning Problems

Let \(\mathsf{Ber}_\tau \) be the Bernoulli distribution over \(\mathbb {Z}_2\) with parameter (bias) \(\tau \in ~]0,1/2[\) (i.e., \(\Pr [x=1]=\tau \) if \(x\leftarrow \mathsf{Ber}_\tau \)). For \(\ell \ge 1\), \(\mathsf{Ber}_\tau ^\ell \) denotes the distribution over \(\mathbb {Z}_2^\ell \) where each vector consists of \(\ell \) independent samples drawn from \(\mathsf{Ber}_\tau \). Given a secret \(\mathbf {x}\in \mathbb {Z}_2^{\ell }\) and \(\tau \in ]0,\frac{1}{2}[\) , we write \(\Lambda _{\tau ,\ell }(\mathbf {x})\) for the distribution over \(\mathbb {Z}_2^\ell \times \mathbb {Z}_2\) whose samples are obtained by sampling vectors \(\mathbf {r}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\ell \) and \(e \mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathsf{Ber}_{\tau }\) outputting \((\mathbf {r},\mathbf {r}^{\mathsf {T}}\cdot \mathbf {x}+ e)\).

2.4.1 Learning Parity with Noise

The LPN assumption, formally defined below, states that it is hard to distinguish \(\Lambda _{\tau ,\ell }(\mathbf {x})\) (with a random secret \(\mathbf {x}\in \mathbb {Z}_2^\ell \)) from the uniform distribution on \(\ell +1\) bits denoted \(U_{\ell +1}\).

Definition 2.1

(Learning Parity with Noise) The (decisional) \(\mathsf {LPN}_{\tau ,\ell }\) problem is \((t,Q,\varepsilon )\)-hard if for every distinguisher \(\mathcal {D} \) running in time t and making Q queries,

$$\begin{aligned} \left| \Pr \left[ \mathcal {D} ^{\Lambda _{\tau ,\ell }(\mathbf {x})}=1;\ \mathbf {x}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\ell \right] - \Pr \left[ \mathcal {D} ^{U_{\ell +1}}=1\right] \right| \le \varepsilon . \end{aligned}$$

It will sometimes be convenient to think of \(U_{\ell +1}\) as LPN samples with uniform errors, note that for any \(\mathbf {x}\), the distributions \(\Lambda _{1/2,\ell }(\mathbf {x})\) and \(U_{\ell +1}\) are the same.

2.4.2 Subspace Learning Parity with Noise

We now define the (seemingly) stronger subspace LPN assumption (SLPN for short) introduced in [40]. Here, the adversary can ask for inner products not only with the secret \(\mathbf {x}\), but with any affine function \(\mathbf {A}\mathbf {x}+\mathbf {b}\) of \(\mathbf {x}\) where \(\mathbf {A}\) can be any (adversarially and adaptively chosen) matrix of sufficiently high rank. For minimal dimension \(d\le \ell \), a secret \(\mathbf {x}\in \mathbb {Z}_2^\ell \) and any \(\mathbf {A}\in \mathbb {Z}_2^{\ell \times \ell }, \mathbf {b}\in \mathbb {Z}_2^\ell \) we define the distribution

$$\begin{aligned} \Gamma _{\tau ,\ell ,d}(\mathbf {x},\mathbf {A},\mathbf {b})=\left\{ \begin{array}{ll} \bot &{}\quad {\text {if }}\mathsf {rank}(\mathbf {A})<d\\ \Lambda _{\tau ,\ell }(\mathbf {A}\cdot \mathbf {x}+\mathbf {b})&{}\quad {\text {otherwise}} \end{array}\right. \end{aligned}$$

and let \(\Gamma _{\tau ,\ell ,d}(\mathbf {x},\cdot ,\cdot )\) denote the oracle which on input \(\mathbf {A},\mathbf {b}\) outputs a sample from \(\Gamma _{\tau ,\ell ,d}(\mathbf {x},\mathbf {A},\mathbf {b})\).

Definition 2.2

(Subspace LPN) Let \(\ell ,d\in \mathbb {Z}\) where \(d\le \ell \). The (decisional) \(\mathsf {SLPN}_{\tau ,\ell ,d}\) problem is \((t,Q,\varepsilon )\)-hard if for every distinguisher \(\mathcal {D} \) running in time t and making Q queries,

$$\begin{aligned} \left| \Pr \left[ \mathcal {D} ^{\Gamma _{\tau ,\ell ,d}(\mathbf {x},\cdot ,\cdot )}=1;\ \mathbf {x}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\ell \right] - \Pr \left[ \mathcal {D} ^{U_{\ell +1}(\cdot ,\cdot )}=1\right] \right| \le \varepsilon , \end{aligned}$$

where \(U_{\ell +1}(\cdot ,\cdot )\) on input \(\mathbf {A},\mathbf {b}\) outputs a sample of \(U_{\ell +1}\) if \(\mathsf {rank}(\mathbf {A})\ge d\) and \(\bot \) otherwise.

The following proposition states that the subspace LPN problem mapping to dimension \(d+g\) is almost as hard as the standard LPN problem with secrets of length d, the hardness gap being exponentially small in g. This proposition is a special case of Theorem 1 from [40]. To be self-contained, we include a proof of this proposition, but first we state a much simpler result, which will be used in the construction of our second MAC, namely that the subspace LPN problem is exactly as hard as the LPN problem if \(d=\ell \), i.e., the subspace must span the entire space.

Proposition 2.3

For any \(\ell \in \mathbb {Z}\) if the \(\mathsf {LPN}_{\tau ,\ell }\) problem is \((t,Q,\varepsilon )\)-hard then the \(\mathsf {SLPN}_{\tau ,\ell ,\ell }\) problem is \((t',Q,\varepsilon )\)-hard where \( t'=t-{{\mathrm{poly}}}(\ell ,Q) \)

Proof

Given a \(\Lambda _{\tau ,\ell }(\mathbf {x})\) sample \((\mathbf {r},\mathbf {r}^{\mathsf {T}}\mathbf {x}+ e)\), for any \(\mathbf {A},\mathbf {b}\) where \(\mathsf {rank}(\mathbf {A})=\ell \), we can transform it into a \(\Gamma _{\tau ,\ell ,\ell }(\mathbf {x},\mathbf {A},\mathbf {b})\) sample \(({\hat{\mathbf {r}}},{\hat{\mathbf {r}}}^{\mathsf {T}}(\mathbf {A}\mathbf {x}+\mathbf {b}) + e\) by setting \({\hat{\mathbf {r}}}^{\mathsf {T}}=\mathbf {r}^{\mathsf {T}}\mathbf {A}^{-1}\) and outputting \(({\hat{\mathbf {r}}},\mathbf {r}^{\mathsf {T}}\mathbf {x}+e+{\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {b})\). We now have

$$\begin{aligned} ({\hat{\mathbf {r}}},\mathbf {r}^{\mathsf {T}}\mathbf {x}+e+{\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {b})= ({\hat{\mathbf {r}}},{\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {A}\mathbf {x}+e+{\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {b})= ({\hat{\mathbf {r}}},{\hat{\mathbf {r}}}^{\mathsf {T}}(\mathbf {A}\mathbf {x}+\mathbf {b})+e) \end{aligned}$$

It just remains to check that e and \({\hat{\mathbf {r}}}\) have the right distribution. The error e in the subspace LPN sample is exactly the same value as in the LPN sample and thus also distributed as \(\mathsf{Ber}_{\tau }\). \({\hat{\mathbf {r}}}^{\mathsf {T}}\) is uniform as required, to see this recall that \(\mathbf {r}\) is uniform and \(\mathbf {A}\) has full rank. Using this transformation, we can use any distinguisher for \(\mathsf {SLPN}_{\tau ,\ell ,\ell }\) from uniform to distinguish \(\mathsf {LPN}_{\tau ,\ell }\) from uniform with the same advantage.\(\square \)

Proposition 2.4

For any \(\ell ,d,g\in \mathbb {Z}\) (where \(\ell \ge d+g\)), if the \(\mathsf {LPN}_{\tau ,d}\) problem is \((t,Q,\varepsilon )\)-hard then the \(\mathsf {SLPN}_{\tau ,\ell ,d+g}\) problem is \((t',Q,\varepsilon ')\)-hard where

$$\begin{aligned} t'=t-{{\mathrm{poly}}}(\ell ,Q)\qquad \varepsilon '=\varepsilon + Q/2^{g}. \end{aligned}$$

Proof

Let \(\mathcal {D} '\) be an adversary with advantage \(\varepsilon '\) for the \(\mathsf {SLPN}_{\tau ,\ell ,d+g}\) problem, from this \(\mathcal {D} '\) we will construct an \(\mathcal {D} \) with advantage \(\varepsilon \ge \varepsilon '-Q/2^g\) for the \(\mathsf {LPN}_{\tau ,d}\) problem. W.l.o.g., we assume that the oracle queries of \(\mathcal {D} '\) are of the form \(\mathbf {A}, \mathbf {b}\) with \(\mathbf {b}= \mathbf 0\) since \(({\hat{\mathbf {r}}},{\hat{\mathbf {z}}}) \mapsto ({\hat{\mathbf {r}}},{\hat{\mathbf {z}}}+{\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {b})\) reduces queries with arbitrary \(\mathbf {b}\) to queries with \(\mathbf {b}=\mathbf 0\).

Our \(\mathcal {D} \) will transform the samples of the form \((\mathbf {r},\mathbf {r}^{\mathsf {T}}\mathbf {x}+e)\) it gets (where e is either sampled according to \(\mathsf{Ber}_\tau \) or uniform) into samples \(({\hat{\mathbf {r}}},{\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {A}{\hat{\mathbf {x}}}+e)\) for any \(\mathbf {A}\in \mathbb {Z}_2^{\ell \times \ell }\) of rank \(\ge d+g\). In particular, LPN samples are mapped to SLPN samples, and random samples are mapped to random samples. For each of the Q queries made by \(\mathcal {D} '\), the transformation will fail with probability at most \(2^{-g}\), which is where the \(Q/2^g\) loss in distinguishing advantage comes from. We now formally define \(\mathcal {D} \).

Initially, \(\mathcal {D} ^{\Lambda _{\delta ,d}(\mathbf {x})}\) (where \(\delta \) is either \(\tau \) or 1 / 2) samples \(\mathbf {W}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{\ell \times d},\mathbf {w}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{\ell }\) which (implicitly) defines the secret \({\hat{\mathbf {x}}} = \mathbf {W}\mathbf {x}+ \mathbf {w}\) for the transformation. Now, \(\mathcal {D} ^{\Lambda _{\delta ,d}(\mathbf {x})}\) invokes \(\mathcal {D} '\) and answers every query \(\mathbf {A}\) of \(\mathcal {D} '\) as follows.

If \(\mathsf {rank}(\mathbf {A}) < d+g\) return \(\bot \) to \(\mathcal {D} '\). Otherwise, query the oracle \(\Lambda _{\delta ,d}(\mathbf {x})\) to get a sample \((\mathbf {r},\mathbf {r}^{\mathsf {T}}\cdot \mathbf {x}+ e)\). Define the set \(\mathcal {S}\subseteq \mathbb {Z}_2^\ell \) of solutions to the system of linear equations:

$$\begin{aligned} \mathcal {S}= \left\{ \mathbf {y}:~\mathbf {y}^{\mathsf {T}}\mathbf {A}\mathbf {W}= \mathbf {r}^{\mathsf {T}}\right\} \subseteq \mathbb {Z}_2^{\ell } \end{aligned}$$

Note that if \(\mathbf {A}\mathbf {W}\) has rank d then \(\mathcal {S}\) is non-empty. \(\mathcal {D} \) samples \({\hat{\mathbf {r}}}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {S}\) and outputs the sample

$$\begin{aligned} ({\hat{\mathbf {r}}},\mathbf {r}^{\mathsf {T}}\mathbf {x}+ {\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {A}\mathbf {w}+ e ), \end{aligned}$$
(2.1)

Note that \(\mathcal {D} \) runs in time \(t'\) (as it must invoke \(\mathcal {D} '\)) plus some \({{\mathrm{poly}}}(\ell ,Q)\) overhead as claimed. It remains to show that simulation performed by \(\mathcal {D} \) is correct. This is shown in the following claims.

Claim 2.5

If \(\mathbf {V}= \mathbf {A}\mathbf {W}\) has (full) rank d, then \({\hat{\mathbf {r}}}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {S}\) is uniformly random (given \(\mathbf {A},\mathbf {W},\mathbf {w}\)).

Proof of Claim

We show that for any \(\mathbf {v}\in \mathbb {Z}_2^\ell \), \(\Pr [{\hat{\mathbf {r}}}=\mathbf {v}\mid \mathbf {W},\mathbf {A},\mathbf {w}]=2^{-\ell }\). First, as \(\mathbf {r}\in \mathbb {Z}_2^d\) is uniform, \(\Pr [\mathbf {v}^{\mathsf {T}}\mathbf {A}\mathbf {W}=\mathbf {r}^{\mathsf {T}}] =2^{-d}\), if this does not hold, then \(\mathbf {v}\ne {\hat{\mathbf {r}}}\). Otherwise, \({\hat{\mathbf {r}}}\) is sampled at uniform from an \(\ell -d\) dimensional linear space, and thus \(\Pr [\mathbf {v}={\hat{\mathbf {r}}} \mid \mathbf {v}\mathbf {A}\mathbf {W}=\mathbf {r}^{\mathsf {T}}]=2^{d-\ell }\). We get

$$\begin{aligned} \Pr [{\hat{\mathbf {r}}}=\mathbf {v}\mid \mathbf {W},\mathbf {A},\mathbf {w}]=\Pr [\mathbf {v}^{\mathsf {T}}\mathbf {A}\mathbf {W}=\mathbf {r}^{\mathsf {T}}]\Pr [\mathbf {v}={\hat{\mathbf {r}}} \mid \mathbf {v}^{\mathsf {T}}\mathbf {A}\mathbf {W}=\mathbf {r}^{\mathsf {T}}]= 2^{-d}2^{d-\ell }=2^{-\ell }. \end{aligned}$$

\(\square \)

Claim 2.6

\(\mathcal {D} \) perfectly simulates the distribution \(\Gamma _{\delta ,\ell ,d+g}({\hat{\mathbf {x}}},\mathbf {A})\) (where \({\hat{\mathbf {x}}} = \mathbf {W}\mathbf {x}+\mathbf {w}\)).

Proof of Claim

We can rewrite the samples of Eq. (2.1) as

$$\begin{aligned} ({\hat{\mathbf {r}}},\mathbf {r}^{\mathsf {T}}\mathbf {x}+ {\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {A}\mathbf {w}+ e)= & {} ({\hat{\mathbf {r}}},{\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {A}\mathbf {W}\mathbf {x}+ {\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {A}\mathbf {w}+ e) \qquad \left( \text {since } {\hat{\mathbf {r}}} \in \mathcal {S}\right) \\= & {} ({\hat{\mathbf {r}}},{\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {A}(\mathbf {W}\mathbf {x}+\mathbf {w}) + e) \\= & {} ({\hat{\mathbf {r}}},{\hat{\mathbf {r}}}^{\mathsf {T}}\mathbf {A}{\hat{\mathbf {x}}} + e) \end{aligned}$$

which is a sample from \(\Gamma _{\delta ,\ell ,d+g}({\hat{\mathbf {x}}},\mathbf {A})\) as required.\(\square \)

Claim 2.7

The probability that the set \(\mathcal {S}\) is empty is at most \(2^{-g}\).

Proof of Claim

Recall that the set \(\mathcal {S}\) can possibly be empty only when \(\mathbf {V}= \mathbf {A}\mathbf {W}\in \mathbb {Z}_2^{\ell \times d}\) has rank less than d, where \(\mathbf {A}\in \mathbb {Z}_2^{\ell \times \ell }\) has rank \(\ge d+g\) and \(\mathbf {W}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{\ell \times d}\).

Denote with \(\Delta (d,g)\) the probability that a random matrix in \(\mathbb {Z}_2^{(d+g)\times d}\) has rank less than d. Since the matrix \(\mathbf {A}\) has rank at least \(d+g\), we can assume, without loss of generality, that the first \(d+g\) rows of \(\mathbf {A}\) are linearly independent. Since the matrix \(\mathbf {W}\) is random, the upper \((d+g) \times d\) matrix of \(\mathbf {V}= \mathbf {A}\mathbf {W}\) is random in \(\mathbb {Z}_2^{(d+g)\times d}\) and thus it has rank less than d with probability at most \(\Delta (d,g)\). We conclude that \(\mathbf {V}\) has rank strictly less than d exactly with the same probability. Using Lemma 4.20, we see that this probability is bounded by \(2^{-g}\).

Applying the union bound, we can upper bound the probability that for any of the Q queries the matrix \(\mathbf {V}= \mathbf {A}\mathbf {W}\) has rank less than d by \(Q\cdot 2^{-g}\). This error probability is thus an upper bound on the gap of the success probability \(\varepsilon '\) of \(\mathcal {D} '\) and the success probability \(\varepsilon \) we get in breaking LPN using the transformation.

Finally, we need to consider the fact that the queries \(\mathbf {A}\) chosen by \(\mathcal {D} '\) are chosen adaptively. To show that adaptivity does not help in picking an \(\mathbf {A}\) where \(\mathbf {A}\mathbf {W}\) has rank \(< d\), we must show that the view of \(\mathcal {D} '\) is independent of \(\mathbf {W}\) (except for the fact that so far no query was made where \(\mathsf {rank}(\mathbf {A}\mathbf {W}) < d\)). This holds as the secret \({\hat{\mathbf {x}}} = \mathbf {W}\mathbf {x}+\mathbf {w}\) used in the simulation is independent of \(\mathbf {W}\) as it is blinded with a uniform \(\mathbf {w}\). In fact, the only reason we use this blinding is to enforce this independence.\(\square \)

2.4.3 Subset Learning Parity with Noise

For some of our constructions, we will only need a weaker version of the \(\mathsf {SLPN}_{\tau ,\ell ,d}\) problem that we call subset LPN. As the name suggests, here the adversary does not ask for inner products with \(\mathbf {A}\mathbf {x}+ \mathbf {b}\) for any \(\mathbf {A}\) (of rank \(\ge d\)), but only with subsets of \(\mathbf {x}\) (of size \(\ge d\)). It will be convenient to explicitly define this special case. For \(\mathbf {x},\mathbf {v}\in \mathbb {Z}_2^{\ell }\), let \(\mathsf {diag}(\mathbf {v})\in \mathbb {Z}_2^{\ell \times \ell }\) denote the matrix with \(\mathbf {v}\) in the diagonal and zero elsewhere, and let

$$\begin{aligned} \Gamma ^*_{\tau ,\ell ,d}(\mathbf {x},\mathbf {v}){:}{=}\Gamma _{\tau ,\ell ,d}(\mathbf {x},\mathsf {diag}(\mathbf {v})) =\left\{ \begin{array}{ll} \bot &{}\quad {\text {if }}\mathbf{wt}(\mathbf {v})<d\\ \Lambda _{\tau ,\ell }(\mathbf {x}\wedge \mathbf {v})&{}\quad {\text {otherwise.}} \end{array}\right. \end{aligned}$$

Definition 2.8

(Subset LPN) Let \(\ell ,d\in \mathbb {Z}\) where \(d\le \ell \). The \(\mathsf {SLPN}^*_{\tau ,\ell ,d}\) problem is \((t,Q,\varepsilon )\)-hard if for every distinguisher \(\mathcal {D} \) running in time t and making Q queries,

$$\begin{aligned} \left| \Pr \left[ \mathcal {D} ^{\Gamma ^*_{\tau ,\ell ,d}(\mathbf {x},\cdot )}=1; \mathbf {x}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\ell \right] - \Pr \left[ \mathcal {D} ^{U_{\ell + 1}(\cdot )}=1\right] \right| \le \varepsilon , \end{aligned}$$

where \(U_{\ell + 1}(\cdot )\) on input \(\mathbf {v}\) (where \(\mathbf{wt}(\mathbf {v})\ge d\)) outputs a sample of \(U_{\ell +1}\) and \(\bot \) otherwise.

Remark 2.9

\(\Gamma ^*_{\tau ,\ell ,d}(\mathbf {x},\mathbf {v})\) samples are of the form \((\mathbf {r}, \mathbf {r}^{\mathsf {T}}_{\downarrow \mathbf {v}} \mathbf {x}_{\downarrow \mathbf {v}} + e)\in \mathbb {Z}_2^{\ell +1}\), where \(e \mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathsf{Ber}_\tau \). To compute the inner product only \(\mathbf {r}_{\downarrow \mathbf {v}}\in \mathbb {Z}_2^{\mathbf{wt}(\mathbf {v})}\) is needed, the remaining bits \(\mathbf {r}_{\downarrow \overline{\mathbf {v}}}\in \mathbb {Z}_2^{\ell -\mathbf{wt}(\mathbf {v})}\) are irrelevant. We use this observation to improve the communication complexity (for protocols) or tag length (for MACs), by using “compressed” samples of the form \((\mathbf {r}_{\downarrow \mathbf {v}},\mathbf {r}^{\mathsf {T}}_{\downarrow \mathbf {v}}\mathbf {x}_{\downarrow \mathbf {v}} + e)\in \mathbb {Z}_2^{\mathbf{wt}(\mathbf {v})+1}\).

2.5 Hoeffding Inequality

We will several times use the following tail inequality known as Hoeffding’s inequality.

Theorem 2.10

[26] Let \(X_1,X_2,\ldots ,X_n\) be i.i.d. Bernoulli variables with \(\Pr [X_i=1]=p\) and \(X=\sum _{i=1}^n X_i\) (so \(\mathrm {E}[X]=np\)). Then, for \(\epsilon \ge 0\),

$$\begin{aligned} \Pr [X-np \ge \epsilon n]\le & {} e^{-2\epsilon ^2n}\\ \Pr [|X-np| \ge \epsilon n]\le & {} 2e^{-2\epsilon ^2n}. \end{aligned}$$

The above also holds if the \(X_i\) are sampled without replacement.

3 Two-Round Authentication with Active Security

In this section, we describe our new two-round authentication protocol and prove its active security under the hardness of the \(\mathsf {SLPN}^*_{\tau ,2\ell ,d}\) problem, where \(d=\ell /(2+\gamma )\) for some constant \(\gamma >0\).

  • Public Parameters. The authentication protocol has the following public parameters, where \(\tau ,\tau '\) are constants and \(\ell ,n\) depend on the security parameter \(\lambda \).Footnote 6

    \(\ell \ \in \mathbb {N}\) :

    length of the secret key \(\mathbf {s}\in \mathbb {Z}_2^{2 \ell }\)

    \(\tau \ \in {]}0,1/2{[}\) :

    parameter of the Bernoulli error distribution \(\mathsf{Ber}_\tau \)

    \(\tau '\ =1/4+\tau /2\) :

    acceptance threshold

    \(n\ \in \mathbb {N}\) :

    number of parallel repetitions (we require \(\ell \ge 2n\))

  • Key Generation. Algorithm \(\mathcal {K}(1^\lambda )\) samples \(\mathbf {s}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{2\ell }\) and returns \(\mathbf {s}\) as the secret key.

  • Authentication Protocol. The two-round authentication protocol with prover \(\mathcal {P}_{\tau ,n}\) and verifier \(\mathcal {V}_{\tau ',n}\) is given in Fig. 4.

Fig. 4
figure 4

Two-round protocol \(\mathsf {AUTH}\) with active security from the LPN assumption

Theorem 3.1

For any constant \(\gamma >0\), let \(d = \ell /(2+\gamma )\). If the \(\mathsf {SLPN}^*_{\tau ,2\ell ,d}\) problem is \((t,nQ,\varepsilon )\)-hard, then the authentication protocol from Fig. 4 is \((t',Q,\varepsilon ')\)-secure against active adversaries, where for constants \(c_\gamma ,c_\tau >0\) that depend only on \(\gamma \) and \(\tau \), respectively,

$$\begin{aligned} t' = t - {{\mathrm{poly}}}(Q,\ell ) \quad \varepsilon '=\varepsilon +Q\cdot 2^{-c_\gamma \cdot \ell }+2^{-c_\tau \cdot n} \ =\ \varepsilon +Q\cdot 2 ^{-\Theta (n)}. \end{aligned}$$

The protocol has completeness error \(2^{-c'_\tau \cdot n}+2^{n-\ell }\in 2^{-\Theta (n)}\) (as \(\ell \ge 2n\)) where \(c'_\tau >0\) depends only on \(\tau \).

3.1 Proof of Completeness

We first bound the completeness error \(\alpha \) of the authentication protocol. For any \(n\in \mathbb {N}\), \(\tau \in ]0,1/2[\), let

(3.1)

denote the probability that n independent Bernoulli samples with bias \(\tau \) contain more than a \(\tau '{:}{=}1/4+\tau /2\) fraction of 1’s (i.e., \(n(1/4+\tau /2)-n\tau =n(1/4-\tau /2)\) more than the expected number).

The verifier performs the following two checks. In the first verification step, the verifier rejects if the random matrix \(\mathbf {R}\) does not have full rank. By Lemma 4.20 (cf. Appendix B), the probability of this event is \(\le 2^{n-\ell }\). Now, let \(\mathbf {e}{:}{=}\mathbf {z}+ \mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}_{\downarrow \mathbf {v}}\) denote the noise added by \(\mathcal {P}_{\tau ,n}\). Then, in the second verification step, the verifier rejects if \(\mathbf{wt}(\mathbf {e})> n\cdot \tau '\). From Eq. (3.1), we have that this happens with probability \(\alpha _{\tau ,n}\). It follows that \(\alpha \le 2^{n-\ell } + 2^{-c'_\tau \cdot n}\).

3.2 Proof of Security

We first define some more terms that will be used later in the security proof. For a constant \(\gamma >0\), let \(d= \ell /(2+\gamma )\) (as in Theorem 3.1). Let \(\alpha '_{\ell ,d}\) denote the probability that a random substring of length \(\ell \) chosen from a string of length \(2\ell \) with Hamming weight \(\ell \), has a Hamming weight less than \(d=\ell /(2+\gamma )\). Using the Hoeffding bound, we see there is a constant \(c_\gamma >0\) s.t.

$$\begin{aligned} \alpha '_{\ell ,d} \mathop {\le }\limits ^{(\mathrm{Thm.}2.10)}2e^{-2\left( \frac{1}{2}-\frac{1}{2+\gamma }\right) ^2\ell } \mathop {=}\limits ^{\small {\mathsf {def}}}2^{-c_\gamma \cdot \ell }. \end{aligned}$$
(3.2)

For \(\tau ' = 1/4+\tau /2\), let \(\alpha ''_{\tau ',n}\) denote the probability that a random bitstring \(\mathbf {y}\in \mathbb {Z}_2^n\) has Hamming weight \(\mathbf{wt}(\mathbf {y})\le n\cdot \tau '\). From the Hoeffding bound, it follows that there exists a constant \(c_\tau >0\) (only depending on \(\tau \)), such that (below we use \(1/2-\tau '=1/4-\tau /2\))

$$\begin{aligned} \alpha ''_{\tau ',n}{:}{=}2^{-n}\cdot \sum _{i=0}^{\lfloor n\cdot \tau '\rfloor }{n \atopwithdelims ()i} \mathop {\le }\limits ^{(\mathrm{Thm.}2.10)}2e^{-2\left( \frac{1}{4}-\frac{\tau }{2}\right) ^2n} \mathop {=}\limits ^{\small {\mathsf {def}}}2^{-c_{\tau }\cdot n}. \end{aligned}$$
(3.3)

We now prove security of the authentication protocol. Consider an oracle \(\mathcal {O}\) which is either the subset LPN oracle \(\Gamma ^*_{\tau , 2\ell ,d}(\mathbf {x},\cdot )\) or \(U_{2\ell +1}(\cdot )\), as defined in Definition 2.8. We will construct an adversary \(\mathcal {B} ^\mathcal {O}\) that uses \(\mathcal {A} \) (who breaks the active security of \(\mathsf {AUTH}\) with advantage \(\varepsilon '\)) in a black-box way such that:

$$\begin{aligned} \Pr [\mathcal {B} ^{\Gamma ^*_{\tau ,2\ell ,d}(\mathbf {x},\cdot )} = 1]\ge \varepsilon ' \quad {\text {and}}\quad \Pr [\mathcal {B} ^{U_{2\ell + 1}(\cdot )} = 1]\le \alpha ''_{\tau ',n} +Q\cdot \alpha '_{\ell ,d} . \end{aligned}$$

Thus \(\mathcal {B} ^\mathcal {O}\) can distinguish between the two oracles with advantage \(\varepsilon {:}{=}\varepsilon '-Q\cdot \alpha '_{\ell ,d}-\alpha ''_{\tau ',n}=\varepsilon +Q\cdot 2^{-c_\gamma \cdot \ell }+2^{-c_\tau \cdot n} \) as claimed in the statement of the theorem. Below we define \(\mathcal {B} ^\mathcal {O}\).

Setup :

Initially, \(\mathcal {B} ^\mathcal {O}\) samples

$$\begin{aligned} \mathbf {x}^*\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{2\ell }, \qquad \mathbf {v}^*\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\left\{ \mathbf {y}\in \mathbb {Z}_2^{2\ell } : \mathbf{wt}(\mathbf {y})=\ell \right\} . \end{aligned}$$

The intuition of our simulation below is as follows. Let us first assume \(\mathcal {O}\) is a subset LPN oracle \(\Gamma ^*_{\tau ,2\ell ,d}(\mathbf {x},\cdot )\) with secret \(\mathbf {x}\). To simulate the prover during the first phase we have to produce answers \(\phi = (\mathbf {R},\mathbf {z})\) to each query \(\mathbf {v}\in \{\mathbf {y}\in \mathbb {Z}_2^{2\ell } : \mathbf{wt}(\mathbf {y})=\ell \}\) issued by \(\mathcal {A} \). The simulated answers will have the same distribution as the answers of an honest prover \(\mathcal {P}_{\tau ,n}(\mathbf {s}\in \mathbb {Z}_2^{2\ell })\) where

$$\begin{aligned} \mathbf {s}= (\mathbf {x}^*\wedge \mathbf {v}^*)+ (\mathbf {x}\wedge \overline{\mathbf {v}}^*). \end{aligned}$$
(3.4)

Thus one half of the bits of \(\mathbf {s}\) come from \(\mathbf {x}^*\), and the other half come from the unknown secret \(\mathbf {x}\) (for which we use the oracle \(\mathcal {O}\)). In the second phase, we give \(\mathcal {A} \) the challenge \(\mathbf {v}^*\). As \(\mathbf {s}_{\downarrow \mathbf {v}^*} = (\mathbf {x}^*\wedge \mathbf {v}^*)_{\downarrow \mathbf {v}^*} = \mathbf {x}^*_{\downarrow \mathbf {v}^*}\) is known, we will be able to verify if \(\mathcal {A} \) did output a valid answer.

On the other hand, if \(\mathcal {O}\) is the uniform oracle \(U_{2\ell + 1}(\cdot )\), then after the interaction with the prover we will show that \(\mathbf {s}_{\downarrow \mathbf {v}^*}=(\mathbf {x}^*\wedge \mathbf {v}^*)_{\downarrow \mathbf {v}^*}\) is information theoretically hidden, and thus \(\mathcal {A} \) cannot succeed except with exponentially small probability.

First phase :

In the first phase \(\mathcal {B} ^\mathcal {O}\) invokes \(\mathcal {A} \) who expects access to \(\mathcal {P}_{\tau ,n}(\mathbf {s}\in \mathbb {Z}_2^{2\ell })\). We now specify how \(\mathcal {B} ^\mathcal {O}\) samples the answer \(\phi = (\mathbf {R},\mathbf {z})\) to a query \(\mathbf {v}\in \{\mathbf {y}\in \mathbb {Z}_2^{2\ell } : \mathbf{wt}(\mathbf {y})=\ell \}\) made by \(\mathcal {A} \). Let

$$\begin{aligned} \mathbf {u}^* {:}{=}\mathbf {v}\wedge \mathbf {v}^*\quad \mathbf {u}{:}{=}\mathbf {v}\wedge \overline{\mathbf {v}}^*. \end{aligned}$$
  1. 1.

    \(\mathcal {B} ^\mathcal {O}\) queries its oracle n times on the input \(\mathbf {u}\). If the oracle’s output is \(\bot \) (this is the case if \(\mathbf{wt}(\mathbf {u})<d\)), then \(\mathcal {B} ^\mathcal {O}\) outputs 1 and stops. Otherwise, let \({\hat{\mathbf {R}}}_1\in \mathbb {Z}_2^{2\ell \times n},\mathbf {z}_1\in \mathbb {Z}_2^n\) denote the n outputs of the oracle.

  2. 2.

    Sample \({\hat{\mathbf {R}}}_0 \mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{2\ell \times n}\) and set \(\mathbf {z}_0 = {\hat{\mathbf {R}}}_0^{\mathsf {T}}\cdot (\mathbf {x}^* \wedge \mathbf {u}^*)\).

  3. 3.

    Return \(\phi = (\mathbf {R}={\hat{\mathbf {R}}}_{\downarrow \mathbf {v}}\in \mathbb {Z}_2^{\ell \times n}, \mathbf {z}=\mathbf {z}_0 + \mathbf {z}_1\in \mathbb {Z}_2^n)\), where \({\hat{\mathbf {R}}}\) is uniquely determined by requiring \({\hat{\mathbf {R}}}_{\downarrow \mathbf {v}^*}={\hat{\mathbf {R}}}_0\) and \({\hat{\mathbf {R}}}_{\downarrow \overline{\mathbf {v}}^*}={\hat{\mathbf {R}}}_1\).

Second phase :

Eventually, \(\mathcal {A} \) enters the second phase of the active attack expecting a challenge from \(\mathcal {V}_{\tau ',n}(\mathbf {s}\in \mathbb {Z}_2^{2\ell })\).

  1. 1.

    \(\mathcal {B} ^\mathcal {O}\) forwards \(\mathbf {v}^*\) as the challenge to \(\mathcal {A} \).

  2. 2.

    \(\mathcal {A} \) answers with some \((\mathbf {R}^*,\mathbf {z}^*)\).

  3. 3.

    \(\mathcal {B} ^\mathcal {O}\) checks if

    $$\begin{aligned} \mathsf{rank}(\mathbf {R}^*)=n\quad {\text {and}}\quad \mathbf{wt}(\mathbf {z}^* + {\mathbf {R}^*}^{\mathsf {T}}\cdot \mathbf {x}^*_{\downarrow \mathbf {v}^*} )\le n \cdot \tau '. \end{aligned}$$
    (3.5)

The output is 1 if both checks succeed and 0 otherwise.

Claim 3.2

\(\Pr [\mathcal {B} ^{U_{2\ell + 1}(\cdot )} = 1]\ \le \ \alpha ''_{\tau ',n}+Q\cdot \alpha '_{\ell ,d}\).

Proof of Claim

Recall that we defined \(\mathcal {B} \) such that it will output 1 and stop in the first phase of the experiment whenever it receives \(\bot \) as output from its oracle. We defined \({U_{2\ell + 1}(\cdot )}\) to output \(\bot \) if it is queried on an input \(\mathbf {u}=\mathbf {v}\wedge \overline{\mathbf {v}}^*\) where \(\mathbf{wt}(\mathbf {u})<d\). Since \(\mathbf {v}^*\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\{ \mathbf {y}\in \mathbb {Z}_2^{2\ell } : \mathbf{wt}(\mathbf {y})=\ell \}\), for any \(\mathbf {v}\), the probability that \(\mathbf{wt}(\mathbf {v}\wedge \overline{\mathbf {v}}^*)<d\) is (by definition) \(\alpha '_{\ell ,d}\) as defined in Eq. (3.2). Using the union bound, we can upper bound the probability that \(\mathbf{wt}(\mathbf {v}\wedge \overline{\mathbf {v}}^*)<d\) for any of the Q different \(\mathbf {v}\)’s chosen by the adversary in the first phase as \(Q\cdot \alpha '_{\ell ,d}\).

If we did not output 1 in the first phase and in the second phase \(\mathbf {R}^*\) does not have full rank then \(\mathcal {B} \) outputs 0 by definition. Therefore, we now consider the case where \(\mathsf {rank}(\mathbf {R}^*)=n\).

The answers \(\phi = (\mathbf {R},\mathbf {z})\) that the adversary \(\mathcal {A} \) obtained from \(\mathcal {B} ^{U_{2\ell + 1}(\cdot )}\) in the first phase are independent of \(\mathbf {x}^*\) (i.e., \(\mathbf {z}= \mathbf {z}_0 + \mathbf {z}_1\) is uniform as \(\mathbf {z}_1\) is uniform). Since \(\mathbf {x}^*_{\downarrow \mathbf {v}^*}\) is uniformly random and \(\mathbf {R}^*\) has full rank, the vector

$$\begin{aligned} \mathbf {y}{:}{=}{\mathbf {R}^*}^{\mathsf {T}}\cdot \mathbf {x}^*_{\downarrow \mathbf {v}^*} + \mathbf {z}^* \end{aligned}$$

is uniformly random over \(\mathbb {Z}_2^n\). Thus the probability that the second verification in Eq. (3.5) does not fail is \( \Pr [\mathbf{wt}(\mathbf {y})\le n\cdot \tau ']=\alpha ''_{\tau ',n} \).\(\square \)

Claim 3.3

\(\Pr [\mathcal {B} ^{\Gamma ^*_{\tau ,2\ell ,d}(\mathbf {x},\cdot )} = 1]\ \ge \ \varepsilon '\)

Proof of Claim

We will show that \(\mathcal {B} \) outputs 1 with probability \(\ge \varepsilon '\) if the subset LPN oracle accepts subsets of arbitrary small size (and does not simply output \(\bot \) on inputs \(\mathbf {v}\) where \(\mathbf{wt}(\mathbf {v})<d\)), i.e.,

$$\begin{aligned} \Pr [\mathcal {B} ^{\Gamma ^*_{\tau ,2\ell ,0}(\mathbf {x},\cdot )} = 1] \ge \varepsilon '. \end{aligned}$$
(3.6)

This then implies the claim using

$$\begin{aligned} \Pr [\mathcal {B} ^{\Gamma ^*_{\tau ,2\ell ,d}(\mathbf {x},\cdot )} = 1]\ge \Pr [\mathcal {B} ^{\Gamma ^*_{\tau ,2\ell ,0}(\mathbf {x},\cdot )} = 1] \end{aligned}$$

which holds as the oracles \(\Gamma ^*_{\tau ,2\ell ,d}(\mathbf {x},\cdot )\) and \(\Gamma ^*_{\tau ,2\ell ,0}(\mathbf {x},\cdot )\) behave the same except on inputs \(\mathbf {v}\) with \(\mathbf{wt}(\mathbf {v})<d\) in which case the former outputs \(\bot \), and \(\mathcal {B} \) will then immediately output 1.

Equation (3.6) holds as:

  • The answers \(\phi = (\mathbf {R},\mathbf {z})\) that \(\mathcal {B} ^{\Gamma ^*_{\tau ,2\ell ,0}(\mathbf {x},\cdot )}\) gives to \(\mathcal {A} \)’s queries in the first phase of the attack have exactly the same distribution as what \(\mathcal {A} \) would get when interacting with an honest prover \(\mathcal {P}_{\tau ,n}(\mathbf {s}\in \mathbb {Z}_2^{2\ell })\) where the “simulated” secret \(\mathbf {s}\) is defined in Eq. (3.4).

    To see this, recall that on a query \(\mathbf {v}\) from \(\mathcal {A} \), adversary \(\mathcal {B} ^{\Gamma ^*_{\tau ,2\ell ,0}(\mathbf {x},\cdot )}\) must compute n SLPN samples \(({\hat{\mathbf {R}}}, \mathbf {z}={\hat{\mathbf {R}}}^{\mathsf {T}}\cdot (\mathbf {s}\wedge {\mathbf {v}})+\mathbf {e})\) and then forward the compressed version of this samples to \(\mathcal {A} \) (that is, \((\mathbf {R}, \mathbf {z}=\mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}_{\downarrow \mathbf {v}}+\mathbf {e})\) where \(\mathbf {R}= {\hat{\mathbf {R}}}_{\downarrow \mathbf {v}}\), cf. Remark 2.9). We next show that the \(\mathbf {z}\) computed by \(\mathcal {B} \) indeed has exactly this distribution. In the first step, \(\mathcal {B} \) queries its oracle with \(\mathbf {u}=\mathbf {v}\wedge \overline{\mathbf {v}}^*\) and obtains noisy inner products \(({\hat{\mathbf {R}}}_1,\mathbf {z}_1)\) with the part of \(\mathbf {s}_{\downarrow \mathbf {v}}\) that contains only bits from \(\mathbf {x}\), i.e.,

    $$\begin{aligned} \mathbf {z}_1 = {\hat{\mathbf {R}}}_1^{\mathsf {T}}\cdot (\mathbf {x}\wedge {\mathbf {u}}) + \mathbf {e}= {\hat{\mathbf {R}}}_1^{\mathsf {T}}\cdot (\mathbf {s}\wedge {\mathbf {u}}) + \mathbf {e}. \end{aligned}$$

    In the second step, \(\mathcal {B} \) samples n inner products \(({\hat{\mathbf {R}}}_0,\mathbf {z}_0)\) (with no noise) with the part of \(\mathbf {s}_{\downarrow \mathbf {v}}\) that contains only bits from the known \(\mathbf {x}^*\), i.e.,

    $$\begin{aligned} \mathbf {z}_0 = {\hat{\mathbf {R}}}_0^{\mathsf {T}}\cdot (\mathbf {x}^*\wedge {\mathbf {u}^*})= {\hat{\mathbf {R}}}_0^{\mathsf {T}}\cdot (\mathbf {s}\wedge {\mathbf {u}^*}). \end{aligned}$$

    In the third step, \(\mathcal {B} \) then generates \(({\hat{\mathbf {R}}}, {\hat{\mathbf {R}}}^{\mathsf {T}}\cdot (\mathbf {s}\wedge \mathbf {v}) + \mathbf {e})\) from the previous values where \({\hat{\mathbf {R}}}\) is defined by \({\hat{\mathbf {R}}}_{\downarrow \mathbf {v}^*} = {\hat{\mathbf {R}}}_0\) and \({\hat{\mathbf {R}}}_{\downarrow \overline{\mathbf {v}}^*} = {\hat{\mathbf {R}}}_1\). Using \(\mathbf {v}= \mathbf {u}+ \mathbf {u}^*\), we get

    $$\begin{aligned} \mathbf {z}= & {} \mathbf {z}_0 + \mathbf {z}_1 \\= & {} {\hat{\mathbf {R}}}_0^{\mathsf {T}}\cdot (\mathbf {s}\wedge {\mathbf {u}^*}) + {\hat{\mathbf {R}}}_1^{\mathsf {T}}\cdot (\mathbf {s}\wedge {\mathbf {u}}) + \mathbf {e}\\= & {} {\hat{\mathbf {R}}}^{\mathsf {T}}\cdot (\mathbf {s}\wedge \mathbf {v}) + \mathbf {e}. \end{aligned}$$
  • The challenge \(\mathbf {v}^*\) sent initially to \(\mathcal {A} \) is uniformly random and therefore has the same distribution as a challenge in an active attack.

  • \(\mathcal {B} ^{\Gamma ^*_{\tau ,2\ell ,0}(\mathbf {x},\cdot )}\) outputs 1 if Eq. (3.5) holds, which is exactly the case when \(\mathcal {A} \)’s response to the challenge was valid. By assumption this probability is at least \(\varepsilon '\).

This concludes the proof of Eq. (3.6).\(\square \)

3.3 Avoiding Checking

One disadvantage of the protocol in Fig. 4, compared to \({\mathsf {HB}}\) style protocols, is the necessity to check whether the messages exchanged have the right form: the prover checks if \(\mathbf {v}\) has weight \(\ell \), while the verifier must make the even more expensive check whether \(\mathbf {R}\) has full rank. We note that it is possible to eliminate these checks by blinding the exchanged messages \(\mathbf {v}\) and \(\mathbf {z}\) using random vectors and , respectively, as shown in Fig. 5. The analysis of this protocol is similar as for the protocol in Fig. 5, we thus just sketch a proof emphasizing the differences.

The setup phase is as before, except that \(\mathcal {B} \) apart from \(\mathbf {x}^*\) and \(\mathbf {v}^*\) also samples random and . In the first phase \(\mathcal {B} ^\mathcal {O}\) will answer \(\mathcal {A} \)’s queries as before, except that we use instead of \(\mathbf {v}\) and finally also add to the answer (just so when \(\mathcal {O}\) is an SLPN oracle we get the same distribution as in the protocol). Also the second phase is identical, except that we use the challenge (so the actual subset used to answer the challenge is ).

Fig. 5
figure 5

By blinding the values \(\mathbf {v},\mathbf {z}\) with secret random vectors we can avoid checking whether \(\mathbf{wt}(\mathbf {v})=\ell \) and \(\mathsf {rank}(\mathbf {R})=n\) as in the protocol from Fig. 4

Let \(\alpha ^*_{\ell ,d}\) denote the probability that a random string of length \(\ell \) has Hamming weight less than d, then we have \(\Pr [\mathcal {B} ^{U_{2\ell + 1}(\cdot )} = 1]\ \le \ \alpha ''_{\tau ',n}+Q\cdot \alpha ^*_{\ell ,d}\), this is basically as in Claim 3.2, except that \(\alpha '_{\ell ,d}\) is replaced with \(\alpha ^*_{\ell ,d}\). As is uniformly random, the probability that is \(\alpha ^*_{\ell ,d}\) (in which case the oracle will output \(\bot \)), using the union bound this will not happen in any of the Q queries except with probability at most \(Q\cdot \alpha ^*_{\ell ,d}\). In this argument we also must use the fact that the answers that \(\mathcal {B} \) sends to \(\mathcal {A}\) are uniformly random, and thus do not leak any information about (except from what can be deduced from the fact that \(\mathcal {B} \) has not already output 1 and stopped). If we come to the second phase, the probability that \(\mathcal {A} \) comes up with a valid answer \((\mathbf {R}^*,\mathbf {z}^*)\) is at most \(\alpha ''_{\tau ',n}\) as at this point , and thus also the string , is uniformly random given \(\mathcal {A} \)’s view, which implies that \(\Pr [\mathbf{wt}(\mathbf{w})> n\cdot \tau ']\le \alpha ''_{\tau ',n}\).

The proof that \(\Pr [\mathcal {B} ^{\Gamma ^*_{\tau ,2\ell ,d}(\mathbf {x},\cdot )} = 1]\ \ge \ \varepsilon '\) is almost analogous to the proof of Claim 3.3.

4 Message Authentication Codes

In this section, we construct two message authentication codes whose security can be reduced to the \(\mathsf {LPN}\) assumption. Our first construction is based on the two-round authentication protocol from Sect. 3. We prove that if the \(\mathsf {LPN}\) problem is \(\varepsilon \)-hard, then no adversary making Q queries can forge a MAC with probability more than \(\Theta (\sqrt{\varepsilon }\cdot Q)\). The construction is somewhat non-uniform as to achieve this security, one must set a parameter in the construction as a function of \(\varepsilon \) (cf. Remark 4.4). Our second construction has no such issues and achieves better security \(\Theta (\varepsilon \cdot Q)\). The efficiency of this construction is similar to that of the first construction, but a larger key is required.

4.1 First Construction

Recall the two-round authentication protocol from Sect. 3. In the protocol the verifier chooses a random challenge subset \(\mathbf {v}\). To turn this interactive protocol into a MAC, we will compute this \(\mathbf {v}\) from the message \(m\) to be authenticated as \(\mathbf {v}=\mathsf {C}(h(m,\mathbf {b}))\), where h is a pairwise independent hash function, \(\mathbf {b}\in \mathbb {Z}_2^{\nu }\) is some fresh randomness and \(\mathsf {C}\) is some encoding scheme. The code \(\mathsf {C}\) is fixed and public, while the function h is part of the secret key. The authentication tag \(\phi \) is computed in the same manner as the prover’s answer in the authentication protocol. That is, we sample a random matrix \(\mathbf {R}\in \mathbb {Z}_2^{\ell \times n}\) and compute a noisy inner product \(\mathbf {z}{:}{=}\mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}_{\downarrow \mathbf {v}} + \mathbf {e}\), where \(\mathbf {e}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathsf{Ber}_\tau ^n\). We note that using \((\mathbf {R},\mathbf {z},\mathbf {b})\) as an authentication tag would not be secure, and we need to blind these values. This is done by applying a pairwise independent permutation (PIP) \(\pi \)—which is part of the secret key—to \((\mathbf {R},\mathbf {z},\mathbf {b})\in \mathbb {Z}_2^{\ell \times n+n+\nu }\).

Remark 4.1

In the construction it is sufficient to just use almost pairwise independent functions and permutations. In fact, if the message space \(\mathcal {M}\) is a priory unbounded, we have to settle for almost pairwise independence (as the description of a pairwise independent h is linear in the length of the inputs it can process).

Construction. The message authentication code \(\mathsf {MAC}_1= (\mathcal {K},\mathcal {T},\mathcal {V})\) with associated message space \(\mathcal {M}\) is defined as follows:

  • Public Parameters. \(\mathsf {MAC}_1\) has the following public parameters.Footnote 7

    \(\ell ,\tau ,\tau ',n\) :

    as in the authentication protocol from Sect. 3

    \(\mu \in \mathbb {N}\) :

    output length of the hash function

    \(\nu \in \mathbb {N}\) :

    length of the randomness

    \(\mathsf {C}: \mathbb {Z}_2^\mu \rightarrow \mathbb {Z}_2^{2\ell }\) :

    encoding, where \(\forall ~\mathbf {y}\ne \mathbf {y}'\in \mathbb {Z}_2^\mu \) we have \(\mathbf{wt}(\mathsf {C}(\mathbf {y}))=\ell \) and \(\mathbf{wt}(\mathsf {C}(\mathbf {y})+\mathsf {C}(\mathbf {y}'))\ge 0.9\ell \).

  • Key Generation. Algorithm \(\mathcal {K}(1^\lambda )\) samples \(\mathbf {s}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{2\ell }\), a pairwise independent hash function \(h : \mathcal {M}\times \mathbb {Z}_2^\nu \rightarrow \mathbb {Z}_2^{\mu }\) and a pairwise independent permutation \(\pi \) over \(\mathbb {Z}_2^{\ell \times n + n + \nu }\). It returns \(K=(\mathbf {s},h,\pi )\) as the secret key.

  • Tagging. Given secret key \(K=(\mathbf {s},h,\pi )\) and message \(m\in \mathcal {M}\), algorithm \(\mathcal {T}\) proceeds as follows.

    1. 1.

      \(\mathbf {R}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{\ell \times n}\), \(\mathbf {b}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\nu \), \(\mathbf {e}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathsf{Ber}^n_\tau \)

    2. 2.

      \(\mathbf {v}{:}{=} \mathsf {C}(h(m, \mathbf {b})) \in \mathbb {Z}_2^{2 \ell }\)

    3. 3.

      Return \(\phi {:}{=}\pi (\mathbf {R}, \mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}_{\downarrow \mathbf {v}} + \mathbf {e}, \mathbf {b})\)

  • Verification. On input a secret key \(K=(\mathbf {s},h,\pi )\), message \(m\in \mathcal {M}\) and tag \(\phi \), algorithm \(\mathcal {V}\) proceeds as follows.

    1. 1.

      Parse \(\pi ^{-1}(\phi )\) as \((\mathbf {R}\in \mathbb {Z}_2^{\ell \times n},\mathbf {z}\in \mathbb {Z}_2^n,\mathbf {b}\in \mathbb {Z}_2^\nu )\). If \(\mathsf {rank}(\mathbf {R}) \ne n\), then return \(\mathsf {reject}\)

    2. 2.

      \(\mathbf {v}{:}{=} \mathsf {C}(h(m, \mathbf {b}))\)

    3. 3.

      If \(\mathbf{wt}(\mathbf {z}+ \mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}_{\downarrow \mathbf {v}})> n\cdot \tau '\) return \(\mathsf {reject}\), otherwise return \(\mathsf {accept}\)

Theorem 4.2

For \(\mu =\nu \in \mathbb {N}\), a constant \(\gamma >0\) and \(d {:}{=}\ell /(2+\gamma )\), if the \(\mathsf {SLPN}^*_{\tau , 2\ell ,d}\) problem is \((t,nQ,\varepsilon )\)-hard then \(\mathsf {MAC}_1\) is \((t',Q,\varepsilon ')\)-secure against uf-cma adversaries, where for constants \(c_\gamma ,c_\tau >0\) that depend only on \(\gamma \) and \(\tau \), respectively, and for \(Q = Q_{\mathsf {tag}} + Q_{\mathsf {vrfy}}\),

$$\begin{aligned} t' \approx t, \qquad \varepsilon = \min \left\{ \varepsilon '/2 - \frac{Q^2}{2^{\mu }} , \ \frac{\varepsilon '}{2^{\mu + 1}} - Q_\mathsf {tag}\cdot 2^{-c_\gamma \cdot \ell } - Q_\mathsf {vrfy}\cdot 2^{-c_\tau \cdot n} \right\} \;. \end{aligned}$$

\(\mathsf {MAC}_1\) has completeness error \(2^{-c'_\tau \cdot n}+2^{n-\ell }\in 2^{-\Theta (n)}\) where \(c'_\tau >0\) depends only on \(\tau \).

Corollary 4.3

Choosing \(\mu \) s.t. \(2^{\mu }=\frac{4Q^2}{\varepsilon '}\) in the above theorem, we get \(\varepsilon =\min \{\varepsilon '/4,{{\varepsilon '}^2/8Q^2} - Q\cdot 2^{-\Theta (n)}\}\). The second term is the minimum here, and solving for \(\varepsilon '\) gives

$$\begin{aligned} \varepsilon ' \le 3Q\cdot \sqrt{\varepsilon +Q\cdot 2^{-\Theta (n)}}. \end{aligned}$$
(4.1)

Remark 4.4

(about \(\mu \)) Note that to get security as claimed in the above corollary, we need to choose \(\mu \) as a function of Q and \(\varepsilon \) such that \(2^\mu \approx 4Q^2/\varepsilon '\) for \(\varepsilon '\) as in Eq. (4.1). Of course we can just fix Q (as an upper bound to the number of queries made by the adversary) and \(\varepsilon \) (as our guess on the actual hardness of \(\mathsf {SLPN}^*_{\tau ,2\ell ,d}\)). But a too conservative guess on \(\mu \) (i.e., choosing \(\mu \) too small) will result in a construction whose security is worse than what is claimed in the above corollary. A too generous guess on the other hand will make the security reduction meaningless, though we do not have any attacks on the MAC for large \(\mu \).

We now give some intuition for the proof of Theorem 4.2. Every query \((m,\phi )\) to \(\mathcal {V}\) and query \(m\) to \(\mathcal {T}\) defines a subset \(\mathbf {v}\) (as computed in the second step in the definitions of both \(\mathcal {V}\) and \(\mathcal {T}\)). We say that a forgery \((m,\phi )\) is “fresh” if the \(\mathbf {v}\) contained in \((m,\phi )\) is different from all \(\mathbf {v}\)’s contained in all the previous \(\mathcal {V}\) and \(\mathcal {T}\) queries. The proof makes a case distinction and uses a different reduction for the two cases where the first forgery found by the adversary is more likely to be fresh, or more likely to be non-fresh. In both cases we consider a reduction \(\mathcal {B} ^\mathcal {O}\) which has access to either a uniform oracle \(\mathcal {O}=U\) or a subset LPN oracle \(\mathcal {O}=\Gamma ^*\). Adversary \(\mathcal {B} ^\mathcal {O}\) uses an adversary \(\mathcal {A}\) who can find forgeries for the MAC to distinguish those cases and thus break the subset LPN assumption. In the first case, where the first forgery is likely to be non-fresh, we can show (using the fact that a pairwise independent permutation is used to blind the tag) that if \(\mathcal {B} ^\mathcal {O}\)’s oracle is \(\mathcal {O}=U\), even a computationally unbounded \(\mathcal {A}\) cannot come up with a message/tag pair \((m,\phi )\) that contains a non-fresh \(\mathbf {v}\). Thus we can distinguish the cases \(\mathcal {O}=U\) and \(\mathcal {O}=\Gamma ^*\) by just observing if \(\mathcal {A}\) ever makes a \(\mathcal {V}\) query \((m,\phi )\) that contains a non-fresh \(\mathbf {v}\) (even without being able to tell if \((m,\phi )\) is valid or not).

If the first forgery found by \(\mathcal {A}\) is more likely to be fresh, we can use a similar argument as in the proof of our authentication protocol in the last section. An additional difficulty here is that the reduction has to guess the fresh \(\mathbf {v}\in \mathbb {Z}_2^{2\ell }\) (for which there are \(2^\mu \) possibilities) contained in the first forgery and cannot choose it itself as in the protocol. This is the reason why the reduction loses a factor \(2^\mu \).

Proof of Theorem 4.2

The proof of completeness is essentially the same (and we get exactly the same quantitative bound) as the proof of completeness for the protocol in Fig. 4 as claimed in Theorem 3.1.

We now prove security. As in the theorem statement, we set \(\mu =\nu \) (but for clarity we will keep the different letters \(\mu \) for the range of h and \(\nu \) for the length of the randomness). Let \(\mathcal {A}\) be an adversary running in time \(t'\) that breaks the uf-cma security of \(\mathsf {MAC}_1\) in the experiment \(\mathbf {Exp}^{\mathrm {uf}\hbox {-}\mathrm {cma}}_{\mathsf {MAC}_1,\mathcal {A},\lambda }\) with advantage \(\varepsilon '\). Let \(Q_\mathsf {tag}\) and \(Q_\mathsf {vrfy}\) denote the number of queries that \(\mathcal {A}\) makes to the tag and verification oracles, respectively, such that \(Q=Q_\mathsf {tag}+Q_\mathsf {vrfy}\). We assume that \(\mathcal {A}\) never makes the same verification query twice (since \(\mathcal {V}\) is deterministic, repeating queries gives no additional information to \(\mathcal {A}\)) and also that she never makes a verification query \((m,\phi )\) where \(\phi \) was received as the output from the tag oracle on input \(m\). Since the completeness error of \(\mathsf {MAC}_1\) is \(2^{-\Theta (n)}\), this is basically without loss of generality (as the answer would almost certainly be \(\mathsf {accept}\)). Every verification query \((m,\phi )\) and tag query \(m\) define a subset \(\mathbf {v}\) (as computed in step 2. in the definitions of both \(\mathcal {V}\) and \(\mathcal {T}\)).

By definition, in the uf-cma experiment, with probability \(\varepsilon '\) the adversary \(\mathcal {A}\) at some point makes a verification query \((m,\phi )\) where (i) \(\phi \) was not received as output on a tag query \(m\), and (ii) \(\mathcal {V}(K,m,\phi )=\mathsf {accept}\). We say that such a forgery \((m,\phi )\) is “fresh” if the \(\mathbf {v}\) defined by \((m,\phi )\) is different from all \(\mathbf {v}\)’s defined by all the previous verification and tag queries. Let \(E_{\mathsf {fresh}}\) denote the event that the first forgery found by \(\mathcal {A}\) is fresh. We will consider the two cases where \(\Pr [E_{\mathsf {fresh}}]>\varepsilon '/2\) and \(\Pr [E_{\mathsf {fresh}}]\le \varepsilon '/2\) separately.

The case \(\underline{\Pr [E_{\mathsf {fresh}}]\le \varepsilon '/2}\). Given \(\mathcal {A}\), we will construct an adversary \(\mathcal {B} _1^\mathcal {O}\) who can distinguish \(\mathcal {O}=\Gamma ^*_{\tau ,2\ell ,\ell }(\mathbf {s},\cdot )\) from \(\mathcal {O}=U_{2\ell +1}(\cdot )\) (as in Definition 2.8) with advantage

$$\begin{aligned} \varepsilon '/2 - \frac{Q^2}{2^{\mu }}. \end{aligned}$$
(4.2)

\(\mathcal {B} _1^\mathcal {O}\) samples \(\pi ,h\) (but not \(\mathbf {s}\)) as defined by \(\mathcal {K}\). Next, it invokes \(\mathcal {A}\) (who expects to attack \(\mathsf {MAC}_1\) with a key \((\mathbf {s},h,\pi )\)) answering its queries as follows:

  • Tag queries. If \(\mathcal {A}\) makes a tag query \(m\), then \(\mathcal {B} _1^{\mathcal {O}}\) does the following:

    1. 1.

      Sample \(\mathbf {b}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\nu \) and compute \(\mathbf {v}{:}{=}\mathsf {C}(h(m,\mathbf {b}))\).

    2. 2.

      Query the oracle \(\mathcal {O}\) for n times on input \(\mathbf {v}\): for \(i=1,\ldots ,n\) let \((\mathbf {R}[i], \mathbf {z}[i]) \mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {O}(\mathbf {v})\).

    3. 3.

      Return \(\phi {:}{=} \pi (\mathbf {R},\mathbf {z},\mathbf {b})\) where \(\mathbf {R}=[\mathbf {R}[1],\ldots ,\mathbf {R}[n]]\) and \(\mathbf {z}=[\mathbf {z}[1],\ldots ,\mathbf {z}[n]]\) to \(\mathcal {A}\).

  • Verification queries. If \(\mathcal {A}\) makes a verification query \((m,\phi )\), \(\mathcal {B} _1^\mathcal {O}\) simply answers with \(\mathsf {reject}\).

If any tag or verification query contains a \(\mathbf {v}\) which has appeared in a previous query, \(\mathcal {B} _1^\mathcal {O}\) outputs 1 and 0 otherwise. (Note that \(\mathcal {B} _1^\mathcal {O}\) can compute the value \(\mathbf {v}\) in a verification query as it knows \(\pi ,h\).)

Claim 4.5

If \(\mathcal {O}= \Gamma ^*_{\tau ,2\ell ,\ell }(\mathbf {s},\cdot )\), then \(\mathcal {B} _1^{\mathcal {O}}\) outputs 1 with probability \(\ge \varepsilon '/2\).

Proof of Claim

The answers to the tag queries of \(\mathcal {A}\) computed by \(\mathcal {B} _1^{\mathcal {O}}\) have exactly the same distribution as in the uf-cma experiment (where the secret key is \((\mathbf {s},h,\pi )\)). (Note that each tag query yields a vector \(\mathbf {v}\) with hamming weight exactly \(\ell \).) The answers to the verification queries (which are always \(\mathsf {reject}\)) are correct as long as \(\mathcal {A}\) does not query a valid forgery. The probability that the first valid forgery found by \(\mathcal {A}\) is not fresh is \(\varepsilon '-\Pr [E_{\mathsf {fresh}}]\ge \varepsilon '-\varepsilon '/2\ge \varepsilon '/2\), which is thus a lower bound on the probability that \(\mathcal {B} _1^{\mathcal {O}}\) outputs 1.\(\square \)

Claim 4.6

If \(\mathcal {O}= U_{2\ell +1}(\cdot )\), then \(\mathcal {B} _1^{\mathcal {O}}\) outputs 1 with probability \(< Q^2/2^{\mu }\).

Proof of Claim

The answers that \(\mathcal {A}\) obtains on a tag query \(m\) from \(\mathcal {B} _1^{U_{2\ell + 1}(\cdot )}\) (i.e., \(\pi (\mathbf {R},\mathbf {z},\mathbf {b}\)) where \(\mathbf {R},\mathbf {z},\mathbf {b}\) are sampled uniformly) are uniformly random, and in particular independent of h or \(\pi \). The answers to verification queries are always \(\mathsf {reject}\) and thus contain no information about \(h,\pi \) either. Then, we have that \(\mathbf {v}_i=\mathbf {v}_j\) (where \(\mathbf {v}_i=\mathsf {C}(h(m_i,\mathbf {b}_i))\) is defined by the ith tag or verification query) iff \(h(m_i,\mathbf {b}_i)=h(m_j,\mathbf {b}_j)\). We will upper bound the probability of such a collision, which then is also an upper bound on the probability that \(\mathcal {B} _1^{\mathcal {O}}\) outputs 1. \(\mathcal {A}\) makes a total of Q queries. Assume that up to the \((i-1)\)th query, all the \(\mathbf {v}\)’s were distinct.

If the ith query is a tag query \(m_i\), a fresh \(\mathbf {b}_i\) is sampled which will collide with a previous \(\mathbf {b}_j\) (\(j<i\)) with probability at most \((i-1)/2^\nu \). Assuming no such collision happened, the probability that \(h(m_i,\mathbf {b}_i) = h(m_j,\mathbf {b}_j)\) for any \(j<i\) can be upper bounded by \((i-1)/2^\mu \) (here we use the fact that the answers that \(\mathcal {A}\) gets from \(\mathcal {B} _1^{U_{2\ell + 1}(\cdot )}\) are uniformly random, and thus \(\mathcal {A}\) has no information about the function h).

If the ith query is a verification query \((m_i,\phi _i)\), then using the fact that \(\pi \) is a pairwise independent permutation (and \(\mathcal {A}\) has no information about it) we get that \(\pi ^{-1}(\phi _i) = (\mathbf {R}_i,\mathbf {z}_i,\mathbf {b}_i)\) contains a \(\mathbf {b}_i\) which will collide with a previous \(\mathbf {b}_j\) (\(j<i\)) where \(\phi _j\ne \phi _i\) with probability at most \((i-1)/2^{\nu }\). If we have no such collision, then also \((m_i,\mathbf {b}_i)\ne (m_j,\mathbf {b}_j)\) for all \(j<i\) as for j where \(\phi _i=\phi _j\) we must have that \(m_i\ne m_j\).Footnote 8 If \((m_i,\mathbf {b}_i)\ne (m_j,\mathbf {b}_j)\) for all \(j<i\) we can again upper bound the probability that \(h(m_i,\mathbf {b}_i) = h(m_j,\mathbf {b}_j)\) for some \(j<i\) by \((i-1)/2^{\mu }\).

We showed that for the ith query (no matter if it is a tag or verification query) the probability that \(h(m_i,\mathbf {b}_i)=h(m_j,\mathbf {b}_j)\) for some \(j\le i\) is at most \(\frac{(i-1)}{2^\nu } +\frac{(i-1)}{2^\mu }\). Taking the union bound over all \(i,1\le i\le Q\), we can upper bound the probability that \(h(m_i,\mathbf {b}_i)=h(m_j,\mathbf {b}_j)\) for any \(i\ne j\) as \(\sum _{i=1}^Q\left( \frac{(i-1)}{2^\nu } +\frac{(i-1)}{2^\mu } \right) \le Q^2/2^{\mu } \) (recall that \(\mu =\nu \)) as claimed.\(\square \)

The case \(\underline{\Pr [E_{\mathsf {fresh}}]>\varepsilon '/2}\). In this case, \(\mathcal {A}\) will make tag/verification queries, where with probability \(>\varepsilon '/2\), at some point she will make an accepting verification query \((m,\phi )\) that defines a fresh \(\mathbf {v}\). We now construct an adversary \(\mathcal {B} _2^\mathcal {O}\) that uses \(\mathcal {A}\) as a black-box, and can distinguish \(\mathcal {O}=\Gamma ^*_{\tau ,2\ell ,d}(\mathbf {s},\cdot )\) from \(\mathcal {O}=U_{2\ell +1}(\cdot )\) (as in Definition 2.8) with advantage

$$\begin{aligned} \frac{\varepsilon '}{2^{\mu +1}}-Q_\mathsf {tag}\cdot \alpha '_{\ell ,d} - Q_\mathsf {vrfy}\cdot \alpha ''_{\tau ',n}. \end{aligned}$$
(4.3)

The construction of \(\mathcal {B} _2^\mathcal {O}\) is very similar to the adversary \(\mathcal {B} \) that we constructed in the proof of Theorem 3.1 (where we proved that the authentication protocol in Fig. 4 is secure against active attacks). The queries to the prover in the first phase of an active attack directly correspond to tag queries. However, we now have to additionally answer verification queries (we will always answer \(\mathsf {reject}\)). Furthermore, we cannot choose the challenge \(\mathbf {v}^*\). Instead, we will simply hope that (in the case where \(\mathcal {O}=\Gamma ^*_{\tau ,2\ell ,d}(\mathbf {s},\cdot )\)) the \(\mathbf {v}\) contained in the first valid verification query (i.e., forgery) that \(\mathcal {A}\) makes is fresh (which by assumption happens with probability \(\varepsilon '/2\)). Moreover, we will hope that it is the unique \(\mathbf {v}^*\) (out of \(2^{\mu }\) possible ones) for which \(\mathcal {B} _2^\mathcal {O}\) can verify this. This gives us a distinguishing advantage of nearly \((\varepsilon '/2)/2^{\mu }=\varepsilon '/2^{\mu +1}\) as stated in Eq. (4.3). We do lose an additional additive term \(Q_\mathsf {tag}\cdot \alpha '_{\ell ,d}\) as there is an exponentially small probability that the transformation of subspace LPN samples to tag queries will fail, and moreover an exponentially small term \(Q_\mathsf {vrfy}\cdot \alpha ''_{\tau ',n}\) which accounts for the probability that \(\mathcal {A}\) correctly guesses an accepting tag even in the case where \(\mathcal {O}=U_{2\ell +1}(\cdot )\).

\(\mathcal {B} _2^\mathcal {O}\) samples \(\pi ,h\) (but not \(\mathbf {s}\)) as defined by \(\mathcal {K}\), and \(\mathbf {y}^*\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\mu ,\ \mathbf {s}^*\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{2\ell }\). Let \(\mathbf {v}^*{:}{=}\mathsf {C}(\mathbf {y}^*)\). Next, \(\mathcal {B} _2^\mathcal {O}\) invokes \(\mathcal {A}\) and answers its queries as follows (the intuition for the sampling below is given in the proof of Claim 4.8).

  • Tag queries. The answer \(\phi \) to a tag query \(m\in \mathcal {M}\) is computed by \(\mathcal {B} _2^{\mathcal {O}}\) as follows:

    1. 1.

      Sample \(\mathbf {b}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\nu \) and compute \(\mathbf {v}{:}{=}\mathsf {C}(h(m,\mathbf {b}))\).

      Let \(\mathbf {u}{:}{=}\mathbf {v}\wedge \overline{\mathbf {v}}^*\) and \(\mathbf {u}^* {:}{=}\mathbf {v}\wedge \mathbf {v}^*\).

    2. 2.

      For \(i=1,\ldots ,n\), let \((\mathbf {R}'[i],\mathbf {z}'[i]) \mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {O}(\mathbf {u})\), \( \mathbf {R}''[i]\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{2\ell }\) and \(\mathbf {z}''[i]{:}{=}\langle \mathbf {R}''[i],\mathbf {s}^* \wedge \mathbf {u}^* \rangle \). Define \(\mathbf {R}=[\mathbf {R}[1],\mathbf {R}[2],\ldots ,\mathbf {R}[n]]\) and \(\mathbf {z}=[\mathbf {z}[1],\ldots ,\mathbf {z}[n]]\) where \(\mathbf {R}[i]{:}{=} (\mathbf {R}'[i]\wedge \mathbf {u}+ \mathbf {R}''[i]\wedge \mathbf {u}^*)_{\downarrow \mathbf {v}}\) and \(\mathbf {z}[i]: = \mathbf {z}'[i] + \mathbf {z}''[i]\).

    3. 3.

      Return \(\phi {:}{=}\pi (\mathbf {R},\mathbf {z},\mathbf {b})\) to \(\mathcal {A}\).

  • Verification queries. If \(\mathcal {A}\) makes a verification query \((\phi ,m)\), then \(\mathcal {B} _2^\mathcal {O}\) always answers \(\mathsf {reject}\), but also makes the following check:

    1. 1.

      Parse \(\mathbf {y}{:}{=}\pi ^{-1}(\phi )\) as \([\mathbf {R}\in \mathbb {Z}_2^{\ell \times n},\mathbf {z}\in \mathbb {Z}_2^{n},\mathbf {b}\in \mathbb {Z}_2^\nu ]\) and compute \(\mathbf {v}{:}{=} \mathsf {C}(h(m,\mathbf {b}))\).

    2. 2.

      If \(\mathbf {v}\ne \mathbf {v}^*\), processing this query is over, otherwise go to the next step.

    3. 3.

      If \(\mathsf {rank}(\mathbf {R})=n\) and \(\mathbf{wt}(\mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}^*_{\downarrow \mathbf {v}^*}+ \mathbf {z})\le n\cdot \tau '\) (i.e., we have a forgery) output 1 and stop.

If \(\mathcal {A}\) has finished its queries, \(\mathcal {B} _2^\mathcal {O}\) stops with output 0.

Claim 4.7

If \(\mathcal {O}=U_{2\ell +1}(\cdot )\), then \(\mathcal {B} _2^{\mathcal {O}}\) outputs 1 with probability \(\le Q_\mathsf {vrfy}\cdot \alpha ''_{\tau ',n}+ Q_{\mathsf {tag}}\cdot \alpha '_{\ell ,d}\).

Proof of Claim

The proof of this claim is almost identical to the proof of Claim 3.2, except that here we have an additional factor \(Q_\mathsf {vrfy}\) as we have to take the union bound over all \(Q_\mathsf {vrfy}\) queries, whereas in Claim 3.2 the adversary was (by definition of an active attack) only allowed one guess.\(\square \)

Claim 4.8

If \(\mathcal {O}=\Gamma ^*_{\tau ,2\ell ,d}(\mathbf {s},\cdot )\), then \(\mathcal {B} _2^{\mathcal {O}}\) outputs 1 with probability \(\ge \frac{\varepsilon '}{2^{\mu +1}} \).

Proof

The proof of this claim is similar to the proof of Claim 3.3. That is, one shows that \(\mathcal {B} _2\) with oracle access to \(\Gamma ^*_{\tau ,2\ell ,0}(\mathbf {s},\cdot )\) (i.e., the subset LPN oracle accepts subsets of arbitrarily small size) perfectly simulates the experiment and thus receive a fresh forgery from \(\mathcal {A}\) with probability at least \(\varepsilon '/2\). Unlike in Claim 3.3, here \(\mathcal {B} _2\) will only be able to recognize a fresh forgery if it is using a \(\mathbf {v}\) that is equivalent to \(\mathbf {v}^*\), this accounts for the extra \(2^{-\mu }\) factor.\(\square \)

Summing up, using \(\mathcal {A} \) we can break the subset LPN assumption with advantage which is given either by Eq. (4.2) or Eq. (4.3), i.e.,

$$\begin{aligned} \varepsilon = \min \left\{ \varepsilon '/2 - \frac{2Q^2}{2^{\mu }} , \ \frac{\varepsilon '}{2^{\mu +1}}- Q_\mathsf {tag}\cdot \alpha '_{\ell ,d} - Q_\mathsf {vrfy}\cdot \alpha ''_{\tau ',n} \right\} . \end{aligned}$$

\(\square \)

4.2 Second Construction

We now give the construction of another MAC based on the hardness of the LPN problem. The main difference from \(\mathsf {MAC}_1\) from the last subsection is the way we generate the values \(\mathbf {s}(\mathbf {v})\). In the new construction we define \(\mathbf {s}(\mathbf {v}) = \mathbf {s}_0 + \mathbf {S}\cdot \mathbf {v}\), where \(\mathbf {S}\in \mathbb {Z}_2^{\ell \times \mu }\) and \(\mathbf {s}_0\in \mathbb {Z}_2^\ell \) are both part of the secret key. Moreover, in the computation of a tag, the output is masked via another vector \(\mathbf {s}_0'\in \mathbb {Z}_2^n\) that is also included in the secret key. The construction borrows ideas from [4] that we needed to adapt to the case of LPN.

Construction. The message authentication code \(\mathsf {MAC}_2=(\mathcal {K},\mathcal {T},\mathcal {V})\) with associated message space \(\mathcal {M}\) is defined as follows:

  • Public Parameters. \(\mathsf {MAC}_2\) has the following public parameters.

    \(\ell ,\tau ,\tau ',n\) :

    as in the authentication protocol from Sect. 3

    \(\mu \in \mathbb {N}\) :

    output length of the hash function

    \(\nu \in \mathbb {N}\) :

    length of the randomness

  • Key Generation. Algorithm \(\mathcal {K}(1^\lambda )\) samples \(\mathbf {S}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{\ell \times \mu }\), \(\mathbf {s}_0 \mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\ell ,\mathbf {s}_0'\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^n\) and chooses an almost pairwise independent hash function \(h : \mathcal {M}\times \mathbb {Z}_2^\nu \rightarrow \mathbb {Z}_2^{\mu }\setminus \{0\}\), as well as a pairwise independent permutation \(\pi \) over \(\mathbb {Z}_2^{\ell \times n + n + \nu }\) (cf. Remark 4.1). It returns \(K=(\mathbf {S},\mathbf {s}_0,\mathbf {s}_0',h,\pi )\) as the secret key.

  • Tagging. Given secret key \(K=(\mathbf {S},\mathbf {s}_0,\mathbf {s}_0',h,\pi )\) and message \(m\in \mathcal {M}\), algorithm \(\mathcal {T}\) proceeds as follows.

    1. 1.

      \(\mathbf {R}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{\ell \times n}\), \(\mathbf {b}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\nu \), \(\mathbf {e}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathsf{Ber}^n_\tau \)

    2. 2.

      \(\mathbf {v}{:}{=} h(m, \mathbf {b})\)

    3. 3.

      \(\mathbf {s}(\mathbf {v}) {:}{=} \mathbf {s}_0 + \mathbf {S}\cdot \mathbf {v}\)

    4. 4.

      Return \(\phi {:}{=}\pi (\mathbf {R}, \mathbf {s}_0' + \mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}(\mathbf {v}) + \mathbf {e}, \mathbf {b})\)

  • Verification. On input a secret key \(K=(\mathbf {S},\mathbf {s}_0,\mathbf {s}_0',h,\pi )\), message \(m\in \mathcal {M}\) and tag \(\phi \), algorithm \(\mathcal {V}\) proceeds as follows.

    1. 1.

      Parse \(\pi ^{-1}(\phi )\) as \((\mathbf {R}\in \mathbb {Z}_2^{\ell \times n},\mathbf {z}\in \mathbb {Z}_2^n,\mathbf {b}\in \mathbb {Z}_2^\nu )\). If \(\mathsf {rank}(\mathbf {R}) \ne n\), then return \(\mathsf {reject}\)

    2. 2.

      \(\mathbf {v}{:}{=} h(m, \mathbf {b})\)

    3. 3.

      \(\mathbf {s}(\mathbf {v}) {:}{=} \mathbf {s}_0 + \mathbf {S}\cdot \mathbf {v}\)

    4. 4.

      If \(\mathbf{wt}(\mathbf {z}+ \mathbf {s}_0' + \mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}(\mathbf {v}))> n\cdot \tau '\) return \(\mathsf {reject}\), otherwise return \(\mathsf {accept}\)

Theorem 4.9

For \(\mu = \nu \in \mathbb {N}\), if the \(\mathsf {SLPN}_{\tau ,\ell ,\ell }\) problem is \((t,nQ,\varepsilon )\)-hard (by Proposition 2.3 this problem is as hard as \(\mathsf {LPN}_{\tau ,\ell }\)), then \(\mathsf {MAC}_2\) is \((t',Q,\varepsilon ')\)-secure against uf-cma adversaries, where for constants \(c_\tau , c_\tau '>0\) that depend only on \(\tau \), and for \(Q = Q_\mathsf {tag}+ Q_\mathsf {vrfy}\),

$$\begin{aligned} t' \approx t \qquad \varepsilon \ = \min \left\{ \varepsilon '/2 - \frac{Q^2}{2^{\mu }}, \frac{\varepsilon '}{8\mu Q_\mathsf {vrfy}} - Q_\mathsf {vrfy}\cdot ( 2^{-c_\tau '\cdot n} + 2^{-c_\tau \cdot n}) \right\} . \end{aligned}$$

\(\mathsf {MAC}_2\) has completeness error \(2^{-c_\tau '\cdot n}+2^{n-\ell }\).

We now give intuition for the proof of Theorem 4.9. Similar to the proof of Theorem 4.2, we distinguish fresh and non-fresh forgeries. Here the new and interesting case is when the first forgery found by the adversary is fresh. In the analysis we move to a mental experiment where tags computed by the tag oracle are uniform and independent from the secret key. The technical heart of the proof is to show that such a modification defines an indistinguishable distribution, assuming that the standard LPN assumption holds. More in detail, consider the two experiments defined in Fig. 6. In the “real experiment,” the answers from the \(\textsc {Eval}(\mathbf {v})\) oracle have the same distribution as the values \((\mathbf {R},\mathbf {z})\) from a tag on message \(m\) such that \(h(m,\mathbf {b})=\mathbf {v}\); in the “random experiment,” the answers from the \(\textsc {Eval}(\mathbf {v})\) oracle are uniform. Oracle \(\textsc {Chal}(\mathbf {R}^*,\mathbf {v}^*)\), which can be queried at most once, essentially corresponds to the output of a verification query on a fresh forgery \((m^*,\phi ^*)\), such that \(h(m^*,\mathbf {b}^*) = \mathbf {v}^*\). The lemma below states that it is hard to distinguish the two cases. Its proof uses a hybrid technique from [4, 13].

Lemma 4.10

Let \(\ell ,\mu ,n,\tau \in \mathbb {N}\). Assume that the \(\mathsf {LPN}_{\tau ,\ell }\) problem is \((t,nQ,\varepsilon )\)-hard. Then, for all adversaries \(\mathcal {B} \) running in time \(t'\approx t\), and asking Q queries to the \(\textsc {Eval}(\cdot )\) oracle, we have that

$$\begin{aligned} \left| \Pr \left[ \mathbf {Exp}^{\mathsf{real}}_{\ell ,\mu ,n,\tau }(\mathcal {B})) = 1\right] - \Pr \left[ \mathbf {Exp}^{\mathsf{rand}}_{\ell ,\mu ,n,\tau }(\mathcal {B})) = 1\right] \right| \le 4\mu \varepsilon . \end{aligned}$$
Fig. 6
figure 6

Experiments \(\mathbf {Exp}^{\mathsf{real}}_{\ell ,\mu ,n,\tau }(\mathcal {B})\) and \(\boxed {\mathbf {Exp}^{\mathsf{rand}}_{\ell ,\mu ,n,\tau }(\mathcal {B})}\). The boxed statement redefining \(\mathbf {z}\) is only executed in \(\mathbf {Exp}^{\mathsf{rand}}_{\ell ,\mu ,n,\tau }\)

Proof

We start by making a syntactical change in the real experiment. Let \(\mathbf {S}= (\mathbf {S}[j])_{j\in [\mu ]}\), with \(\mathbf {S}[j]\in \mathbb {Z}_2^\ell \). One can show that there exist vectors \(\mathbf {s}_{j,k}\in \mathbb {Z}_2^\ell \) for \(j \in [\mu ]\) and \(k \in \{0,1\}\) such that

$$\begin{aligned} \mathbf {s}(\mathbf {v}) = \mathbf {S}\cdot \mathbf {v}+ \mathbf {s}_0 = \sum _{j=1}^\mu \mathbf {s}_{j,\mathbf {v}[j]}. \end{aligned}$$

This is obtained by letting \(\mathbf {s}_0 = \sum _{j=1}^{\mu }\mathbf {s}_{j,0}\), and \(\mathbf {S}[j] = \mathbf {s}_{j,1} - \mathbf {s}_{j,0}\). Furthermore, choosing the \(\mathbf {s}_{j,k}\) uniformly at random yields uniformly distributed values \(\mathbf {S}\), \(\mathbf {s}_0\).

Let \(\mathbf {G}_0\) be identical to the “real experiment” \(\mathbf {Exp}^{\mathsf{real}}_{\ell ,\mu ,n,\tau }(\mathcal {B})\), with the difference that at the beginning of the experiment we sample the vectors \(\mathbf {s}_{i,j}\) (as defined above) uniformly at random, instead of sampling \((\mathbf {S},\mathbf {s}_0)\), and later those vectors are used to define \(\mathbf {s}(\mathbf {v})\). Clearly, \(\mathbf {G}_0\) is identically distributed to the “real experiment.”

We prove the lemma by considering a sequence of intermediate games, starting with game \(\mathbf {G}_0\). The games are shown in Fig. 7. Note that in Game \(\mathbf {G}_{1,0}\) the value \(\mathbf {s}_0'\) is computed as \(\mathsf {RF}_0(\bot )\), where \(\bot \) is the empty string, that always outputs the same vector \(\mathbf {s}_0'\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\ell \). Therefore, we have

Claim 4.11

\(\Pr [\mathbf {G}_0 = 1] = \Pr [\mathbf {G}_{1,0} = 1]\).

The next claim shows that any two adjacent hybrid games are indistinguishable, if the LPN assumption holds.

Claim 4.12

For all \(i \in \{1,\ldots ,\mu -1\}\), there exists a distinguisher \(\mathcal {D} \), with running time similar to that of \(\mathcal {B} \), such that

$$\begin{aligned} \left| \Pr \left[ \mathcal {D} ^{\Lambda _{\tau ,\ell }(\mathbf {s})}=1;\ \mathbf {s}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\ell \right] - \Pr \left[ \mathcal {D} ^{U_{\ell +1}}=1\right] \right| \ge \frac{1}{2}\left| \Pr \left[ \mathbf {G}_{1,i+1} = 1\right] - \Pr \left[ \mathbf {G}_{1,i} = 1\right] \right| . \end{aligned}$$
Fig. 7
figure 7

Hybrid experiments \(\mathbf {G}_0\),

figure a
and
figure b
in the proof of Lemma 4.10. Here \(\mathsf {RF}_i:\{0,1\}^i\rightarrow \mathbb {Z}_2^\ell \) is a random function, and \(\mathbf {v}[1\ldots i]\in \mathbb {Z}_2^i\), for \(i\in [\mu ]\), is the ith prefix of vector \(\mathbf {v}\in \mathbb {Z}_2^\mu \)

Proof

Fix some particular \(i\in [\mu -1]\), and let Q be the total number of queries that \(\mathcal {B} \) asks to the \(\textsc {Eval}(\cdot )\) oracle. Distinguisher \(\mathcal {D} ^\mathcal {O}\) will ask nQ queries to its oracle \(\mathcal {O}(\cdot )\), where \(\mathcal {O}\) is either equal to \(\Lambda _{\tau ,\ell }(\mathbf {s})\) or to \(U_{\ell + 1}\). Looking ahead, distinguisher \(\mathcal {D} \) will start by sampling a bit \(b\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\{0,1\}\) as its guess for \(\mathbf {v}^*[i+1]\) and defining the random function \(\mathsf {RF}_{i+1}(\cdot )\) recursively as

(4.4)

where \(\mathsf {RF}'_i:\{0,1\}^i\rightarrow \mathbb {Z}_2^\ell \) is another random function (to be determined). One can verify that, in case \(\mathsf {RF}_i(\cdot )\) is a random function, so is \(\mathsf {RF}_{i+1}(\cdot )\). A formal description of \(\mathcal {D} \) follows:

  1. 1.

    At setup \(\mathcal {D} \) does the following:

    • Sample \(b\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\{0,1\}\) and set \(\mathcal {Q}_\mathbf {v}{:}{=} \emptyset \).

    • Choose all the vectors \(\mathbf {s}_{j,k} \mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\ell \) at random (for all \(j\in [\mu ]\) and \(k\in \{0,1\}\)), except for \(\mathbf {s}_{i+1,1-b}\) which is implicitly set to be the vector \(\mathbf {s}\) from the LPN oracle.

    • Query the \(\mathcal {O}(\mathbf {s})\) oracle for nQ times, obtaining answers \((\mathbf {R}_j,\mathbf {z}'_j)_{j\in [Q]}\); let \(\mathcal {L}_i\) be an initially empty list.

  2. 2.

    Upon input a query \(\mathbf {v}\) to oracle \(\textsc {Eval}(\cdot )\), distinguisher \(\mathcal {D} \) does the following:

    • Update \(\mathcal {Q}_\mathbf {v}{:}{=} \mathcal {Q}_\mathbf {v}\cup \{\mathbf {v}\}\).

    • If \(\mathbf {v}[i+1] = b\), sample \(\mathbf {R}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{\ell \times n}\) and \(\mathbf {e}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathsf{Ber}_\tau ^n\), compute \(\mathbf {z}= \mathsf {RF}_{i}(\mathbf {v}[1\ldots i]) + \mathbf {R}^{\mathsf {T}}\cdot (\sum _{j=1}^{\mu }\mathbf {s}_{j,\mathbf {v}[j]}) + \mathbf {e}\), and return \((\mathbf {R},\mathbf {z})\).

    • Else, in case \(\mathbf {v}[i+1] = 1-b\), check whether there exists an entry of the form \((\mathbf {v}[1,\ldots ,i],(\mathbf {R}_j,\mathbf {z}'_j))\) in the list \(\mathcal {L}_i\), for some \(j\in [Q]\). If this is the case, let \((\mathbf {R},\mathbf {z}') {:}{=} (\mathbf {R}_{j},\mathbf {z}'_j)\), otherwise let \((\mathbf {R},\mathbf {z}') {:}{=} (\mathbf {R}_{j},\mathbf {z}'_j)\) for the next (in lexicographic order) not already used LPN sample, and add \((\mathbf {v}[1,\ldots ,i],(\mathbf {R}_j,\mathbf {z}'_j))\) to the list \(\mathcal {L}_i\). Define

      $$\begin{aligned} \mathbf {z}= \mathsf {RF}_i(\mathbf {v}[1\ldots i]) + \mathbf {R}^{\mathsf {T}}\cdot \sum _{\begin{array}{c} j=1 \\ j\ne i+1 \end{array}}^{\mu }\mathbf {s}_{j,\mathbf {v}[j]} + \mathbf {z}' \end{aligned}$$

      and return \((\mathbf {R},\mathbf {z})\).

  3. 3.

    Upon input query \((\mathbf {R}^*,\mathbf {v}^*)\) to oracle \(\textsc {Chal}(\cdot )\), distinguisher \(\mathcal {D} \) does the following:

    • In case \(\mathbf {v}^*[i+1] \ne b\), abort.

    • Else, define \(\mathbf {s}(\mathbf {v}^*) = \sum _{j=1}^{\mu }\mathbf {s}_{j,\mathbf {v}^*[j]}\).

    • Return \(\mathbf {z}^* = \mathbf {R}^{*{\mathsf {T}}}\cdot \mathbf {s}(\mathbf {v}^*) + \mathsf {RF}_i(\mathbf {v}^*[1\ldots i])\).

  4. 4.

    Upon input the decision bit d from \(\mathcal {B} \), distinguisher \(\mathcal {D} \) returns \(d \wedge (\mathbf {v}^*\not \in \mathcal {Q}_\mathbf {v})\).

Suppose that \(\mathcal {D} \) correctly guessed \(\mathbf {v}^*[i+1]\), which happens with probability 1 / 2. Note that in this case \(\mathcal {D} \) simulates perfectly the answer of the \(\textsc {Chal}(\cdot )\) oracle (as it knows \(\mathbf {s}_{i+1,b}\)). It remains to analyze the distribution of oracle \(\textsc {Eval}(\cdot )\). Below we argue that, depending on the value of \(\mathbf {v}[i+1]\), the distribution will either equal that of \(\mathbf {G}_i\) or \(\mathbf {G}_{i+1}\), with random functions \(\mathsf {RF}_{i+1}\) as defined in Eq. (4.4). In case \(\mathbf {v}[i+1] = b\), then the distribution is equal to that of both \(\mathbf {G}_i\) and \(\mathbf {G}_{i+1}\) (which is the same, as in this case \(\mathsf {RF}_{i+1}(\mathbf {v}[1\ldots i+1]) = \mathsf {RF}_i(\mathbf {v}[1\ldots i])\)). In case \(\mathbf {v}[i+1] = 1-b\), we consider two cases depending on whether the oracle \(\mathcal {O}(\mathbf {s})\) outputs LPN samples or uniform samples. In the first case, we have \(\mathbf {z}' = \mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}_{i+1,1-b} + \mathbf {e}\) and thus the answer

$$\begin{aligned} \mathbf {z}= & {} \mathsf {RF}_i(\mathbf {v}[1\ldots i]) + \mathbf {R}^{\mathsf {T}}\cdot \sum _{\begin{array}{c} j=1 \\ j\ne i+1 \end{array}}^{\mu }\mathbf {s}_{j,\mathbf {v}[j]} + \mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}_{i+1,1-b} + \mathbf {e}\\= & {} \mathsf {RF}_i(\mathbf {v}[1\ldots i]) + \mathbf {R}^{\mathsf {T}}\cdot \sum _{j=1}^{\mu }\mathbf {s}_{j,\mathbf {v}[j]} + \mathbf {e}, \end{aligned}$$

is distributed like in game \(\mathbf {G}_i\). In the second case, we have \(\mathbf {z}' = \mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}_{i+1,1-b} + \mathbf {e}+ \mathbf {u}\) (for a uniform \(\mathbf {u}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^n\)). Thus, the answer \(\mathbf {z}\) computed by \(\mathcal {D} \) is distributed like in \(\mathbf {G}_{i+1}\) with random function \(\mathsf {RF}_i'(\mathbf {v}[1\ldots i]) = \mathbf {u}\) and \(\mathsf {RF}_{i+1}\) defined as in Eq. (4.4). Note that \(\mathsf {RF}_i'\) is well defined, i.e., the value \(\mathsf {RF}_i'(\mathbf {v}[1,\cdots i])\) does not get overwritten in case the \(\textsc {Eval}(\cdot )\) oracle is queried on two different \(\mathbf {v},\mathbf {v}'\) that are equal in the first i positions (this is because the reduction keeps track of which LPN sample is associated to each of the vectors \(\mathbf {v}[1 \ldots i]\), using the list \(\mathcal {L}_i\)). The claim follows.\(\square \)

Claim 4.13

\(\Pr [\mathbf {G}_{1,\mu } = 1] = \Pr [\mathbf {G}_2 = 1]\).

Proof

The claim follows from the fact that in \(\mathbf {G}_{1,\mu }\) all outputs computed via \(\textsc {Eval}(\cdot )\) are masked by \(\mathsf {RF}_{\mu }(\mathbf {v})\) and thus are independent of \(\mathbf {s}_{j,k}\). Hence, the output of \(\textsc {Chal}(\cdot )\) is uniform.

Finally, we make all steps in reverse order to re-obtain the initial distribution in the \(\textsc {Eval}(\cdot )\) oracle. The proof of the following claim is analogous to the one of Claim 4.12 and is therefore omitted.

Claim 4.14

For all \(i\in \{1,\ldots ,\mu -1\}\), there exists a distinguisher \(\mathcal {D} \), with running time similar to that of \(\mathcal {B} \), such that

$$\begin{aligned}&\left| \Pr \left[ \mathcal {D} ^{\Lambda _{\tau ,\ell }(\mathbf {s})}=1;\ \mathbf {s}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\ell \right] - \Pr \left[ \mathcal {D} ^{U_{\ell +1}}=1\right] \right| \\&\quad \ge \frac{1}{2}\left| \Pr \left[ \mathbf {G}_{2} = 1\right] - \Pr \left[ \mathbf {Exp}^{\mathsf{rand}}_{\ell ,\mu ,n,\tau }(\mathcal {B}) = 1\right] \right| . \end{aligned}$$

The statement of Lemma 4.10 now follows by putting together Claims 4.114.14.\(\square \)

We now turn to the proof of Theorem 4.9.

Proof of Theorem 4.9

The proof of the completeness error is similar to the schemes before and is therefore omitted. As for security, let \(\mathcal {A}\) be an adversary that successfully forges in the uf-cma experiment with probability \(\varepsilon '\). Assume that \(\mathcal {A} \) asks a total of \(Q = Q_\mathsf {tag}+ Q_\mathsf {vrfy}\) queries, where \(Q_\mathsf {tag}\) (resp., \(Q_\mathsf {vrfy}\)) stands for the total number of queries asked to the tag (resp., verification) oracle.

We make the same conventions and the definition of freshness as in the proof of Theorem 4.2 and consider the two cases \(\Pr [E_{\mathsf {fresh}}]\le \varepsilon '/2\) and \(\Pr [E_{\mathsf {fresh}}]>\varepsilon '/2\) separately.

The case \(\underline{\Pr [E_{\mathsf {fresh}}]\le \varepsilon '/2}\). We now give the description of \(\mathcal {B} _1^{\mathcal {O}}\) attacking the \(\mathsf {SLPN}_{\tau ,\ell ,\ell }\) problem, i.e., \(\mathcal {B} _1^\mathcal {O}\) can distinguish \(\mathcal {O}=\Gamma _{\tau ,\ell ,\ell }(\mathbf {s},\cdot , \cdot )\) from \(\mathcal {O}=U_{\ell +1}(\cdot ,\cdot )\) with advantage

$$\begin{aligned} \varepsilon '/2 - \frac{Q^2}{2^{\mu }}. \end{aligned}$$
(4.5)

Adversary \(\mathcal {B} _1^\mathcal {O}\) samples \(\pi ,h,\mathbf {S},\mathbf {s}_0'\) (but not \(\mathbf {s}_0\)) as defined by \(\mathcal {K}\) and \(\mathbf {B}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^{\ell \times \mu }\). Next, it implicitly defines \(\mathbf {s}_0\) as \(\mathbf {s}_0 = \mathbf {s}\), where \(\mathbf {s}\) is only implicitly defined through \(\Gamma _{\tau ,\ell ,\ell }(\mathbf {s},\cdot , \cdot )\). It is easy to see that with this setup of \(K = (\mathbf {S},\mathbf {s}_0,\mathbf {s}_0',h,\pi )\) we have that, for each \(\mathbf {v}\in \mathbb {Z}_2^\mu \setminus \{0\}\),

$$\begin{aligned} \mathbf {s}(\mathbf {v}) = \mathbf {s}_0 + \mathbf {S}\cdot \mathbf {v}= \mathbf {s}+ \mathbf {S}\cdot \mathbf {v}. \end{aligned}$$
(4.6)

Note that \(\mathbf {S}\cdot \mathbf {v}\) is known to \(\mathcal {B} _1\). Adversary \(\mathcal {B} _1^\mathcal {O}\) cannot evaluate \(\mathbf {s}(\mathbf {v})\) but looking ahead, it will use its oracle \(\mathcal {O}\) to answer \(\mathcal {A} \)’s queries as follows.

  • Tag queries. If \(\mathcal {A}\) makes a tag query for message \(m\in \mathcal {M}\), then \(\mathcal {B} _1^{\mathcal {O}}\) does the following:

    1. 1.

      Samples \(\mathbf {b}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\nu \) and compute \(\mathbf {v}{:}{=} h(m, \mathbf {b})\).

    2. 2.

      Query the oracle \(\mathcal {O}\) on \((\mathbf {I},\mathbf {S}\cdot \mathbf {v})\) for n times to obtain \((\mathbf {R},\mathbf {z}')\): for \(i=1,\ldots ,n\) let \((\mathbf {R}[i], \mathbf {z}'[i]) \mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathcal {O}(\mathbf {I},\mathbf {S}\cdot \mathbf {v})\). Here \(\mathbf {I}\) is the identity matrix.

    3. 3.

      Return \(\phi {:}{=}\pi (\mathbf {R}, \mathbf {s}_0' + \mathbf {z}', \mathbf {b})\).

  • Verification queries. If \(\mathcal {A}\) makes a verification query \((m,\phi )\), \(\mathcal {B} _1^\mathcal {O}\) simply answers with \(\mathsf {reject}\).

Finally, if any tag or verification query contains a \(\mathbf {v}\) which has appeared in a previous query, \(\mathcal {B} _1^\mathcal {O}\) outputs 1 and stops. Otherwise, it outputs 0. Note that if \(\mathcal {O}= \Gamma _{\tau ,\ell ,\ell }(\mathbf {s},\cdot , \cdot )\), then \(\mathcal {B} _1^\mathcal {O}\) perfectly simulates the \(\mathcal {T}(K,\cdot )\) algorithm, as \(\mathbf {z}'[i] = \mathbf {R}[i]^{\mathsf {T}}(\mathbf {I}\cdot \mathbf {s}+ \mathbf {S}\cdot \mathbf {v}) + \mathbf {e}[i] = \mathbf {R}[i]^{\mathsf {T}}\cdot \mathbf {s}(\mathbf {v}) + \mathbf {e}[i]\).

The following two claims are the analogs of Claim 4.5 and 4.6, respectively. Their proofs are essentially the same and are therefore omitted.

Claim 4.15

If \(\mathcal {O}= \Gamma _{\tau ,\ell }(\mathbf {s},\cdot ,\cdot )\), then \(\mathcal {B} _1^{\mathcal {O}}\) outputs 1 with probability \(\ge \varepsilon '/2\).

Claim 4.16

If \(\mathcal {O}=U_{\ell +1}(\cdot ,\cdot )\), then \(\mathcal {B} _1^{\mathcal {O}}\) outputs 1 with probability \(< \frac{Q^2}{2^{\mu }}\).

The case \(\underline{\Pr [E_{\mathsf {fresh}}]>\varepsilon '/2}\). We will use games, denoting by \(\mathbf {G}_i\) the output of the ith game. Game \(\mathbf {G}_0\) runs the uf-cma security experiment \(\mathbf {Exp}^{\mathrm {uf}\hbox {-}\mathrm {cma}}_{\mathsf {MAC}_2,\mathcal {A},\lambda }\) and defines the output as 1 iff event \(E_\mathsf {fresh}\) holds. By definition we have \(\Pr [\mathbf {G}_0 = 1] = \Pr [E_\mathsf {fresh}] > \varepsilon '/2\). Throughout the rest of the proof, if in the game \(\mathcal {A} \) finds a forgery, and the first forgery is fresh, we will denote with \(\cdot ^*\) the values associated with this first forgery. In particular, \(\mathbf {v}^*\) is the \(\mathbf {v}\)-value computed to evaluate the verification query on \((m^*, \phi ^*)\). Note that, by definition, \(\mathbf {v}^*\) is fresh, i.e., it is different from all the \(\mathbf {v}\)-values from previous tag and verification queries. Without loss of generality, we assume that after \(\mathbf {G}_0\) processes a verification query with respect to \(\mathbf {v}^*\), as defined above, the random variable corresponding to the outcome of the game is defined and the experiment stops without processing any further query. For simplicity, we also assume that the answer to all verification queries \((m,\phi )\) before the forgery is found is \(\mathsf {reject}\), unless the pair \((m,\phi )\) was returned by the tag oracle (in which case we assume the answer is always \(\mathsf {accept}\)); this results in a loss of at most \(Q_{\mathsf {vrfy}}\cdot \alpha \) in the final bound, where \(\alpha {:}{=} 2^{n-\ell } + 2^{-c_\tau '\cdot n}\) is the completeness error of \(\mathsf {MAC}_2\).

Consider games \(\mathbf {G}_1,\ldots ,\mathbf {G}_{Q_\mathsf {vrfy}}\) where game \(\mathbf {G}_j\) is identical to \(\mathbf {G}_0\), but allows the adversary \(\mathcal {A} \) to ask only j verification queries. Define \(E_\mathsf {fresh}^j\) to be the event that in \(\mathbf {G}_0\) the jth verification query is the one where the first fresh forgery is found; this means that all previous verification queries are either rejected, or relative to a pair \((m,\phi )\) previously returned by the tag oracle. Since, for \(j\in \{1,\ldots ,Q_\mathsf {vrfy}\}\), all the events \(E_\mathsf {fresh}^j\) are disjoint, and additionally \(\Pr [\mathbf {G}_j = 1] \ge \Pr [E_\mathsf {fresh}^j]\), we have:

$$\begin{aligned} \varepsilon '/2 < \Pr \left[ E_\mathsf {fresh}\right] = \Pr \left[ \mathbf {G}_0 = 1\right]= & {} \Pr \left[ \bigcup _{j=1}^{Q_\mathsf {vrfy}}E_\mathsf {fresh}^j\right] = \sum _{j=1}^{Q_\mathsf {vrfy}}\Pr \left[ E_\mathsf {fresh}^j\right] \nonumber \\\le & {} \sum _{j=1}^{Q_\mathsf {vrfy}}\Pr \left[ \mathbf {G}_j = 1\right] . \end{aligned}$$
(4.7)

In the remainder of the proof, we will upper bound \(\Pr [\mathbf {G}_j = 1]\), for all \(j\in \{1,\ldots ,Q_\mathsf {vrfy}\}\). As a first step, consider a modified version \(\mathbf {G}_j'\) of Game \(\mathbf {G}_j\) where the tag oracle internally uses uniform \(\mathbf {z}\in \mathbb {Z}_2^n\), instead of \(\mathbf {z}= \mathbf {s}_0' + \mathbf {R}^{\mathsf {T}}\cdot \mathbf {s}(\mathbf {v})\), to generate tag \(\phi \) on message \(m\).

Claim 4.17

\(\left| \Pr [\mathbf {G}_j= 1] - \Pr [\mathbf {G}_j'=1]\right| \le 4\mu \varepsilon \), for all \(j\in \{1,\ldots ,Q_\mathsf {vrfy}\}\).

Proof

Fix a value of \(j\in \{1,\ldots ,Q_\mathsf {vrfy}\}\). We prove something stronger, namely that the distributions induced by \(\mathbf {G}_j\) and \(\mathbf {G}_j'\) are computationally close, within distance \(4\mu \varepsilon \). Assume the contrapositive, namely that there exists a distinguisher \(\mathcal {D} \) that can distinguish games \(\mathbf {G}_j\) and \(\mathbf {G}_j'\). We build an attacker \(\mathcal {B} \) (running \(\mathcal {D} \)) such that

$$\begin{aligned} \left| \Pr \left[ \mathbf {Exp}^{\mathsf{real}}_{\ell ,\mu ,n,\tau }(\mathcal {B}) = 1\right] - \Pr \left[ \mathbf {Exp}^{\mathsf{rand}}_{\ell ,\mu ,n,\tau }(\mathcal {B}) = 1\right] \right| > 4\mu \varepsilon , \end{aligned}$$

contradicting Lemma 4.10. Adversary \(\mathcal {B} \) works as follows:

  1. 1.

    At the beginning \(\mathcal {B} \) samples \(h,\pi \) (but not \(\mathbf {S},\mathbf {s}_0,\mathbf {s}_0'\)).

  2. 2.

    Upon input a query \(m\) to the tag oracle, \(\mathcal {B} \) does the following:

    • Sample a random \(\mathbf {b}\mathop {\leftarrow }\limits ^{{\scriptscriptstyle \$}}\mathbb {Z}_2^\nu \) and compute \(\mathbf {v}= h(m,\mathbf {b})\).

    • Query \(\mathbf {v}\) to oracle \(\textsc {Eval}(\cdot )\), obtaining a pair \((\mathbf {R},\mathbf {z})\), and forward \(\phi = \pi (\mathbf {R},\mathbf {z},\mathbf {b})\) to \(\mathcal {D} \).

  3. 3.

    Upon input a verification query \((m,\phi )\) to the verification oracle, \(\mathcal {B} \) does the following:

    • First check whether \((m,\phi )\) is equal to one of the tags previously returned to \(\mathcal {D} \); if this is the case answer with \(\mathsf {accept}\).

    • Otherwise, check whether \((m,\phi )\) is the jth verification query; if this is not the case, then answer with \(\mathsf {reject}\).

    • Else, \((m,\phi )\) is the jth verification query; call it \((m^*,\phi ^*)\). Let \((\mathbf {R}^*,\mathbf {z}^*,\mathbf {b}^*) = \pi ^{-1}(\phi ^*)\), compute \(\mathbf {v}^* = h(m^*,\mathbf {b}^*)\) and forward \((\mathbf {R}^*,\mathbf {v}^*)\) to oracle \(\textsc {Chal}(\cdot )\) obtaining a vector \(\mathbf {z}'\). Check that \(\mathbf{wt}(\mathbf {z}' + \mathbf {z}^*) \le n\cdot \tau '\); if this is the case return \(\mathsf {accept}\) to \(\mathcal {D} \), otherwise return \(\mathsf {reject}\).

  4. 4.

    Finally \(\mathcal {B} \) outputs whatever \(\mathcal {D} \) does.

For the analysis, note that \(\mathcal {B} \) runs in time similar to that of \(\mathcal {D} \). By inspection, one can verify that in case \(\mathcal {B} \) is running in the “real experiment” or in the “random experiment,” the simulation of the tag queries provided by \(\mathcal {B} \) is distribute like in \(\mathbf {G}_j\) or in \(\mathbf {G}_j'\), respectively. Finally, all verification queries before the jth query are either answered with \(\mathsf {accept}\) (in case they are identical to a previously simulated tag), or with \(\mathsf {reject}\) (otherwise); this is consistent with both games \(\mathbf {G}_j\) and \(\mathbf {G}_j'\). The jth verification query is fresh by definition and is simulated using the answer from the \(\textsc {Chal}(\cdot )\) oracle, so it has the right distribution. The claim follows.\(\square \)

Claim 4.18

\(\Pr [\mathbf {G}_j' = 1] \le \ \alpha ''_{\tau ',n}\), for all \(j\in \{1,\ldots ,Q_\mathsf {vrfy}\}\).

Proof of Claim

Fix a value of \(j\in \{1,\ldots ,Q_\mathsf {vrfy}\}\). If \(\mathbf {R}^*\) does not have full rank then the experiment outputs 0 by definition. So from now we only consider the case where \(\mathsf {rank}(\mathbf {R}^*)=n\). In Game \(\mathbf {G}_j'\), the values \((\mathbf {R},\mathbf {z})\) the adversary \(\mathcal {A} \) obtains from the tag oracle are independent of the secrets \((\mathbf {S},\mathbf {s}_0,\mathbf {s}_0')\). Since \(\mathbf {s}(\mathbf {v}^*)\) is uniformly random and \(\mathbf {R}^*\) has full rank, the vector \( \mathbf {x}{:}{=}\mathbf {s}_0' + {\mathbf {R}^*}^{\mathsf {T}}\cdot \mathbf {s}(\mathbf {v}^*) + \mathbf {z}^* \) is uniformly random over \(\mathbb {Z}_2^n\). Thus the probability that the second verification \(\mathbf{wt}(\mathbf {z}^*+ \mathbf {s}_0' + {\mathbf {R}^*}^{\mathsf {T}}\cdot \mathbf {s}(\mathbf {v}^*)) \le n \cdot \tau ' \) passes is \( \Pr [\mathbf{wt}(\mathbf {x})\le n\cdot \tau ']=\alpha ''_{\tau ',n}=2^{-\Theta (n)} \).\(\square \)

Summing up, in the case \(\Pr [E_{\mathsf {fresh}}]>\varepsilon '/2\) (combining the bounds in Claim 4.174.18 together with Eq. (4.7)), we can use \(\mathcal {A} \) to break the LPN assumption with advantage \(\tfrac{\varepsilon '}{8\mu Q_\mathsf {vrfy}} - Q_\mathsf {vrfy}\cdot (\alpha + \alpha ''_{\tau ',n})\). On the other hand in the case \(\Pr [E_{\mathsf {fresh}}] \le \varepsilon '/2\), we have an advantage as given in Eq. (4.5). Thus

$$\begin{aligned} \varepsilon = \min \left\{ \varepsilon '/2 - \frac{Q^2}{2^{\mu }}, \frac{\varepsilon '}{8\mu Q_\mathsf {vrfy}} - Q_\mathsf {vrfy}\cdot (\alpha +\alpha ''_{\tau ',n}) \right\} . \end{aligned}$$

\(\square \)