Separate Your Domains: NIST PQC KEMs, Oracle Cloning and ReadOnly Indifferentiability
 1 Citations
 954 Downloads
Abstract
It is convenient and common for schemes in the random oracle model to assume access to multiple random oracles (ROs), leaving to implementations the task—we call it oracle cloning—of constructing them from a single RO. The first part of the paper is a case study of oracle cloning in KEM submissions to the NIST PostQuantum Cryptography standardization process. We give keyrecovery attacks on some submissions arising from mistakes in oracle cloning, and find other submissions using oracle cloning methods whose validity is unclear. Motivated by this, the second part of the paper gives a theoretical treatment of oracle cloning. We give a definition of what is an “oracle cloning method” and what it means for such a method to “work,” in a framework we call readonly indifferentiability, a simple variant of classical indifferentiability that yields security not only for usage in singlestage games but also in multistage ones. We formalize domain separation, and specify and study many oracle cloning methods, including common domainseparating ones, giving some general results to justify (prove readonly indifferentiability of) certain classes of methods. We are not only able to validate the oracle cloning methods used in many of the unbroken NIST PQC KEMs, but also able to specify and validate oracle cloning methods that may be useful beyond that.
1 Introduction
Theoretical works giving, and proving secure, schemes in the random oracle (RO) model [11], often, for convenience, assume access to multiple, independent ROs. Implementations, however, like to implement them all via a single hash function like \(\mathsf {SHA2}56\) that is assumed to be a RO.
If it were merely a question of the specific domainseparation method of Eq. (1), we’d be inclined to agree. But we have found some good reasons to revisit the question and look into theoretical foundations. They arise from the NIST PostQuantum Cryptography (PQC) standardization process [35].
We analyzed the KEM submissions. We found attacks, breaking some of them, that arise from incorrect ways of turning one random oracle into many, indicating that the process is errorprone. We found other KEMs where methods other than Eq. (1) were used and whether or not they work is unclear. In some submissions, instantiations for multiple ROs were left unspecified. In others, they differed between the specification and reference implementation.
Domain separation as per Eq. (1) is a method, not a goal. We identify and name the underlying goal, calling it oracle cloning—given one RO, build many, independent ones. (More generally, given m ROs, build \(n>m\) ROs.) We give a definition of what is an “oracle cloning method” and what it means for such a method to “work,” in a framework we call readonly indifferentiability, a simple variant of classical indifferentiability [29]. We specify and study many oracle cloning methods, giving some general results to justify (prove readonly indifferentiability of) certain classes of them. The intent is not only to validate as many NIST PQC KEMs as possible (which we do) but to specify and validate methods that will be useful beyond that.
Below we begin by discussing the NIST PQC KEMs and our findings on them, and then turn to our theoretical treatment and results.
NIST PQC KEMs. In late 2016, NIST put out a call for postquantum cryptographic algorithms [35]. In the first round they received 28 submissions targeting INDCCAsecure KEMs, of which 17 remain in the second round [37].
Recall that in a KEM (Key Encapsulation Mechanism) \({\mathsf {KE}}\), the encapsulation algorithm \(\mathsf {{\mathsf {KE}}{.}E}\) takes the public key Open image in new window (but no message) to return a symmetric key K and a ciphertext \(C^*\) encapsulating it, Open image in new window . Given an INDCCA KEM, one can easily build an INDCCA PKE scheme by hybrid encryption [18], explaining the focus of standardization on the KEMs.
Most of the KEM submissions (23 in the first round, 15 in the second round) are constructed from a weak (OWCPA, INDCPA, ...) PKE scheme using either a method from Hofheinz, Hövelmanns and Kiltz (HHK) [24] or a related method from [21, 27, 40]. This results in a KEM \({\mathsf {KE}}_{4}\), the subscript to indicate that it uses up to four ROs that we’ll denote \( H _1, H _2, H _3, H _4\). Results of [21, 24, 27, 40] imply that \({\mathsf {KE}}_{4}\) is provably INDCCA, assuming the ROs \( H _1, H _2, H _3, H _4\) are independent.
Next, the step of interest for us, the oracle cloning: they build the multiple random oracles via a single RO \( H \), replacing \( H _i\) with an oracle \(\mathbf {F}[ H ](i,\cdot )\), where we refer to the construction \(\mathbf {F}\) as a “cloning functor,” and \(\mathbf {F}[ H ]\) means that \(\mathbf {F}\) gets oracle access to \( H \). This turns \({\mathsf {KE}}_{4}\) into a KEM \({\mathsf {KE}}_{1}\) that uses only a single RO \( H \), allowing an implementation to instantiate the latter with a single NISTrecommended primitive like \(\mathsf {SHA3}\text {}\mathsf {512}\) or \(\mathsf {SHAKE256}\) [36]. (In some cases, \({\mathsf {KE}}_{1}\) uses a number of ROs that is more than one but less than the number used by \({\mathsf {KE}}_{4}\), which is still oracle cloning, but we’ll ignore this for now.)
Often the oracle cloning method (cloning functor) is not specified in the submission document; we obtained it from the reference implementation. Our concern is the security of this method and the security of the final, singleROusing KEM \({\mathsf {KE}}_{1}\). (As above we assume the starting \({\mathsf {KE}}_{4}\) is secure if its four ROs are independent.)
Oracle cloning in submissions. We surveyed the relevant (first and secondround) NIST PQC KEM submissions, looking in particular at the reference code, to determine what choices of cloning functor \(\mathbf {F}\) was made, and how it impacted security of \({\mathsf {KE}}_{1}\). Based on our findings, we classify the submissions into groups as follows.
First is a group of successfully attacked submissions. We discover and specify attacks, enabled through erroneous RO cloning, on three (firstround) submissions: Open image in new window [8], Open image in new window [7] and Open image in new window [22]. (Throughout the paper, firstround submissions are in Open image in new window , secondround submissions in Open image in new window .) Our attacks on Open image in new window and Open image in new window recover the symmetric key K from the ciphertext \(C^*\) and public key. Our attack on Open image in new window succeeds in partial key recovery, recovering 192 bits of the symmetric key. These attacks are very fast, taking at most about the same time as taken by the (secretkey equipped, prescribed) decryption algorithm to recover the key. None of our attacks needs access to a decryption oracle, meaning we violate much more than INDCCA.
Next is submissions with questionable oracle cloning. We put just one in this group, namely Open image in new window [2]. Here we do not have proof of security in the ROM for the final instantiated scheme \({\mathsf {KE}}_{1}\). We do show that the cloning methods used here do not achieve our formal notion of rdindiff security, but this does not result in an attack on \({\mathsf {KE}}_{1}\), so we do not have a practical attack either. We recommend changes in the cloning methods that permit proofs.
Next is a group of ten submissions that use adhoc oracle cloning methods—as opposed, say, to conventional domain separation as per Eq. (1)—but for which our results (to be discussed below) are able to prove security of the final singleRO scheme. In this group are Open image in new window [3], Open image in new window [44], Open image in new window [28], Open image in new window [16], Open image in new window [4], Open image in new window [38], Open image in new window [30], Open image in new window [6], Open image in new window [19] and Open image in new window [43]. Still, the security of these oracle cloning methods remains brittle and prone to vulnerabilities under slight changes.
A final group of twelve submissions did well, employing something like Eq. (1). In particular our results can prove these methods secure. In this group are Open image in new window [13], Open image in new window [5], Open image in new window [41], Open image in new window [34], Open image in new window [32], Open image in new window [42], Open image in new window [25], Open image in new window [14], Open image in new window [1], Open image in new window [31], Open image in new window [26] and Open image in new window [23].
This classification omits 14 KEM schemes that do not fit the above framework. (For example they do not target INDCCA KEMs, do not use HHKstyle transforms, or do not use multiple random oracles.)
Lessons and response. We see that oracle cloning is errorprone, and that it is sometimes done in adhoc ways whose validity is not clear. We suggest that oracle cloning not be left to implementations. Rather, scheme designers should give proofvalidated oracle cloning methods for their schemes. To enable this, we initiate a theoretical treatment of oracle cloning. We formalize oracle cloning methods, define what it means for one to be secure, and specify a library of provensecure methods from which designers can draw. We are able to justify the oracle cloning methods of many of the unbroken NIST PQC KEMs. The framework of readonly indifferentiability we introduce and use for this purpose may be of independent interest.
The NIST PQC KEMs we break are firstround candidates, not secondround ones, and in some cases other attacks on the same candidates exist, so one may say the breaks are no longer interesting. We suggest reasons they are. Their value is illustrative, showing not only that errors in oracle cloning occur in practice, but that they can be devastating for security. In particular, the extensive and long review process for the firstround NIST PQC submissions seems to have missed these simple attacks, perhaps due to lack of recognition of the importance of good oracle cloning.
Indifferentiability background. Let \(\mathsf {SS},\mathsf {ES}\) be sets of functions. (We will call them the starting and ending function spaces, respectively.) A functor \(\mathbf {F} {:\;\;}\mathsf {SS} \rightarrow \mathsf {ES}\) is a deterministic algorithm that, given as oracle a function \( s \in \mathsf {SS}\), defines a function \(\mathbf {F}[ s ] \in \mathsf {ES}\). Indifferentiability of \(\mathbf {F}\) is a way of defining what it means for \(\mathbf {F}[ s ]\) to emulate \( e \) when \( s , e \) are randomly chosen from \(\mathsf {SS},\mathsf {ES}\), respectively. It permits a “composition theorem” saying that if \(\mathbf {F}\) is indifferentiable then use of \( e \) in a scheme can be securely replaced by use of \(\mathbf {F}[ s ]\).
Maurer, Renner and Holenstein (MRH) [29] gave the first definition of indifferentiability and corresponding composition theorem. However, Ristenpart, Shacham and Shrimpton (RSS) [39] pointed out a limitation, namely that it only applies to singlestage games. MRHindiff fails to guarantee security in multistage games, a setting that includes many goals of interest including security under relatedkey attack, deterministic publickey encryption and encryption of keydependent messages. Variants of MRHindiff [17, 20, 33, 39] tried to address this, with limited success.
Rdindiff. Indifferentiability is the natural way to treat oracle cloning. A cloning of one function into n functions (\(n=4\) above) can be captured as a functor (we call it a cloning functor) \(\mathbf {F}\) that takes the single RO \( s \) and for each \(i \in [1..n]\) defines a function \(\mathbf {F}[ s ](i,\cdot )\) that is meant to emulate a RO. We will specify many oracle cloning methods in this way.
We define in Sect. 4 a variant of indifferentiability we call readonly indifferentiability (rdindiff). The simulator—unlike for resetindiff [39]—has access to a gamemaintained state \(st\), but—unlike MRHindiff [29]—that state is readonly, meaning the simulator cannot alter it across invocations. Rdindiff is a stronger requirement than MRHindiff (if \(\mathbf {F}\) is rdindiff then it is MRHindiff) but a weaker one than resetindiff (if \(\mathbf {F}\) is resetindiff then it is rdindiff). Despite the latter, rdindiff, like resetindiff, admits a composition theorem showing that an rdindiff \(\mathbf {F}\) may securely substitute a RO even in multistage games. (The proof of RSS [39] for resetindiff extends to show this.) We do not use resetindiff because some of our cloning functors do not meet it, but they do meet rdindiff, and the composition benefit is preserved.
General results. In Sect. 4, we define translating functors. These are simply ones whose oracle queries are nonadaptive. (In more detail, a translating functor determines from its input W a list of queries, makes them to its oracle and, from the responses and W, determines its output.) We then define a condition on a translating functor \(\mathbf {F}\) that we call invertibility and show that if \(\mathbf {F}\) is an invertible translating functor then it is rdindiff. This is done in two parts, Theorems 1 and 2, that differ in the degree of invertibility assumed. The first, assuming the greater degree of invertibility, allows a simpler proof with a simulator that does not need the readonly state allowed in rdindiff. The second, assuming the lesser degree of invertibility, depends on a simulator that makes crucial use of the readonly state. It sets the latter to a key for a PRF that is then used to answer queries that fall outside the set of ones that can be trivially answered under the invertibility condition. This use of a computational primitive (a PRF) in the indifferentiability context may be novel and may seem odd, but it works.
We apply this framework to analyze particular, practical cloning functors, showing that these are translating and invertible, and then deducing their rdindiff security. But the abovementioned results are stronger and more general than we need for the application to oracle cloning. The intent is to enable further, future applications.
Analysis of oracle cloning methods. We formalize oracle cloning as the task of designing a functor (we call it a cloning functor) \(\mathbf {F}\) that takes as oracle a function \(s\in \mathsf {SS}\) in the starting space and returns a twoinput function \(e= \mathbf {F}[s] \in \mathsf {ES} \), where \(e(i,\cdot )\) represents the ith RO for \(i\in [1..n]\). Section 5 presents the cloning functors corresponding to some popular and practical oracle cloning methods (in particular ones used in the NIST PQC KEMs), and shows that they are translating and invertible. Our abovementioned results allow us to then deduce they are rdindiff, which means they are safe to use in most applications, even ones involving multistage games. This gives formal justification for some common oracle cloning methods. We now discuss some specific cloning functors that we treat in this way.
The prefix (cloning) functor \(\mathbf {F}_{\mathrm {pf}(\mathbf {p})}\) is parameterized by a fixed, public vector \(\mathbf {p}\) such that no entry of \(\mathbf {p}\) is a prefix of any other entry of \(\mathbf {p}\). Receiving function \( s \) as an oracle, it defines function \(e= \mathbf {F}_{\mathrm {pf}(\mathbf {p})}[ s ]\) by \(e(i,X) = s (\mathbf {p}[i]\Vert X)\), where \(\mathbf {p}[i]\) is the \(i^{\text {th}}\) element of vector \(\mathbf {p}\). When \(\mathbf {p}[i]\) is a fixedlength bitstring representing the integer i, this formalizes Eq. (1).
Some NIST PQC submissions use a method we call output splitting. The simplest case is that we want \(e(i,\cdot ),\ldots ,{\epsilon }(n,\cdot )\) to all have the same output length L. We then define e(i, X) as bits \((i1)L\,+\,1\) through iL of the given function \(s\) applied to X. That is, receiving function \( s \) as an oracle, the splitting (cloning) functor \(\mathbf {F}_{\mathrm {spl}}\) returns function \(e= \mathbf {F}_{\mathrm {spl}}[ s ]\) defined by \(e(i,X) = s(X)[(i1)L\!+\!1 .. iL]\).
An interesting case, present in some NIST PQC submissions, is trivial cloning: just set \(e(i,X)=s(X)\) for all X. We formalize this as the identity (cloning) functor \(\mathbf {F}_{\mathrm {id}}\) defined by \(\mathbf {F}_{\mathrm {id}}[s](i,X) = s(X)\). Clearly, this is not always secure. It can be secure, however, for usages that restrict queries in some way. One such restriction, used in several NIST PQC KEMs, is length differentiation: \(e(i,\cdot )\) is queried only on inputs of some length \(l_i\), where \(l_1,\ldots ,l_n\) are chosen to be distinct. We are able to treat this in our framework using the concept of working domains that we discuss next, but we warn that this method is brittle and prone to misuse.
Working domains. One could capture trivial cloning with length differentiation as a restriction on the domains of the ending functions, but this seems artificial and dangerous because the implementations do not enforce any such restriction; the functions there are defined on their full domains and it is, apparently, left up to applications to use the functions in a way that does not get them into trouble. The approach we take is to leave the functions defined on their full domains, but define and ask for security over a subdomain, which we called the working domain. A choice of working domain \(\mathcal{W}\) accordingly parameterizes our definition of rdindiff for a functor, and also the definition of invertibility of a translating functor. Our result says that the identity functor is rdindiff for certain choices of working domains that include the length differentiation one.
Making the working domain explicit will, hopefully, force the application designer to think about, and specify, what it is, increasing the possibility of staying out of trouble. Working domains also provide flexibility and versatility under which different applications can make different choices of the domain.
Working domains not being present in prior indifferentiability formalizations, the comparisons, above, of rdindiff with these prior formalizations assume the working domain is the full domain of the ending functions. Working domains alter the comparison picture; a cloning functor which is rdindiff on a working domain may not be even MRHindiff on its full domain.
Application to KEMs. The framework above is broad, staying in the land of ROs and not speaking of the usage of these ROs in any particular cryptographic primitive or scheme. As such, it can be applied to analyze RO instantiation in many primitives and schemes. In the full version of this paper [10], we exemplify its application in the realm of KEMs as the target of the NIST PQC designs.
This may seem redundant, since an indifferentiability composition theorem says exactly that once indifferentiability of a functor has been shown, “all” uses of it are secure. However, prior indifferentiability frameworks do not consider working domains, so the known composition theorems apply only when the working domain is the full one. (Thus the resetindiff composition theorem of [39] extends to rdindiff so that we have security for applications whose security definitions are underlain by either single or multistage games, but only for full working domains.)
To give a composition theorem that is conscious of working domains, we must first ask what they are, or mean, in the application. We give a definition of the working domain of a KEM \({\mathsf {KE}}\) . This is the set of all points that the scheme algorithms query to the ending functions in usage, captured by a certain game we give. (Queries of the adversary may fall outside the working domain.) Then we give a workingdomainconscious composition theorem for KEMs that says the following. Say we are given an INDCCA KEM \({\mathsf {KE}}\) whose oracles are drawn from a function space \(\mathsf {{\mathsf {KE}}{.}FS}\). Let \(\mathbf {F}{:\;\;}\mathsf {SS} \rightarrow \mathsf {{\mathsf {KE}}{.}FS}\) be a functor, and let \(\overline{\mathsf {KE}}\) be the KEM obtained by implementing the oracles of the \({\mathsf {KE}}\) via \(\mathbf {F}\). (So the oracles of this second KEM are drawn from the function space \(\mathsf {\overline{\mathsf {KE}}{.}FS}= \mathsf {SS}\).) Let \(\mathcal{W}\) be the working domain of \({\mathsf {KE}}\), and assume \(\mathbf {F}\) is rdindiff over \(\mathcal{W}\). Then \(\overline{\mathsf {KE}}\) is also INDCCA. Combining this with our rdindiff results on particular cloning functors justifies not only conventional domain separation as an instantiation technique for KEMs, but also more broadly the instantiations in some NIST PQC submissions that do not use domain separation, yet whose cloning functors are rddiff over the working domain of their KEMs. The most important example is the identity cloning functor used with length differentiation.
A key definitional element of our treatment that allows the above is, following [9], to embellish the syntax of a scheme (here a KEM \({\mathsf {KE}}\)) by having it name a function space \(\mathsf {{\mathsf {KE}}{.}FS}\) from which it wants its oracles drawn. Thus, the scheme specification must say how many ROs it wants, and of what domains and ranges. In contrast, in the formal version of the ROM in [11], there is a single, schemeindependent RO that has some fixed domain and range, for example mapping \(\{0,1\}^*\) to \(\{0,1\}\). This leaves a gap, between the object a scheme wants and what the model provides, that can lead to error. We suggest that, to reduce such errors, schemes specified in standards include a specification of their function space.
2 Oracle Cloning in NIST PQC Candidates
Notation. A KEM scheme \({\mathsf {KE}}\) specifies an encapsulation \(\mathsf {{\mathsf {KE}}{.}E}\) that, on input a public encryption key Open image in new window returns a session key K, and a ciphertext \(C^*\) encapsulating it, written Open image in new window . A PKE scheme \({\mathsf {PKE}}\) specifies an encryption algorithm \(\mathsf {{\mathsf {PKE}}{.}E}\) that, on input Open image in new window , message \(M\in \{0,1\}^{\mathsf {{\mathsf {PKE}}{.}ml}}\) and randomness R, deterministically returns ciphertext Open image in new window . For neither primitive will we, in this section, be concerned with the key generation or decapsulation/decryption algorithm. We might write \({\mathsf {KE}}[X_1,X_2,\ldots ]\) to indicate that the scheme has oracle access to functions \(X_1,X_2,\ldots \), and correspondingly then write \(\mathsf {{\mathsf {KE}}{.}E}[X_1,X_2,\ldots ]\), and similarly for \({\mathsf {PKE}}\).
2.1 Design Process

(1) First, they specify a \(\mathrm {S}_{\mathrm {pke}}\)secure publickey encryption scheme \({\mathsf {PKE}}\).

(2) Second, they pick a sound transform \(\mathbf {T}\) and obtain KEM \({\mathsf {KE}}_{4}[ H _1, H _2, H _3, H _4] = \mathbf {T}[{\mathsf {PKE}}, H _2, H _3, H _4]\). (The notation is from [24]. The transforms use up to three random oracles that we are denoting \( H _2, H _3, H _4\), reserving \( H _1\) for possible use by the PKE scheme.) We refer to \({\mathsf {KE}}_{4}\) (the subscript refers to its using 4 oracles) as the base KEM, and, as we will see, it differs across the transforms.
 (3) Finally—the undertheradar step that is our concern—the ROs \( H _1,\ldots , H _4\) are constructed from cryptographic hash functions to yield what we call the final KEM \({\mathsf {KE}}_{1}\). In more detail, the submissions make various choices of cryptographic hash functions \( F _1,\ldots , F _m\) that we call the base functions, and, for \(i=1,2,3,4\), specify constructions \(\mathbf {C}_i\) that, with oracle access to the base functions, define the \( H _i\), which we write as \( H _i \leftarrow \mathbf {C}_i[ F _1,\ldots , F _m]\). We call this process oracle cloning, and we call \(H_i\) the final functions. (Common values of m are 1, 2.) The actual, submitted KEM \({\mathsf {KE}}_{1}\) (the subscript because m is usually 1) uses the final functions, so that its encapsulation algorithm can be written as:
2.2 The Base KEM
We need first to specify the base \({\mathsf {KE}}_{4}\) (the result of the sound transform, from step (2) above). The NIST PQC submissions typically cite one of HHK [24], Dent [21], SXY [40] or JZCWM [27] for the sound transform they use, but our examinations show that the submissions have embellished, combined or modified the original transforms. The changes do not (to best of our knowledge) violate soundness (meaning the used transforms still yield an INDCCA \({\mathsf {KE}}_{4}\) if \( H _2, H _3, H _4\) are independent ROs and \({\mathsf {PKE}}\) is \(\mathrm {S}_{\mathrm {pke}}\)secure) but they make a succinct exposition challenging. We address this with a framework to unify the designs via a single, but parameterized, transform, capturing the submission transforms by different parameter choices.
Figure 1 (top) shows the encapsulation algorithm \(\mathsf {{\mathsf {KE}}_{4}{.}E}\) of the KEM that our parameterized transform associates to \({\mathsf {PKE}}\) and \(H_1,H_2,H_3,H_4\). The parameters are the variables X, Y, Z (they will be functions of other quantities in the algorithms), a boolean \(\mathsf {D}\), and an integer \(\mathsf {k}^*\). When choices of these are made, one gets a fullyspecified transform and corresponding base KEM \({\mathsf {KE}}_{4}\). Each row in the table in the same Figure shows one such choice of parameters, resulting in 15 fullyspecified transforms. The final column shows the submissions that use the transform.
The encapsulation algorithm at the top of Fig. 1 takes input a public key Open image in new window and has oracle access to functions \(H_1,H_2,H_3,H_4\). At line 1, it picks a random seed M of length the message length of the given PKE scheme. Boolean \(\mathsf {D}\) being \(\mathsf {true}\) (as it is with just one exception) means \(\mathsf {{\mathsf {PKE}}{.}E}\) is randomized. In that case, line 2 applies \(H_2\) to X (the latter, determined as per the table, depends on M and possibly also on Open image in new window ) and parses the output to get coins R for \(\mathsf {{\mathsf {PKE}}{.}E}\) and possibly (if the parameter \(\mathsf {k}^*\ne 0\)) an additional string \(K'\). At line 3, a ciphertext \(C\) is produced by encrypting the seed M using \(\mathsf {{\mathsf {PKE}}{.}E}\) with public key Open image in new window and coins R. In some schemes, a second portion of the ciphertext, Y, often called the “confirmation", is derived from X or M, using \( H _3\), as shown in the table, and line 4 then defines \(C^*\). Finally, \( H _4\) is used as a key derivation function to extract a symmetric key K from the parameter Z, which varies widely among transforms.
In total, 26 of the 39 NIST PQC submissions which target KEMs in either the first or second round use transforms which fall into our framework. The remaining schemes do not use more than one random oracle, construct KEMs without transforming PKE schemes, or target security definitions other than INDCCA.
2.3 Submissions We Break
We present attacks on Open image in new window [8], Open image in new window [7], and Open image in new window [22]. These attacks succeed in full or partial recovery of the encapsulated KEM key from a ciphertext, and are extremely fast. We have implemented the attacks to verify them.
Although none of these schemes progressed to Round 2 of the competition without significant modification, to the best of our knowledge, none of the attacks we described were pointed out during the review process. Given the attacks’ superficiality, this is surprising and suggests to us that more attention should be paid to oracle cloning methods and their vulnerabilities during review.
Randomnessbased decryption. The PKE schemes used by Open image in new window and Open image in new window have the property that given a ciphertext Open image in new window and also given the coins R, it is easy to recover M, even without knowledge of the secret key. We formalize this property, saying \({\mathsf {PKE}}\) allows randomnessbased decryption, if there is an (efficient) algorithm \(\mathsf {{\mathsf {PKE}}{.}DecR}\) such that Open image in new window for any public key Open image in new window , coins R and message m. This will be used in our attacks.
Attack on Open image in new window . The base KEM \({\mathsf {KE}}_1[H_2,H_3,H_4]\) is given by the transform \(\mathbf {T}_{11}\) in the table of Fig. 1. The final KEM \({\mathsf {KE}}_2[F]\) uses a single base function F to instantiate the final functions, which it does as follows. It sets \(H_4=F\). The specification and reference implementation differ in how \(H_2,H_3\) are defined: In the former, \( H _2(x) = F(F(x)){\,\Vert \,}F(x)\) and \( H _3(x) = F(F(F(x)))\), while, in the latter, \( H _2(x) = F(F(F(x))) {\,\Vert \,}F(x)\) and \( H _3(x) = F(F(X))\). These differences arise from differences in the way the output of a certain function W[F] is parsed.
This attack exploits the difference between the way \(H_2,H_3\) are defined across the specification and implementation, which may be a bug in the implementation with regard to the parsing of W[F](x). However, the attack also exploits dependencies between \( H _2\) and \( H _3\), which ought not to exist when instantiating what are required to be distinct random oracles.
Open image in new window was incorporated into the secondround submission Open image in new window , which specifies a different base function and cloning functor (the latter of which uses the secure method we call “output splitting”) to instantiate oracles \(H_2\) and \(H_3\). This attack therefore does not apply to Open image in new window .
Attack on DAGS. If x is a byte string we let x[i] be its ith byte, and if x is a bit string we let \(x_i\) be its ith bit. We say that a function V is an extendable output function if it takes input a string x and an integer \(\ell \) to return an \(\ell \)byte output, and \(\ell _1 \le \ell _2\) implies that \(V(x,\ell _1)\) is a prefix of \(V(x,\ell _2)\). If \(v = v_1v_2v_3v_4v_5v_6v_7v_8\) is a byte then let \(Z(v) = 00v_3v_4v_5v_6v_7v_8\) be obtained by zeroing out the first two bits. If y is a string of \(\ell \) bytes then let \(Z'(y) = Z(y[1])\Vert \cdots \Vert Z(y[\ell ])\). Now let \(V'(x,\ell ) = Z'(V(x,\ell ))\).
The base KEM \({\mathsf {KE}}_1[H_1,H_2,H_3,H_4]\) is given by the transform \(\mathbf {T}_{8}\) in the table of Fig. 1. The final KEM \({\mathsf {KE}}_2[V]\) uses an extendable output function V to instantiate the random oracles, which it does as follows. It sets \( H _2(x) = V'(x,512)\) and \( H _3(x) = V'(x,32)\). It sets \( H _4(x) = V(x,64)\).
As per \(\mathbf {T}_8\) we have \(K = H_4(M)\) and \(Y = H_3(M)\). Let L be the first 32 bytes of the 64byte K. Then \(Y = Z'(L)\). So Y reveals \(32\cdot 6 = 192\) bits of K. Since Y is in the ciphertext, this results in a partial encapsulatedkey recovery attack. The attack reduces the effective length of K from \(64\cdot 8 = 512\) bits to \(512192 = 320\) bits, meaning \(37.5\%\) of the encapsulated key is recovered. Also \(R = H_2(M)\), so Y, as part of the ciphertext, reveals 32 bytes of R, which does not seem desirable, even though it is not clear how to exploit it for an attack.
2.4 Submissions with Unclear Security
For the scheme Open image in new window [2], we can give neither an attack nor a proof of security. However, we can show that the final functions \(H_2, H_3, H_4\) produced by the cloning functor Open image in new window with oracle access to a single extendableoutput function V are differentiable from independent random oracles. The cloning functor Open image in new window sets \(H_1(x)=V(x,128)\) and \(H_4 = V(x,32)\). It computes \(H_2\) and \(H_3\) from V using the output splitting cloning functor. Concretely, \({\mathsf {KE}}_2\) parses V(x, 96) as \(H_2(x){\,\Vert \,}H_3(x)\), where \(H_2\) has output length 64 bytes and \(H_3\) has output length 32 bytes. Because V is an extendableoutput function, \(H_4(x)\) will be a prefix of \(H_2(x)\) for any string x.
We do not know how to exploit this correlation to attack the INDCCA security of the final KEM scheme \({\mathsf {KE}}_2[V]\), and we conjecture that, due to the structure of \(\mathbf {T}_{10}\), no efficient attack exists. We can, however, attack the rdindiff security of functor Open image in new window , showing that that the security proof for the base KEM \({\mathsf {KE}}_1[H_2,H_3,H_4]\) does not naturally transfer to \({\mathsf {KE}}_2[V]\). Therefore, in order to generically extend the provable security results for \({\mathsf {KE}}_1\) to \({\mathsf {KE}}_2\), it seems advisable to instead apply appropriate oracle cloning methods.
2.5 Submissions with Provable Security but Ambiguous Specification
In their reference implementations, these submissions use cloning functors which we can and do validate via our framework, providing provable security in the random oracle model for the final KEM schemes. However, the submission documents do not clearly specify a secure cloning functor, meaning that variant implementations or adaptations may unknowingly introduce weaknesses. The schemes Open image in new window [3], Open image in new window [44], Open image in new window [28], Open image in new window [16], Open image in new window [4], Open image in new window [38], Open image in new window [30], Open image in new window [6], Open image in new window [19] and Open image in new window [43] fall into this group.
Length differentiation. Many of these schemes use the “identity” functor in their reference implementations, meaning that they set the final functions \(H_1 = H_2 = H_3 = H_4 = F\) for a single base function F. If the scheme \({\mathsf {KE}}_1[H_1,H_2,H_3,H_4]\) never queries two different oracles on inputs of a single length, the domains of \(H_1,\ldots ,H_4\) are implicitly separated. Reference implementations typically enforce this separation by fixing the input length of every call to F. Our formalism calls this query restriction “length differentiation” and proves its security as an oracle cloning method. We also generalize it to all methods which prevent the scheme from querying any two distinct random oracles on a single input.
In the following, we discuss two schemes from the group, Open image in new window and Open image in new window , where ambiguity about cloning methods between the specification and reference implementation jeopardizes the security of applications using these schemes. It will be important that, like Open image in new window and Open image in new window , the PKE schemes defined by Open image in new window and Open image in new window allow randomnessbased decryption.
Open image in new window [16] also follows transform \(\mathbf {T}_9\) to produce its base KEM \({\mathsf {KE}}_1[H_2,H_3,H_4]\). Its submission document suggests instantiation with a single function F as follows: it sets \(H_3 = H_4 = F\), and it sets \(H_2 = W \circ F\) for some postprocessing function W whose details are irrelevant here. Since, in \(\mathbf {T}_9\), \(Y = H _3(M) = F(M)\) and \(R = H _2(M) = W\circ F (M) = W(Y)\), the randomness R will again be leaked through Y in the ciphertext, permitting a keyrecovery attack using randomnessbased decryption much like the others we have described. This attack is prevented in the reference implementation of Open image in new window , which instantiates \(H_3\) and \(H_4\) using an independent function G. The domains of \(H_3\) and \(H_4\) are separated by length differentiation. This allows us to prove the security of the final KEM \({\mathsf {KE}}_2[G,F]\), as defined by the reference implementation.
The reference implementation of the publickey encryption schemes prevents the attack by cloning \( H _3\) and \( H _4\) from G via a third cloning functor, this one using the output splitting method. Yet, the inconsistency in the choice of cloning functors between the specification and both implementations underlines that adhoc cloning functors may easily “get lost” in modifications or adaptations of a scheme.
2.6 Submissions with Clear Provable Security
Here we place schemes which explicitly discuss their methods for domain separation and follow good practice in their implementations: Open image in new window [13], Open image in new window [5], Open image in new window [41], Open image in new window [34], Open image in new window [32], Open image in new window [42], Open image in new window [25], Open image in new window [14], Open image in new window [1], Open image in new window [31], Open image in new window [26] and Open image in new window [23]. These schemes are careful to account for dependencies between random oracles that are considered to be independent in their security models. When choosing to clone multiple random oracles from a single primitive, the schemes in this group use padding bytes, deploy hash functions designed to accommodate domain separation, or restrictions on the length of the inputs which are codified in the specification. These explicit domain separation techniques can be cast in the formalism we develop in this work.
Open image in new window and Open image in new window are unique among the PQC KEM schemes in that their specifications warn that the identity functor admits keyrecovery attacks. As protection, they recommend that \( H _2\) and \( H _3\) be instantiated with unrelated primitives.
Signatures. Although the main focus of this paper is on domain separation in KEMs, we wish to note that these issues are not unique to KEMs. At least one digital signature scheme in the second round of the NIST PQC competition, Open image in new window [15], models multiple hash functions as independent random oracles in its security proof, then clones them from the same primitive without explicit domain separation. We have not analyzed the NIST PQC digital signature schemes’ security to see whether more subtle domain separation is present, or whether oracle collisions admit the same vulnerabilities to signature forgery as they do to session key recovery. This does, however, highlight that the problem of random oracle cloning is pervasive among more types of cryptographic schemes.
3 Preliminaries
Basic notation. By [i..j] we abbreviate the set \(\{i,\ldots ,j\}\), for integers \(i \le j\). If \(\mathbf {x}\) is a vector then \(\mathbf {x}\) is its length (the number of its coordinates), \(\mathbf {x}[i]\) is its ith coordinate and \([\mathbf {x}]=\{\mathbf {x}[i] \,:\, i\in [1..\mathbf {x}]\}\) is the set of its coordinates. The empty vector is denoted (). If S is a set, then \(S^*\) is the set of vectors over S, meaning the set of vectors of any (finite) length with coordinates in S. Strings are identified with vectors over \(\{0,1\}\), so that if \(x \in \{0,1\}^*\) is a string then x is its length, x[i] is its ith bit, and x[i..j] is the substring from its ith to its jth bit (including), for \(i \le j\). The empty string is \(\varepsilon \). If x, y are strings then we write \(x \preceq y\) to indicate that x is a prefix of y. If S is a finite set then S is its size (cardinality). A set \(S\subseteq \{0,1\}^*\) is length closed if \(\{0,1\}^{x}\subseteq S\) for all \(x\in S\).
We let \(y \leftarrow A[\mathsf {O}_1, \ldots ](x_1,\ldots ; r)\) denote executing algorithm A on inputs \(x_1,\ldots \) and coins r, with access to oracles \(\mathsf {O}_1, \ldots \), and letting y be the result. We let Open image in new window be the resulting of picking r at random and letting \(y \leftarrow A[\mathsf {O}_1, \ldots ](x_1,\ldots ;r)\). We let \(\mathrm {OUT}(A[\mathsf {O}_1, \ldots ](x_1,\ldots ))\) denote the set of all possible outputs of algorithm A when invoked with inputs \(x_1,\ldots \) and access to oracles \(\mathsf {O}_1, \ldots \). Algorithms are randomized unless otherwise indicated. Running time is worst case. An adversary is an algorithm.
We use the codebased gameplaying framework of [12]. A game \(\text {G}\) (see Fig. 2 for an example) starts with an \(\textsc {init}\) procedure, followed by a nonnegative number of additional procedures, and ends with a \(\textsc {fin}\) procedure. Procedures are also called oracles. Execution of adversary \(\mathcal {A}\) with game \(\text {G}\) consists of running \(\mathcal {A}\) with oracle access to the game procedures, with the restrictions that \(\mathcal {A}\)’s first call must be to \(\textsc {init}\), its last call must be to \(\textsc {fin}\), and it can call these two procedures at most once. The output of the execution is the output of \(\textsc {fin}\). We write \(\Pr [\text {G}(\mathcal {A})]\) to denote the probability that the execution of game \(\text {G}\) with adversary \(\mathcal {A}\) results in the output being the boolean \(\mathsf {true}\). Note that our adversaries have no output. The role of what in other treatments is the adversary output is, for us, played by the query to \(\textsc {fin}\). We adopt the convention that the running time of an adversary is the worstcase time to execute the game with the adversary, so the time taken by game procedures (oracles) to respond to queries is included.
Functions. As usual \(g{:\;\;}\mathcal {D}\rightarrow \mathcal {R}\) indicates that g is a function taking inputs in the domain set \(\mathcal {D}\) and returning outputs in the range set \(\mathcal {R}\). We may denote these sets by \(\mathrm {Dom}({g})\) and \(\mathrm {Rng}({g})\), respectively.
We say that \(g{:\;\;}\mathrm {Dom}({g}) \rightarrow \mathrm {Rng}({g})\) has output length \(\ell \) if \(\mathrm {Rng}({g})=\{0,1\}^{\ell }\). We say that g is a single outputlength (sol) function if there is some \(\ell \) such that g has output length \(\ell \) and also the set \(\mathcal {D}\) is length closed. We let \(\mathrm {SOL}(\mathcal {D},\ell )\) denote the set of all sol functions \(g{:\;\;}\mathcal {D}\rightarrow \{0,1\}^{\ell }\).
We say g is an extendable output length (xol) function if the following are true: (1) \(\mathrm {Rng}({g})=\{0,1\}^*\) (2) there is a lengthclosed set \(\mathrm {Dom}_{*}({g})\) such that \(\mathrm {Dom}({g}) = \mathrm {Dom}_{*}({g}) \times {{\mathbb N}}\) (3) \(g(x,\ell )=\ell \) for all \((x,\ell )\in \mathrm {Dom}({g})\), and (4) \(g(x,\ell )\preceq g(x,\ell ')\) whenever \(\ell \le \ell '\). We let \(\mathrm {XOL}(\mathcal {D})\) denote the set of all xol functions \(g{:\;\;}\mathcal {D}\rightarrow \{0,1\}^{*}\).
4 ReadOnly Indifferentiability of Translating Functors
We define readonly indifferentiability (rdindff) of functors. Then we define a class of functors called translating, and give general results about their rdindiff security. Later we will apply this to analyze the security of cloning functors, but the treatment in this section is broader and, looking ahead to possible future applications, more general than we need for ours.
4.1 Functors and ReadOnly Indifferentiability
A random oracle, formally, is a function drawn at random from a certain space of functions. A construction (functor) is a mapping from one such space to another. We start with definitions for these.
Function spaces and functors. A function space \(\mathsf {FS}\) is simply a set of functions, with the requirement that all functions in the set have the same domain \(\mathsf {Dom}(\mathsf {FS})\) and the same range \(\mathsf {Rng}(\mathsf {FS})\). Examples are \(\mathrm {SOL}(\mathcal {D},\ell )\) and \(\mathrm {XOL}(\mathcal {D})\). Now Open image in new window means we pick a function uniformly at random from the set \(\mathsf {FS}\).
Sometimes (but not always) we want an extra condition called input independence. It asks that the values of f on different inputs are identically and independently distributed when Open image in new window . More formally, let \(\mathcal {D}\) be a set and let \(\mathrm {Out}\) be a function that associates to any \(W \in \mathcal {D}\) a set \(\mathrm {Out}(W)\). Let \(\mathrm {Out}(\mathcal {D})\) be the union of the sets \(\mathrm {Out}(W)\) as W ranges over \(\mathcal {D}\). Let \(\mathrm {FUNC}(\mathcal {D},\mathrm {Out})\) be the set of all functions \(f{:\;\;}\mathcal {D}\rightarrow \mathrm {Out}(\mathcal {D})\) such that \({f(W)\in \mathrm {Out}(W)}\) for all \({W\in \mathcal {D}}\). We say that \(\mathsf {FS}\) provides input independence if there exists such a \(\mathrm {Out}\) such that \(\mathsf {FS}= \mathrm {FUNC}(\mathsf {Dom}(\mathsf {FS}),\mathrm {Out})\). Put another way, there is a bijection between \(\mathsf {FS}\) and the set S that is the cross product of the sets \(\mathrm {Out}(W)\) as W ranges over \(\mathsf {Dom}(\mathsf {FS})\). (Members of S are \(\mathsf {Dom}(\mathsf {FS})\)vectors.) As an example the function space \(\mathrm {SOL}(\mathcal {D},\ell )\) satisfies input independence, but \(\mathrm {XOL}(\mathcal {D})\) does not satisfy input independence.
Let \(\mathsf {SS}\) be a function space that we call the starting space. Let \(\mathsf {ES}\) be another function space that we call the ending space. We imagine that we are given a function \(s\in \mathsf {SS}\) and want to construct a function \(e\in \mathsf {ES}\). We refer to the object doing this as a functor. Formally a functor is a deterministic algorithm \(\mathbf {F}\) that, given as oracle a function \( s \in \mathsf {SS}\), returns a function \(\mathbf {F}[ s ]\in \mathsf {ES}\). We write \(\mathbf {F}{:\;\;}\mathsf {SS}\rightarrow \mathsf {ES}\) to emphasize the starting and ending spaces of functor \(\mathbf {F}\).
Rdindiff. We want the ending function to “emulate” a random function from \(\mathsf {ES}\). Indifferentiability is a way of defining what this means. The original definition of MRH [29] has been followed by many variants [17, 20, 33, 39]. Here we give ours, called readonly indifferentiability, which implies composition not just for singlestage games, but even for multistage ones [20, 33, 39].
The working domain \(\mathcal{W}\subseteq \mathrm {Dom}(\mathsf {ES})\), a parameter of the definition, is included as a way to allow the notion of readonly indifferentiability to provide results for oracle cloning methods like length differentiation whose security depends on domain restrictions.
The \(\mathsf {S}{.}{\mathrm {Ev}}\) algorithm is given direct access to \( e _0\), rather than access to \(\textsc {priv}\) as in other definitions, to bypass the working domain restriction, meaning it may query \( e _0\) at points in \(\mathrm {Dom}(\mathsf {ES})\) that are outside the working domain.
All invocations of \(\mathsf {S}{.}{\mathrm {Ev}}[e_0]\) are given the same (static, gamemaintained) state \(st\) as input, but \(\mathsf {S}{.}{\mathrm {Ev}}[e_0]\) cannot modify this state, which is why it is called readonly. Note \(\textsc {init}\) does not return \(st\), meaning the state is not given to the distinguisher.
Discussion. To compare rdindiff to other indiff notions, we set \(\mathcal{W}= \mathrm {Dom}(\mathsf {ES})\), because prior notions do not include working domains. Now, rdindiff differs from prior indiff notions because it requires that the simulator state be just the immutable string chosen at the start of the game. In this regard, rdindiff falls somewhere between the original MRHindiff [29] and reset indiff [39] in the sense that our simulator is more restricted than in the first and less than in the second. A construction (functor) that is resetindiff is thus rdindiff, but not necessarily viceversa, and a construct that is rdindiff is MRHindiff, but not necessarily viceversa. Put another way, the class of rdindff functors is larger than the class of resetindiff ones, but smaller than the class of MRHindiff ones. Now, RSS’s proof [39] that resetindiff implies security for multistage games extends to rdindiff, so we get this for a potentially larger class of functors. This larger class includes some of the cloning functors we have described, which are not necessarily resetindiff.
4.2 Translating Functors
Translating functors. We focus on a class of functors that we call translating. This class includes natural and existing oracle cloning methods, in particular all the effective methods used by NIST KEMs, and we will be able to prove general results for translating functors that can be applied to the cloning methods.
A translating functor \(\mathbf {T} {:\;\;}\mathsf {SS}\rightarrow \mathsf {ES}\) is a functor that, with oracle access to \(s\) and on input \({W \in \mathrm {Dom}(\mathsf {ES})}\), nonadaptively calls \(s\) on a fixed number of inputs, and computes its output \(\mathbf {T}[s](W)\) from the responses and W. Its operation can be split into three phases which do not share state: (1) a preprocessing phase which chooses the inputs to \( s \) based on W alone (2) the calls to \( s \) to obtain responses (3) a postprocessing phase which uses W and the responses collected in phase 2 to compute the final output value \(\mathbf {T}[s](W)\).
Inverses. So far, query and answer translators may have just seemed an unduly complex way to say that a translating oracle construction is one that makes nonadaptive oracle queries. The purpose of making the query and answer translators explicit is to define invertibility, which determines rdindiff security.
We say that \((\mathsf {QT},\mathsf {AT})\) is invertible over \(\mathcal{W}\) if there exist \(\mathsf {QTI},\mathsf {ATI}\) such that \(\mathsf {QTI},\mathsf {ATI}\) are inverses of \(\mathsf {QT},\mathsf {AT}\) over \(\mathcal{W}\), and we say that a translating functor \(\mathbf {TF}_{\mathsf {QT},\mathsf {AT}}\) is invertible over \(\mathcal{W}\) if \((\mathsf {QT},\mathsf {AT})\) is invertible over \(\mathcal{W}\).
Additionally we of course ask that the functions \(\mathsf {QT},\mathsf {AT},\mathsf {QTI},\mathsf {ATI}\) all be efficiently computable. In an asymptotic setting, this means they are polynomial time. In our concrete setting, they show up in the runningtime of the simulator or constructed adversaries. (The latter, as per our conventions, being the time for the execution of the adversary with the overlying game.)
4.3 RdIndiff of Translating Functors
We now move on to showing that invertibility of a pair \((\mathsf {QT},\mathsf {AT})\) implies rdindifferentiability of the translating functor \(\mathbf {TF}_{\mathsf {QT},\mathsf {AT}}\). We start with the case that \(\mathsf {QTI}\) has full support.
Theorem 1
Proof
The simulator in Theorem 1 is stateless, so when \(\mathcal{W}\) is chosen to be \(\mathrm {Dom}(\mathsf {ES})\) the theorem is establishing reset indifferentiability [39] of \(\mathbf {F}\).
For translating functors where \(\mathsf {QTI}\) does not have full support, we need an auxiliary primitive that we call a \((\mathsf {SS},\mathsf {ES})\)oracle aided PRF. Given an oracle for a function \(e\in \mathsf {ES}\), an \((\mathsf {SS},\mathsf {ES})\)oracle aided PRF \(\mathsf {G}\) defines a function \(\mathsf {G}[e]{:\;\;}\{0,1\}^{\mathsf {G}.\mathsf {kl}}\times \mathrm {Dom}(\mathsf {SS})\rightarrow \mathrm {Rng}(\mathsf {SS})\). The first input is a key. For \(\mathcal {C}\) an adversary, let \(\mathbf {Adv}^{\mathrm {prf}}_{\mathsf {G},\mathsf {SS},\mathsf {ES}}(\mathcal {C}) = 2\Pr [\mathbf {G}^{\mathrm {prf}}_{\mathsf {G},\mathsf {SS},\mathsf {ES}}(\mathcal {C})]1\), where the game is in Fig. 6. The simulator uses its readonly state to store a key \(st\) for \(\mathsf {G}\), then using \(\mathsf {G}(st,\cdot )\) to answer queries outside the support \(\mathbf {sup}(\mathsf {QTI})\).
We introduce this primitive because it allows multiple instantiations. The simplest is that it is a PRF, which happens when it does not use its oracle. In that case the simulator is using a computational primitive (a PRF) in the indifferentiability context, which seems novel. Another instantiation prefixes \(st\) to the input and then invokes e to return the output. This works for certain choices of \(\mathsf {ES}\), but not always. Note \(\mathsf {G}\) is used only by the simulator and plays no role in the functor.
The proof of the following is in [10].
Theorem 2
5 Analysis of Cloning Functors
Section 4 defined the rdindiff metric of security for functors and give a framework to prove rdindiff of translating functors. We now apply this to derive security results about particular, practical cloning functors.
Arityn function spaces. The cloning functors apply to function spaces where a function specifies subfunctions, corresponding to the different random oracles we are trying to build. Formally, a function space \(\mathsf {FS}\) is said to have arity n if its members are twoargument functions f whose first argument is an integer \(i \in [1..n]\). For \(i\in [1..n]\) we let \(f_i = f(i,\cdot )\) and \(\mathsf {FS}_i = \{f_i \,:\, f\in \mathsf {FS}\}\), and refer to the latter as the ith subspace of \(\mathsf {FS}\). We let \(\mathsf {Dom}_i(\mathsf {FS})\) be the set of all X such that \((i,X) \in \mathrm {Dom}(\mathsf {\mathsf {FS}})\).
We say that \(\mathsf {FS}\) has sol subspaces if \(\mathsf {FS}_i\) is a set of sol functions with domain \(\mathsf {Dom}_i(\mathsf {FS})\), for all \(i \in [1..n]\). More precisely, there must be integers \(\mathsf {OL}_1(\mathsf {FS}), \ldots , \mathsf {OL}_n(\mathsf {FS})\) such that \(\mathsf {FS}_i = \mathrm {SOL}(\mathsf {Dom}_i(\mathsf {FS}),\mathsf {OL}_i(\mathsf {FS}))\) for all \(i\in [1..n]\). In this case, we let \(\mathsf {Rng}_i(\mathsf {FS}) = \{0,1\}^{\mathsf {OL}_i(\mathsf {FS})}\). This is the most common case for practical uses of ROs.
To explain, access to n random oracles is modeled as access to a twoargument function f drawn at random from \(\mathsf {FS}\), written Open image in new window . If \(\mathsf {FS}\) has sol subspaces, then for each i, the function \(f_i\) is a sol function, with a certain domain and output length depending only on i. All such functions are included. This ensures input independence as we defined it earlier. Thus if Open image in new window , then for each i and any distinct inputs to \(f_i\), the outputs are independently distributed. Also functions \(f_1,\ldots ,f_n\) are independently distributed when Open image in new window . Put another way, we can identify \(\mathsf {FS}\) with \(\mathsf {FS}_1\times \cdots \times \mathsf {FS}_n\).
Domainseparating functors. We can now formalize the domain separation method by seeing it as defining a certain type of (translating) functor.
Let the ending space \(\mathsf {ES}\) be an arity n function space. Let \(\mathbf {F} {:\;\;}\mathsf {SS}\rightarrow \mathsf {ES}\) be a translating functor and \(\mathsf {QT},\mathsf {AT}\) be its query and answer translations, respectively. Assume \(\mathsf {QT}\) returns a vector of length 1 and that \(\mathsf {AT}((i,X),\varvec{V})\) simply returns \(\varvec{V}[1]\). We say that \(\mathbf {F}\) is domain separating if the following is true: \(\mathsf {QT}(i_1,X_1)\ne \mathsf {QT}(i_2,X_2)\) for any \((i_1,X_1),(i_2,X_2) \in \mathrm {Dom}(\mathsf {ES})\) that satisfy \(i_1\ne i_2\).
To explain, recall that the ending function is obtained as \(e\leftarrow \mathbf {F}[s]\), and defines \(e_i\) for \(i\in [1..n]\). Function \(e_i\) takes input X, lets \((u)\leftarrow \mathsf {QT}(i,X)\) and returns \(s(u)\). The domain separation requirement is that if \((u_i)\leftarrow \mathsf {QT}(i,X_i)\) and \((u_j)\leftarrow \mathsf {QT}(j,X_j)\), then \(i\ne j\) implies \(u_i\ne u_j\), regardless of \(X_i,X_j\). Thus if \(i\ne j\) then the inputs to which \(s\) is applied are always different. The domain of \(s\) has been “separated” into disjoint subsets, one for each i.
Practical cloning functors. We show that many popular methods for oracle cloning in practice, including ones used in NIST KEM submissions, can be cast as translating functors.
In the following, the starting space \(\mathsf {SS} = \mathrm {SOL}(\{0,1\}^*,\mathsf {OL}(\mathsf {SS})) \) is assumed to be a sol function space with domain \(\{0,1\}^*\) and an output length denoted \(\mathsf {OL}(\mathsf {SS})\). The ending space \(\mathsf {ES}\) is an arity n function spaces that has sol subspaces.
Prefixing. Here we formalize the canonical method of domain separation. Prefixing is used in the following NIST PQC submissions: Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window .
Let \(\mathbf {p}\) be a vector of strings. We require that it be prefixfree, by which we mean that \(i\ne j\) implies that \(\mathbf {p}[i]\) is not a prefix of \(\mathbf {p}[j]\). Entries of this vector will be used as prefixes to enforce domain separation. One example is that the entries of \(\mathbf {p}\) are distinct strings all of the same length. Another is that a \(\mathbf {p}[i]=\mathrm {E}(i)\) for some prefixfree code \(\mathrm {E}\) like a Huffman code.
Assume \(\mathsf {OL}_i(\mathsf {ES})=\mathsf {OL}(\mathsf {SS})\) for all \(i\in [1..n]\), meaning all ending functions have the same output length as the starting function. The functor \(\mathbf {F}_{\mathrm {pf}(\mathbf {p})} {:\;\;}\mathsf {SS} \rightarrow \mathsf {ES}\) corresponding to \(\mathbf {p}\) is defined by \(\mathbf {F}_{\mathrm {pf}(\mathbf {p})}[s](i,X) = s(\mathbf {p}[i]\Vert X)\). To explain, recall that the ending function is obtained as \(e\leftarrow \mathbf {F}_{\mathrm {pf}(\mathbf {p})}[s]\), and defines \(e_i\) for \(i\in [1..n]\). Function \(e_i\) takes input X, prefixes \(\mathbf {p}[i]\) to X to get a string \(X'\), applies the starting function \(s\) to \(X'\) to get Y, and returns Y as the value of \(e_i(X)\).
We claim that \(\mathbf {F}_{\mathrm {pf}(\mathbf {p})}\) is a translating functor that is also a domainseparating functor as per the definitions above. To see this, define query translator \(\mathsf {QT}_{\mathrm {pf}(\mathbf {p})}\) by \(\mathsf {QT}_{\mathrm {pf}(\mathbf {p})}(i,X)= (\mathbf {p}[i]\Vert X)\), the 1vector whose sole entry is \(\mathbf {p}[i]\Vert X\). The answer translator \(\mathsf {AT}_{\mathrm {pf}(\mathbf {p})}\), on input \((i,X),\varvec{V}\), returns \(\varvec{V}[1]\), meaning it ignores i, X and returns the sole entry in its 1vector \(\varvec{V}\).
Identity. Many NIST PQC submissions simply let \(e_i(X) = s(X)\), meaning the ending functions are identical to the starting one. This is captured by the identity functor \(\mathbf {F}_{\mathrm {id}}{:\;\;}\mathsf {SS} \rightarrow \mathsf {ES}\), defined by \(\mathbf {F}_{\mathrm {id}}[s](i,X) = s(X)\). This again assumes \(\mathsf {OL}_i(\mathsf {ES})=\mathsf {OL}(\mathsf {SS})\) for all \(i\in [1..n]\), meaning all ending functions have the same output length as the starting function. This functor is translating, via \(\mathsf {QT}_{\mathrm {id}}(i,X)=X\) and \(\mathsf {AT}_{\mathrm {id}}((i,X),\varvec{V})=\varvec{V}[1]\). It is however not, at least in general, domain separating.
Clearly, this functor is not, in general, rdindiff. To make secure use of it nonetheless, applications can restrict the inputs to the ending functions to enforce a virtual domain separation, meaning, for \(i\ne j\), the schemes never query \(e_i\) and \(e_j\) on the same input. One way to do this is length differentiation. Here, for \(i\in [1..n]\), the inputs to which \(e_i\) is applied all have the same length \(l_i\), and \(l_1,\ldots ,l_n\) are distinct. Length differentiation is used in the following NIST PQC submissions: Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window . There are, of course, many other similar ways to enforce the virtual domain separation.
There are two ways one might capture this with regard to security. One is to restrict the domain \(\mathrm {Dom}(\mathsf {ES})\) of the ending space. For example, for length differentiation, we would require that there exist distinct \(l_1,\ldots ,l_n\) such that for all \((i,X)\in \mathrm {Dom}(\mathsf {ES})\) we have \(X=l_i\). For such an ending space, the identity functor would provide security. The approach we take is different. We don’t restrict the domain of the ending space, but instead define security with respect to a subdomain, which we called the working space, where the restriction is captured. This, we believe, is better suited for practice, for a few reasons. One is that a single implementation of the ending functions can be used securely in different applications that each have their own working domain. Another is that implementations of the ending functions do not appear to enforce any restrictions, leaving it up to applications to figure out how to securely use the functions. In this context, highlighting the working domain may help application designers think about what is the working domain in their application and make this explicit, which can reduce error.
But we warn that the identity functor approach is more prone to misuse and in the end more dangerous and brittle than some others.
As per the above, inverses can only be given for certain working domains. Let us say that \(\mathcal{W}\subseteq \mathrm {Dom}(\mathsf {ES})\) separates domains if for all \((i_1,X_1),(i_2,X_2)\in \mathcal{W}\) satisfying \(i_1\ne i_2\), we have \(X_1\ne X_2\). Put another way, for any \((i,X)\in \mathcal{W}\) there is at most one j such that \(X\in \mathsf {Dom}_j(\mathsf {ES})\). We assume an efficient inverter for \(\mathcal{W}\). This is a deterministic algorithm \(\mathrm {In}_{\mathcal{W}}\) that on input \(X\in \{0,1\}^*\) returns the unique i such that \((i,X)\in \mathcal{W}\) if such an i exists, and otherwise returns \(\bot \). (The uniqueness is by the assumption that \(\mathcal{W}\) separates domains.)
As an example, for length differentiation, we pick some distinct integers \(l_1,\ldots ,l_n\) such that \(\{0,1\}^{l_i}\subseteq \mathsf {Dom}_i(\mathsf {ES})\) for all \(i\in [1..n]\). We then let \(\mathcal{W}= \{(i,X)\in \mathrm {Dom}(\mathsf {ES}) \,:\, X=l_i\}\). This separates domains. Now we can define \(\mathrm {In}_{\mathcal{W}}(X)\) to return the unique i such that \(X = l_i\) if \(X \in \{l_1,\ldots ,l_n\}\), otherwise returning \(\bot \).
Outputsplitting. We formalize another method that we call output splitting. It is used in the following NIST PQC submissions: Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window .
Let \(\ell _i = \mathsf {OL}_1(\mathsf {ES}) + \cdots + \mathsf {OL}_{i}(\mathsf {ES})\) for \(i\in [1..n]\). Let \(\ell = \mathsf {OL}(\mathsf {SS})\) be the output length of the sol functions \(s\in \mathsf {SS}\), and assume \(\ell = \ell _n\). The outputsplitting functor \(\mathbf {F}_{\mathrm {spl}}{:\;\;}\mathsf {SS} \rightarrow \mathsf {ES}\) is defined by \(\mathbf {F}_{\mathrm {spl}}[s](i,X) = s(X)[\ell _{i1}\!+\!1 .. \ell _{i}]\). That is, if \(e\leftarrow \mathbf {F}_{\mathrm {spl}}[s]\), then \(e_i(X)\) lets \(Z \leftarrow s(X)\) and then returns bits \(\ell _{i1}\!+\!1\) through \(\ell _{i}\) of Z. This functor is translating, via \(\mathsf {QT}_{\mathrm {spl}}(i,X)=X\) and \(\mathsf {AT}_{\mathrm {spl}}((i,X),\varvec{V})=\varvec{V}[1][\ell _{i1}\!+\!1 .. \ell _{i}]\). It is however not domain separating.
The implication of this result is that Open image in new window ’s implementation differs noticeably from the model in which its security claims are set, even when \(\mathsf {SHAKE256}\) is assumed to be a random oracle. This admits the possibility of hash function collisions and other sources of vulnerability that are not eliminated by the security proof. To claim provable security for Open image in new window ’s implementation, further justification is required to argue that these potential collisions are rare or unexploitable. We do not claim that an attack on readonly indifferentiability implies an attack on the INDCCA security of Open image in new window , but it does highlight a gap that needs to be addressed. Readonly indifferentiability constitutes a useful tool for detecting such gaps and measuring the strength of various oracle cloning methods.
Notes
Acknowledgments
The authors were supported in part by NSF grant CNS1717640 and a gift from Microsoft. Günther was additionally supported by Research Fellowship grant GU 1859/11 of the German Research Foundation (DFG).
References
 1.Albrecht, M., Cid, C., Paterson, K.G., Tjhai, C.J., Tomlinson, M.: NTSKEM. NIST PQC Round 2 Submission (2019)Google Scholar
 2.Alkim, E., et al.: NewHope: algorithm specifications and supporting documentation. NIST PQC Round 2 Submission (2019)Google Scholar
 3.Aragon, N., et al.: BIKE: bit flipping key encapsulation. NIST PQC Round 2 Submission (2019)Google Scholar
 4.Aragon, N., et al.: LOCKER: low rank parity check codes encryption. NIST PQC Round 1 Submission (2017)Google Scholar
 5.Avanzi, R., et al.: CRYSTALSKyber: algorithm specifications and supporting documentation. NIST PQC Round 2 Submission (2019)Google Scholar
 6.Baan, H., et al.: Round5: KEM and PKE based on (ring) learning with rounding. NIST PQC Round 2 Submission (2019)Google Scholar
 7.Banegas, G., et al.: DAGS: key encapsulation from dyadic GS codes. NIST PQC Round 1 Submission (2017)Google Scholar
 8.Bardet, M., et al.: BIG QUAKE: binary goppa quasicyclic key encapsulation. NIST PQC Round 1 Submission (2017)Google Scholar
 9.Bellare, M., Bernstein, D.J., Tessaro, S.: Hashfunction based PRFs: AMAC and its multiuser security. In: Fischlin, M., Coron, J.S. (eds.) EUROCRYPT 2016, Part I. LNCS, vol. 9665, pp. 566–595. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662498903_22CrossRefGoogle Scholar
 10.Bellare, M., Davis, H., Günther, F.: Separate your domains: NIST PQC KEMs, oracle cloning and readonly indifferentiability. Cryptology ePrint Archive (2020)Google Scholar
 11.Bellare, M., Rogaway, P.: Random oracles are practical: a paradigm for designing efficient protocols. In: Denning, D.E., Pyle, R., Ganesan, R., Sandhu, R.S., Ashby, V. (eds.) ACM CCS 1993, pp. 62–73. ACM Press, November 1993Google Scholar
 12.Bellare, M., Rogaway, P.: The security of triple encryption and a framework for codebased gameplaying proofs. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 409–426. Springer, Heidelberg (2006). https://doi.org/10.1007/11761679_25CrossRefGoogle Scholar
 13.Bernstein, D.J., et al.: Classic McEliece: conservative codebased cryptography. NIST PQC Round 2 Submission (2019)Google Scholar
 14.Bernstein, D.J., Chuengsatiansup, C., Lange, T., van Vredendaal, C.: NTRU Prime. NIST PQC Round 2 Submission (2019)Google Scholar
 15.Chen, M.S., Hülsing, A., Rijneveld, J., Samardjiska, S., Schwabe, P.: MQDSS specifications. NIST PQC Round 2 Submission (2019)Google Scholar
 16.Cheon, J.H., et al.: Lizard public key encryption. NIST PQC Round 1 Submission (2017)Google Scholar
 17.Coron, J.S., Dodis, Y., Malinaud, C., Puniya, P.: MerkleDamgård revisited: how to construct a hash function. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 430–448. Springer, Heidelberg (2005). https://doi.org/10.1007/11535218_26CrossRefGoogle Scholar
 18.Cramer, R., Shoup, V.: Design and analysis of practical publickey encryption schemes secure against adaptive chosen ciphertext attack. SIAM J. Comput. 33(1), 167–226 (2003)MathSciNetCrossRefGoogle Scholar
 19.D’Anvers, J.P., Karmakar, A., Roy, S.S., Vercauteren, F.: SABER: ModLWR based KEM. NIST PQC Round 2 Submission (2019)Google Scholar
 20.Demay, G., Gaži, P., Hirt, M., Maurer, U.: Resourcerestricted indifferentiability. In: Johansson, T., Nguyen, P.Q. (eds.) EUROCRYPT 2013. LNCS, vol. 7881, pp. 664–683. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642383489_39CrossRefGoogle Scholar
 21.Dent, A.W.: A designer’s guide to KEMs. In: Paterson, K.G. (ed.) Cryptography and Coding 2003. LNCS, vol. 2898, pp. 133–151. Springer, Heidelberg (2003). https://doi.org/10.1007/9783540409748_12CrossRefGoogle Scholar
 22.GarciaMorchon, O., Zhang, Z.: Round2: KEM and PKE based on GLWR. NIST PQC Round 1 Submission (2017)Google Scholar
 23.Hamburg, M.: Postquantum cryptography proposal: ThreeBears. NIST PQC Round 2 Submission (2019)Google Scholar
 24.Hofheinz, D., Hövelmanns, K., Kiltz, E.: A modular analysis of the FujisakiOkamoto transformation. In: Kalai, Y., Reyzin, L. (eds.) TCC 2017, Part I. LNCS, vol. 10677, pp. 341–371. Springer, Cham (2017). https://doi.org/10.1007/9783319705002_12CrossRefzbMATHGoogle Scholar
 25.Hülsing, A., Rijneveld, J., Schanck, J.M., Schwabe, P.: NTRUHRSSKEM: algorithm specifications and supporting documentations. NIST PQC Round 1 Submission (2017)Google Scholar
 26.Jao, D., et al.: Supersingular isogeny key encapsulation. NIST PQC Round 2 Submission (2019)Google Scholar
 27.Jiang, H., Zhang, Z., Chen, L., Wang, H., Ma, Z.: INDCCAsecure key encapsulation mechanism in the quantum random oracle model, revisited. In: Shacham, H., Boldyreva, A. (eds.) CRYPTO 2018, Part III. LNCS, vol. 10993, pp. 96–125. Springer, Cham (2018). https://doi.org/10.1007/9783319968780_4CrossRefGoogle Scholar
 28.Lu, X., Liu, Y., Jia, D., Xue, H., He, J., Zhang, Z.: LAC: Latticebased cryptosystems. NIST PQC Round 2 Submission (2019)Google Scholar
 29.Maurer, U., Renner, R., Holenstein, C.: Indifferentiability, impossibility results on reductions, and applications to the random oracle methodology. In: Naor, M. (ed.) TCC 2004. LNCS, vol. 2951, pp. 21–39. Springer, Heidelberg (2004). https://doi.org/10.1007/9783540246381_2CrossRefGoogle Scholar
 30.Melchor, C.A., et al.: ROLLO: rankouroboros, LAKE, & LOCKER. NIST PQC Round 2 Submission (2018)Google Scholar
 31.Melchor, C.A., et al.: Rank quasicyclic (RQC). NIST PQC Round 2 Submission (2019)Google Scholar
 32.Melchor, C.A., et al.: Hamming quasicyclic (HQC). NIST PQC Round 2 Submission (2019)Google Scholar
 33.Mittelbach, A.: Salvaging indifferentiability in a multistage setting. In: Nguyen, P.Q., Oswald, E. (eds.) EUROCRYPT 2014. LNCS, vol. 8441, pp. 603–621. Springer, Heidelberg (2014). https://doi.org/10.1007/9783642552205_33CrossRefzbMATHGoogle Scholar
 34.Naehrig, M., et al.: FrodoKEM: learning with errors key encapsulation. NIST PQC Round 2 Submission (2019)Google Scholar
 35.NIST. PostQuantum Cryptography Standardization Process. https://csrc.nist.gov/projects/postquantumcryptography
 36.NIST. Federal Information Processing Standard 202, SHA3 Standard: PermutationBased Hash and ExtendableOutput Functions, August 2015Google Scholar
 37.NIST. PQC Standardization Process: Second Round Candidate Announcement, January 2019. https://csrc.nist.gov/news/2019/pqcstandardizationprocess2ndroundcandidates
 38.Plantard, T.: Odd Manhattan’s algorithm specifications and supporting documentation. NIST PQC Round 1 Submission (2017)Google Scholar
 39.Ristenpart, T., Shacham, H., Shrimpton, T.: Careful with composition: limitations of the indifferentiability framework. In: Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 487–506. Springer, Heidelberg (2011). https://doi.org/10.1007/9783642204654_27CrossRefGoogle Scholar
 40.Saito, T., Xagawa, K., Yamakawa, T.: Tightlysecure keyencapsulation mechanism in the quantum random oracle model. In: Nielsen, J.B., Rijmen, V. (eds.) EUROCRYPT 2018, Part III. LNCS, vol. 10822, pp. 520–551. Springer, Cham (2018). https://doi.org/10.1007/9783319783727_17CrossRefzbMATHGoogle Scholar
 41.Seo, M., Park, J.H., Lee, D.H., Kim, S., Lee, S.J.: Proposal for NIST postquantum cryptography standard: EMBLEM and R.EMBLEM. NIST PQC Round 1 Submission (2017)Google Scholar
 42.Smart, N.P., et al.: LIMA: a PQC encryption scheme. NIST PQC Round 1 Submission (2017)Google Scholar
 43.Steinfeld, R., Sakzad, A., Zhao, R.K.: Titanium: proposal for a NIST postquantum publickey encryption and KEM standard. NIST PQC Round 1 Submission (2017)Google Scholar
 44.Zhao, Y., Jin, Z., Gong, B., Sui, G.: A modular and systematic approach to key establishment and publickey encryption based on LWE and its variants. NIST PQC Round 1 Submission (2017)Google Scholar