Abstract
We show that the widely deployed RSAOAEP encryption scheme of Bellare and Rogaway (Eurocrypt 1994), which combines RSA with two rounds of an underlying Feistel network whose hash ( i.e., round) functions are modeled as random oracles, meets indistinguishability under chosenplaintext attack (INDCPA) in the standard model based on simple, noninteractive, and noninterdependent assumptions on RSA and the hash functions. To prove this, we first give a result on a more general notion called “paddingbased” encryption, saying that such a scheme is INDCPA if (1) its underlying padding transform satisfies a “fooling" condition against smallrange distinguishers on a class of highentropy input distributions, and (2) its trapdoor permutation is sufficiently lossy as defined by Peikert and Waters (STOC 2008). We then show that the first round of OAEP satisfies condition (1) if its hash function is twise independent for t roughly proportional to the allowed message length. We clarify that this result requires the hash function to be keyed, and for its key to be included in the public key of RSAOAEP. We also show that RSA satisfies condition (2) under the \(\Phi \)Hiding Assumption of Cachin et al. (Eurocrypt 1999). This is the first positive result about the instantiability of RSAOAEP. In particular, it increases confidence that chosenplaintext attacks are unlikely to be found against the scheme. In contrast, RSAOAEP’s predecessor in PKCS #1 v1.5 was shown to be vulnerable to such attacks by Coron et al. (Eurocrypt 2000).
Introduction
Bellare and Rogaway [5] designed the RSAOAEP encryption scheme as a dropin replacement for RSA PKCS #1 v1.5 [55] with provable security. In particular, it follows the same paradigm as RSA PKCS #1 v1.5 in that it encrypts a message of less than k bits to a kbit ciphertext (where k is the modulus length) by first applying a fast, randomized, and invertible “padding transform” to the message before applying RSA. In the case of RSAOAEP, the underlying padding transform (which is itself called ‘OAEP’^{Footnote 1}) embeds a message m and random coins r as \(s\Vert (H(s) {\,\oplus \,}r)\) where ‘\(\Vert \)’ denotes concatenation, \(s = (m \Vert 0^{k_1}) {\,\oplus \,}G(r)\) for some parameter \(k_1\), and G and H are hash functions (see Fig. 2 on p. 12). In contrast, PKCS #1 v1.5 essentially just concatenates m with r.
RSAOAEP was designed using the random oracle (RO) methodology [6]. This means that the hash functions are modeled as independent truly random functions, available to all parties via oracle access. When the scheme is implemented in practice, these oracles are heuristically “instantiated" in certain ways using a cryptographic hash function. In particular, this means that any oracle call by the scheme’s algorithms is replaced by the computation of a concrete function. In terms of security, a cryptographic hash function (or a function built from one) is of course not random nor computable only via an oracle (it has a short, public description), but schemes designed using this methodology are hoped to be secure. Unfortunately, a series of works, starting with the seminal paper of Canetti et al. [20], showed that there are schemes secure in the RO model that are insecure under every instantiation of the oracles; such RO model schemes are called uninstantiable. Thus, to gain confidence in an RO model scheme, we should show that it is instantiable, i.e., that the oracles admit a secure instantiation by efficiently computable functions under welldefined assumptions. Then, when we instantiate the scheme, we know that our goal is at least plausible. We feel this is especially important for a scheme such as RSAOAEP, which is by now widely standardized and implemented (e.g., in SSH [32]).
Yet, while RO model schemes continue to be proposed, relatively few have been shown to be instantiable. In particular, we are not aware of any result showing instantiability of RSAOAEP, even under a relatively modest security model. In fact, the scheme has come under criticism lately due to several works (discussed in Sect. 1.2) showing the impossibility of certain types of instantiations under chosenciphertext attack (INDCCA) [52]. Fortunately, we bring some good news: We give reasonable assumptions under which RSAOAEP is secure against chosenplaintext attack (INDCPA) [31]. We believe this is an important step toward a better understanding of the scheme’s security.
Our Contributions
Our result on the instantiability of RSAOAEP is obtained via three steps or other results. (These other results may also be of independent interest.) First, we show a general result on the instantiability of “paddingbased encryption,” of which fOAEP is a special case, under the assumption that the underlying padding transform is what we call a fooling extractor and the trapdoor permutation is lossy [49]. We then show (as the second and third steps, respectively) that OAEP and RSA satisfy the respective conditions under suitable assumptions.
Paddingbased encryption without ROs. Our first result is a general theorem about paddingbased encryption (PBE), a notion formalized recently by Kiltz and Pietrzak [38].^{Footnote 2} PBE generalizes the design methodology of PKCS #1 and RSAOAEP we already mentioned. Namely, we start with a kbit to kbit trapdoor permutation (TDP) that satisfies a weak security notion like onewayness. To “upgrade" the TDP to an encryption scheme satisfying a strong security notion like INDCPA, we design an invertible “padding transform" which embeds a plaintext and random coins into a kbit string, to which we then apply the TDP. This methodology is quite natural and has long been prevalent in practice, motivating the design of OAEP and later schemes such as SAEP [13] and PSSE [23]. The latter were all designed and analyzed in the RO model.
We show that the RO model is unnecessary in the design and analysis of INDCPA secure PBE. To do so, we formulate a connection between PBE and a new notion we call “fooling extractor for smallrange distinguishers." or just “fooling extractor,” and lossy trapdoor functions as defined by Peikert and Waters [49]. Lossiness means that there is an alternative, “lossy” key generation algorithm that outputs a public key indistinguishable from a normal one, but which induces a small (“lossy”) range function. This is powerful because it allows one to prove security with respect to the lossy key generation algorithm, where informationtheoretic arguments apply. A fooling extractor is a kind of randomness extractor (a concept introduced in [46]) whose output on a highentropy source looks random to any function (or distinguisher) with a small range.^{Footnote 3} Our result says that if the padding transform of a PBE scheme is an “adaptive” fooling extractor for sources of the form (m, R)—where m is a plaintext and R is the random coins (which we call “encryption sources”)—and its TDP is sufficiently lossy (the logarithm of its lossy range size should be slightly less than the length of R), then the PBE scheme is INDCPA. Here “adaptive” means that m may depend on the choice of the extractor seed. We call such padding transforms “encryptioncompatible.”
OAEP fools smallrange distinguishers. Our second result says that the OAEP padding transform is encryptioncompatible if the hash function G is twise independent for appropriate t (roughly, proportional to the allowed message length).^{Footnote 4} Note that no restriction is put on hash function H; in particular, neither hash function is modeled as an RO. The inspiration for our proof comes from the “Crooked" Leftover Hash Lemma (LHL) of Dodis and Smith [26], especially its application to deterministic encryption by Boldyreva et al. [10] (who also gave a simpler proof). Qualitatively, the Crooked LHL says that \((K,f(\Pi (K,X)))\) looks like (K, f(U)) for any smallrange function f, pairwiseindependent function \(\Pi \) keyed by K, and highentropy source X; in our terminology, this says that a pairwiseindependent function is a fooling extractor for such X. In our application, we might naïvely view \(\Pi \) as the OAEP. There are two problems with this. First, OAEP is not pairwise independent, even in the RO model. Second, showing that OAEP is encryptioncompatible entails showing adaptivity (as defined above), whereas in the lemma K is independent of X.
To solve the first problem, we show that the Crooked LHL can be strengthened to say that \(K,f(X,\Pi (K,X))\) looks like K, f(X, U); i.e., that \(\Pi (K,X)\) looks random to f even given X. The proof is a careful extension of the proof of the Crooked LHL in [10]. Then, by viewing X as the random coins in OAEP and \(\Pi \) as the hash function G, we can conclude that OAEP is a fooling extractor for any fixed encryption source (m, R), where m is independent of K (note that our analysis does not use any properties of H—the only fact we use about the second Feistel round is that it is invertible).
To solve the second problem, we extend an idea of Trevisan and Vadhan [61] to our setting and show that if G is t wise independent for large enough t, the probability that the chosen seed (or key) is “bad” for a particular encryption source is so small that we can take a union bound over all possible m and conclude that OAEP is in fact adaptive, meaning it is indeed encryptioncompatible. Interestingly, we obtain better parameters in the case that f is regular, meaning every preimage set has the same size. However, our analysis still goes through assuming that every preimage set is sufficiently large, which we show can always be assumed with some loss in parameters.
Lossiness of RSA. To instantiate RSAOAEP, it remains to show lossiness of RSA. Our final result is that RSA is indeed lossy under reasonable assumptions. We first show lossiness of RSA under the \(\Phi \)Hiding Assumption (\(\Phi \)A) of Cachin, Micali, and Stadler [16]. \(\Phi \)A has been used as the basis for a number of efficient protocols, e.g., [15, 16, 29, 33]. \(\Phi \)A states roughly that given an RSA modulus \(N = pq\), it is hard to distinguish primes e that divide \(\phi (N) = (p1)(q1)\) from those that do not. Normal RSA parameters (N, e) are such that \(\gcd (e,\phi (N) = 1\). Under \(\Phi \)A, we may alternatively choose (N, e) such that e divides \(p1\). The range of the RSA function is then reduced by a factor 1 / e. To resist known attacks, we can take the bitlength of e up to almost 1 / 4 that of N, giving RSA lossiness of almost k / 4 bits, where k is the modulus length.^{Footnote 5} We also stress that even though the only currently known algorithm to break the \(\Phi \)A with such parameters is to factor the modulus N, it is considerably stronger than the standard factoring/RSA assumptions.
In practice, e is usually chosen to be small for efficiency reasons. We observe that in this case more lossiness can be achieved by considering multiprime RSA where \(N = p_1 \cdots p_m\) for \(m \ge 2\) (for a fixed modulus length). In the lossy case, we choose (N, e) such that e divides \(p_i\) for all \(1 \le i \le m1\); the range of the RSA function is then reduced by a factor \(1/e^{m1}\). In a preliminary version of this paper [37], we showed that the maximum bitlength of e in this case to avoid our best attack was roughly \(k(1/m  2/m^2)\) where k is the modulus length. By devising better attacks, this value was subsequently reduced to \(k(2/3m^{2/3})\) by Herrmann [35] and \(k(1/m  2/(em \log (m+1)))\), where e is the base of the natural logarithm, by Tosu and Kunihiro [60]. So, for a fixed modulus size we gain in lossiness only for small e. If we assume such multiprime RSA moduli are indistinguishable from twoprime ones, we can achieve such a gain in lossiness in the case of standard (twoprime) RSA as well.
Implications for RSAOAEP. Combining the results above gives that RSAOAEP is INDCPA in the standard model under (rather surprisingly, at least to us) simple, noninteractive, and noninterdependent assumptions on RSA and the hash functions. The parameters for RSAOAEP supported by our proofs are discussed in Sect. 6. While they are considerably worse than what is expected in practice, we view the upshot of our results not as the concrete parameters they support, but rather that they increase the theoretical backing for the scheme’s security at a more qualitative level, showing it can be instantiated at least for larger parameters. In particular, our results give us greater confidence that chosenplaintext attacks are unlikely to be found against the scheme; such attacks are known against the predecessor of RSAOAEP in PKCS #1 v1.5 [22]. That said, we strongly encourage further research to try to improve the concrete parameters. Indeed, initial steps in this direction have already been taken; see Sect. 1.3 below.
Moreover, our analysis brings to light to some simple modifications that may increase the scheme’s security. The first is to key the hash function G. Although our results have some interpretation in the case that G is a fixed function (see below), it may be preferable for G to have an explicit, randomly selected key. It is in an interesting open question whether our proof can be extended to function families that use shorter keys. The second possible modification is to increase the length of the randomness versus that of the redundancy in the message when encrypting short messages under RSAOAEP. Of course, we suggest these modifications only in cases where they do not impact efficiency too severely.
Using unkeyed hash functions. Formally, our results assume G is randomly chosen from a large family (i.e., it is a keyed hash function). However, our analysis actually shows that almost every function (i.e. all but a very small fraction) from the family yields a secure instantiation; we just do not know an explicit member that works. In other words, it is not strictly necessary that G be randomly chosen. When G is instantiated in practice using a fixed cryptographic hash function like MD5 or SHA1, it is plausible that the resulting instantiation is secure. One can also assume the fixed cryptographic hash function to be implicitly keyed, where the key (in this context called the initialization vector) is chosen and fixed by its designer, and hardcoded into its implementation.
On chosenciphertext security. Any extension of our results to security under chosenciphertext attack (INDCCA) must get around the negative results of Kiltz and Pietrzak [38] (which we discuss in more detail in Sect. 1.2). One possible approach to this is based on the fact that, by the results of Bellare and Palacio [4], the notion of plaintext awareness (PA) + INDCPA implies INDCCA. Thus, in order to show INDCCA security of RSAOAEP in the standard model it suffices, by our results, to show PA (which is an orthogonal property to privacy). To show the latter one could try to use nonblackbox assumptions on H along the lines of [18]. We leave a detailed investigation to future work.
Related Work
Security of OAEP in the RO model. In their original paper [5], Bellare and Rogaway showed that OAEP is INDCPA assuming the TDP is oneway. They further showed it achieves a notion they called “plaintext awareness." Subsequently, Shoup [58] observed that the latter notion is too weak to imply security against chosenciphertext attacks, and in fact there is no blackbox proof of INDCCA security of OAEP based on onewayness of the TDP. Fortunately, Fujisaki al. [28] proved that OAEP is nevertheless INDCCA assuming socalled “partialdomain" onewayness and that partialdomain onewayness and (standard) onewayness of RSA are equivalent.
Security of OAEP without ROs. Results on instantiability of OAEP have so far mainly been negative. Boldyreva and Fischlin [11] showed that (contrary to a conjecture of Canetti [17]) one cannot securely instantiate even one of the two hash functions (while still modeling the other as an RO) of OAEP under INDCCA by a “perfectly oneway" hash function [17, 19] if one assumes only that f is partialdomain oneway. Brown [14] and Paillier and Villar [47] later showed that there are no “keypreserving" blackbox proofs of INDCCA security of RSAOAEP based on onewayness of RSA. Recently, Kiltz and Pietrzak [38] (building on the earlier work of Dodis et al. [24] in the signature context) generalized these results and showed that there is no blackbox proof of INDCCA (or even NMCPA) security of OAEP based on any property of the TDP satisfied by an ideal (truly random) permutation.^{Footnote 6} In fact, their result can be extended to rule out a blackbox proof of NMCPA security of OAEP assuming the TDP is lossy [39], so our results are in some sense optimal given our assumptions.
Instantiations of related schemes. A positive instantiation result about a variant of OAEP called OAEP++ [40] (where part of the transform is output in the clear) was obtained by Boldyreva and Fischlin in [12]. They showed an instantiation that achieves (some weak form of) nonmalleability under chosenplaintext attacks (NMCPA) for random messages, assuming the existence of nonmalleable pseudorandom generators (NMPRGs).^{Footnote 7} We note that the approach of trying to obtain positive results for instantiations under security notions weaker than INDCCA originates from their work, and the authors explicitly ask whether OAEP can be shown INDCPA in the standard model based on reasonable assumptions on the TDP and hash functions.
Another line of work has looked at instantiating other RO model schemes related at least in spirit to OAEP. Canetti [17] showed that the INDCPA scheme in [6] can be instantiated using (a strong form of) perfectly oneway probabilistic hash functions. More recently, the works of Canetti and Dakdouk [18], Pandey al. [48], and Boldyreva et al. [9] obtained (partial) instantiations of the earlier INDCCA scheme of [6]. Hofheinz and Kiltz [36] recently showed an INDCCA secure instantiation of a variant of the DHIES scheme of [51].
Subsequent Work
Subsequent to the preliminary version of this paper [37], our results have been improved in several ways. First, as mentioned above, Hermann [35] and Tosu and Kunihiro [60] gave better cryptanalyses of our extension of \(\Phi \)A to the case of multiple primes. Furthermore, Lewko al. [42] resolved an open problem raised by our work and proved “approximate regularity” of lossy RSA on arithmetic progressions of sufficient length, leading to improved security bounds for RSAOAEP; see Sect. 6. They also showed that this result gives a proof of INDCPA security of RSA PKCS #1 v1.5. Subsequently, Smith and Zhang [59] proved a stronger result on approximate regularity of lossy RSA under a stronger assumption on RSA, leading to better parameters. They also fixed an erroneous claim of [42] about an “averagecase” version of approximate regularity of lossy RSA, which can be used to prove large consecutive runs of input bits simultaneously hardcore without the stronger assumption on RSA.
Seurin [57] (building additionally on Freeman et al. [27]) showed how to extend our results to the case of the Rabin trapdoor function [50] instead of RSA. Hemenway el al. [34] showed how to use our result on the lossiness of RSA under \(\Phi \)A to obtain new constructions of noncommitting encryption under this assumption. Bellare et al. [3] proved INDCPA security of RSAOAEP under standard onewayness of RSA, but making a much stronger assumption on the hash functions than we do.
Preliminaries
Notation and conventions. For a probabilistic algorithm A, by
, we mean that A is executed on input x and the output is assigned to y, whereas if S is a finite set then by , we mean that s is assigned a uniform element of S. We sometimes use
to make A’s random coins explicit. We denote by \(\mathrm {Pr}\bigl [A(x) \,{\Rightarrow }\,y :\,\ldots ~]\) the probability that A outputs y on input x when x is sampled according to the elided experiment. Unless otherwise specified, an algorithm may be probabilistic and its runningtime includes that of any overlying experiment. We denote by \(1^k\) the unary encoding of the security parameter k. We sometimes suppress dependence on k for readability. For \(i \in {\mathbb {N}}\) we denote by \(\{0,1\}^i\) the set of all binary strings of length i. If s is a string, then s denotes its length in bits, whereas if S is a set then S denotes its cardinality. By ‘\(\Vert \)’ we denote string concatenation. All logarithms are base 2.
Basic Definitions. Writing \(P_X(x)\) for the probability that a random variable X puts on x, the statistical distance between random variables X and Y with the same range is given by \(\Delta (X,Y) = \frac{1}{2} \sum _x P_X(x)  P_Y(x)\). If \(\Delta (X,Y)\) is at most \(\varepsilon \) then we say X, Y are \(\varepsilon \) close and write \(X \approx _\varepsilon Y\). We say that X is independent if it is independent of all other random variables under consideration. The minentropy of X is \(\mathrm {H}_\infty (X) = \log (\max _x P_X(x))\). A random variable X over \(\{0,1\}^n\) is called an \((n,\ell )\) source if \(\mathrm {H}_\infty (X) \ge \ell \). If \(\ell = n\) then X is said to be uniform. Let \(f : A \rightarrow B\) be a function. We denote by R(f) the range of f, i.e., \(\{b \in B~~\exists a \in A, f(a) = b\}\). We call R(f) the range size of f. We call f regular if each preimage set is the same size, i.e., \(\{x \in D~~f(x) = y \}\) is the same for all \(y \in R\).
Publickey encryption and its security. A publickey encryption scheme with messagespace \(\mathrm {MsgSp}\) is a triple of algorithms \({\mathcal {AE}}= ({\mathcal {K}}, {\mathcal {E}}, {\mathcal {D}})\). The key generation algorithm \({\mathcal {K}}\) returns a public key \( pk \) and matching secret key \( sk \). The encryption algorithm \({\mathcal {E}}\) takes \( pk \) and a plaintext m to return a ciphertext. The deterministic decryption algorithm \({\mathcal {D}}\) takes \( sk \) and a ciphertext c to return a plaintext. We require that for all messages \(m \in \mathrm {MsgSp}\)
is (very close to) 1.
To an encryption scheme \(\Pi = ({\mathcal {K}}, {\mathcal {E}},{\mathcal {D}})\) and an adversary \(A = (A_1, A_2)\), we associate a chosenplaintext attack experiment,
where we require A’s output to satisfy \(m_0 = m_1\). Define the indcpa advantage of A against \(\Pi \) as
Lossy trapdoor permutations. A lossy trapdoor permutation (LTDP) generator [49]^{Footnote 8} is a pair \(\mathsf {LTDP}= ({\mathcal {F}}, {\mathcal {F}}')\) of algorithms. Algorithm \({\mathcal {F}}\) is a usual trapdoor permutation (TDP) generator, namely it outputs a pair \((f, f^{1})\) where f is a (description of a) permutation on \(\{0,1\}^k\) and \(f^{1}\) its inverse. Algorithm \({\mathcal {F}}'\) outputs a (description of a) function \(f'\) on \(\{0,1\}^k\). We call \({\mathcal {F}}\) the “injective mode" and \({\mathcal {F}}'\) the “lossy mode" of \(\mathsf {LTDP}\) respectively, and we call \({\mathcal {F}}\) “lossy” if it is the first component of some lossy TDP. For a distinguisher D, define its ltdpadvantage against \(\mathsf {LTDP}\) as
We say \(\mathsf {LTDP}\) has residual leakage \(s\) if for all \(f'\) output by \({\mathcal {F}}'\) we have \(R(f') \le 2^s\). The lossiness of \(\mathsf {LTDP}\) is \(\ell =k  s\).
t
wise independent hashing. Let \(H :{\mathcal {K}}\times D \rightarrow R\) be a (keyed) hash function. We say that H is t wise independent [62] if for all distinct \(x_1,\ldots , x_t \in D\) and all \(y_1, \ldots , y_t \in R\)
In other words, \(H(K,x_1),\ldots ,H(K,x_t)\) are all uniform and independent.
PaddingBased Encryption from Lossy TDP + Fooling Extractor
In this section, we show a general result on how to build INDCPA secure paddingbased encryption (PBE) without using random oracles, by combining a lossy TDP with a “fooling extractor" for smallrange distinguishers.
Background and Tools
We first provide the definitions relevant to our result.
Paddingbased encryption. The idea behind paddingbased encryption (PBE) is as follows: We start with a kbit to kbit trapdoor permutation (e.g., RSA) and wish to build a secure encryption scheme. As in [5], we are interested in encrypting messages of less than k bits to ciphertexts of length k. It is wellknown that we cannot simply encrypt messages under the TDP directly to achieve strong security. So, in a PBE scheme we “upgrade" the TDP by first applying a randomized and invertible “padding transform" to a message prior to encryption.
Our definition of PBE largely follows the recent formalization in [38]. Let \(k,\mu ,\rho \) be three integers such that \(\mu +\rho \le k\). A padding transform \((\pi ,{\hat{\pi }})\) consists of two mappings \(\pi : \{0,1\}^{\mu + \rho } \rightarrow \{0,1\}^k\) and \({\hat{\pi }}: \{0,1\}^k \rightarrow \{0,1\}^\mu \cup \{\bot \}\) such that \(\pi \) is injective and the following consistency requirement is fulfilled:
A padding transform generator is an algorithm \(\Pi \) that on input \(1^k\) outputs a (description of a) padding transform \((\pi ,{\hat{\pi }})\). Let \({\mathcal {F}}\) be a kbit trapdoor permutation generator and \(\Pi \) be a padding transform generator. Define the associated paddingbased encryption scheme \({\mathcal {AE}}_\Pi [{\mathcal {F}}] = ({\mathcal {K}}, {\mathcal {E}}, {\mathcal {D}})\) with messagespace \(\{0,1\}^\mu \) by
Paddingbased encryption schemes have long been prevalent in practice, for example PKCS #1 [55]. While OAEP [5] is the bestknown, the notion also captures later schemes such as SAEP [13] and PSSE [23].
Fooling extractors. We define a new notion that we call “fooling extractor for smallrange distinguishers" or just “fooling extractor.” Intuitively, fooling extractors are a type of randomness extractor [46] that “fools" distinguishers with smallrange output. We give some more intuition after the formal definition.
Definition 3.1
Let \(\mathsf {FExt}:\{0,1\}^c \times \{0,1\}^n \rightarrow \{0,1\}^k\) be a function and let \({\mathcal {X}}= \{X_1, \ldots , X_q\}\) be a class of \((n,\ell )\)sources (as defined in Sect. 2). We say that \(\mathsf {FExt}\) fools range \(2^s\) distinguishers on \({\mathcal {X}}\) with probability \(1\varepsilon \) (or is an \((s,\varepsilon )\)fooling extractor for \({\mathcal {X}}\)) if for all functions \(f'\) on \(\{0,1\}^k\) with range size at most \(2^s\) and all \(1 \le i \le q\):
where K is uniform on \(\{0,1\}^c\) and U is uniform and independent on \(\{0,1\}^n\). We call K the key or seed of \(\mathsf {FExt}\). Note that K is independent of i above.
We say that \(\mathsf {FExt}\) adaptively fools range\(2^s\) distinguishers on \({\mathcal {X}}\) with probability \(1\varepsilon \) (or is an adaptive \((s,\varepsilon )\)fooling extractor for \({\mathcal {X}}\)) if for all functions \(f'\) on \(\{0,1\}^k\) with range size at most \(2^s\):
Since , the above implies that \((K,f'(\mathsf {FExt}(K,X_i)) \approx _{\varepsilon } (K,f'(U))\) for i depending on K (or, put differently, \((K,f'(\mathsf {FExt}(K,X_i)) \approx _{\varepsilon } (K,f'(U))\) holds for every i over the same choice of K).
As a useful special case, we say that \(\mathsf {FExt}\) fools range\(2^s\) regular distinguishers on \({\mathcal {X}}\) with probability \(1\varepsilon \) (or is a regular \((s,\varepsilon )\)fooling extractor for \({\mathcal {X}}\)) if we quantify only over regular f in the definition. An adaptive regular \((s,\varepsilon )\)fooling extractor for \({\mathcal {X}}\) is defined analogously.
We note that while the intuition given prior to the definition describes fooling the function f, it actually requires fooling an “implicit” or “external” distinguisher that sees both the output \(f'(\mathsf {FExt}(K,X_i))\) of f and the extractor seed K. This crucial for the definition to be meaningful. Indeed, just asking that \(f'(\mathsf {FExt}(K,X_i))\) be indistinguishable from f(U) for all smallrange functions f is equivalent to asking only that \(\mathsf {FExt}(K,X_i)\) be indistinguishable from U. This latter requirement is trivial to achieve (if one is not concerned with key length)–for example, by using K as a onetime pad.
We also note that the concept of fooling extractors was implicit in the work of Dodis and Smith [26] on errorcorrection without leaking partial information, whose “Crooked” Leftover Hash Lemma establishes in our language that a pairwiseindependent function is a \((s,\varepsilon )\)fooling extractor for every singleton \((n,\ell )\)source X where \(s \le \ell  2 \log (1/\varepsilon ) + 2\). This lemma was later applied in the context of deterministic publickey encryption by Boldyreva et al. [10], who also gave a simpler proof.
The Result
To state our result, we first formalize the concept of encryptioncompatible padding transforms.
Definition 3.2
Let \(\Pi \) be a padding transform generator whose coins are drawn from \(\mathsf {Coins}\). Define the associated function \(h_\Pi : \mathsf {Coins}\times \{0,1\}^{\mu + \rho } \rightarrow \{0,1\}^k\) by \(h(cc,m \Vert r) = \pi (m \Vert r)\) for all \(cc \in \mathsf {Coins}, m \in \{0,1\}^\mu , r \in \{0,1\}^\rho \), where \((\pi ,{\hat{\pi }}) \leftarrow \Pi (1^k; cc)\). Define the class \({\mathcal {X}}_\Pi \) of encryption sources associated to \(\Pi \) as containing all sources of the form (m, R), where \(m \in \{0,1\}^\mu \) is fixed and \(R \in \{0,1\}^\rho \) is uniform. (Note that the class \({\mathcal {X}}_\Pi \) therefore contains \(2^\mu \) distinct \((\mu +\rho ,\rho )\)sources.) We say that \(\Pi \) is \((s,\varepsilon )\) encryptioncompatible if \(h_\Pi \) as above is an adaptive \((s,\varepsilon )\)fooling extractor for \({\mathcal {X}}_\Pi \). (Here \(\mathsf {Coins}\) plays the role of \(\{0,1\}^c\) in Definition 3.1.) A regular \((s,\varepsilon )\)encryptioncompatible padding transform generator is defined analogously.
Theorem 3.3
Let \(\mathsf {LTDP}= ({\mathcal {F}},{\mathcal {F}}')\) be an LTDP with residual leakage \(s\), and let \(\Pi \) be an \((s,\varepsilon )\)encryptioncompatible padding transform generator. Then, for any INDCPA adversary A against \({\mathcal {AE}}_\Pi [{\mathcal {F}}]\), there is an adversary D against \(\mathsf {LTDP}\) such that for all \(k \in {\mathbb {N}}\)
Furthermore, the runningtime of D is the time to run A.
Proof
Given \(A=(A_1, A_2)\), we define three games, called \(G_0,G_1,G_2\), in Fig. 1. Note that game \(G_0\) is the experiment \(\mathbf {Exp}^{\mathrm {ind\text{ }cpa}}_{\Pi ,A}(k)\) defining INDCPA security. We claim that for a distinguisher D against \(\mathsf {LTDP}\) that is simple to construct, we have
from which the theorem follows by rearranging terms. So let us justify the above.
Equation (1) is true by the definition of INDCPA security.
For (2) we can construct a distinguisher D as required since \(G_0, G_1\) do not use \(f^{1}\) in any way.
Equation (3) is true by the definition of encryption compatibility. Namely, since \(h_\Pi \) in the definition is an adaptive \((s,\varepsilon )\)fooling extractor for \({\mathcal {X}}_\Pi \), we know the expectation over the coins cc is at most \(\varepsilon \) for m depending on cc (and hence \(\pi \)), where \((\pi ,{\hat{\pi }}) \leftarrow \Pi (1^k; cc)\), of \(\Delta (f'(\pi (m,R)), f'(U))\), so in particular it holds for \(m = m_b\) in game \(G_1\).
Finally, (4) uses the fact that in \(G_2\) no information about b is given to A. Note that the final two steps in the proof are informationtheoretic, meaning they do not use any assumption about A’s runningtime. \(\square \)
Remark 3.4
The analogous result holds for regular LTDPs and regular encryptioncompatible padding transforms. That is, if the LTDP is regular, then it suffices to use a regular encryptioncompatible padding transform to obtain the same conclusion. The latter may be easier to design or more efficient than in the general case; indeed, we get better parameters for OAEP in the regular case in Sect. 4. Furthermore, known examples of LTDPs (including RSA, as shown in Sect. 5) are regular, although a technical issue about the domain of RSA versus the output range of OAEP makes it challenging to exploit this for RSAOAEP; see Sect. 6.
OAEP as a Fooling Extractor
In this section, we show that the OAEP padding transform of Bellare and Rogaway [5] is encryptioncompatible as defined in Sect. 3 if its initial hash function is twise independent for t depending on the message length and lossiness of the TDP.
OAEP
We recall the OAEP padding transform of Bellare and Rogaway [5], lifted to the “instantiated” setting, i.e., where its hash functions may be keyed. (The original scheme was defined for unkeyed hash functions.) Let \(G :{\mathcal {K}}_G \times \{0,1\}^\rho \rightarrow \{0,1\}^\mu \) and \(H :{\mathcal {K}}_H \times \{0,1\}^\mu \rightarrow \{0,1\}^\rho \) be hash functions. The associated padding transform generator \(\mathsf {OAEP}[G,H]\) on input \(1^k\) returns \((\pi _{K_G,K_H},{\hat{\pi }}_{K_G,K_H})\), where
and
, defined via
See Fig. 2 for a graphical illustration.
Remark 4.1
Since we mainly study INDCPA security, for simplicity we define above the “noredundancy" version of OAEP, i.e., corresponding to the “basic scheme" in [5]. However, all our results also holds for the redundant version. Additionally, as is typical in the literature, we have defined OAEP to apply the Gfunction to the leastsignificant bits of the input; in standards and implementations, it is typically the most significant bits (where the order of m and r are switched). Again, we stress that our results hold in either case.
Analysis
The following establishes that OAEP is encryptioncompatible if the hash function G is twise independent for appropriate t. No restriction is put on the other hash function H. Indeed, our result also applies to SAEP [13] (although the latter is neither standardized nor known to provide CCA security in the RO model, except in certain cases).
Theorem 4.2
Let \(G :{\mathcal {K}}_G \times \{0,1\}^\rho \rightarrow \{0,1\}^\mu \) and \(H :{\mathcal {K}}_H \times \{0,1\}^\mu \rightarrow \{0,1\}^\rho \) be hash functions, and suppose G is twise independent. Let \(\mathsf {OAEP}= \mathsf {OAEP}[G,H]\). Then

(1)
\(\mathsf {OAEP}\) is \((s,\varepsilon )\)encryptioncompatible where \(\varepsilon = 2^{u}\) for \(u = \frac{t}{3t+2}(\rho  s  \log t + 2)  \frac{2(\mu + s)}{3t+2}  1\).

(2)
\(\mathsf {OAEP}\) is regular \((s,\varepsilon )\)encryptioncompatible where \(\varepsilon = 2^{u}\) for \(u = \frac{t}{2t+2}(\rho  s  \log t + 2)  \frac{\mu + s + 2}{t+1}1\).

(3)
When \(t = 2\), \(\mathsf {OAEP}\) is \((s,\varepsilon )\)encryptioncompatible where \(\varepsilon = 2^{u}\) for \(u = (\rho  s  2\mu )/4  1\).
Note that parts (2) and (3) capture special cases of (1) in which we get better bounds. The techniques used in the proof were first developed in the context of the classical LHL by Trevisan and Vadhan [61] and Dodis, Sahai and Smith [25], though the style of presentation of our theorem statement and proof are inspired by Barak al. [1, Lemma1]. We mention that due to our use of (variants of) the Crooked LHL rather than the classical one and the stucture of OAEP, some of the technical details differ in our case and require new ideas.
Corollary 4.3
Let \(G :{\mathcal {K}}_G \times \{0,1\}^\rho \rightarrow \{0,1\}^\mu \) and \(H :{\mathcal {K}}_H \times \{0,1\}^\mu \rightarrow \{0,1\}^\rho \) be hash functions and suppose that G is twise independent for \(t\ge 3 \frac{\mu +s}{\rho s}\). Then \(\mathsf {OAEP}[G,H]\) is \((s,\varepsilon )\)encryptioncompatible where \(\varepsilon =\exp (c(\rho s  \log t))\) for a constant \(c>0\).
In particular, \(c\approx 1/2\) for regular functions. For such a function, if \(\rho  s\) is at least 180, then \(\varepsilon \) is roughly \(2^{80}\) for \(t=10\) and message lengths \(\mu \le 2^{15}\) (which for practical purposes does not restrict the messagespace). Applying Theorem 3.3, we see that if G is 10wise independent and the number of random bits used in OAEP is at least 180 bits larger than the residual lossiness of the TDP, then the security of OAEP is tightly related to that of the lossy TDP.
Remark 4.4
To show security of OAEP against what we call keyindependent chosenplaintext attack, it suffices to argue that \(\mathsf {OAEP}[G,H]\) is a fooling extractor for any fixed encryption source \(X = (m,R)\) where \(m \in \{0,1\}^\mu \). The latter holds for any \(\varepsilon > 0\) and \(s \le \rho  2 \log (1/\varepsilon ) + 2\) assuming G is only pairwiseindependent (i.e., \(t = 2\)). See Appendix 8 for details.
Proof
(of Theorem 4.2) We now prove the above theorem.
Overview. We write \(\mathsf {OAEP}\) for \(\mathsf {OAEP}[G,H]\). The highlevel idea for all three parts of the theorem is the same. Fix a lossy function \(f'\) with range size at most \(2^s\). We first show that for every fixed message \(m\in \{0,1\}^\mu \), with high probability (say \(1\delta \)) over the choice of \(K_G\), the statistical distance between \(f'(\mathsf {OAEP}(m,R))\) and \(f'(U)\) is small (say \({\hat{\varepsilon }}\)). This aspect of the proof changes from part to part. We then take a union bound to show that the above holds for all messages over the same choice of \(K_G\) with probability at least \(12^\mu \delta \). This means that the statistical distance between the pair \((K_G,f'(\mathsf {OAEP}(m,R)))\) and \((K_G, f'(\mathsf {OAEP}(U)))\) is at most \(\varepsilon ={\hat{\varepsilon }}+2^{\mu }\delta \) for all messages over the same choice of \(K_G\). Finally, we express \(\delta \) as a function of \({\hat{\varepsilon }}\), and select \({\hat{\varepsilon }}\) to minimize this sum. Note that the entire argument works for any choice of H.
We first prove part (3) of the theorem, then part (2), and finally part (1).
Proof of part (3). To prove part (3) of the theorem, we strengthen the Crooked LHL of [26] to give the distinguisher access to the input to the fooling function as well its output.
Lemma 4.5
(Augmented Crooked LHL.) Let \(h :{\mathcal {K}}\times A \rightarrow B\) be a pairwiseindependent function and let \(g :A \times B \rightarrow S\) be a function. Let X be a random variable on A such that \(\mathrm {H}_\infty (X) \ge \lg S + 2\lg (1/{\hat{\varepsilon }})  2\) for some \({\hat{\varepsilon }} > 0\). Then
where
and .
The proof, which extends the proof of the Crooked LHL given in [10], is in Appendix 1.
Now we let G play the role of h in Lemma 4.5 and let \(\{0,1\}^\rho \) and \(\{0,1\}^\mu \) play the roles of A and B, respectively. Let g in the lemma be defined by \(g(a,b) = f(m {\,\oplus \,}a \Vert b {\,\oplus \,}H(K_H, m {\,\oplus \,}a))\) for arbitrary but fixed \(m \in \{0,1\}^\mu , K_H \in {\mathcal {K}}_H\). It follows that OAEP is a \((s,{\hat{\varepsilon }})\)fooling extractor for every fixed encryption source X of the form (m, R). Part (3) of the theorem now follows by applying Markov’s inequality and taking a union bound over all such sources.
In more detail, let \(f'\) be any function on \(\{0,1\}^k\) to a set \({\mathcal {Y}}\) of size at most \(2^s\), and let \(X = (m,R)\) be any \((\mu + \rho ,\rho )\)source, where \(m \in \{0,1\}^\mu \) is fixed and R is uniform over \(\{0,1\}^\rho \). Define random variable \(Z_{K_G,K_H}\) to take value \(\Delta (f'(\pi _{k_G,k_H}(m \Vert R), f'(U))\) for U uniform on \(\{0,1\}^k\), if \(K_G = k_G\) and \(K_H = k_H\), where here and in what follows the probability is over the random choices of \(K_G\) and \(K_H\) (although as the distribution on \(K_H\) does not matter – we use only the fact that it is independent of \(m,R,K_G\)). Then applying Lemma 4.5 as explained above, we have \({\mathbf{E}}\left[ \, Z_{K_G,K_H} \,\right] \le 1/2 \sqrt{S\cdot 2^{\rho }}\). Thus by Markov’s inequality
for any \({\hat{\varepsilon }} > 0\). By a union bound, the probability that the above holds simultaneously for all \(2^\mu \) possible \((\mu + \rho ,\rho )\)sources \(X = (m,R)\) is at least \(1\delta _{{\hat{\varepsilon }}}\), where
It now follows (by a conditioning argument) that \(\mathsf {OAEP}\) is \((s,\varepsilon )\)encryptioncompatible with \(\varepsilon ={\hat{\varepsilon }}+\delta _{{\hat{\varepsilon }}}\). Note that \(\delta _{{\hat{\varepsilon }}}\) can be written in the form \( \gamma \cdot {\hat{\varepsilon }}^{1}\) (where \(\gamma \) depends on \(\rho ,s,\mu \) but not \({\hat{\varepsilon }}\)). Setting \({\hat{\varepsilon }}=\gamma ^{1/2}\) yields \(\varepsilon \le 2 \gamma ^{1/2}\) and part (3) of the Theorem follows by observing that
Proof of part (2). Instead of Markov’s inequality, the proof of part (2) of the theorem uses a stronger tail inequality for twise independent random variables, due to Bellare and Rompel [7] (our application was inspired by the use of twise independence by Trevisan and Vadhan [61] and Dodis, Sahai, and Smith [25]).
Let \(f'\) be any function on \(\{0,1\}^k \) to a set \({\mathcal {Y}}\) of size at most \(2^s\). For this part of the theorem, assume that \(f'\) is regular, that is, that each preimage set has size exactly \(2^{ks}\). Let \(X = (m,R)\) be any \((\mu + \rho ,\rho )\)source, where \(m \in \{0,1\}^\mu \) is fixed and R is uniform over \(\{0,1\}^\rho \). For each \(r \in \{0,1\}^\rho \) and \(y \in {\mathcal {Y}}\), define the random variable
where as before the probability is over the random choices of \(K_G\) and \(K_H\) (although as before the distribution on \(K_H\) does not matter – we use only the fact that it is independent of \(m,R,K_G\)). Let \(Z_y = \sum _r Z_{r,y}\). We claim that \({\mathbf{E}}\left[ \, Z_y \,\right] = 2^{s}\). To see this, note that
where we use the fact that R is uniform and \(f'\) is regular.
To bound the deviation of \(Z_y\) from its mean, note that for a fixed y, the variables \(\{Z_{r,y}\}_{r\in \{0,1\}^{\rho }}\) are twise independent (by the twise independence of G) and take values in \([0,2^{\rho }]\). We can apply the following tail bound (modified from the original to apply to random variables in \([0,M]\) rather than [0, 1]).
Lemma 4.6
(Bellare and Rompel [7]) Let \(A_1, \ldots A_n\) be twise independent random variables taking values in \([0,M]\). Let \(A = \sum _i A_i\) and \(\delta \le 1\). Then
where \(c_t < 3\) and \(c_t < 1\) when \(t \ge 8\).
Setting \(\delta = 2 {\hat{\varepsilon }}\), we get that for every \(y\in {\mathcal {Y}}\),
By a union bound, the probability that there exists a \(y\in {\mathcal {Y}}\) such that \(Z_y  2^{s} \ge 2 {\hat{\varepsilon }} \cdot 2^{s}\) is at most
Observe that if \(Z_y  2^{s} \ge 2 {\hat{\varepsilon }} \cdot 2^{s}\) for all \(y \in {\mathcal {Y}}\) then, letting Y denote the random variable \(f'(\pi _{K_G,K_H}(m, R))\), we have
By another union bound, the probability that the above holds simultaneously for all \(2^\mu \) possible \((\mu + \rho ,\rho )\)sources \(X = (m,R)\) is at least \(1\delta _{{\hat{\varepsilon }}}\), where
It now follows (by a conditioning argument) that \(\mathsf {OAEP}\) is \((s,\varepsilon )\)encryptioncompatible with \(\varepsilon ={\hat{\varepsilon }}+\delta _{{\hat{\varepsilon }}}\). Note that \(\delta _{{\hat{\varepsilon }}}\) can be written in the form \( \gamma \cdot {\hat{\varepsilon }}^{t}\) (where \(\gamma \) depends on \(t,\rho ,s,\mu \) but not \({\hat{\varepsilon }}\)). Setting \({\hat{\varepsilon }}=\gamma ^{1/(t+1)}\) yields \(\varepsilon \le 2 \gamma ^{1/(t+1)}\) and part (2) of the Theorem follows by observing that
Proof of part (1). We now turn to proving the lemma for general (not necessarily balanced) functions \(f'\). We first give a proof for approximately balanced functions, in which no preimage set is too small; we then show that this implies a bound for arbitrary functions.
Assume for now that \(\min _{y \in {\mathcal {Y}}} \mathsf {preimg}_{f'}(y) \ge \lambda \cdot 2^{ks}\) for some real number \(0<\lambda \le 1\) (note that regularity corresponds to \(\lambda = 1\)), where \(\mathsf {preimg}_{f'}(y) = \{x \in \{0,1\}^k~~f(x) =y\}\) We sketch how to modify the proof of part (2) under this assumption; essentially, we end up with an extra factor of \(\lambda \) in the denominator of Eq. 6. We use the same definition of \(Z_y\) as in part (2). Instead of \({\mathbf{E}}\left[ \, Z_y \,\right] = 2^{s}\), we now have \({\mathbf{E}}\left[ \, Z_y \,\right] = \Pr \left[ \, f(U \Vert R) = y \,\right] = \mathsf {preimg}_{f'}(y)/2^k\). Thus, instead of Eq. (5), we have
Using \(\min _{y \in {\mathcal {Y}}} \mathsf {preimg}_{f'}(y) \ge \lambda \cdot 2^{ks}\) and taking a union bound, we get that the probability that there exists \(y\in {\mathcal {Y}}\) such that
is at most
We can obtain a bound for arbitrary functions \(f'\) by noting that every function \(f'\) is “close” to a function with no small preimages. Specifically:
Claim 4.7
Let \(f' :\{0,1\}^k \rightarrow {\mathcal {Y}}\) where \({\mathcal {Y}} \le 2^s\) be a function. For any real number \(\lambda >0\), there exists a function \(g' :\{0,1\}^k \rightarrow {\mathcal {Y}}\) such that (i) \(\min _{y \in {\mathcal {Y}}} \mathsf {preimg}_{g'}(y) \ge \lambda \cdot 2^{ks}\); and (ii) the function \(g'\) agrees with \(f'\) on a \(1\lambda \) fraction of its domain. In particular, \(\Delta (f'(U),g'(U)) \le \lambda \).
We can now prove part (3) of the Theorem from Eq. (8) by choosing \(\lambda = {\hat{\varepsilon }}\) in the claim and then completing the analysis as in part (2). It remains to prove the claim.
Proof (of Claim 4.7): The idea is that we will take all the small preimage sets of \(f'\) and merge them together with some larger preimage set (e.g., if 0 has a large preimage set, then for all elements x such that \(\mathsf {preimg}_{f'}(f'(x))\) is small, we set \(f(x)=0\)). How many elements can belong to small preimage sets? There are at most \(2^s\) preimage sets, each of which contains at most \(\lambda \cdot {2^{ks}}\) elements. So there are at most \(\lambda \cdot 2^k\) elements of the domain on which \(f'\) has to be changed.\(\square \)
This concludes the proof of the Theorem.
Lossiness of RSA
In this section, we show that the RSA trapdoor permutation is lossy under reasonable assumptions. In particular, we show that, for large enough encryption exponent e, RSA is considerably lossy under the \(\Phi \)Hiding Assumption of [16]. We then show that by generalizing this assumption to multiprime RSA we can get even more lossiness. Finally, we propose a “TwoOrmPrimes” Assumption that, when combined with the former, amplifies the lossiness of standard (twoprime) RSA for small e.
Background on RSA and Notation
We denote by \({\mathcal {RSA}}_k\) the set of all tuples (N, p, q) such that \(N=pq\) is the product of two distinct k / 2bit primes. Such an N is called an RSA modulus. By we mean that (N, p, q) is sampled according to the uniform distribution on \({\mathcal {RSA}}_k\). An RSA TDP generator [53] is an algorithm \({\mathcal {F}}\) that returns (N, e), (N, d), where N is an RSA modulus and \(ed \equiv 1 \pmod {\phi (N)}\). (Here \(\phi (\cdot )\) denotes Euler’s totient function, so in particular \(\phi (N) = (p1)(q1)\).) The tuple (N, e) defines the permutation on \({{\mathbb {Z}}}_N^*\) given by \(f(x)=x^e \bmod N\), and similarly (N, d) defines its inverse. We say that a lossy TDP generator \(\mathsf {LTDP}= ({\mathcal {F}}, {\mathcal {F}}')\) is an RSA LTDP if \({\mathcal {F}}\) is an RSA TDP generator.
To define the \(\Phi \)Hiding Assumption and later some extensions of it, the following notation is also useful. For \(i \in {\mathbb {N}}\) we denote by \({\mathcal {P}}_i\) the set of all ibit primes. Let R be a relation on p and q. By \({\mathcal {RSA}}_k[R]\) we denote the subset of \({\mathcal {RSA}}_k\) for that the relation R holds on p and q. For example, let e be a prime. Then \({\mathcal {RSA}}_k[p=1 \bmod e]\) is the set of all (N, p, q), where \(N=pq\) is the product of two distinct k / 2bit primes p, q and \(p=1 \bmod e\). That is, the relation R(p, q) is true if \(p=1 \bmod e\) and q is arbitrary. By we mean that (N, p, q) is sampled according to the uniform distribution on \({\mathcal {RSA}}_k[R]\).
RSA Lossy TDP from \(\Phi \)Hiding
\(\Phi \)
Hiding Assumption (\(\Phi \)
A). We recall the \(\Phi \)Hiding Assumption of [16]. For an RSA modulus N, we say that N \(\phi \) hides a prime e if \(e~~\phi (N)\). Intuitively, the assumption is that, given RSA modulus N, it is hard to distinguish primes which are \(\phi \)hidden by N from those that are not. Formally, let \(0<c < 1/2\) be a (public) constant determined later. Consider the following two distributions:
To a distinguisher D, we associate its \(\Phi A\) advantage defined as
As shown in [16], distributions \({\mathcal {R}}_1, {\mathcal {L}}_1\) can be sampled efficiently assuming the widely accepted Extended Riemann Hypothesis (as we need a density estimate on the number of primes of a particular form).^{Footnote 9}
RSA LTDP from \(\Phi \)
A. We construct an RSA LTDP based on \(\Phi \)A. In injective mode the public key is (N, e) where e is not \(\phi \)hidden by N, whereas in lossy mode it is. Namely, define \(\mathsf {LTDP}_1 = ({\mathcal {F}}_1, {\mathcal {F}}'_1)\) as follows:
The fact that algorithm \({\mathcal {F}}_1\) has only a very small probability of failure (returning \(\bot \)) follows from the fact that \(\phi (N)\) can have only a constant number of prime factors of length ck and Bertrand’s Postulate.
Proposition 5.1
Suppose there is a distinguisher D against \(\mathsf {LTDP}_1\). Then there is a distinguisher \(D'\) such that for all \(k\in {\mathbb {N}}\)
Furthermore, the runningtime of \(D'\) is that of D. \(\mathsf {LTDP}_1\) has lossiness ck.
The proof is straightforward.
From a practical perspective, a drawback of \(\mathsf {LTDP}_1\) is that \({\mathcal {F}}_1\) chooses \(N = pq\) in a nonstandard way, so that it hides a prime of the same length as e. Moreover, for small values of e it returns \(\bot \) with high probability. This is done for consistency with how [16] formulated \(\Phi \)A. But, to address this, we also propose what we call the Enhanced \(\Phi \)A (E\(\Phi \)A), which says that N generated in the nonstandard way (i.e., by \({\mathcal {F}}_1\)) is indistinguishable from one chosen at random subject to \(\gcd (e,\phi (N)) = 1\).^{Footnote 10} We conjecture that E\(\Phi \)A holds for all values of c that \(\Phi \)A does. Details follow.
Enhanced \(\Phi \)
Hiding Assumption. We say that the Enhanced \(\Phi \) Hiding Assumption (E\(\Phi \)A) holds for c if the following two distributions \({\mathcal {R}}_{1^*}\) and \({\mathcal {L}}_{1^*}\) are computationally indistinguishable:
To a distinguisher D, we associate its E \(\Phi \) A advantage defined as
As before, distributions \({\mathcal {R}}_{1^*}, {\mathcal {L}}_{1^*}\) can be sampled efficiently assuming the widely accepted Extended Riemann Hypothesis. We conjecture that E\(\Phi \)A holds for all values of \({\mathcal {K}}_\phi , c\) that \(\Phi \)A does.
RSA LTDP from E \(\Phi \)
A. Now define \(\mathsf {LTDP}_{1^*} = ({\mathcal {F}}_{1^*}, {\mathcal {F}}'_{1^*})\) where
and \({\mathcal {F}}'_{1^*} = {\mathcal {F}}'_1\) in Sect. 5.2. Again we have the probability that \({\mathcal {F}}_{1^*}\) returns \(\bot \) is very small. We stress that \({\mathcal {F}}_{1^*}\), unlike \({\mathcal {F}}_1\), chooses p, q at random as is typical in practice. We have the following proposition.
Proposition 5.2
If the Enhanced \(\Phi \)Hiding Assumption holds for c, then \(\mathsf {LTDP}_{1^*}=({\mathcal {F}}_{1^*}, {\mathcal {F}}'_{1^*})\) is an RSA LTDP with lossiness ck. In particular, suppose there is a distinguisher D against \(\mathsf {LTDP}_{1^*}\). Then there is a distinguisher \(D'\) such that
Furthermore, the runningtime of \(D'\) is that of D.
Again, the proof is straightforward.
Parameters for \(\mathsf {LTDP}_1\). When e is too large, \(\Phi \)A can be broken by using Coppersmith’s method for finding small roots of a univariate modulo an unknown divisor of N [21, 43]. Namely, consider the polynomial \(r(x) = e x + 1 \bmod p\). Coppersmith’s method allows us to find all roots of r smaller than \(N^{1/4}\), and thus factor N, in lossy mode in polynomial time if \(c \ge 1/4\). (This is essentially the “factoring with high bits known" attack.) More specifically, applying [43, Theorem1], N can be factored in time \(\mathrm{poly}(\log N)\) and \(O(N^\varepsilon )\) if \(c = 1/4  \varepsilon \) (i.e., \(\log e \ge \log N(1/4\varepsilon )\)). For example, with modulus size \(k = 2048\), we can set \(\varepsilon = .04\) for 80bit security (to enforce \(k \varepsilon \ge 80\)) and obtain \(2048 (1/40.04)=430\) bits of lossiness.
RSA Lossy TDP from Multiprime \(\Phi \)Hiding
Multiprime RSA (according to [41] the earliest reference is [54]) is a generalization of RSA to moduli \(N = p_1 \cdots p_m\) of length k with \(m \ge 2\) prime factors of equal bitlength. Multiprime RSA is of interest to practitioners since it allows to speed up decryption and is included in RSA PKCS #1 v2.1. We are interested in it here because for it we can show greater lossiness, in particular with smaller encryption exponent e.
Notation and terminology. Let \(m \ge 2\) be fixed. We denote by \({\mathcal {MRSA}}_k\) the set of all tuples \((N,p_1,\ldots ,p_m)\), where \(N=p_1\cdots p_m\) is the product of distinct k / mbit primes. Such an N is called an m prime RSA modulus. By we mean that \((N,p_1,\ldots ,p_m)\) is sampled according to the uniform distribution on \({\mathcal {MRSA}}_k\). The rest of the notation and terminology of Sect. 5 is extended to the multiprime setting in the obvious way.
Multi \(\Phi \)
hiding assumption. For an mprime RSA modulus N , let us say that N \(m \phi \) hides a prime e if \(e~~p_i1\) for all \(1 \le i \le m1\). Intuitively, the assumption is that, given such N, it is hard to distinguish primes which are \(m \phi \)hidden by N from those that do not divide \(p_i1\) for any \(1 \le i \le m\). Formally, let \(m = m(k) \ge 2\) be a polynomial and let \(c = c(k)\) be an inverse polynomial determined later. Consider the following two distributions:
Above and in what follows, by \(p_{i \le m1} = 1 \bmod e\) we mean that \(p_i = 1 \bmod e\) for all \(1 \le i \le m1\). To a distinguisher D, we associate its M \(\Phi \) A advantage defined as
As before, distributions \({\mathcal {R}}_2, {\mathcal {L}}_2\) can be sampled efficiently assuming the widely accepted Extended Riemann Hypothesis.
Note that if we had required that in the lossy case \(N = p_1 \cdots p_m\) is such that \(e~~p_i\) for all \(1 \le i \le m\), then in this case we would always have \(N = 1 \bmod e\). But in the injective case \(N \bmod e\) is random, which would lead to a trivial distinguishing algorithm. This explains why we do not impose \(e~~p_m\) in the lossy case above.
Multiprime RSA LTDP from M \(\Phi \)
A. We construct a multiprime RSA LTDP based on M\(\Phi \)A having lossiness \((m1) \log e\), where in lossy mode N \(m\phi \)hides e. Namely, define \(\mathsf {LTDP}_2 = ({\mathcal {F}}_2, {\mathcal {F}}'_2)\) as follows:
Proposition 5.3
Suppose there is a distinguisher D against \(\mathsf {LTDP}_2\). Then there is a distinguisher \(D'\) such that for all \(k \in {\mathbb {N}}\)
Furthermore, the runningtime of \(D'\) is that of D. \(\mathsf {LTDP}_2\) has lossiness \((m1)ck\).
The proof is straightforward.
Parameters for \(\mathsf {LTDP}_2\). Using [35, Section 3] we can break the M\(\Phi \)A in time \(\mathrm{poly}(\log N)\) and \(O(N^{\varepsilon })\) if
For \(m \ge 3\) this improves the bound with \(c \ge 1/m1/m^2\varepsilon \) obtained from “factoring with high bits known"; for \(m\ge 4\) this improves the bound with \(c \ge 1/m 2\frac{ (1/m)^{(1/(m1)}  (1/m)^{m/(m1)}}{m(m1)}\varepsilon \) from the preliminary version [37]. We also note that Tosu and Kunihiro [60] showed a bound with \(c \ge 1/m  \frac{2}{em \log (m+1)}\) where e is the base of the natural logarithm, which is better than [35] for \(m \ge 6\) (see [60, Section4.4] for comparison).
For example, with modulus size \(k=2048\) and \(m=3\) (\(m=4,5\)) we set \(\varepsilon = .04\) (for about 80bit security) and obtain 676 (778, 822) bits of lossiness for \(\mathsf {LTDP}_2\), according to Proposition 5.3.
SmallExponent RSA LTDP from 2vsm Primes
For efficiency reasons, the public RSA exponent e is typically not chosen to be too large in practice. (For example, researchers at UC San Diego [63] found that 99.5 % of the certificates in the campus’s TLS corpus had \(e = 2^{16} +1\).) Therefore, we investigate the possibility of using an additional assumption to “amplify” the lossiness of RSA for small e.
Our highlevel idea is to assume that it is hard to distinguish \(N = pq\) where p, q are primes of length k / 2 from \(N = p_1 \cdots p_m\) for \(m > 2\), where \(p_1, \ldots , p_m\) are primes of length k / m (which we call the “2vsm Primes” Assumption). This assumption is a generalization of the “2vs3 Primes” Assumptions introduced in [8] and used independently to construct a “slightly lossy” TDF based on modular squaring [45]. Combined with the M\(\Phi \)A Assumption of Sect. 5.3, we obtain \((m1) \log e\) bits of lossiness from standard (twoprime) RSA. Let us state our assumption and construction formally.
2vs m
Primes Assumption. We say that the 2vs m primes assumption holds for m if the following two distributions \({\mathcal {N}}_2\) and \({\mathcal {N}}_m\) are computationally indistinguishable:
To a distinguisher D, we associate its HFAadvantage defined as
RSA LTDP from 2vs m
Primes + M \(\Phi \)
A. Define \(\mathsf {LTDP}_3 = ({\mathcal {F}}_3, {\mathcal {F}}'_3)\) as follows:
Proposition 5.4
If the 2vsm Primes Assumption holds for m and the MultiPrime \(\Phi \)Hiding Assumption holds for m, e, then \(\mathsf {LTDP}_3=({\mathcal {F}}_3, {\mathcal {F}}'_3)\) is an RSA LTDP with lossiness \((m1)ck\). In particular, suppose there is a distinguisher D against \(\mathsf {LTDP}_3\). Then there is a distinguisher \(D_1, D_2\) such that
Furthermore, the runningtime of \(D_1,D_2\) is that of D.
Again, the proof is a straightforward.
Parameters for \(\mathsf {LTDP}_3\). We note that m in the construction cannot be too large; otherwise, a small factor of N in the lossy case can be recovered by the elliptic curve factoring method due to Lenstra [41], whose runningtime is proportional to the smallest factor of N. The largest factor recovered by the method so far was 223bits in length [64]. Thus, for example using 2048bit RSA with \(e = 2^{16} 1\), if we assume it is hard to recover factors larger than that we can get \(8 \cdot 16 = 128\) bits of lossiness under the HFA plus M\(\Phi \)A where \(m = 9\).
Enhanced HFA. As in the previous cases, to address the fact that in practice \(N = pq\) is chosen at random and not subject to p hiding a prime of the same bitlength as e, we may define an enhanced version of HFA. Then under the enhanced HFA + enhanced M\(\Phi \)A assumptions, we obtain the same amount of lossiness for standard 2prime RSA.
Instantiating RSAOAEP
By combining the results of Sects. 3, 4, and 5, we obtain standard model instantiations of RSAOAEP under chosenplaintext attack.
Regularity. In particular, we would like to apply part (2) of Theorem 4.2 in this case, as it is not hard to see that under all of the assumptions discussed in Sect. 5, RSA is a regular lossy TDP on the domain \({{\mathbb {Z}}}_N^*\). Unfortunately, this is different from \(\{0,1\}^{\rho + \mu }\) (identified as integers), the range of OAEP. In RSA PKCS #1 v2.1, the mismatch is handled by selecting \(\rho +\mu = \lfloor \log N\rfloor  16\), and viewing OAEP’s output as an integer less than \(2^{\rho +\mu }<N/2^{16}\) (i.e., the most significant two bytes of the output are zeroed out). The problem is that in the lossy case RSA may not be regular on the subdomain \(\{0,\ldots ,2^{\rho +\mu } \}\) (although this has been proven in subsequent work; see below). So, we just detail the weaker parameters given by part (1) of Theorem 4.2 here.
Concrete parameters. Since the results in Sect. 5 have several cases and the parameter settings are rather involved, we avoid stating an explicit theorem about RSAOAEP. If we use part (1) of Theorem 4.2, one can see that for \(u = 80\) bits security, messages of roughly \(\mu \approx ks 3\cdot 80\) bits can be encrypted (for sufficiently large t). For concreteness, we give two example parameter settings. Using the Multi \(\Phi \)Hiding Assumption with \(k=1024\) bits and 3 primes, we obtain \(\ell =ks=291\) bits of lossiness and hence can encrypt messages of length \(\mu = 40\) bits (for \(t \approx 400\)). Using the \(\Phi \)Hiding Assumption with \(k=2048\), we obtain \(\ell =ks=430\) bits of lossiness and hence can encrypt messages of length \(\mu = 160\) bits (for \(t \approx 150\)).
Subsequent improvements. The approximately regularity of RSA on the above subdomain (and, more generally, on arithmetic progressions of sufficient length) has subsequently been shown by Lewko et al. [42]. This allows us to obtain essentially the better parameters given by part (2) of Theorem 4.2. For example, using the \(\Phi \)Hiding Assumption with \(k=2048\), we can encrypt messages of length 274 bits (see [42, Section5.3]).
Notes
 1.
We often use the same terminology for ‘fOAEP,’ which refers to OAEP using an abstract TDP f, with the meaning hopefully clear from context.
 2.
Such schemes were called “simple embedding schemes” by Bellare and Rogaway [5], who discussed them only on an intuitive level.
 3.
In the formal definition, we actually consider an “external” distinguisher who gets the extractor seed; see Sect. 3 for details.
 4.
In particular, this result requires that G is a keyed hash function whose key is included in the public key for OAEP. On the other hand, cryptographic hash functions are typically unkeyed. But see “Using unkeyed hash functions” below.
 5.
We remark that the recent attacks on \(\Phi \)A [56] are for moduli of a special form that does not include RSA.
 6.
Note, however, that their result does not rule out such a proof based on other properties of the TDP, nonblackbox assumptions on the hash functions, or in the case of a specific TDP like RSA.
 7.
In particular, their security notion does not imply INDCPA since they consider random messages. We also point out that it remains an open question whether NMPRGs can be constructed.
 8.
We note that [49] actually defines lossy trapdoor functions, but the extension to permutations is straightforward.
 9.
This is done by choosing a uniform \((1/2c)k\)bit number x until \(p = x e + 1\) is a prime.
 10.
Additionally, in practice the encryption exponent e is usually fixed. This can be addressed by parameterizing E\(\Phi \)A by a fixed e instead of choosing it at random. Note that for \(e = 3\) one should make both \(e~~p1\) and \(e~~q1\) in the lossy case (otherwise the assumption is false [16]).
References
 1.
M. Abdalla, M. Bellare, P. Rogaway, The oracle Diffie–Hellman assumptions and an analysis of DHIES, in D. Naccache, editor, CTRSA 2001. LNCS, vol. 2020 (Springer, Heidelberg, April 2001), pp. 143–158
 2.
B. Barak, R. Shaltiel, E. Tromer, True random number generators secure in a changing environment, in C.D. Walter, Ç.K. Koç, C. Paar, editors, CHES 2003. LNCS, vol. 2779 (Springer, Heidelberg, September 2003), pp. 166–180
 3.
M. Bellare, A. Boldyreva, A. O’Neill, Deterministic and efficiently searchable encryption, in A. Menezes, editor, CRYPTO 2007. LNCS, vol. 4622 (Springer, Heidelberg, August 2007), pp. 535–552
 4.
M. Bellare, V.T. Hoang, S. Keelveedhi, Instantiating random oracles via UCEs, in R. Canetti, J.A. Garay, editors, CRYPTO 2013, Part II. LNCS, vol. 8043 (Springer, Heidelberg, August 2013), pp. 398–415
 5.
M. Bellare, A. Palacio, Towards plaintextaware publickey encryption without random oracles, in P.J. Lee, editor, ASIACRYPT 2004. LNCS, vol. 3329 (Springer, Heidelberg, December 2004), pp. 48–62
 6.
M. Bellare, P. Rogaway, Random oracles are practical: a paradigm for designing efficient protocols. in V. Ashby, editor, ACM CCS 93. (ACM Press, November 1993), pp. 62–73
 7.
M. Bellare, P. Rogaway, Optimal asymmetric encryption, in A. De Santis, editor, EUROCRYPT’94. LNCS, vol. 950 (Springer, Heidelberg, May 1995), pp. 92–111
 8.
M. Bellare, J. Rompel, Randomnessefficient oblivious sampling, in 35th FOCS. (IEEE Computer Society Press, November 1994), pp. 276–287
 9.
M. Blum, P. Feldman, S. Micali, Proving security against chosen cyphertext attacks, in S. Goldwasser, editor, CRYPTO’88. LNCS, vol. 403 (Springer, Heidelberg, August 1990), pp. 256–268
 10.
A. Boldyreva, D. Cash, M. Fischlin, B. Warinschi, Foundations of nonmalleable hash and oneway functions, in M. Matsui, editor, ASIACRYPT 2009. LNCS, vol. 5912 (Springer, Heidelberg, December 2009), pp. 524–541
 11.
A. Boldyreva, S. Fehr, A. O’Neill, On notions of security for deterministic encryption, and efficient constructions without random oracles, in D. Wagner, editor, CRYPTO 2008. LNCS, vol. 5157 (Springer, Heidelberg, August 2008), pp. 335–359
 12.
A. Boldyreva, M. Fischlin, Analysis of random oracle instantiation scenarios for OAEP and other practical schemes, in V. Shoup, editor, CRYPTO 2005. LNCS, vol. 3621 (Springer, Heidelberg, August 2005), pp. 412–429
 13.
A. Boldyreva, M. Fischlin, On the security of OAEP, in X. Lai, K. Chen, editors, ASIACRYPT 2006. LNCS, vol. 4284 (Springer, Heidelberg, December 2006), pp. 210–225
 14.
D. Boneh, Simplified OAEP for the RSA and Rabin functions, in J. Kilian, editor, CRYPTO 2001. LNCS, vol. 2139 (Springer, Heidelberg, August 2001), pp. 275–291
 15.
D.R.L. Brown, What hashes make RSAOAEP secure? Cryptology ePrint Archive. Report 2006/223. http://eprint.iacr.org/ (2006)
 16.
C. Cachin, Efficient private bidding and auctions with an oblivious third party, in ACM CCS 99. (ACM Press, November 1999), pp. 120–127
 17.
C. Cachin, S. Micali, M. Stadler, Computationally private information retrieval with polylogarithmic communication, in J. Stern, editor, EUROCRYPT’99. LNCS, vol. 1592 (Springer, Heidelberg, May 1999), pp. 402–414
 18.
R. Canetti, Towards realizing random oracles: hash functions that hide all partial information, in B.S. Kaliski Jr., editor, CRYPTO’97. LNCS, vol. 1294 (Springer, Heidelberg, August 1997), pp. 455–469
 19.
R. Canetti, R.R. Dakdouk, Extractable perfectly oneway functions, in L. Aceto, I. Damgård, L.A. Goldberg, M.M. Halldórsson, A. Ingólfsdóttir, I. Walukiewicz, editors, ICALP 2008, Part II. LNCS, vol. 5126 (Springer, Heidelberg, July 2008), pp. 449–460
 20.
R. Canetti, O. Goldreich, S. Halevi, The random oracle methodology, revisited. J. ACM, 51(4), 557–594 (2004)
 21.
R. Canetti, D. Micciancio, O. Reingold, Perfectly oneway probabilistic hash functions (preliminary version), in 30th ACM STOC. (ACM Press, May 1998), pp. 131–140
 22.
D. Coppersmith, Small solutions to polynomial equations, and low exponent RSA vulnerabilities. J. Cryptol., 10(4), 233–260 (1997)
 23.
J.S. Coron, M. Joye, D. Naccache, P. Paillier, New attacks on PKCS#1 v1.5 encryption, in B. Preneel, editor, EUROCRYPT 2000. LNCS, vol. 1807 (Springer, Heidelberg, May 2000), pp. 369–381
 24.
J.S. Coron, M. Joye, D. Naccache, P. Paillier, Universal padding schemes for RSA, in M. Yung, editor, CRYPTO 2002. LNCS, vol. 2442 (Springer, Heidelberg, August 2002), pp. 226–241
 25.
Y. Dodis, R. Oliveira, K. Pietrzak, On the generic insecurity of the full domain hash, in V. Shoup, editor, CRYPTO 2005. LNCS, vol. 3621 (Springer, Heidelberg, August 2005), pp. 449–466
 26.
Y. Dodis, A. Sahai, A. Smith, On perfect and adaptive security in exposureresilient cryptography, in B. Pfitzmann, editor, EUROCRYPT 2001. LNCS, vol. 2045 (Springer, Heidelberg, May 2001), pp. 301–324
 27.
Y. Dodis, A. Smith, Correcting errors without leaking partial information, in H.N. Gabow, R. Fagin, editors, 37th ACM STOC. (ACM Press, May 2005), pp. 654–663
 28.
D.M. Freeman, O. Goldreich, E. Kiltz, A. Rosen, G. Segev, More constructions of lossy and correlationsecure trapdoor functions. J. Cryptol., 26(1), 39–74 (2013)
 29.
E. Fujisaki, T. Okamoto, D. Pointcheval, J. Stern, RSAOAEP is secure under the RSA assumption. J. Cryptol., 17(2), 81–104 (2004)
 30.
C. Gentry, P.D. Mackenzie, Z. Ramzan, Password authenticated key exchange using hidden smooth subgroups, in V. Atluri, C. Meadows, A. Juels, editors, ACM CCS 05. (ACM Press, November 2005), pp. 299–309
 31.
O. Goldreich, Foundations of Cryptography: Basic Applications, vol. 2 (Cambridge University Press, Cambridge, UK, 2004)
 32.
S. Goldwasser, S. Micali, Probabilistic encryption. J. Comput. Syst. Sci., 28(2), 270–299 (1984)
 33.
B. Harris, RSA Key Exchange for the Secure Shell (SSH) Transport Layer Protocol. RFC 4432
 34.
B. Hemenway, R. Ostrovsky, Publickey locallydecodable codes, in D. Wagner, editor, CRYPTO 2008. LNCS, vol. 5157 (Springer, Heidelberg, August 2008), pp. 126–143
 35.
B. Hemenway, R. Ostrovsky, A. Rosen, Noncommitting encryption from \(\phi \)hiding, in Y. Dodis, J.B. Nielsen, editors, TCC 2015, Part I. LNCS, vol. 9014 of (Springer, Heidelberg, March 2015), pp. 591–608
 36.
M. Herrmann, Improved cryptanalysis of the multiprime \(\phi \)hiding assumption. in A. Nitaj, D. Pointcheval, editors, AFRICACRYPT 11. LNCS, vol. 6737 (Springer, Heidelberg, July 2011), pp. 92–99
 37.
D. Hofheinz, E. Kiltz, The group of signed quadratic residues and applications, in S. Halevi, editor, CRYPTO 2009. LNCS, vol. 5677 (Springer, Heidelberg, August 2009), pp. 637–653
 38.
E. Kiltz, K. Pietrzak, Personal communication (2009)
 39.
E. Kiltz, A. O’Neill, A. Smith, Instantiability of RSAOAEP under chosenplaintext attack, in T. Rabin, editor, CRYPTO 2010. LNCS, vol. 6223 (Springer, Heidelberg, August 2010), pp. 295–313
 40.
E. Kiltz, K. Pietrzak, On the security of paddingbased encryption schemes or why we cannot prove OAEP secure in the standard model, in A. Joux, editor, EUROCRYPT 2009. LNCS, vol. 5479 (Springer, Heidelberg, April 2009), pp. 389–406
 41.
K. Kobara, H. Imai, OAEP++ : a very simple way to apply oaep to deterministic owcpa primitives. Cryptology ePrint Archive, Report 2002/130. http://eprint.iacr.org/ (2002)
 42.
A.K. Lenstra, Unbelievable security. Matching AES security using public key systems (invited talk), in C. Boyd, editor, ASIACRYPT 2001. LNCS, vol. 2248 (Springer, Heidelberg, December 2001), pp. 67–86
 43.
M. Lewko, A. O’Neill, A. Smith, Regularity of lossy RSA on subdomains and its applications, in T. Johansson, P.Q. Nguyen, editors, EUROCRYPT 2013. LNCS, vol. 7881 (Springer, Heidelberg, May 2013), pp. 55–75
 44.
A. May, Using lllreduction for solving rsa and factorization problems: a survey, in LLL+25 Conference in Honour of the 25th Birthday of the LLL Algorithm (2007)
 45.
S. Micali, C. Rackoff, B. Sloan, The notion of security for probabilistic cryptosystems, in A.M. Odlyzko, editor, CRYPTO’86. LNCS, vol. 263 (Springer, Heidelberg, August 1987), pp. 381–392
 46.
P. Mol, S. Yilek, Chosenciphertext security from slightly lossy trapdoor functions, in P.Q. Nguyen, D. Pointcheval, editors, PKC 2010. LNCS, vol. 6056 (Springer, Heidelberg, May 2010), pp. 296–311
 47.
N. Nisan, D. Zuckerman, Randomness is linear in space. J. Comput. Syst. Sci., 52(1), 43–52 (1996)
 48.
P. Paillier, J.L. Villar, Trading onewayness against chosenciphertext security in factoringbased encryption, in X. Lai, K. Chen, editors, ASIACRYPT 2006. LNCS, vol. 4284 (Springer, Heidelberg, December 2006), pp. 252–266
 49.
O. Pandey, R. Pass, V. Vaikuntanathan, Adaptive oneway functions and applications, in D. Wagner, editor, CRYPTO 2008. LNCS, vol. 5157 (Springer, Heidelberg, August 2008), pp. 57–74
 50.
C. Peikert, B. Waters, Lossy trapdoor functions and their applications. SIAM J. Comput., 40(6), 1803–1844 (2011)
 51.
Rsa publickey cryptography standards (pkcs). http://www.rsa.com/rsalabs/node.asp?id=2124
 52.
M.O. Rabin, Digitalized signatures and publickey functions as intractable as factorization. Technical report (1979)
 53.
C. Rackoff, D.R. Simon, Noninteractive zeroknowledge proof of knowledge and chosen ciphertext attack, in J. Feigenbaum, editor, CRYPTO’91. LNCS. vol. 576 (Springer, Heidelberg, August 1992), pp. 433–444
 54.
R.L. Rivest, A. Shamir, L. Adelman, U.S. patent 4405829: cryptographic communications system and method
 55.
R.L. Rivest, A. Shamir, L. Adelman, A method for obtaining publickey cryptosystems and digital signatures. Technical Memo MIT/LCS/TM82, Massachusetts Institute of Technology, Laboratory for Computer Science (1977)
 56.
C. Schridde, B. Freisleben, On the validity of the phihiding assumption in cryptographic protocols, in J. Pieprzyk, editor, ASIACRYPT 2008. LNCS, vol. 5350 (Springer, Heidelberg, December 2008), pp. 344–354
 57.
Y. Seurin, On the lossiness of the Rabin trapdoor function, in H. Krawczyk, editor, PKC 2014. LNCS, vol. 8383 (Springer, Heidelberg, March 2014), pp. 380–398
 58.
V. Shoup, OAEP reconsidered. J. Cryptol., 15(4), 223–249 (2002)
 59.
A. Smith, Y. Zhang, On the regularity of lossy RSA—improved bounds and applications to paddingbased encryption, in Y. Dodis, J.B. Nielsen, editors, TCC 2015, Part I. LNCS, vol. 9014 (Springer, Heidelberg, March 2015), pp. 609–628
 60.
K. Tosu, N. Kunihiro, Optimal bounds for multiprime phihiding assumption, in Information Security and Privacy—17th Australasian Conference, ACISP 2012, Wollongong, NSW, Australia, July 9–11, 2012. Proceedings (2012), pp. 1–14
 61.
L. Trevisan, S.P. Vadhan, Extracting randomness from samplable distributions, in 41st FOCS (IEEE Computer Society Press, November 2000), pp. 32–42
 62.
M.N. Wegman, L. Carter, New hash functions and their use in authentication and set equality. J. Comput. Syst. Sci. 22(3), 265–279 (1981)
 63.
S. Yilek, E. Rescorla, H. Shacham, B. Enright, S. Savage, When private keys are public: results from the 2008 debian openssl vulnerability, in Internet Measurement Conference
 64.
P. Zimmerman, Integer factoring records. http://www.loria.fr/~zimmerma/records/factor.html
Acknowledgments
We thank Mihir Bellare, Alexandra Boldyreva, Dan Brown, Yevgeniy Dodis, Mathias Herrmann, Jason Hinek, Arjen Lenstra, Alex May, Phil Rogaway, and the anonymous reviewers of Crypto 2010 and the Journal of Cryptology for helpful comments. In particular, we thank Dan for reminding us of [16, Remark2,p. 6], Alex and Mathias for pointing out the improved attacks in Sect. 5.3, Phil for encouraging us to consider the case of small e more closely and for telling us that KI security as defined in Appendix 8 was previously considered by [44], and Yevgeniy for suggesting the statement of Lemma 4.5 (our original lemma was specific to OAEP).
Part of this work was done, while E.K. was at CWI, Amsterdam. E.K. is funded by ERC Project ERCC (FP7/615074) and the German Federal Ministry for Education and Research. Part of this work was done while A.O. was at Georgia Institute of Technology, supported in part by NSF award #0545659 and NSF Cyber Trust award #0831184. A.S. was supported in part by NSF awards #0747294, 0729171.
Eike Kiltz was partially supported by DFG grant KI 795/41 and ERC Project ERCC (FP7/615074). Adam Smith was funded by US National Science Foundation award CCF0747294.
Author information
Affiliations
Corresponding author
Additional information
A preliminary version of this paper appears in Advances in Cryptology—CRYPTO 2010, 30th Annual International Cryptology Conference, T. Rabin ed., LNCS, Springer, 2010. This is the full version.
Communicated by Kenneth Paterson.
Appendices
Appendix 1: Proof of Lemma 4.5
We introduce the following notation for the proof. For a random variable V with range \({\mathcal {V}}\), we define the collision probability of V as \(\mathrm {Col}(V) = \Pr \left[ \, V = V' \,\right] = \sum _{v \in {\mathcal {V}}} P_V(v)^2\) where \(V'\) is an independent copy of V, and for an event \({\mathcal {E}}\) we define the conditional collision probability \(\mathrm {Col}_{{\mathcal {E}}}(V) = {\Pr }\left[ \, V = V'\,\left \right. \,{\mathcal {E}}\,\right] \). For random variables V, W, we define the square of the 2distance as \(D(V,W) = \sum _v \big (P_V(v)  P_W(v)\big )^2\).
Writing \({\mathbf{E}}_k\) for expectation over the choice of random k from \({\mathcal {K}}\), we have
where the first inequality is by CauchySwartz and the second is by Jensen’s inequality. We now show
from which the theorem follows. Write \((X,Y_k) = (X,h(k,X))\) for an arbitrary but fixed k. Then
Using the Kronecker delta \(\delta _{s,s'}\) which equals 1 if \(s =s'\) and else 0 for all \(s,s' \in S\), we can write \(P_{g(X,Y_k)}(s) = \sum _x P_X(x) \delta _{g(x,h(k,x)),s}\), and thus
We use the pairwise independence of h to rewrite this in terms of collision probabilities:
where the subscript \({\mathcal {E}}\) denotes (conditioning on) the event that \(X \ne X'\). That is,
Similarly,
so that
where \({\mathcal {E}}\) is defined as above. Note that the only difference between the expression above and that in (11) is that even when \(X=X'\), a collision is not guaranteed.
Finally,
as well. By combining the above, we have
To complete the proof, we can plug the bound above into (10):
By the assumption on the minentropy of X, the collision probability \(\mathrm {Col}(X)\) is at most \(4 {\hat{\varepsilon }}^2 / S\). So the statistical distance \(\Delta \bigl ((K,g(X,h(K,X))), (K,g(X,U))\bigr )\) is at most \({\hat{\varepsilon }}\), as desired.\(\square \)
Appendix 2: Security of OAEP Under KeyIndependent ChosenPlaintext Attack
The commonlyaccepted notions of security for encryption ask for privacy with respect to messages that may depend on the public key. We define here a notion of privacy for messages not depending on the public key. We mention that such a definition appears for example in the work of Micali et al. [44] (under the name “threepass," versus “onepass," cryptosystem), in the text of Goldreich [30], and in the context of the recent work on deterministic encryption [2].
The definition. To an encryption scheme \(\Pi = ({\mathcal {K}}, {\mathcal {E}},{\mathcal {D}})\) and an adversary \(B = (B_1, B_2)\) we associate
We require \(m_0 = m_1\) above. Define the indkicpa advantage of B against \(\Pi \) as
Remarks. While nonstandard, KI security seems adequate for some applications. For example, in [30] Goldreich points out that highlevel applications that use encryption as a tool do so in a keyoblivious manner, and Bellare et al. [2] argue that in real life public keys are abstractions hidden in our software, so messages are unlikely to depend on them. KI security also suffices for hybrid encryption.
The result. We can show a standard model instantiation under KI security directly from Lemma 4.5, where G is any pairwiseindependent function. This is captured by the theorem below.
Theorem 8.1
Let \(\mathsf {LTDP}= ({\mathcal {F}}, {\mathcal {F}}')\) be an LTDP with residual leakage \(\ell \), and let \(\mathsf {OAEP}\) be the encryption scheme associated to \({\mathcal {F}}\), hash functions G, H, and a parameter \(k_0 < k\). Suppose G is pairwiseindependent. Let \(\varepsilon > 0\). Then for any \(k_0 \ge \ell + 2 \log (1/\varepsilon )  2\) and any INDKICPA adversary B against \(\mathsf {OAEP}\), there is a distinguisher D against \(\mathsf {LTDP}\) such that
Furthermore, the runningtime of D is the time to run B.
As we mentioned, the proof is a simple hybrid argument concluding by Lemma 4.5.
Rights and permissions
About this article
Cite this article
Kiltz, E., O’Neill, A. & Smith, A. Instantiability of RSAOAEP Under ChosenPlaintext Attack. J Cryptol 30, 889–919 (2017). https://doi.org/10.1007/s0014501692384
Received:
Revised:
Published:
Issue Date:
Keywords
 RSA
 OAEP
 Paddingbased encryption
 Lossy trapdoor functions
 Leftover hash lemma
 Standard model