Simple Constructions from (Almost) Regular One-Way Functions

Mazor, Noam; Zhang, Jiapeng

doi:10.1007/s00145-024-09507-4

Simple Constructions from (Almost) Regular One-Way Functions

Research Article
Open access
Published: 30 May 2024

Volume 37, article number 25, (2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of Cryptology Aims and scope Submit manuscript

Simple Constructions from (Almost) Regular One-Way Functions

Download PDF

Noam Mazor¹ &
Jiapeng Zhang²

423 Accesses
Explore all metrics

Abstract

Two of the most useful cryptographic primitives that can be constructed from one-way functions are pseudorandom generators (PRGs) and universal one-way hash functions (UOWHFs). In order to implement them in practice, the efficiency of such constructions must be considered. The three major efficiency measures are: the seed length, the call complexity to the one-way function, and the adaptivity of these calls. Still, the optimal efficiency of these constructions is not yet fully understood: there exist gaps between the known upper bound and the known lower bound for black-box constructions. A special class of one-way functions called unknown-regular one-way functions is much better understood. Haitner, Harnik and Reingold (CRYPTO 2006) presented a PRG construction with semi-linear seed length and linear number of calls based on a method called randomized iterate. Ames, Gennaro and Venkitasubramaniam (ASIACRYPT 2012) then gave a construction of UOWHF with similar parameters and using similar ideas. On the other hand, Holenstein and Sinha (FOCS 2012) and Barhum and Holenstein (TCC 2013) showed an almost linear call-complexity lower bound for black-box constructions of PRGs and UOWHFs from one-way functions. Hence, Haitner et al. and Ames et al. reached tight constructions (in terms of seed length and the number of calls) of PRGs and UOWHFs from regular one-way functions. These constructions, however, are adaptive. In this work, we present non-adaptive constructions for both primitives which match the optimal call complexity given by Holenstein and Sinha and Barhum and Holenstein. Our constructions, besides being simple and non-adaptive, are robust also for almost-regular one-way functions.

Simple Constructions from (Almost) Regular One-Way Functions

Simple and More Efficient PRFs with Tight Security from LWE and Matrix-DDH

Counting Unpredictable Bits: A Simple PRG from One-Way Functions

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

A wide class of cryptographic primitives can be constructed from one-way functions, which is the minimal assumption for cryptography. Informally, a function f is called a one-way function if it is easy to compute, but hard to invert by polynomial-time algorithms. Two important primitives that can be constructed from one-way functions are pseudorandom generators (PRGs) [5, 22] and universal one-way hash functions (UOWHFs) [19]. These two primitives are useful for constructing even more powerful primitives such as encryption, digital signatures and commitments. Thus, an improvement in the efficiency of constructions for PRGs and UOWHFs would have an effect on other primitives. Yet, the optimal efficiency of these two basic primitives is not fully understood.

There are several important efficiency measures to account for when considering PRGs and UOWHFs. For PRG constructions, one aims to minimize the seed length and the number of calls to the one-way function f. For UOWHF constructions, there is a need to minimize the key length and the number of calls to f. Besides these two measurements, another important parameter is the adaptivity of the calls. That is, if the inputs for the one-way function are independent of the output of previous calls, then the construction can be implemented in parallel. By contrast, if the calls are adaptive, one must make them sequentially.

Constructions Much progress was done since the notion of PRGs has been introduced. The first construction of pseudorandom generators was given by Blum and Micali [5] based on the assumption that a specific function is hard to invert. This construction was generalized by Yao [22] to work with any one-way permutation. Since then, many subsequent works made effort to construct PRGs based on arbitrary one-way functions. Notably, through introducing the randomized iterate^{Footnote 1} method, Goldreich, Krawczyk and Luby [8] gave a PRG construction from any unknown-regular one-way function. The notion of regular one-way function is a refinement of a one-way permutation: A one-way function f is called regular if for every n and $x,x'$ with $\left| x\right| = \left| x'\right| =n$ it holds that $\left| f^{-1}(f(x))\right| = \left| f^{-1}(f(x'))\right| $. We say that the function is unknown-regular if the regularity parameter, $\left| f^{-1}(f(x))\right| $, may not be a computable function of n. More recently, the randomized iterate method was further studied by [11, 23], who reached a construction of PRGs from any unknown-regular one-way functions, while having $O(n\log n)$ seed length and making $O(n/\log n)$ calls to the one-way function. [25] improved the seed length up to $\omega (n)$ by using a transformation that converts any unknown-regular function into a function that is known-regular on its image.^{Footnote 2}

For arbitrary one-way function, a seminal work by Håstad, Impagliazzo, Levin and Luby [15] gave the first PRG construction. Since then, the efficiency has been improved by many works ([10, 13, 16, 21]). Currently, the state-of-the-art construction of PRGs due to [21] uses $O(n^3)$ bits of random seed and $O(n^3)$ adaptive calls to the one-way function, or alternatively seed of size $O(n^4)$ with non-adaptive calls [13, 21].^{Footnote 3}

PRG constructions
Assumption	Number of calls	Seed length	Adaptivity
One-way permutation	1	O(n)	No
Known-regular OWF	$\omega (1)$	$\omega (n)$	No
Unknown-regular OWF ([11])	$O(n/\log n)$	$O(n\log n)$	Yes
Unknown-regular OWF ([23])	$\omega (n/\log n)$	$\omega (n)$	Yes
Arbitrary OWF ([21])	$\omega (n^3)$	$\omega (n^3)$	Yes
Arbitrary OWF ([13])	$\omega (n^3)$	$\omega (n^4)$	No
Unknown-regular OWF (this work)	$O(n/\log n)$	$O(n^2)$	No

The constructions of UOWHFs use similar ideas to the constructions of PRGs. Still, the best PRGs constructions from arbitrary one-way functions are more efficient than the best known UOWHFs constructions. Rompel [20] gave the first UOWHF construction from arbitrary one-way functions. The efficiency was improved by [12], who gave a construction of UOWHF using $O(n^{13})$ adaptive calls with a key of size $O(n^7)$. Constructing a UOWHF using $O(n^3)$ calls to the one-way function is still an interesting open question.

The efficiency of UOWHF based on an unknown-regular one-way function is similar to the efficiency of the unknown-regular-based PRGs. Interestingly, this was shown by [2] using the same method of randomized iterate, resulting in a construction that uses $\Theta (n)$ key length and $\Theta (n)$ calls. We stress that when the regularity of f is known (i.e., can be computed efficiently given n), there are much more efficient constructions for both PRGs and UOWHFs ([7, 9, 19, 23]).

UOWHF constructions
Assumption	Number of calls	Seed length	Adaptivity
One-way permutation ([19])	1	O(n)	No
Known-regular OWF	$\omega (1)$	$\omega (n)$	No
Unknown-regular OWF ([2])	$O(n/\log n)$	$O(n\log n)$	Yes
Arbitrary OWF ([12])	$\omega (n^{13})$	$\omega (n^7)$	Yes
Unknown-regular OWF (this work)	$O(n/\log n)$	$O(n^2)$	No

Lower bounds The lower bounds for black-box constructions are relatively far from the upper bounds. In this line of work, there are two incomparable types of results. The first type due to [6] is stated with terms of the stretching and compression of the PRG and UOWHF, respectively. Specifically, [6] showed that any black-box PRG construction $G:\left\{ 0,1\right\} ^m \rightarrow \left\{ 0,1\right\} ^{m+s}$ from f must use $\Omega (s/\log n)$ calls to f. Similarly, any black-box UOWHF construction with input size m and output size $m-s$ must use $\Omega (s/\log n)$ calls. In the second type of results, [17] showed that any black-box PRG construction from f must use $\Omega (n/\log n)$ calls to f, even for 1-bit stretching. [3] showed similar results for 1-bit compressing UOWHF.

As mentioned, there is a substantial gap between the aforementioned lower and upper bounds. One explanation for that gap is that all of the above lower bounds hold even when the one-way function f is unknown-regular. For this case, these bounds are known to be tight with the mentioned above constructions, which are based on randomized iterations. These constructions, however, are adaptive.

1.1 Our Contribution

In this paper, we give non-adaptive constructions of tight call complexity for PRGs and UOWHFs from unknown-regular one-way functions. Both of our constructions are quite simple and are very similar to each other. Same as previous results, the security of our constructions holds also if f is only almost-regular ([23]), which means that for every $\left| x\right| =\left| x'\right| $, the ratio between $\left| f^{-1}(f(x))\right| $ and $\left| f^{-1}(f(x'))\right| $ is only bounded by a polynomial in $\left| x\right| $ (compared to a ratio of 1, in the case of regular functions).

The seed (or key) length in our construction for PRGs (or UOWHFs, respectively) is $O(n^2)$, compared to $\tilde{O}(n)$ bits in the previous adaptive constructions. This seems unavoidable and raises an interesting open question.^{Footnote 4}

1.1.1 Our Constructions and Results

In this section, we present our constructions. The results here are stated for regular one-way functions but can be naturally expanded to almost-regular functions, as stated in Sects. 3 and 4. The main crux of the construction is the following observation. For regular f and i.i.d uniform random variables $X_1$, $X_2$ over $\left\{ 0,1\right\} ^n$, given any fixing of $f(X_1)$, both the entropy and min-entropy of the pair $X_1,f(X_2)$ are exactly n. To see the above, recall that for regular f with (unknown) regularity parameter r, it holds that there are exactly r possible values for $X_1$ given $f(X_1)$, and exactly $2^n/r$ possible values for $f(X_2)$. Thus, the regularity parameter r “cancels out” when considering the number of possible values (given $f(X_1)$) of the pair $X_1,f(X_2)$, which is $r\cdot 2^n/r=2^n$. In the PRG construction, we exploit this fact by using a universal family of hash functions $\mathcal {H}$ (and the Goldreich–Levin theorem) in order to extract pseudo-uniform bits. In the UOWHF construction, we use similar ideas in order to compress the pair $X_1,f(X_2)$ without creating too many collisions. For both constructions, we need additional properties from the universal family $\mathcal {H}$ that we ignore for this introduction. See more details in Sects. 3 and 4. We next present the constructions. The main ideas of the proofs for the following theorems are described in Sect. 1.2.

A simple construction of PRGs from regular one-way functions We start with a description of our PRG construction. Let $\mathcal {H}= \left\{ h:\left\{ 0,1\right\} ^{2n}\rightarrow \left\{ 0,1\right\} ^{n+\log n}\right\} $ be a family of 2-universal hash functions. For a regular one-way function $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$ and an integer $t\in {\mathbb {N}}$,^{Footnote 5} the generator $G_{t}:\mathcal {H}\times \left\{ 0,1\right\} ^{ n(t+1)}\rightarrow \mathcal {H}\times \left\{ 0,1\right\} ^{t\cdot (n+\log n)}$ is given by

$$\begin{aligned} G_t\big (h,x_1,\dots , x_{t+1}\big )=\left( h,h(x_1,f(x_{2})), \dots ,h(x_t,f(x_{t+1}))\right) \end{aligned}$$

We show that for every polynomial t, the distribution $G_{t}(\mathcal {H}, X_1,\dots , X_{t})$ is pseudorandom. Note that the input length of $G_{t}$ is $|h| + n\cdot (t+1)$ and the output length is $|h| + t\cdot (n+\log n)$. By making $t = \Theta (n/\log n)$ calls, we show that $G_{t}$ is indeed a pseudorandom generator.

Theorem 1.1

(Main theorem for PRG, informal) Let $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$ be an unknown-regular one-way function and let $t(n) \ge n/\log n+1$ be some polynomial. Then, $G_t$ is a PRG with seed length $O(n^2+n(t(n)+1))$. Furthermore, $G_t$ makes t(n) non-adaptive calls to f.

A simple construction of UOWHFs from regular one-way functions Now we introduce the construction of the UOWHFs. It is a well-known fact that in order to construct UOWHF, it is sufficient to construct a function for which it is hard to find a collision for a random input. Let f be a one-way function, let t be a parameter and let $\mathcal {H}= \left\{ h:\left\{ 0,1\right\} ^{2n}\rightarrow \left\{ 0,1\right\} ^{n-\log n}\right\} $ be a family of hash functions. We define the function $C_{t}:\mathcal {H}\times \left\{ 0,1\right\} ^{ n\cdot t}\rightarrow \mathcal {H}\times \left\{ 0,1\right\} ^{(t-1)\cdot (n-\log n)+2n}$ as

$$\begin{aligned} C_{t}\left( h,x_1,\dots , x_t\right) = \left( h,f(x_1),h(x_1,f(x_{2})), \dots ,h(x_{t-1},f(x_{t})),x_t\right) \end{aligned}$$

The main difference of this construction from the PRG one is that h is now a shrinking function. In addition, we also output $f(x_1)$ and the very last input of $C_t$. As before, since the output length of UOWHFs has to be shorter than the input length, we have to make up for the additional output $(f(x_1), x_t)$ by taking t to be $\Theta (n/\log n)$.

The UOWHF can now be defined using $C_t$. Let $k=\log \left| \mathcal {H}\right| + n\cdot t$ and for a string $z\in \left\{ 0,1\right\} ^k$, let $C_z$ be the function defined by $C_z(w)= C_t(w\oplus z)$ for every $w\in \left\{ 0,1\right\} ^k$. Our main theorem for this part is stated as follows.

Theorem 1.2

(Main theorem for UOWHF, informal) Let $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$ be an unknown-regular one-way function and let $t(n) \ge n/\log n+2$ be some polynomial. Then, $\left\{ C_z\right\} _{z\in \left\{ 0,1\right\} ^k}$ is a family of universal one-way hash functions with key length $k=O(n^2+n\cdot t(n))$ and output length $O(n^2+n\cdot t(n))$. Furthermore, for every $z\in \left\{ 0,1\right\} ^k$, $C_z$ makes t non-adaptive calls to f.

On the seed/key length As explained above, the seed (or key) length in our constructions is $O(n^2)$, compared to $\tilde{O}(n)$ bits in the previous constructions, which seems to be unavoidable due to the lower bound on the number of calls necessary for any black-box construction. However, one of the bottleneck in the seed (or key) length of our construction is the description of the hash function we use. Thus, even if one is able to use correlated input to save input length, the seed length might still be long, compared to the adaptive constructions. In this case, to improve the seed length it is necessary to use (suitable) more efficient hash. For the PRG construction, we can use hash function based on Toeplitz matrices for better seed length (see, for example, [9, 25]). For the UOWHF construction, we do not know if such hash function exists.

1.2 Proof Overview

Here we give a short overview of our proofs. For both constructions, the proof boils down to showing that each input pair $x_i,x_{i+1}$ induces a weak version of the desired primitive. For PRG, the main part of the security proof is showing that given $f(x_1)$ and h, it is hard to distinguish between $h(x_1,f(x_2))$ and a uniform string. For UOWHF, we prove the security by showing that given $h,x_1,x_2$, it is hard to find a collision $h,x'_1,x'_2$ to the function $C(h,x_1,x_2)=h,f(x_1),h(x_1,f(x_2))$. Note that it may be easy to find $x'_2 \ne x_2$ with $f(x'_2)=f(x_2)$. To solve this, we further demand that $f(x'_2)\ne f(x_2)$.^{Footnote 6} To show that this is enough, we prove that any collision in our UOWHF must contain a collision in the above form, for at least one input pair. Below we give short descriptions of the main ideas in more details.

The PRG construction We start by sketching the security proof for the PRG. Let $X_1$ and $X_2$ be uniform random variables over $\left\{ 0,1\right\} ^n$, and let h be a hash function, uniformly sampled from a universal family of hash functions $\mathcal {H}= \left\{ h:\left\{ 0,1\right\} ^{2n}\rightarrow \left\{ 0,1\right\} ^{n+\log n}\right\} $. Recall that we want to show that given h and $f(X_1)$, it holds that $h(X_1,f(X_2))$ is computationally indistinguishable from uniform $n+\log n$ bits. For simplicity, assume that we are only interested in proving that the distinguishing advantage is at most $n^{-c}$, for some constant $c>1$.

The main observation is that for regular f, given $f(X_1)$, the pair $X_1,f(X_2)$ has exactly n bits of min-entropy. Thus, by the leftover hash lemma, the $n-O(c\log n)$ first bits of $h(X_1,f(X_2))$ are $n^{-c}/2$ statistically close to uniform. To argue that the suffix of $h(X_1,f(X_2))$ looks uniform, we show that $g(x_1,y)=h,f(x_1),h(x_1,y)_{1,\dots ,n-O(c\log n)}$ is a one-way function,^{Footnote 7} and thus we can use Goldreich–Levin in order to extract additional $O(c\log n)$ pseudorandom bits from $X_1,f(X_2)$.

The UOWHF construction We now sketch the security proof for the UOWHF. Let H be a universal family of hash functions $\mathcal {H}= \left\{ h:\left\{ 0,1\right\} ^{2n}\rightarrow \left\{ 0,1\right\} ^{n-\log n}\right\} $. We show that given random h and uniformly sampled $x_1$ and $x_2$ from $\left\{ 0,1\right\} ^n$, it is hard to find $(x'_1,x'_2)\ne (x_1,x_2)$ such that $f(x_1)=f(x'_1)$, $f(x_2)\ne f(x'_2)$ and yet $h(x_1,f(x_2))=h(x'_1,f(x'_2))$. For $x_1,x_2 \in \left\{ 0,1\right\} ^n$ and $h\in \mathcal {H}$ we define

$$\begin{aligned} {{\mathcal {G}}}_{h,x_1,x_2}:=\left\{ (x'_1,y):h(x_1,f(x_2))= h(x'_1,y) \ \wedge \ f(x_1)=f(x'_1) \ \wedge \ y\in Im(f)\right\} . \end{aligned}$$

That is, the set ${{\mathcal {G}}}_{h,x_1,x_2}$ contains all the pairs $(x'_1,f(x'_2))$ for which $h,x'_1,x'_2$ collides with $h,x_1,x_2$. The main observation here is that, since h outputs $n-\log n$ bits, and there are exactly $2^n$ pairs $(x'_1,y)$ such that $y\in Im(f)$ and $f(x'_1)=f(x_1)$, the expected size of ${{\mathcal {G}}}_{h,x_1,x_2}$ is at most $2^n/2^{n-\log n}=n$. Thus, we can use an algorithm $\textsf{A}$ that finds a collision in the above function in order to invert f: Given input y, we choose random $x_1,x_2\in \left\{ 0,1\right\} ^n$ and plant y in ${{\mathcal {G}}}_{h,x_1,x_2}$. That is, we choose a random h conditioned on the event that $h(x_1,f(x_2))=h(x'_1,y)$ for some $x'_1 \in f^{-1}(f(x_1))$. Since there are about n such pairs, we can hope that the planted pair $(x'_1,y)$ will be output by $\textsf{A}$ with good probability.

However, we need to find $x'_1$ for which the pair $(x'_1,y)$ has a good probability to be output by $\textsf{A}$. To do that, we use a similar approach to the one presented in [2]. That it, we use $\textsf{A}$ twice. First, we use $\textsf{A}$ to find a pre-image $x'_1$ of $f(x_1)$, and then plant $(x'_1,y)$ in ${{\mathcal {G}}}_{h,x_1,x_2}$. Similarly to [2], we show by a “collision based” argument, that $x'_1$ has a good probability to be output again by $\textsf{A}$. For more details, see Sect. 4.

1.3 Additional Related Work

Arbitrary one-way functions In [12], the notion of inaccessible entropy (introduced in [14]) was used in order to construct UOWHF. Similar techniques were later used in [10] to construct PRG, where the notion of inaccessible entropy was replaced with next-block pseudoentropy. This construction was later simplified by [21], who also improved the seed length with the cost of adaptivity. Lately, [1] pointed out that the notions of accessible entropy and next-block pseudoentropy are deeply related to each other.

Regular one-way functions As mentioned above, the construction from regular one-way functions is more efficient. Beside almost-regular, a few refinements of regularity were considered in past works. [4] showed a construction for UOWHF that uses $O(ns^6(n))$ key length under the assumption that $f^{-1}(f(x))$ is concentrated in an interval of size $2^{s(n)}$. [24] considered unknown-weakly regular functions. The last are functions for which the set of inputs with maximal number of siblings is of fraction at least $n^{-c}$ for some constant c. For such functions, [24] presented PRG with $O(n\log n)$ seed length and $O(n^{2c+1})$ calls. [23] considered known-almost-regular and unknown-weakly regular functions. For the last, [23] showed a tight construction of UOWHF based on the randomized iterate method.

1.4 Paper Organization

Formal definitions are given in Sect. 2. The PRG construction and proof of Theorem 1.1 are in Sect. 3. The UOWHF construction and proof of Theorem 1.2 are in Sect. 4.

2 Preliminaries

2.1 Notations

We use calligraphic letters to denote sets, uppercase for random variables, and lowercase for values and functions. For $n \in {{\mathbb {N}}}$, let $[n] :=\left\{ 1,\dots ,n\right\} $. Given a vector $s\in \left\{ 0,1\right\} ^n$, let $s_i$ denote its i-th entry, and $s_{1,\dots , i}$ denote its first i entries. For $s,w\in \left\{ 0,1\right\} ^*$ we use $s\circ w$ to denote their concatenation and for $s,w\in \left\{ 0,1\right\} ^n$, we use $s\oplus w \in \left\{ 0,1\right\} ^n$ to denote their bit-wise XOR.

The support of a distribution P over a finite set ${\mathcal {S}}$ is defined by ${\text {Supp}}(P) :=\left\{ x\in {\mathcal {S}}: P(x)>0\right\} $. For a (discrete) distribution D let $d\leftarrow D$ denote that d was sampled according to D. Similarly, for a set ${\mathcal {S}}$, let $s\leftarrow {\mathcal {S}}$ denote that s is drawn uniformly from ${\mathcal {S}}$. For a function $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$, let $y\leftarrow f(\left\{ 0,1\right\} ^n)$ denote that y sampled from the following distribution: sample x uniformly from $\left\{ 0,1\right\} ^n$, and let $y=f(x)$. Let $\textsf{Im}(f) ~:=\left\{ f(x) :x\in \left\{ 0,1\right\} ^n\right\} $ be the image of f. The statistical distance (also known as, variation distance) of two distributions P and Q over a discrete domain ${\mathcal {X}}$ is defined by $\mathsf {\textsc {SD}}({P},{Q}) :=\max _{{\mathcal {S}}\subseteq {\mathcal {X}}} \left| P({\mathcal {S}})-Q({\mathcal {S}})\right| = \frac{1}{2} \sum _{x \in {\mathcal {S}}}\left| P(x)-Q(x)\right| $. The min-entropy of a distribution X, denoted by ${\text {H}_{\infty }}(X)$ is defined by ${\text {H}_{\infty }}(X):=-\log (\max _{x\in {\text {Supp}}(X)}\left\{ \Pr \left[ X=x\right] \right\} )$.

Let ${\text {poly}}$ denote the set of all polynomials, and let PPT stand for probabilistic polynomial time. A function $\nu :{{\mathbb {N}}}\rightarrow [0,1]$ is negligible, denoted $\nu (n) = neg(n)$, if $\nu (n) < 1/p(n)$ for every $p\in {\text {poly}}$ and large enough n. Lastly, we identify a matrix $M\in \left\{ 0,1\right\} ^{n \times m}$ with a function $M:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^m$ by $M(x):=x\cdot M$, thinking of $x\in \left\{ 0,1\right\} ^n$ as a vector with dimension n.

2.2 One-Way Functions

We now formally define basic cryptographic primitives. We start with the definition of one-way function.

Definition 2.1

(One-way function) A polynomial-time computable function $f:\{0,1\}^{*}\rightarrow \{0,1\}^{*}$ is called a one-way function if for every probabilistic polynomial-time algorithm $\textsf{A}$, there is a negligible function $\nu :{\mathbb {N}}\rightarrow [0,1]$ such that for every $n\in {{\mathbb {N}}}$

$$\begin{aligned} {\mathop {\Pr }\limits _{x\leftarrow \{0,1\}^{n}}}\left[ \textsf{A}(f(x))\in f^{-1}(f(x))\right] \le \nu (n) \end{aligned}$$

For simplicity we assume that the one-way function f is length-preserving. That is, $\left| f(x)\right| =\left| x\right| $ for every $x\in \left\{ 0,1\right\} ^*$. This can be assumed without loss of generality, and is not crucial for our constructions.

In this paper we focus on almost-regular one-way functions, formally defined below.

Definition 2.2

(Almost-regular function) A function family $f=\left\{ f_n: \left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n\right\} $ is $\beta $-almost-regular for $\beta \ge 0$ if for every $n\in {{\mathbb {N}}}$ and $x\in \left\{ 0,1\right\} ^n$ it holds that

$$\begin{aligned} \frac{2^n}{\left| \textsf{Im}(f) ~\right| }\cdot n^{-\beta } \le \left| f^{-1}(f(x))\right| \le \frac{2^n}{\left| \textsf{Im}(f) ~\right| }\cdot n^\beta . \end{aligned}$$

f is almost-regular if there exists $\beta \ge 0$ such that f is $\beta $-almost-regular, and regular if it is 0-almost-regular.

Note that we do not assume that the regularity of f can be computed efficiently. That is, we only assume that f is unknown-(almost)-regular.

Immediately from the definition of a one-way function, we get the following simple observation.

Claim 2.3

For every one-way function $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$, there exists a negligible function $\nu (n)$ such that for every input $x\in \left\{ 0,1\right\} ^n$ it holds that $\left| f^{-1}(f(x))\right| \le 2^n\cdot \nu (n)$.

2.3 Pseudorandom Generators

In Sect. 3 we use one-way functions in order to construct PRGs. The latter are formally defined below.

Definition 2.4

(Pseudorandom generator) Let n be a security parameter. A polynomial-time computable function $G:\{0,1\}^{n}\rightarrow \{0,1\}^{m(n)}$ is called a pseudorandom generator if for every $n>0$ it holds that $m(n)>n$ and, for every probabilistic polynomial-time algorithm $\textsf{D}$, there is a negligible function $\nu :{\mathbb {N}}\rightarrow [0,1]$ such that for every $n>0$,

$$\begin{aligned} \left| {\mathop {\Pr }\limits _{x\leftarrow \left\{ 0,1\right\} ^{n}}\left[ \textsf{D}(G(x))=1\right] }-{\mathop {\Pr }\limits _{x\leftarrow \left\{ 0,1\right\} ^{m(n)}}\left[ \textsf{D}(x)=1\right] }\right| \le \nu (n). \end{aligned}$$

A key ingredient in the construction of PRG from one-way function is the Goldreich–Levin hardcore predicate. The following lemma follows almost directly from [9].

Lemma 2.5

Let n be a security parameter. Let $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$ be a function, and D a distribution on $\left\{ 0,1\right\} ^n$, such that for every PPT $\textsf{A}$, there exists a negligible function $\nu $, such that

$$\begin{aligned} {\mathop {\Pr }\limits _{x\leftarrow D}\left[ \textsf{A}(f(x))\in f^{-1}(f(x))\right] } \le \nu (n). \end{aligned}$$

Then for every PPT $\textsf{P}$, there exists a negligible function $\nu '$, such that,

$$\begin{aligned} {\mathop {\Pr }\limits _{x\leftarrow D, r\leftarrow \left\{ 0,1\right\} ^n}\left[ \textsf{P}(f(x),r)={\text {GL}}(x,r)\right] } \le 1/2+\nu '(n) \end{aligned}$$

where ${\text {GL}}(x,r):=\langle x,r \rangle $ is the Goldreich–Levin predicate.

Proof

By the proof of Goldreich–Levin [9], for every $p\in {\text {poly}}$ there is an oracle-aided PPT algorithm $\textsf{A}$ such that for every algorithm $\textsf{P}$ and x with

$$\begin{aligned} {\mathop {\Pr }\limits _{r\leftarrow \left\{ 0,1\right\} ^n}\left[ \textsf{P}(f(x),r)={\text {GL}}(x,r)\right] } \ge 1/2+1/p(n) \end{aligned}$$

it holds that

$$\begin{aligned} \Pr \left[ \textsf{A}^{\textsf{P}}(f(x))=x\right] \ge 1/p^2(n). \end{aligned}$$

On the other end, since $\textsf{A}^{\textsf{P}}$ is an efficient algorithm when $\textsf{P}$ is, there exists some negligible function $\nu $ such that

$$\begin{aligned} {\mathop {\Pr }\limits _{x\leftarrow D}\left[ \textsf{A}^{\textsf{P}}(f(x))=x\right] } \le \nu (n). \end{aligned}$$

Thus, it holds for every $p\in {\text {poly}}$ that

$$\begin{aligned} {\mathop {\Pr }\limits _{x\leftarrow D}\left[ {\mathop {\Pr }\limits _{r\leftarrow \left\{ 0,1\right\} ^n}\left[ \textsf{P}(f(x),r)={\text {GL}}(x,r)\right] } \ge 1/2+1/p(n)\right] }\le \nu (n)\cdot p^2(n). \end{aligned}$$

This implies that

$$\begin{aligned} {\mathop {\Pr }\limits _{x\leftarrow D, r\leftarrow \left\{ 0,1\right\} ^n}\left[ \textsf{P}(f(x),r)={\text {GL}}(x,r)\right] } \le 1/2+1/p(n) +\nu (n)\cdot p^2(n). \end{aligned}$$

Since the above holds for every $p\in {\text {poly}}$, and since $\nu (n)\cdot p^2(n)$ is a negligible function for every such p (in particular, $\nu (n)\cdot p^2(n)\le 1/p(n)$ for large enough n), we get that

$$\nu '(n)={\mathop {\Pr }\limits _{x\leftarrow D, r\leftarrow \left\{ 0,1\right\} ^n}\left[ \textsf{P}(f(x),r)={\text {GL}}(x,r)\right] }-1/2$$

is a negligible function. $\square $

The next lemma, stated in [22], is useful for showing that a sequence of bits is pseudorandom. The proof of the lemma is given below for completeness.

Lemma 2.6

(Distinguishability to prediction) There exists an oracle-aided PPT algorithm $\textsf{P}$ such that the following holds. Let Q be a distribution over $\left\{ 0,1\right\} ^*\times \left\{ 0,1\right\} ^n$, let $\textsf{D}$ be an algorithm and $\alpha \in [0,1]$ such that,

$$\begin{aligned} {\mathop {\Pr }\limits _{(x,y)\leftarrow Q, z\leftarrow \left\{ 0,1\right\} ^n}\left[ \textsf{D}(x,z)=1\right] } -{\mathop {\Pr }\limits _{(x,y)\leftarrow Q}\left[ \textsf{D}(x,y)=1\right] }\ge \alpha . \end{aligned}$$

Then there exists $i\in [n]$ such that

$$\begin{aligned} {\mathop {\Pr }\limits _{(x,y) \leftarrow Q}\left[ \textsf{P}^{\textsf{D}}(x,y_{1,\dots ,i-1})=y_i\right] } \ge 1/2 + \alpha /n. \end{aligned}$$

Proof of Lemma 2.6

Let $Q, \textsf{D}$ and $\alpha $ be as in Lemma 2.6. We start by showing that $\textsf{D}$ can be used in order to distinguish $y_{i}$ from uniform bit given $x,y_{1,\dots , i-1}$ for some index $i\in [n]$. Later, we use this fact in order to predict $y_i$. Indeed, it holds that

$$\begin{aligned} \alpha&\le {\mathop {\Pr }\limits _{(x,y)\leftarrow Q, z\leftarrow \left\{ 0,1\right\} ^n}\left[ \textsf{D}(x,z)=1\right] } - {\mathop {\Pr }\limits _{(x,y)\leftarrow Q}\left[ \textsf{D}(x,y)=1\right] } \\&\le \sum _{i=1}^{n} \left( {\mathop {\Pr }\limits _{(x,y)\leftarrow Q,z\leftarrow \left\{ 0,1\right\} ^n}\left[ \textsf{D}(x,y_{1,\dots ,i-1},z_{i,\dots , n})=1\right] }\right. \\&\quad \left. -{\mathop {\Pr }\limits _{(x,y)\leftarrow Q, z\leftarrow \left\{ 0,1\right\} ^n}\left[ \textsf{D}(x,y_{1,\dots ,i},z_{i+1,\dots , n})=1\right] }\right) , \end{aligned}$$

and thus there exists $i \in [n]$ such that

$$\begin{aligned}&\epsilon :={\mathop {\Pr }\limits _{\begin{array}{c} (x,y)\leftarrow Q, b\leftarrow \left\{ 0,1\right\} \\ z\leftarrow \left\{ 0,1\right\} ^{n-i} \end{array}}\left[ \textsf{D}(x,y_{1,\dots ,i-1},b,z)=1\right] }\nonumber \\&\quad -{\mathop {\Pr }\limits _{\begin{array}{c} (x,y)\leftarrow Q,\\ z\leftarrow \left\{ 0,1\right\} ^{n-i} \end{array}}\left[ \textsf{D}(x,y_{1,\dots ,i-1},y_{i},z)=1\right] } \ge \alpha /n \end{aligned}$$

(1)

as we wanted to show. We now describe the predictor $\textsf{P}$. Consider the following algorithm.

We next show that the probability that $\textsf{P}$ outputs $y_i$ is at least $1/2 + \alpha /n$.

Let $p:={\mathop {\Pr }\limits _{\begin{array}{c} (x,y)\leftarrow Q,\\ z\leftarrow \left\{ 0,1\right\} ^{n-i} \end{array}}\left[ \textsf{D}(x,y_{1,\dots ,i-1},y_{i},z)=1\right] }$. It holds that

$$\begin{aligned} p+\epsilon&={\mathop {\Pr }\limits _{\begin{array}{c} (x,y)\leftarrow Q, b\leftarrow \left\{ 0,1\right\} \\ z\leftarrow \left\{ 0,1\right\} ^{n-i} \end{array}}\left[ \textsf{D}(x,y_{1,\dots ,i-1},b,z)=1\right] }\\&= 1/2\cdot ({\mathop {\Pr }\limits _{\begin{array}{c} (x,y)\leftarrow Q,\\ z\leftarrow \left\{ 0,1\right\} ^{n-i} \end{array}}\left[ \textsf{D}(x,y_{1,\dots ,i-1},y_{i},z)=1\right] }\\&\quad +{\mathop {\Pr }\limits _{\begin{array}{c} (x,y)\leftarrow Q,\\ z\leftarrow \left\{ 0,1\right\} ^{n-i} \end{array}}\left[ \textsf{D}(x,y_{1,\dots ,i-1},1-y_{i},z)=1\right] })\\&= 1/2\cdot (p+{\mathop {\Pr }\limits _{\begin{array}{c} (x,y)\leftarrow Q,\\ z\leftarrow \left\{ 0,1\right\} ^{n-i} \end{array}}\left[ \textsf{D}(x,y_{1,\dots ,i-1},1-y_{i},z)=1\right] })). \end{aligned}$$

Thus, ${\mathop {\Pr }\limits _{\begin{array}{c} (x,y)\leftarrow Q,\\ z\leftarrow \left\{ 0,1\right\} ^{n-i} \end{array}}\left[ \textsf{D}(x,y_{1,\dots ,i-1},1-y_{i},z)=1\right] }= p+2\epsilon $. Continuously, the probability that $\textsf{P}$ outputs $y_i$ is given by

$$\begin{aligned}&{\mathop {\Pr }\limits _{b \leftarrow \left\{ 0,1\right\} ^n}\left[ b =y_i\right] }\cdot (1- p)+{\mathop {\Pr }\limits _{b \leftarrow \left\{ 0,1\right\} ^n}\left[ b =1-y_i\right] }\cdot \\&\quad {\mathop {\Pr }\limits _{\begin{array}{c} (x,y)\leftarrow Q,\\ z\leftarrow \left\{ 0,1\right\} ^{n-i} \end{array}}\left[ \textsf{D}(x,y_{1,\dots ,i-1},1-y_{i},z)=1\right] }\\&= 1/2\cdot (1-p) + 1/2 \cdot (p+2\epsilon )\\&= 1/2 + \epsilon \\&\ge 1/2 + \alpha /n \end{aligned}$$

as needed. $\square $

2.4 Universal One-Way Hash Function

Lastly, we formally define UOWHF.

Definition 2.7

(Universal one-way hash function) Let k be a security parameter. A family of functions ${\mathcal {F}}=\left\{ f_z:\left\{ 0,1\right\} ^{n(k)}\rightarrow \left\{ 0,1\right\} ^{m(k)}\right\} _{z \in \left\{ 0,1\right\} ^k}$ is a family of universal one-way hash functions (UOWHFs) if it satisfies:

1.
Efficiency: Given $z\in \left\{ 0,1\right\} ^k$ and $x\in \left\{ 0,1\right\} ^{n(k)}$, $f_z(x)$ can be evaluated in time ${\text {poly}}(n(k),k)$.
2.
Shrinking: $m(k)<n(k)$.
3.
Target Collision Resistance: For every probabilistic polynomial-time adversary $\textsf{A}$, the probability that $\textsf{A}$ succeeds in the following game is negligible in k:
1. (a)
  Let $(x,state)\leftarrow \textsf{A}(1^k)\in \left\{ 0,1\right\} ^{n(k)}\times \left\{ 0,1\right\} ^*$.
2. (b)
  Choose $z\leftarrow \left\{ 0,1\right\} ^k$.
3. (c)
  Let $x'\leftarrow A(state, z)\in \left\{ 0,1\right\} ^{n(k)}$.
4. (d)
  $\textsf{A}$ succeeds if $x\ne x'$ and $f_z(x)=f_z(x')$.

A relaxation of the target collision resistance property can be done by requiring the function to be collision resistant only on random inputs. This is also known as second pre-image resistance (SPR).

Definition 2.8

(Collision resistance on random inputs) Let n be a security parameter. A function $f:\left\{ 0,1\right\} ^{n}\rightarrow \left\{ 0,1\right\} ^{m(n)}$ is collision resistant on random inputs if for every probabilistic polynomial-time adversary $\textsf{A}$, the probability that $\textsf{A}$ succeeds in the following game is negligible in n:

1.
Choose $x\leftarrow \left\{ 0,1\right\} ^{n}$.
2.
Let $x'\leftarrow A(x)\in \left\{ 0,1\right\} ^{n}$.
3.
$\textsf{A}$ succeeds if $x\ne x'$ and $f(x)=f(x')$.

The following lemma states that it is enough to construct a function that is collision resistant on random inputs, in order to get UOWHF.

Lemma 2.9

(From random inputs to targets, folklore) Let n be a security parameter. Let $F:\left\{ 0,1\right\} ^{n} \rightarrow \left\{ 0,1\right\} ^{m(n)}$ be a length-decreasing function. Suppose F is collision-resistant on random inputs. Then $\left\{ F_y:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^m\right\} _{y\in \left\{ 0,1\right\} ^n}$, for $F_y(x):=F(y\oplus x)$, is a family of target collision-resistant hash functions.

2.5 2-Universal Hash Families

2-universal families are an important ingredient in our constructions. In this section, we formally define this notion, together with some useful properties of such families.

Definition 2.10

(2-universal family) A family of function ${\mathcal {F}}=\left\{ f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^\ell \right\} $ is 2-universal if for every $x \ne x' \in \left\{ 0,1\right\} ^n$ it holds that ${\mathop {\Pr }\limits _{f \leftarrow {\mathcal {F}}}\left[ f(x)=f(x')\right] }= 2^{-\ell }$.

A universal a family is explicit if given a description of a function $f \in {\mathcal {F}}$ and $x\in \left\{ 0,1\right\} ^n$, f(x) can be computed in polynomial time (in $n,\ell $). Such family is constructible if it is explicit and there is a PPT algorithm that given $x,x'\in \left\{ 0,1\right\} ^n$ outputs a uniform $f \in {\mathcal {F}}$, such that $f(x) =f(x')$.

An important property of 2-universal families is that they can be used to construct a strong extractor. This is stated in the leftover hash lemma:

Lemma 2.11

(Leftover hash lemma [18]) Let $n\in {{\mathbb {N}}}$, $\epsilon \in [0,1]$, and let X be a random variable over $\left\{ 0,1\right\} ^n$. Let $\mathcal {H}=\left\{ h:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^\ell \right\} $ be a 2-universal hash family with

$\ell \le {\text {H}_{\infty }}(X)-2\log 1/\epsilon $. Then,

$$\begin{aligned} SD((H,H(X)),(H, U_\ell )) \le \epsilon \end{aligned}$$

for $U_\ell $ being the uniform distribution over $\left\{ 0,1\right\} ^\ell $ and H being the uniform distribution over $\mathcal {H}$.

Recall that we identify a matrix $M\in \left\{ 0,1\right\} ^{n\times \ell }$ with a function $M:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^{\ell }$ defined by $M(x)=x\cdot M$. The family of all binary matrices of size $n\times \ell $, $\left\{ m :m\in \left\{ 0,1\right\} ^{n\times \ell }\right\} $, is a constructible 2-universal family. This family has an additional property that is useful in the proof. This property is defined below.

Definition 2.12

(Approximately flat family) A family of functions $\mathcal {H}=\left\{ h :\left\{ 0,1\right\} ^{2n} \rightarrow \left\{ 0,1\right\} ^{\ell }\right\} $ is approximately flat if for every set ${\mathcal {Y}}\subseteq \left\{ 0,1\right\} ^n$, $x_1,x_2 \in \left\{ 0,1\right\} ^n$ and $y_1 \in {\mathcal {Y}}$ it holds that,

$$\begin{aligned} {\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \exists y_2\in {\mathcal {Y}}\text { s.t. } h(x_1,y_1)= h(x_2,y_2)\right] } \ge 2^{-10}\cdot \min \left\{ \left| {\mathcal {Y}}\right| \cdot 2^{-\ell }, 1\right\} . \end{aligned}$$

Lemma 2.13

For every $\ell ,n \in {{\mathbb {N}}}$ such that $\ell \le n$, the family $\left\{ m :m\in \left\{ 0,1\right\} ^{n\times \ell }\right\} $ is approximately flat.

Proof of Lemma 2.13

Fix ${\mathcal {Y}},x_1,x_2$ and $y_1$ as in Definition 2.12. We want to show that

$$\begin{aligned} {\mathop {\Pr }\limits _{M \leftarrow \left\{ 0,1\right\} ^{2n\times \ell }}\left[ \exists y_2\in {\mathcal {Y}}\text { s.t. } M(x_1,y_1)= M(x_2,y_2)\right] }\ge 2^{-10}\cdot \min \left\{ \left| {\mathcal {Y}}\right| \cdot 2^{-\ell }, 1\right\} . \end{aligned}$$

We first assume that $x_1\ne x_2$, as otherwise the lemma holds trivially. Next, we observe that M can be written as $M_{\mathcal {X}}\in \left\{ 0,1\right\} ^{n\times \ell }$ and $M_{\mathcal {Y}}\in \left\{ 0,1\right\} ^{n\times \ell }$, such that for every vectors $x,y\in \left\{ 0,1\right\} ^n$ it holds that

$$\begin{aligned}&M(x,y) = (x \cdot M_{\mathcal {X}}) \oplus (y \cdot M_{{\mathcal {Y}}}). \end{aligned}$$

(2)

We want to bound the probability that there exists $y_2\in {\mathcal {Y}}$ such that $M(x_1,y_1)= M(x_2,y_2)$, or equivalently,

$$\begin{aligned}&(x_1 \oplus x_2) \cdot M_{\mathcal {X}}= (y_2 \oplus y_1) \cdot M_{\mathcal {Y}}. \end{aligned}$$

(3)

Since $x_1 \ne x_2$, it holds that $(x_1 \oplus x_2)\cdot M_{\mathcal {X}}$ is a uniform element in $\left\{ 0,1\right\} ^{\ell }$. Thus, we are interested in lower bounding the probability

$$\begin{aligned}&{\mathop {\Pr }\limits _{M_{\mathcal {Y}}\leftarrow \left\{ 0,1\right\} ^{n\times \ell }, z' \leftarrow \left\{ 0,1\right\} ^{\ell }}\left[ \exists y_2\in {\mathcal {Y}}\text { s.t. } z'= (y_2\oplus y_1)\cdot M_{\mathcal {Y}}\right] }\\&= {\mathop {\Pr }\limits _{M_{\mathcal {Y}}\leftarrow \left\{ 0,1\right\} ^{n\times \ell }, z \leftarrow \left\{ 0,1\right\} ^{\ell }}\left[ \exists y_2\in {\mathcal {Y}}\text { s.t. } z= y_2\cdot M_{\mathcal {Y}}\right] } \end{aligned}$$

where the equality holds since $z:=z' \oplus y_1\cdot M_{{\mathcal {Y}}}$ is a uniform element in $\left\{ 0,1\right\} ^\ell $ which is independent of $M_{{\mathcal {Y}}}$. In the following we show that with probability at least 1/2 over the choice of $M_{\mathcal {Y}}$, the size of the set $ {\mathcal {Y}}\cdot M_{\mathcal {Y}}= \left\{ y\cdot M_{\mathcal {Y}}:y\in {\mathcal {Y}}\right\} $ is at least $\min \left\{ |{\mathcal {Y}}|/2,2^\ell /32\right\} $, from which the lemma follows.

To see the above, first notice that for every vector $v\in \left\{ 0,1\right\} ^n$ with $v\ne 0$, it holds that

$$\begin{aligned} {\mathop {\Pr }\limits _{M_{\mathcal {Y}}}\left[ v \cdot M_{\mathcal {Y}}=0\right] } = 2^{-\ell } \end{aligned}$$

and thus,

$$\begin{aligned}&{\mathop {{\text {E}}}\limits _{M_{\mathcal {Y}}}\left[ \left| \left\{ y_1\ne y_2\in {\mathcal {Y}}:y_1\cdot M_{\mathcal {Y}}=y_2 \cdot M_{\mathcal {Y}}\right\} \right| \right] }\\&\quad ={\mathop {{\text {E}}}\limits _{M_{\mathcal {Y}}}\left[ \left| \left\{ y_1\ne y_2\in {\mathcal {Y}}:(y_1\oplus y_2)\cdot M_{\mathcal {Y}}=0\right\} \right| \right] }\le \left| {\mathcal {Y}}\right| ^2\cdot 2^{-\ell }. \end{aligned}$$

By Markov inequality, we get that with probability at least 1/2 over the choice of $M_{\mathcal {Y}}$, it holds that

$$\begin{aligned}&\left| \left\{ y_1\ne y_2\in {\mathcal {Y}}:y_1\cdot M_{\mathcal {Y}}=y_2 \cdot M_{\mathcal {Y}}\right\} \right| \le 2\left| {\mathcal {Y}}\right| ^2\cdot 2^{-\ell }. \end{aligned}$$

(4)

In the following we show that for every matrix $M_{\mathcal {Y}}$ for which Equation (4) holds, it holds that ${\mathcal {Y}}\cdot M_{\mathcal {Y}}\ge \min \left\{ |{\mathcal {Y}}|/2,2^\ell /32\right\} $.

Indeed, consider a graph ${{\mathcal {G}}}$, in which the set of vertices is ${\mathcal {Y}}$, and the set of edges E is the set $\left\{ y_1\ne y_2\in {\mathcal {Y}}:y_1\cdot M_{\mathcal {Y}}=y_2 \cdot M_{\mathcal {Y}}\right\} $. By assumption, $\left| E\right| \le 2\left| {\mathcal {Y}}\right| ^2\cdot 2^{-\ell }$. Furthermore, it is not hard to see that ${{\mathcal {G}}}$ is composed of disjoint cliques, and that the number of connected components in ${{\mathcal {G}}}$ is exactly the size of $ {\mathcal {Y}}\cdot M_{{\mathcal {Y}}}$. To bound the number of connected components of ${{\mathcal {G}}}$, we first assume that ${{\mathcal {G}}}$ has no more than $|{\mathcal {Y}}|/2$ isolated vertices, as otherwise the bound trivially follows. We start with removing the isolated vertices from ${{\mathcal {G}}}$, to get a graph with at least $|{\mathcal {Y}}|/2$ vertices and at most $2\left| {\mathcal {Y}}\right| ^2\cdot 2^{-\ell }$ edges. Let k be the number of connected components in the graph, and let $c_1,\dots , c_k$ be the number of vertices in each component. Since $c_i > 1$ for every i, the number of edges in the i-th component is larger than $c_i^2/4$. By Cauchy–Schwarz inequality,

$$\begin{aligned} (\left| {\mathcal {Y}}\right| /2)^2 \le (\sum _{i\in [k]} c_i)^2 \le k \cdot \sum _{i\in [k]} c_i^2 \le 4k \left| E\right| \le 8k\left| {\mathcal {Y}}\right| ^2\cdot 2^{-\ell }, \end{aligned}$$

which implies that $k \ge 2^{\ell }/32$, and the lemma follows. $\square $

2.6 Useful Inequalities

The following well-known inequalities will be useful later on.

Lemma 2.14

(Jensen Inequality) Let X be a distribution over ${{\mathbb {R}}}$ and let $f:{{\mathbb {R}}}\rightarrow {{\mathbb {R}}}$ be a convex function. It holds that

$$\begin{aligned} f({\text {E}}{}\left[ X\right] )\le {\text {E}}{}\left[ f(X)\right] \end{aligned}$$

Lemma 2.15

(Cauchy–Schwarz inequality) Let $n \in {{\mathbb {N}}}$ and $a_1,\dots ,a_n \in {{\mathbb {R}}}$ be numbers. Then,

$$\begin{aligned} (\sum _{i\in [n]} a_i)^2 \le n \cdot \sum _{i\in [n]} a_i^2 \end{aligned}$$

Lastly, the following lemma will be useful in the security proof of the UOWHF. Let A be an algorithm such that for every x, the output of $\textsf{A}(x)$ is in some small set ${\mathcal {S}}_x$. Then the lemma roughly states the event of two executions of $\textsf{A}$ returning the same value is not too rare.

Lemma 2.16

Let $\Omega \subseteq \left\{ 0,1\right\} ^n$ and ${\mathcal {X}}$ be some set, let X be a distribution over ${\mathcal {X}}$, and let $S:{\mathcal {X}}\rightarrow P(\Omega )$ be a function that maps elements in ${\mathcal {X}}$ to subsets of $\Omega $. Let $\textsf{A}$ be an algorithm, such that for every $x\in {\mathcal {X}}$, $\textsf{A}(x)\in S(x)\cup \{\bot \}$. Assume that for every $u \in \Omega $, it holds that $0<{\mathop {\Pr }\limits _{x\leftarrow X}\left[ u\in S(x)\right] }\le \ell /\left| \Omega \right| $, and that ${\mathop {\Pr }\limits _{x\leftarrow X}\left[ \textsf{A}(x)\in S(x)\right] } \ge p$. Then

$$\begin{aligned} \sum _{u\in \Omega } {\mathop {\Pr }\limits _{x \leftarrow X}\left[ \textsf{A}(x)=u\right] } {\mathop {\Pr }\limits _{x \leftarrow X}\left[ \textsf{A}(x)=u\mid u \in S(x)\right] }\ge p^2/\ell . \end{aligned}$$

.

Proof

Using Cauchy–Schwarz inequality, it holds that:

$$\begin{aligned}&\sum _{u\in \Omega } {\mathop {\Pr }\limits _{x \leftarrow X}\left[ \textsf{A}(x)=u\right] } {\mathop {\Pr }\limits _{x \leftarrow X}\left[ \textsf{A}(x)=u\mid u \in S(x)\right] }\\&=\sum _{u\in \Omega } {\mathop {\Pr }\limits _{x \leftarrow X}\left[ \textsf{A}(x)=u\right] } {\mathop {\Pr }\limits _{x \leftarrow X}\left[ \textsf{A}(x)=u, u \in S(x)\right] }/{\mathop {\Pr }\limits _{x \leftarrow X}\left[ u \in S(x)\right] }\\&= \sum _{u\in \Omega } {\mathop {\Pr }\limits _{x \leftarrow X}\left[ \textsf{A}(x)=u\right] }^2/{\mathop {\Pr }\limits _{x \leftarrow X}\left[ u \in S(x)\right] }\\&\ge \sum _{u\in \Omega } {\mathop {\Pr }\limits _{x \leftarrow X}\left[ \textsf{A}(x)=u\right] }^2\cdot \left| \Omega \right| /\ell \\&\ge \left( \sum _{u\in \Omega } {\mathop {\Pr }\limits _{x \leftarrow X}\left[ \textsf{A}(x)=u\right] }\right) ^2/\ell \\&\ge p^2/\ell , \end{aligned}$$

where the second equality holds since the output of $\textsf{A}$ is always in $S(x)\cup \left\{ \bot \right\} $. $\square $

3 The PRG Construction

In this section we prove the security of our PRG construction. We start with a description of the construction. Let $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$ be an almost-regular one-way function, let t be a parameter and let $\mathcal {H}= \left\{ m :m \in \left\{ 0,1\right\} ^{2n\times ( n+\log n)}\right\} $ be the 2-universal family induced by the set of matrices of size $2n\times (n+\log n)$.^{Footnote 8} The generator $G:\mathcal {H}\times \left\{ 0,1\right\} ^{ n(t+1)}\rightarrow \mathcal {H}\times \left\{ 0,1\right\} ^{t\cdot (n+\log n)}$ is given by

$$\begin{aligned} G\big (h,x_1,\dots , x_{t+1}\big )=\left( h,h(x_1,f(x_{2})), \dots ,h(x_t,f(x_{t+1}))\right) . \end{aligned}$$

The main theorem of this part is as follows.

Theorem 3.1

(Main theorem for PRG) Let $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$ be an almost-regular one-way function and let $t(n) \ge n/\log n+1$ be some polynomial. Then G is a PRG with seed length $O(n^2+n(t+1))$. Furthermore, G uses t non-adaptive calls to f.

Note that the stretch of G is $t\cdot \log n - n$, which is tight with [6] for large values of t. We now prove Theorem 3.1. Our main lemma states that given h and $f(x_1)$, the hash $h(x_1,f(x_2))$ looks uniform for a computationally bounded algorithm.

Lemma 3.2

Let $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$ be an almost-regular one-way function. For any PPT algorithm $\textsf{D}$, there exists a negligible function $\nu $ such that

$$\begin{aligned}&\left| {\mathop {\Pr }\limits _{\begin{array}{c} x_1\leftarrow \left\{ 0,1\right\} ^{n}, h \leftarrow \mathcal {H},\\ u \leftarrow \left\{ 0,1\right\} ^{n+\log n} \end{array}}\left[ \textsf{D}(h,f(x_1),u)=1\right] } -{\mathop {\Pr }\limits _{\begin{array}{c} x_1,x_2 \leftarrow \left\{ 0,1\right\} ^{n},\\ h \leftarrow \mathcal {H} \end{array}}\left[ \textsf{D}(h,f(x_1),h(x_1,f(x_{2})))=1\right] }\right| \\&\quad \le \nu (n) \end{aligned}$$

We prove Lemma 3.2 below, but first we use it in order to give the proof of Theorem 3.1, which is straightforward.

Proof of Theorem 3.1

Let f and t be as in Theorem 3.1. By construction G makes t calls to f. Additionally, $t(n+\log n) > n(t+1)$ when $t\ge n/\log n +1$. We are left to show that the output of G is indistinguishable from uniform. The proof is by a hybrid argument. Let H be a uniform random variable over $\mathcal {H}$, and $X_1,\dots ,X_{t+1}$ be i.i.d. uniform random variables over $\left\{ 0,1\right\} ^n$. Assume toward a contradiction that there is a PPT algorithm $\mathsf{\widehat{D}}$ that can distinguish $G(H,X_1,\dots ,X_{t+1})$ from uniform. Then we show that the following algorithm $\textsf{D}$ contradicts Lemma 3.2.

For each $\ell \in [t+1]$, let the distribution $Hyb_\ell $ be defined as

$$\begin{aligned} Hyb_\ell :=\left( H,H(X_1,f(X_{2})), \dots ,H(X_{\ell -1},f(X_{\ell })),U_{(t+1-\ell )n\cdot \log n}\right) \end{aligned}$$

where $U_{(t+1-\ell )n\cdot \log n}$ is the uniform distribution over $\left\{ 0,1\right\} ^{(t+1-\ell )n\cdot \log n}$. That is, $Hyb_\ell $ is equal to $G(H,X_1,\dots ,X_{t+1})$ on the first $\ell -1$ blocks, and uniform on the rest. Observe that for every fixing of $\ell $ in the algorithm, the distribution of w for input $h \leftarrow \mathcal {H}, y \leftarrow f(U_n), z \leftarrow \left\{ 0,1\right\} ^{n+\log n}$ is exactly as the distribution $Hyb_{\ell }$. Similarly, the distribution of w for input $h \leftarrow \mathcal {H}, y \leftarrow f(U_n)$ and $z = h(X',Y')$ for $X'\leftarrow f^{-1}(y)$ and $Y' \leftarrow f(\left\{ 0,1\right\} ^n)$ is exactly as the distribution $Hyb_{\ell +1}$. Thus, it holds that,

$$\begin{aligned}&\left| {\mathop {\Pr }\limits _{\begin{array}{c} x_1\leftarrow \left\{ 0,1\right\} ^{n}, h \leftarrow \mathcal {H},\nonumber \\ u \leftarrow \left\{ 0,1\right\} ^{n+\log n} \end{array}}\left[ \textsf{D}(h,f(x_1),u)=1\right] }-{\mathop {\Pr }\limits _{\begin{array}{c} x_1,x_2 \leftarrow \left\{ 0,1\right\} ^{n},\\ h \leftarrow \mathcal {H} \end{array}}\left[ \textsf{D}(h,f(x_1),h(x_1,f(x_{2})))=1\right] }\right| \nonumber \\&= \left| 1/t \cdot \sum _{\ell =1}^t\bigg ({\mathop {\Pr }\limits _{w\leftarrow Hyb_{\ell }}\left[ \mathsf{\widehat{D}}(w)=1\right] }-{\mathop {\Pr }\limits _{w\leftarrow Hyb_{\ell +1}}\left[ \mathsf{\widehat{D}}(w)=1\right] }\bigg )\right| \nonumber \\ {}&= 1/t \cdot \left| {\mathop {\Pr }\limits _{w\leftarrow Hyb_{1}}\left[ \mathsf{\widehat{D}}(w)=1\right] }-{\mathop {\Pr }\limits _{w\leftarrow Hyb_{t+1}}\left[ \mathsf{\widehat{D}}(w)=1\right] }\right| \nonumber \\&=1/t \cdot \left| {\mathop {\Pr }\limits _{w\leftarrow \left\{ 0,1\right\} ^{\log \left| \mathcal {H}\right| +(n+\log n)\cdot t}}\left[ \mathsf{\widehat{D}}(w)=1\right] }-{\mathop {\Pr }\limits _{w\leftarrow G(H,X_1,\dots ,X_{t+1})}\left[ \mathsf{\widehat{D}}(w)=1\right] }\right| . \end{aligned}$$

(5)

Where the last equality holds since $Hyb_{t+1}\equiv G(H,X_1,\dots ,X_{t+1})$ and $Hyb_1$ is the uniform distribution. We conclude by Lemma 3.2 that the advantage probability of $\mathsf{\widehat{D}}$ is negligible. $\square $

3.1 Proving Lemma 3.2

In the rest of this section, we prove Lemma 3.2. Fix $\beta \ge 0$, any $\beta $-almost-regular one-way function $f:\{0,1\}^{n}\rightarrow \{0,1\}^{n}$ and $n\in {{\mathbb {N}}}$. Recall that we want to show that $h(x_1,f(x_2))$ looks uniform to computationally bounded algorithms, given h and $f(x_1)$. By the leftover hash lemma, every prefix $p(x_1,x_2)$ of the above hash $h(x_1,f(x_2))$ is somewhat close to uniform. In order to show that the suffix looks uniform as well, we prove that the concatenation of $h,f(x_1)$ and $p(x_1,x_2)$ is a one-way function, and then use Goldreich–Levin. The next claim states that the described function is indeed one-way on part of its domain.

Claim 3.3

For every $i \in [n+\log n]$, let $g_i:\mathcal {H}\times \left\{ 0,1\right\} ^n\times \left\{ 0,1\right\} ^n\rightarrow \mathcal {H}\times \left\{ 0,1\right\} ^n\times \left\{ 0,1\right\} ^{i-1}$ be the following function

$$\begin{aligned} g_i(h,x_1,y) :=\left( h,f(x_1),h(x_1, y)_{1,\dots ,i-1} \right) . \end{aligned}$$

Then it holds that for every PPT $\textsf{A}$ and every function $i=i(n)$, there exists a negligible function $\nu $ such that

$$\begin{aligned}&{\mathop {\Pr }\limits _{\begin{array}{c} h\leftarrow \mathcal {H}, x_1,x_2 \leftarrow \left\{ 0,1\right\} ^{n}\\ z=(h,x_1,f(x_2)) \end{array}}\left[ \textsf{A}(g_i(z))\in g^{-1}_i(g_i(z))\right] } \le \nu (n). \end{aligned}$$

(6)

Proof

Assume toward contradiction that the claim does not hold. That is, there exists PPT algorithm $\textsf{A}$, a function i(n) and a constant $d\in {{\mathbb {N}}}$ such that

$$\begin{aligned}&{\mathop {\Pr }\limits _{\begin{array}{c} h\leftarrow \mathcal {H}, x_1,x_2 \leftarrow \left\{ 0,1\right\} ^{n}\\ z=(h,x_1,f(x_2)) \end{array}}\left[ \textsf{A}(g_i(z))\in g_i^{-1}(g_i(z))\right] } \ge n^{-d} \end{aligned}$$

(7)

for infinitely many $n\in {{\mathbb {N}}}$. Fix such n and consider the following algorithm $\mathsf{\widehat{A}}$. In the following we show $\mathsf{\widehat{A}}$ can be used to invert f.

That is, $\mathsf{\widehat{A}}$ tries to invert y using $\textsf{A}$ and only a prefix of $h(x_1, f(x_2))$. It does so by iterating over all the possible values of the missing input bits $h(f^{-1}(y),f(x_2))_{n-(4d+2\beta )\log n+1,\dots ,n+\log n}$ and every possible index $j\in [n+\log n]$. Clearly $\mathsf{\widehat{A}}$ runs in a polynomial time. Let $x_1$ be some preimage of y and let $x_2$ be some element in $\left\{ 0,1\right\} ^n$. Note that when the guess w is equal to $h(x_1,f(x_2))_{n-(4d+2\beta )\log n+1,\dots ,n+\log n}$, and when the index j is equal to i, the value of $h,y,(z\circ w)_{1,\dots ,j-1}$ computed by the algorithm is equal to the output of $g_i(h,x_1,f(x_2))$. Thus, by definition it is clear that the success probability of $\mathsf{\widehat{A}}$ is better than $\textsf{A}$’s. Formally, we get that,

$$\begin{aligned}&{\mathop {\Pr }\limits _{h\leftarrow \mathcal {H}, x_1,x_2 \leftarrow \left\{ 0,1\right\} ^{n}}\left[ \mathsf{\widehat{A}}(h,f(x_1),h(x_1,f(x_2))_{1,\dots ,n-(4d+2\beta )\log n})\in f^{-1}(f(x_1))\right] }\nonumber \\ {}&\ge {\mathop {\Pr }\limits _{x_1,x_2 \leftarrow \left\{ 0,1\right\} ^{n}}\left[ \textsf{A}(g_i(h,x_1,f(x_2)))\in g_i^{-1}(g_i(h,x_1,f(x_2)))\right] }\nonumber \\&\ge n^{-d}. \end{aligned}$$

(8)

Next, we show that $\mathsf{\widehat{A}}$ can guess the value of $h(x_1,f(x_2))_{1,\dots ,n-(4d+2\beta )\log n}$. Indeed, recall that by the $\beta $-almost regularity of f, given any fixing of $f(x_1)$, the min-entropy of $x_1,f(x_2)$ is at least $n-2\beta \log n$. Thus, by the left-over hash lemma, $h(x_1,f(x_2))_{1,\dots ,n-(4d+2\beta )\log n}$ is $n^{-d}/2$ close to uniform given h and $f(x_1)$. Combining the above with Equation (8),

$$\begin{aligned}&{\mathop {\Pr }\limits _{h\leftarrow \mathcal {H}, x_1 \leftarrow \left\{ 0,1\right\} ^{n}, u \leftarrow \left\{ 0,1\right\} ^{n-(4d+2\beta )\log n}}\left[ \mathsf{\widehat{A}}(h,f(x_1),u)\in f^{-1}(f(x_1))\right] }\nonumber \\&={\mathop {{\text {E}}}\limits _{y\leftarrow f(\left\{ 0,1\right\} ^n)}\left[ {\mathop {\Pr }\limits _{\begin{array}{c} h\leftarrow \mathcal {H}, x_1\leftarrow f^{-1}(y),\\ u \leftarrow \left\{ 0,1\right\} ^{n-(4d+2\beta )\log n} \end{array}}\left[ \mathsf{\widehat{A}}(h,y,u)\in f^{-1}(f(x_1))\right] }\right] }\nonumber \\&\ge {\mathop {{\text {E}}}\limits _{y\leftarrow f(\left\{ 0,1\right\} ^n)}\left[ {\mathop {\Pr }\limits _{\begin{array}{c} h\leftarrow \mathcal {H}, x_1\leftarrow f^{-1}(y),\\ x_2 \leftarrow \left\{ 0,1\right\} ^{n} \end{array}}\left[ \mathsf{\widehat{A}}(h,y,h(x_1,f(x_2))_{1,\dots ,n-(4d+2\beta )\log n})\in f^{-1}(f(x_1))\right] } - n^{-d}/2\right] }\nonumber \\&= {\mathop {\Pr }\limits _{h\leftarrow \mathcal {H}, x_1,x_2 \leftarrow \left\{ 0,1\right\} ^{n}}\left[ \mathsf{\widehat{A}}(h,f(x_1),h(x_1,f(x_2))_{1,\dots ,n-(4d+2\beta )\log n})\in f^{-1}(f(x_1))\right] } - n^{-d}/2\nonumber \\&\ge n^{-d}/2. \end{aligned}$$

(9)

Finally, let $\textsf{Inv}$ be the algorithm that given $f(x_1)$ samples $h\leftarrow \mathcal {H}$ and $u \leftarrow \left\{ 0,1\right\} ^{n-(4d+2\beta )\log n}$, and executes $\mathsf{\widehat{A}}$. By Equation (9) $\textsf{Inv}$ inverts $f(x_1)$ successfully with probability at least $n^{-d}/2$ for uniformly sampled $x_1 \in \left\{ 0,1\right\} ^n$, for infinitely many $n\in {{\mathbb {N}}}$, which is a contradiction. $\square $

We are now ready to prove Lemma 3.2. The proof is straightforward from Claim 3.3 together with Lemmas 2.5 and 2.6.

Proof of Lemma 3.2

Assume toward a contradiction that Lemma 3.2 does not hold. That is, there exists PPT algorithm $\textsf{D}$ and a constant $c\in {{\mathbb {N}}}$ such that

$$\begin{aligned} \left| {\mathop {\Pr }\limits _{\begin{array}{c} x_1\leftarrow \left\{ 0,1\right\} ^{n},\\ h\leftarrow \mathcal {H}, u \leftarrow \left\{ 0,1\right\} ^{n+\log n} \end{array}}\left[ \textsf{D}(h,f(x_1),u)=1\right] }-{\mathop {\Pr }\limits _{\begin{array}{c} x_1,x_2 \leftarrow \left\{ 0,1\right\} ^{n},\\ h \leftarrow \mathcal {H} \end{array}}\left[ \textsf{D}(h,f(x_1),h(x_1,f(x_{2})))=1\right] }\right| \ge n^{-c} \end{aligned}$$

(10)

for infinitely many $n\in {{\mathbb {N}}}$. We assume without loss of generality that for infinitely many $n\in {{\mathbb {N}}}$ it holds that

$$\begin{aligned} {\mathop {\Pr }\limits _{\begin{array}{c} x_1\leftarrow \left\{ 0,1\right\} ^{n},\\ h\leftarrow \mathcal {H}, u \leftarrow \left\{ 0,1\right\} ^{n+\log n} \end{array}}\left[ \textsf{D}(h,f(x_1),u)=1\right] }-{\mathop {\Pr }\limits _{\begin{array}{c} x_1,x_2 \leftarrow \left\{ 0,1\right\} ^{2n},\\ h \leftarrow \mathcal {H} \end{array}}\left[ \textsf{D}(h,f(x_1),h(x_1,f(x_{2})))=1\right] }\ge n^{-c} \end{aligned}$$

(11)

as otherwise we can flip the output of $\textsf{D}$. By Lemma 2.6 there is an oracle-aided PPT algorithm $\textsf{P}$ such that for infinitely many $n\in {{\mathbb {N}}}$ and $i=i(n)$ it holds that

$$\begin{aligned} {\mathop {\Pr }\limits _{\begin{array}{c} x_1,x_2 \leftarrow \left\{ 0,1\right\} ^{2n},\\ h \leftarrow \mathcal {H} \end{array}}\left[ \textsf{P}^{\textsf{D}}(h, f(x_1),h(x_1,f(x_2))_{1,\dots ,i-1})=h(x_1,f(x_2))_i\right] }\ge 1/2 + n^{-c-4}. \end{aligned}$$

Recall that, by definition, $h,f(x_1),h(x_1,f(x_{2}))_{1,\dots , i-1} = g_{i}(x_1,f(x_2))$. Additionally, by our choice of the family $\mathcal {H}$, $h(x_1,f(x_{2})))_{i}$ is the ${\text {GL}}$ predicate of the function $g_{i}(x_1,f(x_2))$.^{Footnote 9} Thus, the above contradicts Claim 3.3 and lemma 2.5. $\square $

4 The UOWHF Construction

In this section we prove the security of our UOWHF construction. We start with a full description of the construction. Let $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$ be an almost-regular one-way function, let t be a parameter and let $\mathcal {H}= \left\{ m :m \in \left\{ 0,1\right\} ^{2n\times ( n-\log n)}\right\} $ be the 2-universal family induced by the set of matrices of size $2n\times (n-\log n)$.^{Footnote 10}

The function $C:\mathcal {H}\times \left\{ 0,1\right\} ^{ n\cdot t}\rightarrow \mathcal {H}\times \left\{ 0,1\right\} ^{(t-1)\cdot (n-\log n)+2n}$ is given by

$$\begin{aligned} C\big (h,x_1,\dots , x_t\big )= h,f(x_1),h(x_1,f(x_{2})), \dots ,h(x_{t-1},f(x_{t})),x_t. \end{aligned}$$

Let $k=\log \left| \mathcal {H}\right| + n\cdot t$. For a string $z\in \left\{ 0,1\right\} ^k$, let $C_z(w):=C(w\oplus z)$. Our main theorem for this part is stated as follows.

Theorem 4.1

(Main theorem for UOWHF) Let $f=f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$ be an almost-regular one-way function and let $t(n) \ge n/\log n+2$ be some polynomial. Then ${\mathcal {F}}_k=\left\{ C_z\right\} _{z\in \left\{ 0,1\right\} ^k}$ is a family of universal one-way hash functions with key length $k=O(n^2+n\cdot t(n))$ and output length $O(n^2+n\cdot t(n))$. Furthermore, for every $z\in \left\{ 0,1\right\} ^k$, $C_z$ uses t non-adaptive calls to f.

In the rest of this section, we prove Theorem 4.1. Note that by Lemma 2.9 in order to prove Theorem 4.1, it is enough to show that it is hard to find a collision of C for a random input. The main lemma of this part is the following one, which essentially states that no efficient algorithm can find a collision in a simpler function, $\widehat{C}(h,x_1,x_2) = h, f(x_1), h(x_1,f(x_2))$. Note that $\widehat{C}$ is not UOWHF, as it is not shrinking, and, as we are only interested in collisions $(h,x'_1,x'_2)$ in which $f(x_2) \ne f(x'_2)$.

Lemma 4.2

Let $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$ be an almost-regular one-way function. For every PPT algorithm $\textsf{A}$, there exists a negligible function $\nu $, such that

$$\begin{aligned}&{\mathop {\Pr }\limits _{\begin{array}{c} h\leftarrow \mathcal {H}, x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n,\\ (x'_1,x'_2)\leftarrow \textsf{A}(h,x_1,x_2) \end{array}}\left[ f(x_1)=f(x'_1)\wedge f(x_2)\ne f(x'_2) \wedge h(x_1,f(x_2)) = h(x'_1,f(x'_2)) \right] }\\&\quad \le \nu (n). \end{aligned}$$

We prove Lemma 4.2 below, but first let us prove the security of C using Lemma 4.2. The proof is by reduction, stated in the next claim. Informally, we show that an algorithm that breaks the security of C can be used in order to find a collision in the function $\widehat{C}$ defined above.

Claim 4.3

There exists an oracle-aided PPT algorithm $\textsf{A}$ such that the following holds. Let f be a one-way function, $t\in {\text {poly}}$ and C be the function described above. Let $n\in {{\mathbb {N}}}$, $\alpha \in [0,1]$ and let $\textsf{ColFinder}$ be an algorithm such that

$$\begin{aligned} {\mathop {\Pr }\limits _{w\leftarrow \mathcal {H}\times (\left\{ 0,1\right\} ^n)^t, w' \leftarrow \textsf{ColFinder}(w)}\left[ w' \ne w \wedge C(w)=C(w')\right] } = \alpha . \end{aligned}$$

Then,

$$\begin{aligned}&{\mathop {\Pr }\limits _{\begin{array}{c} h\leftarrow \mathcal {H}, x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n,\\ (x'_1,x'_2)\leftarrow \textsf{A}^{\textsf{ColFinder}}(h,x_1,x_2) \end{array}}\left[ f(x_1)=f(x'_1)\wedge f(x_2)\ne f(x'_2) \wedge h(x_1,f(x_2)) = h(x'_1,f(x'_2)) \right] }\\&\quad \ge \alpha /t-\nu (n), \end{aligned}$$

where $\nu $ is a negligible function, depending only on f and t.

The proof of Theorem 4.1 is now immediate.

Proof of Theorem 4.1

Let f, t and $C_z$ be as in Theorem 4.1. It is clear that $C_z$ is efficiently computable for every $z\in \left\{ 0,1\right\} ^k$, and that C is shrinking since $\log \left| H\right| + n\cdot t > \log \left| H\right| + (t-1)\cdot (n-\log n)+2n$ for $t\ge n/\log n+2$.

Next, we show that it is collision-resistant for random input. Assume toward contradiction that there exists a PPT $\textsf{ColFinder}$ and $p \in {\text {poly}}$ such that

$$\begin{aligned} {\mathop {\Pr }\limits _{\begin{array}{c} w\leftarrow \mathcal {H}\times (\left\{ 0,1\right\} ^n)^t,\\ w' \leftarrow \textsf{ColFinder}(w) \end{array}}\left[ w' \ne w \wedge C(w)=C(w')\right] } \ge 1/p(n) \end{aligned}$$

for infinitely many $n \in {{\mathbb {N}}}$. Then, by Lemma 4.3, for infinitely many $n \in {{\mathbb {N}}}$ it holds that

$$\begin{aligned}&{\mathop {\Pr }\limits _{\begin{array}{c} h\leftarrow \mathcal {H}, x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n,\\ (x'_1,x'_2)\leftarrow \textsf{A}^{\textsf{ColFinder}}(h,x_1,x_2) \end{array}}\left[ f(x_1)=f(x'_1)\wedge f(x_2)\ne f(x'_2) \wedge h(x_1,f(x_2)) = h(x'_1,f(x'_2)) \right] }\\&\qquad \ge 1/(t\cdot p(n))-\nu (n)\\&\qquad \ge 1/(2t\cdot p(n)). \end{aligned}$$

Note that by the choice of t, $1/(2t\cdot p(n))$ is not negligible, and that since both $\textsf{A}$ and $\textsf{ColFinder}$ are efficient, $\textsf{A}^{\textsf{ColFinder}}(\cdot )$ can be efficiently implemented. Thus, the above contradicts Lemma 4.2. $\square $

4.1 Proving Lemma 4.3

We next prove Lemma 4.3. The next simple claim will be useful in the proof, as it states that given $(h,x_1,\dots ,x_t)$, with high probability there is no collision $(h,x'_1,\dots ,x'_t)$ of C in which for some $j\in [t]$ it holds that $x_j\ne x'_j$ while $f(x_j)=f(x'_j)$ and $f(x_{j+1})=f(x'_{j+1})$.

Claim 4.4

For every one-way function f and polynomial t, there exists a negligible function $\nu $ such that the following holds. For every $x_1,\dots , x_t \in \left\{ 0,1\right\} ^n$,

$$\begin{aligned}&{\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \forall j \in [t-1],\ \forall x'_j \in f^{-1}(f(x_{j}))\setminus \left\{ x_j\right\} \text { it holds that } h(x'_j,f(x_{j+1})) \ne h(x_j,f(x_{j+1}))\right] }\\&\quad \ge 1-\nu (n). \end{aligned}$$

Proof

Fix $x_1,\dots , x_t \in \left\{ 0,1\right\} ^n$, $j\in [t-1]$ and $x'_j\in f^{-1}(f(x_{j})){\setminus } \left\{ x_j\right\} $. Since $\mathcal {H}$ is 2-universal, it holds that

$$\begin{aligned} {\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ h(x'_j,f(x_{j+1}))= h(x_j,f(x_{j+1}))\right] }= n/2^n. \end{aligned}$$

By the union bound,

$$\begin{aligned}&{\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \exists j\in [t-1], x'_j \in f^{-1}(f(x_{j}))\setminus \left\{ x_j\right\} \text { s.t. } h(x'_j,f(x_{j+1}))=h(x_j,f(x_{j+1}))\right] } \\ \le&\sum _{j\in [t-1]}\sum _{x'_j \in f^{-1}(f(x_{j}))\setminus \left\{ x_j\right\} } {\mathop {\Pr }\limits _{h\leftarrow \mathcal {H}}\left[ h(x'_j,f(x_{j+1}))=h(x_j,f(x_{j+1}))\right] }\\ \le&t(n)\cdot |f^{-1}(f(x_{j}))| \cdot n/ 2^{n}. \end{aligned}$$

Since f is a one-way function, by Claim 2.3 it holds that $|f^{-1}(f(x_{k}))|\le 2^n\cdot neg(n)$, and thus the claim follows. $\square $

Proof of Lemma 4.3

Let f, t n, $\alpha $ and $\textsf{ColFinder}$ as in Lemma 4.3. Let $\textsf{A}$ be the following algorithm.

We next show that with all but negligible probability over the choice of $w=(h, x_1,\dots , x_t)$, the following must hold. For every $w'=(h',x'_1,\dots , x'_t)$ with $w\ne w'$ and $C(w)=C(w')$, there exists some $i \in [t-1]$ such that $f(x_i)=f(x'_i)$ and $f(x_{i+1})\ne f(x'_{i+1})$. The lemma then follows easily.

Indeed, fix such w and $w'$. First note that since $C(w)=C(w')$, it holds that $h=h'$. Let j be the first index for which $x_j \ne x'_j$, and observe that by the definition of C, $j \in [t-1]$. We split into cases:

If $f(x_j)\ne f(x'_j)$, then $j>1$ (since $C(w)=C(w')$ implies that $f(x_1)=f(x'_1)$) and for $i=j-1$ it holds that $f(x_i)=f(x'_i)$ and $f(x_{i+1})\ne f(x'_{i+1})$.
For the other case, assume that $f(x_j)=f(x'_j)$. By Claim 4.4, with probability all but negligible over the choice of w, it holds that, $h(x_j,f(x_{j+1})) \ne h(x'_j,f(x_{j+1}))$, and thus it must hold that $f(x_{j+1})\ne f(x'_{j+1})$. We get that for $i=j$, it holds that $f(x_i)=f(x'_i)$ and $f(x_{i+1})\ne f(x'_{i+1})$.

Since i is chosen uniformly in Algorithm 4, and since the distribution of $h,z_1,\dots , z_t$ in Algorithm 4 is uniform for every $i\in [t-1]$ and uniformly chosen input $h,x_1,x_2$, we conclude that the success probability of $\textsf{A}^{\textsf{ColFinder}}$ is at least $(\alpha -neg(n))/t$. $\square $

4.2 Proving Lemma 4.2

We now prove Lemma 4.2. For the rest of this section, fix $\beta \ge 0$, and a $\beta $-almost-regular one-way function f. In order to prove the lemma, we show how to invert the one-way function f using an algorithm that contradicts the lemma. Formally,

Claim 4.5

There exists PPT oracle-aided algorithm $\textsf{Inv}$ such that the following holds. Let $n\in {{\mathbb {N}}}$, $\alpha \in [0,1]$ and let $\textsf{A}$ be an algorithm such that

$$\begin{aligned} {\mathop {\Pr }\limits _{\begin{array}{c} h\leftarrow \mathcal {H}, x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n,\\ (x'_1,x'_2)\leftarrow \textsf{A}(h,x_1,x_2) \end{array}}\left[ f(x_1)=f(x'_1)\wedge f(x_2)\ne f(x'_2) \wedge h(x_1,f(x_2)) = h(x'_1,f(x'_2)) \right] } =\alpha . \end{aligned}$$

Then,

$$\begin{aligned} {\mathop {\Pr }\limits _{x \leftarrow \left\{ 0,1\right\} }\left[ \textsf{Inv}^\textsf{A}(f(x)) \in f^{-1}(f(x))\right] } \ge \alpha ^2\cdot n^{-2\beta -2}\cdot 2^{-12}. \end{aligned}$$

The proof of Lemma 4.2 is immediate from Lemma 4.5, as ${\mathop {\Pr }\limits _{x \leftarrow \left\{ 0,1\right\} }\left[ \textsf{Inv}\right] }^\textsf{A}(f(x)) \in f^{-1}(f(x))$ must be negligible.

Proof of Lemma 4.2

Assume toward contradiction that there exists a PPT algorithm $\textsf{A}$ and $p\in {\text {poly}}$ such that

$$\begin{aligned} {\mathop {\Pr }\limits _{\begin{array}{c} h\leftarrow \mathcal {H}, x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n,\\ (x'_1,x'_2)\leftarrow \textsf{A}(h,x_1,x_2) \end{array}}\left[ f(x_1)=f(x'_1)\wedge f(x_2)\ne f(x'_2) \wedge h(x_1,f(x_2)) = h(x'_1,f(x'_2)) \right] } \ge 1/p(n) \end{aligned}$$

for infinitely many $n\in {{\mathbb {N}}}$. Then, by Lemma 4.5 it holds that

$$\begin{aligned} {\mathop {\Pr }\limits _{x \leftarrow \left\{ 0,1\right\} }\left[ \textsf{Inv}^\textsf{A}(f(x)) \in f^{-1}(f(x))\right] } \ge 1/p(n)^2\cdot n^{-2\beta -2}\cdot 2^{-10} \end{aligned}$$

for infinitely many $n\in {{\mathbb {N}}}$, which is a contradiction to f being a one-way function. $\square $

The rest of this part is dedicated for proving Lemma 4.5. Let n, $\alpha $ and $\textsf{A}$ be as in Lemma 4.5. In the following we assume that $\textsf{A}$ outputs a valid pair $(x'_1,x'_2)$ with $(f(x_1)=f(x'_1)\wedge f(x_2)\ne f(x'_2) \wedge h(x_1,f(x_2)) = h(x'_1,f(x'_2)))$ or $(\bot ,\bot )$. For $x_1,x_2$ and h, we define,

$$\begin{aligned} {{\mathcal {G}}}_{h,x_1,x_2}:=\left\{ (x'_1,y) \in f^{-1}(f(x_1)) \times \textsf{Im}(f) ~:h(x_1,f(x_2))=h(x'_1,y) \right\} . \end{aligned}$$

For ease of notation, we say that $x \in {{\mathcal {G}}}_{h,x_1,x_2}$ if there exists $y\in \textsf{Im}(f) ~$ such that $(x,y)\in {{\mathcal {G}}}_{h,x_1,x_2}$. Let $\textsf{Inv}$ be the following algorithm. Note that $\textsf{Inv}$ can be implemented efficiently, by the constructability of $\mathcal {H}$.

That is, in order to invert its input y, $\textsf{Inv}$ samples $x_1,x_2$ and h. It then uses $\textsf{A}$ in order to find $x'_1$ with $f(x'_1)=f(x_1)$. Lastly, it samples $h'$ with $h'(x_1,f(x_2))=h'(x'_1,y)$ and uses $\textsf{A}$ in order to find a collision to $h',x_1,x_2$. By the choice of $h'$, a possible collision is $(h',x'_1,f^{-1}(y))$. We observe that if $\textsf{A}$ finds such a collision, $\textsf{Inv}$ successfully inverted y.

For $x_1,x_2 \in \left\{ 0,1\right\} ^n$, $x'_1 \in f^{-1}(f(x))$ and $y \in \textsf{Im}(f) ~$, let

$$\begin{aligned} p_\textsf{A}(x_1,x_2,x'_1,y) :=&{\mathop {\Pr }\limits _{ h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h',x_1,x_2)\in \left\{ x'_1\right\} \times f^{-1}(y) \mid h'(x_1,f(x_2))=h'(x'_1,y) \right] }\\ =&{\mathop {\Pr }\limits _{ h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h',x_1,x_2)\in \left\{ x'_1\right\} \times f^{-1}(y) \mid (x'_1,y)\in {{\mathcal {G}}}_{h',x_1,x_2}\right] } \end{aligned}$$

and define $p_\textsf{A}(x_1,x_2,\bot ,y)=0$. By the above observation, it holds that

$$\begin{aligned}&{\mathop {\Pr }\limits _{x \leftarrow \left\{ 0,1\right\} ^n}\left[ \textsf{Inv}^\textsf{A}(f(x)) \in f^{-1}(f(x))\right] }\ge {\mathop {{\text {E}}}\limits _{\begin{array}{c} h \leftarrow \mathcal {H}, x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n\\ y \leftarrow f(\left\{ 0,1\right\} ^n) \\ (x'_1,x'_2) \leftarrow A(h,x_1,x_2) \end{array}}\left[ p_\textsf{A}(x_1,x_2,x'_1,y)\right] } \end{aligned}$$

(12)

and thus it is enough to bound the latter. We bound it using the following two claims. The first shows that it is enough to bound the probability that $\textsf{A}$ outputs $(x'_1,\cdot )$. The second claim bounds the last probability.

Claim 4.6

For every $x_1,x_2\in \left\{ 0,1\right\} ^n$ and $x' \in f^{-1}(f(x_1))$ the following holds.

$$\begin{aligned}&{\mathop {{\text {E}}}\limits _{y \leftarrow f(\left\{ 0,1\right\} ^n)}\left[ p_\textsf{A}(x_1,x_2,x',y)\right] }\ge {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h',x_1,x_2)=(x',\cdot ) \mid x'\in {{\mathcal {G}}}_{h',x_1,x_2}\right] }\cdot n^{-\beta -1}\cdot 2^{-10}. \end{aligned}$$

Proof

Fix $x_1,x_2 \in \left\{ 0,1\right\} ^n$ and $x' \in f^{-1}(f(x_1))$, and for every $h\in \mathcal {H}$, let $\textsf{A}(h):=\textsf{A}(h,x_1,x_2)$ and ${{\mathcal {G}}}_{h}:={{\mathcal {G}}}_{h,x_1,x_2}$. Then, by the definition of $p_\textsf{A}$, it holds that

$$\begin{aligned}&{\mathop {{\text {E}}}\limits _{y \leftarrow f(\left\{ 0,1\right\} ^n)}\left[ p_\textsf{A}(x_1,x_2,x',y)\right] }\\&={\mathop {{\text {E}}}\limits _{y \leftarrow f(\left\{ 0,1\right\} ^n)}\left[ {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h')\in \left\{ x'\right\} \times f^{-1}(y) \mid (x',y)\in {{\mathcal {G}}}_{h'}\right] }\right] }\\&= {\mathop {{\text {E}}}\limits _{y \leftarrow f(\left\{ 0,1\right\} ^n)}\left[ \frac{ {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ (x',y)\in {{\mathcal {G}}}_{h'}\wedge \textsf{A}(h')\in \left\{ x'\right\} \times f^{-1}(y) \mid x'\in {{\mathcal {G}}}_{h'}\right] }}{{\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ (x',y)\in {{\mathcal {G}}}_{h'}\mid x'\in {{\mathcal {G}}}_{h'}\right] }}\right] }\\&= {\mathop {{\text {E}}}\limits _{y \leftarrow f(\left\{ 0,1\right\} ^n)}\left[ \frac{ {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h')\in \left\{ x'\right\} \times f^{-1}(y) \mid x'\in {{\mathcal {G}}}_{h'}\right] }}{{\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ (x',y)\in {{\mathcal {G}}}_{h'}\mid x'\in {{\mathcal {G}}}_{h'}\right] }}\right] }\\&= {\mathop {{\text {E}}}\limits _{y \leftarrow f(\left\{ 0,1\right\} ^n)}\left[ {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h')\in \left\{ x'\right\} \times f^{-1}(y) \mid x'\in {{\mathcal {G}}}_{h'}\right] } \cdot \frac{{\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ x'\in {{\mathcal {G}}}_{h'}\right] }}{{\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ (x',y)\in {{\mathcal {G}}}_{h'}\right] }}\right] }.\\ \end{aligned}$$

Since by our assumption on A, for every $(x',y)$ with $\Pr \left[ \textsf{A}(h)\in \left\{ x'\right\} \times f^{-1}(y)\right] >0$ it holds that $(x',y) \ne (x_1,f(x_2))$, we get that for every such pair ${\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ (x',y)\in {{\mathcal {G}}}_{h'}\right] }= n/2^n$.

Recall that the family $\mathcal {H}$ is approximately flat. That is,

$$\begin{aligned} {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ \exists y\in \textsf{Im}(f) ~\text { s.t. } h'(x_1,f(x_2))= h'(x',y)\right] } \ge 2^{-10}\cdot \min \left\{ \left| \textsf{Im}(f) ~\right| \cdot 2^{-(n-\log n)}, 1\right\} . \end{aligned}$$

Continuously,

$$\begin{aligned}&{\mathop {{\text {E}}}\limits _{y \leftarrow f(\left\{ 0,1\right\} ^n)}\left[ p_\textsf{A}(x_1,x_2,x',y)\right] }\\&= \sum _{y \in \textsf{Im}(f) ~} {\mathop {\Pr }\limits _{x\leftarrow \left\{ 0,1\right\} ^n}\left[ f(x)=y\right] } \cdot {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h')\in \left\{ x'\right\} \times f^{-1}(y) \mid x'\in {{\mathcal {G}}}_{h'}\right] } \cdot \frac{2^n}{ n} \cdot {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ x'\in {{\mathcal {G}}}_{h'}\right] }\\&\ge \sum _{y \in \textsf{Im}(f) ~}\frac{1}{\left| \textsf{Im}(f) ~\right| \cdot n^\beta } \cdot {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h')\in \left\{ x'\right\} \times f^{-1}(y) \mid x'\in {{\mathcal {G}}}_{h'}\right] } \cdot \frac{2^n}{ n} \cdot {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ x'\in {{\mathcal {G}}}_{h'}\right] }\\&= \frac{1}{\left| \textsf{Im}(f) ~\right| \cdot n^\beta }\cdot \frac{2^n}{ n} \cdot {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ x'\in {{\mathcal {G}}}_{h'}\right] }\cdot \sum _{y \in \textsf{Im}(f) ~} {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h')\in \left\{ x'\right\} \times f^{-1}(y) \mid x'\in {{\mathcal {G}}}_{h'}\right] } \\&= \frac{2^n}{\left| \textsf{Im}(f) ~\right| \cdot n^{\beta +1}} \cdot {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ x'\in {{\mathcal {G}}}_{h'}\right] }\cdot {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h')=(x',\cdot ) \mid x'\in {{\mathcal {G}}}_{h'}\right] } \\&\ge \frac{2^n}{\left| \textsf{Im}(f) ~\right| \cdot n^{\beta +1}}\cdot 2^{-10} \cdot \min \left\{ \left| \textsf{Im}(f) ~\right| \cdot 2^{-(n-\log n)}, 1\right\} \cdot {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h')=(x',\cdot ) \mid x'\in {{\mathcal {G}}}_{h'}\right] } \\&\ge n^{-\beta -1}\cdot 2^{-10}\cdot {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ \textsf{A}(h')=(x',\cdot ) \mid x'\in {{\mathcal {G}}}_{h'}\right] } \end{aligned}$$

where the first inequality holds since f is $\beta $-almost-regular, and the second since $\mathcal {H}$ is approximately flat. $\square $

The next claim uses Lemma 2.16 in order to show that in a random execution of $\textsf{Inv}$, $\textsf{A}$ has a good probability to output the same element $x'_1$ in Items 2 and 4.

Claim 4.7

For every $x_1,x_2 \in \left\{ 0,1\right\} $ the following holds. Let $\alpha _{x_1,x_2}:={\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \textsf{A}\right] }(h,x_1,x_2)\ne \bot $.

$$\begin{aligned}&\sum _{x'_1 \in f^{-1}(f(x_1))}{\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \textsf{A}(h,x_1,x_2)=(x'_1,\cdot )\right] }&\cdot {{\mathop {\Pr }\limits _{\begin{array}{c} h' \leftarrow \mathcal {H} \end{array}}\left[ \textsf{A}(h',x_1,x_2)=(x'_1,\cdot ) \mid x'_1 \in {{\mathcal {G}}}_{h',x_1,x_2}\right] }}\\&\quad \ge \alpha ^2_{x_1,x_2}\cdot n^{-\beta -1}/4. \end{aligned}$$

Proof

Fix $x_1,x_2\in \left\{ 0,1\right\} ^n$, and let $\alpha _{x_1,x_2}$ be as in Claim 4.7. Let $\alpha _1 := {\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \textsf{A}(h,x_1,x_2)=(x_1,\cdot ) \right] }$ and let $\alpha _2 :={\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \textsf{A}(h,x_1,x_2)\notin \left\{ (x_1,\cdot ),\bot \right\} \right] }$. Notice that $\alpha _{x_1,x_2} = \alpha _1+\alpha _2$.

Define $\widetilde{\textsf{A}}(h)$ to be the algorithm that outputs the first coordinate of $\textsf{A}$’s output ($\textsf{A}(h,x_1,x_2)_1$) if it is different from $x_1$, or $\bot $ otherwise. Let ${{\mathcal {G}}}_{h}:={{\mathcal {G}}}_{h,x_1,x_2}$. Note that by the assumption on $\textsf{A}$, $\widetilde{\textsf{A}}$ always outputs elements in $S(h)=\left\{ x \in {{\mathcal {G}}}_{h,x_1,x_2}:x\ne x_1\right\} $. We get that $\alpha _{2}:={\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \widetilde{\textsf{A}}(h)\ne \bot \right] }$. Let $\Omega = f^{-1}(f(x_1)){\setminus } \left\{ x_1\right\} $. It holds that,

$$\begin{aligned} \sum _{x'_1 \in f^{-1}(f(x_1))}&{\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \textsf{A}(h,x_1,x_2)=(x'_1,\cdot )\right] }\cdot {{\mathop {\Pr }\limits _{\begin{array}{c} h' \leftarrow \mathcal {H} \end{array}}\left[ \textsf{A}(h',x_1,x_2)=(x'_1,\cdot ) \mid x'_1 \in {{\mathcal {G}}}_{h',x_1,x_2}\right] }}\\&\begin{aligned} =\sum _{x'_1\in \Omega }&{\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \textsf{A}(h,x_1,x_2)=(x'_1,\cdot )\right] }\cdot \\&\quad {{\mathop {\Pr }\limits _{\begin{array}{c} h' \leftarrow \mathcal {H} \end{array}}\left[ \textsf{A}(h',x_1,x_2)=(x'_1,\cdot ) \mid x'_1 \in {{\mathcal {G}}}_{h',x_1,x_2}\right] }}\\&+{\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \textsf{A}(h,x_1,x_2)=(x_1,\cdot )\right] }\cdot \\&\quad {{\mathop {\Pr }\limits _{\begin{array}{c} h' \leftarrow \mathcal {H} \end{array}}\left[ \textsf{A}(h',x_1,x_2)=(x_1,\cdot ) \mid x_1 \in {{\mathcal {G}}}_{h',x_1,x_2}\right] }} \end{aligned}\\&\begin{aligned} =\sum _{x'_1\in \Omega }&{\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \widetilde{\textsf{A}}(h)=x'_1\right] }\cdot {{\mathop {\Pr }\limits _{\begin{array}{c} h' \leftarrow \mathcal {H} \end{array}}\left[ \widetilde{\textsf{A}}(h)=x'_1 \mid x'_1 \in {{\mathcal {G}}}_{h',x_1,x_2}\right] }}\\&+{\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \textsf{A}(h,x_1,x_2)=(x_1,\cdot )\right] }\cdot {{\mathop {\Pr }\limits _{\begin{array}{c} h' \leftarrow \mathcal {H} \end{array}}\left[ \textsf{A}(h',x_1,x_2)=(x_1,\cdot ) \right] }} \end{aligned}\\&=\sum _{x'_1\in \Omega }{\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \widetilde{\textsf{A}}(h)=x'_1\right] }\cdot {\mathop {\Pr }\limits _{\begin{array}{c} h' \leftarrow \mathcal {H} \end{array}}\left[ \widetilde{\textsf{A}}(h)=x'_1 \mid x'_1 \in S(h')\right] }+\alpha _1^2, \end{aligned}$$

where the second equality holds by definition of $\widetilde{\textsf{A}}$ and since $x_1$ is always a member in ${{\mathcal {G}}}_{h,x_1,x_2}$. We next show that

$$\begin{aligned} \sum _{x'_1\in \Omega }{\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \widetilde{\textsf{A}}(h)=x'_1\right] }\cdot {\mathop {\Pr }\limits _{\begin{array}{c} h' \leftarrow \mathcal {H} \end{array}}\left[ \widetilde{\textsf{A}}(h)=x'_1 \mid x'_1 \in S(h')\right] } \ge \alpha _2^2 \cdot n^{-\beta -1}. \end{aligned}$$

(13)

Indeed, assume that $\Omega $ is not empty, as otherwise the above holds trivially. We observe that for every $x \in \Omega $,

$$\begin{aligned} 0<{\mathop {\Pr }\limits _{h'\leftarrow \mathcal {H}}\left[ x \in S(h')\right] }\le \left| \textsf{Im}(f) ~\right| \cdot n/2^n \le n^{\beta +1}/\left| f^{-1}(f(x))\right| \le n^{\beta +1}/\left| \Omega \right| . \end{aligned}$$

(14)

Thus we can use Lemma 2.16, with ${\mathcal {X}}=\mathcal {H}$ in order to get Equation (13).

Combining the above, we conclude that

$$\begin{aligned}&\sum _{x'_1\in f^{-1}(f(x_1))}{\mathop {\Pr }\limits _{h \leftarrow \mathcal {H}}\left[ \textsf{A}(h,x_1,x_2)=(x'_1,\cdot )\right] }\cdot {{\mathop {\Pr }\limits _{\begin{array}{c} h' \leftarrow \mathcal {H} \end{array}}\left[ \textsf{A}(h',x_1,x_2)=(x'_1,\cdot ) \mid x'_1 \in {{\mathcal {G}}}_{h',x_1,x_2}\right] }}\\&\quad \ge \alpha _2^2 \cdot n^{-\beta -1} + \alpha _1^2. \end{aligned}$$

The claim follows since either $\alpha _1$ or $\alpha _2$ is at least $\alpha _{x_1,x_2}/2$. $\square $

We are now ready to prove Lemma 4.5.

Proof of Lemma 4.5

For fixed $x_1$ and $x_2$ let $\alpha _{x_1,x_2}$ be as in Claim 4.7. We start by showing that

$$\begin{aligned} {\mathop {\Pr }\limits _{x \leftarrow \left\{ 0,1\right\} }\left[ \textsf{Inv}^\textsf{A}(f(x)) \in f^{-1}(f(x))\right] }\ge {\mathop {{\text {E}}}\limits _{x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n}\left[ \alpha ^2_{x_1,x_2}\right] }\cdot n^{-2\beta -2}\cdot 2^{-12}. \end{aligned}$$

(15)

Indeed, by Equation (12),

$$\begin{aligned}&{\mathop {\Pr }\limits _{x \leftarrow \left\{ 0,1\right\} }\left[ \textsf{Inv}^\textsf{A}(f(x)) \in f^{-1}(f(x))\right] }\ge {\mathop {{\text {E}}}\limits _{\begin{array}{c} h \leftarrow \mathcal {H}, x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n\\ y \leftarrow f(\left\{ 0,1\right\} ^n) \\ (x'_1,x'_2) \leftarrow A(h,x_1,x_2) \end{array}}\left[ p_\textsf{A}(x_1,x_2,x'_1,y)\right] }\\&={\mathop {{\text {E}}}\limits _{x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n}\left[ {\mathop {{\text {E}}}\limits _{\begin{array}{c} h \leftarrow \mathcal {H}, y \leftarrow f(\left\{ 0,1\right\} ^n), \\ (x'_1,x'_2) \leftarrow A(h,x_1,x_2) \end{array}}\left[ p_\textsf{A}(x_1,x_2,x'_1,y)\right] }\right] }, \end{aligned}$$

and thus it is enough to show that for every fixed $x_1,x_2\in \left\{ 0,1\right\} ^n$,

$$\begin{aligned} {\mathop {{\text {E}}}\limits _{\begin{array}{c} h \leftarrow \mathcal {H}, y \leftarrow f(\left\{ 0,1\right\} ^n), \\ (x'_1,x'_2) \leftarrow A(h,x_1,x_2) \end{array}}\left[ p_\textsf{A}(x_1,x_2,x'_1,y)\right] } \ge \alpha ^2_{x_1,x_2}\cdot n^{-2\beta -2}\cdot 2^{-12}. \end{aligned}$$

Indeed, recall that by definition, $p_\textsf{A}(x_1,x_2,\bot ,y) =0$. Therefore,

$$\begin{aligned}&{\mathop {{\text {E}}}\limits _{\begin{array}{c} h \leftarrow \mathcal {H}, y \leftarrow f(\left\{ 0,1\right\} ^n), \\ (x'_1,x'_2) \leftarrow A(h,x_1,x_2) \end{array}}\left[ p_\textsf{A}(x_1,x_2,x'_1,y)\right] }\\&=\sum _{x'_1\in f^{-1}(f(x_1))}{\mathop {\Pr }\limits _{h\leftarrow \mathcal {H}}\left[ A(h,x_1,x_2)=(x'_1,\cdot )\right] }\cdot {\mathop {{\text {E}}}\limits _{ y \leftarrow f(\left\{ 0,1\right\} ^n)}\left[ p_\textsf{A}(x_1,x_2,x'_1,y)\right] }\\&\ge \sum _{x'_1\in f^{-1}(f(x_1))}{\mathop {\Pr }\limits _{h\leftarrow \mathcal {H}}\left[ A(h,x_1,x_2)=(x'_1,\cdot )\right] }\cdot {\mathop {\Pr }\limits _{h' \leftarrow \mathcal {H}}\left[ A(h',x_1,x_2)=(x'_1,\cdot ) \mid x'_1\in {{\mathcal {G}}}_{h',x_1,x_2}\right] }\cdot n^{-\beta -1}\cdot 2^{-10}\\&\ge \alpha ^2_{x_1,x_2}\cdot n^{-2\beta -2}\cdot 2^{-12}. \end{aligned}$$

Where the equality holds by the assumption that $\textsf{A}$ always output a valid collision, or $\bot $. The first inequality holds by Claim 4.6 and the second by Claim 4.7.

We are now left to bound ${\mathop {{\text {E}}}\limits _{x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n}\left[ \alpha ^2_{x_1,x_2}\right] }\cdot n^{-2\beta -2}\cdot 2^{-12}$. Observe that by definition ${\mathop {{\text {E}}}\limits _{x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n}\left[ \alpha _{x_1,x_2}\right] } = \alpha $, and thus by the Jensen inequality, it holds that ${\mathop {{\text {E}}}\limits _{x_1,x_2 \leftarrow \left\{ 0,1\right\} ^n}\left[ \alpha ^2_{x_1,x_2}\right] } \ge \alpha ^2$, which concludes the proof. $\square $

Notes

For a one-way function f and pairwise independent hash functions $h_1,\dots ,h_k$, the k-th randomized iteration of f is $f\circ h_k \circ \dots \circ f\circ h_1 \circ f$.
For a regular function $f:\left\{ 0,1\right\} ^n\rightarrow \left\{ 0,1\right\} ^n$ with ${\mathcal {Y}}=Image(f)$, the function $g:\left\{ 0,1\right\} ^n\times {\mathcal {Y}}\rightarrow {\mathcal {Y}}$ defined by $g(x,y)=f(x\oplus y)$ is a regular function with regularity parameter $2^n$.
We ignore low-order terms for this introduction.
By [17], $\Omega (n)$ calls are necessary for any black-box construction. Since for non-adaptive constructions the uniformly random calls seem the only reasonable way to use the one-way function, such construction needs at least $\Omega (n^2)$ input bits. We admit it is only a vague explanation.
The assumption that f is length-preserving is made for simplicity and is not crucial for our constructions.
For this reason we need to output the last input $x_t$ in our UOWHF construction.
Actually, we need to show that the function g is hard to invert on outputs sampled from a specific distribution. This is sufficient for applying the Goldreich–Levin theorem, see Lemma 2.5.
By taking $\mathcal {H}= \left\{ h_m :m \in \left\{ 0,1\right\} ^{2n\times ( \log ^2 n + \log n)}, h\in {{\mathcal {G}}}\right\} $ where ${{\mathcal {G}}}=\left\{ g :\left\{ 0,1\right\} ^{2n} \rightarrow \left\{ 0,1\right\} ^{n-\log ^2 n}\right\} $ is arbitrary 2-universal family, and $h_m(z):=h(z)\circ m(z)$, the seed length can be reduced up to $O(n\cdot t)$.
Note that if $i\le n-\omega (\log n)$ there is no need in GL. Indeed, by the leftover hash lemma, the first bits of h are statistically close to uniform.
Any approximately flat, constructible, and 2-universal hash family will suffice. Such a family with a smaller size, if exists, can be used in order to reduce the key length up to $O(n \cdot t)$.

References

R. Agrawal, Y.-H. Chen, T. Horel, S. Vadhan, Unifying computational entropies via kullback–leibler divergence, in Annual International Cryptology Conference. (Springer, 2019), pp. 831–858
S. Ames, R. Gennaro, M. Venkitasubramaniam, The generalized randomized iterate and its application to new efficient constructions of uowhfs from regular one-way functions, in International Conference on the Theory and Application of Cryptology and Information Security. (Springer, 2012), pp. 154–171
K. Barhum, T. Holenstein, A cookbook for black-box separations and a recipe for uowhfs, in Theory of Cryptography Conference. (Springer, 2013), pp. 662–679
K. Barhum, U. Maurer, Uowhfs from owfs: Trading regularity for efficiency, in International Conference on Cryptology and Information Security in Latin America. (Springer, 2012), pp. 234–253
M. Blum, S. Micali, How to generate cryptographically strong sequences of pseudorandom bits. SIAM J. Comput. 13(4), 850–864 (1984)
R. Gennaro, Y. Gertner, J. Katz, L. Trevisan, Bounds on the efficiency of generic cryptographic constructions. SIAM J. Comput. 35(1), 217–246 (2005)
O. Goldreich, R. Impagliazzo, L. Levin, R. Venkatesan, D. Zuckerman, Security preserving amplification of hardness, in Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science. IEEE (1990), pp. 318–326
O. Goldreich, H. Krawczyk, M. Luby, On the existence of pseudorandom generators. SIAM J. Comput. 22(6), 1163–1175 (1993)
O. Goldreich, L. A. Levin, A hard-core predicate for all one-way functions, in Proceedings of the twenty-first annual ACM symposium on Theory of computing. (1989), pp. 25–32
I. Haitner, D. Harnik, O. Reingold, Efficient pseudorandom generators from exponentially hard one-way functions, in International Colloquium on Automata, Languages, and Programming. (Springer, 2006), pp. 228–239
I. Haitner, D. Harnik, O. Reingold, On the power of the randomized iterate, in Annual International Cryptology Conference. (Springer, 2006), pp. 22–40
I. Haitner, T. Holenstein, O. Reingold, S. Vadhan, H. Wee, Universal one-way hash functions via inaccessible entropy, in Annual International Conference on the Theory and Applications of Cryptographic Techniques. (Springer, 2010), pp. 616–637
I. Haitner, O. Reingold, S. Vadhan, Efficiency improvements in constructing pseudorandom generators from one-way functions. SIAM J. Comput. 42(3):1405–1430 (2013)
I. Haitner, O. Reingold, S. Vadhan, H. Wee, Inaccessible entropy, in Proceedings of the forty-first annual ACM symposium on Theory of computing. (2009), pp. 611–620
J. Håstad, R. Impagliazzo, L. A. Levin, M. Luby, A pseudorandom generator from any one-way function. SIAM J. Comput. 28(4), 1364–1396 (1999)
T. Holenstein, Pseudorandom generators from one-way functions: A simple construction for any hardness, in Theory of Cryptography Conference. (Springer, 2006), pp. 443–461
T. Holenstein, M. Sinha, Constructing a pseudorandom generator requires an almost linear number of calls, in 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science. IEEE, (2012), pp. 698–707
R. Impagliazzo, L. A. Levin, M. Luby. Pseudo-random generation from one-way functions, in Proceedings of the twenty-first annual ACM symposium on Theory of computing. (1989), pp. 12–24
M. Naor, M. Yung, Universal one-way hash functions and their cryptographic applications, in Proceedings of the twenty-first annual ACM symposium on Theory of computing. (1989), pp. 33–43
J. Rompel, One-way functions are necessary and sufficient for secure signatures, in Proceedings of the twenty-second annual ACM symposium on Theory of computing. (1990), pp. 387–394
S. Vadhan, C. J. Zheng, Characterizing pseudoentropy and simplifying pseudorandom generator constructions, in Proceedings of the forty-fourth annual ACM symposium on Theory of computing. (2012), pp. 817–836
A. C. Yao, Theory and application of trapdoor functions, in 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982). IEEE, (1982), pp. 80–91
Y. Yu, D. Gu, X. Li, J. Weng, (almost) optimal constructions of uowhfs from 1-to-1, regular one-way functions and beyond, in Annual Cryptology Conference. (Springer, 2015), pp. 209–229
Y. Yu, D. Gu, X. Li, J. Weng, The randomized iterate, revisited-almost linear seed length prgs from a broader class of one-way functions, in Theory of Cryptography Conference. (Springer, 2015), pp. 7–35
Y. Yu, X. Li, J. Weng, Pseudorandom generators from regular one-way functions: New constructions with improved parameters. Theor. Comput. Sci. 569, 58–69 (2015)

Download references

Acknowledgements

We are thankful to Iftach Haitner and Salil Vadhan for very useful discussions. We also thank the anonymous reviewers for their comments.

Funding

Open access funding provided by Tel Aviv University.

Author information

Authors and Affiliations

School of Computer Science, Tel-Aviv University, Tel Aviv, Israel
Noam Mazor
Department of Computer Science, University of Southern California, Los Angeles, USA
Jiapeng Zhang

Authors

Noam Mazor
View author publications
You can also search for this author in PubMed Google Scholar
Jiapeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Noam Mazor.

Additional information

Communicated by Stefano Tessaro.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

N. Mazor: Research supported by Israel Science Foundation grant 666/19 and the Blavatnik Interdisciplinary Cyber Research Center at Tel-Aviv University.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mazor, N., Zhang, J. Simple Constructions from (Almost) Regular One-Way Functions. J Cryptol 37, 25 (2024). https://doi.org/10.1007/s00145-024-09507-4

Download citation

Received: 02 July 2022
Revised: 04 December 2023
Accepted: 23 April 2024
Published: 30 May 2024
DOI: https://doi.org/10.1007/s00145-024-09507-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Simple Constructions from (Almost) Regular One-Way Functions

Abstract

Similar content being viewed by others

Simple Constructions from (Almost) Regular One-Way Functions

Simple and More Efficient PRFs with Tight Security from LWE and Matrix-DDH

Counting Unpredictable Bits: A Simple PRG from One-Way Functions

1 Introduction

1.1 Our Contribution

1.1.1 Our Constructions and Results

Theorem 1.1

Theorem 1.2

1.2 Proof Overview

1.3 Additional Related Work

1.4 Paper Organization

2 Preliminaries

2.1 Notations

2.2 One-Way Functions

Definition 2.1

Definition 2.2

Claim 2.3

2.3 Pseudorandom Generators

Definition 2.4

Lemma 2.5

Proof

Lemma 2.6

Proof of Lemma 2.6

2.4 Universal One-Way Hash Function

Definition 2.7

Definition 2.8

Lemma 2.9

2.5 2-Universal Hash Families

Definition 2.10

Lemma 2.11

Definition 2.12

Lemma 2.13

Proof of Lemma 2.13

2.6 Useful Inequalities

Lemma 2.14

Lemma 2.15

Lemma 2.16

Proof

3 The PRG Construction

Theorem 3.1

Lemma 3.2

Proof of Theorem 3.1

3.1 Proving Lemma 3.2

Claim 3.3

Proof

Proof of Lemma 3.2

4 The UOWHF Construction

Theorem 4.1

Lemma 4.2

Claim 4.3

Proof of Theorem 4.1

4.1 Proving Lemma 4.3

Claim 4.4

Proof

Proof of Lemma 4.3

4.2 Proving Lemma 4.2

Claim 4.5

Proof of Lemma 4.2

Claim 4.6

Proof

Claim 4.7

Proof

Proof of Lemma 4.5

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords