A Dichotomy for Local Small-Bias Generators

Applebaum, Benny; Bogdanov, Andrej; Rosen, Alon

doi:10.1007/s00145-015-9202-8

A Dichotomy for Local Small-Bias Generators

Published: 08 April 2015

Volume 29, pages 577–596, (2016)
Cite this article

Download PDF

Journal of Cryptology Aims and scope Submit manuscript

A Dichotomy for Local Small-Bias Generators

Download PDF

Benny Applebaum¹,
Andrej Bogdanov² &
Alon Rosen³

1077 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

We consider pseudorandom generators in which each output bit depends on a constant number of input bits. Such generators have appealingly simple structure: They can be described by a sparse input–output dependency graph $G$ and a small predicate $P$ that is applied at each output. Following the works of Cryan and Miltersen (MFCS’01) and by Mossel et al (STOC’03), we ask: which graphs and predicates yield “small-bias” generators (that fool linear distinguishers)? We identify an explicit class of degenerate predicates and prove the following. For most graphs, all non-degenerate predicates yield small-bias generators, $f:\{0,1\}^n \rightarrow \{0,1\}^m$, with output length $m = n^{1 + \epsilon }$ for some constant $\epsilon > 0$. Conversely, we show that for most graphs, degenerate predicates are not secure against linear distinguishers, even when the output length is linear $m=n+\Omega (n)$. Taken together, these results expose a dichotomy: Every predicate is either very hard or very easy, in the sense that it either yields a small-bias generator for almost all graphs or fails to do so for almost all graphs. As a secondary contribution, we attempt to support the view that small-bias is a good measure of pseudorandomness for local functions with large stretch. We do so by demonstrating that resilience to linear distinguishers implies resilience to a larger class of attacks.

On Linear-Size Pseudorandom Generators and Hardcore Functions

Fast Pseudorandom Functions Based on Expander Graphs

Adaptive Security of Constrained PRFs

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In recent years, there has been interest in the study of cryptographic primitives that are implemented by local functions, that is, functions in which each output bit depends on a constant number of input bits. This study has been in large part spurred by the discovery that, under widely accepted cryptographic assumptions, local functions can achieve rich forms of cryptographic functionality, ranging from one-wayness and pseudorandom generation to semantic security and existential unforgeability [6].

Local functions have simple structure: They can be described by a sparse input–output dependency graph and a sequence of small predicates applied at each output. Besides allowing efficient parallel evaluation, this simple structure makes local functions amenable to analysis and gives hope for proving highly non-trivial statements about them. Given that the cryptographic functionalities that local functions can achieve are quite complex, it is very interesting and appealing to try to understand which properties of local functions (namely graphs and predicates) are necessary and sufficient for them to implement such functionalities.

In this work, we focus on the study of local pseudorandom generators with large stretch. We give evidence that for most graphs, all but a handful of “degenerate” predicates yield pseudorandom generators with output length $m = n^{1 + \varepsilon }$ for some constant $\varepsilon > 0$. Conversely, we show that for almost all graphs, degenerate predicates are not secure even against linear distinguishers. Taken together, these results expose a dichotomy: Every predicate is either very hard or very easy, in the sense that it either yields a generator for almost all graphs or fails to do so for almost all graphs.

1.1 Easy, Sometimes Hard, and Almost Always Hard Predicates

Recall that a pseudorandom generator is a length increasing function $f:\{0,1\}^n \rightarrow \{0,1\}^m$ such that no efficiently computable test can distinguish with noticeable advantage between the value $f(x)$ and a randomly chosen $y\in \{0,1\}^m$, when $x\in \{0,1\}^n$ is chosen at random. The additive stretch of $f$ is defined to be the difference between its output length $m$ and its input length $n$.

In the context of constructing local pseudorandom generators of superlinear stretch, we may assume without loss of generality that all outputs apply the same predicate $P:\{0,1\}^d \rightarrow \{0,1\}$.^{Footnote 1} We are interested in understanding which $d$-local functions $f_{G, P}:\{0,1\}^n \rightarrow \{0,1\}^m$, described by a graph $G$ and a predicate $P$, are pseudorandom generators. For a predicate $P$, we will say

$P$ is easy if $f_{G, P}$ is not pseudorandom (for a given class of adversaries) for every $G$,
$P$ is sometimes hard if $f_{G, P}$ is pseudorandom for some $G$, and
$P$ is almost always hard if $f_{G, P}$ is pseudorandom for a $1-o(1)$ fraction of $G$.^{Footnote 2}

Cryan and Miltersen [17] and Mossel et al. [28] identified several classes of predicates that are easy for polynomial-time algorithms when the stretch is a sufficiently large linear function. These include (1) unbalanced predicates, (2) linear predicates, (3) predicates that are biased toward one input (i.e., $\Pr _w[P(w) = 1] \ne \frac{1}{2}$), and (4) predicates that are biased toward a pair of inputs (i.e., $\Pr _w[P(w)=w_i\oplus w_j]\ne \frac{1}{2}$). We call such predicates degenerate. By a case-analysis, it can be showed that degenerate predicates include all predicates of locality at most four [17, 28].

On the positive side, Mossel et al. [28] also gave examples of five-bit predicates that are sometimes (exponentially) hard against linear distinguishers. Applebaum et al. [5] show that when the locality is sufficiently large, almost always hard predicates against linear distinguishers exist.

Pseudorandomness against linear distinguishers means that there is no subset of output bits whose XOR has noticeable bias. This notion, due to Naor and Naor [29], was advocated in the context of local pseudorandom generators by Cryan and Miltersen [17]. A bit more formally, for a function $f:\{0,1\}^n\rightarrow \{0,1\}^{m}$, we let

$$\begin{aligned} \mathsf {bias}(f)=\max _{L}\left| \Pr [L(f(\mathcal {U}_n))=1]-\Pr [L(\mathcal {U}_m)=1] \right| , \end{aligned}$$

where the maximum is taken over all affine functions $L:\mathbb {F}_2^m\rightarrow \mathbb {F}_2$. A small-bias generator is a function $f$ for which $\mathsf {bias}(f)$ is small (preferably negligible) as a function of $n$.

1.2 Our Results

We fully classify predicates by showing that all predicates that are not known to be easy, are almost always hard.

Theorem 1.1

(Non-degenerate predicates are hard) Let $P:\{0,1\}^d\rightarrow \{0,1\}$ be any non-degenerate predicate. Then, for every $\varepsilon <1/4$ and $m=n^{1+\varepsilon }$:

where $\delta (n)=\exp (-\Omega (n^{1/4-\varepsilon }))$ and $G$ is randomly chosen from all $d$-regular hypergraphs with $n$ nodes (representing the inputs) and $m$ hyperedges (representing the outputs).

The theorem shows that, even when locality is large, the only easy predicates are degenerate ones, and there are no other “sources of easiness” other than ones that already appear in predicates of locality $4$ or less.

Conversely, we show that degenerate predicates are easy for linear distinguishers (as opposed to general polynomial-time distinguishers).

Theorem 1.2

(Linear tests break degenerate predicates) For every $m=n+\Omega (n)$, and every degenerate predicate $P:\{0,1\}^d\rightarrow \{0,1\}$

where $G$ is randomly chosen from all $d$-regular hypergraphs with $n$ nodes and $m$ hyperedges.

The proof of Theorem 1.2 mainly deals with degenerate predicates that are correlated with a pair of their inputs. In this case, we show that the nonlinear distinguisher which was previously used in [28] and was based on a semi-definite program for MAX-2-LIN [21] can be replaced with a simple linear distinguisher. (The proof for other degenerate predicates follows from previous works).

Taken together, Theorems 1.1 and 1.2 expose a dichotomy: A predicate can be either easy (fail for almost all graphs) or hard (succeeds for almost all graphs). One possible interpretation of our results is that, from a designer point of view, a strong emphasis should be put on the choice of the predicate, while the choice of the input–output dependency graph may be less crucial (since if the predicate is appropriately chosen then most graphs yield a small-bias generator). In some sense, this means that constructions of local pseudorandom generators with large stretch are robust: As long as the graph $G$ is “typical,” any non-degenerate predicate can be used (our proof classifies explicitly what is a typical family of graphs and in addition shows that even a mixture of different non-degenerate predicates would work).

1.3 Why Polynomial Stretch?

While Applebaum et al. [6] give strong evidence that local pseudorandom generators exist, the stretch their construction achieves is only sublinear ($m=n+n^{1-\varepsilon }$). In contrast, the regime of large (polynomial or even linear) stretch is not as well understood, and the only known constructions are based on nonstandard assumptions. (See Sect. 1.5.)

Large-stretch local generators are known to have several applications in cryptography and complexity, such as secure computation with constant overhead [25] and strong (average-case) inapproximability results for constraint-satisfaction problems [7]. These results are not known to follow from other (natural) assumptions. It should be mentioned that it is possible to convert small polynomial stretch of $m=n^{1+\varepsilon }$ into arbitrary (fixed) polynomial stretch of $m=n^c$ at the expense of constant blow-up in the locality. (This follows from standard techniques, see [4] for details). Hence, it suffices to focus on the case of $m=n^{1+\varepsilon }$ for some fixed $\varepsilon $.

The proof of Theorem 1.1 yields exponentially small bias when $m=O(n)$, and sub-exponential bias for $m=n^{1+\varepsilon }$ where $\varepsilon <1/4$. We do not know whether this is tight, but it can be shown that some non-degenerate predicates become easy (to break on a random graph) when the output length is $m=n^2$ or even $m=n^{3/2}$. In general, it seems that when $m$ grows, the number of hard predicates of locality $d$ decreases, till the point $m^{\star }$ where all predicates become easy. (By [28], $m^{\star }\le n^{d/2}$.) It will be interesting to obtain a classification for larger output lengths, and to find out whether a similar dichotomy happens there as well.

1.4 Why Small-Bias?

Small-bias generators are a strict relaxation of cryptographic pseudorandom generators in that the tests $L:\mathbb {F}_2^m\rightarrow \mathbb {F}_2$ are restricted to be affine (as opposed to arbitrary efficiently computable functions). Even though affine functions are, in general, fairly weak distinguishers, handling them is a necessary first step toward achieving cryptographic pseudorandomness. In particular, affine functions are used extensively in cryptanalysis and security against them already rules out an extensive class of attacks.

For local pseudorandom generators with linear stretch, Cryan and Miltersen conjectured that affine distinguishers are as powerful as polynomial-time distinguishers [17]. In Sect. 5, we attempt to support this view by showing that resilience against small-bias, by itself, leads to robustness against other classes of attacks.

Small-bias generators are also motivated by their own right being used as building blocks in constructions that give stronger forms of pseudorandomness. This includes constructions of local cryptographic pseudorandom generators [4, 7], as well as pseudorandom generators that fool low-degree polynomials [14], small-space computations[24], read-once formulas [11].

1.5 Related Work

The function $f_{G,P}$ was introduced by Goldreich [22] who conjectured that when $m=n$, one-wayness should hold for a random graph and a random predicate. This view is supported by the results of [3, 16, 20, 22, 26, 27, 30] who show that a large class of algorithms (including ones that capture DPLL-based heuristics) fail to invert $f_{G,P}$ in polynomial time.

At the linear regime, i.e., when $m=n+\Omega (n)$, it is shown in [12] that if the predicate is degenerate, the function $f_{G,P}$ can be inverted in polynomial time. (This strengthens the results of [17, 28] who only give distinguishers.) Recently, a strong self-amplification theorem was proved in [13] showing that for $m=n+\Omega _d(n)$ if $f_{G,P}$ is hard-to-invert over tiny (sub-exponential small) fraction of the inputs with respect to sub-exponential time algorithm, then the same function is actually hard-to-invert over almost all inputs (with respect to sub-exponential time algorithms).

Pseudorandom generators with sub-linear stretch can be implemented by $4$-local functions based on standard intractability assumptions (e.g., hardness of factoring, discrete-log, or lattice problems) [6], or even $3$-local functions based on the intractability of decoding random linear codes [8]. However, it is unknown how to extend this result to polynomial or even linear stretch since all known stretch amplification procedures introduce a large (polynomial) overhead in the locality. In fact, for the special case of $4$-local functions (in which each output depends on at most 4 input bits), there is a provable separation: Although such functions can compute sub-linear pseudorandom generators [6], they cannot achieve polynomial stretch [17, 28].

Alekhnovich [1] conjectured that for $m=n+\Theta (n)$, the function $f_{G,P}$ is pseudorandom for a random graph and when $P$ is a randomized predicate which computes $z_1\oplus z_2\oplus z_3$ and with some small probability $p<\frac{1}{2}$ flips the result. Although this construction does not lead directly to a local function (due to the use of noise), it was shown in [7] that it can be derandomized and transformed into a local construction with linear stretch. (The restriction to linear stretch holds even if one strengthen Alekhnovich’s assumption to $m=\mathrm{poly}(n)$.)

More recently, [4] showed that the pseudorandomness of $f_{G,P}$ with respect to a random graph and output length $m$ can be reduced to the one-wayness of $f_{H,P}$ with respect to a random graph $H$ and related output length $m'$. The current paper complements this result as it provides a criteria for choosing the predicate $P$.^{Footnote 3}

2 Techniques and Ideas

In this section, we give an overview of the proof of our Theorem 1.1. Let $f:\{0,1\}^n\rightarrow \{0,1\}^{m}$ be a $d$-local function where each output bit is computed by applying some $d$-local predicate $P:\{0,1\}^d\rightarrow \{0,1\}$ to a (ordered) subset of the inputs $S\subseteq [n]$.^{Footnote 4} Any such function can be described by a list of $m$ $d$-tuples $G=(S_1,\ldots ,S_m)$ and the predicate $P$. Under this convention, we let $f_{G,P}:\{0,1\}^n\rightarrow \{0,1\}^{m}$ denote the corresponding $d$-local function.

We view $G$ as a $d$-regular hypergraph with $n$ nodes (representing inputs) and $m$ hyperedges (representing outputs) each of size $d$. (We refer to such a graph as an $(m,n,d)$-graph.) Since we are mostly interested in polynomial stretch, we think of $m$ as $n^{1+\varepsilon }$ for some fixed $\varepsilon >0$, e.g., $\varepsilon =0.1$.

We would like to show that for almost all $(m,n,d)$-graphs $G$, the function $f_{G,P}$ fools all linear tests $T$, where $P$ is non-degenerate. Following [28], we distinguish between light linear tests which depend on less than $k=\Omega (n^{1-2\varepsilon })$ outputs and heavy tests which depend on more than $k$ outputs.

Recall that a non-degenerate predicate satisfies two forms of “nonlinearity”: (1) (2-resilient) $P$ is uncorrelated with any linear function that involves less than 3 variables; and (2) (degree 2) the algebraic degree of $P$ as a polynomial over $\mathbb {F}_2$ is at least 2. Both properties are classical design criteria which are widely used in practical cryptanalysis (cf. [31]). It turns out that the first property allows to fool light tests and the second property fools heavy tests.

2.1 Fooling Light Tests

Our starting point is a result of [28] which shows that if the predicate is the parity predicate $\oplus $ and the graph is a good expander, the output of $f_{G,\oplus }(\mathcal {U}_n)$ perfectly fools all light linear tests. In terms of expectation, this can be written as

$$\begin{aligned} {{\mathrm{\mathsf {E}}}}_x[L(f_{G,\oplus }(x))=0], \end{aligned}$$

where we think of $\{0,1\}$ as $\left\{ \pm 1\right\} $, and let $L:\left\{ \pm 1\right\} ^m\rightarrow \left\{ \pm 1\right\} $ be a light linear test. Our key insight is that the case of a general predicate $P$ can be reduced to the case of linear predicates.

More precisely, let $\xi $ denote the outcome of the test $L(f_{G,P}(x))$. Then, by looking at the Fourier expansion of the predicate $P$, we can write $\xi $ as a convex combination over the reals of exponentially many summands of the form $\xi _i=L(f_{G_{i},\oplus }(x))$ where the graphs $G_{i}$ are subgraphs of $G$. (The exact structure of $G_i$ is determined by the Fourier representation of $P$.) When $x$ is uniformly chosen, the random variable $\xi $ is a weighted sum (over the reals) of many dependent random variables $\xi _i$’s. We show that if $G$ has sufficiently high vertex expansion (every not too large set of hyperedges covers many vertices) then the expectation of each summand $\xi _i$ is zero, and so, by the linearity of expectation, the expectation of $\xi $ is also zero.

When the predicate is 2-resilient, the size of each hyperedge of $G_{i}$ is at least 3, and therefore, if every $3$-uniform subgraph of $G$ is a good expander, $f_{G,P}$ (perfectly) passes all light linear tests. Most graphs $G$ satisfy this property. We emphasize that the argument crucially relies on the perfect bias of XOR predicates, as there are exponentially many summands. (See Sect. 3.1 for full details.)

2.2 Fooling Heavy Tests

Consider a heavy test which involves $t\ge k$ outputs. Switching back to zero-one notation, assume that the test outputs the value $\xi =P(x_{S_1})+ \cdots + P(x_{S_t}) \pmod 2$ where $x\mathop {\leftarrow }\limits ^{R}\mathcal {U}_n$. Our goal is to show that $\xi $ is close to a fair coin. For this, it suffices to show that the sum $\xi $ can be rewritten as the sum (over $\mathbb {F}_2$) of $\ell $ random variables

$$\begin{aligned} \xi =\xi _1+ \cdots + \xi _{\ell } \pmod 2, \end{aligned}$$

(1)

where each random variable $\xi _i$ is an independent non-constant coin, i.e., $\Pr [\xi _i=1]\in [2^{-d},1-2^{-d}]$. In this case, the statistical distance between $\xi $ and a fair coin is exponentially small (in $\ell $), and we are done as long as $\ell $ is large enough.

In order to partition $\xi $, let us look at the hyperedges $S_1,\ldots ,S_t$ which are involved in the test. As a first attempt, let us collect $\ell $ distinct “independent” hyperedges that do not share a single common variable. Renaming the edges, we can write $\xi $ as

$$\begin{aligned} \left( P(x_{T_1})+ \cdots + P(x_{T_{\ell }})\right) + \left( P(x_{S_{\ell +1}})+\cdots + P(x_{S_t})\right) \pmod 2, \end{aligned}$$

where the first $\ell $ random variables are indeed statistically independent. However, the last $t-\ell $ hyperedges violate statistical-independence as they may be correlated with more than one of the first $\ell $ hyperedges. This is the case, for example, if $S_{j}$ has a non-empty intersection with both $T_i$ and $T_r$. This problem is fixed by collecting $\ell $ “strongly-independent” hyperedges $T_1,\ldots , T_{\ell }$ for which every $S_j$ intersects at most a single $T_i$. (Such a big set is likely to exist since $t$ is sufficiently large.) In this case, for any fixing of the variables outside the $T_i$’s, the random variable $\xi $ can be partitioned into $\ell $ independent random variables of the form $\xi _i=P(x_{T_i})+\sum P(x_{S_j})$, where the sum ranges over the $S_j$’s which intersects $T_i$. This property (which is a relaxation of Eq. 1) still suffices to achieve our goal, as long as the $\xi _i$’s are non-constant.

To prove the latter, we rely on the fact that $P$ has algebraic degree 2. Specifically, let us assume that $S_i$ and $T_j$ have no more than a single common input node. (This condition can be typically met at the expense of throwing a small number of the $T_i$’s.) In this case, the random variable $\xi _i=P(x_{T_i})+\sum P(x_{S_j})$ cannot be constant, as the first summand is a degree 2 polynomial in $x_{T_i}$ and each of the last summands contain at most a single variable from $T_i$. Hence, $\xi _i$ is a non-trivial polynomial whose degree is lower-bounded by 2. This completes the argument. Interestingly, nonlinearity is used only to prove that the $\xi _i$’s are non-constant. Indeed, linear predicates fail exactly for large tests for which the $\xi _i$’s become fixed due to local cancellations. (See Sect. 3.2 for details.)

2.3 Proving Theorem 1.2

When $P$ is a degenerate predicate and $G$ is random, the existence of a linear distinguisher follows by standard arguments. The cases of linear or biased $P$ are trivial, and the case of bias toward one input was analyzed by Cryan and Miltersen. When $P$ is biased toward a pair of inputs, say the first two, we think of $P$ as an “approximation” of the parity $x_1 \oplus x_2$ of its first two inputs. If $P$ happened to be the predicate $x_1 \oplus x_2$, one could find a short “cycle” of output bits that, when XORed together, causes the corresponding input bits to cancel out. In general, as long as the outputs along the cycle do not share any additional input bits, the output of the test will be biased, with bias exponential in the length of the cycle. In Sect. 4, we show that a random $G$ is likely to have such short cycles, and so the corresponding linear test will be biased.

3 Non-degenerate Predicates are Hard

In this section, we prove Theorem 1.1. We follow the outline described in Sect. 2 and handle light linear tests and heavy linear tests separately.

3.1 Fooling Light Tests

In this section, we show that if the predicate $P$ is $2$-resilient (see definition below) and the graph $G$ is a good expander, the function $f_{G,P}$ is $k$-wise independent, and in particular fools linear tests of weight smaller than $k$. We will need the following definitions.

Lossless expansion. Let $G$ be an $(m,n,d)$-graph. We will say $G$ is $(k, t)$-expanding ($1 \le k \le m, 1 \le t \le d$) if for every $\ell \le k$, every collection of $\ell \le k$ distinct hyperedges of $G$ covers more than $t\ell $ distinct vertices. We say $G$ is $(k, a)$ -linear ($1 \le a \le d$) if for every collection of $k$ distinct hyperedges $S_1, \dots , S_k$ and every collection of subsets $T_1 \subseteq S_1, \dots , T_k \subseteq S_k$ where $\left| T_1 \right| , \dots , \left| T_k \right| \ge a$, the indicator vectors of $T_1, \dots , T_k$ are linearly independent over $\mathbb {F}_2^n$.

Fourier coefficients. The Fourier expansion of a predicate $P:\{0,1\}^d\rightarrow \left\{ \pm 1\right\} $ is given by $\sum _{T\subseteq [d]}\alpha _T \chi _T$ where $\chi _T(x_1,\ldots ,x_d)=(-1)^{\sum _{t\in T} x_i}$ is Parity on the coordinates in the set $T$. The predicate is $a$-resilient if $\alpha _T$ is zero for every $T$ of size smaller or equal to $a$.

The following lemma shows that resiliency combined with $(k,a)$-linearity leads to $k$-wise independence.

Lemma 3.1

If $P$ is $(a-1)$-resilient and the $(m,n,d)$-graph $G$ is $(k,a)$-linear then $f_{G,P}$ is $k$-wise independent generator, i.e., the $m$ r.v.’s $(y_1,\ldots ,y_m)=f_{G,P}(\mathcal {U}_n)$ are $k$-wise independent.

To prove the lemma, we will employ the following fact which follows from Vazirani’s XOR lemma (cf. [23]).

Fact 3.2

A sequence of $\pm 1$ random variables $(y_1,\ldots ,y_t)$ is $k$-wise independent if for every $\ell \le k$ and every $i_1 < i_2 < \cdots < i_{\ell }$ it holds that ${{\mathrm{\mathsf {E}}}}[y_{i_1}\cdot \cdots \cdot y_{i_{\ell }}]=0.$

We can now prove Lemma 3.1.

Proof of Lemma 3.1

We will use the following notation: For a hyperedge $S=(i_1,\ldots ,i_d)$ and a set $T\subseteq [d]$, we define the $T$-projection of $S$, denoted by $S_{T}$, to be the set $\left\{ i_j: j\in T\right\} $.

Fix an $\ell \le k$ outputs of $f_{G,P}$, and let $S_1,\ldots ,S_{\ell }$ be the corresponding hyperedges. By Fact 3.2, we should show that ${{\mathrm{\mathsf {E}}}}_x [\prod _i P(x_{S_i})]=0$. For every $x\in \{0,1\}^n$ we have:

$$\begin{aligned} \prod _{i=1}^{\ell } P(x_{S_i})= \prod _{i=1}^{\ell } \sum _{T\subseteq [d], |T|\ge a}\alpha _{T}\chi _T(x_{S_i})=\sum _{\mathbf {T}=(T_1,\ldots ,T_{\ell }),\left| T_i \right| \ge a} \prod _i \alpha _{T_i}\chi _{S_{i,T_i}}(x). \end{aligned}$$

Hence, by the linearity of expectation, it suffices to show that

$$\begin{aligned} {{\mathrm{\mathsf {E}}}}_x\left[ \prod _i \chi _{S_{i,T_i}}(x)\right] =0, \end{aligned}$$

for every $(T_1,\ldots ,T_{\ell })$ where $T_i\subseteq [d],\left| T_i \right| \ge a$. (Recall that the $\alpha _{T_i}$’s are constants.) Observe that $\prod _i \chi _{S_{i,T_i}}(x)$ is just a parity function, which, by $(k,a)$-linearity, is nonzero. Since every nonzero parity function has expectation zero, the claim follows. $\square $

Next, we show that $(k,a)$-linearity is implied by expansion, and a random graph is likely to be expanding.

Lemma 3.3

Let $d\ge 3$ be a constant. Let $\Delta \le \sqrt{n}/\log n$ and $3\le a\le d$.

1.
Every $(m, n, d)$-graph which is $(k, d - a/2)$-expanding is also $(k,a)$-linear.
2.
A random $(\Delta n, n, d)$-graph is $(\alpha n/\Delta ^2, d - a/2)$-expanding whp, where $\alpha $ is a constant that depends on $a$ and $d$.^{Footnote 5}

Proof

If $G$ is not $(k, a)$ linear, then there exists a nonempty collection of $\ell \le k$ hyperedges $S_1, \dots , S_\ell $ of $G$ and subsets $T_1 \subseteq S_1, \dots , T_\ell \subseteq S_\ell $, $\left| T_i \right| \ge a$ such that the indicator vectors of $T_1, \dots , T_\ell $ sum up to zero over $\mathbb {F}_2^n$. Therefore, every vertex covered by one of $T_1, \dots , T_\ell $ must be covered at least twice, and so $T_1, \dots , T_\ell $ can cover at most $\frac{1}{2}(\left| T_1 \right| + \dots + \left| T_\ell \right| )$ vertices. On the other hand, the total number of vertices covered by $S_1 - T_1, \dots , S_\ell - T_\ell $ can be at most $\left| S_1 - T_1 \right| + \dots + \left| S_\ell - T_\ell \right| $. Therefore, the collection $S_1, \dots , S_\ell $ covers at most

$$\begin{aligned} \frac{1}{2} (\left| T_1 \right| + \dots + \left| T_\ell \right| ) + (\left| S_1 - T_1 \right| + \dots + \left| S_\ell - T_\ell \right| )&= d\ell - \frac{1}{2} (\left| T_1 \right| + \dots + \left| T_\ell \right| )\\&\le (d - a/2) \ell \end{aligned}$$

vertices of $G$. Thus, $G$ is not $(k, d - a/2)$-expanding.

The second item follows by a standard probabilistic calculation. Fix some $t\in (d-\frac{a}{2},d)$, e.g., $t = d - (a+1)/2$. For $\ell \le k$, we upper bound the probability that there exists a non-expanding subset of size $\ell $, i.e., the probability that there exists a set of hyperedges $A$ of size $\ell $ and a set of vertices $B$ of size $\ell t$ such all the vertices in $A$ belong to $B$ by a union bound:

$$\begin{aligned} \left( {\begin{array}{c}\Delta n\\ \ell \end{array}}\right) \cdot \left( {\begin{array}{c}n\\ t \ell \end{array}}\right) \cdot \Bigl (\frac{\ell t}{n}\Bigr )^{d \ell }&\le \Bigl (\frac{e\Delta n}{\ell } \Bigr )^{\ell } \cdot \Bigl (\frac{en}{t \ell } \Bigr )^{t \ell } \cdot \Bigl (\frac{\ell t}{n}\Bigr )^{d \ell } = \Bigl (\frac{e^{t+1}\Delta n}{\ell }\Bigr )^\ell \cdot \Bigl (\frac{\ell t}{n}\Bigr )^{(a/2)\ell }\\&= \biggl (\frac{e^{t+1}t^{a/2}\Delta }{(n/\ell )^{a/2-1}}\biggr )^{\ell }. \end{aligned}$$

where $e$ denotes the base of the natural logarithm and the inequality follows by the well-known upper-bound $\left( {\begin{array}{c}n\\ k\end{array}}\right) \le \left( \frac{e n}{k}\right) ^k$. Using the assumption $a \ge 3$, we can upper bound the last expression by $p_\ell = (c_{d,a} \Delta \sqrt{\ell /n})^\ell $, where $c_{d, a}$ is a constant that depends on $d$ and $a$ only. Now observe that

For $\ell = 1, 2, 3$ we have $p_\ell = O(1/\log n)$,
For $4\le \ell \le 10 \log n$ using $\Delta \le \sqrt{n}/\log n$ we obtain $p_\ell \le (c_{d, a}\sqrt{\ell }/\log n)^\ell = O(1/(\log n)^2)$, and
For $10 \log n \le \ell \le \alpha n/\Delta ^2$, we have $p_\ell \le (c_{d,a} \sqrt{\alpha })^\ell = O(1/n^{10})$ for $\alpha = 1/(2c_{d,a})^2$.

Summing up the contributions of $p_\ell $ to the failure probability, we conclude that the probability $G$ is not $(\alpha n/\Delta ^2, d - a/2)$ expanding is at most $o(1)$. $\square $

By combining the lemmas, we obtain the following corollary.

Corollary 3.4

If $P$ is 2-resilient and $m=\Delta n$ for constant $\Delta $, then whp over the choice of an $(m,n,d)$-graph $G$, the function $f_{G,P}$ is $k$-wise independent for $k=\Omega (n)$. If $\Delta =n^{\varepsilon }$, the above holds with $k=\Omega (n^{1-2\varepsilon } )$.

By taking $\varepsilon <1/4$, we obtain that 2-resiliency suffices for $\omega (\sqrt{n})$-wise independence with high probability.

3.2 Fooling Heavy Tests

In this section, we show that if the predicate $P$ is nonlinear and the graph $G$ has large sets of “independent” hyperedges, the function $f_{G,P}$ fools linear tests of weight larger than $k$. Formally, we will need the following notion of independence.

$(k,\ell ,b)$ -independence. Let $\mathcal{S}$ be a collection of $k$ distinct hyperedges. A subset $\mathcal{T}\subseteq \mathcal{S}$ of $\ell $ distinct hyperedges is an $(\ell ,b)$-independent set of $\mathcal{S}$ if the following two properties hold: (1) Every pair of hyperedges $(T_i,T_j)\in \mathcal{T}$ are of distance at least $2$, namely for every pair $T_i\ne T_j \in \mathcal{T}$ and $S\in \mathcal{S}$,

$$\begin{aligned} T_i\cap S=\emptyset \text { or } T_j\cap S=\emptyset ; \end{aligned}$$

and (2) For every $T_i\in \mathcal{T}$ and $S\ne T_i$ in $\mathcal{S}$ we have

$$\begin{aligned} |T_i\cap S|< b. \end{aligned}$$

A graph is $(k,\ell ,b)$-independent if every set of hyperedges of size larger than $k$ has an $(\ell ,b)$-independent set.

Our key lemma shows that good independence and large algebraic degree guarantee resistance against heavy linear tests.

Lemma 3.5

If an $(m,n,d)$-graph $G$ is $(k,\ell ,b)$-independent and $P$ has an algebraic degree of at least $b$, then every linear test of size at least $k$ has bias of at most $\frac{1}{2}e^{-2\ell /2^d}$.

Proof

Fix some test $\mathcal{S}=(S_1,\ldots ,S_k)$ of size $k$, and let $\mathcal{T}=(T_{1},\ldots ,T_{\ell })$ be an $(\ell ,b)$-independent set of $\mathcal{S}$. Fix an arbitrary assignment $\sigma $ for all the input variables which do not participate in any of the $T_i$’s and choose the other variables uniformly at random. In this case, we can partition the output of the test $y$ to $\ell $ summands over $\ell $ disjoint blocks of variables, namely

$$\begin{aligned} y=\sum _{i\in [k]} P(x_{S_i})=\sum _{i\in [\ell ]} z_i(x_{T_i}), \end{aligned}$$

where

$$\begin{aligned} z_i(x_{T_i})=P(x_{T_i})+\sum _{S:T_i\ne S\cap T_i\ne \emptyset } P(x_{S\cap T_i},\sigma _{S\setminus T_i}), \end{aligned}$$

and the sums are over $\mathbb {F}_2$. We need two observations: (1) The random variables $z_i$’s are statistically independent (as each of them depends on a disjoint block of inputs); and (2) the r.v. $z_i$ is non-constant and, in fact, it takes each of the two possible values with probability at least $2^{-d}$. To prove the latter fact, it suffices to show that $z_i(x)$ is a nonzero polynomial (over $\mathbb {F}_2$) of degree at most $d$. Indeed, recall that $z_i$ is the sum of the polynomial $P(x_{T_i})$ whose degree is in $[b,d]$, and polynomials of the form $P(x_{S\cap T_i},\sigma _{S\setminus T_i})$ whose degree is smaller than $b$ (as $|S\cap T_i|<b$). Therefore, the degree of $z_i$ is in $[1,d]$.

To conclude the proof, we note that the parity of $\ell $ independent coins, each with expectation in $(\delta ,1-\delta )$, has bias of at most $\frac{1}{2}(1-2\delta )^{\ell }$. (See, e.g., [28]). $\square $

We want to show that a random graph is likely to be $(k,\ell ,2)$-independent.

Lemma 3.6

For every positive $\varepsilon $ and $\delta $. A random $(n^{1+\varepsilon },n,d)$-graph is, whp, $(n^{2\varepsilon +\delta },n^{\delta /2},2)$ independent.

Proof

We will need the following claim. Call a hyperedge $S$ $b$-intersecting if there exists another hyperedge $S'$ in the graph for which $|S'\cap S|\ge b$. We first bound the number of $b$-intersecting hyperedges.

Claim 3.7

Let $b$ be a constant. Then, in a random $(m=n^{1+\varepsilon },n,d)$-graph, whp, the number of $b$-intersecting hyperedges is at most $n^{2(1+\varepsilon )-b}\log n$.

Hence, whp, at most $O(n^{2\varepsilon }\log n)$ of the hyperedges are 2-intersecting, and for $\varepsilon <1/4$ there are at most $o(\sqrt{n})$ such hyperedges.

Proof (of Claim 3.7)

Let $X$ be the random variable which counts the number of $b$-intersecting hyperedges. First, we bound the expectation of $X$ by $m^2 d^{2b}/n^b=d^{2b}\cdot n^{2(1+\varepsilon )-b}$. To prove this, it suffices to bound the expected number of pairs $S_i,S_j$ which $b$-intersects. Each such pair $b$-intersects with probability at most $d^{2b}/n^b$, and so, by linearity of expectation, the expected number of intersecting pairs is at most $m^2 d^{2b}/n^b$. Now, by applying Markov’s inequality, we have that $\Pr [X>\frac{\log n}{d^{2b}} {{\mathrm{\mathsf {E}}}}[X]]<d^{2b}/\log n=o(1)$, and the claim follows. (A stronger concentration can be obtained via a martingale argument.) $\square $

We can now prove Lemma 3.6. Assume, without loss of generality, that $\varepsilon >1$ (as if the claim holds for some value of $\varepsilon $ it also holds for smaller values). First observe that, whp, all the input nodes in $G$ have degree at most $2n^{\varepsilon }$. As by a multiplicative Chernoff bound, the probability that a single node has larger degree is exponentially small in $n^{\varepsilon }$. We condition on this event and the event that there are no more than $r=n^{2\varepsilon }\log n$ $2$-expanding edges. Fix a set of $k=n^{2\varepsilon +\delta }$ hyperedges. We extract an $(\ell ,2)$-independent set by throwing away the $2$-expanding edges, and then by iteratively inserting an hyperedge $T$ into the independent set and removing all the hyperedges $S$ that share with $T$ a common node, and the hyperedges which share a node with an edge, that shares a node with $T$. At the beginning, we removed at most $r$ edges, and in each iteration, we remove at most $(d2n^{\varepsilon })^2$ edges, hence there are at least $\ell \ge \frac{k-r}{4d^2n^{2\varepsilon }}>n^{\delta /2}$ hyperedges in the independent set. $\square $

Combining the lemmas together we get:

Corollary 3.8

Fix some positive $\varepsilon $ and $\delta $. If $P$ has an algebraic degree of at least $2$ and $m=n^{1+\varepsilon }$, then, whp over the choice of a random $(m,n,d)$-graph, the function $f_{G,P}$ has at most sub-exponential bias (i.e., $\exp (-\Omega (n^{\delta }))$) against linear tests of size at least $n^{2\varepsilon +2\delta }$.

By combining Corollaries 3.4 and 3.8, we obtain Theorem 1.1.

4 Linear Tests Break Degenerate Predicates

In this section, we prove Theorem 1.2; that is, we show that the assumptions that $P$ is nonlinear and 2-resilient are necessary for $P$ to be a hard predicate. Clearly, the assumption that $P$ is nonlinear is necessary even when $m = n + 1$.

When $m \ge Kn$ for a sufficiently large constant $K$ (depending on $d$), it follows from work of Cryan and Miltersen [17] that if $P$ is not 1-resilient, then for any $f:\{\pm 1\}^n \rightarrow \{\pm 1\}^m$, the output of $f$ is distinguishable from uniform with constant advantage by some linear test. When $P$ is 1-resilient but not 2-resilient, Mossel, Shpilka, and Trevisan show that $f$ is distinguishable from uniform by a polynomial-time algorithm, but not by one that implements a linear test.

Here, we show that if $P$ is not 2-resilient, then the output of $f_{G,P}$ is distinguishable by linear tests with non-negligible advantage with high probability over the choice of $G$.

Claim 4.1

Let $K>4$ and $d\in \mathbb {N}$ be constants. Assume that the predicate $P:\{0,1\}^d\rightarrow \{0,1\}$ is unbiased and 1-resilient but not 2-resilient, i.e., $\left| {{\mathrm{\mathsf {E}}}}[P(z)z_1z_2] \right| = \alpha > 0$. Then for every $\ell = o(\log n)$, with probability $1 - (2^{-\Omega (\ell )} + d\ell /n)$ over the choice of a random $(Kn,n,d)$-graph $G$, there exists a linear test that distinguishes the output of $f_{G,P}$ from random with advantage $\alpha ^\ell $.

Proof

Let $H$ be the directed graph with vertices $\{1, \dots , n\}$ where every hyperedge $(i_1, i_2, \dots , i_d)$ in $G$ induces the edge $(i_1, i_2)$ in $H$.

Let $\ell $ be the length of the shortest directed cycle in $H$ and without loss of generality assume that this cycle consists of the inputs $1, 2, \dots , \ell $ in that order. Let $z_i$ be the name of the output that involves inputs $i$ and $i + 1$ for $i$ ranging from $1$ to $\ell $ (where $i$ is taken modulo $\ell $) and $S_i$ the corresponding hyperedge. With probability at least $1 - d\ell /n$, input $i$ does not participate in any hyperedge besides $S_i$ and $S_{i+1}$ and all other inputs participate in at most one of the hyperedges $S_1, \dots , S_\ell $.

We now calculate the bias of the linear test that computes $z_1 \oplus \dots \oplus z_\ell $. For simplicity, we will assume that $d = 3$; larger values of $d$ can be handled analogously but the notation is more cumbersome. We will denote the entries in $S_i$ by $i$, $i+1$ and $i'$. Then, the Fourier expansion of $z_i(x_{S_i})$ has the form

$$\begin{aligned} z_i(x_{S_i}) = \alpha x_ix_{i+1} + \beta x_ix_{i'} + \gamma x_{i+1}x_{i'} + \delta x_ix_{i+1}x_{i'} \end{aligned}$$

The Fourier expansion of the expression ${{\mathrm{\mathsf {E}}}}[z_1(x_{S_1})\dots z_\ell (x_{S_\ell })]$ can be written as a sum of $4^\ell $ products of different monomials participating in the above terms. The only monomial that does not vanish is the one containing all the $\alpha $-terms, namely

$$\begin{aligned} {{\mathrm{\mathsf {E}}}}\Bigl [\prod \nolimits _{i=1}^n \alpha x_ix_{i+1}\Bigr ] = \alpha ^\ell . \end{aligned}$$

All the other products of monomials contain at least one unique term of the form $x_{i'}$, and this causes the expectation to vanish.

It remains to argue that with high probability $\ell $ is not too large. We show that with probability $1 - O((4/K)^{\ell })$, $H$ has a directed cycle of length $\ell $, as long as $\ell < \log _{2K}(n/4)$. Let $X$ denote the number of directed cycles of length $\ell $ in $H$. The number of potential directed cycles of length in $H$ is $n(n-1)\dots (n-\ell + 1) \ge (n - \ell )^{\ell }$. Each of these occurs uniquely in $H$ with probability of at least

$$\begin{aligned} (Kn)(Kn - 1)\dots (Kn - \ell + 1) \Bigl (\frac{1}{n(n-1)}\Bigr )^\ell \Bigl (1 - \frac{1}{n(n-1)}\Bigr )^{Kn - \ell } \ge \Bigl (\frac{Kn - \ell }{n^2}\Bigr )^{\ell }. \end{aligned}$$

Therefore, ${{\mathrm{\mathsf {E}}}}[X] \ge (K/4)^{\ell }$. The variance can be upper bounded as follows. The number of pairs of cycles of length $\ell $ that intersect in $i$ edges is at most $\left( {\begin{array}{c}\ell \\ i\end{array}}\right) n^{2\ell - i - 1}$, and the covariance of the indicators for these cycles is at most $(K/n)^{2\ell - i}$. Adding all the covariances up as $i$ ranges from $1$ to $\ell $, it follows that

$$\begin{aligned} {{\mathrm{\mathsf {Var}}}}[X] \le {{\mathrm{\mathsf {E}}}}[X] + \sum _{i=1}^\ell \left( {\begin{array}{c}\ell \\ i\end{array}}\right) n^{2\ell - i - 1} \Bigl (\frac{K}{n}\Bigr )^{2\ell - i} \le {{\mathrm{\mathsf {E}}}}[X] + \frac{2^{\ell }K^{2\ell }}{n}. \end{aligned}$$

By Chebyshev’s inequality,

$$\begin{aligned} \Pr [X = 0] \le \frac{{{\mathrm{\mathsf {Var}}}}[X]}{{{\mathrm{\mathsf {E}}}}[X]^2} < \frac{2}{{{\mathrm{\mathsf {E}}}}[X]} \end{aligned}$$

as long as $\ell < \log _{2K}(n/4)$. $\square $

5 Implications of Small-Bias

For local functions with large stretch, small bias seems like a good approximation for cryptographic pseudorandomness. Specifically, we are not aware of any local function $f_{G,P}$ with linear stretch that fools linear distinguishers but can be distinguished by some polynomial-time adversary.^{Footnote 6} One may conjecture that if $f_{G,P}$ fools linear adversaries for most graphs, then it also fools polynomial-time adversaries. In other words, local functions are too simple to “separate” between the two different notions. We attempt to support this view by showing that resilience against small-bias, by itself, leads to robustness against other classes of attacks.

First, we observe that, for local functions, $k$-wise independence follows directly from $\varepsilon $-bias. (This is not the case for non-local functions.)

Lemma 5.1

Let $f:\{0,1\}^n\rightarrow \{0,1\}^{m}$ be a $d$-local function which is $2^{-kd}$-biased. Then, it is also $k$-wise independent.

Proof

Assume toward a contradiction that $f$ is not $k$-wise independent. Then, there exists a set of $k$ outputs $T$ and a linear distinguisher $L$ for which

$$\begin{aligned} \varepsilon =\left| \Pr _{y\mathop {\leftarrow }\limits ^{R}f(\mathcal {U}_n)}[L(y_{T})=1]-\Pr [L(\mathcal {U}_k)=1] \right| >0, \end{aligned}$$

where $y_T$ denotes the restriction of the string $y$ to the indices in $T$. Since $f$ is $d$-local, $y_{T}$ is sampled by using less than $kd$ bits and therefore $\varepsilon \ge 2^{-kd}$. $\square $

Note that the proof of our main theorem establishes $k$-wise independent as an intermediate step (Sect. 3.1). However, the above lemma is stronger in the sense that it holds for every fixed graph and every output length including ones that are not covered by the main theorem.

By plugging in known results about $k$-wise independent distributions, it immediately follows that if a local function is sufficiently small-biased, then it is pseudorandom against $\mathbf {AC^0}$ circuits [15], linear threshold functions over the reals [18], and degree-2 threshold functions over the reals [19].

Moreover, attacks on local functions, which are actively studied at the context of algorithms for constraint-satisfaction problems, appear to be based mainly on “local” heuristics (DPLL, message-passing algorithms, random-walk-based algorithms) or linearization [9]. Hence, it appears that in the context of local functions, the small-bias property already covers all “standard” attacks. We support this intuition by showing that small-biased local functions (on a random-looking input–output graph) are not merely min-wise independent, but have a stronger property: Even after reading an arbitrary set of $t$-outputs, the posterior distribution on every set of $\ell $ inputs, while not uniform, still has $h$ bits of min-entropy. We refer to this property as $(t,\ell ,h)$-robustness.

Lemma 5.2

Suppose that $P$ is a predicate for which $f_{G,P}:\{0,1\}^n\rightarrow \{0,1\}^{m}$ is $k$-wise independent, whp over the choice of a random $(m,n,d)$ graph $G$. Then, whp over the choice of a random $(m'=\Omega (m),n,d)$ graph $H$, the function $f_{H,P}:\{0,1\}^n\rightarrow \{0,1\}^{m'}$ is $(t=\Omega (k),\ell ,h)$-robust, where $h=\min \left( \ell , \Omega (m\cdot (\ell /n)^d), \Omega (k)\right) $.

(See Sect. 6 for more details and proof.) Robustness holds with inverse polynomial parameters $(t=n^{\alpha },\ell =n^{\beta },h=n^{\gamma })$ when $m=n^{1+\varepsilon }$, and with linear parameters when $m=O(n)$ is linear. The notion of robustness is the main technical tool used by Cook et al. [16] to prove that myopic backtracking algorithms cannot invert $f_{G,P}$ in polynomial time (for the case $m=n$).^{Footnote 7} By Lemma 5.2, robustness follows directly “for free” from small-bias, and thus, we can derive a similar lower-bound for larger output lengths (but for a smaller class of predicates). (See Sect. 6 for details.)

Notes

If this is not the case, project on the outputs labeled by the most frequent predicate.
One cannot hope for always hard predicates, for which $f_{G, P}$ is pseudorandom for all graphs, as some easy graphs fatally fail to provide pseudorandomness. This is the case, for example, when the graph connects two outputs to the same $d$ inputs. In fact, an inverse polynomial fraction of all dependency graphs (with $m$ outputs, $n$ inputs, and degree $d$) is easy.
The reduction of [4] has some overhead which leads only to weak (inverse polynomial) security. This is fixed (without violating locality) in the case of linear stretch, but in the case of polynomial stretch, it only yield inverse polynomial security (for arbitrary fixed polynomial). Furthermore, in the polynomial regime, the predicate is required to be of the form $P(w)=w_1\oplus P'(w_2,\dots ,w_d)$. Fortunately, such predicates can be classified as hard in our dichotomy.
We can assume that the same predicate is being used for all outputs at the expense of shortening the output length by a constant factor (as there are only $2^{2^d}$ different predicates).
An event occurs with high probability (whp in short) if it happens with probability $1-o(1)$.
It can be shown that this is false when the stretch is sub-linear.
The two other high-level ingredients are an upper-bound on the number of siblings of a random input, and standard lower-bound on the resolution size of unsatisfiable formulas (cf. [2, 10]).

References

M. Alekhnovich, More on average case vs approximation complexity, in Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science (2003), pp. 298–307
M. Alekhnovich, E. Ben-Sasson, A.A. Razborov, A. Wigderson, Pseudorandom generators in propositional proof complexity. SIAM Journal of Computation 34(1):67–88, 2004
M. Alekhnovich, E. A. Hirsch, D. Itsykson, Exponential lower bounds for the running time of DPLL algorithms on satisfiable formulas. J. Autom. Reasoning 35(1–3):51–72, 2005
B. Applebaum, Pseudorandom generators with long stretch and low locality from random local one-way functions. SIAM Journal of Computation 42(5):2008–2037, 2013.
Article MathSciNet MATH Google Scholar
B. Applebaum, B. Barak, A. Wigderson, Public-key cryptography from different assumptions, in 42nd ACM Symposium on Theory of Computing (STOC 2010) (2010), pp. 171–180
B. Applebaum, Y. Ishai, E. Kushilevitz, Cryptography in $\text{ NC }^{0}$. Journal of Computational Complexity 36(4):845–888, 2006
B. Applebaum, Y. Ishai, E. Kushilevitz, On pseudorandom generators with linear stretch in $\text{ NC }^0$. Journal of Computational Complexity 17(1):38–69, 2008
B. Applebaum, Y. Ishai, E. Kushilevitz, Cryptography with constant input locality. Journal of Cryptology 22(4), 429–469, 2009
S. Arora, R. Ge, New algorithms for learning in presence of errors. in ICALP (1) (2011), pp. 403–415
E. Ben-Sasson, A. Wigderson, Short proofs are narrow–resolution made simple. in STOC (1999), pp. 517–526
A. Bogdanov, P. Papakonstantinou, A. Wan, Pseudorandomness for read-once formulas. in Proceedings of the 52nd Annual Symposium on Foundations of Computer Science (2011), pp. 240–246
A. Bogdanov, Y. Qiao, On the security of Goldreich’s one-way function. Computational Complexity 21(1):83–127, 2012.
Article MathSciNet MATH Google Scholar
A. Bogdanov, A. Rosen, Input locality and hardness amplification. Journal of Cryptology 26:144–171, 2013.
Article MathSciNet MATH Google Scholar
A. Bogdanov, E. Viola, Pseudorandom bits for polynomials. SIAM J. Comput. 39(6), 2464–2486 (2010)
M. Braverman, Poly-logarithmic independence fools AC$^0$ circuits, in Annual IEEE Conference on Computational Complexity (2009), pp. 3–8
J. Cook, O. Etesami, R. Miller, L. Trevisan, Goldreich’s one-way function candidate and myopic backtracking algorithms. in TCC, ed. by O. Reingold, Lecture Notes in Computer Science, vol 5444 (Springer, Berlin, 2009), pp. 521–538
M. Cryan, P.B. Miltersen, On pseudorandom generators in NC$^0$, in Proceedings of 26th MFCS (2001)
I. Diakonikolas, P. Gopalan, R. Jaiswal, R.A. Servedio, E. Viola, Bounded independence fools halfspaces. SIAM Journal of Computation 39(8):3441–3462, 2010
I. Diakonikolas, D.M. Kane, J. Nelson, Bounded independence fools degree-2 threshold functions, in FOCS (2010), pp. 11–20
S.O. Etesami, Pseudorandomness Against Depth-2 Circuits and Analysis of Goldreich’s Candidate One-Way Function. Technical Report EECS-2010-180, UC Berkeley (2010)
M.X. Goemans, D.P. Williamson, Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. JACM: Journal of the ACM, 42:1115–1145, 1995
O. Goldreich, Candidate one-way functions based on expander graphs. in Electronic Colloquium on Computational Complexity (ECCC) (2000) 7(090)
O. Goldreich, Three XOR-lemmas–an exposition, in Studies in Complexity and Cryptography, ed. by O. Goldreich, Lecture Notes in Computer Science, vol 6650 (Springer, Berlin, 2011), pp. 248–272
R. Impagliazzo, N. Nisan, A. Wigderson, Pseudorandomness for network algorithms, in Proceedings of the 26th Annual ACM Symposium on Theory of Computing (1994), pp. 356–364
Y. Ishai, E. Kushilevitz, R. Ostrovsky, A. Sahai, Cryptography with constant computational overhead. in STOC, ed. by R.E. Ladner, C. Dwork (ACM, New York, 2008), pp. 433–442
D. Itsykson, Lower bound on average-case complexity of inversion of Goldreich’s function by drunken backtracking algorithms, in Computer Science-Theory and Applications, 5th International Computer Science Symposium in Russia (2010), pp. 204–215
R. Miller, Goldreich’s one-way function candidate and drunken backtracking algorithms. Distinguished major thesis (University of Virginia, Virginia, 2009)
E. Mossel, A. Shpilka, L. Trevisan, On $\epsilon $-biased generators, in NC$^{0}$. in Proceedings of 44th FOCS (2003), pp. 136–145
J. Naor, M. Naor, Small-bias probability spaces: efficient constructions and applications. SIAM Journal on Computing, 22(4):838–856, 1993. Preliminary version in Proc. of 22th STOC, 1990
S. K. Panjwani, An Experimental Evaluation of Goldreich’s One-Way Function. Technical report, IIT, Bombay, 2001
T. Siegenthaler, Correlation-immunity of nonlinear combining functions for cryptographic applications. IEEE Transactions on Information Theory 30(5):776–778 (1984)

Download references

Author information

Authors and Affiliations

School of Electrical Engineering, Tel-Aviv University, Tel Aviv, Israel
Benny Applebaum
Department of Computer Science and Engineering and Institute for Theoretical Computer Science and Communications, Chinese University of Hong Kong, Shatin, Hong Kong
Andrej Bogdanov
Efi Arazi School of Computer Science, IDC Herzliya, Herzliya, Israel
Alon Rosen

Authors

Benny Applebaum
View author publications
You can also search for this author in PubMed Google Scholar
Andrej Bogdanov
View author publications
You can also search for this author in PubMed Google Scholar
Alon Rosen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benny Applebaum.

Additional information

Communicated by Rafail Ostrovsky.

A preliminary version of this work has appeared in the Proceedings of the Ninth Theory of Cryptography Conference (TCC), 2012.

Benny Applebaum: Supported by Alon Fellowship, ISF Grant 1155/11, Israel Ministry of Science and Technology (Grant 3-9094), GIF Grant 1152/2011, the Check Point Institute for Information Security, and by the European Union’s Horizon 2020 Programme (ERC-StG-2014-2020) under Grant Agreement No. 639813 ERC-CLC.

A. Bogdanov: Supported in part by Hong Kong RGC GRF Grant CUHK 410309.

A. Rosen: Supported by the Israel Science Foundation (Grant No. 334/08).

Appendix: Robustness and Myopic Backtracking Algorithms

Robustness. Let $f:\{0,1\}^n\rightarrow \{0,1\}^{m}$. Let $L\subset [n]$ be a set of inputs, and $t,h\in [m]$. We say that $f$ is $(t,L,h)$-robust if for every set of outputs $T\subset [m]$ of size $t$ and every string $z\in \{0,1\}^t$ the following holds. Let $x\in \{0,1\}^n$ be a uniformly chosen string conditioned on the event $f(x)_T=z$, i.e., the outputs which are indexed by $T$ equal to $z$. Then the random variable $x_L=(x_i)_{i\in L}$ has min-entropy of $h$, namely, for every fixed $w\in \{0,1\}^{\left| L \right| }$, $\Pr [x_L=w]\le 2^{-h}$. The function is $(t,\ell ,h)$-robust if it is $(t,L,h)$-robust for every $\ell $-size input set $L$.

We show that if $f_{G,P}$ is $k$-wise independent with respect to random graph, then it is also robust for shorter output length. (The proof is deferred to Sect. 6.1).

Lemma 6.1

(Lemma 5.2 restated) Suppose that $P$ is a predicate for which $f_{G,P}:\{0,1\}^n\rightarrow \{0,1\}^{m}$ is $k$-wise independent, whp over the choice of a random $(m,n,d)$ graph $G$. Then, whp over the choice of a random $(m-r,n,d)$ graph $H$, the function $f_{H,P}:\{0,1\}^n\rightarrow \{0,1\}^{m-r}$ is $(t,\ell ,h)$-robust, where $h=\min \left( \ell , r\cdot (\ell /n)^d/2, k-t\right) $.

In the case of linear stretch, $m=n+O(n)$, where $k$ is linear as well (Corollary 3.4), one can get $(t,\ell ,h)$-robustness with linear parameters at the expense of linear decrease in the output length (e.g., $r=m/2$). When the output is polynomial $m=n^{1+\varepsilon }$ (for $\varepsilon <1/4$), we get $(t,\ell ,h)$-robustness for inverse polynomial parameters, again at the expense of a linear decrease in the output length (e.g., $r=m/2$).

Robustness is especially useful if the actual number of preimages of $y=f_{G,P}(x)$ is relatively small compared to $2^h$. In this case, an algorithm which attempts to guess $\ell $ bits of a preimage $x$ based on $t$ outputs is likely to be wrong (obtain a partial assignment that does not correspond to any preimage of $y$.) We show that in our setting of parameters (when the output length is large) most inputs have a small number of siblings under $f_{G,P}$ (where $G$ is random). The proof of the following lemma is deferred to Sect. 6.2.

Lemma 6.2

Let $P$ be any non-constant predicate. For $m > \Omega _d( n \log n)$,

$$\begin{aligned} \Pr _{G,x}[\left| \left\{ x'|x' \text { is a preimage of } f_{G,P}(x)\right\} \right| <M(n)]>1-o(1), \end{aligned}$$

where $M(n)$ is any (arbitrary slow) increasing function $M(n)=\omega (1)$.

Myopic DPLL algorithms. We now show how the simple statistical properties proved in the above lemmas yield lower-bounds for DPLL algorithms who attack $f_{G,P}$. The high-level argument is similar to the one used in [3, 16] and it is only sketched here. Consider the following myopic backtracking DPLL algorithm, whose input consists of $y=f_{G,P}(x)$ where $x$ is uniformly chosen. The algorithm is allowed to read the entire graph $G$, but it reads the values of $y$ in an incremental way. Specifically, in each iteration the algorithm adaptively choose an input variable $x_i$ and asks to reveal $r$ new output bits of $y$. Then it guesses the value of $x_i$ based on its current state and on the output bits that were already revealed (including the ones that were revealed in previous iterations). If the algorithm reaches a contradiction, i.e., its partial assignment to $x$ is consistent with some output it backtracks.

Suppose that $f_{G,P}$ satisfies the above lemmas. (Think of $M=O(\log n)$ and $k,t,\ell , h$ as polynomial in $n$, or even linear in $n$ when $m=O(n)$.) Since $f_{G,P}$ is $k$-wise independent the algorithm does not backtrack in the first $k/r$ steps (as some patrial assignment is consistent with every value of $k$ outputs). Since $f$ is $(r\cdot \ell ,\ell ,h)$-robust and the number of siblings of a random $x$ is (whp) $M$, the partial assignment chosen by the algorithm after $\ell <k$ steps is likely to be globally inconsistent (there are $2^{h}$ locally consistent assignments while there are only $M\ll 2^h$ globally consistent assignments). Hence, with all but negligible probability, the algorithm will err during the first $\ell $ steps, and therefore will backtrack at some point after more than $k$ steps. It can be shown (by standard lower-bound on resolution [2, 10]) that, for a random graph, the backtracking phase takes super-polynomial time. (By plugging in the exact parameters the lower-bound is exponential $2^{\Omega (n)}$ when $m=O(n)$ or sub-exponential $\exp (n^{\delta })$ when $m=n^{1+\varepsilon }$.)

1.1 Proving Robustness (Lemma 6.1)

Proof

Observe that an $(m-r,n,d)$ graph $H$ with hyperedges $(S_1,\ldots ,S_{m-r})$ can be extended to an $(m,n,d)$ graph $G$ by adding $r$ hyperedges $(S_{r+1},\ldots ,S_{m})$. The additional edges can be packed together in $(r,n,d)$ graph $H'$ to which we refer as an extension graph. Call $H$ good if $f_{H\cup H',P}$ is $k$-wise independent whp over the choice of the $(r,n,d)$ graph $H'$. Since $f_{G,P}$ is $k$-wise independent, whp over a random $(m,n,d)$ graph $G$, it follows from Markov’s inequality that all but $o(1)$ of the $(m-r,n,d)$ graphs $H$ are good. We show that if $H$ is a good graph the function $f_{H,P}$ is $(t,\ell ,h)$-robust.

Fix some good $H$. Let $L$ be an arbitrary $\ell $-size subset of the inputs. We say that an extension graph $H'=(S_{r+1},\ldots ,S_{m})$ is good (for $L$ and $G$) if:

1.
There is a set $M\subset \left\{ r+1,\ldots , m\right\} $ of at least $h$ hyperedges of $H'$ which fall completely into $L$, i.e., $\bigcup _{i\in M} S_{i}\subseteq L$.
2.
$f_{H\cup H',P}$ is $k$-wise independent.

Claim 6.3

If there exists a good extension $H'$, then $f_{H,P}:\{0,1\}^n\rightarrow \{0,1\}^{m-r}$ is $(t,L,h)$-robust.

Proof

Fix some output set $T\subset [m-r]$ of size $t$. Let $x\mathop {\leftarrow }\limits ^{R}\{0,1\}^n$ and let $y=f_{H\cup H',P}(x)$. Fix the value of $y_T$ to some string $z$. Then, since the random variables $y=(y_1,\dots ,y_m)$ are $k$-wise independent, the distribution of $y_M$ conditioned on $y_T=z$ is uniform over $\{0,1\}^{h}$. (Note that $h+t\le k$.) Since $y_M$ depends only on $x_L$ it follows that the conditional distribution of $x_L$ has min-entropy at least $|M|=h$. (Otherwise, $x_L$ takes some value $w$ with probability larger than $2^{-M}$ and the string $f_{M,P}(w)$ occurs in $y_M$ with probability larger than $2^{-M}$ contradicting uniformity.) $\square $

Finally, a simple calculation shows that there exists a good extension graph (in fact, many of them are good for $L$). Indeed, a random $H'$ is expected to have $r\cdot (\ell /n)^d\ge 2h$ hyperedges in $L$, and therefore by Markov’s inequality at most $\frac{1}{2}$ of the extension graphs violate (1). Also, since $H$ is assumed to be good, at most $o(1)$ of the extension graphs violate (2), and therefore at least $\frac{1}{2}-o(1)$ of the extension graphs are good, and the lemma follows. $\square $

1.2 Bounding the Number of Siblings (Lemma 6.2)

We prove Lemma 6.2 via the following claim.

Claim 6.4

Let $P$ be any non-constant predicate. For $m > 2^{K d} n \log n$, $\Pr _{G, x, y}[f_{G, P}(x) = f_{G, P}(y)] < K2^{-n}$, where $K$ is some constant and $n$ is sufficiently large.

We show that the claim implies Lemma 6.2. For every fixed $G$ and $x$ define $\xi _{x,G}=\Pr _y[f_{G,P}(x)=f_{G,P}(y)]$. The claim shows that ${{\mathrm{\mathsf {E}}}}_{G,x}[\xi _{G,x}]<K2^{-n}$. Hence, by Markov’s inequality, for every $M$, $\Pr _{G,x}[\xi _{G,x}<MK2^{-n}]>1-1/M$, by taking $M=\omega (1)$, we get that, whp over the choice of $G$ and $x$, there are at most $MK2^{-n}2^n=O(MK)$ siblings for $x$ under $f_{G,P}$. We now prove the claim.

proof of Claim 6.4

We write

$$\begin{aligned} \Pr _{G,x,y}[f_{G, P}(x) = f_{G, P}(y)]= & {} {{\mathrm{\mathsf {E}}}}_{x,y}\Pr _G[f_{G, P}(x) = f_{G, P}(y)]\\= & {} {{\mathrm{\mathsf {E}}}}_{x,y}[\Pr _I[P(x|_I) = P(y|_I)]^m] \end{aligned}$$

where $I$ is a random sequence of $d$ indices from $[n]$. The value of the inner probability only depends on $x$ and $y$ through the number of pairs $x_iy_i$ of types $00$, $01$, $10$, and $11$. Let $n_{ab}$ be the number of pairs $x_iy_i$ where $x_i = a$ and $y_i = b$. Then $\mathcal {D}= (n_{00}/n, n_{01}/n, n_{10}/n, n_{11}/n)$ is a probability distribution over $\{0, 1\}^2$ and we can write

$$\begin{aligned} {{\mathrm{\mathsf {E}}}}_{x,y}[\Pr _I[P(x|_I) = P(y|_I)]^m] = \frac{1}{2^{2n}} \sum _{n_{00} + n_{01} + n_{10} + n_{11} = n} \left( {\begin{array}{c}n\\ n\mathcal {D}\end{array}}\right) \Pr _{uv \sim \mathcal {D}^d}[P(u) = P(v)]^m \end{aligned}$$

where $u, v$ are $d$-bit strings, $\left( {\begin{array}{c}n\\ n\mathcal {D}\end{array}}\right) $ is shorthand for $\left( {\begin{array}{c}n\\ n_{00},n_{01},n_{10},n_{11}\end{array}}\right) $, and $\mathcal {D}^d$ is the distribution on $uv$ obtained by choosing each pair $u_iv_i$ independently from the joint distribution $\mathcal {D}$.

We divide the sum into four parts depending on the values $n_{00}, n_{01}, n_{10}, n_{11})$ as follows:

Let UNBAL be those $(x, y)$ such that $n_{ab} \ge 5/6$ for some $a, b \in \{0, 1\}$.
Let $\text {EQ}-\text {UNEQ}$ be those $(x, y)$ such that $n_{aa} \ge 1/36$ and $n_{b\overline{b}} \ge 1/36$ for some $a, b \in B$.
Let EQ$+$ be those $(x, y)$ outside UNBAL and $\text {EQ}-\text {UNEQ}$ such that $n_{00} + n_{11} \ge 1/2$.
Let UNEQ $+$ be those $(x, y)$ outside UNBAL and $\text {EQ}-\text {UNEQ}$ such that $n_{01} + n_{10} \ge 1/2$.

We bound the contribution of each of these sets to the sum.

An entropy calculation (see Section 6 in [13]) shows that the number of $(x, y)$ in UNBAL is at most $2^{0.92n}$ and so the contribution of UNBAL is at most $2^{-1.08n}$.

Let us now look at $(x, y) \in \text {EQ}-\text {UNEQ}$. For simplicity let’s take the case $n_{00}, n_{01} \ge 1/36$, the other cases being similar. Then for any $a \in \{0, 1\}^d$, the event “$u = 0^d$ and $v = a$” occurs with probability at least $36^{-d}$, and so $\Pr [P(u) = P(v)] \le 1 - 36^{-d}$, and

$$\begin{aligned} \sum _{(x, y) \in \text {EQ}-\text {UNEQ}} \Pr _{uv \sim \mathcal {D}^d}[P(u) = P(v)]^m \le 2^{2n} (1 - 36^{-d})^m \le 2^{-2n} \end{aligned}$$

by our choice of $m$.

We now consider the pairs $(x, y) \in \text {EQ}+$. By definition of $\text {EQ}+$, for all such pairs we have $n_{00}, n_{11} \ge 1/36$. Let $E = \{i:x_i = y_i\}$. Since $P$ is not constant, there must exist a pair $u, v \in \{0, 1\}^d$ that differ in exactly one coordinate such that $P(u) \ne P(v)$. Therefore

$$\begin{aligned} \Pr [P(u) \ne P(v)] \ge 1 - 36^{-(d+1)} \cdot \frac{\left| E \right| }{n}. \end{aligned}$$

We can now write

$$\begin{aligned} \sum _{(x, y) \in \text {EQ}+} \Pr _{uv \sim \mathcal {D}^d}[P(u) = P(v)]^m&\le \sum _{k = 0}^{n/2} \sum _{\left| E \right| = k} \sum _{\text {(x, y) agree on E}} \Bigl (1 - 36^{-(d+1)} \cdot \frac{k}{n}\Bigr )^m \\&= 2^n \sum _{k = 0}^{n/2} \left( {\begin{array}{c}n\\ k\end{array}}\right) \Bigl (1 - 36^{-(d+1)} \cdot \frac{k}{n}\Bigr )^m \\&\le 2^n + 2^n\sum _{k=1}^{n/2} \Bigl (\frac{en}{k}\Bigr )^k \Bigl (1 - 36^{-(d+1)} \cdot \frac{k}{n}\Bigr )^m \\&= 2^n + 2^n\sum _{k=1}^{n/2} \exp (k \ln (en/k) - k\ln (2en)) = O(2^n) \end{aligned}$$

by our choice of $m$.

Finally, we consider those $(x, y) \in \text {UNEQ}+$. Then we have $n_{01}, n_{10} \ge 1/36$. Say $P$ is symmetric if $P(\overline{w}) = P(w)$ for every $w$. If $P$ is symmetric, we can bound the contribution of $\text {UNEQ}+$ by $O(2^n)$ by a calculation analogous to the one for $\text {EQ}+$. If $P$ is not symmetric, then $P(w) \ne P(\overline{w})$ for some $w$. The event “$u = w, v = \overline{w}$” then happens with probability at least $36^{-d}$ and we can bound the contribution of $\text {UNEQ}+$ by $2^{-2n}$ by a calculation analogous to the one for $\text {EQ}-\text {UNEQ}$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Applebaum, B., Bogdanov, A. & Rosen, A. A Dichotomy for Local Small-Bias Generators. J Cryptol 29, 577–596 (2016). https://doi.org/10.1007/s00145-015-9202-8

Download citation

Received: 04 September 2012
Published: 08 April 2015
Issue Date: July 2016
DOI: https://doi.org/10.1007/s00145-015-9202-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Dichotomy for Local Small-Bias Generators

Abstract

Similar content being viewed by others

On Linear-Size Pseudorandom Generators and Hardcore Functions

Fast Pseudorandom Functions Based on Expander Graphs

Adaptive Security of Constrained PRFs

1 Introduction

1.1 Easy, Sometimes Hard, and Almost Always Hard Predicates

1.2 Our Results

Theorem 1.1

Theorem 1.2

1.3 Why Polynomial Stretch?

1.4 Why Small-Bias?

1.5 Related Work

2 Techniques and Ideas

2.1 Fooling Light Tests

2.2 Fooling Heavy Tests

2.3 Proving Theorem 1.2

3 Non-degenerate Predicates are Hard

3.1 Fooling Light Tests

Lemma 3.1

Fact 3.2

Proof of Lemma 3.1

Lemma 3.3

Proof

Corollary 3.4

3.2 Fooling Heavy Tests

Lemma 3.5

Proof

Lemma 3.6

Proof

Claim 3.7

Proof (of Claim 3.7)

Corollary 3.8

4 Linear Tests Break Degenerate Predicates

Claim 4.1

Proof

5 Implications of Small-Bias

Lemma 5.1

Proof

Lemma 5.2

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Robustness and Myopic Backtracking Algorithms

Appendix: Robustness and Myopic Backtracking Algorithms

Lemma 6.1

Lemma 6.2

1.1 Proving Robustness (Lemma 6.1)

Proof

Claim 6.3

Proof

1.2 Bounding the Number of Siblings (Lemma 6.2)

Claim 6.4

proof of Claim 6.4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation