Abstract
Local pseudorandom generators allow to expand a short random string into a long pseudorandom string, such that each output bit depends on a constant number d of input bits. Due to its extreme efficiency features, this intriguing primitive enjoys a wide variety of applications in cryptography and complexity. In the polynomial regime, where the seed is of size n and the output of size \(n^{\textsf {s}}\) for \(\textsf {s}> 1\), the only known solution, commonly known as Goldreich’s PRG, proceeds by applying a simple dary predicate to public random sized subsets of the bits of the seed.
While the security of Goldreich’s PRG has been thoroughly investigated, with a variety of results deriving provable security guarantees against class of attacks in some parameter regimes and necessary criteria to be satisfied by the underlying predicate, little is known about its concrete security and efficiency. Motivated by its numerous theoretical applications and the hope of getting practical instantiations for some of them, we initiate a study of the concrete security of Goldreich’s PRG, and evaluate its resistance to cryptanalytic attacks. Along the way, we develop a new guessanddeterminestyle attack, and identify new criteria which refine existing criteria and capture the security guarantees of candidate local PRGs in a more finegrained way.
Keywords
 Pseudorandom generators
 Algebraic attacks
 Guessanddetermine
 Gröbner basis
Download conference paper PDF
1 Introduction
One of the most fundamental problems in cryptography is the question of what makes an efficiently computable function hard to invert. The quest for the simplest design which leads to a primitive resisting all known attacks is at the heart of both symmetric and asymmetric cryptography: while we might be able to build seemingly secure primitives by relying on more and more complex designs to thwart cryptanalysis attempts, such a “security by obscurity” approach is unsatisfying. Instead, as advocated almost two decades ago by Goldreich [Gol00], we should seek to construct the simplest possible function that we do not know how to invert efficiently. Only this way, Goldreich argued, can we better understand what really underlies the security of cryptographic constructions.
Random Local Functions. In an attempt to tackle this fundamental problem, Goldreich suggested a very simple candidate oneway function as a promising target for cryptanalysis: let (n, m) be integers, and let \((\sigma ^1, \ldots , \sigma ^m)\) be a list of m subsets of [n], such that each subset is of small size: for any \(i\le m\), \(\sigma ^i = c(n)\), where \(c(n) \ll n\) (in actual instantiations, c(n) can for example be logarithmic in n, or even constant). Fix a simple predicate \(P:{\{0,1\}} ^{c(n)} \mapsto {\{0,1\}} \), and define the function \(f:{\{0,1\}} ^n\mapsto {\{0,1\}} ^m\) as follows: on input \(x\in {\{0,1\}} ^n\), for any subset S of [n], let \(x[\sigma ]\) denote the subset of the bits of x indexed by \(\sigma \). Compute f(x) as \(P(x[\sigma ^1])\cdots  P(x[\sigma ^m])\) (that is, f(x) is computed by applying the predicate P to all subsets of the bits of x indexed by the sets \(\sigma ^1, \ldots , \sigma ^m\)). We call random local functions the functions obtained by instantiating this template.
In his initial proposal, Goldreich advocated instantiating the above methodology with \(m\approx n\) and \(c(n) = O(\log n)\), and conjectured that if the subsets \((\sigma ^1,\ldots ,\sigma ^m)\) form an expander graph^{Footnote 1}, and for an appropriate choice of the predicate P, it should be infeasible to invert the above function f in polynomial time. While setting c(n) to \(O(\log n)\) offers stronger security guarantees, the more extreme design choice \(c(n) = O(1)\) (also discussed in Goldreich’s paper) enhances the above candidate with an appealing feature: it enjoys constant input locality (which puts it into the complexity class \(\mathsf {NC} ^0\)), hence it is highly parallelizable (it can be computed in constant parallel time). It appeared in subsequent works that a stronger variant of Goldreich’s conjecture, which considers \(m \gg n\) and claims that f is in fact a pseudorandom generator, was of particular interest; we will elaborate on this later on.
Local Pseudorandom Generators. The question of whether cryptographic primitives can exist in weak complexity classes such as \(\mathsf {NC} ^0\) has attracted a lot of attention in the cryptographic community. A primitive of particular interest, which has been the focus of most works on the subject, is the notion of pseudorandom generators (PRGs), which are functions \(G:{\{0,1\}} ^n\mapsto {\{0,1\}} ^m\) extending a short random seed into a longer, pseudorandom string. The existence of PRGs in \(\mathsf {NC} ^0\) was first considered by Cryan and Miltersen in [CM01]. Remarkably, it was shown by Applebaum, Ishai, and Kushilevitz [AIK04, AIK08] that cryptographically secure pseudorandom generators (with linear stretch \(m = O(n)\)) exist in a complexity class as low as \(\mathsf {NC} _4^0\) (the class of constant depth, polysize circuits where each output bit depends on at most 4 input bits), under widely believed standard assumption for the case of PRG with sublinear stretch (such as factorization, or discrete logarithm), and under a specific intractability assumption related to the hardness of decoding “sparsely generated” linear codes, for the case of PRG with linear stretch. While this essentially settled the question of the existence of linear stretch PRGs in \(\mathsf {NC} ^0\), an intriguing open question remained: could PRGs in \(\mathsf {NC} ^0\) have polynomial stretch, \(m = \mathsf {poly} (n)\)?
Some early negative results were given by Cryan and Miltersen [CM01] (who ruled out the existence of PRGs in \(\mathsf {NC} ^0_3\) with stretch \(m > 4n\)) and Mossel, Shpilka, and Trevisan [MST03] (who ruled out the existence of PRGs in \(\mathsf {NC} ^0_4\) with stretch \(m > 24n\)). The authors of [CM01] also conjectured that any candidate PRG with superlinear stretch in \(\mathsf {NC} ^0\) would be broken by simple, linear distinguishing tests^{Footnote 2}; this conjecture was refuted in [MST03], who gave a concrete candidate PRG in \(\mathsf {NC} ^0\), by instantiating a random local function with \(c = 5\), and the predicate
where the \(+\) denotes the addition in \(\mathbb {F}_2\) i.e. the xor.
They proved that this PRG fools linear tests, even when m is a (sufficiently small) polynomial in n. By the previously mentioned negative result on PRGs in \(\mathsf {NC} ^0_4\), this candidate PRG, which has locality 5, achieves the best possible locality. Recently, there has been a renewed interest in the study of this local PRG, now commonly known as Goldreich’s PRG, and its generalizations [BQ09, App12, OW14, CEMT14, App15, ABR16, AL16, IPS08, LV17, BCG+17].
1.1 Implications of PolynomialStretch Local Pseudorandom Generators
The original motivation for the study of local pseudorandom generators was the intriguing possibility of designing cryptographic primitives that can be evaluated in constant time, using polynomially many cores. While this is already a strong motivation in itself, it was observed in several works that the existence of (polystretch) local PRGs had a number of nontrivial implications, and is at the heart of feasibility results for several highend cryptographic primitives. We provide below a brief overview.

Secure computation with constant computational overhead. In the recent work [IKOS08], the authors explored the possibility of computing cryptographic primitives with essentially optimal efficiency, namely, constant overhead over a naive insecure implementation of the same task. One of their main results establishes the existence of constantoverhead twoparty computation protocols for any boolean circuit, assuming the existence of polystretch local PRGs (and oblivious transfers). In a recent work [ADI+17a], this result was extended to arithmetic circuits, using an arithmetic generalization of local PRGs.

Indistinguishability obfuscation (iO). Introduced in the seminal paper of Barak et al. [BGI+01], iO is a primitive that has received a considerable attention from the crypto community in the past years, as a long sequence of works starting with [SW14] has demonstrated that iO had tremendous theoretical implications, to the point that it is often referred to as being a “cryptocomplete” primitive. All known candidate constructions of iO rely, directly or indirectly, on a primitive called klinear map, for some degree k. Recently, a sequence of papers (culminating with [LT17]) has attempted to find out the minimal k for which a klinear map would imply the existence of iO (with the ultimate goal of reaching \(k=2\), as bilinear maps are well understood objects). These works have established a close relation between this value k and the existence of pseudorandom generators with polystretch, and locality k.^{Footnote 3}

MPCfriendly primitives. Historically, the design of symmetric cryptographic primitives (such as block ciphers, pseudorandom generators, and pseudorandom functions) has been motivated by efficiency considerations (memory consumption, hardware compatibility, ease of implementation, ...). The field of multiparty computation (MPC), where parties want to jointly evaluate a function on secret inputs, has led to the emergence of new efficiency considerations: the efficiency of secure evaluation of symmetric primitives is strongly related to parameters such as the circuit depth of the primitive, and the number of its AND gates. This observation has motivated the design of MPCfriendly symmetric primitives in several recent works (\(\textit{e.g.}\) [ARS+15, CCF+16, MJSC16, GRR+16]). Local pseudorandom generators make very promising candidate MPCfriendly PRGs (and lead, through the GGM transform [GGM84], to promising candidates for MPCfriendly pseudorandom functions). Secure evaluation of such symmetric primitives enjoys a wide variety of applications.

Cryptographic capsules. In [BCG+17], Boyle et al. studied the recently introduced primitive of homomorphic secret sharing (HSS). An important implication of HSS is that, assuming the existence of a local PRG with polystretch, one can obtain multiparty computation protocols in the preprocessing model^{Footnote 4} where the amount of communication between the parties is considerably smaller than the circuit size of the function, by constructing a primitive called cryptographic capsule which, informally, allows to compress correlated (pseudo)random coins. MPC protocols with lowcommunication preprocessing have numerous appealing applications; however, the efficiency of the constructions of cryptographic capsule strongly depends on the locality and seed size of the underlying local PRG (both should be as small as possible to get a reasonably efficient instantiation).
In addition to the above (nonexhaustive) overview, we note that the existence of polystretch local pseudorandom generators also enjoys interesting complexitytheoretic implications. For example, they have been shown in [AIK08] to imply strong (tight) bounds on the averagecase inapproximability of constraints satisfactions problems such as Max3SAT.
1.2 On the Security of Goldreich’s PRG
In this section, we provide a brief overview of the stateoftheart regarding the security of local pseudorandom generators. For a more detailed and wellwritten overview dating from 2015, we refer the reader to [App15].
Positive Results: Security Against Class of Attacks. The seminal paper of Goldreich [Gol00] made some preliminary observations on necessary properties for a local oneway function. Namely, the predicate P must satisfy some nondegeneracy properties, such as being nonlinear (otherwise, one could inverse the function using Gaussian elimination). It also noted that to avoid a large class of natural “backtracking” attacks, which make a guess on the values of bit inputs based on local observations and attempt to combine many local solutions into a global solution, the subsets \((S_1, \ldots , S_m)\) should be sufficiently expanding: for some k, every k subsets should cover \(k + \varOmega (n)\) elements of [n]. The security of Goldreich’s candidate oneway function against a large class of backtracking algorithm was formally analyzed in [AHI05, CEMT14], where it was proven that two restricted types of backtracking algorithms (called “drunk” and “myopic” backtracking algorithms) take exponential time to invert the function (with high probability). They also ran experiments to heuristically evaluate its security against SAT solvers (and observed experimentally an exponential increase in running time as a function of the input length).
The pseudorandomness of random local functions was originally analyzed in [MST03]. They proved (among other results) that the random local function instantiated with the predicate \(P_5: (x_1,x_2,x_3,x_4,x_5)\mapsto x_1+ x_2+ x_3+x_4x_5\) fools all \(\mathbb {F} _2\)linear distinguishers for a stretch up to \(m(n) = n^{1.25\varepsilon }\) (for an arbitrary small constant \(\varepsilon \)). This result was later extended to a larger stretch \(n^{1.5\varepsilon }\) in [OW14]. In the same paper, the authors proved that this candidate PRG is also secure against a powerful class of attacks, the Lasserre/Parrilo semidefinite programming (SDP) hierarchy, up to the same stretch. Regarding security against \(\mathbb {F} _2\)linear attacks, a general dichotomy theorem was proven in [ABR12], which identified a class of nondegenerate predicates and showed that for most graphs, a local PRG instantiated with a nondegenerate predicate is secure against linear attacks, and for most graphs, a local PRG instantiated with a degenerate predicate is insecure against linear distinguishers. In general, to fool \(\mathbb {F} _2\)linear distinguishers, the predicate should have high algebraic degree (in particular, a random local function instantiated with a degree\(\ell \) predicate cannot be pseudorandom for a stretch \(\ell \) (\(m\equiv n^\ell \)), as it is broken by a straightforward Gaussian elimination attack).
Being pseudorandom seems to be a much stronger security property than being oneway. Nevertheless, in the case of random local functions, it was shown in [App12] that the existence of local pseudorandom generators follows from the existence of oneway random local functions (with sufficiently large output size).
Negative Results. The result of O’Donnell and Witmer [OW14] regarding security against SDP attacks is almost optimal, as attacks from this class are known to break the candidate for a stretch \(\varTheta (n^{1.5}\log n)\). More generally, optimizing SDP attacks leads to a polytime inversion algorithm for any predicate P which is (even slightly) correlated with some number c of its inputs, as soon as the output size exceeds \(m \in \varOmega (n^{c/2}+n\log n)\) [OW14, App15]. Therefore, a good predicate should have high resiliency (i.e. it should be kwise independent, for a k as large as possible). This result shows, in particular, that a random local function with a constant locality d and with an output size \(m > \mathsf {poly} (d)\cdot n\) is insecure when instantiated with a uniformly random predicate P. Combining this observation with the result of Siegenthaler [Sie84], which studied the correlation of dary predicates, gives a polytime inversion algorithm for any random local function implemented with a dary predicate, and with an output size \(m \in \varOmega (n^{1/2\lfloor 2d/3\rfloor }\log n)\).
Bogdanov and Qiao [BQ09] studied the security of random local functions when the output is sufficiently larger than the input (i.e., \(m \ge Dn\), for a large constant D). They proved that for sufficiently large D, inverting a random local function could be reduced to finding an approximate inverse (i.e. finding any \(x'\) which is close to the inverse x in Hamming distance), by showing how to invert the function with high probability given an advice \(x'\) close to x. For random local function with an output size polynomial in n, \(m = n^{\textsf {s}}\) for some \(\textsf {s}\), this leads to a subexponentialtime attack [App15]: fix a parameter \(\varepsilon \), assign random values to the \((12\varepsilon )n\) first inputs, and create a list that enumerates over all possible \(2\varepsilon n\) assignments for the remaining variables. Then the list is guaranteed to contain a value \(x'\) that agree with the preimage x on a \((1/2+\varepsilon )n\) fraction of the coordinates with good probability. By applying the reduction of [BQ09], using each element of the list as an advice string, one recovers the preimage in time \(\mathsf {poly} (n)\cdot 2^{2\varepsilon n}\) provided that \(m = \varOmega (n/\varepsilon ^{2d})\) (d is the arity of the predicate P). In the case of the 5ary predicate \(P_5\), this leads to an attack in subexponentialtime \(2^{O(n^{1(\textsf {s}1)/2d})}\) (\(\textit{e.g.}\) using \(\textsf {s}= 1.45\) gives an attack in time \(2^{O(n^{0.955})}\)).
By the previous observations, we know that the predicate of a random local function must have high resiliency and high algebraic degree to lead to a pseudorandom function. A natural question is whether this characterization is also sufficient; this question was answered negatively in [AL16], who proved that a predicate must also have high bitfixing degree to fool linear attacks.^{Footnote 5} In particular, this observation disproved a previous conjecture of Applebaum that XORAND predicates (which are natural generalizations of the predicate \(P_5\)) could lead to local PRGs with stretch greater than 2 that fools all linear tests (see [AL16, Corollary 1.3]).
In the same work, Applebaum and Lovett considered the class of algebraic attacks on local pseudorandom function, which are incomparable to linear attacks. An algebraic attack against a function \(f:{\{0,1\}} ^n\mapsto {\{0,1\}} ^m\) starts with an output y and uses it to initialize a system of polynomial equations over the input variables \(x=(x_1,\ldots ,x_n)\). The system is further manipulated and extended until a solution is found or until the system is refuted. Applebaum and Lovett proved that a predicate must also have high rational degree to fool algebraic attacks (a predicate P has rational degree e if it is the smallest integer for which there exist degree e polynomials Q and R, not both zero, such that \(PQ=R\)). Indeed, if \(e<\textsf {s}\) then P is not \(\textsf {s}\)pseudorandom against algebraic attacks (see [AL16], Theorem 1.4). In the symmetric cryptography community, the rational degree denotes the wellknown algebraic immunity criterion on Boolean function that underlies the socalled algebraic attacks on stream ciphers [CM03, Cou03]. An algebraic immunity of e implies an rbit fixing degree greater than or equal to \(er\) ([DGM05], Proposition 1), giving that an high algebraic immunity guarantees both high rational degree and high bit fixing degree. The algebraic degree is equivalent to the 0bit fixing degree, then it leads to the following characterization: a predicate of a random local function must have high resiliency and high algebraic immunity. In light of this characterization, the authors of [AL16] suggested the XORMAJ predicate as a promising candidate for building highstretch local PRGs, the majority function having optimal algebraic immunity [DMS05].
Security Against Subexponential Attacks. While there is a large body of work that studied the security of random local functions, leading to a detailed characterization of the parameters and predicates that lead to insecure instantiations, relatively little is known on the exact security of local PRGs instantiated with nondegenerated parameters. In particular, most papers only prove that some classes of polytime attacks provably fail to break candidates local PRGs; however, these results do not preclude the possible existence of nontrivial subexponential attacks (specifically, these polytime attacks do not “degrade gracefully” into subexponential attacks when appropriate parameters are chosen for the PRG; instead, they do always and provably not succeed). To our knowledge, the only results in this regard are the proof from [AHI05, CEMT14] that many backtrackingtype attacks require exponential time to invert a random local function, and the subexponentialtime attack arising from the work of Bogdanov and Qiao [BQ09]. However, as we saw above, the latter attack only gives a slightlysubexponential algorithm, in time \(2^{O(n^{1(\textsf {s}1)/2d})}\) for a dary predicate, and an \(n^{\textsf {s}}\)stretch local PRG.
1.3 Our Goals and Results
In this work, we continue the study of the most common candidate local pseudorandom generators. However, we significantly depart from the approach of previous works, in that we wish to analyze the concrete security of local PRGs. To our knowledge, all previous works were only concerned about establishing asymptotic security guarantees for candidate local PRGs, without providing any insight on, \(\textit{e.g.}\), which parameters can be conjectured to lead to a primitive with a given bitsecurity. Our motivations for conducting this study are twofold.

Several recent results, which we briefly overviewed in Sect. 1.1, indicate that (polystretch) local PRGs enjoy important theoretical applications. However, the possibility of instantiating these applications with concrete PRG candidates remains unclear, as their efficiency quickly deteriorates with the parameters of the underlying PRG. For example, the iO scheme of [LT17], which requires lowdegree multilinear maps and therefore might be a viable approach to obtain efficiency improvements in iO constructions (as candidate highdegree multilinear maps are prohibitively expensive); however, it has a cost cubic in the seed size of a polystretch local PRG, which renders it practical only if we can safely use local PRGs with reasonably small seeds. Overall, we believe that there is a growing need for a better understanding of the exact efficiency of candidate local PRGs, and providing concrete estimations can prove helpful for researchers willing to understand which efficiency could potentially be obtained for localPRGbased primitives.

At a more theoretical level, previous works on (variants of) Goldreich’s PRG have identified criteria which characterize the predicates susceptible to lead to secure local PRGs. Identifying such criteria is particularly relevant to the initial goal set up by Goldreich in [Gol00], which is to understand what characteristics of a function is the source of its cryptographic hardness, by designing the simplest possible candidate that resists all attacks we know of. However, existing criteria only distinguish predicates leading to insecure instances from those leading to instances for which no polynomialtime attack is known. We believe that it is also of particular relevance to this fundamental question to find criteria which capture in a more finegrained way the cryptographic hardness of random local functions.
Our Results. We provide new cryptanalytic insights on the security of Goldreich’s pseudorandom generator.

A new subexponential attack on Goldreich’s PRG. We start by devising a new attack on Goldreich’s PRG. Our attack relies on a guessanddetermine technique, in the spirit of the recent attack [DLR16] on the FLIP family of stream ciphers [MJSC16]. The complexity of our attack is \(2^{O(n^{2\textsf {s}})}\) where \(\textsf {s}\) is the stretch and n is the seed size. This complements O’Donnel and Witmer’s result [OW14] showing that Goldreich’s PRG is likely to be secure for stretch up to 1.5, with a more finegrained complexity estimation. We implemented our attack and provide experimental results regarding its concrete efficiency, for various seed size and stretch parameters.

Generalization. We generalize the previous attack to a large class of predicates, which are divided into two parts, a linear part and a nonlinear part, XORed together. This captures all known candidate generalizations of Goldreich’s PRG. Our attack takes subexponential time as soon as the stretch of the PRG is strictly above one. Importantly, our attack does not depend on the locality of the predicate, but only on the number of variables involved in the nonlinear part. In a recent work [AL16], Applebaum and Lovett put forth an explicit candidate local PRG (of the form XORMAJ), as a concrete target for cryptanalytic effort. Our attack gives a new subexponential algorithm for attacking this candidate.

Extending the ApplebaumLovett polynomialtime algebraic attack. Applebaum and Lovett recently established that local pseudorandom generators can be broken in polynomial time, as long as the stretch \(\textsf {s}\) of the PRG is greater than the rational degree e of its predicate. We extend this result as follows: we show that the seed of a large class of local PRGs (which include all existing candidates) can be recovered in polynomial time whenever \(\textsf {s}\ge e  \log N_e/\log n\), where e is the rational degree, n is the seed size, and \(N_e\) is the number of independent annihilators of the predicate^{Footnote 6} of degree at most e.

Linearization and Gröbner attack. We complement our study with an analysis of the efficiency of algebraic attacks à la Gröbner on Goldreich’s PRG. While it is known that Goldreich’s PRG (and its variants) provably resists such attacks for appropriate choices of (asymptotic) parameters [AL16], little is known about its exact security against such attacks for concrete choices of parameters. We evaluated the concrete security of Goldreich’s PRG against an ordertwo linearization attack. The existence of such an attack allows to derive bounds on Gröbner basis performance. Using an implemented proof of concept, we introduce heuristic bounds for vulnerable parameters.
As illustrated by our attacks, both the number of annihilators of the predicate and the r bit fixing algebraic immunity play an important role in the security of Golreich’s PRG. These criteria were overlooked in all previous works on local PRGs. Last but not least, our concrete analysis indicates that Gröbner basis attacks, although provably “ruled out” asymptotically, matters when studying the vulnerabilities of Goldreich’s PRG, and the security of concrete instances.
1.4 Organization of the Paper
Section 2 introduces necessary preliminaries on predicates and local pseudorandom generators. Section 3 describes a guessanddetermine attack on Goldreich’s PRG instantiated with the predicate \(P_5\) and analyzes it, where the proofs are given in the full version of our paper [CDM+18]. Section 4 extends this attack to all predicates of the form XORMAJ, where the proofs are given in the full version of our paper. Eventually, still in the full version of our paper, an order 2 linearization attack on Goldreich’s PRG is described. The same full version of our paper considers the case of using Goldreich’s PRG with ordered subset (as was initially advocated in [Gol00]) and provides indications that this weakens its concrete security. Finally, the full version of our paper improves the theorem of Applebaum and Lovett, by taking into account the number of annihilators of the predicate. The full version of our paper contains missing proofs on collisions.
2 Preliminaries
Throughout this paper, n denotes the size of the seed of the PRGs considered. A probabilistic polynomial time algorithm (PPT, also denoted efficient algorithm) runs in time polynomial in the parameter n. A positive function f is negligible if for any polynomial p there exists a bound \(B>0\) such that, for any integer \(k\ge B\), \(f(k)\le 1/{\vert p(k)\vert }\). An event depending on n occurs with overwhelming probability when its probability is at least \(1\mathsf {negl} (n)\) for a negligible function \(\mathsf {negl} \). Given an integer k, we write [k] to denote the set \(\{1, \ldots , k\}\). Given a finite set S, the notation \(X{\mathop {\leftarrow }\limits ^{{}_\$}}S\) means a uniformly random assignment of an element of S to the variable X. Given a string \(x\in {\{0,1\}} ^k\) for some k and a subset \(\sigma \) of [k], we let \(x[\sigma ]\) denote the subsequence of the bits of x whose index belong to \(\sigma \). Moreover, the ith bit of \(x[\sigma ]\) will be denoted by \(x_{\sigma _i}\).
2.1 Hypergraphs
Hypergraphs generalize the standard notion of graphs (which are defined by a set of nodes and a set of edges, an edge being a pair of nodes) to a more general object defined by a set of nodes and a set of hyperedges, each hyperedge being an arbitrary subset of the nodes. We define an (n, m, d)hypergraph G to be a hypergraph with n vertices and m hyperedges, each hyperedge having cardinality d. The hyperedges are assumed to be ordered from 1 to m, and each hyperedge \(\{i_1, i_2, \ldots , i_d\}\) is ordered and satisfies \(i_j \ne i_k\) for all \(j\le d\), \(k\le d\), \(j\ne k\). We will consider hypergraphs satisfying some expansion property, defined below.
Definition 1
(Expander Graph). An (n, m, d)hypergraph G, denoted \((\sigma ^1, \ldots , \sigma ^m)\), is \((\alpha ,\beta )\)expanding if for any \(S\subset [m]\) such that \(S\le \alpha \cdot m\), it holds that \(\cup _{i\in S} \sigma ^i \ge \beta \cdot S \cdot d\).
2.2 Predicates
The constructions of local pseudorandom generators that we will consider in this work rely on predicates satisfying some specific properties. Formally, a predicate P of arity d is a function \(P:{\{0,1\}} ^d\mapsto {\{0,1\}} \). We define below the two properties that were shown to be necessary for instantiating local PRGs:

Resiliency. A predicate P is kresilient if it has no nontrivial correlation with any linear combination of up to k of its inputs. An example of predicate with maximal resiliency is the parity predicate (i.e., the predicate which xors all its inputs).

Algebraic Immunity. A predicate P has algebraic immunity e, referred to as \(\mathsf {AI}(P)=e\), if the minimal degree of a non null function g such that \(Pg=0\) (or \((P+1)g=0\)) on all its entries is e. A local PRG built from a AIe predicate cannot be pseudorandom with a stretch \(n^e\) due to algebraic attacks.
Note that the algebraic immunity (also referred as rational degree in [AL16]) implies a lower bound on the degree and on the bitfixing degree. Moreover, a high algebraic immunity implies at least the same degree. Hence, for now on, those two criterion are considered as the relevant criteria for evaluating the security of Goldreich’s PRG.
We define a particular family of predicates which have been considered as a potential instantiation:
Definition 2
(\({\mathsf {XOR}}_{\ell } \mathsf {M}_{k}\) predicates). We call \({\mathsf {XOR}}_{\ell } \mathsf {M}_{k}\) predicate a predicate P of arity \(\ell +k\) such that M is a predicate of arity k and:
We define also a subfamily of \({\mathsf {XOR}}_{\ell } \mathsf {M}_{k}\) predicates, which have been considered in [AL16]:
Definition 3
(\({\mathsf {XOR}}_{\ell } \mathsf {MAJ}_{k}\) predicates). We call \({\mathsf {XOR}}_{\ell } \mathsf {MAJ}_{k}\) predicate a predicate P of arity \(\ell +k\) such that P is a \({\mathsf {XOR}}_{\ell } \mathsf {M}_{k}\) predicate such that M is the majority function in k variables:
where \(\mathsf {w}_H\) denotes the Hamming weight.
2.3 Pseudorandom Generators
Definition. A pseudorandom generator is a deterministic process that expands a short random seed into a longer sequence, so that no efficient adversary can distinguish this sequence from a uniformly random string of the same length. Formally,
Definition 4
(Pseudorandom Generator). A m(n)stretch pseudorandom generator, for a polynomial m, is an efficient uniform deterministic algorithm \(\mathsf {PRG} \) which, on input a seed \(x \in {\{0,1\}} ^n\), outputs a string \(y \in {\{0,1\}} ^{m(n)}\). It satisfies the following security notion: for any probabilistic polynomialtime adversary \(\mathsf {Adv} \),
Here \(\approx \) denotes that the absolute value of the difference of the two probabilities is negligible in the security parameters, and \(\mathsf {pp} \) stands for the public parameters of the \(\mathsf {PRG}\). For any \(n\in \mathbb {N} \), we denote \(\mathsf {PRG} _n\) the function \(\mathsf {PRG} \) restricted to nbit inputs. A pseudorandom generator \(\mathsf {PRG} \) is dlocal (for a constant d) if for any \(n\in \mathbb {N} \), every output bit of \(\mathsf {PRG} _n\) depends on at most d input bits.
Goldreich’s Pseudorandom Generator. Goldreich’s candidate local PRGs form a family \(\mathsf {F} _{G,P}\) of local PRGs: \(\mathsf {PRG} _{G,P}:{\{0,1\}} ^n\mapsto {\{0,1\}} ^m\), parametrized by an (n, m, d)hypergraph \(G = (\sigma ^1, \ldots , \sigma ^m)\) (where \(m = m(n)\) is polynomial in n), and a predicate \(P:{\{0,1\}} ^d\mapsto {\{0,1\}} \), defined as follows: on input \(x\in {\{0,1\}} ^n\), \(\mathsf {PRG} _{G,P}\) returns the mbit string \((P(x_{\sigma _1^1},\ldots ,x_{\sigma _d^1}),\ldots ,P(x_{\sigma _1^m},\ldots ,x_{\sigma _d^m}))\).
Conjecture 1
(Informal). If G is a sufficiently expanding (n, m, d) hypergraph and P is a predicate with sufficiently high resiliency and high algebraic immunity, then the function \(\mathsf {PRG} _{G,P}\) is a secure pseudorandom generator.
Note that picking an hypergraph G uniformly at random suffices to ensure that it will be expanding with probability \(1o(1)\). However, picking a random graph will always give a nonnegligible probability of having an insecure PRG. To see that, observe that when the locality d is constant, a random hypergraph G will have two hyperedges containing the same vertices with probability \(1/\mathsf {poly} (n)\); for any such graph G, the output of \(\mathsf {PRG} _{G,P}\) on a random input can be trivially distinguished from random. Therefore, the security of random local functions is usually formulated nonuniformly, by stating that for a \(1o(1)\) fraction of all hypergraphs G (and appropriate choice of P), no polytime adversary should be able to distinguish the output of \(\mathsf {PRG} _{G,P}\) from random with nonnegligible probability.
Fixed Hypergraph Versus Random Hypergraphs. Goldreich’s candidates local pseudorandom generators require to use a sufficiently expanding hypergraph. Unfortunately, building concrete graphs satisfying the appropriate expansion properties is a nontrivial task. Indeed, all known concrete constructions of expanding bipartite hypergraphs fail to achieve parameters which would allow to construct a PRG with constant locality. Therefore, to our knowledge, in all works using local PRG (see e.g. [IKOS08, App13, Lin17, ADI+17b, BCG+17]), it is always assumed (implicitly or explicitly) that the hypergraph G of the PRG is picked uniformly at random (which makes it sufficiently expanding with probability \(1o(1)\), even in the constantlocality setting) in a onetime setup phase. Therefore, this is the setting we assume for our cryptanalysis.
Notations. In the first part of this work, we focus on the predicate \(P_5\), assuming that the subsets \(\sigma ^1,...,\sigma ^m\) are random subsets. The predicate \(P_5\) can be regarded as a Boolean function of five variables:
The predicate \(P_5\) has algebraic degree 2 and an algebraic immunity of 2, and is 2resilient. Let n be the size of the input, i.e. the number of initial random bits. We define the stretch \(\textsf {s}\) and denote the size m of the output as \(m=n^\textsf {s}\). Let \(x_1,\ldots , x_n \in \mathbb {F}_2\) be the input random bits and \(y_1,\ldots ,y_m \in \mathbb {F}_2\) be the output bits. The m public equations \(E_i\) for \(1\le i\le m\) are drawn as follows:

a subsequence of [n] of size 5 is chosen uniformly at random. Let us call it
$$\sigma ^i = [\sigma _1^i,\sigma _2^i,\sigma _3^i,\sigma _4^i,\sigma _5^i].$$ 
\(E_i\) is the quadratic equation of the form
$$x_{\sigma _1^i} + x_{\sigma _2^i}+x_{\sigma _3^i}+x_{\sigma _4^i}x_{\sigma _5^i} = y_i.$$
The public system \(\varSigma \) that we consider is then defined with the m equations, that is \((E_i)_{1\le i\le m}\).
Ordered and Unordered. There are two different cases to consider:

1.
(Ordered case) \(\sigma ^i\) is ordered, i.e. \(\sigma _1^i<\sigma _2^i<\sigma _3^i<\sigma _4^i<\sigma _5^i\).

2.
(Unordered case) The order \(\sigma ^i\)’s elements is arbitrary.
However, in the core of the paper, we will consider the unordered case, as we’ll provide evidence that the vulnerabilities are even more important for the ordered case in the full version of our paper [CDM+18].
Matrix Inversion Complexity. Our attacks require a sparse matrix inversion algorithm. We consider the Wiedemann’s algorithm [Wie86], the complexity of which is \(O(n^2)\) in our context, since there are less than \(d \cdot n\) nonzero elements of our matrices. Other algorithms could be used, but the complexity of our attacks would have to be modified accordingly.
3 Guess and Determine Cryptanalysis of Goldreich’s PRG with \(P_5\)
In this section, we describe a new subexponential seed recovery attack on Goldreich’s PRG when instantiated within the predicate \(P_5\). Our attack is a Guess and Determine like attack, which is a widely used technique in symmetric cryptanalysis [HR00, EJ00]. As an example, a similar attack [DLR16] has been done on the preliminary version of the stream cipher FLIP [MJSC16] (which can be interpreted as an instance of Goldreich’s PRG with linear locality and fixed security parameters). The idea of guessing elements before making algebraic analysis has been also introduced in [Bet11] under the name of hybrid attacks. In the following, we sketch a similar idea applied to the highly structured Goldreich’s PRG.
3.1 Overview of the Attack
Using the above notations, we further make the following observations on Goldreich’s PRG instantiated with \(P_5\).
Observations

Quasilinearity. If either \(x_{\sigma _4^i}\) or \(x_{\sigma _5^i}\) is known, then the corresponding equation becomes a linear equation. This is the main vulnerability that we use to mount our attack.

Collisions. If two equations have the same monomial of degree 2, then the sum of these equations becomes linear (details are given in Sect. 3.2). Using this phenomenon, we can also get linear equations. We first analyze the number c of pairs of equations that shares a monomial of degree 2. Let the notion of collision refer to this phenomenon.
Definition 5
(Collision). A collision is a couple \((i,j)\in [m]^2\) such that \(i\ne j\) and \(\{ {\sigma _4^i},{\sigma _5^i}\}=\{ {\sigma _4^j},{\sigma _5^j}\}\).
Combining both observations, a subexponential attack can be derived. The main idea is to find linear equations using collisions and quasilinearity.
The Attack

step 1. Find all collisions and derive the corresponding linear equations. Let c be the number of linear equations obtained with this step.

step 2. Take a small subset of \(\ell \) variables in \(\{x_1,\ldots ,x_n\}\), called \(x_{i_1},\ldots ,x_{i_\ell }\), such that by guessing them, \(nc\) new equations are generated (\(\ell \) is formally defined in Definition 6).

step 3. For all \(2^\ell \) possible values of \((x_{i_1},\ldots ,x_{i_\ell })\), build the system of at least n linear equations, solve it^{Footnote 7}, find a candidate seed and check if that candidate matches the public evaluation of the PRG. If so, then it is the secret seed and the guess is correct.
Definition 6
(Number of guesses \(\ell \)). Let an instance of Goldreich’s PRG be generated with n variables and m equations. Let c be the number of collisions. Let us define \(\ell \) as a sufficient number of guesses required to build \(nc\) linear equations.
The above attack works as long as the systems of linear equations obtained in step 2 and 3 above contain an invertible subsystem of size sufficiently large to recover the seed. Our experiments confirm that this is always the case. We formalize this observation with a combinatorial hypothesis: define \(\mathcal {D} _n\) to be the distribution over \(\mathbb {F} _2^{n\times n}\) obtained by sampling the hypergraph of Goldreich’s PRG at random (with \(d = 5\)), finding c linear equations from the collisions, taking the smallest subset of variables which suffices to recover \(n' \ge nc\) additional linear equations, guessing at random the value of these variables, and outputting the \(n\times n\) matrix \(A_n\) of the linear system (if \(n' > n\), we truncate to n equations for simplicity).
Hypothesis 1
There exists a constant \(\gamma \) such that for every sufficiently large \(n\in \mathbb {N} \), the matrix \(A_n\) contains with overwhelming probability an invertible subsystem of \(\gamma \cdot n\) equations, where the probability is taken over the coins of \(A_n{\mathop {\leftarrow }\limits ^{{}_\$}}\mathcal {D} _n\).
In the full version of this work [CDM+18], we provide a detailed analysis of Hypothesis 1. Specifically:

By applying the result of [BQ09], which describes a polytime seed recovery attack given an approximate preimage of the PRG, we formally show that Hypothesis 1 implies that our attack succeeds with overwhelming probability.

We conduct detailed experimentations. In our experiments, the matrix \(A_n\) always contains an invertible subsystem of \(\gamma \cdot n\) equations, with \(\gamma > 0.9\).

We show that Hypothesis 1 is related to wellestablished conjectures in mathematics, related to the distribution of the rank of random sparse matrices. Unfortunately, formally proving Hypothesis 1, even under some heuristics (e.g. replacing \(\mathcal {D} _n\) by the uniform distribution over sparse matrices), appears to be a highly nontrivial mathematical problem, which requires techniques far out of the scope of the current paper.

Eventually, we show that our attack can be modified to (provably) break the pseudorandomness of Goldreich’s PRG, without having to rely on any unproved hypothesis. Hence, Hypothesis 1 seems to be only necessary for showing that our attack breaks the onewayness of Goldreich’s PRG.
In the next part, we give more details of our attack and we prove that the complexity of this attack will always be smaller than
We later introduce experimental results in Sect. 3.3.
3.2 Complexity Analysis and Details
Assessing the Number of Collisions. As previously noticed, collisions can be used to build linear equations. For example, let us assume we have the following two equations in \(\varSigma \):
then adding Eqs. (1) and (2) gives us the following linear equation:
However, we stress that if we had a third colliding equation:
then we could only produce a single other linear equation (w.l.o.g. (1) + (3)), since the other combination ((2) + (3)) would be linearly equivalent to the two previous linear equations.
Hence, this problem can be seen as a ballsintobins problem: m balls are randomly thrown into \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \) bins and we want to know how many balls in average hit a bin that already contains at least one ball. Indeed, this number will approximate the value c of the algorithm.
Proposition 1
(Average number of collisions). Let n be the number of variables, and m be the number of equations, let C be the random variable counting the number of collisions on the degree two monomials in the whole system. Then, the average number of collisions is:
The proof of this proposition is given in the full version [CDM+18]. Table 1 gives the evaluation of this formula for some set of parameters. Our experimental results (see Sect. 3.3) corroborate these expectations and show that the number of collisions is always very close to this expected average.
We now assess the complexity of the first step.
Lemma 1
In the worst case, Step 1 has complexity \(O(m \cdot \log (m))\).
The proof is given in the full version of our paper [CDM+18].
Finding the Smallest Subset of Guesses. The dominant term of the complexity of our attack is given by the number of guesses \(\ell \) we have to make in the second step. Thus, minimizing \(\ell \) is important. Consequently, the variables of the seed that we guess correspond to those appearing the most in the monomials of degree two. Then, the worst case happens when the instance of the PRG is such that there is no best set of guesses. In this specific unlikely setting, each guess generates the exact same amount of linear equations. Here, we bound the number of guesses with the minimum number of guesses for a worst case system.
Proposition 2
(Number of guesses). For any instance with n variables, m equations and c collisions, an upper bound on the sufficient number of guesses required to build \(nc\) linear equations is:
The proof is given in the full version of our paper [CDM+18]. Eventually, Eq. 4 can be approximated with
We show further in Sect. 3.3 that experimental results are much better. We stress that this theoretical worst case expectation is far from experience. Some explanations of this gap are given in the full version of our paper.
The complexity of Step 2 is given by the following lemma.
Lemma 2
Step 2 has complexity \(O(\ell \cdot m)\) which is \(O(n^2)\) with Eq. 5 estimation.
The proof is given in the full version of our paper.
Solving the Linear System. Now, \(\ell \) variables \(\{x_{i_1},\ldots ,x_{i_\ell }\}\) are chosen to be guessed and an exhaustion over all the \(2^\ell \) values of these variables is necessary. For every possible guess, one can try to solve the linear equations collected in the previous steps. In the case that more than n equations are collected, the system is overdetermined and thus may not be solvable. If so, then the guess is incorrect, else we obtain a candidate seed. This candidate can be either confirmed or rejected using the public quadratic system and the public output of the PRG. If the candidate is rejected, then the guess is also incorrect. However, if the candidate matches the public evaluation of the PRG, then the candidate seed is the secret seed with overwhelming probability^{Footnote 8} and the search can be stopped.
The complexity of this attack is given by the following lemma.
Lemma 3
The complexity of Step 3 is
which is also the asymptotic complexity of the full attack.
The proof is given in the full version of our paper [CDM+18].
3.3 Experiment
Distribution of the Number of Collisions. The theoretical results of Table 1 are verified in practice, as shown in Fig. 1 for the particular case of \(n=1024\) and \(s=1.4\). As expected with the analytical formula, the number of collisions is very close to 254 in average. Moreover, our experimental results are very dense around the average, suggesting that the distribution has a low variance.
Implementation of the Attack. Since the study of this paper is the concrete security of Goldreich’s PRG, it is important to practically check if the attack presented in Sect. 3.1 can be efficient when implemented. For this purpose, we provide a proof of concept in Python.
One can note that the practical attack should be on average more efficient than assessed theoretically. Indeed, the asymptotic complexity of Proposition 3 is estimated in the worst case and pessimistic approximations were made on \(nc\) and on the value of \(\ell \). Hence, we experimented this attack for different stretches and different values of n and we effectively noticed that the complexity in average is much smaller than the expected complexity. Table 2 represents the theoretical number of guesses necessary to recover the seed and Table 3 represents the average number of guesses actually needed in the experiment. Moreover, we also noticed that the number of guesses needed to invert the system has a very low variance, as shown in Fig. 2.
With this experiment, we were able to estimate the practical security of Goldreich’s PRG against the guess and determine approach with 80 bits of security. Indeed, for one instance of the PRG, the complexity of the seed recovery can be easily derived from the number \(\ell \) of guesses as \(2^{\ell }n^{{\omega }}\). So to assess the 80 bits security, one can evaluate the average number of guesses necessary for one choice of \((n,\textsf {s})\) and check if the complexity is lower than \(2^{80}\). For that, for 30 values of \(n \in [2^{7}, 2^{14}] \), we delimited the smallest stretch for which the average number of guesses allows a 80 bits attack. Each average has been done on 1000 measurements because the variance was very small. Figure 3 represents the limit on vulnerable \((n,\textsf {s})\) parameters. Above the line, the parameters are on average insecure against the guess and determine attack.
Candidate Nonvulnerable Parameters. We were able to estimate the practical range of parameters that appear to resist to this attack. To assess them, we estimated the number of guesses necessary and deduced the bit security. With many measurements (1024 for each set of parameters), we could find the limit stretch for parameters that are, not vulnerable to our attack. The couples \((n,\textsf {s})\) that possess the maximal \(\textsf {s}\) with an expected security of 80 or 128 bits^{Footnote 9} are conjectured to be the limit for non vulnerable parameters. These couples^{Footnote 10} are represented by the two lines in Fig. 4.
We also introduce certain parameters in Table 4 as challenges for improving the cryptanalysis of Goldreich’s PRG. These parameters correspond to choices of the seed size and the stretch which cannot be broken in less than \(2^{80}\) (resp. \(2^{128}\)) operations with the attacks of this paper. Further study is required to assess confidence in the security level given by these parameters.
3.4 Other Algebraic Cryptanalysis
To complement this attack, we also made an analysis of the efficiency of algebraic attacks with Gröbner basis on Goldreich’s PRG. While it is known that Goldreich’s PRG (and its variants) provably resists such attacks for appropriate choices of (asymptotic) parameters ([AL16], Theorem 5.5), little is known about its exact security against such attacks for concrete choices of parameters.
Since Goldreich’s PRG is far from a Boolean random quadratic system, the performance of a Gröbner basis strategy is hard to assess with the existing theory. In order to give an intuition on how Gröbner basis algorithms would behave on Goldreich’s PRG with predicate \(P_5\), we provide in the full version of our paper [CDM+18] an easytounderstand order two linearization attack. This polynomial attack leads to a practical seed recovery for certain parameters \((n,\textsf {s})\) and we derive a heuristic bound for vulnerable \((n,\textsf {s})\) for 80 bits of security. The existence of such an attack allows to estimate bounds on Gröbner basis performance. Using an implemented proof of concept, we introduce heuristic bounds for vulnerable parameters. From this linearization attack performance and complexity, we derive a heuristic bound on vulnerable \((n,\textsf {s})\) parameters against a Gröbner basis technique. We refer the reader to the full version of our paper for the complete analysis.
3.5 Conclusion
We described in this section a guess and determine attack against Goldreich’s PRG. In the full version of our paper, we complement this result with an analysis of the security of Goldreich’s PRG against an order 2 linearization attack (à la Gröbner). We represent on Fig. 5 the range of parameters for which Goldreich’s PRG is conjectured to have 80 bits of security against those two attacks. As illustrated on the graph, the guess and determine approach targets more parameters for low n while the linearization attack performs better for \(n > 4000\). Although Goldreich’s PRG is conjectured to be theoretically secure for a stretch approaching 1.5 by an arbitrary constant, our analysis shows that a very large seed must be used to achieve at least 80 bits of security with such stretch. In particular, if a stretch of 1.4 is needed, no seed smaller than 5120 bits should be used. Similarly, for a stretch as small as 1.1, the seed must be at least 512 bits long.
4 Generic Attacks Against Goldreich’s PRG
Beyond the predicate \(P_5\) we investigate the security of other predicates for higher stretches, and show that the considered criteria are not sufficient to determine the security. In the full version of our paper, we prove that the number of independent annihilators of the predicate has to be taken into account. Hence, the algebraic immunity is not enough, as we provide a new bound on the stretch that refines the theorem of Applebaum and Lovett. On the other side, we provide in this section an improvement of the guess and determine technique, combined with an algebraic attack. This generalization can be seen as an hybrid attack as defined in [Bet11].
4.1 A SubexponentialTime Algorithm
The theorem of Applebaum and Lovett for polynomialtime algorithms regarding algebraic attacks can be improved, as shown in the full version of our paper. In this section, we focus on subexponentialtime algorithms. The idea here is to generalize our initial attack of Sect. 3 against the \(\mathsf {PRG}\) instantiated with the predicate \(P_5\), to all other considered predicates. Therefore we generalize the attack to all \({\mathsf {XOR}}_{\ell } \mathsf {M}_{k}\) predicates and then more particularly to the \({\mathsf {XOR}}_{\ell } \mathsf {MAJ}_{k}\) predicates.
The Principle. Let n be the size of the seed of the \(\mathsf {PRG}\) with stretch \(\textsf {s}\), and let P be a predicate with locality d. The general idea is to guess r variables of the seed, and solve the corresponding system of equations for each possible value of those r bits. For each equation obtained, an equation of smaller or equal degree can be derived using the principle of the algebraic immunity. Then, the complexity of the attack mainly depends on the values of r and the algebraic immunity of the functions we obtain. It corresponds to the general principle of algebraic attacks with guess and determine [MJSC16], for which we can affine the complexity in the particular case of \({\mathsf {XOR}}_{\ell } \mathsf {M}_{k}\) predicates. We begin by considering the complexity of an attack targeting the degree of the \(\mathsf {M}\) predicate after guessing some bits, based on the following remark:
Remark 1
As soon as \(k1\) variables among the k variables of \(\mathsf {M}\) are fixed, a linear equation can be found, as the output of \(\mathsf {M}\) depends on only one variable and as \(\mathsf {XOR}_{\ell }\) is linear.
The Attack. Our subexponential time algorithm works as follows:

step 1. Fix r variables of the seed \((x_{i_1},\ldots ,x_{i_r})\), with \(r\in O\left( n^{\frac{ks}{k1}}\right) \).

step 2. For all \(2^r\) possible values of \(x_{i_1},\ldots ,x_{i_r}\), recover the corresponding linear system of equations.

step 3. Solve the system in \((nr)^{\omega }\) operations; if there is a contradiction go back to step 2, otherwise add the solution to the list.

step 4. Return the list of solutions.
This attack works as long as the system of linear equations obtained in step 3 above contains an invertible subsystem of size sufficiently large to recover the seed. We then apply Hypothesis 1 with \(A_n\) being the linear system obtained by guessing at random the \(2^r\) possible values of \(x_{i_1},\ldots ,x_{i_r}\).
Complexity Analysis. The complexity is dominated by Step 3, as we repeat this step \(2^r\) times (we have to solve a system of linear equations of size \(nr\) for each possible values of the r bits), the complexity of this algorithm is subexponential: \(O(n^{{\omega }}2^r)\). Eventually, the final complexity is determined by the following proposition:
Proposition 3
For an overwhelming proportion of Goldreich’s \(\mathsf {PRG}\) instantiated with a \({\mathsf {XOR}}_{\ell } \mathsf {M}_{k}\) predicate, under Hypothesis 1 on step 2 system, the complexity order of the previous algorithm can be approximated by:
The proof is given in the full version of our paper.
Remark 2
It is important to notice that the parameter of this attack does not rely directly on the locality, but only on the number k of variables that appear in the nonlinear part \(\mathsf {M}\), hence, it improves the complexity of [BQ09]. Indeed, the generic complexity of Bogdanov and Qiao is roughly \(O(2^{n^{1(\textsf {s}1)/2d}})\) where d denotes the locality, as our algorithm has a complexity that is in \(O\left( n^{\omega }\cdot 2^{n^{1(\textsf {s}1) / (k1)}}\right) \), with \(k1 < d\), by definition of k.
Moreover, the predicate requires a high resiliency to avoid linear attacks, and one of the most natural constructions to build a resilient function is to add an independent linear part to a function. It corresponds to the \({\mathsf {XOR}}_{\ell } \mathsf {M}_{k}\) predicates, which have a resiliency of at least \(\ell 1\) given by the xor part. It is also possible to build resilient functions differently, which seems to be a better choice regarding this attack. For the case of \(P_5\), we have \(k=2\), that gives us an attack in \(O(n^{\omega }2^{n^{2\textsf {s}}})\).
Possible Improvement. This algorithm only relies on the number of variables of the nonlinear part, but not on its algebraic immunity. Instead of fixing variables in order to obtain linear equations in the nonlinear part of a \({\mathsf {XOR}}_{\ell } \mathsf {M}_{k}\) predicate, an attacker can fix variables in order to recover equations of degree greater than 1. Indeed, using the algebraic immunity of the \(\mathsf {M}\) predicate, the attacker can recover such equations by fixing less than k bits in the \(\mathsf {M}\) part. By doing so, it appears that the relevant criterion regarding this attack is no longer the algebraic immunity, neither the rbit fixing degree defined in [AL16], but a generalization of the two. The efficiency of the attack will depend on the algebraic immunity of the predicates obtained after doing some guesses, and on the probability of getting predicates (in fewer variables) with this algebraic immunity (or smaller). A lower bound on the algebraic immunity that can be obtained with r guesses is given by the rbit fixing algebraic immunity (introduced first in term of recurrent algebraic immunity in [MJSC16] to bound the complexity of algebraic attacks combined with guess and determine) defined in the following sense:
Definition 7
(rbit fixing algebraic immunity). Let f be a Boolean function with d variables. For any \(0\le r \le d\), and \(b=(b_1,\ldots ,b_r)\in \{0,1\}^r\), \(i=(i_1,\ldots ,i_r) \in [d]^r\) such that \(i_1<i_2<\cdots <i_r\), we note \(f_{(b,i)}\) the restriction of f where the r variables indexed by \(i_1,\ldots ,i_r\) are fixed to the value \(b_1,\ldots ,b_r\). Then f has rbit fixing algebraic immunity a if
where \(\mathsf {AI}\) denotes the algebraic immunity.
For the case of \({\mathsf {XOR}}_{\ell } \mathsf {M}_{k}\) predicates we prove in the full version of our paper [CDM+18] an upper bound on the rbit fixing algebraic immunity. Thereafter, determining the number of predicates with this algebraic immunity that could be reached guessing r variables will lead to other subexponential time algorithms. The description and analysis of this algorithm applied on \({\mathsf {XOR}}_{\ell } \mathsf {M}_{k}\) predicates is given in the full version of our paper. However, this algorithm only generalizes the result given by the first algorithm as it considers systems of equations of degree greater than one. But it does not assume any property on the \(\mathsf {M}\) predicate, and leads to consider the maximum algebraic immunity that can be provided by this part when some variables are fixed. Considering the principle of the rbit fixing algebraic immunity, we can try to find guesses which lower this algebraic immunity, leading to an attack with even better complexity.
In the following, we show on the XORMAJ predicates how only taking into account specific values of guessed bits (but changing the positions that we guess) enables to target a low algebraic immunity with enough equations.
Application to XORMAJ Predicates. In the previous algorithms, we fix r bits that never change, but we test all possible values for those bits. However, it might be of interest to change the bits that we guess, by taking into account a specific value for those bits, such that we decrease more drastically the degree of the equations that we get. Using the notations of Definition 7, it boils down to finding values of \(b\in \{0,1\}^r\) such that \(\mathsf {AI}(f_{(i,b)})\) is low for enough i.
Let us consider the \(\mathsf {XOR}_{\ell }\mathsf {MAJ}_{k}\) predicate (Definition 3), then our initial algorithm breaks the construction with complexity \(O(n^{\omega }2^{n^{(k\textsf {s})/(k1)}})\), and its generalization with complexity \(O\left( 2^{ n^{\frac{1+j\textsf {s}+\lceil (kj)/2\rceil }{j}}} n^{{\omega }\left( \left\lceil \frac{kj}{2} \right\rceil +1\right) }\right) \) for all integer j such that \(1\le j \le k\). Moreover, this algorithm is an improvement only for bigger stretches. In the following, we change the way we make our guesses, in order to capture how the rbit fixing algebraic immunity is a relevant criterion.
In these algorithms, one can notice that fixing j bits among the k variables that appear in the majority function can derive different degrees of equations, depending on the value of the bits that are guessed: fixing \(\left\lceil \frac{k}{2} \right\rceil \) bits all to 0 (or all to 1) will derive directly linear equations. Indeed, for the majority function, if strictly more than half of the bits are supposed to be all zero, then the corresponding output has to be 0 by definition of the majority, and respectively 1 if all these bits are ones. On the other side, fixing a quarter of bits to be ones and a quarter of bits to be zero will derive an other majority function taken other half of the bits, which is clearly nonlinear for k big enough.
Hence, instead of fixing r bits and guess all possible values of those bits, we choose r bits, guessing that all those bits are all one or all zero, and repeat this until the guess is right (the position of the r guessed variables changes, not the value). This particular guessanddetermine is exactly what Duval, Lallemand and Rotella investigated in [DLR16] on the FLIP family of stream ciphers (and which complexity can be bounded through the rbit fixing algebraic immunity, [MJSC16] Sect. 3.4).
Description of the Algorithm

step 1. Fix randomly r variables of the seed \((x_{i_1},\ldots ,x_{i_r})\).

step 2. Assume that all of them are equal to zero, solve the corresponding linear system, add the solution to the list.

step 3. Assume that all the r variables are equal to one, solve the corresponding linear system, add the solution to the list.

step 4. If in the solution list there is one with no contradiction with the \(\mathsf {PRG}\) output, output the solution as the seed. Otherwise, empty the list and go back to Step 1.
As for the first algorithm, we assume that Hypothesis 1 is verified with \(A_n\) representing the linear systems of Step 2 and 3.
Complexity Analysis. The complexity is dominated by the number of repetition of Step 2 and Step 3, we determine it through the following proposition:
Proposition 4
For an overwhelming proportion of Goldreich’s \(\mathsf {PRG}\) instantiated with a \(\mathsf {XOR}_{\ell }\mathsf {MAJ}_{k}\) predicate, under Hypothesis 1 for Step 2 and 3 systems, the seed can be recovered in time complexity of order:
The proof is given in the full version of our paper.
This algorithm captures something else than the previous ones, as it shows that one has to consider all possible choices of guesses in order to evaluate exactly the security of such constructions. In other words, it shows that the rbit fixing algebraic immunity is exactly the relevant criterion to resist our attack, as it defines the smallest algebraic immunity that can be considered for an attack. However, one must also take the probability that a corresponding guess happens on the equations into account. Hence there exists a tradeoff between the choice of the good guesses, and the probability that the corresponding equation of small degree can be derived.
4.2 Open Questions
The attacks and their variants described here asked lot of open questions. For the polynomial time algorithm using the number of linearly independent annihilators, we do not take into account some dependencies into different equations as explained in the full version of our paper [CDM+18]. Hence, the condition on the stretch that we gave could be improved by considering dependencies on the subsets.
For the subexponentialtime attack that uses the r bit fixing algebraic immunity, we do not know if the bound given in the full version of our paper is tight, that is if there exist predicates, such that fixing any bits will still derive Boolean functions with fewer variables that reach the maximal algebraic immunity. In other words, is it possible to have a perfect predicate regarding the r bit fixing algebraic immunity? Recalling that it is the relevant criterion in this context.
Moreover, this bound does not depend on the value of the bits that are guessed, whereas this might have an influence, as shown on the XORMAJ predicate. For example, the Boolean function \(x_0+x_1x_2x_3x_4\) is of algebraic immunity 2, but fixing \(x_1\) to be 1 will derive a Boolean function that is still of algebraic immunity 2, but fixing \(x_1=0\) will bring directly an equation of degree 1. Hence, all choices of guess are not equivalent, implying that different choices of guesses could improve the complexity of our subexponentialtime algorithm, depending strongly on the predicate.
Last but not least, how the first idea of using different annihilators can improve the subexponentialtime algorithms using guess and determine?
Notes
 1.
The subsets form an expander graph if for some k, every k subsets cover \(k+\varOmega (n)\) elements of [n]. In practice, it suffices to pick once for all the subsets \((\sigma ^1, \ldots , \sigma ^m)\) at random to guarantee that they will be expanding except with o(1) probability.
 2.
A linear test attempts to distinguish a string from random by checking whether the xor of a subset of the bits of the string is biased toward either 0 or 1.
 3.
The locality requirement can in fact be weakened to a related notion of block locality.
 4.
In this model, n parties securely compute a function f on private inputs \((x_1,\ldots , x_n)\); in the preprocessing phase, the parties have access to f (but not to the input), and generate some preprocessing material. Then, in the online phase, the parties execute an informationtheoretically secure protocol to compute f(x), using the preprocessed material. MPC protocols in the preprocessing model are among the most promising candidates for getting practical solutions to the multiparty computation problem.
 5.
A predicate P has rbit fixing degree e if the minimal degree of the restriction of P obtained by fixing r inputs is e.
 6.
An annihilator of a predicate P is a nonzero polynomials Q such that \(Q\cdot P = 0\).
 7.
If more than n linear equations are recovered from Step 1 and 2, the system is unlikely to be solvable for an incorrect guess. In that case, it is not necessary to check if the public output matches with the candidate seed.
 8.
It is very unlikely that two seeds give the same output by evaluating the same quadratic system. Even though, if it is the case, this procedure still finds an equivalent seed which makes the system insecure.
 9.
We actually took a margin of \(10\%\) to take into account the possible improvements of our implementation.
 10.
This curve should not be extrapolated because outside of its range, Gröbner attacks seem more powerful, see Fig. 5.
References
Applebaum, B., Bogdanov, A., Rosen, A.: A dichotomy for local smallbias generators. In: Cramer, R. (ed.) TCC 2012. LNCS, vol. 7194, pp. 600–617. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642289149_34
Applebaum, B., Bogdanov, A., Rosen, A.: A dichotomy for local smallbias generators. J. Cryptol. 29(3), 577–596 (2016)
Applebaum, B., Damgård, I., Ishai, Y., Nielsen, M., Zichron, L.: Secure arithmetic computation with constant computational overhead. Cryptology ePrint Archive, Report 2017/617 (2017). http://eprint.iacr.org/2017/617
Applebaum, B., Damgård, I., Ishai, Y., Nielsen, M., Zichron, L.: Secure arithmetic computation with constant computational overhead. In: Katz, J., Shacham, H. (eds.) CRYPTO 2017. LNCS, vol. 10401, pp. 223–254. Springer, Cham (2017). https://doi.org/10.1007/9783319636887_8
Alekhnovich, M., Hirsch, E.A., Itsykson, D.: Exponential lower bounds for the running time of DPLL algorithms on satisfiable formulas. J. Autom. Reason. 35(1–3), 51–72 (2005)
Applebaum, B., Ishai, Y., Kushilevitz, E.: Cryptography in NC\(^0\). In: 45th FOCS, pp. 166–175. IEEE Computer Society Press, October 2004
Applebaum, B., Ishai, Y., Kushilevitz, E.: On pseudorandom generators with linear stretch in NC\(^0\). Comput. Complex. 17(1), 38–69 (2008)
Applebaum, B., Lovett, S.: Algebraic attacks against random local functions and their countermeasures. In: 48th ACM STOC, pp. 1087–1100. ACM Press, June 2016
Applebaum, B.: Pseudorandom generators with long stretch and low locality from random local oneway functions. In: 44th ACM STOC, pp. 805–816. ACM Press, May 2012
Applebaum, B.: Pseudorandom generators with long stretch and low locality from random local oneway functions. SIAM J. Comput. 42(5), 2008–2037 (2013)
Applebaum, B.: The cryptographic hardness of random local functions  survey. Cryptology ePrint Archive, Report 2015/165 (2015). http://eprint.iacr.org/2015/165
Albrecht, M.R., Rechberger, C., Schneider, T., Tiessen, T., Zohner, M.: Ciphers for MPC and FHE. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015, Part I. LNCS, vol. 9056, pp. 430–454. Springer, Heidelberg (2015). https://doi.org/10.1007/9783662468005_17
Boyle, E., Couteau, G., Gilboa, N., Ishai, Y., Orrù, M.: Homomorphic secret sharing: optimizations and applications. In: ACM CCS 2017, pp. 2105–2122. ACM Press (2017)
Bettale, L.: Cryptanalyse algebrique: outils et applications, Ph.D. thesis (2011)
Barak, B., et al.: On the (Im)possibility of obfuscating programs. In: Kilian, J. (ed.) CRYPTO 2001. LNCS, vol. 2139, pp. 1–18. Springer, Heidelberg (2001). https://doi.org/10.1007/3540446478_1
Bogdanov, A., Qiao, Y.: On the security of Goldreich’s oneway function. In: Dinur, I., Jansen, K., Naor, J., Rolim, J. (eds.) APPROX/RANDOM 2009. LNCS, vol. 5687, pp. 392–405. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642036859_30
Canteaut, A., et al.: Stream ciphers: a practical solution for efficient homomorphicciphertext compression. In: Peyrin, T. (ed.) FSE 2016. LNCS, vol. 9783, pp. 313–333. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662529935_16
Couteau, G., Dupin, A., Méaux, P., Rossi, M., Rotella, Y.: On the concrete security of Goldreich’s pseudorandom generator (2018)
Cook, J., Etesami, O., Miller, R., Trevisan, L.: On the oneway function candidate proposed by Goldreich. ACM Trans. Comput. Theor. (TOCT) 6(3), 14 (2014)
Cryan, M., Miltersen, P.B.: On pseudorandom generators in NC^{0}. In: Sgall, J., Pultr, A., Kolman, P. (eds.) MFCS 2001. LNCS, vol. 2136, pp. 272–284. Springer, Heidelberg (2001). https://doi.org/10.1007/3540446834_24
Courtois, N.T., Meier, W.: Algebraic attacks on stream ciphers with linear feedback. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 345–359. Springer, Heidelberg (2003). https://doi.org/10.1007/3540392009_21
Courtois, N.T.: Fast algebraic attacks on stream ciphers with linear feedback. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 176–194. Springer, Heidelberg (2003). https://doi.org/10.1007/9783540451464_11
Dalai, D.K., Gupta, K.C., Maitra, S.: Cryptographically significant boolean functions: construction and analysis in terms of algebraic immunity. In: Gilbert, H., Handschuh, H. (eds.) FSE 2005. LNCS, vol. 3557, pp. 98–111. Springer, Heidelberg (2005). https://doi.org/10.1007/11502760_7
Duval, S., Lallemand, V., Rotella, Y.: Cryptanalysis of the FLIP family of stream ciphers. In: Robshaw, M., Katz, J. (eds.) CRYPTO 2016. LNCS, vol. 9814, pp. 457–475. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662530184_17
Dalai, D.K., Maitra, S., Sarkar, S.: Basic theory in construction of Boolean functions with maximum possible annihilator immunity. Cryptology ePrint Archive, Report 2005/229 (2005). http://eprint.iacr.org/2005/229
Ekdahl, P., Johansson, T.: SNOW  a new stream cipher. In: Proceedings of First NESSIE Workshop, Heverlee (2000)
Goldreich, O., Goldwasser, S., Micali, S.: How to construct random functions (extended abstract). In: 25th FOCS, pp. 464–479. IEEE Computer Society Press, October 1984
Goldreich, O.: Candidate oneway functions based on expander graphs. Cryptology ePrint Archive, Report 2000/063 (2000). http://eprint.iacr.org/2000/063
Grassi, L., Rechberger, C., Rotaru, D., Scholl, P., Smart, N.P.: MPCfriendly symmetric key primitives. In: ACM CCS 2016, pp. 430–443. ACM Press, October 2016
Hawkes, P., Rose, G.G.: Exploiting multiples of the connection polynomial in wordoriented stream ciphers. In: Okamoto, T. (ed.) ASIACRYPT 2000. LNCS, vol. 1976, pp. 303–316. Springer, Heidelberg (2000). https://doi.org/10.1007/3540444483_23
Ishai, Y., Kushilevitz, E., Ostrovsky, R., Sahai, A.: Cryptography with constant computational overhead. In: 40th ACM STOC, pp. 433–442. ACM Press, May 2008
Ishai, Y., Prabhakaran, M., Sahai, A.: Secure arithmetic computation with no honest majority. Cryptology ePrint Archive, Report 2008/465 (2008)
Lin, H.: Indistinguishability obfuscation from SXDH on 5linear maps and locality5 PRGs. In: Katz, J., Shacham, H. (eds.) CRYPTO 2017, Part I. LNCS, vol. 10401, pp. 599–629. Springer, Cham (2017). https://doi.org/10.1007/9783319636887_20
Lin, H., Tessaro, S.: Indistinguishability obfuscation from trilinear maps and blockwise local PRGs. In: Katz, J., Shacham, H. (eds.) CRYPTO 2017, Part I. LNCS, vol. 10401, pp. 630–660. Springer, Cham (2017). https://doi.org/10.1007/9783319636887_21
Lombardi, A., Vaikuntanathan, V.: Limits on the locality of pseudorandom generators and applications to indistinguishability obfuscation. In: Kalai, Y., Reyzin, L. (eds.) TCC 2017, Part I. LNCS, vol. 10677, pp. 119–137. Springer, Cham (2017). https://doi.org/10.1007/9783319705002_5
Méaux, P., Journault, A., Standaert, F.X., Carlet, C.: Towards stream ciphers for efficient FHE with lownoise ciphertexts. In: Fischlin, M., Coron, J.S. (eds.) EUROCRYPT 2016, Part I. LNCS, vol. 9665, pp. 311–343. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662498903_13
Mossel, E., Shpilka, A., Trevisan, L.: On eBiased generators in NC0. In: 44th FOCS, pp. 136–145. IEEE Computer Society Press, October 2003
ODonnell, R., Witmer, D.: Goldreich’s PRG: evidence for nearoptimal polynomial stretch. In: IEEE 29th Conference on Computational Complexity (CCC), pp. 1–12. IEEE (2014)
Siegenthaler, T.: Correlationimmunity of nonlinear combining functions for cryptographic applications (corresp.). IEEE Trans. Inf. Theor. 30(5), 776–780 (1984)
Sahai, A., Waters, B.: How to use indistinguishability obfuscation: deniable encryption, and more. In: 46th ACM STOC, pp. 475–484. ACM Press, May/June 2014
Wiedemann, D.: Solving sparse linear equations over finite fields. IEEE Trans. Inf. Theor. 32(1), 54–62 (1986)
Acknowledgments
We thank JeanPierre Tillich and Benny Applebaum for useful discussions and observations. We also are indebted to Guénaël Renault for fruitful discussions about Gröbner basis approaches, and to the reviewers of ASIACRYPT for their useful comments. This research has been partially funded by ANRT under the programs CIFRE N 2015/1158 and 2016/1583. We acknowledge the support of the French Programme d’Investissement d’Avenir under national project RISQ P141580. The first author was supported by ERC grant 724307 (project PREPCRYPTO). The fifth author was partially supported by the French Agence Nationale de la Recherche through the BRUTUS project under Contract ANR14CE280015.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Copyright information
© 2018 International Association for Cryptologic Research
About this paper
Cite this paper
Couteau, G., Dupin, A., Méaux, P., Rossi, M., Rotella, Y. (2018). On the Concrete Security of Goldreich’s Pseudorandom Generator. In: Peyrin, T., Galbraith, S. (eds) Advances in Cryptology – ASIACRYPT 2018. ASIACRYPT 2018. Lecture Notes in Computer Science(), vol 11273. Springer, Cham. https://doi.org/10.1007/9783030033293_4
Download citation
DOI: https://doi.org/10.1007/9783030033293_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 9783030033286
Online ISBN: 9783030033293
eBook Packages: Computer ScienceComputer Science (R0)