Unifying and generalizing known lower bounds via geometric complexity theory

We show that most arithmetic circuit lower bounds and relations between lower bounds naturally fit into the representation-theoretic framework suggested by geometric complexity theory (GCT), including: the partial derivatives technique (Nisan-Wigderson), the results of Razborov and Smolensky on $AC^0[p]$, multilinear formula and circuit size lower bounds (Raz et al.), the degree bound (Strassen, Baur-Strassen), the connected components technique (Ben-Or), depth 3 arithmetic circuit lower bounds over finite fields (Grigoriev-Karpinski), lower bounds on permanent versus determinant (Mignon-Ressayre, Landsberg-Manivel-Ressayre), lower bounds on matrix multiplication (B\"{u}rgisser-Ikenmeyer) (these last two were already known to fit into GCT), the chasms at depth 3 and 4 (Gupta-Kayal-Kamath-Saptharishi; Agrawal-Vinay; Koiran), matrix rigidity (Valiant) and others. That is, the original proofs, with what is often just a little extra work, already provide representation-theoretic obstructions in the sense of GCT for their respective lower bounds. This enables us to expose a new viewpoint on GCT, whereby it is a natural unification and broad generalization of known results. It also shows that the framework of GCT is at least as powerful as known methods, and gives many new proofs-of-concept that GCT can indeed provide significant asymptotic lower bounds. This new viewpoint also opens up the possibility of fruitful two-way interactions between previous results and the new methods of GCT; we provide several concrete suggestions of such interactions. For example, the representation-theoretic viewpoint of GCT naturally provides new properties to consider in the search for new lower bounds.


Introduction
Geometric complexity theory (GCT) is a program towards lower bounds-such as P = NPusing algebraic geometry and representation theory (see [Mul11] for an overview, and references therein). In this paper, we show that most arithmetic circuit lower bounds naturally fit into the representation-theoretic framework used in GCT. We also show that part of the representationtheoretic approach is necessary, that this approach illuminates lower bounds even when it is not strictly necessary, and that it may in fact be the easiest approach to proving circuit lower bounds. GCT thus provides a unifying and generalizing framework for many known lower bounds. This representation-theoretic viewpoint opens the door for new potentially fruitful two-way interactions between previous results and new progress in (geometric) complexity theory (see Sections 1.1 and 4.2 for details).
This paper presupposes no knowledge of representation theory on the part of the reader. In fact, we use previous lower bounds together with our new viewpoint to motivate the use and definitions of representation theory and algebraic geometry in complexity theory.
Essentially any lower bound proof C hard ⊆ C easy between nonuniform complexity classes proceeds by finding some "useful" property, which applies to every function in C easy , but not to every function in C hard . The first part of the GCT program suggests the use of properties of a certain type, namely (linear-)invariant properties defined by the vanishing of polynomials, which we capture in the notion of "separating module" (Definition 2.10). Recall that a property Π is linear-invariant if for every function on n variables, f (x) has Π if and only if f (Ax) has Π for every invertible n × n change of variables A. In this paper we show that most known arithmetic circuit lower bounds in fact use separating modules, including: • Lower bounds on restricted depth 3 arithmetic circuits in characteristic zero [NW97] • Lower bounds on (unrestricted) depth 3 arithmetic circuits over finite fields [GK98] • The recent lower bounds on depth 4 arithmetic circuits with bottom fan-in O( √ n) [GKKS12] • Lower bounds on multilinear formula size [Raz09] • The degree bound of Strassen [Str73] and Baur-Strassen [BS83] (see below) • Lower bounds on real (semi-)algebraic decision trees [BO83,Yao97] • Lower bounds on bounded depth Boolean circuits [Raz87,Smo87] • The best known lower bounds (n 2 /2) on permanent versus determinant [MR04] (already shown to use a separating module [LMR10]) • Many lower bounds on matrix multiplication (already shown to use a separating module [BI12,Str87]) We expect that results which use similar techniques can be shown to use separating modules as well, such as [Raz06, RSY08, RY09, SW01, GR00, Yao91,BLY92]. We also observe that many relations between lower bounds yield relations between separating modules. In other words, if lower bound A is proved using a separating module, that yields a separating module for lower bound B: • Lower bounds on partial derivatives implies lower bounds [BS83] • Matrix rigidity implies circuit lower bounds [Val77] • The chasm at depth 4 [AV08,Koi12] and the recent chasm at depth 3 [GKKS13] • Tensor-rank lower bounds imply formula size lower bounds [Raz10] Finally, in Section 3 we argue that the use of invariant properties is essentially necessary, and we give heuristic arguments that the use of separating modules is by far the easiest way to prove arithmetic circuit lower bounds. Thus separating modules are the first approach to try, and indeed may be the only approach that is easy enough that it will ever be carried out. We can already give one such heuristic argument: most arithmetic circuit lower bounds already use separating modules.
This new viewpoint makes new tools available, and suggests new conjectures and directions to better understand complexity classes and lower bounds. However, we do not provide new proofs of any of the above results. Our paper is similar in some ways to Natural Proofs [RR97] or Razborov's papers [Raz95a,Raz95b] on bounded arithmetic, in that we offer a meta-observation about many lower bounds. This involves digging into the details of the proofs of known lower bounds to understand them in a particular way, which is sometimes trivial but sometimes requires new insights. These previous meta-results have shown that a new viewpoint can be quite fruitful; for example, by working in the framework of bounded arithmetic, Razborov was able to come up with a beautiful new proof of the Switching Lemma [Raz95a]. Despite this new proof of a lower bound against AC 0 , the fundamental message of the papers [RR97, Raz95a,Raz95b] was negative, giving barriers to proving strong lower bounds, whereas the message of this paper is positive, suggesting a route to proving lower bounds-a route that most arithmetic circuit lower bounds have already begun to traverse.
In Section 1.1, we discuss some of the implications of this work. We postpone further details of the implications until Section 4, as they are difficult to discuss properly without definitions and a full example in mind. We give the definitions and an example of how a previous lower bound fits into this new viewpoint in Section 2. In Section 3 we argue for the necessity of invariant properties and the feasibility and utility of separating modules, especially in comparison with other possible approaches. Section 4 contains further discussion and implications. In particular, we discuss the relation of this viewpoint to the larger GCT program-in particular, separating modules are only the very beginning of the GCT approach. We also discuss lower bounds which don't seem to fit into this framework-mostly those based on uniform hierarchy theorems-and we suggest some concrete directions for future research to push forward both our understanding of GCT, and our understanding of known lower bounds and the complexity classes they consider. We also discuss in what way Boolean lower bounds fit into this framework. In Sections 5 and 6 we prove that the results mentioned above use separating modules. However, if the reader is willing to take the above lists on faith, the significance of this paper can be understood without reading these last two sections in detail.
1.1. Implications. Our unifying viewpoint suggests the possibility of a fruitful two-way interplay between the methods currently being leveraged in GCT against major open problems like permanent versus determinant and P versus NP, and already hard-won knowledge for lower bounds on more tractable problems. Although we can state some of these possible interactions now, they will become clearer after the example in the next section, and we discuss further implications in Section 4.
First, the representation-theoretic viewpoint suggests where to look for new properties that might yield lower bounds. Even for lower bounds that are already essentially tight, the representation theory suggests how we might get new proofs of these lower bounds or otherwise understand them better.
Second, the representation-theoretic viewpoint suggests new conjectures, directions, and techniques that may prove fruitful; see, for example, the last paragraph of Section 4.1 and the open questions in Sections 4.2 and 4.3.
Third, by showing that previous lower bounds and GCT share a common representation-theoretic viewpoint, we reveal many new contexts in which it might hopefully be easier to develop the tools and techniques of algebraic geometry and representation theory needed for the GCT approach to bigger problems such as permanent versus determinant or P versus NP.
Fourth, it is often asked how difficult it is to re-prove known lower bounds using GCT. The viewpoint in this paper reveals that most of the old proofs already give representation-theoretic knowledge crucial to the GCT approach, in the form of separating modules. There is, however, a difference between separating modules and the geometric obstructions defined in [MS08]. Upgrading the previous lower bounds to yield such geometric obstructions is one of the open questions we discuss in more detail in Section 4.1. This is one of the ways in which GCT suggests how we might understand previous lower bounds better, even ones that are essentially tight.
For now, we mention just one more point: the representation-theoretic viewpoint replaces the amorphous notion of "useful property" with the specific mathematical notion of separating module. In Section 3 we argue that this is in some sense without loss of generality. This reduces an amorphous search for new useful properties to a comparatively feasible search for separating modules, which can even be made computational (see Appendix B and Section 4 for more).

Definitions and a motivating example
Most nonuniform lower bounds C hard ⊆ C easy are proved by finding a property shared by all functions in the "easy" class C easy that some function f ∈ C hard does not have. The goal of this section is to introduce a representation-theoretic formalization of the types of properties used by most arithmetic circuit lower bounds, namely (linear-)invariant properties defined by polynomials.
2.1. Properties defined by polynomials. Throughout the definitions and motivation, we will use the example of the space Poly 2 (x, y) = {ax 2 + bxy + cy 2 |a, b, c ∈ F} of degree 2 homogeneous polynomials in two variables 1 over some field 2 F, and the expression b 2 − 4ac. The space Poly 2 (x, y) 1 The notation Poly d (x1, . . . , xn) is not standard. We use it because it is clear and mnemonic. For reference we give the standard notation from the literature in Appendix D.

2
In some of these examples, it may be necessary to restrict the characteristic of the field. In all of our actual results we specify the field more carefully.
in this running example should be thought of as analogous to the space of polynomials we care about, like the determinant, permanent, etc. (which are points in Poly n (x 11 , x 12 , . . . , x nn )), but is small enough that we can carry out computations completely by hand and the definitions in this context should already be familiar to the reader.
Recall that b 2 − 4ac = 0 if and only if ax 2 + bxy + cy 2 is a perfect square 3 (αx + βy) 2 for some constants α, β ∈ F. We thus view b 2 − 4ac ? = 0 as a test for the property of being a perfect square, and we say that this property is defined by the (vanishing of the) polynomial b 2 − 4ac.
Note that here we consider b 2 − 4ac not just as an expression, but as a polynomial in the variables a, b, c, which are the coefficients of the polynomials ax 2 + bxy + cy 2 . Because there are two different spaces of polynomials here, we find it useful to give different names to them. We refer to polynomials such as ax 2 + bxy + cy 2 ∈ Poly 2 (x, y) with a, b, c constants as input polynomials: these are polynomials in the "input variables" x, y, and are also themselves inputs for the property tests. We refer to polynomials such as b 2 − 4ac as test polynomials: these are polynomials whose variables are the coefficients of the input polynomials, and define a test for some property of input polynomials.
We index monomials by their exponent vectors e ∈ Z n ≥0 and write x e def = x e 1 1 . . . x en n ; we denote the corresponding coefficient by a e , and then write any polynomial as f (x) = e∈Z n ≥0 a e x e (only finitely many terms will be nonzero). If p ∈ C[(a e ) e∈Z n ≥0 ] is a test polynomial and f = e α e x e is an input polynomial, we write p(f ) for the evaluation of p in which each test variable a e is set to the corresponding coefficient α e ∈ F of f .
Remark 2.2. Readers familiar with algebraic geometry will note that a property defined by test polynomials is exactly the same thing as an algebraic subset of the vector space Poly d (x 1 , . . . , x n ). This is an algebro-geometric viewpoint on complexity. We discuss this further in Section 3. For now we note that such algebro-geometric notions of complexity have been used before: border rank for matrix multiplication and "infinitesimal approximation" in GCT are both algebro-geometric notions of complexity in this sense.
Remark 2.3. By Hilbert's Basis Theorem, any property defined by polynomials can be defined by finitely many polynomials.
2.2. Linear-invariant properties defined by polynomials. Kayal [Kay11, Sec. 5.2] observes that several lower bounds use linear-invariant properties at their core, and in fact this observation was the starting point for this paper. In this paper we extend this observation in two directions simultaneously: (1) we observe that most arithmetic circuit lower bounds use (linear-)invariant properties defined by polynomials (Definition 2.1), allowing us to make the connection with representation theory and GCT, and (2) we extend the observation to most arithmetic circuit lower bounds.
Definition 2.4. A property Π of (input) polynomials is linear-invariant if for every polynomial f (x 1 , . . . , x n ) and every invertible linear change of variables A ∈ GL n (F) f (x) has property Π ⇐⇒ f (Ax) has property Π . 3 Equivalently and perhaps more familiar is that b 2 − 4ac = 0 if and only if ax 2 + bx + c has a double root.
Example 2.5. The property of being a perfect square is linear-invariant: f (x) = g(x) 2 if and only if f (Ax) = g(Ax) 2 for any invertible linear change of variables A. As explained in the previous section, in the case of f (x, y) homogeneous of degree 2, this property is defined by the vanishing of the test polynomial b 2 − 4ac.
Example 2.6. The dimension of the space of all partial derivatives of a homogeneous polynomial f is a linear-invariant property. The k-th order partial derivatives of f are linearly independent from its ℓ-th order partial derivatives for k = ℓ, so we may prove this for each k separately. Consider the partial derivative ∂f ∂x i 1 ···∂x i k (x). When we transform the variables x by A, we change both the variables with respect to which the derivatives are being taken, and we change the variables at which the partial derivative is being evaluated. The fact that the former kind of transformation does not change the dimension of the space of partial derivatives follows from the usual "directional derivative" formula from multilinear calculus. The latter kind of transformation also does not change the dimension of a space of polynomials, We will see below that this property is also defined by polynomials. The notion of a linear-invariant property defined by polynomials is embodied in the following definition. To make the definition clear we first introduce one more bit of notation. Each linear change of input variables B ∈ GL n (F) defines a linear map Coeff B from Poly d (x 1 , . . . , x n ) to itself: B sends f (x) = e a e x e to f (Bx) = e a ′ e x e . In other words, Coeff B is the linear map taking the coefficient vector (a e ) e∈Z n ≥0 to the new coefficient vector Coeff B ((a e ) e ) = (a ′ e ) e . It is a standard fact-easily verified-that Coeff B is linear 4 . Thus B induces a linear map Coeff B on the coefficients of input polynomials, which are in turn the variables of test polynomials. Then Coeff B induces a linear map on test polynomials, taking p((a e ) e ) to p(Coeff B ((a e ) e )).
Definition 2.7. A test GL n (F)-module 5 is a finite-dimensional vector space T of test polynomials, say with basis {p 1 , . . . , p k }, such that for each 1 ≤ i ≤ k and each B ∈ GL n (F), p i (Coeff B ((a e ) e )) lies in T .
We say a test module T vanishes on an input polynomial f if every test polynomial p ∈ T vanishes at f . The set of input polynomials at which a given test module vanishes is a linear-invariant set, which we can think of as a linear-invariant property: Fact 2.8. There is a many-to-one correspondence between test GL n (F)-modules and linear-invariant properties defined by polynomials.
That is, each linear-invariant property defined by polynomials is defined by some test GL n (F)module, and each test GL n (F)-module defines a linear-invariant property 6 . The proof involves only basic observations regarding group actions and algebraic sets (see Appendix A).
Example 2.9. The vector space spanned by the test polynomial b 2 − 4ac is a one-dimensional test GL 2 (F)-module. For let f (x, y) = ax 2 + bxy + cy 2 and A = α β γ δ , and write f (Ax) = 2.3. Separating modules and a first example.
Definition 2.10. A separating module 7 for the lower bound C hard ⊆ C easy is a test module T such that T vanishes on every function in C easy , but does not vanish at some function f hard ∈ C hard .
The main thesis of this paper is that most arithmetic circuit lower bounds already use separating modules. We now demonstrate this with an example, by showing that Theorem 0 of Nisan and Wigderson [NW97] uses a separating module. We first recall their definitions and result. In the next section we argue that the existence of a separating module was in some sense necessary.
An arithmetic circuit is homogeneous if every gate in the circuit computes a homogeneous polynomial. The d-th elementary symmetric function in n variables is the sum of all multilinear monomials of degree d and is denoted e d,n .
When d = cn for any 0 < c ≤ 1/4, this lower bound is exponential in n.
Proof outline. The key property they consider is the dimension of the space of all partial derivatives (of all orders) of a function. We denote this space ∂(f ). First, they show that dim ∂(C) ≤ s2 d for any homogeneous depth 3 arithmetic circuit C of size s computing a polynomial of degree d. Next, they show that dim ∂(e 2d,n ) ≥ n d . Combining these inequalities, one gets s2 2d ≥ n d ≥ n d d .
Proposition 2.12. There is a separating module for the lower bound of Theorem 2.11.
Proof. Let Π(r) denote the property "dim ∂(f ) ≤ r." We argued in Example 2.6 that dim ∂(f ) is a linear-invariant property for homogeneous f . We now show that this property is defined by a test GL n (F)-module, and hence that the above proof yields a separating module. Let f (x) = e a e x e be a homogeneous polynomial of degree d (the only nonzero terms in the sum are those for which i e i = d) and consider the following matrix M f . The columns of M f are indexed by the monomials of degree ≤ d, and the rows of M f are indexed by the partial derivative operators (these are in bijective correspondence with monomials, but we refer to them this way to keep track of which is which). The entry in the ∂ k /∂x i 1 · · · ∂x i k row and the x e column is the coefficient of x e in ∂ k f /∂x i 1 · · · ∂x i k . Note that this coefficient is some linear combination of the coefficients a e of f .
Then the dimension of ∂(f ) is the same as the (row) rank of M f . It is a standard fact from linear algebra that M f has rank ≤ r if and only if all the (r + 1) × (r + 1) minors of M f vanish. Each such minor is a degree r + 1 polynomial of the entries of M f , which are themselves linear combinations of the coefficients a e of f . Hence each such minor is a test polynomial of degree r + 1. Let T (r) denote the linear span of these minors. We have just shown that (the vanishing of the test polynomials in) T (r) defines the property Π(r).
In particular, Π(r) is a linear-invariant property defined by polynomials. By Fact 2.8 Π(r) is defined by some test module, which is thus a separating module. However, we can argue further that T (r) itself is a test GL n (F)-module, and hence a separating module for the lower bound of Theorem 2.11.
In Example 2.6 we essentially showed that M f (Ax) is related to M f (x) by left and right multiplication by some matrices related to A (in a similar way to how Coeff A is related to A). It is a standard fact about minors that the (r + 1) × (r + 1) minors of BM f C are linear combinations of the (r + 1) × (r + 1) minors of M f . Hence for any test polynomial p ∈ T (r), p • Coeff A is also in T (r). Thus T (r) is a separating module for Theorem 2.11. As with everything in complexity, in fact what we have is a family of separating modules. Namely, if we consider e 2d,n with d = n/8, then T (2 n/8 ) vanishes at every polynomial computed by a depth 3 homogeneous circuit of degree n/4 and size at most 2 n/8 , but does not vanish at e n/4,n .
2.4. Generalizations. For other lower bounds it is useful to generalize some of the above notions.
First, we can allow input objects other than input polynomials. For example, in the context of matrix rigidity it will be useful to consider input matrices. Regardless of the input objects, we still speak of test polynomials. In the case of input matrices, test polynomials are then polynomials whose variables are the coordinates a ij of the input matrices. In the context of Boolean functions, we often first represent a function by its unique multilinear polynomial, and then work in the context of input polynomials. But one could imagine a more direct representation in terms of something like "input circuits." In the context of the degree bound [Str73,BS83] and the connected components sorting lower bound [BO83], the input objects are (semi-)algebraic sets, given by their defining polynomial (in)equalities. The variables for the test polynomials are then the coefficients of the equations defining the algebraic sets.
Second, we can allow other types of invariance besides linear invariance. For example, we can hardly imagine a complexity measure or lower bound proof that depends on the order or names of the variables. Hence all properties used in complexity can be expected to be permutationinvariant: f (x 1 , . . . , x n ) has the property if and only if f (x π(1) , . . . , x π(n) ) has the property, for any permutation π. We then speak of test S n -modules, and the analog of Fact 2.8 holds (see Fact A.1). Note that S n -modules are still defined as vector spaces; the use of vector spaces in the definition of test module was not specific to GL n . We will use permutation-invariance in the contexts of matrix rigidity and multilinear formulas and circuits, as these concepts are not linear-invariant but they are permutation-invariant.
Another type of invariance that often arises is affine invariance. Here we generalize from linear transformation x → Ax to affine transformations x → Ax + b, with A ∈ GL n (F) and b ∈ F n . The group of all such transformations is the affine general linear group AGL n (F). We then speak of affine-invariant properties and test AGL n (F)-modules. Again, the analog of Fact 2.8 holds (see Fact A.1).
When the invariance is understood from context, we may simply refer to test modules and separating modules without reference to a particular group.

On the necessity and utility of separating modules and border complexity
In Section 3.1 we argue that the use of invariant properties is essentially necessary. In Section 3.2 we discuss situations where furthermore the use of separating modules is essentially necessary. Although not all complexity classes are defined by the vanishing of test polynomials, in Section 3.3 we argue that all nonuniform complexity classes, including Boolean ones, are "constructible" by test polynomials (see Definition 3.4). Finally, in Appendix B we give a heuristic argument as to why separating modules are likely to be the easiest way to prove lower bounds against constructible complexity classes, and shed light on a complexity class even when their use is not strictly necessary. Hence separating modules should be a first approach to try. We defer this final argument to an appendix only because it is heuristic, somewhat technical, and possibly contentious, and we do not wish to distract from the main points of the paper. However, one argument for this which we can already state is that most arithmetic circuit lower bounds already use separating modules, as shown in this paper.
Throughout this section and Appendix B, we only discuss nonuniform lower bounds. If C is a nonuniform complexity class, then C n denotes the functions in C with n inputs. By a "property" in general, we mean a set of input polynomials, or more generally input objects.
3.1. Invariant properties are necessary. First we show that if C n is invariant under some group G-such as GL n , S n , etc.-then any property used to prove a lower bound against C n can be transformed into a G-invariant property that proves the same lower bound. Then we argue that essentially all "naturally occurring" complexity classes and complexity measures are permutationinvariant, and many are linear-or affine-invariant.
If any property can be used against a G-invariant class, a G-invariant property can. Suppose property Π is used to prove a lower bound 8 against C n by showing that C n ⊆ Π and f hard,n / ∈ Π. Let Π G denote the unique maximum G-invariant subset contained in Π; this exists by Zorn's Lemma, as an arbitrary union of G-invariant subsets is G-invariant. As C n is G-invariant, by the definition of Π G we have C n ⊆ Π G . The G-invariant property Π G then proves the same lower bound as Π, as f hard,n / ∈ Π ⊇ Π G ⊇ C n . Essentially all complexity classes are permutation-invariant. All complexity measures and complexity classes we are aware of are permutation-invariant: they do not depend on the names or order of the variables. Indeed, we imagine that any complexity class or measure that was not permutation-invariant would be quite perverse, as the complexity of computing a function should really not depend on whether its variables are called x 1 , . . . , x n or a, b, c, . . . , or x n , . . . , x 1 . Thus we can expect that any lower bound uses a permutation-invariant property, at the very least.
Many complexity classes, particularly algebraic ones, are furthermore linear-or affine-invariant. For example, arithmetic circuit size does not change by more than an additive difference of n after a linear or affine transformation 9 . Additionally, circuit depth increases by at most 1; for circuits whose bottom gates are linear combination gates, the depth need not increase at all. For example, transformations are as powerful as parity). This is in line with Kayal's initial observation [Kay11, Sec. 5.2] that several known lower bounds use affine-invariant properties, and with our observations in this paper.
Hence, for all naturally occurring nonuniform complexity classes, if any property can be used to prove a lower bound, a permutation-invariant property can be used.

Test polynomials and border complexity.
A complexity class C n is typically not defined by the vanishing of some test polynomials. Hence when we prove a lower bound against C n using test polynomials, we in fact prove a lower bound against the slightly larger class which we denote C n and refer to as "border-C n ," in line with normal usage in other contexts (the overline is for Zariskiclosure; see Definition 3.4). Standard results in algebraic geometry (e. g., [Mum76, Thm. 2.33], [BCS97, §20.6]) imply that C n consists of all functions f which can be written as a limit 10 of functions in C n .
In the next section we show that C is not too far from border-C. In Appendix B we argue that proving lower bounds against border-C is likely to be the easiest way to prove lower bounds against C, despite being a formally stronger statement. Here we present examples where there is known to be little or no difference, and begin arguing for the utility of border complexity.
8 For readers familiar with Natural Proofs [RR97], note that we are using the complementary notion of "useful property" here. They use properties Π that are disjoint from Cn, whereas we use properties Π that completely contain Cn. By taking the complements of sets, the two viewpoints are equivalent. We chose our viewpoint because it has nicer algebro-geometric properties, as in Appendix B. 9 In the model where addition gates can compute linear or affine combinations; in the weaker model where addition gates are just addition gates, the size still does not change by more than O(n 2 ). 10 Over C the notion of limit is defined in the usual manner. Over, say, Fp, we say a function f is a limit of points in Cn if there is a one-dimensional family of functions ft such that ft is well defined and in Cn for all but finitely many values of t ∈ Fp, and f0 = f . There is one additional technical condition here, but we omit it since it does not affect our discussion.
Example 3.1 (Matrix multiplication). In the context of matrix multiplication the typical complexity measure is tensor rank, which is essentially the number of non-scalar multiplications needed to multiply two matrices. Tensor rank is known to agree with the total number of arithmetic operations up to a constant factor. The corresponding border complexity measure is called border rank, or sometimes "approximative complexity," first introduced by Bini, Capovani, Lotti, and Romani [BCRL79]. In general, border rank can be smaller than tensor rank. However, Bini [Bin80] showed that the exponent of matrix multiplication calculated with tensor rank-the smallest ω such that n × n matrix multiplication has tensor rank O(n ω )-is the same as the exponent calculated with border rank. Thus, although border rank and tensor rank are not equal, they give the same asymptotic answer for matrix multiplication. Furthermore, the use of border rank has greatly increased our understanding of both upper and lower bounds for matrix multiplication. One of the main tools for finding efficient algorithms for matrix multiplication is Schönhage's asymptotic sum inequality [Sch81], which shows that an upper bound on border rank implies an upper bound on tensor rank. Conversely, most lower bounds on matrix multiplication seem to have a border rank lower bound at their heart. For example, Landsberg [Lan08,§6] showed that Bläser's tensor rank lower bound [Blä99]-the then best known bound-implicitly uses the same key lemma that Strassen used [Str83] to give a border rank lower bound. The currently best known lower bound on tensor rank [Lan12,MR12] also uses techniques from the best known lower bound on border rank [LO11].
Example 3.2 (Permanent versus determinant). In the context of permanent versus determinant, the typical complexity measure is determinantal complexity: the size of the smallest matrix M (x) with linear combinations of the variables x for entries such that det(M (x)) = perm(x). Mulmuley and Sohoni [MS01] use the analogous notion of border determinantal complexity, which they refer to as "infinitesimal approximative" complexity. Independently, Bürgisser, Landsberg, Manivel, and Weyman [BLMW11, Prop. 9.4.3] and the author [Gro12, Prop. 3.5.4] show that under certain fairly general circumstances the border determinantal complexity only differs from the determinantal complexity by a polynomial, and state a conjecture which would imply this is always the case. Thus border complexity here is not as far from standard complexity as it may at first seem.
In contrast, Mulmuley and Sohoni [MS01, §4.2] give an example of a function which has border determinantal complexity poly(n) but which may have super-polynomial determinantal complexity. Such functions exhibit a difference in the difficulties of resolving the complexity of matrix multiplication and resolving the permanent versus determinant problem. Nonetheless, they conjecture [MS01,Conj. 4.3] that no VNP-hard function has polynomial border determinantal complexity. One might also guess that for quasi-polynomial complexity there is no difference, that is, that the following question has a positive answer: Open Question 3.3. Does polynomial, or more generally quasi-polynomial, border determinantal complexity imply quasi-polynomial determinantal complexity? Equivalently, is VP ws ⊆ VQP or more generally VQP = VQP?
Either way, as all of our current techniques give bounds on border complexity, Question 3.3 is an archetype of a fundamental question of the difference between the way complexity classes are usually defined and the methods we use for proving lower bounds against them.
Because of the above results and the prevalent use of test polynomials in known lower bounds, as well as the arguments in Appendix B, we submit that border complexity in general-not only in the context of matrix multiplication-is a natural and useful measure of complexity from the perspective of lower bounds (and, in the context of matrix multiplication, upper bounds as well!).
3.3. Nonuniform complexity classes are constructible by test polynomials. Over any field, if C n is defined by test polynomials, say C n = {f |t 1 (f ) = t 2 (f ) = · · · = t k (f ) = 0} ⊆ Poly d(n) (x 1 , . . . , x n ), then f hard,n / ∈ C n if and only if there is some 1 ≤ i ≤ k such that t i (f hard ) = 0. For such classes, the use of test polynomials is necessary and sufficient to prove a lower bound. However, most complexity classes are not defined by test polynomials in this manner. We will argue here that all naturally occurring complexity classes are nonetheless "constructible" by test polynomials (definition below). In Appendix B we argue that test polynomials-and hence, via Section 3.1 and Fact 2.8, separating modules-are nonetheless incredibly useful for understanding such constructible (invariant) complexity classes.
Definition 3.4 (Zariski, i. e. algebro-geometric, topology). A set defined by the vanishing of test polynomials is called (Zariski-)closed. A set is constructible if it can be constructed from closed sets by taking complements, unions, and intersections.
The closure of a set S is the smallest closed set containing S, and is denoted S. If S is a Zariskiconstructible set over C, then its Zariski-closure coincides with its closure in the usual complex topology (see, e. g., [Mum76, Thm. 2.33]). Note that the closure S is the set of all points which cannot be separated from S by test polynomials.
The main insight of this section is a corollary to Chevalley's constructibility theorem. To state this theorem, we need one more concept. A map ϕ : A → B between constructible sets is called algebraic if its graph {(a, ϕ(a))|a ∈ A} is a closed subset of A × B. Equivalently, let x 1 , . . . , x n be coordinates on B, not necessarily independent; then ϕ is algebraic if and only if for each i, x i (ϕ(a)) can be expressed as a polynomial in the coordinates of a ∈ A.
Chevalley's Theorem is most concisely stated for Noetherian rings, but we will not need their definition here. For our purposes it suffices that this includes Z, Z/nZ, rings of algebraic integers, all fields, polynomial rings, and quotients of polynomial rings.
Theorem 3.5 (Chevalley's Theorem 11 ). Over any Noetherian ring the image of any algebraic map is constructible.
We are not aware of any nonuniform complexity classes-algebraic or otherwise-that do not belong to one of the classes described in the following corollary: Corollary 3.6. Let C be a nonuniform complexity class; then C n is (Zariski-)constructible if any of the following hold: (1) |C n | is finite; or (2) C is closed under simple (resp. linear, resp. affine) projections, and contains a problem that is complete under simple (resp. linear, resp. affine) projections; or (3) C n is defined by a class of circuits that are restricted to have one of finitely many (a number which may grow with n) shapes. Here by the "shape" of a circuit, we mean the underlying directed acyclic graph together with operators labeling the internal nodes; or (4) More generally, C n is first-order definable in the language of rings over a Noetherian ring, or in the language of ordered rings over an ordered Noetherian ring.
A "simple projection" here means any map that sends each variable x i to a constant α or to a constant multiple of a variable αy j . A linear projection sends each x i to a linear combination of variables j α ij y j , and an affine projection additionally allows an additive constant: Condition (3) includes circuit classes defined in terms of fan-in, size, depth, or connectivity properties like skew or weakly-skew. Proof.
(1) Any finite set is defined by the vanishing of test polynomials, i. e. it is closed, hence constructible.
(2) The set of simple (resp. linear, resp. affine) projections is closed, as we show below; denote this set by R, for "reductions." If f n is a complete function, and F is the space of input functions (objects, etc.), then define a map ϕ : R → F by ϕ(r) = r(f n ). From the definition of projection, it is easily seen that ϕ is algebraic. Then C n is the image of ϕ, hence is constructible by Chevalley's Theorem.
The set of linear (resp. affine) projections from functions on n variables to functions on m variables is just the set of m × n (resp. (m + 1) × n) matrices, so is closed. The set of simple projections is the subset of affine projections defined by the property that each column of the (m + 1) × n matrix has at most one nonzero entry. The latter condition is equivalent to the condition that the product of any two entries from a given column vanishes, hence the set of simple projections is closed.
(3) For each circuit shape G, the set of circuits of that shape is F N where N is the number of edges whose endpoints are linear combination gates. Let Ckt G denote this space, and let ϕ G : Ckt G → Poly d (x 1 , . . . , x n ) be the map which takes each circuit of shape G to the function it computes. It is easily seen that ϕ G is algebraic, so its image is constructible by Chevalley's Theorem. Then C n is the union over finitely many shapes G of Im(ϕ G ). As a union of constructible sets is constructible, so is C n .
(4) A first-order definable set is defined by some first-order formula. For quantifier-free formulas, this is exactly a set defined by a logical combination of equalities and inequalities, namely a constructible set. The only tricky part is then to handle quantifiers. By replacing a universal quantifier ∀x by ¬∃x¬ and noting that the complement of a constructible set is constructible, we need only handle existential quantifiers. If ϕ(x) is a first-order formula without quantifiers, let C ′ denote the set of those x that satisfy ϕ(x). Then the set defined by ∃x 0 ϕ(x) is equal to the image of C ′ under the projection which sends (x 0 , x 1 , . . . , x n ) → (x 1 , . . . , x n ). By Chevalley's Theorem, the image of this projection is constructible.
Note that if a circuit class is defined as the image of some map-as nearly all of them are, as in conditions (2) and (3)-finding its representation as a union of differences of closed sets may be difficult, even uncomputable. However, over finite fields this is a finite problem, hence computable, and over algebraically closed fields or real closed fields quantifier elimination algorithms such as Tarski's [Tar48] make this process effective.
Remark 3.7. The (Zariski-)closure of classes satisfying condition (2) of Corollary 3.6 for linear or affine projections are orbit closures for GL n , respectively AGL n . Much of the current research in GCT studies the orbit closures associated to the permanent, determinant, and matrix multiplication. Considering their structure as orbit closures rather than just G-invariant sets facilitates their study greatly, much as the existence of complete problems facilitates the study of a complexity class. In this paper we show that by extending our viewpoint to all G-invariant complexity classes and not just orbit closures, GCT becomes much more general and far-reaching.
4. Discussion, relation to the GCT program, and future directions In this paper, we show that most arithmetic circuit lower bounds and implications between lower bounds fit naturally into the representation-theoretic framework suggested by geometric complexity theory, specifically in the form of separating modules. In this section we discuss further implications of this connection, as well as which lower bounds seem to not fit into this framework (ones which are essentially uniform), the status of lower bounds in positive characteristic, and the relation between this work and the larger GCT program.
Eric Allender observed that all the lower bounds mentioned here use Razborov-Rudich-natural [RR97] properties 12 , and asked whether this was just a coincidence. In light of the generality of separating modules (Section 3), we believe that it is indeed a coincidence, and has more to do with the fact that most known results use such properties than it has to do with any inherent limitations of the representation-theoretic viewpoint. Indeed, there is evidence that the GCT approach over C philosophically (see Footnote 12) avoids the Razborov-Rudich barrier (the author's thesis [Gro12,Sec. 3.4.3] contains an overview of such evidence).
4.1. Relation to Geometric Complexity Theory. To state how the separating modules used in this paper differ from the geometric obstructions defined in Mulmuley and Sohoni [MS08], and to discuss possible further interactions between previous results and geometric complexity theory, we first recall two standard definition from representation theory, as applied to test modules. Test G-modules for any group G-such as GL n (F), S n , etc.-are, in particular, representations of G; indeed, the term "module" is often used interchangeably with "representation" (see Appendix C for more on the terminology). When we consider a test G-module as just a representation of G (equivalently, as a G-module), we forget that it consists of test polynomials, and only remember that it is a vector space and how the elements of G move vectors around within this vector space.
A classical theorem (see, e. g., [FH91]) says that over an algebraically closed field of characteristic zero, every GL n -or S n -module is a direct sum (as representations, that is, as vector spaces) of irreducible submodules. In particular, this implies that if there is a separating module for a lower bound over C, there is an irreducible separating module. We could have included irreducibility in the definition of test module for this reason, but chose not to in order to keep the definition simple and to avoid complications over other fields, especially finite fields. The property of splitting into a direct sum of irreducible submodules is known as "complete reducibility." It is known to fail in general for AGL n -modules, even over C, and for GL n -and S n -modules in positive characteristic.
Definition 4.2. Two (test) GL n -modules T 1 , T 2 are equivalent (as representations) if there is a bijective linear map L : T 1 → T 2 such that for all A ∈ GL n and all test polynomials p ∈ T 1 , This definition is purely representation-theoretic, in that it ignores the "test polynomial" structure of the test modules, and treats them only as representations. Because this notion of equivalence forgets the underlying polynomials of the test modules, it is possible-and is likely to be the generic situation-for two equivalent test GL n -modules to define distinct linear-invariant properties. Nonetheless, the term "equivalent" is standard in representation theory, so we use it here.
To discuss the geometric obstructions of GCT, we work over C. By complete reducibility, the space of all test polynomials can be written as a direct sum of test GL n (C)-modules. If we group these modules by their equivalence classes, we may write the space of all test polynomials as the direct sum λ m λ i=1 T λ,i where the λ s index the irreducible equivalence classes. (An equivalence class is called irreducible if any representation in this class is irreducible.) It turns out that each equivalence class λ can only occur amongst a specific degree d(λ) of test polynomials, and since the space of test polynomials of any fixed degree is finite-dimensional, each m λ is finite. Moreover, the numbers m λ are independent of the choice of direct sum. We refer to m λ as the multiplicity of the equivalence class λ in the space of test polynomials. 12 The Boolean properties satisfy the Razborov-Rudich conditions, and although there is no known algebraic analog of the Razborov-Rudich barrier, the algebraic properties mentioned in the previous sections seem like they ought to fulfill the requirements of such an analog, were it to exist.
If C is a linear-invariant complexity class, and we consider the space of all test polynomials that vanish everywhere on C (hence on C, see Section 3.2), we may write this space as λ Definition 4.3 (Mulmuley and Sohoni [MS08] 13 ). A multiplicity obstruction for the lower bound C hard ⊆ C easy is an irreducible equivalence class λ such that m λ (C easy ) > m λ (C hard ). A occurrence obstruction or geometric obstruction for C hard ⊆ C easy is a multiplicity obstruction which further has m λ (C easy ) = m λ , that is, every test module equivalent to λ vanishes on C easy .
The existence of a multiplicity obstruction λ implies the existence of a separating module, as then there must be some test GL n (C)-module of type λ that vanishes on C easy but not on C hard . These are referred to as "obstructions" because they obstruct the inclusion C hard ⊆ C easy , much as a K 5 -minor obstructs a planar embedding of a graph.
One advantage of considering multiplicities rather than test modules is that it opens the possibility of using purely representation-theoretic techniques to understand the multiplicities, as is being pursued in GCT (e. g., [Bla12,BCI09,BMS11,ASS09]). To see how this is possible-that is, how one can discuss multiplicity obstructions without reference to actual test polynomials or modules thereof-we must mention a bit more about the representation theory of GL n and S n . Over C, the irreducible representations of these groups have been classified for over 100 years (see, e. g., [FH91]). The equivalence classes of irreducible representations are in bijective correspondence with integer partitions-partitions with at most n parts in the case of GL n (C), and partitions of the number n in the case of S n . The use of partitions enables us to talk about the multiplicities m λ and m λ (C) without reference to any particular (test) module. This is just one of the advantages of the representation-theoretic viewpoint; we discuss two other advantages below.

4.2.
Understanding old lower bounds better (even tight ones!) In this paper, we show that most arithmetic circuit lower bounds yield separating modules, but typically just one separating module for each lower bound. While this suffices for the lower bound, considering other separating modules that can be used for a given lower bound (or non-separating test modules) may give deeper insight. Indeed, by Fact 2.8, this is equivalent to knowing which other invariant properties defined by polynomials can be used (or not) for a lower bound. Understanding which (invariant) properties a complexity class has is surely a task worth undertaking, even for lower bounds that are already tight or as good as we want.
However, trying to understand all such test modules is quite an enormous task. It does not just ask for new proofs of old lower bounds-for example, just asking for a single new separating module for the lower bound-but rather asks to understand, in some sense, all possible proofs of a given lower bound. Instead, the difference between separating modules and multiplicity obstructions suggests a more feasible step in this direction which may well be within reach: Open Question 4.4. Upgrade the proofs of lower bounds mentioned in this paper from separating modules to multiplicity (or stronger: occurrence) obstructions.
As a first step towards Open Question 4.4, which the author hopes to make the subject of future work, we have: Open Question 4.5. Determine the labels (partitions, see Section 4.1) of the separating modules in the lower bounds mentioned in this paper. 13 Although only geometric obstructions were explicitly defined in [MS08], multiplicity obstructions were essentially defined there: see the sentence just before [MS08, Def. 1.2]. 4.3. The role of explicitness and constructivity. Mulmuley [Mul10] and Williams [Wil13] have both argued for the necessity of constructive methods in proving lower bounds. We can use the representation-theoretic viewpoint to give a further argument for explicitness, albeit a heuristic one. It also allows us to quantify the explicitness or constructivity of known proofs in various ways.
Suppose we are trying to prove C hard ⊆ C easy . In Section 3 and Appendix B we argue that this is likely to be done using separating modules. If such a separating module exists, it should furthermore be the case that a random test module that vanishes on C easy should not vanish on C hard -and hence be a separating module-for some notion of "random" which can probably be made precise. However, to prove the existence of a separating module unconditionally-that is, without assuming the lower bound we are trying to prove-one seems to need a more explicit description of the separating module. This is related to the recent results of Mulmuley [Mul12] linking derandomization with algorithms for computational problems in algebraic geometry.
One measure of constructivity is the degree and number (dimension) of test polynomials used. As in the context of Razborov-Rudich [RR97] and Williams [Wil13], we should expect to measure this degree as a function of something like the size of the truth table of the input polynomials involved. In an algebraic context, we might replace truth table size by the number of monomials. For polynomials of degree O(n) in poly(n) variables, the number of monomials is 2 O(n log n) , which is comparable to truth table size.
Another more delicate measure of constructivity is the complexity of verifying that a given test module (perhaps from a specific subset of test modules) is indeed a separating module. This is related to our discussion in Appendix B.
Using the fact that partitions classify the irreducible representation of GL n or S n over C, we get another measure of constructivity. In general, the dimension of an irreducible representation can be exponential in the bit-size of its corresponding partition, so the partition can serve as a succinct label of an equivalence class of representations. One can then consider the computational complexity of constructing from 0 n a partition corresponding to a multiplicity (or occurrence) obstruction for a nonuniform lower bound at input length n. Mulmuley conjectures [Mul10] that this construction problem can be solved in P for occurrence obstructions in the context of permanent versus determinant and NP versus P/poly. In fact, Mulmuley suggests that finding a polynomialtime algorithm to verify whether a given λ n is the label of an obstruction is a crucial first step towards proving the existence of obstructions unconditionally. This suggests a strengthening of Open Question 4.4: Open Question 4.6. Upgrade the lower bounds mentioned in this paper to multiplicity obstructions where the label λ n of the obstruction at input length n can be computed in poly(n)-time.
Note that resolving Question 4.5 would provide natural candidates for labels λ that might be multiplicity obstructions. Both Question 4.5 and this one seem within reach, especially given the recent occurrence obstructions constructed by Bürgisser and Ikenmeyer [BI12] in the context of matrix multiplication.
The more general question of verification is also interesting: Open Question 4.7. For any of the lower bounds mentioned here, what is the complexity of verifying multiplicity or occurrence obstructions? That is, given λ n , what is the complexity of verifying that λ n is indeed a multiplicity obstruction?
Observation 4.8. For any Boolean circuit lower bound against a permutation-invariant complexity class-which includes all natural classes, see Section 3.1-there is a separating S n -module.
Proof. By Fact A.1 for S n , we only need to argue that the complexity class is defined by test polynomials. As the space of Boolean functions on n variables is finite, every property of n-variable Boolean functions is finite and hence defined by test polynomials over F 2 .
In particular, any nonuniform Boolean lower bound implies the existence of a separating module. Despite the fact that the above observation says that separating modules can be used without loss of generality for Boolean circuit lower bounds, we find this observation alone somewhat unsatisfying. However, as with the results of Razborov, Smolensky, and Grigoriev-Razborov over finite fields (see Section 5.3), we believe that many Boolean circuit lower bounds in fact yield separating modules in a very direct and natural manner.
Even without having verified this for many known Boolean lower bounds, we can begin to argue why we expect this to be the case. By the discussion in Appendix B, it is reasonable to expect that lower bounds use properties Π which are naturally defined by some logical combination of the vanishing of some polynomials and the non-vanishing of other polynomials. We already know that the properties used can be defined by the vanishing of some test polynomials; the key here is the naturality (in the usual sense of the word, not the Razborov-Rudich sense).
Putting this logical combination into disjunctive normal form, Π can be naturally expressed as a union of properties of the form Π i \Π ′ i = Π i ∩ Π ′c i , where each Π i and Π ′ i is defined by the vanishing of test polynomials and Π ′c i denotes the complement of Π ′ i . Say Π ′ i is defined by the vanishing of the test polynomials f 1 (x 1 , . . . , x n ) = · · · = f k (x) = 0. Then its complement is most naturally defined by the non-vanishing of at least one of the f i . However, the complement Π ′c i can also be defined by the vanishing of the single polynomial k i=1 (f i (x) − 1). Furthermore, by applying x 2 i = x i , we may take the degree of this single polynomial to be at most min{n, k i=1 deg(f i )}. In terms of constructivity, we thus do not lose much by considering Π ′c i as being defined by the vanishing rather than non-vanishing of test polynomials: the single polynomial defining Π ′c i has low degree, and there is only one such polynomial, so the number of polynomials used to define the property also does not increase.
Remark 4.9. A similar idea works over any finite field F q : use 0 =α∈Fq (f i (x) − α) in place of f i (x) − 1, reduce by x q i = x i , and the resulting degree is at most (q − 1) min{n, i deg(f i )}. One might argue that using (f i (x) − 1) = 0 rather than the non-vanishing of some f i is unnatural, or violates the technique or idea of the lower bound proof that used property Π. However, if this were really the case, then the lower bound proof would hold for the vanishing/non-vanishing of some f i as formal polynomials, and hence would work over fields larger than F 2 , and in particular would hold over the algebraic closure F 2 . With the exception of the results mentioned in Section 5.3, we are not aware of Boolean lower bounds that extend to any such fields. In this sense, the use of finiteness in Observation 4.8 seems less of a kludge to us, and more an essential feature of the current techniques for Boolean circuit lower bounds.

4.5.
Other lower bounds? Although we have obviously not considered all known lower bounds, we have considered quite a wide cross-section of them in this paper. Of the lower bounds which we actively tried to fit into this framework but have not yet been able to do so, most use heavily machine-based diagonalization. For example, the (non)deterministic time and space hierarchies [HS65,Coo73], uniform lower bounds on the permanent [All99, AG94, KP09], time-space trade-offs for SAT [For00, FLvMV05, DvMW11, Wil08, Wil06, BW12], Σ 2 P ∩ Π 2 P ⊆ SIZE(n k ) [Kan82] and the related result MA EXP ⊆ P/poly [BFT98].
Remark 4.10. Although from one viewpoint Kannan's result rests crucially on the nonuniform circuit-size hierarchy-a counting argument-for the purposes of this discussion the key fact he shows is that a uniform Σ 4 P-machine is powerful enough to use the circuit-size hierarchy to diagonalize against SIZE(n k ). The same remark applies to the result MA EXP ⊆ P/poly, as it uses Kannan's result in an essential way.
The recent lower bound NEXP ⊆ ACC 0 [Wil11] provides an interesting crucible. It is a nonuniform lower bound against a permutation-invariant Boolean complexity class, hence by Observation 4.8 there exists a separating S n -module proving NEXP ⊆ ACC 0 . However, the proof uses the nondeterministic time hierarchy in a seemingly crucial way. Extracting a natural separating module from Williams's proof may be a first step towards extending the representation-theoretic framework to include uniform lower bounds.
One very interesting technique which we have not yet been able to fit into the representationtheoretic framework and which is only partially uniform comes from Jansen and Santhanam [JS12,JS13]. The key property they use is the existence of Z hitting sets whose bit descriptions can be encoded by small uniform (or at least succinct [JS13]) circuits. This combination of algebraic (hitting sets) and Boolean (bit descriptions) frameworks in the same breath makes it difficult to even formulate their proofs in a single algebraic setting, let alone translate them into separating modules.
Finally, Shannon's counting argument [Sha49] also seems difficult to put into this representationtheoretic framework. Again, by Observation 4.8 there exists a separating S n -module for this lower bound. However, finding a natural separating module seems difficult, as Shannon counts the functions in C easy (SIZE(2 n /n) in this case), rather than using some property shared by these functions. This is not necessarily a weakness of the framework however: one of the messages we take from Razborov and Rudich [RR97] is that such simple counting arguments cannot work to prove the strong lower bounds we desire. Indeed, Kadish and Landsberg [KL12] point out that getting a lower bound on the determinantal complexity of a generic polynomial is an important first step towards new lower bounds for permanent versus determinant; a lower bound on generic polynomials remains open. 4.6. Finite fields and positive characteristic. There is a mismatch between the current lower bounds over finite fields and the standard techniques of algebraic geometry. The issue is that all the current lower bounds over finite fields that we are aware of depend crucially not just on positive characteristic, but on the size of the field. This means that none of the current lower bounds over finite fields extend to the algebraic closure F p . This is in contrast to the usual approach to finite fields in algebraic geometry, which is (roughly) to first work over their algebraic closures F q where algebraic geometry and representation theory are nicer and then to pass to the F q points 14 . In particular, over F q Hilbert's Nullstellensatz holds and every matrix admits an eigenvector. This process is exactly analogous to (but more complicated than) considering complex solutions, eigenvectors, etc. in order to study equations, matrices, etc. over R.
As we already mentioned, even if the characteristic is held constant but the field size is allowed to grow at a modest pace with the size of the input, the current lower bounds seem to disappear completely. The essential issue here seems to be that the method of approximations is typically used to "throw away" points which are in the complement of an algebraic set. Over finite fields, one then argues that these "erroneous points" are not too numerous, but over any infinite field, almost all points will be "erroneous," as an algebraic set has dimension strictly smaller than that of the ambient space.
14 The Fq-points can be recovered from Fq as the fixed points of the Frobenius map x → x q , just as R points can be recovered from C as the fixed points of the complex conjugation map. The dynamics of the Frobenius map are often very useful.
It thus seems to us that the limits of our knowledge are not so much in finding lower bounds for depth 3 arithmetic circuits in characteristic zero, as is often stated, but for finding lower bounds for depth 3 arithmetic circuits over any given infinite field, including F p . The chasm at depth 4 [AV08,Koi12] holds over an arbitrary field, but these observations lead us to wonder: Open Question 4.11. Is there a chasm at depth 3 over the algebraically closed field F p for any constant prime p > 0?
The current chasm at depth 3 [GKKS13] only seems to work in characteristic zero or over a field of (growing) characteristic greater than the degree d of the polynomial, as they use a trick of Fischer [Fis94] which requires dividing by 2 d−1 d!.

Most arithmetic circuit lower bounds yield separating modules
In this section we show how all of the bounds listed in the introduction give separating modules. Rather than recalling all of these proofs and stating a separate proposition for the existence of a separating module for each of these bounds (as in Section 2), we use a more concise format. Furthermore, we have not included all the results from every paper we consider, but only a representative result from each paper (or sometimes, from each technique). However, we believe that the other results in these papers and using these techniques also yield separating modules.

Nisan-Wigderson partial derivatives
Hard function: Elementary symmetric function e n/4,n Complexity class: Homogeneous depth 3 arithmetic circuits in characteristic zero Lower bound: Size 2 Ω(n) [NW97] Invariance: F-linear (GL n (F)), characteristic zero Separating module: The (r + 1) × (r + 1) minors of the partial derivative matrix M f , as in the proof of Proposition 2.12.

Permanent versus depth 4
Hard function: perm n Complexity class: Depth 4 ΣΠΣΠ arithmetic circuits with bottom fan-in O( √ n) Lower bound: Size 2 Ω( √ n) [GKKS12] Invariance: C-linear (GL n 2 (C)) Separating module: The outline of the proof of this lower bound is very similar to that for the Nisan-Wigderson lower bound above. However, the key property used here is slightly more complicated. Rather than considering the dimension of the space of partial derivatives ∂(f ), they consider the dimension of the space of shifted partial derivatives, which are products of polynomials of some degree ℓ with the partial derivatives of f . Following their notation, we write ∂ =k (f ) ≤ℓ for the space of k-th order partial derivatives multiplied by polynomials of degree ≤ ℓ. As in the above case, we build a matrixM f whose rank is exactly the dimension of ∂ =k (f ) ≤ℓ , and then the r × r minors of this matrix provide the separating module, for appropriately specified r, k, and ℓ.
As above, the columns ofM f will be indexed by monomials x e , and the rows will be indexed by pairs (x d , ∂ c ) of a monomial and a partial derivative operator. (Here c ∈ Z n ≥0 , and ∂ c denotes ∂/∂x c 1 1 · · · ∂x cn n .) Then we proceed as in the above case. For this next result, we need a basic fact about sets of polynomials. Given two test modules V and W , we define their product V · W as the linear span of the pairwise products of their elements: Fact 5.1. The disjunction (union) of two invariant properties defined by test polynomials is again an invariant property defined by test polynomials.
Proof. Let V, W be test modules. First one verifies that V · W is a test module. Then V · W defines the union of the properties defined by V and W . For let f be an input polynomial. If every test polynomial t ∈ V vanishes at f , then so does every test polynomial in V · W . Similarly for W . Conversely, if some test polynomial t 1 ∈ V does not vanish at f , and some test polynomial t 2 ∈ W does not vanish at f , then t 1 t 2 ∈ V · W does not vanish at f .

Multilinear formulas
Hard function: det n or perm n Complexity class: (Syntactic) multilinear formulas in characteristic zero Lower bound: Size Ω(n log n ) [Raz09] Invariance: permutation (S n ) Separating module: Raz combines the above ideas on dim ∂(f ) with random restrictions, making the separating module here slightly more complicated than in the above examples. Raz explicitly defines a matrix of partial derivatives, similar to that in the above two examples, which he also denotes M f . The random restrictions Raz uses (see [Raz09,§5]) take the form ρ(x i , x j , x k , x ℓ ) = (1, 1, y m , z m ), where the i, j, k, ℓ used are of a particular form, and the image may be re-ordered in one of two possible ways. In particular, for each input length n there are only finitely many such restrictions to consider.
He then shows a lower bound on rk M det(ρ(X)) and rk M perm(ρ(X)) under any such restriction ρ, and using a probabilistic argument shows that there exists a restriction making rk M f (ρ(X)) small when f is computed by a multilinear formula of size n o(log n) . Hence the property he is using is that there exists a restriction ρ as in his §5 which makes rk M f (ρ(X)) ≤ r for appropriately chosen r.
For a given restriction ρ, we get a test S n -module V ρ consisting of the (r + 1) × (r + 1) minors of M f (ρ(X)) . This test S n -module vanishes if and only if rk M f (ρ(X)) ≤ r. The separating module is then the product over all (finitely many) ρ of the V ρ (cf. Fact 5.1).
Remark 5.2. Although bounding the rank of a matrix of partial derivatives is linear-invariant, the property of being multilinear is not linear-invariant, though it is permutation-invariant. Hence, despite using a bound on the dimension of partial derivatives, it was to be expected that at some point in the proof a property would be used that was only permutation-invariant and not linearinvariant. Although Raz uses multilinearity elsewhere in his proof, even in the brief outline above we see that the type of random restrictions used is only permutation-invariant, and not linearinvariant. 5.2. Methods using properties of (semi-)algebraic varieties. For methods such as the degree bound [Str73,BS83] and the connected components technique [BO83], the most natural input objects to use are themselves (semi-)algebraic varieties. In other words, we need to replace the input space Poly d (x) with a space whose points correspond to varieties. Such spaces have been constructed in (semi-)algebraic geometry, but their construction is not as elementary as in the above results. In both cases the basic idea is that the input objects will in fact be systems of equations (which, in turn, define algebraic sets), and the test variables are then the coefficients of these systems of equations.
Surprisingly, the use of these "parameter spaces of algebraic sets" makes putting these results into the representation-theoretic viewpoint technically more complicated than the above results, despite the fact that these bounds were discovered considerably earlier.

The degree bound
Hard function: Computing all elementary symmetric functions e 1,n , . . . , e n,n together Complexity class: Arithmetic circuits over an infinite field Lower bound: Size Ω(n log n) [Str73] Invariance: F-affine (AGL n (F)), F infinite Separating module: The key property used here is the degree of a projective algebraic set. Although the degree has a nice geometric definition (in characteristic zero), here we recall the algebraic definition as it lends itself more readily to the definition of the separating module. Let V be an algebraic subset of P(F n ), and let I ⊆ F[x 1 , . . . , x n ] be the homogeneous ideal of all polynomials that vanish on V . In particular, I can be written as the direct sum d I d of its homogeneous subsets I d , which consist of those polynomials in I of degree exactly d. The Hilbert function of Hilbert showed (see, e. g., [CLO97,§9.3] or [Eis95, Thm. 1.11]) that for all sufficiently large d, h I (d) agrees with a polynomial p I (d), which is referred to as the Hilbert polynomial of I or V . The degree of V is then the leading coefficient 15 of the Hilbert polynomial p I (d).
For the input space, we may use either the Chow variety (see, e. g., [Dan94, Ch. 3, §7]) or the Hilbert scheme (see, e. g., [Gro95]). The Chow variety is essentially the "space of projective algebraic sets," and the Hilbert scheme is essentially the "space of homogeneous ideals in F[x 1 , . . . , x n ]." The Chow variety is in fact a disjoint union over pairs (d, D) of the variety of projective algebraic sets of degree d and dimension D. Similarly, the Hilbert scheme is the disjoint union over Hilbert polynomials p(·) of the scheme of homogeneous ideals with Hilbert polynomial p I = p. In either case, showing that two varieties have different degrees then amounts to showing that these varieties, as points in the space of varieties, live in different connected components of the Chow variety or Hilbert scheme.
Finally, being in a given component of a variety (or scheme) is defined by the vanishing of some (test) polynomials. As the Hilbert polynomial, and in particular the degree and dimension, is an affine invariant of a projective algebraic set, the components of the Chow variety and Hilbert scheme are also affine-invariant. Hence, by the analog of Fact 2.8 for affine invariance, there is a separating module.

Algebraic decision trees for sorting
Hard function: Element distinctness (note that element distinctness reduces to sorting) Complexity class: Real semi-algebraic decision trees Lower bound: Depth Ω(n log n) [BO83] Invariance: R-affine (AGL n (R)) Separating module: The key property used here is the number of connected components of a semi-algebraic variety-that is, a subset of R n defined by a collection of polynomial equalities and inequalities. The number of connected components is clearly affine-invariant; we recall here how Hardt's Triviality Theorem implies that it is in fact defined by a collection of test polynomial equalities and inequalities. The use of inequalities here is unavoidable: see Remark 5.3 below.
A special case of Hardt's Triviality Theorem [Har80] (see, e. g., [BPR06, §5.8] for a textbook treatment) says that for any continuous semi-algebraic map π : S → R N from a semi-algebraic set S ⊆ R n , there is a finite partition of R N into semi-algebraic sets R N = k i=1 T i such that for each i and every x ∈ T i , T i × π −1 (x) is semi-algebraically homeomorphic to π −1 (T i ). In particular, this implies that for each i, if x, y ∈ T i then π −1 (x) and π −1 (y) have the same number of connected components. Now, consider a collection of polynomial equalities and inequalities of degree ≤ d in n variables x 1 , . . . , x n : (1) e a 1,e x e + a 1 = 0, . . . , e a m,e x e + a m = 0 e a m+1,e x e + a m+1 ≥ 0, . . . , e a m+s,e x e + a m+s ≥ 0 e a m+s+1,e x e + a m+s+1 > 0, . . . , e a h,e x e + a h > 0 We may consider the a i,e s and a i s as variables rather than constants; suppose in total there are N such variables. Then the x i s are coordinates on R n and the a i,e s are coordinates on R N . Equations (1) thus define a semi-algebraic subset S ⊆ R n × R N . Let π : R n × R N → R N be the projection onto the second factor, and let π : S → R N be the restriction of π to S. For any given numerical values a ∈ R N , let V a ⊆ R n denote the semialgebraic subset defined by (1). Then π −1 (a) = V a × {a} ∼ = V a (where ∼ = here denotes semialgebraic homeomorphism).
Finally, by Hardt's Triviality Theorem, there is a semialgebraic partition R N = k i=1 T i such that for any a and a ′ in the same T i , V a and V a ′ have the same number of connected components. Hence, the collection of equations of the form (1) that define a semialgebraic variety with c connected components is the semi-algebraic set {T i |π −1 (a) has c connected components for all a ∈ T i }. As the property of having c connected components is invariant under affine transformations of the x i s (AGL n (R)), this union of T i s is also affine-invariant (under the induced action of the same AGL n (R), not under the larger AGL N (R)), and hence is defined by some affine-invariant collection of equalities and inequalities (by an analog of Fact 2.8).
Remark 5.3. The use of inequalities here is necessary. The vanishing of some test polynomials would not suffice, even when the semi-algebraic variety is defined only by equalities. This can be seen even in the simple case of the number of connected components defined by a quadratic: over R the number of connected components of the algebraic set {x ∈ R|ax 2 +bx+c = 0} is zero if and only if b 2 − 4ac < 0 and is at most one if and only if b 2 − 4ac ≤ 0. The set {(a, b, c)|b 2 − 4ac < 0} is not defined by the vanishing of some polynomials, for it has dimension 3, but the only 3-dimensional subset of R 3 defined by the vanishing of polynomials is R 3 itself. Hence inequalities are necessary.
Remark 5.4. Note that the above lower bound implies the same lower bound for decision trees for element distinctness (and sorting) over C. However, over C the connected components argument does not work directly, because semi-algebraic varieties over C tend to have fewer connected components than over R. In particular, the semialgebraic variety corresponding to element distinctness over C has just a single connected component. Hence although the lower bound holds over C, we would still only get a separating AGL n (R)-module.

Algebraic decision trees for k-equals
Hard function: k-equals (are at least k of the inputs equal?) Complexity class: Real semi-algebraic decision trees Lower bound: Depth Ω(n log(n/k)) [Yao97] Invariance: R-affine (AGL n (R)) Separating module: The key property used here is a lower bound on any Betti number, rather than just the number of connected components (=the 0-th Betti number). As the Betti numbers are invariant under homeomorphism, essentially the same argument as above using Hardt's Triviality Theorem works for this result.
5.3. The method of approximations over finite fields. Here we give two representative examples of how results that use the method of approximation for circuits over finite fields yield separating modules. Results using similar properties, such as those of Grigoriev-Razborov [GR00], should similarly yield separating modules.

Razborov-Smolensky
Hard function: MOD 3 Complexity class: AC 0 [2] Lower bound: Exponential size [Raz87,Smo87] Invariance: F 2 -affine (AGL n (F 2 )) Separating module: Every AC 0 [2] circuit computes a polynomial function over F 2 , so we use . . . , x n )/ x 2 1 = x 1 , . . . , x 2 n = x n as the space of input functions (using Ω we follow Smolensky's notation). Note that here we consider two functions equal if they are equal when evaluated on all F 2 points. In other words, we are considering functions on F 2 , rather than formal polynomials whose coefficients are in F 2 . Every function over F 2 can be represented by a unique multilinear polynomial; when we refer to MOD 3 we mean its corresponding F 2 -multilinear polynomial.
Fix a depth k and a constant λ. For our purposes, the key property used here is: There exists a subset Γ ⊆ F n 2 (for "good") of size at least 2 n − 2 n−r such that f agrees with a polynomial of degree ≤ (2λr) k on the points in Γ. Smolensky [Smo87,Lem. 2] shows that this holds for any function computed by a depth k circuit with parity gates for r = o(n 1/2k ), but not for MOD 3 . This condition is clearly GL n (F 2 )-invariant.
For any Γ ⊆ F n 2 , let I Γ be the ideal of polynomials that vanish everywhere on Γ. When we mod out the space of functions by I Γ , this is the same as only considering the values a function takes on Γ. Then f agrees with a polynomial of degree ≤ d = (2λr) k on the points in Γ if and only if all of the coefficients of monomials of degree > d of f (mod I Γ ) vanish. As the map Ω d,n → Ω d,n /I Γ is linear, the coefficients of f (mod I Γ ) are linear combinations of the coefficients of f , and we are asking that certain such linear combinations vanish. Let T Γ be the test module consisting of these linear combinations. Finally, for an appropriate choice of r, by Fact 5.1, Γ T Γ is the desired separating module, where the product is taken over all (finitely many) subsets Γ ⊆ F n 2 of size ≥ 2 n − 2 n−r .
Depth 3 arithmetic circuits over finite fields Hard function: Determinant or permanent Complexity class: Depth 3 arithmetic circuits over the finite field F q Lower bound: Exponential size [GK98] Invariance: F q -linear (GL n (F q )) Separating module: As above, the key property here will use an existential quantifier over some finite collection of subsets S of F n q , which will turn into a big product of test modules over all possible choices for S. Beyond that, the condition here is quite a bit more complicated than above.
Here, we work in the space of formal polynomials over F q , namely Poly d Fq (x 11 , x 12 , . . . , x nn ). To describe the key property we introduce some notation. Given σ ∈ GL n (F q ) and any function f = f (X), we denote f (σX) by f σ = f σ (X). For any set F of functions, write F σ = {f σ |f ∈ F }. Let ∂ ≤r (f ) denote the linear span of all the partial derivatives of f of order ≤ r. Finally, combining these notations, we have ∂ ≤r (f ) σ = {g σ |g ∈ ∂ ≤r (f )}.
The key property of a function f ∈ Poly d Fq (x 1 , . . . , x n ) is then, for appropriate choices of all the parameters involved, that there exists a subset S ⊆ GL n (F q ) of size ≤ s such that there is a function g(X) in the intersection σ∈S ∂ ≤r (f ) σ such that g(A) = 0 for all A ∈ GL n (F q ). Again, this property is readily seen to be GL n (F q )-invariant. Let us verify that it is defined by test polynomials. For now, fix a subset S ⊆ GL n (F q ). For each σ ∈ S, we compute a linear basis of ∂ ≤r (f ) σ . The coefficients of each such basis function will be linear combinations of the coefficients of f (=test variables). This follows from the usual fact about partial derivatives, and the fact that for any σ ∈ GL n (F q ) and any function h, the coefficients of h σ are linear combinations of the coefficients of h. Next, we take the intersection over all σ ∈ S of these subspaces. Again, a linear basis for the resulting intersection will consist of polynomials whose coefficients are linear combinations of the test variables. Let us denote this intersection Λ. Now observe that the collection of all g such that g(A) = 0 for all A ∈ GL n (F q ) is an ideal I in the space of polynomials (of degree ≤ d for some d), and in particular is a linear subspace thereof. Then the property is satisfied exactly if I ∩ Λ = 0. The system of linear equations defining I ∩ Λ has coefficients which are either linear combinations of the coefficients of f (coming from the equations defining the linear space Λ) or constants (coming from the equations defining I). If this system of equations had the same number of variables as equations we could require that just the n × n determinant of the system vanishes. As the system is likely to have more equations than variables, we must require that all the n × n minors of this system vanish. These n × n minors form a test module T S , and then, as above, the separating module is S T S , where the product is over all S of appropriate size.
Remark 5.5. Aside from the more obvious uses of finiteness (not just finite characteristic) in the above proofs, in the Grigoriev-Karpinski proof, the property they use becomes vacuous over any infinite field F: the only polynomial in n 2 variables that vanishes everywhere on GL n (F) is the zero polynomial. For further discussion of these issues see Section 4.6. 5.4. Results previously known to give separating modules.

Permanent versus determinant
Hard function: perm n Complexity class: Linear projections of det m Lower bound: m ≥ n 2 /2 [MR04]; also border determinantal complexity n 2 /2 [LMR10] Invariance: C-linear (GL m 2 (C)) Separating module: The key property used by Mignon and Ressayre [MR04] is the rank of the Hessian matrix of a function. Recall that the Hessian of a function f (x 1 , . . . , x n ) is the n × n matrix Hess(f ) whose (i, j) entry is the second partial derivative ∂ 2 f /∂x i ∂x j . They show a lower bound on rk Hess(perm) and an upper bound on rk Hess(det). Note that the entries of Hess(det) are themselves functions; the upper bound on rk Hess(det) that they prove does not hold at all matrices X, but only at those matrices where det(X) = 0. This is enough for them to prove the lower bound, but makes it complicated to extract a separating module from their proof.
If the upper bound held for all X, then the minors of the Hessian matrix would span a separating module, as in the Nisan-Wigderson partial derivatives technique above. Instead, the condition they use is that det(X) divides the r × r minors of Hess(det) (for r = 2n + 1). Landsberg, Manivel, and Ressayre [LMR10] find polynomial equations that vanish exactly on the pairs of polynomials (f, g) such that f divides g (amongst other achievements), resolving a surprisingly old question in algebraic geometry. They then construct a separating module by using these equations with f = det and g the minors of Hess(det).

Relations between lower bounds yield relations between separating modules
Baur-Strassen: computing partial derivatives [BS83] Assumption: Computing (∂f /∂x 1 , . . . , ∂f /∂x n ) requires arithmetic circuits of size s Consequence: Computing f requires arithmetic circuits of size s/3 Invariance: F-linear (GL n (F)), any infinite field Separating module implication: Let ϕ be the map from Poly d (x 1 , . . . , x n ) to the Chow variety or Hilbert scheme (see The Degree Bound above), defined as follows. ϕ(f ) is the variety (ideal) defined by ∂f /∂x 1 , . . . , ∂f /∂x n . Recall that A ∈ GL n (F) acts on the Hilbert scheme by taking the ideal g 1 (x), . . . , g k (x) to the ideal g 1 (Ax), . . . , g k (Ax) ; let us denote the latter by A· g 1 (x), . . . , g k (x) . Similarly, A ∈ GL n (F) acts on Poly d (x) by sending f (x) to f (Ax). Then ϕ is GL n (F)-equivariant, in that If T is a test module which vanishes on {ϕ(g)|ϕ(g) has arithmetic circuits of size ≤ s}, but not on ϕ(f ), then ϕ * (T ) def = {t • ϕ|t ∈ T } is a vector space of test polynomials which vanishes at all g ∈ Poly d (x) that have circuits of size ≤ s/3, but not at f . The GL n (F)-equivariance of ϕ implies that ϕ * (T ) is in fact a test GL n (F)-module.
Tensor rank to formula size [Raz10] Assumption: t n ∈ (F n ) ⊗r(n) has tensor rank ≥ n r(n)(1−o(1)) for some ω(1) ≤ r(n) ≤ O log n log log n Consequence: The polynomial f n which is the symmetrization of t n requires super-polynomial size arithmetic formulas. Also, by the completeness of the permanent, perm n requires super-polynomial size arithmetic formulas (attributed to Yehudayoff, [Raz10, Footnote 2]) Invariance: F-linear (GL n (F)), F arbitrary Separating module implication: Raz uses the standard symmetrization map from tensors (F n ) ⊗r (we think of these as degree r homogeneous noncommutative polynomials) to Poly r (x 1 , . . . , x n ). In particular, to show an arithmetic formula size lower bound on some f n ∈ Poly r (x), it suffices to show a tensor rank lower bound on any noncommutative version t n of f n (that is, f n is the result of symmetrizing t n ). In particular, we are free to use the standard embedding (NB: in the opposite direction compared to the above) ϕ : Poly r (x) ֒→ (F n ) ⊗r , which takes the monomial x i 1 . . . x ir to the tensor 1 r! π∈Sr x i π(1) ⊗ · · · ⊗ x i π(r) . Raz's results imply that the image, under ϕ, of the set of polynomials that have small formulas is contained in the set of tensors of low tensor rank. It is a standard fact from multi-linear algebra that the embedding ϕ is GL n (F)-equivariant (see the Baur-Strassen implication above). Hence, if a test module T is used to show a lower bound on the tensor rank (and hence, border rank, see Section 3.2) of some ϕ(f ), then {t • ϕ|t ∈ T } is a test module which implies the stated lower bound on the arithmetic formula size of f .

Chasm at Depth 4 [AV08, Koi12]
Assumption: f requires depth 4 arithmetic circuits of size 2 ω( √ n log 2 n) Consequence: f requires arithmetic circuits of super-polynomial size Invariance: F-affine (AGL n (F)), F arbitrary Separating module implication: They show that the set of functions computable by arithmetic circuits of polynomial size is contained in the set of functions computable by depth 4 circuits of size 2 O( √ n log 2 n) . Hence, if a separating module vanishes on the latter set, it also vanishes on the former.

Chasm at Depth 3 [GKKS13]
Assumption: f requires depth 3 arithmetic circuits of size 2 ω( √ n log 3/2 n) Consequence: f requires arithmetic circuits of super-polynomial size Invariance: F-affine (AGL n (F)), characteristic zero or characteristic > deg f Separating module implication: Same as above, but with different bounds and not over arbitrary fields. See Section 4.6 for a discussion of this issue.

Matrix rigidity to linear circuits [Val77]
Assumption: The n × n matrix A n has rigidity R An (n/2) ≥ Ω(n 1+ε ) Consequence: The linear function x → A n x does not have linear circuits of simultaneous size O(n) and depth O(log n) Invariance: permutation (S n × S n ) Separating module implication: Here the ambient (input) space is the space M n (F) of n × n matrices. Valiant [Val77,Cor. 6.3] showed the set of matrices A n whose associated linear functions x → A n x can be computed by linear circuits of size O(n) and depth O(log n) (simultaneously) is contained in the set of matrices of low rigidity. Hence any test module which vanishes on the set of matrices with low rigidity but not on some matrix A will also vanish on the set of matrices that can be computed in size O(n) and depth O(log n) by linear circuits.
As the concept of rigidity involves the number of entries of a matrix that must be changed to drop its rank, this concept is only permutation-invariant-we may multiply A n on the left and right by permutation matrices without affecting its rank or rigidity. We note that, despite the fact that the non-rigid matrices do not form an algebraic set, some of the most successful results on matrix rigidity to date use the algebro-geometric approach (essentially, test polynomials) [KLPS09] (see also [LTV03] for more on the geometry).

Acknowledgments
The author would like to thank Scott Aaronson, Eric Allender, Saugata Basu Let G be any finite or algebraic group acting algebraically on an input space. There is a many-to-one correspondence between test G-modules and G-invariant properties defined by the vanishing of test polynomials.
For readers unfamiliar with algebraic geometry, we note that GL n (F) and AGL n (F) are both algebraic groups. All of the situations considered in this paper satisfy the hypotheses above.
For readers familiar with algebraic geometry but perhaps not with algebraic groups: an algebraic group is an algebraic set that is also a group, and where the multiplication map G × G → G and inversion map G → G are both algebraic maps. In particular, all finite groups are algebraic. (If you are concerned that GL n is a Zariski-open subset of F n 2 , consider GL n as instead the algebraic set Proof. Let V denote the input space (input polynomials, matrices, etc.), and suppose that T is a test G-module with basis t 1 , . . . , t k . Let Π T denote the corresponding property, namely Π T = {v ∈ V |t(v) = 0∀t ∈ T }. Π T is defined by test polynomials (namely, those in T ). To see that Π T is G-invariant, suppose that v ∈ Π T and g ∈ G, and consider the point gv. By the defining property of test G-module, if t(x) ∈ T , then t(gx) ∈ T for all g ∈ G. Let t ′ (x) = t(gx). As t ′ ∈ T and v ∈ Π T , we have t ′ (v) = 0 by the definition of Π T . But then 0 = t ′ (v) = t(gv), as desired. Hence Π T is a G-invariant property defined by test polynomials.
Conversely, suppose that Π ⊆ V is a G-invariant property defined by test polynomials. By Hilbert's Basis Theorem, Π is defined by the vanishing of only finitely many test polynomials, say t 1 , . . . , t k . If G is finite, then it is clear that the collection of polynomials GT def = {t i (g(x))|1 ≤ i ≤ k, g ∈ G} is finite. If G is algebraic, then it is a standard fact from algebraic geometry that the linear span of GT is finite-dimensional, even though G itself may be infinite. It is clear from the construction that GT is a test G-module. It remains to show that Π is exactly the set of input points on which GT vanishes. Let us denote the latter set by Π GT . By the previous direction, Π GT is G-invariant.
We will show that for arbitrary Π defined by test polynomials in T (not necessarily G-invariant), Π GT is the unique maximum G-invariant subset of Π. Hence, if Π itself is G-invariant, then Π = Π GT . Suppose Π ′ is a G-invariant subset of Π. In particular, every test polynomial t ∈ T vanishes on every v ∈ Π ′ . We must show that for arbitrary g, t(gx) also vanishes on every v ∈ Π ′ . As Π ′ is G-invariant, v ∈ Π ′ implies that gv ∈ Π ′ for every g ∈ G. Hence t(gv) = 0 for every v ∈ Π ′ . Thus Π ′ ⊆ Π GT . As this holds for arbitrary G-invariant subsets Π ′ of Π, Π GT is the unique maximum G-invariant subset of of Π, and thus is equal to Π if Π itself is G-invariant.
It is clear that the map sending a test G-module T to the property Π T is well-defined, and hence is at worst many-to-one. Over an algebraically closed field, Hilbert's Nullstellensatz implies that two test G-modules T 1 and T 2 define the same property Π if and only if they generate the same radical ideal. Hence, we cannot expect this map to be one-to-one.

Appendix B. The utility of separating modules
In Section 3.1 we argued that invariant properties can be used to prove lower bounds without loss of generality. In Section 3.3 we argued that for all naturally occurring nonuniform complexity classes C, C n is constructible, and furthermore is typically the image of some simple algebraic map from some F N . We now give a heuristic argument that the easiest way to prove a lower bound against such sets is by using a test polynomial, and hence, for invariant classes, a separating module. Even when the use of separating modules is not formally necessary, it thus helps illuminate any (constructible) nonuniform complexity class.
If C n is closed, then test polynomials are necessary and sufficient to prove f hard,n / ∈ C n (see Section 3.3). For the sake of discussion, suppose that C n is not closed, but is the next simplest kind of constructible set: C n is the difference A n \B n of two closed sets A n , B n . By what is essentially disjunctive normal form, every constructible set is a union of such differences.
Without loss of generality, we may assume that A n = C n is the Zariski-closure of C n , and that B n ⊆ A n . Equivalently, B n = C n \C n is the boundary of C n .
Two approaches to show f hard,n / ∈ C n immediately suggest themselves: (1) show that f hard,n / ∈ A n = C n ; or (2) show that f hard,n ∈ B n . As B n = C n \C n might be complicated, a third approach is (3) to find a closed set D n containing f hard,n such that D n is disjoint from C n . Each of these approaches of course requires some insight: in general, (1) requires finding a test polynomial with the desired properties, (2) requires finding all test polynomials that vanish on B n , or at least a set of test polynomials whose vanishing defines B n , and (3) requires finding the set D n along with all the test polynomials that vanish on D n , or at least a set of test polynomials that defines D n . Of course, we say "in general" here because it is always possible that, for example, B n might have some structure that can be exploited so that showing f hard,n ∈ B n might be done without recourse to such test polynomials. However, at this level of heuristic argument, we cannot speculate on anything other than the general case.
In the general case-that is, barring some miraculous leap of ingenuity, which of course we cannot rule out-we can compare the a priori difficulty of these approaches: (1) requires finding a single test polynomial t, verifying that t vanishes on C n (which implies that it vanishes on C n = A n ), and verifying that t(f hard,n ) = 0. (2) requires finding or knowing a set t 1 , . . . , t k of test polynomials whose vanishing defines B n and then verifying that t i (f hard,n ) = 0 for all 1 ≤ i ≤ k.
(3) requires constructing D n , along with a defining set t 1 , . . . , t k of test polynomials, verifying that t i (f hard,n ) = 0 for all 1 ≤ i ≤ k, and verifying that D n is disjoint from C n .
First, there is the obvious difference that (1) only requires finding a single polynomial and verifying its properties, whereas both (2) and (3) require finding a whole set of polynomials and verifying their properties. Furthermore, in most such situations the number of polynomials needed in (2) and (3) will be exponential in n: in all the examples we are aware of except for Remark B.1, the sets A n , B n , C n , D n have dimension poly(n) and live in a space like Poly O(n) (x 1 , . . . , x n ) of dimension 2 Θ(n log n) , which implies that any defining set of test polynomials must consist of at least 2 Θ(n log n) − poly(n) = 2 Θ(n log n) test polynomials.
Remark B.1. In the case of n × n matrix multiplication the ambient space has dimension n 6 , and in the case of matrix rigidity the ambient space has dimension n 2 , so the above point is not an issue. However, it may be telling that even in these cases, the approach via test polynomial seems to be the most successful so far. In the case of matrix multiplication, this corresponds to border rank (see Section 3.2), which has been successfully used for upper bounds as well as lower bounds. In the case of matrix rigidity, see, e. g., [KLPS09,LTV03].
Second, we can use the complexity of the corresponding verification problems as a heuristic guide to the mathematical difficulty of the associated proofs. For starters, given a test polynomial t, it is easy to evaluate t(f ) for any explicitly given f .
(1) Verifying that t vanishes on C n is essentially a coRP problem. If C n is the image of a simple algebraic map ϕ from some F N , as most complexity classes are (see Section 3.3), we can generate random points of C n by choosing random points in F N and applying ϕ. In all situations we are aware of N ≤ poly(n).
(2) Verifying that f hard,n ∈ B n requires verifying that t i (f hard,n ) = 0 for a defining set of test polynomials T . We argued above that in most situations, T must consist of exponentially many test polynomials. (3) Even if D n is chosen to be defined by only poly(n) test polynomials t 1 , . . . , t poly(n) -thus avoiding the difficulty of (2)-verifying that D n is disjoint from C n = Im(ϕ n ) reduces to deciding whether a variety given by equations is empty or not. Namely, the equations t i (ϕ(x)) for 1 ≤ i ≤ k, define the closed set ϕ −1 (D n ), which is empty if and only if D n is disjoint from C n . Deciding whether a closed set given by equations is empty or not is the computational problem of Hilbert's Nullstellensatz (HN), which is NP-hard in general. As the ϕ are quite simple, if we treat the defining equations t 1 , . . . , t poly(n) as the input to our verification problem, the verification problem here is likely to be as hard as the general case of HN.
Also note that the fewer test polynomials that are needed to define D n , the larger its dimension is, and hence the less likely it is to be disjoint from C n . This makes it seem unlikely that one could in fact find a D n described by few test polynomials that is disjoint from C n and contains f hard,n , let alone that the corresponding instance of HN would not be a hard instance. Either way, we find the following complexities of the corresponding general verification problems very suggestive: (1) coRP.
(2) At least exponential time, as there are at least this many defining equations for B n .
Finally, in the absence of a brilliant insight to construct a D n that has exponential dimension and yet is both disjoint from C n and avoids the difficulty of HN, the easiness of verification in (1) suggests that a relatively feasible computational approach is possible using a brute force search for test modules, whereas this is not the case for approaches (2) and (3).

Appendix C. Discussion of terminology
The new terminology we introduced in this paper was far from arbitrary; here we explain our reasons for choosing the terminology we did. A test GL n -module is, in particular, a representation of GL n . Indeed, the word "module" is often used interchangeably with "representation" in representation theory. In our setting, it has the additional connotation of a "module of tests" in the sense of computer programming. We believe the phrase "test module" is new.
Separating GL n (C)-modules are essentially equivalent to the "HWV obstructions" of Bürgisser and Ikenmeyer [BI12]. In particular, the smallest GL n (C)-module containing an HWV obstruction is a separating module, and every separating GL n (C)-module contains some HWV obstruction (see [BI12,Prop. 3.3]). We use our terminology as it generalizes (see Section 2.4) to other groups for which the highest weight theory does not apply, and we believe it is simpler to understand for expository purposes-in particular, it does not require knowing anything about Lie theory and the theory of highest weights. However, for certain approaches to certain lower bounds there are technical advantages to considering the highest weight vectors directly, as in [BI12].