Euclidean Distance Degree and Mixed Volume

We initiate a study of the Euclidean distance degree in the context of sparse polynomials. Specifically, we consider a hypersurface f=0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f=0$$\end{document} defined by a polynomial f that is general given its support, such that the support contains the origin. We show that the Euclidean distance degree of f=0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f=0$$\end{document} equals the mixed volume of the Newton polytopes of the associated Lagrange multiplier equations. We discuss the implication of our result for computational complexity and give a formula for the Euclidean distance degree when the Newton polytope is a rectangular parallelepiped.


Introduction
Let X ⊂ R n be a real algebraic variety. For a point u ∈ R n X, consider the following computational problem: (1) compute the critical points of d X : where u − x = (u − x) T (u − x) is the Euclidean distance on R n . Seidenberg [25] observed that if X is nonempty, then it contains a solution to (1). He used this observation in an algorithm for deciding if X is empty. Hauenstein [14] pointed out that solving (1) provides a point on each connected component of X. So the solutions to (1) are also useful in the context of learning the topological configuration of X. From the point of view of optimization, the problem (1) is a relaxation of the optimization problem of finding a point x ∈ X that minimizes the Euclidean distance to u. A prominent example of this is low-rank matrix approximation, which can be solved by computing the singular value decomposition. In general, computing the critical points of the Euclidean distance between X and u is a difficult task in nonlinear algebra.
We consider the problem (1) when X ⊂ R n is a real algebraic hypersurface in R n defined by a single real polynomial: The critical points of the distance function d X from (1) are called ED-critical points. They can be found by solving the associated Lagrange multiplier equations. This is a system of polynomial equations defined as follows.
Let us write ∂ i for the operator of partial differentiation with respect to the variable x i , so that ∂ i f := ∂f ∂x i , and also write ∇f (x) = (∂ 1 f (x), . . . , ∂ n f (x)) for the vector of partial derivatives of f (its gradient).
The Lagrange multiplier equations are the following system of n+1 polynomial equations in the n+1 variables (λ, x 1 , . . . , x n ). (2) where λ is an auxiliary variable (the Lagrange multiplier). We consider the number of complex solutions to L f,u (λ, x) = 0. For general u, this number is called the Euclidean Distance Degree (EDD) [9] of the hypersurface f = 0: Here, "general" means for all u in the complement of a proper algebraic subvariety of R n . In the following, when referring to EDD(f ) we will simply speak of the EDD of f . Figure 1 shows the solutions to L f,u (λ, x) = 0 for a biquadratic polynomial f . Figure 1. The curve X = V(x 2 1 x 2 2 − 3x 2 1 − 3x 2 2 + 5) ⊂ R 2 is in blue and u = (0.025, 0.2) is in green. The 12 red points are the critical points of the distance function d X ; that is, they are the x-values of the solutions to L f,u (λ, x) = 0. In this example, the Euclidean distance degree is 12, so all complex solutions are in fact real.
Determining the Euclidean Distance degree is of interest in applied algebraic geometry, but also in related areas such as learning and optimization, because, as we will discuss in Section 3, our results on the EDD of f have implications for the computational complexity of solving the problem (1).
There is a subtle point about EDD(f ). The definition in (3) does not need us to assume that V R (f ) is a hypersurface in R n . In fact, V R (f ) can even be empty. Rather, EDD(f ) is a property of the complex hypersurface X C := V C (f ). We will therefore drop the assumption of V R (f ) being a real hypersurface in the following. Nevertheless, the reader should keep in mind that for the applications discussed at the beginning of this paper the assumption is needed. We will come back to those applications only in Sections 3.2 and 3.3.
In the foundational paper [9], the Euclidean distance degree of f was related to the polar classes of X C , and there are other formulas [1] involving characteristic classes of X C . In this paper we give a new formula for the EDD.
Our main result is Theorem 1 in the next section. We show that, if f is sufficiently general given its support A with 0 ∈ A, then EDD(f ) is equal to the mixed volume of the Newton polytopes of L f,u (λ, x). This opens new paths for computing EDDs by using tools from convex geometry. We demonstrate this in Section 6 and compute the EDD of a general hypersurface whose Newton polytope is a rectangular parallelepiped.
Our proof strategy relies on Bernstein's Other Theorem (Proposition 6) below. This result gives an effective method for proving that the number of solutions to a system of polynomial equations can be expressed as a mixed volume. We hope our work sparks a new line of research that exploits this approach in other applications, not just EDD.

Statement of main results
We give a new formula for the Euclidean distance degree that takes into account the monomials in f . In Section 6 we work this out in the special case when this Newton polytope is a rectangular parallelepiped.
Before stating our main results we have to introduce notation: A vector a = (a 1 , . . . , a n ) of nonnegative integers is the exponent of a monomial x a := x a 1 1 · · · x an n , and a polynomial f ∈ C[x 1 , . . . , x n ] is a linear combination of monomials. The set A of exponents of monomials that appear in f is its support. The Newton polytope of f is the convex hull of its support. Given polytopes Q 1 , . . . , Q m in R m , we write MV(Q 1 , . . . , Q m ) for their mixed volume. This was defined by Minkowski; its definition and properties are explained in [12,Sect. IV.3], and we revisit them in Section 6. Our main result expresses the EDD(f ) in terms of mixed volume.
We denote by P, P 1 , . . . , P n the Newton polytopes of the Lagrange multiplier equations L f,u (λ, x) from (2). That is, P is the Newton polytope of f , and P i is the Newton polytope of ∂ i f − λ(u i − x i ). Observe that P, P 1 , . . . , P n are polytopes in R n+1 , because L f,u (λ, x) has n + 1 variables λ, x 1 , . . . , x n .
We state our first main result. The proof is given in Section 4.
Theorem 1. If f is a polynomial whose support A contains 0, then where P is the Newton polytope of f and P i is the Newton polytope of There is a dense open subset U of polynomials with support A such that if f ∈ U , this inequality is an equality.
In the following, we refer to polynomials f ∈ U as general given the support A.
Since P, P 1 , . . . , P n are the Newton polytopes of the entries in L f,u , Bernstein's Theorem [4] implies the inequality in Theorem 1 (commonly known as the BKK bound; see also [10]). Our proof of Theorem 1 appeals to a theorem of Bernstein which gives conditions that imply equality in the BKK bound. These conditions require the facial systems to be empty.
Our next main result is an application of Theorem 1. We compute EDD(f ) when the Newton polytope of the defining equation of X is the rectangular parallelepiped where a := (a 1 , . . . , a n ) is a list of positive integers.
For each k = 1, . . . , n, let e k (a) := 1≤i 1 <···<i k ≤n a i 1 · · · a i k be the k-th elementary symmetric polynomial in n variables evaluated at a. The next theorem is our second main result.
Theorem 2. Let a = (a 1 , . . . , a n ). If f ∈ R[x 1 , . . . , x n ] has Newton polytope B(a), Then There is a dense open subset U of the space of polynomials with Newton polytope B(a) such that for f ∈ U , this inequality is an equality.
Note that there is a conceptual change when passing from Theorem 1 to Theorem 2. Theorem 1 is formulated in terms of the support of f , whereas Theorem 2 concerns its Newton polytope of f . This is because the equality in 2 needs the Newton polytope of the partial derivatives ∂ i f to be B(a 1 , . . . , a i −1, . . . , a n ).

Remark 3.
Observe that for 0 < i 1 < · · · < i k ≤ n, if we project B(a) onto the coordinate subspace indexed by i 1 , . . . , i k , we obtain B(a i 1 , . . . , a i k ). Thus the product a i 1 · · · a i k is the kdimensional Euclidean volume of this projection and k! a i 1 · · · a i k is the normalized volume of this projection. On the other hand, e k (a) = 1≤i 1 <···<i k ≤n a i 1 · · · a i k . This observation implies an appealing interpretation of the formula of Theorem 2: It is the sum of the normalized volumes of all coordinate projections of the rectangular parallelepiped B(a). ⋄ Remark 4 (Complete Intersections). Experiments with HomotopyContinuation.jl [7] suggest that a similar formula involving mixed volumes should hold for general complete intersections. That is, for X = {x ∈ R n | f 1 (x) = · · · = f k (x) = 0} such that dim X = n − k and f 1 , . . . , f k are general given their Newton polytopes. The Lagrange multiplier equations (2) become f 1 (x) = · · · = f k (x) = 0 and Jλ − (u − x) = 0, where λ = (λ 1 , . . . , λ k ) is now a vector of variables, and J = (∇f 1 , . . . , ∇f k ) is the n × k Jacobian matrix. We leave this general case of k > 1 for further research. ⋄ 2.1. Acknowledgments. The first and the second author would like to thank the organizers of the Thematic Einstein Semester on Algebraic Geometry: "Varieties, Polyhedra, Computation" in the Berlin Mathematics Research Center MATH+. This thematic semester included a research retreat where the first and the second author first discussed the relation between Euclidean Distance Degree and Mixed Volume, inspired by results in [8]. The first author would like to thank Sascha Timme for discussing the ideas in Section 3.3.

2.2.
Outline. In Section 3 we explain implications of Theorem 1 for computational complexity in the context of using the polyhedral homotopy for solving the Lagrange multiplier equations L f,u = 0 for the problem (1). In Section 4, we explain Bernstein's conditions, and give a proof of Theorem 1. The proof relies on a lemma asserting that the facial systems of L f,u are empty. Section 5 is devoted to proving this lemma. The arguments that are used in this proof are explained on an example at the end of Section 4. We conclude in Section 6 with a proof of Theorem 2.

Implications for computational complexity
We discuss the implications of Theorem 1 for the computational complexity of computing critical points of the Euclidean distance (1).

3.1.
Polyhedral homotopy is optimal for EDD. Polynomial homotopy continuation is an algorithmic framework for numerically solving polynomial equations which builds upon the following basic idea: Consider the system of m polynomials F (x) = (f 1 (x), . . . , f m (x)) = 0 in variables x = (x 1 , . . . , x m ). The approach to solve F (x) = 0 is to generate another system G(x) (the start system) whose zeros are known. Then, F (x) and G(x) are joined by a homotopy, which is a system H(x, t) of polynomials in m+1 variables with H(x, 1) = G(x) and H(x, 0) = F (x). Differentiating H(x, t) = 0 with respect to t leads to an ordinary differential equation called Davidenko equation. The ODE is solved by standard numerical continuation methods with initial values the zeros of G(x). This process is usually called path-tracking and continuation. For details see [26].
One instance of this framework is the polyhedral homotopy of Huber and Sturmfels [16]. It provides a start system G(x) for polynomial homotopy continuation and a homotopy H(x, t) such that the following holds: Let Q 1 , . . . , Q m be the Newton polytopes of F (x). Then, for all t ∈ (0, 1] the system of polynomials H(x, t) has MV(Q 1 , . . . , Q m ) isolated zeros (at t = 0 this can fail, because the input F (x) = H(x, 0) may have fewer than MV(Q 1 , . . . , Q m ) isolated zeroes). Polyhedral homotopy is implemented in many polynomial homotopy continuation softwares; for instance in HomotopyContinuation.jl [7], HOM4PS [19] and PHCPack [28].
Theorem 1 implies that polyhedral homotopy is optimal for computing ED-critical points in the following sense: If we assume that the continuation of zeroes has unit cost, then the complexity of solving a system of polynomial equations F (x) = 0 by polynomial homotopy continuation is determined by the number of paths that have to be tracked. We say that a homotopy is optimal, if the following three properties hold: (1) the start system G(x) has as many zeros as the input F (x); (2) all continuation paths end in a zero of F (x); and (3) for all zeros of F (x) there is a continuation path which converges to it. In an optimal homotopy none of the continuation paths have to be sorted out. The number of paths, which needs to be tracked, is optimal. At the same time all zeros of the input will be computed.
We now have the following consequence of Theorem 1.
Corollary 5. If f is generic given its support A with 0 ∈ A, polyhedral homotopy is optimal for solving L f,u = 0.
Corollary 5 is one of the few known instances of a structured problem for which we have an optimal homotopy available.
In our definition of optimal homotopy we ignored the computational complexity of pathtracking in polyhedral homotopy. We want to emphasize that this is an important part of contemporary research. We refer to Malajovich's work [20,21,22].

3.2.
Computing real points on real algebraic sets. Hauenstein [14] observed that solving the Lagrange multiplier equations L f,u = 0 gives at least one point on each connected component of the real algebraic set X = V R {x ∈ R n | f (x) = 0}. Indeed, every real solution to L f,u = 0 corresponds to a critical point of the distance function from (1). Every connected component of X contains at least one such critical point.
Corollary 5 shows that polyhedral homotopy provides an optimal start system for Hauenstein's approach. Specifically, Corollary 5 implies that when using polyhedral homotopy in the algorithm in [14, Section 2.1], one does not need to distinguish between the sets E 1 (= continuation paths which converge to a solution to L f,u = 0) and E (= continuation paths which diverge). This reduces the complexity of Hauenstein's algorithm, who puts his work in the context of complexity in real algebraic geometry [2, 3, 23, 25].

3.3.
Certification of ED-critical points. We consider a posteriori certification for polynomial homotopy continuation: Zeros are certified after and not during the (inexact) numerical continuation. Implementations using exact arithmetic [15,18] or interval arithmetic [6,18,24] are available. In particular, box interval arithmetic in C n is powerful in combination with our results. We explain this.
Box interval arithmetic in the complex numbers is arithmetic with intervals of the form Box interval arithmetic in C n uses products of such intervals. Corollary 5 implies that, if f is generic given its support, all continuation paths in polyhedral homotopy converge to a zero of L f,u . Conversely, for all zeros of L f,u , there is a continuation path which converges to it. Therefore, if we can certify that each numerical approximation of a zero of L f,u that we have computed using polyhedral homotopy corresponds to a true zero, and if we can certify that those true zeros are pairwise distinct, we have provably obtained all zeros of L f,u . Furthermore, if we compute box intervals in C n+1 , which provably contain the zeros of L f,u , we can use those intervals to certify whether a zero is real (see [6,Lemma 4.8]) or whether it is not real (by checking if the intervals intersect the real line; this is a property of box intervals).
If it is possibly to classify reality for all zeros, we can take the intervals {r 1 , . . . , r k } ⊂ R n , which contain the real critical points of the distance function d X from (1). The r j are obtained from the coordinate projection (λ, x) → x of the intervals containing the real zeros of L f,u . Setting d j := {d X (s) | s ∈ r j } gives a set of intervals {d 1 , . . . , d k } ⊂ R. If there exists d i such that d i ∩ d j = ∅ and min d i < min d j for all i = j, then this is a proof that the minimal value of d X is contained in d i and that the minimizer for d X is contained in r i .

Bernstein's Theorem
The relation between number of solutions to a polynomial system and mixed volume is given by Bernstein's Theorem [4].
Let g 1 , . . . , g m ∈ C[x 1 , . . . , x m ] be m polynomials with Newton polytopes Q 1 , . . . , Q m . Let (C × ) m be the complex torus of m-tuples of nonzero complex numbers and #V C × (g 1 , . . . , g m ) be the number of isolated solutions to g 1 = · · · = g m = 0 in (C × ) m , counted by their algebraic multiplicities. Bernstein's Theorem [4] asserts that and the inequality becomes an equality when each g i is general given its support. The restriction of the domain to (C × ) m is because Bernstein's Theorem concerns Laurent polynomials, in which the exponents in a monomial are allowed to be negative.
An important special case of Bernstein's Theorem was proven earlier by Kushnirenko. Suppose that the polynomials g 1 , . . . , g m all have the same Newton polytope. This means that Q 1 = · · · = Q m . We write Q for this single polytope. Then, the mixed volume in (5) Kushnirenko's Theorem [17] states that if g 1 , . . . , g m are general polynomials with Newton polytope Q, then That the mixed volume becomes the normalized Euclidean volume when the polytopes are equal is one of three properties which characterize mixed volume, the others being symmetry and multiadditivity. This is explained in [12, Sect. IV.3] and recalled in Section 6.
The inequality (5) is called the BKK bound [5]. The key step in proving it is what we call Bernstein's Other Theorem. This a posteriori gives the condition under which the inequality (5) is strict (equivalently, when it is an equality). We explain that. Let For w ∈ Z m , define h w (A) to be the minimum value of the linear function x → w · x on the set A and write A w for the subset of A on which this minimum occurs. This is the face of A exposed by w. We write for the restriction of g to A w . For w ∈ Z m and a system G = (g 1 , . . . , g m ) of m polynomials, the facial system is G w := ((g 1 ) w , . . . , (g m ) w ). We state Bernstein's Other Theorem [4, Theorem B].
Proposition 6 (Bernstein's Other Theorem). Let G = (g 1 , . . . , g m ) be a system of Laurent polynomials in variables x 1 , . . . , x m . For each i = 1, . . . , m, let A i be the support of g i and Q i = conv(A i ) its Newton polytope. Then While this statement is similar to Bernstein's formulation, we use its contrapositive, that the number of solutions equals the mixed volume when no facial system has a solution. We use Bernstein's Other Theorem when G = L f,u and m = n+1. For this, we must show that for a general polynomial f with support A ⊂ N n , all the solutions to L f,u = 0 lie in (C × ) n+1 and no facial system (L f,u ) w = 0 for 0 = w ∈ Z n+1 has a solution in (C × ) n+1 . The later is given by the next theorem which is proved in Section 5.
Theorem 7. Suppose that f is general given its support A, that 0 ∈ A, and that u ∈ C n is general. For any nonzero w ∈ Z n+1 , the facial system (L f,u ) w has no solutions in (C × ) n+1 .
Using this theorem we can now prove Theorem 1.
x n ] is general given its support A and that 0 ∈ A. We may also suppose that u ∈ C n V(f ) is general. By Theorem 7, no facial system (L f,u ) w has a solution. By Bernstein's Other Theorem, the Lagrange multiplier equations L f,u = 0 have MV(P, P 1 , . . . , P n ) solutions in (C × ) n+1 . It remains to show that there are no other solutions to the Lagrange multiplier equations.
For this, we use standard dimension arguments, such as [13,Theorem 11.12], and freely invoke the generality of f . Consider the incidence variety , which is an affine variety. As f = 0 is an equation in L f,u = 0, this is a subvariety of Write π for the projection of S f to X C and let x ∈ X C . The fiber π −1 (x) over x is As f is general, X C is smooth, so that ∇f (x) = 0 and we see that λ = 0 and u = x. Thus u = x + 1 λ ∇f (x). This identifies the fiber π −1 (x) with C × λ , proving that S f → X C is a C × -bundle, and thus is irreducible of dimension n.
Let Z ⊂ X C be the set of points of X C that do not lie in (C × ) n and hence lie on some coordinate plane. As f is irreducible and f (0) = 0, we see that Z has dimension n−2, and its inverse image π −1 (Z) in S f has dimension n−1. The image W of π −1 (Z) under the projection to C n u consists of those point u ∈ C n u which have a solution (x, λ) to L f,u (λ, x) = 0 with x ∈ (C × ) n . Since W has dimension at most n−1, this shows that for general u all solutions to L f,u (λ, x) = 0 lie in (C × ) n+1 (we already showed that λ = 0).
This completes the proof of Theorem 1.

4.1.
Application of Bernstein's other theorem. To illustrate Theorem 7, let us consider two facial systems of the Lagrange multiplier equations in an example. Let ∂ i A be the support of ∂ i . It depends upon the support A of f and the index i in the following way. Let e i := (0, . . . , 0, 1, 0, . . . , 0) be the ith standard basis vector (1 is in position i). To obtain ∂ i A from A ⊂ N n , first remove all points a ∈ A with a i = 0, then shift the remaining points by −e i . The support of ∂ i − λ(u i − x i ) is obtained by adding e 0 and e i + e 0 to ∂ i A. Throughout the paper we associate to λ the exponent with index 0.
Consider the polynomial in two variables, Its support is A = {(0, 0), (0, 1), (1, 1), (2, 1), (1, 0)} and its Newton polytope is P = conv(A), which is a trapezoid. Figure 2 shows the Newton polytope P along with the Newton polytopes of ∂ 1 f − λ(u 1 − x 1 ) and ∂ 2 f − λ(u 2 − x 2 ). These are polytopes in R 3 ; we plot the exponents of the Lagrange multiplier λ in the (third) vertical direction in Figure 2. The faces exposed by w = (0, 1, 0) are shown in red in Figure 3. The corresponding facial system is  which does not hold for f, u general. The proof of Theorem 7 is divided in three cases and one involves such triangular systems, which are independent of some of the variables. The faces exposed by w = (0, −1, 1) are shown in red in Figure 4. The corresponding facial  system is Observe that h w (A) = −1 and that we have This is an instance of Euler's formula for quasihomogeneous polynomials (Lemma 9). If (λ, x) is a solution to (L f,u ) w = 0, then the third equation becomes ∂ 2 f = λx 2 . Substituting this into (7) gives 0 = −f w = λx 2 2 , which has no solutions in (C × ) 3 . One of the cases in the proof of Theorem 7 exploits Euler's formula in a similar way. ⋄

The facial systems of the Lagrange multiplier equations are empty
Before giving a proof of Theorem 7, we present two lemmas to help understand the support of f and its interaction with derivatives of f , and then make some observations about the facial system (L f,u ) w .
Let f ∈ C[x 1 , . . . , x m ] be a polynomial with support A ⊂ N n , which is the set of the exponents of monomials of f . We assume that 0 ∈ A. As before we write ∂ i A ⊂ N n for the support of the partial derivative ∂ i f . For w ∈ Z n , the linear function x → w · x takes minimum values on A and on ∂ i A, which we denote by (We suppress the dependence on w.) Since 0 ∈ A, we have h * ≤ 0. Also, if h * = 0 and if there is some a ∈ A with a i > 0, then w i ≥ 0. Recall that the subsets of A and ∂ i A where the linear function x → w · x is minimized are their faces exposed by w, (6) we denote by f w the restriction of f to A w , and similarly (∂ i f ) w denotes the restriction of the partial derivative ∂ i f to A w . We write ∂ i f w for ∂ i (f w ), the ith partial derivative of f w . Our proof of Theorem 7 uses the following two results.
. . , n}. Let us define A • := A ∩ {a ∈ R n | a i ≥ 1} and the same for A • w , and recall that e i is the ith standard basis vector in R n and the exponent of (10) and (11) are equalities and we have h * = h * i + w i . The sentence preceding (10)

Therefore, the inequalities in both
The restriction f w of f to the face of A exposed by w is quasihomogeneous with respect to the weight w, and thus it satisfies a weighted version of Euler's formula.
Lemma 9 (Euler's formula for quasihomogeneous polynomials). For w ∈ Z n we have Proof. For a monomial x a with a ∈ Z n and i = 1, . . . , n, we have that The statement follows because for a ∈ A w (the support of f w ), w · a = h * .
Our proof of Theorem 7 investigates facial systems (L f,u ) w for 0 = w ∈ Z n+1 with the aim of showing that for f general given its support A, no facial system has a solution. Recall from (2) that the Lagrange multiplier equations for the Euclidean distance problem are Fix 0 = w = (v, w 1 , . . . , w n , v) ∈ Z n+1 . The first coordinate of w is v ∈ Z. It has index 0 and corresponds to the variable λ.
The first entry of the facial system (L f,u ) w is f w . The shape of the remaining entries depends on w as follows. Recall from (8) that h * := min a∈A w · a and h * i := min a∈∂ i A w · a. As v and v + w i are the weights of the monomials λu i and λx i , respectively, there are seven possibilities for the each of these remaining entries, (12) ( For a subset I ⊂ {1, . . . , n} and a vector u ∈ C n , let u I := {u i | i ∈ I} be the components of u indexed by i ∈ I. We similarly write w I for w ∈ Z n and x I for variables x ∈ C n , and write C I for the corresponding subspace of C n .
We recall Theorem 7, before we give a proof.
Theorem 7. Suppose that f is general given its support A, that 0 ∈ A, and that u ∈ R n is general. For any nonzero w ∈ Z n+1 , the facial system (L f,u ) w has no solutions in (C × ) n+1 .
Proof. Let 0 = w = (v, w 1 , . . . , w n ) ∈ Z n+1 . As before, v corresponds to the variable λ and w i to x i . We argue by cases that depend upon w and A, showing that in each case, for a general polynomial f with support A, the facial system has no solutions in (C × ) n+1 . Note that the last two possibilities in (12) do not occur as they give monomials.
We distinguish three cases.
Case 1 (the constant case): Suppose that ∂ i f w = 0 for all 1 ≤ i ≤ n. Then f w is the constant term of f . Since 0 ∈ A, this is nonvanishing for f general, and the facial system (L f,u ) w has no solutions.
For the next two cases we may assume that there is a partition I ⊔ J = {1, . . . , n} with I nonempty such that ∂ i f w = 0 for i ∈ I and ∂ j f w = 0 for j ∈ J . By Lemma 8 we have (13) h * i = h * − w i for all i ∈ I . As j ∈ J implies that ∂ j f w = 0, we see that if a ∈ A w , then a J = 0. This implies that f w ∈ C[x I ] is a polynomial in only the variables x I . Case 2 (triangular systems): Suppose that for i ∈ I, w i ≥ 0, that is, w I ≥ 0. We claim that this implies w I = 0. To see this, let a ∈ A w . Then a J = 0. We have 0 ≥ h * = w · a = w I · a I ≥ 0 .
Thus h * = w I · a I = 0, which implies that 0 ∈ A w . Let i ∈ I. Since ∂ i f w = 0, there exists some a ∈ A w with a i > 0. Since w I · a I = 0 for all a ∈ A w , we conclude that w i = 0.
Let i ∈ I. By Lemma 8, we have h * i = h * − w i , so that h * i = h * = 0, and we also have (∂ i f ) w = ∂ i f w . As w i = 0 , the possibilities from (12) become We consider three subcases of v < 0, v > 0, and v = 0 in turn. Suppose first that v < 0 and that (λ, x) ∈ (C × ) n+1 is a solution to (L f,u ) w . As λ = 0 and we have λ(u i − x i ) = 0 for all i ∈ I, we conclude that x I = u I . Since f w ∈ C[x I ] is a general polynomial of support A w and u is general, we do not have f w (u I ) = 0. Thus (L f,u ) w has no solutions when v < 0. Suppose next that v > 0. Then the subsystem of (L f,u ) w = 0 involving f w and the equations indexed by I is (14) f w = ∂ i f w = 0 , for i ∈ I .
As f w ∈ C[x I ], the system of equations (14) implies that the hypersurface V(f w ) ⊂ (C × ) I is singular. However, since f w is general, this hypersurface must be smooth. Thus (L f,u ) w has no solutions when v > 0.
The third subcase of v = 0 is more involved. When v = 0, the subsystem of (L f,u ) w consisting of f w and the equations indexed by I is As f w ∈ C[x I ] and 0 ∈ A w , this is the system (L f,u ) w in C λ × C I for the critical points of Euclidean distance from u I ∈ C I to the hypersurface V(f w ) ⊂ C I . Thus (L f,u ) w is triangular; to solve it, we first solve (15), and then consider the equations in (L f,u ) w indexed by J .
Since ∂ j f w = 0 for j ∈ J , the remaining equations are independent of u I and f w . We will see that they are also triangular.
If a ∈ A A w , then w · a > 0 as h * = 0. Let j ∈ J . We earlier observed that if b ∈ A w then b j = 0 and we defined h * j to be the minimum min{w · a | a ∈ ∂ j A}. Since if a ∈ ∂ j A, then a + e j ∈ A, we have that a + e j ∈ A A w , so that w · (a + e j ) > 0. This implies that w · a > −w j . Taking the minimum over a ∈ ∂ j A, implies that h * j > −w j . Consider now the members of the facial system (L f,u ) w indexed by j ∈ J . Since v = 0 and h * j > −w j , the second and fourth possibilities for (∂ j f − λ(u j − x j )) w in (12) do not occur. Recall that the last two possibilities also do not occur. As v = 0, we have three cases (16) ( if h * j > 0 and w j = 0 . If the first case holds for some j ∈ J , then as h * j > −w j , we have w j > 0. Since w j ≥ 0 in the other cases, we have w j ≥ 0 for all j ∈ J . As we showed earlier that w I = 0, we have w ≥ 0. But then as ∂ j A ⊂ N n , we have h * j ≥ 0 for all j ∈ J . In particular, the first case in (16)-in which h * j < 0-does not occur. Thus the only possibilities for the jth component of (L f,u ) w are the second or the third cases in (16), so that w J ≥ 0.
Let us further partition J according to the vanishing of w j , K := {k ∈ J | w k = 0} and M := {m ∈ J | w m > 0} .
Every component of w M is positive and w I = w K = 0. Moreover, the second entry in (16) shows that h * m = 0 for all m ∈ M. We conclude from this that no variable in x M occurs in (∂ m f ) w , for any m ∈ M.
Let us now consider solving (L f,u ) w , using triangularity. Let (λ, x I ) be a solution to the subsystem (15) for critical points of the Euclidean distance from u I to V(f w ) in C I . We may assume that λ = 0 as f w is general. Then the subsystem corresponding to K gives x k = u k for k ∈ K. Let m ∈ M. Since (∂ m f ) w only involves x I and x K , substituting these values into (∂ m f ) w gives a constant, which cannot be equal to λu m for general u m ∈ C. As w = 0, we cannot have M = ∅, so this last case occurs. Thus (L f,u ) w has no solutions when v = 0.
Case 3 (using the Euler formula): Let us now consider the case where there is some index i ∈ I with w i < 0 and suppose that the facial system (L f,u ) w has a solution. Let i ∈ I be an index with w i < 0. As the the facial system has a solution, the last possibility in (12) for (∂ i f − λ(u i − x i )) w does not occur. Thus either first or the fourth possibility occurs.
For any i ∈ I, we have h * i = h * − w i < v − w i , by (13). Thus if w i ≥ 0, then h * i < v. As we obtained the same inequality when w i < 0, we conclude that for all i ∈ I we have h * i < v. Thus only the first or the fourth possibility in (12) occurs for i ∈ I. That is, (17) ( These cases further partition I into sets K and M, where For k ∈ K the corresponding equation in (L f,u ) w = 0 is ∂ k f w = 0 and for m ∈ M it is ∂ m f w + λx m = 0. If M = ∅, then K = I and the subsystem of (L f,u ) w consisting of f w and the equations indexed by I is (14), which has no solutions as we already observed. Now suppose that M = ∅. Define w * := min{w i | i ∈ I}. Then w * < 0. Moreover, by (17) we have that if m ∈ M, then w m = 1 2 (h * − v). Thus, w m = w * for every m ∈ M. Suppose that (λ, x) is a solution to (L f,u ) w . For k ∈ K, we have ∂ k f w (x) = 0 and for m ∈ M, we have that ∂ m f w (x) = −λx m . Then by Lemma 9, we get The last equality uses that I = K ⊔ M. Since λ = 0 and w * = 0, we have m∈M x 2 m = 0. Let Q be this quadratic form. Then the point x I lies on both hypersurfaces V(f w ) and V(Q).
Since ∂ k f w (x I ) = ∂ k Q = 0 for k ∈ K and 2∂ m f w (x I ) = λ∂ m Q for m ∈ M, we see that the two hypersurfaces meet non-transversely at x I . But thus contradicts f w being general. Thus there are no solutions to (L f,u ) w = 0 in this last case.
This completes the proof of Theorem 7.

The Euclidean distance degree of a rectangular parallelepiped
Let a = (a 1 , . . . , a n ) be a vector of nonnegative integers and recall from (4) the definition of the rectangular parallelepiped: We consider the EDD of a general polynomial whose Newton polytope is B(a), with the goal of proving Theorem 2.
Recall that e i := (0, . . . , 1, . . . , 0) is the ith standard unit vector in R n , (the unique 1 is in the ith position). The 0-th unit vector e 0 corresponds to the variable λ. Let f be a general polynomial with Newton polytope B(a). Then the Newton polytope of the partial derivative ∂ i f is B(a 1 , . . . , a i −1, . . . , a n ).
For each i = 1, . . . , n, let P i (a) be the convex hull of B(a 1 , . . . , a i −1, . . . , a n ) and the two points e 0 and e 0 + e i . Then P i (a) is the Newton polytope of ∂ i f − λ(u i − x i ). Consequently, B(a), P 1 (a), . . . , P n (a) are the Newton polytopes of the Lagrange multiplier equations (2).
Recall that for each 1 ≤ k ≤ n, e k (a) is the elementary symmetric polynomial of degree k evaluated at a. It is the sum of all square-free monomials in a 1 , . . . , a n . Let us write The main result in this section is the following mixed volume computation. It and Theorem 1 together imply Theorem 2. Our proof of Theorem 10 occupies Section 6.3, and it depends upon lemmas and definitions collected in Sections 6.1 and 6.2. One technical lemma from Section 6.2 is proven in Section 6.4. Multiadditivity. If Q ′ 1 is another polytope in R m , then MV(Q 1 + Q ′ 1 , Q 2 , . . . , Q m ) = MV(Q 1 , Q 2 , . . . , Q m ) + MV(Q ′ 1 , Q 2 , . . . , Q m ) . Mixed volume decomposes as a product when the polytopes possess a certain triangularity (see [27,Lem. 6] or [11,Thm. 1.10]). We use a special case. For a positive integer b, write [0, b e i ] for the interval of length b along the ith axis in R m . For each j = 1, . . . , m, let π j : R m → R m−1 be the projection along the coordinate direction j. Proof. We paraphrase the proof in [11], which is bijective and algebraic. Consider a system g 1 , . . . , g m of general polynomials with Newton polytopes Q 1 , . . . , Q m−1 , [0, b e j ], respectively. As g m is a univariate polynomial of degree b in x j , g m (x j ) = 0 has b solutions. For each solution x * j , if we substitute x j = x * j in g 1 , . . . , g m−1 , then we obtain general polynomials with Newton polytopes π j (Q 1 ), . . . , π j (Q m−1 ). Thus there are MV(π j (Q 1 ), . . . , π j (Q m−1 )) solutions to our original system for each of the b solutions to g m (x j ) = 0. Its Euclidean volume is a 1 · · · a m , the product of its side lengths.
As before, P i (a) is the convex hull of B(a 1 , . . . , a i −1, . . . , a m ) and e 0 + [0, e i ]. Define Pyr(a) to be the pyramid with base the rectangular parallelepiped B(a) and apex e 0 , this is the convex hull of B(a) and e 0 . For each j = 1, . . . , m we have the projection π j : R m → R m−1 along the jth coordinate, so that π j (a) = (a 1 , . . . , a j−1 , a j+1 , . . . , a m ). We then have that π j (B(a)) = B(π j (a)). The following is immediate from the definitions.
Applying symmetry and Lemma 13 with m = n−1, this is a j (1 + E(π j (a)). Thus the mixed volume (18) is a j e k (π j (a)) = E(a) .
This finishes the proof of Theorem 10. 6.4. Proof of Lemma 13. We use Bernstein's Theorem, showing that a general polynomial system with suport Pyr(a), P 1 (a), . . . , P m (a) has 1 + E(a) solutions in the torus (C × ) m+1 , where a = (a 1 , . . . , a m ) is a vector of positive integers.
A general polynomial with Newton polytope Pyr(a) has the form cλ + f , where f has Newton polytope B(a) and c = 0. Here, λ is a variable with exponent e 0 . Dividing by c, we may assume that the polynomial is monic in λ. Simiarly, as P i (a) is the convex hull of B(a 1 , . . . , a i −1, . . . , a m ) and e 0 + [0, e i ], a general polynomial with support P i (a) may be assumed to have the form λℓ i (x i )+f i (x), where f i has Newton polytope B(a 1 , . . . , a i −1, . . . , a m ) and ℓ i ( We may therefore assume that a general system of polynomials with the given support has the form where f is a general polynomial with Newton polytope B(a) and for each i = 1, . . . , m, f i is a general polynomial with Newton polytope B(a 1 , . . . , a i −1, . . . , a m ). We show that 1 + E(a) is the number of common zeros in (C × ) n+1 of the polynomials in (19).
Using the first polynomial to eliminate λ from the rest shows that solving the system (19) is equivalent to solving the system (20) F : which is in the variables x 1 , . . . , x m , as z → (f (z), z) is a bijection between the solutions z to (20) and the solutions to (19). We show that the number of common zeroes to (20) is 1 + E(a), when f, f 1 , . . . , f m are general given their Newton polytopes.
Unlike the system (19), the system F is not general given its support. Nevertheless, we will show that no facial system has any solutions. Then, by Bernstein's Other Theorem, its number of solutions is the corresponding mixed volume, which we now compute.
Since To see this, first observe that the second equality is the definition of E(a). For the first equality, consider expanding the mixed volume using multilinearity. This will have summands indexed by the subsets I of {1, . . . , m} where in the summand indexed by I, we choose B(a) in the positions in I and [0, e j ] when j ∈ I. A repeated application of Lemma 11 shows that this summand is MV(B(a I ), . . . , B(a I )), as projecting a from the coordinates j ∈ I gives a I . This term is |I|! i∈I a i , by the normalization property of mixed volume.
We now show that no facial system of (20) has any solutions. Since each Newton polytope is a rectangular parallelepiped B(a) + [0, e j ], its proper faces are exposed by nonzero vectors w ∈ {−1, 0, 1} m , and each exposes a different face.
Let w ∈ {−1, 0, 1} m and suppose that w = 0. We first consider the face of B(a) exposed by w. This is a rectangular parallelepiped whose ith coordinate is [0, a i ] if w i = 0 , and a i if w i = −1 .
In the same manner as (9), we define B(a) w := {b * ∈ B(a) | w · b * = min b∈B(a) w · b}, and similarly, we define (B(a) + [0, e j ]) w for each j = 1, . . . , m. Then, As ℓ j = c j + x j , we also have The Newton polytope of f i has ith coordinate the interval [0, (a i −1)] and for j = i its jth coordinate is the interval [0, a j ]. The Newton polytope of ℓ i ·f differs in that its ith coordinate is the interval [0, (a i +1)]. We get , and for f i general (f i ) w = 0 when w i = 1. Let α be the number of coordinates of w equal to 0, β be the number of coordinates equal to 1 and set γ := n − α − β, which is the number of coordinates of w equal to −1. The faces of (B(a) + [0, e j ]) w exposed by w have dimension α, by (21), so the facial system F w of (20) is effectively in α variables. Suppose first that γ > 0. Since on (C × ) n each variable x i is nonzero, by (22) the facial system F w is equivalent to As these are nonzero and general given their support, and there are α + β + 1 > α of them, we see that F w has no solutions.
If γ = 0, then β > 0. Consider the subfamily F of systems of the form (20) where f = 0, but the f i remain general. Then the facial system F w is equivalent to the system {(f i ) w | w i = −1} of α + β > α polynomials which are nonzero and general given their support, so that F w has no solutions.
As the condition that F w has no solutions is an open condition in the space of all systems (19), this implies that for a general system (19) with corresponding system F (20), no facial system F w has a solution. This completes the proof of the lemma.