Skip to main content

A representation theoretic explanation of the Borcea–Brändén characterization

Abstract

In 2009, Borcea and Brändén characterized all linear operators on multivariate polynomials which preserve the property of being non-vanishing (stable) on products of prescribed open circular regions. We give a representation theoretic interpretation of their findings, which generalizes and simplifies their result and leads to a conceptual unification of many related results in polynomial stability theory. At the heart of this unification is a generalized Grace’s theorem which addresses polynomials whose roots are all contained in some real interval or ray. This generalization allows us to extend the Borcea–Brändén result to characterize a certain subclass of the linear operators which preserve such polynomials.

Introduction

In 1914, Pólya and Schur [14] characterized the set of diagonal linear operators on polynomials which preserve real-rootedness. Since this seminal paper, much work has been done in extending this characterization to other classes of linear operators. This program in essence came to a close in 2009 with a paper of Borcea and Brändén [1], which gave a complete characterization of linear operators on polynomials which preserve real-rootedness.

Their real-rootedness preservation characterization is derived from a more general result pertaining to stable polynomials. Given \(\Omega \subset {\mathbb {C}}^m\), we say that a polynomial \(f \in {\mathbb {C}}[x_1,\ldots ,x_m]\) is \(\Omega \)-stable if f does not vanish in \(\Omega \). Further, f is real stable if it has real coefficients and is \({\mathcal {H}}_+^m\)-stable, where \({\mathcal {H}}_+\subset {\mathbb {C}}\) is the open upper half-plane. (We also denote the open lower half-plane by \({\mathcal {H}}_-\subset {\mathbb {C}}\).) We additionally use the terms weakly \(\Omega \)-stable and weakly real stable if we allow \(f \equiv 0\). Finally, we write \(f \in {\mathbb {C}}^\lambda [x_1,\ldots ,x_m]\) for \(\lambda \equiv (\lambda _1,\ldots ,\lambda _m)\) if f is of degree at most \(\lambda _k\) in \(x_k\). We are then led to the following problems for \({\mathbb {K}}\in \{{\mathbb {C}},{\mathbb {R}}\}\), generalized from the Pólya-Schur characterization:

Problem 1

Characterize linear operators \(T: {\mathbb {K}}^\lambda [x_1,\ldots ,x_m] \rightarrow {\mathbb {K}}[x_1,\ldots ,x_m]\) preserving weak \(\Omega \)-stability.

Problem 2

Characterize linear operators \(T: {\mathbb {K}}[x_1,\ldots ,x_m] \rightarrow {\mathbb {K}}[x_1,\ldots ,x_m]\) preserving weak \(\Omega \)-stability.

In [1], Borcea and Brändén were able to solve these problems in many cases. In particular, they solved both problems for \({\mathbb {K}}= {\mathbb {R}}\) and \(\Omega = {\mathcal {H}}_+^m\), where \(m=1\) corresponds to the case of preservation of real-rooted polynomials. For \({\mathbb {K}}= {\mathbb {C}}\), they were able to solve Problem 1 for \(\Omega \) that is any product of open circular regions in \({\mathbb {C}}\).

In this paper we will only be concerned with Problem 1, for which we now state the solution from [1]. Given a linear operator \(T: {\mathbb {K}}^\lambda [x_1,\ldots ,x_m] \rightarrow {\mathbb {K}}[x_1,\ldots ,x_m]\), a polynomial \({{\,\mathrm{Symb}\,}}_{BB}(T)\) called the (Borcea–Brändén) symbol is associated to T. Specifically, the symbol is a polynomial in \({\mathbb {K}}^{\lambda \sqcup \lambda }[x_1,\ldots ,x_m, z_1,\ldots ,z_m]\) (i.e., of 2m variables), where \(\lambda \sqcup \lambda := (\lambda _1,\ldots ,\lambda _m,\lambda _1,\ldots ,\lambda _m)\). The crucial feature of the symbol is that it shares certain stability properties with its associated linear operator, which yields the characterizations stated in the following results. (We will express these results in more detail in Sects. 6 and 7.)

Theorem 1.1

(Borcea–Brändén) Fix a linear operator \(T: {\mathbb {C}}^\lambda [x_1,\ldots ,x_m] \rightarrow {\mathbb {C}}[x_1,\ldots ,x_m]\) which has image of dimension greater than one. Then, T maps \({\mathcal {H}}_+^m\)-stable polynomials to weakly \({\mathcal {H}}_+^m\)-stable polynomials if and only if \({{\,\mathrm{Symb}\,}}_{BB}(T)\) is \({\mathcal {H}}_+^{2m}\)-stable.

Theorem 1.2

(Borcea–Brändén) Fix a linear operator \(T: {\mathbb {R}}^\lambda [x_1,\ldots ,x_m] \rightarrow {\mathbb {R}}[x_1,\ldots ,x_m]\) which has image of dimension greater than two. Then, T maps real stable polynomials to weakly real stable polynomials if and only if either \({{\,\mathrm{Symb}\,}}_{BB}(T)\) or \({{\,\mathrm{Symb}\,}}_{BB}(T^-)\) is real stable, where \(T^-(p) := T(p(-x_1,\ldots ,-x_m))\).

To deal with other products of circular regions, one then conjugates T by certain Möbius transformations and applies Theorem 1.1 to the conjugated operator. Unfortunately though, this is a tedious process which has to be done each time a new stability region is to be considered. Additionally, the image dimension restrictions give rise to degeneracy cases which have to be dealt with separately. Both of these issues obscure the connection between an operator and its symbol.

In this paper, we present a new conceptual approach to the Borcea–Brändén characterization via the representation theory of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\). In particular, we derive a new symbol (denoted \({{\,\mathrm{Symb}\,}}\)) in a natural way, and our definition eliminates the issues discussed above. This is seen in the following results, which are our simplified and generalized versions of the Borcea–Brändén characterizations. Note that for the sake of simplicity, we have omitted a few details here regarding non-convex circular regions. Specifically, circular regions of \({\mathbb {C}}\) should be thought of as lying in the Riemann sphere, so that complements of discs contain the point at \(\infty \).

Theorem 6.2Fix a linear operator \(T: {\mathbb {C}}^\lambda [x_1,\ldots ,x_m] \rightarrow {\mathbb {C}}^\alpha [x_1,\ldots ,x_l]\), a product of all open or all closed circular regions \(\Omega _0 = C_1 \times \cdots \times C_m\), and a product of sets \(\Omega _1 := S_1 \times \cdots \times S_m\). Further, denote \({\widetilde{\Omega }}_0 := ({\mathbb {C}}{\setminus } C_1) \times \cdots \times ({\mathbb {C}}{\setminus } C_m)\). Up to certain degree and convexity (of \(C_k\)) restrictions, we have that T maps \(\Omega _0\)-stable polynomials to nonzero \(\Omega _1\)-stable polynomials if and only if \({{\,\mathrm{Symb}\,}}(T)\) is \(({\widetilde{\Omega }}_0 \times \Omega _1)\)-stable.

Theorem 7.2Fix a linear operator \(T: {\mathbb {R}}^\lambda [x_1,\ldots ,x_m] \rightarrow {\mathbb {R}}^\alpha [x_1,\ldots ,x_l]\). Up to certain degree restrictions, T maps real stable polynomials to nonzero real stable polynomials if and only if \({{\,\mathrm{Symb}\,}}(T)\) is either \(({\overline{{\mathcal {H}}_-}}^m \times {\mathcal {H}}_+^l)\)-stable or \(({\overline{{\mathcal {H}}_-}}^m \times {\mathcal {H}}_-^l)\)-stable.

We summarize the specific improvements that this and our other related results give over the Borcea–Brändén characterization as follows.

  1. 1.

    Different stability regions can be considered using the same symbol. The symbol we define in this paper is universal: for example, it gives stability-preservation information for any product of open circular regions. The Borcea–Brändén symbol, on the other hand, required the application of Möbius transformations. In addition, our symbol also allows for the output stability region to be chosen independently of the input stability region. While this does not literally improve the result, it does allow for quicker computations. In particular, see Examples 6.3 and 6.4 where classical polynomial convolution results are easily derived from our framework.

  2. 2.

    Our characterization does not require any degeneracy condition. Our results characterize operators which preserve (strong) stability rather than weak stability. As seen above, this slightly stronger notion of stability enables us to eliminate any image dimension degeneracy condition, as required in the Borcea–Brändén characterizations (Theorems 1.1 and 1.2). This demonstrates a cleaner link between an operator and its symbol.

  3. 3.

    Closed circular regions and projectively convex regions can be considered. The symbol we define in this paper handles products of open circular regions, as well as products of closed circular regions. (In [12], Melamud proves a result similar to the Borcea–Brändén characterization for closed circular regions.) Further, we are also able to consider more general projectively convex regions (circular regions with portions of their boundary; also called generalized circular regions, see [18] and [17]) in Proposition 6.5. This allows us to determine stability-preservation information about real intervals and half-lines. It also turns out, somewhat surprisingly, that our symbol can handle products of any sets as possible output space stability regions (as seen in Theorem 6.2 above).

In the process of generalizing the Borcea–Brändén characterization we develop a general algebraic framework which also encompasses many of the classical polynomial tools. This framework aims to motivate classical results and provide intuition for the connection between a stability preserving operator and its symbol.

The main idea

A major purpose of this paper is to explain a certain conceptual thread in the history of polynomial stability theory: that it is often possible to determine general stability information from restricted sets of polynomials. For example, the Pólya-Schur and Borcea–Brändén characterizations derive from a single polynomial (i.e., the symbol) stability properties of a whole collection of polynomials in the output of a given linear operator. Additionally, the Grace–Walsh–Szegő coincidence theorem says that stability information of any polynomial can be determined from its polarization, which is of degree at most one in each variable.

As it turns out, these sorts of phenomena can be explained using relatively basic algebraic and representation theoretic concepts. We view \({\mathbb {C}}^n[x]\) as a representation of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) via the standard action, given as follows. For \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) and \(f \in {\mathbb {C}}^n[x]\), we define:

$$\begin{aligned} (\phi \cdot f)(x) := f(\phi ^{-1}x) \end{aligned}$$

Here \(\phi ^{-1}\) acts on \(x \in {\mathbb {C}}\) as a Möbius transformation, or equivalently \(\phi \) acts on the roots of f. (Similarly, \({\mathbb {C}}^\lambda [x_1,\ldots ,x_m]\) can be viewed as a representation of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\) via this action in each variable.) Under this interpretation, important maps like polarization, projection, the apolarity form, and even the symbol turn out to be invariant under these \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) actions. This leads us to a conceptual thesis: \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant maps transfer stability information.

The goal of this paper is then to explicate and answer the most important question related to this thesis: what does it mean for the symbol map to be \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant and how does it transfer stability information? To answer this, we consider the following standard ideas relating spaces of linear operators to tensor products.

Let \(W_1,W_2\) be two finite dimensional representations of a group G, and let \({{\,\mathrm{Hom}\,}}(W_1,W_2)\) denote the space of linear maps from \(W_1\) to \(W_2\). Then, \({{\,\mathrm{Hom}\,}}(W_1,W_2) \cong W_1^* \boxtimes W_2\) (the outer tensor product) can viewed as a representation of \(G \times G\). If we further have a G-invariant bilinear form on \(W_1\), then we also have \(W_1 \cong W_1^*\). This leads to the following identification:

$$\begin{aligned} {{\,\mathrm{Hom}\,}}(W_1,W_2) \cong W_1^* \boxtimes W_2 \cong W_1 \boxtimes W_2 \end{aligned}$$

If \(W_1\) and \(W_2\) are spaces of polynomials, each in m variables, then their tensor product \(W_1 \boxtimes W_2\) is isomorphic to a larger space of polynomials in 2m variables. That is, a linear operator between polynomial spaces \(W_1\) and \(W_2\) can be associated to some polynomial in double the variables, via the above identification of representations. This is precisely the idea of the symbol of an operator.

Let’s see how this works in the univariate case. Consider \({\mathbb {C}}^n[x]\) as a representation of the group \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\), as described above. It is then a standard result that the classical bilinear apolarity form is invariant under the action of Möbius transformations. That is, the apolarity form is an \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant bilinear form on \({\mathbb {C}}^n[x]\). This form, applied to \(f,g \in {\mathbb {C}}^n[x]\) with coefficients \(f_k,g_k\), is defined as follows:

$$\begin{aligned} \langle f,g \rangle ^n := \sum _{k=0}^n \left( {\begin{array}{c}n\\ k\end{array}}\right) ^{-1} (-1)^k f_k g_{n-k} \end{aligned}$$

With this, we obtain the identification described above: \({{\,\mathrm{Hom}\,}}({\mathbb {C}}^n[x],{\mathbb {C}}^m[x]) \cong {\mathbb {C}}^n[x] \boxtimes {\mathbb {C}}^m[x] \cong {\mathbb {C}}^{(n,m)}[x,z]\). The final piece of the puzzle is then to find a way to transfer stability information through this identification of representations. The key result to this end is the classical Grace’s theorem:

Theorem

[8] Let \(f,g \in {\mathbb {C}}^n[x]\) be polynomials of degree exactly n. Further, let C be some open or closed circular region such that f is C-stable and g is \(({\mathbb {C}}{\setminus } C)\)-stable. Then, \(\langle f,g \rangle ^n \ne 0\).

That is, the apolarity form not only provides the link between a linear operator and its symbol, but also captures stability information. So, whatever stability claims we can make about polynomials in \({\mathbb {C}}^{(n,m)}[x,z]\) can then be seamlessly transferred to corresponding linear operators in \({{\,\mathrm{Hom}\,}}({\mathbb {C}}^n[x],{\mathbb {C}}^m[x])\). From this we are able to recover the Borcea–Brändén characterization. Additionally, all of the theory here relating stability and the representation theory of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) can be generalized to multivariate polynomials in a straightforward manner. The details will be discussed in Sect. 3.

In a similar fashion, other important maps also have \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariance properties (e.g., polarization and projection, as used in the Grace–Walsh–Szegő coincidence theorem, explicitly give the isomorphisms of a classical representation theoretic result; see Appendix B). A main feature of our conceptual thesis is that it allows for a unification of many seemingly related results in polynomial stability theory. A crucial point to make then is that Grace’s theorem is at the heart of this unification. That said, a significant portion of this paper is devoted to discussing it.

A generalized Grace’s theorem and interval- and ray-rootedness

In [2], Borcea and Brändén are able to prove a multivariate Grace’s theorem using their operator characterization. In this paper we will prove the multivariate version from scratch, and then use it to derive a new characterization of stability-preserving linear operators. In addition, we generalize it to projectively convex regions, which consist of an open circular region with a portion of its boundary (see Sect. 4.2). We state our new result as follows. Note that this result can be seen as an extension of the generalized Grace’s theorem given in Corollary 4.4 of [17].

Theorem 5.1Fix \(\lambda \in {\mathbb {N}}_0^m\) and \(f,g \in {\mathbb {C}}^\lambda [x_1,\ldots ,x_m]\) such that f and g both have a nonzero term of degree \(\lambda \). Also, denote \(C := {\mathcal {H}}_+\cup \overline{{\mathbb {R}}_+}\) and \({\widetilde{C}} := {\mathcal {H}}_-\cup \overline{{\mathbb {R}}_-}\). If f is \(C^m\)-stable and g is \({\widetilde{C}}^m\)-stable, then \(\langle f,g \rangle ^\lambda \ne 0\).

This result can, for instance, give stability information about positive- and negative-rooted polynomials. Since the apolarity form is invariant under the action of Möbius transformations, we immediately obtain similar statements regarding the union of any open circular region and a portion of its boundary. Notice also that, unlike the classical Grace’s theorem, the stability regions C and \({\widetilde{C}}\) have non-empty intersection.

In the vein of this extension, we provide a new characterization of a certain class of linear operators which preserve ray- and interval-rootedness. The problem of classifying all such operators is still open in general (see e.g., the end of [3]). Here, we solve this problem for a restricted class of operators: namely, operators which both preserve weak real-rootedness and also preserve ray- or interval-rootedness. Our main result in this direction is stated as follows, where a polynomial is called J-rooted when all of its roots are in J:

Theorem 7.8 Fix a linear operator \(T: {\mathbb {R}}^n[x] \rightarrow {\mathbb {R}}^m[x]\) which has image of dimension greater than two. Further, let \(I,J \subseteq {\mathbb {R}}\) be intervals or rays. Up to certain degree restrictions, T preserves weak real-rootedness and maps I -rooted polynomials to nonzero J -rooted polynomials if and only if \({{\,\mathrm{Symb}\,}}(T)\) is either \(({\mathcal {H}}_-\cup I) \times ({\overline{{\mathcal {H}}_+}} {\setminus } J)\) -stable or \(({\mathcal {H}}_-\cup I) \times ({\overline{{\mathcal {H}}_-}} {\setminus } J)\) -stable.

In Sect. 7.4, this result is stated in a more restricted manner as the degeneracy condition (image dimension) and degree restrictions end up being more tedious than in the other results. Corollary 7.9 and further explication then give the result as stated here.

As a final note, all of the results given here in the introduction are stated slightly differently in Sects. 5, 6, 7. In particular, the notation \(V(\lambda )\) is used in place of \({\mathbb {C}}^\lambda [x_1,\ldots ,x_m]\), and reference is made to \(\mathbb {CP}^1\) (i.e., the Riemann sphere). This notation has to do with consideration of “roots at infinity”, which allows us to remove degree restrictions and avoid reference to convex circular regions. We discuss this rigorously in Sect. 2.2.

A roadmap

We now describe the content of the remainder of this paper. In Sect. 2, we discuss the use of homogeneous polynomials via the notation V(n) and \(V(\lambda )\), and we describe the relation of these spaces to the notion of roots at infinity.

In Sect. 3, we explicate some very basic representation theory of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\). We then demonstrate how the apolarity form and the symbol arise as natural constructs in this context. Results like the symbol lemma (Lemma 3.7) and the \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariance of the apolarity form are stated here.

In Sect. 4, we discuss some classical and some new polynomial stability theory results, and their multivariate analogues, in the homogeneous context. We also extend Laguerre’s theorem to projectively convex regions (generalized circular regions), which later allows us to prove results regarding polynomials which have all their roots in a given interval.

In Sect. 5, we state and prove our generalized Grace’s theorem. We also discuss other stability regions to which the theorem applies, and consider symbols of linear operators given by evaluation at a particular point. We call these polynomials evaluation symbols, as they turn out to play a crucial role in the proofs of the operator characterizations.

In Sect. 6, we finally state and prove our improved characterizations of stability-preserving operators. We also demonstrate how the Borcea–Brändén characterizations can be seen (with a bit of work) to be corollaries of our characterizations. We then provide examples of the use of our results. In particular, we show how stability results related to classical polynomial convolutions can be immediately recovered.

In Sect. 7, we state and prove the analogous characterization of strong real stability-preserving operators. As with complex operators, we show how the Borcea–Brändén characterization can be obtained as a corollary. In this section, we also state and prove our characterization of operators which preserve both weak real-rootedness as well as interval- (or ray-) rootedness.

In Appendix A, we explicate more of the representation theory of \(SL_2({\mathbb {C}})\) in a polynomial-minded way. Specifically, we prove a few standard tensor product decomposition results which more clearly demonstrate how this theory connects to the notion of apolarity.

In Appendix B, we discuss how polarization and the Grace–Walsh–Szegő coincidence theorem fit in to the framework presented in this paper. We also demonstrate that the classical isomorphism \(V(n) \cong {\text {Sym}}^n(V(1))\) (for representations of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)) can be realized as the polarization map and therefore has a stability-theoretic interpretation. While important to the conceptual thesis stated above, we place this discussion in an appendix as it is not utilized elsewhere.

Preliminaries

Here, we discuss basic notation and results related to polynomials and stability. In particular, we discuss in more detail the notation and consequences related to the use of homogeneous polynomials in place of usual univariate and mutlivariate polynomials. Then, we state a number of basic stability results in the language of homogeneous polynomials.

Notation

Let \([n] := \{1,2,\ldots ,n\}\) and \((1^m) := (1,\ldots ,1) \in {\mathbb {N}}_0^m\). For \(\lambda = (\lambda _1,\ldots ,\lambda _m) \in {\mathbb {N}}_0^m\), we define:

$$\begin{aligned} {\mathbb {C}}^\lambda [x_1,\ldots ,x_m] := \left\{ f \in {\mathbb {C}}[x_1,\ldots ,x_m] : \deg _{x_k}(f) \le \lambda _k\right\} \end{aligned}$$

That is, elements of \({\mathbb {C}}^\lambda [x_1,\ldots ,x_m]\) are of degree at most \(\lambda _k\) in the variable \(x_k\). In particular, we call polynomials in \({\mathbb {C}}^{(1^m)}[x_1,\ldots ,x_m]\) multi-affine. We will also use the shorthand \({\mathbb {C}}^n[x]\) to refer to univariate polynomials of degree at most n.

Now we define similar spaces of polynomials which are homogeneous in pairs of variables. These polynomials should be seen as per-variable homogenizations of polynomials of the spaces defined above. For \(\lambda = (\lambda _1,\ldots ,\lambda _m) \in {\mathbb {N}}_0^m\) and \({\mathbb {K}}= {\mathbb {C}}\) or \({\mathbb {K}}= {\mathbb {R}}\), we define:

$$\begin{aligned} V_{\mathbb {K}}(\lambda )&= V_{\mathbb {K}}(\lambda _1,\ldots ,\lambda _m) \\ {}&:= \left\{ p \in {\mathbb {K}}[x_1,y_1\ldots ,x_m,y_m] : p \text { is homogeneous of degree } \lambda _k \text{ in } x_k, y_k \right\} \end{aligned}$$

We also use the shorthand \(V(\lambda ) = V_{\mathbb {C}}(\lambda )\). As above, we call polynomials in \(V(1^m)\) multi-affine. The notation used here is generalized from what is typically used to denote the irreducible representations of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\). As it turns out, spaces of homogeneous polynomials in two variables can be used to define these representations. This will be made more precise in Sect. 3.

We let \(\mathbb {CP}^1\) denote the projective space of lines in \({\mathbb {C}}^2\), and we will also identify this space with the Riemann sphere. Note that \(\mathbb {CP}^1\) can also be considered as a compact version of \({\mathbb {C}}\) with one extra point added at infinity. We will often identify \(\mathbb {CP}^1\) with \({\mathbb {C}}\) (up to this extra point) via stereographic projection. A circular region is then an open or closed disc, half-plane, or complement of a disc in \({\mathbb {C}}\). The set of circular regions is transitive under the action of Möbius transformations. We also use the name circular regions to refer to the stereographic projections into \(\mathbb {CP}^1\). Closed half-planes and complements of discs projected into \(\mathbb {CP}^1\) will contain the point at infinity. Throughout, we will let \(\partial S\) denote the boundary of (the closure of) the set S in \(\mathbb {CP}^1\) and let \(S^\circ \) denote the interior of S.

There is also a natural ordering structure on \({\mathbb {N}}_0^m\), along with a few basic operations that will be used throughout. Fix \(\lambda = (\lambda _1,\ldots ,\lambda _m)\) and \(\alpha = (\alpha _1,\ldots ,\alpha _m)\) in \({\mathbb {N}}_0^m\), and fix \(\beta = (\beta _1,\ldots ,\beta _n)\) in \({\mathbb {N}}_0^l\). We say \(\alpha \le \lambda \) whenever \(\alpha _k \le \lambda _k\) for all \(k \in [m]\). We define \(\lambda + \alpha := (\lambda _1+\alpha _1,\ldots ,\lambda _m+\alpha _m)\), \(|\lambda | := \lambda _1 + \cdots + \lambda _m\), and \(\lambda \sqcup \beta := (\lambda _1,\ldots ,\lambda _m,\beta _1,\ldots ,\beta _l) \in {\mathbb {N}}_0^{m+l}\).

We also make use of a number of shorthands. Fix \(\mu ,\lambda \in {\mathbb {N}}_0^m\) such that \(\mu \le \lambda \). We define \(x^\mu \in {\mathbb {C}}_\lambda [x_1,\ldots ,x_m]\) via \(x^\mu := x_1^{\mu _1}x_2^{\mu _2} \cdots x_m^{\mu _m}\). Similarly, we define \(\partial _x^\mu := \partial _{x_1}^{\mu _1} \partial _{x_2}^{\mu _2} \cdots \partial _{x_m}^{\mu _m}\), where \(\partial _x := \frac{\partial }{\partial x}\). When considering \(V(\lambda )\), we define \(y^\mu \) and \(\partial _y^\mu \) in the same way. Further, we define \(\lambda ! := \lambda _1! \cdots \lambda _m!\) and \(\left( {\begin{array}{c}\lambda \\ \mu \end{array}}\right) := \frac{\lambda !}{\mu !(\lambda -\mu )!} = \left( {\begin{array}{c}\lambda _1\\ \mu _1\end{array}}\right) \cdots \left( {\begin{array}{c}\lambda _m\\ \mu _m\end{array}}\right) \). Finally, we denote \((-1)^\mu := (-1)^{\mu _1} \cdots (-1)^{\mu _m} = (-1)^{|\mu |}\).

Homogeneous polynomials

The usual degree-n homogenization of a polynomial \(f \in {\mathbb {C}}^n[x]\) is defined on monomials as follows and is extended linearly.

$$\begin{aligned} \begin{aligned} {{\,\mathrm{Hmg}\,}}_n : {\mathbb {C}}^n[x]&\rightarrow V(n) \\ x^k&\mapsto x^ky^{n-k} \end{aligned} \end{aligned}$$

More generally, for \(\lambda \in {\mathbb {N}}_0^m\) the degree-\(\lambda \) homogenization is defined on monomials as follows and is extended linearly.

$$\begin{aligned} \begin{aligned} {{\,\mathrm{Hmg}\,}}_\lambda : {\mathbb {C}}^\lambda [x_1,\ldots ,x_m]&\rightarrow V(\lambda )\\ x^{\mu }&\mapsto x^{\mu }y^{\lambda -\mu } \end{aligned} \end{aligned}$$

\({\mathbb {C}}^n[x]\) and V(n) are isomorphic as vector spaces via \({{\,\mathrm{Hmg}\,}}_n\), and we will mainly utilize bivariate homogeneous polynomials in V(n) over the usual univariate polynomials in \({\mathbb {C}}^n[x]\). What homogeneity gets us is a simplification of a number of issues related to the fact that polynomials in \({\mathbb {C}}^n[x]\) have at most n zeros. Specifically, it is more natural to think of the missing zeros (when the number of zeros is less than n) as being “at infinity”. Certain results require premises restricting to convex regions or to polynomials of degree exactly n (e.g., the classical Grace–Walsh–Szegő coincidence theorem), and such details vanish when considering homogeneous polynomials with possible roots at infinity. Another way to say this is that we consider polynomials in V(n) to have exactly n roots in \(\mathbb {CP}^1\), which can also be thought of as the Riemann sphere. We also consider polynomials \(p(x,y) = p(x_1,y_1,\ldots ,x_m,y_m) \in V(\lambda )\) to have zeros in \((\mathbb {CP}^1)^m\), where each pair \((x_k,y_k)\) corresponds to a single factor of \(\mathbb {CP}^1\) in \((\mathbb {CP}^1)^m\).

We use the notation \((a:b) \in \mathbb {CP}^1\), which is meant to give off the connotation of a ratio; that is, (a : b) should feel like a/b. We also use the notation \((a:b) = \big ((a_1:b_1), \ldots , (a_m:b_m)\big ) \in (\mathbb {CP}^1)^m\). Note that this connotation aligns with the idea of considering the zeros of polynomials to be in \((\mathbb {CP}^1)^m\). given the following equality. Defining \(p := {{\,\mathrm{Hmg}\,}}_\lambda (f)\) for a given polynomial \(f \in {\mathbb {C}}^\lambda [x_1,\ldots ,x_m]\), we have:

$$\begin{aligned} p(x,y) = p(x_1,y_1,\ldots ,x_m,y_m) = \prod _k y_k^{\lambda _k} \cdot f(x_1/y_1,\ldots ,x_m/y_m) = y^\lambda \cdot f(x/y) \end{aligned}$$

Finally, we give an important definition which will essentially replace the notion of a monic polynomial for homogeneous polynomials.

Definition 2.1

Given \(p \in V(\lambda )\), we say that p is top-degree monic if the coefficient of \(x^\lambda \) in p equals 1. In particular, if \(p \in V(n)\) is top-degree monic, then p has no roots at infinity.

Homogeneous polynomials as representations

In this section, we will discuss some basic representation theory of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) and show how the apolarity form and the notion of the symbol of an operator arise naturally in the representation theoretic context. Most of the representation theory we use in this section is very basic. There are a number of references which discuss the theory in full detail, albeit with different goals in mind. Typically this is done via the theory of Lie groups and algebras, as in [6] and in [10].

As a note, most of the content of this section is less relevant to the analytic questions associated to polynomials. Rather, it serves as the foundational structure for a new approach to Grace’s theorem and results concerning stability-preserving operators. For this reason, we believe it worthwhile to explicate key aspects of this foundation and their connection to analytic results. Pushing further into this connection may lead to new results beyond the scope of this paper.

The action of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)

Given \((\alpha :\beta ) \in \mathbb {CP}^1\) and \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\), we define the usual action of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) on \(\mathbb {CP}^1\) via \(\phi \cdot (\alpha :\beta ) := \phi \left( {\begin{array}{c}\alpha \\ \beta \end{array}}\right) \). That is, \(\phi \) acts by matrix multiplication on the vector \(\left( {\begin{array}{c}\alpha \\ \beta \end{array}}\right) \). Equivalently, \(\phi \) acts as its corresponding Möbius transformation on \(\mathbb {CP}^1\). Note that, as with Möbius transformations, \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) acts transitively on circular regions in \(\mathbb {CP}^1\).

This then induces an action on V(n) by acting on the roots (in \(\mathbb {CP}^1\)) of polynomials in V(n). Given \(p \in V(n)\) and \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\), this action is defined via:

$$\begin{aligned} (\phi \cdot p)(x,y) := p\left( \phi ^{-1} \left( {\begin{array}{c}x\\ y\end{array}}\right) \right) \equiv (p \circ \phi ^{-1})(x,y) \end{aligned}$$

We can define a similar action of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\) on \(V(\lambda )\), for \(\lambda \in {\mathbb {N}}_0^m\). Specifically, given \(p \in V(\lambda )\) and \((\phi _1,\ldots ,\phi _m) \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})^m\), this action is defined via:

$$\begin{aligned} \big ((\phi _1,\ldots ,\phi _m) \cdot p\big )(x_1,y_1,\ldots ,x_m,y_m) := p\left( \phi _1^{-1} \left( {\begin{array}{c}x_1\\ y_1\end{array}}\right) , \ldots , \phi _m^{-1}\left( {\begin{array}{c}x_m\\ y_m\end{array}}\right) \right) \end{aligned}$$

These actions turn V(n) and \(V(\lambda )\) into representations of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) and \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\), respectively. These are precisely the finite dimensional irreducible representations of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) and \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\) (see Lecture 11 of [6]), and so they are the basic building blocks of the \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) representation theory. Actions on V(n) and \(V(\lambda )\) can be extended to tensor products in the usual way, and in this paper we will make use of both inner and outer tensor products. We now briefly discuss tensor product actions for those less familiar.

The outer tensor product of \(V(\lambda _k)\), denoted \(V(\lambda _1) \boxtimes \cdots \boxtimes V(\lambda _m)\), is a representation of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\) with action by \((\phi _1,\ldots ,\phi _m)\) on simple tensors given as follows:

$$\begin{aligned} (\phi _1, \cdots , \phi _m) \cdot (p_1 \boxtimes \cdots \boxtimes p_m) := (\phi _1 \cdot p_1) \boxtimes (\phi _2 \cdot p_2) \boxtimes \cdots \boxtimes (\phi _m \cdot p_m) \end{aligned}$$

This implies that \(V(\lambda )\) and \(V(\lambda _1) \boxtimes \cdots \boxtimes V(\lambda _m)\) are isomorphic as representations, and this fact will be used when we define the symbol in Sect. 3.3.

The inner tensor product of \(V(\lambda _k)\), denoted \(V(\lambda _1) \otimes \cdots \otimes V(\lambda _m)\), is a representation of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) with action by \(\phi \) on simple tensors given as follows:

$$\begin{aligned} \phi \cdot (p_1 \otimes \cdots \otimes p_m) := (\phi \cdot p_1) \otimes (\phi \cdot p_2) \otimes \cdots \otimes (\phi \cdot p_m) \end{aligned}$$

While \(V(\lambda )\) and \(V(\lambda _1) \otimes \cdots \otimes V(\lambda _m)\) are isomorphic as vector spaces, they are representations of different groups (\(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\) and \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) respectively). The inner tensor product relates to invariants of multiple polynomials with respect to a single \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) action. For instance, the apolarity form takes two distinct polynomials as input, and it is a classical result that this form is invariant with respect to a single action by Möbius transformation. As it turns out, this form can be viewed as an \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invaiant map on an inner tensor product of polynomial spaces. It will therefore be important for us to understand these inner tensor products in a little more detail.

An important invariant map, and apolarity

To aide in our investigation of inner tensor products of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) representations, we now define an important \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant linear map, denoted by D. This map has a long history in invariant theory, and we touch on this below.

Proposition 3.1

The linear map \(D := (\partial _x \otimes \partial _y - \partial _y \otimes \partial _x): V(n+1) \otimes V(m+1) \rightarrow V(n) \otimes V(m)\) is \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant.

Proof

It suffices to check this on simple tensors. Fix \(\phi = \left[ {\begin{matrix} a &{} b \\ c &{} d \end{matrix}}\right] \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\), \(p \in V(n+1)\), and \(q \in V(m+1)\). We compute:

$$\begin{aligned} \begin{aligned} (\phi ^{-1} \circ D \circ \phi ) (p \otimes q)&= (\phi ^{-1} \circ D) (p(dx-by, -cx+ay) \otimes q(dx-by, -cx+ay)) \\&= (d\partial _x-c\partial _y)p \otimes (-b\partial _x+a\partial _y)q - (-b\partial _x+a\partial _y)p \otimes (d\partial _x-c\partial _y)q \\&= (ad-bc)(\partial _x p \otimes \partial _y q - \partial _y p \otimes \partial _x q) \\&= D (p \otimes q) \end{aligned} \end{aligned}$$

That is, \(D \circ \phi = \phi \circ D\). \(\square \)

Proposition 3.2

The multiplication map \(V(n) \otimes V(m) \xrightarrow {\times } V(n+m)\) is \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant.

Proof

Trivial. \(\square \)

Powers of the D map actually appear in the literature under a few different names. The first comes from invariant theory, where the application of the map

$$\begin{aligned} V(n) \otimes V(m) \xrightarrow {D^r} V(n-r) \otimes V(m-r) \xrightarrow {\times } V(n+m-2r) \end{aligned}$$

to polynomials \(p \in V(n)\) and \(q \in V(m)\) is called the \(r\text {th}\) transvectant of p and q. This map is also the result of the \(r\text {th}\) iteration of Cayley’s \(\Omega \) process. These notions are discussed, for example, in chapters 4 and 5 of [13], where they are used to explicitly compute invariants and covariants of forms. In particular, the invariance of the Jacobian (\(1\text {st}\) transvectant map applied to \(p \otimes q\)) and the Hessian (\(2\text {nd}\) transvectant map applied to \(p \otimes p\)) can be determined in this way.

Additionally, the \(n\text {th}\) transvectant of \(p,q \in V(n)\) is used to define a notion of apolarity (see, e.g., [5] and [4]), and this notion corresponds to the classical one used in Grace’s theorem. In fact, one of the original formulations of Grace’s theorem can be found in Grace and Young’s 1903 book, The Algebra of Invariants [9]. This suggests a connection between invariant theory and the analytic consequences of apolarity theory via the D map, and we will indeed see this map play a crucial role in the proof of Grace’s theorem (Theorem 5.1).

We are now ready to define the homogeneous apolarity form via the D map. This form and its \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariance are then the next main step toward the definition of the symbol of an operator. In the next section, we will use this bilinear form to define an important construction called the dual of a representation. This will serve as the link to viewing spaces of linear operators as representations themselves.

Definition 3.3

We call the \(n\text {th}\) transvectant

$$\begin{aligned} V(n) \otimes V(n) \xrightarrow {D^n} V(0) \otimes V(0) \cong {\mathbb {C}}\end{aligned}$$

the apolarity form of V(n). This is the unique (up to scalar) nondegenerate \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant bilinear form on V(n), and therefore it is the homogenization of the classical apolarity form.

We now want to extend this definition to act on \(V(\lambda ) \otimes V(\lambda )\) for \(\lambda \in {\mathbb {N}}_0^m\). Note that for \(p \in V(\lambda )\), we have m pairs of variables given by \(p(x_1,y_1,\ldots ,x_m,y_m)\), which allows us to naturally define:

$$\begin{aligned} D^\lambda := \prod _{i=1}^m (\partial _{x_i} \otimes \partial _{y_i} - \partial _{y_i} \otimes \partial _{x_i})^{\lambda _i} \end{aligned}$$

With this, we can define the apolarity form for \(V(\lambda )\) as follows.

Definition 3.4

We call the map

$$\begin{aligned} V(\lambda ) \otimes V(\lambda ) \xrightarrow {D^\lambda } V(0^m) \otimes V(0^m) \cong {\mathbb {C}}\end{aligned}$$

the apolarity form of \(V(\lambda )\). This is the unique (up to scalar) nondegenerate \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\)-invariant bilinear form on \(V(\lambda )\), and therefore it is the homogenization of the multivariate apolarity form defined by Borcea and Brändén in [2].

Since \(V(0) \otimes V(0) \cong {\mathbb {C}}\) and \(V(0^m) \otimes V(0^m) \cong {\mathbb {C}}\), we will often consider the maps \(D^n\) and \(D^\lambda \) to output a element of \({\mathbb {C}}\). And as a final note, we do not justify here the claims of uniqueness and nondegeneracy stated above. Proving these claims involves decomposing \(V(n) \otimes V(n)\) and \(V(\lambda ) \otimes V(\lambda )\) into their irreducible components, and we leave this work to Appendix A for the interested reader (see Corollaries A.9 and A.12 specifically).

The symbol of an operator

Given representations \(V(\lambda )\) and \(V(\alpha )\) (for \(\lambda \in {\mathbb {N}}_0^m\) and \(\alpha \in {\mathbb {N}}_0^l\)) of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\) and \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^l\) respectively, the space of linear maps between \(V(\lambda )\) and \(V(\alpha )\) can be viewed as a representation of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^{m+l}\) in a standard way. This space of linear maps is denoted \({{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\). As discussed previously, we will now use the apolarity form defined above to construct a representation isomorphism between \({{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\) and \(V(\lambda \sqcup \alpha )\) (which is a space of polynomials in \(m+l\) variables). This will lead us to a natural definition for the symbol of an operator.

The significance of this isomorphism will come from the fact that stability results about \(V(\lambda \sqcup \alpha )\) will transfer to \({{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\) via the symbol lemma (Lemma 3.7) stated below. We will see in Sect. 6.2 that this lemma and Grace’s theorem almost immediately imply a characterization of stability-preserving operators which is similar to that of Borcea and Brändén.

To this end, consider the standard representation isomorphism \({{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha )) \cong V(\lambda )^* \boxtimes V(\alpha )\), given by \(T \mapsto \sum _{\mu \le \lambda } (x^\mu y^{\lambda -\mu })^* \boxtimes T(x^\mu y^{\lambda -\mu })\), where \(V(\lambda )^*\) is the dual representation of \(V(\lambda )\). We omit here the details regarding explicit definitions of the action of (products of) \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) on \({{\,\mathrm{Hom}\,}}\) and dual representations. Instead, we utilize the fact that the apolarity form provides an \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\)-invariant isomorphism between \(V(\lambda )\) and the dual representation \(V(\lambda )^*\), as stated in the following result.

Proposition 3.5

For any \(\lambda \in {\mathbb {N}}_0^m\), there is an \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\)-invariant isomorphism \(V(\lambda )^* \rightarrow V(\lambda )\) given by \((x^\mu y^{\lambda -\mu })^* \mapsto \left( {\begin{array}{c}\lambda \\ \mu \end{array}}\right) (-1)^{\mu } x^{\lambda -\mu } y^\mu \).

Proof

We use the apolarity form to determine the isomorphism. In particular, up to scalar \((x^\mu y^{\lambda -\mu })^*\) maps to an element \(p \in V(\lambda )\) such that \((x^\mu y^{\lambda -\mu })^* = D^\lambda (p \otimes \cdot )\). We compute:

$$\begin{aligned} D^\lambda (p \otimes x^\alpha y^{\lambda -\alpha }) = \alpha !(\lambda -\alpha )! \left( {\begin{array}{c}\lambda \\ \alpha \end{array}}\right) (-1)^{\alpha } \partial _x^{\lambda -\alpha } \partial _y^\alpha p = \lambda ! (-1)^{\alpha } \partial _x^{\lambda -\alpha } \partial _y^\alpha p \end{aligned}$$

Picking \(p(x,y) := (\lambda !)^{-2}\left( {\begin{array}{c}\lambda \\ \mu \end{array}}\right) (-1)^{\mu } x^{\lambda -\mu } y^\mu \) achieves the desired equality exactly, and therefore \((x^\mu y^{\lambda -\mu })^* \mapsto \left( {\begin{array}{c}\lambda \\ \mu \end{array}}\right) (-1)^{\mu } x^{\lambda -\mu } y^\mu \) is an \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\)-invariant isomorphism. \(\square \)

With this, we consider the following string of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^{m+l}\)-invariant isomorphisms:

$$\begin{aligned} {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha )) \rightarrow V(\lambda )^* \boxtimes V(\alpha ) \rightarrow V(\lambda ) \boxtimes V(\alpha ) \rightarrow V(\lambda \sqcup \alpha ) \end{aligned}$$

The first map is the standard isomorphism discussed above, the second map is induced by the previous proposition, and the third map is given by the discussion of outer tensor products in Sect. 3.1. This string of maps is explicitly defined on a given linear operator via:

$$\begin{aligned} \begin{aligned} T&\mapsto \sum _{\mu \le \lambda } (z^\mu w^{\lambda -\mu })^* \boxtimes T(x^\mu y^{\lambda -\mu }) \\&\mapsto \sum _{\mu \le \lambda } \left( {\begin{array}{c}\lambda \\ \mu \end{array}}\right) (-1)^{\mu } z^{\lambda -\mu } w^{\mu } \boxtimes T(x^\mu y^{\lambda -\mu }) \\&\mapsto \sum _{\mu \le \lambda } \left( {\begin{array}{c}\lambda \\ \mu \end{array}}\right) z^{\lambda -\mu } (-w)^{\mu } \cdot T(x^\mu y^{\lambda -\mu }) \end{aligned} \end{aligned}$$

Here, T acts only on the x and y variables, and z and w are the \(\lambda \) variables in \(V(\lambda \sqcup \alpha )\). This gives the desired isomorphism between \({{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\) and \(V(\lambda \sqcup \alpha )\), and hence we refer to this map as the \({{\,\mathrm{Symb}\,}}\) map.

Definition 3.6

For \(\lambda \in {\mathbb {N}}_0^m\) and \(\alpha \in {\mathbb {N}}_0^l\), we define the following \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^{m+l}\)-invariant isomorphism:

$$\begin{aligned} {{\,\mathrm{Symb}\,}}: {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha )) \rightarrow V(\lambda \sqcup \alpha ) \\ T \mapsto T\left[ (zy-xw)^\lambda \right] = {{\,\mathrm{Hmg}\,}}_{(\lambda ,\alpha )}\left( T[(z-x)^\lambda ]\right) = \sum _{\mu \le \lambda } \left( {\begin{array}{c}\lambda \\ \mu \end{array}}\right) z^{\lambda -\mu } (-w)^{\mu } \cdot T(x^\mu y^{\lambda -\mu }) \end{aligned}$$

We call \({{\,\mathrm{Symb}\,}}(T)\) the (universal) symbol of T.

This expression bears striking resemblance to the symbol used by Borcea and Brändén in [1], which motivates the use of the name “symbol” here. (In fact, \({{\,\mathrm{Symb}\,}}\) is almost the homogenization of the Borcea–Brändén symbol.) In Sect. 6, \({{\,\mathrm{Symb}\,}}\) will allow us to reduce the study of \({{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\) to the study of \(V(\lambda \sqcup \alpha )\) via the next lemma. We refer to this next result as the symbol lemma, and it demonstrates the fundamental connection between an operator T, its symbol, and the apolarity form. Note that the computation done here in the proof of this lemma is in a sense redundant. The operator \({{\,\mathrm{Symb}\,}}\) was essentially defined such that \({{\,\mathrm{Symb}\,}}(T)\) acts as T via \(D^\lambda \).

Lemma 3.7

(Symbol Lemma) Fix \(\lambda \in {\mathbb {N}}_0^m\), \(\alpha \in {\mathbb {N}}_0^l\), and a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\). For \(q \in V(\lambda )\) and \(r \in V(\alpha )\), we have:

$$\begin{aligned} D^\lambda ({{\,\mathrm{Symb}\,}}(T) \otimes q \cdot r) = (\lambda !)^2 T(q) \otimes r \end{aligned}$$

Proof

Letting \(q_\mu \) be the coefficient of the \(x^\mu y^{\lambda -\mu }\) term of q, we compute:

$$\begin{aligned} \begin{aligned} D^\lambda ({{\,\mathrm{Symb}\,}}(T) \otimes q \cdot r)&= D^\lambda \left( \sum _{\mu \le \lambda } \left( {\begin{array}{c}\lambda \\ \mu \end{array}}\right) x^{\lambda -\mu } (-y)^{\mu } \cdot T(x^\mu y^{\lambda -\mu }) \otimes q \cdot r\right) \\&= \sum _{\mu \le \lambda } \left( {\begin{array}{c}\lambda \\ \mu \end{array}}\right) ^2 (\lambda -\mu )! \mu ! \cdot T(x^\mu y^{\lambda -\mu }) \otimes (\partial _x^\mu \partial _y^{\lambda -\mu } q) r \\&= (\lambda !)^2 \sum _{\mu \le \lambda } T(x^\mu y^{\lambda -\mu }) \otimes q_\mu \cdot r \\&= (\lambda !)^2 T(q) \otimes r \\ \end{aligned} \end{aligned}$$

\(\square \)

Polynomial stability theory

Given \(\lambda \in {\mathbb {N}}_0^m\), a polynomial \(p(x_1,y_1,\ldots ,x_m,y_m) \in V(\lambda )\) is said to be stable if it doesn’t vanish in \({\mathcal {H}}_+^m \subset (\mathbb {CP}^1)^m\). More generally, p is said to be \(\Omega \)-stable if it doesn’t vanish in \(\Omega \). As above, we say p is weakly \(\Omega \)-stable if possibly \(p \equiv 0\). Most all results related to zero location of polynomials then can be translated into statements about stability properties of polynomials and stability preservation properties of operations applied to polynomials.

A linear operator T is said to preserve weak \(\Omega \)-stability if T(p) is \(\Omega \)-stable or identically zero for all \(\Omega \)-stable p. Further, a real linear operator T preserves weak real stability if the same holds for real stable polynomials. In [1], Borcea and Brändén were concerned with classifying such weak stability preserving operators. As seen in their main characterization results (Theorems 1.1 and 1.2), allowing the zero polynomial leads to a degeneracy condition in their characterization.

In order to remove this condition, we define a slightly different notion of stability: we say a linear operator T preserves (strong) \(\Omega \)-stability if T(p) is stable and nonzero for all stable p. Similarly, we say a real linear operator T preserves (strong) real stability if the same holds for real stable polynomials. Most of the main results of this paper rely on this notion of strong stability preservation, and we will demonstrate how it relates to weak stability preservation in Sects. 6.3 and 7.3.

Polar derivatives

A crucial tool of classical stability theory is the polar derivative. In particular, this notion leads to Laguerre’s theorem (Proposition 4.6), which is the main lemma toward Grace’s theorem. By passing to homogeneous polynomials the polar derivative becomes conceptually simpler, and this in turn sheds further light on the general connection to \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariance and the D map. One example of this, as we will see below, is that the polar derivative can be defined as the conjugation of \(\partial _x\) by some \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) action.

Given some “pole” \(x_0 \in {\mathbb {C}}\), the polar derivative with respect to \(x_0\) of \(f \in {\mathbb {C}}^n[x]\) is classically defined as follows.

$$\begin{aligned} (d_{x_0}f)(x) := nf(x) - (x-x_0)f'(x) \end{aligned}$$

Noticing that the term of degree n cancels out, the resulting polynomial is of degree \(n-1\). It is typically said that this operator generalizes the ordinary derivative in the sense that \(\lim _{x_0 \rightarrow \infty } x_0^{-1}d_{x_0}f(x) = f'(x)\). However, this operator also generalizes the ordinary derivative in more natural way, which we see by passing to V(n).

Fix any \(\phi = \begin{bmatrix} a &{} b \\ c &{} d \end{bmatrix} \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\). Define the pole of \(\phi \) to be \((-d:c) \in \mathbb {CP}^1\). For \(p \in V(n)\), we then define the polar derivative with respect to \(\phi \) as follows.

$$\begin{aligned} d_\phi p := (\phi ^{-1}\partial _x\phi )p = -(-d\partial _x + c\partial _y)p \end{aligned}$$

Notice that \(d_\phi \) depends only on \(\phi ^{-1}\left( {\begin{array}{c}-1\\ 0\end{array}}\right) = \left( {\begin{array}{c}-d\\ c\end{array}}\right) \). With this, the pole of \(\phi \) should be interpreted as the element of \(\mathbb {CP}^1\) that \(\phi \) sends to \(\infty = (-1:0)\).

This definition of the polar derivative with respect to \(\phi \) is at very least a natural one, as it can be simply described as the conjugation of \(\partial _x\) by the action of \(\phi \). The following result then shows that this is actually the correct definition.

Proposition 4.1

Fix \(\phi \equiv \begin{bmatrix} a &{} b \\ c &{} d \end{bmatrix} \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) with pole \((-d:c)\), and define \(x_0 := \frac{-d}{c}\) (for \(c \ne 0\)). Then:

$$\begin{aligned} d_\phi \circ {{\,\mathrm{Hmg}\,}}_n = {{\,\mathrm{Hmg}\,}}_{n-1} \circ (-c \cdot d_{x_0}) \end{aligned}$$

That is, the polar derivative \(d_\phi \) on V(n) is the homogenization of the classical polar derivative \(d_{x_0}\) on \({\mathbb {C}}^n[x]\) (up to scalar).

Proof

Straightforward computation. \(\square \)

As mentioned above, \(d_\phi \) depends only on \((-d:c)\), the pole of \(\phi \). So given any pole in \(\mathbb {CP}^1\), we can actually choose \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) to be a rotation of the Riemann sphere (i.e., \(\mathbb {CP}^1\)). This then gives the following intuitive description of the polar derivative.

Remark 4.2

Fix a rotation \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) with pole \((-d:c)\). The polar derivative \(d_\phi \) then acts on \(p \in V(n)\) in the following way. First, consider the zeros of p as being placed in the Riemann sphere via stereographic projection. Next, rotate the sphere via \(\phi \), which moves \((-d:c)\) to infinity at the top of the sphere. Apply the derivative to the new polynomial given by the new locations of the zeros. Finally, undo the original rotation via \(\phi ^{-1}\), which moves infinity back to the pole \((-d:c)\).

Projective convexity and Laguerre’s theorem

Circular regions play a key role in Grace’s theorem and its corollaries. The main reason for this is Laguerre’s theorem, which essentially says that polar derivatives with respect to points of a circular region preserve stability for that circular region. This theorem in turn relies on the Gauss–Lucas theorem, which deals with convex regions.

A circular region in \({\mathbb {C}}\) is defined to be a disc, half-plane, or complement of a disc, and such a circular region can be either open or closed. The generalization of circular regions to \(\mathbb {CP}^1\) is the obvious one. A circular region in \(\mathbb {CP}^1\) is defined to be the sets in \(\mathbb {CP}^1\) for which the stereographic projection is a circular region in \({\mathbb {C}}\). Note that \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) acts transitively on the set of all circular regions in \({\mathbb {C}}\) or in \(\mathbb {CP}^1\). We now state a lemma to Laguerre’s theorem, which gets at the heart of the importance of circular regions.

Lemma 4.3

Let \(C \subseteq \mathbb {CP}^1\) be a circular region, and let \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) be such that its pole is not in C. Then, the stereographic projection of \(\phi \cdot C\) is convex.

Proof

Let \((x_0:y_0) \notin C\) be the pole of \(\phi \). Then, \(\phi \) maps \((x_0:y_0)\) to \(\infty \in \mathbb {CP}^1\) and maps C to another circular region. Since \((x_0:y_0) \notin C\) implies \(\infty \notin \phi \cdot C\), the sterographic projection of \(\phi \cdot C\) is either an open half-plane or is bounded away from \(\infty \). Since \(\phi \cdot C\) is a circular region, it must be convex. \(\square \)

This then leads to a natural extension of the notion of a circular region.

Definition 4.4

Given \(C \subseteq \mathbb {CP}^1\), we say that C is projectively convex if for every \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) with pole not in C, the stereographic projection of \(\phi \cdot C\) is convex.

We now classify all projectively convex sets in \(\mathbb {CP}^1\) in the following. This result has been demonstrated before in [18], where projectively convex regions are referred to as generalized circular regions.

Proposition 4.5

(Zervos) Let \(C \subseteq \mathbb {CP}^1\) be projectively convex. Then, \(C = C^\circ \cup \gamma \), where \(C^\circ \) is an open circular region which is the interior of C, and \(\gamma \) is a connected subset of the boundary of \(C^\circ \). In particular, projective convexity is preserved under taking complements.

So, one example of a projectively convex set which is not quite a circular region is \({\mathcal {H}}_+\cup {\mathbb {R}}_+\). Another is \({\mathcal {H}}_+\cup [0,1]\). Yet another (albeit after a bit of consideration) is \({\mathcal {H}}_+\cup (-\infty ,0) \cup (1,\infty ]\). We now state a homogeneous version of Laguerre’s theorem, extended to projectively convex sets.

Proposition 4.6

(Laguerre) Let \(C \subseteq \mathbb {CP}^1\) be projectively convex, and fix \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\). If the pole of \(\phi \) is in C, then \(d_\phi \) preserves strong C-stability.

Proof

Gauss–Lucas and the fact that \(\mathbb {CP}^1 {\setminus } C\) is projectively convex give the result. Specifically, for C-stable \(p \in V(n)\) consider \(\phi \cdot p\), which is stable in \(\phi \cdot C \ni \infty \). Letting B be the complement of C, the dehomogenization of this polynomial is then of degree exactly n with all of its roots in the stereographic projection of \(\phi \cdot B\). By projective convexity, \(\phi \cdot B\) is convex and therefore Gauss–Lucas implies \(\partial _x(\phi \cdot p)\) is \(\phi \cdot C\)-stable and not identically zero. Applying \(\phi ^{-1}\) then implies \(d_\phi p = (\phi ^{-1} \partial _x \phi ) p\) is C-stable. \(\square \)

Corollary 4.7

Let \(C_k \subseteq \mathbb {CP}^1\) be projectively convex regions for \(k \in [m]\), and fix \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\). If the pole of \(\phi \) is in \(C_{k_0}\), then \(d_\phi \) acting on the variables \((x_{k_0},y_{k_0})\) preserves strong \((C_1 \times \cdots \times C_m)\)-stability.

Proof

Follows from the fact that taking derivatives in some variables commutes with evaluation in the others. Specifically, \(p \in V(\lambda )\) is \((C_1 \times \cdots \times C_m)\)-stable iff \(p \ne 0\) for all evaluations in \(C_1 \times \cdots \times C_m\). So, evaluating p in all variables in that product of sets except \((x_{k_0},y_{k_0})\) gives us a \(C_{k_0}\)-stable polynomial in \(V(\lambda _{k_0})\). Applying the previous proposition then gives the result. \(\square \)

Real stable polynomials

We now give a number of classical real stability results, along with a few results from [1] and [2]. Additionally, we state these results for homogeneous polynomials in \(V_{\mathbb {R}}(\lambda )\), taking roots at infinity into account. The results of this section will come in to play mainly in Sect. 7, where we discuss real linear operators and operators preserving interval- and ray-rootedness.

The first result we will need for our considerations of \(V_{\mathbb {R}}(\lambda )\) is a version of the Hermite–Biehler theorem, often called the Hermite–Kakeya–Obreschkoff theorem. We state here without proof the multivariate version essentially used in Theorem 1.9 of [1] (see also §2.4 of [16]). First we need a definition.

Definition 4.8

We say that \(p,q \in V_{\mathbb {R}}(\lambda )\) are in proper position, denoted \(p \ll q\), if \(q + ip\) is weakly \({\mathcal {H}}_+^m\)-stable (equivalently, if \(p + iq\) is weakly \({\mathcal {H}}_-^m\)-stable).

Proposition 4.9

(Multivariate Hermite–Biehler) For \(p,q \in V_{\mathbb {R}}(\lambda )\), \(ap + bq\) is weakly real stable for all \(a,b \in {\mathbb {R}}\) if and only if either \(p \ll q\) or \(q \ll p\).

This result will be crucial to our consideration of real polynomials and real stability (as it was in [1]). Its main use for us in this direction is made explicit in the following.

Lemma 4.10

Fix \(\lambda \in {\mathbb {N}}_0^m\), \(\alpha \in {\mathbb {N}}_0^l\), and a linear operator \(T: V(\lambda ) \rightarrow V(\alpha )\) which restricts to a real linear operator from \(V_{\mathbb {R}}(\lambda )\) to \(V_{\mathbb {R}}(\alpha )\). If T preserves weak real stability and \(p \in V(\lambda )\) is stable, then T(p) is either \({\mathcal {H}}_+^l\)-stable, \({\mathcal {H}}_-^l\)-stable, or identically zero.

Proof

By the Hermite–Biehler theorem, there exist \(q,r \in V_{\mathbb {R}}(\lambda )\) such that \(p = q + ir\) and \(a q + b r\) is real stable or zero for all \(a,b \in {\mathbb {R}}\). So, \(a T(q) + b T(r)\) is real stable or zero for all \(a,b \in {\mathbb {R}}\). By Hermite-Biehler again, \(T(p) = T(q) + iT(r)\) is either \({\mathcal {H}}_+^l\)-stable, \({\mathcal {H}}_-^l\)-stable, or identically zero. \(\square \)

The next two results are from [1], the first of which gives an equivalent characterization for a polynomial to be a scalar multiple of a real stable polynomial. This result will be specifically used in Sect. 7 to generalize complex operator theoretic stability results to the real stability case.

Lemma 4.11

[1, Proposition 4.1] Let \(p \in V_{\mathbb {C}}(\lambda )\) be both \({\mathcal {H}}_+^m\)-stable and \({\mathcal {H}}_-^m\)-stable. Then, p is a (complex) scalar multiple of a real stable polynomial. In particular, if nonzero \(q,r \in V_{\mathbb {R}}(\lambda )\) are such that \(q \ll r\) and \(r \ll q\), then r is a (real) scalar multiple of q.

The next result provides the degeneracy cases in the Borcea–Brändén characterizations (recall the dimension restrictions of Theorems 1.1 and 1.2). We will use this result to explicate the link between our operator characterization and the Borcea–Brändén characterization (see Lemmas 6.11 and 7.3).

Lemma 4.12

[1, Lemma 3.2] Let \(W \subseteq V_{\mathbb {K}}(\lambda )\) be a \({\mathbb {K}}\)-vector subspace (for \({\mathbb {K}}= {\mathbb {C}}\) or \({\mathbb {K}}= {\mathbb {R}}\)) consisting only of weakly stable (resp. weakly real stable) polynomials. We have:

  1. (a)

    If \({\mathbb {K}}= {\mathbb {C}}\), then \(\dim (W) \le 1\).

  2. (b)

    If \({\mathbb {K}}= {\mathbb {R}}\), then \(\dim (W) \le 2\).

By applying appropriate Möbius transformations, note that (a) of the above lemma can be generalized to \((C_1 \times \cdots \times C_m)\)-stable polynomials for any open circular regions \(C_1,\ldots ,C_m \subseteq \mathbb {CP}^1\).

We now state the last result of this section, which refines the Hermite-Biehler theorem for top-degree monic polynomials in \(V_{\mathbb {R}}(n)\). This refinement comes through the notion of interlacing polynomials and is much closer to the original statement of the classical Hermite-Biehler theorem (e.g., see Theorem 6.3.4 in [15]).

Lemma 4.13

For top-degree monic \(p,q \in V_{\mathbb {R}}(n)\), \(p \ll q\) if and only if the roots of p and q (denoted in increasing order by \((\alpha _k:1)\) and \((\beta _k:1)\), respectively) interlace on the real line in the following way:

$$\begin{aligned} \alpha _1 \le \beta _1 \le \alpha _2 \le \beta _2 \le \cdots \le \alpha _n \le \beta _n \end{aligned}$$

Further, if these equivalent conditions hold, then \(\ll \) gives a total order on the top-degree monic elements of the span of p and q in \(V_{\mathbb {R}}(n)\). This order is equivalently defined via the order of the \(k\text {th}\) largest roots, for any \(k \in [n]\) such that \(\alpha _k \ne \beta _k\).

Proof

The fact that \(p \ll q\) is equivalent to interlacing roots is the classical univariate Hermite-Biehler theorem. That q has larger roots than p can be obtained by the fact that the \((n-1)\text {st}\) derivative of \(q + ip\) must be \({\mathcal {H}}_+\)-stable. Since both polynomials are top-degree monic, this \((n-1)\text {st}\) derivative will be a complex linear combination of two linear terms. This complex linear combination is given as follows, where \(s_q\) and \(s_p\) denote the respective sums of the roots of q and p:

$$\begin{aligned} \partial _x^{n-1} \big (q(x,y) + ip(x,y)\big )= & {} (n! \cdot x - (n-1)! \cdot s_q y) + i(n! \cdot x - (n-1)! \cdot s_p y) \\= & {} n!(1+i)\left( x - \frac{s_q + i s_p}{n(1 + i)} y\right) \end{aligned}$$

Since this polynomial is \({\mathcal {H}}_+\)-stable, it must be that \(\frac{s_q + i s_p}{n(1 + i)} \in {\overline{{\mathcal {H}}_-}}\). We further compute:

$$\begin{aligned} {\overline{{\mathcal {H}}_-}} \ni \frac{s_q + i s_p}{n(1 + i)} = \frac{(s_q + i s_p)(1-i)}{2n} = \frac{s_q + s_p + i(s_p-s_q)}{2n} \end{aligned}$$

Therefore \(s_q \ge s_p\), which is the same as saying that the sum of the roots of q is larger than that of p. Since we already know that the roots of q and p interlace, this implies that q has larger roots than p.

As for the total ordering property, let r and s be two polynomials in the real span of p and q. Any real linear combination of these polynomials is then a real linear combination of p and q (and hence is real-rooted), and Hermite–Biehler implies either \(r \ll s\) or \(s \ll r\). By the above interlacing condition, it is straightforward to see that this total order is given by looking at the order of the \(k\text {th}\) roots, for any \(k \in [n]\). \(\square \)

Grace’s theorem

We now prove the multivariate homogeneous Grace’s theorem for some specific projectively convex regions and then derive a few important corollaries. These corollaries will be almost immediate once Grace’s theorem has been proven, and yet will quickly yield stronger results regarding linear operators in the next section.

In the usual proof of the classical univariate Grace’s theorem, reference to linear factors of \(f \in {\mathbb {C}}^n[x]\) is necessary. This makes generalization to \({\mathbb {C}}^\lambda [x_1,\ldots ,x_m]\) difficult, as multivariate polynomials do not necessarily have any linear factors. In our new proof, we are able avoid reference to linear terms by using particular features of the D map. This means that our proof method works for any \(\lambda \).

Theorem 5.1

Fix \(\lambda \in {\mathbb {N}}_0^m\) and \(p,q \in V(\lambda )\). Also, denote \(C := {\mathcal {H}}_+\cup \overline{{\mathbb {R}}_+}\) and \({\widetilde{C}} := {\mathcal {H}}_-\cup \overline{{\mathbb {R}}_-}\), where the closures are considered to be in \(\mathbb {CP}^1\). If p is \(C^m\)-stable and q is \({\widetilde{C}}^m\)-stable, then \(D^\lambda (p \otimes q) \ne 0\).

Proof

We prove the theorem by induction on degree. For \(\lambda \equiv 0\), the result is obvious. For \(|\lambda | \ge 1\), we can assume WLOG that \(\lambda _1 \ge 1\) by permuting the variables. Define \(\delta _1 := (1,0,0,\ldots ,0) \in {\mathbb {N}}_0^m\).

Since C and \({\widetilde{C}}\) are projectively convex, Corollary 4.7 implies \((a\partial _{x_1} + b\partial _{y_1})p\) is \(C^m\)-stable for all \((a:b) \in C\) and \((c\partial _{x_1} + d\partial _{y_1})q\) is \({\widetilde{C}}^m\)-stable for all \((c:d) \in {\widetilde{C}}\). To obtain a contradiction, we assume \(D^\lambda (p \otimes q) = 0\). For \(\alpha \in {\mathcal {H}}_+\cup {\mathbb {R}}_+ \subset C\) (equivalently, \(-\alpha \in {\mathcal {H}}_-\cup {\mathbb {R}}_- \subset {\widetilde{C}}\)), this gives:

$$\begin{aligned} \begin{aligned}&D^{\lambda -\delta _1}\big ((\alpha \partial _{x_1} + \partial _{y_1})p \otimes (\alpha \partial _{x_1} - \partial _{y_1})q\big ) = \alpha ^2 D^{\lambda -\delta _1}(\partial _{x_1} p \otimes \partial _{x_1} q) - D^{\lambda -\delta _1}(\partial _{y_1} p \otimes \partial _{y_1} q) \\&\quad - \alpha D^\lambda (p \otimes q)= \alpha ^2 D^{\lambda -\delta _1}(\partial _{x_1} p \otimes \partial _{x_1} q) - D^{\lambda -\delta _1}(\partial _{y_1} p \otimes \partial _{y_1} q) \end{aligned} \end{aligned}$$

By induction and the stability properties discussed above, we have \(D^{\lambda -\delta _1}(\partial _{x_1} p \otimes \partial _{x_1} q) \ne 0\), \(D^{\lambda -\delta _1}(\partial _{y_1} p \otimes \partial _{y_1} q) \ne 0\), and \(D^{\lambda -\delta _1}\big ((\alpha \partial _{x_1} + \partial _{y_1})p \otimes (\alpha \partial _{x_1} - \partial _{y_1})q\big ) \ne 0\). This implies:

$$\begin{aligned} \alpha ^2 D^{\lambda -\delta _1}(\partial _{x_1} p \otimes \partial _{x_1} q) - D^{\lambda -\delta _1}(\partial _{y_1} p \otimes \partial _{y_1} q) \ne 0 \Longrightarrow \alpha ^2 \ne \frac{D^{\lambda -\delta _1}(\partial _{y_1} p \otimes \partial _{y_1} q)}{D^{\lambda -\delta _1}(\partial _{x_1} p \otimes \partial _{x_1} q)} \in {\mathbb {C}}{\setminus } \{0\} \end{aligned}$$

However, we can pick \(\alpha \in {\mathcal {H}}_+\cup {\mathbb {R}}_+\) such that \(\alpha ^2\) is any value of \({\mathbb {C}}{\setminus } \{0\}\) we want, including that of \(\frac{D^{\lambda -\delta _1}(\partial _{y_1} p \otimes \partial _{y_1} q)}{D^{\lambda -\delta _1}(\partial _{x_1} p \otimes \partial _{x_1} q)}\). This contradiction gives the result. \(\square \)

Other regions

We now generalize the above theorem to other regions via \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) action and topological considerations. Theorem 5.7 can then be considered our most general form of Grace’s theorem. First though, we define two new notions in order to simplify the rest of this section.

Definition 5.2

Fix \(m \in {\mathbb {N}}_0\) and any sets \(S_1,S_2 \subseteq (\mathbb {CP}^1)^m\). We call \((S_1,S_2)\) a Grace pair if: for all \(\lambda \in {\mathbb {N}}_0^m\) and \(p,q \in V(\lambda )\) such that p is \(S_1\)-stable and q is \(S_2\)-stable, we have that \(D^\lambda (p \otimes q) \ne 0\). That is, if Grace’s theorem holds for \(S_1\) and \(S_2\).

Definition 5.3

We say that a Grace pair is disjoint if it is of the form \((C_1 \times \cdots \times C_m, B_1 \times \cdots \times B_m)\) and \(C_k\) and \(B_k\) are disjoint for all \(k \in [m]\).

This yields the following restatement of the above theorem.

Corollary 5.4

For any \(m \in {\mathbb {N}}_0\), \((({\mathcal {H}}_+\cup \overline{{\mathbb {R}}_+})^m, ({\mathcal {H}}_-\cup \overline{{\mathbb {R}}_-})^m)\) is a Grace pair.

The sets considered above intersect at 2 points (0 and \(\infty \)), and this ends up being crucial to the proof. So, in order to extend to the full generality of Grace’s theorem, we will need to find such points even when the stability sets of two polynomials p and q do not a priori intersect at all. To this end, we give the following lemmas.

Lemma 5.5

Fix \(\lambda \in {\mathbb {N}}_0^m\) and any closed circular regions \(C_1,\ldots ,C_m \subset \mathbb {CP}^1\). Let \(p \in V(\lambda )\) be \((C_1 \times \cdots \times C_m)\)-stable. There exist open circular regions \(U_1,\ldots ,U_m\) such that \(C_k \subset U_k\) for all \(k \in [m]\) and p is \((U_1 \times \cdots \times U_m)\)-stable.

Proof

Follows from compactness of \(\mathbb {CP}^1\) and closedness of \(C_1 \times \ldots \times C_m\) and of the zero set of p. \(\square \)

For the next lemma, note that the boundary of any circular region C is topologically equivalent to the unit circle in \({\mathbb {C}}\) (i.e., the boundary of the unit disc). With this, we call a portion of the boundary of C open if it is open when considered as a subset of the unit circle. Further recall the characterization of projectively convex regions given by Proposition 4.5.

Lemma 5.6

Fix \(n \in {\mathbb {N}}_0\) and any projectively convex \(C \equiv C^\circ \cup \gamma \subset \mathbb {CP}^1\), where \(C^\circ \) is an open circular region and \(\gamma \) is a connected portion of its boundary. Let \(p \in V(n)\) be C-stable. There is an open connected subset \(\Gamma \) of the boundary of \(C^\circ \) such that \(\gamma \subseteq \Gamma \) and p is \((C^\circ \cup \Gamma )\)-stable.

Proof

Let \(\partial C^\circ \) denote the boundary of (the closure of) \(C^\circ \), and let \(S \subseteq \mathbb {CP}^1\) be the intersection of \(\partial C^\circ \) and the zero set of p. Since the zero set of p is closed, we have that S is closed in \(\partial C^\circ \). And further, \(\gamma \cap S = \varnothing \) by assumption. Defining \(\Gamma \) to be the connected component of \(\partial C^\circ {\setminus } S\) containing \(\gamma \) then gives the result. \(\square \)

Using these lemmas and the \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariance of the apolarity form, we obtain the following generalization of Grace’s theorem. Here, (ii) and (iii) give the multivariate Grace’s theorem proven in [2].

Theorem 5.7

For \(m \in {\mathbb {N}}_0\) and \(C_1,\ldots ,C_m,B_1,\ldots ,B_m \subseteq \mathbb {CP}^1\), we have that \((C_1 \times \cdots \times C_m, B_1 \times \cdots \times B_m)\) is a Grace pair for the following regions.

  1. (i)

    For all \(k \in [m]\), \(C_k\) and \(B_k\) are projectively convex, \(C_k \cup B_k = \mathbb {CP}^1\), and \(C_k \cap B_k\) is exactly two points.

  2. (ii)

    For all \(k \in [m]\), \(C_k\) is a closed circular region, \(B_k\) is an open circular region, and \(C_k \cup B_k = \mathbb {CP}^1\).

  3. (iii)

    For all \(k \in [m]\), \(C_k\) is an open circular region, \(B_k\) is a closed circular region, and \(C_k \cup B_k = \mathbb {CP}^1\).

  4. (iv)

    For \(m = 1\), \(C_1\) and \(B_1\) are projectively convex and \(C_1 \cup B_1 = \mathbb {CP}^1\).

Proof

(i). By Proposition 4.5, every projectively convex region in \(\mathbb {CP}^1\) is the union of an open circular region and a portion of its boundary. Since \(C_k \cup B_k = \mathbb {CP}^1\) and \(C_k \cap B_k\) is exactly two points, we then must have that \(C_k = \phi _k \cdot ({\mathcal {H}}_+\cup \overline{{\mathbb {R}}_+})\) and \(B_k = \phi _k \cdot ({\mathcal {H}}_-\cup \overline{{\mathbb {R}}_-})\) for some \(\phi _k \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\). Since \(D^\lambda \) is \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\)-invariant, the result follows from Theorem 5.1.

(ii). Fix \(p,q \in V(\lambda )\). If p is \((C_1 \times \cdots \times C_m)\)-stable and q is \((B_1 \times \cdots \times B_m)\)-stable, then Lemma 5.5 implies p is \((U_1 \times \cdots \times U_m)\)-stable for some open circular regions \(U_1,\ldots ,U_m\) such that \(C_k \subset U_k\) for all \(k \in [m]\). Since \(C_k \cup B_k = \mathbb {CP}^1\), we then have that \(U_k \cap B_k\) is open and nonempty. Since \(U_k\) and \(B_k\) are circular regions, their intersection in fact contains an open annulus or open strip in \({\mathbb {C}}\). Therefore we may slightly shrink \(U_k\) and \(B_k\) to get closed circular regions \(U_k'\) and \(B_k'\) such that \(U_k' \cup B_k' = \mathbb {CP}^1\) and \(U_k' \cap B_k' = \partial U_k' = \partial B_k'\), where \(\partial B_k'\) denotes the boundary of \(B_k'\). We can then further remove portions of the respective boundaries of \(U_k'\) and \(B_k'\) to get projectively convex regions \(U_k''\) and \(B_k''\) such that \(U_k'' \cup B_k'' = \mathbb {CP}^1\) and \(U_k' \cap B_k'\) is exactly two points. Since \(U_k'' \subset U_k\) and \(B_k'' \subset B_k\), we have that p is \((U_1'' \times \cdots \times U_m'')\)-stable and q is \((B_1'' \times \cdots \times B_m'')\)-stable. Therefore (i) implies \(D^\lambda (p \otimes q) \ne 0\), and this implies (ii).

(iii). Same argument as (ii).

(iv). Let \(p,q \in V(n)\) be such that p is \(C_1\)-stable and q is \(B_1\)-stable. Defining \(B_1' := \mathbb {CP}^1 {\setminus } C_1 \subseteq B_1\), we further have that q is \(B_1'\)-stable. So WLOG we may assume that \(B_1 = B_1'\). Note that this implies \(\partial C_1^\circ = \partial B_1^\circ \); that is, the boundaries coincide. If \(C_1\) is a circular region then so is \(B_1\), and therefore \(D^n(p \otimes q) \ne 0\) by (ii) or (iii). This implies (iv) in this case.

Otherwise by Proposition 4.5, we have that \(C_1 = C_1^\circ \cup \gamma _1\) where \(C_1^\circ \) is an open circular region and \(\varnothing \ne \gamma _1 \subseteq \partial C_1^\circ \). Analogously we have \(B_1 = B_1^\circ \cup \psi _1\) with \(\varnothing \ne \psi _1 \subseteq \partial B_1^\circ = \partial C_1^\circ \). Lemma 5.6 then implies there exist \(\Gamma _1\) and \(\Psi _1\), which are open portions of the boundary of \(C_1^\circ \), such that \(\gamma _1 \subseteq \Gamma _1\), \(\psi _1 \subseteq \Psi _1\), p is \((C_1^\circ \cup \Gamma _1)\)-stable, and q is \((B_1^\circ \cup \Psi _1)\)-stable. Since \(\Gamma _1 \cup \Psi _1 = \partial C_1^\circ \), we then further can find closed subsets \(\Gamma _1' \subset \Gamma _1\) and \(\Psi _1' \subset \Psi _1\) such that \(\Gamma _1' \cup \Psi _1' = \partial C_1^\circ \) and \(\Gamma ' \cap \Psi '\) is exactly two points. Therefore \(D^n(p \otimes q) \ne 0\) by (i), and this implies (iv). \(\square \)

Notice that (ii) and (iii) in this result do not allow for mixed open and closed stability regions. That is, all of the \(C_k\) must be open and all of the \(B_k\) closed, or vice versa. We show that this particular point cannot be ignored, using the following example.

Example 5.8

Let \(\lambda = (1,1,1)\), denote \(E := \mathbb {CP}^1 {\setminus } {\overline{{\mathbb {D}}}}\), and consider the polynomial \(p := x_1x_2x_3 - y_1y_2y_3 = {{\,\mathrm{Hmg}\,}}_\lambda (x_1x_2x_3 - 1)\). First, it is easy to see that \(D^\lambda (p \otimes p) = 0\). Also, p is \({\mathbb {D}}^3\)-stable and \(E^3\)-stable, but it is not \({\overline{{\mathbb {D}}}}^3\)-stable nor \({\overline{E}}^3\)-stable (zero at \((x_k,y_k) = (1,1)\) for \(k \in [3]\)). That is, the fact that \(D^\lambda (p \otimes p) = 0\) does not contradict (ii) or (iii) of the previous theorem.

On the other hand, p is both \(({\overline{{\mathbb {D}}}} \times {\mathbb {D}}\times {\mathbb {D}})\)-stable and \((E \times {\overline{E}} \times {\overline{E}})\)-stable. This shows that \(({\overline{{\mathbb {D}}}} \times {\mathbb {D}}\times {\mathbb {D}}, E \times {\overline{E}} \times {\overline{E}})\) is not a Grace pair. That is, mixed open and closed stability regions cannot be included in (ii) and (iii) of the previous theorem.

As for whether or not the two-point intersection condition can be removed from (i) seems to be a more subtle point. It would be quite nice if this condition could be removed, but it is unclear whether or not it is possible.

Evaluation symbols

One way to interpret the stability properties of a given polynomial is via the stability-preservation properties of a particular type of linear operator: the evaluation map. That is, the map which evaluates a polynomial p(xy) at \((a,b) \ne (0,0)\) preserves strong \(\{(a:b)\}\)-stability, where \(\{(a:b)\}\) is a subset of \(\mathbb {CP}^1\) consisting of a single point. Further, given \(\lambda \in {\mathbb {N}}_0^m\) and \((a,b) = (a_1,b_1,\ldots ,a_m,b_m) \in {\mathbb {C}}^{2m}\) (with \((a_j,b_j) \ne 0\) for all j), we can define the corresponding evaluation map as an element of \({{\,\mathrm{Hom}\,}}(V(\lambda ),V(0))\) since \(V(0) \cong {\mathbb {C}}\). This allows us to obtain symbols for evaluation maps, and these play an important role in our linear operator characterization.

Definition 5.9

Fix \(\lambda \in {\mathbb {N}}_0^m\) and \((a,b) = (a_1,b_1,\ldots ,a_m,b_m) \in {\mathbb {C}}^{2m}\) such that \((a_j,b_j) \ne (0,0)\) for all \(j \in [m]\). Let \({{\,\mathrm{ev}\,}}_{(a,b)}: V(\lambda ) \rightarrow V(0) \cong {\mathbb {C}}\) be the evaluation operator which maps p to \(p(a,b) = p(a_1,b_1,\ldots ,a_m,b_m)\). We call \({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)}) \in V(\lambda )\) the evaluation symbol with root (ab). Further:

$$\begin{aligned} {{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)}) = \prod _{j=1}^m (b_jx_j - a_jy_j)^{\lambda _j} =: (bx - ay)^\lambda \end{aligned}$$

The main significance of this notion comes from the following result, which is essentially just a restatement of the symbol lemma (Lemma 3.7) for evaluation symbols.

Lemma 5.10

(Evaluation Symbol Lemma) Fix \(\lambda \in {\mathbb {N}}_0^m\), \(p \in V(\lambda )\), and \((a,b) = (a_1,b_1,\ldots ,a_m,b_m) \in {\mathbb {C}}^{2m}\) such that \((a_j,b_j) \ne (0,0)\) for all \(j \in [m]\). Considering \((bx-ay)^\lambda \), the evaluation symbol with root (ab), we have:

$$\begin{aligned} D^\lambda ((bx-ay)^\lambda \otimes p) = (\lambda !)^2 p(a,b) \otimes 1 = (\lambda !)^2 p(a,b) \end{aligned}$$

In what follows, we will extend Grace’s theorem in a number of ways, mainly relying on the previous lemma and the symbol lemma itself. As we will see, the representation theoretic mentality combined with repeated use of the symbol lemma will yield many of the results of this paper with surprising simplicity.

We now obtain an interesting corollary of Grace’s theorem, making use of the notion of a disjoint Grace pair. This particular formulation of the theorem will serve as a model for our linear operator characterization in Sect. 6.2.

Corollary 5.11

Fix \(\lambda \in {\mathbb {N}}_0^m\), \(q \in V(\lambda )\), and any disjoint Grace pair \((C_1 \times \cdots \times C_m, B_1 \times \cdots \times B_m)\). Then the following are equivalent.

  1. (i)

    \(D^\lambda (p \otimes q) \ne 0\) for all \((C_1 \times \cdots \times C_m)\)-stable \(p \in V(\lambda )\).

  2. (ii)

    \(D^\lambda (p \otimes q) \ne 0\) for all \((C_1 \times \cdots \times C_m)\)-stable evaluation symbols \(p \in V(\lambda )\).

  3. (iii)

    q is \((B_1 \times \cdots \times B_m)\)-stable.

Proof

(i) \(\Rightarrow \) (ii) Immediate.

(ii) \(\Rightarrow \) (iii) Fix any \((a,b) = (a_1,b_1,\ldots ,a_m,b_m) \in {\mathbb {C}}^{2m}\) such that \((a_k:b_k) \in B_k\) for all \(k \in [m]\). Since \(B_k\) and \(C_k\) are disjoint, we have that \((a_k:b_k) \not \in C_k\) for all k, and therefore \({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)}) = (bx-ay)^\lambda \) is an evaluation symbol which is \((C_1 \times \cdots \times C_m)\)-stable. The evaluation symbol lemma given above then implies:

$$\begin{aligned} 0 \ne D^\lambda ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a:b)}) \otimes q) = (\lambda !)^2 q(a,b) \otimes 1 = (\lambda !)^2 q(a,b) \end{aligned}$$

That is, \(q(a,b) \ne 0\) for any \((a,b) \in {\mathbb {C}}^{2m}\) such that \((a:b) \in B_1 \times \cdots \times B_m\), and this implies q is \((B_1 \times \cdots \times B_m)\)-stable.

(iii) \(\Rightarrow \) (i) This follows immediately from the definition of Grace pair (Definition 5.2). \(\square \)

Stability properties of complex linear operators

In [1], Borcea and Brändén were concerned with classifying the class of weak \(\Omega \)-stability preserving operators, where \(\Omega \) is some product of open circular regions. What they found is that an operator preserves weak \(\Omega \)-stability if a particular associated polynomial (what they called the symbol) is \(\Omega \)-stable. However, the “only if” direction does not necessarily hold. In particular, there are some weak \(\Omega \)-stability preserving operators for which the corresponding symbol is not \(\Omega \)-stable. They then showed that this could only happen under very specific circumstances: the operator must have image of dimension at most one.

Here, we will characterize all strong \(\Omega \)-stability preserving linear operators (for a bit more general \(\Omega \)), as well as linear operators which map between different stability regions. And, as it turns out, the extra premise of strong stability preservation is exactly what is needed to have symbol stability be an equivalent condition. In a way, this makes sense: weak \(\Omega \)-stability preservation counts the zero polynomial as \(\Omega \)-stable, which in turn corresponds to potential zeros of the symbol in the region of stability. This does not happen with strong stability preservation, allowing for a more straightforward characterization.

First though, let’s take a closer look at the Borcea–Brändén characterization of weak stability-preserving linear operators.

Weak stability preservation

Borcea and Brändén define the following symbol:

$$\begin{aligned} {{\,\mathrm{Symb}\,}}_{BB}(T) := T[(x+z)^\lambda ] = \sum _{\mu \le \lambda } \left( {\begin{array}{c}\lambda \\ \mu \end{array}}\right) z^{\lambda -\mu } T(x^\mu ) \end{aligned}$$

They then obtain the following characterization of stability-preserving linear operators.

Theorem 1.1 (Borcea–Brändén) Fix \(\lambda \in {\mathbb {N}}_0^m\) and any linear operator \(T: {\mathbb {C}}^\lambda [x_1,\ldots ,x_m] \rightarrow {\mathbb {C}}[x_1,\ldots ,x_m]\). The following are equivalent.

  1. (i)

    T maps \({\mathcal {H}}_+^m\)-stable polynomials to weakly \({\mathcal {H}}_+^m\)-stable polynomials.

  2. (ii)

    One of the following holds:

    1. (a)

      \({{\,\mathrm{Symb}\,}}_{BB}(T)\) is \({\mathcal {H}}_+^{2m}\)-stable.

    2. (b)

      T has image of dimension at most one, and is of the form

      $$\begin{aligned} T: p \mapsto q \cdot \psi (p) \end{aligned}$$

      where \(q \in {\mathbb {C}}[x_1,\ldots ,x_m]\) is \({\mathcal {H}}_+^m\)-stable, and \(\psi \) is some linear functional.

Using our terminology, this is a characterization of weak stability-preserving linear operators. This fantastic result perhaps has but one unfortunate piece: the degeneracy condition (ii)(b). Its necessity is demonstrated in the following.

Example 6.1

Define \(T: {\mathbb {C}}^n[x] \rightarrow {\mathbb {C}}[x]\) via:

$$\begin{aligned} T: \sum _{k=0}^n \left( {\begin{array}{c}n\\ k\end{array}}\right) a_k x^k \mapsto (a_n + a_{n-2}) x^n \end{aligned}$$

This operator obviously preserves weak \({\mathcal {H}}_+\)-stability. We then have that \({{\,\mathrm{Symb}\,}}_{BB}(T) = (z^2 + 1)x^n\), which is not \({\mathcal {H}}_+^2\)-stable.

As we will see below, this condition can be removed once we only consider strong stability-preserving operators. So then, maybe strong stability is the more natural notion? However “natural” it is, unfortunately it leaves out operators one might wish to consider. The most fundamental of such operators is the derivative operator \(\partial _x\). While \(\partial _x\) preserves strong \({\overline{{\mathcal {H}}_+}}\)-stability, it only preserves weak \({\mathcal {H}}_+\)-stability. Specifically, \(1 \in {\mathbb {C}}^n[x]\) is \({\mathcal {H}}_+\)-stable (all its roots are at \(\infty \)), but \(\partial _x 1 \equiv 0\). With this, one obviously wants to be able to include weak stability preserving operators in any characterization of \({\mathcal {H}}_+\)-stability preserving operators. We discuss how to use our strong stability preservation characterization to deal with operators like \(\partial _x\) in Example 6.8.

Strong stability preservation

We now state one of our main characterization results, the strong stability preservation characterization. We then derive the Borcea–Brändén characterization as a corollary.

Theorem 6.2

Fix \(\lambda \in {\mathbb {N}}_0^m\), \(\alpha \in {\mathbb {N}}_0^l\), a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\), any disjoint Grace pair \((C_1 \times \cdots \times C_m, B_1 \times \cdots \times B_m)\), and any sets \(S_1,\ldots ,S_l \subseteq \mathbb {CP}^1\). The following are equivalent.

  1. (i)

    T maps \((C_1 \times \cdots \times C_m)\)-stable polynomials to nonzero \((S_1 \times \cdots \times S_l)\)-stable polynomials.

  2. (ii)

    T maps \((C_1 \times \cdots \times C_m)\)-stable evaluation symbols to nonzero \((S_1 \times \cdots \times S_l)\)-stable polynomials.

  3. (iii)

    \({{\,\mathrm{Symb}\,}}(T)\) is \((B_1 \times \cdots \times B_m) \times (S_1 \times \cdots \times S_l)\)-stable.

One should notice the generality of this result in terms of stability regions. First note that any disjoint Grace pair can be considered, without altering the symbol in any way (e.g., via conjugation by Möbius transformations). And further, the output sets that can be considered have no restrictions whatsoever. The power of these extra features can be seen in the following examples, which demonstrate classical results regarding polynomial convolutions in a very symbol-oriented way.

Example 6.3

Fix \(p,q \in V(n)\), so that \((z_j:1)\) are the roots in \(\mathbb {CP}^1\) of q for \(j \in [n]\). So, q has no roots at \(\infty \). The additive (Walsh) convolution of p and q is defined via:

$$\begin{aligned} p *_+^n q := \frac{1}{n!} \sum _{k=0}^n \partial _x^k p \cdot (\partial _x^{n-k} q)(0,1) \end{aligned}$$

With this, \(T_q(p) := p *_+^n q\) is a linear operator in \({{\,\mathrm{Hom}\,}}(V(n),V(n))\), and we have:

$$\begin{aligned} {{\,\mathrm{Symb}\,}}(T_q) = \prod _{j=1}^n (xw - (z + z_jw)y) = {{\,\mathrm{Hmg}\,}}_{(n,n)}\left[ \prod _{j=1}^n (x - (z + z_j))\right] \end{aligned}$$

Let \(C \subset \mathbb {CP}^1\) be any projectively convex region, and define \(S := \bigcup _j (C + z_j)\). If we order the input variables of \({{\,\mathrm{Symb}\,}}(T_q)\) as (zwxy), it is then straightforward to show that \({{\,\mathrm{Symb}\,}}(T_q)\) is \(C \times (\mathbb {CP}^1 {\setminus } S)\)-stable. (First deal with possible \((x:y) = (1:0)\) or \((z:w) = (1:0)\) cases, and then assume \(y = w = 1\) to simplify the remaining cases.) Applying the previous theorem, this implies \(T_q\) maps polynomials with roots in C to polynomials with roots in S. (This is Theorem 5.3.1 in [15].) Picking \(C = {\overline{{\mathcal {H}}_-}}\) and real-rooted q implies \(T_q\) maps \({\mathcal {H}}_+\)-stable polynomials to \({\mathcal {H}}_+\)-stable polynomials. Restricting to \(p \in V_{\mathbb {R}}(n)\) then shows that \(T_q\) preserves real-rootedness.

Example 6.4

Fix \(p,q \in V(n)\), so that \((z_j:1) \ne 0\) are the roots of q for \(j \in [n]\). So, q has no roots at 0 or \(\infty \). The multiplicative (Grace–Szegő) convolution of p and q (with coefficients \(p_k\) and \(q_k\), respectively) is defined via:

$$\begin{aligned} p *_\times ^n q := \sum _{k=0}^n \left( {\begin{array}{c}n\\ k\end{array}}\right) ^{-1} (-1)^k p_k q_k x^k y^{n-k} \end{aligned}$$

With this, \(T_q(p) := p *_\times ^n q\) is a linear operator in \({{\,\mathrm{Hom}\,}}(V(n),V(n))\), and we have:

$$\begin{aligned} {{\,\mathrm{Symb}\,}}(T_q) = \prod _{j=1}^n (xw - z_jzy) = {{\,\mathrm{Hmg}\,}}_{(n,n)}\left[ \prod _{j=1}^n (x - z_j z)\right] \end{aligned}$$

Let \(C \subset \mathbb {CP}^1\) be any projectively convex region, and define \(S := \bigcup _j (z_j \cdot C)\). If we order the input variables of \({{\,\mathrm{Symb}\,}}(T_q)\) as (zwxy), it is then straightforward to show that \({{\,\mathrm{Symb}\,}}(T_q)\) is \(C \times (\mathbb {CP}^1 {\setminus } S)\)-stable. (As above, first deal with possible \((x:y) = (1:0)\) or \((z:w) = (1:0)\) cases, and then assume \(y = w = 1\) to simplify the remaining cases.) Applying the previous theorem, this implies \(T_q\) maps polynomials with roots in C to polynomials with roots in S. (This is Theorem 3.4.1d in [15].) Picking \(C = {\mathcal {H}}_-\cup {\mathbb {R}}_+\) and q with only positive roots implies \(T_q\) maps \(({\mathcal {H}}_+\cup \overline{{\mathbb {R}}_-})\)-stable polynomials to \(({\mathcal {H}}_+\cup \overline{{\mathbb {R}}_-})\)-stable polynomials. Restricting to \(p \in V_{\mathbb {R}}(n)\) then shows that \(T_q\) preserves positive-rootedness.

In order to prove the above theorem, we need an operator-theoretic corollary to Grace’s theorem. The following result is the main motivation for the symbol lemma (Lemma 3.7), and demonstrates just how closely Grace’s theorem relates to stability properties of linear operators. Further, it gives a slightly stronger result in one direction of the above characterization, as Grace pair disjointness is not a required premise.

Proposition 6.5

Fix \(\lambda \in {\mathbb {N}}_0^m\), \(\alpha \in {\mathbb {N}}_0^l\), a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\), any Grace pair \((C_1 \times \cdots \times C_m, B_1 \times \cdots \times B_m)\), and any sets \(S_1,\ldots ,S_l \subseteq \mathbb {CP}^1\). If \({{\,\mathrm{Symb}\,}}(T)\) is \((B_1 \times \cdots \times B_m) \times (S_1 \times \cdots \times S_l)\)-stable, then T maps \((C_1 \times \cdots \times C_m)\)-stable polynomials to nonzero \((S_1 \times \cdots \times S_l)\)-stable polynomials.

Proof

Fix any \((C_1 \times \cdots \times C_m)\)-stable \(q \in V(\lambda )\) and any \((c,d) = (c_1,d_1,\ldots ,c_l,d_l) \in {\mathbb {C}}^{2l}\) such that \((c_j:d_j) \in S_j\) for all \(j \in [l]\). Let \(k_{\lambda ,\alpha } := (-1)^\alpha (\lambda !)^2(\alpha !)^2\). The evaluation symbol lemma (Lemma 5.10) and the symbol lemma (Lemma 3.7) then give us the following expression of T(q) evaluated at \((c,d) = (c_1,d_1,\ldots ,c_l,d_l)\):

$$\begin{aligned} \begin{aligned} k_{\lambda ,\alpha } T(q)(c,d)&= k_{\lambda ,0} D^\alpha \big (T(q) \otimes (dx-cy)^\alpha \big ) \\&= D^{\lambda \sqcup \alpha }\big ({{\,\mathrm{Symb}\,}}(T) \otimes q \cdot (dx-cy)^\alpha \big ) \\&= k_{0,\alpha } D^\lambda \big ({{\,\mathrm{Symb}\,}}(T)(z,w,c,d) \otimes q(z,w)\big ) \end{aligned} \end{aligned}$$

In the last expression above, \(D^\lambda \) acts on the variables \((z,w) = (z_1,w_1,\ldots ,z_m,w_m)\). Since \(r(z,w) := {{\,\mathrm{Symb}\,}}(T)(z,w,c,d)\) is \((B_1 \times \cdots \times B_m)\)-stable and q(zw) is \((C_1 \times \cdots \times C_m)\)-stable, we have that the last expression above is nonzero by definition of Grace pair (Definition 5.2). This implies T(q) is \((S_1 \times \cdots \times S_l)\)-stable. \(\square \)

With this, we now give the proof of Theorem 6.2.

Proof of Theorem 6.2

The statement of this result, as well as its proof, is quite similar to that of the evaluation symbol version of Grace’s theorem given in Corollary 5.11. We explicitly give the proof anyway, as it is rather short and straightforward.

(i) \(\Rightarrow \) (ii). Immediate.

(ii) \(\Rightarrow \) (iii). Fix \((a,b) = (a_1,b_1,\ldots ,a_m,b_m) \in {\mathbb {C}}^{2m}\) such that \((a_j:b_j) \in B_j\) for all \(j \in [m]\), and fix \((c,d) = (c_1,d_1,\ldots ,c_l,d_l) \in {\mathbb {C}}^{2l}\) such that \((c_j:d_j) \in S_j\) for all \(j \in [l]\). Let \(k_{\lambda ,\alpha } := (-1)^\alpha (\lambda !)^2(\alpha !)^2\). Since \(B_j\) and \(C_j\) are disjoint, we have that \((a_j:b_j) \not \in C_j\) for all j, and therefore \({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)}) = (bx-ay)^\lambda \in V(\lambda )\) is an evaluation symbol which is \((C_1 \times \cdots \times C_m)\)-stable. Using the evaluation symbol lemma (Lemma 5.10) and the symbol lemma (Lemma 3.7), we compute:

$$\begin{aligned}&(-1)^\lambda k_{\lambda ,\alpha } {{\,\mathrm{Symb}\,}}(T)(a,b,c,d) = D^{\lambda \sqcup \alpha }({{\,\mathrm{Symb}\,}}(T) \\&\otimes (bx-ay)^\lambda (dx-cy)^\alpha ) = k_{\lambda ,\alpha } T[(bx-ay)^\lambda ](c,d) \end{aligned}$$

By (ii) the last expression is nonzero, and thus \({{\,\mathrm{Symb}\,}}(T)(a,b,c,d) \ne 0\). This implies \({{\,\mathrm{Symb}\,}}(T)\) is \((B_1 \times \cdots \times B_m) \times (S_1 \times \cdots \times S_l)\)-stable.

(iii) \(\Rightarrow \) (i). Proposition 6.5 above. \(\square \)

As mentioned above, the previous proposition gives a slightly stronger result in the (symbol stability \(\Rightarrow \) operator stability) direction. Using it, we revisit the additive and multiplicative convolutions with a more algebraic/symbolic mentality.

Example 6.6

By Definition 3.6, the \({{\,\mathrm{Symb}\,}}\) map gives a bijection between certain spaces of linear operators and polynomials. So, we can uniquely define a linear operator by giving its symbol. Using this idea, we specify \(T \in {{\,\mathrm{Hom}\,}}(V(n,n), V(n))\) by defining its symbol in V(nnn) with variables (zw), (ts), (xy) as follows:

$$\begin{aligned} {{\,\mathrm{Symb}\,}}(T) := {{\,\mathrm{Hmg}\,}}_{(n,n,n)}\left[ (x - (z+t))^n\right] = (xws - (zs + tw)y)^n \end{aligned}$$

Now, let us consider the additive convolution \(*_+^n\) as an element of \({{\,\mathrm{Hom}\,}}(V(n,n),V(n))\) in the following way. Since \(V(n,n) \cong V(n) \boxtimes V(n)\), we define \(*_+^n\) on elements \(p \boxtimes q \in V(n) \boxtimes V(n)\) via \(*_+^n(p \boxtimes q) := p *_+^n q\) and extend linearly. We then compute \({{\,\mathrm{Symb}\,}}(*_+^n)\) as follows:

$$\begin{aligned} {{\,\mathrm{Symb}\,}}(*_+^n) = *_+^n\left[ (zy-xw)^n \boxtimes (ty-xs)^n\right] = (xws - (zs + tw)y)^n \end{aligned}$$

That is, \(*_+^n\) is the operator that has our desired symbol. Fixing any \(a,b,c,d \in {\mathbb {R}}\) such that \(a<b\) and \(c<d\), we define the sets \(C_1 := {\overline{{\mathcal {H}}_+}} {\setminus } (a,b)\), \(C_2 := {\overline{{\mathcal {H}}_+}} {\setminus } (c,d)\), \(B_1 := {\mathcal {H}}_-\cup [a,b]\), \(B_2 := {\mathcal {H}}_-\cup [c,d])\), and \(S := {\overline{{\mathcal {H}}_+}} {\setminus } [a+c,b+d]\). Proposition 6.5 then implies \(p *_+^n q\) has all its roots in \([a+c,b+d]\) whenever \(p,q \in V_{\mathbb {R}}(n)\) have all their roots in (ab) and (cd), respectively. For real-rooted pq of degree n, this implies:

$$\begin{aligned} {{\,\mathrm{minroot}\,}}(p) + {{\,\mathrm{minroot}\,}}(q) \le {{\,\mathrm{minroot}\,}}(p *_+^n q) \le {{\,\mathrm{maxroot}\,}}(p *_+^n q) \le {{\,\mathrm{maxroot}\,}}(p) + {{\,\mathrm{maxroot}\,}}(q) \end{aligned}$$

Notice that we actually get a bit more. For \((C_1 \times C_2)\)-stable \(r := \sum _j p_j \boxtimes q_j \in V(n) \boxtimes V(n) \cong V(n,n)\), we have that \(*_+^n[r]\) is S-stable. That is, \(*_+^n\) has stability properties as an operator in \({{\,\mathrm{Hom}\,}}(V(n,n),V(n))\), not just as a convolution between two polynomials in V(n).

Example 6.7

As in the previous example, we can consider the multiplicative convolution \(*_\times ^n\) as an element of \({{\,\mathrm{Hom}\,}}(V(n,n),V(n))\) by defining \(*_\times ^n(p \boxtimes q) := p *_\times ^n q\) on elements \(p \boxtimes q \in V(n) \boxtimes V(n) \cong V(n,n)\) and extending linearly. We then compute its symbol in V(nnn) with variables (zw), (ts), (xy) as follows:

$$\begin{aligned} {{\,\mathrm{Symb}\,}}(*_\times ^n) = *_\times ^n\left[ (zy-xw)^n \boxtimes (ty-xs)^n\right] = (xws - zty)^n = {{\,\mathrm{Hmg}\,}}_{(n,n,n)}\left[ (x-zt)^n\right] \end{aligned}$$

Fixing any \(a,b,c,d \in {\mathbb {R}}_+\) such that \(0<a<b\) and \(0<c<d\), we define the sets \(C_1\), \(C_2\), \(B_1\), and \(B_2\) as in the previous example. We then define \(S := {\overline{{\mathbb {R}}}} {\setminus } [ac, bd]\). Proposition 6.5 then implies \(p *_\times ^n q\) has all its real roots in [acbd] whenever \(p,q \in V_{\mathbb {R}}(n)\) have all their roots in (ab) and (cd), respectively. (Notice that we could not apply the proposition if \({\mathcal {H}}_+\subset S\) or \({\mathcal {H}}_-\subset S\).) Since Example 6.4 implies \(p *_\times ^n q\) is positive-rooted (and hence, real-rooted) whenever p and q are, this implies:

$$\begin{aligned} {{\,\mathrm{minroot}\,}}(p) \cdot {{\,\mathrm{minroot}\,}}(q) \le {{\,\mathrm{minroot}\,}}(p *_\times ^n q) \le {{\,\mathrm{maxroot}\,}}(p *_\times ^n q) \le {{\,\mathrm{maxroot}\,}}(p) \cdot {{\,\mathrm{maxroot}\,}}(q) \end{aligned}$$

As in the previous example, we also obtain stability properties for \(*_\times ^n\) as an operator in \({{\,\mathrm{Hom}\,}}(V(n,n),V(n))\), and not just as a polynomial convolution.

Using similar techniques, we can also circumvent the issue that arises from the fact that \(\partial _x\) only preserves weak stability.

Example 6.8

For fixed \(n \ge 1\), consider the operator \(\partial _x \in {{\,\mathrm{Hom}\,}}(V(n),V(n-1))\). We compute:

$$\begin{aligned} {{\,\mathrm{Symb}\,}}(\partial _x) = \partial _x[(zy-xw)^n] = -nw(zy-xw)^{n-1} = {{\,\mathrm{Hmg}\,}}_{(n,n-1)}\left[ -n(z-x)^{n-1}\right] \end{aligned}$$

For any \(a,b \in {\mathbb {R}}\) such that \(a<b\), it is straightforward to see that \({{\,\mathrm{Symb}\,}}(\partial _x)\) is \((C \times B)\)-stable for \(C := {\mathcal {H}}_-\cup (a,b)\) and \(B := {\overline{{\mathcal {H}}_+}} {\setminus } (a,b)\), where the variables are ordered (zw), (xy). (Notice that this does not hold when \(\infty \in C\), due to the w factor in the symbol.) Since (CB) is a disjoint Grace pair, the Theorem 6.2 implies \(\partial _x\) preserves strong B-stability.

With this, let \(f \in {\mathbb {C}}^n[x]\) be a \({\mathcal {H}}_+\)-stable polynomial of degree \(1 \le m \le n\), and let \(p \in V(m)\) be its degree-m homogenization. Then p has no roots at infinity, and therefore there exists \(a < b\) such that p is \(\big ({\overline{{\mathcal {H}}_+}} {\setminus } (a,b)\big )\)-stable. The previous discussion implies \(\partial _x p\) is \(\big ({\overline{{\mathcal {H}}_+}} {\setminus } (a,b)\big )\)-stable, and in particular \(\partial _x p\) is \({\mathcal {H}}_+\)-stable. Since \(\partial _x\) commutes with homogenization, this also implies \(\partial _x f\) is \({\mathcal {H}}_+\)-stable.

Other issues related to weak stability preservation can be dealt with in a similar way, by considering stability regions with small intervals in \({\overline{{\mathbb {R}}}}\) about \(\infty \) attached. More generally though, the Borcea–Brändén characterization ends up being a corollary of Theorem 6.2, which we discuss and demonstrate now.

Deriving the complex Borcea–Brändén characterization

As mentioned above, we hope to obtain the Borcea–Brändén characterization from our strong stability characterization given in Theorem 6.2. To this end, we state two corollaries to Theorem 6.2, which look (naively) as close to the Borcea–Brändén characterization as possible. Let \(C^c\) denote the complement of C in \(\mathbb {CP}^1\).

Corollary 6.9

Fix \(\lambda ,\alpha \in {\mathbb {N}}_0^m\), a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\), and a Grace pair of the form \((C_1 \times \cdots \times C_m, C_1^c \times \cdots \times C_m^c)\). The following are equivalent.

  1. (i)

    T preserves strong \((C_1 \times \cdots \times C_m)\)-stability.

  2. (ii)

    \({{\,\mathrm{Symb}\,}}(T)\) is \((C_1^c \times \cdots \times C_m^c) \times (C_1 \times \cdots \times C_m)\)-stable.

Corollary 6.10

Fix \(\lambda ,\alpha \in {\mathbb {N}}_0^m\) and a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\). T preserves strong stability iff \({{\,\mathrm{Symb}\,}}(T)\) is \(({\overline{{\mathcal {H}}_-}}^m \times {\mathcal {H}}_+^m)\)-stable.

In Theorem 1.1, the analogous “if” direction of the previous corollary is paraphrased as follows: T preserves weak stability if the Borcea–Brändén symbol of T is stable. To see how this statement relates, we restate the definition of the Borcea–Brändén symbol:

$$\begin{aligned} {{\,\mathrm{Symb}\,}}_{BB}(T) := T\left[ (z+x)^\lambda \right] = \sum _{\mu \le \lambda } \left( {\begin{array}{c}\lambda \\ \mu \end{array}}\right) z^{\lambda -\mu } T(x^\mu ) \end{aligned}$$

Notice that by applying \(z \mapsto -z\) and homogenizing, we obtain (up to scalar) the universal symbol \({{\,\mathrm{Symb}\,}}(T)\) defined in this paper. The crucial difference then is the fact that the Borcea–Brändén “if” direction deals only with open upper half-planes, whereas the previous corollary requires closed half-plane stability of \({{\,\mathrm{Symb}\,}}(T)\) in the first m pairs of variables. That is, the required premises of the “if” direction of the previous corollary are strictly stronger than that of the Borcea–Brändén result.

These two results can be reconciled, however, which we now demonstrate. The following result provides the main link to the Borcea–Brändén characterization, and it can be intuitively described as follows: with the exception of having a one-dimensional range, a linear operator which maps \((C_1 \times \cdots \times C_m)\)-stable polynomials to weak \((B_1 \times \cdots \times B_m)\)-stable polynomials can only have zeros on the boundary of the set of \((C_1 \times \cdots \times C_m)\)-stable polynomials.

Lemma 6.11

Fix \(\lambda ,\alpha \in {\mathbb {N}}_0^m\), a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\), and any open circular regions \(C_1,\ldots ,C_m,B_1,\ldots ,B_m \subseteq \mathbb {CP}^1\). The following are equivalent.

  1. (i)

    T maps \((C_1 \times \cdots \times C_m)\)-stable polynomials to weakly \((B_1 \times \cdots \times B_m)\)-stable polynomials.

  2. (ii)

    One of the following holds:

    1. (a)

      T maps \((\overline{C_1} \times \cdots \times \overline{C_m})\)-stable polynomials to nonzero \((B_1 \times \cdots \times B_m)\)-stable polynomials.

    2. (b)

      \(T \equiv p_0 \cdot \psi \) for some weakly \((B_1 \times \cdots \times B_m)\)-stable polynomial \(p_0 \in V(\alpha )\) and some linear functional \(\psi \).

Proof

By appropriate \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) action, we can assume WLOG that \(C_k = B_k = {\mathbb {D}}\), the unit disc, for all \(k \in [m]\).

(i) \(\Rightarrow \) (ii). We show that if (a) is not the case, then (b) must hold. It follows from (i) that T maps \({\overline{{\mathbb {D}}}}^m\)-stable polynomials to (possibly identically zero) \({\mathbb {D}}^m\)-stable polynomials. So, if (a) is not the case, we have that \(T(p) \equiv 0\) for some \({\overline{{\mathbb {D}}}}^m\)-stable polynomial \(p \in V(\lambda )\).

The rest of the argument is essentially the proof of necessity found in [1] for Theorem 1.1. Since the set of nonzero \({\overline{{\mathbb {D}}}}^m\)-stable polynomials is open in \(V(\lambda )\), there is some ball \(B(p) \subset V(\lambda )\) centered at p such that B(p) contains only \({\overline{{\mathbb {D}}}}^m\)-stable polynomials. So, T[B(p)] is an open set in the image of T containing 0 and otherwise consisting of \({\mathbb {D}}^m\)-stable polynomials. Therefore, the image of T is a vector space consisting of \({\mathbb {D}}^m\)-stable polynomials. Lemma 4.12 (and appropriate \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) action) then implies the image of T is of dimension \(\le 1\), and (b) follows.

(ii) \(\Rightarrow \) (i). If (b) holds, then (i) is immediate. Otherwise, fix \(p \in V(\lambda )\) such that p is \({\mathbb {D}}^m\)-stable. For all \(n \in {\mathbb {N}}\), define:

$$\begin{aligned} p_n := p((1-n^{-1})x_1, y_1, (1-n^{-1})x_2, y_2, \ldots , (1-n^{-1})x_m, y_m) \end{aligned}$$

So, \(p_n\) is \({\overline{{\mathbb {D}}}}^m\)-stable for all n, and \(\lim _{n \rightarrow \infty } p_n = p\) coefficient-wise. By (ii), \(T(p_n)\) is \({\mathbb {D}}^m\)-stable for all n, and by continuity, \(T(p) = \lim _{n \rightarrow \infty } T(p_n)\). Hurwitz’s theorem then implies T(p) is either identically zero or \({\mathbb {D}}^m\)-stable. \(\square \)

This lemma then yields the following corollaries to Theorem 6.2. Applying the necessary maps to convert \({{\,\mathrm{Symb}\,}}\) to \({{\,\mathrm{Symb}\,}}_{BB}\) as discussed above, these results give precisely the Borcea–Brändén characterization proven in Theorem 1.1 and more generally in Theorem 6.3 of [1]. In particular, Corollary 6.12 can be seen as a unification of the complex characterization results of [1].

Corollary 6.12

Fix \(\lambda ,\alpha \in {\mathbb {N}}_0^m\), a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\), and any open circular regions \(C_1, \ldots , C_m\). The following are equivalent.

  1. (i)

    T preserves weak \((C_1 \times \cdots \times C_m)\)-stability.

  2. (ii)

    One of the following holds:

    1. (a)

      \({{\,\mathrm{Symb}\,}}(T)\) is \((\overline{C_1}^c \times \cdots \times \overline{C_m}^c) \times (C_1 \times \cdots \times C_m)\)-stable.

    2. (b)

      \(T \equiv p_0 \cdot \psi \) for some weakly \((C_1 \times \cdots \times C_m)\)-stable polynomial \(p_0 \in V(\alpha )\) and some linear functional \(\psi \).

Proof

The result follows from the Lemma 6.11 and Theorem 6.2 applied to an operator T which maps \((\overline{C_1} \times \cdots \times \overline{C_m})\)-stable polynomials to nonzero \((C_1 \times \cdots \times C_m)\)-stable polynomials. \(\square \)

Corollary 6.13

Fix \(\lambda ,\alpha \in {\mathbb {N}}_0^m\) and a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\). T preserves weak stability iff one of the following holds:

  1. (a)

    \({{\,\mathrm{Symb}\,}}(T)\) is \(({\mathcal {H}}_-^m \times {\mathcal {H}}_+^m)\)-stable.

  2. (b)

    \(T \equiv p_0 \cdot \psi \) for some weakly stable polynomial \(p_0 \in V(\alpha )\) and some linear functional \(\psi \).

Notice that our naive guess at strong stability results which emulate the Borcea–Brändén characterization (Corollaries 6.9 and 6.10) was incorrect. We actually needed to consider closed circular stability regions \(\overline{C_k}\), so that their complements in \(\mathbb {CP}^1\) would be open (i.e., to ensure Grace pair disjointness, which is required to apply Theorem 6.2). We see this play out in condition (ii)(a) of Corollary 6.12.

Stability properties of real linear operators

Borcea and Brändén also classified the class of weak real stability preserving linear operators. As in the complex case, they showed that weak real stability preservation of a linear operator T is almost equivalent to real stability of the associated symbol \({{\,\mathrm{Symb}\,}}_{BB}(T)\). We have to say “almost equivalent” here because there are certain weak real stability preserving operators for which the corresponding symbol is not real stable. As before, this implies a certain dimension restriction: such operators must have image of dimension at most two.

We will now characterize all strong real stability preserving linear operators. As above, strong real stability preservation will serve to eliminate the degeneracy condition of the Borcea–Brändén characterization. In this section, we duplicate the outline of our previous discussion on complex operators, making use of arguments similar to those found in [1] to fill in the gaps.

Further, we also obtain a characterization of a certain class of operators which preserve ray- and interval-rootedness. The question of a full characterization of such operators is as of yet still an open problem (see [3]). Here, we answer this question for operators which preserve both strong ray- or interval-rootedness as well as weak real-rootedness.

Weak real stability preservation

Borcea and Brändén obtain the following characterization of weak real stability preserving linear operators. Recall the notion of proper position (denoted by \(\ll \)) given in Definition 4.8.

Theorem 1.2 (Borcea–Brändén) Fix \(\lambda \in {\mathbb {N}}_0^m\) and any linear operator \(T: {\mathbb {R}}_\lambda [x_1,\ldots ,x_m] \rightarrow {\mathbb {R}}[x_1,\ldots ,x_m]\). The following are equivalent.

  1. (i)

    T maps real stable polynomials to weakly real stable polynomials.

  2. (ii)

    One of the following holds:

    1. (a)

      \({{\,\mathrm{Symb}\,}}_{BB}(T)\) is \({\mathcal {H}}_+^{2m}\)-stable.

    2. (b)

      \({{\,\mathrm{Symb}\,}}_{BB}(T)\) is \(({\mathcal {H}}_-^m \times {\mathcal {H}}_+^m)\)-stable.

    3. (c)

      T has image of dimension at most two, and is of the form

      $$\begin{aligned} T: p \mapsto q \cdot \psi _1(p) + r \cdot \psi _2(p) \end{aligned}$$

      where \(q,r \in {\mathbb {R}}[x_1,\ldots ,x_m]\) are weakly real stable such that \(q \ll r\), and \(\psi _1,\psi _2\) are real linear functionals.

As in the case of complex operators, the degeneracy condition (ii)(c) is the result of allowing weak real stability preserving operators. We now give an example which demonstrates its necessity.

Example 7.1

Define \(T: {\mathbb {R}}_n[x] \rightarrow {\mathbb {R}}[x]\) via:

$$\begin{aligned} T: \sum _{k=0}^n \left( {\begin{array}{c}n\\ k\end{array}}\right) a_k x^k \mapsto a_nx^n + a_{n-2}x^{n-1} = (a_nx + a_{n-2})x^{n-1} \end{aligned}$$

This operator obviously preserves weak real stability. We then have that \({{\,\mathrm{Symb}\,}}_{BB}(T) = (z^2+x)x^{n-1}\), which is not \({\mathcal {H}}_+^2\)-stable nor \(({\mathcal {H}}_-\times {\mathcal {H}}_+)\)-stable.

Again, the degeneracy condition is required for the characterization but obscures the connection between an operator and its symbol. To remove it, we now turn to our characterization of strong real stability preserving operators.

Strong real stability preservation

We state and prove our strong real stability preservation characterization here, and then derive the Borcea–Brändén characterization as a corollary. The proof here takes a bit more work than in the complex case, and will rely on many of the real stability results discussed in Sect. 4.3. This extra work is essentially taken from the proof of Theorem 1.2 found in [1].

Theorem 7.2

Fix \(\lambda \in {\mathbb {N}}_0^m\), \(\alpha \in {\mathbb {N}}_0^l\), and a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\) such that T restricts to a real linear operator from \(V_{\mathbb {R}}(\lambda )\) to \(V_{\mathbb {R}}(\alpha )\). The following are equivalent.

  1. (i)

    T preserves strong real stability.

  2. (ii)

    \({{\,\mathrm{Symb}\,}}(T)\) is either \(({\overline{{\mathcal {H}}_-}}^m \times {\mathcal {H}}_+^l)\)-stable or \(({\overline{{\mathcal {H}}_-}}^m \times {\mathcal {H}}_-^l)\)-stable.

Proof

(i) \(\Rightarrow \) (ii). Fixing \((z_0:w_0) \in {\overline{{\mathcal {H}}_-}}^m\), we have that \((w_0x - z_0y)^\lambda \) is \({\mathcal {H}}_+^m\)-stable. If \((z_0:w_0) \in {\overline{{\mathbb {R}}}}^m\), then \(T[(w_0x - z_0y)^\lambda ]\) is nonzero and real stable by assumption. Combining the symbol lemma (Lemma 3.7) and the evaluation symbol lemma (Lemma 5.10), this implies \({{\,\mathrm{Symb}\,}}(T)(z_0,w_0,x,y) = (-1)^\lambda T[(w_0x - z_0y)^\lambda ]\) is both \({\mathcal {H}}_+^l\)-stable and \({\mathcal {H}}_-^l\)-stable.

On the other hand, suppose \((z_0:w_0) \not \in {\overline{{\mathbb {R}}}}^m\). By Lemma 4.10, we have that \(T[(w_0x - z_0y)^\lambda ]\) is either \({\mathcal {H}}_+^l\)-stable, \({\mathcal {H}}_-^l\)-stable, or zero. Now suppose there are \((z_0:w_0), (z_0':w_0') \in {\overline{{\mathcal {H}}_-}}^m {\setminus } {\overline{{\mathbb {R}}}}^m\) such that \(T[(w_0x - z_0y)^\lambda ]\) is \({\mathcal {H}}_+^l\)-stable and \(T[(w_0'x - z_0'y)^\lambda ]\) is \({\mathcal {H}}_-^l\)-stable. By a homotopy argument, there exists \((z_0'':w_0'') \in {\overline{{\mathcal {H}}_-}}^m {\setminus } {\overline{{\mathbb {R}}}}^m\) such that \(T[(w_0''x - z_0''y)^\lambda ]\) is \(({\mathcal {H}}_+^l \cup {\mathcal {H}}_-^l)\)-stable or zero. By Lemma 4.11, \(T[c_0(w_0''x - z_0''y)^\lambda ]\) is either real stable or zero for some complex scalar \(c_0 \ne 0\).

Let \(c_0(w_0''x - z_0''y)^\lambda = q(x,y) + ir(x,y)\) for \(q,r \in V_{\mathbb {R}}(\lambda )\), which are both real stable or zero by Hermite-Biehler (Proposition 4.9). Note further that \(r \not \equiv 0\) since \((z_0'':w_0'') \not \in {\overline{{\mathbb {R}}}}^m\). However, since \(T(q + ir) = T(q) + iT(r)\) is real stable or zero and T restricts to real linear operator, it must be that \(T(r) \equiv 0\). This contradicts the fact that T strongly preserves real stability.

So, \({{\,\mathrm{Symb}\,}}(T)(z_0,w_0,x,y) = (-1)^\lambda T[(w_0x - z_0y)^\lambda ]\) is either \({\mathcal {H}}_+^l\)-stable (in the xy variables) for all \((z_0:w_0) \in {\overline{{\mathcal {H}}_-}}^m {\setminus } {\overline{{\mathbb {R}}}}^m\), or \({\mathcal {H}}_-^l\)-stable for all \((z_0:w_0) \in {\overline{{\mathcal {H}}_-}}^m {\setminus } {\overline{{\mathbb {R}}}}^m\). Combining this with the \((z_0:w_0) \in {\overline{{\mathbb {R}}}}^m\) case, we have that \({{\,\mathrm{Symb}\,}}(T)\) is either \(({\overline{{\mathcal {H}}_-}}^m \times {\mathcal {H}}_+^l)\)-stable or \(({\overline{{\mathcal {H}}_-}}^m \times {\mathcal {H}}_-^l)\)-stable.

(ii) \(\Rightarrow \) (i). By the complex stability characterization (Theorem 6.2), T maps \({\mathcal {H}}_+^m\)-stable polynomials to either nonzero \({\mathcal {H}}_+^l\)-stable polynomials or nonzero \({\mathcal {H}}_-^l\)-stable polynomials. Since T restricts to a real linear operator on \(V_{\mathbb {R}}(\lambda )\), T preserves strong real stability. \(\square \)

As a final note, the “homotopy argument” used in the previous proof is not quite that of the proof found in [1], though it is similar. Here, one just needs to be a bit more careful about the precise homotopy with respect to points at infinity.

Deriving the real Borcea–Brändén characterization

As in the complex case, we now obtain the Borcea–Brändén weak real stability characterization as a corollary to our strong real stability characterization given in Theorem 7.2. To this end, we start by giving a sort of real stability version of Lemma 6.11. The proof of this lemma is similar in spirit to that of the strong real stability characterization given above.

Lemma 7.3

Fix \(\lambda ,\alpha \in {\mathbb {N}}_0^m\) and a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\) such that T restricts to a real linear operator from \(V_{\mathbb {R}}(\lambda )\) to \(V_{\mathbb {R}}(\alpha )\). The following are equivalent.

  1. (i)

    T preserves weak real stability.

  2. (ii)

    One of the following holds:

    1. (a)

      T maps \({\overline{{\mathcal {H}}_+}}^m\)-stable polynomials to nonzero \({\mathcal {H}}_+^m\)-stable polynomials.

    2. (b)

      T maps \({\overline{{\mathcal {H}}_+}}^m\)-stable polynomials to nonzero \({\mathcal {H}}_-^m\)-stable polynomials.

    3. (c)

      T has image of dimension at most two, and is of the form

      $$\begin{aligned} T: p \mapsto q \cdot \psi _1(p) + r \cdot \psi _2(p) \end{aligned}$$

      where \(q,r \in V_{\mathbb {R}}(\alpha )\) are weakly real stable such that \(q \ll r\), and \(\psi _1,\psi _2\) are real linear functionals.

Proof

(i) \(\Rightarrow \) (ii). By the complex characterization (Theorem 6.2), we only need to consider evaluation symbols when demonstrating (a) or (b). For any \((z_0:w_0) \in {\mathcal {H}}_-^m\), Lemma 4.10 then implies \(T[(w_0x-z_0y)^\lambda ]\) is either \({\mathcal {H}}_+^m\)-stable, \({\mathcal {H}}_-^m\)-stable, or identically zero. We now show that (c) holds if (a) and (b) do not.

If neither (a) nor (b) holds for evaluation symbols, then there exist \((z_0:w_0),(z_0':w_0') \in {\mathcal {H}}_-^m\) such that \(T[(w_0x-z_0y)^\lambda ]\) is \({\mathcal {H}}_+^m\)-stable or zero and \(T[(w_0'x-z_0'y)^\lambda ]\) is \({\mathcal {H}}_-^m\)-stable or zero. As in the proof of the strong real stability characterization (Theorem 7.2), a homotopy argument implies there exists \((z_0'':w_0'') \in {\mathcal {H}}_-^m\) such that \(T[(w_0''x-z_0''y)^\lambda ]\) is \(({\mathcal {H}}_+^m \cup {\mathcal {H}}_-^m)\)-stable or zero. Lemma 4.11 then implies \(T[c_0(w_0''x-z_0''y)^\lambda ]\) is real stable or zero, for some complex scalar \(c_0 \ne 0\).

For the sake of simplicity, we denote \(q_0(x,y) = q_0(x_1,y_1,\ldots ,x_m,y_m) := c_0(w_0''x-z_0''y)^\lambda \). Since the set of \({\overline{{\mathcal {H}}_+}}^m\)-stable polynomials is open in \(V(\lambda )\), let B(0) be some open ball centered at 0 such that \(q_0 + iB(0)\) consists of nonzero \({\overline{{\mathcal {H}}_+}}^m\)-stable polynomials. So, for any \(r_0 \in B(0)\), Lemma 4.10 implies \(T(q_0) + iT(r_0)\) is either \({\mathcal {H}}_+^m\)-stable, \({\mathcal {H}}_-^m\)-stable, or zero. Hermite-Biehler then implies \(T(r_0)\) is real stable or zero whenever \(r_0 \in B(0) \cap V_{\mathbb {R}}(\lambda )\). Therefore, \(T[V_{\mathbb {R}}(\lambda )]\) consists of real stable polynomials, and (c) follows from Lemma 4.12.

(ii) \(\Rightarrow \) (i). If (c) holds, then (i) follows from Hermite-Biehler. Otherwise, suppose WLOG that (a) holds. We can then use an argument similar in spirit to that of Lemma 6.11 to show that T maps \({\mathcal {H}}_+^m\)-stable polynomials to weakly \({\mathcal {H}}_+^m\)-stable polynomials. Since T restricts to a real operator, this implies (i). \(\square \)

As in Lemma 6.11, we use the previous lemma to link the characterizations of weak and strong stability preserving operators as follows. Applying the necessary maps to convert \({{\,\mathrm{Symb}\,}}(T)\) to \({{\,\mathrm{Symb}\,}}_{BB}(T)\) below gives essentially the characterization of weak real stability preserving operators given in Theorem 1.2.

Corollary 7.4

Fix \(\lambda ,\alpha \in {\mathbb {N}}_0^m\) and a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\) such that T restricts to a real linear operator from \(V_{\mathbb {R}}(\lambda )\) to \(V_{\mathbb {R}}(\alpha )\). The following are equivalent.

  1. (i)

    T preserves weak real stability.

  2. (ii)

    One of the following holds:

    1. (a)

      \({{\,\mathrm{Symb}\,}}(T)\) is \(({\mathcal {H}}_-^m \times {\mathcal {H}}_+^m)\)-stable.

    2. (b)

      \({{\,\mathrm{Symb}\,}}(T)\) is \(({\mathcal {H}}_-^m \times {\mathcal {H}}_-^m)\)-stable.

    3. (c)

      T has image of dimension at most two, and is of the form

      $$\begin{aligned} T: p \mapsto q \cdot \psi _1(p) + r \cdot \psi _2(p) \end{aligned}$$

      where \(q,r \in V_{\mathbb {R}}(\alpha )\) are weakly real stable such that \(q \ll r\), and \(\psi _1,\psi _2\) are real linear functionals.

Proof

Apply the complex characterization (Theorem 6.2) to conditions (ii)(a) and (ii)(b) of Lemma 7.3 above. \(\square \)

Ray and interval stability

We now apply the above results to projectively convex regions of the form \({\mathcal {H}}_+\cup J^c\), where \(J \subset {\overline{{\mathbb {R}}}}\) is some connected set. From this, we obtain a classification of operators which both preserve strong J-rootedness and weak real-rootedness (a polynomial \(p \in V(n)\) is J-rooted if all its roots lie in J). This of course does not completely solve the open problem of providing a classification of interval- and ray-stability preserving operators (see, e.g., [3]). However, it does seem to be the natural corollary obtained by applying proof methods similar to that of [1].

That said, we now proceed to prove the main result of this subsection, Theorem 7.8. We first start with a short-hand definition in order to simplify the proof.

Definition 7.5

Fix \(\lambda ,\alpha \in {\mathbb {N}}_0^m\) and a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(\lambda ),V(\alpha ))\) such that T restricts to a real linear operator and preserves weak real-stability. We say T is degenerate if it satisfies condition (ii)(c) of Corollary 7.4.

We now prove two lemmas. The first is straightforward, but rather interesting in its own right.

Lemma 7.6

Fix a closed bounded interval \(J \subset {\mathbb {R}}\) and a subspace \(W \subseteq V_{\mathbb {R}}(n)\) consisting of weakly real-rooted polynomials. Let \(S \subseteq W\) denote the subset of top-degree monic J-rooted polynomials. There exist \(p,q \in S\) such that \(p \ll q\) and S is the convex hull of p and q.

Proof

Lemma 4.12 implies W is of dimension at most two, and so then Lemma 4.13 implies the relation \(\ll \) is a total order on S. Applying the root ordering property of Lemma 4.13, the closedness of S implies there are \(p,q \in S\) such that \(p \ll q\) and \(p \ll r \ll q\) for all \(r \in S\). Basic sign arguments and the fact that S is contained in the span of \(\{p,q\}\) then imply S is the the convex hull of \(\{p,q\}\). \(\square \)

The second lemma is perhaps less straightforward in terms of proof, but follows from the following intuitive idea: an open ball in some complex subspace of polynomials yields, roughly speaking, an open ball of zeros.

Lemma 7.7

Fix \(n,m \in {\mathbb {N}}_0\) and a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(n),V(m))\) which restricts to a real linear operator and preserves weak real-rootedness. If there exist some \({\overline{{\mathcal {H}}_+}}\)-stable \(p_0 \in V(n)\) and some \((x_0:y_0) \in {\overline{{\mathbb {R}}}}\) such that \(T(p_0)(x_0,y_0) = 0\), then one of the following holds:

  1. (a)

    \(T(p_0)\) is real-rooted or identically zero.

  2. (b)

    \(T(p)(x_0,y_0) = 0\) for all \(p \in V(n)\).

Proof

Let \(q_0,r_0 \in V_{\mathbb {R}}(m)\) be such that \(T(p_0) = q_0 + ir_0\). Also suppose that \(T(p_0) \not \equiv 0\) and that (b) does not hold, and let \(p_1\) be such that \(T(p_1)(x_0,y_0) \ne 0\). WLOG, we may also assume \(p_1 \in V_{\mathbb {R}}(n)\) by considering its real or imaginary part. We will now prove that \(T(p_0)\) must be real-rooted.

First, suppose further that \(T(p_0)\) has a multiple root at \((x_0,y_0)\). For small fixed \(\epsilon \), \(p_0 + \epsilon p_1\) is \({\overline{{\mathcal {H}}_+}}\)-stable and so Lemma 4.10 implies \(T(p_0 + \epsilon p_1)\) is either \({\mathcal {H}}_+\)-stable or \({\mathcal {H}}_-\)-stable. Hermite-Biehler (Proposition 4.9) then implies \(q_0 + \epsilon T(p_1)\) and \(r_0\) have interlacing roots. However, since T restricts to a real linear operator, it must be that \(q_0\) and \(r_0\) both have a multiple root at \((x_0,y_0)\). The fact that \(q_0 + \epsilon T(p_1)\) has no root at \((x_0,y_0)\) yields a contradiction, as interlacing is then impossible.

Otherwise, \(T(p_0)\) has a simple root at \((x_0,y_0)\). Define \(R \in {{\,\mathrm{Hom}\,}}(V(n),V(1))\) via \(R := d_\phi ^{n-1} \circ T\), where \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) is such that \((x_0:y_0)\) is the pole of \(\phi \). We have that \(R(p_0)(x_0,y_0) = 0\), but \(R(p_0) \not \equiv 0\) since the root is simple. Further, \(R(p_1)(x_0,y_0) \ne 0\), and therefore R is a surjective continuous linear map. By the open mapping theorem, there exists a one-real-dimensional curve \(\Gamma \subset V(n)\) through \(p_0\), for which \(R(\Gamma )\) contains elements with root in \({\mathcal {H}}_+\) on one side of \(p_0\) (call this side \(\Gamma _+\)) and elements with root in \({\mathcal {H}}_-\) on the other side (call it \(\Gamma _-\)). So by Laguerre’s theorem (Proposition 4.6), elements of \(T(\Gamma _+)\) have some roots in \({\mathcal {H}}_+\) and elements of \(T(\Gamma _-)\) have some roots in \({\mathcal {H}}_-\). Since polynomials near \(p_0\) are \({\overline{{\mathcal {H}}_+}}\)-stable, Lemma 4.10 implies elements of \(T(\Gamma \cap B_\epsilon (p_0))\) are all \({\mathcal {H}}_+\)-stable or \({\mathcal {H}}_-\)-stable for some small ball \(B_\epsilon (p_0)\) about \(p_0\). So elements of \(T(\Gamma _+ \cap B_\epsilon (p_0))\) are \({\mathcal {H}}_-\)-stable and elements of \(T(\Gamma _- \cap B_\epsilon (p_0))\) are \({\mathcal {H}}_+\)-stable, and therefore \(T(p_0)\) is real-rooted. \(\square \)

We now prove our main result on ray- and interval-stability preserving operators. First we state the theorem for closed bounded output intervals, as it clarifies the proof quite a bit. We will then extend the result to other connected regions in \({\overline{{\mathbb {R}}}}\).

Theorem 7.8

Fix \(n,m \in {\mathbb {N}}_0\) and a linear operator \(T \in {{\,\mathrm{Hom}\,}}(V(n),V(m))\) which restricts to a real linear operator. Further, let \(I \subseteq {\mathbb {R}}\) be any interval, and let \(J \subset {\mathbb {R}}\) be any closed bounded interval. The following are equivalent.

  1. (i)

    T preserves weak real-rootedness and maps I-rooted polynomials to nonzero J-rooted polynomials.

  2. (ii)

    One of the following holds:

    1. (a)

      \({{\,\mathrm{Symb}\,}}(T)\) is \(({\mathcal {H}}_-\cup I) \times ({\overline{{\mathcal {H}}_+}} {\setminus } J)\)-stable.

    2. (b)

      \({{\,\mathrm{Symb}\,}}(T)\) is \(({\mathcal {H}}_-\cup I) \times ({\overline{{\mathcal {H}}_-}} {\setminus } J)\)-stable.

    3. (c)

      T has image of dimension at most two, and is of the form

      $$\begin{aligned} T: p \mapsto q \cdot \psi _1(p) + r \cdot \psi _2(p) \end{aligned}$$

      where \(q,r \in V_{\mathbb {R}}(m)\) are top-degree monic and weakly J-rooted such that \(q \ll r\), and \(\psi _1\) and \(\psi _2\) are real linear functionals such that \(\psi _1(p) \cdot \psi _2(p) \ge 0\) (not both zero) holds for any I-rooted p.

Proof

(i) \(\Rightarrow \) (ii). Suppose T is nondegenerate. So, \({{\,\mathrm{Symb}\,}}(T)\) is either \(({\mathcal {H}}_-\times {\mathcal {H}}_+)\)-stable or \(({\mathcal {H}}_-\times {\mathcal {H}}_-)\)-stable by Corollary 7.4. By Lemma 7.3, either T maps \({\overline{{\mathcal {H}}_+}}\)-stable evaluation symbols entirely to nonzero \({\mathcal {H}}_+\)-stable polynomials or entirely to nonzero \({\mathcal {H}}_-\)-stable polynomials. If for some \((z_0:w_0) \in {\mathcal {H}}_-\) we have that \(T[(w_0x-z_0y)^n]\) has a root in \({\overline{{\mathbb {R}}}}\), then we can apply the previous lemma. If condition (a) of the lemma holds, then \(T[(w_0x-z_0y)^n]\) is real-rooted or identically zero. The proof of Lemma 7.3 then implies T is degenerate, a contradiction. Otherwise condition (b) of the lemma holds, and therefore the real roots of \(T[(w_0x-z_0y)^n]\) must be in J. So in fact, T maps \({\overline{{\mathcal {H}}_+}}\)-stable evaluation symbols entirely to nonzero \(({\overline{{\mathcal {H}}_+}} {\setminus } J)\)-stable polynomials or entirely to nonzero \(({\overline{{\mathcal {H}}_-}} {\setminus } J)\)-stable polynomials. Finally, T maps I-rooted evaluation symbols to nonzero \(({\overline{{\mathcal {H}}_+}} {\setminus } J)\)-stable and \(({\overline{{\mathcal {H}}_-}} {\setminus } J)\)-stable polynomials by assumption. The complex characterization (Theorem 6.2) then implies (a) or (b).

Otherwise, T is degenerate and \(T[V_{\mathbb {R}}(n)]\) consists entirely of real-rooted polynomials. Condition (c) follows from Lemma 7.6.

(ii) \(\Rightarrow \) (i). By Corollary 7.4, T preserves weak real-rootedness. If (a) or (b) holds, then the complex characterization (Theorem 6.2) and the fact that T restricts to a real operator imply T maps I-rooted polynomials to nonzero J-rooted polynomials.

Otherwise (c) holds. For any real-rooted p, let \(\lambda (p)\) and \(\mu (p)\) denote the largest and smallest roots of p, respectively. Since qr are top-degree monic, every convex combination of q and r has all its roots in the interval \([\mu (q), \lambda (r)] \subseteq J\). Since \(\psi _1 \cdot \psi _2 \ge 0\) (not both zero) holds for I-rooted polynomials, we have that T maps I-rooted polynomials to nonzero J-rooted polynomials. \(\square \)

Notice that this result immediately holds for other closed, connected regions \(I,J \subset {\overline{{\mathbb {R}}}}\) by the action of some appropriate \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {R}})\). In fact, one can directly apply the action of \(\phi \) to conditions (ii)(a) and (ii)(b), due to the fact that our definition of the “universal” symbol works for any projectively convex regions. The only significant change comes when applying \(\phi \) to condition (ii)(c). Further, the only issue with (ii)(c) as it is written now is the requirement that \(p_1\) and \(p_2\) be top-degree monic polynomials. Having zeros at infinity, for instance, means that a polynomial cannot ever be top-degree monic (as the leading homogeneous coefficient is 0). There are ways to rewrite (ii)(c) that avoids this problem, but it is probably more intuitive to state the result as above and apply \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {R}})\).

Additionally, the result holds for open and half-open bounded intervals \(J \subset {\mathbb {R}}\), with a bit of tweaking to condition (ii)(c). (Again, the universality of the symbol means that (ii)(a) and (ii)(b) remain unchanged.) We state this in the following, where the action of \(\phi \in {{\,\mathrm{SL}\,}}_2({\mathbb {R}})\) can be used to obtain similar results regarding open and half-open connected regions in \({\overline{{\mathbb {R}}}}\).

Corollary 7.9

The previous theorem holds when \(J \subset {\mathbb {R}}\) is an open (or half-open) bounded interval, given the following alterations to condition (ii)(c): if the image of T is of dimension exactly two, then \(p_1,p_2 \in V_{\mathbb {R}}(m)\) are top-degree monic \({\overline{J}}\)-rooted polynomials such that the largest root of \(p_1\) and the smallest root of \(p_2\) are in J (for \(p_1 \ll p_2\)), and \(\psi _1 \ne 0\) (resp. \(\psi _2 \ne 0\)) whenever \(p_2\) (resp. \(p_1\)) is not J-rooted.

Proof

The condition that the largest root of \(p_1\) and the smallest root of \(p_2\) are in J (and the fact that \(p_1 \ll p_2\)) implies that \(\alpha p_1 + \beta p_2\) is J-rooted for all \(\alpha ,\beta > 0\). Applying Lemma 7.6 to \({\overline{J}}\) completes the proof. \(\square \)

We now give a few examples. The first demonstrates the necessity of the premise that T preserves weak real-rootedness.

Example 7.10

Consider the operator \(T_n: V(n) \rightarrow V(n)\) defined via:

$$\begin{aligned} T_n: x^ky^{n-k} \mapsto {{\,\mathrm{Hmg}\,}}_n[x(x-1)(x-2)\cdots (x-k+1)] \end{aligned}$$

By Proposition 7.31 in [7], \(T_n\) preserves positive-rootedness for all n. However, \(T_2\) does not preserve real-rootedness, for example. In particular:

$$\begin{aligned} T_2(x^2+2xy+y^2) = x(x-y) + 2xy + y^2 = x^2 + xy + y^2 \end{aligned}$$

We now compute the symbol of \(T_2\):

$$\begin{aligned} {{\,\mathrm{Symb}\,}}(T_2) = T_2[(xw-zy)^2] = x(x-y)w^2 - 2xzyw + z^2y^2 = {{\,\mathrm{Hmg}\,}}_{(2,2)}[(x-z)^2 - x] \end{aligned}$$

Notice that for \(x = -1\), we have that \((-1-z)^2 + 1\) is not real rooted. Therefore, \({{\,\mathrm{Symb}\,}}(T_2)\) is neither \(({\mathcal {H}}_-\cup (0,\infty )) \times ({\overline{{\mathcal {H}}_+}} {\setminus } (0,\infty ))\)-stable nor \(({\mathcal {H}}_-\cup (0,\infty )) \times ({\overline{{\mathcal {H}}_-}} {\setminus } (0,\infty ))\)-stable when the variables are ordered (zw), (xy). That is, the operator \(T_2\) does not contradict the previous theorem.

In the second example, we demonstrate root preservation properties of \(f(\partial _x)\) for real-rooted f. These are standard results of the classical theory: see, e.g., Corollary 5.4.1 in [15].

Example 7.11

For any real-rooted \(f \in {\mathbb {C}}[x]\), consider the operator \(D_f \in {{\,\mathrm{Hom}\,}}(V(n),V(n))\) defined via \(D_f: g \mapsto f(y\partial _x)g\) (i.e., the homogenized version of \(f(\partial _x)\)). To determine properties of this operator, we first write:

$$\begin{aligned} f(y\partial _x) = c_0 \prod _{j=1}^m (y\partial _x - \alpha _j) \end{aligned}$$

Here, the \(\alpha _j \in {\mathbb {R}}\) are the roots of f. Next, we compute the symbol of \((y\partial _x - \alpha _j) \in {{\,\mathrm{Hom}\,}}(V(n),V(n))\) for \(j \in [m]\):

$$\begin{aligned} \begin{aligned} {{\,\mathrm{Symb}\,}}(y\partial _x - \alpha _j)&= (y\partial _x - \alpha _j)(zy-xw)^n \\&= -(\alpha _j (zy-xw) + nwy)(zy-xw)^{n-1} \\&= {{\,\mathrm{Hmg}\,}}_{(n,n)}\left[ -(\alpha _j(z-x)+n)(z-x)^{n-1}\right] \end{aligned} \end{aligned}$$

We now have three cases, depending on the sign of \(\alpha _j\). If \(\alpha _j > 0\), we have that \({{\,\mathrm{Symb}\,}}(y\partial _x - \alpha _j)\) is both \(({\mathcal {H}}_-\cup [a,\infty ]) \times ({\overline{{\mathcal {H}}_+}} {\setminus } [a,\infty ])\)-stable and \(({\mathcal {H}}_-\cup [-\infty ,a]) \times ({\overline{{\mathcal {H}}_+}} {\setminus } [-\infty ,a+\frac{n}{\alpha _j}])\)-stable for any \(a \in {\mathbb {R}}\). (As usual, we order the variables (zw), (xy).) Using Theorem 7.8 and the discussion following the proof, this implies \((y\partial _x - \alpha _j)\) preserves \([a,\infty ]\)-rootedness and maps \([-\infty ,a]\)-rooted polynomials to \([-\infty ,a+\frac{n}{\alpha _j}]\)-rooted polynomials. So, if the (non-infinite) roots of g are contained in the interval [bc], then the (non-infinite) roots of \((y\partial _x - \alpha _j)g\) are contained in the interval \([b,c+\frac{n}{\alpha _j}]\).

If \(\alpha _j < 0\), we have that \({{\,\mathrm{Symb}\,}}(y\partial _x - \alpha _j)\) is both \(({\mathcal {H}}_-\cup [a,\infty ]) \times ({\overline{{\mathcal {H}}_+}} {\setminus } [a+\frac{n}{\alpha _j},\infty ])\)-stable and \(({\mathcal {H}}_-\cup [-\infty ,a]) \times ({\overline{{\mathcal {H}}_+}} {\setminus } [-\infty ,a])\)-stable for any \(a \in {\mathbb {R}}\). As above, this implies \((y\partial _x - \alpha _j)\) preserves \([-\infty ,a]\)-rootedness and maps \([a,\infty ]\)-rooted polynomials to \([a+\frac{n}{\alpha _j},\infty ]\)-rooted polynomials. So, if the (non-infinite) roots of g are contained in the interval [bc], then the (non-infinite) roots of \((y\partial _x - \alpha _j)g\) are contained in the interval \([b+\frac{n}{\alpha _j},c]\).

Finally for \(\alpha _j = 0\), the operator \((y\partial _x - \alpha _j) = y\partial _x\) weakly preserves any interval in which the (non-infinite) roots reside. The main difference for this case is that \(y\partial _x\) only preserves weak real-rootedness. Combining these three cases, we are lead to the following root preservation property of \(f(\partial _x): {\mathbb {C}}^n[x] \rightarrow {\mathbb {C}}^n[x]\). Let \(\alpha _j^+\) and \(\alpha _j^-\) be the positive and negative roots of f, respectively. We then have the following, which refers to non-infinite roots:

$$\begin{aligned} f(\partial _x): [b,c]\text {-rooted} \rightarrow \left[ b + \sum _j \frac{n}{\alpha _j^-}, c + \sum _j \frac{n}{\alpha _j^+}\right] \text {-rooted} \end{aligned}$$

If f has zeros at 0, then \(f(\partial _x)\) may map some nonzero [bc]-rooted polynomials to 0. Otherwise, \(f(\partial _x)\) is invertible on \({\mathbb {C}}^n[x]\).

References

  1. 1.

    Borcea, J., Brändén, P.: The Lee–Yang and Pólya–Schur programs. I. Linear operators preserving stability. Inventiones Mathematicae 177(3), 541–569 (2009)

    MathSciNet  Article  Google Scholar 

  2. 2.

    Borcea, J., Brändén, P.: The Lee–Yang and Pólya–Schur programs. II. Theory of stable polynomials and applications. Commun. Pure Appl. Math. 62(12), 1595–1631 (2009)

    Article  Google Scholar 

  3. 3.

    Borcea, J., Brändén, P.: Pólya–Schur master theorems for circular domains and their boundaries. Ann. Math. 170(1), 465–492 (2009)

  4. 4.

    Brennan, J.P., Chipalkatti, J.V., Fossum, R.M.: Apolarity and covariant forms. Ill. J. Math. 51(1), 21–27 (2007)

    MathSciNet  MATH  Google Scholar 

  5. 5.

    Ehrenborg, R., Rota, G.-C.: Apolarity and canonical forms for homogeneous polynomials. Eur. J. Comb. 14(3), 157–181 (1993)

    MathSciNet  Article  Google Scholar 

  6. 6.

    Fulton, W., Harris, J.: Representation Theory: A First Course, vol. 129. Springer Science and Business Media, Berlin (2013)

    MATH  Google Scholar 

  7. 7.

    Fisk, S.: Polynomials, roots, and interlacing. arXiv preprint arXiv:math/0612833 (2006)

  8. 8.

    Grace, J.H.: The zeros of a polynomial. In: Proc. Cambridge Philos. Soc., vol. 11, pp. 352–357 (1902)

  9. 9.

    Grace, J.H., Young, A.: The Algebra of Invariants. Chelsea Pub. Co., New York (1903)

    MATH  Google Scholar 

  10. 10.

    Humphreys, J.: Introduction to Lie Algebras and Representation Theory, vol. 9. Springer Science and Business Media, Berlin (2012)

    Google Scholar 

  11. 11.

    Kowalski, E.: An Introduction to the Representation Theory of Groups, vol. 155. American Mathematical Society, Providence (2014)

    Book  Google Scholar 

  12. 12.

    Melamud, E.: Linear operators on polynomials preserving roots in open circular domains. Proc. Am. Math. Soc. 143(12), 5213–5218 (2015)

    MathSciNet  Article  Google Scholar 

  13. 13.

    Olver, P.J.: Classical Invariant Theory, vol. 44. Cambridge University Press, Cambridge (1999)

    Book  Google Scholar 

  14. 14.

    Pólya, G., Schur, J.: Über zwei arten von faktorenfolgen in der theorie der algebraischen gleichungen. Journal für die Reine und Angewandte Mathematik 144, 89–113 (1914)

    MathSciNet  MATH  Google Scholar 

  15. 15.

    Rahman, Q.I., Schmeisser, G.: Analytic Theory of Polynomials, vol. 26. Oxford University Press, Oxford (2002)

    MATH  Google Scholar 

  16. 16.

    Wagner, D.: Multivariate stable polynomials: theory and applications. Bull. Am. Math. Soc. 48(1), 53–84 (2011)

    MathSciNet  Article  Google Scholar 

  17. 17.

    Zaheer, N.: On polar relations of abstract homogeneous polynomials. Trans. Am. Math. Soc. 218, 115–131 (1976)

    MathSciNet  Article  Google Scholar 

  18. 18.

    Zervos, S.P.: Aspects modernes de la localisation des zéros des polyn\(\hat{\rm o}\)mes d’une variable. Annales scientifiques de l’École Normale Supérieure. 77(4), 303–410 (1960)

Download references

Acknowledgements

We would like to thank Nick Ryder for many graduate school conversations on stable polynomials and linear stability preservers. We would also like to thank an anonymous referee for a thorough reading of this paper and many helpful comments.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jonathan Leake.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Tensor product decomposition of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) representations

In this appendix, we discuss in detail the decomposition of inner tensor products of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) and \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\) representations. The results given here are for the most part standard, and they are typically presented via the theory of Lie groups and algebras (e.g., in [6] and [10]). Here though, we discuss these results in terms of the polynomial spaces V(n) and \(V(\lambda )\).

That said, the first results we state demonstrate the importance of V(n) and \(V(\lambda )\) in the representation theory of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\). In fact, these representations are precisely the irreducible representations of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) and \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\), respectively (see Lecture 11 of [6], and also Proposition 2.3.23 of [11]). We will not make full use of this fact but will need the following simpler results.

Proposition A.1

For all \(n \in {\mathbb {N}}_0\), we have that V(n) is an irreducible representation of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) of dimension \(n+1\).

Proposition A.2

For all \(\lambda \in {\mathbb {N}}_0^m\), we have that \(V(\lambda ) \cong V(\lambda _1) \boxtimes \cdots \boxtimes V(\lambda _m)\) is an irreducible representation of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\) of dimension \(\prod _{i}(\lambda _i + 1)\).

In particular, outer tensor products of irreducible representations of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) are irreducible representations of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\). On the other hand, inner tensor products are not irreducible and their decomposition leads to a natural definition of the apolarity form (see Sect. 3.2). We now set out to compute these decompositions, which are often given as exercises in the literature (see, e.g., Exercise 11.11 of [6]).

Decomposition of \(V(n) \otimes V(m)\)

Fix \(n,m \in {\mathbb {N}}_0\). We now consider the representation of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) given by the inner tensor product, \(V(n) \otimes V(m)\). The importance of the tensor product comes from the fact that it relates to consideration of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant bilinear forms like the apolarity form. In particular, the decomposition of the tensor product as a sum of irreducible representations (Proposition A.7) will show us exactly how the D map (see Proposition 3.1) can be used to define the apolarity form in the representation theoretic context (Definition 3.3).

We begin with an important \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant map.

Proposition A.3

Let x and y denote the linear maps defined on V(k) via multiplication by x and y, respectively. The linear map \(U := (x \otimes y - y \otimes x): V(n) \otimes V(m) \rightarrow V(n+1) \otimes V(m+1)\) is \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant.

Proof

Straightforward computation, e.g., on simple tensors. \(\square \)

We then use this U map to show that \(D^k\) is not the zero map when \(k \le m,n\).

Lemma A.4

For \(k \le m \le n\), consider the map \(DU^k: V(n-k) \otimes V(m-k) \rightarrow V(n-1) \otimes V(m-1)\). We have:

$$\begin{aligned} DU^k = U^kD + k(n+m-k+1)U^{k-1} \end{aligned}$$

Proof

Follows from the fact that \((\partial _x x - x\partial _x)p = p\) and \((\partial _y y - y\partial _y)p = p\). \(\square \)

Corollary A.5

For \(k \le m \le n\), consider the map \(D^kU^k: V(n-k) \otimes V(m-k) \rightarrow V(n-k) \otimes V(m-k)\). We have:

$$\begin{aligned} D^kU^k(x^{n-k} \otimes x^{m-k}) = \frac{k!(n+m-k+1)!}{(n+m-2k+1)!} (x^{n-k} \otimes x^{m-k}) \ne 0 \end{aligned}$$

In particular, \(x^{n-k} \otimes x^{m-k}\) is in the image of \(D^k: V(n) \otimes V(m) \rightarrow V(n-k) \otimes V(m-k)\).

Proof

Apply the previous lemma k times, and use the fact that \(D(x^{n-k} \otimes x^{m-k}) = 0\). \(\square \)

We will use this fact about the image of \(D^k\) to determine the decomposition of \(V(n) \otimes V(m)\) into irreducible components. We will also need the following fundamental representation theory result.

Lemma A.6

(Schur’s Lemma) Let \(V,V',W\) be representations of a group G, and suppose \(V,V'\) are irreducible. Then:

  1. (i)

    Any G-invariant map \(\pi : W \rightarrow V\) is either surjective or the zero map.

  2. (ii)

    Any G-invariant map \(\iota : V \rightarrow W\) is either injective or the zero map.

  3. (iii)

    Any G-invariant map \(\psi : V \rightarrow V'\) is either an isomorphism or the zero map.

Further, if \(V,V'\) are vector spaces over an algebraically closed field, then \(\psi \) is unique up to scalar.

Applying Schur’s lemma to each of the \(r\text {th}\) transvectants (discussed at the end of Sect. 3.2) yields the desired representation decomposition.

Proposition A.7

Let \(m \le n\). We have the following decomposition of \(V(n) \otimes V(m)\), as a representation of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\), into irreducible components.

$$\begin{aligned} V(n) \otimes V(m) \cong \bigoplus _{r \le m} V(n + m - 2r) \end{aligned}$$

In particular, \(V(n) \otimes V(n) \cong V(2n) \oplus V(2n-2) \oplus \cdots \oplus V(2) \oplus V(0)\).

Proof

For each \(r \in {\mathbb {N}}_0\), \( \le m\), consider the \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant \(r\text {th}\) transvectant map:

$$\begin{aligned} V(n) \otimes V(m) \xrightarrow {D^r} V(n-r) \otimes V(m-r) \xrightarrow {\times } V(n+m-2r) \end{aligned}$$

By Corollary A.5, this map is not the zero map, as \(x^{n-r} \cdot x^{m-r} = x^{n+m-2r}\) is in its image. Since \(V(n+m-2r)\) is irreducible by Proposition A.1, Schur’s lemma implies this map is surjective. This in turn implies \(V(n) \otimes V(m) \cong V(n+m-2r) + W_r\) (as a representation) for some subspace \(W_r\). Since this holds for all \(r \le m\), we actually have

$$\begin{aligned} V(n) \otimes V(m) \cong W + \bigoplus _{r \le m} V(n + m - 2r) \end{aligned}$$

for some subspace W. The sum of irreducible components here is direct, as any two distinct irreducible components must intersect trivially. To show that we can set \(W = 0\), we use the following dimension argument:

$$\begin{aligned} \sum _{r=0}^m \dim (V(n+m-2r)) = \sum _{r=0}^m (n+m-2r+1) = (n+1)(m+1) = \dim (V(n) \otimes V(m)) \end{aligned}$$

This completes the proof. \(\square \)

Along with the stated decomposition, we also obtain something else: the \(r\text {th}\) transvectant is a projection from \(V(n) \otimes V(m)\) onto the irreducible component \(V(n+m-2r)\). Schur’s lemma and the tensor product decomposition then imply this projection is actually unique up to scalar. In a similar way, Schur’s lemma also implies D and U must restrict to either a unique isomorphism or the zero map on each irreducible component of \(V(n) \otimes V(m)\). In the following, we determine exactly what happens on each component.

Theorem A.8

Consider the decomposition \(V(n) \otimes V(m) \cong V(n+m) \oplus V(n+m-2) \oplus \cdots \oplus V(n-m+2) \oplus V(n-m)\). The maps

$$\begin{aligned}&U: V(n) \otimes V(m) \rightarrow V(n+1) \otimes V(m+1) \\&D: V(n+1) \otimes V(m+1) \rightarrow V(n) \otimes V(m) \end{aligned}$$

restrict to \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant isomorphisms from \(V(n+m-2r)\) to \(V(n+m-2r)\) for all \(0 \le r \le m \le n\). Additionally, D restricts to the zero map on \(V(n+m+2)\).

Proof

By Schur’s lemma, the claim immediately follows if U is injective and D is surjective. That D is surjective follows from the fact that the transvectant maps \(\times \circ D^r\) are projections onto each of the irreducible components of \(V(n) \otimes V(m)\) for \(1 \le r \le m+1\). That U is injective follows from the fact that \(U(v) = 0\) implies \(v=0\). One can see this by lexicographically ordering the basis \(\{x^jy^{n-j} \otimes x^ky^{m-k} : 0 \le j \le n, 0 \le k \le m\}\) and considering the highest component of a given \(v \in V(n) \otimes V(m)\). \(\square \)

Our main application of this theory is given as follows. Consider the \(n\text {th}\) transvectant map \(\times \circ D^n: V(n) \otimes V(n) \rightarrow V(0) \cong {\mathbb {C}}\), which is nonzero by the previous theorem. This map can be interpreted as an \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant bilinear form on V(n). It turns out that the apolarity bilinear form used in Grace’s theorem also has this property, and this justifies the following definition.

Definition 3.3We call the \(n\text {th}\) transvectant

$$\begin{aligned} V(n) \otimes V(n) \xrightarrow {D^n} V(0) \otimes V(0) \xrightarrow {\times } V(0) \cong {\mathbb {C}}\end{aligned}$$

the apolarity form of V(n).

Corollary A.9

The apolarity form is the unique (up to scalar) nondegenerate \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant bilinear form on V(n).

Decomposition of \(V(\lambda ) \otimes V(\mu )\)

Fix \(\lambda ,\mu \in {\mathbb {N}}_0^m\), and let \(V(\lambda )\) and \(V(\mu )\) denote the irreducible representations of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\) given by the outer tensor products:

$$\begin{aligned} V(\lambda ) \cong V(\lambda _1) \boxtimes \cdots \boxtimes V(\lambda _m) \qquad \qquad \qquad V(\mu ) \cong V(\mu _1) \boxtimes \cdots \boxtimes V(\mu _m) \end{aligned}$$

We next generalize the above results to the inner tensor product of these two representations, \(V(\lambda ) \otimes V(\mu )\). In particular, we determine the decomposition of this tensor product and define a multivariate apolarity form. Note that these statements strictly generalize the previous analogous statements.

Proposition A.10

(c.f. Proposition A.7) Let \(\mu \le \lambda \). We have the following decomposition of \(V(\lambda ) \otimes V(\mu )\), as a representation of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\), into irreducible components.

$$\begin{aligned} V(\lambda ) \otimes V(\mu ) \cong \bigoplus _{\alpha \le \mu } V(\lambda + \mu - 2\alpha ) \end{aligned}$$

Proof

We compute:

$$\begin{aligned} \begin{aligned} V(\lambda ) \otimes V(\mu )&\cong \big (V(\lambda _1) \boxtimes \cdots \boxtimes V(\lambda _m)\big ) \otimes \big (V(\mu _1) \boxtimes \cdots \boxtimes V(\mu _m)\big ) \\&\cong \big (V(\lambda _1) \otimes V(\mu _1)\big ) \boxtimes \cdots \boxtimes \big (V(\lambda _m) \otimes V(\mu _m)\big ) \\&\cong \left( \bigoplus _{\alpha _1 \le \mu _1} V(\lambda _1 + \mu _1 - 2\alpha _1)\right) \boxtimes \cdots \boxtimes \left( \bigoplus _{\alpha _m \le \mu _m} V(\lambda _m + \mu _m - 2\alpha _m)\right) \\&\cong \bigoplus _{\alpha \le \mu } V(\lambda + \mu - 2\alpha ) \end{aligned} \end{aligned}$$

The last step uses the distributive law for sums and tensor products of representations. \(\square \)

Theorem A.11

(c.f. Theorem A.8) For any \(\beta \in {\mathbb {N}}_0^m\), define \(U^\beta := U^{\beta _1} \boxtimes \cdots \boxtimes U^{\beta _m}\). Define \(D^\beta \) similarly. For any \(\mu \le \lambda \in {\mathbb {N}}_0^m\), the maps

$$\begin{aligned}&U^\beta : V(\lambda ) \otimes V(\mu ) \rightarrow V(\lambda + \beta ) \otimes V(\mu + \beta ) \\&D^\beta : V(\lambda + \beta ) \otimes V(\mu + \beta ) \rightarrow V(\lambda ) \otimes V(\mu ) \end{aligned}$$

restrict to \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant isomorphisms on the components of \(V(\lambda ) \otimes V(\mu ) \cong \bigoplus _{\alpha \le \mu } V(\lambda + \mu - 2\alpha )\). Finally, \(D^\beta \) restricts to the zero map on the other irreducible components of \(V(\lambda + \beta ) \otimes V(\mu + \beta )\).

Proof

Follows by induction on \(\beta \), using Theorem A.8. \(\square \)

Definition 3.4 (c.f. Definition 3.3) We call the map

$$\begin{aligned} V(\lambda ) \otimes V(\lambda ) \xrightarrow {D^\lambda } V(0^m) \otimes V(0^m) \xrightarrow {\times } V(0^m) \cong {\mathbb {C}}\end{aligned}$$

the apolarity form of \(V(\lambda )\).

Corollary A.12

(c.f. Corollary A.9) The apolarity form is the unique (up to scalar) nondegenerate \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\)-invariant bilinear form on \(V(\lambda )\).

The Grace–Walsh–Szegő coincidence theorem

A classical result in the representation theory of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\) is the fact that \(V(n) \cong {\text {Sym}}^n(V(1))\). Here, \({\text {Sym}}^n(V(1))\) denotes the set of symmetric tensors in \(V(1)^{\otimes n}\), or alternatively, the set of symmetric elements in \(V(1^n)\). That is, there is some \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\)-invariant injection from V(n) to \(V(1)^{\otimes n}\), and by our conceptual thesis this map should transfer stability information. In fact, this idea is formalized in the Grace–Walsh–Szegő coincidence theorem, and the injective map is known as the polarization map.

Polarization and projection

For polynomials of degree \(m \le n\), the degree-n polarization map is defined on monomials as follows and is extended linearly.

$$\begin{aligned} \begin{aligned} \Pi _n^\uparrow : {\mathbb {C}}^n[x]&\rightarrow {\mathbb {C}}^{(1^n)}[x_1,\ldots ,x_n] \\ x^k&\mapsto \frac{1}{n!} \sum _{\sigma \in S_n} \prod _{j=1}^k x_{\sigma (j)} \end{aligned} \end{aligned}$$

This definition can be extended to homogeneous polynomials in V(n) by composing with \({{\,\mathrm{Hmg}\,}}_n^{-1}\) and \({{\,\mathrm{Hmg}\,}}_{(1^n)}\). The map \(\Pi _n^\uparrow \) has a left inverse \(\Pi _n^\downarrow \), called the projection map, which we define as follows.

$$\begin{aligned} \begin{aligned} \Pi _n^\downarrow : {\mathbb {C}}^{(1^n)}[x_1,\ldots ,x_n]&\rightarrow {\mathbb {C}}^n[x] \\ f(x_1,x_2,\ldots ,x_n)&\mapsto f(x,x,\ldots ,x) \end{aligned} \end{aligned}$$

That is, \(\Pi _n^\downarrow \circ \Pi _n^\uparrow \) is the identity map. Similarly, this definition can be extended to homogeneous polynomials by composing with \({{\,\mathrm{Hmg}\,}}_{(1^n)}^{-1}\) and \({{\,\mathrm{Hmg}\,}}_n\).

It is well-known that \(\Pi _n^\uparrow \) is an injective linear map onto the subspace of symmetric multi-affine polynomials. This fact then extends to homogeneous polynomials, where the terms symmetric and multi-affine each refer to pairs of homogeneous variables. Further, one can define multivariate polarization and projection maps via composition: \(\Pi _\lambda ^\uparrow := \Pi _{\lambda _m}^\uparrow \circ \cdots \circ \Pi _{\lambda _1}^\uparrow \) and \(\Pi _\lambda ^\downarrow := \Pi _{\lambda _m}^\downarrow \circ \cdots \circ \Pi _{\lambda _1}^\downarrow \). Injectivity then automatically extends to \(\Pi _\lambda ^\uparrow \), and \(\Pi _\lambda ^\downarrow \circ \Pi _\lambda ^\uparrow \) is the identity map.

These two maps arise naturally in the theory of polynomials in general, and play an important role in the theory of stability, via the Grace–Walsh–Szegő coincidence theorem as well as in the proof of the Borcea–Brändén characterization of linear operators. The next result shows they also have represention theoretic importance.

Proposition B.1

Fix \(\lambda \in {\mathbb {N}}_0^m\), and view \(V(\lambda ) \cong V(\lambda _1) \boxtimes \cdots \boxtimes V(\lambda _m)\) and \(V(1^\lambda ) \cong V(1)^{\otimes \lambda _1} \boxtimes \cdots \boxtimes V(1)^{\otimes \lambda _m}\) as representations of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\). The maps \(\Pi _\lambda ^\uparrow : V(\lambda ) \rightarrow V(1^\lambda )\) and \(\Pi _\lambda ^\downarrow : V(1^\lambda ) \rightarrow V(\lambda )\) are \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\)-invariant.

Proof

Note that by proving the result for \(\Pi _n^\uparrow \) and \(\Pi _n^\downarrow \) with \(m=1\), the general result follows since \(\Pi _\lambda ^\uparrow \) and \(\Pi _\lambda ^\downarrow \) are compositions of such maps. To prove it for \(m=1\), note that the set of symmetric elements in \(V(1)^{\otimes n} \cong V(1^n)\) is invariant under the diagonal action of \({{\,\mathrm{SL}\,}}_2({\mathbb {C}})\). Further, since V(n) is irreducible of dimension \(n+1\) and \(V(1)^{\otimes n}\) has a single irreducible component of dimension \(n+1\), Schur’s lemma implies the result. \(\square \)

This result then has a few corollaries which will help to shed light on results related to polarization and the apolarity form. The first will be useful in elucidating the representation theoretic ties to the Grace–Walsh–Szegő coincidence theorem below.

Lemma B.2

Fix \(\lambda \in {\mathbb {N}}_0^m\). Then the apolarity form commutes with polarization up to scalar. That is:

$$\begin{aligned} D^\lambda = D^{(1^\lambda )} \circ (\Pi _\lambda ^\uparrow \otimes \Pi _\lambda ^\uparrow ) \end{aligned}$$

Proof

The map \(D^{(1^\lambda )} \circ (\Pi _\lambda ^\uparrow \otimes \Pi _\lambda ^\uparrow ): V(\lambda ) \otimes V(\lambda ) \rightarrow {\mathbb {C}}\) is an \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\)-invariant bilinear form on \(V(\lambda )\). By uniqueness (see Corollary A.12), this then must equal \(D^\lambda \) up to scalar. \(\square \)

The content of this result is that fact that we have commutativity even though \(D^{(1^\lambda )}\) is a priori the apolarity form with respect to a different group action than that of \(D^\lambda \) (i.e., \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^{|\lambda |}\) instead of \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^m\)). That said, it should be noted that the analogous commutativity statement with the projection map \(\Pi _\lambda ^\downarrow \) does not hold (unless of course, one restricts to the image of \(\Pi _\lambda ^\uparrow \)).

The purpose of this result is then to demonstrate the connection between a polynomial and its polarization. In particular, if Grace’s theorem gives stability information via the apolarity form, then the previous result shows that the polarizations of those polynomials will have the same stability information. We prove this rigorously in Corollary B.5.

Proposition B.1 also leads to one of the crucial results used in the proof of the Borcea–Brändén characterization of linear operators (Lemma 2.5 in [1]). It relies on the notion of “the polarization of an operator”, given by \(T \mapsto \Pi _\alpha ^\uparrow \circ T \circ \Pi _\lambda ^\downarrow \) (see §2.2 in [1]). We do not make explicit use of this result, but we state it here to demonstrate that operator polarization has a representation theoretic interpretation similar to that of the usual polynomial polarization.

Proposition B.3

The symbol of the polarization of an operator T is the polarization of the symbol of T.

Proof

Using Proposition B.1 and Definition 3.6, it is straightforward to see that all the maps involved are injective \(({{\,\mathrm{SL}\,}}_2({\mathbb {C}}))^{2m}\)-invariant linear maps (i.e., polarization of polynomials, polarization of operators, the \({{\,\mathrm{Symb}\,}}\) map). The result then follows from a dimension argument and Schur’s lemma, in a way similar to that of the proof of Proposition B.1. \(\square \)

The coincidence theorem

The Grace–Walsh–Szegő coincidence theorem has strong ties to Grace’s theorem, and most books and surveys on the subject state the two results side by side. Some books (e.g., [15]) even go so far as to demonstrate their equivalence, perhaps with other results involving typical polynomial convolutions. Here, we will state and prove the general multivariate version of the theorem in terms of homogeneous polynomials, making use of evaluation symbols and Grace’s theorem (Theorem 5.1).

First though, consider the following corollary to the symbol lemma (Lemma 3.7) which is similar in spirit to the evaluation symbol lemma (Lemma 5.10). Note that when applied to \(p \in V(n)\) with \(m=1\), this result has the following intuitive statement as a corollary: \(D^n(q \otimes p)\) is equal to the evaluation of \(\Pi _n^\uparrow p\) at the roots of q.

Lemma B.4

Fix \(\lambda \in {\mathbb {N}}_0^m\), \(p \in V(\lambda )\), and any \((a:b) \in (\mathbb {CP}^1)^{|\lambda |}\), more explicitly defined as follows:

$$\begin{aligned}&(a:b) \equiv \big ((a_{1,1}:b_{1,1}), \ldots , (a_{1,\lambda _1}:b_{1,\lambda _1}), \ldots , (a_{m,1}:b_{m,1}), \ldots , (a_{m,\lambda _m}:b_{m,\lambda _m})\big ) \\&\in (\mathbb {CP}^1)^{\lambda _1 + \cdots + \lambda _m} \end{aligned}$$

We have the following:

$$\begin{aligned} (\Pi _\lambda ^\uparrow p)(a,b) = D^\lambda \left( \Pi _\lambda ^\downarrow \big ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\big ) \otimes p\right) = D^\lambda \left( \prod _{k=1}^m \prod _{j=1}^{\lambda _k} (b_{k,j}x_k - a_{k,j}y_k) \otimes p \right) \end{aligned}$$

Proof

We first prove the case of \(m=1\) and \(p \in V(n)\), given as follows:

$$\begin{aligned} (\Pi ^\uparrow _n p)(a,b) = D^n\left( \Pi _n^\downarrow \big ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\big ) \otimes p\right) = D^n\left( \prod _{j=1}^n (b_jx - a_jy) \otimes p \right) \end{aligned}$$

The second equality follows immediately from the definition of \(\Pi _n^\downarrow \) and of \({{\,\mathrm{ev}\,}}_{(a,b)}\) (Definition 5.9). For the first equality, Lemma B.2 implies:

$$\begin{aligned} D^n\left( \Pi _n^\downarrow \big ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\big ) \otimes p\right) = D^{(1^n)}\left( \Pi _n^\uparrow \circ \Pi _n^\downarrow \big ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\big ) \otimes \Pi _n^\uparrow p\right) \end{aligned}$$

By definition of \(\Pi _n^\uparrow \), both \(\Pi _n^\uparrow \circ \Pi _n^\downarrow \big ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\big )\) and \(\Pi _n^\uparrow p\) are symmetric in pairs of variables. Further, we can explicitly compute:

$$\begin{aligned} \Pi _n^\uparrow \circ \Pi _n^\downarrow \big ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\big ) = \frac{1}{n!} \sum _{\sigma \in S_n} \prod _{j=1}^n (b_jx_{\sigma (j)} - a_jy_{\sigma (j)}) \end{aligned}$$

This expression and the fact that \(\Pi _n^\uparrow p\) is symmetric then imply:

$$\begin{aligned}&D^{(1^n)}\left( \Pi _n^\uparrow \circ \Pi _n^\downarrow \big ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\big ) \otimes \Pi _n^\uparrow p\right) \\&\quad = \frac{1}{n!} \sum _{\sigma \in S_n} D^{(1^n)}\left( \prod _{j=1}^n (b_jx_{\sigma (j)} - a_jy_{\sigma (j)}) \otimes \Pi _n^\uparrow p\right) \\&\quad = D^{(1^n)}\left( \prod _{j=1}^n (b_jx_j - a_jy_j) \otimes \Pi _n^\uparrow p\right) \end{aligned}$$

The last expression is then equal to \((\Pi ^\uparrow _n p)(a,b)\) by the evaluation symbol lemma (Lemma 5.10).

For the general case of \(p \in V(\lambda )\), the same argument applies with a few tweaks. The key change is that \(\Pi _\lambda ^\uparrow \circ \Pi _\lambda ^\downarrow \big ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\big )\) and \(\Pi _\lambda ^\uparrow p\) are no longer fully symmetric in pairs of variables, but instead invariant under the action of \(S_{\lambda _1} \times \cdots \times S_{\lambda _m}\) on the \(|\lambda |\) pairs of variables.

\(\square \)

Generally speaking, the above lemma demonstrates the strong connection between the apolarity form and the polarization map. We now utilize this to prove the coincidence theorem.

Corollary B.5

(Grace–Walsh–Szegő) Fix \(\lambda \in {\mathbb {N}}_0^m\), \(p \in V(\lambda )\), and any disjoint Grace pair \((C_1 \times \cdots \times C_m, B_1 \times \cdots \times B_m)\). If p is \((C_1 \times \cdots \times C_m)\)-stable, then \(\Pi _\lambda ^\uparrow p\) is \((C_1^{\lambda _1} \times \cdots \times C_m^{\lambda _m})\)-stable.

Proof

So as to prove the contrapositive, suppose \(\Pi _\lambda ^\uparrow p\) is not \((C_1^{\lambda _1} \times \cdots \times C_m^{\lambda _m})\)-stable. That is, suppose \((\Pi _\lambda ^\uparrow p)(a,b) = 0\) for some \((a,b) = (a_1,b_1,\ldots ,a_{m,\lambda _m},b_{m,\lambda _m}) \in {\mathbb {C}}^{2|\lambda |}\) such that \((a_{j,k}:b_{j,k}) \in C_j\) for all \(j \in [m]\) and \(k \in [\lambda _j]\). By the previous lemma, this implies:

$$\begin{aligned} D^\lambda \left( \Pi _\lambda ^\downarrow \big ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\big ) \otimes p\right) = 0 \end{aligned}$$

By disjointness of \(C_j\) and \(B_j\) for all \(j \in [m]\), we then have that \((a_{j,k}:b_{j,k}) \not \in B_j\) for all jk. Therefore \({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\) is \((B_1^{\lambda _1} \times \cdots \times B_m^{\lambda _m})\)-stable (see Definition 5.9). This implies \(\Pi _\lambda ^\downarrow \big ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\big )\) is \((B_1 \times \cdots \times B_m)\)-stable. Since \(D^\lambda \left( \Pi _\lambda ^\downarrow \big ({{\,\mathrm{Symb}\,}}({{\,\mathrm{ev}\,}}_{(a,b)})\big ) \otimes p\right) = 0\), the definition of Grace pair (Definition 5.2) then implies p must not be \((C_1 \times \cdots \times C_m)\)-stable. \(\square \)

By Theorem 5.7, this implies the coincidence theorem for circular regions when \(m > 1\) and for any projectively convex regions when \(m = 1\). Notice that there is no reference made to degree or convexity restrictions (compare this to Theorems 1.1 and 1.2 in [2]). As discussed above, this is one of the main benefits of using homogeneous polynomials and interpreting zeros as lying in \(\mathbb {CP}^1\) and \((\mathbb {CP}^1)^m\).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Leake, J. A representation theoretic explanation of the Borcea–Brändén characterization. Math. Z. (2021). https://doi.org/10.1007/s00209-021-02825-4

Download citation