1 Introduction

The Fejér–Riesz theorem on the factorization of a non-negative trigonometric polynomial in one variable as the hermitian square of an analytic polynomial is now over 100 years old. It has become an essential tool in both pure mathematics and engineering, especially in signal processing. There have been numerous generalizations, and an especially keen interest in finding a two variable analogue. This is provided here, where it is proved that operator valued non-negative trigonometric polynomials in two variables can be factored as a finite sum of hermitian squares of analytic polynomials, with control over the number and degrees of the polynomials in the factorization.

Trigonometric polynomials are Laurent polynomials in commuting variables \(z=(z_1,\dots ,z_r)\) on the r-torus \(\mathbb {T}^r\) (\(\mathbb {T}\) the unit circle in the complex plane). This paper is concerned with polynomials that take their values in the bounded operators \({\mathfrak L}({\mathfrak H})\) on a Hilbert space \({\mathfrak H}\). If \(n=(n_1,\dots ,n_r)\) is a r-tuple of integers, the shorthand \(z^n\) is used for \(z_1^{n_1}\cdots z_r^{n_r}\), and \(-n\) for \((-n_1,\dots ,-n_r)\). Write \(z_j^*\) for \(z_j^{-1}\). Then any trigonometric polynomial has the form \(p(z) = \sum _n a_n z^n\), where the sum has only finitely many non-zero terms, and \(p^*(z):= \sum _n a_n^* z^{*\,n} = \sum _n a_n^* z^{-n}\). Denote by \(\textrm{deg}\, p = d = (d_1,\dots ,d_r)\) the degree of a trigonometric polynomial p; that is, \(d_j\) is the largest value of \(|n_j|\) for which \(a_n \ne 0\). A trigonometric polynomial where all terms have powers n lying in the positive orthant (that is, each \(n_j \ge 0\)) is called an analytic polynomial.

The \({\mathfrak L}({\mathfrak H})\) valued trigonometric polynomials have a natural operator system structure. Those polynomials which satisfy \(p^* = p\) are termed hermitian. The positive cone \(\mathscr {C}\) consists of those hermitian polynomials p for which \(p(z) \ge 0\) for all \(z \in \mathbb {T}^r\); that is, such that p(z) is a positive operator for all z. Such p are termed positive, and for all z and \(f\in {\mathfrak H}\), \(\left<p(z)f,f\right> \ge 0\). If instead for some \(\epsilon > 0\), \(\left<p(z)f,f\right> > \epsilon \Vert f\Vert ^2\) for all z and all \(f\ne 0\) (equivalently, \(p(z) \ge \epsilon 1\)), p is said to be strictly positive. If \(p = q^*q\), q an analytic polynomial, then p is called an hermitian square. Finite sums of such squares are positive, and it is natural to wonder if this describes all elements of \(\mathscr {C}\). For polynomials in one variable, the Fejér–Riesz theorem gives a positive answer.

Here it is shown that when considering operator valued polynomials in two variables over a finite dimensional Hilbert space, once again there is a description of an element in the positive cone as a finite sum of squares of analytic polynomials, both extending and improving known results for strictly positive polynomials. Observe that requiring polynomials \(q_j\) to be analytic when \(p = \sum q_j^*q_j\) is not particularly restrictive, since \(z^{N_j} q_j\) will be an analytic polynomial for sufficiently large \(N_j\) and \(p = \sum q_j^* z^{N_j\,*} z^{N_j} q_j\). This sort of cancellation is the source of many of the challenges in this context.

For \({\mathfrak L}({\mathfrak H})\) valued trigonometric polynomials in r variables, the cone \(\mathscr {C}\) is archimedean with the polynomial 1, which has the value of the identity \(1\in {\mathfrak L}({\mathfrak H})\) for all z, as the order unit—that is, for any hermitian p, there is a positive constant \(\alpha \) such that \(\alpha 1 \pm p \in \mathscr {C}\). There is the obvious generalization to \(M_n(\mathbb {C})\otimes {\mathfrak L}({\mathfrak H})\) valued polynomials. The cone is in general not closed, so attention is usually restricted to the set \(\mathscr {P}_N\) of hermitian polynomials of degree less than or equal to some fixed \(N = (n_1,\dots , n_d)\) with norm closed positive cone \(\mathscr {C}_N\). However by the cancellation noted above, hermitian squares can have reduced degree, so it might not be possible to factor an element of \(\mathscr {C}_N\) using analytic polynomials from \(\mathscr {P}_N\).

For polynomials taking values in \(\mathbb {R}\), such factorization problems are central to real algebraic geometry. It is always possible to express a scalar valued hermitian trigonometric polynomial in terms of real polynomials,

$$\begin{aligned} p = \sum _n a_n z^n = \sum _n (\textrm{Re}\, a_n) (\textrm{Re}\, (x+iy)^n), \qquad x_j = \textrm{Re}\, z_j,\ y_j = \textrm{Im}\, z_j. \end{aligned}$$

Since p is hermitian and \(\textrm{Re}\, (x+iy)^n\) is a real polynomial in x and y, it follows that p is a real polynomial in 2r variables. The r-torus \(\mathbb {T}^r\) is a set consisting of those points in \(\mathbb {R}^{2r}\) satisfying the 2r constraints, \({\{\pm (1-(x_j^2 + y_j^2)) \ge 0\}}_{j=1}^r\). These describe a compact semialgebraic set; that is, a compact set given in terms of a finite collection of polynomial inequalities. A fundamental problem then is: Given a semialgebraic set such as \(\mathbb {T}^r\), succinctly characterize the elements of the positive cone over this set (generally in terms of “sums of squares”). Such a description is termed a Positivstellensatz. See for example [13, 15].

Recall that in one variable, the Hardy space of functions over the open unit disk \(\mathbb {D}\) with values in a Hilbert space \({\mathfrak H}\), is \(H^2_{{\mathfrak H}}(\mathbb {D}) = \{ \sum _{n=0}^\infty c_n z^n{:}\, \sum _n \Vert c_n\Vert ^2 < \infty \}\). An \({\mathfrak L}({\mathfrak H})\) valued function f over \(\mathbb {D}\) is called a multiplier if for all \(g\in H^2_{{\mathfrak H}}(\mathbb {D})\), \(f\cdot g \in H^2_{{\mathfrak H}}(\mathbb {D})\), where the product is taken pointwise. A standard result is that the space of multipliers of \(H^2_{{\mathfrak H}}(\mathbb {D})\) equals \(H^\infty _{{\mathfrak L}({\mathfrak H})}(\mathbb {D})\), the bounded \({\mathfrak L}({\mathfrak H})\) valued analytic functions on the disk. Obviously, this space contains the analytic polynomials. A multiplier f is said to be outer if the closure of \(f\cdot H^2_{{\mathfrak H}}(\mathbb {D})\) equals \(H^2_{{\mathfrak L}}(\mathbb {D})\), where \({\mathfrak L}\) is a closed subspace of \({\mathfrak H}\). In the scalar setting, for a polynomial f this is equivalent to none of the zeros lying in \(\mathbb {D}\).

As originally formulated, the Fejér–Riesz theorem concerns the factorization of positive scalar valued trigonometric polynomials in one complex variable [6]. It was later proved by Rosenblum [18] (see also [19, 20]) that for any Hilbert space \({\mathfrak H}\) the theorem remains true for \({\mathfrak L}({\mathfrak H})\) valued trigonometric polynomials, again in one variable.

Theorem 1.1

(Fejér–Riesz Theorem) A positive \({\mathfrak L}({\mathfrak H})\) valued trigonometric polynomial p in a single variable z of degree d can be factored as \(p=q^*q\), where q is an outer \({\mathfrak L}({\mathfrak H})\) valued polynomial of degree d.

That is, in one variable, the cone \(\mathscr {C}_d\) in \(\mathscr {P}_d\) equals the set of hermitian squares of (outer) analytic polynomials in \(\mathscr {P}_d\).

There is also a weaker multivariable version of the Fejér–Riesz theorem for strictly positive trigonometric polynomials [4] (see also [21]).

Theorem 1.2

(Multivariable Fejér–Riesz Theorem) A strictly positive \({\mathfrak L}({\mathfrak H})\) valued trigonometric polynomial p of degree d in r variables \(z = (z_1,\dots , z_r)\), can be factored as a finite sum \(p= \sum q_i^*q_i\), where each \(q_i\) is an analytic \({\mathfrak L}({\mathfrak H})\) valued polynomial. If \(p \ge \epsilon 1\), \(\epsilon > 0\), then the number of polynomials in the sum and their degrees can be bounded in terms of \(\epsilon \).

Unsurprising, because of the cancellations mentioned above the degree bounds go to infinity as \(\epsilon \) goes to 0. Examination of the proofs yield no obvious choice of closed cone containing all strictly positive polynomials of a fixed degree, even in two variables.

Suppose \(\mathscr {R} = \{r_j\}_{j = 1}^n\) is a set of real polynomials on \(\mathbb {R}^r\) and \(\mathscr {S} = \{x\in \mathbb {R}^r{:}\, r_j(x) \ge 0 \text { for all }r_j \in \mathscr {R}\}\) is a semialgebraic set. Let \(\Delta \) be the set of functions from \(\{1,\dots ,m\}\) to \(\{0,1\}\). On \(\mathscr {S}\), there are various collections of polynomials which are non-negative. Besides the (finite) sums of squares of polynomials \(\sum _i q_i^2\), which are non-negative everywhere, one also has (again with finite sums)

  • the quadratic module: \(\sum _j r_j \sum _i q_{ji}^2\); and

  • the preordering: \(\sum _{\delta \in \Delta } \prod _j r_j^{\delta (j)}\sum _i q_{\delta i}^2\).

In the scalar setting, the multivariable Fejér–Riesz theorem is a special case of Schmüdgen’s theorem [25] (see also [1]).

Theorem 1.3

(Schmüdgen’s theorem) Let \(\mathscr {S}\) be a compact semialgebraic set in \(\mathbb {R}^r\) described by a finite set \(\mathscr {R}\) of polynomials. Any polynomial strictly positive over \(\mathscr {S}\) is in the preordering.

There are variations and refinements of this result. For example, if for each coordinate \(x_j\), \(\mathscr {R}\) contains the function \(c_j - x_j^2\) for some \(c_j > 0\), then Putinar proved that strictly positive polynomials are in the quadratic module over \(\mathscr {R}\) [16]. In addition, if in Schmüdgen’s theorem the set \(\mathscr {R}\) contains at most two polynomials, the preordering can be replaced by the quadratic module [15]. Cimprič [2] and Hol and Scherer [24] have extended some of these results to matrix valued polynomials.

Matters become more complicated when positivity replaces strict positivity. Both Schmüdgen’s and Putinar’s theorems are known not to hold then. Indeed, Scheiderer has shown that if the dimension of a compact semialgebraic set is 3 or more, there will always be positive polynomials which are not in the preordering [23]. On the other hand, he also proved that under mild restrictions, for a compact two dimensional semialgebraic set, all positive scalar valued polynomials are in the preordering [23]. The latter implies in particular that in two variables, any real valued positive trigonometric polynomial is a sum of squares of analytic polynomials.

The results of Schmüdgen, Putinar, and Scheiderer (as well as the various generalizations mentioned) are intimately tied to the study of real fields, with no known analogue when the polynomial coefficients are allowed to come from a ring of operators. Even the matrix valued results in [2, 24] are likewise connected, since they are proved by reducing to the scalar valued case and applying the known theorems. For this reason, it is particularly striking that most proofs of the operator Fejér–Riesz theorems take a purely analytic approach to the problem, though they say nothing about polynomials vanishing on \(\mathbb {T}^r\) (that is, there is no Nullstellensatz).

This motivates the following generalization of a part of Scheiderer’s work, a factorization theorem for positive matrix valued trigonometric polynomials in two variables.

Theorem

(Two variable Fejér–Riesz theorem) Let \({\mathfrak K}\) be a Hilbert space with \(\dim {\mathfrak K}< \infty \). Given a positive \({\mathfrak L}({\mathfrak K})\) valued trigonometric polynomial Q in two variables of degree \((d_1,d_2)\), there exists \(d_1' < \infty \) such that a positive \({\mathfrak L}({\mathfrak K})\) valued trigonometric polynomial Q can be factored as a sum of at most \(2d_2\) hermitian squares of \({\mathfrak L}({\mathfrak K})\) valued, analytic polynomials with degrees bounded by \(({d'}_1,2d_2-1)\).

The case where either \(d_1\) or \(d_2\) equals 0 can be excluded as it is covered by the single variable Fejér–Riesz theorem. The principle techniques are a hybrid of Rosenblum’s approach to the operator Fejér–Riesz theorem using the Lowdenslager criterion and an elaboration of the Schur complement approach found in [3, 4]. Though these methods are now well known, there are several new twists. As usual, a trigonometric polynomial is associated with a Toeplitz operator, though in the two variable setting, this becomes a Toeplitz operator of Toeplitz operators. Positive trigonometric polynomials correspond to finite degree positive Toeplitz operators (that is, having only finitely many nonzero diagonals). These can be viewed as either being “bi-infinite” in each variable (indexed from \(-\infty \) to \(+\infty \)) or “singly infinite”(indexed from 0 to \(+\infty \)), or even some mixture of these. In the singly infinite case, the multiplication operators identified to the variables are commuting isometries (in fact, unilateral shifts), while in the bi-infinite case they are commuting unitaries (bilateral shifts). The interplay between these turns out to be important here.

Collecting entries of a selfadjoint finite degree Toeplitz operator into large enough blocks, one gets a tridiagonal Toeplitz operator. Calling the main diagonal entry A and the off-diagonal entries B and \(B^*\), this Toeplitz operator is positive if and only if there is a positive operator M such that \(\begin{pmatrix} A-M &{}\quad B^* \\ B &{}\quad M \end{pmatrix} \ge 0\), and in this case the set of all such M forms a norm closed compact, convex set \(\mathscr {M}\) [3]. The elements of \(\mathscr {M}\) play a similar role to the Schur complement supported on the (1, 1) entry of the Toeplitz operator, which happens to be the largest element of \(\mathscr {M}\). The Schur complement gives rise to the outer factorization (unique up to multiplication by a diagonal unitary) of the positive Toeplitz matrix. The smallest element of \(\mathscr {M}\) gives the “co-outer” factorization. Of particular interest will be those extremal polynomials for which \(\mathscr {M}\) is a singleton.

In two variables, A and B can be rearranged to be Toeplitz themselves. A complication then arises in that the elements of \(\mathscr {M}\) may not consist solely of Toeplitz operators, and in particular, the largest element will in general not be Toeplitz. Nevertheless, there will always be a closed convex subset \(\mathscr {M}_T\) of \(\mathscr {M}\) consisting of Toeplitz operators. The extremal case when \(\mathscr {M}_T\) (rather than \(\mathscr {M}\)) is a singleton is of central importance. It is proved that A can always be replaced by a Toeplitz \(\hat{A} \le A\) so that \(\mathscr {M}_T = \{\hat{M}\}\) is a singleton. However, while \(\hat{A}\) and \(\hat{M}\) will be associated to bounded trigonometric functions, it is not evident that they necessarily have finite degree, and so correspond to polynomials.

Another difficulty then arises. While the Fejér–Riesz theorem guarantees the existence of a factorization for a positive single variable trigonometric polynomial, it is well known that not all positive trigonometric functions can be factored as a hermitian square of an analytic function. Some restriction, as in for example Szegő’s theorem [19], is usually needed. Despite this, it will follow by minimality that \(\hat{A}-\hat{M}\) and \(\hat{M}\) have outer factorizations \(E^*E\) and \(F^*F\). It will also imply that \(B = F^*GE\), where \(G = V^*_FV_E\) is Toeplitz with a bi-infinite unitary extension, and \(V_E\) and \(V_F\) inner (so analytic and isometric). This, with the degree bounds for B, then lead to degree bounds for E and F when \({\mathfrak H}\) is finite dimensional, and so for \(\hat{A}\) and \(\hat{M}\). An argument using the one variable Fejér–Riesz theorem then finishes the proof of the two variable theorem.

Why is it not possible to use the same ideas to factor positive trigonometric polynomials in three or more variables? While the theorem presented is about trigonometric polynomials in two variables, frequent use is made of results concerning polynomials in one variable, and outer factorizations of these, especially in showing that the degrees of \(\hat{A}\) and \(\hat{M}\) mentioned above are finite. Outerness (interpreted appropriately) does not necessarily apply to analytic factors in two variables. Indeed, there are examples of positive polynomials in three or more variables which cannot be factored as a sum of squares of polynomials, indirectly indicating that there will exist positive polynomials in two variables without outer factorizations. Outer factorizations of multivariable trigonometric polynomials are explored further in [4].

2 Toeplitz and analytic operators, and their relation to trigonometric polynomials

A shift operator is an isometry S on a Hilbert space \({\mathfrak H}\) with trivial unitary component in its Wold decomposition. It is then natural to write for some Hilbert space \({\mathfrak G}\), \({\mathfrak H}= {\mathfrak G}\oplus {\mathfrak G}\oplus \cdots \) (identified with the Hardy space \(H^2({\mathfrak G})\)), and

$$\begin{aligned} S (h_0,h_1,\dots )^t = (0,h_0,h_1,\dots ) ^t \end{aligned}$$

when the elements of \({\mathfrak H}\) are written as column vectors (here “t” indicates transpose). The multiplicity of S is then \(\dim {\mathfrak G}= \dim \ker S^*\).

Fix a shift S. If \(T,A\in {\mathfrak L}({\mathfrak H})\), say that T is Toeplitz if \(S^*TS=T\), and that A is analytic if \(AS=SA\). To distinguish this case from that of Toeplitz operators on \(L^2\) spaces introduced below, such operators will be referred to as being singly infinite Toeplitz operators. Viewed as an operator on \(H^2({\mathfrak G})\), pre-multiplication of T by \(S^*\) has the effect of deleting the first row of T and shifting T upwards by one row, while pre-multiplication by S shifts T downwards by a row and setting the first row entries to 0. Likewise, post-multiplication by S deletes the first column and shifts left by one column. Hence as matrices with entries in \({\mathfrak L}({\mathfrak G})\), such Toeplitz and analytic operators have the forms

$$\begin{aligned} T = \begin{pmatrix} T_0 &{}\quad T_{-1} &{}\quad T_{-2} &{}\quad \cdots \\ T_1 &{}\quad T_0 &{}\quad T_{-1} &{}\quad \ddots \\ T_2 &{}\quad T_1 &{}\quad T_0 &{}\quad \ddots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots \end{pmatrix}, \qquad A = \begin{pmatrix} A_0 &{}\quad 0 &{}\quad 0 &{}\quad \cdots \\ A_1 &{}\quad A_0 &{}\quad 0 &{}\quad \ddots \\ A_2 &{}\quad A_1 &{}\quad A_0 &{}\quad \ddots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots \end{pmatrix}. \end{aligned}$$
(2.1)

An analytic operator A is termed outer if \(\overline{\textrm{ran}}\,A\) is a subspace of \({\mathfrak H}\) of the form \(H^2({\mathfrak F})\) for some closed subspace \({\mathfrak F}= \overline{\textrm{ran}}\,A_0\) of \({\mathfrak G}\); equivalently, \(\overline{\textrm{ran}}\,A = \bigoplus _0^\infty \overline{\textrm{ran}}\,A_0\) reduces S.

So far it has been assumed that the entries \(T_j\) of the Toeplitz operators T are in \({\mathfrak L}({\mathfrak G})\), and while this is necessary if \(T\ge 0\), it is also natural to more generally consider Toeplitz and analytic operators where \(T_j \in {\mathfrak L}({\mathfrak G}_1,{\mathfrak G}_2)\), \({\mathfrak G}_1\), \({\mathfrak G}_2\), Hilbert spaces.

Now consider Laurent and analytic polynomials \(Q(z) = \sum _{k=-d_-}^{d_+} Q_k z^k\) and \(P(z) = \sum _{k=0}^d P_k z^k\) on \(\mathbb {T}\) with coefficients in \({\mathfrak L}({\mathfrak G})\). Refer to \(\deg _-(Q) = d_-\), \(\deg _+(Q) = d_+\), and \(\deg (Q) = d_- + d_+\) as the degrees of Q, and \(\deg (P) = d\) as the degree of P, assuming that \(Q_{d_+}\), \(Q_{-d_-}\), and \(P_d\) are nonzero while \(Q_{d_+ + \ell }\), \(Q_{-d_+ - \ell }\), and \(P_{d+\ell }\) are zero for \(\ell \ge 1\). The formulas

$$\begin{aligned} T_Q = \begin{pmatrix} Q_0 &{}\quad Q_{-1} &{}\quad Q_{-2} &{}\quad \cdots \\ Q_1 &{}\quad Q_0 &{}\quad Q_{-1} &{}\quad \ddots \\ Q_2 &{}\quad Q_1 &{}\quad Q_0 &{}\quad \ddots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots \end{pmatrix}, \qquad T_P = \begin{pmatrix} P_0 &{}\quad 0 &{}\quad 0 &{}\quad \cdots \\ P_1 &{}\quad P_0 &{}\quad 0 &{}\quad \ddots \\ P_2 &{}\quad P_1 &{}\quad P_0 &{}\quad \ddots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots \end{pmatrix} \end{aligned}$$
(2.2)

then define bounded linear operators on \({\mathfrak H}\). The operator \(T_Q\) is Toeplitz, while \(T_P\) is analytic, and \(d_-\), \(d_+\), and d are likewise called the degrees of \(T_Q\) and \(T_P\). Even if Q and P are not polynomials, but are nevertheless bounded functions, the operators \(T_Q\) and \(T_P\) will be bounded. Moreover,

  • \(Q(z) \ge 0\) for all \(z\in \mathbb {T}\) if and only if \(T_Q \ge 0\);

  • \(Q(z) = P(z)^*P(z)\) for all \(z \in \mathbb {T}\) if and only if \(T_Q = T_P^*T_P\).

Recall that \(Q(z) \ge 0\) means that for all \(g\in {\mathfrak G}\), \(\left<Q(z)g,g\right> \ge 0\), while \(Q(z) > 0\) if for some \(\epsilon > 0\), \(\left<Q(z)g,g\right> \ge \epsilon \Vert g\Vert ^2\). Write \( Q \ge 0\) and \(Q > 0\) for \(Q(z) \ge 0\), respectively \(Q(z) > \epsilon 1\), for all \(z\in \mathbb {T}\).

An analytic function P(z) is outer if the analytic Toeplitz operator \(T_P\) is outer. The Fejér–Riesz theorem (Theorem 1.1) can be restated in terms of Toeplitz operators: A positive Toeplitz operator \(T \in {\mathfrak L}(H^2({\mathfrak G}))\) of finite degree \(d = d_+ = d_-\) has the form \(F^*F\), where \(F\in {\mathfrak L}(H^2({\mathfrak G}))\) is an outer analytic operator of the same degree d as T.

While the Fejér–Riesz theorem states that any bounded positive Toeplitz operator of finite degree has an analytic, and hence outer factorization, the same need not be true if the degree is not finite. Some additional condition needs to be imposed (see [19], especially Section 3.4, and [5, Lemma 2.3]). The next result is a mild generalization of Lowdenslager’s criterion, which ensures an outer (and so analytic) factorization. First, some notation. If \(T \in {\mathfrak L}(H^2({\mathfrak G}))\) is positive and Toeplitz (so \(T = S^*TS\), S the unilateral shift) and \(T = F^*F\) is a factorization, then \(FS = S_FF\), where \(S_F\) is referred to as the Lowdenslager isometry associated to F. Here, unlike in [19], the term “factorization” does not imply analyticity of F.

Lemma 2.1

(Lowdenslager criterion) Let \(T \in {\mathfrak L}(H^2({\mathfrak G}))\) be positive and Toeplitz. The following are equivalent:

  1. (i)

    There is a factorization \(T = F^*F\) where F is outer; 

  2. (ii)

    For some factorization \(T = G^*G\), the Lowdenslager isometry \(S_G\) is a unilateral shift; 

  3. (iii)

    For every factorization \(T = H^*H\), the Lowdenslager isometry \(S_H\) is a unilateral shift.

In this case, the multiplicities of all the Lowdenslager isometries are equal.

The proof that (i) implies (iii) and (ii) implies (i) is essentially the same as for the classical Lowdenslager criterion [19].

A consequence of the Lowdenslager criterion is that if E is analytic then it has an inner-outer factorization; that is, \(E = VF\), where F is outer and V is inner (so analytic and isometric).

One can extend singly infinite Toeplitz operators to \({\tilde{\mathfrak H}}= \dots \oplus {\mathfrak G}\oplus {\mathfrak G}\oplus {\mathfrak G}\oplus \dots = L^2({\mathfrak G})\) simply by continuing each of the diagonals. The shift operator becomes the unitary bilateral shift. The resulting bi-infinite Toeplitz operator is positive if and only if the same is true for the corresponding singly infinite Toeplitz operator.

All other notions considered so far carry over naturally to the multi-index/multivariable setting. Only the two index/variable case is examined, the version for three or more then being evident. Suppose that \(S_2\) is a shift operator on \({\mathfrak H}= \bigoplus _{j_2 = 0}^\infty {\mathfrak G}\), and that \(S_1\) is a shift operator on \({\mathfrak G}= \bigoplus _{j_1 = 0}^\infty {\mathfrak K}\). If T is a Toeplitz operator on \({\mathfrak H}\) with the property that each \(T_{j_2}\) is a Toeplitz operator on \({\mathfrak G}\), say that T is a bi-Toeplitz operator (or multi-Toeplitz more generally). Call T bi-analytic (respectively, multi-analytic) if T is analytic and each \(T_j\) is analytic. It is sometimes convenient to shift back and forth to the bi-infinite Toeplitz setting in one of the variables. If T is a bi-Toeplitz operator on \(\bigoplus _{j_1 = 0}^\infty \bigoplus _{j_2 = 0}^\infty {\mathfrak K}\), the entries are naturally labeled by two indices, \((j_1,j_2)\). Let \({\tilde{\mathfrak G}}= \bigoplus _{j_2 = 0}^\infty {\mathfrak K}\) and \({\tilde{\mathfrak H}}= \bigoplus _{j_1 = 0}^\infty \bigoplus _{j_2 = 0}^\infty {\mathfrak K}= \bigoplus _{j_1 = 0}^\infty {\tilde{\mathfrak G}}\). The indices of T can be interchanged to get another operator \(\tilde{T}\) on \({\tilde{\mathfrak H}}\). The exchange is implemented via a permutation of rows and columns corresponding to conjugation with the unitary operator \(W{:}\,{\tilde{\mathfrak H}}\rightarrow {\mathfrak H}\) having the identity 1 in the entries labeled with \(((j_1,j_2),(j_2,j_1))\) and 0 elsewhere.

As in the single variable setting, there are Laurent and analytic polynomials with coefficients in \({\mathfrak L}({\mathfrak K})\), but now in \(z = (z_1,z_2)\), where \(z_1\) and \(z_2\) commute. These look like

$$\begin{aligned} Q(z) = \sum _{k_2=-m_2}^{m_2} \left( \sum _{k_1=-m_1}^{m_1} Q_{k_2,k_1} z_1^{k_1}\right) z_2^{k_2} \quad \text {and}\quad P(z) = \sum _{k_2=0}^{m_2} \left( \sum _{k_1=0}^{m_1} P_{k_2,k_1} z_1^{k_1}\right) z_2^{k_2}. \end{aligned}$$

Set \(Q_{j_1,j_2} =0\) whenever \(j_1\notin [-m_1,m_1]\) or \(j_2\notin [-m_2,m_2]\), and set \(P_{j_1,j_2}=0\) whenever \(j_1\notin [0,m_1]\) or \(j_2\notin [0,m_2]\). This results in trigonometric polynomials in the variable \(z_2\) with coefficients which are trigonometric polynomials in the variable \(z_1\). Much as before, the formulas

$$\begin{aligned} T_Q = (Q_{j_2-k_2,j_1-k_1})_{(j_2,j_1), (k_2,k_1) \in \mathbb {N}\times \mathbb {N}} \qquad T_P = (Q_{j_2-k_2,j_1-k_1})_{(j_2,j_1), (k_2,k_1) \in \mathbb {N}\times \mathbb {N}} \end{aligned}$$
(2.3)

define bounded operators on \({\mathfrak H}\), the first being bi-Toeplitz and the second bi-analytic. If indices are interchanged and \({\tilde{T}}_Q\) and \({\tilde{T}}_P\) are viewed as operators on \({\tilde{\mathfrak H}}\), this amounts to taking Q and P as polynomials in \(z_1\) with coefficients which are polynomials in \(z_2\). The pairs \((m_{1,\pm },m_{2,\pm })\) are referred to as the degrees of the Q (equivalently, degrees of \(T_Q\)), if the coefficients of the form \(Q_{\pm m_{1,\pm },\pm m_{2,\pm }}\) is nonzero, while \(Q_{j_1,j_2} =0\) if \(j_1\notin [-m_{1,-},m_{1,+}]\) or \(j_2\notin [-m_{2,-},m_{2,+}]\), When Q is positive, \(m_{1+} = m_{1-}\) and \(m_{2+} = m_{2-}\), so these degrees are unambiguously written as \((m_1,m_2)\).

In analogy with the one variable case,

  • \(Q(z) \ge 0\) for all \(z\in \mathbb {T}^2\) if and only if \(T_Q \ge 0\);

  • \(Q(z) = \sum _j P_j(z)^*P_j(z)\) for all \(z \in \mathbb {T}^2\) if and only if \(T_Q = \sum _j T_{P_j}^*T_{P_j}\).

3 Schur complements

Schur complements play an essential role in several proofs of the operator Fejér–Riesz theorem [3, 4]. A survey of their use in this way can be found in [5]. Here is the definition.

Definition 3.1

Let \({\mathfrak H}\) be a Hilbert space and \(0 \le T\in {\mathfrak L}({\mathfrak H})\). Let \({\mathfrak K}\) be a closed subspace of \({\mathfrak H}\), and \(P_{\mathfrak K}\in {\mathfrak L}({\mathfrak H},{\mathfrak K})\) the orthogonal projection of \({\mathfrak H}\) onto \({\mathfrak K}\). Then there is a unique operator \(0 \le M = M(T,{\mathfrak K})\in {\mathfrak L}({\mathfrak K})\) called the Schur complement of T supported on \({\mathfrak K}\), such that

  1. (i)

    \(T-P_{\mathfrak K}^* M P_{\mathfrak K}\ge 0\);

  2. (ii)

    if \(\widetilde{M} \in {\mathfrak L}({\mathfrak K})\), \(\widetilde{M} \ge 0\), and \(T-P_{\mathfrak K}^* \widetilde{M} P_{\mathfrak K}\ge 0\), then \(\widetilde{M} \le M\).

There are several equivalent means of obtaining the Schur complement. For example, if \(T = \begin{pmatrix} A &{} B^* \\ B &{} C \end{pmatrix}\) on \({\mathfrak K}\oplus {\mathfrak K}^\bot \), M is found by

$$\begin{aligned} \left<Mf,f\right> = \inf _{g\in {\mathfrak K}^\bot } \left< \begin{pmatrix} A &{}\quad B^* \\ B &{}\quad C \end{pmatrix} \begin{pmatrix} f \\ g \end{pmatrix}, \begin{pmatrix} f \\ g \end{pmatrix}\right>, \qquad f\in {\mathfrak K}. \end{aligned}$$
(3.1)

Suppose that T is a positive Toeplitz operator of finite degree \(d = d_+ = d_-\) on \({\mathfrak H}= \bigoplus _0^\infty {\mathfrak K}\). By grouping the entries into \(d\times d\) sub-matrices, T can be taken to be tridiagonal. Write \({\tilde{\mathfrak K}}\) for \(\bigoplus _0^{d-1} {\mathfrak K}\) and view \({\mathfrak H}= \bigoplus _0^\infty {\tilde{\mathfrak K}}\). Then

$$\begin{aligned} T = \begin{pmatrix} A &{}\quad B^* &{}\quad 0 &{}\quad \cdots \\ B &{}\quad A &{}\quad B^* &{}\quad \ddots \\ 0 &{}\quad B &{}\quad A &{}\quad \ddots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots \end{pmatrix}. \end{aligned}$$
(3.2)

Let \(M_+\) denote the Schur complement supported on the first copy of \({\tilde{\mathfrak K}}\). Then by (3.1) and the Toeplitz structure of T (see [3, 5]),

$$\begin{aligned} \begin{pmatrix} A-M_+ &{}\quad B^* \\ B &{}\quad M_+ \end{pmatrix} \ge 0. \end{aligned}$$
(3.3)

In fact, \(T\ge 0\) if and only if there is some \(M\ge 0\) such that the inequality (3.3) holds with M in place of \(M_+\). In this case, write \(\mathscr {M}\) for the set of positive operators M satisfying (3.3). This set is norm closed and convex with maximal element equal to the Schur complement \(M_+\). There is also a minimal element \(M_-\) which is constructed by finding the maximal element \(N_+\) such that

$$\begin{aligned} \begin{pmatrix} N_+ &{}\quad B^* \\ B &{}\quad A-N_+ \end{pmatrix} \ge 0 \end{aligned}$$

and setting \(M_- = A - N_+\). Evidently, \(N_+\) is the Schur complement supported on the first copy of \({\tilde{\mathfrak K}}\) in

$$\begin{aligned} \begin{pmatrix} A &{}\quad B &{}\quad 0 &{}\quad \cdots \\ B^* &{}\quad A &{}\quad B &{}\quad \ddots \\ 0 &{}\quad B^* &{}\quad A &{}\quad \ddots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots \end{pmatrix}. \end{aligned}$$

In the context of positive Toeplitz operators, Schur complements have a certain inheritance property, in that if \(T_Q\) is Toeplitz on \(H^2(\mathscr {H})\) and \(M_n(T_Q)\) is the Schur complement supported on the upper left \(n\times n\) corner of \(T_Q\), then \(M_n(M_{n+1}(T_Q)) = M_n(T_Q)\); that is, the Schur complement on the upper left \(n\times n\) corner of the Schur complement on the upper left \((n+1)\times (n+1)\) corner of \(T_Q\) is the same as the Schur complement on the upper left \(n\times n\) corner of \(T_Q\). In addition, if \(\deg T_Q \le n\), then

$$\begin{aligned} M_{n+1}(T_Q) = \begin{pmatrix} Q_0 &{}\quad Q_1^* &{}\quad \cdots &{}\quad Q_n^* \\ Q_1 &{}\quad &{}\quad &{}\quad \\ \vdots &{}\quad &{}\quad M_n(T_Q) &{}\quad \\ Q_n &{}\quad &{}\quad &{}\quad \end{pmatrix}, \end{aligned}$$
(3.4)

where some \(Q_j = 0\) if \(j > n\). This enables the construction of the Fejér–Riesz factorization in the one variable case, via

$$\begin{aligned} \begin{aligned}&R_n(T) = M_{n+1}(T_Q) - \begin{pmatrix} &{}\quad &{}\quad &{}\quad 0 \\ &{}\quad M_n(T_Q) &{}\quad &{}\quad \vdots \\ &{}\quad &{}\quad &{}\quad 0 \\ 0 &{}\quad \cdots &{}\quad 0 &{}\quad 0 \end{pmatrix}\\&\quad = \begin{pmatrix} P_d^* &{}\quad 0 &{}\quad \cdots &{}\quad \cdots &{}\quad 0 \\ \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \vdots \\ P_0^* &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \vdots \\ 0 &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad 0 \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad P_d^* \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \vdots \\ 0 &{}\quad \cdots &{}\quad \cdots &{}\quad 0 &{}\quad P_0^* \end{pmatrix} \begin{pmatrix} P_d &{}\quad \ddots &{}\quad P_0 &{}\quad 0 &{}\quad \cdots &{}\quad \cdots &{}\quad 0 \\ 0 &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \vdots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \vdots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad 0 \\ 0 &{}\quad \ddots &{}\quad \ddots &{}\quad 0 &{}\quad P_d &{}\quad \ddots &{}\quad P_0 \end{pmatrix}. \end{aligned} \end{aligned}$$

See [4, 5] for more details.

A particularly interesting situation, termed extremal, occurs when \(\mathscr {M} = \{M\}\), a singleton. In this case, if there are factorizations \(A - M = E^*E\) and \(M = F^*F\), then \(B = F^*UE\), where U is unitary from \(\overline{\textrm{ran}}\,E\) to \(\overline{\textrm{ran}}\,F\). However such a factorization of B with a unitary for some element \(M\in \mathscr {M}\) does not necessarily guarantee extremality. In order to examine this more carefully, the following test is introduced.

Lemma 3.2

Suppose that

$$\begin{aligned} \mathscr {M} = \left\{ M{:}\, \begin{pmatrix} A - M &{}\quad B^* \\ B &{}\quad M \end{pmatrix} \ge 0\right\} \end{aligned}$$

is non-empty with maximal element \(M_+\), minimal element \(M_-\), and \(M_0 = \tfrac{1}{2}(M_+ + M_-)\). Let \(A-M_\pm = E_\pm ^*E_\pm \) and \(A-M_0 = E_0^*E_0\), \(M_\pm = F_\pm ^*F_\pm \), \(M_0 = F_0^*F_0\), and \(B = F_\pm ^*G_\pm E_\pm = F_0^*G_0 E_0\), where \(G_\pm : \overline{\textrm{ran}}\,E_\pm \rightarrow \overline{\textrm{ran}}\,F_\pm \) and \(G_0: \overline{\textrm{ran}}\,E_0 \rightarrow \overline{\textrm{ran}}\,F_0\) are contractions. The set \(\mathscr {M}\) is a singleton if and only if the operators \(G_+\), \(G_-\) and \(G_0\) are unitary.

Proof

Let \(M_+\), \(M_-\) be the maximal and minimal elements of \(\mathscr {M}\). Suppose that \(G_+\), \(G_-\) and \(G_0\) are the corresponding contractions as in the statement of the lemma. It is straightforward to verify that \(G_+\) is an isometry and \(G_-\) is a co-isometry. Hence if \(\mathscr {M}\) is a singleton, \(G_+ = G_- = G_0\) is unitary.

Conversely, assume that for every \(M\in \{ M_+,M_-,M_0\}\), there are factorizations as in the statement of the lemma, where the operators \(G_+\), \(G_-\) and \(G_0\) are unitary. Without loss of generality, by absorbing \(G_\pm \) into \(E_\pm \) or \(F_\pm \), it is possible to take \(G_\pm = 1\) on \(\overline{\textrm{ran}}\,E_\pm = \overline{\textrm{ran}}\,F_\pm \).

Since \(M_- \le M_+\), there exist contractions \(H_-{:}\,\overline{\textrm{ran}}\,E_- \rightarrow \overline{\textrm{ran}}\,E_+\), \(H_+{:}\,\overline{\textrm{ran}}\,F_+ \rightarrow \overline{\textrm{ran}}\,F_-\) with dense ranges such that \(E_+ = H_- E_-\) and \(F_- = H_+ F_+\). Let \(H_- = V_- |H_-|\) be the polar decomposition. Then \(E_+^*E_+ = E_+^*V_-V_-^*E_+\), so replacing \(E_+\) by \(V_-^*E_+\) if necessary, \(\overline{\textrm{ran}}\,E_+ = \overline{\textrm{ran}}\,E_-\) and \(H_-\) is a positive contraction. Similarly, one can take \(\overline{\textrm{ran}}\,F_+ = \overline{\textrm{ran}}\,F_-\) and \(H_+\) a positive contraction. So \(\overline{\textrm{ran}}\,E_+ = \overline{\textrm{ran}}\,E_- = \overline{\textrm{ran}}\,F_+ = \overline{\textrm{ran}}\,F_-\) and

$$\begin{aligned} \begin{aligned} B&= F_-^* E_- = F_+^* H_+ E_- \\&= F_+^* E_+ = F_+^* H_- E_-. \end{aligned} \end{aligned}$$

Consequently, \(H_+ = H_-\). Denote this operator by H.

The set \(\mathscr {M}\) is convex, so

$$\begin{aligned} \begin{aligned} 0&\le \begin{pmatrix} A - \tfrac{1}{2}(M_+ + M_-) &{}\quad B^* \\ B &{}\quad \tfrac{1}{2}(M_+ + M_-) \end{pmatrix} \\&= \frac{1}{2}\left[ \begin{pmatrix} E_+^*E_+ &{}\quad B^* \\ B &{}\quad F_+^*F_+ \end{pmatrix} + \begin{pmatrix} E_-^*E_- &{}\quad B^* \\ B &{}\quad F_-^*F_- \end{pmatrix} \right] \\&= \frac{1}{2}\left[ \begin{pmatrix} E_-^*H^2E_- &{}\quad E_-^*HF_+ \\ F_+^*HE_- &{}\quad F_+^*F_+ \end{pmatrix} + \begin{pmatrix} E_-^*E_- &{}\quad E_-^*HF_+ \\ F_+^*HE_- &{}\quad F_+^*H^2F_+ \end{pmatrix} \right] \\&= \frac{1}{2} \begin{pmatrix} E_-^* &{}\quad 0 \\ 0 &{}\quad F_+^* \end{pmatrix} \begin{pmatrix} 1+H^2 &{}\quad 2H \\ 2H &{}\quad 1+H^2 \end{pmatrix} \begin{pmatrix} E_- &{}\quad 0 \\ 0 &{}\quad F_+ \end{pmatrix}. \end{aligned} \end{aligned}$$

Recall that \(M_0 = \tfrac{1}{2}(M_+ + M_-)\), \(A-M_0 = E_0^*E_0\), \(M_0 = F_0^*F_0\), and \(B = F_0^*G_0E_0\), where by assumption, \(G_0:\overline{\textrm{ran}}\,E_0 \rightarrow \overline{\textrm{ran}}\,F_0\) is unitary. Then the Schur complement supported on the top left corner of \(\begin{pmatrix} A - M_0 &{}\quad B^* \\ B &{}\quad M_0 \end{pmatrix}\) is zero. Thus the Schur complement of the top left corner of \(\begin{pmatrix} 1+H^2 &{} 2H \\ 2H &{} 1+H^2 \end{pmatrix}\) must also be zero. Since \(1+H^2\) is invertible, it is a standard fact that this Schur complement equals \((1+H^2) - 4H(1+H^2)^{-1}H\), and so \(1 - 2H^2 + H^4 = 0\). Since \(H \ge 0\), this implies that the only point in the spectrum of H is \(\{1\}\), and so \(H = 1\). From this it follows that \(M_+ = M_-\); that is, \(\mathscr {M}\) is a singleton. \(\square \)

The next theorem shows that even if \(\mathscr {M}\) is not a singleton, it is possible to replace A by \(\hat{A} \le A\) so that it is.

Theorem 3.3

Suppose that the set

$$\begin{aligned} \mathscr {M} = \left\{ M{:}\, \begin{pmatrix} A - M &{}\quad B^* \\ B &{}\quad M \end{pmatrix} \ge 0\right\} \end{aligned}$$

is not empty. Then there exists \(\hat{A} \le A\) such that the set

$$\begin{aligned} \hat{\mathscr {M}} = \left\{ M{:}\, \begin{pmatrix} \hat{A} - M &{}\quad B^* \\ B &{}\quad M \end{pmatrix} \ge 0\right\} \end{aligned}$$

is a singleton.

Proof

Assume that the set \(\mathscr {M}\) is not a singleton. Then \(\mathscr {M}\) has distinct maximal and minimal elements \(M_+\) and \(M_-\). Set \(M_0 = \tfrac{1}{2}(M_+ + M_-)\). By Lemma 3.2, for \(M_*\) one of these three and \(A-M_* = E^*E\), \(M_* = F^*F\), \(B = F^*GE\), \(G: \overline{\textrm{ran}}\,E \rightarrow \overline{\textrm{ran}}\,F\) is a non-unitary contraction.

The Schur complement supported on the top left corner of \(\begin{pmatrix} A-M_* &{}\quad B^* \\ B &{}\quad M \end{pmatrix}\) is \(D_+ = E^*(1-G^*G)E\), while that on the lower right corner is \(D_- = F^*(1-GG^*)F\). So if G is not isometric, then \(D_+ \ne 0\), while if it is not coisometric, then \(D_- \ne 0\).

In the first case, set \(M_1 = M_*\), and \(A_1 = A - D_+ \ge 0\). Then

$$\begin{aligned} \begin{pmatrix} A_1 - M_1 &{}\quad B^* \\ B &{}\quad M_1 \end{pmatrix} = \begin{pmatrix} (A-M_*) - D_+ &{}\quad B^* \\ B &{}\quad M_* \end{pmatrix} \ge 0. \end{aligned}$$

Likewise, if G is not co-isometric, set \(M_1 = M_* - D_-\) and \(A_1 = A - D_- = (A-M_*) - (M_*-D_-) \ge 0\). Then

$$\begin{aligned} \begin{pmatrix} A_1 - M_1 &{}\quad B^* \\ B &{}\quad M_1 \end{pmatrix} = \begin{pmatrix} A-M_* &{}\quad B^* \\ B &{}\quad M_*-D_- \end{pmatrix} \ge 0. \end{aligned}$$

In either case, \(A_1 \lneq A\) and

$$\begin{aligned} \mathscr {M}_1 := \left\{ M{:} \begin{pmatrix} A_1 - M &{}\quad B^* \\ B &{}\quad M \end{pmatrix} \ge 0\right\} \subseteq \mathscr {M}. \end{aligned}$$

Let \(\mathscr {A}\) be the set of all \(A' \le A\) for which \(\begin{pmatrix} A'-M &{}\quad B^* \\ B &{}\quad M \end{pmatrix} \ge 0\) for some \(M \ge 0\). If \(\mathscr {C} \subset \mathscr {A}\) is a decreasing chain, then the elements of \(\mathscr {C}\) converge strongly to \(A_0 \ge 0\). The corresponding maximal choice of M for the elements of the chain themselves form a decreasing chain, and so they too converge strongly. Hence there exists M such that \(\begin{pmatrix} A_0 - M &{}\quad B^* \\ B &{}\quad M \end{pmatrix} \ge 0\); that is, \(A_0 \in \mathscr {A}\). An application of Zorn’s lemma gives a minimal \(\hat{A} \in \mathscr {A}\). The set of M such that \(\begin{pmatrix} \hat{A}-M &{}\quad B^* \\ B &{}\quad M \end{pmatrix} \ge 0\) is a singleton, since otherwise the construction given above yields \(A' \lneq \hat{A}\). \(\square \)

Now consider the two index/variable case. Let \(Q(z) = \sum _{j_2 = -d_2}^{d_2} {\hat{Q}}_{j_2}(z_1) z_2^{j_2}\) be an \({\mathfrak L}({\mathfrak K})\) valued trigonometric polynomial (since the positive case will be of interest, the simplified form for the maximal and minimal indices is used). As in the last section, there is an associated Toeplitz operator \(T_{Q_1}(z_1):= ({\hat{Q}}_{j_2 - k_2})_{j_2, k_2 = 0}^\infty \), the entries of which are \({\mathfrak L}({\mathfrak K})\) valued trigonometric polynomials \({\hat{Q}}_{j_2} = \sum Q_{j_1,j_2} z_1^{j_1}\) of degree at most \(d_1\). Following the example of the single variable case, group these into \(d_2 \times d_2\) submatrices. Then with \({\tilde{\mathfrak K}}= M_{d_2}(\mathbb {C}) \otimes {\mathfrak K}\), the entries are \({\mathfrak L}({\tilde{\mathfrak K}})\) valued trigonometric polynomials in \(z_1\). This in turn is equivalent to a tridiagonal Toeplitz operator T as in (3.2), with A and B Toeplitz operators with entries in \({\mathfrak L}({\mathfrak G})= {\mathfrak L}(\bigoplus _0^\infty {\tilde{\mathfrak K}})\), where

$$\begin{aligned} \begin{aligned} \deg _0 A&:= \deg _+ A = \deg _- A = \sup \{\deg _\pm {\hat{Q}}_{j_2}{:}\, 0 \le j_2 \le d_2-1\} \\ \deg _\pm B&= \sup \{\deg _\pm {\hat{Q}}_{j_2}{:}\, 1 \le j_2 \le d_2\}, \qquad \deg B =\deg _+ B + \deg _- B. \end{aligned} \end{aligned}$$
(3.5)

Just to emphasize, the bi-Toeplitz operator obtained in this way has outer level corresponding to the variable \(z_2\) and inner level corresponding to \(z_1\). In other words, the Toeplitz operators which are the entries of the tridiagonal Toeplitz operator correspond to functions in the variable \(z_1\).

It has been assumed that the operators A and B are singly infinite Toeplitz operators. If they are instead replaced with the corresponding bi-infinite Toeplitz operators (so acting on \({\mathfrak L}({\mathfrak G})= L^2({\tilde{\mathfrak K}})\))—call them \(\tilde{A}\) and \(\tilde{B}\) and the resulting tridiagonal operator \(\tilde{T}\)—then T is positive if and only if \(\tilde{T}\) is positive. This is therefore a singly infinite Toeplitz operator with coefficients which are bi-infinite Toeplitz operators. Write \(\tilde{S}\) for the bilateral shift on \({\tilde{\mathfrak K}}\) and suppose that \({\tilde{M}}_+\) is the Schur complement appearing in the resulting version of (3.3). Since \(\tilde{A}\) and \(\tilde{B}\) are bi-infinite Toeplitz operators, they are invariant under conjugation with either \(\tilde{S}\) or \({\tilde{S}}^*\). Consequently,

$$\begin{aligned} \begin{pmatrix} \tilde{A} - \tilde{S}{\tilde{M}}_+{\tilde{S}}^* &{} {\tilde{B}}^* \\ \tilde{B} &{} \tilde{S}{\tilde{M}}_+{\tilde{S}}^* \end{pmatrix} \ge 0 \qquad \text {and} \qquad \begin{pmatrix} \tilde{A} - {\tilde{S}}^*{\tilde{M}}_+ \tilde{S} &{} {\tilde{B}}^* \\ \tilde{B} &{} {\tilde{S}}^*{\tilde{M}}_+ \tilde{S} \end{pmatrix} \ge 0. \end{aligned}$$
(3.6)

Hence \({\tilde{S}}^*{\tilde{M}}_+ \tilde{S} \le {\tilde{M}}_+\) and \(\tilde{S}{\tilde{M}}_+{\tilde{S}}^* \le {\tilde{M}}_+\). On the other hand, \(\tilde{S}{\tilde{S}}^* = 1\), so conjugating both sides of the first of these inequalities by \(\tilde{S}\) gives \({\tilde{M}}_+ \le {\tilde{S}}^*{\tilde{M}}_+\tilde{S}\), and so equality holds. Equality holds likewise for the other inequality. In other words, the Schur complement in this case is Toeplitz. The same argument works with the minimal element \({\tilde{M}}_-\). In neither case is it necessary to assume that \(\tilde{A}\) and \(\tilde{B}\) have finite degrees. There is no immediate guarantee that the degree of \(\tilde{M}\) is finite, even if the degrees of \(\tilde{A}\) and \(\tilde{B}\) are.

The discussion is summarized in the next lemma.

Lemma 3.4

Suppose that \(\tilde{A}\) and \(\tilde{B}\) are bounded bi-infinite Toeplitz operators with entries in \({\mathfrak L}({\tilde{\mathfrak K}})\) (they need not be of finite degree). Let \(\tilde{\mathscr {M}} = \{\tilde{M}{:}\, \begin{pmatrix} \tilde{A}-\tilde{M} &{}\quad {\tilde{B}}^* \\ \tilde{B} &{}\quad \tilde{M} \end{pmatrix} \ge 0 \}\), and assume that \(\tilde{\mathscr {M}}\) is non-empty. Let \(\tilde{\mathscr {M}}_T\) be the (non-empty) subset of elements of \(\tilde{\mathscr {M}}\) which are Toeplitz. Then the maximal and minimal elements of \(\tilde{\mathscr {M}}\), \({\tilde{M}}_+\) and \({\tilde{M}}_-\) are in \(\tilde{\mathscr {M}}_T\) as maximal and minimal elements. This set is a closed and convex.

There is a refined version of Theorem 3.3 for bi-infinite Toeplitz operators.

Theorem 3.5

Let \(\tilde{A}\) and \(\tilde{B}\) be bi-infinite Toeplitz operators (they need not be of finite degree), and suppose that the set

$$\begin{aligned} \tilde{\mathscr {M}} = \left\{ \tilde{M}{:}\, \begin{pmatrix} \tilde{A} - \tilde{M} &{}\quad {\tilde{B}}^* \\ \tilde{B} &{}\quad \tilde{M} \end{pmatrix} \ge 0\right\} \end{aligned}$$

is not empty. Then there exists a minimal bi-infinite Toeplitz operator \(0 \le \hat{A} \le \tilde{A}\) such that the set

$$\begin{aligned} \hat{\mathscr {M}} = \left\{ \tilde{M}{:}\, \begin{pmatrix} \hat{A} - \tilde{M} &{}\quad {\tilde{B}}^* \\ \tilde{B} &{}\quad \tilde{M} \end{pmatrix} \ge 0\right\} = \{\hat{M}\}, \end{aligned}$$

is a singleton. In this case, \(\hat{M}\) is a bounded bi-infinite Toeplitz operator.

Proof

This is essentially a repeat of the proof of Theorem 3.3, taking into account Lemma 3.4, which guarantees that the operator \(M_*\) in the proof of Theorem 3.3 is bi-infinite Toeplitz. The same argument used for Lemma 3.4 implies that the operators \(D_\pm \) are bi-infinite Toeplitz operators, and so \(A_1\) and \(M_1\) are as well. Strong limits of bounded sequences of bi-infinite Toeplitz operators are bi-infinite Toeplitz operators so the chains considered there have upper bounds in the class, and the result again follows from an application of Zorn’s lemma. \(\square \)

The arguments just used with bi-infinite Toeplitz operators do not work in the singly infinite setting with the unilateral shift. Indeed, the Schur complement in this case will generally not be Toeplitz. If it is, it can be shown by arguments to follow that it is possible to factor the bivariate trigonometric polynomial with analytic polynomials of the same degrees, and there are well known examples for which this is not possible [3]. Nevertheless, restricting back to singly infinite operators, the following is obtained.

Lemma 3.6

Suppose that A and B are bounded (singly infinite) Toeplitz operators with entries in \({\mathfrak L}({\tilde{\mathfrak K}})\) (they need not be of finite degree). Let \(\mathscr {M} = \{M{:}\, \begin{pmatrix} A-M &{}\quad B^* \\ B &{}\quad M \end{pmatrix} \ge 0 \}\), and assume that \(\mathscr {M}\) is non-empty. Then there exists a closed, convex subset \(\mathscr {M}_T\) of \(\mathscr {M}\) with maximal and minimal elements, all the elements of which are Toeplitz. Furthermore, there exists a minimal Toeplitz operator \(0 \le \hat{A} \le A\) such that the set

$$\begin{aligned} \hat{\mathscr {M}}_T = \left\{ \hat{M}{:}\, \begin{pmatrix} \hat{A} - \hat{M} &{}\quad B^* \\ B &{}\quad \hat{M} \end{pmatrix} \ge 0\right\} \end{aligned}$$

is a singleton.

The elements of \(\mathscr {M}_T\) come from restricting the set \(\tilde{\mathscr {M}}_T\) of operators \(\tilde{M}\) in the bi-infinite setting satisfying

$$\begin{aligned} \begin{pmatrix} \tilde{A} - \tilde{M} &{}\quad {\tilde{B}}^* \\ \tilde{B} &{}\quad \tilde{M} \end{pmatrix} \ge 0. \end{aligned}$$

The set \(\mathscr {M}_T\) will therefore have maximal and minimal elements since \(\tilde{\mathscr {M}}\) does, and will be a singleton when \(\tilde{\mathscr {M}}\) is.

In general, positive Toeplitz operators need not have outer factorizations. The next theorem provides a useful exception, and is the key ingredient in the proof of the main result.

Theorem 3.7

Suppose that A and B are bounded (singly infinite) Toeplitz operators with entries in \({\mathfrak L}({\tilde{\mathfrak K}})\), that B has finite degree \((d_+,d_-)\) on \(H^2({\mathfrak H})\) (that is, the smallest values such that \(BS^{d_+}\) and \(B^* S^{d_-}\) are analytic), and assume that \(\mathscr {M} = \{M{:}\, \begin{pmatrix} A-M &{}\quad B^* \\ B &{}\quad M \end{pmatrix} \ge 0 \} \ne \emptyset \). Let \(\hat{A}\) be the minimal Toeplitz operator and \(\hat{M}\) the corresponding unique Toeplitz operator such that \(\begin{pmatrix} \hat{A} - \hat{M} &{}\quad B^* \\ B &{}\quad \hat{M} \end{pmatrix} \ge 0\). Then there exist outer E and F having that \(\hat{A}-\hat{M} = E^*E\) and \(\hat{M} = F^*F\). Furthermore, there are inner \(V_E\), \(V_F\) having bi-infinite unitary extensions with equal ranges such that for \(G = V_F^* V_E\), \(B = F^*GE\).

Proof

The proof proceeds as follows. First, it is proved that \(\hat{A}-\hat{M}\) and \(\hat{M}\) have outer factorizations. Then it is shown that R itself has an outer factorization, and this implies the existence of analytic \(R_E\) and \(R_F\) such that \(R = \begin{pmatrix} R_E^* \\ R_F^* \end{pmatrix} \begin{pmatrix} R_E&\quad R_F \end{pmatrix}\). Inner-outer factorizations of \(R_E\) and \(R_F\) yield inner operators \(V_E\) and \(V_F\), and then uniqueness of \(\hat{M}\) implies that for the bi-infinite extensions of these, \({\tilde{V}}_F^*{\tilde{V}}_E\) is unitary.

Let \(\tilde{A}\) and \(\tilde{M}\) be the bi-infinite extensions of \(\hat{A}\) and \(\hat{M}\) to \(L^2({\mathfrak H})\), \(\tilde{B}\) that of B, and \(P_0\) the orthogonal projection of \(L^2({\mathfrak H})\) onto \(H^2({\mathfrak H})\). For the time being, choose any factorizations \(\tilde{A} - \tilde{M} = {\tilde{E}}^*\tilde{E}\) and \(\tilde{M} = {\tilde{F}}^*\tilde{F}\). By Lemma 3.2, \(\tilde{B} = {\tilde{F}}^*\tilde{G}\tilde{E}\), where \(\tilde{G}: \overline{\textrm{ran}}\,\tilde{E} \rightarrow \overline{\textrm{ran}}\,\tilde{F}\) is unitary. Set \(E = \tilde{E}P_0^*\), \(F = \tilde{F}P_0^*\), \(G = P^*_{\overline{\textrm{ran}}\,(\tilde{F}P_0)} \tilde{G} P_{\overline{\textrm{ran}}\,(\tilde{E}P_0)}\). Then \(\hat{A} - \hat{M} = E^*E\), \(\hat{M} = F^*F\), and \(B = F^*GE\).

Since \(\hat{A} - \hat{M}\) and \(\hat{M}\) are Toeplitz, there exist Lowdenslager isometries \(V_E\) and \(V_F\) such that \(V_E E = E S\) and \(V_F F = F S\) (see Lemma 2.1 and the paragraph preceding it). Let \(V_E = S_E \oplus U_E\), \(V_F = S_F \oplus U_F\) be Wold decompositions, where \(S_E\), \(S_F\) are shift operators and \(U_E\), \(U_F\) are unitary. The operator \(B^* S^{d_-}\) is analytic, so for all \(j \in \mathbb {N}\), \(S^j (E^* G^* V_F^{d_-}) = (E^* G^* V_F^{d_-})V_F^j\). Hence if \(Q_j\) is the projection onto \(\bigoplus _{i = 0}^j {\mathfrak H}\), then for any j, \( Q_j (E^* G^* V_F^{d_-})V_F^{j+1} = 0\). Thus with \(\mathscr {N}_F = {\textrm{ran}}\,U_F\) and \(j \ge 0\),

$$\begin{aligned} \{0\} = Q_j S^{j+1} (E^* G^* V_F^{d_-}) \mathscr {N}_F = Q_j E^* G^* V_F^{d+j+1} \mathscr {N}_F = Q_j E^* G^* \mathscr {N}_F, \end{aligned}$$

and so \(E^* G^* \mathscr {N}_F = \{0\}\).

Let \(P_F\) be the orthogonal projection with range equal to \(\overline{\textrm{ran}}\,F \ominus {\textrm{ran}}\,U_F\). Since \(S_F P_F F = P_F F S\), \(F^*P_F F \ge 0\) is Toeplitz. By the above calculations, \(B = F^* P_F G E\) and \(\Vert G\Vert \le 1\), so

$$\begin{aligned}&\begin{pmatrix} \hat{A} - F^*P_F F &{}\quad B^* \\ B &{}\quad F^*P_F F \end{pmatrix} \ge \begin{pmatrix} \hat{A} - \hat{M} &{}\quad B^* \\ B &{}\quad F^*P_F F \end{pmatrix} \\&= \begin{pmatrix} E^*E &{}\quad E^* G^* P_F F \\ F^* P_F G E &{}\quad (P_F F)^* P_F F \end{pmatrix} \ge 0. \end{aligned}$$

By uniqueness of \(\hat{M}\), \(P_F = 1_{\overline{\textrm{ran}}\,F}\) and \(\mathscr {N}_F = \{0\}\). Hence F is analytic from \(H^2({\mathfrak H})\) to \(H^2({\mathfrak H}_F) = \bigoplus _{0}^\infty (\ker S_F^*)\). Since \(\overline{\textrm{ran}}\,F = H^2({\mathfrak H}_F)\), F is outer.

A similar argument with \(BS^{d_+}\) shows that for \(\mathscr {N}_E = {\textrm{ran}}\,U_E\), \(F^* G \mathscr {N}_E = 0\), and so if \(P_E\) is the orthogonal projection with range equal to \(\overline{\textrm{ran}}\,E \ominus {\textrm{ran}}\,U_E\), then \(E^*P_EE\) is Toeplitz, \(B = F^* G P_E E\), and

$$\begin{aligned} \begin{pmatrix} (\hat{A} - E^*(1_E - P_E)E) - \hat{M} &{}\quad B^* \\ B &{}\quad \hat{M} \end{pmatrix}= & {} \begin{pmatrix} E^*E - E^*(1_E - P_E)E &{}\quad B^* \\ B &{}\quad \hat{M} \end{pmatrix}\\= & {} \begin{pmatrix} (P_E E)^* (P_E E) &{}\quad E^* P_E G^* F \\ F^* G P_E E &{}\quad F^*F \end{pmatrix} \ge 0. \end{aligned}$$

Minimality of \(\hat{A}\) then yields that \(\mathscr {N}_E = \{0\}\) and E is analytic, and in fact outer, from \(H^2({\mathfrak H})\) to \(\overline{\textrm{ran}}\,E = H^2({\mathfrak H}_E) = \bigoplus _{0}^\infty (\ker S_E^*)\). By Lowdenslager’s criterion (Lemma 2.1), E and F may be taken to be outer on \(H^2({\mathfrak H})\), and this is what is done. We denote the closure of their ranges by \(H^2({\mathfrak E})\) and \(H^2({\mathfrak F})\), respectively.

The equalities \(F^*GE = B = S^*BS = F^*S^*GS E\) and the fact that E and F are outer imply that G is Toeplitz. Furthermore, if \(\tilde{S}\), \(\tilde{S}_E\), \(\tilde{S}_F\) are the bilateral (and hence unitary) extension of S, \(S_E\), and \(S_F\), then \(\tilde{E}\tilde{S} = \tilde{S}_E \tilde{E}\), \({\tilde{F}}^*\tilde{S}_F = \tilde{S} {\tilde{F}}^*\), and \(\tilde{G}\tilde{S}_E = \tilde{S}_F \tilde{G}\). By construction, G is unique, \(\tilde{G}: L^2({\mathfrak E}) \rightarrow L^2({\mathfrak F})\) is unitary, and \(\tilde{B} = {\tilde{F}}^* \tilde{G} \tilde{E}\). Decompose \(L^2({\mathfrak H}) = H^2({\mathfrak H})^\bot \oplus H^2({\mathfrak H}))\), \(L^2({\mathfrak E}) = H^2({\mathfrak E})^\bot \oplus H^2({\mathfrak E}))\), \(L^2({\mathfrak F}) = H^2({\mathfrak F})^\bot \oplus H^2({\mathfrak F})\). With respect to these decompositions,

$$\begin{aligned} \tilde{F}^* = \begin{pmatrix} {F'}^* &{}\quad Q_F^* \\ 0 &{}\quad F^* \end{pmatrix}, \qquad \tilde{G} = \begin{pmatrix} G' &{}\quad D \\ D' &{}\quad G \end{pmatrix}, \qquad \text {and} \qquad \tilde{E} = \begin{pmatrix} E' &{} 0 \\ Q_E &{} E \end{pmatrix}. \end{aligned}$$

Matrix multiplication verifies that \(\tilde{B}\) is the bi-infinite extension of \(B = F^*GE\). Because F is outer, \(\overline{\textrm{ran}}\,\tilde{F} = L^2({\mathfrak F})\) and \(\overline{\textrm{ran}}\,F' = H^2({\mathfrak F})^\bot \).

Set

$$\begin{aligned} W := \begin{pmatrix} D &{}\quad 0 \\ G &{}\quad 1 \end{pmatrix} : H^2({\mathfrak E}) \oplus H^2({\mathfrak F}) \rightarrow L^2({\mathfrak F}). \end{aligned}$$

The entries of W are Toeplitz and the left column is isometric. Also, \({\textrm{ran}}\,W = {\textrm{ran}}\,D \oplus H^2({\mathfrak F})\). Since \(\tilde{G}\tilde{S}_E = \tilde{S}_F \tilde{G}\), \(\tilde{S}_F W = W (S_E \oplus S_F)\). The columns of

$$\begin{aligned} \begin{pmatrix} D \\ G \end{pmatrix} = \begin{pmatrix} \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots \\ G_{-2} &{}\quad G_{-3} &{}\quad G_{-4} &{}\quad \ddots \\ G_{-1} &{}\quad G_{-2} &{}\quad G_{-3} &{}\quad \ddots \\ \hline G_0 &{}\quad G_{-1} &{}\quad G_{-2} &{}\quad \ddots \\ G_1 &{}\quad G_0 &{}\quad G_{-1} &{}\quad \ddots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots \\ \end{pmatrix} \end{aligned}$$

are isometric with orthogonal ranges. Let \({\mathfrak L}\) be the range of the first column. Then \({\textrm{ran}}\,\begin{pmatrix} D \\ G \end{pmatrix} = \bigoplus _0^\infty {\mathfrak L}\) and \(\tilde{S}_F\) acts as a unilateral shift on this space, as well as on \({0} \oplus H^2({\mathfrak F})\). Set \(S_W = \tilde{S}_F | \overline{\textrm{ran}}\,W\). Since \(\tilde{S}_F W = W \begin{pmatrix} S_E &{} 0 \\ 0 &{} S_F \end{pmatrix}\), \(S_W\) maps \(\overline{\textrm{ran}}\,W\) isometrially to itself. The goal now is to show that this is a unilateral shift.

Suppose that

$$\begin{aligned} f= \begin{pmatrix} f_1 \\ f_2 \end{pmatrix} \in \bigcap _{n=0}^\infty {\tilde{S}_F}^n \overline{\textrm{ran}}\,W. \end{aligned}$$

Then for all n, there exists \(h_n = \begin{pmatrix} h_{1n} \\ h_{2n} \end{pmatrix} \in H^2({\mathfrak E}) \oplus H^2({\mathfrak F})\) such that

$$\begin{aligned} f - \begin{pmatrix} DS_E^n h_{1n} \\ G S_E^n h_{1n} + S_F^n h_{2n} \end{pmatrix} = f - W \begin{pmatrix} S_E^n &{} 0 \\ 0 &{} S_F^n \end{pmatrix} h_n \\ = f - {\tilde{S}_F}^n \begin{pmatrix} D h_{1n} \\ Gh_{1n} + h_{2n} \end{pmatrix} = f - {\tilde{S}_F}^n W h_n \rightarrow 0. \end{aligned}$$

Since \({\tilde{S}_F}^{*\,n}\) has the form \(\begin{pmatrix} {S'_F}^{*\,n} &{} Q_{F,n} \\ 0 &{} {S_F}^{*\,n} \end{pmatrix}\), it follows that \({S_F}^{*\,n} f_2 \rightarrow 0\) and \({S_F}^* G S_E = G\), and so \(h_{2n} \rightarrow -Gh_{1n}\). Hence \(h_n - \begin{pmatrix} h_{1n} \\ -Gh_{1n} \end{pmatrix} \rightarrow 0\), and so

$$\begin{aligned} f - {\tilde{S}_F}^n W \begin{pmatrix} h_{1n} \\ -Gh_{1n} \end{pmatrix} = f - W \begin{pmatrix} S_E^n &{}\quad 0 \\ 0 &{}\quad S_F^n \end{pmatrix} \begin{pmatrix} h_{1n} \\ {S_F}^{*\,n}GS_E^nh_{1n} \end{pmatrix}\\ = f - \begin{pmatrix} DS_E^n h_{1n} \\ (1-S_F^n{S_F}^{*\,n}) G S_E^n h_{1n} \end{pmatrix} \rightarrow 0. \end{aligned}$$

For any n, the first n entries of \((1-S_F^n{S_F}^{*\,n}) G S_E^n h_{1n}\) are 0, so the limit as \(n\rightarrow \infty \) gives \(f_2 = 0\).

In the Wold decomposition of \(S_W\), the unitary part acts on a reducing subspace \(\mathscr {U} = \bigcap S_W^n\overline{\textrm{ran}}\,W\), which is a subspace of \(H^2({\mathfrak F})^\bot \oplus \{0\}\) by what was just shown. Since \(S_W\mathscr {U} = \mathscr {U}\) and \(\overline{\textrm{ran}}\,W\) is invariant for \(\tilde{S}_F\), it follows that \(\mathscr {U}\) reduces \(\tilde{S}_F\). Thus for all n, \({\tilde{S}_F}^n \begin{pmatrix} f_1 \\ 0 \end{pmatrix} \in \mathscr {U}\) has the form \(\begin{pmatrix} x_n \\ 0 \end{pmatrix}\). Therefore

$$\begin{aligned} \begin{pmatrix} f_1 \\ 0 \end{pmatrix} = {\tilde{S}_F}^{*\,n}{\tilde{S}_F}^n \begin{pmatrix} f_1 \\ 0 \end{pmatrix} = \begin{pmatrix} \begin{pmatrix} \vdots \\ f_{1,n+1} \\ f_{1n} \\ 0 \\ \vdots \\ 0 \end{pmatrix}\\ 0 \end{pmatrix}, \end{aligned}$$

and consequently \(f_1 = 0\). So \(\mathscr {U} \subseteq \overline{\textrm{ran}}\,W \ominus ({\textrm{ran}}\,W)^\bot = \{0\}\); that is, \(S_W\) is a unilateral shift and \(S_W W = W \begin{pmatrix} S_E &{}\quad 0 \\ 0 &{}\quad S_F \end{pmatrix}\). Write \(H^2({\mathfrak W})\) for \(\overline{\textrm{ran}}\,W\) under the action of \(S_W\).

As a result,

$$\begin{aligned} H:= W \begin{pmatrix} E &{}\quad 0 \\ 0 &{}\quad F \end{pmatrix} : H^2({\mathfrak H}) \oplus H^2({\mathfrak H}) \rightarrow H^2({\mathfrak W}) \end{aligned}$$

is analytic since the matrix on the right is outer, so has dense range in \(H^2({\mathfrak E}) \oplus H^2({\mathfrak F})\), the space on which W is defined. A simple calculation shows that

$$\begin{aligned} R:= \begin{pmatrix} \hat{A} - \hat{M} &{}\quad B^* \\ B &{}\quad \hat{M} \end{pmatrix} = H^*H. \end{aligned}$$

Decomposing \(H = \begin{pmatrix} H_1&H_2 \end{pmatrix}\), \(H_1, H_2: H^2({\mathfrak H}) \rightarrow H^2({\mathfrak W})\) are analytic with \(H_1^* H_1 = \hat{A} - \hat{M} = E^*E\) and \(H_2^* H_2 = \hat{M} = F^*F\) outer factorizations. Hence \(H_1 = V_EE\), \(H_2 = V_FF\), with \(V_E: H^2({\mathfrak E}) \rightarrow H^2({\mathfrak W})\), \(V_F: H^2({\mathfrak F}) \rightarrow H^2({\mathfrak W})\) inner, and so in \(B = F^* G E\), \(G = V_F^*V_E\).

Let \({\tilde{V}}_E, {\tilde{V}}_F\) be the bi-infinite extensions of \(V_E, V_F\), respectively. By assumption, \(\tilde{G} = {\tilde{V}}_F^* {\tilde{V}}_E\) is unitary, so \({\textrm{ran}}\,{\tilde{V}}_E = {\textrm{ran}}\,{\tilde{V}}_F\). Set \({\mathfrak M}= \overline{{\textrm{ran}}\,V_E \vee {\textrm{ran}}\,V_F}\). Since \(V_E\) and \(V_F\) are analytic, \(\tilde{S}_{\mathfrak H}{\mathfrak M}= S_{\mathfrak H}{\mathfrak M}\subseteq {\mathfrak M}\), which means that \(S_0:= S_{\mathfrak H}| {\mathfrak M}\) is a shift operator. Setting \({\mathfrak K}= \ker S_0^*\), it follows that \({\mathfrak M}= H^2({\mathfrak K})\) and \(S_0 = S_{\mathfrak K}\). Then \(L^2({\mathfrak K}) = {\textrm{ran}}\,\tilde{V}_E = {\textrm{ran}}\,\tilde{V}_F\). In other words, \(\tilde{V}_E\) and \(\tilde{V}_F\) map unitarily into \(L^2({\mathfrak K})\), and \(V_E\) and \(V_F\) are analytic and isometric into \(H^2({\mathfrak K})\). Consequently, \(V_EE\) and \(V_FF\) map analytically into \(H^2({\mathfrak K})\), \(A-M = E^*V_E^*V_EE\), \(M = F^*V_F^*V_FF\), and \(B = F^*V_F^*V_EE\). \(\square \)

The question of degree bounds in the previous theorem is addressed next.

Theorem 3.8

Assume \({\mathfrak H}\) is finite dimensional and that the conditions of the last theorem hold. Then the operators \(A-M\), M, E, and F have finite degree.

Proof

The notation of the proof of the last result is maintained. While the space \({\mathfrak K}\subset {\textrm{ran}}\,\tilde{V}_E\), it may not be in \({\textrm{ran}}\,V_E\). However, the Toeplitz structure of \(V_E\) compensates for this.

Choose an orthonomal basis \(\{f_k\}_{k=1}^d\) for \({\mathfrak K}\), where \(d < \infty \). It is clear from the definition that \(\dim {\mathfrak K}\le \dim {\mathfrak H}\). Fix \(\frac{1}{\sqrt{d}}> \epsilon > 0\). Let \(\{\tilde{g}_k\}\) be an orthonormal subset of \(L^2({\mathfrak E})\) such that \(\tilde{V}_E \tilde{g}_k = f_k\). Set \({\tilde{\mathfrak G}}= \bigvee _k \tilde{g}_k\). Write \(P_{{\mathfrak E}}\) for the orthogonal projection from \(L^2({\mathfrak E})\) onto \(H^2({\mathfrak E})\). Let \(\epsilon > 0\). For \(m_k\) sufficiently large, \(\Vert \tilde{S}_{{\mathfrak E}}^{m_k} \tilde{g}_k - P_{{\mathfrak E}} (\tilde{S}_{{\mathfrak E}}^{m_k} \tilde{g}_k) \Vert < \epsilon \). Let \(m = \sup _k m_k\). Then \(\Vert \tilde{S}_{{\mathfrak E}}^m \tilde{g}_k - P_{{\mathfrak E}} (\tilde{S}_{{\mathfrak E}}^m \tilde{g}_k) \Vert < \epsilon \) for all k as well. Set \(g_k = P_{{\mathfrak E}} (\tilde{S}_{{\mathfrak E}}^m \tilde{g}_k)\) and \({\mathfrak G}= \bigvee _k g_k\).

Note that the mth rows of \(V_E\) and \(\tilde{V}_E\) are of the form

$$\begin{aligned} \begin{aligned} K_m&= \begin{pmatrix} V_m&\quad V_{m-1}&\quad \cdots&\quad V_0&\quad 0&\quad \cdots \end{pmatrix} \\ \tilde{K}_m&= \begin{pmatrix} \cdots V_{m+2}&\quad V_{m+1}&\quad V_m&\quad V_{m-1}&\quad \cdots&\quad V_0&\quad 0&\quad \cdots \end{pmatrix}. \\ \end{aligned} \end{aligned}$$

Let \(L_m = \begin{pmatrix} \cdots V_{m+2}&V_{m+1} \end{pmatrix}\), so that \(\tilde{K}_m = \begin{pmatrix} L_m&\quad K_m \end{pmatrix}\). By the above construction, \(h_k:= K_m g_k \in {\mathfrak K}\), and \(\Vert h_k - f_k \Vert < \epsilon \) for all k.

If \(h:= \sum _k \alpha _k h_k = 0\), then for \(f:= \sum _k \alpha _k f_k\),

$$\begin{aligned} \Vert f\Vert = \Vert h-f\Vert = \left\| \sum _k \alpha _k (h_k - f_k)\right\| \le \epsilon \sum _k |\alpha _k| \le \epsilon \sqrt{d \sum _k |\alpha _k|^2} = \epsilon \sqrt{d} \Vert f\Vert . \end{aligned}$$

Hence \(\Vert f\Vert = 0\), and thus \(\alpha _k = 0\) for all k. It follows then that \(\bigvee _k h_k = {\mathfrak K}\), meaning that \({\textrm{ran}}\,K_m = {\mathfrak K}\). Note that it will also be the case that \({\textrm{ran}}\,K_n = {\mathfrak K}\) for all \(n \ge m\).

The operator \(BS^{d_+}\) is analytic, so for \(R_F = V_F F\), \(R_F^* S_{{\mathfrak K}}^{d_+}\) is analytic on \(\overline{\textrm{ran}}\,V_E E = {\textrm{ran}}\,V_E\), which is invariant under \(S_{{\mathfrak K}}\) by analyticity of \(V_E\). In other words,

$$\begin{aligned} R_F^* S_{{\mathfrak K}}^{d_+} S_{{\mathfrak K}} | {\textrm{ran}}\,V_E = S_{{\mathfrak H}} R_F^* S_{{\mathfrak K}}^{d_+} | {\textrm{ran}}\,V_E. \end{aligned}$$

In particular then, since \(S_{{\mathfrak K}}^m H^2({\mathfrak K}) \subseteq {\textrm{ran}}\,V_E\),

$$\begin{aligned} R_F^* S_{{\mathfrak K}}^{d_+ + m} S_{{\mathfrak K}} |_{H^2({\mathfrak K})} = S_{{\mathfrak H}} R_F^* S_{{\mathfrak K}}^{d_+ + m} |_{H^2({\mathfrak K})}. \end{aligned}$$

Hence \(R_F^* S_{{\mathfrak K}}^{d_+ + m}\) is Toeplitz and analytic (so lower triangular). The operator \(R_F^*\) is co-analytic (upper triangular), hence \(R_F\) has degree at most \(d_+ + m\), as does F since it is outer. Therefore as \(\hat{M} = R_F^*R_F\), \(\deg \hat{M} \le d_+ + m\).

An identical argument using analyticity of \(B^*S^{d_-}\) shows that for some \(n \ge 0\), \(E^*V_E^*S_{{\mathfrak K}}^{d_- + n}\) is analytic. As a consequence, for \(R_E = V_E E\), \(\deg (R_E) \le d_- + n\), as does E since it is outer. Since \(\hat{A} - \hat{M} = R_E^* R_E\), its degree is also bounded by \(d_- + n\). \(\square \)

4 A two variable factorization theorem

This section contains the proof of the main result. First some notation. Suppose that \(T_Q\) is a Toeplitz operator associated to a trigonometric polynomial Q of bi-degree \((d_1,d_2)\) in the variables \((z_1,z_2)\), and that \(T_Q\) has degree \(d_2\) with coefficients \(R_0, \dots , R_{d_2}\) which are themselves Toeplitz of degree bounded by \(d_1\). Write \(d_A^\pm = \max \{\deg _\pm R_j,\ j = 0,\dots , d_2-1\}\), \(d_A = \max \{d_A^+, d_A^-\}\), and \(d_B^\pm = \max \{\deg _\pm R_j,\ j = 1,\dots , d_2\}\), \(d_B = \max \{d_B^+, d_B^-\}\).

Recall that in one variable, the Fejér–Riesz theorem (1.1) gives a factorization of any operator valued trigonometric polynomial with very strong degree bounds. However in two variables, even in the scalar valued case, there are no uniform bounds based on the degree of the polynomials being factored (as implied by Section 5 of [22]). By considering a direct sum of scalar polynomials all of the same degree but with factorization degrees growing, one can construct an operator valued trigonometric polynomial of finite degree without a factorization in terms of polynomials of bounded degree. However, if the coefficients are operators on a finite dimensional Hilbert space, polynomial factorization is possible.

Theorem 4.1

Let \({\mathfrak K}\) be a Hilbert space with \(\dim {\mathfrak K}< \infty \). Given a positive \({\mathfrak L}({\mathfrak K})\) valued trigonometric polynomial Q in two variables of degree \((d_1,d_2)\), there exists \(d_1' < \infty \) such that Q can be factored as a sum of at most \(2d_2\) hermitian squares of \({\mathfrak L}({\mathfrak K})\) valued, analytic polynomials with degrees bounded by \(({d'}_1,2d_2-1)\).

Of course the roles of \(z_1\) and \(z_2\) can be reversed, potentially altering the number and degrees of the polynomials in the factorization.

Proof of Theorem 4.1

As observed above, the trigonometric polynomial Q is associated to a bi-Toeplitz operator \(T_Q\) in such a way that \(T_Q\) is Toeplitz of degree \(d_2\) with coefficients \(R_0, \dots , R_{d_2}\) which are Toeplitz of finite degrees bounded by \(d_A\), \(d_B^\pm \), and \(d_B\) as defined above.

Collect the coefficients into \(d_2\times d_2\) blocks to form a tridiagonal Toeplitz operator. The blocks as they stand are not Toeplitz. However, there is a unitary conjugation (call the unitary W) collecting the (ij)th entries into \(d_2\times d_2\) blocks acting on a space \({\tilde{\mathfrak K}}= \bigoplus _1^{d_2} {\mathfrak K}\). This results in a positive tridiagonal Toeplitz operator

$$\begin{aligned} \begin{pmatrix} A &{}\quad B^* &{}\quad 0 &{}\quad \cdots \\ B &{}\quad A &{}\quad B^* &{}\quad \ddots \\ 0 &{}\quad B &{}\quad \ddots &{}\quad \ddots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots \end{pmatrix}, \end{aligned}$$

with entries A and B which are themselves Toeplitz on \(\bigoplus _0^\infty {\tilde{\mathfrak K}}\). Furthermore, \(\deg _\pm A = d_A\), and \(\deg _\pm B = d_B^\pm \).

By Lemma 3.6, there is are positive Toeplitz operators \(\hat{A}\) and \(\hat{M}\) on \(\bigoplus _0^\infty {\tilde{\mathfrak K}}\), \(A \ge \hat{A} \ge \hat{M}\), where \(\hat{A}\) is a minimal positive Toeplitz operator such that

$$\begin{aligned} \begin{pmatrix} \hat{A} - M &{}\quad B^* \\ B &{}\quad M \end{pmatrix}, \end{aligned}$$

for some \(M \ge 0\) and \(M = \hat{M}\) is the unique positive Toeplitz operator such that this holds for \(\hat{A}\). By Theorem 3.8, \(m:= \deg (\hat{M}) < \infty \), \(m \ge d_B\).

The entries of the operator \(\begin{pmatrix} A - \hat{M} &{}\quad B^* \\ B &{}\quad \hat{M} \end{pmatrix} \ge 0\) are Toeplitz of degree at most \({d'}_1 = \max \{d_A, m\}\). These can be then be collected into \(2 \times 2\) blocks giving a Toeplitz operator with entries acting on \({\tilde{\mathfrak K}}\oplus {\tilde{\mathfrak K}}\) of degree at most \({d'}_1\). Applying the one variable Fejér–Riesz theorem, there is a factorization in terms of analytic operators of at most degree \({d'}_1\), and hence

$$\begin{aligned} L= \begin{pmatrix} A - \hat{M} &{}\quad B^* \\ B &{}\quad \hat{M} \end{pmatrix} = (F_{ij})^*_{i,j=1,2}(F_{ij})_{i,j=1,2}, \end{aligned}$$

where each \(F_{i,j}\) is analytic of degree at most \({d'}_1\).

Conjugate the terms of L with the adjoint of the unitary operator W from the second paragraph of the proof. The operator \(F_{ij}\) becomes a \(d_2 \times d_2\) operator \(N_{ij}\) with entries that are analytic with degrees bounded by \({d'}_1\) on \({\mathfrak G}\). Write \(M = W \hat{M} W^*\). Then

$$\begin{aligned}&\frac{1}{\sqrt{d_2}} \begin{pmatrix} \begin{pmatrix} R_0 &{} R_1^* &{} \cdots &{} R_{d_2-1}^* \\ R_1 &{} \ddots &{} \ddots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} R_1^* \\ R_{d_2-1} &{} \cdots &{} R_1 &{} R_0 \end{pmatrix} - M &{} \begin{pmatrix} R_{d_2}^* &{}\quad 0 &{}\quad \cdots &{}\quad 0 \\ R_{d_2-1}^* &{}\quad \ddots &{}\quad \ddots &{}\quad \vdots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad 0 \\ R_1^* &{}\quad \ddots &{}\quad R_{d_2-1}^* &{}\quad R_{d_2}^* \end{pmatrix}\\ \begin{pmatrix} R_{d_2} &{}\quad R_{d_2-1} &{}\quad \cdots &{}\quad R_1 \\ 0 &{}\quad \ddots &{}\quad \ddots &{}\quad \vdots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad R_{d_2-1} \\ 0 &{}\quad \ddots &{}\quad 0 &{}\quad R_{d_2}^* \end{pmatrix}&M \end{pmatrix} \\&\quad = (N_{ij})^*_{i,j = 1,2}(N_{ij})_{i,j = 1,2}. \end{aligned}$$

For \(m = 0, \dots , 2d_2-1\), define polynomials \(\tilde{F}_m\) in \(z_2\) of degree at most \(2d_2-1\) for which the coefficient of \(z_2^k\) is the kth entry of the mth column of \((N_{ij})\). Then

$$\begin{aligned} \tilde{Q} := \sum _{-d_2}^{d_2} R_k z_2^k = \sum _0^{2d_2-1} \tilde{F}_k^*\tilde{F}_k. \end{aligned}$$

Each \(R_j\) corresponds to an analytic polynomial in \(z_1\), as do all of the entries \(N_{ijk\ell }\) of each \(N_{ij}\) (here \(k,\ell \) run from 1 to \(d_2\), and the polynomials have degree at most \({d'}_1\) in \(z_1\)). Replace the entries of \(N_{ij}\) by the appropriate polynomials in \(z_1\), and write \(F_m\) for the resulting \(2d_2\) polynomials of degrees at most \(({d'}_1,2d_2-1)\) in \((z_1,z_2)\). It follows from the discussion towards the end of Sect. 2 that \(Q= \sum _m F_m^* F_m\). \(\square \)

Can anything be said when \({\mathfrak H}\) is not finite dimensional? As noted in Theorem 1.2, if Q is strictly positive, then it has a factorization as a sum of hermitian squares of analytic polynomials. One could try to apply Theorem 3.7 in the proof of Theorem 4.1 to an operator valued polynomial which is positive but not strictly positive. The result will be a factorization as a sum of hermitian squares of analytic functions which in general are not necessarily polynomials.

5 Conclusion

There is an aspect of the proof of Theorem 4.1 which is far from explicit, since the construction of \(\hat{M}\) there used a Zorn’s lemma argument. However on finite dimensional Hilbert spaces, by Theorem 3.8 the degree of \(\hat{M}\) is bounded, so it might nevertheless be the case that the construction of this operator can be carried out concretely in many instances. Undoubtedly, the coding will involve programming challenges.

There are numerous applications of Theorem 4.1. For example, via a Cayley transform it is possible to construct rational factorizations of positive matrix valued polynomials on \(\mathbb {R}^2\) [3]. For strictly positive polynomials over \(\mathbb {R}^n\), such a Cayley transform gives factorizations involving a restricted class of denominators, though it is known that for positive semidefinite polynomials, this class may fail to be a finite [17] even if the degrees are bounded. The arguments from [3] and the result presented here imply that not only strictly positive, but also non-negative matrix valued polynomials over \(\mathbb {R}^2\), can be factored using a restricted class of denominators, and that a finite set of denominators works for all polynomials of bounded degree.

A number of papers have looked at the problem of factorization for non-negative trigonometric polynomials in two variables, chiefly in the context of engineering problems such as filter design. These have tended to restrict to polynomials having factorizations from the class of stable polynomials; that is, polynomials with no zeros in the closed bidisk [7, 8, 12], or to those with no zeros in the closed disk crossed with the open disk [9], or no zeros on a face of the bidisk [10]. Some address the scalar case, while others the operator case.

As noted in the comment following the statement of Theorem 4.1, there at least two factorizations possible in the two variable setting. Presumably this is part of some larger family of factorizations. Perhaps there is some special “central” factorization, though it is unclear by how much, if any, the bounds on the number of polynomials and their degrees can be improved. Once the minimal \(\hat{A}\) is constructed, the bi-infinite extension of the operator \(\hat{M}\) is a Schur complement, hence maximal in this context, and the ideas briefly discussed in Sect. 3 for using Schur complements to construct optimal Fejér–Riesz factorizations of one variable trigonometric polynomials might improve the number and degree bounds for the polynomials in the two variable factorization.

Finally, there are other related Positivstellensätze in the non-commutative setting which have not been touched upon (see, for example, [11, 14, 26]). What is observed though is that in contrast to the commutative case, the less restrictive nature of non-commutativity allows for factorization regardless of the number of variables.