1 Introduction

Interpolating with Blaschke products has a large literature. In this paper, we focus exclusively on boundary interpolation, that is when we interpolate with Blaschke products and the nodes are on the unit circle. Necessarily, the values are on the unit circle too. The possibility of such interpolation was established in Cantor and Phelps [3]. Jones and Ruscheweyh [15] sharpened that result and showed that any m pairs of data can be interpolated with a Blaschke product of degree at most \(m-1\).

We denote the set of Blaschke products of degree m by

$$\begin{aligned} \mathbf {B}_m := \left\{ \gamma \prod _{j=1}^m \frac{z-a_j}{1-\overline{a_j} z}\; :\ \gamma ,a_1,\ldots ,a_m\in \mathbb {C}, |\gamma |= 1, |a_1|,\ldots ,|a_m|<1 \right\} . \end{aligned}$$
(1)

Here, \(\mathbf {B}_0\) consists of constants (with modulus 1). We also write

$$\begin{aligned} \mathbf {B}_{\le m} := \bigcup _{k=0}^m \mathbf {B}_k. \end{aligned}$$
(2)

Theorem 1

(Jones–Ruscheweyh, 1987) Let \(0\le \varphi _1<\varphi _2<\ldots<\varphi _{m}<2\pi \) and \(\psi _1,\psi _2,\ldots ,\psi _m\in [0,2\pi )\). Then there exists a Blaschke product \(B\in \mathbf {B}_{\le m-1}\) such that \(B(\exp (i \phi _j))=\exp (i \psi _j)\), \(j=1,2,\ldots ,m\).

Actually, there are several proofs of this result and there is a very nice overview in the excellent paper [17] by Semmler and Wegert from 2006. Some earlier references are (not exhaustive list): [1, 6,7,8,9,10,11,12,13, 18, 19] and it is also worth mentioning the books [4, 5].

The outline of the paper is the following. After this introduction, we present a new proof in detail. The proof consists of two main steps: first, we parametrize Blaschke products and transform the boundary Blaschke interpolation problem to interpolation of real values with special real rational functions. After this transformation, we have two sets: one coming from the Blaschke product representation, and another one coming from interpolation data. These sets are subsets of some high dimensional real euclidean space and the assertion of the Jones–Ruscheweyh theorem is equivalent to that their intersection is not empty. We show this by employing tools from real algebraic geometry. In particular, the main tool in this step is a Positivstellensatz of Prestel and Delzell. Finally, we present some technical lemmas and their proofs in Sect. 5. It would be interesting to compare our approach with that of Semmler and Wegert, and also investigate the structure of solutions as a future paper.

2 First Part of the Proof

In this part, we rephrase our problem completely. First, we apply the Cayley transform on finite Blaschke products. Let us mention that Semmler and Wegert used this approach, and they established a natural description (see [17, Lem. 3]). As we need the exact dependence of the coefficients on the zeros of the Blaschke product, we detail this step. In this way, we transform Blaschke products (of degree at most m, denoted by \(\mathbf {B}_{\le m}\)) to a special subset of real rational functions (denoted by \(\mathbf {H}_m\)). Here the coordinates represent the coefficients, and we use a natural choice to exclude the ambiguity (caused by multiplying the numerator and the denominator by the same constant). We also transform the interpolation data and this yields a homogeneous set of data (\(\mathbf {S}\), solution set). If the number of interpolation pairs (n) and the degree of Blaschke products (\(\le m\)) differ by one only (i.e. \(n=m+1\)), then there is a nice description of the solution set \(\mathbf {S}\). Formulating this description (linear parametrization of \(\mathbf {S}\)) finishes the rephrasing of the problem to real algebraic equations and it is the last step of the first part.

2.1 Parametrizing the Blaschke Products

We use the unit disk \(\mathbb {D}:=\{z\in \mathbb {C}:\ |z|<1\}\) and the unit circle \(\mathbb {T}:=\{z\in \mathbb {C}:\ |z|=1\}\). We introduce a parametrization as follows. Let

$$\begin{aligned} \mathbf {E}_m&:= \mathbb {T}\times \mathbb {D}^m \subset \mathbb {C}^{m+1}, \\ \mathbf {F}_m&:= \mathbb {T}\times \left( \mathbb {D}\cup \mathbb {T}\right) ^m \subset \mathbb {C}^{m+1} \end{aligned}$$

where the closure of \(\mathbf {E}_m\) is \(\mathbf {F}_m\),

Now we investigate how \(\mathbf {E}_m\) and \(\mathbf {F}_m\) can be used to parametrize Blaschke products. Consider the parametrization mapping

$$\begin{aligned} \mathcal {P}_m : \mathbf {F}_m \rightarrow \mathbf {B}_{\le m}, \quad (\delta ,a_1,\ldots ,a_m) \mapsto \delta ^2 \prod _{j:\ |a_j|<1} \frac{z-a_j}{1-\overline{a_j}z} \ \prod _{j:\ |a_j|=1} (-a_j) \end{aligned}$$
(3)

where we set

$$\begin{aligned} \gamma =\delta ^2 \end{aligned}$$
(4)

for convenience (see (10), (11) and Lemma 1 below). Note that if \(a\rightarrow a_0\), \(|a|<1\), \(|a_0|=1\), then \((z-a)/(1-{\overline{a}}z)\rightarrow -a_0\) locally uniformly in \(\mathbb {D}\), and also on \(\mathbb {D}\cup \mathbb {T}\). Hence, \(\mathcal {P}_m\) is a continuous mapping (from the euclidean topology to the locally uniform convergence topology), see also Lemma 3. Roughly speaking, \(\mathcal {P}_m\) maps \(\mathbf {E}_m\), and \(\mathbf {F}_m\) to Blaschke products of degree m and Blaschke products of degree at most m. More precisely,

$$\begin{aligned} \mathcal {P}_m\left( \mathbf {E}_m\right) = \mathbf {B}_m \text { and } \mathcal {P}_m\left( \mathbf {F}_m\right) = \mathbf {B}_{\le m}. \end{aligned}$$
(5)

2.2 The Cayley Transform

In this section, we give a description of how the Cayley transform behaves on Blaschke products of degree at most m. We also need the inverse Cayley transform too. We denote the Cayley transform by \(z=T(u)\):

$$\begin{aligned} z=T(u)&=\frac{i-u}{i+u}, \\ u=T^{-1}(z)&=i\frac{1-z }{1+z }. \end{aligned}$$

The set of real rational functions of degree at most m is:

$$\begin{aligned} \mathbf {H}_m := \left\{ \frac{P(u)}{Q(u)}: \ P(u),Q(u)\in \mathbb {R}[u],\ \deg (P),\deg (Q)\le m,\ Q\not \equiv 0\right\} . \end{aligned}$$

It is important that if \(B\in \mathbf {B}_{\le m}\) is a Blaschke product, then

$$\begin{aligned} H(u):= T^{-1} \circ B \circ T(u) = \frac{P(u)}{Q(u)} \end{aligned}$$

for some polynomials PQ with real coefficients and \(\deg (P), \deg (Q)\le m\). This follows immediately from [17, Lem. 3] (Representation Lemma) and we reprove and investigate it in detail in the next subsection.

2.3 Structure of the Coefficients After the Cayley Transform

We also need the structure of the coefficients of P and Q when B is a Blaschke product (\(|\gamma |=1\)) later, so we detail this calculation.

So let \(B\in \mathbf {B}_m\) and we write

$$\begin{aligned} H(u)&= T^{-1}\left( B(T(u)) \right) = i\frac{ 1- \gamma \prod _{j=1}^m \frac{z-a_j}{1-\overline{a_j} z} }{ 1+ \gamma \prod _{j=1}^m \frac{z-a_j}{1-\overline{a_j}z}} \\ {}&= i\frac{ \prod _j (1-\overline{a_j}z) - \gamma \prod _j (z-a_j) }{ \prod _j (1-\overline{a_j}z) + \gamma \prod _j (z-a_j) } \\&= i \frac{ \prod _j \frac{u(1+\overline{a_j})+i-\overline{a_j}i}{i+u} - \gamma \prod _j \frac{u(-1-a_j) +i - a_j i}{i+u} }{ \prod _j \frac{u(1+\overline{a_j})+i-\overline{a_j}i}{i+u} + \gamma \prod _j \frac{u(-1-a_j) +i - a_j i}{i+u} } \\ {}&= i \frac{ \prod _j \left( u(1+\overline{a_j}) +i-\overline{a_j}i\right) - \gamma \prod _j \left( u(-1-a_j) +i - a_j i\right) }{ \prod _j \left( u(1+\overline{a_j}) +i-\overline{a_j}i\right) + \gamma \prod _j \left( u(-1-a_j) +i - a_j i\right) }. \end{aligned}$$

To simplify this, we introduce

$$\begin{aligned} C(u)&:= \prod _{j=1}^m \left( u(1+\overline{a_j}) +i-\overline{a_j}i\right) , \nonumber \\ D(u)&:= \prod _{j=1}^m \left( u(-1-a_j) +i - a_j i\right) \end{aligned}$$
(6)

so we can write

$$\begin{aligned} H(u)= \frac{i C(u) -i \gamma D(u)}{ C(u) + \gamma D(u)}. \end{aligned}$$
(7)

Denote the coefficients of C and D by \(c_j\) and \(d_j\) respectively:

$$\begin{aligned} C(u)=\sum _{j=0}^m c_j u^j,\quad D(u)=\sum _{j=0}^m d_j u^j. \end{aligned}$$

Note that

$$\begin{aligned} d_j=d_j(a_1,\ldots ,a_m)\in \mathbb {C}[a_1,\ldots ,a_m] \end{aligned}$$
(8)

are (holomorphic) polynomials in \(a_1,\ldots ,a_m\).

We also have \(\overline{D(u)}=(-1)^m C(u)\) (when \(u\in \mathbb {R}\)), therefore

$$\begin{aligned} c_j = (-1)^m \overline{d_j} \end{aligned}$$
(9)

for \(j=0,1,\ldots ,m\). We can express the leading coefficients, \(c_m=\prod _{j=1}^m (1+\overline{a_j})\) and \(d_m=\prod _{j=1}^m (-1-a_j)\).

We are going to express P and Q using C, D and (7). First, the leading coefficient of \(i C(u)-i \gamma D(u)\) is

$$\begin{aligned} i c_m - i\gamma d_m = i (-1)^m \overline{d_m} - i \gamma d_m \end{aligned}$$

and the leading coefficient of \(C(u)+\gamma D(u)\) is

$$\begin{aligned} c_m + \gamma d_m = (-1)^m \overline{d_m} + \gamma d_m. \end{aligned}$$

We use Lemma 1 (with \(W=(-1)^m d_m\)) since \(|\gamma |=1\) and Lemma 2; Lemmas 1 and 2 can be found in Sect. 5. This way we get that

$$\begin{aligned} P(u)&= \left( \frac{1}{\sqrt{\gamma }} C - \sqrt{\gamma } D\right) \cdot {\left\{ \begin{array}{ll} 1, &{}\text { if } m \text { is odd},\\ i, &{}\text { if } m \text { is even}, \end{array}\right. } \end{aligned}$$
(10)
$$\begin{aligned} Q(u)&= \left( \frac{1}{\sqrt{\gamma }} C + \sqrt{\gamma } D \right) \cdot {\left\{ \begin{array}{ll} -i, &{}\text { if } m \text { is odd},\\ 1, &{}\text { if } m \text { is even} \end{array}\right. } \end{aligned}$$
(11)

are polynomials with real coefficients and

$$\begin{aligned} H(u)=\frac{P(u)}{Q(u)}. \end{aligned}$$

Let us remark here that P(u) and Q(u) cannot have common zeros. Otherwise, if \(u\in \mathbb {R}\) is such that \(P(u)=0\) and \(Q(u)=0\), then \(C(u)=0\) and \(D(u)=0\) too. Considering the definitions of C(u) and D(u), there should be \(a_j, a_k\) with \(|a_j|< 1\), \(|a_k|< 1\) and \(i(\overline{a_j}-1)/(\overline{a_j}+1)=i(a_k-1)/(-a_k-1)\), a contradiction. This is also in accord with [17, Rem. 1].

Up to now, any branch of the square root can be used (\(\pm \sqrt{\gamma }\)). Now using (4), hence \(\sqrt{\gamma }=\delta \), we use \(\delta \) instead below.

Continuing the calculation for the coefficients of \(P(u)= \sum _{j=0}^m p_j u^j\) (and using \(|\delta |=1\) again), we have

$$\begin{aligned} p_j = \left( \frac{1}{\delta } c_j - \delta d_j\right) \cdot {\left\{ \begin{array}{ll} 1 \\ i \end{array}\right. } = {\left\{ \begin{array}{ll} -2\text { Re}(\delta d_j), &{}\text { if } m \text { is odd},\\ 2\text { Im}(\delta d_j), &{}\text { if } m \text { is even}. \end{array}\right. } \end{aligned}$$
(12)

The coefficients of \(Q(u)=\sum _{j=0}^m q_j u^j\) are

$$\begin{aligned} q_j = \left( \frac{1}{\delta }c_j + \delta d_j\right) \cdot {\left\{ \begin{array}{ll} -i \\ 1 \end{array}\right. } = {\left\{ \begin{array}{ll} 2{\text {Im}}(\delta d_j), &{}\text { if } m \text { is odd},\\ 2{\text {Re}}(\delta d_j), &{}\text { if } m \text { is even}. \end{array}\right. } \end{aligned}$$
(13)

2.4 Parametrization of the Real Rational Functions

We investigate the parametrization of rational functions. In the previous section, \(p_j\)’s and \(q_j\)’s are polynomials which are the coefficients of rational functions depending on the zeros of the Blaschke product. In this paragraph, we use \(p_j\)’s and \(q_j\)’s as variables which are the coefficients of rational functions for convenience (and slightly abusing the notation). Afterward, we use them again as polynomials. Let

$$\begin{aligned} \mathbf {A}:=\&\mathbb {R}^{2m+2}, \\ \mathbf {A}_1:=&\left\{ (p_m,\ldots ,p_1,p_0,q_m,\ldots ,q_1,q_0) \in \mathbf {A}: \ q_m^2+\cdots +q_1^2+q_0^2\ne 0 \right\} . \end{aligned}$$

It is standard that

$$\begin{aligned} (p_m,\ldots ,p_1,p_0,q_m,\ldots ,q_1,q_0)\in \mathbf {A}_1 \mapsto H(u)=\frac{\sum _{j=0}^m p_j u^j}{\sum _{j=0}^m q_j u^j} \in \mathbf {H}_m \end{aligned}$$

is a surjective but not bijective mapping, and if \(\mathbf {p} \in \mathbf {A}_1\subset \mathbb {R}^{2m+2}\), then \(c \mathbf {p}\) determines the same rational function (\(c\in \mathbb {R}{\setminus }\{0\}\)). In other words, the coefficients of the numerator and denominator of a rational function from \(\mathbf {H}_m\) are not uniquely determined (unless some type of normalization is imposed on the denominator and numerator).

Therefore we directly define the coefficients

$$\begin{aligned} \mathcal {L}_{m,\mathbb {C}}:\ \mathbf {F}_m \rightarrow \mathbf {A}, \quad (\delta ,a_1,\ldots ,a_m) \mapsto (p_m,\ldots ,p_1,p_0,q_m,\ldots ,q_1,q_0), \\ \text {where if } m \text { is odd, then } p_j=-2{\text {Re}}(\delta d_j),\ q_j=2{\text {Im}}(\delta d_j),\\ \text {and if } m \text { is even, then } p_j=2{\text {Im}}(\delta d_j),\ q_j=2{\text {Re}}(\delta d_j), \end{aligned}$$

where we used (12), (13) and (6). We know that

$$\begin{aligned} d_j=\sum _{J\subset \{1,\ldots ,m\}, |J|=j} \left( \prod _{k\in J} -1-a_k \right) \left( \prod _{\ell \in J^c} i-a_\ell i \right) \end{aligned}$$
(14)

where \(J^c=\{1,2,\ldots ,m\}{\setminus } J\); in particular, if \(j=0\), then \(J=\emptyset \) and the sum consists of only one term, hence

$$\begin{aligned} d_0=\prod _{\ell =1}^m i-a_\ell i. \end{aligned}$$

Obviously, \(\mathcal {L}_{m,\mathbb {C}}(\mathbf {F}_m) \subset \mathbf {A}_1\).

Finally, we switch to real algebraic language, in particular, we use new variables as follows

$$\begin{aligned} a_j&= x_j + i y_j, \text { where } x_j^2 +y_j^2 \le 1, x_j,y_j\in \mathbb {R}, j=1,\ldots ,m, \end{aligned}$$
(15)
$$\begin{aligned} \delta&= \delta _1 + i \delta _2, \text { where } \delta _1^2 +\delta _2^2 = 1, \delta _1,\delta _2 \in \mathbb {R}. \end{aligned}$$
(16)

With these substitutions, we introduce

$$\begin{aligned} \mathbf {F}_{m,\mathbb {R}}&:= \big \{ (\delta _1,\delta _2,x_1,y_1,\ldots ,x_m,y_m) \in \mathbb {R}^{2m+2}: \\&\qquad \quad \delta _1^2 +\delta _2^2 = 1,\ x_j^2 + y_j^2 \le 1,\ j=1,\ldots ,m \big \} \end{aligned}$$

and

$$\begin{aligned} \mathcal {L}_{m}: \ \mathbf {F}_{m,\mathbb {R}} \rightarrow \mathbf {A}\end{aligned}$$

and as above, \(\mathcal {L}_{m}(\mathbf {F}_{m,\mathbb {R}}) \subset \mathbf {A}_1\).

Let \(U_\ell \) and \(V_\ell \) be the following polynomials

$$\begin{aligned} \mathcal {L}_m : \ (\delta _1,\delta _2,x_1,y_1, \ldots , x_m,y_m) \mapsto (U_m,\ldots ,U_1,U_0,V_m,\ldots ,V_1,V_0) \end{aligned}$$

where \(U_\ell ,V_\ell \in \mathbb {R}[\delta _1,\delta _2,x_1,y_1,\ldots ,x_m,y_m]\), and actually

$$\begin{aligned} U_\ell (\delta _1,\delta _2,x_1,y_1,\ldots ,x_m,y_m)&= p_\ell (\delta _1+i \delta _2, x_1+i y_1,\ldots ,x_m + i y_m), \end{aligned}$$
(17)
$$\begin{aligned} V_\ell (\delta _1,\delta _2,x_1,y_1,\ldots ,x_m,y_m)&= q_\ell (\delta _1+i \delta _2, x_1+i y_1,\ldots ,x_m + i y_m), \end{aligned}$$
(18)

for \(\ell =0,1,\ldots ,m\). Note that for \(\ell =0,1,\ldots ,m\)

$$\begin{aligned} U_\ell ={\left\{ \begin{array}{ll} -2\delta _1 {\text {Re}}(d_\ell ) + 2 \delta _2 {\text {Im}}(d_\ell ), &{}\text { if } m \text { is odd}, \\ 2\delta _1 {\text {Im}}(d_\ell ) +2 \delta _2{\text {Re}}(d_\ell ), &{}\text { if } m \text { is even} \end{array}\right. } \end{aligned}$$
(19)

and

$$\begin{aligned} V_\ell ={\left\{ \begin{array}{ll} 2\delta _1 {\text {Im}}(d_\ell ) +2 \delta _2{\text {Re}}(d_\ell ), &{}\text { if } m \text { is odd}, \\ 2\delta _1 {\text {Re}}(d_\ell ) -2 \delta _2 {\text {Im}}(d_\ell ), &{}\text { if } m \text { is even}. \end{array}\right. } \end{aligned}$$
(20)

where \(d_\ell \) depends on \(a_1,\ldots ,a_m\) while \(U_\ell \) and \(V_\ell \) depend on \(x_1,y_1,\ldots ,x_m,y_m\) but they are connected by (15).

2.5 Applying Cayley Transform on the Interpolation Data

Here we consider the interpolation data and transform it with the Cayley transform.

Suppose that pairwise distinct \(z_1,\ldots ,z_n\in \mathbb {C}\) with \(|z_1|=\ldots =|z_n|=1\) are given and \(w_1,\ldots ,w_n\in \mathbb {C}\) with \(|w_1|=\ldots =|w_n|=1\) are also given. We transform these and consider

$$\begin{aligned} u_j := T^{-1}(z_j), \quad v_j := T^{-1}(w_j). \end{aligned}$$

Note that if \(w_j=-1\) for some j, then \(v_j=\infty \); also if \(z_j=-1\) for some j, then \(u_j=\infty \). By appropriate rotations, this can be avoided. To be precise, let \(\omega ,\chi \in \mathbb {C}\), \(|\omega |=|\chi |=1\) such that none of \(\omega w_1, \ldots , \omega w_n\) is equal to \(-1\) and none of \(z_1 \chi ,\ldots , z_n \chi \) is equal to \(-1\). Then we find a Blaschke product \(B(\cdot )\) so that \(B(z_j \chi )=\omega w_j\) for \(j=1,\ldots ,n\) where

$$\begin{aligned} B(z)=\gamma \prod _{j=1}^m \frac{z-a_j}{1-\overline{a_j}z}. \end{aligned}$$

Then just take

$$\begin{aligned} {\widetilde{B}}(z):=\omega ^{-1} \gamma \chi ^{-m} \prod _{j=1}^m \frac{z-\, a_j\chi }{1-\overline{a_j \chi } \, z} \end{aligned}$$

and this will interpolate \(w_j\) at \(z_j\): \({\widetilde{B}}(z_j)=w_j\), \(j=1,\ldots ,m\).

Therefore, we may assume that all \(u_j\) and \(v_j\) are finite, i.e.

$$\begin{aligned} u_1,\ldots ,u_n,v_1,\ldots ,v_n\in \mathbb {R}, \nonumber \\ u_1,\ldots ,u_n \text { are pairwise distinct. } \end{aligned}$$
(21)

2.6 Introducing Two Real Sets Coming from Blaschke Products and Interpolation Data

Here we would like to find real polynomials

$$\begin{aligned} P(u)=\sum _{\ell =0}^m \alpha _\ell u^\ell \quad \text {and} \quad Q(u)=\sum _{\ell =0}^m \beta _\ell u^\ell , \end{aligned}$$

\(Q(u)\not \equiv 0\) such that

$$\begin{aligned} P(u_j)- v_j Q(u_j)=0,\ j=1,\ldots ,n. \end{aligned}$$
(22)

This equation is equivalent to \(P(u_j)/Q(u_j)=v_j\), provided that

$$\begin{aligned} Q(u_j)\ne 0. \end{aligned}$$
(23)

We will return to this condition later.

An equivalent form of (22) is

$$\begin{aligned} \sum _{\ell =0}^m \big ( \alpha _\ell - v_j \beta _\ell \big ) u_j^\ell =0, \ j=1,\ldots ,n. \end{aligned}$$
(24)

Note that this is a homogeneous linear equation in \(\alpha _0,\beta _0,\ldots ,\alpha _m,\beta _m\), so it always has a solution. Let \(\mathbf {U}\) be the following Vandermonde matrix

$$\begin{aligned} \mathbf {U}:= \left( u_j^\ell \right) _{j=1,2,\ldots ,n;\ \ell =0,1,\ldots ,m}= \begin{pmatrix} 1 &{}\quad u_1 &{}\quad \ldots &{}\quad u_1^m \\ 1 &{}\quad u_2 &{}\quad \ldots &{}\quad u_2^m \\ \vdots &{} &{} &{} \\ 1 &{}\quad u_{n} &{}\quad \ldots &{}\quad u_{n}^m \end{pmatrix}, \end{aligned}$$

and put \(\mathbf {D}:=\mathrm {diag}(-v_1,\ldots ,-v_n)\) and \(\mathbf {v}:=(\alpha _0,\alpha _1,\ldots ,\alpha _m,\beta _0,\beta _1,\ldots ,\beta _m)^{\top }\) (where \(.^{\top }\) means transpose hence \(\mathbf {v}\) is a column vector) and introduce

$$\begin{aligned} \mathbf {M}:= (\mathbf {U} \; \mathbf {D}\mathbf {U}) \in \mathbb {R}^{n\times (2m+2)} \end{aligned}$$

for short. Hence, (24) can be written as

$$\begin{aligned} \mathbf {M}\mathbf {v}=0. \end{aligned}$$
(25)

We assume that

$$\begin{aligned} m=n-1. \end{aligned}$$
(26)

So \(\mathbf {U}\) is a square matrix, \(\mathbf {U}\in \mathbb {R}^{n\times n}\), and since \(u_1,\ldots ,u_n\) are pairwise different (see (21)), \(\mathbf {U}\) is a non-singular matrix. Note that \(\mathbf {M}\) can be thought of as a mapping from \(\mathbf {A}=\mathbb {R}^{2m+2}\). Therefore,

$$\begin{aligned} \mathrm {rank}(\mathbf {M}) = n. \end{aligned}$$
(27)

Using the rank-nullity theorem, this implies that

$$\begin{aligned} \mathbf {S}:= \mathrm {ker}(\mathbf {M}) \text { has dimension } n. \end{aligned}$$
(28)

Note that \(\mathbf {S}=\mathbf {S}(u_1,\ldots ,u_n,v_1,\ldots ,v_n)\subset \mathbf {A}\) depends on the interpolation data (i.e. on \(z_1,\ldots ,z_n\) and \(w_1,\ldots ,w_n\)) and contains all the solutions of (24) while the ”geometrical representation” of general Blaschke products,

$$\begin{aligned} \mathbf {G}_{\le m}:= \mathcal {L}_m(\mathbf {F}_{m,\mathbb {R}}), \end{aligned}$$

is independent of the interpolation data.

We return to the condition (23). It holds because

$$\begin{aligned} \mathbf {v}=(\alpha _0,\alpha _1,\ldots ,\alpha _m,\beta _0,\beta _1,\ldots ,\beta _m)\in \mathbf {G}_{\le m} \end{aligned}$$

hence the corresponding \(P(u)=\sum _{j=0}^m \alpha _j u^j\), \(Q(u)=\sum _{j=0}^m \beta _j u^j\) satisfy that \(P(u)/Q(u)=T^{-1}\circ B\circ T(u)\) holds for some Blaschke product \(B\in \mathbf {B}_{\le m}\). Therefore P and Q cannot have common zeros.

Note that \(\mathbf {G}_{\le m}\subset \mathbf {A}_1\), moreover

$$\begin{aligned} \mathbf {G}_{\le m}&\subset \big \{(p_m,\ldots ,p_1,p_0,q_m,\ldots ,q_1,q_0) \in \mathbf {A}: \\&\quad \quad p_m^2+\ldots +p_1^2+p_0^2\ne 0, \ q_m^2+\ldots +q_1^2+q_0^2\ne 0 \big \}. \end{aligned}$$

Also, observe that \(\mathbf {S}{\setminus }\{0\}\subset \mathbf {A}_1\). Indeed, if \(\mathbf {v}\in \mathbf {S}{\setminus }\{0\}\) and \(\mathbf {v}\in \mathbf {A}{\setminus }\mathbf {A}_1=\mathbf {A}_1^c\), then \(\mathbf {v}\) has the form \((\alpha _1,\ldots ,\alpha _n,0,\ldots ,0)\) where \((\alpha _1,\ldots ,\alpha _n)\ne 0\). Substituting this into (24), we get that \(\mathbf {U} (\alpha _1,\ldots ,\alpha _n)^\top =0\) which contradicts the fact that \(\mathbf {U}\) is invertible.

It is straightforward to see that the Jones-Ruscheweyh theorem is equivalent to

$$\begin{aligned} \mathbf {S}\cap \mathbf {G}_{\le m} \ne \emptyset . \end{aligned}$$
(29)

2.7 Parametrizing the Set Coming from Interpolation Data

We now parametrize \(\mathbf {S}\). In the previous section, \(\alpha _j\)’s and \(\beta _j\)’s were unknowns, coming from a subspace: \((\alpha _0,\alpha _1,\ldots ,\alpha _m,\beta _0,\beta _1,\ldots ,\beta _m)\in \mathbf {S}\). As we parametrize \(\mathbf {S}\), we use the same symbols: \(\alpha _j=\alpha _j(t_1,\ldots ,t_n)\), \(\beta _j=\beta _j(t_1,\ldots ,t_n)\) to keep the notation simple. Using \(t=(t_1,\ldots ,t_n)\in \mathbb {R}^n\), we consider

$$\begin{aligned} t\in \mathbb {R}^n \mapsto \bigg (\alpha _0(t),\ldots , \alpha _m(t), \beta _0(t),\ldots , \beta _m(t) \bigg ) \in \mathbf {A}=\mathbb {R}^{2m+2} \end{aligned}$$

where \(\alpha _0,\alpha _1,\ldots ,\alpha _m\) and \(\beta _0,\beta _1,\ldots ,\beta _m\) are linear polynomials without constant terms:

$$\begin{aligned} \alpha _0(0,\ldots ,0)=\ldots = \alpha _m(0,\ldots ,0)= \beta _0(0,\ldots ,0)=\ldots = \beta _m(0,\ldots ,0)=0.\nonumber \\ \end{aligned}$$
(30)

Note that (29) is equivalent to that the system

$$\begin{aligned} \left. \begin{array}{ll} U_\ell (\delta _1,\delta _2,x_1,y_1,\ldots ,x_m,y_m) =&{} \alpha _\ell (t_1,\ldots ,t_n), \\ V_\ell (\delta _1,\delta _2,x_1,y_1,\ldots ,x_m,y_m) =&{} \beta _\ell (t_1,\ldots ,t_n), \end{array} \quad \ell =0,1,\ldots ,m \right\} \end{aligned}$$
(31)

has a solution under the conditions

$$\begin{aligned} x_j^2 + y_j^2 \le 1, \ j=1,\ldots ,m, \quad \delta _1^2+\delta _2^2=1 \end{aligned}$$
(32)

where \(\delta _1,\delta _2,x_1,y_1,\ldots ,x_m,y_m,t_1,\ldots ,t_n\in \mathbb {R}\).

This reformulation is expressed in terms of real algebraic geometry only.

To exploit the dimension condition (28), we introduce the matrix A by collecting the coefficients of \(\alpha _0,\alpha _1,\ldots ,\alpha _m,\beta _0,\beta _1,\ldots ,\beta _m\). Let \(A\in \mathbb {R}^{2n \times n}\) be the matrix for which

$$\begin{aligned} A t^\top = \begin{pmatrix} \alpha _0(t) \\ \beta _0(t) \\ \alpha _1(t) \\ \beta _1(t) \\ \vdots \\ \alpha _m(t) \\ \beta _m(t) \end{pmatrix} \end{aligned}$$
(33)

where \(t^\top \) is a column vector. Now we use the dimension condition (28). Hence, it is standard (see e.g. [14, p. 13, 0.4.6 (f)]) that there is an invertible matrix \(B_1\in \mathbb {R}^{n\times n}\) and a set \(I_1\subset \{1,2,\ldots ,2n\}\) where \(|I_1|=n\) such that the rows of \(AB_1\) with indices from \(I_1\) is the identity matrix of size n.

For ease of notation, we do not introduce new variables for \(B_1^{-1} t^\top \), that is, we assume that the rows of A with indices from \(I_1\) give the identity matrix. For simplicity, we label the set of \(I_1\) with j(.): \(j(1),\ldots ,j(n)\) are distinct, \(I_1=\{j(1),\ldots ,j(n)\}\). Put \(I_2:=\{1,2,\ldots ,2n\}{\setminus } I_1\) for the remaining row indices.

For convenience, we introduce

$$\begin{aligned} \left. \begin{aligned} L_{2\ell +1}(t):=&\ \alpha _\ell (t), \\ L_{2\ell +2}(t):=&\ \beta _\ell (t) \end{aligned} \right\} \end{aligned}$$
(34)

for \(\ell =0,1,\ldots ,m\). Therefore

$$\begin{aligned} L_{j(k)}=t_k,\quad k=1,2,\ldots ,n \end{aligned}$$
(35)

3 A Second Transformation Applied on the Two Sets

Our ultimate goal is to show that the two sets (\(\mathbf {S}\) and \(\mathbf {G}_{\le m}\)) coming from different ”sides” of the problem have non-empty intersection, i.e (29) holds.

In this section, we apply a second transformation which is an adapted form of the rational parametrization of the unit circle. This transformation changes the occurring polynomials and reveals a crucial property of the polynomials (see (44), (45), and (46)). We exploit this property with a Positivstellensatz of Delzell and Prestel which is a special description of positive polynomials on compact, semialgebraic sets. Instead of sums of squares of polynomials, it features higher powers of polynomials.

3.1 Describing the Second Transformation

We will transform our system of equations by substituting variables and replacing equations so that we can apply a Positivstellensatz.

It is known that

$$\begin{aligned} \tau \mapsto \left( \frac{1-\tau ^2}{1+\tau ^2}, \frac{2\tau }{1+\tau ^2}\right) \end{aligned}$$

is a rational parametrization of the unit circle, more precisely, it is a bijective mapping from \(\mathbb {R}\) to \(\{(x,y)\in \mathbb {R}^2: x^2 + y^2 =1\}{\setminus }\{(-1,0)\}\). Similarly,

$$\begin{aligned} (\sigma ,r) \mapsto \left( \frac{1-\sigma ^2}{1+\sigma ^2}, \frac{2 r \sigma }{1+\sigma ^2} \right) \end{aligned}$$

is a rational mapping from \((\sigma ,r)\in [0,\infty )\times [-1,1]\) to \(\{(x,y)\in \mathbb {R}^2: x^2 + y^2 \le 1\}{\setminus }\{(-1,0)\}\) and bijective from \((0,\infty )\times [-1,1]\) to \(\{(x,y)\in \mathbb {R}^2: x^2 + y^2 \le 1\} {\setminus }\{(-1,0),(1,0)\}\) and \(\{0\}\times [-1,1]\) is mapped to the point (1, 0). We also use \(\sigma =s/(1-s)\) and \(\tau =u/(1-u^2)\), so if s runs over the interval (0, 1), then \(\sigma \) runs over the interval \((0,\infty )\) and if u runs over the interval \((-1,1)\), then \(\tau \) runs over the real numbers. Therefore we consider the composite mapping:

$$\begin{aligned} u\mapsto \left( \frac{u^4-3 u^2+1}{u^4-u^2+1}, -\frac{2 u (u^2-1)}{u^4-u^2+1} \right) \end{aligned}$$
(36)

which is a bijective mapping from the open interval \((-1,1)\) to \(\{(x,y)\in \mathbb {R}^2: x^2 + y^2 =1\}{\setminus }\{(-1,0)\}\) and maps \(-1\) and \(+1\) to the point \((-1,0)\). Similarly, we consider

$$\begin{aligned} (s,r) \mapsto \left( \frac{1-2 s}{1-2 s + 2s^2}, - \frac{2 r s (-1+s)}{1-2 s + 2s^2} \right) \end{aligned}$$
(37)

which has the following mapping properties. It maps \([0,1]\times (-1,1)\) bijectively to the open unit disk \(\{(x,y)\in \mathbb {R}^2:\ x^2+y^2<1\}\), it is also bijective mapping from \((0,1)\times [-1,1]\) to \(\{(x,y)\in \mathbb {R}^2:\ x^2+y^2\le 1\}{\setminus }\{(1,0),(-1,0)\}\), maps \(\{1\}\times [-1,1]\) to the point \((-1,0)\) and maps \(\{0\}\times [-1,1]\) to the point (1, 0).

Based on these, we introduce the following substitutions:

$$\begin{aligned} x_j= \frac{1-2 s_j}{1-2 s_j + 2s_j^2}, \quad y_j= - \frac{2 r_j s_j (-1+s_j)}{1-2 s_j + 2s_j^2}, \qquad j=1,2,\ldots ,m, \end{aligned}$$
(38)
$$\begin{aligned} \delta _1 = \frac{u^4-3 u^2+1}{u^4-u^2+1}, \quad \delta _2= -\frac{2 u (u^2-1)}{u^4-u^2+1}. \end{aligned}$$
(39)

We will apply these substitutions and to obtain polynomials, we will multiply them by the denominators. To see it precisely, we focus on the structure of \(d_\ell \) and use formula (14) with the substitutions (15), (16) and the substitutions above so we can write for \(\ell =0,1,\ldots ,m\)

$$\begin{aligned} d_\ell&= \sum _{J:\ |J|=\ell } (-1)^\ell i^{m-\ell } \prod _{\mu \in J} (1+a_\mu ) \prod _{\nu \in J^c}(1-a_\nu ) \\ {}&= (-1)^\ell i^{m-\ell } \sum _J \prod _{\mu \in J} \left( 1+ \frac{1-2 s_\mu }{1-2 s_\mu + 2s_\mu ^2} +i \frac{-2 r_\mu s_\mu (-1+s_\mu )}{1-2 s_\mu + 2s_\mu ^2} \right) \\&\quad \cdot \prod _{\nu \in J^c} \left( 1- \frac{1-2 s_\nu }{1-2 s_\nu + 2s_\nu ^2} - i \frac{-2 r_\nu s_\nu (-1+s_\nu )}{1-2 s_\nu + 2s_\nu ^2} \right) \\ {}&= \frac{(-1)^\ell i^{m-\ell } }{ \prod _{j=1}^m 1-2s_j+ 2s_j^2 } \sum _J \prod _{\mu \in J} 2( 1-2 s_\mu + s_\mu ^2 +i r_\mu s_\mu (1-s_\mu ) ) \\&\quad \cdot \prod _{\nu \in J^c} 2 s_\nu ^2 + 2 i r_\nu s_\nu (-1+s_\nu ) \\ {}&= \frac{2^\ell (-1)^\ell i^{m-\ell } }{ S_s} \sum _J \prod _{\mu \in J} (1-s_\mu )\left( 1-s_\mu + i r_\mu s_\mu \right) \prod _{\nu \in J^c} s_\nu \left( s_\nu + i r_\nu (s_\nu -1)\right) \end{aligned}$$

where

$$\begin{aligned} S_s:= \prod _{j=1}^m 1-2s_j+ 2s_j^2. \end{aligned}$$
(40)

Slightly rewriting it, we have

$$\begin{aligned} d_\ell= & {} \frac{(-2)^\ell }{S_s} \sum _J \left( \prod _{\nu \in J^c} s_\nu \right) i^{m-\ell } \Big ( \prod _{\nu \in J^c} s_\nu + i r_\nu (s_\nu -1) \nonumber \\&\cdot \prod _{\mu \in J} (1-s_\mu )\left( 1-s_\mu + i r_\mu s_\mu \right) \Big ) \end{aligned}$$
(41)

and, in particular,

$$\begin{aligned} d_m=\frac{(-2)^m}{S_s} \prod _{j=1}^m (1-s_j)(1-s_j + i r_j s_j). \end{aligned}$$
(42)

We also use

$$\begin{aligned} S_u := 1 -u^2 + u^4, \qquad S:= S_s S_u. \end{aligned}$$

Observe that \(S_u\ge 3/4\) (for \(u\in [-1,1]\)) and \(1-2s_j+2s_j^2\ge 1/2\), \(j=1,\ldots ,m\) hence \(S\ge 3/2^{m+2}\).

We rewrite the polynomials \(U_\ell \) and \(V_\ell \)’s using substitutions (38) and (39). So we introduce

$$\begin{aligned}&R_{2\ell +1}(u,s_1,s_2,\ldots ,s_m,r_1,r_2,\ldots ,r_m) \\ {}&\qquad := S\; U_\ell \Big ( \frac{u^4-3 u^2+1}{u^4-u^2+1}, -\frac{2 u (u^2-1)}{u^4-u^2+1}, \frac{1-2 s_1}{1-2 s_1 + 2s_1^2}, - \frac{2 r_1 s_1 (-1+s_1)}{1-2 s_1 + 2s_1^2}, \\&\qquad \qquad \qquad \quad \ldots , \frac{1-2 s_m}{1-2 s_m + 2s_m^2}, - \frac{2 r_m s_m (-1+s_m)}{1-2 s_m + 2s_m^2}\Big ), \\&R_{2\ell +2}(u,s_1,s_2,\ldots ,s_m,r_1,r_2,\ldots ,r_m) \\ {}&\qquad := S\; V_\ell \Big ( \frac{u^4-3 u^2+1}{u^4-u^2+1}, -\frac{2 u (u^2-1)}{u^4-u^2+1}, \frac{1-2 s_1}{1-2 s_1 + 2s_1^2}, - \frac{2 r_1 s_1 (-1+s_1)}{1-2 s_1 + 2s_1^2}, \\&\qquad \qquad \qquad \quad \ldots , \frac{1-2 s_m}{1-2 s_m + 2s_m^2}, - \frac{2 r_m s_m (-1+s_m)}{1-2 s_m + 2s_m^2}\Big ) \end{aligned}$$

for \(\ell =0,1,\ldots ,m-1,m\). Note that \(R_j\)’s are polynomials from \(\mathbb {R}[u,s_1,r_1,\ldots ,s_m,r_m]\). For simplicity, we define

$$\begin{aligned} j_0 := {\left\{ \begin{array}{ll} 2m+1, &{} \text { if } m \text { is even},\\ 2m+2, &{} \text { if } m \text { is odd},\\ \end{array}\right. } \qquad j_1:= {\left\{ \begin{array}{ll} 2m+2, &{} \text { if } m \text { is even},\\ 2m+1, &{} \text { if } m \text { is odd}. \end{array}\right. } \end{aligned}$$

It is very important that \(R_j\), \(j=1,2,\ldots ,2m\) do not have constant terms (because of the first factor on the right of (41))

$$\begin{aligned} R_j(0,\ldots ,0)=0, \quad j=1,2,\ldots ,2m. \end{aligned}$$

In other words, using the substitution

$$\begin{aligned} \mathbf {Z}: s_1=\ldots =s_m=0 \end{aligned}$$
(43)

we can also write that

$$\begin{aligned} R_j|_\mathbf {Z} = 0, \qquad j=1,2,\ldots , 2m. \end{aligned}$$
(44)

The last two \(R_j\)’s, namely \(R_{2m+1}\) and \(R_{2m+2}\) behave differently. The expressions (19) and (20) for \(U_m\) and \(V_m\) show that

  • if m is odd, then

    $$\begin{aligned} U_m&= \frac{(-2)^{m+1}}{S_s} \left( \delta _1(1+\ldots )+ \delta _2\cdot \ldots \right) , \quad \\ V_m&= \frac{2 (-2)^{m}}{S_s} \left( \delta _1\cdot \ldots + \delta _2(1+\ldots ) \right) , \end{aligned}$$
  • if m is even, then

    $$\begin{aligned} U_m&= \frac{2(-2)^{m}}{S_s} \left( \delta _1\cdot \ldots + \delta _2 (1+\ldots ) \right) ,\\ V_m&= \frac{2 (-2)^{m}}{S_s} \left( \delta _1(1+\ldots )+ \delta _2\cdot \ldots \right) \end{aligned}$$

where the \(\ldots \) stand for terms that are multiplied with an \(s_j\).

Taking into account these four lines above and the substitutions (38) and (39), we can write

$$\begin{aligned} R_{j_0}|_\mathbf {Z} =\&(-1)^{m+1} 2^{m+2} u(u^2-1), \end{aligned}$$
(45)
$$\begin{aligned} R_{j_1}|_\mathbf {Z} =\&2^{m+1}(u^4-3u^2+1). \end{aligned}$$
(46)

Hence \(R_{j_0}\) has zero constant term and \(R_{j_1}\) has non-zero constant term. These observations will be crucial for the argument later.

The system (31) with conditions (32) is equivalent to the system

$$\begin{aligned} \left. \begin{aligned} S \, L_j = R_j, \quad j=1,2,\ldots ,2m, \\ S \, L_{j_0} = R_{j_0}, \\ S \, L_{j_1} = R_{j_1} \end{aligned} \right\} \end{aligned}$$
(47)

with the condition that \((u,s_1,\ldots ,s_m,r_1,\ldots ,r_m)\) are from the set

$$\begin{aligned} W := \big \{ (u,s_1,\ldots ,s_m,r_1,\ldots ,r_m) \in \mathbb {R}^{2m+1}:\ -1\le u \le 1, \nonumber \\ 0\le s_1,\ldots ,s_m\le 1,\ -1\le r_1,\ldots ,r_m\le 1 \big \}. \end{aligned}$$
(48)

We remark that W is compact and \(L_j\in \mathbb {R}[t_1,\ldots ,t_n]\) and \(R_j\in \mathbb {R}[u,s_1,r_1,\ldots ,s_m,r_m]\).

For simplicity, we introduce \(X_1=u\), \(X_{1+j}=s_j\), \(j=1,2,\ldots ,m\), \(X_{m+1+j}=r_j\), \(j=1,2,\ldots ,m\), and

$$\begin{aligned} \mathbf {x}= (u,s_1,\ldots ,s_m,r_1,\ldots ,r_m). \end{aligned}$$

3.2 Eliminating the Parametrizing Auxiliary Variables

Since there is an identity submatrix within the matrix A (coming from the coefficients of \(L_j\)’s; see the definition of A, (33), and that of \(L_k\)’s, (34) and (35)) there exists an invertible matrix \(B_2\in \mathbb {R}^{2n\times 2n}\) such that with

$$\begin{aligned} \begin{pmatrix} {\widetilde{L}}_1 \\ {\widetilde{L}}_2 \\ \vdots \\ {\widetilde{L}}_{2n} \end{pmatrix} := B_2 \begin{pmatrix} L_1 \\ L_2 \\ \vdots \\ L_{2n} \end{pmatrix} \end{aligned}$$
(49)

we have

$$\begin{aligned} {\widetilde{L}}_{j} = {\left\{ \begin{array}{ll} 0,&{} \text { if } j\in I_2, \\ t_k,&{} \text { if } j\in I_1,\ j=j(k). \end{array}\right. } \end{aligned}$$
(50)

This is a simple elimination on the left-hand sides (using row operations on A instead of column operations which we used in Sect. 2.7). We transform the right-hand sides accordingly, hence we introduce \({\widetilde{R}}_1,{\widetilde{R}}_2,\ldots ,{\widetilde{R}}_{2n}\) as

$$\begin{aligned} \begin{pmatrix} {\widetilde{R}}_1 \\ {\widetilde{R}}_2 \\ \vdots \\ {\widetilde{R}}_{2n} \end{pmatrix} := B_2 \begin{pmatrix} R_1 \\ R_2 \\ \vdots \\ R_{2n} \end{pmatrix}. \end{aligned}$$
(51)

The system (47) with condition (48) is equivalent to

$$\begin{aligned} S\, t_k =&\ {\widetilde{R}}_{j(k)}(\mathbf {x}), \quad k=1,2,\ldots ,n, \end{aligned}$$
(52)
$$\begin{aligned} 0 =&\ {\widetilde{R}}_j(\mathbf {x}), \quad j\in I_2 \end{aligned}$$
(53)

with conditions (48), that is, when \(\mathbf {x}=(u,s_1,\ldots ,s_m,r_1,\ldots ,r_m) \in W\).

Introduce

$$\begin{aligned} {\widetilde{W}}_1 :=&\big \{ (\mathbf {x},t)\in \mathbb {R}^{3n-1}:\ \mathbf {x}\in W, \ St_k ={\widetilde{R}}_{j(k)}(\mathbf {x}), \ k=1,2,\ldots ,n \big \}, \\ {\widetilde{W}}_2 :=&\big \{ (\mathbf {x},t)\in \mathbb {R}^{3n-1}:\ \mathbf {x}\in W, \ 0 ={\widetilde{R}}_{j}(\mathbf {x}), \ j\in I_2 \big \}, \\ {\widetilde{W}}:=&\ {\widetilde{W}}_1\cap {\widetilde{W}}_2, \\ W_2 :=&\big \{ \mathbf {x}\in W:\ (\mathbf {x},t) \in {\widetilde{W}}_2 \text { for some } t\in \mathbb {R}^n \big \}. \end{aligned}$$

Using (52), we can express \(t_k\)’s with \(u,s_1,\ldots ,s_m,r_1,\ldots ,r_m\) as follows

$$\begin{aligned} t_k = \frac{{\widetilde{R}}_{j(k)}(\mathbf {x})}{S}, \qquad k=1,\ldots ,n. \end{aligned}$$
(54)

Therefore if \(\mathbf {x}\in W_2\), then there is a unique \(t\in \mathbb {R}^n\) such that \((\mathbf {x},t)\in {\widetilde{W}}\), because of the following. Note that \({\widetilde{R}}_j(\mathbf {x})\) are polynomials independent of \(t_1,\ldots ,t_n\). Furthermore, \(1-u^2+u^4 \ge 3/4\) when \(u\in [-1,1]\) and \(S\ge 3/2^{m+2}\) holds also. Therefore, for all possible \(\mathbf {x}\in W\), \({\widetilde{R}}_j(\mathbf {x})/S(\mathbf {x})\) is continuous and bounded.

This implies that \({\widetilde{W}}\) is compact.

Hence, the t’s coming from a solution are bounded, i.e. there is \(M_0>0\) such that for all \(t\in \mathbb {R}^n\) such that \((\mathbf {x},t)\in {\widetilde{W}}\), we have \(|t_j| \le M_0\), \(j=1,2,\ldots ,n\).

4 Application of the Positivstellensatz

We are going to use a form of Positivstellensatz which can be found in the book of Prestel and Delzell [16]. Briefly, it is for sums of even powers and for compact (semialgebraic) sets.

First, we introduce our new set of notations. Then we apply the Positivstellensatz to find a solution. This indirect argument features a step-by-step simplification of the representation (61) provided by the Positivstellensatz. As the first step of simplification, we apply a substitution (63) which turns the representation into a univariate identity (68). Then a careful comparison of the leading terms and degrees leads to an even more simplified identity (69). Finally, exploiting the special structure of the equation (comparing (72) and (71)) leads to a contradiction. We remark that in this section we do not use the polynomials \(c_0,c_1,\ldots ,c_m\) from Sect. 2.3 (and we use \(c_0,c_1\) as new symbols).

As the next step, we set

$$\begin{aligned} N:=8. \end{aligned}$$

Recall that \(\mathbf {x}=(u,s_1,\ldots ,s_m,r_1,\ldots ,r_m)\) and for unifying the notation, we introduce the following:

$$\begin{aligned} h_1(\mathbf {x})&:= (1+X_1^2)(1-X_1^6) =(-1) u^{8}-u^6 +u^2+1, \end{aligned}$$
(55)
$$\begin{aligned} h_{j+1}(\mathbf {x})&:=X_{j+1}^{N-1}(1-X_{j+1}) =(-1) s_j^{N} + s_j^{N-1}, \quad j=1,\ldots ,m, \end{aligned}$$
(56)
$$\begin{aligned} h_{1+m+j}(\mathbf {x})&:= 1-X_{1+m+j}^{N}=(-1) r_j^{N} +1, \qquad j=1,\ldots ,m, \end{aligned}$$
(57)
$$\begin{aligned} h_{2m+2}(t)&:= M_0-\big ( t_1^{N} + \ldots + t_n^{N}\big ). \end{aligned}$$
(58)

Put \(N_1:=2m+2\).

Note that \(h_1(\mathbf {x})\ge 0\) if and only if \(u\in [-1,1]\) and \(h_{1+j}(\mathbf {x})\ge 0\) if and only if \(s_j\in [0,1]\) and \(h_{1+m+j}(\mathbf {x})\ge 0\) if and only if \(r_j\in [-1,1]\). Hence,

$$\begin{aligned} {\widehat{W}}:= \big \{ (\mathbf {x},t)\in \mathbb {R}^{3n-1}: \ h_j(\mathbf {x},t)\ge 0, j=1,\ldots ,N_1 \big \} \end{aligned}$$
(59)

i.e. \({\widehat{W}}\) is a compact, semialgebraic set.

Introduce

$$\begin{aligned} f(\mathbf {x},t):= \sum _{j=1}^{m} \left( SL_j-R_j\right) ^2 + \Big ( S L_{j_0} - R_{j_0} \Big )^2 + \Big ( S L_{j_1} - R_{j_1} \Big )^2. \end{aligned}$$
(60)

Obviously, \(f(\mathbf {x},t)=0\) at some \((\mathbf {x},t)\in {\widehat{W}}\) if and only if \((\mathbf {x},t)\) is a solution of (47) and also \(f\ge 0\).

As the next step, we apply a form of Positivstellensatz, more precisely [16, Thm. 7.3.11, p. 174]. We verify the conditions now. The highest homogeneous parts of \(h_1,\ldots ,h_{N_1}\) are \(-u^8, -s_1^8,\ldots ,-s_m^8, -r_1^8,\ldots ,-r_m^8, -(t_1^8+ \cdots +t_n^8)\) respectively, and it is easy to see that at least one of them is negative at every \((\mathbf {x},t)\in \mathbb {R}^{3n-1}{\setminus }\{(0,\ldots ,0)\}\), i.e. condition (7.3.11.1) is satisfied. Of course, they have the same degree, N, and (59) is compact. The theorem states that if \(f>0\) on \({\widehat{W}}\), then f is in the (Archimedean) module generated by \(h_1,h_2,\ldots ,h_{N_1}\) of level N, i.e.

$$\begin{aligned} f(\mathbf {x},t) = \sigma _0 + \sigma _1 h_1 +\ldots + \sigma _{N_1} h_{N_1} \end{aligned}$$
(61)

where \(\sigma _0,\sigma _1,\ldots ,\sigma _{N_1}\) are sums of N-th powers, i.e. they are from

$$\begin{aligned} \sum \nolimits ^{N}[\mathbf {x},t]:= \left\{ \sum _{j=1}^k P_j^N\ :\ P_1,\ldots ,P_k \in \mathbb {R}[\mathbf {x},t] \right\} . \end{aligned}$$
(62)

Indirectly, assume that \(f>0\) on \({\widehat{W}}\) which implies that (61) holds.

As the next step, we apply a substitution for (61) to simplify it. The expression on the right of (61) after substitution \(\mathbf {Y}\), where

$$\begin{aligned} \mathbf {Y}:\ s_1=\ldots =s_m=0,\ r_1=\ldots =r_m=0,\ t_1=\ldots =t_n=0 \end{aligned}$$
(63)

has the following structure. Obviously,

$$\begin{aligned} h_1(u):= h_1|_\mathbf {Y}=&\ h_1(u)= (u^2+1) (1-u^6), \\ h_{1+j}|_\mathbf {Y}=&\ 0, \qquad j=1,2,\ldots ,m, \\ h_{1+m+j}|_\mathbf {Y}=&\ 1, \qquad j=1,2,\ldots ,m, \\ h_{2+2m}|_\mathbf {Y}=&\ M_0 \text { (const).} \end{aligned}$$

Also, if \(\sigma \in \sum ^N[\mathbf {x},t]\), then \(\sigma |_\mathbf {Y}\in \sum ^N[u]\). Therefore the right-hand side will have this form:

$$\begin{aligned} \sigma _0 +\sum _{j=1}^{N_1} \sigma _j h_j \big |_\mathbf {Y} = \sigma _0 + \sigma _1\; (u^2+1)(1-u^6) \end{aligned}$$
(64)

where slightly abusing the notation, we write \(\sigma |_\mathbf {Y}=\sigma \).

The substitution simplifies greatly the left-hand side of (61)

$$\begin{aligned} f_1(u):=f|_\mathbf {Y}&= \big ( S L_{j_0} - R_{j_0} |_\mathbf {Y} \big )^2 + \big ( S L_{j_1} - R_{j_1} |_\mathbf {Y} \big )^2 \nonumber \\&= \big ( 2^{m+2} u(u^2-1) \big )^2 + \big (2^{m+1}(u^4-3u^2+1) \big )^2 \end{aligned}$$
(65)

where we used (45), (46) and also (44).

As the next step, we rewrite \(u^2+1\) with a result of Berr and Wörmann. Obviously, \(u^2+1\) is strictly positive on \([-1,1]\), so it can be written as

$$\begin{aligned} u^2 +1 = \tau _0 + \tau _1 (1-u^6) +\cdots + \tau _7 (1-u^6)^7 \end{aligned}$$
(66)

where \(\tau _0,\tau _1,\ldots ,\tau _7\in \sum \nolimits ^N[u]\) (recall \(N=8\)). This representation follows from [2, Ex. 4.5, p. 834]. It would be interesting to establish this expansion directly.

Using (66), we rewrite the right-hand side of (64):

$$\begin{aligned}&\sigma _0 + \sigma _1 (u^2+1) (1-u^6) \nonumber \\ {}&\qquad = \sigma _0 + (\sigma _1 \tau _0) (1-u^6) +\cdots + (\sigma _1 \tau _6) (1-u^6)^7 + (\sigma _1 \tau _7) (1-u^6)^8 \nonumber \\ {}&\qquad = \bigg (\sigma _0 + (\sigma _1 \tau _7) (1-u^6)^8 \bigg ) + (\sigma _1 \tau _0) (1-u^6) +\cdots + (\sigma _1 \tau _6) (1-u^6)^7 \nonumber \\ {}&\qquad = {\widetilde{\sigma }}_0 + {\widetilde{\sigma }}_1 (1-u^6) +\cdots + {\widetilde{\sigma }}_7 (1-u^6)^7 \end{aligned}$$
(67)

where

$$\begin{aligned} {\widetilde{\sigma }}_0&:=\sigma _0 + (\sigma _1 \tau _7) (1-u^6)^8 \in \sum \nolimits ^N[u],\\ {\widetilde{\sigma }}_1&:=\sigma _1 \tau _0\in \sum \nolimits ^N[u],\\&\vdots \\ {\widetilde{\sigma }}_7&:=\sigma _1 \tau _6\in \sum \nolimits ^N[u]. \end{aligned}$$

As the next step, we collect the results of substitution and simplification. Using (61), (66) and (67), the right-hand side is relatively simple, while regarding the left-hand side, we use (65) and we write

$$\begin{aligned} f_1(u) = {\widetilde{\sigma }}_0 + {\widetilde{\sigma }}_1 (1-u^6) +\cdots + {\widetilde{\sigma }}_7 (1-u^6)^7. \end{aligned}$$
(68)

To compare the two sides, we need the powers of \(1-u^6\):

$$\begin{aligned} (1-u^6)^1&= (-1) u^6 +1, \\ (1-u^6)^2&= u^{12}-2u^6 +1 = u^4 u^N -2 u^6 +1, \\ (1-u^6)^3&= (-1) u^{18} + 3 u^{12} -3 u^6 +1 \\ {}&= (-1) u^2\, (u^2)^N + 3 u^4\, u^N - 3 u^6 +1, \\ (1-u^6)^4&= u^{24} - 4 u^{18} +6 u^{12} -4u^{6} +1 \\&= (u^3)^N - 4 u^2\, (u^2)^N+ 6 u^4\, u^N - 4 u^6 +1, \\ (1-u^6)^5&= (-1) u^{30} +5 u^{24} -10 u^{18} +10 u^{12} - 5 u^6 +1 \\&= (-1) u^6\, (u^3)^N + 5 (u^3)^N -10 u^2\, (u^2)^N +10 u^4\, u^N -5 u^6 +1, \\ (1-u^6)^6&= u^{36} -6 u^{30} +15 u^{24} -20 u^{18} + 15 u^{12} - 6 u^6 +1 \\&= u^4\, (u^4)^N -6 u^6\, (u^3)^N +15 (u^3)^N \\ {}&\quad -20 u^2 \, (u^2)^N +15 u^4\, u^N -6u^6 +1, \\ (1-u^6)^7&= (-1) u^{42} +7 u^{36} -21 u^{30} +35 u^{24} -35 u^{18} +21 u^{12} -7 u^6 +1 \\&= (-1) u^2\, (u^5)^N +7 u^4\, (u^4)^N -21 u^6\, (u^4)^N \\ {}&\quad +35 (u^3)^N -35 u^2\, (u^2)^N +21 u^4\, u^N -7 u^6 +1. \end{aligned}$$

We investigate the degrees and the leading terms in (68). On the left-hand side, \(\deg f_1=N=8\). The right-hand side is more involved. We remark that \(\deg {\widetilde{\sigma }}_j=N k_j\) for some \(k_j\in \mathbb {N}\) and the leading coefficient \(\mathrm {lc}({\widetilde{\sigma }}_j)\) of \({\widetilde{\sigma }}_j\) is positive. We investigate the degrees of \({\widetilde{\sigma }}_j (1-u^6)^j\), \(j=0,1,\ldots ,7\) modulo N. Consider the following groups of powers: \(\{0,4\}\) and \(\{1,5\}\) and \(\{2,6\}\) and \(\{3,7\}\). In each group they have the same degrees mod N and the same signs of leading coefficients, e.g. if we take \(\{1,5\}\), then \(\deg {\widetilde{\sigma }}_1 (1-u^6)^1= N k_1 + 6\), \(\deg {\widetilde{\sigma }}_5 (1-u^6)^5= N k_5 + 30= N (k_5 + 3) + 6\) and the signs of the leading coefficients are the same \(\mathrm {sign}\,\mathrm {lc}\big ( {\widetilde{\sigma }}_1 (1-u^6)^1\big ) =\mathrm {sign}\,\mathrm {lc}\big ( {\widetilde{\sigma }}_5 (1-u^6)^5)=-1\). Since the signs are the same, the leading terms in the same group cannot cancel. Also, since the degrees of \({\widetilde{\sigma }}_j (1-u^6)^j\) are different modulo \(N=8\), the leading terms of different groups cannot cancel either. Hence

$$\begin{aligned} \deg ( f_1 ) = \max \bigg ( \deg {\widetilde{\sigma }}_0, \deg {\widetilde{\sigma }}_1 (1-u^6), \deg {\widetilde{\sigma }}_2 (1-u^6)^2, \ldots , \deg {\widetilde{\sigma }}_7 (1-u^6)^7 \bigg ). \end{aligned}$$

Taking into account that \(\deg (f_1)=8\), this can happen only when

$$\begin{aligned} {\widetilde{\sigma }}_2=\ldots ={\widetilde{\sigma }}_7=0 \qquad \text{ and }\qquad \deg {\widetilde{\sigma }}_1=0, \end{aligned}$$

i.e. \({\widetilde{\sigma }}_1\) is a constant.

Finally, we obtain that

$$\begin{aligned} f_1 = {\widetilde{\sigma }}_0 +c_1 (1-u^6) \end{aligned}$$
(69)

where \(c_1\ge 0\).

So we have

$$\begin{aligned} f_1(u) = \sum _{j=1}^{N_2} A_j^{N}(u) +c_1 (1-u^6) \end{aligned}$$
(70)

where \(N_2\) is a positive integer and \(A_j(u)\) are polynomials. Again, degree considerations show that the \(A_j\)’s must be constant or linear polynomials, so we can write

$$\begin{aligned} f_1(u)= c_0 + \sum _{j=1}^{N_3} \lambda _j (u-\zeta _j)^N +c_1 (1-u^6) \end{aligned}$$
(71)

where \(\lambda _j>0\), \(\zeta _j\in \mathbb {R}\) are pairwise different and \(c_0\in \mathbb {R}\) and \(N_3\le N_2\).

We finally reach a contradiction by showing that (71) cannot hold. Note that

$$\begin{aligned} f_1(u)&= 4^{m+1}\bigg (4 u^2 (u-1)^2 (u+1)^2 +(u^4-3u^2+1)^2 \bigg ) \nonumber \\&= 4^{m+1}\bigg ( u^8 -2u^6 +3u^4 -2u^2 +1 \bigg ) \end{aligned}$$
(72)

hence \(f_1(\cdot )\) is even, \(f_1(-u)=f_1(u)\). Exploiting this (and that N is even), we write

$$\begin{aligned} f_1(u)=\frac{f_1(u)+f_1(-u)}{2} = c_0+\sum _{j=1}^{N_3} \lambda _j \frac{ (u-\zeta _j)^N +(u+\zeta _j)^N }{2} +c_1 (1-u^6).\nonumber \\ \end{aligned}$$
(73)

Observe that all coefficients of \(\left( (u-\zeta _j)^N +(u+\zeta _j)^N\right) /2\) are non-negative (e.g. the coefficient of \(u^2\) is \(28\zeta _j^6\)). Therefore the coefficient of \(u^2\) is \(4^{m+1} (-2)<0\) (according to the left-hand side of (71)) and it is \(\sum _{j=1}^{N_3} \lambda _j \cdot 28\zeta _j^6 \ge 0\) (according to the right-hand side of (71)). This gives a contradiction.

Therefore, we have \(f\not >0\). Obviously \(f\ge 0\). These two imply that f must have a zero on \({\widehat{W}}\), \((\mathbf {x},t)\in {\widehat{W}}\) with \(f(\mathbf {x},t)=0\), so the system (52) and (53) with \(\mathbf {x}\in W\) has a solution. In turn, this implies that \({\widetilde{W}} \ne 0\) which leads to the fact that (47) has a solution from W (see (48)), which implies that (31) has a solution under the conditions (32), that is, (29) holds which is equivalent to the assertion of Theorem 1. Therefore the proof is complete.

5 Some Technical Lemmas

Lemma 1

Let \(W\in \mathbb {C}\), \(W\ne 0\), \(\gamma \in \mathbb {C}\), \(|\gamma |=1\). Then there exists \(f\in \mathbb {C}\), \(|f|=1\) such that

$$\begin{aligned} i\frac{{\overline{W}} +\gamma W}{{\overline{W}} -\gamma W} = \frac{f\left( {\overline{W}} +\gamma W\right) }{\frac{f}{i} \left( {\overline{W}} -\gamma W\right) } \end{aligned}$$

and the numerator and denominator on the right are real, i.e.

$$\begin{aligned} {\text {Im}}\left( f\left( {\overline{W}} +\gamma W\right) \right) =0, \quad {\text {Im}}\left( \frac{f}{i} \left( {\overline{W}} -\gamma W\right) \right) =0. \end{aligned}$$

In particular, \(f=\sqrt{\gamma }\) will do. Furthermore, a similar identity also holds: there exists \(f\in \mathbb {C}\), \(|f|=1\) such that

$$\begin{aligned} i\frac{{\overline{W}} -\gamma W}{{\overline{W}} +\gamma W} = \frac{f\left( {\overline{W}} -\gamma W\right) }{\frac{f}{i} \left( {\overline{W}} +\gamma W\right) } \end{aligned}$$

and again, the numerator and the denominator on the right are real. In this case, \(f=i/\sqrt{\gamma }\) is a good choice.

Proof

We use a separate set of notation in this proof.

To see the first assertion, write W, \(\gamma \) and f in polar form in this proof: \(W=re^{i\omega }\), where \(r>0\), \(\gamma =e^{i\alpha }\) and \(f=e^{i\varphi }\). We also have

$$\begin{aligned} {\text {Im}}\left( f\left( {\overline{W}} +\gamma W\right) \right)&= r {\text {Im}}\left( e^{i(\varphi -\omega )}+ e^{i(\varphi +\omega +\alpha )}\right) \\ {}&= r\left( \sin (\varphi -\omega ) + \sin (\varphi +\omega +\alpha )\right) \\&= 2r \sin \left( \varphi +\frac{\alpha }{2}\right) \cos \left( \omega +\frac{\alpha }{2}\right) , \end{aligned}$$

and similarly for the denominator,

$$\begin{aligned} {\text {Im}}\left( \frac{f}{i} \left( {\overline{W}} -\gamma W\right) \right)&= r {\text {Im}}\left( e^{i(-\pi /2 +\varphi -\omega )} - e^{i(-\pi /2+\varphi +\omega +\alpha )} \right) \\&= r \left( \sin \left( -\frac{\pi }{2} + \varphi - \omega \right) - \sin \left( -\frac{\pi }{2} +\varphi +\omega + \alpha \right) \right) \\ {}&= -2r\sin (\omega + \frac{\alpha }{2}) \cos \left( -\frac{\pi }{2} +\varphi +\frac{\alpha }{2}\right) \\&= -2r \sin (\omega + \frac{\alpha }{2}) \sin \left( \varphi +\frac{\alpha }{2}\right) . \end{aligned}$$

So we have to find \(\varphi \) for the given \(\alpha \) and \(\omega \) such that

$$\begin{aligned} \sin \left( \varphi +\frac{\alpha }{2}\right) \cos \left( \omega +\frac{\alpha }{2}\right) =0 \quad \text {and}\quad \sin \left( \omega + \frac{\alpha }{2}\right) \sin \left( \varphi +\frac{\alpha }{2}\right) =0. \end{aligned}$$

So \(\varphi =-\alpha /2\) will do.

For the second assertion, a similar argument yields that \(\varphi =(\pi -\alpha )/2\) will do.

\(\square \)

Lemma 2

Let PQ be complex polynomials without common zeros. Assume that \(H(u)=P(u)/Q(u)\) is a real rational function, i.e. if \(u\in \mathbb {R}\) and H(u) is finite, then \(H(u)\in \mathbb {R}\). Also assume that the leading coefficients of P and Q are real. Then all the coefficients of P and Q are real.

Proof

We prove it by induction as follows. Write \(P(u)= a u^n + P_1(u)\) and \(Q(u)= b u^m + Q_1(u)\) where \(\deg (P_1)<\deg (P)\) and \(\deg (Q_1)<\deg (Q)\). By the assumptions, \(a,b\in \mathbb {R}\), \(a\ne 0\), \(b\ne 0\).

If \(n\ge m>0\), we can write

$$\begin{aligned} H(u)=\frac{P(u)}{Q(u)}= \frac{a u^n +P_1(u)}{b u^m + Q_1(u)} = \frac{a}{b}u^{n-m} + \frac{P(u)-\frac{a}{b}u^{n-m} Q(u)}{Q(u)} \end{aligned}$$

which implies that with \(P_2(u):=P(u)- \frac{a}{b} u^{n-m} Q(u)\), \(P_2(u)/Q(u)\) is a real rational function. Note that \(n_1:=\deg (P_2)<\deg (P)=n\). Denote the leading coefficient of \(P_2\) by c, \(c\in \mathbb {C}\), \(c\ne 0\). It is standard to see that

$$\begin{aligned} \lim _{u\rightarrow \infty } \frac{u^m}{u^{n_1}}\frac{P_2(u)}{Q(u)} = \frac{c}{b} \end{aligned}$$

where the left-hand side is real and b on the right is also real. Hence the leading coefficient c of \(P_2\) is real.

If \(m> n>0\), then consider \(1/H(u)=Q(u)/P(u)\) which is again a real rational function.

If \(m=0\), then \(H(u)=P(u)/Q(u)\) is actually a polynomial. Also, \(Q(u)=Q(0)\in \mathbb {R}\), and hence H(u) is a real polynomial, that is, if \(u\in \mathbb {R}\), then \(H(u)\in \mathbb {R}\). It is then standard that the polynomial H(u) must have real coefficients.

Finally, if \(n=0\), then consider 1/H(u) and this way we reduce this case to the case discussed in the previous paragraph.

\(\square \)

We also need the following lemma ( [17, Lem. 2])

Lemma 3

For any sequence \((B_n)\) of Blaschke products of degree m there exist a Blaschke product B of degree \(k\le m\) and a subsequence of \((B_n)\) which converges to B locally uniformly on a set which contains all points of \( \mathbb {D}\cup \mathbb {T}\) with the possible exception of at most \(m-k\) boundary points.