1 Introduction

Let \(\omega =\{\omega _k\}_{k \in \mathbb {N}}\) be a positive sequence. The weighted Hardy space with weight \(\omega \) is the space \(H^2_\omega \) of holomorphic functions f over the unit disk \({\mathbb {D}}\) of the complex plane, given by \(f(z) = \sum _{k=0}^{\infty } a_k z^k\) with

$$\begin{aligned} \left\Vert {f}\right\Vert ^2_\omega := \sum _{k=0}^{\infty } \left| a_k\right| ^2 \omega _k < \infty . \end{aligned}$$
(1)

A special subclass is formed by the family of weights \(\omega _k = (k + 1)^\alpha \), for some \(\alpha \in \mathbb {R}\), usually referred to as Dirichlet-type spaces. The values \(\alpha =-1,0,1\) receive their own names and symbols: the Bergman space \(A^2\), the Hardy space \(H^2\) and the Dirichlet space \({\mathcal {D}}\). We refer the reader to the monographs [8, 11, 12] respectively for the basics about these spaces.

The shift operator S is the operator defined on \(H^2_\omega \) by \(Sf(z)=zf(z)\) and we say that a (closed) subspace \(M \subset H^2_\omega \) is invariant if \(SM\subset M\). In the theory of invariant subspaces, to which all the above references dedicate a significant effort, a natural question is that of identifying cyclic functions, that is, functions that are elements of a space but are contained in no proper invariant subspace. If a function has zeros inside the disk, it is simple to disprove its cyclicity, while if it converges beyond the boundary and has no zeros on the closed disk, it is automatically cyclic. Here, we focus on the critical functions, in the sense of having some zeros on the boundary but none inside the disk.

In the last decade, an approach to the study of cyclic functions has been developed [2] based on the observation that the function 1 is always cyclic, and therefore a function \(f \in H^2_\omega \) is cyclic if and only if there exists a sequence of polynomials such that

$$\begin{aligned}\Vert 1-p_nf\Vert _\omega \rightarrow 0, \text { as } n \rightarrow \infty ,\end{aligned}$$

thus motivating the study of the following family of polynomials. Denote by \({\mathcal {P}}_n\) the space of polynomials of degree at most n.

Definition 1.1

Let \(f \in H^2_\omega \), and \(n \in \mathbb {N}\). We say that \(p_n\) is the optimal polynomial approximant (or o.p.a.) to 1/f of degree less or equal to n if \(\Vert 1-p_nf\Vert _\omega \le \Vert 1-qf\Vert _\omega \) for any \(q \in {\mathcal {P}}_n\).

If f is not identically null, the existence and uniqueness of o.p.a. follow from the fact that \(1-p_n f\) is the vector joining 1 with its orthogonal projection onto the finite-dimensional subspace \({\mathcal {P}}_n f\). Already in [2], it is hinted that the distribution of zeros of o.p.a. or of \(\{1-p_nf\}_{n\in \mathbb {N}}\), where \(p_n\) are the o.p.a. to 1/f, may hide relevant information about the cyclicity of the function f and in the present article our goal is to understand this distribution of the zeros of \(1-p_nf\) for large values of n and simple functions f. In [3], functions that have no zeros inside the domain \(\mathbb {D}\) but with zeros on its boundary were studied. The authors there showed that for such functions, the zeros of o.p.a. can only accumulate to points of the boundary \(\mathbb {T}\) and that indeed every point of the circle is such an accumulation point. Here we will complete that information, showing that if \(f \in {\mathcal {P}}_d\) is critical (in the sense mentioned before), then (a subsequence of) its o.p.a. \(\{p_n\}_{n\in \mathbb {N}}\) have the property that the zeros of \(1-p_nf\) asymptotically equidistribute over the unit circle, in the following (weak) sense:

Denote by Z(q) the zero set of a polynomial q of degree \(d \in \mathbb {N}\), and going forward we denote by \(\mu _q\) the measure

$$\begin{aligned}\mu _q:= \frac{1}{d} \sum _{z_j \in Z(q)} \delta _{z_j}.\end{aligned}$$

Finally, denote by \(\nu _E\) the uniform measure over E. We say that the zeros of the family of polynomials \(\{q_n\}_{n\in \mathbb {N}}\) are asymptotically equidistributed over E if \(\mu _{q_n}\) converges weakly to \(\nu _E\).

Throughout the present text, we will assume a few properties for our weights:

Definition 1.2

We say that \(\omega = \{\omega _k\}_{k\in {\mathbb {N}}}\subset {\mathbb {R}}^+\) is a weight whenever it is a monotone sequence, normalized to have \(\omega _0 = 1\), satisfying

$$\begin{aligned} \lim _{n \rightarrow \infty } \frac{\omega _n}{ \omega _{n - \sqrt{n}}} = 1, \end{aligned}$$
(2)

as well as

$$\begin{aligned} \sum _{k=1}^\infty \frac{1}{\omega _k} = + \infty . \end{aligned}$$
(3)

It is relevant to notice that the Dirichlet-type spaces do meet our assumptions. The first condition guarantees that ratios of weights converge sufficiently fast to 1. Moreover, one can check that this condition implies the more natural

$$\begin{aligned} \lim _{k\rightarrow \infty } \frac{\omega _k}{\omega _{k+1}} = 1. \end{aligned}$$
(4)

In particular, both the shift and its left-inverse are bounded operators in all \(H^2_\omega \) spaces and functions in \(H^2_\omega \) are naturally associated to the unit disk as their domain of analyticity. The condition (3) is shown in [10] to distinguish between the cases in which simple critical functions like \(f(z)=1-z\) are cyclic or not (we choose the case in which they are, when cyclicity is a richer phenomenon). We do need some doubling conditions like monotonicity or (2) for technical reasons, in order to avoid spaces with pathological multiplicative behavior. In addition, notice that monotone sequences meeting (2) and (3) can’t grow too fast: indeed, we must have that for each \(\varepsilon >0\) there exists some \(C_\varepsilon \) such that

$$\begin{aligned} \sup _{k \in [0,n]} \frac{\omega _n}{\omega _k} \le C_\varepsilon n^{1+\varepsilon }. \end{aligned}$$
(5)

Our main result is the following:

Theorem 1.3

Let f be a critical polynomial with simple zeros, and \(p_n\), the n-th o.p.a. to 1/f in \(H^2_\omega \) for a weight \(\omega \). For some subsequence \(\{n_k\}_{k\in \mathbb {N}} \subset \mathbb {N}\), the zeros of \(\{1-p_{n_k}f\}_{k\in \mathbb {N}}\) are asymptotically equidistributed on \(\mathbb {T}\).

Our reasoning will make a strong use of the summary of techniques for (weak) asymptotic equidistribution in [14]. According to E. A. Rakhmanov [13], the development of the programme to study a cyclic function through its o.p.a. will require understanding the strong asymptotics of the zeros, but we consider our contribution an initial step in that direction. To establish this theorem, we will expand upon the work in [4], where the boundary behaviour of o.p.a. was studied for f, a polynomial of degree d with simple roots. Some results were proved there in full generality and some others, exclusively for the Hardy and Bergman spaces. Here we will need to extend some of those results to general spaces. In particular we will prove results which generalize Theorems 1.7 and 1.8 in [4], from the setting of the Hardy \(H^2\) and Bergman \(A^2\) space, to the weighted spaces defined above. To do so, we need to introduce one more function space. Denote by \({\mathcal {A}}(\mathbb {T})\) the Wiener algebra, that is, the space of holomorphic functions over the disk with absolutely summable coefficients, where the norm of a function \(g(z) = \sum _{k \in \mathbb {N}} a_k z^k\) is given by

$$\begin{aligned}\Vert g\Vert _{{\mathcal {A}}(\mathbb {T})} = \sum _{k \in \mathbb {N}} \left| a_k\right| < \infty .\end{aligned}$$

The Wiener algebra is formed by functions with well defined boundary values, and convergence in its norm implies uniform convergence over the closed unit disk. In some of the cited works, the disk algebra is referred to instead, and we warn that this is a different class of functions. The first result we generalize takes the following form:

Theorem 1.4

Let f be a polynomial with simple zeros such that \(Z(f) \cap {\mathbb {D}} = \emptyset \), and let \(p_n\) be the n-th o.p.a. to 1/f in \(H^2_\omega \). Then there exists a constant \(C > 0\) such that for all \(n \in {\mathbb {N}}\),

$$\begin{aligned} \left\Vert {1 - p_n f}\right\Vert _{{\mathcal {A}}(\mathbb {T})} \le C. \end{aligned}$$
(6)

Notice how (6) couldn’t be improved to the left-hand side converging to 0, since values of \(1-p_nf\) at the zeros of f on \(\mathbb {T}\) can’t converge to 0.

The second result we need to extend deals with pointwise convergence of \(1-p_nf\) outside of the zero set of f. The answer is contained in the following theorem:

Theorem 1.5

Let f be a polynomial with simple zeros such that \(Z(f) \cap \mathbb {D}= \emptyset \), and let \(p_n\) be the n-th o.p.a. to 1/f in \(H^2_\omega \). Then

$$\begin{aligned} 1 - p_n f \rightarrow 0 \text { as } n \rightarrow \infty , \end{aligned}$$

uniformly on compact subsets of \({\overline{\mathbb {D}}} \backslash Z(f)\).

In Sect. 2 we show how to derive Theorems 1.4 and 1.5. These will be used in Sect. 3 to establish Theorem 1.3. These proofs will depend on some technical lemmas that we will state in the relevant place but leave for Sect. 4. We conclude with some remarks on future research on Sect. 5.

2 Wiener norm and pointwise convergence

The spaces we are considering, \(H^2_\omega \), are examples of Reproducing Kernel Hilbert Spaces (RKHS) over the disk with an inner product defined by

$$\begin{aligned} \langle f, g \rangle _\omega = \sum _{k=0}^{\infty }a_k{\overline{b}}_k\omega _k, \end{aligned}$$

where \(f(z) = \sum _{k=0}^{\infty } a_kz^k\), \(g(z) = \sum _{k=0}^{\infty } b_kz^k\). See [10] for the details. RKHS have the special property that norm convergence implies pointwise convergence for points in the common domain of the space (in our case, \(\mathbb {D}\)). The reproducing kernel k(zb) at a point \(b \in \mathbb {D}\) is given by

$$\begin{aligned} k(z,b) = \sum _{k=0}^{\infty } \frac{{\overline{b}}^k z^k}{\omega _k}. \end{aligned}$$

Moreover, if we focus on polynomials of degree at most n, we can project the reproducing kernel onto \({\mathcal {P}}_n\), by simply truncating it, to obtain \(k_n(z,b)\), with

$$\begin{aligned} k_n(z,b) = \sum _{k=0}^{n} \frac{{\overline{b}}^k z^k}{\omega _k}, \end{aligned}$$

which is the reproducing kernel in the subspace \({\mathcal {P}}_n\) of \(H^2_\omega \) (with the inherited norm). We also denote

$$\begin{aligned} S_{n} := \sum _{k=0}^{n} \frac{1}{\omega _k}. \end{aligned}$$

Note that \(S_{n}\) could be interpreted as the value of \(k_n(z,z)\) if this was defined for any \(z \in {\mathbb {T}}\), and it gives us an upper bound for the values that the reproducing kernels \(k_n(z,b)\) can attain on the unit disk. From now on, whenever we write \({\hat{g}}(k)\) we mean the Taylor coefficient of order k of the function g, and when we write \(v^t\) we mean the transpose of v.

A key result about o.p.a. in our context was proved in [4], Corollary 1.2, which provides a closed formula for all the coefficients of all degrees for \(1-p_nf\):

Lemma 2.1

Let f be a monic polynomial of degree d with simple zeros \(z_1, \dots , z_{d}\) that lie in \(\mathbb {C}\backslash \{0\}\), \(p_n\) the n-th o.p.a. to 1/f in \(H^2_\omega \), \(d_{k,n} = (\widehat{1-p_nf})(k)\), \(v_0 = (1, \dots , 1) \in \mathbb {C}^d\), and \(E = E_{Z,n} := (e_{l,m})_{l,m=1}^d\) be the matrix whose coefficients are given by \(e_{l,m} = k_{n+d}(z_l, z_m)\). Then there exists a unique vector \(A_n = (A_{1,n}, \dots , A_{d,n})\) such that for \(k = 0, \dots , n+d\) we have

$$\begin{aligned} d_{k,n} = \frac{1}{\omega _k}\sum _{i=1}^d A_{i,n}{\overline{z}}_i^k. \end{aligned}$$
(7)

Moreover,

$$\begin{aligned}A_n^t = E^{-1} \cdot v_0^t.\end{aligned}$$

Furthermore, moving forward, we will assume that f has simple zeros \(z_1, \dots , z_{d_1} \in {\mathbb {T}}\), \(d_1 \ge 1\) and \(z_{d_1 + 1}, \dots , z_{d} \in {\mathbb {C}}\backslash \overline{{\mathbb {D}}}\) and that they are ordered, that is, \(|z_i| \le |z_{i+1}|\).

In order to prove Theorems 1.4 and 1.5, we need bounds for \(A_{i,n}\), to then apply the previous lemma. We need some preliminary results.

First we need to control the size of truncated reproducing kernels, as these provide the elements of the matrix E. Whenever \(|{\overline{z}}_j z_i| > 1\), this is contained in the following lemma:

Lemma 2.2

Let \(z \in \mathbb {C}\) with \(|z| > 1\), and \(N \in \mathbb {N}\). Then

$$\begin{aligned}\sum _{k=0}^{N} \frac{z^k}{\omega _k} = C(N,z)\frac{z^{N+1}}{\omega _N}\end{aligned}$$

where

$$\begin{aligned}\lim _{N\rightarrow \infty } C(N,z) = \frac{1}{z-1}.\end{aligned}$$

Proof

Define

$$\begin{aligned} C(N,z) = \frac{\sum _{k=0}^N \frac{z^k}{\omega _k}}{\frac{z^{N+1}}{\omega _N}} = \sum _{k=0}^N \frac{\omega _Nz^{-N-1+k}}{\omega _k}. \end{aligned}$$

Under the change of variables \(x= \frac{1}{z}\), we can rewrite this as

$$\begin{aligned} C(N,x) = \sum _{k=0}^N \frac{\omega _N x^{N+1-k}}{\omega _k} = x \sum _{k=0}^N \frac{\omega _N x^{k}}{\omega _{N-k}}, \end{aligned}$$

where \(|x| < 1\). Now let \(\epsilon > 0\) be fixed. We split the sum in two parts

$$\begin{aligned} C(N,x) = x\sum _{k=0}^{\sqrt{N}}\frac{\omega _N}{\omega _{N-k}}x^k + x^{\sqrt{N} + 1}\sum _{k=\sqrt{N}+1}^{N}\frac{\omega _N}{\omega _{N-k}}x^{k-\sqrt{N}}. \end{aligned}$$

Therefore, we can compute the difference

$$\begin{aligned} \begin{aligned} \left| C(N,x) - \frac{x}{1-x}\right| \le \left| C(N,x) - x\sum _{k=0}^{\sqrt{N}}x^k\right| + \left| x\sum _{k=0}^{\sqrt{N}}x^k - \frac{x}{1-x}\right| . \end{aligned} \end{aligned}$$

The second term of the right-hand side can be made smaller than an arbitrary \(\epsilon \) by taking a large value of N. Meanwhile, the first term can easily be bounded above by

$$\begin{aligned} |x|\left| \sum _{k=0}^{\sqrt{N}}\left( \frac{\omega _N}{\omega _{N-k}}-1\right) x^k\right| + |x|^{\sqrt{N}+1}\left| \sum _{k=\sqrt{N}+1}^{N}\frac{\omega _N}{\omega _{N-k}}x^{k-\sqrt{N}}\right| . \end{aligned}$$

From condition (2) and the fact that the sequence of weights is monotone, it follows that choosing N large enough, we can derive

$$\begin{aligned} \sup _{k \in \{N - \sqrt{N}, \dots , N\}} \left| \frac{\omega _N}{\omega _{N-k}} - 1 \right| \le \in (1-|x|). \end{aligned}$$

Then we can bound

$$\begin{aligned} |x|\left| \sum _{k=0}^{\sqrt{N}}\left( \frac{\omega _N}{\omega _{N-k}}-1\right) x^k\right| \le \epsilon . \end{aligned}$$

We are finally left with the term on \(k \ge \sqrt{N}\), which tends to zero as it is exponentially suppressed in \(|x|^{\sqrt{N}+1}\) and the ratio growth is bounded by (5), therefore, we can make this term smaller than \(\epsilon \) for sufficiently large N. Reversing the change \(x = \frac{1}{z}\) the result follows. \(\square \)

Notice that here the Szegö kernel (the reproducing kernel for the classical Hardy space \(H^2\)) plays a universal role among weighted Hardy spaces. We are also interested in the size of \(k_n(z_i,z_j)\) when \(z_i \ne z_j\), \(|{\overline{z}}_j z_i| \le 1\). Its size relative to \(S_n\) is determined in the next lemma.

Lemma 2.3

Let \(z_1, z_2 \in {\overline{\mathbb {D}}}\), \(z_1 \ne z_2\). Then

$$\begin{aligned}\lim _{n \rightarrow \infty } \frac{k_n(z_1,z_2)}{S_n} = 0.\end{aligned}$$

Proof

By definition \(\left| k_n(z_1,z_2) \right| = \left| \sum _{k=0}^n \frac{{\overline{z}}_2^k z_1^k}{\omega _k} \right| \). Applying summation by parts and calling \(A_l = \sum _{k=0}^l {\overline{z}}_2^k z_1^k = \frac{1-{\overline{z}}_2^{l+1}z_1^{l+1}}{1-{\overline{z}}_2 z_1}\), it follows that

$$\begin{aligned} \left| \sum _{k=0}^n \frac{{\overline{z}}_2^k z_1^k}{\omega _k} \right|&= \left| \frac{A_n}{\omega _n} - \sum _{k=0}^{n-1}A_k\left( \frac{1}{\omega _{k+1}} - \frac{1}{\omega _k}\right) \right| \\&\le \frac{2}{|1-{\overline{z}}_2 z_1|}\left( \frac{1}{\omega _n} + \sum _{k=0}^{n-1}\left| \frac{1}{\omega _{k+1}} - \frac{1}{\omega _k}\right| \right) . \end{aligned}$$

Using that the sequence of weights is monotone, we can see

$$\begin{aligned} \left| k_n(z_1,z_2) \right| \le \frac{2}{|1-{\overline{z}}_2 z_1|}\left( \frac{2}{\omega _n} + 1 \right) . \end{aligned}$$

From the limit condition (4) and the divergence of \(S_n\) as \(n \rightarrow \infty \), the fraction on the right-hand side goes to zero. \(\square \)

We can now state an estimate the size of \(A_{i,n}\). Its proof will require the use of the intermediate Lemma 2.5.

Lemma 2.4

As \(n \rightarrow \infty \), the coefficients \(A_{i,n}\) meet the following rates of decay:

$$\begin{aligned} A_{i,n} \in {\left\{ \begin{array}{ll} O\left( \frac{1}{S_{n+d}}\right) \text { for } 1 \le i \le d_1 \\ O\left( \frac{1}{S_{n+d}|z_i|^{n+d+1}}\right) \text { for } d_1 < i \le d.\\ \end{array}\right. } \end{aligned}$$

Consequently, for each \(1 \le i \le d_1\), there exists a constant \(C_i\), independent of n, such that

$$\begin{aligned} \sum _{k=0}^{n+d}\left| A_{i,n}\frac{\overline{z_i}^k}{w_k} \right| \le C_i, \end{aligned}$$

while for \(d_1 < i \le d\), we have

$$\begin{aligned} \sum _{k=0}^{n+d}\left| A_{i,n}\frac{\overline{z_i}^k}{w_k} \right| \rightarrow 0 \end{aligned}$$

as \(n \rightarrow \infty \).

Before we can show Lemma 2.4 we need to estimate some determinants from below, in particular that of the matrix E in the statement of the Lemma 2.1. This is reasonable since E is a positive definite matrix but estimating determinants from below is necessarily painful. This use of the determinants to bound \(A_{i,n}\) is in the spirit of what’s done in Lemma 3.3 and Lemma 5.4 from [4] for the \(H^2\) and \(A^2\) spaces respectively. Our lower estimate for the determinant is also based on a similar technique in Lemma 3.2 from the same article.

Lemma 2.5

Let \(1 \le d_1 \le d\) be integers, and \(\{z_i\}_{i=1}^d \subset {\mathbb {C}}\) be distinct points with \(|z_i| = 1\) for \(1 \le i \le d_1\) and \(|z_i| > 1\) for \(d_1 < i \le d\). If \(E := (e_{l,m})^d_{l,m=1}\) with \(e_{l,m} = k_{n+d}(z_l, z_m)\), then there exists a constant \(\delta > 0\), independent of n, such that for every n,

$$\begin{aligned} \det (E) \ge \delta \frac{S_{n+d}^{d_1}}{w_{n+d}^{d - d_1}} \prod _{l = 1}^d |z_l|^{2(n+d+1)}. \end{aligned}$$
(8)

Proof

In what follows, we use the notation \({\mathscr {S}}\) to denote the set of all permutations of the indices \(\{1,\dots , d\}\), \({{\,\mathrm{sgn}\,}}{(\sigma )}\) to denote the parity of a particular permutation and id to refer to the identity permutation. By the definition of determinant,

$$\begin{aligned} \det {(E)} = \sum _{\sigma \in {\mathscr {S}}}\left[ {{\,\mathrm{sgn}\,}}{(\sigma )}\prod _{l=1}^d e_{l, \sigma (l)}\right] = \sum _{\sigma \in {\mathscr {S}}}\left[ {{\,\mathrm{sgn}\,}}{(\sigma )}\prod _{l=1}^d \left( \sum _{k=0}^{n+d} \frac{{\overline{z}}_{\sigma (l)}^k z_l^k }{\omega _k}\right) \right] . \end{aligned}$$

We can decompose this sum depending on the number of indices that a given permutation fixes. Recall that when \(i = 1, \dots , d_1\) then \(|z_i| = 1\) while \(|z_i| > 1\) otherwise. Let \({\mathscr {A}}\) be the set of permutations such that \(\sigma (i) = i\) for every \(1 \le i \le d_1\) and for each \(0 \le j \le d_1\) let \({\mathscr {B}}_j\) be the set of permutations that fix exactly j of the indices in the set \({1, \dots , d_1}\). Then,

$$\begin{aligned} \det {(E)}&= \prod _{l=1}^d \left( \sum _{k=0}^{n+d} \frac{|z_l|^{2k}}{\omega _k}\right) + \sum _{\sigma \in {\mathscr {A}}\backslash \{id\}} {{\,\mathrm{sgn}\,}}{(\sigma )} \prod _{l=1}^d \left( \sum _{k=0}^{n+d} \frac{{\overline{z}}_{\sigma (l)}^k z_l^k}{\omega _k}\right) \\&\quad + \sum _{j=0}^{d_1 - 1} \sum _{\sigma \in {\mathscr {B}}_j} {{\,\mathrm{sgn}\,}}{(\sigma )}\prod _{l=1}^d \left( \sum _{k=0}^{n+d} \frac{{\overline{z}}_{\sigma (l)}^k z_l^k}{\omega _k}\right) . \end{aligned}$$

Now, we can first study the sums inside the products. When \(l = \sigma (l) \le d_1\), then

$$\begin{aligned} \sum _{k=0}^{n+d} \frac{|z_l|^{2k}}{\omega _k} = S_{n+d}. \end{aligned}$$

Meanwhile, when \(l \not = \sigma (l)\) but both are smaller than or equal to \(d_1\), we can invoke Lemma 2.3, yielding

$$\begin{aligned}\sum _{k=0}^{n+d} \frac{{\overline{z}}_{\sigma (l)}^k z_l^k}{\omega _k} \in o\left( S_{n+d}\right) \end{aligned}$$

as n grows to \(\infty \). Finally, if l or \(\sigma (l)\) is bigger than \(d_1\), then \(|{\overline{z}}_{\sigma (l)} z_l| > 1\), therefore, using Lemma 2.2, we get

$$\begin{aligned} \sum _{k=0}^{n+d} \frac{{\overline{z}}_{\sigma (l)}^k z_l^k}{\omega _k} = C(l, \sigma , n) \frac{{\overline{z}}_{\sigma (l)}^{n+d+1} z^{n+d+1}_l }{\omega _{n+d}}, \end{aligned}$$

where

$$\begin{aligned} C(l,\sigma ,n) \rightarrow \frac{1}{{\overline{z}}_{\sigma (l)} z_{l} - 1}, \text { as }n \rightarrow \infty . \end{aligned}$$

Therefore, the first summand in the expression for the determinant is

$$\begin{aligned} \frac{S_{n+d}^{d_1}}{\omega _{n+d}^{d-d_1}} \left( \prod _{l=d_1 + 1}^d |z_l|^{2(n+d+1)}C(l,id, n) \right) . \end{aligned}$$
(9)

We can also compute the second summand corresponding to the permutations in \({\mathscr {A}}\),

$$\begin{aligned} \sum _{\sigma \in {\mathscr {A}}\backslash \{id\}} {{\,\mathrm{sgn}\,}}{(\sigma )} \frac{S_{n+d}^{d_1}}{\omega _{n+d}^{d-d_1}}\left( \prod _{l=d_1 + 1}^d ({\overline{z}}_{\sigma (l)} z_l)^{n+d+1}C(l,\sigma , n) \right) . \end{aligned}$$
(10)

Finally, the summand corresponding to the permutations in \({\mathscr {B}}_j\) consist of sums similar to those in (10), except involving powers of \(S_{n+d}^j\) and products over some subsets of indices l.

Notice that for \(\sigma \in {\mathscr {A}}\), since \(\sigma \) is bijective from \(\{d_1 + 1, \dots , d\}\) to itself, we have

$$\begin{aligned} \prod _{l=d_1 + 1}^d ({\overline{z}}_{\sigma (l)} z_l)^{n+d+1} = \prod _{l = d_1 + 1}^d |z_l|^{2(n+d+1)} = \prod _{l = 1}^d |z_l|^{2(n+d+1)}. \end{aligned}$$

Therefore, we see that the determinant of E is the product of two factors, first one being

$$\begin{aligned} \frac{S_{n+d}^{d_1}}{\omega _{n+d}^{d - d_1}} \prod _{l = 1}^d |z_l|^{2(n+d+1)}\end{aligned}$$

and the second one being

$$\begin{aligned} \prod _{l = d_1 + 1}^d C(l, id, n) + \sum _{\sigma \in {\mathscr {A}}\backslash \{id\}} {{\,\mathrm{sgn}\,}}{(\sigma )} \prod _{l = d_1 + 1}^d C(l, \sigma , n) + r(n), \end{aligned}$$
(11)

where \(r(n) \rightarrow 0\) as \(n \rightarrow \infty \). Note that E is a Gram matrix, and \(\det (E) > 0\) for all n. On the other hand as \(n \rightarrow \infty \), \(r(n) \rightarrow 0\), and thus we get that the limit of (11) coincides with the determinant of the matrix \(B := (b_{l,m})_{l,m = d_1 + 1}^d\), defined by \(b_{l,m}= \frac{1}{{\overline{z}}_m z_l - 1}\) and, as was shown in [4], this is positive definite so that \(\det (B) > 0\). Therefore as \(\det (E) > 0\), the second factor is strictly positive for all n and converges to \(\det (B) > 0 \), so it is bounded below by a constant \(\delta > 0\). Hence we obtained the desired result. \(\square \)

With this control on the determinant, we are finally ready to give bounds for the size of \(A_{i,n}\). We will combine Lemmas 2.22.3, and 2.5 and the proof of a similar result in [4].

Proof of Lemma 2.4

The coefficients \(A_{i,n}\) are the solutions to the linear system \(E \cdot A_n^t = v_0^t\), where \(v_0 := (1, \dots , 1) \in {\mathbb {C}}^d\). Therefore, by Cramer’s rule we have that \(A_{i,n} = \frac{\det (E^{(i)})}{\det (E)}\) where \(E^{(i)}\) is obtained replacing the i-th column of E by \(v_0^t\).

Now if \(1 \le 1 \le d_1\), arguing as in Lemma 2.5, in all the sums \(\sigma (l) \not = i\), the highest power of \(S_{n+d}\) that can appear in any term of the expression of \(\det (E^{(i)})\) is \(S_{n+d}^{d_1 - 1}\), multiplied by a product that is bounded above by a constant multiple of \(\frac{1}{\omega _{n+d}^{d - d_1}}\prod _{l=1}^d |z_l|^{2(n+d+1)}\). There thus exists a positive constant \(C_1\) such that

$$\begin{aligned} \left| \det (E^{(i)})\right| \le C_1 \frac{S_{n+d}^{d_1-1}}{w_{n+d}^{d - d_1}} \prod _{l = 1}^d |z_l|^{2(n+d+1)}. \end{aligned}$$

Applying Lemma 2.5 gives that for \(1 \le i \le d_1\), we have that \(A_{i,n} \in O\left( \frac{1}{S_{n+d}}\right) \).

In the case \( d_1 < i \le d\) arguing as in Lemma 2.5,

$$\begin{aligned}&\det {(E^{(i)})} = \frac{S_{n+d}^{d_1}}{\omega _{n+d}^{d-d_1+1}}\prod _{\begin{array}{c} l = d_1 + 1 \\ l \not = i \end{array}}^d |z_l|^{2(n+d+1)} C(l,id,n) \end{aligned}$$
(12)
$$\begin{aligned}&\quad +\sum _{\sigma \in {\mathscr {A}} \backslash \{id\}} {{\,\mathrm{sgn}\,}}{(\sigma )} \frac{S_{n+d}^{d_1}}{\omega _{n+d}^{d-d_1+1}} \prod _{\begin{array}{c} l = d_1 + 1 \\ \sigma (l) \not = i \end{array}}^d \left( {\overline{z}}_{\sigma (l)}z_l\right) ^{n+d+1} C\left( l,\sigma ,n\right) \end{aligned}$$
(13)
$$\begin{aligned}&\quad + R(n) \end{aligned}$$
(14)

where R(n) denotes the remainder terms. Now, (12) is missing a term of order \(\frac{1}{w_{n+d}}|z_i|^{2(n+d+1)}\), while in (13) each product is missing a term of order \(\frac{1}{w_{n+d}}|{\overline{z}}_i z_{i^*}|^{n+d+1}\) where \(i^* = \sigma ^{-1}(i) >d_1\). Finally, in (14), the highest power of \(S_{n+d}\) that appears is \(S_{n+d}^{d_1 - 1}\), and each product is missing at least one term of order \(|z_i|^{n+d+1}\). Therefore after dividing by \(\det {(E)}\) and using Lemma 2.5 we can conclude that \(A_{i,n}\) has an order of decay at most

$$\begin{aligned} O\left( \frac{w_{n+d}}{|z_i|^{2(n+d+1)}} + \frac{\omega _{n+d}}{|z_i|^{n+d+1} |z_{d_1 + 1}|^{n+d+1}} + \frac{1}{S_{n+d}|z_i|^{n+d+1}}\right) , \end{aligned}$$

where we have used that \(|z_{d_1+1}| \le |z_{d_1 + j}|\), \(j = 1, \dots , d-d_1\) combined with \(\sigma ^{-1}(i) > d_1\). Therefore, given that the growth of \(w_{n+d}\) is not exponential, it follows that \(A_{i,n} \in O\left( \frac{1}{S_{n+d}|z_i|^{n+d+1}}\right) \) as desired.

For the second part of the lemma for \(1 \le i \le d_1\), there is a constant \(C_i\) such that

$$\begin{aligned} \sum _{k=0}^{n+d} \left| A_{i,n} \frac{{\overline{z}}_i^k}{\omega _k}\right| \le \frac{C_i}{S_{n+d}} \sum _{k=0}^{n+d} \frac{|{\overline{z}}_i|^k}{w_k} = C_i, \end{aligned}$$

and for \(d_1 < i \le d\) we have, using Lemma 2.2,

$$\begin{aligned} \sum _{k=0}^{n+d} \left| A_{i,n} \frac{{\overline{z}}_i^k}{\omega _k}\right| = |A_{i,n}| C(n+d,|{\overline{z}}_i|) \frac{|{\overline{z}}_i|^{n+d+1}}{\omega _{n+d}} \end{aligned}$$

where \(C(n+d,|{\overline{z}}_i|) \rightarrow \frac{1}{|{\overline{z}}_i| - 1}\). Hence, \(\frac{1}{S_{n+d} \omega _{n+d}} \rightarrow 0\) as \(n\rightarrow \infty \), implying

$$\begin{aligned} |A_{i,n}| C(n+d,|{\overline{z}}_i|) \frac{|{\overline{z}}_i|^{n+d+1}}{\omega _{n+d}} \rightarrow 0. \end{aligned}$$

\(\square \)

Now we are ready to show the validity of Theorem 1.4.

Proof of Theorem 1.4

Recall from Lemma 2.1 that \((1-p_nf)(z) = \sum _{k=0}^{n+d} d_{k,n} z^k\), where \(d_{k,n} = \frac{1}{\omega _k}\sum _{i=1}^d A_{i,n} {\overline{z}}_i^k\). Therefore, we can estimate the Wiener norm as:

$$\begin{aligned} \left\Vert {1 - p_n f}\right\Vert _{{\mathcal {A}}(\mathbb {T})} \le \sum _{i=1}^d \sum _{k=0}^{n+d} \left| A_{i,n} \frac{{\overline{z}}_i^k}{\omega _k}\right| . \end{aligned}$$

Now invoking Lemma 2.4, we can conclude that

$$\begin{aligned} \left\Vert {1 - p_n f}\right\Vert _{{\mathcal {A}}(\mathbb {T})} \le \sum _{i=1}^{d_1} C_i + o(1) \le C < \infty \end{aligned}$$

for some positive constant C, as desired. \(\square \)

The proof of Theorem 1.5 can also be performed at this point.

Proof of Theorem 1.5

Let \(K \subset \overline{{\mathbb {D}}} \backslash \{z_1, \dots , z_d\}\) be a compact set. Then, by Lemma 2.1, for each \(z \in K\) we have

$$\begin{aligned} (1-p_nf)(z)&= \sum _{k=0}^{n+d} \left( \sum _{i=1}^d A_{i,n} \frac{{\overline{z}}_i^k}{\omega _k}\right) z^k = \sum _{i=1}^d A_{i,n} \left( \sum _{k=0}^{n+d} \frac{{\overline{z}}_i^k z^k}{\omega _k}\right) \\&= \sum _{i=1}^d A_{i,n} k_{n+d}(z, z_i). \end{aligned}$$

Now, for \(1 \le i \le d_1\), we have \(A_{i,n} \in O\left( \frac{1}{S_{n+d}}\right) \) by Lemma 2.4, while by Lemma 2.3, \(k_{n+d}(z, z_i) \in o(S_{n+d})\) uniformly on \(z \in K\) as we are avoiding a neighbourhood of \(z_i\), \(1 \le i \le d_1\). Therefore these terms go uniformly to zero on K.

For the case \(d_1 < i \le d\), one can see that \(k_{n+d}(z, z_i) \in O\left( \frac{{\overline{z}}_i^{n+d+1}}{w_{n+d}}\right) \) uniformly on K. Again, by Lemma 2.4, \(A_{i,n} \in O\left( \frac{1}{S_{n+d}|z_i|^{n+d+1}}\right) \). As a result, these terms also go uniformly to zero on K. \(\square \)

3 Distribution of zeros

We are going to show the proof of Theorem 1.3. We will exploit classical results from approximation theory, for which we were inspired by the nice summary of techniques in [14]. For the rest of the article, for a polynomial P, \(\mu _P\) will denote the measure formed as the average of the delta measures at the zeros of P, and \(\nu _E\) will denote the uniform distribution measure over the set E. We base our approach on a classical result of Ërdos and Turán, claiming that, given a monic polynomial with small size on the unit circle and not too small of a constant coefficient, then its zeros cluster uniformly around the unit circle. There are multiples ways of quantifying these conditions for a monic polynomial. For our case, given a polynomial P we will study

$$\begin{aligned} H(P) = \max _{|z| = 1} \frac{|P(z)|}{\sqrt{|P(0)|}}. \end{aligned}$$

With respect to this H, Ërdos-Turán’s result is stated as follows:

Theorem 3.1

Let \(\{P_n\}_{n \in \mathbb {N}}\) be a family of monic polynomials, such that \(H(P_n) = e^{o(n)}\). Then

$$\begin{aligned} \lim _{n \rightarrow \infty } \mu _{P_{n}} = \nu _{\mathbb {T}}, \end{aligned}$$

where the convergence is in the weak sense.

See [14]. It will turn out that we have plenty of room to establish the applicability of this Theorem to our context, but we prefer to get the best estimates we may obtain, in order to promote further research. To apply the result to our case, we just need to re-normalize our polynomials into monic polynomials. We focus on \(P_n = 1 - p_nf\), which is a polynomial of degree, at most, \(n + d\), and study the polynomial \(\frac{P_n}{d_{n + d, n}}\) which is, obviously, monic and has the same roots as \(P_n\). We have that

$$\begin{aligned} H\left( \frac{P_n}{d_{n+d,n}}\right) = \max _{|z| = 1} \frac{|P_n(z)|}{\sqrt{|d_{0,n}d_{n+d,n}|}} \le \frac{C}{\sqrt{|d_{0,n}d_{n+d,n}|}} \end{aligned}$$

where we used Theorem 1.4 to bound the value of \(P_n\) on the unit disk. Therefore, in order to bound the values for H, we need to find lower bounds for the values of \(d_{0,n}\) and \(d_{n+d,n}\). The first one was already done in [10], Theorem 4.1 showing that

$$\begin{aligned}\Vert 1-p_n f\Vert ^2 \approx \frac{1}{S_{n+d}},\end{aligned}$$

where \(A \approx B\) means that there exists a constant C such that \( C^{-1} B \le A \le CB\). This is enough for determining \(d_{0,n}\) up to a constant since \(p_nf\) is the orthogonal projection of 1 onto \({\mathcal {P}}_n f\) and thus

$$\begin{aligned}\Vert 1-p_nf\Vert ^2 = \left\langle 1-p_nf, 1-p_nf \right\rangle = \left\langle 1-p_nf, 1 \right\rangle = d_{0,n}.\end{aligned}$$

Only \(d_{n+d,n}\) remains to be correctly estimated. Let us see how this works with an example that will illustrate both the difficulties and the solution for this problem. After this we will provide a general proof.

Example 3.2

Let f be a degree two monic polynomial with two distinct roots on the unit circle. Without loss of generality, and under a rotation if necessary, we may assume that \(z_1 = e^{i\theta } = {\overline{z}}_2\) are these two roots and \(0 < \theta \le \pi /2\).

Under these conditions, the matrix E as defined in Lemma 2.1 takes the form

$$\begin{aligned} E = \begin{pmatrix} S_{n+2} &{} k_{n+2}(e^{i\theta }, e^{-i\theta }) \\ k_{n+2}(e^{-i\theta }, e^{i\theta }) &{} S_{n+2} \end{pmatrix}. \end{aligned}$$

Therefore, it follows that

$$\begin{aligned} E^{-1} = \frac{1}{\det {(E)}}\begin{pmatrix} S_{n+2} &{} -k_{n+2}(e^{i\theta }, e^{-i\theta }) \\ -k_{n+2}(e^{-i\theta }, e^{i\theta }) &{} S_{n+2} \end{pmatrix}. \end{aligned}$$

So that

$$\begin{aligned} A_{1,n} = \frac{1}{\det {(E)}} \left( S_{n+2} -k_{n+2}(e^{i\theta }, e^{-i\theta }) \right) \\ A_{2,n} = \frac{1}{\det {(E)}} \left( S_{n+2} -k_{n+2}(e^{-i\theta }, e^{i\theta }) \right) . \end{aligned}$$

By Lemma 2.1, we have that \(d_{n+2, n}\) is equal to

$$\begin{aligned} \frac{2}{\omega _{n+2} \det {(E)}} \left[ S_{n+2}\cos {(\theta (n+2))} - \mathfrak {R}{\left( k_{n+2}(e^{i\theta }, e^{-i\theta })e^{-i\theta (n+2)} \right) }\right] . \end{aligned}$$

Now, for any \(0 < \theta \le \pi /2\), there exists an increasing sequence of natural numbers \(\{n_k\}_{k=0}^\infty \) and a positive constant \(\delta \) with \(|\cos {(\theta (n_k + 2))}| > \delta \) for all \(n_k\). For this sequence it follows that

$$\begin{aligned} |d_{n_k + 2, n_k}| \ge \frac{2 \delta S_{n_k + 2}(1 + o(1))}{\omega _{n_k+2} S_{n_k+2}^2(1 + o(1)) } \ge \frac{C}{\omega _{n_k+2} S_{n_k+2}} \end{aligned}$$

for some constant \(C > 0\), thus finding a lower bound for a subsequence of \(d_{n+2, n}\). Therefore it follows that,

$$\begin{aligned} H\left( \frac{P_{n_k}}{d_{n_k+2, n_k}}\right) = O\left( S_{n_k + 2} \sqrt{\omega _{n_k + 2}}\right) \implies H\left( \frac{P_{n_k}}{d_{n_k+2, n_k}}\right) = e^{o(n_k)}, \end{aligned}$$

so we can apply Theorem 3.1 to this subsequence.

Note that, unless we delve into more details, we can’t avoid working with subsequences as we can’t control \(|\cos (\theta (n+2))|\) under general conditions. To make the situation manageable, we will need Lemma 3.5 below.

Before trying to compute a lower bound for \(d_{n+d,n}\) that is valid in general, we will need some technical work. We just state here some technical lemmas and leave their proofs for the next section.

Lemma 3.3

Let f be a monic polynomial of degree d with simple zeros \(Z(f) \cap \mathbb {D}= \emptyset \ne Z(f) \cap \mathbb {T}= \{z_i\}_{i=1}^{d_1}\) and \(\{z_i\}_{i=d_1+1}^{d}= Z(f) \backslash {\overline{\mathbb {D}}}\). Let \(p_n\) be the n-th o.p.a. to 1/f. Define \(v := (v_m)_{m=1}^{d-d_1}\), \(v_m = 1/{\overline{z}}_{m+d_1}\), \(B := (b_{l,m})_{l,m=1}^{d-d_1}\), \(b_{l,m} = \frac{1}{{\overline{z}}_{m+d_1} z_{l+d_1}-1}\) and \(s := (s_l)_{l=1}^{d-d_1}\), \(s_l = \sum _{j=1}^{d_1} \frac{{\overline{z}}_j^{n+d+1}}{{\overline{z}}_j z_{l+d_1} - 1}\). Then there exists a positive constant C such that

$$\begin{aligned} d_{n+d, n} = \widehat{1 - p_{n}f}(n+d) = \frac{C}{w_{n+d}S_{n+d}}\left( G_n + o(1)\right) \end{aligned}$$

where

$$\begin{aligned} G_{n} = \begin{vmatrix}\sum _{j=1}^{d_1} {\overline{z}}_j^{n+d}&v\\s^t&B\end{vmatrix}. \end{aligned}$$

The next lemma involves the value of the determinant \(G_n\) in the special case where we have only one zero in the unit circle, which we will take to be \(z_1 = 1\).

Lemma 3.4

Let \(z_2, \dots , z_d \in {\mathbb {C}}\backslash \overline{{\mathbb {D}}}\), be different with \(d \ge 2\), then

$$\begin{aligned} G = \begin{vmatrix}1&v\\s^t&B\end{vmatrix} \not = 0, \end{aligned}$$

where \(v := (v_m)_{m=2}^{d}\), \(v_m = 1/{\overline{z}}_{m}\), \(B := (b_{l,m})_{l,m=2}^{d}\), \(b_{l,m} = \frac{1}{{\overline{z}}_{m} z_{l}-1}\) and \(s := (s_l)_{l=2}^{d}\), \(s_l = \frac{1}{z_{l} - 1}\).

The third of the auxiliary lemmas, inspired by Example 3.2, deals with the need to control the size of trigonometric functions. Particularly, it is obvious when the quotients between angles are rational:

Lemma 3.5

Let \(\theta _1, \dots , \theta _n \in [-\pi , \pi )\). Then, there exist \(\{n_k\}_{k=0}^\infty \subset {\mathbb {N}}\) such that

$$\begin{aligned} \lim _{k \rightarrow \infty } n_k \theta _k \equiv 0 \mod 2\pi . \end{aligned}$$

Our last lemma, for now, is about linear combinations of powers of unimodular complex numbers.

Lemma 3.6

Let \(\theta _1, \dots , \theta _n \in [0, 2\pi )\) be different, \(C_1, \dots , C_n\) be any arbitrary complex numbers and \(N \in {\mathbb {N}}\). Then

$$\begin{aligned} \sum _{k=1}^n C_k e^{im\theta _k} = 0, \forall m \ge N, m \in {\mathbb {N}} \end{aligned}$$

if and only if \(C_1 = C_2 = \dots = C_n = 0\).

Using these lemmas, we can finally prove Theorem 1.3.

Proof of Theorem 1.3

It suffices to show that there exists an increasing sequence of naturals numbers \(\{n_k\}_{k=0}^\infty \) such that \(|d_{n_k + d, n_k}| \ge \frac{\delta }{w_{n+d}S_{n+d}}\) for some \(\delta > 0\). Under that condition

$$\begin{aligned} H\left( \frac{P_{n_k}}{d_{n_k+d, n_k}}\right) = O\left( S_{n_k + d}\sqrt{\omega _{n_k + d}}\right) \implies H\left( \frac{P_{n_k}}{d_{n_k+2, n_k}}\right) = e^{o(n_k)}, \end{aligned}$$

and the Theorem follows from Theorem 3.1. Now, using Lemma 3.3 it is enough to show that there exists a subsequence \(\{n_k\}_{k=0}^\infty \) and \(\delta > 0\) such that \(|G_{n_k}| > \delta \).

For that, note that the previous determinant can be simply rewritten as

$$\begin{aligned} G_n = \sum _{j=1}^{d_1}{\overline{z}}_j^{n+d+1}\begin{vmatrix} 1/{\overline{z}}_j&v\\s^t_j&B\end{vmatrix}, \end{aligned}$$

where \((s_{j,l})_{l=1}^{d-d_1} = \frac{1}{{\overline{z}}_j z_{l+d_1} - 1}\). We distinguish between two cases: when we have zeros outside the unit circle and when we don’t have such zeros.

In the case that there are no zeros outside the unit circle, we simply get that \(G_n = \sum _{j=1}^{d_1} {\overline{z}}_j^{n+d} = \sum _{j=1}^{d_1} e^{-i\theta _j(n+d)}\). Applying Lemma 3.5, it follows that we can find \(\{n_k\}_{k=0}^\infty \) such that the sum can approximate \(\sum _{j=1}^{d_1} e^{-i0} = d_1\).

When we have zeros outside the unit circle, rotating the plane if necessary, we can assume that \(z_1 = 1\), so that we can rewrite

$$\begin{aligned} G_n = G + \sum _{j=2}^{d_1}{\overline{z}}_j^{n+d+1}\begin{vmatrix} 1/{\overline{z}}_j&v\\s^t_j&B\end{vmatrix}, \end{aligned}$$

where G is as described in Lemma 3.4. Therefore, we can apply Lemma 3.6 to conclude that there exists N such that \(G_N \not = 0\) and use Lemma 3.5 to find a sequence \(\{n_k\}_{k=0}^\infty \) such that \(G_{n_{k}} \rightarrow G_N \not = 0\), proving the theorem. \(\square \)

4 Proof of technical lemmas

Now we will concentrate on the task of establishing Lemma 3.3, for which we will use all the notation in the statement as well as the notation from Lemma  2.1. This will take the form of one more auxiliary result.

We also denote by \(B_{(j)}^{(i)}\) the matrix created by removing the i-th column of the B matrix and adding the vector \(s_j := \left( s_{j,l}\right) _{l=1}^{d-d_1}\), \(s_{j,l} = \frac{{\overline{z}}_j^{n+d+1}}{\overline{z_j} z_{d_1 + l} - 1}\) as its first column. With this notation, we have the following:

Lemma 4.1

The inverse of E is given by

$$\begin{aligned} E^{-1} = \frac{S_{n+d}^{d_1 - 1}}{\det (E)} \frac{\prod _{l=d_1 + 1}^{d} |z_l|^{2(n+d+1)}}{\omega _{n+d}^{d-d_1}} R \end{aligned}$$

where \(R = (R_{i,j})_{i,j = 1}^d\), which for large values of n satisfies the following:

  • For \(1 \le i \le d_1\),

    $$\begin{aligned} R_{i,i} = \det {(B)} + o(1) \end{aligned}$$

    and \(R_{i,j} \in o(1)\) when \(i \ne j\).

  • For \(d_1 < i \le d\) and \(1 \le j \le d_1\),

    $$\begin{aligned} R_{i,j} = \frac{(-1)^{d_1 + i}}{{\overline{z}}_i^{n+d+1}} \left[ \det {\left( B^{(i)}_{(j)}\right) } + o(1).\right] \end{aligned}$$
  • Otherwise \(d_1 < j \le d\) and \(R_{i,j} \in o\left( \frac{1}{{\overline{z}}_i^{n+d+1}}\right) .\)

Proof

By Cramer’s rule,

$$\begin{aligned} E^{-1} = \frac{1}{\det (E)} \begin{pmatrix}E_{1,1}&{}E_{1,2}&{} \dots &{} E_{1,d} \\ E_{2,1} &{}E_{2,2}&{} \dots &{} E_{2,d} \\ \vdots &{} \vdots &{}\ddots &{} \vdots \\ E_{d,1} &{} E_{d,2} &{} \dots &{} E_{d,d} \end{pmatrix}^T \end{aligned}$$

with each \(E_{i,j}\) the appropriate cofactor. We first focus on the columns that correspond to the zeros in the unit circle, that is \(E_{i,j}\), \(1 \le j \le d_1\). Firstly, noticing that \(e_{i,j} = k_{n+d}(z_i, z_j)\) and using the definition of a cofactor as a determinant, it follows that the diagonal terms are given by

$$\begin{aligned} E_{j,j} = \sum _{\begin{array}{c} \sigma \in {\mathscr {S}}\\ \sigma (j) = j \end{array}}\left[ {{\,\mathrm{sgn}\,}}(\sigma )\prod _{\begin{array}{c} l=1 \\ l \ne j \end{array}}^{d} k_{n+d}(z_l, z_{\sigma (l)})\right] . \end{aligned}$$

Let us decompose this sum depending on the number of indices a given permutation fixes. As in Lemma 2.5, let \({\mathscr {A}}\) be the set of all permutations such that \(\sigma (l) = l\) for every \(1 \le l \le d_1\), and, for each \(0 \le k < d_1-1\), let \({\mathscr {B}}_k\) be the set of permutations that fix exactly k of the indices in the set \(\{1, \dots , d_1\}\backslash \{j\}\). Then \(E_{j,j}\) may be decomposed as the sum of two summands, namely

$$\begin{aligned} \sum _{\begin{array}{c} \sigma \in {\mathscr {A}}\\ \sigma (j) = j \end{array}}{{\,\mathrm{sgn}\,}}(\sigma )\prod _{\begin{array}{c} l=1 \\ l \ne j \end{array}}^{d} k_{n+d}(z_l, z_{\sigma (l)}) + \sum _{k=0}^{d_1-2}\sum _{\begin{array}{c} \sigma \in {\mathscr {B}}_k\\ \sigma (j) = j \end{array}}{{\,\mathrm{sgn}\,}}(\sigma )\prod _{\begin{array}{c} l=1 \\ l \ne j \end{array}}^{d} k_{n+d}(z_l, z_{\sigma (l)}). \end{aligned}$$
(15)

For the first summand in (15), we have that if \(l = \sigma (l) < d_1\) we get, by definition,

$$\begin{aligned} k_{n+d}(z_l, z_l) = S_{n+d} \end{aligned}$$

and when \(l > d_1\) or \(\sigma (l) > d_1\), we get, by Lemma 2.2

$$\begin{aligned} k_{n+d}(z_l, z_{\sigma (l)}) = C(n, \sigma , l)\frac{{\overline{z}}^{n+d+1}_{\sigma (l)} z^{n+d+1}_l}{w_{n+d}}, \end{aligned}$$

with \(C(n, \sigma , l)\) such that

$$\begin{aligned} C(n, \sigma , l) \rightarrow \frac{1}{{\overline{z}}_{\sigma (l)} z_l-1} \text { as } n \rightarrow \infty . \end{aligned}$$

Therefore the first summand of (15) can be expressed as

$$\begin{aligned} S_{n+d}^{d_1 - 1} \frac{\prod _{l=d_1 + 1}^{d} |z_l|^{2(n+d+1)}}{w_{n+d}^{d-d_1}}\left( \sum _{\begin{array}{c} \sigma \in {\mathscr {A}}\\ \sigma (j) = j \end{array}}{{\,\mathrm{sgn}\,}}(\sigma )\prod _{\begin{array}{c} l=d_1 + 1 \end{array}}^{d} C(n, \sigma , l) \right) . \end{aligned}$$
(16)

Now, for the second term in (15), we notice that it will consist of sums similar to (16) but with some factor of \(S_{n+d}\) missing. That factor could be replaced, in some cases, by other ones of the form \(k_{n+d}(z_l, z_{\sigma (l)})\) with \( l \ne \sigma (l)\), \(l, \sigma (l) \le d_1\). However, invoking Lemma 2.3, we know that this will make a small contribution in comparison with \(S_{n+d}\). We note that

$$\begin{aligned} C(n, \sigma , l) = \frac{1}{{\overline{z}}_{\sigma (l)} z_l-1}[1 + r(n,\sigma , l)] \text { with } r(n, \sigma , l) \rightarrow 0 \text { as } n \rightarrow \infty . \end{aligned}$$

Combining all the results above we get that \( E_{j,j}\) is equal to

$$\begin{aligned} S_{n+d}^{d_1 - 1} \frac{\prod _{l=d_1 + 1}^{d} |z_l|^{2(n+d+1)}}{w_{n+d}^{d-d_1}}\left( \sum _{\begin{array}{c} \sigma \in {\mathscr {A}}\\ \sigma (j) = j \end{array}}{{\,\mathrm{sgn}\,}}(\sigma )\prod _{\begin{array}{c} l=d_1 + 1 \end{array}}^{d} \frac{1}{{\overline{z}}_{\sigma (l)} z_l-1} + r(n)\right) , \end{aligned}$$

where \(r(n) \rightarrow 0\) as \(n \rightarrow \infty \). Finally, the left-most term in brackets is just \(\det {(B)}\), as a permutation \(\sigma \in {\mathscr {A}}\) is also a bijection in the set \(\{d_1 + 1, \dots , d\}\}\), so we have exactly the definition of the determinant.

For the non-diagonal terms, as before, we have

$$\begin{aligned} E_{i,j} = \sum _{\begin{array}{c} \sigma \in {\mathscr {S}}\\ \sigma (i) = j \end{array}}\left[ {{\,\mathrm{sgn}\,}}(\sigma )\prod _{\begin{array}{c} l=1 \\ l \ne i \end{array}}^{d} k_{n+d}(z_l, z_{\sigma (l)})\right] . \end{aligned}$$

A similar argument reveals a missing factor of \(S_{n+d}\) in all terms with respect to \(E_{j,j}\), yielding

$$\begin{aligned} E_{i,j} \in o(E_{j,j}). \end{aligned}$$

Next, we focus on the columns that correspond to the zeros outside the unit disk, that is, \(E_{i,j}\), \(d_1 < j \le d\). We divide these cofactors in two cases, one for zeros on the unit circle, and one for the rest. In the first case, we have \(1 \le i \le d_1\), so that, using the definition of the cofactor, we get

$$\begin{aligned} E_{i,j} = \sum _{\begin{array}{c} \sigma \in {\mathscr {S}}\\ \sigma (i) = j \end{array}}\left[ {{\,\mathrm{sgn}\,}}(\sigma )\prod _{\begin{array}{c} l=1 \\ l \ne i \end{array}}^{d} k_{n+d}(z_l, z_{\sigma (l)})\right] . \end{aligned}$$

We further divide this sum in two terms: permutations in the set \({\mathscr {A}}_i\) and those in \({\mathscr {B}}_k\). In this case permutations in \({\mathscr {A}}_i\) fix all indices in \(\{1, \dots , d_1\} \backslash \{i\}\) and \({\mathscr {B}}_{k}\) is defined as before, again avoiding i. For the sum of terms in \({\mathscr {A}}_i\) we get

$$\begin{aligned} S_{n+d}^{d_1-1}\sum _{\begin{array}{c} \sigma \in {\mathscr {A}}_i\\ \sigma (i) = j \end{array}}\left[ {{\,\mathrm{sgn}\,}}(\sigma )k_{n+d}(z_j, z_{\sigma (j)})\prod _{\begin{array}{c} l=d_1+1\\ l \ne i \end{array}}^{d} k_{n+d}(z_l, z_{\sigma (l)})\right] . \end{aligned}$$

Using again Lemma 2.2 and a reasoning analogous to the one before we get

$$\begin{aligned}&S_{n+d}^{d_1-1}\frac{\prod _{l=d_1 + 1}^{d} |z_l|^{2(n+d+1)}}{w_{n+d}^{d-d_1}} \left( \frac{{\overline{z}}_i}{{\overline{z}}_j}\right) ^{n+d+1} \\&\quad \cdot \sum _{\begin{array}{c} \sigma \in {\mathscr {A}}_i\\ \sigma (i) = j \end{array}}\left[ \frac{{{\,\mathrm{sgn}\,}}(\sigma )}{{\overline{z}}_{\sigma (j)} z_j-1} \prod _{\begin{array}{c} l=d_1+1 \\ l \ne i \end{array}}^{d} \frac{1}{{\overline{z}}_{\sigma (l)} z_l-1} + r \right] , \end{aligned}$$

where \(r=r(n, l, \sigma )\) and we have a factor of \({\overline{z}}_j^{n+d+1}\) missing with respect to the previous cases as we have eliminated the j-th column in the E matrix, as well as an extra factor \({\overline{z}}_i^{n+d+1}\), since the i-th column in the E matrix was not eliminated in this case. For the summands corresponding to \({\mathscr {B}}_{k}\), one can easily check with a similar procedure that they are smaller than the ones in \({\mathscr {A}}_i\) by a factor of \(S_{n+d}\), therefore we can conclude that \(E_{i,j}\) is approximately the product of the two factors,

$$\begin{aligned} S_{n+d}^{d_1-1}\frac{\prod _{l=d_1 + 1}^{d} |z_l|^{2(n+d+1)}}{w_{n+d}^{d-d_1}}\left( \frac{(-1)^{d_1 + j}}{{{\overline{z}}_j}^{n+d+1}}\right) \end{aligned}$$

and

$$\begin{aligned}\sum _{\begin{array}{c} \sigma \in {\mathscr {A}}_i\\ \sigma (i) = j \end{array}}\left[ (-1)^{d_1 + j } {{\,\mathrm{sgn}\,}}(\sigma )\frac{{\overline{z}}_i^{n+d+1}}{{\overline{z}}_{\sigma (j)} z_j-1} \prod _{\begin{array}{c} l=d_1+1 \\ l \ne i \end{array}}^{d} \frac{1}{{\overline{z}}_{\sigma (l)} z_l-1} + r(n, \sigma , l)\right] . \end{aligned}$$

This second factor can be easily checked to be \(\det {\left( B^{(j)}_{(i)}\right) }\), where the factor \((-1)^{d_1 + j}\) is added because we are considering a larger permutation, which changes the sign of the permutation \(\sigma \) if we regard it as a bijection on the reduced set.

Finally, for the case \(d_1 < i \le d\), doing a similar expansion of the determinant as before, it is easily seen that for each sum we are missing an exponential factor with respect to the previous terms \(E_{i,j}\). Thus in this case

$$\begin{aligned} E_{i,j} \in o\left( S_{n+d}^{d_1-1}\frac{\prod _{l=d_1 + 1}^{d} |z_l|^{2(n+d+1)}}{{\overline{z}}_i^{n+d+1}w_{n+d}^{d-d_1}}\right) \end{aligned}$$

and the lemma follows transposing the matrix. \(\square \)

We can finally establish Lemma 3.3.

Proof of Lemma 3.3

By Lemma 2.1 we have that

$$\begin{aligned} d_{n+d, n} = \frac{1}{\omega _{n+d}} \sum _{i=1}^d A_{i,n} {\overline{z}}_i^{n+d}, \end{aligned}$$

where \(A^t_n = E^{-1} \cdot v_0^t\). Now, applying Lemma 4.1 it follows that

$$\begin{aligned} A_{i,n} = \frac{S_{n+d}^{d_1 - 1}}{\det (E)} \frac{\prod _{l=d_1 + 1}^{d} |z_l|^{2(n+d+1)}}{\omega _{n+d}^{d-d_1}} \sum _{j=1}^d R_{i,j}. \end{aligned}$$

We can plug this value of \(A_{i,n}\) into the previous formula, which leads us to

$$\begin{aligned} d_{n+d, n} = \frac{S_{n+d}^{d_1 - 1}}{\det (E)} \frac{\prod _{l=d_1 + 1}^{d} |z_l|^{2(n+d+1)}}{\omega _{n+d}^{d - d_1+1}} \sum _{i,j=1}^d R_{i,j} {\overline{z}}_i^{n+d}. \end{aligned}$$

We start working with the multiplicative term. Lemma 2.5 actually tells us that

$$\begin{aligned} \det {(E)} = C S_{n+d}^{d_1}\frac{\prod _{l=d_1 + 1}^{d} |z_l|^{2(n+d+1)}}{\omega _{n+d}^{d-d_1}} (1 + o(1)) \end{aligned}$$

for some positive constant C. We can substitute this value into the formula for \(d_{n+d,n}\) to obtain

$$\begin{aligned} d_{n+d, n} = \frac{C}{S_{n+d} \omega _{n+d}}(1 + o(1))\sum _{i,j=1}^d R_{i,j} {\overline{z}}_i^{n+d}. \end{aligned}$$

We can divide the sum \(\sum _{i,j=1}^d R_{ij} {\overline{z}}_i^{n+d}\) in 3 different terms, as

$$\begin{aligned} \sum _{i=1}^{d_1} \sum _{j=1}^d R_{i,j} {\overline{z}}_i^{n+d} + \sum _{i=d_1+1}^{d} \sum _{j=1}^{d_1} R_{i,j} {\overline{z}}_i^{n+d} + \sum _{i=d_1+1}^{d} \sum _{j=d_1+1}^{d} R_{i,j} {\overline{z}}_i^{n+d}. \end{aligned}$$

For the first of these 3 sums, we can directly apply Lemma 4.1 to get

$$\begin{aligned} \sum _{i=1}^{d_1} \sum _{j=1}^d R_{i,j} {\overline{z}}_i^{n+d}&= \sum _{i=1}^{d_1} R_{i,i} {\overline{z}}_i^{n+d} + \sum _{i=1}^{d_1} \sum _{j \not = i}^d R_{i,j} {\overline{z}}_i^{n+d} \\&= \left( \sum _{i=1}^{d_1}\overline{z_i}^{n+d}\right) \det {(B)} + o(1). \end{aligned}$$

Using again Lemma 4.1, we can see that the second summand is equal to

$$\begin{aligned} \sum _{i=1}^{d-d_1} \frac{(-1)^{i}}{{\overline{z}}_{i+d_1}} \left( \sum _{j=1}^{d_1} \det {\left( B^{(i + d_1)}_{(j)}\right) }\right) + o(1). \end{aligned}$$

The last summand is simply o(1) since the values \(R_{ij}\) are of order \(o(1/{\overline{z}}^{n+d+1}_{i})\). The result follows.\(\square \)

We are also ready for a proof of Lemma 3.4.

Proof of Lemma 3.4

Let us first express every quantity in terms of \(a_j = \frac{1}{{\overline{z}}_j} \in {\mathbb {D}}\), \(j = 2, \dots , d\) so that \(v_m = a_m\), \(b_{l,m} = \frac{\overline{a_l} a_m}{1 - {\overline{a}}_{l} a_{m}}\) and \(s_l = \frac{{\overline{a}}_l}{1 - {\overline{a}}_l}\). From the expansion of the determinant along the l-th row, we can remove a factor of \({\overline{a}}_l\) and the same holds for \(a_m\) along the m-th column, yielding

$$\begin{aligned} G = \left( \prod _{j=2}^d |a_j|^2 \right) \begin{vmatrix} 1&v_0\\ p^t&H \end{vmatrix}, \end{aligned}$$

where \(v_0 = (1, \dots , 1) \in {\mathbb {C}}^{d-1}\), \(p = \left( p_l \right) _{l=2}^d\), \(p_l = \frac{1}{1 - {\overline{a}}_l}\) and \(H = \left( h_{l,m}\right) _{l,m=2}^{d}\), \(h_{l,m} = \frac{1}{1 - {\overline{a}}_l a_m}\). The first product is never zero, as \(a_j \not = 0\), \(j=2, \dots , d\), so we only have to prove that the determinant on the right side is non-zero.

The determinant is the value at \(z=1\) of the function g given by the orthogonal projection in \(H^2\) of the constant function 1 onto \({{\,\mathrm{Span}\,}}\{k_{a_j}(z)\}_{j=2}^d =: V\), where \(k_w(z) = \frac{1}{1 - {\overline{w}}z}\) is the Szegö kernel at w evaluated at z (\(w,z \in {\mathbb {D}}\)).

From this property of projections, it is standard that

$$\begin{aligned} g(1) = 1 - \prod _{j=2}^d \left( -{\overline{a}}_j\right) \left( \frac{1 - a_j}{1 - {\overline{a}}_j}\right) . \end{aligned}$$

Now notice that \(\left| \frac{1 - a_j}{1 - {\overline{a}}_j} \right| = 1\), while \(|a_j| < 1\), so the product in the right hand side is strictly smaller than 1. In other words \(g(1) \not = 0\). \(\square \)

The proposition we need about angles is elementary, being just a particular case of Kronecker’s approximation theorem, but we decide to provide a proof for completeness:

Proof of Lemma 3.5

Without loss of generality we may assume that all \(\theta _i\)’s are different. Moreover, from now on, when talking about angles we will restrict ourselves to the interval \([-\pi , \pi )\) with the corresponding mod \(2\pi \) association. Given \(\epsilon > 0\) it is enough to find a number \(K \in {\mathbb {N}}\) such that the angles \(K\theta _1\), \(K\theta _2\) are in \([-\epsilon , \epsilon ]\).

Fix \(T \in {\mathbb {N}}\) such that \(1/T \le \epsilon \), and choose \(k_1\) such that \(\theta '_1 \equiv \theta _1 k_1\), satisfies \(|\theta _1'| \in \left[ \frac{-1}{2T^3}, \frac{1}{2T^3}\right] \). Then \(\{s\theta _{1}'\}_{s=1}^{2T^2}\) is a sequence of \(2T^2\) points all of which are in \([-\epsilon , \epsilon ]\). Consider the sequence \(\{k_1s\theta _2\}_{s=1}^{2T^2}\) which is a set with \(2T^2\) elements. From the pigeonhole principle, it follows that there exist \(s_1\) and \(s_2\) such that \(|(s_1 - s_2)k_1\theta _2| \le \frac{1}{T} = \epsilon \). Take \(k_2 = |s_1 - s_2|\) and \(K = k_1k_2\) the results follows.

The general case follows automatically via induction. \(\square \)

Lastly, we must understand linear combinations of unimodular complex numbers:

Proof of Lemma 3.6

Suppose we find some \(C_i \not = 0\) such that

$$\begin{aligned} \sum _{k=1}^n C_k e^{im\theta _k} = 0, \forall m \ge N. \end{aligned}$$

Without loss of generality, we can assume that \(C_1 \not = 0\) and rearrange the equation to get

$$\begin{aligned} \sum _{k=2}^n \left( -C_k/C_1\right) e^{im(\theta _k - \theta _1)} = 1, \forall m \ge N. \end{aligned}$$

As this is true for all \(m \ge N\), we can sum the different equalities in m to get

$$\begin{aligned} R&= \sum _{m=N}^{N+R} 1 = \sum _{m=N}^{N+R} \sum _{k=2}^n (-C_k/C_1) e^{im(\theta _k - \theta _1)} \\&= \sum _{k=2}^n (-C_k/C_1) \sum _{m=N}^{N+R} e^{im(\theta _k - \theta _1)} \le \sum _{k=2}^n |C_k/C_1| \frac{2}{|1 - e^{i(\theta _k - \theta _1)}|} \end{aligned}$$

Taking the limit when \(R \rightarrow \infty \) yields the desired contradiction, as the right side is just a constant. \(\square \)

5 Further remarks

From where we stand it is not difficult to draw a few further directions of research.

One may consider whether our results hold true as well for the zeros of o.p.a. \(\{p_n\}_{n\in \mathbb {N}}\) rather than those of \(1-p_nf\). This is the equidistribution that is suggested in the original paper [2] and perhaps a result of interlacing would give this additional asymptotic result.

In some of our theorems, the appearance of subsequences is somewhat surprising. One may wonder whether the propositions do hold without needing to take subsequences, but it seems from the calculations that there are some obstructions. Perhaps this is an actual opportunity: it seems plausible that one could construct some special functions f as limits of other polynomials \(f_n\), for which the corresponding subsequences have larger and larger gaps, making f bear some special property with regards to cyclicity, precisely because no subsequence of the natural numbers would work for all \(f_n\).

Another natural research line could focus on the strong asymptotics as suggested by Rakhmanov, or on the value distribution or level sets of \(1-p_nf\) over \(\mathbb {T}\), in general. At least in the Dirichlet space, this can be linked to capacities and the space norm of \(1-p_nf\) in a way that connects directly with the study of cyclic functions.

Finally, it seems desirable to remove our requirement that the zeros of the function in study be simple. Although we believe this to be possible, we decided not to include such complications in here. The recent article by Felder and Le [9] shows how to deal with multiple zeros in a very similar context.