1 Introduction

This is a continuation of a recent work [26], adding to it some constructive analysis details. The purpose of both articles was/is to compute the moments of the “cloud” of a positive measure with compact support in the plane starting from the moments of the original measure. This moment conversion eliminates the components of the measure which are singular with respect to Lebesgue area, informally regarded as “outliers”, and charges the cloud with uniform mass. Two real dimensions are specific in the approximation scheme we propose, twofold: complex variables enter heavily into the picture, as well as a refined spectral analysis for pairs of self-adjoint operators with trace-class commutator.

The present article has two interlaced motivations: to identify a 2D analogue of the classical and still rapidly evolving spectral analysis on the line or the circle, based on moment information, involving orthogonal polynomials and Christofell-Darboux kernels, and second, the current renewed interest of data analysis flavor, consisting in separating outliers from the cloud, in multidimensional point distributions. Our viewpoint is to treat moments as observed data and to extract from them by simple analytic transforms the moments of the cloud. We only mention without expanding in this work the well known, various reconstruction from moments methods.

A landmark contribution to polynomial approximation theory in the complex domain is Thomson’s Theorem [31]. It asserts that complex polynomials are either dense in the Lebesgue space \(L^{2}(\mu )\) associated to a positive Borel measure \(\mu \) with compact support in the plane, or there exist \(L^{2}(\mu )\)-bounded point evaluations for polynomials. More specifically, Thomson’s Theorem provides a decomposition of the measure \(\mu \) into a singular with respect to area part, where polynomials are dense, and measures which carry an open set of point evaluations, see Theorem 2.2 below for the precise statement. The multiplier with the complex variable \(S_{\mu }= M_{z}\), acting on the closure of polynomials \(P^{2}(\mu )\), is consequently decomposed into a direct sum of a normal operator and a collections of irreducible subnormal, cyclic operators with a continuum of eigenvalues for their adjoints: \(S_{\mu }= N \oplus S_{1} \oplus S_{2} \oplus \ldots \). In very elementary terms, the main idea we exploit in this article is: the self-commutator

$$ [S_{\mu }^{\ast }, S_{\mu }] = 0 \oplus [S_{1}^{\ast }, S_{1}] \oplus [S_{2}^{\ast }, S_{2}] \oplus \ldots $$

does not “see” the normal part, that is the outliers to the cloud.

The spectral analysis of the analytic Toeplitz operator \(S_{\mu }\) is rich and provides the numerical procedure we propose to eliminate via traces of commutators its normal direct summand. A second notable result we invoke is due to Berger and Shaw [5], namely the self-commutator \([S^{\ast }_{\mu }, S_{\mu }]\) is trace class. This opens a wide array of tools specific to the notion of principal function, a refined spectral invariant extending the Fredholm index across the spectrum. A trace formula discovered in the 1970-ies by Helton and Howe and an equivalent determinant formula independently obtained by Carey and Pincus provide effective formulas linking the moments of the principal function (in our case the characteristic function of the cloud of the measure \(\mu \)) and traces of commutators of smooth functions applied to \(S_{\mu }\). The third deep result we rely on is due to Carey and Pincus, asserting that the principal function of a subnormal operator is integer valued [8].

This machinery was put in action in our previous article [26] with a proposed computational/transformation scheme:

moments of \(\mu \)moments of the uniform mass on the cloud of \(\mu \).

We expand below a quantitative analysis of the algorithm, with error bounds expressed in the entries of the Hessenberg matrix representing \(S_{\mu }\) with respect to the filtration of \(P^{2}(\mu )\) by finite dimensional subspaces consisting of polynomials. The main transform involves solely limits of some explicit expressions involving truncated Christoffel-Darboux kernels of the original measure \(\mu \). And not surprising, the principal error estimate depends on the asymptotics of the singular numbers of Hankel’s operator \((I-P)M_{z}^{\ast }P\), where \(P : L^{2}(\mu ) \longrightarrow P^{2}(\mu )\) denotes the orthogonal projection. The approximation scheme can be adapted to other orthogonal function systems than polynomials, with an appropriate sequence of finite rank projections converging strongly to the identity operator. We remain at an all-inclusive level, applicable to any positive measure with compact support, leaving for future studies the adaptation of this general framework to concrete situations. The examples at the end of the article indicate a few openings in this direction.

To make the present note accessible to a larger group of readers, we (re)expose below the main ingredients, with precise reference to the sources.

Acknowledgments. We are grateful to Bernhard Beckermann for helpful comments on the asymptotics of the Hessenberg matrix associated to a uniform mass distribution in the plane.

2 Preliminaries

Throughout this note \(\mathbb{C}[z]\) stands for the algebra of polynomials in one variable with complex coefficients. Let \(\mu \) be a positive Borel measure with compact support in the complex plane. We assume the support of \(\mu \) is not finite.

2.1 Orthogonal Polynomials

We recall a few facts and conventions referring to complex orthogonal polynomials, as for instance appearing in [29]. The closure of complex polynomials in \(L^{2}(\mu )\) is denoted \(P^{2}(\mu )\), with associated orthogonal projection \(P\). The multiplication \(M\) by the complex variable \(z \in \mathbb{C}\) is a bounded linear transform of \(L^{2}(\mu )\) which leaves invariant the subspace \(P^{2}(\mu )\):

$$ P M P = M P. $$

The linear operator \(S_{\mu }= M|_{P^{2}(\mu )}: P^{2}(\mu ) \longrightarrow P^{2}(\mu )\) is called subnormal. It is also a Toeplitz operator:

$$ (S_{\mu }f)(z) = P (w f(w))(z), \quad f \in P^{2}(\mu ). $$

Sometimes we simply write \(S = S_{\mu }\). The spectrum of \(M\) coincides with the closed support \({\mathrm{supp}}(\mu )\) of the measure \(\mu \), while the spectrum of \(S\) can be larger, containing in addition some connected components of \(\mathbb{C}\setminus {\mathrm{supp}}(\mu )\).

The constant function \(\mathbf{1}\) is a cyclic vector for \(S\), producing the finite dimensional filtration (Krylov subspaces):

$$ \mathbb{C}_{n}[z] = \{ f \in \mathbb{C}[z],\ \deg f \leq n \} = { \mathrm{span}} \{ S^{j} \mathbf{1}, \ 0 \leq j \leq n \}. $$

We denote by \(p_{n}(z)\) the associated complex orthogonal polynomials:

$$ \langle p_{j}, p_{k} \rangle = \int p_{j} \overline{p_{k}} d\mu = \delta _{jk}, \qquad \deg p_{k} =k, \quad j,k \geq 0. $$

The operator \(S\) has a distinguished Hessenberg matrix representation with respect to the orthonormal basis \((p_{n})_{n=0}^{\infty }\) of \(P^{2}(\mu )\):

$$ \langle z p_{j}, p_{k} \rangle = h_{kj}, \quad j,k \geq 0, $$

observing the automatic vanishing relations

$$ h_{jk} =0, \quad j+1 < k. $$

The adjoint operator is represented by the matrix:

$$ \langle S^{\ast }p_{j}, p_{k} \rangle = \langle M^{\ast }p_{j}, p_{k} \rangle = \langle p_{j} , S p_{k} \rangle = \overline{h_{jk}}. $$

It is customary to normalize the leading coefficient of \(p_{n}\) to be a positive number:

$$ p_{n}(z) = \gamma _{n} z^{n} + \ldots , \quad \gamma _{n} > 0, \ n \geq 0. $$

The matrix representing \(S\) has the form:

$$ H = \begin{bmatrix} h_{00} & h_{01} & h_{02} & h_{03}& \ldots \\ h_{10}&h_{11}& h_{12} & h_{13} & \ldots \\ 0& h_{21}&h_{22} &h_{23}& \\ 0&0&h_{32}&h_{33}& \ldots \\ \vdots & & \ddots & \ddots \end{bmatrix} . $$

If the measure \(\mu \) has support on the real line, then \(S\) is a self-adjoint operator, hence the matrix \(H\) is symmetric, with real values, with only three non-vanishing diagonals: a well charted Jacobi matrix framework.

For every non-negative integer \(n\) one can speak of the reproducing kernel (known as the Christoffel-Darboux kernel) of the finite dimensional subspace \(\mathbb{C}_{n}[z]\):

$$ K_{n}(z,w) = \sum _{j=0}^{n} p_{j} (z) \overline{p_{j}(w)}, $$

characterized by the identity:

$$ \langle K_{n}(z,w), f(w) \rangle = \left \{ \textstyle\begin{array}{c@{\quad }r} f(z), &f \in \mathbb{C}_{n}[z], \\ 0, & {\mathrm{deg}} f >n. \end{array}\displaystyle \right . $$

The Christoffel function of order \(n\) is

$$ \Lambda _{n} (z) = \frac{1}{K_{n}(z,z)} = \inf \{ \| f \|^{2}, \ f(z) =1, \ \deg f \leq n \}. $$

These numerical values decrease with \(n\), and the limit

$$ \Lambda ^{\mu }(z) = \Lambda (z) = \inf _{n} \Lambda _{n} (z) $$

is known as the Christoffel function associated to the measure \(\mu \), evaluated at the point \(z\).

Incidentally, the leading coefficient of the orthogonal polynomial \(p_{n}\) has a similar variational interpretation:

$$ \frac{1}{\gamma _{n}} = \inf \{ \| z^{n} - f(z) \|, \ \deg f \leq n-1 \}, \quad n \geq 1. $$

2.2 Trace and Determinant Formulas

The analytic subnormal operator \(S = \mathit{PMP}\) satisfies the hyponormal commutator inequality \([S^{\ast }, S] \geq 0\). Indeed, for any element \(f \in P^{2}(\mu )\):

$$\begin{aligned} &\langle [S^{\ast }, S]f, f \rangle = \| S f\|^{2} - \| S^{\ast }f\|^{2} = \\ &\quad \| w f(w) \|^{2} - \| P \overline{w} f(w) \|^{2} \geq \| w f(w) \|^{2} - \| \overline{w}f(w) \|^{2} = 0. \end{aligned}$$

A theorem of Berger and Shaw [5] asserts that the above commutator is trace-class: \({\mathrm{Tr}} [S^{\ast },S] < \infty \). A different proof of Berger-Shaw Theorem is due to Voiculescu [32], in an article putting this very trace bound in the context of general perturbation theory, with a specific link to the modulus of quasi-triangularity of a Hilbert space linear transform.

For any polynomials \(p \in \mathbb{C}[z,\overline{z}]\), one can define an ordered functional calculus (traditionally called hereditary functional calculus) \(p(S, S^{\ast })\), by arranging all powers of \(S^{\ast }\) to the left of the powers of \(S\), in every monomial. Then for a pair of polynomials \(p,q \in \mathbb{C}[z,\overline{z}]\) one proves by degree induction that the commutator \([p(S,S^{\ast }), q(S,S^{\ast })]\) is trace-class. A remarkable observation due to Helton and Howe [15] asserts that the bilinear form

$$ (p,q) \mapsto {\mathrm{Tr}}[p(S,S^{\ast }), q(S,S^{\ast })] $$

depends linearly, and it is continuous in the sense of distributions, on the Jacobian

$$ J(p,q) = \frac{\partial p}{\partial \overline{z}} \frac{\partial q}{\partial {z}} - \frac{\partial q}{\partial \overline{z}} \frac{\partial p}{\partial {z}}. $$

Further on, it was established by Carey and Pincus as a byproduct of a decade of groundbreaking discoveries [8] that there exists a Borel measurable set \(\Sigma (\mu )\) satisfying

$$ {\mathrm{Tr}}[p(S,S^{\ast }), q(S,S^{\ast })] = \int _{\Sigma (\mu )} J(p,q) dA, \quad p, q \in \mathbb{C}[z,\overline{z}]. $$
(2.1)

Above, \(dA\) stands for Lebesgue measure in \(\mathbb{R}^{2}\).

Definition 2.1

The cloud of a positive Boreal measure \(\mu \) with compact support in ℂ is the measurable set \(\Sigma (\mu )\) appearing in trace formula (2.1).

Note that the set \(\Sigma (\mu )\) is only determined up to area null-sets. In practice it is more appropriate to speak about the class \([\chi _{\Sigma (\mu )}]\) of its characteristic function in \(L^{1}(\mathbb{C},\mu )\).

The equivalent formulation as an infinite determinant formula goes back to the origins of Carey and Pincus work:

$$\begin{aligned} &\det [ (S-w) (S^{\ast }-\overline{z}) (S-w)^{-1}(S^{\ast }-\overline{z})^{-1}] = \\ &\quad \exp \biggl( \frac{-1}{\pi } \int _{\Sigma (\mu )} \frac{\mathrm{dA(\zeta )}}{(\zeta -w)(\overline{\zeta }-\overline{z})}\biggr), \quad |z|, |w| > \| S \|. \end{aligned}$$
(2.2)

All results above touch the surface of the theory of hyponormal operators, in particular with reference to the principal function of a semi-normal operator. We do not expand here the details, referring to the monograph [18] for complete proofs and historical comments. In the present article we focus on the reconstruction of the cloud \(\Sigma (\mu )\) contained in the polynomial convex hull of the support of the measure \(\mu \). Note for instance that

$$ {\mathrm{Tr}}[S^{\ast }, S] = \frac{1}{\pi } {\mathrm{Area}}\ \Sigma (\mu ), $$

hence the operator \(S\) is normal if and only if \({\mathrm{Area}} \ \Sigma (\mu ) =0\). We will see shortly that this in turn is equivalent to the density of complex polynomials in \(L^{2}(\mu )\), quite a central topics in approximation theory.

For the limited aim of the present work, we confined ourselves to reproduce some well-known computations specific to Hankel operators. The general theory of Hankel operators is superbly exposed in [25]. The following observations appeared in a slightly different form in [7], but they are not uncommon to the general spectral analysis of Toeplitz operators, see for instance [34].

More precisely, the self-commutator of the subnormal operator \(S = \mathit{PMP} = \mathit{MP}\) can be factored as:

$$\begin{aligned} &[S^{\ast }, S] = [\mathit{PM}^{\ast }P, \mathit{PMP}] = P M^{\ast }P M P - \mathit{PMP} M^{\ast }P = \\ &\quad P M^{\ast }M P - \mathit{PMP} M^{\ast }P = P M M^{\ast }\mathit{PMP} M^{\ast }P = \\ &\quad P M (I-P) M^{\ast }P = [P, M] [M^{\ast }, P] = [P, M] [P, M]^{\ast }. \end{aligned}$$

Note that \(\mathit{MP} = \mathit{PMP}\) and \(\mathit{PM}^{\ast }= \mathit{PM}^{\ast }P\). Therefore the linear operator (known as a Hankel operator)

$$ T = (I-P)M^{\ast }P : P^{2}(\mu ) \longrightarrow L^{2}(\mu ) \ominus P^{2}( \mu ) $$

is Hilbert-Schmidt: \({\mathrm{Tr}} \ T^{\ast }T < \infty \). We can express this fact in two different ways:

$$ \sum _{k=0}^{\infty }\| T p_{k} \|^{2} = \sum _{k=0}^{\infty }\| \overline{z} p_{k}(z) - P [\overline{w} p_{k}(w)](z) \|^{2} < \infty , $$
(2.3)

or, there exists a kernel function \(L \in L^{2}(\mathbb{C}\times \mathbb{C}, \mu \otimes \mu )\) satisfying

$$ [T f ](z) = \int L(z, w) f(w) d\mu (w), \quad f \in P^{2}(\mu ), $$

and in particular

$$ {\mathrm{Tr}} \ T^{\ast }T = \int |L(z,w)|^{2} d \mu (z) d \mu (w). $$

The singular numbers of \(T\) will play a central role below.

2.3 Function Theory

The fine structure of the subnormal operator \(S = S_{\mu }\) was elucidated by a rather recent discovery by Jim Thomson. We state in full the main theorem.

Theorem 2.2

Thomson

Let \(\mu \) be a positive Borel measure, compactly supported on ℂ. There exists a Borel partition \(\Delta _{0}, \Delta _{1}, \ldots \) of the closed support of \(\mu \) with the following properties:

1) \(P^{2}(\mu ) = L^{2}(\mu _{0}) \oplus P^{2}(\mu _{1}) \oplus P^{2}( \mu _{2}) \oplus \ldots \), where \(\mu _{j} = \mu |_{\Delta _{j}}\), \(j \geq 0\);

2) Every operator \(S_{\mu _{j}}, \ j \geq 1\), is irreducible with spectral picture:

$$ \sigma (S_{\mu _{j}}) \setminus \sigma _{\mathrm{ess}}(S_{\mu _{j}}) = G_{j}, \quad {\mathit{simply \ \mathit{connected}}}, $$

and

$$ {\mathrm{supp}} \mu _{j} \subset \overline{G_{j}}, \quad j \geq 1; $$

3) If \(\mu _{0} = 0\), then any element \(f \in P^{2}(\mu )\) which vanishes \([\mu ]\)-a.e. on \(G = \cup _{j} G_{j}\) is identically zero.

The proof appeared in [31], for \(L^{p}\) spaces, \(1 \leq p < \infty \). A conceptually simpler proof appears in [6]. The central position of Thomson’s Theorem was immediately recognized in the monograph [9] (published almost simultaneously with the original article).

Corollary 2.3

The cloud \(\Sigma (\mu )\) of a positive measure \(\mu \) with compact support inis empty if and only if the complex polynomials are dense in \(L^{2}(\mu )\). In case \(\Sigma (\mu )\) is non-empty, it contains interior points: \(G_{j} \subset \Sigma (\mu )\), \(j \geq 1\).

The operator \(S_{\mu _{0}} = M_{z} \in {\mathcal{L}}(L^{2}(\mu _{0}))\) is the single normal component of \(S_{\mu }\) while the summands \(S_{\mu _{j}}\) collect the non-normal behavior of \(S_{\mu }\). For every index \(j, j \geq 1\), the theorem asserts that \(S_{\mu _{j}}\) admits a continuum of eigenvalues of multiplicity one, filling the simply connected open set \(G_{j}\):

$$ \lambda \in G_{j} \quad \Rightarrow \quad [\ker (S_{\mu _{j}} -\lambda ) = 0, \ \dim \ker (S_{\mu _{j}}^{\ast }- \overline{\lambda }) = 1]. $$

Moreover, it is known that the range of \(S_{\mu _{j}}-\lambda \) is closed for such spectral parameters. The corresponding eigenvectors span \(P^{2}(\mu _{j})\), to the extent that this functional Hilbert space carries a reproducing kernel. This feature is detected by the local Christoffel function:

$$ \Lambda ^{\mu _{j}}(\lambda ) : = \inf \{ \|f \|^{2}_{\mu _{j}}; f \in \mathbb{C}[z], \ f(\lambda ) =1\} >0, $$

and ultimately by the full Christoffel function:

$$ \Lambda (\lambda ) = \Lambda ^{\mu }(\lambda ) = \inf \{ \|f \|^{2}_{ \mu }; f \in \mathbb{C}[z], \ f(\lambda ) =1\} >0, \quad \lambda \in G_{j}, \ j \geq 1. $$
(2.4)

Corollary 2.4

The cloud of a positive measure with compact support \(\mu \) is non-empty if and only if there exists a point \(\lambda \in \mathbb{C}\) satisfying \(\Lambda ^{\mu }(\lambda ) > 0\).

Szegö’s theory of orthogonal polynomials on the unit circle \(\mathbb{T}\) provides a clear cut picture: if \({\mathrm{supp}}(\mu ) \subset \mathbb{T}\), then \(\Sigma (\mu )\) is either empty or equal to the full disk \(\overline{\mathbb{D}}\). The latter case is characterized by Szegö’s condition:

$$ \int _{\mathbb{T}}\biggl| \log \frac{d \mu }{d \theta }\biggr| d \theta < \infty , $$

or equivalently by Christoffel function test: for at least one point \(\lambda \in \mathbb{D}\) one has

$$ \Lambda ^{\mu }(\lambda ) >0, $$

and then the same is true for all \(z \in \mathbb{D}\). See for instance [1].

Another, related and notable particular case is represented by measures \(\mu \) subject to the finiteness condition

$$ {\mathrm{rank}} [S^{\ast }, S] < \infty . $$

This situation was fully analyzed by McCarthy and Yang [19, 20] with the following conclusion: Thomson’s decomposition admits finitely many irreducible summands (that is \(0 \leq j \leq n\)) with the normal part represented by a positive measure \(\mu _{0}\) which is singular with respect to harmonic measure on every \(G_{j}, 1 \leq j \leq n\), and does not put mass on \(\cup _{j=1}^{n} G_{j}\); in addition every simply connected component \(G_{j}\) is a quadrature domain. In their theorem a complete description of the measures \(\mu _{j}, 1 \leq j \leq n\), is provided. Moreover, under this finite rank condition, the intersection of any two distinct sets \(\overline{G_{j}} \cap \overline{G_{k}}\) contains at most one single point, and the union \(F = \cup _{j=1}^{n} \overline{G_{j}}\) is polynomially convex. In other terms the complement \(\mathbb{C}\setminus F\) is connected.

A bounded open set \(\Omega \) of the complex plane is called a quadrature domain for analytic functions if there is a distribution of finite support \(\tau \) in \(\Omega \) (combination of point masses and their derivatives), such that

$$ \int _{\Omega }f {\mathrm{d A}} = \tau (f), \quad f \in L^{1}_{a}(\Omega , { \mathrm{d A}}), $$

where \(L^{1}_{a}(\Omega , {\mathrm{d A}})\) stands for the space of analytic functions in \(\Omega \) which are Lebesgue integrable. The order of a quadrature domain is the number of nodes, counting multiplicity, in the above cubature formula. A connected quadrature domain has an irreducible real algebraic boundary. The simplest example is of course a disk. Indeed, on the unit disk

$$ \int _{\mathbb{D}}f(z) dA(z) = \pi f(0), \quad f \in L^{1}_{a}( \mathbb{D}, {\mathrm{d A}}) $$

by Gauss mean value theorem. Any simply connected quadrature domain is a conformal image of the disk by a rational function, for instance a cardiodid or a connected lemniscate of degree four. An informative and accessible survey of quadrature domains, with ramifications to potential theory, inverse geophysical problems, quantum physics and operator theory is [12].

We record these facts under a precise statement.

Proposition 2.5

Let \(\mu \) be a positive Borel measure with compact support in ℂ. The cloud \(\Sigma (\mu )\) appearing in trace formula (2.1) contains every open set \(G_{j}, \ j \neq 0\), appearing in Thomson’s decomposition of the operator \(S_{\mu }\).

In case \({\mathrm{rank}} [S^{\ast }_{\mu }, S_{\mu }] < \infty \) there are only finitely many summands (\(1 \leq j \leq n\)) and the cloud \(\Sigma (\mu )\) coincides up to an area null-set with the quadrature domain \(\Omega = G_{1} \cup \ldots \cup G_{n}\).

The nature of the measure \(\mu \) is also quite explicit in the case of finite rank self-commutator, as follows. Let \(r : \mathbb{D}\longrightarrow \Omega \) denote a conformal rational map onto a bounded quadrature domain \(\Omega \). Let \(\tau \) be an absolutely continuous measure with respect to harmonic measure \(\omega \) for \(\Omega \), supported on \(\partial \Omega \) and satisfying

$$ \int _{\partial \Omega } \biggl|\log \biggl(\frac{ d\tau }{d\omega }\biggr)\biggr| d\omega < \infty . $$

Let \(\mu = \tau + \nu \), where \(\nu \) is a positive, finitely supported measure on \(\Omega \). Then complex polynomials are not dense in \(L^{2}(\mu )\) and the operator \(S_{\mu }\) is a typical irreducible subnormal operator possessing finite rank self-commutator, cf. Theorem 1.12 in [20]. In this case \(\overline{\Omega }\) coincides with the cloud \(\Sigma (\mu )\).

3 Finite Rank Approximation

3.1 Commutator Inequalities

Let \(\phi \in {\mathcal{C}}^{\infty }(\mathbb{C})\). The operator of multiplication by \(\phi \) on \(L^{2}(\mu )\) is denoted \(M_{\phi }\) or by the same letter if there is no confusion. Obviously \(M_{\phi }\) is bounded. Jacobi’s identity:

$$ [M^{\ast }, [\phi , P]] + [\phi , [P,M^{\ast }]] + [\phi , [P,M^{\ast }]] =0 $$

yields

$$ [M^{\ast }, [\phi , P]] = [\phi , [M^{\ast }, P]]. $$

In other terms, for an element \(f \in L^{2}[\mu ]\) we find:

$$\begin{aligned} &\overline{z} ([\phi , P] f(w) ) (z) - ([\phi , P] \overline{w} f(w))(z) = \\ &\quad \phi (z) ([M^{\ast }, P] f(w) ) (z) - ([M^{\ast }, P] \phi (w) f(w))(z). \end{aligned}$$

From here we deduce at the level of integral kernels:

$$ ([\phi , P] f)(z) = \int L(z,w) \frac{ \phi (z)-\phi (w)}{\overline{z} - \overline{w}} f(w) d \mu (w), \quad f \in L^{2}(\mu ). $$
(3.1)

In particular, the commutator \([\phi , P]\) is Hilbert-Schmidt for every smooth function \(\phi \). Observe that for \(\phi \) complex analytic and \(f \in P^{2}(\mu )\) one has \([\phi , P] f = \phi P f - P (\phi f) = \phi f - \phi f = 0\).

One step further, we can write explicitly the integral kernel of the commutator appearing in trace formula (2.1). For a polynomial \(p \in \mathbb{C}[z,\overline{z}]\) either one computes \(p(S,S^{\ast })\) via an ordered functional calculus, or one takes \(P M_{p} P\), the result differs by a trace class operator:

$$ p(S,S^{\ast }) - P M_{p} P \in {\mathcal{C}}_{1}. $$

Hence the trace of \([p(S,S^{\ast }), S]\) is not affected by such a rearrangement of terms. Pure algebra yields:

$$\begin{aligned} &[P M_{p} P, S] = P M_{p} M P - P M P M_{p} P = \\ &\quad P (M M_{p} - M P M_{p}) P = P M (I-P) M_{p} P = [P, M] [ p, P]. \end{aligned}$$

Summing up,

$$ {\mathrm{Tr}} [p(S,S^{\ast }), S] = \iint | L(z,w)|^{2} \frac{ p(z,\overline{z})-p(w,\overline{w})}{\overline{z} - \overline{w}} d \mu (z) d\mu (w), \quad p \in \mathbb{C}[z, \overline{z}]. $$
(3.2)

The above formula extends to all Toeplitz operators \(T_{\phi }= P M_{\phi }P\) with a smooth symbol. In addition, trace formula yields:

$$\begin{aligned} &{\mathrm{Tr}} [T_{\phi }, S] = \frac{1}{\pi } \int _{\Sigma (\mu )} \frac{\partial \phi }{\partial \overline{z}} dA = \\ &\quad \iint | L(z,w)|^{2} \frac{ \phi (z)-\phi (w)}{\overline{z} - \overline{w}} d \mu (z) d \mu (w), \quad \phi \in {\mathcal{C}}^{\infty }(\mathbb{C}). \end{aligned}$$

One can deduce from above a few qualitative properties of the integral kernel \(L(z,w)\), by exploiting the distributional sense of the above identity. In the present article we are concerned only with some approximation estimates. The ad-hoc notation

$$ (\Delta \phi )(z,w) = \frac{ \phi (z)-\phi (w)}{\overline{z} - \overline{w}} $$

is adopted in the following pages, as well as the Besov type semi-norm

$$ | \phi |_{F} = \inf _{p \in \mathbb{C}[z]} \| \Delta ( \phi - p) \|_{ \infty , F \times F}, $$

where \(F\) is a closed subset of the complex plane. Identity 3.1 implies, for every smooth function \(\phi \):

$$ \| [\phi , P] \|_{\mathit{HS}} \leq \| L \|_{2, \mu \times \mu } | \phi |_{{ \mathrm{supp}} (\mu )}. $$

Let \(Q\) denote an orthogonal projection of \(L^{2}(\mu )\) onto a closed subspace of \(P^{2}(\mu )\). For a fixed smooth function \(\phi \) one finds, along the above computations:

$$ {\mathrm{Tr}}\ Q [T_{\phi }, S] Q = {\mathrm{Tr}}\ Q [S, P] [T_{\phi }, P] = {\mathrm{Tr}} \ Q [S, P] [\phi , P] = {\mathrm{Tr}} \ Q [M, P] [\phi , P]. $$

The truncated operator \([M, P]^{\ast }Q\) has finite rank and it is represented by an integral kernel:

$$ ([M, P]^{\ast }Q f) (z) = \int L_{Q}(z,w) f(w) d \mu (w), \quad f \in P^{2}( \mu ). $$

More importantly,

$$\begin{aligned} &|{\mathrm{Tr}} \ Q [T_{\phi }, S] Q - {\mathrm{Tr}} \ [T_{\phi }, S] | = | {\mathrm{Tr}} (I-Q)[M,P] [\phi , P] | \leq \\ &\quad \| (I-Q)[M,P] \|_{\mathit{HS}} \| [\phi , P] \|_{\mathit{HS}} \leq \| (I-Q)[M,P] \|_{\mathit{HS}} \| [M^{\ast }, P] \|_{\mathit{HS}} | \phi |_{{\mathrm{supp}} (\mu )}. \end{aligned}$$

Recall also that

$$ \| [M^{\ast }, P] \|^{2}_{\mathit{HS}} = {\mathrm{Tr}} [S^{\ast }, S] = \frac{ {\mathrm{Area}} \Sigma (\mu )}{\pi }. $$

In conclusion we have proved the following result.

Theorem 3.1

Let \(\mu \) be a positive Borel measure with compact support inand let \(\phi \in {\mathcal{C}}^{\infty }(\mathbb{C})\). For an orthogonal projection \(Q\) of \(L^{2}(\mu )\) onto a closed subspace of \(P^{2}(\mu )\) the estimate

$$ |{\mathrm{Tr}} \ Q [T_{\phi }, S] Q - {\mathrm{Tr}} \ [T_{\phi }, S] | \leq \sqrt{ \frac{ {\mathrm{Area}} \Sigma (\mu )}{\pi }} \| (I-Q)[M,P] \|_{\mathit{HS}} \ |\phi |_{{ \mathrm{supp}} (\mu )}. $$
(3.3)

holds true.

Remark that one can express the distortion factor above in function space norm:

$$ \| (I-Q) [M,P] \|_{\mathit{HS}} = \| L - L_{Q} \|_{2, \mu \otimes \mu }. $$

The particular case of a finite rank projection \(Q\) onto a subspace of polynomials will be of interest in the next sections.

Hankel operator \(T = [M^{\ast }, P] = (I-P) M^{\ast }P: P^{2}(\mu ) \longrightarrow L^{2}( \mu ) \ominus P^{2}(\mu )\) is Hilbert-Schmidt. Its Schmidt expansion

$$ (I-P)M^{\ast }P= \sum _{j=0}^{\infty }\kappa _{j} g_{j} \langle \cdot , f_{j} \rangle $$
(3.4)

identifies the singular numbers \((\kappa _{j})\) together with the eigenvectors of its modulus:

$$ [S^{\ast }, S] f_{j} = T^{\ast }T f_{j} = \kappa _{j}^{2} f_{j}, \quad j \geq 1. $$

By convention we include here the null vectors of \(T\), so that \((f_{j})_{j=0}^{\infty }\) is an orthonormal basis of \(P^{2}(\mu )\). And similarly \(g_{j}\) are eigenvectors of \(T T^{\ast }\), forming an orthonormal basis of \(P^{2}(\mu )^{\perp }\). Courant-Fisher’s min-max principle implies the following bound.

Corollary 3.2

In the conditions of the Theorem, let \(\kappa _{1} \geq \kappa _{2} \geq \ldots \geq 0\) denote the singular numbers of Hankel’s operator \([M^{\ast }, P] = (I-P)M^{\ast }P\). If \(Q\) is a projection of finite rank \(d\) onto a subspace of \(P^{2}(\mu )\), then

$$ \| [M^{\ast }, P] (I-Q) \|^{2}_{\mathit{HS}} \geq \sum _{j \geq d} \kappa _{j}^{2}, $$

and the inequality is attained if \(Q\) is the projection onto the span of vectors \(f_{0}, f_{1}, \ldots , f_{d-1}\).

Although it might be inaccessible numerically to have the complete Schmidt expansion of the operator \(T\), we pretend for a moment that expansion (3.4) is known. Let \(p \in \mathbb{C}[z,\overline{z}]\), so that

$$ [P M_{p} P, S] = P M (I-P) M_{p} P $$

as before. Then

$$ \langle P M (I-P) M_{p} P f_{j}, f_{j} \rangle = \langle p f_{j} , T f_{j} \rangle = \kappa _{j} \langle p f_{j} , g_{j} \rangle $$

hence

$$ {\mathrm{Tr}} [P M_{p} P, S] = \lim _{n} \sum _{j=0}^{n} \kappa _{j} \int p(z, \overline{z}) f_{j}(z) \overline{g_{j}(z)} d\mu (z). $$

This observation is effective in case \({\mathrm{rank}} [S_{\mu }^{\ast }, S_{\mu }] < \infty \) when

$$ \sum _{j=0}^{\infty }\kappa _{j} f_{j}(z) \overline{g_{j}(z)} = \sum _{j=0}^{d} \kappa _{j} f_{j}(z) \overline{g_{j}(z)} = \overline{ L(z,z)} $$

is an integrable function with respect to \(\mu \).

These computations lead to a theoretical result, illustrating the key role played by the asymptotics of the singular numbers of the operator \(T\).

Theorem 3.3

Let \(\mu \) be a positive Borel measure with compact support inand let \(p \in \mathbb{C}[z,\overline{z}]\). Assume that the singular numbers of Hankel’s operator \([M^{\ast },P]\) are in \(\ell ^{1}\). Then \(L(z,z) \in L^{1}(\mu )\) and

$$ \frac{1}{\pi } \int _{\Sigma (\mu )} \frac{\partial p}{\partial \overline{z}} (z,\overline{z}) dA(z) = \int p(z) \overline{L(z,z)} d\mu (z). $$
(3.5)

Regardless to say that there are ample studies and criteria assuring a Hankel operator to be trace-class, see [25].

3.2 Weak Approximation of the Integral Kernel

In general, the integral kernel \(L(z,w)\) representing the Hilbert-Schmidt operator \((I-P)M^{\ast }P : P^{2}(\mu ) \longrightarrow L^{2}(\mu ) \ominus P^{2}( \mu )\) is only an element of \(L^{2}(\mu \times \mu )\). Its structure is illuminated by an approximation in the weak topology of \(L^{2}(\mu \times \mu )\) which involves truncated Christoffel-Darboux kernels.

To this aim, fix an element \(f \in P^{2}(\mu )\) and remark that

$$ (I-P) M^{\ast }P f = \lim _{n} (I-P_{n}) M^{\ast }P f = \lim _{n} M^{\ast }P_{n+1} f - \lim _{n} P_{n} M^{\ast }P. $$

But \(P_{n} M^{\ast }= P_{n} M^{\ast }P_{n+1}\), hence the strong operator topology convergence

$$ (I-P) M^{\ast }P f = \lim _{n} ( M^{\ast }P_{n+1} - P_{n} M^{\ast }) f $$

holds true.

Proposition 3.4

The integral kernel \(L(z,w)\) representing the Hankel operator \((I-P)M^{\ast }P\) admits the approximation:

$$ \lim _{n} [ \overline{z} K^{\mu }_{n+1}(z,w) - \overline{w} K^{\mu }_{n}(z,w)] = L(z,w) $$
(3.6)

in the weak topology of \(L^{2}(\mu \times \mu )\).

This observation is consistent with

$$ \lim _{n} \frac{\partial M^{\ast }P_{n+1} f}{\partial \overline{z}} = \frac{\partial (I-P)M^{\ast }P f}{\partial \overline{z}} = Pf $$

in the even weaker sense of distributions.

3.3 Polynomial Approximation

We specialize below the finite rank approximation of the commutator \([M^{\ast }, P]\) to the filtration by polynomial subspaces \(\mathbb{C}_{n}[z]\) labelled by the degree. The notation introduced in the previous sections is unchanged: \(P_{n}\) is the orthogonal projection onto \(\mathbb{C}_{n}[z]\), Christoffel-Darboux kernel of order \(n\) is \(K_{n}(z,w)\), Hessenberg’s matrix associated to the orthogonal polynomials \((p_{j})\) is \([h_{jk}]\).

The rate of convergence in Hilbert-Schmidt norm of \([M^{\ast },P]P_{n}\) to \([M^{\ast }, P]\) controls via Theorem 3.1 the finite central convergence in the trace formula. For a fixed degree \(j \geq 0\) one finds

$$\begin{aligned} &\| [M^{\ast }, P] p_{j} \|^{2} = \| (I-P) M^{\ast }p_{j} \|^{2} = \| \overline{z} p_{j} \|^{2} - \| P ( \overline{z} p_{j} ) \|^{2} = \| z p_{j} \|^{2} - \| P ( \overline{z} p_{j} ) \|^{2} = \\ &\quad \sum _{k \leq j+1} |h_{kj}|^{2} - \sum _{k \geq j-1} |h_{jk}|^{2}. \end{aligned}$$

That is, the square norm of the \(j\)-th column in the Hessenberg matrix minus the square norm of the \(j\)-th row. Therefore

$$ \| [M^{\ast },P]P_{n} \|^{2}_{\mathit{HS}} = \sum _{j \leq n} \| [M^{\ast }, P] p_{j} \|^{2} = |h_{n+1,n}|^{2} - \sum _{j \leq n< k} |h_{jk}|^{2}. $$
(3.7)

We know that as a function of \(n\) this is an increasing sequence, converging to \({\mathrm{Tr}} [S^{\ast }, S] = \frac{ {\mathrm{Area}} \Sigma (\mu )}{\pi }\).

At the level of integral formulas one puts in motion the Christoffel-Darboux kernel. Denoting by \(L_{n}(z,w)\) the integral kernel of \([M^{\ast },P]P_{n}\) one finds

$$ L_{n}(z,w) = \int L(z,\zeta ) K_{n}(\zeta ,w) d\mu (\zeta ), \quad n \geq 0, $$

and one step further, approximating the full projection \(P\) by \(P_{N}\) with \(N \rightarrow \infty \) we infer a formula in terms only of the moments of the original measure \(\mu \).

Proposition 3.5

The integral kernel \(L_{n}\) of the finite rank operator \([M^{\ast },P]P_{n}\) admits the representation

$$ L_{n}(z,w) = \lim _{N} L_{N,n}(z,w), $$
(3.8)
$$ L_{N,n}(z,w) : = \int K_{N}(z,\zeta ) (\overline{z} - \overline{\zeta }) K_{n}(\zeta , w) d\mu (\zeta ) $$
(3.8)

with the error estimate

$$\begin{aligned} &\| L_{n}(z,w) - L_{N,n}(z,w) \|^{2}_{2, \mu \otimes \mu } = \\ &\quad \| (P_{N} - P) M^{\ast }P_{n} \|^{2}_{\mathit{HS}} = \sum _{j \leq n< N < k} |h_{jk}|^{2}. \end{aligned}$$
(3.9)

Given a polynomial \(R(z,\overline{z}) \in \mathbb{C}[z, \overline{z}]\) it will be convenient to separate the powers of \(\overline{z}\):

$$ R(z,\overline{z}) = R_{0}{z} + R_{1}(z) \overline{z} + \ldots + R_{d}(z) \overline{z}^{d}. $$

The identity

$$ [ R(S,S^{\ast }), S] = [S^{\ast }, S] R_{1}(S) + [S^{\ast 2}, S] R_{2}(S) + \ldots + [S^{\mathit{ast} d}, S] R_{d}(S) $$

implies, thanks to the relation \(\mathit{PMP} = \mathit{MP}\) and the cyclic invariance of trace:

$$ {\mathrm{Tr}} [ R(S,S^{\ast }), S] = {\mathrm{Tr}} \ P M (I-P) M^{\ast }R_{1}(M) + \ldots + {\mathrm{Tr}} \ P M (I-P) M^{\ast d} R_{d}(M). $$

Denoting

$$ \tilde{R}(\overline{z}; w, \overline{w}) = \sum _{j=1}^{d} \frac{\overline{z}^{j} - \overline{w}^{j}}{\overline{z} - \overline{w}} R_{j}(w), $$

formula 3.2 becomes:

$$ {\mathrm{Tr}} [ R(S,S^{\ast }), S] = \iint |L(z,w)|^{2} \tilde{R}( \overline{z}; w, \overline{w}) d\mu (z) d\mu (w). $$

Notice that \(\tilde{R}\) is a polynomial in all variables and its restriction to the diagonal \(z=w\) coincides with \(\frac{\partial R}{\partial \overline{z}} (z,\overline{z})\).

The main result follows. We denote by \(\hat{\sigma }\) the polynomial convex hull of a closed set \(\sigma \subset \mathbb{C}\).

Theorem 3.6

Let \(\mu \) be a positive Borel measure with compact support \(\sigma \) in ℂ, with associated Hessenberg matrix \((h_{jk})_{j,k=0}^{\infty }\) and Christoffel-Darboux kernels \(K_{n}(z,w), n \geq 0\). The moments of the area measure supported by the cloud of \(\mu \) can be computed by the formula:

$$\begin{aligned} & \frac{1}{\pi } \int _{\Sigma (\mu )} \frac{\partial R}{\partial \overline{z}} (z,\overline{z}) dA(z) = \\ &\quad \lim_{n} \lim_{N} \int R(z,\overline{z}) \biggl[z K_{n} (z,z) - \int K_{n} (z,\zeta )\zeta K_{N}(\zeta , z) d\mu (\zeta )\biggr] d\mu (z) \end{aligned}$$
(3.10)

where \(R \in \mathbb{C}[z,\overline{z}]\).

For fixed values of \(n < N\) the error \(\epsilon _{N,n} = \epsilon _{N,n}(\mu )\) in the above limit satisfies:

$$ \epsilon _{N,n}^{2} \leq \frac{ {\mathrm{Area}}\ \hat{\sigma }}{\pi } \| \tilde{R}\|^{2}_{\infty , \sigma \times \sigma } \biggl[ \biggl(\sum _{j >n} s_{j} \biggr) + \sum _{j \leq n< N < k} |h_{jk}|^{2} \biggr], $$

where

$$ s_{j} = h_{j+1,j}^{2} - h_{j,j-1}^{2} + \sum _{\ell < j < k} (|h_{ \ell j}|^{2}- |h_{jk}|^{2}) $$

and \(\pi \sum _{j=0}^{\infty }s_{j} = {\mathrm{Area}}\ \Sigma (\mu )\).

Proof

By relaxing the two projectors \(P\) appearing in the trace formula

$$\begin{aligned} &\frac{1}{\pi } \int _{\Sigma (\mu )} \frac{\partial R}{\partial \overline{z}} (z,\overline{z}) dA(z) = { \mathrm{Tr}} [P M_{R} P, P M P] = \\ &\quad {\mathrm{Tr}} P M (I-P) M_{R} P = {\mathrm{Tr}} P M (I-P) M_{R} , \end{aligned}$$

one finds the approximate values

$$ {\mathrm{Tr}} P_{n} M (I-P_{N}) M_{R} = {\mathrm{Tr}} R P_{n} M (I-P_{N}), $$

which provide the kernels (3.6) in the statement. The error bound is derived from (3.3) and the Hilbert-Schmidt norm identity

$$\begin{aligned} &\| (I-P_{N}) M^{\ast }P_{n} - (I-P) M^{\ast }P \|_{\mathit{HS}}^{2} = \\ &\quad \| (P-P_{N}) M^{\ast }P_{n} \|^{2}_{\mathit{HS}} + \| (I-P) M^{\ast }(P-P_{n}) \|^{2}_{\mathit{HS}}. \end{aligned}$$

The expressions in terms of Hessenberg matrix entries are consequences of identities (3.7) and (3.9). □

Helton and Howe trace formula (2.1) is coordinate free, and moreover, it is invariant under additive Hilbert Schmidt perturbations \(S+L\) of the operator \(S\), provided the self-commutator \([S^{\ast }+ L^{\ast }, S+L]\) remains trace-class [32]. In particular, an adapted sequence of finite rank orthogonal projections converging strongly to the identity operator may well improve the estimates in Theorem 3.6. We discuss an example in this direction.

Corollary 3.7

Let \(\mu \) be a positive measure of compact support and let \(\nu \) be a finite atomic measure supported by the complement of the polynomial hull of \({\mathrm{supp}} \mu \). There exists a constant \(\rho >1\) with the property

$$ \epsilon ^{2}_{N,n}(\mu + \nu ) \leq \epsilon ^{2}_{N,n}(\mu ) + O( \rho ^{-n}), \ n \rightarrow \infty , $$
(3.11)

with the second term independent of \(N >n\).

Proof

By finite recurrence and normalization we can assume \(\nu \) is a Dirac mass at a point \(a\).

The map

$$ J: P^{2}(\mu + \nu ) \longrightarrow P^{2}(\mu ) \oplus P^{2}(\nu ), \quad f \mapsto (f,f), $$

is isometric and has dense range by Runge approximation theorem. Hence \(J\) is a unitary transformation. Denote by \(\pi _{n}\) the orthogonal projection of \(P^{2}(\mu + \nu )\) onto \(\mathbb{C}_{n}[z]\).

The essential Hilbert-Schmidt norm involved in the bound of \(\epsilon _{N,n}(\mu + \nu )\) is

$$ \Biggl\| \begin{pmatrix} I - P & 0 \\ 0 & 1-1 \end{pmatrix} \begin{pmatrix} M^{\ast }& 0 \\ 0 & \overline{a}\end{pmatrix} \pi _{n} \Biggr\| _{\mathit{HS}}. $$

Fix a positive integer \(n\), and consider the orthogonal decomposition of \(J \mathbb{C}_{n}[z]\):

$$ (K^{\mu }_{n}(z,a), K^{\mu }_{n}(a,a)) \mathbb{C}\oplus J ( (z-a) \mathbb{C}_{n-1}[z]). $$

The normalized vector

$$ \biggl(\frac{K^{\mu }_{n}(z,a)}{\sqrt{K^{\mu }_{n}(a,a) + K^{\mu }_{n}(a,a)^{2}}}, \frac{K^{\mu }_{n}(a,a)}{\sqrt{K^{\mu }_{n}(a,a) + K^{\mu }_{n}(a,a)^{2}}}\biggr) $$

and an orthonormal basis of the subspace \((z-a)\mathbb{C}_{n-1}[z] \subset P^{2}(\mu )\), simultaneously orthonormal via the unitary map \(J\) in \(P^{2}(\mu ) \oplus P^{2}(\nu )\), produce the estimate

$$ \biggl\| \begin{pmatrix} (I - P)M^{\ast }& 0 \\ 0 & 0 \end{pmatrix} \pi _{n}\biggr\| ^{2}_{\mathit{HS}} \leq $$
$$ \biggl\| (I-P) M^{\ast }\frac{K^{\mu }_{n}(z,a)}{\sqrt{K^{\mu }_{n}(a,a) + K^{\mu }_{n}(a,a)^{2}}} \biggr\| ^{2} + \| (I-P)M^{\ast }P_{n}\|^{2}. $$

Since the point \(a\) does not belong to the polynomial convex hull of \({\mathrm{supp}}(\mu )\) the variational definition of Christoffel function \(\Lambda ^{\mu }_{n} (a) \) implies

$$ \Lambda ^{\mu }_{n} (a) \leq C \rho ^{-n}, $$

for some constants \(C>0\) and \(\rho >1\). The identity

$$ \biggl\| \frac{K^{\mu }_{n}(z,a)}{\sqrt{K^{\mu }_{n}(a,a) + K^{\mu }_{n}(a,a)^{2}}} \biggr\| ^{2}_{\mu }= \frac{K^{\mu }_{n}(a,a)}{K^{\mu }_{n}(a,a) + K^{\mu }_{n}(a,a)^{2}} = \frac{\Lambda ^{\mu }_{n}(a)}{1+ \Lambda ^{\mu }_{n}(a)} $$

completes the proof. □

4 Padé Type Approximation Scheme in 2D

The reconstruction of the cloud of a measure \(\mu \) from its moments can be completed via different paths: geometric tomography, Bergman space methods [14] (if applicable), identification of potential real-algebraic boundary [17], curve fitting along the boundary [2], or by exploiting the same hyponormal operators tools invoked in the previous sections. We reproduce from our preceding article a few details on the latter approximation scheme.

Let \(g \in L^{1}_{\mathrm{comp}}(\mathbb{C}, dA)\), \(0 \leq g \leq 1\), be a fixed shade function, and think of it as the characteristic function of the shade \(\Sigma (\mu )\) of a measure. We consider the moments

$$ a_{k\ell } = \int _{\mathbb{C}}\zeta ^{k} \overline{\zeta }^{\ell }g( \zeta ) {\mathrm{dA}}(\zeta ), \quad k,l \geq 0, $$

given. They can be organized in the exponential of a formal generating series:

$$ E_{g}(w,z) = \exp \biggl[\frac{-1}{\pi } \sum _{k,\ell =0}^{\infty }\frac{a_{k\ell }}{w^{k+1} \overline{z}^{\ell +1}}\biggr]. $$

This is of course the power expansion at infinity of the double Cauchy integral appearing in (2.2). Denote in short:

$$ E_{g}(w,z) = \exp \biggl( \frac{-1}{\pi } \int _{\mathbb{C}} \frac{ g(\zeta ) {\mathrm{dA}}(\zeta )}{(\zeta -w)(\overline{\zeta }-\overline{z})}\biggr), \quad z,w \in \mathbb{C}, \ z\neq w. $$

We recall a few of the properties of the exponential transform \(E_{g}\):

a). The function \(E_{g}\) can be extended by continuity to \(\mathbb{C}^{2}\) by assuming the value \(E_{g}(z,z) = 0\) whenever \(\int _{\mathbb{C}}\frac{ g(\zeta ) {\mathrm{dA}}(\zeta )}{|\zeta -z|^{2}} = \infty \);

b). The function \(E_{g}(w,z)\) is analytic in \(w \in \mathbb{C}\setminus {\mathrm{supp}} (g)\) and antianalytic in \(z \in \mathbb{C}\setminus {\mathrm{supp}} (g)\);

c). The kernel \(1-E_{g}(w,z)\) is positive semi-definite in \(\mathbb{C}^{2}\);

d). The behavior at infinity contains as a first term the Cauchy transform of \(g\):

$$ E_{g}(w,z) = \frac{1}{\overline{z}} \biggl[\frac{-1}{\pi } \int _{\mathbb{C}} \frac{ g(\zeta ) {\mathrm{dA}}(\zeta )}{\zeta -w}\biggr] + O\biggl(\frac{1}{|z|^{2}}\biggr), \quad |z| \rightarrow \infty . $$

The case of a characteristic function \(g= \chi _{\Omega }\) of a bounded domain \(\Omega \) is particularly relevant for our note. In this case we simply write \(E_{\Omega }\) instead of \(E_{g}\), and we record the following properties (all proved and well commented in [11]).

1). The equation

$$ \frac{\partial E_{\Omega }(w,z)}{\partial \overline{w}} = \frac{E_{\Omega }(w,z)}{\overline{w}-\overline{z}} $$

holds for \(z \in \Omega \) and \(w \in \mathbb{C}\setminus \Omega \);

2). The function \(E_{\Omega }(w,z)\) extends analytically/antianalytically from \((\mathbb{C}\setminus \overline{\Omega })^{2}\) across real analytic arcs of the boundary of \(\Omega \);

3). Assume \(\partial \Omega \) is piecewise smooth. Then \(z \mapsto E_{\Omega }(z,z)\) is a superharmonic function on the complement of \(\Omega \), with value 1 at infinity, vanishing on \(\overline{\Omega }\) and satisfying

$$ E_{\Omega }(z,z) \approx {\mathrm{dist}} (z, \partial \Omega ) $$

for \(z \in \mathbb{C}\setminus \overline{\Omega }\) close to \(\partial \Omega \).

The kernel \(E_{\Omega }\) is also characterized by a Riemann-Hilbert factorization, see [11]. The feature which turns the exponential transform into a suitable shape reconstruction from moments tool is its rationality on quadrature domains, in complete parallelism to the rationality of Cauchy transforms of point masses on the line, or the rationality of exponential of Cauchy transforms of union of intervals.

We reproduce the rational reconstruction procedure. Let \(d\) be a fixed integer and let \((a_{k\ell })_{k,\ell =0}^{d}\), be a non-negative matrix of potential moments of a “shade function” \(g(z), 0 \leq g \leq 1\). Consider the truncated exponential transform

$$\begin{aligned} &F(w,z) = \exp \biggl[\frac{-1}{\pi } \sum _{k,\ell =0}^{d} \frac{a_{k\ell }}{w^{k+1} \overline{z}^{\ell +1}}\biggr] = \\ &\quad 1 - \sum _{m,n=0}^{\infty }\frac{b_{mn}}{w^{m+1} \overline{z}^{n+1}}. \end{aligned}$$

It is known that \((b_{mn})_{M,n=0}^{d}\) is also a non-negative definite matrix, subject to some additional constraints [18]. A necessary and sufficient condition that \((a_{k\ell })_{k,\ell =0}^{d}\) represent the moments of a quadrature domain of order \(d\) is

$$ \det (b_{mn})_{m,n=0}^{d} =0, $$

or equivalently the existence of a monic polynomial \(P(z)\) of degree \(d\) and a rational function of the form

$$ R_{d}(w,z) = 1- \frac{\sum _{m,n=0}^{d-1} c_{mn} w^{m} \overline{z}^{n}}{P(w) \overline{P(z)}}, $$

such that, at infinity

$$ F(w,z) - R_{d}(w,z) = O \biggl( \frac{1}{w^{d+1} \overline{z}^{d}}, \frac{1}{w^{d} \overline{z}^{d+1}}\biggr). $$

The reader will recognize above a typical 2D Padé approximation scheme. Moreover, for any shade function \(g\), the exponential transform \(E_{g}\) coincides with \(E_{\Omega }\), where \(\Omega \) is a quadrature domain if and only if

$$ E_{g}(w,z) = 1- \frac{\sum _{m,n=0}^{d-1} c_{mn} w^{m} \overline{z}^{n}}{P(w) \overline{P(z)}}, \quad |z|, |w| \gg 1. $$

In this case the zeros of \(P\) coincide with the quadrature nodes, while the numerator is the irreducible defining polynomial of the boundary of \(\Omega \):

$$ \partial \Omega \subset \biggl\{ z \in \mathbb{C}; \ \sum _{m,n=0}^{d-1} c_{mn} z^{m} \overline{z}^{n} = |P(z)|^{2} \biggr\} . $$

The above Padé approximation procedure was proposed for the reconstruction of planar shapes in [13], with additional details in [11].

5 Examples

5.1 Reconstruction of a Disk via Its Exponential Transform

As simple and well known the example below might be, it is illustrative for the two dimensional Padé scheme just discussed.

A disk \(B = \{ z \in \mathbb{C}; |z-c| < R\}\) has the exponential transform

$$ E_{B}(z,w) = 1 - \frac{R^{2}}{(z-c)(\overline{z}-\overline{c})} $$

detectable from initial moments:

$$ a_{00} = \pi R^{2}, $$
$$ a_{01} = \int _{|z-c| \leq R} z dA(z) = \pi R c = \overline{a_{10}}, $$
$$ a_{11} = \int _{|z-c| \leq R} |z|^{2} dA(z) = 2\pi \int _{0}^{R} (|c|^{2} + r^{2}) r dr = \pi R^{2} |c|^{2} + \pi \frac{R^{4}}{2}. $$

The truncated exponential transforms is:

$$\begin{aligned} &\exp \biggl[- \frac{R^{2}}{z \overline{w} } - \frac{R^{2} \overline{c}}{z\overline{w}^{2}} - \frac{R^{2} c}{z^{2} \overline{w}} - \frac{R^{2} |c|^{2} + \frac{R^{4}}{2}}{z^{2} \overline{w}^{2}}\biggr] = \\ &\quad 1- \frac{R^{2}}{z \overline{w} } - \frac{R^{2} \overline{c}}{z\overline{w}^{2}} - \frac{R^{2} c}{z^{2} \overline{w}} - \frac{R^{2} |c|^{2}}{z^{2} \overline{w}^{2}} + O \biggl( \frac{1}{w^{3}}, \frac{1}{\overline{z}^{3}} \biggr). \end{aligned}$$

Whence

$$ b_{00} = R^{2}, \qquad b_{10} = R^{2} c,\qquad b_{01} = R^{2} \overline{c}, \qquad b_{11} = R^{2} |c|^{2}. $$

The vanishing determinant \(b_{00}b_{11}-b_{10}b_{01} = 0\) and the linear dependence of the columns of the matrix \((b_{k\ell })_{k,\ell =0}^{1}\) identify the monic factor \(P(z) = z-c\) in the denominator \(P(z)\overline{P(w)}\) of the rational approximant of the full exponential transform. Finally, as in the one dimensional diagonal Padé approximation scheme, one finds:

$$\begin{aligned} &(z-c)(\overline{w}-\overline{c}) \biggl[1- \frac{R^{2}}{z \overline{w} } - \frac{R^{2} \overline{c}}{z\overline{w}^{2}} - \frac{R^{2} c}{z^{2} \overline{w}} - \frac{R^{2} |c|^{2}}{z^{2} \overline{w}^{2}} \biggr] = \\ &\quad (z-c)(\overline{w}-\overline{c}) - R^{2} + O \biggl(\frac{1}{z^{2}}, \frac{1}{\overline{w}^{2}}\biggr). \end{aligned}$$

5.2 Rotationally Invariant Measures

Let \(\rho \) denote a positive Borel measure on the interval \([0,1]\) with \(1 \in {\mathrm{supp}}(\rho )\). That is \(\rho ([\delta , 1]) >0\) for every \(\delta < 1\). The induced rotationally invariant measure \(\mu \) acts on smooth functions \(\phi (x,y) \) by the formula

$$ \int \phi d\mu = \int \phi (r \cos \theta , r \sin \theta ) d\rho (r). $$

The support of the measure \(\mu \) contains full circles, hence it is not finite. The complex monomials are orthogonal in \(L^{2}(\mu )\), with norms

$$ \frac{1}{\gamma _{k}^{2}} = \| z^{k} \|^{2}_{2,\mu } = \int r^{2 k} d \rho (r), \quad k \geq 0. $$

The orthonormal polynomials are

$$ p_{k}(z) = \gamma _{k} z^{k}, \quad k \geq 0, $$

so that the multiplier \(S = M_{z}\) acts as a weighted shift:

$$ S p_{k} = h_{k+1,k} p_{k+1}, \quad k \geq 0. $$

Note that \((\frac{1}{\gamma _{k}^{2}})_{k=0}^{\infty }\) are the moments of a positive measure defined on \([0,1]\) (specifically \(d\rho (\sqrt{r})\)), and

$$ h_{k+1,k} = \frac{\gamma _{k}}{\gamma _{k+1}} > 0, \quad k \geq 0. $$

All other entries in the associated Hessenberg matrix \(H\) are equal to zero. In short, \(H\) is a subnormal weighted shift.

The spectrum of \(H\) is well understood: it coincides with the closed unit disk. Moreover, the elements of \(P^{2}(\mu )\) are analytic functions in the open disk, subject to a growth condition imposed by the coefficients \(\gamma _{k}\), cf. [27]. In particular, \(P^{2}(\mu ) \neq L^{2}(\mu )\) and every \(\lambda \in \mathbb{D}\) is a bounded point evaluation for \(P^{2}(\mu )\). That is, the cloud of \(\mu \) is the full disk: \(\Sigma (\mu ) = \overline{\mathbb{D}}\).

We give for completeness a few details. Condition \([S^{\ast }, S] \geq 0\) reads

$$ \| S p_{k} \|^{2} \geq \|S^{\ast }p_{k} \|^{2}, $$

or equivalently

$$ h^{2}_{k+1,k} \geq h^{2}_{k, k-1}, \quad k \geq 1. $$

In addition

$$ \lim _{k} h_{k+1,k} =1. $$

Indeed,

$$ S^{\ast }S p_{k} = h^{2}_{k+1,k} p_{k}, \quad k \geq 0, $$

and \(\| S^{\ast }S \| = \| S \|^{2} = \| M \|^{2} = 1\).

According to Theorem 3.6, the rate of convergence of the approximation scheme is dictated by the remainder:

$$ \biggl[\biggl(\sum _{j >n} s_{j}\biggr) + \sum _{j \leq n< N < k} |h_{jk}|^{2}\biggr] = 1- h_{n+1,n}^{2}. $$

In general, for Hausdorff moment sequences such as \((\frac{1}{\gamma _{k}^{2}})_{k=0}^{\infty }\) above, the convergence rate of consecutive quotients is known:

$$ h^{2}_{k+1,k} = \frac{\gamma ^{2}_{k}}{\gamma ^{2}_{k+1}} = 1 - O\biggl( \frac{1}{k}\biggr) $$

A multi-fractal gauge, known as the local dimension of a measure, quantifies this asymptotics, cf. [10, 23].

The remarkable feature of this class of examples is that all (normalized) rotationally invariant measures share the same cloud.

5.3 A Uniform Mass Cloud Plus Finitely Many Point Masses

Let \(\Omega \subset \mathbb{C}\) be a bounded, connected and simply connected domain with smooth boundary. The uniform area mass \(\nu = \chi _{\Omega }dA\) distributed on \(\Omega \) offers one of the best understood asymptotic analysis of complex orthogonal polynomials, with a century old history, see for instance [30]. In this case \(P^{2}(\nu )\) coincides with the Bergman space \(L^{2}_{a}(\Omega )\), that is the collection of all analytic functions in \(\Omega \), square summable with respect to area. The reproducing kernel

$$ K^{\Omega }(z,w) = \sum _{j=0}^{\infty }p_{j}(z) \overline{p_{j}(w)} $$

converges in \(\Omega \times \Omega \) to an analytic/anti-analytic positive definite kernel, known as the Bergman kernel of \(\Omega \). If \(\phi : \Omega \longrightarrow \mathbb{D}\) denotes a conformal mapping, then

$$ K^{\Omega }(z,w) = \frac{\phi '(z) \overline{\phi '(w)}}{\pi (1- \phi (z) \overline{\phi (w))^{2}}}, \quad z, w \in \Omega . $$

Therefore, the integral kernel representing \((I-P)M^{\ast }P\) is precisely

$$ L(z,w) = (\overline{z} - \overline{w}) K^{\Omega }(z,w). $$

The asymptotics of the singular numbers of this integral operator (known as a big Hankel operator) were thoroughly studied [16]. In general, one carries to the unit disk all computations, via the inverse conformal mapping \(\phi = \psi ^{-1}\). The integral kernel

$$ L_{1}(u,v) = \frac{ \overline{\phi (u)}- \overline{\phi (v)}}{\pi (1- u \overline{v})^{2}}, \quad u,v \in \mathbb{D} $$

gives rise to a unitarily equivalent integral operator to \((I-P)M^{\ast }P\), this time acting on \(L^{2}(\mathbb{D}, dA)\). Within this framework, a theorem due to Nowak [22] asserts that the singular numbers \(\kappa _{j}\) of the Hankel operator \((I-P)M^{\ast }P\) associated to a domain \(\Omega \) with smooth boundary satisfy \(\kappa _{j} = O(\frac{1}{j})\). Consequently the best approximation by a sequence of finite rank projections \(Q_{n}, \ {\mathrm{rank}} \ Q_{n} = n\), yields the error \(\kappa _{n+1}^{2} + \kappa _{n+2}^{2} + \ldots \), that is

$$ \| (I-P) M^{\ast }(P-Q_{n})\|_{\mathit{HS}}^{2} = O \biggl(\frac{1}{n}\biggr). $$

We show that this estimate is sharp for an ellipse, with respect to the polynomial filtration. More precisely, let \(E\) be the ellipse whose complement is described by Joukowski’s map

$$ F(z) = \frac{1}{2} \biggl( e^{c} z + \frac{1}{e^{c} z}\biggr), \quad |z| > 1, $$

and parameter \(c >0\). The associated Hessenberg matrix is three diagonal, due to the special structure of the orthogonal polynomials with respect to area measure on \(E\). Indeed,

$$ p_{j}(z) = 2 \sqrt{\frac{j+1}{\pi }} \frac{1}{\sqrt{\rho ^{j+1} - \rho ^{-j-1}}} U_{j}(z), \quad j \geq 0, $$

where \(\rho = e^{2c}\) and \(U_{j}\) denotes Chebyshev polynomial of the second kind, see for details [21] pg. 259. We adopt the convention \(p_{-1} =0\) as well \(U_{-1} =0\). The three term recurrence relation for Chebyshev polynomials

$$ 2 z U_{j}(z) = U_{j+1}(z) + U_{j-1}(z), \quad j \geq 0, $$

implies

$$ z p_{j}(z) = \frac{1}{2} \sqrt{\frac{j+1}{j+2}} \sqrt{ \frac{ \rho ^{j+2}-\rho ^{-j-2}}{\rho ^{j+1} - \rho ^{-j-1}}} p_{j+1} + \frac{1}{2} \sqrt{\frac{j+1}{j}} \sqrt{ \frac{ \rho ^{j}-\rho ^{-j}}{\rho ^{j+1} - \rho ^{-j-1}}} p_{j-1}. $$

In other terms, \(h_{jj}= 0\),

$$ h_{j+1,j} = \frac{1}{2} \sqrt{\frac{j+1}{j+2}} \sqrt{ \frac{ \rho ^{j+2}-\rho ^{-j-2}}{\rho ^{j+1} - \rho ^{-j-1}}} $$

and

$$ h_{j-1,j} = \frac{1}{2} \sqrt{\frac{j+1}{j}} \sqrt{ \frac{ \rho ^{j}-\rho ^{-j}}{\rho ^{j+1} - \rho ^{-j-1}}}. $$

In order to evaluate the approximation rate in Theorem (3.6) we remark that \(\sum _{j \leq n< N < k} |h_{jk}|^{2} =0\) whenever \(N>n\), hence

$$ \frac{1}{\pi } {\mathrm{Area \ E}} - (h_{n+1,n}^{2} -h_{n,n+1}^{2}) $$

dictates, up to the stated constants, the rate of convergence. Since \({\mathrm{Area \ E}} = \frac{1}{4} (\rho - \rho ^{-1})\) with \(\rho >1\), we find the error estimate:

$$ \frac{1}{4} \biggl(\rho - \frac{1}{\rho }\biggr) - \frac{1}{4} \biggl[ \frac{n+1}{n+2} \rho \frac{1-\rho ^{-2n-3}}{1-\rho ^{-2n -2}} - \frac{n+2}{n+1} \frac{1}{\rho } \frac{1-\rho ^{-2n-2}}{1-\rho ^{-2n -3}}\biggr] = O \biggl( \frac{1}{n}\biggr). $$

Conformal and quasi-conformal mapping techniques led recently to sharp estimates of the Hessenberg matrix entries of the subnormal multiplier \(S_{\nu }= M_{z}\), acting on \(P^{2}(\nu )\), [4]. This article represents the highest point of several decades of accumulated studies, by many authors. We mention the main setting.

Let

$$ F(z) = c_{-1} z + c_{0} + c_{1} \frac{1}{z} + c_{2} \frac{1}{z^{2}} + \cdots $$

denote the conformal mapping of the exterior of the closed unit disk onto \(\mathbb{C}\setminus \overline{\Omega }\), It is customary to normalize \(F\) by the condition \(c_{-1} >0\). Theorem 1.2 of [4] asserts that there exists a constant \(\beta \geq 1\) so that Hessenberg matrix of \(S_{\nu }\) is asymptotically close to the Toeplitz matrix \(\mathcal{T}\) with entries \(c_{-1}, c_{0}, c_{1}, \ldots \). More specifically:

$$ \biggl| h_{n-k,k} - \sqrt{\frac{n+1}{n-k+1}} c_{k}\biggr| = O\biggl( \frac{1}{n^{\beta }}\biggr), \quad n \rightarrow \infty , $$

where \(O\) depends on \(k \geq -1\). In particular, the only non-zero under-diagonal terms satisfy:

$$ \biggl|h_{n+1,n} - \sqrt{\frac{n+1}{n+2}} c_{-1}\biggr| = O\biggl( \frac{1}{n^{\beta }}\biggr), \quad n \rightarrow \infty . $$

That is

$$ |h_{n+1,n} - c_{-1}| = O\biggl(\frac{1}{n}\biggr). $$

The value of the constant \(\beta \) depends on the regularity of the boundary of \(\Omega \), [4].

Remark that the series \(\sum _{\ell \geq 1} \ell |c_{\ell }|^{2}\) converges via a well known area estimate, in its turn a consequence of Stokes formula:

$$ \frac{1}{\pi } \ {\mathrm{Area}\, \Omega } = |c_{-1}|^{2} - |c_{1}|^{2} - 2 |c_{2}|^{2} - 3 |c_{3}|^{2} - \cdots . $$

Just for validation: this is nothing else than the trace of the self-commutator of the corresponding Toeplitz matrix \(\mathcal{T}\).

Recall from Theorem 3.6 that the orthogonal projection \(P_{n}\) onto \(\mathbb{C}_{n}[z]\) satisfies the identity

$$ \| (I-P)M^{\ast }P_{n} \|^{2}_{\mathit{HS}} = h_{n+1,n}^{2} - \sum _{j \leq n< k} |h_{jk}|^{2}. $$

In conclusion,

$$ \lim _{n} \sum _{j \leq n< k} |h_{jk}|^{2} = |c_{1}|^{2} + 2 |c_{2}|^{2} + 3 |c_{3}|^{2} + \cdots . $$

The yet unknown rate of convergence in the latter limit is not expected to be better than \(O(\frac{1}{n})\), as the ellipse case shows.

One of the first study of estimates of orthogonal polynomials in Bergman space setting is due to Carleman, in the case of real analytic boundaries, see for instance [30]. Without entering into details, we mention that in this scenario there exists \(\rho >1\), depending on the geometry of \(\partial \Omega \) (how far Schwarz function of this curve analytically extends inside \(\Omega \)), such that the decay in formula (3.9) is geometric:

$$ \sum _{j \leq n < N < k} |h_{jk}|^{2} = O \biggl(\frac{1}{\rho ^{N}}\biggr), $$

where \(O\) depends on \(n\).

In general, Corollary 3.7 allows to estimate the error term in the moment approximation formula (3.6) even after adding finitely point masses outside the polynomial convex hull of \(\overline{\Omega }\).

5.4 Finite Rank Self-Commutator

Even the simplest, finite rank self-commutator scenario raises challenging approximation theory questions. In view of the structural theorem of McCarthy and Yang, we have to focus in this case on a rational conformal map \(r : \mathbb{D}\longrightarrow \Omega \) of the disk onto a bounded quadrature domain, and the push forward measure \(\mu = r_{\ast }( d\theta + \nu )\), where \(d\theta \) is arc length on the unit circle and \(\nu \) is a finite atomic, positive measure supported by \(\mathbb{D}\). Then the operator \(S_{\mu }= \mathit{PM}_{z} P\) is a cyclic, irreducible subnormal with finite rank self-commutator, and vice-versa [19, 20]. Let the Schmidt expansion of \((I-P)M^{\ast }P\) be:

$$ (I-P)M^{\ast }P= \sum _{j=0}^{d} \kappa _{j} g_{j} \langle \cdot , f_{j} \rangle , $$

with \(d\) finite. Every eigenfunction \(f_{j}, \ 0 \leq j \leq d\), of the self-commutator \([S^{\ast },S]\) annihilates a finite codimension ideal of the ring of analytic functions defined on \(\Omega \) (see [19]). Let \(a_{1}, \ldots ,a_{p} \in \Omega \) denote the support of this ideal. Therefore every \(f_{j}\) is a linear combination of the corresponding point evaluation functionals \(k_{a_{1}}, \ldots , k_{a_{p}}\). Due to the definition of the measure \(\mu \), these point evaluation functionals are push forward via \(r\) of point evaluations functionals with respect to the measure \(d\theta + \nu \). And the latter can be explicitly computed recursively (via the so-called Uvarov’s transform). We indicate only one step of this transform, corresponding to the measure \(d\theta + \delta _{\alpha }\) with \(\alpha \in D\):

$$ K^{d\theta + \delta _{\alpha }}(z,w) = \frac{1}{2 \pi } \biggl[ \frac{1}{1-z\overline{w}} + \frac{C}{(1-z \overline{\alpha })(1-\alpha \overline{w})}\biggr], \quad z,w \in \Omega , $$

where \(C\) is a constant. More details can be found in [28]. We infer from these formulas that every evaluation functional \(K^{d\theta + \nu }(z,w)\) analytically extends as a function of \(z\), across \(\partial \mathbb{D}\), for a fixed value of \(w \in \mathbb{D}\). The same analytic continuation feature carries to \(k_{a_{1}}, \ldots , k_{a_{p}}\) provided the boundary of \(\Omega \) is smooth. Recall that the only singular points in the boundary of a quadrature domain are inner cusps [12]. We discuss a generic situation.

Proposition 5.1

Let \(S\) be a cyclic subnormal operator with finite-rank self-commutator, so that its spectrum is the closure of a finite union \(\Omega \) of quadrature domains, plus a finite number of points. If the boundary of \(\Omega \) is smooth, then there exists \(\rho >1\), so that the error in Theorem 3.6satisfies:

$$ \| (I-P)M^{\ast }(P-P_{n}) \|_{\mathit{HS}} = O \biggl(\frac{1}{\rho ^{n}}\biggr), \quad n \rightarrow \infty . $$

Proof

The non-degenerate case assumed in the statement, implies that \(\Omega \) is a real analytic curve without singularities. Let \(f_{1}, \ldots ,f_{p}\) denote the eigenfunctions of \([S^{\ast }, S]\). We just proved that \(f_{1}, \ldots ,f_{p}\) are analytic functions defined in a neighborhood of \(\overline{\Omega }\).

The full Hilbert-Schmidt norm of the main Hankel operator is

$$ \| (I-P)M^{\ast }P \|^{2}_{\mathit{HS}} = \sum _{j=0}^{p} \| (I-P)M^{\ast }f_{j} \|^{2}, $$

while the error of interest is

$$ \| (I-P)M^{\ast }(P-P_{n}) \|^{2}_{\mathit{HS}} = \sum _{j=0}^{p} \| (I-P)M^{\ast }(f_{j} - P_{n} f_{j})\|^{2}. $$

It remains to prove that there exists a constant \(\rho >1\), so that, for every \(j, 0 \leq j \leq p\):

$$ \inf \{ \| f_{j} - h \|_{2,\mu }, \ h \in \mathbb{C}_{n}[z] \} = O \biggl( \frac{1}{\rho ^{n}}\biggr), \quad n \rightarrow \infty . $$

Since the complement of \(\overline{\Omega }\) is connected, a Theorem of Russell and Walsh insures the above decay, even with respect to uniform norm on \(\overline{\Omega }\), see [33] Section 4.7. □

The case of non-smooth boundary is different, and more intriguing. We consider a simple example. Let \(r(z) = (z-1)^{2}\) be the conformal map of the disk onto the cardiodid \(\Omega \), a quadrature domain of order two. The point \(0 = r(1)\) is a singular point of \(\Omega \), where the boundary has an inner cusp.

Let \(U = M_{z}\) denote the unilateral shift, on Hardy space \(H^{2}(\mathbb{D}) = P^{2}(\mathbb{T}, d\theta )\). The monomials \(1, z, z^{2} , \ldots \) form an orthonormal basis, with respect to normalized arc length measure \(\frac{d \theta }{2\pi }\). The operator \(S_{\mu }\) corresponding to the push forward measure \(r_{\ast }\frac{d \theta }{2\pi }\) is unitarily equivalent to \((U-I)^{2}\). The action of \(U\) on the basis is by shifting the indices \(U z^{n} = z^{n+1}\), \(n \geq 0\), with \(U^{\ast }z^{n+1} = z^{n}\) and \(U {\mathbf{1}} = 0\). One computes immediately the self commutator, in this Hilbert space representation:

$$\begin{aligned} & [S^{\ast }_{\mu }, S_{\mu }] = [ I - 2 U^{\ast }+ U^{\ast 2}, I - 2 U + U^{2}] = \\ &\quad 4 [U^{\ast }, U] - 2 [U^{\ast }, U^{2}] - 2[U^{\ast 2}, U] + [U^{\ast 2}, U^{2}] \end{aligned}$$

remarking that every commutator in the last expression annihilates \(z^{k}, \ k \geq 2\). Therefore the two eigenfunctions \(f_{0}\), \(f_{1}\) of \([S^{\ast }_{\mu }, S_{\mu }]\) are linear combinations of \({\mathbf{1}}\) and \(z\). In view of the preceding proof, the rate of convergence in Theorem 3.6 is a factor of

$$ \delta _{n} = \inf \{ \| z - h((1-z)^{2})\|_{2, d\theta }, \ h \in \mathbb{C}_{n}[u] \}, $$

clearly equivalent to the best polynomial approximation (in the corresponding Lebesgue space norm) of the function \(\sqrt{w}\) on the boundary of \(\Omega \). Remark that the function \(\sqrt{w}\) is analytic and continuous on the closure \(\overline{\Omega }\) of the cardiodid, but it is not analytic in a neighborhood of it. The same theorem of Russell and Walsh and a refined Bernstein-Markov Inequality imply that \(\delta _{n}\) converges to zero at a slower than geometric rate. We do not expand here the technical details related to the cusp singularity in the boundary, see for instance [3]. The exact asymptotic decay of \(\delta _{n}\) remains unknown to us. It is worth mentioning that a slight perturbation in the definition of the measure \(\mu \), for instance

$$ \mu _{\epsilon }= (r_{\epsilon })_{\ast }d\theta , \qquad r_{\epsilon }(z) = (z-1- \epsilon )^{2}, \quad \epsilon >0 $$

remains within the class of rank-two self-commutator, this time with a cloud possessing a smooth boundary.

5.5 Non-smooth Clouds

Given the full generality of our approach, a vast array of pathologies enters into the picture. We simply make aware the reader of such pitfalls on one of the simplest examples, stressing that the cloud of a measure is only contained in its closed support.

Let \(\Gamma \) be a Jordan curve in the complex plane, with positive area. As for instance constructed by Osgood [24]. Let \(\Omega \) be the interior of \(\Gamma \). A celebrated theorem of Carleman, see for instance [30], asserts that complex polynomials are dense in the associated Bergman space. In our notation \(L^{2}_{a}(\Omega ) = P^{2}(\chi _{\Omega }dA)\). Denote \(\mu = \chi _{\Omega }dA\) and \(S_{\mu }\) the associated subnormal operator, equal to the multiplication by \(z\) on this space. The set of bounded point evaluations for \(\mu \) is equal to \(\Omega \), also equal to the non-essential spectrum of \(S_{\mu }\). The spectrum of \(S_{\mu }\) is equal to \(\overline{\Omega }\). We know that the principal function \(g\) of \(S_{\mu }\) is equal to the characteristic function of \(\overline{\Omega }\), modulo area null-sets. Therefore the cloud \(\Sigma (\mu )\) is equal to \(\overline{\Omega }\), and

$$ {\mathrm{Area}} (\Sigma (\mu ) \setminus \Omega ) >0, $$

while

$$ \mu (\Sigma (\mu ) \setminus \Omega ) =0. $$