1 Introduction

In the framework of noncommutative geometry [3], we study the pure states on the Toeplitz matrices as a metric space. The results of this letter can be understood with little prior knowledge, however. With \(n\times n\) Toeplitz matrices, we mean those matrices in \(M_n (\mathbb {C})\) for which each descending diagonal is constant, i.e., the matrices T for which \(T_{i,j} = T_{i+1, j+1}\). We denote this set by \(C(S^1)^{(n)}\), as these Toeplitz matrices can also be seen as a truncation of \(C(S^1)\) which we will explain later on. The reader need only know that these matrices inherit a structure of positivity from \(M_n(\mathbb {C})\) which allows us to talk about positive linear functionals. Likewise, states on the Toeplitz system can be defined as those positive linear functionals \(\varphi \) for which \(\varphi (I)=1\), and pure states as the extreme points in the state space. We denote these spaces as \(\mathcal {S}(C(S^1)^{(n)})\) and \(\mathcal {P}(C(S^1)^{(n)})\), respectively. The states on \(C(S^1)^{(n)}\) can be equipped with a metric that metrises the weak\(^*\)-topology via the Connes distance formula which is defined as

$$d_n(\varphi , \psi ) = \sup _{T \in C(S^1)^{(n)}} \{\left|\varphi (T)-\psi (T)\right| : \left\Vert [D_n,T]\right\Vert \le 1\}, $$

where \(D_n\) denotes the matrix \(D_n := {\text {diag}}(1, \dots , n)\). Importantly, this is completely analogous to a formula that defines the Monge–Kantorovich metric on the state space of \(C(S^1)\),

$$ d(\varphi , \psi ) = \sup _{f \in C^\infty (S^1)} \{\left|\varphi (f)-\psi (f)\right| : \Vert [D,f]\Vert \le 1 \}, $$

where \(D= -i\frac{d}{dx}\) is the Dirac operator on \(S^1\), which metrises the weak\(^*\)-topology on \(\mathcal {S}(C(S^1))\) [11]. In this language, the main result of this letter is that the metric spaces \((\mathcal {P}(C(S^1)^{(n)}), d_n)\) converge in the Gromov–Hausdorff sense to the metric space \((\mathcal {S}(C(S^1)), d)\).

The reader who is familiar with noncommutative geometry may be interested to know how the above relates to prior work in that area. Related to physical applications of noncommutative geometry, extensive study has been made of spectral geometry for which only part of the spectral data is available [4, 6, 14]. Indeed, detectors in physical experiments are limited and thus only give information up to a certain energy level with finite resolution. Suggested first by F. D’Andrea, F. Lizzi, and P. Martinetti [6] and put in the language of operator systems in an article by A. Connes and W.D. van Suijlekom [4], the correct theoretical framework for such truncations seems to be to truncate a spectral triple \((\mathcal {A}, H, D)\) with a projection P onto a part of the spectrum of D resulting in the triple \((P\mathcal {A}P, PH, PD)\). The Toeplitz matrices arise as exactly such a truncation of the algebra \(\mathcal {A}=C(S^1)\) in the spectral triple of the circle. Note also that a first result on Gromov–Hausdorff convergence of state spaces in this context is put forward in [6]. A more detailed explanation of the connection between the results presented in this letter and noncommutative geometry can be found in the Master’s thesis [8].

Before continuing on, the author would like to thank Walter van Suijlekom for his invaluable guidance.

2 Preliminaries

These preliminaries consist of two parts, namely preliminaries on Gromov–Hausdorff convergence and on truncated geometry on the circle.

2.1 Gromov–Hausdoff convergence

In the spirit of work by M. Rieffel [12], we are interested in studying limits of sequences of metric spaces in the Gromov–Hausdorff sense. The Gromov–Hausdorff distance is a pseudo-metric on the class of all compact metric spaces and is zero if and only if two spaces are isometric [1, Theorem 7.3.30]. Hence, it functions a metric on the isomorphism classes of compact metric spaces, although one has to be careful to dodge set-theoretic paradoxes in such a description. In any case, it gives a useful notion of convergence and there are several techniques that can be used to prove Gromov–Hausdorff convergence of sequences of compact metric spaces.

For a subset S in a metric space, denote the r-neighborhood of S by \(U_r(S)\), i.e.,

$$\begin{aligned}U_r(S) = \bigcup _{x\in S} B_r(x),\end{aligned}$$

where \(B_r(x)\) is the open ball of radius r and center x.

Definition 2.1

Let A and B be subsets of a metric space. The Hausdorff distance between A and B is defined as

$$\begin{aligned}d_H(A,B) = \inf \{ r > 0 : A \subseteq U_r(B) \text{ and } B \subseteq U_r(A)\} .\end{aligned}$$

Definition 2.2

Let X and Y be metric spaces. The Gromov–Hausdorff distance between X and Y is defined as the infimum of all \(r > 0\) such that there exists a metric space Z with subsets \(X', Y' \subseteq Z\) isometric to X and Y, respectively, with \(d_H(X', Y') < r\), where \(d_H(X', Y')\) is the Hausdorff distance between \(X'\) and \(Y'\). We will denote the Gromov–Hausdorff distance by \(d_{GH}(X, Y)\) in the rest of this letter as well.

These definitions are exactly [1, Definition 7.3.1] and [1, Definition 7.3.10].

Definition 2.3

Let X and Y be two sets. A total onto correspondence between X and Y is a set \(\mathfrak {R} \subseteq X \times Y\) such that for every \(x \in X\) there exists at least one \(y \in Y\) with \((x,y) \in \mathfrak {R}\) and similarly for every \(y \in Y\) there exists an \(x \in X\) with \((x,y) \in \mathfrak {R}\).

Note that the above definition is simply called a correspondence in [1, Chapter 7], although this is not usual terminology. To prevent any confusion, we will stick to calling the above total onto correspondences.

Definition 2.4

Let \(\mathfrak {R}\) be a total onto correspondence between metric spaces X and Y. The distortion of \(\mathfrak {R}\) is defined by

$$\begin{aligned}{\text {dis }} \mathfrak {R} = \sup \bigg \{\left|d_X(x, x') - d_Y(y, y')\right| : (x,y), (x', y') \in \mathfrak {R} \bigg \}.\end{aligned}$$

Theorem 2.5

For any two metric spaces X and Y

$$\begin{aligned}d_{GH}(X,Y) = \frac{1}{2} \inf _{\mathfrak {R}}({\text {dis }} \mathfrak {R}),\end{aligned}$$

where the infimum is taken over all total onto correspondences \(\mathfrak {R}\) between X and Y.

Proof

See [1, Theorem 7.3.25]. \(\square \)

As a direct corollary, there is a similar approach of using \(\varepsilon \)-isometries.

Definition 2.6

Let X be a metric space and \(\varepsilon > 0\). A set \(S \subseteq X\) is called an \(\varepsilon \)-net if \({\text {dist}}(x,S) \le \varepsilon \) for every \(x \in X\).

Definition 2.7

Let X and Y be metric spaces and \(f: X \rightarrow Y\) an arbitrary map. The distortion of f is defined by

$$\begin{aligned}{\text {dis}}f = \sup _{x_1, x_2 \in X} |d_Y (f(x_1), f(x_2)) - d_X (x_1, x_2)|.\end{aligned}$$

Definition 2.8

Let X and Y be metric spaces and \(\varepsilon > 0\). A (possibly non-continuous) map \(f: X \rightarrow Y\) is called an \(\varepsilon \)-isometry if \({\text {dis}}f \le \varepsilon \) and f(X) is an \(\varepsilon \)-net in Y.

Corollary 2.9

Let X and Y be metric spaces and \(\varepsilon > 0\).

  1. 1.

    If \(d_{GH}(X, Y) < \varepsilon \), then there exists a \(2\varepsilon \)-isometry from X to Y.

  2. 2.

    If there exists an \(\varepsilon \)-isometry from X to Y, then \(d_{GH}(X, Y) < 2\varepsilon \).

Proof

See [1, Corollary 7.3.28]. \(\square \)

More details on the Gromov–Hausdorff metric can be found in Chapter 7 of A Course in Metric Geometry [1] by D. Burago, I. Burago and S. Ivanov.

2.2 Truncated geometry on the circle

As mentioned in the introduction, the Toeplitz matrices arise as a truncation of the algebra of continuous functions \(C(S^1)\). Indeed, \(C(S^1)\) has a basis \(\{e_n(t) = e^{int} \}_{n \in \mathbb {Z}}\), consisting of all eigenfunctions of the Dirac operator \(D = -i \frac{d}{dx}\) on \(S^1\). We can then consider the projection onto \({\text {span}}_\mathbb {C}\{e_1, \cdots , e_n\}\), denote this projection by \(P_n\). For any function \(f \in C^\infty (S^1)\), the action of \(P_nfP_n\) on the finite-dimensional Hilbert space \(\mathrm {span}_\mathbb {C} \{e_k\}_{k=1}^n\) can be represented as the \(n \times n\) Toeplitz matrix

$$\begin{aligned}\left( \begin{array}{ccccc} \hat{f}(0)&{} \hat{f}(-1) &{} \hat{f}(-2) &{} \cdots &{} \hat{f}(-n+1) \\ \hat{f}(1) &{} \hat{f}(0) &{} \hat{f}(-1) &{} \cdots &{} \hat{f}(-n+2)\\ \hat{f}(2) &{} \hat{f}(1) &{} \hat{f}(0) &{} \cdots &{} \hat{f}(-n+3)\\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ \hat{f}(n-1) &{} \hat{f}(n-2) &{} \hat{f}(n-3) &{} \cdots &{} \hat{f}(0)\\ \end{array} \right) .\end{aligned}$$

This is the reason why we denote the set of Toeplitz matrices by \(C(S^1)^{(n)} \).

It has already been proven in [14] that the state spaces \(\mathcal {S}(C(S^1)^{(n)})\) converge to \(\mathcal {S}(C(S^1))\) in the Gromov–Hausdorff sense. To prove our result, a main ingredient of [14] will be used in this letter as well, which is the map

$$\begin{aligned} R_n: C(S^1)&\rightarrow C(S^1)^{(n)}\\ f \quad&\mapsto P_n f P_n. \end{aligned}$$

Lemma 2.10

The map

$$\begin{aligned} R_n^*: \mathcal {S}(C(S^1)^{(n)})\rightarrow & {} \mathcal {S}(C(S^1))\\ \tau \quad \quad\mapsto & {} \quad \tau \circ R_n \end{aligned}$$

is well-defined and satisfies \({\text {dis}} R_n^* \rightarrow 0\) as \(n \rightarrow \infty \).

Proof

See [14]. \(\square \)

Next, let us give a description of the pure states on \(C(S^1)^{(n)}\). This has already been done via a duality of the Toeplitz operator system with the Fejér-Riesz operator system [4], but here we will use a more direct approach.

A very useful ingredient for this is the following decomposition theorem dating from 1911 proven by C. Caratheodory and L. Fejér [2]. We introduce the notation

$$\begin{aligned}f_z = \frac{1}{\sqrt{n}} \ \big (\begin{array}{ccccc} 1&z&z^2&\cdots&z^{n-1} \end{array}\big ) \in \mathbb {C}^n,\end{aligned}$$

which is a column of a so-called Vandermonde matrix.

Theorem 2.11

Any positive Toeplitz matrix \(T \in C(S^1)^{(n)}\) of rank \(r \le n-1\) can be uniquely decomposed as \(T = \sum _{k=1}^{r} d_k \vert f_{\lambda _k}\rangle \langle f_{\lambda _k}\vert \) where \(d_1, \dots , d_r > 0\) and \(\lambda _1, \dots , \lambda _r \in S^1\). This is called the Vandermonde decomposition. If the rank of T is n, this decomposition is still possible but not unique.

Proof

See [15]. \(\square \)

While we use this classical theorem to classify the pure states on the Toeplitz operator system, the same result can also be derived by the aforementioned operator system duality, see [4, Theorem 4.14].

Proposition 2.12

A state on \(C(S^1)^{(n)}\) is pure if and only if it is a vector state \(\varphi _\xi : T \mapsto \langle \xi , T\xi \rangle ,\) where \(\xi = (\xi _0, \xi _1, \dots , \xi _{n-1}) \in \mathbb {C}^n\) is a unit vector such that the polynomial \(Q_\xi (z) := \sum _k \xi _k z^{n-k-1}\) has all its zeroes on \(S^1\).

Proof

First of all, any pure state on \(C(S^1)^{(n)}\) admits an extension to a pure state on \(M_n(\mathbb {C})\) [4, Fact 2.9]. Since all pure states on \(M_n(\mathbb {C})\) are vector states, any pure state on \(C(S^1)^{(n)}\) must likewise be of the form \(\varphi _\xi : T \mapsto \langle \xi , T\xi \rangle \) for some unit vector \(\xi \in \mathbb {C}^n\).

By the Vandermonde decomposition of Toeplitz matrices, we can write any positive \(T \in C(S^1)^{(n)}\) in the form

$$\begin{aligned}\sum _{k=1}^{r} d_k \vert f_{\lambda _k}\rangle \langle f_{\lambda _k}\vert \end{aligned}$$

with \(d_1, \dots , d_{r} \ge 0\), this is Theorem 2.11. Observe that on \(S^1\), we have \(\overline{\lambda } = \lambda ^{-1}\) and so

$$\begin{aligned}\langle f_\lambda , \xi \rangle = \sum _{k=0}^{n-1} \xi _k \lambda ^{-k} = \lambda ^{-n+1} Q_\xi (\lambda ).\end{aligned}$$

Therefore,

$$\begin{aligned}\varphi _\xi (T) = \langle \xi , T\xi \rangle = \sum _{k=1}^{r} d_k \left|\langle f_{\lambda _k}, \xi \rangle \right|^2 = \sum _{k=1}^{r} d_k \left|Q_\xi (\lambda _k)\right|^2.\end{aligned}$$

Now let us investigate for what \(\xi \) this state is pure. Note that \(\varphi _\xi \) is pure if and only if

$$\begin{aligned}\varphi _\omega (T) \le \varphi _\xi (T) \quad \forall 0\le T\in C(S^1)^{(n)}\end{aligned}$$

implies that \(\omega \in \mathbb {C}\xi \) (see [10, p. 144]), and by the above calculation this is equivalent to

$$\begin{aligned}\left|Q_\omega (z)\right| \le \left|Q_\xi (z)\right| \quad \forall z \in S^1\end{aligned}$$

implying that \(\omega \in \mathbb {C}\xi \).

We claim that if \(\left|Q_\omega (z)\right| \le \left|Q_\xi (z)\right|\) for all \(z \in S^1\) and \(\lambda \in S^1\) is a root of \(Q_\xi \), then \(\lambda \) must also be a root of \(Q_\omega \) with at least the same multiplicity. The first part of this claim is immediate. To prove the second part, suppose the multiplicity of the root \(\lambda \) is m for \(Q_\omega \) but strictly more than that for \(Q_\xi \). Then we must still have

$$\begin{aligned}\left|\frac{Q_\omega (z)}{(z-\lambda )^m}\right| \le \left|\frac{Q_\xi (z)}{(z-\lambda )^m}\right|\end{aligned}$$

on \(S^1\), but with the right-hand side evaluating to zero at \(z = \lambda \) whereas the left-hand side does not, we arrive at a contradiction.

Therefore, if \(Q_\xi \) has all its roots on \(S^1\), the above observation combined with the fact that \(Q_\omega \) and \(Q_\xi \) are polynomials of the same degree leads to the conclusion that \(\left|Q_\omega (z)\right| \le \left|Q_\xi (z)\right| \forall z\in S^1\) implies that \(Q_\omega \in \mathbb {C}Q_\xi \). Hence indeed \(\varphi _\xi \) is pure, which completes one direction of the proposition.

For the other direction, suppose that \(\xi \in \mathbb {C}^n\) is a vector such that the polynomial \(Q_\xi \) has roots \(\lambda _1, \cdots , \lambda _n \) (counted with multiplicities), of which say \(\lambda _n\) is not an element of \(S^1\). Then \(\left|z-\lambda _n\right|\) attains a minimum on \(S^1\) which is strictly greater than zero, say \(\delta > 0\). Choose any \(\lambda \in S^1\) and note that \(\left|z-\lambda \right| \le 2\) on \(S^1\). Then

$$\begin{aligned} \left|Q_\xi (z)\right|^2&= \big |c \prod _{k=1}^n (z-\lambda _k)\big |^2\\&\ge \frac{\delta ^2}{4}\big \vert c (z-\lambda ) \prod _{k=1}^{n-1} (z-\lambda _k)\big |^2, \end{aligned}$$

so the polynomial \(\frac{\delta }{2} c (z-\lambda ) \prod _{k=1}^{n-1} (z-\lambda _k) =: \sum _k \omega _k z^{n-k-1}\) corresponds to some \(\omega \in \mathbb {C}^n\) with the property that

$$\begin{aligned}\langle \omega , T\omega \rangle \le \langle \xi , T\xi \rangle \end{aligned}$$

for all positive \(T \in C(S^1)^{(n)}\). Clearly, \(Q_\omega \) is not a scalar multiple of \(Q_\xi \) so \(\omega \not \in \mathbb {C}\xi \), and hence the vector state \(\varphi _\xi \) is not pure. \(\square \)

Corollary 2.13

The pure states of \(C(S^1)^{(n)}\) are exactly the linear functionals of the form \(P_nfP_n \mapsto \! \int _{S^1} f \ \left|Q_\xi \right|^2 \ d\lambda \), where \(\xi \in \mathbb {C}^n\) is a unit vector and \(Q_\xi := \sum _{k=0}^{n-1}\xi _k z^{n-k-1}\) has all its roots on \(S^1\).

Proof

According to Proposition 2.12, the pure states on \(C(S^1)^{(n)}\) are given by \(T \mapsto \langle \xi , T\xi \rangle ,\) where the unit vector \(\xi \in \mathbb {C}^{n}\) is such that the polynomial \(Q_\xi \) has all its roots on \(S^1\). A short calculation gives that

$$\begin{aligned}\langle \xi , P_nfP_n \xi \rangle = \sum _{\left|j\right|\le n-1} (\xi ^* * \xi )_j \ \hat{f}(-j),\end{aligned}$$

where \((v * w)_j = \sum _{k=\max (0, -j)}^{\min (n-1,n-j-1)} v_{-k} w_{j+k}\) is the discrete convolution product and \((v^*)_j = \overline{v_{n-j-1}} \). By noting that \((\xi ^* * \xi )_j\) is exactly the jth Fourier coefficient of \(\left|Q_\xi \right|^2\), we can use the Plancherel Theorem [13, Theorem 9.13] to conclude that

$$\begin{aligned} \langle \xi , P_nfP_n \xi \rangle = \int _{S^1} f \ \left|Q_\xi \right|^2 \ d\lambda . \end{aligned}$$

This corollary essentially characterizes the pure states \(\tau \) of \(C(S^1)^{(n)}\) by the Radon–Nikodym derivative of \(R_n^*(\tau )\) as a state (i.e., probability measure) on \(C(S^1)\). This works since the Fourier basis \(\{e_n\}_{n \in \mathbb {Z}}\) forms an orthonormal basis of \(C(S^1)\), hence for a function \(g \in \mathrm {span}_\mathbb {C} \{e_1, \dots e_n \}\) there is some ambivalence in considering a state \(\tau : P_nfP_n \mapsto \int _{S^1} fg d\lambda \) on \(C(S^1)^{(n)}\) or its pullback \(R_n^*\tau : f \mapsto \int _{S^1} fg d\lambda \). We will exploit this, although responsibly in order to prevent confusion.

Notation. When considering a pure state \(\tau \) on \(C(S^1)^{(n)}\), we will somewhat abusively refer to the Radon–Nikodym derivative of \(R_n^*\tau \), with respect to the normalized Haar-measure \(d\lambda \) on \(S^1\), as \(\frac{d\tau }{d\lambda }\) (instead of \(\frac{dR_n^*\tau }{d\lambda }\)) because this function uniquely defines the pure state \(\tau \). In the other way around, if f is a function of the form such that it defines a pure state on \(C(S^1)^{(n)}\), we will denote that pure state \(\tau _{f}\). In summary, \(\frac{d\tau _f}{d\lambda } = f\).

Proposition 2.14

If \(\tau \) is a pure state on \(C(S^1)^{(n)}\), then

$$\begin{aligned}\frac{d\tau }{d\lambda }(t) = \left|Q_\xi \right|^2(e^{it}) = c \prod _{j=1}^{n-1} (2-2\cos (t-\theta _j)),\end{aligned}$$

where \(e^{i\theta _j}\) are the roots of the polynomial \(Q_\xi := \sum _k \xi _k z^{n-k-1}\) and \(c\in \mathbb {R}\) is a scaling factor such that \(\frac{d\tau }{d\lambda }\) integrates to 1. Likewise, any function of this form defines a pure state.

Proof

According to Corollary 2.13, any pure state \(\tau \) on \(C(S^1)^{(n)}\) corresponds to a function of the form \(\left|Q_\xi \right|^2\). Denoting the roots of \(Q_\xi \) by \(e^{i\theta _j}\), there must be a constant c such that

$$\begin{aligned} \left|Q_\xi (e^{it})\right|^2&= c\prod _{j=1}^{n-1} \left|e^{it} - e^{i \theta _j}\right|^2\\&= c\prod _{j=1}^{n-1} (2 - 2 \cos (t-\theta _j)). \end{aligned}$$

Furthermore, \(\left|Q_\xi \right|^2\) must integrate to 1 because \(\left\Vert Q_\xi \right\Vert _2 = \left\Vert \xi \right\Vert \) since the Fourier transform is unitary.

For the other way around, the above proves that any function of the form \( c\prod _{j=1}^{n-1} (2 - 2 \cos (t-\theta _j))\) is equal to \(\left|Q_\xi (e^{it})\right|^2\) for some vector \(\xi \in \mathbb {C}^{n}\) such that \(Q_\xi \) has all its zeroes on \(S^1\). And again because \(\left\Vert Q_\xi \right\Vert _2 = \left\Vert \xi \right\Vert \), \(\left|Q_\xi \right|^2\) integrating to 1 implies that \(\xi \) is a unit vector. According to Corollary 2.13, this function therefore indeed defines a pure state on \(C(S^1)^{(n)}\). \(\square \)

3 Convergence to \(\mathcal {S}(C(S^1))\)

We will now prove the convergence of \(\mathcal {P}(C(S^1)^{(n)})\) to \(\mathcal {S}(C(S^1))\) by establishing \(\varepsilon \)-isometries between these spaces, employing Corollary 2.9. The candidate maps we propose are the maps \(R_n^*\) for which we already have that \({\text {dis}} R_n^* \rightarrow 0\) (Lemma 2.10). If we can now establish that \(R_n^* (\mathcal {P}(C(S^1)^{(n)}))\) is an \(\varepsilon \)-net if we choose n large enough, Gromov–Hausdorff convergence follows directly. First, we will prove that all states on the circle can be approximated by the pullback of pure states in \(\mathcal {P}(C(S^1)^{(n)})\). Next, we check that this can be done uniformly in n.

3.1 Approximating states

In this subsection, three steps of increasing difficulty will be carried out to prove that any state on the circle can be approximated by the pullbacks of pure states on the truncated circle.

  1. 1.

    We approximate any state \(\psi \) on \(C(S^1)\) by a state \(\sum _{i=1}^m t_i \text {ev}_{\lambda _i}\) on \(C(S^1)\), which is a convex combination of evaluations at the m-roots of unity;

  2. 2.

    We then approximate any such convex combination of evaluations by a state

    $$\begin{aligned}\sum _{i=1}^m \frac{ \frac{d\varphi }{d\lambda }(\lambda _i)}{\sum _{j=1}^m \frac{d\varphi }{d\lambda }(\lambda _j) } \text {ev}_{\lambda _j}\end{aligned}$$

    on \(C(S^1)\), where \(\lambda _j\) are the m-roots of unity and \(\varphi \) is a pure state on \(C(S^1)^{(n)}\);

  3. 3.

    Finally, we approximate any state of that particular form by a state \(R_n^* \chi \) on \(C(S^1)\) where \(\chi \) is a pure state on \(C(S^1)^{(n)}\).

Recall that a pure state \(\varphi \) on \(C(S^1)^{(n)}\) is uniquely characterized by the Radon–Nikodym derivative with respect to the normalized Haar-measure on \(S^1\) of \(R_n^*\varphi \), which is a state (i.e., a probability measure) on \(C(S^1)\), and that we denote this Radon–Nikodym derivative \(\frac{d\varphi }{d\lambda }\) instead of \(\frac{dR_n^*\varphi }{d\lambda }\) to ease notation. See also Subsect. 2.2. For the third and most important step, the essential property of the pure state space we will exploit is that it is possible to multiply these Radon–Nikodym derivatives to construct new pure states, as can be seen from the form of these functions in Proposition 2.14.

The first step in the outlined scheme is by far the easiest. Because the roots of unity are dense in \(S^1\) and the standard topology on \(S^1\) coincides with the weak\(^*\)-topology on the pure states, which in turn is induced by the Monge–Kantorovich metric [11], it follows that the set

$$\begin{aligned}\left\{ \sum _{i=1}^m t_i \text {ev}_{\lambda _i} \ : \ m \in \mathbb {N}, \ 0 \le t_i \le 1, \sum _{i=1}^m t_i = 1, \lambda _i^m = 1 \right\} ,\end{aligned}$$

i.e., convex combinations of evaluations at the roots of unity, is dense in \(\mathcal {S}(C(S^1))\) with respect to the topology induced by the Monge–Kantorovich metric.

The second step can be done in a single lemma. In spirit, this lemma should be compared with the assertion that one can always fit a polynomial P of degree \(n-1\) (or higher, if desired) through n prescribed points. Here, we must choose a pure state \(\varphi \) such that the function

$$\begin{aligned}\frac{\frac{d\varphi }{d\lambda }}{\sum _{i=1}^m \frac{d\varphi }{d\lambda }(\lambda _i)}\end{aligned}$$

evaluates (approximately) to \(t_i\) at the points \(\lambda _i\).

Lemma 3.1

Let \(\lambda _1, ..., \lambda _m\) be the solutions of \(\lambda ^m = 1\) (the m-roots of unity), and take any state of the form \(\sum _{i=1}^m t_i \text {ev}_{\lambda _i}\) with \(\sum _{i=1}^m t_i = 1\) and \(t_i \ge 0\) for all i. Then for every \(l \in \mathbb {Z}_{\ge 0}\) and \(\varepsilon >0\) we can find \(\varphi \in \mathcal {P}(C(S^1)^{(m+1+l)})\) such that on \(C(S^1\)

$$\begin{aligned}d\left( \sum _{i=1}^m t_i \text {ev}_{\lambda _i}, \sum _{j=1}^m \frac{\frac{d\varphi }{d\lambda }(\lambda _j)}{\sum _{i=1}^m \frac{d\varphi }{d\lambda }(\lambda _i)} \text {ev}_{\lambda _j} \right) < \varepsilon .\end{aligned}$$

Proof

Let us first prove the case \(l = 0\). Take the pure state \(\varphi _N\) on \(C(S^1)^{m+1}\) defined by \(\frac{d\varphi _N}{d\lambda }(t) = c\prod _{j=1}^m (1-\cos (t-\lambda _j + \sqrt{\frac{2t_j}{N}}))\). This indeed corresponds to a pure state in \(\mathcal {P}(C(S^1)^{(m+1)})\) according to Proposition 2.14. Consider what happens at the points \(\lambda _i\) if we take N large.

By Taylor expansion of the factors, it is a quick calculation to see that

$$\begin{aligned} \frac{d\varphi _N}{d\lambda }(\lambda _i)= & {} c\left( \frac{t_i}{N}+ \mathcal {O}\left( \frac{1}{N^{3/2}}\right) \right) \left( \prod _{j \not = i} \left( (1-\cos (\lambda _i-\lambda _j)) + \mathcal {O}\left( \frac{1}{\sqrt{N}}\right) \right) \right) \\= & {} c\frac{t_i}{N}\prod _{j \not = i} (1-\cos (\lambda _i-\lambda _j)) + \mathcal {O}\left( \frac{1}{N^{3/2}}\right) . \end{aligned}$$

Notice that \(c\prod _{j \not = i} (1-\cos (\lambda _i-\lambda _j))\) has the same value for all \(\lambda _i\) by symmetry. Hence, if we pass these values to the projective space \(\mathbb {R}P^{m-1}\) we end up with the ratio

$$\begin{aligned} \left[ \frac{d\varphi _N}{d\lambda }(\lambda _1) : \cdots : \frac{d\varphi _N}{d\lambda }(\lambda _m)\right]&= \left[ \frac{t_1}{N} + \mathcal {O}\left( \frac{1}{N^{3/2}}\right) : \cdots : \frac{t_m}{N} + \mathcal {O}\left( \frac{1}{N^{3/2}}\right) \right] \\&= \left[ t_1 + \mathcal {O}\left( \frac{1}{\sqrt{N}}\right) : \cdots : t_m + \mathcal {O}\left( \frac{1}{\sqrt{N}}\right) \right] . \end{aligned}$$

It then follows that the vectors

$$\begin{aligned}\frac{1}{\sum _{j=1}^m \frac{d\varphi _N}{d\lambda }(\lambda _j)}\left( \frac{d\varphi _N}{d\lambda }(\lambda _1), \dots , \frac{d\varphi _N}{d\lambda }(\lambda _m)\right) \end{aligned}$$

converge to \((t_1, \dots , t_m)\) in \(\mathbb {R}^m\) as \(N \rightarrow \infty \). Therefore, the states \(\frac{1}{\sum _{j=1}^m \frac{d\varphi _N}{d\lambda }(\lambda _j)} \sum _{j=1}^m \frac{d\varphi _N}{d\lambda }(\lambda _j) \text {ev}_{\lambda _j}\) converge to \(\sum _{j=1}^m t_j \text {ev}_{\lambda _j}\) in the weak\(^*\)-topology. Again we can use that the Monge–Kantorovich metric induces the weak\(^*\)-topology to conclude that we can choose N such that

$$d\left( \sum _{i=1}^m t_i \text {ev}_{\lambda _i}, \sum _{j=1}^m \frac{\frac{d\varphi _N}{d\lambda }(\lambda _j)}{\sum _{i=1}^m \frac{d\varphi _N}{d\lambda }(\lambda _i)} \text {ev}_{\lambda _j} \right) < \varepsilon .$$

The cases \(l \ge 1\) follow more or less immediately. If we choose some point \(\mu \) on the circle that is not equal to any of the \(\lambda _j\), we can guarantee that \(1-\cos (t-\mu )\) has no roots in the points \(\lambda _1, \dots , \lambda _m\). For any \(l \in \mathbb {N}\), we can take \(\varphi _N \in \mathcal {P}(C(S^1)^{(m)})\) such that the ratio \( [\frac{d\varphi _N}{d\lambda }(\lambda _1) : \cdots : \frac{d\varphi _N}{d\lambda }(\lambda _m)]\) is arbitrarily close to

$$\begin{aligned}\left[ \frac{t_1}{(1-\cos (\lambda _1 - \mu ))^l} : \cdots : \frac{t_m}{(1-\cos (\lambda _m - \mu ))^l}\right] \end{aligned}$$

by the argument for the case \(l=0\) above. Then \(\frac{d\varphi _N}{d\lambda } (1-\cos (t- \mu ))^l\) defines the pure state (up to scaling) in \(\mathcal {P}(C(S^1)^{(m+1+l)})\) that satisfies the statement in the lemma. \(\square \)

For the third and final step, we need to prove that the states of this type can be approximated by the pullback of some pure state on the truncated circle. To accomplish that, we will need the following propositions.

Proposition 3.2

Let \(K\subseteq X\) be some compact subset of \(\mathbb {R}^n\) and let \(f \in C(K)\) be a positive function attaining its maximum in the unique point \(x_0\). Then the sequence of linear functionals \(\left( \tau _n\right) _{n \in \mathbb {N}}\) defined by \(\tau _n: g \mapsto \int _K \frac{f^n}{\left\Vert f^n\right\Vert _1} g dx\) converges to \( \text {ev}_{x_0}\) in the weak\(^*\)-topology on \(C(K)^*\).

Proof

Denote the maximum of f by M. For every \(\varepsilon > 0\), \(f^{-1} (M - \varepsilon , M]\) is an open neighborhood of \(x_0\), denote this by \(U_\varepsilon \). Outside this neighborhood \(\frac{f^n}{\left\Vert f^n\right\Vert _1} \xrightarrow {n \rightarrow \infty } 0\) uniformly, since

$$\begin{aligned} \left\Vert f^n\right\Vert _1 \ge \int _{U_{\varepsilon /2}} f^n dx > \left|U_{\varepsilon /2}\right| (M - \varepsilon /2)^n, \end{aligned}$$

and so for \(x \not \in U_\varepsilon \)

$$\begin{aligned} \frac{f^n}{\left\Vert f^n\right\Vert _1} (x) \le \frac{1}{\left|U_{\varepsilon /2}\right|} \left( \frac{M-\varepsilon }{M - \varepsilon /2}\right) ^n.\end{aligned}$$

Therefore,

$$\begin{aligned} \left|\tau _n(g) - \text {ev}_{x_0}(g)\right|&=\left|\int _K \frac{f^n}{\left\Vert f^n\right\Vert _1} g dx - \text {ev}_{x_0}(g)\right|\\&= \left|\int _K \frac{f^n}{\left\Vert f^n\right\Vert _1} (g - g(x_0)) dx\right|\\&\le \int _{K-U_\varepsilon } \frac{f^n}{\left\Vert f^n\right\Vert _1} \left|g - g(x_0)\right|dx + \left|\int _{U_\varepsilon } \frac{f^n}{\left\Vert f^n\right\Vert _1} (g-g(x_0)) dx\right|\\&\le \underbrace{\int _{K-U_\varepsilon } \frac{f^n}{\left\Vert f^n\right\Vert _1} \left|g - g(x_0)\right|dx}_{\xrightarrow {n \rightarrow \infty } 0} + \underbrace{\sup _{x \in U_\varepsilon } \left|g(x)-g(x_0)\right|}_{ \xrightarrow {\varepsilon \rightarrow 0} 0} \underbrace{\int _{U_\varepsilon } \frac{f^n}{\left\Vert f^n\right\Vert _1}dx}_{\le 1}. \end{aligned}$$

Since the second term becomes small as \(\varepsilon \rightarrow 0\) independent of n, we see that this converges to 0 indeed. \(\square \)

Proposition 3.3

The convex combinations \(\sum _{j=1}^m \frac{1}{m} \text {ev}_{\lambda _j}\), where \(\lambda _j\) are the solutions of \(\lambda ^m = 1\), are weak \(^*\)-limits in \(\mathcal {S}(C(S^1))\) of sequences \((R_{n(m+1)}^*\tau _n)_{n\in \mathbb {N}}\) with \(\tau _n \in \mathcal {P}(C(S^1)^{(n(m+1))})\).

Proof

Take the polynomial \(Q_\xi {:=} \sum _{k} \xi _k z^{m-k} {=} \frac{1}{\sqrt{2}}( 1{-}z^{m})\), i.e., \(\xi {=} \frac{1}{\sqrt{2}}(-1, 0, ..., 0, 1)\). Then the function

$$\begin{aligned}g_m(t) := \left|Q_\xi (e^{it})\right|^2 = 1 - \cos (mt)\end{aligned}$$

defines a pure state \(\tau _{g_m}\) on \(C(S^1)^{(m+1)}\) in the manner of Proposition 2.12. Likewise, due to Proposition 2.14, the function \(\frac{(g_m)^n}{\left\Vert (g_m)^n\right\Vert _1}\) defines a pure state \(\tau _n:=\tau _{\frac{(g_m)^n}{\left\Vert (g_m)^n\right\Vert _1}}\) on \(C(S^1)^{(n(m+1))}\).

Note that \(g_m\) reaches its maximum in the m points \(\lambda _j\), denote the roots in between by \(\mu _j\) (to be precise, each \(\mu _j\) is the rotation of \(\lambda _j\) by \(\pi /m\)). By symmetry of \(g_m\), \(\left\Vert (g_m\chi _{[\mu _j, \mu _{j+1}]})^n\right\Vert _1 = \frac{1}{m} \left\Vert (g_m)^n\right\Vert _1\). Hence,

$$\begin{aligned}\frac{(g_m)^n}{\left\Vert (g_m)^n\right\Vert _1} = \frac{1}{m} \sum _{j=1}^m \frac{(g_m\chi _{[\mu _j, \mu _{j+1}]})^n}{\left\Vert (g_m\chi _{[\mu _j, \mu _{j+1}]})^n\right\Vert _1},\end{aligned}$$

and by applying Proposition 3.2 on all these terms, we conclude that

$$\begin{aligned}R_{n(m+1)}^*\tau _n \xrightarrow {w^*} \sum _{j=1}^m \frac{1}{m} \text {ev}_{\lambda _j}. \end{aligned}$$

Therefore, convex combinations of this type can indeed be approximated by pure states of the truncated circle. We will now use one more trick, which is to multiply the convergent sequence of pure states of the proposition above with the Radon–Nikodym derivative of another pure state, which results in a new sequence of pure states that converges to what we need.

Proposition 3.4

Let \(\lambda _1, ..., \lambda _m\) be the solutions of \(\lambda ^m = 1\), and take any \(\varphi \in \mathcal {P}(C(S^1)^{(k)})\) such that \(\frac{d\varphi }{d\lambda }(\lambda _j) \not =0\) for at least one j. Then

$$\begin{aligned} \sum _{j=1}^m \frac{\frac{d\varphi }{d\lambda }(\lambda _j)}{\sum _{i=1}^m\frac{d\varphi }{d\lambda }(\lambda _i) } \text {ev}_{\lambda _j}\end{aligned}$$

is the weak \(^*\)-limit of a sequence \((R_{k+n(m+1)}^*\chi _n)_{n \in \mathbb {N}}\) with \(\chi _n \in \mathcal {P}(C(S^1)^{(k+n(m+1))})\).

Proof

Consider the space of linear functionals \(C(S^1)^*\). Observe that on \(C(S^1)^*\):

  1. 1.

    The map

    $$\begin{aligned} M_f^*: C(S^1)^*&\rightarrow C(S^1)^*\\ \tau&\mapsto \tau \circ M_f, \end{aligned}$$

    where \(M_f\) indicates multiplication by f, is weak\(^*\)-continuous. This is trivial, since if \(\tau _n \xrightarrow {w^*} \tau \), then by definition \(\tau _n(fg) \rightarrow \tau (fg)\) for all \(g \in C(S^1)\) so \(M_f^* \tau _n \xrightarrow {w^*} M_f^* \tau \).

  2. 2.

    If \(\tau _n \xrightarrow {w^*} \tau \), then by definition also \(\tau _n(1) \rightarrow \tau (1)\). As scalar multiplication is weak\(^*\)-continuous

    $$\begin{aligned}\frac{\tau _n}{\tau _n(1)} \xrightarrow {w^*} \frac{\tau }{\tau (1)},\end{aligned}$$

    provided that these scalars are nonzero.

Take \(\varphi \in \mathcal {P}(C(S^1)^{(k)})\). For this proof, we will ease some notation by denoting the linear functional \(g \mapsto \int _{S^1} fg \ \! d\lambda \) on \(C(S^1)\) simply by f. As seen in the proof of Proposition 3.3, if we define \(g_m(t) = 1-\cos (mt)\) have that

$$\begin{aligned}\frac{(g_m)^n}{\left\Vert (g_m)^n\right\Vert _1} \xrightarrow {w^*} \sum _{j=1}^m \frac{1}{m} \text {ev}_{\lambda _j}.\end{aligned}$$

If we apply observation 1 on this sequence with \(M_{\frac{d\varphi }{d\lambda }}^*\), we get that

$$\begin{aligned}\frac{(g_m)^n \frac{d\varphi }{d\lambda }}{\left\Vert (g_m)^n\right\Vert _1} \xrightarrow {w^*} \sum _{j=1}^m \frac{\frac{d\varphi }{d\lambda }(\lambda _j)}{m}\text {ev}_{\lambda _j}.\end{aligned}$$

When \(\frac{d\varphi }{d\lambda }(\lambda _j) \not =0\) for at least one j, all these are nonzero positive linear functionals on \(C(S^1)\) so evaluating these functionals at 1 gives a nonzero scalar.

By observation 2, we can therefore conclude that

$$\begin{aligned}\frac{(g_m)^n \frac{d\varphi }{d\lambda }}{\left\Vert (g_m)^n \frac{d\varphi }{d\lambda }\right\Vert _1} \xrightarrow {w^*} \sum _{j=1}^m \frac{\frac{d\varphi }{d\lambda }(\lambda _j)}{\sum _{i=1}^m \frac{d\varphi }{d\lambda }(\lambda _i) } \text {ev}_{\lambda _j}.\end{aligned}$$

Finally, the functional \(f \mapsto \int _{S^1} f \frac{(g_m)^n \frac{d\varphi }{d\lambda }}{\left\Vert (g_m)^n \frac{d\varphi }{d\lambda }\right\Vert _1} \ \! d\lambda \) is exactly \(R_{k+(n(m+1))}^* \chi _n\) for \(\chi _n\) the pure state on \(C(S^1)^{(k+(n(m+1)))}\), defined via Proposition 2.14 by

$$\begin{aligned}\frac{d\chi _n}{d\lambda } = \frac{(g_m)^n \frac{d\varphi }{d\lambda }}{\left\Vert (g_m)^n \frac{d\varphi }{d\lambda }\right\Vert _1}. \end{aligned}$$

We have completed all the steps that were described in the beginning of this subsection. Combined, that gives following proposition.

Proposition 3.5

Given any state \(\psi \in \mathcal {S}(C(S^1))\) and \(\varepsilon >0\), there exists \(N \in \mathbb {N}\) such that for any \(n \ge N\) we can find a pure state \(\chi _n \in \mathcal {P}(C(S^1)^{(n)})\) with \(d(\psi , R_n^* (\chi _n)) < \varepsilon \), where d is the Monge–Kantorovich metric on \(\mathcal {S}(C(S^1))\).

Proof

The proof of this proposition is nothing but the execution of the steps as described at the start of this section. There is one subtlety, however, so we do not omit the proof altogether.

Take \(\psi \in \mathcal {S}(C(S^1))\) and \(\varepsilon >0\). Using Lemma 3.1 and the observation that the set

$$\begin{aligned}\left\{ \sum _{i=1}^n t_i \text {ev}_{\lambda _i} \ : \ 0 \le t_i \le 1, \sum _{i=1}^n t_i = 1, \lambda _i^n = 1 \right\} \end{aligned}$$

is dense in \(\mathcal {S}(C(S^1))\) with respect to the weak *-topology, we can choose a (non-pure) state on \(C(S^1)\)

$$\begin{aligned}\rho = \sum _{j=1}^m \frac{\frac{d\varphi }{d\lambda }(\lambda _j)}{\sum _{i=1}^m \frac{d\varphi }{d\lambda }(\lambda _i)} \text {ev}_{\lambda _j}\end{aligned}$$

with \(\varphi \) a pure state in \(\mathcal {P}(C(S^1)^{(m+1)})\) such that \(d(\rho , \psi ) < \varepsilon \).

According to Lemma 3.4, we can find a sequence \((\chi _n)_{n\in \mathbb {N}}\) with \(\chi _n \in \mathcal {P}(C(S^1)^{(m+n(m+1))})\) such that \(R_{m+n(m+1)}^*(\chi _n)\) converges to \(\rho \) in the Monge–Kantorovich metric. However, we have to ‘fill in the gaps’ to show that the distance of \(\psi \) to the intermediate pure state spaces also shrinks arbitrarily small.

We can exploit Proposition 3.1, which shows that we can also find a pure state of higher ‘degree’ \(\varphi _l \in \mathcal {P}(C(S^1)^{(m+1+l)})\) such that \(d(\rho _l, \psi ) < \varepsilon \), for any \(l\in \mathbb {Z}_{\ge 0}\). The argument above can then be repeated to find a sequence \(\chi _{l,n} \in \mathcal {P}(C(S^1)^{(l+m+n(m+1))})\) converging to \(\rho _l\). Choosing \(l=0, ..., m\) results in a finite number of interlacing, separate sequences, which we can combine into one sequence of pure states \((\chi _n)_{n\in \mathbb {N}}\) such that \(\chi _n \in \mathcal {P}(C(S^1)^{(n)})\). For this Frankenstein sequence, \(R_n^* (\chi _n)\) might not converge, but there does exist an N such that \(n\ge N\) implies \(d(R_n^*(\chi _n), \psi ) < 2\varepsilon \) which proves the proposition. \(\square \)

3.2 Uniformity

The result of the previous section means we can approximate all elements in \(\mathcal {S}(C(S^1))\) by pullbacks of elements in \(\mathcal {P}(C(S^1)^{(n)})\). In order to show Gromov–Hausdorff convergence, it remains to be shown that this approximation can be done uniformly so that \(\mathcal {P}(C(S^1)^{(n)})\) forms an \(\varepsilon \)-net in \(\mathcal {S}(C(S^1))\). A simple argument suffices, since \(\mathcal {S}(C(S^1))\) is weak\(^*\) compact.

Proposition 3.6

For every \(\varepsilon >0\), there exists \(N\in \mathbb {N}\) such that for \(n\ge N\), \(R_n^* (\mathcal {P}(C(S^1)^{(n)}))\) forms an \(\varepsilon \)-net in \(\mathcal {S}(C(S^1))\).

Proof

By the Banach–Alaoglu Theorem [5, Theorem V.3.1], the unit ball of \(C(S^1)^*\) is weak\(^*\)-compact. The set \(\mathcal {S}(C(S^1))\), as a weak\(^*\)-closed subset of the unit ball, is then compact as well. Hence, given \(\varepsilon > 0\), we can find a finite number of \(\psi _1, ..., \psi _m \in \mathcal {S}(C(S^1))\) such that the balls \(B_{\varepsilon }(\psi _i)\) cover \(\mathcal {S}(C(S^1))\).

According to Proposition 3.5, for each \(\psi _i\) we can find \(N_i \in \mathbb {N}\) such that for \( n \ge N_i\) there exists a pure state \(\varphi _i \in \mathcal {P}(C(S^1)^{(n)})\) such that \(d(\psi _i, R_n^*(\varphi _i)) < \varepsilon \). Thus, we have that for any \(\chi \in B_{\varepsilon }(\psi _i)\) and \(n \ge N_i\),

$$\begin{aligned} {\text {dist}}(\chi , R_n^* (\mathcal {P}(C(S^1)^{(n)}))) \le d(\chi , \psi _i) +{\text {dist}}(\psi _i, R_n^* (\mathcal {P}(C(S^1)^{(n)}))) < \varepsilon + \varepsilon = 2 \varepsilon .\end{aligned}$$

Now take any \(\chi \in \mathcal {S}(C(S^1))\). Because

$$\begin{aligned}\mathcal {S}(C(S^1)) \subseteq \bigcup _{i=1}^m B_{\varepsilon }(\psi _i),\end{aligned}$$

\(\chi \) must be an element of the ball \(B_{\varepsilon }(\psi _i)\) for some i. For \(n\ge N:= \max _i N_i\) we have, by the calculation above, that

$$\begin{aligned}{\text {dist}}(\chi , R_n^* (\mathcal {P}(C(S^1)^{(n)}))) < \varepsilon . \end{aligned}$$

Hence, \(R_n^* (\mathcal {P}(C(S^1)^{(n)}))\) forms an \(\varepsilon \)-net in \(\mathcal {S}(C(S^1))\) for \(n \ge N\). \(\square \)

Corollary 3.7

The set of measures on \(S^1\) whose Radon–Nikodym derivatives with respect to the normalized Haar-measure on \(S^1\) is of the form \(c \prod _{j=1}^n (1-\cos (t-\theta _j))\) is dense in the set of Borel measures on \(S^1\) with respect to the vague topology.

Proof

Denote the set of positive Borel measures on \(S^1\) by \(\mathcal {M}^+(S^1)\). By Proposition 3.6, we see that for the set

$$\begin{aligned}A:= \bigcup _{n=1}^\infty \left\{ \mu \in \mathcal {M}^+(S^1) : \frac{d\mu }{d\lambda } = c \prod _{j=1}^n (1-\cos (t-\theta _j)), c\in \mathbb {R}_{\ge 0}, \theta _1, \dots , \theta _n \in [0,2\pi )\right\} , \end{aligned}$$

we have for any \(\mu \in \mathcal {M}^+(S^1)\) that \({\text {dist}}(A, \mu ) = 0\). Hence, A is dense in \(\mathcal {M}^+(S^1)\) with respect to the weak\(^*\)-topology, i.e., the vague topology. \(\square \)

Proposition 3.6 also concludes the proof of the Gromov–Hausdorff convergence of \(\mathcal {P}(C(S^1)^{(n)})\) to \(\mathcal {S}(C(S^1))\) as metric spaces.

Theorem 3.8

The metric spaces \(\mathcal {P}(C(S^1)^{(n)})\) converge to the metric space \(\mathcal {S}(C(S^1))\) in Gromov–Hausdorff convergence.

Proof

Corollary 2.9, Lemma 2.10 and Proposition 3.6 combined immediately give the result. \(\square \)

4 Recovering \(S^1\)

It may come as a surprise that the limit of the spaces \(\mathcal {P}(C(S^1)^{(n)})\) is \(\mathcal {S}(C(S^1))\), and not \(\mathcal {P}(C(S^1)) \cong S^1\). However, since by this identification \(S^1\) is a subset of \(\mathcal {S}(C(S^1))\), it must also be possible to recover \(S^1\) as a Gromov–Hausdorff limit of a sequence of subsets of pure states. In this section, we will demonstrate this.

Lemma 4.1

The Fejér kernel rotated by \(\lambda = e^{i\theta }\)

$$\begin{aligned}f_n^\lambda (x) = \sum _{\left|k\right|\le n-1}\left( 1 - \frac{\left|k\right|}{n} \right) e^{ik(\theta -x)},\end{aligned}$$

defines a pure state on \(C(S^1)^{(n)}\) in the sense of Corollary 2.13.

Proof

Take the polynomial \(\sum _{k=0}^{n-1} z^{k}\). Observe that

$$\begin{aligned}(z-1)\sum _{k=0}^{n-1} z^{k} = z^{n} -1,\end{aligned}$$

and hence the roots of \(\sum _{k=0}^{n-1} z^{k}\) are precisely the nth roots of unity with the exception of 1 itself. In fact, \(\sum _{k=0}^{n-1} z^{k} = Q_\xi (z)\) where \(\xi \) is the vector \(\xi = \frac{1}{\sqrt{n}} (1, ..., 1) \in \mathbb {C}^n\), and \(\left|Q_\xi \right|^2\) defines a pure state. A simple calculation gives that

$$\begin{aligned}\left|Q_\xi (z)\right|^2 = \sum _{\left|k\right|\le n-1} \left( 1-\frac{\left|k\right|}{n} \right) z^k,\end{aligned}$$

hence the function \(\sum _{\left|k\right|\le n-1}\left( 1 - \frac{\left|k\right|}{n} \right) e^{ikx} = \sum _{\left|k\right|\le n-1}\left( 1 - \frac{\left|k\right|}{n} \right) e^{-ikx} \in C(S^1)\) defines a pure state on \(C(S^1)^{(n)}\). Rotations of this pure state are then also pure states by rotational invariance of \(S^1\). \(\square \)

Compare these pure states \(\tau _{f_n^\lambda }\) to the states on \(C(S^1)\) denoted \(\Psi _{x,N}^\sharp \) in [6, Sect. 5.4]. The relation between these is that \(R_n^*(\tau _{f_n^\lambda }) = \Psi _{\lambda , n}^\sharp \). These Fejér states recover the entire circle, which we will show in the next proposition. Note the similarity with [6, Proposition 5.11] and [6, Proposition 5.12]. The difference is that in this thesis we are talking about the intrinsic distance on the truncated circle, so we have to add a a small step to move between the intrinsic distance and the distance on the whole spectral triple.

Proposition 4.2

Define the subsets \(\mathcal {F}_n \subset \mathcal {P}(C(S^1)^{(n)})\) by

$$\begin{aligned}\mathcal {F}_n := \{\tau _{f^\lambda _n} : \lambda \in S^1\},\end{aligned}$$

where \(\tau _{f_n^\lambda }\) are states defined by Fejér kernels like in Lemma 4.1. Then the sequence of metric spaces \((\mathcal {F}_n, d_n)\) converges to \((S^1, d)\) in the Gromov–Hausdorff sense.

Proof

Define the sets

$$\begin{aligned}\mathfrak {R}_n = \{(\tau _{f^\lambda _n}, \lambda ): \lambda \in S^ 1\} \subset \mathcal {F}_n \times S^1.\end{aligned}$$

Because the elements of \(\mathcal {F}_n\) are labeled by \(S^1\), the projections of \(\mathfrak {R}_n\) onto the first and second coordinate are both surjective, making these sets total onto correspondences (Definition 2.3). We now want to show that the distortion of these correspondences converges to zero, in order to use Theorem 2.5.

Because the Fejér kernel is a good kernel [9, Chapter 2], the states \(R_n^*(\tau _{f_n^\lambda })\) converge to \(\text {ev}_\lambda \) in \(\mathcal {S}(C(S^1))\) as \(n \rightarrow \infty \). It thus follows immediately that

$$\begin{aligned}\lim _{n\rightarrow \infty } d(R_n^*(\tau _{f_n^\lambda }), R_n^*(\tau _{f_n^\mu })) = d(\lambda , \mu ),\end{aligned}$$

and because of Lemma 2.10 also

$$\begin{aligned} \lim _{n \rightarrow \infty } d_n(\tau _{f_{n}^\lambda }, \tau _{f_{n}^\mu }) = d(\lambda ,\mu ).\end{aligned}$$

We now want to estimate

$$\begin{aligned}\sup _{\lambda , \mu \in S^1}\left|d(R_n^*(\tau _{f_n^\lambda }), R_n^*(\tau _{f_n^\mu })) - d(\lambda ,\mu )\right|.\end{aligned}$$

By definition,

$$\begin{aligned}d(R_n^*(\tau _{f_n^\lambda }), R_n^*(\tau _{f_n^\mu })) = \sup _{g \in C^\infty (S^1)}\left\{ \left|R_n^*(\tau _{f_n^\lambda })(g) - R_n^*(\tau _{f_n^\lambda })(g)\right| : \left\Vert g'\right\Vert _{\infty } \le 1\right\} .\end{aligned}$$

As \(R_n^*(\tau _{f_n^\lambda })\) and \(R_n^*(\tau _{f_n^\mu })\) are both states, we can freely subtract a constant function from g, so we might as well impose the extra condition

$$\begin{aligned}d(R_n^*(\tau _{f_n^\lambda }), R_n^*(\tau _{f_n^\mu })) = \sup _{g\in C^\infty (S^1)}\left\{ \left|R_n^*(\tau _{f_n^\lambda })(g) - R_n^*(\tau _{f_n^\lambda })(g)\right| : \left\Vert g'\right\Vert _{\infty } \le 1, \left\Vert g\right\Vert _\infty \le \pi \right\} .\end{aligned}$$

Observe that for any smooth function g with \(\left\Vert g'\right\Vert _\infty \le 1\), \(\left\Vert g\right\Vert _\infty \le \pi \),

$$\begin{aligned} \left|R_n^*(\tau _{f_n^\lambda })(g) - R_n^*(\tau _{f_n^\mu }s)(g)\right|&\le \left|R_n^*(\tau _{f_n^\lambda })(g) - g(\lambda )\right| + \left|R_n^*(\tau _{f_n^\mu })(g)-g(\mu )\right| + \left|g(\lambda ) - g(\mu )\right|\\&\le \left|R_n^*(\tau _{f_n^\lambda })(g) - g(\lambda )\right| + \left|R_n^*(\tau _{f_n^\mu })(g)-g(\mu )\right| + d(\lambda , \mu ), \end{aligned}$$

and furthermore if we denote the \(\varepsilon \)-neighborhood of \(\lambda \) in \(S^1\) by \(U^\varepsilon \),

$$\begin{aligned} \left|R_n^*(\tau _{f_n^\lambda })(g) - g(\lambda )\right|&\le \frac{1}{2\pi } \int _{S^1} f_n^\lambda (x) \left|g(x) - g(\lambda )\right| dx\\&= \frac{1}{2\pi }\int _{S^1 \setminus U^\varepsilon } f_n^\lambda (x) \left|g(x) - g(\lambda )\right| dx + \frac{1}{2\pi }\int _{U^\varepsilon } f_n^\lambda (x) \left|g(x) - g(\lambda )\right| dx\\&\le \int _{S^1 \setminus U^\varepsilon } f_n^\lambda (x) dx + \varepsilon . \end{aligned}$$

Note that this last estimate is independent of g and even of \(\lambda \) by rotational invariance. Collecting the statements above, taking any \(\varepsilon > 0\) gives

$$\begin{aligned}\sup _{\lambda , \mu \in S^1}\left|d(R_n^*(\tau _{f_n^\lambda }), R_n^*(\tau _{f_n^\mu })) - d(\lambda ,\mu )\right| \le 2 \int _{S^1 \setminus U^\varepsilon } f_n^\lambda (x) dx + 2 \varepsilon .\end{aligned}$$

By the properties of the Fejér kernel [9, Chapter 2], we may conclude that \(\int _{S^1 \setminus U^\varepsilon } f_n^\lambda (x) dx\) converges to zero and therefore

$$\begin{aligned}\lim _{n \rightarrow \infty } {\text {dist }} \mathfrak {R}_n = 0, \end{aligned}$$

so by Theorem 2.5

$$\begin{aligned} \lim _{n \rightarrow \infty }d_{GH}(\mathcal {F}_n, S^1) = 0. \end{aligned}$$

As a final remark, this proposition is comparable to a result proven by L. Glaser and A. Stern [7], which asserts that the pure state space of any spectral triple is the Gromov–Hausdorff limit of ‘localized’ (not necessarily pure) states on the truncated spectral triple.