1 Introduction

It is the purpose of this paper to add a new perspective to the central limit theorem for linear eigenvalue statistics. The main objects are the eigenvalues \(\lambda _1,\ldots ,\lambda _n\) of a random real symmetric or complex Hermitian matrix Z. Given a test function f, the linear statistic of these eigenvalues, denoted by \(X_1^{(n)}(f)\), is \({\text {tr}}(f(Z))= f(\lambda _1)+\cdots +f(\lambda _n)\). For many distributions of eigenvalues and smooth enough functions, we have, after centering, the convergence in distribution to a normal random variable:

$$\begin{aligned} {\text {tr}}(f(Z)) - \mathbb {E}[{\text {tr}}(f(Z)) ] = \sum _{k=1}^n f(\lambda _k) - \mathbb {E}[f(\lambda _k)] \xrightarrow [n \rightarrow \infty ]{d} \mathcal {N}\left( 0,\sigma _1^2(f)\right) . \end{aligned}$$
(1.1)

Over the last two decades, CLTs for linear eigenvalue statistics have grown into a hugely popular field of study within random matrix theory. To give a partial overview, the convergence in (1.1) was proven for invariant matrix models or orthogonal polynomial ensembles [5, 9, 17, 18, 25,26,27, 36, 42], for general Wigner or Wishart matrices [8, 10, 28, 34, 43], for matrices of compact groups [24, 44], and for non-Hermitian matrices [33, 41]. Comparing (1.1) with classical CLTs, for example for sums of independent random variables, it is highly remarkable that there is no additional scaling factor \(n^{-1/2}\). This phenomenon is usually attributed to the strong dependence structure of the eigenvalues. Indeed, the classical orthogonal polynomial ensembles have a joint eigenvalue density involving the Vandermonde determinant \(\Delta (\lambda ) = \prod _{i<j} |\lambda _i-\lambda _j|\), which leads to a repulsion of eigenvalues. It was shown, however, by [12, 45] that in general the variance of the linear eigenvalue statistic does not remain bounded for non-smooth test functions f.

One sees a very different picture when, instead of the trace, we consider the fluctuations of an individual matrix element \(f(Z)_{1,1}\). Limit theorems for such entries have been considered by [29, 30, 35, 37]. The random variable \(f(Z)_{1,1}\) depends not only on the distribution of the eigenvalues, but also on the eigenvectors. We will assume the matrix of eigenvectors to be Haar-distributed on the orthogonal group (for real Z) or on the unitary group (for complex Z), and to be independent of the eigenvalues. This is satisfied for the prominent case of unitarily invariant ensembles (see Sect. 2.1). Then, the central limit theorem takes the form

$$\begin{aligned} \sqrt{n}(f(Z)_{1,1} - \mathbb {E} [ f(Z)_{1,1}] ) \xrightarrow [n \rightarrow \infty ]{d} \mathcal {N}\left( 0,\sigma _0^2(f)\right) . \end{aligned}$$
(1.2)

Unlike for the full trace, an additional scaling is necessary. Although one might expect \(f(Z)_{1,1}\) to scale as \(\frac{1}{n} {\text {tr}}f(Z)\), the fluctuations of the former random variable are much larger. We remark that in our setting the convergence (1.2) is in fact a consequence of (1.1) (see Theorem 2.3).

In this paper, we show that we can in some sense interpolate between the different CLTs in (1.2) and (1.1) by summing a varying number of diagonal elements. The main object of interest is thus the partial trace \(X_t^{(n)}(f)\), defined by

$$\begin{aligned} X_t^{(n)}(f) = \sum _{i=1}^{\lfloor tn\rfloor } f(Z)_{i,i} = \sum _{k=1}^n w_{k,t}^{(n)}f(\lambda _k) , \end{aligned}$$
(1.3)

which is a weighted version of the linear eigenvalue statistic \({\text {tr}}f(Z)\), where the weights \(w_{k,t}^{(n)}\) are norms of projections of the eigenvectors (see (2.3)). In our main result, Theorem 2.2, we show that in a setting where the convergence (1.1) of the linear eigenvalue statistic holds, the process

$$\begin{aligned} \big ( X_t^{(n)}(f)- \mathbb {E}[X_t^{(n)}(f)] \big )_{t\in [0,1]} \end{aligned}$$
(1.4)

converges as \(n\rightarrow \infty \) in distribution to a centered Gaussian process. The variance of the limit process at time t is given by

$$\begin{aligned} (t -t^2) \sigma _0^2 (f) + t^2 \sigma _1^2(f) = t \big [ (1-t) \sigma _0^2(f) + t \sigma _1^2(f)\big ] . \end{aligned}$$
(1.5)

That is, the fluctuations interpolate between the limit variance of the CLTs in (1.2) and (1.1) and, unless \(\sigma _0^2 (f)=\sigma _1^2 (f)\), the limit is not a Brownian motion.

A core argument in the proof is the independence of eigenvalues and eigenvectors. Assuming a convergence as in (1.1), the main task is then to handle the fluctuations induced by summing a varying number of entries of the eigenvector matrix. The main ingredient for this is a functional limit theorem for sums over subblocks of Haar-distributed matrices proven by [6, 16]. This result itself relies on a powerful theorem of [32], allowing to evaluate higher-order cumulants for entries of Haar matrices. Our strategy also allows us to prove a functional CLT for (1.4), when instead of the mean \(\mathbb {E}[X_t^{(n)}(f)]\), one centers by the expectation conditioned on the eigenvalues. The result is the quenched convergence in Theorem 2.1, which gives a convergence in distribution under the law of the eigenvector matrices, valid for almost all (sequences of) eigenvalues. With this centering, the limit process is a Brownian bridge. This also shows that the results are not restricted to the random matrix setting, but could also be viewed in the framework of randomly weighted sums, when the weights are coming from Haar-distributed matrices as in (1.3). For example, the functional CLT of Theorem 2.1 is also true for deterministic sequences \(\lambda _i\) or more general point processes; see Remark 2.4.

Convergence of partial traces has been considered before in a couple of papers for particular distributions of random matrices. If Z is unitary and f is the identity, a functional limit theorem for the partial trace has been proven in [14]. A more general way of summing entries of unitary matrices was considered in [15]. In [40], real symmetric matrices are considered and the statement of Theorem 2.1 is proven under a strong moment condition on the \(\lambda _i\), using zonal polynomials. Using the arguments of Sect. 3.2, this would lead to a convergence of (1.4), again under higher moment conditions.

This paper is structured as follows. In Sect. 2, we state and discuss our main assumptions and state our results. The proofs can be found in Sect. 3 and a lengthy variance computation is contained in Sect. 4.

2 Random Ensembles and Main Results

Let us begin with a closer look at the partial trace. When \(Z=Z^{(n)}\) is a \(n\times n\) complex Hermitian matrix, by the spectral theorem we may write \(Z^{(n)}= U^{(n)}\Lambda ^{(n)}(U^{(n)})^*\), where \(U^{(n)}\) is a \(n\times n\) unitary matrix, \(\Lambda ^{(n)}\) is real diagonal with the eigenvalues \(\lambda _1,\ldots , \lambda _n\) on the diagonal and \(A^*\) denotes the conjugate transpose of A. If \(Z^{(n)}\) is real symmetric, \(U^{(n)}\) is orthogonal instead. With this decomposition, we have for the partial trace as defined in (1.3):

$$\begin{aligned} X_t^{(n)}(f) = \sum _{i=1}^{\lfloor tn\rfloor } f(Z)_{i,i} = \sum _{i=1}^{\lfloor tn\rfloor } \big (U^{(n)}f(\Lambda ^{(n)}) (U^{(n)})^*\big )_{i,i} = \sum _{i=1}^{\lfloor tn\rfloor } \sum _{k=1}^n |U^{(n)}_{i,k}|^2 f(\lambda _k) . \end{aligned}$$
(2.1)

The main object of our study is then the random nonnegative finite measure \(X_t^{(n)}\) defined by

$$\begin{aligned} X_t^{(n)}= \sum _{k=1}^n w_{k,t}^{(n)} \delta _{\lambda _k}, \end{aligned}$$
(2.2)

with \(\delta _z\) the Dirac measure in z and the weights are given by

$$\begin{aligned} w_{k,t}^{(n)} = \sum _{i=1}^{\lfloor tn\rfloor } |U^{(n)}_{i,k}|^2 . \end{aligned}$$
(2.3)

In this case, \(\mu (f)\) is just the shorthand notation for \(\int f \, d\mu \). Note that the total mass of \(X_t^{(n)}\) is given by \(\lfloor tn\rfloor \). The representation (2.2) shows that statements about the partial trace are in fact statements about a weighted version of the classical empirical eigenvalue distribution, which we denote by \(\hat{\mu }^{(n)}\) and which corresponds to all weights being equal to \(n^{-1}\). In (2.2), the weight of \(\lambda _i\) is a norm of the first \(\lfloor tn\rfloor \) entries of the corresponding eigenvector. Setting \(t=1\), all weights in (2.2) become 1, so that \(\hat{\mu }^{(n)}= \frac{1}{n}X_1^{(n)}\). In other words, \(n \hat{\mu }^{(n)}(f)\) is the linear eigenvalue statistic.

Another prominent eigenvalue measure is the spectral measure \(\mu _1^{(n)}\) of the pair \((Z^{(n)},e_1)\), defined by the functional calculus relation \(\mu _1^{(n)}(f) = e_1^*f(Z^{(n)})e_1 = f(Z^{(n)})_{1,1}\). That is, the CLT in (1.2) is in fact a statement about \(\mu _1^{(n)}(f)\). The spectral measure can be obtained from the partial trace by \(\mu _1^{(n)}= X_{1/n}^{(n)}\). Although for classical ensembles of random matrices, the measures \(\hat{\mu }^{(n)}\) and \(\mu _1^{(n)}\) have the same limit in probability as \(n\rightarrow \infty \), the fluctuations around this limit are very different, which becomes evident in the different central limit theorems in (1.1) (for \(n \hat{\mu }^{(n)}\)) and in (1.2) (for \(\mu _1^{(n)}\)). The additional randomness of the weights in the spectral measure leads to substantially larger fluctuations. Let us remark that a similar behavior can be observed on the scale of large deviations: While \(\hat{\mu }^{(n)}\) satisfies a large deviation principle with speed \(n^2\) , see [4] or [1], for \(\mu _1^{(n)}\) this is reduced to speed n [20, 21].

2.1 Assumptions

In order to present the results for complex and real matrices in a unified expression, we follow the classical notation of [19] and introduce the parameter \(\beta \), where \(\beta =1\) if \(U^{(n)}\) is real and orthogonal and \(\beta =2\) if \(U^{(n)}\) is complex and unitary. Let \(\beta '=\beta /2\). We will always make the following assumption:

  1. (A1)

    The matrices \(U^{(n)}\) and \(\Lambda ^{(n)}\) are independent and \(U^{(n)}\) is Haar-distributed on the unitary group (\(\beta =2\)) or on the orthogonal group (\(\beta =1\)).

Under assumption (A1), we can write the distribution of \((U^{(n)},\Lambda ^{(n)})\) as \(\mathbb {P} = \mathbb {P}_H \otimes \mathbb {P}_\Lambda \), where \(\mathbb {P}_H\) is the Haar measure on the unitary group and \( \mathbb {P}_\Lambda \) is the distribution of the eigenvalues. We denote expectation with respect to \(\mathbb {P}_H\) and \(\mathbb {P}_\Lambda \) by \(\mathbb {E}_H\) and \(\mathbb {E}_\Lambda \), respectively. Without loss of generality, assume that all matrices \((U^{(n)},\Lambda ^{(n)})\) for \(n\ge 1\) are defined on a common probability space. We continue to write \(\mathbb {P}= \mathbb {P}_H \otimes \mathbb {P}_\Lambda \) for the distribution of the sequences. Any convergence in distribution will be under \(\mathbb {P}\) unless we specify otherwise. While the distribution of \(U^{(n)}\) is completely specified by (A1), we need that the empirical measure of the eigenvalues converges to a deterministic limit. Apart from Theorem 2.1, we also assume a CLT for the linear eigenvalue statistic. Note that the two next assumptions are also conditions on the test function \(f:\mathbb {R} \rightarrow \mathbb {R}\).

  1. (A2)

    There exists a deterministic probability measure \(\nu \), such that \(\hat{\mu }^{(n)}\) converges weakly to \(\nu \) \(\mathbb {P}_\Lambda \)-almost surely. Furthermore, \(\mathbb {P}_\Lambda \)-almost surely, \(\hat{\mu }^{(n)}(f)\) converges to \(\nu (f) \) and \(\hat{\mu }^{(n)}(f^2)\) converges to \(\nu (f^2)\).

  2. (A3)

    There exists a \(\sigma _1^2(f)\in [0,\infty )\) such that

    $$\begin{aligned} X_1^{(n)}(f) - \mathbb {E} [X_1^{(n)}(f)] \xrightarrow [n \rightarrow \infty ]{d} \mathcal {N}(0,\sigma _1^2(f)) . \end{aligned}$$

Let us comment on the assumptions above. Suppose the matrix \(Z^{(n)}\) is distributed with density proportional to

$$\begin{aligned} \exp \{ - \tfrac{1}{2}n\beta {\text {tr}}V(X) \} \end{aligned}$$
(2.4)

with respect to the Lebesgue measure in each independent real entry in X. The potential \(V:\mathbb {R} \rightarrow (-\infty ,\infty ]\) is supposed to be continuous and satisfy the growth (or confinement) condition:

$$\begin{aligned} \liminf _{|x| \rightarrow \infty } \frac{V(x)}{2 \log |x|} > 1. \end{aligned}$$
(2.5)

The density (2.4) implies that assumption (A1) is satisfied and that the eigenvalues have a joint density proportional to

$$\begin{aligned} \prod _{i<j} |\lambda _i-\lambda _j|^{\beta } \prod _{i=1}^n \exp \{ - \tfrac{1}{2}n\beta V(\lambda _i) \} \end{aligned}$$
(2.6)

with respect to the Lebesgue measure on \(\mathbb {R}^n\) (see [31]). It follows from the large deviation principle of [4] that the empirical eigenvalue distribution \(\hat{\mu }^{(n)}\) converges exponentially fast to a compactly supported measure \(\nu \). Since the probability of deviating from the limit in the weak topology decays exponentially fast, the weak convergence holds almost surely on any joint probability space. That is, assumption (A2) is satisfied for any f continuous and bounded. If moreover \(\nu \) is supported by a single interval and the effective potential

$$\begin{aligned} \mathcal {J}_V(x) = \tfrac{1}{2}V(x)- \int \log | x-\xi | \, \mathrm{d}\nu (\xi ) \end{aligned}$$
(2.7)

attains its infimum only on the support of \(\nu \), then the largest and smallest eigenvalues each satisfy a large deviation principle [2, 3]. This implies that the probability of the extremal eigenvalues deviating from the support of \(\nu \) decays exponentially, and one easily obtains that (A2) holds also for continuous f growing at most polynomially at infinity.

Turning to assumption (A3), it was shown in [9] that the CLT holds for quite general \(\beta \)-ensembles with density (2.6), when \(\mathcal {J}_V\) attains its infimum only on the support of \(\nu \), and when V, f are sufficiently smooth. We remark that [9] center by \(\int f\, \mathrm{d}\nu \), but their control of exponential moments allows to center as in (A3). For the classical cases of the Gaussian orthogonal ensemble (GOE, \(\beta =1\)) and Gaussian unitary ensemble (GUE, \(\beta =2\)) with potential \(V(x) = \tfrac{1}{2}x^2\), the limit measure \(\nu \) is the semicircle law with Lebesgue density \(\tfrac{1}{2\pi }\sqrt{4-x^2}\mathbb {1}_{[-2,2]}(x)\), and (A3) holds for any \(f\in C^1(\mathbb {R})\) growing at most polynomially, with limiting variance

$$\begin{aligned} \sigma _1^2(f) = \frac{1}{2\beta \pi ^2} \int _{-2}^2 \int _{-2}^2 \left( \frac{f(x)-f(y)}{x-y}\right) ^2 \frac{4-xy}{\sqrt{4-x^2} \sqrt{4-y^2} }\, d x d y ; \end{aligned}$$
(2.8)

see [39], Chapter 3.2.

As already mentioned in the introduction, the CLT in (A3) (and also assumption (A2)) is not only proven for random matrices with density (2.4), but for a large variety of models, for example general Wigner or Wishart matrices. Such matrices have in general no Haar-distributed matrix of eigenvectors, such that assumption (A1) fails to hold. However, given a random matrix \(Z^{(n)}\) satisfying (A2) and (A3), we may take \(U^{(n)}\) Haar-distributed on the orthogonal or unitary group and define \(\tilde{Z}^{(n)}=U^{(n)}Z^{(n)}(U^{(n)})^*\). Then, the matrix \(\tilde{Z}^{(n)}\) trivially has a Haar-distributed matrix of eigenvectors independent of the eigenvalues. The second and third assumptions continue to hold, so that now \(\tilde{Z}^{(n)}\) satisfies all assumptions.

Finally, let us remark that the method in the present paper also works if a weak convergence as in (A3) holds with a non-Gaussian limit, but to stay within the framework of CLTs for the linear eigenvalue statistic, we restrict the presentation to the Gaussian case.

2.2 Results

As described in the introduction, the main objective is to show how the fluctuations of the linear eigenvalue statistic emerge from summing individual matrix elements. We consider the process

$$\begin{aligned} \mathcal {X}^{(n)}(f) = \big ( X_t^{(n)}(f)- \mathbb {E}[X_t^{(n)}(f)] \big )_{t\in [0,1]} \end{aligned}$$
(2.9)

as a random element of \(\mathcal {D}([0,1])\), equipped with the Skorokhod topology and the Borel-\(\sigma \) algebra. We write

$$\begin{aligned} \mathcal {X}^{(n)}(f) = \mathcal {W}^{(n)}(f) + \mathcal {Z}^{(n)}(f) , \end{aligned}$$
(2.10)

where

$$\begin{aligned} \mathcal {W}^{(n)}_t(f) = \mathcal {X}_t^{(n)}(f) - \mathbb {E}_H \left[ \mathcal {X}_t^{(n)}(f)\right] , \qquad 0\le t\le 1 \end{aligned}$$
(2.11)

is the process centered with respect to \(\mathbb {P}_H\) and

$$\begin{aligned} \mathcal {Z}^{(n)}_t(f) = \mathbb {E}_H \left[ \mathcal {X}_t^{(n)}(f)\right] - \mathbb {E}\left[ \mathcal {X}_t^{(n)}(f)\right] , \qquad 0\le t\le 1 . \end{aligned}$$
(2.12)

Since \(\mathbb {E}_H[|{U^{(n)}_{i,k}}|^2] = 1/n\), we have by (2.2) and (2.3)

$$\begin{aligned} \mathcal {W}^{(n)}_t(f) = \sum _{k=1}^n \sum _{i=1}^{\lfloor tn\rfloor } \big ( |{U^{(n)}_{i,k}}|^2-\tfrac{1}{n} \big ) f(\lambda _k) \end{aligned}$$
(2.13)

and

$$\begin{aligned} \mathcal {Z}^{(n)}_t(f) = \frac{\lfloor tn\rfloor }{n} \big ( X_n(f) -\mathbb {E}[X_n(f)]\big ) . \end{aligned}$$
(2.14)

Our first main result is then the following functional limit theorem for the process \(\mathcal {W}^{(n)}(f)\). Note that assumption (A3) is not needed for this part.

Theorem 2.1

Suppose (A1) and (A2) are satisfied. Then, \(\mathbb {P}_\Lambda \) almost surely, as \(n\rightarrow \infty \), the process \(\mathcal {W}^{(n)}(f)\) converges in distribution under \(\mathbb {P}_H\) toward \(\sigma _0(f) B\), where B is a standard Brownian bridge and \(\sigma _0^2(f) = \tfrac{2}{\beta }(\nu (f^2)-\nu (f)^2)\).

Theorem 2.1 shows that the elements of the unitary matrix \(U^{(n)}\) are the main source for the fluctuations of \(\mathcal {W}^{(n)}(f)\), and this process is asymptotically independent of the eigenvalues, and therefore of \(\mathcal {Z}^{(n)}(f)\). Since by assumption (A3), \( \mathcal {Z}_t^{(n)}(f)\) converges to a Gaussian multiplied by t, this will result in the convergence of the sum, which leads to our second main result.

Theorem 2.2

Under assumptions (A1), (A2) and (A3), the process \(\mathcal {X}^{(n)}(f) \) converges as \(n\rightarrow \infty \) in distribution toward the continuous centered Gaussian process \(\mathcal {X}(f)\) with covariance

$$\begin{aligned} {\text {Cov}}(\mathcal {X}_s(f),\mathcal {X}_t(f)) = (t\wedge s -ts) \sigma _0^2 (f) + ts \sigma _1^2(f) , \end{aligned}$$

with \(\sigma _0^2(f)\) as in Theorem 2.1 and \(\sigma _1^2(f)\) as in (A3).

We may also consider a CLT for an individual element of the trace, as in (1.2). Also on this level, we see the different effects the weights and eigenvalues have on the fluctuations, similar to Theorem 2.2. Our next result concerns the asymptotic normality of

$$\begin{aligned} \sqrt{n}(\mu _{1}^{(n)}(f) - \mathbb {E}[ \mu _1^{(n)}(f)])&= \sqrt{n}(\mu _{1}^{(n)}(f) - \hat{\mu }^{(n)}(f)) \nonumber \\&\quad +\sqrt{n}( \hat{\mu }^{(n)}(f) - \mathbb {E}[\hat{\mu }^{(n)}(f)]) , \end{aligned}$$
(2.15)

where we recall that \(\mu _{1}^{(n)}= X_{1/n}^{(n)}\) and \(\hat{\mu }^{(n)}= \frac{1}{n}X_1^{(n)}\), as defined in the beginning of Sect. 2. The random weights are responsible for the weak convergence of the first term on the right-hand side, while under (A3) the second term has fluctuations of smaller order and vanishes in the limit. Moreover, although both terms depend on the eigenvalues, they are asymptotically independent.

Theorem 2.3

Assume that (A1) and (A2) hold with \(f\in \mathcal {C}_b(\mathbb {R})\), then

$$\begin{aligned} \sqrt{n}(\mu _{1}^{(n)}(f) - \hat{\mu }^{(n)}(f)) \xrightarrow [n \rightarrow \infty ]{d} \mathcal {N}(0,\sigma _0^2(f)) , \end{aligned}$$

where \(\sigma _0^2(f) = \tfrac{2}{\beta }(\nu (f^2)-\nu (f)^2)\). If additionally \(\sqrt{n} (\hat{\mu }^{(n)}(f)-\mathbb {E}[\hat{\mu }^{(n)}(f)])\xrightarrow [n \rightarrow \infty ]{d} \mathcal {N}(0,\hat{\sigma }^2(f))\) with \(\hat{\sigma }^2(f)\in [0,\infty )\), then

$$\begin{aligned} \sqrt{n}(\mu _{1}^{(n)}(f) - \mathbb {E}[\mu _1^{(n)}(f)]) \xrightarrow [n \rightarrow \infty ]{d} \mathcal {N}(0,\sigma _0^2(f)+\hat{\sigma }^2(f)) . \end{aligned}$$

In particular, if (A3) holds, then this convergence follows with \(\hat{\sigma }^2(f)=0\).

Let us remark that for Z a random matrix satisfying (A1) and (A2), the first convergence in Theorem 2.3 may be rewritten as

$$\begin{aligned} \sqrt{n}(f(Z)_{1,1} - \tfrac{1}{n} {\text {tr}}f(Z)) \xrightarrow [n \rightarrow \infty ]{d} \mathcal {N}(0,\sigma _0^2(f)) \end{aligned}$$
(2.16)

and if the distribution of Z satisfies also (A3), then the second convergence in Theorem 2.3 is equivalent to (1.2).

Remark 2.4

The weak convergence in Theorem 2.1 can be seen as a quenched convergence, valid for almost all realizations of sequences of eigenvalues. It demonstrates that after centering with respect to \(\mathbb {P}_H\), the origin of the random fluctuations of the partial trace is solely in the weights (2.3), that is, in the eigenvector matrix. The eigenvalues give only a deterministic contribution in the limit, depending only on the equilibrium measure \(\nu \). It is therefore not relevant for the result that \(\lambda _i\) are eigenvalues of a random matrix. Instead, Theorem 2.1 holds for any randomly weighted measure as in (2.2) with weights (2.3). For example, one could replace the support points of this measure with i.i.d. random variables, or realizations of a point process, as long as they are independent of the weights and assumption (A2) holds. The same remark can be made about the first convergence in Theorem 2.3. It does not require (A3), and although it is not explicitly stated, the convergence holds under \(\mathbb {P}_H\) for \(\mathbb {P}_\Lambda \) almost all support points of the random measure \(\mu _1^{(n)}\).

3 Proofs

3.1 Proof of Theorem 2.1

3.1.1 Representation by a Bivariate Process

We will first show the statement of Theorem 2.1 for piecewise constant functions h with

$$\begin{aligned} h(\lambda ) = \sum _{m=1}^M \gamma _m \mathbb {1}_{ (a_{m},b_{m}]}(\lambda ), \end{aligned}$$
(3.1)

for some real \(\gamma _m, 1\le m\le M\), and \(a_1<b_1\le a_2< \ldots \le b_m\) such that \(\nu ((-\infty ,\cdot ])\) is continuous at all \(a_i,b_i\). The last condition only excludes countably many points for the choice of \(a_i,b_i\) and in particular still allows to approximate any \(f\in L^2(\nu )\). Let \(U^{(n)}\) be a sequence of \(n\times n\) unitary or orthogonal Haar-distributed matrices. We denote by \(\widetilde{\mathcal {W}}^{(n)}\) a process indexed by subsets \(A\times B\) of \(\{1,\ldots ,n\}^2\), such that

$$\begin{aligned} \widetilde{\mathcal {W}}^{(n)}_{A,B} = \sum _{i,j=1}^n \big ( |U^{(n)}_{i,j}|^2 - \tfrac{1}{n}\big ) \mathbb {1}_A(i)\mathbb {1}_B(j) . \end{aligned}$$

If A and/or B are of the form \(\{1,\ldots ,\lfloor tn\rfloor \}\) with \(t\in [0,1]\), we replace the corresponding index by t.

We consider \((\widetilde{\mathcal {W}}^{(n)}_{s,t})_{s,t\in [0,1]}\) as a random element of \(\mathcal {D}([0,1]^2)\), the multidimensional version of the Skorokhod space. \(\mathcal {D}([0,1]^2)\) contains all \(X:[0,1]^2\rightarrow \mathbb {R}\) which are “continuous from the north-east” and have existing limits in each quadrant, i.e., \(\lim _{s\searrow s_0,t\searrow t_0} X(s,t)= X(s_0,t_0)\) and \(\lim _{s\searrow s_0,t\nearrow t_0} X(s,t)\), \(\lim _{s\nearrow s_0,t\searrow t_0} X(s,t)\) and \(\lim _{s\nearrow s_0,t\nearrow t_0} X(s,t)\) exist. We endow \(\mathcal {D}([0,1]^2)\) with a generalization of Skorokhod’s \(J_1\)-metric defined by

$$\begin{aligned} d(X,Y)&= \inf _{\lambda _1,\lambda _2} \max \left\{ \sup _{s\in [0,1]} |\lambda _1(s)-s| ,\sup _{t\in [0,1]} |\lambda _2(t)-t| ,\right. \nonumber \\&\qquad \qquad \qquad \qquad \left. \sup _{s,t\in [0,1]} |X(\lambda _1(s),\lambda _2(t))-Y(s,t)|\right\} , \end{aligned}$$
(3.2)

where the infimum is taken over all continuous one-to-one mappings \(\lambda _i:[0,1]\rightarrow [0,1]\) fixing 0. Then, as in the one-dimensional case, \(\mathcal {D}([0,1]^2)\) with metric (3.2) is separable and although it is not complete, there is an equivalent metric such that \(\mathcal {D}([0,1]^2)\) becomes complete; see Section 5 of [46] or [11], Section 3. As in the classical case laid out in Section 12 of [7], convergence with respect to the metric (3.2) with a continuous limit actually implies convergence in supremum norm.

It was shown in [16] that, for suitable index sets, \(\widetilde{\mathcal {W}}^{(n)}\) converges to \(\sqrt{2/\beta } \mathcal {B}\), where \(\mathcal {B}\) is a bivariate tied-down Brownian bridge, a centered Gaussian process on \([0,1]^2\) with continuous paths and covariance

$$\begin{aligned} \mathbb {E}[\mathcal {B}(s,t)\mathcal {B}(s',t')]= (s\wedge s'-ss')(t\wedge t'-tt') . \end{aligned}$$
(3.3)

Theorem 3.1

([16], Thm 1.1) As \(n\rightarrow \infty \), the process \((\widetilde{\mathcal {W}}^{(n)}_{s,t})_{s,t\in [0,1]}\) converges in distribution under \(\mathbb {P}_H\) to \(\sqrt{2/\beta } \mathcal {B}\), with \(\mathcal {B}\) a bivariate tied-down Brownian bridge.

Now consider h as in (3.1), then

$$\begin{aligned} \mathcal {W}^{(n)}_t(h)&= \sum _{k=1}^n \sum _{i=1}^{\lfloor tn\rfloor } \big ( |U_{i,k}|^2-\tfrac{1}{n} \big ) h(\lambda _k) \nonumber \\&= \sum _{m=1}^M \gamma _m \sum _{k=1}^n \sum _{i=1}^n \big ( |U_{i,k}|^2-\tfrac{1}{n} \big ) \mathbb {1}_{ (a_m,b_m]}(\lambda _k) \mathbb {1}_{\{1,\ldots ,\lfloor tn\rfloor \} }(i) \nonumber \\&=\sum _{m=1}^M \gamma _m \widetilde{\mathcal {W}}^{(n)}_{t,A_m} , \end{aligned}$$
(3.4)

with \(A_m = \{ k\, |\, \lambda _k \in (a_m,b_m] \}\). For \(s\in \mathbb {R}\), let

$$\begin{aligned} F^{(n)}(s) = \tfrac{1}{n}|\{\lambda _k^{(n)}: \lambda _k^{(n)}\le s\}| \end{aligned}$$
(3.5)

be the normalized number of eigenvalues \(\le s\), then we claim that

$$\begin{aligned} \mathcal {W}^{(n)}(h) {\mathop {=}\limits ^{\mathbb {P}_H}} \widetilde{\mathcal {W}}^{(n)}(h) := \sum _{m=1}^M \gamma _m \big ( \widetilde{\mathcal {W}}^{(n)}_{\cdot , F^{(n)}(b_{m})} - \widetilde{\mathcal {W}}^{(n)}_{\cdot , F^{(n)}(a_{m})} \big )\, , \end{aligned}$$
(3.6)

where \({\mathop {=}\limits ^{\mathbb {P}_H}}\) denotes equality in distribution under \(\mathbb {P}_H\). To see this, let \(\pi \) by a permutation of \(\{1,\ldots ,n\}\) such that \(\lambda _{\pi (1)}\le \cdots \le \lambda _{\pi (n)}\). If \(\Pi \) is the \(n\times n\) permutation matrix with entries \(\Pi _{i,j} = \mathbb {1}_{\pi (i)=j}\), then \(\Pi \) is orthogonal. By the invariance of the Haar measure, we have \(U^{(n)}{\mathop {=}\limits ^{\mathbb {P}_H}}U^{(n)}\Pi \), which implies that

$$\begin{aligned} \mathcal {W}_t^{(n)}(h)&{\mathop {=}\limits ^{\mathbb {P}_H}} \sum _{j=1}^n \sum _{i=1}^{\lfloor t n\rfloor } \big ( |({U^{(n)}}\Pi )_{i,j}|^2-\tfrac{1}{n} \big ) h(\lambda _j) \nonumber \\&= \sum _{j=1}^n \sum _{i=1}^{\lfloor t n\rfloor } \big ( |{U^{(n)}_{i,\pi ^{-1}(j)}}|^2-\tfrac{1}{n} \big ) h(\lambda _j) \nonumber \\&= \sum _{j=1}^n \sum _{i=1}^{\lfloor t n\rfloor } \big ( |{U^{(n)}_{i,j}}|^2-\tfrac{1}{n} \big ) h(\lambda _{\pi (j)}) \nonumber \\&= \sum _{m=1}^M \gamma _m \sum _{j=1}^n \sum _{i=1}^{\lfloor t n\rfloor } \big ( |{U^{(n)}_{i,j}}|^2-\tfrac{1}{n} \big ) \mathbb {1}_{\{a_m<\lambda _{\pi (j)}\le b_m \}} , \end{aligned}$$
(3.7)

and the last line equals the right-hand side of (3.6). The equality in distribution in (3.7) holds also when both sides are viewed as a function of t, which implies (3.6). We are now almost in the situation to apply Theorem 3.1.

3.1.2 A Subordination Argument

Assumption (A2) implies the \(\mathbb {P}_\Lambda \)-almost sure convergence of \(F^{(n)}(s)\) defined in (3.5) to \(F(s) = \nu ((-\infty ,s])\) for all \(s\in S=\{a_1,b_1,\ldots ,a_M,b_M\}\). Together with Theorem 3.1, this will yield the convergence of \(\widetilde{\mathcal {W}}^{(n)}\) at random time points given by \(F^{(n)}\), and we show in this section the convergence

$$\begin{aligned} ({\mathcal {W}}_t^{(n)}(h))_{t\in [0,1]} \xrightarrow [n \rightarrow \infty ]{d} \big ( \mathcal {W}_t(h)\big )_{t\in [0,1]} := \left( \sum _{m=1}^M \gamma _m \sqrt{\tfrac{2}{\beta }}\big ( \mathcal {B}_{t,F(b_{m})} - \mathcal {B}_{t,F(a_{m})} \big ) \right) _{t\in [0,1]} \end{aligned}$$
(3.8)

\(\mathbb {P}_\Lambda \)-almost surely in distribution under \(\mathbb {P}_H\). Recall that by (3.6) we have \({\mathcal {W}}_t^{(n)}(h) {\mathop {=}\limits ^{\mathbb {P}_H}} \widetilde{\mathcal {W}}_t^{(n)}(h)\). We defined all unitary \(U^{(n)}, n\ge 1\), and therefore also all \(\widetilde{\mathcal {W}}^{(n)}, n\ge 1\) on a common probability space. By the Skorokhod representation theorem, there exists a modification of this space, such that \(\widetilde{\mathcal {W}}^{(n)}\rightarrow \sqrt{2/\beta } \mathcal {B}\) almost surely; with respect to a measure we again denote by \(\mathbb {P}_H\). The product structure implied by assumption (A1) allows us to extend this to a product space with law \(\mathbb {P}_H\otimes \mathbb {P}_\Lambda \) such that

$$\begin{aligned} \big ((\widetilde{\mathcal {W}}_{s,t}^{(n)})_{s,t\in [0,1]} , (F^{(n)}(s))_{s\in S} \big ) \xrightarrow [n \rightarrow \infty ]{} \left( \left( \sqrt{\tfrac{2}{\beta }}\mathcal {B}_{s,t}\right) _{s,t\in [0,1]}, (F(s))_{s\in S} \right) \end{aligned}$$
(3.9)

\(\mathbb {P}_H\otimes \mathbb {P}_\Lambda \)-almost surely in \(\mathcal {D}([0,1]^2)\times \mathbb {R}^{2M}\). By (3.6), we need to consider

$$\begin{aligned}&\sup _{t\in [0,1]} \big | \widetilde{\mathcal {W}}_t^{(n)}(h) - \mathcal {W}_t(h)\big | \nonumber \\&\quad = \sup _{t\in [0,1]} \left| \sum _{m=1}^M \gamma _m \big ( \widetilde{\mathcal {W}}^{(n)}_{t,F^{(n)}(b_{m})} - \widetilde{\mathcal {W}}^{(n)}_{t,F^{(n)}(a_{m})} \big ) \right. \nonumber \\&\qquad \qquad \qquad \qquad \left. - \sum _{m=1}^M \gamma _m \sqrt{\tfrac{2}{\beta }} \big ( \mathcal {B}_{t,F(b_{m})} - \mathcal {B}_{t,F(a_{m})} \big )\right| \nonumber \\&\quad \le \sum _{m=1}^M |\gamma _m| \left( \sup _{t\in [0,1]} \left| \widetilde{\mathcal {W}}^{(n)}_{t,F^{(n)}(b_{m})} - \sqrt{\tfrac{2}{\beta }}\mathcal {B}_{t,F(b_{m})} \right| \right. \nonumber \\&\qquad \qquad \qquad \qquad \left. + \sup _{t\in [0,1]} \left| \widetilde{\mathcal {W}}^{(n)}_{t,F^{(n)}(a_{m})} - \sqrt{\tfrac{2}{\beta }}\mathcal {B}_{t,F(a_{m})} \right| \right) . \end{aligned}$$
(3.10)

An individual supremum in (3.10) can then be bounded as

$$\begin{aligned}&\sup _{t\in [0,1]} \left| \widetilde{\mathcal {W}}^{(n)}_{t,F^{(n)}(s)} - \sqrt{\tfrac{2}{\beta }} \mathcal {B}_{t,F(s)} \right| \nonumber \\&\quad \le \sup _{t\in [0,1]} \left| \widetilde{\mathcal {W}}^{(n)}_{t,F^{(n)}(s)} - \sqrt{\tfrac{2}{\beta }}\mathcal {B}_{t,F^{(n)}(s)} \right| +\sqrt{\tfrac{2}{\beta }} \sup _{t\in [0,1]} \left| \mathcal {B}_{t,F^{(n)}(s)} - \mathcal {B}_{t,F(s)} \right| \end{aligned}$$
(3.11)

with \(s\in S\). Since \(\mathcal {B}\) is uniformly continuous, the convergence of Theorem 3.1 holds with respect to the supremum norm on \(\mathcal {D}([0,1]^2)\), which implies that the first term in (3.11) vanishes as \(n\rightarrow \infty \). Since \(F^{(n)}(s) \rightarrow F(s)\) for \(s\in S\) and using again the uniform continuity of \(\mathcal {B}\), the second term vanishes as well. By the bound in (3.10), the convergence \(\widetilde{\mathcal {W}}^{(n)}(h) \rightarrow \mathcal {W}(h)\) follows \(\mathbb {P}_H\otimes \mathbb {P}_\Lambda \)-almost surely in \(\mathcal {D}([0,1])\). The product structure of the extended probability space implies then that for any bounded continuous G we get

$$\begin{aligned} \mathbb {E}_H\big [G( \mathcal {W}^{(n)}(h)) \big ] =\mathbb {E}_H\big [G( \widetilde{\mathcal {W}}^{(n)}(h)) \big ] \xrightarrow [n \rightarrow \infty ]{} \mathbb {E}\big [G( \mathcal {W}(h)) \big ] \end{aligned}$$
(3.12)

\(\mathbb {P}_\Lambda \)-almost surely, that is, (3.8) holds.

Since \(\mathcal {B}\) is a centered Gaussian process with continuous paths, the same holds for \(\mathcal {W}(h)\). To calculate the covariance, we first note that according to (3.3),

$$\begin{aligned}&\mathrm {Cov}\left( \gamma _m \sqrt{\tfrac{2}{\beta }}(\mathcal {B}_{s,F(b_m) } - \mathcal {B}_{s,F(a_{m}) }),\gamma _\ell \sqrt{\tfrac{2}{\beta }}(\mathcal {B}_{t,F(b_\ell )} - \mathcal {B}_{t,F(a_{\ell }) })\right) \nonumber \\&\quad = \gamma _m\gamma _\ell \tfrac{2}{\beta } (s\wedge t-st)\big [ F(b_m)\wedge F(b_\ell ) - F(b_m)F(b_\ell ) \nonumber \\&\qquad -(F(b_m)\wedge F(a_\ell ) - F(b_m)F(a_\ell )) \nonumber \\&\qquad -(F(a_m)\wedge F(b_\ell ) - F(a_m)F(b_\ell ))+F(a_m)\wedge F(a_\ell ) - F(a_m)F(a_\ell )\big ] . \end{aligned}$$
(3.13)

For \(m\ne \ell \), the minima in (3.13) all cancel and this reduces to

$$\begin{aligned}&\gamma _m\gamma _\ell \tfrac{2}{\beta }(s\wedge t-st)\big [ - (F(b_m)-F(a_m))(F(b_\ell )-F(a_\ell ))\big ] \\&\quad = - (s\wedge t-st) \tfrac{2}{\beta } \int \gamma _m \mathbb {1}_{(a_m,b_m]} \mathrm{d}\nu \cdot \int \gamma _\ell \mathbb {1}_{(a_\ell ,b_\ell ]} \mathrm{d}\nu , \end{aligned}$$

while for \(m=\ell \) we get

$$\begin{aligned}&\gamma _m^2 \tfrac{2}{\beta } (s\wedge t-st)\big [ (F(b_m)-F(a_m))- (F(b_m)-F(a_m))^2\big ] \\&\quad = (s\wedge t-st) \tfrac{2}{\beta } \left[ \int \gamma _m^2 \mathbb {1}_{(a_m,b_m]} \mathrm{d}\nu - \left( \int \gamma _m \mathbb {1}_{(a_m,b_m]} \mathrm{d}\nu \right) ^2 \right] . \end{aligned}$$

Summing over \(m,\ell \), this yields for the covariance

$$\begin{aligned} \mathrm {Cov} (\mathcal {W}_s(h),\mathcal {W}_t(h)) = (s\wedge t-st) \tfrac{2}{\beta } \left[ \int h^2\mathrm{d}\nu - \left( \int h\mathrm{d}\nu \right) ^2 \right] . \end{aligned}$$
(3.14)

That is, \(\mathcal {W}(h)= \sigma _1 (h) B\), with B a standard Brownian bridge. It remains to replace the elementary function h as in (3.1) by an arbitrary f.

3.1.3 Extension to General f

Let \(f\in L^2(\nu )\), G be a bounded uniformly continuous functional from \(\mathcal {D}([0,1])\) to \(\mathbb {R}\), and \(\varepsilon >0\). Denote by d the Skorokhod \(J_1\)-metric on \(\mathcal {D}([0,1])\), similar to Sect. 3.1.1, and let \(\delta <1\) be so small that \(d(X,Y) \le \delta \) implies \(|G(X) - G(Y)| \le \varepsilon \). In order to extend the convergence of \(\mathcal {W}^{(n)}(h)\) with h as in the previous sections replaced by f, we need an a priori estimate on the distance of the processes \(\mathcal {W}^{(n)}(h)\) and \(\mathcal {W}^{(n)}(f)\). The proof is postponed to the end of this section.

Lemma 3.2

There exists a constant \(c>0\), such that for \(\eta >0\) and \(g:\mathbb {R}\rightarrow \mathbb {R}\) measurable,

$$\begin{aligned} \limsup _{n\rightarrow \infty } \mathbb {P}_H \left( \sup _{t\in [0,1]}|\mathcal {W}^{(n)}_t(g)| > \eta \right) \le \limsup _{n\rightarrow \infty }\frac{c}{\eta ^2} (\hat{\mu }^{(n)}(g^2) - \hat{\mu }^{(n)}(g)^2) , \end{aligned}$$

\(\mathbb {P}_\Lambda \)-almost surely.

Note that for g satisfying (A2), the upper bound in Lemma 3.2 is equal to \(c\sigma _0^2(g)/\eta ^2\). We may approximate f by a piecewise constant function \(h=h_\varepsilon \) as in (3.1), such that \(||f-h||_{L^2(\nu )}\le \delta ^2\varepsilon \). We want to apply Lemma 3.2 with \(g=f-h\); however, in order to control the upper bound we need to control \(\hat{\mu }^{(n)}(fh)\). For this, we write \(f=f_+-f_-\) with \(f_\pm \ge 0\) and assume that the positive and negative parts \(f_+\) and \(f_-\) are approximated by \(h_+\) and \(h_-\), respectively, with \(h_\pm \ge 0\) and such that \(h_\pm \le f_\pm \). Then, we can estimate

$$\begin{aligned} \mu ^{(n)}(fh) = \mu ^{(n)}(f_+h_+)+\mu ^{(n)}(f_-h_-)\ge \mu ^{(n)}(h_+^2)+\mu ^{(n)}(h_-^2) = \mu ^{(n)}(h^2), \end{aligned}$$
(3.15)

such that

$$\begin{aligned} \mu ^{(n)}((f-h)^2) \le \mu ^{(n)}(f^2-h^2) . \end{aligned}$$
(3.16)

By assumption (A2), \(\mu ^{(n)}(f^2)\rightarrow \nu (f^2)\) \(\mathbb {P}_\Lambda \) almost surely and the elementary form of h as in (3.1) implies \(\mu ^{(n)}(h^2)\rightarrow \nu (h^2)\) as well. This implies that (3.16) converges \(\mathbb {P}_\Lambda \) almost surely to \(\nu (f^2-h^2)\le 2 ||f-h||_{L^2(\nu )}||f||_{L^2(\nu )}\). We have \(\mathbb {P}_\Lambda \) almost surely

$$\begin{aligned} \mathbb {E}_H [|G(\mathcal {W}^{(n)}(f))-G(\mathcal {W}^{(n)}(h_\varepsilon ))|]&\le \varepsilon + 2 ||G||_\infty \mathbb {P}_H\big ( d(\mathcal {W}^{(n)}(f),\mathcal {W}^{(n)}(h_\varepsilon ))>\delta \big ) \nonumber \\&\le \varepsilon + 2 ||G||_\infty \mathbb {P}_H\big ( ||\mathcal {W}^{(n)}(f)-\mathcal {W}^{(n)}(h_\varepsilon )||_\infty >\delta \big ) , \end{aligned}$$
(3.17)

so that we obtain from Lemma 3.2 with \(g=f-h_\varepsilon \) and (3.16)

$$\begin{aligned} \limsup _{n\rightarrow \infty } \big | \mathbb {E}_H[G(\mathcal {W}^{(n)}(f))]-\mathbb {E}[G(\mathcal {W}(h_\varepsilon ))] \big |&\le \varepsilon +2||G||_\infty \delta ^{-2} \nu (f^2-h_\varepsilon ^2) \nonumber \\&\le \varepsilon +4||G||_\infty \delta ^{-2} ||f-h_\varepsilon ||_{L^2(\nu )}||f||_{L^2(\nu )} \nonumber \\&\le \varepsilon +4\varepsilon ||G||_\infty ||f||_{L^2(\nu )} . \end{aligned}$$
(3.18)

Furthermore, if we set \(\mathcal {W}(f) = \sigma _1 (f) B\), then \(\mathcal {W}(h_\varepsilon )\) and \(\mathcal {W}(f)\) are Gaussian processes with covariance \((s\wedge t -st)\sigma _1^2(h_\varepsilon )\) and \((s\wedge t -st)\sigma _1^2(f)\), respectively, and if \(\varepsilon \rightarrow 0\), then \(h=h_\varepsilon \rightarrow f\) in \(L^2(\nu )\),

$$\begin{aligned} \mathcal {W}(h_\varepsilon ) = \sigma _1 (h_\varepsilon ) B \xrightarrow [\varepsilon \rightarrow 0 ]{} \sigma _1 (f) B=\mathcal {W}(f) . \end{aligned}$$
(3.19)

The combination of (3.18) and (3.19) shows that we may replace h in (3.12) by any \(f\in L^2(\nu )\), so that \(\mathcal {W}^{(n)}(f)\) converges to \(\mathcal {W}(f)\) in distribution under \(\mathbb {P}_H\), for \(\mathbb {P}_\Lambda \)-almost all \(\lambda \). \(\square \)

Proof of Lemma 3.2:

We write

$$\begin{aligned} \mathcal {W}_t^{(n)}(g) = \sum _{j=1}^n \sum _{i=1}^{\lfloor tn\rfloor } \big ( |U_{i,j}|^2-\tfrac{1}{n} \big ) g(\lambda _j) = \sum _{i=1}^{\lfloor tn\rfloor } Y_{i,n}, \end{aligned}$$

where

$$\begin{aligned} Y_{i,n}= \sum _{j=1}^n \big ( |U_{i,j}|^2-\tfrac{1}{n} \big ) g(\lambda _j) . \end{aligned}$$

By the invariance of the Haar distribution, the vector of increments \((Y_{1,n},\ldots ,Y_{n,n})\) is exchangeable under \(\mathbb {P}_H\), meaning that any permutation of the \(Y_{i,n}\) has the same distribution. Corollary 2 in [38] shows that there exists a universal constant \(c>0\), such that

$$\begin{aligned} \mathbb {P}_H \left( \sup _{1\le k\le n} \left| \sum _{i=1}^{k} Y_{i,n}\right|> \eta \right) \le c \mathbb {P}_H \left( \left| \sum _{i=1}^{\lfloor n/2 \rfloor } Y_{i,n}\right| > \eta /c \right) \end{aligned}$$

and the right-hand side can be bounded by \(c^3\mathbb {E}_H[\mathcal {W}^{(n)}_{1/2}(g)^2]/\eta ^2\). The calculations in Sect. 4, more precisely taking the \(\limsup \) in (4.9), show that this upper bound implies the statement of Lemma 3.2. \(\square \)

3.2 Proof of Theorem 2.2

The functional CLT for the process \(X^{(n)}(f) = \mathcal {W}^{(n)}(f) + \mathcal {Z}^{(n)}(f)\) is a direct consequence of the almost sure convergence in distribution of \(\mathcal {W}^{(n)}(f)\) in Theorem 2.1, and the CLT in (A3), which implies the convergence of \(\mathcal {Z}^{(n)}(f)\). To combine these two limits, we use the following lemma.

Lemma 3.3

Let \((\Omega _1\times \Omega _2, \mathcal {G}, \mathbb {P}_1\otimes \mathbb {P}_2)\) be a probability space and \(X^{(n)}:\Omega _1\times \Omega _2\rightarrow \Omega '\) and \(Y^{(n)}:\Omega _2 \rightarrow \Omega '\) random variables, where \(\Omega '\) is a separable metric space with Borel \(\sigma \)-algebra. If \(Y^{(n)}\) converges to Y in distribution under \(\mathbb {P}_2\) and

$$\begin{aligned} \mathbb {E}_1[F(X^{(n)})] \xrightarrow [n \rightarrow \infty ]{} \mathbb {E}[F(X)] \end{aligned}$$
(3.20)

\(\mathbb {P}_2\)-almost surely for any bounded continuous \(F:\Omega '\rightarrow \mathbb {R}\), where \(\mathbb {E}_1,\mathbb {E}\) is the expectation with respect to \(\mathbb {P}_1, \mathbb {P}_1\otimes \mathbb {P}_2\), respectively, then

$$\begin{aligned} (X^{(n)},Y^{(n)}) \xrightarrow [n \rightarrow \infty ]{d} (X,Y) \end{aligned}$$
(3.21)

in distribution under \(\mathbb {P}_1\otimes \mathbb {P}_2\), with X and Y independent.

Proof

The main observation is that functions \(\mathcal {F}:\Omega '\times \Omega '\rightarrow \mathbb {R}\) with \(\mathcal {F}(x,y)=F(x)G(y)\) and FG bounded continuous are sufficient to determine convergence in distribution; see Lemma 4.1 in [22]. For such FG, we have

$$\begin{aligned}&\big | \mathbb {E}[F(X^{(n)})G(Y^{(n)})] - \mathbb {E}[F(X)]\mathbb {E}[G(Y)] \big | \\&\quad \le \big | \mathbb {E}[(F(X^{(n)})-\mathbb {E}[F(X)]) G(Y^{(n)})]\big | + \big | \mathbb {E}[F(X)] (\mathbb {E}[G(Y^{(n)})] - \mathbb {E}[G(Y)]) \big | \\&\quad = \big | \mathbb {E}_2[(\mathbb {E}_1[F(X^{(n)})]-\mathbb {E}[F(X)]) G(Y^{(n)})]\big | + \big | \mathbb {E}[F(X)] (\mathbb {E}[G(Y^{(n)})] - \mathbb {E}[G(Y)]) \big | . \end{aligned}$$

The first term vanishes by dominated convergence using (3.20), and the second one by the convergence of \(Y^{(n)}\) under \(\mathbb {P}_2\). \(\square \)

To prove Theorem 2.2, set \(\mathbb {P}_1=\mathbb {P}_H, \mathbb {P}_2=\mathbb {P}_\Lambda \), and \(X^{(n)}= (\mathcal {W}_t^{(n)}(f))_{t\in [0,1]}\), \(Y^{(n)}= (\mathcal {Z}_t^{(n)}(f))_{t\in [0,1]}\). Then, by Theorem 2.1, the convergence (3.20) holds with limit \(X = (\mathcal {W}_t(f))_{t\in [0,1]}\) and by assumption (A3), \(Y^{(n)}\) converges in distribution under \(\mathbb {P}_2\) to \(Y=(t\mathcal {Z}(f))_{t\in [0,1]}\), with \(\mathcal {Z}(f) \sim \mathcal {N} (0,\sigma _1^2(f))\) (recall (2.14)). Then, Lemma 3.3 implies the convergence

$$\begin{aligned} (\mathcal {X}_t^{(n)}(f))_{t\in [0,1]} = \big ( \mathcal {W}_t^{(n)}(f) + \mathcal {Z}_t^{(n)}(f) \big )_{t\in [0,1]} \xrightarrow [n \rightarrow \infty ]{d} \big ( \mathcal {W}_t(f) + t\mathcal {Z}(f) \big )_{t\in [0,1]} \end{aligned}$$
(3.22)

under \(\mathbb {P}\), with \(\mathcal {W}_t(f)\) and \(\mathcal {Z}(f)\) independent. This is the convergence claimed in Theorem 2.2. \(\square \)

3.3 Proof of Theorem 2.3

Let \(\beta '=\beta /2\). It follows from the Haar distribution of \(U^{(n)}\) that the vector of weights \((|U^{(n)}_{1,1}|^2, \ldots ,|U^{(n)}_{1,n}|^2)\) has a homogeneous Dirichlet distribution \({\text {Dir}}_n(\beta ')\), which is defined by the Lebesgue density for the first \(n-1\) coordinates proportional to

$$\begin{aligned} \big ( x_1 \cdots x_{n-1}(1-x_1-\cdots -x_{n-1}) \big )^{\beta ' -1} \mathbb {1}_{\{ x_i > 0, x_1+\cdots +x_{n-1}<1 \} } . \end{aligned}$$

The uniform distribution on the standard simplex corresponds thus to \(\beta =2\). We will prove the CLT for weights following the general distribution \({\text {Dir}}_n(\beta ')\) for any \(\beta '>0\), since it makes no difference in the proof. The starting point is the observation that the Dirichlet distribution can be generated by self-normalizing a vector of independent gamma random variables. More precisely, let \(\gamma _1, \ldots , \gamma _n\) be independent and identically gamma-distributed with parameters \((\beta ',1)\) and mean \(\beta '\), then

$$\begin{aligned} \left( \frac{\gamma _1}{\gamma _1+ \cdots + \gamma _n}, \ldots , \frac{\gamma _n}{\gamma _1+ \cdots + \gamma _n}\right) \sim {\text {Dir}}_n(\beta '). \end{aligned}$$
(3.23)

The moment generating function of \(\gamma _1\) is given for \(t<1\) as

$$\begin{aligned} \mathbb {E} \big [ e^{t \gamma _1} \big ] = (1-t)^{-\beta '}. \end{aligned}$$
(3.24)

Define the nonnegative measure

$$\begin{aligned} \tilde{\mu }_1^{(n)}= \frac{1}{n\beta '}\sum _{k=1}^n \gamma _k \delta _{\lambda _k} , \end{aligned}$$
(3.25)

then by (3.23) the normalized measure \(\tilde{\mu }_1^{(n)}\cdot \tilde{\mu }_1^{(n)}(1)^{-1}\) has the same distribution as \({\mu }_1^{(n)}\). We first prove the convergence with \(\mu _1^{(n)}\) replaced by \(\tilde{\mu }_1^{(n)}\). Assume without loss of generality that \(\nu (f)=0\). The moment generating function with respect to \(\mathbb {P}_H\) is

$$\begin{aligned}&\mathbb {E}_H\left[ \exp \left\{ t \sqrt{n} (\tilde{\mu }_1^{(n)}(f)-\hat{\mu }^{(n)}(f) ) \right\} \right] \nonumber \\&\quad = \prod _{k=1}^n\mathbb {E}_H\left[ \exp \left\{ t (\sqrt{n}\beta ')^{-1} \gamma _k f(\lambda _k) \right\} \right] \exp \{ -t \sqrt{n}^{-1}f(\lambda _k) \} \nonumber \\&\quad = \prod _{k=1}^n \big ( 1- t (\sqrt{n}\beta ')^{-1} f(\lambda _k) \big )^{-\beta '} \exp \{ -t \sqrt{n}^{-1}f(\lambda _k) \} \nonumber \\&\quad = \exp \left\{ \sum _{k=1}^n \left( -\beta ' \log \big ( 1- t (\sqrt{n}\beta ')^{-1} f(\lambda _k) \big ) -t \sqrt{n}^{-1}f(\lambda _k) \right) \right\} , \end{aligned}$$
(3.26)

where we used the independence of the weights and the independence of weights and eigenvalues and we take \(|t|< \beta '||f||^{-1}_\infty \). Expanding the logarithm as \(\log (1+x) = x -x^2/2 + r(x)\) with \(|r(x)|\le |x|^3 \) for \(|x|\le 1/2\), this gives

$$\begin{aligned} \mathbb {E}_H \left[ \exp \left\{ t \sqrt{n} (\tilde{\mu }_1^{(n)}(f)-\hat{\mu }^{(n)}(f) ) \right\} \right] = \exp \left\{ t^2/2 (\beta ')^{-1} \hat{\mu }^{(n)}(f^2) + R_n(t,f) \right\} , \end{aligned}$$
(3.27)

with \(|R_n(t,f)|\le \sqrt{n}^{-1} (\beta ')^2 |t|^3||f||_\infty ^3\) for n large enough. By Assumption (A2), \(\hat{\mu }^{(n)}(f^2)\) converges to \(\nu (f^2)=\nu (f^2)-\nu (f)^2\) almost surely with respect to \(\mathbb {P}_\Lambda \). Since \(\hat{\mu }^{(n)}(f^2)\) and \(R_n(t,f)\) are uniformly bounded (for t and f fixed), we have by dominated convergence

$$\begin{aligned} \lim _{n\rightarrow \infty } \mathbb {E}\left[ \exp \left\{ t \sqrt{n} (\tilde{\mu }_1^{(n)}(f)-\hat{\mu }^{(n)}(f) ) \right\} \right] = \exp \left\{ t^2/2 (\beta ')^{-1} \nu (f^2) \right\} , \end{aligned}$$
(3.28)

that is,

$$\begin{aligned} \sqrt{n} (\tilde{\mu }_1^{(n)}(f)-\hat{\mu }^{(n)}(f) ) \xrightarrow [n \rightarrow \infty ]{d} \mathcal {N}(0,(\beta ')^{-1} \nu (f^2)) . \end{aligned}$$
(3.29)

In order to come back to the original measure \(\mu _1^{(n)}(f) = \tilde{\mu }_1^{(n)}\cdot \tilde{\mu }_1^{(n)}(1)^{-1}\), we write

$$\begin{aligned} \sqrt{n} \big (\mu _1^{(n)}(f)-\hat{\mu }^{(n)}(f) \big )&= \sqrt{n} \big (\tilde{\mu }_1^{(n)}(f)-\hat{\mu }^{(n)}(f) \big )\tilde{\mu }_1^{(n)}(1)^{-1} \nonumber \\&\quad + \sqrt{n} \hat{\mu }^{(n)}(f)\big ( \tilde{\mu }_1^{(n)}(1)^{-1} -1 \big ) . \end{aligned}$$
(3.30)

By the strong law of large numbers, \(\tilde{\mu }_1^{(n)}(1)\) converges almost surely to \(\mathbb {E}[\tilde{\mu }_1^{(n)}(1)]= \mathbb {E}[(\beta ')^{-1} \gamma _1] = 1\). So to conclude the convergence (3.29) with \(\tilde{\mu }_1^{(n)}\) replaced by \({\mu }_1^{(n)}\), it suffices to show that the second term in (3.30) vanishes in probability. Since \(\hat{\mu }^{(n)}(f)\) converges almost surely to \(\nu (f)=0\), this will follow if \(\sqrt{n}(\tilde{\mu }_1^{(n)}(1)-1)\) is bounded in \(L^2(\mathbb {P})\), which is easily checked by

$$\begin{aligned} n \mathbb {E}\big [ (\tilde{\mu }_1^{(n)}(1)-1)^2 \big ] = n \mathbb {E} \left[ \left( \frac{1}{n} \sum _{i=1}^n ((\beta ')^{-1}\gamma _i-1)\right) ^2\right] = (\beta ')^{-2} {\text {Var}}(\gamma _1) = (\beta ')^{-1} . \end{aligned}$$
(3.31)

This implies then that the last term in (3.30) vanishes in probability and then by (3.29) the left-hand side converges to \( \mathcal {N}(0,(\beta ')^{-1} \nu (f^2))\) in distribution. This proves the first convergence in Theorem 2.3.

The second convergence in Theorem 2.3 will follow from Lemma 3.3. To apply it to the present setting, we may set \(\mathbb {P}_1=\mathbb {P}_H\), \(\mathbb {P}_2= \mathbb {P}_\Lambda \),

$$\begin{aligned} X^{(n)}= \sqrt{n} \big (\mu _1^{(n)}(f)-\hat{\mu }^{(n)}(f) \big ), \qquad Y^{(n)}= \sqrt{n} (\hat{\mu }^{(n)}(f)-\mathbb {E}[\hat{\mu }^{(n)}(f)]) . \end{aligned}$$
(3.32)

By assumption, \(Y^{(n)}\) converges in distribution under \(\mathbb {P}_\Lambda \) to \(Y\sim \mathcal {N}(0,\hat{\sigma }^2(f))\). From (3.27), we get for any \(t\in (-\beta '||f||^{-1}_\infty ,\beta '||f||^{-1}_\infty )\)

$$\begin{aligned} \lim _{n\rightarrow \infty } \mathbb {E}_H\left[ \exp \left\{ t X^{(n)}\right\} \right] = \exp \left\{ t^2/2 (\beta ')^{-1} {\nu (f^2)} \right\} , \end{aligned}$$
(3.33)

\(\mathbb {P}_\Lambda \)-almost surely. Since the moment generating functions are continuous, \(\mathbb {P}_\Lambda \)-almost surely, (3.33) holds for any \(t\in (-\beta '||f||^{-1}_\infty ,\beta '||f||^{-1}_\infty )\). Therefore, \(X^{(n)}\) converges \(\mathbb {P}_\Lambda \)-almost surely in distribution under \(\mathbb {P}_H\) to a limit \(X\sim \mathcal {N}(0,\sigma _0^2(f))\), whose distribution does not depend on the realization of the sequence of eigenvalues, that is, (3.20) holds. Lemma 3.3 implies then the convergence of \(X^{(n)}+Y^{(n)}\) to \(X+Y\). Noting that \(\mathbb {E}[\hat{\mu }^{(n)}(f)]=\mathbb {E}[ \mu _1^{(n)}(f)]\), this finishes the proof. \(\square \)

4 Calculation of the Covariance

In this section, we prove that in the setting of Theorem 2.1 for \(s,t \in [0,1]\),

$$\begin{aligned} \lim _{n\rightarrow \infty } {\text {Cov}}_H \big (\mathcal {X}_s(f),\mathcal {X}_t(f) \big ) = (s\wedge t -st ) \sigma _0^2(f) \end{aligned}$$
(4.1)

\(\mathbb {P}_\Lambda \)-almost surely, where \({\text {Cov}}_H\) denotes the covariance with respect to \(\mathbb {P}_H\). Note that in (3.14) we already computed the covariance of \(\mathcal {W}(h)\). However, this computation was only valid for elementary functions h and to extend this to general functions, we needed a more general estimate of the variance in the proof of Lemma 3.2. Proving (4.1) requires to compute some mixed moments of entries of the eigenvector matrix \(U^{(n)}\), where for the sake of a lighter notation, we drop the superscript. We recall that if \(U=(U_{i,j})_{i,j}\) is Haar-distributed on the unitary (\(\beta =2\)) or the orthogonal (\(\beta =1\)) group, \((|U_{i,1}|^2,\ldots ,|U_{i,n}|^2)\) is \({\text {Dir}}_n(\beta ')\) distributed. Each \(|U_{i,j}|^2\) follows then a beta distribution with parameter \((\beta ',\beta '(n-1))\) and therefore

$$\begin{aligned} \mathbb {E}_H[|U_{i,j}|^2] = \frac{1}{n},\quad \mathbb {E}_H[|U_{i,j}|^4] = \frac{1+\beta '}{n(\beta 'n+1)} . \end{aligned}$$
(4.2)

Moreover, if \(j\ne k\), then \((U_{i,j},U_{i,k})\) is Dirichlet-distributed with parameter \((\beta ',\beta ',\beta '(n-2))\), which implies

$$\begin{aligned} \mathbb {E}_H[|U_{i,j}|^2|U_{i,k}|^2] = \frac{\beta '}{n(\beta 'n+1)} . \end{aligned}$$
(4.3)

If additionally \(m\ne i\), then using

$$\begin{aligned} \mathbb {E}_H[|U_{i,j}|^2]&= \sum _{m'=1}^n \mathbb {E}_H[|U_{i,j}|^2|U_{m',k}|^2] = \mathbb {E}_H[|U_{i,j}|^2|U_{i,k}|^2] \nonumber \\&\quad + (n-1) \mathbb {E}_H[|U_{i,j}|^2|U_{m,k}|^2] , \end{aligned}$$
(4.4)

we see that by (4.2) and (4.3)

$$\begin{aligned} \mathbb {E}_H[|U_{i,j}|^2|U_{m,k}|^2] = \frac{(n-1)\beta '+1}{n(n-1)(\beta 'n+1)} . \end{aligned}$$
(4.5)

These identities can also be obtained from [13] or for \(\beta =2\) from Proposition 4.2.3 of [23]. Now let \(s,t\in [0,1]\) and set \(s_n = \lfloor sn \rfloor \), \(t_n = \lfloor tn \rfloor \). If \(s\in \{0,1\}\) (or \(t\in \{0,1\}\)), then \(\mathcal {X}_s(f)=0\) (or \(\mathcal {X}_t(f)=0\)) and (4.1) is trivially true. So assume that \(s,t \in (0,1)\), and n is so large that \(s_n,t_n\ge 1\). Without loss of generality , let \(s \le {t}\). Then, we get by (2.2) for the mixed moment with respect to \(\mathbb {P}_H\)

$$\begin{aligned} \mathbb {E}_H \left[ X^{(n)}_s(f) X^{(n)}_t(f)\right]&= \mathbb {E}_H \left[ \left( \sum _{i=1}^n \sum _{l=1}^{s_n} |U_{l,i}|^2 f(\lambda _i) \right) \left( \sum _{j=1}^n \sum _{m=1}^{t_n} |U_{m,j}|^2 f(\lambda _j) \right) \right] \\&= \sum _{l=1}^{s_n} \sum _{m=1}^{t_n} \sum _{i,j=1}^n \mathbb {E}_H \left[ |U_{l,i}|^2 |U_{m,j}|^2\right] f(\lambda _i)f(\lambda _j) . \end{aligned}$$

This sum can be decomposed according to whether \(l=m\) or not as

$$\begin{aligned}&\sum _{l,m=1, l\ne m}^{s_n} \sum _{i,j=1}^n \mathbb {E}_H \left[ |U_{l,i}|^2 |U_{m,j}|^2\right] f(\lambda _i)f(\lambda _j) \end{aligned}$$
(4.6)
$$\begin{aligned}&\quad + \sum _{l=1}^{s_n} \sum _{i,j=1}^n \mathbb {E}_H \left[ |U_{l,i}|^2 |U_{l,j}|^2\right] f(\lambda _i)f(\lambda _j) \end{aligned}$$
(4.7)
$$\begin{aligned}&\quad + \sum _{l=1}^{s_n} \sum _{m=s_n+1}^{t_n}\sum _{i,j=1}^n \mathbb {E}_H \left[ |U_{l,i}|^2 |U_{m,j}|^2\right] f(\lambda _i)f(\lambda _j) . \end{aligned}$$
(4.8)

The inner sum in (4.6) and (4.8) satisfies \(l\ne m\) and is equal to

$$\begin{aligned}&\sum _{i=1}^n \mathbb {E}_H [ |U_{l,i}|^2 |U_{m,i}^2| ] f(\lambda _i)^2 + \sum _{i\ne j} \mathbb {E}_H [ |U_{l,i}|^2 |U_{m,j}|^2 ] f(\lambda _i)f(\lambda _j) \\&\quad = \frac{\beta '}{n(\beta 'n+1)} \sum _{i=1}^n f(\lambda _i)^2 + \frac{(n-1)\beta '+1}{n(n-1)(\beta 'n+1)} \sum _{i\ne j} f(\lambda _i)f(\lambda _j) \\&\quad = \frac{\beta '}{n(\beta 'n+1)} X^{(n)}_1(f^2) + \frac{(n-1)\beta '+1}{n(n-1)(\beta 'n+1)} (X^{(n)}_1(f)^2- X^{(n)}_1(f^2) ) \\&\quad = \frac{(n-1)\beta '+1}{n(n-1)(\beta 'n+1)} X^{(n)}_1(f)^2 - \frac{1}{n(n-1)(\beta 'n+1)} X^{(n)}_1(f^2) . \end{aligned}$$

And for the inner sum in (4.7) we get

$$\begin{aligned}&\sum _{i=1}^n \mathbb {E}_H [ |U_{l,i}|^4 ] f(\lambda _i)^2 + \sum _{i\ne j} \mathbb {E}_H [ |U_{l,i}|^2 |U_{l,j}|^2 ] f(\lambda _i)f(\lambda _j) \\&\quad = \frac{1+\beta '}{n(\beta 'n+1)} \sum _{i=1}^n f(\lambda _i)^2 + \frac{\beta '}{n(\beta 'n+1)} \sum _{i\ne j} f(\lambda _i)f(\lambda _j) \\&\quad = \frac{\beta '}{n(\beta 'n+1)} X^{(n)}_1(f)^2 + \frac{1}{n(\beta 'n+1)} X^{(n)}_1(f^2) . \end{aligned}$$

So summing over lm becomes not very difficult and we obtain

$$\begin{aligned}&\mathbb {E}_H \left[ X^{(n)}_s(f) X^{(n)}_t(f)\right] \\&\quad = (s_n(s_n-1)+s_n(t_n-s_n)) \\&\qquad \qquad \left( \frac{(n-1)\beta '+1}{n(n-1)(\beta 'n+1)} X^{(n)}_1(f)^2 - \frac{1}{n(n-1)(\beta 'n+1)} X^{(n)}_1(f^2)\right) \\&\qquad + s_n \left( \frac{\beta '}{n(\beta 'n+1)} X^{(n)}_1(f)^2 + \frac{1}{n(\beta 'n+1)} X^{(n)}_1(f^2) \right) . \end{aligned}$$

From this, we have to subtract the product of expectation, which we expand as

$$\begin{aligned} \mathbb {E}_H \big [ X^{(n)}_s(f)\big ] \mathbb {E}_H \big [ X^{(n)}_t(f) \big ]&= \frac{s_nt_n}{n^2} X^{(n)}_1(f)^2 = \frac{s_n(t_n-1)}{n^2} X^{(n)}_1(f)^2 \\&\quad + \frac{s_n}{n^2} X^{(n)}_1(f)^2 . \end{aligned}$$

For the covariance, we can then combine conveniently:

$$\begin{aligned}&\mathrm {Cov}_H(X^{(n)}_s(f) , X_t^{(n)}(f)) \nonumber \\&\quad = s_n(t_n-1) \left( \left( \frac{(n-1)\beta '+1}{n(n-1)(\beta 'n+1)} - \frac{1}{n^2}\right) X^{(n)}_1(f)^2 \right. \nonumber \\&\qquad \qquad \qquad \qquad \qquad \left. - \frac{1}{n(n-1)(\beta 'n+1)} X^{(n)}_1(f^2)\right) \nonumber \\&\qquad + s_n \left( \left( \frac{\beta '}{n(\beta 'n+1)} - \frac{1}{n^2}\right) X^{(n)}_1(f)^2 + \frac{1}{n(\beta 'n+1)} X^{(n)}_1(f^2) \right) \nonumber \\&\quad = \frac{s_n(t_n-1)}{n(n-1)} \left( \frac{1}{n(\beta 'n+1)} X^{(n)}_1(f)^2 - \frac{1}{\beta 'n+1} X^{(n)}_1(f^2)\right) \nonumber \\&\qquad + \frac{s_n}{n} \left( \frac{-1}{n(\beta 'n+1)} X^{(n)}_1(f)^2 + \frac{1}{\beta 'n+1} X^{(n)}_1(f^2) \right) . \end{aligned}$$
(4.9)

Now, \(\mathbb {P}_\Lambda \)-almost surely \(\frac{1}{n} X^{(n)}_1(f) \rightarrow \nu (f) \) and \(\frac{1}{n} X^{(n)}_1(f^2)\rightarrow \nu (f^2)\) by (A2), such that as \(n\rightarrow \infty \), (4.9) converges to

$$\begin{aligned}&st (\beta ')^{-1} \left( \nu (f)^2 - \nu (f^2)\right) + s (\beta ')^{-1} \left( \nu (f^2) - \nu (f)^2\right) \\&\quad = (s\wedge t -st ) (\beta ')^{-1} \left( \nu (f^2) - \nu (f)^2\right) , \end{aligned}$$

which is precisely the right-hand side of (4.1).