1 Introduction

We consider the empirical spectral measureFootnote 1 of random Laplacian-type matrices of the form

$$\begin{aligned} L_n=A_n-D_n \end{aligned}$$
(1.1)

where \(A_n=(A_{ij})_{i,j=1}^{n}\) is an \(n\times n\) real symmetric random matrix with independent entries up to symmetry, and \(D_n\) is a diagonal matrix with \((D_n)_{ii}=\sum _{j=1}^n A_{ij}\). When \(A_n\) is a Wigner matrix, i.e. \(A_n\) has independent entries up to symmetry with mean zero and variance \(\frac{1}{n}\), the empirical spectral measure of \(A_n\) converges to Wigner’s semicircle law, the empirical spectral measure of \(D_n\) converges to a standard Gaussian distribution, and it was shown in [10] that the empirical spectral measure of \(L_n\) converges to the free convolution of the semicircle law and the standard real Gaussian measure. In this paper, we will consider \(A_n\) such that the diagonal entries of \(D_n\) converge in distribution, not to the Gaussian distribution, but rather to a non-Gaussian infinitely divisible distribution. This model will include Lévy matrices, sometimes referred to as heavy-tailed Wigner matrices, where the entries of \(A_n\) are independent up to symmetry, but have infinite second moment, see Subsection 1.1 for more details. Another important example arises when \(A_n\) is the adjacency matrix of an Erdős–Rényi random graph where the expected degree of any vertex remains fixed as the number of vertices goes to infinity. These \(A_n\) fall into the class of Lévy–Khintchine matrices, a generalization of Lévy matrices defined by Jung in [24], see Subsection 1.2 for more on these matrices.

The term Laplacian comes from graph theory, where the combinatorial Laplacian of a graph with vertex set \(\{1,2,\dots ,n\}\) is defined by:

$$\begin{aligned} L_{ij}={\left\{ \begin{array}{ll} \deg (i),\,\text { if } i=j\\ -1,\,\text { if } i\sim j\\ 0,\,\text { if } i\not \sim j, \end{array}\right. } \end{aligned}$$
(1.2)

where \(i\sim j\) if \(\{i,j\}\) is an edge in the graph and \(\deg (i)\) is the number of edges incident to a vertex i. The combinatorial Laplacian is the negative of what we refer to as the Laplacian. If the entries of \(A_n\) are almost surely nonnegative, then \(L_n\) is the infinitesimal generator of a (random) continuous time random walk, and for this reason, \(L_n\) is referred to as a Markov matrix in some of the literature. We use the term Laplacian throughout. Spectral properties of real symmetric random Laplacian matrices have been studied in [4, 10, 12, 13, 16, 17, 21,22,23] and for non-symmetric random Laplacian matrices in [6] when the entries of \(A_n\) are in the domain of attraction of either a real or complex Gaussian random variable. Though because of the widespread use of graph Laplacians this list is incomplete. In these light-tailed cases, the limiting spectral measure has a particularly nice free probabilistic interpretation (see [27] for an introduction to free probability and random matrices). In [10], Bryc, Dembo, and Jiang proved the following:

Theorem 1.1

(Theorem 1.3 in [10]) Let \(\{X_{ij}:j\ge i\ge 1 \}\) be a collection of i.i.d. real random variables with \(\mathbb {E}X_{12}=0\) and \(\mathbb {E}X_{12}^2=1\), \(X_{ij}=X_{ji}\) for \(1\le i\le j\), and let \(A_n=\left( X_{ij}/\sqrt{n}\right) _{i,j=1}^n\) be a random real symmetric matrix. With probability one, the empirical spectral measure of the matrix \(L_n\) defined in (1.1) converges weakly to the free additive convolution of the semicircle and standard Gaussian measures.

The analogous free probabilistic limit was established in [6] for non-symmetric \(A_n\). While some of the above references study sparse Laplacian matrices, none consider random Laplacian matrices with heavy-tailed entries or sparse Laplacian matrices where the expected number of nonzero entries in a row is uniformly bounded in n.

Many of the tools and techniques we employ were developed in the study of heavy-tailed real symmetric, or Lévy, matrices by Bordenave, Caputo, and Chafaï in [7]. Lévy matrices were introduced in [14] as heavy-tailed versions of Wigner matrices. For the purposes of this paper, an important distinction between Lévy and Wigner matrices is that the row sums of a Wigner matrix converge in distribution to a Gaussian random variable, while the row sums of a Lévy matrix converge to an \(\alpha \)-stable distribution for \(0<\alpha <2\). The techniques in [7] were extended by Jung in [24] to random matrices whose row sums converge in distribution to an infinitely divisible distribution.

1.1 Lévy Matrices

Lévy matrices are the heavy-tailed analogue of Wigner matrices, where the entries are independent up to symmetry, but fail to have two finite moments.

Definition 1.2

A real symmetric random matrix X is a Lévy matrix if the diagonal entries are zero, the entries above the diagonal are independent and identically distributed (i.i.d.) copies of a real random variable \(\xi \), and there exists \(\theta \in [0,1]\) and \(\alpha \in (0,2)\) such that

  1. (i)

    \(\lim \limits _{t\rightarrow \infty }\frac{\mathbb {P}(\xi \ge t)}{\mathbb {P}(|\xi |\ge t)}=\theta ,\)

  2. (ii)

    \(\mathbb {P}(|\xi |\ge t)=t^{-\alpha }L(t)\) for all \(t\ge 1\), where L is a slowly varying function, i.e. \(L(tx)/L(t)\rightarrow 1\) as \(t\rightarrow \infty \) for any \(x>0\).

The conditions in Definition 1.2 are the same conditions for \(\xi \) to be in the domain of attraction of an \(\alpha \)-stable distribution. Unlike with Wigner matrices, the natural scaling on an \(n\times n\) Lévy matrix X is not \(\sqrt{n}\), but instead

$$\begin{aligned} a_n:=\inf \left\{ x:\mathbb {P}(|X_{12}|>x)\le n^{-1}\right\} . \end{aligned}$$
(1.3)

For an \(n\times n\) Lévy matrix \(X_n\), the matrix \(A_n\) in equation (1.1) will be defined as \(A_n:=a_n^{-1}X_n\). We will refer to \(A_n\) as a normalized Lévy matrix.

1.2 Lévy–Khintchine Matrices

Jung in [24] defined a generalization of Lévy matrices. Instead of assuming the entries are in the domain of attraction of an \(\alpha \)-stable distribution, the entries are in the domain of attraction of any infinitely divisible distribution.

Definition 1.3

A sequence of real symmetric random matrices \(\{A_n\}_{n\ge 1}\) is called a Lévy–Khintchine random matrix ensemble with characteristics \((\sigma ^2,b,m)\) if for each n, \(A_n=(A_{ij}^{(n)})_{i,j=1}^n\) is \(n\times n\), the diagonal entries of \(A_n\) are 0, the non-diagonal entries are i.i.d. up to symmetry and \(\sum _{j=1}^n A_{1j}^{(n)}\) converges in distribution as \(n\rightarrow \infty \) to a random variable Y with

$$\begin{aligned} \log (\mathbb {E}e^{itY})=-\frac{1}{2}t^2\sigma ^2+itb+\int _\mathbb {R}\left( e^{itx}-1-\frac{itx}{1+x^2}\right) \textrm{d}m(x) \end{aligned}$$
(1.4)

for all \(t\in \mathbb {R}\), where m is a measure on \(\mathbb {R}\) with \(m(\{0\})=0\) satisfying

$$\begin{aligned} \int _\mathbb {R}1\wedge |x|^2dm(x)<\infty . \end{aligned}$$
(1.5)

Remark 1.4

It is worth noting that the distribution of \(A_{12}^{(n)}\) may change with n. However, for many important examples \(A_{12}^{(n)}\) is either a rescaling of a fixed random variable or is the product of a fixed random variable and a Bernoulli random variable where only the Bernoulli random variable is changing with n.

A random variable Y satisfying (1.4) is said to have an infinitely divisible distribution with characteristics \((\sigma ^2,b,m)\), and (1.4) is referred to as the Lévy–Khintchine representation of Y. When \(\sigma =0\), Y is called purely non-Gaussian and has an important connection to Poisson point processes with intensity measure m outlined in Propositions 3.6 and 3.7. Section 3 additionally contains background on Poisson point processes.

1.3 Notation

Throughout this paper, we use \(\Rightarrow \) to denote weak convergence of probability measure, convergence in distribution of random variables, and vague convergence of finite measures. For an \(n\times n\) real symmetric matrix M, the eigenvalues will always be considered in non-increasing order \(\lambda _1\ge \lambda _2\ge \dots \ge \lambda _n\). We define the empirical spectral measure of an \(n\times n\) real symmetric matrix M to be the probability measure

$$\begin{aligned} \mu _{M}=\frac{1}{n}\sum _{i =1}^n\delta _{\lambda _i}, \end{aligned}$$
(1.6)

where \(\delta _x\) is the Dirac delta measure at x.

A coupling of two probability measures \(\mu _1\) and \(\mu _2\) is a random tuple (XY) such that X is \(\mu _1\) distributed and Y is \(\mu _2\) distributed. The symbol \(\overset{d}{=}\) will be used to denote equality in distribution of random variables and \({\mathcal {L}}(X)\) will be used to denote the distribution of a random variable X. For two complex-valued square integrable random variables \(\xi \) and \(\psi \), we define the covariance between \(\xi \) and \(\psi \) as \({{\,\textrm{Cov}\,}}(\xi ,\psi ):=\mathbb {E}[(\xi -\mathbb {E}\xi )\overline{(\psi -\mathbb {E}\psi )}]\).

Throughout we will consider Poisson point processes on \({\bar{\mathbb {R}}}\setminus \{0\}\), the one point compactification of \(\mathbb {R}\) with the origin removed, with some intensity measure m. We will consider both finite and infinite measures, so for convenience we will denote the points of this process by \(\{y_i\}_{i\ge 1}\) for general m where \(y_i=0\) for any i greater than an appropriate (possibly identically infinite) Poisson random variable, and when considering a specific finite measure m, we will denote the points by \(\{y_i\}_{i=1}^N\) for a Poisson random variable N. See Sect. 3 for details on Poisson point processes.

For a topological space E, let \(C_K(E)\) denote the set of real-valued continuous functions on E with compact support. We will use \(\mathbb {C}_+\) to be the set of complex numbers with strictly positive imaginary part. For a probability measure \(\mu \) on \(\mathbb {R}\), we define the function \(s_\mu :\mathbb {C}_+\rightarrow \mathbb {C}_+\) by

$$\begin{aligned} s_\mu (z)=\int _\mathbb {R}\frac{1}{x-z}\textrm{d}\mu (x), \end{aligned}$$
(1.7)

and refer to \(s_\mu \) as the Stieltjes transform of \(\mu \).

We will use asymptotic notation (\(O, o, \Theta \), etc.) under the assumption that \(n\rightarrow \infty \) unless otherwise stated. \(X=O(Y)\) if \(X\le CY\) for an absolute constant \(C>0\) and all \(n \ge C\), \(X=o(Y)\) if \(X\le C_nY\) for \(C_n\rightarrow 0\), \(X=\Theta (Y)\) if \(cY\le X\le CY\) for absolute constants \(C,c>0\) and all \(n \ge C\), and \(X\sim Y\) if \(X/Y\rightarrow 1\).

2 Main Results

Throughout we will assume \(A_n=(A_{ij}^{(n)})_{i,j=1}^n\) is the n-th element of a Lévy–Khintchine random matrix ensemble with characteristics (0, bm), \(D_n\) is a diagonal matrix with \((D_n)_{ii}=\sum _{j=1}^n A_{ij}^{(n)}\), and

$$\begin{aligned} L_n=A_n-D_n. \end{aligned}$$
(2.1)

Definition 2.1

Let \(\{A_n\}_{n\ge 1}\) be a Lévy–Khintchine random matrix ensemble with characteristics (0, bm) and for each \(n\ge 1\) let \(V_{1n}\ge V_{2n}\ge \dots \ge V_{(n-1)n}\) be the order statistics of \(\{|A_{2n}^{(n)}|,|A_{3n}^{(n)}|,\dots ,|A_{nn}^{(n)}|\}\). \(\{A_n\}_{n\ge 1}\) satisfies Condition C1 if:

  • The Poisson point process with intensity measure m is almost surely summable.

  • \((V_{jn})_{j\ge 1}\) is almost surely uniformly integrable in n, i.e.

    $$\begin{aligned} \lim \limits _{k\rightarrow \infty }\sup _{n> k}\sum _{i=k+1}^{n-1} V_\textrm{in}=0, \end{aligned}$$
    (2.2)

    almost surely.

  • There exists \(\varepsilon >0\) and \(C>0\) such that

    $$\begin{aligned} m(\{x\in \mathbb {R}:|x|\ge t\})\le Ct^{-\varepsilon }, \end{aligned}$$
    (2.3)

    and

    $$\begin{aligned} n\mathbb {P}\left( |A_{12}^{(n)}|\ge t\right) \le Ct^{-\varepsilon } \end{aligned}$$
    (2.4)

    for all \(t>1/4\) and for every \(n\in \mathbb {N}\).

Remark 2.2

From Campbell’s formula (Lemma 3.3), almost sure summability of a Poisson point process with intensity measure m is equivalent to

$$\begin{aligned} \int _{\mathbb {R}\setminus \{0\}}|x|\wedge 1 \textrm{d}m(x)<\infty . \end{aligned}$$
(2.5)

Remark 2.3

Some interesting and important examples of random matrices satisfying condition C1 include:

  1. (i)

    \(A_n=a_n^{-1}X_n\) for a Lévy matrix \(X_n\) with \(\alpha \in (0,1)\) and \(a_n\) as defined in (1.3). In this case, \(m=m_\alpha \) where \(m_\alpha \) is the measure on \(\mathbb {R}\) with density

    $$\begin{aligned} f(x)=\alpha |x|^{-(1+\alpha )}\left( \theta \textbf{1}_{\{x>0\}}+(1-\theta )\textbf{1}_{\{x<0\}} \right) , \end{aligned}$$

    for \(\alpha \) and \(\theta \) as in Definition 1.2.Footnote 2

  2. (ii)

    The adjacency matrix \(A_n\) of an Erdős–Rényi random graph G(np) with \(np\rightarrow \lambda \in (0,\infty )\). In this case, the row sums of \(A_n\) converge to Poisson random variables so that in (1.4) \(m=\lambda \delta _1\) and \(b=\lambda /2\).

  3. (iii)

    The matrix \(A_n=\frac{1}{\sqrt{\lambda }}E_n\circ X_n\) where \(E_n\) is the adjacency matrix of an Erdős–Rényi random graph G(np) with \(np\rightarrow \lambda \in (0,\infty )\), \(X_n\) is chosen from the Gaussian Orthogonal Ensemble (GOE), and \(\circ \) is the Hadamard product of matrices. In this case, \(m=\lambda G_\lambda \) where \(G_\lambda \) is the centered Gaussian probability measure with variance \(\frac{1}{\lambda }\).

The first two points of Condition C1 will be important for handling the diagonal entries of \(L_n\). (2.5) implies a Poisson point process with intensity measure m is almost surely summable, which is stronger than the almost sure square summability implied by (1.5). An essential component of the proofs in Sect. 6 is the convergence of the empirical point process of entries in a row of \(A_n\) to a Poisson point process with intensity measure m; however, convergence as point processes does not necessarily imply convergence of the row sum to the sum of the Poisson point process. The uniform integrability assumption (2.2) will allow us in Sect. 6 to conclude that the row sums converge to the sum of the Poisson point process with intensity measure m. The last point is a technical assumptions needed in the proof of the main theorem given below. Heuristically the last point of Condition C1 states that the infinitely divisible random variable Y in Definition 1.3 has at least \(t^{-\varepsilon }\) tail decay, and this tail assumption holds entry-wise uniformly in n. The assumption in (2.4) is technical and used to prove tightness of the empirical spectral measures, but perhaps is not necessary, and there may be room for refinement. The choice of 1/4 in the final condition is arbitrary, any positive constant would be sufficient.

Theorem 2.4

(Eigenvalue Convergence for Laplacian Lévy–Khintchine matrices) Let \(\{A_n\}_{n\ge 1}\) be a Lévy–Khintchine random matrix ensemble with characteristics (0, bm) all defined on the same probability space satisfying Condition C1, and for every \(n\in \mathbb {N}\) let \(L_n\) be defined by (2.1). Then, there exists a deterministic probability measure \(\mu _m\) depending only on m such that a.s. \(\mu _{L_n}\) converges weakly to \(\mu _m\), as \(n\rightarrow \infty \).

While the random matrices satisfying Condition C1 may appear very different for different m, a general description of \(\mu _m\) is available through its Stieltjes transform and a recursive distributional equation (RDE). A recursive distributional equation is an equation of the form

$$\begin{aligned} X\overset{d}{=}g((Y_1,X_1),(Y_2,X_2),\dots ) \end{aligned}$$
(2.6)

where \(\{X_n\}_{n=1}^\infty \) are i.i.d. copies of X and \(\{Y_n\}_{n=1}^\infty \) is some sequence of random variables independent from \(\{X_n\}_{n=1}^\infty \). While we do not use existing results from the literature, we did find the survey [1] and the unpublished manuscript [2] helpful for better understanding RDEs and contraction arguments in proving uniqueness of solutions. We encourage the interested reader to begin there for more information on RDEs.

Theorem 2.5

(Recursive Distributional Equation for Stieltjes Transform of \(\mu _m\)) Let \(\mu _m\) be the limiting deterministic measure from Theorem 2.4, and let \(s_m(z)=\int _\mathbb {R}\frac{1}{x-z}\textrm{d}\mu _m(x)\) be the Stieltjes transform of \(\mu _m\). Then for every \(z\in \mathbb {C}_+\), \(s_m(z)=\mathbb {E}s_\varnothing (z)\) where \(s_\varnothing \) is the Stieltjes transform of a random probability measure. Moreover, the distribution of \(s_\varnothing \) is the unique distribution on the space of Stieltjes transforms of probability measures such that

$$\begin{aligned} s_\varnothing (z)\overset{d}{=}-\left( z-\sum _{j=1}^\infty \frac{y_j}{s_j(z)y_j-1} \right) ^{-1}\quad \mathrm{for\, all\,} z\in \mathbb {C}_+, \end{aligned}$$
(2.7)

where \(\{y_j\}_{j\ge 1}\) is a Poisson point process with intensity measure m and \(\{s_j\}\) is a collection of i.i.d. copies of \(s_\varnothing \) independent from the point process.

Theorems 2.4 and 2.5 give that the limiting empirical spectral measure of \(L_n\) is uniquely determined by a Poisson point process with intensity measure m. For the examples outlined in Remark 2.3, we will now give some more explicit descriptions of the corresponding point processes.

  1. (i)

    Let \(E_1,E_2,\dots \) be a sequence of independent exponential random variables with mean 1 and \(\Gamma _k=E_1+\cdots +E_k\). Additionally let \(\varepsilon _1,\varepsilon _2,\dots \) be a sequence of i.i.d. random variables such that

    $$\begin{aligned} \mathbb {P}(\varepsilon _1=1)=\theta =1-\mathbb {P}(\varepsilon _1=-1). \end{aligned}$$

    Then (see [15] Proposition 2) the collection \(\{\varepsilon _k\Gamma _k^{-1/\alpha }\}_{k\ge 1}\) is a Poisson point process with intensity measure \(m_\alpha \), the measure arising for Lévy matrices, example (i) in Remark 2.3.

  2. (ii)

    For the Laplacian of very sparse random graphs, discusses in Remark 2.3 (ii), the Poisson point process is quite simple. Let N be a Poisson random variable with mean \(\lambda \) and for \(k\ge 1\) define \(y_k\) by

    $$\begin{aligned} y_k={\left\{ \begin{array}{ll} 1,\, k\le N,\\ 0,\, k>N \end{array}\right. }. \end{aligned}$$

    Then, \(\{y_k\}_{k\ge 1}\) is a Poisson point process with intensity measure \(\lambda \delta _1\).

  3. (iii)

    For a very sparse GOE matrix described in example (iii) in Remark 2.3, let \(Y_1,Y_2,\dots \) be independent standard real Gaussian random variables, and let N be a Poisson random variable with mean \(\lambda \). Define

    $$\begin{aligned} y_k={\left\{ \begin{array}{ll} \frac{Y_k}{\sqrt{\lambda }},\, k\le N\\ 0,\, k>N \end{array}\right. }. \end{aligned}$$

    Then, \(\{y_k\}_{k\ge 1}\) is a Poisson point process with intensity measure \(\lambda G_\lambda \). This example is explored a bit further in Theorem 2.7.

RDE (2.7) can be written as:

$$\begin{aligned} s_\varnothing (z)\overset{d}{=}-\left( z+\sum _{j=1}^\infty y_j-\sum _{j=1}^\infty \frac{y_j^2s_j(z)}{s_j(z)y_j-1} \right) ^{-1}. \end{aligned}$$
(2.8)

If we consider a diagonal matrix \({\tilde{D}}_n\) independent from \(A_n\) with independent entries \(({\tilde{D}}_{n})_{ii}\overset{d}{=}(D_n)_{ii}\) and the matrix \({\tilde{L}}_n=A_n-{\tilde{D}}_n\), then the work below leading up to the existence of (2.7) could be adapted in a straightforward way to arrive at a corresponding RDE for \({\tilde{L}}_n\). Specifically,

$$\begin{aligned} s_\varnothing (z)\overset{d}{=}-\left( z+\sum _{j=1}^\infty {\tilde{y}}_j-\sum _{j=1}^\infty y_j^2s_j(z) \right) ^{-1}, \end{aligned}$$
(2.9)

where \(\{{\tilde{y}}_j\}\) is an independent copy of the point process of \(\{y_j\}\), independent of \(\{s_j\}\). For light-tailed \(A_n\), Theorem 1.1 gives that the limiting spectral measure of \(L_n\) is the free additive convolution of the semicircle measure and the Gaussian measure. This is the same limiting spectral measure for \(A_n-D_n\), when \(A_n\) is independent of \(D_n\). In contrast, the differences between equations (2.8) and (2.9) suggest that for Lévy–Khintchine \(A_n\), the dependence between \(A_n\) and \(D_n\) can be seen in the limiting measure \(\mu _m\).

2.1 Outline

Section 3 contains a brief introduction to Poisson point processes, and important results on these processes used throughout. In Sects. 4 and 5, we define local convergence for operators on \(\ell ^2(V)\) for a countable set V and use the measure m to build a random operator L. In Sect. 6, we show that \(L_n\) converges locally in distribution to L, and then, in Sect. 7 we upgrade this to convergence of the empirical spectral measures. Finally in Sect. 8, we show that the Stieltjes transform of the limiting empirical spectral measure can be described as the expected value of the unique solution to (2.7). In the appendices, we prove almost sure tightness of the collection \(\{\mu _{L_n}\}_{n\ge 1}\) and list some technical lemmas. We end this section with two corollaries of Theorem 2.5. The first is a continuity result for the map \(m\mapsto \mu _m\). In the second, we use (2.7) to recover the free convolution of a semicircle and a standard Gaussian measure from the limiting empirical measure of very sparse random matrices.

2.2 Corollaries of Theorem 2.5

The first corollary of Theorem 2.5 concerns continuity of the mapping \(m\mapsto \mu _m\) where \(\mu _m\) is the limiting measure of Theorem 2.4. Uniqueness of the solution to the RDE in Theorem 2.5 is crucial to the proof of Corollary 2.6.

Corollary 2.6

Let \({\bar{\mathbb {R}}}\) denote the one point compactification of \(\mathbb {R}\). Let \(\{m_n\}_{n=1}^\infty \) be a collection of measures on \(\mathbb {R}\) such that

$$\begin{aligned} \int _{\mathbb {R}} 1\wedge |x|\, \textrm{d}m_n(x)<\infty , \end{aligned}$$

for all \(n\in \mathbb {N}\cup \{\infty \}\). Let \(\mu _{m_1},\mu _{m_2},\dots \) and \(\mu _{m_\infty }\) be the deterministic limiting measures described in Theorem 2.4 for a Lévy–Khintchine random matrix ensemble with characteristics \((0,b,m_1), (0,b,m_2),\dots \) and \((0,b,m_\infty )\), respectively. If for any \(f\in C_K({\bar{\mathbb {R}}}{\setminus }\{0\})\),

$$\begin{aligned} \int _{\mathbb {R}}f\textrm{d}m_n\rightarrow \int _\mathbb {R}f\textrm{d}m_\infty , \end{aligned}$$

and for any \(\varepsilon >0\)

$$\begin{aligned} \lim \limits _{k\rightarrow \infty }\sup _{n\ge 1}\mathbb {P}\left( \sum _{j=k}^\infty |y_j^{(n)}|>\varepsilon \right) = 0, \end{aligned}$$
(2.10)

where for each \(n\in \mathbb {N}\), \(\{y_j^{(n)}\}_{j=1}^\infty \) is a Poisson point process with intensity measure \(m_n\), and then, \(\mu _{m_n}\) converges weakly to \(\mu _{m_\infty }\) as \(n\rightarrow \infty \).

Proof

Let \(s_n\) be the Stieltjes transforms of \(\mu _{m_n}\), and let \(r_n\) be the random Stieltjes transform solving (2.7) corresponding to \(m_n\), so that \(s_n=\mathbb {E}r_n\). By Lemma B.2, in order to prove Corollary 2.6 it is enough to show that pointwise \(\lim _{n\rightarrow \infty }s_n=\mathbb {E}r\), where r solves RDE (2.7) corresponding to \(m_\infty \). Take any subsequence \(\{s_{{n_k}}\}_{n_k}\) of \(\{s_{n}\}_{n}\), with corresponding subsequence \(\{r_{{n_k}}\}_{n_k}\) of \(\{r_{n}\}_{n}\). From Lemma B.3, it follows that \(\{r_{n_k}\}\) is tight in the space of analytic function on \(\mathbb {C}_+\) with the topology of uniform convergence on compact subsets, and we pass to a further subsequence \(n_k'\) converging to another random analytic function r(z). As \(\{r_{n_k}\}\) is almost surely uniformly bounded on compact subsets is follows that r is almost surely bounded on compact subsets. For any fixed \(z\in \mathbb {C}_+\), it follows by the dominated convergence theorem that

$$\begin{aligned} \lim _{n_k'\rightarrow \infty }s_{n_k'}(z)=\lim _{n_k'\rightarrow \infty }\mathbb {E}r_{n_k'}(z)=\mathbb {E}r(z). \end{aligned}$$
(2.11)

Corollary 2.6 then follows if r is a random Stieltjes transform solution to RDE (2.7) corresponding to \(m_\infty \).

To this end, let \(\Pi _n\) be Poisson random measures with intensity measures \(m_n\). For a positive function \(f\in C_K({\bar{\mathbb {R}}}{\setminus }\{0\})\), \(1-e^{-f(x)}\) is also a continuous function with compact support. Thus,

$$\begin{aligned} \lim _{n\rightarrow \infty }\exp \left( \int _\mathbb {R}1-e^{-f(x)}\textrm{d}m_n(x) \right) =\exp \left( \int _\mathbb {R}1-e^{-f(x)}\textrm{d}m_\infty (x) \right) . \end{aligned}$$
(2.12)

It follows from Propositions 3.4 and 3.5 that \(\Pi _n\) converges in distribution to \(\Pi _\infty \). For \(n\in \mathbb {N}\) let \(\{y_j^{(n)}\}_{j\ge 1}\) be the points of the process \(\Pi _n\) and \(\{y_j\}_{j\ge 1}\) the points of the process \(\Pi _\infty \). The points may be ordered such that for every \(j\in \mathbb {N}\), \(y_j^{(n)}\) converges in distribution to \(y_j\) (see Section 2 of [15] for more details). In fact, from (2.10) and Lemma 1 of [15] \(\{y_j^{(n)}\}_{j\ge 1}\) converges in distribution to \(\{y_j\}_{j\ge 1}\) in \(\ell ^1(\mathbb {N})\). Using Skorokhod’s representation theorem, we may put \(\{r_{n_k'}\}\), \(\{\Pi _{n_k'}\}\), \(\Pi _\infty \) and r on a single probability space such that all the above convergences in distributions are almost sure, and

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }\sum _{j=1}^\infty |y_j^{(n)}-y_j|=0 \end{aligned}$$
(2.13)

almost surely. For fixed \(z\in \mathbb {C}_+\),

$$\begin{aligned} r(z)&=\lim _{n_k'\rightarrow \infty }r_{n_k'}(z)\nonumber \\&=\lim _{n_k'\rightarrow \infty }-\left( z-\sum _{j=1}^\infty \frac{y_j^{(n_k')}}{r_{n_k'}^{(j)}(z)y_j^{(n_k')}-1} \right) ^{-1}\nonumber \\&=-\left( z-\lim _{n_k'\rightarrow \infty }\sum _{j=1}^\infty \frac{y_j^{(n_k')}}{r_{n_k'}^{(j)}(z)y_j^{(n_k')}-1} \right) ^{-1}\nonumber \\&=-\left( z-\sum _{j=1}^\infty \frac{y_j}{r_j(z)y_j-1} \right) ^{-1}, \end{aligned}$$
(2.14)

for independent copies \(r_1,r_2,\dots \) of r, where the last equality follows from (2.13). Thus, r is an analytic solution to RDE (2.7). From (2.14) and the almost sure boundedness of r on compact subsets of \(\mathbb {C}_+\) that almost surely

$$\begin{aligned} \lim \limits _{t\rightarrow \infty }itr(it)=-1, \end{aligned}$$
(2.15)

and thus r is almost surely the Stieltjes transform of a probability measure. From (2.11) and the uniqueness of the solution to RDE (2.7), it follows that for any \(z\in \mathbb {C}_+\)

$$\begin{aligned} \lim _{n_k'\rightarrow \infty }s_{n_k'}(z)=s_{\infty }(z). \end{aligned}$$
(2.16)

As the subsequence \(n_k\) was arbitrary is follows that \(s_n\) converge pointwise to \(s_\infty \) and \(\mu _{m_n}\) converges weakly to \(\mu _{m_\infty }\) as \(n\rightarrow \infty \). \(\square \)

Theorem 2.7 considers the \(\lambda \rightarrow \infty \) limit of example (iii) in Remark 2.3. The limiting measure is the same limiting measure found in Theorem 1.1. The works of Jiang [22] and Chatterjee and Hazra [13] established Theorem 1.1 for sparse random matrices where the expected number of nonzero entries in a row tends to infinity with the size of the matrix. Theorem 2.7, when combined with Theorem 2.4 and Remark 2.3 (iii), can then be interpreted as splitting the limit to where first \(n\rightarrow \infty \) and then the expected number of nonzero entries tends to infinity.

Theorem 2.7

Let \(G_\lambda \) denote the Gaussian probability measure with mean 0 and variance \(\frac{1}{\lambda }\), and let \(m_\lambda =\lambda G_\lambda \). If \(\mu _{m_\lambda }\) is the deterministic limiting probability measure from Theorem 2.4, then \(\mu _{m_\lambda }\) converges weakly to the free convolution of the semicircle distribution and the standard real Gaussian distribution, as \(\lambda \rightarrow \infty \).

Proof

Denote the free convolution of a standard semicircle measure and standard Gaussian measure by \(\text {SC}\boxplus G_1\). It is known [5] that the Stieltjes transform, \(s_{\text {fc}}\), of \(\text {SC}\boxplus G_1\) can be defined as the unique solution to

$$\begin{aligned} s_{\text {fc}}(z)=\int _\mathbb {R}\frac{1}{x-z-s_{\text {fc}}(z)}\frac{1}{\sqrt{2\pi }}e^{-x^2/2}\textrm{d}x \end{aligned}$$
(2.17)

satisfying \({\text {Im}}(s_{\text {fc}}(z))\ge 0\) and \(s_{\text {fc}}(z)\sim -z^{-1}\) as \(z\rightarrow \infty \). If \(s_\lambda \) is the Stieltjes transform of \(\mu _{m_\lambda }\), then from Theorem 2.5 we know \(s_\lambda (z)=\mathbb {E}r_\lambda (z)\) where \(r_\lambda \) satisfies the RDE

$$\begin{aligned} r_\lambda (z)\overset{d}{=}-\left( z-\sum _{j=1}^N\frac{y_j}{r_j(z)y_j-1} \right) ^{-1}, \end{aligned}$$
(2.18)

\(N\sim \text {Pois}(\lambda )\), \(\{y_j\}_{j=1}^\infty \) are i.i.d. Gaussian random variables with mean zero and variance \(\frac{1}{\lambda }\) and \(\{r_j\}_{j=1}^\infty \) are i.i.d. copies of \(r_\lambda \), independent of the collection \(\{y_j\}_{j=1}^\infty \). We will instead use the equivalent recursive distributional equation

$$\begin{aligned} r_\lambda (z)\overset{d}{=}-\left( z+\frac{1}{\sqrt{\lambda }}\sum _{j=1}^Ny_j +\frac{1}{\lambda }\sum _{j=1}^N\frac{r_j(z)y_j^2}{1-r_j(z)y_j/\sqrt{\lambda }} \right) ^{-1}, \end{aligned}$$
(2.19)

where \(\{y_j\}_{j=1}^\infty \) are i.i.d. standard real Gaussian random variables. Fix \(z\in \mathbb {C}_+\). We first consider the sum \(S_\lambda =\frac{1}{\sqrt{\lambda }}\sum _{j=1}^Ny_j\). For \(t\in \mathbb {R}\), define

$$\begin{aligned} \varphi _{S_\lambda }(t)&:=\mathbb {E}\exp (itS_\lambda )\\&=\sum _{k=0}^{\infty }\left( e^{-\frac{t^2}{2\lambda } }\right) ^k\frac{\lambda ^ke^{-\lambda }}{k!}\\&=\exp \left( {-\lambda }\right) \exp \left( {\lambda e^{-\frac{t^2}{2\lambda }}}\right) \\&=\exp \left( -\lambda +\lambda -\frac{t^2}{2}+o\left( \frac{1}{\lambda }\right) \right) , \end{aligned}$$

where here and throughout the proof asymptotic notation is as \(\lambda \rightarrow \infty \). Thus, \(S_\lambda \) converges to a standard real Gaussian random variable as \(\lambda \rightarrow \infty \).

We will compare the sum \(\frac{1}{\lambda }\sum _{j=1}^N\frac{r_j(z)y_j^2}{1-r_j(z)y_j/\sqrt{\lambda }}\) to increasingly simpler sums. The first comparison is to the sum \(\frac{1}{\lambda }\sum _{j=1}^Nr_{j}(z)y_j^2\). Notice that

$$\begin{aligned} \left| \frac{1}{\lambda }\sum _{j=1}^N\frac{r_j(z)y_j^2}{1-r_j(z)y_j/\sqrt{\lambda }}-\frac{1}{\lambda }\sum _{j=1}^Nr_{j}(z)y_j^2 \right|&=\left| \frac{1}{\lambda ^{3/2}}\sum _{j=1}^N\frac{r_j(z)^2y_j^3}{1-r_j(z)y_j/\sqrt{\lambda }} \right| \\&\le \frac{1}{{\text {Im}}(z)^2\lambda ^{3/2}}\sum _{j=1}^N\frac{|y_j|^3}{|1-r_j(z)y_j/\sqrt{\lambda }|}\\&\le \frac{2}{{\text {Im}}(z)^2\lambda ^{3/2}}\sum _{j=1}^N|y_j|^3\\&\quad +\frac{1}{{\text {Im}}(z)^2\lambda ^{3/2}}\sum _{j=1}^N\frac{|y_j|^3}{|1-r_j(z)y_j/\sqrt{\lambda }|}\textbf{1}_{{A_{j,\lambda }}}, \end{aligned}$$

where \(\textbf{1}_{{A_{j,\lambda }}}\) is the indicator of the event \(A_{j,\lambda }=\{|y_j|\ge \sqrt{\lambda }{\text {Im}}(z)/2\}\). We will now show both pieces of this bound converge in probability to zero. From Lemma 3.3,

$$\begin{aligned} \lim \limits _{\lambda \rightarrow \infty }\mathbb {E}\frac{1}{\lambda ^{3/2}}\sum _{j=1}^N|y_j|^3 =\lim \limits _{\lambda \rightarrow \infty }\frac{\mathbb {E}|y_1|^3}{\sqrt{\lambda }}=0. \end{aligned}$$

From standard tail estimates of Gaussian random variables, we have that

$$\begin{aligned} \lim \limits _{\lambda \rightarrow \infty }\mathbb {P}\left( \sum _{j=1}^N\textbf{1}_{{A_{j,\lambda }}}\ne 0 \right)&=\lim \limits _{\lambda \rightarrow \infty }\sum _{k=0}^\infty \mathbb {P}\left( \sum _{j=1}^k\textbf{1}_{{A_{j,\lambda }}}\ne 0 \right) e^{-\lambda }\frac{\lambda ^k}{k!}\\&\le \lim \limits _{\lambda \rightarrow \infty }\sum _{k=0}^\infty k\mathbb {P}\left( |y_1|\ge \sqrt{\lambda }{\text {Im}}(z)/2\right) e^{-\lambda }\frac{\lambda ^k}{k!}\\&\le \lim \limits _{\lambda \rightarrow \infty }\sum _{k=0}^\infty k\frac{2}{\sqrt{2\pi \lambda }{\text {Im}}(z)}e^{-\lambda {\text {Im}}(z)^2/8} e^{-\lambda }\frac{\lambda ^k}{k!}\\&\le \lim \limits _{\lambda \rightarrow \infty }Ce^{-c\lambda }\sqrt{\lambda }, \end{aligned}$$

for some positive constants \(C,c>0\) independent of \(\lambda \). Thus \(\frac{1}{\lambda }\sum _{j=1}^N\frac{r_j(z)y_j^2}{1-r_j(z)y_j/\sqrt{\lambda }}-\frac{1}{\lambda }\sum _{j=1}^Nr_{j}(z)y_j^2\Rightarrow 0\) as \(\lambda \rightarrow \infty \). Next we compare to the sum \(\frac{1}{\lambda }\sum _{j=1}^N(\mathbb {E}r_{\lambda }(z))y_j^2\). To this end, let \(Z_j=r_j(z)y_j^2-(\mathbb {E}r_\lambda (z))y_j^2\), and consider the Taylor expansion of characteristic function of the real part of \(\frac{1}{\lambda }\sum _{j=1}^NZ_j\)

$$\begin{aligned} \lim \limits _{\lambda \rightarrow \infty }\mathbb {E}\exp \left( it\frac{1}{\lambda }\sum _{j=1}^N{\text {Re}}(Z_j)\right)&=\lim \limits _{\lambda \rightarrow \infty }\sum _{k=0}^\infty \left[ \mathbb {E}\exp \left( it\frac{1}{\lambda }{\text {Re}}(Z_1)\right) \right] ^ke^{-\lambda }\frac{\lambda ^k}{k!}\\&=\lim \limits _{\lambda \rightarrow \infty }\exp \left( -\lambda +\lambda +it\mathbb {E}{\text {Re}}(Z_1)+O(1/\lambda ) \right) \\&=1. \end{aligned}$$

An identical argument follows from the imaginary part, and we see that \(\frac{1}{\lambda }\sum _{j=1}^NZ_j\) converges in probability to zero. It is also straightforward to show \(\frac{1}{\lambda }\sum _{j=1}^Ny_j^2\Rightarrow 1\), and thus, \(\frac{1}{\lambda }\sum _{j=1}^N\mathbb {E}( r_{\lambda }(z))y_j^2-\mathbb {E}r_{\lambda }(z)\) converges in probability to zero. These three comparisons lead to

$$\begin{aligned} \frac{1}{\lambda }\sum _{j=1}^N\frac{r_j(z)y_j^2}{1-r_j(z)y_j/\sqrt{\lambda }}-\mathbb {E}r_\lambda (z)&=\frac{1}{\lambda }\sum _{j=1}^N\frac{r_j(z)y_j^2}{1-r_j(z)y_j/\sqrt{\lambda }} -\frac{1}{\lambda }\sum _{j=1}^Nr_{j}(z)y_j^2\\&\quad +\frac{1}{\lambda }\sum _{j=1}^Nr_{j}(z)y_j^2-\frac{1}{\lambda }\sum _{j=1}^N\mathbb {E}(r_{\lambda }(z))y_j^2\\&\quad +\mathbb {E}(r_{\lambda }(z))\left( \frac{1}{\lambda }\sum _{j=1}^Ny_j^2-1\right) , \end{aligned}$$

which converges in distribution to 0. Since this limit is a constant, we may conclude that jointly

$$\begin{aligned} \left( \frac{1}{\sqrt{\lambda }}\sum _{j=1}^Ny_j,\frac{1}{\lambda }\sum _{j=1}^N\frac{r_j(z)y_j^2}{1-r_j(z)y_j/\sqrt{\lambda }}-\mathbb {E}r_\lambda (z)\right) \Rightarrow (Y,0), \end{aligned}$$
(2.20)

where Y is a standard Gaussian random variable.

Let \(\{\lambda _n\}_{n=1}^\infty \) be an arbitrary increasing sequence of positive real numbers going to infinity and let \(\{\lambda _{n_k}\}\) be an arbitrary subsequence. From Lemma B.3, \(\{r_{\lambda _{n_k}}\}_{n_k}\) is tight as a family of random analytic functions on \(\mathbb {C}_+\) with the topology of uniform convergence on compact subsets, and thus, there exists a further subsequence \(\lambda _{n_{k'}}\) such that \( r_{\lambda _{n_{k'}}}(z)\rightarrow {\tilde{r}}(z)\) for some random analytic function \({\tilde{r}}\). Fix \(z\in \mathbb {C}_+\), it follows from the dominated convergence theorem that \(\mathbb {E}r_{\lambda _{n_{k'}}}(z)\rightarrow \mathbb {E}{\tilde{r}}(z)=:r(z)\) for some deterministic limit r(z). As \(z\in \mathbb {C}_+\) was arbitrary, it follows from the above convergence in distribution and the continuous mapping theorem that

$$\begin{aligned} r(z)&=\lim \limits _{n_{k'}\rightarrow \infty }\mathbb {E}r_{\lambda _{n_{k'}}}(z)\\&=-\lim \limits _{n_{k'}\rightarrow \infty }\mathbb {E}\left( z+\frac{1}{\sqrt{\lambda _{n_{k'}}}}\sum _{j=1}^Ny_j+\frac{1}{\lambda _{n_{k'}}}\sum _{j=1}^N\frac{r_{\lambda _{n_{k'}}}(z)y_j^2}{1-r_{\lambda _{n_{k'}}}(z)y_j/\sqrt{\lambda _{n_{k'}}}} \right) ^{-1}\\&=\mathbb {E}\frac{-1}{z+Y+r(z)}\\&=\int _\mathbb {R}\frac{1}{x-z-r(z)}\frac{1}{\sqrt{2\pi }}e^{-x^2/2}\textrm{d}x, \end{aligned}$$

pointwise on \(\mathbb {C}_+\). Thus, \(r(z)=s_{\text {fc}}(z)\) along every one of these further subsequences of \(\{\lambda _{n_{k}}\}\), and \(s_\lambda (z)=\mathbb {E}r_\lambda (z)\rightarrow s_{\text {fc}}(z)\). By Lemma B.2, this pointwise convergence of the Stieltjes transforms implies \(\mu _{m_\lambda }\) converges weakly to \(\text {SC}\boxplus G_1\) as \(\lambda \rightarrow \infty \). \(\square \)

The matrix \(X_n\) in Remark 2.3 (iii) has Gaussian entries, and for convenience, we stated Theorem 2.7 for the corresponding measure \(\lambda G_\lambda \). However, the proof can be adapted in a straightforward way to the analogous measures corresponding to \(X_n\) from Remark 2.3 (iii) having entries with mean zero, variance \(\frac{1}{\lambda }\), and three finite moments.

3 Poisson Point Processes and Infinitely Divisible Distributions

This section contains a brief overview of Poisson point processes and their relation to infinitely divisible distributions. It includes results used throughout the paper, whose proofs can be found in [15, 25, 28, 29].

We will assume throughout the section that \(E={\bar{\mathbb {R}}}\setminus \{0\}\) with the relative topology, where \({\bar{\mathbb {R}}}\) is the one point compactification of \(\mathbb {R}\). Many of the results below can be extended to higher-dimensional Euclidean space and other sufficiently nice topological spaces. It is also worth noting that we consider \({\bar{\mathbb {R}}}\setminus \{0\}\) for the purpose of defining the appropriate notion of a compact subset of \(\mathbb {R}\), i.e. one which is closed and bounded away from 0. However, many of the results below could be stated on \(\mathbb {R}\).

Denote by \({\mathcal {M}}(E)\) the set of simple point Radon measures

$$\begin{aligned} \mu =\sum _{x\in S}\delta _x, \end{aligned}$$
(3.1)

where S is a multiset, and \(\mu \) is such that \(\mu \left( (-\infty ,-r)\cup (r,\infty ) \right) <\infty \) for any \(r>0\). Denote by \({\mathcal {H}}(E)\) the set of supports corresponding to measures in \({\mathcal {M}}(E)\), where the support of a simple point measure \(\mu \) is taken to be the multiset S. The elements of \({\mathcal {H}}(E)\) are called configurations.

Intuitively, a point process should be a random collection of points on some space. From this intuition, we want to define a point processes to be random variables taking values in \({\mathcal {H}}(E)\); however, to do so one would need to define an appropriate \(\sigma \)-algebra on \({\mathcal {H}}(E)\). This is done through the space \({\mathcal {M}}(E)\), as spaces of measures can be topologized in a straightforward way using an appropriate class of test functions. From this topology, one can then take the Borel \(\sigma \)-algebra to make \({\mathcal {M}}(E)\) an measureable space which our random variables can take values in. To this end, we say a sequence of measure \(\mu _n\) in \({\mathcal {M}}(E)\) converge vaguely to a measure \(\mu \) if for any continuous compactly supported f on E

$$\begin{aligned} \int _E f\textrm{d}\mu _n\rightarrow \int _E f\textrm{d}\mu . \end{aligned}$$

We refer to the topology on \({\mathcal {M}}(E)\) defined through vague convergence as the vague topology. In fact, \({\mathcal {M}}(E)\) with the vague topology is a Polish space, and thus complete and metrizable.

The vague topology on \({\mathcal {M}}(E)\) can be pushed forward to configurations \({\mathcal {H}}(E)\) by considering the bijection, F, from simple point Radon measures to configurations defined by \(F(\mu )=S\) in (3.1). Then, vague convergence \(\mu _n\rightarrow \mu \) corresponds to pointwise convergence of the configurations in some labeling of the elements of the multisets \(F(\mu _n)\), i.e. there exists labeling \(F(\mu _n)=\{x^{(1)}_n,x^{(2)}_n,\dots \}\), where \(x^{(1)}_n,x^{(2)}_n,\dots \) need not be distinct, such that for all \(k\in \mathbb {N}\), \(x_n^{(k)}\rightarrow x^{(k)}\).

With \({\mathcal {M}}(E)\) appropriately topologized, we define a simple point process N as a measurable mapping from a probability space \((\Omega ,{\mathcal {F}},\mathbb {P})\) to \(({\mathcal {M}}(E),{\mathcal {B}}({\mathcal {M}}(E)))\), where \({\mathcal {B}}({\mathcal {M}}(E))\) is the Borel \(\sigma \)-algebra defined by the vague topology.

Definition 3.1

Let m be a Borel measure on E. A point process N is a Poisson point process with intensity measure m if we have the following:

  1. 1.

    For any Borel set A, N(A) is a Poisson random variable with mean m(A).Footnote 3

  2. 2.

    If \(A_1,\dots ,A_k\) are pairwise disjoint Borel sets, then \(N(A_1),\dots ,N(A_k)\) are independent random variables.

Lemma 3.2 is a technical result for Poisson point processes with intensity measure satisfying (2.5). We do not believe that the result is novel; however, we are unable to find a reference in the point process literature. The proof below is a modification of the proof for Lemma A.4 in [7].

Lemma 3.2

Let m be a measure on \(\mathbb {R}\setminus \{0\}\) such that

$$\begin{aligned} \int _{\mathbb {R}}1\wedge |x|\textrm{d}m(x)<\infty , \end{aligned}$$
(3.2)

and let \(\{y_k\}_{k= 1}^N\) be a Poisson point process with intensity measure m where \(N=\infty \) a.s. if \(m(\mathbb {R}\setminus \{0\})=\infty \). If \(\tau _\kappa =\inf \left\{ t\ge 0:\sum _{k =t+1}^N y_{(k)}^2\le \kappa \right\} \) where \(y_{(1)}\ge y_{(2)}\ge \dots \) is a non-increasing ordering of \(\{y_k\}_{k= 1}^N\), then \(\mathbb {E}\tau _\kappa <\infty \) for all \(\kappa >0\) and \(\mathbb {E}\tau _\kappa \rightarrow 0\) as \(\kappa \rightarrow \infty \).

Proof

The fact that \(\tau _\kappa \) is almost surely finite follows from the integrability condition on m. Additionally \(\mathbb {P}(\tau _\kappa =0)=\mathbb {P}\left( \sum _{k =1}^N y_k^2\le \kappa \right) \) and clearly converges to 1 as \(\kappa \rightarrow \infty \). Thus, it is sufficient to prove \(\mathbb {E}\tau _\kappa <\infty \) for all \(\kappa > 0\). Let \(0<r_t<1\) be some monotonically decreasing function of t such that \(r_t\rightarrow 0\) as \(t\rightarrow \infty \), \(S_t=\sum _{k=1}^N y_k^2\textbf{1}_{\{|y_k|\le r_t\}}\), and \(\Pi =\sum _{k =1}^N\delta _{y_j}\). Define the event \(A_t=\{\Pi ((-\infty ,-r_t)\cup (r_t,\infty ))\ge t\}\). On \(A_t^c\) the collection of points summed over in the definition of \(\tau _\kappa \) is a strict subset of the collection of points summed over in the definition of \(S_t\). Then,

$$\begin{aligned} \{\tau _\kappa>t\}&\subseteq \left( \{\tau _\kappa>t\}\cap A_t^c\right) \cup \left( \{\tau _\kappa >t\}\cap A_t\right) \\&\subseteq \left( \{S_t\ge \kappa \}\cap A_t^c\right) \cup A_t \\&\subseteq \{S_t\ge \kappa \}\cup A_t. \end{aligned}$$

We will now show for appropriate \(r_t\) the probabilities of the events \(\{S_t\ge \kappa \}\cup \{\Pi ((-\infty ,-r_t)\cup (r_t,\infty ))\ge t\}\) are summable in \(t\in \mathbb {N}\).

For a Poisson random variable X with mean \(\lambda \), \(\mathbb {P}(X\ge t)\le \exp (-t\log (\frac{t}{\lambda e}))\). Letting \(E_t=\mathbb {E}\Pi ((-\infty ,-r_t)\cup (r_t,\infty ))=m((-\infty ,-r_t)\cup (r_t,\infty ))<\infty \) we have

$$\begin{aligned} \mathbb {P}\left( \Pi ((-\infty ,-r_t)\cup (r_t,\infty ))\ge t\right) \le \exp \left( -t\log \left( \frac{t}{E_te}\right) \right) . \end{aligned}$$
(3.3)

For the event \(\{S_t\ge \kappa \}\) notice that for every \(\theta >0\),

$$\begin{aligned} \mathbb {P}(S_t\ge \kappa )\le \exp (-\theta \kappa )\mathbb {E}\exp (\theta S_t), \end{aligned}$$

and from Campbell’s formula, Lemma 3.3,

$$\begin{aligned} \mathbb {E}\exp (\theta S_t)&=\exp \left( \int _{-r_t}^{r_t}(e^{\theta x^2}-1)\textrm{d}m(x) \right) \\&\le C\exp \left( \theta \int _{-r_t}^{r_t}x^2\textrm{d}m(x) \right) . \end{aligned}$$

Letting \(\theta =\theta _t =\left( \int _{-r_t}^{r_t}x^2\textrm{d}m(x) \right) ^{-1}\) the above gives us

$$\begin{aligned} \mathbb {P}(\tau _\kappa >t)\le \exp \left( -t\log \left( \frac{t}{E_te}\right) \right) +C'\exp (-\theta _t\kappa ). \end{aligned}$$
(3.4)

Notice the integrability assumption on m implies

$$\begin{aligned} \varepsilon m\left( (-\infty ,-\varepsilon )\cup (\varepsilon ,\infty ) \right) \le C \end{aligned}$$
(3.5)

for all \(\varepsilon \in (0,1)\) and some constant C. From the integrability assumption, we also have that

$$\begin{aligned} \int _{-1}^1 |x|\,\textrm{d}m(x)=I \end{aligned}$$
(3.6)

for some \(0<I<\infty \). From this, we see that

$$\begin{aligned} \int _{-r_t}^{r_t}x^2\,\textrm{d}m(x)\le r_tI, \end{aligned}$$
(3.7)

and \(\theta _t\ge (Ir_t)^{-1}\). Taking \(r_t=\frac{c}{t}\) for some \(c>0\) gives \(\theta _t\ge (cI)^{-1}t\). For an appropriate choice of \(c>0\) (3.5) and the definition of \(E_t\) imply that \(E_t\le t/2e\). Thus, from (3.4) we get

$$\begin{aligned} \mathbb {E}\tau _\kappa =\sum _{t=0}^{\infty }\mathbb {P}(\tau _\kappa \ge t)<\infty . \end{aligned}$$
(3.8)

This completes the proof. \(\square \)

The next lemma, known as Campbell’s formula, is a fundamental result in the theory of point processes and gives a description of the functions belonging almost surely to \(L^1(\Pi )\) for a Poisson random measure \(\Pi \).

Lemma 3.3

(Campbell’s Formula, Sect. 3.2 in [26]) Let \(\Pi \) be a Poisson point process on E with intensity measure m. Let \(u:E\rightarrow \mathbb {R}\) be a measurable function. Then with probability 1,

$$\begin{aligned} S=\int _Eu(x)\textrm{d}\Pi (x)<\infty \end{aligned}$$
(3.9)

if and only if

$$\begin{aligned} \int _E 1\wedge |u(x)| \, \textrm{d}m(x)<\infty . \end{aligned}$$
(3.10)

If either of the above integrals are finite then

$$\begin{aligned} \mathbb {E}\exp (\theta S)=\exp \left( \int _Ee^{\theta u(x)}-1\textrm{d}m(x) \right) \end{aligned}$$
(3.11)

for any \(\theta \in \mathbb {C}\) for which the integral on the right-hand side is finite. Moreover,

$$\begin{aligned} \mathbb {E}\int _Eu(x)\textrm{d}\Pi (x)=\int _Eu(x)\textrm{d}m(x), \end{aligned}$$
(3.12)

whenever \(u\ge 0\) or \(\int |u(x)|\textrm{d}m(x)<\infty \).

The next two lemmas, which can be found in Chapter 5 of [29], characterize Poisson point processes and convergence of point processes in terms of (3.11).

Proposition 3.4

(Theorem 5.1 in [29]) A point process \(\Pi \) on E is a Poisson point process with intensity measure m if and only if

$$\begin{aligned} \mathbb {E}\exp \left( -\int _Eu(x)\textrm{d}\Pi (x) \right) =\int _E(e^{- u(x)}-1)\ \textrm{d}m(x), \end{aligned}$$

for every bounded positive measurable function u.

Proposition 3.5

(Theorem 5.2 in [29]) Let \(\Pi _1,\Pi _2,\dots \) and \(\Pi _\infty \) be simple point processes on E. Then \(\Pi _n\Rightarrow \Pi _\infty \) as \(n\rightarrow \infty \) if and only if

$$\begin{aligned} \mathbb {E}\exp \left( -\int _Eu(x)d\Pi _n(x) \right) \rightarrow \mathbb {E}\exp \left( -\int _Eu(x)\textrm{d}\Pi _\infty (x) \right) , \end{aligned}$$

as \(n\rightarrow \infty \) for every bounded positive measurable function u.

The next two propositions outline the connection between Poisson point processes and triangular arrays converging to infinitely divisible distributions. For \(0<h<1\), define

$$\begin{aligned} \sigma ^2_h:=\sigma ^2+\int _{|x|\le h}x^2\textrm{d}m(x),\quad \text { and }\quad b_h:=b-\int _{h<|x| }\frac{x}{1+x^2}\textrm{d}m(x). \end{aligned}$$

Proposition 3.6

(Corollary 15.16 in [25]) Suppose \(\{X_{ni}:1\le i\le n\}_{n\ge 1}\) is a triangular array of random variables such that each row consists of i.i.d. random variables. Then, the sum

$$\begin{aligned} \sum _{i =1}^n X_{ni}, \end{aligned}$$

converges in distribution to an infinitely divisible random variable with characteristic \((\sigma ^2,b,m)\) as \(n\rightarrow \infty \) if and only if for every \(0<h<1\) which is not an atom of m

  • \(n\mathbb {P}(X_{n1}\in \cdot )\Rightarrow m(\cdot )\) on \({\overline{\mathbb {R}}}\setminus \{0\}\),

  • \(n\mathbb {E}\left[ X_{n1}^2\textbf{1}_{\{|X_{n1}|\le h\}}\right] \rightarrow \sigma _h^2\), and

  • \(n\mathbb {E}\left[ X_{n1}\textbf{1}_{\{|X_{n1}|\le h\}} \right] \rightarrow b_h\),

as \(n\rightarrow \infty \).

Proposition 3.7

(Theorem 5.3 in [29]) Suppose \(\{X_{ni}:1\le i\le n\}_{n\ge 1}\) is a triangular array of random variables on \({\overline{\mathbb {R}}}\setminus \{0\}\) such that each row consists of i.i.d. random variables. Let N be a Poisson point process with intensity measure m. Then,

$$\begin{aligned} \sum _{i =1}^n\delta _{X_{ni}}\Rightarrow N \end{aligned}$$

as \(n\rightarrow \infty \) if and only if

$$\begin{aligned} n\mathbb {P}(X_{n1}\in \cdot )\Rightarrow m(\cdot ) \end{aligned}$$

as \(n\rightarrow \infty \).

4 Operators on \(\ell ^2(V)\)

Let V be a countable set and let \(\ell ^2(V)\) denote the Hilbert space defined by the inner product

$$\begin{aligned} \langle \phi ,\psi \rangle :=\sum _{u\in V}{\bar{\phi }}_u\psi _u,\quad \phi _u=\langle \delta _u,\phi \rangle , \end{aligned}$$

where \(\delta _u\) is the unit vector supported on \(u\in V\). Let \({\mathcal {D}}(V)\) denote the dense subset of \(\ell ^2(V)\) of vectors with finite support. Let \((w_{uv})_{u,v\in V}\) be a collection of real numbers with \(w_{uv}=w_{vu}\) such that for all \(u\in V\),

$$\begin{aligned} \sum _{v\in V}|w_{uv}|^2<\infty . \end{aligned}$$

We then define a symmetric linear operator A with domain \({\mathcal {D}}(V)\) by

$$\begin{aligned} \langle \delta _u,A\delta _v\rangle =\langle \delta _v,A\delta _u\rangle =w_{uv}. \end{aligned}$$
(4.1)

Definition 4.1

(Local Convergence) Suppose \((A_n)\) is a sequence of bounded operators on \(\ell ^2(V)\) and A is a linear operator on \(\ell ^2(V)\) with domain \({\mathcal {D}}(A)\supset {\mathcal {D}}(V)\). For any \(u,v\in V\), we say that \((A_n,u)\) converges locally to (Av), and write

$$\begin{aligned} (A_n,u)\rightarrow (A,v), \end{aligned}$$

if there exists a sequence of bijections \(\sigma _n:V\rightarrow V\) such that \(\sigma _n(v)=u\) and, for all \(\phi \in {\mathcal {D}}(V)\),

$$\begin{aligned} \sigma _n^{-1}A_n\sigma _n\phi \rightarrow A\phi , \end{aligned}$$

in \(\ell ^2(V)\), as \(n\rightarrow \infty \).

Here we use \(\sigma _n\) for the bijection on V and the corresponding linear isometry defined in the obvious way. This notion of convergence is useful to random matrices for two reasons. First, we will make a choice on how to define the action of an \(n\times n\) matrix on \(\ell ^2(V)\), and the bijections \(\sigma _n\) help ensure the choice of location for the support of the matrix does not matter. Second, local convergence also gives convergence of the resolvent operator at the distinguished points \(u,v\in V\). This comes down to the fact that local convergence is strong operator convergence, up to the isometries. See [8] for details.

Theorem 4.2

(Theorem 2.2 in [7]) If \((A_n)_{n=1}^\infty \) and A are self-adjoint operators such that \((A_n,u)\) converges locally to (Av) for some \(u,v\in V\), then, for all \(z\in \mathbb {C}_+\),

$$\begin{aligned} \langle \delta _u,(A_n-z)^{-1}\delta _u\rangle \rightarrow \langle \delta _v,(A-z)^{-1}\delta _v\rangle \end{aligned}$$
(4.2)

as \(n\rightarrow \infty \).

To apply this to random operators, we say that \((A_n,u)\rightarrow (A,v)\) in distribution if there exists a sequence of random bijections \(\sigma _n\) such that \(\sigma _n^{-1}A_n\sigma _n\phi \rightarrow A\phi \) in distribution for every \(\phi \in {\mathcal {D}}(V)\).

5 Poisson Weighted Infinite Tree

Let \(\rho \) be a positive Radon measure on \(\mathbb {R}\setminus \{0\}\). \(\textbf{PWIT}(\rho )\) is the random infinite weighted rooted tree defined as follows. The vertex set of the tree is identified with \(\mathbb {N}^f:=\bigcup _{k\in \mathbb {N}\cup \{0\}}\mathbb {N}^k\) by indexing the root as \(\mathbb {N}^0=\varnothing \), the offspring of the root as \(\mathbb {N}\) and, more generally, the offspring of some \(v\in \mathbb {N}^k\) as \((v1),(v2),\cdots \in \mathbb {N}^{k+1}\). Define T as the tree on \(\mathbb {N}^f\) with edges between parents and offspring. Let \(\{\Xi _v\}_{v\in \mathbb {N}^f}\) be independent realizations of a Poisson point process with intensity measure \(\rho \). Let \(\Xi _\varnothing =\{y_1,y_2,\dots \}\) be ordered such that \(|y_1|\ge |y_2|\ge \cdots \) with the convention \(y_i=0\) for all i large enoughFootnote 4 if \(\rho (\mathbb {R}\setminus \{0\})<\infty \), and assign the weight \(y_i\) to the edge between \(\varnothing \) and i, assuming such an ordering is possible. More generally assign the weight \(y_{vi}\) to the edge between v and vi where \(\Xi _v=\{y_{v1},y_{v2},\dots \}\) and \(|y_{v1}|\ge |y_{v2}|\ge \cdots ,\) again with the convention \(y_{vi}=0\) for all i larger than \(\Xi _v(\mathbb {R}\setminus \{0\})\) if \(\rho (\mathbb {R}\setminus \{0\})<\infty \).

For a measure m on \(\mathbb {R}\setminus \{0\}\) satisfying (1.5) and a realization of \(\textbf{PWIT}(m)\) define the linear operator A on \({\mathcal {D}}(\mathbb {N}^f)\) by the formulas

$$\begin{aligned} \langle \delta _v,A\delta _{vk}\rangle =\langle \delta _{vk},A\delta _v\rangle =y_{vk} \end{aligned}$$
(5.1)

and \(\langle \delta _v,A\delta _u\rangle =0\) otherwise. From (1.5) one can see that the points in \(\Xi _v\) are almost surely square summable for every \(v\in \mathbb {N}^f\), and thus A is a well defined linear operator on \({\mathcal {D}}(\mathbb {N}^f)\), though is possibly unbounded on \(\ell ^2(\mathbb {N}^f)\).

5.1 Poisson Weighted Infinite Tree with Loops

The Poisson weighted infinite tree has been utilized in [7,8,9, 11, 24] to study the empirical spectral distribution of heavy-tailed random matrices by showing the random matrices converge to the operator defined by (5.1) for an appropriate measure m. One key feature of those matrices is the diagonal elements are negligible when compared to the largest entries in a row or column. This will not be the case for the Laplacian matrix \(L_n\), thus we will need to define an operator on a slightly modified graph.

Let m be a measure on \(\mathbb {R}\setminus \{0\}\) such that

$$\begin{aligned} \int _{\mathbb {R}\setminus \{0\}}|x|\wedge 1 \textrm{d}m(x)<\infty . \end{aligned}$$
(5.2)

Define the Poisson weighted infinite tree with loops \(\textbf{PWITL}(m)\) as the random weighted graph with vertex set \(\mathbb {N}^f\) and edge set \(E\cup \bigcup _{v\in \mathbb {N}^f}\{v,v\}\) where E is the edge set of \(\textbf{PWIT}(m)\). The weights on edges in E of \(\textbf{PWITL}(m)\) are the weights on edges in E of \(\textbf{PWIT}(m)\), while the weight on a loop \(\{v,v\}\) is

$$\begin{aligned} y_{vv}:=-y_{ul}-\sum _{k=1}^\infty y_{vk}, \end{aligned}$$
(5.3)

where \(ul=v\) if v is not \(\varnothing \) and the weight on \(\{\varnothing ,\varnothing \}\) is

$$\begin{aligned} y_{\varnothing \varnothing }:=-\sum _{k =1}^\infty y_k. \end{aligned}$$
(5.4)

(5.2) is enough to guarantee \(y_{vv}\) is a well-defined random variable, see Lemma 3.3. Define the operator L by

$$\begin{aligned} \langle \delta _v,L\delta _{vk}\rangle =\langle \delta _{vk},L\delta _v\rangle =y_{vk}\quad \text { and }\quad \langle \delta _v,L\delta _v\rangle =y_{vv}, \end{aligned}$$
(5.5)

and \(\langle \delta _v,L\delta _u\rangle =0\) otherwise. In which case we say L is the operator associated to \(\textbf{PWITL}(m)\).

We will show the sequence \(\{(L_n,1)\}_{n\ge 1}\) converges locally in distribution to \((L,\varnothing )\) where L is the linear operator on \(\ell ^2(\mathbb {N}^f)\) associated to the \(\textbf{PWITL}(m)\).

5.2 Self-Adjointness

In this section, we review and apply a criteria established by Bordenave, Caputo, and Chafaï in [7] for unbounded operators to be essentially self-adjoint. There are two minor issues which prevent immediately applying their results to the operator L associated to \(\textbf{PWITL}(m)\). First is they consider operators with skeletons which are trees, and not trees with loops. This is easy to overcome. The second obstacle is in the application of the criteria they consider only point processes associated to \(\alpha \)-stable distributions and not more general infinitely divisible distributions. This is overcome by Lemma 3.2.

Proposition 5.1

(Lemma A.3 in [7]) Let A be a linear operator on \(\ell ^2(\mathbb {N}^f)\) defined by (4.1). We say \(u\sim v\) if \(u=v, u=vk,\) or \(v=uk\) for some \(k\in \mathbb {N}\). Assume \(w_{uv}=0\) if \(u\not \sim v\). Suppose there exists a constant \(\kappa >0\) and sequence of finite connected subsets \(S_n\subset \mathbb {N}^f\), such that \(S_n\subset S_{n+1}\), \(\mathbb {N}^f=\bigcup _{n\in \mathbb {N}} S_n\), and for every n and \(v\in S_n\),

$$\begin{aligned} \sum _{u\notin S_n:u\sim v}|w_{uv}|^2\le \kappa . \end{aligned}$$
(5.6)

Then A is essentially self-adjoint.

Proof

Proposition 5.1 is not stated identically to Lemma A.3 in [7]; however, the only added assumption is that vertices are connected to themselves, so that the graph of the skeleton of A is not a tree. The step in the proof given in [7] which uses the tree structure is the fact that if \(v\in S_n\), \(u\sim v\), and \(v'\in S_n{\setminus }\{v\}\) then \(u\not \sim v\) which is also true for a tree with loops. \(\square \)

Proposition 5.2

(Proposition A.2 in [7]) Let m be a measure on \(\mathbb {R}\setminus \{0\}\) satisfying (1.5). Let \(\{\Xi _v \}_{v\in \mathbb {N}^f}\) be a collection of Poisson point process on \(\mathbb {R}\setminus \{0\}\) with intensity measure m. Let \(\Xi _\varnothing =\{y_1,y_2,\dots \}\) be ordered such that \(|y_1|\ge |y_2|\ge \cdots \), and \(\Xi _v=\{y_{v1},y_{v2},\dots \}\) be ordered such that \(|y_{v1}|\ge |y_{v2}|\ge \cdots \) with the convention the \(y_{vk}\) or \(y_k\) are eventually zero if \(m(\mathbb {R}\setminus \{0\})<\infty \). Additionally let \(\{y_{vv}\}_{v\in \mathbb {N}^f}\) be a collection of real random variables. Define the symmetric linear operator A on \(\ell ^2(\mathbb {N}^f)\) by

$$\begin{aligned} \langle \delta _v,A\delta _{vk}\rangle =\langle \delta _{vk},A\delta _v\rangle =y_{vk},\quad \text { and }\quad \langle \delta _v,A\delta _v\rangle =y_{vv}, \end{aligned}$$

and \(\langle \delta _u,A\delta _v\rangle =0\) otherwise. Then, with probability 1, A is essentially self-adjoint.

Proof

Proposition 5.2 may not initially appear to be Proposition A.2 in [7]; however, the proofs are essentially identical. The only minor difference is replacing Lemma A.4 in [7] with Lemma 3.2. \(\square \)

6 Local Convergence for the Laplacian of Lévy–Khintchine Matrices

For an \(n\times n\) matrix M, extend M to a bounded operator on \(\ell ^2(\mathbb {N}^f)\) as follows. For \(1\le i,j,\le n,\) let \(\langle \delta _i,M\delta _j\rangle =M_{ij}\), and \(\langle \delta _u,M\delta _v\rangle =0\) otherwise.

Theorem 6.1

Let \(L_n\) be the matrix defined by (2.1), where \(\{A_n \}_{n\ge 1}\), a Lévy–Khintchine random matrix ensemble satisfying C1. Let L be the linear operator on \(\ell ^2(\mathbb {N}^f)\) associated with \(\textbf{PWITL}(m)\). Then, in distribution, \((L_n,1)\rightarrow (L,\varnothing )\), as \(n\rightarrow \infty \).

The rest of this section is devoted to the proof of Theorem 6.1. Before considering \((L_n,1)\), we begin by showing \((A_n,1)\) converges to \((L+D,\varnothing )\) where D is a diagonal operator. This follows from the work of Jung in [24]. We include the proof to establish notation and for the convenience of the reader. We define a network as a graph with edge weights taking values in some normed space. To begin, let \(G_n\) be the complete network, without loops, on \(\{1,\dots ,n\}\) whose weight on edge \(\{i,j\}\) equals \(\xi ^n_{ij}\) for some collection \((\xi ^n_{ij})_{1\le i< j\le n}\) of random variables taking values in some normed space. Now consider the rooted network \((G_n,1)\) with the distinguished vertex 1. For any realization \((\xi _{ij}^n)\), and for any \(B,H\in \mathbb {N}\) such that \((B^{H+1}-1)/(B-1)\le n\), we will define a finite rooted subnetwork \((G_n,1)^{B,H}\) of \((G_n,1)\) whose vertex set coincides with a B-ary tree of depth H. To this end, we partially index the vertices of \((G_n,1)\) as elements in

$$\begin{aligned} J_{B,H}:=\bigcup _{l=0}^H\{1,\dots ,B\}^l\subset \mathbb {N}^f, \end{aligned}$$

the indexing being given by an injective map \(\sigma _n\) from \(J_{B,H}\) to \(V_n:=\{1,\dots ,n\}\). We set \(I_\varnothing :=\{1\}\) and the index of the root \(\sigma _n^{-1}(1)=\varnothing \). The vertex \(v\in V_n\setminus I_\varnothing \) is given the index \((k)=\sigma _n^{-1}(v), 1\le k\le B\), if \(\xi ^n_{1,v}\) has the k-th largest norm value among \(\{\xi _{1j}^n, j\ne 1 \}\), ties being broken by lexicographic orderFootnote 5. This defines the first generation, and let \(I_1\) be the union of \(I_\varnothing \) and this generation. If \(H\ge 2\) repeat this process for the vertex labeled (1) on \(V_n{\setminus } I_1\) to order \(\{\xi _{(1)j}^n\}_{j\in V_n{\setminus } I_1}\) to get \(\{11,12,\dots ,1B\}\). Define \(I_2\) to be the union of \(I_1\), and this new collection. Repeat again for \((2),(3),\dots ,(B)\) to get the second generation and so on. Call this vertex set \(V_n^{B,H}=\sigma _n J_{B,H}\).

For a realization T of \(\textbf{PWITL}(m)\), recall we assign the weight \(y_{vk}\) to the edge \(\{v,vk\}\) and the weight \(y_{vv}\) to the edge \(\{v,v\}\). Then \((T,\varnothing )\) is a rooted network. Call \((T,\varnothing )^{B,H}\) the finite rooted subnetwork obtained by restricting \((T,\varnothing )\) to the vertex set \(J_{B,H}\), and the edge set without the loops. If an edge is not present in \((T,\varnothing )^{B,H}\) assign the weight 0. We say a sequence \((G_n,1)^{B,H}\), for fixed B and H, converges in distribution, as \(n\rightarrow \infty \), to \((T,\varnothing )^{B,H}\) if the joint distribution of the weights converges weakly.

Let \(\xi _{ij}^n=L_{ij}=A_{ij}^{(n)}\), where \(L_{ij}\) is the ij-th entry of \(L_n\) for \(1\le i<j\le n\). We aim to show with the choice of weights \((\xi ^n_{ij})_{1\le i< j\le n}\) that for fixed BH \((G_n,1)^{B,H}\) converges weakly to \((T,\varnothing )^{B,H}\).

Order the elements of \(J_{B,H}\) lexicographically, i.e. \(\varnothing \prec 1\prec 2\prec \cdots \prec B\prec 11\prec 12\prec \cdots \prec B\cdots B\). For \(v\in J_{B,H}\) let \({\mathcal {O}}_v\) denote the offspring of v in \((G_n,1)^{B,H}\). By construction \(I_\varnothing =\{1\}\) and \(I_v=\sigma _n\left( \bigcup _{w\prec v}{\mathcal {O}}_w \right) \), where \(w\prec v\) must be strict in this union. Thus at every step of the indexing procedure we order the weights of neighboring edges not already considered at a previous step. Thus for all v,

$$\begin{aligned} (\xi _{\sigma _n(v),j }^n)_{j\notin I_v}\overset{d}{=}(\xi _{1j}^n)_{1< j\le n-|I_v|}. \end{aligned}$$

Note that by independence, Proposition 3.7 still holds if you take the sum of Dirac measures at the random variables over \(\{1,\dots ,n \}\setminus I\) for any fixed finite set I. Thus, by Proposition 3.7 the weights from a fixed parent to its offspring in \((G_n,1)^{B,H}\) converge weakly to those of \((T,\varnothing )^{B,H}\). By independence we can extend this to joint convergence. Recall \((G_n,1)^{B,H}\) is a complete graph and not a tree with loops. Thus, it remains to show the edges in \((G_n,1)^{B,H}\) which were not considered in the sorting procedure converge to 0. This was shown for heavy-tailed weights in [7] and for more general Lévy–Khintchine weights in [24].

Let L be the operator associated to \(\textbf{PWITL}(m)\). For fixed BH let \(\sigma _n^{B,H}\) be the map \(\sigma _n\) above associated with \((G_n,1)^{B,H}\), and arbitrarily extend \(\sigma _n^{B,H}\) to a bijection on \(\mathbb {N}^f\), where \(V_n\) is considered in the natural way as a subset of the offspring of \(\varnothing \). From the Skorokhod representation theorem, we may assume \((G_n,1)^{B,H}\) converges almost surely to \((T,\varnothing )^{B,H}\). Thus, there are sequences \(B_n,H_n\) tending to infinity and \({\hat{\sigma }}_n:=\sigma _n^{B_n,H_n}\) such that for any pair \(v,w\in \mathbb {N}^f\) with \(w\ne v\), \(\xi _{{\hat{\sigma }}_n(v),{\hat{\sigma }}_n(w)}^n\) converges almost surely to

$$\begin{aligned} {\left\{ \begin{array}{ll} y_{vk},\quad \ \hbox {if} \ w=vk \ \text {for some} \ k\\ y_{wk},\quad \ \hbox {if} \ v=wk \text {for some} \ k\\ 0,\quad \text {otherwise.} \end{array}\right. } \end{aligned}$$

Thus for any \(u,v\in \mathbb {N}^f\) with \(u\ne v\)

$$\begin{aligned} \langle \delta _u, {\hat{\sigma }}_n^{-1}L_n{\hat{\sigma }}_n\delta _v\rangle =\langle \delta _u, {\hat{\sigma }}_n^{-1}A_n{\hat{\sigma }}_n\delta _v\rangle \rightarrow \langle \delta _u,(L+D)\delta _v\rangle =\langle \delta _u,L\delta _v\rangle \end{aligned}$$
(6.1)

almost surely. We now consider the diagonal elements. Let \(u\in \mathbb {N}^f\), \(B=H=k\) for some \(k\in \mathbb {N}\) such that \(u\in J_{k,k}\). From the above we know almost surely

$$\begin{aligned} \sum _{v\in J_{k,k}} \xi ^n_{(u),(v)}\rightarrow \sum _{v\in J_{k,k}, v\sim u} y_v. \end{aligned}$$
(6.2)

Assume \(v\notin J_{k,k}\), then \(\xi ^n_{(u),(v)}\rightarrow 0\) almost surely. By the uniform summability condition of C1 we have almost surely

$$\begin{aligned} \sum _{v\notin J_{k,k}} \xi ^n_{(u),(v)}\rightarrow 0. \end{aligned}$$
(6.3)

As k was arbitrarily large we have that almost surely for any \(v\in \mathbb {N}^f\)

$$\begin{aligned} \langle \delta _v,{\hat{\sigma }}_n^{-1}L_n{\hat{\sigma }}_n\delta _v\rangle \rightarrow \langle \delta _v,L\delta _v\rangle . \end{aligned}$$
(6.4)

From linearity it suffices to show for every \(v\in \mathbb {N}^f\) that \({\hat{\sigma }}_n^{-1}L_n{\hat{\sigma }}_n\delta _v\rightarrow L\delta _v\), i.e.

$$\begin{aligned} \sum _{u\in \mathbb {N}^f}\left[ \langle \delta _u,{\hat{\sigma }}_n^{-1}L_n{\hat{\sigma }}_n\delta _v\rangle -\langle \delta _u,L\delta _v\rangle \right] ^2\rightarrow 0. \end{aligned}$$
(6.5)

We have shown \(\langle \delta _u,{\hat{\sigma }}_n^{-1}L_n{\hat{\sigma }}_n\delta _v\rangle \rightarrow \langle \delta _u,L\delta _v\rangle \) almost surely for every \(u\in \mathbb {N}^f\), thus (6.5) holds if \(\left\{ \langle \delta _u,{\hat{\sigma }}_n^{-1}L_n{\hat{\sigma }}_n\delta _v\rangle \right\} _{u\in \mathbb {N}^f}\) is uniformly square-summable. This follows from the uniform summability of C1. This completes the proof of Theorem 6.1.

We will need the following extension of Theorem 6.1.

Theorem 6.2

Let \(L_n\) be the matrix defined by (2.1) for \(\{A_n \}_{n\ge 1}\), a Lévy–Khintchine random matrix ensemble satisfying C1. If L and \(L'\) are two independent copies of the linear operator on \(\ell ^2(\mathbb {N}^f)\) associated to \(\textbf{PWITL}(m)\), then, in distribution, \((L_n\oplus L_n,(1,2))\rightarrow (L\oplus L',(\varnothing ,\varnothing ))\) as \(n\rightarrow \infty \).

Proof

Using Proposition 2.6 in [7] and the arguments above, we can construct isometries \(\sigma _n\) on \(\ell ^2(\mathbb {N}^f)\oplus \ell ^2(\mathbb {N}^f)\) such that for any \(v\in \mathbb {N}^f,\ \sigma _n^{-1}(L_n\oplus L_n)\sigma _n(\delta _v,0)\rightarrow L\oplus L'(\delta _v,0)\) and \(\sigma _n^{-1}(L_n\oplus L_n)\sigma _n(0,\delta _v)\rightarrow L\oplus L'(0,\delta _v)\) in \(\ell ^2(\mathbb {N}^f)\oplus \ell ^2(\mathbb {N}^f)\) almost surely. The result then follows by linearity. \(\square \)

7 Resolvent Convergence and the Proof of Theorem 2.4

Theorem 7.1

Let \(s_{L_n}(z)\) be the Stieltjes transform of \(\mu _{L_n}\), and let \(s_{\varnothing }(z)\) be the Stieltjes transform of the measure \(\mu _\varnothing \) defined by

$$\begin{aligned} \langle \delta _\varnothing ,f(L)\delta _\varnothing \rangle =\int _\mathbb {R}f\textrm{d}\mu _\varnothing \end{aligned}$$
(7.1)

for any continuous bounded function \(f:\mathbb {R}\rightarrow \mathbb {C}\), where f(L) is defined by the continuous functional calculus. Then,

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }\mathbb {E}s_{L_n}(z)=\mathbb {E}s_{\varnothing }(z) \end{aligned}$$
(7.2)

for every \(z\in \mathbb {C}_+\).

Proof

For \(z\in \mathbb {C}_+\), we define the operators

$$\begin{aligned} R_n(z)=(L_n-z)^{-1}, \end{aligned}$$
(7.3)

and

$$\begin{aligned} R(z)=(L-z)^{-1}. \end{aligned}$$
(7.4)

Additionally for \(u,v\in \mathbb {N}^f\), we define the functions \(R_n(z)_{uv},R(z)_{uv}:\mathbb {C}_+\rightarrow \mathbb {C}\) by

$$\begin{aligned} R_n(z)_{uv}:=\langle \delta _u,R_n(z)\delta _v\rangle ,\quad \text { and }\quad R(z)_{uv}:=\langle \delta _u,R(z)\delta _v\rangle . \end{aligned}$$
(7.5)

From Proposition 5.2, L is self-adjoint with probability 1. Thus from Theorem 4.2 and Theorem 6.1

$$\begin{aligned} R_n(z)_{11}=\langle \delta _1,(L_n-z)^{-1}\delta _1\rangle \Rightarrow \langle \delta _\varnothing , (L-z)^{-1}\delta _\varnothing \rangle =R(z)_{\varnothing \varnothing }. \end{aligned}$$
(7.6)

For every \(z\in \mathbb {C}_+\), \(R_n(z)_{11}\) and \(R(z)_{\varnothing \varnothing }\) are bounded, thus

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }\mathbb {E}R_n(z)_{11}=\mathbb {E}R(z)_{\varnothing \varnothing }. \end{aligned}$$
(7.7)

By definition \(s_\varnothing (z)=R(z)_{\varnothing \varnothing }\), while

$$\begin{aligned} s_{L_n}(z)=\frac{1}{n}{{\,\textrm{tr}\,}}(L_n-z)^{-1}=\frac{1}{n}\sum _{i =1}^n R_n(z)_{ii}. \end{aligned}$$
(7.8)

It is clear from the matrix of cofactors method of inversion \(R_n(z)_{ii}\overset{d}{=}R_n(z)_{jj}\) for every \(i,j\in [n]\). Thus,

$$\begin{aligned} \mathbb {E}s_{L_n}(z)&=\frac{1}{n}\sum _{i =1}^n\mathbb {E}R_n(z)_{ii}\\&=\frac{1}{n}\sum _{i =1}^n\mathbb {E}R_n(z)_{11}\\&=\mathbb {E}R_n(z)_{11}. \end{aligned}$$

This completes the proof. \(\square \)

7.1 Proof of Theorem 2.4

We are now ready to complete the proof of Theorem 2.4. From Lemma A.1, \(\{\mu _{L_n}\}_{n\ge 1}\) is almost surely tight. Consider the Stieltjes transfrom \(s_{L_n}\) of \(\mu _{L_n}\). From tightness and Lemma B.2, it is enough to prove that almost surely there exists a probability measure with Stieltjes transform s, such that for any subsequence \(\{n_k\}\)

$$\begin{aligned} \lim _{n_k\rightarrow \infty }s_{L_{n_k}}(z)=s(z), \end{aligned}$$
(7.9)

for all \(z\in \mathbb {C}_+\). We know from Theorem 7.1 that for all \(z\in \mathbb {C}_+\)

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }\mathbb {E}s_{L_n}(z)=\mathbb {E}s_{\varnothing }(z). \end{aligned}$$
(7.10)

We now upgrade this to almost surely convergence of \(s_{L_n}(z)\) to \(\mathbb {E}s_{\varnothing }(z)\). For \(z\in \mathbb {C}_+\)

$$\begin{aligned} \mathbb {E}|s_{L_n}(z)-\mathbb {E}s_{\varnothing }(z)|\le \mathbb {E}|s_{L_n}(z)-\mathbb {E}s_{L_n}(z)|+|\mathbb {E}s_{L_n}(z)-\mathbb {E}s_{\varnothing }(z)|. \end{aligned}$$

For \(z\in \mathbb {C}_+\)

$$\begin{aligned} \mathbb {E}|s_{L_n}(z)-\mathbb {E}s_{L_n}(z)|=\mathbb {E}\left| \frac{1}{n}\sum _{i =1}^n\left( R_n(z)_{ii}-\mathbb {E}R_n(z)_{ii}\right) \right| , \end{aligned}$$
(7.11)

and by the exchangeability of the matrix entries

$$\begin{aligned} \mathbb {E}\left[ \left| \frac{1}{n}\sum _{i =1}^n R_n(z)_{ii}-\mathbb {E}R_n(z)_{ii} \right| ^2 \right]&=\frac{1}{n}\mathbb {E}\left| R_n(z)_{11}-\mathbb {E}R_n(z)_{11} \right| ^2\\&\qquad +\frac{n-1}{n}{{\,\textrm{Cov}\,}}(R_n(z)_{11},R_n(z)_{22})\\&\le \frac{1}{n \, {\text {Im}}(z)^2}\\&\qquad +\frac{n-1}{n}{{\,\textrm{Cov}\,}}(R_n(z)_{11},R_n(z)_{22}). \end{aligned}$$

From Theorems 4.2 and 6.2, we know \(R_{n}(z)_{11}\) and \(R_n(z)_{22}\) are asymptotically independent random variables bounded uniformly in n, and thus asymptotically uncorrelated. From this, we get

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }\mathbb {E}\left[ \left| \frac{1}{n}\sum _{i =1}^n R_n(z)_{ii}-\mathbb {E}R_n(z)_{ii} \right| ^2 \right] =0, \end{aligned}$$

and

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }\mathbb {E}|s_{L_n}(z)-\mathbb {E}s_{\varnothing }(z)|=0. \end{aligned}$$
(7.12)

Taking \(\mu _m=\mathbb {E}\mu _\varnothing \) completes the proof of Theorem 2.4.

8 Proof of Theorem 2.5

We will follow the approach of [7] and take advantage of the tree structure on \(\mathbb {N}^f\) to arrive at (2.7) before proving uniqueness. Let L be the operator associated to \(\textbf{PWITL}(m)\), we have already seen that \(s_m(z)=\mathbb {E}s_\varnothing (z)\) where

$$\begin{aligned} s_\varnothing (z)=\langle \delta _\varnothing ,(L-z)^{-1}\delta _\varnothing \rangle . \end{aligned}$$
(8.1)

We now decompose the operator L as

$$\begin{aligned} L=C+\bigoplus _{k=1}^\infty L_k \end{aligned}$$
(8.2)

where

$$\begin{aligned} \langle \delta _k,C\delta _\varnothing \rangle =\langle \delta _k,L\delta _\varnothing \rangle \nonumber \\ \langle \delta _k,C\delta _k\rangle =-y_k\nonumber \\ \langle \delta _\varnothing ,C\delta _\varnothing \rangle =\langle \delta _\varnothing ,L\delta _\varnothing \rangle \end{aligned}$$
(8.3)

and for every \(k\in \mathbb {N}\), \(L_k\) is supported on \(k\mathbb {N}^f=\{kv\in \mathbb {N}^f: v\in \mathbb {N}^f \}\). Note \(\langle \delta _v, C\delta _u\rangle =0\) for any other combination of \(u,v\in \mathbb {N}^f\). Under this decomposition \(\{L_k\}_{k\ge 1}\) is a collection of i.i.d. random operators each equal in distribution, up to an isometry, to L. For convenience, define the operator \({\tilde{L}}\) by

$$\begin{aligned} {\tilde{L}}:=\bigoplus _{k=1}^\infty L_k, \end{aligned}$$
(8.4)

and the operators \(R(z):=(L-z)^{-1}\) and \({\tilde{R}}(z):=({\tilde{L}}-z)^{-1}\) for \(z\in \mathbb {C}_+\). From (8.2) we get the resolvent identity

$$\begin{aligned} {\tilde{R}}(z)CR(z)={\tilde{R}}(z)-R(z). \end{aligned}$$
(8.5)

Additionally denote by \(R_{uv}(z):=\langle \delta _u, R(z)\delta _v\rangle \) and \({\tilde{R}}_{uv}(z):=\langle \delta _u, {\tilde{R}}(z)\delta _v\rangle \). Note \({\tilde{R}}_{\varnothing \varnothing }(z)=-z^{-1}\), \({\tilde{R}}_{kl}(z)=0\) for all \(k,l\in \mathbb {N}\) with \(k\ne l\), and \({\tilde{R}}_{\varnothing k}(z)=0={\tilde{R}}_{k\varnothing }(z)\) for all \(k\in \mathbb {N}\).

From (8.5), one immediately gets

$$\begin{aligned} \langle \delta _k, {\tilde{R}}(z)CR(z)\delta _\varnothing \rangle =-R_{k\varnothing }(z). \end{aligned}$$
(8.6)

It also follows that

$$\begin{aligned} \langle \delta _k, {\tilde{R}}(z)CR(z)\delta _\varnothing \rangle&=\left\langle \delta _k, {\tilde{R}}(z)C\sum _{v\in \mathbb {N}^f}R_{v\varnothing }(z)\delta _v\right\rangle \\&=\left\langle \delta _k, {\tilde{R}}(z)R_{\varnothing \varnothing }(z)y_{\varnothing \varnothing }\delta _\varnothing \right\rangle +\left\langle \delta _k, {\tilde{R}}(z)\sum _{l\in \mathbb {N}}R_{\varnothing \varnothing }(z)y_l\delta _l\right\rangle \\ {}&\quad +\left\langle \delta _k, {\tilde{R}}(z)\sum _{j\in \mathbb {N}}R_{j\varnothing }(z)y_j\delta _\varnothing \right\rangle -\left\langle \delta _k, {\tilde{R}}(z)\sum _{j\in \mathbb {N}}R_{j\varnothing }(z)y_j\delta _j\right\rangle \\&=0+{\tilde{R}}_{kk}(z)y_kR_{\varnothing \varnothing }(z)+0-{\tilde{R}}_{kk}(z)y_kR_{k\varnothing }(z). \end{aligned}$$

Rearranging we arrive at

$$\begin{aligned} R_{k\varnothing }(z)=\frac{{\tilde{R}}_{kk}(z)y_kR_{\varnothing \varnothing }(z)}{{\tilde{R}}_{kk}y_k-1}. \end{aligned}$$
(8.7)

A similar computation for \(\langle \delta _\varnothing , {\tilde{R}}(z)CR(z)\delta _\varnothing \rangle \) gives

$$\begin{aligned} {\tilde{R}}_{\varnothing \varnothing }(z)-R_{\varnothing \varnothing }(z)={\tilde{R}}_{\varnothing \varnothing }(z)y_{\varnothing \varnothing }R_{\varnothing \varnothing }(z)+\sum _{j=1}^\infty {\tilde{R}}_{\varnothing \varnothing }(z)y_{j}R_{j\varnothing }(z). \end{aligned}$$
(8.8)

Combining (8.7) and (8.8) gives

$$\begin{aligned} {\tilde{R}}_{\varnothing \varnothing }(z)&=R_{\varnothing \varnothing }(z) +{\tilde{R}}_{\varnothing \varnothing }(z)y_{\varnothing \varnothing }R_{\varnothing \varnothing }(z)+\sum _{j=1}^\infty {\tilde{R}}_{\varnothing \varnothing }(z)y_{j}R_{j\varnothing }(z)\\&=R_{\varnothing \varnothing }(z)+{\tilde{R}}_{\varnothing \varnothing }(z) y_{\varnothing \varnothing }R_{\varnothing \varnothing }(z)+\sum _{j=1}^\infty {\tilde{R}}_{\varnothing \varnothing }(z)y_{j}\frac{{\tilde{R}}_{jj}(z)y_jR _{\varnothing \varnothing }(z)}{{\tilde{R}}_{jj}y_j-1}\\&=R_{\varnothing \varnothing }(z) \left[ 1+{\tilde{R}}_{\varnothing \varnothing }(z)y_{\varnothing \varnothing }+\sum _{j=1}^\infty {\tilde{R}}_{\varnothing \varnothing }(z)y_{j}\frac{{\tilde{R}}_{jj}(z)y_j}{{\tilde{R}}_{jj}y_j-1}\right] ,\\ \end{aligned}$$

which, along with \({\tilde{R}}_{\varnothing \varnothing }(z)=-z^{-1}\), implies

$$\begin{aligned} R_{\varnothing \varnothing }=-\left( z-y_{\varnothing \varnothing }-\sum _{j=1} ^\infty \frac{{\tilde{R}}_{jj}(z)y_j^2}{{\tilde{R}}_{jj}(z)y_j-1} \right) ^{-1}. \end{aligned}$$
(8.9)

Noting \(y_{\varnothing \varnothing }=-\sum _{j=1}^\infty y_j\) gives (2.7). Note that for \(j\in \mathbb {N}\) \({\tilde{R}}_{jj}(z)\) depends only on z and \(L_j\), and hence \(\{{\tilde{R}}_{jj}(z)\}_{j\in \mathbb {N}}\) is a collection of i.i.d. random variables independent of \(\{y_j\}_{j\in \mathbb {N}}\).

8.1 Uniqueness

In this section, we prove uniqueness of the solution to (2.7) from Theorem 2.5. While the argument is technical, the core is a contraction approach. We will show the map T defined below in (8.10) would contract, in an appropriate metric, two fixed points belonging to a nice subset of all probability measures on the space of Stieltjes transforms. We then extend this result to any two potential fixed points by moving from this metric to a functional separating distinct points.

Let \({\mathcal {S}}\) be the set of Stieltjes transforms of probability measures on \(\mathbb {R}\) and \({\mathcal {P}}({\mathcal {S}})\) be the set of probability measures on \({\mathcal {S}}\). Define \(T:{\mathcal {P}}({\mathcal {S}})\rightarrow {\mathcal {P}}({\mathcal {S}})\) as follows: for \(\mu \in {\mathcal {P}}({\mathcal {S}})\)

$$\begin{aligned} T(\mu ):={\mathcal {L}}\left( -\left( z-\sum _{j=1}^N\frac{y_j}{s_j(z)y_j-1} \right) ^{-1}\right) , \end{aligned}$$
(8.10)

where \(\{s_j\}\) are i.i.d. with distribution \(\mu \), \(\{y_j\}_{j=1}^\infty \) is a Poisson point process with a fixed intensity measure m independent of the collection \(\{s_j\}_{j\ge 1}\), N is a Poisson random variable with mean \(m(\mathbb {R})\) such that \(y_j=0\) if \(j>N\), and \({\mathcal {L}}(X)\) is the law of a random variable X. Thus the distribution of \(s_\varnothing \) is a fixed point of T and we aim to show it is the unique fixed point. The notation of distance for which T contracts fixed points will involve the infimum over all couplings of these fixed point measures. Let \(\mu _1,\mu _2\in {\mathcal {P}}({\mathcal {S}})\) be two fixed points of T and let (s(z), r(z)) be an arbitrary coupling of \(\mu _1\) and \(\mu _2\). Additionally let \(\mu _r\) and \(\mu _s\) be the random probability measures on \(\mathbb {R}\) defined uniquely by

$$\begin{aligned} s(z)=\int _\mathbb {R}\frac{1}{x-z}\textrm{d}\mu _s(x)\text { and }r(z)=\int _\mathbb {R}\frac{1}{x-z}\textrm{d}\mu _r(x), \end{aligned}$$

for all \(z\in \mathbb {C}_+\). For now, we will assume there exists \(M\in \mathbb {N}\) such that almost surely \(\mu _r([-M,M])\ge \frac{1}{2}\) and \(\mu _s([-M,M])\ge \frac{1}{2}\). This assumption will be removed later. As r and s are analytic functions on the upper half plane, we will consider them only on the box

$$\begin{aligned} {\mathcal {C}}_M=\left\{ z\in \mathbb {C}: |{\text {Re}}(z)|\le \frac{1}{2},\, f_m(M)\le {\text {Im}}(z)\le f_m(M)+1 \right\} , \end{aligned}$$
(8.11)

where \(f_m\) is a positive increasing function on \(\mathbb {N}\) such that \(f_m(M)\rightarrow \infty \) as \(M\rightarrow \infty \), which will be chosen later to satisfy (8.17) below. Note for \(z\in {\mathcal {C}}_M\), the assumption on \(\mu _r\) and \(\mu _s\) imply \({\text {Im}}(r(z)),{\text {Im}}(s(z))\ge \frac{1}{2}\frac{{\text {Im}}(z)}{(M+\frac{1}{2})^2+{\text {Im}}(z)^2}\). Let \(\{(r_j,s_j) \}_{j=1}^N\) be i.i.d. copies of (s(z), r(z)). Define the random functions \({\tilde{s}}\) and \({\tilde{r}}\), pointwise on \(\mathbb {C}_+\) and the sample space, by

$$\begin{aligned} {\tilde{s}}(z)=-\left( z-\sum _{j=1}^N\frac{y_j}{s_j(z)y_j-1} \right) ^{-1}\text { and }{\tilde{r}}(z)=-\left( z-\sum _{j=1}^N\frac{y_j}{r_j(z)y_j-1} \right) ^{-1}, \end{aligned}$$
(8.12)

where \(\{y_j\}_{j=1}^\infty \) is a Poisson point process with intensity measure m independent of the collection \(\{(r_j,s_j) \}_{j=1}^\infty \). If \(\mu _1\) and \(\mu _2\) are fixed points of T, then \(({\tilde{r}},{\tilde{s}})\) is a coupling of \(\mu _1\) and \(\mu _2\). We show that

$$\begin{aligned} \mathbb {E}\sup _{z\in {\mathcal {C}}_M}|{\tilde{r}}(z)-{\tilde{s}}(z)|\le \frac{4}{5}\mathbb {E}\sup _{z\in {\mathcal {C}}_M}|r(z)-s(z)| \end{aligned}$$

for an appropriate choice of \(f_m(M)\) independent of the coupling (rs). First note

$$\begin{aligned} \mathbb {E}\sup _{z\in {\mathcal {C}}_M}|{\tilde{r}}(z)-{\tilde{s}}(z)|&\le \mathbb {E}\sup _{z\in {\mathcal {C}}_M} \frac{1}{{\text {Im}}(z)^2}\sum _{j=1}^N\frac{|r_j(z)-s_j(z)|y_j^2}{|r_j(z)y_j-1||s_j(z)y_j-1|}\nonumber \\&\le \mathbb {E}\sum _{j=1}^N\sup _{z\in {\mathcal {C}}_M}\frac{1}{{\text {Im}}(z)^2} \frac{|r_j(z)-s_j(z)|y_j^2}{|r_j(z)y_j-1||s_j(z)y_j-1|}. \end{aligned}$$
(8.13)

To handle the denominator, we will consider separately the points where \({\text {Re}}(r_j(z)y_j)\) is small and the few points where \({\text {Im}}(r_j(z)y_j)\) is large. Let \({\hat{m}}\) be equal to m with support restricted to \([-f_m(M)/2,f_m(M)/2]\) and \({\tilde{m}}:=m-{\hat{m}}\). Decompose the point process \(\{y_j\}_{j=1}^N\) into two independent Poisson point processes \(\{{\hat{y}}_j\}^{{\hat{N}}}_{j=1}\) and \(\{{\tilde{y}}_j\}^{{\tilde{N}}}_{j=1}\) with intensity measures \({\hat{m}}\) and \({\tilde{m}}\), respectively. We will divide the sum in (8.13) into two sums over these point processes. To begin note for \(z\in {\mathcal {C}}_M\), \(|s_j(z){\hat{y}}_j-1||r_j(z){\hat{y}}_j-1|\ge 1/4\) and thus

$$\begin{aligned}&\mathbb {E}\sum _{j=1}^{{\hat{N}}}\sup _{z\in {\mathcal {C}}_M}\frac{1}{{\text {Im}}(z)^2}\frac{|r_j(z)-s_j(z) |{\hat{y}}_j^2}{|r_j(z){\hat{y}}_j-1||s_j(z){\hat{y}}_j-1|}\\&\le \mathbb {E}\sum _{j=1}^{{\hat{N}}} \sup _{z\in {\mathcal {C}}_M}\frac{4}{{\text {Im}}(z)^2}|r_j(z)-s_j(z)|{\hat{y}}_j^2\\&\le \frac{4}{f_m(M)^2}\mathbb {E}\sum _{j=1}^{{\hat{N}}}\sup _{z\in {\mathcal {C}}_M}|r_j(z)-s_j(z)|{\hat{y}}_j^2\\&=\frac{4}{f_m(M)^2}\mathbb {E}\sup _{z\in {\mathcal {C}}_M}|r_1(z)-s_1(z)|I_{M,m}, \end{aligned}$$

where \(I_{M,m}=\int _{-f_m(M)/2}^{f_m(M)/2}y^2\textrm{d}m(y)\) and the last equality follows from Lemma 3.3 and independence. From (2.3)

$$\begin{aligned} \int _{-f_m(M)/2}^{f_m(M)/2}y^2\textrm{d}m(y)&\le \int _{-1}^1 y^2\ \textrm{d}m(y)+\int _{1\le |x|\le \sqrt{f_m(M)}}y^2\ \textrm{d}m(y)\\&\qquad + \int _{\sqrt{f_m(M)}\le |x|\le f_m(M)/2}y^2\ \textrm{d}m(y)\\&\le C'+C'f_m(M)+ \frac{f_m(M)^2}{4}m(\{x: |x|\ge \sqrt{f_m(M)}\})\\&\le Cf_m(M)^{2-\varepsilon /2} \end{aligned}$$

where \(\varepsilon >0\) is from (2.3), and \(C,C'>0\) are constants which depend only on the measure m. Thus,

$$\begin{aligned} \mathbb {E}\sum _{j=1}^{{\hat{N}}}\sup _{z\in {\mathcal {C}}_M}\frac{1}{{\text {Im}}(z)^2} \frac{|r_j(z)-s_j(z)|{\hat{y}}_j^2}{|r_j(z){\hat{y}}_j-1||s_j(z){\hat{y}}_j-1|} \le \frac{C'}{f_m(M)^{\varepsilon /2}}\mathbb {E}\sup _{z\in {\mathcal {C}}_M}|r_1(z)-s_1(z)|.\nonumber \\ \end{aligned}$$
(8.14)

To handle the other sum first note \(|s_j(z){\tilde{y}}_j-1||r_j(z){\tilde{y}}_j-1|\ge {\tilde{y}}_j^2 C_{z,M}^{-1}\) where

$$\begin{aligned} C_{z,M}=\frac{4\left( \left( M+\frac{1}{2}\right) ^2+{\text {Im}}(z)^2\right) ^2}{{\text {Im}}(z)^2}. \end{aligned}$$

Then,

$$\begin{aligned}&\mathbb {E}\sum _{j=1}^{{\tilde{N}}}\sup _{z\in {\mathcal {C}}_M}\frac{1}{{\text {Im}}(z)^2}\frac{|r_j(z)-s_j(z) |{\tilde{y}}_j^2}{|r_j(z){\tilde{y}}_j-1||s_j(z){\tilde{y}}_j-1|}\nonumber \\&\le \mathbb {E}\sum _{j=1}^{{\tilde{N}}} \sup _{z\in {\mathcal {C}}_M}\frac{C_{z,M}}{{\text {Im}}(z)^2}|r_j(z)-s_j(z)|\nonumber \\&\le \frac{C_{i(f_m(M)+1),M}}{f_m(M)^2}\sum _{j=1}^{{\tilde{N}}}\sup _{z\in {\mathcal {C}}_M}|r_j(z)-s_j(z)|\nonumber \\&= \frac{C_{i(f_m(M)+1),M}}{f_m(M)^2}\mathbb {E}{\tilde{N}}\mathbb {E}\sup _{z\in {\mathcal {C}}_M}|r_1(z)-s_1(z)|, \end{aligned}$$
(8.15)

where the final equality follows from Lemma 3.3. Finally combining (8.14) and (8.15) gives

$$\begin{aligned} \mathbb {E}\sup _{z\in {\mathcal {C}}_M}|{\tilde{r}}(z)-{\tilde{s}}(z)|\le \left( \frac{C'}{f_m(M)^{\varepsilon /2}}+\frac{C_{i(f_m(M)+1),M}}{f_m(M)^2}\mathbb {E}{\tilde{N}} \right) \mathbb {E}\sup _{z\in {\mathcal {C}}_M}|r_1(z)-s_1(z)|.\nonumber \\ \end{aligned}$$
(8.16)

Notice this coefficient is independent of the coupling and depends only on M, \(f_m\) and m. From the definition of \(C_{z,M}\), we have that \(C_{z+i,M}/{\text {Im}}(z)^2\rightarrow 4\) as \({\text {Im}}(z)\rightarrow \infty \). We also have that \(\mathbb {E}{\tilde{N}}=m\left( (-\infty ,-f_m(M)/2)\cup (f_m(M)/2,\infty ) \right) \le Cf_m(M)^{-\epsilon }\). We choose \(f_m\) to be such that

$$\begin{aligned} \left( \frac{C'}{f_m(M)^{\varepsilon /2}}+\frac{C_{i(f_m(M)+1),M}}{f_m(M)^2}\mathbb {E}{\tilde{N}} \right) \le \frac{4}{5}, \end{aligned}$$
(8.17)

for each \(M\in \mathbb {N}\). As the left-hand side of (8.17) is decreasing in \(f_m(M)\), \(f_m\) may be chosen to be increasing and unbounded.

Next we remove that assumption that, for some M, almost surely \(\mu _r\) and \(\mu _s\) have half their mass in \([-M,M]\). For a positive, increasing, unbounded function \(f_m\) on \(\mathbb {N}\), we define the function \(d_{f_m}:{\mathcal {P}}({\mathcal {S}})^2\rightarrow [0,\infty )\) by

$$\begin{aligned} d_{f_m}(\mu _1,\mu _2)=\inf _{(r,s)\in {\mathcal {C}}(\mu _1,\mu _2)}\mathbb {E}\sum _{M=1}^{\infty }\sup _{z\in {\mathcal {C}}_M}|r(z)-s(z)|/2^M, \end{aligned}$$
(8.18)

and the function \(d'_{f_m}:{\mathcal {P}}({\mathcal {S}})^2\rightarrow [0,\infty )\) by

$$\begin{aligned} d'_{f_m}(\mu _1,\mu _2)=\inf _{(r,s)\in {\mathcal {C}}(\mu _1,\mu _2)}\mathbb {E}\sum _{M=1}^{\infty }\sup _{z\in {\mathcal {C}}_M}|r(z)-s(z)|\textbf{1}_{{A_M}}/2^M, \end{aligned}$$

where \(\textbf{1}_{{A_M}}\) is the indicator function of the event

$$\begin{aligned} A_M=\left\{ \mu _r([-M,M])\ge \frac{1}{2} \text { and } \mu _s([-M,M])\ge \frac{1}{2} \right\} , \end{aligned}$$

\({\mathcal {C}}_M\) is the set defined by (8.11), and \({\mathcal {C}}(\mu _1,\mu _2)\) is the set of all couplings of \(\mu _1\) and \(\mu _2\) for \(\mu _1,\mu _2\in {\mathcal {P}}({\mathcal {S}})\). It is straightforward to check that \(\rho :{\mathcal {S}}\times {\mathcal {S}}\rightarrow [0,\infty )\) defined by

$$\begin{aligned} \rho (s,r)=\sum _{M=1}^{\infty }\sup _{z\in {\mathcal {C}}_M}|r(z)-s(z)|/2^M \end{aligned}$$
(8.19)

is a metric on \({\mathcal {S}}\), and thus, \(d_{f_m}\) is the \(1^{\text {st}}\)-Wasserstein metric on \({\mathcal {P}}({\mathcal {S}})\) (see [18] Chapter 11 for details). Let \(\mu _1\) and \(\mu _2\) be two fixed points of T. Let (sr) be a coupling of \(\mu _1\) and \(\mu _2\) such that

$$\begin{aligned} \mathbb {E}\sum _{M=1}^{\infty }\sup _{z\in {\mathcal {C}}_M}|r(z)-s(z)|\textbf{1}_{{A_M}}/2^M\le \frac{10}{9}d'_{f_m}(\mu _1,\mu _2), \end{aligned}$$
(8.20)

and let \({\tilde{s}}\) and \({\tilde{r}}\) be built from i.i.d. copies of (sr) as in (8.12). Using the specific coupling \(({\tilde{s}},{\tilde{r}})\), (8.16), and (8.20) we get

$$\begin{aligned} d'_{f_m}(\mu _1,\mu _2)&\le \mathbb {E}\sum _{M=1}^{\infty }\sup _{z\in {\mathcal {C}}_M}|{\tilde{r}}(z)-{\tilde{s}}(z)|\textbf{1}_{{A_M}}/2^M\\&\le \frac{4}{5}\mathbb {E}\sum _{M=1}^{\infty }\sup _{z\in {\mathcal {C}}_M}|r(z)-s(z)|\textbf{1}_{{A_M}}/2^M\\&\le \frac{8}{9}d'_{f_m}(\mu _1,\mu _2), \end{aligned}$$

and thus \(d'_{f_m}(\mu _1,\mu _2)=0\).

If \(d'_{f_m}\) was a metric, it would be immediate that \(\mu _1=\mu _2\); however, it is not clear this is the case. The only property of a metric needed is that \(d'_{f_m}\) separates distinct points in \({\mathcal {P}}({\mathcal {S}})\), and thus, we conclude the proof using the following lemma.

Lemma 8.1

Fix a positive, increasing, unbounded function \(f_m:\mathbb {N}\rightarrow \mathbb {R}\). \(d'_{f_m}(\mu _1,\mu _2)=0\) if and only if \(d_{f_m}(\mu _1,\mu _2)=0\) for the metric \(d_{f_m}\) defined in (8.18).

Proof

Assume \(d'_{f_m}(\mu _1,\mu _2)=0\), fix \(\varepsilon >0\), and note there exists \(M_0\in \mathbb {N}\) such that for any \(M\ge M_0\) and any coupling (rs) one has

$$\begin{aligned} \mathbb {P}(A_M)\ge 1-\varepsilon . \end{aligned}$$
(8.21)

We have that

$$\begin{aligned} \inf _{(r,s)\in {\mathcal {C}}(\mu _1,\mu _2)}\mathbb {E}\sup _{z\in {\mathcal {C}}_{M_0}}|r(z)-s(z)|\textbf{1}_{{A_{M_0}}}=0, \end{aligned}$$
(8.22)

and thus we can find a sequence of couplings \(\{(r_n,s_n)\}_{n=1}^\infty \) such that

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }\mathbb {E}\sup _{z\in {\mathcal {C}}_{M_0}}|r_n(z)-s_n(z)|\textbf{1}_{{A_{M_0}^n}}=0, \end{aligned}$$
(8.23)

where \(A_{M_0}^n=\{\mu _r^n([-M,M])\ge \frac{1}{2} \text { and } \mu _s^n([-M,M])\ge \frac{1}{2} \}\) and \(\mu _r^n\) and \(\mu _s^n\) are the random probability measures associated to \(r_n\) and \(s_n\). Let \(M_1\) be such that \(f_m(M_1)>\frac{1}{2\varepsilon }\), and hence for any \(M\ge M_1\)

$$\begin{aligned} \sup _{z\in {\mathcal {C}}_M}|r(z)-s(z)|\le \varepsilon , \end{aligned}$$
(8.24)

for any Stieltjes transforms r and s.

We will now extend the convergence in (8.23) to the supremum over the larger compact set \(\tilde{{\mathcal {C}}}=\cup _{j=1}^{M_1}{\mathcal {C}}_j\). The \(L^{1}\)-convergence of the random variables \(\sup _{z\in {\mathcal {C}}_{M_0}}|r_n(z)-s_n(z)|\textbf{1}_{{A_{M_0}^n}}\) in (8.23) to zero implies convergence in probability to zero. Thus, we can find a subsequence converging almost surely to zero, and without loss of generality, we denote this subsequence \(\{\sup _{z\in {\mathcal {C}}_{M_0}}|r_n(z)-s_n(z)|\textbf{1}_{{A_{M_0}^n}}\}_{n=1}^\infty \). Let \(G=\{\lim _{n\rightarrow \infty }\sup _{z\in {\mathcal {C}}_{M_0}}|r_n(z)-s_n(z)|\textbf{1}_{{A_{M_0}^n}}=0\}\), and decompose G into \(G_1=\{\omega \in G: \textbf{1}_{{A_{M_0}^n}}(\omega )=1 \text { i.o.}\}\) and \(G_0=G{\setminus } G_1\). Clearly on \(G_0\) the random variables \(\sup _{z\in \tilde{{\mathcal {C}}}}|r_n(z)-s_n(z)|\textbf{1}_{{A_{M_0}^n}}\) are eventually identically 0. For \(\omega \in G_1\), we consider the further subsequence \(\{n_k\}\) such that \(\textbf{1}_{{A_{M_0}^{n_k}}}(\omega )=1\) for all k. For this outcome \(\omega \), we have \(\{(r_{n_k}-s_{n_k})(\omega )\}_{n=1}^\infty \) is a sequence of complex analytic functions on \(\mathbb {C}_+\), uniformly bounded on compact subsets of \(\mathbb {C}_+\), converging uniformly to 0 on a set with an accumulation point. Thus applying the Vitali convergence theorem for analytic functions, Lemma B.1, we get that \(\sup _{z\in \tilde{{\mathcal {C}}}}|(r_{n_k}(z)-s_{n_k}(z))|(\omega )\rightarrow 0\) as \(n_k\rightarrow \infty \). From the above and the bounded convergence theorem, we get

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }\mathbb {E}\sup _{z\in \tilde{{\mathcal {C}}}}|r_n(z)-s_n(z)|\textbf{1}_{{A_{M_0}^n}}=0. \end{aligned}$$
(8.25)

Combining (8.21), (8.24), and (8.25), we obtain

$$\begin{aligned} d_{f_m}(\mu _1,\mu _2)&=\inf _{(r,s)\in {\mathcal {C}}(\mu _1,\mu _2)}\mathbb {E}\sum _{M=1}^{\infty }\sup _{z\in {\mathcal {C}}_M}|r(z)-s(z)|/2^M\\&\le \inf _{(r,s)\in {\mathcal {C}}(\mu _1,\mu _2)}\mathbb {E}\sum _{M=1}^{\infty }\left( \sup _{z\in {\mathcal {C}}_M}|r(z)-s(z)|\textbf{1}_{{A_{M_0}}}/2^M+\frac{2}{f_m(1)}\mathbb {P}(A_{M_0}^c)/2^M\right) \\&<\inf _{(r,s)\in {\mathcal {C}}(\mu _1,\mu _2)}\mathbb {E}\sum _{M=1}^{M_1}\sup _{z\in \tilde{{\mathcal {C}}}}|r(z)-s(z)|\textbf{1}_{{A_{M_0}}}/2^M+\varepsilon +\frac{2}{f_m(1)}\varepsilon \\&\le \inf _{(r,s)\in {\mathcal {C}}(\mu _1,\mu _2)}\mathbb {E}\sup _{z\in \tilde{{\mathcal {C}}}}|r(z)-s(z)|\textbf{1}_{{A_{M_0}}}+\left( 1+\frac{2}{f_m(1)}\right) \varepsilon \\&=\left( 1+\frac{2}{f_m(1)}\right) \varepsilon . \end{aligned}$$

As \(\varepsilon >0\) was arbitrary we have \(d_{f_m}(\mu _1,\mu _2)=0\). For the other direction, note

$$\begin{aligned} d'_{f_m}(\mu _1,\mu _2)&=\inf _{(r,s)\in {\mathcal {C}}(\mu _1,\mu _2)}\mathbb {E}\sum _{M=1}^{\infty }\sup _{z\in {\mathcal {C}}_M}|r(z)-s(z)|\textbf{1}_{{A_M}}/2^M\\&\le \inf _{(r,s)\in {\mathcal {C}}(\mu _1,\mu _2)}\mathbb {E}\sum _{M=1}^{\infty }\sup _{z\in {\mathcal {C}}_M}|r(z)-s(z)|/2^M\\&=d_{f_m}(\mu _1,\mu _2), \end{aligned}$$

for any \(\mu _1,\mu _2\). \(\square \)