SPECTRUM OF L´EVY-KHINTCHINE RANDOM LAPLACIAN MATRICES

. We consider the spectrum of random Laplacian matrices of the form L n = A n − D n where A n is a real symmetric random matrix and D n is a diagonal matrix whose entries are equal to the corresponding row sums of A n . If A n is a Wigner matrix with entries in the domain of attraction of a Gaussian distribution the empirical spectral measure of L n is known to converge to the free convolution of a semicircle distribution and a standard real Gaussian distribution. We consider real symmetric random matrices A n with independent entries (up to symmetry) whose row sums converge to a purely non-Gaussian inﬁnitely divisible distribution, which fall into the class of L´evy-Khintchine random matrices ﬁrst introduced by Jung [Trans Am Math Soc, 370 , (2018)]. Our main result shows that the empirical spectral measure of L n converges almost surely to a deterministic limit. A key step in the proof is to use the purely non-Gaussian nature of the row sums to build a random operator to which L n converges in an appropriate sense. This operator leads to a recursive distributional equation uniquely describing the Stieltjes transform of the limiting empirical spectral measure.


Introduction
We consider the empirical spectral measure 1 of random Laplacian-type matrices of the form L n = A n − D n (1.1) where A n = (A ij ) n i,j=1 is an n × n real symmetric random matrix with independent entries up to symmetry, and D n is a diagonal matrix with (D n ) ii = n j=1 A ij .When A n is a Wigner matrix, i.e.A n has independent entries up to symmetry with mean zero and variance 1  n , the empirical spectral measure of A n converges to Wigner's semicircle law, the empirical spectral measure of D n converges to a standard Gaussian distribution, and it was shown in [10] that the empirical spectral measure of L n converges to the free convolution of the semicircle law and the standard real Gaussian measure.In this paper we will consider A n such that the diagonal entries of D n converge in distribution, not to the Gaussian distribution, but rather to a non-Gaussian infinitely divisible distribution.This model will include Lévy matrices, sometimes referred to as heavy-tailed Wigner matrices, where the entries of A n are independent up to symmetry, but have infinite second moment, see Subsection 1.1 for more details.Another important example arises when A n is the adjacency matrix of an Erdős-Rényi random graph where the expected degree of any vertex remains fixed as the number of vertices goes to infinity.These A n fall S. O'Rourke has been supported in part by NSF CAREER grant DMS-2143142. 1 The definition for the empirical spectral measure and other notation used throughout is established in Subsection 1.3.
1 into the class of Lévy-Khintchine matrices, a generlization of Lévy matrices defined by Jung in [24], see Subsection 1.2 for more on these matrices.
The term Laplacian comes from graph theory, where the combinatorial Laplacian of a graph with vertex set {1, 2, . . ., n} is defined by where i ∼ j if {i, j} is an edge in the graph and deg(i) is the number of edges incident to a vertex i.The combinatorial Laplacian is the negative of what we refer to as the Laplacian.If the entries of A n are almost surely nonnegative, then L n is the infinitesimal generator of a (random) continuous time random walk and for this reason L n is referred to as a Markov matrix in some of the literature.We use the term Laplacian throughout.Spectral properties of real symmetric random Laplacian matrices have been studied in [4,10,12,13,16,17,21,22,23] and for non-symmetric random Laplacian matrices in [6] when the entries of A n are in the domain of attraction of either a real or complex Gaussian random variable.Though because of the widespread use of graph Laplacians this list is incomplete.In these light-tailed cases the limiting spectral measure has a particularly nice free probabilistic interpretation (see [27] for an introduction to free probability and random matrices).In [10] Bryc, Dembo, and Jiang proved the following: Theorem 1.1 (Theorem 1.3 in [10]).Let {X ij : j ≥ i ≥ 1} be a collection of i.i.d.real random variables with EX 12 = 0 and EX 2  12 = 1, X ij = X ji for 1 ≤ i ≤ j, and let A n = (X ij / √ n) n i,j=1 be a random real symmetric matrix.With probability one, the empirical spectral measure of the matrix L n defined in (1.1) converges weakly to the free additive convolution of the semicircle and standard Gaussian measures.
The analogous free probabilistic limit was established in [6] for non-symmetric A n .While some of the above references study sparse Laplacian matrices, none consider random Laplacian matrices with heavy-tailed entries or sparse Laplacian matrices where the expected number of nonzero entries in a row is uniformly bounded in n.
Many of the tools and techniques we employ were developed in the study of heavy-tailed real symmetric, or Lévy, matrices by Bordenave, Caputo, and Chafaï in [7].Lévy matrices were introduced in [14] as heavy-tailed versions of Wigner matrices.For the purposes of this paper an important distinction between Lévy and Wigner matrices is that the row sums of a Wigner matrix converge in distribution to a Gaussian random variable, while the row sums of a Lévy matrix converge to an α-stable distribution for 0 < α < 2. The techniques in [7] were extended by Jung in [24] to random matrices whose row sums converge in distribution to an infinitely divisible distribution.
1.1.Lévy Matrices.Lévy matrices are the heavy-tailed analogue of Wigner matrices, where the entries are independent up to symmetry, but fail to have two finite moments.
The conditions in Definition 1.2 as the same conditions for ξ to be in the domain of attraction of an α-stable distribution.Unlike with Wigner matrices, the natural scaling on an n × n Lévy matrix X is not √ n, but instead For an n × n Lévy matrix X n , the matrix A n in equation (1.1) will be defined as A n := a −1 n X n .We will refer to A n as a normalized Lévy matrix.1.2.Lévy-Khintchine Matrices.Jung in [24] defined a generalization of Lévy matrices.Instead of assuming the entries are in the domain of attraction of an α-stable distribution, the entries are in the domain of attraction of any infinitely divisible distribution.
is n × n, the diagonal entries of A n are 0, the non-diagonal entries are i.i.d. up to symmetry and for all t ∈ R, where m is a measure on R with m({0}) = 0 satisfying 1.3.Notation.Throughout this paper we use ⇒ to denote weak convergence of probability measure, convergence in distribution of random variables, and vague convergence of finite measures.For an n × n real symmetric matrix M the eigenvalues will always be considered in non-increasing order λ 1 ≥ λ 2 ≥ • • • ≥ λ n .We define the empirical spectral measure of an n × n real symmetric matrix M to be the probability measure where δ x is the Dirac delta measure at x.A coupling of two probability measures µ 1 and µ 2 is a random tuple (X, Y ) such that X is µ 1 distributed and Y is µ 2 distributed.The symbol d = will be used to denote equality in distribution of random variables and L(X) will be used to denote the distribution of a random variable X.For two complex-valued square integrable random variables ξ and ψ, we define the covariance between ξ and ψ as Cov(ξ, ψ) Throughout we will consider Poisson point processes on R \ {0}, the one point compactification of R with the origin removed, with some intensity measure m.We will consider both finite and infinite measures, so for convenience we will denote the points of this process by {y i } i≥1 for general m where y i = 0 for any i greater than an appropriate (possibly identically infinite) Poisson random variable, and when considering a specific finite measure m we will denote the points by {y i } N i=1 for a Poisson random variable N .
For a topological space E, let C K (E) denote the set of real-valued continuous functions on E with compact support.We will use C + to be the set of complex numbers with strictly positive imaginary part.For a probability measure µ on R we define the function s µ : and refer to s µ as the Stieltjes transform of µ.
We will use asymptotic notation (O, o, Θ, etc.) under the assumption that n → ∞ unless otherwise stated.

Throughout we will assume
Definition 2.1.Let {A n } n≥1 be a Lévy-Khintchine random matrix ensemble with characteristics (0, b, m) and for each n almost surely.
• There exists ε > 0 and C > 0 such that ) for all t > 1/4 and for every n ∈ N.
Remark 2.2.Some interesting and important examples of random matrices satisfying condition C1 include (i) A n = a −1 n X n for a Lévy matrix X n with α ∈ (0, 1) and a n as defined in (1.3).In this case m = m α where m α is the measure on R with density for α and θ as in Definition 1.2. 2 (ii) The adjacency matrix A n of an Erdős-Rényi random graph G(n, p) with np → λ ∈ (0, ∞).In this case the row sums of A n converge to Poisson random variables and m = λδ 1 .(iii) The matrix where E n is the adjacency matrix of an Erdős-Rényi random graph G(n, p) with np → λ ∈ (0, ∞), X n is a chosen from the Gaussian Orthogonal Ensemble (GOE), and • is the Hadamard product of matrices.In this case m = λG λ where G λ is the centered Gaussian probability measure with variance 1 λ .The first two points of Condition C1 will be important for handling the diagonal entries of L n .(2.2) implies a Poisson point process with intensity measure m is almost surely summable, which is stronger than the almost sure square summability implied by (1.5).(2.3) implies that the row sums converge to the sum of the Poisson point process with intensity measure m.The last point is a technical assumptions needed in the proof of the main theorem given below.Heuristically the last point of Condition C1 states that the infinitely divisible random variable Y in Definition 1.3 has at least t −ε tail decay, and this tail assumption holds entry-wise uniformly in n.The assumption in (2.5) is technical and used to prove tightness of the empirical spectral measures, but perhaps is not necessary and there may be room for refinement.Those choice of 1/4 in the final condition is arbitrary, any positive constant would be sufficient.

Theorem 2.3 (Eigenvalue Convergence for Laplacian Lévy-Khintchine matrices).
Let {A n } n≥1 be a Lévy-Khintchine random matrix ensemble with characteristics (0, b, m) all defined on the same probability space satisfying Condition C1, and for every n ∈ N let L n be defined by (2.1).Then there exists a deterministic probability measure µ m depending only on m such that a.s.µ Ln converges weakly to µ m , as n → ∞.
While the random matrices satisfying Condition C1 may appear very different for different m, a general description of µ m is available through its Stieltjes transform and a recursive distributional equation (RDE).A recursive distributional equation is an equation of the form where {X n } ∞ n=1 are i.i.d.copies of X and {Y n } ∞ n=1 is some sequence of random variables independent from {X n } ∞ n=1 .While we do not use existing results from the literature we did find the survey [1] and the unpublished manuscript [2] helpful for better understanding RDEs and contraction arguments in proving uniqueness of solutions.We encourage the interested reader to begin there for more information on RDEs.x−z dµ m (x) be the Stieltjes transform of µ m .Then for every z ∈ C + , s m (z) = Es ∅ (z) where s ∅ is the Stieltjes transform of a random probability measure.Moreover, the distribution of s ∅ is the unique distribution on the space of Stieltjes transforms of probability measures such that where {y j } j≥1 is a Poisson point process with intensity measure m and {s j } is a collection of i.i.d.copies of s ∅ independent from the point process.
Theorems 2.3 and 2.4 give that the limiting empirical spectral measure of L n is uniquely determined by a Poisson point process with intensity measure m.For the examples outlined in Remark 2.2 we will now give some more explicit descriptions of the corresponding point processes.
(i) Let E 1 , E 2 , . . .be a sequence of independent exponential random variables with mean 1 and . .be a sequence of i.i.d.random variables such that Then (see [15] Proposition 2) the collection Then {y k } k≥1 is a Poisson point process with intensity measure λδ 1 .(iii) For a very sparse GOE matrix described in example (iii) in Remark 2.2, let Y 1 , Y 2 , . . .be independent standard real Gaussian random variables, and let N be a Poisson random variable with mean λ.Define Then {y k } k≥1 is a Poisson point process with intensity measure λG λ .This example is explored a bit further in Theorem 2.6 below.RDE (2.7) can be written as (2.8) If we consider a diagonal matrix Dn independent from A n with independent entries ( Dn ) ii d = (D n ) ii and the matrix Ln = A n − Dn the work below leading up to the existence of (2.7) could be adapted in a straightforward way to arrive at the following corresponding RDE for Ln , where {ỹ j } is an independent copy of the point process of {y j }, independent of {s j }.For light-tailed A n , Theorem 1.1 gives that the limiting spectral measure of L n is the free additive convolution of the semicircle measure and the Gaussian measure.This is the same limiting spectral measure for A n −D n for A n independent of D n .In contrast, the differences between equations (2.8) and (2.9) suggest that for Lévy-Khintchine A n , the dependence between A n and D n can be seen in the limiting measure µ m .
2.1.Outline.In Sections 3 and 4 we define local convergence for operators on ℓ 2 (V ) for a countable set V and use the measure m to build a random operator L. In Section 5 we show that L n converges locally in distribution to L, and then in Section 6 we upgrade this to convergence of the empirical spectral measures.
Finally in Section 7 we show the Stieltjes transform of the limiting empirical spectral measure can be described as the expected value of the unique solution to (2.7).In the appendices we prove almost sure tightness of the collection {µ Ln } n≥1 and list some technical lemmas.We end this section with two corollaries of Theorem 2.4.
The first is a continuity result for the map m → µ m .In the second we use (2.7) to recover the free convolution of a semicircle and a standard Gaussian measure from the limiting empirical measure of very sparse random matrices.

2.2.
Corollaries of Theorem 2.4.The first corollary of Theorem 2.4 concerns continuity of the mapping m → µ m where µ m is the limiting measure of Theorem 2.3.Uniqueness of the solution to the RDE in Theorem 2.4 is crucial to the proof of Corollary 2.5 below.
and for any ε > 0 where for each n ∈ N, {y j=1 is a Poisson point process with intensity measure m n .Then µ mn converges weakly to µ m∞ as n → ∞, where µ m1 , µ m2 , . . .and µ m∞ are the deterministic limiting measures described in Theorem 2.3 for a Lévy-Khintchine random matrix ensemble with characteristics (0, b, m 1 ), (0, b, m 2 ), . . .and (0, b, m ∞ ) respectively.Proof.Let s n be the Stieltjes transforms of µ mn .Let {µ mn k } n k be a subsequence of {µ n } n , and let r n k (z) be the random Stieltjes transforms solving RDE (2.7) for the measures m n k .From Lemma B.7 it follows that {r n k } is tight in the space of analytic function on C + with the topology of uniform convergence on compact subsets, and we pass to a further subsequence n ′ k converging to another random analytic function r(z).As {r n k } is almost surely uniformly bounded on compact subsets is follows that r is almost surely bounded on compact subsets.For any fixed z ∈ C + , it follows by the dominated convergence theorem that lim Corollary 2.5 then follows if r is a random Stieltjes transform solution to RDE (2.7) corresponding to m ∞ .To this end, let {Π n } ∞ n=1 be Poisson random measures with intensity measures m n .For a positive function f ∈ C K ( R\{0}), 1−e −f (x) is also a continuous function with compact support.Thus (2.12) It follows from Theorems 5.1 and 5.2 in [28] that Π n converges in distribution to Π ∞ .For n ∈ N let {y (n) j } j≥1 be the points of the process Π n and {y j } j≥1 the points of the process Π ∞ .The points may be ordered such that for every j ∈ N, y (n) j converges in distribution to y j (see Section 2 of [15] for more details).In fact, from (2.10) and Lemma 1 of [15] r on a single probability space such that all the above convergences in distributions are almost sure, and for independent copies r 1 , r 2 , . . . of r, where the last equality follows from (2.13).
Thus r is an analytic solution to RDE (2.7).From (2.14) and the almost sure boundedness of r on compact subsets of C + that almost surely and thus r is almost surely the Stieltjes transform of a probability measure.From (2.11) and the uniqueness of the solution to RDE (2.7) it follows that for any As the subsequence n k was arbitrary is follows that s n converge pointwise to s ∞ and µ mn converges weakly to µ m∞ as n → ∞.
Theorem 2.6 below considers the λ → ∞ limit of example (iii) in Remark 2.2.The limiting measure is the same limiting measure found in Theorem 1.1.The works of Jiang [22] and Chatterjee and Hazra [13] established Theorem 1.1 for sparse random matrices where the expected number of nonzero entries in a row tends to infinity with the size of the matrix.Theorem 2.6, when combined with Theorem 2.3 and Remark 2.2 (iii), can then be interpreted as splitting the limit to where first n → ∞ and then the expected number of nonzero entries tends to infinity.
Theorem 2.6.Let G λ denote the Gaussian probability measure with mean 0 and variance 1  λ , and let m λ = λG λ .If µ m λ is the deterministic limiting probability measure from Theorem 2.3, then µ m λ converges weakly to the free convolution of the semicircle distribution and the standard real Gaussian distribution, as λ → ∞.
Proof.Denote the free convolution of a standard semicircle measure and standard Gaussian measure by SC⊞G 1 .It is known [5] the Stieltjes transform, s fc , of SC⊞G 1 can be defined as the unique solution to N ∼ Pois(λ), {y j } ∞ j=1 are i.i.d.Gaussian random variables with mean zero and variance 1  λ and {r j } ∞ j=1 are i.i.d.copies of r λ , independent of the collection {y j } ∞ j=1 .We will instead use the equivalent recursive distributional equation where {y j } ∞ j=1 are i.i.d.standard real Gaussian random variables.Fix z ∈ C + .We first consider the sum where here and throughout the proof asymptotic notation is as λ → ∞.Thus S λ converges to a standard real Gaussian random variable as λ → ∞.
We will compare the sum 1 where 1 A j,λ is the indicator of the event A j,λ = {|y j | ≥ √ λ Im(z)/2}.We will now show both pieces of this bound converge in probability to zero.From Lemma B.2 From standard tail estimates of Gaussian random variables we have that for some positive constants C, c > 0 independent of λ.Thus Next we compare to the sum 1 λ N j=1 (Er λ (z))y 2 j .To this end let Z j = r j (z)y 2 j − (Er λ (z))y 2 j , and consider the Taylor expansion of characteristic function of the real part of An identical argument follows from the imaginary part, and we see that 1 which converges in distribution to 0. Since this limit is a constant, we may conclude that jointly where Y is a standard Gaussian random variable.Let {λ n } ∞ n=1 be an arbitrary increasing sequence of positive real numbers going to infinity and let {λ n k } be an arbitrary subsequence.From Lemma B.7 {r λn k } n k is tight as a family of random analytic functions on C + with the topology of uniform convergence on compact subsets, and thus there exists a further subsequence λ n k ′ such that r λn k ′ (z) → r(z) for some random analytic function r.Fix z ∈ C + , it follows from the dominated convergence theorem that Er λn k ′ (z) → Er(z) =: r(z) for some deterministic limit r(z).As z ∈ C + was arbitrary, it follows from the above convergence in distribution and the continuous mapping theorem that pointwise on C + .Thus r(z) = s fc (z) along every one of these further subsequences of {λ n k }, and s λ (z) = Er λ (z) → s fc (z).By Lemma B.6 this pointwise convergence of the Stieltjes transforms implies µ m λ converges weakly to SC ⊞ G 1 as λ → ∞.
The matrix X n in Remark 2.2 (iii) has Gaussian entries, and for convenience we stated Theorem 2.6 for the corresponding measure λG λ .However, the proof can be adapted in a straightforward way to the analogous measures corresponding to X n from Remark 2.2 (iii) having entries with mean zero, variance 1 λ , and three finite moments.

Acknowledgment
The first author thanks Yizhe Zhu for pointing out reference [29].

Operators on ℓ 2 (V )
Let V be a countable set and let ℓ 2 (V ) denote the Hilbert space defined by the inner product φ, ψ := where δ u is the unit vector supported on u ∈ V .Let D(V ) denote the dense subset of ℓ 2 (V ) of vectors with finite support.Let (w uv ) u,v∈V be a collection of real numbers with w uv = w vu such that for all u ∈ V , We then define a symmetric linear operator A with domain D(V ) by For any u, v ∈ V we say that (A n , u) converges locally to (A, v), and write if there exists a sequence of bijections σ n : V → V such that σ n (v) = u and, for all Here we use σ n for the bijection on V and the corresponding linear isometry defined in the obvious way.This notion of convergence is useful to random matrices for two reasons.First, we will make a choice on how to define the action of an n × n matrix on ℓ 2 (V ), and the bijections σ n help ensure the choice of location for the support of the matrix does not matter.Second, local convergence also gives convergence of the resolvent operator at the distinguished points u, v ∈ V .This comes down to the fact that local convergence is strong operator convergence, up to the isometries.See [8] for details.
n=1 and A are self-adjoint operators such that (A n , u) converges locally to (A, v) for some u, v ∈ V , then, for all z ∈ C + , as n → ∞.
To apply this to random operators we say that (A n , u) → (A, v) in distribution if there exists a sequence of random bijections σ n such that σ −1 n A n σ n φ → Aφ in distribution for every φ ∈ D(V ).

Poisson weighted infinite tree
Let ρ be a positive Radon measure on R \ {0}.PWIT(ρ) is the random infinite weighted rooted tree defined as follows.The vertex set of the tree is identified with N f := k∈N∪{0} N k by indexing the root as N 0 = ∅, the offspring of the root as N and, more generally, the offspring of some v ∈ N k as (v1), (v2), • • • ∈ N k+1 .Define T as the tree on N f with edges between parents and offspring.Let {Ξ v } v∈N f be independent realizations of a Poisson point process with intensity measure ρ.Let Ξ ∅ = {y 1 , y 2 , . . .} be ordered such that |y 1 | ≥ |y 2 | ≥ • • • with the convention y i = 0 for all i large enough3 if ρ(R\{0}) < ∞, and assign the weight y i to the edge between ∅ and i, assuming such an ordering is possible.More generally assign the weight y vi to the edge between v and vi where Ξ v = {y v1 , y v2 , . . .} and For a measure m on R\{0} satisfying (1.5) and a realization of PWIT(m) define the linear operator A on D(N f ) by the formulas and δ v , Aδ u = 0 otherwise.From (1.5) one can see that the points in Ξ v are almost surely square summable for every v ∈ N f , and thus A is a well defined linear operator on D(N f ), though is possibly unbounded on ℓ 2 (N f ).

4.1.
Poisson weighted infinite tree with loops.The Poisson weighted infinite tree has been utilized in [7,8,9,11,24] to study the empirical spectral distribution of heavy-tailed random matrices by showing the random matrices converge to the operator defined by (4.1) for an appropriate measure m.One key feature of those matrices is the diagonal elements are negligible when compared to the largest entries in a row or column.This will not be the case for the Laplacian matrix L n , thus we will need to define an operator on a slightly modified graph.
Let m be a measure on R \ {0} such that Define the Poisson weighted infinite tree with loops PWITL(m) as the random weighted graph with vertex set N f and edge set E ∪ v∈N f {v, v} where E is the edge set of PWIT(m).The weights on edges in E of PWITL(m) are the weights on edges in E of PWIT(m) while the weight on a loop {v, v} is where ul = v if v is not ∅ and the weight on {∅, ∅} is and δ v , Lδ u = 0 otherwise.In which case we say L is the operator associated to PWITL(m).
We will show the sequence {(L n , 1)} n≥1 converges locally in distribution to (L, ∅) where L is the linear operator on ℓ 2 (N f ) associated to the PWITL(m).
4.2.Self-adjointness.In this section we review and apply a criteria established by Bordenave, Caputo, and Chafaï in [7] for unbounded operators to be essentially self-adjoint.There are two minor issues which prevent immediately applying their results to the operator L associated to PWITL(m).First is they consider operators with skeletons which are trees, and not trees with loops.This is easy to overcome.The second obstacle is in the application of the criteria they consider only point processes associated to α-stable distributions and not more general infinitely divisible distributions.This is overcome by the establishment of Lemma B.1.Proposition 4.1 (Lemma A.3 in [7]).Let A be a linear operator on ℓ 2 (N f ) defined by (3.1).We say u ∼ v if u = v, u = vk, or v = uk for some k ∈ N. Assume w uv = 0 if u ≁ v. Suppose there exists a constant κ > 0 and sequence of finite connected subsets S n ⊂ N f , such that S n ⊂ S n+1 , N f = n∈N S n , and for every n and v ∈ S n , u / ∈Sn:u∼v

.6)
Then A is essentially self-adjoint.
Proof.Proposition 4.1 is not stated identically to Lemma A.3 in [7], however the only added assumption is that vertices are connected to themselves, so that the graph of the skeleton of A is not a tree.The step in the proof given in [7] which uses the tree structure is the fact that if v ∈ S n , u ∼ v, and v ′ ∈ S n \ {v} then u ≁ v which is also true for a tree with loops.Proof.While Proposition 4.2 may not initially appear to be Proposition A.2 in [7], the proofs are identical as Lemma A.4 in [7] is extended to the setting considered here in Lemma B.1 below.

Local convergence for the Laplacian of Lévy-Khintchine matrices
For an n × n matrix M , extend M to a bounded operator on ℓ 2 (N f ) as follows.For 1 ≤ i, j, ≤ n, let δ i , M δ j = M ij .and δ u , M δ v = 0 otherwise.Theorem 5.1.Let L n be the matrix defined by (2.1) for {A n } n≥1 , a Lévy-Khintchine random matrix ensemble satisfying C1 and L the linear operator on ℓ 2 (N f ) associated to PWITL(m).Then, in distribution, (L n , 1) → (L, ∅), as n → ∞.
The rest of this section is devoted to the proof of Theorem 5.1.Before considering (L n , 1) we begin by showing (A n , 1) converges to (L + D, ∅) where D is a diagonal operator.This follows from the work of Jung in [24], we include the proof to establish notation and for the convenience of the reader.We define a network as a graph with edge weights taking values in some normed space.To begin let G n be the complete network, without loops, on {1, . . ., n} whose weight on edge {i, j} equals ξ n ij for some collection (ξ n ij ) 1≤i<j≤n of random variables taking values in some normed space.Now consider the rooted network (G n , 1) with the distinguished vertex 1.For any realization (ξ n ij ), and for any B, H ∈ N such that (B H+1 − 1)/(B − 1) ≤ n, we will define a finite rooted subnetwork (G n , 1) B,H of (G n , 1) whose vertex set coincides with a B-ary tree of depth H.To this end we partially index the vertices of (G n , 1) as elements in the indexing being given by an injective map σ n from J B,H to V n := {1, . . ., n}.We set I ∅ := {1} and the index of the root σ has the k-th largest norm value among {ξ n 1j , j = 1}, ties being broken by lexicographic order 4 .This defines the first generation, and let I 1 be the union of I ∅ and this generation.If H ≥ 2 repeat this process for the vertex labeled (1) on V n \ I 1 to order {ξ n (1)j } j∈Vn\I1 to get {11, 12, . . ., 1B}.Define I 2 to be the union of I 1 and this new collection.Repeat again for (2), (3), . . ., (B) to get the second generation and so on.Call this vertex set V B,H n = σ n J B,H . 4To help keep track of notation in this section, note that v = (w) ∈ Vn if w ∈ J B,H and For a realization T of PWITL(m), recall we assign the weight y vk to the edge {v, vk} and the weight y vv to the edge {v, v}.Then (T, ∅) is a rooted network.Call (T, ∅) B,H the finite rooted subnetwork obtained by restricting (T, ∅) to the vertex set J B,H , and the edge set without the loops.If an edge is not present in (T, ∅) B,H assign the weight 0. We say a sequence (G n , 1) B,H , for fixed B and H, converges in distribution, as n → ∞, to (T, ∅) B,H if the joint distribution of the weights converges weakly. Let , where L ij is the ij-th entry of L n for 1 ≤ i < j ≤ n.We aim to show with the choice of weights (ξ n ij ) 1≤i<j≤n that for fixed B, H (G n , 1) B,H converges weakly to (T, ∅) B,H .
Order the elements of J B,H lexicographically, i.e.
w≺v O w , where w ≺ v must be strict in this union.Thus at every step of the indexing procedure we order the weights of neighboring edges not already considered at a previous step.Thus for all v, Note that by independence, Proposition B.4 still holds if you take the sum of Dirac measures at the random variables over {1, . . ., n} \ I for any fixed finite set I. Thus by Proposition B.4 the weights from a fixed parent to its offspring in (G n , 1) B,H converge weakly to those of (T, ∅) B,H .By independence we can extend this to joint convergence.Recall (G n , 1) B,H is a complete graph and not a tree with loops.Thus it remains to show the edges in (G n , 1) B,H which were not considered in the sorting procedure converge to 0. This was shown for heavy-tailed weights in [7] and for more general Lévy-Khintchine weights in [24].
Let L be the operator associated to PWITL(m).For fixed B, H let σ B,H n be the map σ n above associated to (G n , 1) B,H , and arbitrarily extend σ B,H n to a bijection on N f , where V n is considered in the natural way as a subset of the offspring of ∅.From the Skorokhod representation theorem we may assume (G n , 1) B,H converges almost surely to (T, ∅) B,H .Thus there are sequences B n , H n tending to infinity and σn := σ Bn,Hn n such that for any pair v, w ∈ N f with w = v, ξ n σn(v),σn(w) converges almost surely to      y vk , if w = vk for some k y wk , if v = wk for some k 0, otherwise.

Thus for any
almost surely.We now consider the diagonal elements.Let u ∈ N f , B = H = k for some k ∈ N such that u ∈ J k,k .From the above we know almost surely From linearity it suffices to show for every We have shown This follows from the uniform summability of C1.This completes the proof of Theorem 5.1.
We will need the following extension of Theorem 5.1.
Theorem 5.2.Let L n be the matrix defined by (2.1) for {A n } n≥1 , a Lévy-Khintchine random matrix ensemble satisfying C1.If L and L ′ are two independent copies of the linear operator on ℓ 2 (N f ) associated to PWITL(m), then, in distribution, Proof.Using Proposition 2.6 in [7] and the arguments above we can construct isometries ) almost surely.The result then follows by linearity.for every z ∈ C + .
Proof.For z ∈ C + we define the operators and From Proposition 4.2, L is self-adjoint with probability 1.Thus from Theorem 3.2 and Theorem 5.1 For every z ∈ C + , R n (z) 11 and R(z) ∅∅ are bounded, thus By definition s ∅ (z) = R(z) ∅∅ , while It is clear from the matrix of cofactors method of inversion R n (z This completes the proof. for all z ∈ C + .We know from Theorem 6.1 that for all z ∈ C + lim n→∞ Es Ln (z) = Es ∅ (z).(6.10) We now upgrade this to almost surely convergence of s Ln (z) to Es ∅ (z).For and by the exchangeability of the matrix entries From Theorems 3.2 and 5.2 we know R n (z) 11 and R n (z) 22 are asymptotically independent random variables bounded uniformly in n, and thus asymptotically uncorrelated.From this we get and Taking µ m = Eµ ∅ completes the proof of Theorem 2.3.

Proof of theorem 2.4
We will follow the approach of [7] and take advantage of the tree structure on N f to arrive at (2.7) before proving uniqueness.Let L be the operator associated to PWITL(m), we have already seen that s m (z) = Es ∅ (z) where We now decompose the operator L as where for any other combination of u, v ∈ N f .Under this decomposition {L k } k≥1 is a collection of i.i.d.random operators each equal in distribution, up to an isometry, to L. For convenience define the operator L by and the operators R(z) Additionally denote by R uv (z) := δ u , R(z)δ v and Ruv (z) := δ u , R(z)δ v .Note R∅∅ (z) = −z −1 , Rkl (z) = 0 for all k, l ∈ N with k = l, and R∅k (z) = 0 = Rk∅ (z) for all k ∈ N. From (7.5) one immediately gets It also follows that Rearranging we arrive at A similar computation for δ Noting y ∅∅ = − ∞ j=1 y j gives (2.7).Note that for j ∈ N Rjj (z) depends only on z and L j , and hence { Rjj (z)} j∈N is a collection of i.i.d.random variables independent of {y j } j∈N .7.1.Uniqueness.In this section we prove uniqueness of the solution to (2.7) from Theorem 2.4.While the argument is technical, the core is a contraction approach.We will show the map T defined below in (7.10) would contract, in an appropriate metric, two fixed points belonging to a nice subset of all probability measures on the space of Stieltjes transforms.We then extend this result to any two potential fixed points by moving from this metric to a functional separating distinct points.
Let S be the set of Stieltjes transforms of probability measures on R and P(S) be the set of probability measures on S. Define T : P(S) → P(S) as follows: for µ ∈ P(S) where {s j } are i.i.d. with distribution µ, {y j } ∞ j=1 is a Poisson point process with a fixed intensity measure m independent of the collection {s j } j≥1 , N is a Poisson random variable with mean m(R) such that y j = 0 if j > N , and L(X) is the law of a random variable X.Thus the distribution of s ∅ is a fixed point of T and we aim to show it is the unique fixed point.The notation of distance for which T contracts fixed points will involve the infimum over all couplings of these fixed point measures.Let µ 1 , µ 2 ∈ P(S) be two fixed points of T and let (s(z), r(z)) be an arbitrary coupling of µ 1 and µ 2 .Additionally let µ r and µ s be the random probability measures on R defined uniquely by for all z ∈ C + .For now we will assume there exists M ∈ N such that almost surely . This assumption will be removed later.As r and s are analytic functions on the upper half plane we will consider them only on the box where f m is a positive increasing function on N such that f m (M ) → ∞ as M → ∞, which will be chosen later to satisfy (7.17To handle the denominator we will consider separately the points where Re(r j (z)y j ) is small and the few points where Im(r j (z)y j ) is large.Let m be equal to m with support restricted to [−f m (M )/2, f m (M )/2] and m := m − m.Decompose the point process {y j } N j=1 into two independent Poisson point processes {ŷ j } N j=1 and {ỹ j } Ñ j=1 with intensity measures m and m respectively.We will divide the sum in (7.13) into two sums over these point processes.To begin note for z ∈ C M , |s j (z)ŷ j − 1||r j (z)ŷ j − 1| ≥ 1/4 and thus where ε > 0 is from (2.4), and C, C ′ > 0 are constants which depend only on the measure m.Thus Im(z) 2  . Then where the final equality follows from Lemma B.2. Finally combining (7.14) and (7.15) gives (7.16) Notice this coefficient is independent of the coupling and depends only on M , f m and m.From the definition of C z,M we have that C z+i,M / Im(z) 2 → 4 as Im(z) → ∞.We also have that for each M ∈ N. As the left hand side of (7.17) is decreasing in f m (M ), f m may be chosen to be increasing and unbounded.
Next we remove that assumption that, for some M , almost surely µ r and µ s have half their mass in [−M, M ].For a positive, increasing, unbounded function f m on N, we define the function d fm : P(S) 2 → [0, ∞) by and the function d ′ fm : where 1 AM is the indicator function of the event C M is the set defined by (7.11), and C(µ 1 , µ 2 ) is the set of all couplings of µ 1 and µ 2 for µ 1 , µ 2 ∈ P(S).It is straightforward to check that ρ : S × S → [0, ∞) defined by is a metric on S, and thus d fm is the 1 st -Wasserstein metric on P(S) (see [18] Chapter 11 for details).Let µ 1 and µ 2 be two fixed points of T .Let (s, r) be a coupling of µ 1 and µ 2 such that and let s and r be built from i.i.d.copies of (s, r) as in (7.12).Using the specific coupling (s, r), (7.16), and (7.20) we get fm was a metric it would be immediate that µ 1 = µ 2 , however it is not clear this is the case.The only property of a metric needed is that d ′ fm separates distinct points in P(S), and thus we conclude the proof using the following lemma.Proof.Assume d ′ fm (µ 1 , µ 2 ) = 0, fix ε > 0, and note there exists M 0 ∈ N such that for any M ≥ M 0 and any coupling (r, s) one has We have that inf and thus we can find a sequence of couplings where } and µ n r and µ n s are the random probability measures associated to r n and s n .Let M 1 be such that f m (M 1 ) > 1 2ε , and hence for any for any Stieltjes transforms r and s.
We will now extend the convergence in (7.23) to the supremum over the larger compact set C = ∪ M1 j=1 C j .The L 1 -convergence of the random variables sup z∈CM 0 |r n (z)− s n (z)|1 A n M 0 in (7.23) to zero implies convergence in probability to zero.Thus we can find a subsequence converging almost surely to zero, and without loss of generality we denote this subsequence {sup z∈CM are eventually identically 0. For ω ∈ G 1 we consider the further subsequence {n k } such that 1 A n k M 0 (ω) = 1 for all k.For this outcome ε.
As ε > 0 was arbitrary we have d fm (µ 1 , µ 2 ) = 0.For the other direction note  This completes the proof.
Lemma B.2 (Campbell's Formula, Section 3.2 in [26]).Let Π be a Poisson point process on a measurable space (X, X ) with intensity measure m.Let u : X → R be a measurable function.Then  for all z ∈ C + if and only if there exists a positive measure µ with Stieltjes transform s such that µ n converges to µ vaguely.
For the following lemma, we use the notation of [29].For a connected open domain D ⊂ C, let H(D) be the space of analytic functions on D equipped with the topology of uniform convergence on compact subsets of D.
Lemma B.7 (Proposition 2.5 in [29]).Let X 1 , X 2 , X 3 , . . .be a sequence of random analytic functions on a connected open set D ⊂ C, with probability distribution measures µ X1 , µ X2 , . . . on H(D).If for every K ⊂ D compact {max z∈K |X n (z)|} n≥1 is a tight sequence of random variables, then {µ Xn } n≥1 is tight in the space of probability measures on H(D).

. 5 ) 1 . 4 .
Remark It is worth noting the that distribution of A (n) 12 may change with n.However, for many important examples A (n) 12 is either a rescaling of a fixed random variable or is the product of a fixed random variable and a Bernoulli random variable where only the Bernoulli random variable is changing with n.A random variable Y satisfying (1.4) is said to have an infinitely divisible distribution with characteristics (σ 2 , b, m) and (1.4) is referred to as the Lévy-Khintchine representation of Y .When σ = 0, Y is called purely non-Gaussian and has an important connection to Poisson point processes with intensity measure m outlined in Propositions B.3 and B.4.

Theorem 2 . 4 (
Recursive Distributional Equation for Stieltjes Transform of µ m ).Let µ m be the limiting deterministic measure from Theorem 2.3 and let s m (z) = R 1

2 j 1 −
rj(z)yj / √ λ to increasingly simpler sums.The first comparison is to the sum 1 λ N j=1 r j (z)y 2 j .Notice that converges in probability to zero.It is also straightforward to show 1 r λ (z))y 2 j − Er λ (z) converges in probability to zero.These three comparisons lead to

. 1 ) 3 . 1 (
Definition Local Convergence).Suppose (A n ) is a sequence of bounded operators on ℓ 2 (V ) and A is a linear operator on ℓ 2 (V ) with domain D(A) ⊃ D(V ).
v) → 0 almost surely.By the uniform summability condition of C1 we have almost surely v / ∈J k,k ξ n (u),(v) → 0. (5.3)As k was arbitrarily large we have that almost surely for any

6 . 3 Theorem 6 . 1 .
Resolvent convergence and the proof of Theorem 2.Let s Ln (z) be the Stieltjes transform of µ Ln and let s ∅ (z) be the Stieltjes transform of the measure µ ∅ defined by δ ∅ , f (L)δ ∅ = R f dµ ∅ (6.1) for any continuous bounded function f : R → C, where f (L) is defined by the continuous functional calculus.Then lim n→∞ Es Ln (z) = Es ∅ (z) (6.2)
a sequence of complex analytic functions on C + , uniformly bounded on compact subsets of C + , converging uniformly to 0 on a set with an accumulation point.Thus applying the Vitali convergence theorem for analytic functions, Lemma B.5, we get that sup z∈ C |(r n k (z) − s n k (z))|(ω) → 0 as n k → ∞.From the above and the bounded convergence theorem we get lim n→∞ E sup z∈ C |r n (z) − s n (z)|1 A n 21), (7.24), and (7.25), we obtain

8 )δ
if and only if X min(u(x), 1)dm(x) < ∞.(B.9)If either of the above integrals are finite then E exp(θS) = exp X e θu(x)−1 dm(x) (B.10)for any θ ∈ C for which the integral on the right hand side is finite.Moreover,E X u(x)dΠ(x) = X u(x)dm(x), (B.11) whenever u ≥ 0 or |u(x)|dm(x) < ∞.For 0 < h < 1 defineσ 2 h := σ 2 + |x|≤h x 2 dm(x), and b h := b − h<|x| x 1 + x 2 dm(x).Proposition B.3 (Corollary 15.16 in[25]).Suppose {X ni : 1 ≤ i ≤ n} n≥1 is a triangular array of random variables such that each row consists of i.i.d.random variables.Then the sum n i=1 X ni , converges in distribution to an infinitely divisible random variable with characteristic (σ 2 , b, m) as n → ∞ if and only if for every 0 < h < 1 which is not an atom of m• nP(X n1 ∈ •) ⇒ m(•) on R \ {0}, • nE X 2 n1 1 {|Xn1|≤h} → σ 2 h , and • nE X n1 1 {|Xn1|≤h} → b h , as n → ∞.Proposition B.4 (Theorem 5.3 in[28]).Suppose {X ni : 1 ≤ i ≤ n} n≥1 is a triangular array of random variables on R \ {0} such that each row consists of i.i.d.random variables.Let N be a Poisson point process with intensity measure m.Thenn i=1 Xni ⇒ N as n → ∞ if and only if nP(X n1 ∈ •) ⇒ m(•)as n → ∞.Lemma B.5 (Vitali's convergence theorem for analytic functions, Lemma 2.14 in [3]).Let f 1 , f 2 , . . .be analytic in D, a connected open set of C, satisfying |f n (z)| ≤ M for every n and z ∈ D, and f n (z) converges as n → ∞ for each z in a subset of D having an accumulation point in D. Then there exists a function f analytic in D for which f n (z) → f (z) for all z ∈ D. Moreover on any set bounded by a contour interior to D, the convergence is uniform.Though Stieltjes transforms are not uniformly bounded on C + , it is straightforward to apply Theorem B.5 to them by considering first C +,m = {z ∈ C : Im(z) > 1/m} and letting m → ∞.