1 Introduction

In this article, we are concerned with the limiting spectral behavior of a new class of random sample covariance matrices introduced in [28]. Consider a large \(N\times p\) matrix \({\mathbf {X}}\) with random i.i.d. entries, which may be used as a model of a large dataset. If the entries of \([{\mathbf {X}}]_{i,j}\) are known a priori to be centered, then the \(p\times p\) sample covariance matrix of features is given by

$$\begin{aligned} \frac{1}{N} {\mathbf {X}}^\dagger {\mathbf {X}} \end{aligned}$$

The bulk behavior of such a matrix is studied through its empirical spectral distribution (ESD), the point-mass probability measure on its spectrum. In practice, many datasets exhibit strong multicollinearity when p and N are comparably large. In this scenario, \(p/N \sim O(1)\), and we choose to model \({\mathbf {X}}\) as a large random matrix with stable rectangular shape. In what follows, we implicitly take \(N = N(p)\) as a function of the asymptotic parameter \(p \in {\mathbb {N}}\).

Definition 1

We say that a sequence of \(N\times p\) random matrices \({\mathbf {X}}_p\) is a \(\lambda \)-shaped ensemble if the collection of entries \([{\mathbf {X}}_p]_{i,j}\) for \(p \in {\mathbb {N}}\), \(1 \le i \le N\), and \(1 \le j \le p\) are jointly independent, and if \(N/p \rightarrow \lambda \in (0,\infty )\) as \(p\rightarrow \infty \).

Describing the limiting spectral properties of matrices like \(\frac{1}{N} {\mathbf {X}}_p{\mathbf {X}}_p\) and its variants is a long-standing problem in random matrix theory. We say that a sequence of square random matrices has a limiting spectral distribution if their (random) ESDs converge weakly to some probability measure almost surely. The existence of a limiting spectral distribution in the pure-noise case, where the entries \([{\mathbf {X}}_p]_{i,j}\) are i.i.d. with finite variance, was initiated by Marčenko and Pastur [17]. The criteria on \({\mathbf {X}}_p\) that ensures \(\frac{1}{N}{\mathbf {X}}_p^\dagger {\mathbf {X}}_p\) follows the Marčenko–Pastur law have a long history [3, 11, 21, 25], with one branch culminating in the generous conditions of Tao and Vu [23] that the collection of entries across the asymptotic parameter p shares some uniformly bounded \((2+\epsilon )\)-moment.

As the conditions for the Marčenko–Pastur law continued to weaken, its “universality” inspired a number of covariance matrix cleaning techniques, not the least of which were applied to financial data [2, 10, 13]. As the motivation behind these techniques was the shape and bounds on the Marčenko–Pastur law, it was suggested that real datasets would exhibit this bulk shape when some volatility in the data could be attributed to noise that was approximately Gaussian-like. As financial data are ubiquitously non-Gaussian, however, efforts were made to extend the law to the heavier-tailed setting. Along these lines, Biroli et. al. [8] investigated an ensemble of random matrices with Student’s t column norms, while Guionnet and collaborators began a program of limit theorems for matrices with i.i.d. heavy-tailed entries [4, 5, 7, 9].

In [28], it was shown that intraday equity data in various markets fail to match the scalability implied by Marčenko–Pastur with distinct values of \(\lambda \). Appealing to the heavy-tailed setting does not resolve the issue: The heavy-tailed pure-noise Marčenko–Pastur law described in [4] produces large eigenvalues with heavy tails themselves, which contradicts well-observed phenomena in extreme asset returns [16, 18]. In order to address these concerns, the author considered instead a sequence of random matrices \({\mathbf {X}}_p\) whose columns are drawn from the fluctuations in a stochastic process \(X_t\) over a fixed interval [0, T]. Specifically, after discretizing the interval [0, T] into a series of \(N+1\) points \(t_i = i \cdot \frac{T}{N}\) with \(0 \le i \le N\), we let the entries \([{\mathbf {X}}_p]_{i,j}\) follow the distributions:

$$\begin{aligned}{}[{\mathbf {X}}_p]_{i,j} {\mathop {=}\limits ^{d}} X_{t_i} - X_{t_{i-1}} \end{aligned}$$

In this way, each column of \({\mathbf {X}}_p\) is understood to represent the fluctuations of an independent copy of the process \(X_t\) over [0, T].

If we continue to impose the condition that entries of \({\mathbf {X}}_p\) are i.i.d. for each p, then \(X_t\) must be a stochastic process with independent and time-invariant increments. Such properties specify that \(X_t\) is a Lévy process, and the entries of \({\mathbf {X}}_p\) will therefore follow an infinitely divisible distribution. This correspondence leads naturally to the following model.

Definition 2

(Sample Lévy Covariance Ensemble) Let \((\mu ,\lambda )\) be a pair consisting of an infinitely divisible distribution \(\mu \in \text {ID}(*)\) and a shape parameter \(\lambda \in (0,\infty )\). A sample Lévy covariance ensemble (SLCE) \({\mathbf {C}}_p\) driven by data \((\mu ,\lambda )\) is a sequence of \(p\times p\) Wishart-type random matrices \({\mathbf {C}}_p = {\mathbf {X}}_p^\dagger {\mathbf {X}}_p\), where \({\mathbf {X}}_p\) is a \(\lambda \)-shaped rectangular ensemble whose entries follow the distributions:

$$\begin{aligned} {{\mathscr {L}}}\left( [{\mathbf {X}}_p]_{i,j}\right) =\mu ^{*1/p}, \ \ \ \ \ \begin{array}{l} 1 \le i \le N \\ 1 \le j \le p \end{array} \end{aligned}$$

The main contribution of this paper is the spectral convergence of SLCE matrices in Theorem 1, which extends Lemma 2 introduced in [28] to cover the case of an SLCE driven by an arbitrary Lévy process.

Theorem 1

Let \({\mathbf {C}}_p\) be an SLCE with data \((\mu ,\lambda ) \in \text {ID}(*)\times (0,\infty )\). Then, there exists a unique probability distribution, denoted by \(\Lambda _\lambda (\mu )\), such that the limiting spectral distribution of \({\mathbf {C}}_p\) is \(\Lambda _\lambda (\mu )\). Furthermore, \(\Lambda _\lambda (\mu )\) is weakly continuous in its arguments, where continuity in \(\mu \) means sequential continuity.

Our model intersects with the world of non-commutative probability in the following way. A trivial but far reaching property of sample covariance matrices is that they can be decomposed in terms of blocks of their observations. Writing

$$\begin{aligned} {\mathbf {X}}_p = \begin{bmatrix} \begin{array}{c} {\mathbf {X}}_{p,1} \\ \hline {\mathbf {X}}_{p,2}\\ \hline \vdots \\ \hline {\mathbf {X}}_{p,K} \end{array} \end{bmatrix} \end{aligned}$$

where \({\mathbf {X}}_{p,k}\) consists of rows \(\lfloor N\frac{k-1}{K}+1\rfloor \) to \(\lfloor N\frac{k}{K}\rfloor \), we then have

$$\begin{aligned} {\mathbf {C}}_p = {\mathbf {X}}_p^\dagger {\mathbf {X}}_p = \sum _{k=1}^K {\mathbf {X}}_{p,k}^\dagger {\mathbf {X}}_{p,k} = \sum _{k=1}^K {\mathbf {C}}_{p,k} \end{aligned}$$
(1)

Each \({\mathbf {C}}_{p,k}\) is an SLCE with parameters \((\mu ,\lambda /K)\). The expression of \({\mathbf {C}}_p\) as an independent sum of matrices with identical limiting spectral distributions parallels the classical case of infinite divisibility. The recent result of Au et. al. [1], connecting operator-valued free probability and permutation-invariant random matrices, makes this precise: The matrices \({\mathbf {C}}_{p,k}\) in decomposition (1) are independent and permutation invariant and therefore asymptotically free with amalgamation over the subalgebra of diagonal matrices. Prior to this result, techniques from free probability were typically restricted to unitarily invariant ensembles of random matrices. This development provides a rich framework for us to understand \({\mathbf {C}}_p\) as asymptotically modeling a non-commutative Lévy process in an operator-valued \(^*\)-algebra.

Theorem 2

For any essentially bounded infinitely divisible distribution \(\mu \in \text {ID}_b(*)\), there exists an operator-valued \(^*\)-probability space \(({{\mathscr {A}}},\tau ,{{\mathscr {D}}},\Delta )\) and a \(\Delta \)-free Lévy process \(x_t \in {{\mathscr {A}}}\) (in the sense of Definitions 6 and 8) such that

$$\begin{aligned} {{\mathscr {L}}}(x_t) = \Lambda _t(\mu ) \end{aligned}$$

The outline of the article is as follows. In Sect. 2, we sketch the foundations of Lévy processes in order to establish the decomposition in Lemma 1. The full reference for this section is the treaty by Sato [19]. The proof of Theorem 1 is provided at the end of Sect. 3, followed by some immediate corollaries. In Sect. 5, we use the SLCE to construct an operator-valued \(^*\)-algebra and an accompanying \(\Delta \)-free Lévy process with prescribed moments.

One might ask where these results fit into the larger context of random covariance matrices. Our ensembles lie outside the domain of attraction for Marčenko–Pastur (and the analogous heavy-tailed case [4]) because their entries are independent but not identically distributed across the asymptotic parameter p. Because of the increasing roughness of the entries, the matrices fail to meet the conditions of [23] and others. On the other hand, they fit well into the world of covariance matrices with exploding moments [6, 14]. In these works, the i.i.d. entries of the data \({\mathbf {X}}_p\) have normalized even moments with some prescribed behavior, following the covariance form of the Zakharevich condition [27]. Under our normalization, this condition is equivalent to limits of the form

$$\begin{aligned} p{\mathbb {E}}\big [ [{\mathbf {X}}_p]_{i,j}^{2n} \big ] \xrightarrow {p\rightarrow \infty } c_{2n} \end{aligned}$$

for some even sequence \(c_{2n}\). From the proof of Theorem 1, we have the following: If \(c_{2n} \in [0,\infty ]\) is the sequence of even cumulants of an infinitely divisible probability distribution, then it can be realized as the Zakharevich sequence of an ensemble of random covariance matrices. We note that this includes sequences of even moments of arbitrary probability distributions, through the moment–cumulant correspondence given by compound Poisson processes.

2 Decomposition of Lévy Processes

Throughout, if X is a real-valued random variable, then we write \({{\mathscr {L}}}(X)\) for the law of X, a probability distribution on \({\mathbb {R}}\). Equality in distribution \(X {\mathop {=}\limits ^{d}} Y\) is shorthand for equality in law, \({{\mathscr {L}}}(X) = {{\mathscr {L}}}(Y)\).

Definition 3

A (classical) Lévy process \(X_t\) is a stochastically continuous càdlàg process such that

  1. 1.

    \(X_0 {\mathop {=}\limits ^{d}} 0\)

  2. 2.

    \(X_t\) is real-valued for all \(t\ge 0\)

  3. 3.

    For any sequence \(0 \le t_0 \le t_1 \le \cdots \le t_n\), all increments \(X_{t_k} - X_{t_{k-1}}\) are jointly independent.

  4. 4.

    For all \(t,s\ge 0\), we have the time invariance of distributions \(X_{t+s} - X_s {\mathop {=}\limits ^{d}} X_t\).

Definition 4

A probability distribution \(\mu \) is said to be (classically) infinitely divisible, \(\mu \in \text {ID}(*)\), if for any \(n \in {\mathbb {N}}\), there exists a distribution denoted by \(\mu ^{*1/n}\) such that

$$\begin{aligned} \mu = \underbrace{\mu ^{* 1/n} * \mu ^{* 1/n} * \cdots * \mu ^{* 1/n}}_{n\text { times}} \end{aligned}$$

Here, the symbol \(*\) stands for the additive convolution of probability measures. Similarly, a random variable X is said to be \(\text {ID}(*)\) if for any \(n \in {\mathbb {N}}\), we can write

$$\begin{aligned} X {\mathop {=}\limits ^{d}} X_1^{(n)} + X_2^{(n)} + \cdots + X_n^{(n)} \end{aligned}$$

for \(X_i^{(n)}\) i.i.d.

Theorem 3

(Lévy–Khintchine representation [19]) There is a one-to-one correspondence between \(\text {ID}(*)\) distributions and Lévy processes, such that each \(\mu \in \text {ID}(*)\) can be realized as the distribution of a Lévy process \(X_t\) at unit time \(t=1\). Furthermore, the cumulant generating function

$$\begin{aligned}\psi _{X_t}(\theta ) = \log {\mathbb {E}}\left[ e^{i \theta X_t}\right] \end{aligned}$$

for a Lévy process \(X_t\) is well defined as a continuous function of \(\theta \in {\mathbb {R}}\) and has a representation given by

$$\begin{aligned} \frac{1}{t}\psi _{X_t}(\theta ) = ia \theta - \frac{b^2}{2} \theta ^2 + \int _{{\mathbb {R}}} \left[ e^{i\theta x} - 1 - i\theta x \mathbbm {1}_{[-1,1]}(x) \right] \mathrm{d}\Pi (x) \end{aligned}$$
(2)

The unique triplet \((a,b,\Pi )\), called the data of \(X_t\), consists of constants \(a \in {\mathbb {R}}\) and \(b \ge 0\), and a nonnegative Borel measure \(\Pi \) on \({\mathbb {R}}\) with no atom at zero, such that for any \(\epsilon > 0\) (or, equivalently, for only \(\epsilon = 1\)), we have

$$\begin{aligned} \Pi \big ({\mathbb {R}}\backslash [-\epsilon ,\epsilon ]\big )< \infty , \ \ \ \ \ \ \ \ \ \ \int _{-\epsilon }^\epsilon x^2 \ \mathrm{d}\Pi (x) < \infty \end{aligned}$$

The Borel measure \(\Pi \) is called the Lévy measure of \(X_t\). We write \(\mu ^{*t} = {{\mathscr {L}}}(X_t)\), well defined for all \(t \ge 0\).

We recall that the cumulants \(\kappa _n\) are defined in terms of the moment–cumulant formula, which states that for a random variable X with finite moments \(m_j[X] = {\mathbb {E}}[X^j]\) up to order n, they are the unique values \(\kappa _j[X]\) such that the following n equations are satisfied:

$$\begin{aligned} m_j[X] = \sum _{\pi } \prod _{B\in \pi } \kappa _{|B|}[X], \ \ \ \ \ j = 1,2,\ldots ,n \end{aligned}$$
(3)

Here, each sum runs over all partitions \(\pi \) of the sets \(\{1,2,\ldots ,j\}\) and the elements \(B \in \pi \) are subsets of \(\{1,2,\ldots ,j\}\).

Corollary 1

The cumulants \(\kappa _n[X_t]\), when they are finite, are given by the expressions

$$\begin{aligned} \kappa _n[X_t] = t \int _{\mathbb {R}} x^n \mathrm{d}\Pi (x), \ \ \ \ \ n \ge 3 \end{aligned}$$
(4)

Definition 5

A Lévy process \(X_t\) with data \((a,b,\Pi )\) is said to be essentially bounded if the support of \(\Pi \) is contained on some bounded interval \([-B,B]\) with \(B>0\). If \(\mu \in \text {ID}(*)\) is infinitely divisible such that \({{\mathscr {L}}}(X_t) = \mu \), then we say \(\mu \) is essentially bounded if \(X_t\) is. We write \(\text {ID}_b(*)\) for the set of all essentially bounded probability distributions.

Sato [19] provides a thorough account of essentially bounded Lévy processes. If the support of \(\Pi \) is contained in \((-\infty ,B]\) for some minimal \(B\ge 0\), then the super-exponential moments

$$\begin{aligned} {\mathbb {E}}\left[ e^{\beta X_t \log X_t} \ \big | \ X_t> 0 \right] = {\mathbb {E}}\left[ X_t^{\beta X_t} \ \big | \ X_t > 0 \right] \end{aligned}$$

are finite for all \(0< \beta < 1/B\) and infinite for all \(\beta > 1/B\), independent of \(t>0\). The cumulant generating function \(\psi _{X_t}(\theta )\) can also be extended to an entire function in the argument \(\theta \in {\mathbb {C}}\) when \(X_t\) is essentially bounded. The key property of essentially bounded functions is that they are precisely those Lévy processes cumulants whose size grows at most exponentially, such that the sequence

$$\begin{aligned} \root n \of {\left| \kappa _n[X_t]\right| }\end{aligned}$$

is bounded.

The following lemma shows that all Lévy processes can be decomposed into the independent sum of an essentially bounded process and a compound Poisson process with arbitrarily small probability of activation. Recall that a compound Poisson process is a Lévy process realized by the random sum

$$\begin{aligned} P_t {\mathop {=}\limits ^{d}} \sum _{j=1}^{N_{rt}} \zeta _j \end{aligned}$$

where \(\zeta _j\) are i.i.d. random variables drawn from some fixed distribution, and \(N_{t}\) is a standard Poisson process which is independent of the \(\zeta _j\). The value \(r>0\) is called the rate of \(P_t\), and when \(N_{rt} = 0\) (the sum is empty) we say that \(P_t\) failed to activate. We note that the probability of this event is equal to

$$\begin{aligned}{\mathbb {P}}[N_{rt} = 0] = {\mathbb {P}}\left[ \text {Poisson with parameter { rt} is zero}\right] = e^{-rt}\end{aligned}$$

Compound Poisson processes have a convenient description in terms of their jumps \(\zeta _j\): Their cumulant generating functions are given by

$$\begin{aligned} \frac{1}{t}\psi _{P_t}(\theta ) = r \left( {\mathbb {E}}[ e^{i \theta \zeta _1} ] -1 \right) = r \int _{\mathbb {R}} \left( e^{i\theta x} - 1 \right) \mathrm{d}F_{\zeta _1}(x) \end{aligned}$$
(5)

where \({\mathbb {E}}[ e^{i \theta \zeta _1} ]\) is the characteristic function of the jump distribution \(\zeta _j\).

Lemma 1

Let \(X_t\) be a Lévy process, and let \(r>0\) be a fixed constant. Then, there exists a decomposition

$$\begin{aligned} X_t {\mathop {=}\limits ^{d}} X_t^{b} + P_t \end{aligned}$$

where \(X_t^{b}\) is an essentially bounded Lévy process, and \(P_t\) is an independent compound Poisson process with rate less than or equal to r.

Proof

Let \((a,b,\Pi )\) be the data for the Lévy process \(X_t\). Since \(\Pi ({\mathbb {R}}\backslash [-1,1]) < \infty \) and Borel measurable, it is inner regular and we have that

$$\begin{aligned} \lim _{B \rightarrow \infty } \Pi ([-B,B] \backslash [-1,1]) \rightarrow \Pi ({\mathbb {R}}\backslash [-1,1]) \end{aligned}$$

Now, simply choose some \(B>1\) such that

$$\begin{aligned}\Pi ({\mathbb {R}}\backslash [-1,1]) - \Pi ([-B,B] \backslash [-1,1]) < r\end{aligned}$$

We can now write \(\Pi (A) = \Pi (A \cap [-B,B]) + \Pi (A \cap [-B,B]^c)\) for any Borel set \(A \subseteq {\mathbb {R}}\). It is clear that the nonnegative Borel measures \(\Pi ^b\) and \(\Pi ^P\) defined as \(\Pi ^b(\cdot ) = \Pi (\cdot \cap [-B,B])\) and \(\Pi ^P(\cdot ) = \Pi (\cdot \cap [-B,B]^c)\) are also Lévy measures; therefore, we can write \(X_t\) as the independent sum

$$\begin{aligned} X_t {\mathop {=}\limits ^{d}} X_t^b + P_t \end{aligned}$$

where \(X_t^b\) is an essentially bounded Lévy process with data \((a,b,\Pi ^b)\), and \(P_t\) is a Lévy process with data \((0,0,\Pi ^P)\). If \(\Pi ^p({\mathbb {R}}) > 0\), then \(P_t\) has cumulant generating function

$$\begin{aligned} \frac{1}{t} \psi _{P_t}(\theta ) = \int _{{\mathbb {R}}} \left[ e^{i\theta x} - 1\right] \mathrm{d}\Pi ^P(x) \end{aligned}$$

This is precisely the form of a compound Poisson process with rate \(\Pi ^P({\mathbb {R}})<r\) and jump distribution given by the probability distribution \(\Pi ^P({\mathbb {R}})^{-1} \Pi ^P(\cdot )\). \(\square \)

3 Proof of Main Results

Lemma 2

(Zitelli [28]) Every SLCE driven by data \((\mu ,\lambda )\) where \(\mu \in \text {ID}_b(*)\) has a limiting spectral distribution \(\Lambda _\lambda (\mu )\). Furthermore, \(\Lambda _\lambda (\mu )\) is weakly continuous in its arguments, where continuity in \(\mu \) means sequential continuity.

Proof

We aim to show that the \(\lambda \)-shaped rectangular ensemble \({\mathbf {X}}_p\) appearing in the definition of the SLCE matrices \({\mathbf {C}}_p\) can be scaled in order to satisfy the Zakharevich condition found in Benaych–Georges and Cabanal–Duvillard [6, Theorem 3.2]. Specifically, we set

$$\begin{aligned} {\mathbf {Y}}_p = \sqrt{p} {\mathbf {X}}_p \end{aligned}$$

so that \({\mathbf {C}}_p = \frac{1}{p} {\mathbf {Y}}_p^\dagger {\mathbf {Y}}_p\). We let \(X_t\) denote a Lévy process such that \(\mu = {{\mathscr {L}}}(X_1)\). Then, the entries of \({\mathbf {Y}}_p\) follow the distribution of \(\sqrt{p} X_{1/p}\).

To show convergence of covariance matrices, it is sufficient to consider the Zakharevich condition on the even moments only:

$$\begin{aligned} \frac{ {\mathbb {E}}\left[ \big | \left[ {\mathbf {Y}}_p\right] _{1,1} \big |^{2n} \right] }{p^{2n/2-1}} = \frac{ {\mathbb {E}}\left[ \left| \sqrt{p} X_{1/p} \right| ^{2n} \right] }{p^{2n/2-1}} = p \ {\mathbb {E}}\left[ X_{1/p}^{2n} \right] \end{aligned}$$
(6)

For \(n \in {\mathbb {N}}\) fixed, the term on the right is simply the \(2n^{\text {th}}\) moment of \(X_{1/p}\). By the moment–cumulant formula (3), this moment can be expressed as a sum of products of the form

$$\begin{aligned} \prod _{j=1}^{2n} \kappa _j\left[ X_{1/p}\right] ^{k_j} = \prod _{j=1}^{2n} \frac{1}{p^{k_j}}\kappa _j\left[ X_{1}\right] ^{k_j} = \frac{1}{p^{\sum _{j=1}^{2n} k_j}}\prod _{j=1}^{2n} \kappa _j\left[ X_{1}\right] ^{k_j} \end{aligned}$$

with \(k_j \in \{0,1,2,\ldots ,2n\}\) such that

$$\begin{aligned} \sum _{j=1}^{2n} j \cdot k_j = 2n \end{aligned}$$
(7)

Therefore, we can write (6) as a sum of terms which look like

$$\begin{aligned} p^{1-\sum _{j=1}^{2n} k_j} \prod _{j=1}^{2n} \kappa _{j}\left[ X_{1}\right] ^{k_j} \end{aligned}$$

Condition (7) guarantees that \(1-\sum _{j=1}^{2n} k_j \le 0\). The terms for which \(1-\sum _{j=1}^{2n} k_j < 0\) will converge to zero as \(p\rightarrow \infty \). There is only one term such that \(1-\sum _{j=1}^{2n} k_j = 0\), which is \(k_j = 0\) for all except \(k_{2n} = 1\). Therefore, as \(p\rightarrow \infty \) we have

$$\begin{aligned} \frac{ {\mathbb {E}}\left[ \big | \left[ {\mathbf {Y}}_p\right] _{1,1} \big |^{2n} \right] }{p^{2n/2-1}} \rightarrow \kappa _{2n}\left[ X_1\right] \end{aligned}$$
(8)

Since \(X_t\) is essentially bounded, \(\kappa _{2n}\left[ X_1\right] ^{1/2n}\) is bounded, and the conditions of [6, Theorem 3.2] are met. We denote the limiting distribution by \(\Lambda _\lambda (\mu )\), as it only depends on \(\lambda \) and the even cumulants of the distribution \(\mu \).

Continuity in the parameter \(\lambda \) follows similarly from the same reference. For sequential continuity of a collection of essentially bounded processes \(X_t^{(j)}\), we have that the cumulants \(\kappa _{2n}\left[ X_1^{(j)}\right] \) each converge for fixed \(n \in {\mathbb {N}}\). By the continuity in the \({\mathbf {c}}\) term, we have continuity of the limiting distribution as desired. \(\square \)

Proof of Theorem 1

As above, we take a Lévy process \(X_t\) such that \([{\mathbf {X}}_p]_{i,j} {\mathop {=}\limits ^{d}} X_{1/p}\). Our goal is to decompose the matrices \({\mathbf {X}}_p\) into the sum of two independent components, one of which is driven by an essentially bounded process and another one is low rank with high probability. By Lemma 1, we have a decomposition into the sum of independent processes

$$\begin{aligned} X_t {\mathop {=}\limits ^{d}} X_t^{b} + P_t \end{aligned}$$

where \(X_t^{b}\) is an essentially bounded process and \(P_t\) is a compound Poisson process with arbitrarily small rate \(r>0\). Therefore, we can write each matrix \({\mathbf {X}}_p\) as

$$\begin{aligned} {\mathbf {X}}_p = \widetilde{{\mathbf {X}}}_p + {\mathbf {P}}_p \end{aligned}$$

where the entries of \(\widetilde{{\mathbf {X}}}_p\) are i.i.d. following the distribution \(X^b_{1/p}\) and the entries of \({\mathbf {P}}_p\) are i.i.d. following the distribution of \(P_{1/p}\). Note that this equality is not simply in distribution, as we treat the entries of \({\mathbf {X}}_p\) as being generated by summing the independent entries of \(\widetilde{{\mathbf {X}}}_p\) and \({\mathbf {P}}_p\). It follows that

$$\begin{aligned} \text {rank} \left( {\mathbf {X}}_p^\dagger {\mathbf {X}}_p - \widetilde{{\mathbf {X}}}_p^\dagger \widetilde{{\mathbf {X}}}_p\right)&\le 2 \cdot \left[ \text {Number of columns of }{\mathbf {X}}_p\text { that are different from }\widetilde{{\mathbf {X}}}_p\right] \\&\le 2 \cdot \big [\text {Number of columns of }{\mathbf {P}}_p\text { that are nonzero}\big ] \end{aligned}$$

Each column of \({\mathbf {P}}_p\) has p independent compound Poisson entries, which each fail to activate with probability \(e^{-r/p}\). The probability that all N entries fail to activate is

$$\begin{aligned} {\mathbb {P}}\left[ \text {A specified column of }{\mathbf {P}}_p\text { is all zero} \right] \ge \left( e^{-r/p}\right) ^N = e^{-r\lambda } \end{aligned}$$

This allows us to treat the number of columns of \({\mathbf {X}}_p\) that are different from \(\widetilde{{\mathbf {X}}}_p\) as being bounded above by a multiple of a Bernoulli random variable with p trials and probability of success \(q \le 1 - e^{-r\lambda }\). Using the Chernoff bound on Bernoulli trials, we get that

$$\begin{aligned} {\mathbb {P}}\left[ \text {rank} \left( {\mathbf {X}}_p^\dagger {\mathbf {X}}_p - \widetilde{{\mathbf {X}}}_p^\dagger \widetilde{{\mathbf {X}}}_p\right) \ge 4 p \left( 1- e^{-r\lambda } \right) \right] \le \left( \frac{e}{4}\right) ^{p\left( 1- e^{-r\lambda } \right) } \end{aligned}$$

Since \(\left( \frac{e}{4}\right) ^{\left( 1- e^{-r\lambda } \right) } < 1\), it follows that

$$\begin{aligned} \sum _{p=1}^\infty \left( \left( \frac{e}{4}\right) ^{\left( 1- e^{-r\lambda } \right) }\right) ^p < \infty \end{aligned}$$

By Borel–Cantelli, the inequality

$$\begin{aligned} \text {rank} \left( {\mathbf {X}}_p^\dagger {\mathbf {X}}_p - \widetilde{{\mathbf {X}}}_p^\dagger \widetilde{{\mathbf {X}}}_p\right) \ge 4 p \left( 1- e^{-r\lambda } \right) \end{aligned}$$

occurs only finitely many times almost surely.

Let \(\epsilon > 0\) be given, and choose \(r>0\) such that \(4\left( 1 - e^{-r\lambda }\right) < \epsilon \). For this choice of \(r>0\),

$$\begin{aligned}\text {rank} \left( {\mathbf {X}}_p^\dagger {\mathbf {X}}_p - \widetilde{{\mathbf {X}}}_p^\dagger \widetilde{{\mathbf {X}}}_p\right) \le 4 p \left( 1- e^{-r\lambda } \right) < p \epsilon \end{aligned}$$

almost surely for large enough p. By Lemma 2, we know that \(\widetilde{{\mathbf {X}}}_p^\dagger \widetilde{{\mathbf {X}}}_p\) has a limiting spectral distribution, which we will denote by \(\mu ^\epsilon \). Since this can be done for any \(\epsilon >0\), it follows by from the lemma of Benaych–Georges and Cabanal–Duvillard [6, Lemma 12.2] that a limiting distribution exists for \({\mathbf {X}}_p^\dagger {\mathbf {X}}_p\) as well and is given by the weak limit of \(\lim _{\epsilon \rightarrow 0^+} \mu ^\epsilon \).

Continuity in the arguments can be derived from the continuity in the case of essentially bounded processes. Let \((\mu ^{(j)},\lambda _j)\) be a sequence of data converging to some \((\mu ,\lambda )\), where we take \(\mu ^{(j)} \rightarrow \mu \) in distribution. If we let \(\Pi ^{(j)}\) be the Lévy measures of \(\mu ^{(j)}\), it follows [19] that for any fixed \(B>1\)

$$\begin{aligned} \Pi ^{(j)} \left( {\mathbb {R}}\backslash [-B,B]\right) \rightarrow \Pi \left( {\mathbb {R}}\backslash [-B,B]\right) \end{aligned}$$
(9)

Therefore, if \(\epsilon >0\) is given and some \(r>0\) is chosen as above, a \(B>1\) can be chosen uniformly across all \(j \in {\mathbb {N}}\). To see this, take some \(B_1 > 1\) as in Lemma 1 so that

$$\begin{aligned} \Pi \left( {\mathbb {R}}\backslash [-B_1,B_1]\right) < \frac{r}{2} \end{aligned}$$

Since (9) holds, there is some \(J \in {\mathbb {N}}\) such that \(j>J\) implies that

$$\begin{aligned} \left| \Pi ^{(j)} \left( {\mathbb {R}}\backslash [-B_1,B_1]\right) - \Pi \left( {\mathbb {R}}\backslash [-B_1,B_1]\right) \right| < \frac{r}{2} \end{aligned}$$

and therefore \(\Pi ^{(j)} \left( {\mathbb {R}}\backslash [-B_1,B_1]\right) < r\) for \(j>J\). Now, for the finitely many \(j \le J\), we choose \(B>B_1\) so that the same condition holds, and we have \(\Pi ^{(j)} \left( {\mathbb {R}}\backslash [-B_1,B_1]\right) < r\) for all \(j \in {\mathbb {N}}\). Now, since weak convergence is metrizable, choosing an appropriate metric and applying the triangle inequality gives the result. \(\square \)

4 Consequences of Theorem 1

Corollary 2

If \(\mu \in \text {ID}(*)\) does not follow a normal distribution, then for every \(\lambda \in (0,\infty )\), the probability measure \(\Lambda _\lambda (\mu )\) is distinct from the Marčenko–Pastur distribution and has an unbounded right tail. If \(\mu \in \text {ID}_b(*)\), then \(\Lambda _\lambda (\mu )\) has exponential moments of all orders.

Proof

This follows from the precise statement of the theorem of Benaych–Georges and Cabanal–Duvillard [6, Theorem 3.2]. \(\square \)

By (2), a Lévy process is symmetric precisely when \(\Pi \) is. It follows that every Lévy process can be symmetrized by considering a new process whose Lévy measure is given by

$$\begin{aligned} \Pi ^{s}(A) = \frac{1}{2} \left( \Pi (A) + \Pi (-A)\right) \end{aligned}$$

for any Borel set \(A\subseteq {\mathbb {R}}\) which does not contain a neighborhood of zero, where \(-A = \{x \in {\mathbb {R}} : -x \in A\}\). Both the process \(X_t\) and the resulting symmetric process \(X^\text {s}_t\) can be simultaneously approximated in distribution by essentially bounded Lévy processes with identical even cumulants. As the limiting spectral distribution of the SLCE is independent of the odd cumulants of the original process, it is invariant under the operation of symmetrization.

Corollary 3

Suppose \(\mu ,\nu \in \text {ID}(*)\) are infinitely divisible with Lévy measures \(\Pi _\mu \) and \(\Pi _\nu \). If \(\Pi _\mu ^s = \Pi _\nu ^s\), then \(\Lambda _\lambda (\mu ) = \Lambda _\lambda (\nu )\) for all \(\lambda \in (0,\infty )\).

Proof

In the proof of Lemma 2, the limiting distribution relies only on the even cumulants of the essentially bounded process. As in the proof of Theorem 1, we take Lévy processes \(X_t\) and \(Y_t\) such that \({{\mathscr {L}}}(X_1) = \mu \) and \({{\mathscr {L}}}(Y_1)=\nu \). The decomposition of both process \(X_t\) and \(Y_t\) into the independent sums \(X^b_t + P_t\) and \(Y^b_t + P_t'\) relies on truncating the Lévy measures \(\Pi _\mu \) and \(\Pi _\nu \) on sets \([-B,B]\). By (4), the even cumulants of the essentially bounded components are identical under the stated condition after choosing \(B>0\) for the two processes simultaneously. Therefore, the limiting distributions for matrices \(\widetilde{{\mathbf {X}}}_p^\dagger \widetilde{{\mathbf {X}}}_p\) and \(\widetilde{{\mathbf {Y}}}_p^\dagger \widetilde{{\mathbf {Y}}}_p\) are both equal to some \(\mu ^\epsilon \). Since the limiting distributions for both are weak limits of \(\mu ^\epsilon \), they are equal. \(\square \)

By equality of the nonzero eigenvalues of \({\mathbf {X}}_p^\dagger {\mathbf {X}}_p\) and \({\mathbf {X}}_p{\mathbf {X}}_p^\dagger \), we have immediately that for \(\lambda \ge 1\)

$$\begin{aligned} \Lambda _{1/\lambda }(\mu ^{*\lambda }) = \left( 1 - \frac{1}{\lambda }\right) \delta _0 + \frac{1}{\lambda }\Lambda _\lambda (\mu )\end{aligned}$$

The parallels to the Marčenko–Pastur law are clear. The following corollary shows a similar correspondence for a recent result of Shlyakhtenko and Tao [20] having to do with the interpretation of the (scalar-valued) free semigroup \(\mu ^{\boxplus t}\) in terms of unitarily invariant minors of large matrices.

Corollary 4

Let \({\mathbf {C}}_p\) be an SLCE with data \((\mu ,\lambda )\), with \({\mathbf {X}}_p\) as in Definition 2. For \(k \in [1,\infty )\), let \({\mathbf {p}}_p\) and \({\mathbf {p}}_p'\) denote sequences of \(p\times p\) and \(N\times N\) random diagonal matrices, respectively, whose diagonal entries are in \(\{0,1\}\) and such that

$$\begin{aligned} \lim _{p\rightarrow \infty } \frac{1}{p} \text {Tr}[{\mathbf {p}}_p] = \lim _{p\rightarrow \infty } \frac{1}{N} \text {Tr}[{\mathbf {p}}_p'] = \frac{1}{k} \end{aligned}$$

Then, \({\mathbf {X}}_p^\dagger {\mathbf {p}}'_p {\mathbf {X}}_p\) has a limiting spectral distribution \(\Lambda _{\lambda /k}(\mu )\) and \({\mathbf {p}}_p {\mathbf {C}}_p {\mathbf {p}}_p\) has a limiting spectral distribution

$$\begin{aligned} \left( 1 - \frac{1}{k}\right) \delta _0 + \Lambda _{k\lambda }(\mu ^{*1/k}) \end{aligned}$$

5 Amalgamated Free Lévy Processes

Definition 6

In this article, we define an operator-valued \(^*\)-probability space (sometimes called an algebraic probability space, see [1, 24]) as a collection of data \(({{\mathscr {A}}},\tau ,{{\mathscr {D}}},\Delta )\) such that

  1. 1.

    \({{\mathscr {A}}}\) is a \(^*\)-algebra.

  2. 2.

    The pair \(({{\mathscr {A}}},\tau )\) is a \(^*\)-probability space, which is to say that \(\tau \) is a faithful tracial linear form on \({{\mathscr {A}}}\) with \(\tau [1] = 1\), \(\tau [a^*a] = 0\) implies \(a=0\), and \(\tau [ab]=\tau [ba]\) for all \(a,b \in {{\mathscr {A}}}\). \(\tau \) is called the expectation on \({{\mathscr {A}}}\).

  3. 3.

    \({{\mathscr {D}}}\subseteq {{\mathscr {A}}}\) is a \(^*\)-subalgebra.

  4. 4.

    The conditional expectation \(\Delta :{{\mathscr {A}}}\rightarrow {{\mathscr {D}}}\) is a unital linear map, such that \(\Delta [d_1ad_2]=d_1\Delta [a]d_2\) for any \(a \in {{\mathscr {A}}}\), \(d_1,d_2\in {{\mathscr {D}}}\).

  5. 5.

    The expectation and conditional expectation are compatible, such that \(\tau [\Delta [a]] = \tau [a]\) for all \(a \in {{\mathscr {A}}}\).

We take \({{\mathscr {D}}}\langle x_1,\ldots , x_n \rangle \) to be the algebra generated by \({{\mathscr {D}}}\) and the elements \(x_i \in {{\mathscr {A}}}\). We say that the (univariate) distribution of an element \(x \in {{\mathscr {A}}}\) is the collection of multi-linear maps

$$\begin{aligned} \mu _x = \Big \{&m_n^x : {{\mathscr {D}}}^{n-1} \rightarrow {{\mathscr {D}}} \ : \ m_n^x(d_1,\ldots ,d_{n-1}) = \Delta \left[ xd_1xd_2x \ldots xd_{n-1}x \right] , n \in {\mathbb {N}} \Big \} \end{aligned}$$

Due to the compatibility of trace and conditional expectation, the classical moments of an element x can be recovered as

$$\begin{aligned} m_n[x]:= \tau \left[ m_n^x(1,\ldots ,1)\right] = \tau [x^n] \end{aligned}$$

When x is self-adjoint and the sequence of moments \(m_n[x]\) uniquely specifies a real-valued probability distribution \(\mu \), we write \({{\mathscr {L}}}(x) = \mu \).

Definition 7

We say that a family of subalgebras \({{\mathscr {A}}}_1,\ldots ,{{\mathscr {A}}}_K \subseteq {{\mathscr {A}}}\) containing \({{\mathscr {D}}}\) are free with amalgamation over \({{\mathscr {D}}}\) (or simply \(\Delta \)-free) if

$$\begin{aligned} \Delta [x_1x_2 \cdots x_M] = 0 \end{aligned}$$

whenever \(x_m \in {{\mathscr {A}}}_{i_m}\) are such that \(\Delta [x_m] = 0\) and \(i_m \ne i_{m+1}\) for \(1 \le m \le M-1\). We say that a family of K elements \(x_1,\ldots ,x_K \in {{\mathscr {A}}}\) are \(\Delta \)-free if the subalgebras \({{\mathscr {D}}}\langle x_j\rangle \) are \(\Delta \)-free.

Definition 8

A \(\Delta \)-free Lévy process \(x_t\) on an operator-valued \(^*\)-probability space \(({{\mathscr {A}}},\tau ,{{\mathscr {D}}},\Delta )\) is a map \(t \mapsto x_t \in {{\mathscr {A}}}\) such that

  1. 1.

    \(x_0 = 0 \in {{\mathscr {A}}}\)

  2. 2.

    \(x_t\) is self-adjoint for all \(t\ge 0\)

  3. 3.

    For any sequence \(0 = t_0 \le t_1 \le \cdots \le t_n\), all increments \(x_{t_k} - x_{t_{k-1}} \in {{\mathscr {A}}}\) are \(\Delta \)-free.

  4. 4.

    For \(t,s\ge 0\), we have the time invariance of distributions \(\mu _{x_t} = \mu _{x_{t+s} - x_s}\).

Definition 9

Let \({\mathbf {x}} = (x_j)_{j \in J}\) where J is some index set. A word in \({\mathbf {x}}\) is a word on the alphabet of symbols \(\{x_j\}_{j \in J}\). A bracketed word in \({\mathbf {x}}\) is a word enclosed in functional bracket symbols \(\Delta [\) and ]. The set of \(\Delta \)-monomials in \({\mathbf {x}}\) is the quotient of the smallest monoid containing words in \({\mathbf {x}}\), stable under the bracketing operation, with the relations

  1. 1.

    \(\Delta [e] \sim e\) for the empty word e

  2. 2.

    For all \(w_1,w_2,w_3\)

    $$\begin{aligned} \Delta \Big [ \Delta [w_1] \ w_2 \ \Delta [w_3]\Big ] \sim \Delta [w_1] \Delta [w_2] \Delta [w_3] \end{aligned}$$
    (10)

We take \({\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \) to be the space of formal linear combinations of \(\Delta \)-monomials in \({\mathbf {x}}\), the so-called \(\Delta \)-polynomials. This is a \(^*\)-algebra in the natural way: The product of monomials is concatenation of words and bracketed words, and involution on monomials is the expression of words and bracketed words in reverse order.

Note that in the construction of our algebra \({\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \), we are implicitly assuming that the \(x_j\) are self-adjoint. When the indeterminants \({\mathbf {x}}\) are clear, we write \(D\subseteq {\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \) for the subalgebra of elements w such that \(\Delta [w] = w\). Similarly, \(D\langle x_j\rangle \) is the subalgebra generated by bracketed words and the indeterminant \(x_j\). Elements of \(D\langle x_j\rangle \) are precisely linear combinations of alternating monomials of the form

$$\begin{aligned} x_j^{r_1} \ \Delta [w_1] \ x_j^{r_2} \ \Delta [w_2]\cdots \Delta [w_{L}] \ x_j^{r_{L+1}} \end{aligned}$$

where \(w_l \in {\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \) and \(r_l \in {\mathbb {N}}\).

In the context of \(p\times p\) matrices, we write \(\Delta [{\mathbf {A}}]\) for the diagonal of \({\mathbf {A}}\), that is the matrix such that

$$\begin{aligned}{}[\Delta [{\mathbf {A}}]]_{i,j} = \delta _{i,j} [{\mathbf {A}}]_{i,j} \end{aligned}$$

Let \({\mathbf {c}} = ({\mathbf {C}}_j)_{j \in J}\) be a family of \(p\times p\) matrices. If \(q \in {\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \) is a \(\Delta \)-polynomial on the same index set J, then we let \(q({\mathbf {c}})\) denote the \(p\times p\) matrix formed by linear combinations of matrix products and applications of matrix diagonalization.

Lemma 3

(Asymptotic \(\Delta \)-Freeness of SLCE Families) Let \({\mathbf {c}}_p = ({\mathbf {C}}_p^{(1)},\ldots ,{\mathbf {C}}_p^{(K)})\) denote a family of K independent SLCE \({\mathbf {C}}_p^{(k)}\) with data \((\mu _k,\lambda _k)\), where \(\mu _k \in \text {ID}_b(*)\) are essentially bounded. If \(q \in {\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \) is a \(\Delta \)-polynomial in the indeterminants \({\mathbf {x}} = (x_1,\ldots x_K)\), then the following limit exists and is finite:

$$\begin{aligned} \lim _{p\rightarrow \infty } {\mathbb {E}}\left[ \frac{1}{p} \text {Tr}\left[ q({\mathbf {c}}_p) \right] \right] \end{aligned}$$

Furthermore, let \(q_m \in {\mathbb {C}}\langle {\mathbf {x}}\rangle _{\Delta }\) for \(m = 1,\ldots ,M\) be a collection of \(\Delta \)-polynomials, and let \(\epsilon _p\) be a sequence of \(p\times p\) diagonal matrices of the form

$$\begin{aligned} \epsilon _p = \Delta \left[ \Big ( q_1({\mathbf {c}}_p) - \Delta [q_1({\mathbf {c}}_p)] \Big ) \cdots \Big ( q_M({\mathbf {c}}_p) - \Delta [q_M({\mathbf {c}}_p)] \Big )\right] \end{aligned}$$

Then, if \(q_m \in D\langle x_{i_m} \rangle \) and \(i_m \ne i_{m+1}\) for each \(1 \le m \le M-1\), it follows that

$$\begin{aligned} \lim _{p\rightarrow \infty } {\mathbb {E}}\left[ \frac{1}{p} \text {Tr}\left[ \epsilon _p^\dagger \epsilon _p \right] \right] = 0 \end{aligned}$$

Proof

By the independence on the entries of each \({\mathbf {X}}_p^{(k)}\), we see that the \({\mathbf {C}}_p^{(k)}\) are permutation invariant, which is to say that

$$\begin{aligned} {\mathbf {C}}_p^{(k)} {\mathop {=}\limits ^{d}} {\mathbf {S}}_p^\dagger {\mathbf {C}}_p^{(k)} {\mathbf {S}}_p \end{aligned}$$

for any \(p\times p\) permutation matrix \({\mathbf {S}}_p\). By Male’s result on heavy covariance matrices with exploding moments [14, Corollary 2.9], and using the Zakharevich condition (8) for SLCE driven by essentially bounded Lévy processes, it follows that the limit

$$\begin{aligned} \lim _{p\rightarrow \infty }{\mathbb {E}}\left[ \frac{1}{p} \text {Tr}\left[ q({\mathbf {c}}_p) \right] \right] \end{aligned}$$

exists and is finite for any \(\Delta \)-polynomial \(q \in {\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \). By Au et al. [1, Theorem 1.3], families of permutation invariant matrices are asymptotically \(\Delta \)-free in probability. Specifically, the result shows that we have

$$\begin{aligned} \lim _{p\rightarrow \infty }{\mathbb {E}}\left[ \frac{1}{p} \text {Tr}\left[ \epsilon _p^\dagger \epsilon _p \right] \right] = 0 \end{aligned}$$
(11)

for \(\epsilon _p\) as above. \(\square \)

We mention that our use of [14] in order to show the existence of the limit for a \(\Delta \)-polynomial follows from the larger theory of graph operations on families of random matrices, detailed fully in the recent monograph [15]. As the proof of [14, Theorem 2.3] encompasses the so-called graph monomials, the inclusion \({\mathbb {C}}\langle {\mathbf {x}}\rangle \subset {\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \subset {\mathbb {C}}{{\mathscr {G}}} \langle {\mathbf {x}}\rangle \) shows that the limit exists for \(\Delta \)-polynomials as well.

Theorem 4

Let \(\mu _k \in \text {ID}_b(*)\) and \(\lambda _k \in (0,\infty )\) for \(k=1,\ldots ,K\). Then, there exists an operator-valued \(^*\)-probability space \(({{\mathscr {A}}},\tau ,{{\mathscr {D}}},\Delta )\) with self-adjoint elements \(x_k \in {{\mathscr {A}}}\) such that \(x_k\) are \(\Delta \)-free and \({{\mathscr {L}}}(x_k) = \Lambda _{\lambda _k}(\mu _k)\).

Proof

For each pair \((\mu _k,\lambda _k)\), we let \({\mathbf {C}}_p^{(k)}\) denote an SLCE such that the entries of each \({\mathbf {X}}_p^{(k)}\) are jointly independent of all others across the family. We write \({\mathbf {c}}_p = ({\mathbf {C}}^{(1)}_p,\ldots ,{\mathbf {C}}^{(K)}_p)\) for the K-tuple of \(p\times p\) self-adjoint random matrices.

Let \({{\mathscr {A}}}_0 = {\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \) be the space of \(\Delta \)-polynomials in the indeterminates \({\mathbf {x}} = \{x_k\}_{k=1,\ldots ,K}\), as above. We define a tracial state \(\tau \) on the \(\Delta \)-polynomials of \({{\mathscr {A}}}_0\) in the following way:

$$\begin{aligned} \tau [q] = \lim _{p \rightarrow \infty } {\mathbb {E}}\left[ \frac{1}{p} \text {Tr}\left[ q({\mathbf {c}}_p) \right] \right] \end{aligned}$$

This limit exists by Lemma 3. Now, consider \(Z_\tau = \{q \in {{\mathscr {A}}}_0 : \tau [q^*q]=0\}\). By Cauchy–Schwarz on \(\tau \), we have that for \(n>1\)

$$\begin{aligned} \tau [ (q^*q)^n ] \ = \tau [\left( (q^*q)^{n-1} q^*\right) q] \le \sqrt{ \tau [ q(qq^*)^{n-1}(q^*q)^{n-1}q^* ] } \sqrt{\tau [q^*q]} \end{aligned}$$

which shows that \(\tau [(q^*q)^{n}] = 0\) for all \(n \in {\mathbb {N}}\). Now, if \(q \in Z_\tau \) and \(r \in {{\mathscr {A}}}_0\), we have

$$\begin{aligned} \tau [(qr)^*qr] = \tau [(q^*q)(rr^*)] \le \sqrt{\tau [(q^*q)^2]} \sqrt{\tau [(r^*r)^2]} \end{aligned}$$

A similar argument on the right shows that \(Z_\tau \) is a two-sided ideal in \({{\mathscr {A}}}_0\), and so let \({{\mathscr {A}}} = {{\mathscr {A}}}_0 \big / Z_\tau \). Take \({{\mathscr {D}}}\) to be the set of equivalence classes of elements from \(D\subseteq {\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \), this is to say the elements \(q \in {{\mathscr {A}}}\) such that \(\Delta [q] = q\). Our data \(({{\mathscr {A}}},\tau ,{{\mathscr {D}}},\Delta )\) are an operator-valued \(^*\)-probability space following from properties of the matrix trace, the matrix diagonal, and (10).

We now consider the monomials \(q_k({\mathbf {x}}) = x_k\). These are self-adjoint, and their classical moments can be computed via the expressions

$$\begin{aligned} \tau [x_k^n] = \lim _{p\rightarrow \infty } {\mathbb {E}}\left[ \frac{1}{p} \text {Tr}\left[ \left( {\mathbf {C}}_p^{(k)}\right) ^n\right] \right] \end{aligned}$$

By the weak convergence of the ESD of an SLCE in Theorem 1, this equals the \(n^{\text {th}}\) moment of \(\Lambda _{\lambda _k}(\mu _k)\). The existence of exponential moments of \(\Lambda _{\lambda _k}(\mu _k)\) from Corollary 2 guarantees that it is completely determined by its moment sequence, so we have \({{\mathscr {L}}}(x_k) = \Lambda _{\lambda _k}(\mu _k)\).

To see that the elements \(x_k\) are \(\Delta \)-free, let \(q_m \in {{\mathscr {A}}}\) for \(m = 1,\ldots ,M\) such that \(q_m \in {{\mathscr {D}}}\langle x_{i_m}\rangle \) with \(i_m \ne i_{m+1}\) for each \(1 \le m \le M-1\). These \(q_m\) are precisely those in the equivalence classes of elements \(D\langle x_{i_m}\rangle \). Setting

$$\begin{aligned} \epsilon = \Delta \left[ \Big ( q_1 - \Delta [q_1] \Big ) \cdots \Big ( q_M - \Delta [q_M] \Big )\right] , \end{aligned}$$

we have that

$$\begin{aligned} \tau \left[ \epsilon ^*\epsilon \right] = \lim _{p\rightarrow \infty } {\mathbb {E}}\left[ \frac{1}{p} \text {Tr} \left[ \epsilon ^*({\mathbf {c}}_p) \epsilon ({\mathbf {c}}_p) \right] \right] \end{aligned}$$

Applying Lemma 3 to \(\epsilon _p = \epsilon ({\mathbf {c}}_p)\), this expression is zero, and so \(\epsilon = 0\). \(\square \)

The realization of \({{\mathscr {A}}}\) as an operator algebra is possible using the GNS construction of Takesue [22]. We consider the inner product \(\langle q_1,q_2 \rangle = \tau [q_2^*q_1]\) on \({{\mathscr {A}}}\), and its Hilbert space completion \({\mathfrak {H}}\). In the non-Gaussian case, the left action \(q({\mathbf {x}}) \mapsto x_kq({\mathbf {x}})\) of the indeterminants \(x_k\) on \({{\mathscr {A}}}\) is necessarily unbounded by Corollary 2. Instead, \({{\mathscr {A}}}\) forms an \(O^*_p\)-algebra [12] and can be embedded as a \(^*\)-subalgebra of the operators affiliated with the von Neumann algebra \(B({\mathfrak {H}})\). In the operator-valued setting, such algebras can be studied through the non-commutative Cauchy transform [26]; however, we do not pursue this further. These results provide the framework to prove the existence of \(\Delta \)-free Lévy processes derived from the SLCE, as stated in Theorem 2.

Proof of Theorem 2

As above, we consider \({\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \), where \({\mathbf {x}} = \{x_t\}_{t \in [0,\infty )}\) are indeterminants indexed by a continuous parameter \(t \ge 0\). Let \(\{z_{p,i,j}\}_{p,i,j \in {\mathbb {N}}}\) denote an infinite-dimensional array of independent random variables such that \({{\mathscr {L}}}(z_{p,i,j}) = \mu ^{*\frac{1}{p}}\).

We define the tracial state as follows. For an element \(q \in {\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \), let \(0\le t_1\le \cdots \le t_K\) be the indices of its indeterminants, and set \(t_0 = 0\). For each \(p \in {\mathbb {N}}\) and \(0\le k \le K\), take \({\mathbf {X}}^{(k)}_p\) to be the \(\lceil t_K p\rceil \times p\) matrix with entries \([{\mathbf {X}}_p^{(k)}]_{i,j} = z_{p,i,j}\) when \(1\le i \le \lceil t_k p\rceil \), and zero otherwise. Similarly, let \({\mathbf {C}}_p^{(k)} = \left( {\mathbf {X}}_p^{(k)}\right) ^\dagger {\mathbf {X}}_p^{(k)}\), an SLCE with data \((\mu ,t_k)\). Setting \({\mathbf {c}}_p = ({\mathbf {C}}_p^{(1)},\ldots ,{\mathbf {C}}_p^{(K)})\), we define

$$\begin{aligned} \tau [q] = \lim _{p \rightarrow \infty } {\mathbb {E}}\left[ \frac{1}{p} \text {Tr}\left[ q({\mathbf {c}}_p) \right] \right] \end{aligned}$$

As in Theorem 4 this is well defined, and we take \({{\mathscr {A}}}={\mathbb {C}}\langle {\mathbf {x}}\rangle _\Delta \big / Z_\tau \) on the two-sided ideal \(Z_\tau \).

We define our non-commutative Lévy process to be the map \(t \mapsto p_t \in {{\mathscr {A}}}\) where \(p_t({\mathbf {x}}) = x_t\) is a monomial in the single indeterminant \(x_t\). Note that the \({\mathbf {C}}_p^{(k)}\) need not satisfy any freeness condition, but for any \(t_k\) with \(k = 1,\ldots ,K\) as above, we have

$$\begin{aligned} {\mathbf {X}}_p^{(k)} - {\mathbf {X}}_p^{(k-1)} = \begin{bmatrix} {\mathbf {0}}_{1}^{(k)} \\ \hline \hat{{\mathbf {X}}}_p^{(k)} \\ \hline {\mathbf {0}}_{2}^{(k)} \end{bmatrix} \end{aligned}$$

where \(\hat{{\mathbf {X}}}_p^{(k)}\) is a \((\lceil t_{k}p\rceil - \lceil t_{k-1}p\rceil )\times p\) with i.i.d. entries following the distribution \(\mu ^{*\frac{1}{p}}\), \({\mathbf {0}}_{1}^{(k)}\) is a \(\lceil t_{k-1}p\rceil \times p\) matrix of zeros, and \({\mathbf {0}}_2^{(k)}\) is a \((\lceil t_{K}p\rceil - \lceil t_kp \rceil ) \times p\) matrix of zeros. Writing \(\hat{{\mathbf {C}}}_p^{(k)} = \left( \hat{{\mathbf {X}}}_p^{(k)}\right) ^\dagger \hat{{\mathbf {X}}}_p^{(k)}\), it follows that

$$\begin{aligned} {\mathbf {C}}_p^{(k)} = \sum _{j=1}^k \hat{{\mathbf {C}}}_p^{(j)} \end{aligned}$$

where the \(\hat{{\mathbf {C}}}_p^{(k)}\) are independent and permutation-invariant \(p\times p\) elements of an SLCE with data \((\mu ,t_k-t_{k-1})\). Applying the asymptotic \(\Delta \)-freeness to these \(\hat{{\mathbf {C}}}_p^{(k)}\) as in Theorem 4, it follows that the difference elements \(x_{t_k}-x_{t_{k-1}} \in {{\mathscr {A}}}\) are \(\Delta \)-free. \(\square \)