Abstract
In the first part (Bourgade et al., Local circular law for random matrices, preprint, arXiv:1206.1449, 2012) of this article series, Bourgade, Yau and the author of this paper proved a local version of the circular law up to the finest scale \(N^{-1/2+ {\varepsilon }}\) for non-Hermitian random matrices at any point \(z \in \mathbb {C}\) with \(||z| - 1| > c \) for any constant \(c>0\) independent of the size of the matrix. In the second part (Bourgade et al., The local circular law II: the edge case, preprint, arXiv:1206.3187, 2012), they extended this result to include the edge case \( |z|-1={{\mathrm{o}}}(1)\), under the main assumption that the third moments of the matrix elements vanish. (Without the vanishing third moment assumption, they proved that the circular law is valid near the spectral edge \( |z|-1={{\mathrm{o}}}(1)\) up to scale \(N^{-1/4+ {\varepsilon }}\).) In this paper, we will remove this assumption, i.e. we prove a local version of the circular law up to the finest scale \(N^{-1/2+ {\varepsilon }}\) for non-Hermitian random matrices at any point \(z \in \mathbb {C}\).
1 Introduction and main result
The circular law in random matrix theory describes the macroscopic limiting spectral measure of normalized non-Hermitian matrices with independent entries. Its origin goes beck to the work of Ginibre [12], who found the joint density of the eigenvalues of such Gaussian matrices. More precisely, for an \(N\times N\) matrix with independent entries \(\frac{1}{\sqrt{N}}z_{ij}\) such that \(z_{ij}\) is identically distributed according to the measure \(\mu _g=\frac{1}{\pi }e^{-|z|^2}{\, \mathrm d A}(z)\) (\({\, \mathrm d A}\) denotes the Lebesgue measure on \(\mathbb {C}\)), its eigenvalues \(\mu _1,\dots ,\mu _N\) have a probability density proportional to
with respect to the Lebesgue measure on \(\mathbb {C}^{N}\). These random spectral measures define a determinantal point process with the explicit kernel (see [12])
with respect to the Lebesgue measure on \(\mathbb {C}\). This integrability property allowed Ginibre to derive the circular law for the eigenvalues, i.e., \(\frac{1}{N}\rho _1^{(N)}\) converges to the uniform measure on the unit disk,
This limiting law also holds for real Gaussian entries [9], for which a more detailed analysis was performed in [6, 11, 19].
For non-Gaussian entries, Girko [13] argued that the macroscopic limiting spectrum is still given by (1.3). His main insight is commonly known as the Hermitization technique, which converts the convergence of complex empirical measures into the convergence of logarithmic transforms of a family of Hermitian matrices. If we denote the original non-Hermitian matrix by \(X\) and the eigenvalues of \(X\) by \(\mu _j\), then for any \({C}^2\) function \(F\) we have the identity
Due to the logarithmic singularity at \(0\), it is clear that the small eigenvalues of the Hermitian matrix \((X^* - z^* ) (X-z) \) play a special role. A key question is to estimate the small eigenvalues of \((X^* - z^* ) (X-z)\), or in other words, the small singular values of \( (X-z)\). This problem was not treated in [13], but the gap was remedied in a series of papers. First Bai [3] was able to treat the logarithmic singularity assuming bounded density and bounded high moments for the entries of the matrix (see also [4]). Lower bounds on the smallest singular values were given in Rudelson, Vershynin [17, 18], and subsequently Tao and Vu [20], Pan and Zhou [16] and Götze and Tikhomirov [14] weakened the moments and smoothness assumptions for the circular law, till the optimal \(\text{ L }^2\) assumption, under which the circular law was proved in [21]. On the other hand, Wood [23] showed that the circular law also holds for sparse random \(n\) by \(n\) matrices where each entry is nonzero with probability \( n^{\alpha -1}\) where \(0<\alpha \leqslant 1\).
In the first part of this article [7], Bourgade, Yau and the author of this paper proved a local version of the circular law, up to the optimal scale \(N^{-1/2 + {\varepsilon }}\), in the bulk of the spectrum. In the second part [8], they extended this result to include the edge case, under the assumption that the third moments of the matrix elements vanish. (Without the vanishing third moment assumption, they also proved that the circular law is valid near the spectral edge \( |z|-1={{\mathrm{o}}}(1)\) up to scale \(N^{-1/4+ {\varepsilon }}\).) This vanishing third moment condition is also the main assumption in Tao and Vu’s work on local circular law [22]. In the current paper, we will remove this assumption, i.e. we prove a local version of the circular law up to the finest scale \(N^{-1/2+ {\varepsilon }}\) for non-Hermitian random matrices at any point \(z \in \mathbb {C}\).
More precisely, we considered an \(N \times N\) matrix \(X\) with independent realFootnote 1 centered entries with variance \( N^{-1}\). Let \(\mu _j, j\in [\![\!1, N\!]\!]\) denote the eigenvalues of \(X\). To state the local circular law, we first define the notion of stochastic domination.
Definition 1.1
Let \(W=W^{(N)}\) be a family of random variables and \(\Psi =\Psi ^{(N)}\) be a family of deterministic parameters. We say that \(W\) is stochastically dominated by \(\Psi \) if for any \( \sigma > 0\) and \(D > 0\) we have
for sufficiently large \(N\). We denote this stochastic domination property by
Furthermore, Let \(U^{(N)}\) be a possibly N-dependent parameter set. We say \(W(u)\) is stochastically dominated by \(\Psi (u)\) uniformly in \(u\in U^{(N)}\), if for any \( \sigma > 0\) and \(D > 0\) we have
for uniformly sufficiently large \(N\) (may depends on \(\sigma \) and \(D\)).
Note In the most cases of this paper, the \(U^{(N)}\) is chosen as the product of the index sets \(1\leqslant i, j\leqslant N\) and some compact set in \(\mathbb {C}^2\).
In this paper, as in [7, 8] and [22], we assume that the probability distributions of the matrix elements satisfy the following uniform subexponential decay property:
for some constant \(\vartheta >0\) independent of \(N\). This condition can of course be weakened to an hypothesis of boundedness on sufficiently high moments, but the error estimates in the following Theorem would be weakened as well.
Note Most constants appearing in this work may depend on \(\vartheta \), but we will not emphasize this dependence in the proof.
Let \(f:\mathbb {C}\rightarrow \mathbb {R}\) be a fixed smooth compactly supported function, and \(f_{z_0}(\mu )=N^{2s}f(N^s(\mu -z_0))\), where \(z_0\) depends on \(N\), and \(s\) is a fixed scaling parameter in \([0,1/2]\). Let \(D\) denote the unit disk. Theorem 2.2 of [7] and Theorem 1.2 and 1.3 of [8] assert that the following estimate holds: (Note: Here \(\Vert f_{z_0}\Vert _1=O(1)\))
[7] if \(||z_0|-1|>c\) for some \(c>0\) independent of \(N\) or [8] if the third moments of matrix entries vanish. This implies that the circular law holds after zooming up to scale \(N^{-1/2+{\varepsilon }}\) (\({\varepsilon }>0\)) under these conditions. In particular, there are neither clusters of eigenvalues nor holes in the spectrum at such scales. We note that in [7] and [8], the scaling parameter was denoted as \(a\), but letter \(a\) will be used as a fixed index in this work.
We aim at understanding the circular law for any \(z_0\in \mathbb {C}\) without the vanishing third moment assumption. The following theorem is our main result.
Theorem 1.2
(Local circular law) Let \(X\) be an \(N\times N\) matrix with independent centered entries of variances \(1/N\). Suppose that the distributions of the matrix elements satisfy the subexponential decay property (1.7). Let \(f_{z_0}\) be defined as previously [above (1.8)] and \(D\) denote the unit disk. Then for any \(s \in [0,1/2]\) and any \(z_0\in \mathbb {C}\), we have
Notice that the main new assertion of (1.9) is for the case: \(|z_0|-1={{\mathrm{o}}}(1)\) and the third moments not vanishing, since the other cases were proved in [7] and [8].
Remark We believe that the assumption on the subexponential decay can be replaced with finite (high) moment condition by using the method in [15], which provides the necessary and sufficient condition of Wigner matrix Tracy Widom law.
Remark Shortly after the preprint [7] appeared, a version of local circular law (both in the bulk and near the edge) was proved by Tao and Vu [22] under the assumption that the first three moments of the matrix entries match a Gaussian distribution, i.e., third moments vanish.
In the next section we will introduce our main strategy and improvements.
2 Proof of Theorem 1.2
Proof of Theorem 1.2
If \(s=0\) (global case) or \(||z|-1|\ne o(1)\) (bulk case), Theorem 1.2 has been proved in Theorem 1.3 of [8] and Theorem 2.2 of [7] respectively. Furthermore, it is easy to see that the result in Theorem 1.2 for \(s=1/2\) follows from the ones for \(s<1/2\) (Thanks to the extra \(\sigma \) room we have in the definition of stochastic domination). Hence in this proof, we can assume that
In the edge case, our Theorem 1.2 was proved in the Theorem 1.2 of [8] with the vanishing third moment assumption. Hence the goal of this paper is to improve the proof of Theorem 1.2 of [8]. One can easily check that in the proof of Theorem 1.2 of [8], condition \(\mathbb {E}X^3_{ij}=0\) was only used in the Lemma 2.13 of [8]. Therefore, we only need to prove a stronger version of Lemma 2.13 in [8], i.e., the one without vanishing third moment condition. More precisely, it only remains to prove the following Lemma 2.2. (Here we use the same notations as the ones in [8], except for the scaling parameter) \(\square \)
Before stating Lemma 2.2, i.e., the stronger version of Theorem 1.2 of [8], we introduce some definitions and notations. First, we define
where \(I\) is the identity operator. In the following, we use the notation \(A\sim B\) when \(c B \leqslant |A|\leqslant c^{-1}B\), where \(c>0\) is independent of \(N\). For any matrix \(M\), we denote \(M^T\) as the transpose of \(M\) and \(M^*\) as the Hermitian conjugate. Usually we choose \(z -z_0\sim N^{-s}\), hence we define the scaled parameter \(\xi \):
Let \(\lambda _j(z)\) be the \(j\) th eigenvalue (in the increasing ordering) of \(Y^*_zY_z\). Define the Green function of \(Y^*_z Y_z\) and its trace by, where \(w\in \mathbb {C}\) and \({{\mathrm{Im}}}w>0\),
Let \( m_\mathrm{c}:=m_\mathrm{c}(w,z)\) be the unique solution of
with positive imaginary part. As proved in [7] and [8], for some regions of \((w, z)\) with high probability, \(m(w,z)\) converges to \(m_\mathrm{c}(w,z)\) pointwise, as \(N\rightarrow \infty \). Let \(\rho _c\) be the measure whose Stieltjes transform is \(m_\mathrm{c} \). This measure is compactly supported and \({{\mathrm{supp}}}\rho _c=[\max \{0, \lambda _-\}, \lambda _+]\), where
Note that \(\lambda _-\) has the same sign as \(|z|-1\). It is well-known that \( \rho _\mathrm{c} (x,z)\) can be obtained from its Stieltjes transform \(m_\mathrm{c}(x+{\mathrm {i}}\eta ,z)\) via
(Some basic properties of \(m_c\) and \(\rho _c\) were discussed in section 2.2 of [8])
Definition 2.1
\(\phi , \chi , I\) and \(Z_{X,\mathrm{c}}^{(f)}\)
Let \(h(x)\) be a smooth increasing function supported on \([1,+\infty ]\) with \(h(x)=1\) for \(x\geqslant 2 \) and \(h(x)=0\) for \(x\leqslant 1\). For any \({\varepsilon }>0\), define \(\phi \) on \(\mathbb {R}_+\) by (note: \(\lambda _+\) depends on \(z\))
Let \( \chi \) be a smooth cutoff function supported in \( [-1,1]\) with bounded derivatives and \( \chi (y) = 1\) for \(|y| \leqslant 1/2\). Recall \({\, \mathrm d A}\) denotes the Lebesgue measure on \(\mathbb {C}\), for any fixed function \(g\) defined on \(\mathbb {C}\), we define:
and
Note The condition \(E\geqslant N^{-2+2{\varepsilon }}\) was not in the definition of the \(I\) used in [8], but clearly this condition is implied by \(\phi '(E)\ne 0\), i.e., our new \(I\) does not change the value of \(Z_{X,\mathrm{c}}^{(g)}\). One can also easily check:
With these notations and definitions, we claim the following main lemma. It is a stronger version of Lemma 2.13 in [8], i.e., the one without vanishing third moment condition.
Lemma 2.2
Under the assumptions of Theorem 1.2, there exists a constant \(C>0\) such that for any small enough \({\varepsilon }>0\) (independent of \(N)\), if \(||z_0|-1|\leqslant {\varepsilon }\) and \(s\in (0, 1/2)\), then
where \( c_f\) is a constant depending only on the function \(f\).
As mentioned above, in the proof of Theorem 1.2 of [8], the vanishing third moment condition was only used in the Lemma 2.13 of [8]. Therefore with the improved Lemma (2.2), one can obtain our main result Theorem 1.2 as in [8].\(\square \)
In the next step, the Lemma 2.2 will be reduced to Lemma 2.4.
We note that the bounds proved in [8] for \(G_{ij}\)’s are not strong enough for our purpose in this paper. Unfortunately we noticed that it seems impossible to improve these bounds in general cases. On the other hand, we found that though the behaviors \(G\)’s and \(\mathcal G\)’s are unstable in the region \(|m|\leqslant (N\eta )^{-1}\), they are very stable in the region \(|m|\gg (N\eta )^{-1}\) and many stronger bounds can be derived in this region. Therefore, in the following proof, we separate the \(Z_{X,c}\) into two parts: the one comes for the region \(|m|\leqslant (N\eta )^{-1}\) and the one comes for the region \(|m|\gg (N\eta )^{-1}\). The first part can be easily bounded, since \(m\) is small, so as its contribution to \(Z_{X,c}\). For the second part, we will apply Green’s function comparison method (which was first introduced in [10] for generalized Wigner matrix) and prove our new stronger bounds in the region \(|m|\gg (N\eta )^{-1}\).
On the other hand, the old Green’s function comparison method was not enough for our purpose, which is also the reason that in [8], the authors needed the extra assumption on the third moment of matrix entries. In this work, we will introduce an improved Green’s function comparison method, which provides an extra \(N^{-1/2} \) factor than the previous method. This idea was motivated from the work in [5].
Definition 2.3
\(t_X\) and \(A_X^{(f)}\)
For \(N\times N\) matrix \(X\), we define
i.e.,
Now we extend the function \(h\) defined in Definition 2.1 to the whole real line, i.e., \(h(x)=h(-x)\), but still use the same notation \(h(x)\). With these notations, we define:
where \(z=z_0+N^{-s}\xi , w=E+i\eta , \phi =\phi _{{\varepsilon },z}\) and \(t_X= t_X( {\varepsilon }, w, z)\).
Note the only difference between \(A_X^{(f)}\) and \(Z_{X,\mathrm{c}}^{(f)}\) is the \(h(t_X)\) in front of \({{\mathrm{Re}}}m\). Then the difference of \(A_X^{(f)}\) and \(Z_{X,\mathrm{c}}^{(f)}\) only comes from the region \(h(t_X)\ne 1\), i.e, \(|{{\mathrm{Re}}}m| \leqslant 2N^{{\varepsilon }} (N\eta )^{-1}\). Therefore, by the definition of \(\phi \) we have
where we used \(|(1-h(t_X)){{\mathrm{Re}}}m|\leqslant 2 N^{\varepsilon }(N\eta )^{-1}\).
Proof of Lemma 2.2
With (2.8), it only remains to prove the following lemma.
Lemma 2.4
Under the assumptions of Theorem 1.2, there exists a constant \(C>0\) such that for any small enough \({\varepsilon }>0\) (independent of \(N)\), if \(||z_0|-1|\leqslant {\varepsilon }\) and \( s\in (0, 1/2)\), then
where \( c_f\) is a constant depending only on the function \(f\).
In the next subsection, we will introduce the basic idea of proving Lemma 2.4. The rigorous proof will start from Sect. 3.
2.1 Basic strategy of proving Lemma 2.4
Before we give the complete proof of this lemma, we introduce the basic idea and main improvement in the remainder of this section. Lemma 2.2 was proved in [8] under the vanishing third moment condition. Together with (2.8) that result implies that if \(X_{ij}\)’s are Gaussian variables, for all \(1\leqslant i,j\leqslant N\), then for any fixed \(p\in 2\mathbb {N}\),
We note that \(A_{X,c}^{(f)}\) is basically a linear functional of \(m(w,z)\). Hence as in [8], we will apply the Green function comparison method to show that for sufficiently large \(N\),
for any two different ensembles \(X\) and \(X'\) whose matrix elements satisfy the condition of Theorem 1.2. To complete the proof for Lemma 2.4, we will choose \(X'\) to be the Ginibre ensemble, whose matrix elements are Gaussian variables. The \(X\) will be the general ensembles in Lemma 2.4. Combining (2.9) and (2.10), with Markov inequality, one immediately obtains Lemma 2.4.
In applying the Green function comparison method, we estimate expectation values of functionals of \(Y , G=(Y^* Y -w)^{-1}\) and \(\mathcal G=(YY^* -w)^{-1}\), i.e., \(\mathbb {E}F(Y, G, \mathcal G)\). In [8] and most previous applications of Green function comparison method, one can only bound the expectation value of these functionals with their stochastically dominations. For example, in [8], for \(i\ne j\) and \(|w|^{1/2}\ll (N\eta )\), one has
With this stochastically domain, the authors in [8] obtained that \(|\mathbb {E}(Y G)_{ij}|\leqslant N^{\sigma } \) for any \(\sigma >0\). In the present paper, under the condition \(|{{\mathrm{Re}}}m|\gg (\eta N)^{-1}\), i.e., \(h(t_X)>0\), we will first show an improved bound: for \(i\ne j\) and \(|w|^{1/2}\ll (N\eta )\)
Then using a new idea on Green’s function comparison method, we will show that the expectation value of this term has an extra factor \(N^{-1/2}\), i.e., for any \(\sigma >0\),
This extra factor \(N^{-1/2}\) plays a key role in our new proof. A similar method was used in the [5].
Now we explain the basic idea of proving (2.11)-type bounds, i.e., where the extra \(N^{-1/2}\) factor comes from. For simplicity we assume \(X_{ij}\in \mathbb {R}\). Let \(Y^{(i,i)}_z\) be the matrix obtained by removing \(i\)-th row and column of \(Y_z\), and define
We write \(h(t_X)(Y_zG)_{ij}\) as a polynomial of the \(i\)-th row/column of \(X\): \(X_{ik}, X_{ki}\) (\(1\leqslant k\leqslant N\)), \(G^{(i,i)}\) and \( \mathcal G^{(i,i)}\), i.e.,
where \(P\) is a polynomial. By definition, \(X_{ik}, X_{ki}\) are independent of \(G^{(i,i)}\) and \(\mathcal G^{(i,i)}\). In this polynomial, we will show that the degrees of every monomials w.r.t. \(X_{ik}\) and \(X_{ki}\)’s are always odd numbers. Therefore, in taking the expectation value, with assumption \(\mathbb {E}X_{ij}=0\) and \(|\mathbb {E}(X _{ij})^n|\leqslant O(N^{-n/2})\), one will see an extra factor \(N^{-1/2}\). The following simple example will show why the odd powers give an extra factor \(N^{-1/2}\). Suppose we estimate \(\mathbb {E}\sum _{kst} X_{ik}G^{(i,i)}_{kl}X_{is}G^{(i,i)}_{st}X_{it}\). Since \(X_{ik}, X_{ki}\) are independent of \(G^{(i,i)}\) and \(\mathcal G^{(i,i)}, \mathbb {E}X_{ij}=0\) and \(|\mathbb {E}(X _{ij})^n|\leqslant O(N^{-n/2})\), the nonzero contributions only come from the terms where \(k=s=t\), therefore
On the other hand, without \(\mathbb {E}\), this term can only be bounded without this \(N^{-1/2}\) factor (with large deviation theory).
Note One will not see this \(N^{-1/2}\) factor if the degree is even number, e.g., \(\mathbb {E}\sum X_{is}G^{(i,i)}_{st}X_{it}\). Based on this new idea, the main task of proving Lemma 2.4 and (2.11)-type bounds is writing the functionals of \(Y_z\)’s, \(G\)’s and \(\mathcal G\)’s as the polynomials of \(X_{ik}, X_{ki}\) (\(1\leqslant k\leqslant N\)), \(G^{(i,i)}\) and \(\mathcal G^{(i,i)}\) for some \(1\leqslant i\leqslant N\), (up to negligible error) and counting the degree of each monomial.
3 Proof of Lemma 2.4
In this section, we apply the Green’s function comparison method to prove the Lemma 2.4. We will see the key input of proving Lemma 2.4 is the Lemma 3.2. This new lemma is similar to (3.62)–(3.63) of [8], but without the third moments vanishing assumption. More precisely, the (3.62)–(3.63) of [8] is similar to the (3.4) of this work, and Lemma 3.2 is the key step of proving (3.4). The proof of Lemma 3.2 will start from Sect. 4. In [8], the (3.62)–(3.63) can be easily proved by bounding the expectation value of these terms with their stochastic dominations. In this paper, as introduced in Sect. 2.1, we will introduce a new comparison method to show that, for the contribution comes from \(X_{ij}\)’s third moment, their expectation values have an extra factor \(N^{-1/2}\), i.e., Lemma 3.2.
First of all, we state the following lemma. It will be used to estimate expectation values of some random variables (like \(\widetilde{A} A \,v^p\) in the following lemma) which are stochastically dominated, but not \(L_\infty \) bounded.
Lemma 3.1
Let \(v=v^{(N)}\) be a family of centered random variables with variance \(1/N\), satisfying the sub exponential decay (1.7). Let \(\widetilde{A}=\widetilde{A}^{(N)}\) and \(A=A^{(N)}\) be families of random variables. Suppose \(A\prec 1\), and \(A=\sum _{n=0}^{C} A_n\,v^n\), where \(|A_n|\leqslant N^C\) for some fixed constant \(C>0\). We also assume that \(\widetilde{A}\) is independent of \(v\) and \(|\widetilde{A}|\leqslant N^C\) for some \(C>0\). Then for any fixed \( p\in \mathbb {N}\) and fixed (small) \(\delta >0\),
for large enough \(N\).
Note Here \(A\) or \(A_i\)’s may depend on \(v\).
Proof of Lemma 3.1
By Definition 1.1, the assumption \(A\prec 1\), and the fact that \(v\) has sub exponential decay (1.7), for any fixed (small) \(\delta >0\) and (large) \(D>0\) there is a probability subset \(\Omega \) such that \(\mathbb {P}(\Omega )\geqslant 1-N^{-D}\) and
Then
for the second inequality, we used Cauchy–Schwarz inequality. Choosing large enough \(D\), we complete the proof of Lemma 3.1. \(\square \)
In the following proofs, some error terms satisfy the assumptions on \(A\) in the above lemma, except that these terms maybe stochastically dominated by some deterministic numbers (not 1). We will bound these terms with Lemma 3.1. Hence for simplicity we define the following set. For any centered random variables \(v\) with variance \(1/N\), satisfying the sub exponential decay (1.7), we define
Now we return to prove Lemma 2.4.
Proof of Lemma 2.4
For simplicity, we assume that the matrix entries of \(X\) are real numbers. Let \(X\) and \(X'\) be two ensembles which satisfy the assumption of Theorem 1.2. To prove Lemma 2.4, as we explained in the beginning of subsection 2.1 [near (2.10)], one only needs to show that for any fixed small enough \({\varepsilon }>0, s\in (0,1/2)\), and \(p\in 2\mathbb {N}\), if \(||z_0|-1|\leqslant {\varepsilon }\) then
for large enough \(N\). For integer \(k, 0\leqslant k\leqslant N^2\), define the following matrix \(X_k\) interpolating between \(X'\) and \(X \):
Note that \(X'=X_0\) and \(X =X_{N^2}\). As one can see that the difference between \(X_k\) and \(X_{k-1}\) is just one matrix entry. We denote the index of this entry as \((a,b):=(a_k, b_k)\) (\(a_k, b_k\in \mathbb {Z}, 1\leqslant a_k, b_k\leqslant N\)), here \(k=(a_k-1)N+b_k\).
Furthermore, we define \(t_{X_{k-1}}, t_{X_{k}}, A^{(f)}_{X_{k-1} }, A^{(f)}_{X_{k} }\) with \(X_{k-1}\) and \(X_k\), as in Definition 2.3. We are going to show that if this special matrix entry is in the diagonal line, i.e., \(a=b\) then
otherwise, i.e., \(a\ne b\),
for sufficiently large \(N\) (independent of \(k\)). Clearly, (3.3) and (3.4) imply (3.2).
We are going to compare these functionals corresponding to \(X_k\) and \(X_{k-1}\) with a third one, corresponding to the matrix \(\widetilde{Q}\) hereafter with deterministic \((a,b)\) entry. We define the following \(N\times N\) matrices (hereafter, \(Y_\ell =X_\ell -z I, \ell = k\) or \(k-1\)):
Furthermore, we define \(t_{\widetilde{Q}}, A^{(f)}_{\widetilde{Q} }\) with \(\widetilde{Q}\), as in Definition 2.3. To prove (3.3) and (3.4), we will estimate \( A^{(f)}_{X_{k-1 } } - A^{(f)}_{X_{\widetilde{Q}} } \) and \( A^{(f)}_{X_{k } } - A^{(f)}_{X_{\widetilde{Q}} } \) .
First we introduce the notations
We note: with Cauchy’s interlace theorem, for some \(C>0\),
holds for any \(w\) and \(z\). One can also prove it with [7, (6.6)], which shows that for \(N\times N\) matrices \(Y\) and \(Y'\), if they are only different from each other in \(O(1)\) columns or rows, then \(|{{\mathrm{Tr}}}(Y^*Y-wI) -{{\mathrm{Tr}}}(Y')^*Y'-wI)|= O(\eta ^{-1})\). With (3.13), we get
To estimate \(A^{(f)}_{X_{k-1}}-A^{(f)}_{\widetilde{Q} }\), from (2.7), we have
where \(z=z_0+N^{-s}\xi , w=E+i\eta , \phi =\phi _{{\varepsilon },z}\). Recall \(t_{X_{k-1}}\) and \(t_{\widetilde{Q}}\) are defined with \(m_S\) and \(m_R\) respectively. Applying Taylor’s expansion on the term \( h(t_{X_{k-1}}){{\mathrm{Re}}}m_S-h(t_{\widetilde{Q}}){{\mathrm{Re}}}m_R \) in (3.15) and letting \(h^{(k)}\) be the \(k\)th derivative of \(h\), we have
where \(B_n(\widetilde{Q})\) (\(1\leqslant n\leqslant 3\)) and \(B_4(X_{k-1}, \widetilde{Q}) \) are defined as
where \(\zeta \) is between \(t_{X_{k-1}}\) and \(t_{\widetilde{Q}}\), and only depends on \(t_{X_{k-1}}, t_{\widetilde{Q}}\) and \(h\). As one can see that \(B_1, B_2\) and \(B_3\) are independent of \(v_{ab}\). For the definitions of \(B \)’s, we note that if \(n\geqslant 1\), then
Therefore, with \(|h|\leqslant 1\), we obtain the following uniform bounds for \(B\)’s:
To estimate the \(m_S-m_R\) in (3.16), we study the difference between \(m_S\) and \(m_R\) in the parameter set:
In (3.59) of [8] and the discussion below (3.61) of [8], it was proved that with the notations:
the difference between \({{\mathrm{Re}}}m_S\) and \({{\mathrm{Re}}}m_R\), i.e., \((\frac{1}{N}{{\mathrm{Re}}}{{\mathrm{Tr}}}S-\frac{1}{N}{{\mathrm{Re}}}{{\mathrm{Tr}}}R)\) can be written as (recall \(v_{ab}=X'(a,b)\))
where \(P_4(X_{k-1}, \widetilde{Q})\) depends on \(X_{k-1}\) and \(\widetilde{Q}\), and the \(P\)’s can be bounded as
uniformly for \((k, z, w)\) in (3.19). In [8] the uniformness was not emphasized, but it can be easily checked. From (3.7)–(3.10) and the definition of \(P_{i}(\widetilde{Q}), 1\leqslant i\leqslant 3\), we can see that \(P_{i}(\widetilde{Q}), 1\leqslant i\leqslant 3\), only depend on \(\widetilde{Q}\) and they are independent of \(v_{ab}\).
Now we collect some simple bounds on \(P_i\)’s. For \(L_\infty \) norm, by definition, it is easy to prove that the following inequalities always hold:
for any \((k, z, w)\) in (3.19) and some fixed constant \(C>0\). Then with the definition in (3.20), we also have that for any \((k, z, w)\) in (3.19) and some constant \(C>0\)
Expanding \(S\) around \(R\) with the fact: \(S= (R^{-1}+(Y_k^*Y_k- Q^* Q) )^{-1}\), we obtain that for any fixed \(m\in \mathbb N\)
Let \(m=5\) in (3.24). Now we take \(\frac{1}{N}{{\mathrm{Re}}}{{\mathrm{Tr}}}\) on the both sides of (3.24) and compare it with (3.21). Since \(Y_k^*Y_k- Q^* Q=v _{ab}(\mathbf{e}_{ba}Q)+v_{ab}(Q^*\mathbf{e}_{ab})+v^2_{ab}\mathbf{e}_{bb}\), we can see that for \(1\leqslant l\leqslant 3\), the \(P_l(\widetilde{Q})\) is the coefficient of the \((v_{ab})^l\) term in the r.h.s. of \(\frac{1}{N}{{\mathrm{Re}}}{{\mathrm{Tr}}}\) (3.24) and
Similarly, using this expansion (\(m=5\)), and the fact:
and \(\partial _w S, \partial _z S=O(N^C)\), we can improve (3.22) to the following one:
Inserting (3.16) and (3.21) into (3.15), we write \(A^{(f)}_{X_{k-1} }-A^{(f)}_{Q }\) as a polynomial of \(v_{ab}\) as follows.
where
where \(B_n=B_n({\widetilde{Q}}), P_n=P_n(\widetilde{Q})\) (\(1\leqslant n\leqslant 3\)), \(B_4=B_4(X_{k-1}, \widetilde{Q})\) and \(P_4=P_4(X_{k-1}, \widetilde{Q})\). We note: \( \fancyscript{P}_1(\widetilde{Q}), \fancyscript{P}_2(\widetilde{Q})\) and \(\fancyscript{P}_3(\widetilde{Q})\) are independent of \(v_{ab}\).
Replacing \(X_{k-1}\) with \(X_{k}\), with the same method, we obtain (Here \(v_{ab}\) is replaced with \(u_{ab}\))
From (3.18) and (3.26), it is easy to check that \(\fancyscript{P}_1, \fancyscript{P}_2\) and \(\fancyscript{P}_3\prec 1\) uniformly hold for \(1\leqslant k\leqslant N^2\). For \(L_\infty \) bound, with (3.23), they are bounded by \(N^C\) for some \(C\). Similarly, we can obtain that \(\fancyscript{P}_4\prec 1\). With (3.25), we have \(\fancyscript{P}_4 (X_{k-1}, \widetilde{Q})\in \mathcal M_C(v_{ab})\) and \(\fancyscript{P}_4 (X_{k}, \widetilde{Q})\in \mathcal M_C(u_{ab})\). So far, we proved
uniformly hold for \(1\leqslant k\leqslant N^2\).
Now we return to prove (3.3) and (3.4). First we write
We insert the (3.27) and (3.29) into the r.h.s. and write it in the following form
where \(\mathcal A_{m}\) only contains \(A^{(f)}_{\tilde{Q} }, \fancyscript{P}_{i}(\widetilde{Q})\) (\(i=1,2,3\)), \(\fancyscript{P}_4 (X_{k-1}, \widetilde{Q})\), and \(\mathcal B_{m}\) only contains \(A^{(f)}_{\tilde{Q} }, \fancyscript{P}_{i}(\widetilde{Q})\)(\(i=1,2,3\)), \(\fancyscript{P}_4 (X_{k}, \widetilde{Q})\) For example,
where \(C_{p,n}\) (\(1\leqslant n\leqslant 3\)) are constants that only depends on \(p\). Since the first two moments of \(v_{ab}\) and \(u_{ab}\) coincide, \(u_{ab},v_{ab}\) are independent of \( \widetilde{Q}\), and \(\mathcal A_{1}=\mathcal B_1, \mathcal A_{2}=\mathcal B_2\) only contain \(A^{(f)}_{\tilde{Q} }, \fancyscript{P}_{i}(\widetilde{Q})\)(\(i=1,2,3\)), we have
Recall the definition of \(\mathcal A_m\) and \(\mathcal B_m\) from (3.31), for the terms with \(m\geqslant 4\), using (3.30) and Lemma 3.1, we get
Therefore, with \(\mathcal A_3=\mathcal B_3\),
Similarly, using (3.30), (3.32), \(A_{\widetilde{Q}}^{(f)} =O( N^C)\) and Lemma 3.1, we have
As in [8, (3.64)], using Hölder’s inequality and the bound (3.14), we have
Then combining (3.33)–(3.35), we obtain (3.3). (Note: \(p\in 2\mathbb Z\).)
To prove (3.4), we claim the following lemma, which provides the stronger bound on the expectation value of the r.h.s. of (3.32).
Lemma 3.2
Assume \(1\leqslant a\ne b\leqslant N\). Let \(X\) be defined as in Theorem 1.2, except that \(X_{ab} =0\). For any fixed small enough \({\varepsilon }>0\), if \(||z_0|-1|\leqslant {\varepsilon }\) and \(s\in (0,1/2)\), define \(A^{(f)}_{X }, P_i(X), B_i(X), \fancyscript{P}_i(X), i=1,2,3\) as in (2.7), (3.20), (3.17) and (3.28). (More precisely \(\widetilde{Q}, Q, R\) in (3.20) and (3.17) will be replaced with \(X, Y=X-zI\) and \((Y^*Y-wI)\) respectively.) Then
uniformly for \((a,b)\).
We return to prove (3.4) and prove Lemma 3.2 in the next section. Inserting Lemma 3.2 and (3.32) into (3.33), as in (3.34), we obtain that if \(a\ne b\), then
Together with (3.33) and (3.35), we obtain (3.4). Clearly, (3.3) and (3.4) imply (3.2), and we complete the proof of Lemmas 2.4 2.2.
4 Proof of Lemma 3.2
Lemma 3.2 bounds the expectation values of some polynomials of \(A^{(f)}_{X}\) and \(\fancyscript{P}_{1,2,3} (X)\). Roughly speaking Lemma 3.2 shows that the expectation value of these polynomials are much less than their stochastic domination by a factor \(N^{-1/2}\). (Note: \(a\) and \(b\) appear in the definitions of \(P\)’s and \(B\)’s. The \(\fancyscript{P}\)’s are defined with \(P\)’s and \(B\)’s.) As introduced in the second part of Sect. 2.1 [below (2.11)], the main strategy of showing this extra factor is
-
writing them as the polynomials (up to negligible error) of \(X_{ak}\)’s, \(X_{ka}\)’s (\(1\leqslant k\leqslant N\)), \(G^{(a,a)}\) and \(\mathcal G^{(a,a)}\), which are defined as
$$\begin{aligned} G^{ (a,a)}:=((Y^{ (a,a)}_z)^*Y^{ (a,a)}_z-w)^{-1}, \quad \mathcal G^{ (a,a)}:=(Y^{ (a,a)}_z(Y^{ (a,a)}_z)^*-w)^{-1} \end{aligned}$$and \(Y^{(a,a)}:=Y^{(a,a)}_z\) is the matrix obtained by removing the \(a\)-th row and column of \(Y_z\).
-
showing the degrees of the monomials w.r.t. \(X_{ak}\)’s and \(X_{ka}\)’s in above polynomials are always odd (except for \(X_{aa}\)).
First of all, in Lemmas 4.5 and 4.7, we introduce some polynomials having the properties we need for Lemma 3.2, i.e., their expectation values have an extra factor \(N^{-1/2}\) comparing with their stochastic dominations. In the next subsection, we introduce some \(\mathcal F\) sets, whose elements are the ”basic” polynomials in our proof, i.e., the bricks of the polynomials in Lemmas 4.5 and 4.7.
4.1 Basic polynomials and their properties
We first introduce some notations.
Definition 4.1
\(X^{(\mathbb T, \mathbb U)}, Y^{(\mathbb T, \mathbb U)}, G^{(\mathbb T, \mathbb U)}\) and \(\mathcal G^{(\mathbb T, \mathbb U)}\).
Let \(\mathbb T, \mathbb U\) be some subsets of \(\{1,2,\ldots , N\}\). Then we define \(Y^{(\mathbb T, \mathbb U)}\) as the \( (N-|\mathbb U|)\times ( N-|\mathbb T|) \) matrix obtained by removing all columns of \(Y\) indexed by \(i \in \mathbb T\) and all rows of \(Y\) indexed by \(i \in \mathbb U\). Notice that we keep the labels of indices of \(Y\) when defining \(Y^{(\mathbb T, \mathbb U)}\). With the same method, we define \(X^{(\mathbb T, \mathbb U)}\) with \(X\).
Let \(\mathbf{{y}}_i\) be the \(i \)-th column of \(Y\) and \(\mathbf{{y}}^{(\mathbb S)}_i\) be the vector obtained by removing \(\mathbf{{y}}_i (j) \) for all \( j \in \mathbb S\). Similarly we define \(\mathrm{y}_i\) be the \(i \)-th row of \(Y\). Define
By definition, \(m^{(\emptyset , \emptyset )} = m\). Since the eigenvalues of \(Y^* Y \) and \(Y Y^*\) are the same except the zero eigenvalue, it is easy to check that
For \(|\mathbb U|=| \mathbb T|\), we define
There is a crude bound for \((m_G^{(\mathbb T, \mathbb U)}-m)\) proved in (6.6) of [7]:
Definition 4.2
(Notations for general sets) As usual, if \(x\in \mathbb {R}\) or \(\mathbb {C}\), and \(\mathcal S\) is a set of random variables then \(x \mathcal S\) denotes the following set as
For two sets \(\mathcal S_1\) and \(\mathcal S_2\) of random variables, we define the following set as
Let \(\mathcal S=\mathcal S^{(N)}\) be a family of sets and \(s=s^{(N)}\) be a family of sums of the elements in \(\mathcal S\), we call \(s\in _n \mathcal S\) if and only if \( s^{(N)}\) can be written as the sum of \(O(1)\) elements in \(\mathcal S^{(N)}\) for any \( N\), i.e.,
Definition 4.3
(Definition of \(\mathcal F_0, \mathcal F_1, \mathcal F_{1/2}\) and \(\mathcal F \)) For fixed indices \(a,b\) and ensemble \(X\) in Lemma 3.2, we define \(\mathcal F_0\) as the set of random variables (depending on \(X\)) which are stochastically dominated by 1 and independent of any \(X_{ak}\) and \(X_{ka} (1\leqslant k\leqslant N)\), i.e.,
Note \(\mathcal F_0\) depends on \(a\), not \(b\). One example element in \(\mathcal F_0\) is \({{\mathrm{Tr}}}X-X_{aa}\).
For simplicity, we define
Next we define \(\mathcal F_1\) as the union of the set \(( N^{ 1/2}X_{aa} \mathcal F_0)\) and the sets of some quadratic forms as follows
(Note it is \(X _{ka}V_{kl}X_{la}\) or \(X _{ak}V_{kl}X_{al}\) in the first line and \(X_{ak}V_{kl}X_{la}\) in the second line, and the diagonal terms in the second case is allowed to be larger than the others by a factor \(N^{1/2}\).)
Furthermore, we define \(\mathcal F\) as the set of following random variables
where \(\left( \mathcal F_1\right) ^n\) represents the set of the products of \(n\) elements in \(\mathcal F_1\). For simplicity, sometimes we write \(\mathcal F\)
i.e., with the subscript empty set \(\emptyset \).
Similarly, we define
Note The total number of \(X_{ak}\) and \(X_{ka}\) (\(1\leqslant k\leqslant N\)), in each monomial of the element in \(\mathcal F\) is always even. On the other hand, this number in \(\mathcal F_{1/2}\cdot \mathcal F\) is always odd. By the definition, it is easy to see that
and
Examples By definition, \(G^{(a,a)}_{kl}\leqslant \eta ^{-1}\) for any \(k,l\ne a\). Hence we have
and if \(\eta =O(1)\)
Definition 4.4
(Uniformness) Let \(F_T, T\in \mathcal T_N\) be a family of random variables, where \(\mathcal T_N\) is a parameter set which may depend on \(N\). We say
are uniform for all \(T\in \mathcal T_N\), if the following two uniform conditions hold.
-
(i)
There exist uniform integers \(m\) and \(n\) independent of \(N\) such that for all \(T\in \mathcal T_N\), we can write \(F_T\) as the sum of \(m\) elements in \((\mathcal F_0\cup \left( \mathcal F_1 \right) ^n)\), i.e.,
$$\begin{aligned} F_T=\sum _{i=1}^m F_{T, i}, \quad F_{T,i}\in \mathcal F_0\cup \left( \mathcal F_1 \right) ^n. \end{aligned}$$ -
(ii)
All of the stochastic domination relations, i.e., \(\prec \), appearing in all \(F_T\)’s (\(\,T\in \mathcal T_N\)) hold uniformly.
Similarly, for \(\mathcal F_0, \mathcal F_{1/2}\) and \(\mathcal F_{1}\), we call
uniformly for all \(T\in \mathcal T_N\), if there exist uniform \(m\) independent of \(N\) such that
and the above uniform condition (ii) holds.
More general, if \(\mathcal F_\alpha \) is one of \(\mathcal F_0,\;\mathcal F_{1/2},\;\mathcal F_1,\;\mathcal F\), so as \(\mathcal F_\beta \), i.e., \(\alpha , \beta =0, 1/2, 1\) or \(\emptyset \), we say
uniformly for all \(T\in \mathcal T_N\) if there exists uniform \(m\) independent of \(N\) such that \(F_T\) can be written as the sum of the \(m\) terms in \(\mathcal F_\alpha \cdot \mathcal F_\beta \), i.e.,
and
hold uniformly for all \(T\in \mathcal T_N\).
Furthermore, with fixed \(D>0\) and random (or deterministic) variable \(a_T\), we say
uniformly for all \(T\in \mathcal T_N\) if \(F_T\) can be written as
where
hold uniformly for all \(T\in \mathcal T_N\). Similarly, one can define
as above.
Now we estimate the expectation values of elements in \(\mathcal F\cdot \mathcal F_{1/2}\). Let \(F_{1/2}\in \mathcal F_{1/2}, F\in \mathcal F\). With large deviation theory, we can only obtain
But we will show that the elements in \( \mathcal F_{1/2}\cdot \mathcal F \) may have much smaller expectation values.
Lemma 4.5
For fixed indices \(a,b\) and ensemble \(X\) in Lemma 3.2, let \(F_0\) and \(F\) be two random variables bounded by \(N^C\) for some \(C\), i.e.,
We assume that
Then we have
for any fixed \(D>0\).
Proof of Lemma 4.5
For simplicity, we assume \(F\in \mathcal F_{1/2}\cdot \mathcal F\) (not \(\in _n\)). The general case can be proved with the same method. Furthermore, by definition, \( \mathbb {E}F_0F=0\) if \( F\in \mathcal F_{1/2}\cdot \mathcal F_0\). Hence one only needs to prove the following case: for some fixed \(m, F\in \mathcal F_{1/2}\cdot (\mathcal F_1)^m\), i.e., \(F\) can be written as the product of one element of \(\mathcal F_{1/2}\) and \(m\) elements of \(\mathcal F_1\), i.e.,
By definition, \( F_{1/2} F_1F_2F_3\cdots F_m\) can be consider as a polynomial of \(X_{ak}\)’s and \(X_{ka}\)’s (\(1\leqslant k\leqslant N\)), whose coefficients are independent of the \(a\)-th row and column of \(X\). Then, we can decompose \(F\) as
where \(k_i\)’s are all different in the summation, and \(\mathcal A\Big ( \{k_i\}_{ i=1}^n, \{s_i\}_{ i=1}^n,\{t_i\}_{ i=1}^n\Big )\) is the coefficient of \( \prod _{i=1}^n (X_{ak_i})^{s_i} (X_{k_ia})^{t_i} \) and it is independent of the \(a\)-th row and column of \(X\). We separate the parameter region into two cases.
First case \(k_i\ne a \) for all \(1\leqslant i\leqslant n\). By definition of \(\mathcal F_1\), we have
where the last factor come from the \(N^{1/2}\) factor in the definition of \(\mathcal F_1\) (see the \(N^{1/2}\sum _{k }^{(a)}X_{ak}V_{kk}X_{ka}\) term in the definition of \(\mathcal F_1\)).
Second case \(k_j=a \) for some \(1\leqslant j\leqslant n\). Since the \(k_i\)’s are all different, hence the other \(k_i\)’s are not equal to \(a\). Let \(s_j=s, t_j=0\), we have
By definition of \(\mathcal F_1\) and \(\mathcal F\), we know that for any \(\delta >0\) and \(D>0\), there exists probability set \(\Omega \), which is independent of the \(a\)-th row and column of \(X\), such that \(\mathbb {P}(\Omega )\geqslant 1-N^{-D}\), and the \(\prec \)’s in (4.7) and (4.8) can be replaced with \(\leqslant \). More precisely,
With this \(\Omega \) and \(|F_0|+|F|\leqslant N^C\), we have
Hence to prove (4.5), we only need to bound \(\mathbb {E}\mathbf{1}_\Omega F_0F \). For the first case, i.e., \(k_i\ne a\) (\(1\leqslant i\leqslant n\)), using (4.9), and the fact that \(F_0\) and \(\Omega \) are independent of the \(a\)-th row and column of \(X\), we have
for any \(\delta >0\), where the factor \((N^{-1/2})^{-2}=N^1\) comes from summation of \(k_i: 1\leqslant k_i\leqslant N\). It is easy to check:
Therefore, with some constants \(C_m\) only depending on \(m\), for any \(\delta >0\) we have
Similarly for the second case: without loss of generality, we assume \(k_1=a\). Then as above, using (4.9), and the fact \(\Omega \) is independent of the \(a\)-th row and column of \(X\), we have
Combining (4.11) and (4.12), we obtain
Then together with (4.10), we obtain (4.5) and complete the proof of Lemma 4.5. \(\square \)
Now we slightly extend the above lemma. Instead of assuming \( F \in _n \mathcal F_{1/2}\cdot \mathcal F\), we assume that \( F \in _n \mathcal F_{1/2}\cdot \mathcal F+O_\prec (N^{-D})\) for some fixed \(D>0\).
Corollary 4.6
For fixed indices \(a,b\) and ensemble \(X\) in Lemma 3.2, let \(F_0\) and \(F\) be two random variables bounded by \(N^C\) for some \(C\), i.e.,
We assume that
and for some fixed \(D>0\),
Then we have
Proof of Corollary 4.6
Write
Here superscription \(M\) and \(e\) are for \(main\) and \(error\). (Note \(F^M\) and \(F^e\) are not assumed to be bounded by \(N^C\), otherwise the proof is much simpler.) For simplicity, we assume for some \(m\geqslant 0, F^M\in \mathcal F_{1/2}(\mathcal F_1)^m\). We repeat the same argument as above, i.e., from (4.6) to (4.9). Then for any (small) \(\delta >0\) and (large) \(\widetilde{D}>0\), there exists probability set \(\Omega \), which is independent of the \(a\)-th row and column of \(X\), such that \(\mathbb {P}(\Omega )\geqslant 1-N^{-\widetilde{D}}\), and (4.9) holds. Next we write
where we used \(|F_0|+|F|\leqslant N^C\) and (4.13).
Now we bound \(\left| \mathbb {E}\,1_{\Omega } F_0 F^e \right| \). By the definition of \(\prec \) again, for any (small) \(\delta >0\) and (large) \(\widetilde{D}>0\), there exists \(\widetilde{\Omega }\) such that \(\mathbb {P}(\widetilde{\Omega })\geqslant 1-N^{-\widetilde{D}}\) and
With this \(\widetilde{\Omega }\), and \(|F_0|+|F|\leqslant N^C\) we write
For the last term, we note that by the definition of \(\Omega \) we can simply bound the term in \(\mathcal F\) which are independent of the \(a\)-th row and column of \(X\) by \(N^{0.1}\). Then using the assumption \(F^M\in \mathcal F_{1/2}(\mathcal F_1)^m\), we have
Together with Cauchy–Schwarz inequality, and subexponential decay property (1.7), we obtain that for some constant \(C_m>0\),
Inserting it into (4.15), choosing large enough \(\widetilde{D}\), we obtain (4.14) and complete the proof. \(\square \)
More general, if \(F_T\in _n \mathcal F_{1/2}\cdot \mathcal F\) hold uniformly for \(T\in \mathcal T\), Corollary 4.6 can be extended to the following integration version.
Lemma 4.7
For fixed indices \(a,b\) and ensemble \(X\) in Lemma 3.2, let \(F_T\) be a family of random variables such that for some deterministic \(x_T\) and uniform \(D>0\)
hold uniformly for \(T\in \mathcal T=\mathcal T_N\), i.e., \(F_{T}=F^M_T+F_T^e\) and
hold uniformly for \(T\in \mathcal T=\mathcal T_N\). Here we assume that \(\cup _N\mathcal T_N\) can be covered by a compact set in \( \mathbb {R}^p\) for some \(p\in \mathbb N\), this compact set and \(p\) are independent of \(N\).
We also assume that \(|x_T|+|F_T|\leqslant N^C\) for some uniform \(C>0\). Let \(F_0\) be a random variable satisfying \(F_0\in N^C\mathcal F_0\) and \(|F_0|\leqslant N^C\). Then
Proof of Lemma 4.7
Since \(F_0\) and \(F_T\) are bounded by \(N^C\), one can exchange the order of integration and expectation, i.e.,
Then with the uniformness, one can easily extend the proof of Lemma 4.5 and Corollary 4.6, and prove this lemma. \(\square \)
4.2 Proof of Lemma 3.2
The Lemmas 4.5 and 4.7 are the key observations for the proof of Lemma 3.2. Now to prove Lemma 3.2, we claim that the following Lemma 4.9, which shows that the terms in Lemma 3.2 can be represented by \(\mathcal F\) and \(\mathcal F_{1/2}\cdot \mathcal F\)(with negligible error term). We first introduce a cutoff function on \({{\mathrm{Re}}}m^{(a,a)}\). (Recall the definition in Definition 4.1)
Definition 4.8
Define \(\chi _a\) as
Note By definition and (4.3), \(h(t_X)>0\) implies \(\chi _a=1\), and for any \(|{\mathbb {U}}|+|{\mathbb {T}}|=O(1)\), we have
Lemma 4.9
Recall \(X^{(a,a)}\) and \(m^{(a,a)}\) defined in Definition (4.1). Under the assumption of Lemma 3.2, for any fixed large \(D>0\), we have
uniformly hold for
We postpone the proof of this lemma to the next section. In the remainder of this section, we will prove Lemma 3.2 with Lemma 4.9. First we introduce a simple lemma for the calculation of \(\mathcal F\) sets.
Lemma 4.10
Let \(A\) and \(B\) be two variables stochastically dominated by \(N^C\) for some \(C>0\), i.e., \(|A|+|B|\prec N^C\). If for some \(D>0\) and random variable \(A_0\) and \(B_0\), we have
then
Proof
By assumption,
With \(|A|+|B|\prec N^C\), we obtain (4.22). \(\square \)
Now we return to finish the proof of Lemma 3.2.
Proof of Lemma 3.2
For simplicity, we introduce the notation \(\widetilde{A}(w,z)\) as
First as in (3.30), (3.18) and (3.23), one can see that there exists uniform \(C>0\), such that
With \(A^{(f)}_{X }= A^{(f)}_{X ^{(a,a)}}+ (A^{(f)}_{X }-A^{(f)}_{X ^{(a,a)}})\), we write
Recall the definitions in (2.7), (3.28) and (3.17), for fixed \(l\), with the notation \(\widetilde{A}(w,z)\) and (4.18), we can write:
where \({\, \mathrm d}T=\prod _i {\, \mathrm d}E_i{\, \mathrm d}\eta _i{\, \mathrm d}A(\xi _i)\) and \(\mathcal T=(I_{\varepsilon }\times {{\mathrm{supp}}}f)^{l+3}\). Using (4.23), Lemmas 4.9 and 4.10, for any fixed \(D>0\), we have
uniformly hold for \(T\in \mathcal T\). Applying Lemma 4.7 by choosing
and \(\mathcal T=(I_{\varepsilon }\times {{\mathrm{supp}}}f)^{l+3}\), with (4.24) and (4.25), we obtain
for any fixed \(D>0\). Then use Holder inequality, we have
Similarly, one can prove
It follows from (4.3) that \(m-m^{(a,a)}=O(N\eta )^{-1}\). Then it is easy to check that \(|A^{(f)}_{X^{(a,a)} }-A^{(f)}_{X}|\leqslant C\). Inserting it into (4.26), we complete the proof of Lemma 3.2.
\(\square \)
5 Polynomialization of Green’s functions
As showed in the previous sections, to complete the proof of Theorem 1.2, it only remains to prove Lemma 4.9. In this section, we will prove Lemma 4.9, i.e., write the terms in (4.19) as polynomials in \(\mathcal F\) or \(\mathcal F_{1/2}\cdot \mathcal F\) (up to negligible error). Since the uniformness can be easily checked, we will only focus on the fixed \(a, b, z, w\):
First we need to write the single matrix elements of \(G\)’s and \(\mathcal G's\) as this type of polynomials. To do so, we start with deriving some bounds on \(G\)’s under the condition:
Note This condition is guaranteed by \(\chi _a>0, h(t_X)>0\) or \(h(t_{X^{(a,a)}})>0\).
5.1 Preliminary lemmas
This subsection summarizes some elementary results from [7] and [8]. Note that all the inequalities in this subsection hold uniformly for bounded \(z\) and \(w\). Furthermore, they hold without the condition (5.1).
Recall the definitions of \(Y^{(U,T)}, G^{(U,T)}, \mathcal G^{(U,T)}, \mathbf{{y}}_i\) and \(\mathrm{y}_i\) in the Definition 4.1.
Lemma 5.1
(Relation between \(G, G^{(\mathbb T,\emptyset )}\) and \(G^{( \emptyset , \mathbb T)})\) For \(i,j \ne k \ (i = j\) is allowed) we have
and
Definition 5.2
In the following, \(\mathbb {E}_{\mathrm{y}_i}\) and \(\mathbb {E}_{\mathbf{{y}}_i} \) denote the expectations w.r.t. \(\mathrm{y}_i\) and \(\mathbf{{y}}_i\) respectively. For any \(\mathbb T\subset [\![\!1,N\!]\!]\), we introduce the notations
and
Recall by our convention that \(\mathbf{{y}}_i\) is a \(N\times 1 \) column vector and \(\mathrm{y}_i\) is a \(1\times N \) row vector. For simplicity we will write
Lemma 5.3
(Identities for \(G, \mathcal G, Z\) and \(\mathcal Z)\) For any \( \mathbb T\subset [\![\!1,N\!]\!]\), we have
where, by definition, \(\mathcal G_{ii}^{(i,\mathbb T)}=0\) if \(i\in \mathbb T\). Similar results hold for \(\mathcal G\):
Definition 5.4
(\(\zeta \)-High probability events) Define
Let \(\zeta > 0\). We say that an \(N\)-dependent event \(\Omega \) holds with \(\zeta \) -high probability if there is some constant \(C\) such that
for large enough \(N\). Furthermore, we say that \(\Omega (u)\) holds with \(\zeta \) -high probability uniformly for \(u\in U_N\), if there is some uniform constant \(C\) such that
for uniformly large enough \(N\).
Note Usually we choose \(\zeta \) to be 1. By the definition, if some event \(\Omega \) holds with \(\zeta \)-high probability for some \(\zeta >0\), then \(\Omega \) holds with probability larger then \(1-N^{-D}\) for any \(D>0\).
Lemma 5.5
(Large deviation estimate) Let \(X\) be defined as in Theorem 1.2. For any \(\zeta >0\), there exists \(Q_\zeta >0\) such that for \(\mathbb T\subset [\![\!1,N\!]\!], |\mathbb T| \leqslant N/2\) the following estimates hold with \( \zeta \)-high probability uniformly for \(1\leqslant i,j\leqslant N, |w|+ |z| \leqslant C \):
Furthermore, for \(i\ne j\), let \(\mathbb {E}_{\mathrm{y}_i\mathrm{y}_j}=\mathbb {E}_{\mathrm{y}_i}\mathbb {E}_{\mathrm{y}_j}\) and \(\mathbb {E}_{\mathbf{{y}}_i\mathbf{{y}}_j}=\mathbb {E}_{\mathbf{{y}}_i}\mathbb {E}_{\mathbf{{y}}_j}\) we have
where
Lemma 5.6
Let \(X\) be defined as in Theorem 1.2. Suppose \( |w|+ |z| \leqslant C. \) For any \(\zeta >0\), there exists \(C_\zeta \) such that if the assumption
holds then the following estimates hold
with \( \zeta \)-high probability uniformly for \( |w|+ |z| \leqslant C \).
5.2 Improved bounds on \(G\,\)’s
The next lemma gives the bounds on \(G, \mathcal G\) and \(m\) under the condition (5.1). Note: with (4.3), it implies that for any \(U, T\): \(|U|+|T|=O(1)\),
Before we give the rigorous proof for the bounds on \(G, \mathcal G\), we provide a rough picture on the sizes of these terms under the condition (5.1), \(w\in I_{\varepsilon }\) and \(||z|-1|\leqslant 2{\varepsilon }\). We note that the typical size of the \(G^{({\mathbb {U}},{\mathbb {T}})}_{kl}\) heavily relies on whether \(k=l\) and whether \(k, l\) are in \({\mathbb {U}}, {\mathbb {T}}\).
-
(i)
If \(k=l\notin {\mathbb {U}}\cup {\mathbb {T}}\), the typical size of \(G^{({\mathbb {U}},{\mathbb {T}})}_{kk}(w,z)\) is \(m(w,z)=\frac{1}{N}{{\mathrm{Tr}}}G(w,z)\).
-
(ii)
If \(k\ne l\), and \(k, l\notin {\mathbb {U}}\cup {\mathbb {T}}\), the typical size of \(G^{({\mathbb {U}},{\mathbb {T}})}_{kl}(w,z)\) is \(\sqrt{|m|/(N\eta )}\).
-
(iii)
If \(\{k, l\}\cap {\mathbb {U}}\ne \emptyset \), then \(G^{({\mathbb {U}},{\mathbb {T}})}_{kl}=0\). This result follows from the definition, and it worth to emphasize:
$$\begin{aligned} \{k, l\}\cap {\mathbb {U}}\ne \emptyset \implies G^{({\mathbb {U}},{\mathbb {T}})}_{kl}=\mathcal G^{({\mathbb {T}},{\mathbb {U}})}_{kl}=0 \end{aligned}$$(5.20) -
(iv)
If \( k=l\in {\mathbb {T}}\), then the typical size of \(G^{({\mathbb {U}},{\mathbb {T}})}_{kk}\) is \(|wm|^{-1}\).
-
(v)
If \(k\ne l\), and \( k \in {\mathbb {T}}\) and \(l\notin {\mathbb {T}}\), then the typical size of \(G^{({\mathbb {U}},{\mathbb {T}})}_{kl}\) is \((|w^{1/2}m |)^{-1} \sqrt{|m|/(N\eta )}\).
-
(vi)
If \(k\ne l\), and \( k,l \in {\mathbb {T}}\) then the typical size of \(G^{({\mathbb {U}},{\mathbb {T}})}_{kl}\) is \({|wm^2 |}^{-1} \sqrt{|m|/(N\eta )}\).
-
(vii)
With the definition of \(G^{({\mathbb {U}}, {\mathbb {T}})}\) and \(\mathcal G^{({\mathbb {T}}, {\mathbb {U}})}\) in Definition 4.1, one can easily see that \(\mathcal G^{({\mathbb {T}}, {\mathbb {U}})}_{kl}\) has the same typical size as \(G^{({\mathbb {U}}, {\mathbb {T}})}_{kl}\)(Here the superscript of \(\mathcal G\) is \(({\mathbb {T}}, {\mathbb {U}})\) not \(({\mathbb {U}}, {\mathbb {T}})\)).
We note: The \(m\) is bounded by \((\log N)^{C}|w|^{-1/2}\) in (5.18) (no better bound is obtained in this paper), but we believe that it could be much smaller, since \(m_c\) is much smaller.
Lemma 5.7
Let \(X\) be defined as in Theorem 1.2. Let \({\varepsilon }\) be small enough positive number, \(||z^2|-1|\leqslant 2{\varepsilon }\) and \(w\in I_{\varepsilon }\) [see definition in (2.5)]. If (5.1) holds , i.e., \(|{{\mathrm{Re}}}m(w,z)|\geqslant \frac{1}{4}N^{\varepsilon }( N\eta )^{-1}\) in \(\Omega =\Omega ({\varepsilon }, w, z)\). Then there exists \(\widetilde{\Omega }\subset \Omega \), and \(C>0\) such that \(\widetilde{\Omega }\) holds in \(\Omega \) with 1-high probability uniformly for \(z, w\): \(||z^2|-1|\leqslant 2{\varepsilon }\) and \(w\in I_{\varepsilon }\), [see definition in (5.9)] and the following bounds hold in \(\widetilde{\Omega }\) for any \(1\leqslant i \ne j\leqslant N\), (Here \(A\sim B\) denotes there exists \(C>0\) such that \(C^{-1}|B|\leqslant |A|\leqslant C|B|\) )
Furthermore, with the symmetry and the definition of \(G^{({\mathbb {U}}, {\mathbb {T}})}\) and \(\mathcal G^{({\mathbb {T}}, {\mathbb {U}})}\), these bounds also hold under the following switch
Proof of Lemma 5.7
In the following proof, we only focus on the fixed \(z, w, i\) and \(j\), since the uniformness can be easily checked.
We choose \(\zeta =1\). Because \(\varphi \ll N^{\varepsilon }\) for any fixed \({\varepsilon }>0\) [see (5.8)] and in this lemma \(w\in I_{\varepsilon }\), one can easily check that the assumption in this lemma implies the conditions of Lemma 5.6 i.e.,
Therefore we can use all of the results (with \(\zeta =1\)) of Lemma 5.6 in the following proof.
1. We first prove (5.21). The condition (5.1) implies that \(|\frac{1}{N}\sum _{i}{{\mathrm{Re}}}G_{ii}|\geqslant \frac{1}{4}N^{{\varepsilon }}(N\eta )^{-1}\), then there exists \(i: 1\leqslant i\leqslant N\) such that \(|G_{ii}|\geqslant \frac{1}{4}N^{{\varepsilon }}(N\eta )^{-1} \). Together with (5.16), it implies that \(|\mathcal G^{(i,\emptyset )}_{ii}|\leqslant |w|^{-1} N^{-\frac{4}{5}{\varepsilon }}N\eta \) with \(1\)-high probability in \(\Omega \). Inserting it into (5.6) with \(\mathbb T=i\), using \(G^{(i,i)}_{ii}=0\) from (5.20), we have
Applying (5.10) to bound \(Z_i^{(i)}\) with \(\mathbb T=i\), using Schwarz’s inequality and the fact \(G^{(i,i)}_{ii}=0\) again, we obtain
holds with 1-high probability in \(\Omega \). Together with (5.34), it implies that with 1-high probability in \(\Omega \),
Then replacing \(m^{(i,i)}\) with \(m\) by (4.3), we obtain (5.21).
2. For (5.22), first using (4.3) and (5.21), we have that for any \(i: 1\leqslant i\leqslant N\)
holds with 1-high probability in \(\Omega \). Together with the \(\mathcal Z\) version of (5.35):
we obtain (5.22).
3. For (5.23), it follows from (5.4) with \(\mathbb T=i\), (5.20) and (5.22).
4. Now we prove (5.24). Suppose (5.21), (5.23) and (5.10) holds in \(\Omega _0\subset \Omega \). From our previous results, \(\Omega _0\) holds with 1-high probability in \(\Omega \). Now we prove that (5.24) holds in \(\Omega _0\). First we assume that \(|1+m|\leqslant 3\), clearly otherwise (5.24) holds. Together with (5.21), it implies that \((N\eta )^{-1}\leqslant 3N^{-\frac{1}{2}{\varepsilon }}\). Using (4.3) and \(|1+m|\leqslant 3\), we obtain \(|1+m^{(i,i)}|\leqslant 4\) and \(|m_G^{(\emptyset , i)}|\leqslant 5\). With (5.23), the bound \(|1+m^{(i,i)}|\leqslant 4\) implies \(|G^{( \emptyset , i)}_{ii}|\geqslant |5w|^{-1}\). The assumption \(w\in I_{\varepsilon }\) implies \(|w|\leqslant {\varepsilon }\) [see definition of \(I_{\varepsilon }\) in (2.5)]. Then applying (5.10) on \(Z_i\), and using \(||z|-1|\leqslant 2{\varepsilon }\) and the bounds we just proved on \((N\eta )^{-1}, m_G^{( \emptyset , i)}\) and \(G^{(\emptyset , i)}_{ii}\), we obtain that in \(\Omega _0\),
Together with \(|G^{( \emptyset , i)}_{ii}| \geqslant |5w|^{-1}\) and the assumption \(||z|-1|\leqslant 2{\varepsilon }\) and \(|w|\leqslant {\varepsilon }\), we have
Now inserting (5.38) into the identity (5.6) with \(\mathbb T=\emptyset \), using \(|m_G^{( \emptyset , i)}|\leqslant 5\), and \(|w|\leqslant {\varepsilon }\) again, we obtain that
Then together with (5.37) and (5.23), in \(\Omega _0\), we have
Combining (5.21) and (4.3), we have
Inserting it into (5.40), we have
It is easy to extend this result to the following one:
holds in a probability set \(\widetilde{\Omega }\subset \Omega \) such that \(\widetilde{\Omega }\) holds with \(1\)-high probability in \(\Omega \). Since \(m=\frac{1}{N}\sum _i\mathcal G_{ii}\), for small enough \({\varepsilon }\), with \(|w|\leqslant {\varepsilon }\) and \(||z^2|-1|\leqslant 2{\varepsilon }\), (5.42) implies that
It completed the proof of (5.24).
We note: combining (4.3), (5.1), (5.21) and (5.24), we have for any \( |U|,\; |T|=O(1),\)
5. For (5.25), it follows from (5.23) (with \(\mathcal G^{( i,\emptyset )}_{ii}\) in the l.h.s.), (5.43) and (5.16).
6. For (5.26), first using (5.5), (5.12), (5.13) and (5.20), we obtain that
holds with 1-high probability in \(\Omega \). Applying (5.16) on \(X^{(i,i)}\) instead of \(X\), we obtain that
Recall (5.1) implies (5.19). Applying (5.25) on \(G^{(i,i)}_{jj}\), we have that
holds with 1-high probability in \(\Omega \). Then inserting (5.45), (5.46), (5.23) and (5.43) into (5.44), with (5.18) we obtain (5.26).
7. For (5.27), from (5.3), we have
On the other hand, (5.6) and (5.13) show that (similar result can be seen in (6.18) of [7])
Then
Since \(X_{jk}\)’s \((1\leqslant k\leqslant N)\) are independent of \(G^{(\emptyset , j)}\), using large deviation lemma (e.g. see Lemma 6.7 [7] ), as in (3.44) of [8], we have that with 1-high probability,
Inserting this bound, (5.25), (5.26) and (5.43) into (5.47), we have
i.e.,
It implies that
Then with (5.15) and (5.18), it implies
and we obtain (5.27).
8. For (5.28), using (5.4) and (5.20), we have
Using (5.10) and (5.20) again, we can bound \(\mathcal Z_i^{(ij)}\) as
Together with (5.43) and (5.21), we obtain (5.28).
9. For (5.29), using (5.5), (5.12) and (5.13), we obtain that
Furthermore, with (5.7), (5.11), (5.20) and (5.43), we have
Here these two bounds holds with 1-high probability. As in (5.46), applying (5.23) on \(\mathcal G^{(ij,i )}_{jj}\), with (5.43) we have
with 1-high probability in \(\Omega \). With (5.25), (5.28), (4.3) and (5.21), we also have
For the \(G_{jj}^{( i,\emptyset )}\) in (5.49), as in (5.47) and (5.48), with (5.20), we have
Then applying (5.25) on \(G_{jj}^{( i,i)}\), and applying (5.23) on \(\mathcal G_{ii}^{( i,\emptyset )}\), with (5.43) we obtain that
Inserting these bounds into (5.49) and (5.50), we obtain (5.29).
10. For (5.30), using (5.10) (with \(\mathbb T=\emptyset \)) and (5.23), (5.43) , we have
holds with 1-high probability in \(\Omega \). Together with (5.18), we obtain
Together with (5.25) and (5.18), we have
Then with (2.6), we obtain (5.30).
11. For (5.31), we note that (5.24) implies \(|m|\geqslant (\log N)^{-1}\). Then with (5.43), we obtain (5.31). \(\square \)
5.3 Polynomialization of Green’s functions
In this subsection, using the bounds we proved in the last subsection, we write the \(G\)’s and \(\mathcal G\)’s as polynomials in \(\mathcal F\) and \(\mathcal F_{1/2}\cdot \mathcal F\) (with negligible error).
We note: In the Lemmas 3.2 and 4.9 we assumed \(X_{ab}=0\), but the bounds we proved in Lemmas 5.6 and 5.7 still hold for this type of \(X\), the similar detailed argument was given in Remark 3.8 of [8].
Lemma 5.8
Lemmas 5.6 and 5.7 still hold if one enforces \(X_{st}=0\) for some fixed \(1\leqslant s,t\leqslant N\).
Note Here \(s,t\) are allowed to be equal to the \(i,j\) in Lemmas 5.6 and 5.7. For example, from (5.29), we have \(|G_{st}|\leqslant \varphi ^Cm^{1/2}(N\eta )^{-1/2}\), even if \(X_{st}=0\).
By the definitions of \(A_X^{(f)}, \fancyscript{P}(X)\)’s, \(B(X)\)’s and \(P(X)\)’s, one can see that the values of \(A_X^{(f)}, \fancyscript{P}_i(X)\) (\(i=1,2,3\)) would not change if one replaced the \(G\)’s inside with \(\chi _a G\)’s. Therefore, instead of \(G\)’s, we will write \(\chi _aG\) as some polynomials in \(\mathcal F\) and \(\mathcal F_{1/2}\cdot \mathcal F\) (with negligible errors).
Definition 5.9
For simplicity, we define the notations:
We collect some basic properties of these quantities in the following lemma.
Lemma 5.10
Under the assumption of Lemma 3.2, for \(z, w\): \(||z^2|-1|\leqslant 2{\varepsilon }\) and \(w\in I_{\varepsilon }\), there exists uniform \(C>0\) such that
hold with 1-high probability.
Proof of Lemma 5.10
We note \(\chi _a=1\) implies the condition (5.1). Hence the results in Lemma 5.7 hold with 1-high probability. First from (5.31) and \(|w|\geqslant \eta \), we have the first and the third inequalities of (5.54), and the first inequality of (5.55). The second inequality in (5.54) follows from (5.18) and (5.43). It also implies the second inequality of (5.57). Combining the second inequality of (5.54) with (2.6), we obtain the second inequality in (5.55). For (5.56), one can easily check this identity by the definition of \(\beta \) and \(\gamma \). For the first inequality of (5.57), it follows from (5.21) and (5.43). \(\square \)
Definition 5.11
Under the assumption of Lemma 3.2, for \(w\in I_{\varepsilon }, ||z|-1|\leqslant 2{\varepsilon }\) and \(s, k\ne a\), we define \(S_{ks}\) and \(\widetilde{S}_{sk}\) as the random variables which are independent of the \(a\)-th row and columns of \(X\) and
With (5.5), one can obtain their explicit expressions, e.g.,
Similarly, we define \(\mathcal S_{ks}\) and \(\widetilde{\mathcal S}_{sk}\) as random variables which are independent of the \(a\)-th row and columns of \(X\) and
As one can see that \(S, \widetilde{S}, \mathcal S\) and \(\widetilde{\mathcal S}\) have the same behaviors. Here we collect some basic properties of these quantities in the following lemma.
Lemma 5.12
We assume that \(||z|-1|\leqslant 2{\varepsilon }, w\in I_{\varepsilon }, k\ne a\) and \(X\) satisfies the assumption of Lemma 3.2. For some \(C>0\), with 1-high probability, we have
so as \(\widetilde{S}, \mathcal S\) and \(\widetilde{\mathcal S}\). Recall the definition \(\mathcal F\)’s in Definition 4.3, for some \(C>0\), we have
and
Furthermore, (5.58), (5.59) and (5.60) hold uniformly for \(||z|-1|\leqslant 2{\varepsilon }, w\in I_{\varepsilon }\) and \(k,s: k,s\ne a, 1\leqslant k,s \leqslant N\).
Note With (5.59), we also have
Proof of Lemma 5.12
Since the uniformness are easy to be checked, we will only focus on the fixed \(z, w, s\) and \(k\).
1. For (5.58), the condition \(\chi _a\ne 0\) implies that we can apply Lemma 5.7 on the \(X^{(a,a)}\). Recall: these bounds also hold under the exchange (5.32). Then the bounds (5.25) and (5.26) imply that for \(s\ne k\),
hold with 1-high probability. Similarly it follow from (5.23) and (5.43) that for \(s=k\)
holds with 1-high probability. Then with the explicit expression of \(S_{ks}\) in Definition 5.11, we have that
holds with 1-high probability. Since \(X_{tk}\)’s are independent of \(\mathcal G^{(ak,a)}_{st}\)’s (\(1\leqslant t\leqslant N\)), using large deviation lemma (e.g. see [7, Lemma 6.7] ), as in (3.44) of [8], we have that
holds with 1-high probability. Applying Lemma 5.7 on the \(X^{(a ,a )}\) again, from (5.27), we have that
holds with 1-high probability. Together with the first part of (5.62), (5.63) and (5.64), we obtain that
holds with 1-high probability. At last, with (5.18) and (5.43), we obtain (5.58).
2. For (5.59), we recall the definition of \(\mathcal F\) in Definition 4.3, especially the two \(N^{1/2}\) factors in \(\mathcal F\). It is easy to see that (5.59) follows from the first inequality of (5.55) and the bounds on \(S\) in (5.58).
3. For (5.60), since the (5.58) also holds for \(\widetilde{S}\), then with the first inequality of (5.55), we have that
holds with 1-high probability. Together with the definition of \(\mathcal F\), we obtain (5.60).
\(\square \)
Now we introduce a method to track and show the dependence of random variables on the indices. First we give a simple example to show the basic idea. Let \(A_{kl}, 1\leqslant k,l\leqslant N\) be a family of random variables:
By definition of \(\mathcal F\) and \(\mathcal F_0\), we can say,
But the first part of the r.h.s. of (5.66), i.e., \(\frac{G^{(a,a)}_{kk}}{|G^{(a,a)}_{kk}|}\) only depends on the first index \(k\), the second part \(\frac{G^{(a,a)}_{ll}}{|G^{(a,a)}_{ll}|}\) only depends on the second index \(l\) and the third part is independent of the indices. Therefore, we prefer to write it as
More precisely, \(A_{kl}\in \mathcal F_0^{[k]}\cdot \mathcal F_0^{[l]}\cdot \mathcal F_1^{[\emptyset ]} \) means that \( A_{kl}=F_1(k)F_2(l)F_3 \) and \(F_1(k)\in \mathcal F_0, F_2(l)\in \mathcal F_0, F_3\in \mathcal F_1\), and \(F_{1}(k)\) only depends on index \(k, F_{2}(l)\) only depends on index \(l\), and \(F_{3} \) does not depends on index.
For general case, to show how random variables depend on the indices, we define the following notations.
Definition 5.13
Let \(A_I\) be a family of random variables where \(I\) is index vector, not including index \(a\). we write
where \( I_i\) is a part of \(I\), if and only if there exist \( F_{ i}^{[I_i]}\in \mathcal F _{\alpha _i}\) such that \(A_I=\prod _i F_{ i}^{[I_i]}\) and \(F_{ i}^{[I_i]}\) only depends on the indices in \( I_i \).
For the example in (5.66), we write \(A_{kl}\in \mathcal F_0^{[k]}\cdot \mathcal F_0^{[l]}\cdot \mathcal F^{[\emptyset ]} \), where \(I=(k, l), I_1=(k), I_2=(l)\) and \(I_3=(\emptyset ), \alpha _1=\alpha _2=0\) and \(\alpha _3=1\).
The following lemma shows the \(G\)’s can be written as the polynomials in \(\mathcal F\)’s.
Lemma 5.14
For simplicity, we introduce the notation:
i.e., (Note: no sum)
Let \(w\in I_{\varepsilon }\) and \(||z|-1|\leqslant 2{\varepsilon }\). Under the assumption of Lemma 3.2, for any \(D>0\), we have
and
For any \(k\ne a\), we have
and,
For any \(k, l\ne a\), we have
Furthermore, (5.68)–(5.72) hold uniformly for \(||z|-1|\leqslant 2{\varepsilon }, w\in I_{\varepsilon }\) and \(1\leqslant k,l\ne a\leqslant N\).
Proof of Lemma 5.14
Because one can easily check the uniformness, in the following proof we will only focus on the fixed \(w, z, k\) and \(l\). Recall (4.18) and (5.33), with the assumption \(w\in I_{\varepsilon }\) and \(||z|-1|\leqslant 2{\varepsilon }\), we know the results in Lemmas 5.6 and 5.7 hold under the assumption of this lemma. Furthermore, these results also hold for \(X^{(a,a)}\)(instead of \(X\)).
1. We first prove (5.68). Applying Lemma 5.7 on \(X^{(a,a)}\), with (5.25), (5.29) and the first inequality of (5.57), we have
Then with
and \(\alpha :=\chi _am^{(a,a)}\), we have
From (5.6) and (5.20) with \(i=a, {\mathbb {T}}=a\), we have
Then with (5.22), for any \({\varepsilon }, D>0\), there exists \(C_{{\varepsilon }, D}\) depending \({\varepsilon }\) and \(D\), such that
holds with 1-high probability. Hence with (5.43) and \(\chi _aZ_{a}^{(a)} \in _n m^{(a,a)} \mathcal F\) in (5.75), we obtain that
which implies (5.68) with the fact: \(\mathcal G^{(a, \emptyset )}_{aa }\) and \( G^{( \emptyset , a)}_{aa }\) have the same behavior.
2. Now we prove (5.69). From (5.30) and (5.4), with \(i=a\) and \(T=\emptyset \), for any \({\varepsilon }, D>0\), there exists \(C_{{\varepsilon }, D}\) depending \({\varepsilon }\) and \(D\) such that with 1-high probability,
Note \(1+ m_\mathcal G^{(a, \emptyset )}+ |z|^2 \mathcal G^{(a, \emptyset )}_{aa}\) is independent of the \(a\)-th column of \(X\), but depends on the \(a\)-th row of \(X\). By definition, we write
Now we claim that for any \(D\),
and
Combining (5.79), (5.80) and (5.77), we obtain (5.69).
2.a We prove (5.79) first. Using the \(\mathcal G\) version of (5.61) and (5.76), we can write the first two terms of the r.h.s. of (5.78) as we can write
Similarly for the third term of the r.h.s. of (5.78), using (5.2), we can write it as
Using (5.75) , (5.61) and (5.76), we obtain
For the fourth term of the r.h.s. of (5.78), using (5.2), we have
Together with (5.60), it implies that
and
Now inserting these bounds back to (5.78) and using the relations between \(\alpha , \beta \) and \(\gamma \) in (5.54) and (5.55), we obtain (5.79).
2.b Now we prove (5.80). With (5.83) and
we have
We write this denominator as
With (5.22), (5.43), (5.18), we can bound the first term in the second bracket as follows:
holds with 1-high probability. Together with (5.60) and (5.55), with 1-high probability, we can bound the second bracket of (5.85) as
On the other hand, we claim for some \(C>0\), the following inequality holds with 1-high probability.
If (5.87) does not hold, then \(\chi _a=1\) and \(1+m^{(a,a)}=(-|z|+O((\log N)^{-C}))w^{-1/2}\). With (4.3), (5.21) and \(||z|-1|\leqslant 2{\varepsilon }\) , we obtain
It follows from \(1+m^{(a,a)}=(-|z|+O((\log N)^{-C}))w^{-1/2}\) and (5.23) that
Inserting them into (5.10), with (2.6), we have
Now insert (5.88), (5.89) and (5.90) into (5.4), we obtain \(|G_{aa}|\!\geqslant \! (\log N)^{C-1}|w|^{-1/2}\) for any \(C>0\), which contradicts (5.15). Therefore, (5.87) must hold for some \(C>0\).
Recall that the denominator of the r.h.s. of (5.84) equals to the sum of the l.h.s. of (5.86) and (5.87) [see (5.85)]. Then inserting (5.86) and (5.87) into (5.85), we have that for any fixed \(D\), there exists \(C_{{\varepsilon }, D}\), such that with \(1\)-high probability,
For the terms in (5.91), we apply (5.75) on \((XG^{(a,a)}X^T)_{aa}\) and \(Z^{(a)}_a\), apply (5.24) on \((1+m^{(a,a)})\), apply (5.60) on \(\left( X \widetilde{\mathcal S}\mathcal S X^T \right) _{aa}\), apply (5.55) on \(\gamma \) and apply (5.87) on the denominator of (5.91), we obtain
With the bounds of \(\alpha \) and \(\gamma \) in (5.54), (5.55) and (5.57), it implies (5.80). Combining (5.79), (5.80) and (5.77), we obtain (5.69).
3. For (5.70), it clearly follows from the Definition 5.11, (5.58) and Definition 5.13.
4. Now we prove (5.71). First with (5.6) and (5.13), we have
Applying (5.3) on \(G_{ak}\) with \(i=a\), recalling \(Y=X-zI\), we have
Writing the first term in the r.h.s. as \(G^{(\emptyset ,a)}_{ak}\mathcal G_{aa}(\mathcal G_{aa})^{-1}\) and applying (5.92) on \((\mathcal G_{aa})^{-1}\), we can write the first three terms in the r.h.s. of (5.93) as
Therefore
Using the fact: \(\alpha \beta =\chi _a|w|^{-1}\), inserting (5.68)–(5.70), (5.81), (5.82) and (5.61) into (5.94), we have
More precisely, here what we used is the \(G\)-version of (5.81), (5.82), i.e.,
They follows from (5.81), (5.82) and the symmetry between \(G\) and \(\mathcal G\).
Next using (5.57) and the definitions of \(\alpha , \beta \) and \(\gamma \), we have \(\alpha +\beta \gamma \prec \sqrt{\frac{\alpha }{N\eta }}\) and we write
For \( (XG^{(\emptyset ,a)})_{ak}\) in (5.95), using (5.2), for \( k\ne a\) we have (note: \(s\) can be \(a\))
Together with (5.61), (5.68) and (5.70), it implies that
It follows from (5.73) (note: \( |w^{-1/2}|\gamma =\alpha ^{1/2} (N\eta ) ^{-1/2} \)) that
Inserting it into (5.96), with Lemma 5.10, we obtain
Together with (5.95), we obtain (5.71).
5. Now we prove (5.72). With (5.97), (5.70) and Lemma 5.10, we have
Together with (5.3), (5.92), (5.68) and (5.56), we can write \(G_{kl}\) as follow,
Furthermore, with (5.2), (5.68), (5.70) and (5.56), we can write \(G^{(\emptyset ,a)}_{kl}\) as
Therefore, together with (5.99), it implies (5.72). \(\square \)
Next, we write the terms appeared in the Lemma 4.9 as polynomials in \(\mathcal F, \mathcal F_{1/2}\) and \(\mathcal F_{1/2}\cdot \mathcal F\) (with proper deterministic coefficients and ignorable error terms).
Lemma 5.15
Let \(w\in I_{\varepsilon }\) and \(||z|-1|\leqslant 2{\varepsilon }\). Under the assumption of Lemma 3.2, for any fixed large \(D>0\), with \(\chi _a\) defined in (4.17) and \(F_{ 0, X}^{[k]} \) defined in (5.67), we have that for \(k\ne a\)
Proof of Lemma 5.15
1. For (5.100), using (5.72) and (5.69), we have
Here for the last \(\in _n\), we used
Then with (5.54) and (5.55), we obtain (5.100).
2. For (5.101), it follows from (5.72), \(\mathcal F_{1/2}\cdot \mathcal F_{1/2}\subset \mathcal F, X_{ab}=0, X_{ba}\in \mathcal F_{1/2}\) that
If \(\chi _a\ne 0\), then (5.1) holds for \(m^{(a,a)}\) and Lemma 5.7 holds for \(G^{(a,a)}\). With (5.25), it implies that \(\chi _aG^{(a,a)}_{bb} \prec \alpha \). Then using Lemma 5.10, we obtain (5.101).
3. For (5.102), with (5.3) and (5.92), we can write it as
Then using (5.69), (5.68) and (5.81), we obtain (5.102).
4. Now we prove (5.103), with (5.3) and (5.92) again, we write it is
Then using (5.98) and (5.69), we obtain (5.103).
5. For (5.104), by definition, we write \((YG^2 )_{ab}\) as
Then using (5.103), (5.72), with \(X_{ab}=0\), we get
With (5.108) and Lemma 5.10, we obtain
Then applying (5.73) on \( G^{(a,a)}_{kb}\), we obtain \( \chi _aG^{(a,a)}_{kb}\in \left( |w|^{-1/2}\gamma +\delta _{kb}\alpha \right) \mathcal F_0^{[k,b]}\). Now with
and Lemma 5.10 again, we get
Similarly, with (5.102), (5.71) and Lemma 5.10 again, we obtain
and we obtain (5.104).
6. For (5.105), we write \((G^2)_{aa}\) as
where for the second \(\in _n\), we used (5.69) and (5.54). As in (5.111), using (5.71) and (5.108), we have
Then with Lemma 5.10, we obtain (5.105).
7. For (5.106), we write it as
With (5.72), (5.108) and \(\sum _k \mathcal F^{[k,b]}_\alpha \in N\mathcal F^{[b]}_\alpha , (\alpha = 0,1/2,\emptyset )\), after a tedious calculation, we get
Then using Lemma 5.10, \(\mathcal F^{[b]}_{1/2}\mathcal F^{[b]}_{1/2}\in \mathcal F, X_{ba}\mathcal F^{[b]}_{1/2}\in \mathcal F\) and \(X_{ba}^2\in \mathcal F\), we obtain
Similarly, using (5.72), and Lemma 5.10 we have
Using (5.72), \(\mathcal F_{0, X}\in \mathcal F_{1/2}\) and \(\mathcal F_{1/2}\cdot \mathcal F_{1/2}\subset \mathcal F\), we have \(\chi _a G_{ba}, \chi _aG_{ab}\in \chi _a( \frac{1}{N\eta }+\beta )\mathcal F \). Together with Lemma 5.10, it implies that
which completes the proof of (5.106).
8. For (5.107), it follows from
and (5.69) and (5.105). \(\square \)
Now we are ready to prove Lemma 4.9, which is the key lemma in the proof of our main result.
5.4 Proof of Lemma 4.9
First with \(m-m^{(a,a)}=O((N\eta )^{-1})\) [see (4.3)] and the definition of \(\chi _a\), for any fixed \(D>0\), with 1-high probability, we can write the \(h(t_X)\) as
where constant \(C_{{\varepsilon }, D}\) depends on \({\varepsilon }\) on \(D\), and \(h^{(k)}\) is the \(k-th\) derivative of \(h\). Using (5.100) and the fact that \(h\) is smooth and supported in [1, 2], we obtain
and
Note \(\mathbf{1}(|t_{X^{(a,a)}}|\leqslant 2)=\mathbf{1}(|{{\mathrm{Re}}}m^{(a,a)}|\leqslant 2N^{{\varepsilon }}(N\eta )^{-1})\). Similarly, one can prove
Using (5.112), (5.113) and (5.100), we have
It implies (4.19).
For (4.20), recall \(B_m(X)\) is defined as
Then using (5.112), (5.114) and (5.100), we obtain (4.20).
Similarly, for (4.21), the terms appearing in the definition (3.20) have all been bounded in (5.104), (5.69), (5.106), (5.107) and (5.101). With a simple calculation, one can obtain (4.21) and complete the proof.\(\square \)
Notes
For the sake of notational simplicity we do not consider complex entries in this paper, but the statements and proofs are similar.
References
Ameur, Y., Hedenmalm, H., Makarov, N.: Fluctuations of eigenvalues of random normal matrices. Duke Math. J. 159, 31–81 (2011)
Ameur, Y., Ortega-Cerdà, J., Beurling-Landau densities of weighted Fekete sets and correlation kernel estimates, preprint, arXiv:1110.0284 (2011)
Bai, Z.D.: Circular law. Ann. Probab. 25(1), 494–529 (1997)
Bai, Z.D., Silverstein, J.: Spectral Analysis of Large Dimensional Random Matrices, Mathematics Monograph Series, 2nd edn. Science Press, Beijing (2006)
Bloemendal, A., Erdoes, L., Knowles, A., Yau, H.T., Yin, J.: Isotropic Local Laws for Sample Covariance and Generalized Wigner Matrices, arXiv:1308.5729 (2013)
Borodin, A., Sinclair, C.D.: The Ginibre ensemble of real random matrices and its scaling limits. Commun. Math. Phys. 291(1), 177–224 (2009)
Bourgade, P., Yau, H.-T., Yin, J.: Local circular law for random matrices, preprint, arXiv:1206.1449 (2012)
Bourgade, P., Yau, H.-T., Yin, J.: The local circular law II: the edge case, preprint, arXiv:1206.3187 (2012)
Edelman, A.: The probability that a random real Gaussian matrix has \(k\) real eigenvalues, related distributions, and the circular law. J. Multivariate Anal. 60(2), 203–232 (1997)
Erdős, L., Yau, H.-T., Yin, J.: Bulk universality for generalized Wigner matrices. PTRF, preprint, arXiv:1001.3453 (2010)
Forrester, P.J., Nagao, T.: Eigenvalue statistics of the real Ginibre ensemble. Phys. Rev. Lett. 99 (2007)
Ginibre, J.: Statistical ensembles of complex, quaternion, and real matrices. J. Math. Phys. 6, 440–449 (1965)
Girko, V.L.: The circular law. Russ. Teor. Veroyatnost. i Primenen 29(4), 669–679 (1984)
Götze, F., Tikhomirov, A.: The circular law for random matrices. Ann. Probab. 38(4), 1444–1491 (2010)
Lee, J., Yin, J.: A necessary and sufficient condition for edge universality of Wigner matrices. Duke Math. J. (2013, to appear)
Pan, G., Zhou, W.: Circular law, extreme singular values and potential theory. J. Multivariate Anal. 101(3), 645–656 (2010)
Rudelson, M.: Invertibility of random matrices: Norm of the inverse. Ann. Math. 168(2), 575–600 (2008)
Rudelson, M., Vershynin, R.: The Littlewood–Offord problem and invertibility of random matrices. Adv. Math. 218(2), 600–633 (2008)
Sinclair, C.D.: Averages over Ginibre’s ensemble of random real matrices. Int. Math. Res. Notices IMRN, no. 5 (2007)
Tao, T., Vu, V.: Random matrices: the circular law. Commun. Contemp. Math. 10(2), 261–307 (2008)
Tao, T., Vu, V.: Random matrices: universality of ESDs and the circular law, with an appendix by Manjunath Krishnapur. Ann. Probab. 38(5), 2023–2065 (2010)
Tao, T., Vu, V.: Random matrices: universality of local spectral statistics of non-Hermitian matrices, preprint, arXiv:1206.1893 (2012)
Wood, P.: Universality and the circular law for sparse random matrices. Ann. Appl. Probab. 22(3), 1266–1300 (2012)
Author information
Authors and Affiliations
Corresponding author
Additional information
J. Yin was partially supported by NSF grant DMS-1001655 and DMS- 1207961.
Rights and permissions
About this article
Cite this article
Yin, J. The local circular law III: general case. Probab. Theory Relat. Fields 160, 679–732 (2014). https://doi.org/10.1007/s00440-013-0539-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-013-0539-3
Keywords
- Local circular law
- Universality
Mathematics Subject Classification (2010)
- 15B52
- 82B44