1 Introduction

In the seminal paper [31] Wigner introduced random self-adjoint matrices, \(\mathbf {H}=\mathbf {H}^*\), with centered, identically distributed and independent entries (subject to the symmetry constraint). He proved that the empirical density of the eigenvalues converges to the semicircle distribution. Wigner also conjectured that the distribution of the distance between consecutive eigenvalues (gap statistics) is universal, hence it is the same as in the Gaussian model. His revolutionary observation was that these universality phenomena hold for much larger classes of physical systems and only the basic symmetry type determines local spectral statistics. It is generally believed, but mathematically unproven, that random matrix theory (RMT), among many other examples, also describes the local statistics of random Schrödinger operators in the delocalized regime and quantization of chaotic classical Hamiltonians.

The first rigorous results on the local eigenvalue statistics in the bulk spectrum were given by Dyson, Mehta and Gaudin in the 60’s. These concerned the Gaussian models and identified their local correlation functions. According to Wigner’s universality hypothesis, these statistics should hold independently of the common law of the matrix elements. This conjecture was resolved recently in a series of works. The strongest result on Wigner matrices in the bulk spectrum is Theorem 7.2 in [13], see [19, 30] for a summary of the history and related results. In fact, the three-step approach developed in [14, 17, 20] also applies for generalized Wigner matrices that allow for non-identically distributed matrix elements as long as the variance matrix \(s_{ij}:= \mathbbm {E}|h_{ij}|^2\) is stochastic, i.e. \(\sum _j s_{ij}=1\) (in particular, independent of the index i). The stochasticity of \(\mathbf {S}\) guarantees that the eigenvalue density is given by the semicircle law and the diagonal elements \(G_{ii} = G_{ii}(z) \) of the resolvent

$$\begin{aligned} \begin{aligned} \varvec{\mathrm {G}}(z) = (\mathbf {H}-z)^{-1}, \quad \mathrm {Im}\, z>0, \end{aligned} \end{aligned}$$
(1.1)

become not only deterministic but also independent of i as the the matrix size N goes to infinity. They asymptotically satisfy a system of self-consistent equations

$$\begin{aligned} \begin{aligned} -\frac{1}{ G_{ii}} \,\approx \, z+ \sum _{j} s_{ij} G_{jj}, \end{aligned} \end{aligned}$$
(1.2)

that reduces to a particularly simple scalar equation

$$\begin{aligned} \begin{aligned} -\frac{1}{m} = z + m, \end{aligned} \end{aligned}$$
(1.3)

for the common value \(m\approx G_{ii}\) for all i as \(N\rightarrow \infty \). The solution \( m = m(z) \) of (1.3) is the Stieltjes transform of the Wigner semicircle law.

In this paper we consider a general variance matrix \(\varvec{\mathrm {S}}\) without stochasticity condition. We show that the approximate self-consistent Eq. (1.2) still holds, but it does not simplify to a scalar equation. In fact, \(G_{ii}\) remains i-dependent even as \(N\rightarrow \infty \) and it is close to the solution \(m_i \) of the Quadratic Vector Equation (QVE)

$$\begin{aligned} \begin{aligned} -\frac{1}{ m_i} = z+ \sum _{j} s_{ij} m_j, \end{aligned} \end{aligned}$$
(1.4)

under the additional condition that \(\mathrm {Im}\, m_i > 0\).

In the context of random matrices importance of this equation has been realized by Girko [23], Shlyakhtenko [29], Khorunzhy and Pastur [25], see also Guionnet [24], as well as Anderson and Zeitouni [5, 6], but no detailed study has been initiated. In the companion paper [1] we analyzed (1.4) in full detail. See also Section 3 of [2] for how the QVE is related to other random matrix models. We showed that \(\langle m \rangle := N^{-1}\sum _i m_i \) is the Stieltjes transform of a probability density \(\rho \) that is supported on a finite number of intervals, inside of which it is a real analytic function. We also described the behavior of \( \rho \) near the edges of its support; it features only square root or cubic root (cusp) singularities and an explicit one parameter family of profiles interpolating between them as a gap in the support closes.

The main result of the current paper is the universality of the local eigenvalue statistics in the bulk for Wigner-type matrices with a general variance matrix (cf. Theorem 1.16). This extends Wigner’s vision towards full universality by considering a much larger class of matrix ensembles than previously studied. In particular, we demonstrate that local statistics, as expected, are fully independent of the global density. This fact has already been established for very general \(\beta \)-ensembles in [10] (see also [8, 28]) and for additively deformed Wigner ensembles having a density with a single interval support [27]. Our class admits a general variance matrix and allows for densities with several intervals (we do not, however, consider non-centered distributions here; an extension to matrices with non-centered entries on the diagonal may be incorporated in our analysis with additional technical effort).

Following the three-step approach, we first prove local laws for \(\varvec{\mathrm {G}}\) on the scale \(\eta =\mathrm {Im}\, z \gg N^{-1}\), i.e. down to the optimal scale just slightly above the eigenvalue spacing. This is the main and novel part of our analysis. The previous proofs (see [14] for a pedagogical presentation) heavily relied on properties of the semicircle law, especially on its square root edge singularity. With possible cubic root singularities and small gaps in the support of \(\rho \) an additional scale appears which needs to be controlled. The second step is to prove universality for Wigner-type matrices with a tiny Gaussian component via Dyson Brownian motion (DBM). The method of local relaxation flow, introduced first in [16, 17], also heavily relies on the semicircle law since it requires that the global density remain unchanged along DBM. In [18] a new method was developed to localize the DBM that proves universality of the gap statistics around a fixed energy \(\tau \) in the bulk, assuming that the local law holds near \(\tau \) (we remark that a similar result was obtained independently in [26]). Since Wigner-type matrices were one of the main motivations for [18], it was formulated such that it could be directly applied once the local laws are available. Finally, the third step is a perturbation result to remove the tiny Gaussian component using the Green function comparison method that first appeared in [20] and can be applied to our case basically without any modifications.

In a separate paper [3] we apply the results of this work and [1] to treat Gaussian random matrices with correlated entries. Assuming translation invariance of the correlation structure in these Gaussian matrix ensembles we prove an optimal local law, bulk universality and non-trivial decay of off-diagonal resolvent entries.

1.1 Set-up and main results

Let \(\mathbf {H}^{(N)}\in {\mathbb {C}}^{N \times N}\) be a sequence of self-adjoint random matrices. In particular, if the entries of \(\mathbf {H}\) are real then \(\mathbf {H}^{(N)}\) is symmetric. The matrix ensemble \(\mathbf {H}=\mathbf {H}^{(N)}\) is said to be of Wigner-type if its entries \(h_{i j}\) are independent for \(i \le j\) and centered, i.e.,

$$\begin{aligned} \begin{aligned} \mathbbm {E}h_{ij} \,=\, 0 \quad \text { for all } \quad i,j \,=\, 1,\dots ,N. \end{aligned} \end{aligned}$$
(1.5)

The dependence of \(\mathbf {H}\) and other quantities on the dimension N will be suppressed in our notation. The matrix of variances, \(\mathbf {S}=(s_{i j})_{i,j=1}^N\), is defined through

$$\begin{aligned} \begin{aligned} s_{ij} := \mathbbm {E}|h_{i j}|^2. \end{aligned} \end{aligned}$$
(1.6)

It is symmetric with non-negative entries. In [1] it was shown that for every such matrix \(\mathbf {S}\) the quadratic vector equation (QVE),

$$\begin{aligned} \begin{aligned} -\, \frac{1}{m_i(z)} \,=\, z+\sum _{j=1}^N s_{ij} m_j(z), \quad \text {for all } \quad i=1,\dots ,N \text { and } z \in \mathbb {H}, \end{aligned} \end{aligned}$$
(1.7)

for a function \(\mathbf {m}=(m_1,\dots ,m_N) : \mathbb {H}\rightarrow \mathbb {H}^N\) on the complex upper half plane, \(\mathbb {H}=\{z\in {\mathbb {C}}: \mathrm {Im}z >0 \}\), has a unique solution. The main result of this paper is to establish the local law for Wigner-type matrices, i.e. that for large N the resolvent, \(\mathbf {G}(z) = (\mathbf {H}-z)^{-1}\), with spectral parameter \(z= \tau +\mathrm {i}\eta \in \mathbb {H}\), is close to the diagonal matrix, \(\mathrm {diag}(\mathbf {m}(z))\), as long as \(\eta \gg N^{-1}\). As a consequence, we obtain rigidity estimates on the eigenvalues and complete delocalization for the eigenvectors. Combining this information with the Dyson–Brownian motion, we obtain universality of the eigenvalue gap statistics in the bulk.

We now list the assumptions on the variance matrices \(\mathbf {S}=\mathbf {S}^{(N)}\). The restrictions on \(\mathbf {S}\) are controlled by three model parameters, \(p, P >0\) and \(L \in \mathbb {N}\), which do not depend on N. These parameters will remain fixed throughout this paper.

  1. (A)

    For all N the matrix \(\mathbf {S}\) is flat, i.e.,

    $$\begin{aligned} \begin{aligned} s_{i j} \;\le \; \frac{1}{N}, \quad i,j=1,\dots ,N. \end{aligned} \end{aligned}$$
    (1.8)
  2. (B)

    For all N the matrix \(\mathbf {S}\) is uniformly primitive, i.e.,

    $$\begin{aligned} \begin{aligned} (\varvec{\mathrm {S}}^{L})_{i j} \,\ge \, \frac{p}{N}, \quad i,j=1,\dots ,N. \end{aligned} \end{aligned}$$
    (1.9)
  3. (C)

    For all N the matrix \(\mathbf {S}\) induces a bounded solution of the QVE, i.e., the unique solution \(\mathbf {m}\) of (1.7) corresponding to \(\mathbf {S}\) is bounded,

    $$\begin{aligned} \begin{aligned} |m_i(z)|\;\le \; P, \quad i=1,\dots ,N, \quad z \in \mathbb {H}. \end{aligned} \end{aligned}$$
    (1.10)

Remark 1.1

(Boundedness and normalization) The assumption on the boundedness of \(\mathbf {m}\) is an implicit condition in the sense that it can be checked only after solving (1.7). In Theorem 6.1 of [1] we list sufficient, explicitly checkable conditions on \(\mathbf {S}\), which ensure (1.10). We also remark that the assumption (1.8) can be replaced by \(s_{ij} \le C/N \) for some positive constant C. This will lead to a rescaling (cf. Remark 2.2 of [1]) of \(\mathbf {m}\). We pick the normalization \( C=1 \) just for convenience.

Remark 1.2

(Primitivity) The primitivity condition (1.9) excludes some important models, e.g. matrices of the form

$$\begin{aligned} \varvec{\mathrm {H}} = \begin{pmatrix} \varvec{\mathrm {0}} &{} \varvec{\mathrm {X}} \\ \varvec{\mathrm {X^*}} &{}\varvec{\mathrm {0}} \end{pmatrix}, \end{aligned}$$

whose eigenvalues yield the singular values of the Gram matrix \(\varvec{\mathrm {X}}\varvec{\mathrm {X}}^*\), where \(\varvec{\mathrm {X}}\) has independent centered entries with an arbitrary variance profile. Condition (B) is not a mere technicality; Gram matrices may have singularities in the spectrum near 0 (often referred to as the ‘hard-edge’) that require separate treatment; but even away from 0 some new ideas are needed. The complete analysis is presented in [4], where we prove local laws for Gram matrices.

In addition to the assumptions on the variances of \(h_{i j}\), we also require uniform boundedness of higher moments. This leads to another basic model parameter, \(\underline{\mu }=(\mu _1, \mu _2, \dots )\), which is a sequence of non-negative real numbers.

  1. (D)

    For all N the entries \(h_{i j}\) of the random matrix \(\mathbf {H}\) have bounded moments,

    $$\begin{aligned} \begin{aligned} \mathbbm {E}| h_{i j} |^k \,\le \, \mu _k s_{i j}^{k/2}, \quad k\in \mathbb {N},\;i,j=1, \dots , N. \end{aligned} \end{aligned}$$
    (1.11)

In order to state our main result, in the next corollary we collect a few facts about the solution of the QVE that are proven in [1]. Although these properties are sufficient for the formulation of our results, for their proofs we will need much more precise information about the solution of the QVE. Theorems 4.1 and 4.2 summarize everything that is needed from [1] besides the existence and uniqueness of the solution of the QVE. In particular, the statement of Corollary 1.3 follows easily from Theorem 4.1 below.

Corollary 1.3

(Solution of QVE) Suppose \( \varvec{\mathrm {S}} \) satisfies (A)–(C). Let \(\mathbf {m} : \mathbb {H}\rightarrow \mathbb {H}^N\) be the solution the QVE (1.7) corresponding to \(\mathbf {S}\). Then \(\mathbf {m}\) is analytic and has a continuous extension (denoted again by \(\mathbf {m}\)) to the closed upper half plane, \(\mathbf {m}: \overline{\mathbb {H}}\rightarrow \overline{\mathbb {H}}^N\), with \(\overline{\mathbb {H}}:=\mathbb {H}\cup \mathbb {R}\). The function \(\rho : \mathbb {R}\rightarrow [0,\infty )\), defined by

$$\begin{aligned} \begin{aligned} \rho (\tau )\,:=\, \lim _{\eta \downarrow 0}\frac{1}{\pi N}\sum _{i=1}^N \mathrm {Im}m_i(\tau +\mathrm {i}\eta ), \end{aligned} \end{aligned}$$
(1.12)

is a probability density. Its support is contained in \([-2,2]\) and is a union of closed disjoint intervals

$$\begin{aligned} \begin{aligned} {{\mathrm{supp}}}\rho \,=\, \bigcup _{k=1}^K[ \alpha _k,\beta _k],\quad \text { where }\quad \alpha _k<\beta _k< \alpha _{k+1}. \end{aligned} \end{aligned}$$
(1.13)

There exists a positive constant \(\delta _*\), depending only on the model parameters p, P and L, such that the sizes of these intervals are bounded from below by

$$\begin{aligned} \begin{aligned} \beta _k-\alpha _k\,\ge \, 2 \delta _*. \end{aligned} \end{aligned}$$
(1.14)

Note that (1.14) provides a lower bound on the length of the intervals that constitute \({{\mathrm{supp}}}\rho \), while the length of the gaps, \(\alpha _{k+1}-\beta _k\), between neighboring intervals can be arbitrarily small. Figure 1 shows a shape that the density of states typically might have. In particular, \(\rho \) may have gaps in its support and may have additional zeros (cusps) in the interior of \({{\mathrm{supp}}}\rho \). However, the behavior of \( \rho \) on the domain \( \rho \le \varepsilon \), for some sufficiently small \( \varepsilon > 0 \), can be completely characterized by universal shape functions. For more details see Theorem 2.6 of [1].

Definition 1.4

(Density of states) The function \(\rho \) defined in (1.12) is called the density of states. Its harmonic extension to the upper half plane

$$\begin{aligned} \begin{aligned} \rho ( \tau + \mathrm {i}\eta )\,&:=\, \int _\mathbb {R}\Pi _\eta (\tau -\sigma ) \rho (\sigma ) \mathrm {d}\sigma , \\ \Pi _\eta (\tau )\,&:=\, \frac{1}{\pi } \frac{\eta }{\tau ^2+\eta ^{ 2}}; \quad \tau \in \mathbb {R},\quad \eta > 0, \end{aligned} \end{aligned}$$
(1.15)

is again denoted by \(\rho \). With a slight abuse of notation we still write \({{\mathrm{supp}}}\rho \), as in (1.13), for the support of the density of states as a function on the real line.

The density of states will be shown to be the eigenvalue distribution of \(\mathbf {H}\) in the large N limit on the macroscopic scale. For any fixed values \(\tau _1,\tau _2 \in \mathbb {R}\) with \(\tau _1<\tau _2\) it satisfies

$$\begin{aligned} \begin{aligned} \lim _{N\rightarrow \infty } \frac{\big |{{\mathrm{Spec}}}( \mathbf {H}^{(N)})\cap [ \tau _1,\tau _2 ]\,\big |}{N\int _{\tau _1}^{\tau _2} \rho ^{(N)}(\tau )\,\mathrm {d}\tau }\,=\, 1, \end{aligned} \end{aligned}$$
(1.16)

provided the denominator does not vanish in the limit. The identity (1.16) motivates the terminology of density of states for the function \(\rho \). The harmonic extension of \(\rho \) to \(\mathbb {H}\) is a version of the density of states, that is smoothed out on the scale \(\eta \). It satisfies the identity \(\rho (z)= (\pi N)^{-1}\sum _k \mathrm {Im}m_k(z)\) not just for \(z \in \mathbb {R}\) (cf. (1.12)) but for all \(z \in \overline{\mathbb {H}}\) and it will be used in the statement of our main result, Theorem 1.7.

Fig. 1
figure 1

The density of states may have gaps, cusps and local minima. It is always a symmetric function around zero, i.e., \(\rho (\tau )=\rho (-\tau )\)

In fact, Theorem 1.7, implies a local version of (1.16), where the length of the interval, \([\tau _1,\tau _2]\), may converge to zero as N tends to infinity. Our estimates and thus the proven speed of convergence depend on the distance of the interval to the edges of \({{\mathrm{supp}}}\rho \) and even on the length of the closest gap in this case. We introduce a function \(\Delta : \mathbb {R}\rightarrow [0,\infty )\), which encodes this relation.

Definition 1.5

(Local gap size) Let \(\alpha _k\) and \(\beta _k\) be the edges of the support of the density of states (cf. (1.13)) and \(\delta _*\) the constant introduced in Corollary 1.3. Then for any \(\delta \in [0,\delta _*)\) we set

$$\begin{aligned} \begin{aligned} \Delta _\delta (\tau )\,:=\, {\left\{ \begin{array}{ll} \alpha _{k+1}-\beta _k &{} \text {if } \beta _k-\delta \le \tau \le \alpha _{k+1}+\delta \text { for some } k=1,\dots ,K-1,\\ \;1 &{} \text {if } \tau \le \alpha _1+\delta \text { or } \tau \ge \beta _{K}-\delta ,\\ \;0 &{} \text { otherwise.} \end{array}\right. } \end{aligned} \end{aligned}$$
(1.17)

Finally, we fix an arbitrarily small tolerance exponent \(\gamma \in (0,1)\). This number will stay fixed throughout this paper in the same fashion as the model parameters P, p, L and \(\underline{\mu }\). Our main result is stated for spectral parameters \(z=\tau +\mathrm {i}\eta \) whose imaginary parts satisfy

$$\begin{aligned} \begin{aligned} \eta \,\ge \, N^{\gamma -1}. \end{aligned} \end{aligned}$$
(1.18)

For a compact statement of the main theorem we define the notion of stochastic domination, introduced in [12, 14]. This notion is designed to compare sequences of random variables in the large N limit up to small powers of N on high probability sets.

Definition 1.6

(Stochastic domination) Suppose \(N_0: (0,\infty )^2\rightarrow \mathbb {N}\) is a given function, depending only on the model parameters p, P, L and \(\underline{\mu }\), as well as on the tolerance exponent \(\gamma \). For two sequences, \(\varphi =(\varphi ^{(N)})_N\) and \(\psi =(\psi ^{(N)})_N\), of non-negative random variables we say that \(\varphi \) is stochastically dominated by \(\psi \) if for all \(\varepsilon >0\) and \(D>0\),

$$\begin{aligned} \begin{aligned} \mathbbm {P}\Bigl ( \varphi ^{(N)} > N^\varepsilon \psi ^{(N)}\Bigr ) \;\le \; N^{-D},\quad N \ge N_0(\varepsilon ,D). \end{aligned} \end{aligned}$$
(1.19)

In this case we write \(\varphi \prec \psi \).

Basic properties of the stochastic domination that are used extensively in this paper are listed in Lemma A.1. The threshold \( N_0(\varepsilon ,D) = N_0(\varepsilon ,D;P,p,L,\underline{\mu },\gamma ) \) will always be an explicit function whose value will be increased throughout the paper, though we will not follow its form. This will happen only finitely many times, ensuring that \(N_0\) stays finite. The threshold is uniform in all other parameters, e.g. in the spectral parameter z, as well as in the indices \( i,j, \dots \) of the matrix entries, that the sequences \(\varphi \) and \(\psi \) may depend on. Typically, we will not mention the existence of \(N_0\) - it is implicit in the notation \(\varphi \prec \psi \). As an example, we see that the bounded moment condition, (D), implies

$$\begin{aligned} | h_{ij} | \prec N^{-1/2} . \end{aligned}$$

Actually, the function \(N_0\) depends only on finitely many moment parameters \((\mu _1,\dots , \mu _{M})\) instead of the entire sequence \(\underline{\mu }\), where now the number of required moments M \(=M(\varepsilon ,D;P,p,L,\gamma )\), is an explicit function.

Now we are ready to state our main result on the local law. Suppose \(\mathbf {H}=\mathbf {H}^{(N)}\) is a sequence of self-adjoint random matrices with the corresponding sequence \(\mathbf {S}=\mathbf {S}^{(N)}\) of variance matrices and \(\rho =\rho ^{(N)}\) the induced sequence of densities of state. Recall that \(\delta _*\) is the positive constant, depending only on p, P and L, introduced in Corollary 1.3 and \(\Delta _\delta \) is defined as in Definition 1.5.

Theorem 1.7

(Local law) Suppose that assumptions (A)–(D) are satisfied and fix an arbitrary \(\gamma \in (0,1)\). There is a deterministic function \(\kappa =\kappa ^{(N)}: \mathbb {H}\rightarrow (0,\infty ]\) such that uniformly for all \(z=\tau +\mathrm {i}\eta \in \mathbb {H}\) with \(\eta \ge N^{\gamma -1}\) the resolvents (1.1) of the random matrices \(\mathbf {H}=\mathbf {H}^{(N)}\) satisfy

$$\begin{aligned} \begin{aligned} \max _{i,j}| G_{ij}(z)-m_i(z) \delta _{ij} | \,\prec \, \sqrt{\frac{\rho (z)}{N \eta }} +\frac{1}{N \eta } + \min \left\{ \, {\frac{1}{\sqrt{N \eta }}, \frac{\kappa (z)}{N \eta }} \,\right\} . \end{aligned} \end{aligned}$$
(1.20)

Furthermore, for any sequence of deterministic vectors \(\mathbf {w} = \mathbf {w}^{(N)} \in {\mathbb {C}}^N\) with \( \max _{i}|w_i|\le 1 \) the averaged resolvent diagonal has an improved convergence rate,

$$\begin{aligned} \begin{aligned} \left|\frac{1}{N}\sum _{i=1}^N \overline{w}_i\,\bigl (G_{ii}(z)-m_i(z)\bigr ) \right| \,\prec \, \min \left\{ \, {\frac{1}{\sqrt{N \eta }}, \frac{\kappa (z)}{N\eta }} \,\right\} . \end{aligned} \end{aligned}$$
(1.21)

In particular, for \(w_i=1\) this implies

$$\begin{aligned} \begin{aligned} \left|\frac{1}{N} \mathrm {Im}\mathrm {Tr}\,\mathbf {G}(z)-\pi \rho (z) \right| \,\prec \, \min \left\{ \, {\frac{1}{\sqrt{N \eta }}, \frac{\kappa (z)}{N\eta }} \,\right\} . \end{aligned} \end{aligned}$$
(1.22)

The function \( \kappa \) may be chosen to be

$$\begin{aligned} \begin{aligned} \kappa (z)\,=\, \frac{1}{\Delta (\tau )^{1/3}+\rho (z)}, \end{aligned} \end{aligned}$$
(1.23)

where \(\Delta =\Delta _\delta \), with some \(\delta \in (0,\delta _*)\) that depends only on the model parameters p, P and L.

In the regime, where z is not too close to the support of the density of states in the sense that

$$\begin{aligned} \begin{aligned} ( \Delta (\tau )^{1/3}+ \rho (z) )\,\mathrm {dist}( z, {{\mathrm{supp}}}\rho ) \;\ge \, \frac{N^\gamma }{(N \eta )^2}, \end{aligned} \end{aligned}$$
(1.24)

\(\kappa \) maybe improved to

$$\begin{aligned} \kappa (z)= & {} \frac{\eta }{\mathrm {dist}(z, {{\mathrm{supp}}}\rho )\,( \Delta (\tau )^{1/3}+\rho (z))} \nonumber \\&\quad + \,\frac{1}{N\eta \mathrm {dist}(z, {{\mathrm{supp}}}\rho )^{1/2} ( \Delta (\tau )^{1/3}+ \rho (z))^{1/2}}. \end{aligned}$$
(1.25)

The size of \( \rho (z) \) is described in (4.5) below. Theorem 1.7 can be localized to a spectral interval \( I \subset \mathbb {R}\), i.e., the statements hold for \( \mathrm {Re}\,z \in I \) provided (1.10) applies for \( z \in I + \mathrm {i}(0,\infty ) \). In particular, in the bulk of the spectrum Theorem 1.7 simplifies considerably.

Corollary 1.8

(Local law in the bulk) Assume (A), (B) and (D) with \( L = 1 \). Suppose there is a constant \( \rho _*> 0 \) and an interval \( I \subset {{\mathrm{supp}}}\,\rho \) such that \( \rho (\tau ) \ge \rho _*\) for all \( \tau \in I\). Then uniformly for all \( z = \tau + \mathrm {i}\eta \), with \( \tau \in I \) and \( \eta \ge N^{\gamma -1} \), and non-random \( \varvec{\mathrm {w}} \in {\mathbb {C}}^N \) satisfying \( \max _i |w_i |\le 1 \), the local laws hold

$$\begin{aligned} \begin{aligned} \max _{i, j=1}^N\,| G_{ij}(z)-m_i(z) \delta _{ij} | \,&\prec \, \frac{1}{\sqrt{N \eta }}, \quad \text {and } \\ \left|\frac{1}{N}\sum _{i=1}^N \overline{w}_i\,\bigl (G_{ii}(z)-m_i(z)\bigr ) \right| \,&\prec \, \frac{1}{N \eta }, \end{aligned} \end{aligned}$$
(1.26)

where \( \rho _*\) is considered as an additional model parameter.

Here the additional assumption \( L = 1 \) is only used to guarantee (cf. (i) of Theorem 6.1 in [1]) that the solution \( \varvec{\mathrm {m}}(z) \) of the QVE stays bounded around \( z = 0 \). Indeed, if z is bounded away from zero then (i) of Lemma 5.4 in [1] implies \( ||\varvec{\mathrm {m}} ||_\infty := \max _i| m_i |\) is bounded by a constant independent of N in the bulk of the spectrum. Therefore, if \( \mathrm {dist}(I,0) \ge \delta \) for some \( \delta > 0 \), or \( \sup \{ {\,|| \varvec{\mathrm {m}}(z) ||_\infty :\mathrm {Re}\,z \in I } \} \le P \) is known for some \( P < \infty \), then the assumption \( L=1 \) can be removed, and (1.26) holds with \( \delta \) or P , respectively, considered as model parameters.

Theorem 1.7 generalizes the previous local laws for stochastic variance matrices \(\mathbf {S}\) (see [14] and references therein). It is valid for densities \(\rho \) with an edge behavior different from the square root growth that is known from Wigner’s semicircular law. In particular, singularities that interpolate between a square root and a cubic root are possible. In the bulk of the support of the density of states, i.e., where \(\rho \) is bounded away from zero, the function \(\kappa \) is bounded. The same is true near the edges, unless the nearby gap is small. The bound deteriorates near small gaps in the support of \(\rho \).

In applications, the sequence \(\mathbf {S}=\mathbf {S}^{(N)}\) satisfying (A)–(C) may be constructed by discretizing a piecewise 1 / 2-Hölder continuous limit function (cf. Remark 6.2 in [1]). As a particularly simple example, suppose f is a smooth, non-negative, symmetric, \(f(x,y)=f(y,x)\), function on \([0,1]^2\) with a positive diagonal, \(f(x,x)>0\). Then the sequence of variance matrices,

$$\begin{aligned} s_{i j}^{(N)}\,:=\, \frac{1}{N} f\left( \frac{i}{N},\frac{j}{N}\right) ,\quad i,j = 1,\dots ,N, \end{aligned}$$

satisfies conditions (A)–(C). The validity of (C) can be verified by using the general criteria (cf. Theorem 2.10 and Theorem 6.1 of [1]) for uniform boundedness. In this case the solution, \(\mathbf {m}=(m_1,\dots ,m_N)\), of the QVE converges to a limit in the sense that

$$\begin{aligned} \sup _{z\in \mathbb {H}} \max _{i=1}^N \big |m_i(z)-m( i/N;z) \big |\,\rightarrow \, 0, \end{aligned}$$

where \(m:[0,1]\times \overline{\mathbb {H}} \rightarrow \overline{\mathbb {H}}\) is the solution of the continuous QVE,

$$\begin{aligned} -\frac{1}{m(x;z)}\,=\, z + \int _0^1 f(x,y) m(y;z) \mathrm {d}y,\quad x \in [0,1],\; z \in \overline{\mathbb {H}} . \end{aligned}$$

The continuous QVE such as this one fall into the class of general QVEs thoroughly analyzed in the companion paper [1]. In particular, the stability analysis applies and the density of states converges to a limit

$$\begin{aligned} \rho ^{(N)}(\tau )\,\rightarrow \, \frac{1}{\pi }\int _0^1\mathrm {Im}m(x;\tau ) \mathrm {d}x. \end{aligned}$$

We introduce a notion for expressing that events hold with high probability in the limit as N tends to infinity.

Definition 1.9

(Overwhelming probability) Suppose \(N_0: (0,\infty )\rightarrow \mathbb {N}\) is a given function, depending only on the model parameters p, P, L and \(\underline{\mu }\), as well as on the tolerance exponent \(\gamma \). For a sequence \(A=(A^{(N)})_N\) of random events we say that A hold asymptotically with overwhelming probability (a.w.o.p.), if for all \(D>0\):

$$\begin{aligned} \begin{aligned} \mathbbm {P}(A^{(N)})\,\ge \, 1-N^{-D},\quad N \ge N_0(D). \end{aligned} \end{aligned}$$
(1.27)

There is a simple connection between the notions of stochastic domination and asymptotically overwhelming probability. For two sequences \(A = A^{(N)}\) and \(B = B^{(N)}\) the statement ‘A implies B a.w.o.p.’ is equivalent to \(\mathbbm {1}_A \prec \mathbbm {1}_B\), where the threshold \(N_0\), implicit in the stochastic domination, does not depend on \(\varepsilon \), i.e., \(N_0(\varepsilon ,D)=N_0(D)\).

We denote by \(\lambda _1\le \dots \le \lambda _N\) the eigenvalues of the random matrix \(\mathbf {H}\). The following corollary shows that the eigenvalue distribution converges to the density of states as N tends to infinity.

Corollary 1.10

(Convergence of cumulative eigenvalue distribution) Assume (A)–(D). Then uniformly for all \(\tau \in \mathbb {R}\) the cumulative empirical eigenvalue distribution approaches the integrated density of states,

$$\begin{aligned} \begin{aligned} \left| \#\{ i: \lambda _i \le \tau \}- N\int _{-\infty }^{\tau }\rho (\omega ) \mathrm {d}\omega \, \right| \,\prec \, \min \left\{ \, { \frac{1}{\Delta (\tau )^{1/3}+\rho (\tau )} ,\, N^{1/5} } \,\right\} . \end{aligned} \end{aligned}$$
(1.28)

Furthermore, for an arbitrary tolerance exponent \(\gamma \in (0,1)\) there are no eigenvalues away from the support of the density of states,

$$\begin{aligned} \begin{aligned} \max _{k=0}^K\,\# \bigl \{ {\,i: \beta _k+\delta _k< \lambda _i < \alpha _{k+1}-\delta _k} \bigl \} \;=\,0 \quad \text {a.w.o.p.}, \end{aligned} \end{aligned}$$
(1.29)

where we interpret \(\beta _0:=-\infty \), \(\alpha _{K+1}:=+\infty \) and \(\delta _k\) is defined as \(\delta _0:=\delta _K:=N^{\gamma -2/3}\), as well as

$$\begin{aligned} \begin{aligned} \delta _k\,:=\,\frac{N^{\gamma }}{(\alpha _{k+1}-\beta _k)^{1/3}N^{2/3}}\;, \quad k=1,\dots ,K-1. \end{aligned} \end{aligned}$$
(1.30)

Based on (1.16) we define the index, \( i(\tau )\), of an eigenvalue that we expect to be located close to the spectral parameter \( \tau \) by

$$\begin{aligned} \begin{aligned} i(\tau )\,:=\, \bigg \lceil N \int _{-\infty }^\tau \rho (\omega ) \mathrm {d}\omega \bigg \rceil . \end{aligned} \end{aligned}$$
(1.31)

Here, \(\lceil \omega \rceil \) denotes the smallest integer that is bigger or equal to \(\omega \) for any \(\omega \in \mathbb {R}\).

Corollary 1.11

(Rigidity of eigenvalues) Assume (A)–(D), and let \(\gamma \in (0,1)\) be an arbitrary tolerance exponent. Denote

$$\begin{aligned} \begin{aligned} \varepsilon _k \,:=\, N^{\gamma }\min \left\{ \, { \frac{1}{N^{3/5}}\;, \frac{1}{(\alpha _{k+1}-\beta _k)^{1/9}N^{2/3} }\,} \,\right\} ,\quad k = 1,\dots ,K-1, \end{aligned} \end{aligned}$$
(1.32)

and \(\varepsilon _0:=\varepsilon _K:=N^{\gamma -2/3} \). Then uniformly for every

$$\begin{aligned} \begin{aligned} \tau \,\in \, \bigcup _{k=1}^K\bigl [ \alpha _k + \varepsilon _{k-1} , \beta _k-\varepsilon _k\bigr ], \end{aligned} \end{aligned}$$
(1.33)

the eigenvalues satisfy the rigidity

$$\begin{aligned} \begin{aligned} | \lambda _{i(\tau )}- \tau | \,\prec \, \min \left\{ \, { \frac{1}{( \Delta (\tau )^{1/3}+\rho (\tau ) ) \rho (\tau ) N },\frac{1}{ N^{ 3/5} }\,} \,\right\} . \end{aligned} \end{aligned}$$
(1.34)

Furthermore, if \( \tau \) is close to the extreme edge, \( \tau \in ( \alpha _1,\alpha _1 +\varepsilon _0) \) or \( \tau \in (\beta _K-\varepsilon _K,\beta _K ] \), then

$$\begin{aligned} \begin{aligned} | \lambda _{i(\tau )}- \tau | \,\prec \, N^{-2/3} . \end{aligned} \end{aligned}$$
(1.35)

Finally, if \(\tau \in (\beta _k-\varepsilon _k,\alpha _{k+1}+\varepsilon _k)\) for some \( 1 \le k \le K-1\), then the corresponding eigenvalue is close to an internal edge in the sense that

$$\begin{aligned} \begin{aligned} \lambda _{i(\tau )}\,\in \, \bigl [ \beta _k-2 \varepsilon _k , \beta _k+\delta _k\bigr ] \cup \bigl [ \alpha _{k+1}-\delta _k , \alpha _{k+1}+2 \varepsilon _k\bigr ] \quad \text {a.w.o.p.}, \end{aligned} \end{aligned}$$
(1.36)

where \( \delta _k \) is defined in (1.30).

Remark 1.12

(Eigenvalues outside \( {{\mathrm{supp}}}\rho \)) The statements (1.35) and (1.36) are an immediate consequence of (1.34) and (1.29). They simply express the fact that the small number of \(\mathcal {O}(N^\varepsilon )\) eigenvalues, very close to the edges, are found in the space that is left for them by the other eigenvalues for which the rigidity statement (1.34) applies. For an illustration see Fig. 2. We also note that results of this type date back to at least [7] (in the sample covariance context).

Theorem 1.13

(Anisotropic law) Assume (A)–(D) and fix arbitrary \( \gamma > 0 \). Then uniformly for all \(z= \tau +\mathrm {i}\eta \in \mathbb {H}\) with \(\eta \ge N^{\gamma -1}\), and for any two deterministic \( \ell ^2\)-unit vectors \( \mathbf {w},\mathbf {v} \) we have

$$\begin{aligned} \begin{aligned} \left| \sum _{i,j=1}^N \overline{w}_i G_{i j}(z)v_j- \sum _{i=1}^N m_i(z) \overline{w}_i v_i\, \right| \;\prec \; \sqrt{\frac{\rho (z)}{N \eta }}+\frac{1}{N \eta } + \min \left\{ \, {\frac{1}{\sqrt{ N\eta }} , \frac{\kappa (z)}{N\eta }} \,\right\} , \end{aligned} \end{aligned}$$
(1.37)

where \(\kappa \) is the function from Theorem 1.7.

Corollary 1.14

(Delocalization of eigenvectors) Assume (A)–(D) and fix arbitrary \( \gamma > 0 \). Let \( \varvec{\mathrm {u}}^{(i)} \in {\mathbb {C}}^N \) be the \( \ell ^2\)-normalized eigenvector of \(\varvec{\mathrm {H}}\) corresponding to the eigenvalue \(\lambda _i\). All eigenvectors are delocalized in the sense that for any deterministic unit vector \(\varvec{\mathrm {b}}\in {\mathbb {C}}^N\) we have

$$\begin{aligned} \begin{aligned} \big | \varvec{\mathrm {b}}\cdot \varvec{\mathrm {u}}^{(i)} \big | \,\prec \, \frac{1}{\sqrt{N }} . \end{aligned} \end{aligned}$$
(1.38)

In particular, the eigenvectors are completely delocalized, i.e., \(|| \mathbf {u}^{(i)} ||_\infty = \max _j | u^{(i)}_j |\prec N^{-1/2}\).

Definition 1.15

(q-full random matrix) We say that \(\mathbf {H}\) is q -full for some \(q>0\) (independent of N) if either of the following applies:

  • \(\mathbf {H}\) is real symmetric and \(\mathbbm {E}h_{ij}^2 \ge q/N\) for all \(i,j=1,\dots ,N\);

  • \(\mathbf {H}\) is complex hermitian and for all \(i,j=1,\dots ,N\) the real symmetric \(2 \times 2\)-matrix,

    $$\begin{aligned} \varvec{\mathrm {\sigma }}_{i j}\,:=\, \left( \begin{array}{cc} \mathbbm {E}(\mathrm {Re}h_{i j})^2&{}\mathbbm {E}(\mathrm {Re}h_{i j})(\mathrm {Im}h_{i j}) \\ \mathbbm {E}(\mathrm {Re}h_{i j})(\mathrm {Im}h_{i j}) &{}\mathbbm {E}(\mathrm {Im}h_{i j})^2 \end{array} \right) , \end{aligned}$$

    is strictly positive definite such that \( \varvec{\mathrm {\sigma }}_{i j}\ge q/N\).

If \( \varvec{\mathrm {H}} \) is real symmetric, then the q-fullness of \( \varvec{\mathrm {H}} \) is equivalent to the property (B) with \( L = 1 \) and \( q = p \). On the other hand, in the complex hermitian case the q -fullness condition is stronger than a lower bound on \( \mathbbm {E}\,|h_{ij} |^2 = s_{ij}\), and it can not be captured by the matrix \( \varvec{\mathrm {S}} \) alone.

Theorem 1.16

(Universality) Suppose (A) and (D) hold, and \( \varvec{\mathrm {H}}\) is q -full. Then for all \(\varepsilon >0\), \(n \in \mathbb {N}\) and all smooth compactly supported observables \( F:\mathbb {R}^n \rightarrow \mathbb {R}\), there are two positive constants C and c , depending on \( \varepsilon ,q \) and F in addition to the model parameters, such that for any \(\tau \in \mathbb {R}\) with \(\rho (\tau )\ge \varepsilon \) the local eigenvalue distribution is universal,

$$\begin{aligned}&\left| \,\mathbbm {E}F\left( \left( N \rho ( \lambda _{i(\tau )} ) ( \lambda _{i(\tau )}-\lambda _{i(\tau )+j}) \right) _{j=1}^n \right) \right. \nonumber \\&\quad \left. -\, \mathbbm {E}_\mathrm{G} F\left( \left( N \rho _\mathrm{sc}(0) ( \lambda _{\lceil {N/2}\rceil }-\lambda _{\lceil {N/2}\rceil +j}) \right) _{j=1}^n \right) \right| \,\le \, C N^{-c}. \end{aligned}$$

Here, \(\mathbbm {E}_\mathrm{G}\) denotes the expectation with respect to the standard Gaussian ensemble, i.e., with respect to GUE and GOE in the cases of complex Hermitian and real symmetric \(\mathbf {H}\), respectively, and \(\rho _\mathrm{sc}(0)=1/(2 \pi )\) is the value of Wigner’s semicircle law at the origin.

Fig. 2
figure 2

Notations of Corollary 1.11: At the edges of a gap of length \( \Delta \) in \( {{\mathrm{supp}}}\rho \) the bound on the eigenvalue fluctuation is \( \delta _k \) inside the gap and \(\varepsilon _k\) inside the support

This theorem concerns the universality in the bulk. With the help of our local law one may also prove a weaker version of the universality at the edges (including the internal edges). Since our local law, Theorem 1.7, is optimal at the edges, a direct application of the Green function comparison theorem from Section 6 of [22] (with straightforward adjustments) shows edge universality in the sense that the edge statistics may depend only on the second moments encoded in the matrices \( \varvec{\mathrm {\sigma }}_{ij} \). In particular, it is the same as the edge statistics of a Wigner-type matrix with centered Gaussian entries with coinciding second moments. This argument holds for the extreme edges as well as for the internal edges. However, it does not yet prove the Tracy-Widom law, i.e. that the edge statistics is independent even of the variances \(\varvec{\mathrm {S}}\).

Convention 1.17

(Constants and comparison relation) We use the convention that every positive constant with a lower star index, such as \(\delta _*\), \(c_*\) and \(\lambda _*\), explicitly depends only on the model parameters P, p and L from (B)–(D). These dependencies can be reconstructed from the proofs, but we will not follow them. Constants \(c, c_1, c_2, \dots , C, C_1, C_2, \dots \) also depend only on P, p and L. They will have a local meaning within a specific proof.

For two non-negative functions \(\varphi \) and \(\psi \) depending on a set of parameters \(u \in U\), we use the comparison relation

$$\begin{aligned} \begin{aligned} \varphi \;\gtrsim \; \psi , \end{aligned} \end{aligned}$$
(1.39)

if there exists a positive constant c, depending explicitly on P, p and L such that \(\varphi (u) \ge c \psi (u)\) for all \(u \in U\). The notation \(\psi \sim \varphi \) means that both \(\psi \lesssim \varphi \) and \(\psi \gtrsim \varphi \) hold true. In this case we say that \(\psi \) and \(\varphi \) are comparable. We also write \(\psi = \varphi +\mathcal {O}(\vartheta )\), if \(|\psi -\varphi |\lesssim \vartheta \).

We denote the normalized scalar product between two vectors \(\mathbf {u},\mathbf {w} \in {\mathbb {C}}^N\) and the average of a vector by

$$\begin{aligned} \begin{aligned} \langle \mathbf {u},\mathbf {w} \rangle \,:=\, \frac{1}{N}\sum _{i=1}^N \overline{u}_i w_i,\quad \text {and}\quad \langle \mathbf {w} \rangle \,:=\, \frac{1}{N}\sum _{i=1}^N w_i, \end{aligned} \end{aligned}$$
(1.40)

respectively. Note that with this convention \(|\langle \varvec{\mathrm {u}},\varvec{\mathrm {u}} \rangle | = N^{-1}\Vert \varvec{\mathrm {u}}\Vert ^2_{\ell ^2} \).

2 Bound on the random perturbation of the QVE

We will make the following standing assumptions for the rest of this paper,

  • The assumptions (A)–(D) hold true and an arbitrary tolerance exponent \(\gamma \in (0,1)\) is fixed;

which are always assumed to hold unless explicitly otherwise stated.

We introduce the notation \(\mathbf {G}^{(V)}\) for the resolvent of the matrix \(\mathbf {H}^{(V)}\), which is identical to \(\mathbf {H}\) except for the removal of the rows and columns corresponding to the indices \(V \subseteq \{1,\dots ,N \}\). The enumeration of the indices is kept, even though \(\mathbf {G}^{(V)}\) has a lower dimension.

The diagonal elements of the resolvent, \(\mathbf {g}:=(G_{11},\dots ,G_{NN})\), satisfy the perturbed quadratic vector equation

$$\begin{aligned} \begin{aligned} -\,\frac{1}{g_i(z)} \,=\, z + \sum _{j=1}^N s_{ij} g_j(z) + d_i(z), \end{aligned} \end{aligned}$$
(2.1)

for all \(z \in \mathbb {H}\) and \(i=1,\dots ,N\). The random perturbation \(\mathbf {d}=(d_1,\dots ,d_N)\) is given by

$$\begin{aligned} d_k \;:= & {} \; \sum ^{(k)}_{i\ne j} h_{ki}G^{(k)}_{ij}h_{jk} \,+\, \sum ^{(k)}_i ( |h_{ki} |^2-s_{ki}) G^{(k)}_{ii} \nonumber \\&\quad -\, \sum ^{(k)}_i s_{ki} \frac{G_{ik}G_{ki}}{g_k} \,-\, h_{kk} \,-\, s_{kk} g_k . \end{aligned}$$
(2.2)

Here and in the following, the upper indices on the sums indicate which indices are not summed over. For the proof of this simple identity as well as (2.3) below via the Schur complement formula we refer to [14]. As in (2.2) we will often omit the dependence on the spectral parameter z in our notation, i.e., \(G_{i j}=G_{i j}(z)\), \(d_k=d_k(z)\), etc.

We will now derive an upper bound on \(||\mathbf {d} ||_\infty = \max _i|d_i|\), provided \(|g_i-m_i|\) is bounded by a small constant. At the same time we will control the off-diagonal elements \(G_{k l}\) of the resolvent. These satisfy the identity

$$\begin{aligned} \begin{aligned} G_{kl} \,=\, G_{kk} G^{(k)}_{ll} \sum _{i,j}^{(k l)} h_{ki} G^{(kl)}_{ij}h_{jl} \,-\,G_{kk}G^{(k)}_{ll}h_{kl}, \end{aligned} \end{aligned}$$
(2.3)

for \(k\ne l\). The strategy in what follows below is that (2.2) and (2.3) are used to improve a rough bound on the entries of the resolvent \(\mathbf {G}\) to get the correct bounds on the random perturbation and the off-diagonal resolvent elements. Later, in Sect. 3, the stability of the QVE under the small perturbation, \(\mathbf {d}\), will provide the improved bound on the diagonal elements, \(G_{ii}-m_i=g_i-m_i\).

We introduce a short notation for the difference between \(\mathbf {g}\) and the solution \(\mathbf {m}\) of the unperturbed Eq. (1.7),

$$\begin{aligned} \begin{aligned} \Lambda _{\mathrm {d}}(z) \,&:=\, \max _i|G_{ii}(z)-m_i(z)|, \\ \Lambda _{\mathrm {o}}(z) \,&:=\, \max _{i\ne j } |G_{ij}(z) |, \\ \Lambda (z)\,&:=\, \max \big \{\Lambda _{\mathrm {d}}(z) ,\Lambda _{\mathrm {o}}(z) \big \} . \end{aligned} \end{aligned}$$
(2.4)

The following lemma is analogous to Lemma 5.2 in [14] with minor modifications. For the completeness of this paper, we repeat these arguments. One small modification is that our estimates also deal with the regime where |z| is large. To keep the formulas short we denote

$$\begin{aligned}{}[z]\,:=\, 1+|z| . \end{aligned}$$

The dependence of the upcoming error bounds on [z] is not always optimal and this dependence is not kept in the statement of our main result Theorem 1.7, either. In fact, the regime \([z]\sim 1\) is the most interesting, since our results show that the spectrum of \(\mathbf {H}\) lies a.w.o.p. inside a compact interval (cf. Corollary 1.10). For the first reading we therefore recommend to think of \([z]=1\) in most of our proofs. The [z]-dependence is used mainly in order to propagate a bound from the regime of very large imaginary part of the spectral parameter (\(\mathrm {Im}z \ge N^{ 5}\)) to the entire domain, on which Theorem 1.7 holds.

Lemma 2.1

(Bound on perturbation) There is a small positive constant \(\lambda _* \sim 1\), such that uniformly for all spectral parameters \(z=\tau +\mathrm {i}\eta \in \mathbb {H}\) with \(\eta \ge N^{\gamma -1}\):

$$\begin{aligned} |d_k(z)| \,\mathbbm {1}\big ( \Lambda (z) \le \lambda _*/[z] \big ) \;&\prec \; [z]^{-2} \sqrt{\frac{\mathrm {Im}\langle \mathbf {g}(z)\rangle }{N \eta }}+\frac{1}{\sqrt{N}}, \end{aligned}$$
(2.5a)
$$\begin{aligned} \Lambda _{\mathrm {o}}(z) \,\mathbbm {1}\big ( \Lambda (z) \le \lambda _*/[z] \big ) \;&\prec \;[z]^{-2} \left( \sqrt{\frac{\mathrm {Im}\langle \mathbf {g}(z)\rangle }{N \eta }}+\frac{1}{\sqrt{N}} \right) . \end{aligned}$$
(2.5b)

For the proof of this lemma we will need an additional property of the solution of the QVE that is a corollary of Theorem 4.1, where all properties of \(\mathbf {m}\) taken from [1] are summarized.

Corollary 2.2

(Bounds on solution) The absolute value of the solution of the QVE satisfies

$$\begin{aligned} \begin{aligned} | m_i(z) | \,\sim \, [z]^{-1}, \quad z\in \mathbb {H},\; i=1,\dots ,N . \end{aligned} \end{aligned}$$
(2.6)

Proof of Lemma 2.1

Here we use the three large deviation estimates,

$$\begin{aligned} \left|\,\sum ^{(k)}_{i\ne j} h_{ki} G^{(k)}_{ij}h_{jk} \right| \;&\prec \; \left( \, \sum ^{(k)}_{i\ne j} s_{ki} s_{jk} \big |G^{(k)}_{ij}\big |^2 \right) ^{1/2} , \end{aligned}$$
(2.7a)
$$\begin{aligned} \left|\,\sum ^{(k l)}_{i,j} h_{ki} G^{(k l)}_{ij}h_{jl} \right| \;&\prec \; \left( \, \sum ^{(k l)}_{i,j} s_{ki} s_{jl} \big |G^{(k l)}_{ij}\big |^2 \right) ^{1/2} , \end{aligned}$$
(2.7b)
$$\begin{aligned} \left|\,\sum ^{(k)}_i \bigl ( |h_{ki}^2 |-s_{ki}\bigr ) G^{(k)}_{ii} \right| \,&\prec \, \left( \, \sum ^{(k)}_i s_{ki}^2 \big |G^{(k)}_{ii}\big |^2 \right) ^{1/2} . \end{aligned}$$
(2.7c)

Since \(\mathbf {G}^{(V)}\) is independent of the rows and columns of \(\mathbf {H}\) with indices in V, these estimates follow directly from the large deviation bounds in Appendix C of [14]. Furthermore, we use

$$\begin{aligned} \begin{aligned} | h_{ij} |\,\prec \, N^{-1/2},\quad s_{i j}\,\le \, N^{-1}, \end{aligned} \end{aligned}$$
(2.8)

where latter the inequality is just assumption (1.8) and the bound on \(h_{i j}\) follows from (1.11). We remark that the stochastic domination in (2.7) and (2.8) is uniform in kl and ij , respectively, i.e., the threshold function \( N_0 \) in Definition 1.6 does not depend on ijkl .

We will now show that the removal of a few rows and columns in \(\mathbf {H}\) will only have a small effect on the entries of the resolvent. The general resolvent identity,

$$\begin{aligned} \begin{aligned} G_{i j} \,=\, G_{i j}^{(k)} + \frac{G_{i k}G_{k j}}{G_{kk}},\quad k \not \in \{i,j\}, \end{aligned} \end{aligned}$$
(2.9)

leads to the bound

$$\begin{aligned} \begin{aligned} \big |G_{i j}^{(k)} - G_{i j}\big |\,\mathbbm {1}\big ( \Lambda \le \lambda _*/[z] \big ) \,=\, \frac{| G_{i k} G_{k j} |}{|g_k|} \,\mathbbm {1}\big ( \Lambda \le \lambda _*/[z] \big ) \,\lesssim \, [z]\Lambda _{\mathrm {o}}^2. \end{aligned} \end{aligned}$$
(2.10)

In the inequality we used that \(|m_k(z)|\sim [z]^{-1}\) (cf. Corollary 2.2), \(|g_k| = |m_k|+\mathcal {O}(\Lambda )\) and that \(\lambda _*\) is chosen to be small enough. We use (2.10) in a similar calculation for \(G_{i j}^{(l)}\) and find that on the event where \(\Lambda \le \lambda _*/[z]\),

$$\begin{aligned} \begin{aligned} \big |G_{i j}^{(k l)} - G_{i j}^{(l)}\big | \,=\, \frac{\big | G_{i k}^{(l)} G_{k j}^{(l)} \big |}{\big |G_{k k}^{(l)}\big |} \,\lesssim \, \frac{ (\,| G_{i k}| + \mathcal {O}( [z] \Lambda _{\mathrm {o}}^2 )\,) (\,|G_{k j} | + \mathcal {O}( [z] \Lambda _{\mathrm {o}}^2 )\,) }{| g_k | + \mathcal {O}( [z] \Lambda _{\mathrm {o}}^2 ) }. \end{aligned} \end{aligned}$$
(2.11)

Again using (2.10) and that the denominator of the last expression is comparable to \([z]^{-1}\), we conclude

$$\begin{aligned} \begin{aligned} |G_{ij}^{(kl)} - G_{i j}|\,\mathbbm {1}\big ( \Lambda \le \lambda _*/[z] \big ) \,\lesssim \, [z] \Lambda _{\mathrm {o}}^2, \end{aligned} \end{aligned}$$
(2.12)

provided \(\lambda _*\) is small. Therefore, we see that it is possible to remove one or two upper indices from \(G_{i j}\) for the price of a term, whose size is at most \([z]\Lambda _{\mathrm {o}}^2\).

We have now collected all necessary ingredients and use them to estimate all the terms in (2.2) one by one. We start with the first summand. By (2.7a) we find

$$\begin{aligned} \begin{aligned} \left|\,\sum ^{(k)}_{i\ne j} h_{ki}G^{(k)}_{ij}h_{jk} \right|^2 \,\prec \, \sum ^{(k)}_{i\ne j} s_{ki} s_{jk} \big |G^{(k)}_{ij}\big |^2 \,\le \, \frac{1}{N^2}\sum ^{(k)}_{i\ne j} \big |G^{(k)}_{ij}\big |^2 . \end{aligned} \end{aligned}$$
(2.13)

With the help of (2.10) we remove the upper index from \(G_{ij}^{(k)}\) and get

$$\begin{aligned} \begin{aligned} \left|\,\sum ^{(k)}_{i\ne j} h_{ki}G^{(k)}_{ij}h_{jk} \right|^2 \mathbbm {1}\big ( \Lambda \le \lambda _*/[z] \big ) \,\prec \, \bigl ( \Lambda _{\mathrm {o}}^2 \,+\,[z]^2\Lambda _{\mathrm {o}}^4 \bigr ) \mathbbm {1}\bigl ( \Lambda \le \lambda _*/[z] \bigr ) \,\lesssim \,\Lambda _{\mathrm {o}}^2 . \end{aligned} \end{aligned}$$
(2.14)

For the second summand in (2.2) we use the large deviation bound for the diagonal, (2.7c), and find that

$$\begin{aligned} \begin{aligned} \left|\,\sum ^{(k)}_i (|h_{ki} |^2-s_{ki}) G^{(k)}_{ii} \right|^2 \,\prec \; \sum ^{(k)}_i s_{ki}^2 \big |G^{(k)}_{ii}\big |^2 \,\le \, \frac{1}{N^2}\sum ^{(k)}_i \big |G^{(k)}_{ii}\big |^2. \end{aligned} \end{aligned}$$
(2.15)

By removing the upper index again we estimate

$$\begin{aligned} \begin{aligned} \big |G^{(k)}_{ii}\big |\,\mathbbm {1}\big ( \Lambda \le \lambda _*/[z] \big ) \,\lesssim \, |m_i|+\Lambda _{\mathrm {d}}+ [z]\Lambda _{\mathrm {o}}^2. \end{aligned} \end{aligned}$$
(2.16)

We use this in (2.15) and for sufficiently small \(\lambda _*\) we arrive at

$$\begin{aligned} \begin{aligned} \left|\,\sum ^{(k)}_i (|h_{ki} |^2-s_{ki}) G^{(k)}_{ii} \right|^2 \mathbbm {1}\big ( \Lambda \le \lambda _*/[z] \big ) \;\prec \; \frac{1}{[z]^2 N}. \end{aligned} \end{aligned}$$
(2.17)

The third summand in (2.2) is estimated directly by

$$\begin{aligned} \begin{aligned} \left|\,\sum ^{(k)}_i s_{ki} \frac{G_{ik}G_{ki}}{g_k} \right|\,\mathbbm {1}\big ( \Lambda \le \lambda _*/[z] \big ) \,\le \, \frac{\Lambda _{\mathrm {o}}^2}{|g_k|} \,\mathbbm {1}\big ( \Lambda \le \lambda _*/[z] \big ) \,\lesssim \, \Lambda _{\mathrm {o}} . \end{aligned} \end{aligned}$$
(2.18)

We combine the estimates for the individual terms (2.14), (2.17), (2.18) and (2.8). Altogether we conclude that

$$\begin{aligned} \begin{aligned} |d_k| \,\mathbbm {1}\big ( \Lambda \le \lambda _*/[z] \big ) \,\prec \, \Lambda _{\mathrm {o}}(z)+\frac{1}{\sqrt{N}} . \end{aligned} \end{aligned}$$
(2.19)

We will now derive in a similar fashion a stochastic domination bound for the off-diagonal error term \(\Lambda _{\mathrm {o}}\). Afterwards, we will combine the two bounds and infer the claim of the lemma. For the off-diagonal error term we proceed along the same lines as for \(|d_k|\), using (2.3) instead of (2.2). For \(k \ne l\) we find

$$\begin{aligned} \begin{aligned} |G_{kl}|^2 \,\prec \, |g_k|^2\big |G_{ll}^{(k)} \big |^2 \left( \frac{1}{N^2} \sum ^{(k l)}_{i, j} \big |G^{(k l)}_{ij}\big |^2+\frac{1}{N} \right) . \end{aligned} \end{aligned}$$
(2.20)

Here, we applied the large deviation bound (2.7b). Using the Ward identity for the resolvent \(\mathbf {G}^{(kl)}\),

$$\begin{aligned} \begin{aligned} \sum ^{(kl)}_{j} \big |G^{(kl)}_{ij}\big |^2 \,=\, \frac{\mathrm {Im}\,G_{i i}^{(kl)}}{\eta }, \end{aligned} \end{aligned}$$
(2.21)

and (2.10) for removing the upper index of \(G_{ll}^{(k)}\) we get

$$\begin{aligned} \begin{aligned} |G_{kl}|^2\,\mathbbm {1}\big ( \Lambda \le \lambda _*/[z] \big ) \,\prec \, [z]^{-4} \left( \frac{1}{N^2\eta }\sum _{i}^{(k l)} \mathrm {Im}G_{ii}^{(k l)}+\frac{1}{N} \right) . \end{aligned} \end{aligned}$$
(2.22)

We remove the upper indices from \(G_{ii}^{(k l)}\) and end up with

$$\begin{aligned} \begin{aligned} \Lambda _{\mathrm {o}} \,\mathbbm {1}\big ( \Lambda \le \lambda _*/[z] \big ) \,\prec \, [z]^{-2}\left( \sqrt{\frac{\mathrm {Im}\,\langle \mathbf {g} \rangle }{N \eta }}+ \sqrt{\frac{[z]}{N\eta }}\;\Lambda _{\mathrm {o}}+\frac{1}{\sqrt{N}} \right) . \end{aligned} \end{aligned}$$
(2.23)

The bound remains true without the summand containing \(\Lambda _{\mathrm {o}}\) on the right hand side, since this term can be absorbed into the left hand side, as its coefficient is bounded by \(N^{-\gamma /2}\), while on the left \(\Lambda _{\mathrm {o}}\) is not multiplied by a small coefficient. Putting (2.19) and (2.23) together yields the desired result (2.5). \(\square \)

3 Local law away from local minima

In this section we will use the stability of the QVE to establish the main result away from the local minima of the density of states inside its own support, i.e. away from the set

$$\begin{aligned} \begin{aligned} {\mathbb {M}}\,:=\, \bigl \{ {\tau \in {{\mathrm{supp}}}\rho : \;\tau \text { is the location of a local minimum of } \rho } \bigl \} . \end{aligned} \end{aligned}$$
(3.1)

The case where z is close to \({\mathbb {M}}\) requires a more detailed analysis. This is given is Sect. 4. At the end of this section we will also sketch the proof of Corollary 1.8. In this section we prove the following.

Proposition 3.1

(Local law away from local minima) Let \(\delta _*\) be any positive constant, depending only on the model parameters p, P and L. Then, uniformly for all \( z=\tau \,+\,\mathrm {i}\eta \) with \(\eta \ge N^{\gamma -1}\) and \(\mathrm {dist}(z,{\mathbb {M}})\ge \delta _*\), we have

$$\begin{aligned}{}[z]^2\Lambda _{\mathrm {d}}(z)+||\mathbf {d}(z) ||_\infty \,&\prec \, [z]^{-2}\sqrt{\frac{\rho (z)}{N \eta }}+\frac{\;[z]^{-6} }{N\eta }\,+\frac{1}{\sqrt{N}}\;, \end{aligned}$$
(3.2a)
$$\begin{aligned} \Lambda _{\mathrm {o}}(z) \,&\prec \, [z]^{-2}\sqrt{\frac{\rho (z)}{N\eta }} + \frac{\;[z]^{-4} }{N\eta }\,+\frac{\;[z]^{-2} }{\sqrt{N}}. \end{aligned}$$
(3.2b)

Furthermore, on the same domain, for any sequence of deterministic vectors \(\mathbf {w}=\mathbf {w}^{(N)} \in {\mathbb {C}}^N\) with the uniform bound, \(||\mathbf {w} ||_\infty \le 1\), we have

$$\begin{aligned} \begin{aligned} |\langle \mathbf {w} ,\mathbf {g}(z) -\mathbf {m}(z) \rangle |\;\prec \; [z]^{-3} \frac{\rho (z) }{N\eta }+\frac{\,[z]^{-7} }{(N\eta )^2\,}+\frac{\;[z]^{-2} }{N} \;. \end{aligned} \end{aligned}$$
(3.3)

This proposition, combined with the properties of \( \rho \) given in Theorem 4.1 later, yields the local law (Theorem 1.7) away from the set \({\mathbb {M}}\). Indeed, using \( \rho (z) \gtrsim [z]^{-2}\eta \) (cf. relations (4.5) below) and \( \kappa (z) \ge 0 \) we see that (3.2) implies (1.20).

In order to see that also the averaged local law (1.21) follows from (3.3) we split the domain \( \{ {z\in \mathbb {H}:\mathrm {dist}(z,{\mathbb {M}})} \} \ge \delta _*\) into three subdomains that are considered separately. To this end, let \( B_0 \) and \( B_1 \) be the upper bounds on \(\kappa \) from (1.23) and (1.25), respectively.

First we consider the regime \( \eta \ge \delta _*/2 \). Using \( \Delta ^{1/3}+\rho \lesssim 1 \) we see that \( B_0 \gtrsim 1 \). Similarly, we get \( B_1 \gtrsim \eta [z]^{-1} \). Since \( (N\eta )^{-1}B_k \), \( k=0,1\), are both bigger than the right hand side of (3.3), we obtain (1.21) for \( \eta \ge \delta _*/2 \).

Now we consider the regime \( \eta \le \delta _*/2 \), which is split into two cases depending on whether \( \mathrm {dist}(\mathrm {Re}\,z,{{\mathrm{supp}}}\rho ) = 0 \), or not. In the former case \([z] \lesssim 1 \) and \( \mathrm {dist}(z,{{\mathrm{supp}}}\rho ) = \eta \), and (4.5a) implies \( \rho (\mathrm {Re}\,z) \sim 1 \). Feeding these estimates into (1.23) and (1.25) yields \( B_0 \sim 1 \) and \( B_1 \gtrsim 1 \). These imply (1.21).

Finally, suppose \( \mathrm {dist}(\mathrm {Re}\,z,{{\mathrm{supp}}}\rho ) \ge \delta _*/2 \) and \(\eta \le \delta _*/2 \). In this regime \(\Delta \sim 1 \) (cf. (1.17)), while (4.5f) implies \(\rho \sim \eta \,[z]^{-2}\). Hence, \( B_0 \sim 1 \) and \( B_1 \gtrsim \eta \,[z]^{-1} \), and \( (N\eta )^{-1}\min \{ {B_0,B_1} \} \ge [z]^{-1}N^{-1} \). By comparing with the right hand side of (3.3) we conclude that (1.21) applies for all \( \mathrm {dist}(z,{\mathbb {M}}) \ge \delta _*\).

The proof of Proposition 3.1 uses a continuity argument in z. In particular, continuity of the solution of the QVE is needed. The statement of the following corollary is part of the properties of \(\mathbf {m}\) listed in Theorem 4.1 below.

Corollary 3.2

(Stieltjes-transform representation) For every \(i=1, \dots , N\) there is a probability density \(p_i:\mathbb {R}\rightarrow [0,\infty )\) with support in \([-2,2]\) such that \(m_i\) is the Stieltjes-transform of this density, i.e.,

$$\begin{aligned} \begin{aligned} m_i(z)\,=\, \int _\mathbb {R}\frac{p_i(\tau ) \mathrm {d}\tau }{\tau -z},\quad z \in \mathbb {H}. \end{aligned} \end{aligned}$$
(3.4)

The solution of the QVE is uniformly Hölder-continuous,

$$\begin{aligned} \begin{aligned} ||\mathbf {m}(z_1)-\mathbf {m}(z_2) ||_\infty \,\lesssim \,|z_1- z_2|^{1/3},\quad z_1,z_2 \in \overline{\mathbb {H}}. \end{aligned} \end{aligned}$$
(3.5)

Since the solution can be extended to the real line, it is the harmonic extension to the complex upper half plane of its own restriction to the real line. Therefore, \(\mathrm {Im}m_i(\tau )=\pi p_i(\tau )\) for \(\tau \in \mathbb {R}\). The density of states is the average of the probability densities \(p_i\), i.e., \(\rho =\langle \mathbf {p} \rangle \).

Since we will estimate the difference, \(\mathbf {g}-\mathbf {m}\), we start by deriving an equation for this quantity. Using the QVE for \(\mathbf {m}\) and the perturbed Eq. (2.1) for \(\mathbf {g}\) we find

$$\begin{aligned} g_i-m_i \,= & {} \, -\,\frac{1}{z+(\mathbf {S}\mathbf {g})_i+d_i} +\frac{1}{z+(\mathbf {S}\mathbf {m})_i} \nonumber \\ \,= & {} \, \frac{(\mathbf {S}(\mathbf {g}-\mathbf {m}))_i+d_i}{(z+(\mathbf {S}\mathbf {g})_i+d_i)(z+(\mathbf {S}\mathbf {m})_i)} \nonumber \\ \,= & {} \, m_i^2(\mathbf {S}(\mathbf {g}-\mathbf {m}))_i+m_i (g_i-m_i)(\mathbf {S}(\mathbf {g}-\mathbf {m}))_i+m_i\, g_i\, d_i. \end{aligned}$$
(3.6)

Rearranging the terms leads to

$$\begin{aligned} \begin{aligned} \big ( (\mathbf{1 }-\mathrm {diag}(\mathbf {m})^2 \mathbf {S})(\mathbf {g}-\mathbf {m}) \big )_i \,\,&=\,\, m_i (g_i-m_i) ( \mathbf {S}(\mathbf {g}-\mathbf {m}))_i \\&\quad +m_i^2\, d_i+m_i\,( g_i - m_i)\, d_i. \end{aligned} \end{aligned}$$
(3.7)

In the proof of Proposition 3.1 we will view (3.7) as a quadratic equation for \(\mathbf {g}-\mathbf {m}\) and we use its stability to bound \(\Lambda _{\mathrm {d}}\) in terms of \(||\mathbf {d} ||_\infty \). We will now demonstrate this effect in the case when z is far away from the support of the density of states.

Lemma 3.3

(Stability far away from support) For \(z \in \mathbb {H}\) with \(|z|\ge 10\), we have

$$\begin{aligned} \begin{aligned} \Lambda _{\mathrm {d}}(z) \mathbbm {1}(\Lambda _{\mathrm {d}}(z) \le 4 |z|^{-1} ) \!\;\lesssim \;\! |z |^{-2}||\mathbf {d}(z) ||_\infty . \end{aligned} \end{aligned}$$
(3.8)

Furthermore, there is a matrix valued function \(\mathbf {T}:\mathbb {H}\rightarrow {\mathbb {C}}^{N \times N}\), depending only on \(\varvec{\mathrm {S}} \) and satisfying the uniform bound \(||\mathbf {T}(z) ||_{\infty \rightarrow \infty }\lesssim 1\), such that for all \(\mathbf {w} \in {\mathbb {C}}^N\) and \(|z|\ge 10\) the averaged difference between \(\mathbf {g}\) and \(\mathbf {m}\) satisfies the improved bound

$$\begin{aligned}&\big |\bigl \langle \mathbf {w},\mathbf {g}(z)-\mathbf {m} (z) \bigr \rangle \big | \mathbbm {1}\big ( \Lambda _{\mathrm {d}}(z) \le 4 |z|^{-1} \big ) \nonumber \\&\quad \lesssim \;\!\! |z |^{-2} \big ( ||\mathbf {w} ||_\infty ||\mathbf {d}(z) ||_{\infty }^2+ |\langle \mathbf {T}(z) \mathbf {w},\mathbf {d}(z) \rangle | \big ) . \end{aligned}$$
(3.9)

For a matrix \( \varvec{\mathrm {A}} \) we denote by \( ||\varvec{\mathrm {A}} ||_{\infty \rightarrow \infty } \) the operator norm of \( \varvec{\mathrm {w}} \mapsto \varvec{\mathrm {A}}\varvec{\mathrm {w}} \) on \( \ell ^\infty \).

Proof

Since the matrix \(\mathbf {S}\) is flat (cf. (1.8)), it satisfies the norm bound \(||\mathbf {S} ||_{\infty \rightarrow \infty }\le 1\).

We also have the trivial bound \(|m_i(z)|\le 1/\mathrm {dist}(z,{{\mathrm{supp}}}\rho ) \le 2 |z|^{-1}\le 1/5\) at our disposal. This follows directly from the Stieltjes transform representation (3.4). In particular,

$$\begin{aligned} \begin{aligned} ||( \mathbf{1 }-\mathrm {diag}(\mathbf {m})^2 \mathbf {S} )^{-1} ||_{\infty \rightarrow \infty } \,\le \, 2, \end{aligned} \end{aligned}$$
(3.10)

from the geometric series. By inverting the matrix \(\mathbf{1 }-\mathrm {diag}(\mathbf {m})^2 \mathbf {S}\) and using the trivial bound on \(\mathbf {m}\) in (3.7) we find

$$\begin{aligned} \begin{aligned} \Lambda _{\mathrm {d}}(z) \,\le \, 4 \Bigl (\, |z|^{-1}\Lambda _{\mathrm {d}}(z)^2+|z|^{-1}\Lambda _{\mathrm {d}}(z) ||\mathbf {d}(z) ||_\infty +2 |z|^{-2}||\mathbf {d}(z) ||_\infty \Bigr ) . \end{aligned} \end{aligned}$$
(3.11)

Using the bound inside the indicator function from (3.8) and \(|z|\ge 10\) the assertion (3.8) of the lemma follows.

The bound for the average, (3.9), follows by taking the inverse of \(\mathbf{1 }-\mathrm {diag}(\mathbf {m})^2 \mathbf {S}\) on both sides of (3.7) and using (3.8) and \(|m_i|\sim |z|^{-1}\).

For the proof of Proposition 3.1 we use the stability of (3.7) also close to \({{\mathrm{supp}}}\rho \). This requires more care and is carried out in detail in [1]. The result of that analysis is Theorem 4.2. Here we will only need the following consequence of that theorem and (4.5a).

Corollary 3.4

(Stability away from minima) Suppose \(\delta _*\) is an arbitrary positive constant, depending only on the model parameters p, P and L. Let \(\mathbf {d}: \mathbb {H}\rightarrow {\mathbb {C}}^N\), \(\mathbf {g}: \mathbb {H}\rightarrow ({\mathbb {C}}\backslash \{ {0} \})^N\) be arbitrary vector valued functions on the complex upper half plane that satisfy

$$\begin{aligned} \begin{aligned} -\frac{1}{g_i(z)} \,=\, z + \sum _{j=1}^N s_{i j} g_j(z) +d_i(z), \quad z \in \mathbb {H}. \end{aligned} \end{aligned}$$
(3.12)

There exists a positive constant \(\lambda _* \sim 1\), such that the QVE is stable away from \({\mathbb {M}}\),

$$\begin{aligned} \begin{aligned} || \mathbf {g}(z)-\mathbf {m}(z) ||_\infty \,\mathbbm {1}\big ( ||\mathbf {g}(z)-\mathbf {m}(z) ||_\infty \le \lambda _* \big ) \,&\lesssim \, \,||\mathbf {d}(z) ||_\infty , \\&\quad z \in \mathbb {H},\; \mathrm {dist}(z,{\mathbb {M}})\ge \delta _* . \end{aligned} \end{aligned}$$
(3.13)

Furthermore, there is a matrix valued function \(\mathbf {T}:\mathbb {H}\rightarrow {\mathbb {C}}^{N \times N}\), depending only on \(\varvec{\mathrm {S}}\) and satisfying the uniform bound \(||\mathbf {T}(z) ||_{\infty \rightarrow \infty }\lesssim 1\), such that for all \(\mathbf {w} \in {\mathbb {C}}^N\),

$$\begin{aligned} \begin{aligned} |\langle \mathbf {w},\mathbf {g}(z)-\mathbf {m} (z)\rangle | \mathbbm {1}\big ( ||\mathbf {g}(z)-\mathbf {m}(z) ||_\infty \le \lambda _* \big ) \,\lesssim \,||\mathbf {w} ||_\infty || \mathbf {d}(z) ||_{\infty }^2+ |\langle \mathbf {T}(z) \mathbf {w},\mathbf {d}(z) \rangle |, \end{aligned} \end{aligned}$$
(3.14)

for \(z \in \mathbb {H}\) with \(\mathrm {dist}(z,{\mathbb {M}})\ge \delta _*\).

Furthermore, the following fluctuation averaging result is needed. It was first established for generalized Wigner matrices with Bernoulli distributed entries in [21].

Theorem 3.5

(Fluctuation averaging) For any \(z \in \mathbb {H}\), with \( \mathrm {Im}\,z \ge N^{\gamma -1} \), and any sequence of deterministic vectors \(\mathbf {w}=\mathbf {w}^{(N)} \in {\mathbb {C}}^N\) with the uniform bound, \(||\mathbf {w} ||_\infty \le 1\) the following holds true: If \(\Lambda _{\mathrm {o}}(z) \prec \Phi /[z]^2\) for some deterministic (N-dependent) \(\Phi \le N^{-\gamma /3}\) and \(\Lambda (z)\prec N^{-\gamma /3}/(1+|z|)\) , then

$$\begin{aligned} \begin{aligned} |\langle \mathbf {w},\mathbf {d}(z) \rangle | \,\prec \, [z]^{-1} \Phi ^2 + \frac{1}{N}, \end{aligned} \end{aligned}$$
(3.15)

where \( \varvec{\mathrm {d}}(z) \) is defined in (2.2).

Proof

The proof directly follows the one given in [14]. We only mention some minor necessary modifications. Let \(Q_kX:=X-\mathbbm {E}[X|\mathbf {H}^{(k)}]\) be the complementary projection to the conditional expectation of a random variable X given the matrix \(\mathbf {H}^{(k)}\), in which the k-th row and column are removed. From the definition of \(\mathbf {d}\) in (2.2) and Schur’s complement formula in the form,

$$\begin{aligned} \begin{aligned} \frac{1}{G_{kk}}\,=\, h_{kk}-z-\sum _{i,j}^{(k)}h_{k i }G_{i j}^{(k)}h_{j k}, \end{aligned} \end{aligned}$$
(3.16)

we infer the identity

$$\begin{aligned} d_k\,=\, - Q_k\frac{1}{G_{kk}}-s_{kk}G_{kk}- \sum _{i}^{(k)}s_{k i}\frac{G_{i k}G_{k i}}{G_{kk}}. \end{aligned}$$

In particular, we have that a.w.o.p.

$$\begin{aligned} \Big |d_k + Q_k \frac{1}{G_{kk}}\Big |\,\lesssim \;\frac{\;[z]^{-1} }{N} \,+ [z]\Lambda _{\mathrm {o}}^2. \end{aligned}$$

Thus, proving (3.15) reduces to showing

$$\begin{aligned} \left| \frac{1}{N}\sum _{k=1}^N \overline{w}_k Q_k \frac{1}{G_{kk}}\, \right| \,\prec \; [z]^{-1} \Phi ^2 + \frac{1}{N}. \end{aligned}$$

In the setting where \(\mathbf {H}\) is a generalized Wigner matrix and \(|z|\le 10\) this bound is precisely the content of Theorem 4.7 from [14].

The a priori bound used in the proof of that theorem is replaced by

$$\begin{aligned} \begin{aligned} \left| Q_k \frac{1}{G^{(V)}_{kk}} \right| \,\prec \, \Lambda _{\mathrm {o}} + \frac{1}{\sqrt{N}}, \end{aligned} \end{aligned}$$
(3.17)

for any \(V \subseteq \{1, \dots ,N\}\) with N-independent size. This bound is proven in the same way as (2.19). Here, the \(N_0\) hidden in the stochastic domination depends on the size |V| of the index set. Following the proof of Theorem 4.7 given in [14] with (3.17) and tracking the z-dependence,

$$\begin{aligned} \frac{1}{|G_{kk}^{(V)}(z)|}\,\prec \, [z], \end{aligned}$$

yields the fluctuation averaging, Theorem 3.5. \(\square \)

Proof of Proposition 3.1

Let us show first that (3.3) follows directly from (3.2) by applying the fluctuation averaging, Theorem 3.5. Indeed, (3.2) provides a deterministic bound on the off-diagonal error, \(\Lambda _{\mathrm {o}}\), which is needed to apply the fluctuation averaging to the second terms on the right hand sides of (3.14) and (3.9). It also shows that the indicator functions on the left hand sides of (3.14) and (3.9) are a.w.o.p. nonzero. The stability bound (3.9) valid in the large \( |z | \) regime is necessary to get the correct [z] -factors in (3.3). Thus, (3.3) is proven, provided (3.2) is true.

The proof of (3.2) is split into the consideration of two different regimes. In the first regime the absolute value of z is large, \(|z|\ge N^{ 5}\). In this case we make use only of weak a priori bounds on the resolvent elements and the entries of \(\mathbf {d}\). Together with Lemma 3.3 they will suffice to prove (3.2) in this case. In the second regime, \(|z|\le N^{ 5}\), we use a continuity argument. We will establish a gap in the possible values that the continuous function, \(z\mapsto [z]\Lambda (z)\), might have. Here, the stability result Corollary 3.4 is used. We use this gap to propagate the bound with the help of Lemma A.2 in the appendix from \(|z|=N^{ 5}\) to the whole domain where \(|z|\le N^{ 5}\), \(\eta \ge N^{\gamma -1}\) and we stay away from \({\mathbb {M}}\).

Regime 1: Let \(|z|\ge N^{ 5}\). We show that the indicator functions in the statement of Lemma 2.1 are a.w.o.p. not vanishing. We start by showing that the diagonal contribution, \(\Lambda _{\mathrm {d}}\), to \(\Lambda \) is sufficiently small. The reduced resolvent elements for an arbitrary \(V \subseteq \{1, \dots ,N\}\) satisfy

$$\begin{aligned} \begin{aligned} |G^{(V)}_{i j}(z)|\,\le \,\eta ^{-1}\,\le \, N^{1-\gamma }. \end{aligned} \end{aligned}$$
(3.18)

From this and the definition of \(\mathbf {d}\) in (2.2) we read off the a priori bound,

$$\begin{aligned} \begin{aligned} ||\mathbf {d}(z) ||_\infty \,\prec \,N^{ 2-\gamma }. \end{aligned} \end{aligned}$$
(3.19)

Here, we used the general resolvent identity (2.9) in the form \(G_{i k}G_{k i}=g_k(g_i-G_{ii}^{(k)})\). Since \(\mathbf {g}\) satisfies the perturbed QVE (2.1) and \(|\sum _{j=1}^N s_{ij} g_j(z) + d_i(z)|\prec N^{2-\gamma }\) from (3.19) and (3.18) we conclude that uniformly for \(|z|\ge N^{ 2}\) we have

$$\begin{aligned} \begin{aligned} |g_k(z)|\,\le \, 2 |z|^{-1},\quad \text {a.w.o.p.} \end{aligned} \end{aligned}$$
(3.20)

With the trivial bound \(|m_i(z)|\le 1/\mathrm {dist}(z, {{\mathrm{supp}}}\rho )\) on the solution of the QVE we infer that on this domain the indicator function in (3.8) is a.w.o.p. non-zero and therefore uniformly for \(|z|\ge N^{ 2}\). Lemma 3.3 yields

$$\begin{aligned} \begin{aligned} \Lambda _{\mathrm {d}}(z) \,\lesssim \, |z|^{-2} ||\mathbf {d}(z) ||_\infty \,\le \, N^{-\gamma /2} |z|^{-1} , \quad \text {a.w.o.p.} \end{aligned} \end{aligned}$$
(3.21)

In the last inequality we have used (3.19) in the form \( ||\varvec{\mathrm {d}} ||_\infty \le N^{-\gamma /2}N^2 \) a.w.o.p. (cf. Definitions 1.6 and 1.9) and the extra factor \( [z]^{-2} \) on the right hand side of (3.8). Thus, for \(|z|\ge N^{ 2}\) the diagonal contribution to \(\Lambda \) does not play a role in the indicator function in the statement of Lemma 2.1.

Now we derive a similar bound for the off-diagonal contribution \(\Lambda _{\mathrm {o}}\). Using the resolvent identity (2.9) for \(i=j\) again, the bound \(|h_{i j}|\prec N^{-1/2}\) on the entries of the random matrix and the a priori bound on the reduced resolvent elements, (3.18), in the expansion formula (2.3) yields

$$\begin{aligned} \begin{aligned} |G_{k l}(z)|\,\prec \, \big (\,|g_k(z) g_l(z)|+|G_{k l}(z)G_{l k}(z)|\,\big ) N^{ 2-\gamma },\quad |G_{k l}(z)|\,\prec \,|g_k(z)| N^{ 3-\gamma }, \end{aligned} \end{aligned}$$
(3.22)

for \(k \ne l\). With the bound (3.20) we conclude that

$$\begin{aligned} \begin{aligned} \Lambda _{\mathrm {o}}(z) \;\prec \; |z|^{-2} N^{2-\gamma }+ |z|^{-1} N^{ 5-2\gamma }\Lambda _{\mathrm {o}}(z),\quad |z| \ge N^{ 2}. \end{aligned} \end{aligned}$$
(3.23)

Thus, \(\Lambda _{\mathrm {o}}\prec N^{-3}|z|^{-1}\) on the domain where \(|z|\ge N^{ 5}\). We conclude that Lemma 2.1 applies in this regime even without the indicator functions in the formulas (2.5). We use the bound from this lemma for the norm of \(\mathbf {d}\) and the off-diagonal contribution, \(\Lambda _{\mathrm {o}}\), to \(\Lambda \), while we use the first inequality in (3.21) for the diagonal contribution, \(\Lambda _{\mathrm {d}}\). In this way, we get

$$\begin{aligned} |z|^2\Lambda _{\mathrm {d}}+||\mathbf {d} ||_\infty\prec & {} |z|^{-2}\sqrt{\frac{\rho }{N \eta }}\,+|z|^{-2}\sqrt{\frac{\Lambda _{\mathrm {d}}}{N \eta }}+\frac{1}{\sqrt{N}}, \nonumber \\ |z|^2\Lambda _{\mathrm {o}}\prec & {} \sqrt{\frac{\rho }{N \eta }}\,+\,\sqrt{\frac{\Lambda _{\mathrm {d}}}{N \eta }}+\frac{1}{\sqrt{N}}, \end{aligned}$$
(3.24)

where we also used \(g_k = m_k + \mathcal {O}(\Lambda _{\mathrm {d}})\). Applying the weighted Cauchy-Schwarz inequality, \( \sqrt{\alpha \beta }\le \theta \,\alpha +\theta ^{-1}\beta \), we find for any \(\varepsilon \in (0,\gamma )\) that the right hand side of the first inequality can be estimated further by

$$\begin{aligned} |z|^2\Lambda _{\mathrm {d}}+||\mathbf {d} ||_\infty \prec |z|^{-2}\sqrt{\frac{\rho }{N \eta }}+N^{-\varepsilon } |z|^2\Lambda _{\mathrm {d}}+ |z|^{-6}\frac{N^\varepsilon }{N \eta }+\frac{1}{\sqrt{N}}. \end{aligned}$$

The term \(N^{-\varepsilon }|z|^2\Lambda _{\mathrm {d}}\) can be absorbed into the left hand side and by the definition of the stochastic domination and since \(\varepsilon \) is arbitrarily small the remaining \(N^\varepsilon \) on the right hand side can be replaced by 1 without changing the correctness of this bound (cf. (i) and (ii) of Lemma A.1). In this way we arrive at

$$\begin{aligned} |z|^2\Lambda _{\mathrm {d}}+||\mathbf {d} ||_\infty \,\prec \, |z|^{-2}\sqrt{\frac{\rho }{N \eta }}+\frac{\; |z|^{-6} }{N \eta }\,+\frac{1}{\sqrt{N}} . \end{aligned}$$

For the bound on the off-diagonal error term we plug this result into (3.24) and get

$$\begin{aligned} \Lambda _{\mathrm {o}}\,\prec \,|z|^{-2}\sqrt{\frac{\rho }{N\eta }} + \frac{\,|z|^{-6} }{N\eta }+\frac{\,|z|^{-3} }{N^{1/4}}\sqrt{\frac{1}{N\eta }}+\frac{\,|z|^{-2} }{\sqrt{N}}\;. \end{aligned}$$

Regime 2: Now let \(|z|\le N^{ 5}\) and suppose that \(\delta _*\) is a positive constant, depending only on the model parameters p, P and L. The diagonal contribution, \(\Lambda _{\mathrm {d}}\), satisfies

$$\begin{aligned} \begin{aligned} \Lambda _{\mathrm {d}}(z)\,\mathbbm {1}\big (\Lambda _{\mathrm {d}}(z) \le \lambda _*/[z]\big ) \;\lesssim \; [z]^{-2}||\mathbf {d}(z) ||_\infty , \end{aligned} \end{aligned}$$
(3.25)

according to (3.8) in Lemma 3.3 (for \(|z|\ge 10\)) and (3.13) from Corollary 3.4 (for \(|z|\le 10\)), where \(\lambda _*\) is a sufficiently small positive constant.

We will now establish a gap in the possible values of \(\Lambda (z) \) by showing (cf. (3.29) below) that the right hand side of (3.25) is much less than \( \lambda _*/[z] \). To this end we estimate the norm of \(\mathbf {d}\) in (3.25) by Lemma 2.1 and also use the bound on the off-diagonal contribution, \(\Lambda _{\mathrm {o}}\), from the same lemma,

$$\begin{aligned} \begin{aligned} \big ( [z]^2\Lambda _{\mathrm {d}}+||\mathbf {d} ||_\infty \big )\,\mathbbm {1}\big (\Lambda \le \lambda _*/[z]\big ) \;&\!\!\prec \;\!\! [z]^{-2}\sqrt{\frac{\mathrm {Im}\langle \mathbf {g}\rangle }{N \eta }}+\frac{1}{\sqrt{N}}, \\ [z]^2\Lambda _{\mathrm {o}}\,\mathbbm {1}\big (\Lambda \le \lambda _*/[z]\big ) \;&\!\!\prec \;\!\! \sqrt{\frac{\mathrm {Im}\langle \mathbf {g}\rangle }{N \eta }}+\frac{1}{\sqrt{N}}. \end{aligned} \end{aligned}$$
(3.26)

Now we use \( \mathrm {Im}\langle \varvec{\mathrm {g}} \rangle = \pi \rho + \mathrm {Im}\langle \mathbf {g}-\varvec{\mathrm {m}} \rangle \lesssim \rho + \Lambda _{\mathrm {d}} \) to estimate the first terms on the right hand side of (3.26):

$$\begin{aligned} \sqrt{\frac{\mathrm {Im}\langle \varvec{\mathrm {g}} \rangle }{N\eta }} \,\lesssim \, \sqrt{\frac{\pi \rho }{N\eta }} + \sqrt{ \frac{1}{N\eta }\,\Lambda _{\mathrm {d}}} . \end{aligned}$$

Using again the weighted Cauchy-Schwarz inequality in the second term yields

$$\begin{aligned} \big ( [z]^2\Lambda _{\mathrm {d}}+||\mathbf {d} ||_\infty \big )\,\mathbbm {1}\big (\Lambda \le \lambda _*/[z]\big ) \,\prec \, [z]^{-2}\sqrt{\frac{\rho }{N \eta }}+[z]^{-6}\frac{N^\varepsilon }{N\eta }+\frac{1}{\sqrt{N}}+N^{-\varepsilon } [z]^2\Lambda _{\mathrm {d}} . \end{aligned}$$

The term \(N^{-\varepsilon }[z]^2\Lambda _{\mathrm {d}}\) can be absorbed (cf. (ii) of Lemma A.1) into the left hand side and we arrive at

$$\begin{aligned} \begin{aligned} \big ( [z]^2\Lambda _{\mathrm {d}}+||\mathbf {d} ||_\infty \big )\,\mathbbm {1}\big (\Lambda \le \lambda _*/[z]\big )\,\prec \, [z]^{-2}\sqrt{\frac{\rho }{N \eta }}+\frac{\;[z]^{-6} }{N\eta }\,+\frac{1}{\sqrt{N }}. \end{aligned} \end{aligned}$$
(3.27)

For the off-diagonal error terms we plug this into the second bound of (3.26) after using \(\mathrm {Im}\langle \mathbf {g} \rangle \lesssim \rho + \Lambda _{\mathrm {d}}\) and get

$$\begin{aligned} \begin{aligned} \Lambda _{\mathrm {o}} \;\prec \; [z]^{-2}\sqrt{\frac{\rho }{N\eta }} +\frac{\; [z]^{-6} }{N\eta }\,+\frac{\;[z]^{-3} }{N^{1/4}}\sqrt{\frac{1}{N\eta }}+\frac{\;[z]^{-2} }{\sqrt{N}}. \end{aligned} \end{aligned}$$
(3.28)

In particular, we combine (3.27) and (3.28) to establish a gap in the values that \(\Lambda \) can take,

$$\begin{aligned} \begin{aligned} \Lambda \,\mathbbm {1}\big (\Lambda \le \lambda _*/[z]\big )\,\prec \,[z]^{-1}N^{-\gamma /2} . \end{aligned} \end{aligned}$$
(3.29)

Here we used \(\eta \ge N^{\gamma -1}\). This shows that either \( \Lambda \ge \lambda _*/[z] \) or \( \Lambda \le N^{-\gamma /4}/[z] \) a.w.o.p.

Now we apply Lemma A.2 on the connected domain

$$\begin{aligned} \Bigl \{ {z \in \mathbb {H}: \mathrm {Im}z \ge N^{\gamma -1}, \mathrm {dist}(z,{\mathbb {M}})\ge \delta _*, |z|\le N^5} \Bigr \} , \end{aligned}$$

with the choices

$$\begin{aligned} \begin{aligned} \varphi (z)\,:=\,[z] \Lambda (z),\quad \Phi (z)\,:=\, N^{-\gamma /3}, \quad z_0\,:=\, \mathrm {i}N^{ 5}. \end{aligned} \end{aligned}$$
(3.30)

The continuity condition (A.1) of the lemma for these two functions follows from the Hölder-continuity, (3.5), of the solution of the QVE and the weak continuity of the resolvent elements,

$$\begin{aligned} \begin{aligned} |G_{i j}(z_1) - G_{i j}(z_2)| \,\le \, \frac{|z_1-z_2|}{(\mathrm {Im}z_1) (\mathrm {Im}z_2)} \,\le \, N^{ 2} |z_1-z_2|. \end{aligned} \end{aligned}$$
(3.31)

The condition (A.3) holds since by (3.2) on the first regime we have a.w.o.p. \(\varphi (z_0)\le \Phi (z_0)\). Finally, (3.29) implies a.w.o.p. \(\varphi \,\mathbbm {1}( \varphi \in [\Phi -N^{-1},\Phi ] )< \Phi -N^{-1}\) and thus (A.2). We infer that a.w.o.p. \(\varphi \le \Phi \). In particular, the indicator function in (3.27) and (3.28) is non-zero a.w.o.p. Thus, (3.27) and (3.28) imply (3.2) in the second regime. \(\square \)

We will now sketch the proof of Corollary 1.8. The set-up in this corollary differs slightly from the one used in the rest of this paper, because the uniform bound (assumption (C)) on the solution of (1.7) is not assumed. We therefore use additional information from [1] about \(\mathbf {m}\) in this more general setting.

Proof of Corollary 1.8

Since the boundedness assumption (C) on the solution of the QVE is dropped in this corollary, its proof starts by showing that nevertheless for some constant \(P>0\) we have

$$\begin{aligned} \begin{aligned} |m_i(z)|\,\le \, P,\quad i=1,\dots ,N,\; z \in I+ \mathrm {i}(0,\infty ). \end{aligned} \end{aligned}$$
(3.32)

In this setting the solution \(\mathbf {m}(z)\) is not guaranteed to be extendable as a Hölder-continuous function with N-independent Hölder-norm to \(z \in \overline{\mathbb {H}}\). The density of states, defined by (1.12), however still has a Hölder-norm with Hölder-exponent 1 / 13 that is independent of N (cf. (i) of Proposition 7.1 and (i) of Theorem 6.1 in [1]). Here, we used \(L=1\) for the model parameter from assumption (B). Furthermore, (3.32) follows from the lower bound on the density of states and (i) of Lemma 5.4 and (i) of Theorem 6.1 in [1]. For the proof of Proposition 3.1 we only used the properties of the solution of (1.7), valid for z in the entire complex upper half plane, that are listed in Corollaries 2.23.2 and 3.4. These properties remain true for \(\mathrm {Re}z \in I \) (cf. Theorem 2.1, (i) of Theorem 2.12, (i) of Proposition 5.3 and Proposition 7.1 in [1]) if only (3.32) instead of (1.10) is satisfied. Thus, (3.2a), (3.2b) and (3.3) hold for \(z \in I+ \mathrm {i}[N^{\gamma -1},\infty )\) and Corollary 1.8 is proven. \(\square \)

4 Local law close to local minima

4.1 The solution of the QVE

In this subsection we state a few facts about the solution \(\mathbf {m}\) of the QVE (1.7) and about the stability of this equation against perturbations. These facts are summarized in two theorems that are taken from the companion paper [1]. The first theorem contains regularity properties of \(\mathbf {m}\). Furthermore, it provides lower and upper bounds on the imaginary part, \(\mathrm {Im}\langle \mathbf {m}\rangle =\pi \rho \), by explicit functions. It is a combination of the statements from Theorem 2.1, Theorem 2.4, Theorem 2.6 and Corollary A.1 of [1].

Theorem 4.1

(Solution of the QVE) Let the sequence \(\mathbf {S} = \mathbf {S}^{(N)}\) satisfy the assumptions (A)-(C). Then for every component, \(m_i: \mathbb {H}\rightarrow \mathbb {H}\), of the unique solution, \(\mathbf {m}=(m_1, \dots , m_N)\), of the QVE there is a probability density \(p_i:\mathbb {R}\rightarrow [0,\infty )\) with support in the interval \([-2,2]\), such that

$$\begin{aligned} \begin{aligned} m_i(z)\,=\, \int _\mathbb {R}\frac{p_i(\tau ) \mathrm {d}\tau }{\tau -z},\quad z \in \mathbb {H},\; i=1,\dots ,N. \end{aligned} \end{aligned}$$
(4.1)

The probability densities are comparable,

$$\begin{aligned} \begin{aligned} p_i(\tau )\,\sim \,p_j(\tau ),\quad \tau \in \mathbb {R}, \; i,j=1,\dots ,N. \end{aligned} \end{aligned}$$
(4.2)

The solution \(\mathbf {m}\) has a uniformly Hölder-continuous extension (denoted again by \(\mathbf {m}\)) to the closed complex upper half plane \(\overline{\mathbb {H}}=\mathbb {H}\cup \mathbb {R}\),

$$\begin{aligned} \begin{aligned} || \mathbf {m}(z_1)-\mathbf {m}(z_2) ||_\infty \,\lesssim \, |z_1-z_2|^{1/3}, \quad z_1,z_2 \in \overline{\mathbb {H}}. \end{aligned} \end{aligned}$$
(4.3)

Its absolute value satisfies

$$\begin{aligned} |m_i(z)|\,\sim \, [z]^{-1},\quad z \in \overline{\mathbb {H}},\; i=1,\dots ,N. \end{aligned}$$

Let \(\rho : \mathbb {R}\rightarrow [0,\infty ),\tau \mapsto \langle \varvec{\mathrm {p}}(\tau ) \rangle \) be the density of states, defined in (1.12). Then there exists a positive constant \(\delta _* \sim 1 \) such that the following holds true. The support of the density consists of \(K \sim 1 \) disjoint intervals of lengths at least \(2 \delta _*\), i.e.,

$$\begin{aligned} \begin{aligned} {{\mathrm{supp}}}\rho \,=\, \bigcup _{i=1}^{K}\, [ \alpha _i,\beta _i] ,\quad \text {where}\quad \beta _i -\alpha _i \ge 2 \delta _*,\quad \text {and}\quad \alpha _i< \beta _i< \alpha _{i+1} . \end{aligned} \end{aligned}$$
(4.4)

The size of the harmonic extension (1.15) of \(\rho \), up to constant factors, is given by explicit functions as follows. Let \(\eta \in [0,\delta _*]\).

  • Bulk: Close to the support of the density of states but away from the local minima in \({\mathbb {M}}\) (cf. (3.1)) the function \(\rho \) is comparable to 1, i.e.,

    $$\begin{aligned} \begin{aligned} \rho (\tau +\mathrm {i}\eta )\,\sim \, 1,\quad \tau \in {{\mathrm{supp}}}\rho ,\; \mathrm {dist}(\tau ,{\mathbb {M}})\ge \delta _*. \end{aligned} \end{aligned}$$
    (4.5a)
  • At an internal edge: At the edges \(\alpha _i, \beta _{i-1}\) with \(i =2, \dots , K\) in the direction where the support of the density of states continues the size of \(\rho \) is

    $$\begin{aligned} \begin{aligned} \rho (\alpha _i+\omega + \mathrm {i}\eta ) \,\sim \, \rho (\beta _{i-1}-\omega + \mathrm {i}\eta ) \,&\sim \, \frac{(\omega +\eta )^{1/2}}{(\alpha _i-\beta _{i-1} +\omega +\eta )^{1/6}}, \end{aligned} \end{aligned}$$
    (4.5b)

    for all \(\omega \in [ 0 ,\delta _*].\)

  • Inside a gap: Between two neighboring edges \(\beta _{i-1}\) and \(\alpha _i\) with \(i =2, \dots , K\), the function \(\rho \) satisfies

    $$\begin{aligned} \begin{aligned} \rho (\beta _{i-1} + \omega + \mathrm {i}\eta )\,\sim \,\rho (\alpha _{i} - \omega + \mathrm {i}\eta ) \,\sim \,\frac{\eta }{(\alpha _i-\beta _{i-1}+\eta )^{1/6} (\omega +\eta )^{1/2}}, \end{aligned} \end{aligned}$$
    (4.5c)

    for all \(\omega \in [0,(\alpha _i-\beta _{i-1})/2]\).

  • Around an extreme edge: At the extreme points \( \alpha _1\) and \( \beta _{K} \) of \( {{\mathrm{supp}}}\rho \) the density of states grows like a square root,

    $$\begin{aligned} \begin{aligned} \rho (\alpha _1+\omega + \mathrm {i}\eta ) \,\sim \, \rho (\beta _{K}-\omega + \mathrm {i}\eta ) \,\sim \, {\left\{ \begin{array}{ll} \displaystyle (\omega +\eta )^{1/2},\quad &{}\omega \in [ 0,\delta _* ], \\ \displaystyle \frac{\eta }{( |\omega |+\eta )^{1/2}}\;, &{}\omega \in [-\delta _*, 0 ]. \end{array}\right. } \end{aligned} \end{aligned}$$
    (4.5d)
  • Close to a local minimum: In a neighborhood of a local minimum in the interior of the support of the density of states, i.e., for \(\tau _0 \in {\mathbb {M}} \cap \mathrm{int}{{\mathrm{supp}}}\rho \), we have

    $$\begin{aligned} \begin{aligned} \rho (\tau _0+\omega + \mathrm {i}\eta ) \,\sim \, \rho (\tau _0)+ (|\omega |+\eta )^{1/3},\quad \omega \in [-\delta _*, \delta _* ]. \end{aligned} \end{aligned}$$
    (4.5e)
  • Away from the support: Away from the interval in which \({{\mathrm{supp}}}\rho \) is contained

    $$\begin{aligned} \begin{aligned} \rho (z)\,\sim \, \frac{\mathrm {Im}z}{|z|^2},\quad z \in \overline{\mathbb {H}},\;\mathrm {dist}(z,[\alpha _1,\beta _K])\,\ge \,\delta _*. \end{aligned} \end{aligned}$$
    (4.5f)

The next theorem shows that the QVE is stable under small perturbations, \(\mathbf {d}\), in the sense that once a solution of the perturbed QVE (4.6) is sufficiently close to \(\mathbf {m}\), then the difference between the two can be estimated in terms of \(||\mathbf {d} ||_\infty \). In [1] it is stated as Proposition 10.1.

Theorem 4.2

(Stability) There exists a scalar function \(\sigma : \overline{\mathbb {H}} \rightarrow [ 0 ,\infty )\), three vector valued functions \(\mathbf {s}, \mathbf {t}^{(1)}, \mathbf {t}^{(2)}: \overline{\mathbb {H}}\rightarrow {\mathbb {C}}^N\), a matrix valued function \(\mathbf {T}:\overline{\mathbb {H}}\rightarrow {\mathbb {C}}^{N\times N}\), all depending only on \(\mathbf {S}\), and a positive constant \(\lambda _*\), depending only on the model parameters p, P and L, such that for two arbitrary vector valued functions \(\mathbf {d}: \mathbb {H}\rightarrow {\mathbb {C}}^N\) and \(\mathbf {g}: \mathbb {H}\rightarrow ({\mathbb {C}}\backslash \{ {0} \})^N\) that satisfy

$$\begin{aligned} \begin{aligned} -\frac{1}{g_i(z)} \,=\, z + \sum _{j=1}^N s_{i j} g_j(z) +d_i(z), \quad z \in \mathbb {H}, \end{aligned} \end{aligned}$$
(4.6)

the difference between \(\mathbf {g} = \varvec{\mathrm {g}}(z)\) and \(\mathbf {m} = \varvec{\mathrm {m}}(z)\) is bounded in terms of

$$\begin{aligned} \begin{aligned} \Theta = \Theta (z)\,:=\, \big |\bigl \langle \mathbf {s}(z) , \mathbf {g}(z)-\mathbf {m}(z) \bigr \rangle \big |,\quad z \in \mathbb {H}, \end{aligned} \end{aligned}$$
(4.7)

in the following two ways. On the whole complex upper half plane

$$\begin{aligned} || \mathbf {g}-\mathbf {m} ||_\infty \mathbbm {1}\big ( ||\mathbf {g}-\mathbf {m} ||_\infty \le \lambda _*\big ) \;&\lesssim \; \Theta \,+\,||\mathbf {d} ||_\infty , \end{aligned}$$
(4.8)
$$\begin{aligned} |\langle \mathbf {w},\mathbf {g}-\mathbf {m} \rangle | \mathbbm {1}\big ( ||\mathbf {g}-\mathbf {m} ||_\infty \le \lambda _*\big ) \;&\lesssim \; ||\mathbf {w} ||_\infty \Theta +||\mathbf {w} ||_\infty ||\mathbf {d} ||_\infty ^2+|\langle \mathbf {T}\mathbf {w},\mathbf {d} \rangle |, \end{aligned}$$
(4.9)

for any non-random \(\varvec{\mathrm {w}} \in {\mathbb {C}}^N \). The scalar function \(\Theta : \overline{\mathbb {H}} \rightarrow [ 0 ,\infty ) \) satisfies a cubic equation

$$\begin{aligned} \begin{aligned} \big | \Theta ^3+\pi _2 \Theta ^2+\pi _1 \Theta \big | \mathbbm {1}\big ( ||\mathbf {g}-\mathbf {m} ||_\infty \le \lambda _*\big ) \;\lesssim \; ||\mathbf {d} ||_\infty ^2+|\langle \mathbf {t}^{(1)},\mathbf {d} \rangle |\,+\,|\langle \mathbf {t}^{(2)},\mathbf {d} \rangle | . \end{aligned} \end{aligned}$$
(4.10)

The coefficients \(\pi _1,\pi _2:\mathbb {H}\rightarrow {\mathbb {C}}\) may depend on \(\varvec{\mathrm {S}}\) and \(\varvec{\mathrm {g}}\). They satisfy

$$\begin{aligned} |\pi _1(z)| \,&\sim \, \frac{\mathrm {Im}z}{\rho (z)}+\rho (z) ( \rho (z)+\sigma (z)), \end{aligned}$$
(4.11a)
$$\begin{aligned} |\pi _2(z)| \,&\sim \, \rho (z)+\sigma (z), \end{aligned}$$
(4.11b)

for all \( z \in \mathbb {H}\). Moreover, the functions \(\sigma \), \(\mathbf {s}\), \(\mathbf {t}^{(1)}, \mathbf {t}^{(2)}\) and \(\mathbf {T}\) are regular in the sense that

$$\begin{aligned}&\displaystyle | \sigma (z_1)-\sigma (z_2) | \,+\, || \mathbf {s}(z_1)-\mathbf {s}(z_2) || \;\lesssim \; |z_1-z_2 |^{1/3}, \quad z_1,z_2 \in \overline{\mathbb {H}}, \end{aligned}$$
(4.12)
$$\begin{aligned}&\displaystyle \sigma (z) \,+\, || \mathbf {s}(z) ||_\infty \,+\, || \mathbf {t}^{(1)}(z) ||_\infty \,+\, || \mathbf {t}^{(2)}(z) ||_\infty \,+\, ||\mathbf {T}(z) ||_{\infty \rightarrow \infty } \;\lesssim \; 1, \quad z \in \overline{\mathbb {H}}.\nonumber \\ \end{aligned}$$
(4.13)

Furthermore, the function \( \sigma \) is related to the density of states by

$$\begin{aligned} \sigma (\alpha _i) \,&\sim \, \sigma (\beta _{i-1}) \,\sim \,(\alpha _i-\beta _{i-1})^{1/3}, \quad i=2,\dots ,K, \end{aligned}$$
(4.14a)
$$\begin{aligned} \sigma (\alpha _1)\,&\sim \, \sigma (\beta _{K}) \,\sim \,1, \end{aligned}$$
(4.14b)
$$\begin{aligned} \sigma (\tau _0)\,&\lesssim \; \rho (\tau _0)^2, \quad \tau _0 \in {\mathbb {M}}\backslash \{\alpha _i,\beta _i\} . \end{aligned}$$
(4.14c)

We warn the reader that in this paper \( \Theta \) and \( \sigma \) denote the absolute values of the quantities denoted by the same symbols in Proposition 10.1 of [1]. The function \(\sigma \) appears naturally in the analysis of the QVE. Analogous to the more explicitly constructed function \(\Delta \) from Definition 1.5, at an edge the value of \(\sigma ^{ 3}\) encodes the size of the corresponding gap in \({{\mathrm{supp}}}\rho \). At the local minima in \({\mathbb {M}}\backslash \{\alpha _i,\beta _i\}\) the value of \(\sigma ^{ 3}\) is small, provided the density of states has a small value at the minimum. In this sense it is again analogous to \(\Delta \), which vanishes at these internal minima.

4.2 Coefficients of the cubic equation

The stability of QVE near the points in \({\mathbb {M}}\) requires a careful analysis of the cubic equation (4.10) for \(\Theta \) from Theorem 4.2. For this, we will provide a more explicit description of the upper and lower bounds from (4.11) on the coefficients, \(\pi _1\) and \(\pi _2\), of the cubic equation.

Proposition 4.3

(Behavior of the coefficients) There exist \( \delta _*,c_*\sim 1 \) such that for all \(\eta \in [0,\delta _*]\) the coefficients, \(\pi _1\) and \(\pi _2\), of the cubic Eq. (4.10) satisfy the following bounds.

  • Around an internal edge: At the edges \(\alpha _i, \beta _{i-1}\) of the gap with length \(\Delta :=\alpha _i-\beta _{i-1}\) for \(i =2, \dots , K\), we have

    $$\begin{aligned} | \pi _1(\alpha _i+\omega + \mathrm {i}\eta ) | \,&\sim \, | \pi _1(\beta _{i-1}-\omega + \mathrm {i}\eta ) | \,\sim \, ( |\omega |+\eta )^{1/2}( |\omega |+\eta +\Delta )^{1/6}, \nonumber \\ | \pi _2(\alpha _i+\omega + \mathrm {i}\eta ) | \,&\sim \, | \pi _2(\beta _{i-1}-\omega + \mathrm {i}\eta ) | \,\sim \, ( |\omega |+\eta +\Delta )^{1/3}, \nonumber \\&\qquad \omega \in [-c_*\Delta ,\delta _*] . \end{aligned}$$
    (4.15a)
  • Well inside a gap: Between two neighboring edges \(\beta _{i-1} \) and \(\alpha _i\) of the gap with length \(\Delta :=\alpha _i-\beta _{i-1}\) for \(i =2, \dots , K\), the first coefficient, \(\pi _1\), satisfies

    $$\begin{aligned} \begin{aligned} | \pi _1(\alpha _i-\omega + \mathrm {i}\eta ) | \,\sim \, | \pi _1(\beta _{i-1}+\omega + \mathrm {i}\eta ) | \,\sim \, ( \eta +\Delta )^{2/3}, \quad \omega \in \left[ c_*\Delta , \frac{\Delta }{2}\right] . \end{aligned} \end{aligned}$$
    (4.15b)

    The second coefficient, \(\pi _2\), satisfies the upper bounds,

    $$\begin{aligned} \begin{aligned} \begin{array}{c} | \pi _2(\alpha _i-\omega + \mathrm {i}\eta ) | \;\lesssim \; ( \eta +\Delta )^{1/3}, \\ | \pi _2(\beta _{i-1}+\omega + \mathrm {i}\eta ) | \;\lesssim \; ( \eta +\Delta )^{1/3}, \end{array} \quad \omega \in \left[ c_*\Delta , \frac{\Delta }{2}\right] . \end{aligned} \end{aligned}$$
    (4.15c)
  • Around an extreme edge: Around the extreme points \( \alpha _1 \) and \( \beta _{K} \) of \( {{\mathrm{supp}}}\rho \), we have

    $$\begin{aligned} \begin{aligned} \begin{array}{l} | \pi _1(\alpha _1+\omega + \mathrm {i}\eta ) | \,\sim \, | \pi _1(\beta _{K}-\omega + \mathrm {i}\eta ) | \,\sim \, (\omega +\eta )^{1/2} \\ | \pi _2(\alpha _1+\omega + \mathrm {i}\eta ) | \,\sim \, | \pi _2(\beta _{K}-\omega + \mathrm {i}\eta ) | \,\sim \, 1, \end{array} \quad \omega \in [-\delta _*,\delta _*]. \end{aligned} \end{aligned}$$
    (4.15d)
  • Close to a local minimum: In a neighborhood of the local minimum in the interior of the support of the density of states, i.e. for \(\tau _0 \in {\mathbb {M}} \cap \mathrm{int}{{\mathrm{supp}}}\rho \), we have

    $$\begin{aligned} \begin{aligned} \begin{array}{l} | \pi _1(\tau _0+\omega + \mathrm {i}\eta ) | \,\sim \, \rho (\tau _0)^2+ (|\omega |+\eta )^{2/3}, \\ | \pi _2(\tau _0+\omega + \mathrm {i}\eta ) | \,\sim \, \rho (\tau _0)+ (|\omega |+\eta )^{1/3}, \end{array} \quad \omega \in [-\delta _*, \delta _* ]. \end{aligned} \end{aligned}$$
    (4.15e)

Proof

The proof is split according to the cases above. In each case we combine the general formulas (4.11) with the knowledge about the harmonic extension, \(\rho \), of the density of states from Theorem 4.1 and about the behavior of the positive Hölder-continuous function, \(\sigma \), at the minima in \({\mathbb {M}}\) from (4.14). The positive constant \(\delta _*\) is chosen to have at most the same value as in Theorem 4.1. We start with the simplest case.

Around an extreme edge: By the Hölder-continuity of \(\sigma \) (cf. (4.12)) and because \(\sigma \) is comparable to 1 at the points \(\alpha _1\) and \(\beta _K\) (cf. (4.14)), this function is comparable to 1 in the whole \(\delta _*\)-neighborhood of the extreme edges. Thus, using (4.11) inside this neighborhood, we find

$$\begin{aligned} | \pi _1(z)|\,\sim \, \frac{\mathrm {Im}z}{\rho (z)}+\rho (z),\quad | \pi _2(z)|\,\sim \, 1. \end{aligned}$$

The claim now follows from the behavior of \(\rho \), given in Theorem 4.1, inside this domain.

Close to a local minimum: In this case \(\rho +\sigma \) is comparable to \(\rho \). In fact, using the 1 / 3-Hölder-continuity of \(\sigma \) (cf. (4.12)) and its bound at the minimum, \(\tau _0\in {\mathbb {M}}\), (cf. (4.14)) we find

$$\begin{aligned} \rho (z)\,\le \, \rho (z)+\sigma (z) \,\lesssim \, \rho (z)+\rho (\tau _0)^2+ |z-\tau _0|^{1/3}\,\sim \, \rho (z),\quad |z-\tau _0|\le \delta _*.\nonumber \\ \end{aligned}$$
(4.16)

In the last relation we used the behavior (4.5e) of \(\rho \) from Theorem 4.1. By (4.11) we conclude that inside the \(\delta _*\)-neighborhood of \(\tau _0\),

$$\begin{aligned} \begin{aligned} | \pi _1(z)|\,\sim \, \frac{\mathrm {Im}z}{\rho (z)}+\rho (z)^2,\quad |\pi _2(z)|\,\sim \, \rho (z). \end{aligned} \end{aligned}$$
(4.17)

Using the upper and lower bounds on \(\rho (z)\) again, gives the desired result, (4.15e).

Around an internal edge: First we prove the bounds on \(|\pi _2|\), starting from (4.11). The upper bound simply uses the 1 / 3-Hölder-continuity and the behavior at the edge points of \(\sigma \),

$$\begin{aligned} \begin{aligned} |\pi _2(z)|\,\sim \, \rho (z)+\sigma (z)\,\lesssim \, \rho (z)+ \Delta ^{1/3}+|z-\tau _0|^{1/3}, \end{aligned} \end{aligned}$$
(4.18)

where \(\tau _0\) is one of the edge points \(\alpha _i\) or \(\beta _{i-1}\). The claim follows from plugging in the size of \(\rho \) from the two corresponding domains in Theorem 4.1, i.e., the domain close to an edge, (4.5b), and the domain inside a gap, (4.5c).

For the lower bound we consider two different regimes. In the first case z is close to the edge point, \(|z-\tau _0| \le c \Delta \), for some small positive constant c, depending only on the model parameters p, P and L. We find

$$\begin{aligned} | \pi _2(z)|\,\sim \, \rho (z)+\sigma (z)\,\gtrsim \, \rho (z)+ \Delta ^{1/3} -C |z-\tau _0|^{1/3}\,\sim \,\rho (z)+ \Delta ^{1/3}, \end{aligned}$$

provided c is small enough. This bound coincides with the lower bound on \(\pi _2\) in (4.15a), once the size of \(\rho \) from (4.5b) is used.

In the second regime, \(|z-\tau _0| \ge c \Delta \), we simply use \(|\pi _2(z)|\gtrsim \rho (z)\) from (4.11). If \(\mathrm {Re}z \in {{\mathrm{supp}}}\rho \), then the size of \(\rho \) from (4.5b) yields the desired lower bound. If, on the other hand, \(\mathrm {Re}z\) lies inside a gap of \({{\mathrm{supp}}}\rho \), then we use the freedom of choosing the constant \(c_*\) in Proposition 4.3. Suppose \(c_*\le c/2\). Then \(|z-\tau _0| \ge c \Delta \) and \(|\mathrm {Re}z -\tau _0|\le c_*\Delta \) imply \(\mathrm {Im}z \gtrsim \Delta \) and

$$\begin{aligned} \rho (z)\,\sim \,(\mathrm {Im}z)^{1/3}\,\gtrsim \,\Delta ^{1/3}+|z-\tau _0|^{1/3}. \end{aligned}$$

This finishes the proof of the upper and lower bound on \(|\pi _2|\) on this domain. For the claim about \(|\pi _1|\) we plug the result about \(|\pi _2|\) and the size of \(\rho \) into

$$\begin{aligned} \begin{aligned} | \pi _1|\,\sim \, \frac{\mathrm {Im}z}{\rho (z)}+ \rho (z) |\pi _2(z)| . \end{aligned} \end{aligned}$$
(4.19)

Well inside a gap: For the upper bound on \(|\pi _2|\) we simply use (4.18) again, which follows from (4.12) and (4.14). The comparison relation for \(|\pi _1|\) now follows from (4.19) again. For the lower bound, \(|\pi _1|\gtrsim \mathrm {Im}z /\rho \) and (4.5c) from Theorem 4.1 are sufficient. This finishes the proof of the proposition. \(\square \)

4.3 Rough bound on \(\Lambda \) close to local minima

In the following lemma we will see that a.w.o.p. \(\Lambda \le c\) for some arbitrarily small constant \(c>0\). Since the local law away from \({\mathbb {M}}\) is already shown in Proposition 3.1 we may restrict to bounded z in the following. From here on until the end of Section 4 we assume \(|z|\le 10\).

Lemma 4.4

(Rough bound) Let \(\lambda _*\) be a positive constant. Then, uniformly for all \(z= \tau + \mathrm {i}\eta \in \mathbb {H}\) with \(\eta \ge N^{\gamma -1}\), the function \(\Lambda \) is uniformly small,

$$\begin{aligned} \begin{aligned} \Lambda (z)\, \le \, \lambda _*\quad \text {a.w.o.p.} \end{aligned} \end{aligned}$$
(4.20)

Proof

Away from the local minima in \({\mathbb {M}}\) the claim follows from (3.2) in Proposition 3.1. We will therefore prove that \(\Lambda \) is smaller than any fixed positive constant in some \(\delta \)-neighborhood of \({\mathbb {M}}\). We will use the freedom to choose the size \(\delta \sim 1 \) of these neighborhoods as small as we like.

Let us sketch the upcoming argument. Close to the points in \({\mathbb {M}}\) we make use of Theorem 4.2. Using Lemma 2.1, we will see that the right hand side of the cubic equation in \(\Theta \), (4.10), is smaller than a small negative power, \(N^{-\varepsilon }\), of N, provided \(\Lambda \) is bounded by a small constant, \(\Lambda \le \lambda _*\). This will imply that \(\Theta \) itself is small and through (4.8) that the bound on \(\Lambda \) can be improved to \(\Lambda \le \lambda _*/2\). In this way we establish a gap in the possible values that the continuous function \(\Lambda \) can take. Lemma A.2 in the appendix is then used to propagate the bound on \(\Lambda \) from Proposition 3.1 into the \(\delta \)-neighborhoods of the points in \({\mathbb {M}}\).

Now we start the detailed proof from the fact that \(\Theta \) satisfies the cubic equation (4.10), whose right hand side is bounded by \(C||\mathbf {d} ||_\infty \) for some constant C, depending only on the model parameters. Note that \(||\mathbf {d} ||_\infty \lesssim 1\) as long as \(\Lambda \le \lambda _*\) because in this case \(|m_i|\sim 1\), \(|g_i|\sim 1\) and \(\mathbf {g}\) satisfies the perturbed QVE with perturbation \(\mathbf {d}\). From the definition of \(\Theta \) in (4.7) and the uniform bound on \(\mathbf {s}\) from (4.13), we get \(\Theta \lesssim \Lambda \). Since the coefficient \(|\pi _2|\) is uniformly bounded (cf. (4.11)), the cubic equation for \(\Theta \) implies the three bounds

$$\begin{aligned} \Theta \,\mathbbm {1}( \Lambda \le \varepsilon _1, |\pi _1|\ge C_1\varepsilon _1) \;&\lesssim \; \frac{ ||\mathbf {d} ||_\infty }{| \pi _1|}\;, \end{aligned}$$
(4.21a)
$$\begin{aligned} \Theta \,\mathbbm {1}( \Lambda \le \varepsilon _2, |\pi _2|\ge C_2\varepsilon _2) \;&\lesssim \;\frac{|\pi _1|}{|\pi _2|}+ \frac{||\mathbf {d} ||_\infty ^{1/2}}{|\pi _2|^{1/2}}\;, \end{aligned}$$
(4.21b)
$$\begin{aligned} \Theta \,\mathbbm {1}( \Lambda \le \lambda _*) \;&\lesssim \; |\pi _2|+ \sqrt{|\pi _1|}+||\mathbf {d} ||_\infty ^{1/3} . \end{aligned}$$
(4.21c)

Here, \(\varepsilon _1,\varepsilon _2 \in (0,\lambda _*) \), with \( \lambda _*\in (0,1) \) from Theorem 4.2, are arbitrary constants and \( C_1,C_2>0 \) depend only on the model parameters. We prove (4.21b); the other two bounds are obtained similarly. First we show that under the assumptions \( \Lambda \le \varepsilon _2 \) and \(|\pi _2 | \ge C_2\varepsilon _2 \) the second order term \( \pi _2\Theta ^2 \) is at least three times larger than \( \Theta ^3 \) provided \( C_2 \sim 1 \) is chosen to be sufficiently large. Indeed, since \( \Theta \le ||\varvec{\mathrm {s}} ||_\infty \Lambda \le ||\varvec{\mathrm {s}} ||_\infty \varepsilon _2 \) and \( |\pi _2 | \ge C_2\varepsilon _2 \), it suffices to choose \( C_2 \ge 3 ||\varvec{\mathrm {s}} ||_\infty \sim 1 \). Here we have also used (4.13). Next we compare the second order term to the linear term \( \pi _1\Theta \). We may assume that \(\Theta \ge 3 |\pi _1/\pi _2 | \), otherwise (4.21b) holds trivially. Together with \( |\pi _2 |\Theta ^2 \ge 3 \Theta ^3 \) proved above this implies that the second order term \( \pi _2\Theta ^2 \) dominates the left hand side of (4.10). Combining this with \( | \langle \varvec{\mathrm {t}}^{(j)},\varvec{\mathrm {d}} \rangle | \lesssim ||\varvec{\mathrm {d}} ||_\infty \) (cf. (4.13)) on the right hand side of (4.10), hence yields

$$\begin{aligned} \begin{aligned} \frac{1}{3}|\pi _2 |\Theta ^2 \,\le \, | \Theta ^3+\pi _2\Theta ^2+\pi _1\Theta | \,\lesssim \, ||\varvec{\mathrm {d}} ||_\infty . \end{aligned} \end{aligned}$$
(4.22)

In order to satisfy the constraint of (4.10) we have also used \( \varepsilon _2 \le \lambda _*\). This together with (4.22) yields (4.21b).

Let \(\delta \in (0,1)\) be another constant to be chosen later which depends only on the model parameters p, P, and L. We split \({\mathbb {M}}\) into four subsets, which are treated separately,

$$\begin{aligned} \begin{aligned}&{\mathbb {M}}_1(\delta ) := \bigl \{ {\tau _0 \in {\mathbb {M}}\backslash \partial {{\mathrm{supp}}}\rho : \rho (\tau _0)> \delta ^{1/3}} \bigl \} , \\&{\mathbb {M}}_2(\delta ) := \bigl \{ {\tau _0 \in \partial {{\mathrm{supp}}}\rho : \Delta (\tau _0)> \delta ^{1/2}} \bigl \} , \\&{\mathbb {M}}_3(\delta ) := \bigl \{ {\tau _0 \in {\mathbb {M}}\backslash \partial {{\mathrm{supp}}}\rho : \rho (\tau _0)\le \delta ^{1/3}} \bigl \} , \\&{\mathbb {M}}_4(\delta ) := \bigl \{ {\tau _0 \in \partial {{\mathrm{supp}}}\rho : \Delta (\tau _0)\le \delta ^{1/2} } \bigl \} . \end{aligned} \end{aligned}$$

The function \(\Delta \) is from Definition 1.5 and its value is simply the length of the gap at the point \(\tau _0 \in \partial {{\mathrm{supp}}}\rho \) where it is evaluated. We also define the \(\delta \)-neighborhoods of these subsets,

$$\begin{aligned} \mathbb {D}_k(\delta )\,:=\, \bigl \{ {z \in \mathbb {H}:\; \mathrm {dist}(z,{\mathbb {M}}_k(\delta ))\le \delta } \bigl \} ,\quad k=1,2,3,4. \end{aligned}$$

As an immediate consequence of the upper and lower bounds on the coefficients, \(\pi _1\) and \(\pi _2\), presented in Proposition 4.3, we see that

$$\begin{aligned} |\pi _1(z)|\gtrsim & {} \delta ^{2/3},\quad z \in \mathbb {D}_1(\delta ), \end{aligned}$$
(4.23a)
$$\begin{aligned} |\pi _1(z)|\lesssim & {} \delta ^{1/2},\quad |\pi _2(z)|\,\gtrsim \,\delta ^{1/6},\quad z \in \mathbb {D}_2(\delta ), \end{aligned}$$
(4.23b)
$$\begin{aligned} |\pi _1(z)|\lesssim & {} \delta ^{1/2},\quad |\pi _2(z)|\,\lesssim \,\delta ^{1/6},\quad z \in \mathbb {D}_3(\delta )\cup \mathbb {D}_4(\delta ). \end{aligned}$$
(4.23c)

On \(\mathbb {D}_2(\delta )\) only the regimes around an internal edge, (4.15a), and around an extreme edge, (4.15d), are relevant. The case well inside the gap, (4.15b) and (4.15c), does not apply for small enough \(\delta \), since \(\Delta (\tau _0)>\delta ^{1/2}\) but \(|z-\tau _0|\le \delta \).

Now we make a choice for the two constants \(\varepsilon _1\) and \(\varepsilon _2\). We express them in terms of \(\delta \) as

$$\begin{aligned} \varepsilon _1\,:=\,\delta ,\quad \varepsilon _2\,:=\, \delta ^{1/5}. \end{aligned}$$

We pair the bounds on \(\Theta \) from (4.21) with the corresponding bounds from (4.23) on the coefficients of the cubic equation. For small enough \(\delta \) the conditions on \(\pi _1\) in (4.21a) and \(\pi _2\) in (4.21b) are automatically satisfied by the choice of \(\varepsilon _1\) and \(\varepsilon _2\), as well as the upper and lower bounds from (4.23a) and (4.23b 4.23c). Thus, for small enough \(\delta \) we end up with

$$\begin{aligned} \Theta (z) \mathbbm {1}( \Lambda (z)\le \delta )&\lesssim \delta ^{-2/3} || \mathbf {d}(z) ||_\infty , z \in \mathbb {D}_1(\delta ), \\ \Theta (z) \mathbbm {1}( \Lambda (z)\le \delta ^{1/5} )&\lesssim \delta ^{1/3}+\delta ^{-1/12}|| \mathbf {d}(z) ||_\infty ^{1/2}, \quad z \in \mathbb {D}_2(\delta ), \\ \Theta (z) \mathbbm {1}( \Lambda (z)\le \lambda _*)&\lesssim \delta ^{1/6}+|| \mathbf {d}(z) ||_\infty ^{1/3}, z \in \mathbb {D}_3(\delta )\cup \mathbb {D}_4(\delta ) . \end{aligned}$$

At this stage we use Lemma 2.1 in the form of \(||\mathbf {d} ||_\infty \prec N^{-\gamma /2}\) on the set where \(\Lambda \le \lambda _*/10\), say, and (4.8) from Theorem 4.2. We may choose \(\lambda _*\) to be sufficiently small compared to the constants with the same name from these two statements. Furthermore, we choose \(\delta \) so small that \(\delta ^{1/5}\le \lambda _*\). Since \( ||\varvec{\mathrm {d}} ||_\infty \le N^{-\gamma /2+c} \) a.w.o.p. for an arbitrary \( c > 0 \) we obtain

$$\begin{aligned} \Lambda (z) \,\mathbbm {1}(\Lambda (z)\le \delta )&\lesssim \delta ^{-2/3}N^{-\gamma /3}, z \in \mathbb {D}_1(\delta ), \end{aligned}$$
(4.24a)
$$\begin{aligned} \text {a.w.o.p.} \quad \Lambda (z)\,\mathbbm {1}(\Lambda (z)\le \delta ^{1/5})&\lesssim \delta ^{1/3}+\delta ^{-1/12}N^{-\gamma /5}, \quad z \in \mathbb {D}_2(\delta ), \end{aligned}$$
(4.24b)
$$\begin{aligned} \Lambda (z)\,\mathbbm {1}(\Lambda (z)\le \lambda _*)&\lesssim \delta ^{1/6}+N^{-\gamma /7}, z \in \mathbb {D}_3(\delta )\cup \mathbb {D}_4(\delta ) . \end{aligned}$$
(4.24c)

The right hand sides, including the constants from the comparison relation, can be made smaller than any given constant \(\lambda _*\) by choosing \(\delta =\delta _*\), depending only on the model parameters, small enough and N sufficiently large. Furthermore, (4.24) establish a gap in the possible values that \(\Lambda \) can take on the \(\delta _*\)-neighborhood of any point in \({\mathbb {M}}\). By Proposition 3.1 we have the bound \(\Lambda \prec N^{-\gamma /2}\) outside these \(\delta _*\)-neighborhoods and thus also for at least one point in the boundary of each neighborhood. Now we apply Lemma A.2 to each neighborhood and in this way we propagate the bound \(\Lambda \le \lambda _*\) to every point z in the \(\delta _*\)-neighborhood of \({\mathbb {M}}\) with \(\mathrm {Im}z \ge N^{\gamma -1}\). \(\square \)

4.4 Proof of Theorem 1.7

According to Proposition 3.1 the local law, Theorem 1.7, holds outside the \(\delta _*\)-neighborhoods of the points in \({\mathbb {M}}\). It remains to show that it is true inside these neighborhoods as well. From here on we assume that \(z\in \mathbb {H}\) satisfies \(\mathrm {dist}(z,{\mathbb {M}})\le \delta _*\) and \(\mathrm {Im}z \ge N^{\gamma -1}\). Let \(\tau _0\in {\mathbb {M}}\) be one of the closest points to z in \({\mathbb {M}}\), i.e.,

$$\begin{aligned} |z-\tau _0|\,=\, \mathrm {dist}(z,{\mathbb {M}}) . \end{aligned}$$

When \( \tau _0 \in \partial {{\mathrm{supp}}}\rho \) we denote by \( \theta = \theta (\tau _0) \in \{ {\pm 1} \} \) the direction that points towards the gap in \( {{\mathrm{supp}}}\rho \) at \( \tau _0 \). In case \(\tau _0 \notin \partial {{\mathrm{supp}}}\rho \) we make the arbitrary choice \(\theta :=+1\), i.e.,

$$\begin{aligned} \theta := {\left\{ \begin{array}{ll} -1 &{} \text { if } \; \tau _0 \in \{\alpha _i\}, \\ +1 &{} \text { if } \; \tau _0 \in \{\beta _i\}, \\ +1 &{} \text { if } \; \tau _0 \in {\mathbb {M}} \backslash \partial {{\mathrm{supp}}}\rho . \end{array}\right. } \end{aligned}$$

The minimum \( \tau _0 \) will be considered fixed in the following analysis. We parametrize z as follows in the neighborhood of \(\tau _0 \in {\mathbb {M}}\):

$$\begin{aligned} \begin{aligned} z \,=\, \tau _0 + \theta \omega + \mathrm {i}\eta , \end{aligned} \end{aligned}$$
(4.25)

where \( \eta \in (0,\delta _*]\) and \(\omega \in [-\delta _*, \delta _*]\). We will then prove the local law in the form

$$\begin{aligned} \Lambda (z)&\prec \sqrt{\frac{\rho (z)}{N \eta }} + \frac{1}{N \eta }+ \mathcal {E}(\omega ,\eta ), \end{aligned}$$
(4.26a)
$$\begin{aligned} \big |\bigl \langle \mathbf {w}, \mathbf {g}(z)-\mathbf {m}(z) \bigr \rangle \big |&\prec \mathcal {E}(\omega ,\eta ), \end{aligned}$$
(4.26b)

where the positive error function \(\mathcal {E}: [-\delta _*, \delta _*] \times (0,\delta _*] \rightarrow (0,\infty )\) is given as the unique solution of an explicit cubic equation in (4.30) below.

To define \(\mathcal {E}\) we introduce explicit auxiliary functions \(\widetilde{\pi }_1\), \(\widetilde{\pi }_2\) and \(\widetilde{\rho }\) that are comparable in size to the corresponding functions \(\pi _1\), \(\pi _2\) and \(\rho \). The reason for using these auxiliary quantities for the definition of \(\mathcal {E}\) instead of the original ones is twofold. Firstly, in this way \(\mathcal {E}\) will be an explicit function instead of one that is implicitly defined through the solution of the QVE. The function \(\mathcal {E}\) is explicit in the sense that there is a formula for the solution of the cubic equation that defines it and the coefficients are given by the explicit functions \(\widetilde{\pi }_1\), \(\widetilde{\pi }_2\) and \(\widetilde{\rho }\). Secondly, \(\mathcal {E}\) will be monotonic of its second variable, \(\eta \). This property will be used later. The definition of the three auxiliary functions will be different, depending on whether \(\tau _0\) is in the boundary of the support of the density of states or not. Recall the definition (1.17) of \(\Delta _\delta (\tau ) \).

  • Edge: If \(\tau _0 \in \partial {{\mathrm{supp}}}\rho \), i.e. \(\tau _0\) is an edge of a gap of size \(\Delta :=\Delta _{0}(\tau _0)\) in the support of the density of states or an extreme edge. Then we define the three explicit functions

    $$\begin{aligned} \widetilde{\rho }(\omega ,\eta ) \,&:=\, {\left\{ \begin{array}{ll} \displaystyle \frac{(|\omega |+\eta )^{1/2}}{(\Delta +|\omega |+\eta )^{1/6}} ,\quad &{}\omega \in \bigl [-\delta _*, 0 \bigr ], \\ \displaystyle \frac{\eta ^{\left. \right. }}{(\Delta +\eta )^{1/6}(\omega +\eta )^{1/2}} ,\quad &{}\omega \in \bigl [ 0 ,c_*\Delta \bigr ], \\ \displaystyle \frac{\eta ^{\left. \right. }}{(\Delta +\eta )^{2/3}} ,\quad &{} \displaystyle \omega \in \left[ c_*\Delta , \frac{\Delta }{2}\right] . \end{array}\right. } \end{aligned}$$
    (4.27a)
    $$\begin{aligned} \widetilde{\pi }_1(\omega ,\eta ) \,&:=\, {\left\{ \begin{array}{ll} (|\omega |+\eta )^{1/2}(|\omega |+\eta +\Delta )^{1/6},\quad &{}\omega \in \bigl [-\delta _*, 0 \bigr ], \\ (\omega +\eta )^{1/2}(\Delta +\eta )^{1/6},\quad &{}\omega \in \bigl [ 0 ,c_*\Delta \bigr ], \\ (\Delta +\eta )^{2/3},\quad \displaystyle &{}\omega \in \left[ c_*\Delta , \frac{\Delta }{2}\right] \end{array}\right. } \end{aligned}$$
    (4.27b)
    $$\begin{aligned} \widetilde{\pi }_2(\omega ,\eta ) \,&:=\, {\left\{ \begin{array}{ll} (|\omega |+\eta +\Delta )^{1/3},\quad &{}\omega \in \bigl [-\delta _*, 0 \bigr ], \\ (\Delta +\eta )^{1/3},\quad &{}\omega \in \bigl [ 0 ,c_*\Delta \bigr ], \\ (\Delta +\eta )^{1/3},\quad &{} \displaystyle \omega \in \left[ c_*\Delta , \frac{\Delta }{2}\right] \end{array}\right. } \end{aligned}$$
    (4.27c)

    Here, \( c_* \sim 1 \) is the constant from Proposition 4.3.

  • Internal minimum: If \(\tau _0 \in {\mathbb {M}} \backslash \partial {{\mathrm{supp}}}\rho \), then we define for \( \omega \in [-\delta _*,\delta _*]\) the three functions

    $$\begin{aligned} \widetilde{\rho }(\omega ,\eta ) \,&:=\ \rho (\tau _0)+ (|\omega |+\eta )^{1/3}, \end{aligned}$$
    (4.28a)
    $$\begin{aligned} \widetilde{\pi }_1(\omega ,\eta ) \,&:=\, \rho (\tau _0)^2+ (|\omega |+\eta )^{2/3}, \end{aligned}$$
    (4.28b)
    $$\begin{aligned} \widetilde{\pi }_2(\omega ,\eta ) \,&:=\, \rho (\tau _0)+ (|\omega |+\eta )^{1/3}, \end{aligned}$$
    (4.28c)

By design (cf. Proposition 4.3 and Theorem 4.1) these functions satisfy

$$\begin{aligned} \begin{aligned} \rho (\tau _0 + \theta \omega + \mathrm {i}\eta ) \sim \widetilde{\rho }(\omega ,\eta ),\quad \text {and}\quad | \pi _k(\tau _0 + \theta \omega + \mathrm {i}\eta ) | \sim \widetilde{\pi }_k(\omega ,\eta ), \end{aligned} \end{aligned}$$
(4.29)

except in one special case where the second bound does not hold, namely when \(k=2\), \(\tau _0 \in \partial {{\mathrm{supp}}}\rho \) and \(\omega \in [c_*\Delta ,\Delta /2]\). In this case only the direction \(|\pi _2| \lesssim \widetilde{\pi }_2\) is true (cf. (4.15c)).

We fix a positive constant \(\widetilde{\varepsilon }\in (0, \gamma /16)\). The value of the function \(\mathcal {E}\) at \((\omega , \eta )\) is then defined to be the unique positive solution of the cubic equation

$$\begin{aligned} \begin{aligned}&\mathcal {E}(\omega , \eta )^3 + \widetilde{\pi }_2(\omega , \eta ) \mathcal {E}(\omega , \eta )^2 + \widetilde{\pi }_1(\omega , \eta ) \mathcal {E}(\omega , \eta ) \\&\quad = N^{8 \widetilde{\varepsilon }}\;\frac{\mathcal {E}(\omega , \eta )}{N\eta } +\frac{\widetilde{\rho }(\omega , \eta )}{N\eta } +\frac{1}{(N\eta )^2}, \end{aligned} \end{aligned}$$
(4.30)

With the choices (1.23) and (1.25) for \( \kappa = \kappa (z) \) we have

$$\begin{aligned} \begin{aligned} \mathcal {E}\,\le \, N^{9 \widetilde{\varepsilon }}\min \left\{ \, {\frac{1}{\sqrt{N \eta }}, \frac{\kappa }{N\eta }} \,\right\} , \end{aligned} \end{aligned}$$
(4.31)

for any \(N \ge N_0\), where the threshold \(N_0\) here depends on \(\widetilde{\varepsilon }\) in addition to p, P, L, \(\underline{\mu }\) and \(\gamma \). The inequality (4.31) is verified by plugging its right hand side into (4.30) in place of \(\mathcal {E}\) and checking that on each regime the resulting expression on the right hand side of (4.30) is smaller than the resulting expression on the left hand side of (4.30). The factor of \(N^{9 \widetilde{\varepsilon }}\) in (4.31) can be absorbed in the stochastic domination in (4.26). Thus (4.26) becomes (1.20) and (1.21) of Theorem 1.7.

Before we start the proof of the local law (4.26), let us motivate the definition of \(\mathcal {E}\). As a consequence of Lemma 4.4 the indicator function equals one a.w.o.p. in the statement of Lemma 2.1. Thus, uniformly in the \(\delta _*\)-neighborhood of \(\tau _0\) we have

$$\begin{aligned} \begin{aligned} ||\mathbf {d} ||_{\infty } +\Lambda _{\mathrm {o}} \,\prec \, \sqrt{\frac{ \rho +|\langle \varvec{\mathrm {g}}-\varvec{\mathrm {m}} \rangle |}{N\eta }} + \frac{1}{\sqrt{ N }} . \end{aligned} \end{aligned}$$
(4.32)

Here we used \(\mathrm {Im}\langle \varvec{\mathrm {g}} \rangle \lesssim \rho + |\langle \mathbf {g}-\mathbf {m} \rangle |\). Since at the end the local law implies \(|\langle \varvec{\mathrm {g}}-\varvec{\mathrm {m}} \rangle |\prec \mathcal {E}\), heuristically we may replace \(|\langle \varvec{\mathrm {g}}-\varvec{\mathrm {m}} \rangle |\) in (4.32) by \(\mathcal {E}\). In this case, from the fluctuation averaging, Theorem 3.5, we would be able to conclude that for any deterministic vector \(\mathbf {w}\) with bounded entries,

$$\begin{aligned} \begin{aligned} ||\mathbf {d} ||_{\infty }^2 + |\langle \mathbf {w},\mathbf {d} \rangle | \;\prec \; \frac{\mathcal {E}}{N\eta }+\frac{\rho }{N\eta }+\frac{1}{(N\eta )^2}. \end{aligned} \end{aligned}$$
(4.33)

Up to the technical factor of \(N^{8 \varepsilon }\) the right hand side coincides with the right hand side of the cubic equation defining \(\mathcal {E}\). On the other hand, the right hand side of the cubic equation (4.10) for the quantity \(\Theta \) from Theorem 4.2 is of the same form as the left hand side of (4.33). Therefore, we infer

$$\begin{aligned} \begin{aligned} | \Theta ^3+\pi _2 \Theta ^2+\pi _1 \Theta |\,\prec \,\frac{\mathcal {E}}{N\eta }+\frac{\rho }{N\eta }+\frac{1}{(N\eta )^2}. \end{aligned} \end{aligned}$$
(4.34)

We will argue that on appropriately chosen domains out of the three summands in the cubic expression in \(\Theta \) always one is the biggest by far. Therefore, the error function \(\mathcal {E}\), defined by (4.30), is essentially the best bound on \(\Theta \) that one may hope to deduce from (4.34). Indeed, since \(\Theta \) is by definition an average of \(\mathbf {g}-\mathbf {m}\), we expect \(\Theta \prec \mathcal {E}\).

We will now prove (4.26). To this end we gradually improve the bound on \(\Theta \). Fix some \(\varepsilon \in (0,\widetilde{\varepsilon })\). The sequence of deterministic bounds on this quantity is defined as

$$\begin{aligned} \begin{aligned} \Phi _0:= 1, \quad \Phi _{k+1}:= \max \bigl \{ { N^{- \varepsilon } \Phi _k , \, N^{ 9 \varepsilon }\mathcal {E}} \bigl \} . \end{aligned} \end{aligned}$$
(4.35)

From here on until the end of this section the threshold function \(N_0\) from the definition of the stochastic domination (cf. Definition 1.6) as well as the definition of ’a.w.o.p.’ (cf. Definition 1.9) may depend on \(\varepsilon \) in addition to p, P, L, \(\underline{\mu }\) and \(\gamma \). At the end of the proof we will remove this dependence. The following lemma is essential for doing one step in the upcoming iteration.

Lemma 4.5

(Improving bound through cubic) Suppose that for all \(z\in \tau _0 + [-\delta _*,\delta _*]+\mathrm {i}[N^{\gamma -1},\delta _*]\) and some \(k \in \mathbb {N}\) the quantity \(\Theta (z)\) from (4.7) fulfills

$$\begin{aligned} \begin{aligned} \big |\,\Theta (z)^3 +\pi _2(z) \Theta (z)^2+\pi _1(z) \Theta (z)\, \big | \,\prec \, \frac{\rho (z)+\Phi _k(\omega ,\eta )}{N\eta }+\frac{1}{(N\eta )^2}. \end{aligned} \end{aligned}$$
(4.36)

Then \(\Theta (z)\prec \Phi _{k+1}(\omega ,\eta )\).

We will postpone the proof of this lemma until the end of this section. First we show how to use this result to prove the local law (Theorem 1.7). Fix an integer \(k\ge 0\) and assume that \(\Theta +|\langle \varvec{\mathrm {g}}-\varvec{\mathrm {m}} \rangle |\prec \Phi _k\) is already proven. For \(k=0\) this follows from the rough bound on \(\Lambda \) in Lemma 4.4, \(\Lambda \prec 1=\Phi _0\). As an induction step we show that \(\Theta +|\langle \varvec{\mathrm {g}}-\varvec{\mathrm {m}} \rangle |\prec \Phi _{k+1} \).

From (4.32) we see that

$$\begin{aligned} \begin{aligned} ||\mathbf {d} ||_{\infty } + \Lambda _{\mathrm {o}} \,\prec \, \sqrt{\frac{\rho +\Phi _k}{N\eta }}+\frac{1}{N\eta }. \end{aligned} \end{aligned}$$
(4.37)

The right hand side is a deterministic bound on the off-diagonal error \(\Lambda _{\mathrm {o}}\). Therefore the fluctuation averaging (Theorem 3.5) is applicable to \( \langle \varvec{\mathrm {t}}^{(1)},\varvec{\mathrm {d}} \rangle \) and \( \langle \varvec{\mathrm {t}}^{(2)},\varvec{\mathrm {d}} \rangle \) on right hand side of the cubic equation (4.10)

$$\begin{aligned} \big |\langle \varvec{\mathrm {t}}^{(j)},\varvec{\mathrm {d}} \rangle \big | \,\prec \, \left( \sqrt{\frac{\rho +\Phi _k}{N\eta }}+\frac{1}{N\eta }\right) ^2, \end{aligned}$$

where \( N^{-1} \) from (3.15) has been neglected since \( \rho \gtrsim \eta \). In this way we see that the hypothesis (4.36) of Lemma 4.5 is satisfied. Using the lemma the bound on \(\Theta \) is improved to

$$\begin{aligned} \begin{aligned} \Theta (z)\,\prec \, \Phi _{k+1}(\omega ,\eta ). \end{aligned} \end{aligned}$$
(4.38)

In order to improve the bound on \(| \langle \varvec{\mathrm {g}} - \varvec{\mathrm {m}} \rangle |\) as well, we use the bound (4.9) from Theorem 4.2 for averages of \(\mathbf {g}-\mathbf {m}\) against bounded vectors. Since by Lemma 4.4 the deviation function \(\Lambda \) is bounded by a small constant, the indicator function in (4.9) is a.w.o.p. non-zero. Choosing \(\mathbf {w}=(1, \dots ,1)\), we find that

$$\begin{aligned} \begin{aligned} |\langle \mathbf {g}-\mathbf {m} \rangle | \,\lesssim \, \Theta + ||\mathbf {d} ||_\infty ^2 + |\langle \widetilde{\mathbf { w}}, \mathbf {d} \rangle |,\quad \text {a.w.o.p.}, \end{aligned} \end{aligned}$$
(4.39)

where \(\widetilde{\mathbf {w}}=\mathbf {T}\mathbf {w}\) is a bounded, \(||\widetilde{\mathbf {w}} ||_\infty \lesssim 1\), deterministic vector. Together with the bound (4.37) we apply the fluctuation averaging (Theorem 3.5) again,

$$\begin{aligned} \begin{aligned} |\langle \mathbf {g}-\mathbf {m} \rangle | \;\prec \; \Phi _{k+1} + \frac{\rho +\Phi _k}{N\eta }+\frac{1}{(N\eta )^2} \;\lesssim \; N^{-\varepsilon }\Phi _k+\Phi _{k+1} \;\lesssim \; \Phi _{k+1} . \end{aligned} \end{aligned}$$
(4.40)

This concludes one step in the iteration, i.e., we have shown \(\Theta +|\langle \varvec{\mathrm {g}}-\varvec{\mathrm {m}} \rangle |\prec \Phi _{k+1}\).

We repeat this step finitely many times and each time improve \(\Phi _k\) by a factor of \(N^{-\varepsilon }\) until it reaches its target value \(N^{9 \varepsilon }\mathcal {E}\) and is not improved anymore. Note that all constants in our estimates, explicit and hidden, depend only on the model parameters and \( \varepsilon \). In particular, the number of steps needed is uniform in \( (\omega ,\eta )\). At that stage we have

$$\begin{aligned} \Theta +|\langle \mathbf {g}-\mathbf {m} \rangle | \,\prec _{\varepsilon }\, N^{ 9 \varepsilon }\mathcal {E}, \end{aligned}$$

where the subindex \(\varepsilon \) indicates that the threshold \(N_0\) from the stochastic domination may depend on \(\varepsilon \). But since \(\varepsilon > 0 \) was arbitrary, we infer (cf. (i) of Lemma A.1) that \( \Theta +|\langle \varvec{\mathrm {g}}-\varvec{\mathrm {m}} \rangle |\prec \mathcal {E}\), where now and until the start of the proof of Lemma 4.5 below the stochastic domination is \(\varepsilon \)-independent. By (4.32) we conclude

$$\begin{aligned} \begin{aligned} ||\mathbf {d} ||_{\infty } + \Lambda _{\mathrm {o}}\,\prec \, \sqrt{\frac{\rho }{N \eta }}+\frac{1}{N \eta }+ \mathcal {E}. \end{aligned} \end{aligned}$$
(4.41)

For the bound on the diagonal contribution, \(\Lambda _{\mathrm {d}}\), we use (4.8) to get

$$\begin{aligned} \Lambda _{\mathrm {d}}\,\lesssim \, \Theta + ||\mathbf {d} ||_{\infty } \,\prec \, \sqrt{\frac{\rho }{N \eta }}+\frac{1}{N \eta } + \mathcal {E}. \end{aligned}$$

Finally, with the help of (4.9), (4.41) and the fluctuation averaging, we prove the bound on averages of \(\mathbf {g}-\mathbf {m}\) against any bounded, \(||\mathbf {w} ||_\infty \le 1\), deterministic vector,

$$\begin{aligned} |\langle \mathbf {w},\mathbf {g}-\mathbf {m} \rangle | \,\prec \, \frac{\rho }{N \eta }+\frac{1}{(N \eta )^2}+\Theta \,\prec \,\frac{\rho }{N \eta }+\frac{1}{(N \eta )^2}+\mathcal {E}. \end{aligned}$$

This finishes the proof of Theorem 1.7 apart from the proof of Lemma 4.5 which we will tackle now.

Proof of Lemma 4.5

The spectral parameter \(z=\tau _0 + \theta \omega + \mathrm {i}\eta \) lies inside the \(\delta _*\)-neighborhood of \(\tau _0\). We fix \(\omega \in [-\delta _*,\delta _*]\) and show that the claim holds for any choice of \(\eta \in [N^{\gamma -1},\delta _*]\). We split the interval of possible values of \(\eta \) into two or three regimes, depending on the case we are treating.

  • Edge: If \(\tau _0 \in \partial {{\mathrm{supp}}}\rho \) is an edge of a gap of size \(\Delta :=\Delta _{0}(\tau _0)\), then we define

    $$\begin{aligned} \begin{aligned} I_1(\omega )\,&:=\, \left\{ \, {\eta \in [N^{\gamma -1},\delta _*]:\, \frac{(|\omega |+\eta )^{1/2}}{(|\omega |+\eta +\Delta )^{1/6}} \ge N^{-5 \varepsilon }\Phi _k(\omega ,\eta ) } \,\right\} , \\ I_2(\omega )\,&:=\, \left\{ \eta \in [N^{\gamma -1},\delta _*]:\, N^{5 \varepsilon }\frac{(|\omega |+\eta )^{1/2}}{(|\omega |+\eta +\Delta )^{1/6}} \le \Phi _k(\omega ,\eta ) \right. \\&\quad \left. \le N^{2 \varepsilon }(|\omega |+\eta +\Delta )^{1/3}\right\} , \\ I_3(\omega )\,&:=\, \left\{ \, { \eta \in [N^{\gamma -1},\delta _*]:\, (|\omega |+\eta +\Delta )^{1/3} \le N^{-2 \varepsilon }\Phi _k(\omega ,\eta ) } \,\right\} . \end{aligned} \end{aligned}$$

    If any of the two regimes \(I_l(\omega )\) with \(l=2,3 \) consists of a single point only, then we set \(I_l(\omega ):=\emptyset \).

  • Internal minimum: If \(\tau _0 \in {\mathbb {M}} \backslash \partial {{\mathrm{supp}}}\rho \), then we set \(I_2(\omega ):=\emptyset \) and define

    $$\begin{aligned} \begin{aligned} I_1(\omega )\,&:=\, \Bigl \{ { \eta \in [N^{\gamma -1},\delta _*]: \, \rho (\tau _0)+(|\omega |+\eta )^{1/3} \ge N^{-2 \varepsilon }\Phi _k(\omega ,\eta )} \Bigr \} , \\ I_3(\omega )\,&:=\, \Bigl \{ { \eta \in [N^{\gamma -1},\delta _*]: \, \rho (\tau _0)+(|\omega |+\eta )^{1/3} \le N^{-2 \varepsilon }\Phi _k(\omega ,\eta )} \Bigr \} . \end{aligned} \end{aligned}$$

    If \(I_3(\omega )\) consists of a single point only, then we set \(I_3(\omega ):=\emptyset \).

Fig. 3
figure 3

The shaded area is forbidden for \( \Theta \). Since the continuous function \( \Theta \) lies below this region at \( \eta = \delta _*\) it stays below it for any \( \eta \ge N^{\gamma -1} \), hence proving \( \Theta \le \Phi _{k+1}\)

In the cubic equation (4.30), used to define the error function \(\mathcal {E}\), the coefficients \(\widetilde{\pi }_1\) and \(\widetilde{\pi }_2\) on the left hand side are monotonously increasing functions of \(\eta \). The linear and the constant coefficient of \(\mathcal {E}\) on the right hand side are monotonously decreasing in \(\eta \). Thus, \(\mathcal {E}\) itself is a monotonously decreasing function of \(\eta \). From this fact and the definition of the regimes \(I_1\), \(I_2\) and \(I_3\) we see that \(I_1=[\eta _1,\delta _*]\), \(I_2=[\eta _2,\eta _1]\) and \(I_3=[N^{\gamma -1},\eta _2]\) for some \(\eta _1, \eta _2 \in [N^{\gamma -1},\delta _*]\). Here, we interpret \(I_2=\emptyset \) if \(\eta _1\le \eta _2\) and \(I_3=\emptyset \) if \(\eta _2 \le N^{\gamma -1}\).

Now we define a z-dependent indicator function

$$\begin{aligned} \begin{aligned}&\chi (\omega ,\eta ) \\&\quad :=\, {\left\{ \begin{array}{ll} \mathbbm {1}\Bigl ( N^{-7\varepsilon }\Phi _k(\omega ,\eta ) \le \Theta (\tau _0 + \theta \omega + \mathrm {i}\eta ) \le N^{-6 \varepsilon }\Phi _k(\omega ,\eta ) \Bigr ) &{} \text { if } \;\eta \in I_{1}(\omega ) \\ \mathbbm {1}\Bigl ( N^{-4\varepsilon }\Phi _k(\omega ,\eta ) \le \Theta (\tau _0 + \theta \omega + \mathrm {i}\eta ) \le N^{-3\varepsilon }\Phi _k(\omega ,\eta ) \Bigr ) &{} \text { if } \;\eta \in I_{2}(\omega ) \\ \mathbbm {1}\Bigl ( N^{-\varepsilon }\Phi _k(\omega ,\eta ) \le \Theta (\tau _0 + \theta \omega + \mathrm {i}\eta ) \le \Phi _k(\omega ,\eta ) \Bigr ) &{} \text { if }\; \eta \in I_{3}(\omega ) \end{array}\right. }. \end{aligned} \end{aligned}$$
(4.42)

This function fixes the values of \(\Theta \) to a small interval just below the deterministic control parameter \(\Phi _k\). We will prove that \(\Theta \) cannot take these values, i.e. \(\chi =0\) a.w.o.p. Figure 3 illustrates this argument. Compared to Figure 6.1 in [14] we see that instead of two there are now three domains, \(I_1(\omega )\), \(I_2(\omega )\) and \(I_3(\omega )\), to be distinguished. The reason for this extra complication is that (4.10) is cubic in \(\Theta \), compared to the quadratic equation for [v] that appeared in the proof of Lemma 6.2 in [14]. To see that \(\chi =0\), first note that the choice of the domains, \(I_l\), ensures that there is always one summand on the left hand side of the cubic equation (4.10) for \(\Theta \) which dominates the two others by a factor \(N^\varepsilon \), whenever \(\chi \) does not vanish. In fact, by construction we have:

The random functions \(\Theta \) and \(\chi \) satisfy a.w.o.p.

$$\begin{aligned} \begin{aligned}&\Bigl (\Theta (z)^3 + \widetilde{\pi }_2(\omega ,\eta ) \Theta (z)^2+ \widetilde{\pi }_1(\omega ,\eta ) \Theta (z) \Bigr )\,\chi (\omega , \eta ) \\&\quad \lesssim \; \Bigl |\Theta (z)^3 +\pi _2(z) \Theta (z)^2+\pi _1(z) \Theta (z) \Bigr |\;. \end{aligned} \end{aligned}$$
(4.43)

We will verify this fact at the end of the proof of this lemma. Now we will simply use it. First we combine the assumption (4.36) of the lemma and (4.43) to obtain

$$\begin{aligned} N^{-\varepsilon }( \Theta ^3 + \widetilde{\pi }_2\Theta ^2 + \widetilde{\pi }_1\Theta )\;\chi \,\le \, \frac{ \widetilde{\rho }+\Phi _k}{N\eta } + \frac{1}{(N\eta )^2} \quad \text {a.w.o.p.} \end{aligned}$$

Here we also gave up a factor of \(N^\varepsilon \) to get an inequality instead of the stochastic domination, and replaced \( \rho \) by the comparable quantity \( \widetilde{\rho } \). By the definition of the indicator function \(\chi \) we have \(\Theta \chi \ge N^{-7 \varepsilon } \Phi _k \). Using this to bound the left hand side, and that \( \varepsilon \le \widetilde{\varepsilon } \), we obtain

$$\begin{aligned} \big ({\mathcal {R}}^3 +\widetilde{\pi }_2 {\mathcal {R}}^2+\widetilde{\pi }_1 {\mathcal {R}} \big ) \chi\le & {} N^{8 \widetilde{\varepsilon }}\frac{{\mathcal {R}}}{N\eta }+ \frac{\widetilde{\rho }}{N\eta }+\frac{1}{(N\eta )^2},\quad \text {a.w.o.p.}, \\ {\mathcal {R}}:= & {} N^{-8 \varepsilon }\Phi _k . \end{aligned}$$

Comparing this with the defining Eq. (4.30) for \( \mathcal {E}\) we conclude that a.w.o.p. \( N^{-8 \varepsilon }\Phi _k \chi \le \mathcal {E}\).

On the other hand, by the definition of \(\Phi _k\) in (4.35) we know that \(\Phi _k> N^{8 \varepsilon }\mathcal {E}\). These two inequalities yield

$$\begin{aligned} \begin{aligned} \chi (\omega ,\eta )\,=\, 0,\quad \eta \in [N^{\gamma -1},\delta _*],\quad \text {a.w.o.p.} \end{aligned} \end{aligned}$$
(4.44)

Now we successively, for \(l=1,2,3\), apply Lemma A.2 on the connected domains \(\tau _0 + \theta \omega + \mathrm {i}I_l(\omega )\) with the choices \(\varphi :=\Theta \) and

$$\begin{aligned} \Phi (\tau _0 + \theta \omega +\mathrm {i}\eta ):= & {} {\left\{ \begin{array}{ll} N^{-6 \varepsilon }\Phi _k(\omega ,\eta ) &{} \text { if }\; l=1, \\ N^{-3 \varepsilon }\Phi _k(\omega ,\eta ) &{} \text { if }\; l=2, \\ \Phi _k(\omega ,\eta ) &{} \text { if }\; l=3, \end{array}\right. } \\ z_0:= & {} {\left\{ \begin{array}{ll} \tau _0 + \theta \omega +\mathrm {i}\delta _* &{} \text { if }\; l=1, \\ \tau _0 + \theta \omega +\mathrm {i}\eta _1 &{} \text { if }\; l=2, \\ \tau _0 + \theta \omega +\mathrm {i}\eta _2 &{} \text { if }\; l=3, \end{array}\right. } \end{aligned}$$

where as explained after the definition of \(I_1\), \(I_2\) and \(I_3\) above we have \(I_1=[\eta _1,\delta _*]\), \(I_2=[\eta _2,\eta _1]\) and \(I_3=[N^{\gamma -1},\eta _2]\). The condition (A.1) of the lemma is satisfied by the definition of \(\Theta \) in (4.7), the Hölder-continuity of the solution of the QVE, the weak Lipschitz-continuity of \(\mathbf {g}\) with Lipschitz-constant \(N^{ 2}\) and the Hölder-continuity of \(\mathbf {s}\) from (4.12). The gap condition, (A.2), holds because of (4.44) and the definition of \(\chi \) and \(\Phi \) for an appropriate choice of the exponent \(D_3\).

The condition, \(\varphi (z_0) \le \Phi (z_0)\) a.w.o.p., necessary for the application of Lemma A.2 on the first domain, \(\tau _0 + \theta \omega + \mathrm {i}I_1(\omega )\), is obtained form Proposition 3.1. With Lemma A.2 we propagate the bound to all \(z \in \tau _0 + \theta \omega + \mathrm {i}I_1(\omega )\). Now we apply Lemma A.2 on the second domain \(\tau _0 + \theta \omega + \mathrm {i}I_2(\omega )\), provided \(I_2(\omega )\) is not empty. The bound (A.3) for the new \(z_0=\tau _0 + \theta \omega + \mathrm {i}\eta _1\) is obtained from the previous step. Finally, we apply Lemma A.2 to \(\tau _0 + \theta \omega + \mathrm {i}I_3(\omega )\), in case it is not empty, with the new choice \(z_0=\tau _0 + \theta \omega + \mathrm {i}\eta _2\). Altogether, we applied the lemma at most three times. Through this procedure we prove that a.w.o.p. \(\Theta (z) \le \Phi (z)\) for all \(z \in \tau _0 + \theta \omega + \mathrm {i}[N^{\gamma -1},\delta _*]\). On the third domain, \(\tau _0 + \theta \omega + \mathrm {i}I_3(\omega )\), we use that a.w.o.p. \(\chi =0\) (cf. (4.44)) and thus a.w.o.p. \(\Theta (z) \le N^{-\varepsilon }\Phi _k\). Altogether we showed that in the \(\delta _*\)-neighborhood of \(\tau _0\),

$$\begin{aligned} \text {a.w.o.p.}\quad \Theta (z)\,\le \, N^{-\varepsilon }\Phi _k\,\le \, \Phi _{k+1}. \end{aligned}$$

This finishes the proof of Lemma 4.5 up to verifying the claim (4.43).

For the proof of (4.43) one verifies case by case that on \(I_1\) the term \(\widetilde{\pi }_1 \Theta \sim |\pi _1| \Theta \) is bigger than the two other terms, \(\widetilde{\pi }_2 \Theta ^2\) and \(\Theta ^3\) by a factor of \(N^\varepsilon \). If \(I_3\) is not empty then the term \(\Theta ^3\) is the biggest. If \(I_2\) is not empty, then \(|\pi _2| \sim \widetilde{\pi }_2\) and \(\widetilde{\pi }_2 \Theta ^2\) is the biggest term by a factor of \(N^\varepsilon \). More specifically, when \( \eta \in I_j \) and \( \chi = \chi (\omega ,\eta ) = 1 \) we show

$$\begin{aligned} \big |\Theta ^3 + \pi _2\Theta ^2 +\pi _1\Theta \big | \,\sim \, |\pi _j |\Theta ^j \,\sim \, \widetilde{\pi }_j\Theta ^j \,\sim \; \Theta ^3 + \widetilde{\pi }_2\Theta ^2 +\widetilde{\pi }_1\Theta , \end{aligned}$$

where \( \pi _3 = \widetilde{\pi }_3 := 1 \). As an example we demonstrate these relations in a few cases:

  • Well inside a gap: If \(\tau _0 \in \partial {{\mathrm{supp}}}\rho \) and \(\omega \in [c_*\Delta , \Delta /2]\) then \(I_2(\omega )=\emptyset \). We now check that on \(I_1(\omega )\) the linear term in \(\Theta \) is the biggest while on \(I_3(\omega )\) the cubic term dominates. First, let \(\eta \in I_1(\omega )\). Then the following chain of inequalities hold,

    $$\begin{aligned} \widetilde{\pi }_1 \Theta \,\sim \, |\pi _1| \Theta \,\sim \, (\Delta +\eta )^{2/3} \Theta\gtrsim & {} N^{-5 \varepsilon } (\Delta +\eta )^{1/3} \Phi _k \Theta \,\sim \, N^{-5 \varepsilon } \widetilde{\pi }_2 \Phi _k \Theta \\\gtrsim & {} N^{-10 \varepsilon } \Phi _k^2 \Theta . \end{aligned}$$

    Here, we used (4.29), (4.15b), the definition of \(I_1(\omega )\) and (4.27c) in the form \(\widetilde{\pi }_2 \sim (\Delta +\eta )^{1/3}\). Now we can use \( \chi \) to replace \( \Phi _k \) by \( \Theta \). By definition of \(\chi \) and since \(\widetilde{\pi }_k \gtrsim |\pi _k|\) for \(k=1,2\) we also get

    $$\begin{aligned} N^{-5 \varepsilon } \widetilde{\pi }_2 \Phi _k \Theta \chi \ge N^{\varepsilon } \widetilde{\pi }_2 \Theta ^2 \chi \,\gtrsim \, N^{\varepsilon } |\pi _2|\Theta ^2 \chi , \quad N^{-10 \varepsilon } \Phi _k^2 \Theta \chi \ge N^{2 \varepsilon } \Theta ^3 \chi . \end{aligned}$$

    We conclude that on \(I_1(\omega )\) the linear term in \(\Theta \) dominates the others,

    $$\begin{aligned} \widetilde{\pi }_1 \Theta \chi \,\gtrsim \,N^{\varepsilon }(\Theta ^3+ \widetilde{\pi }_2 \Theta ^2) \chi . \end{aligned}$$

    Suppose now that \(\eta \in I_3(\omega )\). In this case, using the choice of the indicator function \(\chi \),

    $$\begin{aligned} \Theta ^3 \chi \,\ge \, N^{-\varepsilon }\Phi _k \Theta ^2 \chi \,\ge \, N^{-2 \varepsilon }\Phi _k^2 \Theta \chi . \end{aligned}$$

    By definition of \(I_3(\omega )\) and (4.27c) we find that

    $$\begin{aligned} N^{-\varepsilon }\Phi _k \Theta ^2\gtrsim & {} N^{\varepsilon }(\Delta +\eta )^{1/3} \Theta ^2 \,\sim \, N^{\varepsilon }\widetilde{\pi }_2 \Theta ^2, \\ N^{-2 \varepsilon }\Phi _k^2 \Theta\gtrsim & {} N^{2 \varepsilon }(\Delta +\eta )^{2/3} \Theta \,\sim \, N^{2 \varepsilon }\widetilde{\pi }_1 \Theta . \end{aligned}$$

    Altogether we find that the cubic term dominates the two others,

    $$\begin{aligned} \Theta ^3 \chi \gtrsim N^\varepsilon (\widetilde{\pi }_2 \Theta ^2+\widetilde{\pi }_1 \Theta ) \chi . \end{aligned}$$
  • Inside a gap close to an edge on \(I_2\): If \(\tau _0 \in \partial {{\mathrm{supp}}}\rho \), \(\omega \in [0,c_*\Delta ]\) and \(\eta \in I_2(\omega )\), then we will show the quadratic term in \(\Theta \) dominates the two other terms. We have

    $$\begin{aligned} |\pi _2| \Theta ^2\sim \widetilde{\pi }_2 \Theta ^2\,\sim \, (\Delta +\eta )^{1/3} \Theta ^2\,\gtrsim \, N^{-2 \varepsilon }\Phi _k \Theta ^2, \end{aligned}$$

    where in the inequality we used the definition of \(I_2(\omega )\). The choice of \(\chi \) guarantees that \(\Phi _k \chi \ge N^{3 \varepsilon } \Theta \chi \). Thus, the quadratic term is larger than the cubic term by a factor of \(N^\varepsilon \). On the other hand

    $$\begin{aligned} (\Delta +\eta )^{1/3} \Theta ^2 \chi \,&\gtrsim \,N^{-4 \varepsilon }(\Delta +\eta )^{1/3}\Phi _k \Theta \\&\gtrsim \, N^\varepsilon (\omega +\eta )^{1/2}(\Delta +\eta )^{1/6} \Theta \,\sim \, N^\varepsilon \widetilde{\pi }_1 \Theta \\&\sim \, N^\varepsilon |\pi _1|\Theta . \end{aligned}$$

    Here, in the first inequality we used the indicator function \(\chi \) and in the second inequality the definition of \(I_2(\omega )\). Altogether, we arrive at

    $$\begin{aligned} \widetilde{\pi }_2 \Theta ^2 \chi \,\gtrsim \, N^\varepsilon (\Theta ^3+ \widetilde{\pi }_1 \Theta ) \chi . \end{aligned}$$
  • Internal minimum on \(I_1\): If \(\tau _0 \in {\mathbb {M}} \backslash \partial {{\mathrm{supp}}}\rho \) and \(\eta \in I_1(\omega )\), then the linear term is the biggest,

    $$\begin{aligned}&|\pi _1|\Theta \,\sim \, \widetilde{\pi }_1 \Theta \,\sim \, \big (\rho (\tau _0)^2 + (|\omega |+\eta )^{2/3}\big ) \Theta \\&\quad \gtrsim \, N^{-2 \varepsilon }\big (\rho (\tau _0) + (|\omega |+\eta )^{1/3}\big )\Phi _k \Theta . \end{aligned}$$

    Here, we used (4.29) and the definitions of \(\widetilde{\pi }_1\) and \(I_1(\omega )\), respectively. Since \(\Phi _k \chi \ge N^{6 \varepsilon }\Theta \chi \) and by the definition of \(\widetilde{\pi }_2 \) this shows that the linear term is larger than the quadratic term by a factor of \(N^{4 \varepsilon }\). In order to compare the linear with the cubic term we estimate further. By definition of \(I_1(\omega )\),

    $$\begin{aligned} N^{-2 \varepsilon }\bigl (\,\rho (\tau _0) + (|\omega |+\eta )^{1/3}\,\bigr ) \Phi _k \Theta \,\ge \, N^{-4 \varepsilon } \Phi _k^2 \Theta . \end{aligned}$$

    Again we use the lower bound on \(\Phi _k \chi \) and get

    $$\begin{aligned} N^{-4 \varepsilon } \Phi _k^2 \Theta \chi \ge \, N^{8 \varepsilon }\Theta ^3 \chi . \end{aligned}$$

    Thus we showed that on the domain \(I_1(\omega )\)

    $$\begin{aligned} \widetilde{\pi }_1 \Theta \, \chi \,\gtrsim \, N^\varepsilon \, ( \Theta ^3+\widetilde{\pi }_2 \Theta ^2 ) \chi . \end{aligned}$$

The other cases are proven similarly. This completes the proof of (4.43). \(\square \)

5 Rigidity and delocalization of eigenvectors

5.1 Proof of Corollary 1.10

Here we explain how the local law, Theorem 1.7, is used to estimate the difference between the cumulative density of states and the eigenvalue distribution function of the random matrix \(\mathbf {H}\). The following auxiliary result shows that the difference between two probability measures can be estimated in terms of the difference of their respective Stieltjes transforms. For completeness the proof is given in the appendix. It uses a Cauchy-integral formula that was also applied in the construction of the Helffer-Sjöstrand functional calculus (cf. [11]) and it appeared in different variants in [15, 20, 21].

Lemma 5.1

(Bounding measures by Stieltjes transforms) There is a universal constant \(C>0\), such that for any two probability measures, \(\nu _1\) and \(\nu _2\), on the real line and any three numbers \(\eta _1, \eta _2,\varepsilon \in (0,1]\) with \(\varepsilon \ge \max \{\eta _1,\eta _2\}\), the difference between the two measures evaluated on the interval \([\tau _1,\tau _2]\subseteq \mathbb {R}\), with \(\tau _1< \tau _2\), satisfies

$$\begin{aligned} \begin{aligned}&\big | \nu _1([\tau _1,\tau _2])\,-\,\nu _2([\tau _1,\tau _2]) \big | \\&\quad \le \; C\,\Bigl (\, \nu _1([\tau _1-\eta _1,\tau _1]) \,+\, \nu _1([\tau _2,\tau _2+\eta _2]) \,+\, J_1 +J_2+J_3 \Bigr ) . \end{aligned} \end{aligned}$$
(5.1)

Here, the three contributions to the error, \(J_1\), \(J_2\) and \(J_3\), are defined as

$$\begin{aligned} \begin{aligned} J_1\,&:=\, \int _{\tau _1-\eta _1}^{\tau _1} \mathrm {d}\omega \,\biggl ( \mathrm {Im}m_{\nu _1}(\omega +\mathrm {i}\eta _1)+|m_{\nu _1-\nu _2}(\omega +\mathrm {i}\eta _1)| \\&\quad +\frac{1}{\eta _1} \int _{\eta _1}^{2 \varepsilon } \mathrm {d}\eta | m_{\nu _1-\nu _2}(\omega +\mathrm {i}\eta )| \biggr ) \,, \\ J_2\,&:=\, \int _{\tau _2}^{\tau _2+\eta _2} \mathrm {d}\omega \,\biggl ( \mathrm {Im}m_{\nu _1}(\omega +\mathrm {i}\eta _2)+|m_{\nu _1-\nu _2}(\omega +\mathrm {i}\eta _2)| \\&\quad +\frac{1}{\eta _2} \int _{\eta _2}^{2 \varepsilon } \mathrm {d}\eta | m_{\nu _1-\nu _2}(\omega +\mathrm {i}\eta )| \biggr ) \,, \\ J_3\,&:=\, \frac{1}{\varepsilon }\int _{\tau _1-\eta _1}^{\tau _2+\eta _2} \mathrm {d}\omega \int _{\varepsilon }^{2 \varepsilon }\mathrm {d}\eta | m_{\nu _1-\nu _2}(\omega +\mathrm {i}\eta )|\,, \end{aligned} \end{aligned}$$
(5.2)

where \(m_\nu \) denotes the Stieltjes transform of \(\nu \) for any signed measure \(\nu \).

We will now apply this lemma to prove Corollary 1.10 with the choices of the measures

$$\begin{aligned} \begin{aligned} \nu _1(\mathrm {d}\omega )\,:=\, \rho (\omega )\mathrm {d}\omega ,\quad \text {and}\quad \nu _2(\mathrm {d}\omega )\,:=\, \frac{1}{N}\sum _{i=1}^N\delta _{\lambda _i}(\mathrm {d}\omega ) . \end{aligned} \end{aligned}$$
(5.3)

As a first step we show that a.w.o.p. there are no eigenvalues with an absolute value larger or equal than 10, i.e.,

$$\begin{aligned} \begin{aligned} \#\{ {\,i: |\lambda _i|\ge 10 } \}\,=\,0\quad \text {a.w.o.p.} \end{aligned} \end{aligned}$$
(5.4)

We focus on the eigenvalues \(\lambda _i \ge 10\). The ones with \(\lambda _i \le -10\) are treated in the same way. We will show first that there are no eigenvalues in a small interval around \(\tau \) with \(\tau \ge 10\). In fact, we prove that for \(\gamma \in (0,1/3)\),

$$\begin{aligned} \begin{aligned} \# \bigl \{ {\,i: \tau \le \lambda _i\le \tau + N^{-1}} \bigl \} \;\prec \, N^{-\gamma } . \end{aligned} \end{aligned}$$
(5.5)

For this we apply Lemma 5.1 with the same choices of the measures \(\nu _1\) and \(\nu _2\) as in (5.3) and with

$$\begin{aligned} \begin{aligned} \eta _1\,:=\, \eta _2\,:=\,\varepsilon \,:=\, N^{\gamma -1},\quad \tau _1\,:=\, \tau ,\quad \tau _2\,:=\, \tau +N^{-1}. \end{aligned} \end{aligned}$$
(5.6)

Theorem 1.7 takes the form

$$\begin{aligned} \begin{aligned} \big | \langle \mathbf {g}(\omega +\mathrm {i}\eta )\rangle -\langle \mathbf {m}(\omega +\mathrm {i}\eta ) \rangle \big |\,\prec \, \frac{1}{N} +N^{-2 \gamma }, \quad (\omega ,\eta ) \in \Gamma , \end{aligned} \end{aligned}$$
(5.7)

where \( \Gamma := [\tau -N^{\gamma -1},\tau + 2N^{\gamma -1}] \times [N^{\gamma -1},2N^{\gamma -1}] \). Here we used \( \kappa (\omega +\mathrm {i}\eta ) \lesssim \eta _1 + (N\eta )^{-1} \), that follows from the facts that we are well outside \( {{\mathrm{supp}}}\rho \subset [-2,2 ] \), and hence \( \Delta (\omega ) = 1 \) by (1.17) so the condition (1.24) holds, and thus (1.25) is applicable.

Using the definition of stochastic domination (Definition 1.6), the basic union bound, and the part (iii) of Lemma A.1 we see that the estimate (5.7) holds even with supremum over \( (\omega ,\eta ) \in \widehat{\Gamma } \), where \( \widehat{\Gamma } := (N^{-10}\mathbb {Z})^2 \cap \Gamma \) is a fine grid of spacing \( N^{-10} \) with \( | \widehat{\Gamma } | \le N^{20} \). Using the Lipschitz-continuity of \(z \mapsto \langle \varvec{\mathrm {g}}(z) \rangle \) with Lipschitz-constant bounded by \(N^2\), as well as the uniform 1 / 3-Hölder-continuity of \( z \mapsto \langle \mathbf {m}(z) \rangle \), we can extend the supremum over \( \widehat{\Gamma } \) to the entire domain \( \Gamma \), i.e.,

$$\begin{aligned} \sup _{(\omega ,\eta ) \in \Gamma } \big | \langle \mathbf {g}(\omega +\mathrm {i}\eta )\rangle -\langle \mathbf {m}(\omega +\mathrm {i}\eta ) \rangle \big |\,\prec \, \frac{1}{N} +N^{-2 \gamma }. \end{aligned}$$

Plugging this bound into the definitions of \(J_1\), \(J_2\) and \(J_3\) from (5.2) and using (5.1) and the fact that \(\rho =0\) in this regime shows the validity of (5.5).

We conclude that a.w.o.p. there are no eigenvalues in an interval of length \(N^{-1}\) to the right of \(\tau \). By using a union bound this implies that

$$\begin{aligned} \#\{\,i: 10\le \lambda _i\le N\}\,=\, 0\quad \text {a.w.o.p.} \end{aligned}$$

The eigenvalues larger than N are treated by the following simple argument,

$$\begin{aligned} \max _{i=1}^N\lambda _i^2 \,\le \, \sum _{i=1}^N \lambda _i^2\,=\, \sum _{i,j=1}^N|h_{ij}|^2\,\prec \, N. \end{aligned}$$

Thus (5.4) holds true.

Now we apply Lemma 5.1 to prove (1.28). In case \(|\tau |\ge 10\) the bound (1.28) follows because a.w.o.p. there are no eigenvalues of \(\mathbf {H}\) with absolute value larger or equal than 10. Thus, we fix \(\tau \in (-10,10)\) and make the choices

$$\begin{aligned} \begin{aligned} \eta _1:= \eta _2:= N^{\gamma -1},\quad \tau _1:= -10,\quad \tau _2:= \tau ,\quad \varepsilon :=1. \end{aligned} \end{aligned}$$
(5.8)

Again we use (1.21) from Theorem 1.7, the Lipschitz-continuity of \(\langle \mathbf {g}\rangle \) and the Hölder-continuity of \(\langle \mathbf {m}\rangle \) to see that uniformly for all \(\eta \ge N^{\gamma -1}\),

$$\begin{aligned} \sup _{\omega \in [0,\eta _1]}\big | \langle \mathbf {g}(\tau _1-\omega +\mathrm {i}\eta ) \rangle -\langle \mathbf {m}(\tau _1-\omega +\mathrm {i}\eta ) \rangle \big | \prec \, \frac{1}{N} +\frac{1}{(N\eta )^2}. \end{aligned}$$

Here we evaluated \(\Delta (\tau _1)=1\) and thus \(\kappa \lesssim \eta + (N\eta )^{-1}\). With \(J_1\) defined as in (5.2) we infer \(J_1 \prec N^{-1}\). Theorem 1.7 also implies the bound

$$\begin{aligned} \sup _{\omega \in [-20,20]}\sup _{\eta \in [1,2]}\big | \langle \mathbf {g}(\omega +\mathrm {i}\eta ) \rangle -\langle \mathbf {m}(\omega +\mathrm {i}\eta ) \rangle \big | \prec \, \frac{1}{N}, \end{aligned}$$

since in this regime \(\kappa \lesssim 1\), thus showing that \(J_3 \prec N^{-1}\). We are left with estimating the three terms constituting \(J_2\). The first and second of these terms are estimated trivially by using the boundedness of their integrands. Therefore, we conclude that

$$\begin{aligned} \begin{aligned} \left|\int _{-10}^\tau \rho (\omega )\mathrm {d}\omega -\frac{\#\{\,i : \, -10 \le \lambda _i\le \tau \}}{N} \right| \,\prec \, N^{\gamma -1} + R(\tau ), \end{aligned} \end{aligned}$$
(5.9)

where the error term, R, is defined as

$$\begin{aligned} \begin{aligned} R(\tau )&:= N^{1-\gamma }\int _{0}^{N^{\gamma -1}} \mathrm {d}\omega \int _{N^{\gamma -1}}^{2} \mathrm {d}\eta \, \\&\quad \min \left\{ \, {\frac{1}{N \eta (\Delta (\tau +\omega )^{1/3}+ \rho (\tau +\omega +\mathrm {i}\eta ))},\,\frac{1}{\,(N \eta )^{1/2}}} \,\right\} . \end{aligned} \end{aligned}$$
(5.10)

This expression is derived by using the bound (1.23) on \(\kappa \) for the integrand of the third contribution to \(J_2\).

To estimate R further we distinguish three cases, depending on whether \(\tau \) is away from \({\mathbb {M}}\), close to an edge or close to a local minimum in the interior of \({{\mathrm{supp}}}\rho \). In each of these cases we prove

$$\begin{aligned} \begin{aligned} R(\tau )\,\prec \, \min \left\{ \, { \frac{1}{N (\Delta (\tau )^{1/3}+ \rho (\tau ))},\, \frac{1}{N^{4/5}\,} } \,\right\} . \end{aligned} \end{aligned}$$
(5.11)

Away from \({\mathbb {M}}\): In case \(\mathrm {dist}(\tau ,{\mathbb {M}})\ge \delta _*\), with \(\delta _*\) the size of the neighborhood around the local minima from Theorem 4.1, we have \(\Delta ^{1/3}+\rho \sim 1\) and thus the \(\eta \)-integral in (5.10) yields a factor comparable to \(N^{-1}\log N\). Thus, \(R(\tau )\prec N^{-1}\), and hence (5.11) holds.

Close to an edge: Let \(\mathrm {dist}(\tau ,\{\alpha _k,\beta _k\})\le \delta _*\). Then from the size of \(\rho \) at an internal edge, at the extreme edges and inside the gap (cf. (4.5b), (4.5d) and (4.5c) from Theorem 4.1) we see that

$$\begin{aligned} \Delta (\tau +\omega )^{1/3}+ \rho (\tau +\omega +\mathrm {i}\eta )\,\sim \,\big ( \Delta (\tau ) + \mathrm {dist}(\tau , \{\alpha _k,\beta _k\})+\eta \big )^{1/3}. \end{aligned}$$

for any \(\omega \in [0,N^{\gamma -1}]\) and \(\eta \in [N^{\gamma -1},2]\). With this the size of R is given by

$$\begin{aligned} R(\tau )\,\sim \, \int _{N^{\gamma -1}}^{2} \mathrm {d}\eta \min \left\{ \, { \frac{1}{N \eta (\Delta (\tau )+ \mathrm {dist}(\tau , \{\alpha _k,\beta _k\})+\eta )^{1/3}},\frac{1}{\,(N \eta )^{1/2}} } \,\right\} . \end{aligned}$$

Integrating over \(\eta \) yields that

$$\begin{aligned} R(\tau )\,\lesssim \,\min \left\{ \, {\frac{ \log N}{N(\Delta (\tau )+ \mathrm {dist}(\tau , \{\alpha _k,\beta _k\}))^{1/3}},\frac{1}{N^{4/5}}} \,\right\} . \end{aligned}$$

Now (5.11) follows by using the size of \(\rho \) from Theorem 4.1 again.

Close to an internal local minimum: Suppose \(|\tau -\tau _0|\le \delta _*\) for some \(\tau _0 \in {\mathbb {M}} \backslash \partial {{\mathrm{supp}}}\rho \). Then using the size of \(\rho \) from (4.5e) of Theorem 4.1 we see that

$$\begin{aligned} R(\tau )\,\sim \, \int _{N^{\gamma -1}}^{2} \mathrm {d}\eta \min \left\{ \, { \frac{1}{N \eta (\rho (\tau _0)+ |\tau -\tau _0|^{1/3}+\eta ^{1/3})},\,\frac{1}{(N \eta )^{1/2}} } \,\right\} . \end{aligned}$$

The bound (5.11) follows by performing the integration over \(\eta \).

This finishes the proof of (5.11). We insert this bound into (5.9) and use that \(\gamma \) was arbitrary. Thus, we find

$$\begin{aligned} \left|\, \int _{-10}^\tau \rho (\omega ) \mathrm {d}\omega \;-\;\frac{\#\{\,i : \, -10 \le \lambda _i\le \tau \}}{N}\, \right| \;\prec \; \min \left\{ \, { \frac{1}{N (\Delta (\tau )^{1/3}+ \rho (\tau ))},\, \frac{1}{N^{4/5}} } \,\right\} . \end{aligned}$$

This finishes the proof of (1.28) since there are no eigenvalues below \(-10\).

Now we prove (1.29). Let \(\tau \in \mathbb {R}\backslash {{\mathrm{supp}}}\rho \). Suppose that for some \(k=1, \dots ,K\) we have \(|\tau -\beta _k| = \mathrm {dist}(\tau , \partial {{\mathrm{supp}}}\rho )\). The case when \(\tau \) is closer to the set \(\{\alpha _k\}\) than to \(\{\beta _k\}\) is treated similarly. Suppose further that

$$\begin{aligned} \tau \,\ge \, \alpha _k + \delta _k, \end{aligned}$$

where \(\delta _k\) are defined as in (1.30) and \(\delta _0=N^{\gamma -2/3}\). Note that there is nothing to show if \(k>1\) and the size of the gap, \(\alpha _k-\beta _{k-1}\), is smaller than \(2 \delta _k\), i.e., if such a \(\tau \) does not exist. In particular, we have \(\alpha _k-\beta _{k-1}=\Delta (\tau )\gtrsim N^{-1/2}\). We will show that a.w.o.p. there are no eigenvalues in an interval of length \(N^{-2/3}\) to the right of \(\tau \), i.e.

$$\begin{aligned} \begin{aligned} \# \bigl \{ {\,i:\, \tau \le \lambda _i \le \tau +N^{-2/3}} \bigl \} \;=\, 0 \quad \text {a.w.o.p.} \end{aligned} \end{aligned}$$
(5.12)

We apply Lemma 5.1 with the same choices of the measures \(\nu _1\) and \(\nu _2\) as in (5.3). Additionally, we set

$$\begin{aligned} \begin{aligned} \eta _1\,:=\, \eta _2\,:=\,\varepsilon \,:=\, N^{-2/3},\quad \tau _1\,:=\, \tau ,\quad \tau _2\,:=\, \tau +N^{-2/3}. \end{aligned} \end{aligned}$$
(5.13)

We use the local law, Theorem 1.7, to estimate the differences between the Stieltjes transforms of the two measures for the integrands in the definition of the three error terms, \(J_1\), \(J_2\) and \(J_3\) from (5.2). By the definition of \(\delta _k\) the condition (1.24) is satisfied inside the integrals and we use the improved bound, (1.25), on \(\kappa \). Indeed, we find

$$\begin{aligned} \sup \big | \langle \mathbf {g}(\omega +\mathrm {i}\eta )\rangle -\langle \mathbf {m}(\omega +\mathrm {i}\eta ) \rangle \big |\,\prec \, \frac{1}{N\delta _k\Delta (\tau )^{1/3}} +\frac{1}{N^{2/3} \delta _k^{1/2}\Delta (\tau )^{1/6}} \;, \end{aligned}$$

where the supremum is taken over \(\omega \in [\tau - N^{-2/3}, \tau + 2N^{-2/3}]\) and \(\eta \in [N^{-2/3}, 2N^{-2/3}]\). With this, the definition of \(\delta _k\) and the size of \(\rho \) from (4.5c) and (4.5d) we infer

$$\begin{aligned} J_1+J_2 +J_3\,\prec \, N^{-1-\gamma /2} . \end{aligned}$$

From this (5.12) follows. The claim, (1.29), is now a consequence of a simple union bound taken over the events in (5.12) with different choices of \(\tau \). This finishes the proof of Corollary 1.10.

5.2 Proof of Corollary 1.11

Here we show how we get the rigidity, Corollary 1.11, from Corollary 1.10. Fix a \(\tau \in [\alpha _1,\beta _K]\). We define the random fluctuation to the left, \(\delta _-\), and to the right, \(\delta _+\), of the eigenvalue \(\lambda _{i(\tau )}\) as

$$\begin{aligned} \delta _+(\tau ):= & {} \inf \biggl \{\, \delta \ge 0 :\, 2+\Big | \# \bigl \{ {\,i:\, \lambda _i \le \tau +\delta } \bigl \} \,-\,N\int _{-\infty }^{\tau +\delta }\rho (\omega ) \mathrm {d}\omega \Big | \nonumber \\&\le N \int _{\tau }^{\tau +\delta }\rho (\omega ) \mathrm {d}\omega \,\biggr \} \end{aligned}$$
(5.14)
$$\begin{aligned} \delta _-(\tau ):= & {} \inf \biggl \{\, \delta \ge 0 :\, 1+ \Big | \# \bigl \{ {\,i:\, \lambda _i \le \tau -\delta } \bigl \} \,-\,N\int _{-\infty }^{\tau -\delta }\rho (\omega ) \mathrm {d}\omega \Big | \nonumber \\&\le N \int _{\tau -\delta }^{\tau } \rho (\omega ) \mathrm {d}\omega \,\biggr \} \,. \end{aligned}$$
(5.15)

We show now that with this definition,

$$\begin{aligned} \begin{aligned} \lambda _{i(\tau )} \in \bigl [ \tau -\delta _-(\tau ),\tau +\delta _+(\tau )\bigr ] . \end{aligned} \end{aligned}$$
(5.16)

We start with the upper bound on \(\lambda _{i(\tau )}\). By the definition of \(i(\tau )\) we find the inequality

$$\begin{aligned} \# \bigl \{ {\,i:\, \lambda _i \le \lambda _{i(\tau )}} \bigl \}= & {} i(\tau )\,\le \, 1+N\int _{-\infty }^{\tau }\rho (\omega )\mathrm {d}\omega \\= & {} 1+ N\int _{-\infty }^{\tau +\delta _+} \rho (\omega )\mathrm {d}\omega - N\int _{\tau }^{\tau +\delta _+} \rho (\omega )\mathrm {d}\omega . \end{aligned}$$

The definition of \(\delta _+=\delta _+(\tau )\) implies that

$$\begin{aligned} \# \bigl \{ {\,i:\, \lambda _i \le \lambda _{i(\tau )}} \bigl \} < \#\{\,i:\, \lambda _i \le \tau +\delta _+\}. \end{aligned}$$

By monotonicity of the cumulative eigenvalue distribution, we conclude that \( \lambda _{i(\tau )}\le \tau +\delta _+\). Thus, the upper bound is proven.

Now we show the lower bound. We start similarly,

$$\begin{aligned} \# \bigl \{ {\,i:\, \lambda _i \le \lambda _{i(\tau )}} \bigl \}= & {} i(\tau ) \,\ge \, N\int _{-\infty }^{\tau }\rho (\omega )\mathrm {d}\omega \\= & {} N\int _{-\infty }^{\tau -\delta _-} \rho (\omega )\mathrm {d}\omega \,+\, N\int _{\tau -\delta _-}^{\tau }\rho (\omega )\mathrm {d}\omega . \end{aligned}$$

By definition of \(\delta _-\) we get

$$\begin{aligned} \# \bigl \{ {\,i:\, \lambda _i \le \lambda _{i(\tau )}} \bigl \} \;\,\ge \, 1+\, \liminf _{\varepsilon \downarrow 0}\# \bigl \{ {\,i:\, \lambda _i \le \tau -\delta _--\varepsilon } \bigl \} . \end{aligned}$$

Here the \(\liminf \) is necessary, since the cumulative eigenvalue distribution is not continuous from the left. We conclude that \(\lambda _{i(\tau )} \ge \tau -\delta _--\varepsilon \) for all \(\varepsilon >0\) and therefore the lower bound is proven.

Now we start with the proof of (1.34). For this we show that for any \(\tau \) that is well inside the support of the density of states, i.e., that satisfies (1.33), we have

$$\begin{aligned} \begin{aligned} \delta _-(\tau )+\delta _+(\tau )\,\prec \, \delta ,\quad \delta := \min \left\{ \, { \frac{1}{\rho (\tau )(\Delta (\tau )^{1/3}+ \rho (\tau ))N},\frac{1}{N^{3/5}} } \,\right\} . \end{aligned} \end{aligned}$$
(5.17)

If \(\tau \) is in the bulk, i.e., \(\mathrm {dist}(\tau ,{\mathbb {M}})\ge \delta _*\), then \(\delta \sim N^{-1}\) and thus (5.17) follows from (1.28). We distinguish the two remaining cases, namely whether \(\tau \) is close to an edge or to a local minimum inside the interior of \({{\mathrm{supp}}}\rho \).

Close to an edge: Suppose that \(\tau \in [\beta _k-\delta _*,\beta _k-\varepsilon _k]\). The case when \(\tau \) is closer to \(\{\alpha _k\}\) than to \(\{\beta _k\}\) is treated similarly. By the definition of \(\varepsilon _k\) in (1.32) and by the size of \(\rho \) from (4.5d) and (4.5b) in Theorem 4.1 we see that \(\varepsilon _k \gtrsim N^\gamma \delta \). Using Corollary 1.10 we find for any \(\varepsilon \in (0,\gamma /2)\) that

$$\begin{aligned} \left|\,\# \bigl \{ {\,i:\, \lambda _i \le \tau +N^\varepsilon \delta } \bigl \} \,-\,N\int _{-\infty }^{\tau +N^\varepsilon \delta } \rho (\omega ) \mathrm {d}\omega \, \right| \prec \, \min \Bigl \{ { \bigl (\Delta (\tau )+\beta _k-\tau \bigr )^{-1/3}, N^{1/5} } \Bigr \} . \end{aligned}$$

On the other hand

$$\begin{aligned} N \int _{\tau }^{\tau +N^\varepsilon \delta }\rho (\omega )\mathrm {d}\omega\sim & {} \frac{N^{1+\varepsilon }\delta (\beta _k-\tau )^{1/2}}{(\Delta (\tau )+\beta _k-\tau )^{1/6}} \\\gtrsim & {} N^\varepsilon \min \Bigl \{ { \bigl (\Delta (\tau )+\beta _k-\tau \bigr )^{-1/3}, N^{1/5} } \Bigr \} . \end{aligned}$$

Here we used the size of \(\rho \) from Theorem 4.1, the definition of \(\delta \) and \(\beta _k-\tau \ge \varepsilon _k\). Since \(\varepsilon \) was arbitrary we conclude that \(\delta _+(\tau )\prec \delta \). The bound, \(\delta _-(\tau )\prec \delta \), is shown in the same way.

Close to internal local minima: Suppose \(|\tau -\tau _0|\le \delta _*\) for some \(\tau _0 \in {\mathbb {M}}\backslash \partial {{\mathrm{supp}}}\rho \). Then by (4.5e) with \(\Delta (\tau _0)=0\) and the definition of \(\delta \) in (5.17) we have

$$\begin{aligned} \delta \,\sim \, \min \left\{ \, { \frac{1}{(\rho (\tau _0)^3+|\tau -\tau _0|)^{2/3}N},\frac{1}{N^{3/5}} } \,\right\} . \end{aligned}$$

We apply (1.28) from Corollary 1.10 and, using (4.5e) again, we get

$$\begin{aligned}&\left| \# \bigl \{ {\,i:\, \lambda _i \le \tau +N^\varepsilon \delta } \bigl \} \,-\,N \int _{-\infty }^{\tau +N^\varepsilon \delta }\rho (\omega ) \mathrm {d}\omega \, \right| \nonumber \\&\quad \prec \; \min \Bigl \{ { \bigl ( \rho (\tau _0)^3+|\tau +N^\varepsilon \delta -\tau _0|\bigr )^{-1/3}, N^{1/5} } \Bigr \} \,. \end{aligned}$$
(5.18)

On the other hand, we find

$$\begin{aligned} \begin{aligned} N \int _{\tau }^{\tau +N^\varepsilon \delta }\rho (\omega ) \mathrm {d}\omega \;\sim \; N^{1+\varepsilon }\delta \,\Bigl (\,\rho (\tau _0)^3 + |\tau -\tau _0|+N^\varepsilon \delta \,\Bigr )^{1/3} . \end{aligned} \end{aligned}$$
(5.19)

We will now verify that for large enough N,

$$\begin{aligned} \begin{aligned}&N^{\varepsilon /2}\min \Bigl \{ { \bigl (\,\rho (\tau _0)^3+|\tau +N^\varepsilon \delta -\tau _0| \bigr )^{-1/3}, N^{1/5} } \Bigr \} \\&\quad \lesssim N^{1+\varepsilon }\delta \,\Bigl ( \rho (\tau _0)^3+|\tau -\tau _0|+N^\varepsilon \delta \Bigr )^{1/3} . \end{aligned} \end{aligned}$$
(5.20)

We distinguish three cases. First let us consider the regime where \(\rho (\tau _0)^3+|\tau -\tau _0|\le N^{-3/5}\). Then we have \(\delta = N^{-3/5}\) and

$$\begin{aligned} N^{1+\varepsilon }\delta \bigl ( \rho (\tau _0)^3+|\tau -\tau _0|+N^\varepsilon \delta \,\bigr )^{1/3}\,\sim \, N^{4\varepsilon /3}N^{1/5}. \end{aligned}$$

Now we treat the situation where, \(N^{-3/5}<\rho (\tau _0)^3+|\tau -\tau _0|\le N^{3\varepsilon /2-3/5}\). In this case

$$\begin{aligned} N^{1+\varepsilon }\delta \,\bigl (\,\rho (\tau _0)^3+|\tau -\tau _0|+N^\varepsilon \delta \,\bigr )^{1/3} \,\gtrsim \, \frac{N^\varepsilon }{( \rho (\tau _0)^3+|\tau -\tau _0| )^{1/3}}\,\ge \, N^{\varepsilon /2}N^{1/5}. \end{aligned}$$

Finally, we consider \(\rho (\tau _0)^3+|\tau -\tau _0|> N^{3\varepsilon /2-3/5}\). Then for large enough N we find on the one hand

$$\begin{aligned} \min \Bigl \{ { \bigl ( \rho (\tau _0)^3+|\tau +N^\varepsilon \delta -\tau _0|\bigr )^{-1/3}, N^{1/5} } \Bigr \} \,\sim \, \frac{1}{( \rho (\tau _0)^3+|\tau -\tau _0| )^{1/3}}, \end{aligned}$$

and on the other hand

$$\begin{aligned} N^{1+\varepsilon }\,\delta \,\bigl ( \rho (\tau _0)^3+|\tau -\tau _0|+N^\varepsilon \delta \,\bigr )^{1/3}\,\gtrsim \, \frac{N^\varepsilon }{( \rho (\tau _0)^3+|\tau -\tau _0| )^{1/3}}. \end{aligned}$$

Thus, (5.20) holds true and since \(\varepsilon \) was arbitrary, we infer from (5.18) and (5.19) that \(\delta _+(\tau )\prec \delta \). Along the same lines we prove \(\delta _-(\tau )\prec \delta \). Thus (5.17) and with it (1.34) are proven.

The statement about the fluctuation of the eigenvalues at the leftmost edge, (1.35) follows directly from (1.34) and (1.29) in Corollary 1.10. Indeed, for \(\tau \in [\alpha _1,\alpha _1+\varepsilon _0)\) we have \(\lambda _{i(\tau )}\le \lambda _{i(\alpha _1+\varepsilon _0)}\) and from (1.34) with \(\Delta (\tau )=1\), as well as \(\rho (\alpha _1+\varepsilon _0)\sim \varepsilon _0^{1/2}\), and from the definition of \(\varepsilon _0\) we see that

$$\begin{aligned} \lambda _{i(\alpha _1+\varepsilon _0)}\,\le \, \alpha _1 +\varepsilon _0+N^{\gamma -2/3}\,\le \, \tau +2 N^{\gamma -2/3}\quad \text {a.w.o.p.} \end{aligned}$$

On the other hand, (1.29) shows that a.w.o.p. \(\lambda _{i(\tau )}\ge \alpha _1-N^{\gamma -2/3}\). Since \(\gamma \) was arbitrary, (1.35) follows. The rigidity at the rightmost edge is proven along the same lines.

The claim, (1.36), about the remaining eigenvalues follows from a similar argument. For \(\tau \in (\beta _k-\varepsilon _k,\alpha _{k+1}+\varepsilon _k)\), as a consequence of (1.29), we have

$$\begin{aligned} \lambda _{i(\tau )} \,\in \, \bigl [ \lambda _{ i(\beta _k-\varepsilon _k)}, \beta _k+\delta _k\bigr ] \cup \bigl [ \alpha _{k+1}-\delta _k , \lambda _{ i(\alpha _{k+1}+\varepsilon _k)}\bigr ] \quad \text {a.w.o.p.} \end{aligned}$$

From (1.34) and the definition of \(\varepsilon _k\) we infer \(\lambda _{i(\beta _k-\varepsilon _k)}\ge \beta _k-2 \varepsilon _k \) a.w.o.p., as well as \(\lambda _{i(\alpha _{k+1}+\varepsilon _k)} \le \alpha _{k+1}+2 \varepsilon _k\) a.w.o.p., which finishes the proof of (1.36).

5.3 Proof of Corollary 1.14

The delocalization of eigenvectors is a simple consequence of the anisotropic local law Theorem 1.13 using the argument from [14]. Expressing the resolvent in the eigenbasis, we have

$$\begin{aligned} \begin{aligned} \varvec{\mathrm {b}} \cdot \varvec{\mathrm {G}}(z)\varvec{\mathrm {b}}\,=\, \sum _{i=1}^N \frac{| \varvec{\mathrm {b}} \cdot \varvec{\mathrm {u}}^{(i)} |^2 }{\lambda _i \,-\,z}, \end{aligned} \end{aligned}$$
(5.21)

where \(\varvec{\mathrm {u}}^{(i)} \) is the \( \ell ^{ 2}\)-normalised eigenvector corresponding to the eigenvalue \(\lambda _i\). We evaluate this at \(z:=\lambda _k+\mathrm {i}N^{\gamma -1}\) with \(\gamma >0\) as in the statement of Theorem 1.13. The anisotropic local law implies that also \(\varvec{\mathrm {b}} \cdot \varvec{\mathrm {G}}(z)\varvec{\mathrm {b}}\) is uniformly bounded. Hence we get

$$\begin{aligned} 1\gtrsim \mathrm {Im}\;\varvec{\mathrm {b}} \cdot \varvec{\mathrm {G}}(z)\varvec{\mathrm {b}} \ge N^{1-\gamma } | \varvec{\mathrm {b}} \cdot \varvec{\mathrm {u}}^{(k)} |^2, \end{aligned}$$

by keeping only a single summand \( i = k \) from (5.21). As \(\gamma >0\) was arbitrary we conclude that

$$\begin{aligned} | \varvec{\mathrm {b}} \cdot \varvec{\mathrm {u}}^{(k)} | \,\prec \, N^{-1/2}, \end{aligned}$$

uniformly in k .

6 Anisotropic law and universality

6.1 Proof of Theorem 1.13

Given the entrywise local law, Theorem 1.7, the proof of the anisotropic law follows exactly as in Section 7 of [9], where the same argument was presented for generalized Wigner matrices (this argument itself mimicked the detailed proof of the isotropic law for sample covariance matrices in Section 5 of [9]). The only difference is that in our case \(G_{ii}(z)\) is close to \(m_i(z)\), the i-th component of the solution to the QVE, which now genuinely depends on i, while in [9] we had \( G_{ii}(z)\approx m_{\mathrm {sc}}(z)\) for every i, where \(m_{\mathrm {sc}}(z)\) is the solution to (1.3). However, the diagonal resolvent elements played no essential role in [9]. We now explain the small modifications.

Recall from Section 5.2 of [9] that by polarization it is sufficient to prove (1.37) for \(\ell ^{ 2}\)-normalized vectors \(\mathbf {w}= \mathbf {v}\). We can then write

$$\begin{aligned} \sum _{i,j=1}^N \overline{v}_i\, G_{ij} v_j -\sum _{i=1}^N m_i |v_i|^2 \,=\, \sum _i (G_{ii}-m_i)|v_i|^2 + {\mathcal {Z}} ,\quad \mathcal {Z}:= \sum _{i\ne j}^N \overline{v_i} G_{ij} v_j . \end{aligned}$$

The first term containing the diagonal elements \(G_{ii}\) is clearly bounded by the right hand side of (1.37) by Theorem 1.7. This is the first instant where the nontrivial i-dependence of \(m_i\) is used.

The main technical part of the proof in [9] is then to control \( \mathcal {Z} \), the contribution of the off diagonal terms. We can follow this proof in our case to the letter; the nontrivial i-dependence of \(m_i\) requires a slight modification only at one point. To see this, we recall the main structure of the proof. For any even p , the moment

$$\begin{aligned} \begin{aligned} \mathbbm {E}|\mathcal{Z}|^p \,=\, \mathbbm {E} \sum _{b_{11}\ne b_{12}} \cdots \sum _{b_{p1}\ne b_{p2}} \left( \; \prod _{k = 1}^{p/2} \overline{v}_{b_{k1}} G_{b_{k1}b_{k2}}v_{b_{k2}}\right) \left( \prod _{ k=p/2+1}^{p} \overline{v}_{b_{k1}} G^*_{b_{k1}b_{k2}}v_{b_{k2}} \right) , \end{aligned} \end{aligned}$$
(6.1)

is computed. Let us concentrate on a fixed summand in (6.1) and let \(B=\{b_{k_1}\}\cup \{b_{k_2}\}\) be the set of \(\mathbf {v}\)-indices appearing in that term. Using the resolvent identity (2.9) we successively expand the resolvents until each of them appears in a maximally expanded form, where every resolvent entry is of the form \( G^{(B\backslash ab)}_{ab} \), for some \( a,b \in B \) (cf. Definition 5.4 of [9]). Each time a maximally expanded off-diagonal element is produced we use (2.3). Finally, unless we end up with an expression that contains a very large numbers of off-diagonal resolvent entries (such trivial leaves are treated separately in Subsection 5.11 of [9]) we apply (3.16) to expand the remaining maximally expanded diagonal resolvent entries. This way we end up with an expression where only the resolvent entries of the type \( G^{(B)}_{ij} \), with \( i,j \notin B \), appear. In other words, the \( \varvec{\mathrm {v}} \)-indices and the indices of the resolvent entries are completely decoupled; only explicit products of entries of \( \varvec{\mathrm {H}} \) represent the connections between them. We can now take partial expectation w.r.t. the rows and columns of these h-terms. In this way we guarantee that each index in B appears at least twice as a value of \( b_{k1} \) or \( b_{k2}\) in (6.1), i.e., the entries of \( \varvec{\mathrm {v}} \) must be at least paired, and therefore the 2p-fold summation in (6.1) effectively becomes at most a p-fold summation. This renders the uncontrolled \(\ell ^1\)-norm of \( \varvec{\mathrm {v}} \) to \(\ell ^{ q}\)-norms of \( \varvec{\mathrm {v}}\), with \( q \ge 2\), which are bounded by one by normalization.

Along this procedure it is only at the treatment of the maximally expanded diagonal resolvent elements appearing in the non-trivial leaves (cf. Subsection 5.12 of [9]) where we need to slightly adjust the proof to the setting where \( \varvec{\mathrm {S}} \) is not stochastic. Using the QVE (1.7) and Schur’s formula, similarly as in (3.16), we obtain a representation, where all the dependence of the B-columns and -rows of \( \varvec{\mathrm {H}} \) is explicit

$$\begin{aligned} \begin{aligned} \frac{1}{G^{(B\backslash b)}_{bb}} \,&=\, \frac{1}{m_b} \,-\, \sum _{i,j}^{(B)} \Bigl (\,h_{bi} G^{(B)}_{ij} h_{jb}-s_{bi} m_i \delta _{ij}\Bigr ) \,+ \sum _{a \in B} s_{ba}m_a \,+\,h_{bb}, \quad b \in B . \end{aligned} \end{aligned}$$
(6.2)

This formula replaces (5.41) from [9]. Taking the inverse of this formula and expanding around the leading term \( m_b \), we get a geometric series representation for \( G^{(B\backslash b)}_{bb} \) in terms of powers of the last three term in (6.2). The resulting formula is analogous to (5.42) in [9]. The geometric series converges because the last three term on the right hand side of (6.2) are much smaller than \( |1/m_b |\sim 1 \) a.w.o.p. Indeed, the last two terms in (6.2) are of size \( N^{-1} \) and \( N^{-1/2+c} \) a.w.o.p., respectively. The double sum in (6.2) is small by using the large deviation estimates (2.7a)–(2.7c), similarly as in the proof of Lemma 2.1. When estimating the diagonal sum \(i=j\), we note that \( |G^{(B)}_{ii}-m_i | \) is small by first estimating \( |G^{(B)}_{ii}-G_{ii} | \) similarly to (2.12), and then we use the local law Theorem 1.7 to see that also \( |G_{ii}-m_i | \) is small.

The proof in [9] did not use the specific form of the subtracted term \(s_{bi}m_i \delta _{ij} \) in (6.2), just the fact that the subtraction made (2.7c) applicable for the double summation in (6.2). After this slight modification, the rest of the proof in [9] goes through without any further changes.

6.2 Proof of Theorem 1.16

For the proof of Theorem 1.16 we follow the method developed in [14, 17, 20]. Theorem 2.1 from [18] was designed for proving universality for a random matrix with a small independent Gaussian component and densities of state that may differ from Wigner’s semicircle law. The main theorem in [18] asserts that if local laws hold in a sufficiently strong sense then bulk universality holds locally for matrices with a small Gaussian component. We remark that a similar approach was independently developed in [26] that can also be easily used to conclude bulk universality from Theorem 1.7, but here we follow [18]. In Section 2.5 of [18] a recipe was given how to use this theorem to establish universality for a quite general class of random matrix models even without the Gaussian component, as long as uniform local laws on the optimal scale are known and the matrix satisfies the appropriate q -fullness condition (cf. Definition 1.15) that allows for an application of the moment matching (Lemma 6.5 in [20]) and the Green’s function comparison theorem (Theorem 2.3 in [20]).

Let \( \varvec{\mathrm {H}} \) be the Wigner-type matrix satisfying the hypotheses of Theorem 1.16, and for which the universality is to be proven. Let \( \tau \) be a bulk point of \( \rho \), so that \( \rho (\tau ) \ge \varepsilon \), for some \( \varepsilon > 0 \), and let \( I := [\tau -\delta ,\tau +\delta ] \) be some environment of size \( \delta \sim 1 \) around \( \tau \). Following the above recipe, it remains to show that the local law holds for the random matrices

$$\begin{aligned} \mathbf {H}_t\,=\, \mathrm {e}^{-t/2} \mathbf {H}_0 + (1- \mathrm {e}^{-t} )^{ 1/2} \mathbf {U} , \end{aligned}$$

uniformly in both \(t \in [0,T] \) and the spectral parameters \( z \in I +\mathrm {i}[N^{\gamma -1},\infty ) \). Here T is a small negative power of N , i.e., \(T=N^{-\xi }\) for some \(\xi > 0 \), such that \( \varvec{\mathrm {H}} \) and \( \varvec{\mathrm {H}}_T \) are close in the four moment comparison sense (cf. Theorem 2.3 of [20]), and \( \varvec{\mathrm {U}}\) is a standard GUE/GOE random matrix. The random matrix \( \varvec{\mathrm {H}}_0\) has independent entries, is independent of \(\mathbf {U}\), and has a variance matrix

$$\begin{aligned} \mathbf {S}_0:= \mathrm {e}^{T}\mathbf {S} - (\mathrm {e}^{T}-1) \mathbf {S}_\mathrm{G}, \end{aligned}$$

with \( \varvec{\mathrm {S}}\) and \(\varvec{\mathrm {S}}_\mathrm{G}\) denoting the variance matrices of \( \varvec{\mathrm {H}} \) and the standard GUE/GOE-matrix, respectively. It follows that the variance matrix of \(\mathbf {H}_t\) is

$$\begin{aligned} \mathbf {S}_t\,=\, \mathrm {e}^{-t}\mathbf {S}_0+(1-\mathrm {e}^{-t}) \mathbf {S}_{\mathrm {G}}, \end{aligned}$$

and hence \( \varvec{\mathrm {S}}_T = \mathbf {S} \) as required by the moment matching.

We will now show that \( \varvec{\mathrm {H}}_t \) satisfy the hypotheses of Corollary 1.8 uniformly in t. Since \( T=N^{-\xi } \) is small, the variance matrices \( \varvec{\mathrm {S}}_t \) are all small perturbations of \( \varvec{\mathrm {S}} \). In particular, \( \varvec{\mathrm {S}}_t \), \( t \in [0,T] \), are hence q / 2 -full.

Next we show that the interval I is inside the bulk of \( \varvec{\mathrm {H}}_t \). To this end, we consider the QVE associated to the variance matrix \( \varvec{\mathrm {S}}_t \),

$$\begin{aligned} -\frac{1}{ m_{t ; i}} \,=\, z + (\varvec{\mathrm {S}}\varvec{\mathrm {m}}_t)_i + d_i, \quad \varvec{\mathrm {d}} \,=\, ( \varvec{\mathrm {S}}_t-\varvec{\mathrm {S}} ) \varvec{\mathrm {m}}_t, \end{aligned}$$

as a perturbation of the original QVE with \( \varvec{\mathrm {S}} = \varvec{\mathrm {S}}_T \). In order to use our stability results we show \(||\varvec{\mathrm {d}} ||_\infty \lesssim T \). Since \( \varvec{\mathrm {H}}_t \) is q / 2-full we have \( s_{t;ij} \ge q/2 \) and hence using (i) of Theorem 6.1 of [1] we see that there is a constant \( \delta ' \sim 1 \) such that \( ||\varvec{\mathrm {m}}_t(z) ||_\infty \sim 1 \) uniformly for \( |\mathrm {Re}\,z | \le \delta ' \). Moreover, the structural \( \mathrm {L}^2\)-bound from Theorem 2.1 of [1] implies

$$\begin{aligned} \frac{||\varvec{\mathrm {m}}_t(z) ||^2_{\ell ^2}}{N} = \frac{1}{N}\sum _{i = 1}^N | m_{t ; i}(z) |^2\,\le \, \frac{4}{|z |^2},\quad z \in \mathbb {H},\; t \in [0,T]. \end{aligned}$$

Combining these estimates we see that \( \sup _{t,z}||\varvec{\mathrm {m}}_t(z) ||^2_{\ell ^2} \lesssim N \), and consequently the perturbation is small in the uniform norm: \( ||\varvec{\mathrm {d}} ||_\infty \lesssim N\sup _{i,j}|s_{t;ij}-s_{ij} | \lesssim N^{-\xi } \). Applying the stability (Theorem 4.2 or Theorem 2.12 from [1]) of the QVE associated to \( \varvec{\mathrm {S}}\) we conclude that \( || \varvec{\mathrm {m}}_t(z)-\varvec{\mathrm {m}}(z) ||_\infty \lesssim N^{-\xi }\varepsilon ^{-2} \), and hence \( \rho _t(\omega ) \ge \varepsilon /2\) for \(\omega \in I \) and all t , provided N is sufficiently large.

The moment condition (D) is automatically satisfied uniformly for every \(\mathbf {H}_t\) by construction. Since the condition (A) is merely a matter of normalization we have now shown that \(\varvec{\mathrm {H}}_t\) satisfy the hypotheses of Corollary 1.8 uniformly in t . Thus \( \varvec{\mathrm {H}}_t \) satisfy local law uniformly in \(t \in [0,T] \) and \( z \in I +\mathrm {i}[N^{\gamma -1},\infty ) \). This finishes the proof of universality.