Abstract
We consider \(N\times N\) Hermitian random matrices H consisting of blocks of size \(M\ge N^{6/7}\). The matrix elements are i.i.d. within the blocks, close to a Gaussian in the four moment matching sense, but their distribution varies from block to block to form a block-band structure, with an essential band width M. We show that the entries of the Green’s function \(G(z)=(H-z)^{-1}\) satisfy the local semicircle law with spectral parameter \(z=E+\mathbf {i}\eta \) down to the real axis for any \(\eta \gg N^{-1}\), using a combination of the supersymmetry method inspired by Shcherbina (J Stat Phys 155(3): 466–499, 2014) and the Green’s function comparison strategy. Previous estimates were valid only for \(\eta \gg M^{-1}\). The new estimate also implies that the eigenvectors in the middle of the spectrum are fully delocalized.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Grassmann integration with supersymmetric (SUSY) methods is ubiquitous in the physics literature of random quantum systems, see e.g. the basic monograph of Efetov [7]. This approach is especially effective for analyzing the Green function in the middle of the bulk spectrum with spectral parameter close to the real axis, i.e. precisely in the regime where other methods often fail. The main algebraic strength lies in the fact that Gaussian integrals with Grassmann variables counterbalance the determinants obtained in the partition functions of complex Gaussian integrals. This greatly simplifies algebraic manipulations as it was demonstrated in several papers, see e.g. the proof of the absolutely continuous spectrum on the Bethe lattice by Klein [17] or the bounds on the Lyapunov exponents for random walks in random environment at the critical energy by Wang [29]. However, in theoretical physics Grassmann integrations are also commonly used as an analytic tool by performing saddle point analysis on the superspace coordinatized by complex and Grassmann variables. Since Grassmann variables lack the concept of size, the rigorous justification of this very appealing idea is notoriously difficult.
Initiated by Spencer (see [26] for a summary) and starting with the paper [4] by Disertori, Pinson and Spencer, only a handful of mathematical papers have succeeded in exploiting this powerful tool in an essentially analytic way. We still lack the mathematical framework of a full-fledged analysis on the superspace that would enable us to translate physics arguments into proofs directly, but a combination of refined algebraic identities from physics (such as the superbosonization formula) and a careful analysis have yielded results that are currently inaccessible with more standard probabilistic methods. In this paper we present such results on random band matrices that surpass a well known limitation of the recently developed probabilistic techniques to prove the local versions of the celebrated Wigner semicircle law. We start with introducing the physical motivation of our model.
The Hamiltonian of quantum systems on a graph with vertex set \(\varGamma \) is a self-adjoint matrix \(H= (h_{ab})_{a,b\in \varGamma }, H=H^*\). The matrix elements \(h_{ab}\) represent the quantum transition rates from vertex a to b. Disordered quantum systems have random matrix elements. We assume they are centered, \(\mathbb {E}h_{ab} =0\), and independent subject to the basic symmetry constraint \(h_{ab} = \bar{h}_{ba}\). The variance \(\sigma _{ab}^2: = \mathbb {E} |h_{ab}|^2\) represents the strength of the transition from a to b and we use a scaling where the norm \(\Vert H\Vert \) is typically order 1. The simplest case is the mean field model, where \(h_{ab}\) are identically distributed; this is the standard Wigner matrix ensemble [31]. The other prominent example is the Anderson model [2] or random Schrödinger operator, \(H= \varDelta +V\), where the kinetic energy \(\varDelta \) is the (deterministic) graph Laplacian and the potential \(V= ( V_x)_{x\in \varGamma }\) is an on-site multiplication operator with random multipliers. If \(\varGamma \) is a discrete \(\mathsf {d}\)-dimensional torus then only few matrix elements \(h_{ab}\) are nonzero and they connect nearest neighbor points in the torus, \(\text {dist}(a,b)\le 1\). This is in sharp contrast to the mean field character of the Wigner matrices.
Random band matrices naturally interpolate between the mean field Wigner matrices and the short range Anderson model. They are characterized by a parameter M, called the band width, such that the matrix elements \(h_{ab}\) for \(\text {dist}(a,b)\ge M\) are zero or negligible. If M is comparable with the diameter L of the system then we are in the mean field regime, while \(M\sim 1\) corresponds to the short range model.
The Anderson model exhibits a metal-insulator phase transition: at high disorder the system is in the localized (insulator) regime, while at small disorder it is in the delocalized (metallic) regime, at least in \(\mathsf {d}\ge 3\) dimensions and away from the spectral edges. The localized regime is characterized by exponentially decaying eigenfunctions and off diagonal decay of the Green’s function, while in the complementary regime the eigenfunctions are supported in the whole physical space. In terms of the localization length \(\ell \), the characteristic length scale of the decay, the localized regime corresponds to \(\ell \ll L\), while in the delocalized regime \(\ell \sim L\). Starting from the basic papers [1, 15], the localized regime is well understood, but the delocalized regime is still an open mathematical problem for the \(\mathsf {d}\)-dimensional torus.
Let \(N=L^{\mathsf {d}}\) be the number of vertices in the discrete torus. Since the eigenvectors of the mean field Wigner matrices are always delocalized [13, 14], while the short range models are localized, by varying the parameter M in the random band matrix, one expects a (de)localization phase transition. Indeed, for \(\mathsf {d}=1\) it is conjectured (and supported by non rigorous supersymmetric calculations [16]) that the system is delocalized for broad bands, \(M\gg N^{1/2}\) and localized for \(M\ll N^{1/2}\). The optimal power 1/2 has not yet been achieved from either sides. Localization has been shown for \(M\ll N^{1/8}\) in [23], while delocalization in a certain weak sense for the most eigenvectors was proven for \(M\gg N^{4/5}\) in [11]. Interestingly, for a special Gaussian model even the sine kernel behavior of the 2-point correlation function of the characteristic polynomials could be proven down to the optimal band width \(M\gg N^{1/2}\), see [19, 21]. Note that the sine kernel is consistent with the delocalization but does not imply it. We remark that our discussion concerns the bulk of the spectrum; the transition at the spectral edge is much better understood. In [25] it was shown with moment method that the edge spectrum follows the Tracy–Widom distribution, characteristic to mean field model, for \(M\gg N^{5/6}\), but it yields a different distribution for narrow bands, \(M\ll N^{5/6}\).
Delocalization is closely related to estimates on the diagonal elements of the resolvent \(G(z)=(H-z)^{-1}\) at spectral parameters with small imaginary part \(\eta =\mathsf {Im}z\). Indeed, if \(G_{ii}(E+i\eta )\) is bounded for all i and all \(E\in {\mathbb R }\), then each \(\ell ^2\)-normalized eigenvector \(\mathbf{{u}}\) of H is delocalized on scale \(\eta ^{-1}\) in a sense that \(\max _i |u_i|^2 \lesssim \eta \), i.e. u is supported on at least \(\eta ^{-1}\) sites. In particular, if \(G_{ii}\) can be controlled down to the scale \(\eta \sim 1/N\), then the system is in the complete delocalized regime.
For band matrices with band width M, or even under the more general condition \( \sigma _{ab}^2\le M^{-1}\), the boundedness of \(G_{ii}\) was shown down to scale \(\eta \gg M^{-1}\) in [14] (see also [12]). If \(M\gg N^{1/2}\), it is expected that \(G_{ii}\) remains bounded even down to \(\eta \gg N^{-1}\) which is the typical eigenvalue spacing, the smallest relevant scale in the model. However, the standard approach [12, 14] via the self-consistent equations for the Green’s function does not seem to work for \(\eta \le 1/M\); the fluctuation is hard to control. The more subtle approach using the self-consistent matrix equation in [11] could prove delocalization and the off-diagonal Green’s function profile that are consistent with the conventional quantum diffusion picture, but it was valid only for relatively large \(\eta \), far from \(M^{-1}\). Moment methods, even with a delicate renormalization scheme [24] could not break the barrier \(\eta \sim M^{-1}\) either.
In this paper we attack the problem differently; with supersymmetric (SUSY) techniques. Our main result is that \(G_{ii}(z)\) is bounded, and the local semicircle law holds for any \(\eta \gg N^{-1}\), i.e. down to the optimal scale, if the band width is not too small, \(M\gg N^{6/7}\), but under two technical assumptions. First, we consider a generalization of Wegner’s n-orbital model [22, 30], namely, we assume that the band matrix has a block structure, i.e. it consists of \(M\times M\) blocks and the matrix elements within each block have the same distribution. This assumption is essential to reduce the number of integration variables in the supersymmetric representation, since, roughly speaking, each \(M\times M\) block will be represented by a single supermatrix with 16 complex or Grassmann variables. Second, we assume that the distribution of the matrix elements matches a Gaussian up to four moments in the spirit of [28]. Supersymmetry heavily uses Gaussian integrations, in fact all mathematically rigorous works on random band matrices with supersymmetric method assume that the matrix elements are Gaussian, see [4–6, 19–21, 26, 27]. The Green’s function comparison method [14] allows one to compare Green’s functions of two matrix ensembles provided that the distributions match up to four moments and provided that \(G_{ii}\) are bounded. This was an important motivation to reach the optimal scale \(\eta \gg N^{-1}\).
In the next subsections we introduce the model precisely and state our main results. Our supersymmetric analysis was inspired by [20], but our observable, \(G_{ab}\), requires a partly different formalism, in particular we use the singular version of the superbosonization formula [3]. Moreover, our analysis is considerably more involved since we consider relatively narrow bands. In Sect. 1.3, we explain our novelties compared with [20].
1.1 Matrix model
Let \(H_N=(h_{ab})\) be an \(N\times N\) random Hermitian matrix, in which the entries are independent (up to symmetry), centered, complex variables. In this paper, we are concerned with \(H_N\) possessing a block band structure. To define this structure explicitly, we set the additional parameters \(M\equiv M(N)\) and \(W\equiv W(N)\) satisfying
For simplicity, we assume that both M and W are integers. Let \(S=(\mathfrak {s}_{jk})\) be a \(W\times W\) symmetric matrix, which will be chosen as a weighted Laplacian of a connected graph on W vertices. Now, we decompose \(H_N\) into \(W\times W\) blocks of size \(M\times M\) and relabel
where \(j\equiv j(a)\) and \(k\equiv k(b)\) are the spatial indices that describe the location of the block containing \(h_{ab}\) and \(\alpha \equiv \alpha (a)\) and \(\beta \equiv \beta (b)\) are the orbital indices that describe the location of the entry in the block. More specifically, we have
We will call \((j(a),\alpha (a))\) (resp. \((k(b),\beta (b))\)) as the spatial-orbital parametrization of a (resp. b). Moreover, we assume
That means, the variance profile of the random matrix \(\sqrt{M}H_N\) is given by
in which each entry represents the common variance of the entries in the corresponding block of \(\sqrt{M}H_N\).
1.2 Assumptions and main results
In the sequel, for some matrix \(A=(a_{ij})\) and some index sets \(\mathsf {I}\) and \(\mathsf {J}\), we introduce the notation \(A^{(\mathsf {I}|\mathsf {J})}\) to denote the submatrix obtained by deleting the ith row and jth column of A for all \(i\in \mathsf {I}\) and \(j\in \mathsf {J}\). We will adopt the abbreviation
In addition, we use \(||A||_{\max }:=\max _{i,j}|a_{ij}|\) to denote the max norm of A. Throughout the paper, we need some assumptions on S.
Assumption 1.1
(On S) Let \(\mathcal {G}=(\mathcal {V},\mathcal {E})\) be a connected simple graph with \(\mathcal {V}=\{1,\ldots , W\}\). Assume that S is a \(W\times W\) symmetric matrix satisfying the following four conditions.
-
(i)
S is a weighted Laplacian on \(\mathcal {G}\), i.e. for \(i\ne j\), we have \(\mathfrak {s}_{ij}>0\) if \(\{i,j\}\in \mathcal {E}\) and \(\mathfrak {s}_{ij}=0\) if \(\{i,j\}\not \in \mathcal {E}\), and for the diagonal entries, we have
$$\begin{aligned} \mathfrak {s}_{ii}=-\sum _{j:j\ne i}\mathfrak {s}_{ij},\quad \forall \; i=1,\ldots ,W. \end{aligned}$$ -
(ii)
\(\widetilde{S}\) defined in (1.2) is strictly diagonally dominant, i.e., there exists some constant \(c_0>0\) such that
$$\begin{aligned} 1+2\mathfrak {s}_{ii}>c_0,\quad \forall \; i=1,\ldots , W. \end{aligned}$$ -
(iii)
For the discrete Green’s functions, we assume that there exist some positive constants C and \(\gamma \) such that
$$\begin{aligned} \max _{i=1,\ldots , W}||(S^{(i)})^{-1}||_{\max }\le CW^{\gamma }. \end{aligned}$$(1.4) -
(iv)
There exists a spanning tree \(\mathcal {G}_0=(\mathcal {V},\mathcal {E}_0)\subset \mathcal {G}\), on which the weights are bounded below, i.e. for some constant \(c>0\), we have
$$\begin{aligned} \mathfrak {s}_{ij}\ge c, \quad \forall \; \{i,j\}\in \mathcal {E}_0. \end{aligned}$$
Remark 1.2
From Assumption 1.1 (ii), we easily see that
Later, in Lemma 7.4, we will see that \(||(S^{(i)})^{-1}||_{\max }\le CW^2\) always holds. Hence, we may assume \(\gamma \le 2\).
Example 1.1
Let \(\varDelta \) be the standard discrete Laplacian on the \(\mathsf {d}\)-dimensional torus \([1,\mathfrak {w}]^{\mathsf {d}}\cap \mathbb {Z}^{\mathsf {d}}\), with periodic boundary condition, where \(\mathfrak {w}=W^{1/\mathsf {d}}\). Here by standard we mean the weights on the edges of the box are all 1. Now let \(S=a\varDelta \) for some positive constant \(a<1/4\mathsf {d}\). It is then easy to check Assumption 1.1 (i), (ii) and (iv) are satisfied. In addition, if \(\mathsf {d}=1\), it is well known that we can choose \(\gamma =1\) in Assumption 1.1 (iii). For \(\mathsf {d}\ge 3\), one can choose \(\gamma =0\). For \(\mathsf {d}=2\), one can choose \(\gamma =\varepsilon \) for arbitrarily small constant \(\varepsilon \). For instance, one can refer to [8] for more details.
For simplicity, we also introduce the notation
where \(\mathbf {1}_M\) is the M-dimensional vector whose components are all 1 and \(\widetilde{S}\) is the variance matrix in (1.2). It is elementary that
Our assumption on M depends on the constant \(\gamma \) in Assumption 1.1 (iii).
Assumption 1.3
(On M) We assume that there exists a (small) positive constant \(\varepsilon _1\) such that
Remark 1.4
A direct consequence of (1.8) and \(N=MW\) is
Especially, when \(\gamma =1\), one has \(M\gg N^{6/7}\). Actually, through a more involved analysis, (1.8) [or (1.9)] can be further improved. At least, for \(\gamma \le 1\), we expect that \(M\gg N^{4/5}\) is enough. However, we will not pursue this direction here.
Besides Assumption 1.1 on the variance profile of H, we need to impose some additional assumption on the distribution of its entries. To this end, we temporarily employ the notation \(H^g=(h^g_{ab})\) to represent a random block band matrix with Gaussian entries, satisfying (1.1), Assumptions 1.1 and 1.3.
Assumption 1.5
(On distribution) We assume that for each \(a,b\in \{1,\ldots , N\}\), the moments of the entry \(h_{ab}\) match those of \(h_{ab}^g\) up to the 4th order, i.e.
In addition, we assume the distribution of \(h_{ab}\) possesses a subexponential tail, namely, there exist positive constants \(c_1\) and \(c_2\) such that for any \(\tilde{\gamma }>0\),
holds uniformly for all \(a,b=1,\ldots , N\).
The four moment condition (1.10) in the context of random matrices first appeared in [28].
To state our results, we will need the following notion on the comparison of two random sequences, which was introduced in [9, 12].
Definition 1.6
(Stochastic domination) For some possibly N-dependent parameter set \(\mathsf {U}_N\), and two families of random variables \(\mathsf {X}=(\mathsf {X}_N(u): N\in \mathbb {N},u\in \mathsf {U}_{N})\) and \(\mathsf {Y}=(\mathsf {Y}_N(u): N\in \mathbb {N},u\in \mathsf {U}_N)\), we say that \(\mathsf {X}\) is stochastically dominated by \(\mathsf {Y}\), if for all \(\varepsilon '>0\) and \(D>0\) we have
for all sufficiently large \(N\ge N_0(\varepsilon ', D)\). In this case we write \(\mathsf {X}\prec \mathsf {Y}\).
The set \(\mathsf {U}_N\) is omitted from the notation \(\mathsf {X}\prec \mathsf {Y}\). Whenever we want to emphasize the role of \(\mathsf {U}_N\), we say that \(\mathsf {X}_N(u)\prec \mathsf {Y}_N(u)\) holds for all \(u\in \mathsf {U}_N\). For example, by (1.1) and Assumption 1.5, we have
Note that here \(\mathsf {U}_N=\{u=(a,b): a,b=1,\ldots ,N\}\). In some applications, we also use this notation for random variables without any parameter or with a fixed parameter, i.e. the set of parameters \(\mathsf {U}_N\) plays no role.
Note that \(\widetilde{S}\) is doubly stochastic. It is known that the empirical eigenvalue distribution of \(H_N\) converges to the semicircle law, whose density function is given by
see [14] for instance. We denote the Green’s function of \(H_N\) by
and its (a, b) matrix element is \(G_{ab}(z)\). Throughout the paper, we will always use E and \(\eta \) to denote the real and imaginary part of z without further mention. In addition, for simplicity, we suppress the subscript N from the notation of the matrices here and there. The Stieltjes transform of \(\varrho _{sc}(x)\) is
where we chose the branch of the square root with positive imaginary part for \(z\in \mathbb {C}^+\). Note that \(m_{sc}(z)\) is a solution to the following self-consistent equation
The semicircle law also holds in a local sense, see Theorem 2.3 in [12]. For simplicity, we cite this result with a slight modification adjusted to our assumption.
Proposition 1.7
(Erdős, Knowles, Yau, Yin, [12]) Let H be a random block band matrix satisfying Assumptions 1.1, 1.3 and 1.5. Then, for any fixed small positive constants \(\kappa \) and \(\varepsilon \), we have
Remark 1.8
We remark that Theorem 2.3 in [12] was established under a more general assumption \(\sum _k \sigma _{jk}^2=1\) and \(\sigma _{jk}^2\le C/M\). Especially, the block structure on the variance profile is not needed. In addition, Theorem 2.3 in [12] also covers the edges of the spectrum, which will not be discussed in this paper. We also refer to [14] for a previous result, see Theorem 2.1 therein.
Our aim in this paper is to extend the local semicircle law to the regime \(\eta \gg N^{-1}\) and replace M with N in (1.15). More specifically, we will work in the following set, defined for arbitrarily small constant \(\kappa >0\) and any sufficiently small positive constant \( \varepsilon _2:=\varepsilon _2(\varepsilon _1)\),
Throughout the paper, we will assume that \(\varepsilon _2\) is much smaller than \(\varepsilon _1\), see (1.8) for the latter. Specifically, there exists some large enough constant C such that \(\varepsilon _2\le \varepsilon _1/C\).
Theorem 1.9
(Local semicircle law) Suppose that H is a random block band matrix satisfying Assumptions 1.1, 1.3 and 1.5. Let \(\kappa \) be an arbitrarily small positive constant and \(\varepsilon _2\) be any sufficiently small positive constant. Then
for all \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\).
Remark 1.10
In fact, (1.17) together with the fact that \(G_{ab}(z)\) and \(m_{sc}(z)\) are Lipschitz functions of z with Lipschitz constant \(\eta ^{-2}\) imply the uniformity of the estimate in z in the following stronger sense
Remark 1.11
The restriction \(|E|\le \sqrt{2}-\kappa \) in (1.16) is technical. We believe the result can be extended to the whole bulk regime of the spectrum, i.e., \(|E|\le 2-\kappa \). The upper bound of \(\eta \) in (1.16) is also technical. However, for \(\eta > M^{-1}N^{\varepsilon _2}\), one can control the Green’s function by (1.15) directly.
Let \(\lambda _1,\ldots ,\lambda _N\) be the eigenvalues of \(H_N\). We denote by \(\mathbf {u}_i:=(u_{i1},\ldots , u_{iN})\) the normalized eigenvector of \(H_N\) corresponding to \(\lambda _i\). From Theorem 1.9, we can also get the following delocalization property for the eigenvectors.
Theorem 1.12
(Complete delocalization) Let H be a random block band matrix satisfying Assumptions 1.1, 1.3 and 1.5. We have
Remark 1.13
We remark that delocalization in a certain weak sense was proven in [11] for an even more general class of random band matrices if \(M\gg N^{4/5}\). However, Theorem 1.12 asserts delocalization for all eigenvectors in a very strong sense (supremum norm), while Proposition 7.1 of [11] stated that most eigenvectors are delocalized in a sense that their substantial support cannot be too small.
1.3 Outline of the proof strategy and novelties
In this section, we briefly outline the strategy for the proof of Theorem 1.9.
The first step, which is the main task of the whole proof, is to establish the following Theorem 1.15, namely, a prior estimate of the Green’s function in the Gaussian case. For technical reason, we need the following slight modification of Assumption 1.3, to state the result.
Assumption 1.14
(On M) Let \(\varepsilon _1\) be the small positive constant in Assumption 1.3. We assume
In the regime \(M\ge N(\log N)^{-10}\), we see that (1.17) anyway follows from (1.15) directly.
Theorem 1.15
Assume that H is a Gaussian block band matrix, satisfying Assumptions 1.1 and 1.14. Let n be any fixed positive integer. Let \(\kappa \) be an arbitrarily small positive constant and \(\varepsilon _2\) be any sufficiently small positive constant. There is \(N_0=N_0(n)\), such that for all \(N\ge N_0\) and all \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\), we have
for some positive constant \(C_0\) independent of n and z.
Remark 1.16
Much more delicate analysis can show that the prefactor \(N^{C_0}\) can be improved to some n-dependent constant \(C_n\). We refer to Sect. 12 for further comment on this issue.
Using the definition of stochastic domination in Definition 1.6, a simple Markov inequality shows that (1.21) implies
The proof of Theorem 1.15 is the main task of our paper. We will use the supersymmetry method. We partially rely on the arguments from Shcherbina’s work [20] concerning universality of the local 2-point function and we develop new techniques to treat our observable, the high moment of the entries of G(z), under a more general setting. We will comment on the novelties later in this subsection.
The second step is to generalize Theorem 1.15 from the Gaussian case to more general distribution satisfying Assumption 1.5, via a Green’s function comparison strategy initiated in [14], see Lemma 2.1 below.
The last step is to use Lemma 2.1 and its Corollary 2.2 to prove our main theorems. Using (1.22) above to bound the error term in the self-consistent equation for the Green’s function, we can prove Theorem 1.9 by a continuity argument in z, with the aid of the initial estimate for large \(\eta \) provided in Proposition 1.7. Theorem 1.12 will then easily follow from Theorem 1.9.
The second and the last steps are carried out in Sect. 2. The main body of this paper, Sects. 3–11 is devoted to the proof of Theorem 1.15.
One of the main novelty of this work is to combine the supersymmetry method and the Green’s function comparison strategy to go beyond the Gaussian ensemble, which was so far the only random band matrix ensemble amenable to the supersymmetry method, as mentioned at the beginning. The comparison strategy requires an apriori control on the individual matrix elements of the Green’s function with high probability [(see (1.22)], this is one of our main motivations behind Theorem 1.15.
Although we consider a different observable than [20], many technical aspects of the supersymmetric analysis overlaps with [20]. For the convenience of the reader, we now briefly introduce the strategy of [20], and highlight the main novelties of our work.
In [20], the author considers the 2-point correlation function of the trace of the resolvent of the Gaussian block band matrix H, with the variance profile \(\widetilde{S}=1+a\varDelta \), under the assumption \(M\sim N\) (note that we use M instead of W in [20] for the size of the blocks). The 2-point correlation function can be expressed in terms of a superintegral of a superfunction \(F(\{\breve{\mathcal {S}}_i\}_{i=1}^W)\) with a collection of \(4\times 4\) supermatrices \(\breve{\mathcal {S}}_i:=\mathcal {Z}^*_i\mathcal {Z}_i\). Here for each \(i, \mathcal {Z}_i=(\varPsi _{1,i},\varPsi _{2,i},\varPhi _{1,i},\varPhi _{2,i})\) is an \(M\times 4\) matrix and \(\mathcal {Z}^*_i\) is its conjugate transpose, where \(\varPsi _{1,i}\) and \(\varPsi _{2,i}\) are Grassmann M-vectors whilst \(\varPhi _{1,i}\) and \(\varPhi _{2,i}\) are complex M-vectors. Then, by using the superbosonization formula in the nonsingular case (\(M\ge 4\)) from [18], one can transform the superintegral of \(F(\{\breve{\mathcal {S}}_i\}_{i=1}^W)\) to a superintegral of \(F(\{\mathcal {S}_i\}_{i=1}^W)\), where each \(\mathcal {S}_i\) is a supermatrix akin to \(\breve{\mathcal {S}}_i\), but only consists of 16 independent variables (either complex or Grassmann). We will call the integral representation of the observable after using the superbosonization formula as the final integral representation. Schematically it has the form
for some functions \(\mathsf {g}(\cdot ), \mathsf {f}_c(\cdot )\) and \(\mathsf {f}_g(\cdot )\), where we used the abbreviation \(\mathcal {S}:=\{\mathcal {S}_i\}_{i=1}^W\) and \(\mathcal {S}_c\) and \(\mathcal {S}_g\) represents the collection of all complex variables and Grassmann variables in \(\mathcal {S}\), respectively. Here, \(\mathsf {g}(\mathcal {S}_c)\) and \(\mathsf {f}_c(\mathcal {S}_c)\) are some complex functions and \(\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)\) will be mostly regarded as a function of the Grassmann variables with complex variables as its parameters. The number of variables (either complex or Grassmann) in the final integral representation then turns out to be of order W, which is much smaller than the original order N. In fact, in [20] it is assumed that \(W=O(1)\) although the author also mentions the possibility to deal with the case \(W\sim N^{\varepsilon }\) for some small positive \(\varepsilon \), see the remark below Theorem 1 therein.
Performing a saddle point analysis for the complex measure \(\exp \{M\mathsf {f}_c(\mathcal {S}_c)\}\), one can restrict the integral in a small vicinity of some saddle point, say, \(\mathcal {S}_c=\mathcal {S}_{c0}\). It turns out that \(\mathsf {f}_c(\mathcal {S}_{c0})=0\) and \(\mathsf {f}_c(\mathcal {S}_c)\) decays quadratically away from \(\mathcal {S}_{c0}\). Consequently, by plugging in the saddle point \(\mathcal {S}_{c0}\), one can estimate \(\mathsf {g}(\mathcal {S}_c)\) by \(\mathsf {g}(\mathcal {S}_{c0})\) directly. However, for \(\exp \{M\mathsf {f}_c(\mathcal {S}_c)\}\) and \(\exp \{\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)\}\), one shall expand them around the saddle point. Roughly speaking, in some vicinity of \(\mathcal {S}_{c0}\), one will find that the expansions read
Here \(\mathbf {u}\) is a real vector of dimension O(W), which is essentially a vectorization of \(\sqrt{M}(\mathcal {S}_c-\mathcal {S}_{c0}); \mathsf {e}_c(\mathbf {u})=o(1)\) is some error term; \(\varvec{\rho }\) and \(\varvec{\tau }\) are two Grassmann vectors of dimension O(W). \({\mathbb {H}}\) is a complex matrix [(c.f. (9.26)], and \(\mathbb {A}\) is a complex matrix with positive-definite Hermitian part [(the explicit form of \(\mathbb {A}\) can be read from (8.30)]. Moreover, \(\mathbb {A}\) is closely related to \(\mathbb {H}\) in the sense that determinant of a certain minor of \(\mathbb {H}\) (after two rows and two columns removed) is proportional to the square root of the determinant of \( \mathbb {A}\), up to trivial factors. In addition, \(\mathsf {p}(\varvec{\rho },\varvec{\tau },\mathbf {u})\) is the expansion of \(\exp \{\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)-\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_{c0})\}\), which possesses the form
where \(\mathsf {p}_\ell (\varvec{\rho },\varvec{\tau },\mathbf {u})\) is a polynomial of the components of \(\varvec{\rho }\) and \(\varvec{\tau }\) with degree \(2\ell \), regarding \(\mathbf {u}\) as fixed parameters. Now, keeping the leading order term of \(\mathsf {p}(\varvec{\rho },\varvec{\tau },\mathbf {u})\), and discarding the remainder terms, one can get the final estimate of the integral by taking the Gaussian integral over \(\mathbf {u}, \varvec{\rho }\) and \(\varvec{\tau }\). This completes the summary of [20].
Similarly to [20], we also use the superbosonization formula to reduce the number of variables and perform the saddle point analysis on the resulting integral. However, owing to the following three main aspects, our analysis is significantly different from [20].
-
(Different observable) Our objective is to compute high moments of the single entry of the Green’s function. By using Wick’s formula (see Proposition 3.1), we express \(\mathbb {E}|G_{jk}|^{2n}\) in terms of a superintegral of some superfunction of the form
$$\begin{aligned} \tilde{F}\left( \{\varPsi _{a,j},\varPsi ^*_{a,j},\varPhi _{a,j}, \varPhi ^*_{a,j}\}_{\begin{array}{c} a=1,2;\\ j=1,\ldots ,W \end{array}}\right) :=\left( \bar{\phi }_{1,q,\beta }\phi _{1,p,\alpha }\bar{\phi }_{2,p,\alpha }\phi _{2,q,\beta }\right) ^nF(\{\breve{\mathcal {S}}_i\}_{i=1}^W) \end{aligned}$$for some \(p,q\in \{1,\ldots , W\}\) and \(\alpha ,\beta \in \{1,\ldots ,M\}\), where \(\phi _{1,p,\alpha }\) is the \(\alpha \)th coordinate of \(\varPhi _{1,p}\), and the others are defined analogously. Unlike the case in [20], \(\tilde{F}\) is not a function of \(\{\breve{\mathcal {S}}_i\}_{i=1}^W\) only. Hence, using the superbosonization formula to change \(\breve{\mathcal {S}}_i\) to \(\mathcal {S}_i\) directly is not feasible in our case. In order to handle the factor \(\big (\bar{\phi }_{1,q,\beta }\phi _{1,p,\alpha }\bar{\phi }_{2,p,\alpha }\phi _{2,q,\beta }\big )^n\), the main idea is to split off certain rank-one supermatrices from \(\breve{\mathcal {S}}_p\) and \(\breve{\mathcal {S}}_q\) such that this factor can be expressed in terms of the entries of these rank-one supermatrices. Then we use the superbosonization formula not only in the nonsingular case from [18] but also in the singular case from [3] to change and reduce the variables, resulting the final integral representation of \(\mathbb {E}|G_{jk}|^{2n}\). Though this final integral representation, very schematically, is still of the form (1.23), due to the decomposition of the supermatrices \(\breve{\mathcal {S}}_p\) and \(\breve{\mathcal {S}}_q\), it is considerably more complicated than its counterpart in [20]. Especially, the function \(\mathsf {g}(\mathcal {S}_c)\) differs from its counterpart in [20], and its estimate at the saddle point follows from a different argument.
-
(Small band width) In [20], the author considers the case that the band width M is comparable with N, i.e. the number of blocks W is finite. Though the derivation of the 2-point correlation function is highly nontrivial even with such a large band width, our objective, the local semicircle law and delocalization of the eigenvectors, however, can be proved for the case \(M\sim N\) in a similar manner as for the Wigner matrix (\(M=N\)), see [12, 14]. In our work, we will work with much smaller band width to go beyond the results in [12, 14], see Assumption 1.3. Several main difficulties stemming from a narrow band width can be heuristically explained as follows.
At first, let us focus on the integral over the small vicinity of the saddle point, in which the exponential functions in the integrand in (1.23) approximately look like (1.24).
We regard the first term in (1.24) as a complex Gaussian measure, of dimension O(W). When \(W\sim 1\), one can discard the error term \(\mathsf {e}_c(\mathbf {u})\) directly and perform the Gaussian integral over \(\mathbf {u}\), due to the fact \(\int \mathrm{d}\mathbf {u}\exp \{-\mathbf {u}'\mathsf {Re}(\mathbb {A})\mathbf {u}\}|\mathsf {e}_c(\mathbf {u})|=o(1)\). However, such an estimate is not allowed when \(W\sim N^{\varepsilon }\) (say), because the normalization of the measure \(\exp \{-\mathbf {u}'\mathsf {Re}(\mathbb {A})\mathbf {u}\}\) might be exponentially larger than that of \(\exp \{-\mathbf {u}'\mathbb {A}\mathbf {u}\}\). In order to handle this issue, in Sect. 8.2, we will do a second deformation of the contours of the variables in \(\mathbf {u}\), following the steepest descent paths exactly, whereby we can transform the complex Gaussian measure to a real one (c.f., (8.45)), thus the error term of the integral can be controlled.
Now, we turn to the second term in (1.24). When \(W\sim 1\), there are only finitely many Grassmann variables. Hence, the complex coefficient of each term in the polynomial \(\mathsf {p}(\varvec{\rho }, \varvec{\tau },\mathbf {u})\), which is of order \(M^{-\ell /2}\) for some \(\ell \in \mathbb {N}\) (see (1.25)), actually controls the magnitude of the integral of this term against the Gaussian measure \(\exp \{-\varvec{\rho }'\mathbb {H}\varvec{\tau }\}\). Consequently, in case of \(W\sim 1\), it suffices to keep the leading order term (according to \(M^{-\ell /2}\)), one may discard the others trivially, and compute the Gaussian integral over \(\varvec{\rho }\) and \(\varvec{\tau }\) explicitly. However, when \(W\sim N^{\varepsilon }\) (say), in light of the Wick’s formula (3.2) and the fact that the coefficients are of order \(M^{-\ell /2}\), the order of the integral of each term of \(\mathsf {p}(\varvec{\rho }, \varvec{\tau },\mathbf {u})\) against the Gaussian measure reads \(M^{-\ell /2}\det \mathbb {H}^{(\mathsf {I}|\mathsf {J})}\) for some index sets \(\mathsf {I}\) and \(\mathsf {J}\) and some \(\ell \in \mathbb {N}\). Due to the fact \(W\sim N^{\varepsilon }, \det \mathbb {H}^{(\mathsf {I}|\mathsf {J})}\) is typically exponential in W. Hence, it is much more complicated to determine and compare the orders of the integrals of all \(e^{O(W)}\) terms. In Sect. 9.1, in particular using Assumption 1.1 (iii) and Lemma 9.4, we perform a unified estimate for the integrals of all the terms, rather than simply estimate them by \(M^{-\ell /2}\).
In addition, the analysis for the integral away from the vicinity of the saddle point in our work is also quite different from [20]. Actually, the integral over the complement of the vicinity can be trivially ignored in [20], since each factor in the integrand of (1.23) is of order 1, thus gaining any o(1) factor for the integrand outside the vicinity is enough for the estimate. However, in our case, either \(\exp \{M\mathsf {f}_c(\mathcal {S}_c)\}\) or \(\int \mathrm{d} \mathcal {S}_g \exp \{\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)\}\) is essentially exponential in W. This fact forces us to provide an apriori bound for \(\int \mathrm{d} \mathcal {S}_g \exp \{\mathsf {f}_g(\mathcal {S}_g, \mathcal {S}_c)\}\) in the full domain of \(\mathcal {S}_c\) rather than in the vicinity of the saddle point only. This step will be done in Sect. 6. In addition, in Sect. 7, an analysis of the tail behavior of the measure \(\exp \{M\mathsf {f}_c(\mathcal {S}_c)\}\) will also be performed, in order to control the integral away from the vicinity of the saddle point.
-
(General variance profile \(\widetilde{S}\)) In [20], the authors considered the special case \(S=a\varDelta \) with \(a<1/4\mathsf {d}\). We generalize the discussion to more general weighted Laplacians S satisfying Assumption 1.1, which, as a special case, includes the standard Laplacian \(\varDelta \) for any fixed dimension \(\mathsf {d}\).
1.4 Notation and organization
Throughout the paper, we will need some notation. At first, we conventionally use U(r) to denote the unitary group of degree r, as well, U(1, 1) denotes the group of \(2\times 2\) matrices Q obeying
Furthermore, we denote
Recalling the real part E of z, we will frequently need the following two parameters
Correspondingly, we define the following four matrices
We remark here \(D_\pm \) does not mean “\(D_+\) or \(D_-\)”. In addition, we introduce the matrix
For simplicity, we introduce the following notation for some domains used throughout the paper.
For some \(\ell \times \ell \) Hermitian matrix A, we use \(\lambda _1(A)\le \cdots \le \lambda _\ell (A)\) to represent its ordered eigenvalues. For some possibly N-dependent parameter set \(\mathsf {U}_N\), and two families of complex functions \(\{a_N(u): N\in \mathbb {N}, u\in \mathsf {U}_N\}\) and \(\{b_N(u): N\in \mathbb {N}, u\in \mathsf {U}_N\}\), if there exists a positive constant \(C>1\) such that \(C^{-1}|b_N(u)|\le |a_N(u)|\le C |b_N(u)|\) holds uniformly in N and u, we write \(a_N(u)\sim b_N(u)\). Conventionally, we use \(\{\mathbf {e}_i:i=1,\ldots , \ell \}\) to denote the standard basis of \(\mathbb {R}^{\ell }\), in which the dimension \(\ell \) has been suppressed for simplicity. For some real quantities a and b, we use \(a\wedge b\) and \(a\vee b\) to represent \(\min \{a,b\}\) and \(\max \{a,b\}\), respectively.
Throughout the paper, \(c, c', c_1, c_2, C, C', C_1, C_2\) represent some generic positive constants that are possibly n-dependent and may differ from line to line. In contrast, we use \(C_0\) to denote some generic positive constant independent of n.
The paper will be organized in the following way. In Sect. 2, we prove Theorem 1.9 and Theorem 1.12, with Theorem 1.15. The proof of Theorem 1.15 will be done in Sects. 3–11. More specifically, in Sect. 3, we use the supersymmetric formalism to represent \(\mathbb {E}|G_{ij}|^{2n}\) in terms of a superintegral, in which the integrand can be factorized into several functions; Sect. 4 is devoted to a preliminary analysis on these functions; Sects. 5–10 are responsible for different steps of the saddle point analysis, whose organization will be further clarified at the end of Sect. 5; Sect. 11 is devoted to the final proof of Theorem 1.15, by summing up the discussions in Sects. 3–10. In Sect. 12, we make a comment on how to remove the prefactor \(N^{C_0}\) in (1.21). At the end of the paper, we also collect some frequently used symbols in a table, for the convenience of the reader.
2 Proofs of Theorem 1.9 and Theorem 1.12
Assuming Theorem 1.15, we prove Theorems 1.9 and 1.12 in this section. At first, (1.21) can be generalized to the generally distributed matrix with the four moment matching condition via the Green’s function comparison strategy.
Lemma 2.1
Assume that H is a random block band matrix, satisfying Assumptions 1.1, 1.5 and 1.14. Let \(\kappa \) be an arbitrarily small positive constant and \(\varepsilon _2\) be any sufficiently small positive constant. There is \(N_0=N_0(n)\), such that for all \(N\ge N_0\) and all \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\), we have
for some positive constant \(C_0\) uniform in n and z.
By the definition of stochastic domination in Definition 1.6, we can get the following corollary immediately.
Corollary 2.2
Under the assumptions of Lemma 2.1, we have
In the sequel, at first, we prove Lemma 2.1 from Theorem 1.15 via the Green’s function comparison strategy. Then we prove Theorem 1.9, using Lemma 2.1. Finally, we will show that Theorem 1.12 follows from Theorem 1.9 simply.
2.1 Green’s function comparison: Proof of Lemma 2.1
To show (2.1), we use Lindeberg’s replacement strategy to compare the Green’s functions of the Gaussian case and the general case. That means, we will replace the entries of \(H^g\) by those of H one by one, and compare the Green’s functions step by step. Choose and fix a bijective ordering map
Then we use \(H_k\) to represent the \(N\times N\) random Hermitian matrix whose \((\imath ,\jmath )\)th entry is \(h_{\imath \jmath }\) if \(\varpi (\imath ,\jmath )\le k\), and is \(h^g_{\imath \jmath }\) otherwise. Especially, we have \(H_0=H^g\) and \(H_{\varsigma (N)}=H\). Correspondingly, we define the Green’s functions by
Fix k and denote
Then, we write
where \(H_k^0\) is obtained via replacing \(h_{ab}\) and \(h_{ba}\) by 0 in \(H_k\) (or replacing \(h^g_{ab}\) and \(h^g_{ba}\) by 0 in \(H_{k-1}\)). In addition, we denote
Set \(\varepsilon _3\equiv \varepsilon _3(\gamma ,\varepsilon _1)\) to be a sufficiently small positive constant, satisfying (say)
where \(\gamma \) is from Assumption 1.1 (iii) and \(\varepsilon _1\) is from (1.8). For simplicity, we introduce the following parameters for \(\ell =1,\ldots , \varsigma (N)\) and \(\imath ,\jmath =1,\ldots , N\),
where C is a positive constant. Here we used the notation \(\delta _{\mathsf {I}\mathsf {J}}=1\) if two index sets \(\mathsf {I}\) and \(\mathsf {J}\) are the same and \(\delta _{\mathsf {I}\mathsf {J}}=0\) otherwise. It is easy to see that for \(\eta \le M^{-1}N^{\varepsilon _2}\), we have
by using (1.9). Now, we compare \(G_{k-1}(z)\) and \(G_k(z)\). We will prove the following lemma.
Lemma 2.3
Suppose that the assumptions in Lemma 2.1 hold. Additionally, we assume that for some sufficiently small positive constant \(\varepsilon _3\) satisfying (2.5),
Let \(n\in \mathbb {N}\) be any given integer. Then, if
we also have
for any \(k=1,\ldots , \varsigma (N)\).
Proof of Lemma 2.3
Fix k and omit the argument z from now on, but all formulas are understood to hold for all \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2) \). At first, under the conditions (2.8) and (2.9), we show that
To see this, we use the expansion with (2.4)
which implies that for a sufficiently small \(\varepsilon '>0\) and a sufficiently large constant \(D>0\)
where the first step follows from (1.13), (2.8), Definition 1.6 and the trivial bound \(\eta ^{-1}\) for the Green’s functions. Now, using (2.9), (2.7) and Hölder inequality, we have
In addition, for sufficiently small \(\varepsilon '\), it is easy to check that there exists a constant \(c>0\) such that
for \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\), in light of the fact \(M\gg N^{\frac{4}{5}}\), c.f., (1.9). Substituting (2.13) and (2.14) into (2.12) and choosing D to be sufficiently large, we can easily get the bound (2.11).
Now, recall (2.4) again and expand \(G_{k-1}(z)\) and \(G_k(z)\) around \(G_k^0(z)\), namely
We always choose m to be sufficiently large, depending on \(\varepsilon _3\) but independent of N. Then, we can write
where
At first, by taking m sufficiently large, from (2.8) and (1.13), we have the trivial bound
For \(\mathsf {R}_{\ell ,\imath \jmath }\) and \(\mathsf {S}_{\ell ,\imath \jmath }\), we split the discussion into off-diagonal case and diagonal case. In the case of \(\imath \ne \jmath \), we keep the first and the last factors of the terms in the expansions of \(((G_k^0 \mathsf {V}_{ab})^{\ell } G_k^0)_{\imath \jmath }\) and \(((G_k^0 \mathsf {W}_{ab})^{\ell } G_k^0)_{\imath \jmath }\), namely, \((G_k^0)_{\imath \jmath '}\) and \((G_k^0)_{\imath '\jmath }\) for some \(\imath ',\jmath '=a\) or b, and bound the factors in between by using (1.13) and (2.8), resulting the bound
For \(\imath =\jmath \), we only keep the first factor of the terms in the expansions of \(((G_k^0 \mathsf {V}_{ab})^{\ell } G_k^0)_{\imath \imath }\) and \(((G_k^0 \mathsf {W}_{ab})^{\ell } G_k^0)_{\imath \imath }\), and bound the others by using (1.13) and (2.8), resulting the bound
Observe that, in case \(\imath \ne \jmath \), if \(\{\imath ,\jmath \}\ne \{a,b\}\), at least one of \((G_k^0)_{\imath \jmath '}\) and \((G_k^0)_{\imath '\jmath }\) is an off-diagonal entry of \(G_k^0\) for \(\imath ',\jmath '=a\) or b.
Now we compare the 2nth moment of \(|(G_{k-1})_{\imath \jmath }|\) and \(|(G_k)_{\imath \jmath }|\). At first, we write
By substituting the expansion (2.16) into (2.21), we can write
where \(\mathbf {A}(\imath ,\jmath )\) is the sum of the terms which depend only on \(H_k^0\) and the first four moments of \(h_{ab}\), and \(\mathbf {R}_d(\imath ,\jmath )\) is the sum of all the other terms. We claim that \(\mathbf {R}_d(\imath ,\jmath )\) satisfies the bound
for some positive constant C independent of n. Now, we verify (2.23). According to (2.11) and the fact that the sequence \(\mathsf {R}_{1,\imath \jmath },\ldots , \mathsf {R}_{m,\imath \jmath }, \widetilde{\mathsf {R}}_{m+1,\imath \jmath }\), as well as \(\mathsf {S}_{1,\imath \jmath },\ldots , \mathsf {S}_{m,\imath \jmath }, \widetilde{\mathsf {S}}_{m+1,\imath \jmath }\), decreases by a factor \(N^{\varepsilon _3}/\sqrt{M}\) in magnitude, it is not difficult to check the leading order terms of \(\mathbf {R}_{k-1}(\imath ,\jmath )\) are of the form
with some \(p,q_\ell ,q'_\ell \in \mathbb {N}\) such that
and the leading order terms of \(\mathbf {R}_{k}(\imath ,\jmath )\) possess the same form (with \(\mathsf {R}\) replaced by \(\mathsf {S}\)). Every other term has at least 6 factors of \(h_{ab}\) or \(h_{ab}^g\) or their conjugates, thus their sizes are typically controlled by \(M^{-3}(N\eta )^{-n}\), i.e. they are subleading. Hence, it suffices to bound (2.24).
Now, the five factors of \(h_{ab}\) or \(h_{ba}\) within the \(\mathsf {R}_{\ell ,\imath \jmath }\)’s in (2.24) are independent of the rest and estimated by \(M^{-5/2}\). For the remaining factors from \(G^0_k\), we use (2.11) to bound 2n of them and use (2.8) to bound the rest. In the case that \(\imath \ne \jmath \) and \(\{\imath ,\jmath \}\ne \{a,b\}\), by the discussion above, we must have an off-diagonal entry of \(G_k^0\) in the product \((G_k^0)_{\imath \jmath '} (G_k^0)_{\imath ' \jmath }\) for any choice of \(\imath ',\jmath '=a\) or b. Then, in the bound for \(\mathsf {R}_{\ell ,\imath \jmath }\) in (2.19), for each \((G_k^0)_{\imath \jmath '} (G_k^0)_{\imath '\jmath }\), we keep the off-diagonal entry and bound the other by \(N^{\varepsilon _3}\) from assumption (2.8). Hence, by using (2.19) and (2.25), we see that for some \(\imath _r,\jmath _r\in \{\imath ,\jmath ,a,b\}\) with \(\imath _r\ne \jmath _r, r=1,\ldots , \sum (q_\ell +q'_\ell )\), the following bound holds
where the last step follows from (2.11) and Hölder’s inequality. In case of \(\imath \ne \jmath \) but \(\{\imath ,\jmath \}=\{a,b\}\), we keep an entry in the product \((G_k^0)_{\imath \jmath '} (G_k^0)_{\imath ' \jmath }\) and bound the other by \(N^{\varepsilon _3}\). We remark here in this case the entry being kept can be either diagonal or off-diagonal. Consequently, for some \(\imath _r,\jmath _r\in \{\imath ,\jmath ,a,b\},r=1,\ldots ,\sum (q_\ell +q'_\ell )\), we have the bound
by using (2.11) and Hölder’s inequality again. Hence, we have shown (2.23) in the case of \(\imath \ne \jmath \). For \(\imath =\jmath \), it is analogous to show
by using (2.11), (2.20) and Hölder’s inequality. Hence, we verified (2.23). Consequently, by Assumption 1.5, (2.22) and (2.23) we have
which together with the assumption (2.9) for \(\mathbb {E}|(G_{k-1})_{\imath \jmath }|^{2n}\) and the definition of \(\widehat{\varTheta }_{\ell ,\imath \jmath }\)’s in (2.6), we can get (2.10). Hence, we completed the proof of Lemma 2.3. \(\square \)
To show (2.1), we also need the following lemma.
Lemma 2.4
Suppose that the assumptions in Lemma 2.1 hold. Fix the indices \(a,b\in \{1,\ldots N\}\). Let \(H^0\) be a matrix obtained from H with its (a, b)th entry replaced by 0. Then, if for some \(\eta _0\ge 1/N\) there exists
then we also have
Proof of Lemma 2.4
The proof is almost the same as the discussion on pages 2311–2312 in [10]. For the convenience of the reader, we sketch it below. At first, according to the discussion below (4.28) in [10], for any \(\imath ,\jmath =1,\ldots , N\), we have
Now, we set \(k_1:=\max \{k: 2^k\eta <\eta _0\}\) and \(k_2:=\max \{k: 2^k\eta <1\}\). According to our assumption, both \(k_1\) and \(k_2\) are of the order \(\log N\). Now, we have
where in the second step, we used the fact that the function \(y\mapsto y\mathsf {Im}G_{\ell \ell } (E+\mathbf {i}y)\) is monotonically increasing, the condition (2.29) and the fact \(\eta \le \eta _0\). Hence, we conclude the proof of Lemma 2.4. \(\square \)
Now, with Theorem 1.15, Lemmas 2.3 and 2.4, we can prove Lemma 2.1.
Proof for Lemma 2.1
The proof relies on the following bootstrap argument, namely, we show that once
holds for \(\eta \ge \eta _0\) with \(\eta _0\in [N^{-1+\varepsilon _2+\varepsilon _3},M^{-1}N^{\varepsilon _2}]\), it also holds for \(\eta \ge \eta _0N^{-\varepsilon _3}\) for any \(\varepsilon _3\) satisfying (2.5). Assuming (2.30) holds for \(\eta \ge \eta _0\), we see that
Consequently, for \(\eta \ge \eta _0\), we also have
Therefore, (2.29) holds. Then, by Lemma 2.4, we see that (2.8) holds for \(\eta \ge \eta _0N^{-\varepsilon _3}\). Furthermore, by Lemma 2.3 and Theorem 1.15 for \(G_0\), i.e. the Gaussian case, one can get that for any given n,
Note that since (2.32) holds for any given n, we get (2.30) for \(M^{-1}N^{\varepsilon _2}\ge \eta \ge \eta _0N^{-\varepsilon _3}\).
Now we start from \(\eta _0=M^{-1}N^{\varepsilon _2}\). By Proposition 1.7 we see that (2.30) holds for all \(\eta \ge \eta _0\). Then we can use the bootstrap argument above finitely many times to show (2.30) holds for all \(\eta \ge N^{-1+\varepsilon _2}\). Consequently, we have (2.8) for all \(\eta \ge N^{-1+\varepsilon _2}\). Then, Lemma 2.1 follows from Lemma 2.3 and Theorem 1.15 immediately. \(\square \)
2.2 Proof of Theorem 1.9
Without loss of generality we can assume that \(M\le N(\log N)^{-10}\), otherwise, Proposition 1.7 implies (1.17) immediately. We only need to consider the diagonal entries \(G_{ii}\) below, since the bound for the off-diagonal entires of G(z) is implied by (2.1) directly. For simplicity, we introduce the notation
To bound \(\varLambda _d\), a key technical input is the estimate for the quantity
which is given in the following lemma.
Lemma 2.5
Suppose that H satisfies Assumptions 1.1, 1.5 and 1.14. We have
for all \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\).
The proof of Lemma 2.5 will be postponed. Using Lemma 2.5, we see that, with high probability, (2.33) is a small perturbation of the self-consistent equation of \(m_{sc}\), i.e. (1.14), considering \(\sum _{a}\sigma _{ai}^2=1\). To control \(\varLambda _d\), we use a continuity argument from [12].
We remind here that in the sequel, the parameter set of the stochastic dominance is always \(\mathbf {D}(N,\kappa ,\varepsilon _2)\), without further mention. We need to show that
and first we claim that it suffices to show that
Indeed, if (2.36) were proven, we see that with high probability either \(\varLambda _d>N^{-\frac{\varepsilon _2}{4}}\) or \(\varLambda _d\prec (N\eta )^{-\frac{1}{2}}\le N^{-\frac{\varepsilon _2}{2}}\) for \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\). That means, there is a gap in the possible range of \(\varLambda _d\). Now, choosing \(\varepsilon \) in (1.15) to be sufficiently small, we are able to get for \(\eta =M^{-1}N^{\varepsilon _2}\),
By the fact that \(\varLambda _d\) is continuous in z, we see that with high probability, \(\varLambda _d\) can only stay in one side of the range, namely, (2.35) holds. The rigorous details of this argument involve considering a fine discrete grid of the z-parameter and using that G(z) is Lipschitz continuous (albeit with a large Lipschitz constant \(1/\eta \)). The details are found in Section 5.3 of [12].
Hence, what remains is to verify (2.36). The proof of (2.36) is almost the same as that for Lemma 3.5 in [14]. For the convenience of the reader, we sketch it below without reproducing the details. We set
We also denote . By the assumption \(\varLambda _d\le N^{-\frac{\varepsilon _2}{4}}\), we have
Now we rewrite (2.33) as
By using (2.34), Lemma 5.1 in [14], and the assumption \(\varLambda _d\le N^{-\frac{\varepsilon _2}{4}}\), we can show that
One can refer to the derivation of (5.14) in [14] for more details. Averaging over i for (2.39) and (2.40) leads to
where
Plugging (2.38) and (2.34) into (2.42) yields
Using (2.43), the fact \(|\bar{m}(z)-m_{sc}(z)|\le \varLambda _d\le N^{-\frac{\varepsilon _2}{4}}\), and Lemma 5.2 in [14], to (2.41), we have
where in the first step we have used the fact that \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\) thus away from the edges of the semicircle law. Now, we combine (2.39), (2.40) and (2.41), resulting
We just take the above identity as the definition of \(\mathsf {w}_i\). Analogously, we set . Then (2.42) and (2.45) imply
where the second step follows from the fact \(|z+m_{sc}(z)|\ge 1\) in \(\mathbf {D}(N,\kappa ,\varepsilon _2)\) (see (5.1) in [14] for instance), (2.43) and (2.44), and in the last step we used (2.44) again.
Now, using the fact \(m_{sc}^2(z)=(m_{sc}(z)+z)^{-2}\) (see (1.14)), we rewrite (2.45) in terms of the matrix \(\mathcal {T}\) introduced in (1.6) as . Consequently, we have
Then for \(z\in \mathbf {D}(N,\kappa ,\varepsilon _2)\), using (1.7) and Proposition A.2 (ii) in [12] (with \(\delta _-=1\) and \(\theta >c\)), we can get
Plugging (2.48) and (2.46) into (2.47) yields
where the second step follows from (2.34). Then (2.38) further implies that , which together with (2.44) and (2.34) also implies
Hence
Therefore, we completed the proof of Theorem 1.9.
Proof of Lemma 2.5
At first, recalling the notation defined in (1.3), we denote the Green’s function of \(H^{(i)}\) as
with a little abuse of notation. For simplicity, we omit the variable z from the notation below. At first, we recall the elementary identity by Schur’s complement, namely,
where we used the notation \(\mathbf {h}_i^{\langle i\rangle }\) to denote the ith column of H, with the ith component deleted. Now, we use the identity for \(a,b\ne i\) (see Lemma 4.5 in [12] for instance),
By using (1.11) and the large deviation estimate for the quadratic form (see Theorem C.1 of [12] for instance), we have
which implies that
where we have used the fact that \(\sum _{a}\sigma _{ai}^2=1\) in the first inequality above. Plugging (1.22) and (2.52) into (2.50) and using Corollary 2.2 we obtain
which implies
In addition, (1.22), (2.50) and (2.53) lead to the fact that
Now, using (2.33), (2.49), (2.51) and (2.54), we can see that
Therefore, we completed the proof of Lemma 2.5. \(\square \)
2.3 Proof of Theorem 1.12
With Theorem 1.9, we can prove Theorem 1.12 routinely. At first, according to the uniform bound (1.18), we have
which implies that
due to the fact that \(m_{sc}(z)\sim 1\). Recalling the normalized eigenvector \(\mathbf {u}_i=(u_{i1},\ldots , u_{iN})\) corresponding to \(\lambda _i\), and using the spectral decomposition, we have
For any \(|\lambda _i|\le \sqrt{2}-\kappa \), we set \(E=\lambda _i\) on the r.h.s. of (2.57) and use (2.56) to bound the l.h.s. of it. Then we obtain \({||\mathbf {u}_{i}||^2_\infty }/{\eta }\prec 1\). Choosing \(\eta =N^{-1+\varepsilon _2}\) above and using the fact that \(\varepsilon _2\) can be arbitrarily small, we can get (1.19). Hence, we completed the proof of Theorem 1.12.
3 Supersymmetric formalism and integral representation for the Green’s function
In this section, we will represent \(\mathbb {E}|G_{ij}(z)|^{2n}\) for the Gaussian case by a superintegral. The final representation is stated in (3.31). We make the convention here, for any real argument in an integral below, its region of the integral is always \(\mathbb {R}\), unless specified otherwise.
3.1 Gaussian integrals and superbosonization formulas
Let \(\varvec{\phi }=(\phi _1,\ldots , \phi _k)'\) be a vector of complex components, \(\varvec{\psi }=(\psi _1,\ldots ,\psi _k)'\) be a vector of Grassmann components. In addition, let \(\varvec{\phi }^*\) and \(\varvec{\psi }^*\) be the conjugate transposes of \(\varvec{\phi }\) and \(\varvec{\psi }\), respectively. We recall the following well-known formulas for Gaussian integrals.
Proposition 3.1
(Gaussian integrals or Wick’s formulas)
-
(i)
Let \(\mathrm {A}\) be a \(k\times k\) complex matrix with positive-definite Hermitian part, i.e. \(\mathsf {Re}A>0\). Then for any \(\ell \in \mathbb {N}\), and \(i_1,\ldots , i_\ell , j_1,\ldots , j_\ell \in \{1,\ldots , k\}\), we have
$$\begin{aligned} \int \prod _{a=1}^k\frac{\mathrm{d}\mathsf {Re}\phi _a \mathrm{d}\mathsf {Im}\phi _a}{\pi }\;\exp \{-\varvec{\phi }^*\mathrm {A}\varvec{\phi }\}\prod _{b=1}^\ell \bar{\phi }_{i_b}\phi _{j_b}=\frac{1}{\det \mathrm {A}} \; \sum _{\sigma \in \mathbb {P}(\ell )} \prod _{b=1}^\ell (\mathrm {A}^{-1})_{j_b,i_{\sigma (b)}}, \nonumber \\ \end{aligned}$$(3.1)where \(\mathbb {P}(\ell )\) is the permutation group of degree \(\ell \).
-
(ii)
Let \(\mathrm {B}\) be any \(k\times k\) matrix. Then for any \(\ell \in \{ 0,\ldots , k\}\), any \(\ell \) distinct integers \(i_1,\ldots , i_\ell \) and another \(\ell \) distinct integers \(j_1,\ldots , j_\ell \in \{1,\ldots , k\}\), we have
$$\begin{aligned} \int \prod _{a=1}^k\mathrm{d}\bar{\psi }_a \mathrm{d}\psi _a\; \exp \{-\varvec{\psi }^*\mathrm {B}\varvec{\psi }\}\prod _{b=1}^\ell \bar{\psi }_{i_b}\psi _{j_b} =(-1)^{\ell +\sum _{\alpha =1}^\ell (i_\alpha +j_\alpha )}\det \mathrm {B}^{(\mathsf {I}|\mathsf {J})}, \nonumber \\ \end{aligned}$$(3.2)where \(\mathsf {I}=\{i_1,\ldots , i_\ell \}\), and \(\mathsf {J}=\{j_1,\ldots , j_\ell \}\).
Now, we introduce the superbosonization formula for superintegrals. Let \(\varvec{\chi }=(\chi _{ij})\) be an \(\ell \times r\) matrix with Grassmann entries, \(\mathbf {f}=(f_{ij})\) be an \(\ell \times r\) matrix with complex entries. In addition, we denote their conjugate transposes by \(\varvec{\chi }^*\) and \(\mathbf {f}^*\) respectively. Let F be a function of the entries of the matrix
Let \(\mathcal {A}(\varvec{\chi },\varvec{\chi }^*)\) be the Grassmann algebra generated by \(\chi _{ij}\)’s and \(\bar{\chi }_{ij}\)’s. Then we can regard F as a function defined on a complex vector space, taking values in \(\mathcal {A}(\varvec{\chi },\varvec{\chi }^*)\). Hence, we can and do view \(F(\mathcal {S}(\mathbf {f},\mathbf {f}^*;\varvec{\chi },\varvec{\chi }^*))\) as a polynomial in \(\chi _{ij}\)’s and \(\bar{\chi }_{ij}\)’s, in which the coefficients are functions of \(f_{ij}\)’s and \(\bar{f}_{ij}\)’s. Under this viewpoint, we state the assumption on F as follows.
Assumption 3.2
Suppose that \(F(\mathcal {S}(\mathbf {f},\mathbf {f}^*;\varvec{\chi },\varvec{\chi }^*))\) is a holomorphic function of \(f_{ij}\)’s and \(\bar{f}_{ij}\)’s if they are regarded as independent variables, and F is a Schwarz function of \(\mathsf {Re}f_{ij}\)’s and \(\mathsf {Im}f_{ij}\)’s, by those we mean that all of the coefficients of \(F(\mathcal {S}(\mathbf {f},\mathbf {f}^*;\varvec{\chi },\varvec{\chi }^*))\), as functions of \(f_{ij}\)’s and \(\bar{f}_{ij}\)’s, possess the above properties.
Proposition 3.3
(Superbosonization formula for the nonsingular case, [18]) Suppose that F satisfies Assumption 3.2. For \(\ell \ge r\), we have
where \(\mathbf {x}=(x_{ij})\) is a unitary matrix; \(\mathbf {y}=(y_{ij})\) is a positive-definite Hermitian matrix; \(\varvec{\omega }\) and \(\varvec{\xi }\) are two Grassmann matrices, and all of them are \(r\times r\). Here
and \(\mathrm{d}\hat{\mu }(\cdot )\) is defined by
under the parametrization induced by the eigendecomposition, namely,
Here \(\mathrm{d}\mu (V)\) is the Haar measure on \(\mathring{U}(r)\), and \(\varDelta (\cdot )\) is the Vandermonde determinant. In addition, the integral w.r.t. \(\mathbf {x}\) ranges over U(r), that w.r.t. \(\mathbf {y}\) ranges over the cone of positive-definite matrices.
For the singular case, i.e. \(r>\ell \), we only state the formula for the case of \(r=2\) and \(\ell =1\), which is enough for our purpose. We can refer to formula (11) in [3] for the result under more general setting.
Proposition 3.4
(Superbosonization formula for the singular case, [3]) Suppose that F satisfies Assumption 3.2. If \(r=2\) and \(\ell =1\), we have
where y is a positive variable; \(\mathbf {x}\) is a 2-dimensional unitary matrix; \(\varvec{\omega }=(\omega _1,\omega _2)'\) and \(\varvec{\xi }=(\xi _1,\xi _2)\) are two vectors with Grassmann components. In addition, \(\mathbf {w}\) is a unit vector, which can be parameterized by
Moreover, the differentials are defined as
In addition, the integral w.r.t. \(\mathbf {x}\) ranges over U(2).
3.2 Initial representation
For \(a=1,2\) and \(j=1,\ldots ,W\), we set
For each j and each \(a, \varPhi _{a,j}\) is a vector with complex components, and \(\varPsi _{a,j}\) is a vector with Grassmann components. In addition, we use \(\varPhi ^*_{a,j}\) and \(\varPsi ^*_{a,j}\) to represent the conjugate transposes of \(\varPhi _{a,j}\) and \(\varPsi _{a,j}\) respectively. Analogously, we adopt the notation \(\varPhi ^*_{a}\) and \(\varPsi ^*_{a}\) to represent the conjugate transposes of \(\varPhi _{a}\) and \(\varPsi _{a}\), respectively. We have the following integral representation for the moments of the Green’s function.
Lemma 3.5
For any \(p,q=1,\ldots , W\) and \(\alpha , \beta =1,\ldots , M\), we have
where
Proof
By using Proposition 3.1 (i) with \(\ell =n\) and Proposition 3.1 (ii) with \(\ell =0\), we can get (3.5). \(\square \)
3.3 Averaging over the Gaussian random matrix
Recall the variance profile \(\widetilde{S}\) in (1.2). Now, we take expectation of the Green’s function, i.e average over the random matrix. By elementary Gaussian integral, we get
where \(J:=\text {diag}(1,-1)\) and \(Z:=\text {diag}(z,\bar{z})\), and for each \(j=1,\ldots , W\), the matrices \(\breve{X}_j, \breve{Y}_j, \breve{\varOmega }_j\) and \(\breve{\varXi }_j\) are \(2\times 2\) blocks of a supermatrix, namely,
Remark 3.6
The derivation of (3.6) from (3.5) is quite standard. We refer to the proof of (2.14) in [20] for more details and will not reproduce it here.
3.4 Decomposition of the supermatrices
From now on, we split the discussion into the following three cases
-
(Case 1): Entries in the off-diagonal blocks, i.e. \(p\ne q\),
-
(Case 2): Off-diagonal entries in the diagonal blocks, i.e. \(p=q, \quad \alpha \ne \beta \),
-
(Case 3): Diagonal entries, i.e. \(p=q, \quad \alpha =\beta \).
For each case, we will perform a decomposition for the supermatrix \(\breve{\mathcal {S}}_j\) (\(j=p\) or q). For a vector \(\mathbf {v}\) and some index set \(\mathsf {I}\), we use \(\mathbf {v}^{\langle \mathsf {I}\rangle }\) to denote the subvector obtained by deleting the ith component of \(\mathbf {v}\) for all \(i\in \mathsf {I}\). Then, we adopt the notation
Here, for \(\mathsf {A}=\breve{X}_j, \breve{Y}_j, \breve{\varOmega }_j\) or \(\breve{\varXi }_j\), the notation \(\mathsf {A}^{\langle \mathsf {I}\rangle }\) is defined via replacing \(\varPhi _{a,j}, \varPsi _{a,j}, \varPhi _{a,j}^*\) and \(\varPsi _{a,j}^*\) by \(\varPhi _{a,j}^{\langle \mathsf {I}\rangle }, \varPsi _{a,j}^{\langle \mathsf {I}\rangle }, (\varPhi _{a,j}^*)^{\langle \mathsf {I}\rangle }\) and \((\varPsi _{a,j}^*)^{\langle \mathsf {I}\rangle }\), respectively, for \(a=1,2\), in the definition of \(\mathsf {A}\). In addition, the notation \(\mathsf {A}^{[i]}\) is defined via replacing \(\varPhi _{a,j}, \varPsi _{a,j}, \varPhi _{a,j}^*\) and \(\varPsi _{a,j}^*\) by \(\phi _{a,j,i}, \psi _{a,j,i}, \bar{\phi }_{a,j,i}\) and \(\bar{\psi }_{a,j,i}\) respectively, for \(a=1,2\), in the definition of \(\mathsf {A}\). Moreover, for \(\mathsf {A}=\breve{\mathcal {S}}_j, \breve{X}_j, \breve{Y}_j, \breve{\varOmega }_j\) or \(\breve{\varXi }_j\), we will simply abbreviate \(\mathsf {A}^{\langle \{a,b\}\rangle }\) and \(\mathsf {A}^{\langle \{a\}\rangle }\) by \(\mathsf {A}^{\langle a,b\rangle }\) and \(\mathsf {A}^{\langle a\rangle }\), respectively. Note that \(\breve{\mathcal {S}}_j^{[i]}\) is of rank-one.
Recalling the spatial-orbital parametrization for the rows or columns of H, it is easy to see from the block structure that for any \(\alpha ,\alpha '\in \{1,\ldots , M\}\), exchanging the \((j,\alpha )\)th row with the \((j,\alpha ')\)th row and simultaneously exchanging the corresponding columns will not change the distribution of H.
For Case 1, due to the symmetry mentioned above, we can assume \(\alpha =\beta =1\). Then we extract two rank-one supermatrices from \(\breve{\mathcal {S}}_p\) and \(\breve{\mathcal {S}}_q\) such that the quantities \(\bar{\phi }_{2,p,1}\phi _{1,p,1}\) and \(\bar{\phi }_{1,q,1}\phi _{2,q,1}\) can be expressed in terms of the entries of these supermatrices. More specifically, we decompose the supermatrices
Consequently, we can write
For Case 2, due to symmetry, we can assume that \(\alpha =1, \beta =2\). Then we extract two rank-one supermatrices from \(\breve{\mathcal {S}}_p\), namely,
Consequently, we can write
Finally, for Case 3, due to symmetry, we can assume that \(\alpha =1\). Then we extract only one rank-one supermatrix from \(\breve{\mathcal {S}}_p\), namely,
Consequently, we can write
Since the discussion for all three cases are similar, we will only present the details for Case 1. More specifically, in the remaining part of this section and Sect. 4 to Sect. 10, we will only treat Case 1. In Sect. 11, we will sum up the discussions in the previous sections and explain how to adapt them to Case 2 and Case 3, resulting a final proof of Theorem 1.15.
3.5 Variable reduction by superbosonization formulae
We will work with Case 1. Recall the decomposition (3.7). We use the superbosonization formulae to reduce the number of variables. We shall treat \(\breve{\mathcal {S}}_k (k\ne p,q)\) and \(\breve{\mathcal {S}}_j^{\langle 1\rangle } (j=p,q)\) on an equal footing and use the formula (3.3) with \(r=2,\ell =M\) for the former and \(r=2,\ell =M-1\) for the latter, while we separate the terms \(\breve{\mathcal {S}}_j^{[1]} (j=p,q)\) and use the formula (3.4). For simplicity, we introduce the notation
Accordingly, we will use \(\widetilde{X}_j, \widetilde{\varOmega }_j, \widetilde{\varXi }_j\) and \(\widetilde{Y}_j\) to denote four blocks of \(\widetilde{\mathcal {S}}_j\). With this notation, we can rewrite (3.6) with \(\alpha =\beta =1\) as
where the first factor \((\cdots )^n\) of integrand is the observable and all other factors constitute a normalized measure, written in a somewhat complicated form according to the decomposition from Sect. 3.4.
Now, we use the superbosonization formulae (3.3) and (3.4) to the supermatrices \(\breve{\mathcal {S}}_k (k\ne p,q), \breve{\mathcal {S}}_j^{\langle 1\rangle }\) and \(\breve{\mathcal {S}}_j^{[1]} (j=p,q)\) one by one, to change to the reduced variables as
Here, for \(j=1,\ldots , W, X_j\) is a \(2\times 2\) unitary matrix; \(Y_j\) is a \(2\times 2\) positive-definite matrix; \(\varOmega _j=(\omega _{j,\alpha \beta })\) and \(\varXi _j=(\xi _{j,\alpha \beta })\) are \(2\times 2\) Grassmann matrices. For \(k=p\) or \(q, X_k^{[1]}\) is a \(2\times 2\) unitary matrix; \(y_k^{[1]}\) is a positive variable; \(\varvec{\omega }_{k}^{[1]}=(\omega _{k,1}^{[1]}, \omega _{k,2}^{[1]})'\) is a column vector with Grassmann components; \(\varvec{\xi }_{k}^{[1]}=(\xi _{k,1}^{[1]},\xi _{k,2}^{[1]})\) is a row vector with Grassmann components. In addition, for \(k=p,q\),
Then by using superbosonization formulae, we arrive at the representation
where we used the notation \(\mathbf {y}^{[1]}:=(y_p^{[1]},y_q^{[1]}), \mathbf {w}^{[1]}:=(\mathbf {w}_p^{[1]}, \mathbf {w}_q^{[1]})\). The differentials are defined by
Now we change the variables as \(X_jJ\rightarrow X_j,Y_jJ \rightarrow B_j,\varOmega _jJ\rightarrow \varOmega _j, \varXi _jJ\rightarrow \varXi _j\) and perform the scaling \(X_j\rightarrow -MX_j, B_j\rightarrow MB_j, \varOmega _j\rightarrow \sqrt{M} \varOmega _j\) and \(\varXi _j\rightarrow \sqrt{M}\varXi _j\). Consequently, we can write
where the functions in the integrand are defined as
with
In (3.17), the regions of \(X_j\)’s and \(X_k^{[1]}\)’s are all U(2), and those of \(B_j\)’s are the set of the matrices A satisfying \(AJ>0\). We remind the reader here that if we parametrize the unitary matrix \(X_j\) according to its eigendecomposition, the scaling \(X_j\rightarrow -MX_j\) is equivalent to changing the contour of the eigenvalues of \(X_j\) from the unit circle \(\varSigma \) to \(\frac{1}{M}\varSigma \), up to the orientation. Afterwards, we deformed the contour back to \(\varSigma \) in (3.17). This is possible since the only singularity of the integrand in (3.17) in the variables of the eigenvalues of \(X_j\) is at 0, c.f., the matrix \(X_j^{-1}\) in the factor \(\mathcal {P}( \varOmega , \varXi , X, B)\).
3.6 Parametrization for X, B
Similarly to the discussion in [20], we start with some preliminary parameterization. At first, we do the eigendecomposition
where
Further, we introduce
Especially, we have \(V_1=T_1=I\). Now, we parameterize \(P_1, Q_1, V_j\) and \(T_j\) for all \(j=2,\ldots , W\) as follows
Under the parametrization above, we can express the corresponding differentials as follows.
where
In addition, for simplicity, we do the change of variables
Note that the Berezinian of such a change is 1. After this change, \(\mathcal {P}( \varOmega , \varXi , X,B,\mathbf {y}^{[1]},\mathbf {w}^{[1]})\) turns out to be independent of \(P_1\) and \(Q_1\).
To adapt to the new parametrization, we change the notation
We recall here that K(X) does not depend on \(P_1\), as well, L(B) does not depend on \(Q_1\). Moreover, according to the change (3.27), we have
and
Consequently, using (3.26), from (3.17) we can write
where we introduced the notation
In (3.31), the regions of \(V_j\)’s are all \(\mathring{U}(2)\), and those of \(T_j\)’s are all \(\mathring{U}(1,1)\). Observe that all Grassmann variables are inside the integrand of the integral \(\mathsf {A}(\hat{X}, \hat{B}, V, T)\). Hence, (3.31) separates the saddle point calculation from the observable \(\mathsf {A}(\hat{X}, \hat{B}, V, T)\).
To facilitate the discussions in the remaining part, we introduce some additional terms and notation here. Henceforth, we will employ the notation \((X^{[1]})^{-1}=\{(X^{[1]}_p)^{-1}, (X^{[1]}_q)^{-1}\}\) and \((\mathbf {y}^{[1]})^{-1}=\{(y_p^{[1]})^{-1}, (y_q^{[1]})^{-1}\}\) for the collection of inverse matrices and reciprocals, respectively. For a matrix or a vector A under discussion, we will use the term A-variables to refer to all the variables parametrizing it. For example, \(\hat{X}_j\)-variables means \(x_{j,1}\) and \(x_{j,2}\), and \(\hat{X}\)-variables refer to the collection of all \(\hat{X}_j\)-variables. Analogously, we can define the terms T-variables, \(\mathbf {y}^{[1]}\)-variables , \(\varOmega \)-variables and so on. We use another term A-entries to refer to the non-zero entries of A. Note that \(\hat{X}_j\)-variables are just \(\hat{X}_j\)-entries. However, for \(T_j\), they are different, namely,
Analogously, we will also use the term T-entries to refer to the collection of all \(T_j\)-entries. Then V-entries, \(\mathbf {w}^{[1]}\)-entries, etc. are defined in the same manner. It is easy to check that \(Q_1^{-1}\)-entries are the same as \(Q_1\)-entries, up to a sign, as well, \(T_j^{-1}\)-entries are the same as \(T_j\)-entries, for all \(j=2,\ldots , W\).
Moreover, to simplify the notation, we make the convention here that we will frequently use a dot to represent all the arguments of a function. That means, for instance, we will write \(\mathcal {P}( \varOmega , \varXi , \hat{X}, \hat{B}, V, T)\) as \(\mathcal {P}(\cdot )\) if there is no confusion. Analogously, we will also use the abbreviation \(\mathcal {Q}(\cdot ), \mathcal {F}(\cdot ), \mathsf {A}(\cdot )\), and so on.
Let \(\mathbf {a}:=\{a_1,\ldots , a_\ell \}\) be a set of variables, we will adopt the notation
to denote the class of all multivariate polynomials \(\mathfrak {p}(\mathbf {a})\) in the arguments \(a_1,\ldots , a_\ell \) such that the following three conditions are satisfied: (i) The total number of the monomials in \(\mathfrak {p}(\mathbf {a})\) is bounded by \(\kappa _1\); (ii) the coefficients of all monomials in \(\mathfrak {p}(\mathbf {a})\) are bounded by \(\kappa _2\) in magnitude; (iii) the power of each \(a_i\) in each monomial is bounded by \(\kappa _3\), for all \(i=1,\ldots , \ell \). For example, \(5b_{j,1}^{-1}+3b_{j,1}t_j^2+1\in \mathfrak {Q}\big (\{b_{j,1}^{-1}, b_{j,1}, t_j\}; 3, 5, 2\big )\). In addition, we define the subset of \(\mathfrak {Q}(\mathbf {a}; \kappa _1, \kappa _2, \kappa _3)\), namely,
consisting of those polynomials in \(\mathfrak {Q}(\mathbf {a}; \kappa _1, \kappa _2, \kappa _3)\) such that the degree is bounded by \(\kappa _3\), i.e. the total degree of each monomial is bounded by \(\kappa _3\). For example \(5b_{j,1}^{-1}+3b_{j,1}t_j^2+1\in \mathfrak {Q}_{\text {deg}}\big (\{b_{j,1}^{-1}, b_{j,1}, t_j\}; 3, 5, 3\big )\).
4 Preliminary discussion on the integrand
In this section, we perform a preliminary analysis on the factors of the integrand in (3.17). Recall the matrix \(\mathfrak {I}\) defined in (1.29).
4.1 Factor \(\exp \{-M(K(\hat{X},V)+L(\hat{B},T))\}\)
Recall the parametrization of \(\hat{B}_j, \hat{X}_j, T_j\) and \(V_j\) in (3.23) and (3.25), as well as the matrices defined in (1.28). According to the discussion in [20], there are three types of saddle points of this function, namely,
-
Type I : For each j, \(\displaystyle (\hat{B}_j, T_j,\hat{X}_j)=(D_{\pm }, I, D_{\pm })\quad \text {or} \quad (D_{\pm }, I, D_{\mp })\),
\(\qquad \theta _j\in \mathbb {L}, ~~~~v_j=0\) if \(\hat{X}_j=\hat{X}_1\), and \(v_j=1\) if \(\hat{X}_j\ne \hat{X}_1\).
-
Type II : For each j, \(\displaystyle (\hat{B}_j, T_j,\hat{X}_j)=(D_{\pm }, I, D_{+})\) and \(V_j\in \mathring{U}(2)\).
-
Type III : For each j, \(\displaystyle (\hat{B}_j, T_j,\hat{X}_j)=(D_{\pm }, I, D_{-})\) and \(V_j\in \mathring{U}(2)\).
(Actually, since \(\theta _j\) and \(v_j\) vary on continuous sets, it would be more appropriate to use the term saddle manifolds.) We will see that the main contribution to the integral (3.17) comes from some small vicinities of the Type I saddle points. At first, by the definition in (3.24), we have \(V_1=I\). If we regard \(\theta _j\)’s in the parametrization of \(V_j\)’s as fixed parameters, it is easy to see that there are totally \(2^W\) choices of Type I saddle points. Furthermore, the contributions from all the Type I saddle points are the same, since one can always do the transform \(V_j\rightarrow \mathfrak {I} V_j\) or \((\hat{X}_j, P_j)\rightarrow (\mathfrak {I}\hat{X}_j\mathfrak {I}, \mathfrak {I}P_j)\) for several j to change one saddle to another. That means, for Type I saddle points, it suffices to consider
-
Type I’ : For each j, \(\displaystyle (\hat{B}_j, T_j,\hat{X}_j, V_j)=(D_{\pm }, I, D_{\pm }, I)\).
Therefore, the total contribution to the integral (3.17) from all Type I saddle points is \(2^W\) times that from the Type I’ saddle point.
Following the discussion in [20], we will show in Sect. 5 that both \(K(\hat{X},V)-K(D_{\pm }, I)\) and \(L(\hat{B},T)-L(D_{\pm }, I)\) have positive real parts, bounded by some positive quadratic forms from below, which allows us to perform the saddle point analysis. In addition, it will be seen that in a vicinity of the Type I’ saddle point, \(\exp \{-M(K(\hat{X},V)+L(\hat{B},T))\}\) is approximately Gaussian.
4.2 Factor \(\mathcal {Q}( \varOmega , \varXi , \varvec{\omega }^{[1]},\varvec{\xi }^{[1]}, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]})\)
The function \(\mathcal {Q}(\cdot )\) contains both the \( \varOmega , \varXi \)-variables from \(\mathcal {P}(\cdot )\), and the \(P_1, Q_1, X^{[1]},\mathbf {y}^{[1]}, \mathbf {w}^{[1]}\)-variables from \(\mathcal {F}(\cdot )\). In addition, note that in the integrand in (3.17), \(\mathcal {Q}(\cdot )\) is the only factor containing the \(\varvec{\omega }^{[1]}\) and \(\varvec{\xi }^{[1]}\)- variables. Hence, we can compute the integral
At first, the explicit formula for \(\mathsf {Q}( \cdot )\) is complicated and irrelevant for us. From (3.30) and the definition of the Grassmann integral, it is not difficult to see that \(\mathsf {Q}( \cdot )\) is a polynomial of the \((X^{[1]})^{-1}, (\mathbf {y}^{[1]})^{-1}, \mathbf {w}^{[1]}, P_1, Q_1, \varOmega \) and \(\varXi \)-entries. In principle, for each monomial in the polynomial \(\mathsf {Q}(\cdot )\), we can combine the Grassmann variables with \(\mathcal {P}(\cdot )\), then perform the integral over \(\varOmega \) and \(\varXi \)-variables, whilst we combine the complex variables with \(\mathcal {F}(\cdot )\), and perform the integral over \(X^{[1]}, \mathbf {y}^{[1]}, \mathbf {w}^{[1]}, P_1\) and \(Q_1\)-variables. A formal discussion on \(\mathsf {Q}(\cdot )\) will be given in Sect. 6.1. However, the terms from \(\mathsf {Q}(\cdot )\) turn out to be irrelevant in our proof. Therefore, in the arguments with \(\mathsf {Q}(\cdot )\) involved, a typical strategy that we will adopt is as follows: we usually neglect \(\mathsf {Q}(\cdot )\) at first, and perform the discussion on \(\mathcal {P}(\cdot )\) and \(\mathcal {F}(\cdot )\) separately, at the end, we make necessary comments on how to slightly modify the discussions to take \(\mathsf {Q}(\cdot )\) into account.
4.3 Factor \(\mathcal {P}( \varOmega , \varXi , \hat{X}, \hat{B}, V, T)\)
We will mainly regard \(\mathcal {P}(\cdot )\) as a function of the \(\varOmega \) and \(\varXi \)-variables. As mentioned above, we also have some \(\varOmega \) and \(\varXi \)-variables from the irrelevant term \(\mathsf {Q}(\cdot )\). But we temporarily ignore them and regard as if the integral over \( \varOmega \) and \(\varXi \)-variables reads
We shall estimate \(\mathsf {P}(\cdot )\) in three different regions: (1) the complement of the vicinities of the saddle points; (2) the vicinity of Type I saddle point; (3) the vicinities of Type II and III saddle points, which will be done in Sects. 6.2, 9.1 and 10.1, respectively. (Definition 5.5 gives the precise definition of the vicinities.) In each case we will decompose the function \(\mathcal {P}(\cdot )\) as a product of a Gaussian measure and a multivariate polynomial of Grassmann variables. Consequently, we can employ (3.2) to perform the integral of this polynomial against the Gaussian measure, whereby \(\mathsf {P}(\cdot )\) can be estimated.
4.4 Factor \(\mathcal {F}(\hat{X},\hat{B}, V, T, P_1, Q_1, X^{[1]}, \mathbf {y}^{[1]},\mathbf {w}^{[1]})\)
Observe that \(\mathcal {F}\) is the only term containing the energy scale \(\eta \). As in the previous discussion of \(\mathcal {P}(\cdot )\), here we also ignore the \(P_1, Q_1, X^{[1]},\mathbf {y}^{[1]}, \mathbf {w}^{[1]}\)-variables from the irrelevant term \(\mathsf {Q}(\cdot )\) temporarily, and investigate the integral
We shall also estimate \(\mathsf {F}(\cdot )\) in three different regions: (1) the complement of the vicinities of the saddle points; (2) the vicinity of Type I saddle point; (3) the vicinities of Type II and III saddle points, which will be done in Sects. 6.3, 9.2 and 10.2, respectively.
Especially, when we restrict the \(\hat{X}, \hat{B}, V\) and T-variables to the vicinity of the Type I saddle points, the above integral can be performed approximately, resulting our main term, a factor of order \(1/(N\eta )^{n+2}\). This step will be done in Sect. 9. It is instructive to give a heuristic sketch of this calculation. At first, we plug the Type I saddle points into (4.3). We will show that the integral of \(f(\cdot )\) approximately reads
which is the easy part. Then, recalling the definition of \(g(\cdot )\) in (3.21) and the parameterization (3.15), we will show that the integral of \(g(\cdot )\) approximately reads
where in the second step above we used the fact that
We notice that the factor \(e^{\mathbf {i}n\sigma _p^{[1]}}e^{-\mathbf {i}n\sigma _q^{[1]}}\) in (4.4) actually comes from the term \(\Big (\big (\mathbf {w}^{[1]}_q(\mathbf {w}^{[1]}_q)^*\big )_{12}\big (\mathbf {w}^{[1]}_p(\mathbf {w}^{[1]}_p)^*\big )_{21}\Big )^n\) in (3.21). This factor brings a strong oscillation to the integrand in the integral (4.4). In Case 2, an analogous factor will appear, resulting the same estimate as (4.4). However, in Case 3, such an oscillating factor is absent, then the estimate for the counterpart of the integral in (4.4) is of order \(1/N\eta \) instead of \(1/(N\eta )^{n+1}\). The detailed analysis will be presented in Sects. 10 and 11.
5 Saddle points and vicinities
In this section, we study the saddle points of \(K(\hat{X},V)\) and \( L(\hat{B},T)\) and deform the contours of the \(\hat{B}\)-variables to pass through the saddle points. Then we introduce and classify some small vicinities of these saddle points. The derivation of the saddle points of \(K(\hat{X},V)\) and \( L(\hat{B},T)\) in Sects. 5.1 and 5.2 below is essentially the same as the counterpart in [20], the only difference is that we are working under a more general setting on S. Hence, in Sects. 5.1 and 5.2, we just sketch the discussion, list the results, and make necessary modifications to adapt to our setting. In the sequel, we employ the notation
As mentioned above, later we also need to deform the contours, and discuss the integral over some vicinities of the saddle points, thus it is convenient to introduce a notation for the integral over specific domains. To this end, for \(a=1,2\), we use \(\mathbf {I}^b_{a}\) and \(\mathbf {I}^x_a\) to denote generic domains of \(\mathbf {b}_a\) and \(\mathbf {x}_a\) respectively. Analogously, we use \(\mathbf {I}^t\) and \(\mathbf {I}^v\) to represent generic domains of \(\mathbf {t}\) and \(\mathbf {v}\), respectively. These domains will be specified later. Now, for some collection of domains, we introduce the notation
For example, we can write (3.31) as
which is the integral over the full domain.
5.1 Saddle points of \(L(\hat{B},T)\)
We introduce the function
Recalling the definition of \(L(\cdot )\) in (3.18), the decomposition of \(B_j\)’s in (3.22) and the definition of \(T_j\)’s in (3.24), we can write
where we used the notation introduced in (5.1), and the functions \(\ell (\cdot )\) and \(\ell _S(\cdot )\) are defined as
Following the discussion in [20] with slight modification (see Section 3 therein), we see that for \(|E|\le \sqrt{2}-\kappa \), the saddle point of \( L(\hat{B},T)\) is
where \(D_{\pm }\) is defined in (1.28). For simplicity, we will write (5.7) as \((\hat{B},T)=(D_{\pm },I)\) in the sequel. Observe that
We introduce the notation
where \(\ell (a_+)\) represents the value of \(\ell (\mathbf {a})\) at the point \(\mathbf {a}=(a_+,\ldots , a_+)\), and \(\ell (a_-)\) is defined analogously. Correspondingly, we adopt the notation
where the second step is implied by (5.5), (5.8) and (5.9). Now, for each \(j=1,\ldots ,W\), we deform the contours of \(b_{j,1}\) and \(b_{j,2}\) to
to pass through the saddle points of \(\hat{B}\)-variables, based on the following lemma which will be proved in Sect. 7.
Lemma 5.1
With the notation introduced in (5.2), we have
We introduce the notation
Along the new contours, we have the following lemma.
Lemma 5.2
Suppose that \(|E|\le \sqrt{2}-\kappa \). Let \(\mathbf {b}_{1}\in \varGamma ^W, \mathbf {b}_{2}\in \bar{\varGamma }^W\). We have
for some positive constant c.
Proof
Since \(|E|\le \sqrt{2}-\kappa \), we have \(\mathsf {Re}(b_{j,1}+b_{j,2})(b_{k,1}+b_{k,2})\ge 0\) for \(\mathbf {b}_{1}\in \varGamma ^W\) and \(\mathbf {b}_{2}\in \bar{\varGamma }^W\), thus \(\mathsf {Re}\ell _S(\hat{B},T)\ge 0\), in light of the definition in (5.6). Consequently, according to (5.10), it suffices to prove
for some positive constant c. To see this, we observe the following identities obtained via elementary calculation,
which together with \(|E|\le \sqrt{2}-\kappa \) and (1.5) implies (5.14) immediately. The same identity holds if we replace \(\mathring{\ell }_{++}(\mathbf {b}_1)\) and \(r_{j,1}\)’s by \(\mathring{\ell }_{--}(\mathbf {b}_2)\) and \(r_{j,2}\)’s in (5.15), respectively. Hence, we completed the proof of Lemma 5.2. \(\square \)
5.2 Saddle points of \(K(\hat{X},V)\)
Analogously, recalling the definition in (5.6), we can write
where \(\ell (\cdot )\) is defined in the first line of (5.6) and \(\ell _S(\hat{X},V)\) is defined as
Analogously to the notation \(L(D_{\pm },I)\), we will use \(K(D_{\pm },I)\) to represent the value of \(K(\hat{X},V)\) at \((\hat{X}_j,V_j)=(D_{\pm }, I)\) for all \(j=1,\ldots , W\). In addition, \(K(D_{+},I)\) and \(K(D_{-},I)\) are defined in the same manner. Observing that
we have
Moreover, we employ the notation
We will need the following elementary observations that are easy to check from (5.19) and (5.6)
In addition, we introduce the \(W\times W\) matrix
and the \(2W\times 2W\) matrices
where \(\mathbb {S}^v\) depends on the V-variables according to (5.22). Here we regard V-variables as fixed parameters. Due to the fact \(|(V_kV_j^*)_{12}|\in \mathbb {I}\), it is easy to see that \(\mathbb {S}^v\) is a weighted Laplacian of a graph with 2W vertices. In particular, \(\mathbb {S}^v\le 0\). By the definition (5.22), one can see that \(S^v_{ii}=0\) for all \(i=1,\ldots , W\). Consequently, we can obtain
Similarly to (1.5), we get
where \(c_0\) is the constant in Assumption 1.1 (ii). Moreover, it is not difficult to see from the definitions in (5.17), (5.22) and (5.23) that
where we used the notation \(\mathbf {x}:=(\mathbf {x}_1',\mathbf {x}_2')'\). Now let
Then, recalling the parametrization of \(V_j\)’s in (3.25), we have the following lemma.
Lemma 5.3
Assume that \(x_{j,1}, x_{j,2}\in \varSigma \) for all \(j=1,\ldots , W\). We have
for some positive constant c. In addition, \(\mathsf {Re}\mathring{K}(\hat{X},V)\) attains its minimum 0 at the following three types of saddle points
-
Type I : For each j, \(\hat{X}_j=D_{\pm }\quad \text {or} \quad D_{\mp }\), \(\qquad \theta _j\in \mathbb {L} ~~~~v_j=0\) if \(\hat{X}_j=\hat{X}_1\), and \(v_j=1\) if \(\hat{X}_j\ne \hat{X}_1\),
-
Type II : For each j, \(\hat{X}_j=D_{+},V_j\in \mathring{U}(2)\),
-
Type III : For each j, \(\hat{X}_j=D_{-},V_j\in \mathring{U}(2)\),
which are the restrictions of three types of saddle points in Sect. 4.1, on \(\hat{X}\) and V-variables.
Remark 5.4
The Type I saddle points of \((\hat{X},V)\) are exactly those points satisfying
In Lemma 5.3, we wrote them in terms of \(\hat{X}_j, v_j\) and \(\theta _j\) in order to evoke the parameterization in (3.23) and (3.25).
Proof
By (5.16), (5.25), the definitions of the functions \(\ell (\cdot )\) in (5.7) and \(\Bbbk (\cdot )\) in (5.4), we can write
By using (5.26) and the fact \(|x_{j,a}|=1\) for all \(j=1,\ldots , W\) and \(a=1,2\), we can obtain via elementary calculation
In light of the fact \(\mathbb {S}^v\le 0\) and (5.24), we have
Applying (5.29) to the r.h.s. of (5.28) yields (5.27).
Then what remains is to show that \(\mathsf {Re}\mathring{K}(\hat{X},V)\) attains its minimum 0 at three types of points listed in Lemma 5.3. This step can be verified in the same manner as the counterpart in [20]. Hence, we just omit it here and refer to the proof of Lemma 2 in [20] for more details. Therefore, we completed the proof of Lemma 5.3. \(\square \)
5.3 Vicinities of the saddle points
Having studied the saddle points of \(L(\hat{B},T)\) and \(K(\hat{X},V)\), we then introduce some small vicinities of them. To this end, we introduce the quantity
for small positive constant \(\varepsilon _0\) which will be chosen later. Let \(\mathbf {a}=(a_{1},\ldots , a_{W})\in \mathbb {C}^W\) be any complex vector. In the sequel, we adopt the notation for any \(d\in \mathbb {C}\),
Now, we define the domains \(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_-\) and \(\varUpsilon _S\) as follows
where the superscripts b and x indicate that these will be domains of the corresponding variables. In order to define the vicinities of the Type I saddle points properly, we introduce the permutation \(\epsilon _j\) of \(\{1,2\}\), for each triple \((x_{j,1}, x_{j,2}, v_j)\). Specifically, recalling the fact of \(u_j=\sqrt{1-v_j^2}\) from (3.25), we define
Denoting by \(\varvec{\epsilon }=(\epsilon _1,\ldots , \epsilon _W)\) and \(\varvec{\epsilon }(a)=(\epsilon _1(a),\ldots , \epsilon _W(a))\) for \(a=1,2\), we set
With this notation, we now define the Type I, II, and III vicinities, parameterized by \((\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2,\mathbf {t},\mathbf {v})\) of the corresponding saddle point types. We also define the special case of the Type I vicinity, namely, Type I’ vicinity, corresponding to the Type I’ saddle point defined in Sect. 4.1.
Definition 5.5
-
Type I vicinity : \(\big (\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_{\varvec{\epsilon }(1)},\mathbf {x}_{\varvec{\epsilon }(2)},\mathbf {t},\mathbf {v}_{\varvec{\epsilon }}\big )\in \varUpsilon ^b_+\times \varUpsilon ^b_-\times \varUpsilon ^x_+\times \varUpsilon ^x_-\times \varUpsilon _S\times \varUpsilon _S \) for some \(\varvec{\epsilon }\).
-
Type I’ vicinity : \(\big (\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2,\mathbf {t},\mathbf {v}\big )\in \varUpsilon ^b_+\times \varUpsilon ^b_-\times \varUpsilon ^x_+\times \varUpsilon ^x_-\times \varUpsilon _S\times \varUpsilon _S \).
-
Type II vicinity : \(\big (\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2,\mathbf {t},\mathbf {v}\big )\in \varUpsilon ^b_+\times \varUpsilon ^b_-\times \varUpsilon ^x_+\times \varUpsilon ^x_+\times \varUpsilon _S\times \mathbb {I}^{W-1}\).
-
Type III vicinity : \(\big (\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2,\mathbf {t},\mathbf {v}\big )\in \varUpsilon ^b_+\times \varUpsilon ^b_-\times \varUpsilon ^x_-\times \varUpsilon ^x_-\times \varUpsilon _S\times \mathbb {I}^{W-1}\).
In the following discussion, the parameter \(\varepsilon _0\) in \(\varTheta \) is allowed to be different from line to line. However, given \(\varepsilon _1\) in (1.8), we shall always choose \(\varepsilon _2\) in (1.16) and \(\varepsilon _0\) in (5.30) according to the rule
for some sufficiently large \(C>0\). Consequently, by Assumption 1.14 we have
To prove Theorem 1.15, we split the task into three steps. The first step is to exclude the integral outside the vicinities. Specifically, we will show the following lemma.
Lemma 5.6
Under Assumptions 1.1 and 1.14, we have,
Remark 5.7
The first three terms on the r.h.s. of (5.36) correspond to the integrals over vicinities of the Type I, II, and III saddle points, respectively. Note that for the first term, we have used the fact that the total contribution of the integral over the Type I vicinity is \(2^W\) times that over the Type I’ vicinity.
The second step, is to estimate the integral over the Type I vicinity. We have the following lemma.
Lemma 5.8
Under Assumptions 1.1 and 1.14, there exists some positive constant \(C_0\) uniform in n and some positive number \(N_0=N_0(n)\) such that for all \(N\ge N_0\),
The last step is to show that the integral over the Type II and III vicinities are also negligible.
Lemma 5.9
Under Assumptions 1.1 and 1.14, there exists some positive constant c such that,
Therefore, the remaining task is to prove Lemmas 5.1, 5.6, 5.8 and 5.9. For the convenience of the reader, we outline the organization of the subsequent part as follows.
At first, the proofs of Lemmas 5.1 and 5.6 require a discussion on the bound of the integrand, especially on the term \(\mathsf {A}(\cdot )\), which contains the integral over all the Grassmann variables. To this end, we perform a crude analysis for the function \(\mathsf {A}(\cdot )\) in Sect. 6 in advance, with which we are able to prove Lemmas 5.1 and 5.6 in Sect. 7. Then, we can restrict ourselves to the integral over the vicinities, i.e., prove Lemmas 5.8 and 5.9. In Sect. 8, we will analyze the factor \(\exp \{-M(\mathring{K}(\hat{X},V)+\mathring{L}(\hat{B},T))\}\), which is approximately Gaussian in the Type I vicinity. The key step is to deform the contours of \(\hat{X}\) and \(\hat{B}\)-variables again to the steepest paths exactly, whereby we can control the error terms in the integrand and prove Lemma 5.8 in Sect. 9. In the Type II and III vicinities, we only deform the contours of \(\hat{B}\)-variables again and bound \(\exp \{-M\mathring{K}(\hat{X},V)\}\) by its absolute value directly. It turns out to be enough for our proof of Lemma 5.9, which is given in Sect. 10.
6 Crude bound on \(\mathsf {A}(\hat{X}, \hat{B}, V, T)\)
In this section, we provide a bound on the function \(\mathsf {A}(\cdot )\) in terms of the \(\hat{B}, T\)-variables, which holds on all the domains under discussion in the sequel. Here, by crude bound we mean a bound of order \(\exp \{O(WN^{\varepsilon _2})\})\), which will be specified in Lemma 6.1 below. By the definition in (3.32), we see that \(\mathsf {A}(\cdot )\) is an integral of the product of \(\mathcal {Q}(\cdot ), \mathcal {P}(\cdot )\) and \(\mathcal {F}(\cdot )\). As mentioned in Sect. 4.2, a typical procedure we will adopt is to ignore \(\mathcal {Q}(\cdot )\) at first, then estimate the integrals of \(\mathcal {P}(\cdot )\) and \(\mathcal {F}(\cdot )\), which are denoted by \(\mathsf {P}(\cdot )\) and \(\mathsf {F}(\cdot )\), respectively [see (4.2) and (4.3)], finally, we make necessary comment on how to modify the bounding scheme to take \(\mathcal {Q}(\cdot )\) into account, whereby we can get the desired bound for \(\mathsf {A}(\cdot )\).
For the sake of simplicity, from now on, we will use the notation
Moreover, we introduce the domains
By the assumption that \(|E|\le \sqrt{2}-\kappa \) in (1.16), it is easy to see that \(|\arg \omega |\le \pi /4-c\) for all \(\omega \in \mathbb {K}\cup \bar{\mathbb {K}}\), where c is some positive constant depending on \(\kappa \). Our aim is to show the following lemma.
Lemma 6.1
Suppose that \(\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2\in \mathbb {K}^W\times \bar{\mathbb {K}}^W\times \widehat{\varSigma }^W\times \widehat{\varSigma }^W\). Under the assumption of Theorem 1.15, we have
Remark 6.2
Obviously, using the terminology introduced at the end of Sect. 3.6, we have
6.1 Integral of \(\mathcal {Q}\)
In this section, we investigate the function
Recall \(\mathfrak {Q}_{\text {deg}}(\mathbf {a}; \kappa _1, \kappa _2, \kappa _3)\) defined at the end of Sect. 3.6, the parameterization in (3.15) and (3.25) and the notation introduced in (6.1). We have the following lemma.
Lemma 6.3
If we regard \(\sigma , {v}_p^{[1]}, {v}_q^{[1]}, P_1\) and \((X^{[1]})^{-1}\)-entries as fixed parameters, we have
where \(\mathfrak {S}\) is the set of variables defined by
Proof
Note that \(\mathcal {Q}(\cdot )\) can be regarded as a function of the Grassmann variables in \(\varvec{\omega }^{[1]}\) and \(\varvec{\xi }^{[1]}\). Hence, by the definition in (3.30), it is a polynomial of these variables with bounded degree. In addition, we can always combine the factor \(1/\sqrt{M}\) with \(\varOmega _j\) and \(\varXi _j\)-variables in the first factor of the r.h.s. of (3.30). Then it is not difficult to check that
where
By the definition in (6.4), \(\mathsf {Q}(\cdot )\) is the integral of \(\mathcal {Q}(\cdot )\) over the \(\varvec{\omega }^{[1]}\) and \(\varvec{\xi }^{[1]}\)-variables. Now, we regard all the other variables in \(\mathfrak {S}'\), except \(\varvec{\omega }^{[1]}\) and \(\varvec{\xi }^{[1]}\)-variables, as parameters. By the definition of Grassmann integral, we know that only the coefficient of the highest order term \(\prod _{k=p,q}\prod _{a=1,2}\omega ^{[1]}_{k,a}\xi ^{[1]}_{k,a}\) in \(\mathcal {Q}(\cdot )\) survives after integrating \(\varvec{\omega }^{[1]}\) and \(\varvec{\xi }^{[1]}\)-variables out. Then, it is easy to see (6.5) from (6.6), completing the proof.
\(\square \)
6.2 Integral of \(\mathcal {P}\)
In this subsection, we temporarily ignore the \(\varOmega \) and \(\varXi \)-variables from \(\mathsf {Q}(\cdot )\), and estimate \(\mathsf {P}(\hat{X}, \hat{B}, V, T)\) defined in (4.2). Recalling \(r_{j,1}\) and \(r_{j,2}\) defined in (5.12), we can formulate our estimate as follows.
Lemma 6.4
Suppose that the assumptions in Lemma 6.1 hold. We have
Proof
We start with one factor from \(\mathcal {P}(\cdot )\) (see (3.29)), namely
Here \(\mathfrak {p}_\ell (\cdot )\) is a polynomial in \(\hat{X}_j^{-1}, \hat{B}_j^{-1}, V_j, T_j, \varOmega _j\) and \(\varXi _j\)-entries with bounded degree and bounded coefficients. Moreover, if we regard \(\mathfrak {p}_\ell (\cdot )\) as a polynomial of \(\varOmega _j\) and \(\varXi _j\)-entries, it is homogeneous, with degree \(2\ell \), and the total degree for \(\varOmega _j\)-variables is \(\ell \), thus that for \(\varXi _j\)-entries is also \(\ell \). More specifically, we can write
where we used the notation in (6.1) and denoted \(\varvec{\alpha }=(\alpha _{1},\ldots , \alpha _{\ell })\) and \(\varvec{\beta }=(\beta _{1},\ldots , \beta _{\ell })\). It is easy to verify that \(\varvec{\varpi }_j\) is of the form (6.8) by taking Taylor expansion w.r.t. the Grassmann variables. The expansion in (6.8) terminates at \(\ell =4\), owing to the fact that there are totally 8 Grassmann variables from \(\varOmega _j\) and \(\varXi _j\). In addition, it is also easy to check that \(\mathfrak {p}_{\ell ,\varvec{\alpha },\varvec{\beta }}(\cdot )\) is a polynomial of \(\hat{X}_j^{-1}, \hat{B}_j^{-1}, V_j, T_j\)-entries with bounded degree and bounded coefficients, which implies that there exist two positive constants \(C_1\) and \(C_2\), such that
uniformly in \(\ell , \varvec{\alpha }\) and \(\varvec{\beta }\). Here we used the fact that \(\hat{X}_{j}^{-1}\) and \(V_j\)-entries are all bounded and \(T_j\)-entries are bounded by \(1+t_j\).
Now, we go back to the definition of \(\mathcal {P}(\cdot )\) in (3.29) and study the last factor. Similarly to the discussion above, it is easy to see that for \(k=p\) or q,
where \(\hat{\mathfrak {p}}_0(\cdot )=\det \hat{X}_k/\det \hat{B}_k\) and \(\hat{\mathfrak {p}}_{\ell ,\varvec{\alpha },{\varvec{\beta }}}(\cdot )\)’s are some polynomials of \(\hat{X}_k, \hat{B}_k^{-1}, V_k, T_k\)-entries with bounded degree and bounded coefficients. Similarly, we have
for some positive constants \(C_1\) and \(C_2\).
According to the definitions in (6.8) and (6.10), we can rewrite (3.29) as
In light of the discussion above, \(\prod _{j=1}^W\varvec{\varpi }_j\prod _{k=p,q}\hat{\varvec{\varpi }}_k\) is a polynomial of \(\hat{X}^{-1}, \hat{B}^{-1}, V, T, \varOmega \) and \(\varXi \)-entries, in which each monomial is of the form
where we used the notation
and \(\mathfrak {q}_{\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta }}}(\cdot )\) is a polynomial of \(\hat{X}, \hat{X}^{-1}, \hat{B}^{-1}, V\) and T-entries. Moreover, all the entries of \(\mathbf {\ell }, {\varvec{\alpha }}\) and \({\varvec{\beta }}\) are bounded by 4. By (6.9) and (6.11), we have
In addition, it is easy to see that the number of the summands of the form (6.13) in \(\prod _{j=1}^W\varvec{\varpi }_j\prod _{k=p,q}\hat{\varvec{\varpi }}_k\) is bounded by \(e^{O(W)}\).
Now, we define the vectors
where \(\varvec{\omega }_\alpha =(\omega _{1,\alpha },\ldots , \omega _{W,\alpha })\) and \(\varvec{\xi }_\alpha =(\xi _{1,\alpha },\ldots , \xi _{W,\alpha })\) for \(\alpha =1,2,3,4\). Here we used the notation (6.1). In addition, we introduce the matrix
It is easy to check \(\sum _{j,k}\tilde{\mathfrak {s}}_{jk} Tr \varOmega _j\varXi _k=\mathbf {\Omega }\widetilde{\mathbb {H}}\mathbf {\Xi }'\). By using the Gaussian integral formula for the Grassmann variables (3.2), we see that for each \(\mathbf {\ell }, \varvec{\alpha }\) and \(\varvec{\beta }\), we have
for some index sets \(\mathsf {I}\) and \(\mathsf {J}\) with \(|\mathsf {I}|=|\mathsf {J}|\). By Assumption 1.1 (i) and (ii), we see that the 2-norm of each row of \(\widetilde{S}\) is O(1). Consequently, by using Hadamard’s inequality, we have
Therefore, (6.12)–(6.18) and the bound \(e^{O(W)}\) for the total number of summands of the form (6.13) in \(\prod _{j=1}^W\varvec{\varpi }_j\prod _{k=p,q}\hat{\varvec{\varpi }}_k\) imply (6.7). Thus we completed the proof.
\(\square \)
6.3 Integral of \(\mathcal {F}\)
In this subsection, we also temporarily ignore the \(X^{[1]},\mathbf {y}^{[1]},\mathbf {w}^{[1]},P_1,Q_1\)-variables from \(\mathsf {Q}(\cdot )\), and estimate \(\mathsf {F}(\hat{X},\hat{B}, V, T)\) defined in (4.3). We have the following lemma.
Lemma 6.5
Suppose that the assumptions in Lemma 6.1 hold. We have
Proof
Recalling the decomposition of \(\mathcal {F}(\cdot )\) in (3.19) together with the parameterization in (3.29), we will study the integrals
separately. Recalling the convention at the end of Sect. 3, we use \(f(\cdot )\) and \(g(\cdot )\) to represent the integrands above. One can refer to (3.20) and (3.21) for the definition. From the assumption \(\eta \le M^{-1}N^{\varepsilon _2}\), we see
since \(P_1, V, \hat{X}, X^{[1]}\)-variables are all bounded and \(|\det X_p^{[1]}|, |\det X_q^{[1]}|\sim 1\) when \(\mathbf {x}_1,\mathbf {x}_2\in \widehat{\varSigma }\).
For \(\mathbb {G}(\hat{B},T)\), we use the facts
to estimate trivially several terms, whereby we can get the bound
Here \(\mathsf {Re}(B_j)=Q_1^{-1}T_j^{-1} \mathsf {Re}(\hat{B}_j) T_jQ_1\). Hence, integrating \(y_p^{[1]}\) and \(y_q^{[1]}\) out yields
for some positive constants C and \(C_1\) depending on n, where we used the elementary facts that \(\tilde{\mathfrak {s}}_{kk}\ge c\) for some positive constant c and
Now, note that
In addition, it is also easy to see \(\lambda _1(T_j)=s_j-t_j\) and \(\lambda _1(Q_1)=s-t\), according to the definitions in (3.25). Now, by the fact \(JA^{-1}=AJ\) for any \(A\in \mathring{U}(1,1)\), we have
Consequently, we can get
by recalling the facts \(s^2-t^2=1\) and \(s_j^2-t_j^2=1\). Therefore, combining (6.25), (6.27) and (6.29), we have
Now, what remains is to estimate the exponential function in (6.30). By elementary calculation from (6.28) we obtain
Observe that
It implies that
for some positive constant c, where we have used the fact \(\mathsf {Re}b_{j,\alpha }\ge cr_{j,\alpha }\) for all \(j=1,\ldots , W\) and \(\alpha =1,2\), in light of the assumption \(|E|\le \sqrt{2}-\kappa \) and the definition of \(\mathbb {K}\) in (6.2). Plugging (6.31) into (6.30), estimating \((s+t)^2\le 2(1+2t^2)\), and integrating t out, we can crudely bound
Now, we use the trivial bounds
and
Inserting (6.33) and (6.34) into (6.32) and integrating out the remaining variables yields
Combining (6.22) and (6.35) we can get the bound (6.19). Thus we completed the proof of Lemma 6.5. \(\square \)
6.4 Summing up: Proof of Lemma 6.1
In the discussions in Sects. 6.2 and 6.3, we ignored the irrelevant factor \(\mathsf {Q}(\cdot )\). However, it is easy to modify the discussion slightly to take this factor into account, whereby we can prove Lemma 6.1.
Proof
At first, by the definition in (6.4), we can rewrite (3.32) as
Now, by the conclusion \(\kappa _1=W^{O(1)}\) in Lemma 6.3, it suffices to consider one term in \(\mathsf {Q}(\cdot )\), which is a monomial of the form \(\mathfrak {p}(t,s, (y^{[1]}_p)^{-1},(y^{[1]}_q)^{-1})\mathfrak {q}(\varOmega , \varXi )\), regarding \(\sigma , {v}_p^{[1]}, {v}_q^{[1]}, P_1\)-variables, \(X^{[1]}\)-variables and \(\mathbf {w}^{[1]}\)-variables as bounded parameters. Here \(\mathfrak {p}(\cdot )\) is a monomial of \(t,s, (y^{[1]}_p)^{-1},(y^{[1]}_q)^{-1}\) and \(\mathfrak {q}(\cdot )\) is a monomial of \(\varOmega ,\varXi \)-variables, both with bounded coefficients and bounded degrees, according to the fact \(\kappa _2,\kappa _3=O(1)\) in Lemma 6.3. Now we define
By repeating the discussions in Sects. 6.2 and 6.3 with slight modification, we can easily see that (6.7) and (6.19) hold as well, if we replace \(\mathsf {P}(\cdot )\) and \(\mathsf {F}(\cdot )\) by \(\mathsf {P}_{\mathfrak {q}}(\cdot )\) and \(\mathsf {F}_{\mathfrak {p}}(\cdot )\), respectively. Therefore, we completed the proof of Lemma 6.1. \(\square \)
7 Proofs of Lemmas 5.1 and 5.6
In this section, with the aid of Lemma 6.1, we prove Lemmas 5.1 and 5.6. A key problem is that the domain of \(\mathbf {t}\)-variables is not compact. This forces us to analyze the exponential function
carefully for any fixed \(\hat{B}\)-variables. Recall the definition of the sector \(\mathbb {K}\) in (6.2). For \(\mathbf {b}_1\in \mathbb {K}^W\) and \(\mathbf {b}_2\in \bar{\mathbb {K}}^W\), we have
From now on, we regard \(\mathbb {M}(\mathbf {t})\) as a measure of the \(\mathbf {t}\)-variables and study it in the following two regions separately:
Roughly speaking, when \(\mathbf {t}\in \mathbb {I}^{W-1}\), we will see that \(\mathbb {M}(\mathbf {t})\) can be bounded pointwisely by a Gaussian measure. More specifically, we have the following lemma.
Lemma 7.1
With the notation above, we have
Proof
Using the definition of \(\ell _S(\hat{B},T)\) in (5.6) and \(\mathfrak {A}(\hat{B})\) in (7.2) and the fact \(|(T_kT_j^{-1})_{12}|=|s_jt_ke^{\mathbf {i}\sigma _k}-s_kt_je^{\mathbf {i}\sigma _j}|\), we have
Simple estimate using \(s_j^2=1+t_j^2\) shows that
Notice that the assumption \(\mathbf {t}\in \mathbb {I}^{W-1}\) was used only in the last inequality. By (7.3), (7.4) and the definition (7.1), Lemma 7.1 follows immediately. \(\square \)
However, the behavior of \(\mathbb {M}(\mathbf {t})\) for \(\mathbf {t}\in \mathbb {R}_+^{W-1}{\setminus }\mathbb {I}^{W-1}\) is much more sophisticated. We will not try to provide a pointwise control of \(\mathbb {M}(\mathbf {t})\) in this region. Instead, we will bound the integral of \(\mathfrak {q}(\mathbf {t})\) against \(\mathbb {M}(\mathbf {t})\) over this region, for any given monomial \(\mathfrak {q}(\cdot )\) of interest. More specifically, recalling the definition of \(\varTheta \) in (5.30) and the spanning tree \(\mathcal {G}_0=(\mathcal {V},\mathcal {E}_0)\) in Assumption 1.1, and additionally setting
we have the following lemma.
Lemma 7.2
Let \(\mathfrak {q}(\mathbf {t})=\prod _{j=2}^Wt_j^{n_j}\) be a monomial of \(\mathbf {t}\)-variables, with powers \(n_j=O(1)\) for all \(j=2,\ldots , W\). We have
Remark 7.3
Roughly speaking, by Lemma 7.2 we see that the integral of \(\mathfrak {q}(\mathbf {t})\)-variables against the measure \(\mathbb {M}(\mathbf {t})\) over the region \(\mathbb {R}_+^{W-1}\setminus \mathbb {I}^{W-1}\) is exponentially small, owing to the fact \(\varTheta ^2\gg W^2\log N\).
We will postpone the proof of Lemma 7.2 to the end of this section. In the sequel, at first, we prove Lemmas 5.1 and 5.6 with the aid of Lemmas 6.1, 7.1 and 7.2. Before commencing the formal proofs, we mention two basic facts which are formulated as the following lemma.
Lemma 7.4
Under Assumption 1.1, we have the following two facts.
-
For the smallest eigenvalue of \(-S^{(1)}\), there exists some positive constant c such that
$$\begin{aligned} \lambda _1(-S^{(1)})\ge {c}/{W^2}. \end{aligned}$$(7.7) -
Let \(\varvec{\varrho }=(\rho _2,\ldots , \rho _{W})'\) be a real vector and \(\rho _1=0\). If there is at least one \(\alpha \in \{2,\ldots ,W\}\) such that \(\varrho _\alpha \ge \varTheta /\sqrt{M}\), then we have
$$\begin{aligned} \sum _{j,k} \mathfrak {s}_{jk} (\varrho _j-\varrho _k)^2\ge {\varTheta }/{M}. \end{aligned}$$(7.8)
Proof
Let \(\varvec{\varrho }=(\rho _2,\ldots , \rho _{W})'\) be a real vector and \(\rho _1=0\). Now, we assume \(|\rho _\alpha |=\max _{\beta =2,\ldots , W}|\rho _\beta |\). Then
where the second step follows from Assumption 1.1 (iv) and Cauchy-Schwarz inequality. Analogously, we have \(\sum _{j,k} \mathfrak {s}_{jk} (\varrho _j-\varrho _k)^2\ge {c\varrho _\alpha ^2 }/{W}\), which implies (7.8) by the definition of \(\varTheta \) in (5.30). Hence, we completed the proof. \(\square \)
Recalling the notation defined in (5.2) and the facts \(|x_{j,a}|=1\) and \(|b_{j,a}|=r_{j,a}\) for all \(j=1,\ldots , W\) and \(a=1,2\), for any sequence of domains, we have
In addition, according to Lemma 6.1, we have
for some polynomial \(\tilde{\mathfrak {p}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\) with positive coefficients, and
7.1 Proof of Lemma 5.1
At first, since throughout the whole proof, the domains of \(\mathbf {x}_1, \mathbf {x}_2\), and \(\mathbf {v}\)-variables, namely, \(\varSigma ^{W}, \varSigma ^W\) and \(\mathbb {I}^{W-1}\), will not be involved, we just use \(*\)’s to represent them, in order to simplify the notation.
Now, we introduce the following contours with the parameter \(\mathfrak {D}\in \mathbb {R}_+\),
In addition, we recall the sector \(\mathbb {K}\) defined in (6.2). Then, trivially, we have \(\mathbb {R}_+,\varGamma , \mathcal {L}_\mathfrak {D}\subset \mathbb {K}\) and \(\mathbb {R}_+, \bar{\varGamma }, \bar{\mathcal {L}}_\mathfrak {D}\in \bar{\mathbb {K}}\) for all \(\mathfrak {D}\in \mathbb {R}_+\).
Observe that the integrand in (5.2) is an analytic function of the \(\hat{B}\)-variables. To see this, we can go back to the integral representation (3.17) and the definitions of L(B) and \(\mathcal {P}(\varOmega ,\varXi ,X,B)\) in (3.18). Since \(\exp \{M\log \det B_j\}=(\det B_j)^M\), actually the logarithmic terms in L(B) do not produce any singularity in the integrand in (3.17); it can also compensate the factors \(b_{j,a}^{-\ell }\) from \(\mathcal {P}(\varOmega ,\varXi ,X,B)\) since \(\ell \ll M\). Consequently, we have
Hence, to prove Lemma 5.1, it suffices to prove the following lemma.
Lemma 7.5
Suppose that \(|E|\le \sqrt{2}-\kappa \). As \(\mathfrak {D}\rightarrow \infty \), the following convergence hold,
Proof
For simplicity, we use the notation
Now, recall the definition of the function \(\ell (\mathbf {a})\) in (5.6) and the representation of \(L(\hat{B},T)\) in (5.5). Hence, in light of the definition of \(\mathbb {M}(\mathbf {t})\) in (7.1), we have
By the assumption \(|E|\le \sqrt{2}-\kappa \), we see that \(\mathsf {Re}b_{j,a}b_{k,a}>0\) for all \(b_{j,a},b_{k,a}\in \mathbb {K}\cup \bar{\mathbb {K}}\). Consequently, when \(b_{j,1}\in \mathbb {K}\) and \(b_{j,2}\in \bar{\mathbb {K}}\) for all \(j=1,\ldots , W\), we have
where we used Assumption 1.1 (ii) and the fact that \((-1)^{a+1}E\mathsf {Im}b_{j,a}\ge 0\).
Now, when \((\mathbf {b}_1,\mathbf {b}_2)\in \mathbf {I}_\mathfrak {D}^{b,i}\) for \(i=1,2, 3\), we have \(\sum _{a=1,2}\sum _{j} r_{j,a}^2 \ge c \mathfrak {D}^2\) for some positive constant c, which implies the trivial fact
Consequently, we can get from (7.12), (7.13) and (7.14) that for some positive constant c,
holds in \(\mathbf {I}_\mathfrak {D}^{b,i}\) for \(i=1,2,3\). In addition, by the boundedness of V and \(\hat{X}\)-variables, we can get the trivial bound \(MK(\hat{X},V)=O(N)\). Hence, from (7.9) and (7.10) we see that the quantities in Lemma 7.5 (i), (ii) and (iii) can be bounded by the following integral with \(i=1,2,3\), respectively,
According to the facts \(\kappa _1=e^{O(W)}\) and \(\kappa _2=O(1)\) in (7.11), it is suffices to consider one monomial, say \(\tilde{\mathfrak {q}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\), instead \(\tilde{\mathfrak {p}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\). Then, bounding \(t_j\)’s by 1 trivially in the region \(\mathbf {t}\in \mathbb {I}^{W-1}\) and using Lemma 7.2 in the region \(\mathbf {t}\in \mathbb {R}_+^{W-1}\setminus \mathbb {I}^{W-1}\), we can first integrate \(\mathbf {t}\)-variables out; then performing the Gaussian integral on \(\hat{B}\)-variables, it is not difficult to get the estimate
for \(i=1,2,3\). Here we remark that after integrating \(\mathbf {t}\)-variables out, we get a singularity \((\min _{j,a} r_{j,a})^{-CW^2}\) from the factor \((1+\mathfrak {L}^{-\frac{1}{2}})^{O(W^2)}\) according to (7.6), which can be compensated by the factor \(\prod _{a}\prod _{j}r_{j,a}^M\) by the fact \(M\gg W^2\). Thus we completed the proof. \(\square \)
7.2 Proof of Lemma 5.6
Plugging the first identity of (5.21) and (7.10) into (7.9), we can write
where \(\tilde{\mathfrak {p}}(\mathbf {r},\mathbf {r}^{-1},\mathbf {t})\) is specified in (7.11).
Lemma 5.6 immediately follows from the following two lemmas.
Lemma 7.6
Under Assumptions 1.1 and 1.14, we have
Lemma 7.7
Under Assumptions 1.1 and 1.14, we have
In the sequel, we prove Lemmas 7.6 and 7.7.
Proof of Lemma 7.6
Recall (7.16) with the choice of the integration domains
To simplify the integral on the r.h.s. of (7.16), we use the fact \(\mathsf {Re}\mathring{K}(\hat{X},V)\ge 0\) implied by (5.27), together with the facts that the \(\mathbf {x}\) and \(\mathbf {v}\)-variables are bounded by 1. Consequently, we can eliminate the integral over \(\mathbf {x}\) and \(\mathbf {v}\)-variables from the integral on the r.h.s. of (7.16). Moreover, according to (7.11), it suffices to prove
instead, for some monomial \(\tilde{\mathfrak {q}}(\cdot )\) in \(\tilde{\mathfrak {p}}(\cdot )\). The proof of (7.19) then is just an application of the first inequality of (5.13), Lemma 7.2 and elementary Gaussian integral. We omit the details here. \(\square \)
To prove Lemma 7.7, we split the exponential function into two parts. We use one part to control the integral, and the other will be estimated by its magnitude. More specifically, we shall prove the following two lemmas.
Lemma 7.8
Under Assumptions 1.1 and 1.14, we have
Lemma 7.9
If \((\mathbf {b}_1,\mathbf {b}_2,\mathbf {x}_1,\mathbf {x}_2,\mathbf {t}, \mathbf {v})\in \varGamma ^W\times \bar{\varGamma }^W\times \varSigma ^W\times \varSigma ^W\times \mathbb {I}^{W-1}\times \mathbb {I}^{W-1}\), but not in any of the Types I, II, III vicinities in Definition 5.5, we have
With Lemmas 7.8 and 7.9, we can prove Lemma 7.7.
Proof of Lemma 7.7
For the sake of simplicity, in this proof, we temporarily use \(\mathcal {I}_{\text {full}}\) to represent the l.h.s. of (7.18), i.e. the integral over the full domain, and use \(\mathcal {I}_{I}, \mathcal {I}_{II}\) and \(\mathcal {I}_{III}\) to represent the first three terms on the r.h.s. of (7.18). Now, combining (7.16), (7.21) and (7.20), we see that,
in light of the definition of \(\varTheta \) in (5.30) and the assumption (5.34). Hence, we completed the proof of Lemma 7.7. \(\square \)
Proof of Lemma 7.8
At first, again, the polynomial \(\tilde{\mathfrak {p}}(\cdot )\) in the integrand can be replaced by some monomial \(\tilde{\mathfrak {q}}(\cdot )\) in the discussion, owing to the fact that \(\kappa _1=\exp \{O(W)\}\) in (7.11). Then, the proof is similar to that of Lemma 7.6, but much simpler, since \(\mathbf {t}\)-variables are bounded by 1 now. Consequently, we can eliminate \(\hat{X}, \mathbf {t}\) and \(\mathbf {v}\)-variables from the integral directly and use the trivial bounds
where the latter is from (5.13). Then, an elementary Gaussian integral leads to the conclusion immediately. \(\square \)
Proof of Lemma 7.9
According to (5.13) and (5.27), we see both \(M\mathsf {Re}\mathring{L}(\hat{B},T)\) and \(M\mathsf {Re}\mathring{K}(\hat{X},V)\) are nonnegative on the full domain. Hence, it suffices to show one of them is larger than \(\varTheta \) outside the Type I, II, III vicinities. For the former, using the first inequality of (5.13), (7.3) and (7.4), we have
Then it is easy to see that \( M\mathsf {Re}\mathring{L}(\hat{B},T)\ge \varTheta \) if \((\mathbf {b}_1,\mathbf {b}_2,\mathbf {t})\not \in \varUpsilon ^b_+\times \varUpsilon ^b_-\times \varUpsilon _S\), by the definitions in (5.32).
Now we turn to \(M\mathsf {Re}\mathring{K}(\hat{X},V)\). Recall the definition of \(\vartheta _j\)’s in (5.26). If there is some \(j\in \{1,2W\}\) such that \((\sin \vartheta _j-E/2)^2> \varTheta /M\), we can get \(M\mathsf {Re}\mathring{K}(\hat{X},V)>\varTheta \) immediately, by using (5.27). Hence, in the sequel, it suffices to consider the case \(\Big (\sin \vartheta _j-{E}/{2}\Big )^2\le {\varTheta }/{M}\) for all \({j=1,\ldots , 2W}\), which implies
Now, recall the notation defined in (5.33). We claim that it suffices to focus on the following three subcases of (7.23), corresponding to the Type I, II and III saddle points.
-
(i)
There is a sequence of permutations of \(\{1,2\}, \varvec{\epsilon }=(\epsilon _1,\ldots , \epsilon _W)\), such that \(||\arg (a_+^{-1}\mathbf {x}_{\varvec{\epsilon }(1)})||_\infty ^2\le {\varTheta }/{M}\) and \(||\arg (a_-^{-1}\mathbf {x}_{\varvec{\epsilon }(2)})||_\infty ^2\le {\varTheta }/{M}\),
-
(ii)
For \(a=1,2\), there is \(||\arg (a_+^{-1}\mathbf {x}_{a})||_\infty ^2\le {\varTheta }/{M}\).
-
(iii)
For \(a=1,2\), there is \(||\arg (a_-^{-1}\mathbf {x}_{a})||_\infty ^2\le {\varTheta }/{M}\).
For those \(\hat{X}\)-variables which satisfy (7.23) but do not belong to any of the case (i), (ii) or (iii) listed above, one can actually get \(M\mathsf {Re}\mathring{K}(\hat{X},V)>cM\gg \varTheta \). We explain this estimate for the following case, namely, there is some pair \(\{i,j\}\in \mathcal {E}\) such that
all the other cases can be handled analogously. Using (5.27), we have
Now, by the assumption (7.24), we have
Consequently, from (7.25) and the definition (5.23) we have
where the last step follows from Assumption 1.1 (iv). Therefore, we have \(M\mathsf {Re}\mathring{K}(\hat{X},V)>cM\).
Hence, we can focus on the cases (i), (ii) and (iii). Note that in case (i), we actually defined a vicinity of \((\mathbf {x}_{\varvec{\epsilon }(1)},\mathbf {x}_{\varvec{\epsilon }(2)})\) in terms of the \(\ell ^\infty \)-norm rather than \(\ell ^2\)-norm of \(\mathbf {x}_{\varvec{\epsilon }(1)}\) and \(\mathbf {x}_{\varvec{\epsilon }(2)}\), thus the vicinity is larger than \(\varUpsilon _+^{x}\times \varUpsilon _-^x\). Without loss of generality, we assume that \(\epsilon _1=\mathrm {id}\) which is the identity. Then by doing the transform \((\hat{X}_j,V_j)\rightarrow (\mathfrak {I}\hat{X}_j\mathfrak {I}, \mathfrak {I}V_j)\) for those j with \(\epsilon _j\ne \mathrm {id}\), we can transform \((\mathbf {x}_{\varvec{\epsilon }(1)}, \mathbf {x}_{\varvec{\epsilon }(1)}, \mathbf {v}_{\varvec{\epsilon }})\) to \((\mathbf {x}_1,\mathbf {x}_2,\mathbf {v})\) for all permutation sequence \(\varvec{\epsilon }\). Hence, it suffices to assume \(\epsilon _j=\mathrm {id}\) for all \(j=1,\ldots , W\) and consider the case
Now, if \((\mathbf {x}_1,\mathbf {x}_2)\not \in \varUpsilon _+^x\times \varUpsilon _-^x\), we can use (5.27) again to see that
where in the second step we used the fact \(||a_+^{-1}\mathbf {x}_1||_\infty <{\varTheta }/{M}\) and \(||a_-^{-1}\mathbf {x}_2||_\infty <{\varTheta }/{M}\), and in the last step we used the definition in (5.32). Now, if \((\mathbf {x}_1,\mathbf {x}_2)\in \varUpsilon _+^x\times \varUpsilon _-^x\) but \(\mathbf {v}\not \in \varUpsilon _S\), we go back to (5.20). It is easy to check \(-\mathring{\ell }_{++}(\mathbf {x}_1)-\mathring{\ell }_{+-}(\mathbf {x}_2)\ge 0\) for \((\mathbf {x}_1,\mathbf {x}_2)\in \varUpsilon _+^x\times \varUpsilon _-^x\). Hence it suffices to estimate \(\ell _S(\hat{X},V)\). By the definition in (5.17) and the fact \(x_{j,1}-x_{j,2}=a_+-a-+o(1)\) for all \(j=1,\ldots , W\), we have
where the last step follows from the same argument as (7.4). Consequently, \(M\ell _S(\hat{X},V)\ge \varTheta \) if \(\mathbf {v}\not \in \varUpsilon _S\). Hence, we finished the discussion in case (i). For cases (ii) and (iii), the proofs are much easier since it suffices to discuss the \(\hat{X}\)-variables, we just omit the details here. Hence, we completed the proof of Lemma 7.9. \(\square \)
7.3 Proof of Lemma 7.2
Let \(\mathbb {I}^c=\mathbb {R}_+{\setminus } \mathbb {I}\). Now we consider the domain sequence \(\mathbf {\mathbb {J}}=(\mathbb {J}_2,\ldots , \mathbb {J}_{W})\in \{\mathbb {I}, \mathbb {I}^c\}^{W-1}\). We decompose the integral in Lemma 7.2 as follows
Note the total number of the choices of such \(\mathbf {\mathbb {J}}\) in the sum above is \(2^{W-1}-1\). It suffices to consider one of these sequences \(\mathbf {\mathbb {J}}\in \{\mathbb {I}, \mathbb {I}^c\}^{W-1}\) in which there is at least one i such that \(\mathbb {J}_i=\mathbb {I}^c\).
Recall the spanning tree \(\mathcal {G}_0=(\mathcal {V},\mathcal {E}_0)\) in Assumption 1.1. The simplest case is that there exists a linear spanning tree (a path) \(\mathcal {G}_0\) with
We first present the proof in this simplest case.
Now, we only keep the edges in the path \(\mathcal {E}_0\), i.e. the terms with \(k=j-1\) in (7.3), we also trivially discard the term \(1/(1+2t_{j}^2)\) from the sum \(1/(1+2t_{j-1}^2)+1/(1+2t_{j}^2)\) in the estimate (7.4) (the first inequality), and finally we bound all \(M\mathfrak {A}(\hat{B})\mathfrak {s}_{j-1,j}/4\) by \(\mathfrak {L}\) defined in (7.5) from below. That means, we use the bound
Consequently, we have
Note that, as a function of \(\mathbf {t}, \breve{\mathbb {M}}_j(\mathbf {t})\) only depends on \(t_{j-1}\) and \(t_{j}\).
Having fixed \(\mathbf {\mathbb {J}}\), assume that k is the largest index such that \(\mathbb {J}_{k}=\mathbb {I}^c\), i.e. \(t_{k+1},\ldots , t_W\in \mathbb {I}\). Then, using the fact \(t_1=0\) and \(t_k\ge 1\), it is not difficult to check
by employing the elementary facts
Now, we split \(\prod _{j=2}^W\breve{\mathbb {M}}_j(\mathbf {t})\) into two parts. We use one to control the integral, and the other will be estimated by (7.32). Specifically, substituting (7.32) into (7.31) we have
Therefore, what remains is to estimate the integral in (7.34), which can be done by elementary Gaussian integral step by step. More specifically, using (7.33) and the change of variable \(t_j/t_{j-1}-1\rightarrow t_j\) in case of \(t_{j-1}\in \mathbb {I}^c\) and \(t_j-t_{j-1}\rightarrow t_j\) in case of \(t_{j-1}\in \mathbb {I}\), it is elementary to see that for any \(\ell =O(W)\),
Starting from \(j=W\), using (7.35) to integrate (7.34) successively, the exponent of \(t_j\) increases linearly (\(n_j=O(1)\)), thus we can get
Then (7.6) follows from the definition of \(\mathfrak {L}\) in (7.5) and (5.35). Hence, we completed the proof for (7.6) when the spanning tree is given by (7.29).
Now, we consider more general spanning tree \(\mathcal {G}_0\) and regard 1 as its root. We start from the generalization of (7.30), namely,
Here we make the convention that \(\text {dist}(1,i)=\text {dist}(1,j)-1\) for all \(\{i,j\}\in \mathcal {E}_0\), where \(\text {dist}(a,b)\) represents the distance between a and b. Now, if there is \(k'\) such that \(\mathbb {J}_{k'}\in \mathbb {I}^c\), we can prove the following analogue of (7.32), namely,
Consequently, we can get the analogue of (7.34) via replacing \(\breve{\mathbb {M}}_j(\mathbf {t})\)’s by \(\breve{\mathbb {M}}_{i,j}(\mathbf {t})\)’s. Finally, integrating \(t_j\)’s out successively, from the leaves to the root 1, yields the same conclusion, i.e. (7.6), for general \(\mathcal {G}_0\). Therefore, we completed the proof of Lemma 7.2.
8 Gaussian measure in the vicinities
From now on, we can restrict ourselves to the Type I, II and III vicinities. As a preparation of the proofs of Lemmas 5.8 and 5.9, we will show in this section that the exponential function
is approximately a Gaussian measure (unnormalized).
8.1 Parametrization and initial approximation in the vicinities
We change the \(\mathbf {x}, \mathbf {b}, \mathbf {t}, \mathbf {v}\)-variables to a new set of variables, namely, \(\mathring{\mathbf {x}}, \mathring{\mathbf {b}}, \mathring{\mathbf {t}}\) and \(\mathring{\mathbf {v}}\). The precise definition of \(\mathring{x}\) differs in the different vicinities. To distinguish the parameterization, we set \(\varkappa =\pm , +\), or \(-\), corresponding to Type I, II or III vicinity, respectively. Recalling \(D_\varkappa \) from (1.28). For each j and each \(\varkappa \), we then set
If \(\varkappa =\pm \), we also need to parameterize \(v_j\) by
Correspondingly, we set the vectors
Accordingly, recalling the quantity \(\varTheta \) from (5.30), we introduce the domains
We remind here, as mentioned above, in the sequel, the small constant \(\varepsilon _0\) in \(\mathring{\varUpsilon }\) and \(\mathring{\varUpsilon }_S\) may be different from line to line, subject to (5.34). Now, by the definition of the Type I’, II and III vicinities in Definition 5.5 and the parametrization in (8.2) and (8.3), we can redefine the vicinities as follows.
Definition 8.1
We can redefine three types of vicinities as follows.
-
Type I’ vicinity : \(\big (\mathring{\mathbf {b}}_1,\mathring{\mathbf {b}}_2,\mathring{\mathbf {x}}_{1},\mathring{\mathbf {x}}_{2},\mathring{\mathbf {t}},\mathring{\mathbf {v}}\big )\in \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon } \times \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S \), with \(\varkappa =\pm \).
-
Type II vicinity : \(\big (\mathring{\mathbf {b}}_1,\mathring{\mathbf {b}}_2,\mathring{\mathbf {x}}_{1},\mathring{\mathbf {x}}_{2},\mathring{\mathbf {t}},\mathbf {v}\big )\in \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon } \times \mathring{\varUpsilon }_S\times \mathbb {I}^{W-1}\), with \(\varkappa =+\).
-
Type III vicinity : \(\big (\mathring{\mathbf {b}}_1,\mathring{\mathbf {b}}_2,\mathring{\mathbf {x}}_{1},\mathring{\mathbf {x}}_{2},\mathring{\mathbf {t}},\mathbf {v}\big )\in \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon }\times \mathring{\varUpsilon } \times \mathring{\varUpsilon }_S\times \mathbb {I}^{W-1}\), with \(\varkappa =-\).
We recall from (7.8) the fact
Now, we use the representation (5.2). Then, for the Type I vicinity, we change \(\mathbf {x},\mathbf {b},\mathbf {t},\mathbf {v}\)-variables to \(\mathring{\mathbf {x}},\mathring{\mathbf {b}},\mathring{\mathbf {t}},\mathring{\mathbf {v}}\)-variables according to (8.2) with \(\varkappa =\pm \), thus
For the Type II or III vicinities, i.e. \(\varkappa =+\) or \(-\), we change \(\mathbf {x},\mathbf {b},\mathbf {t}\)-variables to \(\mathring{\mathbf {x}},\mathring{\mathbf {b}},\mathring{\mathbf {t}}\)-variables. Consequently, we have
We will also need the following facts
if \(\mathbf {x}_{1},\mathbf {x}_2\in \widehat{\varSigma }^W, b_{j,1}=a_++o(1), b_{j,2}=-a_-+o(1)\) and \(t_j=o(1)\), for all \(j=1,\ldots , N\), which always hold in these types of vicinities. The first estimate in (8.8) is trivial, and the second follows from Lemma 6.1.
Now, we approximate (8.1) in the vicinities. We introduce the matrices \(\mathcal {E}_+(\vartheta )=\text {diag}(e^{\mathbf {i}\vartheta }, e^{-\mathbf {i}\vartheta })\mathfrak {I}\) and \(\mathcal {E}_-(\vartheta )=\text {diag}( e^{\mathbf {i}\vartheta }, -e^{-\mathbf {i}\vartheta } )\mathfrak {I}\) for any \(\vartheta \in \mathbb {L}\). Then, with the parameterization above, expanding \(\hat{X}_j\) in (3.23) and \(T_j\) in (3.25) up to the second order, we can write
For \(\varkappa =\pm \), we also expand \(V_j\) in (3.25) up to the second order, namely,
We just take (8.9) and (8.10) as the definition of \(R^x_j, R^t_j\) and \(R^v_j\). Note that \(R^x_j\) is actually \(\varkappa \)-dependent. However, this dependence is irrelevant for our analysis thus is suppressed from the notation. It is elementary that
Recall the facts (5.10) and (5.20). In addition, in light of (5.18)–(5.20), we can also represent \(\mathring{K}(\hat{X},V)\) in the following two alternative ways
We will use the representations of \(\mathring{K}(\hat{X},V)\) in (8.12) and (8.13) for Type II and III vicinities respectively. In addition, we introduce the matrices
Then, we have the following lemma.
Lemma 8.2
With the parametrization in (8.9), we have the following approximations.
-
Let \(\mathring{\mathbf {b}}_1, \mathring{\mathbf {b}}_2\in \mathbb {C}^{W}\) and \(||\mathring{\mathbf {b}}_1||_\infty , ||\mathring{\mathbf {b}}_2||_\infty =o(\sqrt{M})\), we have
$$\begin{aligned}&M\left( \mathring{\ell }_{++}(\mathbf {b}_1)+\mathring{\ell }_{--}(\mathbf {b}_2)\right) =\frac{1}{2}\mathring{\mathbf {b}}_1'\mathbb {A}_+\mathring{\mathbf {b}}_1+\frac{1}{2}\mathring{\mathbf {b}}_2'\mathbb {A}_-\mathring{\mathbf {b}}_2+R^b,\nonumber \\&\quad \quad R^b=O\left( \frac{\sum _{a=1,2}||\mathring{\mathbf {b}}_{a}||_3^3}{\sqrt{M}}\right) . \end{aligned}$$(8.15) -
Let \(\varkappa =\pm \) and \(\mathring{\mathbf {x}}_1,\mathring{\mathbf {x}}_2\in \mathbb {C}^{W}\) and \(||\mathring{\mathbf {x}}_1||_\infty ,||\mathring{\mathbf {x}}_2||_\infty =o(\sqrt{M})\), we have
$$\begin{aligned}&M\left( -\mathring{\ell }_{++}(\mathbf {x}_1)-\mathring{\ell }_{+-}(\mathbf {x}_2)\right) =\frac{1}{2}\mathring{\mathbf {x}}_1'\mathbb {A}_+\mathring{\mathbf {x}}_1+\frac{1}{2}\mathring{\mathbf {x}}_2'\mathbb {A}_-\mathring{\mathbf {x}}_2+R^x_\pm ,\nonumber \\&\quad R^x_\pm =O\left( \frac{\sum _{a=1,2}||\mathring{\mathbf {x}}_{a}||_3^3}{\sqrt{M}}\right) . \end{aligned}$$(8.16) -
In the Type II vicinity, we have
$$\begin{aligned} M\left( -\mathring{\ell }_{++}(\mathbf {x}_1)-\mathring{\ell }_{++}(\mathbf {x}_2)\right) =\frac{1}{2}\mathring{\mathbf {x}}_1'\mathbb {A}_+\mathring{\mathbf {x}}_1+\frac{1}{2}\mathring{\mathbf {x}}_2'\mathbb {A}_+\mathring{\mathbf {x}}_2+R^x_+,\quad R^x_+=O\left( \frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\right) . \nonumber \\ \end{aligned}$$(8.17) -
In the Type III vicinity, we have
$$\begin{aligned} M\left( -\mathring{\ell }_{+-}(\mathbf {x}_1)-\mathring{\ell }_{+-}(\mathbf {x}_2)\right) =\frac{1}{2}\mathring{\mathbf {x}}_1'\mathbb {A}_-\mathring{\mathbf {x}}_1+\frac{1}{2}\mathring{\mathbf {x}}_2'\mathbb {A}_-\mathring{\mathbf {x}}_2+R^x_-,\quad R^x_-=O\left( \frac{\varTheta ^{\frac{3}{2}}}{\sqrt{M}}\right) . \nonumber \\ \end{aligned}$$(8.18)
Here \(R^b R^x_\pm , R^x_+\) and \(R^x_-\) are remainder terms of the Taylor expansion of the function \(\ell (\mathbf {a})\) defined in (5.6).
Proof
It follows from the Taylor expansion of the function \(\ell (\mathbf {a})\) easily. \(\square \)
Then, according to (5.10), (5.20), (8.12) and (8.13), what remains is to approximate \(M\ell _S(\hat{B},T)\) and \(M\ell _S(\hat{X},V)\) in the vicinities. Recalling the definition in (5.6) and the parameterization in (8.2), we can rewrite
We take the above equation as the definition of \(R^{t,b}\). Now, we set
and change the variables and the measure as
In the Type I’ vicinity, we can do the same thing for \(M\ell _S(\hat{X},V)\), namely,
where \(R^{v,x}_\pm \) is the remainder term. Then we set
and change the variables and measure as
Now, we introduce the vectors
With this notation, we can rewrite (8.19) and (8.22) as
According to (8.21) and (8.24), we can express (8.6) as an integral over \(\mathring{\mathbf {b}}, \mathring{\mathbf {x}}, \mathring{\varvec{\tau }}\) and \(\mathring{\varvec{\upsilon }}\)-variables. However, we need to specify the domains of \(\mathring{\varvec{\tau }}\) and \(\mathring{\varvec{\upsilon }}\)-variables in advance. Our aim is to restrict the integral in the domains
To see the truncation from the domain \((\mathring{\mathbf {t}},\mathring{\mathbf {v}},\varvec{\sigma },\varvec{\theta })\in \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S\times \mathbb {L}^{W-1}\times \mathbb {L}^{W-1}\) to (8.26) is harmless in the integral (8.6), we need to bound \(R^{t,b}\) and \(R^{v,x}_\pm \) in terms of \(\varvec{\tau }'_aS^{(1)}\varvec{\tau }_a\) and \(\varvec{\upsilon }'_aS^{(1)}\varvec{\upsilon }_a\), respectively. By (8.5), we have \(s_j=1+O(\varTheta ^{2}/M)\) for all \(j=2,\ldots , W\), which also implies
Then, by the definition of \(R^{t,b}\) in (8.19), it is not difficult to check that in the Type I’ vicinity, we have
The same estimate holds if we replace \(R^{t,b}\) by \(R^{v,x}_\pm , \varvec{\tau }_a\) by \(\varvec{\upsilon }_a\). Then it is obvious that if one of \(\varvec{\tau }_1, \varvec{\tau }_2, \varvec{\upsilon }_1\) and \(\varvec{\upsilon }_2\) is not in \(\mathring{\varUpsilon }_S\), we will get (7.21). Hence, using (8.8), we can discard the integral outside the vicinity, analogously to the proof of Lemma 7.7. More specifically, in the sequel, we can and do assume (8.26). Now, plugging (8.26) into (8.28) in turn yields the bound
By the discussion above, for the Type I vicinity, we can write (8.6) as
where the error term stems from the truncation of the vicinity \((\mathring{\mathbf {t}},\mathring{\mathbf {v}}, \varvec{\sigma },\varvec{\theta })\in \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S\times \mathbb {L}^{W-1}\times \mathbb {L}^{W-1}\) to \((\varvec{\tau }_1, \varvec{\tau }_2,\varvec{\upsilon }_1, \varvec{\upsilon }_2)\in \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S\times \mathring{\varUpsilon }_S\).
Now, for the Type II and III vicinities, the discussion on \(\ell _S(\hat{B},T)\) is of course the same. For \(\ell _S(\hat{X},V)\), we make the following approximation. For the Type II vicinity, using the notation in (5.22), we can write
It is easy to see that
Recall \(\mathbb {S}^v\) defined in (5.23). Set
Combining (5.16), (5.19), (8.17) and (8.31) we obtain
where
Analogously, for the Type III vicinity, we can write
where
Consequently, by (8.12) and (8.13) we can write (8.7) for \(\kappa =+,-\) as
8.2 Steepest descent paths in the vicinities
In order to estimate the integrals (8.30) and (8.36) properly, we need to control various remainder terms in (8.30) and (8.36) to reduce these integrals to Gaussian ones. The final result is collected in Proposition 8.4 at the end of this section. As a preparation, we shall further deform the contours of \(\mathring{\mathbf {b}}\)-variables and \(\mathring{\mathbf {x}}\)-variables to the steepest descent paths. We mainly provide the discussion for the \(\mathring{\mathbf {b}}\)-variables, that for the \(\mathring{\mathbf {x}}\)-variables is analogous.
For simplicity, in this section, we assume \(0\le E\le \sqrt{2}-\kappa \), the case \(-\sqrt{2}+\kappa \le E\le 0\) can be discussed similarly. We introduce the eigendecomposition of S as
Note that \(\mathsf {U}\) is an orthogonal matrix thus the entries are all real. Now, we perform the change of coordinate
Obviously, for the differentials, we have \(\prod _{j=1}^W \mathrm{d}\mathring{b}_{j,a}=\prod _{j=1}^W \mathrm{d} c_{j,a}\) for \(a=1,2\). In addition, for the domains, it is elementary to see
Now, we introduce the notation
and set the diagonal matrices
By the assumption \(0\le E\le \sqrt{2}-\kappa \) and (1.5), it is not difficult to check
With the notation introduced above, we have
To simplify the following discussion, we enlarge the domain of the \(\mathbf {c}\)-variables to
Obviously, \(\mathring{\varUpsilon }\subset \varUpsilon _\infty \). It is easy to check that (7.21) also holds when \(\mathbf {c}_a\in \varUpsilon _\infty {\setminus }\mathring{\varUpsilon }\) for either \(a=1\) or 2, according to (8.37), thus such a modification of the domain will only produce an error term of order \(O(\exp \{-\varTheta \})\) in the integral (8.30), by using (8.8).
Now we do the scaling \(\mathbf {c}_1\rightarrow \mathbb {D}_+\mathbf {c}_{1}\) and \(\mathbf {c}_2\rightarrow \mathbb {D}_-\mathbf {c}_{2}\). Consequently, we have
thus
Accordingly, we should adjust the change of differentials as
In addition, the domain of \(\mathbf {c}_1\) should be changed from \(\varUpsilon _\infty \) to \(\prod _{j=1}^W \mathbb {J}_j^+\) and that of \(\mathbf {c}_2\) should be changed from \(\varUpsilon _\infty \) to \(\prod _{j=1}^W \mathbb {J}_j^-\), where
Now, we consider the integrand in (8.30) as a function of \(\mathbf {c}\)-variables on the disks, namely,
For \(\mathbf {c}_1\in \prod _{j=1}^W\mathbb {O}_j^+\) and \(\mathbf {c}_2\in \prod _{j=1}^W\mathbb {O}_j^-\), by (8.38) and (8.40) we have
Here we used the elementary fact \(||U\mathbf {a}||_\infty \le \sqrt{W}||\mathbf {a}||_\infty \) for any \(\mathbf {a}\in \mathbb {C}^W\) and and unitary matrix U. Then, we deform the contour of \(c_{j,1}\) from \(\mathbb {J}_j^+\) to
for each \(j=1,\ldots , W\), where
It is not difficult to see \(\mathsf {Re}c_{j,1}^2\ge \varTheta \) for \( c_{j,1}\in (-\varSigma _j^+)\cup \varSigma _j^+\), by (8.38). Consequently, by (8.41), we have
Then using (8.8), we can get rid of the integral over \(\varSigma _j^+\) and \(-\varSigma _j^+\), analogously to the discussion in Sect. 7. Similarly, we can perform the same argument for \(\mathbf {c}_2\). Consequently, we can restrict the \(\mathbf {c}_1\) and \(\mathbf {c}_2\) integral from \(\mathbf {c}_1\in \prod _{j=1}^W \mathbb {J}_j^+\) and \(\mathbf {c}_2\in \prod _{j=1}^W \mathbb {J}_j^-\) to the domain \(\mathbf {c}_1\in \prod _{j=1}^W\mathbb {L}_j^+\) and \(\mathbf {c}_2\in \prod _{j=1}^W \mathbb {L}_j^-\).
By (8.15), (8.42) and the fact \(||\mathbf {a}||_3^3\le ||\mathbf {a}||_\infty ||\mathbf {a}||_2^2\) for any vector \(\mathbf {a}\), we see that
for some positive constant C, where in the last step we also used the fact that \(||\mathbf {b}_a||_2= O(||\mathbf {c}_a||_2)\) for \(a=1,2\), which is implied by (8.40) and (8.38). Consequently, we have
This allows us to take a step further to truncate \(\mathbf {c}_1\) and \(\mathbf {c}_2\) according to their 2-norm, namely
Similarly to the discussion in the proof of Lemma 7.7, such a truncation will only produce an error of order \(\exp \{-\varTheta \}\) to the integral, by using (8.8).
Now, analogously to (8.40), we can change \(\mathring{\mathbf {x}}\)-variables to \(\mathbf {d}\)-variables, defined by
Thus accordingly, we change the differentials
In addition, like (8.44), we deform the domain to \(\mathbf {d}_1,\mathbf {d}_2\in \mathring{\varUpsilon }\). Finally, using the fact \(\det \mathbb {D}_+\mathbb {D}_-=1/\sqrt{\det \mathbb {A}_+\mathbb {A}_-}\), from (8.30) we arrive at the representation
in which \(\mathbf {x}\) and \(\mathring{\mathbf {x}}\)-variables should be regarded as functions of the \(\mathbf {d}\)-variables, as well, \(\mathbf {b}\) and \(\mathring{\mathbf {b}}\)-variables should be regarded as functions of the \(\mathbf {c}\)-variables.
Now, in the Type II and III vicinities, we only do the change of coordinates for the \(\mathring{\mathbf {b}}\)-variables, which is enough for our purpose. Consequently, we have
By (8.38), it is then easy to see
We keep the terminology “Type I’, II and III vicinities” for the slightly modified domains defined in terms of \(\mathbf {c}, \mathbf {d}, \varvec{\tau }\) and \(\varvec{\upsilon }\)-variables. More specifically, we redefine the vicinities as follows.
Definition 8.3
We slightly modify Definition 8.1 as follows.
-
Type I’ vicinity: \(\,\,\,\,\,\displaystyle \mathbf {c}_1,\mathbf {c}_2,\mathbf {d}_1,\mathbf {d}_2\in \mathring{\varUpsilon }, \,\,\,\,\, \varvec{\tau }_1,\varvec{\tau }_2,\varvec{\upsilon }_1,\varvec{\upsilon _2}\in \mathring{\varUpsilon }_S\).
-
Type II vicinity: \(\,\,\,\,\,\displaystyle \mathbf {c}_1,\mathbf {c}_2,\mathring{\mathbf {x}}_1,\mathring{\mathbf {x}}_2\in \mathring{\varUpsilon }, \,\,\,\,\, \varvec{\tau }_1,\varvec{\tau }_2\in \mathring{\varUpsilon }_S,\,\,\,\,\, V_j\in \mathring{U}(2)\) for all \(j=2,\ldots , W\),
\(\qquad \text {where} \, \mathring{\mathbf {x}}-\text {variables are defined in}\) (8.9) with \(\varkappa =+\).
-
Type III vicinity: \(\,\,\,\,\,\displaystyle \mathbf {c}_1,\mathbf {c}_2,\mathring{\mathbf {x}}_1,\mathring{\mathbf {x}}_2\in \mathring{\varUpsilon }, \,\,\,\,\, \varvec{\tau }_1,\varvec{\tau }_2\in \mathring{\varUpsilon }_S,\,\,\,\,\, V_j\in \mathring{U}(2)\) for all \(j=2,\ldots , W\),
\(\qquad \text {where} \, \mathring{\mathbf {x}}-\text {variables are defined in}\) (8.9) with \(\varkappa =-\).
Now, recall the remainder terms \(R^b, R^x_{\pm }, R^x_+\) and \(R^x_-\) in Lemma 8.2, \(R^{t,b}\) and \(R^{v,x}_\pm \) in (8.25), \(R_{+}^{v,x}\) in (8.31) and \(R_{-}^{v,x}\) in (8.34). In light of (8.47), the bounds on these remainder terms are the same as those obtained in Sect. 8.1. For the convenience of the reader, we collect them as the following proposition.
Proposition 8.4
Under Assumptions 1.1 and 1.14, we have the following estimate, in the vicinities.
Proof
Note that, (i) can be obtained from (8.28), and (ii) is implied by (8.32) and (8.35), and (iii) follows from Lemma 8.2. Hence, we completed the proof. \(\square \)
Analogously, in the vicinities, \(||\mathring{\mathbf {b}}_1||_{2}^2, ||\mathring{\mathbf {b}}_2||_{2}^2, ||\mathring{\mathbf {x}}_1||_2^2, ||\mathring{\mathbf {x}}_2||_2^2, ||\mathring{\mathbf {t}}||_\infty \) and \(||\mathring{\mathbf {v}}||_\infty \) are still bounded by \(\varTheta \).
9 Integral over the Type I vicinities
With (8.45), we estimate the integral over the Type I vicinity in this section. At first, in the Type I’ vicinity, we have \(||\mathring{\mathbf {x}}_a||_\infty =O(\varTheta ^{\frac{1}{2}})\) and \(||\mathring{\mathbf {b}}_a||_\infty =O(\varTheta ^{\frac{1}{2}})\) for \(a=1,2\). Consequently, according to the parametrization in (8.2), we can get
Hence, what remains is to estimate the function \(\mathsf {A}(\hat{X}, \hat{B}, V, T)\). We have the following lemma.
Lemma 9.1
Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, for any given positive integer n, there is \(N_0=N_0(n)\), such that for all \(N\ge N_0\) we have
for some positive constant \(C_0\) and some integer \(\ell =O(1)\), both of which are independent of n.
With (8.45), (9.1) and Lemma 9.1, we can prove Lemma 5.8.
Proof of Lemma 5.8
Using (8.45), (9.1), Lemma 9.1, Proposition 8.4 with (5.35), the fact \(\det \mathbb {A}_+=\overline{\det \mathbb {A}_-}\) and the trivial estimate \(M\varTheta ^2W^{C_0}/{(N\eta )^{\ell }}\le N^{C_0}\) for sufficiently large constant \(C_0\), we have
Then, by elementary Gaussian integral we obtain (5.37). Hence, we completed the proof of Lemma 5.8. \(\square \)
The remaining part of this section will be dedicated to the proof of Lemma 9.1. Recall the definitions of the functions \(\mathsf {A}(\cdot ), \mathsf {Q}(\cdot ), \mathsf {P}(\cdot )\) and \(\mathsf {F}(\cdot )\) in (3.32), (4.1), (4.2) and (4.3). Using the strategy in Sect. 6 again, we ignore the irrelevant factor \(\mathsf {Q}(\cdot )\) at the beginning. Hence, we bound \(\mathsf {P}(\cdot )\) and \(\mathsf {F}(\cdot )\) at first, and modify the bounding procedure slightly to take \(\mathsf {Q}(\cdot )\) into account in the end, resulting a proof of Lemma 9.1.
9.1 \(\mathsf {P}(\hat{X}, \hat{B}, V, T)\) in the Type I’ vicinity
Our aim, in this section, is to prove the following lemma.
Lemma 9.2
Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, we have
Before commencing the formal proof, we introduce more notation below. At first, we introduce
where the bound holds in the Type I’ vicinity.
Recalling (6.12) with \(\varvec{\varpi }_j\) defined in (6.8) and \(\hat{\varvec{\varpi }}_j\) in (6.10). Now, we write
where
The second step of (9.4) follows from the Taylor expansion of the logarithmic function. Now, we expand the first factor of (9.4) around the Type I’ saddle point, namely
We take (9.6) as the definition of \(\varDelta _j\), which is of the form
for some function \(\mathring{\mathfrak {p}}_{j,\alpha ,\beta }\) of \(\mathring{\mathbf {x}}, \mathring{\mathbf {b}}, \mathring{\mathbf {v}}\) and \(\mathring{\mathbf {t}}\)-variables, satisfying
One can check (9.7) easily by using (8.9)–(8.11). Analogously, we can also write
where
The bound on \(\mathring{\mathfrak {p}}_{\ell ,j,\varvec{\alpha },\varvec{\beta }}\) in (9.9) follows from the fact that all the \(V_j, \hat{X}_j^{-1}, T_j, T_j^{-1}\) and \(\hat{B}_j^{-1}\)-entries are bounded in the Type I’ vicinity, uniformly in j. Consequently, we can write for \(j\ne p,q\)
In a similar manner, we can also write for \(k=p,q\),
where \(\hat{\mathfrak {p}}_0(\cdot )=\det \hat{X}_k/\det \hat{B}_k\), which is introduced in (6.10), and \(\mathring{\mathfrak {q}}_{\ell ,j,\varvec{\alpha },\varvec{\beta }}\) is some function of \(\hat{X}, \hat{B}, V\) and T-variables, satisfying the bound
Obviously, we have \(\hat{\mathfrak {p}}_0(\cdot )=O(1)\) in Type I’ vicinity.
Now, in order to distinguish \(\ell ,\varvec{\alpha } \) and \(\varvec{\beta }\) for different j, we index them as \(\ell _j, \varvec{\alpha }_j\) and \(\varvec{\beta }_j\), where
In addition, we define
Let \(||\mathbf {\ell }||_1=\sum _{j=1}^W\ell _j\) be the 1-norm of \(\mathbf {\ell }\). Note that \({\varvec{\alpha }}\) and \({\varvec{\beta }}\) are \(||\mathbf {\ell }||_1\)-dimensional. With these notations, using (6.12), (9.4), (9.6), (9.10) and (9.11) we have the representation
where we made the convention
According to (9.12) and (9.14), we have
In addition, we can decompose the sum
It is easy to see
Moreover, it is obvious that
Therefore, it suffices to investigate the integral
for each combination \((\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta }})\), and then sum it up for \((\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta }})\) to get the estimate of \(\mathsf {P}(\hat{X},\hat{B},V,T)\). Specifically, we have the following lemma.
Lemma 9.3
With the notation above, we have
Moreover, we have
We postpone the proof of Lemma 9.3 and prove Lemma 9.2 at first.
Proof of Lemma 9.2
By (4.2), (9.13) and (9.19) and the fact that \(\hat{\mathfrak {p}}_0(\cdot )=O(1)\), we have
Substituting the bounds (9.15), (9.18) and (9.20) into (9.21) yields
Now, from (9.3) we have \(\prod _{j=1}^W (1+\mathring{\kappa }_j)^{\ell _j}\le \varTheta ^{||\mathbf {\ell }||_1}\), which can absorb the irrelevant factor \(e^{O(||\mathbf {\ell }||_1)}\). Using (9.16), (9.17), we have
where the last step follows from (5.30) and (5.35). Now, substituting (9.23) into (9.22), we can complete the proof of Lemma 9.2. \(\square \)
Hence, what remains is to prove Lemma 9.3. We will need the following technical lemma whose proof is postponed.
Lemma 9.4
For any index sets \(\mathsf {I},\mathsf {J}\subset \{ 1,\ldots , W\}\) with \(|\mathsf {I}|=|\mathsf {J}|=\mathfrak {m}\ge 1\), we have the following bounds for the determinants of the submatrices of \(S, \mathbb {A}_+\) and \(\mathbb {A}_-\) defined in (8.14).
-
For \((\mathbb {A}_+)^{(\mathsf {I}|\mathsf {J})}\) and \((\mathbb {A}_-)^{(\mathsf {I}|\mathsf {J})}\), we have
$$\begin{aligned} {|\det (\mathbb {A}_+)^{(\mathsf {I}|\mathsf {J})}|}/{|\det \mathbb {A}_+|}\le 1,\quad {|\det (\mathbb {A}_-)^{(\mathsf {I}|\mathsf {J})}|}/{|\det \mathbb {A}_-|}\le 1. \end{aligned}$$(9.24) -
For \(S^{(\mathsf {I}|\mathsf {J})}\), we have
$$\begin{aligned} {|\det S^{(\mathsf {I}|\mathsf {J})}|}/{|\det S^{(1)}|}\le (\mathfrak {m}-1)!(2W^\gamma )^{(\mathfrak {m}-1)}. \end{aligned}$$(9.25)
Proof of Lemma 9.3
Recall the definition in (6.16). Furthermore, we introduce the matrix
Using the fact \(a_+a_-=-1\), we can write
Now, by the Gaussian integral of the Grassmann variables in (3.2), we see that
for some index sets \(\mathsf {I}, \mathsf {J}\subset \{1,\ldots , 4W\}\) determined by \({\varvec{\alpha }}\) and \({\varvec{\beta }}\) such that
Here we mention that (9.27) may fail when at least two components in \(\varvec{\alpha }_j\) coincide for some j. But \(\mathfrak {P}_{\mathbf {\ell },{\varvec{\alpha }},{\varvec{\beta } }}=0\) in this case because of \(\chi ^2=0\) for any Grassmann variable \(\chi \).
Now, obviously, there exists index sets \(\mathsf {I}_\alpha , \mathsf {J}_\alpha \subset \{1,\ldots , W\}\) for \(\alpha =1,\ldots , 4\) such that
It suffices to consider the case \(|\mathsf {I}_\alpha |=|\mathsf {J}_\alpha |\) for all \(\alpha =1,2,3,4\). Otherwise, \(\det \mathbb {H}^{(\mathsf {I}|\mathsf {J})}\) is obviously 0, in light of the block structure of \(\mathbb {H}\), see the definition (9.26). Now, note that, since \(\det S=0\), we have
For more general \(\mathbf {\ell }\), by Lemma 9.4, we have
Then, by the fact \(|\det \mathbb {A}_+\mathbb {A}_-|=|\det \mathbb {A}_+|^2\), we can conclude the proof of Lemma 9.3. \(\square \)
To prove Lemma 9.4, we will need the following lemma.
Lemma 9.5
For the weighted Laplacian S, we have
Remark 9.6
A direct consequence of (9.28) is \(\det S^{(1)}=\cdots =\det S^{(W)}\).
Proof of Lemma 9.5
Without loss of generality, we assume \(j>i\) in the sequel. We introduce the matrices
It is not difficult to check
Then, by the fact \(\det P_{ij}E_j=(-1)^{j-i}\), we can get the conclusion. \(\square \)
Proof of Lemma 9.4
At first, by the definition in (8.14), (1.5) and the fact \(\mathsf {Re}a_+^2=\mathsf {Re}a_-^2>0\), it is easy to see that the singular values of \(\mathbb {A}_+\) and \(\mathbb {A}_-\) are all larger than 1. With the aid of the rectangular matrix \((\mathbb {A}_+)^{(\mathsf {I}|\emptyset )}\) as an intermediate matrix, we can use Cauchy interlacing property twice to see that the kth largest singular value of \((\mathbb {A}_+)^{(\mathsf {I}|\mathsf {J})}\) is always smaller than the kth largest singular value of \(\mathbb {A}_+\). Consequently, we have the first inequality of (9.24). In the same manner, we can get the second inequality of (9.24)
Now, we prove (9.25). At first, we address the case that \(\mathsf {I}\cap \mathsf {J}\ne \emptyset \). In light of Remark 9.6, without loss of generality, we assume that \(1\in \mathsf {I}\cap \mathsf {J}\). Then \(S^{(\mathsf {I}|\mathsf {J})}\) is a submatrix of \(S^{(1)}\). Therefore, we can find two permutation matrices P and Q, such that
where \(\mathrm {D}=S^{(\mathsf {I}|\mathsf {J})}\). Now, by Schur complement, we know that
Moreover, \((\mathrm {A}-\mathrm {B}\mathrm {D}^{-1}\mathrm {C})^{-1}\) is the \((|\mathsf {I}|-1)\) by \((|\mathsf {I}|-1)\) upper-left corner of \((PS^{(1)}Q)^{-1}=Q^{-1}(S^{(1)})^{-1}P^{-1}\). That means \(\det S^{(\mathsf {I}|\mathsf {J})}/\det S^{(1)}\) is the determinant of a sub matrix of \((S^{(1)})^{-1}\) (with dimension \(|\mathsf {I}|-1\)), up to a sign. Then, by Assumption 1.1 (iii), we can easily get
Now, for the case \(\mathsf {I}\cap \mathsf {J}=\emptyset \), we can fix one \(i\in \mathsf {I}\) and \(j\in \mathsf {J}\). Due to (9.28), it suffices to consider
By similar discussion, one can see that (9.30) is the determinant of a sub matrix of \((S^{(i|j)})^{-1}\) with dimension \(|\mathsf {I}|-1\). Hence, it suffices to investigate the bound of the entries of \((S^{(i|j)})^{-1}\). From (9.29) we have
Then, it is elementary to see that the entries of \((S^{(i|j)})^{-1}\) are bounded by \(2W^\gamma \), in light of Assumption 1.1 (iii). Consequently, we have
which implies (9.25). Hence, we completed the proof of Lemma 9.4. \(\square \)
9.2 \(\mathsf {F}(\hat{X},\hat{B}, V, T)\) in the Type I’ vicinity
Neglecting the \(X^{[1]}, \mathbf {y}^{[1]}\) and \(\mathbf {w}^{[1]}\)-variables in \(\mathsf {Q}(\cdot )\) at first, we investigate the integral \(\mathsf {F}(\hat{X},\hat{B}, V, T)\) in the Type I’ vicinity in this section. We have the following lemma.
Lemma 9.7
Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, we have
Recalling the functions \(\mathbb {G}(\hat{B},T)\) and \(\mathbb {F}(\hat{X},V)\) defined in (6.20) and (6.21), we further introduce
Then, we have the decomposition
Hence, we can estimate \(\mathring{\mathbb {F}}(\hat{X},V)\) and \(\mathring{\mathbb {G}}(\hat{B},T)\) separately in the sequel.
9.2.1 Estimate of \(\mathring{\mathbb {F}}(\hat{X},V)\)
Lemma 9.8
Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, we have
Proof
Using (8.9) and (8.10), we can write
where the remainder term represents a \(2\times 2\) matrix whose max-norm is bounded by \(\varTheta /\sqrt{M}\). Using (9.36) and recalling \(N=MW\) yields
Substituting (9.37) into (3.20) and (6.21), we can write
Recalling the parametrization of \(P_1\) in (3.25), we have
Consequently, we have
By the fact that \(X^{[1]}\)-variables are all bounded and \(|\det X_k^{[1]}|=1\) for \(k=p,q\), it is easy to see that
Therefore, we completed the proof. \(\square \)
9.2.2 Estimate of \(\mathring{\mathbb {G}}(\hat{B},T)\)
Recall the definition of \(\mathring{\mathbb {G}}(\hat{B},T)\) from (9.33), (6.20) and (3.21). In this section, we will prove the following lemma.
Lemma 9.9
Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, we have
Note that \(y_p^{[1]}, y_q^{[1]}\) and t in the parametrization of \(Q_1\) (see (3.25) ) are not bounded, we shall truncate them with some appropriate bounds at first, whereby we can neglect some irrelevant terms in the integrand, in order to simplify the integral. More specifically, we will do the truncations
Accordingly, we set
where we have used the parameterization of \(\mathbf {w}^{[1]}\) in (3.15). We will prove the following lemma.
Lemma 9.10
Suppose that the assumptions in Theorem 1.15 hold. In the Type I’ vicinity, we have
for some positive constant \(\varepsilon \).
Proof
At first, by (6.26)–(6.29), we have for any j,
for some positive constant c, where the last step follows from the facts that \(\mathsf {Re}b_{j,1}, \mathsf {Re}b_{j,2}=\mathsf {Re}a_++o(1)\) and \(t_j=o(1)\) in the Type I’ vicinity. In addition, it is not difficult to get
which implies that
Note that the second and third factors in the definition of \(g(\cdot )\) in (3.21) can be bounded by 1, according to (6.23). Then, as a consequence of (9.41) and (9.42), we have
for some positive constants C, c and \(c'\). By integrating \(y_p^{[1]}\) and \(y_q^{[1]}\) out at first, we can easily see that the first truncation in (9.39) only produces an error of order \(O(\exp \{-N^{\varepsilon }\})\) to the integral \(\mathring{\mathbb {G}}(\hat{B},T)\), for some positive constant \(\varepsilon =\varepsilon (\varepsilon _2)\) by the assumption \(\eta \ge N^{-1+\varepsilon _2}\) in (1.16). Then one can substitute the first bound in (9.39) to the last factor of the r.h.s. of (9.43), thus
We can also do the second truncation in (9.39) in the integral \(\mathring{\mathbb {G}}(\hat{B},T)\), up to an error of order \(O(\exp \{-N^{\varepsilon }\})\), for some positive constant \(\varepsilon \). Therefore, we completed the proof of Lemma 9.10. \(\square \)
With the aid of Lemma 9.10, it suffices to work on \(\widehat{\mathbb {G}}(\hat{B},T)\) in the sequel. We have the following lemma.
Lemma 9.11
We have
Proof of Lemma 9.11
Recall the parameterization of \(\mathbf {w}^{[1]}_k\) in (3.15) again. To simplify the notation, we set
Similarly to (9.36), using \(t=o(1)\) from (9.39), we have the expansion
Consequently, we have
In addition, for \(k=p,q\), using the fact \(\sum _{j}\tilde{\mathfrak {s}}_{jk}=1\), we have
where \(R_k\) is a \(2\times 2\) matrix independent of \(Y_k^{[1]}\), satisfying \(||R_k||_{\max }=O(1)\).
Observe that the term in (9.44) is obviously independent of \(\mathbf {w}^{[1]}\)-variables. In addition, for \(k=p\) or q, we have
and for \(k,\ell =p\) or q, we have
Moreover, we have
Substituting (9.44), (9.45) and (9.46)–(9.48) to the definition of \(g(\cdot )\) in (3.21) and reordering the factors properly, we can write the integrand in (9.40) as
where the last factor is independent of the \(\mathbf {w}^{[1]}\)-variables. Here, we put the factors containing \(\sigma _p^{[1]}\) and \(\sigma _q^{[1]}\) together, namely, the first two lines on the r.h.s. of (9.49). For further discussion, we write for \(k=p,q\)
where \(\mathfrak {r}_{k}^+ , \mathfrak {r}_k^-\) and \(\mathfrak {r}_k\) are all polynomials of \({u}_k^{[1]}\) and \({v}_k^{[1]}\), with bounded degree and bounded coefficients, in light of \(||R_k||_{\max }=O(1)\), the definition of \(Y_k^{[1]}\) in (3.14) and the parametrization in (3.15).
Now, we start to estimate the integral (9.40) by using (9.49). We deal with the integral over \(\sigma _p^{[1]}\) and \(\sigma _q^{[1]}\) at first. These variables are collected in the integral of the form
with integers \(\ell _1\) and \(\ell _2\) independent of n. Note that according to (9.49), it suffices to consider \(\mathcal {I}_\sigma (0,0)\) for the proof of (9.38). We study \(\mathcal {I}_\sigma (\ell _1,\ell _2)\) for general \(\ell _1\) and \(\ell _2\) here, which will be used later.
Now, we set
In addition, we introduce
Obviously, when (9.39) is satisfied, we have
With the aid of the notation defined in (9.50) and (9.51), we can write
We have the following lemma.
Lemma 9.12
Under the truncation (9.39), we have
for some positive constant C.
Proof
At first, by Taylor expansion, we have
Now, for any \(m_1,m_2\in \mathbb {Z}\), we denote
Setting
and using (9.55), we can rewrite (9.54) as
For simplicity, we employ the notation
Consequently, by (9.58) and (9.56) we obtain
for some positive constant C, where in the last step we used the fact \(|c_{k,1}|<1, |c_{k,2}|< 1\), which can be seen directly from the definition in (9.53), the truncations in (9.39) and the assumption \(\eta \le M^{-1}N^{\varepsilon _2}\). Analogously, we also have \(|{c_{p,q}}/M|<1\). According to the definitions (9.57) and (9.59), we have
Hence, by using \(|c_{k,1}|<1, |c_{k,2}|< 1\) and \(|{c_{p,q}}/{M}|<1\), we have the trivial bound
Therefore, we completed the proof by using (9.53). \(\square \)
Now, we return to the proof of Lemma 9.11. Using (9.49) and Lemma 9.12 with \(\ell _1=\ell _2=0\) to (9.40), and integrating the bounded variables \({v}_p^{[1]}, {v}_q^{[1]}\) and \(\sigma \) out, we can get
where the last two factors come from the facts
In (9.61) we used the fact \(({u}_k^{[1]})^2+({v}_k^{[1]})^2=1\). Now, we integrate \(y_p^{[1]}\) and \(y_q^{[1]}\) out. Consequently, by the definition in (9.52), we have
where in the last step we have used the assumption \(\eta \le M^{-1}N^{\varepsilon _2}\) in (1.16), Assumption 1.14, the definition of \(\varTheta \) in (5.30) and the fact \(N=MW\). Hence, we completed the proof of Lemma 9.11. \(\square \)
Finally, we can prove Lemma 9.9, and further prove Lemma 9.7.
Proof of Lemma 9.9
This is a direct consequence of Lemmas 9.10 and 9.11. \(\square \)
Proof of Lemma 9.7
This is a direct consequence of (9.34), Lemma 9.8 and Lemma 9.9. \(\square \)
9.3 Summing up: Proof of Lemma 9.1
In this section, we slightly modify the discussions in Sects. 9.1 and 9.2 to prove Lemma 9.1. The combination of Lemmas 9.2 and 9.7 would directly imply Lemma 9.1 if the \(\mathsf {Q}(\cdot )\) factor were not present in the definition of \(\mathsf {A}(\cdot )\). Now we should take \(\mathsf {Q}(\cdot )\) into account. This argument is similar to the corresponding discussion in Sect. 6.4.
Proof of Lemma 9.1
At first, we observe that \(\kappa _1, \kappa _2\) and \(\kappa _3\) in (6.5) are obviously independent of n. Then, by the fact \(\kappa _1=W^{O(1)}\), it suffices to consider one monomial of the form
where the degrees of \(\mathfrak {p}_1(\cdot ), \mathfrak {p}_2(\cdot )\) and \(\mathfrak {q}(\cdot )\) are all O(1), and independent of n, in light of the fact \(\kappa _3=O(1)\) in (6.5). Especially, the order of \((y^{[1]}_p)^{-1}\)and \((y^{[1]}_q)^{-1}\) are not larger than 2, which can be easily seen from the definition of \(\mathcal {Q}(\cdot )\) in (3.19).
Now, we reuse the notation \(\mathsf {P}_{\mathfrak {q}}(\hat{X}, \hat{B}, V, T)\) and \(\mathsf {F}_{\mathfrak {p}}(\hat{X}, \hat{B}, V, T)\) in (6.36), by redefining them as
It is easy to check \(\mathcal {P}(\cdot )\mathfrak {q}(\cdot )\) also has an expansion of the form in (9.13). Hence, the bound in (9.2) holds for \(\mathsf {P}_{\mathfrak {q}}(\cdot )\) as well. For \(\mathsf {F}_{\mathfrak {p}}(\cdot )\), the main modification is to use Lemma 9.12 with general \(\ell _1\) and \(\ell _2\) independent of n, owing to the function \(\mathfrak {p}_2(\cdot )\). In addition, by the truncations in (9.39), we can bound \(\mathfrak {p}_1(\cdot )\) by some constant C. Hence, it suffices to replace n by \(n+\ell _3\) in the proof of Lemma 9.11. Finally, we can get
with some finite integer \(\ell _3\) independent of n. Consequently, we completed the proof of Lemma 9.1. \(\square \)
10 Integral over the Type II and III vicinities
In this section, we prove Lemma 5.9. We only present the discussion for \(\mathcal {I}(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_+, \varUpsilon ^x_+, \varUpsilon _S,\mathbb {I}^{W-1})\), i.e. integral over the Type II vicinity. The discussion on \(\mathcal {I}(\varUpsilon ^b_+, \varUpsilon ^b_-, \varUpsilon ^x_-, \varUpsilon ^x_-, \varUpsilon _S,\mathbb {I}^{W-1})\) is analogous. We start from (8.46). Similarly, we shall provide an estimate for the integrand. At first, under the parameterization (8.2) with \(\varkappa =+\), we see that
Then, what remains is to estimate \(\mathsf {A}(\hat{X}, \hat{B}, V, T)\). Our aim is to prove the following lemma.
Lemma 10.1
Suppose that the assumptions in Theorem 1.15 hold. In the Type II vicinity, we have
for some positive constant c.
With the aid of (10.1) and Lemma 10.1, we can prove Lemma 5.9.
Proof of Lemma 5.9
Recall (8.46). At first, by the definition of \(\mathbb {A}_+^v\) in (8.33), (5.24) and the fact \(\mathsf {Re}a_+^2>0\), we can see that
for all \(\{V_j\}_{j=2}^W\in (\mathring{U}(2))^{W-1}\). Substituting (5.21), (10.1), (10.2), (10.3) and the estimates in Proposition 8.4 into (8.46) yields
where we absorbed several factors by \(\exp \{-cN\eta \}\). We also enlarged the domains to the full ones. Then, using the trivial facts
and performing the Gaussian integral for the remaining variables, we can get
Observe that
Moreover, by Assumption 1.1 (ii), we see that \(|\mathfrak {s}_{ii}|\le (1-c_0)/2\) for some small positive constant \(c_0\). Consequently, since \(S^{(1)}\) is negative definite, we have
by Hadamard’s inequality. Substituting (10.5) and (10.6) into (10.4) yields
for some positive constant \(\delta \). Hence, we proved the first part of Lemma 5.9. The second part can be proved analogously. \(\square \)
In the sequel, we prove Lemma 10.1. We also ignore the factor \(\mathsf {Q}(\cdot )\) from the discussion at first.
10.1 \(\mathsf {P}(\hat{X}, \hat{B}, V, T)\) in the Type II vicinity
Lemma 10.2
Suppose that the assumptions in Theorem 1.15 hold. In the Type II vicinity, we have
Proof
We will follow the strategy in Sect. 9.1. We regard all V-variables as fixed parameters. Now, we define the function
Then, we recall the representation (9.4) and the definition of \(\varDelta _{\ell ,j}\) in (9.5). We still adopt the representation (9.8). It is easy to see that in the Type II vicinity, we also have the bound (9.9) for \(\mathring{\mathfrak {p}}_{\ell ,j,\varvec{\alpha },\varvec{\beta }}\). The main difference is the first factor of the r.h.s. of (9.4). We expand it around the saddle point as
We take the formula above as the definition of \(\widehat{\varDelta }_j\), which is of the form
where \(\hat{p}_{j,\alpha ,\beta }\) is a function of \(\hat{X}, \hat{B}, V\) and T-variables, satisfying
Let \(\widehat{\mathbb {H}}=(a_+^{-2}\mathbb {A}_+)\oplus S\oplus (a_+^{-2}\mathbb {A}_+)\oplus S\). Recalling the notation in (6.16), we can write
Now, via replacing \(\varDelta _{1,j}\) by \(\widehat{\varDelta }_{1,j}, \mathring{\kappa }_j\) by \(\mathring{\iota }_j, \mathbb {H}\) by \(\widehat{\mathbb {H}}\) in the proof of Lemma 9.2, we can perform the proof of Lemma 10.2 in the same way. We leave the details to the reader. \(\square \)
10.2 \(\mathsf {F}(\hat{X}, \hat{B}, V, T)\) in the Type II vicinity
Lemma 10.3
Suppose that the assumptions in Theorem 1.15 hold. In the Type II vicinity, we have
Proof
Recall the decomposition (9.34). Note that Lemma 9.9 is still applicable. Hence, it suffices to estimate \(\mathring{\mathbb {F}}(\hat{X},V)\). Now, note that in the Type II vicinity, it is easy to see that
Consequently, by the assumption on \(\eta \), we have
From (3.20) we can also see that all the other factors of \(f(P_1, V, \hat{X}, X^{[1]})\) are O(1). Hence, by the definition (9.33), we have \(\mathring{\mathbb {F}}(\hat{X},V)=O(\exp \{-(a_+-a_-)N\eta \})\), which together with Lemma 9.9 yields the conclusion. \(\square \)
10.3 Summing up: Proof of Lemma 10.1
Analogously, we shall slightly modify the proofs of Lemma 10.2 and Lemma 10.3, in order to take \(\mathsf {Q}(\cdot )\) into account. The proof can then be performed in the same manner as Lemma 9.1. We omit the details.
11 Proof of Theorem 1.15
The conclusion for Case 1 is a direct consequence of the discussions in Sects. 3.5–10. More precisely, by using Lemmas 5.1, 5.6, 5.8 and 5.9, we can get (1.21) immediately.
The proofs of Case 2 and Case 3 can be performed analogously, with slight modifications, which will be stated below. In Case 2, we shall slightly modify the discussions in Sects. 3.5–10 for Case 1, according to the decomposition of supermatrices in (3.9). Now, at first, in (3.12) and (3.13), for \(A=\breve{\mathcal {S}}, \breve{{X}}, \breve{{Y}}, \breve{{\varOmega }}\) or \(\breve{{\varXi }}\), we replace \(A_p^{\langle 1\rangle }\) and \(A_q^{\langle 1\rangle }\) by \(A_p^{\langle 1,2\rangle }\) and \(A_q\) respectively, and replace \(A_q^{[1]}\) by \(A_p^{[2]}\). In addition, in the last three lines of (3.13), we shall also replace \(\tilde{s}_{jq}\) by \(\tilde{s}_{jp}\), and replace \(\tilde{s}_{pq}\) and \(\tilde{s}_{qp}\) by \(\tilde{s}_{pp}\), and in the first line, we replace \(\bar{\phi }_{1,q,1}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,q,1}\) by \(\bar{\phi }_{1,p,2}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,p,2}\). Then, in (3.14) and (3.15), for \(A=X, Y, \varOmega , \varXi , \varvec{\omega }, \varvec{\xi }, \mathbf {w}, y, \tilde{u}, \tilde{v}\) or \(\sigma \), we replace \(A_q^{[1]}\) by \(A_p^{[2]}\). With these modifications, it is easy to check the proof in Sects. 3.5–10 applies to Case 2 as well. The main point is we can still gain the factor \(1/(N\eta )^{n+1}\) from integral of \(g(\cdot )\) defined in (3.21) (with \(y_q^{[1]}\) and \(\mathbf {w}_q^{[1]}\) replaced by \(y_p^{[2]}\) and \(\mathbf {w}_p^{[2]}\)). Heuristically, we can go back to (4.4), and replace \(\sigma _q^{[1]}\) by \(\sigma _p^{[2]}\) therein. It is then quite clear the same estimate holds. Consequently, Lemmas 5.1, 5.6, 5.8 and 5.9 still hold under the replacement of the variables described above. Hence, (1.21) holds in Case 2.
In Case 3, we can also mimic the discussions for Case 1 with slight modifications. We also start from (3.12) and (3.13). For \(A=\breve{\mathcal {S}}, \breve{{X}}, \breve{{Y}}, \breve{{\varOmega }}, \breve{{\varXi }}, \varvec{\omega }\) and \(\varvec{\xi }\), we replace \(A_q^{\langle 1\rangle }\) by \(A_q\), and replace \(A_q^{[1]}\) by 0. In addition, in the first line of (3.13), we replace \(\bar{\phi }_{1,q,1}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,q,1}\) by \(\bar{\phi }_{1,p,1}\phi _{1,p,1}\bar{\phi }_{2,p,1}\phi _{2,p,1}\). Consequently, after using superbosonization formula, we will get the factor \((y_p^{[1]}|(\mathbf {w}_p^{[1]}(\mathbf {w}_p^{[1]})^*)_{12}|)^{2n}\) instead of \((y_p^{[1]}y_q^{[1]}(\mathbf {w}^{[1]}_q(\mathbf {w}^{[1]}_q)^*)_{12}(\mathbf {w}^{[1]}_p(\mathbf {w}^{[1]}_p)^*)_{21})^n\) in (3.16). Then, for the superdeterminant terms
we shall only keep the factors with \(k=p\) and delete those with \(k=q\). Moreover, we shall also replace \(A_q^{[1]}\) by 0 for \(A=X, Y, \varOmega , \varXi , \varvec{\omega }, \varvec{\xi }, \mathbf {w}, y, \tilde{u}, \tilde{v}\) or \(\sigma \) in (3.16). In addition, \(\mathrm{d A^{[1]}}\) shall be redefined as the differential of \(A_p^{[1]}\)-variables only, for \(A=X, \mathbf {y}, \mathbf {w}, \varvec{w}\) and \(\varvec{\xi }\). One can check step by step that such a modification does not require any essential change of our discussions for Case 1. Especially, note that our modification has nothing to do with the saddle point analysis on the Gaussian measure \(\exp \{-M(K(\hat{X},V)+L(\hat{B},T))\}\). Moreover, the term \(\mathcal {P}(\cdot )\) in (3.29) can be redefined by deleting the factor with \(k=q\) in the last term therein. Such a modification does not change our analysis of \(\mathcal {P}(\cdot )\). In addition, the irrelevant term \(\mathcal {Q}(\cdot )\) can also be defined accordingly. Specifically, we shall delete the factor with \(k=q\) in the last term of (3.30) and replace \(A_q^{[1]}\) by 0 for \(A=\varOmega , \varXi , \varvec{\omega }, \varvec{\xi }, \mathbf {w}, y\). It is routine to check that Lemma 6.3 still holds under such a modification. Analogously, we can redefine the functions \(\mathcal {F}(\cdot ), f(\cdot )\) and \(g(\cdot )\) in (3.19)–(3.21). Now, the main difference between Case 3 and Case 1 or 2 is that the factor \((y_p^{[1]}|(\mathbf {w}_p^{[1]}(\mathbf {w}_p^{[1]})^*)_{12}|)^{2n}\) does not produce oscillation in the integral of \(g(\cdot )\) any more. Heuristically, the counterpart of (4.4) in Case 3 reads
Hence, (1.21) holds for Case 3. Therefore, we completed the proof of Theorem 1.15.
12 Comment on the prefactor \(N^{C_0}\) in (1.21)
In the proof of (1.21), we have used \(N^{C_0}\) to replace \(M\varTheta ^2W^{C_0}/(N\eta )^{\ell }\) (see the proof of Lemma 5.8). However, the latter is also artificial. It can be improved to some n-dependent constant \(C_n\) via a more delicate analysis on \(\mathsf {A}(\cdot )\), i.e. the integral of \(\mathcal {P}(\cdot )\mathcal {Q}(\cdot )\mathcal {F}(\cdot )\). Such an improvement stems from the cancellation in the Gaussian integral. At first, a finer analysis will show that the factor \(\mathcal {Q}(\cdot )\) can really be ignored, in the sense that it does not play any role in the estimate of the order of \(\mathbb {E}|G_{ij}(z)|^{2n}\). Hence, for simplicity, we just focus on the product \(\mathsf {P}(\cdot )\mathsf {F}(\cdot )\) instead of \(\mathsf {A}(\cdot )\). Then, we go back to Lemmas 9.2 and 9.7. Recall the decomposition (9.34). A more careful analysis on \(\mathsf {F}(\cdot )\) leads us to the following expansion, up to the subleading order terms of the factors \(\mathring{\mathbb {G}}(\cdot )\) and \(\mathring{\mathbb {F}}(\cdot )\),
where \(\mathsf {l}_j(\cdot )\)’s and \(\mathsf {l}'_j(\cdot )\)’s are some linear combinations of the arguments. Analogously, we shall write down the leading order term of \(\mathsf {P}(\cdot )\) in terms of \(\mathring{\mathbf {x}}, \mathring{\mathbf {b}}, \mathring{\mathbf {t}}\) and \(\mathring{\mathbf {v}}\) explicitly. Then it can be seen that the leading order term of \(\mathsf {P}(\cdot )\) is a linear combination of \(\mathring{x}_{j,1}\mathring{x}_{k,2}, \mathring{b}_{j,1}\mathring{b}_{k,2}, \mathring{x}_{j,\alpha }\mathring{b}_{k,\beta }, \upsilon _{j,\alpha }\tau _{k,\beta }\) for \(j,k=1,\ldots , W\) and \(\alpha ,\beta =1,2\), in which all the coefficients are of order 1 / M. Observe that the Gaussian integral in (8.45) will kill the linear terms. Consequently, in the expansion (12.1), the first term that survives after the Gaussian integral is actually
Replacing \(\mathsf {A}(\cdot )\) by the product of the leading order term of \(\mathsf {P}(\cdot )\) and (12.2) in the integral (8.45) and taking the Gaussian integral over \(\mathbf {c}, \mathbf {d}, \varvec{\tau }\) and \(\varvec{\upsilon }\)-variables yield the true order \(1/(N\eta )^n\), without additional N-dependent prefactors.
Table of symbols
For the convenience of the reader, in the following table we collect some frequently used symbols followed by the locations where they are defined.
\(a_+, a_-\) | (1.27) | \( \hat{X}_j,\hat{B}_j \) | (3.23) | \(\mathbb {S},\mathbb {S}^{v}\) | (5.23) |
\( u,s,u_j,s_j\) | (3.25) | \(P_1,Q_1\) | (3.25) | \(\varTheta \) | (5.30) |
\( \tau _{j,1},\tau _{j,2} \) | (8.20) | \(V_j, T_j\) | (3.24) | \(\mathbb {I}, \mathbb {L}, \varSigma , \varGamma \) | (1.30) |
\( \upsilon _{j,1}, \upsilon _{j,2}\) | (8.23) | \(\mathbb {A}_+,\mathbb {A}_-\) | (8.14) | \(\varUpsilon ^b_{\pm },\varUpsilon ^x_{\pm }, \varUpsilon _S\) | (5.32) |
\( \Bbbk (a)\) | (5.4) | \(\mathbb {A}_+^v,\mathbb {A}_-^v\) | (8.33) | \(\mathring{\varUpsilon }, \mathring{\varUpsilon }_S\) | (8.4) |
\(D_{\pm }, D_{\mp }, D_+, D_-\) | (1.28) | \(\mathbb {H}\) | (9.26) | \(\varUpsilon _{\infty }\) | (8.39) |
References
Aizenman, M., Molchanov, S.: Localization at large disorder and at extreme energies: an elementary derivation. Commun. Math. Phys. 157, 245–278 (1993)
Anderson, P.: Absences of diffusion in certain random lattices. Phys. Rev. 109, 1492–1505 (1958)
Bunder, J.E., Efetov, K.B., Kravtsov, V.E., Yevtushenko, O.M., Zirnbauer, M.R.: Superbosonization formula and its application to random matrix theory. J. Stat. Phys. 129(5–6), 809–832 (2007)
Disertori, M., Pinson, H., Spencer, T.: Density of states for random band matrices. Commun. Math. Phys. 232(1), 83–124 (2002)
Disertori, M., Spencer, T.: Anderson localization for a supersymmetric sigma model. Commun. Math. Phys. 300(3), 659–671 (2010)
Disertori, M., Spencer, T., Zirnbauer, M.R.: Quasi-diffusion in a 3D supersymmetric hyperbolic sigma model. Commun. Math. Phys. 300(2), 435–486 (2010)
Efetov, K.: Supersymmetry in Disorder and Chaos. Cambridge University Press, Cambridge (1997)
Ellis, R.B.: Discrete Green’s Functions for Products of Regular Graphs (2003). arXiv:math/0309080
Erdős, L., Knowles, A., Yau, H.-T.: Averaging fluctuations in resolvents of random band matrices. Ann. Henri Poincaré 14(8), 1837–1926 (2013)
Erdős, L., Knowles, A., Yau, H.-T., Yin, J.: Spectral statistics of Erdős-Rényi graphs I: local semicircle law. Ann. Prob. 41(3B), 2279–2375 (2013)
Erdős, L., Knowles, A., Yau, H.-T., Yin, J.: Delocalization and diffusion profile for random band matrices. Commun. Math. Phys. 323, 367–416 (2013)
Erdős, L., Knowles, A., Yau, H.-T., Yin, J.: The local semicircle law for a general class of random matrices. Electron. J. Probab 18(59), 1–58 (2013)
Erdős, L., Schlein, B., Yau, H.-T.: Local semicircle law and complete delocalization for Wigner random matrices. Commun. Math. Phys. 287, 641–655 (2009)
Erdős, L., Yau, H.-T., Yin, J.: Bulk universality for generalized Wigner matrices. Probab. Theory Relat. Fields 154(1–2), 341–407 (2012)
Fröhlich, J., Spencer, T.: Absence of diffusion in the Anderson tight binding model for large disorder or low energy. Commun. Math. Phys. 88, 151–184 (1983)
Fyodorov, Y.V., Mirlin, A.D.: Scaling properties of localization in random band matrices: a \(\sigma \)-model approach. Phys. Rev. Lett. 67, 2405–2409 (1991)
Klein, A.: Extended states in the Anderson model on the Bethe lattice. Adv. Math. 133(1), 163–184 (1998)
Littelmann, P., Sommers, H.J., Zirnbauer, M.R.: Superbosonization of invariant random matrix ensembles. Commun. Math. Phys. 283(2), 343–395 (2008)
Shcherbina, T.: On the second mixed moment of the characteristic polynomials of 1D band matrices. Commun. Math. Phys. 328(1), 45–82 (2014)
Shcherbina, T.: Universality of the local regime for the block band matrices with a finite number of blocks. J. Stat. Phys. 155(3), 466–499 (2014)
Shcherbina, T.: Universality of the second mixed moment of the characteristic polynomials of the 1D band matrices: real symmetric case (2014). arXiv:1410.3084
Schäfer, L., Wegner, F.: Disordered system with n orbitals per site: Lagrange formulation, hyperbolic symmetry, and Goldstone modes. Z. Phys. B 38, 113–126 (1980)
Schenker, J.: Eigenvector localization for random band matrices with power law band width. Commun. Math. Phys. 290, 1065–1097 (2009)
Sodin, S.: An estimate for the average spectral measure of random band matrices. J. Stat. Phys. 144(1), 46–59 (2011)
Sodin, S.: The spectral edge of some random band matrices. Ann. Math. 172(3), 2223–2251 (2010)
Spencer, T.: SUSY Statistical Mechanics and Random Band Matrices, Quantum Many Body Systems. Springer, Berlin Heidelberg (2012)
Spencer, T., Zirnbauer, M.R.: Spontaneous symmetry breaking of a hyperbolic sigma model in three dimensions. Commun. Math. Phys. 252(1–3), 167–187 (2004)
Tao, T., Vu, V.: Random matrices: universality of local eigenvalue statistics. Acta Math. 206(1), 127–204 (2011)
Wang, W.M.: Mean field upper and lower bounds on Lyapunov exponents. Am. J. Math. 851-878 (2002)
Wegner, F.J.: Disordered system with \(n\) orbitals per site: \(n\rightarrow \infty \) limit. Phys. Rev. B 19, 783–792 (1979)
Wigner, E.: Characteristic vectors of bordered matrices with infinite dimensions. Ann. Math. 62, 548–564 (1955)
Acknowledgments
Open access funding provided by Institute of Science and Technology (IST Austria). The authors are very grateful to the anonymous referees for careful reading and valuable comments, which helped to improve the organization.
Author information
Authors and Affiliations
Corresponding author
Additional information
Z. Bao was supported by ERC Advanced Grant RANMAT No. 338804; L. Erdős was partially supported by ERC Advanced Grant RANMAT No. 338804.
Rights and permissions
This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.
About this article
Cite this article
Bao, Z., Erdős, L. Delocalization for a class of random block band matrices. Probab. Theory Relat. Fields 167, 673–776 (2017). https://doi.org/10.1007/s00440-015-0692-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-015-0692-y