A central limit theorem for products of random matrices and GOE statistics for the Anderson model on long boxes

We consider products of random matrices that are small, independent identically distributed perturbations of a fixed matrix $T_0$. Focusing on the eigenvalues of $T_0$ of a particular size we obtain a limit to a SDE in a critical scaling. Previous results required $T_0$ to be a (conjugated) unitary matrix so it could not have eigenvalues of different modulus. From the result we can also obtain a limit SDE for the Markov process given by the action of the random products on the flag manifold. Applying the result to random Schr\"odinger operators we can improve some result by Valko and Virag showing GOE statistics for the rescaled eigenvalue process of a sequence of Anderson models on long boxes. In particular we solve a problem posed in their work.


Introduction and results
The goal of this paper is to study scaling limits of random matrix products T λ,n T λ,n−1 · · · T λ,1 with λ → 0 where the T λ,n are perturbations of a fixed d × d matrix of the form T λ,n = T 0 + λV λ,n + λ 2 W λ . (1.1) Matrix products of this kind are used in the study of quasi-1-dimensional random Schrödinger operators, and the large eigenvalues are related to so-called hyperbolic channels. Indeed, the main motivating example is this case, which we will introduce in Section 1.2. When T 0 is (the multiple of) a unitary matrix this type of result has been established in that context, [BR,VV1,VV2] (see also [SS2]) and the limiting process is described by a stochastic differential equation (SDE). In [KVV,VV1] the SDE limit was used to study the limiting eigenvalue statistics of such random Schrödinger operators in a critical scaling λ 2 n = t. We can extend this result and obtain a limit for the rescaled eigenvalue process in the presence of hyperbolic channels as well (cf. Theorem 1.3).
In particular, we solve Problem 3 raised in [VV1] and obtain limiting GOE statistics for the Anderson model on sequences of long boxes (cf. Theorem 1.2) with appropriate scalings. We essentially reduce the proof to a situation where it is left to analyze the same family of SDEs as in [VV1]. Deriving the GOE statistics then relies on the work of Erdös, Schlein, Yau and Yin [ESYY, EYY], but we do not repeat these steps that are done in [VV1]. The results are stated in Section 1.2.
Random matrix ensembles such as the Gaussian Orthogonal Ensemble were introduced by Wigner [Wi] to model the observed repulsion between eigenvalues in large nuclei. The local statistics is given by the Sine 1 kernel, see e.g. [Me]. This type of repulsion statistics is expected for many randomly disordered systems of the same symmetry class (time reversal symmetry). that have delocalized eigenfunctions. This is referred to as universality. Most models with rigorously proved universal bulk behavior are themselves ensembles of random matrices, e.g. [DG, ESY, Joh, TV].
Recently, T. Shcherbina proved universal GUE statistics (Gaussian Unitary Ensemble) for random block band matrix ensembles that in some sense interpolate between the classical matrix ensembles and Anderson models [Shc].
The Anderson model was introduced by P. W. Anderson to describe disordered media like randomly doped semi-conductors [And]. It is given by the Laplacian and a random independent identically distributed potential and has significantly less randomness than the matrix ensemble models. For large disorder and at the edge of the spectrum, the Anderson model in Z d or R d localizes [FS, DLS, SW, CKM, AM, Aiz, Klo] and has Poisson statistics [Mi, Wa, CGK, GK]. For small disorder in the bulk of the spectrum, localization and Poisson statistics appears in one and quasi-one dimensional systems [GMP, KuS, CKM, Lac, KLS] (except if prevented by a symmetry [SS3]) and is expected (but not proved) in 2 dimensions. Delocalization for the Anderson model was first rigorously proved on regular trees (Bethe lattices) [Kl] and had been extended to several infinite-dimensional tree-like graphs [Kl,ASW,FHS,AW,KLW,FHH,KS,Sa2,Sa3,Sha]. For 3 and higher dimensions one expects delocalized eigenfunctions (absolutely continuous spectrum) for small disorder and the eigenvalue statistics of large boxes should approximate GOE by universality.
However, proving any of these statements in Z d or R d for d ≥ 3 remains a great mathematical challenge.
In Theorem 1.3 we consider the limiting eigenvalue process of quasi-one dimensional models in a critical scaling limit λ 2 n = t = constant (at bandedges one has a different scaling as mentioned in Section 1.4). In this scaling limit, localization effects and Poisson statistics are not seen and the description through an SDE arises. As mentioned above, previous works [VV1,BR] had to modify the original Anderson model to avoid hyperbolic channels. We obtain that the hyperbolic channels only shift the eigenvalues but do not affect the local statistics, in particular we solve Problem 3 raised in [VV1]. In fact, fixing the width and base energy, the local eigenvalue statistics only depends on the number of so called elliptic channels. This can be seen as some universality statement by itself. Increasing the number of elliptic channels and choosing appropriate sequences of models, the GOE statistics arises.
As a byproduct of this work we solve some conjecture from [Sa1] showing that there is an SDE limit for the reduced transfer matrices in the presence of hyperbolic channels.
The papers [BR,VV1,VV2] are restricted to the subset of the important cases where all eigenvalues of T 0 had the same absolute value (and no Jordan blocks). The novelty of this work is to handle eigenvalues of different absolute value for T 0 , the application to Schrödinger operators comes from applying Theorem 1.1 to the transfer matrices and following some calculations until we arrive at precisely the same family of SDEs as in [VV1].
Let us briefly explain why this is not a trivial extension. If T 0 (or AT 0 A −1 for some matrix A) is unitary one simply has to remove the free evolution from the random products. To illustrate this, let for now X λ,n = T λ,n T λ,n−1 · · · T λ,1 . Then, T −n 0 X λ,n = (1 + λT −n 0 V λ,n T (n−1) 0 + λ 2 T −n 0 W λ T (n−1) 0 )(T −(n−1) 0 X λ,n−1 ). The conjugations like T −n 0 (V λ,n T −1 0 )T n 0 simply lead to an averaging effect over the compact group generated by the unitary T 0 in the limit for the drift and diffusion terms. Adopting techniques by Strook and Varadhan [SV] and Ethier and Kurtz [EK] to this situation (cf. Proposition 23 in [VV2]) one directly obtains an SDE limit for T −n 0 X λ,n in the scaling λ 2 n = t. If T 0 has eigenvalues of different sizes then generically some entries of T −n 0 W λ T n−1 and the variance of some entries of T n 0 V λ,n T n−1 0 will grow exponentially in n. This destroys any hope of a limiting process. Now, instead one may then consider a process U −n X λ,n where U is a unitary just counteracting the fast rotations. But then one still has different directions growing at different exponential rates even for the free evolution, and simply projecting to some subspace, P U −n T −n 0 X λ,n , does not work either! The trick lies in finding a projection which cuts off the exponential growth of the free evolution and does not screw up the convergence of the random evolution to some drift and diffusion terms.
The correct way to handle the exponential growing directions is choosing a Schur complement.
The exponentially decreasing directions will tend to zero and not matter and the directions of size 1 will lead to a limit. The exponential growing directions have some non-trivial effect and lead to an additional drift term. As the Schur complement itself is not a Markov process, it will be better to consider it as part of a quotient of X λ,n modulo a certain subgroup of GL(d). Then one still needs several estimates to handle the appearing inverses in the Schur complement and the error terms before one can apply some modification of Proposition 23 in [VV2] (cf. Proposition A.1).
Although we cannot take an SDE limit of the entire matrix as indicated above, it will be possible to describe the limit of its action on Grassmannians and flag manifolds. The limit processes live in certain submanifolds that are stable under the free, non-random dynamics of T 0 . This result is related to the numerical calculations in [RS] who considered the action of the transfer matrices on the so called Lagrangian Planes, or Lagrangian Grassmannians (which is some invariant subspace of a Grassmannian). The limiting submanifold corresponds to the 'freezing' of some phases related to the hyperbolic channels. In the scaling limit, only a motion of the part corresponding to the so called elliptic channels can be seen and it is described by a SDE.
We will also study the case of non-diagonalizable Jordan blocks. These can be dealt with by a λ-dependent basis change which leads to a different critical scaling, see Section 1.4. In the Schrödinger case such Jordan blocks appear at band-edges and we give an example for a Jordan block of size 2d for general d.
First, in Section 1.1 we will explain the main theorem which is a limit SDE for products of random matrices as in (1.1). The proof will be done in Sections 2 and 3. In section 1.2 we will explain the consequences for random Schrödinger operators, our main application. Further details for the proofs are given in Section 5. Section 1.3 will explain the correlations for SDE limits corresponding to different sizes of eigenvalues of T 0 and the related limit for the action on the flag manifold, proofs are found in Section 4. Section 1.4 will explain how to handle Jordan blocks of T 0 and give some example of a finite range random Schrödinger operator where a Jordan block of size 2d appears at a bandedge.

General SDE limits
Without loss of generality we focus on the eigenvalues with absolute value 1 and assume that T 0 has no Jordan blocks for eigenvalues of size 1. Next, we conjugate the matrices T λ,n to get T 0 in Jordan form. We may write it as a block diagonal matrix of dimension d 0 + d 1 + d 2 of the form where U is a unitary, and Γ 0 and Γ 2 have spectral radius smaller than 1. The block Γ 0 corresponds to the exponential decaying directions and the block Γ −1 2 to the exponential growing directions of T 0 .
The only way the matrix product T λ,n · · · T λ,1 can have a continuous limiting evolution is if we compensate for the macroscopic rotations given by U (as in [BR,VV1,VV2]). Hence define where X 0 is some initial condition and 1 d is the identity matrix of dimension d.
In most of the following calculations we will use a subdivision in blocks of size d 0 + d 1 and d 2 .
Let us define the Schur complement X λ,n of size If X λ,n and D λ,n are both invertible, then where P is the projection to the first d 0 + d 1 coordinates. Note that invertibility of D λ,n is required to define X λ,n . Therefore, we demand the starting value D 0 to be invertible, where The first important observation, explained in Section 2, is that the pair is a Markov process. Therefore, it will be more convenient to study this pair.
We need the following assumptions.
Assumptions. We assume that for some constants ǫ > 0, λ 0 > 0 one has Furthermore we assume that the limits of first and second moments (1.8) In order to state the main theorem, we need to subdivide V λ,n , W λ in blocks of sizes d 0 , d 1 , d 2 .
We denote the d j × d k blocks by V λ,jk , and W λ,jk respectively. (1.9) The covariances of the d 1 × d 1 block V λ,11 will be important. A useful way to encode covariances of centered matrix-valued random variables A and B is to consider the matrix-valued linear functions Choosing matrices M with one entry 1 and all other entries zero one can read off E(A ij B kl ) and E(A ij B kl ) directly. Let us therefore define (1.10) Furthermore the lowest order drift term of the limit will come from the lowest order Schurcomplement and hence contain some influence from the exponentially growing directions. Therefore, let (1.11) By the assumption (1.8) above these limits exist.
Λ t is a d 1 × d 1 matrix valued process and the solution of and B t is a complex matrix Brownian motion (i.e. B t is Gaussian) with covariances Here, U denotes the compact abelean group generated by the unitary U , i.e. the closure of the set of all powers of U , and du denotes the Haar measure on U .
Remark. (i) The analogous theorem in the situation d 2 = 0 (no exponential growing directions) holds. In this case the matrices B λ,n , C λ,n and D λ,n do not exist and one simply has X λ,n = A λ,n = X λ,n . For this case one can actually simplify some of the estimates done for the proof, as one does not need to work with the process B λ,n D −1 λ,n and no inverse is required.
(ii) In the case where d 0 = 0, i.e. no exponential decaying directions, the Theorem also works fine. In this case one simply has X t = Λ t X 11 .
(iii) The Theorem does not hold for t = 0 and indeed it looks contradictory for small t. However, the exponentially decaying directions go to zero exponentially fast so that one obtains for sufficiently small α which gives the initial conditions for the limiting process.
(iv) When defining the process X λ,n one may want to subtract some of the oscillating terms in the growing and decaying directions as well, i.e. one may want to replace R in (1.3) by a unitary written in blocks of sizes d 0 , d 1 , d 2 , respectively. Then let X λ,n =R −n R n X λ,n = Â λ,nBλ,n C λ,nDλ,n and define the corresponding Schur complementX λ,n =Â λ,n −B λ,nD −1 λ,nĈ λ,n as well haŝ Z λ,n =B λ,nD −1 λ,n . Simple algebra shows that Hence, it is easy to see that for n → ∞, where X t is the exact same process as in Theorem 1.1.
In Section 2 we will develop the evolution equations for the process (X λ,n Z λ,n ), together with some crucial estimates. In Section 3 we will then obtain the limiting stochastic differential equations as in Theorem 1.1 using Proposition A.1 in Appendix A for convergence of Markov processes to SDE limits. The reader interested in the proofs can continue with Section 2 at this point. converge to the Sine 1 point process. The latter process is the large-n limit of the random set of eigenvalues of the n × n Gaussian orthogonal ensemble near 0.

Eigenvalue limits for random Schrödinger operators and the GOE
A version of such predictions were proved rigorously [VV1] for subsequences of n i ≫ d i → ∞, λ 2 i n i → 0 but only near energies E i tending to zero. In a modified model where the edges in the d direction get weight r < 1 the proof of [VV1] works for almost all energies in the range (−2 + 2r, 2 − 2r). Proving such claims for almost all energies of the original model (1.16) presented a challenge, the main motivation for the present paper. For better comparison with [VV1] let us re-introduce the weight r. It is natural to think of operators like (1.16) as acting on a sequence ψ = (ψ 1 , . . . , ψ n ) of d-vectors. So given the weight r, let us define the nd × nd matrix H λ,n,d by with the notational convention that ψ 0 = ψ n+1 = 0. Here, Z d is the adjacency matrix of the connected graph of a path with d vertices and the V k are i.i.d. real diagonal matrices, i.e., Then we obtain the following: Theorem 1.2. For any fixed r > 0 and almost every energy E ∈ (−2 − 2r, 2 + 2r) there exist sequences n k ≫ d k → ∞, σ 2 k := λ 2 k n k → 0, such that the process of eigenvalues of converges to the Sine 1 process. In particular the level statistics corresponds to GOE statistics in this limit.
Theorem 1.2 resolves Problem 3 posed in [VV1]. There one has r < 1 and E ∈ (−2 + 2r, 2 − 2r) or r = 1 and a sequence of energies converging to 0. (Note that this interval is smaller than the one in Theorem 1.2 and in fact empty for r ≥ 1). Theorem 1.2 applies to the exact Anderson model r = 1 with any fixed energy in the interior (−4, 4) of the spectrum of the discrete two-dimensional Laplacian. It also applies in the case r > 1. This is because hyperbolic channels can now be handled for the SDE limit. The exact definition of 'elliptic' and 'hyperbolic' channels will be given below. Overcoming this difficulty was the main motivation for this work.
Essentially, only the elliptic channels play a role in the eigenvalue process limit. It is thus important to have a sequence with a growing number of elliptic channels going to infinity. Indeed, one can obtain GOE statistics even for a sequence of energies E k approaching the edge of the spectrum |E| = 2 + 2r. For this, one needs that the sequence d k grows fast enough, such that the number of elliptic channels at energy E k grows.
In the sequel we will study the limiting eigenvalue process for n → ∞ with λ 2 n constant and d fixed for more general random nd × nd matrices given by Here, A is a general Hermitian matrix, and the V k are general i.i.d. Hermitian matrices. We dropped the index d now as d will be fixed from now on and sometimes we may also drop the index n. Moreover, for simplicity, we can assume that A is diagonal; indeed, this can be achieved by the The eigenvalue equation H λ ψ = Eψ is a recursion that can be written in the matrix form as follows. (1.20) The T E λ,k are called transfer matrices. For the limiting eigenvalue process we obtain the following: Theorem 1.3. Let E be an energy such that the unperturbed transfer matrix T E 0,k (which is independent of k) is diagonalizable and has 2d e > 0 eigenvalues of absolute value 1. (In the notions introduced below this means we have d e > 0 elliptic, d h = d − d e ≥ 0 hyperbolic and no parabolic channels.) Consider the process E σ,n of eigenvalues of n(H σ √ n ,n − E) and let n k be an increasing sequence such that Z n k → Z * for k → ∞ with Z being the unitary, diagonal d e × d e matrix defined in (1.21). Then, E σ,n k converges to the zero process of the determinant of a d e × d e matrix, where the 2d e × 2d e matrix process Λ t = Λ σ,ε,t satisfies some SDE in t (with σ, ε fixed) as given in Proposition 1.4.
Now, E is an eigenvalue of H λ,n if there is a nonzero solution (ψ 1 , ψ n ) to 0 ψ n = T n · · · T 1 ψ 1 0 equivalently, when the determinant of the top left d × d block of T n · · · T 1 vanishes. So we can study the eigenvalue equation through the products which are the focus of Theorem 1.3. The matrices T k satisfy the definition of elements of the hermitian symplectic group HSp(2d). In particular, they are all invertible. The T k are all perturbations of the noiseless matrix This matrix is also block diagonal with d blocks of size 2, and the eigenvalues of T E 0,1 exactly the 2d solutions of the d quadratics so the solutions are on the real line or on the complex unit circle, depending on whether |E − a j | is less more more than two. We call the corresponding generalized eigenspaces of T * = T E 0,1 elliptic (< 2), parabolic (= 2) and hyperbolic (> 2) channels. Elliptic and hyperbolic channels correspond to two-dimensional eigenspaces, while parabolic channels correspond to a size 2 Jordan block. Traditionally, this notation refers to the solutions of the noiseless (λ = 0) recursion that are supported in these subspaces for every coordinate ψ n .
Pick an energy E, such that there are no parabolic channels and at least one elliptic channel.
Correspondingly, we define the hyperbolic eigenvalues γ j and elliptic eigenvalues z j of T * by Furthermore we define the diagonal matrices In order to complete the description of the limiting eigenvalue process, we need to consider a family of limiting SDE by varying the energy in the correct scaling. More precisely, define the 2d e × 2d e unitary matrix U and the 2d × 2d matrix Q by so that Q diagonalizes T * to a form as in (1.2) that is used for Theorem 1.1 Furthermore, let The scaling ελ 2 means that a unit interval of ε should contain a constant order of eigenvalues. In order to get limiting SDEs we consider a Schur complement and a projection of it as before, thus define the 2d e × 2d e matrices Then by Theorem 1.1 we obtain the correlated family (parameters σ, ε) of limiting processes These limiting processes correspond to the ones in Theorem 1.3 for σ = 1. For general σ one has to change V K by σV k in (1.23).
Remark. For ε = 0, up to some conjugation, the matrix T 0,λ,n corresponds to the reduced transfer matrix as introduced in [Sa1] for the scattering of a block described by H λ of a finite length n inserted into a cable described by H 0 of infinite length ('n = ∞'). Thus we obtain that in the limit λ 2 n = const., n → ∞, the process of the reduced transfer matrix as defined in [Sa1] is described by a SDE, proving Conjecture 1 in [Sa1].
In order to express the limit SDEs explicitly we need to split the potential V 1 into the hyperbolic and elliptic parts, i.e. let Moreover, define where dz denotes the Haar measure on the compact abelian group Z generated by the diagonal, unitary matrix Z. As we will see, S h will give rise to a drift term coming from the hyperbolic channels. In fact, this is the only influence of the hyperbolic channels for the limit process.
Moreover, to simplify expressions, we will be interested in one specific case.
Definition. We say that the matrix Z = diag(z 1 , . . . , z de ) with |z j | = 1, Im(z j ) > 0 is chaotic, if all of the following apply for all i, j, k, l ∈ {1, . . . , d e }, Brownian motions, independent of ε and σ, with A * t = A t , C * t = C t and certain covariances. (ii) If A and V n are real symmetric then we obtain (iii) If Z is chaotic then B t is independent of A t and C t . Also, A t and C t have the same distribution.
Moreover, with the subscript t dropped, we have the following: and for any i, j, k, l, All other covariances are obtained from A t = A * t , C t = C * t .

Correlations along different directions and SDE limit on the flag manifold
Here we will use the same notations as in Section 1.1. When T 0 has eigenvalues of absolute value c different from 1, and T 0 is diagonalized (or in Jordan form) so that the corresponding eigenspace are also the span of coordinate vectors and have no Jordan blocks, then we can apply Theorem 1.1 to the products of T λ,n /c. Moreover, the convergence in law holds jointly for the processes corresponding to magnitudes 1 and c (and in fact all magnitudes). To complete the picture, we just have to specify the covariance structure of the driving matrix-valued Brownian motions for the two processes. Towards this, we define λ,11 denotes the corresponding d 1 (c) × d 1 (c) block of V λ,n . Similarly, we define h cc ′ and h cc ′ for any two absolute values c, c ′ (see also (1.10)).
As before, we also need the d 1 (c) × d 1 (c) unitaries U c (like U in (1.2)) so that T 0 restricted to the eigenspaces of magnitude c acts like cU c .
Theorem 1.5. The convergence of Theorem 1.1 holds jointly along all eigenspaces corresponding to absolute values c of eigenvalues of T 0 that correspond to eigenspaces without Jordan block. We will denote the corresponding process for the magnitude c by Λ (c) t . Then, the covariance of the driving Brownian motions B, B ′ for the magnitudes c, c ′ are given by If the eigenvalues of T 0 are of different absolute value, then the matrix product process grows at different directions at different exponential rates. Hence there is no hope to get a matrix limit of the process that captures all the directions and all the different SDE limits Λ The above picture still holds when we add perturbations and consider the products T λ,n · · · T λ,1 .
So nothing interesting happens in this case. Things become more interesting when there are more than one eigenvalue of T 0 for a given absolute value. If this holds for the top one, then the direction of the action of a typical vector becomes dependent on the randomness, even in the limit.
The deterministic dynamics only gives that the vector will be in the subspace spanned by the eigenvectors corresponding to the top absolute value. In this sense, the different exponential rates will still determine certain subspaces of the flag in the limit so that the limiting process will be in a specific submanifold that is invariant and attracting under the action of T 0 .
Our next theorem shows how this happens. More precisely, we will consider a flag which is typical for the behavior of powers of T 0 . This happens if the k-dimensional spaces of the flag do not include directions that are spanned by subsets of eigenvectors of T 0 corresponding to eigenvalues of lower order. The matrix products applied to this flag will give a flag-valued process. This is described in Theorem 1.6.
As only invertible matrices act on a flag, suppose that for small λ all T λ,n are invertible with probability one, i.e., there is λ 0 such that for all 0 ≤ λ < λ 0 , P(T λ,n is invertible for all n) = 1.
Suppose further that T 0 is diagonalizable and that we chose a basis such that  This is an attractor by the deterministic dynamics given by the action of T 0 and the set of points in F that is attracted is given by : for all j, a j ∈ GL(d(c j )) , * arbitrary . (1.31) To counteract all the rotations let Theorem 1.6. Let T 0 be as in (1.30) and let [F 0 ] ∈ F a be represented in the form as described in (1.31). Furthermore let F λ,n = R −n T λ,n · · · T 1,λ F 0 .
Then, for fixed t > 0 and n → ∞ we have Here, Λ (cj ) t are the correlated processes for the different magnitudes c j of eigenvalues of T 0 whose correlations are described in Theorem 1.5. Note that The theorem is proved in Section 4.2.
Remark. If T 0 can not be brought into the structure as in (1.30) in general then one still obtains the SDE limits on the Grassmannians G(p, d) for d 2 < p ≤ d 2 + d 1 as in the proof. G(p, d) denotes the space of p-dimensional subspaces of C d .
The interesting point in the theorem is that the process R −n T n · · · T 1 does not have a distributional limit, its quotient by ∆(d) does.
Question. For which subgroups ∆ ′ of GL d does the process R −n T n · · · T 1 /∆ ′ have a distributional limit?
When ∆ ′ is algebraic and the quotient GL d /∆ ′ is compact, then ∆ ′ contains a conjugate of ∆(d), therefore, GL d /∆ ′ can be seen as a quotient of the flag manifold itself and there is such a limit. In Section 2 we will see that the distributional limit of the pair (X λ,n , Z λ,n ) can also be seen as the distributional limit on a certain quotient.

Jordan blocks and critical scaling
Without loss of generality we will focus on the eigenvalues of size 1 of T 0 . Let us introduce the notation J k for the standard k × k Jordan block with eigenvalue 1, and N k for the standard Jordan block with eigenvalue 0, i.e.
If a Jordan block of the form e iθ J k appears in (a possible conjugation of) T 0 then we will do a λ-dependent conjugation. This trick was already used in [SS1] to analyze the Lyapunov exponent and density of states at a bandedge for a one-dimensional Schrödinger operator. The main point is the following observation. Define the λ-dependent, diagonal k × k matrices (1.32) Now using blocks of sizes d 0 , d 1 , d 2 as before let with Γ 1 and Γ 2 having spectral radius smaller than one as before. Conjugating T λ,n by S λ,α will give a new drift term of order λ α coming from (1.32), but it also brings a diffusion term of order λ 1−(d1−1)α from conjugating λV λ,n . The diffusion has thus order λ 2−2(d1−1)α and the most interesting SDE limit arises from balancing the new drift term and the diffusion term, i.e. α = 2 − 2(d 1 − 1)α, leading to α = α(d 1 ) = 2/(2d 1 − 1). For smaller α, the drift term dominates and for larger α, the diffusion term dominates.
In fact, only the lower left corner entry 1 of the middle d 1 × d 1 block of S −1 λ,α V λ,n S λ,α will be of order λ α/2 , all other terms from the conjugation will be at least of order λ 3α/2 . Hence, for the case as in (1.33) we find (1.34) Furthermore, V 11,n has only one entry v n in the lower left corner, and the v n are i.i.d. random variables with mean zero, Therefore, application of Theorem 1.1 gives an SDE limit in the scaling λ α k n = λ 2/(2k−1) n = t: Theorem 1.7. Let T λ,n be given as in (1.1) and let the assumptions as on page 6 and (1.33) be satisfied. Moreover let S λ,α be defined as above with α = α(d 1 ) = 2/(2d 1 − 1). Let X λ,n = R −n S −1 λ,α T λ,n · · · T λ1 S λ,α X 0 with X 0 as before and let X λ,n be the corresponding Schur complement as before. Then where B t is a complex Brownian motion with covariances Note that for a vector x(t) = Λ t x(0) equation (1.35) is equivalent to 1 , the jth derivative of x 1 , and B ′ is the (distributional) derivative of the Brownian motion term.
Remark. The original drift term coming from λ 2 W λ is of too low order after the conjugation with S λ,α to matter in the limit. If one wants an additional drift term in (1.36) on the right hand side coming from an added term λ β W then the conjugation S −1 λ,α λ β WS λ,α needs to produce a term of order λ 2 2d 1 −1 = λ α . If W is not zero in the lower left corner of the corresponding d 1 × d 1 block for the SDE limit, then one needs β − (d 1 − 1)α = α, i.e. β = d 1 α = 2d 1 /(2d 1 − 1).
Jordan blocks do appear at so-called band-edges for transfer matrices of one-dimensional random Schrödinger operators with some finite range hopping. Similar as in Section 1.2 consider the random family of random real symmetric matrices H (d) λ,n acting on C n ∋ ψ = (ψ 1 , . . . , ψ n ) given by where ψ j = 0 for j < 1 and j > n. We may sometimes drop the index n. The v k are independent, identically distributed real random variables with variance E(v 2 k ) = 1. Note that for d = 1 this operator corresponds to (1.19) with A = −2. The eigenvalue equation H (d) λ ψ = Eψ can be rewritten as ψ k = (T + (E − λv k ) S) ψ k−1 , where ψ k = (ψ k+d , ψ k+d−1 , . . . , ψ k−d+1 ) ⊤ and S and T are 2d × 2d matrices given by: S 1,d = 1 and all other entries of S are zero; T 1,k = (−1) k+1 2d 2d−k , T j,j−1 = 1 for j ≥ 2 and all other entries of T are zero, i.e.
For E = 0 and λ = 0 the transfer matrix T is equivalent to a Jordan block 2 of maximum size for the eigenvalue 1. In order to bring it into the Jordan form, let us define the Pascal-triangle type matrix M by M jk = 2d−j k−1 for k + j ≤ 2d + 1 and zero for all other entries, then one has 2 In fact E = 0 is at the edge of the spectrum of the operator H M −1 jk = (−1) j+k j−1 2d−k for k + j ≥ 2d + 1 and all other entries zero, i.e.
Then some calculation shows M −1 T M = J 2d where J 2d is the Jordan matrix as defined above. For the conjugation of the whole transfer matrix T + E − λv k )S we also need to calculate M −1 SM . Its entries are given by ( In particular the lower left corner has the entry 1. As above let α = 2 4d−1 and as in the remark scale energy differences by E = ǫλ 2dα to obtain Then, for any vector x ∈ C 2d we find where B ′ is the distributional derivative of a standard, real, one-dimensional Brownian motion.
Following the arguments of [KVV] or the arguments of the proof of Theorem 1.3 one could show that (along suitable subsequences so that the boundary conditions converge) the eigenvalue process of n 2d H (d) n −1/α ,n with α = 2/(4d − 1) converges to the process of eigenvalues of the random operator ∂ 2d x − B ′ acting on the interval [0, 1] with appropriate boundary conditions. For periodic boundary conditions this is a generalization of the random Hill operator (at d = 1).
Then the assumptions imply that for small λ For the proof of Theorem 1.1 we will fix some time T > 0 and obtain the SDE limit up to time T > 0 which is fixed but arbitrary. Let us first show the following.
Now let s be such that 1 − s lies between 2/(6 + ǫ) and 1/3 and define the truncated random variable Y λ,n by By the choice of s, (1 − s) > 2 6+ǫ > 1 5+ǫ and we obtain and similarly for λ → 0. Thus, using Y λ,n instead of Y λ,n in (1.13), (1.14) and (1.15) does not change the quantities V , g(M ) andĝ(M ). Hence, the SDE limits mentioned in Theorem 1.1 for Y λ,n and Y λ,n are the same.
Let us assume that Theorem 1.1 is correct for Y λ,n and obtain its validity for using Y λ,n by showing that we obtain the same limit SDE. From (2.3) where the last equations define c > 0 and δ > 0. Hence, P Y λ,n > Kλ s−1 for some n = 1, 2, . . . , ⌊λ −2 T ⌋ ≤ T cλ δ which approaches zero for λ → 0. Therefore, introducing a stopping time T λ := min{n : Y λ,n = Y λ,n } and considering the stopped process X λ,n∧T λ one obtains the same SDE limits. But the stopped processes coincide when using Y λ,n instead Y λ,n and therefore lead to the same SDE limits.
Thus we may assume equation (2.2) without loss of generality and we will do so from now on.
Moreover, as the spectral radius of Γ 0 and Γ 2 are smaller than 1, using a basis change, we may assume: 3 Before obtaining the evolution equations, we will first establish that the pair (X λ,n , Z λ,n ) is a Markov process. Let us define the following subgroup of GL(d, C).
Now let X 1 and X 2 be equivalent, X 1 ∼ X 2 , if X 1 = X 2 Q for Q ∈ G. As different representatives differ by multiplication from the right, multiplication from the left defines an action on the equivalence classes. Therefore, the evolution of the equivalence classes [X λ,n ] ∼ is a Markov process. As we see that the equivalence class [X λ,n ] ∼ is determined by the pair (X λ,n , Z λ,n ).
Let us further introduce the following commuting matrices of size d 0 + d 1 , Note that R is unitary and that (2.7) for R as defined in (1.3). As Γ n 0 is exponentially decaying, we refer to the d 0 dimensional subspace corresponding to this matrix block as the decaying directions of T n 0 . Similarly, the d 2 dimensional subspace corresponding to the entry Γ −n 2 are referred to as growing directions.
The evolution of X λ,n is given by X λ,n = R −n T λ,n−1 R n−1 X λ,n−1 .
(2.8) Therefore, let Here, A, B, C, D are used as indices to indicate that we use the same sub-division of the matrix as we did when defining A λ,n , B λ,n , C λ,n and D λ,n .
The action on the equivalence class of X λ,n−1 ∼ X λ,n−1 Z λ,n−1 0 1 gives Transforming the matrix on the right hand side into the form as in (2.5) we can read off the evolution equations (2.10) For more detailed calculations, let (2.11) From (1.1), (1.2), (2.6) and (2.7) one finds (2.12) We will first consider the Markov process Z λ,n and denote the starting point by Z 0 = B 0 D −1 0 .
Remark. Note that the estimates show that T C λ,n Z λ,n−1 + T D λ,n is invertible. Using D λ,n = T C λ,n B λ,n−1 + T D λ,n D λ,n−1 = (T C λ,n Z λ,n−1 + T D λ,n )D λ,n−1 it follows inductively also that D λ,n is invertible. Hence, X λ,n and Z λ,n are always well defined for small λ under assumption (2.2).
Hence, under the assumptions of Theorem 1.1 they will be well defined up to n = T λ −2 with probability going to one as λ → 0.
Lemma 2.3. Let E X,Z denote the conditional expectation given that X λ,n−1 = X and Z λ,n−1 = Z.
By induction, for small λ and n < T λ −2 , As all norms are equivalent, this finishes the proof.
3 Limit for the process X λ,n We need to split up the (d 0 + d 1 ) × (d 0 + d 1 ) matrix X λ,n into two blocks. Therefore, let Then, using (2.6) one finds P 0 S = Γ 0 P 0 , P 1 S = P 1 , P 0 R n = P 0 , P 1 R n = U n P 1 . (3.1) Proposition 3.1. There is a function K(T ) such that for all n < T λ −2 one has In particular, for any function f (n) ∈ N with lim n→∞ f (n) = ∞ one has Multiplying (2.25) by P 0 from the left and P * 0 from the right, taking expectations and using the bound (2.26) gives . where the bound for the error term is uniform in n for n < λ −2 T . Induction yields the stated result.
In order to use Proposition A.1 we need to consider stopped processes. So for any λ, let T K be the stopping time when P 1 X λ,n is bigger then K. We define the stopped process by P 1 X K λ,n := P 1 X λ,TK ∧n , P 0 X K λ,n := P 0 X λ,n · 1 n≤TK , Z K λ,n := Z λ,n · 1 n≤TK where 1 n≤TK = 1 for n ≤ T K and 1 n≤TK = 0 for n > T K .
(3.6) Therefore, using the functions h, h as defined in (1.10) and W as defined in (1.11) one finds Here the error terms o(λ) and o(1) are uniform in the limit λ → 0. Note that W = P 1 W 0 P * 1 with W 0 as in (2.23).
Letting u = U −n = U n * we have the terms uW U * u * , uU h(u ⊤ M u)U * u * and uU h(u * M u)U * u * appearing in (3.11), (3.12) and (3.13), respectively. On the abelean compact group U generated by the unitary U , the functions Let f (n) ∈ N with lim n→∞ f (n) = ∞ and lim n→∞ f (n)n −s = 0, then by Proposition 3.1 and 3.2 we find for large enough K that X K

1/
√ n,f (n) =⇒ 0 0 0 1 d 1 X 0 where we used a subdivision in blocks of sizes d 0 and d 1 . For sake of concreteness let us set f (n) = ⌊n α ⌋ with some 0 < α < s.
From (3.5) we find for m → ∞, Together with Proposition 3.3 we see that the stopped processes (X K √ m,⌊tm⌋ =⇒ 0 Λ K t X 0 , uniformly for 0 < t < T , where Λ K t = Λ t∧T k denotes the stopped process of Λ t as described in Theorem 1.1 with stopping time T K when P 1 Λ t X 0 > K. As we have this convergence for all such stopping times T K , P 1 Λ t X 0 is almost surely finite and as the final time T was arbitrary, one obtains In this section we will obtain the correlations for the SDE limits for different sizes of eigenvalues of T 0 and prove Theorem 1.5. We follow all calculations above for the limit processes X (c) t as in Corollary 3.3 (considering the matrices 1 c T λ,n instead of T λ,n ) and define all the same objects as above with superscript (c).
In particular we define the random variables Y λ,11 defined as in Section 1.3. The factor 1 c comes from the fact that we have to consider 1 c T λ,n in the calculations above to define Y (c) λ,n . Similar to (3.12) and (3.13) one obtains Hence, by (3.19) this leads to for the functions g cc ′ ,ĝ cc ′ as in (1.28) and (1.29). Using Proposition A.1, this gives the correlation between the limit processes X (c) t and X (c ′ ) t as described in Theorem 1.5.

The action on the flag manifold
In this subsection we prove Theorem 1.6. Recall we have Let G(p, d) denote the Grassmannian manifold of p-dimensional subspaces of C d . Note that F (p) ∈ G(p, d). As F can be seen as a submanifold of d p=1 G(p, d) it will be sufficient to prove in G(p, d) jointly for any (fixed) p.
As the action of T and cT on F or G(p, d) is the same, we may for fixed p scale the matrices such that d 2 < p ≤ d 2 + d 1 in the sense of the definitions of d 1 , d 2 in the Section 1.1 (Note that this basically means c j = 1 for some j, d 2 = d(c 1 ) + d(c 2 ) + . . . + d(c j−1 ) and d 1 = d(c j ).) Now for Using blocks of size d 0 + d 1 and d 2 and representing [F 0 ] ∈ F a as above we find where D 0 , a 00 and a 11 are invertible. Note that in fact a 11 = a j for some j as in the notations above and that a 00 contains the a k for k > j and D 0 contains the a k for k < j. So we can choose X 0 = F 0 and consider the processes X λ,n as above. Then clearly F λ,n and in terms of representatives in G(p, d) they are equivalent to Note that from the proof of Theorem 1.1 the inverse D −1 λ,n exists for small λ (with sufficiently high probability) and therefore, as we consider invertible matrices here, we also find that X λ,n is invertible. As X 1 √ n ,⌊tn⌋ with Λ t invertible, we find for n ∼ λ −2 and large n that 1 d 0 0 X λ,n 0 1 d 1 is invertible. Hence, the right hand side of (4.3) represents the same p-dimensional subspace as Therefore by Theorem 1.1 we find (4.5) The last equation is easy to see if one realizes that the last p column vectors end somewhere inside the a 11 term and therefore span indeed the same p-dimensional subspace as F t . Clearly, looking at this convergence jointly in p we obtain the correlations as in Theorem 1.5.

Application to random Schrödinger operators
In this section we will show the results from Section 1.2. and use the notations introduced there.

Limit SDE
In this subsection we prove Proposition 1.4 and stick to the case σ = 1. The general case is a trivial modification where V 1 given below is replaced by σV 1 . In order to describe the limit SDE, we split the potentials into the corresponding hyperbolic and elliptic blocks. Recall the definition Next, note that we chose the sign on S Γ this way, so that S Γ > 0 is a positive diagonal matrix. This leads to In the notations as introduced in Section 1 and used for Theorem 1.1 we have Γ 2 = Γ and as in (1.26). Also recall from (1.26) that we defined the positive, Hermitian matrix where dz describes the Haar measure on Z , the group of diagonal, unitary matrices generated by Z. In order to calculate the drift term, note that which follows as we have |z i z j | = 1 and Im(z i ) > 0, Im(z j ) > 0 which implies that z i z j = 1 for any i, j ∈ {1, . . . , d e }. Therefore, application of Theorem 1.1 gives In order to express the covariances as described by (1.12) in more detail recall Z = diag(z 1 , . . . , z de ), |z j | = 1, leading to where z jj is the j-th diagonal entry of the diagonal matrix z ∈ Z , and n j are integers. This leads to the following covariances, The correlations between the Brownian motions are given by This shows part (i) of Proposition 1.4 for σ = 1. Changing V 1 to σV 1 immediately gives the general case. If V e is almost surely real, which is the case if O * V 1 O is almost surely real, then one has C t = A t and B t = B ⊤ t giving part (ii). Part (iii) of Proposition 1.4 follows from using the chaoticity assumption in the equations for the covariances.

Limiting eigenvalue process
In this subsection we will prove Theorem 1.3 and restrict without loss of generality to the case σ = 1. We won't need the precise form of the limit SDE but it is important how we obtain this SDE. Therefore we need to look at the matrix parts giving the Schur complement as in the proof of Theorem 1.1. Hence, using U and T ǫ,λ,[1,n] as above let  , X ε,λ,n = R −n T ε,λ,[1,n] X 0 . (5.14) Using blocks of sizes d h + 2d e and d h , let X ε,λ,n = A ε,λ,n B ε,λ,n C ε,λ,n D ε,λ,n and define X ε,λ,n = A ε,λ,n − B ε,λ,n D −1 ε,λ,n C ε,λ,n .
(5.15) Then by Theorem 1.1, X ε,1/ √ n,⌊tn⌋ =⇒ 0 0 0 Λ ǫ,t , with the process Λ ε,t = Λ 1,ε,t as in Theorem 1.3 and in (1.25). Let us define Then, for t > 0, Let us also define Θ * n := An energy E + λ 2 ε is an eigenvalue of H λ,n , precisely if there is a solution to the eigenvalue equation with ψ 0 = 0 and ψ n+1 = 0, i.e. if and only if det As T E+λ 2 ε λ,[1,n] = QR n X ε,λ,n X −1 0 Q −1 , this is equivalent to det (Θ * n X ε,λ,n Θ 0 M ε,λ,n ) = 0 (5.21) Along a subsequence n k of the positive integers where Z n k +1 converges to Z * , we find that for in law, uniform on compact sets in ε. As the determinant is a holomorphic function, we find that the zero processes of the determinant also converge in law. This proves Theorem 1.3 for σ = 1. A proper rescaling immediately implies the result for general σ.

Limiting GOE statistics
In this subsection we will prove Theorem 1.2 by reduction to the work in [VV1]. Without loss of generality we focus on energies E smaller than 0 and consider r = 1. The more general case needs some more care and notations in the subdivision into elliptic and hyperbolic channels, but the main calculations remain the same. We need to consider the SDE limit as described above a bit more precisely for this particular Anderson model as in (1.19) with A = Z d and V n as in (1.17).
In Proposition 1.4, especially for the definitions of V h , V e and V he it was assumed that A is diagonal. So in order to use these calculations we need to diagonalize Z d and see how this unitary transformation changes V n .
Let E be such that Z is chaotic, then by Proposition 1.4 (iii) we need to consider the following the covariances Here, by |O i | 2 we denote the vector (|O k,i | 2 ) k=1,...,d and ·, · denotes the scalar product. As stated in [VV1], one finds (d + 1) |O i | 2 , |O j | 2 =    3/2 for i = j 1 for i = j . (5.30) Let us further calculate the drift contribution Q from the hyperbolic channels as introduced above.
Using chaoticity, it is not hard to see from (5.11) that Q is diagonal. Moreover one has (5.31) It follows that Q is a multiple of the unit matrix, more precisely (5.32) Thus, using Proposition 1.4 we obtain the following SDE limits, dΛ σ,ε,t = S (ε − σ 2 q) 1 0 where A t and B t are independent matrix Brownian motions, A t is Hermitian, B t complex symmetric, i.e. (5.35) Except for the additional drift σ 2 q which can be seen as a shift in ε, this is the exact same SDE as it appears in [VV1]. In fact, the matrix S here corresponds to iS 2 as in [VV1] and the process there corresponds to the process above conjugated by |S| 1/2 .
Thus, from now on the proof to obtain the Sine 1 kernel and GOE statistics follows precisely the arguments as in [VV1].

A A limit theorem for Markov processes
We need the following variation of Proposition 23 in [VV2].
as well as a sequence of "good" subsets G m of R d . Let Y m n (x) be distributed as the increment X m ℓ+1 − x given X m n = x ∈ G m . We define Let d ′ ≤ d, and letx denote the first d ′ coordinates of x. These are the coordinates that will be relevant in the limit. Also letb m denote the first d ′ coordinates of b m andã m be the upper left and that P(X m n ∈ G m for all n ≥ f (m)) → 1. Then (X m ⌊mt⌋ , 0 < t ≤ T ) converges in law to the unique solution of the SDE dX = b dt + a dB, X(0) = X 0 .
Proof. This is essentially Proposition 23 in [VV2]. The first difference is that the coordinates d ′ + 1, . . . , d of the X m do not appear in the limiting process. A careful examination of the proof of that Proposition shows that it was not necessary to assume that all coordinates appear in the limit, as long as the auxiliary coordinates do not influence the variance and drift asymptotics.
The second difference is the introduction of the "good" set G m , possibly a proper subset of R d .
Since we assume that the processes X m stay in G m with probability tending to one, we can apply the Proposition 23 of [VV1] to X m stopped when it leaves this set. Then, the probability that the stopped process is different from the original tends to zero, completing the proof.
The third difference is the weak convergence ofX m f (m) instead ofX n 0 and that we have the bound in (A.2) only for m ≥ f (m). Note that for the Markov familyX m l =X m max(l,f (m)) all the same conditions apply with f (m) = 0 and the initial conditions converge weakly. Moreover, for any fixed t > 0 and m large enough one hasX m ⌊mt⌋ = X m ⌊mt⌋ .