Band edge localization beyond regular Floquet eigenvalues

We prove that Anderson localization near band edges of ergodic random Schr\"odinger operators with periodic background potential in $L^2(\mathbb{R}^d)$ in dimension two and larger is universal. By this we mean that Anderson localization holds without extra assumptions on the random variables and independently of regularity or degeneracy of the Floquet eigenvalues of the background operator. Our approach is based on an initial scale estimate the proof of which avoids Floquet theory altogether and uses instead an interplay between quantitative unique continuation and large deviation estimates. Furthermore, our reasoning is sufficiently flexible to prove this initial scale estimate in a non-ergodic setting, which promises to be an ingredient for understanding band edge localization also in these situations.


Introduction and results
The Anderson model dates back to the work of Anderson in 1958 [And58] in condensed matter physics who argued that the presence of disorder will drastically change the dynamics of electrons in a solid. This has triggered a huge research activity in mathematics and physics during the past 60 years. We refer to the monographs [PF92,Sto01,Ves08,AW15] for an overview on the mathematics literature. While Anderson's original work was on a lattice model, analogous phenomena have since been studied for continuum Schrödinger operators. The prototypical model investigated in this context is the ergodic Alloy-type or continuum Anderson model in L 2 (R d ), where V per is a Z d -periodic potential, ω = (ω j ) j∈Z d is a sequence of independent and identically distributed random variables with bounded density, and u is a bump function modelling the effective potential around a single atom. Under mild assumptions, this random family of self-adjoint operators has almost sure spectrum, which means that there exists a set Σ ⊂ R such that for almost every realization of ω the random operator H erg ω has spectrum Σ. The general philosophy is that randomness leads to Anderson localization, at least in a neighbourhood of the edges of Σ, or -in dimension one -on the whole of Σ; Anderson localization in an interval I ⊂ Σ means that the spectrum of H erg ω within I is almost surely only of pure point type with exponentially decaying eigenfunctions. This is a dramatic difference to the background operator H per which has only absolutely continuous spectrum and no eigenvalues. There also exist stronger notions of localization such as dynamical localization, see, e.g. [Kle08] for an overview. One standard method to prove localization is the so-called multi-scale analysis, and typically, if one can run the multi-scale analysis and prove Anderson localization, then dynamical localization follows with only minor modifications, see [GK01] for details. For the sake of simplicity we will content ourselves with Anderson localization here, although dynamical localization will hold as well.
The edge of Σ where localization is most tangible is the bottom of the spectrum. More challenging is the situation where the almost sure spectrum Σ has a band structure. The spectrum of operators of the form −∆ + V per typically has a band structure, as can be seen by Floquet theory, see e.g. [Kuc16] for an overview. When an ergodic random potential V ω is added, the almost sure spectrum Σ of H erg ω will inherit this band structure, tacitly acknowledging that V ω is not too large such that not all spectral gaps close. It is near these edges of Σ where we prove Anderson localization.
In dimension d = 1, randomness will immediately lead to full localization on the whole spectrum [GMP77]. In dimensions two and larger localization in a neighbourhood of the bottom of the spectrum is proved in different models [HM84,KLNS12,GHK07,Kle13,NTTV18]. These results were essentially based on so-called Lifshitz tails at the bottom of the spectrum, which imply an initial scale estimate (ISE), a major ingredient for the above mentioned multi-scale analysis. At internal band edges, however, the validity of such Lifshitz tails on a general scope is a more complicated issue.
Localization at internal band edges has been intensively studied in the second half of the 90s of the last century. First results, among them [BCH97] by Barbaroux, Combes, and Hislop, additionally required a power like decay of the distribution of the random variables ω j near their extreme values, for which however no physical justification is given. It rather seems to be a technical assumption necessary for the proposed proof which avoids using Lifshitz tails and instead works with simpler classical product probabilities. Kirsch, Stollmann, and Stolz [KSS98] study a similar setting but use different techniques.
The fundamental task to understand Lifshitz tails at internal band edges was approached by Klopp [Klo99]. Lifshitz tails at E 0 mean that the integrated density of states N (·) of H erg ω satisfies Klopp proved that Lifshitz tails occur for the random operator H erg ω if and only if the background operator H per has regular Floquet eigenvalues near these edges, which means that these edges are generated by a quadratic extremum of an eigenvalue curve in the so-called dispersion relation. This implies an initial scale estimate that, together with a Wegner estimate, can then be used to start the multi-scale analysis and prove localization [Ves02]. Thus, there are two natural questions: Firstly, is it possible that Floquet eigenvalues of H per are not regular? Secondly, how does the integrated density of states for H erg ω look like if H per exhibits a non-quadratic Floquet minimum?
The first question is answered in dimension one and in dimension two for "small" potentials by [CdV91], where it is proved that regularity of Floquet eigenvalues is generic (it occurs in a precise sense for almost all choices of the potential V per , but there are exceptional cases where it does not). In higher dimensions there are partial results, e.g. [KR00] which states that potentials for which band edges are attained by a single Floquet eigenvalue are generic, but the question whether regular Floquet eigenvalues are generic in all dimensions is still open [Kuc16,Conjecture 5.25].
The second question was studied by Klopp and Wolff [KW02]. Therein it is shown in two space dimensions that even if a proper Lifshitz tail does not occur for the integrated density of states for the random operator, a weaker version of (1.1) with −d/2 replaced by −α for some α > 0 always holds. Such an asymptotic still implies an initial scale estimate and thus localization, see Theorems 0.3 and 0.4 in [KW02]. Consequently, this suggested a strategy for proving universality of band edge localization as it does not seem impossible to extend [KW02] to arbitrary dimensions. On the other hand, the reasoning in [KW02] is explicitly two-dimensional and it relies on tools from analytic geometry such as the Newton diagram, which would at least introduce additional technical complications in higher dimensions. We are not aware of any progress made towards universality of band edge localization following this strategy since the early 2000s.
Our main result of this note, Theorem 1.1, proves that in dimension d ≥ 2, band edge localization always occurs, independently of the regularity or degeneracy of the Floquet eigenvalues and of Lifshitz tails. Recall that d ≥ 2 is not a restriction here since in dimension d = 1 one anyway has the stronger full localization. Our main novelty is a robust initial scale estimate the proof of which does not rely on Floquet theory and makes no use of periodicity. Instead, the recent scale-free quantitative unique continuation principle (UCP) for spectral projectors [NTTV18] is used in combination with the observation that certain favourable configurations of the random potential have overwhelming probability. Quantitative unique continuation is a technique which has been introduced to the random Schrödinger operators community in [BK05], and has since found various applications in the theory of random Schrödinger operators [BK13, RMV13, Kle13, NTTV18, TT18, Täu18, DGM19, Geb19].
Freed from the burdens of periodicity and ergodicity, we can even state our initial scale estimate in a more general context of non-ergodic Schrödinger operators in Theorem 2.2. There has recently been some activity on localization for non-ergodic operators [RM12,RMV13,Kle13]. The existence of almost sure band edges for such operators is a somewhat tricky business. In our context this is bypassed by Hypothesis (H3') below, see also Remark 2.1.
The initial scale estimate of Theorem 2.2 might be used as an induction anchor for the multi-scale analysis for non-ergodic operators, but one would have to combine this with considerations on the multi-scale analysis in the non-ergodic setting and on almost sure statements on the spectrum. This is a subject for future investigations. We sketch however in Examples 2.4 and 2.5 some situations of non-ergodic random Schrödinger operators where we think that the initial scale estimate is useful.
The paper is organized as follows: In Section 1.1, the notation and the ergodic model are introduced whereas Section 1.2 presents the main result, Theorem 1.1, on band edge localization. After that, Section 2 presents Theorem 2.2, the initial scale estimate in the non-ergodic setting. Finally, Section 3 is devoted to the proof of Theorem 2.2.
1.1. The model. We always work in dimension d ≥ 2. For L > 0 and x ∈ R d , we denote by Λ L (x) the open hypercube in R d of side length L, centered at x. If x = 0, we simply write Λ L . Similarly, we denote by B δ (x) the open ball of radius δ, centered at x, and if x = 0 we just write B δ . Given a measurable subset A ⊂ R d we write χ A for the characteristic function of this set.
We consider a Z d -ergodic random Schrödinger operator H erg exponentially decaying in local L p norms, and with a uniform lower bound on some open subset. More precisely, there are c, δ, δ 1 , δ 2 > 0 such that The random variables ω j are independent and identically distributed on a probability space (Ω, P) with bounded density and support equal to the in- The reason why we assume the potential V per in (H1) to be bounded is that this is a requirement of the quantitative unique continuation principle for spectral projectors [NTTV18], a major ingredient in the proof. There have been recent works removing this boundedness assumption [KT16b,KT16a], but since this is not the main focus of this work, we refrain from pursuing this path further here. Apart from that, the above model is essentially the one discussed in [Ves02].  To the best of our knowledge, Theorem 1.1 provides the first proof of Anderson localization near band edges in the continuum without additional assumptions. In particular, it does not require regularity of Floquet eigenvalues as in [Ves02].
The core of the proof of Theorem 1.1 is a so-called initial scale estimate, which in the situation of the theorem takes the following form (see also Corollary 2.3 below): For all q > 0 and α ∈ (0, 1) there exists L 0 ∈ N such that for all L ∈ N with L ≥ L 0 we have where H erg ω,L denotes the restriction of H erg ω onto L 2 (Λ L ) with periodic boundary conditions. Theorem 1.1 then follows from (1.2) and the Wegner estimate via the standard multi-scale analysis. This line of argument is spelled out explicitly in [Ves02]. Therefore, we omit it and just content ourselves with the proof of the initial scale estimate (1.2). Since its proof does not rely on periodicity or ergodicity, we prove a more general initial scale estimate for not necessarily ergodic random Schrödinger operators H ω = H 0 + V ω in Theorem 2.2 below.
Let us conclude this section by briefly explaining the main idea of the proof of the initial scale estimate (1.2); the full proof in the more general non-ergodic situation can be found in Section 3 below: The quantitative unique continuation principle implies that eigenvalues of −∆ + V will move up by some positive amount if the potential V is varied by some c > 0 on a -in general disconnected -subset U with typical distance l between its components. However, the price to pay is that the lifting is very small and it scales unfavourably with increasing l, namely the eigenvalues will be lifted proportional to c exp(−l 4/3+ε ). On the one hand, this seems to be too weak for the polynomial bound in (1.2). On the other hand, by large deviation arguments favourable configurations of the random potential appear with overwhelming probability for large l: the random potential will be larger than c on a set U with typical distance l between its components with probability 1 − exp(−l d ). Since exp(−l 4/3+ε ) decays slower than exp(−l d ) in dimensions d ≥ 2, we can trade the large deviations bound against the meager eigenvalue lifting from unique continuation and conclude the statement.

An initial scale estimate for non-ergodic random Schrödinger operators
For a random operator H ω = H 0 + V ω , L > 0, and x ∈ R d , we denote by H ω,L,x the restriction of H ω to L 2 (Λ L (x)) with a fixed choice of either Dirichlet, Neumann, or periodic boundary conditions. We formulate the following hypotheses: (H1') The background operator is of the form H 0 = −∆+V 0 , where V 0 is a bounded and real-valued potential. (H2') The random potential is of the form V ω = j∈(GZ) d ω j u j , G > 0. Here, (u j ) j∈(GZ) d is a family of functions in L p with p > d/2. In addition, there are c, δ 1 , δ 2 > 0 and δ ∈ (0, G/2] such that for every j ∈ (GZ) d there exists Moreover, the random variables ω j are independent on a probability space (Ω, P), with values contained in the interval [0, 1] almost surely, and there are η, κ > 0 such that P[ω j ≥ η] ≥ κ for all j ∈ (GZ) d . For notational purposes we define (2) It is easy to see that due to the exponential decay in (2.1) the potential W is locally L p with a uniform bound on the L p -norm on cubes of side length one. Since p > d/2 and d ≥ 2, this guarantees that W (and consequently also V ω almost surely) belongs to the Kato class in R d and is thus infinitesimally form bounded with respect to H 0 , see, e.g., [CFKS87, Section 1.2].
(3) Hypothesis (H3') implies that for every (L, x) ∈ M b the number of eigenvalues of H ω,L,x below b is almost surely constant whence the infimum of the spectrum of H ω,L,x in [b, ∞) can experience no "jumps" when the random potential is varied, see the proof of Lemma 3.2 below for more details.
The following theorem is the core of the present paper: Theorem 2.2 (ISE for non-ergodic random Schrödinger operators). Assume (H1'), (H2'), and (H3'). Then, for all q > 0 and α ∈ (0, 1) there exists L 0 ∈ GN such that for all (L, It is clear that (H1)-(H2) are a particular case of (H1')-(H2') with V 0 = V per and u j = u(· − j). We also note that Hypotheses (H1')-(H2') define a generalization of the crooked Anderson model and the Delone model for which Wegner estimates are known [Kle13,RMV13]. Let us comment on the connection between the gap conditions (H3) and (H3'): For Z d -periodic operators H per , it is a consequence of Floquet theory that periodic restrictions of H per to boxes Λ L of integer side length will respect any gap (a ′ , b ′ ) ⊂ ρ(H per ) of the full-space operator, i.e. they won't have any eigenvalues in (a ′ , b ′ ). We also note that if the potential is Z d -periodic and symmetric under all coordinate reflections, then also Dirichlet and Neumann restrictions to boxes Λ L will respect such gaps, which can be seen by extending Neumann and Dirichlet eigenfunctions by symmetric or antisymmetric reflections respectively to a box of side length 2L, on which they must satisfy periodic boundary conditions and, thus, respect the gap. As a consequence, (H3) will imply (H3'). In fact, if we assume (H3), the function W in (H3') will be Z d -periodic and the almost sure spectrum of H erg ω will be equal to Σ = t∈[0,1] σ(−∆ + V per + tW ), see, e.g., [Sto01, Lemma 1.4.1]. By periodicity, finite volume restrictions of these operators onto boxes of integer side length will respect the gap (a, b). In summary, we have seen that (1.2) is a particular case of Theorem 2.2, and we obtain the following corollary: Corollary 2.3 (Ergodic ISE at band edges). Assume (H1), (H2), and (H3). Then for all q > 0 and α ∈ (0, 1) there exists L 0 ∈ N such that for all L ∈ N with L ≥ L 0 we have We are not going to engage in the multi-scale analysis in the non-ergodic setting here, but we nevertheless sketch some non-ergodic situations where Theorem 2.2 might be useful: Example 2.4. Let the background operator H 0 = H per + V 1 consist of a periodic operator H per with a non-negative, bounded, and fast decaying perturbation V 1 . Let H per have a gap (a ′ , b ′ ) and assume that the potential W , which is an almost sure upper bound on the random potential V ω , is small. Then, the system is not ergodic any more, but for almost all j ∈ (GZ) d and for all L such that the bulk of V 1 is not contained in Λ L (j) some interval (ã, b ′ ) ⊂ (a ′ , b ′ ) is in the resolvent set of H ω,L,x . Therefore, the initial scale estimate from Theorem 2.2 holds. This model is reasonably close to the ergodic case and (assuming that all random variables ω j have support near 0) it is relatively easy to see that the random operator will have well-defined lower band edges. Thus, one expects to find Anderson localization near these band edges.
We note however that an alternative approach in this situation can be found in [DGH + 18], where it is proved that localization is robust under perturbations by a fast decaying potential.
Our second example concerns the situation where the background potential is periodic and the random potential is ergodic, but the two lattices are incommensurate: Example 2.5. Let G 1 , G 2 > 0 be incommensurate (i.e G 1 /G 2 ∈ Q), let the background operator H 0 be (G 1 Z) d -periodic, and let the random potential V ω be (G 2 Z) dergodic. Then the random operator H 0 + V ω is not ergodic any more and arguments using Floquet theory break down. However, assuming that H 0 has a spectral gap, V ω is sufficiently small, and that the support of the random variables ω j contains 0, a lower band edge will persist, and Theorem 2.2 might be used to prove Anderson localization in its neighbourhood via the multi-scale analysis.

Proof of Theorem 2.2
By scaling, we may assume G = 1. Furthermore, for notational convenience, we assume that N × {0} ⊂ M b and therefore only prove the statement for x = 0 and sufficiently large L, writing H ω,L and H 0,L instead of H ω,L,x and H 0,L,x .
The proof of Theorem 2.2 relies on the scale-free quantitative unique continuation principle [NTTV] given in Proposition 3.1 below. We start by introducing some notation: Given l > 0 and δ ∈ (0, l/2), a sequence Z = (y j ) j∈(lZ) d in R d is called (l, δ)-equidistributed if B δ (y j ) ⊂ Λ l (j) for all j ∈ (lZ) d . If Z is (l, δ)-equidistributed and L > 0, we write S Z,L := j∈(lZ) d B δ (y j ) ∩ Λ L Proposition 3.1 (Scale-free unique continuation principle [NTTV18,NTTV]). Let V ∈ L ∞ (R d ), l ∈ N odd , L ∈ N with l ≤ L, and δ ∈ (0, l/2). Denote the restriction of −∆ + V to Λ L with Dirichlet, Neumann, or periodic boundary conditions by H L . Let Z be an (l, δ)-equidistributed sequence. Then, for every E ∈ R and every φ ∈ Ran(χ (−∞,E] (H L )) we have where N > 0 is a constant that only depends on the dimension.
In the situation of Theorem 2.2, let J(ω) := {k ∈ Z d : ω k ≥ η} ⊂ Z d for ω ∈ Ω and consider for l ≤ L the event The main idea is that if ω ∈ A l,L then we can pick k j ∈ Λ l (j) ∩ J(ω), where j runs over (lZ) d ∩Λ 2L , such that the corresponding points x kj from Hypothesis (H2') are part of an (l, δ)-equidistributed sequence. The scale-free unique continuation principle then implies the following eigenvalue lifting estimate: Lemma 3.2. There is l 0 ∈ N, depending only on δ, b, d, c, η, and V 0 ∞ , such that for all L ∈ N and l ∈ N odd with L ≥ l ≥ l 0 and all ω ∈ A l,L we have there is nothing to prove. So, from now on assume that Since H 0,L ≤ H 0,L + ηcχ SZ,L ≤ H ω,L ≤ H 0,L + W L , the minimax principle for eigenvalues implies that for every k ∈ N the k-th eigenvalues, counted from the bottom of the spectrum, satisfy (3.5) that no eigenvalues can enter (a, b) when passing from H 0,L + ηcχ SZ,L to H ω,L either. In particular, there exists k 0 ∈ N such that λ k0 (H 0,L ), λ k0 (H 0,L + ηcχ SZ,L ), and λ k0 (H ω,L ) denote the lowest eigenvalue in [b, ∞) of the respective operators. Therefore, it suffices to prove that (3.6) λ k0 (H 0,L + ηcχ SZ,L ) ≥ λ k0 (H 0,L ) + ηc exp(−l 7/5 ).
We are ready to prove Theorem 2.2.
From Lemma 3.2 we deduce It remains to give a lower bound on P[A l,L ]. To this end, note that since l ∈ N odd , we have for each j ∈ Λ L ∩ (lZ) d that #(Λ l (j) ∩ Z d ) = l d . Thus which, in view of (3.8), proves the claim.
Remark 3.3. The proof of Theorem 2.2 merely relies on the fact that configurations for which the potential is larger than ηc on an (l, δ)-equidistributed set within Λ L has overwhelming probability. Therefore, its proof verbatim transfers to other models which share this feature such as the random breather model [TV15,SV17,NTTV18].