1 Introduction and main results

In this paper, we refine the compactness theory for gradient Ricci shrinkers in general dimensions. A smooth, connected, complete, n-dimensional Riemannian manifold \((M^n,g)\) is called a gradient Ricci shrinker if there exists a function \(f:M\rightarrow {\mathbb {R}}\), called the potential of the shrinker, such that

$$\begin{aligned} \textrm{Ric}_g + \nabla ^2_g f = \tfrac{1}{2}g. \end{aligned}$$
(1.1)

This notion, introduced by Hamilton in [21], naturally generalises the concept of positive Einstein manifolds (satisfying (1.1) with \(f\equiv const.\)). Gradient shrinkers have been very heavily studied, particularly in the last two decades. It is not hard to see that (1.1) is equivalent to \(g(t):=(1-t)\phi _t^*{g}\) satisfying Hamilton’s Ricci flow equation

$$\begin{aligned} \partial _t g(t) = -2\textrm{Ric}_{g(t)}, \end{aligned}$$

where \(\phi _t\) is the family of diffeomorphisms generated by \((1-t)^{-1}\nabla f\) with \(\phi _0={{\,\textrm{id}\,}}_M\). That is, a gradient shrinker evolves under Ricci flow only by diffeomorphisms and scaling and becomes singular at time \(t=1\). Hence, gradient shrinkers yield some of the most basic examples of singular Ricci flows.

Their importance however stems from the fact that gradient shrinkers model finite time singularities of the Ricci flow. As shown by Enders, Topping, and the first author [18], for a so-called Type I Ricci flow \((M,g(t))_{t\in [0,T)}\), a sequence of parabolic rescalings \(\big (M,g_j(t):= \lambda ^{-2}_j g\big (T + \lambda ^2_j(t-1)\big ),p\big )\) with scaling factors \(\lambda _j \rightarrow 0\) will subconverge smoothly in the pointed Cheeger–Gromov sense to a gradient shrinker which is non-trivial (i.e. non-flat) if and only if p is a singular point. We also refer the interested reader to an earlier result of Naber [26] (without the non-triviality statement) and the work of Mantegazza and the first author [24] for an alternative proof which yields additional information about the entropy of the limiting gradient shrinker. Recently, in his spectacular trilogy [4,5,6], Bamler generalised this blow-up result to the case of general Ricci flows without the Type I assumption. Instead, one must work with a new concept of weak convergence and limiting gradient shrinkers that may have a co-dimension 4 singular set. In the special case of dimension \(n=4\), Bamler has shown that one obtains orbifold Cheeger–Gromov convergence to an orbifold Ricci shrinker with isolated singularities modelled on \({\mathbb {R}}^n/\Gamma \) for some finite \(\Gamma \subset O\left( n\right) \). This yields a parabolic version of the (4-dimensional) shrinker compactness result by Haslhofer and the first author [22, 23] which we will recall now.

It is now well known (see [22]) that every gradient shrinker comes with a natural basepoint, namely a point \(p:= \mathop {\mathrm {arg\,min}}\limits _M f\) where the potential attains its minimum. Such a point always exists and the distance between two such points is bounded by a constant depending only on the dimension. From p, the potential grows like one-quarter distance squared and the volume growth of geodesic balls around p is at most Euclidean, see Sect. 2 for more details. It is therefore always possible to normalise f by adding a constant so that

$$\begin{aligned} \int _M \left( 4\pi \right) ^{-\frac{n}{2}}e^{-f}dV_g = 1. \end{aligned}$$
(1.2)

In this article, we always assume that the potential has been normalised this way. The gradient shrinker then has a well defined entropy,

$$\begin{aligned} \mu \left( g\right) = \mathcal {W}\left( g,f\right) = \int _M\left( \left| \nabla f\right| _g^2 + \textrm{R}_g + f -n\right) \left( 4\pi \right) ^{-\frac{n}{2}}e^{-f}dV_g>-\infty . \end{aligned}$$

The entropy, introduced by Perelman in [28], is non-decreasing along a general Ricci flow (in the compact case or under some technical assumptions) and assuming a lower bound for the entropy of singularity models is therefore quite natural. An additional local scalar curvature bound, which is always available for gradient shrinkers, implies no local-collapsing.

The main compactness theorem for n-dimensional Ricci shrinkers from [22] (and its improvement from [23] that shows the condition (1.3) always holds in dimension \(n=4\)) then states the following.

Theorem 1.1

(Theorem 1.1 in [22] and Theorem 1.1 in [23]) Let \(n\ge 4\) and let \((M_i,g_i,f_i)\) be a sequence of n-dimensional gradient Ricci shrinkers with entropy uniformly bounded below \(\mu (g_i)\ge {\underline{\mu }}>-\infty \). If \(n>4\), then assume in addition that we have uniform local energy bounds,

$$\begin{aligned} \int _{B_{g_i}(p_i,r)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \le E\left( r\right) < \infty , \quad \forall i,r. \end{aligned}$$
(1.3)

Then \(\left( M_i,g_i,f_i,p_i\right) \) subconverges to an orbifold Ricci shrinker \((M_\infty ,g_\infty ,f_\infty ,p_\infty )\) in the pointed orbifold Cheeger–Gromov sense where \(p_i:= \mathop {\mathrm {arg\,min}}\limits _{M_i} f_i\).

In particular, this means that a subsequence converges in the pointed Gromov–Hausdorff sense and in the smooth Cheeger–Gromov sense away from the isolated point singularities, see Sect. 2 for precise definitions of the different notions of convergence as well as for the definition of an orbifold Ricci shrinker. We denote the set of isolated singularities by \(\mathcal {Q}\).

This compactness result generalised earlier shrinker compactness theorems for compact shrinkers by Cao-Sesum [15], Weber [34], and Zhang [36] that furthermore rely on additional conditions such as pointwise curvature bounds or positivity assumptions for the curvature. This type of orbifold compactness theorem goes back to the fundamental work on sequences of Einstein manifolds by Anderson, Bando–Kasue–Nakajima, and Tian [2, 9, 27, 32]. See also Uhlenbeck [33] and Cheeger–Naber [17].

Our aim is to further extend Theorem 1.1 by investigating precisely what happens at the points where orbifold singularities form. In addition to the work cited above, our main results are in particular inspired by bubbling theorems for Einstein manifolds by Anderson–Cheeger [3] and Bando [7, 8]. Our first main result is the following.

Theorem 1.2

(Bubble tree convergence) Let \(n\ge 4\), \(\left( M_i, g_i, f_i, p_i\right) \) be a sequence of n-dimensional oriented gradient Ricci shrinkers as in Theorem 1.1, and \(\mathcal {Q}\) be the set of orbifold points of the limiting orbifold Ricci shrinker \(\left( M_\infty ,g_\infty ,f_\infty \right) \). Then, given \(q\in \mathcal {Q}\), there exist point-scale sequences \(\{(q^k_i, r^k_i)\}_{k=1}^{N_q}\) where \(M_i \ni q^k_i \rightarrow q\), \(r^k_i\rightarrow 0\), and ALE bubbles \(\{(V^k,h^k,q_\infty ^k)\}_{k=1}^{N_q}\) (see Definition 1.4), such that, up to passing to a subsequence, the following is true.

  1. 1.

    For all \(k\ne \ell \), we have

    $$\begin{aligned} \frac{r^k_i}{r^\ell _i} + \frac{r^\ell _i}{r^k_i} + \frac{d_{g_i}(q^k_i, q^\ell _i)}{r^k_i + r^\ell _i} \rightarrow \infty \end{aligned}$$

    as \(i\rightarrow \infty \).

  2. 2.

    For every fixed \(1\le k\le N_q\), the pointed rescaled manifolds \((M_i, (r^k_i)^{-2}g_i,q^k_i)\) converge in the pointed orbifold Cheeger-Gromov sense to \((V^k,h^k,q_\infty ^k)\) as \(i\rightarrow \infty \).

  3. 3.

    Given any other sequences \(M_i \ni q_i \rightarrow q\) and \(\varrho _i \rightarrow 0\) such that

    $$\begin{aligned} \min _{k=1,\dots N_q} \Big (\frac{\varrho _i}{r^k_i} + \frac{r^k_i}{\varrho _i} + \frac{d_{g_i}(q_i,q^k_i)}{\varrho _i + r^k_i}\Big )\rightarrow \infty \end{aligned}$$

    then the pointed rescaled manifolds \((M_i, (\varrho _i)^{-2}g_i,q_i)\) converge to a flat limit.

  4. 4.

    The number of ALE bubbles forming is locally finite, in particular for every \(r\ge 2\) there exists \(N=N({\underline{\mu }}, E(2r))\) such that \(\sum _{q \in \mathcal {Q}_r} N_q \le N\), where \(\mathcal {Q}_r:= \mathcal {Q}\cap B_{g_\infty }(p_\infty ,r)\).

  5. 5.

    Finally, the following energy identity holds:

    $$\begin{aligned} \lim _{i \rightarrow \infty } \int _{B_{g_i}(p_i,r)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i}= & {} \int _{B_{g_\infty }(p_\infty ,r)} \left| \textrm{Rm}_{g_\infty }\right| ^{n/2}_{g_\infty } dV_{g_\infty } \\{} & {} + \sum _{q \in \mathcal {Q}_r} \sum _{k = 1}^{N_q} \int _{V^k}\left| \textrm{Rm}_{h^k}\right| ^{n/2}_{h^k} dV_{h^k}, \end{aligned}$$

    whenever \(r \ge 2\) is such that \(\mathcal {Q}\cap \partial B_{g_\infty }(p_\infty ,r) = \emptyset \).

Bubble tree constructions as in Theorem 1.2 are an important tool in the study of geometric PDEs and have been employed in a variety of situations. In addition to the work on Einstein manifolds cited above, we would like to mention the classical works of Sacks–Uhlenbeck [29] for harmonic maps, as well as the articles by Brezis–Coron [12] and Struwe [30] for certain elliptic systems, all of which have inspired us.

From a technical point of view, our proof of Theorem 1.2 differs from the ones by Anderson–Cheeger [3] and Bando [7, 8] in that we start the process from the deepest bubble (or leaf bubble which corresponds to the smallest scale) and then work our way outwards, while their argument goes the other direction. This follows the first author’s bubbling analysis for minimal surfaces obtained jointly with Sharp in [13], as well as the beautiful work of Chang–Qing–Yang in [16].

Furthermore, working with non-compact manifolds, Theorem 1.2 only states that the number of orbifold points in \(\mathcal {Q}_r = \mathcal {Q}\cap B_{g_\infty }(p_\infty , r)\) is bounded by \(N=N({\underline{\mu }}, E(2r))\). This means that if one wishes to apply the result for large r, or even the entire shrinker, the number of orbifold points can become arbitrarily large and as a consequence, the bubble tree construction in Sect. 5 may not terminate.

A final important difference to the work of Anderson–Cheeger is Point 1.2 of Theorem 1.2, which is not present in [3]. While proving an energy inequality is relatively easy, proving the claimed energy identity requires a more delicate argument to show that no energy is lost in the intermediate regions between the different bubble scales. A significant part of the present paper is therefore focusing on these so-called neck regions, see in particular Sects. 3 and 4. Once the energy identity is proved, it can easily be translated into an identity for the Euler characteristic:

Corollary 1.3

Under the assumptions of Theorem 1.2 and using the same notation, we have the identity

$$\begin{aligned} \lim _{i \rightarrow \infty } \chi (B_{g_i}(p_i,r)) = \chi (B_{g_\infty }(p_\infty ,r)\setminus \mathcal {Q}_r) + \sum _{q \in \mathcal {Q}_r} \sum _{k = 1}^{N_q} \chi (V^k \setminus {\mathcal {Q}}^k) \end{aligned}$$
(1.4)

where \({\mathcal {Q}}^k\) is the (possibly empty) set of orbifold points of the ALE bubble \((V^k,h^k)\).

The concept of an ALE bubble is defined as follows.

Definition 1.4

(ALE bubble) A manifold (or an orbifold with finitely many singularities) \((M^n, g)\) with one end is asymptotically locally Euclidean (ALE) of order \(\tau > 0\) if there is a compact set \(K \subset M^n\), a constant \(R > 0\), a finite group \(\Gamma \subset O\left( n\right) \) acting freely on \({\mathbb {R}}^n \setminus B(0,R)\), as well as a \(C^\infty \) diffeomorphism \(\psi : M^n {\setminus } K \rightarrow \left( {\mathbb {R}}^n {\setminus } B(0,R)\right) /\Gamma \) such that the following estimates hold:

$$\begin{aligned} (\varphi ^*g)_{ij}(x)&= \delta _{ij} + O (\left| x\right| ^{-\tau })\\ \partial ^k (\varphi ^*g)_{ij}(x)&= O (\left| x\right| ^{-\tau -k}), \quad \forall k\ge 1 \end{aligned}$$

for all \(x,y \in {\mathbb {R}}^n \setminus B(0,R)\). Here \(\varphi := \psi ^{-1} \circ \pi \) where \(\pi : {\mathbb {R}}^n \rightarrow {\mathbb {R}}^n/\Gamma \) is the natural projection. We say that an n-dimensional manifold (or orbifold with finitely many singularities) is an ALE bubble, if it is complete and non-compact with one end, Ricci-flat, non-flat with bounded \(L^{n/2}\) Riemannian curvature, and ALE of order \(n-1\) in general. If \(n = 4\) or the manifold/orbifold is Kähler then we require it to be ALE of order n. (In Definition 5.1, we will further distinguish between leaf and intermediate bubbles.)

A further consequence of Theorem 1.2 is a (local) diffeomorphism finiteness result for Ricci shrinkers.

Corollary 1.5

(Local diffeomorphism finiteness) Let \({\mathcal {M}}\) denote the collection of n-dimensional gradient Ricci shrinkers (Mgf) with entropy uniformly bounded below \(\mu (g) \ge {\underline{\mu }} > -\infty \) and uniform local energy bounds as in (1.3) whenever \(n>4\). Moreover, for any \(r>0\), set \({\mathcal {M}}^r\) to be the collection of \(M^r:=M \cap B_{g}(p,r)\), where \(M\in {\mathcal {M}}\) and \(p:= \mathop {\mathrm {arg\,min}}\limits _{M} f\). Then \({\mathcal {M}}^r\) contains only a finite number of diffeomorphism types.

The corollary shows in particular that the collection of closed Ricci shrinkers with a uniform upper bound on the diameter as well as uniform lower entropy bounds and uniform energy bounds contains only a finite number of diffeomorphism types. Recently, Munteanu–Wang [25] have shown such diameter bounds follows from the other assumptions. Proving a global diffeomorphism finiteness result is a more delicate issue; even in the case where one knows that all orbifold singularities form in a compact region (which is for example the case under a scalar curvature bound), one still needs to control the number of ends of the shrinkers, a problem which we will study elsewhere.

The paper is organised as follows: in Sect. 2, we recall some basic concepts and collect some facts about gradient Ricci shrinkers before proving a blow-up version of Theorem 1.1 (see Theorem 2.6). In Sect. 3, we first show a connectedness result for small annuli in Ricci shrinkers (Lemma 3.1) which implies that bubbles have precisely one end (Corollary 3.3), and then we proceed to prove a neck theorem (Theorem 3.4) controlling the geometry of intermediate regions in our bubbling result. Section 4 is dedicated to proving an energy estimate in these neck regions (Theorem 4.5) via an improved Kato inequality for Ricci shrinkers. Theorem 1.2 is then proved in Sect. 5. In Sect. 6, we prove the two corollaries from above.

2 A blow-up version of the compactness theorem

Let us start this section with the precise notions of pointed Gromov–Hausdorff convergence and pointed orbifold Cheeger–Gromov convergence and a quick overview of the main results from [22, 23].

Definition 2.1

(Pointed Gromov–Hausdorff convergence) A pointed map \(f: (X,p) \rightarrow (Y,q)\) between two metric spaces \(\left( X, d_X, p\right) \), \(\left( Y,d_Y,q\right) \) is an \(\varepsilon \)-pointed Gromov–Hausdorff approximation (\(\varepsilon \)-PGHA) if it is almost an isometry and almost onto in the following sense

  1. (i)

    \(\left| d_X(x_1, x_2) - d_Y(f(x_1), f(x_2))\right| \le \varepsilon \), for all \(x_1,x_2 \in B_{d_X}(p,1/\varepsilon )\),

  2. (ii)

    for all \(y \in B_{d_Y}(q,1/\varepsilon )\) there exists \(x\in B_{d_X}(p,1/\varepsilon )\) with \(d_Y(y,f(x))\le \varepsilon \).

We say \(\left( X_i,p_i\right) \rightarrow \left( Y,q\right) \) as \(i\rightarrow \infty \) in the pointed Gromov–Hausdorff sense if

$$\begin{aligned} d_{\textrm{pGH}}((X_i,p_i), (Y,q))&:= \inf \{\varepsilon > 0 : \exists \varepsilon -\text {pGHA } f_1:(X_i,p_i) \rightarrow (Y,q) \text { and } \\&\quad f_2:(Y,q) \rightarrow (X_i,p_i)\}\\&\quad \rightarrow 0 \qquad (i\rightarrow \infty ). \end{aligned}$$

As explained in [22], Lemma 2.1 and Lemma 2.2, under the normalisation (1.2), one obtains the following growth condition for the potential f from the basepoint p,

$$\begin{aligned} \frac{1}{4}(d(x,p)-5n)_{+}^2 \le f(x) - \mu (g) \le \frac{1}{4}(d(x,p)+\sqrt{2n})^2. \end{aligned}$$

This in turn implies the volume growth estimate

$$\begin{aligned} {{\,\textrm{Vol}\,}}_g(B_g(p,r)) \le V_0 r^n, \quad \forall r>0 \end{aligned}$$
(2.1)

with \(V_0\) being a constant depending only on the dimension of the shrinker. Finally, similar to Perelman’s non-collapsing result, under the entropy bound \(\mu (g)\ge {\underline{\mu }}>-\infty \) one obtains for every r the existence of \(v_0 =v_0(r,n,{\underline{\mu }})\) such that

$$\begin{aligned} {{\,\textrm{Vol}\,}}_g(B_g(q,\delta )) \ge v_0 \delta ^n, \end{aligned}$$
(2.2)

for every ball \(B_g(q,\delta )\subset B_g(p,r)\), \(0<\delta \le 1\), see Lemma 2.3 in [22]. Pointed Gromov–Hausdorff convergence of a sequence of Ricci shrinkers \((M_i,g_i,f_i,p_i)\) with entropy \(\mu (g_i)\ge {\underline{\mu }}>-\infty \) to a complete metric space \((M_\infty ,d_\infty ,p_\infty )\) then follows directly from (2.1)–(2.2) and Gromov’s compactness theorem, see Theorem 2.4 in [22] for details.

The main work of [22, 23] then goes into improving the regularity of the convergence and of the limit metric space \(M_\infty \).

Definition 2.2

(Orbifold Ricci shrinker) A complete metric space \(M_\infty \) is called an orbifold Ricci shrinker if it is a smooth Ricci shrinker away from a locally finite set \(\mathcal {Q}\) of singular points and at every \(q\in \mathcal {Q}\), \(M_\infty \) is modelled on \({\mathbb {R}}^n/\Gamma \) for a finite group \(\Gamma \subset O(n)\). Moreover, there exists an associated covering \({\mathbb {R}}^n \supset B(0,\varrho ){\setminus } \{0\} {\mathop {\rightarrow }\limits ^{\pi }} U {\setminus } \{q\}\) of some neighbourhood \(U\subset M_\infty \) of q such that \((\pi ^*g_\infty ,\pi ^*f_\infty )\) can be extended smoothly to a gradient shrinker over the origin.

Definition 2.3

(Pointed Orbifold Cheeger–Gromov convergence) A sequence of gradient shrinkers \(\left( M^n_i, g_i, f_i, p_i\right) \) converges to an orbifold gradient shrinker \(\left( M^n_\infty , g_\infty , f_\infty , p_\infty \right) \) in the pointed orbifold Cheeger–Gromov sense if the following properties hold:

  1. 1.

    There exist a locally finite set \({\mathcal {Q}} \subset M_\infty \), an exhaustion of \(M_\infty {\setminus } {\mathcal {Q}}\) by open sets \(U_i\), and smooth embeddings \(\varphi _i: U_i \rightarrow M_i\) such that \(\left( \varphi ^*_i g_i, \varphi ^*_i f_i\right) \) converges to \(\left( g_\infty , f_\infty \right) \) in the \(C^\infty _{\textrm{loc}}\)-sense on \(M_\infty {\setminus } {\mathcal {Q}}\).

  2. 2.

    Each of the above maps \(\varphi _i\) can be extended to an \(\varepsilon \)-pGHA which yield a convergent sequence \(\left( M_i, d_i, p_i\right) \rightarrow \left( M_\infty , d_\infty , p_\infty \right) \) in the pointed Gromov–Hausdorff sense.

Pointed orbifold Cheeger–Gromov convergence to a Ricci-flat orbifold is defined analogously.

The main result of [22] improves the Gromov–Hausdorff convergence to orbifold Cheeger–Gromov convergence under the energy bound (1.3) and the main result of [23] shows that the energy bound assumption is in fact always satisfied in dimension \(n=4\), see Theorem 1.1 from the introduction. It turns out that the set \(\mathcal {Q}\) is the same in the two above definitions, meaning that the convergence is bad (or non-smooth) around a point \(q\in M_\infty \) if and only if \(M_\infty \) is non-smooth at q. We can therefore use the expressions that q is a singular point or a point of bad convergence interchangeably. A key ingredient in the proof of this improved convergence result is an \(\varepsilon \)-regularity theorem that follows from local Sobolev constant bounds via a Moser iteration argument. We recall these two results here as we will need them later.

Lemma 2.4

(Local Sobolev constant bounds, Lemma 3.2 in [22]) There exist \(C_S(r)<\infty \) and \(\delta _0(r)>0\) depending on r, n and \({\underline{\mu }}\), such that for every gradient shrinker with normalised weighted volume and \(\mu (g) \ge {\underline{\mu }} > -\infty \), and for every ball \(B_g(x,\delta ) \subset B_g(p,r)\) with \(0 < \delta \le \delta _0(r)\) and \(p:= \mathop {\mathrm {arg\,min}}\limits _M f\), we have

$$\begin{aligned} \left\| \varphi \right\| _{L^{2^*}} < C_S(r)\left\| \nabla \varphi \right\| _{L^2}, \end{aligned}$$

for all functions \(\varphi \in C^1_c(B_g(x,\delta ))\), where \(2^*=\tfrac{2n}{n-2}\).

Lemma 2.5

(\(\varepsilon \)-Regularity, Lemma 3.3 in [22]) There exist \(\varepsilon _{\textrm{reg}}(r), \delta _0(r)>0\), and \(K_\ell (r) < \infty \), all depending on r, n and \({\underline{\mu }}\), such that for every gradient shrinker with normalised weighted volume and \(\mu (g) \ge {\underline{\mu }} > -\infty \), and for every ball \(B_g(x,\delta ) \subset B_g(p,r)\) with \(0 < \delta \le \delta _0(r)\) and \(p:= \mathop {\mathrm {arg\,min}}\limits _M f\), we have the implication

$$\begin{aligned} \left\| \textrm{Rm}_g\right\| _{L^{n/2}(B_g(x,\delta ))} < \varepsilon _{\textrm{reg}}(r) \Longrightarrow \sup \limits _{B_g(x,\delta /4)} \left| \nabla ^\ell \textrm{Rm}_g\right| _g \le \frac{K_\ell (r)}{\delta ^{2+\ell }} \left\| \textrm{Rm}_g\right\| _{L^{n/2}(B_g(x,\delta ))}. \end{aligned}$$

Under the energy bound (1.3), for a large r and a small \(\delta >0\), there can only be finitely many disjoint \(\delta \)-balls in \(B_g(p,r)\) that contain energy more than \(\varepsilon _{\textrm{reg}}(r)\). In light of Lemma 2.5, away from these balls we get \(C^\infty \) estimates for the curvatures and hence smooth convergence. Hence the singular points \(q\in \mathcal {Q}\) are exactly characterised by the condition

$$\begin{aligned} \exists q_i\in M_i \text { with } q_i\rightarrow q, \exists \delta _i\rightarrow 0 \text { such that }\left\| \textrm{Rm}_{g_i}\right\| _{L^{n/2}(B_{g_i}(q_i,\delta _i))} \ge \varepsilon _{\textrm{reg}}(r). \end{aligned}$$
(2.3)

Our first new result is a blow-up version of Theorem 1.1 stating that if we rescale the metrics of our sequence of Ricci shrinkers with \(\lambda _i^{-2}\) (where \(\lambda _i\rightarrow 0\)) we still obtain orbifold Cheeger–Gromov convergence. In order to allow us to apply this result flexibly in different situations below, we prove a rather general theorem which does not yet make a statement about whether or not the limit is flat and whether or not it has singular points – properties that will in particular depend on the precise choice of q and the scaling factors \(\lambda _i\).

Theorem 2.6

(Blow-up version of Theorem 1.1) Let \(\left( M_i, g_i, f_i, p_i\right) \) be a sequence of n-dimensional gradient Ricci shrinkers with uniformly bounded entropy \(\mu \left( g_i\right) \ge {\underline{\mu }} > -\infty \) and, if \(n > 4\), locally bounded energy as in Theorem 1.1. Let \(q\in M_\infty \) and let \(M_i \ni q_i \rightarrow q\) and \(\lambda _i \rightarrow 0\). Then the rescaled sequence \((M_i, \widetilde{g}_i = \lambda _i^{-2}g_i, q_i)\) subconverges in the pointed (orbifold) Cheeger–Gromov sense to a complete, non-compact, Ricci-flat manifold or orbifold with isolated singularities \((V, h, q_\infty )\) which has bounded \(L^{n/2}\) Riemannian curvature and is ALE of order \(n-1\) in general and ALE of order n if either \(n = 4\) or (Vh) is Kähler. Finally, the singular points of V are characterised by (2.3) for the rescaled metrics \(\widetilde{g}_i\).

Proof

The proof consists of checking that after rescaling we can essentially still follow the same arguments as in the original proof of Theorem 1.1 to obtain orbifold Cheeger–Gromov convergence and in checking the claimed properties of the limiting manifold or orbifold.

First, choose some \(r \ge 2\) such that \(q\in B_{g_\infty }(p_\infty , r)\). By picking i sufficiently large, we may assume that \(q_i \in B_{g_i}(p_i,r+1)\) and thus \(B_{g_i}(q_i,1) \subset B_{g_i}(p_i,r+2) \subset B_{g_i}(p_i,2r)\), which by (2.1) implies that \({{\,\textrm{Vol}\,}}_{g_i} B_{g_i}(q_i,1) \le C_n(2r)^n\) independently of i. Clearly, these unit balls \(B_{g_i}(q_i,1)\) with respect to the original metrics correspond to the larger and larger balls \(B_{\widetilde{g}_i}(q_i, \lambda _i^{-1})\) with respect to the rescaled metrics. Using also (2.2), we therefore see that there are constants \(v_1, V_1\) depending only on r, n and \({\underline{\mu }}\), such that

$$\begin{aligned} v_1 s^n \le {{\,\textrm{Vol}\,}}_{\widetilde{g}_i} B_{\widetilde{g}_i}(q_i,s) \le V_1 s^n \end{aligned}$$
(2.4)

whenever i is sufficiently large so that \(s<\lambda _i^{-1}\). In particular, this controls the number of small balls that can be placed disjointly in a large ball and thus implies pointed Gromov–Hausdorff convergence to a complete length space by Gromov’s compactness theorem. Clearly this limit space is non-compact.

Next, we note that Lemma 2.5 still holds for the rescaled metrics \(\widetilde{g}_i\) and for balls \(B_{\widetilde{g}_i}(x,\delta )\) such that \(0<\lambda _i \delta \le \delta _0(r)\). This is obtained by scaling \(\widetilde{g}_i\), applying the \(\varepsilon \)-regularity lemma for shrinkers, and then scaling back. More precisely, we have

$$\begin{aligned} \left\| \text {Rm}_{\widetilde{g}_i}\right\| _{L^{n/2}(B_{\widetilde{g}_i}(x,\delta ))}&< \varepsilon _{\text {reg}}(r)\Longleftrightarrow \left\| \text {Rm}_{g_i}\right\| _{L^{n/2}(B_{g_i}(x,\lambda _i\delta ))} < \varepsilon _{\text {reg}}(r)\nonumber \\ \Longrightarrow&{} \sup \limits _{B_{g_i}(x,\lambda _i\delta /4)} \left| \nabla ^\ell \text {Rm}_{g_i}\right| _{g_i}\le \frac{K_\ell (r)}{(\lambda _i\delta )^{2+\ell }} \left\| \text {Rm}_{g_i}\right\| _{L^{n/2}(B_{g_i}(x,\lambda _i\delta ))}\nonumber \\\Longleftrightarrow&{} \sup \limits _{B_{\widetilde{g}_i}(x,\delta /4)} \left| \nabla ^\ell \text {Rm}_{\widetilde{g}_i}\right| _{\widetilde{g}_i} \le \frac{K_\ell (r)}{\delta ^{2+\ell }} \left\| \text {Rm}_{\widetilde{g}_i}\right\| _{L^{n/2}(B_{\widetilde{g}_i}(x,\delta ))}.\nonumber \\ \end{aligned}$$
(2.5)

So for the rescaled metrics we have the exact same implication as in Lemma 2.5, the advantage being that we can potentially work with much larger balls, a fact that we will use in the neck theorem below to conclude flatness of the limit.

Endowed with such an \(\varepsilon \)-regularity result, we can conclude exactly as in [22] to improve the regularity of the limit to an orbifold (Vh) with isolated singularities and the convergence to pointed orbifold Cheeger–Gromov convergence. We refer the reader to Section 3 of [22] and the associated references for more details. In the exact same way as described above, the orbifold points of V are exactly the points where the convergence is bad and these points are characterised by an energy concentration as in (2.3) for the rescaled metrics \(\widetilde{g}_i\). As the bounded \(L^{n/2}\) Riemannian curvature of the limit (Vh) is an obvious consequence of the local energy bound (1.3), it only remains to show that the limit is Ricci-flat and satisfies the ALE condition.

To prove the former property, note that the rescaling changes (1.1) to

$$\begin{aligned} \textrm{Ric}_{\widetilde{g}_i} + \nabla ^2_{\widetilde{g}_i} f_i = \frac{\lambda ^2_i}{2}\widetilde{g}_i. \end{aligned}$$

Hence, away from the points of bad convergence, (Vh) satisfies the steady soliton equation

$$\begin{aligned} \textrm{Ric}_h + \nabla ^2 f = 0 \end{aligned}$$

for some function \(f:V\rightarrow {\mathbb {R}}\). Since any Ricci shrinker \((M_i,g_i,f_i,p_i)\) satisfies

$$\begin{aligned} 0 \le R_{g_i}(x) \le f_i(x) - \mu (g_i) \le \frac{1}{4}(d_{g_i}(x,p_i)+\sqrt{2n})^2, \end{aligned}$$

see for example (2.11) in [22], and the rescaled metrics satisfy \(R_{\widetilde{g}_i} = \lambda ^2_i R_{g_i}\), we also conclude that (Vh) is scalar-flat. If (Vh) is a smooth manifold (and thus a smooth steady soliton), then it satisfies

$$\begin{aligned} R_h = \Delta R_h + 2 \left| \textrm{Ric}_h\right| ^2 \end{aligned}$$

and we can directly conclude Ricci-flatness from scalar-flatness. This argument does not directly go through if there are orbifold singularities, but we can work instead with the evolution equation for the scalar curvature on the shrinkers \((M_i,g_i,f_i)\), namely

$$\begin{aligned} R_{g_i} + \left\langle \nabla f_i, \nabla R_{g_i}\right\rangle = \Delta R_{g_i} + 2\left| \textrm{Ric}_{g_i}\right| ^2_{g_i} \end{aligned}$$

and pass to a limit after rescaling to conclude that the limit is Ricci-flat.

In the final step, we want to apply the following theorem which will yield the desired ALE condition.

Theorem 2.7

(Theorem 1.5 in [9]) Let \((V^n, h)\) with \(n \ge 4\) be a Ricci-flat manifold or a Ricci-flat orbifold with isolated singularities such that for some \(x\in M\) and \(v>0\), we have

$$\begin{aligned} {{\,\textrm{Vol}\,}}_h(B_h(x,s)) \ge vs^n, \qquad \forall s>0 \end{aligned}$$

as well as

$$\begin{aligned} \int _V \left| \textrm{Rm}_h\right| ^{n/2}_h dV_h \le C < \infty . \end{aligned}$$

Then \((V^n, h)\) is ALE of order \(n-1\). If \(n = 4\) or \((V^n, h)\) is Kähler then it is ALE of order n.

In order to apply this theorem, we pick \(x\in V\) to be the limit \(q_\infty \) of the points \(q_i\) (whether this is a point of good or bad convergence does not matter). Then note that the volume growth assumption follows by passing to a limit in (2.4) while the integral condition follows by

$$\begin{aligned} \int _V \left| \textrm{Rm}_h\right| ^{n/2}_h dV_h&= \lim _{s\rightarrow \infty }\int _{B_h(q_\infty ,s)} \left| \textrm{Rm}_h\right| ^{n/2}_h dV_h \le \lim _{s\rightarrow \infty } \liminf _{i\rightarrow \infty } \int _{B_{\widetilde{g}_i(q_i,s)}} \left| \textrm{Rm}_{\widetilde{g}_i}\right| ^{n/2}_{\widetilde{g}_i} dV_{\widetilde{g}_i}\\&= \lim _{s\rightarrow \infty } \liminf _{i\rightarrow \infty } \int _{B_{g_i(q_i,\lambda _i s)}} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \le \int _{B_{g_i(q_i,1)}} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i}\\&\le \int _{B_{g_i(p_i,2r)}} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \le E(2r) < \infty . \end{aligned}$$

On the second line, we used that for any s we may take i large enough so that \(s<\lambda _i^{-1}\) and on the last line we used the uniform local energy bound (1.3) which we assumed for \(n>4\) and which, as mentioned previously, is always automatically satisfied for \(n=4\) by the work in [23]. This completes the proof of Theorem 2.6. \(\square \)

We also recall the following sufficient condition for flatness of the limit.

Proposition 2.8

(Bando’s gap result, [7, 8]) There exists \(\varepsilon _{\textrm{gap}}(n)>0\) such that the following holds. Let \((V^n,h)\) be a limit orbifold arising in Theorem 2.6 with

$$\begin{aligned} \int _V \left| \textrm{Rm}_h\right| ^{n/2}_h dV_h < \varepsilon _{\textrm{gap}}. \end{aligned}$$
(2.6)

Then (Vh) is flat, i.e. \(\textrm{Rm}_h \equiv 0\) in the regular part of V.

3 Small annuli in Ricci shrinkers and a neck theorem

Let \({\bar{A}}_{s_1,s_2}(q)\) denote the closed geodesic annulus centered at some point \(q \in M\). That is,

$$\begin{aligned} {\bar{A}}_{s_1,s_2}(q):= \overline{B_g(q, s_2)} \setminus B_g(q, s_1). \end{aligned}$$

Furthermore, let \(A_{s_1,s_2}(q)\) denote a connected component of \({\bar{A}}_{s_1,s_2}(q)\) such that

$$\begin{aligned} A_{s_1, s_2}(q) \cap \partial B_g(q, s_2) \ne \emptyset . \end{aligned}$$
(3.1)

The first lemma of this section, which is important for the neck theorem below, shows that for sufficiently small annuli \({\bar{A}}_{s_1,s_2}(q)\) there exists only one such component \(A_{s_1,s_2}(q)\). We will also prove a diameter bound for this component. These results are Ricci shrinker versions of results from [1, 3] for manifolds with pointwise Ricci bounds. We only consider annuli lying fully inside \(B_g(p, 2r)\) for some fixed \(r\ge 2\) and, in order to guarantee that we can later apply our results to our sequence of shrinkers, we make sure that all constants only depend on this r as well as possibly on n and \({\underline{\mu }}\).

The key tool to prove these results is the Bakry–Émery volume comparison theorem (Theorem 1.2 in [35]) which implies that for a gradient Ricci shrinker there exists \(a=a(r,n)\) such that

$$\begin{aligned} \frac{{{\,\textrm{Vol}\,}}_f(B_g(q, \sigma _2))}{{{\,\textrm{Vol}\,}}_f(B_g(q, \sigma _1))} \le a(r,n)\,\frac{\sigma _2^n}{\sigma _1^n}. \end{aligned}$$
(3.2)

whenever \(0<\sigma _1\le \sigma _2 \le 1\) and \(q \in B_g(p,r+1)\). Here, \(p:= \mathop {\mathrm {arg\,min}}\limits _M f\) is the basepoint of the shrinker, \(r\ge 2\), and \({{\,\textrm{Vol}\,}}_f (\Omega ):= \int _\Omega e^{-f}dV\). The constant a can in fact simply be chosen to be \(a=e^b\) where b is a bound on \(\left| \nabla f\right| \) on the ball \(B_g(p, 2r)\). Such a bound follows from the auxiliary equation for Ricci shrinkers

$$\begin{aligned} R_g + \left| \nabla f\right| ^2_g - f = -\mu (g). \end{aligned}$$

which, together with \(R_g \ge 0\) yields

$$\begin{aligned} 0 \le \left| \nabla f\right| ^2_g \le R_g + \left| \nabla f\right| ^2_g = f - \mu (g) \le \frac{1}{4}(d(x,p)+\sqrt{2n})^2. \end{aligned}$$
(3.3)

So we can for example pick \(a(r,n)=\exp (r+\sqrt{n})\). There is also a corresponding version of (3.2) for annuli, obtained in the proof of Theorem 1.2 in [35]. The version we will use below can be written as

$$\begin{aligned} \frac{{{\,\textrm{Vol}\,}}_f({\bar{A}}_{\sigma _0,\sigma _2}(q))}{{{\,\textrm{Vol}\,}}_f({\bar{A}}_{\sigma _0,\sigma _1}(q))} \le a(r,n)\,\frac{\sigma _2^n-\sigma _0^n}{\sigma _1^n-\sigma _0^n}, \end{aligned}$$
(3.4)

whenever \(0<\sigma _0<\sigma _1\le \sigma _2 \le 1\) and \(q \in B_g(p,r+1)\).

Lemma 3.1

(Small annuli in Ricci shrinkers) Given \(n\ge 4\), \(r\ge 2\) and \({\underline{\mu }}>-\infty \), then there exist constants \(0<\zeta _0<\tfrac{1}{3}\), and \(C_0<\infty \), such that the following holds: let (Mgf) be a complete, connected n-dimensional gradient Ricci shrinker with entropy bounded below by \(\mu (g) \ge {\underline{\mu }}\). Let \(p:= \mathop {\mathrm {arg\,min}}\limits _M f\) and \(q \in B_g(p, r+1)\). If

$$\begin{aligned} s_2 \le \tfrac{1}{4} \quad \text {and}\quad s_1 = \zeta s_2 \quad \text {where}\quad \zeta \le \zeta _0 \end{aligned}$$

then the annulus \({\bar{A}}_{s_1,s_2}(q)\) contains at most one connected component that meets \(\partial B_g(q, s_2)\) in the sense of (3.1) and, if such a component exists, any two points in \(A_{s_1, s_2}(q) \cap \partial B_g(q, s_2)\) can be connected by a curve lying in \(A_{s_1, 2s_2}(q)\) of length \(C_0s_2\).

Remark 3.2

Obviously any two points in \(A_{s_1, s_2}(q) \cap \partial B_g(q, s_2)\) can be connected in M by a curve of length \(2s_2\) (by passing via q), but the lemma excludes curves getting near q. By a more careful argument, one can prove an actual intrinsic diameter bound for \({\bar{A}}_{s_1, s_2}(q)\), but the claimed property has a simple proof and is sufficient for us.

Proof

Assume for a contradiction that there are at least two such components \(\{D_i\}\). We may assume that \(D_1\) is such that for any \(i \ne 1\) we have \({{\,\textrm{Vol}\,}}_f(D_1) \le {{\,\textrm{Vol}\,}}_f (D_i)\) which implies

$$\begin{aligned} {{\,\textrm{Vol}\,}}_f({\bar{A}}_{s_1, s_2}(q)) \le 2{{\,\textrm{Vol}\,}}_f\Big (\bigcup _{i\ne 1} D_i\Big ). \end{aligned}$$

Now choose \(x \in D_1 \cap \partial B_g(q,s_2)\). Note that any minimising geodesic \(\gamma (t)\) in M from x to some component \(D_i\) (with \(i\ne 1\)) has length at most \(2s_2\) and intersects \(B_g(q,s_1)\) for some \(t_0 \in [s_2 - s_1, s_2 + s_1]\). Let \(A^0_{u,v}(x)\) be the set of all points on such a geodesic with \(t \in [u,v]\). Note the inclusion relation

$$\begin{aligned} \bigcup _{i\ne 1} D_i \subseteq A^0_{s_2 - s_1, 2s_2}(x). \end{aligned}$$
(3.5)

Also, the triangle inequality together with \(\gamma (t_0)\subset B_g(q,s_1)\) yields

$$\begin{aligned} A^0_{s_2 - s_1, s_2+s_1}(x) \subseteq B_g(q, 3s_1). \end{aligned}$$
(3.6)

Combining (3.4)–(3.6), we obtain

$$\begin{aligned} \frac{{{\,\textrm{Vol}\,}}_f({\bar{A}}_{s_1, s_2}(q))}{{{\,\textrm{Vol}\,}}_f(B_g(q, 3s_1))}&\le \frac{2{{\,\textrm{Vol}\,}}_f\Big (\bigcup _{i\ne 1} D_i\Big )}{{{\,\textrm{Vol}\,}}_f(B_g(q, 3s_1))}\\&\le \frac{2{{\,\textrm{Vol}\,}}_f(A^0_{s_2 - s_1, 2s_2}(x))}{{{\,\textrm{Vol}\,}}_f(A^0_{s_2 - s_1, s_2+s_1}(x))}\\&\le 2a(n,r)\frac{(2\zeta ^{-1})^n -(\zeta ^{-1}-1)^n}{(\zeta ^{-1}+1)^n -(\zeta ^{-1}-1)^n}\\&\le C_0 \zeta ^{-1} + C_1 \end{aligned}$$

for constants \(C_0\) and \(C_1\) depending only on n and a(rn). On the other hand, we also have

$$\begin{aligned} \frac{{{\,\textrm{Vol}\,}}_f({\bar{A}}_{s_1, s_2}(q))}{{{\,\textrm{Vol}\,}}_f(B_g(q, 3s_1))} \ge \frac{{{\,\textrm{Vol}\,}}_f(B_g(q, s_2))}{{{\,\textrm{Vol}\,}}_f(B_g(q, 3s_1))} - 1 \ge \frac{v_0 s_2^n}{3^n V_0 s_1^n}-1 = \frac{v_0}{3^nV_0} \zeta ^{-n}-1, \end{aligned}$$

where \(V_0\) and \(v_0\) are the constants from (2.1)–(2.2). Clearly this yields a contradiction when \(\zeta _0\) is sufficiently small (and hence \(\zeta ^{-1}\ge \zeta _0^{-1}\) sufficiently large), showing that there can be at most one connected component \(A_{s_1,s_2}(q)\) that meets \(\partial B_g(q, s_2)\).

It remains to prove the claimed diameter bound. In order to do so, pick a maximal family of points \(x_j \in A_{s_1, s_2}(q) \cap \partial B_g(q, s_2)\) such that \(B_j:= B_g(x_j, \xi s_2)\) are disjoint for \(\xi :=\tfrac{1}{2}(1-\zeta )\) and set \({\hat{B}}_j:=B_g(x_j, 2\xi s_2)\). Clearly if \({\hat{B}}_j \cap {\hat{B}}_k \ne \emptyset \), then \(x_j\) and \(x_k\) can be joined by a curve in \({\bar{A}}_{s_1, 2s_2}(q)\) of length at most \(4\xi s_2\). This uses in particular that all \({\hat{B}}_j\) are disjoint from \(B_g(q,s_1)\) by definition of \(\xi \). By maximality, \(\{{\hat{B}}_j\}\) cover \(A_{s_1, s_2}(q) \cap \partial B_g(q, s_2)\) and therefore any two points in \(A_{s_1, s_2}(q) \cap \partial B_g(q, s_2)\) can be joined by a curve in \({\bar{A}}_{s_1, 2s_2}(q)\) of length as most \(4\xi s_2 \cdot \#\{x_j\}\). It remains to estimate the number of points in the family \(\{x_j\}\).

Note the inclusion

$$\begin{aligned} B_j = B_g(x_j, \xi s_2) \subseteq B_g(q, (1 + \xi )s_2) \subseteq B_g(x_j, (2 + \xi )s_2). \end{aligned}$$

which by (3.2) yields

$$\begin{aligned} \frac{{{\,\textrm{Vol}\,}}_f (B_g(q,(1+\xi )s_2))}{{{\,\textrm{Vol}\,}}_f B_j} \le \frac{{{\,\textrm{Vol}\,}}_f (B_g(x_j,(2+\xi )s_2))}{{{\,\textrm{Vol}\,}}_f B_j} \le a(n,r)\left( \frac{2 + \xi }{\xi }\right) ^n \end{aligned}$$

for each j. In particular, the number of disjoint \(B_j\) lying in \(B_g(q, (1 + \xi )s_2)\) is bounded by \(a(n,r)(\tfrac{2 + \xi }{\xi })^n\).

Combining with the above, we see that any two points in \(A_{s_1, s_2}(q) \cap \partial B_g(q, s_2)\) can be joined by a curve in \({\bar{A}}_{s_1, 2s_2}(q)\) of length as most \(4\xi a(n,r)(\tfrac{2 + \xi }{\xi })^n \cdot s_2\). An explicit constant \(C_0\) can easily be obtained from \(0<\zeta <\tfrac{1}{3}\) which yields \(\tfrac{1}{3}<\xi <\tfrac{1}{2}\). \(\square \)

Corollary 3.3

(One end) Any limit manifold or orbifold (Vh) obtained in the blow-up version of the compactness theorem, Theorem 2.6, has one end.

Proof

For sufficiently small \(\lambda _i\), the annuli \({\bar{A}}_{\lambda _i,\sqrt{\lambda _i}}(q_i)\) satisfy the assumptions of Lemma 3.1 and therefore have only one connected component meeting the outer boundary. This immediately implies that (Vh) has only one end. \(\square \)

In the remainder of this section, we prove a so-called neck theorem. Generally, in bubbling results one usually encounters three different types of regions: regions where energy concentrates (and bubbles form), regions where there is no such concentration (and the convergence is smooth), and finally the intermediate or neck regions. The following result about these intermediate regions is a Ricci shrinker version of Theorem 1.8 in [3] for manifolds with pointwise Ricci bounds.

Theorem 3.4

(Neck theorem for Ricci shrinkers) Let \(n\ge 4\), \(r\ge 2\), \({\underline{\mu }} > -\infty \), \(k\in {\mathbb {N}}\) and \(\varepsilon >0\) be given constants. Then there exist \(\varepsilon _{\textrm{neck}}>0\), \(\sigma _1>0\) and \(\gamma <\infty \) such that the following holds.

Let (Mgf) be a complete n-dimensional gradient Ricci shrinker such that \(\mu (g) \ge {\underline{\mu }}\) and the local energy bounds (1.3) are satisfied if \(n>4\). Take \(q \in B_g(p,r+1)\) where \(p:= \mathop {\mathrm {arg\,min}}\limits _M f\). Let \(A_{s_1, s_2}(q) \subset M\) be the unique connected component of the geodesic annulus \({\bar{A}}_{s_1, s_2}(q)\) which satisfies the condition \(A_{s_1, s_2}(q) \cap \partial B_g(q,s_2) \ne \emptyset \) (according to Lemma 3.1) and with

$$\begin{aligned} s_2 \le \sigma _1, \qquad s_1 \le \varepsilon _{\textrm{neck}} s_2. \end{aligned}$$
(3.7)

Finally, assume that

$$\begin{aligned} \int _{A_{s_1, s_2}(q)} \left| \textrm{Rm}_g\right| ^{n/2}_g dV_g \le \varepsilon _{\textrm{neck}}. \end{aligned}$$
(3.8)

Then there is some \(\Gamma \subset O(n)\) acting freely on \(S^{n-1}\) with \(\left| \Gamma \right| \le \gamma \) and an \(\varepsilon \)-quasi-isometryFootnote 1\(\psi \) with

$$\begin{aligned} A_{(\varepsilon _{\textrm{neck}}^{-1/2} + \varepsilon )s_1, (\varepsilon _{\textrm{neck}}^{1/2}-\varepsilon )s_2}(q) \subset \psi \Big ({\mathcal {C}}_{\varepsilon _{\textrm{neck}}^{-1/2}s_1, \varepsilon _{\textrm{neck}}^{1/2}s_2}(S^{n-1}/\Gamma )\Big ) \subset A_{(\varepsilon _{\textrm{neck}}^{-1/2} - \varepsilon )s_1, (\varepsilon _{\textrm{neck}}^{1/2}+\varepsilon )s_2}(q)\nonumber \\ \end{aligned}$$
(3.9)

such that for all \({\mathcal {C}}_{\frac{1}{2}s,s} (S^{n-1}/\Gamma ) \subset {\mathcal {C}}_{\varepsilon _{\textrm{neck}}^{-1/2}s_1, \varepsilon _{\textrm{neck}}^{1/2}s_2}(S^{n-1}/\Gamma )\) in local coordinates one has

$$\begin{aligned} \left| (\psi ^*(s^{-2}g))_{ij} - \delta _{ij}\right| _{C^k} \le \varepsilon . \end{aligned}$$
(3.10)

Proof

The proof is in two steps. We first prove the following claim:

Claim 3.5

There exist \(\varepsilon _{\textrm{neck}}, \sigma _1, \gamma \) such that for each s as in the statement of the theorem, (3.9)–(3.10) hold for some \(\psi _s: {\mathcal {C}}_{\frac{1}{2}s,s}(S^{n-1} / \Gamma _s) \rightarrow A_{\varepsilon _{\textrm{neck}}^{-1/2} s_1, \varepsilon _{\textrm{neck}}^{1/2} s_2}(q)\) where \(\psi _s, \Gamma _s\) may a-priori depend on s.

Proof

Assume towards a contradiction that the claim is not true. Then for given \(n\ge 4\), \(r\ge 2\), \({\underline{\mu }} > -\infty \), \(k\in {\mathbb {N}}\) and \(\varepsilon >0\) there exist sequences \(\varepsilon _i\rightarrow 0\), \(\sigma _i\rightarrow 0\) and a family of complete n-dimensional Ricci shrinkers \((M_i,g_i,f_i)\) containing annuli \(A_{s^i_1, s^i_2}(q_i)\) satisfying the assumptions of the theorem but containing some sub-annuli \(A_{\frac{1}{2}s_i,s_i}(q_i)\) (with \(s_i\in [2\varepsilon _i^{-1/2}s^i_1, \varepsilon _i^{1/2}s^i_2]\)) that, after rescaling the metric by \(\widetilde{g}_i:=s_i^{-2}g_i\), are not \(\varepsilon \)-close in the \(C^k\) topology to an annular portion of any cone \({\mathcal {C}}(S^{n-1}/\Gamma )\).

By Theorem 2.6, we can take a pointed orbifold Cheeger–Gromov limit of \((M_i,\widetilde{g}_i,f_i,q_i)\) converging to an orbifold \((V,h,q_\infty )\). By the condition \(s_i\in [2\varepsilon _i^{-1/2}s^i_1, \varepsilon _i^{1/2}s^i_2]\), we obtain for every \(\ell \in {\mathbb {N}}\) that for sufficiently large i we have \(A_{\frac{1}{2\ell }s_i, \ell s_i}(q_i) \subset A_{s^i_1, s^i_2}(q_i)\) and therefore

$$\begin{aligned} \int _{A_{\frac{1}{2\ell }s_i, \ell s_i}(q_i)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \le \varepsilon _i \rightarrow 0. \end{aligned}$$

In particular, after rescaling \(\widetilde{g}_i:=s_i^{-2}g_i\), there are no points of bad convergence on the annulus \(A^{\widetilde{g}_i}_{(2\ell )^{-1}\!,\, \ell }(q_i)\). Repeating this for larger and larger \(\ell \), we see that the convergence is smooth away from \(q_\infty \), i.e. (Vh) has at most one orbifold point. Moreover, using the argument from (2.5) applied to larger and larger balls, we obtain that the limit is flat.

Following the proof of Theorem 1.8 in [3], respectively Section 5 of [9] (which we can certainly do because the Ricci-flatness of the limit gives pointwise Ricci bounds on the annuli \(A^{\widetilde{g}_i}_{(2\ell )^{-1}\!,\, \ell }(q_i)\)), we see that therefore the limit (Vh) must be an Euclidean cone, i.e. there exists some \(\Gamma \subset O(n)\) acting freely on \(S^{n-1}\) such that \((V,h) = {\mathcal {C}}(S^{n-1}/\Gamma )\). As the convergence is smooth away from the origin, we obtain the desired contradiction and hence the claim holds true. \(\square \)

Having obtained \(\psi _s\) and groups \(\Gamma _s\), all that remains is to rule out the possible s dependence. This is the second step of the proof.

Claim 3.6

The subgroup \(\Gamma _s\) is independent of s and, after slight modifications, some (or all) of the maps \(\psi _s\) can be combined to yield the map \(\psi \) in the statement of the theorem.

Proof

We know there are constants \(\varepsilon _{\textrm{neck}}, \sigma _1, \gamma \) such that for any s with \(2\varepsilon _{\textrm{neck}}^{-1/2} s_1 \le s \le \varepsilon _{\textrm{neck}}^{1/2}s_2\), the annulus \(A_{\frac{1}{2}s,s}(q)\) is \(\varepsilon \)-quasi-isometric and \(\varepsilon s^{-k}\)-close in the \(C^k\) sense to an annular region in a cone, \({\mathcal {C}}_{\frac{1}{2}s,s}(S^{n-1}/\Gamma _s)\). Now take \(\varepsilon \) sufficiently small and, for some fixed s, take \(s'\) very close to s. On its maximal domain of definition \(\psi ^{-1}_{s'} \circ \psi _s\) is a \(2\varepsilon \)-quasi-isometry. Therefore \(\Gamma _s\) is locally constant and thus independent of s.

Once we know that the cone \({\mathcal {C}}(S^{n-1}/\Gamma )\) is fixed, we can let \(t_i:= (\frac{2}{3})^{i-1}\varepsilon _{\textrm{neck}}^{1/2}s_2\) and set \(\psi _i = \psi _{t_i}\). These maps \(\psi _i\) almost agree after radial scaling and hence, after a further slight modification, can then be piece-wisely connected to yield the map \(\psi \), precisely following the argument from [3], page 241. \(\square \)

Combining the two claims, the theorem is proved. \(\square \)

4 Improved Kato inequality and energy estimate in necks

The main estimate of this section, Theorem 4.5, which is to some extent inspired by work of Bando and Bando–Kasue–Nakajima on Einstein manifolds [7, 9], will allow us to show that energy does not concentrate in a neck region during the bubble tree construction.

The following proposition, on which the energy estimate from Theorem 4.5 is based, can be seen as a purely analytical result, which requires only the uniform local Sobolev constant bounds from Lemma 2.4.

Proposition 4.1

(Annulus estimate) Let \(n\ge 4\), \(r\ge 2\), \({\underline{\mu }} > -\infty \) and \(\alpha >1\) be given constants. Then there exist \(\varepsilon _{\textrm{ann}}>0\), \(\sigma _2>0\) and \(C_2<\infty \) such that the following holds.

Let (Mgf) be an n-dimensional gradient Ricci shrinker with \(\mu (g) \ge {\underline{\mu }}\). Take \(q \in B_g(p,r+1)\) where \(p:= \mathop {\mathrm {arg\,min}}\limits _M f\) and let \(A_{s_1, s_2}(q) \subset M\) be the unique connected component of the geodesic annulus \({\bar{A}}_{s_1, s_2}(q)\) which satisfies the condition \(A_{s_1, s_2}(q) \cap \partial B_g(q,s_2) \ne \emptyset \) (according to Lemma 3.1) and with

$$\begin{aligned} s_2 \le \sigma _2, \qquad s_1 \le \tfrac{1}{4} s_2. \end{aligned}$$

Finally, let uv be non-negative functions such that \(\Delta _f u = \Delta u - \left\langle \nabla f,\nabla u\right\rangle \ge - uv\) and suppose that \(v \in L^{\frac{n}{2}}\) with

$$\begin{aligned} \int _{A_{s_1,s_2}(q)} v^{\frac{n}{2}} dV_g \le \varepsilon _{\textrm{ann}} \end{aligned}$$
(4.1)

and \(u \in L^\alpha \). Then for \(\gamma =\frac{n}{n-2}\), we have

$$\begin{aligned} \int _{A_{s_1,s_2}(q)} u^{\alpha \gamma } dV_g \le C_2 \int _{A_{s_1, 2s_1}(q)\, \cup \, A_{\frac{1}{2}s_2, s_2}(q)}u^{\alpha \gamma } dV_g. \end{aligned}$$

Proof

The first part of the proof (up to (4.4) below) is related to the first step in a standard Moser iteration or epsilon regularity argument with some extra work to take care of the \(\nabla f\) terms coming from the drift Laplacian.

We will work with a cutoff function \(0\le \varphi \le 1\) with compact support in \(A_{s_1,s_2}(q) \subset B_g(p,2r)\) which we will determine more precisely further below. We have

$$\begin{aligned} -\int _M \varphi ^2 u^{\alpha -1}\Delta u \, dV_g&= \int _M \varphi ^2\nabla u^{\alpha -1} \nabla u \, dV_g + \int _M \nabla \varphi ^2 \cdot u^{\alpha -1} \nabla u \, dV_g\\&= \tfrac{4(\alpha -1)}{\alpha ^2} \int _M \varphi ^2\left| \nabla u^{\alpha /2}\right| ^2 dV_g + \tfrac{4}{\alpha } \int _M \varphi \nabla \varphi \cdot u^{\alpha /2}\nabla u^{\alpha /2} dV_g \end{aligned}$$

Rearranging this and applying the differential inequality \(\Delta _f u = \Delta u - \left\langle \nabla f,\nabla u\right\rangle \ge -uv\), we get

$$\begin{aligned} \tfrac{4(\alpha -1)}{\alpha ^2} \int _M \varphi ^2 \left| \nabla u^{\alpha /2}\right| ^2 dV_g\le & {} \int _M \varphi ^2 u^\alpha v \, dV_g - \int _M \varphi ^2 u^{\alpha -1} \left\langle \nabla f, \nabla u\right\rangle dV_g \\{} & {} - \tfrac{4}{\alpha }\int _M \varphi \nabla \varphi \cdot u^{\alpha /2}\nabla u^{\alpha /2} dV_g. \end{aligned}$$

Using Young’s inequality, the last term on the right hand side can be estimated by

$$\begin{aligned} - \tfrac{4}{\alpha }\int _M \varphi \nabla \varphi \cdot u^{\alpha /2}\nabla u^{\alpha /2} dV_g \le \tfrac{2}{\alpha }\bigg [\tfrac{\alpha -1}{\alpha } \int _M \varphi ^2 \left| \nabla u^{\alpha /2}\right| ^2 dV_g + \tfrac{\alpha }{\alpha -1} \int _M \left| \nabla \varphi \right| ^2 u^\alpha dV_g\bigg ]. \end{aligned}$$

Hence, after absorption, we find

$$\begin{aligned} \tfrac{2(\alpha -1)}{\alpha ^2} \int _M \varphi ^2 \left| \nabla u^{\alpha /2}\right| ^2 dV_g\le & {} \int _M \varphi ^2 u^\alpha v \, dV_g + \tfrac{2}{\alpha -1}\int _M \left| \nabla \varphi \right| ^2 u^\alpha dV_g \nonumber \\{} & {} - \int _M \varphi ^2 u^{\alpha -1} \left\langle \nabla f, \nabla u\right\rangle dV_g. \end{aligned}$$
(4.2)

Let us now estimate the term involving \(\nabla f\). Integrating by parts yields

$$\begin{aligned} - \int _M \varphi ^2 u^{\alpha -1} \left\langle \nabla f, \nabla u\right\rangle dV_g&= \int _M u^\alpha \left\langle \nabla \varphi ^2 , \nabla f \right\rangle dV_g \\&\quad + \int _M \varphi ^2 u \left\langle \nabla u^{\alpha -1}, \nabla f\right\rangle dV_g + \int _M \varphi ^2 u^\alpha \Delta f \, dV_g\\&= \int _M u^\alpha \left\langle \nabla \varphi ^2 , \nabla f \right\rangle dV_g + (\alpha -1)\int _M \varphi ^2 u^{\alpha -1} \left\langle \nabla f, \nabla u\right\rangle dV_g \\&\quad + \int _M \varphi ^2 u^\alpha \Delta f \, dV_g \end{aligned}$$

and thus after subtracting the second term on the right hand side

$$\begin{aligned} -\alpha \int _M \varphi ^2 u^{\alpha -1} \left\langle \nabla f, \nabla u\right\rangle dV_g&= \int _M u^\alpha \left\langle \nabla \varphi ^2 , \nabla f \right\rangle dV_g + \int _M \varphi ^2 u^\alpha \Delta f \, dV_g. \end{aligned}$$

We can therefore estimate, using Young’s inequality,

$$\begin{aligned} - \int _M \varphi ^2 u^{\alpha -1} \left\langle \nabla f, \nabla u\right\rangle dV_g&= \tfrac{1}{\alpha } \int _M u^\alpha \big (\left\langle \nabla \varphi ^2,\nabla f\right\rangle +\varphi ^2\Delta f\big ) dV_g\\&= \tfrac{1}{\alpha } \int _M u^\alpha \big (2\varphi \left\langle \nabla \varphi ,\nabla f\right\rangle +\varphi ^2\Delta f\big ) dV_g\\&\le \tfrac{1}{\alpha } \int _M \left| \nabla \varphi \right| ^2 u^\alpha dV_g + \tfrac{1}{\alpha } \int _M \varphi ^2 u^\alpha \big (\left| \nabla f\right| ^2+\Delta f\big ) dV_g\\&\le \tfrac{1}{\alpha } \int _M \left| \nabla \varphi \right| ^2 u^\alpha dV_g + \tfrac{C(r)}{\alpha } \int _M \varphi ^2 u^\alpha dV_g, \end{aligned}$$

where C(r) is a bound on \(\left| \nabla f\right| ^2+\Delta f\) inside \(B_g(p,2r)\). (Such a bound clearly exists: for \(\left| \nabla f\right| ^2\) we have derived it in (3.3) and, using the trace of the shrinker equation (1.1) and the fact that \(R_g \ge 0\), we also have \(\Delta f \le \tfrac{n}{2}\) everywhere.) Plugging this last estimate into (4.2), we obtain

$$\begin{aligned} \int _M \varphi ^2 \left| \nabla u^{\alpha /2}\right| ^2 dV_g&\le \tfrac{\alpha ^2}{2(\alpha -1)}\bigg [\int _M \varphi ^2 u^\alpha v\, dV_g+ \big (\tfrac{2}{\alpha -1}+\tfrac{1}{\alpha }\big )\int _M \left| \nabla \varphi \right| ^2 u^\alpha dV_g+ \tfrac{C(r)}{\alpha }\int _M \varphi ^2 u^\alpha dV_g\bigg ]\\&\le C \int _M \varphi ^2 u^\alpha v + \left| \nabla \varphi \right| ^2 u^\alpha + \varphi ^2 u^\alpha dV_g, \end{aligned}$$

where

$$\begin{aligned} C=C(n,r,{\underline{\mu }},\alpha )= \tfrac{\alpha ^2}{2(\alpha -1)}\max \Big \{1,\tfrac{2}{\alpha -1}+\tfrac{1}{\alpha }, \tfrac{C(r)}{\alpha }\Big \}. \end{aligned}$$
(4.3)

Next, combining this estimate with the uniform Sobolev inequality from Lemma 2.4 (which we can apply if \(\sigma _2 \le \delta _0(2r)\)) and the smallness assumption (4.1), and noting that \(2^*=2\gamma \), we conclude

$$\begin{aligned} \left( \int _M \big (\varphi u^{\alpha /2}\big )^{2\gamma }dV_g\right) ^{\frac{1}{\gamma }}&\le C_S \int _M \left| \nabla \left( \varphi u^{\alpha /2}\right) \right| ^2 dV_g\\&\le C_S \int _M \left| \nabla \varphi \right| ^2 u^\alpha + \varphi ^2\left| \nabla u^{\alpha /2}\right| ^2 dV_g\\&\le C_S (C+1) \int _M \varphi ^2 u^\alpha v + \left| \nabla \varphi \right| ^2 u^\alpha + \varphi ^2 u^\alpha dV_g\\&\le C_S (C+1) \bigg [\left( \int _M v^{\frac{n}{2}}dV_g\right) ^{\frac{2}{n}}\left( \int _M \varphi ^{2\gamma } u^{\alpha \gamma }dV_g\right) ^{\frac{1}{\gamma }} \\&\quad + \int _M \varphi ^2 u^\alpha + \left| \nabla \varphi \right| ^2 u^\alpha dV_g\bigg ]\\&\le C_S (C+1) \bigg [\varepsilon _{\textrm{ann}}^{2/n}\left( \int _M \big (\varphi u^{\alpha /2}\big )^{2\gamma }dV_g\right) ^{\frac{1}{\gamma }} + \int _M \varphi ^2 u^\alpha + \left| \nabla \varphi \right| ^2 u^\alpha dV_g\bigg ]. \end{aligned}$$

We can absorb the first term on the last line by taking \(\varepsilon _{\textrm{ann}}\) small enough. For example, letting \(\varepsilon _{\textrm{ann}}^{2/n} \le \frac{1}{2C_S(C+1)}\), we obtain

$$\begin{aligned} \left( \int _M \big (\varphi u^{\alpha /2}\big )^{2\gamma }dV_g\right) ^{\frac{1}{\gamma }} \le 2C_S(C+1) \int _M \varphi ^2 u^\alpha + \left| \nabla \varphi \right| ^2 u^\alpha dV_g. \end{aligned}$$
(4.4)

Now, choose \(0\le \varphi \le 1\) so that \(\varphi = 1\) on \(A_{2s_1,\frac{1}{2}s_2}(q)\), \(\varphi = 0\) on \(M {\setminus } A_{s_1,s_2}\left( q\right) \), and

$$\begin{aligned} \left| \nabla \varphi \right| \le {\left\{ \begin{array}{ll} \frac{C'}{s_1} &{} \text {on } A_{s_1,2s_1}(q) \\ \frac{C'}{s_2} &{} \text {on } A_{\frac{1}{2}s_2, s_2}(q) \end{array}\right. } \end{aligned}$$
(4.5)

for some universal constant \(C'<\infty \). Using (4.4), we get

$$\begin{aligned} \bigg (\int _{A_{s_1,s_2}(q)} u^{\alpha \gamma } dV_g\bigg )^{\frac{1}{\gamma }}&\le \bigg (\int _{A_{s_1,2s_1}(q)\, \cup \, A_{\frac{1}{2}s_2,s_2}(q)} u^{\alpha \gamma } dV_g\bigg )^{\frac{1}{\gamma }} + \bigg (\int _{A_{2s_1,\frac{1}{2}s_2}(q)} \big (u^{\alpha /2}\big )^{2\gamma }dV_g\bigg )^{\frac{1}{\gamma }}\\&\le \bigg (\int _{A_{s_1,2s_1}(q)\, \cup \, A_{\frac{1}{2}s_2,s_2}(q)} u^{\alpha \gamma } dV_g\bigg )^{\frac{1}{\gamma }} + \bigg (\int _M \big (\varphi u^{\alpha /2}\big )^{2\gamma }dV_g\bigg )^{\frac{1}{\gamma }}\\&\le \bigg (\int _{A_{s_1,2s_1}(q)\, \cup \, A_{\frac{1}{2}s_2,s_2}(q)} u^{\alpha \gamma } dV_g\bigg )^{\frac{1}{\gamma }} \\&\quad + 2C_S(C+1) \int _M \varphi ^2 u^\alpha + \left| \nabla \varphi \right| ^2 u^\alpha dV_g. \end{aligned}$$

Hölder’s inequality yields

$$\begin{aligned} \int _M \varphi ^2 u^\alpha dV_g \le {{\,\textrm{Vol}\,}}^{\frac{2}{n}}(A_{s_1,s_2}(q)) \cdot \bigg (\int _{A_{s_1,s_2}(q)} u^{\alpha \gamma } dV_g\bigg )^{\frac{1}{\gamma }} \end{aligned}$$

Hence, if \(\sigma _2\) is chosen sufficiently small such that \({{\,\textrm{Vol}\,}}^{\frac{2}{n}}(A_{s_1,s_2}(q)) \le \frac{1}{4C_S(C+1)}\) – which can be done due to the uniform volume growth estimate (2.1) – then this term can be absorbed, leading to

$$\begin{aligned} \bigg (\int _{A_{s_1,s_2}(q)} u^{\alpha \gamma } dV_g\bigg )^{\frac{1}{\gamma }} \le 2\bigg (\int _{A_{s_1,2s_1}(q)\, \cup \, A_{\frac{1}{2}s_2,s_2}(q)} u^{\alpha \gamma } dV_g\bigg )^{\frac{1}{\gamma }} + 4C_S(C+1) \int _M \left| \nabla \varphi \right| ^2 u^\alpha dV_g. \end{aligned}$$

Finally, applying Hölder’s inequality also to the last term, we find for some \(C''<\infty \)

$$\begin{aligned} \int _M \left| \nabla \varphi \right| ^2 u^\alpha dV_g\le & {} \left( \int _{\textrm{supp}(\nabla \varphi )} \left| \nabla \varphi \right| ^n dV_g\right) ^{\frac{2}{n}}\left( \int _{\textrm{supp}(\nabla \varphi )} u^{\alpha \gamma } dV_g\right) ^{\frac{1}{\gamma }}\\\le & {} C'' \bigg (\int _{A_{s_1,2s_1}(q)\, \cup \, A_{\frac{1}{2}s_2,s_2}(q)} u^{\alpha \gamma } dV_g\bigg )^{\frac{1}{\gamma }}. \end{aligned}$$

Here, we have used the volume growth estimate (2.1) and the assumption (4.5) for the last estimate. We therefore conclude

$$\begin{aligned} \bigg (\int _{A_{s_1,s_2}(q)} u^{\alpha \gamma } dV_g\bigg )^{\frac{1}{\gamma }} \le (2+4C_S(C+1)C'')\bigg (\int _{A_{s_1,2s_1}(q)\, \cup \, A_{\frac{1}{2}s_2,s_2}(q)} u^{\alpha \gamma } dV_g\bigg )^{\frac{1}{\gamma }} \end{aligned}$$

and hence the proposition is proved with \(C_2 = (2+4C_S(C+1)C'')^\gamma \). \(\square \)

Endowed with this proposition, we would now like to show that for small annuli \(A_{s_1, s_2}(q)\) (under assumptions similar to the ones in the neck theorem), the energy of the entire annulus can be estimated by the energy of the two dyadic annuli \(A_{s_1, 2s_1}(q)\) and \(A_{\frac{1}{2}s_2, s_2}(q)\). It is tempting to use the equation

$$\begin{aligned} \Delta _f \left| \textrm{Rm}\right| \ge - C\left| \textrm{Rm}\right| ^2, \end{aligned}$$
(4.6)

and try to apply Proposition 4.1 to \(u=\left| \textrm{Rm}\right| \), \(v=C\left| \textrm{Rm}\right| \) with \(\alpha \gamma =\frac{n}{2}\), but unfortunately, this does not work: for example if \(n=4\), we have \(\gamma =\frac{n}{n-2}=2=\frac{n}{2}\), so would need to work with \(\alpha =1\), but the proposition crucially needs \(\alpha >1\) and, as can be clearly seen from (4.3), the constant \(C_2\) degenerates as \(\alpha \searrow 1\). It is therefore necessary to improve the differential inequality (4.6), which we will do in the following. A key ingredient for this is the following improved Kato inequality for gradient Ricci shrinkers.

Lemma 4.2

(Improved Kato inequality) There exists a constant \(\delta _K = \delta _K(n)>0\) such that the following holds. If (Mgf) is an n-dimensional oriented gradient Ricci shrinker, then

$$\begin{aligned} (1+\delta _K)\left| \nabla \left| \textrm{Rm}\right| \right| ^2 \le \left| \nabla \textrm{Rm}\right| ^2. \end{aligned}$$

Proof

One can deduce an improved Kato inequality from an explicit calculation similar to the work of Bando–Kasue–Nakajima [9] in the Einstein case. Here however, we will rely on a general framework, due to Branson [11] (see also Calderbank–Gauduchon–Herzlich [14] for a similar result with a quite different proof), for determining when an improved Kato inequality holds on an oriented manifold. Specifically, in order to apply Theorem 4 in [11] we need to consider a first order operator D and a tensor bundle T with sections \(\psi \). Then, if \(D^*D\) is elliptic when acting on T and \(D\psi = 0\) we will have an improved Kato inequality for \(\psi \) away from its zero set. Such conditions are typically satisfied for a curvature tensor because of the Bianchi identites and some extra structure, which in our case is the shrinker equation.

Branson’s framework requires one to work with an operator D which is the sum of generalised gradients (also called Stein–Weiss operators), two examples of which are the exterior derivative d and its adjoint \(d^*\) acting on differential forms, see [31]. Viewing \(\textrm{Rm}\) as a vector bundle valued 2-form in \(\Omega ^2\left( M, \textrm{End}\left( TM\right) \right) \) and taking d to be the exterior covariant derivative, we note that \(d \textrm{Rm}= 0\) by the second Bianchi identity.

Instead of \(d^*\), we would like to work with \(d^*_f = -{{\,\textrm{div}\,}}_f = - e^f {{\,\textrm{div}\,}}(e^{-f} \cdot )\), the adjoint of d with respect to \(e^{-f}dV_g\). This is the natural adjoint to work with in the shrinker setting, but it is not immediately clear if it is a Stein–Weiss operator. However we can use that forms are linear with respect to smooth functions which gives

$$\begin{aligned} {{\,\textrm{div}\,}}_f \textrm{Rm}\left( \cdot \right) = e^f {{\,\textrm{div}\,}}\left( e^{-f}\textrm{Rm}\left( \cdot ,\cdot \right) \right) = e^f {{\,\textrm{div}\,}}\left( \textrm{Rm}\left( e^{-f}\cdot , \cdot \right) \right) . \end{aligned}$$

Thus we are actually dealing with \({{\,\textrm{div}\,}}\), or equivalently \(d^*\), which we know is Stein–Weiss, except now we have applied a transformation to the domain of \(\textrm{Rm}\). However, since this transformation is confromal the new domain is isomorphic to TM. This is enough for our purposes, since being Stein–Weiss is an algebraic property. Using the second Bianchi identity, the shrinker quation (1.1), and the commutator rule we can compute the following in coordinates:

$$\begin{aligned} \begin{aligned} {{\,\textrm{div}\,}}\textrm{Rm}&= \nabla _p \textrm{Rm}_{ij\ell p}\\&= \nabla _j \textrm{Ric}_{i\ell } - \nabla _i \textrm{Ric}_{j\ell }\\&= \nabla _j \big (\tfrac{1}{2}g_{i\ell } - \nabla _i \nabla _\ell f\big ) - \nabla _i \big (\tfrac{1}{2}g_{j\ell } - \nabla _j \nabla _\ell f\big )\\&= -\nabla _j \nabla _i \nabla _\ell f + \nabla _i \nabla _j \nabla _\ell f \\&= \textrm{Rm}_{ij\ell p}\nabla _p f. \end{aligned} \end{aligned}$$
(4.7)

This is equivalent to \(d^*_f \textrm{Rm}= 0\).

With all of this in mind, we take \(D = d + d^*_f\) and have \(D \textrm{Rm}= 0\). This also gives \(D^*D = \left( d + d^*_f\right) ^2 = \Delta ^H_f\) where \(\Delta ^H_f\) is the f-Hodge Laplacian, which is certainly elliptic. Therefore we can apply Theorem 4 in [11] to get the desired improved Kato inequality away from the set of points where \(\textrm{Rm}= 0\). However, such a set is empty on a non-trivial shrinker. This completes the proof.

To be precise, in dimension \(n=4\), we need to split 2-forms into their self-dual and anti-self-dual parts in order to obtain Stein–Weiss operators \(d_{\pm }\) and \((d^*_f)_{\pm }\), see Branson’s work in [10] for details. \(\square \)

As a corollary, we obtain the following improvement of (4.6).

Corollary 4.3

(Improved differential inequality for the Riemann tensor on a Ricci shrinker) There exists a constant \(C_K=C_K(n)<\infty \) such that for every n-dimensional oriented gradient Ricci shrinker (Mgf) and \(\delta _K(n)\) from the improved Kato inequality, we have

$$\begin{aligned} \Delta _f \left| \textrm{Rm}\right| ^{1-\delta _K} \ge -C_K \left| \textrm{Rm}\right| ^{2-\delta _K}, \end{aligned}$$
(4.8)

where \(\Delta _f u = \Delta u - \left\langle \nabla f,\nabla u\right\rangle \) denotes the drift Laplacian on (Mgf).

Proof

The proof is in two steps. We first show the following shrinker version of the evolution equation of the Riemann tensor along the Ricci flow.

Claim 4.4

The Riemann tensor on a gradient Ricci shrinker (Mgf) satisfies the following equation

$$\begin{aligned} \Delta _f \textrm{Rm}= \textrm{Rm}+ Q(\textrm{Rm}), \end{aligned}$$

where \(Q(\textrm{Rm})\) is a quadratic expression in \(\textrm{Rm}\).

Proof

In the argument below, the quadratic expression \(Q(\textrm{Rm})\) may change from line to line. Working in coordinates, we first note that using the commutator rule, the second Bianchi identity, and (4.7) we have

$$\begin{aligned} \nabla _p \nabla _p \textrm{Rm}_{ijk \ell }= & {} -\nabla _p \nabla _k \textrm{Rm}_{ij \ell p} - \nabla _p \nabla _\ell \textrm{Rm}_{ijpk}\nonumber \\= & {} -\nabla _k \nabla _p \textrm{Rm}_{ij \ell p} - \nabla _\ell \nabla _p \textrm{Rm}_{ijpk} + Q(\textrm{Rm})\nonumber \\= & {} \nabla _k \textrm{Rm}_{ji\ell p}\nabla _p f + \nabla _\ell \textrm{Rm}_{ijkp}\nabla _p f + \textrm{Rm}_{ji\ell p}\nabla _k \nabla _p f + \textrm{Rm}_{ijkp} \nabla _\ell \nabla _p f + Q(\textrm{Rm}).\nonumber \\ \end{aligned}$$
(4.9)

Using the second Bianchi identity for the terms involving first derivatives of the Riemann tensor yields \((\nabla _k \textrm{Rm}_{ji \ell p} + \nabla _\ell \textrm{Rm}_{ijkp})\nabla _p f = \nabla _p \textrm{Rm}_{ijk\ell } \nabla _p f\). The terms involving second derivatives of the shrinker potential are handled using the shrinker equation (1.1) one last time:

$$\begin{aligned} \textrm{Rm}_{ji \ell p}\nabla _k \nabla _p f + \textrm{Rm}_{ijkp} \nabla _\ell \nabla _p f&= \textrm{Rm}_{ji \ell p}\big (\tfrac{1}{2}g_{kp} - \textrm{Ric}_{kp}\big ) + \textrm{Rm}_{ijkp}\big (\tfrac{1}{2}g_{\ell p} - \textrm{Ric}_{\ell p}\big )\\&= \textrm{Rm}_{ijk \ell } + Q(\textrm{Rm}). \end{aligned}$$

Putting everything together, the claim follows. \(\square \)

Using the identity \(\nabla \left| \textrm{Rm}\right| = \left| \textrm{Rm}\right| ^{-1}\left\langle \nabla \textrm{Rm}, \textrm{Rm}\right\rangle \) and the improved Kato inequality

$$\begin{aligned} \Delta _f \left| \textrm{Rm}\right| ^{1-\delta _K}&= (1-\delta _K) \nabla \cdot (\left\langle \nabla \textrm{Rm}, \textrm{Rm}\right\rangle \left| \textrm{Rm}\right| ^{-1-\delta _K}) - (1-\delta _K)\left\langle \left\langle \nabla f, \nabla \textrm{Rm}\right\rangle , \textrm{Rm}\right\rangle \left| \textrm{Rm}\right| ^{-1-\delta _K}\\&= (1-\delta _K) \left\langle \Delta \textrm{Rm}, \textrm{Rm}\right\rangle \left| \textrm{Rm}\right| ^{-1-\delta _K} - (1-\delta _K)\left\langle \left\langle \nabla f, \nabla \textrm{Rm}\right\rangle , \textrm{Rm}\right\rangle \left| \textrm{Rm}\right| ^{-1-\delta _K}\\&\quad + (1-\delta _K)\left| \nabla \textrm{Rm}\right| ^2 \left| \textrm{Rm}\right| ^{-1-\delta _K} - (1-\delta _K)(1+\delta _K)\left\langle \nabla \textrm{Rm}, \textrm{Rm}\right\rangle \nabla \left| \textrm{Rm}\right| \left| \textrm{Rm}\right| ^{-2-\delta _K}\\&\ge (1-\delta _K) \left\langle \Delta _f \textrm{Rm}, \textrm{Rm}\right\rangle \left| \textrm{Rm}\right| ^{-1-\delta _K}. \end{aligned}$$

Thus, using Claim 4.4 as well as the fact that \(Q(\textrm{Rm})\ge - {\bar{C}}\left| \textrm{Rm}\right| ^2\) for some constant \({\bar{C}}\), we find

$$\begin{aligned} \Delta _f \left| \textrm{Rm}\right| ^{1-\delta _K}&\ge (1-\delta _K)\left\langle \textrm{Rm}+ Q(\textrm{Rm}), \textrm{Rm}\right\rangle \left| \textrm{Rm}\right| ^{-1-\delta _K}\\&= (1-\delta _K)\left| \textrm{Rm}\right| ^{1-\delta _K} + (1-\delta _K)\left\langle Q(\textrm{Rm}), \textrm{Rm}\right\rangle \left| \textrm{Rm}\right| ^{-1-\delta _K}\\&\ge -(1-\delta _K){\bar{C}} \left| \textrm{Rm}\right| ^{2-\delta _K} \end{aligned}$$

Hence the corollary follows by setting \(C_K:= (1-\delta _K){\bar{C}}\). \(\square \)

We can now combine this improved differential inequality with Proposition 4.1 to obtain an energy estimate in neck regions for oriented gradient shrinkers as desired. This is the main result of this section.

Theorem 4.5

(Energy estimate in necks for Ricci shrinkers) Given \(n \ge 4\), \(r\ge 2\), and \({\underline{\mu }} > -\infty \), there exist \(\varepsilon _{\textrm{ee}}>0\), \(\sigma _3>0\) and \(C_3<\infty \) such that the following holds.

Let (Mgf) be an n-dimensional oriented gradient Ricci shrinker with \(\mu (g) \ge {\underline{\mu }}\). Take \(q \in B_g(p,r+1)\) where \(p:= \mathop {\mathrm {arg\,min}}\limits _M f\) and let \(A_{s_1, s_2}(q) \subset M\) be the unique connected component of the geodesic annulus \({\bar{A}}_{s_1, s_2}(q)\) which satisfies the condition \(A_{s_1, s_2}(q) \cap \partial B_g(q,s_2) \ne \emptyset \) (according to Lemma 3.1) and with

$$\begin{aligned} s_2 \le \sigma _3, \qquad s_1 \le \tfrac{1}{4} s_2. \end{aligned}$$
(4.10)

Finally, assume that

$$\begin{aligned} \int _{A_{s_1, s_2}(q)} \left| \textrm{Rm}_g\right| ^{n/2}_g dV_g \le \varepsilon _{\textrm{ee}}. \end{aligned}$$
(4.11)

Then we have

$$\begin{aligned} \int _{A_{s_1,s_2}(q)} \left| \textrm{Rm}_g\right| ^{n/2}_g dV_g \le C_3 \int _{A_{s_1, 2s_1}(q)\, \cup \, A_{\frac{1}{2}s_2, s_2}(q)}\left| \textrm{Rm}_g\right| ^{n/2}_g dV_g. \end{aligned}$$

Proof

We set \(u:=\left| \textrm{Rm}\right| ^{1-\delta _K}\) and \(v:=C_K\left| \textrm{Rm}\right| \), where \(\delta _K\) and \(C_K\) are from Corollary 4.3. Then (4.8) is equivalent to \(\Delta _f u \ge -uv\). Letting \(\sigma _3=\sigma _2\) and \(\varepsilon _{\textrm{ee}} = C_K^{-n/2}\varepsilon _{\textrm{ann}}\) (with \(\sigma _2\) and \(\varepsilon _{\textrm{ann}}\) given by Proposition 4.1), we have

$$\begin{aligned} \int _{A_{s_1,s_2}(q)} v^{\frac{n}{2}} dV_g \le C_K^{n/2}\varepsilon _{\textrm{ee}} \le \varepsilon _{\textrm{ann}} \end{aligned}$$

and moreover for \(\alpha :=\frac{n-2}{2(1-\delta _K)}>1\) the fact that \(\left| \textrm{Rm}\right| \in L^{\frac{n}{2}}\) (and therefore by Hölder’s inequality \(\left| \textrm{Rm}\right| \in L^{\frac{n-2}{2}}\)) shows that \(u\in L^\alpha \). We can therefore apply Proposition 4.1 which yields the claimed estimate as \(u^{\alpha \gamma }=\left| \textrm{Rm}\right| ^{n/2}\) with \(C_3=C_2(\alpha =\frac{n-2}{2(1-\delta _K)})\). \(\square \)

5 Construction of the bubble tree

It is finally time to construct the bubble tree and prove Theorem 1.2. So let \(n\ge 4\) and let \((M_i,g_i,f_i)\) be a sequence of n-dimensional oriented gradient Ricci shrinkers with entropy uniformly bounded below \(\mu (g_i)\ge {\underline{\mu }}>-\infty \) and basepoints \(p_i=\mathop {\mathrm {arg\,min}}\limits _M f_i\). If \(n>4\), then additionally assume (1.3) – recall that for \(n=4\) this is always satisfied automatically. Finally, we also fix a small \(\varepsilon >0\), \(k\in {\mathbb {N}}\), and \(r\ge 2\) such that \(\mathcal {Q}\cap \partial B_{g_\infty }(p_\infty ,r) = \emptyset \) and let \(\varepsilon _{\textrm{neck}}\), \(\sigma _1\) and \(\gamma \) be the corresponding constants from Theorem 3.4.

By the arguments from Sect. 2 and in particular (2.3), we know that for each \(q^\ell \in \mathcal {Q}_r\) there are \(M_i \ni q^\ell _i \rightarrow q^\ell \) such that the convergence of \(B_{g_i}(p_i,r) {\setminus } \bigcup _{\ell } B_{g_i}(q^\ell _i,\delta )\) is smooth for any sufficiently small \(\delta<<\delta _0\). In particular, we obtain

$$\begin{aligned} \lim _{\delta \rightarrow 0}\, \lim _{i \rightarrow \infty } \int _{B_{g_i}(p_i,r) \setminus \bigcup _{\ell } B_{g_i}(q^\ell _i,\delta )} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} = \int _{B_{g_\infty }(p_\infty ,r)} \left| \textrm{Rm}_{g_\infty }\right| ^{n/2}_{g_\infty } dV_{g_\infty }. \end{aligned}$$
(5.1)

In the following, we investigate what happens inside the \(\delta \)-balls. In the following argument, we may only consider \(\delta \) sufficiently small so that the regions \(B_{g_\infty }(q^\ell ,10\cdot \delta ) \setminus \{q^\ell \}\) do not contain any other orbifold points, and i sufficiently large so that all \(B_{g_i}(q^\ell _i,\delta )\) are disjoint. This allows us to focus on a single orbifold point q.

Given such a point \(q \in \mathcal {Q}_r\), we fix a corresponding sequence \(M_i \ni q_i\rightarrow q\) along which the curvature concentrates in the sense of (2.3). The task is then to extract a (finite) number of point-scale sequences that detect all the ALE bubbles that form at q.

The first bubble: Let \({\bar{\varepsilon }}:=\min \{\varepsilon _{\textrm{reg}},\varepsilon _{\textrm{gap}},\varepsilon _{\textrm{neck}}, \varepsilon _{\textrm{ee}}\}\) where \(\varepsilon _{\textrm{reg}}\) is the constant from the \(\varepsilon \)-regularity result (Lemma 2.5), \(\varepsilon _{\textrm{gap}}\) is from Bando’s gap result (Proposition 2.8), \(\varepsilon _{\textrm{neck}}\) has been chosen above as in the neck theorem (Theorem 3.4), and \(\varepsilon _{\textrm{ee}}\) is from the energy estimate in necks (Theorem 4.5). Set

$$\begin{aligned} r_i^1:= \inf \Big \{ r>0 \, \Big | \int _{B_{g_i}(q,r)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \ge \frac{{\bar{\varepsilon }} }{2} \text { for some } B_{g_i}(q,r) \subseteq B_{g_i}(q_i,\delta ) \Big \} \end{aligned}$$

and let \(q_i^1\) be points in \(M_i\) such that \(B_{g_i}(q_i^1,r_i^1) \subseteq B_{g_i}(q_i,\delta )\) and

$$\begin{aligned} \int _{B_{g_i}(q_i^1,r_i^1)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \ge \frac{{\bar{\varepsilon }} }{2}. \end{aligned}$$

Clearly \(r^1_i \rightarrow 0\), otherwise there is no curvature concentration as described by (2.3). By Theorem 2.6 the rescaled sequence \((M_i, \widetilde{g}_i = (r_i^1)^{-2}g_i, q_i^1)\) subconverges in the pointed orbifold Cheeger–Gromov sense to a complete, non-compact, Ricci-flat limit \((V^1, h^1, q_\infty ^1)\) with bounded \(L^{n/2}\) Riemannian curvature and which is ALE of order \(n-1\) in general and ALE of order n if either \(n = 4\) or \((V^1,h^1)\) is Kähler. By Corollary 3.3, \((V^1,h^1)\) has one end. Moreover, by the choice of \(r_i^1\), any ball of radius \(r\le 1\) with respect to the rescaled metric \((r_i^1)^{-2}g_i\) (and contained in \(B_{{\tilde{g}}_i}(q_i,(r^1_i)^{-1}\delta )\)) has energy at most \({\bar{\varepsilon }} /2\) and hence the convergence and the limit are smooth everywhere by the characterisation of singular points (respectively points of bad convergence) in Theorem 2.6. We then conclude that

$$\begin{aligned} \int _{B_{h^1}(q_\infty ^1,1)} \left| \textrm{Rm}_{h^1}\right| ^{n/2}_{h^1} dV_{h^1} \ge \frac{{\bar{\varepsilon }} }{2} \end{aligned}$$

which implies the limit is non-flat and hence a (smooth) ALE bubble as in Definition 1.4. By smooth convergence, we conclude that

$$\begin{aligned} \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i^1,Rr_i^1)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} = \int _{V^1} \left| \textrm{Rm}_{h^1}\right| ^{n/2}_{h^1} dV_{h^1}. \end{aligned}$$
(5.2)

We have now extracted the deepest bubble corresponding to the smallest scale, motivating the following definition.

Definition 5.1

(Leaf and intermediate bubbles) An ALE bubble as in Definition 1.4 is called a leaf bubble if it is smooth. If instead it has finitely many orbifold singularities it is called an intermediate bubble.

If there is further curvature concentration, we continue to extract more point-scale sequences. We first set

$$\begin{aligned} N:= \frac{4E(2r)}{{\bar{\varepsilon }}} \end{aligned}$$
(5.3)

and note that since \(B_{g_i}(q_i,\delta )\subseteq B_{g_i}(p_i,2r)\) contains at most E(2r) energy and our method detects disjoint regions containing at least \({\bar{\varepsilon }}/4\) energy, the process will terminate after a finite number of steps \(N_q\le N\).

The second bubble: First, in order to make sure we do not simply find the same bubble again, we pick \(K^1>> 1\) large enough, so that

$$\begin{aligned} \int _{V^1 \setminus B_{h^1}(q_\infty ^1,K^1)} \left| \textrm{Rm}_{h^1}\right| ^{n/2}_{h^1} dV_{h^1} \le \frac{{\bar{\varepsilon }} }{10N}, \end{aligned}$$

with N given by (5.3), which is possible as \((V^1,h^1)\) has bounded \(L^{n/2}\) curvature. From this, we conclude that for any constant \(R>K^1\) we have

$$\begin{aligned} \int _{A_{K^1r_i^1, Rr_i^1}(q_i^1)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \le \frac{{\bar{\varepsilon }} }{8N} \end{aligned}$$
(5.4)

for sufficiently large i. We then set

$$\begin{aligned} r_i^2:= \inf \Big \{ r>0 \, \Big | \int _{B_{g_i}(q,r) \setminus B_{g_i}(q_i^1,K^1 r_i^1)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \ge \frac{{\bar{\varepsilon }} }{2} \text { for some } B_{g_i}(q,r) \subseteq B_{g_i}(q_i,\delta ) \Big \} \end{aligned}$$

and let \(q_i^2\) be points in \(M_i\) such that \(B_{g_i}(q_i^2,r_i^2) \subseteq B_{g_i}(q_i,\delta )\) and

$$\begin{aligned} \int _{B_{g_i}(q_i^2,r_i^2) \setminus B_{g_i}(q_i^1,K^1 r_i^1)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \ge \frac{{\bar{\varepsilon }} }{2}. \end{aligned}$$
(5.5)

Note that \(r^2_i \ge r^1_i\) by construction. We can assume that \(r^2_i \rightarrow 0\), otherwise there is no more curvature concentration and the process of extracting point-scale sequences stops. We first claim the following.

Claim 5.2

The point-scale sequences satisfy

$$\begin{aligned} \frac{r^2_i}{r^1_i} + \frac{d(q^1_i, q^2_i)}{r^2_i} \rightarrow \infty . \end{aligned}$$
(5.6)

Proof

If (5.6) is not true, then there is some number \(M > K^1\) such that

$$\begin{aligned} 1 \le \frac{r^2_i}{r^1_i} \le M, \qquad \frac{d(q^1_i, q^2_i)}{r^2_i} \le M \end{aligned}$$

and therefore

$$\begin{aligned} \frac{d(q^1_i, q^2_i)}{r^1_i} \le M^2. \end{aligned}$$

This implies that \(q^2_i \in B_{g_i}(q^1_i, M^2 r_i^1)\) and thus

$$\begin{aligned} B_{g_i}(q_i^2,r_i^2) \setminus B_{g_i}(q_i^1,K^1 r_i^1) \subseteq A_{K^1r_i^1, (M^2+M)r_i^1}(q_i^1). \end{aligned}$$

In particular, (5.4) and (5.5) now yield a contradiction (for \(R=M^2+M\)) and hence the claim must hold. \(\square \)

Remark 5.3

An alternative approach is to not pass from \(B_{g_i}(q_i^1,r_i^1)\) to the larger balls \(B_{g_i}(q_i^1,K^1r_i^1)\) but instead mark and later discard the point-scale sequences that do not satisfy (5.6). Such a strategy was used in the bubble tree construction in [13].

We then distinguish two cases.

Case 1: We have

$$\begin{aligned} \frac{d(q^1_i, q^2_i)}{r^2_i} \rightarrow \infty . \end{aligned}$$

This is the easy case because the bubbles are forming separately. Indeed, the reader can easily verify that if we blow up using \((q^2_i,r^2_i)\) in a similar way as for \((q^1_i,r^1_i)\) above, we get the same conclusion and the first bubble will disappear off at infinity. In particular, we obtain another leaf bubble. Clearly, since \(r^2_i \ge r^1_i\), we also have

$$\begin{aligned} \frac{d(q^1_i, q^2_i)}{r^1_i} \rightarrow \infty , \end{aligned}$$

motivating the following definition.

Definition 5.4

(Separable bubbles) If \((q_i^k,r_i^k)\) and \((q_i^\ell ,r_i^\ell )\) are two point-scale sequences such that

$$\begin{aligned} \frac{d(q^k_i, q^\ell _i)}{r^k_i} \rightarrow \infty \quad \text {and}\quad \frac{d(q^k_i, q^\ell _i)}{r^\ell _i} \rightarrow \infty , \end{aligned}$$

then we say that the two associated bubbles \((V^k,h^k)\) and \((V^\ell ,h^\ell )\) are separable.

Case 2: For some M, we have

$$\begin{aligned} \frac{d(q^1_i, q^2_i)}{r^2_i} \le M < \infty . \end{aligned}$$
(5.7)

This is the much more delicate case as the bubbles will form on top of each other. We consider the rescaled sequence \((M_i, \widetilde{g}_i = (r_i^2)^{-2}g_i, q_i^2)\) which by Theorem 2.6 and Corollary 3.3 subconverges in the pointed orbifold Cheeger–Gromov sense to a complete, non-compact, Ricci-flat limit \((V^2, h^2, q_\infty ^2)\) with bounded \(L^{n/2}\) Riemannian curvature and with one end satisfying the required ALE condition. The assumption (5.7) shows that, by possibly passing to a further subsequence, \(q_i^1\) converge to some \({\hat{q}}^1_\infty \in V^2\) (with \(d({\hat{q}}^1_\infty ,q_\infty ^2)\le M\)). Since by (5.6) and (5.7) we know that \(r^1_i/r^2_i \rightarrow 0\), we have energy concentration for \(\widetilde{g}_i\) around \(q_i^1\) and hence the limit point \({\hat{q}}^1_\infty \in V^2\) is an orbifold point. By the choice of \(r^2_i\), we see that there are no other energy concentrations and hence no further orbifold singularities.

Note that for any \(R\ge K^1\), by the choice of \(r^2_i\), we have

$$\begin{aligned} \int _{B_{g_i}(q_i^1,\frac{1}{R} r_i^2) \setminus B_{g_i}(q_i^1,R r_i^1)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} < \frac{{\bar{\varepsilon }} }{2}. \end{aligned}$$

and hence, for sufficiently large i, \(A_{R r_i^1, \frac{1}{R} r_i^2}(q_i^1)\) satisfies the assumptions (3.7)–(3.8) of the neck theorem—we therefore call such an annulus a neck region. We claim the following.

Claim 5.5

No energy is concentrating in the neck region in the following sense:

$$\begin{aligned} \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{A_{R r_i^1, \frac{1}{R} r_i^2}(q_i^1)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} = 0. \end{aligned}$$

Proof

It is clear that for \(R\rightarrow \infty \) and \(i\rightarrow \infty \) the innermost dyadic sub-annulus \(A_{Rr_i^1, 2Rr_i^1}(q_i^1)\) converges smoothly, after rescaling by \(2Rr_i^1\), to an annular portion of a flat cone \({\mathcal {C}}(S^{n-1}/\Gamma _1)\), where \(\Gamma _1\) is given by the asymptotic structure of the end of the ALE bubble \((V^1,h^1)\). Similarly, the outermost dyadic sub-annulus \(A_{\frac{1}{2R} r_i^2, \frac{1}{R} r_i^2}(q_i)\) converges smoothly, after rescaling by \(\frac{1}{R} r_i^2\), to an annular portion of a flat cone \({\mathcal {C}}(S^{n-1}/\Gamma _2)\) where \(\Gamma _2\) is given by the orbifold singularity structure at \({\hat{q}}^1_\infty \) in \((V^2,h^2)\). In particular, the energy of these inner- and outermost dyadic annuli is converging to zero.

If \(R>\varepsilon _{\textrm{neck}}^{-1/2}\), then by the neck theorem, for large i there exists an \(\varepsilon \)-quasi isometry from the entire neck region \(A_{Rr_i^1, \frac{1}{R}r_i^2}(q_i^1)\) to \({\mathcal {C}}_{Rr_i^1, \frac{1}{R}r_i^2}(S^{n-1}/\Gamma )\) for some \(\Gamma \) with \(\left| \Gamma \right| <\gamma \). This shows that \(\Gamma _1=\Gamma _2=\Gamma \). One might then show that it is possible to let \(\varepsilon \rightarrow 0\) as \(R\rightarrow \infty \), so that, after rescaling, one obtains (smooth) convergence of any dyadic sub-annulus \(A_{\frac{1}{2}s_i,s_i}(q_i^1) \subseteq A_{Rr_i^1, \frac{1}{R} r_i^2}(q_i^1)\) to a portion of \({\mathcal {C}}(S^{n-1}/\Gamma )\) hence the energy on each such sub-annulus is converging to zero (as \(R\rightarrow \infty \), \(i\rightarrow \infty \)). But this is not sufficient to conclude the claim, as the number of dyadic sub-annuli could increase very fast with \(i\rightarrow \infty \), and hence a more careful argument is required – a delicate fact that is unfortunately ignored in some articles proving bubbling theorems. This is exactly where our energy estimate in annular neck regions from the last section comes into play. In fact, Theorem 4.5 shows that the energy over the entire neck region can be estimated by the energy of the innermost and outermost dyadic sub-annuli – for which we have just deduced that the energy converges to zero (as \(R\rightarrow \infty \) and \(i\rightarrow \infty \)). The claim therefore follows from this theorem. \(\square \)

Endowed with this claim, we continue the analysis of Case 2. First, we fix some \(R>K^1\) sufficiently large so that

$$\begin{aligned} \lim _{i\rightarrow \infty } \int _{A_{R r_i^1, \frac{1}{R} r_i^2}(q_i^1)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \le \frac{{\bar{\varepsilon }} }{8N}. \end{aligned}$$

Combining this with (5.4), we obtain

$$\begin{aligned} \lim _{i\rightarrow \infty }\int _{A_{K^1 r_i^1, \frac{1}{R} r_i^2}(q_i^1)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \le \frac{{\bar{\varepsilon }} }{8N}+\frac{{\bar{\varepsilon }} }{8N} = \frac{{\bar{\varepsilon }} }{4N} \le \frac{{\bar{\varepsilon }} }{4}. \end{aligned}$$

Finally, combining this with (5.5) shows that for this R, similarly as in the case of the first bubble

$$\begin{aligned} \int _{B_{h^2}(q_\infty ^2,1)\setminus B_{h^2}({\hat{q}}^1_\infty ,\frac{1}{R})} \left| \textrm{Rm}_{h^2}\right| ^{n/2}_{h^2} dV_{h^2} = \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i^2,r_i^2) \setminus B_{g_i}(q_i^1,\frac{1}{R}r_i^2)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \ge \frac{{\bar{\varepsilon }} }{4}, \end{aligned}$$

therefore \((V^2,h^2)\) is non-flat and thus an intermediate ALE bubble (see Definitions 1.4, 5.1). We might also say that \((V^2,h^2)\) is a parent of \((V^1,h^1)\).

By smooth convergence, we have

$$\begin{aligned} \lim _{R\rightarrow \infty }\lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i^2,Rr_i^2) \setminus B_{g_i}(q_i^1,\frac{1}{R}r_i^2)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} = \int _{V^2} \left| \textrm{Rm}_{h^2}\right| ^{n/2}_{h^2} dV_{h^2}. \end{aligned}$$
(5.8)

Therefore, combining (5.2), (5.8), and Claim 5.5, we obtain the energy estimate

$$\begin{aligned} \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i^2,Rr_i^2)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i}&= \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i^2,Rr_i^2) \setminus B_{g_i}(q_i^1,\frac{1}{R} r_i^2)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i}\\&\quad + \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i^1,\frac{1}{R}r_i^2) \setminus B_{g_i}(q_i^1,Rr_i^1)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i}\\&\quad + \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i^1,Rr_i^1)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i}\\&= \int _{V^2} \left| \textrm{Rm}_{h^2}\right| ^{n/2}_{h^2} dV_{h^2} + \int _{V^1} \left| \textrm{Rm}_{h^1}\right| ^{n/2}_{h^1} dV_{h^1}. \end{aligned}$$

In particular, all the energy is fully accounted for by the two bubbles detected.

Further bubbles: We then continue to extract more bubbles and to build bubble trees, a concept which is defined as follows.

Definition 5.6

(Bubble tree) A bubble tree T is a tree whose vertices are ALE bubbles and whose edges are neck regions. The single ALE end of each vertex is connected by a neck region (which it meets at its smaller boundary component) to its parent and possibly further ancestors toward the root bubble of the tree T, while at possibly finitely many isolated orbifold points it is connected by more necks (which it meets at their larger boundary components) to its children and possibly further descendants toward leaf bubbles of T. We say two bubble trees \(T_1\) and \(T_2\) are separable if their root bubbles are separable.

We proceed inductively, assuming that we have already extracted \((\ell -1)\) point-scale sequences and the associated bubbles that will form separable bubble trees \(\{T_j\}_{j\in J}\). After possibly re-labelling, we assume \(\{(V^j,h^j)\}_{j\in J}\) are their separable root bubbles and we can ignore all descendants for the argument that follows. Assume further that \(K^j\) are picked (as described for the first bubble above) such that for \(R>K^j\) we have

$$\begin{aligned} \int _{A_{K^jr_i^j, Rr_i^j}(q_i^j)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \le \frac{{\bar{\varepsilon }} }{8N}, \qquad \forall j\in J. \end{aligned}$$
(5.9)

We then set

$$\begin{aligned} r_i^\ell:= & {} \inf \Big \{ r>0 \, \Big | \int _{B_{g_i}(q,r) \setminus \bigcup _{j\in J} B_{g_i}(q_i^j,K^j r_i^j)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \\\ge & {} \frac{{\bar{\varepsilon }} }{2} \text { for some } B_{g_i}(q,r) \subseteq B_{g_i}(q_i,\delta ) \Big \} \end{aligned}$$

and let \(q_i^\ell \) be points in \(M_i\) such that \(B_{g_i}(q_i^\ell ,r_i^\ell ) \subseteq B_{g_i}(q_i,\delta )\) and

$$\begin{aligned} \int _{B_{g_i}(q_i^\ell ,r_i^\ell ) \setminus \bigcup _{j\in J} B_{g_i}(q_i^j,K^j r_i^j)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \ge \frac{{\bar{\varepsilon }} }{2}. \end{aligned}$$
(5.10)

Note that \(r^\ell _i \ge r^j_i\) for each \(j \in J\) by construction. We can assume that \(r^\ell _i \rightarrow 0\), otherwise there is no more curvature concentration and the process of extracting point-scale sequences stops with the \((\ell -1)\) point-scale sequences already extracted.

As for the second bubble, there are now two cases.

Case 1: For all \(j \in J\)

$$\begin{aligned} \frac{d(q^j_i, q^\ell _i)}{r^\ell _i} \rightarrow \infty . \end{aligned}$$

In this case, just as in the case of the second bubble, if we blow up using \((q_i^\ell ,r_i^\ell )\), all other bubbles disappear at infinity and we obtain another leaf bubble which is separable from all all trees \(T_j\) (thus forming a new tree consisting only of one vertex).

Case 2: For some \(j \in J\)

$$\begin{aligned} \frac{d(q^j_i, q^\ell _i)}{r^\ell _i} \le M^j < \infty . \end{aligned}$$
(5.11)

In this case, denote by \({\mathcal {J}}\subseteq J\) the set of indices j for which (5.11) holds. By assumption \({\mathcal {J}}\ne \emptyset \). We claim the following.

Claim 5.7

There exists \(\eta >0\) such that for each pair of indices \(j\ne k\) in \({\mathcal {J}}\) we have

$$\begin{aligned} \liminf _{i\rightarrow \infty } \frac{d(q^j_i, q^k_i)}{r^\ell _i} \ge 2\eta >0. \end{aligned}$$

Let us for the moment assume that the claim holds and continue. We consider the rescaled sequence \((M_i, \widetilde{g}_i = (r_i^\ell )^{-2}g_i, q_i^\ell )\) which by Theorem 2.6 and Corollary 3.3 subconverges in the pointed orbifold Cheeger–Gromov sense to a complete, non-compact, Ricci-flat limit \((V^\ell , h^\ell , q_\infty ^\ell )\) with bounded \(L^{n/2}\) Riemannian curvature and with one end satisfying the required ALE condition. By assumption, after possibly passing to a further subsequence, for each \(j\in {\mathcal {J}}\) the sequence \(q_i^j\) converges to some \({\hat{q}}^j_\infty \in V^\ell \) (with \(d({\hat{q}}^j_\infty ,q_\infty ^\ell )\le M^j\)) and by Claim 5.7 these limit points are distinct and at least distance \(\eta \) away from one another. This is the crucial ingredient that allows us to proceed essentially in the exact same way as if there was only one such point. More precisely, as for the second bubble, we can conclude that these points \({\hat{q}}^j_\infty \) are orbifold points of \((V^\ell ,h^\ell )\) and there are no other orbifold singularities. Furthermore, as in Claim 5.5, no energy is concentrating in the neck regions around \(q_i^j\), i.e.

$$\begin{aligned} \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{A_{R r_i^j, \frac{1}{R} r_i^\ell }(q_i^j)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} = 0, \quad \forall j\in {\mathcal {J}}. \end{aligned}$$
(5.12)

In particular, for \(R>\max _{j\in {\mathcal {J}}} K^j\) sufficiently large, we obtain for every \(j\in {\mathcal {J}}\)

$$\begin{aligned} \lim _{i\rightarrow \infty } \int _{A_{R r_i^j, \frac{1}{R} r_i^\ell }(q_i^j)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} \le \frac{{\bar{\varepsilon }} }{8N}. \end{aligned}$$

Combining this with (5.9)–(5.10) and using the obvious estimate \(\left| {\mathcal {J}}\right| \le N\) then implies

$$\begin{aligned} \int _{B_{h^\ell }(q_\infty ^\ell ,1)\setminus \bigcup _{j\in {\mathcal {J}}} B_{h^\ell }({\hat{q}}^j_\infty ,\frac{1}{R})} \left| \textrm{Rm}_{h^\ell }\right| ^{n/2}_{h^\ell } dV_{h^\ell } \ge \frac{{\bar{\varepsilon }} }{2} - \left| {\mathcal {J}}\right| \cdot \frac{{\bar{\varepsilon }}}{4N} \ge \frac{{\bar{\varepsilon }}}{4}. \end{aligned}$$

Therefore \((V^\ell ,h^\ell )\) is non-flat and thus a new parent bubble of all the bubbles \((V^j,h^j)\) with \(j\in {\mathcal {J}}\). This means that the trees \(\{T_j\}_{j\in {\mathcal {J}}}\) will be combined to a single tree with the new root \((V^\ell ,h^\ell )\). Finally, as for the second bubble, we obtain the energy estimate

$$\begin{aligned}&\lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i^\ell ,Rr_i^\ell )} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i}\\&\quad = \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i^\ell ,Rr_i^\ell ) \setminus \bigcup _{j\in {\mathcal {J}}} B_{g_i}(q_i^j,\frac{1}{R} r_i^j)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i}\\&\qquad + \sum _{j\in {\mathcal {J}}} \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i^j,\frac{1}{R}r_i^\ell ) \setminus B_{g_i}(q_i^j,Rr_i^j))} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i}\\&\qquad + \sum _{j\in {\mathcal {J}}} \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i^j,Rr_i^j)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i}\\&\quad = \int _{V^\ell } \left| \textrm{Rm}_{h^\ell }\right| ^{n/2}_{h^\ell } dV_{h^\ell } + \sum _{j\in {\mathcal {J}}} \sum _{V^k\in T_j} \int _{V^k} \left| \textrm{Rm}_{h^k}\right| ^{n/2}_{h^k} dV_{h^k}. \end{aligned}$$

In particular, all the energy is fully accounted for by \((V^\ell ,h^\ell )\) and all its descendants. This uses the fact that no energy concentrates in the new neck regions according to (5.12), as well as the inductive assumption that the energy in each \(B_{g_i}(q_i^j,Rr_i^j)\) is already fully accounted for by the bubbles in the tree \(T_j\).

It remains to prove the claim.

Proof of Claim 5.7

Assume towards a contradiction that there exists a non-empty subset \({\mathcal {J}}' \subseteq {\mathcal {J}}\) such that, after possibly passing to a subsequence

$$\begin{aligned} \lim _{i\rightarrow \infty } \frac{d(q^j_i, q^k_i)}{r^\ell _i} =0, \quad \forall j,k\in {\mathcal {J}}'. \end{aligned}$$
(5.13)

Then we set

$$\begin{aligned} \mu _i =\min \{d(q^j_i, q^k_i) | j,k\in {\mathcal {J}}' \} = d(q^{j_1}_i,q^{k_1}_i). \end{aligned}$$

As we started with separable trees by the inductive assumption, we have \(r^j_i/\mu _i \rightarrow 0\) for all \(j\in {\mathcal {J}}'\). Therefore, as in the previous argument, the rescaled sequence \((M_i, \widetilde{g}_i = (\mu _i^1)^{-2}g_i, q_i^{j_1})\) subconverges in the pointed orbifold Cheeger–Gromov sense to a complete, non-compact, Ricci-flat limit (Xh) with one ALE end and at most \(\left| {\mathcal {J}}'\right| \) isolated orbifold singularities. Note also that there are at least two orbifold singularities (coming from the sequences \(q_i^{j_1}\) and \(q_i^{k_1}\)). On the other hand, by (5.13) we have \(r_i^\ell /\mu _i \rightarrow \infty \) and therefore (Xh) has energy at most \({\bar{\varepsilon }}/2\) and is hence flat by Bando’s gap result (Proposition 2.8). But a flat ALE orbifold is either smooth or a flat cone with only one orbifold singularity, yielding the desired contradiction. The claim is proved. \(\square \)

Termination of the process and completion of the proof of Theorem 1.2: As already noted above, the process of finding new point-scale sequences for \(q\in \mathcal {Q}_r\) terminates after a finite number of steps \(N_q\le N\) because for each bubble we have found disjoint regions in each \(M_i\) containing at least \({\bar{\varepsilon }}/4\) energy, and by assumption the energy in the ball \(B_{g_i}(q_i,\delta )\) is uniformly bounded in i. We can therefore move on to the next orbifold point in \(\mathcal {Q}_r\) after a finite number of bubbles have been extracted.

By construction, Points 1.2 and 1.2 of Theorem 1.2 obviously hold. As we made sure that the \(\delta \)-balls around the sequences corresponding to different orbifold points in \(\mathcal {Q}_r\) are disjoint, Point 1.2 also follows immediately. Point 1.2 can be seen as follows: if we take a point-scale sequence \((q_i,\varrho _i)\) as in the theorem, if it converges to a limit which is non-flat, then we must be able to detect a new region of energy concentration (disjoint to all the regions from our point-scale sequences), but this cannot happen as we have exhausted all such regions in our process. Hence, to complete the proof of the theorem, it only remains to prove the energy identity from Point 1.2.

We first note that at each singular point \(q\in \mathcal {Q}\) there is only one tree forming. This is proved with the exact same argument a Claim 5.7. Assume that the tree forming at q consists of bubbles \(\{(V^k,h^k)\}_{k=1}^{N_q}\) and that its root bubble \((V^{N_q},h^{N_q})=:(V,h)\) is detected by a point-scale sequence \((q_i^{N_q},r_i^{N_q})=:(q_i,r_i)\). By the above construction, we already know that

$$\begin{aligned} \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i,Rr_i)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} = \sum _{k=1}^{N_q} \int _{V^k} \left| \textrm{Rm}_{h^k}\right| ^{n/2}_{h^k} dV_{h^k}. \end{aligned}$$

There is one further neck region connecting the tree to \(M_\infty \). As in Claim 5.5, we can show

$$\begin{aligned} \lim _{R\rightarrow \infty } \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i,\frac{1}{R})\setminus B_{g_i}(q_i,Rr_i)} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} = 0, \end{aligned}$$

so that this neck also does not contribute to the total energy. Writing \(\delta =1/R\), we therefore conclude that

$$\begin{aligned} \lim _{\delta \rightarrow 0} \lim _{i\rightarrow \infty } \int _{B_{g_i}(q_i,\delta )} \left| \textrm{Rm}_{g_i}\right| ^{n/2}_{g_i} dV_{g_i} = \sum _{k=1}^{N_q} \int _{V^k} \left| \textrm{Rm}_{h^k}\right| ^{n/2}_{h^k} dV_{h^k}. \end{aligned}$$

The claimed energy identity now immediately follows by repeating this for all orbifold points in \(\mathcal {Q}\) and combining with (5.1). Note that the condition \(\mathcal {Q}\cap \partial B_{g_\infty }(p_\infty ,r) = \emptyset \) ensures that for sufficiently large i and sufficiently small \(\delta \), each \(B_{g_i}(q_i,\delta )\) will be fully contained in \(B_{g_i}(q_i,r)\), avoiding potential issues with capturing “half-bubbles”. This finishes the proof of Theorem 1.2.

6 Proofs of the corollaries from the introduction

We will first transform the energy identity into an identity for the Euler characteristic. This reinforces the notion that, while the formation of orbifold singularities can cause some topological degeneration, we can recover the lost topology in a quantitative and systematic way.

Proof of Corollary 1.3

As noted in Anderson’s work on Einstein manifolds [2], bubbling can be excluded if the dimension n is odd. In this setting we have \({\mathcal {Q}}_r, {\mathcal {Q}}^k = \emptyset \) and the result trivially holds. Therefore, we only need to consider the case when n is even. The proof is rather direct and will be clear for experts; we therefore only give the full details for the easiest case \(n = 4\) and then briefly point out the necessary modifications for higher dimensions.

One of the main ingredients is the Chern–Gauss–Bonnet theorem for compact 4-manifolds N with boundary \(\partial N\), namely

$$\begin{aligned} \begin{aligned} 32\pi ^2 \chi \left( N\right)&= \int _N \big (\left| \textrm{Rm}\right| ^2 - 4\left| \textrm{Ric}\right| ^2 + R^2\big ) dV\\&\quad + 16\int _{\partial N} \kappa _1 \kappa _2 \kappa _3 dA + 8\int _{\partial N} \big (\kappa _1 K_{23} + \kappa _2 K_{13} + \kappa _3 K_{12}\big ) dA, \end{aligned} \end{aligned}$$
(6.1)

see for example [19]. Here \(\kappa _a = \textrm{II}(e_a,e_a)\) are the principal curvatures of \(\partial N\) (hence \(\{e_1,e_2,e_3\}\) is an orthonormal basis of \(T\partial N\) diagonalising the second fundamental form), and \(K_{ab} = \textrm{Rm}\left( e_a,e_b,e_a,e_b\right) \) are the sectional curvatures of N. In particular, if \((V^k,h^k)\) is a Ricci-flat ALE orbifold with one ALE end with fundamental group \(\Theta _k\) and with a finite discrete set (possibly empty) of orbifold points \({\mathcal {Q}}^k=\{q_\infty ^{k,j}\}\) with isometry groups \(\{\Gamma _{k,j}\}\), respectively, then (6.1) implies the well known formula

$$\begin{aligned} \frac{1}{32\pi ^2}\int _{V^k} \left| \textrm{Rm}_{h^k}\right| ^2_{h^k} dV_{h^k} = \chi (V^k\setminus {\mathcal {Q}}^k)-\frac{1}{\left| \Theta _k\right| }+\sum _{q_\infty ^{k,j}\in {\mathcal {Q}}^k} \frac{1}{\left| \Gamma _{k,j}\right| } \end{aligned}$$

Now fix \(r>0\). Take a sequence \((M_i,g_i, f_i,p_i)\) as in Theorem 1.2. Denote by \({\mathcal {Q}}_r = {\mathcal {Q}} \cap B_{g_\infty }(p_\infty , r)\) the orbifold points forming and fix some \(q\in {\mathcal {Q}}_r\). Then denote by \(\{(V^k,h^k)\}_{k=1}^{N_q}\) the ALE bubbles of the bubble tree \(T_q\). Intermediate bubbles will have a non-empty discrete set of orbifold points \({\mathcal {Q}}^k=\{q_\infty ^{k,j}\}\) (where the bubble tree is connected via neck regions to the children \(\{(V^j,h^j)\}\)) while for leaf bubbles \({\mathcal {Q}}^k=\emptyset \).

By the neck theorem, as explained in the proof of Claim 5.5, the fundamental group at infinity \(\Theta _j\) of each child bubble is the same as the orbifold group \(\Gamma _{k,j}\) at \(q_\infty ^{k,j}\), hence these terms cancel each other when summing over all bubbles and for the entire tree \(T_q\) at q we find

$$\begin{aligned} \frac{1}{32\pi ^2}\sum _{k=1}^{N_q} \int _{V^k} \left| \textrm{Rm}_{h^k}\right| ^2_{h^k} dV_{h^k} = \sum _{k = 1}^{N_q}\chi (V^k\setminus {\mathcal {Q}}^k)-\frac{1}{\left| \Theta _{N_q}\right| }. \end{aligned}$$
(6.2)

Here \(\Theta _{N_q}\) is the fundamental group at infinity of the root bubble \((V^{N_q},h^{N_q})\) of the tree \(T_q\).

Similarly, we also have

$$\begin{aligned}{} & {} \frac{1}{32\pi ^2}\bigg [\int _{B_{g_\infty }(p_\infty ,r)} \big (\left| \textrm{Rm}\right| ^2 - 4\left| \textrm{Ric}\right| ^2 + R^2\big ) dV_{g_\infty } + T(\partial B_{g_\infty }(p_\infty ,r))\bigg ]\\{} & {} \quad = \chi \big (B_{g_\infty }(p_\infty ,r) \setminus {\mathcal {Q}}_r\big ) + \sum _{q\in {\mathcal {Q}}_r} \frac{1}{\left| \Gamma _q\right| } \end{aligned}$$

where \(\Gamma _q\) is the finite isometry group associated to the orbifold point q and \(T(\partial B_{g_\infty }(p_\infty ,r))\) denotes the boundary integral in (6.1) above. Using once again the neck theorem for the neck connecting the root bubble of the tree \(T_q\) to the smoothly converging body part, similarly as above, we find \(\Theta _{N_q} =\Gamma _q\). Hence, by Point 1.2 of Theorem 1.2 (respectively its version using the Chern–Gauss–Bonnet integrand \(\left| \textrm{Rm}\right| ^2 - 4\left| \textrm{Ric}\right| ^2 + R^2\), which holds with identical proofFootnote 2), we conclude

$$\begin{aligned} \lim _{i \rightarrow \infty } \chi (B_{g_i}(q_i,r))&= \lim _{i\rightarrow \infty } \frac{1}{32\pi ^2}\bigg [\int _{B_{g_i(p_i,r)}} \big (\left| \textrm{Rm}\right| ^2 - 4\left| \textrm{Ric}\right| ^2 + R^2\big ) dV_{g_i} + T(\partial B_{g_i}(p_i,r))\bigg ]\\&=\frac{1}{32\pi ^2}\bigg [\int _{B_{g_\infty }(p_\infty ,r)} \big (\left| \textrm{Rm}\right| ^2 - 4\left| \textrm{Ric}\right| ^2 + R^2\big ) dV_{g_\infty } + T(\partial B_{g_\infty }(p_\infty ,r))\bigg ]\\&\quad + \frac{1}{32\pi ^2}\sum _{q\in {\mathcal {Q}}_r}\sum _{k=1}^{N_q} \int _{V^k} \left| \textrm{Rm}_{h^k}\right| ^2_{h^k} dV_{h^k}\\&= \chi (B_{g_\infty }(q_\infty ,r)\setminus {\mathcal {Q}}_r) + \sum _{q\in {\mathcal {Q}}_r} \frac{1}{\left| \Gamma _q\right| } + \sum _{q \in \mathcal {Q}_r} \bigg (\sum _{k = 1}^{N_q} \chi (V^k \setminus {\mathcal {Q}}^k)-\frac{1}{\left| \Theta _{N_q}\right| }\bigg )\\&= \chi (B_{g_\infty }(q_\infty ,r)\setminus {\mathcal {Q}}_r) + \sum _{q \in \mathcal {Q}_r} \sum _{k = 1}^{N_q} \chi (V^k \setminus {\mathcal {Q}}^k). \end{aligned}$$

We have proved the result for \(n=4\). In higher even dimensions, we can use the Chern–Gauss–Bonnet formula for a compact manifold with boundary from Theorem 1.9.2 in [20]:

$$\begin{aligned} \chi \left( N\right) = \int _N C_n ~\varepsilon ^I_J~ {\mathcal {R}}^{I,n}_{J,1} dV_g + \int _{\partial N} \sum ^{n-1}_{l=0} C_{l,n}~ \varepsilon ^A_B~ {\mathcal {R}}^{A,2l}_{B,1}~ \textrm{II}^{A,n-1}_{B,2l+1} dA_g. \end{aligned}$$
(6.3)

Here

$$\begin{aligned} \varepsilon ^I_J&:= \varepsilon _{i_1 \dots i_l j_1 \dots j_l} \\ {\mathcal {R}}^{I,t}_{J,s}&:= \textrm{Rm}_{i_s i_{s+1} j_{s+1} j_s} \; \cdots \; \textrm{Rm}_{i_{t-1}i_t j_t j_{t-1}}\\ \textrm{II}^{I,t}_{J,s}&:= \textrm{II}_{i_s j_s} \; \cdots \; \textrm{II}_{i_t j_t}. \end{aligned}$$

where \(\varepsilon ^I_J\) is shorthand for the Levi–Civita symbol, \(\textrm{II}_{ij}\) is the second fundamental form of \(\partial N\), and IJ are \((n-1)\) tuples of indices associated to an orthonormal basis \(\{e_{i_1}, \ldots , e_{i_l}, e_{j_1}, \ldots , e_{j_l}\}\) of \(T\partial N\) that diagonalises \(\textrm{II}_{ij}\). We also note the following:

  1. 1.

    The first term in Eq. (6.3) is the integral over a sum of products of \(\frac{n}{2}\) Riemann curvature tensors and is bounded above by a multiple of the energy \(E\left( r\right) \). Similar to the energy identity in Point 1.2 of Theorem 1.2, one can prove an identity for these integrals.

  2. 2.

    \(\varepsilon ^A_B ~ \textrm{II}^{A,n-1}_{B,1}\) is, up to a constant, the Gauss curvature of \(\partial M\):

    $$\begin{aligned} \varepsilon ^A_B ~ \textrm{II}^{A,n-1}_{B,1} = \left( n-1\right) !\prod ^{n-1}_{s=1} \kappa _s. \end{aligned}$$
    (6.4)

    where \(\kappa _s\) is a principal curvature of \(\partial N\). However, as in the argument in \(n=4\), these terms are only needed at the boundary of \(B_{g_i}(p_i),r)\) (as all “inner” boundary terms near orbifold points will appear twice with opposite sign and thus cancel out).

Using Eq. (6.3) the proof is therefore the same as the \(n=4\) case, up to dealing with the much more cumbersome notation. \(\square \)

Now we will use the bubble tree construction to prove the local diffeomorphism finiteness result.

Proof of Corollary 1.5

Fix \(r>0\). Take a sequence \((M_i,g_i, f_i,p_i)\) in \({\mathcal {M}}\) and assume for a contradiction that \(M_i \cap B_{g_i}(p_i,r)\) have pairwise distinct diffeomorphism types. By Theorem 1.1, we obtain pointed orbifold convergence to an orbifold shrinker \((M_\infty ,g_\infty ,f_\infty ,p_\infty )\). As before, let \({\mathcal {Q}}\) be the set of orbifold points. By possibly slightly enlarging r (without relabelling it), we can assume that \({\mathcal {Q}} \cap \partial B_{g_\infty }(p_\infty , r) = \emptyset \). We then set \({\mathcal {Q}}_r = {\mathcal {Q}} \cap B_{g_\infty }(p_\infty , r)\). By Theorem 1.2 we know that at each of the finitely many orbifold points \(q\in {\mathcal {Q}}_r\) a finite number of ALE bubbles \(\{(V^k,h^k)\}_{k=1}^{N_q}\) will be detected via point-scale sequences \((q_i^k,r_i^k)_{i\in {\mathbb {N}}}\), forming a bubble tree \(T_q\). Finally assume that the last bubble, i.e. \((V^{N_q},h^{N_q})\), is the root bubble of \(T_q\).

Next pick R sufficiently large, so that for each bubble \((V^k,h^k)\) we have \(R>K^k\) (where \(K^k\) come from the bubble tree construction in Sect. 5) and \(R>\varepsilon ^{-1/2}_{\textrm{neck}}\) (where \(\varepsilon _{\textrm{neck}}\) comes from the Neck Theorem 3.4).

Then each \(B_{g_i}(p_i, r)\) can be covered by finitely many of the following regions:

  1. 1.

    Body Regions: These are the regions \(B_{g_i}(p_i, r) {\setminus } \bigcup _{q\in {\mathcal {Q}}_r} B_{g_i}\big (q^{N_q}_i, \tfrac{1}{2R}\big )\). By the construction in Sect. 5, we have smooth convergence

    $$\begin{aligned} \Big (B_{g_i}(p_i, r) \setminus \bigcup _{q\in {\mathcal {Q}}_r} B_{g_i}\big (q^{N_q}_i, \tfrac{1}{2R}\big ), g_i\Big ) \rightarrow \Big (B_{g_\infty }(p_\infty , r) \setminus \bigcup _{q\in {\mathcal {Q}}_r} B_{g_\infty }\big (q, \tfrac{1}{2R}\big ), g_\infty \Big ) \end{aligned}$$

    and hence these regions will eventually all be diffeomorphic to each other.

  2. 2.

    Bubble Regions: Let q be an orbifold point and \((V^k,h^k)\) a fixed ALE bubble of the bubble tree \(T_q\). Denote by \(\{(V^j,h^j)\}_{j\in J}\) its children in the bubble tree \(T_q\) (with \(J=\emptyset \) for a leaf bubble). Then the corresponding bubble regions are \(B_{g_i}\big (q_i^k, 2R r_i^k\big ) {\setminus } \bigcup _{j\in J} B_{g_i}\big (q^j_i, \tfrac{1}{2R}r^k_i\big )\). After rescaling with \((r_i^k)^{-2}\), these regions will smoothly converge to a region in \(V^k \setminus \{q^{k,j}_\infty \}_{j \in J}\), more precisely

    $$\begin{aligned} \Big (B_{g_i}\big (q_i^k, 2R r_i^k\big ) \setminus \bigcup _{j\in J} B_{g_i}\big (q^j_i, \tfrac{1}{2R}r^k_i\big ), (r_i^k)^{-2} g_i\Big ) \rightarrow \Big ( B_{h^k}\big (q_\infty ^k,2R\big ) \setminus \bigcup _{j\in J} B_{h^k}\big (q^{k,j}_\infty ,\tfrac{1}{2R}\big ),h^k\Big ). \end{aligned}$$

    This implies that the bubble regions will eventually be diffeomorphic to each other and this argument works for each of the finitely many bubbles \((V^k,h^k)\).

  3. 3.

    Neck Regions: Again, let \((V^k,h^k)\) be a fixed ALE bubble in a tree \(T_q\) detected by the point-scale sequence \((q_i^k,r_i^k)\). Denote by \((V^\ell ,h^\ell )\) its parent bubble, detected by \((q_i^\ell ,r_i^\ell )\), if it exists. If instead \((V^k,h^k) = (V^{N_q},h^{N_q})\) is a root bubble and therefore does not have a parent, then we set \((q_i^\ell ,r_i^\ell ):=(q_i^k,1)\). Then \(B_{g_i}(q^k_i, \tfrac{1}{R} r_i^\ell ) {\setminus } B_{g_i}\big (q^k_i, Rr_i^k\big )\) are the corresponding neck regions. By Theorem 3.4, for sufficiently large i these annular regions will be diffeomorphic to an annulus on the cone \({\mathcal {C}}({\mathbb {S}}^{n-1}/\Gamma )\) for some \(\Gamma \subset O(n)\) with \(\left| \Gamma \right| \le \gamma \) and thus in particular diffeomorphic to each other.

The regions above are defined in such a way that each (annular) neck region overlaps with a bubble region on its innermost dyadic annulus and with the corresponding parent bubble region (or the body region in case of root bubbles) at its outermost dyadic annulus giving controlled regions where the diffeomorphism can be “glued together”. In particular, after possibly passing to a subsequence, for i sufficiently large \(M_i \cap B_{g_i}(p_i,r)\) are all diffeomorphic to each other. This then obviously also holds true for the original (not enlarged) r, which is the desired contradiction. \(\square \)