1 Introduction

Let us consider a compact smooth n-manifold M, and a family of immersions

$$\begin{aligned} F : M\times [0,T) \rightarrow {\mathbb {R}}^{n+k} \end{aligned}$$

which move by mean curvature flow, that is,

$$\begin{aligned} \partial _t F(x,t) = H(x,t) \end{aligned}$$

for each \((x,t) \in M\times [0,T)\) where H is the mean curvature vector. The mean curvature flow constitutes a system of quasilinear weakly parabolic partial differential equations for F, and since M is compact the flow must form a singularity in finite time. Singularity formation may be characterised analytically as follows: if we let A denote the second fundamental form and take T to be the maximal time then there holds

$$\begin{aligned} \limsup _{t\nearrow T} \sup _M |A|(\cdot ,t) = \infty . \end{aligned}$$

A profound and challenging problem is characterising and classifying the geometry of singularities, whose formation depends on the initial submanifold \(M_0\) where \(M_t:=F(M,t)\). In the seminal work of Huisken–Sinestrari [12], the singularities formed by mean convex codimension one solutions were shown to be weakly convex (White obtained a similar result for embedded mean convex solutions in [25]). It is natural to seek a corresponding theorem for solutions of higher codimension. However, we encounter a number of new difficulties, the foremost being that the second fundamental form and the mean curvature are vector-valued, and consequently there is no direct corresponding notion of mean convexity.

We thus use a different condition introduced by Andrews–Baker [1]. They showed that when \(n \ge 2\), the quadratic pinching condition

$$\begin{aligned} |A|^2 - c |H|^2 + a\le 0 \end{aligned}$$
(1.1)

is preserved for each \(c < \frac{4}{3n}\) and \(a > 0\). That is, if the condition is satisfied at the initial time, then it is satisfied by \(M_t\) for every \(t\in [0,T)\). We note that a compact hypersurface with positive mean curvature always satisfies (1.1) for some a and c. We will refer to submanifolds satisfying (1.1) with \(c <\tfrac{4}{3n}\) as being quadratically pinched. It is useful to note that, whenever \(n \ge 4m\), quadratic pinching implies

$$\begin{aligned} |A|^2 < \frac{1}{n-m}|H|^2. \end{aligned}$$

For hypersurfaces, this is a quadratic analogue of m-convexity.

Andrews–Baker showed that if \(c < \{\frac{4}{3n}, \frac{1}{n-1}\}\) the flow contracts quadratically pinched solutions to round points. This result is a high codimension generalisation of Huisken’s work on convex solutions of codimension one [14]. Recently the second-named author has constructed a flow with surgeries for solutions of dimension \(n \ge 5\) which are quadratically pinched with

$$\begin{aligned} c< {\left\{ \begin{array}{ll} \frac{3(n+1)}{2n(n+2)} &{} n = 5, 6, 7, \\ \frac{1}{n-2} &{} n \ge 8. \end{array}\right. } \end{aligned}$$

This generalises the surgery construction for two-convex hypersurface flows due to Huisken–Sinestrari [13]. An important ingredient in [22] (and in the present work) is the codimension estimate due to Naff [21], which implies that the singularities formed by a quadratically pinched solution are codimension one if

$$\begin{aligned} c< c_n : ={\left\{ \begin{array}{ll} \frac{3(n+1)}{2n(n+2)} &{} n = 5, 6, 7, \\ \frac{4}{3n} &{} n \ge 8. \end{array}\right. } \end{aligned}$$

In this paper we show that a quadratically pinched mean curvature flow with \(c < c_n\) is asymptotically convex in a quantifiable manner. As the second fundamental form is vector-valued, we denote by \(\lambda _1 \le \dots \le \lambda _n\) the eigenvalues of the second fundamental form in the principal normal direction, that is \(\nu _1= \frac{H}{|H|}\) (see Section 2 for precise definitions).

Theorem 1.1

Let \(F:M\times [0,T) \rightarrow {\mathbb {R}}^{n+k}\) be a compact mean curvature flow of dimension \(n \ge 5\) which is quadratically pinched with \(c < c_n\). For every \(\varepsilon >0\) there exists a constant \(C_\varepsilon >0\) which depends only on n, \(\varepsilon \) and \(M_0\) such that

$$\begin{aligned} \lambda _1 \ge -\varepsilon |H| - C_\varepsilon \end{aligned}$$

on \(M_t\) for each \(t \in [0,T)\).

As \(\varepsilon >0\) is arbitrary, this shows that the negative part of the first eigenvalue in the principal normal direction does not grow as fast as |H|.

Let us comment on the proof of Theorem 1.1. Compared with the hypersurface case, the first new difficulty we encounter is the complicated algebraic structure of the zeroth-order reaction terms in the evolution of A. This is an artefact of the vastly more complicated structure of Simons’ identity in higher codimensions. The second difficulty is that the evolution of |H| is influenced by the torsion of the submanifold. Consequently, in higher codimensions the evolution of \(\lambda _1\) is influenced by various new zeroth-order reaction terms and first-order torsion terms which need to be controlled.

As in the hypersurface case, we cannot prove the convexity estimate using the maximum principle, so we need to generalise Huisken’s Stampacchia iteration to quadratically pinched higher codimension flows. However, the burden we place on this tool is much heavier. Crucially, we will show that the iteration procedure can be carried out entirely in the region of the submanifold where |H| is extremely large. In this region, by Naff’s estimate, the tensor \({\hat{A}}\) (the part of A orthogonal to H) is extremely small compared to H. This has two consequences for the troublesome terms in the evolution of \(\lambda _1\): First, the zeroth-order terms can be written as a hypersurface part plus small errors, and second, the torsion terms can be controlled by adding in a multiple of \(|{\hat{A}}|^2\) to produce a favourable Bochner-type term. However, introducing \(|{\hat{A}}|^2\) in this way produces yet more zeroth- and first-order error terms. We show that the first-order errors can be controlled by introducing the pinching quantity to produce yet another good Bochner-type term, while the zeroth-order errors can be absorbed in the Stampacchia procedure with the help of a Poincaré-type inequality. This then yields the convexity estimate.

1.1 Singularities

As a consequence of [21], our convexity estimate, and [5], we find that singularity models in quadratically pinched high codimension mean curvature flow are noncollapsed convex hypersurface solutions.

Theorem 1.2

Let \(F:M\times [0,T) \rightarrow {\mathbb {R}}^{n+k}\) be a compact mean curvature flow of dimension \(n \ge 5\) which is quadratically pinched with \(c < c_n\). Let \({\bar{F}} : {\bar{M}} \times (-\infty ,0] \rightarrow {\mathbb {R}}^{n+k}\) be a smooth ancient mean curvature flow which arises as a blow-up limit of F at the singular time T. There is an affine subspace \({\mathbb {R}}^{n+1} \subset {\mathbb {R}}^{n+k}\) and a family of open convex sets \(\Omega _t \subset {\mathbb {R}}^{n+1}\) such that \({\bar{M}}_t := {\bar{F}}({\bar{M}},t)\) coincides with \(\partial \Omega _t\) for all \(t \in (-\infty ,0]\). Moreover, \(\Omega _t\) is interior noncollapsed — at each point of \({\bar{M}}_t\), \(\Omega _t\) admits an interior ball of radius \(|{\bar{H}}|^{-1}\).

For the proof of Theorem 1.2 we refer to [5].

We note Naff already established Theorem 1.2 for quadratically pinched solutions with \(c <\{\tfrac{3(n+1)}{2n(n+2)},\tfrac{1}{n-2}\}\) in [20]. In this case it also follows that every blow-up limit is uniformly two-convex, and is therefore a homothetically shrinking sphere or cylinder of the form \({\mathbb {R}}\times S^{n-1}\), or else a bowl soliton [3]. Theorem 1.2 opens up the possibility of obtaining a similar classification for general quadratically pinched flows.Footnote 1

1.2 Outline

The paper is set out as follows. In Sect. 2 we gather together the necessary evolution equations and technical tools. In particular, the first eigenvalue of the second fundamental form in the principal direction \(\lambda _1\) is not smooth but is locally Lipschitz and semiconvex, hence we show its evolution equation may be understood in a distributional sense. In Sect. 3 we obtain a Poincaré-type inequality which requires Simons’ identity for high codimension submanifolds. In Sect. 4, we complete the proof of the convexity estimate by generalising Huisken’s Stampacchia iteration to the setting of quadratically pinched high codimension flows. Finally, in Sect. 5 we use Theorem 1.2 to characterise type I and II singularities near the maximum of the curvature.

2 Evolution equations

Let \(F:M\times [0,T) \rightarrow {\mathbb {R}}^{n+k}\) solve mean curvature flow and write \(M_t := F(M,t)\). We recall from the work of Andrews–Baker [1] the following evolution equations for the second fundamental form and mean curvature vector. With respect to local orthonormal frames \(\{e_i\}\) and \(\{\nu _\alpha \}\) for the tangent and normal bundles,

$$\begin{aligned} \nabla _{\partial _t} A_{ij\alpha }&= \Delta A_{ij\alpha } + A_{ij\beta }A_{pq\beta } A_{pq\alpha } \\&\quad +A_{iq\beta }A_{qp\beta } A_{pj\alpha }+A_{jq\beta }A_{qp\beta } A_{pi\alpha } - 2 A_{ip\beta }A_{jq\beta }A_{pq\alpha } , \end{aligned}$$

and

$$\begin{aligned} \nabla _{\partial _t} H_{\alpha }&= \Delta H_{\alpha } + H_\beta A_{pq\beta } A_{pq \alpha }. \end{aligned}$$

From these equations we can compute that

$$\begin{aligned} (\partial _t -\Delta ) |A|^2&= - 2 |\nabla A|^2 + 2 | \langle A, A\rangle |^2 + 2 |R^{\perp }|^2 \\ (\partial _t - \Delta ) |H|^2&= - 2 |\nabla H|^2 + 2 |\langle A, H\rangle |^2 . \end{aligned}$$

We use \(R^{\perp }\) to denote the normal curvature, which is given by

$$\begin{aligned} R^\perp _{ij \alpha \beta } = A_{ip\alpha } A_{jp\beta }- A_{jp \alpha } A_{ip\beta }. \end{aligned}$$

Under the quadratic pinching assumption we have \(|H|>0\), so at any point on \(M_t\) we can choose a local orthonormal frame for the normal bundle which is such that

$$\begin{aligned} \nu _1 = \frac{H}{|H|}. \end{aligned}$$

We also use the notation

$$\begin{aligned} {\hat{A}} = A - \langle A, \nu _1\rangle \nu _1 = A - A_1 \nu _1 \end{aligned}$$

to denote the components of the second fundamental form orthogonal to the mean curvature vector, and write

$$\begin{aligned} h = \langle A, \nu _1 \rangle = A_1 \end{aligned}$$

for the scalar part of the mean curvature component of A. Hence A admits the decomposition

$$\begin{aligned} A = h \nu _1 + {\hat{A}}. \end{aligned}$$

2.1 Pinching is preserved

With this notation in place, we can state the estimate proven by Andrews–Baker showing that quadratic pinching is preserved by the flow.

Lemma 2.1

([1], Sect. 3) Fix constants \(0<c < \frac{4}{3n}\) and \(a >0\) and let

$$\begin{aligned} {\mathcal {Q}} := |A|^2 - c|H|^2 + a. \end{aligned}$$

At every point in \(M\times [0,T)\) where \({\mathcal {Q}} \le 0\) there holds

(2.1)

Note that by Proposition 6 in [1] we have

$$\begin{aligned} |\nabla A|^2 \ge \frac{3}{n+2} |\nabla H|^2 \end{aligned}$$
(2.2)

so the gradient term on the right-hand side is nonpositive. At points where \({\mathcal {Q}}\le 0\), each of the zeroth-order reaction terms is also nonpositive. From now on we suppose the initial submanifold \(M_0\), and hence \(M_t\) for all \(t \in [0,T)\), is quadratically pinched with

$$\begin{aligned} c \le \frac{4}{3n} - \varepsilon _0, \qquad \varepsilon _0 >0. \end{aligned}$$

For ease of notation let us define

$$\begin{aligned} W := \bigg ( \frac{4}{3n} - \frac{\varepsilon _0}{2}\bigg ) |H|^2 - |A|^2, \qquad w:= W^\frac{1}{2}, \end{aligned}$$

and observe that by the quadratic pinching \(W \ge \frac{\varepsilon _0}{2} |H|^2\) on \(M_t\).

Lemma 2.2

At each point in \(M \times [0,T)\) we have the inequalities

$$\begin{aligned} (\partial _t - \Delta ) W \ge 2 |h|^2 W + \frac{(n+2)}{3} \varepsilon _0 |\nabla A|^2. \end{aligned}$$

and

$$\begin{aligned} (\partial _t - \Delta )w \ge |h|^2 w + \delta _0 \frac{|\nabla A|^2}{|H|}, \end{aligned}$$

where \(\delta _0 >0\) depends only on n and \(\varepsilon _0\).

Proof

From Lemma 2.1 we obtain

$$\begin{aligned} (\partial _t - \Delta ) W\ge 2 |h|^2 W + 2|\nabla A|^2 -2 \bigg ( \frac{4}{3n} - \frac{\varepsilon _0}{2}\bigg )|\nabla H|^2. \end{aligned}$$

Using (2.2) we estimate

$$\begin{aligned} |\nabla A|^2 - \bigg ( \frac{4}{3n} - \frac{\varepsilon _0}{2}\bigg )|\nabla H|^2&= \bigg (1 - \frac{(n+2)}{3} \bigg ( \frac{4}{3n} - \frac{\varepsilon _0}{2}\bigg )\bigg ) |\nabla A|^2\\&\quad +\bigg ( \frac{4}{3n} - \frac{\varepsilon _0}{2}\bigg ) \bigg ( \frac{n+2}{3} |\nabla A|^2 - |\nabla H|^2\bigg )\\&\ge \bigg (1 - \frac{4(n+2)}{9n} \bigg ) |\nabla A|^2 + \frac{(n+2)}{6} \varepsilon _0|\nabla A|^2\\&\ge \frac{(n+2)}{6} \varepsilon _0|\nabla A|^2, \end{aligned}$$

which gives the desired inequality for W. It follows that

$$\begin{aligned} (\partial _t - \Delta ) W^\frac{1}{2}&= \frac{1}{4 W^{3/2} } |\nabla W|^2 + \frac{1}{2 W^\frac{1}{2} } (\partial _t - \Delta ) W\\&\ge |h|^2 W^\frac{1}{2} + \frac{(n+2)}{6} \varepsilon _0 \frac{|\nabla A|^2}{W^\frac{1}{2}}, \end{aligned}$$

and since \(W \le \frac{4}{3n} |H|^2\) we have

$$\begin{aligned} (\partial _t - \Delta ) w&\ge |h|^2 w + \frac{(3n)^\frac{1}{2}}{2}\frac{(n+2)}{6} \varepsilon _0 \frac{|\nabla A|^2}{|H|}. \end{aligned}$$

Thus it suffices to take

$$\begin{aligned} \delta _0 = \frac{(3n)^\frac{1}{2}}{2}\frac{(n+2)}{6} \varepsilon _0 . \end{aligned}$$

\(\square \)

2.2 The evolution of h

From the equations for A and H, we readily compute that the projection \(\langle A, H\rangle \) satisfies

$$\begin{aligned} (\nabla _{\partial _t} - \Delta )(A_{ij\alpha } H_\alpha )&= -2\nabla _p A_{ij\alpha } \nabla _p H_\alpha + 2 H_\alpha A_{ij\beta }A_{pq\beta } A_{pq\alpha } \\&\quad +H_\alpha (A_{iq\beta }A_{qp\beta } A_{pj\alpha }+A_{jq\beta }A_{qp\beta } A_{pi\alpha } - 2 A_{ip\beta }A_{jq\beta }A_{pq\alpha } ). \end{aligned}$$

The first of the reaction terms can be split into a hypersurface and a codimension component, as follows:

$$\begin{aligned} 2 H_\alpha A_{ij\beta }A_{pq\beta } A_{pq\alpha }&= 2 A_{ij\beta } A_{kl\beta }h_{kl} H_1 \\&= 2 h_{ij}h_{pq}h_{pq}H_1 + 2 \sum _{\beta \ne 1} A_{ij\beta }A_{pq\beta } h_{pq} H_1 \\&= 2 h_{ij} H_1 |h|^2 + 2 \sum _{\beta \ne 1}^n {\hat{A}}_{ij\beta } {\hat{A}}_{pq\beta } h_{pq}H_1. \end{aligned}$$

Similarly, the remaining reaction terms can be written as

$$\begin{aligned}&H_\alpha (A_{iq\beta }A_{qp\beta } A_{pj\alpha }+A_{jq\beta }A_{qp\beta } A_{pi\alpha } - 2 A_{ip\beta }A_{jq\beta }A_{pq\alpha } )\\&\quad = h_{ip}h_{pq}h_{qj}H_1 + h_{jp}h_{pq}h_{qi}H_1 - 2 h_{ip}h_{jq}h_{pq}H_1\\&\qquad + \sum _{\beta \ne 1} A_{ip\beta } A_{pq \beta } h_{qj} H_1 + \sum _{\beta \ne 1}A_{jp\beta }A_{pq\beta }h_{qi}H_1 - 2 \sum _{\beta \ne 1} A_{ip\beta }A_{jq\beta }h_{pq}H_1 \\&\quad = \sum _{\beta \ne 1} {\hat{A}}_{ip\beta } {\hat{A}}_{pq \beta } h_{qj} H_1 + \sum _{\beta \ne 1}{\hat{A}}_{jp\beta }{\hat{A}}_{pq\beta }h_{qi}H_1 - 2 \sum _{\beta \ne 1} {\hat{A}}_{ip\beta }{\hat{A}}_{jq\beta }h_{pq}H_1. \end{aligned}$$

Therefore,

$$\begin{aligned} (\nabla _{\partial _t} - \Delta ) (A_{ij\alpha } H_\alpha )&= -2\nabla _p A_{ij\alpha } \nabla _p H_\alpha +2|h|^2 h_{ij} H_1 + 2 \sum _{\beta \ne 1}^n {\hat{A}}_{ij\beta } {\hat{A}}_{pq\beta } h_{pq}H_1\\&\quad +\sum _{\beta \ne 1} {\hat{A}}_{ip\beta } {\hat{A}}_{pq \beta } h_{qj} H_1 + \sum _{\beta \ne 1}{\hat{A}}_{jp\beta }{\hat{A}}_{pq\beta }h_{qi}H_1\\&\quad - 2 \sum _{\beta \ne 1} {\hat{A}}_{ip\beta }{\hat{A}}_{jq\beta }h_{pq}H_1. \end{aligned}$$

The quantity |H| satisfies

$$\begin{aligned} (\partial _t - \Delta ) |H|&= \frac{|\langle A, H\rangle |^2 }{|H|} - \frac{|\nabla H|^2}{|H|}+ \frac{1}{|H|^3} \langle H, \nabla _i H\rangle \langle H, \nabla _i H\rangle , \end{aligned}$$

so since

$$\begin{aligned} \frac{|\langle A, H\rangle |^2 }{|H|} = |\langle A, |H|^{-1} H\rangle |^2 |H|= |h|^2 |H| \end{aligned}$$

and

$$\begin{aligned} - \frac{|\nabla H|^2}{|H|}+ \frac{1}{|H|^3}&\langle H, \nabla _i H\rangle \langle H, \nabla _i H\rangle = - |H||\nabla \nu _1|^2, \end{aligned}$$

we have

$$\begin{aligned} (\partial _t - \Delta ) |H|&= |h|^2|H| - |H||\nabla \nu _1|^2. \end{aligned}$$
(2.3)

For a tensor \(B_{ij}\) divided by a positive scalar function f there holds

$$\begin{aligned} (\nabla _{\partial _t} - \Delta ) \frac{B_{ij}}{f} = \frac{1}{f} (\nabla _{\partial _t} - \Delta ) B_{ij} - \frac{B_{ij}}{f^2} (\partial _t - \Delta ) f + \frac{2}{f} \bigg \langle \nabla \frac{B_{ij}}{f}, \nabla f \bigg \rangle , \end{aligned}$$

Therefore, dividing \(\langle A, H\rangle \) by |H|, we obtain

$$\begin{aligned} (\nabla _{\partial _t} - \Delta ) h_{ij}&= |h|^2 h_{ij} +2 \sum _{\beta \ne 1}^n {\hat{A}}_{ij\beta } {\hat{A}}_{pq\beta } h_{pq} +\sum _{\beta \ne 1} {\hat{A}}_{ip\beta } {\hat{A}}_{pq \beta } h_{qj} \\&\quad + \sum _{\beta \ne 1}{\hat{A}}_{jp\beta }{\hat{A}}_{pq\beta }h_{qi} - 2 \sum _{\beta \ne 1} {\hat{A}}_{ip\beta }{\hat{A}}_{jq\beta }h_{pq}\\&\quad -2|H|^{-1} \langle \nabla A_{ij} ,\nabla H \rangle + h_{ij} |\nabla \nu _1|^2 +2 |H|^{-1} \langle \nabla h_{ij}, \nabla |H| \rangle . \end{aligned}$$

Let us introduce the abbreviation

$$\begin{aligned} T_{ij}&:= 2 \sum _{\beta \ne 1}^n {\hat{A}}_{ij\beta } {\hat{A}}_{pq\beta } h_{pq}+\sum _{\beta \ne 1} {\hat{A}}_{ip\beta } {\hat{A}}_{pq \beta } h_{qj} \nonumber \\&\quad + \sum _{\beta \ne 1}{\hat{A}}_{jp\beta }{\hat{A}}_{pq\beta }h_{qi} - 2 \sum _{\beta \ne 1} {\hat{A}}_{ip\beta }{\hat{A}}_{jq\beta }h_{pq}, \end{aligned}$$
(2.4)

so that we may write

$$\begin{aligned} (\nabla _{\partial _t} - \Delta ) h_{ij}&= |h|^2 h_{ij} +T_{ij} -2|H|^{-1} \langle \nabla A_{ij}, \nabla H \rangle \\&\quad + h_{ij} |\nabla \nu _1|^2 + 2 |H|^{-1} \langle \nabla h_{ij}, \nabla |H| \rangle . \end{aligned}$$

We simplify the gradient terms by decomposing

$$\begin{aligned} - 2\langle \nabla A_{ij} , \nabla H\rangle&= - 2\langle \nabla h_{ij} \nu _1 + h_{ij} \nabla \nu _1 + \nabla {\hat{A}}_{ij}, \nabla |H| \nu _1 + 2|H| \nabla \nu _1 \rangle \\&= - 2\langle \nabla h_{ij} , \nabla |H|\rangle - 2H_1 h_{ij} |\nabla \nu _1|^2 - 2\langle \nabla {\hat{A}}_{ij}, \nabla |H|\nu _1 \rangle \\&\quad - 2|H| \langle \nabla {\hat{A}}_{ij}, \nabla \nu _1 \rangle , \end{aligned}$$

and so obtain:

Lemma 2.3

At each point in \(M\times [0,T)\) there holds

$$\begin{aligned} (\nabla _{\partial _t} - \Delta ) h_{ij}&= |h|^2 h_{ij} +T_{ij}- h_{ij} |\nabla \nu _1|^2 - 2|H|^{-1} \langle \nabla {\hat{A}}_{ij}, \nabla |H| \nu _1 \rangle \\&\quad - 2\langle \nabla {\hat{A}}_{ij}, \nabla \nu _1 \rangle , \end{aligned}$$

where \(T_{ij}\) is the quantity defined in (2.4).

Since h is a symmetric bilinear form it has n real eigenvalues, which we denote by

$$\begin{aligned} \lambda _1 \le \dots \le \lambda _n. \end{aligned}$$

The smallest eigenvalue can be written as

$$\begin{aligned} \lambda _1(x,t) = \min _{|v| = 1} h(x,t) (v,v), \end{aligned}$$

and is therefore a locally Lipschitz continuous function on \(M \times [0,T)\). We will use the evolution equation for h to estimate \((\partial _t -\Delta )\lambda _1\), interpreted in an appropriate weak sense (cf. [25] and [17]).

Definition 2.4

Let \(f:M\times [0,T) \rightarrow {\mathbb {R}}\) be locally Lipschitz continuous and fix a point \((x_0,t_0) \in M \times (0,T)\). We say that a function \(\varphi \) is a lower support for f at \((x_0,t_0)\) if \(\varphi \) is \(C^2\) on the set \(B_{g(t_0)}(x_0,r) \times [-r^2 +t_0,t_0]\) for some \(r >0\) and there holds

$$\begin{aligned} f(x,t) \ge \varphi (x,t), \end{aligned}$$

with equality at \((x_0,t_0)\). If the inequality is reversed then \(\varphi \) is an upper support for f at \((x_0,t_0)\).

With this definition in place we have the following estimate:

Lemma 2.5

Fix \((x_0,t_0) \in M \times (0,T)\) and suppose \(\varphi \) is a lower support for \(\lambda _1\) at \((x_0,t_0)\). Then at \((x_0,t_0)\) there holds

$$\begin{aligned} (\partial _t - \Delta ) \varphi&\ge |h|^2 \varphi +T_{11}- \varphi |\nabla \nu _1|^2 - 2|H|^{-1} \langle \nabla {\hat{A}}_{11}, \nabla |H| \nu _1 \rangle - 2\langle \nabla {\hat{A}}_{11}, \nabla \nu _1 \rangle \end{aligned}$$

Proof

We choose at the point \((x_0,t_0)\) an orthonormal basis of tangent vectors \(\{e_i\}\) which are such that

$$\begin{aligned} h(x_0,t_0) (e_i,e_i) = \lambda _i, \end{aligned}$$

and extend the \(\{e_i\}\) to a spatial neighbourhood of \(x_0\) by parallel transport with respect to \(g(t_0)\). We then extend to an orthonormal frame on a backward spacetime neighbourhood of \((x_0,t_0)\) by parallel transport with respect to the connection \(\nabla _{\partial _t}\). On this neighbourhood we can define a smooth function

$$\begin{aligned} \eta (x,t) := h(x,t)(e_1(x,t) ,e_1(x,t)). \end{aligned}$$

Observe that by the definition of \(\lambda _1\) there holds

$$\begin{aligned} \eta (x,t) \ge \lambda _1(x,t) \ge \varphi (x,t), \end{aligned}$$

with equality at \((x_0,t_0)\).

It follows that at the point \((x_0,t_0)\) we have

$$\begin{aligned} \partial _t \eta \le \partial _t \varphi , \qquad \Delta \eta \ge \Delta \varphi , \end{aligned}$$

hence

$$\begin{aligned} (\partial _t - \Delta ) \varphi \ge (\partial _t - \Delta ) \eta . \end{aligned}$$

At \((x_0,t_0)\) we compute

$$\begin{aligned} \partial _t \eta&= \nabla _{\partial _t} h(e_1, e_1) + 2 h(e_1, \nabla _{\partial _t} e_1)\\&= \Delta h_{11} +|h|^2 h_{11} +T_{11}- h_{11} |\nabla \nu _1|^2 - 2|H|^{-1} \langle \nabla {\hat{A}}_{11}, \nabla |H| \nu _1 \rangle - 2\langle \nabla {\hat{A}}_{11}, \nabla \nu _1 \rangle \\&= \Delta h_{11} +|h|^2 \varphi +T_{11}- \varphi |\nabla \nu _1|^2 - 2|H|^{-1} \langle \nabla {\hat{A}}_{11}, \nabla |H| \nu _1 \rangle - 2\langle \nabla {\hat{A}}_{11}, \nabla \nu _1 \rangle , \end{aligned}$$

and

$$\begin{aligned} \Delta h_{11}&= \nabla _i (\nabla _i (h_{11}) - 2 h(e_1, \nabla _i e_1))\\&=\Delta (h_{11}) - 2 h(e_1, \Delta e_1)\\&= \Delta \eta - 2 h (e_1, \Delta e_1). \end{aligned}$$

On the other hand, at \((x_0,t_0)\) there holds

$$\begin{aligned} \langle e_1, \Delta e_1\rangle = \nabla _k \langle e_1, \nabla _k e_1 \rangle =0, \end{aligned}$$

which shows that \(\Delta e_1\) is orthogonal to \(e_1\), so since h is diagonal at \((x_0,t_0)\) we obtain

$$\begin{aligned} \Delta h_{11} = \Delta \eta , \end{aligned}$$

and consequently

$$\begin{aligned} \partial _t \eta&= \Delta \eta +|h|^2 \varphi +T_{11}- \varphi |\nabla \nu _1|^2 - 2|H|^{-1} \langle \nabla {\hat{A}}_{11}, \nabla |H| \nu _1 \rangle - 2\langle \nabla {\hat{A}}_{11}, \nabla \nu _1 \rangle . \end{aligned}$$

It follows then that

$$\begin{aligned} (\partial _t - \Delta ) \varphi&\ge (\partial _t -\Delta )\eta \\&= |h|^2 \varphi +T_{11}- \varphi |\nabla \nu _1|^2 - 2|H|^{-1} \langle \nabla {\hat{A}}_{11}, \nabla |H| \nu _1 \rangle - 2\langle \nabla {\hat{A}}_{11}, \nabla \nu _1 \rangle \end{aligned}$$

at \((x_0,t_0)\) as required. \(\square \)

Eventually we will want to prove integral estimates for the function \(\lambda _1\). To do so we appeal to Alexandrov’s theorem, following Brendle [6] (see also [17]). We call a function \(f:M\times [0,T) \rightarrow {\mathbb {R}}\) locally semiconvex (resp. semiconcave) if about every \((x_0,t_0)\) there is a small open neighbourhood on which f can be expressed as the sum of a smooth and a convex (resp. concave) function.

Lemma 2.6

Let \(f:M\times [0,T) \rightarrow {\mathbb {R}}\) be locally semiconvex. Then f is twice differentiable almost everywhere in \(M\times [0,T)\), and if \(\varphi \) is a nonnegative Lipschitz function on M then for each \(t \in [0,T)\) there holds

$$\begin{aligned} \int _M \Delta f \cdot \varphi \, d\mu _t \le -\int _M \langle \nabla f, \nabla \varphi \rangle \,d\mu _t. \end{aligned}$$

Here \(\mu _t\) is the measure induced by the immersion \(F(\cdot ,t)\).

Proof

Choosing local coordinates and applying Alexandrov’s theorem [8, Sect. 6.4], we see that f has two derivatives at a.e. point in \(M\times [0,T)\). Furthermore, by [8, Sect. 6.3], for each \(t \in [0,T)\) there is a singular Radon measure \(\chi \) on M with the property that

$$\begin{aligned} \int _M \Delta f \cdot \varphi \,d\mu _t + \int _M \varphi \,d\chi = - \int _M \langle \nabla f, \nabla \varphi \rangle \,d\mu _t \end{aligned}$$

for every \(\varphi \in C^2(M)\) . Hence if \(\varphi \ge 0\) there holds

$$\begin{aligned} \int _M \Delta f \cdot \varphi \,d\mu _t \le - \int _M \langle \nabla f, \nabla \varphi \rangle \,d\mu _t. \end{aligned}$$

By approximation, the same inequality also holds if \(\varphi \) is only Lipschitz continuous. \(\square \)

Since h is smooth, on every small enough set in spacetime, \(\lambda _1\) can be expressed as the minimum over a set of smooth functions which is bounded in \(C^2\). This is sufficient to ensure that \(\lambda _1\) is locally semiconcave on \(M\times [0,T)\), so by the lemma we conclude that there is a set of full measure \(Q \subset M\times [0,T)\) where \(\lambda _1\) is twice differentiable.

Lemma 2.7

At each point in Q there holds

$$\begin{aligned} (\partial _t - \Delta ) \lambda _1&\ge |h|^2 \lambda _1 +T_{11}- \lambda _1 |\nabla \nu _1|^2 - 2|H|^{-1} \langle \nabla {\hat{A}}_{11}, \nabla |H| \nu _1 \rangle - 2\langle \nabla {\hat{A}}_{11}, \nabla \nu _1 \rangle . \end{aligned}$$

Proof

Fix a point \((x_0,t_0)\in Q\). Then \(\lambda _1\) admits a lower support \(\varphi \) at \((x_0,t_0)\), to which we can apply Lemma  2.5. Since \(\varphi (x_0,t_0) = \lambda _1(x_0,t_0)\), this gives the desired inequality. \(\square \)

Remark 2.8

Notice that the first of the gradient terms is nonnegative whenever \(\lambda _1 \le 0\), whereas the remaining gradient terms both contain \(\nabla {\hat{A}}\) as a factor. It is this structure which allows us to prove the convexity estimate.

2.3 The evolution of \(|{\hat{A}}|^2\)

The following evolution equation for \(|{\hat{A}}|^2\) was derived by Naff [21]:

We make use of the quantity

$$\begin{aligned} v := \frac{|{\hat{A}}|^2}{|H|}. \end{aligned}$$

Lemma 2.9

There is a positive constant \(C = C(n)\) such that

$$\begin{aligned} (\partial _t - \Delta ) v&\le C|A|^2 |{\hat{A}}| + C \frac{|{\hat{A}}|}{|H|} \frac{|\nabla A|^2}{|H|} - 2\frac{|\nabla {\hat{A}}|^2}{|H|} \end{aligned}$$

holds on \(M\times [0,T)\).

Proof

We use the formula

$$\begin{aligned} (\partial _t - \Delta ) \frac{f_1}{f_2} = \frac{1}{f_2} (\partial _t - \Delta ) f_1 - \frac{f_1}{f_2^2} (\partial _t - \Delta ) f_2 + \frac{2}{f_2} \bigg \langle \nabla \frac{f_1}{f_2}, \nabla f_2 \bigg \rangle \end{aligned}$$

to derive

$$\begin{aligned} (\partial _t - \Delta ) v&= \frac{1}{|H|} (\partial _t - \Delta ) |{\hat{A}}|^2 - \frac{|{\hat{A}}|^2}{|H|^2} (\partial _t - \Delta ) |H| + \frac{2}{|H|} \bigg \langle \nabla \frac{|{\hat{A}}|^2}{|H|}, \nabla |H|\bigg \rangle . \end{aligned}$$

We estimate

$$\begin{aligned} \frac{2}{|H|} \bigg \langle \nabla \frac{|{\hat{A}}|^2}{|H|}, \nabla |H|\bigg \rangle&= \frac{2}{|H|} \bigg \langle \frac{1}{|H|} \nabla |{\hat{A}}|^2 - \frac{|{\hat{A}}|^2}{|H|^2} \nabla |H|, \nabla |H|\bigg \rangle \\&= \frac{4}{|H|^2} {\hat{A}}_{ij}\langle \nabla {\hat{A}}_{ij}, \nabla |H|\rangle - 2 \frac{|{\hat{A}}|^2}{|H|^2} \frac{|\nabla |H||^2}{|H|}\\&\le C \frac{|{\hat{A}}|}{|H|} \frac{|\nabla A|^2}{|H|}. \end{aligned}$$

By (2.3) we have

$$\begin{aligned} - \frac{|{\hat{A}}|^2}{|H|^2}(\partial _t - \Delta ) |H|&= - \frac{|{\hat{A}}|^2}{|H|^2}( |h|^2 |H| - |H||\nabla \nu _1|^2) \le C\frac{|{\hat{A}}|}{|H|} \frac{|\nabla A|^2}{|H|}. \end{aligned}$$

Combining these inequalities, we obtain

$$\begin{aligned} (\partial _t - \Delta ) v&\le \frac{1}{|H|} (\partial _t - \Delta ) |{\hat{A}}|^2 + C \frac{|{\hat{A}}|}{|H|} \frac{|\nabla A|^2}{|H|}. \end{aligned}$$
(2.5)

We recall

and estimate

$$\begin{aligned} (\partial _t - \Delta ) |{\hat{A}}|^2&\le C|{\hat{A}}|^4 + 2 \sum _{i,j,\alpha } |R^{\perp }_{ij1 \alpha }|^2 - 2|\nabla {\hat{A}}|^2 + C \frac{|{\hat{A}}|}{|H|} |\nabla A|^2. \end{aligned}$$

Then since

$$\begin{aligned} R^\perp _{ij\alpha \beta } = A_{ip\alpha } A_{jp\beta } - A_{ip\beta }A_{jp\alpha } \end{aligned}$$

we can write

$$\begin{aligned} 2 \sum _{i,j,\alpha } |R^{\perp }_{ij1 \alpha }|^2 = 2 \sum _{i,j}\sum _{\alpha \ge 2} |h_{ip} {\hat{A}}_{jp\alpha } - {\hat{A}}_{ip\alpha } h_{jp}|^2 \end{aligned}$$

and use this to bound

$$\begin{aligned} (\partial _t - \Delta ) |{\hat{A}}|^2&\le C|A|^2 |{\hat{A}}|^2 + C \frac{|{\hat{A}}|}{|H|} |\nabla A|^2 - 2|\nabla {\hat{A}}|^2 . \end{aligned}$$

Substituting this inequality into (2.5) and using the quadratic pinching gives the desired estimate. \(\square \)

2.4 Modifying \(\lambda _1\)

We now form the quantity

$$\begin{aligned} f(x,t) := -\lambda _1(x,t) - \varepsilon w(x,t) + \Lambda v(x,t), \end{aligned}$$

where \(\varepsilon \) and \(\Lambda \) are positive constants to be chosen later. Combining the evolution equations for the three components we obtain:

Lemma 2.10

At each point in Q there holds

$$\begin{aligned} (\partial _t - \Delta )f&\le |h|^2 f + C(1+ \Lambda ) |A|^2 |{\hat{A}}| - ( f + \varepsilon w )|\nabla \nu _1|^2 \\&\quad - \bigg ( \frac{\varepsilon \delta _0}{2}- C\Lambda \frac{|{\hat{A}}|}{|H|} \bigg ) \frac{|\nabla A|^2}{|H|} - \bigg (2\Lambda - \frac{C}{\varepsilon \delta _0}\bigg )\frac{|\nabla {\hat{A}}|^2}{|H|}, \end{aligned}$$

where \(C = C(n)\).

Proof

At any point in Q we compute

$$\begin{aligned} (\partial _t - \Delta ) f = - (\partial _t - \Delta )\lambda _1 -\varepsilon (\partial _t - \Delta ) w+\Lambda (\partial _t - \Delta )v, \end{aligned}$$

so by Lemma 2.7,

$$\begin{aligned} (\partial _t - \Delta )f&\le -|h|^2 \lambda _1 -T_{11}+ \lambda _1 |\nabla \nu _1|^2 +2|H|^{-1} \langle \nabla {\hat{A}}_{11}, \nabla |H| \nu _1 \rangle \\&\quad + 2\langle \nabla {\hat{A}}_{11}, \nabla \nu _1 \rangle -\varepsilon (\partial _t - \Delta ) w+ \Lambda (\partial _t - \Delta )v. \end{aligned}$$

Inserting the estimates from Lemmas 2.2 and 2.9 we find that

$$\begin{aligned} (\partial _t - \Delta )f&\le |h|^2 (-\lambda _1 - \varepsilon w) -T_{11}+ \lambda _1 |\nabla \nu _1|^2 + 2|H|^{-1} \langle \nabla {\hat{A}}_{11}, \nabla |H| \nu _1 \rangle \\&\quad + 2\langle \nabla {\hat{A}}_{11}, \nabla \nu _1 \rangle -\varepsilon \delta _0 \frac{|\nabla A|^2}{|H|}+ \Lambda \bigg ( C|A|^2 |{\hat{A}}| + C \frac{|{\hat{A}}|}{|H|} \frac{|\nabla A|^2}{|H|} - 2\frac{|\nabla {\hat{A}}|^2}{|H|}\bigg ), \end{aligned}$$

where \(C = C(n)\). Using the definition of f and rearranging we obtain

$$\begin{aligned} (\partial _t - \Delta )f&\le |h|^2 (f - \Lambda v) -T_{11}+ C \Lambda |A|^2 |{\hat{A}}| + (- f - \varepsilon w + \Lambda v)|\nabla \nu _1|^2 \\&\quad + 2|H|^{-1} \langle \nabla {\hat{A}}_{11}, \nabla |H| \nu _1 \rangle + 2\langle \nabla {\hat{A}}_{11}, \nabla \nu _1 \rangle + C\Lambda \frac{|{\hat{A}}|}{|H|} \frac{|\nabla A|^2}{|H|}\\&\quad -\varepsilon \delta _0 \frac{|\nabla A|^2}{|H|} - 2\Lambda \frac{|\nabla {\hat{A}}|^2}{|H|}. \end{aligned}$$

Next we estimate

$$\begin{aligned} -T_{11}&= 2 \sum _{\beta \ne 1}^n {\hat{A}}_{ij\beta } {\hat{A}}_{pq\beta } h_{pq} +\sum _{\beta \ne 1} {\hat{A}}_{ip\beta } {\hat{A}}_{pq \beta } h_{qj} \\&\quad +\sum _{\beta \ne 1}{\hat{A}}_{jp\beta }{\hat{A}}_{pq\beta }h_{qi} - 2 \sum _{\beta \ne 1} {\hat{A}}_{ip\beta }{\hat{A}}_{jq\beta }h_{pq}\\&\le C|A|^2 |{\hat{A}}| \end{aligned}$$

and

$$\begin{aligned} 2|H|^{-1} \langle \nabla {\hat{A}}_{11}, \nabla |H| \nu _1 \rangle + 2\langle \nabla {\hat{A}}_{11}, \nabla \nu _1 \rangle&\le C |H|^{-1} |\nabla {\hat{A}}| |\nabla A|\\&\le \frac{\varepsilon \delta _0}{2} \frac{|\nabla A|^2}{|H|} + \frac{C}{\varepsilon \delta _0} \frac{|\nabla {\hat{A}}|^2}{|H|} \end{aligned}$$

in order to obtain

$$\begin{aligned} (\partial _t - \Delta )f&\le |h|^2 (f - \Lambda v) + C(1+ \Lambda ) |A|^2 |{\hat{A}}| - ( f + \varepsilon w -\Lambda v)|\nabla \nu _1|^2 \\&\quad -\frac{\varepsilon \delta _0}{2} \frac{|\nabla A|^2}{|H|} + C\Lambda \frac{|{\hat{A}}|}{|H|} \frac{|\nabla A|^2}{|H|} + \bigg (\frac{C}{\varepsilon \delta _0} - 2\Lambda \bigg )\frac{|\nabla {\hat{A}}|^2}{|H|}. \end{aligned}$$

Finally, by bounding

$$\begin{aligned} \Lambda v |\nabla \nu _1|^2 \le C \Lambda \frac{|{\hat{A}}|}{|H|} \frac{|\nabla A|^2}{|H|} , \end{aligned}$$

we arrive at

$$\begin{aligned} (\partial _t - \Delta ) f&\le |h|^2 (f-\Lambda v) + C(1+ \Lambda ) |A|^2 |{\hat{A}}| - ( f + \varepsilon w )|\nabla \nu _1|^2 \\&\quad - \bigg ( \frac{\varepsilon \delta _0}{2}- C\Lambda \frac{|{\hat{A}}|}{|H|} \bigg ) \frac{|\nabla A|^2}{|H|} - \bigg (2\Lambda - \frac{C}{\varepsilon \delta _0}\bigg )\frac{|\nabla {\hat{A}}|^2}{|H|}. \end{aligned}$$

\(\square \)

3 A Poincaré inequality

In this section we establish a Poincaré-type inequality for the high codimension solution \(M_t\). The proof loosely follows [14, Lemma 5.4], in that we combine Simons’ identity with an integration by parts argument. We also incorporate an idea from [4, Theorem 3.1], where the authors symmetrise and then take the square of Simons’ identity to fully exploit the structure of the cubic zeroth-order terms.

Simons’ identity for high codimension submanifolds states that

$$\begin{aligned} \nabla _k \nabla _ l A _{ij\alpha }&= \nabla _i \nabla _j A_{kl \alpha } + A_{ kl \beta } A _{ ip \beta } A _{ jp \alpha } - A_{ ij \beta } A _{ kp \beta } A _{ lp \alpha }\\&\quad + A_{ jl \beta } A_{ ip \beta } A _{ kp \alpha } + A_{ jk \beta } A _{ ip \beta } A _{ lp \alpha }-A _{ il \beta } A _{ kp \beta } A _{ jp \alpha }- A_{ jl \beta }A _{ kp \beta }A _{ ip \alpha }. \end{aligned}$$

We symmetrise to get

$$\begin{aligned} \nabla _k \nabla _ l A _{ij\alpha } +\nabla _l\nabla _k A_{ji\alpha } =&\ \nabla _i \nabla _j A_{kl \alpha } + \nabla _j\nabla _i A_{lk\alpha } + E_{klij\alpha }. \end{aligned}$$

where

$$\begin{aligned} E_{klij\alpha }&= A_{ kl \beta } A _{ ip \beta } A _{ jp \alpha } + A_{lk\beta } A_{jp\beta } A_{ip\alpha }\\&\quad - A_{ ij \beta } A _{ kp \beta } A _{ lp \alpha }-A_{ji\beta } A_{lp\beta }A_{kp\alpha }\\&\quad + A_{ jl \beta } A_{ ip \beta } A _{ kp \alpha } +A_{ik\beta } A_{jp\beta }A_{lp\alpha }\\&\quad + A_{ jk \beta } A _{ ip \beta } A _{ lp \alpha }+ A_{il\beta } A_{jp\beta }A_{kp\alpha }\\&\quad -A _{ il \beta } A _{ kp \beta } A _{ jp \alpha } -A_{jk\beta }A_{lp\beta }A_{ip\alpha }\\&\quad - A_{ jl \beta } A _{ kp \beta }A _{ ip \alpha }-A_{ik\beta } A_{lp\beta }A_{jp\alpha }. \end{aligned}$$

Using the relation

$$\begin{aligned} R^\perp _{ij \alpha \beta } = A_{ip\alpha } A_{jp \beta } - A_{ip\beta } A_{jp\alpha } \end{aligned}$$

we can rewrite the components of E as

$$\begin{aligned} E_{klij\alpha }&= \ A_{kl\beta } ( A_{ip\beta } A_{jp \alpha } + A_{jp\beta } A_{ip\alpha } ) - A_{ij \beta } (A_{kp\beta }A_{lp \alpha }+A_{lp\beta } A_{kp\alpha } ) \\&\quad +A_{jl\beta } (A_{ip\beta } A_{kp\alpha } - A_{kp\beta }A_{ip\alpha } ) + A_{jk\beta } ( A_{ip\beta }A_{lp\alpha } - A_{lp\beta }A_{ip\alpha } ) \\&\quad +A_{ik\beta } (A_{lp\alpha }A_{jp\beta } - A_{lp\beta } A_{jp\alpha } ) + A_{il\beta }( A_{kp\alpha } A_{jp\beta } - A_{kp\beta } A_{jp \alpha })\\&= \ A_{kl\beta } ( A_{ip\beta } A_{jp \alpha } + A_{jp\beta } A_{ip\alpha } ) - A_{ij \beta } (A_{kp\beta }A_{lp \alpha }+A_{lp\beta } A_{kp\alpha } ) \\&\quad + A_{jl\beta } R^\perp _{ki\alpha \beta } + A _{ jk\beta } R^\perp _{li\alpha \beta } + A_{ik\beta } R^\perp _{lj\alpha \beta } + A _{il\beta }R^\perp _{ kj\alpha \beta }\\&= 2 A_{kl\beta } A_{ip\beta } A_{jp \alpha } - 2A_{ij \beta } A_{kp\beta }A_{lp \alpha } + A_{kl\beta } R^\perp _{ij\alpha \beta } - A_{ij\beta } R^\perp _{kl\alpha \beta }\\&\quad + A_{jl\beta } R^\perp _{ki\alpha \beta } + A _{ jk\beta } R^\perp _{li\alpha \beta } + A_{ik\beta } R^\perp _{lj\alpha \beta } + A _{il\beta }R^\perp _{ kj\alpha \beta }. \end{aligned}$$

Lemma 3.1

There is a positive constant \(C=C(n)\) such that

$$\begin{aligned} |E|^2 \ge 8|h|^2 {{\,\textrm{tr}\,}}(h^4) - 8 {{\,\textrm{tr}\,}}(h^3)^2 - C|A|^5|{\hat{A}}|. \end{aligned}$$

Proof

Let us decompose E as

$$\begin{aligned} E_{klij\alpha } = U_{klij\alpha } + V_{klij\alpha } \end{aligned}$$

where

$$\begin{aligned} U_{klij\alpha }&:= 2 A_{kl\beta } A_{ip\beta } A_{jp \alpha } - 2A_{ij \beta } A_{kp\beta }A_{lp \alpha } ,\\ V_{klij\alpha }&:= A_{kl\beta } R^\perp _{ij\alpha \beta } - A_{ij\beta } R^\perp _{kl\alpha \beta } + A_{jl\beta } R^\perp _{ki\alpha \beta }\\&\quad + A _{ jk\beta } R^\perp _{li\alpha \beta } + A_{ik\beta } R^\perp _{lj\alpha \beta } + A _{il\beta }R^\perp _{ kj\alpha \beta }. \end{aligned}$$

There then holds

$$\begin{aligned} |E|^2 = |U|^2 + 2 \langle U, V\rangle + |V|^2. \end{aligned}$$

Breaking U into components parallel and orthogonal to the mean curvature vector we obtain

$$\begin{aligned} U_{klij}&= 2 \langle A_{kl} ,A_{ip} \rangle A_{jp} - 2\langle A_{ij}, A_{kp}\rangle A_{lp}\\&= 2 h_{kl} h_{ip} A_{jp} - 2 h_{ij}h_{kp} A_{lp} + 2 \langle {\hat{A}}_{kl}, {\hat{A}}_{ip} \rangle A_{jp} - 2 \langle {\hat{A}}_{ij}, {\hat{A}}_{lp} \rangle A_{lp}\\&= 2 h_{kl} h_{ip} h_{jp} \nu _1 - 2 h_{ij}h_{kp} h_{lp} \nu _1+ 2 h_{kl} h_{ip} {\hat{A}}_{jp} - 2 h_{ij}h_{kp} {\hat{A}}_{lp}\\&\quad + 2 \langle {\hat{A}}_{kl}, {\hat{A}}_{ip} \rangle A_{jp} - 2 \langle {\hat{A}}_{ij}, {\hat{A}}_{lp} \rangle A_{lp}, \end{aligned}$$

hence

$$\begin{aligned} |U|^2&\ge 4 | h_{kl} h_{ip} h_{jp} - h_{ij}h_{kp} h_{lp} |^2 - C(n) |{\hat{A}}| |A|^5+ 4 |\langle {\hat{A}}_{kl}, {\hat{A}}_{ip} \rangle A_{jp} - \langle {\hat{A}}_{ij}, {\hat{A}}_{lp} \rangle A_{lp}|^2 \\&\ge 8 |h|^2 {{\,\textrm{tr}\,}}(h^4)-8{{\,\textrm{tr}\,}}(h^3)^2 - C |{\hat{A}}||A|^5. \end{aligned}$$

Substituting this back in we arrive at

$$\begin{aligned} |E|^2 \ge 8 |h|^2 {{\,\textrm{tr}\,}}(h^4)-8{{\,\textrm{tr}\,}}(h^3)^2 - 2 |U||V| - C |{\hat{A}}||A|^5. \end{aligned}$$

We may bound

$$\begin{aligned} |V| \le C|A||R^\perp |, \end{aligned}$$

and we have

$$\begin{aligned} R^{\perp }_{ij1\beta } = h_{ip} {\hat{A}}_{jp\beta } - {\hat{A}}_{ip\beta } h_{jp} \end{aligned}$$

and

$$\begin{aligned} R^{\perp }_{ij\alpha \beta } = {\hat{A}}_{ip\alpha } A_{jp\beta } - A_{ip\beta } {\hat{A}}_{jp\alpha },\qquad \alpha \ge 2. \end{aligned}$$

Combining these inequalities gives

$$\begin{aligned} |V| \le C |A|^2 |{\hat{A}}|, \end{aligned}$$

so since \(|U| \le C|A|^3\) we have

$$\begin{aligned} |E|^2 \ge 8 |h|^2 {{\,\textrm{tr}\,}}(h^4)-8{{\,\textrm{tr}\,}}(h^3)^2 - C|A|^5 |{\hat{A}}|. \end{aligned}$$

\(\square \)

We are now ready to prove the Poincaré inequality. The proof does not actually use the fact that \(M_t\) moves by mean curvature, so this result can be viewed as a general statement about (quadratically pinched) high codimension submanifolds.

Proposition 3.2

Fix \(t \in [0,T)\) and let \(u:M\rightarrow {\mathbb {R}}\) be a nonnegative Lipschitz function which is supported in \(\{x:f(x,t)>0\}\). Then there is a positive constant \(C = C(n,\varepsilon _0,\varepsilon ,\Lambda )\) such that

$$\begin{aligned} \int _M |h|^2 u^2 \, d\mu _t&\le C \int _M u^2 \frac{|\nabla A|^2}{|A|^2} \, d\mu _t +C\int _M u |\nabla u| \frac{|\nabla A|}{|A|} \,d \mu _t + C \int _M |A| |{\hat{A}}| u^2\,d\mu _t. \end{aligned}$$

Proof

For a symmetric matrix B with eigenvalues \(\mu _i\) there holds

$$\begin{aligned} |B|^2 {{\,\textrm{tr}\,}}(B^4) - {{\,\textrm{tr}\,}}(B^3)^2&= \frac{1}{2} \sum _{i,j}( \mu _i^2 \mu _j^4 - \mu _i^3 \mu _j^3 )+\frac{1}{2} \sum _{i,j}( \mu _j^2 \mu _i^4 - \mu _i^3 \mu _j^3 )\\&=\frac{1}{2}\sum _{i,j} \mu _i^2 \mu _j^2 (\mu _i^2 + \mu _j^2 - 2 \mu _i \mu _j)\\&= \frac{1}{2}\sum _{i,j}\mu _i^2 \mu _j^2(\mu _i - \mu _j)^2. \end{aligned}$$

Observe that the right-hand side vanishes if and only if B is the second fundamental form of a codimension-one cylinder. Let us define

$$\begin{aligned} B(x,t) = h(x,t) - \Lambda v(x,t) g(x,t), \end{aligned}$$

which has as eigenvalues \(\mu _i = \lambda _i - \Lambda v\). In particular, the computation above shows that

$$\begin{aligned} |B|^2 {{\,\textrm{tr}\,}}(B^4) - {{\,\textrm{tr}\,}}(B^3)^2&\ge \mu _n^2 \mu _1^2 (\mu _n-\mu _1)^2\\&\quad =\lambda _n^2 \mu _1^2 (\mu _n - \mu _1)^2 - 2 \Lambda \lambda _n \mu _1^2 (\mu _n - \mu _1)^2 v\\&\qquad + \Lambda ^2 \mu _1^2 (\mu _n - \mu _1)^2 v^2\\&\quad \ge \frac{1}{C} |h|^2 \mu _1^2 (\mu _n - \mu _1)^2 - C |A|^5 |{\hat{A}}| \end{aligned}$$

where \(C = C(n,\Lambda )\). At any point where \(f(x,t) >0\) we have

$$\begin{aligned} \lambda _1(x,t) < - \varepsilon w(x,t) + \Lambda v(x,t), \end{aligned}$$

which is to say that \(\mu _1(x,t) \le - \varepsilon w(x,t)\). Furthermore, since

$$\begin{aligned} 0< |H|(x,t) = \lambda _1(x,t) + \dots + \lambda _n(x,t), \end{aligned}$$

we have \(\lambda _n(x,t) >0\), and hence

$$\begin{aligned} \mu _n(x,t) - \mu _1(x,t)&= \lambda _n(x,t) - \lambda _1(x,t) > \varepsilon w(x,t) - \Lambda v(x,t). \end{aligned}$$

If the right-hand side is nonnegative then we can square both sides to get an estimate of the form

$$\begin{aligned} (\mu _n(x,t) - \mu _1(x,t))^2&\ge \varepsilon ^2 w(x,t)^2 - C\varepsilon \Lambda |A||{\hat{A}}| \end{aligned}$$

where \(C = C(n)\). On the other hand if

$$\begin{aligned} \varepsilon w(x,t) - \Lambda v(x,t)< 0 \end{aligned}$$

then trivially there holds

$$\begin{aligned} (\mu _n(x,t) - \mu _1(x,t))^2 \ge 0 \ge \varepsilon ^2 w(x,t)^2 - \Lambda ^2 v(x,t)^2, \end{aligned}$$

so in either case we can bound

$$\begin{aligned} (\mu _n(x,t) - \mu _1(x,t))^2&\ge \varepsilon ^2 w(x,t)^2 - C |A||{\hat{A}}| \end{aligned}$$

with \(C = C(n,\varepsilon , \Lambda )\).

Putting these estimates together we find that on \(\{x:f(x,t)>0\}\) there holds

$$\begin{aligned} |B|^2 {{\,\textrm{tr}\,}}(B^4) - {{\,\textrm{tr}\,}}(B^3)^2&\ge \mu _n^2 \mu _1^2 (\mu _n-\mu _1)^2\\&\ge \frac{\varepsilon ^4}{C} |h|^2 w^4 - C |A|^5 |{\hat{A}}|, \end{aligned}$$

and since

$$\begin{aligned} |B|^2 {{\,\textrm{tr}\,}}(B^4) - {{\,\textrm{tr}\,}}(B^3)^2 \le |h|^2 {{\,\textrm{tr}\,}}(h^4) - {{\,\textrm{tr}\,}}(h^3)^2 + C |A|^5 |{\hat{A}}| \end{aligned}$$

we finally get

$$\begin{aligned} |h|^2 {{\,\textrm{tr}\,}}(h^4) - {{\,\textrm{tr}\,}}(h^3)^2 \ge C^{-1} |h|^2 |A|^4 - C | A|^5|{\hat{A}}| \end{aligned}$$

where \(C = C(n, \varepsilon _0, \varepsilon , \Lambda )\).

Combining this inequality with the result of the last lemma, we find that on \(\{x:f(x,t)>0\}\) there holds

$$\begin{aligned} |h|^2 |A|^4&\le C(|h|^2 {{\,\textrm{tr}\,}}(h^4) - {{\,\textrm{tr}\,}}(h^3)^2 )+ C| A|^5|{\hat{A}}|\\&\le C|E|^2 + C|A|^5 |{\hat{A}}|. \end{aligned}$$

Let u be a nonnegative Lipschitz function supported in \(\{x:f(x,t)>0\}\). Then we can multiply this inequality by \(|A|^{-4} u^2\) and integrate over M to get

$$\begin{aligned}&\int _M |h|^2 u^2 \,d\mu _t \\&\quad \le C \int _M |A|^{-4} u^2 |E|^2 + |A| |{\hat{A}}| u^2 \, d\mu _t\\&\quad = C \int _M |A|^{-4} u^2 E_{klij\alpha } (\nabla _k \nabla _ l A _{ij\alpha } +\nabla _l\nabla _k A_{ji\alpha } -\nabla _i \nabla _j A_{kl \alpha } - \nabla _j\nabla _i A_{lk\alpha }) \, d\mu _t\\&\qquad + C \int _M |A| |{\hat{A}}| u^2 \,d\mu _t \end{aligned}$$

We are going to estimate each of the four Hessian terms on the right. Since each of these is handled in the same way, we only give the argument for the first one. Defining

$$\begin{aligned} T_k := |A|^{-4} u^2 E_{klij\alpha } \nabla _ l A _{ij\alpha }, \end{aligned}$$

we can write

$$\begin{aligned} |A|^{-4} u^2E_{klij\alpha } \nabla _k \nabla _ l A _{ij\alpha }&= \nabla _k T_k + 4 |A|^{-5} u^2 E_{klij\alpha } \nabla _ l A _{ij\alpha } \nabla _k |A| \\&\quad -2 |A|^{-4} u E_{klij\alpha }\nabla _ l A _{ij\alpha } \nabla _k u - |A|^{-4} u^2 \nabla _k E_{klij\alpha }\nabla _ l A _{ij\alpha }. \end{aligned}$$

The divergence term vanishes upon integration, and there is a purely dimensional constant C such that

$$\begin{aligned} |E| \le C|A|^3 , \qquad |\nabla E| \le C |A|^2 |\nabla A|, \qquad |\nabla |A|| \le C |\nabla A|, \end{aligned}$$

so making C a bit larger, we have

$$\begin{aligned}&\int _M |A|^{-4} u^2 E_{klij\alpha }\nabla _k \nabla _ l A _{ij\alpha } \, d\mu _t \le C \int _M u^2 |A|^{-5} |A|^3 |\nabla A|^2 \, d\mu _t\\&\quad + C \int _M u |\nabla u| |A|^{-4} |A|^3 |\nabla A| \,d \mu _t\\&\quad + C\int _M u^2 |A|^{-4} |A|^2 |\nabla A|^2 \,d\mu _t. \end{aligned}$$

Estimating the remaining Hessian terms in the same way and substituting back in we arrive at

$$\begin{aligned} \int _M |h|^2 u^2 \, d\mu _t&\le C \int _M u^2 \frac{|\nabla A|^2}{|A|^2} \, d\mu _t +C\int _M u |\nabla u| \frac{|\nabla A|}{|A|} \,d \mu _t + C \int _M |A| |{\hat{A}}| u^2\,d\mu _t. \end{aligned}$$

\(\square \)

4 Stampacchia iteration

In this section we establish the convexity estimate by proving an a priori supremum estimate for the function

$$\begin{aligned} f_\sigma := \frac{f}{|H|^{1-\sigma }} \end{aligned}$$

where \(\sigma \in (0,1)\) is chosen small depending on n and \(M_0\). Recall from Lemma 2.10 that at each point in Q there holds

$$\begin{aligned} (\partial _t - \Delta )f&\le |h|^2 f + C(1+ \Lambda ) |A|^2 |{\hat{A}}| - ( f + \varepsilon w )|\nabla \nu _1|^2 \\&\quad - \bigg ( \frac{\varepsilon \delta _0}{2}- C\Lambda \frac{|{\hat{A}}|}{|H|} \bigg ) \frac{|\nabla A|^2}{|H|} - \bigg (2\Lambda - \frac{C}{\varepsilon \delta _0}\bigg )\frac{|\nabla {\hat{A}}|^2}{|H|}, \end{aligned}$$

where \(C = C(n)\). Let us fix

$$\begin{aligned} \Lambda = \frac{C}{2\varepsilon \delta _0} \end{aligned}$$

so that the last term vanishes. Then using

$$\begin{aligned} (\partial _t - \Delta ) |H|^{1-\sigma }&= (1-\sigma ) |h|^2 |H|^{1-\sigma } - (1-\sigma )|H|^{1-\sigma }|\nabla \nu _1|^2\\&\quad + \sigma (1-\sigma ) |H|^{-\sigma - 1} |\nabla |H||^2 \end{aligned}$$

we compute that

$$\begin{aligned} (\partial _t - \Delta ) f_\sigma&\le \sigma |h|^2 f_\sigma + C(1+ \Lambda ) |A|^2 \frac{|{\hat{A}}|}{|H|^{1-\sigma }} - \bigg (\sigma f_\sigma + \varepsilon \frac{w}{|H|^{1-\sigma } }\bigg )|\nabla \nu _1|^2 \\&\quad - \bigg ( \frac{\varepsilon \delta _0}{2}- C\Lambda \frac{|{\hat{A}}|}{|H|} \bigg ) H^\sigma \frac{|\nabla A|^2}{|H|^2} - \sigma (1-\sigma ) f_\sigma \frac{ |\nabla |H||^2}{|H|^2}\\&\quad + 2(1-\sigma ) \bigg \langle \nabla f_\sigma , \frac{\nabla |H|}{|H|} \bigg \rangle . \end{aligned}$$

Hence at points in \(Q \cap \{f_\sigma >0\}\) we have

$$\begin{aligned} (\partial _t - \Delta ) f_\sigma&\le \sigma |h|^2 f_\sigma +C(1+ \Lambda ) |A|^2 \frac{|{\hat{A}}|}{|H|^{1-\sigma }}- \bigg ( \frac{\varepsilon \delta _0}{2}- C\Lambda \frac{|{\hat{A}}|}{|H|} \bigg ) |H|^\sigma \frac{|\nabla A|^2}{|H|^{2}} \nonumber \\&\quad + 2(1-\sigma ) \bigg \langle \nabla f_\sigma , \frac{\nabla |H|}{|H|} \bigg \rangle , \end{aligned}$$
(4.1)

where \(C=C(n)\).

All of the computations until now apply to any quadratically pinched solution with

$$\begin{aligned} c \le \frac{4}{3n} - \varepsilon _0. \end{aligned}$$

From here on we assume \(n\ge 5\) and the more restrictive condition \(c \le c_n - \varepsilon _0\) where

$$\begin{aligned} c_n : ={\left\{ \begin{array}{ll} \frac{3(n+1)}{2n(n+2)} &{} n = 5, 6, 7, \\ \frac{4}{3n} &{} n \ge 8. \end{array}\right. } \end{aligned}$$

This is the range of pinching constants for which Naff’s codimension estimate is valid.

Theorem 4.1

([21]) Let \(F:M\times [0,T) \rightarrow {\mathbb {R}}^{n+k}\), \(n \ge 5\), be a quadratically pinched mean curvature flow with \(c \le c_n - \varepsilon _0\). Then there is a constant \(\eta = \eta (n,\varepsilon _0)\) in (0, 1) such that

$$\begin{aligned} \max _{M_t} \frac{|{\hat{A}}|^2}{|H|^{2-2\eta }} \le \max _{M_0} \frac{|{\hat{A}}|^2}{|H|^{2-2\eta }} \end{aligned}$$

for each \(t \in [0,T)\).

Hence if we set

$$\begin{aligned} L := \max _{M_0} |H| \end{aligned}$$

then the inequality

$$\begin{aligned}|{\hat{A}}|\le C L^\eta |H|^{1-\eta } \end{aligned}$$

holds on \(M_t\) for every \(t \in [0,T)\), where \(C=C(n)\). Inserting this estimate into (4.1) we find

$$\begin{aligned} (\partial _t - \Delta ) f_\sigma&\le \sigma |h|^2 f_\sigma +C(1+ \Lambda )L^\eta |A|^2 |H|^{\sigma -\eta }\nonumber \\&\quad - \bigg (\frac{\varepsilon \delta _0}{2}- C\Lambda L^\eta |H|^{-\eta }\bigg ) |H|^\sigma \frac{|\nabla A|^2}{|H|^{2}}+ 2(1-\sigma ) \bigg \langle \nabla f_\sigma , \frac{\nabla |H|}{|H|} \bigg \rangle \end{aligned}$$
(4.2)

on \(Q \cap \{f_\sigma >0\}\), where \(C=C(n)\).

4.1 \(L^p\)-estimates

For each \(k >0\) let us define

$$\begin{aligned} f_{\sigma ,k}(x,t) := \max \{ f_\sigma (x,t) -k,0\}. \end{aligned}$$

Using the Poincaré inequality we now establish an \(L^p\)-estimate for \(f_{\sigma ,k}\). In the codimension one case similar estimates have appeared in [14] and [12].

Proposition 4.2

There are positive constants \(p_0\) and \(\ell _0\) depending on n, \(\varepsilon _0\), \(\eta \), \(\varepsilon \) and \(\Lambda \), and a positive constant \(k_0 = k_0(n, \varepsilon _0, \eta , \varepsilon , \Lambda , L)\), with the following property. For every

$$\begin{aligned} p \ge p_0, \qquad \sigma \le \ell _0 p^{-\frac{1}{2}}, \qquad k \ge k_0, \end{aligned}$$

we have

$$\begin{aligned} \sup _{t \in [0,T)} \bigg ( \int _M f_{\sigma ,k}^p \,d\mu _t \bigg )\le C, \end{aligned}$$

where \(C = C(n, \varepsilon _0,\eta ,\varepsilon , \Lambda , L, \mu _0(M), T, k, \sigma , p)\).

Proof

Suppose for now that \(p_0 \ge 4\) and \(\ell _0 \le \eta \). Then the condition \(\sigma \le \ell _0 p^{-\frac{1}{2}}\) ensures that \(\sigma \le \eta /2\). On \({{\,\textrm{supp}\,}}(f_{\sigma ,k})\) we have

$$\begin{aligned} k < \frac{f}{|H|} |H|^\sigma \le C_0(n,\Lambda ) |H|^\sigma , \end{aligned}$$

so if we take \(k_0 \ge C_0\) and impose \(k \ge k_0\) then on \({{\,\textrm{supp}\,}}(f_{\sigma ,k})\) there holds

$$\begin{aligned} |H| \ge (k/C_0)^\frac{1}{\sigma } \ge \max \{k/C_0,1\}. \end{aligned}$$

Substituting this into (4.2) we find

$$\begin{aligned} (\partial _t - \Delta ) f_\sigma&\le \sigma |h|^2 f_\sigma +C(1+ \Lambda ) L^\eta |A|^2 |H|^{-\frac{\eta }{2}}- \bigg ( \frac{\varepsilon \delta _0}{2}- C_1 k^{-\eta } \bigg ) |H|^\sigma \frac{|\nabla A|^2}{|H|^{2}}\nonumber \\&\quad + 2(1-\sigma ) \bigg \langle \nabla f_\sigma , \frac{\nabla |H|}{|H|} \bigg \rangle \end{aligned}$$

on \(Q\cap {{\,\textrm{supp}\,}}(f_{\sigma , k})\), where \(C = C(n)\) and \(C_1 = C_1(n,\eta ,\Lambda ,L)\). Choosing \(k_0\) a bit larger so that

$$\begin{aligned} k_0 \ge \max \bigg \{1,C_0, \bigg ( \frac{4C_1}{\varepsilon \delta _0}\bigg )^{1/\eta } \bigg \} \end{aligned}$$

and using \(f/|H| \le C_0\), we find that on \(Q \cap {{\,\textrm{supp}\,}}(f_{\sigma ,k})\),

$$\begin{aligned} (\partial _t - \Delta ) f_\sigma&\le \sigma |h|^2 f_\sigma +C(1+ \Lambda )L^\eta |A|^2 |H|^{-\frac{\eta }{2}}- \frac{\varepsilon \delta _0}{4 C_0} f_\sigma \frac{|\nabla A|^2}{|H|^{2}}\nonumber \\&\quad + 2(1-\sigma ) \bigg \langle \nabla f_\sigma , \frac{\nabla |H|}{|H|} \bigg \rangle . \end{aligned}$$

By Young’s inequality we have

$$\begin{aligned} 2(1-\sigma ) \bigg \langle \nabla f_\sigma , \frac{\nabla |H|}{|H|} \bigg \rangle \le C_2 \frac{|\nabla f_\sigma |^2}{f_\sigma } + \frac{\varepsilon \delta _0}{8 C_0} f_\sigma \frac{|\nabla A|^2}{|H|^2} \end{aligned}$$

on \({{\,\textrm{supp}\,}}(f_\sigma )\), where \(C_2 = C_2(n,\varepsilon _0,\varepsilon , C_0)\). Hence on \(Q \cap {{\,\textrm{supp}\,}}(f_{\sigma ,k})\),

$$\begin{aligned} (\partial _t - \Delta ) f_\sigma&\le \sigma |h|^2 f_\sigma +C(1+ \Lambda ) L^\eta |A|^2 |H|^{-\frac{\eta }{2}}- \frac{\varepsilon \delta _0}{8 C_0} f_\sigma \frac{|\nabla A|^2}{|H|^{2}} + C_2 \frac{|\nabla f_\sigma |^2}{f_\sigma }. \end{aligned}$$

Applying the pinching we can bound

$$\begin{aligned} C(1+ \Lambda ) L^\eta |A|^2 |H|^{-\eta /2} \le C_3(n,\eta ,\Lambda ,L) |h|^2 |H|^{-\eta /2}, \end{aligned}$$

and by Young’s inequality

$$\begin{aligned} |H|^{-\eta /2} \le \frac{4 - \eta }{4}s^{4/(4-\eta )} + \frac{\eta }{4} \frac{1}{s^{4/\eta } } \frac{1}{|H|^2} \le s^{4/(4-\eta )} + \frac{1}{s^{4/\eta } } \frac{1}{|H|^2} \end{aligned}$$

for every positive s. Setting \(s = \sigma ^{(4-\eta )/4}\) gives

$$\begin{aligned} |H|^{-\eta /2} \le \sigma + \frac{1}{\sigma ^{(4-\eta )/\eta }} \frac{1}{|H|^2} \le \sigma + \sigma ^{-4/\eta } |H|^{-2}, \end{aligned}$$

so using the pinching we get

$$\begin{aligned} C(1+ \Lambda ) L^\eta |A|^2 |H|^{-\frac{\eta }{2}} \le C_3 \sigma |h|^2 + C_4 \sigma ^{-4/\eta } \end{aligned}$$

for some \(C_4 = C_4(n,\eta ,\Lambda ,L)\). Substituting back in, we have

$$\begin{aligned} (\partial _t - \Delta ) f_\sigma&\le \sigma |h|^2 f_\sigma + C_3\sigma |h|^2- c_0 f_\sigma \frac{|\nabla A|^2}{|H|^{2}} + C_2 \frac{|\nabla f_\sigma |^2}{f_\sigma } + C_4 \sigma ^{-4/\eta } \end{aligned}$$

on \(Q \cap {{\,\textrm{supp}\,}}(f_{\sigma ,k})\), where

$$\begin{aligned} c_0 := \frac{\varepsilon \delta _0}{8 C_0}. \end{aligned}$$

If \(\varphi \) is any nonnegative locally Lipschitz function supported in \({{\,\textrm{supp}\,}}(f_{\sigma ,k})\), then on almost every timeslice we can multiply the last inequality by \(\varphi \) and integrate to get

$$\begin{aligned}&\int _M \partial _t f_\sigma \cdot \varphi \,d\mu _t \le \int _M \Delta f_\sigma \cdot \varphi \,d\mu _t + \sigma \int _M |h|^2 f_\sigma \varphi \,d\mu _t + C_3 \sigma \int _M |h|^2 \varphi \, d\mu _t \\&\quad - c_0 \int _M f_\sigma \varphi \frac{|\nabla A|^2}{|H|^2} \,d\mu _t + C_2 \int _M \varphi \frac{|\nabla f_\sigma |^2}{f_\sigma }\,d\mu _t + C_4 \sigma ^{-4/\eta }\int _M \varphi \,d\mu _t. \end{aligned}$$

Since \(f_\sigma \) is a locally semiconvex function we can use Lemma  2.6 to integrate by parts, and so obtain

$$\begin{aligned} \int _M \partial _t f_\sigma \cdot \varphi \,d\mu _t&\le -\int _M \langle \nabla f_\sigma , \nabla \varphi \rangle \,d\mu _t + \sigma \int _M |h|^2 f_\sigma \varphi \,d\mu _t + C_3 \sigma \int _M |h|^2 \varphi \, d\mu _t \\&\quad - c_0 \int _M f_\sigma \varphi \frac{|\nabla A|^2}{|H|^2} \,d\mu _t + C_2 \int _M \varphi \frac{|\nabla f_\sigma |^2}{f_\sigma }\,d\mu _t + C_4 \sigma ^{-4/\eta }\int _M \varphi \,d\mu _t. \end{aligned}$$

We set \(\varphi = p f_{\sigma , k}^{p-1}\) in this inequality and use

$$\begin{aligned} \frac{d}{dt} \int _M f_{\sigma , k}^p \,d\mu _t = p \int _M \partial _t f_\sigma \cdot f_{\sigma ,k}^{p-1} \,d\mu _t - \int _M |H|^2 f_{\sigma ,k}^p\,d\mu _t \end{aligned}$$

to estimate

$$\begin{aligned}&\frac{d}{dt} \int _M f_{\sigma ,k}^p \,d\mu _t\\&\quad \le - p(p-1) \int _M f_{\sigma ,k}^{p-2}|\nabla f_\sigma |^2\,d\mu _t + \sigma p\int _M |h|^2 f_\sigma f_{\sigma ,k}^{p-1} \,d\mu _t \\&\qquad + C_3 \sigma p \int _M |h|^2 f_{\sigma ,k}^{p-1}\, d\mu _t - c_0 p\int _M f_\sigma f_{\sigma ,k}^{p-1} \frac{|\nabla A|^2}{|H|^2} \,d\mu _t\\&\qquad + C_2 p\int _M f_{\sigma ,k}^{p-1} \frac{|\nabla f_\sigma |^2}{f_\sigma }\,d\mu _t + C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p-1} \,d\mu _t \end{aligned}$$

for almost every \(t \in [0,T)\). Using that \(f_{\sigma ,k} = f_\sigma - k\) on \({{\,\textrm{supp}\,}}(f_{\sigma ,k})\) and rearranging slightly, this gives

$$\begin{aligned} \frac{d}{dt} \int _M f_{\sigma ,k}^p \,d\mu _t&\le - (p(p-1) - C_2p)\int _M f_{\sigma ,k}^{p-2}|\nabla f_\sigma |^2\,d\mu _t - c_0 p\int _M f_{\sigma ,k}^{p} \frac{|\nabla A|^2}{|H|^2} \,d\mu _t \\&\quad + \sigma p\int _M |h|^2 f_{\sigma ,k}^{p} \,d\mu _t + (C_3+k) \sigma p \int _M |h|^2 f_{\sigma ,k}^{p-1}\, d\mu _t \\&\quad + C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p-1} \,d\mu _t. \end{aligned}$$

Using Young’s inequality we estimate

$$\begin{aligned} (C_3 +k)\sigma p\int _M |h|^2 f_{\sigma ,k}^{p-1}\, d\mu _t&\le \sigma (p-1) \int _M |h|^2 f_{\sigma ,k}^p \,d\mu _t + (C_3+k)^p \sigma \int _M |h|^2 \, d\mu _t \end{aligned}$$

and

$$\begin{aligned} C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p-1} \,d\mu _t \le C_4 \sigma ^{-4/\eta } (p-1) \int _M f_{\sigma ,k}^{p} \,d\mu _t + C_4 \sigma ^{-4/\eta } \mu _t(M). \end{aligned}$$

Inserting these inequalities we arrive at

$$\begin{aligned} \frac{d}{dt} \int _M f_{\sigma ,k}^p \,d\mu _t&\le - (p(p-1) - C_2p)\int _M f_{\sigma ,k}^{p-2}|\nabla f_\sigma |^2\,d\mu _t - c_0 p\int _M f_{\sigma ,k}^{p} \frac{|\nabla A|^2}{|H|^2} \,d\mu _t \nonumber \\&\quad + 2\sigma p\int _M |h|^2 f_{\sigma ,k}^{p} \,d\mu _t + (C_3+k)^p \sigma \int _M |h|^2 \, d\mu _t \nonumber \\&\quad +C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p} \,d\mu _t + C_4 \sigma ^{-4/\eta } \mu _t(M). \end{aligned}$$
(4.3)

We now apply the Poincaré inequality with \(u = f_{\sigma , k}^{\frac{p}{2}}\) to obtain

$$\begin{aligned} \int _M |h|^2 f_{\sigma ,k}^p \, d\mu _t&\le C_5 \int _M f_{\sigma ,k}^p \frac{|\nabla A|^2}{|H|^2} \, d\mu _t +C_5 p \int _M f_{\sigma ,k}^{p-1} |\nabla f_\sigma | \frac{|\nabla A|}{|H|} \,d \mu _t\\&\quad + C_5 \int _M |A| |{\hat{A}}| f_{\sigma ,k}^p\,d\mu _t, \end{aligned}$$

where the constant \(C_5\) depends on n, \(\varepsilon _0\), \(\varepsilon \) and \(\Lambda \). Applying Young’s inequality we obtain

$$\begin{aligned} \int _M |h|^2 f_{\sigma ,k}^p \, d\mu _t&\le C_5(1+p^\frac{1}{2}) \int _M f_{\sigma ,k}^p \frac{|\nabla A|^2}{|H|^2} \, d\mu _t +C_5 p^\frac{3}{2} \int _M f_{\sigma ,k}^{p-2} |\nabla f_\sigma |^2 \,d \mu _t\\&\quad + C_5 \int _M |A| |{\hat{A}}| f_{\sigma ,k}^p\,d\mu _t. \end{aligned}$$

Inserting the codimension estimate and quadratic pinching we get

$$\begin{aligned} C_5 \int _M |A| |{\hat{A}}| f_{\sigma ,k}^p\,d\mu _t \le C_6(n, L, C_5) \int _M |h|^2 |H|^{-\eta } f_{\sigma ,k}^p\,d\mu _t, \end{aligned}$$

and we know that \(|H| \ge k/C_0\) on \({{\,\textrm{supp}\,}}(f_{\sigma ,k})\), so if we take

$$\begin{aligned} k_0 \ge \max \bigg \{1,C_0, \bigg ( \frac{4C_1}{\varepsilon \delta _0}\bigg )^{1/\eta } , C_0(2C_6)^{1/\eta }\bigg \} \end{aligned}$$

then

$$\begin{aligned} C_5 \int _M |A| |{\hat{A}}| f_{\sigma ,k}^p\,d\mu _t \le \frac{1}{2} \int _M |h|^2 f_{\sigma ,k}^p\,d\mu _t. \end{aligned}$$

In this case

$$\begin{aligned} \frac{1}{2} \ int _M |h|^2 f_{\sigma ,k}^p \, d\mu _t&\le C_5(1+p^\frac{1}{2}) \int _M f_{\sigma ,k}^p \frac{|\nabla A|^2}{|H|^2} \, d\mu _t +C_5 p^\frac{3}{2} \int _M f_{\sigma ,k}^{p-2} |\nabla f_\sigma |^2 \,d \mu _t. \end{aligned}$$

Multiplying this inequality through by \(4\sigma p\) and substituting back into (4.3) gives

$$\begin{aligned} \frac{d}{dt} \int _M f_{\sigma ,k}^{p} \,d\mu _t&\le - (p(p-1)-C_2p - 4 C_5 \sigma p^\frac{5}{2}) \int _M f_{\sigma , k}^{p-2} |\nabla f_\sigma |^2 \,d\mu _t \\&\quad - (c_0 p - 4 C_5\sigma p -4 C_5\sigma p^\frac{3}{2} ) \int _M f_{\sigma ,k}^{p} \frac{|\nabla A|^2}{|H|^{2}}\,d\mu _t\\&\quad + (C_3+k)^p \sigma \int _M |h|^2 \, d\mu _t +C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p} \,d\mu _t \\&\quad + C_4 \sigma ^{-4/\eta } \mu _t(M). \end{aligned}$$

Now we insert the assumption \(\sigma \le \ell _0 p^{-\frac{1}{2}}\) and thus obtain

$$\begin{aligned} \frac{d}{dt} \int _M f_{\sigma ,k}^{p} \,d\mu _t&\le - (p(p-1)-C_2p -4 C_5 \ell _0 p^2) \int _M f_{\sigma , k}^{p-2} |\nabla f_\sigma |^2 \,d\mu _t \\&\quad - (c_0 p - 4 C_5\ell _0 p^\frac{1}{2} -4 C_5 \ell _0 p ) \int _M f_{\sigma ,k}^{p} \frac{|\nabla A|^2}{|H|^{2}}\,d\mu _t\\&\quad + (C_3+k)^p \sigma \int _M |h|^2 \, d\mu _t +C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p} \,d\mu _t \\&\quad + C_4 \sigma ^{-4/\eta } \mu _t(M). \end{aligned}$$

Decreasing \(\ell _0\) so that

$$\begin{aligned} \ell _0 \le \min \bigg \{\eta , \frac{c_0}{8C_5}, \frac{1}{8C_5} \bigg \} \end{aligned}$$

now gives

$$\begin{aligned} \frac{d}{dt} \int _M f_{\sigma ,k}^{p} \,d\mu _t&\le - (p^2/2 -p-C_2p ) \int _M f_{\sigma , k}^{p-2} |\nabla f_\sigma |^2 \,d\mu _t \\&\quad - (c_0 p /2 - 2 C_5\ell _0 p^\frac{1}{2} ) \int _M f_{\sigma ,k}^{p} \frac{|\nabla A|^2}{|H|^{2}}\,d\mu _t\\&\quad + (C_3+k)^p \sigma \int _M |h|^2 \, d\mu _t +C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p} \,d\mu _t \\&\quad + C_4 \sigma ^{-4/\eta } \mu _t(M). \end{aligned}$$

We can now take \(p_0\) large depending only on \(c_0\) and \(C_5\) to ensure that for \(p \ge p_0\) the inequality

$$\begin{aligned} \frac{d}{dt} \int _M f_{\sigma ,k}^{p} \,d\mu _t&\le (C_3+k)^p \sigma \int _M |h|^2 \, d\mu _t +C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p} \,d\mu _t \\&\quad + C_4 \sigma ^{-4/\eta } \mu _t(M). \end{aligned}$$

holds for almost every \(t \in [0,T)\).

Taking \(k_0\) a bit larger depending on n and \(C_3\), using \(k \ge k_0\) we can bound

$$\begin{aligned} \frac{d}{dt} \int _M f_{\sigma ,k}^{p} \,d\mu _t&\le 2^p k^p \sigma \int _M |H|^2 \, d\mu _t +C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p} \,d\mu _t \\&\quad + C_4 \sigma ^{-4/\eta } \mu _t(M). \end{aligned}$$

Since

$$\begin{aligned} \frac{d}{dt} \int _M 2^p k^p \sigma \, d\mu _t = - 2^p k^p \sigma \int _M |H|^2 \,d\mu _t \end{aligned}$$

this implies

$$\begin{aligned} \frac{d}{dt} \int _M f_{\sigma ,k}^{p} + 2^p k^p \sigma \,d\mu _t&\le C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p} \,d\mu _t + C_4 \sigma ^{-4/\eta } \mu _t(M)\\&= C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p} + p^{-1} \,d\mu _t . \end{aligned}$$

Hence the function

$$\begin{aligned} \varphi (t):= \int _M f_{\sigma ,k}^p +2^pk^p\sigma + p^{-1} \,d\mu _t \end{aligned}$$

satisfies

$$\begin{aligned} \varphi '(t) \le C_4 \sigma ^{-4/\eta } p \varphi (t) \end{aligned}$$

for almost every \(t \in [0,T)\). Since \(\varphi \) is Lipschitz continuous in time it follows that

$$\begin{aligned} \varphi (t) \le \varphi (0) \exp ( C_4 \sigma ^{-4/\eta } p t). \end{aligned}$$

In particular, \(\varphi \) can be bounded from above in terms of its value at the initial time, and the constants \(C_4\), \(\eta \), \(\sigma \), p and T. Recall that \(C_4\) depends only on n, \(\eta \), \(\Lambda \) and L. Also,

$$\begin{aligned} f_{\sigma , k}^{p} \le C_0^p |H|^{\sigma p}, \end{aligned}$$

so \(\varphi (0)\) can be bounded purely in terms of n, \(\Lambda \), \(\sigma \), p, L and \(\mu _0(M)\). This completes the proof. \(\square \)

4.2 The supremum estimate

Combining the \(L^p\)-estimates just established with the Michael–Simon Sobolev inequality [19], we obtain the following iteration inequality. The proof is very similar to that of Theorem 5.1 in [14]. However, we need to make some modifications, since our \(L^p\) estimate only holds for \(k \ge k_0\) (whereas the analogous estimate in Huisken’s work holds for all \(k \ge 0\)).

Proposition 4.3

There are positive constants \(p_1 \ge p_0\) and \(\ell _1 \le \ell _0\) depending on n, \(\varepsilon _0\), \(\eta \), \(\varepsilon \) and \(\Lambda \), and a positive constant \(k_1 \ge k_0\) depending on n, \(\varepsilon _0\), \(\eta \), \(\varepsilon \), \(\Lambda \) and L, with the following property. Suppose \(p \ge p_1\) and \(\sigma \le \ell _1 p^{-\frac{1}{2}}\) and set

$$\begin{aligned} A(k) := \int _0^T \int _{{{\,\textrm{supp}\,}}(f_{\sigma ,k}(\cdot ,t))} \,d\mu _t dt. \end{aligned}$$

For every \(h>k\ge k_1\) we have

$$\begin{aligned} A(h) \le \frac{C}{(h-k)^p} A(k)^{\gamma }. \end{aligned}$$

where \(\gamma >1\) depends on n and \(C = C(n, \varepsilon _0, \eta , \varepsilon , \Lambda , L, \mu _0(M), T, \sigma , p)\).

Proof

Suppose \(k \ge k_0\) and define

$$\begin{aligned} A(k,t) := {{\,\textrm{supp}\,}}(f_{\sigma ,k}(\cdot ,t)) \end{aligned}$$

for each \(t \in [0,T)\). We recall from the proof of Proposition  4.2 that for \(k \ge k_0\) there holds

$$\begin{aligned}&\frac{d}{dt} \int _M f_{\sigma ,k}^p \,d\mu _t\\&\quad \le - p(p-1) \int _M f_{\sigma ,k}^{p-2}|\nabla f_\sigma |^2\,d\mu _t + \sigma p\int _M |h|^2 f_\sigma f_{\sigma ,k}^{p-1} \,d\mu _t + C_3 \sigma p \int _M |h|^2 f_{\sigma ,k}^{p-1}\, d\mu _t \\&\qquad - c_0 p\int _M f_\sigma f_{\sigma ,k}^{p-1} \frac{|\nabla A|^2}{|H|^2} \,d\mu _t + C_2 p\int _M f_{\sigma ,k}^{p-1} \frac{|\nabla f_\sigma |^2}{f_\sigma }\,d\mu _t + C_4 \sigma ^{-4/\eta } p \int _M f_{\sigma ,k}^{p-1} \,d\mu _t \end{aligned}$$

for almost every \(t \in [0,T)\). For \(p \ge p_0\) we have

$$\begin{aligned} \frac{d}{dt}\int _M f_{\sigma ,k}^p \,d\mu _t + \frac{p^2}{4} \int _M f_{\sigma ,k}^{p-2} |\nabla f_\sigma |^2 \,d\mu _t&\le C_5 \int _{A(k,t)} |h|^2 f_\sigma ^p +|h|^2 f_{\sigma }^{p-1}+ f_{\sigma }^{p-1} \,d\mu _t, \end{aligned}$$

where \(C_5 = C_5(n,\eta ,\Lambda ,L,\sigma ,p)\). Recall that for \(k \ge k_0\) we have \(|H| \ge 1\) on A(kt), so using the pinching and Young’s inequality we obtain

$$\begin{aligned} \frac{d}{dt}\int _M f_{\sigma ,k}^p \,d\mu _t + \frac{p^2}{4} \int _M f_{\sigma ,k}^{p-2} |\nabla f_\sigma |^2 \,d\mu _t&\le C_5 \int _{A(k,t)} |H|^2 f_\sigma ^p + |H|^2 \,d\mu _t, \end{aligned}$$

where we have made \(C_5\) larger as necessary. We can also assume \(k \ge 1\), in which case \(f_\sigma \ge 1\) on A(kt) and there holds

$$\begin{aligned} \frac{d}{dt}\int _M f_{\sigma ,k}^p \,d\mu _t + \frac{p^2}{4} \int _M f_{\sigma ,k}^{p-2} |\nabla f_\sigma |^2 \,d\mu _t&\le C_5 \int _{A(k,t)} |H|^2 f_\sigma ^p \,d\mu _t. \end{aligned}$$

Set \(v_k := f_{\sigma ,k}^\frac{p}{2}\). Then from the last inequality we obtain

$$\begin{aligned} \frac{d}{dt}\int _M v_k^2 \,d\mu _t + \int _M |\nabla v_k|^2 \,d\mu _t&\le C_5 \int _{A(k,t)} |H|^2 f_\sigma ^p \,d\mu _t \end{aligned}$$

for almost every \(t \in [0,T)\). From the Michael–Simon Sobolev inequality and Hölder’s inequality, for any nonnegative function u on \(M_t\) there holds

$$\begin{aligned} \bigg ( \int _M u^{2q} \, d\mu _t \bigg )^\frac{1}{q} \le C(n) \int _M |\nabla u|^2\,d\mu _t + C(n) \bigg (\int _{{{\,\textrm{supp}\,}}(u)} |H|^n \,d\mu _t \bigg )^\frac{2}{n} \bigg ( \int _M u^{2q} \, d\mu _t \bigg )^\frac{1}{q} \end{aligned}$$

with \(q= \frac{n}{n-2}\) (recall we only consider \(n \ge 5\)). Setting \(u = v_k\) gives

$$\begin{aligned} \bigg ( \int _M v_k^{2q} \, d\mu _t \bigg )^\frac{1}{q} \le C(n) \int _M |\nabla v_k|^2\,d\mu _t + C(n) \bigg (\int _{A(k,t)} |H|^n \,d\mu _t \bigg )^\frac{2}{n} \bigg ( \int _M v_k^{2q} \, d\mu _t \bigg )^\frac{1}{q}. \end{aligned}$$

Observe that

$$\begin{aligned} \int _{A(k,t)} |H|^n \,d\mu _t&\le \frac{1}{k^p} \int _{A(k,t)} |H|^n f_{\sigma }^p \,d\mu _t\\&\quad = \frac{1}{k^p} \int _{A(k,t)} f_{\sigma '}^p \,d\mu _t \end{aligned}$$

where we have set \(\sigma ' := \sigma + n/p\). Hence

$$\begin{aligned} \int _{A(k,t)} |H|^n \,d\mu _t&\le \frac{1}{k^p} \int _{A(k,t)} (f_{\sigma '} - k_0 +k_0)^p \,d\mu _t\\&\quad \le \frac{2^{p-1}}{k^p} \int _M f_{\sigma ',k_0}^p + k_0^p \,d\mu _t. \end{aligned}$$

We would like to use Proposition 4.2 to estimate the right-hand side. To this end we now take

$$\begin{aligned} p \ge \max \bigg \{p_0, \frac{4n^2}{\ell _0^2} \bigg \}, \qquad \sigma \le \frac{\ell _0}{2} p^{-\frac{1}{2}} \end{aligned}$$

so that

$$\begin{aligned} \sigma ' = \sigma + \frac{n}{p} \le \frac{\ell _0}{2}p^{-\frac{1}{2}} + \frac{n}{p^\frac{1}{2}} p^{-\frac{1}{2}} \le \ell _0 p^{-\frac{1}{2}}. \end{aligned}$$

Applying Proposition 4.2, we find

$$\begin{aligned} \int _{A(k,t)} |H|^n \,d\mu _t&\le C_6(n, \eta , \Lambda , L, \mu _0(M), T, k_0, \sigma , p) k^{-p}. \end{aligned}$$

Subsituting this back into the Sobolev inequality gives

$$\begin{aligned} \bigg ( \int _M v_k^{2q} \, d\mu _t \bigg )^\frac{1}{q} \le C(n) \int _M |\nabla v_k|^2\,d\mu _t + C_6 k^{-2p/n} \bigg ( \int _M v_k^{2q} \, d\mu _t \bigg )^\frac{1}{q}, \end{aligned}$$

so by taking k a bit larger (depending only on \(C_6\), p and n) we can ensure that

$$\begin{aligned} \frac{1}{2}\bigg ( \int _M v_k^{2q} \, d\mu _t \bigg )^\frac{1}{q} \le C(n) \int _M |\nabla v_k|^2\,d\mu _t . \end{aligned}$$

Inserting this inequality into

$$\begin{aligned} \frac{d}{dt}\int _M v_k^2 \,d\mu _t + \int _M |\nabla v_k|^2 \,d\mu _t&\le C_5 \int _{A(k,t)} |H|^2 f_\sigma ^p \,d\mu _t, \end{aligned}$$

we find

$$\begin{aligned} \frac{d}{dt}\int _M v_k^2 \,d\mu _t + \frac{1}{C_7}\bigg ( \int _M v_k^{2q} \, d\mu _t \bigg )^\frac{1}{q}&\le C_5 \int _{A(k,t)} |H|^2 f_\sigma ^p \,d\mu _t \end{aligned}$$

for almost every \(t \in [0,T)\) and some \(C_7 = C_7(n)\).

Observe that by choosing \(k_1 \ge k_0\) depending only on \(C_0\) and L, we can ensure that \(v_k\) vanishes identically on \(M_0\) for each \(k \ge k_1\). Integrating the last inequality in time then gives

$$\begin{aligned} \int _M v_k^2(\cdot ,\tau ) \,d\mu _\tau + \frac{1}{C_7} \int _0^\tau \bigg ( \int _M v_k^{2q} \, d\mu _t \bigg )^\frac{1}{q} \,dt&\le C_5 \int _0^\tau \int _{A(k,t)} |H|^2 f_\sigma ^p \,d\mu _t \,dt \end{aligned}$$

for each \(\tau \in [0,T)\). In particular,

$$\begin{aligned} \sup _{t \in [0,T)} \int _M v_k^2(\cdot , t) \, d\mu _t \le C_5 \int _0^T \int _{A(k,t)} |H|^2 f_\sigma ^p \,d\mu _t \,dt \end{aligned}$$

and

$$\begin{aligned} \int _0^T \bigg ( \int _M v_k^{2q} \, d\mu _t \bigg )^\frac{1}{q} \,dt \le C_5 C_7 \int _0^T \int _{A(k,t)} |H|^2 f_\sigma ^p \,d\mu _t \,dt, \end{aligned}$$

and by adding together these two inequalities we arrive at

$$\begin{aligned} \sup _{t \in [0,T)} \int _M v_k^2(\cdot , t) \, d\mu _t +\int _0^T \bigg ( \int _M v_k^{2q} \, d\mu _t \bigg )^\frac{1}{q} \,dt \le C_8 \int _0^T \int _{A(k,t)} |H|^2 f_\sigma ^p \,d\mu _t \,dt \end{aligned}$$

where \(C_8 := C_5(1+C_7)\). To exploit the second term on the left above, we use the following interpolation inequality for \(L^p\) spaces:

$$\begin{aligned} \Vert f\Vert _{q_0} \le \Vert f \Vert _r^{1-\theta } \Vert f\Vert ^\theta _q, \end{aligned}$$

where \(\theta \in (0,1)\) and \(\frac{1}{q_0} = \frac{\theta }{q} + \frac{1-\theta }{r}\). We apply this inequality with \(r =1\), \(q_0 = \frac{n+2}{n}\), and \(\theta = \frac{1}{q_0}\). This gives

$$\begin{aligned} \bigg (\int _{M} v_k^{2 q_0} \, d\mu _t\bigg )^\frac{1}{q_0}&\le \bigg (\int _{M} v_k^2 \,d\mu _t \bigg )^\frac{q_0-1}{q_0} \bigg (\int _M v_k^{2q} \, d\mu _t \bigg )^\frac{1}{q q_0}. \end{aligned}$$

Raising both sides to \(q_0\), integrating in time and using Young’s inequality, we find

$$\begin{aligned} \int _0^T \int _{M} v_k^{2 q_0} \, d\mu _t dt&\le \bigg ( \sup _{t \in [0,T)} \int _M v_k^2 \,d\mu _t + \int _0^T \bigg (\int _M v_k^{2q}\,d\mu _t \bigg )^\frac{1}{q} \,dt \bigg )^{q_0}, \end{aligned}$$

and hence

$$\begin{aligned} \bigg (\int _0^T \int _{M} v_k^{2 q_0} \, d\mu _t\bigg )^\frac{1}{q_0}&\le \sup _{t \in [0,T)} \int _M v_k^2 \,d\mu _t + \int _0^T \bigg (\int _M v_k^{2q}\,d\mu _t \bigg )^\frac{1}{q} \,dt\\&\le C_8 \int _0^T \int _{A(k,t)} |H|^2 f_\sigma ^p \,d\mu _t \,dt. \end{aligned}$$

Next we use Hölder’s inequality to estimate

$$\begin{aligned} \int _0^T \int _{A(k,t)} |H|^2 f_\sigma ^p \,d\mu _t \,dt \le A(k)^{1-1/r} \bigg ( \int _0^T \int _{A(k,t)} (|H|^{2} f_{\sigma }^p)^r \,d\mu _t \,dt\bigg )^\frac{1}{r} \end{aligned}$$

where \(r >1\) is to be chosen later. In the end, r will depend only on \(q_0\) and hence only on n. Setting

$$\begin{aligned} \sigma '' := \sigma + \frac{2}{p}, \end{aligned}$$

we can write the last inequality as

$$\begin{aligned} \int _0^T \int _{A(k,t)} |H|^2 f_\sigma ^p \,d\mu _t dt&\le A(k)^{1-1/r} \bigg (\int _0^T \int _{A(k,t)} f_{\sigma ''}^{pr} \,d\mu _t dt\bigg )^\frac{1}{r}. \end{aligned}$$

To apply the \(L^p\)-estimate to the right-hand side, we need

$$\begin{aligned} pr \ge p_0, \qquad \sigma ''\le \ell _0 (pr)^{-\frac{1}{2}}. \end{aligned}$$

This can be achieved by taking \(p \ge p_1\) and \(\sigma \le \ell _1 p^{-\frac{1}{2}}\) where \(p_1\) is allowed to depend on \(p_0\), \(\ell _0\) and r, and \(\ell _1\) is allowed to depend on \(\ell _0\) and r. With this choice of parameters we can bound

$$\begin{aligned} \int _{A(k,t)} f_{\sigma ''}^{pr} \,d\mu _t \le 2^{pr-1}\int _M f_{\sigma '',k_0}^{pr} + k_0^{pr} \,d\mu _t \le C_9 \end{aligned}$$

for some \(C_9 = C_9(n, \eta , \Lambda , L, \mu _0(M), T, k_0, \sigma , p, r)\). Setting \(C_{10} := C_8 (C_9T)^\frac{1}{r}\), we now have

$$\begin{aligned} \bigg (\int _0^T \int _{M} v_k^{2 q_0} \, d\mu _t dt\bigg )^\frac{1}{q_0} \le C_{10} A(k)^{1-1/r}. \end{aligned}$$

To finish, we observe that for \(h >k \ge k_1\) there holds

$$\begin{aligned} A(h)&\le \frac{1}{(h-k)^p} \int _0^T \int _M v_k^2 \,d\mu _t dt\\&\le \frac{1}{(h-k)^p} A(k)^{1-1/q_0} \bigg ( \int _0^T \int _M v_k^{2 q_0} \, d\mu _t dt \bigg )^\frac{1}{q_0}, \end{aligned}$$

and therefore,

$$\begin{aligned} A(h) \le \frac{C_{10}}{(h-k)^p} A(k)^{2- \frac{1}{q_0} -\frac{1}{r}}. \end{aligned}$$

Since \(q_0 > 1\), we can choose r depending only on \(q_0\) to ensure that \(2 - 1/q_0 -1/r >1\). \(\square \)

We may now appeal to Stampacchia’s lemma (see for example Lemma B.1 in [16]) to obtain the following supremum estimate.

Corollary 4.4

There is a constant \(k_2 = k_2 (n,\varepsilon _0, \eta , \varepsilon , \Lambda , L, \mu _0(M), T)\) such that

$$\begin{aligned} f_{\sigma _0, k_2} \equiv 0 \end{aligned}$$

on \(M\times [0,T)\), where \(\sigma _0 := \ell _1 p_1^{-\frac{1}{2}}\) depends only on n, \(\varepsilon _0\), \(\eta \), \(\varepsilon \) and \(\Lambda \).

Recall that \(\eta \) depends only on n and \(\varepsilon _0\), and we chose \(\Lambda \) depending only on n, \(\varepsilon _0\) and \(\varepsilon \). Therefore, we have an estimate of the form

$$\begin{aligned} \frac{\lambda _1 + \varepsilon w}{|H|} \ge - C|H|^{-\sigma _0} -\Lambda \frac{|{\hat{A}}|^2}{|H|^2} \end{aligned}$$

on \(M\times [0,T)\), where \(C = C(n,\varepsilon _0, \varepsilon , L, \mu _0(M), T)\). Inserting the codimension estimate, we finally obtain

$$\begin{aligned} \frac{\lambda _1 + \varepsilon w}{|H|} \ge - C|H|^{-\sigma _0} -C|H|^{-2\eta }, \end{aligned}$$
(4.4)

where C has the same dependencies as before. From here, since \(\varepsilon \) can be made arbitrarily small, an application of Young’s inequality to the two lower-order terms on the right-hand side gives the convexity estimate of Theorem 1.1 (note that T can be bounded in terms of \(M_0\) by applying the maximum principle to the evolution equation of W).

5 Singularity formation

In the study of parabolic evolution equations it is natural to distinguish between singularities which form at different rates. For a solution of mean curvature flow \(F:M\times [0,T) \rightarrow {\mathbb {R}}^{n+k}\) where T is the maximal time, we say that a type I singularity forms as \(t \rightarrow T\) if there is a positive constant C such that

$$\begin{aligned} \max _{M_t} |A|^2 \le \frac{C}{T-t}. \end{aligned}$$

Note that this is the blow-up rate for solutions which shrink homothetically (such as shrinking spheres and cylinders). If on the other hand

$$\begin{aligned} \limsup _{t \nearrow T} \max _{M_t} \, (T-t)|A|^2 =\infty \end{aligned}$$

then the singularity forming at time T is said to be of type II.

In his thesis [2], Baker could show that the only homothetically shrinking solutions satisfying the quadratic pinching condition are shrinking spheres and cylinders (the analogous result for mean convex solutions of codimension one was proven earlier by Huisken [15]). Using this result, Baker was able to show that blow-ups at type I singularities are homothetically shrinking spheres and cylinders, provided there is a fixed point \(x \in M\) such that \(|A|(x,t) \rightarrow \infty \) as \(t \rightarrow T\). He referred to such singularities as being ‘special’. It is not clear that all type I singularities are special in higher codimensions (in codimension one, this was proven by Stone for embedded flows [24], but the arguments do not generalise). However, we can use Theorem  1.2 to show that blow-ups at a general type I singularity are homothetically shrinking spheres or cylinders.

Corollary 5.1

Let \(F:M\times [0,T) \rightarrow {\mathbb {R}}^{n+k}\) be a maximal solution of mean curvature flow which is quadratically pinched with \(c<c_n\). Suppose a type I singularity forms at time T. Then there exists a sequence of rescalings of F that subconverges smoothly to a self-similarly shrinking cylinder solution.

Proof

Let \((x_j,t_j)\) be such that

$$\begin{aligned} |H|(x_j,t_j) = \max _{(x,t) \in M \times [0,t_j]} |H|(x,t). \end{aligned}$$

Let \(L_j := |H|(x_j,t_j)\) and consider the rescaled flows

$$\begin{aligned} F_j(x,t) := L_j(F(x,L_j^{-2}t +t_j) - F(x_j,t_j)), \qquad t\in [-L_j^2t_j, 0]. \end{aligned}$$

Writing \(A_j\) and \(H_j\) for the second fundamental form and mean curvature vector of \(F_j\), we have the global curvature bound \(|H_j|^2\le 1\), and hence \(|A_j|^2 \le c\). It is well known that for a compact solution of mean curvature flow, a global upper bound for |A| implies bounds on all of the higher derivatives of A. This follows from the estimates in [9] in the codimension-one case, and similar arguments work in higher codimensions (the details can be found in Sect. 4.3 of [2]). Standard compactness theorems therefore imply that there is a smooth solution

$$\begin{aligned} {\bar{F}} : {\bar{M}} \times (-\infty , 0] \rightarrow {\mathbb {R}}^{n+k} \end{aligned}$$

such that the sequence \(F_j\) subconverges to \({\bar{F}}\) in the local smooth sense. This follows for example from Hamilton’s compactness theorem [10], as is illustrated in Sect. 6.1 of [2]. Using Theorem 1.2, we conclude that \({\bar{M}}_t\) is convex and codimension one, and the type I assumption implies there is a \(C<\infty \) such that

$$\begin{aligned} |{\bar{H}}| \le \frac{C}{\sqrt{-t}} \end{aligned}$$

for all \(t <0\). We may therefore apply [18, Theorem 1.1] to conclude that \({\bar{M}}_t\) is a homothetically shrinking \({\mathbb {R}}^m \times S^{n-m}\). \(\square \)

We now turn to type II singularities. In [12] Huisken and Sinestrari used their convexity estimate to show that at a type II singularity, appropriate rescalings about the maximum of the curvature converge to a convex translating solution. Our convexity estimate can be used to generalise their result to higher codimensions.

Corollary 5.2

Let \(F:M\times [0,T) \rightarrow {\mathbb {R}}^{n+k}\) be a maximal solution of mean curvature flow which is quadratically pinched with \(c<c_n\). Suppose a type II singularity forms at time T. Then there exists a sequence of rescalings of F that subconverges smoothly to a codimension-one limiting flow which is convex, noncollapsed and moves by translation.

Proof of Corollary 5.2

Consider a sequence of times \({\tilde{t}}_j \rightarrow T\) and let \((x_j,t_j)\) be such that

$$\begin{aligned} ({\tilde{t}}_j - t_j) |H|^2(x_j,t_j) := \max _{M \times [0,{\tilde{t}}_j]} ({\tilde{t}}_j - t) |H|^2(x,t). \end{aligned}$$

Then we have

$$\begin{aligned} |H|^2(x_j, t_j) = \max _{M} |H|^2(x,t_j) \end{aligned}$$

By the type II assumption, for each \(K > 0\) there is a point \((y,\tau ) \in M\times [0,T)\) such that

$$\begin{aligned} (T-\tau ) |H|^2(y,\tau ) \ge K. \end{aligned}$$

If j is large enough so that \({\tilde{t}}_j > \tau \) then we have

$$\begin{aligned} ({\tilde{t}}_j - t_j) |H|^2(x_j,t_j) \ge ({\tilde{t}}_j - \tau )|H|^2(y,\tau ) \ge K - (T - {\tilde{t}}_j) |H|^2(y,\tau ). \end{aligned}$$

Hence if j is sufficiently large there holds

$$\begin{aligned} ({\tilde{t}}_j - t_j)|H|^2(x_j, t_j) \ge K/2, \end{aligned}$$

and since K can be made arbitrarily large this shows that

$$\begin{aligned} ({\tilde{t}}_j - t_j)|H|^2(x_j, t_j) \rightarrow \infty . \end{aligned}$$

It follows that \(t_j \rightarrow T\).

Let \(L_j^2 := |H|^2(x_j,t_j)\) and consider the sequence of rescaled solutions defined by

$$\begin{aligned} F_j(x,t) := L_j (F(x,L_j^{-2}t + t_j ) - F(x_j,t_j)), \qquad (x,t) \in M\times [- L_j^2t_j, L_j^2(T- t_j)), \end{aligned}$$

which satisfy the conditions

$$\begin{aligned} F_j(0,0) = 0, \qquad |H_j|^2(0,0) = 1, \end{aligned}$$

where \(H_j\) is the mean curvature vector of \(F_j\). More generally, for \(t \le L_j^2 ({\tilde{t}}_j - t_j)\) there holds

$$\begin{aligned} |H_j|^2(x,t)&= L_j^{-2} |H|^2(x, L_j^{-2} t + t_j) \\&=L_j^{-2} \frac{({\tilde{t}}_j - L_j^{-2} t -t_j ) |H|^2(x, L_j^{-2} t + t_j)}{{\tilde{t}}_j - L_j^{-2} t - t_j} \\&\le L_j^{-2} \frac{({\tilde{t}}_j - t_j) |H|^2(x_j, t_j) }{{\tilde{t}}_j - L_j^{-2} t - t_j}\\&= \frac{{\tilde{t}}_j - t_j}{{\tilde{t}}_j - t_j - L_j^{-2} t }. \end{aligned}$$

Therefore, for times \(t \le \delta L_j^2 ({\tilde{t}}_j - t_j)\) with \(\delta <1\) we have

$$\begin{aligned} \max _{M} |H_j|^2(\cdot ,t) \le \frac{1}{1-\delta }. \end{aligned}$$

Passing to a subsequence in j, we can guarantee that there is a sequence \(\tau _j \rightarrow \infty \) such that

$$\begin{aligned} \max _{M} |H_j|^2(\cdot , t) \le 1 + \frac{1}{j}, \qquad \forall \; t \in [-\tau _j, \tau _j]. \end{aligned}$$

As in the proof of Corollary 5.1, we may extract a smooth limiting flow \({\bar{F}} :{\bar{M}} \times (-\infty ,\infty ) \rightarrow {\mathbb {R}}^{n+k}\) such that the \({\bar{M}}_t\) are convex hypersurfaces in a fixed \({\mathbb {R}}^{n+1} \subset {\mathbb {R}}^{n+k}\). Moreover, the scalar mean curvature \(|{\bar{H}}|\) is globally bounded from above by one, and this upper bound is attained at the spacetime origin. This is exactly the situation considered in Sect. 4 of [12]. The rigidity case of Hamilton’s Harnack inequality [11] implies that the family \({\bar{M}}_t\) moves by translation. \(\square \)