1 Introduction

We are interested in the local Hölder regularity of weak solutions to a class of parabolic equations involving a fractional p-Laplacian type operator:

$$\begin{aligned} \partial _t u + \mathscr {L} u=0\quad \text {weakly in}\> E_T, \end{aligned}$$
(1.1)

where \(E_T=E\times (0,T]\) for some open set \(E\subset \mathbb {R}^N\) and some \(T>0\), and the nonlocal operator \(\mathscr {L}\) is defined by

$$\begin{aligned} \mathscr {L}u(x,t)=\mathrm{P.V.}\int _{\mathbb {R}^N} K(x,y,t)\big |u(x,t) - u(y,t)\big |^{p-2} \big (u(x,t) - u(y,t)\big )\,\textrm{d}y, \end{aligned}$$
(1.2)

for some \(p>1\). Here, \(\mathrm{P.V.}\) denotes the principle value of the integral, whereas the kernel \(K:\mathbb {R}^N\times \mathbb {R}^N\times (0,T]\rightarrow [0,\infty )\) is a measurable function satisfying the following condition uniformly in t:

$$\begin{aligned} \frac{C_o}{|x-y|^{N+sp}}\le K(x,y,t)\equiv K(y,x,t)\le \frac{C_1}{|x-y|^{N+sp}}\quad \text {a.e.}\> x,\,y\in \mathbb {R}^N, \end{aligned}$$
(1.3)

for some positive \(C_o\), \(C_1\) and \(s\in (0,1)\).

Throughout this note, the parameters \(\{s, p, N, C_o, C_1\}\) are termed the data, and we use \(\varvec{\gamma }\) as a generic positive constant in various estimates that can be determined by the data only.

The formal definition of weak solution to (1.1)–(1.3) and notation can be found in Sect. 1.2. We proceed to present our main result as follows.

Theorem 1.1

Let u be a locally bounded, local, weak solution to (1.1)–(1.3) in \(E_T\) with \(p>1\).Then u is locally Hölder continuous in \(E_T\). More precisely, there exist constants \(\varvec{\gamma }>1\) and \(\beta \in (0,1)\) that can be determined a priori only in terms of the data, such that for any \(0<\varrho<R<\widetilde{R}\), there holds

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{(x_o,t_o)+Q_{\varrho }({\varvec{\omega }}^{2-p})} u\le {\varvec{\gamma }} {\varvec{\omega }} \left( \frac{\varrho }{R}\right) ^{\beta }, \end{aligned}$$

provided the cylinders \((x_o,t_o)+Q_{R}(\varvec{\omega }^{2-p})\subset (x_o,t_o)+Q_{\widetilde{R}}\) are included in \(E_T\), where

$$\begin{aligned} \varvec{\omega }=2\mathop {\mathrm{ess\,sup}}\limits _{(x_o,t_o)+Q_{\widetilde{R}}}|u| +\textrm{Tail}\big (u; (x_o,t_o)+Q_{\widetilde{R}}\big ). \end{aligned}$$

Remark 1.1

Theorem 1.1 has been formulated independent of any initial/boundary data. While local, the oscillation estimate bears global information via the tail of u; see (1.9). In particular, a solution is allowed to grow at infinity. Whereas if u is globally bounded in \(\mathbb {R}^N\times (0,T)\), then \(\varvec{\omega }\) can be taken as the global bound. This occurs if, for instance, proper initial/boundary data are prescribed, cf. [4, 6]. In addition, if u is globally bounded in \(\mathbb {R}^N\times (-\infty ,T)\), then u is, a fortiori, a constant by the oscillation estimate.

Remark 1.2

Theorem 1.1 continues to hold for more general structures. For instance, one can consider the kernel satisfying

$$\begin{aligned} \frac{C_o \chi _{\{|x-y|\le 1\}}}{|x-y|^{N+sp}}\le K(x,y,t)\equiv K(y,x,t)\le \frac{C_1}{|x-y|^{N+sp}}\quad \text {a.e.}\> x,\,y\in \mathbb {R}^N. \end{aligned}$$

Also, proper lower order terms can be considered, cf. [8]. However, we will concentrate on the actual novelty and leave possible generalizations to the motivated reader.

Remark 1.3

Our approach only has a minimal requirement regarding fractional calculus; all relevant results are collected in Appendix A. However, DiBenedetto’s intrinsic scaling method is quite involved; monographs [9, 11, 25] provide a good account of the method. It would be instructive to first practice our arguments in the elliptic setting or in the linear parabolic setting.

1.1 Novelty and significance

The nonlocal elliptic operator \(\mathscr {L}\) as in (1.2) with a kernel like (1.3), especially when \(p=2\), has been a classical topic in Probability, Potential theory, Harmonic Analysis, etc. In addition, nonlocal partial differential equations arise from continuum mechanics and phase transition, from population dynamics, and from optimal control and game theory. We refer to [5, 14] for a source of motivations and applications.

Local regularity for the nonlocal elliptic operator with merely measurable kernels is well studied, cf. [8, 12, 13, 17, 22]—just to mention a few. In [12, 13], localization techniques are developed in order to establish Hölder regularity and Harnack’s inequality for the elliptic operator. A logarithmic estimate plays a key role in [12, 13]. Whereas [8] further improves these results to functions in certain DeGiorgi classes and the logarithmic estimate is dispensed with.

The parabolic nonlocal problem (1.1) has witnessed a growing interest recently; see [2,3,4, 6, 15, 16, 18, 19, 21, 23, 24, 26, 27]—just to mention a few. Coming to the local regularity, while the case \(p=2\) has been subject to extensive studies, the case \(p\ne 2\) is largely open. Local regularity of supersolutions is studied in [2, 3]. Some local boundedness estimates are reported in [15, 23]. Meanwhile, it is tried in [15] to adapt techniques of [12] and to show Hölder regularity for (1.1) with \(p>2\). However, we are unable to verify [15, (5.23)], which is based on a logarithmic estimate. When the kernel K(xyt) is exactly \( 2|x-y|^{-N-sp}\) for \(p\ge 2\), explicit Hölder exponents are obtained in [4].

Our contribution lies in establishing Hölder regularity for the parabolic fractional p-Laplace type equation with merely measurable kernels for all \(p>1\). The approach is structural, in the sense that we dispense with any kind of comparison principle and do not reply on solving PDEs. More generally, we find that the Hölder regularity is in fact encoded in a family of energy estimates in Corollary 2.1, and tools like logarithmic estimates or exponential change of variables play no role in our arguments.Footnote 1 This differs from the Moser approach, which keeps using the PDE with different testing functions. As such, the arguments are new even for \(p=2\) and hold the promise of a wider applicability, for instance, in Calculus of Variations.

Unlike the elliptic operator or the parabolic operator with \(p=2\), the local behavior of a solution to the parabolic p-Laplacian is markedly different: it has to be read in its own intrinsic geometry. This is the guiding idea in the local operator theory, cf. [7, 9,10,11, 25]. In terms of oscillation estimates, this idea leads to the construction of geometric sequences \(\{R_i\}\) and \(\{\varvec{\omega }_{i}\}\) connected by the intrinsic relation

$$\begin{aligned} \begin{array}{c} \displaystyle \mathop {\mathrm{ess\,osc}}\limits _{Q_{R_{i}}(\varvec{\omega }_{i}^{2-p})}u\le \varvec{\omega }_{i}. \end{array} \end{aligned}$$
(1.4)

The nonlocal theory developed here is no exception. However, the nonlocal character of (1.1) needs to be carefully handled in this intrinsic scaling scenario. A new component brought by the nonlocality of the operator is a proper control of the so-called tail—a nonlocal integral of the solution (see (1.9)). Precisely, we have

$$\begin{aligned} \textrm{Tail}\big (\big (u - \varvec{\mu }_i^{\pm }\big )_{\pm }; Q_{R_{i}}\big (\varvec{\omega }_{i}^{2-p}\big )\big ) \le \varvec{\gamma }\varvec{\omega }_i, \end{aligned}$$
(1.5)

where \( \varvec{\mu }_i^{\pm }\) denotes the supremum/infimum of u over \(Q_{R_{i}}(\varvec{\omega }_{i}^{2-p})\). In other words, the nonlocal tail is controlled by the local oscillation, if the intrinsic relation (1.4) is verified. The tail estimate (1.5) in turn allows us to reduce the oscillation in the next step, and so on. This induction procedure can be illustrated by

$$\begin{aligned} (1.4)_{i}^{i+1}\> \leftrightarrows \> (1.5)_{i}. \end{aligned}$$

The local regularity theory for the nonlocal parabolic problem (1.1) with \(p\ne 2\) is still at its inception. We believe the techniques developed in this note are flexible enough and provide a handy toolkit that can be used to fruitfully attack more general nonlocal parabolic equations.

1.2 Definitions and notation

1.2.1 Function spaces

For \(p>1\) and \(s\in (0,1)\), we introduce the fractional Sobolev space \(W^{s,p}(\mathbb {R}^N)\) by

$$\begin{aligned} W^{s,p}(\mathbb {R}^N):=\bigg \{v \in L^p(\mathbb {R}^N):\, \int _{\mathbb {R}^N}\int _{\mathbb {R}^N}\frac{|v(x) - v(y)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y<\infty \bigg \}, \end{aligned}$$

which is endowed with the norm

$$\begin{aligned} \Vert v\Vert _{W^{s,p}(\mathbb {R}^N)}:=\left( \int _{\mathbb {R}^N}|v|^p\,\textrm{d}x\right) ^{\frac{1}{p}} + \left( \int _{\mathbb {R}^N}\int _{\mathbb {R}^N}\frac{|v(x)-v(y)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\right) ^{\frac{1}{p}}. \end{aligned}$$

Similarly, the fractional Sobolev space \(W^{s,p}(E)\) for a domain \(E\subset \mathbb {R}^N\) can be defined. Moreover, we denote

$$\begin{aligned} W^{s,p}_o(E):=\bigg \{v\in W^{s,p}(\mathbb {R}^N):\, v=0\>\>\text {a.e. in}\>\mathbb {R}^N\setminus E \bigg \}. \end{aligned}$$

These spaces admit imbedding into proper Lebesgue spaces; we collect some in Appendix A.

1.2.2 Notion of weak solution

A measurable function \(u:\,\mathbb {R}^N\times (0,T]\rightarrow \mathbb {R}\) satisfying

$$\begin{aligned} u\in C_{{\text {loc}}}\big (0,T;L^2_{{\text {loc}}}(E)\big )\cap L^p_{{\text {loc}}}\big (0,T; W^{s,p}_{{\text {loc}}}(E)\big ) \end{aligned}$$

is a local, weak sub(super)-solution to (1.1)–(1.3), if for every compact set \(\mathcal {K}\subset E\) and every sub-interval \([t_1,t_2]\subset (0,T]\), we have

$$\begin{aligned} \mathop {\mathrm{ess\,sup}}\limits _{t_1<t<t_2}\int _{\mathbb {R}^N}\frac{|u(x,t)|^{p-1}}{1+|x|^{N+sp}}\,\textrm{d}x<\infty \end{aligned}$$
(1.6)

and

$$\begin{aligned} \begin{aligned} \int _{\mathcal {K}}&u\varphi \,\textrm{d}x\bigg |_{t_1}^{t_2} -\int _{t_1}^{t_2}\int _{\mathcal {K}} u\partial _t\varphi \, \textrm{d}x\textrm{d}t+\int _{t_1}^{t_2} \mathscr {E}\big (u(\cdot , t), \varphi (\cdot , t)\big )\,\textrm{d}t\le (\ge )0 \end{aligned} \end{aligned}$$
(1.7)

where

$$\begin{aligned} \mathscr {E}:=\int _{\mathbb {R}^N}\int _{\mathbb {R}^N}K(x,y,t)\big |u(x,t) - u(y,t)\big |^{p-2} \big (u(x,t) - u(y,t)\big )\big (\varphi (x,t) - \varphi (y,t)\big )\,\textrm{d}y\textrm{d}x\end{aligned}$$

for all non-negative testing functions

$$\begin{aligned} \varphi \in W^{1,2}_{{\text {loc}}}\big (0,T;L^2(\mathcal {K})\big )\cap L^p_{{\text {loc}}}\big (0,T;W_o^{s,p}(\mathcal {K}) \big ). \end{aligned}$$
(1.8)

A function u that is both a local weak sub-solution and a local weak super-solution to (1.1)–(1.3) is a local weak solution.

Remark 1.4

As we are developing a local theory, the function space in (1.8) can be taken smaller. Namely, the notion of \(W_o^{s,p}(\mathcal {K})\) can be replaced by functions \(\varphi (\cdot , t)\in W^{s,p}(\mathbb {R}^N)\) with a compact support in \(\mathcal {K}\) for a.e. t.

Remark 1.5

To ensure the convergence of the global integral in (1.7), it suffices to weaken the \(L^\infty \) norm appearing in the condition (1.6) by the \(L^1\) norm. However, in deriving the energy estimate of Proposition 2.1, the condition (1.6) is needed already.

1.2.3 Some notation

Throughout this note, we will use \(K_\varrho (x_o)\) to denote the ball of radius \(\varrho \) and center \(x_o\) in \(\mathbb {R}^N\), and the symbols

$$\begin{aligned} \left\{ \begin{aligned} (x_o,t_o)+Q_\varrho (\theta )&:=K_{\varrho }(x_o)\times (t_o-\theta \varrho ^{sp},t_o),\\ (x_o,t_o)+Q(R,S)&:=K_R(x_o)\times (t_o-S,t_o), \end{aligned}\right. \end{aligned}$$

to denote (backward) cylinders with the indicated positive parameters. When the context is unambiguous, we will omit the vertex \((x_o,t_o)\) from the symbols for simplicity. When \(\theta =1\), it is also omitted.

A nonlocal integral of u—termed the tail of u—inevitably appears in the theory, which we define as

$$\begin{aligned} \textrm{Tail}\big (u; Q(R,S)\big ):= \mathop {\mathrm{ess\,sup}}\limits _{t_o-S<t<t_o}\left( R^{sp}\int _{\mathbb {R}^N\setminus K_R(x_o)}\frac{|u(x,t)|^{p-1}}{|x-x_o|^{N+sp}}\,\textrm{d}x\right) ^{\frac{1}{p-1}}. \end{aligned}$$
(1.9)

For any \(Q(R,S)\subset E_T\), the finiteness of this tail is guaranteed by (1.6).

2 Energy estimates

This section is devoted to energy estimates satisfied by local weak sub(super)-solutions to (1.1)–(1.3). We first introduce, for any \(k\in \mathbb {R}\), the truncated functions

$$\begin{aligned} (u-k)_+=\max \{u-k,0\}, \qquad (u-k)_-=\max \{-(u-k),0\}. \end{aligned}$$

In what follows, when we state “u is a sub(super)-solution...” and use \(``\pm "\) or \(``\mp "\) afterwards, we mean the sub-solution corresponds to the upper sign and the super-solution corresponds to the lower sign in the statement.

Proposition 2.1

Let u be a local weak sub(super)-solution to (1.1)–(1.3) in \(E_T\). There exists a constant \(\varvec{\gamma } (C_o,C_1,p)>0\), such that for all cylinders \(Q(R,S) \subset E_T\), every \(k\in \mathbb {R}\), and every non-negative, piecewise smooth cutoff function \(\zeta (\cdot ,t)\) compactly supported in \( K_{R} \) for all \(t\in (t_o-S,t_o)\), there holds

$$\begin{aligned}&\int _{t_o-S}^{t_o}\int _{K_R}\int _{K_R}\min \big \{\zeta ^p(x,t),\zeta ^p(y,t)\big \} \frac{|w_\pm (x,t) - w_\pm (y,t)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\qquad +\iint _{Q(R,S)} \zeta ^p w_{\pm }(x,t)\,\textrm{d}x\textrm{d}t\left( \int _{ K_R} \frac{w^{p-1}_\mp (y,t)}{|x-y|^{N+sp}}\,\textrm{d}y\right) +\int _{K_R}\zeta ^p w^2_{\pm }(x,t)\,\textrm{d}x\bigg |_{t_o-S}^{t_o}\\&\quad \le \varvec{\gamma }\int _{t_o-S}^{t_o}\int _{K_R}\int _{K_R}\max \big \{w^p_{\pm }(x,t), w^p_{\pm }(y,t)\big \} \frac{|\zeta (x,t) - \zeta (y,t)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\qquad +\varvec{\gamma }\iint _{Q(R,S)} \zeta ^pw_{\pm }(x,t)\,\textrm{d}x\textrm{d}t\left( \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} x\in {\text {supp}}\zeta (\cdot , t)\\ t\in (t_o-S,t_o) \end{array}} \int _{\mathbb {R}^N\setminus K_R}\frac{ w_{\pm }^{p-1}(y,t)}{|x-y|^{N+sp}}\,\textrm{d}y\right) \\&\qquad + \iint _{Q(R,S)} |\partial _t\zeta ^p|w_{\pm }^2(x,t)\,\textrm{d}x\textrm{d}t. \end{aligned}$$

Here, we have denoted \(w=u-k\) for simplicity.

Proof

We will only deal with the case of sub-solution as the other case is similar. Using \(\varphi =w_+\zeta ^p\) as a testing function in the weak formulation modulo a proper time mollification (cf. Appendix B), the last terms on the right/left-hand side of the energy estimate are rather standard. We only treat the integral resulting from the fractional diffusion part, which, due to the support of \(\zeta \) and symmetry of the integrand, can be split into two parts, that is,

$$\begin{aligned}&\int _{t_o-S}^{t_o}\int _{\mathbb {R}^N}\int _{\mathbb {R}^N} \big |u(x,t) - u(y,t)\big |^{p-2}\big (u(x,t) - u(y,t)\big )\big (\varphi (x,t) - \varphi (y,t)\big )\,\textrm{d}\mu \\&\quad =\int _{t_o-S}^{t_o}\int _{K_R}\int _{K_R} \big |u(x,t) - u(y,t)\big |^{p-2}\big (u(x,t) - u(y,t)\big )\big (\varphi (x,t) - \varphi (y,t)\big )\,\textrm{d}\mu \\&\qquad +2\int _{t_o-S}^{t_o}\int _{K_{R}}\int _{\mathbb {R}^N\setminus K_R} \big |u(x,t) - u(y,t)\big |^{p-2}\big (u(x,t) - u(y,t)\big ) \varphi (x,t) \,\textrm{d}\mu \\&\quad =:I_1+I_2, \end{aligned}$$

where we have used \(\textrm{d}\mu =K(x,y,t)\textrm{d}y\textrm{d}x\textrm{d}t\) for simplicity.

Let us manipulate the first integral, which is the leading term. To this end, we denote \(A_k=\big \{u(\cdot , t)>k\big \}\cap K_R\) for fixed \(t\in (t_o-S,t_o)\). Observe that if \(x\in A_k\) while \(y\in K_R{\setminus } A_k\), one obtains

$$\begin{aligned} \begin{aligned} \big |u(x,t)&- u(y,t)\big |^{p-2}\big (u(x,t) - u(y,t)\big )\big (\varphi (x,t) - \varphi (y,t)\big )\\&=\big (w_+(x,t)+w_-(y,t)\big )^{p-1} w_+(x,t)\zeta ^p(x,t)\\&\ge c(p) w_+^p(x,t) \zeta ^p(x,t)+c(p) w^{p-1} _-(y,t)w_+(x,t)\zeta ^p(x,t), \end{aligned} \end{aligned}$$
(2.1)

for some proper \(c=c(p)\). Whereas if \(x,y\in A_k\), we claim that

$$\begin{aligned} \begin{aligned} \big |u(x,t)&- u(y,t)\big |^{p-2}\big (u(x,t) - u(y,t)\big )\big (\varphi (x,t) - \varphi (y,t)\big )\\&\ge \tfrac{1}{2}\big |w_+(x,t)-w_+(y,t)\big |^{p} \max \big \{\zeta ^p(x,t), \zeta ^p(y,t)\big \}\\&\quad - \varvec{\gamma }(p)\max \big \{w_+^p(x,t), w_+^p(y,t)\big \}\big |\zeta (x,t)-\zeta (y,t)\big |^p. \end{aligned} \end{aligned}$$
(2.2)

To prove the claim, we may assume that \(u(x,t)\ge u(y,t)\) due to symmetry and write

$$\begin{aligned} \begin{aligned} \big |u(x,t)&- u(y,t)\big |^{p-2}\big (u(x,t) - u(y,t)\big )\big (\varphi (x,t) - \varphi (y,t)\big )\\&=\big (w_+(x,t)-w_+(y,t)\big )^{p-1} \big (w_+(x,t)\zeta ^p(x,t) - w_+(y,t)\zeta ^p(y,t)\big ). \end{aligned} \end{aligned}$$
(2.3)

If \(\zeta (x,t)\ge \zeta (y,t)\), the above display (2.3) is estimated below by

$$\begin{aligned} \big (w_+(x,t)-w_+(y,t)\big )^{p} \zeta ^p(x,t), \end{aligned}$$

and hence the claim follows. If, instead \(\zeta (x,t)<\zeta (y,t)\), then (2.3) can be written as

$$\begin{aligned} \big (w_+(x,t)-w_+(y,t)\big )^{p} \zeta ^p(y,t)-\big (w_+(x,t)-w_+(y,t)\big )^{p-1} w_+(x,t) \big (\zeta ^p(y,t) - \zeta ^p(x,t)\big ). \end{aligned}$$

To proceed, we need an elementary inequality:

$$\begin{aligned} a^p-b^p\le \varepsilon a^p + \frac{p^p}{\varepsilon ^{p-1}}(a-b)^p\quad \text {for}\> a\ge b\ge 0,\>\varepsilon >0. \end{aligned}$$
(2.4)

This simply follows from the mean value theorem and Young’s inequality:

$$\begin{aligned} a^p-b^p\le p a^{p-1}(a-b)\le \varepsilon a^p + \frac{p^p}{\varepsilon ^{p-1}}(a-b)^p. \end{aligned}$$

We may apply (2.4) with \(a=\zeta (y,t)\), \(b=\zeta (x,t)\) and

$$\begin{aligned} \varepsilon =\big (w_+(x,t)-w_+(y,t)\big )/2w_+(x,t) \end{aligned}$$

to obtain

$$\begin{aligned} \big (w_+(x,t)&-w_+(y,t)\big )^{p-1} w_+(x,t) \big (\zeta ^p(y,t) - \zeta ^p(x,t)\big )\\&\le \tfrac{1}{2}\big (w_+(x,t)-w_+(y,t)\big )^{p} \zeta ^p(y,t)+\varvec{\gamma }(p)w_+^p(x,t) \big (\zeta (y,t) - \zeta (x,t)\big )^p. \end{aligned}$$

Combining the last two estimates in (2.3), we obtain (2.2) in the case \(\zeta (x,t)<\zeta (y,t)\) also.

Employing (2.1) and (2.2) and properly adjusting c if necessary, the first integral \(I_1\) is estimated by

$$\begin{aligned} I_1&\ge c\int _{t_o-S}^{t_o}\int _{A_k}\int _{A_k} \big |w_+(x,t)-w_+(y,t)\big |^{p} \max \big \{\zeta ^p(x,t), \zeta ^p(y,t)\big \}\,\textrm{d}\mu \\&\quad +2c\int _{t_o-S}^{t_o}\int _{A_{k}}\int _{K_R\setminus A_k} \big |w_+(x,t)-w_+(y,t)\big |^{p} \zeta ^p(x,t) \,\textrm{d}\mu \\&\quad +2c\int _{t_o-S}^{t_o}\int _{A_{k}}\int _{K_R\setminus A_k} w_-^{p-1}(y,t) w_+(x,t)\zeta ^p(x,t) \,\textrm{d}\mu \\&\quad - \varvec{\gamma }(p)\int _{t_o-S}^{t_o}\int _{K_R}\int _{K_R}\max \big \{w_+^p(x,t), w_+^p(y,t)\big \}\big |\zeta (x,t)-\zeta (y,t)\big |^p\,\textrm{d}\mu \\&\ge c\int _{t_o-S}^{t_o}\int _{K_R}\int _{K_R} \big |w_+(x,t)-w_+(y,t)\big |^{p} \min \big \{\zeta ^p(x,t), \zeta ^p(y,t)\big \}\,\textrm{d}\mu \\&\quad +c\int _{t_o-S}^{t_o}\int _{K_R}\int _{K_R} w_-^{p-1}(y,t) w_+(x,t)\zeta ^p(x,t) \,\textrm{d}\mu \\&\quad - \varvec{\gamma }(p)\int _{t_o-S}^{t_o}\int _{K_R}\int _{K_R}\max \big \{w_+^p(x,t), w_+^p(y,t)\big \}\big |\zeta (x,t)-\zeta (y,t)\big |^p\,\textrm{d}\mu . \end{aligned}$$

Now let us treat the second integral \(I_2\), which yields the only nonlocal integral in the energy estimate. Indeed, we first estimate

$$\begin{aligned} -\big |u(x,t)&- u(y,t)\big |^{p-2}\big (u(x,t) - u(y,t)\big ) \big (u(x,t)-k\big )_+\\&\le \big (u(y,t)-u(x,t)\big )^{p-1}_+\big (u(x,t)-k\big )_+\\&\le \big (u(y,t)-k\big )^{p-1}_+\big (u(x,t)-k\big )_+. \end{aligned}$$

As a result, we may estimate by the condition (1.3) on the kernel K,

$$\begin{aligned} -I_2&\le 2\int _{t_o-S}^{t_o}\int _{K_{R}}\int _{\mathbb {R}^N\setminus K_R} w_+^{p-1}(y,t) w_+(x,t)\zeta ^p(x,t)\,\textrm{d}\mu \\&\le \varvec{\gamma } \iint _{Q(R,S)} \zeta ^pw_{+}(x,t)\,\textrm{d}x\textrm{d}t\left( \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} x\in {\text {supp}}\zeta (\cdot , t)\\ t\in (t_o-S,t_o) \end{array}} \int _{\mathbb {R}^N\setminus K_R}\frac{ w_{+}^{p-1}(y,t)}{|x-y|^{N+sp}}\,\textrm{d}y\right) . \end{aligned}$$

Note that the finiteness of the above nonlocal integral is guaranteed by (1.6). This term will evolve into the tail term (1.9) in the forthcoming theory.

Finally, we can put all these estimates together and use the condition (1.3) on the kernel K to conclude. \(\square \)

The above energy estimate can be written in \(K_R\times (t_o- S, t)\) for any \(t\in (t_o- S, t_o)\). As usual, this will lead to an \(L^\infty \) estimate in the time variable on the left, due to the arbitrariness of t. Further, by choosing a proper cutoff function \(\zeta \), we derive the following two types of energy estimates from Proposition 2.1, which encode all the information needed to show Theorem 1.1.

Corollary 2.1

Let u be a local weak sub(super)-solution to (1.1)–(1.3) in \(E_T\). There exists a constant \(\varvec{\gamma } (C_o,C_1,p)>0\), such that for all cylinders \(Q(r,\tau )\subset Q(R,S) \subset E_T\), and every \(k\in \mathbb {R}\), there holds

$$\begin{aligned} \mathop {\mathrm{ess\,sup}}\limits _{t_o-\tau<t<t_o}\int _{K_r}&w^2_{\pm }(x,t)\,\textrm{d}x+ \int _{t_o-\tau }^{t_o}\int _{K_r}\int _{K_r} \frac{|w_\pm (x,t) - w_\pm (y,t)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\le \frac{\varvec{\gamma } R^{(1-s)p}}{(R-r)^p}\int _{t_o-S}^{t_o}\int _{K_R} w^p_{\pm }(x,t) \,\textrm{d}x\textrm{d}t\\&\quad +\frac{\varvec{\gamma } R^{N}}{(R-r)^{N+sp}}\iint _{Q(R,S)} w_{\pm }(x,t)\,\textrm{d}x\textrm{d}t\big [\textrm{Tail}\big (w_\pm ; Q(R,S)\big )\big ]^{p-1}\\&\quad + \frac{\varvec{\gamma }}{S-\tau }\iint _{Q(R,S)} w_{\pm }^2(x,t)\,\textrm{d}x\textrm{d}t\end{aligned}$$

and

$$\begin{aligned} \mathop {\mathrm{ess\,sup}}\limits _{t_o-S<t<t_o}\int _{K_r}&w^2_{\pm }(x,t)\,\textrm{d}x+ \int _{t_o-S}^{t_o}\int _{K_r}\int _{K_r} \frac{|w_\pm (x,t) - w_\pm (y,t)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\quad +\frac{1}{r^{N+sp}}\int _{t_o-S}^{t_o}\int _{K_r}\int _{K_r} w_{\pm }(x,t) w^{p-1}_\mp (y,t)\,\textrm{d}y\textrm{d}x\textrm{d}t\\&\le \int _{K_R} w^2_\pm (x,t_o-S)\,\textrm{d}x+\frac{\varvec{\gamma } R^{(1-s)p}}{(R-r)^p}\int _{t_o-S}^{t_o}\int _{K_R} w^p_{\pm }(x,t) \,\textrm{d}x\textrm{d}t\\&\quad +\frac{\varvec{\gamma } R^{N}}{(R-r)^{N+sp}}\iint _{Q(R,S)} w_{\pm }(x,t)\,\textrm{d}x\textrm{d}t\big [\textrm{Tail}\big (w_\pm ; Q(R,S)\big )\big ]^{p-1}. \end{aligned}$$

Here, we have denoted \(w=u-k\) for simplicity.

3 Preliminary tools

In this section, we collect the main modules of the proof of Hölder regularity. An important feature is that the tail term appears in these modules via an either-or form. This feature clarifies the role of the tail and greatly facilitates the delicate intrinsic scaling arguments to be unfolded in the next two sections. To streamline, we derive them from the energy estimates in Proposition 2.1. Nevertheless, it will be clear from their proofs that Corollary 2.1 actually suffices. We also stress that the arguments in this section are given in a unified fashion for all \(p>1\).

Throughout this section, let \(\mathcal {Q}:=K_R(x_o)\times (T_1,T_2]\) be a cylinder included in \(E_T\). We introduce numbers \(\varvec{\mu }^{\pm }\) and \(\varvec{\omega }\) satisfying

$$\begin{aligned} \varvec{\mu }^+\ge \mathop {\mathrm{ess\,sup}}\limits _{\mathcal {Q}}u, \quad \varvec{\mu }^-\le \mathop {\mathrm{ess\,inf}}\limits _{\mathcal {Q}} u, \quad \varvec{\omega }\ge \varvec{\mu }^+-\varvec{\mu }^-. \end{aligned}$$

The first result concerns a DeGiorgi type lemma. For simplicity, we omit the vertex \((x_o,t_o)\) from \(Q_\varrho (\theta )\).

Lemma 3.1

Let u be a locally bounded, local weak sub(super)-solution to (1.1)–(1.3) in \(E_T\). For some \( \delta ,\,\xi \in (0,1)\) set \(\theta =\delta (\xi \varvec{\omega })^{2-p}\) and assume \( Q_\varrho (\theta ) \subset \mathcal {Q}\). There exists a constant \(\nu \in (0,1)\) depending only on the data \(\{s, p, N, C_o, C_1\}\) and \(\delta \), such that if

$$\begin{aligned} \Big |\Big \{ \pm \big (\varvec{\mu }^{\pm }-u\big )\le \xi \varvec{\omega }\Big \}\cap Q_{\varrho }(\theta )\Big | \le \nu |Q_{\varrho }(\theta )|, \end{aligned}$$

then either

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}\big ( \big (u - \varvec{\mu }^{\pm }\big )_{\pm }; \mathcal {Q}\big )>\xi \varvec{\omega }, \end{aligned}$$

or

$$\begin{aligned} \pm \big (\varvec{\mu }^{\pm }-u\big )\ge \tfrac{1}{2}\xi \varvec{\omega } \quad \text{ a.e. } \text{ in } Q_{\frac{1}{2}\varrho }(\theta ). \end{aligned}$$

Moreover, we have the dependence \(\nu \approx \delta ^q\) for some \( q>1\) depending on p and N.

Proof

It suffices to show the case of super-solution with \(\varvec{\mu }^-=0\). As a restatement,Footnote 2 we will show that there exists \(\nu \), such that if

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}(u_{-}; \mathcal {Q}) \le \xi \varvec{\omega } \end{aligned}$$

and if

$$\begin{aligned} \Big |\Big \{ u\le \xi \varvec{\omega }\Big \}\cap Q_{\varrho }(\theta )\Big | \le \nu |Q_{\varrho }(\theta )|, \end{aligned}$$

then

$$\begin{aligned} u \ge \tfrac{1}{2}\xi \varvec{\omega } \quad \text{ a.e. } \text{ in } Q_{\frac{1}{2}\varrho }(\theta ). \end{aligned}$$

We stress that \(\nu \) only depends on the data \(\{s, p, N, C_o, C_1\}\) and \(\delta \), but independent of \(\xi \).

Upon a translation, we may assume \((x_o,t_o)=(0,0)\) and define for \(n\in \mathbb {N}\cup \{0\}\),

$$\begin{aligned} \left\{ \begin{array}{c} \displaystyle k_n= \frac{\xi \varvec{\omega }}{2} +\frac{\xi \varvec{\omega } }{2^{n+1}}, \\ \displaystyle \varrho _n=\frac{\varrho }{2}+\frac{\varrho }{2^{n+1}},\quad \tilde{\varrho }_n=\frac{\varrho _n+\varrho _{n+1}}{2},\\ \displaystyle \hat{\varrho }_n=\frac{3\varrho _n+ \varrho _{n+1}}{4},\quad \bar{\varrho }_n=\frac{\varrho _n+ 3\varrho _{n+1}}{4},\\ \displaystyle K_n=K_{\varrho _n}, \quad \widetilde{K}_n=K_{\tilde{\varrho }_n},\quad \widehat{K}_n=K_{{\hat{\varrho }}_n},\quad \overline{K}_n=K_{\bar{\varrho }_n},\\ \displaystyle Q_n=K_n\times \big (-\theta \varrho _n^{sp},0\big ),\quad \widetilde{Q}_n=\widetilde{K}_n\times \big (-\theta \tilde{\varrho }_n^{sp},0\big ),\\ \widehat{Q}_n=\widehat{K}_n\times \big (-\theta \hat{\varrho }_n^{sp},0\big ),\quad \overline{Q}_n=\overline{K}_n\times \big (-\theta \bar{\varrho }_n^{sp},0\big ). \end{array} \right. \end{aligned}$$

It is helpful to observe that

$$\begin{aligned} Q_{n+1}\subset \overline{Q}_n\subset \widetilde{Q}_n\subset \widehat{Q}_n\subset Q_n. \end{aligned}$$

Introduce the cutoff function \(\zeta \) in \(Q_n\), vanishing outside \(\widehat{Q}_{n}\) and equal to the identity in \(\widetilde{Q}_{n}\), such that

$$\begin{aligned} |D\zeta |\le \frac{2^n}{\varrho }\quad \text { and }\quad |\partial _t \zeta |\le \frac{2^{psn}}{\theta \varrho ^{sp}}. \end{aligned}$$

Let us examine the energy estimate of Proposition 2.1 in this setting:

$$\begin{aligned} \mathop {\mathrm{ess\,sup}}\limits _{-\theta {\tilde{\varrho }}_n^{sp}<t<0}&\int _{\widetilde{K}_n} w^2_{-}(x,t)\,\textrm{d}x+ \int _{-\theta {\tilde{\varrho }}_n^{sp}}^{0}\int _{\widetilde{K}_n}\int _{\widetilde{K}_n} \frac{|w_-(x,t) - w_-(y,t)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\quad \le \varvec{\gamma }\int _{-\theta \varrho _n^{sp}}^{0}\int _{K_n}\int _{K_n}\max \big \{w^p_{-}(x,t), w^p_{-}(y,t)\big \} \frac{|\zeta (x,t) - \zeta (y,t)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\qquad +\varvec{\gamma }\iint _{Q_n} \zeta ^p w_{-}(x,t)\,\textrm{d}x\textrm{d}t\left( \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} x\in \widehat{K}_n\\ t\in (-\theta \varrho _n^{sp},0) \end{array}} \int _{\mathbb {R}^N\setminus K_n}\frac{ w_{-}^{p-1}(y,t)}{|x-y|^{N+sp}}\,\textrm{d}y\right) \\&\qquad + \iint _{Q_n} |\partial _t\zeta ^p|w_{-}^2(x,t)\,\textrm{d}x\textrm{d}t. \end{aligned}$$

Recalling \(w_-=(u-k_n)_-\), we treat the three terms on the right-hand side of the energy estimate as follows. For the first term, we estimate

$$\begin{aligned} \int _{-\theta \varrho _n^{sp}}^{0}&\int _{K_n}\int _{K_n}\max \big \{w_{-}(x,t), w_{-}(y,t)\big \}^p \frac{|\zeta (x,t) - \zeta (y,t)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\le 2^{pn+1} \frac{(\xi \varvec{\omega })^p}{\varrho ^{sp}}\int _{-\theta \varrho _n^{sp}}^{0}\int _{K_n}\int _{K_n} \frac{ \chi _{\{u(x,t)<k_n\}} }{|x-y|^{N+(s-1)p}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\le \varvec{\gamma } 2^{pn} \frac{(\xi \varvec{\omega })^p}{\varrho ^{sp}} |A_n|, \end{aligned}$$

where we have defined \(A_n:=\{u<k_n\}\cap Q_n\).

For the second term, we observe that for \(|y|\ge \varrho _n\) and \(|x|\le {\hat{\varrho }}_n\), there holds

$$\begin{aligned} \frac{|y-x|}{|y|}\ge 1-\frac{{\hat{\varrho }}_n}{ \varrho _n}=\frac{1}{4}\left( \frac{\varrho _n-\varrho _{n+1}}{\varrho _n}\right) \ge \frac{1}{2^{n+4}}; \end{aligned}$$

consequently, recalling also \(u\ge \varvec{\mu }^-=0\) a.e. in \(\mathcal {Q}\) by assumption, we estimate

$$\begin{aligned} \iint _{Q_n}&\zeta ^pw_{-}(x,t)\,\textrm{d}x\textrm{d}t\left( \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} x\in \widehat{K}_n\\ t\in (-\theta \varrho _n^{sp},0) \end{array}} \int _{\mathbb {R}^N\setminus K_n}\frac{ w_{-}^{p-1}(y,t)}{|x-y|^{N+sp}}\,\textrm{d}y\right) \\&\le \varvec{\gamma } 2^{(N+sp)n} \xi \varvec{\omega } |A_n| \left( \varvec{\gamma } \frac{(\xi \varvec{\omega })^{p-1}}{\varrho ^{sp}}+\mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} t\in (-\theta \varrho _n^{sp},0) \end{array}} \int _{\mathbb {R}^N\setminus K_R}\frac{ u_{-}^{p-1}(y,t)}{|y|^{N+sp}}\,\textrm{d}y\right) \\&= \varvec{\gamma } 2^{(N+sp)n} \frac{\xi \varvec{\omega }}{\varrho ^{sp}} |A_n| \left( \varvec{\gamma } (\xi \varvec{\omega })^{p-1} + \left( \frac{\varrho }{R}\right) ^{sp} [\textrm{Tail}(u_-; \mathcal {Q})]^{p-1} \right) \\&\le \varvec{\gamma } 2^{(N+sp)n} \frac{(\xi \varvec{\omega })^p}{\varrho ^{sp}} |A_n|. \end{aligned}$$

In the last line, we have enforced

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}(u_{-}; \mathcal {Q}) \le \xi \varvec{\omega }. \end{aligned}$$

For the third term, it is quite standard to obtain

$$\begin{aligned} \iint _{Q_n} |\partial _t\zeta ^p |(u-k_n)_{-}^2\,\textrm{d}x\textrm{d}t\le \frac{2^{spn}}{\theta \varrho ^{sp}} (\xi \varvec{\omega })^2 |A_n|. \end{aligned}$$

Collecting these estimates on the right-hand side of the energy estimate, we arrive at

$$\begin{aligned} \mathop {\mathrm{ess\,sup}}\limits _{-\theta \tilde{\varrho }^{sp}_n<t<0}&\int _{\widetilde{K}_n} w_-^2\,\textrm{d}x+ \int _{-\theta \tilde{\varrho }^{sp}_n}^{0}\int _{\widetilde{K}_n}\int _{\widetilde{K}_n} \frac{|w_-(x,t) - w_-(y,t)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\le \varvec{\gamma } 2^{(N+2p)n}\frac{(\xi \varvec{\omega })^p}{\delta \varrho ^{sp}}|A_n|. \end{aligned}$$

Now set \(0\le \phi \le 1\) to be a cutoff function in \(\widetilde{Q}_n\), which vanishes outside \(\overline{Q}_n\), equals the identity in \(Q_{n+1}\) and satisfies \(|D\phi |\le 2^n/\varrho \). An application of the Hölder inequality and the Sobolev imbedding (cf. Proposition A.3 with \(d=2^{-n-4}\)) gives that

for some \({\varvec{b}}={\varvec{b}}(p,N)>1\). To obtain the last line, we used the triangle inequality

$$\begin{aligned} \big |w_-\phi (x,t)&- w_-\phi (y,t)\big |^p \\&\le c \big |w_-(x,t) - w_-(y,t) \big |^p \phi ^p(x,t) + c w^p_-(y,t) \big |\phi (x,t) - \phi (y,t)\big |^p, \end{aligned}$$

for some \(c=c(p)\), such that

$$\begin{aligned} \varrho ^{sp}&\int _{-\theta \tilde{\varrho }^{sp}_n}^{0}\int _{\widetilde{K}_n}\int _{\widetilde{K}_n} \frac{\big |w_-\phi (x,t) - w_-\phi (y,t)\big |^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\le c \varrho ^{sp}\int _{-\theta \tilde{\varrho }^{sp}_n}^{0}\int _{\widetilde{K}_n}\int _{\widetilde{K}_n} \frac{\big |w_-(x,t) - w_-(y,t)\big |^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\quad + c \varrho ^{sp}\int _{-\theta \tilde{\varrho }^{sp}_n}^{0}\int _{\widetilde{K}_n}\int _{\widetilde{K}_n} \frac{w^p_-(y,t)\big |\phi (x,t) - \phi (y,t)\big |^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\le c \varrho ^{sp}\int _{-\theta \tilde{\varrho }^{sp}_n}^{0}\int _{\widetilde{K}_n}\int _{\widetilde{K}_n} \frac{\big |w_-(x,t) - w_-(y,t)\big |^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\quad + \varvec{\gamma } 2^{pn} \iint _{\widetilde{Q}_n} w^p_-(y,t) \,\textrm{d}y\textrm{d}t. \end{aligned}$$

Plugging this into the second-to-last line and employing the above energy estimate, the last line follows. Notice also from Proposition A.3 there holds

$$\begin{aligned} \kappa :=1+\frac{2(\kappa _* -1)}{p\kappa _*}. \end{aligned}$$

In terms of \( {\varvec{Y}}_n=|A_n|/|Q_n|\), this estimate leads to the recursive inequality

$$\begin{aligned} \begin{aligned} {\varvec{Y}}_{n+1} \le&\varvec{\gamma } \delta ^{-\left( \frac{1}{\kappa }+\frac{\kappa _*}{\kappa _*\kappa p}\right) }(2{\varvec{b}})^n {\varvec{Y}}_n^{1+\frac{\kappa _* -1}{\kappa _*\kappa p}}, \end{aligned} \end{aligned}$$

for some generic constant \(\varvec{\gamma }\). Hence, by the fast geometric convergence, cf. [9, Chapter I, Lemma 4.1], there exists a positive constant \(\nu \) depending only on the data, such that \({\varvec{Y}}_n\rightarrow 0\) if we require that \({\varvec{Y}}_o\le \nu \). \(\square \)

The next lemma is a variant of the previous one, involving quantitative initial data.

Lemma 3.2

Let u be a locally bounded, local weak sub(super)-solution to (1.1)–(1.3) in \(E_T\). Let \(\xi \in (0,1)\). There exists a positive constant \(\nu _o\) depending only on the data \(\{s, p, N, C_o, C_1\}\) and independent of \(\xi \), such that if

$$\begin{aligned} \pm \big (\varvec{\mu }^{\pm }-u(\cdot , t_o)\big )\ge \xi \varvec{\omega } \quad \text { a.e. in } K_{\varrho }(x_o), \end{aligned}$$

then either

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}\big (\big (u - \varvec{\mu }^{\pm }\big )_{\pm }; \mathcal {Q}\big ) >\xi \varvec{\omega }, \end{aligned}$$

or

$$\begin{aligned} \pm \big (\varvec{\mu }^{\pm }-u\big )\ge \tfrac{1}{2}\xi \varvec{\omega }\quad \text { a.e. in }K_{\frac{1}{2}\varrho }(x_o)\times \big (t_o, t_o+\nu _o(\xi \varvec{\omega })^{2-p}\varrho ^{sp}\big ], \end{aligned}$$

provided the cylinders are included in \(\mathcal {Q}\).

Proof

Assume \((x_o,t_o)=(0,0)\). It suffices to show the case of super-solutions with \(\varvec{\mu }^-=0\). Let us first examine the energy estimate of Proposition 2.1 in \(Q(R,S)\equiv K_{\varrho }\times (0,\theta \varrho ^{sp})\) for some \(\theta \) to be determined. Note that the time level \(t_o-S\) in Proposition 2.1 corresponds to \(t=0\) here. Let \(\zeta (x)\) be a time independent, piecewise smooth, cutoff function in \(K_\varrho \) that vanishes on \(\partial K_\varrho \). If we take the level \(k\le \xi \varvec{\omega }\), the spatial integral at \(t=0\) (i.e. the term at the time level \(t_o-S\) in Proposition 2.1) vanishes due to the assumption that \(u(\cdot , 0)\ge \xi \varvec{\omega }\) a.e. in \(K_{\varrho }\). The term involving \(\partial _t\zeta \) also vanishes since \(\zeta \) is independent of t. As a result, the energy estimate reads

$$\begin{aligned} \mathop {\mathrm{ess\,sup}}\limits _{0<t<\theta \varrho ^{sp}}&\int _{K_\varrho }\zeta ^p w_-^2\,\textrm{d}x+ \int _{0}^{\theta \varrho ^{sp}}\int _{K_\varrho }\int _{K_\varrho } \min \big \{\zeta ^p(x),\zeta ^p(y)\big \} \frac{|w_-(x,t) - w_-(y,t)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\quad \le \varvec{\gamma }\int _{0}^{\theta \varrho ^{sp}}\int _{K_\varrho }\int _{K_\varrho }\max \big \{w^p_{-}(x,t), w^p_{-}(y,t)\big \} \frac{|\zeta (x) - \zeta (y)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\qquad +\varvec{\gamma }\int _{0}^{\theta \varrho ^{sp}}\int _{K_\varrho } \zeta ^pw_{-}(x,t)\,\textrm{d}x\textrm{d}t\left( \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} x\in {\text {supp}}\zeta \\ t\in (0,\theta \varrho ^{sp}) \end{array}} \int _{\mathbb {R}^N\setminus K_\varrho }\frac{ w_{-}^{p-1}(y,t)}{|x-y|^{N+sp}}\,\textrm{d}y\right) . \end{aligned}$$

Introduce \(k_n\), \(\varrho _n\), \(\tilde{\varrho }_n\), \(\hat{\varrho }_n\), \(\bar{\varrho }_n\), \(K_n\), \(\widetilde{K}_n\), \(\widehat{K}_n\) and \(\overline{K}_n\) as in Lemma 3.1. The only difference is that the cylinders \(Q_n\), \(\widetilde{Q}_n\), \(\widehat{Q}_n\) and \(\overline{Q}_n\) are now of forward type, i.e. \(Q_n=K_n\times (0,\theta \varrho ^{sp})\), \(\widetilde{Q}_n=\widetilde{K}_n\times (0,\theta \varrho ^{sp})\), \(\widehat{Q}_n=\widehat{K}_n\times (0,\theta \varrho ^{sp})\) and \(\overline{Q}_n=\overline{K}_n\times (0,\theta \varrho ^{sp})\). Note that while shrinking the base balls \(K_n\), \(\widehat{K}_n\), \(\widetilde{K}_n\) and \(\overline{K}_n\) along \(\varrho _n\), the height of the cylinders is fixed. For the piecewise smooth function \(\zeta (x)\) in \(K_n\), we choose it to vanish outside \(\widehat{K}_n\), be equal to 1 in \(\widetilde{K}_n\), and satisfy \(|D\zeta |\le 2^n/\varrho \). By a similar treatment of the right-hand side as in Lemma 3.1, after enforcing

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}(u_{-}; \mathcal {Q}) \le \xi \varvec{\omega }, \end{aligned}$$

we may obtain that

$$\begin{aligned} \mathop {\mathrm{ess\,sup}}\limits _{0<t<\theta \varrho ^{sp} }&\int _{\widetilde{K}_n} w_-^2\,\textrm{d}x+ \int ^{\theta \varrho ^{sp}}_{0}\int _{\widetilde{K}_n}\int _{\widetilde{K}_n} \frac{|w_-(x,t) - w_-(y,t)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\le \varvec{\gamma } 2^{(N+2p)n}\frac{(\xi \varvec{\omega })^p}{\varrho ^{sp}}|A_n|, \end{aligned}$$

where \(A_n:=\{u<k_n\}\cap Q_n\).

Then we may proceed as in Lemma 3.1 to obtain the recursive inequality

$$\begin{aligned} \begin{aligned} {\varvec{Y}}_{n+1} \le&\varvec{\gamma }{\varvec{b}}^n \bigg [\frac{\theta }{(\xi \varvec{\omega })^{2-p}}\bigg ]^{\frac{\kappa _* -1}{\kappa _*\kappa p}}{\varvec{Y}}_n^{1+\frac{\kappa _* -1}{\kappa _*\kappa p}}, \end{aligned} \end{aligned}$$

where \( {\varvec{Y}}_n=|A_n|/|Q_n|\) and the constants \(\varvec{\gamma },\,{\varvec{b}}\) depend only on the data. Hence, by the fast geometric convergence, cf. [9, Chapter I, Lemma 4.1], there exists a positive constant \(\nu _o\) depending only on the data, such that if

$$\begin{aligned} {\varvec{Y}}_o\le \frac{\nu _o(\xi \varvec{\omega })^{2-p}}{\theta }, \end{aligned}$$

then \({\varvec{Y}}_n\rightarrow 0\). To finish the proof, we choose \(\theta =\nu _o(\xi \varvec{\omega })^{2-p}\). \(\square \)

The following lemma propagates measure theoretical information forward in time.

Lemma 3.3

Let u be a locally bounded, local weak sub(super)-solution to (1.1)–(1.3) in \(E_T\). Introduce parameters \(\xi \) and \(\alpha \) in (0, 1). There exist \(\delta \) and \(\varepsilon \) in (0, 1), depending only on the data \(\{s, p, N, C_o, C_1\}\) and \(\alpha \), such that if

$$\begin{aligned} \Big |\Big \{ \pm \big (\varvec{\mu }^{\pm }-u(\cdot , t_o)\big )\ge \xi \varvec{\omega } \Big \}\cap K_{\varrho }(x_o)\Big | \ge \alpha \big |K_{\varrho }\big |, \end{aligned}$$

then either

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}\big (\big (u - \varvec{\mu }^{\pm }\big )_{\pm }; \mathcal {Q}\big ) >\xi \varvec{\omega }, \end{aligned}$$

or

$$\begin{aligned} \Big |\Big \{ \pm \big (\varvec{\mu }^{\pm }-u(\cdot , t)\big )\ge \varepsilon \xi \varvec{\omega }\Big \} \cap K_{\varrho }(x_o)\Big | \ge \frac{\alpha }{2} |K_\varrho | \quad \text{ for } \text{ all } t\in \big (t_o,t_o+\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}\big ], \end{aligned}$$

provided this cylinder is included in \(\mathcal {Q}\). Moreover, we may trace the dependences by \(\varepsilon \approx \alpha \) and \(\delta \approx \alpha ^{p+N+1}\).

Proof

We only show the case of super-solutions with \(\varvec{\mu }^-=0\). Assume \((x_o,t_o)=(0,0)\) without loss of generality. Use the energy estimate in Proposition 2.1 in the cylinder \(Q=K_{\varrho }\times (0,\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}]\), with the truncation

$$\begin{aligned} w_- =(u-k)_- \quad \text {and}\quad k= \xi \varvec{\omega }. \end{aligned}$$

Choose a standard non-negative cutoff function \(\zeta (x,t)\equiv \zeta (x)\) independent of time that equals 1 on \(K_{(1-\sigma )\varrho }\) with \(\sigma \in (0,1)\) to be chosen later and vanishes outside \(K_{(1-\frac{1}{2}\sigma )\varrho }\) satisfying \(|D\zeta |\le 4(\sigma \varrho )^{-1}\); in such a case, the energy estimate presents that, for all \(0<t<\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}\), there holds

$$\begin{aligned}&\int _{K_{(1-\sigma )\varrho }\times \{t\}} w^2_{-}\,\textrm{d}x-\int _{K_{\varrho }\times \{0\}} w^2_{-}\,\textrm{d}x\\&\quad \le \varvec{\gamma }\int _0^{\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}}\int _{K_\varrho }\int _{K_\varrho }\max \big \{w^p_{-}(x,t), w^p_{-}(y,t)\big \} \frac{|\zeta (x) -\zeta (y)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\qquad +\varvec{\gamma }\iint _{Q} \zeta ^p w_{-}(x,t)\,\textrm{d}x\textrm{d}t\left( \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} x\in {\text {supp}}\zeta \\ t\in (0, \delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}) \end{array}} \int _{\mathbb {R}^N\setminus K_\varrho }\frac{ w_{-}^{p-1}(y,t)}{|x-y|^{N+sp}}\,\textrm{d}y\right) . \end{aligned}$$

The first term on the right-hand side is estimated by

$$\begin{aligned} \int _0^{\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}}&\int _{K_\varrho }\int _{K_\varrho }\max \big \{w^p_{-}(x,t), w^p_{-}(y,t)\big \} \frac{|\zeta (x) - \zeta (y)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\le \varvec{\gamma } [\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp} ](\xi \varvec{\omega })^p\int _{K_\varrho }\int _{K_\varrho } \frac{|\zeta (x) - \zeta (y)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\\&\le \varvec{\gamma } [\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp} ]\frac{(\xi \varvec{\omega })^p}{(\sigma \varrho )^p}\int _{K_\varrho }\int _{K_\varrho } \frac{1}{|x-y|^{N+(s-1)p}}\,\textrm{d}x\textrm{d}y\\&\le \varvec{\gamma }\frac{ \delta (\xi \varvec{\omega })^{2}}{\sigma ^p}|K_\varrho |. \end{aligned}$$

As for the second term, we observe that for \(x\in {\text {supp}}\zeta \subset K_{(1-\frac{1}{2}\sigma )\varrho }\) and \(y\in \mathbb {R}^N{\setminus } K_{\varrho }\), there holds

$$\begin{aligned} \frac{|x-y|}{|y|}\ge 1-\frac{|x|}{|y|}\ge \tfrac{1}{2}\sigma ; \end{aligned}$$

therefore, we estimate

$$\begin{aligned} \iint _{Q}&\zeta ^p w_{-}(x,t)\,\textrm{d}x\textrm{d}t\left( \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} x\in {\text {supp}}\zeta \\ t\in (0, \delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}) \end{array}} \int _{\mathbb {R}^N\setminus K_\varrho }\frac{ w_{-}^{p-1}(y,t)}{|x-y|^{N+sp}}\,\textrm{d}y\right) \\&\le \varvec{\gamma } \frac{\xi \varvec{\omega } |Q|}{\sigma ^{N+sp}} \left( \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} t\in (0, \delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}) \end{array}} \int _{\mathbb {R}^N\setminus K_\varrho }\frac{ w_{-}^{p-1}(y,t)}{|y|^{N+sp}}\,\textrm{d}y\right) \\&= \varvec{\gamma } \frac{\xi \varvec{\omega } |Q|}{\sigma ^{N+sp}} \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} t\in (0, \delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}) \end{array}} \left( \int _{K_R\setminus K_\varrho }\frac{ w_{-}^{p-1}(y,t)}{|y|^{N+sp}}\,\textrm{d}y+ \int _{\mathbb {R}^N\setminus K_R}\frac{ w_{-}^{p-1}(y,t)}{|y|^{N+sp}}\,\textrm{d}y\right) \\&\le \varvec{\gamma } \frac{\xi \varvec{\omega } |Q|}{\sigma ^{N+sp}} \left( \varvec{\gamma } \frac{(\xi \varvec{\omega })^{p-1}}{\varrho ^{sp}}+\mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} t\in (0, \delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}) \end{array}} \int _{\mathbb {R}^N\setminus K_R}\frac{ u_{-}^{p-1}(y,t)}{|y|^{N+sp}}\,\textrm{d}y\right) \\&= \varvec{\gamma } \frac{\xi \varvec{\omega } |Q|}{\sigma ^{N+sp}\varrho ^{sp}} \left( \varvec{\gamma } (\xi \varvec{\omega })^{p-1} + \left( \frac{\varrho }{R}\right) ^{sp} [\textrm{Tail}(u_-; \mathcal {Q})]^{p-1} \right) \\&\le \varvec{\gamma } \frac{(\xi \varvec{\omega })^p}{\sigma ^{N+sp} \varrho ^{sp}} |Q|=\varvec{\gamma }\frac{\delta (\xi \varvec{\omega })^{2}}{\sigma ^{N+sp}}|K_\varrho |. \end{aligned}$$

Here, in the last line we have enforced

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}(u_{-}; \mathcal {Q}) \le \xi \varvec{\omega }. \end{aligned}$$

Therefore, combining all these estimates in the energy estimate, we have for all \(0<t<\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}\), that

$$\begin{aligned} \int _{K_{(1-\sigma )\varrho }\times \{t\}} (u-\xi \varvec{\omega })^2_- \,\textrm{d}x&\le \int _{K_\varrho \times \{0\}} (u-\xi \varvec{\omega })^2_-\, \textrm{d}x+ \varvec{\gamma }\frac{\delta (\xi \varvec{\omega })^{2}}{\sigma ^{p+N}}|K_\varrho |\nonumber \\&\le (\xi \varvec{\omega })^{2}\Big [(1-\alpha ) +\varvec{\gamma }\frac{ \delta }{\sigma ^{p+N}}\Big ]|K_\varrho |. \end{aligned}$$

The left-hand side in the last display is estimated from below by

$$\begin{aligned} \int _{K_{(1-\sigma )\varrho }\times \{t\}} (u-\xi \varvec{\omega })^2_- \,\textrm{d}x&\ge \int _{K_{(1-\sigma )\varrho }\times \{t\}} (u-\xi \varvec{\omega })^2_- \chi _{\{u< \varepsilon \xi \varvec{\omega }\}} \,\textrm{d}x\\&\ge (1-\varepsilon )^2 (\xi \varvec{\omega })^{2}|A_{\varepsilon ,(1-\sigma )\varrho }(t)| \end{aligned}$$

where we have defined

$$\begin{aligned} A_{\varepsilon ,(1-\sigma )\varrho }(t):=\Big \{u(\cdot ,t)<\varepsilon \xi \varvec{\omega }\Big \}\cap K_{(1-\sigma )\varrho }. \end{aligned}$$

Notice also a simple fact that

$$\begin{aligned} |A_{\varepsilon ,\varrho }(t)|&=|A_{\varepsilon ,(1-\sigma )\varrho }(t)\cup (A_{\varepsilon ,\varrho }(t)-A_{\varepsilon ,(1-\sigma )\varrho }(t))|\\&\le |A_{\varepsilon ,(1-\sigma )\varrho }(t)|+|K_\varrho - K_{(1-\sigma )\varrho }|\\&\le |A_{\varepsilon ,(1-\sigma )\varrho }(t)|+N\sigma |K_\varrho |. \end{aligned}$$

Using the last two estimates, together with the energy estimate, we have for all \(0<t<\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}\), that

$$\begin{aligned} |A_{\varepsilon ,\varrho }(t)|\le \frac{1}{(1-\varepsilon )^2}\Big [(1-\alpha ) +\varvec{\gamma }\frac{ \delta }{\sigma ^{p+N}} \Big ]|K_\varrho |+ N\sigma |K_\varrho |. \end{aligned}$$

To conclude the proof, we choose various parameters \(\sigma \), \(\delta \) and \(\varepsilon \) to satisfy:

$$\begin{aligned} N\sigma =\tfrac{1}{4}\alpha ,\quad \varvec{\gamma }\frac{ \delta }{\sigma ^{p+N}} =\tfrac{1}{8}\alpha , \quad \frac{1-\frac{7}{8}\alpha }{(1-\varepsilon )^2}\le 1-\tfrac{1}{4}\alpha . \end{aligned}$$

In this way, we obtain

$$\begin{aligned} |A_{\varepsilon ,\varrho }(t)|\le (1-\tfrac{1}{2}\alpha ) |K_\varrho | \end{aligned}$$

for all \(0<t<\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}\), which is a restatement of the desired measure estimate. \(\square \)

The following measure shrinking lemma is usually a delicate part in the theory of the parabolic p-Laplacian. However, the term that involves mixed positive/negative truncations in the energy estimate greatly simplifies the argument. For ease of notation, we omit the vertex \((x_o,t_o)\) from \(Q_\varrho (\theta )\).

Lemma 3.4

Let u be a locally bounded, local weak sub(super)-solution to (1.1)–(1.3) in \(E_T\). Suppose that for some \(\delta \), \(\sigma \) and \(\xi \) in \((0,\tfrac{1}{2})\), there holds

$$\begin{aligned} \Big |\Big \{ \pm \big (\varvec{\mu }^{\pm }-u(\cdot , t)\big )\ge \xi \varvec{\omega } \Big \}\cap K_{\varrho }(x_o)\Big | \ge \alpha \big |K_{\varrho }\big |\quad \text{ for } \text{ all } t\in \big (t_o-\delta (\sigma \xi \varvec{\omega })^{2-p}\varrho ^{sp}, t_o\big ]. \end{aligned}$$

Let \(\theta =\delta (\sigma \xi \varvec{\omega })^{2-p}\). There exists \(\varvec{\gamma }>0\) depending only on the data \(\{s, p, N, C_o, C_1\}\) and independent of \(\{\alpha , \delta , \sigma ,\xi \}\), such that either

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}\big (\big (u - \varvec{\mu }^{\pm }\big )_{\pm }; \mathcal {Q}\big ) >\sigma \xi \varvec{\omega }, \end{aligned}$$

or

$$\begin{aligned} \Big |\Big \{ \pm \big (\varvec{\mu }^{\pm }-u\big )\le \sigma \xi \varvec{\omega } \Big \}\cap Q_{\varrho }(\theta )\Big | \le \varvec{\gamma } \frac{\sigma ^{p-1}}{\delta \alpha } |Q_{\varrho }(\theta )|, \end{aligned}$$

provided \(Q_{2\varrho }(\theta )\) is included in \(\mathcal {Q}\).

Proof

It suffices to show the case of super-solutions with \(\varvec{\mu }^-=0\). For simplicity, we assume \((x_o,t_o)=(0,0)\). Let us employ the energy estimate of Proposition 2.1 in \(K_{2\varrho }\times (-\theta \varrho ^{sp}, 0]\) with the truncation

$$\begin{aligned} w_- =(u-k)_- \quad \text {and}\quad k= \sigma \xi \varvec{\omega }, \end{aligned}$$

and introduce a cutoff function \(\zeta \) in \(K_{ 2\varrho }\) (independent of t) that is equal to 1 in \(K_{\varrho }\) and vanishes outside \(K_{\frac{3}{2}\varrho }\), such that \(|D\zeta |\le 4/\varrho \). Then, we obtain from Proposition 2.1 that

$$\begin{aligned} \iint _{Q_\varrho (\theta )}&w_{-}(y,t) \,\textrm{d}y\textrm{d}t\left( \int _{ K_{2\varrho }} \frac{w^{p-1}_+(x,t)}{|x-y|^{N+sp}}\,\textrm{d}x\right) \\&\le \varvec{\gamma }\int _{-\theta \varrho ^{sp}}^{0}\int _{K_{2\varrho }}\int _{K_{2\varrho }}\max \big \{w_{-}^p(x,t), w_{-}^p(y,t)\big \} \frac{|\zeta (x) - \zeta (y)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\quad +\varvec{\gamma }\iint _{Q_{2\varrho }(\theta )} \zeta ^p w_{-}(x,t)\,\textrm{d}x\textrm{d}t\left( \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} x\in K_{\frac{3}{2}\varrho }\\ t\in (-\theta \varrho ^{sp},0) \end{array}} \int _{\mathbb {R}^N\setminus K_{2\varrho }}\frac{ w_{-}^{p-1}(y,t)}{|x-y|^{N+sp}}\,\textrm{d}y\right) \\&\quad + \int _{K_{2\varrho } } w_-^2(x,-\theta \varrho ^{sp}) \,\textrm{d}x. \end{aligned}$$

The first term on the right-hand side is estimated by

$$\begin{aligned} \int _{-\theta \varrho ^{sp}}^{0}&\int _{K_{2\varrho }}\int _{K_{2\varrho }}\max \big \{w_{-}^p(x,t), w_{-}^p(y,t)\big \} \frac{|\zeta (x) - \zeta (y)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\textrm{d}t\\&\le \theta \varrho ^{sp} (\sigma \xi \varvec{\omega })^p\int _{K_{2\varrho }}\int _{K_{2\varrho }} \frac{|\zeta (x) - \zeta (y)|^p}{|x-y|^{N+sp}}\,\textrm{d}x\textrm{d}y\\&\le \varvec{\gamma }\theta \varrho ^{sp} \frac{(\sigma \xi \varvec{\omega })^p}{\varrho ^p}\int _{K_{2\varrho }}\int _{K_{2\varrho }} \frac{1}{|x-y|^{N+(s-1)p}}\,\textrm{d}x\textrm{d}y\\&\le \varvec{\gamma }\theta \varrho ^{sp} \frac{(\sigma \xi \varvec{\omega })^p}{\varrho ^p}\frac{|K_{2\varrho }|}{\varrho ^{(s-1)p}}\\&\le \varvec{\gamma } \frac{(\sigma \xi \varvec{\omega })^p}{\varrho ^{sp}}|Q_{\varrho }(\theta )|, \end{aligned}$$

noticing that \(|K_{2\varrho }|=2^N |K_\varrho |\) in the last line. As for the second term, observe that for \(|x|\le \frac{3}{2}\varrho \) and \(|y|\ge 2\varrho \), there holds

$$\begin{aligned} \frac{|x-y|}{|y|}\ge 1-\frac{|x|}{|y|}\ge \frac{1}{4}; \end{aligned}$$

consequently, we estimate

$$\begin{aligned} \iint _{Q_{2\varrho }(\theta )}&\zeta ^p w_{-}(x,t)\,\textrm{d}x\textrm{d}t\left( \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} x\in K_{\frac{3}{2}\varrho }\\ t\in (-\theta \varrho ^{sp},0) \end{array}} \int _{\mathbb {R}^N\setminus K_{2\varrho }}\frac{ w_{-}^{p-1}(y,t)}{|x-y|^{N+sp}}\,\textrm{d}y\right) \\&\le \varvec{\gamma }(\sigma \xi \varvec{\omega }) |Q_{2\varrho }(\theta )| \left( \mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} t\in (-\theta \varrho ^{sp},0) \end{array}} \int _{\mathbb {R}^N\setminus K_{2\varrho }}\frac{ w_{-}^{p-1}(y,t)}{|y|^{N+sp}}\,\textrm{d}y\right) \\&\le \varvec{\gamma }(\sigma \xi \varvec{\omega }) |Q_{2\varrho }(\theta )| \left( \varvec{\gamma }\frac{(\sigma \xi \varvec{\omega })^{p-1}}{\varrho ^{sp}}+\mathop {\mathrm{ess\,sup}}\limits _{\begin{array}{c} t\in (-\theta \varrho ^{sp},0) \end{array}} \int _{\mathbb {R}^N\setminus K_{R}}\frac{ u_{-}^{p-1}(y,t)}{|y|^{N+sp}}\,\textrm{d}y\right) \\&\le \varvec{\gamma }\frac{\sigma \xi \varvec{\omega }}{\varrho ^{sp}} |Q_{2\varrho }(\theta )| \left( \varvec{\gamma } (\sigma \xi \varvec{\omega })^{p-1}+\left( \frac{\varrho }{R}\right) ^{sp} [\textrm{Tail}(u_{-}; \mathcal {Q})]^{p-1} \right) \\&\le \varvec{\gamma } \frac{(\sigma \xi \varvec{\omega })^p}{\varrho ^{sp}}|Q_{\varrho }(\theta )|. \end{aligned}$$

Here, in the last line we have enforced

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}(u_{-}; \mathcal {Q}) \le \sigma \xi \varvec{\omega }, \end{aligned}$$

and used \(|Q_{2\varrho }(\theta )|=2^{N+p}|Q_{\varrho }(\theta )|\). Whereas the third term is standard:

$$\begin{aligned} \int _{K_{2\varrho } } w_-^2(x,-\theta \varrho ^{sp}) \,\textrm{d}x\le (\sigma \xi \varvec{\omega })^2|K_{2\varrho }|\le \varvec{\gamma } \frac{(\sigma \xi \varvec{\omega })^p}{\delta \varrho ^{sp}}|Q_{\varrho }(\theta )|, \end{aligned}$$

recalling that \(\theta =\delta (\sigma \xi \varvec{\omega })^{2-p}\) and noticing that \(|K_{2\varrho }|=2^N |K_\varrho |\).

Combining the above estimate we see that the right-hand side of the energy estimate is bounded by

$$\begin{aligned} \varvec{\gamma } \frac{(\sigma \xi \varvec{\omega })^p}{\delta \varrho ^{sp}}|Q_{\varrho }(\theta )|, \end{aligned}$$

provided the tail estimate is enforced. We stress that \(\varvec{\gamma }\) depends only on the data \(\{s, p, N, C_o, C_1\}\) and is independent of \(\{\alpha , \delta , \sigma ,\xi \}\).

The left-hand side is estimated by extending the integrals over smaller sets and by using the given measure theoretical information:

$$\begin{aligned} \iint _{Q_\varrho (\theta ) }&w_{-}(y,t) \chi _{\{u(y,t)\le \frac{1}{2}\sigma \xi \varvec{\omega }\}} \,\textrm{d}y\textrm{d}t\left( \int _{ K_{2\varrho }} \frac{w^{p-1}_+(x,t)\chi _{\{u(x,t)\ge \xi \varvec{\omega }\} }}{|x-y|^{N+sp}}\,\textrm{d}x\right) \\&\ge \tfrac{1}{2}\sigma \xi \varvec{\omega }\Big |\Big \{u\le \tfrac{1}{2}\sigma \xi \varvec{\omega }\Big \}\cap Q_\varrho (\theta )\Big | \left( \frac{ (\tfrac{1}{2}\xi \varvec{\omega })^{p-1}\alpha |K_\varrho |}{(4\varrho )^{N+sp}} \right) \\&=2^{-(2N+p+2sp)}\frac{(\xi \varvec{\omega })^p\alpha \sigma }{\varrho ^{sp}} \Big |\Big \{u\le \tfrac{1}{2}\sigma \xi \varvec{\omega }\Big \}\cap Q_\varrho (\theta )\Big |. \end{aligned}$$

Combining these estimates and properly adjusting relevant constants, we conclude the proof. \(\square \)

4 Proof of theorem 1.1: \(1<p\le 2\)

4.1 Expansion of positivity

Suppose the cylinder \(\mathcal {Q}\) and the numbers \(\varvec{\mu }^{\pm }\) and \(\varvec{\omega }\) are defined as in Sect. 3. The key ingredient of the reduction of oscillation lies in the following expansion of positivity, which is valid for \(1<p\le 2\).

Proposition 4.1

Let u be a locally bounded, local, weak sub(super)-solution to (1.1)–(1.3) in \(E_T\), with \(1<p\le 2\). Suppose for some constants \(\alpha ,\xi \in (0,1)\), there holds

$$\begin{aligned} \Big |\Big \{\pm \big (\varvec{\mu }^{\pm }-u(\cdot , t_o)\big )\ge \xi \varvec{\omega } \Big \}\cap K_{\varrho }(x_o) \Big | \ge \alpha \big |K_\varrho \big |. \end{aligned}$$

Then there exist constants \(\delta ,\,\eta \in (0,1)\) depending only on the data \(\{s, p, N, C_o, C_1\}\) and \(\alpha \), such that either

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}\big (\big (u - \varvec{\mu }^{\pm }\big )_{\pm }; \mathcal {Q}\big ) >\eta \xi \varvec{\omega }, \end{aligned}$$

or

$$\begin{aligned} \pm \big (\varvec{\mu }^{\pm }-u\big )\ge \eta \xi \varvec{\omega } \quad \text{ a.e. } \text{ in } K_{2\varrho }(x_o) \times \big ( t_o+\tfrac{1}{2} \delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}, t_o+\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}\big ], \end{aligned}$$

provided

$$\begin{aligned} K_{4\varrho }(x_o)\times \big (t_o, t_o+\delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}\big ]\subset \mathcal {Q}. \end{aligned}$$

Moreover, we have \(\delta \approx \alpha ^{p+N+1}\) and \(\eta \approx \alpha ^q\) for some \(q>1\) depending on the data \(\{s, p, N, C_o, C_1\}\).

Proof

Assuming \((x_o,t_o)=(0,0)\) and \(\varvec{\mu }^{-}=0\) for simplicity, it suffices to deal with super-solutions. As a restatement, we need to show that there exist \(\delta \) and \(\eta \), such that if

$$\begin{aligned} \Big |\Big \{u(\cdot , 0) \ge \xi \varvec{\omega } \Big \}\cap K_{\varrho } \Big | \ge \alpha \big |K_\varrho \big | \end{aligned}$$

and if

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}\big (u_{-}; \mathcal {Q}\big ) \le \eta \xi \varvec{\omega }, \end{aligned}$$

then

$$\begin{aligned} u\ge \eta \xi \varvec{\omega } \quad \text{ a.e. } \text{ in } K_{2\varrho } \times \big ( \tfrac{1}{2} \delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}, \delta (\xi \varvec{\omega })^{2-p}\varrho ^{sp}\big ]. \end{aligned}$$

We stress that \(\delta \) and \(\eta \) depend only on the data \(\{s, p, N, C_o, C_1\}\) and \(\alpha \), but independent of \(\xi \).

Rewriting the measure theoretical information at the initial time \(t_o=0\) in the larger ball \(K_{4\varrho }\) and replacing \(\alpha \) by \(4^{-N}\alpha \), we can enforce

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}(u_{-}; \mathcal {Q}) \le \xi \varvec{\omega }, \end{aligned}$$

and apply Lemma 3.3 to obtain \(\delta , \varepsilon \in (0,1)\) depending only on the data \(\{s, p, N, C_o, C_1\}\) and \(\alpha \), such that

$$\begin{aligned} \Big |\Big \{ u(\cdot , t) \ge \varepsilon \xi \varvec{\omega }\Big \} \cap K_{4\varrho } \Big | \ge \frac{\alpha }{2} 4^{-N} |K_{4\varrho }| \quad \text{ for } \text{ all } t\in \big (0, \delta (\xi \varvec{\omega })^{2-p}(4\varrho )^{sp}\big ]. \end{aligned}$$

This measure theoretical information for each slice of the time interval in turn allows us to apply Lemma 3.4 in the cylinders \((0,\bar{t})+Q_{4\varrho }(\delta (\sigma \varepsilon \xi \varvec{\omega })^{2-p})\) with an arbitrary \(\bar{t}\in \big (\delta (\sigma \varepsilon \xi \varvec{\omega })^{2-p}(4\varrho )^{sp}, \delta (\xi \varvec{\omega })^{2-p}(4\varrho )^{sp}\big ]\), and with \(\xi \) and \(\alpha \) there replaced by \(\varepsilon \xi \) and \(\tfrac{1}{2} 4^{-N}\alpha \). This is viable because \(\sigma \in (0,1)\) and \(\delta (\sigma \varepsilon \xi \varvec{\omega })^{2-p}\le \delta (\xi \varvec{\omega })^{2-p}\); consequently, we have

$$\begin{aligned} (0,\bar{t})+Q_{4\varrho }(\delta (\sigma \varepsilon \xi \varvec{\omega })^{2-p})\subset K_{4\varrho }\times \big (0, \delta (\xi \varvec{\omega })^{2-p}(4\varrho )^{sp}\big ] \end{aligned}$$

when \(\bar{t}\) ranges over the given interval. Note also this step used the fact that \(p\le 2\).

Letting \(\nu \) be determined in Lemma 3.1 in terms of the data and \(\delta \), we further choose \(\sigma \) according to Lemma 3.4 to satisfy

$$\begin{aligned} \varvec{\gamma } \frac{\sigma ^{p-1}}{\delta \alpha } <\nu , \quad \text {i.e.}\quad \sigma \le \left( \frac{\nu \delta \alpha }{\varvec{\gamma }}\right) ^{\frac{1}{p-1}}. \end{aligned}$$

This choice is possible because \(\varvec{\gamma }\) of Lemma 3.4 is independent of \(\sigma \). Further enforcing

$$\begin{aligned} \left( \frac{\varrho }{R}\right) ^{\frac{sp}{p-1}} \textrm{Tail}\big ( u_{-}; \mathcal {Q}\big ) \le \sigma \varepsilon \xi \varvec{\omega }, \end{aligned}$$

such a choice of \(\sigma \) permits us to apply Lemma 3.1 in the cylinders \((0,\bar{t})+Q_{4\varrho }(\delta (\sigma \varepsilon \xi \varvec{\omega })^{2-p})\) with an arbitrary \(\bar{t}\in \big (\delta (\sigma \varepsilon \xi \varvec{\omega })^{2-p}(4\varrho )^{sp}, \delta (\xi \varvec{\omega })^{2-p}(4\varrho )^{sp}\big ]\), and with \(\xi \) there replaced by \(\sigma \varepsilon \xi \). Therefore, by arbitrariness of \(\bar{t}\) we conclude that

$$\begin{aligned} u\ge \tfrac{1}{2}\sigma \varepsilon \xi \varvec{\omega }\quad \text {a.e. in}\>K_{2\varrho } \times \big ( \delta (\sigma \varepsilon \xi \varvec{\omega })^{2-p}(4\varrho )^{sp}, \delta (\xi \varvec{\omega })^{2-p}(4\varrho )^{sp}\big ]. \end{aligned}$$

The proof is completed by defining \(\eta =\sigma \varepsilon \) and properly adjusting relevant constants in dependence of the data and \(\alpha \). \(\square \)

Based on Proposition 4.1, the remaining part is devoted to the proof of Theorem 1.1 for \(1<p\le 2\). All constants determined in the course of the proof are stable as \(p\rightarrow 2\).

4.2 The first step

For some \(Q_{\widetilde{R}}\subset E_T\) we introduce

$$\begin{aligned} \varvec{\omega }=2\mathop {\mathrm{ess\,sup}}\limits _{Q_{\widetilde{R}}} |u| +\textrm{Tail}(u; Q_{\widetilde{R}}) \end{aligned}$$

and \(Q_o=Q_R(\varvec{\omega }^{2-p})\). By properly shrinking R, we may assume that \(Q_o\subset Q_{\widetilde{R}}\) and set

$$\begin{aligned} \varvec{\mu }^+=\mathop {\mathrm{ess\,sup}}\limits _{Q_o}u, \qquad \varvec{\mu }^-=\mathop {\mathrm{ess\,inf}}\limits _{Q_o}u. \end{aligned}$$

Without loss of generality, we take \((x_o,t_o)=(0,0)\).

Then the following intrinsic relation holds true:

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_R(\varvec{\omega }^{2-p})}u\le \varvec{\omega }. \end{aligned}$$
(4.1)

The choice of the reference cylinder \(Q_{\widetilde{R}}\) is made to verify (4.1), on which the subsequent arguments are based.

Let \(\delta \in (0,1)\) be determined in Proposition 4.1 with \(\alpha =\tfrac{1}{2}\). For some \(c\in (0,\tfrac{1}{4})\) to be chosen, define

$$\begin{aligned} \tau :=\delta (\tfrac{1}{4} \varvec{\omega })^{2-p}( c R)^{sp} \end{aligned}$$

and consider two alternatives

$$\begin{aligned} \left\{ \begin{array}{ll} \Big |\Big \{u\big (\cdot ,-\tau \big )-\varvec{\mu }^->\tfrac{1}{4} \varvec{\omega }\Big \} \cap K_{c R}\Big | \ge \tfrac{1}{2} |K_{c R}|,\\ \Big |\Big \{\varvec{\mu }^+ - u\big (\cdot ,-\tau \big )>\tfrac{1}{4} \varvec{\omega }\Big \} \cap K_{c R}\Big | \ge \tfrac{1}{2} |K_{c R}|. \end{array}\right. \end{aligned}$$

Assuming \(\varvec{\mu }^+ - \varvec{\mu }^-\ge \tfrac{1}{2}\varvec{\omega }\), one of the two alternatives must hold. Whereas the case \(\varvec{\mu }^+ - \varvec{\mu }^-<\tfrac{1}{2}\varvec{\omega }\) will be trivially incorporated into the forthcoming oscillation estimate (4.3).

Let us suppose the first alternative holds for instance. An appeal to Proposition 4.1 with \(\alpha =\tfrac{1}{2}\), \(\xi =\tfrac{1}{4}\) and \(\varrho =cR\) determines \(\eta \in (0,\tfrac{1}{2})\) and yields that either

$$\begin{aligned} c^{\frac{sp}{p-1}} \textrm{Tail}\big ( \big (u-\varvec{\mu }^{-}\big )_{-}; Q_o\big ) > \eta \varvec{\omega }, \end{aligned}$$
(4.2)

or

$$\begin{aligned} u-\varvec{\mu }^{-} \ge \eta \varvec{\omega } \quad \text {a.e. in}\> Q_{cR}(\tfrac{1}{2}\delta (\tfrac{1}{4} \varvec{\omega })^{2-p}), \end{aligned}$$

which, thanks to (4.1), gives the reduction of oscillation

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_{cR}(\frac{1}{2}\delta (\frac{1}{4} \varvec{\omega })^{2-p})} u\le \big (1-\eta )\varvec{\omega }=:\varvec{\omega }_1. \end{aligned}$$
(4.3)

The number c is chosen to ensure that (4.2) does not happen. Indeed, we may first estimate

$$\begin{aligned} \textrm{Tail}\big ( \big (u-\varvec{\mu }^{-}\big )_{-}; Q_o\big ) \le \varvec{\gamma }\varvec{\omega }. \end{aligned}$$
(4.4)

This can be seen by the definitions of \(\varvec{\omega }\) and the tail,

$$\begin{aligned} \big [\textrm{Tail}&\big ( \big (u-\varvec{\mu }^{-}\big )_{-}; Q_o\big )\big ]^{p-1} =R^{sp} \mathop {\mathrm{ess\,sup}}\limits _{-\varvec{\omega }^{2-p}R^{sp}<t<0}\int _{\mathbb {R}^N\setminus K_{R}}\frac{\big (u-\varvec{\mu }^{-}\big )_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x\\&\le \varvec{\gamma } \varvec{\omega }^{p-1}+\varvec{\gamma } R^{sp} \mathop {\mathrm{ess\,sup}}\limits _{-\varvec{\omega }^{2-p}R^{sp}<t<0} \int _{\mathbb {R}^N\setminus K_R} \frac{u_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x\\&=\varvec{\gamma } \varvec{\omega }^{p-1}+\varvec{\gamma } R^{sp} \mathop {\mathrm{ess\,sup}}\limits _{-\varvec{\omega }^{2-p}R^{sp}<t<0} \bigg [ \int _{\mathbb {R}^N\setminus K_{\widetilde{R}}} \frac{u_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x+ \int _{K_{\widetilde{R}}\setminus K_{R}} \frac{u_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x\bigg ]\\&\le \varvec{\gamma }\varvec{\omega }^{p-1}. \end{aligned}$$

Then, using (4.4) we choose

$$\begin{aligned} c^{\frac{sp}{p-1}}\varvec{\gamma }\varvec{\omega }\le \eta \varvec{\omega },\quad \text {i.e.}\quad c\le \left( \frac{\eta }{\varvec{\gamma }}\right) ^{\frac{p-1}{sp}}, \end{aligned}$$
(4.5)

such that (4.2) does not occur. Note that (4.5) is not the final choice of c yet and it is subject to a further smallness requirement.

Next, we set \(R_1=\lambda R\) for some \(\lambda \le c\) to verify the set inclusion

$$\begin{aligned} Q_{R_1}(\varvec{\omega }_1^{2-p}) \subset Q_{cR}(\tfrac{1}{2}\delta (\tfrac{1}{4} \varvec{\omega })^{2-p}),\quad \text {i.e.}\quad \lambda \le 2^{\frac{2p-5}{p}}\delta ^{\frac{1}{p}}c. \end{aligned}$$
(4.6)

As a result of this inclusion and (4.3) we obtain

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_{R_1}(\varvec{\omega }_1^{2-p})}u\le \varvec{\omega }_1, \end{aligned}$$

which plays the role of (4.1) in the next stage.

4.3 The induction

Now we may proceed by induction.

Suppose up to \(i=1,\cdots , j\), we have built

$$\begin{aligned} \left\{ \begin{array}{c} \displaystyle R_o=R,\quad R_i=\lambda R_{i-1}, \quad \varvec{\omega }_i=(1-\eta )\varvec{\omega }_{i-1}, \quad Q_i=Q_{R_i}(\varvec{\omega }_i^{2-p}),\\ \displaystyle \varvec{\mu }_i^+=\mathop {\mathrm{ess\,sup}}\limits _{Q_i}u, \quad \varvec{\mu }_i^-=\mathop {\mathrm{ess\,inf}}\limits _{Q_i}u, \quad \mathop {\mathrm{ess\,osc}}\limits _{Q_i}u\le \varvec{\omega }_i. \end{array} \right. \end{aligned}$$

The induction argument will show that the above oscillation estimate continues to hold for the \((j+1)\)-th step.

Let \(\delta \) be fixed as before, whereas \(c\in (0,1)\) is subject to a further choice. To reduce the oscillation in the next stage, we basically repeat what has been done in the first step, now with \(\varvec{\mu }^{\pm }_j\), \(\varvec{\omega }_j\), \(R_j\), \(Q_j\), etc. In fact, we define

$$\begin{aligned} \tau :=\delta (\tfrac{1}{4} \varvec{\omega }_j)^{2-p}( c R_j)^{sp} \end{aligned}$$

and consider two alternatives

$$\begin{aligned} \left\{ \begin{array}{ll} \Big |\Big \{u\big (\cdot ,-\tau \big )-\varvec{\mu }^-_j>\tfrac{1}{4} \varvec{\omega }_j\Big \} \cap K_{c R_j}\Big | \ge \tfrac{1}{2} |K_{c R_j}|,\\ \Big |\Big \{\varvec{\mu }^+_j - u\big (\cdot ,-\tau \big )>\tfrac{1}{4} \varvec{\omega }_j\Big \} \cap K_{c R_j}\Big | \ge \tfrac{1}{2} |K_{c R_j}|. \end{array}\right. \end{aligned}$$

Like in the first step, we may assume \(\varvec{\mu }^+_j - \varvec{\mu }^-_j\ge \tfrac{1}{2}\varvec{\omega }_j\), so that one of the two alternatives must hold. Otherwise, the case \(\varvec{\mu }^+_j - \varvec{\mu }^-_j<\tfrac{1}{2}\varvec{\omega }_j\) can be trivially incorporated into the forthcoming oscillation estimate (4.8).

Let us suppose the first case holds for instance. An application of Proposition 4.1 in \(Q_j\), with \(\alpha =\tfrac{1}{2}\), \(\xi =\tfrac{1}{4}\) and \(\varrho =cR_j\) yields (for the same \(\eta \) as before) that either

$$\begin{aligned} c^{\frac{sp}{p-1}} \textrm{Tail}\big ( \big (u-\varvec{\mu }_j^{-}\big )_{-}; Q_j\big ) > \eta \varvec{\omega }_j, \end{aligned}$$
(4.7)

or

$$\begin{aligned} u-\varvec{\mu }_j^{-} \ge \eta \varvec{\omega }_j \quad \text {a.e. in}\> Q_{cR_j}(\tfrac{1}{2}\delta (\tfrac{1}{4} \varvec{\omega }_j)^{2-p}), \end{aligned}$$

which, thanks to the j-th induction assumption, gives the reduction of oscillation

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_{cR_j}(\frac{1}{2}\delta (\frac{1}{4} \varvec{\omega }_j)^{2-p})} u\le \big (1-\eta \big )\varvec{\omega }_j=:\varvec{\omega }_{j+1}. \end{aligned}$$
(4.8)

The final choice of c is made to ensure that (4.7) does not happen, independent of j. This hinges upon the following tail estimate

$$\begin{aligned} \textrm{Tail}\big ( \big (u-\varvec{\mu }_j^{-}\big )_{-}; Q_j\big ) \le \varvec{\gamma }\varvec{\omega }_j. \end{aligned}$$
(4.9)

To prove this, we first rewrite the tail as follows:

$$\begin{aligned} \big [\textrm{Tail}&\big ( \big (u-\varvec{\mu }_j^{-}\big )_{-}; Q_j\big )\big ]^{p-1}=R_j^{sp}\mathop {\mathrm{ess\,sup}}\limits _{-\varvec{\omega }_j^{2-p}R_j^{sp}<t<0}\int _{\mathbb {R}^N\setminus K_j} \frac{\big (u-\varvec{\mu }_j^{-}\big )_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x\\&=R_j^{sp}\mathop {\mathrm{ess\,sup}}\limits _{-\varvec{\omega }_j^{2-p}R_j^{sp}<t<0} \bigg [ \int _{\mathbb {R}^N\setminus K_R} \frac{\big (u-\varvec{\mu }_j^{-}\big )_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x+ \sum _{i=1}^{j}\int _{K_{i-1}\setminus K_{i}} \frac{\big (u-\varvec{\mu }_j^{-}\big )_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x\bigg ]. \end{aligned}$$

Here, we denoted \(K_i=K_{R_i}\) for short. The first integral is estimated by using the definition of \(\varvec{\omega }\). Namely, for any \(t\in (-\varvec{\omega }_j^{2-p}R_j^{sp},0)\),

$$\begin{aligned} \int _{\mathbb {R}^N\setminus K_R} \frac{\big (u-\varvec{\mu }_j^{-}\big )_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x&\le \varvec{\gamma }\int _{\mathbb {R}^N\setminus K_R} \frac{|\varvec{\mu }_j^{-}|^{p-1}+u_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x\\&\le \varvec{\gamma }\frac{\varvec{\omega }^{p-1}}{R^{sp}}+\varvec{\gamma }\int _{K_{\widetilde{R}}\setminus K_R} \frac{u_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x+\varvec{\gamma }\int _{\mathbb {R}^N\setminus K_{\widetilde{R}}} \frac{u_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x\\&\le \varvec{\gamma }\frac{\varvec{\omega }^{p-1}}{R^{sp}}. \end{aligned}$$

Whereas the second integral is estimated by using the simple fact that, for \(i=1,2,\cdots ,j\),

$$\begin{aligned} \big (u-\varvec{\mu }_j^{-}\big )_{-}\le \varvec{\mu }_j^{-} - \varvec{\mu }_{i-1}^{-}\le \varvec{\mu }_j^{+} - \varvec{\mu }_{i-1}^{-}\le \varvec{\mu }_{i-1}^{+} - \varvec{\mu }_{i-1}^{-} \le \varvec{\omega }_{i-1}\quad \text {a.e. in}\>Q_{i-1}. \end{aligned}$$

Namely, for any \(t\in (-\varvec{\omega }_j^{2-p}R_j^{sp},0)\),

$$\begin{aligned} \int _{K_{i-1}\setminus K_{i}} \frac{\big (u-\varvec{\mu }_j^{-}\big )_{-}^{p-1}}{|x|^{N+sp}}\,\textrm{d}x\le \varvec{\gamma } \frac{\varvec{\omega }_{i-1}^{p-1}}{R_i^{sp}}. \end{aligned}$$

Combine them and further estimate the tail by

$$\begin{aligned} \begin{aligned} \big [\textrm{Tail}\big ( \big (u-\varvec{\mu }_j^{-}\big )_{-}; Q_j\big )\big ]^{p-1}&\le \varvec{\gamma } R_j^{sp} \frac{\varvec{\omega }^{p-1}}{R^{sp}}+\varvec{\gamma } R_j^{sp} \sum _{i=1}^{j} \frac{\varvec{\omega }_{i-1}^{p-1}}{R_i^{sp}}\\&\le \varvec{\gamma } \varvec{\omega }_{j}^{p-1} \sum _{i=1}^{j} (1 - \eta )^{(j-i+1)(1-p)}\lambda ^{(j-i)sp}. \end{aligned} \end{aligned}$$

The summation in the last display is bounded by \((1-\eta )^{1-p}\) if we restrict the choice of \(\lambda \) by

$$\begin{aligned} (1-\eta )^{1-p}\lambda ^{sp}\le \tfrac{1}{2},\quad \text {i.e.}\quad \lambda \le 2^{-\frac{1}{sp}}(1-\eta )^{\frac{p-1}{sp}}. \end{aligned}$$

Consequently, the tail estimate (4.9) is proven and (4.7) does not happen, if we choose c to verify

$$\begin{aligned} c^{sp}\varvec{\gamma } \le \eta ^{p-1},\quad \text {i.e.}\quad c\le \frac{1}{\varvec{\gamma }}\eta ^{\frac{p-1}{sp}}. \end{aligned}$$
(4.10)

The final choice of c is made out of the smaller one of (4.5) and (4.10).

Let \(R_{j+1}=\lambda R_j\) for some \(\lambda \in (0,1)\) to verify the set inclusion

$$\begin{aligned} Q_{R_{j+1}}\big (\varvec{\omega }_{j+1}^{2-p}\big ) \subset Q_{cR_j}\big (\tfrac{1}{2}\delta \big (\tfrac{1}{4} \varvec{\omega }_j\big )^{2-p}\big ),\quad \text {i.e.}\quad \lambda \le 2^{\frac{2p-5}{p}}\delta ^{\frac{1}{p}}c. \end{aligned}$$
(4.11)

Note that the choice of \(\lambda \) in the last display may have been adjusted from the previous one in (4.6) due to the possible change of c made in (4.10). The final choice of \(\lambda \) is

$$\begin{aligned} \lambda =\min \Big \{ 2^{-\frac{1}{sp}}(1-\eta )^{\frac{p-1}{sp}}, 2^{\frac{2p-5}{p}}\delta ^{\frac{1}{p}}c\Big \}. \end{aligned}$$

As a result of the inclusion (4.11) and (4.8) we obtain

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_{R_{j+1}}(\varvec{\omega }_{j+1}^{2-p})}u\le \varvec{\omega }_{j+1}, \end{aligned}$$

which completes the induction argument. From now on, the deduction of a Hölder modulus of continuity becomes quite standard; cf. [9, Chapter III, Proposition 3.1].

5 Proof of theorem 1.1: \(p> 2\)

As in the previous section, we first introduce \(Q_{\widetilde{R}}\subset E_T\),

$$\begin{aligned} \varvec{\omega }=2\mathop {\mathrm{ess\,sup}}\limits _{Q_{\widetilde{R}}} |u| +\textrm{Tail}(u; Q_{\widetilde{R}}) \end{aligned}$$

and \(Q_o=Q_R(a\theta )\) for \(\theta =(\frac{1}{4}\varvec{\omega })^{2-p}\) and some \(a\in (0,1)\) to be determined. By properly shrinking R, we may assume that \(Q_o\subset Q_{\widetilde{R}}\) and set

$$\begin{aligned} \varvec{\mu }^+=\mathop {\mathrm{ess\,sup}}\limits _{Q_o}u, \qquad \varvec{\mu }^-=\mathop {\mathrm{ess\,inf}}\limits _{Q_o}u. \end{aligned}$$

Without loss of generality, we take \((x_o,t_o)=(0,0)\).

Then the following intrinsic relation holds:

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_{R}\big (a\big (\frac{1}{4}\varvec{\omega }\big )^{2-p}\big )}u\le \varvec{\omega }. \end{aligned}$$
(5.1)

As before, the choice of \(Q_{\widetilde{R}}\) is made to ensure (5.1), on which the subsequent arguments are based.

Unlike the case \(1<p\le 2\), an expansion of positivity for the case \(p>2\) requires additional technical complication. To deal with this case, we instead refine DiBenedetto’s argument in [10]: On one hand, the tail needs a great care in this intrinsic scaling scenario; on the other hand, we dispense with any kind of logarithmic estimate and just rely on the energy estimate.

5.1 The first alternative

In this section, we work with u as a super-solution near its infimum. Furthermore, we assume

$$\begin{aligned} \varvec{\mu }^+ -\varvec{\mu }^- >\tfrac{1}{2}\varvec{\omega }. \end{aligned}$$
(5.2)

The other case \(\varvec{\mu }^+ -\varvec{\mu }^- \le \frac{1}{2}\varvec{\omega }\), will be considered later.

Suppose \(a,\, c\in (0,1)\) verify that \(a>2c^{sp}\) for the moment (which will be confirmed in (5.11) and (5.12)), and for some \(\bar{t}\in \big (-a\theta R^{sp}+\theta (cR)^{sp},0\big ]\), there holds

$$\begin{aligned} \Big |\Big \{u\le \varvec{\mu }^-+\tfrac{1}{4} \varvec{\omega }\Big \} \cap (0,\bar{t})+Q_{cR}(\theta )\Big |\le \nu |Q_{cR}(\theta )|, \end{aligned}$$
(5.3)

where \(\nu \) is the constant determined in Lemma 3.1 (with \(\delta =1\)) in terms of the data. According to Lemma 3.1 with \(\delta =1\), \(\xi =\frac{1}{4}\) and \(\varrho =cR\), we have either

$$\begin{aligned} c^{\frac{sp}{p-1}} \textrm{Tail}\big ( \big (u-\varvec{\mu }^{-}\big )_{-}; Q_o\big ) > \tfrac{1}{4} \varvec{\omega }, \end{aligned}$$
(5.4)

or

$$\begin{aligned} u\ge \varvec{\mu }^-+\tfrac{1}{8}\varvec{\omega } \quad \text{ a.e. } \text{ in } (0,\bar{t})+Q_{\frac{1}{2} cR}(\theta ). \end{aligned}$$
(5.5)

To proceed, we restrict c so that (5.4) does not happen. Indeed, since, according to the definition of \(\varvec{\omega }\), the tail can be easily estimated by (cf. (4.4))

$$\begin{aligned} \textrm{Tail}\big ( \big (u-\varvec{\mu }^{-}\big )_{-}; Q_o\big ) \le \varvec{\gamma }\varvec{\omega }, \end{aligned}$$

we impose

$$\begin{aligned} c^{\frac{sp}{p-1}} \varvec{\gamma }\varvec{\omega }\le \tfrac{1}{4}\varvec{\omega },\quad \text {i.e.}\quad c\le \left( \frac{1}{4\varvec{\gamma }}\right) ^{\frac{p-1}{sp}}. \end{aligned}$$
(5.6)

Note that this is not the final choice of c yet and it is subject to further smallness requirements in the course of the proof.

The pointwise estimate in (5.5) at \(t_*=\bar{t}- \theta (\tfrac{1}{2} cR)^{sp}\) allows us to apply Lemma 3.2 with \(\varrho =\tfrac{1}{2}cR\) and obtain that for some free parameter \(\xi _o\in (0,\tfrac{1}{8})\), either

$$\begin{aligned} (\tfrac{1}{2}c)^{\frac{sp}{p-1}} \textrm{Tail}\big ( \big (u-\varvec{\mu }^{-}\big )_{-}; Q_o\big ) > \xi _o \varvec{\omega }, \end{aligned}$$
(5.7)

or

$$\begin{aligned} u\ge \varvec{\mu }^-+\tfrac{1}{2}\xi _o \varvec{\omega } \quad \text{ a.e. } \text{ in } K_{\frac{1}{4}cR}\times \big (t_*, t_*+\nu _o(\xi _o\varvec{\omega })^{2-p}(\tfrac{1}{2}cR)^{sp}\big ]. \end{aligned}$$
(5.8)

The number \(\xi _o\) is chosen to fulfill

$$\begin{aligned} \nu _o(\xi _o\varvec{\omega })^{2-p}(\tfrac{1}{2}cR)^{sp}\ge a(\tfrac{1}{4}\varvec{\omega })^{2-p}R^{sp},\quad \text {i.e.}\quad \xi _o= \tfrac{1}{4}\left( \frac{\nu _o c^{sp}}{2^p a}\right) ^{\frac{1}{p-2}}. \end{aligned}$$
(5.9)

In this way, the estimate (5.8) can be claimed up to \(t=0\) and yields the reduction of oscillation

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_{\frac{1}{4}cR}(\theta )} u\le \big (1-\tfrac{1}{2}\xi _o \big )\varvec{\omega }. \end{aligned}$$
(5.10)

Note that in the dependence of \(\xi _o\), the constants a and c are still to be determined. For the moment, let us suppose c has been fixed. We then select a to ensure (5.7) does not occur. In fact, since the tail is bounded by \(\varvec{\gamma }\varvec{\omega }\) as before, we impose

$$\begin{aligned} \left( \tfrac{1}{2}c\right) ^{\frac{sp}{p-1}}\varvec{\gamma }\varvec{\omega }\le \xi _o \varvec{\omega }\equiv \tfrac{1}{4}\left( \frac{\nu _o c^{sp}}{2^p a}\right) ^{\frac{1}{p-2}} \varvec{\omega }, \end{aligned}$$

where we have employed the selection of \(\xi _o\) in (5.9). Consequently, the above display yields the relation of a and c, that is,

$$\begin{aligned} a=\frac{\nu _o}{\varvec{\gamma }}c^{\frac{sp}{p-1}}. \end{aligned}$$
(5.11)

Hence, by this choice, the estimate (5.7) does not occur and the reduction of oscillation (5.10) actually holds.

Moreover, the assumption \(a>2c^{sp}\) made at the beginning is verified, if we use (5.11) and further restrict c by

$$\begin{aligned} a=\frac{\nu _o}{\varvec{\gamma }}c^{\frac{sp}{p-1}}>2c^{sp},\quad \text {i.e.}\quad c<\left( \frac{\nu _o}{\varvec{\gamma }}\right) ^{sp\frac{p-1}{p-2}}. \end{aligned}$$
(5.12)

The number a will be eventually fixed via (5.11), once we determine c in the end.

5.2 The second alternative

In this section, we work with u as a sub-solution near its supremum. Suppose (5.3) does not hold for any \(\bar{t}\in \big (-a\theta R^{sp}+\theta (cR)^{sp},0\big ]\). Due to (5.2), this can be rephrased as

$$\begin{aligned} \Big |\Big \{\varvec{\mu }^+-u\ge \tfrac{1}{4} \varvec{\omega }\Big \} \cap (0,\bar{t})+Q_{cR}(\theta )\Big |> \nu |Q_{cR}(\theta )|. \end{aligned}$$

Based on this, it is not hard to find some \(t_*\in \big [\bar{t}-\theta (cR)^{sp}, \bar{t}-\tfrac{1}{2}\nu \theta (cR)^{sp}\big ]\), such that

$$\begin{aligned} \Big |\Big \{\varvec{\mu }^+-u(\cdot , t_*)\ge \tfrac{1}{4} \varvec{\omega }\Big \} \cap K_{cR} \Big |> \tfrac{1}{2}\nu |K_{cR}|. \end{aligned}$$

Indeed, if the above inequality were not to hold for any s in the given interval, then

$$\begin{aligned} \Big |\Big \{\varvec{\mu }^+-u\ge \tfrac{1}{4}\varvec{\omega }\Big \}\cap (0,\bar{t})+Q_{cR}(\theta )\Big |&= \int _{\bar{t}-\theta (cR)^p}^{\bar{t} -\frac{1}{2}\nu \theta (cR)^{sp}} \Big |\Big \{\varvec{\mu }^+-u(\cdot , s)\ge \tfrac{1}{4}\varvec{\omega }\Big \}\cap K_{cR}\Big |\,\textrm{d}s\\&+\int ^{\bar{t}}_{\bar{t}-\frac{1}{2}\nu \theta (cR)^{sp}} \Big |\Big \{\varvec{\mu }^+-u(\cdot , s)\ge \tfrac{1}{4}\varvec{\omega }\Big \}\cap K_{cR}\Big |\,\textrm{d}s\\&<\tfrac{1}{2}\nu |K_{cR}|\theta (cR)^{sp}\big (1-\tfrac{1}{2}\nu \big ) +\tfrac{1}{2}\nu \theta (cR)^{sp}|K_{cR} |\\&<\nu | Q_{cR}(\theta )|, \end{aligned}$$

which would yield a contradiction.

Starting from this measure theoretical information, we may apply Lemma 3.3 (with \(\alpha =\tfrac{1}{2}\nu \) and \(\varrho =cR\)) to obtain \(\delta \) and \(\varepsilon \) depending on the data and \(\nu \), such that, for some free parameter \(\xi _1\in (0,\frac{1}{4})\), either

$$\begin{aligned} c^{\frac{sp}{p-1}} \textrm{Tail}\big ( \big (u-\varvec{\mu }^{+}\big )_{+}; Q_o\big ) >\xi _1\varvec{\omega }, \end{aligned}$$
(5.13)

or

$$\begin{aligned} \Big |\Big \{ \varvec{\mu }^{+}-u(\cdot , t) \ge \varepsilon \xi _1\varvec{\omega }\Big \} \cap K_{cR} \Big | \ge \frac{\alpha }{2} |K_{cR}| \> \text{ for } \text{ all } t\in \big (t_*,t_*+\delta (\xi _1\varvec{\omega })^{2-p}(cR)^{sp}\big ]. \end{aligned}$$
(5.14)

The number \(\xi _1\) is chosen to satisfy

$$\begin{aligned} \delta (\xi _1\varvec{\omega })^{2-p}(cR)^{sp}\ge \theta (cR)^{sp},\quad \text {i.e.}\quad \xi _1=\tfrac{1}{4}\delta ^{\frac{1}{p-2}}. \end{aligned}$$

In this way, the measure theoretical information (5.14) can be claimed up to the time level \(\bar{t}\). Whereas the constant c is again chosen so small that (5.13) does not happen. Indeed, according to the definition of \(\varvec{\omega }\), just like in (4.4) the tail can be easily estimated by

$$\begin{aligned} \textrm{Tail}\big ( \big (u-\varvec{\mu }^{+}\big )_{+}; Q_o\big ) \le \varvec{\gamma }\varvec{\omega }. \end{aligned}$$

A simple calculation as before then gives

$$\begin{aligned} c\le \left( \frac{\xi _1}{\varvec{\gamma }}\right) ^{\frac{p-1}{sp}}. \end{aligned}$$
(5.15)

Consequently, the measure theoretical information (5.14) yields

$$\begin{aligned} \Big |\Big \{ \varvec{\mu }^{+}-u(\cdot , t) \ge \varepsilon \xi _1\varvec{\omega }\Big \} \cap K_{cR} \Big | \ge \frac{\alpha }{2} |K_{cR}| \> \text{ for } \text{ all } t\in \big (-a\theta R^{sp}+\theta (cR)^{sp},0\big ], \end{aligned}$$
(5.16)

thanks to the arbitrariness of \(\bar{t}\).

Given (5.16), we want to apply Lemma 3.4 with \(\delta =1\), \(\xi =\varepsilon \xi _1\) and \(\varrho =cR\) next. To this end, we first let \(\nu \) be fixed in Lemma 3.1 (with \(\delta =1\)) and choose \(\sigma \in (0,\tfrac{1}{2})\) to satisfy

$$\begin{aligned} \varvec{\gamma } \frac{\sigma ^{p-1}}{\alpha }\le \nu . \end{aligned}$$

This choice is possible because \(\varvec{\gamma }\) of Lemma 3.4 is independent of \(\sigma \). Then we use (5.11) and restrict c further to satisfy

$$\begin{aligned} a\theta R^{sp}-\theta (cR)^{sp}\ge (\sigma \varepsilon \xi _1\varvec{\omega })^{2-p}(cR)^{sp},\quad \text {i.e.}\quad c\le \left( \frac{\nu _o}{\varvec{\gamma }}\right) ^{\frac{p-1}{sp(p-2)}}(\sigma \varepsilon \xi _1)^{\frac{p-1}{sp}}. \end{aligned}$$
(5.17)

In this way, the measure theoretical information (5.16) gives that

$$\begin{aligned} \Big |\Big \{ \varvec{\mu }^{+}-u(\cdot , t)\ge \varepsilon \xi _1\varvec{\omega } \Big \}\cap K_{cR}\Big | \ge \alpha \big |K_{cR}\big |\quad \text{ for } \text{ all } t\in \big ( -(\sigma \varepsilon \xi _1\varvec{\omega })^{2-p}(cR)^p, 0\big ], \end{aligned}$$

which allows us to implement Lemma 3.4. Namely, we have either

$$\begin{aligned} c^{\frac{sp}{p-1}} \textrm{Tail}\big ( u-\varvec{\mu }^{+}\big )_{+}; Q_o\big ) >\sigma \varepsilon \xi _1\varvec{\omega } \end{aligned}$$
(5.18)

or

$$\begin{aligned} \Big |\Big \{ \varvec{\mu }^{+}-u \le \sigma \varepsilon \xi _1\varvec{\omega } \Big \}\cap Q_{cR}({\widetilde{\theta }})\Big | \le \nu \big |Q_{cR}({\widetilde{\theta }})|,\quad \text {where}\>{\widetilde{\theta }}=(\sigma \varepsilon \xi _1\varvec{\omega })^{2-p}. \end{aligned}$$

By Lemma 3.1 (with \(\delta =1\)), the last display yields

$$\begin{aligned} \varvec{\mu }^{+}-u \ge \tfrac{1}{2}\sigma \varepsilon \xi _1\varvec{\omega }\quad \text {a.e. in}\>Q_{\frac{1}{2}cR}({\widetilde{\theta }}), \end{aligned}$$

which in turn gives the reduction of oscillation

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_{\frac{1}{2}cR}({\widetilde{\theta }})} u\le \big (1- \tfrac{1}{2}\sigma \varepsilon \xi _1\big )\varvec{\omega }. \end{aligned}$$
(5.19)

Once again, we may restrict the choice of c to ensure that (5.18) does not happen.Indeed, according to the definition of \(\varvec{\omega }\), just like in (4.4) the tail can be easily estimated by

$$\begin{aligned} \textrm{Tail}\big ( \big (u-\varvec{\mu }^{+}\big )_{+}; Q_o\big ) \le \varvec{\gamma }\varvec{\omega }. \end{aligned}$$

By a similar calculation as before, this amounts to requiring

$$\begin{aligned} c\le \left( \frac{\sigma \varepsilon \xi _1}{\varvec{\gamma }}\right) ^{\frac{p-1}{sp}}. \end{aligned}$$
(5.20)

Combining (5.10) and (5.19), we arrive at

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_{\frac{1}{4}cR}( \theta )} u\le \big (1- \eta \big )\varvec{\omega }:=\varvec{\omega }_1, \end{aligned}$$
(5.21)

where

$$\begin{aligned} \eta =\min \big \{\tfrac{1}{2}\xi _o, \tfrac{1}{2}\sigma \varepsilon \xi _1\big \}. \end{aligned}$$

Now, set \(\theta _1=(\tfrac{1}{4}\varvec{\omega }_1)^{2-p}\) and \(R_1=\tfrac{1}{4} c R\). To prepare the induction, we need to verify the set inclusion

$$\begin{aligned} Q_{R_1}(a\theta _1)\subset Q_{\frac{1}{4}cR}( \theta ),\quad \text {i.e.}\quad a\le \big [\tfrac{1}{4}(1-\eta )\big ]^{p-2}. \end{aligned}$$

This, by the choice of a in (5.11), amounts to further requiring the smallness of c.

As a result of this inclusion and (5.21), we obtain

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_{R_1}( a\theta _1)} u\le \varvec{\omega }_1, \end{aligned}$$

which takes the place of (5.1) in the next stage. Note that this oscillation estimate also takes into account the reverse case of (5.2).

5.3 The induction

Now we may proceed by induction.

Suppose up to \(i=1,\cdots , j\), we have built

$$\begin{aligned} \left\{ \begin{array}{c} \displaystyle R_o=R,\quad R_i=\tfrac{1}{4}c R_{i-1}, \quad \varvec{\omega }_i=(1-\eta )\varvec{\omega }_{i-1},\\ \theta _i=\left( \tfrac{1}{4}\varvec{\omega }_i\right) ^{2-p}, \quad Q_i=Q_{R_i}(a\theta _i),\\ \displaystyle \varvec{\mu }_i^+=\mathop {\mathrm{ess\,sup}}\limits _{Q_i}u, \quad \varvec{\mu }_i^-=\mathop {\mathrm{ess\,inf}}\limits _{Q_i}u, \quad \mathop {\mathrm{ess\,osc}}\limits _{Q_i}u\le \varvec{\omega }_i. \end{array} \right. \end{aligned}$$

The induction argument will show that the above oscillation estimate continues to hold for the \((j+1)\)-th step.

Like in the proof for \(1<p\le 2\), we can repeat all the previous arguments, which now are adapted with \(\varvec{\mu }^{\pm }_j\), \(\varvec{\omega }_j\), \(R_j\), \(\theta _j\), \(Q_j\), etc. In the end, we have a reduction of oscillation parallel with (5.21), that is,

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_{\frac{1}{4}cR_j}( \theta _j)} u\le \big (1- \eta \big )\varvec{\omega }_j:=\varvec{\omega }_{j+1}. \end{aligned}$$
(5.22)

Now, setting \(\theta _{j+1}=(\tfrac{1}{4}\varvec{\omega }_{j+1})^{2-p}\) and \(R_{j+1}=\tfrac{1}{4} c R_j\), it is straightforward to verify the set inclusion

$$\begin{aligned} Q_{R_{j+1}}(a\theta _{j+1})\subset Q_{\frac{1}{4}cR_j}( \theta _j), \end{aligned}$$

which, by (5.22), implies

$$\begin{aligned} \mathop {\mathrm{ess\,osc}}\limits _{Q_{R_{j+1}}( a\theta _{j+1})} u \le \varvec{\omega }_{j+1}. \end{aligned}$$

A key step lies in determining c as done in (5.6), (5.15), (5.12), (5.17) and (5.20), such that the alternative involving the tail, along the course of the arguments, does not really occur and hence (5.22) can be reached. This hinges upon the following estimate of the tail:

$$\begin{aligned} \textrm{Tail}\big (\big (u - \varvec{\mu }_j^{\pm }\big )_{\pm }; Q_j\big ) \le \varvec{\gamma }\varvec{\omega }_j. \end{aligned}$$

The computations leading to the above tail estimate can be performed as those leading to (4.9). We omit the details to avoid repetition. After the number c is determined independent of j, the number a is finally chosen via the relation (5.11). Hence the induction is completed and the derivation of a Hölder modulus of continuity follows.