1 Introduction

1.1 Background

Let \(a_{i,j} : [0,T] \times \mathbb {R}^d \rightarrow (0,\infty )\), \(i,j = 1,\dots ,d\), be bounded, measurable coefficients which satisfy the usual uniform ellipticity condition. A celebrated result by D.G. Aronson from 1967 says that the fundamental solution \(\Gamma (y,s;x,\eta )\) to the equation

$$\begin{aligned} \partial _t u - \partial _i(a_{i,j}\partial _j u) = 0,~~ \text { in } (\eta ,T) \times \mathbb {R}^d, \end{aligned}$$
(1.1)

satisfies the following two-sided estimate for all \(0 \le \eta< s < T\), and \(x,y \in \mathbb {R}^d\):

$$\begin{aligned} c_1 (s-\eta )^{-\frac{d}{2}}e^{-c_2 \frac{\vert x -y \vert ^2}{s - \eta }} \le \Gamma (y,s;x,\eta ) \le c_3 (s-\eta )^{-\frac{d}{2}}e^{-c_4 \frac{\vert x -y \vert ^2}{s - \eta }}, \end{aligned}$$
(1.2)

where \(c_1,c_2,c_3,c_4 > 0\) depend only on d and the ellipticity constants. In other words, the fundamental solution of the classical heat equation \(\partial _t u - \Delta u = 0\) is an upper and lower bound of \(\Gamma \) up to multiplicative constants, see [1, 2]. In this sense, the bounds (1.2) are robust in the class of second order divergence form operators with bounded, measurable, uniformly elliptic coefficients.

Aronson’s proof is closely related to the so-called DeGiorgi-Nash-Moser theory for parabolic differential operators of second order with bounded, measurable and uniformly elliptic coefficients. The proof heavily relies on Hölder regularity estimates and the parabolic Harnack inequality for solutions to (1.1), see also [3].

[2] has initiated several research studies on estimates for fundamental solutions to parabolic equations in various contexts. An important feature of this research is that it connects partial differential equations with geometry. This is due to the sensitivity of the heat kernel to the geometric properties of the underlying space. This phenomenon becomes apparent in the celebrated works [18, 41], where the method of Aronson was generalized to prove heat kernel estimates on complete Riemannian manifolds with nonnegative Ricci curvature. Some of their arguments have been further refined and generalized in [20], where an integral estimate for the heat kernel was established that is useful for proving the upper bound in (1.2). We refer the interested reader to [32, 39, 50] and the references therein for more detailed expositions on this topic.

1.2 Main results

The goal of this article is to extend Aronson’s proof of upper heat kernel estimates to integro-differential operators of the form

$$\begin{aligned} L_t u (x) = \text { p.v.} \int _{\mathbb {R}^d} (u(y)-u(x))k(t;x,y) \text {d}y, ~~ t \in (0,T),~ x \in \mathbb {R}^d. \end{aligned}$$

Such operators are determined by a jumping kernel \(k : (0,T) \times \mathbb {R}^d \times \mathbb {R}^d \rightarrow [0,\infty ]\) which is assumed to be symmetric, i.e., \(k(t;x,y) = k(t;y,x)\) and satisfies a pointwise upper bound

figure a

for some given constant \(\Lambda > 0\), and \(\alpha \in (0,2)\), \(0 < T \le \infty \).

Moreover, we assume that there is \(\lambda > 0\) such that for any ball \(B \subset \mathbb {R}^d\) and every \(v \in H^{\alpha /2}(B)\):

figure b

\(\mathcal {E}_{\ge }\) can be thought of as a coercivity assumption on k and is substantially weaker than a pointwise lower bound. We refer the reader to Sect. 2 for a more detailed discussion and to Section4 where we explain a possible extension of our method and replace \(\mathcal {E}_{\ge }\) by a Faber-Krahn inequality.

We are now ready to state the main result of this article in the aforementioned setup. For a possible extension to doubling metric measure spaces and jumping kernels of mixed type, we refer to Theorem 4.1.

Theorem 1.1

Let \(k : (0,T) \times \mathbb {R}^d \times \mathbb {R}^d \rightarrow [0,\infty ]\) be symmetric and assume \(k_{\le }\), \(\mathcal {E}_{\ge }\). Let \(p(y,s;x,\eta )\) be the fundamental solution to the equation

$$\begin{aligned} \partial _t u - L_t u = 0, ~~ \text { in } (\eta ,T) \times \mathbb {R}^d, \end{aligned}$$
(1.3)

where \(\eta \in [0,T)\). Then there exists a constant \(c > 0\) depending on \(d,\alpha ,\lambda ,\Lambda \) such that for every \(0 \le \eta< s < T\), and \(x,y \in \mathbb {R}^d\):

$$\begin{aligned} p(y,s;x,\eta ) \le c(s-\eta )^{-\frac{d}{\alpha }}\left( 1 + \frac{\vert x-y \vert ^{\alpha }}{s-\eta }\right) ^{-\frac{d+\alpha }{\alpha }}. \end{aligned}$$
(1.4)

Estimate (1.4) states that the fundamental solution to \(\partial _t u - L_t u = 0\) possesses the same upper bound as the fundamental solution to the fractional heat equation \(\partial _t u + (-\Delta )^{\alpha /2} u = 0\), see [8]. Only for \(\alpha = 1\) an explicit formula for \(p(y,s;x,\eta )\) is known:

$$\begin{aligned} p(y,s;x,\eta ) = \frac{\Gamma \left( \frac{d+1}{2} \right) }{\pi ^{\frac{d+1}{2}}} \left( \frac{s-\eta }{(s-\eta )^2 + |x-y|^2}\right) ^{\frac{d+1}{2}}. \end{aligned}$$

It is well known that corresponding lower bounds do not hold under \(k_{\le }\) and \(\mathcal {E}_{\ge }\) since the latter condition does not rule out k to be zero in certain cones of directions (see [34]).

Different versions of Theorem 1.1 already exist in the literature. Let us give a brief account on the history of heat kernel bounds for nonlocal operators. Two-sided estimates of the form (1.4) have been established by [11, 13] using a probabilistic approach. They assume that the jumping kernel is pointwise comparable to \(\vert x-y \vert ^{-d-\alpha }\), \(\alpha \in (0,2)\), from above and below. Their analysis of the upper heat kernel estimate heavily relies on [16], where Davies’ method was extended to a more general setup, including jump processes. Building upon this, [9] derived (1.4) assuming \(k_{\le }\) and a Nash inequality.

In a series of articles, see [28,29,30], the analysis of heat kernel estimates was extended to metric measure spaces with walk dimension greater than 2. The authors were able to characterize upper heat kernel estimates, as well as two-sided heat kernel estimates in terms of equivalent conditions on the jumping kernels and the geometry of the underlying space. Their approach does not use the underlying stochastic process and is based on certain comparison inequalities of the corresponding heat semigroups relying on the parabolic maximum principle. Note that Davies’ method was extended to jumping kernels with jump index \(\alpha >2\) in [33, 48]. The aforementioned results assume certain homogeneity of the doubling measure space and do not deal with mixed-type jumping kernels.

In [14, 15, 17] upper and two-sided heat kernel estimates were investigated on doubling metric measures spaces for jumping kernels of mixed type. This approach applies also to cases when \(\alpha \ge 2\). We would like to draw the reader’s attention to Theorem 1.15 in [17]. In the case \(\alpha < 2\), it states that upper heat kernel estimates of the form (1.4) are equivalent to a pointwise upper bound on the jumping kernel and a Faber-Krahn inequality, which can be understood as an implicit lower bound.

A major difference between our approach and [17] is that our method relies on purely analytic arguments, while [17] makes essential use of the corresponding stochastic process. In Theorem 4.1 we extend our approach to doubling metric measure spaces and jumping kernels of mixed type. Let us mention that we prove on-diagonal heat kernel estimates with the help of a parabolic \(L^\infty - L^1\)-estimate, see Lemma 4.2. This rather straightforward approach allows us to avoid truncation methods and the usage of the iteration techniques of [35].

In contrast to our setup, all jumping kernels in the results discussed above are assumed to be time-homogeneous. Note that it would require substantial effort to extend methods based on stochastic processes to situations with time-dependent jumping kernels. Heat kernel estimates for time-inhomogeneous jumping kernels were established in [43, 44] where the authors assume pointwise upper and lower bounds on the jumping kernel. Note that assuming pointwise lower bounds is more restrictive than \(\mathcal {E}_{\ge }\). The focus of these works lies on the treatment of an additional divergence-free drift of first order.

There are further results on heat kernel estimates for nonlocal operators, which are related to Theorem 1.1. For example, sharp two-sided estimates for jump processes on \(\mathbb {R}^d\) with upper scaling index not strictly less than 2 are established in [10]. In [36], heat kernel estimates for a certain class of jump processes with singular jumping measures are proved.

1.3 Strategy of proof

A main insight of Aronson’s proof for second order differential operators is the observation that solutions u to the Cauchy problem (1.1) satisfy the weighted \(L^2\)-estimate

$$\begin{aligned} \sup _{t \in (\eta ,s)} \int _{\mathbb {R}^d} H(t,x) u^2(t,x) \text {d}x \le \int _{\mathbb {R}^d} H(\eta ,x) u_0^2(x) \text {d}x \end{aligned}$$
(1.5)

for \(0 \le \eta< s < T\), whenever H satisfies

$$\begin{aligned} C \vert \nabla H^{1/2} \vert ^2 \le -\partial _t H, ~~ \text { in } (\eta ,s) \times \mathbb {R}^d, \end{aligned}$$
(1.6)

for a given number \(C > 0\) depending on the ellipticity constants. (1.6) is closely related to the famous Li-Yau inequality:

$$\begin{aligned} \vert \nabla \log w \vert ^2 \le \frac{d}{2 t} + \partial _t \log w. \end{aligned}$$
(1.7)

In fact, a direct computation reveals that the Gauss-Weierstrass kernel \(w(t,x) = t^{-\frac{d}{2}}e^{-\frac{\vert x \vert ^2}{4t}}\) satisfies (1.7) with equality. By a scaling argument, it becomes evident that (1.6) holds true for \(H(t,x) = (C[t])^{\frac{d}{2}} w(C[t],x-y) = \exp (-\frac{\vert x-y \vert ^2}{4 C [t]})\), where \([t] {:}{=} 2(s-\eta ) - (t-\eta )\) and \(y \in \mathbb {R}^d\) can be chosen arbitrarily.

This insight suggests that some qualitative information on the decay of solutions to (1.1) is encoded in the weighted \(L^2\)-estimate (1.5). Indeed, by combining (1.5) with a localized \(L^{\infty }-L^2\)-estimate, as it was proved by Moser [45,46,47], one can estimate the value of the solution to the Cauchy problem at the center of a ball that lies outside the support of the initial data (see Theorem 3.2). From such estimate, it is not difficult to deduce

$$\begin{aligned} \left( \int _{\mathbb {R}^d \setminus B_{\sigma }(y)} \Gamma ^2(y,s;z,\eta ) \text {d}z\right) ^{1/2} \le c (s-\eta )^{-\frac{d}{4}}e^{-\frac{\sigma ^2}{32C (s-\eta )}} \end{aligned}$$

for every \(\sigma > 0\) and \(0 \le \eta< s < T\) with \(s-\eta \le \sigma ^2\). Together with the on-diagonal estimate \(\Gamma (y,s;x,\eta ) \le c(s-\eta )^{-\frac{d}{2}}\), one deduces the upper bound in (1.2) via a standard argument.

We would like to point out that (1.7) is at the core of the works [18, 41], where (1.7) was used to derive a parabolic Harnack inequality on Riemannian manifolds with nonnegative Ricci-curvature and to establish its equivalence to Gaussian heat kernel bounds.

Next, let us summarize how we adapt Aronson’s proof to integro-differential operators. First, we require a nonlocal analog of (1.5). We prove that there exist functions H that satisfy

$$\begin{aligned} C \Gamma ^{\alpha }_{\rho } (H^{1/2},H^{1/2}) \le -\partial _t H, ~~ \text { in } (\eta ,s) \times \mathbb {R}^d \end{aligned}$$
(1.8)

given \(C,\rho > 0\). Here, \(\Gamma ^{\alpha }_{\rho }\) denotes the \(\rho \)-truncated carré du champ operator of order \(\alpha \in (0,2)\), which is defined as follows:

$$\begin{aligned} \Gamma ^{\alpha }_{\rho }(f,f)(x) = \int _{B_{\rho }(x)} (f(x)-f(y))^2 \vert x-y \vert ^{-d-\alpha }\text {d}y, ~~ x \in \mathbb {R}^d. \end{aligned}$$

By a careful choice of a function H satisfying (1.8), we deduce a nonlocal analog of (1.5) for solutions to the \(\rho \)-truncated Cauchy problem, see Lemma 3.1. The corresponding integro-differential operator only takes into account differences up to distance \(\rho \). In order to prove an a priori bound for the corresponding fundamental solution \(p_{\rho }\) (see Theorem 3.4), we derive a parabolic \(L^{\infty }-L^2\)-estimate in the spirit of [51], see Lemma 2.4. The estimate involves a nonlocal, truncated tail-term which requires special treatment. In a final step, we obtain the desired upper heat kernel estimate (1.4) for p by gluing together short and long jumps. Such argument is by now standard in the theory of jump processes.

Last, we explain several difficulties that occur when avoiding the detour via the truncated jumping kernel.

First, the corresponding \(L^{\infty }-L^2\)-estimate involves a non-truncated tail-term which cannot be controlled without any further assumptions on k.

Second, finding suitable weight functions H which satisfy a non-truncated version of (1.8) is a challenging task in the light of the following observation: The corresponding nonlocal analog of the Li-Yau inequality, which would imply an estimate of the form (1.5) for solutions to (1.3), reads as follows:

$$\begin{aligned} \frac{\Gamma ^{\alpha }(w_{\alpha }^{1/2},w_{\alpha }^{1/2})}{w_{\alpha }} \le \frac{d}{\alpha t} + \partial _t \log (w_{\alpha }). \end{aligned}$$
(1.9)

However, one can show that the fundamental solution \(w_{\alpha }(t,x)\) to \(\partial _t u + (-\Delta )^{\alpha /2}u = 0\) does not satisfy (1.9). Let us give a quick proof of this fact. First, note that \(w_{\alpha }\) is a radial function and satisfies

$$\begin{aligned} \frac{d}{\alpha t} w_{\alpha } + \partial _t w_{\alpha } = -\frac{\vert x \vert }{\alpha t} \partial _{\vert x \vert } w_{\alpha }. \end{aligned}$$

For a proof of this identity, we refer to (2.5) in [52]. Consequently,

$$\begin{aligned} \frac{d}{\alpha {t}} + \partial _{t}\log {w}_{\alpha }(t,0) = 0, \end{aligned}$$

but this is a contradiction to (1.9) since \(\Gamma ^{\alpha }(w_{\alpha }^{1/2},w_{\alpha }^{1/2})(t,0) > 0\) for \(t > 0\).

We would like to point out that modifications of the estimates (1.6) and (1.7) also hold true in the context of the porous medium equation. Such estimates are known as Aronson-Benilan estimates (see [4, 40]). For a discussion of nonlinear fractional diffusion equations of porous medium type, we refer the interested reader to [12, 24, 53].

1.4 Outline

This article is separated into five sections. In Sect. 2 we present several auxiliary results that we need in our proof. Section 3 contains the derivation of the upper heat kernel bounds and proves Theorem 1.1. In Sect. 4, we explain how our method can be applied to jumping kernels of mixed type on metric measure spaces. In Sect. 1, we provide a proof of a gluing lemma which differs from [27] due to the time-inhomogeneity of the jumping kernels under consideration.

2 Preliminaries

In this section we provide several auxiliary results that will be required for the proof of Theorem 1.1 in Sect. 3.

Let \(k : (0,T) \times \mathbb {R}^d \times \mathbb {R}^d \rightarrow [0,\infty ]\) be a symmetric jumping kernel satisfying the pointwise upper bound \(k_{\le }\) and the coercivity condition \(\mathcal {E}_{\ge }\).

Let us make a few comments on assumption \(\mathcal {E}_{\ge }\): First of all, \(\mathcal {E}_{\ge }\) can be regarded as a nonlocal substitute of the classical uniform ellipticity condition for local operators. In fact, it is considerably weaker than a pointwise lower bound on the jumping kernel, since \(\mathcal {E}_{\ge }\) allows for jumping kernels that might degenerate in certain directions, as for example kernels that are supported on double cones. We refer the interested reader to [19, 21] for an investigation of such condition. Since k is allowed to depend on time, our result also covers kernels that are supported on double cones with fixed apex, whose cone axes rotate continuously in time.

A coercivity assumption like \(\mathcal {E}_{\ge }\) is crucial to our approach since it is needed for the \(L^{\infty }-L^2\)-estimate (Lemma 2.4) and also the on-diagonal heat kernel bound (Theorem 2.3). In the literature, lower bounds on jumping kernels are often introduced through functional inequalities, e.g. in [9], or [17] where the authors assume a Nash -, or a Faber-Krahn inequality. We point out that such assumption would have been possible also in this work, since the proofs of Lemma 2.4, Theorem 2.3 can be changed accordingly (see Sect. 4). For a discussion on the equivalence of Nash - and Faber-Krahn inequalities, we refer the reader to [17]. Moreover, we would like to mention the recent article [7], where the relation between \(L^1-L^{\infty }\) smoothing effects, on-diagonal upper heat kernel estimates, and functional inequalities are studied for fractional equations of porous medium type.

Let us point out a possible extension that was mentioned to us by a reviewer of this article. Our approach certainly allows to track constants in the respective estimates, as it is done in [6]. In particular, one could treat time-dependent versions of the condition \(\mathcal {E}_{\ge }\) and establish sufficient conditions on \(\lambda (t)\) as \(t \rightarrow \infty \).

For any \(\rho > 0\), we define the truncated jumping kernel \(k_{\rho }\) via

$$\begin{aligned} k_{\rho }(t;x,y) = k(t;x,y) \mathbbm {1}_{\{\vert x-y \vert \le \rho \}}(x,y). \end{aligned}$$

The associated integro-differential operator \(L_t^{\rho }\) is defined as

$$\begin{aligned} L_t^{\rho } u (x) = \text { p.v.} \int _{\mathbb {R}^d} (u(y)-u(x))k_{\rho }(t;x,y) \text {d}y. \end{aligned}$$

Definition 2.1

We say that a function \(u \in L^2_{loc}((\eta ,T);H^{\alpha /2}(\mathbb {R}^d))\) with \(\partial _t u \in L^1_{loc}((\eta ,T) ; L^2_{loc}(\mathbb {R}^d))\) solves the Cauchy problem associated with k in \((\eta ,T) \times \mathbb {R}^d\):

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t u - L_t u = 0, \quad \text { in } (\eta ,T) \times \mathbb {R}^d,\\ u(\eta ) = u_0 \in L^2(\mathbb {R}^d), \end{array}\right. } \end{aligned}$$
(2.1)

if for every \(\phi \in H^{\alpha /2}(\mathbb {R}^d)\) with \({{\,\textrm{supp}\,}}(\phi )\) compact, it holds

$$\begin{aligned} \int _{\mathbb {R}^d} \partial _t u(t,x) \phi (x) \text {d}x + \mathcal {E}_t(u(t),\phi )&= 0, ~~ \text { a.e. } t \in (\eta ,T), \end{aligned}$$
(2.2)
$$\begin{aligned} \Vert u(t) - u_0 \Vert _{L^2(\mathbb {R}^d)}&\rightarrow 0, ~~ \text { as } t \searrow \eta , \end{aligned}$$
(2.3)

where we write

$$\begin{aligned} \mathcal {E}_t(u,v) = \int _{\mathbb {R}^d} \int _{\mathbb {R}^d} (u(t,x) - u(t,y))(v(t,x)-v(t,y))k(t;x,y) \text {d}y \text {d}x \end{aligned}$$
(2.4)

for the family of energy forms \((\mathcal {E}_t)_{t \in (\eta ,T)}\) associated with k.

Solutions to the \(\rho \)-truncated Cauchy problem associated with k in \((\eta ,T) \times \mathbb {R}^d\)

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t u - L_t^{\rho } u &{}= 0, ~~ \text { in } (\eta ,T) \times \mathbb {R}^d,\\ u(\eta ) &{}= u_0 \in L^2(\mathbb {R}^d), \end{array}\right. } \end{aligned}$$
(2.5)

are defined in an analogous way, replacing k by \(k_{\rho }\).

Throughout this article, we will assume that the fundamental solutions \(p, p_{\rho } : (0,T) \times \mathbb {R}^d \times [0,T) \times \mathbb {R}^d \rightarrow [0,\infty ]\) to the equations \(\partial _t u - L_t u = 0\) and \(\partial _t u - L^{\rho }_t u = 0\) exist. p and \(p_{\rho }\) have the following properties for all \(0 \le \eta< t< s < T\), \(x,y \in \mathbb {R}^d\):

$$\begin{aligned} p(y,s;x,\eta ) = p(x,s;y,\eta )> 0&,~~ p_{\rho }(y,s;x,\eta ) = p_{\rho }(x,s;y,\eta ) > 0, \end{aligned}$$
(2.6)
$$\begin{aligned} \int _{\mathbb {R}^d} p(y,s;x,\eta ) \text {d}x = 1&, ~~ \int _{\mathbb {R}^d} p_{\rho }(y,s;x,\eta ) \text {d}x = 1, \end{aligned}$$
(2.7)
$$\begin{aligned} p(y,s;x,\eta ) = \int _{\mathbb {R}^d}p(y,s;z,t)&p(z,t;x,\eta )\text {d}z,~~ p_{\rho }(y,s;x,\eta )\nonumber \\&\quad = \int _{\mathbb {R}^d}p_{\rho }(y,s;z,t)p_{\rho }(z,t;x,\eta )\text {d}z. \end{aligned}$$
(2.8)

Moreover, the solutions u to (2.1) and \(u_{\rho }\) to (2.5) are unique and have the representation

$$\begin{aligned} u(s,y)= & {} \int _{\mathbb {R}^d} p(y,s;x,\eta ) u_0(x) \text {d}x,~~ u_{\rho }(s,y)\nonumber \\= & {} \int _{\mathbb {R}^d} p_{\rho }(y,s;x,\eta ) u_0(x) \text {d}x, ~~ s \in (\eta ,T), ~ y \in \mathbb {R}^d. \end{aligned}$$
(2.9)

In the following, we will denote the unique solutions to (2.1) and (2.5) by \(P_{\eta ,s}u_0\), and \(P^{\rho }_{\eta ,s}u_0\). \((P_{\eta ,s})_{s \in [\eta ,T)}\), and \((P^{\rho }_{\eta ,s})_{s \in [\eta ,T)}\) are called the heat semigroups associated with k, and \(k_{\rho }\).

In the time-homogeneous case, i.e., when k does not depend on t, the existence of \((P_{\eta ,s})_{s \in [\eta ,T)}\) and \((P^{\rho }_{\eta ,s})_{s \in [\eta ,T)}\) is guaranteed by symmetric Dirichlet form theory. The existence of the fundamental solution classically follows from so-called ultracontractivity estimates for the heat semigroup which are a consequence of Nash’s inequality. For time-inhomogeneous jumping kernels k which satisfy the following pointwise lower bound for some \(\lambda > 0\)

$$\begin{aligned} k(t;x,y) \ge \lambda \vert x-y \vert ^{-d-\alpha }, ~~ t \in (0,T), ~ x,y \in \mathbb {R}^d, \end{aligned}$$
(2.10)

the existence of the fundamental solutions p and \(p_{\rho }\) was proved in [43] by approximation of k through a sequence of smooth jumping kernels for which the desired properties follow from the theory of pseudo-differential operators. A similar result is proved in [37, 38] but under an additional smoothness assumption on t. Alternatively, the existence and uniqueness of solutions to (2.1) and (2.5) can be established by following the proof of Theorem 5.3 in [25], which is based on a parabolic version of the Lax-Milgram lemma (see Corollary 23.26 in [54]).

The following result explains the connection between p and \(p_{\rho }\) and is crucial to our approach. It is frequently used in the derivation of upper heat kernel bounds for alpha-stable like processes and goes back to a probabilistic construction carried out in [42]. An analytic proof via the parabolic maximum principle is derived in [27] (see also [30]). Since both proofs are known only in the time-homogeneous case, we will provide a modified version of the argument in [27] in the appendix.

Lemma 2.2

Assume that k satisfies \(k_{\le }\), \(\mathcal {E}_{\ge }\). Then there exists \(c > 0\) such that for every \(\rho > 0\), \(0 \le \eta< s < T\), and \(x,y \in \mathbb {R}^d\) it holds

$$\begin{aligned} p(y,s;x,\eta )&\le p_{\rho }(y,s;x,\eta ) + c(s-\eta ) \rho ^{-d-\alpha }, \end{aligned}$$
(2.11)
$$\begin{aligned} p_{\rho }(y,s;x,\eta )&\le e^{c\rho ^{-\alpha }(s-\eta )} p(y,s;x,\eta ). \end{aligned}$$
(2.12)

Next, we provide the so-called on-diagonal bound for the heat kernels p, and \(p_{\rho }\).

Theorem 2.3

(on-diagonal bound) Assume that k satisfies \(k_{\le }\), \(\mathcal {E}_{\ge }\). Then there exists \(c > 0\) depending on \(d,\alpha ,\lambda ,\Lambda \) such that for every \(\rho > 0\), \(0 \le \eta< s < T\), and \(x,y \in \mathbb {R}^d\) it holds

$$\begin{aligned} p(y,s;x,\eta )&\le c(s-\eta )^{-\frac{d}{\alpha }}, \end{aligned}$$
(2.13)
$$\begin{aligned} p_{\rho }(y,s;x,\eta )&\le ce^{c\rho ^{-\alpha }(s-\eta )}(s-\eta )^{-\frac{d}{\alpha }}. \end{aligned}$$
(2.14)

There are at least two ways to prove Theorem 2.3. One approach classically goes via Nash inequalities (see [17]) and can be traced back to Nash’s famous work [49]. This proof also works in the time-inhomogeneous setup (see [43]).

Another way to establish on-diagonal bounds goes via \(L^{\infty }-L^1\)-estimates. For this, we refer to Lemma 4.2, where such estimate is proved in a more general setup. Observe that \((t,z) \mapsto p(y,t;z,\eta )\) solves \(\partial _t u - L u = 0\) in \((\eta ,T) \times \mathbb {R}^d\) for every \(y \in \mathbb {R}^d\). Therefore, by the \(L^{\infty }-L^1\)-estimate (4.4), for every \(0 \le \eta< s < T\) and \(x,y \in \mathbb {R}^d\):

$$\begin{aligned} p(y,s;x,\eta ) \le c (s-\eta )^{-\frac{d}{\alpha }} \sup _{t \in (\eta ,s)}\int _{\mathbb {R}^d} p(y,t;z,\eta ) \text {d}z \le c (s-\eta )^{-\frac{d}{\alpha }}, \end{aligned}$$

where we used (2.7) in the last step. (2.14) follows from Lemma 2.2.

The remainder of this section is devoted to proving an \(L^{\infty }-L^2\)-estimate for solutions to the truncated problem \(\partial _t u - L^{\rho }_t u = 0\) in a time-space cylinder \(I_R(t_0) \times B_R(x_0)\), where \(t_0 \in (0,T)\), \(x_0 \in \mathbb {R}^d\), and \(I_R(t_0) {:}{=} (t_0 - R^{\alpha },t_0) \subset (\eta ,T)\).

For truncated jumping kernels \(k_{\rho }\), it is possible to estimate the nonlocal tail-term by an \(L^2\)-norm over a ball with radius \(\rho \), see (2.16), without assuming any pointwise lower bounds of k, or a UJS-type condition as in [5]. This is at the cost of the suboptimal scaling factor \(( \rho ^{\alpha }/R^{\alpha })^{d/(2\alpha )}\) in the resulting estimate, which luckily does not affect the proof of the final heat kernel estimate.

A similar result for solutions to elliptic equations was obtained in [17]. We follow the strategy outlined in [51] which is based on a nonlocal adaptation of De Giorgi’s iteration technique (see [22, 23]) but present the proof in all details due to our special treatment of the tail term.

Lemma 2.4

(truncated \(L^{\infty }-L^2\)-estimate) Assume that k satisfies \(k_{\le }\), \(\mathcal {E}_{\ge }\). There exists a constant \(C > 0\) depending on \(d,\alpha ,\lambda ,\Lambda \) such that for every \(t_0 \in (0,T)\), \(x_0 \in \mathbb {R}^d\), and \(\rho , R > 0\) with \(R \le \rho /2 \wedge t_0^{1/\alpha }\), and every subsolution u to \(\partial _t u - L_t^{\rho } u = 0\) in \(I_R(t_0) \times B_R(x_0)\) it holds:

$$\begin{aligned} \sup _{I_{R/2}(t_0) \times B_{R/2}(x_0)} u \le C \left( \frac{\rho ^{\alpha }}{R^{\alpha }} \right) ^{\frac{d}{2\alpha }} R^{-\frac{d}{2}} \sup _{t \in I_{R}(t_0)} \left( \int _{B_{2\rho }(x_0)} u^2(t,x) \text {d}x \right) ^{1/2}. \end{aligned}$$
(2.15)

For the definition of a subsolution to \(\partial _t u - L_t^{\rho } u = 0\) in \(I_R(t_0) \times B_R(x_0)\), we refer to the appendix.

Proof

To keep notation simple, from now on we will write \(I_R {:}{=} I_R(t_0)\), and \(B_R(x_0) : = B_R\). We do not provide the details for \(d = 1\). From the usual Caccioppoli inequality, see [51], we know that for every rR with \(0 < r \le R \le \rho /2\) and every \(l > 0\), it holds

$$\begin{aligned}&\sup _{t \in I_R} \int _{B_R} w_l^2(t,x) \text {d}x + \int _{I_R} \int _{B_R}\int _{B_R}(w_l(s,x)-w_l(s,y))^2 k_{\rho }(s;x,y) \text {d}y \text {d}x \text {d}s\\&\le c_1\Bigg (\sigma (R,r) \int _{I_{R+r}} \int _{B_{R+r}} w_l^2(t,x) \text {d}x \text {d}t + \Vert w_l\Vert _{L^1(I_{R+r}\times B_{R+r})} \sup _{t \in I_{R+r}} \sup _{x \in B_{R + \frac{r}{2}}} \\&\int _{B_{R+r}^{c}} w_l(t,y) k_{\rho }(t;x,y) \text {d}y \Bigg ), \end{aligned}$$

where \(c_1 > 0\) is a constant, \(w_l = (u-l)_+\), and \(\sigma (r,R) = r^{-(\alpha \vee 1)}(R+r)^{(\alpha \vee 1) - \alpha }\). Define \(\kappa = 1 + \frac{\alpha }{d}\), and \(A(l,R) : = \vert \lbrace (t,x) \in I_{R} \times B_R : u(t,x) > l \rbrace \vert \). Then, by \(\mathcal {E}_{\ge }\) and the fractional Sobolev inequality:

$$\begin{aligned} \int _{I_{R}} \int _{B_R} w^2_l(t,x) \text {d}x \text {d}t&\le c_2\vert A(l,R) \vert ^{\frac{1}{\kappa '}} \left( \int _{I_R} \int _{B_R} w_l^{2\kappa }(s,x) \text {d}x \text {d}s \right) ^{\frac{1}{\kappa }}\\&\le c_3\vert A(l,R) \vert ^{\frac{1}{\kappa '}} \left( \left( \sup _{t \in I_R}\int _{B_R} w_l^2(t,x) \text {d}x\right) ^{\kappa -1}\int _{I_R} \left( \int _{B_R} w_l^{\frac{2d}{d-\alpha }}(s,x) \text {d}x\right) ^{\frac{d-\alpha }{d}} \text {d}s \right) ^{\frac{1}{\kappa }}\\&\le c_4\vert A(l,R) \vert ^{\frac{1}{\kappa '}} \Bigg (\sigma (R,r) \int _{I_{R+r}} \int _{B_{R+r}} w_l^2(t,x) \text {d}x \text {d}t\\&\qquad \qquad \qquad + \Vert w_l\Vert _{L^1(I_{R+r}\times B_{R+r})} \sup _{t \in I_{R+r}} \sup _{x \in B_{R + \frac{r}{2}}} \int _{B_{R+r}^{c}} w_l(t,y) k_{\rho }(t;x,y) \text {d}y \Bigg ) \end{aligned}$$

for some \(c_2,c_3,c_4 > 0\). Observe that for some constant \(c_5 > 0\) by \(k_{\le }\)

(2.16)

Let us now fix \(R \in (0,\rho /2]\) and define sequences \(l_i = M(1-2^{-i})\), for \(M > 0\) to be defined later, \(r_i = 2^{-i-1}R\), \(R_{i+1} = R_i - r_{i+1}\), \(R_0 {:}{=} R\), \(A_i = \int _{I_{R_i}} \int _{B_{R_i}} w_{l_i}^2(t,x) \text {d}x \text {d}t\). Note that by definition: \(R/2 = \lim _{i \rightarrow \infty } R_i< \dots< R_2< R_1 < R_0 = R\), and \(l_i \nearrow M\), and \(\sigma (r_i,R_i) \le c_6 R^{-\alpha }2^{2i}\). Then, we deduce from the two lines above:

where \(c_7,c_8,c_9 > 0\), and \(\gamma = 2 + \frac{2}{\kappa '} + d + \alpha > 0\). Let us now choose

where \(c_{10} = 2c_9\), \(C = 2^{\gamma } > 1\). Consequently, it holds

$$\begin{aligned} A_i \le \frac{c_{10}}{M^{\frac{2}{\kappa '}} R^{\alpha } } C^i A_{i-1}^{1+ \frac{1}{\kappa '}}, ~~ A_0 \le C^{-\kappa '^2} \left( \frac{c_{10}}{R^{\alpha }M^{\frac{2}{\kappa '}}} \right) ^{-\kappa '}. \end{aligned}$$

By Lemma 7.1 in [31], we obtain

for some \(c_{11}, c_{12} > 0\), as desired, where we used \(R \le \rho /2\). \(\square \)

3 Nonlocal Aronson method

In this section we prove Theorem 1.1. As is standard for proofs of heat kernel bounds for nonlocal operators, we first establish bounds for the heat kernel corresponding to the truncated jumping kernel and derive the estimate for the original jumping kernel by gluing together short and long jumps with the help of Lemma 2.2 in a second step.

The following lemma is a nonlocal version of (1.5). Presumably, estimates of this form are also of interest in the study of fractional porous medium equations and gradient flows.

Lemma 3.1

Assume that k satisfies \(k_{\le }\) and let \(\rho > 0\), \(0 \le \eta< s < T\). Let \(u \in L^{\infty }((\eta ,T) \times \mathbb {R}^d)\) be a solution to the \(\rho \)-truncated Cauchy problem (2.5) in \((\eta ,T) \times \mathbb {R}^d\). Then there exists a constant \(C = C(\Lambda )\) such that for every bounded function \(H : [\eta ,s] \times \mathbb {R}^d \rightarrow [0,\infty )\) satisfying

  • \(C\Gamma ^{\alpha }_{\rho }(H^{1/2},H^{1/2}) \le -\partial _t H\) in \((\eta ,s) \times \mathbb {R}^d\),

  • \(H^{1/2} \in L^2((\eta ,s);H^{\alpha /2}(\mathbb {R}^d))\),

the following estimate holds true:

$$\begin{aligned} \sup _{t \in (\eta ,s)} \int _{\mathbb {R}^d} H(t,x) u^2(t,x) \text {d}x \le \int _{\mathbb {R}^d} H(\eta ,x) u_0^2(x) \text {d}x. \end{aligned}$$
(3.1)

Proof

Let \(R \ge 2\) and \(\gamma _R \in C^{\infty }_c(\mathbb {R}^d)\) such that \(\gamma _R \equiv 1\) in \(B_{R-1}(0)\), \(\gamma _R \equiv 0\) in \(\mathbb {R}^d \setminus B_{R}(0)\), \(0 \le \gamma _R \le 1\), \(\vert \nabla \gamma _R\vert \le 2\). Consequently, \(\Gamma _{\rho }^{\alpha }(\gamma _R,\gamma _R)\) satisfies:

$$\begin{aligned} \Gamma _{\rho }^{\alpha }(\gamma _R,\gamma _R)(x) \le {\left\{ \begin{array}{ll} 0, ~~ x \in B_{\frac{R-1}{2}}(0),\\ c_1, ~~ x \in \mathbb {R}^d \setminus B_{\frac{R-1}{2}}(0)\\ \end{array}\right. } \end{aligned}$$
(3.2)

for some constant \(c_1 > 0\). We test the equation for u with the test function \(\phi = \gamma _R^2Hu\) and integrate in time over \((\eta ,\tau )\), where \(\tau \in (\eta ,s)\), and obtain:

$$\begin{aligned}{} & {} \int _{\eta }^{\tau } \int _{\mathbb {R}^d} (\partial _t u) \phi \text {d}x \text {d}t \\{} & {} \quad + \int _{\eta }^{\tau } \int _{\mathbb {R}^d}\int _{\mathbb {R}^d} (u(t,x)-u(t,z))(\phi (t,x)-\phi (t,z))k_{\rho }(t;x,z) \text {d}z \text {d}x \text {d}t = 0. \end{aligned}$$

Note that \(\phi \) is a valid test function since for every \(t \in (\eta ,s)\) by assumption it holds \(\gamma _R H(t)u(t) \in H^{\alpha /2}(\mathbb {R}^d)\). From \((\partial _t u)u = \frac{1}{2}\partial _t(u^2)\) and integration by parts:

$$\begin{aligned} \int _{\mathbb {R}^d} u^2(\tau ,x) \gamma _R^2(x) H(\tau ,x) \text {d}x&+ 2\int _{\eta }^{\tau } \int _{\mathbb {R}^d}\int _{\mathbb {R}^d} (u(t,x)-u(t,z))(\phi (t,x)\\&\quad -\phi (t,z))k_{\rho }(t;x,z) \text {d}z \text {d}x \text {d}t\\&=\int _{\mathbb {R}^d} u^2_0(x) \gamma _R^2(x) H(\eta ,x) \text {d}x \\&\quad + \int _{\eta }^{\tau } \int _{\mathbb {R}^d} u^2(t,x) \gamma _R^2(x) \partial _t H(t,x) \text {d}x \text {d}t. \end{aligned}$$

We treat the nonlocal term by making use of the following algebraic inequality:

$$\begin{aligned} (u_1-u_2)&(\gamma _1^2 H_1 u_1 - \gamma _2^2 H_2 u_2) \ge (\gamma _1 H^{1/2}_1 u_1 - \gamma _2 H^{1/2}_2 u_2)^2\\&- c_2 \left( (\gamma _1 - \gamma _2)^2 (H_1 + H_2)(u_1^2+u_2^2) + (H_1^{1/2}-H_2^{1/2})^2(\gamma _1^2+\gamma _2^2)(u_1^2+u_2^2) \right) , \end{aligned}$$

where \(c_2 > 0\). Its proof is based on the following two observations:

$$\begin{aligned} (u_1-u_2)(\gamma _1^2 H_1 u_1 - \gamma _2^2 H_2 u_2)&= (\gamma _1 H^{1/2}_1 u_1 - \gamma _2 H^{1/2}_2 u_2)^2 -u_1u_2(\gamma _1 H^{1/2}_1 - \gamma _2 H^{1/2}_2)^2,\\ u_1u_2(\gamma _1 H^{1/2}_1 - \gamma _2 H^{1/2}_2)^2&\le c_2(u_1^2+u_2^2)\left( (\gamma _1 - \gamma _2)^2 (H_1 + H_2) + (\gamma _1^2 \right. \\&\quad \left. + \gamma _2^2) (H^{1/2}_1 - H^{1/2}_2)^2 \right) . \end{aligned}$$

Moreover, note that for \(R > 4\rho \) it holds \(\gamma _R^2(z) = \gamma _R^2(x) = 1\) for every \(x \in B_{\frac{R-1}{2}}(0)\) and \(z \in B_{\rho }(x)\). By symmetry of k, \(k_{\le }\) and the observations from above, we deduce:

$$\begin{aligned} \sup _{\tau \in (\eta ,s)}&\int _{\mathbb {R}^d} u^2(\tau ,x) \gamma _R^2(x) H(\tau ,x) \text {d}x \le \int _{\mathbb {R}^d} u^2_0(x) \gamma _R^2(x) H(\eta ,x) \text {d}x\\&\quad + \int _{\eta }^{s} \int _{\mathbb {R}^d} u^2(t,x) \gamma _R^2(x) \left( 2c_2\Lambda \Gamma ^{\alpha }_{\rho }(H^{1/2},H^{1/2})(t,x) + \partial _t H(t,x)\right) \text {d}x \text {d}t\\&\quad + 2c_2\Lambda \int _{\eta }^{s} \int _{\mathbb {R}^d}\int _{B_{\rho }(x)} u^2(t,x)\gamma _R^2(z)(H^{1/2}(t,x) - H^{1/2}(t,z))^2 \vert x-y \vert ^{-d-\alpha } \text {d}z \text {d}x \text {d}t\\&\quad + 2c_2\Lambda \int _{\eta }^{s}\int _{\mathbb {R}^d} u^2(t,x) H(t,x) \Gamma ^{\alpha }_{\rho }(\gamma _R,\gamma _R)(x) \text {d}x \text {d}t\\&\quad + 2c_2\Lambda \int _{\eta }^{s}\int _{\mathbb {R}^d}\int _{B_{\rho }(x)} u^2(t,z) H(t,x) (\gamma _R(x)-\gamma _R(z))^2 \vert x-y \vert ^{-d-\alpha } \text {d}z \text {d}x \text {d}t\\&\le \int _{\mathbb {R}^d} u^2_0(x) \gamma _R^2(x) H(\eta ,x) \text {d}x\\&\quad + \int _{\eta }^{s} \int _{\mathbb {R}^d} u^2(t,x) \gamma _R^2(x) \left( 4c_2\Lambda \Gamma ^{\alpha }_{\rho }(H^{1/2},H^{1/2})(t,x) + \partial _t H(t,x)\right) \text {d}x \text {d}t\\&\quad + 2c_2\Lambda \Vert u\Vert _{\infty }^2\int _{\eta }^{s} \int _{\mathbb {R}^d \setminus B_{\frac{R-1}{2}(0)}} \Gamma ^{\alpha }_{\rho }(H^{1/2},H^{1/2})(t,x) \text {d}x \text {d}t\\&\quad + 4c_2\Lambda \Vert u\Vert _{\infty }^2 \int _{\eta }^{s}\int _{\mathbb {R}^d} \Gamma ^{\alpha }_{\rho }(\gamma _R,\gamma _R)(x) H(t,x) \text {d}x \text {d}t. \end{aligned}$$

Now, assume that H satisfies the assumption with \(C = 4c_2\Lambda \). Then, using also (3.2), the following holds true:

$$\begin{aligned} \left( 4c_2\Lambda \Gamma ^{\alpha }_{\rho }(H^{1/2},H^{1/2})(t,x) + \partial _t H(t,x)\right)&\le 0, ~~ t \in (\eta ,s),~ x \in \mathbb {R}^d,\\ \int _{\eta }^s \int _{\mathbb {R}^d \setminus B_{\frac{R-1}{2}}(0)}\Gamma ^{\alpha }_{\rho }(H^{1/2},H^{1/2})(t,x) \text {d}x \text {d}t&\rightarrow 0, ~~\text {as } R \rightarrow \infty ,\\ \int _{\eta }^{s}\int _{\mathbb {R}^d} \Gamma ^{\alpha }_{\rho }(\gamma _R,\gamma _R)(x) H(t,x) \text {d}x \text {d}t \le c_1 \int _{\eta }^{s}\int _{\mathbb {R}^d \setminus B_{\frac{R-1}{2}}(0)}&H(t,x) \text {d}x \text {d}t \rightarrow 0, ~~ \text {as } R \rightarrow \infty . \end{aligned}$$

Upon the observation that \(\gamma _R \rightarrow 1\), as \(R \rightarrow \infty \), it follows

$$\begin{aligned} \sup _{\tau \in (\eta ,s)}\int _{\mathbb {R}^d} u^2(\tau ,x) H(\tau ,x) \text {d}x \le \int _{\mathbb {R}^d} u^2_0(x) H(\eta ,x) \text {d}x, \end{aligned}$$

as desired. \(\square \)

Our next goal is to establish the following auxiliary estimate:

Theorem 3.2

Assume that k satisfies \(k_{\le }\), \(\mathcal {E}_{\ge }\). Let \(y \in \mathbb {R}^d\), \(\sigma ,\rho > 0\), \(\eta \ge 0\). Let \(u_0 \in L^2(\mathbb {R}^d)\) be such that \(u_0 \equiv 0\) in \(B_{\sigma }(y)\). Assume that \(u \in L^{\infty }( (\eta ,T) \times \mathbb {R}^d)\) is a weak solution to \(\partial _t u - L_t^{\rho } u = 0\) in \((\eta ,T) \times \mathbb {R}^d\). Then there exist \(\nu> 1, C > 0\) depending on \(d,\alpha ,\lambda ,\Lambda \) such that for every \(s \in (\eta ,T)\) with \( s-\eta \le \frac{1}{4\nu }\rho ^{\alpha }\):

$$\begin{aligned} \vert u(s,y) \vert \le C (s-\eta )^{-\frac{d}{2\alpha }}2^{\frac{\sigma }{6\rho }} \left( \frac{\rho ^{\alpha }}{\nu (s-\eta )}\right) ^{-\frac{\sigma }{6\rho } + \frac{1}{2} + \frac{d}{2\alpha }} \Vert u_0\Vert _{L^2(\mathbb {R}^d)}. \end{aligned}$$

The idea to prove Theorem 3.2 is to find a suitable function H such that Lemma 3.1 is applicable. In the following, we present a suitable such function.

Given \(y \in \mathbb {R}^d\), \(\rho > 0\), \(0 \le \eta< s < T\), \(\nu > 1\) with \(s-\eta \le \frac{1}{4\nu }\rho ^{\alpha }\), we define \(H_{y,\rho ,\eta ,s,\nu } = H : [\eta ,s] \times \mathbb {R}^d \rightarrow [0,\infty )\) via

$$\begin{aligned} \begin{aligned} H(t,x)&{:}{=} \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]}\right) ^{-1} \wedge \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]}\right) ^{-\frac{\vert x-y \vert }{3\rho }}\\&= e^{-\log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \left( \frac{\vert x-y \vert }{3\rho } \vee 1\right) }. \end{aligned} \end{aligned}$$
(3.3)

Lemma 3.3

For every \(C > 0\), there exists \(\nu = \nu (d,\alpha , C) > 1\) such that for every \(y \in \mathbb {R}^d\), \(\rho > 0\), \(0 \le \eta< s < T\) with \(s-\eta \le \frac{1}{4\nu }\rho ^{\alpha }\), the function \(H_{y,\rho ,\eta ,s,\nu } = H\) defined above satisfies

$$\begin{aligned} C&\Gamma ^{\alpha }_{\rho }(H^{1/2},H^{1/2}) \le -\partial _t H,~~ \text { in }(\eta ,s) \times \mathbb {R}^d, \end{aligned}$$
(3.4)
$$\begin{aligned} H^{1/2}&\in L^2((\eta ,s);H^{\alpha /2}(\mathbb {R}^d)). \end{aligned}$$
(3.5)

Proof

Let \(y \in \mathbb {R}^d\), \(\rho > 0\), \(0 \le \eta< s <T\), with \(s-\eta \le \frac{1}{4\nu }\rho ^{\alpha }\), where \(\nu > 1\) to be chosen later. Note that by assumption, \(\frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} > 1\) for every \(t \in [\eta ,s]\). Let \(t \in [\eta ,s]\) be fixed. We split the proof of (3.4) into three cases.

Case 1: \(\vert x-y \vert \le 2 \rho \).

In this case, trivially \(\Gamma ^{\alpha }_{\rho }(H^{1/2},H^{1/2})(t,x) = 0\), and

$$\begin{aligned} -\partial _t H(t,x) = -\partial _t \left( \frac{\nu [2(s-\eta ) - (t-\eta )]}{\rho ^{\alpha }} \right) = \nu \rho ^{-\alpha } > 0. \end{aligned}$$

Therefore, (3.4) holds true for any \(\nu > 0\).

Case 2: \(2\rho \le \vert x-y \vert \le 3\rho \).

In this case, \(-\partial _t H(t,x) = \nu \rho ^{-\alpha }\), as in Case 1. Moreover, let us fix \(x_0 \in B_{\rho }(x)\) with \(\vert x_0 - y\vert = 3\rho \) such that for every \(z \in \mathbb {R}^d \setminus B_{3\rho }(y)\) it holds \(\vert x_0-z \vert \le 2 \vert x-z \vert \). This holds true, e.g. if one chooses \(x_0\) as a point on \(\partial B_{3\rho }(y)\) that minimizes \({{\,\textrm{dist}\,}}(x,\mathbb {R}^d \setminus B_{3\rho }(y))\), since then by triangle inequality: \(\vert x_0 - z\vert \le \vert x_0 - x\vert + \vert x - z\vert \le 2 \vert x-z \vert \). Note that \(H(t,x) = H(t,x_0)\) and \(B_{\rho }(x) \subset B_{2\rho }(x_0)\).

Therefore:

$$\begin{aligned} \Gamma ^{\alpha }_{\rho }(H^{1/2},H^{1/2})&(t,x) = \int _{B_{\rho }(x) \setminus B_{3\rho }(y)} \left( H^{1/2}(t,x) - H^{1/2}(t,z) \right) ^2 \vert x-z\vert ^{-d-\alpha } \text {d}z \\&\le c_1\int _{B_{2\rho }(x_0) \setminus B_{3\rho }(y)} \left( H^{1/2}(t,x_0) - H^{1/2}(t,z) \right) ^2 \vert x_0-z\vert ^{-d-\alpha } \text {d}z \\&\le \int _{B_{2\rho }(x_0) \setminus B_{3\rho }(y)} \vert \nabla H^{1/2}(t,x_0) \vert ^2 \vert x_0 - z \vert ^{2-d-\alpha } \text {d}z\\&\le c_2 \vert \nabla H^{1/2}(t,x_0) \vert ^2 \rho ^{2-\alpha }\\&= c_2 \left( (6\rho )^{-1} \log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) e^{-\log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \left( \frac{\vert x_0-y \vert }{6\rho }\right) }\right) ^2 \rho ^{2-\alpha } \\&= c_3 \rho ^{-\alpha } \left( \log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) ^{-1/2}\right) ^2\\&\le c_4 \rho ^{-\alpha } \end{aligned}$$

for some \(c_1,c_2,c_3,c_4 > 0\). We used the inequality \(\log (a) \le a^{1/2}\) in the last step. Moreover, note that the estimate \(\left( H^{1/2}(t,x_0) - H^{1/2}(t,z) \right) ^2 \le \vert \nabla H^{1/2}(t,x_0) \vert ^2 \vert x_0 - z \vert ^{2}\) is correct since \(\sup _{z \in \mathbb {R}^d \setminus B_{3\rho }(y)} \vert \nabla H^{1/2}(t,z)\vert = \vert \nabla H^{1/2}(t,x_0)\vert \) due to \(x_0 \in \partial B_{3\rho }(y)\). Therefore, (3.4) holds true in this case for any \(\nu > c_4C\).

Case 3: \(\vert x-y \vert > 3\rho \).

In this case,

$$\begin{aligned} -\partial _t H(t,x) = \frac{\vert x-y \vert }{3\rho [2(s-\eta ) - (t-\eta )]}e^{-\log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \left( \frac{\vert x-y \vert }{3\rho }\right) }. \end{aligned}$$

Moreover:

$$\begin{aligned} \Gamma ^{\alpha }_{\rho }&(H^{1/2},H^{1/2})(t,x) = \int _{B_{\rho }(x)} \left( H^{1/2}(t,x) - H^{1/2}(t,z) \right) ^2 \vert x-z\vert ^{-d-\alpha } \text {d}z \\&\le \int _{B_{\rho }(x)} \left( e^{-\log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \left( \frac{\vert x-y \vert }{6\rho }\right) } - e^{-\log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \left( \frac{\vert z-y \vert }{6\rho }\right) } \right) ^2 \vert x-z\vert ^{-d-\alpha } \text {d}z\\&\le \int _{B_{\rho }(x)} \sup _{z \in B_{\rho }(x)} \left| \nabla e^{-\log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \left( \frac{\vert z-y \vert }{6\rho }\right) } \right| ^2 \vert x-z \vert ^{2-d-\alpha } \text {d}z\\&\le c_5 \left( (6\rho )^{-1} \log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) e^{-\log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \left( \frac{\vert x-y \vert - \rho }{6\rho }\right) } \right) ^2 \rho ^{2-\alpha }\\&= c_6 \left[ \log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \right] ^2 \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) ^{1/3} e^{-\log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \left( \frac{\vert x-y \vert }{3\rho }\right) } \rho ^{-\alpha }\\&\le c_7 \frac{1}{\nu [2(s-\eta ) - (t-\eta )]}e^{-\log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \left( \frac{\vert x-y \vert }{3\rho }\right) }\\&\le c_7 \frac{\vert x-y \vert }{3\rho \nu [2(s-\eta ) - (t-\eta )]}e^{-\log \left( \frac{\rho ^{\alpha }}{\nu [2(s-\eta ) - (t-\eta )]} \right) \left( \frac{\vert x-y \vert }{3\rho }\right) } \end{aligned}$$

for some \(c_5,c_6,c_7 > 0\). In the third inequality, we used \(\vert z-y \vert \ge \vert x-y \vert - \rho \), and in the second to last step we applied the estimate \(\log (a) \le c a^{1/3}\). This holds with \(c > 0\) independent of \(a > 1\). Therefore, by choosing \(\nu > c_7C\), (3.4) is satisfied also in this case.

Together, we have proved (3.4). Finally, note that \(\Gamma ^{\alpha }_{\rho }(H^{1/2},H^{1/2}) \in L^1((\eta ,s) \times \mathbb {R}^d)\) since for \(\vert x-y \vert \ge 3\rho \), we computed above

$$\begin{aligned} \Gamma ^{\alpha }_{\rho }(H^{1/2},H^{1/2})(t,x) \le \vert x-y \vert c^{-\frac{\vert x-y \vert }{3\rho }}, \end{aligned}$$

where \(c > 1\) is a constant that might depend on \(\eta ,s,\rho \). This proves (3.5). \(\square \)

Having at hand the function H defined in (3.3), it is possible to establish Theorem 3.2.

Proof

(Proof of Theorem 3.2) The idea is to apply Lemma 3.1 with H as in Lemma 3.3. It follows that for every \(y \in \mathbb {R}^d\), \(0 \le \eta< s < T\) with \(s-\eta \le \frac{1}{4\nu }\rho ^{\alpha }\):

$$\begin{aligned} \sup _{\tau \in (\eta ,s)}\int _{B_{2\rho }(y)} u^2(\tau ,x) H(\tau ,x) \text {d}x{} & {} \le \sup _{\tau \in (\eta ,s)}\int _{\mathbb {R}^d} u^2(\tau ,x) H(\tau ,x) \text {d}x\\{} & {} \le \int _{\mathbb {R}^d \setminus B_{\sigma }(y)} u^2_0(x) H(\eta ,x) \text {d}x. \end{aligned}$$

Consequently,

$$\begin{aligned} \sup _{\tau \in (\eta ,s)}\int _{B_{2\rho }(y)} u^2(\tau ,x) \text {d}x \le \left( \frac{\sup _{x \in \mathbb {R}^d \setminus B_{\sigma }(y)} H(\eta ,x)}{\inf _{\tau \in (\eta ,s), x \in B_{2\rho }(y)} H(\tau ,x)} \right) \Vert u_0 \Vert _{L^2(\mathbb {R}^d)}^2. \end{aligned}$$

By the truncated \(L^{\infty }-L^2\)-estimate Lemma 2.4, applied with \(R = (s-\eta )^{1/\alpha }\), \(t_0 = s\), \(x_0 = y\):

$$\begin{aligned} \begin{aligned} \sup _{(\eta ',s) \times B_{\frac{1}{2}(s-\eta )^{1/\alpha }}(y)} u&\le c_1(s-\eta )^{-\frac{d}{2\alpha }} \left( \frac{\rho ^{\alpha }}{s-\eta } \right) ^{\frac{d}{2\alpha }} \sup _{\tau \in (\eta ,s)}\left( \int _{B_{2\rho }(y)} u^2(\tau ,x) \text {d}x \right) ^{1/2}\\&\le c_1(s-\eta )^{-\frac{d}{2\alpha }} \left( \frac{\rho ^{\alpha }}{s-\eta } \right) ^{\frac{d}{2\alpha }} \left( \frac{\sup _{x \in \mathbb {R}^d \setminus B_{\sigma }(y)} H(\eta ,x)}{\inf _{\tau \in (\eta ,s), x \in B_{2\rho }(y)} H(\tau ,x)} \right) ^{1/2} \Vert u_0 \Vert _{L^2(\mathbb {R}^d)} \end{aligned} \end{aligned}$$

for some \(c_1 > 0\), where \(\eta ' {:}{=} s - 2^{-\alpha }(s-\eta ) \in (\eta ,s)\).

Note that there exist \(c_2,c_3 > 0\) such that for \(x \in \mathbb {R}^d \setminus B_{\sigma }(y)\) it holds

$$\begin{aligned} H(\eta ,x) \le c_2 \left( \frac{\rho ^{\alpha }}{2\nu (s-\eta )}\right) ^{-\frac{\sigma }{3\rho }} \end{aligned}$$

and for \((\tau ,x) \in [\eta ,s] \times B_{2\rho }(y)\) we have:

$$\begin{aligned} H(\tau ,x)\ge e^{-\log \left( \frac{\rho ^{\alpha }}{\nu (s-\eta )} \right) \left( \frac{2\rho }{6\rho } \vee 1\right) } \ge c_3 e^{-\log \left( \frac{\rho ^{\alpha }}{\nu (s-\eta )} \right) } \ge c_3 \left( \frac{\rho ^{\alpha }}{\nu (s-\eta )}\right) ^{-1}. \end{aligned}$$

This follows directly from the definition of H and \(s-\eta \le \frac{1}{4\nu }\rho ^{\alpha }\). Together, we obtain

$$\begin{aligned} \vert u(s,y)\vert \le c_4(s-\eta )^{-\frac{d}{2\alpha }} 2^{\frac{\sigma }{6\rho }} \left( \frac{\rho ^{\alpha }}{\nu (s-\eta )}\right) ^{-\frac{\sigma }{6\rho } + \frac{1}{2} + \frac{d}{2\alpha }} \Vert u_0 \Vert _{L^2(\mathbb {R}^d)} \end{aligned}$$

for some \(c_4 > 0\), as desired. \(\square \)

Having proved Theorem 3.2, we are now in the position to establish upper off-diagonal bounds for \(p_{\rho }(y,s;x,\eta )\):

Theorem 3.4

Assume that k satisfies \(k_{\le }\), \(\mathcal {E}_{\ge }\). Then there exists \(c > 0\) depending on \(d,\alpha ,\lambda ,\Lambda \) such that for every \(\rho > 0\), \(0 \le \eta< s < T\), and \(x,y \in \mathbb {R}^d\) with \(s-\eta \le \frac{1}{4\nu }\rho ^{\alpha }\):

$$\begin{aligned} p_{\rho }(y,s;x,\eta ) \le c(s-\eta )^{-\frac{d}{\alpha }} 2^{\frac{\vert x-y \vert }{12\rho }} \left( \frac{\rho ^{\alpha }}{\nu (s-\eta )} \right) ^{-\frac{\vert x-y \vert }{12\rho } + \frac{1}{2} + \frac{d}{2\alpha }}. \end{aligned}$$
(3.6)

Proof

Note that the on-diagonal bound (2.14) and (2.7) immediately imply for every \(0 \le \eta< s < T\) with \(s-\eta \le \frac{1}{4\nu }\rho ^{\alpha }\) and \(x,y \in \mathbb {R}^d\):

$$\begin{aligned} \left( \int _{\mathbb {R}^d} p_{\rho }^2(z,s;x,\eta ) \text {d}z\right) ^{1/2} \le c_1 (s-\eta )^{-\frac{d}{2\alpha }} \end{aligned}$$
(3.7)

for some \(c_1 > 0\). On the other hand, from Theorem 3.2, it follows for every \(0 \le \eta< s < T\) with \(s-\eta \le \frac{1}{4\nu }\rho ^{\alpha }\) and \(x,y \in \mathbb {R}^d\):

$$\begin{aligned} \left( \int _{\mathbb {R}^d \setminus B_{\sigma }(y)} p_{\rho }^2(y,s;z,\eta ) \text {d}z\right) ^{1/2} \le c_2 (s-\eta )^{-\frac{d}{2\alpha }}2^{\frac{\sigma }{6\rho }} \left( \frac{\rho ^{\alpha }}{\nu (s-\eta )}\right) ^{-\frac{\sigma }{6\rho } + \frac{1}{2} + \frac{d}{2\alpha }} \end{aligned}$$
(3.8)

for some \(c_2 > 0\). To see this, one observes that \(u(t,x) = \int _{\mathbb {R}^d \setminus B_{\sigma }(y)} p_{\rho }(x,t;z,\eta )p_{\rho }(y,s;z,\eta )\text {d}z\) satisfies the assumptions of Theorem 3.2 with

$$\begin{aligned} u_0(x) = p_{\rho }(y,s;x,\eta )\mathbbm {1}_{\{\vert x-y \vert > \sigma \}}(x), ~~~~ u(s,y) = \int _{\mathbb {R}^d \setminus B_{\sigma }(y)} p_{\rho }^2(y,s;z,\eta ) \text {d}z. \end{aligned}$$

To prove (3.6), let us fix \(0 \le \eta< s < T\) with \(s-\eta \le \frac{1}{4\nu }\rho ^{\alpha }\), and \(x,y \in \mathbb {R}^d\). Then we define \(\sigma = \frac{1}{2}\vert x-y \vert \) and compute, using (2.8):

$$\begin{aligned} p_{\rho }(y,s;x,\eta )&= \int _{\mathbb {R}^d} p_{\rho }(y,s;z,(s-\eta )/2)p_{\rho }(z,(s-\eta )/2;x,\eta ) \text {d}z\\&= \int _{\mathbb {R}^d \setminus B_{\sigma }(y)} p_{\rho }(y,s;z,(s-\eta )/2)p_{\rho }(z,(s-\eta )/2;x,\eta ) \text {d}z\\&\quad + \int _{B_{\sigma }(y)} p_{\rho }(y,s;z,(s-\eta )/2)p_{\rho }(z,(s-\eta )/2;x,\eta ) \text {d}z\\&= J_1+J_2. \end{aligned}$$

For \(J_1\), we compute, using (3.7), (3.8):

$$\begin{aligned} J_1&\le \left( \int _{\mathbb {R}^d \setminus B_{\sigma }(y)} p_{\rho }^2(y,s;z,(s-\eta )/2) \text {d}z \right) ^{1/2} \left( \int _{\mathbb {R}^d \setminus B_{\sigma }(y)} p_{\rho }^2(z,(s-\eta )/2;x,\eta ) \text {d}z \right) ^{1/2}\\&\le c_3(s-\eta )^{-\frac{d}{\alpha }} 2^{\frac{\vert x-y \vert }{12\rho }} \left( \frac{\rho ^{\alpha }}{\nu (s-\eta )} \right) ^{-\frac{\vert x-y \vert }{12\rho } + \frac{1}{2} + \frac{d}{2\alpha }} \end{aligned}$$

for some \(c_3 > 0\). For \(J_2\), observe that \(B_{\sigma }(y) \subset \mathbb {R}^d \setminus B_{\sigma }(x)\), and therefore by (2.6):

$$\begin{aligned} J_2&\le \left( \int _{\mathbb {R}^d \setminus B_{\sigma }(x)} p_{\rho }^2(y,s;z,(s-\eta )/2) \text {d}z \right) ^{1/2} \left( \int _{\mathbb {R}^d \setminus B_{\sigma }(x)} p_{\rho }^2(z,(s-\eta )/2;x,\eta ) \text {d}z \right) ^{1/2}\\&\le c_4(s-\eta )^{-\frac{d}{\alpha }} 2^{\frac{\vert x-y \vert }{12\rho }} \left( \frac{\rho ^{\alpha }}{\nu (s-\eta )} \right) ^{-\frac{\vert x-y \vert }{12\rho } + \frac{1}{2} + \frac{d}{2\alpha }} \end{aligned}$$

for some \(c_4 > 0\). Together, we obtain the desired result. \(\square \)

Bounds for the heat kernel corresponding to the truncated jumping kernel \(p_{\rho }\) imply bounds for p with the help of the gluing lemma Lemma 2.2. The underlying argument is known among probabilists as “Meyer’s decomposition”. For an analytic proof relying on the parabolic maximum principle we refer to the appendix.

We are now ready to provide the proof of our main result Theorem 1.1:

Proof of Theorem 1.1

Let \(x,y \in \mathbb {R}^d\) be fixed. By (2.13) it suffices to prove that for some constants \(c_0, c_1 > 0\) and \(s-\eta \le c_0 \vert x-y \vert ^{\alpha }\) it holds

$$\begin{aligned} p(y,s;x,\eta ) \le c_1\frac{s-\eta }{\vert x-y \vert ^{d+\alpha }}. \end{aligned}$$

By Lemma 2.2, and \(k_{\le }\) we know that for every \(\rho > 0\), \(0 \le \eta< s < T\):

$$\begin{aligned} \begin{aligned} p(y,s;x,\eta )&\le p_{\rho }(y,s;x,\eta ) + c_2(s-\eta )\Vert k-k{\mathbbm {1}}_{\{\vert x-y \vert \le \rho \}}\Vert _{\infty }\\&\le p_{\rho }(y,s;x,\eta ) + c_3(s-\eta ) \rho ^{-d-\alpha } \end{aligned} \end{aligned}$$
(3.9)

for some \(c_2,c_3 > 0\). We choose \(\rho = \frac{\vert x-y \vert }{12}\left( \frac{d+\alpha }{\alpha } + \frac{1}{2} + \frac{d}{2\alpha }\right) ^{-1}\). Then by (3.6) and (3.9) it holds for \(s-\eta \le \frac{1}{4\nu }\rho ^{\alpha }\):

$$\begin{aligned} p(y,s;x,\eta ) \le c_4(s-\eta )\vert x-y \vert ^{-d-\alpha }, \end{aligned}$$

where \(c_4 > 0\), as desired. \(\square \)

4 Extension: jumping kernels of mixed type on metric measure spaces

In this section, we discuss a possible extension of the nonlocal Aronson method to jumping kernels of mixed type. Moreover, we work on a general doubling metric measure space. The main result of this section is Theorem 4.1.

Let (Md) be a locally compact, separable metric space, and let \(\mu \) be a positive Radon measure with full support. We assume that \((M,d,\mu )\) satisfies the volume doubling property, i.e., there exists \(C > 0\), \(d \in \mathbb {N}\) such that

$$\begin{aligned} \frac{\mu (B_R(x))}{\mu (B_r(x))} \le C \left( \frac{R}{r} \right) ^d, ~~ x \in M, ~ 0 < r \le R. \end{aligned}$$
(VD)

Note that as a consequence of VD, for every \(\delta > 0\) there exist \(c_1,c_2 > 0\) such that for every \(R > 0\), \(x,y \in M\) with \(d(x,y) \le \delta R\): \(c_1 \mu (B_{R}(x)) \le \mu (B_{R}(y)) \le c_2 \mu (B_{R}(x))\).

Moreover, let \(\phi : [0,\infty ) \rightarrow [0,\infty )\) be strictly increasing with \(\phi (0) = 0\), \(\phi (1) = 1\) and

$$\begin{aligned} C^{-1} \left( \frac{R}{r} \right) ^{\alpha _1} \le \frac{\phi (R)}{\phi (r)} \le C \left( \frac{R}{r} \right) ^{\alpha _2}, ~~ 0 < r \le R, \end{aligned}$$
(4.1)

for some constant \(C > 0\) and \(0< \alpha _1 \le \alpha _2 < 2\).

For a detailed discussion of the setup, we refer to [17].

Consider symmetric jumping kernels \(k : (0,T) \times M \times M \rightarrow \mathbb {R}\) satisfying for some \(\Lambda > 0\)

figure c

and assume that there is \(\mathcal {F} \subset L^2(M,\mu )\) such that \((\mathcal {E}_t,\mathcal {F})\) is a regular Dirichlet form on \(L^2(M,\mu )\) for every \(t \in (0,T)\), where for every \(u,v \in \mathcal {F}\),

$$\begin{aligned} \mathcal {E}_t(u,v) = \int _M \int _M (u(x)-u(y))(v(x)-v(y))k(t;x,y) \text {d}x \text {d}y \end{aligned}$$

is defined in the usual way. For simplicity, we write \(\text {d}x {:}{=} \mu (\text {d}x)\).

Moreover, we assume that the Faber-Krahn inequality holds true, i.e., that there exist \(c, \nu > 0\) such that for all \(t \in (0,T)\), \(R > 0\), \(x_0 \in M\), \(D \subset B_R(x_0)\) and every \(u \in \mathcal {F}\) with \(u \equiv 0\) in \(M \setminus D\):

$$\begin{aligned} \mathcal {E}_t(u,u) \ge c \phi (R)^{-1}\left( \frac{\mu (B_{R}(x_0))}{\mu (D)} \right) ^{\nu } \Vert u \Vert _{L^2(D)}^2. \end{aligned}$$
(FK)

Let \(L_t\) be the operator associated with \(\mathcal {E}_t\) and \(p(y,s;x,\eta )\) be the fundamental solution to the equation \(\partial _t u - L_t u = 0\).

Theorem 4.1

Let \((M,d,\mu )\) and \(\phi \) be as above, and assume VD, (4.1). Assume that k satisfies \(k_{\le }\) and FK. Then there exists \(c > 0\) such that for every \(0 \le \eta< s < T\), \(x,y \in M\):

$$\begin{aligned} p(y,s;x,\eta ) \le c \left[ \mu (B_{\phi ^{-1}(s-\eta )}(x))^{-1} \wedge \frac{s-\eta }{\mu (B_{d(x,y)}(x)) \phi (d(x,y))} \right] . \end{aligned}$$
(4.2)

We remark that variants of Theorem 4.1 for time-homogeneous jumping kernels of mixed type on doubling metric measure spaces can be found in several articles, e.g., [14, 15, 17].

We will provide a proof of Theorem 4.1 below. A central ingredient in the proof is the \(L^{\infty }-L^2\)-estimate (4.8) for subsolutions to \(\partial _t u- L_t^{\rho } u = 0\). Its proof is similar to the proof of Lemma 2.4. However we will provide some details since this seems to be the first time that FK is used for nonlocal parabolic \(L^{\infty }-L^2\)-estimates. Moreover, we provide an \(L^{\infty }-L^1\)-estimate for subsolutions to \(\partial _t u - L_t u = 0\) which allows us to give a direct proof of the on-diagonal upper heat kernel estimate. In the elliptic case, \(L^{\infty }-L^2\)- and \(L^{\infty }-L^1\)-estimates are established via FK for example in [17].

Lemma 4.2

Let \((M,d,\mu )\) and k be as in Theorem 4.1. Then there exists \(C_1 > 0\) such that for every \(t_0 \in (0,T)\), \(x_0 \in M\), \(\rho , R > 0\) with \(R \le \rho /2 \wedge \phi ^{-1}(t_0)\) and every subsolution u to \(\partial _t u - L_t^{\rho } u = 0\) in \(I_R(t_0) \times B_R(x_0)\) it holds:

$$\begin{aligned} \sup _{I_{R/2}(t_0) \times B_{R/2}(x_0)} u \le C_1 \left( \frac{\mu (B_{\rho }(x_0))}{\mu (B_R(x_0))} \right) ^{\frac{1}{2}} \mu (B_R(x_0))^{-\frac{1}{2}} \sup _{t \in I_{R}(t_0)} \left( \int _{B_{2\rho }(x_0)} u^2(t,x) \text {d}x \right) ^{1/2}.\nonumber \\ \end{aligned}$$
(4.3)

Moreover, there exists \(C_2 > 0\) such that for every \(t_0 \in (0,T)\), \(x_0 \in M\), \(R \le \phi ^{-1}(t_0)\) and every subsolution u to \(\partial _t u - L_t u = 0\) in \(I_R(t_0) \times B_R(x_0)\), with \(u \ge 0\) in \(I_R(t_0) \times B_R(x_0)\), it holds:

$$\begin{aligned} \sup _{I_{R/2}(t_0) \times B_{R/2}(x_0)} u \le C_2 \mu (B_R(x_0))^{-1} \sup _{t \in I_{R}(t_0)} \int _{M} |u(t,x)| \text {d}x. \end{aligned}$$
(4.4)

We refer to Sect. 1 for the definition of a subsolution. \(H^{\alpha /2}(\mathbb {R}^d)\) should be replaced by \(\mathcal {F}\).

Proof

First, we prove (4.3). Let \(l> k > 0\), \(0 < r \le R \le \rho /2\), \(A(l,R) : = \vert \lbrace (t,x) \in I_{R} \times B_R(x_0) : u(t,x) > l \rbrace \vert \). Let u be a subsolution to \(\partial _t u - L_t^{\rho } u = 0\). First, observe that for every \(t \in I_{R+r}\):

where \(c_1 > 0\), \(\sigma '(R,r) {:}{=} \sup _{x \in B_R(x_0)} \phi (r)^{-1} \frac{\mu (B_{\rho }(x_0))}{\mu (B_{r}(x))}\) and we used \(k_{\le }\). By Caccioppoli’s inequality and (4.7), every subsolution u to \(\partial _t u - L_t^{\rho } u = 0\) in \(I_R \times B_R(x_0)\) satisfies:

where \(c_2 > 0\), \(\sigma (R,r) = \phi (r)^{-1} \vee (\phi (R+r)-\phi (R))^{-1}\) and

where \(c_3,c_4 > 0\), \(\tau \in C^{\infty }_c(\mathbb {R}^d)\) is an arbitrary function such that \(\tau \equiv 1\) in \(B_R(x_0)\), \(\tau \equiv 0\) in \(B_{R+\frac{r}{2}}(x_0)\), \(\Vert \nabla \tau \Vert _{\infty } \le 4 r^{-1}\), and we used FK. By combination of the foregoing two estimates, we obtain for some \(c_5 > 0\):

(4.5)

Let us now fix \(R \in (0,\rho /2]\) and define sequences \(l_i = M(1-2^{-i})\), for \(M > 0\) to be defined later, \(r_i = 2^{-i-1}R\), \(R_{i+1} = R_i - r_{i+1}\), \(R_0 {:}{=} R\), \(A_i = \int _{I_{R_i}} \int _{B_{R_i}(x_0)} w_{l_i}^2(t,x) \text {d}x \text {d}t\). We deduce:

for some \(\gamma ,c_6,c_7 > 0\), using that \(\sigma (R_i,r_i) \le c_8 2^{\gamma _1 i} \phi (R)^{-1}\), \(\sigma '(R_i,r_i) \le c_9 2^{\gamma _2 i} \phi (R)^{-1} \frac{\mu ( B_{\rho }(x_0))}{\mu ( B_{R}(x_0))}\) for some \(c_8,c_9,\gamma _1,\gamma _2 > 0\). The latter follows from the fact that for all \(x \in B_{R}(x_0)\): \(\frac{\mu (B_{\rho }(x_0))}{\mu (B_{r_i}(x))} = \frac{\mu (B_{\rho }(x_0))}{\mu (B_{R}(x))} \frac{\mu (B_{R}(x))}{\mu (B_{r_i}(x))} \le c_{10} 2^{\gamma _3 i}\frac{\mu (B_{\rho }(x_0))}{\mu (B_{R}(x_0))} \), where \(c_{10},\gamma _3 > 0\). Let us choose \(c_{11} = c_7 2^{1+\gamma }\), and

Hence:

$$\begin{aligned} A_i&\le (c_{11} M^{-2\nu } (\phi (R)\mu (B_{R}(x_0)))^{-\nu } ) 2^{\gamma i} A_{i-1}^{1+\nu }, ~~ A_0 \\&\le 2^{-\frac{\gamma }{\nu ^2}} (c_{11} M^{-2\nu }(\phi (R)\mu (B_{R}(x_0)))^{-\nu })^{-\frac{1}{\nu }}, \end{aligned}$$

and we can apply Lemma 7.1 in [31] to deduce that for some \(c_{12} > 0\):

This proves (4.3). Let us now demonstrate how to prove (4.4). Let u be a subsolution to \(\partial _t u - L_t u = 0\). First, we provide a different estimate of the tail term. For every \(t \in I_{R+r}\):

$$\begin{aligned} \sup _{x \in B_R(x_0)} \int _{M \setminus B_{R+ \frac{r}{2}}(x_0)} \vert u(t,y) \vert k(t;x,y) \text {d}y \le c_{13} {\widetilde{\sigma }}'(R,r)\int _{M} \vert u(t,y) \vert \text {d}y, \end{aligned}$$

where \({\widetilde{\sigma }}'(R,r) {:}{=} \sup _{x \in B_{R}(x_0)} \mu (B_r(x))^{-1} \phi (r)^{-1}\) and we applied \(k_{\le }\). As in (4.5), we get:

$$\begin{aligned} \int _{I_R}&\int _{B_R(x_0)} w_l^2(t,x) \text {d}x \text {d}t \le c_{14}\left( \int _{I_{R+r}} \int _{B_{R+r}(x_0)} w_k^2(t,x) \text {d}x \text {d}t\right) ^{1+\nu } \times \\&\times \frac{\phi (R+\frac{r}{2})}{\mu ( B_{R + \frac{r}{2}}(x_0))^{\nu }} (l-k)^{-2\nu } \left( \sigma (R,r) + \frac{{\widetilde{\sigma }}'(R,r)}{l-k} \sup _{t \in I_{R+r}} \int _{M} \vert u(t,x) \vert \text {d}x \right) ^{1+\nu } \end{aligned}$$

for some \(c_{14} > 0\). From now on, let \({\overline{R}} > 0\) be fixed. Moreover, let \(0< {\overline{R}}/2 \le r < R \le {\overline{R}}\) and define sequences \(l_i = M(1-2^{-i})\), for \(M > 0\) to be defined later, \(r_i = 2^{-i-1}(R-r)\), \(R_{i+1} = R_i - r_{i+1}\), \(R_0 {:}{=} R\), \(A_i = \int _{I_{R_i}} \int _{B_{R_i}(x_0)} w_{l_i}^2(t,x) \text {d}x \text {d}t\) and deduce

$$\begin{aligned} A_i&\le c_{16} \frac{2^{\gamma i}}{M^{2\nu }} \frac{\phi (R)}{\mu ( B_R(x_0))^{\nu }} \left( \frac{R}{R-r} \frac{1}{\phi (R-r)} + \frac{1}{M} \left( \frac{R}{R-r} \right) ^{d}\frac{\phi (R-r)^{-1}}{\mu ( B_{R}(x_0))} \sup _{t \in I_{R}} \int _{M} \vert u(t,x) \vert \text {d}x \right) ^{1+\nu } A_{i-1}^{1+\nu } \\&\le c_{17} \frac{2^{\gamma i}}{M^{2\nu }} \frac{\left( \frac{R}{R-r}\right) ^{\alpha _2+1}}{(\phi (R-r)\mu ( B_R(x_0)))^{\nu }} \left( 1 + \frac{1}{M} \frac{\left( \frac{R}{R-r} \right) ^{d-1}}{\mu ( B_{R}(x_0))} \sup _{t \in I_{R}} \int _{M} \vert u(t,x) \vert \text {d}x \right) ^{1+\nu }A_{i-1}^{1+\nu }, \end{aligned}$$

for \(c_{16},c_{17},\gamma > 0\), using (4.1) and that by VD: \(\mu (B_{R_i}(x_0)) \ge \mu (B_{{\overline{R}}/2}(x_0)) \ge c_{18} \mu (B_{{\overline{R}}}(x_0)) \ge c_{18}\mu (B_{R}(x_0))\), \(\sigma (R_i,r_i) \le c_{19}2^{\gamma _4 i}\frac{R}{R-r}\phi (R-r)^{-1}\), \({\widetilde{\sigma }}'(R_i,r_i) \le c_{20} 2^{\gamma _5 i} \phi (R-r)^{-1} \left( \frac{R}{R-r} \right) ^d \mu (B_R(x_0))^{-1}\). The latter follows from the fact that for all \(x \in B_R(x_0)\): \(\mu (B_{r_i}(x))^{-1} = \frac{\mu (B_R)(x)}{\mu (B_{r_i}(x))} \mu (B_R(x))^{-1} \le c_{21} 2^{\gamma _6 i} \left( \frac{R}{R-r} \right) ^d \mu (B_R(x_0))^{-1}\). We choose \(c_{22} = c_{17}2^{1+\gamma }\) and

$$\begin{aligned} M {:}{=} \frac{\left( \frac{R}{R-r} \right) ^{d-1}}{\mu ( B_{R}(x_0))} \sup _{t \in I_{R}} \int _{M} \vert u(t,x) \vert \text {d}x + 2^{-\frac{\gamma }{2 \nu ^2}} c_{22}^{\frac{1}{2\nu }} \left[ \frac{\left( \frac{R}{R-r}\right) ^{\alpha _2+1}}{(\phi (R-r)\mu ( B_R(x_0)))^{\nu }} \right] ^{\frac{1}{2\nu }} A_0^{1/2} \end{aligned}$$

and deduce by arguments analogous to those in the first part of the proof:

$$\begin{aligned} \sup _{I_r \times B_r(x_0)} u&\le \left( \frac{R}{R-r} \right) ^{d-1}\frac{1}{\mu ( B_{R}(x_0))} \sup _{t \in I_{R}} \int _{M} \vert u(t,x) \vert \text {d}x\\&\quad + c_{23} \left[ \frac{\left( \frac{R}{R-r}\right) ^{\alpha _2+1}}{(\phi (R-r)\mu ( B_R(x_0)))^{\nu }}\right] ^{\frac{1}{2\nu }} \left( \int _{I_R} \int _{B_R(x_0)} u^2(t,x) \text {d}x \text {d}t \right) ^ {1/2}\\&= I_1 + I_2, \end{aligned}$$

where \(c_{23} > 0\). We further estimate

where \(c_{24} > 0\) and we applied (4.1) and Hölder’s and Young’s inequality. Together, we obtain

$$\begin{aligned} \sup _{I_r \times B_r(x_0)} u \le \frac{1}{2} \sup _{I_R \times B_R(x_0)} u + c_{25}\left( \frac{R}{R-r} \right) ^{\delta } \frac{1}{\mu (B_R(x_0))} \sup _{t \in I_R} \int _M \vert u(t,x) \vert \text {d}x \end{aligned}$$

for \(c_{25} > 0\) and \(\delta {:}{=} d-1 \vee \frac{\alpha _2(1 + \nu ) + 1}{\nu }\). We can apply Lemma 1.1 in [26] to the estimate above and deduce that there exists \(c_{26} > 0\) such that for every \(0< {\overline{R}}/2 \le r < R \le {\overline{R}}\):

$$\begin{aligned} \sup _{I_r \times B_r(x_0)} u \le c_{26} \left( \frac{{\overline{R}}}{R-r} \right) ^{\delta } \frac{1}{\mu (B_{{\overline{R}}}(x_0))} \sup _{t \in I_{{\overline{R}}}} \int _M \vert u(t,x) \vert \text {d}x. \end{aligned}$$
(4.6)

Choosing \(r = {\overline{R}}/2, R = {\overline{R}}\) implies the desired result (4.4). \(\square \)

Now, having established the \(L^{\infty }-L^2\)-estimate Lemma 4.2, we are ready to prove Theorem 4.1.

Proof of Theorem 4.1

First of all, we observe that by VD, (4.1), \(k_{\le }\):

$$\begin{aligned} \int _{B_{R}(x)} d(x,y)^{2} k(t;x,y) \text {d}y \le c R^2 \phi (R)^{-1}, ~~~~~~\int _{M \setminus B_{R}(x)} k(t;x,y) \text {d}y \le c \phi (R)^{-1}. \end{aligned}$$
(4.7)

for every \(t \in (0,T)\), \(x \in M\), \(R > 0\). For a proof, see [17]. Given \(y \in M\), \(\rho > 0\), \(0 \le \eta< s < T\), \(\nu > 1\) with \(s-\eta \le \frac{1}{4\nu }\phi (\rho )\), we define \(H_{y,\rho ,\eta ,s,\nu } = H : [\eta ,s] \times M \rightarrow [0,\infty )\) via

$$\begin{aligned} H(t,x) {:}{=} \left( \frac{\phi (\rho )}{\nu [2(s-\eta ) - (t-\eta )]}\right) ^{-1} \wedge \left( \frac{\phi (\rho )}{\nu [2(s-\eta ) - (t-\eta )]}\right) ^{-\frac{d(x,y)}{3\rho }}. \end{aligned}$$

With the help of (4.7), (4.1) it is easy to check along the lines of Lemma 3.3 that H satisfies the assumptions of Lemma 3.1, namely for every \(C > 0\) there exists \(\nu > 1\) such that

$$\begin{aligned} C \Gamma ^{\phi }_{\rho }(H^{1/2},H^{1/2}) \le -\partial _t H~~ \text { in } (\eta ,s) \times M,~~~~ H^{1/2} \in L^2((\eta ,s);\mathcal {F}), \end{aligned}$$

where

$$\begin{aligned} \Gamma ^{\phi }_{\rho }(H^{1/2},H^{1/2})(t,x) = \int _{B_{\rho }(x)} (H^{1/2}(t,x) - H^{1/2}(t,z))^2 \mu (B_{d(x,z)}(x))^{-1} \phi (d(x,z))^{-1} \text {d}z. \end{aligned}$$

Next, we recall from Lemma 4.2 that the following \(L^{\infty }-L^2\)-estimate holds true for local subsolutions u to \(\partial _t u - L_t^{\rho } u = 0\) in \(I_R(t_0) \times B_R(x_0)\), where \(I_R(t_0) = (t_0 - \phi (R),t_0)\):

$$\begin{aligned} \sup _{I_{R/2}(t_0) \times B_{R/2}(x_0)} u \le C \left( \frac{\mu (B_{\rho }(x_0))}{\mu (B_R(x_0))} \right) ^{\frac{1}{2}} \mu (B_R(x_0))^{-\frac{1}{2}} \sup _{t \in I_{R}(t_0)} \left( \int _{B_{2\rho }(x_0)} u^2(t,x) \text {d}x \right) ^{1/2}.\nonumber \\ \end{aligned}$$
(4.8)

Applying H to Lemma 3.1, we obtain from (4.8) and the definition of H:

$$\begin{aligned} \vert u(s,y) \vert \le c_1 \left( \frac{\mu (B_{\rho }(y))}{\mu (B_{\phi ^{-1}(s-\eta )}(y))} \right) ^{\frac{1}{2}} \mu (B_{\phi ^{-1}(s-\eta )}(y))^{-\frac{1}{2}} 2^{\frac{\sigma }{6\rho }} \left( \frac{\phi (\rho )}{\nu (s-\eta )}\right) ^{-\frac{\sigma }{6\rho } + \frac{1}{2}} \Vert u_0 \Vert _{L^2(D)}\nonumber \\ \end{aligned}$$
(4.9)

for \(c_1 > 0\), where \(u,u_0\) are as in Lemma 3.1, and \(0 \le \eta< s < T\) with \(s- \eta \le \frac{1}{4\nu } \phi (\rho )\), \(\sigma > 0\).

Moreover, the following on-diagonal estimates hold for every \(0 \le \eta< s < T\), \(x,y \in M\), \(\rho > 0\):

$$\begin{aligned} p(y,s;x,\eta )&\le c \mu (B_{\phi ^{-1}(s-\eta )}(x))^{-1}, \end{aligned}$$
(4.10)
$$\begin{aligned} p_{\rho }(y,s;x,\eta )&\le c e^{c(s-\eta )\phi (\rho )^{-1}}\mu (B_{\phi ^{-1}(s-\eta )}(x))^{-1}, \end{aligned}$$
(4.11)

for some \(c > 0\). These estimates are proved in Lemma 5.1 and Section 4.4 in [17] using a stochastic approach. A more direct proof, using only analysis tools, goes via the \(L^{\infty }-L^1\)-estimate (4.4).

In fact, given \(0 \le \eta< s < T\), \(x,y \in M\), we apply (4.4) to \((t,z) \mapsto p(y,t;z,\eta )\), choosing \(R {:}{=} \phi ^{-1}(s-\eta )\), \(x_0 {:}{=} x\), \(t_0 {:}{=} s\). Using (2.7), we obtain

$$\begin{aligned} p(y,s;x,\eta ) \le c \mu (B_{\phi ^{-1}(s-\eta )}(x))^{-1} \sup _{t \in I_{\phi ^{-1}(s-\eta )}(s)} \int _{M} p(y,t;z,\eta ) \text {d}z \le c \mu (B_{\phi ^{-1}(s-\eta )}(x))^{-1}, \end{aligned}$$

as desired. (4.11) is a direct consequence of (4.10) in the light of (5.7) and (4.7). Note that the proof of (5.7) is written for \(M = \mathbb {R}^d\) but works in the same way in the current setup.

Combining (4.9) and (4.11), we derive as in Theorem 3.4 for every \(0 \le \eta< s < T\) with \(s- \eta \le \frac{1}{4\nu } \phi (\rho )\), \(x,y \in M\):

$$\begin{aligned}{} & {} p_{\rho }(y,s;x,\eta )\le c_2 \left( \frac{\rho }{\phi ^{-1}(s-\eta )} \right) ^{\frac{d}{2}} \mu (B_{\phi ^{-1}(s-\eta )}(x))^{-\frac{1}{2}}\\{} & {} \mu (B_{\phi ^{-1}(s-\eta )}(y))^{-\frac{1}{2}} 2^{\frac{d(x,y)}{12\rho }} \left( \frac{\phi (\rho )}{\nu (s-\eta )}\right) ^{-\frac{d(x,y)}{12\rho } + \frac{1}{2}} \end{aligned}$$

for \(c_2 > 0\). Finally, we explain how to deduce off-diagonal bounds for p. We get from (5.6):

$$\begin{aligned} p(y,s;x,\eta ) \le p_{\rho }(y,s;x,\eta ) + \int _{\eta }^s P_{\tau }^{\rho } K_{\rho }(y) \text {d}\tau . \end{aligned}$$
(4.12)

We choose \(\rho = \frac{d(x,y)}{12}\left( \frac{d+\alpha _1}{\alpha _1} + \frac{1}{2} + \frac{d}{2\alpha _1}\right) ^{-1}\) and obtain by VD, (4.1):

$$\begin{aligned} \begin{aligned}&p_{\rho }(y,s;x,\eta ) \\&\le c_2 \left( \frac{\rho }{\phi ^{-1}(s-\eta )} \right) ^{\frac{d}{2}} \left[ \mu (B_{\phi ^{-1}(s-\eta )}(x)) \mu (B_{\phi ^{-1}(s-\eta )}(y))\right] ^{-\frac{1}{2}} 2^{\frac{d(x,y)}{12\rho }} \left( \frac{\phi (\rho )}{\nu (s-\eta )}\right) ^{-\frac{d(x,y)}{12\rho } + \frac{1}{2}}\\&\le c_3 \left( \frac{d(x,y)}{\phi ^{-1}(s-\eta )}\right) ^{\frac{d}{2}} \left[ \mu (B_{\phi ^{-1}(s-\eta )}(x)) \mu (B_{\phi ^{-1}(s-\eta )}(y))\right] ^{-\frac{1}{2}} \left( \frac{\phi (d(x,y))}{s-\eta } \right) ^{-\frac{3d}{2\alpha _1} - 1} \\&\le c_4 \left( \frac{\phi ^{-1}(\phi (d(x,y)))}{\phi ^{-1}(s-\eta )}\right) ^{\frac{d}{2}} \left( \frac{\mu (B_{\phi ^{-1}(\phi (d(x,y)))}(x)) \mu (B_{\phi ^{-1}(\phi (d(x,y)))}(y))}{\mu (B_{\phi ^{-1}(s-\eta )}(x))\mu (B_{\phi ^{-1}(s-\eta )}(y))}\right) ^{\frac{1}{2}} \frac{\left( \frac{\phi (d(x,y))}{s-\eta } \right) ^{-\frac{3d}{2\alpha _1} - 1} }{\mu (B_{d(x,y)}(x))}\\&\le c_5 \left( \frac{\phi (d(x,y))}{s-\eta }\right) ^{\frac{3d}{2\alpha _1} } \left( \frac{\phi (d(x,y))}{s-\eta } \right) ^{-\frac{3d}{2\alpha _1} - 1} \mu (B_{d(x,y)}(x))^{-1}\\&= c_5\frac{s-\eta }{\mu (B_{d(x,y)}(x)) \phi (d(x,y))} \end{aligned}\nonumber \\ \end{aligned}$$
(4.13)

for \(c_3,c_4,c_5 > 0\). Next, we estimate \(\int _{\eta }^s P_{\tau }^{\rho } K_{\rho }(y) \text {d}\tau \). For this, we compute by \(k_{\le }\) and VD:

$$\begin{aligned} \begin{aligned} \int _{\eta }^s P_{\tau }^{\rho } K_{\rho }(y) \text {d}\tau&= \sum _{k=1}^{\infty } \int _{\eta }^s P_{\tau }^{\rho } \left[ \mathbbm {1}_{B_{ck\rho }(y) \setminus B_{c(k-1)\rho }(y)} K_{\rho } \right] (y) \text {d}\tau \\&\le c_6 \phi (d(x,y))^{-1} \sum _{k=1}^{\infty } \int _{\eta }^s P_{\tau }^{\rho } \left[ \mathbbm {1}_{B_{ck\rho }(y) \setminus B_{c(k-1)\rho }(y)} \mu (B_{\rho }(\cdot ))^{-1} \right] (y) \text {d}\tau \\&\le c_7\mu (B_{\rho }(x))^{-1}\phi (d(x,y))^{-1} \sum _{k=1}^{\infty } k^{d}\int _{\eta }^s P_{\tau }^{\rho } \mathbbm {1}_{B_{ck\rho }(y) \setminus B_{c(k-1)\rho }(y)} (y) \text {d}\tau \end{aligned} \end{aligned}$$

for \(c_6,c_7 > 0\), and \(c > 3 + \frac{6d}{\alpha _1}\). Using (4.9), VD and (4.1), we estimate for \(\tau \in (\eta ,s)\), \(k \ge 2\):

$$\begin{aligned} k^{d} P_{\tau }^{\rho } \mathbbm {1}_{B_{ck\rho }(y) \setminus B_{c(k-1)\rho }(y)} (y)&\le c_8 k^{d} \left( \frac{\mu (B_{k\rho }(y))}{\mu (B_{\phi ^{-1}(\tau -\eta )}(y))} \right) 2^{\frac{c(k-1)}{6}} \left( \frac{\phi (\rho )}{\nu (\tau -\eta )}\right) ^{-\frac{c(k-1)}{6} + \frac{1}{2}}\\&\le c_9 k^{2d} 2^{\frac{c(k-1)}{6}} \left( \frac{\phi (\rho )}{\nu (\tau -\eta )}\right) ^{-\frac{c(k-1)}{6} + \frac{1}{2} + \frac{d}{\alpha _1}}\\&\le c_{10} k^{2d} 2^{\frac{c(k-1)}{6}} 4^{-\frac{c(k-1)}{6} + \frac{1}{2} + \frac{d}{\alpha _1}} \end{aligned}$$

for \(c_8,c_9,c_{10} > 0\), where we use \(s- \eta \le \frac{1}{4\nu } \phi (\rho )\). From (2.7), it follows

$$\begin{aligned} \begin{aligned} \int _{\eta }^s P_{\tau }^{\rho } K_{\rho }(y) \text {d}\tau&\le c_{11} \mu (B_{\rho }(x))^{-1} \phi (d(x,y))^{-1} \int _{\eta }^s \left( P^{\rho }_{\tau } \mathbbm {1}_{B_{c\rho }(y)}(y) + \sum _{k=2}^{\infty } k^{2d} 2^{\frac{ck}{6}} 4^{-\frac{ck}{6}}\right) \text {d}\tau \\&\le c_{12} \frac{s-\eta }{\mu (B_{d(x,y)}(x)) \phi (d(x,y))} \end{aligned}\nonumber \\ \end{aligned}$$
(4.14)

for \(c_{11},c_{12} > 0\). Combining (4.12), (4.13), (4.14) we obtain the desired off-diagonal estimate for \(s-\eta \le \frac{1}{4\nu }\phi (\rho )\). Together with the on-diagonal estimate (4.10), we deduce the desired result. \(\square \)

Remark

Note that the proof of Theorem 4.1 does not require the scaling argument from [14] since we are working with an on-diagonal estimate and an \(L^{\infty }-L^2\)-estimate that take into account the parabolic scaling of the corresponding equation, see also [15].