1 Introduction

It is well known that a priori Hölder estimates play a key role in the study of nonlinear equations. There is a wealth literature on this for differential operators. In this paper, we study a priori Hölder estimates of parabolic functions for a large class of time-dependent nonlocal operators with (time-dependent) gradient drifts on \({\mathbb {R}}^d\)

$$\begin{aligned} {\fancyscript{L}}^b_t u ={\fancyscript{L}}_t u+b_t\cdot \nabla u. \end{aligned}$$

Here \(b_t(x)\) is an \({\mathbb {R}}^d\)-valued measurable function on \([0, \infty ) \times {\mathbb {R}}^d\) and \({\fancyscript{L}}_t\) is a time-dependent purely nonlocal operator given by

$$\begin{aligned} {\fancyscript{L}}_t u(x)=\int _{{\mathbb {R}}^d}[u(x+z)+u(x-z)-2u(x)]\kappa _t(x,z){\mathord {\mathrm{d}}}z, \end{aligned}$$
(1.1)

where \(\kappa _t(x,z)\) is symmetric in \(z\) (i.e., \(\kappa _t(x,z)=\kappa _t(x,-z)\)) and satisfies

$$\begin{aligned} \sup _{t,x}\int _{{\mathbb {R}}^d}(1\wedge |z|^2)\kappa _t(x,z){\mathord {\mathrm{d}}}z\leqslant C_1, \end{aligned}$$
(1.2)

and for some regularly varying function \(\phi \) with index \(\alpha \in [0,2]\) (see Definition 3.1 below),

$$\begin{aligned} \frac{c_1}{\phi (|z|)|z|^d}\leqslant \kappa _t(x,z)\leqslant&\frac{c_1^{-1}}{\phi (|z|)|z|^d} \quad \hbox {for } |z|\leqslant 3. \end{aligned}$$
(1.3)

Note that under condition (1.2), \({\fancyscript{L}}_t u(x)\) is well defined and bounded for every \(u\in C^2_b({\mathbb {R}}^d)\). Since \(\kappa _t (x, z)\) is symmetric in \(z\), we can rewrite \({\fancyscript{L}}_t u(x)\) in (1.1) as

$$\begin{aligned} {\fancyscript{L}}_t u(x)&= 2 \mathrm{p.v.} \int _{{\mathbb {R}}^d} \left( u(x+z)-u(x)\right) \kappa _t (x, z) {\mathord {\mathrm{d}}}z \\&= 2 \int _{{\mathbb {R}}^d} \left( u(x+z)-u(x)-\nabla u(x) \cdot z {\mathbf {1}}_{|z|\leqslant 1}\right) \kappa _t (x, z) {\mathord {\mathrm{d}}}z \end{aligned}$$

for \(u\in C^2_b ({\mathbb {R}}^d)\).

When \(\kappa _t (x, z)= c |z|^{-d-\alpha }\) for some suitable constant \(c>0\) and \(\alpha \in (0, 2)\), \({\fancyscript{L}}_t\) is just the usual fractional Laplacian \(\Delta ^{\alpha /2}\). Recently Silvestre [13] proved the following a priori Hölder estimate for the fractional-diffusion equation with drift

$$\begin{aligned} \partial _t u=\Delta ^{ {\alpha }/{2}}u+b_t\cdot \nabla u. \end{aligned}$$
(1.4)

There are constants \(C>0\) and \(\beta \in (0,1)\) such that for any classical solution \(u\) of Eq. (1.4), \(x, y\in {\mathbb {R}}^d\) with \(|x-y|\leqslant 1\) and \(0<s\leqslant t\leqslant 1\),

$$\begin{aligned} |u(t,x)-u(s,y)|\leqslant C \Vert u \Vert _{L^\infty ([0, 1]\times {\mathbb {R}}^d)} \, \frac{|x-y|^\beta +|t-s|^{\beta /\alpha }}{t^{\beta /\alpha }}, \end{aligned}$$
(1.5)

provided \(b\) in the Hölder class \(C^{1-\alpha }\) when \(\alpha \in (0,1)\), and bounded measurable when \(\alpha \in [1,2)\), where the constants \(C\) and \(\beta \) only depend on \(d,\alpha \), \(\Vert b\Vert _\infty \), as well as on the \((1-\alpha )\)-Hölder norm of \(b\) when \(\alpha \in (0,1)\). See [9] for recent result on local regularity of solutions to

$$\begin{aligned} \Delta ^{ {\alpha }/{2}}u+b (x) \cdot \nabla u =f \end{aligned}$$

in Sobolev spaces with \(\alpha \in (0, 1)\).

In the literature, if \(\alpha \in (1,2)\), the Eq. (1.4) is usually referred to as the subcritical case since the fractional Laplacian \(\Delta ^{{\alpha }/{2}}\) is of higher order than that of the gradient part \(b_t \cdot \nabla \). For \(\alpha =1\), it is called the critical case since the fractional Laplacian has the same order as the first order gradient term. For \(\alpha \in (0,1)\), it is known as the supercritical case because the fractional Laplacian is of lower order than the drift term, and the drift term can be stronger than the diffusion term in small scales. This explains why one needs \(b\) to be Hölder continuous for the above a priori estimate in the supercritical case. It should be noted that the following scaling property plays a crucial role in paper [13]: for \(\lambda >0\), let \(u^\lambda (t,x):=\lambda ^{-\alpha }u(\lambda ^\alpha t, \lambda x)\) and \(b^\lambda (t,x):=b(\lambda ^\alpha t, \lambda x)\), then \(u^\lambda \) satisfies

$$\begin{aligned} \partial _t u^\lambda =\Delta ^{{\alpha }/{2}}u^\lambda +\lambda ^{\alpha -1}b^\lambda \cdot \nabla u^\lambda . \end{aligned}$$

We mention that the Hölder estimate (1.5) has been used in the study of well-posedness of multidimensional critical Burger’s equations in Zhang [14].

In this paper, we are concerned with the following nonlocal-diffusion equation with drift \(b\):

$$\begin{aligned} \partial _t u ={\fancyscript{L}}^b_t u ={\fancyscript{L}}_t u+b_t\cdot \nabla u. \end{aligned}$$
(1.6)

Following [11], we define

$$\begin{aligned} \Phi (r):=\left( \int ^2_r\frac{{\mathord {\mathrm{d}}}s}{s \phi (s)}\right) ^{-1},\ \ r\in (0,1), \end{aligned}$$
(1.7)

which is a continuous increasing function. The purpose of introducing this function \(\Phi \) is to deal with the case when \(\phi \) in (1.3) is a regularly varying function of order \(0\). It is known (see (3.5) below) that \(\lim _{r\rightarrow 0}\frac{\Phi (r)}{\phi (r)}=\alpha \). Thus \(\phi \) and \(\Phi \) are comparable when \(\alpha \in (0, 2]\) so we could use \(\phi \) in place of \(\Phi \) in this case. The function \(\Phi \) will be used to measure the modulus of continuity of parabolic functions; see Theorem 1.1. Such a result would be trivial if \(\lim _{r\rightarrow 0} \Phi ( r ) >0\). Thus without loss of generality, we will assume in this paper that

$$\begin{aligned} \lim _{r\rightarrow 0} \Phi ( r ) =0. \end{aligned}$$
(1.8)

Note that

$$\begin{aligned} \lim _{r\rightarrow 0}\Phi (r)=0\iff \int ^2_0\frac{{\mathord {\mathrm{d}}}s}{s\phi (s)} =\infty \mathop {\iff }\limits ^{(1.3)} \int _{{\mathbb {R}}^d}\kappa _t(x,z){\mathord {\mathrm{d}}}z=\infty . \end{aligned}$$

In [11], Kassmann and Mimica called \(\Phi \) an intrinsic scaling function and obtained \(\Phi \)-Hölder regularity of harmonic functions for the (time-independent) nonlocal operator \({\fancyscript{L}}_0\) with \(b_0=0\) under conditions (1.2), (1.3) and (1.8). When \(\kappa _0 (x, z)\) satisfies (1.3) for all \(z\in {\mathbb {R}}^d\) with \(\phi (r)=r^\alpha \) for some \(0<\alpha <2\) and \(b_0=0\), Hölder regularity of harmonic functions of \({\fancyscript{L}}_0\) was first established in Bass and Levin [2]. For a priori Hölder estimate for symmetric jump processes, see, e.g., [5, 6].

In this paper, we use “\(:=\)” as a way of definition. Let \({\mathbb {R}}^+:=[0, \infty )\), and \(C^{\infty }_c ({\mathbb {R}}^+\times {\mathbb {R}}^d)\) the space of smooth functions with compact support in \({\mathbb {R}}^+\times {\mathbb {R}}^d \). A probability measure \({\mathbf {Q}}\) on the Skorokhod space \({\mathbb {D}}([0, \infty ); {\mathbb {R}}^+\times {\mathbb {R}}^d)\) is said to be a solution to the martingale problem for \(({\fancyscript{L}}_t^b, C^{\infty }_c({\mathbb {R}}^+\times {\mathbb {R}}^d))\) with initial value \((t, x)\in {\mathbb {R}}^+ \times {\mathbb {R}}^d\) if \({\mathbf {Q}}(Z_0=(t, x))=1\) and for every \(f\in C^{\infty }_c({\mathbb {R}}^+\times {\mathbb {R}}^d)\),

$$\begin{aligned} M^f_s:=f(s+t, X_s) -f(t, X_0)-\int _0^s (\partial _r+{\fancyscript{L}}_{t+r}^b) f(t+r, X_r) {\mathord {\mathrm{d}}}r \end{aligned}$$
(1.9)

is a \({\mathbf {Q}}\)-martingale. The martingale problem for \(({\fancyscript{L}}^b_t, C^\infty _c({\mathbb {R}}^+\times {\mathbb {R}}^d))\) with initial value \((t, x)\in {\mathbb {R}}^d\) is said to be well posed if it has a unique solution.

Throughout this paper, we assume conditions (1.2) and (1.3) and (1.8) hold. Here are our main results.

Theorem 1.1

Assume that the martingale problem for \(({\fancyscript{L}}^b_t, C^\infty _c ({\mathbb {R}}^+\times {\mathbb {R}}^d))\) is well posed for every initial value \((t, x)\in {\mathbb {R}}^+\times {\mathbb {R}}^d\). Let \(\Phi \) be defined by (1.7).

  1. (i)

    Suppose that   \(\displaystyle \liminf _{r\rightarrow 0}r/\Phi (r)=0\) and for some \(C_2>0\),

    $$\begin{aligned} \frac{\Phi (r)}{\Phi (s)}\leqslant C_2\frac{r}{s} \quad \hbox {for } \ 0<s\leqslant r\leqslant 1, \end{aligned}$$
    (1.10)

    and \(b_t (x)\) is continuous in \(x\) for each \(t>0\) having

    $$\begin{aligned} \Vert b/(1+|x|)\Vert _\infty :=\sup _{t\in [0, 1], x\in {\mathbb {R}}^d}|b_t(x)|/(1+|x|)<\infty \end{aligned}$$

    with

    $$\begin{aligned} |b_t(x)-b_t(y)|\leqslant C_b|x-y|/\Phi (|x-y|) \quad \hbox {for }\ |x-y|\leqslant 1 \end{aligned}$$
    (1.11)

    for some \(C_b>0\). Then there are constants \(\beta \in (0,1)\) and \(\lambda =\lambda (\Vert b/(1+|x|)\Vert _\infty )>0\) such that for any classical solution \(u\) of (1.6), and for any \(0<t\leqslant t_0\leqslant 1\) and \(|x_0-x|+ \lambda (t_0-t) \leqslant \Phi ^{-1}(t_0)\),

    $$\begin{aligned} |u(t_0,x_0)\!-\!u(t,x)|\leqslant \! 16 \Vert u\Vert _{L^\infty ([0, 1]\times {\mathbb {R}}^d)}\, t_0^{-\beta }((t_0-t)\!+\!\Phi (|x_0\!-\!x|\!+\!\lambda (t_0-t)))^\beta . \end{aligned}$$
    (1.12)
  2. (ii)

    Suppose   \(\displaystyle \liminf _{r\rightarrow 0}r/\Phi (r)>0\) and \(b\) is a bounded measurable function on \({\mathbb {R}}^+\times {\mathbb {R}}^d\). Then there is a constant \(\beta \in (0,1)\) such that for any classical solution \(u\) of (1.6), and for any \(0<t\leqslant t_0\leqslant 1\) and \(|x_0-x|\leqslant \Phi ^{-1}(t_0)\),

    $$\begin{aligned} |u(t_0,x_0)-u(t,x)|\leqslant 16 \Vert u\Vert _{L^\infty ([0, 1]\times {\mathbb {R}}^d)}\, t_0^{-\beta }((t_0-t)+\Phi (|x_0-x|))^\beta . \end{aligned}$$
    (1.13)

Remark 1.2

Condition (1.10) is automatically satisfied by (3.1) below when \(\phi \) (also \(\Phi \)) is regularly varying with \(\alpha \in [0,1)\). Moreover, condition (1.10) is satisfied when \(\Phi \) is comparable to a convex function \(\Psi \) with \(\Psi (0)=0\) as in this case \(r/\Psi (r)\) is increasing. The following table gives some examples of regularly varying functions \(\phi \) that satisfy conditions (1.2), (1.3) and (1.8), where the \(\alpha \) stands for the index of regularly varying function \(\phi \), and the last column denotes the modulus of continuity in spatial variable \(x\) required for \({\mathbb {R}}^d\)-valued function \(b_t (x)\).

Case

\(\alpha \)

\(\phi (s)\)

\(\Phi (r)\)

\(b\)

(i)

\(0\)

\(\ln \frac{3}{s}\)

\(\asymp (\ln \ln \frac{3}{r})^{-1}\)

\(s\ln \ln \frac{3}{s}\)

(i)

\(0\)

\(1\)

\((\ln \frac{2}{r})^{-1}\)

\(s\ln \frac{2}{s}\)

(i)

\(0\)

\((\ln \frac{3}{s})^{-1}\)

\(\asymp (\ln \frac{3}{r})^{-2}\)

\(s(\ln \frac{3}{s})^2\)

(i)

\((0,1)\)

\(s^\alpha \)

\(\asymp r^{\alpha }\)

\(s^{1-\alpha }\)

(i)

\(1\)

\( s(\ln \frac{3}{s})^\beta ,\ \beta >0\ \ \ \)

\(\asymp r(\ln \frac{3}{r})^\beta \)

\( (\ln \frac{3}{s})^{-\beta }\)

(ii)

\(1\)

\(s(\ln \frac{3}{s})^\beta ,\ \beta \leqslant 0\)

\(\asymp r(\ln \frac{3}{r})^\beta \)

\(1\)

(ii)

\((1,2)\)

\(s^\alpha \)

\(\asymp r^{\alpha }\)

\(1\)

(ii)

\(2\)

\(s^2(\ln \frac{3}{s})^{\beta },\ \beta >1\)

\(\asymp r^2(\ln \frac{3}{r})^{\beta }\)

\(1\)

Remark 1.3

When \(\phi (r)=r^\alpha \) with \(\alpha \in (0, 2)\) (and so \(\Phi (r)\) is comparable to \(r^\alpha \)), condition \(\displaystyle \liminf _{r\rightarrow 0}r/\Phi (r)=0\) corresponds precisely to the supercritical case \(0<\alpha <1\), while \(\displaystyle \liminf _{r\rightarrow 0}r/\Phi (r)>0\) corresponds to \(1\leqslant \alpha <2\). So Theorem 1.1 contains the main results of Silvestre [13] as a particular case. Unlike [13], in the supercritical case (i), we do not need to assume \(b\) is bounded. Moreover, in this paper we can not only deal with more general but also time-dependent nonlocal operator \({\fancyscript{L}}^b_t\). In particular, taking \(b=0\), we obtain a priori \(\Phi \)-Hölder estimates for parabolic functions of time-dependent nonlocal operator \({\fancyscript{L}}_t\). It contains as a special case a priori \(\Phi \)-Hölder estimates for harmonic functions \(u(x)\) of nonlocal operator \({\fancyscript{L}}_0\), which is the main result of Kassmann and Mimica [11] when \(b_0=0\). \(\square \)

Remark 1.4

In this paper, we concentrate on a priori Hölder estimates for parabolic functions. For results on the well-posedness of the martingale problems for \({\fancyscript{L}}^b_t\), we refer the reader to [1, 7, 8] and the references therein. We remark here that if \(\kappa _t (x, z)\) is independent of \((t, x)\), then \({\fancyscript{L}}_t={\fancyscript{L}}_0\) is the generator of a Lévy process \(Y\). When \(b_t (x)\) is uniformly Lipschitz in \(x\), it is easy to show that for every initial data \((t, x)\in {\mathbb {R}}^+\times {\mathbb {R}}^d\), stochastic differential equation

$$\begin{aligned} {\mathord {\mathrm{d}}}X^t_s = {\mathord {\mathrm{d}}}Y_s + b_{t+s} (X^t_s) {\mathord {\mathrm{d}}}s \end{aligned}$$
(1.14)

has a unique strong solution \(X^t\) with \(X^t_0=x\). (This can be done as follows. For each \(\omega \in \Omega \), ODE \(dZ^t_s = b_{t+s} (Z^t_s +Y_s(\omega )) {\mathord {\mathrm{d}}}s\) with \(Z^t_0 (\omega ) =x\) has a unique solution. Then \(X^t_s := Z^t_s+Y_t\) is the unique solution to SDE (1.14) with \(X^t_0=x\).) Hence in this case, the martingale problem for \(({\fancyscript{L}}, C^\infty _c ({\mathbb {R}}^+\times {\mathbb {R}}^d))\) is well posed and so Theorem 1.1 is applicable. It is important to note that the \(\Phi \)-Hölder estimate in Theorem 1.1 does not depend on the Lipschitz constant of \(b_t (x)\).

When the martingale problem for \(({\fancyscript{L}}^b_t, C^\infty _c ({\mathbb {R}}^+\times {\mathbb {R}}^d))\) is well posed for every initial value \((t, x)\in {\mathbb {R}}^+\times {\mathbb {R}}^d\), there is a space-time Hunt process \(Z_s=(V_0+s, X_s)\) having \(\partial _s + {\fancyscript{L}}^v_{V_0+s}\) as its infinitesimal generator. For any bounded classical solution \(u\) of (1.6), by Itô’s formula, \(u(s+t, X_s)\) is a \({\mathbf {P}}_{(t, x)}\)-martingale for every \((t, x)\in {\mathbb {R}}^+\times {\mathbb {R}}^d\). This is the only property of \(u\) we used in the proof of Theorem 1.1. Hence the conclusion of Theorem 1.1 holds for any bounded function \(u\) on \({\mathbb {R}}^+\times {\mathbb {R}}^d\) such that \(u(s+t, X_s)\) is a \({\mathbf {P}}_{(t, x)}\)-martingale for every \((t, x)\in {\mathbb {R}}^+\times {\mathbb {R}}^d\). \(\square \)

The approach of this paper is purely probabilistic. Our tool is the time-inhomogeneous strong Markov process \(X\) determined by the solution of the martingale problem for \(({\fancyscript{L}}^b_t, C^\infty _c ({\mathbb {R}}^+\times {\mathbb {R}}^d))\). In Sect. 2, we prove an abstract result on Hölder’s continuity in terms of certain estimates on exiting and hitting probabilities, which is motivated by the approaches in [2, 5]. This probabilistic approach has its origin in Krylov and Safanov [12] for diffusion processes associated with second order nondivergence form differential operators. In Sect. 3, we prove our main results by verifying the abstract conditions. The Lévy system of the strong Markov process \(X\) plays a key role in establishing these exiting and hitting probabilities.

2 An Abstract Criterion for Hölder’s Regularity

Let \(\Omega \) be the space of càdlàg functions from \({\mathbb {R}}^+ =[0,\infty )\) to \({\mathbb {R}}^d\), which is endowed with the Skorokhod topology. Let \(X_s(\omega )=\omega _s\) be the coordinate process over \(\Omega \). Define the space-time process

$$\begin{aligned} Z_s:=(V_s, X_s),\ \ V_s:=V_0+s. \end{aligned}$$

Let \(\{{\fancyscript{F}}_s^0; s\geqslant 0\}\) be the natural filtration generated by \(X\). Suppose \(\{{\mathbb {P}}_{(t,x)}; t\geqslant 0, x\in {\mathbb {R}}^d\}\) is a family of probability measures over \((\Omega ,{\fancyscript{F}}^0_\infty )\) so that \(Z=(\Omega ,{\fancyscript{F}}^0_\infty ,{\fancyscript{F}}^0_s, Z_s, {\mathbb {P}}_{(t,x)})\) is a time-homogenous strong Markov processes with state space \({\mathbb {R}}^+\times {\mathbb {R}}^d\) with

$$\begin{aligned} {\mathbb {P}}_{(t,x)}\big (Z_0=(t,x)\big )=1. \end{aligned}$$

Denote by \(\{{\fancyscript{F}}_s: s\geqslant 0\}\) the minimal augmented filtration of \(Z\). Note that under \({\mathbb {P}}_{(t, x)}\), \(\{X^t_s:=X_{s+t}; s\geqslant 0\}\) is a possibly time-inhomogeneous strong Markov process with

$$\begin{aligned} {\mathbb {P}}_{(t,x)}(X^t_s=x,\ s\in [0,t])=1. \end{aligned}$$

For a Borel set \(A\subset {\mathbb {R}}^+\times {\mathbb {R}}^d\), denote by \(\sigma _A,\tau _A\) the hitting time and exit time of \(A\), i.e.,

$$\begin{aligned} \sigma _A:=\inf \{s\geqslant 0: Z_s\in A\},\quad \tau _A:=\inf \{s\geqslant 0: Z_s\notin A\}. \end{aligned}$$

Definition 2.1

A nonnegative Borel measurable function \(u(t,x)\) on \({\mathbb {R}}^+\times {\mathbb {R}}^d\) is called \(Z\)-harmonic (or simply parabolic) in a relatively open subset \(D\) of \({\mathbb {R}}^+\times {\mathbb {R}}^d\) if for each relatively compact open subset \(A\subset D\) and every \((t,x)\in A\),

$$\begin{aligned} u(t,x)={\mathbb {E}}_{(t,x)}[u(Z_{\tau _A})]. \end{aligned}$$
(2.1)

Remark 2.2

Condition (2.1) is equivalent to the following. For any (\({\fancyscript{F}}_s\))-stopping time \(\tau \),

$$\begin{aligned} u(t,x)={\mathbb {E}}_{(t,x)}{\Big [}u(Z_{\tau \wedge \tau _A}){\Big ]}. \end{aligned}$$
(2.2)

Indeed, assume that (2.1) holds for any \((t,x)\in A\). In view of \(Z_\tau \in A\) on \(\{\tau <\tau _A\}\), we have

$$\begin{aligned} {\mathbf {1}}_{\tau <\tau _A}u(Z_\tau )={\mathbf {1}}_{\tau <\tau _A}{\mathbb {E}}_{Z_\tau }[u(Z_{\tau _A})]. \end{aligned}$$

Let \(\{\theta _t; t\geqslant 0\}\) be the usual shift operators on \(\Omega \). By the strong Markov property, we have

$$\begin{aligned} {\mathbb {E}}_{(t,x)}{\Big [}{\mathbf {1}}_{\tau <\tau _A}u(Z_\tau ){\Big ]}&={\mathbb {E}}_{(t,x)}{\Big [}{\mathbf {1}}_{\tau <\tau _A} {\mathbb {E}}_{Z_{\tau }}[u(Z_{\tau _A})]{\Big ]}\\&={\mathbb {E}}_{(t,x)}{\Big [}{\mathbf {1}}_{\tau <\tau _A}{\mathbb {E}}_{(t,x)}{\Big [}u(Z_{\tau _A}\circ \theta _\tau )|{\fancyscript{F}}_\tau {\Big ]}{\Big ]}\\&={\mathbb {E}}_{(t,x)}{\Big [}{\mathbf {1}}_{\tau <\tau _A}u(Z_{\tau _A}\circ \theta _\tau ){\Big ]}\quad \hbox {since } \{\tau <\tau _A\}\in {\fancyscript{F}}_\tau . \end{aligned}$$

Since \(\tau _A=\tau +\tau _A\circ \theta _\tau \) on \(\{\tau <\tau _A\}\) and \(Z_{\tau _A}\circ \theta _\tau =Z_{\tau +\tau _A\circ \theta _\tau }\), we obtain

$$\begin{aligned} {\mathbb {E}}_{(t,x)}{\Big [}{\mathbf {1}}_{\tau <\tau _A}u(Z_\tau ){\Big ]}={\mathbb {E}}_{(t,x)}{\Big [}{\mathbf {1}}_{\tau <\tau _A}u(Z_{\tau _A}){\Big ]}, \end{aligned}$$

which implies that

$$\begin{aligned} {\mathbb {E}}_{(t,x)}{\Big [}u(Z_{\tau \wedge \tau _A}){\Big ]}&={\mathbb {E}}_{(t,x)} {\Big [}{\mathbf {1}}_{\tau <\tau _A}u(Z_\tau ){\Big ]}+{\mathbb {E}}_{(t,x)}{\Big [}{\mathbf {1}}_{\{\tau \geqslant \tau _A\}} u(Z_{\tau _A}){\Big ]}\\&={\mathbb {E}}_{(t,x)}{\Big [}u(Z_{\tau _A}){\Big ]}=u(t,x). \end{aligned}$$

Let \(\Phi :(0,2)\rightarrow [0,\infty )\) be a continuous and strictly increasing function with \(\Phi (1)=1\). Write for \(r>0\),

$$\begin{aligned} B(r):=\{z\in {\mathbb {R}}^d: |z|<r\} \quad \hbox {and} \quad Q(r):=[0,\Phi (r))\times B(r). \end{aligned}$$

Define

$$\begin{aligned} \varphi _a(r):=\Phi ^{-1}(a\Phi (r)) \text{ for } \; a>0, \quad D_a(r):=Q(\varphi _a(r))\setminus Q(\varphi _{\sqrt{a}}(r)) \text{ for }\; a>1. \end{aligned}$$
(2.3)

Notice that

$$\begin{aligned} \varphi _1(r)=r \quad \text{ and } \quad a\mapsto \varphi _a(r) \text{ is } \text{ strictly } \text{ increasing. } \end{aligned}$$

We make the following assumptions:

(H \(_1\) ) :

There exist constants \(C_{3}, C_{4}\geqslant 1\) such that for each \(a\geqslant C_{4}\) and \(r,R\in (0,1)\) with \(\varphi _a(r)\leqslant R\),

$$\begin{aligned} \sup _{(t_0,x_0)\in Q(r)}{\mathbb {P}}_{(t_0,x_0)}\Big (X_{\tau _{Q(r)}}\notin B(R)\Big )\leqslant C_{3}\frac{\Phi (r)}{\Phi (R)}. \end{aligned}$$
(2.4)
(H \(_2\) ) :

There is an increasing sequence of positive numbers \(\{a_k; k\geqslant 1\}\subset (1, \infty )\) with \(\lim _{k\rightarrow \infty } a_k=\infty \) such that for every \(a\in \{a_k; k\geqslant 1\}\), there exists a constant \(\gamma _a\in (0,a]\) so that for each \(r\in (0,1)\) with \(\varphi _a(r)\leqslant 1\), there is a radon measure \(\mu _r\) over \(D_a(r)\) such that for any compact subset \(K\subset D_a(r)\) with \(\mu _r(K)\geqslant \frac{1}{3}\mu _r(D_a(r))\),

$$\begin{aligned} \inf _{(t_0,x_0)\in Q(r)}{\mathbb {P}}_{(t_0,x_0)}\Big (\sigma _K<\tau _{Q(\varphi _a(r))}\Big )\geqslant \frac{\gamma _a}{a}, \end{aligned}$$
(2.5)

and \(\lim _{k\rightarrow \infty } \gamma _{a_k}=\infty \).

Remark 2.3

If \(X_s\) is a continuous process, then (H \(_1\) ) is automatically satisfied.

Theorem 2.4

Under (H \(_1\) ) and (H \(_2\) ), there exists a constant \(\beta \in (0,1)\), which only depends on \(C_{3}, C_{4}\), and \(\gamma _a\), such that for each \(r\in (0,1)\), every bounded measurable function \(u\) on \([0, 1] \times {\mathbb {R}}^d\) that is parabolic in \(Q(r)\),

$$\begin{aligned} |u(t,x)-u(0,0)|\leqslant 8\left( \frac{t\vee \Phi (|x|)}{\Phi (r)}\right) ^\beta \Vert u\Vert _{L^\infty ([0, \Phi (r)]\times {\mathbb {R}}^d)} \quad \hbox {for } (t,x)\in Q(r). \end{aligned}$$
(2.6)

Proof

Our proof is adapted from Chen and Kumagai [5, Theorem 4.14]. Fix \(r\in (0, 1)\). Without loss of generality, we may assume \(0\leqslant u\leqslant 1\) on \([0, \Phi (r)]\times {\mathbb {R}}^d\). Otherwise, instead of \(u\), we may consider

$$\begin{aligned} \tilde{u}_t (x) =\frac{u_t (x)-\inf _{(s, y)\in [0, \Phi (r)]\times {\mathbb {R}}^d}u_s (y)}{\sup _{(s, y)\in [0, \Phi (r)]\times {\mathbb {R}}^d}u_s (y)-\inf _{(s, y)\in [0, \Phi (r)]\times {\mathbb {R}}^d}u_s (y)}. \end{aligned}$$

(i) Define for \(n\in {\mathbb {N}}\),

$$\begin{aligned} r_n:=\varphi _{a^{1-n}}(r),\quad s_n:=2 b^{1-n}, \end{aligned}$$

where \(a>1\) from \(\{a_k; k\geqslant 1\}\) and \(b\in (1, 2)\) to be determined below. Observe that \(\Phi (r_n)=a\Phi (r_{n+1})\). Clearly,

$$\begin{aligned} \varphi _a(r_{n+1})=r_{n} \text{ and } r_n\downarrow 0, \ \ s_n\downarrow 0. \end{aligned}$$

For simplicity of notation, we write

$$\begin{aligned} Q_n:=Q(r_n),\quad M_n:=\sup _{Q_n}u,\quad m_n:=\inf _{Q_n}u. \end{aligned}$$

We are going to prove that the oscillation of \(u\) over \(Q_k\)

$$\begin{aligned} \hbox {osc}_{Q_k} u:=M_k-m_k\leqslant s_k, \quad k\in {\mathbb {N}}. \end{aligned}$$
(2.7)

If this is proven, then (2.6) follows. In fact, for any \((t,x)\in Q_1\), there is an \(n\in {\mathbb {N}}\) such that

$$\begin{aligned} (t,x)\in Q_n\setminus Q_{n+1}, \end{aligned}$$

which means that

$$\begin{aligned} \Phi (r_{n+1})\leqslant t<\Phi (r_n)=a \Phi (r_{n+1}) \quad \hbox {or}\quad r_{n+1}\leqslant |x|<r_n. \end{aligned}$$

In this case, we have

$$\begin{aligned} |u(t,x)-u(0,0)|\leqslant M_n-m_n\leqslant s_n=2b a^{-n\ln b/\ln a}\leqslant 2b\left( \frac{t\vee \Phi (|x|)}{\Phi (r)}\right) ^{\frac{\ln b}{\ln a}}, \end{aligned}$$

and (2.6) follows with \(\beta =\ln b/\ln a\).

(ii) We now prove (2.7) by an inductive argument. First of all, clearly,

$$\begin{aligned} M_1-m_1\leqslant 1\leqslant s_1=2,\ \ M_2-m_2\leqslant 1\leqslant s_2=2/b. \end{aligned}$$

Next suppose that \(M_k-m_k\leqslant s_k\) for all \(k=1,\ldots , n\). Define

$$\begin{aligned} A:=\Big \{z\in D_a(r_{n+1}): u(z)\leqslant \tfrac{m_n+M_n}{2}\Big \}. \end{aligned}$$

By considering \(1-u\) instead of \(u\) if necessary, we may assume that

$$\begin{aligned} \mu _{r_{n+1}}(A)\geqslant \tfrac{1}{2}\mu _{r_{n+1}}(D_a(r_{n+1})), \end{aligned}$$

where \(\mu _{r_{n+1}}\) is given in (H \(_2\) ). (Note here we are interested in the oscillation \(\hbox {osc}_{Q_k} u=M_k-m_k\) not on the exact values of \(M_k\) and \(m_k\).) Since \(\mu _{r_{n+1}}\) is regular, there is a compact subset \(K\subset A\) such that

$$\begin{aligned} \mu _{r_{n+1}}(K)\geqslant \tfrac{1}{3}\mu _{r_{n+1}}(D_a(r_{n+1})). \end{aligned}$$
(2.8)

For any \(\varepsilon >0\), let us choose \(z_1, z_2\in Q_{n+1}\) so that

$$\begin{aligned} u(z_1)\leqslant m_{n+1}+\varepsilon ,\ \ u(z_2)\geqslant M_{n+1}-\varepsilon . \end{aligned}$$

If one can show

$$\begin{aligned} u(z_2)-u(z_1)\leqslant s_{n+1}, \end{aligned}$$
(2.9)

then

$$\begin{aligned} M_{n+1}-m_{n+1}-2\varepsilon \leqslant s_{n+1}\Rightarrow M_{n+1}-m_{n+1}\leqslant s_{n+1}, \end{aligned}$$

and (2.7) is thus proven.

(iii) Now, we show (2.9). Since \(z_2\in Q_{n+1}\subset Q_n\), if we define \(\tau _n:=\tau _{Q_n}\), then by (2.2) we have

$$\begin{aligned} u(z_2)-u(z_1)&={\mathbb {E}}_{z_2}\Big [u(Z_{\tau _{n}\wedge \sigma _K})-u(z_1)\Big ] \nonumber \\&= \left( {\mathbb {E}}_{z_2}\Big [u(Z_{\sigma _K})\!-\!u(z_1); \sigma _K<\tau _{n}\Big ]\right. \nonumber \\&\quad \left. +{\mathbb {E}}_{z_2}\Big [u(Z_{\tau _{n}})-u(z_1); \sigma _K\geqslant \tau _{n};Z_{\tau _{n}}\in Q_{n-1}\Big ]\right) \nonumber \\&\quad +{\mathbb {E}}_{z_2}\Big [u(Z_{\tau _{n}})-u(z_1); \sigma _K\geqslant \tau _{n}, Z_{\tau _{n}}\notin Q_{n-1}\Big ]\nonumber \\&=:I_1+I_2. \end{aligned}$$
(2.10)

For \(I_1\), since \(u(z_1)\geqslant m_{n+1}\geqslant m_n\geqslant m_{n-1}\), by the inductive hypothesis we have

$$\begin{aligned} I_1&\leqslant \Big (\tfrac{m_n+M_n}{2}-m_n\Big ){\mathbb {P}}_{z_2}(\sigma _K<\tau _{n})+(M_{n-1}-m_{n-1}){\mathbb {P}}_{z_2}(\sigma _K\geqslant \tau _{n})\nonumber \\&\leqslant \tfrac{s_n}{2}{\mathbb {P}}_{z_2}(\sigma _K<\tau _{n})+s_{n-1}(1-{\mathbb {P}}_{z_2} (\sigma _K<\tau _{n}))\nonumber \\&\leqslant s_{n-1}(1-{\mathbb {P}}_{z_2}(\sigma _K<\tau _{n})/2)\leqslant s_{n+1}b^2(1-\gamma _a/(2a)), \end{aligned}$$
(2.11)

where the last step is due to (2.8) and (H \(_2\)). For \(I_2\), we similarly have

$$\begin{aligned} I_2&=\sum _{i=1}^{n-2}{\mathbb {E}}_{z_2}\Big [u(Z_{\tau _{n}})-u(z_1); \sigma _K\geqslant \tau _{n}, Z_{\tau _{n}}\in Q_{n-i-1}\setminus Q_{n-i}\Big ]\\&\qquad +{\mathbb {E}}_{z_2}\Big [u(Z_{\tau _{n}})-u(z_1); \sigma _K\geqslant \tau _{n}, Z_{\tau _{n}}\notin Q_1\Big ]\\&\leqslant \sum _{i=1}^{n-2}s_{n-i-1}{\mathbb {P}}_{z_2}\Big (Z_{\tau _{n}}\notin Q_{n-i}\Big )+{\mathbb {P}}_{z_2}\Big (Z_{\tau _{n}}\notin Q_1\Big ). \end{aligned}$$

Noticing that

$$\begin{aligned} {\mathbb {P}}_{z_2}\Big (Z_{\tau _{n}}\notin Q_{n-i}\Big )= {\mathbb {P}}_{z_2}\Big (X_{\tau _{Q( r_n})}\notin B ( r_{n-i})\Big ), \end{aligned}$$

by (H \(_1\) ), we further have for \(a > \max \{C_4, b\}\),

$$\begin{aligned} I_2&\leqslant C_{3}\sum _{i=1}^{n-2}s_{n-i-1}\frac{\Phi (r_{n})}{\Phi (r_{n-i})}+C_{3} \frac{\Phi (r_{n})}{\Phi (r_1)} =2C_{3}b^{2-n}\sum _{i=1}^{n-2}(b/a)^{i}+C_{3}a^{1-n}\\&\leqslant s_{n+1}b^2\left( \frac{C_{3}b}{a-b}+\frac{C_{3}}{2a}\right) , \end{aligned}$$

which together with (2.10) and (2.11), yields that

$$\begin{aligned} u(z_2)-u(z_1)\leqslant s_{n+1} b^2\left( 1-\frac{\gamma _a}{2a}+\frac{C_{3}b}{a-b}+\frac{C_{3}}{2a}\right) \leqslant s_{n+1} b^2 \left( 1-\frac{\gamma _a}{3a} \right) \leqslant s_{n+1} \end{aligned}$$

provided we take \(a=a_k\) large enough and \(b\) close to \(1\) as \(\lim _{k\rightarrow \infty } \gamma _{a_k}=\infty \). This completes the proof.\(\square \)

3 Proof of Theorem 1.1

We first recall the definition and properties of regularly varying functions.

Definition 3.1

A measurable and positive function \(\phi : (0,2)\rightarrow (0,\infty )\) is said to vary regularly at zero with index \(\alpha \in {\mathbb {R}}\) if for every \(\lambda >0\),

$$\begin{aligned} \lim _{r\rightarrow 0}\frac{\phi (\lambda r)}{\phi (r)}=\lambda ^\alpha . \end{aligned}$$

We call such \(\phi \) a regularly varying function. All regularly varying functions with index \(\alpha \) is denoted by \({\fancyscript{R}}_\alpha \).

We list some properties of \(\phi \in {\fancyscript{R}}_\alpha \) for later use (cf. [3, pp. 25–28] and [11]).

Proposition 3.2

Let \(\alpha \geqslant 0\) and \(\phi \in {\fancyscript{R}}_\alpha \) be bounded away from \(0\) and \(\infty \) on any compact subset of \((0,2)\). For any \(\delta >0\), there is a constant \(C_{5}=C_{5}(\delta )\geqslant 1\) such that for all \(r,s\in (0,1]\),

$$\begin{aligned} \frac{\phi (r)}{\phi (s)}\leqslant C_{5}\max \left\{ \Big (\frac{r}{s}\Big )^{\alpha +\delta }, \Big (\frac{r}{s}\Big )^{\alpha -\delta }\right\} , \end{aligned}$$
(3.1)

and for any \(\beta >\alpha -1\),

$$\begin{aligned} \lim _{r\rightarrow 0}\frac{\phi (r)}{r^{\beta +1}}\int ^r_0 \frac{s^{\beta }}{\phi (s)}{\mathord {\mathrm{d}}}s&=(\beta -\alpha +1)^{-1},\end{aligned}$$
(3.2)
$$\begin{aligned} \lim _{r\rightarrow 0}r^{\beta +1-\alpha }\phi (r)\int ^2_r \frac{1}{s^{\beta +2-\alpha }\phi (s)}{\mathord {\mathrm{d}}}s&=(\beta -\alpha +1)^{-1}. \end{aligned}$$
(3.3)

Moreover, if we define

$$\begin{aligned} \Phi (r):=\left( \int ^2_r\frac{1}{\phi (s)s}{\mathord {\mathrm{d}}}s\right) ^{-1}, \end{aligned}$$
(3.4)

then \(\Phi \in {\fancyscript{R}}_\alpha \) and

$$\begin{aligned} \lim _{r\rightarrow 0}\frac{\Phi (r)}{\phi (r)}=\alpha . \end{aligned}$$
(3.5)

In particular, (3.1) and (3.2) also hold for \(\Phi \), and for some \(C_{6}> 1\),

$$\begin{aligned} \phi (2s)\leqslant C_{6} \phi (s) \quad \hbox {and} \quad \Phi (2s)\leqslant C_{6}\Phi (s) \quad \hbox { for } s\in (0,1/2). \end{aligned}$$
(3.6)

We now return to the setting in Sect. 1. By normalizing the function \(\phi \) in (1.3) by a constant multiple, we may and do assume the scale function \(\Phi \) defined by (1.7) has the property that \(\Phi (1)=1\). Consider the nonlocal operator \({\fancyscript{L}}^b_t \) in (1.6). We assume

(MP) :

The martingale problem for \(({\fancyscript{L}}^b_t, C^\infty _c ({\mathbb {R}}^+\times {\mathbb {R}}^d))\) is well posed for every initial value \((t, x)\in {\mathbb {R}}^+\times {\mathbb {R}}^d\).

Denote by \({\mathbb {P}}_{(t,x)}\) the law of the unique solution to the martingale problem for \(({\fancyscript{L}}^b_t, C^\infty _c ({\mathbb {R}}^+\times {\mathbb {R}}^d))\) with initial value \((t, x)\in {\mathbb {R}}^+\times {\mathbb {R}}^d\). By [10, Theorems 4.3.12 and 4.4.2]), \(Z_s=(V_0+s, X_s)\) is a Hunt process with \({\mathbb {P}}_{(t,x)}(V_0=t \hbox { and } X_0=x)=1\) and so it has a Lévy system that describes the jumps of \(Z\). By a similar argument as that for [4, Theorem 2.6], we have the following.

Theorem 3.3

Assume (MP) holds. Then for any \((t,x)\in {\mathbb {R}}^+\times {\mathbb {R}}^d\) and any nonnegative measurable function \(f\) on \({\mathbb {R}}^+ \times {\mathbb {R}}^d\times {\mathbb {R}}^d\) vanishing on \(\{(s, x, y)\in {\mathbb {R}}^+ \times {\mathbb {R}}^d\times {\mathbb {R}}^d: x=y\}\) and \(({\fancyscript{F}}_t)\)-stopping time \(T\),

$$\begin{aligned} {\mathbb {E}}_{(t,x)}\! \left[ \sum _{s\leqslant T} f(s,X_{s-}, X_s) \!\right] \!=\! {\mathbb {E}}_{(t,x)}\! \left[ \int _0^T \left( \int _{{\mathbb {R}}^d} f(s,X_s, y) \kappa _{s+t}(X_s, y\!-\!X_s){\mathord {\mathrm{d}}}y\! \right) \! {\mathord {\mathrm{d}}}s \!\right] . \end{aligned}$$
(3.7)

Next we prove the following estimate, which implies (H \(_1\) ).

Lemma 3.4

Let \(C_{6}\) be as in (3.6). Under (1.2), (1.3) and (MP), there is a constant \(C_{7}\geqslant 1\) such that for all \(a\geqslant C_{6}\) and \(r,R\in (0,1)\) with \(\varphi _a(r)\leqslant R\),

$$\begin{aligned} \sup _{(t_0,x_0)\in Q(r)}{\mathbb {P}}_{(t_0,x_0)}\Big (X_{\tau _{Q(r)}}\notin B(R)\Big )\leqslant C_{7}\frac{\Phi (r)}{\Phi (R)}, \end{aligned}$$

where \(\Phi \) is defined by (3.4).

Proof

For simplicity of notation, we write \(z=(t_0,x_0)\). Note that \(r<\varphi _a(r)\) so we have by formula (3.7),

$$\begin{aligned}&{\mathbb {P}}_{z}\Big (X_{\tau _{Q(r)}}\notin B(R)\Big )={\mathbb {E}}_{z}\left( \sum _{0<s\leqslant \tau _{Q(r)}}{\mathbf {1}}_{\{X_{s-}\in B(r), X_s\in B(R)^c\}}\right) \\&={\mathbb {E}}_{z}\int ^{\tau _{Q(r)}}_0\!\!\!\!\int _{B(R)^c}\kappa _{s+t}(X_s, X_s-y){\mathord {\mathrm{d}}}y{\mathord {\mathrm{d}}}s\\&={\mathbb {E}}_{z}\int ^{\tau _{Q(r)}}_0\!\!\!\!\int _{B(2)\cap B(R)^c}\kappa _{s+t}(X_s, X_s-y){\mathord {\mathrm{d}}}y{\mathord {\mathrm{d}}}s\\&\quad +{\mathbb {E}}_{z}\int ^{\tau _{Q(r)}}_0\!\!\!\!\int _{B(2)^c}\kappa _{s+t}(X_s, X_s-y){\mathord {\mathrm{d}}}y{\mathord {\mathrm{d}}}s=:I_1+I_2. \end{aligned}$$

By (3.6), we have \(\varphi _a(r)\geqslant 2r\) for \(a\geqslant C_{6}\), which implies that

$$\begin{aligned} |x-y|\geqslant |y|-|x|\geqslant |y|/2 \quad \hbox {for } x\in B(r) \hbox { and } y\in B(R)^c\subset B(\varphi _a(r))^c. \end{aligned}$$

For \(I_1\), by (1.3) and (3.1) we have

$$\begin{aligned} I_1&\leqslant {\mathbb {E}}_{z}\int ^{\tau _{Q(r)}}_0\!\!\!\int _{B(2)\cap B(R)^c}\frac{c_1^{-1}}{\phi (|X_s-y|)|X_s-y|^d}{\mathord {\mathrm{d}}}y{\mathord {\mathrm{d}}}s\\&\leqslant C{\mathbb {E}}_{z}\tau _{Q(r)}\int _{B(2)\cap B(R)^c}\frac{{\mathord {\mathrm{d}}}y}{\phi (|y|)|y|^d}\leqslant C{\mathbb {E}}_{z}\tau _{Q(r)}/\Phi (R). \end{aligned}$$

On the other hand, by (1.2) we clearly have

$$\begin{aligned} I_2\leqslant {\mathbb {E}}_{z}\int ^{\tau _{Q(r)}}_0\!\!\!\!\int _{B(1)^c}\kappa _{s+t}(X_s, y){\mathord {\mathrm{d}}}y{\mathord {\mathrm{d}}}s\leqslant C_1{\mathbb {E}}_{z}\tau _{Q(r)}. \end{aligned}$$

Hence, by (1.8),

$$\begin{aligned} {\mathbb {P}}_{z}\Big (X_{\tau _{Q(r)}}\notin B(R)\Big )\leqslant {\mathbb {E}}_{z}\tau _{Q(r)}\Big (C_1+C/\Phi (R)\Big )\leqslant C_7{\mathbb {E}}_{z}\tau _{Q(r)}/\Phi (R), \end{aligned}$$

which yields the desired estimate by \(\tau _{Q(r)}\leqslant \Phi (r)\).\(\square \)

Before verifying (H \(_2\) ), we need the following lemma.

Lemma 3.5

Let \(\Phi \) be defined by (3.4). Suppose that one of the following conditions holds:

  1. (i)

    \(\liminf _{r\rightarrow 0}r/\Phi (r)=0\) and for some \(C_2>0\),

    $$\begin{aligned} \frac{\Phi (r)}{\Phi (s)}\leqslant C_2\frac{r}{s},\quad 0<s\leqslant r\leqslant 1, \end{aligned}$$
    (3.8)

    and for some \(C_b>0\),

    $$\begin{aligned} |b_t(x)|\leqslant C_b|x|/\Phi (|x|), \quad |x|\leqslant 1. \end{aligned}$$
    (3.9)
  2. (ii)

    \(\liminf _{r\rightarrow 0}r/\Phi (r)>0\) and \(b\) is bounded measurable.

Then there exists a constant \(C_{8}\geqslant 1\) such that for all \(r\in (0,1)\), \(x_0\in B(r)\) and \(t_0\in [0,1]\),

$$\begin{aligned} {\mathbb {P}}_{(t_0,x_0)}(\tau _{B(x_0,r)}<t)\leqslant \frac{C_{8} t}{\Phi (r)},\quad t>0. \end{aligned}$$
(3.10)

In particular,  for any \(a\geqslant C_{6}^2>1\) and \(r\in (0,1)\) with \(\varphi _a(r)\leqslant 1\),

$$\begin{aligned} \sup _{x_0\in B(r)}{\mathbb {P}}_{(t_0,x_0)} \Big (\tau _{Q(\varphi _{a}(r))}< \Phi (r)\Big )\leqslant \frac{C_{8}}{\sqrt{a}}, \end{aligned}$$
(3.11)

where \(C_6\) is the positive constant in (3.6).

Proof

Given \(f\in C^2_b({\mathbb {R}}^d)\) with \(f(0)=0\) and \(f(x)=1\) for \(|x|\geqslant 1\), set

$$\begin{aligned} f_r(x):=f((x-x_0)/r),\ \ r>0. \end{aligned}$$

By the optional stopping theorem,

$$\begin{aligned} {\mathbb {P}}_{(t_0,x_0)}\Big (\tau _{B(x_0,r)}<t\Big )\leqslant {\mathbb {E}}_{(t_0,x_0)}f_r\Big (X_{\tau _{B(x_0,r)}\wedge t}\Big ) ={\mathbb {E}}_{(t_0,x_0)}\int ^{\tau _{B(x_0,r)}\wedge t}_0{\fancyscript{L}}^b_{s+t_0} f_{r}(X_{s}){\mathord {\mathrm{d}}}s. \end{aligned}$$
(3.12)

On the other hand, by the definition of \({\fancyscript{L}}^b_s\) and (1.3), we have

$$\begin{aligned} |{\fancyscript{L}}^b_s f_r(x)|&= \left| \int _{{\mathbb {R}}^d}(f_r(x+z)+f_r(x-z)-2f_r(x))\kappa _s(x,z){\mathord {\mathrm{d}}}z+b_s(x)\cdot \nabla f_r(x)\right| \\&\leqslant C\int _{|z|\leqslant r}\frac{\Vert \nabla ^2 f_r\Vert _\infty }{\phi (|z|)|z|^{d-2}}{\mathord {\mathrm{d}}}z\\&+ C \!\int _{1\geqslant |z|\geqslant r}\frac{\Vert f_r\Vert _\infty }{\phi (|z|)|z|^{d}}{\mathord {\mathrm{d}}}z \!+\!\Vert f_r\Vert _\infty \int _{|z|\geqslant 1}\kappa _s(x,z){\mathord {\mathrm{d}}}z \!+\! \Vert \nabla f_r\Vert _\infty |b_s(x)|\\&\leqslant \frac{C}{r^2}\int ^r_0\frac{s{\mathord {\mathrm{d}}}s}{\phi (s)}+C\int ^1_r\frac{{\mathord {\mathrm{d}}}s}{\phi (s)s}+C+\frac{C|b_s(x)|}{r}\\&\leqslant \frac{C}{\phi (r)}+\frac{C}{\Phi (r)}+C+\frac{C|b_s(x)|}{r} \qquad \hbox {by (3.2) and (3.3).} \end{aligned}$$

Substituting this into (3.12) and using (3.5) and (1.8), we obtain

$$\begin{aligned} {\mathbb {P}}_{(t_0,x_0)}\Big (\tau _{B(x_0,r)}<t\Big )\leqslant \frac{Ct}{\Phi (r)} +{\mathbb {E}}_{(t_0,x_0)}\int ^{\tau _{B(x_0,r)}\wedge t}_0\frac{C|b_{s+t_0}(X_s)|}{r}{\mathord {\mathrm{d}}}s. \end{aligned}$$
(3.13)

In case (i), since

$$\begin{aligned} |x|\leqslant |x-x_0|+|x_0|\leqslant 2r \hbox { for } x\in B(x_0,r) \hbox { and } x_0\in B(r), \end{aligned}$$

(3.10) follows by (3.13) and

$$\begin{aligned} \frac{|b_{s+t_0}(x)|}{r}\mathop {\leqslant }\limits ^{(3.9)} \frac{C_b|x|}{r\Phi (|x|)}\mathop {\leqslant }\limits ^{(3.8)} \frac{2C_bC_2}{\Phi (2r)}\leqslant \frac{C}{\Phi (r)}. \end{aligned}$$

In case (ii), (3.10) follows by (3.13) and \(|b_{s+t_0}(X_s)|\leqslant \Vert b\Vert _{L^\infty ({\mathbb {R}}^+\times {\mathbb {R}}^d)}\) as well as \(\frac{1}{r}\leqslant \frac{C}{\Phi (r)}\).

On the other hand, by (3.6), we have for any \(a\geqslant C_{6}^2\),

$$\begin{aligned} \Phi (2\varphi _{\sqrt{a}}(r))\leqslant C_{6} \sqrt{a}\Phi (r)\leqslant a\Phi (r), \end{aligned}$$

which implies that for \(x_0\in B(r)\) and \(x\in B(x_0,\varphi _{\sqrt{a}}(r))\),

$$\begin{aligned} |x|\leqslant |x-x_0|+|x_0|\leqslant \varphi _{\sqrt{a}}(r)+r\leqslant 2\varphi _{\sqrt{a}}(r)\leqslant \varphi _{a}(r). \end{aligned}$$

Hence \(B(x_0,\varphi _{\sqrt{a}}(r))\subset B(\varphi _a (r))\) and

$$\begin{aligned} {\mathbb {P}}_{(t_0,x_0)}\Big (\tau _{Q(\varphi _a(r))}\!<\!\Phi (r)\Big ) \leqslant {\mathbb {P}}_{(t_0,x_0)}\Big (\tau _{B(x_0,\varphi _{\sqrt{a}}(r))} < \Phi (r)\Big ) \mathop {\leqslant }\limits ^{(3.10)}\frac{C_{8}\Phi (r)}{\Phi (\varphi _{\sqrt{a}}(r))} \!=\!\frac{C_{8}}{\sqrt{a}}. \end{aligned}$$

The proof is complete.\(\square \)

For \(a>1\) and \(r\in (0,1)\) with \(\varphi _a(r)\leqslant 1\), let \(D_a(r)\) be defined by (2.3). Define a measure

$$\begin{aligned} \mu _r(A):=\int ^{\Phi (r)}_0\!\!\!\int _{{\mathbb {R}}^d}\frac{1_A(s,y)\Phi (|y|)}{\phi (|y|)|y|^d}{\mathord {\mathrm{d}}}y{\mathord {\mathrm{d}}}s,\ \ A\subset D_a(r). \end{aligned}$$

The above definition of \(\mu _r\) arises naturally when estimating the lower bound of \({\mathbb {P}}_{(t_0,x_0)}\Big (\sigma _K<\tau _{Q(\varphi _a(r))}\Big )\) using the Léve system of \(Z\); see (3.16) below. Clearly, we have

$$\begin{aligned} \mu _r(D_a(r))&= \Phi (r)\int _{\varphi _{\sqrt{a}}(r)\leqslant |y|\leqslant \varphi _a(r) }\frac{\Phi (|y|)}{\phi (|y|)|y|^d}{\mathord {\mathrm{d}}}y =\omega _d\Phi (r)\int _{\varphi _{\sqrt{a}}(r)}^{\varphi _a(r) }\frac{\Phi (s)}{\phi (s)s}{\mathord {\mathrm{d}}}s \nonumber \\&= \omega _d \Phi (r) \int _{\varphi _{\sqrt{a}}(r)}^{\varphi _a(r) } \frac{1}{\Phi (s)} d\Phi (s) =\frac{1}{2} \omega _d\Phi (r)\ln a , \end{aligned}$$
(3.14)

where \(\omega _d\) is the sphere area of the unit ball.

Lemma 3.6

Suppose (1.2), (1.3), (MP) and the assumptions of Lemma 3.5 hold. There exist \(a_0\geqslant 1\) and \(c_2\in (0,1)\) such that for each \(a\geqslant a_0\) and \(r\in (0,1)\) with \(\varphi _a(r)\leqslant 1\), and any compact subset \(K\subset D_a(r)\) with \(\mu _r(K)>\frac{1}{3}\mu _r(D_a(r))\),

$$\begin{aligned} \inf _{(t_0,x_0)\in Q(r)}{\mathbb {P}}_{(t_0,x_0)}\Big (\sigma _K<\tau _{Q(\varphi _a(r))}\Big )\geqslant c_2\frac{\ln a}{a}. \end{aligned}$$
(3.15)

In particular, condition (H \(_2\) ) holds.

Proof

Notice that

$$\begin{aligned} \Big \{Z_{\tau _{Q(\varphi _{\sqrt{a}}(r))}}\in K\Big \}\subset \Big \{\sigma _K<\tau _{Q(\varphi _a(r))}\Big \}. \end{aligned}$$

It suffices to prove that there are \(a_0\geqslant 1\) and \(c_2\in (0,1)\) such that for all \(a\geqslant a_0\) and any \((t_0,x_0)\in Q(r)\),

$$\begin{aligned} {\mathbb {P}}_{(t_0,x_0)}\Big (Z_{\tau _{Q(\varphi _{\sqrt{a}}(r))}}\in K\Big )\geqslant c_2\frac{\ln a}{a}. \end{aligned}$$

As \(\mu _r ( \partial Q(\varphi _{\sqrt{a}}(r)))=0\), by taking a suitable subset of \(K\) if needed, we may assume without loss of generality that \(K\cap \partial Q(\varphi _{\sqrt{a}}(r)))=\emptyset \). Then

$$\begin{aligned} {\mathbf {1}}_K(Z_{\tau _{Q(\varphi _{\sqrt{a}}(r))}})=\sum _{0<s\leqslant \tau _{Q(\varphi _{\sqrt{a}}(r))}}{\mathbf {1}}_{X_{s-}\not =X_s}{\mathbf {1}}_K(Z_s). \end{aligned}$$

Hence by formula (3.7) and (1.3), we have

$$\begin{aligned} {\mathbb {P}}_{(t_0,x_0)}\Big (Z_{\tau _{Q(\varphi _{\sqrt{a}}(r))}}\in K\Big )&={\mathbb {E}}_{(t_0,x_0)}\int ^{\tau _{Q(\varphi _{\sqrt{a}}(r))}}_0 \!\!\!\int _{{\mathbb {R}}^d}{\mathbf {1}}_K(s,y) \kappa _s(X_s, X_s-y){\mathord {\mathrm{d}}}y{\mathord {\mathrm{d}}}s\\&\geqslant c_1{\mathbb {E}}_{(t_0,x_0)}\int ^{\tau _{Q(\varphi _{\sqrt{a}}(r))}}_0\!\!\!\int _{{\mathbb {R}}^d}\frac{{\mathbf {1}}_K(s,y)}{\phi (|X_s-y|)|X_s-y|^{d}}{\mathord {\mathrm{d}}}y{\mathord {\mathrm{d}}}s. \end{aligned}$$

Since \(|x-y|\leqslant 2|y|\) for \(x\in B(\varphi _{\sqrt{a}}(r))\) and \(y\notin B(\varphi _{\sqrt{a}}(r))\), by (3.1) and definition of \(\mu _r\), we have

$$\begin{aligned} {\mathbb {P}}_{(t_0,x_0)}\Big (Z_{\tau _{Q(\varphi _{\sqrt{a}}(r))}}\in K\Big )&\geqslant c_3{\mathbb {E}}_{(t_0,x_0)}\left( \int ^{\tau _{Q(\varphi _{\sqrt{a}}(r))}}_0\!\!\! \int _{{\mathbb {R}}^d}\frac{{\mathbf {1}}_K(s,y)}{\phi (|y|)|y|^{d}}{\mathord {\mathrm{d}}}y{\mathord {\mathrm{d}}}s\right) \nonumber \\&= c_3{\mathbb {E}}_{(t_0,x_0)}\left( \int ^{\tau _{Q(\varphi _{\sqrt{a}}(r))}}_0\!\!\!\int _{{\mathbb {R}}^d}\frac{{\mathbf {1}}_K(s,y)}{\Phi (|y|)}\mu _r({\mathord {\mathrm{d}}}y,{\mathord {\mathrm{d}}}s)\right) \nonumber \\&\geqslant \frac{c_3\mu _r(K)}{\Phi (\varphi _a(r))}{\mathbb {P}}_{(t_0,x_0)}\Big (\tau _{Q(\varphi _{\sqrt{a}}(r))}\geqslant \Phi (r)\Big ), \end{aligned}$$
(3.16)

where the last inequality is due to \(y\in D_a(r)\) and the increasing of \(\Phi \). Lastly, by \(\mu _r(K)\geqslant \frac{1}{3}\mu _r(D_a(r))\), (3.14) and (3.11), we obtain that for \(a\geqslant C^2_6\vee 4C_{8}^2=:a_0\),

$$\begin{aligned} {\mathbb {P}}_{(t_0,x_0)}\Big (Z_{\tau _{Q(\varphi _{\sqrt{a}}(r))}}\in K\Big )&\geqslant \frac{c_3\omega _d\ln a}{6a} \left( 1-{\mathbb {P}}_{(t_0,x_0)}\Big (\tau _{Q(\varphi _{\sqrt{a}}(r))}< \Phi (r)\Big )\right) \\&\geqslant \frac{c_3\omega _d\ln a}{6a}\left( 1-\frac{C_{8}}{\sqrt{a}}\right) \geqslant \frac{c_3\omega _d\ln a}{12a}. \end{aligned}$$

The proof is completed by taking \(c_2=\frac{c_3\omega _d}{12}\).\(\square \)

We can now present the

Proof of Theorem 1.1

Fix \(t_0>0\) and \(x_0\in {\mathbb {R}}^d\).

(i) In this case, by assumption \(b_t (x)\) is continuous in \(x\) and \(b_t (x)\leqslant C (1+|x|)\) for all \(t>0\) and \(x\in {\mathbb {R}}^d\). Thus, by the theory of ODE, the following ODE admits at least one solution \(y_t\) for \(t\in [0, t_0]\):

$$\begin{aligned} \dot{y}_t=-b_{t_0-t}(x_0+y_t),\ \ y_0=0. \end{aligned}$$

Define

$$\begin{aligned} w(t,x):=u(t_0,x_0)-u(t_0-t,x_0+x+y_t) \end{aligned}$$

and

$$\begin{aligned} \tilde{b}_t(x):=b_t(x+x_0+y_t)-b_t(x_0+y_t). \end{aligned}$$

Then

$$\begin{aligned} \partial _t w+{\fancyscript{L}}^{\tilde{b}}_{t_0-t} w=0,\quad t\in [0,t_0). \end{aligned}$$

Notice that by (1.11),

$$\begin{aligned} |\tilde{b}_t(x)|\leqslant C_b |x|/\Phi (|x|) \quad \hbox {for } |x|\leqslant 1. \end{aligned}$$

By Lemmas 3.43.6 and Theorem 2.4, we have

$$\begin{aligned} |w(t,x)|&=|w(t,x)-w(0,0)|\\&\leqslant 8\left( \frac{t\vee \Phi (|x|)}{ t_0}\right) ^\beta \Vert w\Vert _{L^\infty ([0, t_0]\times {\mathbb {R}}^d)} \quad \hbox {for } (t,x)\in Q(\Phi ^{-1}(t_0)). \end{aligned}$$

By making the change of variables \(t_0-t={t^{\prime }}\) and \(x_0+x+y_t={x^{\prime }}\), and noticing that

$$\begin{aligned} |y_t|\leqslant \lambda t\ \text{ for } \text{ some } \lambda =\lambda (\Vert b/(1+|x|)\Vert _\infty )>0, \end{aligned}$$

we obtain the desired estimate (1.12).

(ii) In this case, define

$$\begin{aligned} w(t,x):=u(t_0,x_0)-u(t_0-t,x_0+x). \end{aligned}$$

Just as above, one can conclude that (1.13) holds.\(\square \)