1 Introduction

The usual maximum principle concerns with second-order differential operators of elliptic or parabolic type. It is a basic property of solutions to boundary value problems for the associated elliptic or parabolic partial differential equations (PDEs) in a bounded domain. See [22, 24]) for a general study of maximum principles. Classically, the maximum principle states that the maximum of the solution of a second-order elliptic or parabolic equation in a domain is to be found on the boundary of that domain. In particular, the strong maximum principle says that if the solution achieves its maximum in the interior of the domain, the solution must be a constant, while the weak maximum principle indicates that the maximum is to be found on the boundary but may re-occur in the interior as well. Let us also mention [19] where both weak and strong maximum principle for symmetric Markov generators are discussed via (local) Dirichlet forms. Moreover, a maximum principle for nonlocal operators generated by nonnegative kernels defined on topological groups acting continuously on a Hausdorff space was considered by Coville [7]. The strong maximum principle for semicontinuous viscosity solution of fully nonlinear second-order parabolic integro-differential equations was studied in [5].

A fairly large class of Markov processes on \({\mathbb {R}}^d\) are governed analytically by their infinitesimal generators, called Lévy type generators or pseudo-differential operators associated with negative definite symbols (cf. e.g. [11]), either via martingale problem (cf. e.g. [14,15,16, 28, 29]) or via Dirichlet form (cf. e.g. [9, 11, 17, 18]). From [6, 11], these operators are usually integro-differential operators or nonlocal operators, consisting of a combination of second-order elliptic differential operators and integral operators of Lévy type. The nonlocal operator here corresponds to the jump component of a Markov process; in fact, it is an integral with respect to a jump measure.

The well-known Hille-Yosida theorem and the semigroup approach, which can be found in e.g. [12], provide an intrinsic link between Markov processes and partial differential equations, in particular second-order elliptic differential operators, as in the pioneering work of Feller in early 1950s. The monograph [30] (also references therein) explores the functional analytic approach to constructing Markov processes in a prescribed region of \({\mathbb {R}}^d\), via the elliptic boundary value problems for the associated Lévy-type generators.

Due to the nature of pseudo-differential operators (involving integral operators), the Lévy-type generators are nonlocal operators. This kind of integro-differential operators was initiated by Waldenfels [32] in 1960s. It was elucidated in [30] that a Markov process associated with such an operator as infinitesimal operator could be interpreted with a physical picture: A Markovian particle moves both by jumps and continuously in a certain region of the state space \({\mathbb {R}}^d\).

The present paper is devoted to the weak and strong maximum principles for the following nonlocal parabolic Waldenfels operator \(-\frac{\partial }{\partial t} + L \):

$$\begin{aligned} \begin{aligned}&\Big (-\frac{\partial }{\partial t} + L\Big ) u(x,t)\\&\quad := -\frac{\partial u}{\partial t}(x,t) + \sum ^d_{j,k=1}a_{jk}(x,t) \frac{\partial ^2 u}{\partial x_j\partial x_k}(x,t) +\sum ^d_{j=1}b_{j}(x,t)\frac{\partial u}{\partial x_j}(x,t)+c(x,t)u(x,t) \\&\qquad +\int _{\mathbb {R}^d\setminus \{0\}}\Big [u(x+z,t)-u(x,t)-\sum ^d_{j=1}z_j \frac{\partial u}{\partial x_j}(x,t){\mathbf {1}}_{\{|z|<1\}}\Big ]\nu (t,x,dz), \end{aligned} \end{aligned}$$

where the kernel \(\{\nu (t,x,\cdot )\mid (x,t)\in \mathbb {R}^d\times [0,\infty )\}\) behaves as the jump measure for the associated Markov process. The operator L is called an elliptic Waldenfels operator. Note that Waldenfels operators L and \(-\frac{\partial }{\partial t} + L \) appear in the generator and in the Fokker–Planck equation, respectively, for a stochastic differential equation with Lévy motions [3, 8, 26, 27]. We would like to point out that Waldenfels operators also appear in nonlocal conservation laws [31]. Certain properties for diffusion generators perturbed by the nonlocal Laplacian operator have also been studied recently [1, 2].

We will prove the new weak and strong maximum principles for the nonlocal parabolic operator \(-\frac{\partial }{\partial t} + L \), and they do not require any “nondegeneracy” conditions. In order to cover the general case with either bounded or unbounded support of the jump measure \(\nu \), we will introduce two open sets D and E (with \(D \subset E\)), where D is the set where the maximum is achieved, and the stochastic process (“Markovian particle”) cannot jump from D to the complement of E.

As a preparation for proving these maximum principles, we will prove the maximum principles for nonlocal elliptic Waldenfels operator L. These maximum principles are important for the construction of Markov processes. In [30, Appendix C], weak and strong maximum principles for such elliptic Waldenfels operators were proven, but under stringent conditions, that is, the jump measure has to have bounded support. The results in [5] includes a strong maximum principle for viscosity solutions of certain nonlinear nonlocal partial differential equations under a “nondegeneracy” condition.

The rest of this paper is organised as follows. In Sect. 2, we will present our results on maximum principles for elliptic Waldenfels operators. As a corollary, we also obtain the Hopf’s Lemma about the sign of the gradient on the boundary. Section 3 is devoted to prove the maximum principles for parabolic Waldenfels operators. Some consequences and examples are presented in Sect. 4. Finally in Sect. 5, we present the proofs of some technical lemmas for the sake of completeness.

2 Maximum principles for elliptic Waldenfels operators

In this section, we consider the weak and strong maximum principles for the elliptic Waldenfels operator L (decomposed into local and nonlocal components)

$$\begin{aligned} L := A+K, \end{aligned}$$
(1)

where A and K are defined as

$$\begin{aligned} \begin{aligned} Au(x) :=&\ \sum ^d_{j,k=1}a_{jk}(x)\frac{\partial ^2 u}{\partial x_j\partial x_k}(x)+\sum ^d_{j=1}b_{j}(x) \frac{\partial u}{\partial x_j}(x)+c(x)u(x), \\ Ku(x) :=&\int _{\mathbb {R}^d\setminus \{0\}}\Big [u(x+z)-u(x)-\sum ^d_{j=1}z_j \frac{\partial u}{\partial x_j}(x){\mathbf {1}}_{\{|z|<1\}}\Big ]\nu (x,dz). \end{aligned} \end{aligned}$$

Note that the coefficients are taken to be independent of time t. Note that the operator K is actually the nonlocal Laplacian operator \(-(-\Delta )^{\frac{\alpha }{2}}\), when the jump measure \(\nu \) is the \(\alpha \)-stable type; see [8, Ch. 7].

The elliptic Waldenfels operator L plays an important role [30] in the theory of Markov processes constructed in a given domain of \({\mathbb {R}}^d\). In that context, the second-order differential operator describes the diffusion part of the associated Markov process and the integral operator of Lévy type corresponds to the jump behavior of the Markov process. Finally, there is an assumption in that context which indicates that a Markovian particle cannot move by jumps from any interior point of certain domain to the outside of closure of the domain. For further remarks and discussions, we refer e.g. to Bony et al. [4] and Taira [30].

To cover more general situations, we introduce two open sets D and E in \(\mathbb {R}^d\), with \(D\subset E\) and E not necessarily bounded. As usual, we denote the boundary of D by \(\partial D\), its closure by \(\overline{D}:=D\cup \partial D\) and its complement by \(D^c:=\mathbb {R}^d\setminus D\).

We make following assumptions:

  1. 1.

    Continuity condition: \(a_{jk},b_j,c\in C(\overline{E})\) \((j,k=1,\ldots ,d).\)

  2. 2.

    Symmetry condition: \(a_{jk}=a_{kj}\) \((j,k=1,\ldots ,d)\). Uniform ellipticity condition: there exists a constant \(\gamma >0\) such that

    $$\begin{aligned} \sum ^d_{j,k=1}a_{jk}(x)\xi _j\xi _k\ge \gamma |\xi |^2, \end{aligned}$$
    (2)

    for all \(x\in D\), \(\xi \in \mathbb {R}^d\).

  3. 3.

    Lévy measures: The kernel \(\{\nu (x,\cdot )\mid x\in \mathbb {R}^d\}\) is a family of Lévy measures, namely, each \(\nu (x,\cdot )\) is a Borel measure on \({\mathbb {R}}^d\setminus \{0\}\) such that

    $$\begin{aligned} \sup _{x\in \mathbb {R}^d} \int _{{\mathbb {R}}^d\setminus \{0\}}(1\wedge |z|^2)\nu (x,dz)<\infty , \end{aligned}$$
    (3)

    and moreover, for fixed \(U\in {\mathcal {B}}({\mathbb {R}}^d\setminus \{0\})\), the mapping \(\mathbb {R}^d\ni x\rightarrow \nu (x,U)\in [0,\infty )\) is Borel measurable. Here we further assume that for each \(x\in D\) the measure \(\nu (x,\cdot )\) is supported in \({\overline{E}}-x:=\{y-x\mid y\in {\overline{E}}\}=\{z\mid x+z\in {\overline{E}}\}\), i.e.,

    $$\begin{aligned} \text {supp}\,\nu (x,\cdot )\subset {\overline{E}}-x,\quad \forall x\in D. \end{aligned}$$
    (4)

Remark 2.1

The support condition (4) means in probability sense that a Markovian particle cannot move by jumps from a point \(x\in D\) to the outside of \(\overline{E}\). The motivation for this condition is that the maximizer point will propagate between connected components of the set in which the subsolution achieves maximum. The details will be discussed again in Remark 2.9 below. When the set E is the whole space \(\mathbb {R}^d\), \(E-x\) is still the whole space, and then there are actually no extra restrictions on the support of each measure \(\nu (x,\cdot )\). In the case that \(E=D\), the support condition is \(\text { supp}\,\nu (x,\cdot )\subset \overline{D}-x\), and this is related to the assumption in [30] that a Markovian particle cannot move by jumps from a point \(x\in D\) to the outside of \(\overline{D}\).

For convenience, the notation \({\mathbf {a}}=(a_{jk})_{j,k=1,\ldots ,d}\) means \({\mathbf {a}}\) is a matrix with (jk)-th entry \(a_{jk}\), and \(b=(b_1,\ldots ,b_d)^T\) is regarded as a row vector. We also recall the gradient operator (for space variable) \(\nabla _x=\big (\frac{\partial }{\partial x_1},\ldots ,\frac{\partial }{\partial x_d}\big )^T\) and the Hessian operator \(\nabla ^2_x=\nabla _x\otimes \nabla _x=\big (\frac{\partial ^2}{\partial x_j\partial x_k}\big )_{j,k=1,\ldots ,d}\), where \(\otimes \) means the tensor product. The variables or subscripts will be omitted when there is no ambiguity. Then we can rewrite the operator L as

$$\begin{aligned} \begin{aligned} Lu =&\ Au+Ku \\ =&\ \text {tr}[{\mathbf {a}}^T(\nabla ^2 u)]+b^T\nabla u+cu \\&\ +\int _{\mathbb {R}^d\setminus \{0\}}\big [u(\cdot +z)-u-z^T\nabla u\cdot {\mathbf {1}}_{\{|z|<1\}}\big ]\nu (\cdot ,dz), \end{aligned} \end{aligned}$$
(5)

where “tr” denote the trace of a matrix. Both \(x^Ty\) and \(x\cdot y\), for two vectors \(x,y\in \mathbb {R}^d\), denote the scalar product. Moreover, we denote the positive and negative part of function u by \(u^+:=u\vee 0\) and \(u^-:=-(u\wedge 0)=(-u)\vee 0\), respectively. Then \(u=u^+-u^-\) and \(|u|=u^++u^-\).

In this section, L is the elliptic Waldenfels operator as defined in (1).

2.1 Weak maximum principle for elliptic case

We now prove the weak maximum principle.

Theorem 2.2

(Weak maximum principle for elliptic Waldenfels operators) Let D be an open and bounded set but not necessarily connected, and E be an open set satisfying \(D\subset E\). Assume that \(u\in C^2(D)\cap C(\overline{E})\), \(Lu\ge 0\) in D, and \(\text { supp}\,\nu (x,\cdot )\subset {\overline{E}}-x\) for each \(x\in D\).

  1. 1.

    If \(c\equiv 0\) in D, then

    $$\begin{aligned} \sup _{\overline{E}}u=\sup _{\overline{E}\setminus D}u. \end{aligned}$$
  2. 2.

    If \(c\le 0\) in D, then

    $$\begin{aligned} \sup _{\overline{E}}u\le \sup _{\overline{E}\setminus D}u^+. \end{aligned}$$

Here the supremum may be infinity.

Proof

Assertion 1. We first consider the case with the strict inequality

$$\begin{aligned} Lu>0\quad \text {in }D. \end{aligned}$$
(6)

Suppose that on the contrary \(\sup _{\overline{E}}u>\sup _{\overline{E}\setminus D}u\). Then there exists a point \(x^0\in D\) with \(u(x^0)=\sup _{\overline{E}}u\), and

$$\begin{aligned} u(x^0)=\max _{\overline{D}}u. \end{aligned}$$

Thus at the maximizer point \(x^0\), we have

$$\begin{aligned} \nabla u(x^0)&= 0, \end{aligned}$$
(7)
$$\begin{aligned} \nabla ^2u(x^0)&\le 0, \end{aligned}$$
(8)

where the last inequality means that the symmetric matrix \(\nabla ^2u(x^0)\) is nonpositive definite. In particular, \(\frac{\partial ^2 u}{\partial x_j^2}(x^0)\le 0, j=1,\ldots ,d\). Since the matrix \({\mathbf {a}}=(a_{jk})\) is symmetric and positive definite at \(x^0\), there exists an orthogonal matrix P such that

$$\begin{aligned} P[{\mathbf {a}}(x^0)]P^T=\text {diag}(\lambda _1,\ldots ,\lambda _d), \end{aligned}$$

where “diag” means the diagonal matrix with diagonal entries \(\lambda _j>0,j=1,\ldots ,d\), which are eigenvalues of \({\mathbf {a}}(x^0)\). Then by changing variables \(y-x^0=P(x-x^0)\), we have

$$\begin{aligned} \begin{aligned} \nabla _x u&= P^T(\nabla _y u), \\ \nabla ^2_x u&= P^T(\nabla ^2_y u)P. \end{aligned} \end{aligned}$$

In light of (8), we find that at point \(x^0\),

$$\begin{aligned} \begin{aligned}&\text {tr}[{\mathbf {a}}^T(\nabla ^2_x u)] = \text {tr}[{\mathbf {a}}^TP^T(\nabla ^2_y u)P]=\text {tr}[P{\mathbf {a}}^TP^T(\nabla ^2_y u)] \\&\quad = \text {tr}[(P{\mathbf {a}}P^T)^T(\nabla ^2_y u)]= \sum _{j}\lambda _j\frac{\partial ^2u}{\partial y_j^2}\le 0. \end{aligned} \end{aligned}$$
(9)

Thus, combining (7), (9) and the assumption \(c\equiv 0\), together with the fact that u attains a maximum at \(x^0\), we obtain that at \(x^0\),

$$\begin{aligned} \begin{aligned} Au&= \text {tr}[{\mathbf {a}}^T(\nabla ^2 u)]+b^T\nabla u+cu\le 0, \\ Ku(x^0)&= \int _{\mathbb {R}^d\setminus \{0\}}\big [u(x^0+z)-u(x^0)-z^T\nabla u(x^0)\cdot {\mathbf {1}}_{\{|z|<1\}}\big ]\nu (x^0,dz) \\&= \int _{{\overline{E}}-x}\big [u(x^0+z)-u(x^0)\big ]\nu (x^0,dz) \\&\le 0. \end{aligned} \end{aligned}$$

Hence

$$\begin{aligned} Lu=Au+Ku\le 0 \quad \text {at }x^0. \end{aligned}$$
(10)

Therefore, we get a contradiction in light of (6) and (10), which leads to \(\sup _{\overline{E}}u=\sup _{\overline{E}\setminus D}u\).

For the general case that \(Lu\ge 0\), we introduce a function

$$\begin{aligned} u^\epsilon (x):=u(x)+\epsilon e^{-\beta x_1},\quad x\in \overline{E}, \end{aligned}$$
(11)

where \(\beta >0\) will be selected below and \(\epsilon \) is a positive parameter. Note that \(a_{11}\ge \gamma >0\), by substituting \(z=e_1=(1,0,\ldots ,0)\) into condition (2). Then by Taylor expansion and the moment condition (3) of kernel \(\nu \), we have

$$\begin{aligned} \begin{aligned} Lu^\epsilon =&\ Lu+\epsilon L(e^{-\beta x_1}) \\ \ge&\ \epsilon e^{-\beta x_1}\Big [\beta ^2a^{11}-\beta b^1+\int _{|z|\ge 1}\big (e^{-\beta z_1}-1\big )\nu (x,dz) \\&\qquad \quad \ +\int _{0<|z|<1}\big (e^{-\beta z_1}-1+\beta z_1\big )\nu (x,dz)\Big ] \\ \ge&\ \epsilon e^{-\beta x_1}\Big [\beta ^2a^{11}-\beta b^1-\int _{|z|\ge 1}\nu (x,dz)+\frac{1}{2} \beta ^2\int _{0<|z|<1}z_1^2e^{-\beta \theta z_1}\nu (x,dz)\Big ] \\ \ge&\ \epsilon e^{-\beta x_1}\Big [\beta ^2a^{11}-\beta b^1-\int _{|z|\ge 1}\nu (x,dz)+\frac{1}{2} \beta ^2e^{-\beta \theta }\int _{0<|z|<1}z_1^2\nu (x,dz)\Big ] \\ >&\ 0, \end{aligned} \end{aligned}$$

provided \(\beta >0\) is large enough, where \(\theta \) is a constant with \(0<\theta <1\).

Then by the previous conclusion, \(\sup _{\overline{E}}u^\epsilon =\sup _{\overline{E}\setminus D}u^\epsilon \). Let \(\epsilon \rightarrow 0\) to find \(\sup _{\overline{E}}u=\sup _{\overline{E}\setminus D}u\) by the continuity. This proves Assertion 1.

Assertion 2. If \(u\le 0\) everywhere in D, the second assertion is trivially true. Hence we set \(D_+:=\{x\in D\mid u(x)>0\}\ne \emptyset \). Then

$$\begin{aligned} (L-c)u\ge -cu\ge 0 \quad \text {in }D_+. \end{aligned}$$

The new operator \(L-c\) has no zeroth-order term and consequently Assertion 1 implies that

$$\begin{aligned} \sup _{\overline{E}}u = \sup _{\overline{E}\setminus D_+}u=\big (\sup _{\overline{E}\setminus D}u\big )\vee \big (\sup _{D\setminus D_+}u\big )=\big (\sup _{\overline{E}\setminus D}u\big )\vee 0=\sup _{\overline{E}\setminus D}u^+. \end{aligned}$$

This completes the proof. \(\square \)

Remark 2.3

From the proof of Assertion 2 in Theorem 2.2, we have the following conclusions.

  1. 1.

    In Assertion 1, if \(Lu>0\) in D, then u can either achieve its (finite) maximum only on \(\overline{E}\setminus D\) or be unbounded on \({\overline{E}}\).

  2. 2.

    In Assertion 2, essentially the following equality holds according to the proof,

    $$\begin{aligned} \sup _{\overline{E}}u^+=\sup _{\overline{E}\setminus D}u^+, \end{aligned}$$

    even though the Assertion 1 in Theorem 2.2 cannot be applied directly to \(u^+\) as it is not in \(C^2(D)\). Especially if u can take positive values in D, or equivalently, \(D_+\ne \emptyset \), then we have

    $$\begin{aligned} \sup _{\overline{E}}u=\sup _{\overline{E}\setminus D}u^+. \end{aligned}$$

Remark 2.4

The proof of Theorem 2.2 still works if the matrix \({\mathbf {a}}=(a_{jk})\) is only positive semidefinite. Indeed, since the eigenvalues of \({\mathbf {a}}(x^0)\) are nonnegative (\(\lambda _j\ge 0, j=1,\ldots ,d\)), the inequality (9) still holds.

Remark 2.5

As in Remark 2.1, there are two special cases for Theorem 2.2, that is, \(E=\mathbb {R}^d\) or \(E=D\). Using the latter as an example, namely, \(u\in C^2(D)\cap C(\overline{D})\), \(Lu\ge 0\) in D, and \(\text {supp}\,\nu (x,\cdot )\subset \overline{D}-x\) for each \(x\in D\), where D is open and bounded but not necessarily connected, then the following conclusions holds:

  1. 1.

    If \(c\equiv 0\) in D, then

    $$\begin{aligned} \max _{\overline{D}}u=\max _{\partial {D}}u. \end{aligned}$$
  2. 2.

    If \(c\le 0\) in D, then

    $$\begin{aligned} \max _{\overline{D}}u\le \max _{\partial {D}}u^+. \end{aligned}$$

Corollary 2.6

Let D be an open and bounded set but not necessarily connected, and E be an open set satisfying \(D\subset E\). Assume that \(u\in C^2(D)\cap C(\overline{E})\), and \(\text { supp}\,\nu (x,\cdot )\subset {\overline{E}}-x\) for each \(x\in D\).

  1. 1.

    If \(c\equiv 0\) and \(Lu\le 0\) both hold in D, then

    $$\begin{aligned} \inf _{\overline{E}}u=\inf _{\overline{E}\setminus D}u. \end{aligned}$$
  2. 2.

    If \(c\le 0\) and \(Lu\le 0\) both hold in D, then

    $$\begin{aligned} \inf _{\overline{E}}u\ge -\sup _{\overline{E}\setminus D}u^-. \end{aligned}$$
  3. 3.

    If \(c\le 0\) and \(Lu=0\) both hold in D, then

    $$\begin{aligned} \sup _{\overline{E}}|u|=\sup _{\overline{E}\setminus D}|u|. \end{aligned}$$

In all the three expressions, the supremum and infimum may be infinity.

Proof

  1. 1.

    Apply directly the first assertion of Theorem 2.2 to \(-u\).

  2. 2.

    Apply the second assertion of Theorem 2.2 to \(-u\).

  3. 3.

    Applying Statement 2 in Remark 2.3 to \(-u\), we have

    $$\begin{aligned} \sup _{\overline{E}}u^-=\sup _{\overline{E}\setminus D}u^-. \end{aligned}$$

Then it follows that

$$\begin{aligned} \sup _{\overline{E}}|u|=\big (\sup _{\overline{E}}u^+\big ) \vee \big (\sup _{\overline{E}}u^-\big )=\big (\sup _{\overline{E}\setminus D}u^+\big )\vee \big (\sup _{\overline{E}\setminus D}u^-\big )=\sup _{\overline{E}\setminus D}|u|. \end{aligned}$$

This completes the proof. \(\square \)

Going one step further, we suppose E is bounded and then apply Corollary 2.6 to \(u-v\), yielding the following corollary which is often used in applications.

Corollary 2.7

Let D be an open and bounded set but not necessarily connected, and E be an open set satisfying \(D\subset E\). Assume that \(u, v\in C^2(D)\cap C(\overline{E})\), \(c\le 0\) in D, and \(\text { supp}\,\nu (x,\cdot )\subset {\overline{E}}-x\) for each \(x\in D\).

  1. 1.

    (Comparison Principle) If \(Lu\le Lv\) in D and \(u\ge v\) on \(\overline{E}\setminus D\), then \(u\ge v\) in \(\overline{E}\).

  2. 2.

    (Uniqueness) If \(Lu=Lv\) in D and \(u=v\) on \(\overline{E}\setminus D\), then \(u=v\) in \(\overline{E}\).

Proof

The two results immediately follow by using the last two assertions of Corollary 2.6 for \(u-v\). \(\square \)

2.2 Strong maximum principle for elliptic case

This section is devoted to the strong maximum principle for the elliptic Waldenfels operator L.

Theorem 2.8

(Strong maximum principle for elliptic Waldenfels operator) Let D be an open and connected set but not necessarily bounded, and E be an open set satisfying \(D\subset E\). Assume that \(u\in C^2(D)\cap C(\overline{E})\), \(Lu\ge 0\) in D, and \(\text { supp}\,\nu (x,\cdot )\subset {\overline{E}}-x\) for each \(x\in D\). Moreover, assume that the mapping \(x\rightarrow \nu (x,\cdot )\) is continuous in D. If one of the following conditions holds:

  1. 1.

    \(c\equiv 0\) in D and u achieves a (finite) maximum over \({\overline{E}}\) at an interior point in D;

  2. 2.

    \(c\le 0\) in D and u achieves a (finite) nonnegative maximum over \({\overline{E}}\) at an interior point in D;

  3. 3.

    u achieves a zero maximum over \({\overline{E}}\) at an interior point in D,

then u is constant on \(\overline{D}\).

Before proving this theorem, let us first give some comments on it.

Remark 2.9

The propagation of maximizer point by translation of measure support mentioned in [5, 7] is similar in our case. That is, if the assumptions in Theorem 2.8 hold, then u is a constant on the set \(\overline{\bigcup _{n=0}^\infty \varLambda _n}\), where \(\varLambda _n\)’s are defined by induction,

$$\begin{aligned} \varLambda _0 = {x^0}, \varLambda _{n+1}= \bigcup _{x\in D\cap \varLambda _n}[\text { supp}\,\nu (x,\cdot )+x]. \end{aligned}$$

This result depends on the support of every measure \(\nu (x,\cdot )\), it can be easily proved by induction and continuity. It is noteworthy that in this scheme, the set D may not be connected, since jumps from one connected component to another might occur when measure supports overlap two or more connected components.

In conclusion, it is the integro-differential term, or jump diffusion term that leads to the propagation of maximizer point between those connected components. Therefore, we need to restrict that the Markovian point can move by jumps only inside the set E, i.e., the support condition (4), to obtain the propagation of maximizer (over E) point.

Remark 2.10

As shown in Remark 2.1, our results on the weak and strong maximum principles formulated in Theorem 2.2 and 2.8, respectively, cover the situations when the support of jump measure is either bounded or unbounded, especially for \(E=D\) or \(E=\mathbb {R}^d\) in the setting. While Taira [30] only considered the situation for \(E=D\). Furthermore, our assumptions are less restrictive than Taira’s: In our work, the connectedness is not needed for the weak maximum principle while the boundedness is not necessary for the strong maximum principle. Moreover, the continuity of mapping \(x\rightarrow \nu (x,\cdot )\) is necessary only in the strong case but not for the weak maximum principle.

Like the weak case, by applying directly Theorem 2.8 to \(-u\), one can conclude the strong maximum principle for the converse case \(Lu\le 0\).

Corollary 2.11

Let D be an open and connected set but not necessarily bounded, and E be an open set satisfying \(D\subset E\). Assume that \(u\in C^2(D)\cap C(\overline{E})\), \(Lu\le 0\) in D, and \(\text { supp}\,\nu (x,\cdot )\subset {\overline{E}}-x\) for each \(x\in D\). Moreover, assume that the mapping \(x\rightarrow \nu (x,\cdot )\) is continuous in D. If one of the following conditions holds:

  1. 1.

    \(c\equiv 0\) in D and u achieves a (finite) minimum over \({\overline{E}}\) at an interior point in D;

  2. 2.

    \(c\le 0\) in D and u achieves a (finite) nonnegative minimum over \({\overline{E}}\) at an interior point in D;

  3. 3.

    u achieves a zero minimum over \({\overline{E}}\) at an interior point in D,

then u is constant on \(\overline{D}\).

Now we start to prove Theorem 2.8.

Proof of Theorem 2.8

Suppose that \(u\not \equiv \max _{{\overline{E}}}u\) in D. Set \(D_<:=\{x\in D\mid u(x)<\max _{{\overline{E}}}u\}\ne \emptyset \). Since D is connected which implies \(\partial D_<\cap D\ne \emptyset \), we can always choose a point \(x^1\in D_<\) such that \(\text {dist}(x^1,\partial D_<\cap D)<\text {dist}(x^1,\partial D)\). Denote by B the largest ball having \(x^1\) as center with \(B\subset D_<\). Then \({\overline{B}}\subset D\) and there exists some point \(x^0\in \partial B\) with

$$\begin{aligned} u(x^0)=\max _{{\overline{E}}}u>u(x),\quad \forall x\in B. \end{aligned}$$

Since u achieves its maximum at \(x^0\in D\), we have \(\nabla u(x^0)=0\). We will create a contradiction by proving that

$$\begin{aligned} \frac{\partial u}{\partial \mathbf{n}}(x^0)>0, \end{aligned}$$
(12)

where \(\mathbf{n}\) is the unit outer normal vector of B at \(x^0\). Then by this contradiction, u must be constant within D, and the result follows by continuity. Now the rest of the proof is devoted to (12). We divide it into three steps.

Step 1 The closed set \({\overline{B}}\) is a d-dimension \(C^2\)-differential manifold with boundary. Let \((U,{\mathbf {\Phi }})\) be a coordinate chart near \(x^0\), where U is a relatively open neighborhood of \(x^0\) in \({\overline{B}}\), \({\mathbf {\Phi }}\) is a \(C^2\)-diffeomorphism to its image from U into the closed upper half plane \(\mathbb {H}^d_+:=\{y\in {\mathbb {R}}^d\mid y_d\ge 0\}\), with inverse \({\mathbf {\Phi }}^{-1}\). Then \({\mathbf {\Phi }}\) is an embedding whose rank at \(x^0\) equals to d, equivalently, if we denote by \(J{\mathbf {\Phi }}\) the Jacobian matrix of \({\mathbf {\Phi }}\), i.e., \(J{\mathbf {\Phi }}:=\nabla _x{\mathbf {\Phi }}\), then \(J{\mathbf {\Phi }}\) is non-degenerate. As a result, the tangent mapping \({\mathbf {\Phi }}_*\) induced by \({\mathbf {\Phi }}\) at point \(x^0\) is an isomorphism.

Now we consider the function u restricted in U. We define \({\hat{u}}(y):=u(\mathbf {\Phi ^{-1}}(y)), y\in {\mathbf {\Phi }}(U)\). Then \({\hat{u}}\) attains its maximum at \(y^0={\mathbf {\Phi }}(x^0)\) over \({\mathbf {\Phi }}(U)\subset \mathbb {H}^d_+\). Hence at the maximizer point \(y^0\),

$$\begin{aligned} \frac{\partial {\hat{u}}}{\partial y_j}=0,\quad j=1,\ldots ,d-1. \end{aligned}$$
(13)

We also denote the image tangent vector of \(\frac{\partial }{\partial {\mathbf {n}}}\) under tangent mapping \({\mathbf {\Phi }}_*\) by

$$\begin{aligned} \frac{\partial }{\partial \hat{{\mathbf {n}}}}:= {\mathbf {\Phi }}_*\Big (\frac{\partial }{\partial {\mathbf {n}}}\Big ). \end{aligned}$$

We compute at \(y^0\) (or \(x^0\))

$$\begin{aligned} \begin{aligned}&\frac{\partial {\hat{u}}}{\partial \hat{{\mathbf {n}}}} = \bigg \langle {\mathbf {\Phi }}_*\Big (\frac{\partial }{\partial {\mathbf {n}}}\Big ), d{\hat{u}}\bigg \rangle = \bigg \langle \frac{\partial }{\partial {\mathbf {n}}},{\mathbf {\Phi }}^*(d{\hat{u}})\bigg \rangle \\&\quad = \bigg \langle \frac{\partial }{\partial {\mathbf {n}}}, d({\hat{u}}\circ {\mathbf {\Phi }})\bigg \rangle = \bigg \langle \frac{\partial }{\partial {\mathbf {n}}},du\bigg \rangle = \frac{\partial u}{\partial {\mathbf {n}}}=0, \end{aligned} \end{aligned}$$
(14)

where \({\mathbf {\Phi }}^*\) is denoted as the cotangent mapping induced by \({\mathbf {\Phi }}\) at point \(x^0\), \(\langle \cdot ,\cdot \rangle \) is the dual product between the tangent space and cotangent space at \(y^0\) (or \(x^0\)). Now recall that \({\mathbf {\Phi }}_*\) is an isomorphism. The tangent vector \(\frac{\partial }{\partial \hat{{\mathbf {n}}}}\) is independent of \(\{\frac{\partial }{\partial y_j}\mid j=1,\ldots ,d-1\}\) and consequently by (14),

$$\begin{aligned} \frac{\partial {\hat{u}}}{\partial y_d}(y^0)=0. \end{aligned}$$
(15)

Combining (13) and (15) together with the fact that \({\hat{u}}\) attains its maximum at \(y^0\), we have

$$\begin{aligned} \nabla _y^2{\hat{u}}(y^0)\le 0. \end{aligned}$$
(16)

Combining (13), (15) and (16), we have at \(x^0\),

$$\begin{aligned} \nabla _x u&= (J{\mathbf {\Phi }})^T(\nabla _y{\hat{u}})=0, \\ \nabla _x^2 u&= (J{\mathbf {\Phi }})^T(\nabla _y^2{\hat{u}})(J{\mathbf {\Phi }}) +(\nabla ^2_x{\mathbf {\Phi }})(\nabla _y{\hat{u}})= (J{\mathbf {\Phi }})^T(\nabla _y^2{\hat{u}})(J{\mathbf {\Phi }}), \end{aligned}$$

where we treat \(\nabla ^2_x{\mathbf {\Phi }}\) as a third-order covariant tensor. Hence at \(x^0\),

$$\begin{aligned} \begin{aligned} Au&= \text {tr}[{\mathbf {a}}^T(\nabla ^2_x u)]+b^T\nabla _x u+cu \\&= \text {tr}[{\mathbf {a}}^T(J{\mathbf {\Phi }})^T(\nabla _y^2 {\hat{u}})(J{\mathbf {\Phi }})]+cu \\&= \text {tr}[(J{\mathbf {\Phi }}){\mathbf {a}}^T (J{\mathbf {\Phi }})^T(\nabla _y^2 {\hat{u}})]+cu \\&= \text {tr}\big [\big ((J{\mathbf {\Phi }}){\mathbf {a}}(J{\mathbf {\Phi }})^T\big )^T (\nabla _y^2{\hat{u}})\big ]+cu \\&=: \text {tr}[{\hat{{\mathbf {a}}}}^T(\nabla ^2_y{\hat{u}})]+cu, \end{aligned} \end{aligned}$$
(17)

where \({\hat{{\mathbf {a}}}}:=(J{\mathbf {\Phi }}){\mathbf {a}}(J{\mathbf {\Phi }})^T\). Since \({\mathbf {a}}(x)\) is symmetric and positive definite and the matrix \(J{\mathbf {\Phi }}\) is non-degenerate, we see the matrix \({\hat{{\mathbf {a}}}}(x^0)\) is also symmetric and positive definite. Hence, as explained in the proof of Theorem 2.2 and by (16), we have

$$\begin{aligned} \text {tr}\big [{\hat{{\mathbf {a}}}}(x^0)^T\big (\nabla ^2_y{\hat{u}}(y^0)\big )\big ] \le 0. \end{aligned}$$
(18)

Define

$$\begin{aligned} E_0:=\big \{x\in {\overline{E}}\mid u(x)=\max _{{\overline{E}}}u\big \}=\big \{x\in {\overline{E}}\setminus B\mid u(x)=\max _{{\overline{E}}}u\big \}. \end{aligned}$$
(19)

Recall that u attains its maximum over \(\overline{E}\) at \(x^0\). Now we have

$$\begin{aligned} \begin{aligned} Ku(x^0)&= \int _{{\mathbb {R}}^d\setminus \{0\}}\big [u(x^0+z)-u(x^0)- z^T\nabla u(x^0)\cdot {\mathbf {1}}_{\{|z|<1\}}\big ]\nu (x^0,dz) \\&= \int _{{\overline{E}}}[u(x^0+z)-u(x^0)]\nu (x^0,dz) \\&= \int _{x^0+z\in {\overline{E}}\setminus E_0}[u(x^0+z)-u(x^0)]\nu (x^0,dz) \\&\le 0. \end{aligned} \end{aligned}$$
(20)

From (17), (18) and (20), we obtain

$$\begin{aligned} Lu(x^0)=Ku(x^0)+Au(x^0)\le c(x^0)u(x^0)\le 0. \end{aligned}$$

By recalling the assumption on u, we have \(Lu(x)\ge 0\) for each \(x\in D\), and thus

$$\begin{aligned} Lu(x^0)=Au(x^0)=Ku(x^0)=0, \end{aligned}$$

especially,

$$\begin{aligned} Ku(x^0) = \int _{x^0+z\in {\overline{E}}\setminus E_0}[u(x^0+z)-u(x^0)]\nu (x^0,dz) = 0. \end{aligned}$$

Hence, we conclude

$$\begin{aligned} \nu (x^0,({\overline{E}}\setminus E_0)-x^0)=0. \end{aligned}$$
(21)

Step 2 We set \(B=B(x^1,R)\) with \(R=|x^0-x^1|\). See Fig. 1. Define

$$\begin{aligned} v(x):=e^{-\beta |x-x^1|^2}-e^{-\beta R^2},\quad x\in \overline{E}, \end{aligned}$$

for \(\beta >0\) as selected below. Then

Fig. 1
figure 1

Sketch for Theorem 2.8

$$\begin{aligned} \begin{aligned} Av =&\ e^{-\beta |x-x^1|^2}\Big \{\text {tr}\big [{\mathbf {a}}^T \big (4\beta ^2(x-x^1)\otimes (x-x^1)-2\beta I\big )\big ] \\&\qquad \qquad \ \ -2\beta b^T(x-x^1)+c\big (1-e^{-\beta (R^2-|x-x^1|^2)}\big )\Big \} \\ =&\ e^{-\beta |x-x^1|^2}\big [4\beta ^2(x-x^1)^T{\mathbf {a}}^T (x-x^1)-2\beta \text {tr}({\mathbf {a}}) \\&\qquad \qquad \ \ -2\beta b^T(x-x^1)+c\big (1-e^{-\beta (R^2-|x-x^1|^2)}\big )\big ] \\ \ge&\ e^{-\beta |x-x^1|^2}\big [4\gamma \beta ^2|x-x^1|^2-2\beta \text {tr} ({\mathbf {a}})-2\beta |b||x-x^1| \\&\qquad \qquad \ \ +c\big (1-e^{-\beta (R^2-|x-x^1|^2)}\big )\big ]. \end{aligned} \end{aligned}$$
(22)

Consider next the open set \(D_0:=B(x^1,R)\cap B(x^0,r)\) (see Fig. 1) with some \(r\in (0,R)\) which will be chosen later. When \(\beta \) is large enough, we have

$$\begin{aligned} Av \ge e^{-\beta R^2}\big [4\gamma \beta ^2(R-r)^2-2\beta \text {tr} ({\mathbf {a}})-2\beta |b|R\big ] > C_1\beta ^2-C_2\beta , \end{aligned}$$
(23)

for \(x\in D_0\), where \(C_1,C_2\) are two positive constants.

Moreover, by recalling (21), we have

$$\begin{aligned} \begin{aligned} Kv(x^0) =&\int _{{\mathbb {R}}^d\setminus {\{0\}}}\big [e^{-\beta |x^0-x^1+z|^2} -e^{-\beta R^2} \\&\qquad \quad \,\ +2\beta z^T(x^0-x^1)e^{-\beta R^2}{\mathbf {1}}_{\{|z|<1\}}\big ]\nu (x^0,dz) \\ =&\int _{\begin{array}{c} x^0+z\in E_0 \\ |z|\ge 1 \end{array}}\big [e^{-\beta |x^0-x^1+z|^2} -e^{-\beta R^2}\big ]\nu (x^0,dz) \\&+ \int _{\begin{array}{c} x^0+z\in E_0 \\ 0<|z|<1 \end{array}}\big [e^{-\beta |x^0-x^1+z|^2} -e^{-\beta R^2} \\&\qquad \qquad \quad \ +2\beta z^T(x^0-x^1)e^{-\beta R^2}\big ]\nu (x^0,dz) \\ =:&\ I + II. \end{aligned} \end{aligned}$$
(24)

For the term I, it is clear that \(E_0\cap \overline{B}=\{x^0\}\) and consequently

$$\begin{aligned} |x^0+z-x^1|>|x^0-x^1|=R \end{aligned}$$

for point z satisfying \(x^0+z\in {\overline{E}}\setminus E_0\). Thus for sufficiently large \(\beta \), we have

$$\begin{aligned} -C_3<e^{-\beta |x^0-x^1+z|^2}-e^{-\beta R^2}<0 \end{aligned}$$

with a constant \(C_3>0\). Hence,

$$\begin{aligned} I > -C_3\int _{\begin{array}{c} x^0+z\in E_0 \\ |z|\ge 1 \end{array}}\nu (x^0,dz)\gtrsim -C_3. \end{aligned}$$
(25)

For the term II, using the Taylor expansion, and for \(x^0+z\in E_0\) and \(\beta \) large enough,

$$\begin{aligned} \begin{aligned}&\ e^{-\beta |x^0-x^1+z|^2} -e^{-\beta R^2}+2\beta z^T(x^0-x^1)e^{-\beta R^2} \\&\quad =\ \frac{1}{2} \big [4\beta ^2e^{-\beta |x^0-x^1+\theta z|^2}| z^T(x^0-x^1+\theta z)|^2-2\beta e^{-\beta |x^0-x^1+\theta z|^2}|z|^2\big ] \\&\quad \ge -\beta e^{-\beta |x^0-x^1+\theta z|^2}|z|^2 \\&\quad \ge -C_4\beta |z|^2, \end{aligned} \end{aligned}$$

with some \(\theta \in (0,1)\) and a constant \(C_4>0\). Hence,

$$\begin{aligned} II > -C_4\beta \int _{\begin{array}{c} x^0+z\in E_0 \\ 0<|z|<1 \end{array}}|z|^2\nu (x,dz)\gtrsim -C_4\beta . \end{aligned}$$
(26)

Thus, combining the results of (23), (24), (25) and (26), we find that

$$\begin{aligned} Lv(x^0)&= Av(x^0)+Kv(x^0)=Av(x^0)+I +II\\&\gtrsim C_1\beta ^2-(C_2+C_4)\beta -C_3 \\&> 0, \end{aligned}$$

provided \(\beta >0\) is fixed large enough. Since Lv(x) is continuous in \(x\in D\) in light of the continuity of \(\nu (x,\cdot )\), we have

$$\begin{aligned} Lv(x)\ge 0, \end{aligned}$$
(27)

for \(x\in D_0\), provided r is small enough.

Step 3 Define

$$\begin{aligned} u^\epsilon (x)=u(x)+\epsilon v(x)-u(x^0),\quad x\in \overline{E}, \end{aligned}$$

for a constant \(\epsilon >0\). We can choose \(\epsilon \) so small that

$$\begin{aligned} u^\epsilon (x)\le 0,\quad x\in \overline{E}\setminus D_0, \end{aligned}$$

since \(v(x)\le 0\) for \(x\in \overline{E}\setminus B\), and \(u(x)<u(x^0)\) for \(x\in B\setminus D_0\) by recalling \(u(x^0)>u(x)\) for all \(x\in D_<\).

For the first two cases, \(c\equiv 0\) in D, or \(c\le 0\) in D also \(u(x^0)\ge 0\), from (27) and the fact that \(Lu\ge 0\) in D, we see that

$$\begin{aligned} Lu^\epsilon \ge -cu(x^0)\ge 0 \quad \text {in }D_0. \end{aligned}$$

In view of the weak maximum principle of elliptic Waldenfels operator, Theorem 2.2, we know that \(u^\epsilon \le 0\) in \(\overline{E}\). Note that \(u^\epsilon (x^0)=0\). Thus we have,

$$\begin{aligned} 0=\frac{\partial u^\epsilon }{\partial {\mathbf {n}}}(x^0)=\frac{\partial u}{\partial {\mathbf {n}}}(x^0)+\epsilon \frac{\partial v}{\partial {\mathbf {n}}}(x^0). \end{aligned}$$

Consequently,

$$\begin{aligned} \frac{\partial u}{\partial {\mathbf {n}}}(x^0)=-\epsilon \frac{\partial v}{\partial {\mathbf {n}}}(x^0)=-\epsilon \nabla v(x^0)\cdot \frac{(x^0-x^1)}{R}=2\epsilon \beta Re^{-\beta R^2}>0, \end{aligned}$$

as required.

For the third case that \(u(x^0)=0\), obviously \(u\le 0\) in D. We find

$$\begin{aligned} (L-c^+)u=Lu-c^+u\ge Lu\ge 0 \quad \text {in }D. \end{aligned}$$

Notice that the zeroth-order coefficient of operator \(L-c^+\) is \(c-c^+\), which is nonpositive in D. Hence we apply the result of the second case by replacing L and c respectively with \(L-c^+\) and \(c-c^+\) to get the same result for this case.

We have thus completed the proof. \(\square \)

Some comments will be helpful for understanding the long proof of Theorem 2.8.

Remark 2.12

In Theorem 2.8, we restrict the set D to be connected to ensure \(\partial D_<\cap D\ne \emptyset \). More generally, if D is not connected, one may merely replace D with the connected component of D which contains the maximizer point, and we thus conclude that u is constant in this connected component.

Recalling Remark 2.13, we could see that the diffusion term gives rise to the propagation of maximizer point in the corresponding connected component. This is why we need the set D to be connected.

Remark 2.13

We can see from (23), (25) and (26) that, it is the second-order differential term \(\text { tr}[{\mathbf {a}}^T(\nabla ^2)]\), namely, the diffusion term that plays a leading role in Step 2 in the proof of Theorem 2.8.

Remark 2.14

Theorem 2.8 still holds if the matrix \({\mathbf {a}}(x)=(a_{jk}(x))_{j,k=1,\ldots ,d}\) is only positive semidefinite and the unit outer normal vector \({\mathbf {n}}\) is not in the nullspace of \({\mathbf {a}}(x^0)\).

In fact, recall that \({\hat{{\mathbf {a}}}}=(J{\mathbf {\Phi }}){\mathbf {a}}(J{\mathbf {\Phi }})^T\) is also semidefinite as the Jacobian matrix \(J{\mathbf {\Phi }}\) is invertible. Due to the reason mentioned in Remark 2.4, we confirm that (18) still holds. Moreover, noting that there exists a positive constant \(\gamma \) such that \({\mathbf {n}}^Ta(x^0){\mathbf {n}}\ge \gamma >0\) with \({\mathbf {n}}\) not in the nullspace of \({\mathbf {a}}(x^0)\), and consequently \((x^0-x^1)^Ta(x^0)(x^0-x^1)\ge \gamma |x^0-x^1|^2\). By continuity we can choose r so small that for all \(x\in D_0=B(x^1,R)\cap B(x^0,r)\),

$$\begin{aligned} (x-x^1)^Ta(x)(x-x^1)\ge \gamma _1|x-x^1|^2, \end{aligned}$$

with a positive constant \(\gamma _1\). Hence (22) holds with \(\gamma _1\) in placing of \(\gamma \) and (23) also holds for some other constants \(C_1, C_2\).

By a similar way to prove (12), we can easily obtain the following version of Hopf’s boundary point lemma, which is a generalization of [30, Lemma C.3].

Proposition 2.15

(Hopf’s boundary point lemma for elliptic Waldenfels operators) Let D be an open set (not necessarily bounded or connected) with boundary \(\partial D\) being \(C^2\). Assume that \(u\in C^2(\overline{D})\), \(Lu\ge 0\) in D, and \(\text { supp}\,\nu (x,\cdot )\subset \overline{D}-x\) for each \(x\in D\), and furthermore the mapping \(x\rightarrow \nu (x,\cdot )\) is continuous in D. Suppose that u achieves its (finite) maximum over \(\overline{D}\) at point \(x^0\in \partial D\) such that \(u(x^0)>u(x)\) for all \(x\in D\), and that one of the following conditions holds:

  1. 1.

    \(c\equiv 0\) in D;

  2. 2.

    \(c\le 0\) in D and \(u(x^0)\ge 0\);

  3. 3.

    \(u(x^0)=0\).

Then the outer normal derivative is positive: \(\frac{\partial u}{\partial \mathbf{n}}(x^0)>0\).

In fact, if we let \(E=D\) and replace B by D in Step 1 in the proof of Theorem 2.8, also replace \(D_<\) by D in Step 3, then the three-step argument also works in the context of Proposition 2.15 and the result follows.

3 Maximum principles for parabolic Waldenfels operators

We assume that D, E are two open sets in \(\mathbb {R}^d\) and \(D\subset E\), where E is not necessarily bounded. Set \(D_T:=D\times (0,T]\) and \(E_T:=E\times (0,T]\) for arbitrarily fixed \(T>0\).

As in [14,15,16, 28, 29], we define a time dependent elliptic Waldenfels operator

$$\begin{aligned} L := A+K, \end{aligned}$$
(28)

where A and K are defined as, respectively

$$\begin{aligned} \begin{aligned} Au(x,t)&:= \sum ^d_{j,k=1}a_{jk}(x,t)\frac{\partial ^2 u}{\partial x_j\partial x_k}(x,t)+\sum ^d_{j=1}b_{j}(x,t)\frac{\partial u}{\partial x_j}(x,t)+c(x,t)u(x,t), \\ Ku(x,t)&:= \int _{\mathbb {R}^d\setminus \{0\}}\Big [u(x+z,t)-u(x,t)-\sum ^d_{j=1}z_j \frac{\partial u}{\partial x_j}(x,t){\mathbf {1}}_{\{|z|<1\}}\Big ]\nu (t,x,dz). \end{aligned} \end{aligned}$$

We make the following assumptions:

  1. 1.

    Continuity condition: \(a_{jk},b_j,c\in C(\overline{E_T})\) \((j,k=1,\ldots ,d).\)

  2. 2.

    Symmetry condition: \(a_{jk}=a_{kj}\) \((j,k=1,\ldots ,d)\). Uniform ellipticity condition: There exists a constant \(\gamma >0\) such that

    $$\begin{aligned} \sum ^d_{j,k=1}a_{jk}(x,t)\xi _j\xi _k\ge \gamma |\xi |^2, \end{aligned}$$

    for all \((x,t)\in D_T\), \(\xi \in \mathbb {R}^d\).

  3. 3.

    Lévy measures: The kernel \(\{\nu (t,x,\cdot )\mid (x,t)\in \mathbb {R}^d\times [0,T]\}\) is a family of Lévy measures, namely, each \(\nu (t,x,\cdot )\) is a Borel measure on \({\mathbb {R}}^d\setminus \{0\}\) such that for all \((x,t)\in \mathbb {R}^d\times [0,T]\),

    $$\begin{aligned} \int _{{\mathbb {R}}^d\setminus \{0\}}(1\wedge |z|^2)\nu (t,x,dz)<\infty , \end{aligned}$$
    (29)

    and moreover, for fixed \(U\in {\mathcal {B}}({\mathbb {R}}^d\setminus \{0\})\), the mapping \(\mathbb {R}^d\times [0,T]\ni (x,t)\rightarrow \nu (t,x,U)\in [0,\infty )\) is Borel measurable. Here we further assume that for each \((x,t)\in D_T\), the measure \(\nu (t,x,\cdot )\) is supported in \({\overline{E}}-x:=\{y-x\mid y\in {\overline{E}}\}=\{z\mid x+z\in {\overline{E}}\}\). That is,

    $$\begin{aligned} \text { supp}\,\nu (t,x,\cdot )\subset {\overline{E}}-x,\quad \forall (x,t)\in D_T. \end{aligned}$$
    (30)

The Markov process associated with such a generator L can be determined as a solution to the martingale problem induced by L (see, e.g., [29]). However, it is not clear if the Markov process determined by the martingale problem is linked to a stochastic differential equation with certain boundary conditions.

Now we consider the parabolic Waldenfels operator

$$\begin{aligned} -\frac{\partial }{\partial t}+L, \end{aligned}$$

with L being defined in (28), and we are concerned with the maximum principles for such a parabolic operator.

3.1 Weak maximum principle for parabolic case

We are in the position to present both weak and strong maximum principles for parabolic Waldenfels operator \(-\frac{\partial }{\partial t}+L\). First we prove the weak one.

Theorem 3.1

(Weak maximum principle for parabolic Waldenfels operators) Let D be an open and bounded set but not necessarily connected, and E be an open set satisfying \(D\subset E\). Assume that \(u\in C^{2,1}(D_T)\cap C(\overline{E_T})\), \(-\frac{\partial u}{\partial t}+Lu\ge 0\) in \(D_T\), and \(\text { supp}\,\nu (t,x,\cdot )\subset {\overline{E}}-x\) for each \((x,t)\in D_T\).

  1. 1.

    If \(c\equiv 0\) in \(D_T\), then

    $$\begin{aligned} \sup _{\overline{E_T}}u=\sup _{\overline{E_T}\setminus D_T}u. \end{aligned}$$
  2. 2.

    If \(c\le 0\) in \(D_T\), then

    $$\begin{aligned} \sup _{\overline{E_T}}u\le \sup _{\overline{E_T}\setminus D_T}u^+. \end{aligned}$$

Here the supremum may be infinity.

Proof

Assertion 1. We prove this by contradiction. Suppose that the strict inequality holds, i.e.,

$$\begin{aligned} -\frac{\partial u}{\partial t}+Lu>0 \quad \text {in }D_T, \end{aligned}$$
(31)

but there exists a point \((x^0,t^0)\in D_T\) such that

$$\begin{aligned} u(x^0,t^0)=\max _{\overline{E_T}}u. \end{aligned}$$

On one hand, as explained in the proof of Theorem 2.2, we note that \(Lu\le 0\) at point \((x^0,t^0)\). On the other hand, if \(0<t^0<T\), then \((x^0,t^0)\in (D_T)^\circ \) and consequently

$$\begin{aligned} \frac{\partial u}{\partial t}=0 \quad \text {at }(x^0,t^0); \end{aligned}$$

if \(t^0=T\), then \((x^0,t^0)\in \partial (D_T)\) and consequently

$$\begin{aligned} \frac{\partial u}{\partial t}\ge 0 \quad \text {at }(x^0,t^0). \end{aligned}$$

Thus we always have \(-\frac{\partial u}{\partial t}+Lu\le 0\) at point \((x^0,t^0)\), a contradiction to (31).

In the general case that \(-\frac{\partial u}{\partial t}+Lu\ge 0\) holds in \(D_T\), define

$$\begin{aligned} u^\epsilon (x,t):=u(x,t)-\epsilon t\quad \text {in }\overline{E_T}, \end{aligned}$$
(32)

with a positive parameter \(\epsilon \). Then

$$\begin{aligned} -\frac{\partial u^\epsilon }{\partial t}+Lu^\epsilon =-\frac{\partial u}{\partial t}+Lu+\epsilon >0, \end{aligned}$$

and hence \(\sup _{\overline{E_T}}u^\epsilon =\sup _{\overline{E_T}\setminus D_T}u^\epsilon \). Now Assertion 1 follows by setting \(\epsilon \rightarrow 0\).

Assertion 2. If u is nonpositive throughout D, Assertion 2 is trivially true. Hence we may assume on the contrary that u achieves a positive maximum at a point \((x^0,t^0)\in D_T\) over \(\overline{E_T}\).

We first consider the case with strict inequality \(-\frac{\partial u}{\partial t}+Lu>0\) in \(D_T\). Since \(u(x^0,t^0)>0\) and \(c\le 0\), we derive the contradiction to Assertion 1,

$$\begin{aligned} -\frac{\partial u}{\partial t}+Lu\le 0 \quad \text {at } (x^0,t^0). \end{aligned}$$

More generally, if \(-\frac{\partial u}{\partial t}+Lu\ge 0\) in \(D_T\), then set as before \(u^\epsilon (x,t):=u(x,t)-\epsilon t\) with \(\epsilon >0\), which leads to

$$\begin{aligned} -\frac{\partial u^\epsilon }{\partial t}+Lu^\epsilon =-\frac{\partial u}{\partial t}+Lu+\epsilon -c\cdot \epsilon t\ge \epsilon -c\cdot \epsilon t>0. \end{aligned}$$

Moreover, if u achieves a positive maximum at a point \((x^0,t^0)\in D_T\) over \(\overline{E_T}\), then by the continuity, \(u^\epsilon \) also achieves a positive maximum at a point \((x^0,t^0)\in D_T\) over \(\overline{E_T}\), provided that \(\epsilon \) is small enough. However, as in the previous proof, we obtain a contradiction.

This completes the proof. \(\square \)

Remark 3.2

As in the first statement of Remark 2.3, we also conclude that in Assertion 1 of Theorem 3.1 if strictly \(-\frac{\partial u}{\partial t}+Lu>0\) in D, then u can either achieve its (finite) maximum only on \(\overline{E_T}\setminus D_T\) or be unbounded on \(\overline{E_T}\).

Remark 3.3

We cannot prove Assertion 2 of Theorem 3.1 in the same way as the corresponding assertion in Theorem 2.2. In fact, if we introduce similarly the set \(D_T^+:=\{(x,t)\in D_T\mid u(x,t)>0\}\), it will never be the form of \(U\times (0,T]\) for some \(U\subset D\). Hence we may not take advantage of the first assertion of Theorem 3.1. Consequently, the similar judgment with Assertion 2 of Remark 2.3, which lies on the proof of Assertion 2 in Theorem 2.2, cannot be established here.

Remark 3.4

From Theorem 2.2 and Remark 2.4, we have already known that, for \(u\in C^{2,2}(D_T)\cap C(\overline{E_T})\), the supremum (or respectively, positive supremum) is achieved on \(\overline{E_T}\setminus (D_T)^\circ \). The alert reader could notice that we may have appeared to be cheating here, as we should also verify that the kernel \(\nu \) still satisfy the third assumption in the definition of elliptic Waldenfels operator (1) when regarding it as a kernel in \(\mathbb {R}^{d+1}\). In fact, the modified kernel \({\hat{\nu }}((x,t),dzds):=\nu (t,x,dz)\delta _0(ds)\) does satisfy the moment condition (3), which is enough for us even though \({\hat{\nu }}\) is not supported inside \(\mathbb {R}^{d+1}\setminus \{0\}\). See the proof of Lemma 3.11 for details.

Moreover, if \(-\frac{\partial u}{\partial t}+Lu>0\) in \(D_T\), we see that the maximum (or respectively, positive maximum) cannot be achieved on the upper boundary \(D\times \{T\}\) by the same argument as in the proof of Theorem 3.1. Hence, it is clear that Theorem 3.1 holds for \(u\in C^{2,2}(D_T)\cap C(\overline{E_T})\), which is a natural consequence of Theorem 2.2.

However, the result for \(u\in C^{2,1}(D_T)\cap C(\overline{E_T})\) cannot be obtained in this way. We only need the first-order differentiability in t, benefiting from the form of the operator \(-\frac{\partial }{\partial t}+L\). This is evident in the different forms of \(u^\epsilon \) in (11) and (32).

Remark 3.5

There are two special cases for the weak maximum principle Theorem 3.1 for the parabolic operator \(-\frac{\partial }{\partial t}+L\). That is, \(E=\mathbb {R}^d\) or \(E=D\). Take the latter as an example. Let \(u\in C^{2,1}(D_T)\cap C(\overline{D_T})\), \(-\frac{\partial u}{\partial t}+Lu\ge 0\) in \(D_T\), and \(\text { supp}\,\nu (t,x,\cdot )\subset \overline{D}-x\) for each \((x,t)\in D_T\), where D is open and bounded but not necessarily connected.

  1. 1.

    If \(c\equiv 0\) in \(D_T\), then

    $$\begin{aligned} \max _{\overline{D_T}}u=\max _{\Gamma _T}u. \end{aligned}$$
  2. 2.

    If \(c\le 0\) in \(D_T\), then

    $$\begin{aligned} \max _{\overline{D_T}}u\le \max _{\Gamma _T}u^+. \end{aligned}$$

Here \(\Gamma _T\) is the parabolic boundary of \(D_T\), i.e., \(\Gamma _T:=\overline{D_T}\setminus D_T=(\partial D\times [0,T])\cup (D\times \{0\})\).

There are some consequences of the weak maximum principle for a parabolic Waldenfels operator. We only highlight the following results.

Corollary 3.6

Let D be an open and bounded set but not necessarily connected, and E be an open set satisfying \(D\subset E\). Assume that \(u\in C^{2,1}(D_T)\cap C(\overline{E_T})\), and \(\text { supp}\,\nu (t,x,\cdot )\subset {\overline{E}}-x\) for each \((x,t)\in D_T\).

  1. 1.

    If \(c\equiv 0\) and \(-\frac{\partial u}{\partial t}+Lu\le 0\) both hold in \(D_T\), then

    $$\begin{aligned} \inf _{\overline{E_T}}u=\inf _{\overline{E_T}\setminus D_T}u. \end{aligned}$$
  2. 2.

    If \(c\le 0\) and \(-\frac{\partial u}{\partial t}+Lu\le 0\) both hold in \(D_T\), then

    $$\begin{aligned} \inf _{\overline{E_T}}u\ge -\sup _{\overline{E_T}\setminus D_T}u^-. \end{aligned}$$
  3. 3.

    If \(c\le 0\) and \(-\frac{\partial u}{\partial t}+Lu=0\) both hold in \(D_T\), then

    $$\begin{aligned} \sup _{\overline{E_T}}|u|=\sup _{\overline{E_T}\setminus D_T}|u|. \end{aligned}$$

Here the supremum and infimum may be infinity.

Corollary 3.7

Let D be an open and bounded set but not necessarily connected, and E be an open set satisfying \(D\subset E\). Assume that \(u,v\in C^{2,1}(D_T)\cap C(\overline{E_T})\), \(c\le 0\) in \(D_T\), and \(\text { supp}\,\nu (t,\) \(x,\cdot )\subset {\overline{E}}-x\) for each \((x,t)\in D_T\). There is no sign condition on c.

  1. 1.

    (Comparison Principle) If \(-\frac{\partial u}{\partial t}+Lu\le -\frac{\partial v}{\partial t}+Lv\) in \(D_T\) and \(u\ge v\) on \(\overline{E_T}\setminus D_T\), then \(u\ge v\) in \(\overline{E_T}\).

  2. 2.

    (Uniqueness) If \(-\frac{\partial u}{\partial t}+Lu=-\frac{\partial v}{\partial t}+Lv\) in \(D_T\) and \(u=v\) on \(\overline{E_T}\setminus D_T\), then \(u=v\) in \(\overline{E_T}\).

Proof

In the case that \(c\le 0\) in \(D_T\), the two conclusions are trivially followed by applying Corollary 3.6 to \(u-v\).

For general case without any assumption on the sign of c, we only need to prove that if \(-\frac{\partial u}{\partial t}+Lu\le 0\) in \(D_T\) and \(u\ge 0\) on \(\overline{E_T}\setminus D_T\), then \(u\ge 0\) in \(\overline{E_T}\). Define \(u^\beta :=ue^{-\beta t}\). Then \(u\ge 0\) is equivalent to \(u^\beta \ge 0\). We calculate

$$\begin{aligned} -\frac{\partial u}{\partial t}+Lu=e^{\beta t}\Big (-\frac{\partial u^\beta }{\partial t}+Lu^\beta -\beta u^\beta \Big ). \end{aligned}$$

Hence \(-\frac{\partial u}{\partial t}+Lu\le 0\) is equivalent to \(-\frac{\partial u^\beta }{\partial t}+(L-\beta )u^\beta \le 0\). Choose a sufficiently large \(\beta \), we can ensure \(c-\beta \), the zeroth-order coefficient of operator \(L-\beta \), to be nonpositive in \(D_T\). By the preceding statements, we know \(u^\beta \ge 0\) in \(\overline{E_T}\), equivalently, \(u\ge 0\) in \(\overline{E_T}\). \(\square \)

Corollary 3.8

Let D be an open and bounded set but not necessarily connected, and E be an open set satisfying \(D\subset E\). Assume that \(u,v\in C^{2,1}(D_T)\cap C(\overline{E_T})\), \(-\frac{\partial u}{\partial t}+Lu=0\) in \(D_T\), and \(\text { supp}\,\nu (t,x,\cdot )\subset {\overline{E}}-x\) for each \((x,t)\in D_T\). If \(\max _{\overline{D_T}} c\le \beta <0\), then

$$\begin{aligned} \max _{\overline{E_T}}|u|\le e^{\beta T}\max _{\overline{E_T}\setminus D_T}|u|. \end{aligned}$$

Proof

We consider the function \(u^\beta :=ue^{-\beta t}\) like in the previous corollary. By the same argument we know that \(-\frac{\partial u}{\partial t}+Lu=0\) is equivalent to \(-\frac{\partial u^\beta }{\partial t}+(L-\beta )u^\beta =0\). The zeroth-order coefficient of operator \(L-\beta \), i.e., \(c-\beta \), is nonpositive in \(D_T\). Therefore, Assertion 3 of Corollary 3.6 implies that

$$\begin{aligned} e^{-\beta T}\max _{\overline{E_T}}|u|\le \max _{\overline{E_T}}|u^\beta |= \max _{\overline{E_T}\setminus D_T}|u^\beta |\le \max _{\overline{E_T}\setminus D_T}|u|. \end{aligned}$$

Our result follows. \(\square \)

3.2 Strong maximum principle for parabolic case

We now turn to the strong maximum principle for the parabolic Waldenfels operator \(-\frac{\partial }{\partial t}+L\).

Theorem 3.9

(Strong maximum principle for parabolic Waldenfels operators) Let D be an open and connected set but not necessarily bounded, and E be an open set satisfying \(D\subset E\). Assume that \(u\in C^{2,2}(D_T)\cap C(\overline{E_T})\), \(-\frac{\partial u}{\partial t}+Lu\ge 0\) in \(D_T\), and \(\text { supp}\,\nu (t,x,\cdot )\subset {\overline{E}}-x\) for each \((x,t)\in D_T\). Moreover, assume that the mapping \((x,t)\rightarrow \nu (t,x,\cdot )\) is continuous in \(D_T\). If one of the following conditions holds:

  1. 1.

    \(c\equiv 0\) in \(D_T\) and u achieves a (finite) maximum over \(\overline{E_T}\) at a point \((x^0,t^0)\in D_T\);

  2. 2.

    \(c\le 0\) in \(D_T\) and u achieves a (finite) nonnegative maximum over \(\overline{E_T}\) at a point \((x^0,t^0)\in D_T\);

  3. 3.

    u achieves a zero maximum over \(\overline{E_T}\) at a point \((x^0,t^0)\in D_T\),

then u is constant on \(\overline{D_{t^0}}\), where \(D_{t^0}=D\times (0,t^0]\).

A result of strong maximum principle for viscosity solutions of certain nonlinear nonlocal parabolic operators proved in [5] required a “nondegeneracy” condition, which is crucial in that context. But our strong maximum principle for linear nonlocal parabolic operator in Theorem 3.9 does not need this or any other conditions like this.

The converse case that \(-\frac{\partial u}{\partial t}+Lu\le 0\) in \(D_T\) is immediate.

Corollary 3.10

Let D be an open and connected set but not necessarily bounded, and E be an open set satisfying \(D\subset E\). Assume that \(u\in C^{2,2}(D_T)\cap C(\overline{E_T})\), \(-\frac{\partial u}{\partial t}+Lu\le 0\) in \(D_T\), and \(\text { supp}\,\nu (t,x,\cdot )\subset {\overline{E}}-x\) for each \((x,t)\in D_T\). Moreover, assume that the mapping \((x,t)\rightarrow \nu (t,x,\cdot )\) is continuous in \(D_T\). If one of the following conditions holds:

  1. 1.

    \(c\equiv 0\) in \(D_T\) and u achieves a (finite) minimum over \(\overline{E_T}\) at a point \((x^0,t^0)\in D_T\);

  2. 2.

    \(c\le 0\) in \(D_T\) and u achieves a (finite) nonpositive minimum over \(\overline{E_T}\) at a point \((x^0,t^0)\in D_T\);

  3. 3.

    u achieves a zero minimum over \(\overline{E_T}\) at a point \((x^0,t^0)\in D_T\),

then u is constant on \(\overline{D_{t^0}}\), where \(D_{t^0}=D\times (0,t^0]\).

To prove the strong maximum principle, we will consider the horizontal propagation of maximizer point in space by the similar arguments in elliptic case, and further obtain the vertical propagation of maximizer point locally in time by the weak maximum principle in elliptic case.

Denote \(M:=\max _{\overline{E_T}}u<\infty \) for convenience. Under the assumptions in Theorem 3.9, that is, \(u\in C^{2,2}(D_T)\cap C(\overline{E_T})\), \(-\frac{\partial u}{\partial t}+Lu\ge 0\) in \(D_T\), and \(u(x^0,t^0)=M\) with point \((x^0,t^0)\in D_T\), \(\text { supp}\,\nu (t,x,\cdot )\subset {\overline{E}}-x\) for each \((x,t)\in D_T\), and the mapping \((x,t)\rightarrow \nu (t,x,\cdot )\) is continuous in \(D_T\). Furthermore, one of the following assumptions holds:

Assumption 1

\(c\equiv 0\) in \(D_T\).

Assumption 2

\(c\le 0\) in \(D_T\) and \(M\ge 0\).

Assumption 3

\(M=0\).

The following lemma follows from Theorem 2.8 and Remark 2.14.

Lemma 3.11

Let \(B\subset {\mathbb {R}}^{d+1}\) be a open ball with \({\overline{B}}\subset D_T\). Assume that there exists a point \((x^0,t^0)\in \partial B\) such that \(u(x^0,t^0)=M\) and \(u(x,t)<M\) for each point \((x,t)\in B\). Then \(t^0\) is either the smallest or the largest value over all the time coordinates of points in \(\overline{B}\).

Proof

If \(t^0=T\), the theorem is trivial. Hence we assume \(t^0<T\). Equivalently, \((x^0,t^0)\) is an interior point of \(D_T\).

We regard the parabolic Waldenfels operator \(-\frac{\partial }{\partial t}+L\) as a degenerate elliptic Waldenfels operator by writing

$$\begin{aligned} \begin{aligned}&\Big (-\frac{\partial }{\partial t}+L\Big )u(x,t) \\&\quad = -\frac{\partial u}{\partial t}(x,t)+\text {tr}[{\mathbf {a}}^T(\nabla ^2 u)](x,t)+b^T\nabla u(x,t)+cu(x,t) \\&\qquad +\int _{\mathbb {R}^d\setminus \{0\}}\big [u(x+z,t)-u(x,t)-z^T\nabla u(x,t)\cdot {\mathbf {1}}_{\{|z|<1\}}\big ]\nu (t,x,dz) \\&\quad = \ \text {tr}\bigg [\begin{pmatrix}{\mathbf {a}} &{} 0 \\ 0 &{} 0\end{pmatrix}^T(\nabla ^2_{x,t}u)\bigg ](x,t)+ (b^T,-1)\cdot \nabla _{x,t}u(x,t)+cu(x,t) \\&\qquad +\int _{\mathbb {R}^{d+1}}\big [u(x+z,t+s)-u(x,t) \\&\qquad \qquad -(z^T,s)\cdot \nabla _{x,t}u(x,t)\cdot {\mathbf {1}}_{\{|z|^2+|s|^2\le 1\}}\big ]\nu (t,x,dz)\delta _0(ds). \end{aligned} \end{aligned}$$

Thus, we can replace the matrix \({\mathbf {a}}\) in the elliptic Waldenfels operator (5) by \({\hat{{\mathbf {a}}}}=\begin{pmatrix}{\mathbf {a}} &{} 0 \\ 0 &{} 0\end{pmatrix}\), vector b by \({\hat{b}}=\begin{pmatrix}b\\ -1\end{pmatrix}\), and the kernel \(\nu (x,dz)\) by \({\hat{\nu }}((x,t),dzds):=\nu (t,x,dz)\delta _0(ds)\).

Now we verify that the kernel \({\hat{\nu }}\), defined on \({\mathbb {R}}^{d+1}\), satisfies the moment condition (3), although its support is not contained in \({\mathbb {R}}^{d+1}\setminus \{\mathbf {0}\}\). By recalling (29), condition (3) for \({\hat{\nu }}\) immediately follows as we see that

$$\begin{aligned} \begin{aligned}&\int _{{\mathbb {R}}^{d+1}}\big [1\wedge (|z|^2+|s|^2)\big ]{\hat{\nu }} (x,t,dzds) \\&\quad = \int _{{\mathbb {R}}^{d+1}}\big [1\wedge (|z|^2+|s|^2)\big ] \nu (t,x,dz)\delta _0(ds) \\&\quad = \int _{{\mathbb {R}}^d\setminus \{0\}} (1\wedge |z|^2)\nu (t,x,dz) \\&\quad < \ \infty . \end{aligned} \end{aligned}$$
(33)

Since \((x^0,t^0)\) is the maximizer point over \(\overline{E_T}\) of u, we may replace the set \(E_0\) in (19) by

$$\begin{aligned} {\hat{E_0}}:=\{(x,t)\in \overline{E}\times \{t^0\}\mid u(x,t)=M\}. \end{aligned}$$

Due to the fact that the support of measure \({\hat{\nu }}((x^0,t^0),\cdot )\) is contained in \(\overline{E}\times \{0\}\), we can further derive (21) for \({\hat{\nu }}\) and \({\hat{E_0}}\). It turns out that (24) and (25), (26) also hold in this situation.

Combining Remark 2.14 and the preceding arguments , we conclude that, as in Theorem 2.8, \(\frac{\partial u}{\partial {\mathbf {n}}}>0\) holds in the case that the unit outer normal vector \({\mathbf {n}}\) of \((x^0,t^0)\) over \(\partial B\) is not in the nullspace of \({\mathbf {a}}(x^0,t^0)\), which is exactly the space \(N:=\{(x,t)\mid x=0\}\). But this leads to a contradiction: since u attains a maximum at the interior point \((x^0,t^0)\), we have \(\frac{\partial u}{\partial {\mathbf {n}}}=0\). Therefore, \((x^0,t^0)\) must be a pole of ball B, whose unit outer normal vector is just in N. This completes the proof. \(\square \)

The next lemma shows that for every \(t\in (0,T)\), we have either \(u(x,t)<M\) or \(u(x,t)=M\) for all \(x\in D\). This means that the non-maximizer point (or the maximizer point) may propagation horizontally in space. The proof can be found in [24], we do not present it here and the main points of the proof can be found in Sect. 5.

Lemma 3.12

Assume that \(u(x^0,t^0)<M\) with \(x^0\in D\) and \(t^0\in (0,T)\). Then \(u(x,t^0)<M\) for every \(x\in D\).

Remark 3.13

In Lemma 3.12, we restrict the set D to be connected to make sure that the point \((x^1,t^0)\) can be chosen. More generally, if D is not connected, we may replace D in the previous proof with the connected component of D which contains the maximizer point. We thus conclude that for every fixed \(t\in (0,T)\), either \(u(x,t)<M\) or \(u(x,t)=M\) holds in each connected component of D. Then as in Remark 2.12, we see that the diffusion term gives rise to the horizontal propagation of maximizer point in the corresponding connected component.

As in Remark 2.9, or from [5, 7], we also see that the horizontal propagation of maximizer point by translation of measure support. Namely, if \(u(x^0,t^0)=M\) with \(x^0\in D\) and \(t^0\in (0,T)\), then \(u\equiv M\) on the set \(\overline{\bigcup _{n=0}^\infty \varLambda _n}\), where \({\tilde{\varLambda }}_n\)’s are defined by induction,

$$\begin{aligned} {\tilde{\varLambda }}_0 = {x^0}, {\tilde{\varLambda }}_{n+1}= \bigcup _{x\in D{\cap {\tilde{\varLambda }}}_n}[\text { supp}\,\nu (t^0,x,\cdot )+x]. \end{aligned}$$

Thus in this scheme, the jump diffusion term leads to the horizontal propagation of maximizer point between those connected components, since jumps from one connected component to another might occur when measure supports overlap two or more connected components.

Furthermore, we present the final lemma we need. It means the maximizer point may propagate vertically in time in a local sense. The proof can also be found in Sect. 5

Lemma 3.14

Assume that \(u<M\) in \(D\times (t^0,t^1)\), with \(0\le t^0<t^1\le T\). Then \(u<M\) in \(D\times \{t^1\}\).

Finally we can prove Theorem 3.9.

Proof of Theorem 3.9

Set \(D_T^<:=\{(x,t)\in D_T\mid u(x,t)<M\}\). Then \(D_T^<\) is a relatively open subset of \(D_T\). From Lemma 3.12, we know that for each fixed \(t\in (0,T)\), either \(u(x,t)<M\) or \(u(x,t)=M\) holds for all \(x\in D\). Therefore, \(D_T^<\) must be of the form \(D\times I\), for some \(I\subset (0,T]\) relatively open in (0, T].

For fixed \(s\in I\) and \(s\ne T\), define \(\tau (s):=\sup \{t\mid (s,t)\subset I\}\), where the set in supremum is never empty as I is relatively open in (0, T]. From Lemma 3.14, we see \(\tau (s)\in I\). Then \((r,\tau (s)]\) is the connected component in I containing s, for some \(r<s\). Consequently \(\tau (s)=T\), since I is relatively open. Thus we summarize that for each \(s\in I\), \([s,T]\subset I\), which is trivial when \(s=T\). Hence, the relatively open set I only has two options: either [0, T] or (sT] for some \(s\in [0,T)\). In light of the fact \(u(x^0,t^0)=M\), or equivalently \(t^0\in I\), we conclude that I must be of the form (sT] for some \(s\in [t^0,T)\), as required. This finishes the proof of Theorem 3.9. \(\square \)

4 Examples

We will give some examples in this section. These examples are all concerned with symmetric \(\alpha \)-stable Lévy noise which are not covered in Taira’s framework in [30], since the jump measure is of unbounded support.

Example 4.1

(Mean exit time) Consider a stochastic system in \(\mathbb {R}^d\):

$$\begin{aligned} dX_t = b(X_t) dt + dW_t + dL_t^\alpha , \end{aligned}$$

where \(W_t\) is a standard Wiener process, and \(L_t^\alpha \) is a Lévy process with jump measure \(\nu _\alpha (dz) = c_{\alpha ,d} \frac{dz}{|z|^{d+\alpha }}\), for \(\alpha \in (0, 2)\) and \(c_{\alpha ,d}\) a positive constant depending on \(\alpha \) and d, together with zero drift and zero diffusion. The generator for this system is the following elliptic Waldenfels operator

$$\begin{aligned} \begin{aligned} Lu(x) =&\frac{1}{2}\Delta u(x)+b(x)\cdot \nabla u(x) \\&+\int _{\mathbb {R}^d\setminus \{0\}}\Big [u(x+z)-u(x)-\sum ^d_{j=1}z_j \frac{\partial u}{\partial x_j}(x){\mathbf {1}}_{\{|z|<1\}}\Big ]\nu _\alpha (dz). \end{aligned} \end{aligned}$$

Let D be a domain in \(\mathbb {R}^d\). The mean exit time for \(X_t\), starting at x, exits firstly from D is denoted by \(\tau (x)\). By Dynkin formula for such jump diffusion process [3, 13], as shown in [8, 21, 25], we know that \(\tau \) satisfies the following equation,

$$\begin{aligned} {\left\{ \begin{array}{ll} L \tau = -1 &{}\text {in }D, \\ \tau = 0 &{}\text {in }D^c. \end{array}\right. } \end{aligned}$$

By the strong maximum principle 2.8, or precisely Corollary 2.11 with the special case \(E=\mathbb {R}^d\), we conclude that the mean exit time \(\tau \) cannot take zero value inside D, unless it is constant (inside the domain D).

Example 4.2

(Escape probability) Similarly, let U be a subset of \(D^c\). The likelihood that \(X_t\), starting at x, exits firstly from D by landing in the target set U is called the escape probability from D to U, denoted by p(x). As shown in [20, 23], the escape probability p satisfies the following equation,

$$\begin{aligned} {\left\{ \begin{array}{ll} L p = 0 &{}\text {in }D, \\ p = 1 &{}\text {in }U, \\ p = 0 &{}\text {in }D^c\setminus U. \end{array}\right. } \end{aligned}$$

By Theorem 2.8 and Corollary 2.11 with \(E=\mathbb {R}^d\), we conclude that p cannot take values of zero or one at any point inside D.

Example 4.3

(Fokker–Planck equation) Consider a stochastic system in \(\mathbb {R}^d\):

$$\begin{aligned} \left\{ \begin{aligned}&dX_t = b(X_t) dt + dW_t + dL_t^\alpha , \\&X_0 =x^0, \end{aligned} \right. \end{aligned}$$
(34)

where \(W_t\) is a standard Wiener process and \(L_t^\alpha \) is a Lévy process with pure jump measure \(\nu _\alpha (dz) = c_{\alpha ,d}\frac{dz}{|z|^{d+\alpha }}\), for \(\alpha \in (0, 2)\) and \(c_{\alpha ,d}\) a positive constant depending on \(\alpha \) and d. The Fokker–Planck equation for the probability density of the solution, as shown in [8, 10], is

$$\begin{aligned} \begin{aligned} \frac{\partial p}{\partial t} =&\ \frac{1}{2} \Delta p - b(x) \cdot \nabla p - (\nabla \cdot b) p \\&+\int _{\mathbb {R}^d\setminus \{0\}}\Big [p(x+z,t)-p(x,t)-\sum ^d_{j=1}z_j\frac{\partial p}{\partial x_j}(x,t){\mathbf {1}}_{\{|z|<1\}}\Big ]\nu _\alpha (dz). \end{aligned} \end{aligned}$$
(35)

Let D be a domain in \(\mathbb {R}^d\). In this case, the coefficient of zeroth-order term is \(c=-\nabla \cdot b\). We apply the strong maximum principle in Theorem 3.9 and Corollary 3.10 with \(E=\mathbb {R}^d\). If \(\nabla \cdot b\equiv 0\), which means the deterministic vector field of stochastic system (34) is divergence-free, then the probability density p cannot attain its maximum (or minimum) over \(\mathbb {R}^d\times [0,\infty )\) in \(D\times [0,\infty )\), unless it is constant at all time before the maximizer (or minimizer) point. Moreover, if \(\nabla \cdot b\ge 0\), then p cannot attain its maximum or zero minimum over \(\mathbb {R}^d\times [0,\infty )\) in \(D\times [0,\infty )\) (note that p only takes nonnegative values), unless it is constant at all time before this point as well.