1 Introduction

Consider a Hunt process X(t) on a state space D, with a possibly finite lifetime \(\zeta \). The Fleming–Viot system of \(K \geqslant 2\) particles is a collection of K processes \({\bar{X}}_k(t)\), which evolve as follows. The particles are independent copies of X(t) up to the random time \(\tau _1\) when one of the particles reaches its lifetime. Instead of dying, however, this particle immediately jumps to the position of a randomly chosen other particle. Then the particles again evolve as independent copies of X(t) until one of them reaches its lifetime, at a random time \(\tau _2\), and so the process continues.

Alternatively, one can think that a randomly chosen particle branches into two whenever some particle dies. For this reason, the random times \(\tau _n\) described above are often called branching times. One expects that, under reasonable assumptions, the branching times \(\tau _n\) diverge to infinity, and so the Fleming–Viot system is well-defined for every \(t \in (0, \infty )\). It is known, however, that in some cases the limit \(\tau _\infty = \lim _{n \rightarrow \infty } \tau _n\) can be finite with positive probability, and then there is no obvious way to extend the evolution of the Fleming–Viot system past \(\tau _\infty \). Therefore, we say that extinction, or branching explosion, occurs at \(\tau _\infty \) whenever it is finite.

The main result of this paper deals with just \(K = 2\) particles. Note that in this case at each branching time there is only one surviving particle, and so both particles occupy the same location at each branching time. Our key assumption is that the Hunt processes X(t) is self-dual (or symmetric) with respect to a finite reference measure m. Here by self-duality we simply mean that the transition operators of X(t) are self-adjoint on \(L^2(m)\).

Theorem 1.1

Consider the Fleming–Viot system of two particles evolving according to a self-dual Hunt process on a state space D with a finite reference measure m. For almost every initial configuration of the two particles, either (a) on \(D \times D\) (with respect to the product measure \(m \times m\)); or (b) on the diagonal of \(D \times D\) (with respect to the measure with marginals m), extinction never happens.

Remark 1.2

The Fleming–Viot system of particles following the Brownian motion was introduced in [6, 7]. A non-extinction result was stated already in [7], but the proof given there has an error. This led to an open problem which has been resolved only for sufficiently regular domains.

More precisely, non-extinction was established rigorously in [15] under the assumption that the domain satisfies the interior and exterior cone conditions, and the interior cone has a sufficiently large aperture. The result of [15] also covers particles following a general diffusion with smooth coefficients. A very similar result was simultaneously proved in [2] for Lipschitz domains with a sufficiently small Lipschitz constant, as well as for systems of two particles in polyhedral domains.

Our Theorem 1.1 shows that no regularity is needed for Fleming–Viot systems of two Brownian particles, and that the Brownian motion can be replaced by an arbitrary self-dual Hunt process with a finite reference measure. The question for Fleming–Viot systems of more than two particles, however, remains open.

We mention here a related stream of research [1, 4, 5, 8, 21] on the spine (the path of the surviving particle) of the Fleming–Viot system. In particular, the results of the present paper are used in [5] to show that for Fleming–Viot systems of two particles following the Brownian motion in an interval, the spine does not coincide with the Brownian motion conditioned to live forever.

Remark 1.3

Self-duality is an essential assumption in Theorem 1.1. For example, if \(D = (0, 1)\) and X(t) is the uniform motion to the left (that is, \(X(t) = X(0) - t\) for t less than the life-time \(\zeta = X(0)\)), then \(\tau _\infty = \max \{{\bar{X}}_1(0), \bar{X}_2(0)\}\) is always finite.

We expect that for many self-dual Hunt processes on state spaces D with an infinite reference measure m the assertion Theorem 1.1 holds true, but the assumption that m is finite cannot be completely removed. In other words, if m is an infinite measure, then the system may become extinct in finite time with positive probability. For example, Theorem 1.1(a) in [3] asserts that Bessel processes of negative dimension lead to Fleming–Viot systems of two particles with \(\tau _\infty < \infty \) almost surely; see also Example 5.7 in [2]. We note that a Bessel process of dimension \(\nu \in \mathbb {R}\) is a self-dual Hunt process on \((0, \infty )\), but the reference measure \(r^{\nu - 1} dr\) has infinite mass near the origin when \(\nu \leqslant 0\).

Under mild additional assumptions, Theorem 1.1 holds for every initial configuration. This is illustrated by the following result, which clearly covers the case of Brownian particles.

Corollary 1.4

Consider the Fleming–Viot system of two particles evolving according to a self-dual Hunt process X(t) on a state space D with a finite reference measure m. Suppose that the one-dimensional distributions of X(t) are absolutely continuous with respect to m for every \(t > 0\) and every starting point \(X(0) \in D\). Then, the Fleming–Viot system of two particles never becomes extinct, regardless of the initial configuration of the particles.

Let G(xdy) be the potential kernel of X(t):

$$\begin{aligned} G(x, A)&= \mathbb {E}^x \int _0^\zeta \mathbbm {1}_A(X_t) dt , \end{aligned}$$
(1.1)

and denote

$$\begin{aligned} \Vert G\Vert&= \iint \limits _{D \times D} G(x, dy) m(dx) . \end{aligned}$$
(1.2)

We say that the Hunt process X(t) is irreducible if for every Borel set \(A \subseteq D\) such that \(m(A) > 0\) we have \(G(x, A) > 0\) for almost every \(x \in D\). By Theorems 6 and 29 in [19], irreducibility is equivalent to the following property: for every \(t > 0\) and every Borel set \(A \subseteq D\) such that \(m(A) > 0\) we have \(\mathbb {P}^x(X(t) \in A) > 0\) for almost every \(x \in E\); for further equivalent definitions, we refer to [19]. In addition to Theorem 1.1, we prove the following result on ergodicity of Fleming–Viot systems of two particles.

Theorem 1.5

Consider the Fleming–Viot system of two particles \(({\bar{X}}_1(t), {\bar{X}}_2(t))\) evolving according to a self-dual Hunt process X(t) on a state space D with a finite reference measure m. If \(\Vert G\Vert \) is finite (or, more generally, if the measure G(xdy)m(dx) is \(\sigma \)-finite), then G(xdy)m(dx) is an invariant measure for \(({\bar{X}}_1(t), {\bar{X}}_2(t))\).

If \(\Vert G\Vert \) is finite and additionally X(t) is irreducible, then, for every nonnegative Borel function \(\varphi \) on \(D \times D\) and for almost every initial configuration of the two particles (with respect to either the product measure \(m \times m\) or the measure m on the diagonal of \(D \times D\), as in Theorem 1.1), the ergodic averages of \(\varphi ({\bar{X}}_1(t), {\bar{X}}_2(t))\) converge with probability one:

$$\begin{aligned} \lim _{T \rightarrow \infty } \frac{1}{T} \int _0^T \varphi ({\bar{X}}_1(t), \bar{X}_2(t)) dt= & {} \frac{1}{\Vert G\Vert } \iint \limits _{D \times D} \varphi (x, y) G(x, dy) m(dx). \end{aligned}$$

The above theorem is closely related to the following result on the embedded Markov chain of branching positions.

Corollary 1.6

Consider the Fleming–Viot system of two particles evolving according to a self-dual Hunt process X(t) on a state space D with a finite reference measure m, and assume that the branching times are finite with probability one. Let \(Z_n\) denote the position of the surviving particle at the nth branching time \(\tau _n\). Then \(Z_n\) is a conservative Markov chain, and m is a stationary measure for \(Z_n\).

Remark 1.7

Let us discuss the assumptions in Theorem 1.5 and Corollary 1.6. By definition, \(\Vert G\Vert = \int _D \mathbb {E}^x \, \zeta m(dx)\). Therefore, if \(\Vert G\Vert \) is finite, then \(\mathbb {P}^x(\zeta < \infty ) = 1\) for almost every \(x \in D\), and hence the branching times \(\tau _n\) are finite with probability one for almost every initial configuration (with respect to both the product measure \(m \times m\) and the measure on the diagonal with marginals m). Converse implication need not be true, but if \(\mathbb {P}^x(\zeta < \infty ) = 1\) for almost every \(x \in D\), then G(xdy)m(dx) is a \(\sigma \)-finite measure; see, for example, items (iv) and (v) of Proposition (2.2) in [12].

Let us define \(D_0 = \{x \in D: \mathbb {P}^x(\zeta < \infty ) = 1\}\). As it was kindly pointed out by the referee, for a self-dual Hunt process X(t) on a state space D with a finite reference measure m, it can be proved that \(D_0\) is an invariant set, and \(\mathbb {P}^x(\zeta = \infty ) = 1\) for almost every \(x \in D {\setminus } D_0\) (see Remark 3.3). From the point of view of Fleming–Viot systems of particles, we can safely ignore \(D {\setminus } D_0\), or, in other words, we may restrict our attention to the case when \(\mathbb {P}^x(\zeta < \infty ) = 1\) for almost every \(x \in D\). Thus, the assumptions in the first part of Theorem 1.5 and in Corollary 1.6 are quite natural and not restrictive. We refer to [13] for further discussion.

The remaining part of the paper is divided into four sections. In Sect. 2 we illustrate the main idea of the proof when X(t) is the standard Brownian motion in a smooth domain in \(\mathbb {R}^d\). In Sect. 3 we consider two independent copies of the underlying Hunt process X(t). This part contains two auxiliary lemmas, which describe the location of the surviving particle when the other one dies. Section 4 contains the proof of Theorem 1.1, while in Sect. 5 we prove Theorem 1.5.

2 Idea of the proof

Let us consider a Fleming–Viot system of two particles, that from now on we denote by \(({\bar{X}}(t), {\bar{Y}}(t))\) rather than \((\bar{X}_1(t), {\bar{X}}_2(t))\). In this section we assume that \({\bar{X}}(t)\) and \({\bar{Y}}(t)\) evolve as independent Brownian motions in a smooth, bounded Euclidean domain D, and whenever one of the particles hits the boundary, it immediately jumps to the position of the other particle. The key idea of our proof lies in the following observation. Its refined variant, Lemma 3.2, is proved rigorously in Sect. 3.

Proposition 2.1

If the Fleming–Viot pair of particles is started at a random point distributed uniformly over the diagonal of \(D \times D\), then at the first branching time the particles are again distributed uniformly on the diagonal of \(D \times D\).

With the above result at hand, the remaining part of the proof of Theorem 1.1 is fairly simple. The sequence \((Z_n, \sigma _n)\) of branching locations \(Z_n = {\bar{X}}(\tau _n) = \bar{Y}(\tau _n)\) and gaps between branching times \(\sigma _n = \tau _n - \tau _{n - 1}\) forms a stationary Markov chain, and hence, by the ergodic theorem, \(\tau _n / n\) converges to a positive limit with probability one. In particular, \(\tau _\infty = \lim _{n \rightarrow \infty } \tau _n\) is necessarily infinite, that is, there is no extinction.

We remark that our proof of Theorem 1.1 in Sect. 4 uses the full statement of Lemma 3.2 rather than just Proposition 2.1, and it is in fact even simpler than the argument sketched above.

Our proof of Proposition 2.1 (or Lemma 3.2) in the next section uses relatively standard, but rather abstract tools from the theory of Markov processes. For this reason, we think it will be instructive to discuss first a more direct approach to Proposition 2.1 for Brownian particles in a smooth domain D. While the two arguments (the one that follows and the actual proof in Sect. 3) may appear quite different, in fact both of them follow the same line: we identify the occupation density of the bivariate process with the Green function of D (using an analytic approach in Step 1 below, and a probabilistic reasoning in Lemma 3.1), and then we determine the distribution at the first branching time (expressing it in terms of the Poisson kernel of D in Step 2 below, and using resolvent techniques in Lemma 3.2).

Proof

(Sketch of the proof of Proposition 2.1) We divide the argument into three steps.

Step 1. Let \(G_D(x, y)\) denote the usual Green function in D. That is, we have \(G_D(x, y) = G_D(y, x)\), for every \(y \in D\) the function \(x \mapsto G_D(x, y)\) vanishes continuously on the boundary of D, and we have

$$\begin{aligned} \Delta _x G_D(x, y)&= -\delta _y(dx) \end{aligned}$$

in the sense of distributions. Here \(\Delta _x\) stands for the Laplace operator in \(\mathbb {R}^d\) acting on the variable x, and \(\delta _y\) is the Dirac delta at y. By symmetry,

$$\begin{aligned} \Delta _y G_D(x, y)&= -\delta _x(dy) . \end{aligned}$$

It follows that, as a bivariate function, the Green function satisfies

$$\begin{aligned} \Delta _{x_1, x_2} G_D(x_1, x_2)&= -2 \delta (dx_1, dx_2) . \end{aligned}$$

Here and below we denote by \(\delta \) the uniform measure on the diagonal of \(D \times D\):

$$\begin{aligned} \delta (dx, dy)&= \delta _x(dy) dx = \delta _y(dx) dy . \end{aligned}$$

Since \(G_D(x_1, x_2)\) converges to zero on the boundary of \(D \times D\), Green’s third identity implies that

$$\begin{aligned} G_D(x_1, x_2)&= 2 \iint \limits _{D \times D} G_{D \times D}((x_1, x_2), (y_1, y_2)) \delta (dy_1, dy_2) \\&= 2 \int _D G_{D \times D}((x_1, x_2), (y, y)) dy . \end{aligned}$$

Here \(G_{D \times D}((x_1, x_2), (y_1, y_2))\) denotes the Green function in \(D \times D\).

Step 2. Consider the 2d-dimensional Brownian motion (X(t), Y(t)) in \(D \times D\), absorbed at the boundary, and denote by \(\tau \) the hitting time of the boundary. By Kakutani’s formula for the solution of the Dirichlet problem, if \((X(0), Y(0)) = (x_1, x_2)\), then the density function of the distribution of \((X(\tau ), Y(\tau ))\) on the boundary of \(D \times D\) is the Poisson kernel \(P_{D \times D}((x_1, x_2), (z_1, z_2))\). Recall that the Poisson kernel is the inward normal derivative of the Green function (with respect to the second variable) on the boundary \(\partial (D \times D) = (\partial D \times D) \cup (D \times \partial D)\):

$$\begin{aligned} P_{D \times D}((x_1, x_2), (z_1, z_2))&= {\left\{ \begin{array}{ll} \dfrac{\partial }{\partial \nu } \biggr |_{z_1} G_{D \times D}((x_1, x_2), (\cdot , z_2)) &{} \text {for } (z_1, z_2) \in \partial D \times D, \\ \dfrac{\partial }{\partial \nu } \biggr |_{z_2} G_{D \times D}((x_1, x_2), (z_1, \cdot )) &{} \text {for } (z_1, z_2) \in D \times \partial D, \end{array}\right. } \end{aligned}$$

Here \(\tfrac{\partial }{\partial \nu }|_z\) denotes the inward normal derivative on the boundary of D evaluated at a point \(z \in \partial D\).

It follows that if the initial position (X(0), Y(0)) is uniformly distributed over the diagonal of \(D \times D\), then the density function of the distribution of \((X(\tau ), Y(\tau ))\) is given by

$$\begin{aligned} P(z_1, z_2)&= \frac{1}{|D|} \int _D P_{D \times D}((x, x), (z_1, z_2)) dx , \end{aligned}$$

and hence it is the inward normal derivative on the boundary of \(D \times D\) of

$$\begin{aligned} \frac{1}{|D|} \int _D G_{D \times D}((x, x), (y_1, y_2)) dx&= \frac{1}{2 |D|} \, G_D(y_1, y_2) . \end{aligned}$$

In other words,

$$\begin{aligned} P(z_1, z_2)&= {\left\{ \begin{array}{ll} \dfrac{\partial }{\partial \nu } \biggr |_{z_1} \dfrac{G_D(\cdot , z_2)}{2 |D|} &{} \text {for } (z_1, z_2) \in \partial D \times D, \\ \dfrac{\partial }{\partial \nu } \biggr |_{z_2} \dfrac{G_D(z_1, \cdot )}{2 |D|} &{} \text {for } (z_1, z_2) \in D \times \partial D. \end{array}\right. } \end{aligned}$$

But the inward normal derivative (with respect to y) of the Green function in D, \(G_D(x, y)\), is the Poisson kernel in D, \(P_D(x, z)\). Thus,

$$\begin{aligned} P(z_1, z_2)&= {\left\{ \begin{array}{ll} \dfrac{P_D(z_1, z_2)}{2 |D|} &{} \text {for } (z_1, z_2) \in D \times \partial D, \\ \dfrac{P_D(z_2, z_1)}{2 |D|} &{} \text {for } (z_1, z_2) \in \partial D \times D. \end{array}\right. } \end{aligned}$$

Suppose that \(Z = X(\tau )\) if \(Y(\tau ) \in \partial D\), and \(Z = Y(\tau )\) if \(X(\tau ) \in \partial D\). What we have found above implies that for every Borel set \(A \subseteq D\),

$$\begin{aligned} \mathbb {P}(Z \in A)&= \mathbb {P}(X(\tau ) \in A , \, Y(\tau ) \in \partial D) + \mathbb {P}(X(\tau ) \in \partial D , \, Y(\tau ) \in A) \\&= \iint \limits _{A \times \partial D} P(z_1, z_2) dz_1 dz_2 + \iint \limits _{\partial D \times A} P(z_1, z_2) dz_1 dz_2 \\&= \frac{1}{2 |D|} \iint \limits _{A \times \partial D} P_D(z_1, z_2) dz_1 dz_2 + \frac{1}{2 |D|} \iint \limits _{\partial D \times A} P_D(z_2, z_1) dz_1 dz_2 . \end{aligned}$$

However, the integral of \(P_D(x, y)\) over \(y \in \partial D\) is equal to one, and so

$$\begin{aligned} \mathbb {P}(Z \in A)&= \frac{1}{2 |Z|} \int _A 1 dz_1 + \frac{1}{2 |D|} \int _A 1 dz_2 = \frac{|A|}{|D|} \, , \end{aligned}$$

that is, Z is uniformly distributed over D.

Step 3. Let us now consider the Fleming–Viot pair \(\bar{X}(t)\) and \({\bar{Y}}(t)\) evolving as independent Brownian motions in D, but whenever either particle hits the boundary of D, it is immediately resurrected at the location of the other particle. We suppose that \({\bar{X}}(0) = {\bar{Y}}(0)\) is uniformly distributed over D, and we claim that in this case also the first branching location \(Z_1 = {\bar{X}}(\tau _1) = {\bar{Y}}(\tau _1)\) is uniformly distributed over D.

Observe that up to time \(\tau _1\), \(({\bar{X}}(t), {\bar{Y}}(t))\) is the Brownian motion in \(D \times D\), and \(Z_1 = {\bar{X}}(\tau _1) = \bar{Y}(\tau _1)\) is equal to the left limit \({\bar{X}}(\tau _1-) = \bar{X}(\tau _1)\) if \({\bar{Y}}(\tau _1-) \in \partial D\), and to the left limit \({\bar{Y}}(\tau _1-) = {\bar{Y}}(\tau _1)\) if \({\bar{X}}(\tau _1-) \in \partial D\). Therefore, \(Z_1\) is the same random variable as the variable Z studied in the previous step. This completes the proof: we have already shown that Z is uniformly distributed over D. \(\square \)

3 Where were you

In this section we prove an auxiliary result, which describes the distribution of the position of the Hunt process X(t) at the lifetime of its independent copy. This turns out to be the key ingredient of the proof of Theorem 1.1 in the next section.

Suppose that X(t) is a self-dual Hunt processes with state space \(D \cup \{\partial \}\), where \(\partial \) is a cemetery point, and a finite reference measure m. By \({\mathscr {F}}_t\) we denote the natural filtration of X(t). For notational convenience, with no loss of generality we assume that m is a probability measure. As it is customary, we adopt the convention that \(X(\infty ) = \partial \), and that if f is a function on D, then we automatically extend f to \(D \cup \{\partial \}\) so that \(f(\partial ) = 0\). We denote by \(\mathbbm {1}\) the constant one on D; note, however, that \(\mathbbm {1}(\partial ) = 0\). We also write \(\langle f, g \rangle \) for the inner product of \(f, g \in L^2(m)\). Finally, we denote by \(\delta \) the measure concentrated on the diagonal of \(D \times D\) with marginals m,

$$\begin{aligned} \delta (dx, dy)&= \delta _x(dy) m(dx) = \delta _y(dx) m(dy) . \end{aligned}$$

We write \(\mathbb {P}^x\) for the probability corresponding to the process X(t) started at \(X(0) = x\). We denote by \(\zeta \) the lifetime of X(t), by

$$\begin{aligned} p_t(x, A)&= \mathbb {P}^x(X(t) \in A) \end{aligned}$$

the transition kernel of X(t), and by

$$\begin{aligned} P_t f(x)&= \mathbb {E}^x f(X(t)) = \int _D f(y) p_t(x, dy) \end{aligned}$$

the transition operators \(P_t\) of the process X(t), defined whenever the expectation or the integral is well-defined. Then \(P_t\) form a strongly continuous semigroup of self-adjoint contractions on \(L^2(D, m)\). We also define the resolvent kernel

$$\begin{aligned} u_\lambda (x, A)&= \int _0^\infty e^{-\lambda t} p_t(x, A) dt = \mathbb {E}^x \int _0^\zeta e^{-\lambda t} \mathbbm {1}_A(X(t)) dt , \end{aligned}$$

and the resolvent operators

$$\begin{aligned} U_\lambda f(x)&= \int _0^\infty e^{-\lambda t} P_t f(x) dt = \mathbb {E}^x \int _0^\zeta e^{-\lambda t} f(X(t)) dt , \end{aligned}$$

defined whenever \(\lambda \geqslant 0\) and the double integral on the right-hand side is well-defined. Note that for \(\lambda = 0\) we recover the potential kernel \(G(x, dy) = u_0(x, dy)\) discussed in the introduction. For further information about Hunt processes and their transition and resolvent operators, we refer to [9].

Suppose that Y(t) an independent copy of X(t). Let us write \(\mathbb {P}^{x, y}\) for the probability corresponding to processes X(t) and Y(t) started at \(X(0) = x\) and \(Y(0) = y\), and \(\mathbb {P}^\delta \) for the probability corresponding to processes X(t) and Y(t) started at a random point \(X(0) = Y(0)\) with distribution m:

$$\begin{aligned} \mathbb {E}^\delta \varphi (X(t), Y(t))&= \int _D \mathbb {E}^{x, x} \varphi (X(t), Y(t)) m(dx) . \end{aligned}$$

We stress that under \(\mathbb {P}^\delta \), the processes X(t) and Y(t) are not independent. Denote by \(\zeta ^X\), \(\zeta ^Y\) the lifetimes of X(t) and Y(t), respectively. We view the bivariate process (X(t), Y(t)) as a Hunt process on state space \((D \cup \{\partial \}) \times (D \cup \{\partial \})\), with lifetime \(\max \{\zeta ^X, \zeta ^Y\}\), and cemetery state \((\partial , \partial )\). Clearly, for every Borel sets \(A, B \subseteq D \cup \{\partial \}\), for every Borel functions fg on \(D \cup \{\partial \}\), and for every \(x, y \in D\),

$$\begin{aligned} \mathbb {P}^{x, y} \bigl (X(t) \in A, \, Y(t) \in B\bigr ) = p_t(x, A) p_t(y, B) , \\ \mathbb {E}^{x, y} \bigl (f(X(t)) g(Y(t))\bigr ) = P_t f(x) P_t f(y) . \end{aligned}$$

Here we abuse the notation by setting \(p_t(x, \{\partial \}) = 1 - p_t(x, D)\).

Symmetry (or self-duality) of X(t) allows us to link the resolvent kernels of the bivariate process (X(t), Y(t)) and the original process X(t).

Lemma 3.1

For bounded Borel functions f and g and \(\lambda > 0\), or for nonnegative Borel functions f and g and \(\lambda \geqslant 0\), we have

$$\begin{aligned} \mathbb {E}^\delta \int _0^\infty e^{-2 \lambda t} f(X(t)) g(Y(t)) dt&= \frac{1}{2} \, \langle f, U_\lambda g \rangle . \end{aligned}$$
(3.1)

Note that since we agreed that \(f(\partial ) = g(\partial ) = 0\), we have \(f(X(t)) g(Y(t)) = 0\) when \(t \geqslant \min \{\zeta ^X, \zeta ^Y\}\). Thus, the integral in (3.1) is effectively over \(t \in [0, \min \{\zeta ^X, \zeta ^Y\})\).

Proof

Clearly,

$$\begin{aligned} \mathbb {E}^\delta \bigl (f(X(t)) g(Y(t))\bigr )&\!=\! \int _D \mathbb {E}^{x, x} \bigl (f(X(t)) g(X(t))\bigr ) m(dx) \!=\! \int _D P_t f(x) P_t g(x) m(dx) \\&= \langle P_t f, P_t g \rangle . \end{aligned}$$

Using the fact that \(P_t\) is self-adjoint and the semigroup property, we find that

$$\begin{aligned} \mathbb {E}^\delta \bigl (f(X(t)) g(Y(t))\bigr )&= \langle f, P_t P_t g \rangle = \langle f, P_{2 t} g \rangle . \end{aligned}$$
(3.2)

Thus, by Fubini’s theorem,

$$\begin{aligned} \mathbb {E}^\delta \biggl (\int _0^\infty e^{-2 \lambda t} f(X(t)) g(Y(t)) dt\biggr )&\!=\! \int _0^\infty \! \langle f, e^{-2 \lambda t} P_{2 t} g \rangle dt \!=\! \biggl \langle f, \int _0^\infty e^{-2 \lambda t} P_{2 t} g dt \biggr \rangle \\&= \langle f, \tfrac{1}{2} U_\lambda g \rangle , \end{aligned}$$

as desired. \(\square \)

Our next result is a refined version of Proposition 2.1. Before we state it, we recall that if \(\zeta ^X \leqslant \zeta ^Y\), then we understand that \(X(\zeta ^Y) = \partial \) and so \(f(X(\zeta ^Y)) = 0\). We also agree that \(e^{-\infty } = 0\).

Lemma 3.2

If f is a bounded Borel function and \(\lambda > 0\), then

$$\begin{aligned} \mathbb {E}^\delta \bigl (e^{-2 \lambda \zeta ^Y} f(X(\zeta ^Y))\bigr )&= \frac{1}{2} \, \mathbb {E}^\delta \bigl (e^{-\lambda \zeta ^X} f(X(0))\bigr ) = \frac{1}{2} \int _D f(x) \mathbb {E}^x e^{-\lambda \zeta ^X} m(dx) . \end{aligned}$$
(3.3)

When \(\lambda = 0\) and \(\mathbb {P}^x(\zeta ^X < \infty ) = 1\) for almost all \(x \in D\), then we have

$$\begin{aligned} \mathbb {E}^\delta f(X(\zeta ^Y))&= \frac{1}{2} \int _D f(x) m(dx) . \end{aligned}$$
(3.4)

Remark 3.3

Self-duality of X(t) and finiteness of m imply that \(\mathbb {P}^x(\zeta ^X < \infty ) \in \{0, 1\}\) for almost every \(x \in D\). Indeed: if \(f(x) = \mathbb {P}^x(\zeta ^X = \infty ) = \lim _{t \rightarrow \infty } P_t \mathbbm {1}(x)\), then, arguing as in (3.2), we have

$$\begin{aligned} \int _D (f(x))^2 m(dx)&= \lim _{t \rightarrow \infty } \int _D (P_t \mathbbm {1}(x))^2 m(dx) = \lim _{t \rightarrow \infty } \langle P_t \mathbbm {1}, P_t \mathbbm {1}\rangle = \lim _{t \rightarrow \infty } \langle \mathbbm {1}, P_{2 t} \mathbbm {1}\rangle \\&= \lim _{t \rightarrow \infty } \int _D P_{2 t} \mathbbm {1}(x) m(dx) = \int _D f(x) m(dx) . \end{aligned}$$

Therefore, \(f(x) - (f(x))^2\) is a nonnegative function with integral zero. This is only possible if \(f(x) \in \{0, 1\}\) for almost every \(x \in D\), as claimed. Additionally, \(f = P_t f\) for every \(t > 0\), and hence the set of \(x \in D\) such that \(\mathbb {P}^x(\zeta ^X < \infty ) = 1\) is an invariant set. Thus, if X(t) is irreducible, then either X(t) is conservative (in the sense that \(\mathbb {P}^x(\zeta ^X = \infty ) = 1\) for almost every \(x \in D\)) or \(\mathbb {P}^x(\zeta ^X < \infty ) = 1\) for almost every \(x \in D\). We refer to [19] for a closely related discussion.

Remark 3.4

One can prove that X(t) has an exit law \(\ell _t(x)\) such that if \(s, t > 0\), then \(P_s \ell _t(x) = \ell _{t + s}(x)\) for almost every \(x \in D\), and \(\mathbb {P}^x(\zeta ^X \in dt) = \ell _t(x) dt\) on \((0, \infty )\) for almost every \(x \in D\). Thus, using independence of X(t) and \(\zeta ^Y\), we have

$$\begin{aligned} \mathbb {E}^\delta \bigl (e^{-2 \lambda \zeta ^Y} f(X(\zeta ^Y))\bigr )&= \int _D \mathbb {E}^{x, x} \bigl (e^{-2 \lambda \zeta ^Y} f(X(\zeta ^Y))\bigr ) m(dx) \\&= \int _D \mathbb {E}^x \bigl (e^{-2 \lambda t} f(X(t)) \ell _t(x) \bigr ) m(dx) \\&= \int _D \int _0^\infty e^{-2 \lambda t} P_t f(x) \ell _t(x) dt m(dx) \\&= \int _0^\infty e^{-2 \lambda t} \langle P_t f, \ell _t\rangle dt . \end{aligned}$$

Since \(P_t\) is a self-adjoint operator,

$$\begin{aligned} \mathbb {E}^\delta \bigl (e^{-2 \lambda \zeta ^Y} f(X(\zeta ^Y))\bigr )&=\int _0^\infty e^{-2 \lambda t} \langle f, P_t \ell _t\rangle dt = \int _0^\infty e^{-2 \lambda t} \langle f, \ell _{2 t}\rangle dt \\&= \frac{1}{2} \int _0^\infty e^{-\lambda s} \langle f, \ell _s\rangle ds . \end{aligned}$$

Undoing the initial steps, we conclude that

$$\begin{aligned} \mathbb {E}^\delta \bigl (e^{-2 \lambda \zeta ^Y} f(X(\zeta ^Y))\bigr )&= \frac{1}{2} \int _D \int _0^\infty e^{-\lambda s} f(x) \ell _s(x) ds m(dx) \\&= \frac{1}{2} \int _D f(x) \mathbb {E}^x e^{-\lambda \zeta ^X} m(dx) , \end{aligned}$$

and Lemma 3.2 follows. The above argument, kindly suggested by the referee, is very elegant and insightful, but it depends on the existence of the exit law \(\ell _t(x)\). This fact can be proved using Proposition (3.7) in [10] and an appropriate approximation argument, similar to the one given below; see also [13]. With some effort, one can also extend Lemma 3.2 to the case of a \(\sigma \)-finite reference measure m. However, we choose to restrict our attention to finite reference measures m, and we give a slightly longer, but more elementary proof.

Proof

The idea of the proof is to apply Lemma 3.1 to f and \(L \mathbbm {1}\), where L denotes the generator of the process X(t). However, \(\mathbbm {1}\) typically fails to be in the domain of L. In order to circumvent this difficulty, we use resolvent techniques. More precisely, we approximate the (non-existent) function \(L \mathbbm {1}\) by \(\mu \mathbbm {1}- \mu ^2 U_\mu \mathbbm {1}\), where \(\mu \rightarrow \infty \).

Let \(\mu> \lambda > 0\), and let \(g = \mu \mathbbm {1}- \mu ^2 U_\mu \mathbbm {1}\). Our goal is to apply Lemma 3.1 to f and g, simplify both sides of (3.1), and pass to the limit as \(\mu \rightarrow \infty \). We divide the proof into three steps.

Step 1. By the resolvent equation,

$$\begin{aligned} U_\lambda g&= \mu U_\lambda \mathbbm {1}- \mu ^2 U_\lambda U_\mu \mathbbm {1}= \mu U_\lambda \mathbbm {1}+ \frac{\mu ^2}{\mu - \lambda } \, (U_\mu \mathbbm {1}- U_\lambda \mathbbm {1}) \\&= \frac{\mu ^2}{\mu - \lambda } \, U_\mu \mathbbm {1}- \frac{\mu \lambda }{\mu - \lambda } \, U_\lambda \mathbbm {1}. \end{aligned}$$

As \(\mu \rightarrow \infty \), we have \(\mu U_\mu \mathbbm {1}\rightarrow \mathbbm {1}\) in \(L^2(D, m)\), and hence \(U_\lambda g \rightarrow \mathbbm {1}- \lambda U_\lambda \mathbbm {1}\) in \(L^2(D, m)\). It follows that

$$\begin{aligned} \lim _{\mu \rightarrow \infty } \langle f, U_\lambda g \rangle = \langle f, \mathbbm {1}- \lambda U_\lambda \mathbbm {1}\rangle = \int _D f(x) (1 - \lambda U_\lambda \mathbbm {1}(x)) m(dx). \end{aligned}$$

Finally, since \(\mathbbm {1}(X(t)) = 1\) when \(t < \zeta ^X\) and \(\mathbbm {1}(X(t)) = 0\) for \(t \geqslant \zeta ^X\), we have

$$\begin{aligned} 1 - \lambda U_\lambda \mathbbm {1}(x) = 1 - \mathbb {E}^x \int _0^\infty \lambda e^{-\lambda t} \mathbbm {1}(X(t)) dt = 1 - \mathbb {E}^x \int _0^{\zeta ^X} \lambda e^{-\lambda t} dt = \mathbb {E}^x e^{-\lambda \zeta ^X} . \end{aligned}$$

Therefore,

$$\begin{aligned} \lim _{\mu \rightarrow \infty } \langle f, U_\lambda g \rangle = \int _D f(x) \mathbb {E}^x e^{-\lambda \zeta ^X} m(dx) = \mathbb {E}^\delta \bigl (e^{-\lambda \zeta ^X} f(X(0))\bigr ) . \end{aligned}$$
(3.5)

The above identity links the right-hand side of (3.1) and the right-hand side of (3.3).

Step 2. For \(x, y \in D \cup \{\partial \}\) we have

$$\begin{aligned} g(y) = \mu \mathbbm {1}(y) - \mu ^2 U_\mu \mathbbm {1}(y) = \mathbb {E}^{x, y} \int _0^\infty \mu ^2 e^{-\mu s} \bigl (\mathbbm {1}(Y(0)) - \mathbbm {1}(Y(s))\bigr ) ds . \end{aligned}$$

Using the above formula and the Markov property (with the usual abuse of notation), alongside with Fubini’s theorem, we find that

$$\begin{aligned} \hspace{3em}&\hspace{-3em} \mathbb {E}^\delta \biggl (\int _0^\infty e^{-2 \lambda t} f(X(t)) g(Y(t)) dt\biggr ) \\&= \int _0^\infty e^{-2 \lambda t} \mathbb {E}^\delta \bigl (f(X(t)) g(Y(t))\bigr ) dt \\&= \int _0^\infty e^{-2 \lambda t} \mathbb {E}^\delta \biggl ( f(X(t)) \mathbb {E}^{X(t), Y(t)} \int _0^\infty \mu ^2 e^{-\mu s} \bigl (\mathbbm {1}(Y(s)) - \mathbbm {1}(Y(0))\bigr ) ds \biggr ) dt \\&= \int _0^\infty \! e^{-2 \lambda t} \mathbb {E}^\delta \biggl ( f(X(t)) \mathbb {E}^\delta \biggl ( \int _0^\infty \! \mu ^2 e^{-\mu s} \bigl (\mathbbm {1}(Y(t \!+\! s)) \!-\! \mathbbm {1}(Y(t))\bigr ) ds \bigg | {\mathscr {F}}_t \biggr )\biggr ) dt \\&= \int _0^\infty e^{-2 \lambda t} \mathbb {E}^\delta \biggl ( f(X(t)) \int _0^\infty \mu ^2 e^{-\mu s} \bigl (\mathbbm {1}(Y(t + s)) - \mathbbm {1}(Y(t))\bigr ) ds \biggr ) dt \\&= \mathbb {E}^\delta \biggl (\int _0^\infty \int _0^\infty \mu ^2 e^{-\mu s} e^{-2 \lambda t} f(X(t)) \bigl (\mathbbm {1}(Y(t + s)) - \mathbbm {1}(Y(t))\bigr ) ds dt\biggr ) . \end{aligned}$$

Recall that \(\mathbbm {1}(Y(t)) = 1\) if \(t < \zeta ^Y\) and \(\mathbbm {1}(Y(t)) = 0\) otherwise. It follows that

$$\begin{aligned} \hspace{3em}&\hspace{-3em} \mathbb {E}^\delta \biggl (\int _0^\infty e^{-2 \lambda t} f(X(t)) g(Y(t)) dt\biggr ) \\&= \mathbb {E}^\delta \biggl (\int _0^\infty \int _0^\infty \mu ^2 e^{-\mu s} e^{-2 \lambda t} f(X(t)) \mathbbm {1}_{\{t < \zeta ^Y \leqslant t + s\}} ds dt\biggr ) . \end{aligned}$$

By Fubini’s theorem,

$$\begin{aligned} \hspace{3em}&\hspace{-3em} \mathbb {E}^\delta \biggl (\int _0^\infty e^{-2 \lambda t} f(X(t)) g(Y(t)) dt\biggr ) \\&= \mathbb {E}^\delta \biggl (\int _0^\infty \int _0^\infty \mu ^2 e^{-\mu s} e^{-2 \lambda t} f(X(t)) \mathbbm {1}_{\{t < \zeta ^Y \leqslant t + s\}} dt ds\biggr ) \\&= \mathbb {E}^\delta \biggl (\int _0^\infty \int _0^{\zeta ^Y} \mu ^2 e^{-\mu s} e^{-2 \lambda t} f(X(t)) \mathbbm {1}_{\{\zeta ^Y \leqslant t + s\}} dt ds\biggr ) . \end{aligned}$$

Substituting \(s = u / \mu \) and \(t = \zeta ^Y - v / \mu \), we arrive at

$$\begin{aligned} \hspace{3em}&\hspace{-3em} \mathbb {E}^\delta \biggl (\int _0^\infty e^{-2 \lambda t} f(X(t)) g(Y(t)) dt\biggr ) \\&= \mathbb {E}^\delta \biggl (\int _0^\infty \int _0^{\mu \zeta ^Y} e^{-u} e^{-2 \lambda (\zeta ^Y - v / \mu )} f(X(\zeta ^Y - \tfrac{v}{\mu })) \mathbbm {1}_{\{v \leqslant u \}} \mathbbm {1}_{\{\zeta ^Y < \infty \}} dv du \biggr ) . \end{aligned}$$

Since f is bounded and \(e^{-2 \lambda (\zeta ^Y - v / \mu )} \leqslant 1\), the dominated convergence theorem applies, and we conclude that

$$\begin{aligned} \begin{aligned} \hspace{3em}&\hspace{-3em} \lim _{\mu \rightarrow \infty } \mathbb {E}^\delta \biggl (\int _0^\infty e^{-2 \lambda t} f(X(t)) g(Y(t)) dt\biggr ) \\&= \mathbb {E}^\delta \biggl (\int _0^\infty e^{-u} \int _0^\infty e^{-2 \lambda \zeta ^Y} f(X(\zeta ^Y)) \mathbbm {1}_{\{v \leqslant u \}} \mathbbm {1}_{\{\zeta ^Y < \infty \}} dv du\biggr ) \\&= \mathbb {E}^\delta \bigl (e^{-2 \lambda \zeta ^Y} f(X(\zeta ^Y))\bigr ) . \end{aligned} \end{aligned}$$
(3.6)

This identity provides a link between the left-hand sides of (3.1) and (3.3).

Step 3. The desired result for \(\lambda > 0\) follows now from (3.5) and (3.6), combined with Lemma 3.1. Finally, the result for \(\lambda = 0\) is shown by passing to the limit as \(\lambda \rightarrow 0^+\) and using the dominated convergence theorem. \(\square \)

We will need the following simple property:

$$\begin{aligned} \mathbb {P}^\delta (\zeta ^X = \zeta ^Y < \infty )&= 0 . \end{aligned}$$
(3.7)

This follows easily from the fact that the semigroup \(P_t\) is analytic (see Theorem 1 in Sect. III.1, p. 67, in [20]). Indeed: the function \(\mathbb {P}^\delta (\zeta ^X> s, \zeta ^Y > t) = \langle P_s \mathbbm {1}, P_t \mathbbm {1}\rangle \) is real-analytic with respect to \(s, t > 0\), and so the joint distribution of \((\zeta ^X, \zeta ^Y)\) under \(\mathbb {P}^\delta \) is absolutely continuous on \((0, \infty ) \times (0, \infty )\) (with a real-analytic density function). Alternatively, one can derive (3.7) from formula (3.4) with \(f = \mathbbm {1}\), with a minor twist when \(\mathbb {P}^\delta (\zeta ^X = \infty ) \in (0, 1)\) (which is only possible when X(t) is not irreducible). For a closely related result, see Proposition (3.7) in [10] or Proposition (6.20)(i) in [14].

We define the random time

$$\begin{aligned} \sigma&= \min \{\zeta ^X, \zeta ^Y\} , \end{aligned}$$

and the random variable

$$\begin{aligned} Z&= {\left\{ \begin{array}{ll} X(\sigma ) = X(\zeta ^Y) &{} \text {if } \zeta ^Y< \zeta ^X , \\ Y(\sigma ) = Y(\zeta ^X) &{} \text {if } \zeta ^X < \zeta ^Y , \\ \partial &{} \text {if } \zeta ^X = \zeta ^Y . \end{array}\right. } \end{aligned}$$
(3.8)

Recall that \(f(X(t)) g(Y(t)) = 0\) when \(t \geqslant \sigma \). Thus, Lemma 3.1 reads

$$\begin{aligned} \mathbb {E}^\delta \int _0^\sigma e^{-2 \lambda t} f(X(t)) g(Y(t)) dt&= \frac{1}{2} \, \langle f, U_\lambda g \rangle . \end{aligned}$$
(3.9)

When \(\lambda > 0\) and \(f = g = \mathbbm {1}\), we obtain

$$\begin{aligned} \mathbb {E}^\delta (1 - e^{-2 \lambda \sigma })&= \lambda \langle \mathbbm {1}, U_\lambda \mathbbm {1}\rangle . \end{aligned}$$

On the other hand, setting \(\lambda = 0\) and \(f = g = \mathbbm {1}\) leads to

$$\begin{aligned} \mathbb {E}^\delta \sigma&= \tfrac{1}{2} \langle \mathbbm {1}, U_0 \mathbbm {1}\rangle = \tfrac{1}{2} \Vert G\Vert \end{aligned}$$

(see (1.2)). Thus, the assumption that \(\Vert G\Vert \) is finite in Theorem 1.5 is equivalent to finiteness of \(\mathbb {E}^\delta \sigma \).

We now rephrase Lemma 3.2. Note that \(e^{-2 \lambda \zeta ^Y} f(X(\zeta ^Y)) = e^{-2 \lambda \sigma } f(X(\sigma ))\), as \(\zeta ^Y = \sigma \) when \(\zeta ^Y < \zeta ^X\), and both sides are equal to zero otherwise. Thus, formula (3.3) can be written as

$$\begin{aligned} \mathbb {E}^\delta \bigl (e^{-2 \lambda \sigma } f(X(\sigma ))\bigr )&= \tfrac{1}{2} \mathbb {E}^\delta \bigl (e^{-\lambda \zeta ^X} f(X(0))\bigr ) . \end{aligned}$$

By symmetry,

$$\begin{aligned} \mathbb {E}^\delta \bigl (e^{-2 \lambda \sigma } f(Y(\sigma ))\bigr )&= \tfrac{1}{2} \mathbb {E}^\delta \bigl (e^{-\lambda \zeta ^X} f(X(0))\bigr ) , \end{aligned}$$

and therefore

$$\begin{aligned} \mathbb {E}^\delta \bigl (e^{-2 \lambda \sigma } f(Z)\bigr )&= \mathbb {E}^\delta \bigl (e^{-\lambda \zeta ^X} f(X(0))\bigr ) = \int _D f(x) \mathbb {E}^x e^{-\lambda \zeta ^X} m(dx), \end{aligned}$$
(3.10)

By considering \(\lambda = 0\), we find that if \(\mathbb {P}^x(\zeta ^X < \infty ) = 1\) for almost all \(x \in D\), then

$$\begin{aligned} \mathbb {E}^\delta f(Z)&= \mathbb {E}^\delta f(X(0)) = \int _D f(x) m(dx) , \end{aligned}$$
(3.11)

a property that was stated as Proposition 2.1 in the previous section, and proved with a more direct approach when X(t) is the Brownian motion.

4 We’ll meet again

We are now ready to prove Theorem 1.1. This is done below, after the Fleming–Viot system of two particles is introduced in a more careful way.

As in the previous section, we consider a self-dual Hunt process X(t) on a state space D with a finite reference measure m, and its independent copy Y(t). Again with no loss of generality we assume that m is a probability measure. In this section we consider the corresponding Fleming–Viot system of two particles: a bivariate process \(({\bar{X}}(t), {\bar{Y}}(t))\) which evolves just as (X(t), Y(t)), except that at an increasing sequence of branching times \(\tau _n\) the coordinate that is about to die, reenters the state space D at the position of the other coordinate.

More precisely, the Fleming–Viot system \(({\bar{X}}(t), {\bar{Y}}(t))\) is constructed recursively. We let \(\tau _0 = 0\), and we let \(Z_0\) to be a random variable taking values in D, with distribution m. Once \(\tau _{n - 1} \in [0, \infty )\) and \(Z_{n - 1} \in D\) are given, and \({\bar{X}}(t)\) and \({\bar{Y}}(t)\) are defined on \([0, \tau _{n - 1})\), we proceed as follows. We sample an independent copy \((X_n(t), Y_n(t))\) of the bivariate process (X(t), Y(t)) started at the random position \((Z_{n - 1}, Z_{n - 1})\), and we denote the corresponding variables \(\sigma \) and Z by \(\sigma _n\) and \(Z_n\). That is,

$$\begin{aligned} \sigma _n&= \min \{\zeta ^{X_n}, \zeta ^{Y_n}\} ,&Z_n&= {\left\{ \begin{array}{ll} X_n(\zeta ^{Y_n}) &{} \text {if}\ \zeta ^{Y_n}< \zeta ^{X_n}, \\ Y_n(\zeta ^{X_n}) &{} \text {if}\ \zeta ^{X_n} < \zeta ^{Y_n}. \end{array}\right. } \end{aligned}$$

We define

$$\begin{aligned} {\bar{X}}(\tau _{n - 1} + t)&= X_n(t) ,&{\bar{Y}}(\tau _{n - 1} + t)&= Y_n(t) \end{aligned}$$

for \(t \in [0, \sigma _n)\). Furthermore, we let

$$\begin{aligned} \tau _n&= \tau _{n - 1} + \sigma _n . \end{aligned}$$

We denote the probability corresponding to the above construction by \({\bar{\mathbb {P}}}^\delta \) to emphasise that the initial configuration of the particles is distributed according to the measure \(\delta \) on the diagonal of \(D \times D\) with marginals m.

Exactly the same construction can be carried out for an arbitrary distribution of the initial configuration of the two particles. If \({\bar{X}}(0) = x\) and \({\bar{Y}}(0) = y\) with probability one, we write \({\bar{\mathbb {P}}}^{x, y}\) for the corresponding probability. If \({\bar{X}}(0)\) and \({\bar{Y}}(0)\) are drawn independently from distribution m, we denote the corresponding probability by \({\bar{\mathbb {P}}}^{m \times m}\). More generally, for an arbitrary probability measure \(\mu \) on \(D \times D\) we write \({\bar{\mathbb {P}}}^\mu = \int _{D \times D} {\bar{\mathbb {P}}}^{x, y} \mu (dx, dy)\) for the probability corresponding to the system of particles with initial configuration \(({\bar{X}}(0), {\bar{Y}}(z))\) chosen randomly from distribution \(\mu \). Clearly, for any event E we have

$$\begin{aligned} {\bar{\mathbb {P}}}^\mu (E)&= \iint \limits _{D \times D} {\bar{\mathbb {P}}}^{x, y}(E) \mu (dx, dy) . \end{aligned}$$
(4.1)

Note that the process \(({\bar{X}}(t), {\bar{Y}}(t))\) is defined on \([0, \tau _\infty )\), where

$$\begin{aligned} \tau _\infty&= \lim _{n \rightarrow \infty } \tau _n \in (0, \infty ] . \end{aligned}$$

If \(\tau _\infty < \infty \), we say that the Fleming–Viot particle system becomes extinct (or suffers from a branching explosion) at time \(\tau _\infty \), and in this case we set \(\bar{X}(t) = {\bar{Y}}(t) = \partial \) for \(t \geqslant \tau _\infty \) to make the definition of \(({\bar{X}}(t), {\bar{Y}}(t))\) complete.

Below we restate, and then prove, Theorem 1.1. For convenience, we show items (a) and (b) of Theorem 1.1 separately, as Corollary 4.2 and Theorem 4.1, respectively.

Theorem 4.1

Consider the Fleming–Viot system \(({\bar{X}}(t), \bar{Y}(t))\) of two particles evolving according to a self-dual Hunt process X(t) on a state space D with a finite reference measure m. For almost every \(x \in D\) (with respect to m), if the initial configuration is \({\bar{X}}(0) = {\bar{Y}}(0) = x\), then the system never becomes extinct.

Proof

With no loss of generality we assume that m is a probability measure.

By construction, up to the first branching time, the process \((\bar{X}(t), {\bar{Y}}(t))\) is a copy of the process (X(t), Y(t)) studied in the previous section. More precisely, for \(t \in [0, \tau _1)\) the process \(({\bar{X}}(t), {\bar{Y}}(t)) = (X_1(t), Y_1(t))\) is a copy of the process (X(t), Y(t)) restricted to \(t \in [0, \sigma )\). Note, however, that at time \(\tau _1\) we have \({\bar{X}}(\tau _1) = \bar{Y}(\tau _1) = Z_1\), while one of the variables \(X_1(\tau _1)\) and \(Y_1(\tau _1)\) is equal to \(\partial \).

By definition, \(Z_1 = X_1(\tau _1)\) if \(\tau _1 = \zeta ^{Y_1}\), and \(Z_1 = Y_1(\tau _1)\) if \(\tau _1 = \zeta ^{X_1}\). This means that \(Z_1\) is defined in terms of \(X_1(t)\) and \(Y_1(t)\) in the same way as the random variable Z was constructed in (3.8) using X(t) and Y(t).

Again by construction, the process \(({\bar{X}}(t), {\bar{Y}}(t))\) has the strong Markov property at time \(\tau _1\): the evolution after time \(\tau _1\) is independent of the past, conditionally on the present value of \(Z_1 = {\bar{X}}(\tau _1) = {\bar{Y}}(\tau _1)\). We fix \(\lambda > 0\), and we write

$$\begin{aligned} f(x)&= {\bar{\mathbb {E}}}^{x, x} e^{-2 \lambda \tau _\infty } . \end{aligned}$$

By the strong Markov property and (3.10), we have

$$\begin{aligned} \int _D f(x) m(dx)&= {\bar{\mathbb {E}}}^\delta e^{-2 \lambda \tau _\infty } = {\bar{\mathbb {E}}}^\delta \bigl (\mathbbm {1}_{\{\tau _1< \infty \}} e^{-2 \lambda \tau _1} e^{-2 \lambda (\tau _\infty - \tau _1)}\bigr ) \\&= {\bar{\mathbb {E}}}^\delta \bigl (\mathbbm {1}_{\{\tau _1< \infty \}} e^{-2 \lambda \tau _1} {\bar{\mathbb {E}}}^\delta (e^{-2 \lambda (\tau _\infty - \tau _1)} \vert {\mathscr {F}}_{\tau _1})\bigr ) \\&= {\bar{\mathbb {E}}}^\delta \bigl (\mathbbm {1}_{\{\tau _1< \infty \}} e^{-2 \lambda \tau _1} f(Z_1) \bigr ) = \mathbb {E}^\delta \bigl (\mathbbm {1}_{\{\sigma < \infty \}} e^{-2 \lambda \sigma } f(Z) \bigr ) \\&= \int _D f(x) \mathbb {E}^x e^{-\lambda \zeta ^X} m(dx) . \end{aligned}$$

Thus,

$$\begin{aligned} \int _D f(x) (1 - \mathbb {E}^x e^{-\lambda \zeta ^X}) m(dx)&= 0 . \end{aligned}$$

However, \(\mathbb {E}^x e^{-\lambda \zeta ^X} < 1\) for every \(x \in D\). It follows that \(f(x) = 0\) for almost every \(x \in D\), that is, \(\tau _\infty = \infty \) with probability \({\bar{\mathbb {P}}}^{x, x}\) one for almost every \(x \in D\). \(\square \)

Claim (b) in Theorem 1.1 is thus proved. In order to extend this result to almost all initial configurations \(({\bar{X}}(0), {\bar{Y}}(0)) = (x, y)\) with respect to the product measure \(m \times m\), as in claim (a), we need one more step.

Corollary 4.2

The result of Theorem 4.1 remains true for almost every initial configuration \({\bar{X}}(0) = x\), \({\bar{Y}}(0) = y\) (with respect to the product measure \(m \times m\)), and also when the distribution of \(({\bar{X}}(0), {\bar{Y}}(0))\) is absolutely continuous with respect to \(m \times m\).

Proof

Again, with no loss of generality we assume that m is a probability measure. Suppose that the initial position of \((\bar{X}(t), {\bar{Y}}(t))\) is chosen randomly according to the product measure \(m \times m\), and recall that we denote the corresponding probability by \({\bar{\mathbb {P}}}^{m \times m}\). As in the proof of Theorem 4.1, we observe that up to the first branching time \(\tau _1\), we have \({\bar{X}}(t) = X_1(t)\) and \({\bar{Y}}(t) = Y_1(t)\), where \(X_1(t)\) and \(Y_1(t)\) are independent copies of the underlying Hunt process X(t), and both are started independently at a random position in D chosen according to the measure m. Furthermore, we have \(Z_1 = X_1(\tau _1)\) if \(\tau _1 = \zeta ^{Y_1}\) and \(Z_1 = Y_1(\tau _1)\) if \(\tau _1 = \zeta ^{X_1}\). It follows that for every Borel set \(A \subseteq D\),

$$\begin{aligned} {\bar{\mathbb {P}}}^{m \times m}(Z_1 \in A)&= {\bar{\mathbb {P}}}^{m \times m}(X_1(\zeta ^{Y_1}) \in A) + {\bar{\mathbb {P}}}^{m \times m}(Y_1(\zeta ^{X_1}) \in A) \\&= 2 {\bar{\mathbb {P}}}^{m \times m}(X_1(\zeta ^{Y_1}) \in A) \end{aligned}$$

(recall that \(X_1(\zeta ^{Y_1}) = \partial \) if \(\zeta ^{Y_1} \geqslant \zeta ^{X_1}\)). By independence,

$$\begin{aligned} {\bar{\mathbb {P}}}^{m \times m}(Z_1 \in A)&= 2 \int _{[0, \infty )} {\bar{\mathbb {P}}}^{m \times m}(X_1(t) \in A) {\bar{\mathbb {P}}}^{m \times m}(\zeta ^{Y_1} \in dt) . \end{aligned}$$

If \(\mathbb {P}^m\) corresponds to the process X(t) started at a random position X(0) with distribution m, then we obtain

$$\begin{aligned} {\bar{\mathbb {P}}}^{m \times m}(Z_1 \in A)&= 2 \int _{[0, \infty )} \mathbb {P}^m(X(t) \in A) \mathbb {P}^m(\zeta ^X \in dt) . \end{aligned}$$

However,

$$\begin{aligned} \mathbb {P}^m(X(t) \in A)&= \int _D \mathbb {P}^x(X(t) \in A) m(dx) = \langle \mathbbm {1}, P_t \mathbbm {1}_A \rangle , \end{aligned}$$

and since \(P_t\) is self-adjoint, we have

$$\begin{aligned} \mathbb {P}^m(X(t) \in A)&= \langle P_t \mathbbm {1}, \mathbbm {1}_A \rangle \leqslant \langle \mathbbm {1}, \mathbbm {1}_A \rangle = m(A) . \end{aligned}$$

Therefore,

$$\begin{aligned} {\bar{\mathbb {P}}}^{m \times m}(Z_1 \in A)&\leqslant 2 \int _{[0, \infty )} m(A) \mathbb {P}^m(\zeta ^X \in dt) = 2 m(A) . \end{aligned}$$

In particular, the distribution of \(Z_1\) is absolutely continuous with respect to m.

By the strong Markov property, the shifted process \(({\bar{X}}(\tau _1 + t), {\bar{Y}}(\tau _1 + t))\) is the same Fleming–Viot particle system, with initial configuration \((Z_1, Z_1)\). Above we proved that the distribution of \(Z_1\) under \({\bar{\mathbb {P}}}^{m \times m}\) is absolutely continuous with respect to m. Thus, using the strong Markov property and Theorem 4.1, we find that

$$\begin{aligned} {\bar{\mathbb {P}}}^{m \times m}(\tau _\infty< \infty )&= {\bar{\mathbb {E}}}^{m \times m}\bigl (\mathbbm {1}_{\{\tau _1< \infty \}} {\bar{\mathbb {P}}}^{m \times m}(\tau _\infty< \infty ) \big | {\mathscr {F}}_{\tau _1}\bigr ) \\&= {\bar{\mathbb {E}}}^{m \times m}\bigl (\mathbbm {1}_{\{\tau _1< \infty \}} {\bar{\mathbb {P}}}^{Z_1, Z_1}(\tau _\infty< \infty )\bigr ) \\&= \int _D {\bar{\mathbb {P}}}^{x, x}(\tau _\infty < \infty ) {\bar{\mathbb {P}}}^{m \times m}(Z_1 \in dx) = 0 , \end{aligned}$$

that is, the system never becomes extinct. By (4.1) (with \(\mu = m \times m\)), we conclude that \(\tau _\infty = \infty \) with probability \({\bar{\mathbb {P}}}^{x, y}\) one for almost every pair xy.

The latter assertion of the lemma follows immediately from the former by (4.1) and Fubini. Alternatively, one can repeat the above argument with the initial configuration distributed according to a given absolutely continuous distribution rather than \(m \times m\). \(\square \)

Of course, this proves claim (a) of Theorem 1.1, and so the proof of our main result is complete.

Proof of Corollary 1.4

Suppose that \({\bar{X}}(t)\) and \({\bar{Y}}(t)\) are started at fixed points x and y, respectively. Then the processes \({\bar{X}}(t)\) and \(\bar{Y}(t)\) are independent up to the first branching time \(\tau _1\). By the same argument as in the proof of Corollary 4.2, the distribution of \(Z_1\) is a mixture of the one-dimensional distributions of the underlying Hunt process X(t), and hence, by assumption, it is absolutely continuous with respect to m. The remaining part of the proof is exactly the same as in the proof of Corollary 4.2. \(\square \)

5 Isn’t this where we came in?

In this section we prove Theorem 1.5 and Corollary 1.6. We use the notation introduced in the previous section, and we begin by showing that m is a stationary measure for the embedded Markov chain \(Z_n\).

Proof of Corollary 1.6

With no loss of generality we assume that m is a probability measure, and we suppose that the initial configuration \(({\bar{X}}(0), {\bar{Y}}(0))\) has distribution \(\delta \) on the diagonal of \(D \times D\), as in the proof of Theorem 4.1. By assumption, \(Z_0 = {\bar{X}}(0) = {\bar{Y}}(0)\) has distribution m, and \(\mathbb {P}^x(\zeta ^X < \infty ) = 1\) for almost every \(x \in D\).

As it was observed in the proof of Theorem 4.1, the process \(({\bar{X}}(t), {\bar{Y}}(t))\) up to the first branching time \(\tau _1\) is a copy of the bivariate process (X(t), Y(t)) up to time \(\sigma = \min \{\zeta ^X, \zeta ^Y\}\), and hence the distribution of \(Z_1\) under \({\bar{\mathbb {P}}}^\delta \) is equal to the distribution of Z under \(\mathbb {P}^\delta \). By (3.11), the latter is equal to m. Therefore, \(Z_1\) has distribution m.

Suppose that we already know that \(Z_{n - 1}\) has distribution m. By the strong Markov property for the process \(({\bar{X}}(t), \bar{Y}(t))\) at time \(\tau _{n - 1}\) (see [16, 18]), the shifted process \(({\bar{X}}(\tau _{n - 1} + t), {\bar{Y}}(\tau _{n - 1} + t))\) under \({\bar{\mathbb {P}}}^\delta \) is just a copy of the original process \((\bar{X}(t), {\bar{Y}}(t))\) under \({\bar{\mathbb {P}}}^\delta \). The result of the previous paragraph implies that \(Z_n\) has distribution m, and additionally it proves the Markov property for the sequence \(Z_0, Z_1, \ldots \) at time \(n - 1\). Thus, by induction, the sequence \(Z_0, Z_1, \ldots \) is indeed a Markov chain, and for every n the random variable \(Z_n\) has distribution m. \(\square \)

For the construction of a stationary measure for the Fleming–Viot system \(({\bar{X}}(t), {\bar{Y}}(t))\) in Theorem 1.5, we additionally denote

$$\begin{aligned} \mu (dx, dy)&= \Vert G\Vert ^{-1} G(x, dy) m(dx) \end{aligned}$$

whenever \(\Vert G\Vert \) is finite; see (1.1) and (1.2). Note that in this case \(\mu \) is a probability measure on \(D \times D\). Our goal is to show that under probability \({\bar{\mathbb {P}}}^\mu \), corresponding to the Fleming–Viot system of two particles with initial configuration \(({\bar{X}}(0), {\bar{Y}}(0))\) having distribution \(\mu \), the system \(({\bar{X}}(t), {\bar{Y}}(t))\) is stationary.

The only essential property that we use here is that the process \(({\bar{X}}(t), {\bar{Y}}(t))\) is obtained by concatenating ‘excursions’ \((X_k(t), Y_k(t))\), \(t \in [0, \sigma _k)\), and that the initial configurations \((X_k(0), Y_k(0)) = (Z_{k - 1}, Z_{k - 1})\) of these excursions form a stationary Markov chain (by Corollary 1.6). While such a concatenation procedure seems to be rather standard (see, for example, [16, 18]), the author failed to find a proper reference, and so we include full details.

We fix \(\lambda > 0\) and two bounded nonnegative Borel functions fg, and we denote

$$\begin{aligned} \varphi (x, y)&= {\bar{\mathbb {E}}}^{x, y} \int _0^\infty e^{-\lambda t} f(\bar{X}(t)) g({\bar{Y}}(t)) dt . \end{aligned}$$

Recall that \(\delta \) is the measure on the diagonal of \(D \times D\) with marginals m, and once again with no loss of generality we assume that m is a probability measure. The following lemma is the key technical result in the proof of Theorem 1.5

Lemma 5.1

Suppose that \(\Vert G\Vert \) is finite. With the above definitions, we have

$$\begin{aligned} \frac{1}{2} \iint \limits _{D \times D} \varphi (x, y) G(x, dy) m(dx)&= \frac{1}{2 \lambda } \iint \limits _{D \times D} f(x) g(y) G(x, dy) m(dx) . \end{aligned}$$
(5.1)

Proof

Recall that \(G(x, dy) = u_0(x, dy)\), and that the process \((\bar{X}(t), {\bar{Y}}(t))\) up to the first branching time \(\tau _1\) is a copy of the process (X(t), Y(t)) up to time \(\sigma = \min \{\zeta ^X, \zeta ^Y\}\). Thus, formula (3.9) implies that

$$\begin{aligned} \frac{1}{2} \smash {\iint \limits _{D \times D}} \varphi (x, y) G(x, dy) m(dx)&= \mathbb {E}^\delta \biggl (\int _0^\sigma \varphi (X(s), Y(s)) ds\biggr ) \\&= {\bar{\mathbb {E}}}^\delta \biggl (\int _0^{\tau _1} \varphi ({\bar{X}}(s), {\bar{Y}}(s)) ds\biggr ) . \end{aligned}$$

We now show a variant of the resolvent equation, with one integral over \((0, \infty )\) and the other one over \((0, \tau _1)\). Using the Markov property, we find that

$$\begin{aligned} \hspace{3em}&\hspace{-3em} \frac{1}{2} \smash {\iint \limits _{D \times D}} \varphi (x, y) G(x, dy) m(dx) \\&= {\bar{\mathbb {E}}}^\delta \biggl (\int _0^{\tau _1} {\bar{\mathbb {E}}}^{{\bar{X}}(s), \bar{Y}(s)} \biggl (\int _0^\infty e^{-\lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) ds\biggr ) \\&= {\bar{\mathbb {E}}}^\delta \biggl (\int _0^{\tau _1} {\bar{\mathbb {E}}}^\delta \biggl (\int _0^\infty e^{-\lambda t} f({\bar{X}}(s + t)) g({\bar{Y}}(s + t)) dt \bigg | {\mathscr {F}}_s \biggr ) ds\biggr ) \\&= {\bar{\mathbb {E}}}^\delta \biggl (\int _0^{\tau _1} \int _0^\infty e^{-\lambda t} f({\bar{X}}(s + t)) g({\bar{Y}}(s + t)) dt ds\biggr ) \\&= {\bar{\mathbb {E}}}^\delta \biggl (\int _0^{\tau _1} \int _s^\infty e^{\lambda s - \lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt ds\biggr ) . \end{aligned}$$

By Fubini’s theorem,

$$\begin{aligned} \hspace{3em}&\hspace{-3em} \frac{1}{2} \smash {\iint \limits _{D \times D}} \varphi (x, y) G(x, dy) m(dx) \nonumber \\&= {\bar{\mathbb {E}}}^\delta \biggl (\int _0^\infty \int _0^{\min \{t, \tau _1\}} e^{\lambda s - \lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) ds dt\biggr ) \nonumber \\&= {\bar{\mathbb {E}}}^\delta \biggl (\int _0^{\tau _1} \frac{1 - e^{-\lambda t}}{\lambda } \, f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) \nonumber \\&\quad + {\bar{\mathbb {E}}}^\delta \biggl (\int _{\tau _1}^\infty \frac{e^{\lambda \tau _1 - \lambda t} - e^{-\lambda t}}{\lambda } \, f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) \nonumber \\&= \frac{1}{\lambda } \, {\bar{\mathbb {E}}}^\delta \biggl (\int _0^{\tau _1} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) \nonumber \\&\quad - \frac{1}{\lambda } \, {\bar{\mathbb {E}}}^\delta \biggl (\int _0^{\tau _1} e^{-\lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) \nonumber \\&\quad + \frac{1}{\lambda } \, {\bar{\mathbb {E}}}^\delta \biggl (\int _{\tau _1}^\infty e^{\lambda \tau _1 - \lambda t} f(\bar{X}(t)) g({\bar{Y}}(t)) dt\biggr )\nonumber \\&\quad - \frac{1}{\lambda } \, {\bar{\mathbb {E}}}^\delta \biggl (\int _{\tau _1}^\infty e^{-\lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) . \end{aligned}$$
(5.2)

We study each of the terms on the right-hand side. For the first one, by equality of \(({\bar{X}}(t), {\bar{Y}}(t))\) and (X(t), Y(t)) up to the first branching time, and by (3.9) with \(\lambda = 0\), we obtain

$$\begin{aligned} \begin{aligned} {\bar{\mathbb {E}}}^\delta \biggl (\int _0^{\tau _1} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr )&= \mathbb {E}^\delta \biggl (\int _0^\sigma f(X(t)) g(Y(t)) dt\biggr ) \\&= \frac{1}{2} \, \langle f, U_0 g \rangle \\&= \frac{1}{2} \iint _{D \times D} f(x) g(y) G(x, dy) m(dx) . \end{aligned} \end{aligned}$$
(5.3)

In order to transform the third term, we apply the strong Markov property:

$$\begin{aligned} \hspace{3em}&\hspace{-3em} {\bar{\mathbb {E}}}^\delta \biggl (\int _{\tau _1}^\infty e^{\lambda \tau _1 - \lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) \\&= {\bar{\mathbb {E}}}^\delta \biggl (\mathbbm {1}_{\{\tau _1< \infty \}} \int _0^\infty e^{-\lambda t} f({\bar{X}}(\tau _1 + t)) g({\bar{Y}}(\tau _1 + t)) dt\biggr ) \\&= {\bar{\mathbb {E}}}^\delta \biggl (\mathbbm {1}_{\{\tau _1< \infty \}} {\bar{\mathbb {E}}}^\delta \biggl (\int _0^\infty e^{-\lambda t} f({\bar{X}}(\tau _1 + t)) g({\bar{Y}}(\tau _1 + t)) dt \bigg | {\mathscr {F}}_{\tau _1}\biggr ) \biggr ) \\&= {\bar{\mathbb {E}}}^\delta \biggl (\mathbbm {1}_{\{\tau _1< \infty \}} {\bar{\mathbb {E}}}^{Z_1, Z_1} \biggl (\int _0^\infty e^{-\lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) \biggr ) \\&= {\bar{\mathbb {E}}}^\delta \bigl (\mathbbm {1}_{\{\tau _1 < \infty \}} \varphi (Z_1, Z_1)\bigr ) . \end{aligned}$$

Since \(\Vert G\Vert \) is finite, we have \({\bar{\mathbb {P}}}^{x, x}(\tau _1< \infty ) = \mathbb {P}^{x, x}(\sigma < \infty ) = 1\) for almost every \(x \in D\), and hence \({\bar{\mathbb {P}}}^\delta (\tau _1 < \infty ) = 1\). By Corollary 1.6, the distribution of \(Z_1\) under \({\bar{\mathbb {P}}}^\delta \) is equal to m. Thus,

$$\begin{aligned} {\bar{\mathbb {E}}}^\delta \biggl (\int _{\tau _1}^\infty e^{\lambda \tau _1 - \lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr )&= \int _D \varphi (x, x) m(dx) . \end{aligned}$$

By the definition of \(\varphi \), we have

$$\begin{aligned} \hspace{3em}&\hspace{-3em} {\bar{\mathbb {E}}}^\delta \biggl (\int _{\tau _1}^\infty e^{\lambda \tau _1 - \lambda t} f(\bar{X}(t)) g({\bar{Y}}(t)) dt\biggr ) \nonumber \\&= \int _D {\bar{\mathbb {E}}}^{x, x} \biggl (\int _0^\infty e^{-\lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) m(dx) \nonumber \\&= {\bar{\mathbb {E}}}^\delta \biggl (\int _0^\infty e^{-\lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) . \end{aligned}$$
(5.4)

Combining (5.3) and (5.4) with (5.2), we find that

$$\begin{aligned} \frac{1}{2} \iint \limits _{D \times D} \varphi (x, y) G(x, dy) m(dx)&= \frac{1}{2 \lambda } \iint \limits _{D \times D} f(x) g(y) G(x, dy) m(dx) \\&\hspace{3em} - \frac{1}{\lambda } \, {\bar{\mathbb {E}}}^\delta \biggl (\int _0^{\tau _1} e^{-\lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) \\&\hspace{3em} + \frac{1}{\lambda } \, {\bar{\mathbb {E}}}^\delta \biggl (\int _0^\infty e^{-\lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) \\&\hspace{3em} - \frac{1}{\lambda } \, {\bar{\mathbb {E}}}^\delta \biggl (\int _{\tau _1}^\infty e^{-\lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt\biggr ) . \end{aligned}$$

The last three terms on the right-hand side cancel out, and the desired result follows. \(\square \)

Proof of Theorem 1.5

We divide the argument into three parts.

Step 1. Suppose that \(\Vert G\Vert \) is finite. Using the definition of \({\bar{\mathbb {P}}}^\mu \) and \(\varphi \), formula (5.1) can be rewritten as

$$\begin{aligned} {\bar{\mathbb {E}}}^\mu \int _0^\infty e^{-\lambda t} f({\bar{X}}(t)) g({\bar{Y}}(t)) dt&= \frac{1}{\lambda } \, {\bar{\mathbb {E}}}^\mu \bigl (f({\bar{X}}(0)) g({\bar{Y}}(0))\bigr ) . \end{aligned}$$

Fubini’s theorem implies that if \(\Phi (t) = {\bar{\mathbb {E}}}^\mu \bigl (f({\bar{X}}(t)) g({\bar{Y}}(t))\bigr )\), then

$$\begin{aligned} \int _0^\infty e^{-\lambda t} \Phi (t) dt&= \frac{\Phi (0)}{\lambda } \, , \end{aligned}$$

that is, the Laplace transform of \(\Phi \) is given by \(\Phi (0) / \lambda \). It follows that \(\Phi (t) = \Phi (0)\) for almost every \(t \in [0, \infty )\). Assume that f and g are additionally continuous. Then \(\Phi \) is right-continuous, and hence we simply have \(\Phi (t) = \Phi (0)\) for every \(t \in [0, \infty )\). In other words,

$$\begin{aligned} {\bar{\mathbb {E}}}^\mu \bigl (f({\bar{X}}(t)) g({\bar{Y}}(t))\bigr )&= {\bar{\mathbb {E}}}^\mu \bigl (f({\bar{X}}(0)) g({\bar{Y}}(0))\bigr ) \end{aligned}$$

for every \(t \geqslant 0\) and every bounded continuous nonnegative functions f and g. By a density argument, the distributions of \(({\bar{X}}(t), {\bar{Y}}(t))\) and \(({\bar{X}}(0), {\bar{Y}}(0))\) under \({\bar{\mathbb {P}}}^\mu \) are equal, that is, \(({\bar{X}}(t), {\bar{Y}}(t))\) is indeed a stationary process under \(\mathbb {P}^\mu \). The first assertion of Theorem 1.5 is proved when \(\Vert G\Vert \) is finite.

Step 2. Extension to the case when G(xdy)m(dx) is a \(\sigma \)-finite measure is immediate, except that we need to work with the infinite measure \(\mu (dx, dy) = G(x, dy) m(dx)\) (without the normalisation constant \(\Vert G\Vert ^{-1}\)). We leave it to the interested reader to verify that the above proof carries over to this setting.

Step 3. The second assertion of Theorem 1.5 follows now from Birkhoff’s ergodic theorem for Markov processes. Indeed: suppose that \(\varphi \) is a nonnegative Borel function on \(D \times D\). By the ergodic theorem given in Corollary 25.9 in [17], the limit

$$\begin{aligned} M&= \lim _{T \rightarrow \infty } \frac{1}{T} \int _0^T \varphi ({\bar{X}}(t), {\bar{Y}}(t)) dt \end{aligned}$$

exists with probability \({\bar{\mathbb {P}}}^\mu \) one, and if \({\mathscr {I}}\) denotes the \(\sigma \)-algebra of all events which are invariant under time shifts, then

$$\begin{aligned} M&= {\bar{\mathbb {E}}}^\mu \bigl (\varphi ({\bar{X}}(0), {\bar{Y}}(0)) \big \vert {\mathscr {I}}\bigr ) . \end{aligned}$$

Below we prove that irreducibility of X(t) implies that \(\mathscr {I}\) is trivial: it only contains events of probability \({\bar{\mathbb {P}}}^\mu \) zero or one. This implies that

$$\begin{aligned} M&\!=\! {\bar{\mathbb {E}}}^\mu \varphi ({\bar{X}}(0), {\bar{Y}}(0)) \!=\! \iint \limits _{D \times D} \varphi (x, y) \mu (dx, dy) \!=\! \frac{1}{\Vert G\Vert } \iint \limits _{D \times D} \varphi (x, y) G(x, dy) m(dx) \end{aligned}$$

with probability \({\bar{\mathbb {P}}}^\mu \) one, completing the proof of the theorem.

It remains to show that \({\mathscr {I}}\) is trivial. This follows by a standard argument, which is however difficult to find in literature, and so we provide full details. Recall that we assume irreducibility of X(t): if \(m(A) > 0\) and \(t > 0\), then \(\mathbb {P}^x(X_t \in A) > 0\) for almost every \(x \in D\), and our goal is to prove that if I is an invariant event for \(({\bar{X}}(t), {\bar{Y}}(t))\), in the sense that the time-shifts leave I unchanged, then \({\bar{\mathbb {P}}}^\mu (I)\) is either zero or one.

By Lemma 1 in [11], for every invariant event I we have

$$\begin{aligned} {\bar{\mathbb {P}}}^\mu \bigl (\mathbbm {1}_I = \mathbbm {1}_B({\bar{X}}(0), {\bar{Y}}(0))\bigr )&= 1 \end{aligned}$$

for some invariant Borel set \(B \subseteq D \times D\). That is, for every \(t > 0\) we have

$$\begin{aligned} {\bar{\mathbb {P}}}^{x, y}(({\bar{X}}(t), {\bar{Y}}(t)) \in B)&= \mathbbm {1}_B(x, y) \end{aligned}$$

for almost every \((x, y) \in D \times D\) with respect to the measure \(\mu \). Since the product measure \(m \times m\) is absolutely continuous with respect to \(\mu (dx, dy) = \Vert G\Vert ^{-1} G(x, dy) m(dx)\), the above property also holds for almost every \((x, y) \in D \times D\) with respect to \(m \times m\). It follows that for every \(t > 0\),

$$\begin{aligned} \mathbb {P}^{x, y}((X(t), Y(t)) \in B)&\leqslant \mathbbm {1}_B(x, y) \end{aligned}$$

for almost every \((x, y) \in D \times D\) (with respect to \(m \times m\)). On the other hand, irreducibility of X(t) and Y(t) and independence of these processes imply that if \(m \times m(B) > 0\), then for almost every \((x, y) \in D \times D\) we have

$$\begin{aligned} \mathbb {P}^{x, y}((X(t), Y(t)) \in B)&> 0 . \end{aligned}$$

Thus, if \(m \times m(B) > 0\), then \(\mathbbm {1}_B(x, y) > 0\) for almost every \((x, y) \in D \times D\), that is, B is of full measure \(m \times m\). We conclude that either B or its complement has zero measure \(m \times m\). In the former case \({\bar{\mathbb {P}}}^\mu (I) = 0\), while in the latter \({\bar{\mathbb {P}}}^\mu (I) = 1\), and so \({\mathscr {I}}\) is indeed trivial. \(\square \)