Continuous-time perpetuities and time reversal of diffusions

We consider the problem of estimating the joint distribution of a continuous-time perpetuity and the underlying factors which govern the cash flow rate, in an ergodic Markov model. Two approaches are used to obtain the distribution. The first identifies a partial differential equation for the conditional cumulative distribution function of the perpetuity given the initial factor value, which under certain conditions ensures the existence of a density for the perpetuity. The second (and more general) approach, identifies the joint law as the stationary distribution of an ergodic multi-dimensional diffusion using techniques of time reversal. This later approach allows for efficient use of Monte-Carlo simulation when estimating the distribution, as the distribution is obtained by sampling a single path of the reversed process.


Introduction
Discussion. In this article, we consider a continuous-time perpetuity given by the random variable (0.1) Above, Z = (Z t ) t∈R + represents the value of an economic factor that determines a cash flow rate (f (Z t )) t∈R + . Cash flows are discounted according to D = (D t ) t∈R + ; therefore, X 0 represents the whole payment in units of account at time zero. Our main concern is the identification of the joint distribution of (Z 0 , X 0 ). As Z 0 is typically observable, the joint distribution of (Z 0 , X 0 ) allows to obtain conditional distributions of X 0 given Z 0 .
In order to make the problem tractable, we work in a diffusive, Markovian environment where Z and D are solutions to the respective stochastic differential equations (written in integrated form) 1 In the above equations, W and B are independent Brownian motions of dimension d and k respectively, while m, σ, a, θ and η are given functions. (Precise assumptions on all the model coefficients are given in Section 1.) We assume Z is stationary and ergodic with invariant density p. Equation where a represents a short-rate function. However, the more general form of (0.3) is considered to accommodate a broader range of situations; for example: • payment streams are sometimes denominated in units of different account (for example, another currency, or financial assets), in which case discounting has to take into account the "exchange rate" • for pricing purposes, the payment stream, though denominated in domestic currency, must incorporate both traditional discounting and the density of the pricing kernel.
The two main results of the paper-Theorem 2.1 and Theorem 2.4-identify the distribution of (Z 0 , X 0 ) in different ways. First, in the case when η in (0.3) is non-degenerate and f in (0.1) is sufficiently regular, the conditional cumulative distribution function of X 0 given Z 0 is shown to coincide with the explosion probability of an associated locally elliptic diffusion and, hence, through the Feynman-Kac formula satisfies a partial differential equation (PDE): see Theorem 2.1. Second, for general η and f , using methods of diffusion time-reversal, we identify an "ergodic" process (ζ, χ) whose invariant distribution coincides with the joint distribution of (Z 0 , X 0 ). In particular, for any fixed starting point of χ, the empirical time-average laws of (ζ, χ) converges weakly to the joint distribution of (Z 0 , X 0 ) with probability one: see Theorem 2.4. The time-reversal result has the advantage of leading to an efficient method for obtaining the distribution via simulation, as the ergodic theorem enables estimation of the entire distribution based upon a single realization of (ζ, χ). However, it must be noted that the invariant distribution p for Z appears in the reversed dynamics and hence must be known to perform simulation. When Z is one-dimensional, or more generally reversing, p is given in explicit form with respect to the model parameters. In the general multi-dimensional setup, lack of knowledge of p could pose an issue; however, we provide a potential way to amend the situation in the discussion after Theorem 2.4. Note also that in the PDE result in Theorem 2.1, explicit knowledge of p is not necessary.
Existing literature and connections. Obtaining the distribution of the perpetuity X 0 is of great importance in the areas of finance and actuarial science; for this reason, perpetuities with a form similar to X 0 have been extensively studied. For example, [11] deals with the case where establishing that X 0 has an inverse gamma distribution. This fits into the set-up of (0.2), (0.3) by taking a = ν − σ 2 /2, f = 1, θ = 0 and η = σ. Note that here Z plays no role. In a similar manner, [31,9,10] consider the case and obtain the first moment, along with bounds for other moments, of X 0 . In [16], the perpetuity takes the form (0.4) X 0 = ∞ 0 e −Qt dP t , with P and Q being independent Lévy processes.
Under certain conditions on P and Q, the distribution of X 0 is implicitly calculated by identifying the characteristic function and/or Laplace transform for X 0 . In fact, the results of [16] are predated (for highly particular P and Q), in [24,21]. The Laplace transform method is also used in [26,25] to treat (0.4) when P t = t and Q is a diffusion. In addition to identifying a degenerate elliptic partial differential equation for the Laplace transform, they propose a candidate recurrent Markov chain whose invariant distribution has the law of X 0 . Lastly, the setup of [16] is significantly extended in [7] where, under minimal assumptions on P and Q, the distribution of X 0 is shown to coincide with the unique invariant measure for a certain generalized Ornstein-Uhlenbeck process, a relationship that is confirmed in our current setting in Proposition 6.2.
The use of time-reversal to identify the distribution of a discrete-time perpetuity is well known, dating at least back to [12], where X 0 takes the form where the discount factors (D n ) n∈N and cash flows (f n ) n∈N are two independent sequences of independent, identically distributed (iid) random variables. To provide insight, the time-reversal argument in [12] is briefly presented here. With X has the same distribution as X N := D N f N + D N D N −1 f N −1 + .... + N j=1 D j f 1 . Straightforward calculations show that the reversed process ( X n ) n∈N satisfies the recursive equation X n = D n X n−1 + f n . Thus, assuming that ( X n ) n∈N converges to a random variable X in distribution, X must solve the distributional equation X = D( X + f ), where D, f and X are independent, D has the same law as D 1 and f has the same law as f 1 . In [30] solutions to the aforementioned distributional equation are obtained based upon the expectation of log(|D|) and log + (|Df |). The tails of X, as well as convergence of iterative schemes, are studied in [14]; furthermore, [17] gives "almost" if and only if conditions for the convergence of iterative schemes.
In a continuous time setting, we employ an argument similar in spirit, but rather different in execution, to [12]. Specifically, we extend X 0 to a whole "forward" process X := (1/D) ∞ · D t f (Z t )dt and then, for each T > 0 define the reversed process (ζ T , χ T ) on [0, T ] by ζ T t := Z T −t , χ T t := X T −t : see (2.7), (2.8). Using results on time reversal of diffusions from [19] (alternatively, see [23,3,8,13]), as well as additional elementary calculations, we obtain the dynamics for (ζ T , χ T ). In fact, Proposition 5.5 shows the generator of (ζ T , χ T ) does not depend upon T and ergodicity can be studied for the process (ζ, χ) with the given generator. When |η| > 0 in E and f is sufficiently regular, this generator is locally elliptic and the associated process (ζ, χ) is ergodic with invariant distribution equalling that of (Z 0 , X 0 ): see Proposition 6.2. In the general case a slightly weaker (but still sufficient) form of ergodicity still holds where, starting ζ off its invariant distribution p, the empirical time-average laws of (ζ, χ) converge almost surely converge in the weak topology for all starting points of χ.
Structure. This paper is organized as follows: in Section 1 we precisely state the given assumptions on the processes Z and D, as well as the function f , paying particular attention to deriving sharp conditions under which X 0 is almost surely finite or infinite. The main results are then presented in Section 2. First, when |η| > 0 in E and f is sufficiently regular, the conditional cumulative distribution function of X given Z 0 = z is shown to satisfy a certain partial differential equation. Then, using the method of time reversal, we construct a probability space and diffusion (ζ, χ) such that with probability one its empirical time-average laws weakly converge to the joint distribution of (Z 0 , X 0 ) for all starting points of χ. Section 2 concludes with a brief discussion how the distribution may be estimated via simulation, in particular proposing a method for obtaining the desired distribution when the invariant density p for Z is not explicitly known. The remaining sections contain the proofs: Section 3 proves the statements regarding the finiteness of X 0 ; Section 4 proves the partial differential equation result; Section 5 obtains the dynamics for the time-reversed process (ζ, χ); Section 6 proves the (weak) ergodicity with the correct invariant distribution. Finally, a number of technical supporting results are included in the appendix.
1. Problem Setup 1.1. Well-posedness and ergodicity. The first order of business is to specify precise coefficient assumptions so that Z in (0.2) and D in (0.3), are well-defined. As for Z, we work in the standard locally elliptic set-up for diffusions: for more information, see [27]. Let E ⊆ R d be an open, connected region. We assume the existence of γ ∈ (0, 1] such that: (A1) there exists a sequence of regions (E n ) n∈N such that E = ∞ n=1 E n , each E n being open, connected, bounded, with ∂E n being C 2,γ and satisfyingĒ n ⊂ E n+1 for all n ∈ N.
With the provisos in (A1) and (A2), define L Z as the generator associated to (m, c): In the sequel the summands will be omitted using Einstein's convention; therefore, L Z is written as Under (A1) and (A2), one can infer the existence of a solution to the martingale problem for L Z on E, with the possibility of explosion to the boundary of E : see [27] We wish for something stronger; namely, to construct a filtered probability space (Ω, F, P) on which there is a strong, stationary, ergodic solution to the SDE in (0.2) with invariant density p. In (0.2), W is a d-dimensional Brownian Motion and σ = √ c, the unique positive definite symmetric matrix such that σ 2 = c. In order to achieve this, we ask that (A3) The martingale problem for L Z on E is well posed and the corresponding solution is recurrent. Furthermore, there exists a strictly positive p ∈ C 2,γ (E, R) with E p(z)dz = 1 satisfyingL Z p = 0, whereL Z is the formal adjoint of L Z given by We summarize the situation in the following result: the extra Brownian motion B in its statement will be used to define the process D via (0.3) later on.
In this case, it holds that where K > 0 is a normalizing constant.
In the multi-dimensional case, suppose that there exists a function H : E → R with the property  [6,27]. For example, if there exist a smooth function u : E → R, an integer N and constants ε > 0 and C > 0 such that L Z u ≤ −ε and u ≥ −C on E \ E N , then (A3) holds.
Given (A4) and all previous assumptions, it follows that (0.3) possesses a strong solution on (Ω, F, P) of Theorem 1.1; in fact, defining R := − log(D), it holds that Then, the following hold: i) There exists κ > 0 such that for all z ∈ E, P lim t→∞ e κt D t = 0 | Z 0 = z = 1. In particular, lim t→∞ e κt D t = 0 P a.s..
As a partial converse to Lemma 1.3 we have 4 We define L 1 (E, p) to be those Borel measurable functions on g on E so that D |g(z)|p(z)dz < ∞. Thus, Borel measurability is implicitly assumed throughout.
Remark 1.6. Let (A1), (A2), (A3) and (A4) hold, and assume that a is nonnegative. A combination of Lemma 1.3 and Lemma 1.5 yield sharp conditions for the finiteness of X 0 that do not require knowledge of p, at least for bounded f .
In view of Lemma 1.3, we ask that To recapitulate, for the remainder of the article the following is assumed: We enforce throughout all above assumptions (A1), (A2), (A3), (A4) and (A5).

2.1.
The distribution of X 0 via a partial differential equation. Define the cumulative distribution function g of X 0 given Z 0 by Next, recall that Assumption 1.7 implies that Z 0 has a density p, and define the joint distribution π of (Z 0 , X 0 ) by Under Assumption 1.7, as well as an additional smoothness requirement on f and non-degeneracy requirement on η, the first main result (Theorem 2.1 below) shows g solves a certain PDE on the state space F . This will imply that the joint distribution of (Z 0 , X 0 ) has a density (still labeled π) and the law of X 0 charges all of (0, ∞).
To motivate the result, as well as fix notation, for each x ∈ (0, ∞), consider the process Since Assumption 1.7 implies P [lim t→∞ D t = 0 | Z 0 = z] = 1 for all z ∈ E, it is clear that given Z 0 = z, on {X 0 < x} the process Y x tends to ∞. Alternatively, on {X 0 > x}, Y x will hit 0 at some finite time. What happens on {X 0 = x} is not immediately clear but it will be shown under the given assumptions there is no probability of this occurring. For fixed (z, x) ∈ F , it follows that 1 − g(z, x) equals the probability that Y x hits zero, given Z 0 = z. According the Feynman-Kac formula, such probabilities "should" solve a PDE. To identify the PDE, note that the joint equations governing Z and Y x are for all (z, x) ∈ F . Note that if, in addition to Assumptions 1.7, |η|(z) > 0, z ∈ E then A is locally elliptic. Let L be the second order differential operator associated to (A, b), i.e., Note that Lφ = L Z φ for functions φ of z ∈ E alone. With the previous notation, the first main result now follows.
Theorem 2.1. Let Assumptions 1.7 hold, and suppose further that a) f ∈ C 1,γ (E; R + ) and b) |η(z)| > 0 for all z ∈ E. Then, g ∈ C 2,γ (F ) satisfies Lg = 0 with the following "locally uniform" boundary conditions Furthermore, g is unique within the class of solutions to Lg = 0 taking values in [0, 1] with the above boundary conditions. Remark 2.2. The non-degeneracy assumption on η is essential for the existence of a density; if η ≡ 0 it may be that the distribution of X 0 has an atom. Indeed, take f ≡ 1, a ≡ 1, η ≡ 0, θ ≡ 0.
Then, X 0 = ∞ 0 e −t dt = 1 with probability one. there are no natural auxiliary boundary conditions in the spatial domain of z ∈ E. In Subsection 2.2 that follows we provide an alternative, more useful method for estimating numerically the law of (Z 0 , X 0 ).

2.2.
The distribution of (Z 0 , X 0 ) via diffusion time-reversal. The goal here is to show that the distribution of (Z 0 , X 0 ) coincides with the invariant distribution of a positive recurrent process (ζ, χ). In order to see the connection, extend X 0 to a whole process (X t ) t∈R + defined via and note that (Z t , X t ) t∈R + is a stationary process under P. Fix T > 0, and define the process It still follows that (ζ T , χ T ) is stationary under P, with the same one-dimensional marginal distribution as (Z 0 , X 0 ). Furthermore, stationarity of (Z, X) clearly implies that the law of the process (ζ T , χ T ) does not depend on T (except for its time-domain of definition). Therefore, one may create a new process (ζ t , χ t ) t∈R + such that the law of (ζ T , χ T ) is the same as the law of (ζ t , χ t ) t∈[0,T ] for all t ∈ T . If one can establish that (ζ, χ) is ergodic, then the distribution of (Z 0 , X 0 ) may be efficiently estimated via the ergodic theorem.
Towards this end, one needs to understand the behavior of (ζ, χ). Standard results (e.g. [19]) in the theory of time-reversal imply that ζ is a diffusion in its own filtration, and identify the corresponding coefficients. In order to deal with χ, we return to the definition of χ T and define yet Using all previous definitions, we obtain that As it turns out, one can describe the joint dynamics of (ζ T , ∆ T ) in appropriate filtrations (and these dynamics do not depend on T , as expected). To ease the presentation, recall from Section 1 that From the joint dynamics of (ζ T , ∆ T ) one obtains the joint dynamics of (ζ T , χ T ), which again do not depend on T . In particular, since ∆ T is a semimartingale, (2.10) yields that For a generic version (ζ, χ) with the same generator (which does not depend upon time) as (ζ T , χ T ) above, ergodicity of Z implies ergodicity of ζ (see Proposition 5.1 later on in the text).
Furthermore, χ is "mean reverting" as can easily be seen when θ ≡ 0, and a > 0, and continues to be true in the general case. Thus, one expects the empirical laws of (ζ, χ) to satisfy a certain strong law of large numbers, an intuition that is made precise in the following result.
where ζ 0 is an F 0 -measurable random variable with density p.
Define the process ∆ as the solution to the linear differential equation and then, for any x ∈ (0, ∞), define χ x as the solution to the linear differential equation Lastly, for any x ∈ (0, ∞) and T ∈ (0, ∞) define the empirical measure π x T via Then, there exists a set Ω 0 ∈ F ∞ with Q [Ω 0 ] = 1 such that where π is the joint distribution of (Z 0 , X 0 ) under P given in (2.2).
Remark 2.5. In the context of Theorem 2.4, note that the processes ∆ and χ x can be given in closed form in terms of ζ; indeed, In light of Theorem 2.4, one may estimate the joint distribution of (Z 0 , X 0 ) efficiently through Monte-Carlo simulation. However, the applicability of the result above depends heavily on whether or not the distribution p for Z 0 is known, as it (together with its gradient) appears in the dynamics of ζ. In the case where Z is one-dimensional, or more generally, reversing, p can be expressed in closed form from the model coefficients m and c in the dynamics for Z. Furthermore, there are certain cases of non-reversing, multi-dimensional diffusions, where p can be (semi-)explicitly computed, as the next example shows.
Example 2.6. Assume that Z is a multi-dimensional Ornstein-Uhlenbeck process with dynamics where γ ∈ R d×d , Θ ∈ R d , and σ ∈ R d×d . Here, E = R d and (A1) clearly holds. Furthermore (A2) is satisfied when c = σσ ′ is (strictly) positive definite; in fact, we take σ as the unique positive definite square root of c. The process Z need not be reversing, as can clearly be seen when σ is the identity matrix, Θ = 0 and γ is not symmetric. However, as will be argued below, the ergodic assumption (A3) holds when all eigenvalues of γ have strictly positive real part, and one may identify the invariant density "almost" explicitly. To see this, a direct calculation shows that if a symmetric matrix J satisfies the Riccati equation then the function with strictly negative real part and b) the eigenvalues of σ −1 γσ have strictly positive real part.
In the present case, each of these statements readily follows: for the first statement, one can take for the second statement, note that the eigenvalues of σ −1 γσ coincide with those of γ, which by assumption have strictly positive real part. Therefore, even in this non-reversing case one may still identify p.
The previous interesting Example 2.6 notwithstanding, for non-reversing, multi-dimensional diffusions, even after verifying the ergodicity of Z (and hence the existence of p) one does not typically know p explicitly. In such cases, the following simulation method is proposed: fix a large enough T and first simulate (Z t ) t∈[0,2T ] via (0.2), starting from any point Z 0 (since the invariant density is unknown). If the choice of T is large enough, the process (Z t ) t∈[T,2T ] will behave as the stationary version in (0.2), since Z T will have approximately density p. In that case, defining ζ should behave as it should in the dynamics (5.7), even with ζ 0 having (approximate) density p. Now, given ζ, χ x may be defined via the formulas of Remark 2.5; therefore, for large enough T , the empirical measure π x T should, with high probability, approximate in the weak sense the joint law π.
Note finally that when p is known and |η| > 0, and under certain mixing conditions (see [29,28]), one can also obtain uniform estimates for the speed at which the above convergence takes place.
Remark 2.7. In the case when θ = η ≡ 0 and f ∈ C 1,γ (E; R + ), it is possible to explicitly identify the support of π. Such an identification follows from more general ergodic results on "stochastic differential systems" obtained in [5,4]. To identify the support, note that when θ = η ≡ 0, it follows that ∆ t = exp − T 0 a(ζ u )du . A direct calculation using Remark 2.5 shows that χ x has dynamics Hence, the paths of χ x are of bounded variation. Now, define Assumption 1.7 implies a(z 0 ) > 0 for some z 0 ∈ E and thus 0 ≤l ≤û ≤ ∞ withl =û if and only if for some constant c, f (z) = ca(z) for all z ∈ E. In this case, X = c P z almost surely for all z ∈ E. With this notation, [5] proves:

Proofs from Section 1.2
We present here the proofs of Lemma 1.3 and Lemma 1.5.
Proof of Lemma 1.3. Let ε > 0 be as in (1.4) and denote by P z the probability obtained by conditioning upon Z 0 = z. The positive recurrence of Z implies ([27, Theorem 4.9.5]), there exists a P z -a.s. finite random variable T (z) such that and hence the first conclusion of Lemma 1.3 holds. Furthermore, since Z is stationary, ergodic under P, the ergodic theorem implies there is a P a.s.
where the last inequality follows by the regularity of a and the non-explositivity of Z. Thus By the stationarity of Z: Assume now that θ ′ cθ + η ′ η ≡ 0, which by continuity of all involved functions implies that By the Dambis, Dubins and Schwarz theorem and the strong law of large numbers for Brownian motion, it follows that there exists a P z -a.s. finite random variable T (z) such that With Hence the first part of Lemma 1.3 holds true again. Additionally, the ergodic theorem applied with P gives a P-a.s. finite random variable T such that t ≥ T implies −R t ≤ −2κt. Again, for n ∈ N such that n > 1/(2κ) we have from which P [X 0 < ∞] = 1 follows by the same line of reasoning as above.
finite random variable T such that −R t ≥ κt holds for t ≥ T . This gives that Ergodicity of Z implies that P almost surely Let (F n ) n∈N be an increasing sequence smooth, bounded, open, connected domains of F such that F = ∪ n F n . Note that F n can be obtained by smoothing out the boundary of E n × (1/n, n).
By uniqueness of solutions to the generalized martingale problem, for each n, the law of of ( Z, Y ) is the same as the law of (Z, Y x ) under P [· | Z 0 = z] (where the latter will always denote a version of the conditional probability) up until the first exit time of F n . Furthermore, since the process Z is recurrent, with (P z ) z∈E being the restriction of (P z,x ) (z,x)∈F to the first d coordinates, for z ∈ E, the law of Z under P z is the same as the law of Z under P [ · | Z 0 = z]. For these reasons, and in order to ease the reading, we abuse notation and still use (Z, Y ) instead of ( Z, Y ) for the coordinate process on Ω. The underlying space we are working on will be clear from the context.
Denote by τ n the first exit time of (Z, Y ) from F n . Assumption 1.7 implies Z does not explode under P z,x and Y cannot explode to infinity since D is strictly positive almost surely under P [ · | Z 0 = z] for all z ∈ E. Therefore, the explosion time τ := lim n→∞ τ n for (Z, Y ) is the first hitting time of Y to 0 and the law of τ under P z,x is the same as the law of the first hitting of Y x Therefore, Therefore, if h(z, x) is continuous it follows that h(z, x) = g(z, x). It is now shown that in fact h is in C 2,γ (F ) and satisfies Lh = 0. This gives the desired result for g since g = h.
Let ψ : (0, ∞) → (0, 1) be a smooth function such that lim x→0 ψ(x) = 0, lim x→∞ ψ(x) = 1. By the classical Feynman-Kac formula Therefore, (P z,x ) (z,x)∈F is transient [27,Chapter 2] and, since (P z ) z∈E is positive recurrent, this implies that for all (z, x), with P z,x -probability one, either lim t→τ Y t = 0 or lim t→τ Y t = ∞, where in the latter case, τ = ∞ since Y cannot explode to ∞. This in turn yields that Y τn → 0 or Y τn → ∞ with P z,x -probability one and hence by the dominated convergence theorem show for each ε > 0 there is some n(ε) such that (4.6) sup The condition near x = 0 is handled first. By way of contradiction, assume there exists some ε > 0 such that for all integers n there exists z n ∈ E k , x n ≤ 1/n such that g(z n , x n ) > ε. Since the z n are all contained within E k there is a sub-sequence (still labeled n) such that z n → z for z ∈Ē k .
The proof for x → ∞ is very similar. Assume by contradiction that there is some ε > 0 such that for all integers n there exist z n ∈ E k , x n ≥ n such that g(z n , x n ) < 1 − ε. Again, by taking sub-sequences, it is possible to assume z n → z ∈Ē k . Fix M > 0. For n ≥ M , since g is increasing in x, g(z n , M ) < 1 − ε. Since g is continuous, g(z, M ) ≤ 1 − ε. Since this holds for all M , lim x→∞ g(z, x) ≤ 1 − ε. But, this violates the condition that under P [· | Z 0 = z], X 0 < ∞ almost surely.

Dynamics for the Time-Reversed Process
The goal of the next two sections is to prove Theorem 2.4. We keep all notation from Subsection 2.2. We first identify the dynamics for ζ T . The operator L ζ does not depend upon T . Thus, if (Q z ) z∈E denotes the solution of the generalized martingale problem for L ζ on E, then in fact (Q z ) ζ∈E solves the martingale problem for L ζ on E and is positive recurrent.
Remark 5.2. If Z is reversing then p satisfies m = (1/2) (c∇p/p + div (c)). Thus, in this instance, µ = m and, as the name suggests, ζ T has the same dynamics as Z.
Proof. The first statement regarding the martingale problem is based off the argument in [19].
Since Z is positive recurrent with invariant measure p and Z 0 has initial distribution p under P, Z is stationary with distribution p. SinceL Z p = 0, equation (2.5) in [19] holds noting that p does not depend upon t.
For a given s ≤ t ≤ t and g ∈ C ∞ c (E) define the function v(s, z) := E g(X t ) Z s = z . The Feynman-Kac formula implies v satisfies v s + L z v = 0 on 0 < s < t, z ∈ E with v(t, z) = g(z) : see [20,18] for an extension of the classical Feynman-Kac formula to the current setup. Therefore, the condition in equation (2.7) of [19] holds as well. Thus, the formal argument on page 1191 of [19] is rigorous and the law of ζ T under P solves the martingale problem for L ζ .
Turning to the statement regarding (Q z ) z∈E , setL ζ as the formal adjoint to L ζ .L ζ is given by with µ replacing m: where the third equality follows from (5.1). Thus, Assumption 1.7 (specifically the fact that Z is ergodic and E p(z)dz = 1) implies the diffusion forL ζ,p not only does not explode but also is positive recurrent, finishing the proof.
In preparation for the proof of the main result of this Section, which is Proposition 5.5, it is first needed to define a certain "backwards" filtration G T and to present two Lemmas. Fix T ∈ (0, ∞) and t ∈ [0, T ] and let G T t be the σ-field generated by X T , Then, let G T := (G T t ) t∈[0,T ] be the usual augmentation of ( G T t ) t∈[0,T ] . It is easy to check that (χ T , ζ T ) is G T -adapted for all T ∈ R + , as well as that the process B T defined via B T t := B T −t − B T is a k dimensional Brownian motion on (Ω, G T , P), independent of (χ T 0 , ζ T 0 ) = (X T , Z T ). However, the G T -adapted process (W T −t − W T ) t∈[0,T ] is not necessarily a Brownian motion on (Ω, G T , P).
With this notation, the following two Lemmas are essential for proving Proposition 5.5.
Proof. Fix 0 ≤ s ≤ t ≤ T . For each n ∈ N and i ∈ {0, . . . , n}, let First, assume that η is twice continuously differentiable. The standard convergence theorem for stochastic integrals implies that (the following limit is to be understood in measure P): Since B and Z are independent, by Ito's formula the last quadratic covariation is zero. Therefore, (5.3) holds for twice continuously differentiable η. The fact that (5.3) holds whenever η is locally bounded follows from a monotone class argument.
In a similar manner, assume that θ is twice continuously differentiable. The standard convergence theorem for stochastic integrals implies that The last quadratic covariation process (without the minus sign) is equal to whereF (c, θ) : E → R is given bỹ since c ′ = c. Thus, (5.4) is established in the case where θ is twice continuously differentiable.
The fact that (5.4) holds whenever θ is continuously differentiable follows form a density argument, noting that there exists a sequence (θ n ) n∈N of polynomials such that lim n→∞ θ n = θ and lim n→∞ ∇θ n = ∇θ both hold, where the convergence is uniform on compact subsets of E.
Proof of Proposition 5.5. Proposition 5.1 immediately implies that under P, ζ T has dynamics: is a Brownian motion on (Ω, G T , P). In order to specify the dynamics for χ T , recall the definition of ∆ T from (2.9). Observe that Then, using the definitions of χ T , ζ T and ∆ T , the above is rewritten as Lemma 5.4 implies ∆ T is a semimartingale, and hence (5.9) yields The result now follows by plugging in for d∆ T u /∆ T u from (5.6).
6. Proof of Theorem 2.4 6.1. Preliminaries. We first prove two technical results. The first asserts the existence of a probability space and stationary processes (ζ, χ) consistent with (ζ, χ x ) in Theorem 2.4 in that The second proposition shows that under the nondegeneracy assumption |η|(z) > 0, z ∈ E and regularity assumption f ∈ C 2 (E; R + ) it follows that (ζ, χ) is ergodic. Lemma 6.1. Let Assumption 1.7 hold. Then, there is a filtered probability space (Ω, F, Q), supporting independent d and k dimensional Brownian motions W and B, F 0 measurable random variables ζ 0 , χ 0 with joint distribution π, as well as a stationary process ζ with dynamics Furthermore, with ∆, χ x defined as in (2.12), (2.13), if the process χ is defined by χ t := χ χ 0 t (see Remark 2.5) then (ζ, χ) are stationary with invariant measure π and joint dynamics Proof. This result follows from Proposition 5.1. Indeed, one can start with a probability space (Ω, F, Q) supporting independent d and k dimensional Brownian motions W and B respectively, as well as a F 0 measurable random variable (ζ 0 , χ 0 ) ∼ π (hence independent of W and B). Under the given regularity assumptions, Proposition 5.1 yields a strong, stationary solution ζ satisfying (6.1). Then, defining ∆ as in (2.9) and, for x > 0, χ x as in (2.13), it follows that (ζ, χ x ) and hence (ζ, χ) satisfy the SDE in (6.2). Under the given regularity assumptions the law under P of (ζ T , χ T ) given ζ T 0 = z, χ T 0 = x coincides with the law under Q of (ζ, χ x ) given that ζ 0 = z. Since by construction, π is an invariant measure for (ζ T , χ T ), it follows from the Markov property that π is invariant for (ζ, χ) under Q and hence (ζ, χ) is stationary with invariant measure π.
Define the measures Q z,x for (z, x) ∈ F via We now consider when |η| > 0 on E and f ∈ C 2 (E; R + ). According to Theorem 2.1, g ∈ C 2,γ (F ) and hence π possesses a density satisfying Additionally, we have the following Proposition: Proposition 6.2. Let Assumption 1.7 hold, and additionally suppose that |η|(z) > 0 for z ∈ E and f ∈ C 2 (E; R + ). Then the process (ζ, χ) from Lemma 6.1 is ergodic. Thus, for all bounded measurable functions h on F and all (z, x) ∈ F (6.5) lim . (6.6) From (6.2) it is clear that the generator for (ζ, χ) is L R := (1/2)A ij ∂ 2 ij + (b R ) i ∂ i . As an abuse of notation, let (Q z,x ) (z,x)∈F also denote the solution to the generalized martingale problem for L R on F . Using Theorem 2.1, and the fact that under the given coefficient regularity assumptions, g ∈ C 3 (F ) (see [15,Ch. 6]) a lengthy calculation performed in Lemma A.1 below shows that the density π from (6.4) solvesL R π = 0 whereL R is the formal adjoint to L. Since by construction, F π(z, x)dzdx = 1, positive recurrence will follow once it is shown that (Q z,x ) (z,x)∈F is recurrent. By Proposition 5.1, the restriction of Q z,x to the first d coordinates (i.e. the part for ζ) is positive recurrent. Since by (2.13) it is evident that χ does not hit 0 in finite time, it follows that that χ does not explode under Q z,x . Thus, [27,Corollary 4.9.4] shows that (ζ, χ) is recurrent. Now, that so that it contains a one dimensional Brownian motionB which is independent of Z 0 , W and B.
In a similar manner, by enlarging the probability space (Ω, F, Q) of Lemma 6.1 to include a Brownian motion (still labeledB) which is independent of ζ 0 , χ 0 , W and B and defining the family of processes (∆ ε ) ε>0 and (χ ε,x ) ε>0 for x > 0 according to it follows that (ζ, χ x,ε ) solve the SDE Since |η ε | ≥ √ ε > 0, Proposition 6.2 shows for f ∈ C 2 (E; R + ) the generator L ε,R associated to (6.9) is positive recurrent with invariant density π ε and thus for all (z, x) ∈ F and all bounded measurable functions h on F (note that conditioned upon χ 0 = x we have χ ε,x 0 = χ x 0 = x = χ 0 ): With all the notation in place, Theorem 2.4 is the culmination of a number of lemmas, which are now presented. The first lemma implies that π ε converges weakly to π as ε ↓ 0. Lemma 6.3. Let Assumption 1.7 hold. Define X ε as in (6.7). Then X ε converges to X in P-measure as ε → 0.
Proof of Lemma 6.3. Denote by G the sigma-field generated by Z 0 , W and B. Set δ ε t := D ε t /D t = E √ εB t . By the independence of δ ε and G: By assumption, P [X < ∞] = 1. Since for any ε > 0, sup t≥0 δ ε t < ∞ P a.s., it thus follows that P [X ε < ∞] = 1. The dominated convergence theorem applied path-wise (recall that there exists a κ > 0 so that e κt D t → 0 P almost surely) then gives that lim ε→0 E [|X ε − X| | G] = 0, which shows that the pair (Z 0 , X ε ) converges in probability to (Z 0 , X), finishing the proof.
Next, define C as the class of (Borel measurable) functions h which are bounded and Lipschitz in x, uniformly in z; in other words, (6.11) The next Lemma gives a weak form of the convergence in Theorem 2.4 for regular f . Note that the notation Q-lim T →∞ stands for the limit in Q probability as T → ∞. Lemma 6.4. Let Assumption 1.7 hold. Assume additionally that f ∈ C 2 (E; R + ). Then for all x > 0 and all h ∈ C: Proof of Lemma 6.4. For ease of presentation we adopt the following notational conventions. First, for any measurable function f and probability measure ν on F set (6.13) h, ν := F hdν.
Thus, the above limit holds Q almost surely, and hence in probability.
To prove (6.12) we need to show that for any increasing R + -valued sequence (T n ) n∈N such that as this implies (6.12) by considering double sub-sequences. To this end, let (ε k ) k∈N be any strictly positive sequence that converges to zero, and assume that ε 1 < κ, where κ > 0 is from Assumption (A5). Next, pick T n k large enough so that k/T n k → 0 and such that As argued above, this is possible since h,π ε k ,x T converges to h, π ε k in Q probability. Since Lemma 6.3 implies lim ε→0 h, π ε k = h, π it follows that it suffices to show In fact, the claim is that From (6.11): Furthermore, recall that where ∆ ε k is from (6.8). With δ ε k := E √ ε kB it follows that under Q With G now denoting the σ field generated by ζ 0 , W and B, by the independence ofB and G it follows that where for any ε > 0, h ε is from Lemma 6.3. Since ζ is stationary under Q, it holds for all t > 0 that the distribution of ∆ t under Q coincides with the distribution of D t under P and the distribution of t 0 (∆ t /∆ u )h ε k t−u f (ζ u )du under Q is the same as the distribution of t 0 D u h ε k u f (Z u )du under P. We next claim there exists a sequence δ k → 0 such that (6.18) sup This is shown at the end of the proof. Admitting this fact, and using In the above, the first inequality holds because of (6.17) and the second by (6.18) and the fact that for any r.v.
The last equality follows by construction of δ k .
Recall that T n k was chosen so that lim k→∞ (k/T n k ) = 0 , it follows that lim sup which in view of (6.16) implies (6.15), finishing the proof. Thus, it remains to show (6.18). Since for any a, b > 0, 1 ∧ (a + b) ≤ 1 ∧ a + 1 ∧ b the two terms on the right hand side of (6.18) are treated separately. Let δ k > 0. First we have t ≤ e ε k /2t so on t ≥ k, e κt /h ε k t ≥ e (κ−ε k /2)t ≥ e (κ−ε k /2)k since ε k /2 < κ. So, for any δ k > e −(κ−ε k /2)(k/2) it follows that P [xD t h ε k t > δ k ] ≤ P xD t e κt ≥ e (κ−ε k /2)(k/2) Setδ k := sup t≥k P xD t e κt ≥ e (κ−ε k /2)(k/2) . Since D t e κt goes to 0 in P probability, it follows that δ k → 0. Thus, taking δ k to be maximum ofδ k and e −(κ−ε k /2)(k/2) it follows that Turning to the second term in (6.18), it is clear that As shown in the proof of Lemma 6.3, ∞ 0 D u h ε k u f (Z u )du goes to 0 as k → ∞ almost surely. Thus by the bounded convergence theorem, du > δ k ≤ δ k and δ k → 0. This concludes the proof since to combine the two terms one can take δ k to be twice the maximum of the δ k 's for individual terms.
The next lemma proves the convergence in Lemma 6.4 for f ∈ L 1 (E, p), not just f ∈ C 2 (E; R + ). Lemma 6.5. Let Assumption 1.7 hold. Then for all x > 0 and all h ∈ C: Proof of Lemma 6.5. By mollifying f , since p is tight in E there exists a sequence of functions Note that Thus, by the Borel-Cantelli lemma it follows that P almost surely For n > κ from Assumption 1.7, let A n = n −1 sup t∈R + (e t/n D t ). Note that lim n→∞ A n = 0 almost surely since for each δ > 0 we can find a P almost surely finite random variable T = T (δ) so that D t ≤ δe −κt for t ≥ T , and hence we see that Thus, with X n 0 := ∞ 0 D t f n (Z t )dt that lim n→∞ X n 0 = X 0 almost surely and hence if π n is the joint distribution of (Z 0 , X n 0 ) then π n converges to π weakly, as n → ∞. Now, on the same probability space as in Lemma 6.1 define and by construction the law of the process on the right hand side above under Q is the same as By (6.21) we can find a non-negative sequence (δ n ) such that δ n → 0 and lim δ→0 φ n (δ n ) = 0. Now, for h ∈ C we have almost surely for t ≥ 0: Therefore, withπ x,n T denoting the empirical law of (ζ, χ n,x ) we have Since for any 0 < δ < 1 and random variable Y we have E [1 ∧ |Y |] ≤ δ + P [|Y | > δ] it follows that for any n and hence for the given sequence (δ n ): Now, fix an sequence (T k ) such that lim k→∞ T k = ∞. Since Lemma 6.4 implies for each n, Q − lim T →∞ | h,π x,n T − h, π n | = 0 for each n we can find a T kn so that Q h,π x,n T kn − h, π n > 1 n < 1 n It thus follows that Q − lim n→∞ h,π n,x T kn − h, π n = 0.
Since lim n→∞ h, π n − h, π = 0 it follows by (6.22) that for each γ > 0 that We have just showed that for any sequence ( h,π x T k ) there is a subsequence ( h,π x T kn ) which converges in Q probability to h, π which in fact proves that ( h,π x T ) converges in Q probability to h, π , proving (6.19).
The next lemma strengthens the convergence in Lemma 6.5 to almost sure convergence under Q, but for π almost every x > 0, for h ∈ C from (6.11).
Lemma 6.6. Let Assumption 1.7 hold. Then for all h ∈ C and π almost every x > 0: Proof of Lemma 6.6. We again use the notation in (6.13). Recall χ from Lemma 6.1 and definê π T as the empirical law of (ζ, χ) on [0, T ]. Given that (ζ, χ) is stationary under Q, the ergodic theorem implies that for all bounded measurable functions h on F that there is a random variable Y such that (6.24) lim T →∞ h,π T = Y ; Q a.s.. By Lemma 6.5 it holds that for h ∈ C, Y = h, π with Q probability one. Indeed, let δ > 0 and note: The first of these two terms goes to 0 by (6.24). As for the second, denote by π| x the marginal of π with respect to χ. Then By Lemma 6.4 the integrand goes to 0 as T → ∞ for all x > 0 and thus the result follows by the bounded convergence theorem. Next, we have and thus (6.23) holds for π a.e. x > 0, finishing the proof.
The last preparatory lemma strengthens Lemma 6.6 to show almost sure convergence for all starting points x > 0, not just π almost every x > 0.
Lemma 6.7. Let Assumption 1.7 hold. Then for all h ∈ C and all x > 0 Proof of Lemma 6.7. Recall from Remark 2.5 that χ x takes the form Let h ∈ C. By Lemma 6.6, there is some x 0 > 0 such that (6.25) holds. Using the notation in (6.13) and (6.26) it easily follows for any x > 0 that We will show below that Q ∞ 0 ∆ t dt < ∞ = 1. Admitting this it holds that Q almost surely, lim T →∞ | h,π x T − h,π x 0 T | = 0 and hence the result follows since (6.25) holds for x 0 . It remains to prove that Q ∞ 0 ∆ t dt < ∞ = 1. By way of contradiction assume there is some 0 < δ ≤ 1 so that Q With all the above lemmas, the proof of Theorem 2.4 is now given.
Proof of Theorem 2.4. We again adopt the notation in (6.13). In view of Lemma 6.1 the remaining statement Theorem 2.4 which must be proved is that there is a set Ω 0 ∈ F ∞ with Q [Ω 0 ] = 1 such that (2.15) holds: i.e.
Recall the definition of C from (6.11) and let h ∈ C b (F ; R) ∩ C. In view of Lemma 6.7 there is a set Ω h ∈ F ∞ such that Q [Ω h ] = 1 and h,π x T (ω) = h, π for all x > 0.
Let the (countable subset)C ⊂ C be as in the technical Lemma A.2 below and set Ω 0 = ∩ h∈C Ω h .
Clearly, Q [Ω 0 ] = 1. Let ω ∈ Ω 0 and h ∈ C b (F ; R) with C = sup y∈F |h(y)|. Let ε > 0 and for n ≥ 5 take ↑ φ n m,k , ↓ φ n m,k and θ n as in Lemma A.2 such that (A.11) therein holds. In what follows the ω will be suppressed, but all evaluations are understood to hold for this ω.
Recall F from (2.1) and the invariant density p for Z. Let h ∈ C 2 (F ) be given and set Let the operator L be as in (2.5) and the operator L R = A ij ∂ 2 ij + (b R ) i ∂ i be as in the proof of Proposition 6.2, where A is from (2.4) and b R is from (6.6). LetL R be the formal adjoint of L R .
Proof. For notational ease, the arguments will be suppressed when writing functions except for the x appearing in the drifts and volatilities of the operators. Now, recall the dynamics for the reversed process (ζ, χ) in (6.2): and note, as is mentioned in the proof of Proposition 6.2, that L R is the generator for (ζ, χ). To Note that by (5.2) it follows that 0 = ∇ · (pξ). With this notation we have that dζ t = (m + 2ξ) (ζ t )dt + σ(ζ t )dW t dχ t = f (ζ t ) − χ t a − 2θ ′ (m + ξ) − H(c, θ) (ζ t ) dt + χ t θ ′ c(ζ t )dW t + η(ζ t ) ′ dB t , which in turns yields that H(c, θ)) .
Lemma A.2. Let Assumption 1.7 hold. Let C be as in (6.11). Recall that F = E × (0, ∞) and let {F n } n∈N be a family of open, bounded, increasing subsets of F with smooth boundary such that F = ∪ n F n . There exists a countable family of functions (A.10)C := ↑ φ n m,k , ↓ φ n m,k , θ n | n, m, k ∈ N, n ≥ 3 ⊂ C such that 1) For each n ≥ 3, 0 ≤ θ n ≤ 1 with θ n = 1 onF n and θ n = 0 on F c n+1 . 2) For each n ≥ 3 and m, the functions ↑ φ n m,k are increasing in k and the functions ↓ φ n m,k are decreasing in k. Furthermore, for any n ≥ 3 and m, lim k→∞ | ↑ φ n m,k (y) − ↓ φ n m,k (y)| = 0 for y ∈F n−2 .
Additionally, for any h ∈ C b (F ; R) set C = C(h) := sup y∈F |h(y)|. Then, for any ε > 0 and any integer n ≥ 5 there exits an integer m = m(ε, n) such that for all k ∈ N, sup y∈F | ↑ φ n m,k (y)| ≤ C + ε, sup y∈F | ↓ φ n m,k (y)| ≤ C + ε. Furthermore, for any Borel measure ν on F : Proof of Lemma A.2. Fix n ∈ N and let (φ n m ) m∈M be a countable dense (with respect to the supremum norm) subset of C b (F n ; R). Now, let k ∈ N and define: As shown in [2,Ch. 3.4], ↑φn m,k and ↓φn m,k are a) increasing and decreasing respectively in k, and b) Lipschitz continuous inF n with Lipschitz constant k. Furthermore, as k ↑ ∞, ↑φn m,k ր φ n m and ↓ φ n m,k ց φ n m onF n .
Therefore, the upper bound in (A.11) is established.