Rigidity for Markovian Maximal Couplings of Elliptic Diffusions

Maximal couplings are (probabilistic) couplings of Markov processes such that the tail probabilities of the coupling time attain the total variation lower bound (Aldous bound) uniformly for all time. Markovian (or immersion) couplings are couplings defined by strategies where neither process is allowed to look into the future of the other before making the next transition. Markovian couplings are typically easier to construct and analyze than general couplings, and play an important role in many branches of probability and analysis. Hsu and Sturm (2013) proved that the reflection-coupling of Brownian motion is the unique Markovian maximal coupling (MMC) of Brownian motions starting from two different points. Later, Kuwada (2009) proved that the existence of a MMC for Brownian motions on a Riemannian manifold enforces existence of a reflection structure on the manifold. In this work, we investigate suitably regular elliptic diffusions on manifolds, and show how consideration of the diffusion geometry (including dimension of the isometry group and flows of isometries) is fundamental in classification of the space and the generator of the diffusion for which an MMC exists, especially when the MMC also holds under local perturbations of the starting points for the coupled diffusions. We also describe such diffusions in terms of Killing vectorfields (generators of isometry groups) and dilation vectorfields (generators of scaling symmetry groups).


Introduction
Let (Ω 1 , F 1 , µ 1 ) and (Ω 2 , F 2 , µ 2 ) be two probability spaces. A (probabilistic) coupling of µ 1 and µ 2 is a measure µ on the product measurable space (Ω 1 × Ω 2 , F 1 × F 2 ) with marginals µ 1 and µ 2 . This paper considers the question of coupling of (the laws of) two realizations X and Y of a Markov where µ 1,t and µ 2,t are the distributions of X t and Y t respectively, while ν T V = sup{|ν(A)| : measurable A} denotes the total variation norm on signed measures ν. Thus a maximally efficient possible coupling (a Maximal Coupling) would attain equality in the Aldous inequality (1) for all times t > 0, thus solving a multi-objective optimization problem. The remarkable construction of Griffeath (1975), later simplified in a most elegant way by Pitman (1976), shows that maximal couplings always exist for discrete Markov chains. Goldstein (1979) generalized the construction to the case of non-Markovian processes; Sverchkov and Smirnov (1990) generalized it to continuous-time càdlàg processes. Here is a summary of the Pitman approach, which is a model for the construction below (in Subsection 1.1) of maximal couplings of diffusions. A deterministic time-varying interface is constructed using the transition probabilities of the diffusions which are to be coupled. The distribution of the coupling time is elicited using the deficits of the transition probability masses integrated on each side of the interface (at any particular time, these integrated deficits are equal and correspond to the probability of one, equivalently both, of the coupled processes hitting the interface after this time). Now, the coupling time is sampled from this distribution, and the coupling location corresponds to a point on the interface at this time. Finally, the coupling is realized by constructing a single process forward in time and time-reversed time-inhomogeneous diffusions connecting starting locations to the location and moment of coupling, conditioning to avoid hitting the interface prematurely.
The major drawback of all these constructions is they are typically very implicit; in most cases, it is extremely hard, if not impossible, to make detailed calculations for such couplings. This is a strong motivation for considering Markovian couplings, which we now describe.
Let X and Y be Markov processes starting from x 0 and y 0 respectively. Let F s = σ{(X s ′ , Y s ′ ) : s ′ ≤ s} denote the joint filtration generated by X and Y together up to time s. A coupling of X and Y is called Markovian if the joint process {(X t+s , Y t+s ) : t ≥ 0} conditioned on F s is again a coupling of the laws of X and Y , but now starting from (X s , Y s ). (An alternative martingalebased characterization makes a succinct connection to the theory of immersions of filtrations. For this reason Markovian couplings are also called immersion couplings: Kendall, 2014) A natural and immediate question is, when can a maximal coupling of two diffusions be Markovian? The standard (and elegant) example in the literature is the reflection-coupling of Euclidean Brownian motions starting from two different points: the second Brownian path is obtained from the first by reflecting the first path on the hyperplane bisecting the line joining the starting points until the first path (equivalently, the second, reflected, path) hits this hyperplane. Both paths then evolve together ("synchronously") as a single Brownian path. Straightforward calculations, based on the reflection principle, show that this construction is in fact a Markovian maximal coupling (MMC). Furthermore, Hsu and Sturm (2013) proved that this is the unique such coupling for Euclidean Brownian motion. A few other examples are discussed in the literature: Ornstein Uhlenbeck processes (Connor, 2007), also Brownian motion on manifolds which possess certain reflection symmetries. The reflection coupling idea manifests itself throughout the area of probabilistic coupling: for example it has a natural generalization to Brownian motion on Riemannian manifolds (Kendall, 1986;Cranston, 1991), involving stochastic parallel transport and development, and not requiring any symmetries of the manifold. However it seems unlikely that such generalizations will normally provide maximal couplings. Kuwada (2009) investigated this question for Brownian motion on manifolds (and their generalisations to metric spaces). Under suitable mild regularity assumptions he showed that a reflection symmetry of the space is necessary for the existence of a Markovian maximal coupling of two Brownian motions started from a specified pair of points. Working under some further assumptions, he proved that the fixed point set of the symmetry (the "mirror", characterizing this isometry) does not change with time; the maximal coupling is given simply by reflecting one process onto the other using the reflection symmetry defined by this mirror.
The aim of this paper is to develop the results of Kuwada to the case of general regular elliptic diffusions with smooth coefficients. It will be shown that Markovian maximal couplings are rare, in the sense that a stable local existence result enforces extreme global symmetry on the manifold: a kind of rigidity result. Section 2 considers implications of existence of Markovian maximal couplings for ddimensional Euclidean diffusions ("Euclidean" here meaning that the diffusion matrix is the identity matrix), under rather general regularity assumptions on the (possibly time-inhomogeneous) drift. Extending Kuwada's argument, the existence of an MMC implies there is a mirror symmetry between the coupled processes at any given time. However the influence of the non-zero drift now means that the mirror can vary deterministically with time, making the coupled dynamics considerably more complicated. We study the evolution of the mirror in time using stochastic calculus and we obtain a functional equation that the drift must satisfy for a Markovian maximal coupling to exist. This equation can be used to characterise all time-inhomogeneous diffusions which admit such couplings.
In the time-homogeneous case the characterization can be refined under the additional hypothesis that there is also a Markovian maximal coupling under local perturbation of the starting points, which is to say, Markovian maximal couplings exist locally in a stable sense: Definition 1 (Local Perturbation Condition (LPC)). There is r > 0, and initial points x 0 and y 0 , such that there exists a Markovian maximal coupling of the diffusion processes X and Y starting from x and y for every x ∈ B(x 0 , r) and y ∈ B(y 0 , r), where B(x 0 , r) is the open metric ball centred at x 0 and of radius r.
In fact, we do not need the full power of LPC, as described in Remark 49. Using a suitable reference probability measure (as elaborated in Remark 49), our conclusions still hold if MMC only holds for d(d + 1)/2 + 1 randomly chosen pairs of starting points in B(x 0 , r) × B(y 0 , r).
We will show that, for any dimension d, LPC holds for a suitably regular Euclidean diffusion with time-homogenous drift if and only if the drift takes the form b(x) = λx + T x + c, where λ is a scalar, T is a skew-symmetric matrix and c is a fixed vector. In one dimension, even without LPC, it turns out that a Markovian maximal coupling exists between two copies of a regular diffusion started from x 0 and y 0 if and only if the drift is either affine or an odd function around the midpoint of the starting points. This implies that Brownian motion with constant drift and Ornstein-Uhlenbeck process are the only one-dimensional examples of time-homogeneous diffusions for which there are successful Markovian maximal couplings. In higher dimensions, for regular Euclidean diffusions under LPC, essentially the same is true except that the drift may also include a rotational component.
Section 3 considers Markovian maximal couplings of Brownian motion with time-homogeneous drift on a complete Riemannian manifold M under LPC. This is the natural generalization of the context of Section 2, since a regular elliptic diffusion on Euclidean space furnishes the space with a Riemannian metric by means of inverting the diffusion matrix, and then the diffusion is converted into a Brownian motion with drift on the resulting Riemannian manifold, so that the Riemannian geometry serves to classify a variety of diffusions (compare the rather similar rôle of Fisher information in theoretical statistics). Strikingly, LPC then produces a geometric rigidity phenomenon, namely a complete classification of the space M as one of the three model spaces R d (Euclidean space), S d (Sphere) and H d (Hyperbolic space) depending upon the sign of the (necessarily constant) curvature K (see Theorem 38 in Section 3). The Euclidean case is fully covered in Section 2, and delivers the necessary ideas and techniques which we generalise to the manifold setup in Section 3 to study Markovian maximal couplings on the other two spaces. It turns out that the only drifts which can yield Markovian maximal couplings are given by the Killing vectorfields, defined as infinitesimal generators for the rigid motion group (namely, generators of one-parameter subgroups of isometries).
In this paper we confine our considerations to the case of elliptic diffusions, where there is a strong connection to Riemannian geometry, and path-continuity permits the formation of interfaces of co-dimension 1 separating pairs of initial points. Possible extensions to hypoelliptic diffusions or to general Markov chains are potentially of great interest, but we leave these questions as topics for future work.

Markovian maximal couplings: general properties
We complete this introduction by defining some general notation and by describing some basic general properties of Markovian maximal couplings for general Markov processes on a metric space (M, dist). Kuwada (2009) derived results similar to Lemmas 2 and 3 below. For the sake of clearer exposition, and as we are primarily interested in diffusion processes, we will state the results for continuous-time Markov processes. Denote the Markov process under consideration by X.
We assume that the metric space supports a positive Borel measure m with 0 < m(B) < ∞ for any metric ball B of finite radius. Consequently, the closed support of m is the whole of M . We further assume that for any t > s ≥ 0, the conditional distribution law L (X t | X s = x) is absolutely continuous with respect to m and has a probability kernel density given by p(s, x; t, z) for x, z ∈ M and 0 ≤ s < t.
Let µ denote the law of a Markovian maximal coupling (X, Y ) of two copies of our Markov process started from (x 0 , y 0 ), which can be thought of as a measure on the coupled path-space C[0, ∞) 2 , and let τ = inf{s > 0 : X t = Y t for all t > s} denote the coupling time of X and Y . Motivated by Pitman's construction for finite Markov chains, we write α(s, x, y, t, z) = p(s, x; t, z) − p(s, y; t, z) , and set α + (s, x, y, t, z) = max(α(s, x, y, t, z), 0) and α − (s, x, y, t, z) = max(−α(s, x, y, t, z), 0). If s = 0 (and thus x = x 0 and y = y 0 ), then we abbreviate α(t, z) for α(s, x 0 , y 0 , t, z) and similarly for other quantities.
We will be dealing with Markov processes which are possibly time-inhomogeneous, so we say a Markov process starts from (t, x) if we are looking at the distribution law L (θ t X | X t = x), where θ denotes the time-shift operator given by (θ t X) s = X t+s .
Define the interface between p(0, x 0 ; ·, ·) and p(0, y 0 ; ·, ·) at time t to be the region where the corresponding heat kernels agree: (2) Also write Finally, define the perpendicularly bisecting set (or "hyperplane") and the associated "half-spaces" (note that these are indeed a hyperplane and half-spaces in the Euclidean case): Lemma 2. Any joint maximal coupling law can be related to differences of the transition probability kernel densities as follows: for any Borel subset A of M , and s > 0, Interchanging the rôles of X and Y , a corresponding argument applies if p(0, x 0 ; s, ·) ≥ p(0, y 0 ; s, ·) on A. Hence additivity shows that for all A the coupling must satisfy Finally, Aldous' inequality (1) is by definition an equality for a maximal coupling, so It follows that the inequality (5) must in fact be an equality. This proves the lemma.
Only maximality was required for Lemma 2. If in addition µ is Markovian, then the conditional law L (θ s X, θ s Y | F s ) describes a Markovian coupling of two copies of our Markov process starting from ((s, X s ), (s, Y s )). Such a coupling therefore satisfies the following flow property: Lemma 3. If µ is a Markovian maximal coupling and µ s = L (X s , Y s ) then, for µ s -almost every (x, y) with x = y the conditional law L (θ s X, θ s Y | X s = x, Y s = y) gives a Markovian maximal coupling of (X, Y ) starting from ((s, x), (s, y)).
Proof. This follows immediately from the maximality of µ and the fact that µ is Markovian.
We now introduce notation to describe the set of pairs of initial points in the closed support of µ s for which the forward processes (θ s X, θ s Y ) do indeed generate a maximal coupling: M(µ s ) = {(x, y) ∈ Support(µ s ) : x = y and L (θ s X, θ s Y | X s = x, Y s = y) yields a maximal coupling of (X, Y ) starting from ((s, x), (s, y))}.
We conclude this introduction by noting an elementary observation about couplings of Markov processes.
Proof. The first assertion is a direct consequence of the general definition of conditional expectation. The second assertion follows from the definition of maximality.

Markovian Maximal Couplings on Euclidean spaces
We consider diffusions on Euclidean space R d with infinitesimal generator where ∂ i = ∂ ∂x i . In the following, X will be used to denote a diffusion with the above generator. We will refer below to such a diffusion as a Euclidean diffusion, because diffusions with general diffusion coefficients are covered in Section 3 as instances of 'Brownian motion plus drift on a manifolds'. We make the following very general regularity assumptions (not necessary for all of our results, but imposed globally to streamline the exposition): (A1) The drift vectorfield b : [0, ∞) × R d → R is continuously differentiable in the second (space) variable, moreover b and all its first-order spatial partial derivatives ∂ i b are bounded on compact subsets of [0, ∞) × R d .
(A2) For every t > s ≥ 0, and x, z ∈ R d , the conditional distribution law L (X t | X s = x) is the law of a diffusion with transition probability density kernel p(s, x; t, z) (density with respect to Lebesgue measure), which is jointly continuous in all its arguments. Moreover, p(s, ·; ·, ·) is positive everywhere when s > 0. Finally, the density p(s, x; ·, ·) : R + × R d → R is continuously differentiable in the time variable (first unspecified variable) and twice continuously differentiable in the space variable (second unspecified variable).
Remark 5. Note that Assumption (A2) implies that the diffusion does not explode in finite time (otherwise p(s, x; t, ·) would determine a sub-probability density). A sufficient condition for nonexplosion is to require that b is locally Lipschitz in the space variable x (which follows from Assumption (A1)) and moreover that there exists a constant C such that |b(t, (Hsu, 2002, Proposition 1.1.11). Furthermore, the fact that b is locally Lipschitz in x implies the existence of a unique strong solution to the SDE corresponding to (7) for any given driving Brownian motion B (Hsu, 2002, Theorem 1.1.8).
We will sometimes say b satisfies Assumptions (A1) and (A2) if b satisfies (A1) and the corresponding diffusion (whose law is unique by the above remark) has transition probability densities satisfying (A2).
Recall that we say a diffusion starts from (t, x) if we are looking at the law L (θ t X | X t = x), where θ denotes the time-shift operator given by (θ t X) s = X t+s . The resulting process is a diffusion with the identity diffusion matrix but using time-shifted drift b(t + ·, ·) and starting from x at time 0.
Let X and Y be two copies of this diffusion starting from x 0 and y 0 respectively. Recall a maximal coupling of (X, Y ) starting from ((s, x), (s, y))}.

Coupling and the interface
Here, we show that the existence of a Markovian maximal coupling for X and Y implies that for each time t, the interface I(x 0 , y 0 , t) will be a hyperplane bisecting the straight line joining X t and Y t . We begin with some preparatory lemmas. Note that Brownian motion has fluctuations which are of order O( √ t) while fluctuations resulting from the drift are of order O(t). Thus, on small time scales, the Brownian behaviour should dominate. The following lemma substantiates this intuition.
Lemma 7. Let X be a diffusion given by with X 0 = x 0 (so B 0 = x 0 ), and suppose the drift b satisfies Assumption (A1). Denote by P the underlying measure. Then, for any z ∈ R d and any δ > 0, Proof. Let I = sup{|y − x 0 | : y ∈ B(z, δ)} and choose N > d × I + 1. By continuity of b, there is a finite M for which |b(t, y)| ≤ M for all (t, y) ∈ [0, 1] × B(x 0 , N ). Let τ N = inf{t > 0 : X t ∈ B(x 0 , N )}. Then, we can write and (using t < δ/M ) Also (using t < 1/M to control the difference between B and X) Thus, there exists some constant C such that, By the Large Deviation principle for Brownian motion (Varadhan, 1984), it is easy to see that This, along with (10), (11) and (12), yields the lemma.
Remark 8. The above lemma can be regarded as a weak form of a large deviation principle (LDP) for the diffusion X, specialized to a particular set B(z, δ). The general form of the LDP can be shown to hold under the additional assumption of linear growth of the drift vectorfield, which is used to control the moments of the Radon-Nikodym derivative of the law of X with respect to that of B obtained by the Girsanov Theorem (Varadhan, 1984).
Note that for each fixed (s, x) the transition density (t, y) → p(s, x; t, y) satisfies the Kolmogorov forward equation where L * is the adjoint of the operator L. Under assumptions (A1) and (A2) the above equation can be rewritten as where A is a uniformly parabolic operator (Protter and Weinberger, 1984, p. 173) and h is bounded on compact subsets of [0, ∞) × R d . We now state the Strong Maximum Principle for uniformly parabolic equations in the following form (see Theorem 5, Theorem 7 and part (ii) of the remark following Theorem 7, pp. 173-175 of Protter and Weinberger, 1984). It is now possible to state and prove the main result of this section, which can be seen as a stronger version of Kuwada (2009, Proposition 3.9), although our proof is quite different and slightly shorter.
Remark 11. The above theorem shows that for a Markovian maximal coupling, for any time s, the locus I(x 0 , y 0 , s) can be viewed as a (possibly time-varying) mirror which realizes the coupling in a very explicit way, using a (possibly time-varying) reflection isometry.
The following corollary to the above lemma shows that the coupling time τ is, in fact, the hitting time of the deterministic space-time set {(s, I(x 0 , y 0 , s)) : s > 0} by the process ((s, X s ) : s ≥ 0) (equivalently, ((s, Y s ) : s > 0)). In particular, X and Y will couple at the first time they meet. Furthermore, the interface representation described in Theorem 10 will hold almost surely for all time before coupling occurs.
Since the trajectories of Y are continuous, it follows that almost surely Y t is contained in the complement of I − (x 0 , y 0 , t) for all t < τ . This implies For any t > 0, we define the event Theorem 10 implies the assertion µ (E q is true for all rational q) = 1 , hence almost surely E = ∩ q∈Q E q holds. Take any t > 0 with X t = Y t and let z ∈ H(X t , Y t ). Then it follows from the definition of H(x, y) and the continuity of sample paths of X and Y that there is a rational sequence t n ↓ t and z n ∈ H(X tn , Y tn ) such that z n → z. Thus, on the event E, the continuity of α implies that H(X t , Y t ) ⊆ I(x 0 , y 0 , t).
Now, take z ∈ H + (X t , Y t ) when X t = Y t . The continuity of sample paths of X and Y implies that there exist η, δ > 0 with B(z, η) ⊆ H + (X s , Y s ) for all s ∈ [t − δ, t]. On the event E, the continuity of α implies α(s, z ′ ) ≥ 0 for all s ∈ [t − δ, t] when z ′ ∈ B(z, η). Thus, as α(q, z) > 0 for all rational q ∈ [t − δ, t], Lemma 9 implies α(t, z) > 0. Thus, Note that, in particular, (18) and (21) . Similarly, before the coupling time τ , it is not possible for X and Y to meet away from the interface The corresponding argument for Y implies that τ ′ also satisfies τ ′ = inf{s > 0 : Y s ∈ I(x 0 , y 0 , s)}. Therefore, τ ′ is a stopping time for both X and Y . Since X τ ′ = Y τ ′ , we can extend X and Y synchronously beyond time τ ′ . Combined with (18), this implies τ = τ ′ almost surely, since the maximal coupling time τ must be stochastically smaller than all other coupling times. Consequently This, together with (21), yields (17) and thus the corollary is proved.

Time evolution of the mirror
We now analyze the time-evolution of the mirror. From Theorem 10, it follows that the mirror I(x 0 , y 0 , t) is a hyperplane for each t > 0. We parametrize this hyperplane by its signed distance from the origin, say l(t), together with the normal vector to the hyperplane, say n(t). There is an ambiguity of sign in the choice of ν; however the next lemma states that n(t) can be chosen to make this parametrization continuous up to the coupling time τ .
Proof. Corollary 12, together with the remark following Lemma 3, shows that the following subset of coupled path-space C[0, ∞) 2 is non-empty for any S > 0, and indeed of full µ-measure in the subset corresponding to τ > S: Consider any coupled pair of paths ω ∈ A S . Define (l(t), n(t)) on [0, S] by This gives a continuous parametrization (l (S) , n (S) ) on [0, S ∧ τ ). This recipe can be used to define (l (N ) , n (N ) ) on [0, N ∧ τ ) for each positive integer N . By continuity of n (N ) and . So we can consistently and continuously define the parametrization as ((l(t), n(t)) : t ∈ [0, τ )), thus proving the lemma.
In fact the parametrization is not simply continuous but is also continuously differentiable: Proof. We use the fact that the map given by reflection in the hyperplane parametrized by (l(t), n(t)), We write expectation with respect to µ using E.
By general properties of diffusions (Nelson, 1967, Chapter 11), Note that under the coupling µ we may use Corollary 12 to see that Y U s = F (s, X U s ) for all s ≥ t with probability one. Thus, we can write the last expression above as in the sense that if the limit of s−t exists and is defined by the above. By linearity of F in x, we see that the first summand becomes This shows that lim s↓t exists for each x and for all t ∈ [0, τ ) and indeed is continuous in t. This is enough to show that t → F (t, x) is continuously differentiable for each x (Bruckner, 1978, Theorem 1.3). This follows from the facts that t → (I − 2n(t)n ⊤ (t)) and t → l(t)n(t) are continuously differentiable, and actually requires these facts to be true: consider F (t, x) for x varying over an orthonormal basis and also for x = 0. Now, take any This proves the lemma.

Structure of the coupling
All the tools having been assembled, it is now possible to present a rather explicit description of drifts b which permit the existence of a Markovian maximal coupling of two copies X and Y of a Euclidean diffusion with the required regularity conditions.
We begin with a notational remark. For any x ∈ R d and any hyperplane h, we denote by hx the reflection of x in h. We write h k for the hyperplane {x k = 0}.
The first lemma of this subsection concerns an observation concerning rotations and shifts of these Euclidean diffusions.
Lemma 15. Let X be an Euclidean diffusion satisfying assumptions (A1), (A2). Let Q : [0, ∞) → O(d) be a continuously differentiable function taking values in the space of orthogonal (d×d) matrices, and let l : [0, ∞) → R be a continuously differentiable real-valued function. Then the new process given by satisfies the stochastic differential equation and Here,Q andl denote the respective time-derivatives and Q ⊤ denotes the matrix transpose.
Proof. The result follows by direct calculation using Itô calculus.
The following theorem describes Markovian maximal couplings for the class of time-nonhomogeneous Euclidean diffusions satisfying suitable regularity conditions. The intuitive content of the theorem is, given an MMC (X, Y ), applying deterministic time-varying rotations and translations to the ambient Euclidean space reduces this MMC to a reflection coupling in a fixed hyperplane. Thus, in a certain sense, reflection coupling is the only type of Markovian coupling that can possibly preserve maximality.
(i) Suppose the following holds for every x ∈ R d , for the fixed hyperplane gives a Markovian maximal coupling between two copies of the diffusion starting from x 0 and h 1 x 0 respectively.
using the transformation (24) are reflection-coupled according to the recipe (29). In particular, the transformed time-varying drift b given by (26) must satisfy Proof. (i) Equation (28) implies that the process (h 1 X t : t ≥ 0) has the same law as the diffusion starting from h 1 x 0 and thus, the reflection-coupling (29) gives a valid coupling. Maximality follows from the reflection principle, since the drifts cancel when computing the Itô differential for X − Y .
(ii) First, note that if X and Y are reflection-coupled according to (29), then analysis of generators of h 1 X t and Y t yields (30). But is a bijective, bimeasurable function, so application of Lemma 4 to (t, Conversely, let (X, Y ) be a Markovian maximal coupling of two copies of the diffusion starting from x 0 and y 0 . Then the results of subsections 2.1 and 2.2 show that there exist continuously differentiable functions l : [0, ∞) → R and n : [0, ∞) → S d−1 parametrising the mirror I(x 0 , y 0 , t). Furthermore, Theorem 10 and the corollary following it show that X and Y are coupled on t < τ according to the relationship The construction of Q follows by applying Gram-Schmidt orthogonalization to extend n(0) to an orthonormal basis (n(0), v 1 , . . . , v d−1 ) of R d . Note that the vectors v i lie in the tangent space of S d−1 based at n(0). The vector function (n(t) : t ≥ 0) traces out a C 1 curve γ on the sphere S d−1 . Parallel transport (Gallot et al., 2004, p. 75) can be applied along γ to each vector v i ; this produces C 1 vectorfields X i : [0, ∞) → R d along γ. Gallot et al. (2004, Proposition 2.74) shows that (n, X 1 , . . . , X d−1 ) produces a C 1 orthonormal frame along γ, so set .
We now produce a new pair of diffusions with time-varying drifts, ( X, Y ), by applying the transformation (24) to (X, Y ) with drift b and driving Brownian motion B as described in Lemma 15. This new pair is also a Markovian maximal coupling (use Lemma 4), and from equation (31) it follows that the coupled pair ( X, Y ) is described by the transformation (29). As discussed in part (i) of this proof, the relationship (30) follows as a direct consequence.
Inverting the relationship (26), and using the relationship (30), the above theorem yields the following characterisation of drifts which permit MMC: for some b satisfying Assumptions (A1) and (A2) and fulfilling the relationship (30).

Rigidity theorems for time-homogeneous diffusions
The previous subsection established an implicit classification of all time-nonhomogeneous diffusions that can be coupled by a Markovian maximal coupling. But, as noted in the literature, not many examples of such couplings are known for time-homogeneous diffusions. It is a matter of general belief that the class of such time-homogeneous diffusions is very small, but little rigorous work appears to have been done to specify this class. In this subsection we obtain a constraint equation on the drift, leading to certain general conditions on the drift and the starting points which are necessary for the existence of Markovian maximal couplings. In the case of affine drifts the constraint equations are explicit enough to classify all affine drifts leading to Markovian maximal couplings. We then state and prove the main theorem of this subsection: if there are two balls B(x 0 , r) and B(y 0 , r) in R d , such that a Markovian maximal coupling exists from all pairs of points (x, y) ∈ B(x 0 , r) × B(y 0 , r), then the drift has to be of a very simple affine form, verifying the popular belief that Markovian maximal couplings are indeed very rare.
We conclude by showing a stronger result for one-dimensional diffusions, which states that for such couplings to exist, the drift is either an odd function centred at a point, or is affine.
The following lemma supplies the constraint equation on the drift. Recall that is a linear tranformation sending x ∈ R d to its reflection in the mirror I(x 0 , y 0 , t). For the sake of concise exposition, in the following two lemmas and their proofs we suppress the argument t when writing l and n.
Lemma 19. Assume (A1), (A2) hold. A Markovian maximal coupling (X, Y ) exists from starting points x 0 and y 0 if and only if the drift vectorfield b satisfies the following equation: Proof. First, assume that a Markovian maximal coupling (X, Y ) exists. Note from equation (31) that for t ∈ [0, τ ). Applying stochastic calculus to the function F for t ∈ [0, τ ), substituting in and simplifying, we obtain The diffusion term is clearly a Brownian motion, as can be verified by the Lévy criterion. On the other hand, the drift term in the semimartingale decomposition of Y is given by Equating the two drifts yields the necessity of the drift constraint condition (34).
gives a valid coupling µ of the two copies (X, Y ) with coupling time τ . To see that this is indeed the maximal coupling, note that if we define the kernel h(t, x) by then h solves the Kolmogorov forward equation with Dirichlet boundary conditions: also solves (36). By uniqueness of solution of (36), we get h = α. Thus, which is equal to the total variation distance between the laws of X t and Y t . This proves the lemma.
Equation (34) provides the constraint only in implicit form, and the main task is to extract as much information from it as possible. In what follows, we decompose the gradient matrix ∇b into symmetric and skew-symmetric parts via where . The next lemma records relations for S(x) and T (x) which are direct consequences of (34).
Lemma 20. Under the hypotheses of Lemma 19, the following hold for all x ∈ R d and t > 0: and In particular, S(x) and S(F (t, x)) have the same set of eigenvalues.
(ii) There exists a continuous function λ(·, Proof. Differentiating both sides of (34), while recalling the reflection form of F (t, x) as given in (33), This immediately yields part (i). The equality of the set of eigenvalues follows from the fact that the reflection matrix (I − 2nn ⊤ ) is symmetric and orthogonal. Parts (ii) and (iii) follow by post-multiplying the equations of part (i) by n, bearing in mind that as n is a unit vector therefore n andṅ must be orthogonal.
Even in the generality of the hypotheses of Lemma 19, one can obtain the following necessary condition on the drift of a Euclidean diffusion for existence of a Markovian maximal coupling: use (ii) of the above lemma and take t ↓ 0.
Corollary 21. Under the hypotheses of Lemma 19, n(0) must be an eigenvector of S(x)+S(F (0,x)) 2 corresponding to some eigenvalue λ(x), for every x ∈ R d .
Briefly restrict attention to the case where b(x) is affine in x. The following Theorem completely classifies the set of such drifts which ensure Markovian maximal coupling.
Then a Markovian maximal coupling (X, Y ) exists from starting points x 0 and y 0 if and only if there exists an eigenvalue λ 0 of S such that the vectors T k (x 0 − y 0 ) (for 0 ≤ k ≤ d − 1) all lie in the eigenspace of S corresponding to λ 0 . In this case (using matrix exponentials exp), Proof. From (ii) and (iii) of Lemma 20 we get the following: (where we note that λ is a function of t only) and The finite symmetric matrix S has discrete spectrum; by this, and the continuity of n(·) and λ(·), it follows immediately from (45) that λ(·) ≡ λ 0 for some constant λ 0 . Furthermore, solving (46) gives Since T is skew-symmetric, the above formula implies |n(t)| = 1 for all t.
Thus n(t), as given by (47), must lie in the eigenspace of S corresponding to λ 0 , for all time t. Substituting this formula for n(t) in equation (45) and differentiating (47) k times with respect to t (for k = 0, 1, . . . , d − 1), then setting t = 0, we obtain that the vectors T k (x 0 − y 0 ) for 0 ≤ k ≤ d − 1 must all lie in the eigenspace of S corresponding to λ 0 . As T solves its characteristic equation, it is clear that all the higher powers T k (x 0 − y 0 ) for k ≥ d must also lie in this eigenspace. Using the series representation of exp (T t), this means that n(t) must also lie in this eigenspace for all t.
To solve for l, note that computation with (33), (34), (42) yields the following expression for n = n(t): On the other hand, (45) and (46) yield Substituting into (48) and simplifying,l Solving this equation, using the solution for n = n(t) obtained from (47), yields which proves the Lemma.
The following Corollary is immediate from the above Theorem.
Corollary 23. If d = 2, then under the hypotheses of Theorem 22, A is either a symmetric matrix or of the form λ 0 I + T for some real scalar λ 0 and a skew-symmetric matrix T .
Proof. If the skew-symmetric part T of A is non-zero, then x and T x are non-zero, mutually orthogonal vectors which lie in the eigenspace of S corresponding to λ 0 . Thus, this eigenspace is the whole of R 2 and S = λ 0 I. Now, we state and prove the main theorem of this section. Recall the Local Perturbation condition LPC described in the introduction.
Theorem 24. Assume (A1) and (A2) hold for a time-homogeneous Euclidean diffusion. Then LPC holds if and only if there exist a real scalar λ 0 , a skew-symmetric matrix T and a vector c ∈ R d such that the diffusion drift is given by Proof. We need to show that the set of eigenvalues of S(x) for any x ∈ R d is the singleton {λ 0 } and the skew-symmetric part T (x) is a constant matrix T . Write Our approach is to choose an appropriate set of mirrors H ⊆ H 0 and then to consider the orbit of a point z ∈ R d under repeated reflections in this set of mirrors, defined as We then use the constraint relations between a point and its reflection obtained in Lemma 20. This idea is made more precise in the following internal lemmas.
Lemma 25. Under the hypotheses of Theorem 24, there exists λ 0 ∈ R such that S(x) = λ 0 I for all Proof. Suppse that X and Y start at x ∈ B(x 0 , r) and y ∈ B(y 0 , r) respectively. It follows from letting t ↓ 0 in part (i) of Lemma 20 that, for all z ∈ R d , S(z) and S(H(x, y)z) have the same set of eigenvalues. (Recall that H(x, y)z represents reflection of z in the hyperplane H(x, y).) Denote x * = (x 0 + y 0 )/2 and let v 1 = x 0 − x * . Extend v 1 to a basis {v 1 , . . . , v d }. If ε is sufficiently small then the linearly independent vectors n i = v 1 + εv i , i = 1, . . . d are such that {x * + n i : i = 1, . . . d} ⊂ B(x 0 , r) and {x * − n i : i = 1, . . . d} ⊂ B(y 0 , r). Defining x i = x * + n i and y i = x * − n i , it follows that x * ∈ H(x i , y i ) for all i. For each i, consider maximally coupled diffusions begun at (x i , y i ): applying part (ii) of Lemma 20 and letting t ↓ 0, it follows that n i is an eigenvector of S(x * ). By construction, no n i is orthogonal to any other n j . Since S(x * ) is symmetric, it follows that {n i : i = 1, . . . , d} correspond to the same eigenvalue, say λ 0 and thus, S(x * ) = λ 0 I.
Choosing the set of mirrors H = H 0 , consider the orbit O(x * ) of x * in H. If O(x * ) = R d , then the lemma follows from the previous observation that for any z ∈ O(x * ), the set of eigenvalues of S(z) agrees with that of S(x * ).
To see this, let L be the line that passes through x 0 and y 0 . Let v 0 = x 0 −y 0 |x 0 −y 0 | . Write x δ = x 0 +δv 0 and y δ = y + δv 0 for all δ ∈ (−r, r). Thus the mirrors h δ = H(x δ , y δ ) ∈ H for all such δ, and the orbit of x * under reflection in {h δ : δ ∈ (−r, r)} is the whole of L. Thus L ⊆ O(x * ). Now, for any z ∈ R d , let H be a hyperplane containing the line L and the point z, and let v ′ denote its normal vector. For sufficiently ε > 0, for all δ ∈ (−ε, ε) the mirror h ′ δ containing x * and having normal vector v δ orthogonal to v ′ and making an angle δ with v 0 lies in H. Denote by C the circle centred at x * , lying in H and passing through z. Letẑ ∈ L ∩ C. Then the orbit ofẑ under reflection in {h ′ δ : δ ∈ (−ε, ε)} is the whole of C. In particular, z ∈ O(x * ). This shows that O(x * ) = R d and the lemma follows.
Before proceeding further with the proof of Theorem 24, we record a general fact about skewsymmetric matrices which follows by spectral decomposition (Gallier, 2011).
We now show that the skew-symmetric part T (x) is a constant matrix T . Proof. The proof breaks into three steps.
Extend v 1 , v 2 to an orthonormal basis v 1 , . . . , v d of R d . Using the method of the proof of Lemma 25, construct independent vectors n i = v 2 + εv i , i = 2, . . . d, choosing ε > 0 small enough so that H(z * + n i , z * − n i )x ∈ B(y 0 , r) for all i = 2, . . . , d. Writing x i = z * + n i and y i = z * − n i , the hyperplane H(x i , y i ) lies in H 0 and the line joining z and z ′ is contained in H(x i , y i ) for all i = 2, . . . , d. Thus, H(x i , y i )z = z and H(x i , y i )z ′ = z ′ for all i = 2, . . . , d. Taking t ↓ 0 in part (iii) of Lemma 20, it follows that Together with Lemma 26, this establishes Step 1.
Step 2. There is ε > 0 such that T (z) = T (z ′ ) for all z, z ′ ∈ {w ∈ R d : dist(w, H(x 0 , y 0 )) < ε}, where dist(w, A) denotes the distance of w from the set A.
Step 3. Now we work with the set of mirrors where ε is chosen as in Step 2. For notational convenience, we write h δ = H(x 0 , y δ ), and note that . The y δ = y 0 + δ x 0 −y 0 |x 0 −y 0 | all lie on the same line through x 0 , and therefore all these mirrors have a common normal vector, which we write n * . Let (l δ , n δ ) parametrize the interface I(x 0 , y δ , ·) corresponding to starting points x 0 and y δ of the diffusions X and Y respectively. For each δ, n δ (0) = n * . Furthermore, by letting t ↓ 0 in part (iii) of Lemma 20, n δ (0) = 2T x 0 +y δ 2 n * .
Choose any z, z ′ ∈ R d such that z ′ = z + δ x 0 −y 0 |x 0 −y 0 | for some δ ∈ (−2ε, 2ε). Set z * = h 0 z so that z = h 0 z * . Noting that z, z * , z ′ lie on the same line perpendicular to H(x 0 , y 0 ), it follows from an argument about one-dimensional reflections that z ′ = h δ z * .
Lemmas 25 and 27 together are sufficient to prove Theorem 24.
Theorem 24 can be strengthened ifṅ(t) = 0 for all t, i.e., the interface translates but does not rotate in time. We state this in the following theorem. Since there is no rotation, the driving Brownian motions in the stochastic differential equation for X and Y are constant reflections of each other. So we can assume without loss of generality that l(0) = 0 and n(t) ≡ e 1 .
Theorem 28. Assume (A1) and (A2) hold for a time-homogeneous Euclidean diffusion. Suppose there exists a Markovian maximal coupling of X and Y starting from x 0 and y 0 respectively, such that the interface I(x 0 , y 0 , t) is parametrized by ((l(t), e 1 ) : t ≥ 0) with l(0) = 0. Then there are only two possibilities: Proof. Part (i) follows from the fact that the generators of Y and h 1 X are the same. To prove part (ii), we may assume that (without loss of generality) (0, ε) ⊂ Range(l) for some ε > 0. Choose the set of mirrors where, as before, y δ = x 0 + δ x 0 −y 0 |x 0 −y 0 | . Then the following can be proved using part (i) of Lemma 20 and reflections in H, and arguing as in the proof of Theorem 24: where b (1) = (b 2 , . . . , b d ) ⊤ . This forces b to be of the required form.
The case of one-dimensional diffusions is a trivial consequence of the above Theorem, as noted in the next corollary.
Remark 30. Corollary 29 completely characterises all one-dimensional time-homogeneous diffusions subject to the regularity conditions (A1) and (A2) and permitting Markovian maximal couplings, even with a varying twice-continuously-differentiable diffusion coefficient σ(·) : R → [c, ∞) for some c > 0. Let X be given by and similarly for Y . Define the function and set U t = F (X t ). Then, it follows from Itô calculus that Thus, the conditions on b derived in the case σ ≡ 1 readily carry over to conditions on the drift term of (54) for general σ.

Markovian Maximal Couplings for manifolds
In this section, we analyse rigidity phenomena for Markovian maximal couplings for smooth elliptic diffusions, and demonstrate that there are powerful geometric consequences arising from a natural connection to the theory of diffusion processes on manifolds (specifically, the notion of Riemannian Brownian motion with drift). The main task of this section is to understand how the Euclidean arguments of section 2 carry over to the manifold case. In particular, the existence of Markovian maximal couplings (together with LPC) has profound rigidity consequences for the geometry of the manifold. We commence by summarizing the Riemannian geometry required to establish these consequences. Let M be a complete, connected smooth manifold of dimension d (the results which follow are actually significant even in the case when M = R d ). Following Dynkin (1965), a strong Markov process X on M is said to be a diffusion process if each C 2 function f belongs to the domain of definition of the characteristic operator L given by where N denotes a system of neighbourhoods shrinking to x, τ N denotes the first exit time from N and E denotes expectation with respect to the measure induced by the Markov process. In any local system of coordinates (x 1 , . . . , x d ), the operator L takes the form where the diffusion matrix A = {a ij } is non-negative definite and {v i } denotes the drift vectorfield. We will assume a ij and v i are smooth functions. Note that the general form of the operator does not depend on the specific choice of coordinates. We call X an elliptic diffusion if L is an elliptic operator (in other words, if A is positive-definite). As in the previous section, we deal only with elliptic diffusions. Following Molchanov (1975), if we furnish M with the Riemannian metric g which is given in local coordinates by g ij = (A −1 ) ij then the operator L can be rewritten in the form where ∆ M is the Laplace-Beltrami operator for the Riemannian metric, and b is the (intrinsic) drift vectorfield. When b = 0, the corresponding Markov process is called Brownian motion on M . Thus, we see that any diffusion process on M can be written as 'Brownian motion plus drift' if M is given a suitable metric. Henceforth, we will assume that M is endowed with this metric g, so that we can view M as a smooth Riemannian manifold (M, g). Note: Throughout this section, we will assume that our diffusion process X is defined for all time. This is to ensure that we are dealing with probability densities which is essential for the arguments in subsection 1.1 to go through. For Brownian motion on M , this can be resolved by ensuring that M is stochastically complete. There are a number of intrinsic geometric properties of M that ensure stochastic completeness, such as the existence of a constant lower bound on the Ricci curvature. See Hsu (2002), for example, for more details.
Let G = Iso(M ) denote the group of (global) isometries of M . This can be shown to be a Lie group (Myers and Steenrod, 1939), and it plays an important rôle in the following arguments. As M is complete and connected, any pair of points in M are connected by a geodesic. Furthermore, there are no branching geodesics in Riemannian manifolds. (More details on these geometric notions can be found in Burago et al., 2001;Chavel, 1995.)

Brownian motion with drift on the manifold
Not only can any smooth elliptic diffusion on M be written as Brownian motion with drift on (M, g), but also this permits a rather explicit geometric construction of the diffusion which facilitates the discussion of probabilistic coupling techniques, namely the Eells-Elworthy-Malliavin construction (Elworthy, 1982).
Using terminology expounded (for example) in Hsu (2002) where ue i denotes the i-th unit vector of the orthonormal frame u. This framework provides an expressive way to define smooth elliptic diffusions (and other semimartingale processes) on M , as follows.
Let b be a smooth vectorfield on M . This yields a natural vectorfield B on O(M ) given by where b i (u) = b(πu), ue i πu (here ·, · denotes the Riemannian inner product). We will call this the lifted drift. Consider the following Stratonovich differential equation on O(M ): where W is a d-dimensional Euclidean Brownian motion. The diffusion on M with drift b is obtained simply as the projection X t = πU t . The pivotal fact justifying this construction is that we can define a second order operator on O(M ) (Bochner's horizontal Laplacian) given by for any u ∈ O(M ) such that πu = x. The generator L of the diffusion X defined at the start of section 3 satisfies for any u ∈ O(M ) such that πu = x, and any C 2 test function f on M . Note that, when b = 0, the above construction reduces to the classical Eells-Elworthy-Malliavin construction of Brownian motion on M .

Couplings of diffusions on manifolds
Once we have the above construction, a natural question to ask is: when is there a Markovian maximal coupling (MMC) for two copies of the diffusion starting from x 0 and y 0 ? In the Euclidean case there is a complete characterization of the class of time-homogeneous diffusions under LPC, which is to say, when two copies of the diffusion can be maximally coupled whenever they start from x ∈ B(x 0 , r) and y ∈ B(y 0 , r) (for B(x 0 , r) and B(x 0 , r) chosen to be two arbitrary disjoint open balls in R d ). Theorem 24 shows that the class of such diffusions is actually very small.
The proof of Theorem 24 depends strongly on a wealth of isometries of Euclidean space arising via iterated reflections. Very few other d-dimensional Riemannian manifolds have many isometries, and so we may expect an even stronger rigidity phenomenon to hold for the geometry of (non-Euclidean) manifolds on which there is a good supply of MMC. The work of this section substantiates this expectation.
We begin by recalling briefly some notions from the Euclidean case (section 2). We have noted that the Local Perturbation Condition LPC (Definition 1) makes sense for any metric space, including the Riemannian manifold case. Let X and Y be two copies of the elliptic diffusion derived from the stochastic differential equation (59), and starting from x 0 and y 0 respectively. Note that the assumptions of ellipticity and smoothness of the coefficients of L together ensure that the law of X (equivalently Y ) has a smooth positive density with respect to the Riemannian volume measure m for every positive time t > 0, which we write as p(x 0 ; t, z), p(y 0 ; t, z) for t > 0, z ∈ M .
We note here that all the results in subsection 1.1 carry over to the manifold setting with (M, dist) being the Riemannian manifold (with the distance dist induced by the Riemannian metric) and m taken to be the volume measure.

The interface
Varadhan small-time asymptotics and Lemma 3 can be used to show the following: that the existence of an MMC implies that, for each time t, there is a deterministic involutive isometry F t which exchanges X t with Y t and fixes the set of points equidistant from both X t and Y t . This generalizes the time-varying reflection isometry of Euclidean space which is mentioned in Remark 11; the fixedpoint set of F t corresponds to the 'evolving mirror' of the Euclidean case.
The rôle of Varadhan's small-time asymptotics in the following is analogous to the rôle of Lemma 7 in the Euclidean case. This powerful technique gives the logarithmic asymptotics of the density of X t when t ↓ 0, as stated in the following lemma.
Lemma 31. Let M 1 and M 2 be compact subsets of M . Then the density p of X t satisfies the following: uniformly for all x, y ∈ M 1 × M 2 , where dist(x, y) is the Riemannian distance between x and y.
This theorem was proven by Varadhan (1967) for diffusion processes on Euclidean space. Later Molchanov (1975) noticed that Varadhan's arguments carry over to diffusions on closed manifolds whose generators are of the form L = 1 2 ∆ M + b. Molchanov also showed that this result could be extended to general smooth complete manifolds by introducing a reflected diffusion in a suitably large domain U ⊂ M containing x and y, with the same generator L inside, and using this process to define a natural diffusion on the 'double' U . He then showed that smoothing techniques allowed the approximation of the 'double' U by a smooth closed manifold, such that the diffusion thus defined has a density that is sufficiently close to that of the original one (Molchanov, 1975, p. 18 and further references).
We can now restate the pivotal Theorem 10 from subsection 2.1 in the new context of manifolds. The proof of the manifold case follows that of the Euclidean case, but uses Lemma 31 in place of Lemma 7, and uses the strong maximum principle (Lemma 9) in local coordinates; we omit details.
Theorem 32. For any (x, y) ∈ M(µ s ), and any s > 0, the following equalities hold: Let τ ′ = inf{s > 0 : X s ∈ I(x 0 , y 0 , s)} be the first time that X hits the interface. Then the following holds.
Corollary 33. Almost surely τ ′ = τ , so coupling occurs when X first hits the interface. Furthermore, for all t < τ , Proof. The proof follows the lines of the proof of Corollary 12. The only additional detail that we have to check here (which was immediate in the Euclidean case) is that, for any t > 0 with X t = Y t , any z ∈ H(X t , Y t ) and any rational sequence t n ↓ t, there is z n ∈ H(X tn , Y tn ) such that z n → z. This was used in Corollary 12 to show H(X t , Y t ) ⊆ I(x 0 , y 0 , t)).
The rest of the proof carries over verbatim from that of Corollary 12.
The striking fact that emerges from the above is that, almost surely under the coupling µ, for each s > 0, H(X t , Y t ) is a non-random set which depends only on s and not on the specific location of (X t , Y t ). We will call this set H t henceforth. Similarly, denote H + t = H + (X t , Y t ) and H − t = H − (X t , Y t ). The family {H t : t ≥ 0} corresponds to the family of moving mirrors from section 2.
We now follow Kuwada (2009)'s construction to define a deterministic global involutive isometry F s which fixes H s and maps X s to Y s under the coupling. The argument of Kuwada (2009, Lemma 4.6) applies directly to our case: we therefore omit proof.
Lemma 34. Take s ≥ 0. If x, y ∈ M , with x = y, satisfies (so x and y lie in opposite "half-manifolds"). Furthermore, for any x ∈ M , a point y ∈ M \{x} satisfying (63) is unique if it exists.
Whenever such a y exists, we will call y the mirror image of x at time s. With the aid of the above lemma, the isometry F s is constructed using a procedure which is similar to Kuwada (2009, Theorem 4.5), but is subject to some modification as described in the following lemma and its proof. For x ∈ A s , define F s (x) to be the unique y for which (63) Following the proof of Kuwada (2009, Theorem 4.5), the setÂ s = A s ∪ H s is closed. Furthermore, by Theorem 32, on the event [0 < s < τ ] the support of X s (equivalently Y s ) is the whole of H − s (respectively H + s ). This, by Lemma 3 and Theorem 32, impliesÂ s = M for all s > 0. A little more argument is required for s = 0. By Theorem 32, for any x ∈ H − 0 , there is a sequence t n ↓ 0 and x n → x such that x n ∈ A tn with y n ∈ M being its mirror image at time t n , for all n. Take any z 0 ∈ H 0 . Following the proof of Corollary 33, for sufficiently large n there is z n ∈ γ ∩ H tn such that z n → z 0 . As dist(x n , z n ) = dist(y n , z n ), it follows that the set of distances {dist(z 0 , y n )} n≥1 is bounded. Consequently the properness of M implies that there is a subsequence {n k } such that y n k → y for some y ∈ M . Now, for any z ∈ H 0 , take z ′ n ∈ H tn such that z ′ n → z. Thus, This impliesÂ 0 = M . Note that, by Lemma 34, the limit y is uniquely determined by x and H 0 , and thus, does not depend on the subsequence chosen. This implies y n → y. Thus F s is defined on the whole of M for every s ≥ 0. Following Kuwada (2009, Lemma 5.3), we infer that F s so defined is a global involutive isometry that fixes H s . Now F (H − s ) = H + s follows from Lemma 34. This proves the lemma.
Following Petersen (2006, Chapter 10, Proposition 24), as H s is the fixed point set of an isometry therefore each connected component of H s is a totally geodesic submanifold (in particular, a smooth submanifold). Furthermore, as H s partitions M into two disjoint open subsets, it can be verified (for example by referring to normal coordinates based around a point in H s ) that H s must be of codimension 1. Furthermore, this discussion also implies that for any x, y ∈ M there is at most one isometry whose set of fixed points is the set H(x, y). We will refer to this isometry, if it exists, as f x,y . In fact Lemmas 34 and 35 together imply that for any s ≥ 0 there does indeed exist such a f x,y for each (x, y) ∈ M(µ s ), given by f x,y = F s .
To get an intuitive picture of how F s acts locally around a point x * ∈ H s (hence, fixed by F s ), recall that d F s : is a linear isometry. We can form an orthonormal basis e 1 , . . . e d of T x * M such that e 1 , . . . , e d−1 form a basis of the tangent space T x * H s viewed as a subspace of T x * M . Because H s is totally geodesic, these vectors correspond to geodesics through x * that stay in H s . As H s is the fixed point set of F s , the basis vectors e 1 , . . . , e d−1 must be fixed by d F s , while e d is mapped by d F s to −e d . Thus, locally, one geodesic passing through x * is inverted by F s , while geodesics starting in directions orthogonal to the inverted geodesic are fixed by F s .

Structure of the manifold M
In this section, we will use the isometries f x,y constructed above for every pair of points x ∈ B(x 0 , r) and y ∈ B(y 0 , r) to show that the underlying manifold M is homogeneous (i.e. the isometry group acts transitively) and isotropic about a chosen point x * (i.e. there are d(d−1) 2 independent rotations about x * ). This will imply that M is a maximally symmetric space, i.e. the isometry group G of M has the maximal dimension possible (namely, d(d+1) 2 ) for any d dimensional manifold. It is an almost immediate consequence that the space M can be classified (up to scaling) as one of the three model space forms of constant curvatures respectively −1, 0, and +1.
Proof. We want to show that G acts transitively on M . Together with LPC, the work of the previous subsection shows that for each x ∈ B(x 0 , r) and y ∈ B(y 0 , r), there exists an involutive isometry f x,y . This implies that, for any x ∈ B(x 0 , r), there is an isometry G x 0 ,x = f y 0 ,x • f x 0 ,y 0 which takes x 0 to x. Consider the set of isometries Let H be the closure of the subgroup generated by I, so H is a closed subgroup of G. Denote by O(x 0 ), the orbit or set of equivalent points of x 0 under H. By construction, B(x 0 , r) ⊆ O(x 0 ). In order to prove that M is homogeneous, we need to prove O(x 0 ) = M , which we will show by proving that O(x 0 ) is both open and closed in M . Let z be a limit point of O(x 0 ). Then, there is a sequence of isometries G n ∈ H such that G n (x 0 ) → z. By Myers and Steenrod (1939, p. 7), there exists an isometry G ∈ H and a subsequence G n k ∈ H such that G n k → G in the topology of isometries (i.e. G n k (x) → G(x) for all x ∈ M ), and consequently, G(x 0 ) = z. This shows that O(x 0 ) is closed. On the other hand, if y ∈ O(x 0 ), then there is an isometry G ∈ H such that y = G(x). Therefore, In the following lemma, we will write x * for the midpoint of a minimal geodesic γ x 0 ,y 0 connecting x 0 and y 0 . If two vectors u, v belong to the same tangent space then we denote the angle between them by ∠(u, v).
Proof. Let γ(v) denote the geodesic issuing from x * in direction v. Suppose γ(v 0 ) = γ x 0 ,y 0 , thus defining a unit vector v 0 . The proof proceeds in three steps as follows.
Step 1. First, we want to show that there is ε > 0 such that, for any v ∈ T x * M with ∠(v, v 0 ) < ε, there is an isometry g v leaving x * fixed and d g v (v 0 ) = v.
By continuity of geodesics in the starting direction, we can choose ε > 0 sufficiently small so that γ(v ′ ) intersects B(x 0 , r) and γ(−v ′ ) intersects B(y 0 , r) whenever ∠(v ′ , v 0 ) < ε. By Petersen (2006, Proposition 20, p. 141), we can take x v ′ ∈ γ(v ′ ) ∩ B(x 0 , r) and y v ′ ∈ γ(−v ′ ) ∩ B(y 0 , r) such that γ(v ′ ) realises the distance dist(x * , x v ′ ) and γ(−v ′ ) realises the distance dist(x * , y v ′ ). Furthermore, by continuity of the metric, we can take dist(x v ′ , x * ) = dist(y v ′ , x * ). Thus, from the developments of the previous subsection, there is an involutive isometry f x v ′ ,y v ′ which fixes x * , inverts the geodesic passing through x * in direction v ′ , and fixes all the geodesics which pass through x * in directions orthogonal to v ′ . Now, take any unit vector v ∈ T x * M with ∠(v, v 0 ) < 2ε. Let v ′ = v+v 0 |v+v 0 | . By the properties of equal-sided parallelograms, ∠(v ′ , v 0 ) = 1 2 ∠(v, v 0 ) < ε, and thus f x v ′ ,y v ′ exists as specified in the preceding paragraph. Now, consider the isometry g v = f x v ′ ,y v ′ • f x 0 ,y 0 . Note that g v fixes x * and a straightforward calculation reveals d g v (v 0 ) = v. This g v is our required isometry.
Step 2. Take any unit vector w ∈ T x * M such that w and v 0 are linearly independent. Let Π be the two-dimensional subspace of T x * M generated by v 0 and w and denote by S(v 0 , w) the circle in T x * M centred at the origin of T x * M and running through v 0 and w. Let U be a normal neighbourhood around x * . Let S Π = exp x * (Π ∩ U ) denote the two-dimensional fragment of M corresponding to Π and lying in U .
Denote by H(v 0 , w) the closed subgroup of isometries generated by {g v : v ∈ S(v 0 , w), ∠(v, v 0 ) < ε}, where g v are the isometries constructed in Step 1. Note that H(v 0 , w) fixes x * and keeps vectors orthogonal to {v 0 , w} fixed. Let Note that, if v n = d g n (v 0 ) such that v n → v, then, by the fact that g n (x * ) = x * for all n, we can choose a subsequence g n k and a g ∈ H(v 0 , w) such that g n k → g in the topology of isometries (Myers and Steenrod, 1939, p. 7). Thus, by Myers and Steenrod (1939, Lemma 4 Thus, in particular, the subgroup of isometries G x * which fix x * (the isotropy group at x * ) generates all the rotations of T x * M based at x * in 2-planes containing v 0 . We describe the isometries in H(v 0 , w) as rotations in S(v 0 , w).
Step 3. We will now show that, given two ordered orthonormal frames based at T x * M , there is a sequence of isometries in G x * that take one to the other. In particular this implies that M is isotropic at x * . Let (e 1 , . . . , e d ) and (e ′ 1 , . . . , e ′ d ) be ordered orthonormal frames in T x * M . We can apply rotations in S(v 0 , e 1 ) (respectively S(v 0 , e ′ d )) to align e 1 with v 0 (respectively e ′ d with v 0 ). Thus, without loss of generality, we consider frames of the form (v 0 , e 2 , . . . , e d ) and (e ′ 1 , . . . , e ′ d−1 , v 0 ). Now, apply a rotation in S(v 0 , e ′ 1 ) to transform (v 0 , e 2 , . . . , e d ) to (e ′ 1 , e are linearly independent, then apply a rotation in 2 ), to bring (e ′ 1 , e 3 , . . . , e 2 = −v 0 , then achieve the same result using the reflection f x 0 ,y 0 . Note that these operations both keep e ′ 1 fixed as it is orthogonal to {v 0 , e (1) 2 }. The same procedure is applied inductively to (v 0 , e 3 , . . . , e (2) d ) to obtain (e ′ 1 , v 0 , e 3 , . . . , e 4 , . . . , e d ) (note that these operations leave e ′ 1 fixed), and so on. Finally we obtain (e ′ 1 , . . . , e ′ d−1 , v 0 ), which proves the lemma.
The above two lemmas imply the following rigidity theorem which completely classifies the space M .
Theorem 38. Suppose that the complete, connected Riemannian manifold M supports Brownian motion with drift for which there is a Markovian maximal coupling and moreover LPC holds. Then M has constant sectional curvature. Moreover M must be simply connected and therefore (up to scaling) M must be one of the three model spaces R d , S d and H d .
Proof. By Lemmas 36 and 37, we see that M is a maximally symmetric space, i.e., the dimension of Iso(M ) is d(d+1) 2 (Sharan, 2009, p. 195). In particular, this implies that M has constant sectional curvature (Petersen, 2006, p. 190). For the second part of the corollary, the argument of Petersen (2006, p. 190) shows that a complete, connected maximally symmetric Riemannian manifold must be one of the three model spaces above, or RP d . But, as observed in Kuwada (2009, Example 6.4), there is no involutive isometry of RP d of the form described in Lemma 35. This proves the theorem.

Evolution of the mirror isometries
Having classified the space M , we must now classify the set of drift vectorfields b which permit MMC with LPC. This necessitates analysis of the evolution of the isometries F s as s varies. As noted above, Myers and Steenrod (1939) proved that the set of isometries G has the structure of a Lie group. The first objective is to prove that the curve of isometries (F s : s ≥ 0) is a C 1 curve in this Lie group.
Lemma 39. The curve s → F s is a C 1 curve in the Lie group G.
Proof. Recall that any point in M has a neighbourhood, called a σ-neighbourhood, such that any point in this neighbourhood is in a normal coordinate ball of any other point in the same neighbourhood. We study continuity and continuous differentiability of (F s : s ≥ 0) at s = t. As we are investigating a local property, we work in two separate sets of normal coordinates; one set describing a σ-neighbourhood U around x and the other set describing another σ The first step is to prove that s → F s is continuous in G at s = t < τ . To show this, it suffices to show that any set of d + 1 points x i ∈ M , all of which lie in a σ-neighbourhood and are linearly independent (i.e. do not belong in the same (d − 1)-dimensional geodesic hypersurface), produces continuous curves s → F s (x i ) in M (Myers and Steenrod, 1939). We will use the continuity of the diffusion paths and the fact that, by Corollary 33, Y s = F s (X s ) when s < τ . Take x ∈ H + t such that (x, F t (x)) ∈ M(µ t ). Since the set of such x is dense in H + t , and the F s are isometries, it suffices to prove F s (x) → F t (x) as s → t in order to prove continuity of the isometry-valued function s → F s . First address the question of right-continuity at t > 0. With σ-neighbourhoods U , V of x, F t (x) as described above, let τ U = inf{s ≥ t : X s / ∈ U }. Because the coupling is Markovian, τ U is a stopping time with respect to the filtration generated by the coupling process (X, Y ). Consider the stopped processes X U s = X s∧τ U and Y U s = Y s∧τ U . In a slight abuse of notation, we use the same notation X U s for the coordinate representation for this stopped process in U , and similarly for Y U s . Also we continue to write F s for the coordinate representation of F s : U → V . Now, using Corollary 33 and conditioning on the event [ Let · represent the Euclidean norm for the normal coordinate representation in V . By continuity of normal coordinates, there exists a constant C > 0 with (F s (X U s )−F s (x)) ≤ C dist(F s (X U s ), F s (x)) = C dist(X U s , x) (where the second step uses the isometry property of F s ). By path continuity of X and Y , the left hand side and the first term on the right hand side both tend to zero as s ↓ t. Thus lim s↓t (F s (x) − F t (x)) = 0, proving right-continuity. Left-continuity follows by a similar argument involving path continuity of the backward process (X s∨σ U : It is necessary to address the question of right-continuity at t = 0. Take x ∈ H 0 and consider the case when t n → 0. Take a sequence x n → x such that (x n , F tn (x n )) ∈ M(µ tn ). An argument following the treatment of the case s = 0 in the proof of Lemma 35 shows that F tn (x n ) → F t (x). As F tn is an isometry for each n, we can deduce that F tn (x) → F 0 (x), thus proving right-continuity. The next step is to prove differentiability at t > 0. By Lemma 8 of Myers and Steenrod (1939) it suffices to prove differentiability at t of the continuous curve s → F s (x) for x ∈ H + t such that (x, F t (x)) ∈ M(µ t ). Take U , V and normal coordinate systems for x and F t (x) as above. Using these coordinates, we may write the stochastic differential equation for X U as for some Brownian motion W in U . A similar expression holds for Y U . General properties of diffusions (Nelson, 1967, Chapter 11) yield the following expressions in coordinate form: By Corollary 33, Y s = F s (X s ) when s < τ . So the third expression above gives us But s → F s is a continuous curve in G, so by Myers and Steenrod (1939, Lemma 7) we may deduce that the (space) derivatives of F s in the above expression are continuous in s. Thus we may apply (64) to deduce that the curve s → F s (x) has a continuous right-derivative given by This, together with Bruckner (1978, Theorem 1.3), implies uniformly continuous differentiability of s → F s (x) at t > 0. Note that the Mean Value Theorem and right-continuity of the right hand side of (65) now gives us right-differentiability at t = 0. This proves the lemma.
Corollary 40. All the partial derivatives with respect to x of (s, x) → F s (x) are continuously differentiable in s. Furthermore, d d s s=t F s (x) is smooth in x. Proof. Using the argument of Myers and Steenrod (1939, Section 8), we can deduce the following representation in local coordinates (x i ): where Ψ is a smooth function and x 0 , . . . , x d are fixed points in M . The corollary follows from this representation and the previous lemma.
The derivative vectorfield κ defined on M by possesses a special significance. This is the Killing vectorfield corresponding to the C 1 curve s → G s in G given by G s (x) = F s (F 0 (x)) for x ∈ M . vectorfields of this form correspond to the natural action of elements in the Lie algebra of G on the manifold M (recall that F 0 • F 0 is the identity map, and the Lie algebra of G corresponds to the tangent space of G at the identity). Killing vectorfields will play a crucial rôle in the following subsections.

Structure of the coupling
The processes X and Y can be constructed as projections X t = πU t and Y t = π U t , where U and V are solutions to Stratonovich stochastic differential equations which are defined on the orthonormal frame bundle O(M ) by Proof. Let γ be the unit speed geodesic in M starting from πu in direction ue i , defined on some interval [0, ε] for some ε > 0. For each 1 ≤ j ≤ d, let u j t denote the parallel transport of ue j along γ. Define the curve γ u in O(M ) given by γ u (t) = (γ t , u 1 t , . . . , u d t ) for t ∈ [0, ε]. Because F is an isometry it follows that .
proving the lemma.
The stochastic differential equation (66) for U delivers a diffusion V on O(M ) given by where F t is the time-varying deterministic involutive isometry constructed in previous subsections. Note that this automatically implies Y t = F t (X t ) = πV t on t < τ . Thus, V lifts Y up to the orthonormal frame bundle O(M ). We now derive the stochastic differential equation for V . From Kendall (1987, Equation (2.3)) it follows that where we have used Lemma 41 and the fact thatF t 2 = Id in the last step.
Thus, the stochastic differential equation for V takes the form Considering differentiation along the curve γ u introduced in the proof of Lemma 41, it can be seen that dπ(H i (u)) = ue i .
Also, as F t is an involutive isometry, where F t * b is the pushforward of the vectorfield b on M by the isometry F t .
Finally, writing for x ∈ M , note that, for u ∈ O(M ) and a smooth function f : M → R, Thus, writing for x ∈ M , we obtain d π(χ t (F t (u))) = κ t (πu) .
Note that κ t is the Killing vectorfield corresponding to the C 1 curve of isometries (F s • F t : s ≥ t), as introduced at the end of subsection 3.5.
Using the above relations, we can project down the stochastic differential equation (69) for V onto M as follows.
From the above expression, we see that the generator of Y at (t, x) is Comparing this with (60), we deduce the following important relation: Theorem 42. For a Markovian maximal coupling (X, Y ) to exist from starting points (x 0 , y 0 ), the following relation must hold: for all x ∈ M and t ≥ 0, where (F s : s ≥ 0) is the C 1 curve of isometries introduced in Lemma 35.

Classification of the drift
Finally it is possible to produce a complete characterization of the drift b under LPC. Recall that M can only be a scaled version of one of the model spaces S d , H d or R d corresponding to the curvature K being constant and equal to +1, −1, or 0. For this section, special attention is paid to the equation (71) at time 0. When the context makes it plain there is no ambiguity, we will write F for F 0 and κ for κ 0 .
Let ∇ represent the covariant derivative with respect to the Riemannian connection compatible with the metric g. We will need the following useful fact about Killing vectorfields (Petersen, 2006, Prop. 27).
Lemma 43. If κ is a Killing vectorfield, then for any x ∈ M and any u ∈ T x M , Isometries take geodesics to geodesics, so any Killing vectorfield is a Jacobi field, i.e. the variation field of a variation through geodesics. Thus, Killing vectorfields satisfy the Jacobi equation, as given by the following lemma (Lee, 1997, Theorem 10.2).
Because of Theorem 38, we can confine attention to the case when M is of constant curvature K, in which case there is a simple representation for the curvature tensor R (Lee, 1997, Lemma 8.20): We now define the symmetric 2-form associated with the drift vectorfield b: for u, v ∈ T x M , The following lemma describes this symmetric 2-form S x under LPC.
Lemma 45. Under LPC, there is a scalar λ such that, for all x ∈ M and all u, v ∈ T x M , Proof.
Recall that x * is the midpoint of a minimal geodesic connecting x 0 and y 0 . Let {e 1 , . . . , e d } denote the canonical orthonormal frame of T x * M . From previous discussions, F 'inverts' one geodesic through x * (the minimal geodesic joining x 0 and y 0 ) and keeps all geodesics orthogonal to this one fixed. Let n ∈ T x * M denote the direction of the inverted geodesic. Now, consider any isometry G that satisfies for some Killing vectorfield κ, for all x ∈ M . Then, it follows that for any u, v ∈ T x * M , which, along with Lemma 43, yields In particular, equation (71) at time t = 0 gives where (78) follows from (77) by noting that F fixes x * and F −1 = F . Let S(x * ) denote the matrix (S(x * )) ij = S x * (e i , e j ).
Using the description above of F as 'inverting' the geodesic with tangent vector n at x * , and leaving orthogonal geodesics at x * fixed, (78) yields By LPC, we can choose d pairs of starting points {(x i , y i ) : x i ∈ B(x 0 , r), y i ∈ B(y 0 , r), 1 ≤ i ≤ d} such that the directions of the inverted geodesics n i (for 1 ≤ i ≤ d) based at x * form d linearly independent vectors in T x * M . Now, noting from equation (79) that n i are eigenvectors of S(x * ), we find S(x * ) = λ(x * ) I for some scalar λ(x * ). In coordinate-free terms, this is the assertion of the lemma at point x * . Now, we want to show that the assertion of the lemma holds at any x ∈ M . Denote Z = {G ∈ G : G satisfies (76) for some Killing vectorfield κ and all x ∈ M }.
Recall that (77) holds for all G ∈ Z. Thus, by (80), we get By continuity of the map in the topology of isometries (Myers and Steenrod, 1939, Lemma 4), (77) holds for all G ∈ Z, where Z denotes the closed subgroup generated by Z.
Now, from the developments in subsection 3.3, observe that, under LPC, for any x ∈ B(x 0 , r) and y ∈ B(y 0 , r), there exists a unique involutive isometry f x,y whose fixed point set is exactly the set H(x, y). These isometries satisfy (76) as this equation corresponds to (71) at time t = 0 when the starting points of X and Y are taken to be x and y respectively. Furthermore, exactly along the lines of the proof of Lemma 36, we see that the orbit of x * under the closed subgroup of isometries generated by {f x,y : x ∈ B(x 0 , r), y ∈ B(y 0 , r)} is the whole of M . In particular, the orbit of for all u, v ∈ T x M , proving the lemma.
Now we describe the drift vectorfield along geodesics issuing from x * , the midpoint of a minimal geodesic joining x 0 and y 0 . In the following, we will denote the canonical orthonormal basis of T x * M by {e 1 , . . . , e d }. Also, for any vector u ∈ T x * M and any matrix d × d matrix T , T u will denote the vector obtained by matrix multiplication when we identify T x * M with R d .
Lemma 46. If the drift vectorfield b permits MMC with LPC, then it must satisfy the following. Let x * ∈ M be the midpoint of a minimal geodesic connecting x 0 and y 0 and u, v ∈ T x * M be unit vectors with u ⊥ v. Let γ represent the geodesic issuing from x * in direction u and let V t represent the parallel transport of v along γ. Then the following holds.
where λ is as in Lemma 45, and where the matrix T given by T ij = ∇ e i b(x * ), e j is a skew-symmetric matrix.
Proof. To see (81), note that Take any x ∈ B(x 0 , r) and y ∈ B(y 0 , r) such that x * ∈ H(x, y). Since H(x, y) is the fixed point set of an isometry, it is therefore a totally geodesic submanifold of M . Let (F, κ) denote the isometry and the Killing vectorfield for which (71) holds at time t = 0. Take any unit speed geodesic γ passing through x * and lying in H(x, y). (Note that, if a geodesic lies in H(x, y) for a short time, it should lie in H(x, y) for all time. See, for example, the proof of Proposition 24 of Petersen, 2006, p. 145.) Let (n t : t ≥ 0) be the parallel transport of the vector normal to the hypersurface H(x, y) at x * along the geodesic γ. Note that, as H(x, y) is totally geodesic, the second fundamental form vanishes identically on H(x, y) (Lee, 1997, Exercise 8.4). This fact implies that parallel transportation of a vector v ∈ T x * H(x, y) with respect to the induced metric on H(x, y) agrees with parallel transportation of v in the ambient manifold M (Lee, 1997, Lemma 8.5). Thus, n t is precisely the direction that is reversed at γ(t) by the involutive isometry corresponding to H(x, y).
Differentiating the above twice with respect to t along the geodesic γ, and using the fact that ∇γ (t) n t = 0 because n t was defined using parallel transport along γ, we obtain , n t (using D t as shorthand for covariant differentiation ∇γ along the geodesic γ) which, along with (73) and (74), gives Consequently equation (83) shows that the function t → b(γ(t)), n t satisfies the following differential equation For any geodesic γ passing through x * , not necessarily lying in H(x, y), and for any parallel vectorfield V t along γ orthogonal toγ t , a similar technique uses (71), (73) and (74) to give us Now, following the lines of the proof of Lemma 37, we can iteratively compose the isometries in S = f x,y ∈ G : x ∈ B(x 0 , r) , y ∈ B(y 0 , r) , dist(x, x * ) = dist(y, x * ) = 1 2 dist(x, y) to deduce that the closed subgroup of isometries G * generated by S is the whole isotropy group of x * in G. Take any geodesic γ passing through x * and lying in H(x, y) for some x ∈ B(x 0 , r), y ∈ B(y 0 , r) and let n t denote the parallel vectorfield along γ that is inverted by f x,y . Let V t be any parallel vectorfield along γ that is orthogonal toγ t . We can apply compositions of isometries in S which fix γ to obtain an isometry F ∈ G which fixes γ and satisfies F * n t = V t . Applying (86) at each such composition, we get By (85), the left hand side of the above is zero. Thus, the right hand side should vanish too. But for any other geodesic σ passing through x * we can again apply composition of isometries in S to get G ∈ G * such that σ = F • γ. So (86) and (87) now yield the crucial observation that, for any geodesic γ passing through x * and for any vectorfield V t along γ which is orthogonal toγ t , we have d 2 d t 2 b(γ(t)), V t + K b(γ(t)), V t = 0 .
Solving the above equation gives (82) for the given matrix T . The fact that T is skew-symmetric follows from the observation that S(e i , e j ) = 0 for orthogonal e i and e j (by Lemma 45) and therefore ∇ e i b(x * ), e j = 1 2 ∇ e i b(x * ), e j − ∇ e j b(x * ), e i .
Since M is a maximally symmetric space (by Theorem 38), the dimension of its set of Killing vectorfields is d(d+1) 2 . Thus, for any vector w ∈ T x * M and any skew-symmetric matrix T , there exists a unique Killing vectorfield K with K(x * ) = w and ∇ e i K(x * ), e j = T ij . Moreover, as every Killing vectorfield is a Jacobi field (i.e. satisfies (73)), it follows thatK satisfies the following equation analogous to (82), for unit vectors u, v ∈ T x * M with u ⊥ v.
Thus, if we set K x * as the Killing vectorfield uniquely determined by w = b(x * ) and T ij = ∇ e i b(x * ), e j , we see from Lemmas 45 and 46 that the vectorfield b can be written as where D λ x * is the dilation vectorfield about x * with dilation coefficient λ defined as for any geodesic γ issuing from x * . Now, we claim that dilation vectorfields do not arise in the case of non-zero-curvature.
Proof. Under LPC, the description of b given in Lemma 46 holds for x * replaced byx ∈ B(x * , ρ) for some ρ > 0. Take any two points x 1 , x 2 ∈ B(x * , ρ) with x 1 = x 2 . Lemmas 45 and 46, applied at x 1 and x 2 , show that b satisfies b = D λ 1 + K 1 = D λ 2 + K 2 where K 1 and K 2 are Killing vectorfields and D λ 1 and D λ 2 are dilation vectorfields with the same coefficient λ about x 1 and x 2 respectively.
Denote by σ the geodesic issuing from x 2 and passing through x 1 , and set γ to be a geodesic issuing from x 2 in a direction orthogonal to σ. Locate z = γ(dist(x 1 , x 2 )). Finally, denote the geodesic issuing from x 1 and passing through z by η. Consider the geodesic triangle ∆ formed by x 1 , x 2 and z. Thus, the sides of ∆ are formed by the geodesics σ, γ and η. Now, recall that the curvature K can also be interpreted in terms of the rate at which geodesics diverge when they issue from a point in different directions. Thus (Maubon, 2004, Proposition 2.6) we see that if x 1 is taken sufficiently close to x 2 , then dist(x 1 , z) Applying the triangle version of the Toponogov comparison theorem (Petersen, 2006, Theorem 79, p. 339), we see that the interior angle θ formed at the vertex z of ∆ satisfies θ ≥ π/4 if K > 0 and θ ≤ π/4 if K < 0. But (91) implies D λ 1 (z),γ(dist(x 1 , x 2 )) = D λ 1 (z),η(dist(x 1 , z)) cos θ = λ dist(x 1 , z) cos θ .
Thus, if λ > 0, we get and the inequalities are reversed if λ < 0.
Note: When K > 0, observe that b(γ(0)),γ 0 = b(γ(2π/ √ K)),γ 2π/ √ K yields λ = 0. But the above proof works for both positive and negative curvatures, and is in some sense, the real geometric reason why the dilation part of the vectorfield b vanishes for non-zero curvature.
Finally we can state and prove the main theorem of this section.
Theorem 48. The drift vectorfield b permits MMC with LPC if and only if both of the following hold: (i) The underlying Riemannian manifold M is one of the three model spaces S d (K > 0), R d (K = 0) or H d (K < 0), in the sense that the diffusion must be expressible as Riemannian Brownian motion plus drift vectorfield b for such an M .
(ii) For K = 0, the drift b must and can be any Killing vectorfield K on M . For K = 0, the drift b must and can be described in Euclidean coordinates by b(x) = λx + T x + c for any scalar λ and any skew-symmetric matrix T , where x → λx is a dilation vectorfield about the origin and x → T x + c is a Killing vectorfield.
Remark 49. We do not, in fact, need the full strength of LPC. The closed subgroup of isometries J generated by {f x,y ∈ G : x ∈ B(x 0 , r) , y ∈ B(y 0 , r) } forms a compact subgroup of G. The natural measure associated to this subgroup J is the normalised Haar measure, which can be normalized since J is compact. The work of Winkelmann (2003) shows that a subgroup generated by 1 + d(d + 1)/2 randomly chosen isometries in J (which corresponds to pairs of starting points in B(x 0 , r) × B(y 0 , r)) can be shown to be almost surely dense in J with probability one. This is sufficient for our conclusions to hold. For the sake of clarity of exposition, however, we have proved our results under the stronger assumption of LPC.

Conclusion
In this paper we have shown that Markovian maximal couplings of regular elliptic diffusions with smooth coefficients have to be reflection couplings tied to involutive isometries of the corresponding Riemannian structure on state space; moreover as soon as the existence of a Markovian maximal coupling is stable (in the sense of LPC) then a rigidity result requires the Riemannian structure to be Euclidean, hyperspherical, or hyperbolic, and the space must be simply connected. In such cases the drift must also be of a very simple form, corresponding to a rotation with possibly (but only in the Euclidean case) a dilation component. Thus Markovian maximal couplings of elliptic diffusions are rare, and their existence enforces severe geometric constraints.
This places a natural premium on questions of efficiency of Markovian coupling, as discussed for example in Burdzy and Kendall (2000), for the case of reflecting Brownian motion in compact regions. One could ask, for example, when it is possible to construct Markovian couplings (X, Y ) which are optimal in the sense that the tail probability of the coupling time P [τ > t] is minimized for all t amongst Markovian couplings if not amongst all possible couplings. (Note that this notion of optimality differs from the optimality discussed in Chen (2004), which is defined relative to a specified Wasserstein metric.) Little is known as yet about such couplings, though Kendall (2014) exhibits a coupling of two copies of scalar Brownian motion and local time which is Markovian, non-maximal, but optimal amongst all Markovian couplings. The question of whether similar geometric rigidity results for existence of such optimal Markovian couplings remains entirely open, and its answer would be of great interest.
We expect that in fact such optimal Markovian couplings are also rare. Further refinements are possible (for example, one could consider the existence of Markovian couplings which minimize the Laplace transform E [exp(−uτ )] for some or all values of u > 0); however the probable rarity of such couplings would focus attention on developing the notions of efficiency from Burdzy and Kendall (2000) to apply to non-compact regions. In particular there is a natural question concerning criteria for existence of efficient Markovian couplings, where "efficient" here means, the rate of decay of P [τ > t] with t for the Markovian coupling is comparable to that of the total variation distance µ 1,t − µ 2,t T V between the one-point distributions µ 1,t and µ 2,t (the distributions of X t and Y t respectively).
Two other natural extensions of these results are: 1. extension of the notion of Markovian maximal coupling to the hypoelliptic case (in which case in fact the very existence of Markovian couplings is moot: but see the positive results of Kendall and Price, 2004;Kendall, 2007); 2. examination of the extent to which the ideas of this paper carry over to Markov processes which are not skip-free (and here a natural first step would be to consider the case of couplings of Lévy processes, though a potentially significant result in the random walk case is to be found in Rogers, 1999).
We hope to consider many of these questions in future work.