Uniform Ergodicity for Brownian Motion in a Bounded Convex Set

We consider an n-dimensional Brownian Motion trapped inside a bounded convex set by normally-reflecting boundaries. It is well-known that this process is uniformly ergodic. However, the rates of this ergodicity are not well-understood, especially in the regime of very high-dimensional sets. Here we present new bounds on these rates for convex sets with a given diameter. Our bounds do not depend upon the smoothness of the boundary nor the value of the ambient dimension, n.


Introduction
Let Ω be an open, convex set with diameter d. We will consider a Brownian motion, {X t } t≥0 , trapped inside Ω by normally-reflecting boundaries, initialized at some point X 0 = x. We will give a more precise definition of this process momentarily; intuitively, X is a homogeneous Markov process which behaves like a Brownian motion on Ω, spends essentially no time at the boundary (which we will denote ∂Ω), and is always contained in the closure (which we will denoteΩ). Let {p (t, x, dy)} t≥0,x∈Ω denote the transition measures of the process, i.e. P(X t ∈ A|X 0 = x) = A p (t, x, dy). It is well-known that the process X is ergodic, with stationary distribution σ, where σ(dy) dy/Vol(Ω). Thus for every x we must have that p (t, x, ·) → σ as t → ∞ in various modes of convergence. The rate and exact nature of this convergence has been investigated in a number of ways. For example, consider the case that Ω is a convex polytope which can be contained inside a cube of diameter d, such as [0, d/ √ n] n . In this special case, it has long been known that where µ TV sup A |µ(A)| (cf. [15]). Another important result follows from a certain Poincare constant on arbitrary bounded convex domains, first rigorously shown by Bebendorf in [3]. Using this result one may readily show that g(x) f (y)p (t, x, dy) dy − f (y)σ(dy) where we define f 2 L 2 f 2 dx and take g to be any density (i.e. g ≥ 0 and g = 1). One way to see the significance of this formula is to take Y to be some variable with Y ∼ σ, and ρ(dx) = g(x)dx to be some initial distribution. Equation  However, Bebendorf's result also has some problems: it is not directly applicable when the initial distribution on X 0 isn't absolutely continuous with respect to Lebesgue measure. For example, let δ x denote the degenerate initial distribution defined by δ x (A) = I x∈A . If we try apply Equation 2 to the limiting case as g(x)dx → δ x (dx), the right hand side of the bound must become infinite. This can be somewhat remedied by using some other technique to try to understand the short-term behavior of the diffusion, and then applying Bebendorf's result for asymptotic results. For example, one can pick a small t 0 > 0 and estimate the distribution of X 0 |X −t0 = x, g 0 . In most cases this distribution will have a continuous density, and so one can apply Bebendorf's inequality to the modified problem in which g = g 0 and time has been shifted by t 0 . If the density of X 0 |X −t0 = x can be accurately computed for modestly large t 0 , this will be effectivehowever, in that case these kinds of estimates are not necessary in the first place. For small values of t 0 , this method yields a very weak bound (in particular, in the limit as t 0 → 0, it fails completely). More generally, a rich understanding of the rate of ergodicity remains elusive, even in this simple convex case. How does it depend upon the dimension? How does it depend upon the initial condition? Here we will show results of uniform ergodicity, similar to those of Bebendorf insofar as they do not depend upon dimension, but different in that they apply uniformly regardless of initial condition.
It turns out that a simple one-dimensional diffusion can shed some light on these questions.
Note that the distribution of this object is straightforward to calculate and analyze. For example, in [11] it is shown that the survival function ofτ d is given by 8d 2 (2n+1) 2 t 4 (−1) n π (2n + 1) cos 2n + 1 2 × πk d Numerical estimation of this sum is straightforward and effective in practice. Indeed, F d (t, k) is simply the solution to the heat equation on [−d, d] with homogeneous Dirichlet boundary conditions and initial condition F d (0, k) = I k∈(−d,d) ; this partial differential equation is well-understood (cf. [5]). It is also easy to bound F d using the moment generating function ofτ and a Chernoff bound; the moment generating function can be deduced by an application of the Kac moment formula, yielding E e γτ d |W t = k = cos( √ 2γk)/ cos( √ 2γd) for any γ ≤ π 2 /8d 2 (cf. [8,12]).
We can use the distribution ofτ d to help us understand the rate of ergodicity for convex domains: x, dy)} t≥0,x∈Ω denote the transition measures of Brownian motion trapped inside Ω by normally-reflecting barriers. Then This very last bound is tight within a factor of 2. In, particular, taking the special case that Ω = [0, d] ⊂ R, we have that F d (4t, 0) ≤ 2 p (t, 0, ·) − σ TV .
Proof. We will defer the proof to Section 3.
Notice that the leading exp −π 2 t/2d 2 rate in F d (4t, ·) is the same as the rate given by Bebendorf in Equation 2. This is no accident. Both quantities reflect the spectral gap for the Neumann Laplacian on the interval [0, d], namely π/d. The author's particular interest in this problem arose from a question about hitting probabilities. Let A, B denote two open disjoint subsets of Ω. Let T = inf {t : X t ∈ A ∪ B}, and consider the problem of estimating u(x) = P(X T ∈ ∂A|X 0 = x). In general it can be quite tricky to analyze u. However, there are some circumstances in which it simplifies considerably. Let x ∈ Ω denote some point such that Then it is easy to show that u(x) ≈ u(y)σ(dy). This can greatly simplify both the numerical and the analytic investigation of u. It was for this reason that we wanted to closely investigate the rate of ergodicity.
The remainder of this article is divided into three sections: 1. Known results. We give a rigorous definition for reflecting Brownian motion in a convex set and formalize some aspects of our introductory exposition. We summarize known results, look at the equations governing p (t, x, dy), and see how the work by Bebendorf yields the rate of convergence found in Equation 2. Finally, we will examine a coupling idea whose first rigorous construction is due to Atar and Burdzy (cf. [1]).

Application of known results.
Here we prove our main theorem, using the coupling construction of Atar and Burdzy.
3. Conclusions. We will consider possible directions for future research.

Known results
The theory of reflected Brownian motion in convex sets is fairly well-developed. To get a sense of this history, we will here recall Tanaka's early work on the subject: • a positive locally finite random measure µ on [0, ∞) • a random function n : R + → R n such that for every t and n(t) is a normal vector of a supporting hyperplane 1 of Ω at the point X t for µ-almost-every value of t. Then we will call X a reflecting Brownian motion in Ω driven by W .
If Ω ⊂ R n is convex then there is a pathwise-unique reflecting Brownian motion in Ω driven by W .
Here we summarize some well-known facts about the process X. x∈A y∈B σ(dx)p (t, x, dy) = x∈B y∈A σ(dx)p (t, x, dy). In particular, σ is a stationary distribution of X.
for weakly differentiable functions f : Ω → R. Then for any density g ∈ L 2 (i.e. g ≥ 0 and g(x)dx = 1), we have that Proof. These results are well-known. We relate them at a high level here for the convenience of the reader.
The key is to grasp the connection between X and a certain so-called "Dirichlet form," which is understood as a bilinear form on the Sobolev space H 1 of weakly differentiable functions on Ω. The arc of this connection is the content of the treatise [9]. We will only sketch it briefly in this paragraph. Let L 2 (Ω) denote the space of square-integrable measurable functions on Ω, equipped with the inner product f, g L 2 = f (x)g(x)dx. One can find a unique non-negative definite operator A : H 1 → L 2 such that E (f, g) = (Af ) (Ag) dx. It turns out that A is self-adjoint. One one can thus obtain a family of operators of the form T t : L 2 → L 2 , uniquely defined as Note that e −A 2 t is defined on all of L 2 even though A is only defined on H 1 ; we refer the reader to [16] for a very clear introduction to these considerations. If Ω has lipschitz boundary (i.e. the boundary can locally be represented as the epigraph of a lipschitz function), one can then (not-necessarily-uniquely) define a strong Markov process almost surely with respect to Lebesgue measure, for every f ∈ L 2 . In this case we may say that Y is "weakly determined" by E . It is shown in [2] that if Ω is bounded with lipschitz boundary, then any process which is weakly determined by E will be a reflecting Brownian motion driven by some Brownian motion W . It is also shown that at least one such process exists. Since a bounded convex set automatically has a lipschitz boundary (cf. Corollary 1.2.2.3 of [10]) and Tanaka showed that reflecting Brownian motion on a convex set is uniquely defined, it follows that the process X must be weakly determined by E .
This allows us to prove all three of our claims: Since [9] shows that every process with lipschitz boundary that is weakly determined by E is a strong Markov process, it follows that X is a strong Markov process. The reversibility of X with respect to σ then follows from the fact that T t is self-adjoint as an operator on L 2 (Ω, σ) (this, in turns follows from the fact that A is self-adjoint).
In [4] it is shown that if a process {Y t } t≥0 is weakly determined by E and Ω is convex, then Thus the same follows for our process, X. Let λ denote any constant so that f dx = 0 =⇒ λ 2 f 2 L 2 ≤ E (f, f ). Recall that we have said there is a unique operator A : We may thus rephrase our understanding of λ by saying that f dx = 0 =⇒ λ f L 2 ≤ Af L 2 . Using spectral methods it is thus straightforward to show that Using this and Cauchy-Schwarz, one can readily show that Proof. The work in [3] shows that λ = π/d fills the required role for statement 3 of Lemma 3.
This last corollary gives a satisfying grip on the L 2 ergodic convergence for Brownian motion in convex domains. Our endeavor here is to complement this with a comparable analysis of the total variation convergence.
Towards this end, we will be employ a coupling construction. That is, we will construct a joint process {X t , Y t } t≥0 so that X and Y both carry the law of reflecting brownian motion, but each has a different initial condition. We will construct them in such a way that τ = inf {t : X t = Y t } is almost surely finite: Theorem 5. Fix any bounded convex set Ω. Let {W t } t≥0 denote a brownian motion. Then there exists a solution to the equations defined in a pathwise unique way up until the time τ = inf {t : X t = Y t }. Here d |L| s plays the role of the measure µ in Definition 1; we require that n L (s) is a normal vector of supporting hyperplanes of Ω at X s , d |L| s -almost surely. Likewise for n M , Y s , d |M | s . In particular, let constitute a strongly Markovian process, and both X and Y are reflecting Brownian motions.
Proof. We refer the reader to the work of Atar and Burdzy in [1]. Note that although this article focuses on the case that ∂Ω is smooth, it also mentions that all of the reasoning goes through for any set which is "admissible" according to the lights of work by Lions and Sznitman, [14]. Convex sets are indeed "admissible" according to Remark 3.1 of the work by Lions and Sznitman. We emphasize that even though X and Y in this theorem are profoundly coupled, individually they both behave like Brownian motions trapped inside Ω by reflecting boundaries. It is also worth emphasizing that there are two completely different conceptual "reflections" at play here: 1. The normally reflecting boundaries keep X, Y inside Ω 2. The mirror coupling causes Y to generally behave like the mirror image of X, reflected over a plane halfway between X and Y .
These two kinds of reflections may interact when X or Y hits a point in ∂Ω. In this case the direction of reflection (which is mathematically expressed as η t ) may rotate.

Application of known results
To show our main theorem, we must understand the distribution of the coupling time in the mirror construction from Theorem 5. Exact calculation of this distribution may be impossible, but some bounds are straightforward to obtain: Let Ω, X, Y,Ỹ , τ be as in Theorem 5. Let d denote the diameter of Ω. Then Proof. Let us consider We now apply the fact that Equation 4 holds for for t ≤ τ . This yields that Dambis-Dubins-Schwarz then yields that we can find some one-dimensional Brownian motion B such that Let us now inspect Let us focus on three properties of this object: So certainly The same arguments apply to − η s , n L (s) .
2. Since the diameter of Ω is d, we have that R t ≤ d for all t. Put another way, Putting these facts together, we obtain the overall bound of This is useful because the law ofR t is well-understood. It is that of a one-dimensional Brownian motion with reflection at the point d, initialized atR 0 = |x − y| (cf. the introduction of [7] for a useful exposition on this point), and stopped at the time whenR t = 0. In particular, lettingτ = inf t :R t = 0 , we can apply the results of [11] to argue that Moreover, since R t ≤R 4t for every t, it follows thatτ ≥ 4τ . Our result follows immediately.
This leads to our main theorem, which we restate here for the convenience of the reader: Theorem.
Let Ω ⊂ R n bounded, open, convex, with diameter d. Let {p (t, x, dy)} t≥0,x∈Ω denote the transition measures of Brownian motion trapped inside Ω by normally-reflecting barriers. Then This very last bound is tight within a factor of 2. In, particular, taking the special case that Proof. The total variation distance between p (t, x, ·) − p (t, y, ·) is easy to bound using Lemma 6. Recall that X t ,Ỹ t were both reflecting Brownian motions, with X 0 = x,Ỹ 0 = y, and X t =Ỹ t for t ≥ τ . The lemma then shows that P (t < τ ) ≤ F d (4t, d − |x − y|). We thus obtain our first total variation bound: The second total variation bound then follows immediately from the fact that σ is the stationary distribution of the process, i.e. p (t, x, A) σ(dx) = σ(A). The second total variation bound is thus a kind of "average" of the first variation bound: Finally, it is well-known that sup k F d (t, k) = F d (t, 0), so we obtain the overall bound of p (t, x, ·) − σ TV ≤ F d (t, 0). We now turn our attention to the one-dimensional case, to see that this last bound is tight within a factor of 2. Let us assume Ω = [0, d] and X 0 = 0. Let It is well-known that in this simple one-dimensional case, V is the unique solution to Note that since V (0, x) is discontinuous, we need to construe these equations weakly; we refer the reader to [9] for a detailed account of how this can be done and why this partial differential equation governs the behavior of V . The key idea is that V (t, ·) = e −A 2 t V (0, ·) where e −A 2 t is the L 2 operator whose construction we outlined in Lemma 3. Once the problem is properly formulated, a fourier expansion immediately yields the solution: In particular,

Conclusions
Among all convex sets with diameter d, this work begins to suggest that the one-dimensional interval (e.g. Ω = [0, d]) may provide a kind of worst-case scenario for mixing rates. This is helpful because the onedimensional interval is easy to analyze. Unfortunately, "typical" high-dimensional sets of diameter d may mix much faster than our bounds would suggest. Thus, Equation 1 from [15] seemed to imply that mixing might get slower in high dimensions, the bound in our theorem does not depend upon the dimension, but the reality may be that mixing typically gets faster in higher dimensions. For example, preliminary analysis suggests that n-dimensional Brownian motion in the unit n-dimensional ball mixes quite a bit faster than one-dimensional Brownian motion on [0, d], especially as n → ∞. On the other hand, we are simply not sure about the total-variation mixing rate for Brownian motion in a high-dimensional cube. Are there simple ways to improve our bounds when we know more about Ω? This is a possible direction of future research.
In another direction, it should be possible to extend this basic method of proof to accommodate a wide variety of Ito diffusions, beyond Brownian motion. Mirror couplings are available for many such diffusions; we refer the reader to [13] and the many papers which have cited it. For every such mirror coupling process, {X t , Y t } t≥0 , one can analyze the one-dimensional process R t = |X t − Y t |. By applying Ito's lemma, Dambis-Dubins-Schwarz, and taking bounds, one can often obtain a stochastic differential inequality of the form dR t ≤ dR t , whereR t is some semimartingale that is better-understood. Applying stochastic differential inequality results such as those found in [6], one can then obtain bounds for the high-dimensional process with some simple one-dimensional process.
For example, consider the stochastic differential equation dX t = µ(X t )dt + W t where µ is some Lipschitz vector field. Similar to the technique we used, let dY t = µ(Y t )dt − dW t − 2η t η t , dW t , where η is a normalized version of X − Y . As in our case, if we define B 4t = t 0 2 η s , dW s then we can show that B is a one-dimensional Brownian motion. Finally, let Γ denote some Lipschitz function satisfying Γ(r) ≥ sup |x−y|=r x − y, µ(x) − µ(y) , and consider the simple one-dimensional stochastic differential equation Using the results from [6], one can readily show that R t ≤ Z t . Thus In short, by analyzing the simple one-dimensional diffusion Z, one can estimate coupling times for extremely complex and high-dimensional processes. These coupling times can then be used to estimate rates of ergodicity. Of course, these kinds of estimates may be quite poor in some cases (e.g. consider the catastrophic case that sup |x−y|=r x − y, µ(x) − µ(y) = ∞). This difficulty is related to the problem with which we began these conclusions: in some cases it may be very difficult to get high-quality bounds using only a one-dimensional diffusion.
We have seen that one-dimensional diffusions can be used to analyze very high-dimensional diffusions, although the bounds may not be ideal in certain cases. Might it be possible to improve this basic technique to allow two or three-dimensional diffusions to give insight into high-dimensional processes? This is an intriguing question for future research.