1 Introduction

Let \(\varOmega \) be an open, convex set with diameter \(d<\infty \). We will consider a Brownian motion, \(\left\{ X_{t}\right\} _{t\ge 0}\), trapped inside \(\varOmega \) by normally reflecting boundaries, initialized at some point \(X_{0}=x\). We will give a more precise definition of this process momentarily; intuitively, X is a homogeneous Markov process which behaves like a Brownian motion on \(\varOmega \), spends essentially no time at the boundary (which we will denote \(\partial \varOmega \)), and is always contained in the closure (which we will denote \(\bar{\varOmega }\)). Let \(\left\{ p\left( t,x,{\hbox {d}}y\right) \right\} _{t\ge 0,x\in \bar{\varOmega }}\) denote the transition measures of the process, i.e., \(\mathbb {P}(X_{t}\in A|X_{0}=x)=\int _{A}p\left( t,x,{\hbox {d}}y\right) \). It is well known that the process X is ergodic, with stationary distribution \(\sigma \), where \(\sigma ({\hbox {d}}y)\triangleq {\hbox {d}}y/\mathrm {Vol}(\varOmega )\) (cf. [4]). Thus, for every x we must have that

$$\begin{aligned} p\left( t,x,\cdot \right) \rightarrow \sigma \quad \text {as}\quad t\rightarrow \infty \end{aligned}$$

in various modes of convergence. The rate and exact nature of this convergence has been investigated in a number of ways. For example, consider the case that \(\varOmega \) is a convex polytope which can be contained inside a cube of diameter d, such as \([0,d/\sqrt{n}]^{n}\). In this special case, it has long been known that

$$\begin{aligned} \sup _{x}\left\| p\left( t,x,\cdot \right) -\sigma \right\| _{\mathrm {TV}}\le \sqrt{\frac{d^{2}}{2\pi t}} \end{aligned}$$
(1)

where \(\left\| \mu \right\| _{\mathrm {TV}}\triangleq \sup _{A}\left| \mu (A)\right| \) (cf. [16]). Another important result follows from a certain Poincare constant on arbitrary bounded convex domains, first rigorously shown by Bebendorf in [3]. Using this result one may readily show that

$$\begin{aligned}&\left| \int g(x)\left( \int f(y)p\left( t,x,{\hbox {d}}y\right) \right) {\hbox {d}}y-\int f(y)\sigma ({\hbox {d}}y)\right| \nonumber \\&\quad \le \exp \left( -\frac{1}{2}\left( \frac{\pi }{d}\right) ^{2}t\right) \left\| g-\frac{1}{\mathrm {Vol}(\varOmega )}\right\| _{\mathscr {L}^{2}}\left\| f\right\| _{\mathscr {L}^{2}} \end{aligned}$$
(2)

where we define \(\left\| f\right\| _{\mathscr {L}^{2}}^{2}\triangleq \int f^{2}{\hbox {d}}x\) and take g to be any density (i.e., \(g\ge 0\) and \(\int g=1\)). One way to see the significance of this formula is to take Y to be some variable with \(Y\sim \sigma \), and \(\rho ({\hbox {d}}x)=g(x){\hbox {d}}x\) to be some initial distribution. Equation (2) can then be understood as a bound on the rate at which \(\mathbb {E}\left[ f(X_{t})|X_{0}\sim \rho \right] \rightarrow \mathbb {E}\left[ f(Y)\right] \). Comparing Bebendorf’s result with Eq. (1), we see that Bebendorf’s result has several advantages:

  1. 1.

    Equation (1) becomes less powerful as \(n\rightarrow \infty \), but Eq. (2) does not suffer from this deficit. To see how Eq. (1) fails in high dimensions, consider what it says about a convex polytope of diameter d which is roughly spherical. Such a polytope generally cannot fit inside a cube of diameter d. Indeed, to enclose an n-dimensional ball of diameter d, one needs a cube with diameter \(d\sqrt{n}\). Thus, as \(n\rightarrow \infty \), Eq. (1) becomes quite weak for certain kinds of diameter-d sets. Equation (2) does not suffer from this problem.

  2. 2.

    Equation (2) has exponential decay instead of polynomial decay.

  3. 3.

    Equation (2) does not require \(\varOmega \) to be a convex polytope.

However, Bebendorf’s result also has some problems: it is not directly applicable when the initial distribution on \(X_{0}\) isn’t absolutely continuous with respect to Lebesgue measure. For example, let \(\delta _{x}\) denote the degenerate initial distribution defined by \(\delta _{x}(A)=\mathbb {I}_{x\in A}\). If we try applying Eq. (2) to the limiting case as \(g(x){\hbox {d}}x\rightarrow \delta _{x}({\hbox {d}}x)\), the right-hand side of the bound must become infinite (and thus quite useless).

More generally, a rich understanding of the rate of ergodicity remains elusive, even in this simple convex case. How does it depend upon the dimension? How does it depend upon the initial condition?

It turns out that a simple one-dimensional diffusion can shed some light on these questions. Let \(\left\{ W_{t}\right\} _{t\ge 0}\) denote a one-dimensional Brownian motion. Let \(\tilde{\tau }_{d}=\inf \left\{ t:\ W_{t}\notin (-d,d)\right\} \). Note that the distribution of this object is straightforward to calculate and analyze. For example, in [11] it is shown that the survival function of \(\tilde{\tau }_{d}\) is given by

$$\begin{aligned} \mathbb {P}(t\le \tilde{\tau }_{d}|W_{t}=k)=F_{d}\left( t,k\right) \triangleq \sum _{n=0}^{\infty }\mathrm{e}^{-\frac{\pi ^{2}}{8d^{2}}\left( 2n+1\right) ^{2}t}\frac{4\left( -1\right) ^{n}}{\pi \left( 2n+1\right) }\cos \left( \frac{2n+1}{2}\times \frac{\pi k}{d}\right) \end{aligned}$$

Numerical estimation of this sum is straightforward and effective in practice. Indeed, \(F_{d}\left( t,k\right) \) is simply the solution to the heat equation on \([-d,d]\) with homogeneous Dirichlet boundary conditions and initial condition \(F_{d}\left( 0,k\right) =\mathbb {I}_{k\in (-d,d)}\); this partial differential equation is well understood (cf. [5]). It is also easy to bound \(F_{d}\) using the moment-generating function of \(\tilde{\tau }\) and a Chernoff bound; the moment-generating function can be deduced by an application of the Kac moment formula, yielding \(\mathbb {E}\left[ \mathrm{e}^{\gamma \tilde{\tau }_{d}}|W_{t}=k\right] =\cos (\sqrt{2\gamma }k)/\cos (\sqrt{2\gamma }d)\) for any \(\gamma \le \pi ^{2}/8d^{2}\) (cf. [8, 12]).

We can use the distribution of \(\tilde{\tau }_{d}\) to help us understand the rate of ergodicity for convex domains:

Theorem 1

Let \(\varOmega \subset \mathbb {R}^{n}\) bounded, open, convex, with diameter d. Let \(\left\{ p\left( t,x,{\hbox {d}}y\right) \right\} _{t\ge 0,x\in \varOmega }\) denote the transition measures of Brownian motion trapped inside \(\varOmega \) by normally reflecting barriers. Then

$$\begin{aligned} \left\| p\left( t,x,\cdot \right) -p\left( t,y,\cdot \right) \right\| _{\mathrm {TV}}&\le F_{d}\left( 4t,d-\left| x-y\right| \right) \\ \left\| p\left( t,x,\cdot \right) -\sigma \right\| _{\mathrm {TV}}&\le \int F_{d}\left( 4t,d-\left| x-y\right| \right) \sigma (\mathrm{{d}}y)\le F_{d}\left( 4t,0\right) \end{aligned}$$

This very last bound is tight within a factor of 2. In, particular, taking the special case that \(\varOmega =\left[ 0,d\right] \subset \mathbb {R}\), we have that \(F_{d}\left( 4t,0\right) \le 2\left\| p\left( t,0,\cdot \right) -\sigma \right\| _{\mathrm {TV}}\).

We will defer the proof to Sect. 3.

Notice that the leading \(\exp \left( -\pi ^{2}t/2d^{2}\right) \) rate in \(F_{d}\left( 4t,\cdot \right) \) is the same as the rate given by Bebendorf in Eq. (2). This is no accident. Both quantities reflect the spectral gap for the Neumann Laplacian on the interval [0, d], namely \(\pi /d\).

The author’s particular interest in this problem arose from a question about hitting probabilities. Let AB denote two open disjoint subsets of \(\varOmega \). Let \(T=\inf \left\{ t:\ X_{t}\in A\cup B\right\} \), and consider the problem of estimating \(u(x)=\mathbb {P}(X_{T}\in \partial A|X_{0}=x)\). In general it can be quite tricky to analyze u. However, there are some circumstances in which it simplifies considerably. Let \(x\in \varOmega \) denote some point such that \(\mathbb {P}(T>t|X_{0}=x)\approx 1\) and \(\left\| p\left( t,x,\cdot \right) -\sigma \right\| _{\mathrm {TV}}\approx 0\). Then it is easy to show that \(u(x)\approx \int u(y)\sigma ({\hbox {d}}y)\). This quantity is of course independent of x and can be estimated efficiently. Unfortunately, it is not immediately obvious how to tell whether a given point x satisfies these criteria. In particular, exact computation of \(\left\| p\left( t,x,\cdot \right) -\sigma \right\| _{\mathrm {TV}}\) is nontrivial, and so we turned to finding accurate rates of the uniform ergodicity as a way to bound this quantity.

The remainder of this article is divided into three sections:

  1. 1.

    Known results We give a rigorous definition for reflecting Brownian motion in a convex set and formalize some aspects of our introductory exposition. We summarize known results, look at the equations governing \(p\left( t,x,{\hbox {d}}y\right) \), and see how the work by Bebendorf yields the rate of convergence found in Eq. (2). We will examine a coupling idea whose first rigorous construction is due to Atar and Burdzy (cf. [1]). Finally, we will give some remarks on the function \(F_{d}\left( t,k\right) \) and begin to see why it will appear in our main theorem.

  2. 2.

    Application of known results Here we prove our main theorem, using the coupling construction of Atar and Burdzy.

  3. 3.

    Conclusions We will consider possible directions for future research.

2 Known Results

The theory of reflected Brownian motion in convex sets is fairly well developed. To get a sense of this history, we will here recall Tanaka’s early work on the subject:

Definition 1

Let \(\varOmega \in \mathbb {R}^n\) any set. A plane \(\{x:\ \langle x,y\rangle =c\}\) is said to be a supporting hyperplane of \(\varOmega \) if \(\varOmega \subset \{ x:\ \langle x,y\rangle \ge c\}\). The vector y is said to be the inward-facing normal vector.

Definition 2

Let \(\varOmega \subset \mathbb {R}^{n}\) denote an open set. Let \(\left\{ W_{t}\right\} _{t\ge 0}\) denote an n-dimensional Brownian motion adapted to \(\left\{ \mathcal {F}_{t}\right\} _{t\ge 0}\). Let us say there exists

  • a continuous process \(\left\{ X_{t}\right\} _{t\ge 0}\subset \bar{\varOmega }\) which is adapted to \(\left\{ \mathcal {F}_{t}\right\} _{t\ge 0}\)

  • a positive locally finite random measure \(\mu \) on \([0,\infty )\)

  • a random function \(\varvec{n}:\ \mathbb {R}^{+}\rightarrow \mathbb {R}^{n}\)

such that

$$\begin{aligned}&X_{t}=W_{t}+\int _{0}^{t}\varvec{n}(s)\mu ({\hbox {d}}s)\\&\quad \mu \left( \left\{ t:\ X_{t}\notin \partial \varOmega \right\} \right) =0 \end{aligned}$$

for every t and \(\varvec{n}(t)\) is the inward-facing normal vector of a supporting hyperplane of \(\varOmega \) at the point \(X_{t}\) for \(\mu \) almost every value of t. Then we will call X a reflecting Brownian motion in\(\varOmega \)driven by\({\varvec{W}}\).

It is worth taking a moment to consider the case \(n=1\), in which case much of the complexity of the definition above evaporates. To make it as simple as possible, say \(\varOmega \) is simply the interval \([0,\infty )\). Then it is easy to see that

$$\begin{aligned} X_t \triangleq W_t - 0\wedge \inf _{s\le t} W_s \end{aligned}$$

is a reflecting Brownian motion on \(\varOmega \). We refer the reader to [7] for a useful exposition on this point. The corresponding measure \(\mu \) is simply the local time of X on the boundary, and the vector \(\varvec{n}\) is simply the number 1. That is, \(- 0 \wedge \inf _{s\le t} W_s = \int _0^t 1\mu ({\hbox {d}}s)\). This formulation will be useful to us later, as we will see that the general problem for \(n>1\) can be approximated by looking at a much simpler problem with \(n=1\).

For the general case \(n>1\), Tanaka showed that as long as \(\varOmega \) is convex and bounded, we can always find a unique reflecting Brownian motion:

Theorem 2

(Tanaka [18]) Let W denote any n-dimensional Brownian motion with \(W_{0}\in \varOmega \). If \(\varOmega \subset \mathbb {R}^{n}\) is bounded and convex then there is a pathwise-unique reflecting Brownian motion in \(\varOmega \) driven by W.

Proof

Cf. [18]. \(\square \)

Here we summarize some well-known facts about the process X.

Lemma 1

(Properties of reflected Brownian motion in a convex set \(\varOmega \)) Let W denote an \(\left\{ \mathcal {F}_{t}\right\} _{t\ge 0}\)-adapted Brownian motion and let X denote a reflecting Brownian motion in \(\varOmega \) driven by W. If \(\varOmega \) is convex, then

  1. 1.

    \(\left\{ X_{t}\right\} _{t\ge 0}\) is a strongly \(\left\{ \mathcal {F}_{t}\right\} _{t\ge 0}\)-adapted homogeneous Markov process. Let \(\left\{ p\left( t,x,\mathrm{{d}}y\right) \right\} _{t\ge 0,x\in \bar{\varOmega }}\) denote the transition measures of this process. The process X is reversible with respect to \(\sigma \), i.e., \(\int _{x\in A}\int _{y\in B}\sigma (\mathrm{{d}}x)p\left( t,x,\mathrm{{d}}y\right) =\int _{x\in B}\int _{y\in A}\sigma (\mathrm{{d}}x)p\left( t,x,\mathrm{{d}}y\right) \). In particular, \(\sigma \) is a stationary distribution of X.

  2. 2.

    X is uniformly ergodic with stationary distribution \(\sigma \), i.e.,

    $$\begin{aligned} \lim _{t\rightarrow \infty }\sup _{x}\left\| p\left( t,x,\cdot \right) -\sigma \right\| _{\mathrm {TV}}=0 \end{aligned}$$
  3. 3.

    Let \(\lambda \ge 0\) denote any constant such that

    $$\begin{aligned} \int f\mathrm{{d}}x=0\implies \lambda \left\| f\right\| _{\mathscr {L}^{2}}\le \sqrt{\int \left| \nabla f(x)\right| ^{2}\mathrm{{d}}x} \end{aligned}$$

    for weakly differentiable functions \(f:\ \varOmega \rightarrow \mathbb {R}\). Then for any density \(g\in \mathscr {L}^{2}\) (i.e., \(g\ge 0\) and \(\int g(x)\mathrm{{d}}x=1\)), we have that

    $$\begin{aligned} \left| \mathbb {E}\left[ f(X_{t})|X_{0}\sim \rho \right] -\mathbb {E}\left[ f(Y)\right] \right| \le \exp \left( -\frac{1}{2}\lambda ^{2}t\right) \left\| g-\frac{1}{\mathrm {Vol}(\varOmega )}\right\| _{\mathscr {L}^{2}}\left\| f\right\| _{\mathscr {L}^{2}} \end{aligned}$$

    where \(Y\sim \sigma \) and \(\rho (\mathrm{{d}}x)=g(x)\mathrm{{d}}x\).

Proof

These results are well known. We relate them at a high level here for the convenience of the reader.

The key is to grasp the connection between X and a certain so-called Dirichlet form,

$$\begin{aligned} \mathscr {E}(f,g)=\frac{1}{2}\int _{\varOmega }\left\langle \nabla f,\nabla g\right\rangle {\hbox {d}}x \end{aligned}$$

which is understood as a bilinear form on the Sobolev space \(H^{1}\) of weakly differentiable functions on \(\varOmega \). The arc of this connection is the content of the treatise [9]. We will only sketch it briefly in this paragraph. Let \(\mathscr {L}^{2}(\varOmega )\) denote the space of square-integrable measurable functions on \(\varOmega \), equipped with the inner product \(\left\langle f,g\right\rangle _{\mathscr {L}^{2}}=\int f(x)g(x){\hbox {d}}x\). One can find a unique nonnegative definite operator \(A:\ H^{1}\rightarrow \mathscr {L}^{2}\) such that \(\mathscr {E}(f,g)=\int \left( Af\right) \left( Ag\right) {\hbox {d}}x\). It turns out that A is self-adjoint. One can thus obtain a family of operators of the form \(T_{t}:\ \mathscr {L}^{2}\rightarrow \mathscr {L}^{2}\), uniquely defined as

$$\begin{aligned} T_{t}=\mathrm{e}^{-A^{2}t} \end{aligned}$$

Note that \(\mathrm{e}^{-A^{2}t}\) is defined on all of \(\mathscr {L}^{2}\) even though A is only defined on \(H^{1}\); we refer the reader to [17] for a very clear introduction to these considerations. If \(\varOmega \) has Lipschitz boundary (i.e., the boundary can locally be represented as the epigraph of a Lipschitz function), one can then (not necessarily uniquely) define a strong Markov process \(\left\{ Y_{t}\right\} _{t\ge 0}\) such that

$$\begin{aligned} \mathbb {E}\left[ f(Y_{t+s})|Y_{t}=y\right] =\left( T_{t}f\right) (y) \end{aligned}$$

almost surely with respect to Lebesgue measure, for every \(f\in \mathscr {L}^{2}\). In this case we may say that Y is “weakly determined” by \(\mathscr {E}\).

It is shown in [2] that if \(\varOmega \) is bounded with Lipschitz boundary, then any process which is weakly determined by \(\mathscr {E}\) will be a reflecting Brownian motion driven by some Brownian motion W. It is also shown that at least one such process exists. Since a bounded convex set automatically has a Lipschitz boundary (cf. Corollary 1.2.2.3 of [10]) and Tanaka showed that reflecting Brownian motion on a convex set is uniquely defined, it follows that the process X must be weakly determined by \(\mathscr {E}\).

This allows us to prove all three of our claims:

  1. 1.

    Since [9] shows that every process with Lipschitz boundary that is weakly determined by \(\mathscr {E}\) is a strong Markov process, it follows that X is a strong Markov process. The reversibility of X with respect to \(\sigma \) then follows from the fact that \(T_{t}\) is self-adjoint as an operator on \(\mathscr {L}^{2}(\varOmega ,\sigma )\) (this, in turns follows from the fact that A is self-adjoint).

  2. 2.

    In [4] it is shown that if a process \(\left\{ Y_{t}\right\} _{t\ge 0}\) is weakly determined by \(\mathscr {E}\) and \(\varOmega \) is convex, then

    $$\begin{aligned} \lim _{t\rightarrow \infty }\sup _{y}\sup _{A}\left| \mathbb {P}(Y_{t}\in A|Y_{0}=y)-\sigma (A)\right| =0 \end{aligned}$$

    Thus, the same follows for our process, X.

  3. 3.

    Let \(\lambda \) denote any constant so that \(\int f{\hbox {d}}x=0\implies \lambda ^{2}\left\| f\right\| _{\mathscr {L}^{2}}^{2}\le \mathscr {E}(f,f)\). Recall that we have said there is a unique operator \(A:\ H^{1}\rightarrow \mathscr {L}^{2}\) such that \(\mathscr {E}(f,g)=\int \left( Af\right) \left( Ag\right) {\hbox {d}}x\). In particular, \(\mathscr {E}(f,f)=\left\| Af\right\| _{\mathscr {L}^{2}}^{2}\). We may thus rephrase our understanding of \(\lambda \) by saying that \(\int f{\hbox {d}}x=0\implies \lambda \left\| f\right\| _{\mathscr {L}^{2}}\le \left\| Af\right\| _{\mathscr {L}^{2}}\). Using spectral methods it is thus straightforward to show that

    $$\begin{aligned} \int f{\hbox {d}}x=0\implies \left\| \mathrm{e}^{-A^{2}t}f\right\| _{\mathscr {L}^{2}}^{2}\le \mathrm{e}^{-\lambda ^{2}t}\left\| f\right\| _{\mathscr {L}^{2}}^{2} \end{aligned}$$
    (3)

    Using this and Cauchy–Schwarz, one can readily show that

    $$\begin{aligned} \left| \mathbb {E}\left[ f(X_{t})|X_{0}\sim \rho \right] -\mathbb {E}\left[ f(Y)\right] \right| \le \mathrm{e}^{-\frac{1}{2}\lambda ^{2}}\left\| g-\frac{1}{\mathrm {Vol}(\varOmega )}\right\| _{\mathscr {L}^{2}}\left\| f\right\| _{\mathscr {L}^{2}} \end{aligned}$$

\(\square \)

Corollary 1

Let \(Y\sim \sigma \), \(\rho (\mathrm{{d}}x)=g(x)\mathrm{{d}}x\), \(g\ge 0\) and \(\rho (\varOmega )=1\). Then

$$\begin{aligned} \left| \mathbb {E}\left[ f(X_{t})|X_{0}\sim \rho \right] -\mathbb {E}\left[ f(Y)\right] \right| \le \exp \left( -\frac{1}{2}\left( \frac{\pi }{d}\right) ^{2}t\right) \left\| g-\sigma \right\| _{\mathscr {L}^{2}}\left\| f\right\| _{\mathscr {L}^{2}} \end{aligned}$$

Proof

The work in [3] shows that \(\lambda =\pi /d\) fills the required role for statement 3 of Lemma 1. \(\square \)

This last corollary gives a satisfying grip on the \(\mathscr {L}^{2}\) ergodic convergence for Brownian motion in convex domains. Our endeavor here is to complement this with a comparable analysis of the total variation convergence.

Toward this end, we will employ a coupling construction. That is, we will construct a joint process \(\left\{ X_{t},Y_{t}\right\} _{t\ge 0}\) so that X and Y both carry the law of reflecting Brownian motion, but each has a different initial condition. We will construct them in such a way that \(\tau =\inf \left\{ t:\ X_{t}=Y_{t}\right\} \) is almost surely finite:

Theorem 3

(Atar and Burdzy [1]) Fix any bounded convex set \(\varOmega \). Let \(\left\{ W_{t}\right\} _{t\ge 0}\) denote a Brownian motion. Then there exists a pathwise-unique solution to the equations

$$\begin{aligned} \begin{array}{ll} X_{t} &{} =x+W_{t}+L_{t}\in \bar{\varOmega }\\ Y_{t} &{} =y+Z_{t}+M_{t}\in \bar{\varOmega }\\ Z_{t} &{} =W_{t}-\int _{0}^{t}2\eta _{s}\left\langle \eta _{s},\mathrm{{d}}W_{s}\right\rangle \\ \eta _{t} &{} =\frac{X_{t}-Y_{t}}{\left| X_{t}-Y_{t}\right| } \end{array}\qquad \qquad \qquad \begin{array}{ll} \left| L\right| _{t} &{} =\int _{0}^{t}\mathbb {I}_{X_{s}\in \partial \varOmega }\mathrm{{d}}\left| L\right| _{s}<\infty \\ L_{t} &{} =-\int _{0}^{t}\varvec{n}_{L}(s)\mathrm{{d}}\left| L\right| _{s}\\ \left| M\right| _{t} &{} =\int _{0}^{t}\mathbb {I}_{Y_{s}\in \partial \varOmega }\mathrm{{d}}\left| M\right| _{s}<\infty \\ M_{t} &{} =-\int _{0}^{t}\varvec{n}_{M}(s)\mathrm{{d}}\left| M\right| _{s} \end{array} \end{aligned}$$
(4)

Here \(\mathrm{{d}}\left| L\right| _{s}\) plays the role of the measure \(\mu \) in Definition 2; we require that \(\varvec{n}_{L}(s)\) is a normal vector of supporting hyperplanes of \(\varOmega \) at \(X_{s}\), \(\mathrm{{d}}\left| L\right| _{s}\)-almost surely. Likewise for \(\varvec{n}_{M},Y_{s},\mathrm{{d}}\left| M\right| _{s}\). Let \(\tau =\inf \left\{ t:\ X_{t}=Y_{t}\right\} \) and

$$\begin{aligned} \tilde{Y}_{t}={\left\{ \begin{array}{ll} Y_{t} &{} t\le \tau \\ X_{t} &{} t\ge \tau \end{array}\right. } \end{aligned}$$

Then \(\left\{ \left( X_{t},\tilde{Y}_{t}\right) \right\} _{t\ge 0}\) constitute a strongly Markovian process, and both X and Y are reflecting Brownian motions.

Proof

We refer the reader to the work of Atar and Burdzy in [1]. Note that although this article focuses on the case that \(\partial \varOmega \) is smooth, it also mentions that all of the reasoning goes through for any set which is “admissible” according to the lights of work by Lions and Sznitman [15]. Convex sets are indeed “admissible” according to Remark 3.1 of the work by Lions and Sznitman. \(\square \)

We emphasize that even though X and Y in this theorem are profoundly coupled, individually they both behave like Brownian motions trapped inside \(\varOmega \) by reflecting boundaries. It is also worth emphasizing that there are two completely different conceptual “reflections” at play here:

  1. 1.

    The normally reflecting boundaries keep XY inside \(\varOmega \)

  2. 2.

    The mirror coupling causes Y to generally behave like the mirror image of X, reflected over a plane halfway between X and Y.

These two kinds of reflections may interact when X or Y hits a point in \(\partial \varOmega \). In this case the direction of reflection (which is mathematically expressed as \(\eta _{t}\)) may rotate.

Finally, we turn to the function \(F_{d}\left( t,k\right) \). This function comes up several times for different reasons; here we hope to clarify the relationships between these different forms.

Lemma 2

(Connections between \(F_{d}\left( t,k\right) \) and one-dimensional Brownian motion) Let \(W_t\) denote a one-dimensional reflected Brownian motion trapped inside \([-d,d]\). Let \(\tau _d = \inf \{t:\ W_t \notin (-d,d)\) and \(\tau _0=\inf \{t:\ W_t=0\}\). Let

$$\begin{aligned} F_{d}\left( t,k\right) \triangleq \sum _{n=0}^{\infty }\mathrm{e}^{-\frac{\pi ^{2}}{8d^{2}}\left( 2n+1\right) ^{2}t}\frac{4\left( -1\right) ^{n}}{\pi \left( 2n+1\right) }\cos \left( \frac{2n+1}{2}\times \frac{\pi k}{d}\right) \end{aligned}$$

Then

  1. 1.

    \(\mathbb {P}(t\le \tau _d|W_0=w)= F_{d}\left( t,w\right) \).

  2. 2.

    \(\mathbb {P}(t\le \tau _0| W_0=w\ge 0) = F_{d}\left( t,d-w\right) \).

  3. 3.

    \(\mathbb {P}(W_t\le 0 |W_0 = w\ge 0) = \frac{1}{2}+\frac{1}{2}F_{d}\left( t,d-w\right) \).

Proof

The foundations of these results may be found in [14]; the particular function \(F_{d}\left( t,k\right) \) can be found in [11], Equations (3.8) and (3.9). Here we give a sketch.

  1. 1.

    Let \(X_t\) denote the process \(W_t\) killed at \(-d\) and d. That is, \(X_t=W_t\) for \(t\le \tau _d\) but \(X_t=\perp \) for \(t\ge \tau _d\), where \(\perp \) designates a graveyard state. Then for any function f on \((-d,d)\), we may define \(f(\perp )=f(-d)=f(d)=0\) and consider

    $$\begin{aligned} \phi (w,t)=\mathbb {E}[f(X_t)|W_0=w] \end{aligned}$$

    We can make a slight extension to the Dirichlet form arguments used in Lemma 1 to account for this kind of killed Brownian motion. In brief, we replace the Sobolev space \(H^1\) with the Sobolev space \(H^1_0\) of functions which vanish at the boundary. The operator A of Lemma 1 turns out to be the derivative, so the equation \(\phi (w,t)=(\mathrm{e}^{-A^2t}f)(w)\) from that Lemma actually corresponds to the differential equation

    $$\begin{aligned} \phi _t = -\frac{1}{2}\phi _{ww} \qquad \phi (w,0)=f(w) \qquad f(-d,t)=f(d,t)=0 \\ \end{aligned}$$

    (Here we have adopted the notation that subscripts indicate derivatives.) We can easily solve this function for special trigonometric functions of the form \(f(x)=\cos (n\pi x/2d)\). Indeed, in this case we have \(\phi (w,t)=\cos (n\pi w/2d)\mathrm{e}^{-tn^2 \pi ^2 / 8d^2}\).

    Now our primary interest is in \(\mathbb {P}(t\le \tau _d|W_0=w)\), which can be articulated as \(\mathbb {E}[f(X_t)|W_0=w]\) in the case \(f(x)=1\) for all \(x\in (-d,d)\) and \(f(x)=0\) otherwise. Thus, we can obtain our object of interest if we can write this f in a spectral representation, i.e., in terms of the nice trigonometric functions. This is easily done:

    $$\begin{aligned} f(x)=\sum _{n=0}^\infty \frac{4(-1)^n}{\pi (2n+1)}\cos \left( \frac{2n+1}{2}\times \frac{\pi x}{d} \right) = {\left\{ \begin{array}{ll}1 &{} x\in (-d,d) \\ 0 &{} x\in \{-d,d\}\end{array}\right. } \end{aligned}$$

    Using linearity (and checking to make sure nothing goes wrong in taking the limits (cf. [9, 14]), we conclude that \(\mathbb {E}[f(X_t)|W_0=w]\) is given by \(F_{d}\left( t,w\right) \), so the same must be true of \(\mathbb {P}(t\le \tau _d|W_0=w)\).

  2. 2.

    This is nearly the same problem. Only the boundary conditions are slightly different. In particular, we have a graveyard state at \(W=0\) and a reflection at \(W=d\). These translate into the boundary conditions \(\phi (0,t)=0\) and \(\phi _w(d,t)=0\). However, applying the symmetry of the problem, we can readily show that this is equivalent to the boundary conditions \(\phi (0,t)=0\) and \(\phi (2d,t)=0\), which reduces to the same problem we just handled, only offset by d and reflected. Equivalently, one may consider that the time it takes W to hit \(-d\) or d starting from w must be the same as the time it takes a simple (unreflected) Brownian motion to hit 0 or 2d starting at \(d-w\), and by symmetry this must be the same as the time it takes the reflected Brownian motion to hit 0 starting from \(d-w\).

  3. 3.

    This problem carries the reflecting boundary conditions \(\phi _w(-d,t)=\phi _w(d,t)=0\) and the initial condition \(\phi (w,t)=1\) for \(w<0\) and \(\phi (w,t)=0\) for \(w>0\). This is just our first problem, vertically scaled by 1 / 2, vertically shifted by \(+1/2\) and a horizontally shifted by d.

\(\square \)

3 Application of Known Results

To show our main theorem, we must understand the distribution of the coupling time in the mirror construction from Theorem 3. Exact calculation of this distribution may be impossible, but some bounds are straightforward to obtain:

Lemma 3

Let \(\varOmega ,X,Y,\tilde{Y},\tau \) be as in Theorem 3. Let d denote the diameter of \(\varOmega \). Then

$$\begin{aligned} \mathbb {P}_{x}\left( t\le \tau \right)&\le F_{d}\left( 4t,d-\left| x-y\right| \right) \end{aligned}$$

Proof

Let us consider

$$\begin{aligned} R_{t}=\left| X_{t}-Y_{t}\right| \end{aligned}$$

The key question is how long it takes before \(R_t=0\). Ito’s Lemma yields that

$$\begin{aligned} R_{t}=\left| x-y\right| +\int _{0}^{t}\left\langle \eta _{s},\varvec{n}_{M}(s){\hbox {d}}\left| M\right| _{s}-\varvec{n}_{L}(s){\hbox {d}}\left| L\right| _{s}\right\rangle +2\left\langle \eta _{s},{\hbox {d}}W_{s}\right\rangle \end{aligned}$$

Dambis–Dubins–Schwarz then yields that we can find some one-dimensional Brownian motion B such that

$$\begin{aligned} R_{t}=\left| x-y\right| +\int _{0}^{t}\left\langle \eta _{s},\varvec{n}_{M}(s){\hbox {d}}\left| M\right| _{s}-\varvec{n}_{L}(s){\hbox {d}}\left| L\right| _{s}\right\rangle +B_{4t} \end{aligned}$$

Let us now inspect

$$\begin{aligned} \varPhi _{t}=\int _{0}^{t}\left\langle \eta _{s},\varvec{n}_{M}(s){\hbox {d}}\left| M\right| _{s}-\varvec{n}_{L}(s){\hbox {d}}\left| L\right| _{s}\right\rangle \end{aligned}$$

Let us focus on three properties of this object:

  1. 1.

    \(t\mapsto \varPhi _{t}\) is monotone decreasing. Indeed, recall that \(\varvec{n}_{M}(s)\) is a supporting hyperplane of \(\varOmega \) at \(X_{s}\), \({\hbox {d}}\left| M\right| _{s}\) almost surely. That is,

    $$\begin{aligned} \varOmega \subset \left\{ y:\ \left\langle \varvec{n}_{M}(s),y-X_{t}\right\rangle \ge 0\right\} \end{aligned}$$

    So certainly

    $$\begin{aligned} \left\langle \eta _{s},\varvec{n}_{M}(s)\right\rangle =\left\langle \frac{X_{s}-Y_{s}}{\left| X_{s}-Y_{s}\right| },\varvec{n}_{M}(s)\right\rangle \le 0 \end{aligned}$$

    The same arguments apply to \(-\left\langle \eta _{s},\varvec{n}_{L}(s)\right\rangle \).

  2. 2.

    Since the diameter of \(\varOmega \) is d, we have that \(R_{t}\le d\) for all t. Put another way, \(\varPhi _{t}\le d-B_{4t}-\left| x-y\right| \).

  3. 3.

    \(\varPhi _{0}=0\).

Putting these facts together, we obtain the overall bound of

$$\begin{aligned} \varPhi _{t}\le \inf _{s\le t}\left( d-B_{4t}-\left| x-y\right| \right) \wedge 0 \end{aligned}$$

And thus,

$$\begin{aligned} R_{t}\le \left| x-y\right| +B_{4t}+\inf _{s\le t}\left( d-B_{4t}-\left| x-y\right| \right) \wedge 0\triangleq \tilde{R}_{4t} \end{aligned}$$

This is useful because the law of \(\tilde{R}_{t}\) is well understood. It is that of a one-dimensional Brownian motion with reflection at the point d, initialized at \(\tilde{R}_{0}=\left| x-y\right| \). (Indeed, it is easy to see that it satisfies Definition 2.) Intuitively, this signifies that the amount of time it takes R to go to zero is shorter than the amount of time it takes a (time-rescaled) reflected one-dimensional Brownian motion to hit zero.

In particular, let \(\tilde{\tau }=\inf \left\{ t:\ \tilde{R}_{t}=0\right\} \). Since \(R_{t}\le \tilde{R}_{4t}\) for every t, it follows that \(\tilde{\tau }\ge 4\tau \). On the other hand, Lemma 2 yields

$$\begin{aligned} \mathbb {P}_{x}\left( t\le \tilde{\tau }\right)&=F_{d}\left( t,d-\left| x-y\right| \right) \end{aligned}$$

Our result follows immediately. \(\square \)

This leads to our main theorem, which we here restate here the convenience of the reader:

Theorem 4

Let \(\varOmega \subset \mathbb {R}^{n}\) bounded, open, convex, with diameter d. Let \(\left\{ p\left( t,x,\mathrm{{d}}y\right) \right\} _{t\ge 0,x\in \varOmega }\) denote the transition measures of Brownian motion trapped inside \(\varOmega \) by normally reflecting barriers. Then

$$\begin{aligned} \left\| p\left( t,x,\cdot \right) -p\left( t,y,\cdot \right) \right\| _{\mathrm {TV}}&\le F_{d}\left( 4t,d-\left| x-y\right| \right) \\ \left\| p\left( t,x,\cdot \right) -\sigma \right\| _{\mathrm {TV}}&\le \int F_{d}\left( 4t,d-\left| x-y\right| \right) \sigma (\mathrm{{d}}y)\le F_{d}\left( 4t,0\right) \end{aligned}$$

This very last bound cannot be improved by more than a factor of 2. In particular, taking the special case that \(\varOmega =\left[ -d/2,d/2\right] \times [0,\epsilon ]^{n-1}\subset \mathbb {R}^n\) for any \(\epsilon >0\), we have that \(\left\| p\left( t,d/2,\cdot \right) -\sigma \right\| _{\mathrm {TV}} \ge F_{d}\left( 4t,0\right) /2\).

Proof

The total variation distance between \(p\left( t,x,\cdot \right) -p\left( t,y,\cdot \right) \) is easy to bound using Lemma 3. Recall that \(X_{t},\tilde{Y}_{t}\) were both reflecting Brownian motions, with \(X_{0}=x\), \(\tilde{Y}_{0}=y\), and \(X_{t}=\tilde{Y}_{t}\) for \(t\ge \tau \). The lemma then shows that \(\mathbb {P}\left( t<\tau \right) \le F_{d}\left( 4t,d-\left| x-y\right| \right) \). We thus obtain our first total variation bound:

$$\begin{aligned} \left\| p\left( t,x,\cdot \right) -p\left( t,y,\cdot \right) \right\| _{\mathrm {TV}}&=\sup _{A}\left| \mathbb {P}\left( X_{t}\in A\right) -\mathbb {P}\left( \tilde{Y}_{t}\in A\right) \right| \\&\le \mathbb {P}\left( X_{t}\ne \tilde{Y}_{t}\right) \le \mathbb {P}\left( t<\tau \right) \le F_{d}\left( 4t,d-\left| x-y\right| \right) \end{aligned}$$

The second total variation bound then follows immediately from the fact that \(\sigma \) is the stationary distribution of the process, i.e., \(\int p\left( t,x,A\right) \sigma ({\hbox {d}}x)=\sigma (A)\). The second total variation bound is thus a kind of “average” of the first variation bound:

$$\begin{aligned} \left\| p\left( t,x,\cdot \right) -\sigma \right\| _{\mathrm {TV}}&=\sup _{A}\left| p\left( t,x,A\right) -\int p\left( t,y,A\right) \sigma ({\hbox {d}}y)\right| \\&\le \int \left\| p\left( t,x,\cdot \right) -p\left( t,y,\cdot \right) \right\| _{\mathrm {TV}}\sigma ({\hbox {d}}y)\\&\le \int F_{d}\left( t,d-\left| x-y\right| \right) \sigma ({\hbox {d}}y) \end{aligned}$$

Finally, it is well known that \(\sup _{k}F_{d}\left( t,k\right) =F_{d}\left( t,0\right) \), so we obtain the overall bound of \(\left\| p\left( t,x,\cdot \right) -\sigma \right\| _{\mathrm {TV}}\le F_{d}\left( t,0\right) \).

Finally we turn to how much this bound could be further improved: any upper bound on the rate of uniform ergodicity which relies only on diameter and dimension cannot be smaller than the bound given here by more than a factor of two. Indeed, consider any pipe \(\varOmega =\left[ -d/2,d/2\right] \times [0,\epsilon ]^{n-1}\subset \mathbb {R}^n\) for \(\epsilon >0\). Notice that the law of X can be decomposed into n independent one-dimensional reflecting Brownian motions, and so the total variation can likewise be decomposed. The total variation distance may therefore be lower bounded by the total variation distance along any dimension. Thus, since we seek a lower bound, without loss of generality, let us assume \(\varOmega = [-d/2,d/2]\). Then \(\left\| p\left( t,x,\cdot \right) -\sigma \right\| _{\mathrm {TV}}\) may further be lower bounded by \(|\mathbb {P}(X_t\le 0|X_0=d/2) - 1/2|\). The relevant probability is given by the third result of Lemma 2:

$$\begin{aligned} \mathbb {P}(X_t\le 0|X_0=d/2) = \frac{1}{2}+\frac{1}{2}F_{d/2}\left( t,d/2-d/2\right) =\frac{1}{2}+\frac{1}{2}F_{d}\left( 4t,0\right) \end{aligned}$$

which yields the desired result. \(\square \)

4 Conclusions

Among all convex sets with diameter d, this work begins to suggest that the one-dimensional interval (e.g., \(\varOmega =[0,d]\)) may provide a kind of worst-case scenario for mixing rates. This is helpful because the one-dimensional interval is easy to analyze.

However, “typical” high-dimensional sets of diameter d may mix much faster than our bounds would suggest. Thus, Eq. (1) from [16] seemed to imply that mixing might get slower in high dimensions, our theorem suggests that the mixing does not depend upon the dimension, but the reality may be that mixing typically gets faster as in higher dimensions. For example, preliminary analysis suggests that n-dimensional Brownian motion in the unit n-dimensional ball mixes quite a bit faster than one-dimensional Brownian motion on [0, d], especially as \(n\rightarrow \infty \). On the other hand, we are simply not sure about the total variation mixing rate for Brownian motion in a high-dimensional cube. Are there simple ways to improve our bounds when we know more about \(\varOmega \)? This is a possible direction of future research.

In another direction, it should be possible to extend this basic method of proof to accommodate a wide variety of Ito diffusions, beyond Brownian motion. Mirror couplings are available for many such diffusions; we refer the reader to [13] and the many papers which have cited it. For every such mirror coupling process, \(\left\{ X_{t},Y_{t}\right\} _{t\ge 0}\), one can analyze the one-dimensional process \(R_{t}=\left| X_{t}-Y_{t}\right| \). By applying Ito’s lemma, Dambis–Dubins–Schwarz, and taking bounds, one can often obtain a stochastic differential inequality of the form \({\hbox {d}}R_{t}\le {\hbox {d}}\tilde{R}_{t}\), where \(\tilde{R}_{t}\) is some semimartingale that is better understood. Applying stochastic differential inequality results such as those found in [6], one can then obtain bounds for the high-dimensional process with some simple one-dimensional process.

For example, consider the stochastic differential equation \({\hbox {d}}X_{t}=\mu (X_{t}){\hbox {d}}t+W_{t}\) where \(\mu \) is some Lipschitz vector field. Similar to the technique we used, let \({\hbox {d}}Y_{t}=\mu (Y_{t}){\hbox {d}}t-{\hbox {d}}W_{t}-2\eta _{t}\left\langle \eta _{t},{\hbox {d}}W_{t}\right\rangle \), where \(\eta \) is a normalized version of \(X-Y\). As in our case, if we define \(B_{4t}=\int _{0}^{t}2\left\langle \eta _{s},{\hbox {d}}W_{s}\right\rangle \) then we can show that B is a one-dimensional Brownian motion. Finally, let \(\varGamma \) denote some Lipschitz function satisfying \(\varGamma (r)\ge \sup _{\left| x-y\right| =r}\left\langle x-y,\mu (x)-\mu (y)\right\rangle \), and consider the simple one-dimensional stochastic differential equation

$$\begin{aligned} Z_{t}=\left| x-y\right| +\int _{0}^{t}\frac{\varGamma (Z_{s})}{Z_{s}}{\hbox {d}}s+B_{4t} \end{aligned}$$

Using the results from [6], one can readily show that \(R_{t}\le Z_{t}\). Thus,

$$\begin{aligned} \inf \left\{ t:\ X_{t}=Y_{t}\right\} \le \inf \left\{ t:\ Z_{t}=0\right\} \end{aligned}$$

In short, by analyzing the simple one-dimensional diffusion Z, one can estimate coupling times for extremely complex and high-dimensional processes. These coupling times can then be used to estimate rates of ergodicity. Of course, these kinds of estimates may be quite poor in some cases (e.g., consider the catastrophic case that \(\sup _{\left| x-y\right| =r}\left\langle x-y,\mu (x)-\mu (y)\right\rangle =\infty \)). This difficulty is related to the problem with which we began these conclusions: in some cases it may be very difficult to get high-quality bounds using only a one-dimensional diffusion.

We have seen that one-dimensional diffusions can be used to analyze very high-dimensional diffusions, although the bounds may not be ideal in certain cases. Might it be possible to improve this basic technique to allow two or three-dimensional diffusions to give rigorous bounds on high-dimensional processes? This is an intriguing question for future research.