1 Introduction

In this paper we study run-and-tumble motion, which is often used as a model of active particles. The particle motion has two ingredients: first the particle performs a symmetric random walk, and second, independently it moves in a direction dictated by an internal state process. This internal state process is assumed to be a continuous-time stationary Markov process. In the sequel we will first describe how our paper relates to various results on run-and-tumble particles in the literature. Next, we will briefly sketch how our model relates to the broader literature on active matter, stochastic slow-fast systems and directionally reinforced random walks.

1.1 Model and Contributions of This Paper

The model that we study in this paper is an instance of what is more generally called run-and-tumble motion. These are models of particles that follow a preferred direction which is reversed at random points in time. Recent articles include [1,2,3,4,5,6].

We study an active particle of which the state process (that determines the preferred direction) is a stationary Markov process (under some technical assumptions), started from its unique ergodic measure. Then our main contribution is twofold. First we are able to calculate closed form formulas for the limiting diffusion coefficient of the active particle. This formula holds in great generality, including also the case where the state process is a diffusion (we will provide examples where an Ornstein–Uhlenbeck process or Brownian motion on a circle form the state process). In this formula we can interpret the different terms and observe where the activity is manifested. We also calculate the large deviations free energy function and rate function in the case where the state process has a finite state space.

Second, we study the role of reversibility of the state process in the diffusion coefficient and large deviations of the active particle (again for finite state spaces). In particular, we show that reversible processes in some sense optimize those quantities. To be more precise, we show that among all processes with the same symmetric part and the same stationary measure, the reversible process maximizes the diffusion coefficient and the free energy function (pointwise) and minimizes the large deviations rate function (also pointwise). The last two results are obtained by showing a pointwise inequality for the Donsker–Varadhan rate function of the empirical processes corresponding to the reversible and non-reversible state processes, respectively.

The calculations that we present are for an active particle in \(\mathbb {R}\), but we explain for all of our results how they generalize to \(\mathbb {R}^d\) and we also provide the explicit formulas in the \(\mathbb {R}^d\) setting.

1.2 Context and Related Literature

First of all, the run-and-tumble motion is often used as a model of active matter. As we said before, our active particle performs a symmetric random walk and a random walk with preferred directions that are switched. The part of the motion that follows the internal state is called the active part of the motion, because for the switching between internal states some internal source of energy is needed. The passive part of the motion is the symmetric random walk part and comes from collisions with surrounding molecules.

Note that active particles should not be confused with activated random walk. In those models particles perform random walks, but fall asleep after a random time and are awakened (activated) when other particles jump to their position.

Second, the active particle motion studied in this paper is an example of a stochastic slow-fast system. These are well-studied systems where coupled quantities evolve on different time scales. If one rescales the position of the active particle diffusively, the underlying state process behaves as a fast process and the (rescaled) particle position is a slow process. Asymptotically the fast state process averages out and has a deterministic influence on the slow process: the limiting diffusion coefficient will depend on the state process only through the stationary distribution and the covariance function. For an introduction to stochastic slow-fast systems see for instance [7]. The large deviation results that we obtain are related to more general results for large deviations of slow-fast systems that were studied in for instance [8] or, more recently, in [9].

Third, the active particle motion studied in this paper has strong similarities with a directionally reinforced random walk. This model was first studied by [10] and a multidimensional version in [11]. Then in [12] and (in a more general context) [13] it was shown for a process of this type that it converges to a multidimensional Brownian motion when rescaled diffusively.

Also we will compare the diffusion coefficients and large deviations rate functions for active particles with state processes that are either reversible or non-reversible with respect to the same invariant measure. In particular we will show that the Donsker–Varadhan rate function of reversible processes is dominated by the rate functions of non-reversible processes with the same symmetric part and the same invariant measure. A similar result (in a different context) was obtained in [14].

1.3 Structure of This Paper

In Sect. 2, we introduce the active particle process as a stochastic integral. We split it into a random walk part, a martingale part and an active part.

In Sect. 3, we obtain the limiting diffusion coefficient of the active particle and show that it is the sum of the contributions of the random walk part, the martingale part and the active part. Then we generalize the formulas to the multidimensional case. The limiting diffusion coefficient (or matrix) is then calculated for several concrete examples, both with finite and with infinite state spaces. Finally, we sketch how one obtains a Central Limit Theorem for the active particle.

Next, in Sect. 4, we restrict ourselves to finite space spaces and study the active part of the diffusion coefficient, which is proportional to an inner product with the inverse of the generator of the state process. We show that among all stationary processes with respect to the same invariant measure and with equal symmetric part, the active part of the diffusion coefficient is maximal for the reversible process. We use the 1-dimensional case to show that this also holds for the active part of the diffusion matrix in higher dimensions.

Then in Sect. 5, we move to large deviations (still for finite state spaces). We compute the large deviations free energy function. Using Varadhan’s lemma, we derive an expression for the free energy function of the active particle in terms of the Donsker–Varadhan rate function for the empirical process corresponding to the state process (which in turn gives us the large deviations rate function as the Legendre transform of the free energy). We show that the free energy function is maximal and the rate function is minimal in the reversible case (similar to the situation for the diffusion coefficient) by showing that Donsker–Varadhan rate function is maximal for reversible processes.

We conclude the paper in Sect. 6 with an analysis of the situation where the state space is \(\{-1,1\}\). In this two-state case we can explicitly calculate the Fourier–Laplace transform of the distribution of the active particle process, the moment generating function and the large deviations free energy function.

2 Preliminaries

In this section we introduce the model and goal of this paper. First, in Sect. 2.1 we describe in words the models we study and formulate in words the main results. Then in Sect. 2.2, the definitions will be repeated with more mathematical details and precise assumptions. In Sect. 2.1 we will also describe the basic example where the internal state space is \(\{-1,1\}\), more examples will follow in Sect. 3.2.

2.1 Informal Description of the Model and Main Results

In the models we consider a particle that moves on \(\mathbb {R}^d\) in continuous time. The particle has a position at time \(t\ge 0\) denoted by \(X_t\), and an “internal state” denoted by \(M_t\). The internal state is assumed to evolve according to a stationary Markov process, and can model e.g. a chemical state of a molecular motor. The active part of the motion is driven by this internal state. The simplest setting is e.g. when the internal state takes the values \(\pm 1\) and determines whether the particle drifts to the right or left.

Let us now first describe the joint motion of the position and the internal state (xm) in the simplest setting where the particle moves on the discrete lattice \(\mathbb {Z}\) and has internal state \(m\in \{-1,1\}\). The motion consists of three parts.

  1. (a)

    At rate \(\kappa \) the particle makes a random walk jump, i.e., (xm) moves to \((x\pm 1, m)\). This motion models the “passive” part of the motion, caused by collisions with surrounding molecules.

  2. (b)

    At rate \(\lambda \) the particle jumps according to its internal state, i.e., (xm) moves to \((x+m,m)\). This corresponds to the active part of the motion, driven by an internal energy source (such as ATP–ADP conversion).

  3. (c)

    At rate \(\gamma \) the internal state flips, i.e., (xm) moves to \((x,-m)\).

Denoting \(\mu (x,m;t)\) the probability to be at position \(x\in \mathbb {Z}\), and internal state \(m\in \{-1,1\}\) the above verbal description of the process is then summarised via the master equation

$$\begin{aligned} \frac{d\mu (x,m;t)}{dt}= & {} \kappa (\mu (x+1,m;t)+\mu (x-1,m;t)-2\mu (x,m;t)) \nonumber \\&\quad + \lambda (\mu (x-m,m;t)-\mu (x,m;t)) + \gamma (\mu (x,-m;t)-\mu (x,m;t)) \end{aligned}$$

or alternatively via the generator working on functions from the state space \(\mathbb {Z}\times \{-1,1\}\)

$$\begin{aligned} L f(x,m)= & {} \kappa (f(x+1,m) + f(x-1,m)-2f(x,m))\nonumber \\&\quad + \lambda (f(x+m,m)-f(x,m)) +\gamma ( f(x,-m)- f(x,m)). \end{aligned}$$

The idea is now to generalise this simple setting, i.e, the motion of the particle is on \(\mathbb {R}^d\) and we will allow much more general internal state processes (the precise assumptions on them are in the subsection below) including e.g. diffusion processes such as the Ornstein–Uhlenbeck process. In this more general setting, the active part of the motion consists in jumping according to the vector v(m) determined by the internal state m, and the internal state is a general stationary ergodic Markov process, whereas the random walk part of the motion remains unchanged. This implies that the generator is of the form

$$\begin{aligned} L f(x,m)= & {} \kappa (f(x+1,m) + f(x-1,m)-2f(x,m))\nonumber \\&\quad + \lambda (f(x+ v(m),m)-f(x,m)) +\gamma Af(x, \cdot )(m), \end{aligned}$$

where A is the generator of the internal state process. Notice that this form of the generator implicitly assumes that the internal state dynamics is not depending on the particle’s position. Moreover, we assume that there is no “global” drift in the active part of the motion, i.e., the average of v(m) over the stationary distribution of the internal state process is assumed to be zero. Note that the active particle with internal state space \(\{-1,1\}\) in the simple setting above fits into this framework by letting v be the identity function on \(\{-1,1\}\).

Our main interest is then in the asymptotic behavior of the position \(X_t\), more precisely we will prove the following:

  1. (1)

    Diffusive scaling limit for \(X_t\), with explicit expressions for the diffusion matrix D, i.e., in the scaling limit

    $$\begin{aligned} \frac{1}{\sqrt{N}} X_{tN}\rightarrow \sqrt{D}W(t), \quad N\rightarrow \infty , \end{aligned}$$

    where W denotes Brownian motion, and where D denotes the diffusion matrix.

  2. (2)

    Large deviations for the position \(X_t, t\ge 0\), i.e., in the sense of large deviations

    $$\begin{aligned} \mathbb {P}\left( \frac{X_t}{t} \approx x\right) \approx e^{-t I(x)} \end{aligned}$$

    with I(x) the large deviation rate function.

We then focus on the question how the diffusion matrix as well as the large deviation rate depend on the internal state process, more precisely on its generator A. We show that both quantities are optimised (i.e., the diffusion matrix is maximal and the rate function is minimal) for reversible internal state space processes.

More precisely, when the stationary distribution \(\mu \) of the internal state process is fixed, as well as the reversible part of the dynamics, then we show that the diffusion matrix is maximal and the rate function is minimal for the internal state process for which \(\mu \) is reversible, i.e., when the asymmetric part of the dynamics is zero. Though we do not have a simple intuitive “physics” argument for this result, it corresponds to the general intuition that non-reversible processes converge faster to their stationary state, and therefore allow less fluctuations, resulting in a smaller rate function (and larger diffusion constant) in the reversible setting.

2.2 Mathematical Definitions

We consider the position \((X_t,t\ge 0)\) of a particle that moves in continuous time and space (see also Remark 2.1). For now we assume \(X_t\in \mathbb {R}\), but we will generalize to \(\mathbb {R}^d\) later. The particle has the following dynamics.

  1. (a)

    With rate \(2\kappa \) the particle performs a simple symmetric random walk.

  2. (b)

    Independently, with rate \(\lambda \) the particle jumps in a preferred direction indicated by an inner state. If such jump occurs at time t, the particle jumps from \(X_t\) to \(X_t+v_t^\gamma \).

  3. (c)

    This internal state evolves with ‘rate’ \(\gamma \) according to a stationary Markov process.

Because of the jump to a preferred direction based on the inner state, we call the particle an active particle.

To make this more precise we make the following definitions. We will assume that the processes in the coming definitions are jointly defined on a probability space \((\Omega ,\mathscr {F},\mathbb {P})\).

  1. (i)

    Random walk part. Let \(Y=(Y_t,t\ge 0)\) be a simple symmetric random walk, i.e. a random walk that starts from the origin (\(Y_0=0\)), jumps with rate 1 and jumps 1 to the left or to the right with equal probability. Fix a constant \(\kappa >0\). Then the random walk part of the process is \(Y_{2\kappa t}\).

  2. (ii)

    Internal state process. Let \((M_t,t\ge 0)\) be a stationary Markov process (independent of the random walk) on a state space \(\mathscr {S}\) with ergodic measure \(\mu \). We will call this process the state process. Since we will always start M from \(\mu \), we can assume without loss of generality that \(\mu \) is the unique ergodic (and hence the unique invariant) measure of M. Denote by \((S_t,t\ge 0)\) and A the corresponding semigroup and Markov generator on \(L^2(\mu )\), respectively, and denote the inner product on \(L^2(\mu )\) by \((\cdot ,\cdot )\) and the corresponding norm by \(\Vert \cdot \Vert \).

  3. (iii)

    Speed function. Let v be an element of \(L^2(\mu )\). We will call v the speed function. For simplicity, we assume that \(\int v\mathrm {d}\mu =0\), meaning that the average of the speed with respect to the stationary measure on the internal state space is 0. This is not essential though, we will make some remarks on what happens without this assumption. The idea is that \(v:\mathscr {S}\rightarrow \mathbb {R}\) is a mapping that indicates for each internal state the jump vector in case of an active jump when the particle has that internal state. In the example in Sect. 2.1, the speed function v was just the identity function on \(\{-1,1\}\). In Sect. 3.2 we will see more examples, for instance where v maps three internal states to three numbers that sum to 0 (in Example 2) or where v is the sine function (in Example 4).

  4. (iv)

    Speed process. Fix a constant \(\gamma >0\). We define \(v^\gamma _t = v(M_{\gamma t})\) and call \((v^\gamma _t,t\ge 0)\) the speed process. Note that this speed process does not need to be a Markov process. In the special case for \(\gamma =1\), we will simply write \(v_t\). Note that \((v^\gamma _t,t\ge 0)\) is the process \((v_t,t \ge 0)\) speeded up by the factor \(\gamma \). We make the following two technical assumptions on the speed process.

    1. (a)

      First we assume that

      $$\begin{aligned} \lim _{t\rightarrow \infty } \int _0^t S_rv\mathrm {d}r \quad \text {exists in } L^2(\mu ). \end{aligned}$$
      (1)

      This implies that the limit \(u:=\int _0^\infty S_tv\mathrm {d}t\) satisfies \(u\in D(A)\) and \(-A u = v\), so we will write \(\int _0^\infty S_tv\mathrm {d}t = -A^{-1}v\). We need this assumption to ensure that the limiting variance is finite. If it does not hold, there may not be a diffusive scaling limit. Sufficient conditions for Assumption (1) are for instance that the spectral gap of A is positive or that there exist \(c,C>0\) such that

      $$\begin{aligned} \Vert S_tv\Vert \le C \mathrm {e}^{-ct}. \end{aligned}$$

      The latter is a condition on the speed of relaxation, it ensures that the internal state process reaches equilibrium fast enough, which avoids large temporal covariances. In any case, Assumption (1) requires that \(S_tv\) goes to 0 fast enough that it is integrable.

    2. (b)

      The second assumption is that for all \(t>0\)

      $$\begin{aligned} \lim _{\delta \downarrow 0} \sup _{\begin{array}{c} 0\le s,s'\le t\\ |s-s'|<\delta \end{array}} \mathbb {E}\left[ (v_s-v_{s'})^2\right] = 0. \end{aligned}$$
      (2)

      In other words: the speed process must be uniformly continuous in \(L^2\). This assumption is purely technical, we will use it in Lemma A.1 to show that the integral in (3) is well-defined.

    Both of these assumptions are automatically satisfied in the case that the state space \(\mathscr {S}\) of M is finite. Other internal state processes that satisfy these assumptions (with a suitable choice of v) include diffusion processes such as Brownian motion and the Ornstein–Uhlenbeck processes that we encounter in the examples in Sect. 3.2.

  5. (v)

    Active jumps. Finally, fix a constant \(\lambda >0\) and let \((N_t,t\ge 0)\) be a Poisson process with rate \(\lambda \) (independent of the random walk and the state process). This process marks the times at which the particle jumps in a preferred direction.

With these components we can define

$$\begin{aligned} X_t = Y_{2\kappa t} + \int _0^t v_s^\gamma \mathrm {d}N_s, \end{aligned}$$
(3)

where the integral is defined as a limit in \(L^2(\mathbb {P})\) [see in Lemma A.1 how the well-definedness of the integral follows from Assumption (2)]. This expression matches with our description above: \(Y_{2\kappa t}\) is the random walk part and on top of that whenever the Poisson process N has a jump at time t, say, the number \(v_t^\gamma \) is added to \(X_t\). Note that (3) implies that \(X_0=0\). Also, we can write (3) as

$$\begin{aligned} X_t = Y_{2\kappa t} + \int _0^t v_s^\gamma \mathrm {d}{\overline{N}}_s + \lambda \int _0^t v_s^\gamma \mathrm {d}s, \end{aligned}$$
(4)

where \({{\overline{N}}}_t = N_t-\lambda t\) is a compensated Poisson process. We call the first, second and third term of (4) the random walk part, the martingale part and the active part, respectively. This division will become more clearly visible in the diffusion coefficient.

Remark 2.1

Note that if v is integer-valued, \(X_t\) stays in the lattice \(\mathbb {Z}\). In case v is not integer-valued, we can also directly consider a continuous process and define

$$\begin{aligned} X^c_t = B_{2\kappa t} + \lambda \int _0^t v_s^\gamma \mathrm {d}s, \end{aligned}$$
(5)

where \((B_t,t\ge 0)\) is Brownian motion (independent of the state process) and where the speed process is followed continuously in time. As will become clear later, the change to Brownian motion is mostly aesthetic. However, the change from \(\mathrm {d}N_t\) to \(\lambda \mathrm {d}t\) leaves out the martingale part of \(X_t\), which will have consequences for both the limiting diffusion coefficient and for the large deviations. We will makes remarks on this later, after the results concerned.

3 Diffusion Coefficient

A first observation is that the expectation of \(X_t\) is 0. Indeed, using independence of the processes \(v^\gamma _s\) and \(N_s\) and the fact that \(\mathbb {E}v^\gamma _s=0\), we compute

$$\begin{aligned} \mathbb {E}X_t= & {} \mathbb {E}Y_{2\kappa t} + \mathbb {E}\int _0^t v_s^\gamma \mathrm {d}N_s = 0 + \lim _{n\rightarrow \infty } \sum _{i=0}^{n-1} \mathbb {E}\left[ v_s^\gamma (N_{s_{i+1}}-N_{s_i})\right] \\= & {} \lim _{n\rightarrow \infty } \sum _{i=0}^{n-1} \mathbb {E}v^\gamma _{s_i} \lambda (s_{i+1}-s_i) = 0. \end{aligned}$$

In this section we determine the limiting diffusion coefficient of the active particle and extend this result to active particles in higher dimensions. Then we provide some examples. Finally, we discuss the invariance principle.

3.1 Calculating the Diffusion Coefficient

As a first result, we compute the limiting variance of the position of the active particle.

3.1.1 The 1-Dimensional Case

We start in dimension 1. Recall that \((\cdot ,\cdot )\) denotes the inner product on \(L^2(\mu )\).

Theorem 3.1

The active particle has the following limiting diffusion coefficient

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{\mathrm {Var}(X_t)}{t} = 2\kappa + \lambda \int v^2\mathrm {d}\mu + \frac{2\lambda ^2}{\gamma }(v,-A^{-1}v). \end{aligned}$$
(6)

Proof

First of all, note that the random walk part of \(X_t\) is independent of the rest. Second, note that using Lemma A.1 and the independence of \(v^\gamma \) and \({\overline{N}}\),

$$\begin{aligned} \mathrm {Cov}\left( \int _0^t v_s^\gamma \mathrm {d}{\overline{N}}_s,\lambda \int _0^t v_s^\gamma \mathrm {d}s\right)= & {} \lim _{n\rightarrow \infty } \sum _{i,j=0}^{n-1} \mathrm {Cov}\left( v_{s_i}({\overline{N}}_{s_{i+1}}-{\overline{N}}_{s_i}),\lambda v_{s_j} (s_{j+1}-s_j)\right) \\= & {} \lim _{n\rightarrow \infty } \sum _{i,j=0}^{n-1} \lambda v_{s_j} (s_{j+1}-s_j) \mathrm {Cov}\left( v_{s_i},v_{s_j}\right) \mathbb {E}\left[ {\overline{N}}_{s_{i+1}}-{\overline{N}}_{s_i}\right] = 0. \end{aligned}$$

This implies that

$$\begin{aligned} \mathrm {Var}(X_t) = \mathrm {Var}(Y_{2\kappa t})+ \mathrm {Var}\left( \int _0^t v_s^\gamma \mathrm {d}{\overline{N}}_s\right) + \mathrm {Var}\left( \lambda \int _0^t v_s^\gamma \mathrm {d}s\right) . \end{aligned}$$

In other words, each of the parts of \(X_t\) in (4) has its own contribution to the variance of \(X_t\) and hence to the limiting diffusion coefficient. Similar to before, we will refer to these as the random walk part, the martingale part and the active part of the diffusion coefficient. We will now calculate these contributions.

First, \(Y_{2\kappa t}\) is the difference of two independent Poisson random variables with rate \(\kappa t\). Therefore

$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{\mathrm {Var}(Y_t)}{t} = \lim _{t\rightarrow \infty } \frac{\kappa t+ \kappa t}{t} = 2\kappa . \end{aligned}$$
(7)

Second, using Lemma A.1, the independence of \(v^\gamma \) and \({\overline{N}}\) and the fact that \(\mathbb {E}v^\gamma _s = \mathbb {E}\left[ {\overline{N}}_{s_{i+1}}-{\overline{N}}_{s_i}\right] = 0\), we see

$$\begin{aligned} \mathrm {Var}\left( \int _0^t v_s^\gamma \mathrm {d}{\overline{N}}_s\right)= & {} \lim _{n\rightarrow \infty } \sum _{i,j=0}^{n-1} \mathrm {Cov}\left( v_{s_i} ({\overline{N}}_{s_{i+1}}-{\overline{N}}_{s_i}), v_{s_j} ({\overline{N}}_{s_{j+1}}-{\overline{N}}_{s_j})\right) \\= & {} \lim _{n\rightarrow \infty } \sum _{i=0}^{n-1} \mathrm {Var}\left( v_{s_i} ({\overline{N}}_{s_{i+1}}-{\overline{N}}_{s_i})\right) = \lim _{n\rightarrow \infty } \sum _{i=0}^{n-1} \mathrm {Var}\left( v_{s_i}\right) \mathrm {Var}\left( {\overline{N}}_{s_{i+1}}-{\overline{N}}_{s_i}\right) \\= & {} \lim _{n\rightarrow \infty } \sum _{i=0}^{n-1} \int v^2 \mathrm {d}\mu \lambda (s_{i+1}-s_i) = \lambda t \int v^2 \mathrm {d}\mu . \end{aligned}$$

Therefore

$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{\mathrm {Var}\left( \int _0^t v_s^\gamma \mathrm {d}{\overline{N}}_s\right) }{t} = \lim _{t\rightarrow \infty } \frac{\lambda t \int v^2 \mathrm {d}\mu }{t} = \lambda \int v^2 \mathrm {d}\mu . \end{aligned}$$
(8)

For the third part we calculate the limiting variance of an additive functional of a Markov process. This formula was already obtained for instance in [15, Corollary 1.9] and [16, Lemma 2.4] (for reversible Markov processes). In fact, it is known as Green–Kubo relations, which go back to [17] and [18]. For completeness, we provide the calculations for our specific context here. Using the stationarity of \(v^\gamma \) and the symmetry of covariance, we compute

$$\begin{aligned} \mathrm {Var}\left( \int _0^t v_s^\gamma \mathrm {d}s\right)= & {} \int _0^t \int _0^t \mathrm {Cov}(v^\gamma _s,v^\gamma _r)\mathrm {d}r \mathrm {d}s = 2 \int _0^t \int _0^s \mathrm {Cov}(v^\gamma _s,v^\gamma _r)\mathrm {d}r \mathrm {d}s \nonumber \\= & {} 2 \int _0^t \int _0^s \mathrm {Cov}(v^\gamma _{s-r},v^\gamma _0)\mathrm {d}r \mathrm {d}s \nonumber \\= & {} 2 \int _0^t \int _0^s \mathrm {Cov}(v^\gamma _r,v^\gamma _0)\mathrm {d}r \mathrm {d}s = 2 \int _0^t \int _r^t \mathrm {Cov}(v^\gamma _r,v^\gamma _0)\mathrm {d}s \mathrm {d}r \nonumber \\= & {} 2 \int _0^t (t-r) \mathrm {Cov}(v^\gamma _r,v^\gamma _0) \mathrm {d}r\nonumber \\= & {} \frac{2}{\gamma } \int _0^{\gamma t}(t-r)\mathrm {Cov}(v(M_{r}),v(M_0))\mathrm {d}r = \frac{2}{\gamma } \int _0^{\gamma t}(t-r)(v,S_rv)\mathrm {d}r.\nonumber \\ \end{aligned}$$
(9)

To compute this, first note that with Assumption (1) we see that

$$\begin{aligned} \lim _{t\rightarrow \infty } \int _0^{t} (v,S_rv)\mathrm {d}r = \left( v,\lim _{t\rightarrow \infty } \int _0^t S_r v\mathrm {d}r\right) = \left( v,-A^{-1}v\right) . \end{aligned}$$
(10)

Note that the convergence of \(\int _0^t (v,S_rv)\mathrm {d}r\) also implies that

$$\begin{aligned} \lim _{t\rightarrow \infty } \int _0^{t}\frac{r}{t} (v,S_rv)\mathrm {d}r = 0. \end{aligned}$$
(11)

Combining (9), (10) and (11), we obtain

$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{\mathrm {Var}\left( \lambda \int _0^t v_s^\gamma \mathrm {d}{\overline{N}}_s\right) }{t}= & {} \lim _{t\rightarrow \infty } \frac{2\lambda ^2}{\gamma } \int _0^{\gamma t}(v,S_rv)\mathrm {d}r + \lim _{t\rightarrow \infty }2\lambda ^2 \int _0^{\gamma t} \frac{r}{\gamma t} (v,S_rv)\mathrm {d}r \nonumber \\= & {} \frac{2\lambda ^2}{\gamma }(v,-A^{-1}v). \end{aligned}$$
(12)

Now combining (7), (8) and (12), we obtain the result. \(\square \)

3.1.2 Higher Dimensions

So far we considered an active particle that only moves in one dimension. However, we can just as well treat a higher dimensional situation. To this end fix a dimension \(d\in \mathbb {N}\). Let Y be a d-dimensional simple random walk, i.e. each component of Y is an independent copy of the Y that we had in the 1-dimensional situation. Let the speed function v be an element of \(L^2((\Omega ,\mu ),\mathbb {R}^d)\) such that \(\int v\mathrm {d}\mu =0\) (in \(\mathbb {R}^d\)). We denote by \(\Sigma \) the covariance matrix of v under \(\mu \), i.e.

$$\begin{aligned} \Sigma _{ij} = \mathrm {Cov}(v(M_0)_i,v(M_0)_j). \end{aligned}$$

Let again \(X_t\) denote the position of the active particle, now in \(\mathbb {R}^d\), with random walk part Y and speed function v. The internal state process remains the same as the 1-dimensional case. To find the limiting diffusion matrix of the active particle, we can show that similar to the 1-dimensional case

$$\begin{aligned} \mathrm {Cov}((X_t)_i,(X_t)_j)= & {} \mathrm {Cov}((Y_{2\kappa t})_i,(Y_{2\kappa t})_j) \\&+ \mathrm {Cov}\left( \int _0^t(v_s^\gamma )_i\mathrm {d}{{\overline{N}}}_s,\int _0^t(v_s^\gamma )_j\mathrm {d}{{\overline{N}}}_s\right) + \mathrm {Cov}\left( \int _0^t(v_s^\gamma )_i\mathrm {d}s,\int _0^t(v_s^\gamma )_j\mathrm {d}s\right) . \end{aligned}$$

Now if we go through calculations that are very similar to the 1-dimensional case, we obtain the following.

Theorem 3.2

Let \(X_t\) be the position in \(\mathbb {R}^d\) of the active particle that we just defined. Then

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{\mathrm {Cov}((X_{t})_i,(X_{t})_j)}{t}= & {} 2\kappa \delta _{i,j} + \lambda \Sigma _{ij}\nonumber \\&\quad + \frac{\lambda ^2}{\gamma } [((v)_i,-A^{-1}(v)_j)+((v)_j,-A^{-1}(v)_i)]. \end{aligned}$$
(13)

Remark 3.3

The sum of inner products in (13) equals \(2((v)_i,-\mathrm {sym}(A^{-1})(v)_j)\) [where for an operator B on \(L^2(\mu )\), \(\mathrm {sym}(B) = (B+B^*)/2\) is the symmetric part]. Note that the 1-dimensional case can be retrieved from this by realising that for any operator B and function w, \((w,Bw) = (w,\mathrm {sym}(B)w)\).

3.1.3 Interpretation

We now briefly discuss the various terms appearing in the RHS of (6). First of all, as is clear directly from the definition of the process, the random walk part is independent of the rest and therefore produces the term \(2\kappa \).

Now, to understand the other two terms, let us first consider what happens in the limit of \(\gamma \) to infinity. In that case the state process is speeded up so much that it reaches equilibrium between subsequent jumps of the N-process. Therefore the jump sizes are just independent copies of \(v(M_0)\) (so v under the stationary measure \(\mu \)), so the process is simply a random walk with jump rate \(\lambda \) and jump size distribution \(v(M_0)\). In this case the diffusion coefficient should be \(\lambda \mathrm {Var}(v(M_0))=\lambda \int v^2\mathrm {d}\mu \), which is indeed what we find when we let \(\gamma \) go to infinity in (6).

Finally, the third term of (6) corresponds to the case where \(\gamma \) is finite. Therefore this term comes from the dependence between the active jumps due to the temporal dependence in the state process. Hence this term comes from the activity of the particle. These considerations justify the name ‘active part’ for the third part of (6). This is the only part that depends on the state process through more than just its stationary distribution. We will analyse this term more thoroughly in Sect. 4.

Remark 3.4

Note that for \(X^c\) (see Remark 2.1), the random walk part of \(X^c_t\) has variance \(2\kappa t\), the martingale part is left out and the active part is the same as in X, so we obtain

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{\mathrm {Var}(X^c_t)}{t} = 2\kappa + \frac{2\lambda ^2}{\gamma }(v,-A^{-1}v). \end{aligned}$$

Remark 3.5

Note that instead of writing \((v,-A^{-1}v)\), we could also have kept the covariance in the expression in (9) to obtain in a similar way that the active part of the limiting diffusion coefficient equals

$$\begin{aligned} \frac{2\lambda ^2}{\gamma } \int _0^\infty \mathrm {Cov}(v_0,v_r)\mathrm {d}r. \end{aligned}$$

This might be easier to calculate for processes of which the covariance function is explicitly known.

Remark 3.6

The assumption that \(\int v\mathrm {d}\mu =0\) makes sure that \(\mathbb {E}X_t=0\). Considering a speed function that does not have average 0 is equivalent to setting the speed function to be \(v+c\) where c is a constant and v still satisfies \(\int v\mathrm {d}\mu =0\). In this case the expectation equals \(\mathbb {E}X_t = c\lambda t\). Of course the random walk part is not affected by this choice. Now it is easy to see following our calculations above that with the new speed function the expectation of the martingale part remains the same, but the variance changes. Contrarily, the expectation of the active part changes, but the variance stays the same (since the change is deterministic). Overall, the limiting diffusion coefficient becomes:

$$\begin{aligned}&\lim _{t\rightarrow \infty }\frac{\mathrm {Var}(X_t)}{t} \\&\quad = 2\kappa + \lambda \left( \int v^2\mathrm {d}\mu +c^2\right) + \frac{2\lambda ^2}{\gamma }(v,-A^{-1}v). \end{aligned}$$

3.2 Examples

Now we give some examples. We start with two cases where the state process M is a Markov chain with 2 or 3 states. Then we take M to be an Ornstein–Uhlenbeck process and Brownian motion on a circle and finally we consider an Ornstein–Uhlenbeck process in \(\mathbb {R}^2\).

First, in these examples we need to calculate \((v,-A^{-1}v)\) [cf. (6)]. Now write \(u=-A^{-1}v\) and recall that this means \(u=\int _0^\infty S_t v \mathrm {d}t\), which implies \(-Au=v\). In order to compute \((v,-A^{-1}v)\), we can proceed as follows. First we find a function w such that \(-Aw=v\). Then \((v,w)=(v,-A^{-1}v)\). Indeed, since \(\mu \) is the unique ergodic measure, the only \(h\in D(A)\) with \(Ah=0\) are constant functions, so if \(-Au=-Aw\), u and w only differ by a constant. Therefore \((v,w) = (v,u+c\mathbb {1}) =(v,-A^{-1}v)+c \int v\mathrm {d}\mu = (v,-A^{-1}v)\).

Second, for all of the examples, we need to verify Assumptions (1) and (2). In Examples 1 and 2, the state space is finite so both assumptions always hold. In Examples 34 and 5, Assumption (2) can be verified by a direct computation, since the correlation functions for Brownian motion and the Ornstein–Uhlenbeck process are explicitly known. As we noted before, for Assumption (1), it suffices to find constants \(c,C>0\) such that \(\Vert S_tv\Vert \le C\exp (-ct)\). This is implied by the Poincaré inequality (see [19, Theorem 2.18]). The Poincaré inequality for the Ornstein–Uhlenbeck process is proved in [19, Lemma 2.22, Theorem 2.25] (and holds similarly in the higher dimensional case). By [19, Remark 2.19], the Poincaré inequality for Brownian motion with drift on the circle follows from the Poincaré inequality for driftless Brownian motion on the circle. The exponential ergodicity (and the corresponding Poincaré inequality) in this case is known and can be shown using Fourier analysis.

Example 1

(2 States) We start with the case where M is a Markov chain on \(\mathscr {S}=\{ 1,-1\}\) where the state switches with rate 1 and v is the identity function \([1,-1]^T\). Then \(\mu =(\delta _{-1}+\delta _1)/2\),

$$\begin{aligned} A = \begin{bmatrix} -1 &{} 1 \\ 1 &{} -1 \end{bmatrix} \end{aligned}$$

and indeed \(\int v\mathrm {d}\mu =0\). Now choose \(w=[1,0]^T\), then \(-Aw=v\). So \((v,-A^{-1}v)=(v,w)=1/2 * (1*1)+ 1/2 * (-1*0) = 1/2\). Also we compute \(\int v^2\mathrm {d}\mu = \int 1 \mathrm {d}\mu = 1\). Now applying Theorem 3.1 yields

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{\mathrm {Var}(X_{t})}{t} = 2\kappa + \lambda \int v^2\mathrm {d}\mu + \frac{2\lambda ^2}{\gamma } (v,w) = 2\kappa + \lambda + \frac{\lambda ^2}{\gamma }. \end{aligned}$$

Note that the same diffusion coefficient is found in the calculation in Sect. 6.

Example 2

(3 States) Now let M be a Markov chain on the triangle with nodes \(\mathscr {S}=\{n_1,n_2,n_3\}\) where the state switches with rate 1 and jumps to the right with probability \(1/2+a\) and to the left otherwise (where \(|a|\le 1/2)\). Here \(\mu =(\delta _{n_1}+\delta _{n_2}+\delta _{n_3})/3\), \(v=[v_1,v_2,v_3]^T\) such that \(v_1+v_2+v_3=0\) and

$$\begin{aligned} A=\begin{bmatrix} -1 &{} \frac{1}{2}+a &{} \frac{1}{2}-a \\ \frac{1}{2}-a &{} -1 &{} \frac{1}{2}+a \\ \frac{1}{2}+a &{} \frac{1}{2}-a &{} -1 \end{bmatrix}. \end{aligned}$$

To find w we solve the linear system \(-Aw=v\), which yields

$$\begin{aligned} w = \left[ \frac{v_1+(a+1/2)v_2}{3/4 + a^2},\frac{(1/2-a)v_1+v_2}{3/4 + a^2},0\right] ^T. \end{aligned}$$

This gives

$$\begin{aligned} (v,w)= & {} \frac{v_1^2+v_1v_2+v_2^2}{3(3/4 + a^2)} = \frac{v_1^2+v_1v_2+v_2^2+v_3(v_1+v_2+v_3)}{3(3/4 + a^2)}\nonumber \\= & {} \frac{(v_1+v_2+v_3)^2 - (v_1v_2+v_2v_3+v_1v_3)}{9/4 + 3a^2} =-\frac{v_1v_2+v_2v_3+v_1v_3}{9/4 + 3a^2}, \end{aligned}$$
(14)

where we used in the last step that \(v_1+v_2+v_3=0\). Also we compute \(\int v^2\mathrm {d}\mu = (v_1^2+v_2^2+v_3^2)/3\). Now applying Theorem 3.1 yields

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{\mathrm {Var}(X_{t})}{t}= & {} 2\kappa + \lambda \int v^2\mathrm {d}\mu + \frac{2\lambda ^2}{\gamma } (v,w) \\= & {} 2\kappa + \frac{\lambda }{3}(v_1^2+v_2^2+v_3^2) + \frac{2\lambda ^2}{\gamma }\frac{(-v_1v_2-v_2v_3-v_1v_3)}{9/4 + 3a^2}. \end{aligned}$$

Example 3

(Ornstein–Uhlenbeck process) Now let us consider a different kind of example where M is a continuous process, namely an Ornstein–Uhlenbeck process satisfying

$$\begin{aligned} \mathrm {d}M_t = -\theta M_t \mathrm {d}t + \sigma \mathrm {d}B_t, \end{aligned}$$

where \(B_t\) is a Brownian motion independent of everything else (note that a similar process is studied in [20]). This process has stationary distribution \(\mu \sim N(0,\sigma ^2/(2\theta ))\). We take \(v(x)=x\) (indeed \(\int x \mathrm {d}\mu = 0\)). We know that the generator equals

$$\begin{aligned} A = -\theta x \frac{\mathrm {d}}{\mathrm {d}x} + \frac{\sigma ^2}{2} \frac{\mathrm {d}^2}{\mathrm {d}x^2} \end{aligned}$$

and has as domain D(A) all functions in \(L^2(\mu )\) of which the first and second (weak) derivative are also in \(L^2(\mu )\). A quick inspection shows that if we set \(w(x)=x/\theta \), then w in D(A) and \(-Aw=v\). Now we compute \((v,w) = \int x^2/\theta \mathrm {d}\mu = \sigma ^2/(2\theta ^2)\). Also \(\int v^2 \mathrm {d}\mu = \int x^2\mathrm {d}\mu = \sigma ^2/(2\theta )\). Now Theorem 3.1 gives us

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{\mathrm {Var}(X_{t})}{t} = 2\kappa + \lambda \int v^2\mathrm {d}\mu + \frac{2\lambda ^2}{\gamma } (v,w) = 2\kappa + \frac{\lambda \sigma ^2}{2\theta } + \frac{\lambda ^2}{\gamma }\frac{\sigma ^2}{\theta ^2}. \end{aligned}$$

Note that the constant \((v,w)=\sigma ^2/(2\theta ^2)\) could also have been directly obtained by calculating

$$\begin{aligned} \mathrm {Var}\left( \lambda \int _0^t v^\gamma _s \mathrm {d}s\right) = \lambda ^2 \int _0^t \int _0^t \mathrm {Cov}(v^\gamma _s,v^\gamma _r)\mathrm {d}s\mathrm {d}r \end{aligned}$$
(15)

followed by rescaling and taking limits, since the covariance of the Ornstein–Uhlenbeck process is explicitly known. This yields the same result. Alternatively, one could have used the expression in Remark 3.5 to see

$$\begin{aligned} (v,-A^{-1}v) = \int _0^\infty \mathrm {Cov}(v_0,v_t)\mathrm {d}t = \int _0^\infty \frac{\sigma ^2}{2\theta }\exp (-\theta t) \mathrm {d}t = \frac{\sigma ^2}{2 \theta ^2}. \end{aligned}$$

Example 4

(Sine of Brownian motion with drift) In this example we want the speed process \(v_t\) to be \(\sin (M_t)\) where \(M_t=B_{2at} + bt\), \((B_t,t\ge 0)\) is Brownian motion and \(a,b>0\) are constants. However, \(X_t\) does not have a stationary (probability) distribution. Therefore we take M to be \(B_{2at} + bt\) on a circle \(\mathscr {S}\) with radius 1 and we set \(v(\theta )=\sin (\theta )\). Now \(\mu = \frac{1}{2\pi }\mathrm {d}\theta \), so indeed \(\int v \mathrm {d}\mu =0\). The generator is given by

$$\begin{aligned} A = a \frac{\mathrm {d}^2}{\mathrm {d}\theta ^2} + b \frac{\mathrm {d}}{\mathrm {d}\theta } \end{aligned}$$

with domain D(A) containing all smooth functions on \(\mathscr {S}\). Substituting \(w(\theta ) = c \sin (\theta ) + d \cos (\theta )\) and solving for cd shows that

$$\begin{aligned} w(\theta ) = \frac{a}{a^2+b^2}\sin (\theta ) + \frac{b}{a^2+b^2}\cos (\theta ) \end{aligned}$$

satisfies \(-Aw=v\) with \(w\in D(A)\). Now we calculate and see that \(\int v^2\mathrm {d}\mu = \frac{1}{2\pi } \int _0^{2\pi } \sin ^2(\theta )\mathrm {d}\theta =1/2\) and

$$\begin{aligned} (v,w)=\frac{1}{2\pi } \int _0^{2\pi } \sin (\theta )\left( \frac{a}{a^2+b^2}\sin (\theta ) + \frac{b}{a^2+b^2}\cos (\theta )\right) \mathrm {d}\theta = \frac{a}{2(a^2+b^2)}, \end{aligned}$$

so applying Theorem 3.1, we see:

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{\mathrm {Var}(X_{t})}{t} = 2\kappa + \lambda \int v^2\mathrm {d}\mu + \frac{2\lambda ^2}{\gamma } (v,w) = 2\kappa + \frac{\lambda }{2} + \frac{2\lambda ^2}{\gamma }\frac{a}{a^2+b^2}. \end{aligned}$$
(16)

Note that first of all the last term vanishes when either a or b goes to infinity, similar to what happens when \(\gamma \) goes to infinity (see the considerations at the end of Sect. 3). However, note that this part also vanishes when a goes to 0, even when \(b>0\). Indeed, when \(a=0\), the speed process is \(\sin (M_0+b t)\), where \(M_0\) is sampled from \(\mu \). Now it is easy to see that \(\int _0^t v_s^\gamma \mathrm {d}s\) is bounded in t, so \(\mathrm {Var}\left( \int _0^t v_s^\gamma \mathrm {d}s\right) /t\) goes to 0. In that sense the particle is not active in the limit.

Example 5

As example for the higher dimensional case, we take M to be the two-dimensional stationary Ornstein–Uhlenbeck process given by

$$\begin{aligned} \mathrm {d}M_t = -\Theta M_t\mathrm {d}t + \sigma \mathrm {d}W_t, \end{aligned}$$

where \(W_t\) is a two-dimensional Brownian motion,

$$\begin{aligned} \Theta =\begin{bmatrix} 1 &{} a \\ -a &{} 1 \end{bmatrix} \end{aligned}$$

and \(\sigma ,a>0\) are constants. The invariant distribution is \(N(0,\sigma ^2/2 I)\). We set v to be the identity function. The corresponding generator is

$$\begin{aligned} Af = - (\nabla f)^T \Theta x + \frac{\sigma ^2}{2} \Delta f. \end{aligned}$$

First we see that \(\Sigma = \frac{\sigma ^2}{2}I\). Now set

$$\begin{aligned} u_1(x) = \frac{1}{1+a^2}(x_1-ax_2), \quad u_2(x) = \frac{1}{1+a^2}(ax_1+x_2), \end{aligned}$$

then \(-Au_1(x) = x_1=(v)_1(x)\) and \(-Au_2(x)=x_2=(v)_2(x)\). Using these we obtain

$$\begin{aligned} ((v)_1,-A^{-1}(v)_1)+((v)_1,-A^{-1}(v)_1) = 2 \frac{1}{1+a^2} (x_1,x_1-ax_2) = \frac{2}{1+a^2}(x_1,x_1) = \frac{\sigma ^2}{1+a^2}. \end{aligned}$$

Also

$$\begin{aligned} ((v)_1,-A^{-1}(v)_2)+((v)_2,-A^{-1}(v)_1)= & {} (x_1,ax_1+x_2) + (x_2,x_1-ax_2) \\= & {} a(x_1,x_1)-a(x_2,x_2) = 0. \end{aligned}$$

Here we used that under \(\mu \), \((x_1,x_1)=(x_2,x_2)=\sigma ^2/2\) and \((x_1,x_2)=(x_2,x_1)=0\).

Applying Theorem 3.2, we see that the limiting diffusion matrix equals

$$\begin{aligned} \left( 2\kappa + \lambda \frac{\sigma ^2}{2}+ \frac{\lambda ^2}{\gamma }\frac{\sigma ^2}{1+a^2}\right) I. \end{aligned}$$
(17)

3.3 Invariance Principle

So far we have calculated the limiting diffusion coefficient of the active particle. In a lot of cases one can in fact show a Central Limit Theorem (CLT) for (the trajectory of) the active particle. This type of problem has been dealt with in a lot of generality under several sets of assumptions before, so we will not provide all the details.

As we noted before the active particle process decomposes naturally into three parts. First of all, there is the random walk part, which is independent of the rest. The CLT for this case is well-known.

Then there is the martingale part

$$\begin{aligned} \int _0^t v_s^\gamma \mathrm {d}{{\overline{N}}}_s. \end{aligned}$$

As the name suggests, this term is actually a martingale with respect to the filtration \(\mathscr {F}_t=\sigma \{(M_{\gamma s},N_s),0\le s\le t\}\) (see Remark 3.7). Moreover, the active part

$$\begin{aligned} \lambda \int _0^t v_s^\gamma \mathrm {d}s, \end{aligned}$$

is an additive functional of a stationary Markov process and can (under some technical assumptions) be approximated by a martingale with respect to the filtration \(\mathscr {F}_t'=\sigma \{M_{\gamma s},0\le s\le t\}\) and hence (by independence of N and the active part) also with respect to \(\mathscr {F}_t\). This type of result was obtained in [15, 21,22,23].

Therefore the sum of the martingale part and the active part

$$\begin{aligned} \int _0^t v_s^\gamma \mathrm {d}N_s \end{aligned}$$

can be approximated by a martingale with respect to \(\mathscr {F}_t\). Since the martingale part has a source of randomness (the Poisson process N) that is independent of the active part, the martingales cannot cancel each other out. Finally, as is done in the papers that were just cited, one can apply functional martingale central limit theorems such as in [24, 25] to obtain the CLT for the active particle.

Remark 3.7

The fact that the martingale part is actually a martingale with respect to \(\mathscr {F}_t\) can be shown from a direct computation. However, this martingale also naturally shows up as a Dynkin martingale. Because of the underlying state process, the position \(X_t\) itself is not a Markov process. However, the pair \((X_t,M^\gamma _t)\) is (where \(M^\gamma _t\) is M speeded up by a factor \(\gamma \)). The corresponding generator L is given by

$$\begin{aligned} Lf(x,m) = \lambda (f(x+v(m),m)-f(x,m)) + \gamma (Af(x,\cdot ))(m). \end{aligned}$$

Setting \(g(x,m)=x\), we see that the following is (formally) a martingale with respect to the natural filtration of \((X_t,M^\gamma _{t})\):

$$\begin{aligned} \mathscr {M}_t := g(X_t,M^\gamma _t)-g(X_0,M^\gamma _0)-\int _0^t Lg(X_s,M^\gamma _s)\mathrm {d}s = X_t-X_0-\int _0^t \lambda v^\gamma _s\mathrm {d}s = \int _0^t v^\gamma _s\mathrm {d}{{\overline{N}}}_s. \end{aligned}$$

The quadratic variation of this martingale equals

$$\begin{aligned} \int _0^t (Lg^2 - 2gLg)(X_s,v^\gamma _s) \mathrm {d}s = \lambda \int _0^t (v^\gamma _s)^2\mathrm {d}s. \end{aligned}$$

Note that by ergodicity of M we have that almost surely

$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{\lambda }{t}\int _0^t (v^\gamma _s)^2\mathrm {d}s = \lambda \int v^2\mathrm {d}\mu , \end{aligned}$$

which confirms that the martingale part converges to a Brownian motion with diffusion coefficient \(\lambda \int v^2\mathrm {d}\mu \).

4 Diffusion Coefficient: The Role of Reversibility

Now that we found an expression for the limiting diffusion coefficient of the active particle, we want to understand how it depends on the internal state process. In particular we want to understand the role of reversibility of the internal state process with respect to the stationary measure \(\mu \). Recall that we say that the state process \(M_t\) is reversible with respect to \(\mu \) if the generator A is a self-adjoint operator on its domain in \(L^2(\mu )\). We will fix the stationary measure \(\mu \) and study processes with this stationary measure. We will also assume in the rest of this section that the internal state space \(\mathscr {S}\) is finite, this is mainly to avoid technical complications.

When we inspect the different terms of the diffusion coefficient (6), we see the following.

  1. (a)

    The random walk part, \(2\kappa \), does not depend on the internal state process.

  2. (b)

    The martingale part, \(\lambda \int v^2\mathrm {d}\mu \), only depends on the internal state process through its stationary measure \(\mu \).

  3. (c)

    The active part, \(\tfrac{2\lambda ^2}{\gamma }(v,-A^{-1}v)\) depends on the whole internal state process, i.e. its stationary measure as well as its generator.

We conclude that given a stationary measure \(\mu \), only the active part might depend on the reversibility of the state process with respect to \(\mu \). Since also the factor \(\tfrac{2\lambda ^2}{\gamma }\) is fixed, we will dedicate the rest of this section to studying the behaviour of the term

$$\begin{aligned} \left( v,-A^{-1}v\right) . \end{aligned}$$

To further specify our results, note that the generator A can be decomposed into a symmetric part \(\mathrm {sym}(A)=(A+A^*)/2\) and an antisymmetric part \(\mathrm {asym}(A)=(A-A^*)/2\), where \(A^*\) denotes the adjoint of A as operators on \(L^2(\mu )\). In particular the internal state process is reversible with respect to \(\mu \) if \(\mathrm {sym}(A)=A\) and accordingly \(\mathrm {asym}(A)=0\). We will show the following.

  1. (i)

    In Sect. 4.2 we will consider state processes with the same symmetric part. We will show that the active part of the diffusion coefficient is maximal for the process generated by the symmetric part itself, for any choice of the speed function v. In other words: the diffusion coefficient is maximal for the reversible process. Mathematically this means that we will prove that for all v that satisfy \(\int v\mathrm {d}\mu =0\),

    $$\begin{aligned} \left( v,-A^{-1}v\right) \le \left( v,-\mathrm {sym}(A)^{-1}v\right) . \end{aligned}$$

    This is Proposition 4.4. We also generalise this to active particles in higher dimensions.

  2. (ii)

    In Sect. 4.3 we will consider reversible processes with the requirement that the total jumping rate from each point is the same. We will show that in this case there is no reversible process that maximises the diffusion coefficient for each choice of the speed function. In other words: within the class of reversible processes (with the same total jumping rates) there is no optimal reversible process.

Before this, we will start with some motivating examples in Sect. 4.1.

Remark 4.1

Note that the active part of the diffusion coefficient only depends on the “zero-average”-part of the speed function (see Remark 3.6). Therefore it remains the same when we replace the speed function v by \(v+c\), where c is a constant. Similarly, the active part of the diffusion coefficient is the same for \(X^c\) (from Remark 2.1). Because of this, if we replace v by \(v+c\) or if we consider the process \(X^c\) instead of X, the results of this section are still valid.

4.1 Motivation

As a motivating example, let us look back at Example 2. Note that for each \(a\in [-1/2,1/2]\), the state process has the same stationary distribution, namely the uniform distribution. However, only for \(a=0\) the process is reversible, whereas for \(a=1/2\) or \(a=-1/2\) the process is completely asymmetric (it only jumps to the right or only to the left, respectively). Hence we can think of a as the parameter that tunes the non-reversibility of the state process. The expression that we found earlier [see (14)] is

$$\begin{aligned} (v,-A^{-1}v) = \frac{-(v_1v_2+v_2v_3+v_1v_3)}{9/4 + 3a^2}. \end{aligned}$$

Since \(-(v_1v_2+v_2v_3+v_1v_3)\ge 0\) for v with \(\int v\mathrm {d}\mu = \tfrac{1}{3}(v_1+v_2+v_3) = 0\), this expression is maximal for \(a=0\), the reversible case, and decreases like \(\tfrac{1}{1+a^2}\) for a away from 0. We conclude that out of this family of state processes, the reversible process maximizes the diffusion coefficient.

Now for a more general result, we go back to the three states example and note that the symmetric part of the generator [as an operator in \(L^2(\mu )\)] was the same for each a and the antisymmetric part varied with a, indeed:

$$\begin{aligned} \frac{1}{3}\begin{bmatrix} -1 &{} \frac{1}{2}+a &{} \frac{1}{2}-a \\ \frac{1}{2}-a &{} -1 &{} \frac{1}{2}+a \\ \frac{1}{2}+a &{} \frac{1}{2}-a &{} -1 \end{bmatrix} = \frac{1}{3} \begin{bmatrix} -1 &{} \frac{1}{2} &{} \frac{1}{2} \\ \frac{1}{2} &{} -1 &{} \frac{1}{2} \\ \frac{1}{2} &{} \frac{1}{2} &{} -1 \end{bmatrix} + \frac{a}{3} \begin{bmatrix} 0 &{} 1 &{} -1 \\ -1 &{} 0 &{} 1 \\ 1 &{} -1 &{} 0 \end{bmatrix}. \end{aligned}$$

We want to show that this is true in general: out of all processes (with the same stationary measure \(\mu \)) of which the symmetric part of the generator is the same, the purely reversible process (so the purely symmetric one) maximizes \((v,-A^{-1}v)\).

Remark 4.2

Even though we restrict ourselves in this section to finite state spaces (mainly for technical reasons), notice that the same behaviour (the fact that the diffusion coefficient is maximal for reversible state processes) occurs in Examples 4 and 5.

Indeed, in Example 4 the state process consists of a reversible part scaled with a constant a and an non-reversible part with constant b (so in particular the process is reversible if and only if \(b=0\)). The active part of the diffusion coefficient in (16) equals

$$\begin{aligned} \frac{2\lambda ^2}{\gamma }\frac{a}{a^2+b^2}. \end{aligned}$$

So when we keep a fixed, the active part is maximized in the reversible case.

In Example 5 the active part of the diffusion matrix in (17) equals

$$\begin{aligned} \frac{\lambda ^2}{\gamma } \frac{\sigma ^2}{1+a^2} I. \end{aligned}$$

This matrix is maximal for \(a=0\), which is the reversible case.

4.2 Comparing Reversible and Non-reversible Processes

In order to prove the main result, Proposition 4.4 below, we first need the following lemma.

Lemma 4.3

Let C be a skew-symmetric matrix. Then both \(I+C\) and \(I-C^2\) are invertible and for all w

$$\begin{aligned} (w,(I+C)^{-1}w) = (w,(I-C^2)^{-1}w) \le (w,w). \end{aligned}$$

Proof

The invertibility of \(I+C\) and \(I-C^2\) is known, but we repeat it for completeness. Suppose that \(I+C\) is not invertible. Then there exists \(v\ne 0\) such that \((I+C)v=0\), so \(v=-Cv\). Then \((v,v)=-(v,Cv)=0\), which is a contradiction. Similarly if \((I-C^2)v=0\), then \(v=C^2v\), so \((v,v)=(v,C^2v)=-(Cv,Cv)\le 0\), which is a contradiction.

Now let w be arbitrary and set \(g=(I-C^2)^{-1}w\) and \(h=(I+C)^{-1}w\), which implies that \((I-C)g=h\). Then we see

$$\begin{aligned} (w,(I+C)^{-1}w) = ((I+C)h,h) = (h,h) + (Ch,h)=(h,h) \end{aligned}$$

and

$$\begin{aligned} (w,(I-C^2)^{-1}w)= & {} ((I-C^2)g,g) = ((I+C)(I-C)g,g) = ((I+C)h,g) \\= & {} (h,g)+(Ch,g)= (h,g)-(h,Cg) = (h,(I-C)g) = (h,h), \end{aligned}$$

which proves the equality.

To prove the inequality, first note that \(-C^2\) is positive semidefinite. Therefore the eigenvalues of \(I-C^2\) are greater than 1, so the eigenvalues of \((I-C^2)^{-1}\) are between 0 and 1, so \(\Vert (I-C^2)^{-1}\Vert \le 1\), which implies that \((w,(I-C^2)^{-1}w) \le (w,w)\). \(\square \)

Since we want to compare a Markov generator with its symmetric part [in \(L^2(\mu )\)], we recall some properties of this symmetric part. First of all, the symmetric part is again a Markov generator. Moreover, if the original generator has a unique ergodic measure, then the symmetric part generates a reversible process with the same unique ergodic measure. These properties are known, but for the reader’s convenience we collect them with a proof in Lemma A.2 in the appendix.

Now we can prove the following proposition.

Proposition 4.4

Let A be the generator of a Markov process on a finite state space with unique ergodic measure \(\mu \). Then for all v such that \(\int v\mathrm {d}\mu =0\)

$$\begin{aligned} (v,-A^{-1}v)\le (v,-\mathrm {sym}(A)^{-1}v), \end{aligned}$$

where \(\mathrm {sym}(A) = (A+A^*)/2\) is the symmetric part of A in \(L^2(\mu )\). As a consequence, the diffusion coefficient (6) is maximized for reversible state processes.

Proof

Let \(B=(-A+(-A)^*)/2\) be the symmetric part of \(-A\) and \(D=(-A-(-A)^*)/2\) the skew-symmetric part [in \(L^2(\mu )\)]. Let v such that \(\int v\mathrm {d}\mu = 0\). Note that B is (strictly) positive definite on the subspace of w such that \(\int w\mathrm {d}\mu =0\), so \(B^{-1}\) and \(B^{-1/2}\) exist and are symmetric [in \(L^2(\mu )\)]. Now we see

$$\begin{aligned}&(v,-A^{-1}v)= (v,(B+D)^{-1}v) = (v,(B^{1/2}(I+B^{-1/2}DB^{-1/2})B^{1/2})^{-1}v) \\= & {} (v,B^{-1/2}(I+B^{-1/2}DB^{-1/2})^{-1}B^{-1/2}v) = (B^{-1/2} v, (I+B^{-1/2}DB^{-1/2})^{-1}B^{-1/2}v). \end{aligned}$$

Now write \(w=B^{-1/2} v\) and \(C=B^{-1/2}DB^{-1/2}\), so

$$\begin{aligned} (v,-A^{-1}v) = (w,(I+C)^{-1}w). \end{aligned}$$

Note that for all \(u,u'\)

$$\begin{aligned} (u,Cu')= & {} (u,B^{-1/2}DB^{-1/2}u') = (B^{-1/2}u,D B^{-1/2}u') \\= & {} -(DB^{-1/2}u,B^{-1/2}u')=-(B^{-1/2}DB^{-1/2}u,u')=-(Cu,u'), \end{aligned}$$

so C is skew-symmetric. Therefore applying Lemma 4.3 gives us that

$$\begin{aligned} (v,-A^{-1}v)= & {} (w,(I+C)^{-1}w) \le (w,w) \\= & {} (B^{-1/2}v,B^{-1/2}v) = (v,B^{-1}v) = (v,-\mathrm {sym}(A)^{-1}v). \end{aligned}$$

\(\square \)

Remark 4.5

If we assume that \(\Vert B^{-1/2}DB^{-1/2}\Vert < 1\), we use the Taylor expansion and obtain the more explicit formula:

$$\begin{aligned} (v,-A^{-1}v) = (v,-\mathrm {sym}(A)v) + (w,C^2(I-C^2)^{-1}w), \end{aligned}$$

where w and C are as in the proof of Proposition 4.4. Indeed in that case

$$\begin{aligned} (w,(I+C)^{-1}w )= & {} \left( w,\sum _{n=0}^\infty (-1)^n C^nw\right) = \sum _{n=0}^\infty (-1)^n (w,C^nw) = \sum _{n=0}^\infty (-1)^{2n} (w,C^{2n}w) \\= & {} \left( w,\sum _{n=0}^\infty (C^2)^n w\right) = (w,w) + \left( w,\sum _{n=1}^\infty (C^2)^n w\right) \\= & {} (w,w) + \left( w,C^2\sum _{n=0}^\infty (C^2)^n w\right) = (v,-\mathrm {sym}(A)^{-1}v) + (w,C^2(I-C^2)^{-1}w). \end{aligned}$$

Note that in the third equality we used that \(C^n\) is skew-symmetric, so \((w,C^nw)=0\) for n odd.

Now that we have Proposition 4.4 for active particles in \(\mathbb {R}\), we can use it to generalize to d dimensions. Recall from Theorem 3.2 that the active part of the limiting diffusion matrix of an \(\mathbb {R}^d\)-valued random walk is \((2\lambda ^2/\gamma ) D^A\), where

$$\begin{aligned} D^A_{ij} := ((v)_i,-A^{-1}(v)_j)+((v)_j,-A^{-1}(v)_i). \end{aligned}$$

The next proposition tells us that in the same context as Proposition 4.4, this quantity is optimal for the reversible process.

Corollary 4.6

Let A and \(\mu \) be as in Proposition 4.4. Then for all \(\mathbb {R}^d\)-valued v such that \(\int v\mathrm {d}\mu =0\) (in \(\mathbb {R}^d),\) \(D^A\) is dominated by \(D^{\mathrm {sym}(A)}\) in the sense that \(D^{\mathrm {sym}(A)}-D^A\) is positive definite.

Proof

It suffices to show that for all \(\alpha \in \mathbb {R}^d\), \(\alpha ^TD^A\alpha \le \alpha ^TD^{\mathrm {sym}(A)}\alpha \). Let \(\alpha \in \mathbb {R}^d\). Then \(\alpha \cdot v\) is an \(\mathbb {R}\)-valued function such that \(\int (\alpha \cdot v)\mathrm {d}\mu = \alpha \cdot \left( \int v\mathrm {d}\mu \right) =0\). Therefore, using Proposition 4.4, we see

$$\begin{aligned} \alpha ^TD^A\alpha= & {} \sum _{i,j=1}^d \alpha _i\alpha _j(((v)_i,-A^{-1}(v)_j)+((v)_j,-A^{-1}(v)_i)) = 2 ((\alpha \cdot v),-A^{-1}(\alpha \cdot v))\\\le & {} 2 ((\alpha \cdot v),-\mathrm {sym}(A)^{-1}(\alpha \cdot v)) = \alpha ^TD^{\mathrm {sym}(A)}\alpha . \end{aligned}$$

\(\square \)

4.3 Comparing Reversible Processes

Proposition 4.4 tells us that among all generators with the same symmetric part, the symmetric part itself maximizes the diffusion coefficient of the active particle. Now one might wonder whether there are classes of reversible internal state processes that yield the same diffusion coefficient for each speed function v. The following lemma shows us that this is not the case.

Lemma 4.7

Let A and B be Markov generators with reversible measure \(\mu \). Suppose that for every v with \(\int v\mathrm {d}\mu =0,\) \((v,-A^{-1}v)=(v,-B^{-1}v)\). Then \(A=B\).

Proof

Define the following linear subspaces of \(L^2(\mu )\): \(V_\mu :=\left\{ v| \int v\mathrm {d}\mu = 0\right\} \) and \(V_1 = \{c\mathbb {1}|c\in \mathbb {R}\}\). Note that \(V_\mu \) and \(V_1\) are orthogonal in \(L^2(\mu )\) and in fact \(V_\mu \) is the orthogonal complement of \(V_1\) in \(L^2(\mu )\), so the action on \(V_\mu \) and \(V_1\) together fully define A and B. Also note that A and B are 0 on \(V_1\) and are invertible when restricting to \(V_\mu \rightarrow V_\mu \). It suffices to show that A and B are equal on \(V_\mu \), so in turn it suffices to show that \(A^{-1}\) and \(B^{-1}\) are equal on \(V_\mu \). For this let \(v,w\in V_\mu \). Then

$$\begin{aligned} (v,-A^{-1}w)= & {} \frac{1}{2}((v+w),-A^{-1}(v+w))-(v,-A^{-1}v)-(w,-A^{-1}w)) \\= & {} \frac{1}{2}((v+w),-B^{-1}(v+w))-(v,-B^{-1}v)-(w,-B^{-1}w)) = (v,-B^{-1}w). \end{aligned}$$

This shows that \(A^{-1}=B^{-1}\) on \(V_\mu \), so we conclude that \(A=B\). \(\square \)

Now that we know that different reversible processes cannot yield the same diffusion coefficients, it could still be that certain reversible processes yield larger diffusion coefficients than others. To answer this question, we need to normalise in some way. Otherwise if we replace the generator A by cA for some constant \(c>1\), the diffusion coefficient is divided by that constant c, so A trivially yields larger diffusion coefficients than cA. We normalise here by comparing reversible processes that have the same total jumping rate from each point. The next lemma tells us that in that case no process strictly dominates all the others, it depends on the speed function v.

Lemma 4.8

Let A and B be Markov generators on a finite state space that are reversible with respect to \(\mu \). Additionally assume that the total jump rate from each state is the same for A and B. Then either \(A=B\) or there exist \(v,w\in V_\mu \) such that

$$\begin{aligned} (v,-A^{-1}v)>(v,-B^{-1}v)\quad \text {and}\quad (w,-A^{-1}w)<(w,-B^{-1}w). \end{aligned}$$

Proof

Let A and B be as stated. Now assume that there are no \(v,w\in V_\mu \) such that \((v,-A^{-1}v)>(v,-B^{-1}v)\) and \((w,-A^{-1}w)<(w,-B^{-1}w)\). Without loss of generality assume that for all \(v\in V_\mu \), \((v,-A^{-1}v)\ge (v,-B^{-1}v)\). This implies that \(-A^{-1}\ge -B^{-1}\) [in the sense that \(-A^{-1}-(-B^{-1})\) is symmetric and positive definite on \(V_\mu \)]. With the fact that \(-A,-B\) are positive definite, this in turn implies that \(-B\ge -A\), so \(A - B\ge 0\) on \(V_\mu \). Since also \(Av=Bv=0\) for \(v\in V_1\), this implies that \(A-B\ge 0\) on \(L^2(\mu )\). Now if we define D to be the diagonal matrix with \(D_{ii}=\mu _i\), then \(D(A-B)\ge 0\) and \(D(A-B)\) is symmetric with respect to the usual inner product in \(\mathbb {R}^d\). Also, \(A-B\) and (hence) \(D(A-B)\) have zeroes on the diagonal (because of the equal jump rates), so the trace of \(D(A-B)\) is 0. Therefore the eigenvalues of \(D(A-B)\) are non-negative and sum to 0, so they are all 0. This implies that \(D(A-B)=0\), so \(A=B\). \(\square \)

5 Large Deviations

In this section we derive a large deviation principle (LDP) for \(\tfrac{1}{t}X_t\).Footnote 1 The active particle that we are studying is what is called a slow-fast system in the literature and a lot of research has already been done about its large deviations. Because of this it is not our goal here to present this result in the highest possible generality. We would rather see which formulas are obtained and study their behaviour, in particular the relation between the rate function and the reversibility of M. Therefore we reduce (as in Sect. 4) to the case where the state space \(\mathscr {S}\) of M is finite [and hence where \((v^\gamma ,s\ge 0)\) is bounded].

Remark 5.1

Note that we don’t need anywhere in this section that \(\int v\mathrm {d}\mu =0\).

Since we will express the rate function for \(\tfrac{1}{t}X_t\) in terms of the rate function of the empirical process corresponding to the underlying state process, we quickly recall some results that we will use. We write

$$\begin{aligned} \chi _t = \frac{1}{t}\int _0^t \delta _{M_s}\mathrm {d}s \end{aligned}$$

and denote by \(P_t\) the distribution of \(\chi _t\) in the space of probability measures on \(\mathscr {S}\). Then we know from [28] that \((P_t,t\ge 0)\) satisfies an LDP with good rate function \(I_e\) given by

$$\begin{aligned} I_e(\xi ) = \sup _{u>0} \left( -\sum _{i=1}^n\xi _i \frac{(Au)_i}{u_i}\right) . \end{aligned}$$
(18)

In case A is symmetric, this reduces to

$$\begin{aligned} I_e(\xi ) = (u,-Au), \end{aligned}$$
(19)

where \(u_i = \sqrt{\xi _i/\mu _i}\) (note that we assumed that \(\mu \) has full support, so \(\mu _i>0\) for all i) and the inner product is (as usual) with respect to \(\mu \).

5.1 Large Deviations Rate Function

To obtain the large deviations rate function of \(X_t/t\), we start by calculating the logarithmic moment generating function (log-mgf) of \(X_t\): \(F_t(\alpha ) = \log \mathbb {E}\left[ \mathrm {e}^{\alpha X_t}\right] \) for \(\alpha \in \mathbb {R}^d\). To calculate it we first observe that by independence of Y and the rest,

$$\begin{aligned} F_t(\alpha )= & {} \log \mathbb {E}\left[ \mathrm {e}^{\alpha \left( \sqrt{2\kappa }Y_t + \int _0^tv_s^\gamma \mathrm {d}N_s\right) }\right] = \log \mathbb {E}\left[ \exp (\alpha \sqrt{2\kappa }Y_t)\right] \nonumber \\&\quad + \log \mathbb {E}\left[ \exp \left( \alpha \int _0^tv_s^\gamma \mathrm {d}N_s\right) \right] . \end{aligned}$$
(20)

The first term is just the log-mgf of a simple random walk speeded up with a factor \(2\kappa \). Therefore at time t it equals the difference of two independent Poisson random variables with parameter \(\kappa t\), so we obtain that

$$\begin{aligned} \log \mathbb {E}\left[ \exp (\alpha Y_{2\kappa t})\right] = \log (\exp (\kappa t (\mathrm {e}^{\alpha }-1))\exp (\kappa t (\mathrm {e}^{-\alpha }-1))) = 2\kappa t (\cosh (\alpha )-1). \end{aligned}$$

To calculate the second term, we first condition on \(v^\gamma =(v^\gamma _s,0\le s\le t)\) and obtain

$$\begin{aligned}&\mathbb {E}\left[ \left. \exp \left( \alpha \int _0^tv_s^\gamma \mathrm {d}N_s\right) \right| v^\gamma \right] = \lim _{n\rightarrow \infty } \mathbb {E}\left[ \left. \exp \left( \alpha \sum _{i=0}^{n-1}v^\gamma _{s_i}(N_{s_{i+1}}-N_{s_i})\right) \right| v^\gamma \right] \\= & {} \lim _{n\rightarrow \infty } \prod _{i=0}^{n-1} \mathbb {E}\left[ \exp \left( \alpha v^\gamma _{s_i}(N_{s_{i+1}}-N_{s_i})\right) |v^\gamma \right] = \lim _{n\rightarrow \infty } \prod _{i=0}^{n-1} \exp \left( \lambda \left( \mathrm {e}^{\alpha v_{s_i}^\gamma }-1\right) (s_{i+1}-s_i)\right) \\= & {} \lim _{n\rightarrow \infty } \exp \left( \sum _{i=0}^{n-1} \lambda \left( \mathrm {e}^{\alpha v_{s_i}^\gamma }-1\right) (s_{i+1}-s_i)\right) = \exp \left( \lambda \int _0^t \left( \mathrm {e}^{\alpha v_{s}^\gamma }-1\right) \mathrm {d}s\right) . \end{aligned}$$

Therefore we see that the second term of (20) equals

$$\begin{aligned} \log \mathbb {E}\exp \left( \lambda \int _0^t \left( \mathrm {e}^{\alpha v_{s}^\gamma }-1\right) \mathrm {d}s\right) . \end{aligned}$$

We conclude that

$$\begin{aligned} F_t(\alpha ) = 2\kappa t (\cosh (\alpha )-1) + \log \mathbb {E}\exp \left( \lambda \int _0^t \left( \mathrm {e}^{\alpha v_{s}^\gamma }-1\right) \mathrm {d}s\right) . \end{aligned}$$
(21)

Now we can compute the large deviation free energy function \(F(\alpha )\) as the limit of \(F_t(\alpha )/t\). We see for the first term that

$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{2\kappa t (\cosh (\alpha )-1)}{t} = 2\kappa (\cosh (\alpha )-1). \end{aligned}$$
(22)

Now for the second term define \(h_\alpha \) as a function on measures on \(\mathscr {S}\) given by

$$\begin{aligned} h^\gamma _\alpha (\xi ) = \frac{\lambda }{\gamma } \int _\mathscr {S}\left( \mathrm {e}^{\alpha v(x)}-1\right) \xi (\mathrm {d}x). \end{aligned}$$

This enables us to rewrite the second part of \(F_t(\alpha )\) and use Varadhan’s lemma to obtain

$$\begin{aligned}&\lim _{t\rightarrow \infty }\frac{1}{t} \log \mathbb {E}\exp \left( \frac{\lambda }{\gamma } \int _0^{\gamma t} \left( \mathrm {e}^{\alpha v(M_s)}-1\right) \mathrm {d}s \right) \\= & {} \lim _{t\rightarrow \infty }\frac{1}{t} \log \mathbb {E}\exp \left( t\gamma \frac{\lambda }{\gamma } \int _\mathscr {S}\left( \mathrm {e}^{\alpha v(x)}-1\right) \left( \frac{1}{\gamma t}\int _0^{\gamma t} \delta _{M_s} \mathrm {d}s\right) (\mathrm {d}x) \right) \\= & {} \gamma \lim _{t\rightarrow \infty } \frac{1}{\gamma t} \log \mathbb {E}\exp \left( \gamma t h^\gamma _\alpha (\chi _{\gamma t})\right) = \gamma \sup _{\xi } (h^\gamma _\alpha (\xi ) - I_e(\xi )). \end{aligned}$$

Note that the latter equals

$$\begin{aligned} \sup _{\xi }(\lambda (\varphi _\xi (\alpha )-1) - \gamma I_e(\xi )), \end{aligned}$$
(23)

where \(\varphi _\xi (\alpha )=\int _\mathscr {S}\exp (\alpha v(x))\xi (\mathrm {d}x)\) denotes the mgf of v under \(\xi \) evaluated at \(\alpha \). Taking together (22) and (23), we conclude that

$$\begin{aligned} F(\alpha ) = \lim _{t\rightarrow \infty }\frac{F_t(\alpha )}{t} = 2\kappa (\cosh (\alpha )-1) + \sup _{\xi }(\lambda (\varphi _\xi (\alpha )-1) - \gamma I_e(\xi )). \end{aligned}$$
(24)

Using the Gaertner–Ellis theorem, we now obtain the large deviation principle for \(X_t/t\) with rate function given by the Legendre transform of \(F(\alpha )\):

$$\begin{aligned} I(x) = \sup _{\alpha }(\alpha x - F(\alpha )) = \sup _{\alpha }(\alpha x - 2\kappa (\cosh (\alpha )-1) - \sup _{\xi }(\lambda (\varphi _\xi (\alpha )-1) - \gamma I_e(\xi ))). \end{aligned}$$

Remark 5.2

A very similar computation shows that a similar expression holds in the multidimensional case. Indeed, if we set \(F_t(\alpha ) = \log \mathbb {E}\exp (\alpha \varvec{\cdot }X_t)\) for \(\alpha \in \mathbb {R}^d\), we obtain

$$\begin{aligned} F(\alpha )=\lim _{t\rightarrow \infty }\frac{F_t(\alpha )}{t} = 2\kappa \sum _{i=1}^d (\cosh (\alpha _i)-1) + \sup _{\xi }(\lambda (\varphi _\xi (\alpha )-1) - \gamma I_e(\xi )), \end{aligned}$$

where

$$\begin{aligned} \varphi _\xi (\alpha )=\int _\mathscr {S}\mathrm {e}^{\alpha \varvec{\cdot }v(x)}\xi (\mathrm {d}x). \end{aligned}$$

Then again we can take the Legendre transform to find the rate function I.

Example 6

We return to Example 1 to obtain an explicit expression for the large deviations free energy function. Note that the state process is reversible with respect to the stationary measure \(\mu =(1/2,1/2)\). Using (19), fixing a probability measure \(\xi \) on \(\{1,-1\}\) and setting \(u_i=\sqrt{(\xi _i/(1/2)}=\sqrt{2\xi _i}\), we see

$$\begin{aligned} I_e(\xi ) = (u,-Au) = \frac{1}{2}(\sqrt{2\xi _1}-\sqrt{2\xi _{-1}})^2 = (\sqrt{\xi _1}-\sqrt{\xi _{-1}})^2 = 1 - 2\sqrt{\xi _1\xi _{-1}}. \end{aligned}$$

Parametrizing \(\xi =(r,1-r)\), we see

$$\begin{aligned} \sup _{\xi }(\lambda (\varphi _\xi (\alpha )-1) - \gamma I_e(\xi ))= & {} \sup _{0\le r\le 1} (\lambda (r \mathrm {e}^\alpha + (1-r)\mathrm {e}^{-\alpha } - 1) -\gamma (1-2\sqrt{r(1-r}))) \\= & {} \lambda (\mathrm {e}^{-\alpha }-1) - \gamma + \sup _{0\le r\le 1} (2\lambda \sinh (\alpha )r +2\gamma \sqrt{r(1-r)}). \end{aligned}$$

A simple calculation shows that the latter equals

$$\begin{aligned} \lambda (\mathrm {e}^{-\alpha }-1) - \gamma + \sqrt{\gamma ^2+\lambda ^2\sinh ^2(\alpha )} + \lambda \sinh (\alpha ) = \lambda (\cosh (\alpha )-1) + \sqrt{\gamma ^2+\lambda ^2\sinh ^2(\alpha )} - \gamma , \end{aligned}$$

so with (24), we see

$$\begin{aligned} F(\alpha ) = (2\kappa + \gamma ) (\cosh (\alpha )-1) + \sqrt{\gamma ^2+\lambda ^2\sinh ^2(\alpha )} - \gamma . \end{aligned}$$
(25)

Remark 5.3

In the case of \(X^c\) (from Remark 2.1), the calculations become a bit easier. Instead of the symmetric random walk \(Y_{2\kappa t}\) we directly work with the continuous limit \(B_{2\kappa t}\). But more importantly, there is no additional randomness from the Poisson process N. Following the analogous computations for this part, we find the same results with \(\varphi _\xi (\alpha )\) replaced by \(\alpha \int v(x) \xi (\mathrm {d}x).\)

In this section we worked with a finite state space, so all the computations and quantities here are well-defined. However, for a more general state process, for the original process X one would need

$$\begin{aligned} \mathbb {E}\exp \left( \lambda \int _0^t \left( \mathrm {e}^{\alpha v_{s}^\gamma }-1\right) \mathrm {d}s\right) <\infty \end{aligned}$$

to get a finite free energy. Setting \(t\ll 1\), this implies that we need something like

$$\begin{aligned} \mathbb {E}\mathrm {e}^{\mathrm {e}^{v_0}}<\infty , \end{aligned}$$

which is a very strong assumption that for instance for the Ornstein–Uhlenbeck process is not satisfied.

Changing to \(X^c\) means getting rid of the Poisson jumps, which takes away one of the exponentials. So we expect that an LDP holds for a lot more state processes in the \(X^c\) case than for the original process X.

5.2 The Role of Reversibility

Our goal now is to show a result that is similar to Proposition 4.4. Indeed, we show that if an active particle has a state process generated by some generator A, then the rate function of this active particle is greater (pointwise) than the rate function of the active particle of which the state process is generated by the symmetric part of A. In other words: a reversible state process yields a lower rate function. Before we show this, we will prove the following lemma about a similar result for the rate functions of the empirical measures corresponding to the state processes.

Lemma 5.4

Let A be a Markov generator with unique ergodic measure \(\mu \) and let \(\mathrm {sym}(A)\) be its symmetric part (in \(L^2(\mu )).\) Denote the rate functions of the corresponding empirical processes by \(I_e^A\) and \(I_e^{\mathrm {sym}(A)},\) respectively. Then for all probability measures \(\xi ,\) \(I_e^{\mathrm {sym}(A)}(\xi )\le I_e^A(\xi )\).

Proof

Let \(\xi \) be a probability measure on \(\mathscr {S}\). We set \(u_i=\sqrt{\xi _i/\mu _i}\ge 0\). Also define for \(m\in \mathbb {N}\), \(u^m_i = u_i\) if \(u_i>0\) and \(u^m_i=1/m\) otherwise. Note that \(u^m_i>0\) for all i and that \(u^m\rightarrow u\) in \(L^2(\mu )\) (since it converges pointwise and \(\mathscr {S}\) is finite). Finally note that \(\xi _i/u^m_i = \mu _i u_i\) for all i. Now, using (18), we see that for all m

$$\begin{aligned} I^A_e(\xi ) = \sup _{u'>0}-\sum _i^n\xi _i \frac{(Au')_i}{u'_i} \ge -\sum _i^n\xi _i \frac{(Au^m)_i}{u^m_i} = -\sum _{i=1}^n \mu _i u_i (Au^m)_i = (u,-Au^m). \end{aligned}$$

Therefore, using (19), we conclude

$$\begin{aligned} I^A_e(\xi )\ge \lim _{m\rightarrow \infty } (u,-Au^m) = (u,-Au) = (u,-\mathrm {sym}(A)u) = I_e^{\mathrm {sym}(A)}(\xi ). \end{aligned}$$

\(\square \)

Now we use this to prove the following result.

Corollary 5.5

Let A be a Markov generator with unique ergodic measure \(\mu \) and let \(\mathrm {sym}(A)\) be its symmetric part (in \(L^2(\mu )).\) Denote the rate functions of the corresponding active particle processes by \(I^A\) and \(I^{\mathrm {sym}(A)}\) and the free energy functions of those processes by \(F^A\) and \(F^{\mathrm {sym}(A)},\) respectively. Then for all \(\alpha \in \mathbb {R}: F^A(\alpha )\le F^{\mathrm {sym}(A)}(\alpha )\) and for all \(x\in \mathbb {R}:I^{\mathrm {sym}(A)}(x)\le I^A(x)\).

Proof

Since for all \(\xi \), \(I_e^{\mathrm {sym}(A)}(\xi )\le I_e^A(\xi )\), it follows that for all \(\alpha \),

$$\begin{aligned} \sup _{\xi }(\lambda (\varphi _\xi (\alpha )-1) - \gamma I^{\mathrm {sym}(A)}_e(\xi )) \ge \sup _{\xi }(\lambda (\varphi _\xi (\alpha )-1) - \gamma I^A_e(\xi )), \end{aligned}$$

so \(F^{\mathrm {sym}(A)}(\alpha )\ge F^A(\alpha )\). Since this holds for all \(\alpha \), similarly it follows that for all x, \(I^{\mathrm {sym}(A)}(x)\le I^A(x)\). \(\square \)

Remark 5.6

Note that in the case that \(F(\alpha )\) is sufficiently smooth, the limiting diffusion coefficient (or matrix, in the higher dimensional case) is given by the second derivative, or, more generally, the Hessian of \(F(\alpha )\) in 0. By Corollary 5.5, the free energy function is dominated by the free energy function of the active particle with state process generated by the symmetric part pointwise everywhere and they are equal for \(\alpha =0\). Therefore we see in that case that the Hessian at 0 (and therefore the limiting diffusion matrix) is dominated by the Hessian of the symmetric version. This is consistent with the results of Proposition 4.4 and Corollary 4.6.

6 The 2-State Case: Explicit Formulas

In the case where there are just two states, we can compute a lot of things explicitly with different methods. Therefore this section is dedicated to the active particle with two states. In this case the active particle has a position \(x\in \mathbb {Z}\) and a velocity \(v\in \{-1,1\}\). The process \(\{(X_t,v_t): t\ge 0\}\) is described via the generator

$$\begin{aligned} Lf(x,v)= & {} \lambda (f(x+v,v)- f(x,v)) \nonumber \\&\quad + \kappa (f(x+1, v)+ f(x-1,v) -2f(x,v)) \nonumber \\&\quad + \gamma (f(x,-v)-f(x,v)). \end{aligned}$$
(26)

This is interpreted as follows: with rate \(\lambda \) the process makes a jump in the direction of the velocity, with rate \(\kappa \) it makes a random walk jump and with rate \(\gamma \) it flips velocity \(v\rightarrow -v\). If we denote by \(\mu (x,t,v)\) the probability to be at location \(x\in \mathbb {Z}\) with velocity \(v\in \{-1,1\}\) at time \(t>0\), the generator (26) corresponds to the master equation (or Kolmogorov forward equation)

$$\begin{aligned} \frac{d\mu (x,t,v)}{dt}= & {} \lambda \mu (x-v,t,v) + \kappa (\mu (x-1,t, v)+ \mu (x+1,t,v)) + \gamma \mu (x,t,-v) \nonumber \\&\quad - (2\kappa +\lambda + \gamma ) \mu (x,t,v). \end{aligned}$$
(27)

6.1 The Fourier–Laplace Transform of the Distribution

The master equation (27) can be solved using a Fourier–Laplace transform. We define

$$\begin{aligned} {\hat{\mu }} (q,t,v)= \sum _{x} e^{iqx} \mu (x,t,v) \end{aligned}$$

and view this quantity as a two-column, denoted \(\overline{\mu }(q,t, \cdot )\) indexed by row index \(v=1,-1\). The master equation (27) then becomes, after a Fourier transform:

$$\begin{aligned} \frac{d}{dt} \overline{\mu }(q,t) = M(q) \overline{\mu }(q,t) \end{aligned}$$
(28)

with M(q) a symmetric two by two matrix of the form

$$\begin{aligned}&M(q)= \left( \begin{array}{cc} a &{} b\\ b &{} a^* \end{array} \right) \end{aligned}$$
(29)

where \(*\) denotes complex conjugate and where

$$\begin{aligned} a= & {} (2\kappa +\lambda )(\cos (q)-1)-\gamma + i\lambda \sin (q), \nonumber \\ b= & {} \gamma . \end{aligned}$$
(30)

For the analysis of the scaling behavior of the position of the particle, it is convenient to further Laplace transform \(\overline{\mu }(q,t)\), i.e. we define for \(z>0\) the column vector

$$\begin{aligned} \widehat{\mu }(q,z)= \int _0^\infty \overline{\mu }(q,t) e^{-zt}\ dt. \end{aligned}$$
(31)

Then, from (28) we find

$$\begin{aligned} \widehat{\mu }(q,z) = ( zI- M(q) )^{-1} {\bar{\mu }}_0 (q). \end{aligned}$$

For the initial position and velocity we choose \(X_0=0\), and \(v=\pm 1\) with probability 1/2. Then we have, \({\bar{\mu }}_0 (q)= \frac{1}{2} (1,1)^T\) where T denotes transposition. We further define the Fourier–Laplace transform of the distribution of the particle position:

$$\begin{aligned} S(q,z)=\int _0^\infty \mathbb {E}e^{iqX_t} e^{-zt} \ dt = \sum _v \widehat{\mu }(q,z,v)=(1,1) \widehat{\mu }(q,z). \end{aligned}$$

Then we have, using (31)

$$\begin{aligned} S(q,z)= (\widehat{\mu }(q,z,1)) + (\widehat{\mu }(q,z, -1)) = \frac{1}{2} (1,1) (zI-M(q))^{-1} (1,1)^T. \end{aligned}$$

Using the explicit formulas (29), (30), we obtain

$$\begin{aligned} S(q,z)= \frac{2\gamma + z - (\lambda +2\kappa ) (\cos (q)-1)}{(\gamma +z -(\lambda +2\kappa ) (\cos (q)-1))^2-\gamma ^2 + \lambda ^2 \sin ^2(q)}. \end{aligned}$$
(32)

For a more general velocity distribution at time zero, i.e., \(X_0=0\), and \(v=1\), resp. \(v=-1\), with probability \(\alpha \), resp. \(1-\alpha \), we find

$$\begin{aligned} S(q,z)= \frac{i\lambda (2\alpha -1) \sin (q)+ 2\gamma + z - (\lambda +2\kappa ) (\cos (q)-1)}{(\gamma +z -(\lambda +2\kappa ) (\cos (q)-1))^2-\gamma ^2 + \lambda ^2 \sin ^2(q)}. \end{aligned}$$

6.2 The Limiting Diffusion Coefficient

We can now use the explicit formula (32) to obtain the limit distribution of \(\epsilon X_{\epsilon ^{-2} t}\) as \(\epsilon \rightarrow 0\). This amounts to understanding the scaling behavior of \(\epsilon ^2 S(\epsilon q, \epsilon ^2 z)\). In particular \(\epsilon X_{\epsilon ^{-2} t}\rightarrow {{\mathcal {N}}}(0,\sigma ^2 t)\) as \(\epsilon \rightarrow 0\) (in distribution), where \({{\mathcal {N}}}(0,\sigma ^2 t)\) denotes a normal with mean zero and variance \(\sigma ^2 t\), corresponds to the limiting scaling behavior

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \epsilon ^2 S(\epsilon q, \epsilon ^2 z)= \frac{1}{z + \frac{q^2}{2}\sigma ^2}. \end{aligned}$$

If we obtain this scaling behavior, we call \(\sigma ^2\) the (limiting) diffusion constant. Indeed, we compute from the exact formula (32)

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \epsilon ^2 S(\epsilon q, \epsilon ^2 z) = \frac{1}{z + \frac{q^2}{2}\sigma ^2} \end{aligned}$$

with the limiting diffusion constant

$$\begin{aligned} \sigma ^2= 2\kappa +\lambda + \frac{\lambda ^2}{\gamma }. \end{aligned}$$
(33)

This is consistent with the limiting diffusion coefficient that we obtained in Example 1.

6.3 Moment Generating Function and Large Deviations

We choose the starting point \(X_0=0\) and with random initial velocity, i.e., \(v=\pm 1\) with probability 1/2. This allows us to compute the moment generating function via

$$\begin{aligned} \mathbb {E}( e^{\alpha X_t})= & {} \frac{1}{2}(1,1)e^{t M(-i \alpha )} (1,1)^T. \end{aligned}$$
(34)

This amounts to computing the exponential of the matrix M(q) from (29) which can be done using diagonalization, and results in

$$\begin{aligned} e^{t M(q)}= \frac{e^{tA}}{2\gamma B} G(t,q), \end{aligned}$$

where G(tq) is given by the symmetric two by two matrix

$$\begin{aligned} G(t,q)= \left( \begin{array}{cc} A_{11} &{} A_{12}\\ A_{12} &{} A_{11}^* \end{array} \right) \end{aligned}$$

where

$$\begin{aligned} A_{11}= & {} -2\gamma \lambda i \sin (k) \sinh (Bt) + 2\gamma B \cosh (tB), \\ A_{12}= & {} 2\gamma ^2 \sinh (tB) \end{aligned}$$

and where

$$\begin{aligned} A= & {} (\cos (k)-1)(2\kappa + \lambda ) -\gamma , \\ B= & {} \sqrt{\gamma ^2-\lambda ^2 \sin ^2(k)}. \end{aligned}$$

Moreover, we see from (34) that the free energy function

$$\begin{aligned} F(\alpha ) = \lim _{t\rightarrow \infty } \frac{1}{t}\log \mathbb {E}\left( e^{\alpha X_t}\right) \end{aligned}$$

is equal to the largest eigenvalue of the symmetric matrix \(M(-i\alpha )\), which is explicitly given by

$$\begin{aligned} M(-i\alpha )= \left( \begin{array}{cc} (2\kappa +\lambda ) (\cosh (\alpha )-1) + \lambda \sinh (\alpha ) -\gamma &{} \gamma \\ \gamma &{} (2\kappa +\lambda ) (\cosh (\alpha )-1) - \lambda \sinh (\alpha )-\gamma \end{array} \right) . \end{aligned}$$

This gives

$$\begin{aligned} F(\alpha )= (2\kappa +\lambda )(\cosh (\alpha ) -1) + \sqrt{\gamma ^2 + \lambda ^2\sinh ^2(\alpha )}-\gamma , \end{aligned}$$
(35)

which agrees with (25).

Let us look at three relevant limiting cases for the “free energy function” F from (35).

  1. (a)

    Expanding the free energy function F around \(\alpha \approx 0\) gives

    $$\begin{aligned} F(\alpha )= \frac{1}{2}D\alpha ^2 + O(\alpha ^4) \end{aligned}$$

    with \(D= 2\kappa + \lambda + \tfrac{\lambda ^2}{\gamma }\). This is consistent with the diffusion constant found in Example 1 and in (33). The function \(F(\alpha )\) in (35) can be analytically extended in a neighborhood of the origin in the complex plane, and as a consequence, we can reobtain the central limit theorem (which we found via the scaling behavior of the characteristic function) from the large deviation free energy, see [29].

  2. (b)

    In the limit \(\gamma \rightarrow \infty \) the free energy function becomes

    $$\begin{aligned} F(\alpha )= (\cosh (\alpha )-1)(2\kappa + \lambda ) \end{aligned}$$

    which corresponds to the large deviations of a symmetric random walk jumping with rates \(\kappa +\lambda /2\) to the right or left. This is indeed the (slow-fast) scaling limit of the process as we saw before. For large values of \(\gamma \) we have

    $$\begin{aligned} F(\alpha )= (\cosh (\alpha )-1)(2\kappa + \lambda ) + \frac{\lambda ^2}{2\gamma } \sinh ^2(\alpha ) + o(1/\gamma ). \end{aligned}$$

    Remark also that F in (35) is non-increasing as a function of \(\gamma \).

  3. (c)

    In the continuum limit we rescale \(\lambda \rightarrow \epsilon \lambda \), \(\gamma \rightarrow \epsilon ^2\gamma \), \(X_t\rightarrow \epsilon X_{\epsilon ^{-2} t}\), we find

    $$\begin{aligned} \lim _{\epsilon \rightarrow 0}\lim _{t\rightarrow \infty }\frac{1}{t}\log \mathbb {E}^{\epsilon \lambda , \epsilon ^2\gamma }\left( e^{\alpha \epsilon X_{\epsilon ^{-2} t}}\right) = \kappa \alpha ^2 + \sqrt{\gamma ^2+ \lambda ^2 \alpha ^2}-\gamma ^2 \end{aligned}$$
    (36)

    which corresponds to the large deviation free energy of the continuum model (see also [30]), i.e., the limits \(\epsilon \rightarrow 0\) and \(t\rightarrow \infty \) in (36) commute.