Bounds for mixing times for finite semi-Markov processes with heavy-tail jump distribution

Georgiou, Nicos; Scalas, Enrico

doi:10.1007/s13540-021-00010-2

Bounds for mixing times for finite semi-Markov processes with heavy-tail jump distribution

Original Paper
Open access
Published: 07 February 2022

Volume 25, pages 229–243, (2022)
Cite this article

Download PDF

You have full access to this open access article

Fractional Calculus and Applied Analysis Aims and scope Submit manuscript

Bounds for mixing times for finite semi-Markov processes with heavy-tail jump distribution

Download PDF

Nicos Georgiou¹ &
Enrico Scalas¹

1390 Accesses
2 Citations
Explore all metrics

Abstract

Consider a Markov chain with finite state space and suppose you wish to change time replacing the integer step index n with a random counting process N(t). What happens to the mixing time of the Markov chain? We present a partial reply in a particular case of interest in which N(t) is a counting renewal process with power-law distributed inter-arrival times of index $\beta $. We then focus on $\beta \in (0,1)$, leading to infinite expectation for inter-arrival times and further study the situation in which inter-arrival times follow the Mittag-Leffler distribution of order $\beta $.

Harmonic Moments and Large Deviations for the Markov Branching Process with Immigration

Article 14 August 2023

Mixing times and hitting times for general Markov processes

Article Open access 09 October 2023

Conditioned local limit theorems for random walks defined on finite Markov chains

Article 25 October 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 Motivation

The original motivation for this paper stems from a 2011 paper [8] where a probabilistic theory for dynamic networks was presented. In particular, given a fixed set of vertices, an embedded Markov chain was considered on the space of all possible graphs connecting the vertices. This discrete-time chain was then transformed into a continuous-time chain by means of a simple time change with a counting process. In a subsequent paper [3], we explicitly solved one of the models presented in [8]. This model is equivalent to an $\alpha $-delayed version of the Ehrenfest urn chain and the time change is the fractional Poisson process [4] of renewal type [6]. At that time, we initiated a discussion on how this time change for a discrete-time discrete-space Markov chain affects mixing times and the convergence rate to equilibrium. Below we collect results on this point in the interesting case in which the inter-arrival times between two consecutive transitions of the embedded chain have a power-law distribution with index $\beta $, also covering the case in which $\beta \in (0,1)$ meaning that the expected value of the waiting times is infinite. In the latter case, under an appropriate choice of the distribution of inter-arrival times, it is possible to show that the forward Kolmogorov equations can be replaced by a fractional version with Caputo derivative of index $\beta $ (see e.g. [3] for details in the case mentioned above and [7] for a general theory) when the initial time of the process is a renewal point.

The starting point of our discussion is that the continuous-time probabilities $p_{i,j} (t)$ of being in state j at time t, given that the process was in state i at time 0 converge to the same equilibrium distribution as in the case of the embedded chain. Then, in Theorem 2, we prove lower and upper bounds for the mixing time of the continuous-time chain based on the mixing time of the embedded chain and, in Theorem 3, we specialize the result to the case in which inter-arrival times follow the Mittag-Leffler distribution where a sharper upper bound is available. We believe these bounds can be useful for applied scientists simulating these processes, for instance to estimate how far from equilibrium their simulations are.

1.2 Preliminaries

Let $T_1, T_2, \ldots $ be a sequence of independent positive random variables with the meaning of inter-event times or waiting times (with common law $\nu $) and define the partial sum

$$\begin{aligned} S_n = \sum _{k=1}^n T_i, \; \; n\ge 1. \end{aligned}$$

(1.1)

The sequence $S_1, S_2, \ldots $ denotes the event times at which the state of the Markov chain X(t) attempts to change.

The embedded Markov chain is a discrete time chain $X_{n}, n\ge 1$, with state space $\mathcal {S}$. Initially we assume an initial distribution $\mu ^{(0)}$, i.e. $\mathbb {P}\{X_0 =i\} =\mu ^{(0)}_i$ and the chain evolves according to a discrete transition kernel $q: \mathcal {S}\times \mathcal {S}\rightarrow [0,1]$. As usual, since $\mathcal {S}$ is finite, the transition kernel may be encoded as a transition matrix $Q = (q_{i,j})_{1 \le i,j \le |\mathcal {S}|}$. We will be assuming the chain $X_n$ is irreducible for convenience of the exposition. Otherwise, all theorems below can be ascribed to each irreducible component separately. Moreover we shall also assume that the chain $X_n$ is aperiodic. Again, this is a technical point when discussing the convergence to equilibrium, as in the irreducible aperiodic case we have almost sure convergence to the unique invariant measure for the discrete chain.

We couple the embedded chain $X_n$ with the process X(t) via the counting process

$$\begin{aligned} N_\nu (t) = \max \{ n \in \mathbb {N}: S_n \le t \} \end{aligned}$$

(1.2)

that gives the number of events from time 0 up to a finite time horizon t. Then we have

$$\begin{aligned} X(t) = X_{N_{\nu }(t)} = X_n 1\!\!1\{ S_n \le t < S_{n+1} \}, \end{aligned}$$

(1.3)

i.e. the state of the process at time t is the same as that of the embedded chain after the last event before time t occurred.

All information about X(t) is encoded in the pairs $\{(X_n, T_n)\}_{n \ge 1}$ which are a discrete-time Markov renewal process, satisfying

$$\begin{aligned}&\mathbb {P}\{ X_{n+1} = j, T_{n+1} \le t | (X_0, S_0), \ldots , (X_n =i, S_n)\} \nonumber \\&\quad = \mathbb {P}\{ X_{n+1} = j, T_{n+1}\le t | X_n = i \}. \end{aligned}$$

(1.4)

$X(\cdot )$ is then a semi-Markov process subordinated to $N_\nu (t)$ where we use “subordination” with the meaning of “time change” with an abuse of language. Under the assumption that $\mu _0 = \delta _{i}$ (deterministic starting point), the temporal evolution of its transition probabilities satisfies the forward equations

$$\begin{aligned} p_{i,j}(t) \!= {\overline{F}}_\nu (t)\delta _{ij} \!\!+\! \sum _{\ell \in \mathcal {S}} q_{\ell , j} \!\! \int _0^t \!\! p_{i,\ell }(u)f_{\nu }(t-u)\,du. \end{aligned}$$

(1.5)

Above we introduced $p_{i,j}(t) = \mathbb {P}\{X(t) = j | X(0) = i\}$, the tail (complementary cumulative distribution function) $\overline{F}_\nu (t) = 1- F_\nu (t)$ and $f_\nu (t)$ the Radon-Nikodym derivative of $\nu $ with respect to Lebesgue (the probability density function if appropriate smoothness conditions are satisfied). These equations are proved by conditioning on the time of the last event before time t and it is implicitly assumed that at $t=0$ we have a renewal point.

A conditioning argument on the values of $N_{\nu }(t)$ gives

$$\begin{aligned} p_{i,j}(t) = {\overline{F}}_\nu (t)\delta _{ij} + \sum _{n=1}^\infty q^{(n)}_{i,j} \mathbb {P}\{N_\nu (t) = n\}, \end{aligned}$$

(1.6)

where $q^{(n)}_{i,j}$ are the n-step transitions of the embedded discrete Markov chain, namely the entries of the n-th power of the transition matrix $Q = (q_{i,j})_{1 \le i, j \le |\mathcal {S}|}$.

From the ergodic theorem we have that for any i, j

$$\begin{aligned} \lim _{n \rightarrow \infty } q^{(n)}_{i,j} = \pi _{j} > 0. \end{aligned}$$

This is sufficient to argue the following lemma.

Lemma 1

Consider the transition probabilities given by (1.6) and assume that $\displaystyle \lim _{n \rightarrow \infty } q^{(n)}_{i,j} = \pi _{j}$. Then

$$\begin{aligned} \lim _{t \rightarrow \infty } p_{i,j}(t) = \pi _i. \end{aligned}$$

Proof

Let N large enough so that for a given $\varepsilon >0$ we have for all $n > N$

$$\begin{aligned} |q^{(n)}_{i,j} - \pi _{i}| < \varepsilon . \end{aligned}$$

Then, substituting in (1.6) we have

$$\begin{aligned} p_{i,j}(t)&= {\overline{F}}_\nu (t)\delta _{ij} + \sum _{n=1}^\infty q^{(n)}_{i,j} \mathbb {P}\{N_\nu (t) = n\} \\&\le {\overline{F}}_\nu (t)\delta _{ij} + \sum _{n=1}^N q^{(n)}_{i,j} \mathbb {P}\{N_\nu (t) = n\} + \sum _{n=N+1}^\infty (\pi _i +\varepsilon )\mathbb {P}\{N_\nu (t) = n\}\\&\le \mathbb {P}\{N_\nu (t) \le N\} + (\pi _j +\varepsilon )\mathbb {P}\{N_\nu (t) > N\}. \end{aligned}$$

Then as $t \rightarrow \infty $ the first probability tends to 0, while $\mathbb {P}\{N_\nu (t) > N\} \rightarrow 1$. Then let $\varepsilon \rightarrow 0$ to obtain

$$\begin{aligned} \lim _{t \rightarrow \infty } p_{i,j}(t) \le \pi _j. \end{aligned}$$

The lower bound follows in a similar manner so we omit details. $\square $

This straightforward convergence result is the starting point of this discussion. In discrete Markov chains, there is a substantial body of literature (see [5] and references therein) examining quantitative estimates on the convergence; this information is encapsulated in information about the mixing times of the chain, using the total variation distance between the two measures.

1.3 Total variation distance and mixing times for discrete chains

Let ${\mathcal {F}}$ denote the $\sigma $-algebra of events of a space $\Omega $ and $\mu , \nu $ two probability measures on this space. Then the total variation distance between two measures is defined as

$$\begin{aligned} \Vert \mu - \nu \Vert = \sup _{A \in {\mathcal {F}}}|\mu (A) - \nu (A)| \in [0,1] \end{aligned}$$

(1.7)

and one can show that for countable spaces

$$\begin{aligned} \Vert \mu - \nu \Vert = \frac{1}{2}\sum _{x} |\mu (x) - \nu (x)|. \end{aligned}$$

(1.8)

Moreover, the total variation distance between two measures can be given in terms of a different variational formula (coupling):

$$\begin{aligned} \Vert \mu - \nu \Vert = \inf \{ \mathbb {P}\{ X \ne Y \}: (X, Y) \text { is a coupling of } \mu \text { and } \nu \}. \end{aligned}$$

(1.9)

Both formulas have merit, as (1.7) can be used for a lower bound, while (1.9) for upper bounds on mixing times.

For any $\varepsilon >0$, we define the mixing time $T_{\varepsilon }$ of a finite state, aperiodic, irreducible Markov chain to be

$$\begin{aligned} T_\varepsilon = \inf \left\{ n: \sum _{ i \in {\mathcal {S}}} \Vert q^{(n)}_{i, \cdot } - \pi _\cdot \Vert \le \varepsilon \right\} . \end{aligned}$$

(1.10)

The fact that $\Vert q^{(n)}_{i,\cdot } - \pi _\cdot \Vert $ is non-increasing in n means that for all $N > T_\varepsilon $ we have $\Vert q^{(N)}_{i, \cdot } - \pi _{\cdot } \Vert \le \varepsilon $ and that $T_{\varepsilon }$ are non-decreasing as $\varepsilon \rightarrow 0$. Loosely speaking, for a given tolerance $\varepsilon $, the mixing time tells us how long it takes the chain to start behaving as if it is near equilibrium.

Equation (1.9) can be used to obtain an upper bound for the mixing times the following way. First we construct a coupling between the two Markov chains, where $X_0 \sim \delta _i$, $Y_0 \sim \pi $. Both chains evolve according to the transition matrix Q independently, until they meet at some state x, after which the chains just jump to the same location together, again according to Q. The marginals of the pair chain $(X_n, Y_n)$ are still those of two Markov chains, so this description is indeed the description of a coupling between the two.

At the instant where the two independent chains meet, the pair Markov chain $(X_n, Y_n)$ hits the set

$$\begin{aligned} D = \{ (x,x) : x \in {\mathcal {S}} \}. \end{aligned}$$

Let the hitting time of this set be

$$\begin{aligned} \tau _D = \inf \{ n: (X_n, Y_n) \in D \}. \end{aligned}$$

Then, using this coupling between the chains and (1.9) one can obtain

$$\begin{aligned} \Vert q^{(n)}_{i,\cdot } - \pi \Vert \le \mathbb {P} \{ X_n \ne Y_n \}=\mathbb {P}_{\delta _i\otimes \pi }\{ \tau _D > n\}. \end{aligned}$$

At this point the general theory of Markov chains can assist with uniform estimates on the hitting time, irrespective of the initial measure. This can be obtained by using the fact that the two chains act independently from another- until they meet at time $\tau _{D}$- and we have

$$\begin{aligned} \sup _{\mu } \mathbb {P}_\mu ( \tau _D > n) \le c_1 e^{-c_2 n/ \ell ^*_D}, \quad \ell ^*_D = \max _i \mathbb {E}_{\delta _i}(\tau _D), \end{aligned}$$

(1.11)

where $c_1, c_2$ are uniform constants. In particular this gives the bound

$$\begin{aligned} \Vert q^{(n)}_{i,\cdot } - \pi \Vert \le c_1 e^{-c_2 n/ \ell ^*_D}. \end{aligned}$$

Using only Q one can derive upper bounds for $\ell ^*_D$, so we treat that as a computable constant. Now, if, overall, the upper bound is less than $\varepsilon |\mathcal {S}|^{-1}$ for some $n_\varepsilon $ then (1.10) implies that $T_{\varepsilon } \le n_\varepsilon $. Forcing the upper bound in the display above to be less than $\varepsilon |\mathcal {S}|^{-1}$ we have that

$$\begin{aligned} n_\varepsilon > C \ell _{D}^*(-\log \varepsilon + \log |\mathcal {S}|), \end{aligned}$$

which in turn gives that there exists a function $f(\mathcal {S}, Q)$ such that

$$\begin{aligned} T_\varepsilon \le f({\mathcal {S}}, Q)| \log \varepsilon |, \end{aligned}$$

(1.12)

which shows us how the mixing time depends on the order of $\varepsilon $.

For a lower bound, the most basic method involves counting; it relies on the idea that if the possible locations of a chain after n jumps do not cover a substantial proportion of the state space, we cannot be close to mixing. Then one can get

$$\begin{aligned} T_\varepsilon > \frac{\log ( |{\mathcal {S}}|(1 - \varepsilon ))}{\log c(Q)}\,. \end{aligned}$$

(1.13)

The constant c(Q) only depends on the transition matrix. Note that the lower bound above is not necessarily close to the upper bound, and as $\varepsilon \rightarrow 0$ it gets weaker. The $\varepsilon $-order of this agrees with the upper bound when $|{\mathcal {S}}| \sim \varepsilon ^{-1}$. Many further methods exist for lower bounds, but are usually model-dependent. We briefly mention that a suitable $L^2$ theory exists for reversible, aperiodic, irreducible MCs so bounds on $T_{\varepsilon }$ from below are of the same order as the upper bounds,

$$\begin{aligned} ((\gamma ^* )^{-1}- 1) | \log 2\varepsilon |< T_{\varepsilon } < (\gamma ^* )^{-1} c_{Q} |\log \varepsilon |, \end{aligned}$$

where $\gamma ^*$ is the spectral gap of Q (the difference between 1 and the second largest eigenvalue $\lambda _2$).

2 Results

In this short paper, we will bound mixing times for continuous semi-Markov processes with heavy tails for the distribution of inter-event times. Using Lemma 1 we have that the convergence occurs (albeit more slowly than Markov chains). The global time change we performed on the chain will be reflected in the bounds for the mixing times, as we obtain them in terms of the mixing times of the embedded discrete chain.

At this point, we want to impose some conditions on the distribution of the inter-event times we are looking at. In particular:

Assumption

We assume there are two uniform constants $c_1$ and $c_2$, a $t_0 > 0$ and a $\beta >0$ such that

$$\begin{aligned} \frac{c_1}{t^{\beta }}\le \mathbb {P}\{ T> t \} \le \frac{c_2}{t^{\beta }}, \quad \text { for all }\ \ t > t_0. \end{aligned}$$

(2.1)

Note that we are not assuming any moments exist for the inter-event distributions as $\beta $ can be in (0, 1). In the case where moments exist, the results sharpen.

For any $\varepsilon >0$ we define the mixing time for the continuous semi-Markov chain to be

$$\begin{aligned} T_{\varepsilon }^{\text {cont}} = \inf \Big \{ t: \sum _{ i \in {\mathcal {S}}} \Vert p_{i, \cdot }(t) - \pi _\cdot \Vert \le \varepsilon \Big \}. \end{aligned}$$

(2.2)

By Lemma 1 we know the $p_{i, \cdot }$ converge to $\pi $ so the above object is finite and well defined.

2.1 Motivating examples

Example 1

(Diagonalizable transition matrix) In this example, we make the assumption that Q is the transition matrix of an irreducible, aperiodic Markov chain and in particular that it is diagonalisable. Let $\pi $ denote the unique invariant distribution of the Markov chain and recall that $\pi $ is a left 1-eigenvector for the matrix Q and the vector $\mathbf{1} = (1, \ldots , 1)$ is a right 1-eigenvector. Since Q is diagonalisable, we have that there exists a matrix L so that $LQL^{-1} = D$ and without loss of generality we may assume that $d_{11} = 1$ and that $\ell _{1j} = \pi _j$. Furthermore, by the Perron-Frobenius theorem, the 1-eigenspace has dimension 1 and therefore the first column of $L^{-1} = ({\tilde{\ell }}_{ij})$ is a right 1-eigenvector of Q and therefore satisfies ${\tilde{\ell }}_{i1} = 1$.

Then $Q^n = L^{-1}D^nL$ and on a coordinate by coordinate computation we have

$$\begin{aligned} q^{(n)}_{ij} =\sum _{k=1}^N {{\tilde{\ell }}}_{ik} \lambda ^{n}_{k}\ell _{kj} = \pi _j + \sum _{k \ne 1} {{\tilde{\ell }}}_{ik} \lambda ^{n}_{k}\ell _{kj}. \end{aligned}$$

The eigenvalues $\lambda _k$ remaining in the sum all have $|\lambda _k| < 1$, with the sum vanishing as n grows and the n-step transitions converging to the invariant distribution.

Substituting the last relationship back in (1.6), we have

$$\begin{aligned} p_{i,j}(t)&= {\overline{F}}_\nu (t)\delta _{ij} + \sum _{n=1}^\infty q^{(n)}_{i,j} \mathbb {P}\{N_\nu (t) = n\}\\&= {\overline{F}}_\nu (t)\delta _{ij} + \sum _{n=1}^\infty \left( \pi _j + \sum _{k = 2}^N {{\tilde{\ell }}}_{ik} \lambda ^{n}_{k}\ell _{kj} \right) \mathbb {P}\{N_\nu (t) = n\}\\&= \pi _j (1 - \mathbb {P}\{ N_\nu (t) = 0\}) + {\overline{F}}_\nu (t)\delta _{ij} + \sum _{k=2}^N \sum _{n=1}^\infty {{\tilde{\ell }}}_{ik} \lambda ^{n}_{k} \mathbb {P}\{N_\nu (t) = n\}\ell _{kj}\\&= \pi _j + (\delta _{ij} - \pi _j) {\overline{F}}_\nu (t) + \sum _{k=2}^N {{\tilde{\ell }}}_{ik} \sum _{n=1}^\infty \left( \lambda ^{n}_{k} \mathbb {P}\{N_\nu (t) = n\} \right) \ell _{kj}\\&= \pi _j + \sum _{k=2}^N {{\tilde{\ell }}}_{ik} \sum _{n=0}^\infty \left( \lambda ^{n}_{k} \mathbb {P}\{N_\nu (t) = n\} \right) \ell _{kj}\\&= \pi _j + \sum _{k=2}^N {{\tilde{\ell }}}_{ik} \mathbb {E}\Big (\lambda _k^{N_\nu (t)}\Big )\ell _{kj}\\&=\pi _j + \sum _{k=2}^N {{\tilde{\ell }}}_{ik} P_{N_\nu (t)}(\lambda _k)\ell _{kj}. \end{aligned}$$

Particularly, the convergence to equilibrium for a finite state space process only depends on the tails of the probability generating function of $N_\nu (t)$. Then, since $N_\nu $ is an increasing process and $|\lambda _k| < 1$, we may bound

$$\begin{aligned} \sup _j | p_{i,j}(t) - \pi _j| \le C_N P_{N_\nu (t)}(|\lambda _2|). \end{aligned}$$

(2.3)

Therefore the total variation distance as a function of time only depends on the tails of the probability generating function.

In fact, the following rough estimate can be performed, keeping in mind that $|\lambda _2| <1$. Let K such that $|\lambda _2|^K < \varepsilon /2$,

$$\begin{aligned} P_{N_\nu (t)}(|\lambda _2|)&= \mathbb {E}(|\lambda _2|^{N_\nu (t)}\mathbf{1}\{N_\nu (t) > K\}) + \mathbb {E}(|\lambda _2|^{N_\nu (t)}\mathbf{1}\{N_\nu (t) \le K\})\\&\le |\lambda _2|^K + \mathbb {P}\{ N_{\nu }(t) \le K\} \le \varepsilon /2 + \mathbb {P}\{ N_{\nu }(t) \le K\} . \end{aligned}$$

From Lemma 2 below, the second term above decays like $K^{1+\beta }t^{-\beta }$ and modulates t in order to make this quantity arbitrarily small.

Example 2

(Mittag-Leffler waiting times) This example is taken from [2]. When the waiting times $T_i$ are Mittag-Leffler with parameter $\beta $, we have that $P_{N_\nu (t)}(\lambda )$ $= E_{\beta }((\lambda -1)t^\beta )$, where $E_{\beta }$ is the Mittag-Leffler function with parameter $\beta \in (0,1]$. For large t values we have that

$$\begin{aligned} E_{\beta }((\lambda -1)t^\beta ) \sim C_{\lambda , \beta } t^{-\beta }, \end{aligned}$$

and therefore

$$\begin{aligned} \sup _j | p_{i,j}(t) - \pi _j| \le C_{\lambda _2, \beta } N t^{-\beta }. \end{aligned}$$

(2.4)

The total variation distance becomes less than $\varepsilon > 0$, when

$$\begin{aligned} t > \left( \frac{C_{\lambda , \beta , N}N}{\varepsilon }\right) ^{1/\beta }. \end{aligned}$$

We compute an explicit value for $C_{\lambda , \beta , N}$ later, in the proof of Theorem 2.

We are now ready to state the main theorem.

Theorem 1

Assume (2.1). Let $\varepsilon > 0$ and $T^\mathrm{{emb}}_{\varepsilon /2}$ the $\varepsilon /2$-mixing time for the embedded chain, be given by (1.10). Then for any $\beta > 0 $ we can find explicit constants $C_1$ so that

$$\begin{aligned} T^\mathrm{{cont}}_{\varepsilon } < C_1 \varepsilon ^{-1/\beta } (T^\mathrm{{emb}}_{\varepsilon /2})^{1+1/\beta }. \end{aligned}$$

This theorem is quite general as it makes no further assumptions on the background chain. Moreover, as is often the case for discrete Markov chains, a lot of the sophisticated estimates on mixing times are model dependent, so a theorem like Theorem 1 can utilise those bounds directly.

In the case where the inter-event times are Mittag-Leffler distributed we can make the upper bound sharper.

Theorem 2

Let X(t) be a finite space semi-Markov process for which the inter-event times are Mittag-Leffler$(\beta )$ distributed. Then,

$$\begin{aligned} T^\mathrm{{cont}}_{\varepsilon } < C_2 \varepsilon ^{-1/\beta } (T^\mathrm{{emb}}_{\varepsilon /2})^{1/\beta }\,. \end{aligned}$$

In Figure 1 we see a simulation of the fractional Ehrenfest chain for times before and at the upper bound of the mixing time in Theorem 2.

A natural question arises about lower bounds for $T^\mathrm{{cont}}_{\varepsilon }$. These are more challenging to obtain for the total variation distance directly. However, by defining a different distance between the measures we can obtain also lower bounds. Let

$$\begin{aligned} \widetilde{T}^\mathrm{{cont}}_\varepsilon = \inf \{ t: \max _{i} \mathbb {E}\Vert q^{(N_s)}_{i,\cdot } - \pi \Vert < \varepsilon , \,\, \text { for all } s> t \}. \end{aligned}$$

(2.5)

Note that

$$\begin{aligned} \Vert p_{i, \cdot }(t) - \pi \Vert = \Vert \mathbb {E}(q^{(N_t)}_{i, \cdot }) - \pi \Vert \le \mathbb {E}\Vert q^{(N_t)}_{i, \cdot }- \pi \Vert , \end{aligned}$$

and therefore if the expected value (2.5) is less than $\varepsilon $ then the total variation distance is small. In particular this already gives

$$\begin{aligned} \widetilde{T}^\mathrm{{cont}}_\varepsilon >{T}^\mathrm{{cont}}_\varepsilon . \end{aligned}$$

Using definition (2.5), we can however find bounds for $\widetilde{T}^\mathrm{{cont}}_\varepsilon $.

Theorem 3

Assume (2.1). Let $\delta > 0$ and $T^\mathrm{{emb}}_{\delta }$ the $\delta $-mixing time for the embedded chain, be given by (1.10). Then for any $\beta > 0$, and any $\alpha \in (0,1)$ we can find explicit constants $C_1 < C_2$ so that

$$\begin{aligned} C_1 \varepsilon ^{(\alpha -1)/\beta } (T^\mathrm{{emb}}_{\varepsilon ^{\alpha }})^{1/\beta }< \widetilde{T}^\mathrm{{cont}}_{\varepsilon } < C_2 \varepsilon ^{-1/\beta } (T^\mathrm{{emb}}_{\varepsilon /2})^{1+1/\beta }. \end{aligned}$$

We are now ready to present the proofs in the next section.

3 Mixing times and equilibrium

Lemma 2

Under assumption (2.1), let $K \in \mathbb {N}$ and let $t > (t_0 \vee (2c_2)^{1/\beta })K$. Then, there exists a uniform positive constant $C_0$ so that

$$\begin{aligned} \frac{c_1 \,K}{t^\beta } - \frac{C_0\,K^{2}}{t^{2\beta }} \le \mathbb {P}\{N_\beta (t) < K \} \le \frac{c_2K^{1+\beta }}{t^\beta } + \frac{C_0\,K^{1+2\beta }}{t^{2\beta }}. \end{aligned}$$

(3.1)

Proof

The assumptions of the lemma guarantee that all functions below are well defined, all constants arising from Taylor’s theorem do not depend on t and the error of Taylor’s theorem is small. When $t > (t_0 \vee (2c_2)^{1/\beta })K$ we have

$$\begin{aligned} 1 - \mathbb {P}\{N_\beta (t)< K \}&= 1 - \mathbb {P}\left\{ t< \sum ^{K}_{j=1} T_j \right\} \ge 1- \mathbb {P}\left\{ t < K \max _{1 \le j\le K } T_j \right\} \nonumber \\&= 1- \mathbb {P}\left\{ \max _{1 \le j\le K} T_j > \frac{t}{K} \right\} = \left( \mathbb {P}\left\{ T_1 \le \frac{t}{K} \right\} \right) ^{K} \nonumber \\&\ge \exp \left\{ K\log \left( 1 - c_2 \frac{K^\beta }{t^\beta } \right) \right\} \nonumber \\&= \exp \left\{ -Kc_2 \frac{K^\beta }{t^\beta } - KC_{\text {up}}c^2_2 \frac{K^{2\beta }}{t^{2\beta }} \right\} , \, \text {for a uniform }C_{\text {up}}, \nonumber \\&\ge 1 - c_2 \frac{K^{1+\beta }}{t^\beta } - C_{\text {up}}c^2_2 \frac{K^{1+2\beta }}{t^{2\beta }} . \end{aligned}$$

(3.2)

For a lower bound we can write

$$\begin{aligned} \mathbb {P}\{N_\beta (t)< K \}&= \mathbb {P}\left\{ t< \sum ^{K}_{j=1} T_j \right\} \ge \mathbb {P}\left\{ t< \max _{1 \le j\le K } T_j \right\} \nonumber \\&= 1 - \left( \mathbb {P}\left\{ T_1 < t \right\} \right) ^{K} \ge 1- \left( 1 - \frac{c_1}{t^{\beta }}\right) ^K \nonumber \\&= 1 - \exp \left\{ -Kc_1 \frac{1}{t^\beta } - K {\widetilde{C}}_{\text {low}}c^2_1 \frac{1}{t^{2\beta }} \right\} \ge 1 - \exp \left\{ -Kc_1 \frac{1}{t^\beta }\right\} \nonumber \\&\ge Kc_1 \frac{1}{t^\beta } - C_{{\text {low}}}\left( \frac{K c_1}{t^\beta }\right) ^2, \ \text { for a uniform constant } C_{\text {low}}. \end{aligned}$$

(3.3)

The lemma follows from (3.2) and (3.3). The last inequality on the right side of (3.1) comes directly from the assumption. $\square $

Proof of Theorem 1

It suffices to prove that for arbitrary $M< L$ the total variation distance between the transition probabilities and the equilibrium distribution is bounded above, according to the following

$$\begin{aligned} \Vert p_{i,\cdot }(t) - \pi \Vert \le \mathbb {P}\{ N_t < M\} + \Vert q^{(M)}_{i,\cdot } - \pi \Vert + \Vert q_{i,\cdot }^{(L)} - \pi \Vert \mathbb {P}\{ N_t > L\}. \end{aligned}$$

(3.4)

Assume for the moment that (3.4) holds and set $M = T^{\text {emb}}_{\varepsilon /2}$. By the definition of $T^{\text {emb}}_{\varepsilon /2}$, the middle term on the right-hand side of (3.4) is bounded above by $\varepsilon /2$. Then let $L \rightarrow \infty $ so that the third term vanishes.

The left-hand side is then bounded by $\varepsilon $ -and therefore the continuous process is $\varepsilon $-mixed- if $\mathbb {P}\{ N_t < T^{\text {emb}}_{\varepsilon /2} \} \le \varepsilon /2$. By Lemma 2 this happens whenever

$$\begin{aligned} t > \left( \frac{2(c_1 + C_0)}{\varepsilon }\right) ^{1/\beta }\left( T^{\text {emb}}_{\varepsilon /2}\right) ^{1+1/\beta } \vee (t_0 \vee (2c_2)^{1/\beta })T^{\text {emb}}_{\varepsilon /2}, \end{aligned}$$

and therefore

$$\begin{aligned} T^{\text {cont}}_{\varepsilon } < C_2 \varepsilon ^{-1/\beta } (T^{\text {emb}}_{\varepsilon /2})^{1+1/\beta }. \end{aligned}$$

(3.5)

The theorem is proven when we establish (3.4). To this end,

$$\begin{aligned}&2\Vert p_{i,\cdot }(t) - \pi \Vert = \sum _{j \in {\mathcal {S}}}|p_{i,j}(t) - \pi _{j}| \\&\quad = \sum _{j \in {\mathcal {S}}}\left| \sum _{n=0}^{\infty }(q^{(n)}_{i,j} - \pi _{j}) \mathbb {P}\{ N_t = n\}\right| \le \sum _{j \in {\mathcal {S}}}\sum _{n=0}^{\infty }|q^{(n)}_{i,j} - \pi _{j}| \mathbb {P}\{ N_t = n\}\\&\quad \le \mathbb {P}\{ N_t< M\} + \sum _{j \in {\mathcal {S}}}\sum _{n=M}^{L}|q^{(n)}_{i,j} - \pi _{j}| \mathbb {P}\{ N_t = n\} + 2\Vert q^{(L)}_{i,\cdot } - \pi \Vert \mathbb {P}\{ N_t> L\}\\&\quad \le \mathbb {P}\{ N_t < M\} +2 \Vert q^{(M)}_{i,\cdot } - \pi \Vert \mathbb {P}\{ M \le N_t \le L\} + 2\Vert q^{(L)}_{i, \cdot } - \pi \Vert \mathbb {P}\{ N_t > L\}. \end{aligned}$$

$\square $

Proof Theorem 2

When the counting process $N_{\beta }(t)$ has Mittag-Leffler($\beta $) inter-event times, we have

$$\begin{aligned} {\bar{n}}_t = \mathbb {E}(N_\beta (t)) = \frac{t^\beta }{\Gamma (1+\beta )}, \quad \mathbb {E}(N^2_\beta (t)) = {\bar{n}}_t + ({\bar{n}}_t)^2 \left[ \frac{\beta B(\beta ,1/2)}{2^{2 \beta -1}} -1 \right] , \end{aligned}$$

(3.6)

where $B(\cdot ,\cdot )$ is the beta function.

As in Example 2,

$$\begin{aligned} p_{i,j}(t)&= {\overline{F}}_\nu (t)\delta _{ij} + \sum _{n=1}^\infty q^{(n)}_{i,j} \mathbb {P}\{N_\nu (t) = n\} = \sum _{n=0}^\infty q^{(n)}_{i,j} \mathbb {P}\{N_\nu (t) = n\} \nonumber \\&\le \sum _{n=0}^\infty (\pi _j + c_1e^{-n/e\ell ^*_D}) \mathbb {P}\{N_\nu (t) = n\} \nonumber \\&=\pi _j + c_1 M_{N_\nu (t)}(-1/e\ell ^*_D) =\pi _j + c_1 E_{\beta }((e^{-1/e\ell ^*_D}-1)t^\beta ) . \end{aligned}$$

(3.7)

Here, $M_{N_\beta (t)}(s)$ is the moment generating function of the counting process $N_{\beta }(t)$, while $E_{\beta }$ is the Mittag-Leffler function with parameter $\beta $. $\ell ^*_D$ is defined in equation (1.11).

We will extrapolate mixing times asymptotics by forcing

$$\begin{aligned} c_1E_{\beta }((e^{-1/e\ell ^*_D}-1)t^\beta ) = c_1 M_{N_\beta (t)}(-1/e\ell ^*_D) < \varepsilon . \end{aligned}$$

The equality between this two quantities is a beautiful fact of the Mittag-Leffler function. The derivation of the moment generating function can be found in the book [1] and in [4].

One way to bound above the moment generating function is by

$$\begin{aligned} M_{N_\nu (t)}(-1/e\ell ^*_D) \le \mathbb {P}\left\{ N_{\beta }(t) \le \theta \frac{t^{\beta }}{\Gamma (1+\beta )} \right\} + e^{- \theta t^{\beta }/e\Gamma (1+\beta )\ell ^*_D}. \end{aligned}$$

(3.8)

The constant $\theta $ is to be determined so that each term above is bounded by $\varepsilon /2$. For the first term we will use the Paley-Zygmound inequality. For any $\theta \in [0,1]$ we have

$$\begin{aligned} \mathbb {P}\left\{ N_{\beta }(t) \ge \theta \frac{t^{\beta }}{\Gamma (1+\beta )} \right\}&\ge \frac{(1 - \theta )^2 \mathbb {E}(N_\beta (t))^2}{\text {Var}(N_\beta (t)) + (1 - \theta )^2\mathbb {E}(N^2_\beta (t))}\\&=\frac{ (1 - \theta )^2 {\bar{n}}_t^2}{((1-\theta )^2 + 1){\bar{n}}_t + {\bar{n}}_t^2 \left( \frac{\beta B(\beta , 1/2)}{2^{2\beta -1}} -1 \right) }\\&=\frac{1}{ (1 - \theta )^{-2} \left( \frac{\beta B(\beta , 1/2)}{2^{2\beta -1}} -1 \right) + (1 + (1 - \theta )^{-2}){\bar{n}}_t^{-1}}. \end{aligned}$$

The function $\frac{\beta B(\beta , 1/2)}{2^{2\beta -1}} -1$ is monotonically decreasing in $\beta $ and takes values in (0, 1). therefore there is a unique $\theta ^*(\beta )$ in (0, 1) so that $(1-\theta ^*(\beta ))^{-2}(\frac{\beta B(\beta , 1/2)}{2^{2\beta -1}} -1)= 1$. For the particular value of $\theta ^*$ we bound

$$\begin{aligned} \mathbb {P}\Big \{ N_{\beta }(t)&\ge \frac{\theta ^*({\beta })t^{\beta }}{\Gamma (1+\beta )} \Big \} \\&\ge 1 - \frac{(1 + (1 - \theta ^*(\beta ))^{-2})}{\bar{n_t}} = 1 - \frac{(1 + (1 - \theta ^*(\beta ))^{-2})\Gamma (1+\beta )}{t^{\beta }}. \end{aligned}$$

In particular, we obtain

$$\begin{aligned} \mathbb {P}\left\{ N_{\beta }(t) \le \frac{\theta ^*(\beta )t^{\beta }}{\Gamma (1+\beta )} \right\} \le \frac{(1 + (1 - \theta ^*(\beta ))^{-2})\Gamma (1+\beta )}{t^{\beta }} = \frac{C_\beta }{t^{\beta }}. \end{aligned}$$

(3.9)

This is a much improved bound for the probability, than the one established in Lemma 2. Impose that the upper bound in (3.9) is less than $\varepsilon /2$ to obtain that

$$\begin{aligned} t > \left( \frac{2C_\beta }{\varepsilon }\right) ^{1/\beta }. \end{aligned}$$

(3.10)

Similarly, set

$$\begin{aligned} e^{- \theta ^*(\beta ) t^{\beta }/e\Gamma (1+\beta )\ell ^*_D} < \varepsilon /2 \Longleftrightarrow t > \left( \frac{e\Gamma (1+\beta )}{\theta ^*(\beta )}\ell _D^*\log \frac{2}{\varepsilon }\right) ^{1/\beta }. \end{aligned}$$

(3.11)

Combine (3.10) and (3.11) in (3.8), which in turn can bound (3.7) to conclude that the relation

$$\begin{aligned} T^\mathrm{{cont}}_\varepsilon \le \left( \max \left\{ C_\beta , \frac{e\Gamma (1+\beta )}{\theta ^*(\beta )}\ell _D^*\right\} \frac{2c_1}{\varepsilon }\right) ^{1/\beta }, \end{aligned}$$

(3.12)

as required. $\square $

Proof of Theorem 3

Using definition (2.5), we can however find a lower bound for $\widetilde{T}^\mathrm{{cont}}_\varepsilon $. We have that for any M positive,

$$\begin{aligned} \mathbb {E}\Vert q^{(N_t)}_{i, \cdot }- \pi \Vert \ge \mathbb {E}(\Vert q^{(N_t)}_{i, \cdot }- \pi \Vert \mathbf{1}\{ N_t< M \}) \ge \Vert q^{(M)}_{i, \cdot }- \pi \Vert \mathbb {P}\{ N_t < M \}. \end{aligned}$$

(3.13)

If we set $M = \frac{1}{2}T^{\text {emb}}_{\varepsilon ^\alpha }$, we have

$$\begin{aligned} \mathbb {E}\Vert q^{(N_t)}_{i, \cdot }- \pi \Vert \ge \varepsilon ^{\alpha }\mathbb {P}\Big \{ N_t < \frac{1}{2}T^{\text {emb}}_{\varepsilon ^\alpha } \Big \}, \end{aligned}$$

and therefore it suffices to have $\mathbb {P}\{ N_t < T^{\text {emb}}_{\varepsilon ^\alpha }/2 \} > \varepsilon ^{1-\alpha }$, in order for the two measures to not be close in distance (2.5). This is enough to guarantee

$$\begin{aligned} \widetilde{T}^{\text {cont}}_{\varepsilon } \ge \sup \Big \{ t: \mathbb {P}\Big \{ N_{t} < \frac{1}{2}T^{\text {emb}}_{\varepsilon ^{\alpha }}\Big \} \ge \varepsilon ^{1-\alpha }\Big \}. \end{aligned}$$

At this point we need to separate two cases, depending on the assumption of Lemma 2. If $\beta < 1$, then the assumption of the lemma requires

$$\begin{aligned} t > C(t_0, c_2, \beta ) T^{\text {emb}}_{\varepsilon ^{\alpha }} \end{aligned}$$

in order to use (3.1), while we must also have

$$\begin{aligned} t^{\beta } > C(t_0, c_2, \beta ) T^{\text {emb}}_{\varepsilon ^{\alpha }}, \end{aligned}$$

(3.14)

so that the lower bound in Lemma 2 is non-negative. Then

$$\begin{aligned} \varepsilon ^{1-\alpha }< \tilde{C}(t_0, c_2, \beta ) \frac{T^{\text {emb}}_{\varepsilon ^{\alpha }}}{t^\beta } \Longleftrightarrow t < \tilde{C}(t_0, c_2, \beta ) \varepsilon ^{(\alpha -1)/\beta }(T^{\text {emb}}_{\varepsilon ^{\alpha }})^{1/\beta }. \end{aligned}$$

(3.15)

In order for both inequalities (3.14) and (3.15) to be satisfied, we need (modulo the constants)

$$\begin{aligned} 1 < \varepsilon ^{(\alpha -1)/\beta } \end{aligned}$$

which is true as $\alpha <1$. Therefore in the case $\beta <1$

$$\begin{aligned} \widetilde{T}^{\text {cont}}_{\varepsilon ^{\alpha }} > C_1 \varepsilon ^{(\alpha -1)/\beta } (T^{\text {emb}}_{\varepsilon ^{\alpha }})^{1/\beta }. \end{aligned}$$

Now suppose that $\beta \ge 1$. Then for the estimate in Lemma 2 to be meaningful (i.e. the lower bound is strictly greater than 0), we need that $t^{\beta }> T^{\text {emb}}_{\varepsilon ^{\alpha }}$. This is guaranteed by the assumption of Lemma 2 and we obtain

$$\begin{aligned} \widetilde{T}^{\text {cont}}_{\varepsilon ^{\alpha }} {\mathop {>}\limits ^{\text {Lemma } 2}} C_1 \varepsilon ^{(\alpha -1)/\beta } (T^{\text {emb}}_{\varepsilon ^{\alpha }})^{1/\beta }. \end{aligned}$$

Now for the upper bound in the theorem, we can repeat the arguments of Theorem 1. We have

$$\begin{aligned} \mathbb {E}\Vert q^{(N_t)}_{i, \cdot } - \pi \Vert = \sum _{n=0}^{\infty }\Vert q^{(n)}_{i, \cdot } - \pi \Vert \mathbb {P}\{N_t = n\}, \end{aligned}$$

and therefore bound (3.4) and all subsequent arguments work for this distance as well. $\square $

References

Baleanu, D., Diethelm, K., Scalas, E., Trujillo, J.J.: Fractional Calculus: Models and Numerical Methods. World Scientific, Singapore (2016)
Book Google Scholar
de Nigris, S., Hastir, A., Lambiotte, R.: Burstiness and fractional diffusion on complex networks. Eur. Phys. J. B 89, Art. 114 (2016)
Georgiou, N., Kiss, I.Z., Scalas, E.: Solvable non-Markovian dynamic network. Phys. Rev. E 92, Art. 042801 (2015)
Laskin, N.: Fractional Poisson process. Commun. Nonlinear Sci, Numer. Simul. 8, 201–213 (2003)
Levin, D.A., Peres, Y., Wilmer, E.L.: Markov Chains and Mixing Times. American Mathematical Society (2009)
Mainardi, F., Gorenflo, R., Scalas, E.: A fractional generalization of the Poisson process. Vietnam J. Math. 32(SI), 53–64 (2004)
Meerschaert, M.M., Toaldo, B.: Relaxation patterns and semi-Markov dynamics. Stoch. Process. Their Appl. 129(8), 2850–2879 (2019)
Article MathSciNet Google Scholar
Raberto, M., Rapallo, F., Scalas, E.: Semi-Markov graph dynamics. Plos One 6(8), Art. e23370 (2011)

Download references

Acknowledgements

Both authors were partially supported by the Dr Perry James (Jim) Browne Research Center at the Department of Mathematics, University of Sussex.

Open Access

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Author information

Authors and Affiliations

Department of Mathematics, University of Sussex, Falmer, Brighton, BN1 9QH, UK
Nicos Georgiou & Enrico Scalas

Authors

Nicos Georgiou
View author publications
You can also search for this author in PubMed Google Scholar
Enrico Scalas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Enrico Scalas.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.

About this article

Cite this article

Georgiou, N., Scalas, E. Bounds for mixing times for finite semi-Markov processes with heavy-tail jump distribution. Fract Calc Appl Anal 25, 229–243 (2022). https://doi.org/10.1007/s13540-021-00010-2

Download citation

Received: 15 September 2021
Revised: 08 November 2021
Accepted: 30 November 2021
Published: 07 February 2022
Issue Date: February 2022
DOI: https://doi.org/10.1007/s13540-021-00010-2

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Bounds for mixing times for finite semi-Markov processes with heavy-tail jump distribution

Abstract

Similar content being viewed by others

Harmonic Moments and Large Deviations for the Markov Branching Process with Immigration

Mixing times and hitting times for general Markov processes

Conditioned local limit theorems for random walks defined on finite Markov chains

1 Introduction

1.1 Motivation

1.2 Preliminaries

Lemma 1

Proof

1.3 Total variation distance and mixing times for discrete chains

2 Results

Assumption

2.1 Motivating examples

Example 1

Example 2

Theorem 1

Theorem 2

Theorem 3

3 Mixing times and equilibrium

Lemma 2

Proof

Proof of Theorem 1

Proof Theorem 2

Proof of Theorem 3

References

Acknowledgements

Open Access

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Bounds for mixing times for finite semi-Markov processes with heavy-tail jump distribution

Abstract

Similar content being viewed by others

Harmonic Moments and Large Deviations for the Markov Branching Process with Immigration

Mixing times and hitting times for general Markov processes

Conditioned local limit theorems for random walks defined on finite Markov chains

1 Introduction

1.1 Motivation

1.2 Preliminaries

Lemma 1

Proof

1.3 Total variation distance and mixing times for discrete chains

2 Results

Assumption

2.1 Motivating examples

Example 1

Example 2

Theorem 1

Theorem 2

Theorem 3

3 Mixing times and equilibrium

Lemma 2

Proof

Proof of Theorem 1

Proof Theorem 2

Proof of Theorem 3

References

Acknowledgements

Open Access

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation