Strong stationary duality for Möbius monotone Markov chains
 763 Downloads
 9 Citations
Abstract
For Markov chains with a finite, partially ordered state space, we show strong stationary duality under the condition of Möbius monotonicity of the chain. We give examples of dual chains in this context which have no downwards transitions. We illustrate general theory by an analysis of nonsymmetric random walks on the cube with an interpretation for unreliable networks of queues.
Keywords
Strong stationary times Strong stationary duals Speed of convergence Random walk on cube Möbius function Möbius monotonicityMathematics Subject Classification (2000)
60G40 60J10 60K251 Introduction
The motivation of this paper stems from a study on the speed of convergence to stationarity for unreliable queueing networks, as in Lorek and Szekli [13]. The problem of bounding the speed of convergence for networks is a rather complex one, and is related to transient analysis of Markov processes, spectral analysis, coupling or duality constructions, drift properties, monotonicity properties, among others (see for more details Dieker and Warren [9], Aldous [1], Lorek and Szekli [13]). In order to give bounds on the speed of convergence for some unreliable queueing networks, it is necessary to study the availability vector of unreliable network processes. This vector is a Markov chain with the state space representing sets of stations with down or up status via the power set of the set of nodes (typical state is the set of broken nodes). Such a chain represents at the same time a random walk on the vertices of the finite dimensional cube. We are concerned in this paper with walks on the vertices of the finite dimensional cube which are up–down in the natural (inclusion) ordering on the power set. This study is a special case of a general duality construction for monotone Markov chains.
To be more precise, we shall study strong stationary duality (SSD) which is a probabilistic approach to the problem of speed of convergence to stationarity for Markov chains. SSD was introduced by Diaconis and Fill [7]. This approach involves strong stationary times (SST) introduced earlier by Aldous and Diaconis [2, 3] who gave a number of examples showing useful bounds on the total variation distance for convergence to stationarity in cases where other techniques utilizing eigenvalues or coupling were not easily applicable. A strong stationary time for a Markov chain (X _{ n }) is a stopping time T for this chain for which X _{ T } has the stationary distribution π and is independent of T. Diaconis and Fill [7] constructed an absorbing dual Markov chain with its absorption time equal to the strong stationary time T for (X _{ n }). In general, there is no recipe for constructing particular dual chains. However, a few cases are known and tractable. One of the most basic and interesting ones is given by Diaconis and Fill [7] (Theorem 4.6) when the state space is linearly ordered. In this case, under the assumption of stochastic monotonicity for the time reversed chain, and under the condition that for the initial distribution ν, ν≤_{ mlr } π (that is, for any k _{1}>k _{2}, \({\nu(k_{1})\over\pi(k_{1})}\leq{\nu(k_{2})\over\pi(k_{2})}\)) it is possible to construct a dual chain on the same state space. A special case is a stochastically monotone birthanddeath process for which the strong stationary time has the same distribution as the time to absorption in the dual chain, which turns out to be again a birthanddeath process on the same state space. Times to absorption are usually more tractable objects in a direct analysis than times to stationarity. In particular, a wellknown theorem, usually attributed to Keilson, states that, for an irreducible continuoustime birthanddeath chain on \(\mathbb {E}=\{0,\ldots ,M\}\), the passage time from state 0 to state M is distributed as a sum of M independent exponential random variables. Fill [11] uses the theory of strong stationary duality to give a stochastic proof of an analogous result for discretetime birthanddeath chains and geometric random variables. He shows a link for the parameters of the distributions to eigenvalue information about the chain. The obtained dual is a pure birth chain. Similar structure holds for more general chains. An (upward) skipfree Markov chain with the set of nonnegative integers as a state space is a chain for which upward jumps may be only of unit size; there is no restriction on downward jumps. Brown and Shao [5] determined, for an irreducible continuoustime skipfree chain and any M, the passage time distribution from state 0 to state M. When the eigenvalues of the generator are all real, their result states that the passage time is distributed as the sum of M independent exponential random variables with rates equal to the eigenvalues. Fill [12] gives another proof of this theorem. In the case of birthanddeath chains, this proof leads to an explicit representation of the passage time as a sum of independent exponential random variables. Diaconis and Miclo [8] recently obtained such a representation, using an involved duality construction; for some recent references related to duality and stationarity, see this paper.
Our main result is an SSD construction which generalizes the above mentioned construction of Diaconis and Fill [7]. We consider a partially ordered state space instead of a linearly ordered one and utilize Möbius monotonicity instead of the usual stochastic monotonicity. This construction opens new ways to study particular Markov chains by a dual approach and is of independent interest. It has a special feature that the dual state space is again the same state space as for the original chain, similarly as in SSD for birthanddeath processes. Moreover, we show that the dual chain can have an upwards drift in the sense that it has no downwards transitions. We formulate the main result in Sect. 3, explaining the needed notation and definitions in detail in Sect. 2. We elaborate on the topic of Möbius monotonicity because it is almost not present in the literature. The only papers we are aware of are the following two: Massey [14] recalls Möbius monotonicity as considered earlier by Adrianus Kester in his PhD thesis, and proves that Möbius monotonicity implies a weak stochastic monotonicity. The second paper is by Falin [10] where a similar result to the one by Massey can be found. We introduce two versions of Möbius monotonicity, and we define a new notion of Möbius monotone functions which appear in a natural way in our main result on SSD. We characterize Möbius monotonicity by an invariance property on the set of Möbius monotone functions. Utilization of Möbius monotonicity involves a general problem of inverting a sum ranging over a partially ordered set, which appears in many combinatorial contexts; see, for example, Rota [15]. The inversion can be carried out by defining an analog of the difference operator relative to a given partial ordering. Such an operator is the Möbius function, and the analog of the fundamental theorem of calculus obtained in this context is the Möbius inversion formula on a partially ordered set, which we recall in Sect. 2.
In Sect. 3 we present our main result on SSD with a proof and give some corollaries which show other possible duals, including an alternative dual for linearly ordered state spaces (Corollary 3.2). In Sect. 4 we show an SSD result for nonsymmetric nearest neighbor walks on the finite dimensional cube. It gives an additional insight into the structure of eigenvalues of this chain. It is interesting that the dual (absorbing) chain here is a chain which jumps only upwards to neighboring states or stays at the same state. This structure of the dual chain allows us to read all eigenvalues for the transition matrix P and its dual P ^{∗} from the diagonal of P ^{∗} since P ^{∗} is uppertriangular. The symmetric walk was considered by Diaconis and Fill [7]. They used the symmetry to reduce the problem of the speed of convergence to a birthanddeath chain setting. The problem of the speed of convergence to stationarity for the nonsymmetric case was studied by Brown [4], were the eigenvalues were identified by a different method. Finally, it is worth mentioning that Möbius monotonicity of nonsymmetric nearest neighbor walks is a stronger property than the usual stochastic monotonicity for this chain.
2 SSD, Möbius monotonicity
2.1 Time to stationarity and strong stationary duality
Let P be an irreducible aperiodic transition matrix on a finite, partially ordered state space \((\mathbb {E}, \preceq)\). We enumerate the states using natural numbers ℕ in such a way that for the partial order ⪯, for all i,j∈ℕ, e _{ i }⪯e _{ j } implies i<j. Each distribution ν on \(\mathbb {E}\) we regard as a row vector, and ν P denotes the usual vector times matrix multiplication.
Consider a Markov chain X=(X _{ n })_{ n≥0} with transition matrix P, initial distribution ν, and (unique) stationary distribution π. One possibility of measuring distance to stationarity is to use the separation distance (see Aldous and Diaconis [3]), given by \(s(\nu \mathbf {P}^{n},\pi)=\max_{\mathbf {e}\in \mathbb {E}} (1\nu \mathbf {P}^{n}(\mathbf {e})/\pi(\mathbf {e}) )\). Separation distance s provides an upper bound on the total variation distance: \(s(\nu \mathbf {P}^{n},\pi)\ge d(\nu \mathbf {P}^{n},\pi ):=\max_{B\subset \mathbb {E}}\nu \mathbf {P}^{n}(B)\pi(B)\).
A random variable T is a Strong Stationary Time (SST) if it is a randomized stopping time for X=(X _{ n })_{ n≥0} such that T and X _{ T } are independent, and X _{ T } has distribution π. SST was introduced by Aldous and Diaconis in [2, 3]. In [3], they prove that s(ν P ^{ n },π)≤P(T>n) (T implicitly depends on ν). Diaconis [6] gives some examples of bounds on the rates of convergence to stationarity via an SST. However, the method to find an SST is specific to each example.
Diaconis and Fill [7] introduced the socalled Strong Stationary Dual (SSD) chains. Such chains have a special feature, namely for them the SST for the original process has the same distribution as the time to absorption in the SSD one.

X is Markov with the initial distribution ν and the transition matrix P,

X ^{∗} is Markov with the initial distribution ν ^{∗} and the transition matrix P ^{∗},

the absorption time T ^{∗} of X ^{∗} is an SST for X.
The following theorem (Diaconis and Fill [7], Theorem 4.6) gives an SSD chain for linearly ordered state spaces under some stochastic monotonicity assumption. In the formulation below, we set g(M+1)=0, \(\overleftarrow{P}(M+1,\{1,\ldots,i\})=0\), for all \(i\in \mathbb {E}\).
Theorem 1
 (i)
\(g(i)={\nu(i)\over\pi(i)}\) is nonincreasing,
 (ii)
\(\overleftarrow{\mathbf {X}}\) is stochastically monotone.
Theorem 2 is our main result on SSD chains. It is an extension of Theorem 1 to Markov chains on partially ordered state spaces by replacing monotonicity in condition (i) and stochastic monotonicity in condition (ii) with Möbius monotonicity. We state this theorem in Sect. 3, after introducing required definitions and background material. Theorem 2 reveals the role of Möbius functions in finding SSD chains. Consequently, it is possible to reformulate Theorem 1 in terms of the corresponding Möbius function (in a similar way as in Corollary 3.2).
2.2 Möbius monotonicities
Consider a finite, partially ordered set \(\mathbb {E}=\{\mathbf {e}_{1},\ldots,\mathbf {e}_{M}\}\), and denote a partial order on \(\mathbb {E}\) by ⪯. We select the above enumeration of \(\mathbb {E}\) to be consistent with the partial order, i.e., e _{ i }⪯e _{ j } implies i<j.
Let X=(X _{ n })_{ n≥0}∼(ν,P) be a time homogeneous Markov chain with an initial distribution ν and transition function P on the state space \(\mathbb {E}\). We identify the transition function with the corresponding matrix written for the fixed enumeration of the state space. Suppose that X is ergodic with the stationary distribution π.
We shall use ∧ for the meet (greatest lower bound) and ∨ for the join (least upper bound) in \(\mathbb {E}\). If \(\mathbb {E}\) is a lattice, it has unique minimal and maximal elements, denoted by \(\mathbf {e}_{1}:=\hat{\mathbf{0}}\) and \(\mathbf {e}_{M}:=\hat{\mathbf{1}}\), respectively.
Recall that the zeta function ζ of the partially ordered set \(\mathbb {E}\) is defined by: ζ(e _{ i },e _{ j })=1 if e _{ i }⪯e _{ j } and ζ(e _{ i },e _{ j })=0 otherwise. If the states are enumerated in such a way that e _{ i }⪯e _{ j } implies i<j (assumed in this paper), then ζ can be represented by an uppertriangular, 0–1 valued matrix C, which is invertible. It is well known that ζ is an element of the incidence algebra (see Rota [15], p. 344), which is invertible in this algebra, and the inverse to ζ, denoted by μ, is called the Möbius function. Using the enumeration which defines C, the corresponding matrix describing μ is given by the usual matrix inverse C ^{−1}.
Throughout the paper, μ will denote the Möbius function of the corresponding ordering.
Definition 2.1
 ^{↓}Möbius monotone if$$\mathbf {C}^{1}\mathbf {P}\mathbf {C}\ge0,$$
 ^{↑}Möbius monotone if$$\bigl(\mathbf {C}^T\bigr)^{1}\mathbf {P}\mathbf {C}^T \ge0,$$
Definition 2.2

^{↓}Möbius monotone if f(C ^{ T })^{−1}≥0,

^{↑}Möbius monotone if fC ^{−1}≥0.
Proposition 2.1

f is ^{↑}Möbius monotone implies that Pf ^{ T } is ^{↑}Möbius monotone.
Proof
Suppose that P is ^{↑}Möbius monotone, that is, (C ^{ T })^{−1} PC ^{ T }≥0. Take arbitrary f which is ^{↑}Möbius monotone, i.e., take f=mC for some arbitrary m≥0. Then (C ^{ T })^{−1} PC ^{ T } m ^{ T }≥0, which is (using transposition) equivalent to fP ^{ T } C ^{−1}≥0, which, in turn, gives (by definition) that Pf ^{ T } is ^{↑}Möbius monotone. Conversely, for all f=mC, where m≥0, we have fP ^{ T } C ^{−1}≥0 since Pf ^{ T } is ^{↑}Möbius monotone. This implies that (C ^{ T })^{−1} PC ^{ T } m ^{ T }≥0 and (C ^{ T })^{−1} PC ^{ T }≥0. □
Many examples can be produced using the fact that the set of Möbius monotone matrices is a convex subset of the set of transition matrices. We shall give some basic examples in Sect. 4. These examples can be used to build up a large class of Möbius monotone matrices.
Proposition 2.2
 (i)
If P _{1} and P _{2} are ^{↑}Möbius monotone (^{↓}Möbius monotone) then P _{1} P _{2} is ^{↑}Möbius monotone (^{↓}Möbius monotone).
 (ii)
If P is ^{↑}Möbius monotone (^{↓}Möbius monotone) then (P)^{ k } is ^{↑}Möbius monotone (^{↓}Möbius monotone) for each k∈ℕ.
 (iii)If P _{1} is ^{↑}Möbius monotone (^{↓}Möbius monotone) and P _{2} is ^{↑}Möbius monotone (^{↓}Möbius monotone) thenis ^{↑}Möbius monotone (^{↓}Möbius monotone) for all p∈(0,1).$$p\mathbf {P}_1+(1p)\mathbf {P}_2$$
Proof
3 Main result: SSD for Möbius monotone chains
Now we are prepared to state our main result on SSD.
Theorem 2
 (i)
\(g(\mathbf {e})={\nu(\mathbf {e})\over\pi(\mathbf {e})}\) is ^{↓}Möbius monotone,
 (ii)
\(\overleftarrow{\mathbf {X}}\) is ^{↓}Möbius monotone.
Proof of Theorem 2
Note that in the context of Theorem 2, if the original chain starts with probability 1 in the minimal state, i.e., \(\nu=\delta_{\mathbf {e}_{1}}\), then \(\nu^{*}=\delta_{\mathbf {e}_{1}}\).
For \(\mathbb {E}=\{1,\ldots,M\}\), with linear ordering ≤, the Möbius function is given by μ(k,k)=1,μ(k−1,k)=−1, and μ equals 0 otherwise. In this case, the link is given by \(\varLambda (j,i)=\mathbb{I}(i\le j) {\pi(i)\over H(j)}\), and we obtain from Theorem 2 (as a special case) Theorem 1, which is a reformulation of Theorem 4.6 from Diaconis and Fill [7].
In a similar way, we construct an analog SSD chain for ^{↑}Möbius monotone P. We skip the corresponding matrix formulation and a proof. This analog SSD chain will be used in Corollary 3.2 to give an alternative SSD chain to the one given in Theorem 1.
Corollary 3.1
 (i)
\(g(\mathbf {e})={\nu(\mathbf {e})\over\pi(\mathbf {e})}\) is ^{↑}Möbius monotone,
 (ii)
\(\overleftarrow{\mathbf {X}}\) is ^{↑}Möbius monotone.
Note that in the setting of Corollary 3.1, if the original chain starts with probability 1 in the maximal state, i.e., \(\nu=\delta_{\mathbf {e}_{M}}\), then \(\nu^{\bullet}=\delta_{\mathbf {e}_{M}}\).
From Corollary 3.1 we obtain an alternative dual result for linearly ordered spaces assuming that \({\nu(i)\over\mu(i)} \) is nondecreasing. Roughly speaking, Theorem 1 and Corollary 3.2 describe two complementary situations, namely when an initial distribution for the original chain is in a sense (mlr ordering) smaller or bigger than the stationary distribution, then one can create (and use) different (alternative) dual chains as described in these statements.
Corollary 3.2
 (i)
\(g(i)={\nu(i)\over\pi(i)}\) is nondecreasing,
 (ii)
\(\overleftarrow{\mathbf {X}}\) is stochastically monotone.
4 Nearest neighbor Möbius monotone walks on a cube
To make our presentation simpler, we assume that ν=δ _{(0,…,0)} (this assumption can be waived).
For example, such a Markov chain is a model for a set of working unreliable servers where the repairs and breakdowns of servers are independent for different servers and only one server can be broken or repaired at a transition time.
Assume that all α _{ i } and β _{ i } are positive and that there exists at least one state e such that P(e,e)>0. Then the chain is ergodic.
Theorem 3
Proof
It is worth mentioning that the condition for ^{↓}Möbius monotonicity (i.e., \(\sum_{i=1}^{d}(\alpha_{i}+\beta_{i})\le1\)) is equivalent to the condition that all eigenvalues of P are nonnegative.
4.1 Further research
The problem of finding SSD chains for walks on the cube, which are not nearest neighbor walks, is open and seems to be a difficult one. Moreover, a more difficult task is to find SSD chains which have an upper triangular form (potentially useful for finding bounds on times to absorption). We have some observations for three dimensional cubes which might be of some interest. Consider the random walk on the threedimensional cube, \(\mathbb {E}=\{0,1\}^{3}\), which is a special case of the random walk given in (4.1). We define on \(\mathbb {E}\) the partial ordering: for all \(\mathbf {e}=(e_{1},e_{2},e_{3})\in \mathbb {E}\), \(\mathbf {e}'=(e'_{1},e'_{2},e'_{3})\in \mathbb {E}\), e⪯e′ iff \(e_{1}\leq e_{1}', e_{2}\leq e_{2}', e_{3}\leq e'_{3}\).
One possibility to extend the model to allow up–down jumps not only to neighboring states is to take powers of the nearest neighbor transitions matrix P, that is, to look at two step chain. It turns out that the matrix P ^{2} is again Möbius monotone, and has a dual with an uppertriangular form if α=β.
A simple sufficient criterion for ≺_{ sm } order when \(\mathbb {E}\) is a discrete (countable) lattice is given as follows.
Lemma 4.1
If in Lemma 4.1 the state space \(\mathbb {E}\) is the set of all subsets of a finite set (i.e., the cube) then the transformation described in (4.3) is called as in Li and Xu [16] a pairwise g ^{+} transform, and Lemma 4.1 specializes then to their Proposition 5.5.
If we modify rows numbered 1, 3, 6, 8 by such a transformation (notice that e _{1},e _{3},e _{6},e _{8} lie on a symmetry axis), that is, we consider an up–down walk which allows jumps not only to the nearest neighbors, then it can be checked that it is Möbius monotone, and the dual matrix again has an uppertriangular form, for an appropriate selection of α and κ.
There are several other examples of chains which are Möbius monotone on some other state spaces. We shall study this topic in a subsequent paper.
Notes
Acknowledgements
Work supported by NCN Research Grant UMO2011/01/B/ST1/01305 (first author) and by MNiSW Research Grant N N201 394137 (second author).
Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
References
 1.Aldous, D.J.: Finitetime implications of relaxation times for stochastically monotone processes. Probab. Theory Relat. Fields 77, 137–145 (1988) CrossRefGoogle Scholar
 2.Aldous, D.J., Diaconis, P.: Shuffling cards and stopping times. Am. Math. Mon. 93, 333–348 (1986) CrossRefGoogle Scholar
 3.Aldous, D.J., Diaconis, P.: Strong uniform times and finite random walks. Adv. Appl. Math. 8, 69–97 (1987) CrossRefGoogle Scholar
 4.Brown, M.: Consequences of monotonicity for Markov transition functions. Technical report, City college, CUNY (1990) Google Scholar
 5.Brown, M., Shao, Y.S.: Identifying coefficients in the spectral representation for first passage time distributions. Probab. Eng. Inf. Sci. 1, 69–74 (1987) CrossRefGoogle Scholar
 6.Diaconis, P.: Group Representations in Probability and Statistics. IMS, Hayward (1988) Google Scholar
 7.Diaconis, P., Fill, J.A.: Strong stationary times via a new form of duality. Ann. Probab. 18, 1483–1522 (1990) CrossRefGoogle Scholar
 8.Diaconis, P., Miclo, L.: On times to quasistationarity for birth and death processes. J. Theor. Probab. 22, 558–586 (2009) CrossRefGoogle Scholar
 9.Dieker, A.B., Warren, J.: Series Jackson networks and noncrossing probabilities. Math. Oper. Res. 35, 257–266 (2010) CrossRefGoogle Scholar
 10.Falin, G.I.: Monotonicity of random walks in partially ordered sets. Russ. Math. Surv. 43, 167–168 (1988) CrossRefGoogle Scholar
 11.Fill, J.A.: The passage time distribution for a birthanddeath chain: Strong stationary duality gives a first stochastic proof. J. Theor. Probab. 22, 543–557 (2009) CrossRefGoogle Scholar
 12.Fill, J.A.: On hitting times and fastest strong stationary times for skipfree and more general chains. J. Theor. Probab. 22, 587–600 (2009) CrossRefGoogle Scholar
 13.Lorek, P., Szekli, R.: On the speed of convergence to stationarity via spectral gap: queueing networks with breakdowns and repairs (submitted to JAP). arXiv:1101.0332 [math.PR]
 14.Massey, W.A.: Stochastic ordering for Markov processes on partially ordered spaces. Math. Oper. Res. 12, 350–367 (1987) CrossRefGoogle Scholar
 15.Rota, G.C.: On the foundations of combinatorial theory I. Theory of Möbius functions. Z. Wahrscheinlichkeitstheor. 2, 340–368 (1964) CrossRefGoogle Scholar
 16.Xu, S.H., Li, H.: Majorization of weighted trees: A new tool to study correlated stochastic systems. Math. Oper. Res. 25, 298–323 (2000) CrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.