Faster Statistical Model Checking for Unbounded Temporal Properties
 5 Citations
 1.4k Downloads
Abstract
We present a new algorithm for the statistical model checking of Markov chains with respect to unbounded temporal properties, including full linear temporal logic. The main idea is that we monitor each simulation run on the fly, in order to detect quickly if a bottom strongly connected component is entered with high probability, in which case the simulation run can be terminated early. As a result, our simulation runs are often much shorter than required by termination bounds that are computed a priori for a desired level of confidence on a large state space. In comparison to previous algorithms for statistical model checking our method is not only faster in many cases but also requires less information about the system, namely, only the minimum transition probability that occurs in the Markov chain. In addition, our method can be generalised to unbounded quantitative properties such as meanpayoff bounds.
Keywords
Markov Chain Goal State Linear Temporal Logic Confidence Bound Reachability Analysis1 Introduction
Traditional numerical algorithms for the verification of Markov chains may be computationally intense or inapplicable, when facing a large state space or limited knowledge about the chain. To this end, statistical algorithms are used as a powerful alternative. Statistical model checking (SMC) typically refers to approaches where (i) finite paths of the Markov chain are sampled a finite number of times, (ii) the property of interest is verified for each sampled path (e.g. state r is reached), and (iii) hypothesis testing or statistical estimation is used to infer conclusions (e.g. state r is reached with probability at most 0.5) and give statistical guarantees (e.g. the conclusion is valid with \(99\,\%\) confidence). SMC approaches differ in (a) the class of properties they can verify (e.g. bounded or unbounded properties), (b) the strength of statistical guarantees they provide (e.g. confidence bounds, only asymptotic convergence of the method towards the correct value, or none), and (c) the amount of information they require about the Markov chain (e.g. the topology of the graph). In this paper, we provide an algorithm for SMC of unbounded properties, with confidence bounds, in the setting where only the minimum transition probability of the chain is known. Such an algorithm is particularly desirable in scenarios when the system is not known (“black box”), but also when it is too large to construct or fit into memory.
SMC approaches to Markov chain verification, organised by (i) the class of verifiable properties, and (ii) by the required information about the Markov chain, where \(p_{\mathsf {min}}\) is the minimum transition probability, S is the number of states, and \(\lambda \) is the second largest eigenvalue of the chain.
Our contribution is a new SMC algorithm for full linear temporal logic (LTL), as well as for unbounded quantitative properties (mean payoff), which provides strong guarantees in the form of confidence bounds. Our algorithm uses less information about the Markov chain than previous algorithms that provide confidence bounds for unbounded properties —we need to know only the minimum transition probability \(p_{\mathsf {min}}\) of the chain, and not the number of states nor the topology. Yet, experimentally, our algorithm performs in many cases better than these previous approaches (see Sect. 5). Our main idea is to monitor each execution run on the fly in order to build statistical hypotheses about the structure of the Markov chain. In particular, if from observing the current prefix of an execution run we can stipulate that with high probability a bottom strongly connected component (BSCC) of the chain has been entered, then we can terminate the current execution run. The information obtained from execution prefixes allows us to terminate executions as soon as the property is decided with the required confidence, which is usually much earlier than any bounds that can be computed a priori. As far as we know, this is the first SMC algorithm that uses information obtained from execution prefixes.
Finding \(p_{\mathsf {min}}\) is a light assumption in many realistic scenarios and often does not depend on the size of the chain – e.g. bounds on the rates for reaction kinetics in chemical reaction systems are typically known, from a Prism language model they can be easily inferred without constructing the respective state space.
Example 1
Consider the property of reaching state r in the Markov chain depicted in Fig. 1. While the execution runs reaching r satisfy the property and can be stopped without ever entering any \(v_i\), the finite execution paths without r, such as stuttutuut, are inconclusive. In other words, observing this path does not rule out the existence of a transition from, e.g., u to r, which, if existing, would eventually be taken with probability 1. This transition could have arbitrarily low probability, rendering its detection arbitrarily unlikely, yet its presence would change the probability of satisfying the property from 0.5 to 1. However, knowing that if there exists such a transition leaving the set, its transition probability is at least \(p_{\mathsf {min}}=0.01\), we can estimate the probability that the system is stuck in the set \(\{t,u\}\) of states. Indeed, if existing, the exit transition was missed at least four times, no matter whether it exits t or u. Consequently, the probability that there is no such transition and \(\{t,u\}\) is a BSCC is at least \(1(1p_{\mathsf {min}})^4\).
This means that, in order to get \(99\,\%\) confidence that \(\{t,u\}\) is a BSCC, we only need to see both t and u around 500 times^{1} on a run. This is in stark contrast to a priori bounds that provide the same level of confidence, such as the \((1/p_{\mathsf {min}})^{S}=100^{\mathcal {O}(m)}\) runs required by [4], which is infeasible for large m. In contrast, our method’s performance is independent of m. \(\triangle \)

Input: we can sample finite runs of arbitrary length from an unknown finitestate discretetime Markov chain \(\mathcal {M}\) starting in the initial state^{2}, and we are given a lower bound \(p_{\mathsf {min}}>0\) on the transition probabilities in \(\mathcal {M}\), an LTL formula \(\varphi \), a threshold probability p, an indifference region \(\varepsilon >0\), and two error bounds \(\alpha ,\beta >0\),^{3}

Output: if \(\mathbb P[\mathcal {M}\models \varphi ]\ge p+\varepsilon \), return YES with probability at least \(1\alpha \), and if \(\mathbb P[\mathcal {M}\models \varphi ]\le p\varepsilon \), return NO with probability at least \(1\beta \).
In addition, we present the first SMC algorithm for computing the mean payoff of Markov chains whose states are labelled with rewards.
Related Work. To the best of our knowledge, we present the first SMC algorithm that provides confidence bounds for unbounded qualitative properties with access to only the minimum probability of the chain \(p_{\mathsf {min}}\), and the first SMC algorithm for quantitative properties. For completeness, we survey briefly other related SMC approaches. SMC of unbounded properties, usually “unbounded until” properties, was first considered in [10] and the first approach was proposed in [22], but observed incorrect in [9]. Notably, in [26] two approaches are described. The first approach proposes to terminate sampled paths at every step with some probability \(p_{\mathsf {term}}\). In order to guarantee the asymptotic convergence of this method, the second eigenvalue \(\lambda \) of the chain must be computed, which is as hard as the verification problem itself. It should be noted that their method provides only asymptotic guarantees as the width of the confidence interval converges to zero. The correctness of [16] relies on the knowledge of the second eigenvalue \(\lambda \), too. The second approach of [26] requires the knowledge of the chain’s topology, which is used to transform the chain so that all potentially infinite paths are eliminated. In [9], a similar transformation is performed, again requiring knowledge of the topology. The (pre)processing of the state space required by the topologyaware methods, as well as by traditional numerical methods for Markov chain analysis, is a major practical hurdle for large (or unknown) state spaces. In [4] a priori bounds for the length of execution runs are calculated from the minimum transition probability and the number of states. However, without taking execution information into account, these bounds are exponential in the number of states and highly impractical, as illustrated in the example above. Another approach, limited to ergodic Markov chains, is taken in [20], based on coupling methods. There are also extensions of SMC to timed systems [7]. Our approach is also related to [8, 18], where the product of a nondeterministic system and Büchi automaton is explored for accepting lassos. We are not aware of any method for detecting BSCCs by observing a single run, employing no directed search of the state space.
Experimental Evaluation. Our idea of inferring the structure of the Markov chain on the fly, while generating execution runs, allows for their early termination. In Sect. 5 we will see that for many chains arising in practice, such as the concurrent probabilistic protocols from the Prism benchmark suite [15], the BSCCs are reached quickly and, even more importantly, can be small even for very large systems. Consequently, many execution runs can be stopped quickly. Moreover, since the number of execution runs necessary for a required precision and confidence is independent of the size of the state space, it needs not be very large even for highly confident results (a good analogy is that of the opinion polls: the precision and confidence of opinion polls is regulated by the sample size and is independent of the size of the population). It is therefore not surprising that, experimentally, in most cases from the benchmark suite, our method outperforms previous methods (often even the numerical methods) despite requiring much less knowledge of the Markov chain, and despite providing strong guarantees in the form of confidence bounds. In Sect. 6, we also provide theoretical bounds on the running time of our algorithm for classes of Markov chains on which it performs particularly well.
Due to space constraints, the proofs and further details can be found in [2].
2 Preliminaries
Definition 1

\(S\) is a finite set of states,

\(\mathbf {P} \;:\; S\times S\rightarrow [0,1]\) is the transition probability matrix, such that for every \(s\in S\) it holds \(\sum _{s'\in S} \mathbf {P}(s,s') = 1\),

\(\mu \) is a probability distribution over \(S\).
We let \(p_{\mathsf {min}}:=\min (\{\mathbf {P}(s,s') > 0\mid s,s'\in S\})\) denote the smallest positive transition probability in \(\mathcal {M}\). A run of \(\mathcal {M}\) is an infinite sequence \(\rho = s_0 s_1 \ldots \) of states, such that for all \(i\ge 0\), \(\mathbf {P}(s_i, s_{i+1}) > 0\); we let \(\rho [i]\) denote the state \(s_i\). A path in \(\mathcal {M}\) is a finite prefix of a run of \(\mathcal {M}\). We denote the empty path by \(\lambda \) and concatenation of paths \(\pi _1\) and \(\pi _2\) by \(\pi _1\,.\,\pi _2\). Each path \(\pi \) in \(\mathcal {M}\) determines the set of runs \(\mathsf {Cone}(\pi )\) consisting of all runs that start with \(\pi \). To \(\mathcal {M}\) we assign the probability space \( (\mathsf {Runs},\mathcal {F},\mathbb P)\), where \(\mathsf {Runs}\) is the set of all runs in \(\mathcal {M}\), \(\mathcal F\) is the \(\sigma \)algebra generated by all \(\mathsf {Cone}(\pi )\), and \(\mathbb P\) is the unique probability measure such that \(\mathbb P[\mathsf {Cone}(s_0s_1\ldots s_k)] = \mu (s_0)\cdot \prod _{i=1}^{k} \mathbf {P}(s_{i1},s_i)\), where the empty product equals 1. The respective expected value of a random variable \(f:\mathsf {Runs}\rightarrow \mathbb R\) is \(\mathbb {E}[f]=\int _\mathsf {Runs}f\ d\,\mathbb P\).
A nonempty set \(C\subseteq S\) of states is strongly connected (SC) if for every \(s,s'\in C\) there is a path from s to \(s'\). A set of states \(C\subseteq S\) is a bottom strongly connected component (BSCC) of \(\mathcal {M}\), if it is a maximal SC, and for each \(s\in C\) and \(s'\in S\setminus C\) we have \(\mathbf {P}(s,s')=0\). The sets of all SCs and BSCCs in \(\mathcal {M}\) are denoted by \(\mathsf {SC}\) and \(\mathsf {BSCC}\), respectively. Note that with probability 1, the set of states that appear infinitely many times on a run forms a BSCC. From now on, we use the standard notions of SC and BSCC for directed graphs as well.
3 Solution for Reachability
A fundamental problem in Markov chain verification is computing the probability that a certain set of goal states is reached. For the rest of the paper, let \(\mathcal {M}= (S,\mathbf {P},\mu )\) be a Markov chain and \(G \subseteq S\) be the set of the goal states in \(\mathcal {M}\). We let \( \Diamond G:= \{ \rho \in \mathsf {Runs}\mid \exists i\ge 0: \rho [i]\in G\} \) denote the event that “eventually a state in G is reached.” The event \(\Diamond G\) is measurable and its probability \(\mathbb P[\Diamond G]\) can be computed numerically or estimated using statistical algorithms. Since no bound on the number of steps for reaching G is given, the major difficulty for any statistical approach is to decide how long each sampled path should be. We can stop extending the path either when we reach G, or when no more new states can be reached anyways. The latter happens if and only if we are in a BSCC and we have seen all of its states.
In this section, we first show how to monitor each simulation run on the fly, in order to detect quickly if a BSCC has been entered with high probability. Then, we show how to use hypothesis testing in order to estimate \(\mathbb P[\Diamond G]\).
3.1 BSCC Detection
We start with an example illustrating how to measure probability of reaching a BSCC from one path observation.
Example 2
Recall Example 1 and Fig. 1. Now, consider an execution path stuttutu. Intuitively, does \(\{t,u\}\) look as a good “candidate” for being a BSCC of \(\mathcal {M}\)? We visited both t and u three times; we have taken a transition from each t and u at least twice without leaving \(\{t,u\}\). By the same reasoning as in Example 1, we could have missed some outgoing transition with probability at most \((1p_{\mathsf {min}})^2\). The structure of the system that can be deduced from this path is in Fig. 2 and is correct with probability at least \(1(1p_{\mathsf {min}})^2\). \(\triangle \)
Definition 2
(Candidate). If a path \(\pi \) has a suffix \(\kappa \) such that \(\overline{\kappa }\) is a BSCC of the graph of \(\pi \), we call \(\overline{\kappa }\) the candidate of \(\pi \). Moreover, for \(k\in \mathbb N\), we call it a kcandidate (of \(\pi \)) if each \(s\in \overline{\kappa }\) has at least k occurrences in \(\kappa \) and the last element of \(\kappa \) has at least \(k+1\) occurrences. A kcandidate of a run \(\rho \) is a kcandidate of some prefix of \(\rho \).
Note that for each path there is at most one candidate. Therefore, we write \(K(\pi )\) to denote the candidate of \(\pi \) if there is one, and \(K(\pi )=\bot \), otherwise. Observe that each \(K(\pi )\ne \bot \) is a SC in \(\mathcal {M}\).
Example 3
Consider a path \(\pi =stuttutu\), then \(K(\pi )=\{t,u\}\). Observe that \(\{t\}\) is not a candidate as it is not maximal. Further, \(K(\pi )\) is a 2candidate (and as such also a 1candidate), but not a 3candidate. Intuitively, the reason is that we only took a transition from u (to the candidate) twice, cf. Example 2. \(\triangle \)
Intuitively, the higher the k the more it looks as if the kcandidate is indeed a BSCC. Denoting by \( Cand _{k}(K)\) the random predicate of K being a kcandidate on a run, the probability of “unluckily” detecting any specific nonBSCC set of states K as a kcandidate, can be bounded as follows.
Lemma 1
Example 4
We illustrate how candidates “evolve over time” along a run. Consider a run \(\rho =s_0s_0s_1s_0\ldots \) of the Markov chain in Fig. 3. The empty and oneletter prefix do not have the candidate defined, \(s_0s_0\) has a candidate \(\{s_0\}\), then again \(K(s_0s_0s_1)=\bot \), and \(K(s_0s_0s_1s_0)=\{s_0,s_1\}\). One can observe that subsequent candidates are either disjoint or contain some of the previous candidates. Consequently, there are at most \(2S1\) candidates on every run, which is in our setting an unknown bound. \(\triangle \)
While we have bounded the probability of detecting any specific nonBSCC set K as a kcandidate, we need to bound the overall error for detecting a candidate that is not a BSCC. Since there can be many false candidates on a run before the real BSCC (e.g. Fig. 3), we need to bound the error of reporting any of them.
In the following, we first formalise the process of discovering candidates along the run. Second, we bound the error that any of the nonBSCC candidates becomes a kcandidate. Third, we bound the overall error of not detecting the real BSCC by increasing k every time a different candidate is found.
We start with discovering the sequence of candidates on a run. For a run \(\rho =s_0s_1\ldots \), consider the sequence of random variables defined by \(K(s_0\ldots s_j)\) for \(j\ge 0\), and let \((K_i)_{i\ge 1}\) be the subsequence without undefined elements and with no repetition of consecutive elements. For example, for a run \(\varrho =s_0s_1s_1s_1s_0s_1s_2s_2\ldots \), we have \(K_1=\{s_1\}\), \(K_2=\{s_0,s_1\}\), \(K_3=\{s_2\}\), etc. Let \(K_j\) be the last element of this sequence, called the final candidate. Additionally, we define \(K_\ell :=K_j\) for all \(\ell >j\). We describe the lifetime of a candidate. Given a nonfinal \(K_i\), we write \(\rho =\alpha _i\beta _ib_i\gamma _id_i\delta _i\) so that \(\overline{\alpha _i}\cap K_i=\emptyset \), \(\overline{\beta _ib_i\gamma _i}=K_i\), \(d_i\notin K_i\), and \(K(\alpha _i\beta _i)\ne K_i\), \(K(\alpha _i\beta _ib_i)= K_i\). Intuitively, we start exploring \(K_i\) in \(\beta _i\); \(K_i\) becomes a candidate in \(b_i\), the birthday of the ith candidate; it remains to be a candidate until \(d_i\), the death of the ith candidate. For example, for the run \(\varrho =s_0s_1s_1s_1s_0s_1s_2s_2\ldots \) and \(i=1\), \(\alpha _1=s_0\), \(\beta _1=s_1\), \(b_1=s_1\), \(\gamma _1=s_1\), \(d_1=s_0\), \(\delta _1=s_1s_2s_2\varrho [8]\varrho [9]\ldots \). Note that the final candidate is almost surely a BSCC of \(\mathcal {M}\) and would thus have \(\gamma _j\) infinite.
Now, we proceed to bounding errors for each candidate. Since there is an unknown number of candidates on a run, we will need a slightly stronger definition. First, observe that \( Cand _{k}(K_i)\) iff \(K_i\) is a kcandidate of \(\beta _ib_i\gamma _i\). We say \(K_i\) is a strong kcandidate, written \( SCand _{k}(K_i)\), if it is a kcandidate of \(b_i\gamma _i\). Intuitively, it becomes a kcandidate even not counting the discovery phase. As a result, even if we already assume there exists an ith candidate, its strong kcandidacy gives the guarantees on being a BSCC as above in Lemma 1.
Lemma 2
Since the number of candidates can only be bounded with some knowledge of the state space, e.g. its size, we assume no bounds and provide a method to bound the error even for an unbounded number of candidates on a run.
Lemma 3
Theorem 1
For every \(\delta >0\), Algorithm 1 is correct with error probability at most \(\delta \).
3.2 Hypothesis Testing on a Bernoulli Variable Observed with Bounded Error
In the following, we show how to estimate the probability of reaching a set of goal states, by combining the BSCC detection and hypothesis testing. More specifically, we sample many paths of a Markov chain, decide for each whether it reaches the goal states (Algorithm 2), and then use hypothesis testing to estimate the event probability. The hypothesis testing is adapted to the fact that testing reachability on a single path may report false negatives.
Let \(X_\Diamond ^\delta \) be a Bernoulli random variable, such that \(X_\Diamond ^\delta =1\) if and only if SinglePathReach \((G,p_{\mathsf {min}},\delta )=\mathbf {Yes}\), describing the outcome of Algorithm 2. The following theorem establishes that \(X_\Diamond ^\delta \) estimates \(\mathbb P[\Diamond G]\) with a bias bounded by \(\delta \).
Theorem 2
For every \(\delta >0\), we have \(\mathbb P[\Diamond G]\delta \le \mathbb {E}[X_\Diamond ^\delta ]\le \mathbb P[\Diamond G]\).
In order to conclude on the value \(\mathbb P[\Diamond G]\), the standard statistical model checking approach via hypothesis testing [28] decides between the hypothesis \( H_0: \mathbb P[\Diamond G]\ge p + \varepsilon \) and \(H_1: \mathbb P[\Diamond G]\le p  \varepsilon \), where \(\varepsilon \) is a desired indifference region. As we do not have precise observations on each path, we reduce this problem to a hypothesis testing on the variable \(X_\Diamond ^\delta \) with a narrower indifference region: \( H_0': \mathbb {E}[X_\Diamond ^\delta ] \ge p+(\varepsilon  \delta )\) and \(H_1': \mathbb {E}[X_\Diamond ^\delta ]\le p  \varepsilon \), for some \(\delta <\varepsilon \).
We define the reduction simply as follows. Given a statistical test \(\mathcal {T'}\) for \(H'_0, H'_1\) we define a test \(\mathcal {T}\) that accepts \(H_0\) if \(\mathcal {T}'\) accepts \(H'_0\), and \(H_1\) otherwise. The following lemma shows that \(\mathcal {T}\) has the same strength as \(\mathcal {T}'\).
Lemma 4
Suppose the test \(\mathcal {T'}\) decides between \(H_0'\) and \(H_1'\) with strength \((\alpha , \beta )\). Then the test \(\mathcal {T}\) decides between \(H_0\) with \(H_1\) with strength \((\alpha , \beta )\).
Lemma 4 gives us the following algorithm to decide between \(H_0\) and \(H_1\). We generate samples \(x_0, x_1, \ldots , x_n \sim X_\Diamond ^\delta \) from SinglePathReach \((G,p_{\mathsf {min}},\delta )\), and apply a statistical test to decide between \(H'_0\) and \(H'_1\). Finally, we accept \(H_0\) if \(H'_0\) was accepted by the test, and \(H_1\) otherwise. In our implementation, we used the sequential probability ration test (SPRT) [24, 25] for hypothesis testing.
4 Extensions
In this section, we present how the onthefly BSCC detection can be used for verifying LTL and quantitative properties (mean payoff).
4.1 Linear Temporal Logic
We show how our method extends to properties expressible by linear temporal logic (LTL) [19] and, in the same manner, to all \(\omega \)regular properties. Given a finite set Ap of atomic propositions, a labelled Markov chain (LMC) is a tuple \(\mathcal {M}= (S, \mathbf {P}, \mu , Ap, L)\), where \((S,\mathbf {P}, \mu )\) is a MC and \(L: S\rightarrow 2^{Ap}\) is a labelling function. Definition of LTL formulae is standard and for reader’s convenience recalled in the full version [2], along with other standard details omitted in this section. Given a labelled Markov chain \(\mathcal {M}\) and an LTL formula \(\varphi \), we are interested in the measure \(\mathbb P[\mathcal {M}\models \varphi ] := \mathbb P[\{\rho \in \mathsf {Runs}\mid L(\rho ) \models \varphi \}],\) where \(L\) is naturally extended to runs by \(L(\rho )[i]=L(\rho [i])\) for all i.
For every LTL formula \(\varphi \), one can construct a deterministic Rabin automaton (DRA) \(\mathcal {A}= (Q, {2^{Ap}}, \gamma , q_o, Acc)\) that accepts all runs that satisfy \(\varphi \) [3]. Here \(Q\) is a finite set of states, \(\gamma : Q\times {2^{Ap}}\rightarrow Q\) is the transition function, \(q_o\in Q\) is the initial state, and \(Acc\subseteq 2^Q\times 2^Q\) is the acceptance condition. Further, the product of a MC \(\mathcal {M}\) and DRA \(\mathcal A\) is the Markov chain \(\mathcal {M}\otimes \mathcal {A}=(S\times Q, \mathbf {P}', \mu ')\), where \(\mathbf {P}'((s,q),(s',q')) = \mathbf {P}(s,s')\) if \(q'=\gamma (q,L(s'))\) and \(\mathbf {P}'((s,q),(s',q'))=0\) otherwise, and \(\mu '(s,q) = \mu (s)\) if \(\gamma (q_o, L(s))=q\) and \(\mu '(s,q)=0\) otherwise. Note that \(\mathcal {M}\otimes \mathcal {A}\) has the same smallest transition probability \(p_{\mathsf {min}}\) as \(\mathcal {M}\).
The crux of LTL probabilistic model checking relies on the fact that the probability of satisfying an LTL property \(\varphi \) in a Markov chain \(\mathcal {M}\) equals the probability of reaching an accepting BSCC in the Markov chain \(\mathcal {M}\otimes \mathcal {A}_\varphi \). Formally, a BSCC C of \(\mathcal {M}\otimes \mathcal {A}_\varphi \) is accepting if for some \((E,F)\in Acc\) we have \(C\cap (S\times E)=\emptyset \) and \(C\cap (S\times F)\ne \emptyset \). Let \(\mathsf {AccBSCC}\) denote the union of all accepting BSCCs in \(\mathcal {M}\). Then we obtain the following wellknown fact [3]:
Lemma 5
For every labelled Markov chain \(\mathcal {M}\) and LTL formula \(\varphi \), we have \(\mathbb P[\mathcal {M}\models \varphi ] = \mathbb P[\Diamond \mathsf {AccBSCC}]\).
Since the input used is a Rabin automaton, the method applies to all \(\omega \)regular properties. Let \(X^\delta _\varphi \) be a Bernoulli random variable, such that \(X_\varphi ^\delta =1\) if and only if SinglePathLTL \((\mathcal {A}_\varphi ,p_{\mathsf {min}},\delta )=\mathbf {Yes}\). Since the BSCC must be reached and fully explored to classify it correctly, the error of the algorithm can now be bothsided.
Theorem 3
For every \(\delta >0\), \(\mathbb P[\mathcal {M}\models \varphi ]  \delta \le \mathbb {E}[X_\varphi ^\delta ] \le \mathbb P[\mathcal {M}\models \varphi ] + \delta \).
4.2 Mean Payoff
Algorithm 4 extends Algorithm 2 as follows. It divides the confidence parameters \(\delta \) into \(\delta _{BSCC}\) (used as in Algorithm 2 to detect the BSCC) and \(\delta _C\) (the total confidence for the estimates on transition probabilities). For simplicity, we set \(\delta _{BSCC}=\delta _{C}=\delta /2\). First, we compute the bound \(\xi \) required for Open image in new window precision (by Eq. 2). Subsequently, we compute the required strength k of the candidate guaranteeing \(\delta _{C}\)confidence on \(\hat{\mathbf {P}}\) (from Eq. 1). The path is prolonged until the candidate is strong enough; in such a case \(\hat{\mathsf {MP}}_{C}\) is Open image in new window approximated with \(1\delta _{C}\) confidence. If the candidate of the path changes, all values are computed from scratch for the new candidate.
Theorem 4
For every \(\delta >0\), the Algorithm 4 terminates correctly with probability at least \(1\delta \).
Let random variable \(X_{\mathsf {MP}}^{\zeta ,\delta }\) denote the value SinglePathMP(\(r,p_{\mathsf {min}},\zeta ,\delta \)). The following theorem establishes relation between the meanpayoff \(\mathsf {MP}\) and the expected value of \(X_\mathsf {MP}^{\zeta ,\delta }\).
Theorem 5
For every \(\delta , \zeta > 0\), \(\mathsf {MP}  \zeta  \delta \le \mathbb {E}[X_\mathsf {MP}^{\zeta ,\delta }] \le \mathsf {MP} + \zeta + \delta \).
As a consequence of Theorem 5, if we establish that with \((1\alpha )\) confidence \(X_\mathsf {MP}^{\zeta ,\delta }\) belongs to the interval [a, b], then we can conclude with \((1\alpha )\) confidence that \(\mathsf {MP}\) belongs to the interval \([a\zeta  \delta , b+\zeta +\delta ]\). Standard statistical methods can be applied to find the confidence bound for \(X_\mathsf {MP}^{\zeta ,\delta }\).
5 Experimental Evaluation
We implemented our algorithms in the probabilistic model checker Prism [14], and evaluated them on the DTMC examples from the Prism benchmark suite [15]. The benchmarks model communication and security protocols, distributed algorithms, and faulttolerant systems. To demonstrate how our method performs depending on the topology of Markov chains, we also performed experiments on the generic DTMCs shown in Figs. 3 and 4, as well as on two CTMCs from the literature that have large BSCCs: “tandem” [11] and “gridworld” [27].
All benchmarks are parametrised by one or more values, which influence their size and complexity, e.g. the number of modelled components. We have made minor modifications to the benchmarks that could not be handled directly by the SMC component of Prism, by adding selfloops to deadlock states and fixing one initial state instead of multiple.
Reachability. The experimental results for unbounded reachability are shown in Table 2. The Prism benchmarks were checked against their standard properties, when available. We directly compare our method to another topologyagnostic method of [26] (\(\mathsf {SimTermination}\)), where at every step the sampled path is terminated with probability \(p_{\mathsf {term}}\). The approach of [4] with a priori bounds is not included, since it times out even on the smallest benchmarks. In addition, we performed experiments on two methods that are topologyaware: sampling with reachability analysis of [26] (\(\mathsf {SimAnalysis}\)) and the numerical modelchecking algorithm of Prism (\(\textsf {MC}\)). The full version [2] contains detailed experimental evaluation of these methods.
Experimental results for unbounded reachability. Simulation parameters: \(\alpha =\beta =\varepsilon =0.01\), \(\delta =0.001\), \(p_{\mathsf {term}}=0.0001\). \(\mathsf {TO}\) means timeout, and \(\mathsf {MO}\) means memoryout. Our approach is denoted by \(\mathsf {SimAdaptive}\) here. Highlights show the best result the among topologyagnostic methods.
Finding the optimal termination probability \(p_{\mathsf {term}}\) for the \(\mathsf {SimTermination}\) method is a nontrivial task. If the probability is too high, the method might never reach the target states, thus give an incorrect result, and if the value is too low, then it might sample unnecessarily long traces that never reach the target. For instance, to ensure a correct answer on the Markov chain in Fig. 3, \(p_{\mathsf {term}}\) has to decrease exponentially with the number of states. By experimenting we found that the probability \(p_{\mathsf {term}}=0.0001\) is low enough to ensure correct results. See the full version [2] for experiments with other values of \(p_{\mathsf {term}}\).
On most examples, our method scales better than the \(\mathsf {SimTermination}\) method. Our method performs well even on examples with large BSCCs, such as “tandem” and “gridworld,” due to early termination when a goal state is reached. For instance, on the “gridworld” example, most BSCCs do not contain a goal state, thus have to be fully explored, however the probability of reaching such BSCC is low, and as a consequence full BSCC exploration rarely occurs. The \(\mathsf {SimTermination}\) method performs well when the target states are unreachable or can be reached by short paths. When long paths are necessary to reach the target, the probability that an individual path reaches the target is small, hence many samples are necessary to estimate the real probability with high confidence.
Moreover, it turns out that our method compares well even with methods that have access to the topology of the system. In many cases, the running time of the numerical algorithm \(\textsf {MC}\) increases dramatically with the size of the system, while remaining almost constant in our method. The bottleneck of the \(\mathsf {SimAnalysis}\) algorithm is the reachability analysis of states that cannot reach the target, which in practice can be as difficult as numerical model checking.
Experimental results for LTL and meanpayoff properties. Simulation parameters for LTL: \(\alpha =\beta =\varepsilon =0.01\), \(\delta =0.001\), for meanpayoff we computed \(95\,\%\)confidence bound of size 0.22 with \(\delta =0.011, \zeta =0.08\).
6 Discussion and Conclusion
As demonstrated by the experimental results, our method is fast on systems that are (1) shallow, and (2) with small BSCCs. In such systems, the BSCC is reached quickly and the candidate is builtup quickly. Further, recall that the BSCC is reported when a kcandidate is found, and that k is increased with each candidate along the path. Hence, when there are many strongly connected sets, and thus many candidates, the BSCC is detected by a kcandidate for a large k. However, since k grows linearly in the number of candidates, the most important and limiting factor is the size of BSCCs.
Theorem 6
Systems that have large deep BSCCs require longer time to reach for the required level of confidence. However, such systems are often difficult to handle also for other methods agnostic of the topology. For instance, correctness of [26] on the example in Fig. 3 relies on the termination probability \(p_{\mathsf {term}}\) being at most \(1\lambda \), which is less than \(2^{n}\) here. Larger values lead to incorrect results and smaller values to paths of exponential length. Nevertheless, our procedure usually runs faster than the bound suggest; for detailed discussion see [2].
Conclusion. To the best of our knowledge, we propose the first statistical modelchecking method that exploits the information provided by each simulation run on the fly, in order to detect quickly a potential BSCC, and verify LTL properties with the desired confidence. This is also the first application of SMC to quantitative properties such as mean payoff. We note that for our method to work correctly, the precise value of \(p_{\mathsf {min}}\) is not necessary, but a lower bound is sufficient. This lower bound can come from domain knowledge, or can be inferred directly from description of whitebox systems, such as the Prism benchmark.
The approach we present is not meant to replace the other methods, but rather to be an addition to the repertoire of available approaches. Our method is particularly valuable for models that have small BSCCs and huge state space, such as many of the Prism benchmarks.
In future work, we plan to investigate the applicability of our method to Markov decision processes, and to deciding language equivalence between two Markov chains.
Footnotes
References
 1.Tool for the paper. http://pub.ist.ac.at/~przemek/pa_tool.html
 2.Daca, P., Henzinger, T.A., Kretínský, J., Petrov, T.: Faster statistical model checking for unbounded temporal properties. In: CoRR, abs/1504.05739 (2015)Google Scholar
 3.Baier, C., Katoen, J.P.: Principles of Model Checking. MIT Press, Cambridge (2008)zbMATHGoogle Scholar
 4.Brázdil, T., Chatterjee, K., Chmelík, M., Forejt, V., Křetínský, J., Kwiatkowska, M., Parker, D., Ujma, M.: Verification of Markov decision processes using learning algorithms. In: Cassez, F., Raskin, J.F. (eds.) ATVA 2014. LNCS, vol. 8837, pp. 98–114. Springer, Heidelberg (2014)Google Scholar
 5.Bulychev, P.E., David, A., Larsen, K.G., Mikucionis, M., Poulsen, D.B., Legay, A., Wang, Z.: UPPAALSMC: statistical model checking for priced timed automata. In: QAPL, pp. 1–16 (2012)Google Scholar
 6.Chatterjee, K.: Robustness of structurally equivalent concurrent parity games. In: Birkedal, L. (ed.) FOSSACS 2012. LNCS, vol. 7213, pp. 270–285. Springer, Heidelberg (2012)CrossRefGoogle Scholar
 7.David, A., Larsen, K.G., Legay, A., Mikucionis, M., Poulsen, D.B.: Uppaal SMC tutorial. STTT 17(4), 397–415 (2015)CrossRefGoogle Scholar
 8.Grosu, R., Smolka, S.A.: Monte carlo model checking. In: Halbwachs, N., Zuck, L.D. (eds.) TACAS 2005. LNCS, vol. 3440, pp. 271–286. Springer, Heidelberg (2005)CrossRefGoogle Scholar
 9.He, R., Jennings, P., Basu, S, Ghosh, A.P., Wu, H.: A bounded statistical approach for model checking of unbounded until properties. In: ASE, pp. 225–234 (2010)Google Scholar
 10.Hérault, T., Lassaigne, R., Magniette, F., Peyronnet, S.: Approximate probabilistic model checking. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 73–84. Springer, Heidelberg (2004)CrossRefGoogle Scholar
 11.Hermanns, H., MeyerKayser, J., Siegle, M.: Multi terminal binary decision diagrams to represent and analyse continuous time Markov chains. In: 3rd International Workshop on the Numerical Solution of Markov Chains, pp. 188–207. Citeseer (1999)Google Scholar
 12.Jegourel, C., Legay, A., Sedwards, S.: A platform for high performance statistical model checking – PLASMA. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 498–503. Springer, Heidelberg (2012)CrossRefGoogle Scholar
 13.Jha, S.K., Clarke, E.M., Langmead, C.J., Legay, A., Platzer, A., Zuliani, P.: A Bayesian approach to model checking biological systems. In: Degano, P., Gorrieri, R. (eds.) CMSB 2009. LNCS, vol. 5688, pp. 218–234. Springer, Heidelberg (2009)CrossRefGoogle Scholar
 14.Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic realtime systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 15.Kwiatkowska, M.Z., Norman, G., Parker, D.: The PRISM benchmark suite. In: QUEST, pp. 203–204 (2012)Google Scholar
 16.Lassaigne, R., Peyronnet, S.: Probabilistic verification and approximation. Ann. Pure Appl. Logic 152(1–3), 122–131 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
 17.Norris, J.R.: Markov Chains. Cambridge University Press, Cambridge (1998)zbMATHGoogle Scholar
 18.Oudinet, J., Denise, A., Gaudel, M.C., Lassaigne, R., Peyronnet, S.: Uniform montecarlo model checking. In: Giannakopoulou, D., Orejas, F. (eds.) FASE 2011. LNCS, vol. 6603, pp. 127–140. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 19.Pnueli, A.: The temporal logic of programs. In: FOCS, pp. 46–57 (1977)Google Scholar
 20.El Rabih, D., Pekergin, N.: Statistical model checking using perfect simulation. In: Liu, Z., Ravn, A.P. (eds.) ATVA 2009. LNCS, vol. 5799, pp. 120–134. Springer, Heidelberg (2009)CrossRefGoogle Scholar
 21.Sen, K., Viswanathan, M., Agha, G.: Statistical model checking of blackbox probabilistic systems. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 202–215. Springer, Heidelberg (2004)CrossRefGoogle Scholar
 22.Sen, K., Viswanathan, M., Agha, G.: On statistical model checking of stochastic systems. In: Etessami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 266–280. Springer, Heidelberg (2005)CrossRefGoogle Scholar
 23.Solan, E.: Continuity of the value of competitive Markov decision processes. J. Theor. Probab. 16(4), 831–845 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
 24.Wald, A.: Sequential tests of statistical hypotheses. Ann. Math. Stat. 16(2), 117–186 (1945)MathSciNetCrossRefzbMATHGoogle Scholar
 25.Younes, H.L.S.: Planning and verification for stochastic processes with asynchronous events. In: AAAI, pp. 1001–1002 (2004)Google Scholar
 26.Younes, H.L.S., Clarke, E.M., Zuliani, P.: Statistical verification of probabilistic properties with unbounded until. In: Davies, J. (ed.) SBMF 2010. LNCS, vol. 6527, pp. 144–160. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 27.Younes, H.L.S., Kwiatkowska, M.Z., Norman, G., Parker, D.: Numerical vs. statistical probabilistic model checking. STTT 8(3), 216–228 (2006)CrossRefzbMATHGoogle Scholar
 28.Younes, H.L.S., Simmons, R.G.: Probabilistic verification of discrete event systems using acceptance sampling. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp. 223–235. Springer, Heidelberg (2002)CrossRefGoogle Scholar