Adaptive Aggregation of Markov Chains: Quantitative Analysis of Chemical Reaction Networks
Abstract
Quantitative analysis of Markov models typically proceeds through numerical methods or simulationbased evaluation. Since the state space of the models can often be large, exact or approximate state aggregation methods (such as lumping or bisimulation reduction) have been proposed to improve the scalability of the numerical schemes. However, none of the existing numerical techniques provides general, explicit bounds on the approximation error, a problem particularly relevant when the level of accuracy affects the soundness of verification results. We propose a novel numerical approach that combines the strengths of aggregation techniques (statespace reduction) with those of simulationbased approaches (automatic updates that adapt to the process dynamics). The key advantage of our scheme is that it provides rigorous precision guarantees under different measures. The new approach, which can be used in conjunction with time uniformisation techniques, is evaluated on two models of chemical reaction networks, a signalling pathway and a prokaryotic gene expression network: it demonstrates marked improvement in accuracy without performance degradation, particularly when compared to known statespace truncation techniques.
Keywords
Error Bound Chemical Master Equation Empirical Error Probabilistic Model Check Uniformisation Step1 Introduction
Markov models are widely used in many areas of science and engineering in order to evaluate the probability of certain events of interest. Quantitative analysis of timebounded properties of Markov models typically proceeds through numerical analysis, via solution of equations yielding the probability of the system residing in a given state at a given time, or via simulationbased exploration of its execution paths. For continuoustime Markov chains (CTMCs), a commonly employed method is uniformisation (also known as Jensen’s method), which is based on the discretisation of the original CTMC and on the numerical computation of transient probabilities (that is, probability distributions over time). This can be combined with graphtheoretic techniques for probabilistic model checking against temporal logic properties [4].
There are many situations where highly accurate probability estimates are necessary, for example for reliability analysis in safetycritical systems or for predictive modelling in scientific experiments, but this is difficult to achieve in practice because of the statespace explosion problem. Imprecise values are known to lead to lack of robustness, in the sense that the satisfaction of temporal logic formulae can be affected by small changes to the formula bound or the probability distribution of the model. Simulationbased analysis does not suffer from this problem and additionally allows dynamic adaptation of the sampling procedure, as e.g. in importance sampling, to the current values of the transient probability distribution. However, this analysis provides only weak precision guarantees in the form of confidence intervals. In order to enable the handling of larger state spaces, two types of techniques have been introduced: state aggregation and statespace truncation. State aggregation techniques build a reduced state space using lumping [6] or bisimulation quotient [21], and have been proposed both in exact [21] and approximate form [10], with the latter deemed more robust than than the exact ones [11]. Statespace truncation methods, e.g. fast adaptive uniformisation (FAU) [9, 23], on the other hand, only consider the states whose probability mass is not negligible, while clustering states where the probability is less than a given threshold and computing the total probability lost. Unfortunately, though these methods allow the user to specify a desired precision, none provide explicit and general error bounds that can be used to quantify the accuracy of the numerical computation: more precisely, these truncation methods provide a lower bound on the probability distributions in time, and the total probability lost can be used to derive a (rather conservative) upper bound on the (pointwise) approximation error as the sum of the lower bound and of the total probability lost.
Key Contributions. We propose a novel adaptive aggregation method for Markov chains that allows us to control its approximation error based on explicitly derived error bounds. The method can be combined with numerical techniques such as uniformisation [9, 23], typically employed in quantitative verification of Markov chains. The method works over a finite time interval by clustering the state space of a Markov chain sequentially in time, where the quality of the current aggregation is quantified by a number of metrics. These metrics, in conjunction with userspecified precision requirements, drive the process by automatically and adaptively reclustering its state space depending on the explicit error bounds. In contrast to related simulationbased approaches in the literature [13, 31] that employ the current probability distribution of the aggregated model to selectively cluster the regions of the state space containing negligible probability mass, our novel use of the derived error bounds allows far greater accuracy and flexibility as it accounts also for the past history of the probability mass within specific clusters.
To the best of our knowledge, despite recent attempts [10, 11] the development and use of explicit bounds on the error associated with a clustering procedure is new for the simulation and analysis of Markov chains. The versatility of the method is further enhanced by employing a variety of different metrics to assess the approximation quality. More specifically, we use the following to control the error: (1) the probability distributions in time (namely, the pointwise difference between concrete and abstract distributions), (2) the timewise likelihood of events (\(L_1\) norm and total variation distance), as well as (3) the probability of satisfying a temporal logic specification.
We implement our method in conjunction with uniformisation for the computation of probability distributions of the process in time, as well as timebounded probabilities (a key procedure for probabilistic model checking against temporal logic specifications), and evaluate it on two case studies of chemical reaction networks. Compared to fast adaptive uniformisation as implemented in PRISM [9], currently the best performing technique in this setting, we demonstrate that our method yields a marked improvement in numerical precision without degrading its performance.
Related Work. (Bio)chemical reaction networks can be naturally analysed using discrete stochastic models. Since the discrete state space of these models can be large or even infinite, a number of numerical approaches have been proposed to alleviate the associated statespace explosion problem. For biochemical models with large populations of chemical components, fluid (meanfield) approximation techniques can be applied [5] and extended to approximate higherorder moments [12]: these deterministic approximations lead to a set of ordinary differential equations. In [16], a hybrid method is proposed that captures the dynamics using a discrete stochastic description in the case of small populations and a momentbased deterministic description for large populations. An alternative approach assumes that the transient probabilities can be compactly approximated based on quasi product forms [3]. All the mentioned methods do not provide explicit accuracy bounds of approximation.
A widely studied model reduction method for Markov models is state aggregation based on lumping [6] or (bi)simulation equivalence [4], with the latter notion in its exact [21] or approximate [10] form. In particular, approximate notions of equivalence have led to new abstraction/refinement techniques for the numerical verification of Markov models over finite [11] as well as uncountablyinfinite state spaces [1, 2, 26]. Related to these techniques, [27] presents an algorithm to approximate probability distributions of a Markov process forward in time, which serves as an inspiration for our adaptive scheme. From the perspective of simulations, adaptive aggregations are discussed in [13] but no precision error is given: our work differs by developing an adaptive aggregation scheme, where a formal error analysis steers the adaptation.
An alternative method to deal with large/infinite state spaces is truncation, where a lower bound on the transient probability distribution of the concrete model is computed, and the total probability mass that is lost due to this truncation is quantified. Such methods include finite state projections [24], sliding window abstractions [18], or fast adaptive uniformisation (FAU) [9, 23]. Apart from truncating the state space by neglecting states with insignificant probability mass, FAU dynamically adapts the uniformisation rate, thus significantly reducing the number of uniformisation steps [30]. The efficiency of the truncation techniques depends on the distribution of the significant part of the probability mass over the states, and may result in poor accuracy if this mass is spread out over a large number of states, or whenever the selected window of states does not align with a property of interest.
Summarising, whilst a number of methods have been devised to study or to simulate complex biochemical models, in most cases a rigorous error analysis is missing [13, 22, 31], or the error analysis cannot be effectively used to obtain accurate bounds on the probability distribution or on the likelihood of events of interest [17].
Structure of this Article. Section 2 introduces the sequential aggregation approach to approximate the transient probability distribution (that is, the distribution over time) of discretetime Markov chains, and quantifies bounds on the introduced error according to three different metrics. Section 3 applies the aggregation method for temporal logic verification of Markov chains. In Sect. 4, we implement adaptive aggregation for continuoustime Markov chain models of chemical reaction networks, in conjunction with known techniques such as uniformisation and threshold truncation. Finally, Sect. 5 discusses experimental results.
2 Computation of the Transient Probability Distribution

\(S = \{s_1, \ldots , s_n\}\) is the finite state space of size n;

\(P: S \times S \rightarrow [0,1]\) is the transition probability matrix, which is such that \( \forall j \in S: \sum _{i=1}^n P_{ji} = \sum _{i=1}^n P(j,i) = 1\);

\(L: S \rightarrow 2^\varSigma \) is a labelling function, where \(\varSigma \) is a finite alphabet built from a set of atomic propositions.
Sequential Aggregations of the Markov Chain. Consider the finite time interval of interest \([0,1,\ldots , N]\). Divide this interval into a given number (q) of subintervals, namely select \(N_1, N_2, \ldots , N_q: \sum _{i=1}^{q} N_i = N\), and consider the evolution of the model within the corresponding lth interval \([\sum _{i=0}^{l1} N_i,\sum _{i=0}^{l} N_i]\), for \(l=1, \ldots q\), and where we have set \(N_0 = 0\).
Finally, define, for all \(s \in S\), \(\tilde{\pi }_{k}^l (s) = \pi _k^l (\alpha ^l(s))/\mid A^l(\alpha ^l(s))\mid \) as a (normalised) piecewise constant approximation of the abstract functions \(\pi _k^l\). Functions \(\tilde{\pi }_{k}^l\), being defined over the concrete state space S, will be employed for comparison with the original distribution functions \(\pi _k\). Specifically, for the initial interval \([N_0,N_1]\) (with \(l=1\)), approximate the initial distribution \(\pi _0\) by \(\pi _0^1\) as: \(\forall s \in S^1, \pi _0^1(s) = \sum _{s'\in A^1(s)} \pi _0(s')\). Similarly, we have that \(\forall s \in S, \tilde{\pi }_0^1 (s) = \pi _0^1(\alpha ^1(s))/\mid A^1(\alpha ^1(s))\mid \).
Remark 1
Exact and approximate probabilistic bisimulations [10, 21] build a quotient or a cover of the state space of the original model based on matching or approximating the “outgoing probability” from concrete states – for example, exact probabilistic bisimulation compares, for state pairs \((s_1, s_2)\) within a partition, the “outgoing” probabilities \(P(s_1,B)\) and \(P(s_2,B)\) over partitions B. On the other hand, in (2) we approximate the “incoming probability”, as motivated by the approximation of the recursions in (1). \(\Box \)
Proposition 1
Remark 2
A few comments on the structure of the error bounds are in order. The overall error is composed of two main contributions, one depending on the error accrued within single aggregation steps, and the other (\(\gamma _{l1}^{l}(s)\)) depending on the q reaggregations (that is, an update from the current partition to the next).
The first term of the first contribution further depends on the pointwise error in the distributions initialised at each aggregation, namely, \(\left \pi _{\sum _{i=0}^{l} N_i} (s) \right. \) \(\left. \tilde{\pi }_{\sum _{i=0}^{l} N_i}^l (s) \right \): this quantity, discounted by the \(N_l\)th power of the factor c(s) (accounting for contractive or expansive dynamics), builds up recursively to yield the global (over the q aggregation steps) quantity \(c(s)^{N} \left \pi _{0} (s)  \tilde{\pi }_{0}^1 (s) \right \). The second term of the first contribution, on the other hand, accounts for the error due to the approximation of the transition probability matrix (terms \(\epsilon ^l\)), averaged over the accrued running distribution functions (factors \(\pi ^l\)).
The intuition on factor c(s) is the following: if the model is “contractive” (in a certain probabilistic sense) towards a point s, the factor c(s) is likely to be greater than one; on the other hand, if the distribution in time is “dispersed,” then it is likely that \(c(s)<1\) over a large subset of the state space. The quantity \(c(s) = P(S,s)\) might be decreased if we work on a subset of S: this might happen with a discretetime chain obtained from a corresponding continuoustime model via FAU [9, 23], or through the interaction of the factor \(c(s), s\in S\), with atomic propositions defined specifically over subsets of the state space S. \(\Box \)
Corollary 1
Consider the same setup as in Proposition 1. A bound for the quantity \(\left\ \pi _{N}  \tilde{\pi }_{N}^q \right\ _{\infty }\) can be obtained from that in Proposition 1 by straightforward adaptation and setting \(c = \max _{s \in S} c(s)\), and \(\gamma _{l1}^{l} = \max _{s \in S} \gamma _{l1}^{l}(s), l=1,\ldots ,q\).
Proposition 2
3 Aggregations for Model Checking of TimeBounded Specifications
In Sect. 2, we have introduced a sequential aggregation procedure to approximate the computation of the transient probability distribution of a Markov chain. The derived bounds allow for a comparison of aggregated and concrete models either pointwise, or according to a global measure of the differences in the probability of events over the state space, at a specific point in time. We now show how to employ the aggregation method for quantitative verification against probabilistic temporal logics such as PCTL. We focus on a bounded variant of the probabilistic safety (invariance) property, which corresponds to timebounded invariance for continuoustime Markov chains.
Proposition 3
Remark 3
We give some intuition regarding the structure of the bounds. The quantity depends on a summation over q aggregation steps. It expresses the accrual of the error incurred over the outgoing probability from the ith partition (term \(\epsilon ^l (i) \)), averaged over the history of the cost function over that partition. Note the symmetry between the shape of the bound and the recursive definition of the quantities of interest. \(\Box \)
4 Quantitative Analysis of Chemical Reaction Networks
A chemical reaction network describes a biochemical system containing M chemical species participating in a number of chemical reactions. The state of a model of the system at time \(t \in \mathbb R^+\) is the vector \(\mathbf {X}(t) = ( X_1(t), X_2(t), \ldots , X_{M}(t) )\), where \(X_i\) denotes the number of molecules of the ith species [15]. Whenever a single reaction occurs the state changes based on the stoichiometry of the corresponding reaction. We use S to denote the set of (discrete) states. Further, for \(s \in S\), \(\pi _t(s)\) denotes the probability \(\mathbb P (\mathbf {X}(t) = s)\). Assuming finite volume and temperature, the model can be interpreted as a continuoustime Markov chain (CTMC) \(C = (S, R)\), where the rate matrix \(R(s,s')\) gives the rate of a transition from states s to \(s'\), and \(\pi _0\) specifies the initial distribution over S. The time evolution of the model is governed by the Chemical Master Equation (CME) [15], namely \(\frac{d}{dt}\pi _t = \pi _t \cdot Q\), where Q is the infinitesimal generator matrix, defined as \(Q(s,s') = R(s,s')\) if \(s \ne s'\), and as \(1\sum _{s'' \ne s}{R(s,s'')} \) otherwise. The exact solution of the CME is in general intractable, which has led to a number of possible numerical approximations [25]. We employ uniformisation [30], which in many cases outperforms other methods and also provides an arbitrary, userdefined approximation precision.
Uniformisation is based on a timediscretisation of the CTMC. The distribution \(\pi _t\) is obtained as a sum (over index i) of terms giving the probability that i discrete reaction steps occur up to time t: this is a Poisson random variable \(\gamma _{i,\lambda \cdot t} = e^{\lambda \cdot t} \cdot \frac{\left( \lambda \cdot t\right) ^i}{i!}\), where the time delay is exponentially distributed with rate \(\lambda \). More formally, \(\pi _t = \sum ^{\infty }_{i=0}{\gamma _{i,\lambda \cdot t} \cdot \pi _0 \cdot \tilde{Q}^i} \approx \sum ^{N}_{i=0}{\gamma _{i,\lambda \cdot t} \cdot \pi _0 \cdot \tilde{Q}^i}\), where \(\tilde{Q}\) is the uniformised infinitesimal generator matrix defined using terms \(\frac{R(s,s')}{\lambda }\), and where the uniformisation constant \(\lambda \) is equal to the maximal exit rate \(\sum _{s'' \ne s}{R(s,s'')} \). Although the sum is in general infinite, for a given precision an upper bound N can be estimated using techniques in [14], which also allow for efficient computation of the Poisson probabilities \(\gamma _{i,\lambda \cdot t}\).
For complex models with very large or possibly infinite state spaces, the above numerical approximations are computationally infeasible, and are typically combined with (dynamical) statespace truncation methods, such as finite state projection [24], sliding window abstraction [18], or fast adaptive uniformisation [9, 23] (FAU). The key idea of these truncation methods is to restrict the analysis of the model to a subset of states containing significant probability mass. One can easily compute the probability lost at each uniformisation step and thus obtain the total probability lost by truncation. As such, these truncation methods provide a lower bound on the quantities \(\pi _t\), and the quantified probability lost can be used to derive a (rather conservative) upper bound on the approximation error: the sum of the lower bound and the probability lost gives an upper bound for the pointwise error. Moreover, a (pessimistic) bound on the \(L_1\)norm over a general subset of the state space is obtained by multiplying the probability lost by the number of states in the concrete subset.
Adaptive Aggregation for CTMC Models of Chemical Reaction Networks. The aggregation methods in the previous sections can be directly applied to uniformised CTMCs, such as those arising from chemical reaction networks. We now discuss how the aggregation unfolds sequentially in time and how the derived error bounds can be used for the aggregation method in this setting.
Recall from Eq. (2) that the derivation of the error bounds for the aggregation procedure requires a finite state space: for infinitestate CTMCs, the aggregation method can be combined with statespace truncation (alongside time uniformisation), in order to accelerate computations in cases where the set of significant states is still too large. On the other hand, for finitestate CTMC models, adaptive aggregations can be regarded as an orthogonal strategy to truncation, and can be directly applied in conjunction with time uniformisation. In order to compare the precision and reduction capability of our method to that of FAU, we thus assume that the population of each species is bounded, which ensures fairness of experimental evaluation.
The historydependent strategy is based on the available history contributing to the shape of the derived errors: for the lth aggregation step and the given ith cluster of the current partition, it tracks the sum of the errors accumulated in the interval \([\sum _{i=0}^{l1} N_i, \sum _{i=0}^{l1} {N_i + k}]\) for \( k=1,\ldots , N_l\), according to the explicit bounds derived in Sect. 2 (line 4 of Algorithm 1). At each step k, the obtained value (averaged over k steps) reflects the (averaged) error accrual for each cluster (array AccumErrors) and is used to drive the partitioning procedure.
The function \(\mathsf {checkAggregation}\) determines (using AccumErrors) if the current clustering meets the desired threshold, or if a refinement is desirable: during reclustering, a locally coarser abstraction may as well be suggested by merging clusters. The function \(\mathsf {Recluster}\) provides the new clustering based on the error bounds, which are functions of AccumErrors, of the local contributions \(\epsilon ^l\), and of the (history of the) distribution \(\pi _k^l\) (or of the cost \(V_k^l\) in the case of safety verification). In contrast to the adaptive method presented in [13] and based exclusively on local heuristics, our strategy closely reflects the shape of the derived, historydependent error bounds. Note that the aggregation strategy applied to chemical reaction networks aligns well with the known structure of the underlying CTMCs. In particular, the statespace clustering employs the spatial locality of the distribution of transitions in the Mdimensional space [13, 31] (M is the number of chemical species), usually leading to relatively uniform probability mass over adjacent states and thus to strategies that cluster neighbouring states.
A simpler reclustering strategy (denoted in the experiments as local) employs at each uniformisation step k only the product of the local error \(\epsilon ^l\) with the probability distribution \(\pi _k^l\) (or with the cost function \(V_k^l\)). In other words, a local reclustering is performed if the local error depending on \(\epsilon ^l \pi _k^l\) (respectively, on \(\epsilon ^l V_k^l\)) is above a given threshold. This intuitive scheme is similar to the local heuristic employed in [13].
We will show that the historybased strategy is more flexible with respect to the required precision and aggregation size. Our experiments confirm that, while based on error bounds that overestimate the actual empirical error incurred in the aggregation, the historybased strategy tends to outperform the more intuitive and easier local strategy, with respect to key performance metrics affecting the practical use of the adaptive aggregation. This shows that the computed errors not only serve as a means to certify the accuracy of approximation, but can also be used to effectively drive the aggregation procedure. In particular, the metrics we are interested in are: (1) the value of avg representing the statespace reduction; (2) the accuracy of the empirical results of the abstract model; (3) the total number of reclusterings; and (4) the actual value of the error bounds (compared to the empirical errors).
The number of reclusterings (denoted by q) is crucial for the performance of the overall scheme, since each reclustering requires \(\mathcal {O}(S+P)\) steps, which is similar to performing a few uniformisation steps for the concrete model. As such, the number of reclusterings should be significantly smaller than the total number of uniformisation steps. Therefore, in our experiments we use thresholds that favour fewer reclusterings over coarser abstractions. Finally, note that the adaptive aggregation scheme can be combined with the adaptive uniformisation step as well as with dynamic statespace truncation [9, 23, 30], which updates the uniformisation constant \(\lambda \) for different time intervals, thus decreasing the number of overall uniformisation steps N.
Illustrative Example. We resort to a twodimensional discrete LotkaVolterra “predatorprey” model [15] to illustrate the historydependent aggregation strategy. The maximal population of each species is bounded by 2000, thus the concrete model has 4M states. The initial population is set to 200 predators and 400 preys.
5 Experimental Evaluation on Two Case Studies
We have developed a prototype implementation of the adaptive aggregation for the quantitative analysis of chemical reaction networks modelled in PRISM [20]. We have evaluated the scheme on two case studies in comparison with FAU [9] as implemented in the explicit engine of PRISM. In order to ensure comparability between the two schemes, which employ different data structures, rather than measuring execution time we have focused on assessing performance based on measures that are independent of implementation, and specifically focused on the metrics (1)–(4) introduced in Sect. 4 (model reduction, empirical accuracy, number of reclusterings, and formal error bounds). For the same reason, we have not incorporated heuristics such as varying the maximal cluster size, optimally selecting error thresholds, or use of advanced clustering methods, which can be employed to further optimise the adaptive scheme.
We run all experiments on a MackBook Air^{™} with 1.8GHz Intel Core i5 and 4 GB 1600 MHz RAM. As expected, for comparable state space reductions (value avg), FAU can be faster but in the same order of magnitude as our prototype, due to the overhead of clustering and adaptive uniformisation not being fully integrated in our implementation.
Recall that FAU eliminates states with incoming probability lower than a defined threshold, and as such leads to an underapproximation of the concrete probability distribution with no tailored error bounds: all we can say is that, pointwise, the concrete transient probability distribution resides between this underapproximation and a value obtained by adding the total probability lost, and similarly for the invariance likelihood.
We first evaluate the adaptive aggregation scheme over the verification of an invariant property with associated small likelihood: in this scenario dynamic truncation techniques such as FAU provide insufficient approximation precision. We compute the probability that the population of Rp stays below the level 15 for \(t = 5\) s (a relevant time window due to the fastscale phosphorylation). The results for the new, less restrictive population bounds [5, 55] are reported in Fig. 2. We present empirical satisfaction probabilities (“Empirical”) and their formal bounds (“Bound”) computed using Proposition 3 for the adaptive aggregation scheme, and lower bounds and probability lost for the FAU algorithm. For both schemes we report the obtained statespace size avg. We can observe a clear relationship between the statespace reduction and the precision of the analysis. For adaptive aggregations, the parametrisation of each strategy is denoted by an index (1, 2, 3) representing the thresholds affecting the precision. Note that the parameterization for the historybased aggregation, in contrast to the local strategy, allows us to obtain the userdefined precision (e.g. in this experiment for the historydependent strategy index 1 denotes a restriction of the bounds to 5E11, whereas 2 to 5E12, and 3 to 5E14), since the aggregation employs exactly the errors. The results also demonstrate that the historybased strategy significantly outperforms the local strategy in all four key performance metrics.
Next, we employ this example to compare the computation of the \(L_1\) norm of the probability distribution at time \(t=5\) s. The table in Fig. 3 depicts the results for the \(L_1\) norm over the whole state space, whereas the table in Fig. 4 depicts the results for the \(L_1\) norm over a certain subset of interest. The formal bounds for the adaptive scheme (column “Empirical” in Fig. 3) have been computed using Proposition 2, whilst the corresponding bounds for Fig. 4 (middle part) have been obtained as the sum of the pointwise errors, defined in Proposition 1, over the subset of interest. The upper part of the tables corresponds to the population bounds [25, 35] (as in [7]), whereas the lower part to the less restrictive bounds [5, 55]. Compared to the local strategy, the historybased aggregation again provides better performance, namely it requires significantly (up to tentimes) smaller numbers of reclusterings (“Reclust.”): we thus present the results only for the historydependent strategy. We ensure the comparability of the two outcomes by empirically selecting the threshold for FAU to obtain a truncated model of size (avg) similar to that resulting from our technique. Note that, in the case of the \(L_1\) norm over the state space, the probability lost reported by FAU provides the safe bound on the \(L_1\) norm and is equal to the empirical error between the concrete and truncated probability distributions. However, in the case of the \(L_1\) norm over a general subset of the state space the probability lost has to be multiplied by the cardinality of the subset to obtain the correct formal bounds. Such bounds are reported in Fig. 4 (right part) as “Bound,” whereas the empirical error between the distribution over the subset is depicted as “Empirical”.
Summarising Figs. 3 and 4, when requiring a tight bound for the smaller state space (population [25, 35]), either approach does not lead to more than a twofold reduction in the size of the space. This suggests a limit on the possible statespace reduction resulting from the model dynamics. However, for the larger model (population [5, 55]), up to a sevenfold reduction can be obtained using adaptive aggregation. We can see that FAU outperforms the adaptive aggregation scheme in the case of the \(L_1\) norm error over the whole state space (where it leads to a nineteenfold reduction) but, in contrast to our approach, is not able to provide useful bounds for a general \(L_1\) norm (especially for the larger model).
In contrast with the previous case study that focused on events with very small likelihood, we now discuss results for events with nonnegligible likelihood. Figure 5 reports basic statistics on the computation of the \(L_1\) norm over a certain subset of the state space at time \(t=1000\) s. Providing useful error bounds on the \(L_1\) norm (computed from the pointwise errors in Proposition 1), the adaptive aggregation leads to almost a tenfold state space reduction for the smaller model (1.2M vs 127K) and a fifteenfold reduction for the larger model (4.4M vs 287K). Due to the large cardinality of the subset of interest, FAU fails to provide any informative formal bounds. Note that in this case study the adaptive aggregation scheme also provides better empirical bounds than FAU.
Finally, we have evaluated both approaches on an invariant property (the population of a species stays below the level 10, for 1000 s) with a significant satisfaction probability (more than \(15\,\%\) and \(20\,\%\) on the small and large model, respectively). We observe that this choice is favourable to FAU, since for invariant properties with high likelihood the state space truncated via FAU is aligned with the property of interest, and thus the lost probability mass is slightly smaller than the error introduced by the statespace aggregations. In this scenario FAU yields better reductions than the adaptive aggregation scheme (especially for the larger model), while providing similar error bounds, since it is able to successfully identify the relevant part of the state space. This scenario advantageous to FAU is in contrast to that discussed in Fig. 2, as well as to the general case where for an arbitrary model it is not known how the probability mass is distributed in relation to the states satisfying the property of interest.
6 Conclusions
We have proposed a novel adaptive aggregation algorithm for approximating the probability of an event in a Markov chain with rigorous precision guarantees. Our approach provides error bounds that are in general orders of magnitude more accurate compared to those from fast adaptive uniformisation, and significantly decreases the size of models without performance degradation. This has allowed us to efficiently analyse larger and more complex models. Future work will include effective combinations of the adaptive aggregation with robustness analysis and parameter synthesis. We also plan to apply our approach to the verification and performance analysis of complex safetycritical computer systems, where precision guarantees play a key role.
Footnotes
 1.
For the sake of simplicity, we shall often loosely identify the set Sat\(({\varPhi })\) with label \({\varPhi }\).
References
 1.Abate, A., Katoen, J.P., Lygeros, J., Prandini, M.: Approximate model checking of stochastic hybrid systems. Eur. J. Control 16, 624–641 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
 2.Abate, A., Kwiatkowska, M., Norman, G., Parker, D.: Probabilistic model checking of labelled Markov processes via finite approximate bisimulations. In: van Breugel, F., Kashefi, E., Palamidessi, C., Rutten, J. (eds.) Horizons of the Mind. LNCS, vol. 8464, pp. 40–58. Springer, Heidelberg (2014) Google Scholar
 3.Angius, A., Horváth, A., Wolf, V.: Quasi Product form approximation for markov models of reaction networks. In: Priami, C., Petre, I., de Vink, E. (eds.) Transactions on Computational Systems Biology XIV. LNCS, vol. 7625, pp. 26–52. Springer, Heidelberg (2012) CrossRefGoogle Scholar
 4.Baier, C., Katoen, J.P.: Principles of Model Checking. The MIT Press, Cambridge (2008)zbMATHGoogle Scholar
 5.Bortolussi, L., Hillston, J.: Fluid model checking. In: Koutny, M., Ulidowski, I. (eds.) CONCUR 2012. LNCS, vol. 7454, pp. 333–347. Springer, Heidelberg (2012) CrossRefGoogle Scholar
 6.Buchholz, P.: Exact performance equivalence: an equivalence relation for stochastic automata. Theor. Comput. Sci. 215(1–2), 263–287 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
 7.Česka, M., Šafránek, D., Dražan, S., Brim, L.: Robustness analysis of stochastic biochemical systems. PloS One 9(4), e94553 (2014)CrossRefGoogle Scholar
 8.Chen, T., Kiefer, S.: On the total variation distance of labelled Markov chains. In: Computer Science Logic (CSL) and Logic in Computer Science (LICS) (2014)Google Scholar
 9.Dannenberg, F., Hahn, E.M., Kwiatkowska, M.: Computing cumulative rewards using fast adaptive uniformisation. ACM Trans. Model. Comput. Simul. Spec. Issue Comput. Methods Syst. Biol. (CMSB) 25, 9 (2015)MathSciNetGoogle Scholar
 10.Desharnais, J., Laviolette, F., Tracol, M.: Approximate analysis of probabilistic processes: logic, simulation and games. In: Quantitative Evaluation of SysTems (QEST), pp. 264–273 (2008)Google Scholar
 11.D’Innocenzo, A., Abate, A., Katoen, J.P.: Robust PCTL model checking. In: Hybrid Systems: Computation and Control (HSCC), pp. 275–285. ACM (2012)Google Scholar
 12.Engblom, S.: Computing the moments of high dimensional solutions of the master equation. Appl. Math. Comput. 180(2), 498–515 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
 13.Ferm, L., Lötstedt, P.: Adaptive solution of the master equation in low dimensions. Appl. Numer. Math. 59(1), 187–204 (2009)MathSciNetCrossRefGoogle Scholar
 14.Fox, B.L., Glynn, P.W.: Computing poisson probabilities. Commun. ACM 31(4), 440–445 (1988)MathSciNetCrossRefGoogle Scholar
 15.Gillespie, D.T.: Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81(25), 2340–2381 (1977)CrossRefGoogle Scholar
 16.Hasenauer, J., Wolf, V., Kazeroonian, A., Theis, F.: Method of conditional moments (MCM) for the chemical master equation. J. Math. Biol. 69(3), 687–735 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
 17.Hegland, M., Burden, C., Santoso, L., MacNamara, S., Booth, H.: A solver for the stochastic master equation applied to gene regulatory networks. J. Comput. Appl. Math. 205(2), 708–724 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
 18.Henzinger, T.A., Mateescu, M., Wolf, V.: Sliding window abstraction for infinite Markov chains. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 337–352. Springer, Heidelberg (2009) CrossRefGoogle Scholar
 19.Kierzek, A.M., Zaim, J., Zielenkiewicz, P.: The effect of transcription and translation initiation frequencies on the stochastic fluctuations in prokaryotic gene expression. J. Biol. Chem. 276(11), 8165–8172 (2001)CrossRefGoogle Scholar
 20.Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic realtime systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011) CrossRefGoogle Scholar
 21.Larsen, K.G., Skou, A.: Bisimulation through probabilistic testing. Inf. Comput. 94(1), 1–28 (1991)MathSciNetCrossRefGoogle Scholar
 22.Madsen, C., Myers, C., Roehner, N., Winstead, C., Zhang, Z.: Utilizing stochastic model checking to analyze genetic circuits. In: Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), pp. 379–386. IEEE Computer Society (2012)Google Scholar
 23.Mateescu, M., Wolf, V., Didier, F., Henzinger, T.A.: Fast adaptive uniformization of the chemical master equation. IET Syst. Biol. 4(6), 441–452 (2010)CrossRefGoogle Scholar
 24.Munsky, B., Khammash, M.: The finite state projection algorithm for the solution of the chemical master equation. J. Chem. Phys. 124, 044104 (2006)CrossRefGoogle Scholar
 25.Sidje, R., Stewart, W.: A numerical study of large sparse matrix exponentials arising in Markov chains. Comput. Stat. Data Anal. 29(3), 345–368 (1999)CrossRefzbMATHGoogle Scholar
 26.Soudjani, S.E.Z., Abate, A.: Adaptive and sequential gridding procedures for the abstraction and verification of stochastic processes. SIAM J. Appl. Dyn. Syst. 12(2), 921–956 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
 27.Esmaeil Zadeh Soudjani, S., Abate, A.: Precise approximations of the probability distribution of a markov process in time: an application to probabilistic invariance. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014 (ETAPS). LNCS, vol. 8413, pp. 547–561. Springer, Heidelberg (2014) CrossRefGoogle Scholar
 28.Steuer, R., Waldherr, S., Sourjik, V., Kollmann, M.: Robust signal processing in living cells. PLoS Comput. Biol. 7(11), e1002218 (2011)MathSciNetCrossRefGoogle Scholar
 29.Tkachev, I., Abate, A.: On approximation metrics for linear temporal modelchecking of stochastic systems. In: Hybrid Systems: Computation and Control (HSCC), pp. 193–202. ACM (2014)Google Scholar
 30.van Moorsel, A.P., Sanders, W.H.: Adaptive uniformization. Stoch. Models 10(3), 619–647 (1994)CrossRefzbMATHGoogle Scholar
 31.Zhang, J., Watson, L.T., Cao, Y.: Adaptive aggregation method for the chemical master equation. Int. J. Comput. Biol. Drug Des. 2(2), 134–148 (2009)CrossRefGoogle Scholar