FACT: A Probabilistic Model Checker for Formal Verification with Confidence Intervals
 4 Citations
 1.3k Downloads
Abstract
We introduce FACT, a probabilistic model checker that computes confidence intervals for the evaluated properties of Markov chains with unknown transition probabilities when observations of these transitions are available. FACT is unaffected by the unquantified estimation errors generated by the use of point probability estimates, a common practice that limits the applicability of quantitative verification. As such, FACT can prevent invalid decisions in the construction and analysis of systems, and extends the applicability of quantitative verification to domains in which unknown estimation errors are unacceptable.
1 Introduction
The development of quantitative verification [8, 11] over the past fifteen years represents one of the most prominent recent advances in system modelling and analysis. Given a Markov model that captures relevant states of a system and the probabilities or rates of transition between these states, the technique can evaluate key reliability and performance properties of the system. This capability and the emergence of efficient probabilistic model checkers such as PRISM [10] and MRMC [9] have led to adoption in a wide range of applications [14].
Despite the success of quantitative verification, the usefulness of its results depends on the accuracy of the analysed models. Obtaining accurate Markov models is difficult. Although model states and transitions are typically easy to identify (e.g., through static code analysis for software systems), transition probabilities and rates need to be estimated. The common practice is to obtain these estimates through model fitting to log data or runtime observations [4, 15], or from domain experts. In either case, the values used in the analysed models contain estimation errors. These errors are then propagated and may be amplified by quantitative verification (since Markov models are nonlinear), producing imprecise results that can lead to invalid design or verification conclusions.
2 Formal Verification with Confidence Intervals
FACT PMCs can have multiple sets of parameters (2). For example, the outgoing transitions from state ‘\(\mathsf {s}=2\)’ in Fig. 1a could be associated with unknown probabilities \(\mathsf {pRetry}_1\) and \(\mathsf {pRetry}_2\). The only constraint is that the different sets of parameters (2) are statistically independent. This constraint is satisfied by a broad class of PMCs that includes, for instance, all the models used in the case studies of the PROPhESY tool^{2} [5] for analysing parametric Markov chains.
3 Using FACT
 1.
If the number of statetransition observations associated with parameter set j is larger than that for parameter set i, this choice of confidence levels may produce much narrower confidence intervals for parameter set i with an insignificant widening of the confidence intervals for parameter set j;
 2.
If the analysed property is particularly sensitive to variations in the parameter set i, reducing \(\alpha _i\) narrows the confidence intervals for parameter set i and may also narrow the \(\alpha \) confidence interval for the analysed property.
Therefore, step 4 uses a confidence interval optimisation heuristic to seek alternative confidence levels \(\alpha _1\), \(\alpha _2\), ..., \(\alpha _m\) such that \(\prod _{i=1}^m \alpha _i = \alpha \) and using \(\alpha _i\) confidence intervals for the ith parameter set, \(1\le i\le m\), produces a narrower \(\alpha \) confidence interval for the analysed property. This optimisation can reduce the width of property confidence intervals (e.g., by up to 14 % in the case studies from [2]), but is time consuming since FACT steps 2 and 3 are repeated for each \(\alpha _1\), \(\alpha _2\), ..., \(\alpha _m\) combination suggested by the heuristic. Hence step 4 is by default switched off in FACT, and the user should switch it on explicitly if needed. There is one typical scenario in which this need arises. This is when FACT is used to verify whether the analysed property is above/below a threshold specified in the system requirements (with some confidence level \(\alpha \)), and the threshold falls inside the \(\alpha \) confidence interval without the heuristic search. In this scenario, the FACT user should switch on the heuristic search by specifying a nonzero number of search iterations, which may result in a narrower \(\alpha \) confidence interval that does not contain the threshold and enables a conclusion to be drawn.
4 Architecture and Implementation
 1.
The parametric quantitative verification engine is implemented on top of PRISM [10], which it invokes in the background. An alternative implementation based on PARAM [6] is worth exploring.
 2.
The simultaneous confidence interval calculator implements the (conservative) solution proposed by Kwong and Iglewicz [12], which achieves a good tradeoff between computational complexity and precision. Several alternative solutions that deserve investigating are mentioned in [2].
 3.
The convex optimisation engine uses the MATLAB convex optimisation toolbox YALMIP [13], which it invokes in the background. An implementation based on the noncommercial GNU Octave package (https://www.gnu.org/software/octave/) is worth exploring.
 4.
The confidence interval optimisation heuristic currently used is hill climbing. Numerous alternative heuristics can be substituted in this module.
Experimental results for the case studies from Sect. 5
PMC  \( psets ^\mathsf {a}\)  \( params ^\mathsf {b}\)  PCTL property  \(t_{ exp }^{\mathsf {c}}\)  \(t_{ CI }^{\mathsf {d}}\) 

Web  5  13  \(\mathcal {P}_{=?} [ \mathrm {F}\ \textsf {HttpResponse}]\)  0.75s  3.96s 
\(\mathcal {P}_{=?} [ \lnot (\textsf {Database}\;\!\vee \! \textsf {FileServer}) \;\!\mathrm {U}\!\textsf { HttpResponse}]\)  0.84s  3.43s  
\(\mathcal {R}_{=?}^{\mathsf {cost}} [ \mathrm {F}\; \textsf {Done}]\)  0.86s  3.31s  
\(\mathcal {R}_{=?}^{\mathsf {time}} [ \mathrm {F}\; \textsf {Done}]\)  0.89s  3.29s  
TAS  3  6  \(\mathcal {P}_{=?} [ \mathrm {F}\ \textsf {FailedAlarm}]\)  0.24s  4.32s 
\(\mathcal {P}_{=?} [\lnot \textsf {Done} \;\mathrm {U}\ \textsf {FailedService}]\)  0.12s  2.82s  
\(\mathcal {P}_{=?} [\lnot \textsf {Done} \;\mathrm {U}\ \textsf {FailedAlarm} \{\textsf {MedicalAnalysis}\}]\)  0.11s  2.78s  
LWB  1  2  \(\mathcal {R}_{=?}^{\mathsf {power}} [ \mathrm {S}]\)  0.24s  3.03s 
\(\mathcal {R}_{=?}^{\mathsf {energy}} [ \mathrm {F}\; \textsf {StartedUp}]\)  0.27s  2.98s  
BRP  2  4  \(\mathcal {P}_{=?} [ \mathrm {F}\ \textsf {SenderNoSuccessReport}]\)  0.44s  31.6s 
Z  2  4  \(\mathcal {R}_{=?}^\mathsf {numTests} [ \mathrm {F}\; \textsf {DecisionMade}]\)  0.15s  5.41s 
5 Case Studies and Experimental Results

a web application taken from [2] (Web);

a teleassistance servicebased system adapted from [3, 4] (TAS);

the lowpower wireless bus communication protocol taken from [2] (LWB);

the bounded retransmission protocol from the PROPhESY [5] site (BRP);

the Zeroconf IP address selection protocol from the PARAM [6] website (Z).
The timing results were obtained on a standard OS X 10.8.5 MacBook computer with 1.3 GHz Intel Core i5 processor and 8 GB 1600 MHz DDR3 RAM. The models, PCTL property files, results and descriptions for all case studies are available on our FACT website http://wwwusers.cs.york.ac.uk/~cap/FACT.
These case studies demonstrated several key benefits of our probabilistic model checker. First, FACT supports the analysis of systems for which state transition probabilities are unknown, but observations of these transitions are available from logs or runtime monitoring. Second, it enables the analysis of reliability, performance and other nonfunctional properties of systems at the required confidence level. This approach is better aligned with the current industrial practice than traditional quantitative verification. Third, it can prevent invalid design and verification decisions. In many scenarios, the quantitative analysis of Markov models built using point estimates of the unknown transition probabilities misleadingly suggested that requirements were met. In contrast, FACT showed that this was only the case with low confidence levels that are typically deemed unacceptable in practice. Last but not least, our case studies showed that FACT can be used to analyse systems from multiple domains.
Footnotes
References
 1.Andova, S., Hermanns, H., Katoen, J.P.: Discretetime rewards modelchecked. FORMATS 2003. LNCS, vol. 2791, pp. 88–104. Springer, Heidelberg (2003)CrossRefGoogle Scholar
 2.Calinescu, R., Ghezzi, C., Johnson, K., Pezze, M., Rafiq, Y., Tamburrelli, G.: Formal verification with confidence intervals to establish quality of service properties of software systems. IEEE Trans. Reliab. PP(99), 1–16 (2015)Google Scholar
 3.Calinescu, R., Johnson, K., Rafiq, Y.: Developing selfverifying servicebased systems. In: ASE 2013, pp. 734–737 (2013)Google Scholar
 4.Calinescu, R., Rafiq, Y., Johnson, K., Bakir, M.E.: Adaptive model learning for continual verification of nonfunctional properties. In: ICPE 2014, pp. 87–98 (2014)Google Scholar
 5.Dehnert, C., Junges, S., Jansen, N., Corzilius, F., Volk, M., Bruintjes, H., Katoen, J.P., Ábrahám, E.: PROPhESY: A PRObabilistic ParamEter SYnthesis Tool. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9206, pp. 214–231. Springer, Heidelberg (2015)CrossRefGoogle Scholar
 6.Hahn, E.M., Hermanns, H., Wachter, B., Zhang, L.: PARAM: a model checker for parametric Markov models. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 660–664. Springer, Heidelberg (2010)CrossRefGoogle Scholar
 7.Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal Aspects Comput. 6(5), 512–535 (1994)CrossRefzbMATHGoogle Scholar
 8.Haverkort, B.R., Katoen, J.P., Larsen, K.G.: Quantitative verification in practice. In: Margaria, T., Steffen, B. (eds.) ISoLA 2010, Part II. LNCS, vol. 6416, pp. 127–127. Springer, Heidelberg (2010)CrossRefGoogle Scholar
 9.Katoen, J.P., Zapreev, I.S., Hahn, E.M., Hermanns, H., Jansen, D.N.: The ins and outs of the probabilistic model checker MRMC. Perform. Eval. 68(2), 90–104 (2011)CrossRefGoogle Scholar
 10.Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic realtime systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 11.Kwiatkowska, M.Z.: Quantitative verification: models, techniques and tools. In: ESECFSE 2007, pp. 449–458 (2007)Google Scholar
 12.Kwong, K.S., Iglewicz, B.: On singular multivariate normal distribution and its applications. Comput. Stat. Data Anal. 22(3), 271–285 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
 13.Löfberg, J.: Automatic robust convex programming. Optim. Methods Softw. 27(1), 115–129 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
 14.Norman, G., Parker, D.: Quantitative verification: formal guarantees for timeliness, reliability and performance. Technical report, London Mathematical Society and the Smith Institute for Industrial Mathematics and System Engineering (2014)Google Scholar
 15.Su, G., Rosenblum, D.S.: Asymptotic bounds for quantitative verification of perturbed probabilistic systems. In: Groves, L., Sun, J. (eds.) ICFEM 2013. LNCS, vol. 8144, pp. 297–312. Springer, Heidelberg (2013)CrossRefGoogle Scholar