FIG: The Finite Improbability Generator

This paper introduces the statistical model checker FIGV, that estimates transient and steady-state reachability properties in stochastic automata. This software tool specialises in Rare Event Simulation via importance splitting, and implements the algorithms RESTART and Fixed Effort. FIG is push-button automatic since the user need not define an importance function: this function is derived from the model specification plus the property query. The tool operates with Input/Output Stochastic Automata with Urgency, aka IOSA models, described either in the native syntax or in the JANI exchange format. The theory backing FIG has demonstrated good efficiency, comparable to optimal importance splitting implemented ad hoc for specific models. Written in C++, FIG can outperform other state-of-the-art tools for Rare Event Simulation.


Introduction
In formal analysis of stochastic systems, statistical model checking (smc [33]) emerges as an alternative to numerical techniques such as (exhaustive) probabilistic model checking. Its partial, on-demand state exploration offers a memorylightweight option to exhaustive explorations. At its core, smc integrates Monte Carlo simulation with formal models, where traces of states are generated dynamically e.g. via discrete event simulation. Such traces are samples of the states where a (possibly non-Markovian) stochastic model usually ferrets. Given a temporal logic property ϕ that characterises certain states, an smc analysis yields an estimateγ of the actual probability γ with which the model satisfies ϕ. The estimateγ typically comes together with a quantification of the statistical error: two numbers δ ∈ (0, 1) and ε > 0 such thatγ ∈ [γ − ε, γ + ε] with probability δ.
Thus, if n traces are sampled, the full smc outcome is the tuple (n,γ, δ, ε).
With this statistical quantification-usually presented as a confidence interval (ci) aroundγ-an idea of the quality of an estimation is conveyed. To increase the quality one must increase the precision (smaller ε) or the confidence (bigger δ). For fixed confidence, this means a narrower ci aroundγ. The number of traces n is inversely proportional to ε and to the ci width, so smc trades memory for runtime or precision when compared to exhaustive methods [5].
This trade-off of smc comes with one up and one down. The up is the capability to analyse systems whose stochastic transitions can have non-Markovian This work was partially funded by NWO, NS, and ProRail project 15474 (SE-QUOIA) and EU project 102112 (SUCCESS).
distributions. In spite of gallant efforts, this is still out of reach for most other model checking approaches, making smc unique. The down are rare events.
If there is a very low probability to visit the states characterised by the property ϕ, then most traces will not visit them. Thus the estimateγ is either (an incorrect) 0 or, if a few traces do visit these states, statistical error quantification make ε skyrocket. To counter such phenomenon, n must increase as γ decreases. Unfortunately, for typical estimates such as the sample mean, it takes n 384 /γ to build a (rather lax!) ci where δ = 0.95 and ε = γ 10 . If e.g. γ ≈ 10 −8 then n 38400000000 traces are needed, causing trace-sampling times to grow unacceptably long. Rare Event Simulation (res [24]) methods tackle this issue.
The two main res methods are importance sampling (is) and importance splitting (isplit). is compromises the aforementioned up, since it must tamper the stochastic transitions of the model. Given that the study of non-Markovian systems is a chief reason to use smc, Related work. Other statistical model checkers offer res methods to some degree of automation. Plasma Lab implements automatic is [18] and semiautomatic isplit [21] for Markov chains. Its isplit engine offers a wizard that guides the user to choose an importance function. The wizard exploits a layered decomposition of the property query-not the system model. Via apis, the isplit engine of Plasma Lab could be extended beyond dtmc models. SBIP 2.0 [22] implements the same (semiautomatic, property-based) engine for dtmcs. SBIP offers a richer set of temporal logics to define the property query in. Cosmos [1] and ftres [26] implement importance sampling on Markov chains, the latter specialising in systems described as repairable Dynamic Fault Trees (dfts). All these tools can operate directly on Markovian models, and none offers fully automated isplit. Instead, the smc tool modes can also cope with nondeterminism (e.g. in Markov automata) using the LSS algorithm [10,5]. On the other hand, using the batch means method, FIG can estimate steady-state properties, which modes cannot currently do. Moreover,

Rare Event Simulation
res methods make more traces visit the rare states that satisfy a property ϕ (the set S ϕ ), to reduce the variance of smc estimators. For a fixed budget of traces n, this yields more precise cis than classical Monte Carlo simulation (cmc).
FIG implements importance splitting: a main res method that can work on non-Markovian systems without special considerations. isplit splits the states of the model into layers that wrap S ϕ like an onion. Reaching a state in S ϕ from the surface is then broken down into many steps. The i-th step estimates the conditional probability to reach (the inner) layer i + 1 from (the outer) layer i. This stepwise estimation of conditional probabilities can be much more efficient than trying to go in one leap from the surface of the onion to its core [20].
Formally, let S be the states of a model with initial states S 0 and rare states This approach is correct, i.e. it yields an unbiased estimatorγ which depends on how the S i layers where chosen. For this, an importance function f : S → R 0 and thresholds i ∈ R 0 are defined: and the importance function f .
These choices are the key challenge in isplit [20]. Theoretical developments assume f is given [12,8], and applications define it ad hoc via (res and domain) expert knowledge [30,27]. Yet there is one general rule: importance must be proportional to the probability of reaching S ϕ . Thus for s, s ∈ S, if a trace that visits s is more likely to observe a rare state, one wants f (s) f (s ). This means that f depends both on the model M and the property ϕ that define S ϕ . In iosa, continuous variables called clocks sample random values from arbitrary distributions (pdfs). As time evolves, all clocks count down at the same rate. The first to reach zero can trigger events and synchronise with other modules, broadcasting an output action that synchronises with homonymous input actions (iosa are input-enabled). Actions can be urgent, where urgent outputs have module M1 fc,rc : clock; inf,brk : [0..2] init 0;  , FIG 1.2 reads models written in the jani exchange format [7]. Model types supported are ctmc and a subset of sta that matches iosa, e.g. with a single pdf per clock and broadcast synchronisation. FIG also translates iosa to jani as sta, to share models with tools such as the Modest Toolset [16] and Storm [13]. This is used in Sec. 4 for comparisons. Properties. FIG  builders. Engines are nosplit, restart, and sfe, which resp. run cmc, restart (rst [31]), and Fixed Effort (fe [14]) simulations. The latter two are isplit algorithms: fe was described in Sec. 2, and works for transient properties; rst also works for steady-state analysis (steady-state via fe requires regeneration theory [15], seldom applicable to non-Markovian models and unsupported by All these experiments can be reproduced via the artifact freely available in [3].

JANI. Besides its native input syntax
We test different configurations of engines, efforts, and thresholds. For each configuration we run simulations until some timeout. This yields a ci with precision 2ε for confidence coefficient δ = 0.95. The smaller the ε, the narrower the ci, and the better the performance of the configuration (and tool) that produced it.
First, we analyse repairable dfts with warm spares and exponential (fail), normal (repair), and lognormal (dormancy) pdfs. Using cmc, fe 8,16,32 and rst 3,4,6 we estimate the probability of a top level event after the first failure, before all components are repaired, in trees with 6, 7, and 8 spares (the smallest iosa has 116 variables and > 2.5 e 37 states). For isplit we used seq thresholds with --dft 0 --acomp and no arguments, i.e. as automatic as cmc.
With a 20 min timeout, each configuration was repeated 13 times in a Xeon E5-2683v4 CPU running Linux x64 4.4.0. The height of the bars in the top plot of Fig. 1 is the average ci precision (lower is better), using Z-score m=2 to remove outliers [17]. Whiskers are standard deviation, and white numbers indicate how many runs yielded not-null estimates. Clearly, res algorithms outperform cmc in the hardest cases: less than half of cmc runs in DFT-8 could build (wide) cis.
Third, we compare FIG and modes in the original benchmark of the latter [5].
We do so for fe-seq, rst-seq, rst-es, using each tool's default options.
We ran each benchmark instance 15 min, thrice per tool, in an Intel i7-6700 CPU with Linux x64 5.3.1. The scatter plots of Fig. 2 show the median of the ci precisions. Sub-plots on the bottom-right are a zoom-ins in the range [10 −10 ,10 −5 ].
An (x,y) point is an instance whose median ci width was x for Albeit modes is multi-threaded, these experiments ran on a single thread to compare both tools on equal conditions. On the other hand, FIG also estimates the probability of steady-state properties, for which there is no support in modes.