Sampling-Based Verification of CTMCs with Uncertain Rates

We employ uncertain parametric CTMCs with parametric transition rates and a prior on the parameter values. The prior encodes uncertainty about the actual transition rates, while the parameters allow dependencies between transition rates. Sampling the parameter values from the prior distribution then yields a standard CTMC, for which we may compute relevant reachability probabilities. We provide a principled solution, based on a technique called scenario-optimization, to the following problem: From a finite set of parameter samples and a user-specified confidence level, compute prediction regions on the reachability probabilities. The prediction regions should (with high probability) contain the reachability probabilities of a CTMC induced by any additional sample. To boost the scalability of the approach, we employ standard abstraction techniques and adapt our methodology to support approximate reachability probabilities. Experiments with various well-known benchmarks show the applicability of the approach.


Introduction
Continuous-time Markov chains (CTMCs) are widely used to model complex probabilistic systems in reliability engineering [51], network processes [36,38], systems biology [10,22] and epidemic modeling [2]. A key verification task is to compute aspects of system behavior from these models, expressed as, e.g., continuous stochastic logic (CSL) formulae [4,6]. Typically, we compute reachability probabilities for a set of horizons, such as: what is the probability that a target state is reached before time t 1 , . . . , t n ? Standard algorithms [6] implemented in mature model checking tools such as Storm [37] or Prism [42] provide efficient means to compute these reachability probabilities. However, these methods typically require that transition rates and probabilities are precisely known. This assumption is often unrealistic [34] and led to some related work, which we discuss in Sect. 7.    in Fig. 1a for a population of two. Such a CTMC assumes a fixed set of transition rates, in this case an infection rate λ i , and a recovery rate λ r . The outcome of analyzing this CTMC for fixed values of λ i and λ r may yield a probability curve like in Fig. 2a 3 , where we plot the probability (y-axis) of reaching a target state that corresponds to the epidemic becoming extinct against varying time horizons (x-axis). In fact, the plot is obtained via a smooth interpolation of the results at finitely many horizons, cf. 2b. To acknowledge that λ i , λ r are in fact unknown, we may analyze the model for different values of λ i , λ r , resulting in a set of curves as in Fig. 2c. These individual curves, however, provide no guarantees about the shape of the curve obtained from another infection and recovery rate. Instead, we assume a probability distribution over the transition rates and aim to compute prediction regions as those in shown Fig. 2d, in such a way that with a certain (high) probability, any rates λ i and λ r yield a curve within this region.
Overall goal. From the illustrative example, we state the following goal. Each fixed set of transition rates induces a probability curve, i.e., a mapping from horizons to the corresponding reachability probabilities. We aim to construct prediction regions around a set of probability curves, such that with high probability and high confidence, sampling a set of transition rates induces a probability curve within this region. Our key contribution is an efficient probably approximately correct, or PAC-style method that computes these prediction regions. The remainder of the introduction explores the technical steps toward this goal.
Uncertain CTMCs. The setting above is formally captured by parametric CTMCs (pCTMCs). Transition rates of pCTMCs are not given precisely but as (polynomials over) parameters [14,34], such as those shown in Fig. 1a. We assume a prior on each parameter valuation, i.e., assignment of values to parameters, similar to settings in [10,44] and in contrast to, e.g., [22,34]. These priors may result from asking different experts which value they would assume for, e.g., the infection rate. The prior may also be the result of Bayesian reasoning [56]. Formally, we capture the uncertainty in the rates by an arbitrary and potentially unknown probability distribution over the parameter space, see Fig. 1b. We call this model an uncertain pCTMC (upCTMC). The distribution allows drawing independent and identically distributed (i.i.d.) samples that yield (parameter-free) CTMCs.
Problem statement. We consider prediction regions on probability curves in the form of a pair of two curves that 'sandwich' the probability curves, as depicted in Fig. 2d. Intuitively, we then aim to find a prediction region R that is sufficiently large, such that sampling parameter valuations yields a probability curve in R with high probability p. We aim to compute a lower bound on this containment probability p. Naturally, we also aim to compute a meaningful, i.e. small (tight), prediction region R. As such, we aim to solve the following problem: Problem Statement. Given a upCTMC with a target state, compute 1. a (tight) prediction region R on the probability curves, and 2. a (tight) lower bound on the containment probability that a sampled parameter valuation induces a probability curve that will lie in R. We solve this problem with a user-specified confidence level β.
The problem solved. In this paper, we present a method that samples probability curves as in Fig. 2c, but now for, say 100 curves. From these curves, we compute prediction regions (e.g., both tubes in Fig. 2d) and compute a lower bound (one for both tubes) on the containment probability that the curve associated with any sampled parameter value will lie in the specific prediction region (tube). Specifically, for a confidence level of 99% and considering 100 curves, we conclude that this lower bound is 79.4% for the red region and 7.5% for the blue region. For a higher confidence level of 99.9%, the lower bounds are slightly more conservative.
A change in perspective. Toward the algorithm, we make a change in perspective. For two horizons t 1 and t 2 , reachability probabilities for fixed CTMCs are twodimensional points in [0, 1] 2 that we call solution vectors, as shown in Fig. 3a.
Here, these solution vectors represent pairs of the probabilities that the disease becomes extinct before time t 1 and before t 2 . The prediction regions as in Fig. 2d are shown as the shaded boxes in Fig. 3a.
Solving the problem algorithmically. We solve the problem using a sampling-based approach. Starting with a set of solution vectors, we use techniques from scenario optimization, a data-driven methodology for solving stochastic optimization problems [17,20]. As such, we construct the prediction region from the solution to an optimization problem. Our method can balance the size of the prediction region with the containment probability, as illustrated by the two boxes in Fig. 3a.   Extensions. Our approach offers more than prediction regions on probability curves from precise samples. The change in perspective mentioned above allows for solution vectors that represent multiple objectives, such as the reachability with respect to different goal states, expected rewards or even the probability mass of paths satisfying more complex temporal properties. In our experiments, we show that this multi-objective approach -also on probability curves-yields much tighter bounds on the containment probability than an approach that analyzes each objective independently. We can also produce prediction regions as other shapes than boxes, as, for example, shown in Fig. 3b. To accelerate our approach, we significantly extend the methodology for dealing with imprecise verification results, given as an interval on each entry of the solution vector.

Contributions.
Our key contribution is the approach that provides prediction regions and lower bounds on probability curves for upCTMCs. The approach requires only about 100 samples and scales to upCTMCs with tens of parameters. Furthermore: (1) We extend our approach such that we can also handle the case where only imprecise intervals on the verification results are available. (2) We develop a tailored batch verification method in the model checker Storm [37] to accelerate the required batches of verification tasks. We accompany our contributions by a thorough empirical evaluation and remark that our batch verification method can be used beyond scenario optimization. Our scenario optimization results are independent of the model checking and are, thus, applicable to any model where solution vectors are obtained in the same way as for upCTMCs.

Problem Statement
In this section, we introduce pCTMCs and upCTMCs, and we define the formal problem statement. We use probability distributions over finite and infinite sets; see [8] for details. The set of all distributions over a set X is denoted by Dist(X ). The set of polynomials over parameters V , with rational coefficients, is denoted by Q[V ]. An instantiation u : V → Q maps parameters to concrete values. We often fix a parameter ordering and denote instantiations as vectors, u ∈ Q |V | .

Definition 1 (pCTMC).
A pCTMC is a tuple M = (S, s I , V, R), where S is a finite set of states, s I ∈ Dist(S ) is the initial distribution, V are the (ordered) parameters, and R : S × S → Q[V ] is a parametric transition rate function. If R(s, s) ∈ Q ≥0 for all s, s ∈ S, then M is a (parameter-free) CTMC.
For any pair of states s, s ∈ S with a non-zero rate R(s, s ) > 0, the probability of triggering a transition from s to s within t time units is 1 − e −R(s,s )·t [41].
Applying an instantiation u to a pCTMC M yields an instantiated CTMC for all s, s ∈ S. In the remainder, we only consider instantiations u for a pCTMC M which are welldefined. The set of such instantiations is the parameter space V M .
A central measure on CTMCs is the (time-bounded) reachability Pr(♦ ≤τ E), which describes the probability that one of the error states E 4 is reached within the horizon τ ∈ Q. Other measures include the expected time to reach a particular state, or the average time spent in particular states. We refer to [41]  ∈ R m generalizes this concept to an (ordered) set of m measures Φ = ϕ 1 , . . . , ϕ m . We abuse notation and introduce the solution function to express solution vectors on a pCTMC: We often omit the scripts in sol Φ M (u) and write sol(u) instead. We also refer to sol(u) as the solution vector of u. For n parameter samples U n = {u 1 , . . . , u n } with u i ∈ V M , we denote the solution vectors by sol(U n ) ∈ R m×n .
Using solution vectors, we can define the probability curves shown in Fig. 2c.
Definition 3 (Probability curve). The probability curve for reachability probability φ τ = Pr(♦ ≤τ E) and CTMC M[u] is given by probC : τ → sol ϕτ M [u] . We can approximate the function probC for a concrete CTMC by computing probC(t 1 ), . . . , probC(t m ) for a finite set of time horizons. As such, we compute the solution vector w.r.t. m different reachability measures Φ = {ϕ t1 , . . . , ϕ tm }. By exploiting the monotonicity 5 of the reachability over time, we obtain an upper and lower bound on probC(τ ) as two step functions, see Fig. 2d. We can smoothen the approximation, by taking an upper and lower bound on these step functions.
We study pCTMCs where the parameters follow a probability distribution. This probability distribution can be highly complex or even unknown; we merely assume that we can sample from this distribution.

Definition 4 (upCTMC).
A upCTMC is a tuple (M, P) with M a pCTMC and P a probability distribution over the parameter space V M of M.
A upCTMC defines a probability space (V M , P) over the parameter values, whose domain is defined by the parameter space V M . In the remainder, we denote a sample from V M drawn according to P by u ∈ V M .
To quantify the performance of a upCTMC, we may construct a prediction region on the solution vector space, such as those shown in Fig. 3a. In this paper, we consider only prediction regions which are compact subsets R ⊆ R |Φ| . We define the so-called containment probability of a prediction region, which is the probability that the solution vector sol(u) for a randomly sampled parameter u ∈ V M is contained in R, as follows: Definition 5 (Containment probability). For a prediction region R, the containment probability contain V (R) is the probability that the solution vector sol(u) for any parameter sample u ∈ V M is contained in R: (1) Recall that we solve the problem in Sect. 1 with a user-specified confidence level, denoted by β ∈ (0, 1). Formally, we solve the following problem: Formal Problem. Given a upCTMC (M, P), a set Φ of measures, and a confidence level β ∈ (0, 1), compute a (tight) prediction region R and a (tight) lower bound µ ∈ (0, 1) on the containment probability, such that contain(R) ≥ µ holds with a confidence level of at least β.
The problem in Sect. 1 is a special case of the formal problem, with Φ the reachability probability over a set of horizons. In that case, we can overapproximate a prediction region as a rectangle, yielding an interval [c,c] for every horizon t that defines where the two step functions (see below Def. 3) change. We smoothen these step functions (similar to probability curves) to obtain the following definition: Definition 6 (Prediction region for a probability curve). A prediction region R over a probability curve probC is given by two curvesc,c : Q ≥0 → R as the area in-between: R = {(t, y) ∈ Q × R |c(t) ≤ y ≤c(t)}.
We solve the problem by sampling a finite set U n of parameter values of the upCTMC and computing the corresponding solution vectors sol(U n ). In Sect. 3, we solve the problem assuming that we can compute solution vectors exactly. In Sect. 4, we consider a less restricted setting in which every solution is imprecise, i.e. only known to lie in a certain interval.

Precise Sampling-Based Prediction Regions
In this section, we use scenario optimization [15,17] to compute a high-confidence lower bound on the containment probability. First, in Sect. 3.1, we describe how to compute a prediction region using the solution vectors sol(U n ) for the parameter samples U n . In Sect. 3.2, we clarify how to compute a lower bound on the containment probability with respect to this prediction region. In Sect. 3.3, we construct an algorithm based on those results that solves the formal problem.

Constructing prediction regions
We assume that we are given a set of solution vectors sol(U n ) obtained from n parameter samples. We construct a prediction region R based on these vectors such that we can annotate these regions with a lower bound on the containment probability, as in the problem statement. For conciseness, we restrict ourselves to the setting where R is a hyperrectangle in R m , with m = |Φ| the number of measures, cf. Remark 1 below. In the following, we represent R using two vectors (points)x,x ∈ R m such that, using pointwise inequalities, R = {x |x ≤ x ≤x}.
For an example of such a rectangular prediction region, see Fig. 3a.
As also shown in Fig. 3a, we do not require R to contain all solutions in sol(U n ). Instead, we have two orthogonal goals: we aim to minimize the size of R, while also minimizing the (Manhattan) distance of samples to R, measured in their 1-norm. Solutions contained in R are assumed to have a distance of zero, while solutions not contained in R are called relaxed. These goals define a multi-objective problem, which we solve by weighting the two objectives using a fixed parameter ρ > 0, called the cost of relaxation, that is used to scale the distance to R. Then, ρ → ∞ enforces sol(U n ) ⊆ R, as in the outer box in Fig. 3a, while for ρ → 0, R is reduced to a point. Thus, the cost of relaxation ρ is a tuning parameter that determines the size of the prediction region R and hence the fraction of the solution vectors that is contained in R (see [18,20] for details).
We capture the problem described above in the following convex optimization problem L ρ U . We define the decision variablesx,x ∈ R m to represent the prediction region. In addition, we define a decision variable ξ i ∈ R m ≥0 for every sample i = 1, . . . , n that acts as a slack variable representing the distance to R.
The objective function in Eq. (2a) minimizes the size of R -by minimizing the sum of the width of the prediction region in all dimensions-plus ρ times the distances of the samples to R. We denote the optimal solution to problem L ρ U for a given ρ by for the rectangular case.
Assumption 1. The optimal solution R * ρ , ξ * ρ to L ρ U exists and is unique.
Note that Def. 2 ensures finite-valued solution vectors, thus guaranteeing the existence of a solution to Eq. (2). If the solution is not unique, we apply a suitable tie-break rule that selects one solution of the optimal set (e.g., the solution with a minimum Euclidean norm, see [15]). The following example shows that values of ρ exist for which such a tie-break rule is necessary to obtain a unique solution.
Example 1. Fig. 4 shows a set of solution vectors in one dimension, labeled Fig. 4: The prediction region changes with the cost of relaxation ρ.  Thus, for ρ > 1, solving L ρ U yields R 1 whereas for ρ < 1, relaxing solutions A and F is cheaper than not doing so, so R 2 is optimal. When ρ = 1, however, relaxing solutions A and F yields the same cost as not relaxing these samples, so a tie-break rule is needed (see above). For ρ < 1 2 , relaxing samples A, B, E, and F is cost-optimal, resulting in the prediction region containing exactly {C, D}.
Similarly, we can consider cases with more samples and multiple measures, as shown in Fig. 5, which we discuss in more detail in App. A. The three prediction regions in Fig. 5 are obtained for different costs of relaxation ρ. For ρ = 2, the region contains all vectors, while for a lower ρ, more vectors are left outside.
(2) yields a rectangular prediction region, we can also produce other shapes. We may, e.g., construct a Pareto front as in Fig. 3b, by adding additional affine constraints [11]. In fact, our only requirement is that the objective function is convex, and the constraints are convex in the decision variables (the dependence of the constraints on u may be arbitrary) [20].

Bounding the containment probability
The previous section shows how we compute a prediction region based on convex optimization. We now characterize a valid high-confidence lower bound on the containment probability w.r.t. the prediction region given by the optimal solution to this optimization problem. Toward that result, we introduce the so-called complexity of a solution to problem L ρ U in Eq. (2), a concept used in [20] that is related to the compressibility of the solution vectors sol(U n ): ρ is the cardinality of the smallest critical set. We also call c * ρ the complexity of L ρ U .  If a sample u i has a value ξ * ρ,i > 0, its solution vector has a positive distance to the prediction region, R * ρ . (i.e., [x * ρ ,x * ρ ] for the rectangular case). Thus, the complexity c * ρ is the number of samples for which sol(u i ) / ∈ R * ρ , plus the minimum number of samples needed on the boundary of the region to keep the solution unchanged. We describe in Sect. 3.3 how we algorithmically determine the complexity.
Example 2. In Fig. 5, the prediction region for ρ = 2 contains all solution vectors, so ξ * 2,i = 0 ∀i. Moreover, if we remove all but four solutions (the ones on the boundary of the region), the optimal solution to problem L ρ U remains unchanged, so the complexity is c * 1.12 = 0 + 4. Similarly, the complexity for ρ = 0.4 is c * 0.4 = 8 + 2 = 10 (8 solutions outside the region, and 2 on the boundary).
Recall that Def. 5 defines the containment probability of a generic prediction region R, so contain(R * ρ ) is the containment probability w.r.t. the optimal solution to L ρ U . We adapt the following theorem from [20], which gives a lower bound on the containment probability contain(R * ρ ) of an optimal solution to L ρ U for a predefined value of ρ. This lower bound is correct with a user-defined confidence level of β ∈ (0, 1), which we typically choose close to one (e.g., β = 0.99). Theorem 1. Let U n be a set of n samples, and let c * be the complexity of problem L ρ U . For any confidence level β ∈ (0, 1) and any upper bound d * ≥ c * , it holds that where R * ρ is the prediction region for L ρ U . Moreover, η is a function defined as η(n) = 0, and otherwise, η(c) is the smallest positive real-valued solution to the following polynomial equality in the t variable for a complexity of c: We provide the proof of Theorem 1 in App. B.1. With a probability of at least β, Theorem 1 yields a correct lower bound. That is, if we solve L ρ U for many more sets of n parameter samples (note that, as the samples are i.i.d., these sets Prediction regions R * ρ ∀ρ Fig. 7: Overview of our approach for solving the problem statement. are drawn according to the product probability P n ), the inequality in Eq. (3) is incorrect for at most a 1 − β fraction of the cases. We plot the lower bound η(c) as a function of the complexity c = 0, . . . , n in Fig. 6, for different samples sizes n and confidence levels β. These figures show that an increased complexity leads to a lower η, while increasing the sample size leads to a tighter bound.
Example 3. We continue Example 2. Recall that the complexity for the outer region in Fig. 5 is c * 1.12 = 4. With Theorem 1, we compute that, for a confidence level of β = 0.9, the containment probability for this prediction region is at least η = 0.615 (cf. Fig. 6a). For a stronger confidence level of β = 0.999, we obtain a more conservative lower bound of η = 0.455.

An algorithm for computing prediction regions
We combine the previous results in our algorithm, which is outlined in Fig. 7. The goal is to obtain a set of prediction regions as in Fig. 5 and their associated lower bounds. To strictly solve the problem statement, assume k = 1 in the exposition below. We first outline the complete procedure before detailing Steps 4 and 5.
As preprocessing steps, given a upCTMC (M, P), we first (1) sample a set U n of n parameter values. Using M and Φ, a (2) model checking algorithm then computes the solution vector sol Φ M (u) for each u ∈ U n , yielding the set of solutions sol(U n ). We then use sol(U n ) as basis for (3) the scenario problem L ρ U in Eq. (2), which we solve for k predefined values ρ 1 , . . . , ρ k , yielding k prediction regions R * ρ1 , . . . R * ρ k . We (4) compute an upper bound d * ρ on the complexity c * ρ ∀ρ. Finally, we (5) use the result in Theorem 1, for a given confidence β, to compute the lower bound on the containment probability η(d * ρ ) of R * ρ . Using Def. 6, we can postprocess this region to a prediction region over the probability curves.
Step (3): Choosing values for ρ. Example 1 shows that relaxation of additional solution vectors (and thus a change in the prediction region) only occurs at critical values of ρ = 1 n , for n ∈ N. In practice, we will use ρ = 1 n+0. 5 for ±10 values of n ∈ N to obtain gradients of prediction regions as in Sect. 6.
Step (4): Computing complexity. Computing the complexity c * ρ is a combinatorial problem in general [30], because we must consider the removal of all combinations of the solutions on the boundary of the prediction region R * ρ . In practice, we compute an upper bound d * ρ ≥ c * ρ on the complexity via a greedy algorithm. Specifically, we iteratively solve L ρ U in Eq.
(2) with one more sample on the boundary removed. If the optimal solution is unchanged, we conclude that this sample does not contribute to the complexity. If the optimal solution is changed, we put the sample back and proceed by removing a different sample. This greedy algorithm terminates when we have tried removing all solutions on the boundary.
Step (5): Computing lower bounds. Theorem 1 characterizes a computable function B(d * , n, β) that returns zero for d * = n (i.e., all samples are critical), and otherwise uses the polynomial Eq. (4) to obtain η, which we solve with an approximate root finding method in practice (see [31] for details on how to ensure that we find the smallest root). For every upper bound on the complexity d * and any requested confidence, we obtain the lower bound η = B(d * , n, β) for the containment probability w.r.t. the prediction region R * ρ .

Imprecise Sampling-Based Prediction Regions
Thus far, we have solved our problem statement under the assumption that we compute the solution vectors precisely (up to numerics). For some models, however, computing precise solutions is expensive. In such a case, we may choose to compute an approximation, given as an interval on each entry of the solution function. In this section, we deal with such imprecise solutions.
Setting. Formally, imprecise solutions are described by the bounds sol − (u), sol + (u) ∈ R m such that sol − (u) ≤ sol(u) ≤ sol + (u) holds with pointwise inequalities. Our goal is to compute a prediction region R and a (high-confidence) lower bound µ such that contain(R) ≥ µ, i.e., a lower bound on the probability that any precise solution sol(u) is contained in R. However, we must now compute R and contain(R) from the imprecise solutions sol − , sol + . Thus, we aim to provide a guarantee with respect to the precise solution sol(u), based on imprecise solutions.
Challenge. Intuitively, if we increase the (unknown) prediction region R * from problem L ρ U (for the unknown precise solutions) while also overapproximating the complexity of L ρ U , we obtain sound bounds. We formalize this idea as follows.
Lemma 1. Let R * ρ be the prediction region and c * ρ the complexity that result from solving L ρ U for the precise (unknown) solutions sol(U n ). Given a set R ∈ R n and d ∈ N, for any confidence level β ∈ (0, 1), the following implication holds: where η(n) = 0, and otherwise, η(d) is the smallest positive real-valued solution to the polynomial equality in Eq. (4).
The proof is in App. B.2. In what follows, we clarify how we compute the appropriate R and d in Lemma 1. As we will see, in contrast to Sect. 3, these results do not carry over to other definitions L ρ U (for non-rectangular regions R).

Prediction regions on imprecise solutions
In this section, we show how to compute R ⊇ R * ρ , satisfying the first term in the premise of Lemma 1. We construct a conservative box around the imprecise solutions as in Fig. 9, containing both sol − (u) and sol + (u). We compute this box by solving the following problem G ρ U as a modified version of L ρ U in Eq. (2): We denote the optimal solution of G ρ U by [x ρ ,x ρ ], ξ ρ (recall that the optimum to L ρ U is written as [x * ρ ,x * ρ ], ξ * ρ ). 6 If a sample u i ∈ V M in problem G ρ U is relaxed (i.e., has a non-zero ξ i ), part of the interval [sol − (u i ), sol + (u i )] is not contained in the prediction region. The following result (for which the proof is in App. B.3) relates L ρ U and G ρ U , showing that we can use [x ρ ,x ρ ] as R in Lemma 1.
Theorem 2. Given ρ, sample set U n , and prediction region [x ρ ,x ρ ] to problem We note that this result is not trivial. In particular, the entries ξ i from both LPs are incomparable, as are their objective functions. Instead, Theorem 2 relies on two observations. First, due to the use of the 1-norm, the LP G ρ U can be decomposed into n individual LPs, whose results combine into a solution to the original LP. This allows us to consider individual dimensions. Second, the solution vectors that are relaxed depend on the value of ρ and on their relative order, but not on the precise position within that order, which is also illustrated by Example 1. In combination with the observation from Example 1 that the outermost samples are relaxed at the (relatively) highest ρ, we can provide conservative guarantees on which samples are (or are surely not) relaxed. We formalize these observations and provide a proof of Theorem 2 in App. B.3.

Computing the complexity
To satisfy the second term of the premise in Lemma 1, we compute an upper bound on the complexity. We first present a negative result. Let the complexity c ρ of problem G ρ U be defined analogous to Def. 7, but with [x ρ ,x ρ ] as the region.
Thus, we cannot upper bound the complexity directly from the result to G ρ U . We can, however, determine the samples that are certainly not in any critical set (recall Def. 7). Intuitively, a sample is surely noncritical if its (imprecise) solution is strictly within the prediction region and does not overlap with any solution on the region's boundary. In Fig. 8, sample u 6 is surely noncritical, but sample u 5 is not (whether u 5 is critical depends on its precise solution). Formally, let δR be the boundary 7 of region [x ρ ,x ρ ], and let B be the set of samples whose solutions overlap with δR, which is B = {u ∈ U n : [sol − (u), sol + (u)] ∩ δR = ∅}.
As a worst case, any sample not surely noncritical can be in the smallest critical set, leading to the following bound on the complexity as required by Lemma 1.
Theorem 3. Let X be the set of surely noncritical samples. Then c * ρ ≤ |U n \ X |.
The proof is in App. B.4. For imprecise solutions, the bound in Theorem 3 is conservative but can potentially be improved, as discussed in the following.

Solution refinement scheme
Often, we can refine imprecise solutions arbitrarily (at the cost of an increased computation time). Doing so, we can improve the prediction regions and upper bound on the complexity, which in turn improves the computed bound on the containment probability. Specifically, we propose the following rule for refining solutions. After solving G ρ U for a given set of imprecise solutions, we refine the solutions on the boundary of the obtained prediction region. We then resolve problem G ρ U , thus adding a loop back from (4) to (2) in our algorithm shown in Fig. 7. In our experiments, we demonstrate that with this refinement scheme, we iteratively improve our upper bound d ≥ c * ρ and the smallest superset R ⊇ R * ρ .

Batch Verification for CTMCs
One bottleneck in our method is to obtain the necessary number of solution vectors sol(U n ) by model checking. The following improvements, while mild, are essential in our implementation and therefore deserve a brief discussion. In general, computing sol(u) via model checking consists of two parts. First, the high-level representation of the upCTMC -given in Prism [42], JANI [12], or a dynamic fault tree 8 -is translated into a concrete CTMC M[u]. Then, from M[u] we construct sol(u) using off-the-shelf algorithms [6]. We adapt the pipeline by tailoring the translation and the approximate analysis as outlined below.
Our implementation supports two methods for building the concrete CTMC for a parameter sample: (1) by first instantiating the valuation in the specification and then building the resulting concrete CTMC, or (2) by first building the pCTMC M (only once) and then instantiating it for each parameter sample to obtain the concrete CTMC M[u]. Which method is faster depends on the specific model (we only report results for the fastest method in Sect. 6 for brevity).
Partial models. To accelerate the time-consuming computation of solution vectors by model-checking on large models, it is natural to abstract the models into smaller models amenable to faster computations. Similar to ideas used for dynamic fault trees [55] and infinite CTMCs [48], we employ an abstraction which only keeps the most relevant parts of a model, i.e., states with a sufficiently large probability to be reached from the initial state(s). Analysis on this partial model then yields best-and worst-case results for each measure by assuming that all removed states are either target states (best case) or are not (worst case), respectively. This method returns imprecise solution vectors as used in Sect. 4, which can be refined up to an arbitrary precision by retaining more states of the original model. Similar to building the complete models, two approaches are possible to create the partial models: (1) fixing the valuation and directly abstracting the concrete CTMC, or (2) first building the complete pCTMC and then abstracting the concrete CTMC. We reuse partial models for similar valuations to avoid costly computations. We cluster parameter valuations which are close to each other (in Euclidean distance). For parameter valuations within one cluster, we reuse the same partial model (in terms of the states), albeit instantiating it according to the precise valuation.

Experiments
We answer three questions about (a prototype implementation of) our approach: Q1. Can we verify CTMCs taking into account the uncertainty about the rates? Q2. How well does our approach scale w.r.t. the number of measures and samples? Q3. How does our approach compare to naïve baselines (to be defined below)?
Setup. We implement our approach using the explicit engine of Storm [37] and the improvements of Sec. 5 to sample from upCTMCs in Python. Our current a Computed using approximate model checking up to a relative gap between upper bound sol + (u) and lower bound sol − (u) below 1% for every sample u ∈ VM. b Model size is unknown, as the approximation does not build the full state-space.  implementation is limited to pCTMC instantiations that are graph-preserving, i.e. for any pair s, s ∈ S either R(s, s )[u] = 0 or R(s, s )[u] > 0 for all u. We solve optimization problems using the ECOS solver [29]. All experiments ran single-threaded on a computer with 32 3.7 GHz cores and 64 GB RAM. We show the effectiveness of our method on a large number of publicly available pCTMC [35] and fault tree benchmarks [50] across domains (details in App. C).

Q1. Applicability
An excerpt of the benchmark statistics is shown in Tab. 1 (see Tab. 4 in App. C for the full table). For all but the smallest benchmarks, sampling and computing the solution vectors by model checking is more expensive than solving the scenario problems. In the following, we illustrate that 100 samples are sufficient to provide qualitatively good prediction regions and associated lower bounds.  Plotting prediction regions. Fig. 10 presents prediction regions on the extinction probability of the disease in the SIR model and is analogous to the tubes in Fig. 2d (see Fig. 14 and 15 in App. C.1 for plots for various other benchmarks). These regions are obtained by applying our algorithm with varying values for the cost of relaxation ρ. For a confidence level of β = 99%, the widest (smallest) tube in Fig. 10 corresponds to a lower bound probability of µ = 91.1% (µ = 23.9%). Thus, we conclude that, with a confidence of at least 99%, the curve created by the CTMC for any sampled parameter value will lie within the outermost region in Fig. 10 with a probability of at least 91.1%. We highlight that our approach supports more general prediction regions. We show n = 200 solution vectors for the buffer benchmark with two measures in Fig. 11 and produce regions that approach the Pareto front. For a confidence level of β = 99%, the outer prediction region is associated with a lower bound probability of µ = 91.1%, while the inner region has a lower value of µ = 66.2%. We present more plots in App. C.1.
Tightness of the solution. In Tab. 2 we investigate the tightness of our results. For the experiment, we set ρ = 1.1 and solve L ρ U for different values of n, repeating every experiment 10 times, resulting in the average boundsμ. Then, we sample 1 000 solutions and count the observed number of solutions contained in every prediction regions, resulting in an empirical approximation of the containment probability. Recall that for ρ > 1, we obtain a prediction region that contains all solutions, so this observed count grows toward n. The lower bounds grow toward the empirical count for an increased n, with the smallest difference (RC, n = 800, β = 0.9) being as small as 0.9%. Similar observations hold for other values of ρ.
Handling imprecise solutions. The approximate model checker is significantly faster (see Tab. 1 for SIR (140) and RC), at the cost of obtaining imprecise solution vectors. 9 For SIR (140), the sampling time is reduced from 49 to 9 min, while the scenario optimization time is slightly higher at 129 s. This difference only grows larger with the size of the CTMC. For the larger instances of RC and HECS, computing exact solutions is infeasible at all (one HECS (2,2) sample alone takes 15 min). While the bounds on the containment probability under imprecise solutions may initially be poor (see Fig. 12a, which results in µ = 2.1%), we can improve the results significantly using the refinement scheme proposed in Sect. 4.3. For example, Fig. 12c shows the prediction region after refining 31 of the 100 solutions, which yields µ = 74.7%. Thus, by iteratively refining only the imprecise solutions on the boundary of the resulting prediction regions, we significantly tighten the obtained bounds on the containment probability.

Q2. Scalability
In Tab. 3, we report the run times for steps (3)-(5) of our algorithm shown in Fig. 7 (i.e., for solving the scenario problems, but not for computing the solution vectors in Storm). Here, we solve problem L ρ U for ρ = 0.1, with different numbers of samples and measures. Our approach scales well to realistic numbers of samples (up to 800) and measures (up to 400). The computational complexity of the scenario problems is largely independent of the size of the CTMC, and hence, similar run times are observed across the benchmarks (cf. Tab. 1).

Q3. Comparison to baselines
We compare against two baselines: (1) Scenario optimization to analyze each measure independently, yielding a separate probabilistic guarantee on each measure. (2) A frequentist (Monte Carlo) baseline, which samples a large number of parameter values and counts the number of associated solutions within a region.  Analyzing measures independently. To show that analyzing a full set of measures at once, e.g., the complete probability curve, is essential, we compare our method to the baseline that analyzes each measure independently and combines the obtained bounds on each measure afterward. We consider the PCS benchmark with precise samples and solve L ρ U for ρ = 2 (see Tab. 5 in App. C for details). For n = 100 samples and β = 99%, our approach returns a lower bound probability of µ = 84.8%. By contrast, the naïve baseline yields a lower bound of only 4.5%, and similar results are observed for different values of n (cf. Tab. 5). There are two reasons for this large difference. First, the baseline applies Theorem 3 once for each of the 25 measures, so it must use a more conservative confidence level ofβ = 1 − 1−β 25 = 0.9996. Second, the baseline takes the conjunction over the 25 independent lower bounds, which drastically reduces the obtained bound.
Frequentist baseline. The comparison to the frequentist baseline on the Kanban and RC benchmarks yields the previously discussed results in Tab. 2. The results in Tab. 1 and 3 show that the time spent for sampling is (for most benchmarks) significantly higher than for scenario optimization. Thus, our scenario-based approach has a relatively low cost, while resulting in valuable guarantees which the baseline does not give. To still obtain a high confidence in the result, a much larger sample size is needed for the frequentist baseline than for our approach.

Related Work
Several verification approaches exist to handle uncertain Markov models.
For (discrete-time) interval Markov chains (DTMCs) or Markov decision processes (MDPs), a number of approaches verify against all probabilities within the intervals [32,39,46,53,54]. Lumpability of interval CTMCs is considered in [21]. In contrast to upCTMCs, interval Markov chains have no dependencies between transition uncertainties and no distributions are attached to the intervals.
Parametric Markov models generally define probabilities or rates via functions over the parameters. The standard parameter synthesis problem for discretetime models is to find all valuations of parameters that satisfies a specification. Techniques range from computing a solution function over the parameters, to directly solving the underlying optimization problems [24,28,33,40]. Parametric CTMCs are investigated in [22,34], but are generally restricted to a few parameters. The work [14] aims to find a robust parameter valuation in pCTMCs.
For all approaches listed so far, the results may be rather conservative, as no prior information on the uncertainties (the intervals) is used. That is, the uncertainty is not quantified and all probabilities or rates are treated equally as likely. In our approach, we do not compute solution functions, as the underlying methods are computationally expensive and usually restricted to a few parameters.
Quantified uncertainty is studied in [44]. Similarly to our work, the approach draws parameter values from a probability distribution over the model parameters and analyzes the instantiated model via model checking. However, [44] studies DTMCs and performs a frequentist (Monte Carlo) approach, cf. Sect. 6, to compute estimates for a single measure, without prediction regions. Moreover, our approach requires significantly fewer samples, cf. the comparison in Sect. 6.
The work in [9,10] takes a sampling-driven Bayesian approach for pCTMCs. In particular, they take a prior on the solution function over a single measure and update it based on samples (potentially obtained via statistical model checking). We assume no prior on the solution function, and, as mentioned before, do not compute the solution function due to the expensive underlying computations.
Statistical model checking (SMC) [1,43] samples path in stochastic models to perform model checking. This technique has been applied to numerous models [25][26][27]47], including CTMCs [52,57]. SMC analyzes a concrete CTMC by sampling from the known transition rates, whereas for upCTMC these rates are parametric.
Finally, scenario optimization [15,20] is widely used in control theory [13] and recently in machine learning [19] and reliability engineering [49]. Within a verification context, closest to our work is [5], which considers the verification of single measures for uncertain MDPs. [5] relies on the so-called sampling-anddiscarding approach [16], while we use the risk-and-complexity perspective [31], yielding better results for problems with many decision variables like we have.

Conclusion
This paper presents a novel approach to the analysis of parametric Markov models with respect to a set of performance characteristics. In particular, we provide a method that yields statistical guarantees on the typical performance characteristics from a finite set of samples of those parameters. Our experiments show that high-confidence results can be given based on a few hundred of samples. Future work includes supporting models with nondeterminism, exploiting aspects of parametric models such as monotonicity, and integrating methods to infer the distributions on the parameter space from observations.

A Additional Examples
In the following example, we discuss the results shown in Fig. 5 in more detail.
Example 4. We consider the Kanban manufacturing system benchmark from [23] with a Gaussian distribution over the parameters. In Fig. 5, we present n = 25 solution vectors for two expected cost measures. We use these solutions in problem L ρ U in Eq. (2), and solve for ρ = 2, 0.4, and 0.15. We show the three resulting prediction regions in Fig. 5. For ρ = 2, the prediction region contains all vectors, while for a lower cost of relaxation ρ, more vectors are left outside.

B.2 Proof of Lemma 1
Recall from the proof of Theorem 1 that for η(c * ρ ) the solution to Eq. (4), where c * ρ is the true complexity of problem L ρ U , it holds that Observe that for any two sets R * ρ ⊆ R, we have contain R ≥ contain R * ρ . Moreover, recall that η(c) is monotonically decreasing in c (as also observed visually from Fig. 6), and thus, the condition c * ρ ≤ d implies that η(d) ≤ η(c * ρ ). Hence, under the proposed conditions, we rewrite Eq. (10) as the right-hand side of Eq. (5), which concludes the proof.

B.3 Proof of Theorem 2
The one-dimensional case. Let us first consider the case for one dimension, i.e., one measure. Recall from Example 1 that for precise solutions in 1D, the the r-th entries of the respective imprecise solution vectors, i.e., the (upper and lower bound) solutions for measure ϕ r ∈ Φ and the CTMC created by sample u ∈ V M . In formalizing the relationship between the value of ρ and whether a sample is relaxed, we state the following definition: . . , n} as the number of samples whose upper bound is at least sol + (u i ) (or at most sol − (u i )), when projected to dimension r: For precise solutions, there is no distinction between sol − (u i ) and sol + (u i ), and Def. 9 yields the sets num ≥ (u i ) r and num ≤ (u i ) r . We can check whether a precise solution sol(u) is contained in the prediction region [x * ρ ,x * ρ ] to L ρ U by replacing num + → num and num − → num in Eq. (11).
Proof of Theorem 2 (part I). We proof the theorem by contradiction for the 1dimensional case, and we generalize afterward. In a single dimension, [x * ρ ,x * ρ ] ⊆ [x ρ ,x ρ ] requires that eitherx * ρ <x ρ orx * ρ >x ρ . First consider the upper bounds x ρ andx * ρ , which we can make explicit using Lemma 3: Forx * ρ >x ρ to hold, the maximum num ≥ (u) > ρ −1 (i.e., the highest precise solution for which there are more than ρ −1 solutions at least as high) must exceed num + ≥ (u) > ρ −1 (the highest imprecise upper bound solution for which there are more than ρ −1 imprecise upper bound solutions at least as high). This can only be true if the number of samples for which sol(u) >x is higher than the number for which sol + (u) >x . However, by construction, sol(u) ≤ sol + , so this is impossible, and thus, it holds thatx * ρ ≤x ρ . While omitted for brevity, the proof that the lower boundx * ρ ≥x ρ follows analogous to the upper bound.
The multi-dimensional case.
Lemma 4. Problem G ρ U can be decomposed into an independent problem for every dimension 1, . . . , m, with m = |Φ| the number of measures in Φ.
Lemma 4 holds because the objective Eq. (2a) is additive and all constraints for all measures Eq. (2b) are independent. Thus, we can equivalently solve problem L ρ U for all m measures separately. Proof of Theorem 2 (part II). We now generalize the result to multiple dimensions. Lemma 4 states that for rectangular prediction regions, problems L ρ U and G ρ U can be solved for each dimension separately. As such, we obtain an elementwise inequalityx ρ ≤x * ρ ≤x * ρ ≤x * ρ , which also implies that [x * ρ ,x * ρ ] ⊆ [x ρ ,x ρ ], so the claim in follows.

B.4 Proof of Theorem 3
Before providing the proof, we state the following useful lemma about surely noncritical samples: Lemma 5. Any surely noncritical sample cannot be in the (smallest) critical set, defined in Def. 7.
Proof. Recall from Def. 7 that a sample may (potentially) be critical if it is either outside or on the boundary of the prediction region [x * ρ ,x * ρ ]. While the boundary of the prediction region [x * ρ ,x * ] is unknown, it cannot be smaller than the inner rectangle I defined in Def. 8. By construction, any surely noncritical sample is a subset of this set I. Hence, any surely noncritical sample cannot be in the (smallest) critical set, and the claim follows.
The proof of Theorem 3 now follows almost directly from Lemma 5. The complexity is the cardinality of the smallest critical set, which cannot contain any surely noncritical sample, as stated by Lemma 5. Hence, it follows that n−|X| = |U n \X |, where X is the set of surely noncritical samples, which concludes the proof.

C Detailed Benchmark Overview and Results
In this appendix, we illustrate on two specific benchmarks how we convert a pCTMC into a upCTMC by equipping its parameters with a probability distribution. We omit details on the other benchmarks for brevity. Thereafter, we present the full benchmark statistics in Tab. 4, as well as more graphical outputs of our implementation in Fig. 14 and 15.
Epidemic modeling. We consider the classical SIR infection model [3] of an infectious disease spreading through a population. The factored state s = (S,Ī,R) of this CTMC counts the susceptible, infected, and recovered populations. Infections and recoveries (we assume that immunization is permanent) occur based on the following parametric rules that depend on parameters λ i and λ r : In the classical SIR model, λ i and λ r are assumed to be known precisely, while we consider the parameters λ i = N (0.05, 0.002) and λ r = N (0.04, 0.002) to be normally distributed (recall that we only use samples from these distributions).
We define a set Φ = {ϕ 100+100i : i = 1 m , 2 m , . . . , m−1 m , 1} of m measures, where ϕ t is the probability that the disease becomes extinct between time 100 and t: Buffer system. We augment the producer-consumer buffering system from [14]. We equip the six parameters of this pCTMC by uniform probability distributions with their domains specified below. This pCTMC models the transfer of requests from a producer (at a rate of λ g ∈ [32,38]) to consumers who consume them (at a rate of λ c ∈ [27,33]). The requests are sent at a rate of λ t ∈ [27,33], via either a slow or a fast buffer, with probabilities of 0.6 and 0.4, respectively. While being faster, the fast buffer is less reliable than the slow buffer (it loses requests with a probability λ loss ∈ [0.025, 0.075]), and has a smaller capacity. Requests from the slow buffer are transferred to the fast buffer with a probability proportional to the occupancy. The transmission rate of the slow buffer is λ slow ∈ [5,15]; the rate of the fast buffer is λ δ ∈ [5,15] higher. We consider two measures: (1) the expected transferred requests until time 25, and (2) the probability that the utilization of both buffers is above 75% within the time [20,25].

C.1 Detailed results
The complete overview of the statistics of all benchmarks is shown in Tab. 4. Running polling (15) for n = 200 samples led to a timeout, due to the very high sampling times (sampling 100 solutions already takes over 3.5 hours). The obtained prediction regions for eight benchmarks (under precise solutions) are presented in Fig. 14 and 15. These plots demonstrate the wide applicability and effectiveness of our method on a large variety of benchmarks.
Analyzing measures independently. Tab. 5 presents the full comparison on a Computed using approximate model checking up to a relative gap between upper bound sol + (u) and lower bound sol − (u) below 1% for every sample u ∈ VM. b Model size is unknown, as the approximation does not build the full state-space. Table 5: Obtained bounds (for the PCS fault tree) on the containment probability for our approach and the baseline that analyzes each measure independently. the PCS benchmark between our approach and the baseline scenario approach that analyzes each measure independently. In this table, we report the average lower bounds (over 10 iterations) on the containment probability, for different sample sizes n = 100, . . . , 800 and confidence levels β. For the two main reasons for the significant difference in the tightness of the containment probability, we refer to Sect. 6 in the main paper.