In the previous section, we introduced the notion of a family of MCs, two synthesis problems and the one-by-one approach. Yet, for a sufficiently high number of realisations such a straightforward analysis is not feasible. We propose a novel approach allowing us to more efficiently analyse families of MCs.
4.1 All-in-one MDP
We first consider a single MDP that subsumes all individual MCs of a family \(\mathfrak {D}\), and is equipped with an appropriate action and state labelling to identify the underlying realisations from \(\mathcal {R}^{\mathfrak {D}}\).
Definition 7
(All-in-one MDP [18, 28, 43]). The all-in-one MDP of a family \(\mathfrak {D} = (S,s_0,K,\mathfrak {P})\) of MCs is given as \(M^\mathfrak {D}=(S^\mathfrak {D},s_0^\mathfrak {D}, Act ^\mathfrak {D},\mathcal {P}^\mathfrak {D})\) where \(S^\mathfrak {D}=S \times \mathcal {R}^{\mathfrak {D}} \cup \{s_0^\mathfrak {D}\}\), \( Act ^\mathfrak {D}=\{a^r\mid r\in \mathcal {R}^{\mathfrak {D}}\}\), and \(\mathcal {P}^\mathfrak {D}\) is defined as follows:
$$\mathcal {P}^\mathfrak {D}(s_0^\mathfrak {D},a^r)((s_0,r)) = 1 \quad \text {and} \quad \mathcal {P}^\mathfrak {D}((s,r),a^r)((s',r)) = \mathfrak {P}(r)(s)(s').$$
Example 3
(All-in-one MDP). Figure 2 shows the all-in-one MDP \(M^\mathfrak {D}\) for the family \(\mathfrak {D}\) of MCs from Example 1. Again, states that are not reachable from the initial state \(s_0^\mathfrak {D}\) are marked grey. For the sake of readability, we only include the transitions and states that correspond to realisations \(r_1\) and \(r_2\).
From the (fresh) initial state \(s_0^\mathfrak {D}\) of the MDP, the choice of an action \(a_r\) corresponds to choosing the realisation r and entering the concrete MC \(D_r\). This property of the all-in-one MDP is formalised as follows.
Corollary 1
For the all-in-one MDP \(M^\mathfrak {D}\) of family \(\mathfrak {D}\) of MCsFootnote 2:
$$\begin{aligned} \{M^\mathfrak {D}_{\sigma ^r} \mid \sigma ^r \text { memoryless deterministic scheduler}\} = \{ D_r \mid r \in \mathcal {R}^\mathfrak {D} \}. \end{aligned}$$
Consequently, the feasibility synthesis problem for \(\varphi \) has the solution \(r\in \mathcal {R}^\mathfrak {D}\) iff there exists a memoryless deterministic scheduler \(\sigma ^r\) such that \(M^\mathfrak {D}_{\sigma ^r} \vDash \varphi \).
Approach 2
(All-in-one [18]). Model checking the all-in-one MDP determines max or min probability (or expected reward) for all states, and thereby for all realisations, and thus provides a solution to both synthesis problems.
As also the all-in-one MDP may be too large for realistic problems, we merely use it as formal starting point for our abstraction-refinement loop.
4.2 Abstraction
First, we define a predicate abstraction that at each state of the MDP forgets in which realisation we are, i.e., abstracts the second component of a state (s, r).
Definition 8
(Forgetting). Let \(M^\mathfrak {D}=(S^\mathfrak {D},s_0^\mathfrak {D}, Act ^\mathfrak {D},\mathcal {P}^\mathfrak {D})\) be an all-in-one MDP. Forgetting is an equivalence relation \(\sim _f\ \subseteq S^\mathfrak {D} \times S^\mathfrak {D}\) satisfying
$$\begin{aligned} (s,r)\sim _f (s',r') \iff s=s' \text { and } s_0^{\mathfrak {D}} \sim _f (s_0^{\mathfrak {D}},r) \ \forall r\in \mathcal {R}^\mathfrak {D}. \end{aligned}$$
Let \([s]_{\sim }\) denote the equivalence class wrt. \(\sim _f\) containing state \(s\in S^\mathfrak {D}\).
Forgetting induces the quotient MDP \(M^\mathfrak {D}_\sim = ( S^\mathfrak {D}_\sim ,[s_0^\mathfrak {D}]_\sim , Act ^\mathfrak {D},\mathcal {P}^\mathfrak {D}_\sim )\), where \(\mathcal {P}^\mathfrak {D}_\sim ([s]_\sim ,a_r)([s']_\sim ) = \mathfrak {P}(r)(s)(s')\).
At each state of the quotient MDP, the actions correspond to any realisation. It includes states that are unreachable in every realisation.
Remark 1
(Action space). According to Definition 8, for every state \([s]_\sim \) there are \(|\mathfrak {D}|\) actions. Many of these actions lead to the same distributions over successor states. In particular, two different realisations r and \(r'\) lead to the same distribution in s if \(r(k) = r'(k)\) for all \(k\in K\) where \(\mathfrak {P}(s)(k) \ne 0\). To avoid this spurious blow-up of actions, we a-priori merge all actions yielding the same distribution.
The quotient MDP under forgetting involves that the available actions allow to switch realisations and thereby create induced MCs different from any MC in \(\mathfrak {D}\). We formalise the notion of a consistent realisation with respect to parameters.
Definition 9
(Consistent realisation). For a family \(\mathfrak {D}\) of MCs and \(k\in K\), k-realisation-consistency is an equivalence relation \(\approx _k\ \subseteq \mathcal {R}^\mathfrak {D}{\times }\mathcal {R}^\mathfrak {D}\) satisfying:
$$ r\approx _k r' \Longleftrightarrow r(k)=r'(k). $$
Let \([r]_{\approx _k}\) denote the equivalence class w.r.t. \(\approx _k\) containing \(r\in \mathcal {R}^\mathfrak {D}\).
Definition 10
(Consistent scheduler). For quotient MDP \(M^\mathfrak {D}_\sim \) after forgetting and \(k\in K\), a scheduler \(\sigma \in \varSigma ^{M^\mathfrak {D}_\sim }\) is k-consistent if for all \(\pi ,\pi '\in \mathsf {Paths}_{ fin }^{M^\mathfrak {D}_\sim }\):
$$ \sigma (\pi )=a_r \wedge \sigma (\pi ')=a_{r'} \Longrightarrow r \approx _k r'\ . $$
A scheduler is K-consistent (short: consistent) if it is k-consistent for all \(k\in K\).
Lemma 1
For the quotient MDP \(M^\mathfrak {D}_{\sim }\) of family \(\mathfrak {D}\) of MCs:
$$\begin{aligned} \{ \left( M^\mathfrak {D}_{\sim }\right) _{\sigma ^{r^*}} \mid \sigma ^{r^*} \text { consistent scheduler}\} = \{ D_r \mid r \in \mathcal {R}^\mathfrak {D} \}. \end{aligned}$$
Proof
(Idea). For \(\sigma ^r \in \varSigma ^{M^\mathfrak {D}}\), we construct \(\sigma ^{r^*} \in \varSigma ^{M^\mathfrak {D}_\sim }\) such that \(\sigma ^{r^*}([s]_\sim ) = a_r\) for all s. Clearly \(\sigma ^{r^*}\) is consistent and \(M^\mathfrak {D}_{\sigma ^r} = \left( M^\mathfrak {D}_{\sim }\right) _{\sigma ^{r^*}}\) is obtained via a map between (s, r) and \([s]_\sim \). For \(\sigma ^{r^*} \in \varSigma ^{M^\mathfrak {D}_\sim }\), we construct \(\sigma ^r \in \varSigma ^{M^\mathfrak {D}}\) such that if \(\sigma ^{r^*}([s]_\sim ) = a_r\) then \(\sigma ^{r}(s_0^{\mathfrak {D}}) = a_r\). For all other states, we define \(\sigma ^{r}((s,r')) = a^{r'}\) independently of \(\sigma ^{r^*}\). Then \(M^\mathfrak {D}_{\sigma ^r} = \left( M^\mathfrak {D}_{\sim }\right) _{\sigma ^{r^*}}\) is obtained as above.
The following theorem is a direct corollary: we need to consider exactly the consistent schedulers.
Theorem 2
For all-in-one MDP \(M^\mathfrak {D}\) and specification \(\varphi \), there exists a memoryless deterministic scheduler \(\sigma ^r \in \varSigma ^{M^\mathfrak {D}}\) such that \(M^\mathfrak {D}_{\sigma ^r} \vDash \varphi \) iff there exists a consistent deterministic scheduler \(\sigma ^{r^*}\in \varSigma ^{M^\mathfrak {D}_\sim }\) such that \(\left( M^\mathfrak {D}_{\sim }\right) _{\sigma ^{r^*}} \vDash \varphi \).
Example 4
Recall the all-in-one MDP \(M^\mathfrak {D}\) from Example 3. The quotient MDP \(M^\mathfrak {D}_\sim \) is depicted in Fig. 3. Only the transitions according to realisations \(r_1\) and \(r_2\) are included. Transitions from previously unreachable states, marked grey in Example 3, are now available due to the abstraction. The scheduler \(\sigma \in \varSigma ^{M^\mathfrak {D}_\sim }\) with \(\sigma ([s_0^\mathfrak {D}]_\sim )=a_{r_2}\) and \(\sigma ([1]_\sim )=a_{r_1}\) is not \(k_1\)-consistent as different values are chosen for \(k_1\) by \(r_1\) and \(r_2\). In the MC \(M^\mathfrak {D}_{\sim \sigma }\) induced by \(\sigma \) and \(M^\mathfrak {D}_\sim \), the probability to reach state \([2]_\sim \) is one, while under realisation \(r_1\), state 2 is not reachable.
Approach 3
(Scheduler iteration). Enumerating all consistent schedulers for \(M^\mathfrak {D}_\sim \) and analysing the induced MC provides a solution to both synthesis problems.
However, optimising over exponentially many consistent schedulers solves the NP-complete feasibility synthesis problem, rendering such an iterative approach unlikely to be efficient. Another natural approach is to employ solving techniques for NP-complete problems, like satisfiability modulo linear real arithmetic.
Approach 4
(SMT). A dedicated SMT-encoding (in [11]) of the induced MCs of consistent schedulers from \(M^\mathfrak {D}_\sim \) that solves the feasibility problem.
4.3 Refinement Loop
Although iterating over consistent schedulers (Approach 3) is not feasible, model checking of \(M^\mathfrak {D}_\sim \) still provides useful information for the analysis of the family \(\mathfrak {D}\). Recall the feasibility synthesis problem for \(\varphi =\mathbb {P}_{\le \lambda } (\phi )\). If \(\texttt {Prob}^{\max }(M^\mathfrak {D}_\sim ,\phi ) \le \lambda \), then all realisations of \(\mathfrak {D}\) satisfy \(\varphi \). On the other hand, \(\texttt {Prob}^{\min }(M^\mathfrak {D}_\sim ,\phi ) > \lambda \) implies that there is no realisation satisfying \(\varphi \). If \(\lambda \) lies between the \(\min \) and \(\max \) probability, and the scheduler inducing the \(\min \) probability is not consistent, we cannot conclude anything yet, i.e., the abstraction is too coarse. A natural countermeasure is to refine the abstraction represented by \(M^\mathfrak {D}_\sim \), in particular, split the set of realisations leading to two synthesis sub-problems.
Definition 11
(Splitting). Let \(\mathfrak {D}\) be a family of MCs, and \(\mathcal {R} \subseteq \mathcal {R}^{\mathfrak {D}}\) a set of realisations. For \(k\in K\) and predicate \(A_k\) over S, splitting partitions \(\mathcal {R}\) into
$$\begin{aligned} \mathcal {R}_\top =\{ r \in \mathcal {R} \mid A_k(r(k))\} \quad \text {and} \quad \mathcal {R}_\bot =\{ r \in \mathcal {R} \mid \lnot A_k(r(k))\}. \end{aligned}$$
Splitting the set of realisations, and considering the subfamilies separately, rather than splitting states in the quotient MDP, is crucial for the performance of the synthesis process as we avoid rebuilding the quotient MDP in each iteration. Instead, we only restrict the actions of the MDP to the particular subfamily.
Definition 12
(Restricting). Let \(M^\mathfrak {D}_\sim = ( S^\mathfrak {D}_\sim ,[s_0^\mathfrak {D}]_\sim , Act ^\mathfrak {D},\mathcal {P}^\mathfrak {D}_\sim )\) be a quotient MDP and \(\mathcal {R} \subseteq \mathcal {R}^{\mathfrak {D}}\) a set of realisations. The restriction of \(M^\mathfrak {D}_\sim \) wrt. \(\mathcal {R}\) is the MDP \(M^\mathfrak {D}_\sim [\mathcal {R}] = ( S^\mathfrak {D}_\sim ,[s_0^\mathfrak {D}]_\sim , Act ^\mathfrak {D}[\mathcal {R}],\mathcal {P}^\mathfrak {D}_\sim )\) where \( Act ^\mathfrak {D}[\mathcal {R}] = \{a_r \mid r \in \mathcal {R}\}.\)Footnote 3
The splitting operation is the core of the proposed abstraction-refinement. Due to space constraints, we do not consider feasibility separately.
Algorithm 1 illustrates the threshold synthesis process. Recall that the goal is to decompose the set \(\mathcal {R}^{\mathfrak {D}}\) into realisations satisfying and violating a given specification, respectively. The algorithm uses a set U to store subfamilies of \(\mathcal {R}^{\mathfrak {D}}\) that have not been yet classified as satisfying or violating. It starts building the quotient MDP with merged actions. That is, we never construct the all-in-one MDP, and we merge actions as discussed in Remark 1. For every \(\mathcal {R} \in U\), the algorithm restricts the set of realisations to obtain the corresponding subfamily. For the restricted quotient MDP, the algorithm runs standard MDP model checking to compute the \(\max \) and \(\min \) probability and corresponding schedulers, respectively. Then, the algorithm either classifies \(\mathcal {R}\) as satisfying/violating, or splits it based on a suitable predicate, and updates U accordingly. We describe the splitting strategy in the next subsection. The algorithm terminates if U is empty, i.e., all subfamilies have been classified. As only a finite number of subfamilies of realisations has to be evaluated, termination is guaranteed.
The refinement loop for max synthesis is very similar, cf. Algorithm 2. Recall that now the goal is to find the realisation \(r^*\) that maximises the satisfaction probability \(\max ^*\) of a path formula. The difference between the algorithms lies in the interpretation of the results of the underlying MDP model checking. If the \(\max \) probability for \(\mathcal {R}\) is below \(\max ^*\), \(\mathcal {R}\) can be discarded. Otherwise, we check whether the corresponding scheduler \(\sigma _{\max }\) is consistent. If consistent, the algorithm updates \(r^*\) and \(\max ^*\), and discards \(\mathcal {R}\). If the scheduler is not consistent but \(\min > \max ^{*}\) holds, we can still update \(\max ^*\) and improve the pruning process, as it means that some realisation (we do not know which) in \(\mathcal {R}\) induces a higher probability than \(\max ^*\). Regardless whether \(\max ^*\) has been updated, the algorithm has to split \(\mathcal {R}\) based on some predicate, and analyse its subfamilies as they may include the maximising realisation.
4.4 Splitting Strategies
If verifying the quotient MDP \(M^\mathfrak {D}_\sim [\mathcal {R}]\) cannot classify the (sub-)realisation \(\mathcal {R}\) as satisfying or violating, we split \(\mathcal {R}\), while we guide the splitting strategy by using the obtained verification results. The splitting operation chooses a suitable parameter \(k \in K\) and predicate \(A_k\) that partition the realisations \(\mathcal {R}\) into \(\mathcal {R}_{\top }\) and \(\mathcal {R}_{\bot }\) (see Definition 11). A good splitting strategy globally reduces the number of model-checking calls required to classify all \(r\in \mathcal {R}\).
The two key aspects to locally determine a good k are: (1) the variance, that is, how the splitting may narrow the difference between \(\max = \texttt {Prob}^{\max }(M^\mathfrak {D}_\sim [\mathcal {X}],\phi )\) and \(\min = \texttt {Prob}^{\min }(M^\mathfrak {D}_\sim [\mathcal {X}],\phi )\) for both \(\mathcal {X} = \mathcal {R}_{\top }\) or \(\mathcal {X} = \mathcal {R}_{\bot }\), and (2) the consistency, that is, how the splitting may reduce the inconsistency of the schedulers \(\sigma _{\max }\) and \(\sigma _{\min }\). These aspects cannot be evaluated precisely without applying all the split operations and solving the new MDPs \(M^\mathfrak {D}_\sim [\mathcal {R}_{\bot }]\) and \(M^\mathfrak {D}_\sim [\mathcal {R}_{\top }]\). Therefore, we propose an efficient strategy that selects k and \(A_k\) based on a light-weighted analysis of the model-checking results for \(M^\mathfrak {D}_\sim [\mathcal {R}]\). The strategy applies two scores \(\texttt {variance}(k)\) and \(\texttt {consistency}(k)\) that estimate the influence of k on the two key aspects. For any k, the scores are accumulated over all important states s (reachable via \(\sigma _{\max }\) or \(\sigma _{\min }\), respectively) where \(\mathfrak {P}(s)(k) \ne 0\). A state s is important for \(\mathcal {R}\) and some \(\delta \in \mathbb {R}_{\ge 0}\) if
$$\begin{aligned} \frac{\texttt {Prob}^{\max }(M^\mathfrak {D}_\sim [\mathcal {R}],\phi )(s) - \texttt {Prob}^{\min }(M^\mathfrak {D}_\sim [\mathcal {R}],\phi )(s)}{\texttt {Prob}^{\max }(M^\mathfrak {D}_\sim [\mathcal {R}],\phi ) - \texttt {Prob}^{\min }(M^\mathfrak {D}_\sim [\mathcal {R}],\phi )} \ge \delta \end{aligned}$$
where \(\texttt {Prob}^{\min }(.)(s)\) and \(\texttt {Prob}^{\max }(.)(s)\) is the \(\min \) and \(\max \) probability in the MDP with initial state s. To reduce the overhead of computing the scores, we simplify the scheduler representation. In particular, for \(\sigma _{\max }\) and every \(k\in K\), we extract a map \(C_{\max }^k:T_k \rightarrow \mathbb {N}\), where \(C_{\max }^k(t)\) is the number of important states for which \(\sigma _{\max }(s) = a_r\) with \(r(k) = t\). The mapping \(C_{\min }^k\) represents \(\sigma _{\min }\).
We define \(\texttt {variance}(k) = \sum _{t\in T_k} |C_{\max }^k(t) - C_{\min }^k(t)|\), leading to high scores if the two schedulers vary a lot. Further, we define \( \texttt {consistency}(k) = \texttt {size}\left( C_{\max }^k\right) \cdot \texttt {max}\left( C_{\max }^k\right) + \texttt {size}\left( C_{\min }^k\right) \cdot \texttt {max}\left( C_{\min }^k\right) \), where \(\texttt {size}\left( C\right) = |\{t\in T_k \mid C(t) > 0\}|-1 \) and \(\texttt {max}\left( C\right) = \max _{t\in T_k}\{ C(t)\}\), leading to high scores if the parameter has clear favourites for \(\sigma _{\max }\) and \(\sigma _{\min }\), but values from its full range are chosen.
As indicated, we consider different strategies for the two synthesis problems. For threshold synthesis, we favour the impact on the variance as we principally do not need consistent schedulers. For the max synthesis, we favour the impact on the consistency, as we need a consistent scheduler inducing the \(\max \) probability.
Predicate \(A_k\) is based on reducing the variance: The strategy selects \(T' \subset T_k\) with \(|T'| = \frac{1}{2}\left\lceil {|T_k|}\right\rceil \), containing those t for which \(C_{\max }^k(t) - C_{\min }^k(t)\) is the largest. The goal is to get a set of realisations that induce a large probability (the ones including \(T'\) for parameter k) and the complement inducing a small probability.
Approach 5
(MDP-based abstraction refinement). The methods underlying Algorithms 1 and 2, together with the splitting strategies, provide solutions to the synthesis problems and are referred to as MDP abstraction methods.