An unfolding-based approach as discussed in Sect. 3.2 does not scale well in terms of memory consumption: If the original MDP has n states, then the unfolding has on the order of \(n \cdot \prod _{i=1}^m (b_i + 2)\) states. This blow-up makes an a priori unfolding infeasible for larger cost limits \(b_i\) over multiple bounds. The bottleneck lies in computing the points \({\mathbf {p}}_{{\mathbf {w}}}\) as in equations 1 and 2. In this section, we show how to compute these probability vectors efficiently, i.e. given a weight vector \({\mathbf {w}} = \langle w_1, \dots , w_{\ell } \rangle \in [0,1]^{\ell }\setminus \{ \mathbf {0} \}\), compute
$$\begin{aligned} {\mathbf {p}}_{{\mathbf {w}}} = \langle \mathbb {P}^{\mathfrak {S}}_{M} ({\varphi }_1), \dots , \mathbb {P}^{\mathfrak {S}}_{M} ({\varphi }_{\ell }) \rangle ~ \text {with} ~ \mathfrak {S} \in \arg \max _{\mathfrak {S} '} \left( \sum _{k=1}^{{\ell }} w_i \cdot \mathbb {P}^{\mathfrak {S} '}_{M} ({\varphi }_k) \right) \end{aligned}$$
(3)
without creating the unfolding. The two characterisations of \({\mathbf {p}}_{{\mathbf {w}}}\) given in equations 1 and 3 are equivalent due to Lemma 1.
The efficient analysis of single-objective queries \(\Phi _1 = \mathbb {P}^{\mathrm {max}}_{M} (\langle \mathrm {C}_{} \rangle _{\le b}\, G)\) with a single bound has recently been addressed [28, 37]. The key idea is based on dynamic programming. The unfolding \({M_ unf }\) is decomposed into \(b + 2\)epoch models MDPs \(M^{b}, \dots , M^{0}, M^{\bot }\) such that the epoch model MDPs correspond to the cost epochs. Each epoch model MDP is a copy of M with only slight adaptations (detailed later). The crucial observation is that, since costs are non-negative, reachability probabilities in copies corresponding to epoch i only depend on the copies \(\{\, M^{j}~|~j \le i \vee j = \bot \,\}\). It is thus possible to analyse \(M^{\bot }, \dots , M^{b}\)sequentially instead of considering all copies at once. In particular, it is not necessary to construct the complete unfolding.
We lift this idea to multi-objective tradeoffs with multiple cost bounds: we aim to build an MDP for each epoch \({{\mathbf {e}}}\in {\mathbf {E}}_{{m}}\) that can be analysed via standard model checking techniques using the weighted expected cost encoding of objective probabilities. Notably, in the single cost bound case with a single objective, it is easy to determine whether the one property is satisfied: either reaching a goal state for the first time or exceeding the cost bound immediately suffices to determine whether the property is satisfied. Thus, while \(M^{\bot }\) is just one sink state in the single cost bound case, its structure is more involved in the presence of multiple objectives and multiple cost bounds.
An Epoch Model Approach without Unfolding
We first formalise epoch models for multiple bounds. As noted, the overall epoch structure is the same as in the unfolding approach.
Example 9
We illustrate the structure of the epoch models in Fig. 5. For our running example MDP \(M_ ex \) of Fig. 2a with bounds 4 and 3, we obtain \((4+2) \cdot (3+2) = 30\) epoch models. The epoch models can be partitioned into 4 partitions (indicated by the dashed lines), with all epoch models inside a partition having the same MDP structure. The overall graph of the epoch models is acyclic (up to self-loops). From the maximum costs in \(M_ ex \), we a priori know that e.g. epoch model \(M^{\langle 2,1 \rangle }\) can only be reached from epochs \(M^{\langle i,j \rangle }\) with \(i \le 2, j \le 1\). In our illustration, we only show the transitions between the epoch models that are forward-reachable from \(M^{\langle 4,3 \rangle }\); observe that in this example, these are significantly fewer than what the backward-reachability argument based on the maximum costs gives, which are again only a fraction of all possible epochs.
Before we give a formal definition of the epoch model in Definition 10, we give an intuitive description. The state space of an individual epoch model for epoch \({{\mathbf {e}}}\) consists of up to one copy of each original state for each of the \(2^{m}\) goal satisfaction vectors \({{\mathbf {g}}}\in {{\mathbf {G}}_{{m}}}\). Additional sink states \(\langle s_\bot , {{\mathbf {g}}} \rangle \) encode the target for a jump to any other cost epoch \({{\mathbf {e}}}' \ne {{\mathbf {e}}}\). Similar to the unfolding \({M_ unf ^{+}}\), we use the function \({ satObj _\Phi }:{{\mathbf {G}}_{{m}}} \times {{\mathbf {G}}_{{m}}} \rightarrow \{0,1\}^{\ell }\) to assign cost 1 for objectives that change from not (yet) satisfied to satisfied, based on the information in the two goal satisfaction vectors. More precisely, we put cost 1 in entry \(1 \le k \le {m}\) if and only if a reachability property \({\varphi }_k\) is satisfied according to the target goal satisfaction vector and not in the previous goal satisfaction vector. For the transitions’ branches, we distinguish two cases:
-
1.
If the successor epoch \( {{\mathbf {e}}}' = { succ ({{\mathbf {e}}},{\mathbf {c}})}\) with respect to the original cost \({\mathbf {c}} \in \mathbb {N} ^{m}\) of M is the same as the current epoch \({{\mathbf {e}}}\), we jump to the successor state as before, and update the goal satisfaction. We collect the new costs for the objectives if the updated goal satisfaction newly satisfies an objective given by \({ satObj _\Phi }\), i.e. if it is now satisfied by the new goal satisfaction and the old goal satisfaction did not satisfy that objective.
-
2.
If the successor epoch \({{\mathbf {e}}}' = { succ ({{\mathbf {e}}},{\mathbf {c}})}\) is different from the current epoch \({{\mathbf {e}}}\), the transitions’ branch is redirected to the sink state \(\langle s_\bot ,{{\mathbf {g}}}'\rangle \) with the corresponding goal state satisfaction vector. Notice that this might require to merge some branches, hence we have to sum over all branches.
The collected costs contain the part of the goal satisfaction as in item 1, but also the results obtained by analysing the successor epoch \({{\mathbf {e}}}'\). The latter is incorporated by a function \(f :{{\mathbf {G}}_{{m}}} \times \mathrm {Dist}({\mathbb {N} ^m \times S}) \rightarrow [0,1]^{\ell }\) such that the k-th entry of the vector \(f({{\mathbf {g}}}, \mu )\) reflects the probability to newly satisfy the k-th objective after leaving the current epoch via distribution \(\mu \).
Definition 10
The epoch model of an MDP M as in Definition 1 for \({{\mathbf {e}}}\in {\mathbf {E}}_{{m}}\) and a function \(f :{{\mathbf {G}}_{{m}}} \times \mathrm {Dist}({\mathbb {N} ^m \times S}) \rightarrow [0,1]^{\ell }\) is the MDP \(M^{{\mathbf {e}}}_f =\langle S ^{{\mathbf {e}}}, T ^{{\mathbf {e}}}_f, \langle s_{ init }, {\mathbf {0}} \rangle \rangle \) with \({\ell }\) cost structures defined by
$$\begin{aligned} S ^{{\mathbf {e}}}{\mathop {=}\limits ^{\mathrm{def}}}(S \uplus s_\bot ) \times {{\mathbf {G}}_{{m}}} ,\quad T ^{{\mathbf {e}}}_f(\langle s_\bot , {{\mathbf {g}}} \rangle ) = \{\,\mathscr {D}(\langle {\mathbf {0}}, \langle s_\bot , {{\mathbf {g}}} \rangle \rangle ) \,\}, \end{aligned}$$
and for every \({\tilde{s}}= \langle s, {{\mathbf {g}}} \rangle \in S ^{{\mathbf {e}}}\) and \(\mu \in T (s)\), there is a \(\nu \in T ^{{\mathbf {e}}}_f({\tilde{s}})\) such that
-
1.
\(\nu (\langle { satObj _\Phi }({{\mathbf {g}}}, {{\mathbf {g}}}'), \langle s', {{\mathbf {g}}}' \rangle \rangle ) = \mu ({\mathbf {c}}, s') \cdot [{ succ ({{\mathbf {e}}}, {\mathbf {c}})} = {{\mathbf {e}}}] \cdot [{ succ ({{\mathbf {g}}}, s', {{\mathbf {e}}})} = {{\mathbf {g}}}']\)
-
2.
\(\nu (\langle { satObj _\Phi }({{\mathbf {g}}}, {{\mathbf {g}}}') + f({{\mathbf {g}}}, \mu ), \langle s_\bot , {{\mathbf {g}}}' \rangle \rangle ) = \sum _{\langle {\mathbf {c}}, s' \rangle } \mu ({\mathbf {c}}, s') \begin{aligned}&\cdot [{ succ ({{\mathbf {e}}}, {\mathbf {c}})} = {{\mathbf {e}}}' \ne {{\mathbf {e}}}] \\&\cdot [{ succ ({{\mathbf {g}}}, s', {{\mathbf {e}}}')} = {{\mathbf {g}}}']. \end{aligned}\)
In contrast to Definition 1, the MDP \(M^{{\mathbf {e}}}_f\) may consider cost vectors that consist of non-natural numbers—as reflected by the image of f. The two items in the definition reflect the two cases described before. For item 2, the sum \({ satObj _\Phi }({{\mathbf {g}}}, {{\mathbf {g}}}') + f({{\mathbf {g}}}, \mu )\) reflects the two cases where an objective is satisfied in the current step (upon taking a branch that leaves the epoch) or only afterwards. In particular, our algorithm constructs f in a way that \({{ satObj _\Phi }({{\mathbf {g}}}, {{\mathbf {g}}}')}[{k}] = 1\) implies \({f({{\mathbf {g}}}, \mu )}[{k}] = 0\).
Example 10
Figure 6 shows an epoch model \(M^{{\mathbf {e}}}_f\) of the MDP \(M_ ex \) in Fig. 2a with respect to tradeoff \(\Phi \) as in Example 4 and any epoch \({{\mathbf {e}}}\in {\mathbf {E}}_{m}\) in the partition where \({{{\mathbf {e}}}}[{1}] \ne \bot \) and \({{{\mathbf {e}}}}[{2}] \ne \bot \).
As already mentioned before, the structure of \(M^{{\mathbf {e}}}_f\) differs only slightly between epochs. In particular consider epochs \({{\mathbf {e}}}\) and \({{\mathbf {e}}}'\) with \({{{\mathbf {e}}}}[{i}] = \bot \) if and only if \({{{\mathbf {e}}}'}[{i}] = \bot \). To construct epoch model \(M^{{{\mathbf {e}}}'}_{f'}\) from \( M^{{{\mathbf {e}}}}_f\), only transitions to the bottom states \(\langle s_\bot , {{\mathbf {g}}} \rangle \) need to be adapted, by adapting f accordingly.
Consider the unfolding \({M_ unf ^{+}}\) with \({\ell }\) cost structures as in Sect. 3.3. Intuitively, the states of \(M^{{\mathbf {e}}}_f\) reflect the states of \({M_ unf ^{+}}\) with cost epoch \({{\mathbf {e}}}\). We use the function f to propagate values for the remaining states of \({M_ unf ^{+}}\). This is formalised by the following lemma. We use the notation \({\mathbb {E}^{\mathfrak {S}}_{{M_ unf ^{+}}} ({\mathbf {w}})}[{\langle s, {{\mathbf {e}}}, {{\mathbf {g}}} \rangle }] \) for the weighted expected costs for \({M_ unf ^{+}}\) when changing the initial state to \(\langle s, {{\mathbf {e}}}, {{\mathbf {g}}} \rangle \).
Lemma 2
Let \(M=\langle S, T, s_{ init } \rangle \) be an MDP with unfolding \({M_ unf ^{+}}= \langle S ', T ', s_{ init } ' \rangle \) as above. Further, let \(M^{{\mathbf {e}}}_f = \langle S ^{{\mathbf {e}}}, T ^{{\mathbf {e}}}_f, \langle s_{ init }, {\mathbf {0}} \rangle \rangle \) be an epoch model of M for epoch \({{\mathbf {e}}}\in {\mathbf {E}}_{{m}}\), and f given by
$$\begin{aligned} {f({{\mathbf {g}}}, \mu )}[{k}] = \frac{1}{\mu _{\mathrm {exit}}} \sum _{\langle {\mathbf {c}}, s' \rangle } \mu ({\mathbf {c}}, s') \cdot [{ succ ({{\mathbf {e}}}, {\mathbf {c}})} = {{\mathbf {e}}}' \ne {{\mathbf {e}}}] \cdot {\mathbb {E}^{\mathrm {max}}_{{M_ unf ^{+}}} ({\mathbf {1}}_k)}[{\langle s', {{\mathbf {e}}}', { succ ({{\mathbf {g}}}, s', {{\mathbf {e}}}')} \rangle }] \end{aligned}$$
if \(\mu _{\mathrm {exit}} = \sum _{\langle {\mathbf {c}},s \rangle } \mu ({\mathbf {c}},s) \cdot [{ succ ({{\mathbf {e}}},{\mathbf {c}})} \ne {{\mathbf {e}}}] > 0\) and \({f({{\mathbf {g}}}, \mu )}[{k}] = 0\) otherwise. For every weight vector \({\mathbf {w}} \in [0,1]^{\ell }\) and state \(\langle s,{{\mathbf {g}}} \rangle \) of \(M^{{\mathbf {e}}}_f\) with \(s \ne s_\bot \) we have
$$\begin{aligned} {\mathbb {E}^{\mathrm {max}}_{{M_ unf ^{+}}} ({\mathbf {w}})}[{\langle s, {{\mathbf {e}}}, {{\mathbf {g}}} \rangle }] = {\mathbb {E}^{\mathrm {max}}_{M^{{\mathbf {e}}}_f} ({\mathbf {w}})}[{\langle s, {{\mathbf {g}}} \rangle }]. \end{aligned}$$
Proof
We apply the characterisation of (weighted) expected rewards as the smallest solution of a Bellman equation system [25, 46]. For \({M_ unf ^{+}}\), assume variables \({{\mathbf {x}}}[{\langle s, \hat{{{\mathbf {e}}}}, {{\mathbf {g}}} \rangle }] \in \mathbb {R}_{\ge 0}\) for every \(\langle s, \hat{{{\mathbf {e}}}}, {{\mathbf {g}}} \rangle \in S '\). The smallest solution of the equation system
$$\begin{aligned} \forall \langle s,\hat{{{\mathbf {e}}}}, {{\mathbf {g}}} \rangle \in S ' :\quad {{\mathbf {x}}}[{\langle s, \hat{{{\mathbf {e}}}}, {{\mathbf {g}}} \rangle }] = \max _{\mu \in T '(\langle s, \hat{{{\mathbf {e}}}}, {{\mathbf {g}}} \rangle )} \Bigg ( \sum _{\langle {\mathbf {c}}, {\hat{s}} \rangle } \mu (\langle {\mathbf {c}}, {\hat{s}} \rangle ) \cdot \big ({\mathbf {w}} \cdot {\mathbf {c}} + {{\mathbf {x}}}[{{\hat{s}}}] \big )\Bigg ) \end{aligned}$$
(4)
satisfies \({{\mathbf {x}}}[{\langle s, \hat{{{\mathbf {e}}}}, {{\mathbf {g}}} \rangle }] = {\mathbb {E}^{\mathrm {max}}_{{M_ unf ^{+}}} ({\mathbf {w}})}[{\langle s, \hat{{{\mathbf {e}}}}, {{\mathbf {g}}} \rangle }] \). Similarly, for \(M^{{\mathbf {e}}}_f\), the smallest solution of
$$\begin{aligned} \forall \langle s, {{\mathbf {g}}} \rangle \in S ^{{\mathbf {e}}}:\quad {{\mathbf {y}}^{{\mathbf {e}}}}[{\langle s,{{\mathbf {g}}} \rangle }] = \max _{\nu \in T ^{{\mathbf {e}}}_f(\langle s, {{\mathbf {g}}} \rangle )} \Bigg ( \sum _{\langle {\mathbf {c}}, {\tilde{s}} \rangle } \nu (\langle {\mathbf {c}}, {\tilde{s}} \rangle ) \cdot \big ({\mathbf {w}} \cdot {\mathbf {c}} + {{\mathbf {y}}^{{\mathbf {e}}}}[{{\tilde{s}}}] \big )\Bigg ) \end{aligned}$$
(5)
satisfies \({{\mathbf {y}}^{{\mathbf {e}}}}[{\langle s,{{\mathbf {g}}} \rangle }] = {\mathbb {E}^{\mathrm {max}}_{M^{{\mathbf {e}}}_f} ({\mathbf {w}})}[{\langle s, {{\mathbf {g}}} \rangle }] \). We prove the lemma by showing the following claim: If \({{\mathbf {x}}}[{\langle s, \hat{{{\mathbf {e}}}}, {{\mathbf {g}}} \rangle }] \) for \(\langle s, \hat{{{\mathbf {e}}}}, {{\mathbf {g}}} \rangle \in S '\) is the smallest solution for Eq. 4, the smallest solution for Eq. 5 is given by \({{\mathbf {y}}^{{\mathbf {e}}}}[{\langle s,{{\mathbf {g}}} \rangle }] = [s \ne s_\bot ] \cdot {{\mathbf {x}}}[{\langle s, {{\mathbf {e}}}, {{\mathbf {g}}} \rangle }] \) for \(\langle s,{{\mathbf {g}}} \rangle \in S ^{{\mathbf {e}}}\).
Let \({{\mathbf {x}}}[{\langle s, \hat{{{\mathbf {e}}}}, {{\mathbf {g}}} \rangle }] \) be the smallest solution for Eq. 4. Since no cost can be reached from \(s_\bot \) in \(M^{{\mathbf {e}}}_f\), we can show that \({{\mathbf {y}}^{{\mathbf {e}}}}[{\langle s_\bot ,{{\mathbf {g}}} \rangle }] = 0\) has to hold. Now let \(\langle s, {{\mathbf {g}}} \rangle \in S ^{{\mathbf {e}}}\) with \(s \ne s_\bot \). To improve readability, we use \({{\mathbf {e}}}'\) as short for \({ succ ({{\mathbf {e}}},{\mathbf {c}})}\) and \({{\mathbf {g}}}'\) as short for \({ succ ({{\mathbf {g}}}, s', {{\mathbf {e}}}')}\).
$$\begin{aligned}&{{\mathbf {y}}^{{\mathbf {e}}}}[{\langle s,{{\mathbf {g}}} \rangle }] = [s \ne s_\bot ] \cdot {{\mathbf {x}}}[{\langle s, {{\mathbf {e}}}, {{\mathbf {g}}} \rangle }] = {{\mathbf {x}}}[{\langle s, {{\mathbf {e}}}, {{\mathbf {g}}} \rangle }] \\&\quad = \max _{\mu \in T '(\langle s, {{\mathbf {e}}}, {{\mathbf {g}}} \rangle )} \Bigg ( \sum _{\langle {\mathbf {c}}, {\hat{s}} \rangle } \mu (\langle {\mathbf {c}}, {\hat{s}} \rangle ) \cdot \big ({\mathbf {w}} \cdot {\mathbf {c}} + {{\mathbf {x}}}[{{\hat{s}}}] \big )\Bigg ) \\&\quad = \max _{\mu \in T (s)} \Bigg ( \sum _{\langle {\mathbf {c}}, s' \rangle } \mu (\langle {\mathbf {c}}, s' \rangle ) \cdot \big ({\mathbf {w}} \cdot { satObj _\Phi }({{\mathbf {g}}}, {{\mathbf {g}}}') + {{\mathbf {x}}}[{\langle s',{{\mathbf {e}}}',{{\mathbf {g}}}' \rangle }] \big )\Bigg ) \\&\quad = \max _{\mu \in T (s)} \Bigg ( \sum _{\langle {\mathbf {c}}, s' \rangle } [{{\mathbf {e}}}= {{\mathbf {e}}}'] \cdot \mu (\langle {\mathbf {c}}, s' \rangle ) \cdot \big ({\mathbf {w}} \cdot { satObj _\Phi }({{\mathbf {g}}}, {{\mathbf {g}}}') + {{\mathbf {x}}}[{\langle s',{{\mathbf {e}}}',{{\mathbf {g}}}' \rangle }] \big ) \\&\qquad + \sum _{\langle {\mathbf {c}}, s' \rangle } [{{\mathbf {e}}}\ne {{\mathbf {e}}}'] \cdot \mu (\langle {\mathbf {c}}, s' \rangle ) \cdot \big ({\mathbf {w}} \cdot { satObj _\Phi }({{\mathbf {g}}}, {{\mathbf {g}}}') + {\mathbb {E}^{\mathrm {max}}_{{M_ unf ^{+}}} ({\mathbf {w}})}[{\langle s, {{\mathbf {e}}}', {{\mathbf {g}}} \rangle }] \big )\Bigg ) \\&\quad = \max _{\mu \in T (s)} \Bigg ( \sum _{\langle {\mathbf {c}}, s' \rangle } [{{\mathbf {e}}}= {{\mathbf {e}}}'] \cdot \mu (\langle {\mathbf {c}}, s' \rangle ) \cdot \big ({\mathbf {w}} \cdot { satObj _\Phi }({{\mathbf {g}}}, {{\mathbf {g}}}') + {{\mathbf {x}}}[{\langle s',{{\mathbf {e}}}',{{\mathbf {g}}}' \rangle }] \big ) \\&\qquad + \sum _{\langle {\mathbf {c}}, s' \rangle } [{{\mathbf {e}}}\ne {{\mathbf {e}}}'] \cdot \mu (\langle {\mathbf {c}}, s' \rangle ) \cdot \sum _{\bar{{{\mathbf {g}}}} \in {{\mathbf {G}}_{{m}}}} [{{\mathbf {g}}}' = \bar{{{\mathbf {g}}}}] \cdot \big ({\mathbf {w}} \cdot { satObj _\Phi }({{\mathbf {g}}}, \bar{{{\mathbf {g}}}}) \big ) \\&\qquad + \sum _{\langle {\mathbf {c}}, s' \rangle } [{{\mathbf {e}}}\ne {{\mathbf {e}}}'] \cdot \mu (\langle {\mathbf {c}}, s' \rangle ) \cdot \big ({\mathbf {w}} \cdot f({{\mathbf {g}}}, \mu )\big ) \cdot \sum _{\bar{{{\mathbf {g}}}} \in {{\mathbf {G}}_{{m}}}} [{{\mathbf {g}}}' = \bar{{{\mathbf {g}}}}] \Bigg )\\&\quad = \max _{\mu \in T (s)} \Bigg ( \sum _{\langle {\mathbf {c}}, s' \rangle } [{{\mathbf {e}}}= {{\mathbf {e}}}'] \cdot \mu (\langle {\mathbf {c}}, s' \rangle ) \cdot \big ({\mathbf {w}} \cdot { satObj _\Phi }({{\mathbf {g}}}, {{\mathbf {g}}}') + {{\mathbf {x}}}[{\langle s',{{\mathbf {e}}}',{{\mathbf {g}}}' \rangle }] \big ) \\&\sum _{\bar{{{\mathbf {g}}}} \in {{\mathbf {G}}_{{m}}}} {\mathbf {w}} \cdot \big ( { satObj _\Phi }({{\mathbf {g}}}, \bar{{{\mathbf {g}}}}) + f({{\mathbf {g}}}, \mu ) \big ) \cdot \sum _{\langle {\mathbf {c}}, s' \rangle } [{{\mathbf {e}}}\ne {{\mathbf {e}}}'] \cdot \mu (\langle {\mathbf {c}}, s' \rangle ) \cdot [{{\mathbf {g}}}' = \bar{{{\mathbf {g}}}}] \Bigg ) \\&\quad = \max _{\nu \in T _f^{{\mathbf {e}}}(s)} \Bigg ( \sum _{\langle {\mathbf {c}}, \langle s', {{\mathbf {g}}}' \rangle \rangle } [s' \ne s_\bot ] \cdot \nu (\langle {\mathbf {c}}, \langle s', {{\mathbf {g}}}' \rangle \rangle ) \cdot \big ({\mathbf {w}} \cdot {\mathbf {c}} + {{\mathbf {x}}}[{\langle s',{{\mathbf {e}}},{{\mathbf {g}}}' \rangle }] \big ) \\&\qquad + \sum _{\langle {\mathbf {c}}, \langle s_\bot , {{\mathbf {g}}}' \rangle \rangle } \nu ({\mathbf {c}}, \langle s_\bot , {{\mathbf {g}}}' \rangle ) \cdot {\mathbf {w}} \cdot {\mathbf {c}} \Bigg ) \\&\quad = \max _{\nu \in T _f^{{\mathbf {e}}}(s)} \Bigg ( \sum _{\langle {\mathbf {c}}, \langle s', {{\mathbf {g}}}' \rangle \rangle } \nu (\langle {\mathbf {c}}, \langle s', {{\mathbf {g}}}' \rangle \rangle ) \cdot \big ({\mathbf {w}} \cdot {\mathbf {c}} + [s' \ne s_\bot ] \cdot {{\mathbf {x}}}[{\langle s',{{\mathbf {e}}},{{\mathbf {g}}}' \rangle }] \big )\Bigg ) \\&\quad = \max _{\nu \in T _f^{{\mathbf {e}}}(s)} \Bigg ( \sum _{\langle {\mathbf {c}}, \langle s', {{\mathbf {g}}}' \rangle \rangle } \nu (\langle {\mathbf {c}}, \langle s', {{\mathbf {g}}}' \rangle \rangle ) \cdot \big ({\mathbf {w}} \cdot {\mathbf {c}} + {{\mathbf {y}}^{{\mathbf {e}}}}[{\langle s',{{\mathbf {g}}}' \rangle }] \big )\Bigg ). \\ \end{aligned}$$
We conclude that \({{\mathbf {y}}^{{\mathbf {e}}}}[{\langle s,{{\mathbf {g}}} \rangle }] = [s \ne s_\bot ] \cdot {{\mathbf {x}}}[{\langle s, {{\mathbf {e}}}, {{\mathbf {g}}} \rangle }] \) is indeed a solution for Eq. 5. If there is a smaller solution \({\hat{{\mathbf {y}}}^{{\mathbf {e}}}}[{\langle s,{{\mathbf {g}}} \rangle }] < {{\mathbf {y}}^{{\mathbf {e}}}}[{\langle s,{{\mathbf {g}}} \rangle }] \), the equalities above can be used to construct a smaller solution for Eq. 4, violating our assumption for \({{\mathbf {x}}}[{\langle s, {{\mathbf {e}}}, {{\mathbf {g}}} \rangle }] \). \(\square \)
To analyse an epoch model \(M^{{\mathbf {e}}}_f\), any successor epoch \({{\mathbf {e}}}'\) of \({{\mathbf {e}}}\) needs to have been analysed before. Since costs are non-negative, we can ensure this by analysing the epochs in a specific order: In the case of a single cost bound, this order is uniquely given by \(\bot ,0,1, \dots , b\).
Definition 11
Let \({\preceq } \subseteq {\mathbf {E}}_{{m}} \times {\mathbf {E}}_{{m}}\) be the partial order with
$$\begin{aligned} {{\mathbf {e}}}' \preceq {{\mathbf {e}}}\quad \text {iff}\quad \forall \, i :{{{\mathbf {e}}}'}[{i}] \le {{{\mathbf {e}}}}[{i}] \vee {{{\mathbf {e}}}'}[{i}] = \bot . \end{aligned}$$
A proper epoch sequence is a sequence of epochs \({\mathfrak {E}}= {{\mathbf {e}}}_1 \dots , {{\mathbf {e}}}_n\) such that (i) \({{\mathbf {e}}}_1 \hbox {\,\char 054\,}{{\mathbf {e}}}_2 \hbox {\,\char 054\,} \dots \hbox {\,\char 054\,}{{\mathbf {e}}}_{n}\) for some linearisation \( \hbox {\,\char 054\,}\) of \(\preceq \) and (ii) if \({{\mathbf {e}}}\) occurs in \({\mathfrak {E}}\) and \({{\mathbf {e}}}' \preceq {{\mathbf {e}}}\), then also \({{\mathbf {e}}}'\) occurs in \({\mathfrak {E}}\).
For multiple cost bounds any proper epoch sequence can be considered. This definition coincides with the topological sort of the graph in Fig. 6. To improve performance, we group the epoch models with a common MDP structure.
Example 11
For the epoch models depicted in Fig. 5, a possible proper epoch sequence is
$$\begin{aligned} {\mathfrak {E}}= \langle \bot , \bot \rangle , \langle 0, \bot \rangle , \langle 2, \bot \rangle , \langle \bot , 1 \rangle , \langle \bot , 3 \rangle , \langle 1, 1 \rangle , \langle 0, 3 \rangle , \langle 3, 1 \rangle , \langle 2, 3 \rangle , \langle 4, 3 \rangle . \end{aligned}$$
We compute the points \({\mathbf {p}}_{{\mathbf {w}}}\) by analysing the different epoch models (i.e. the coordinates of Fig. 3b) sequentially, using a dynamic programming-based approach. The main procedure is outlined in Algorithm 1. The costs of the model for the current epoch \({{\mathbf {e}}}\) are computed in lines 2-12. These costs comprise the results from previously analysed epochs \({{\mathbf {e}}}'\) (line 7). In lines 13-16, the current epoch model \(M^{{\mathbf {e}}}_f\) is built and analysed: We compute weighted expected costs on \(M^{{\mathbf {e}}}_f\) where \({\mathbb {E}^{\mathfrak {S}}_{M^{{\mathbf {e}}}_f} ({\mathbf {w}})}[{s}] \) denotes the expected costs for \(M^{{\mathbf {e}}}_f\) when changing the initial state to s. In line 14, a (deterministic and memoryless) scheduler \(\mathfrak {S} \) that induces the maximal weighted expected costs (i.e. \({\mathbb {E}^{\mathfrak {S}}_{M^{{\mathbf {e}}}_f} ({\mathbf {w}})}[{s}] = \max _{\mathfrak {S} '} {\mathbb {E}^{\mathfrak {S} '}_{M^{{\mathbf {e}}}_f} ({\mathbf {w}})}[{s}] \) for all states s) is computed. In line 16, we then compute the expected costs induced by \(\mathfrak {S} \) for the individual objectives. Forejt et al. [25] describe how this computation can be implemented with a value iteration-based procedure. Alternatively, we can apply policy iteration or linear programming [46] for this purpose.
Theorem 1
The output of Algorithm 1 satisfies Eq. 3.
Proof
We have to show:
$$\begin{aligned} {{{x^{\mathrm {last}({\mathfrak {E}})}}}[{s_{ init } ^{\mathrm {last}({\mathfrak {E}})}}]} = \langle \mathbb {P}^{\mathfrak {S}}_{M} ({\varphi }_1), \dots , \mathbb {P}^{\mathfrak {S}}_{M} ({\varphi }_{\ell }) \rangle ~ \text {with} ~ \mathfrak {S} \in \arg \max _{\mathfrak {S} '} \left( \sum _{k=1}^{{\ell }} w_i \cdot \mathbb {P}^{\mathfrak {S} '}_{M} ({\varphi }_k) \right) \end{aligned}$$
We prove the following statement for each epoch \({{\mathbf {e}}}\):
$$\begin{aligned} {{{x^{{{\mathbf {e}}}}}}[{\langle s, {{\mathbf {g}}} \rangle }]} = \langle \mathbb {P}^{\mathfrak {S}}_{M} ({\varphi }'_1), \dots , \mathbb {P}^{\mathfrak {S}}_{M} ({\varphi }'_{\ell }) \rangle ~ \text {with} ~ \mathfrak {S} \in \arg \max _{\mathfrak {S} '} \left( \sum _{k=1}^{{\ell }} w_i \cdot \mathbb {P}^{\mathfrak {S} '}_{M} ({\varphi }'_k) \right) \end{aligned}$$
where
$$\begin{aligned} {\varphi }'_k = \bigwedge _{i=n_{k-1}}^{n_k-1}(\langle \mathrm {C}_{{i}} \rangle _{\le {{\mathbf {e}}}[i]}\, G_{i})\text { using } {\varphi }_k = \bigwedge _{i=n_{k-1}}^{n_k-1}(\langle \mathrm {C}_{{i}} \rangle _{\le b_{i}}\, G_{i}) \end{aligned}$$
i.e. \({\varphi }'_k\) is obtained from \({\varphi }_k\) by adapting the cost limits based on the current epoch. For \({{{\mathbf {e}}}}[{i}] = \bot \) we assume that the cost bound \(\langle \mathrm {C}_{{i}} \rangle _{\le \bot }\, G_i\) is not satisfied by any path.
Thus, the algorithm correctly computes the bounded reachability for all states and all epochs. This statement is now proven by induction over any proper epoch sequence. For the induction base, the algorithm correctly computes the epoch \(\langle \bot , \ldots , \bot \rangle \). In particular, notice that there exists an optimal memoryless scheduler on the unfolding, and thus a memoryless scheduler on the epoch model. For the induction step, let \({{\mathbf {e}}}\) be the currently analysed epoch. Since \({\mathfrak {E}}\) is assumed to be a proper epoch sequence, we already computed any reachable successor epoch \({{\mathbf {e}}}'\) of \({{\mathbf {e}}}\), i.e. line 7 is only executed for epochs \({{\mathbf {e}}}'\) for which \(x^{{{\mathbf {e}}}'}\) has already been computed, and by the induction hypothesis these \({{{{{x^{{{\mathbf {e}}}'}}}[{\langle s, {{\mathbf {g}}} \rangle }]}}[{k}]}\) computed by the algorithm coincide with the probability to satisfy \({\varphi }_k'\) from state \(\langle s, {{\mathbf {e}}}', {{\mathbf {g}}} \rangle \) in the unfolding \({M_ unf }\) under a scheduler \(\mathfrak {S} \) that maximises the weighted sum. Hence, the algorithm computes the function f as given in Lemma 2. Then, the algorithm computes weighted expected costs for the epoch model and writes them into \({{{{{x^{{{\mathbf {e}}}}}}[{\langle s, {{\mathbf {g}}} \rangle }]}}[{k}]}\). By Lemma 2, these values coincide with the unfolding. \(\square \)
Runtime and Memory Requirements
In the following, we discuss the complexity of our approach relative to the size of a binary encoding of the cost limits \(b_1, \dots , b_{m}\) occurring in a tradeoff \(\Phi \). Algorithm 1 computes expected weighted costs for \(|{\mathfrak {E}}|\) many epoch models \(M^{{\mathbf {e}}}_f\). Each of these computations can be done in polynomial time (in the size of \(M^{{\mathbf {e}}}_f\)) via a linear programming encoding [46]. With \(|{\mathfrak {E}}| \le \prod _{i=1}^{{m}} b_i\), we conclude that the runtime of Algorithm 1 is exponential in a binary encoding of the cost limits. For the unfolding approach, weighted expected costs have to be computed for a single MDP whose size is, again, exponential in a binary encoding of the cost limits. Although we observe similar theoretical runtime complexities for both approaches, experiments with topological value iteration [5, 19] and single cost bounds [2, 28] have shown the practical benefits of analysing several small sub-models instead of one large MDP. We make similar observations with our approach in Sect. 7.
Algorithm 1 stores a solution vector \({{{x^{{{\mathbf {e}}}}}}[{\langle s, {{\mathbf {g}}} \rangle }]} \in \mathbb {R} ^{\ell }\) for each \({{\mathbf {e}}}\in {\mathfrak {E}}\), \(s \in S \), and \({{\mathbf {g}}}\in {{\mathbf {G}}_{{m}}}\), i.e. a solution vector is stored for every state of the unfolding. However, memory consumption can be optimised by erasing solutions \({{{x^{{{\mathbf {e}}}}}}[{\langle s, {{\mathbf {g}}} \rangle }]}\) as soon as this value is not accessed by any of the remaining epoch models (for example if all predecessor epochs of \({{\mathbf {e}}}\) have been considered already). If \(m=1\) (i.e. there is only a single cost bound), such an optimisation yields an algorithm that runs in polynomial space. In the general case (\(m>1\)), the memory requirements remain exponential in the size of a binary encoding of the cost limits. However, our experiments in Sect. 7 indicate substantial memory savings in practice.
Error Propagation
As presented above, the algorithm assumes that (weighted) expected costs \(\mathbb {E}^{\mathfrak {S}}_{M} ({\mathbf {w}})\) are computed exactly. Practical implementations, however, are often based on numerical methods that only approximate the correct solution. The de-facto standard in MDP model checking for this purpose is value iteration. Methods based on value iteration do not provide any guarantee on the accuracy of the obtained result [27] for the properties considered here. Recently, interval iteration [5, 27] and similar techniques [9, 34, 48] have been suggested to provide error bounds. These methods guarantee that the obtained result \(x_s\) is \(\varepsilon \)-precise for any predefined precision \(\varepsilon > 0\), i.e. upon termination we obtain \(|x[s] - {\mathbb {E}^{\mathfrak {S}}_{M} ({\mathbf {w}})}[{s}] | \le \varepsilon \) for all states s. We describe how to adapt our approach for multi-objective multi-cost bounded reachability to work with an \(\varepsilon \)-precise method for computing the expected costs.
General Models
Results from topological interval iteration [5] indicate that individual epochs can be analysed with precision \(\varepsilon \) to guarantee this same precision for the overall result. The downside is that such an adaptation requires the storage of the obtained bounds for all previously analysed epochs. Therefore, we extend the following result from [28].
Lemma 3
For the single-cost bounded variant of Algorithm 1, to compute \(\mathbb {P}^{\mathrm {max}}_{M} (\langle \mathrm {C}_{} \rangle _{\le b}\, G)\) with precision \(\varepsilon \), each epoch model needs to be analysed with precision \(\frac{\varepsilon }{b+1}\).
The bound is easily deduced: assume the results of previously analysed epochs (given by f) are \(\eta \)-precise and that \(M^{{\mathbf {e}}}_f\) is analysed with precision \(\delta \). The total error for \(M^{{\mathbf {e}}}_f\) can accumulate to at most \(\delta + \eta \). As we analyse \(b+1\) (non-trivial) epoch models, the error thus accumulates to \((b+1) \cdot \delta \). Setting \(\delta \) to \(\frac{\varepsilon }{b+1}\) guarantees the desired bound \(\varepsilon \). We generalise this result to multi-cost bounded tradeoffs.
Theorem 2
If the values \({{{{{x^{{{\mathbf {e}}}}}}[{{\tilde{s}}}]}}[{k}]}\) at line 16 of Algorithm 1 are computed with precision \(\varepsilon / \sum _{i=1}^{{m}} (b_i +1)\) for some \(\varepsilon > 0\), the output \({\mathbf {p}}_{{\mathbf {w}}}'\) of the algorithm satisfies \(| {\mathbf {p}}_{{\mathbf {w}}} - {\mathbf {p}}_{{\mathbf {w}}}'| \cdot {\mathbf {w}} \le \varepsilon \) where \({\mathbf {p}}_{{\mathbf {w}}}\) is as in Eq. 3.
Proof
As in the single-cost bounded case, the total error for \(M^{{\mathbf {e}}}_f\) can accumulate to \(\delta + \eta \) when \(\eta \) is the (maximal) error bound on f. The error bound on f is again recursively determined by \(\delta -1\) times the maximum number of epochs visited along paths from the successor epochs. Since a path through the MDP M visits at most \(\sum _{i=1}^{m}(b_i + 1)\) non-trivial cost epochs, each incurring cost \(\delta \), the overall error can be upper-bounded by \(\delta \cdot \sum _{i=1}^{m}(b_i + 1)\). \(\square \)
While an approach based on Theorem 2 thus requires the analysis of epoch models with tighter error bounds than the bounds induced by [5], and therefore potentially increases the per-epoch analysis time, it still allows us to be significantly more memory-efficient.
Acyclic Epoch Models
The error bound in Theorem 2 is pessimistic, as it does not assume any structure in the epoch models. However, very often, the individual epoch models are in fact acyclic, in particular for cost epochs \({{\mathbf {e}}}\in \mathbb {N}^{m}\), i.e. \({{{\mathbf {e}}}}[{i}] \ne \bot \) for all i. Intuitively, costs usually represent quantities like time or energy usage for which the possibility to perform infinitely many interesting steps without accumulating cost would be considered a modelling error. In the timed case, for example, such a model would allow Zeno behaviour, which is generally considered unrealistic and undesirable. When epoch models are acyclic, interval iteration [5, 27] will converge to the exact result in a finite number of iterations. In this case, the tightening of the precision according to Theorem 2 usually has no effect on runtime. The epoch models for cost epochs \({{\mathbf {e}}}\in \mathbb {N}^{m}\) are acyclic for almost all models that we experiment with in Sect. 7.
Different Bound Types
Minimising Objectives Objectives \(\mathbb {P}^{\mathrm {min}}_{M} ({\varphi }_k)\) can be handled by adapting the function \({ satObj _\Phi }\) in Definition 10 such that it assigns cost \(-1\) to branches that lead to the satisfaction of \({\varphi }_k\). To obtain the desired probabilities we then maximise negative costs and multiply the result by \(-1\) afterwards. As interval iteration supports mixtures of positive and negative costs [5], arbitrary combinations of minimising and maximising objectives can be consideredFootnote 1.
Beyond Upper Bounds Our approach also supports bounds of the form \(\langle \mathrm {C}_{j} \rangle _{\sim b}\, G\) for \({\sim } \in \{<, \le , >, \ge \}\), and we allow combinations of lower and upper cost bounds. Strict upper bounds \(< b\) can be reformulated to non-strict upper bounds \(\le b-1\). Likewise, we reformulate non-strict lower bounds \(\ge b\) to \(> b-1\), and only consider strict lower bounds in the following.
For bound \(\langle \mathrm {C}_{i} \rangle _{> b_i}\, G_i\) we adapt the update of goal satisfactions (Definition 7) such that
$$\begin{aligned} {{ succ ({{\mathbf {g}}}, s, {{\mathbf {e}}})}}[{i}] = {\left\{ \begin{array}{ll} 1 &{} \text {if }s\in G_i \wedge {{{\mathbf {e}}}}[{i}] = \bot , \\ {{\mathbf {g}}}[i] &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$
Moreover, we support multi-bounded single-goal queries like \(\langle \mathrm {C}_{(j_1, \dots , j_n)} \rangle _{(\sim _{1} b_{1}, \dots , \sim _{n} b_{n})}\, G\), which characterises the paths \(\pi \) with a prefix \(\pi _\mathrm {fin}\) satisfying \(\mathrm {last}(\pi _\mathrm {fin}) \in G\) and all cost bounds simultaneously, i.e. \( \mathrm {cost}_{j_i}(\pi _\mathrm {fin}) \sim _i b_i\). Let us clarify the meaning of simultaneously with an example.
Example 12
The formula \({\varphi }= \langle \mathrm {C}_{(1,1)} \rangle _{(\le 1, \ge 1)}\, G\) expresses the paths that reach G while collecting exactly one cost with respect to the first cost structure. This formula is not equivalent to \({\varphi }' = \langle \mathrm {C}_{1} \rangle _{\le 1}\, G \wedge \langle \mathrm {C}_{1} \rangle _{\ge 1}\, G\). Consider the trivial MDP in Fig. 7 with \(G= \{\,s_0\,\}\). The MDP (and the trivial strategy) satisfies \({\varphi }'\) but not \({\varphi }\): Initially, the left-hand side of \({\varphi }'\) is (already) satisfied, and after one more step along the unique path, also the right-hand side is satisfied, thereby satisfying the conjunction. However, there is no point where exactly cost 1 is collected, hence \({\varphi }\) is never satisfied.
Expected Cost Objectives The algorithm supports cost-bounded expected cost objectives \(\mathbb {E}^{ opt }_{M} (\mathrm {R}_{j_1}, \langle \mathrm {C}_{j_2} \rangle _{\le b}\,)\) with \( opt \in \{\, \mathrm {max}, \mathrm {min}\,\}\), which refer to the expected cost accumulated for cost structure \(j_1\) within a given cost bound \(\langle \mathrm {C}_{j_2} \rangle _{\le b}\,\). The computation is analogous to cost-bounded reachability queries: we treat them by computing (weighted) expected costs within epoch models. Therefore, they can be used in multi-objective queries, potentially in combination with cost-bounded reachability objectives.