Introduction

Probabilistic model checking is a verification technique to analyse the behaviour of stochastic agents and check whether they satisfy some given desirable properties. It relies on: (i) a formal model of the agent; (ii) a formal specification of the desirable properties; and (iii) algorithms to perform an exhaustive model exploration and verify whether the specified properties are met.

Markov models are the reference formalism usually involved in modelling the stochastic behaviour of agents. These include both discrete and continuous-time Markov chains and their several extensions [1]. Different approaches exist to specify properties of these models: graph-oriented approaches based on non-deterministic state automata [2, 3], algebraic approaches based on formalisms like the performance evaluation process algebra [4, 5], and approaches based on logical languages, such as the Probabilistic Computational Tree Logic (PCTL) [6] and its various extensions [2]. Here, we limit our focus to logical approaches and the related semantics and algorithms.

Probabilistic model checking is applied to a variety of fields, ranging from software verification [7] and communication protocols [8,9,10], to service oriented architectures [11,12,13] and computational biology [14, 15]. Moreover, different frameworks have been proposed so far to cope with stochastic multi-agent systems. The latter are generally based on Kripke-like semantic frames and related epistemic languages [16]. Among them, the computationally-grounded weighted doxastic logic proposed in [17] extends the well-known Computation Tree Logic (CTL) [2] with a weighted doxastic operator to specify single and multi-agent beliefs. Similarly, the Probabilistic Computation Tree Logic of Knowledge (PCTLK) [18] is a PCTL extension including single and multi-agent epistemic modalities to specify different epistemic-probabilistic properties of stochastic multi-agent systems. Both formalisms base their semantics on probabilistic interpreted systems, a class of structures obtained by merging Kripke structures [16] with Markov models.

Despite its success, probabilistic model checking suffers from a well-known limitation: it requires all the transition probabilities to be defined by “sharp” (or, “precise”) numerical values. This constraint might be critical in several applications as it prevents from modelling both non-stationary agents and agents characterised by a partial uncertainty on the values of transition probabilities. An appealing way to overcome this limitation is represented by so-called parametric Markovian models [19], where precise state transition probabilities are replaced with unknown parameters. This is the solution adopted, for instance, in [20]. However, complexity issues related with the corresponding model checking procedure, based on fraction-free Gaussian elimination, limit its applicability only to models of small size.

A less explored alternative is provided by the formalism of imprecise probabilities [21] and related imprecise Markov models, namely imprecise Markov chains (IMC) and their extensions [22,23,24,25]. The latter can be seen as the imprecise counterparts of standard Markov chains and are obtained by replacing single-valued probability distributions with so-called credal sets, i.e. sets of probability distributions compatible with given constraints [26].

A first attempt to extend probabilistic model checking to the framework of imprecise probabilities has been proposed in [22]. The paper introduces an imprecise version of PCTL based on IMCs and proves that shifting from precise to imprecise models does not increase the time complexity of the relevant model checking tasks, which remains polynomial in the number of states of the models. A first extension of the results in [22] is provided in [25], introducing a specific language and the corresponding model checking procedure for imprecise Markov reward models (IMRM) partially based on the recursive procedures outlined in [27].

In both [22, 25], only single-agent systems are considered. A multi-agent extension of this formalism is instead presented in [28]. In this paper, the authors introduce a class of structures to model epistemic-stochastic multi-agent systems called imprecise probabilistic interpreted systems (IPIS) together with a language to specify their properties (called EIPCTL) and relative model checking procedure.

The present work introduces a new framework that combines and extends the results in [25, 28]. The framework is based on Imprecise Probabilistic Interpreted Reward Systems (IPIRS), a class of imprecise-probabilistic models obtained by combining IPISs presented in [28] with IMRMs introduced in [25]. A language to specify properties of these models, called Epistemic Imprecise Probabilistic Interpreted Reward Systems (EIPRCTL) is consequently introduced. The latter is obtained by extending the EIPCTL introduced in [28] with operators specific for reward properties. Corresponding model checking algorithms are then developed based on iterative scheme exploiting the same transition operator used in [22] for IMCs. Furthermore, preliminary computational complexity results included in [22] and [28] are generalised, hence proving that shifting from precise to imprecise models does not increase the time complexity of the relevant model checking tasks with respect to standard (precise) models. The developed formalism and algorithms are finally tested on a case study borrowed from the medical domain.

The paper is structured as follows. In “Markov Models”, we provide a general introduction to different kinds of Markov models and related probabilistic inferences. In “Imprecise Markov Models”, we introduce different relevant kinds of imprecise Markov models and the corresponding methods to compute probabilistic inferences over them. In “Epistemic Imprecise PRCTL”, we define the syntax and semantics of EIPRCTL. In “Model Checking”, we develop a model checker for EIPRCTL exploiting procedures for the computation of probabilistic and epistemic inferences in imprecise Markov models introduced in “Imprecise Markov Models”. In “A Case Study on Healthcare Budgeting”, we offer an example of application based on a case study borrowed from the medical domain. Finally, in “Conclusions”, we conclude with some remarks about further potential applications and developments of our framework. The proofs of the theorems are all reported in the Appendix A.

Markov Models

Markov Chains

Let \({\mathcal {S}}\) be a finite non-empty set of possible states. We are interested in modelling stochastic agents that, at each discrete-time \(t\in \mathbb {N}\), shift from a state \(s\in {\mathcal {S}}\) to another, not necessarily different, state \(s'\in {\mathcal {S}}\). We assume the stochastic behaviour of an agent to be time-homogeneous, that is, the probability of a transition from s to \(s'\) is independent of the time t at which it occurs, and memory-less, that is, the probability of each transition is independent of the previously occurred transitions. Under these conditions, the behaviour of the agent can be described in terms of a discrete-time Markov chain (DTMC).

Definition 1

(Discrete-time Markov chain) A discrete-time Markov chain \(M_\textsf{DTMC}\) is a tuple:

$$\begin{aligned} M_\textsf{DTMC}:= \left\langle {\mathcal {S}}, T, \iota \right\rangle , \end{aligned}$$

where:

  • \({\mathcal {S}}\) is a finite non-empty set of states;

  • \(T: {\mathcal {S}}\times {\mathcal {S}}\rightarrow [0,1]\) is a transition matrix that assigns a probability value to each transition \(s,s'\in {\mathcal {S}}\times {\mathcal {S}}\);

  • \(\iota : {\mathcal {S}}\rightarrow [0,1]\) is a probability distribution that assigns an initial probability value to each \(s\in {\mathcal {S}}\).

Given a DTMC \(M_\textsf{DTMC}\), we call path a function \(\pi : \mathbb {N}\mapsto {\mathcal {S}}\) whose values are the states reached by \(M_\textsf{DTMC}\) at the various time-steps t. Accordingly, each path \(\pi\) describes a possible temporal evolution of the Markov chain and corresponds to an infinite countable sequence of states. In what follows, notation \(\pi (t)\) is used to refer to the state of the path \(\pi\) at time t, while \(\textrm{Paths}(s)\) denotes the set of all paths \(\pi\) originating in a given state \(s\in {\mathcal {S}}\) (i.e. such that \(\pi (0) = s\)). The set of all possible paths \(\pi\) of a given DTMC \(M_\textsf{DTMC}\) is denoted by \(\Pi ^{M_\textsf{DTMC}}\) and represents the set of all possible outcomes in the temporal evolution of the Markov chain.

To relate paths with probabilities, we endow \(\Pi ^{M_\textsf{DTMC}}\) with a \(\sigma\)-algebra and augment it to a probability space as follows. Given a path \(\pi\) of \(M_\textsf{DTMC}\), a finite prefix \(\hat{\pi }\) of \(\pi\) is any sequence \((\pi (0),\dots ,\pi (t))\) originating in \(\pi (0)\) and including a finite number of subsequent states of \(\pi\). The set of all finite prefixes of a given path \(\pi\) is denoted by \(pref(\pi )\), while the set of all finite prefixes \(\hat{\pi }\) originating in a given state \(s\in {\mathcal {S}}\) is denoted by \(\textrm{Paths}_{fin}(s)\) (see, [2,  Sec. 10.1]).

Definition 2

(Cylinder set) The cylinder set \(Cyl(\hat{\pi })\) induced by a finite prefix \(\hat{\pi }\) is defined as

$$\begin{aligned} Cyl(\hat{\pi }):= \left\{ \pi \in \Pi ^{M_\textsf{DTMC}} \mid \hat{\pi }\in pref(\pi )\right\} . \end{aligned}$$

That is, \(Cyl(\hat{\pi })\) is the set of all paths whose common prefix is \(\hat{\pi }\) [2, Def. 10.9].

Definition 3

(\(\sigma\)-algebra of a Markov Chain) The \(\sigma\)-algebra associated with a DTMC \(M_\textsf{DTMC}\), denoted \(\sigma ^{M_\textsf{DTMC}}\), is the smallest \(\sigma\)-algebra that contains all cylinder sets \(Cyl(\hat{\pi })\), where \(\hat{\pi }\) ranges over all finite prefixes of \(M_\textsf{DTMC}\).

From basic concepts of probability and Definition 3, it follows that there exists a unique probability measure \(P^{M_\textsf{DTMC}}\) on the \(\sigma\)-algebra \(\sigma ^{M_\textsf{DTMC}}\) such that (see, [2, Sec. 10.1]):

$$\begin{aligned} P^{M_\textsf{DTMC}}\left( Cyl\left( \hat{\pi }(0),\dots ,\hat{\pi }(t)\right) \right) = \iota \left( \hat{\pi }(0)\right) \cdot P^{M_\textsf{DTMC}}\left( \hat{\pi }(0),\dots ,\hat{\pi }(t)\right) , \end{aligned}$$
(1)

where:

$$\begin{aligned} P^{M_\textsf{DTMC}}\left( \hat{\pi }(0),\dots ,\hat{\pi }(t)\right) := \prod _{\tau =0}^{t-1}T\left( \hat{\pi }(\tau ),\hat{\pi }(\tau +1)\right) , \end{aligned}$$
(2)

while for finite prefixes composed by just one state (i.e. \(\hat{\pi } = \{s\}\)), \(P_{s}(\hat{\pi }) = 1\) [2, Sec. 10.1].

Also of interest are the probabilities of a path \(\pi\), respectively a finite prefix \(\hat{\pi }\), conditional on a given initial state \(s\in {\mathcal {S}}\), which are henceforth denoted by \(P^{M_\textsf{DTMC}}_{s}(\pi )\), respectively \(P^{M_\textsf{DTMC}}_{s}(\hat{\pi })\).

Over the probability space \(\langle \Pi ^{M_\textsf{DTMC}}, \sigma ^{M_\textsf{DTMC}}, P^{M_\textsf{DTMC}} \rangle\), we define a family \(\{S_t\}_{t\in \mathbb {N}}\) of categorical stochastic variables \(S_t\) that range over \({\mathcal {S}}\) and describe the temporal behaviour of the Markov chain. In this framework, the memory-less condition above mentioned corresponds to the Markov property, which establishes that: \(P^{M_\textsf{DTMC}}(S_{t+1}\mid S_{t},\ldots ,S_{0}) = P^{M_\textsf{DTMC}}(S_{t+1}\mid S_{t})\). Time-homogeneity, on the other hand, corresponds to assume \(P^{M_\textsf{DTMC}}(S_{t+1}\mid S_t)\) to be the same for all t.

Let us now consider the usual definition of (discrete-time) stochastic process.

Definition 4

(Stochastic process) Given a finite non-empty set of states \({\mathcal {S}}\), a discrete-time stochastic process over \({\mathcal {S}}\), here denoted M, is a family of categorical stochastic variables \(\{S_t\}_{t\in \mathbb {N}}\) ranging over \({\mathcal {S}}\) and defined over a probability space \(\langle \Pi , \sigma (\pi ), P^{M} \rangle\) such that

  • \(\Pi\) is the set of all paths generated by states in \({\mathcal {S}}\),

  • \(\sigma (\Pi )\) is the cylinder \(\sigma\)-algebra of \(\Pi\),

  • \(P^{M}\) is a probability measure over \(\sigma (\Pi )\).

From Definition 4, it follows that each stochastic process M is uniquely identified by (i.e. it is in one-to-one correspondence with) a probability measure \(P^{M}\). Accordingly, a DTMC with state-space \({\mathcal {S}}\), initial distribution \(\iota\) and transition matrix T can be alternatively regarded as the (discrete-time) stochastic process over \({\mathcal {S}}\) uniquely identified by the probability measure \(P^{M}\) generated by \(\iota\) and T as by Eqs. (1) and (2).

Definition 5

(Labelled DTMC) A labelled DTMC is a DTMC extended with a set of atomic propositions AP and a labelling function \(l: {\mathcal {S}}\rightarrow 2^{AP}\) that assigns to each \(s\in {\mathcal {S}}\) a set of atomic propositions \(l(s)\subseteq AP\) representing elementary facts (or properties) holding in that state.

In the rest of this article, we consider only labelled DTMCs. When not differently specified, we use the term DTMC to refer directly to their labelled extensions.

Inferences in Markov Chains

We next recall two probabilistic inferences that are of central interest in this work, i.e. marginal and hitting probability.

Definition 6

(Marginal probability) Given an event \(B\subseteq {\mathcal {S}}\) and an initial state \(s\in {\mathcal {S}}\), the marginal probability of B with respect to time t conditional on \(S_{0} = s\) is defined as follows:

$$\begin{aligned} P^{M_\textsf{DTMC}}_{s}\left( S_t\in B\right) := \sum _{\hat{\pi }:= \{\hat{\pi }(0),\dots ,\hat{\pi }(t)\}\,\, \mathrm {s.t.}\,\, \hat{\pi }\in \textrm{Paths}_{fin}(s)\,\wedge \, \hat{\pi }(t)\in B} P^{M_\textsf{DTMC}}(\hat{\pi })\,. \end{aligned}$$
(3)

Given an event \(B\subseteq {\mathcal {S}}\) and state \(s\in {\mathcal {S}}\), we are also interested in the (conditional) hitting probability \(h_{B}(s)\), alternatively called reachability probability (see, [2, Sec.10.1.1]), which is the probability of eventually visiting a state \(s'\in B\) starting from s.

Consider the event “the process eventually reaches B” (in model checking literature, this is usually denoted by \(\Diamond B\)). In order to define \(h_{B}(s)\), we need first to characterise the above event as a measurable set of paths. In fact, the latter corresponds to the union of all cylinders \(Cyl(\hat{\pi })\) spanned by finite prefixes \(\hat{\pi }\) originating in s and such that \(\exists \, t\in \mathbb {N}:\, \pi (t)\in B\, \wedge \, \forall \tau <t,\, \pi (\tau )\not \in B\). Since these sets are pairwise disjoint, the hitting probability of B can be defined as follows [2, Sec.10.1.1.].

Definition 7

(Hitting probability) Given a DTMC \(M_\textsf{DTMC}\), a set of states \(B\subseteq {\mathcal {S}}\), and an initial state \(s\in {\mathcal {S}}\),

$$\begin{aligned} h_{B}(s):= \sum _{ \begin{array}{cc} &{} \hat{\pi }\in \textrm{Paths}_{fin}(s):\, \exists \, t\in \mathbb {N}\, \mathrm {s.t.}\,\\ &{} \hat{\pi }(t)\in B\,\wedge \, \forall \, \tau <t,\, \hat{\pi }(\tau )\not \in B \end{array} } P^{M_\textsf{DTMC}}_{s}\left( Cyl(\hat{\pi })\right) . \end{aligned}$$

Following [22, 24], we compute marginal and hitting probabilities via a transition operator \(\hat{T}\) defined as follows.

Definition 8

(Transition operator) For any real function f of \({\mathcal {S}}\), \(\hat{T}f\) is defined as a function \({\mathcal {S}}\rightarrow \mathbb {R}\) such that

$$\begin{aligned} \forall s\in {\mathcal {S}}\,,\,\, \left( \hat{T} f\right) (s) := \sum _{s'\in {\mathcal {S}}} T\left( s,s'\right) \cdot f\left( s'\right) . \end{aligned}$$
(4)

In practice, the transition operator returns the conditional expectation of f, i.e. \(\hat{T} f(s)=E[f(S_{t+1})\mid S_t = s]\) [24]. After t time-steps, the transition operator is obtained as follows:

$$\begin{aligned} \forall s\in {\mathcal {S}}\,,\,\, \left( \hat{T}^{t}f\right) (s):= \left\{ \begin{array}{ll} \left( \hat{T}f\right) (s) &{} \textrm{if}\,\, t=1\,,\\ \left( \hat{T}\left( \hat{T}^{t-1}f\right) \right) (s) &{} \textrm{if}\,\, t>1, \end{array} \right. \end{aligned}$$
(5)

and the respective conditional expectation writes as \(E[f(S_t) \mid S_0 = s] = (\hat{T}^t f)(s)\). Since the marginal probability is equivalent to the expectation of the indicator function \(\mathbb {I}_B(s)\) that returns one if \(s \in B\) and zero otherwise, we can compute it as

$$\begin{aligned} P_{s}\left( S_{t}\in B\right) = \hat{T}^{t}\mathbb {I}_{B}(s). \end{aligned}$$
(6)

The hitting probability \(h_{B}\) is obtained instead by computing the minimal non-negative solutionFootnote 1 of the following system of linear equations [29,  Th. 1.3.2]:

$$\begin{aligned} h_{B} = \mathbb {I}_{B} + \mathbb {I}_{B^{c}}\cdot \hat{T} h_{B}\,, \end{aligned}$$
(7)

where \(B^c\) is the complement of B, and sums and products are intended as element-wise operations on arrays. Standard methods solve Eq. (7) in polynomial time with respect to \(\vert {\mathcal {S}}\vert\) [2, p.749]. Here we consider an alternative procedure that is easier to be extended to the imprecise-probabilistic framework (see, Section  “Imprecise Markov Models”). Let \(h^{t}_{B}(s)\) be the probability of hitting B from \(s\in {\mathcal {S}}\) within a finite number of time-steps t. For \(t=0\), we trivially have \(h_{B}^{t}(s)=\mathbb {I}_{B}(s)\). For \(t>0\), if \(s\not \in B\), the hitting probability at t is obtained by applying the transition operator to \(h_{B}^{t-1}\), while if \(s\in B\) it is simply set to one. Thus,

$$\begin{aligned} h_{B}^{t} = \mathbb {I}_{B} + \mathbb {I}_{B^{c}} \cdot \hat{T} h_{B}^{t-1}\,. \end{aligned}$$
(8)

The hitting probability \(h_B\) can thus be computed as the fixed point \(t^{*}\) of the iterative schema in Eq. (8).

The time complexity of the above iterative computation is polynomial with respect to \(\vert {\mathcal {S}}\vert t^{*}\). As each iterative step is based on solving a system of \(\vert {\mathcal {S}} \vert\) linear equations, its time complexity results polynomial in \(\vert {\mathcal {S}} \vert\). Since \(t^{*}\) further iterations are necessary to reach the fixed point, the overall time complexity will result to be polynomial in \(\vert {\mathcal {S}}\vert t^{*}\).

Markov Reward Models

Among the various extensions of Markov chains, let us consider Markov reward models (MRMs) [2]. A MRM is a pair \(\langle M,rew \rangle\) composed of a Markov chain M with state space \({\mathcal {S}}\) and a reward function \(rew: {\mathcal {S}}\mapsto \mathbb {N}\) such that rew(s) represents the reward earned visiting s, for each state \(s\in {\mathcal {S}}\). Given an event \(B\subseteq {\mathcal {S}}\) and a path \(\pi \in \Pi ^{M_\textsf{DTMC}}\), we are interested in the cumulative reward earned along \(\pi\) until visiting an \(s\in B\) for the first time. The latter is defined as follows.

Definition 9

(Cumulative reward) Given an event \(B\subseteq {\mathcal {S}}\) and a path \(\pi \in \Pi ^{M_\textsf{DTMC}}\),

$$\begin{aligned} Rew_B(\pi ):= \left\{ \begin{array}{ll} \sum _{\tau =0}^{t}rew(\pi (\tau ))&{} \textrm{if}\,\, \exists t:\, \pi (t)\in B\wedge \, \forall \tau <t,\, \pi (\tau )\not \in B\\ \sum _{\tau =0}^{\infty } rew(\pi (\tau ))\,\, &{}\textrm{otherwise}\,. \end{array} \right. \end{aligned}$$
(9)

Given the above definition, the expected cumulative reward earned until reaching B starting from \(s\in {\mathcal {S}}\), denoted \(ExpRew_{B}(s)\), is now defined as the expectationFootnote 2 of the function \(Rew_{B}\) conditional on initial state \(s\in {\mathcal {S}}\), i.e. \(ExpRew_{B}(s):= E[Rew_{B} \mid S_0 = s]\) [2, Def. 10.71]. Let us now discuss more in detail how to compute this value. First of all, we need to recall the following result from [2, Sec. 10.5.1]. Given an event \(B \subseteq S\), let \({\mathcal {S}}^{B}_{=1}\) indicates the set of all states \(s\in {\mathcal {S}}\) from which it is possible to reach an \(s'\in B\) almost surely, i.e.:

$$\begin{aligned} {\mathcal {S}}^{B}_{=1}:= \left\{ s\in {\mathcal {S}} \mid h_{B}(s) = 1\right\} . \end{aligned}$$

If \(s\not \in {\mathcal {S}}^{B}_{=1}\), then \(ExpRew_{B}(s)\) may not converge to a finite value. Following [2,  sec. 10.5.1], we thus assume for convenience that, by default, \(ExpRew_{B}(s) = \infty\) for all \(s\not \in {\mathcal {S}}^{B}_{=1}\). For all \(s\in {\mathcal {S}}^{B}_{=1}\), the following result holds [2, Sec. 10.5.1].

Proposition 1

The values \(x_s= E[Rew_{B} \mid S_0 = s]\) for each \(s \in {\mathcal {S}}^B_{=1}\) provide the unique solution of the following equations system:

$$\begin{aligned} x_s = \left\{ \begin{array}{ll} rew(s)&{}\textrm{if} \; s\in B,\\ rew(s) + \sum _{s'\in {\mathcal {S}}^B_{=1}}T(s,s') x_{s'}&{}\textrm{otherwise}. \end{array} \right. \end{aligned}$$
(10)

There exist several methods to solve the linear system in Eq. (10) (see, [2, Sec.10.5.1]). Here, we adopt a recursive schema similar to that one involved above for hitting probability, which is obtained as follows.

For every \(s \in {\mathcal {S}}^B_{=1}\), let \(ExpRew^0_B(s) :=rew(s)\), and, for every \(t \in \mathbb {N}, \; t \ne 0\), let \(ExpRew^{t}_{B}(s)\) be defined as

$$\begin{aligned} ExpRew^t_B(s) :=\left\{ \begin{array}{ll} rew(s) &{} \textrm{if}\,\, s\in B\,,\\ rew(s) + \sum _{s' \in S}T(s,s')ExpRew^{t-1}_B(s') &{}\,\, \textrm{otherwise}. \end{array} \right. \end{aligned}$$
(11)

Notice that the functions \(ExpRew^t_B\) are well-defined since if \(s \in {\mathcal {S}}^B_{=1}\), then \(s' \in {\mathcal {S}}^B_{=1}\) for every \(s'\) such that \(T(s,s')>0\). Each function \(ExpRew^t_B\) can be given a clear interpretation as the expected cumulative reward earned until reaching B from s within a maximum number of time-steps t, as the following result shows.

Theorem 2

For every \(t \in \mathbb {N}\), it holds that

$$\begin{aligned} \left( \forall s \in {\mathcal {S}}^B_{=1}\right) \,\, ExpRew^t_B(s)= E\left[ Rew^t_B \mid S_0 =s\right] , \end{aligned}$$
(12)

where for every \(\pi \in \Pi ^{M_\textsf{DTMC}}\), \(Rew^0_B(\pi ) :=rew(\pi (0))\), and for every \(t \in \mathbb {N},\; t\ne 0\),

$$\begin{aligned} Rew^t_B(\pi ) :=\left\{ \begin{array}{ll} Rew_B(\pi ) &{} \textrm{if}\,\, \exists t^* \le t: \left( \forall \tau < t^*\right) \; \pi (\tau ) \notin B, \pi (t^*) \in B, \\ \sum _{\tau =0}^{t} rew(\pi (\tau )) &{}\,\, \textrm{otherwise}. \end{array} \right. \end{aligned}$$
(13)

Thanks to Theorem 2, we can now demonstrate the following result proving that the recursive scheme above introduced converges to what we expect.

Theorem 3

\(E[Rew_B \mid S_0]\) restricted to \(S^B_{=1}\) is a fixed point of the iterative scheme (25).

As for hitting probability, we can thus compute \(ExpRew_{B}\) by iterating the schema in Eq. (25) over increasing values of t until convergence.

The last MRM inference we consider is the reward-bounded hitting probability \(h_{B}^{r}(s)\), i.e. the probability of reaching B from s before earning a cumulative reward equals to r.

As for standard hitting probability, we proceed by defining the event “the process reaches B before earning a cumulative reward equals to r” (usually denoted by \(\Diamond _{\le r}B\)) as a measurable set of paths. Notably, this event corresponds to the union of all cylinder sets \(Cyl(\hat{\pi })\) spanned by finite prefixed \(\hat{\pi }\) originating in s and such that \(\exists t\in \mathbb {N}:\, \hat{\pi }(t)\in B\, \wedge \, \forall \tau <t,\, \hat{\pi }(\tau )\not \in B\, \wedge \, rew(\hat{\pi }(0),\dots ,\hat{\pi }(\tau ))\le r\). Since these are pairwise disjoint sets, the reward-bounded hitting probability of B can be defined as follows [2, Sec. 10.5.1]:

Definition 10

(Reward-bounded hitting probability) For all \(s\in {\mathcal {S}}\):

$$\begin{aligned} h^{r}_{B}(s):= \sum _{ \begin{array}{cc} &{} \hat{\pi }\in \textrm{Paths}_{fin}(s): \exists t\in \mathbb {N}\, \mathrm {s.t.}\,\\ &{} \hat{\pi }(t)\in B\, \wedge \, \forall \tau <t\, \hat{\pi }(\tau )\not \in B\, \wedge \,\\ {} &{} rew\left( \hat{\pi }(0),\dots ,\hat{\pi }(t)\right) \le r \end{array} } P^{M_\textsf{DTMC}}_{s}\left( Cyl(\hat{\pi })\right) \end{aligned}$$

To introduce a method to compute \(h_{B}^{r}(s)\), let us first recall the following result from [2, Sec. 10.5.1]. Let \(h_{B}^{\rho }\) denote the vector of reward-bounded hitting probability for a reward-threshold \(\rho = 0,\dots ,r\). Let \({\mathcal {S}}^{B}_{>0}\) be the set of all states \(s\in {\mathcal {S}}\) such that \(h_{B}(s)>0\), i.e. the set of all states \(s\in {\mathcal {S}}\) such that there exists at least one path \(\pi\) originating in s and reaching B for some t.

Proposition 4

For each \(s\in {\mathcal {S}}\), the value of \(h_{B}^{\rho }(s)\) is given by the following system of equations:

$$\begin{aligned} h_{B}^{\rho }(s) = \left\{ \begin{array}{ll} 1 &{} \textrm{if}\,\, s\in B\,\, \textrm{and}\,\, rew(s)\le \rho \\ 0 &{}\textrm{if}\,\, rew(s)> \rho \,\, \textrm{or}\,\, s\not \in {\mathcal {S}}^{B}_{>0}\\ \sum _{s'\in {\mathcal {S}}}T(s,s') h_{B}^{\rho -rew(s)}(s') &{} \textrm{otherwise}. \end{array} \right. \end{aligned}$$
(14)

Trivially, \(h^{\rho }_{B} = 1\) whenever s is already in B and its state-reward does not overcome the desired threshold \(\rho\). Differently, if \(s\in B\) but \(rew(s)> \rho\), then we have that the cumulative reward earned by the agent already overcomes the specified threshold and, thus, \(h^{\rho }_{B}(s) = 0\). The same also holds when \(s\not \in {\mathcal {S}}^{B}_{>0}\) as, in this case, we already know that \(h_{B}\) is equal to zero and consequently also \(h^{\rho }_{B} = 0\). In all the remaining cases, \(h^{\rho }_{B}(s)\) is computed by iteration over the possible successors of s. That is, we take the sum over all \(s'\in {\mathcal {S}}\) of the probability of reaching \(s'\) from s multiplied the probability of reaching B from \(s'\) before earning a cumulative reward equal to \(\rho -rew(s)\). Remember that every time the agent transits from a state s to one of its successors \(s'\in {\mathcal {S}}\), it earns the reward of s. Consequently, the threshold \(\rho\) has to be reduced of a value rew(s) at each further iteration from a state s to its successors \(s'\).

Notice that the system in the above proposition is in fact a linear system with variables \((s,\rho )\) ranging in \({\mathcal {S}}\times \{0,1,\dots ,\rho \}\). As in the case of standard hitting (see, Eq. (14)), we can solve the system by standard methods [2, Sec. 10.5.1]. Here, we follow a different strategy based on a recursive schema that iterates both over times and rewards. Let \(h^{t,\rho }_{B}(s)\) denote the probability of hitting B from s before earning cumulative reward \(\rho\) and within time-step t. For each \(t\in \mathbb {N}\), the values of \(h^{t,\rho }_{B}\) computed for \(\rho = 0,\dots ,r\) are collected in \({\mathcal {S}}\times r\) matrix that we denote by \(\textbf{h}^{t,\rho _{0:r}}_{B}\).

For \(t=0\), we generate \(\textbf{h}^{t=0,\rho _{0:r}}_{B}\) by computing the vectors \(h^{t=0,\rho }_{B}\) for \(\rho = 0,\dots ,r\) and for each \(s\in {\mathcal {S}}\) as follows:

$$\begin{aligned} h^{t=0,\rho }_{B}(s) = \left\{ \begin{array}{ c l } &{} 1\,\, \textrm{if}\,\, s\in B\,\, \textrm{and}\,\, rew(s) \le \rho \\ &{} 0\,\, \textrm{otherwise}. \end{array} \right. \end{aligned}$$
(15)

Consider that when \(t=0\) no transition occurs. Hence, we clearly have that, for each \(s\in \mathcal {A}\), \(h^{t=0,\rho }_{B}(s)\) is equals to one if s belongs to the hitting event B and its reward rew(s) does not exceed the specified threshold \(\rho\), otherwise \(h^{t=0,\rho }_{B}(s)\) is zero.

For \(t>0\), \(\textbf{h}^{t,\rho _{0:r}}_{B}\) is generated by computing the various vectors \(h^{t,\rho }_{B}\) for \(\rho = 0,1,\dots ,r\) via the following recursive schema applied to each \(s\in {\mathcal {S}}\):

$$\begin{aligned} h^{t,\rho }_{B}(s) = \left\{ \begin{array}{ l l } 1 &{} \textrm{if}\,\, s\in B\,\, \textrm{and}\,\, rew(s) \le \rho \\ 0 &{} \textrm{if}\,\, h_B(s)=0\,\, \textrm{or}\,\, rew(s) > \rho \\ \sum _{s'\in {\mathcal {S}}}T\left( s,s'\right) h^{t-1,\rho -rew(s)}_{B}\left( s'\right) &{} \textrm{otherwise}. \end{array} \right. \end{aligned}$$
(16)

The first two cases follow from considerations analogous to those leading to Eq. (15). In the third case, we obtain the value of \(h^{t,\rho }_{B}(s)\) from \(h^{t-1,\rho -rew(s)}_{B}\) via one-step application of the transition operator. As the reward is cumulative, the threshold \(\rho\) is decreased by the reward of the current state. The values of \(h^{t-1,\rho -rew(s)}_{B}\) are provided by the matrix \(\textbf{h}^{t-1,\rho _{0:r}}_{B}\) that we generate at time-step \(t-1\). Specifically, \(h^{t-1,\rho -rew(s)}_{B}(s')\) corresponds to the cell identified by the \(\vert s' \vert\)-row and the \(\vert \rho -rew(s) \vert\)-column of the matrix \(\textbf{h}^{t-1,\rho _{0:r}}_{B}\). For all \(s\in {\mathcal {S}}\), the procedure converges to a finite value, as proved by the following theorem.

Theorem 5

Let \(\langle M,rew \rangle\) be a MRM, \(B \subseteq [ {\mathcal {S}}]\), and \(\rho \in \mathbb {N}\). There is a \(t^{*}\in \mathbb {N}\) such that for all \(\tau \ge 0\):

$$\begin{aligned} \textbf{h}^{t^{*}+\tau ,\rho _{0:r}}_{B} =\textbf{h}^{t^{*},\rho _{0:r}}_{B}. \end{aligned}$$
(17)

As for standard hitting, we can, therefore, compute \(h^{r}_{B}\) simply by iterating \(\textbf{h}^{t,\rho _{0:r}}_{B}\) over increasing values of t until convergence.

Probabilistic Interpreted Systems

Probabilistic Interpreted Systems (PISs) [18] are another MC extension we consider. PISs are a class of semantic frames used in computational logic for modelling epistemic and probabilistic properties of stochastic multi-agent systems. Consider a finite non-empty set of agents \(\mathcal {A}\). The possible configurations of each agent \(a\in \mathcal {A}\) are described by a finite non-empty set of local states \({\mathcal {S}}^{a}\). The set of global states \({\mathcal {S}}\) describing the possible configurations of the whole multi-agent system is obtained as the Cartesian product \({\mathcal {S}}:= \times _{a\in \mathcal {A}} {\mathcal {S}}^{a}\). Accordingly, each \(s\in {\mathcal {S}}\) is a tuple \(\langle s^{a_1},s^{a_2},\dots ,s^{a_n} \rangle\) of \(\vert \mathcal {A} \vert\) local states. Hence, formally, we have the following definition.

Definition 11

A PIS is defined as a tuple:

$$\begin{aligned} M_\textsf{PIS}:= \left\langle \mathcal {A},{\mathcal {S}},\{T^{a}\}_{a\in \mathcal {A}}, \{P^{a}\}_{a\in \mathcal {A}}, AP, l(s) \right\rangle , \end{aligned}$$
(18)

where:

  • \(\mathcal {A}\) is a finite non-empty set of agents;

  • \({\mathcal {S}}\) is a finite non-empty set of global states;

  • \(\{T^{a}\}_{a\in \mathcal {A}}\) is a family of transition matrices \(T^{a}: {\mathcal {S}}\times {\mathcal {S}}\rightarrow [0,1]\);

  • \(\{P^{a}\}_{a\in \mathcal {A}}\) is a family of initial probability distributions \(P^{a}: {\mathcal {S}}\mapsto [0,1]\);

  • AP is a set of atomic propositions;

  • \(l: {\mathcal {S}}\rightarrow 2^{AP}\) is the labelling function.

For each agent \(a\in \mathcal {A}\), we also introduce an epistemic equivalence relation \(\sim ^{a}\subseteq {\mathcal {S}}\times {\mathcal {S}}\) such that

$$\begin{aligned} s\sim ^{a}s'\,\, \textrm{iff}\,\, s^{a} = s'^{a}. \end{aligned}$$

The latter denotes that two global states \(s,s'\) are epistemically indistinguishable by an agent a if and only if they are identical as far as the agent knows. The equivalence relation \(\sim ^{a}\) induces a partition \(Eq^{\sim ^{a}}\) over \({\mathcal {S}}\). The elements of this partition are denoted by \(eq^{\sim ^{i}}\), and are called epistemic equivalence classes (EEC). They consist of sets of global states that are epistemically indistinguishable among each others by a. Specific equivalence relations can be also defined to model different kinds of multi-agent knowledge in a group of agents \(\Gamma \subseteq \mathcal {A}\), including:

  • Everybody Knows \(\sim ^{\Gamma }_{E}:= \bigcup _{\forall a\in \Gamma }\sim ^{a}\);

  • Common Knowledge \(\sim ^{\Gamma }_{C}:= it(\bigcup _{\forall a\in \Gamma }\sim ^{a})\), where it denotes the iterative closure;

  • Distributed Knowledge \(\sim ^{\Gamma }_{D}:= \bigcap _{\forall a\in \Gamma }\sim ^{a}\).

Each relation induces a partition whose elements are the respective EECs \(eq^{\sim ^{\Gamma }_{E}}\), \(eq^{\sim ^{\Gamma }_{C}}\), and \(eq^{\sim ^{\Gamma }_{D}}\) for groups of agents \(\Gamma \subseteq \mathcal {A}\).

From the transition matrices associated with each agent, a global transition matrix \(T_\textsf{PIS}\) describing the stochastic behaviour of the whole multi-agent system can be obtained by logarithmic pooling as follows:

$$\begin{aligned} T_\textsf{PIS}\left( s,s'\right) := \eta \prod _{a\in \mathcal {A}}T^{a}\left( s,s'\right) , \end{aligned}$$
(19)

where \(\eta\) is a normalising constant.

A global initial probability distribution \(P_\textsf{PIS}\) is similarly defined:

$$\begin{aligned} P_\textsf{PIS}(s):= \eta '\prod _{a\in \mathcal {A}}P^{a}(s)\,. \end{aligned}$$
(20)

Furthermore, the global transition matrix \(T_\textsf{PIS}\) and the global initial probability distribution identify a particular MC \(\langle {\mathcal {S}},\iota _\textsf{PIS},T_\textsf{PIS},AP,l \rangle\) called the embedded MC of the PIS. The latter is used to compute probabilistic inferences concerning the overall stochastic behaviour of the multi-agent system.

A pair \(\langle M_\textsf{PIS},rew \rangle\) of a PIS \(M_\textsf{PIS}\) and a reward function \(rew: {\mathcal {S}}\mapsto \mathbb {N}\), is called a Probabilistic Interpreted Reward System (PIRS).

Imprecise Markov Models

In this section, we provide imprecise-probabilistic counterparts for the Markov models presented in the previous section. This basically corresponds to replace the (sharp) specifications of the probabilistic parameters with set-valued ones. We also show how the efficient inference algorithms described in the previous section can be extended to such generalised setup, thus allowing for the computation of the bounds with respect to the set-valued specification, without increased computational costs. These results partially rely on recent works about imprecise Markov models [24, 27]. Note that the imprecise Markov models we consider follow the so-called measure-theoretic interpretation [24] and relies on the formalism of credal sets [30]. The alternative game-theoretic formalisation [31] is briefly mentioned without going into details insofar the two formalisms are equivalent for the inference tasks relevant for this work, as proved in [32].

Imprecise Transition Matrices

Given a variable S, a Credal Set (CS) K(S) is a set of probability mass functions over S. The upper expectation of a real-valued function f of S with respect to CS K(S) is intended as \(\overline{E}[f(S)]:= \sup _{P(S)\in K(S)} \sum _{s\in {\mathcal {S}}}f(s)\cdot P(s)\) (the lower expectation \(\underline{E}[f(S)]\) is analogously defined). Here we only consider closed and convex CSs induced by a finite number of linear constraints. These are polytopes in the probability simplex with a finite number of extreme points collected in a set Ext[K(S)]. For these CSs, upper (lower) expectations can be equivalently obtained by taking the maximum (minimum) with respect to the precise expectations computed on the extreme points. Conditional CSs might be defined analogously [33].

In this framework, an imprecise transition matrix \(\mathcal {T}\) is defined as a collection of conditional CSs \(\{K(S'\vert {s})\}_{s\in {\mathcal {S}}}\), each one representing a separately specified row of the matrix. This allows for defining precise transition matrices whose rows are obtained by taking a \(P(S'\vert {s}) \in K(S'\vert {s})\) for each \(s\in {\mathcal {S}}\). Each one of these matrices represents a stochastic behaviour compatible with the “imprecise” specification given by \(\mathcal {T}\).

Imprecise Markov Chains

As a first example of imprecise Markov model, we consider (discrete-time) imprecise Markov chains (IMCs), thus providing an imprecise-probabilistic version of the models introduced in Section  “Markov Chains”. Note that there exist two main ways of formalising IMCs in the literature. On the one hand, the measure-theoretic characterisation defines an IMC as a family of (discrete-time) Markov models compatible with beliefs about initial and transition probabilities. On the other hand, the game-theoretic characterisation is grounded on the game-theoretic view of probability popularised in [31] that, applied to the theory of stochastic processes directly leads to imprecise models.Footnote 3 The two characterisations are different but have been recently proved to coincide for all expectations on the following domains: (i) monotone pointwise limits of finitary real-valued functions, and (ii) bounded below Borel-measurable variables [32]. In this work we focus on measure-theoretic IMCs only. However, all the inferences we consider fall under (i) and thus, for the purposes of our work, the two characterisations can be considered equivalent.

Given a CS \(K(S_0)\) and an imprecise transition matrix \(\mathcal {T}\), both defined over \({\mathcal {S}}\), the (discrete-time) IMC \(\mathcal {M}\) induced by \(K(S_0)\) and \(\mathcal {T}\) can be defined as the largest set of (discrete-time) stochastic processes that are compatible with \(K(S_0)\) and \(\mathcal {T}\).

The term “compatible” here deserves an exact characterisation. In literature, indeed there exist at least two different criteria for establishing compatibility, which depends on the imprecise interpretation of the notion of stochastic irrelevance, and, consequently, of the Markov property one considers [36]. The two notions of irrelevance typically involved for IMCs are strong independence and epistemic irrelevance. The former is defined via product-independence of the CS extreme points: we say that K(S) and \(K(S')\) are strong-independent if and only if, for all \(P(S)\in Ext[K(S)]\) and all \(P(S')\in Ext[K(S')]\), it holds that \(P(S,S') = P(S)\cdot P(S')\). The latter is defined via conditioning: we say that K(S) is epistemically irrelevant for \(K(S')\) if and only if \(K(S' \mid s) = K(S')\) for each \(s\in {\mathcal {S}}\).Footnote 4 Notice that, unlike strong independence, epistemic irrelevance is asymmetric, i.e. the irrelevance of K(S) for \(K(S')\) does not entail the irrelevance of \(K(S')\) for K(S).

Following [27, 38], and also the early work of [22], in this paper we focus on epistemic irrelevance. Notably, by exploiting the results in [30], the imprecise-probabilistic inferences considered in the rest of this paper can easily proved to be independent of the specific characterisation we adopt. Epistemic irrelevance leads to an imprecise characterisation of the Markov property practically corresponding to assume that “whenever the agent knows the current state, then her beliefs about future states are not altered upon learning what states were visited in the past” [36, p. 265]. A formal definition of IMC can thus be given as follows:Footnote 5

Definition 12

(Imprecise Markov chain under epistemic irrelevance) Given \(K(S_0)\) and \(\mathcal {T}\), an IMC \(\mathcal {M}\) (under epistemic irrelevance) is defined as the largest set of all, potentially non-Markov, non-homogeneous, stochastic processes for which, for all \(t\in \mathbb {N}\) and all \(s_0,\dots ,s_t\in {\mathcal {S}}^{t}\), there is some \(T\in \mathcal {T}\) such that \(P(S_{t+1} = s'\mid S_{0:t} = s_{0:t})\, =\, T(s_{0:t},s')\) for all \(s'\in {\mathcal {S}}\).

Furthermore, each IMC is uniquely identified by a set of probability measures \(P^{M}: \sigma (\Pi )\rightarrow [0,1]\) that we denote by \(K^{\mathcal {M}}\). Each \(P^{M}\in K^{\mathcal {M}}\) uniquely identifies a (potentially non-Markov and non-homogeneous) stochastic process compatible with the IMC identified by \(K^{\mathcal {M}}\). The IMC identified by \(K^{\mathcal {M}}\) is also uniquely identified by \(K^{\mathcal {M}}_{s}\), which is the credal set including all the conditional probability distribution \(P^{M}_{s}\) obtained by conditioning the various \(P^{M}\in K^{\mathcal {M}}\) on a given initial state \(s\in {\mathcal {S}}\). As detailed in the next section, inferences in IMCs are consequently intended as the computation of lower and upper expectations with respect to such credal set.

Before moving on, notice that, as in the precise case, we are interested here in labelled IMCs, i.e. IMCs augmented with a finite set of atomic propositions AP and a labelling function \(l: {\mathcal {S}}\rightarrow 2^{AP}\). In what follows, when using the term IMC, we always refer to labelled IMCs.

Inference in Imprecise Markov Chains

To compute inferences in IMCs, let us first introduce the analogous of the transition operator in Eq. (4). This is obtained by taking the bounds with respect to all the possible (precise) specifications of transition probabilities consistent with the imprecise transition matrix of the IMC. For upper bounds, this corresponds to the following non-linear upper operator:

$$\begin{aligned} \left( \overline{\mathcal {T}}f\right) (s):= \max _{T\left( s,S'\right) \in \mathcal {T}\left( s,S'\right) } \sum _{s'\in {\mathcal {S}}} T\left( s,s'\right) \cdot f\left( s'\right) , \end{aligned}$$
(21)

while an analogous definition, with the minimum replacing the maximum, holds for the lower operator \(\underline{\mathcal {T}}\) [27, Eq. 1]. Equation (21) can be computed by solving \(\vert {\mathcal {S}}\vert\) linear programming tasks whose feasible regions are the conditional CSs in the definition of \(\mathcal {T}\). This is possible, in particular, because we assume (Section  “Imprecise Transition Matrices”) that each row of \(\mathcal {T}\) is separately specified and consists of a conditional CS \(K(S'\mid s)\) described by a finite number of linear constraints.

An iterated application of the above operators can be used to compute the bounds of the probability of reaching a given set of states after a number of time-steps t, as shown by the following result.

Theorem 6

Given an event \(B\subseteq {\mathcal {S}}\) and a time \(t\in \mathbb {N}\), let \(\overline{P}_s(s_t\in B)\) denote the upper bound for the probability of reaching B after t time-steps when starting from s. It holds that:

$$\begin{aligned} \overline{P}_{s}\left( s_t\in B\right) = \overline{\mathcal {T}}^{t}\mathbb {I}_{B}(s). \end{aligned}$$
(22)

A similar result allows to obtain the lower probability by means of the lower operator. For the sake of conciseness, in the rest of the paper, we only report the results for upper probabilities, expectations and transition operators. The lower bounds can always be obtained by replacing the upper transition operator with its lower analogous.

For what concerns the upper hitting probability, the latter can be regarded as the upper bound of the \(h_{B}\) defined in Eq. (7) with respect to the set \(K^{\mathcal {M}}\), as detailed in the following definition.

Definition 13

(Upper hitting probability) Given a IMC \(\mathcal {M}\), a set of states \(B\subseteq {\mathcal {S}}\), and an initial state \(s\in {\mathcal {S}}\):

$$\begin{aligned} h_{B}(s):= \max _{P_{s}^{M}\in K^{\mathcal {M}}} \sum _{ \begin{array}{cc} &{} \hat{\pi }\in \textrm{Paths}_{fin}(s):\, \exists \, t\in \mathbb {N}\, \mathrm {s.t.}\,\\ &{} \hat{\pi }(t)\in B\,\wedge \, \forall \, \tau <t,\, \hat{\pi }(\tau )\not \in B \end{array} } P^{M}_{s}\left( Cyl(\hat{\pi })\right) \end{aligned}$$

The latter can be compute as the minimal solution of the following system of equations [24, Corollary 19] (See also [27]).

$$\begin{aligned} \overline{h}_{B}= \mathbb {I}_{B} + \mathbb {I}_{B^{c}} \overline{\mathcal {T}} \,\overline{h}_{B}. \end{aligned}$$
(23)

Differently from the precise case, the system in Eq. (23) is non-linear and cannot be solved by the standard methods typically used in the precise case. Nevertheless, as we show below, it is possible to apply a schema analogous to that in Eq. (8) and compute \(\overline{h}_{B}\) by recursion over increasing values of t (see, Ref. [24]). Let \(\overline{h}_{B}^{t}\) denote the upper hitting probability of B for a finite number of time-steps \(t\in \mathbb {N}\) conditional on \(S_0 = s\). For \(t=0\), we trivially have that \(\overline{h}^{t=0}_{B}=\mathbb {I}_{B}\). For \(t>0\), we have instead the following recursion:

$$\begin{aligned} \overline{h}_{B}^{t}= \mathbb {I}_{B} + \mathbb {I}_{B^{c}} \overline{\mathcal {T}}\, \overline{h}_{B}^{t-1}. \end{aligned}$$
(24)

In practice, the procedure consists of t iterated applications of the transition operator \(\overline{\mathcal {T}}\) and, consequently, requires the solution of \(\vert {\mathcal {S}} \vert \cdot t\) linear programming tasks. The time complexity of the procedure is, therefore, polynomial in \(\vert {\mathcal {S}} \vert \cdot t\), exactly as in the precise case.

As for standard DTMCs, it is proved that the least fixed point of Eq. (24) is the minimal non-negative solution of the schema in Eq. (23) (see, [24, Prop 16]). We can thus compute \(\overline{h}_{B}\) simply by iterating the schema in Eq. (24) over increasing values of t until converge. The overall time complexity results polynomial in \(\vert {\mathcal {S}} \vert t^{*}\) as in the precise case. Each iteration step is based on a one-step application of the upper transition operator \(\overline{\mathcal {T}}\) and requires the solution of \(\vert {\mathcal {S}} \vert\) linear optimisation tasks: its time complexity is, therefore, polynomial in \(\vert {\mathcal {S}} \vert\). As \(t^{*}\) further iterations are necessary to reach convergence, the overall time complexity results polynomial in \(\vert {\mathcal {S}} \vert t^{*}\).

Imprecise Markov Reward Models

The imprecise Markov reward models (IMRMs) are the first extension of IMCs we consider. The latter can be defined as the imprecise counterpart of a MRM and consists of a pair \(\langle \mathcal {M},rew\rangle\) of a IMC \(\mathcal {M}\) and a reward function \(rew: {\mathcal {S}}\rightarrow \mathbb {R}\). For IMRMs, we characterise the expected cumulative reward \(ExpRew_{B}\) by its upper and lower bounds, respectively denoted \(\overline{Exp}Rew_{B}\) and \(\underline{Exp}Rew_{B}\). As in the precise case, we restrict the latter to only \(s\in {\mathcal {S}}^{B}_{=1}\), where \({\mathcal {S}}^{B}_{=1}\) is now defined as the set of all \(s\in {\mathcal {S}}\) such that \(\underline{h}_{B}(s)=1\).

Given an event \(B\subseteq {\mathcal {S}}\) and a path \(\pi \in \Pi ^{M_\textsf{DTIMC}}\), let us consider the cumulative reward \(Rew_B(\pi )\) earned along \(\pi\) until visiting an \(s\in B\) for the first time, as in Definition 9. The upper \(\overline{Exp}Rew_{B}(s)\) expected cumulative rewards earned until reaching B starting from \(s\in {\mathcal {S}}\) can be defined as the upper expectation of \(Rew_B\) conditional on the initial state \(s\in {\mathcal {S}}\), i.e. \(\overline{Exp}Rew_{B}(s) :=\overline{E}[Rew_B \vert s]\).

Inspired by Theorem 3 valid for the precise case, we introduce an imprecise version of the recursive scheme presented in Eq. (10). To this end, let \(\overline{Exp}Rew^0_B(s) :=rew(s)\) for every \(s \in {\mathcal {S}}^B_{=1}\). Instead, for each \(t \in \mathbb {N}, \; t \ne 0\), let \(\overline{Exp}Rew^t_B(s)\) be defined as follows:

$$\begin{aligned} \overline{Exp}Rew^t_B(s) :=\left\{ \begin{array}{ll} rew(s) &{} \textrm{if}\,\, s\in B\,,\\ rew(s) + \left( \overline{T}\,\overline{Exp}Rew^{t-1}_B\right) (s) &{}\,\, \textrm{otherwise}, \end{array} \right. \end{aligned}$$
(25)

Similarly to the precise case, \(\overline{Exp}Rew^t_B\) can be given a clear interpretation via the following theorem.

Theorem 7

For every \(t \in \mathbb {N}\), it holds that

$$\begin{aligned} (\forall s \in S^B_{=1})\,\, \overline{Exp}Rew^t_B(s)= \overline{E}\left[ Rew^t_B \mid S_0=s\right] , \end{aligned}$$
(26)

where for each \(\pi \in \Pi ^{M_\textsf{DTIMC}}\), \(Rew^0_B(\pi ) :=rew(\pi (0))\), and for each \(t \in \mathbb {N},\; t\ne 0\),

$$\begin{aligned} Rew^t_B(\pi ) :=\left\{ \begin{array}{ll} Rew_B(\pi ) &{} \textrm{if}\,\, \exists t^* \le t: \left( \forall \tau < t^*\right) \; \pi (\tau ) \notin B, \pi \left( t^*\right) \in B, \\ \sum _{\tau =0}^{t} rew(\pi (\tau )) &{}\,\, \textrm{otherwise}. \end{array} \right. \end{aligned}$$
(27)

By exploiting Theorem 7, we can now demonstrate the following result proving that the recursive schema above introduced converges to what expected.

Theorem 8

\(\overline{E}[Rew_B \mid S_0]\) restricted to \(S^B_{=1}\) is a fixed point of the iterative scheme (25).

To conclude, let us focus on the imprecise counterpart of reward-bounded hitting probability \(h^{r}_{B}\) and its upper bound \(\overline{h}^{r}_{B}\), which we defined as follows.

Definition 14

(Upper hitting probability) Given a IMC \(\mathcal {M}\), a set of states \(B\subseteq {\mathcal {S}}\), a reward-threshold r, and an initial state \(s\in {\mathcal {S}}\):

$$\begin{aligned} \overline{h}^{r}_{B}(s):= \max _{P_{s}^{M}\in K^{\mathcal {M}}} \sum _{ \begin{array}{cc} &{} \hat{\pi }\in \textrm{Paths}_{fin}(s): \exists t\in \mathbb {N}\, \mathrm {s.t.}\,\\ &{} \hat{\pi }(t)\in B\, \wedge \, \forall \tau <t\, \hat{\pi }(\tau )\not \in B\, \wedge \,\\ {} &{} rew(\hat{\pi }(0),\dots ,\hat{\pi }(t))\le r \end{array} } P^{M_\textsf{DTMC}}_{s}\left( Cyl(\hat{\pi })\right) . \end{aligned}$$

Similarly to the precise case (see Proposition 4), the values of \(\overline{h}^{\rho }_{B}(s)\) for all \(s\in {\mathcal {S}}\) and \(\rho := 0,1,\dots ,r\) provide a solution to the following system of equations:

$$\begin{aligned} \overline{h}_{B}^{\rho }(s) = \left\{ \begin{array}{ll} 1 &{} \textrm{if}\,\, s\in B\,\, \textrm{and}\,\, rew(s)\le r\\ 0 &{} \textrm{if}\,\, rew(s)> \rho \,\, \textrm{or}\,\, s\not \in {\mathcal {S}}_{>0}\\ \,\,\,\,\,\,\,\,\,\,{\mathop {\textrm{max}}\limits _{T\left( s,S'\right) \in \mathcal {T}(s,S')}}\,\,\,\,\,\,\,\,\,\,\, \sum _{s'\in {\mathcal {S}}}T\left( s,s'\right) \overline{h}_{B}^{\rho -rew(s)}\left( s'\right) &{} \textrm{otherwise}\,. \end{array} \right. \end{aligned}$$
(28)

To compute \(\overline{h}^{r}_{B}\), we can, therefore, use a recursive schema analogous to that presented in “Markov Reward Models”. First, we define a matrix \(\overline{\textbf{h}}^{t,\rho _{0:r}}_{B}\) whose cells are the values of \(\overline{h}^{t,\rho }_{B}(s)\) computed for each \(s\in {\mathcal {S}}\) and for \(\rho = 0,\dots ,r\).

For \(t=0\), we generate \(\overline{\textbf{h}}^{t,\rho _{0:r}}_{B}\) by computing the vectors \(\overline{h}^{t=0,\rho }_{B}\) for \(\rho = 0,\dots ,r\) as in Eq. (15).

For increasing values of t, \(\overline{\textbf{h}}^{t,\rho _{0:r}}_{B}\) is generated by computing the vectors \(\overline{h}^{t,\rho }_{B}\) for \(\rho = 0,\dots ,r\) as follows:

$$\begin{aligned} \overline{h}^{t,\rho }_{B}(s) = \left\{ \begin{array}{ll} 1 &{} \textrm{if}\,\, s\in B\,\, \textrm{and}\,\, rew(s) \le \rho \\ 0 &{} \textrm{if}\,\, rew(s)> \rho \,\, \textrm{or}\,\, s\not \in {\mathcal {S}}_{>0}\\ \,\,\,\,\,\,\,\,\,\,\, {\mathop {\textrm{max}}\limits _{T\left( s,S'\right) \in \mathcal {T}\left( s,S'\right) }}\,\,\,\,\,\,\,\,\,\,\,\, \sum _{s'\in {\mathcal {S}}} T\left( s,s'\right) h_{B}^{t-1,\rho -rew(s)}\left( s'\right) &{} \textrm{otherwise}. \end{array} \right. \end{aligned}$$
(29)

As in the precise case, the values of \(h^{t-1,\rho -rew(s)}_{B}(s')\) for all \(s'\in {\mathcal {S}}\) are provided by the matrix \(\overline{\textbf{h}}^{t-1,\rho _{0:r}}\) that we generate at time-step \(t-1\). The recursion is based on iterated applications of the upper transition operator \(\overline{\mathcal {T}}\), each one based on solving \(\vert {\mathcal {S}} \vert\) linear programming tasks.

To compute \(\overline{h}^{r}_{B}(s)\) for all \(s\in {\mathcal {S}}\), we proceed as in the precise case, that is, we iterate \(\overline{\textbf{h}}^{t,\rho _{0:r}}_{B}\) over increasing values of t until convergence. Hence, for each \(s\in {\mathcal {S}}\), \(\overline{h}_{B}^{r}(s)\) is given by the s-cell of the r-vector of the matrix \(\overline{\textbf{h}}^{t^{*},\rho _{0:r}}_{B}\), where \(t^{*}\) is the convergence time-step. The convergence of the procedure is granted by the following theorem:

Theorem 9

Let \((\mathcal {M},rew)\) be a IMRM, \(B \subseteq [ {\mathcal {S}}]\), and \(\rho \in \mathbb {N}\), there is a \(t^{*}\in \mathbb {N}\) such that for all \(\tau \ge 0\):

$$\begin{aligned} \overline{\textbf{h}}^{t^{*}+\tau ,\rho _{0:r}}_{B} =\overline{\textbf{h}}^{t^{*},\rho _{0:r}}_{B}. \end{aligned}$$
(30)

The overall time complexity of the procedure is polynomial in \(\vert {\mathcal {S}} \vert t^{*}\) for a reasoning analogous to that stated for Eq. (24).

Imprecise Probabilistic Interpreted Systems

The second IMC extension we consider are imprecise probabilistic interpreted systems (IPISs) [28]. The latter are defined as multi-agent systems composed by agents whose stochastic behaviour is described in terms of IMCs. An IPIS under this interpretation is constructed as follows. For each agent \(a\in \mathcal {A}\), let \(\{K^{a}(S'\mid s)\}_{s\in {\mathcal {S}}}\) denote a family of CSs including, for each \(s\in {\mathcal {S}}\), all the transition probability mass functions \(P^{a}(S'\mid s)\) that are compatible with some agent’s probabilistic beliefs. To obtain an IPIS, we replace all the transition matrices \(T^{a}, a\in \mathcal {A}\) in a standard PIS with corresponding (row-stochastic) imprecise transition matrices \(\mathcal {T}^{a}:= \{K^{a}(S'\mid s)\}_{s\in {\mathcal {S}}}\), whose rows correspond to the transition CSs \(K^{a}(S'\mid s)\) for all \(s\in {\mathcal {S}}\). The overall stochastic behaviour of the entire multi-agent system is then described by a global imprecise transition matrix \(\mathcal {T}_\textsf{IPIS}\) which can be obtained following different approaches. The most conservative approach consists of defining \(\mathcal {T}_\textsf{IPIS}\) as a collection of \(\mid {\mathcal {S}} \mid\) conditional CSs \(K_\textsf{IPIS}(S' \mid s)\), each one being defined as \(\bigcup _{a\in \mathcal {A}}K^{a}(S' \mid s)\). While natural, this approach always implies an increase of the degree of imprecision, defined in terms of the size of the credal sets.

Another approach to obtain \(\mathcal {T}_\textsf{IPIS}\) consists of computing, for each transition \(s,s'\in {\mathcal {S}}\times {\mathcal {S}}\), the credal version of the logarithmic pooling of the family of conditional CSs \(\{ K^{a}(S'\mid s): a\in \mathcal {A}\}\). In general, this is defined as the element-wise application of the standard logarithmic pooling to the elements of the credal sets. This element-wise approach, however, might comport exponential complexity with respect to the number of agents in the model. A similar problem also occurs when considering alternative strategies, such as the one proposed in [39] within the framework of general credal networks. An efficient way to overcome the problem we adopt here consists in considering an outer approximation of the lower and upper bounds of the credal logarithmic pooling achieved as follows:

$$\begin{aligned} \overline{\mathcal {T}}_\textsf{IPIS}\left( s,s'\right) := \frac{\prod _{a\in \mathcal {A}}\overline{\mathcal {T}}^{a}\left( s,s'\right) }{\prod _{a\in \mathcal {A}}\overline{\mathcal {T}}^{a}\left( s,s'\right) + \sum _{s''\ne s'} \prod _{a\in \mathcal {A}}\underline{\mathcal {T}}^{a}\left( s,s''\right) }. \end{aligned}$$
(31)

The lower bound is analogously computed. The obtained global matrix \(\mathcal {T}_\textsf{IPIS}\) consists of an imprecise transition matrix \(\mathcal {T}_\textsf{IPIS}\) whose entries are intervals \((m,n)\subseteq [0,1]\) whose extremes are given by the lower \(\underline{\mathcal {T}}_\textsf{IPIS}(s,s')\) and the upper \(\overline{\mathcal {T}}_\textsf{IPIS}(s,s')\) bounds of the transition probabilities. As in the precise case, the global matrix describes the embedded IMC of the IPIS that is used to compute probabilistic inferences arising with the overall stochastic behaviour of the multi-agent system.

Imprecise Probabilistic Interpreted Reward Systems

The last class of IMC-based structures we consider are imprecise-probabilistic interpreted reward systems (IPIRSs). The latter are defined as pairs \(\langle {\mathcal{M}_\textsf{IPIS}},rew\rangle\) of a IPIS \(\mathcal {M}_\textsf{IPIS}\) and a reward function \(rew: {\mathcal {S}}\mapsto \mathbb {N}\). As for IPISs, the global transition matrix \(\mathcal {T}_\textsf{IPIRS}\) of an IPIRS is obtained by combining the credal transition matrices \(\mathcal {T}^{a}\) of the various agents \(i\in \mathcal {A}\).

Notice that all the various imprecise Markov models above introduced can be seen as components of an IPIRS. Specifically, an IPIRS \((\mathcal {M}_\textsf{IPIS}, rew)\) includes an IPIS \(\mathcal {M}_\textsf{IPIS}\) that is composed by several IMCs \(\mathcal {M}\), one for each agent of the system. On the other hand, an IPIRS \((\mathcal {M}_\textsf{IPIS}, rew)\) includes several IMRMs, one for each agent in the system, composed by an IMC \(\mathcal {M}\) and the reward function rew. This work focuses on only IPIRS and their properties. Nevertheless, the various results obtained for IPIRSs can be easily transferred to IMCs, IMRMs, and IPISs.

Epistemic Imprecise PRCTL

This section presents Epistemic Imprecise Probabilistic Reward Computation Tree Logic (EIPRCTL), an epistemic and imprecise-probabilistic extension of the PCTL introduced in [6] suitable to specify epistemic, probabilistic, and reward properties of non-stationary (or not fully specifiable) stochastic multi-agent systems. The language is targeted on IPIRSs but can be also applied to the other kinds of imprecise Markov models previously introduced. Notably, properties of IMCs, IMRMs, and IPISs can be specified via specific languages, such as IPCTL [22], IPRCTL [25], and EIPCTL [28], which can be obtained as EIPRCTL fragments.

EIPRCTL Syntax

The EIPRCTL syntax is recursively defined as follows:

$$\begin{aligned} \begin{array}{ll} \phi := &{} \left\{ \begin{array}{ll} &{} \top \mid p \mid \lnot \phi \mid \phi _1\wedge \phi _2 \mid \underline{P}_{\nabla {b}}\psi \mid \overline{P}_{\nabla {b}}\psi \mid P_{J}\psi \\ &{} \underline{E}_{\nabla {r}}\phi \mid \overline{E}_{\nabla {r}}\phi \mid E_{J}\phi \mid K^{a}\phi \mid E^{\Gamma }\phi \mid C^{\Gamma }\phi \mid D^{\Gamma }\phi \end{array} \right. ,\\ &{}\\ \psi := &{} \bigcirc \phi \mid \phi _1\bigcup \phi _2 \mid \phi _1\bigcup ^{\le t}\phi _2 \mid \phi _1\bigcup _{\le r}\phi _2,\\ &{}\\ \epsilon := &{} B^{a}_{\nabla {\underline{b}}}\phi \mid B^{a}_{\nabla {\overline{b}}}\phi , \end{array} \end{aligned}$$

where \(p \in AP\), \(b\in [0,1]\), \(J\subseteq [0,1]\), \(a\in \mathcal {A}\), \(\Gamma \subseteq \mathcal {A}\), and \(\nabla {}\) is a short notation for \(\{<,\le ,=,\ge ,>\}\).

The language is composed by \(\phi\), \(\psi\) and \(\epsilon\)-formulae. The former extend classical propositional logic with usual operators for single-agent knowledge \(K^{a}\), common knowledge \(C^{\Gamma }\), and distributed knowledge \(D^{\Gamma }\), and with the following probabilistic modalities:

  • \(\underline{P}_{\nabla {b}}\psi\): the lower probability of reaching a path that satisfies \(\psi\) is \(\nabla {b}\);

  • \(\overline{P}_{\nabla {b}}\psi\): the upper probability of reaching a path that satisfies \(\psi\) is \(\nabla {b}\);

  • \(P_{J}\psi\): the probability of reaching a path that satisfies \(\psi\) belongs to the closed interval \(J\subseteq [0,1]\);

  • \(\underline{E}_{\nabla {r}}\phi\): the lower bound of the expected cumulative reward earned by the system until reaching a state that satisfies \(\phi\) is \(\nabla {r}\);

  • \(\overline{E}_{\nabla {r}}\phi\): the upper bound of the expected cumulative reward earned by the system until reaching a state that satisfies \(\phi\) is \(\nabla {r}\);

  • \(E_{J}\phi\): the expected cumulative reward earned by the system until reaching a state that satisfies \(\phi\) belongs to the closed interval \(J\subseteq [0,1]\);

The \(\psi\)-formulae are standard CTL path-formulae [2, p. 313] used to represent properties of paths:

  • \(\bigcirc \phi\): in the next state of the path \(\phi\) holds;

  • \(\phi _1\bigcup \phi _2\) \(\phi _1\): \(\phi _1\) holds along the path until \(\phi _2\) holds;

  • \(\phi _1\bigcup ^{\le t}\phi _2\): there exists a time-step \(\tau \le t\) such that \(\phi _2\) holds in the \(\tau\)-step of the path and \(\phi _1\) holds in all the previous time-steps;

  • \(\phi _1\bigcup _{\le r}\phi _2\): \(\phi _1\) holds in all states of the path until a cumulative reward lower then or equals to r is earned then \(\phi _2\) holds.

Finally, \(\epsilon\)-formulae include the two following weighted-belief modalities:Footnote 6

  • \(B^{a}_{\nabla {\underline{b}}}\phi\): agent a believes that the lower probability of hitting \(\phi\) eventually in the future is \(\nabla {b}\);

  • \(B^{a}_{\nabla {\overline{b}}}\phi\): agent a believes that the upper probability of hitting \(\phi\) eventually in the future is \(\nabla {b}\).

EIPRCTL Semantics

Let us introduce a proper semantics for EIPRCTL formulae based on IPIRSs. This can be seen as a generalisation of the semantics proposed in [22] for IMCs and those proposed in [18] for standard (precise) PISs.

Semantics of State-Formulae

We start by presenting satisfiability conditions for Boolean, probabilistic, and epistemic \(\phi\)-formulae separately.

Definition 15

(Semantics of Boolean formulae) Given an IPIRS \(\langle \mathcal {M}_\textsf{IPIS}, rew\rangle\) and a state \(s\in {\mathcal {S}}\), the following conditions hold:

$$\begin{aligned} \begin{array}{ll} \langle \mathcal {M}_\textsf{IPIS}, rew \rangle ,s\models p &{} \textrm{iff}\;\; p\in l(s),\\ \langle \mathcal {M}_\textsf{IPIS}, rew \rangle ,s\models \lnot \phi &{} \textrm{iff}\;\; \langle \mathcal {M}_\textsf{IPIS}, rew \rangle ,s\not \models \phi ,\\ \langle \mathcal {M}_\textsf{IPIS}, rew \rangle ,s\models \phi _1\wedge \phi _2 &{} \textrm{iff}\;\; \langle \mathcal {M}_\textsf{IPIS}, rew \rangle ,s\models \phi _1\;\; \textrm{and}\;\; \langle \mathcal {M}_\textsf{IPIS}, rew \rangle ,s\models \phi _2. \end{array} \end{aligned}$$

Definition 16

(Semantics of probabilistic formulae) Given an IPIRS \(\langle \mathcal {M}_\textsf{IPIS}, rew \rangle\) and a state \(s\in {\mathcal {S}}\), the following conditions hold:

$$\begin{aligned} \begin{array}{ll} \langle \mathcal {M}_\textsf{IPIS}, rew \rangle ,s\models \underline{P}_{\nabla {b}}\psi &{} \textrm{iff}\,\,\,\, \underline{P}_\textsf{IPIRS}(s\models \psi ) \nabla {b},\\ \langle \mathcal {M}_\textsf{IPIS}, rew \rangle ,s\models \overline{P}_{\nabla {b}}\psi &{} \textrm{iff}\,\,\,\, \overline{P}_\textsf{IPIRS}(s\models \psi ) \nabla {b},\\ \langle \mathcal {M}_\textsf{IPIS}, rew \rangle ,s\models P_{J}\psi &{} \textrm{iff}\,\, \left\{ \begin{array}{ll} \langle \mathcal {M}_\textsf{IPIS}, rew \rangle ,s\models &{} \underline{P}_{\ge \min {J}}\psi \,\, \textrm{and}\\ \langle \mathcal {M}_\textsf{IPIS}, rew \rangle ,s\models &{} \overline{P}_{\le \max {J}}\psi . \end{array} \right. \end{array} \end{aligned}$$

For the probabilistic formulae, the satisfiability conditions refer to the lower and upper bounds of \(P_\textsf{IPIRS}(s\models \psi )\), where \(P_\textsf{IPIRS}(s\models \psi )\) is the probability that a path \(\pi \models \psi\) belongs to the set of paths originating in s conditional to \(S_0 = s\).Footnote 7 Similarly to standard PCTL [2], the computation of the lower and upper bounds of \(P_\textsf{IPIRS}(s\models \psi )\) varies depending on \(\psi\). We further analyse this point in Section “Model Checking” dedicated to model checking procedures.

Definition 17

(Semantics of expected reward formulae) Let \(Sat(\phi )\) be the set of all states that satisfy \(\phi\). Given an IPISR \((\mathcal {M}_\textsf{IPIS}, rew)\) and state \(s\in {\mathcal {S}}\), the following condition holds:

$$\begin{aligned} \begin{array}{ll} \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,s\models \overline{E}_{\nabla {r}}\phi &{} \textrm{iff}\,\, \overline{Exp}Rew_{Sat(\phi )}(s) \nabla {r},\\ \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,s\models \underline{E}_{\nabla {r}}\phi &{} \textrm{iff}\,\, \underline{Exp}Rew_{Sat(\phi )}(s) \nabla {r},\\ \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,s\models E_{J}\phi &{} \textrm{iff} \left\{ \begin{array}{l} \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,s\models \underline{E}_{\ge \min {J}}\phi \,\, \textrm{and}\\ \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,s\models \overline{E}_{\le \max {J}}\phi . \end{array} \right. \end{array} \end{aligned}$$

.

Definition 18

(Semantics of epistemic formulae) Given an IPIRS \((\mathcal {M}_\textsf{IPIS}, rew)\), an agent \(i\in \mathcal {A}\) or a group of agents \(\Gamma \subseteq \mathcal {A}\), and a state \(s\in S\), the following conditions hold:

$$\begin{aligned} \begin{array}{ll} \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,s\models K^{a}\phi &{} \textrm{iff}\,\,\,\, \forall s', s\sim ^{a}s': s'\models \phi ,\\ \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,s\models E^{\Gamma }\phi &{} \textrm{iff}\,\,\,\, \forall s', s\sim ^{\Gamma }_{E}s': s'\models \phi ,\\ \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,s\models C^{\Gamma }\phi &{} \textrm{iff}\,\,\,\, \forall s', s\sim ^{\Gamma }_{C}s': s'\models \phi ,\\ \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,s\models D^{\Gamma }\phi &{} \textrm{iff}\,\,\,\, \forall s', s\sim ^{\Gamma }_{D}s': s'\models \phi . \end{array} \end{aligned}$$

Semantics of Path-Formulae

Definition 19

(Semantics of \(\psi\)-formulae) Given an IPIRS \((\mathcal {M}_\textsf{IPIS}, rew)\) and a path \(\pi\), the following conditions hold:

$$\begin{aligned} \begin{array}{ll} \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,\pi \models \bigcirc \phi &{} \textrm{iff}\,\,\,\, \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,\pi (1)\models \phi ,\\ &{}\\ \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,\pi \models \phi _1\bigcup ^{\le t}\phi _2 &{} \textrm{iff}\,\,\,\, \exists \tau \le t: \left\{ \begin{array}{l} \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,\pi (\tau )\models \phi _2\,\, \textrm{and}\\ \forall \tau '<\tau :\,\, \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,\pi (\tau )\models \phi _1, \end{array} \right. \\ &{}\\ \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,\pi \models \phi _1\bigcup \phi _2 &{} \textrm{iff}\,\,\,\, \exists t\ge 0: \left\{ \begin{array}{l} \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,\pi (t)\models \phi _2\,\, \textrm{and}\\ \forall \tau : 0\le \tau<t\,\, \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,\pi (\tau )\models \phi _1, \end{array} \right. \\ &{}\\ \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,\pi \models \phi _1\bigcup _{\le r}\phi _2 &{} \textrm{iff}\,\,\,\, \exists t\in \mathbb {N}: \left\{ \begin{array}{ll} \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,\pi (t)\vDash \phi _2\,\, \textrm{and}\\ \forall \tau <t:\,\, \langle \mathcal {M}_\textsf{IPIS}, rew\rangle ,\pi (\tau )\vDash \phi _1\\ \textrm{and}\,\, Rew(\pi , t)\le r. \end{array} \right. \end{array} \end{aligned}$$

Semantics of Weighted-Belief Formulae

The \(\epsilon\)-formulae are doxastic formulae that specify the probabilistic beliefs of a specific agent concerning the overall behaviour of the multi-agent system. They are expressed in terms of the lower and upper bounds of \(P^{a}(s\models \top \bigcup \phi )\), that is, the probability according to the agent a that the multi-agent system eventually visit a state \(s'\models \phi\) starting from \(s\in {\mathcal {S}}\). This probability is computed analogously to \(P_\textsf{IPIRS}(s\models \pi \models \top \bigcup \phi )\) in Definition 16, but replacing the global transition matrix \(\mathcal {T}_\textsf{IPIS}\) with a local transition matrix \(\mathcal {T}^{a}\) that describes the specific stochastic behaviour of the single-agent \(a\in \mathcal {A}\). Obviously, since we are considering imprecise models, we are interested in computing the lower and upper bounds of \(P^{a}(s\models \top \bigcup \phi )\). These are denoted, respectively, by \(\underline{P}^{a}(s\models \top \bigcup \phi )\) and \(\overline{P}^{a}(s\models \top \bigcup \phi )\). The specific procedure to compute those bounds is further detailed in Section “Model Checking”. Here we focus only on the satisfiability conditions for \(\epsilon\) formulae, defined as follows:

Definition 20

(Satisfiability of \(\epsilon\)-formulae) Given an IPIRS \((\mathcal {M}_\textsf{IPIS}, rew)\), and a state \(s \in {\mathcal {S}}\),

$$\begin{aligned} \begin{array}{cc} \left\langle \mathcal {M}_\textsf{IPIS}, rew\right\rangle ,s\models B^{a}_{\nabla {\underline{b}}}\phi &{} \textrm{iff}\,\,\,\, \forall \, s':\, s\sim ^{a} s',\,\, \underline{P}^{a}\left( s\models \top \bigcup \phi \right) \nabla {b},\\ \left\langle \mathcal {M}_\textsf{IPIS}, rew\right\rangle ,s\models B^{a}_{\nabla {\overline{b}}}\phi &{} \textrm{iff}\,\,\,\, \forall \, s':\, s\sim ^{a} s',\,\, \overline{P}^{a}\left( s\models \top \bigcup \phi \right) \nabla {b}. \end{array} \end{aligned}$$

Model Checking

In the present section, we explain how to check systems modelled by IPISRs against properties specified in the EIPRCTL language. The procedure we present is obtained by extending the well-known parsing-tree algorithm originally introduced for model checking with CTL formulae [2]. We start by briefly recalling the structure and functioning of the parsing-tree. Then, we extend the algorithm introducing a series of new sub-routines to solve specific model checking tasks related to the different kinds of EIPRCTL formulae.

Parsing Tree

Let \(\Lambda\) be a short notation for either a \(\phi\)-formula or an \(\epsilon\)-formula. Given an IPISR \(\langle \mathcal {M}_\textsf{IPIS}, rew\rangle\), a state \(s\in {\mathcal {S}}\), and a formula \(\Lambda\), a model checking algorithm is an automatic procedure to verify whether \(\langle \mathcal {M}_\textsf{IPIS}, rew\rangle , s\models \Lambda\) holds. The standard algorithm for CTL and its extension exploits the parse tree of \(\Lambda\) generated by decomposing \(\Lambda\) in its various sub-formulae as in Fig. 1 [2,  p. 336]. The algorithm works as follows:

  1. 1.

    Generate the parse tree of \(\Lambda\), recursively decomposing \(\Lambda\) in its sub-formulae \(\lambda\) until only atoms remain.

  2. 2.

    Traverse the parse tree of \(\Lambda\) visiting all the sub-formulae \(\lambda\), starting from the leaves and working backwards to the roots,

  3. 3.

    At each sub-formula \(\lambda\), calculate the set of states that satisfy \(\lambda\), denoted \(Sat(\lambda )\), by checking whether \(s\models \lambda\) for all \(s\in {\mathcal {S}}\),

  4. 4.

    Calculate \(Sat(\Lambda )\) by composition of the various \(Sat(\lambda )\),

  5. 5.

    Check whether \(s\in Sat(\Lambda )\).

Fig. 1
figure 1

Parse-tree of a formula

The algorithm includes a specific sub-routine to compute \(Sat(\Lambda )\) for each specific kind of sub-formula \(\lambda\).

Boolean Formulae

For Boolean formulae, \(Sat(\lambda )\) is computed as follows:

$$\begin{aligned} \begin{array}{ll} &{} Sat(\top ):=\,\, {\mathcal {S}},\\ &{} Sat(p):=\,\, \left\{ s\in {\mathcal {S}}: p\in l^{i}(s)\right\} ,\\ &{} Sat\left( \phi _1\wedge \phi _2\right) :=\,\, Sat\left( \phi _1\right) \cap Sat\left( \phi _2\right) ,\\ &{} Sat\left( \lnot \phi \right) :=\,\, {\mathcal {S}}\setminus Sat(\phi ).\\ \end{array} \end{aligned}$$

Probabilistic Formulae

For formulae of the kind \(\underline{P}_{\nabla {b}}\psi\), \(\overline{P}_{\nabla {b}}\psi\) and \(P_{J}\psi\), the set \(Sat(\lambda )\) is obtained by computing the lower and upper bounds of \(P_\textsf{IPIS}(s\models \psi )\) for each \(s\in {\mathcal {S}}\) and then check whether they satisfy the specified threshold \(\nabla {b}\). The specific procedure to compute the lower and upper bounds of \(P_\textsf{IPIS}(s\models \psi )\) varies depending on the specification of \(\psi\).

Next If \(\psi := \bigcirc \phi\), then \(P_\textsf{IPIRS}(s\models \psi )\) corresponds to \(P_{s}(s_{1}\in Sat(\phi ))\) and its upper (lower) bound can be computed as in Eq. (22).

Time-Bounded Until If \(\psi = \phi _1\bigcup ^{\le t}\phi _2\), then \(P_\textsf{IPIRS}(s\models \psi )\) corresponds to the probability of hitting \(Sat(\phi _2)\) within a finite number of time-steps t conditional on \(S_0 = s\) and with the additional condition that all the states visited before reaching \(Sat(\phi _2)\) are in \(Sat(\phi _1)\). For each \(s\in {\mathcal {S}}\), we denote such probability by \(h_{Sat(\phi _2)\mid Sat(\phi _1)}^{t}(s)\). A recursive schema analogous to that in Eq. (24) can be formulated to compute \(\overline{h}_{Sat(\phi _2)\mid Sat(\phi _1)}^{t}\). Let \(\mathbb {I}_{Sat(\phi _1)\setminus Sat(\phi _2)}\) denote the indicator vector whose values are one for all \(s\in Sat(\phi _1){\setminus } Sat(\phi _2)\) and 0 otherwise. A slightly modified version of the algorithm in (24) for computing \(\overline{h}_{Sat(\phi _2)\mid Sat(\phi _1)}^{t}\) by recursion over increasing values of t is achieved as follows:

$$\begin{aligned} \overline{h}_{Sat(\phi _2)\mid Sat(\phi _1)}^{t}:= \mathbb {I}_{Sat(\phi _2)} + \mathbb {I}_{Sat(\phi _1)\setminus Sat(\phi _2)} \left( \overline{\mathcal {T}}_\textsf{IPIS} \,\overline{h}_{Sat(\phi _2)\mid Sat(\phi _1)}^{t-1}\right) . \end{aligned}$$
(32)

As in Eq. (24), the initialisation is given by the indicator function of \(Sat(\phi _2)\) while the recursive steps consist of iterated applications of the upper transition operator to the hitting vector computed at the precedent time-step \(t-1\). The only relevant difference with the analogous scheme presented in Section “Imprecise Markov Chains” consists of the indicator vector \(\mathbb {I}_{Sat(\phi _1)\setminus Sat(\phi _2)}\) that replaces the indicator vector \(\mathbb {I}_{B^{c}}\) of the complement of the hitting event B. In the general scheme, \(\mathbb {I}_{B^{c}}\) limits the iteration considering only paths that have not already visited an \(s\in B\). Here, by using \(\mathbb {I}_{Sat(\phi _1)\setminus Sat(\phi _2)}\), we limit the iteration to only those paths whose actual and previous states are all in \(Sat(\phi _1)\) and that have not already reached a state \(s\in Sat(\phi _2)\). Notice that Eq. (32) is the imprecise analogous of the system of linear equations used to compute in \(P(s\models \phi _1\bigcup \phi _2)\) in the precise case, as reported in [2, Sec. 10.2.1]. Finally, as for Eq. (24), the computation of Eq. (32) is based on solving \(\vert {\mathcal {S}}\vert t\) linear programming tasks and its time complexity is, thus, polynomial in \(\vert {\mathcal {S}}\vert \ t\).

Until If \(\psi := \phi _1\bigcup \phi _2\), then \(P_\textsf{IPIRS}(s\models \psi )\) corresponds to the probability of hitting \(Sat(\phi _2)\) conditional on \(S_0 = s\) and with the additional requirement that all the states visited before reaching \(Sat(\phi _2)\) are in \(Sat(\phi _1)\). To compute the lower and upper bounds of this probability, we simply iterate the schema in Equation (32) over increasing values of t until convergence.

Reward-Bounded Until

If \(\psi := \phi _1\bigcup _{\le r}\phi _2\), then \(P_\textsf{IPIRS}(s\models \phi _1\bigcup _{\le r}\phi _2)\) corresponds to the reward-bounded hitting probability of \(Sat(\phi _2)\) with the additional condition that all the states visited before reaching \(Sat(\phi _2)\) are in \(Sat(\phi _1)\). We denote this probability by \(h^{r}_{Sat(\phi _2)\mid Sat(\phi _1)}\). To compute the upper (lower) bound of the latter, we involve a sightly modified version of the procedure introduced in Eqs. (28) and (29)

Let \(\overline{\textbf{h}}^{t,\rho _{0:r}}_{Sat(\phi _2)\mid Sat(\phi _1)}\) be a matrix whose cells are the values of \(\overline{h}^{t,\rho }_{Sat(\phi _2)\mid Sat(\phi _1)}(s)\) computed for each \(s\in {\mathcal {S}}\) and for \(\rho : 0,1,\dots ,r\).

For \(t=0\), we generate \(\overline{\textbf{h}}^{t,\rho _{0:r}}_{Sat(\phi _2)\mid Sat(\phi _1)}\) by computing the vectors \(\overline{h}^{t=0,\rho }_{Sat(\phi _2)\mid Sat(\phi _1)}\) for \(\rho = 0,1,\dots ,\rho\). as in Eq. (15).

For \(t>0\), we generate \(\overline{\textbf{h}}^{t,\rho _{0:r}}_{Sat(\phi _2)\mid Sat(\phi _1)}\) by computing the various vectors \(\overline{h}^{t,\rho }_{B}\) for \(\rho = 0,1,\dots ,r\) as follows:

$$\begin{aligned} \overline{h}^{t,\rho }_{Sat(\phi _2)\mid Sat(\phi _1)}(s) = {\left\{ \begin{array}{ll} &{} 1\,\, \textrm{if}\,\, s\in Sat(\phi _2)\,\, \textrm{and}\,\, rew(s)\le \rho \,,\\ &{} 0\,\, \textrm{if}\,\, \underline{h}(s)=0\,\, \textrm{or}\,\, s\not \in Sat(\phi _1)\setminus Sat(\phi _2)\,\, \textrm{or}\,\, rew(s) > \rho \,,\\ &{} \overline{\mathcal {T}}h^{t-1,\rho -rew(s)}_{B}(s)\,\, \textrm{otherwise}\,. \end{array}\right. } \end{aligned}$$
(33)

The schema is analogous to that in Eq. (29). The only relevant difference is the additional clause prescribing that \(\overline{h}^{t,\rho }_{B}(s) = 0\) also for all \(s\in Sat(\phi _1){\setminus } Sat(\phi _2)\), whereas in Eq. (29) \(\overline{h}^{t,\rho }_{B}(s) = 0\) only for \(s\in {\mathcal {S}}\) such that either \(\underline{h}_{B}(s') = 0\) or \(rew(s) > r\). The additional clause blocks the recursion for the successors of the initial state that do not belong to \(Sat(\phi _1)\). Indeed, if a certain successor \(s'\not \in Sat(\phi _1)\setminus Sat(\phi _2)\) is reached at a certain time-step \(\tau\) of the iteration, then \(h^{r}_{Sat(\phi _2)\mid Sat(\phi _1)}(s')\) takes value zero and the recursion from that state is stopped. In such a way, it is possible to account for the additional requirement that all states visited before reaching the hitting event \(Sat(\phi _2)\) are in \(Sat(\phi _1)\). Notice that the slightly modification does not alter the general results about \(\overline{h}^{r}_{B}\) reported above. In particular, the time complexity of the procedure remains polynomial in \(\vert {\mathcal {S}}\vert t^{*}\) (with \(t^{*}\) denoting the convergence time-steps) as it practically depends on the iterative step \(\overline{\mathcal {T}}h^{t-1,\rho -rew(s)}_{B}(s)\), which is the same both in Eq. (29) and (33).

Expected Reward Formulae

For formulae of the kind \(\underline{E}_{\nabla {r}}\phi \mid \overline{E}_{\nabla {r}}\phi \mid E_{J}\phi\), the procedure to determine \(Sat(\lambda )\) is based on computing the lower or the upper bounds of \(ExpRew_{Sat(\lambda )}(s)\), with \(Sat(\lambda )\subseteq {\mathcal {S}}\) following the procedure in Eq. (25).

Epistemic Formulae

For formulae of the kind \(K^{i}\phi \mid C^{\Gamma }\phi \mid D^{\Gamma }\phi\), the sub-routine to compute \(Sat(\lambda )\) is based on the algorithm reported in Fig. 2.

Fig. 2
figure 2

Model-checking algorithm for epistemic formulae

Let \(\kappa\) be a short notation for \(K^{i} \mid E^{\Gamma } \mid C^{\Gamma } \mid D^{\Gamma }\), and let \(\sim ^{\kappa }\) be a short notation for \(\sim ^{i} \mid \sim ^{\Gamma }_{E} \mid \sim ^{\Gamma }_{C} \mid \sim ^{\Gamma }_{D}\). Given an epistemic equivalence relation \(\sim ^{\kappa }\), we denote by \(Eq^{\sim ^{\kappa }}\) the partition induced on \({\mathcal {S}}\) by \(\sim ^{\kappa }\). Each element of \(Eq^{\sim ^{\kappa }}\) is an EEC that we denote by \(eq^{\sim ^{\kappa }}\). The algorithm in Fig. 2, works as follows:

  1. 1.

    It takes in input an IPIRS \(M_\textsf{IPIRS}\) and a formula \(\kappa \phi\);

  2. 2.

    It computes \(Sat(\phi )\) by recursively calling the respective sub-routine;

  3. 3.

    For each \(eq^{\sim ^{\kappa }}\in Eq^{\sim ^{\kappa }}\), it checks whether \(eq^{\sim ^{\kappa }}\subseteq Sat(\phi )\). If this is the case, then the algorithm adds the whole equivalence class \(eq^{\sim ^{\kappa }}\) to \(Sat(\kappa \phi )\).

The main advantage of this procedure is that it does not consider single states \(s\in {\mathcal {S}}\) but works directly on EECs. This strategy drastically reduces the time necessary for the execution of the procedure, which ultimately results polynomial in \(\vert {\mathcal {S}}\vert n\), with n denoting the nesting-depth of \(\kappa \phi\), i.e. the number of nested instances of epistemic operators occurring in the formula. For more details on this procedure, we refer to [40,  Sec. 2.1].

Imprecise Weighted-Belief Formulae


For \(\epsilon\)-formulae, the procedure to compute \(Sat(\lambda )\) requires to compute the upper (lower) bounds of \(P^{a}(s\models \top \bigcup \phi )\) for each \(s'\in eq^{a}(s)\). In practice, these correspond to the lower and upper bounds of \(h^{a}_{Sat(\phi _2)}\), that is, the hitting probability of event \(Sat(\phi _2)\) computed through the local transition matrix \(\mathcal {T}^{a}\) of the agent a. The procedure to compute \(P^{a}(s\models \top \bigcup \phi )\) is totally analogous to that for computing \(P_\textsf{IPIRS}(s\models \top \bigcup \phi )\) reported above, with the relevant difference that we use the local transition matrix \(\mathcal {T}^{a}\) instead of the global transition matrix \(\mathcal {T}_\textsf{IPIS}\).

A Case Study on Healthcare Budgeting

We present a first validation of EIPRCTL based on a slightly modified version of the MRM originally proposed in [41]. MRMs are used in that work to estimate the recovery costs for patients in geriatric departments. Two kinds of recovery are considered: short-term recoveries for acute cares have a daily cost estimated as £100, while long-term recoveries cost £50 per day. From a cumulative perspective, long-term recoveries are more expensive, since those patients typically remain in the hospitals for longer periods.

Fig. 3
figure 3

Transitions in a three-state MRM

The evolution across time of health conditions of a patient can be described through a MC with three states: acute care A, long-term care L, and discharge or death D. Transitions from L to A are considered impossible, while D acts as an absorbing state. A parametrized version of the transition matrix for this model is in Fig. 3. The parameters have the following interpretation: the conversion rate \(\nu\) corresponds to the probability of passing from a short-term to a long-term recovery, while the dismissing rates \(\gamma\) and \(\delta\) correspond to the probability of being discharged/die, respectively, in a short and long-term recovery. Rates \(\gamma\), \(\nu\) and \(\delta\) vary depending on the patient and disease. An assessment of these parameters for three different departments is in Table 1.

Table 1 Conversion and dismissing rates

The reward rew associated with each state represents the daily cost per patient. In a scale where one is assumed to correspond to one pound, the daily costs per patient are described by a function such that \(rew(A)=100\), \(rew(L)=50\), and \(rew(D)=0\). When a patient is dismissed or death she no longer has a cost for the hospital. Following these specifications, it is possible to construct a (precise) MRM \(\langle M, rew \rangle\) able to predict the expected cumulative cost incurred by the hospital for each patient up to the time of the patient’s discharge or death. Suppose that the total amount of financial resources per patient available to the hospital is \(\rho :=\pounds 40,000\). We are interested in verifying whether the expected cumulative cost per patient is sustainable, i.e. it does not exceed the amount of resources available. This corresponds to check whether:

$$\begin{aligned} ExpRew_{D}(s) \le \rho , \end{aligned}$$

for both \(s=A\), i.e. for a patient initially recovered in acute care, and \(s=L\), i.e. for a patient initially recovered in long-term care.

The MRM in [41] presents an important limitation, that is, it considers precise values for the transition rates \(\nu\), \(\gamma\) and \(\delta\), which is tantamount to assuming that the probability of a patient to change her health-condition is always the same independently from time. This assumption is clearly problematic. For example, it is obvious that the probability to die for patients in long-term cares increases with time. Imprecise probabilistic models allow us to overcome this limitation by considering, instead of precise values, probability intervals within which the transition rates can fluctuate over time.

In practice, we define an imprecise transition matrix for each department obtained by a perturbation of the values of each column of Table 1. As a perturbation method for converting a probability mass function into a credal set, we simply adopt a linear-vacuous contamination [21]. The methods works as follows. If P(S) is a PMF over S, the CS K(S) (see Section “Imprecise Transition Matrices”) becomes the CS that includes all the PMFs obtained as a mixture \((1-\alpha ) P(S) + \alpha P'(S)\), where \(P'(S)\) is any probability mass function and the parameter \(\alpha \in [0,1]\) defines the level of imprecision in the CS (e.g. \(K(S)=\{P(S)\}\) for \(\alpha =0\), while for \(\alpha =1\) we get the vacuous CS). In our example, we assume a perturbation level \(\alpha = 0.03\). Hence, by applying the above contamination model to each row of the three precise transition matrices described in Table 1, we obtain three imprecise transition matrices \(\mathcal {T}^{a}, a\in \{1,2,3\}\) (where a is the number of the department to which the matrix refers).

Given that the costs associated with each state in the model remain the same as above, we have now obtained three different IMRMs, each describing the scenario related to one of the three departments. We are still interested in checking whether, given a patient, the expected cumulative cost incurred until the patient is dismissed or dies does not overcome the available resources. However, we neither know from which one of the three departments the patient comes nor whether it is recovered in acute (A) or long-term (L) care. We can model this scenario as follows.

First, instead of considering a specific transition matrix \(\mathcal {T}^{a}, a\in \{1,2,3\}\), we consider an aggregated model where the global imprecise transition matrix is a pooling of the imprecise transition matrices of each department.

Such a global imprecise transition matrix \(\mathcal {T}_{\textrm{IPIS}}\) can be obtained by the logarithmic pooling as in Eq.(31). As an alternative, more cautious, estimate, we also consider instead a conservative pooling consisting in taking as aggregated model the union of the probability intervals of the imprecise transition matrices of the different departments.

Finally, the fact that, independently from which is the department the patient comes from, we cannot know whether the patient is recovered in acute or long-term cares corresponds to assume that \(A\sim ^{a} L, \forall a\in \{1,2,3\}\), i.e. states labelled by A and L are indistinguishable. We have now obtained a description of the considered scenario in terms of an IPISR \(\langle \mathcal {M}_\textsf{IPIS}, rew\rangle\).

Checking whether the upper maximum expected cumulative cost incurred by the hospital until a patient is dismissed or dies is sustainable corresponds, therefore, to verify whether or not the formula \(\overline{E}_{\le \rho }D\) holds in the model \(\langle \mathcal {M}_\textsf{IPIS}, rew \rangle\) for each state s in the equivalence class \(\{A,L\}\). To check this formula, we apply the procedure discussed in Section “Model Checking”. The algorithms described in Section “Imprecise Markov Models” are finally used to compute the upper bounds of \(ExpRew_{D}(s)\) for both \(s = A\) and \(s = L\).Footnote 8 The most cautious bounds returned by the conservative pooling are:

$$\begin{aligned} \overline{Exp}Rew_D(A)= \, & {} \pounds 29'561, \end{aligned}$$
(34)
$$\begin{aligned} \overline{Exp}Rew_D(L)= \, & {} \pounds 42'343. \end{aligned}$$
(35)

As expected, the cumulative costs for patients initially admitted in acute care are lower and not exceeding the resources available to the hospital. The same does not happen for patients initially admitted in long-term care.

The management of the hospital might consequently need to check how likely is the fact that the cumulative costs for an hospitalised patient are exceeding the available resources. We do that by checking that the probability of a patient to be dismissed/died before the cumulative cost overcomes the available resources is sufficiently high, e.g. greater than or equal to a threshold \(\pi :=0.95\). This corresponds to check whether the formula \(\underline{P}_{\ge \pi }\top \bigcup _{\le \rho } D\) holds in the model \(\langle \mathcal {M}_\textsf{IPIS}, rew \rangle\) and for each state s in the equivalence class \(\{A,L\}\). Remarkably, by means of the algorithms described in Section “Imprecise Markov Models” we obtain that the formula is satisfied for both initial states. The resources overrun is, therefore, a relatively unlikely event for the hospital.

Conclusions

An intrinsic limitation of probabilistic model checking is related to its fundamental reliance on the use of standard Markov models, which can notoriously model only stationary agents whose transition probabilities are all specified by known numerical values. To overcome this limitation, we have presented a novel framework based on the theory of imprecise probabilities and the related imprecise Markov models. More specifically, we have explained how the use of imprecise Markov models allows us to apply probabilistic model checking methods to both non-stationary agents and agents whose transition probabilities are not fully known without comporting computational complexity issues. The key point is that both probabilistic and reward inferences in imprecise Markov models can be computed by iteratively solving linear programming tasks. This allows us to solve relevant model checking tasks without increasing the computational complexity of the relative procedures, which always remains polynomial in the number of states of the models. The paper focuses specifically on stochastic multi-agent systems, but the framework it introduces is useful also for single stochastic agents. The main limitation is that, so far, we considered only discrete-time models. Recent developments [38] in the study of imprecise continuous-time Markov chains (CTMC) strongly suggest that an analogous framework can be introduced for continuous-time models, which are of fundamental relevance for applications in fields like computational and systems biology [14, 15]. In the model checking community, some works concerning non-stationarity issues in continuous-time Markov models have been recently proposed. In [42], for example, non-stationary agents are modelled via uncertain CTMCs, which are CTMCs whose transition probabilities vary non-deterministically in time within bounded continuous intervals. Although uncertain and imprecise CTMCs are similar, they are not equivalent formalisms. Notably, while an uncertain CTMC can be regarded as the largest family of precise CTMCs compatible with the bounds of the intervals, an imprecise CTMC is the largest family of all, potentially non-Markov, non-homogeneous, processes compatible with given constraints. In other words, imprecise CTMCs are more expressive and potentially useful for a wider range of applications. So far, however, there are no works specifically concerning model checking with imprecise CTMCs.

To conclude, another important development to consider might concern the development of an imprecise-probabilistic framework for Markov decision-processes, notably for the relevance of the latter and their natural connection with the field of Reinforcement Learning [43].