1 Introduction

The diffusion of complex systems with timing requirements that operate in unpredictable environments has led to interest in formal modelling and verification techniques for timed and probabilistic systems. One such formal verification technique is model checking [3, 12], in which a system model is verified automatically against formally-specified properties. A well-established modelling formalism for timed systems is timed automata [2]. A timed automaton consists of a finite graph equipped with a set of real-valued variables called clocks, which increase at the same rate as real-time and which can be used to constrain the relative time of events. To model formally probabilistic systems, frameworks such as Markov chains or Markov decision processes are used typically. Model-checking algorithms for these formalisms have been presented in the literature: for overviews of these techniques see, for example, [7] for timed automata, and [3, 14] for Markov chains and Markov decision processes. Furthermore, timed automata and Markov decision processes have been combined to obtain the formalism of probabilistic timed automata [16, 25, 28], which can be viewed as timed automata with probabilities associated with their edges (or, equivalently, as Markov decision processes equipped with clocks and their associated constraints).

Fig. 1.
figure 1

An example of one-clock clock-dependent probabilistic automaton \(\mathcal {P}\).

For the modelling of certain systems, it may be advantageous to model the fact that the probability of some events, in particular those concerning the environment in which the system is operating, vary as time passes. For example, in automotive and aeronautic contexts, the probability of certain reactions of human operators may depend on factors such as fatigue, which can increase over time (see, for example, [13]); an increase in the amount of time that an unmanned aerial vehicle spends performing a search and rescue operation in a hazardous zone may increase the probability that the vehicle incurs damage from the environment; an increase in the time elapsed before a metro train arrives at a station can result in an increase in the number of passengers on the station’s platform, which can in turn increase the probability of the doors failing to shut at the station, due to overcrowding of the train (see [4]). A natural way of representing such a dependency of probability of events on time is using a continuous function: for example, for the case in which a task can be completed between 1 and 3 time units in the future, we could represent the successful completion of the task by probability \(\frac{x\,+\,1}{4}\), where the clock variable \(x\) (measuring the amount of time elapsed) ranges over the interval [1, 3]. The standard probabilistic timed automaton formalism cannot express such a continuous relationship between probabilities and time, being limited to step functions (where the intervals along which the function is constant have rational-numbered endpoints). This limitation led to the development of an extension of probabilistic timed automata called clock-dependent probabilistic timed automata [32], in which the probabilities of crossing edges can depend on clock values according to piecewise constant functions. Figure 1 gives an example of such a clock-dependent probabilistic timed automaton, using the standard conventions for the graphical representation of (probabilistic) timed automata (the model has one clock denoted by \(x\), and black boxes denote probabilistic choices over outgoing edges). In location \(\mathrm {W}\), the system is working on a task, which is completed after between 1 and 3 units of time. When the task is completed, it is either successful (edge to location \(\mathrm {S}\)), fails (edge to location \(\mathrm {F}\)) or leads to system termination (edge to location \(\mathrm {T}\)). For the case in which the task completion fails, between 4 and 5 time units after work on the task started the system may either try again to work on the task (edge to location \(\mathrm {W}\), resetting \(x\) to 0) or to terminate (edge to location \(\mathrm {T}\)). The edges corresponding to probabilistic choices are labelled with expressions over the clock \(x\), which describe how the probability of those edges changes in accordance with changes in the value of \(x\). For example, the longer the time spent in location \(\mathrm {W}\), the higher the value of \(x\) when location \(\mathrm {W}\) is left, and the higher the probability of making a transition to location \(\mathrm {S}\), which corresponds to the successful completion of the task.

Previous work on clock-dependent probabilistic timed automata showed that a basic quantitative reachability problem (regarding whether there is a scheduler of nondeterministic choice such that the probability of reaching a set of target locations exceeds some probability threshold) is undecidable, but that an approach based on the region graph can be employed to approximate optimal reachability probabilities [32]. The undecidability result relied on the presence of at least three clocks: in this paper, following similar precedents in the context of (non-probabilistic and probabilistic) variants of timed automata (for example, [1, 5, 6, 8, 23, 26]), we restrict our attention to clock-dependent probabilistic timed automata with a single clock variable. As in [32], we consider the case in which the dependencies of transition probabilities on the value of the clock are described by affine functions. Furthermore, we assume that, between any two edges with a non-constant dependence on the clock, the clock must have a natural-numbered value, either through being reset to 0 or by increasing as time passes. We call this condition initialisation, following the precedents of [1] and [21], in which similar conditions are used to obtain decidability results for stochastic timed systems with one clock, and hybrid automata, respectively; intuitively, the value of the clock is “reinitialised” (either explicitly, through a reset to 0, or implicitly, through the passage of time) to a known, natural value between non-constant dependencies of probability on the value of the clock. Note that the clock-dependent probabilistic timed automaton of Fig. 1 satisfies this assumption (although clock \(x\) is not reset on the edge to location \(\mathrm {F}\), it must take values 3 and 4 before location \(\mathrm {F}\) can be left). We show that, for such clock-dependent probabilistic timed automata, quantitative reachability problems can be solved in polynomial time. Similarly, we can also solve in polynomial time qualitative reachability problems, which ask whether there exists a scheduler of nondeterminism such that a set of target locations can be reached with probability 1 (or 0), or whether all schedulers of nondeterminism result in the target locations being reached with probability 1 (or 0).

These results rely on the construction of an interval Markov decision process from the one-clock clock-dependent probabilistic timed automaton. Interval Markov decision processes have been well-studied in the verification context (for example, in [17, 20, 30]), and also in other contexts, such as planning [15] and control [27, 36]. They comprise a finite state space where transitions between states are achieved in the following manner: for each state, there is a nondeterministic choice between a set of actions, where each action is associated with a decoration of the set of edges from the state with intervals in [0, 1]; then a nondeterministic choice as to the exact probabilties associated with each outgoing edge is chosen from the intervals associated with the action chosen in the first step; finally, a probabilistic choice is made over the edges according to the probabilities chosen in the second step, thus determining the next state. In contrast to the standard formulation of interval Markov decision processes, we allow edges corresponding to probabilistic choices to be labelled not only with closed intervals, but also with open and half-open intervals. While (half-)open intervals have been considered previously in the context of interval Markov chains in [9, 33], we are unaware of any work considering them in the context of interval Markov decision processes. The presence of open intervals is vital to obtain a precise representation of the one-clock clock-dependent probabilistic timed automaton.

We proceed by giving some preliminary concepts in Sect. 2: this includes a reduction from interval Markov decision processes to interval Markov chains [22, 24, 31] with the standard Markov decision process-based semantics, which may be of independent interest. The reduction takes open and half-open intervals into account; while [9] has shown that open interval Markov chains can be reduced to closed Markov chains for the purposes of quantitative properties, [33] shows that the open/closed distinction is critical for the evaluation of qualitative properties. In Sect. 3, we present the definition of one-clock clock-dependent probabilistic timed automata, and present the transformation to interval Markov decision processes in Sect. 4. Proofs of the results can be found in [34].

2 Interval Markov Decision Processes

Preliminaries. We use \(\mathbb {R}_{\ge 0}\) to denote the set of non-negative real numbers, \(\mathbb {Q}\) to denote the set of rational numbers, and \(\mathbb {N}\) to denote the set of natural numbers. A (discrete) probability distribution over a countable set Q is a function \({\mu : Q \rightarrow [0,1]}\) such that \(\sum _{q \in Q} \mu (q) = 1\). Let \(\mathsf {Dist}(Q)\) be the set of distributions over Q. For a (possibly uncountable) set Q and a function \(\mu : Q \rightarrow [0,1]\), we define \(\mathsf {support}(\mu ) = \{{q \in Q \mid \mu (q) > 0}\}\). Then, for an uncountable set Q, we define \(\mathsf {Dist}(Q)\) to be the set of functions \({\mu : Q \rightarrow [0,1]}\) such that \(\mathsf {support}(\mu )\) is a countable set and \(\mu \) restricted to \(\mathsf {support}(\mu )\) is a distribution. Given a binary function \(f: Q \times Q \rightarrow [0,1]\) and element \(q \in Q\), we denote by \(f(q,\cdot ): Q \rightarrow [0,1]\) the unary function such that \(f(q,\cdot )(q')=f(q,q')\) for each \(q' \in Q\).

A Markov chain (MC) \(\mathcal {C}\) is a pair \((S,\mathbf {P})\) where \(S\) is a set of states and \(\mathbf {P}: S\times S\rightarrow [0,1]\) is a transition probability function, such that \(\mathbf {P}(s,\cdot ) \in \mathsf {Dist}(S)\) for each state \(s\in S\). A path of MC \(\mathcal {C}\) is a sequence \(s_0 s_1 \cdots \) of states such that \(\mathbf {P}(s_i,s_{i+1})>0\) for all \(i \ge 0\). Given a path \(\mathbf {r}= s_0 s_1 \cdots \) and \(i \ge 0\), we let \(\mathbf {r}(i) = s_i\) be the \((i+1)\)-th state along \(\mathbf {r}\). The set of paths of \(\mathcal {C}\) starting in state \(s\in S\) is denoted by \( Paths ^{\mathcal {C}}(s)\). In the standard manner (see, for example, [3, 14]), given a state \(s\in S\), we can define a probability measure \(\mathrm {Pr}^{\mathcal {C}}_{s}\) over \( Paths ^{\mathcal {C}}(s)\).

A Markov decision process (MDP) \(\mathcal {M}= ( S, A, \varDelta )\) comprises a set \(S\) of states, a set \(A\) of actions, and a probabilistic transition function . The symbol \(\bot \) is used to represent the unavailability of an action in a state, i.e., \(\varDelta (s,a) = \bot \) signifies that action \(a\in A\) is not available in state \(s\in S\). For each state \(s\in S\), let \(A({s}) = \{{ a\in A\mid \varDelta (s,a) \ne \bot }\}\), and assume that \(A({s}) \ne \emptyset \), i.e., there is at least one available action in each state. Transitions from state to state of an MDP are performed in two steps: if the current state is \(s\), the first step concerns a nondeterministic selection of an action \(a\in A({s})\); the second step comprises a probabilistic choice, made according to the distribution \(\varDelta (s,a)\), as to which state to make the transition (that is, a transition to a state \(s' \in S\) is made with probability \(\varDelta (s,a)(s')\)). In general, the sets of states and actions can be uncountable. We say that an MDP is finite if \(S\) and \(A\) are finite sets.

A(n infinite) path of an MDP \(\mathcal {M}\) is a sequence \(s_0 a_0 s_1 a_1 \cdots \) such that \(a_i \in A({s_i})\) and \(\varDelta (s_i,\mu _i)(s_{i+1})>0\) for all \(i \ge 0\). Given an infinite path \(\mathbf {r}= s_0 a_0 s_1 a_1 \cdots \) and \(i \ge 0\), we let \(\mathbf {r}(i) = s_i\) be the \((i+1)\)-th state along \(\mathbf {r}\). Let be the set of infinite paths of . A finite path is a sequence such that and for all \(0 \le i < n\). Let denote the final state of . We use to denote the finite path . Let be the set of finite paths of the MDP \(\mathcal {M}\). Let and be the sets of infinite paths and finite paths, respectively, of starting in state .

A scheduler is a function such that for all .Footnote 1 Let \(\varSigma ^{\mathcal {M}}\) be the set of schedulers of the MDP \(\mathcal {M}\). We say that infinite path is generated by if for all \(i \in \mathbb {N}\). Let be the set of paths generated by \(\sigma \). The set of finite paths generated by \(\sigma \) is defined similarly. Let and . Given a scheduler \(\sigma \in \varSigma ^{\mathcal {M}}\), we can define a countably infinite-state MC \(\mathcal {C}^{\sigma }\) that corresponds to the behaviour of the scheduler \(\sigma \): we let , where, for , we have if and , and otherwise. For , we denote the \((i+1)\)-th prefix of \(r\) by , i.e., , for \(i \le n\). Then, given and , we let . Let be the set of infinite paths starting in \(s\) that have the finite path \(r\) as a prefix. Then we let be the unique probability measure over such that (for more details, see [3, 14]).

Given a set \(T\subseteq S\), we define \(\Diamond T= \{ \mathbf {r}\in Paths ^{\mathcal {M}} \mid \exists i \in \mathbb {N}\, . \,\mathbf {r}(i) \in T\}\) as the set of infinite runs of \(\mathcal {M}\) such that some state of \(T\) is visited along the run. Let \(s\in S\). We define the maximum probability of reaching \(T\) from \(s\) as \(\mathbb {P}^\mathrm {max}_{{\mathcal {M}},{s}}(\Diamond {T}) = \sup _{\sigma \in \varSigma ^{\mathcal {M}}}~\mathrm {Pr}^{\sigma }_{s}(\Diamond T)\). Similarly, the minimum probability of reaching \(T\) from \(s\) is defined as . The maximal reachability problem for \(\mathcal {M}\), \(T\subseteq S\), \(s\,\in \,S\), \(\unrhd \,\in \{ \ge , > \}\) and \(\lambda \in [0,1]\) is to decide whether \(\mathbb {P}^\mathrm {max}_{{\mathcal {M}},{s}}(\Diamond {T}) \unrhd \,\lambda \). Similarly, the minimal reachability problem for \(\mathcal {M}\), \(T\subseteq S\), \(s\in S\), \(\unlhd \,\in \{ \le , < \}\) and \(\lambda \in [0,1]\) is to decide whether \(\mathbb {P}^\mathrm {min}_{{\mathcal {M}},{s}}(\Diamond {T}) \unlhd \,\lambda \). The maximal and minimal reachability problems are called quantitative problems. We also consider the following qualitative problems: (\(\forall 0\)) decide whether \(\mathrm {Pr}^{\sigma }_{s}(\Diamond T)=0\) for all \(\sigma \in \varSigma ^{\mathcal {M}}\); (\(\exists 0\)) decide whether there exists \(\sigma \in \varSigma ^{\mathcal {M}}\) such that \(\mathrm {Pr}^{\sigma }_{s}(\Diamond T)=0\); (\(\exists 1\)) decide whether there exists \(\sigma \in \varSigma ^{\mathcal {M}}\) such that \(\mathrm {Pr}^{\sigma }_{s}(\Diamond T)=1\); (\(\forall 1\)) decide whether \(\mathrm {Pr}^{\sigma }_{s}(\Diamond T)=1\) for all \(\sigma \in \varSigma ^{\mathcal {M}}\).

Interval Markov Chains. We let \(\mathcal {I}\) denote the set of (open, half-open or closed) intervals that are subsets of [0, 1] and that have rational-numbered endpoints. Given an interval \(I\in \mathcal {I}\), we let \(\mathsf {left}({I})\) (respectively, \(\mathsf {right}({I})\)) be the left (respectively, right) endpoint of \(I\). Note that \(\mathsf {left}({I}),\mathsf {right}({I}) \in [0,1] \cap \mathbb {Q}\).

An interval distribution over a finite set Q is a function \(\mathfrak {d}: Q \rightarrow \mathcal {I}\) such that (1) \(\sum _{q \in Q} \mathsf {left}({\mathfrak {d}(q)}) \le 1 \le \sum _{q \in Q} \mathsf {right}({\mathfrak {d}(q)})\), (2a) \(\sum _{q \in Q} \mathsf {left}({\mathfrak {d}(q)}) = 1\) implies that \(\mathfrak {d}(q)\) is left-closed for all \(q \in Q\), and (2b) \(\sum _{q \in Q} \mathsf {right}({\mathfrak {d}(q)}) = 1\) implies that \(\mathfrak {d}(q)\) is right-closed for all \(q \in Q\). We define \(\mathfrak {Dist}(Q)\) as the set of interval distributions over Q. An assignment for interval distribution \(\mathfrak {d}\) is a distribution \(\alpha \in \mathsf {Dist}(Q)\) such that \(\alpha (q) \in \mathfrak {d}(q)\) for each \(q \in Q\). Note that conditions (1), (2a) and (2b) in the definition of interval distributions guarantee that there exists at least one assignment for each interval distribution. Let \(\mathfrak {G}({\mathfrak {d}})\) be the set of assignments for \(\mathfrak {d}\).

An (open) interval Markov chain (IMC) \(\mathfrak {C}\) is a pair \((S,\mathfrak {P})\), where \(S\) is a finite set of states, and \(\mathfrak {P}: S\times S\rightarrow \mathcal {I}\) is a interval-based transition function such that \(\mathfrak {P}(s,\cdot )\) is an interval distribution for each \(s\in S\) (formally, \(\mathfrak {P}(s,\cdot ) \in \mathfrak {Dist}(S)\)). An IMC makes a transition from a state \(s\in S\) in two steps: first an assignment \(\alpha \) is chosen from the set \(\mathfrak {G}({\mathfrak {P}(s,\cdot )})\) of assignments for \(\mathfrak {P}(s,\cdot )\), then a probabilistic choice over target states is made according to \(\alpha \). The semantics of an IMC corresponds to an MDP that has the same state space as the IMC, and for which each state is associated with a set of distributions, the precise transition probabilities of which are chosen from the interval distribution of the state. Formally, the semantics of an IMC \(\mathfrak {C}= (S,\mathfrak {P})\) is the MDP , where \(\mathfrak {G}({\mathfrak {P}}) = \bigcup _{s\in S} \mathfrak {G}({\mathfrak {P}(s,\cdot )})\), and for which \(\varDelta (s,\alpha ) = \alpha \) for all states \(s\in S\) and assignments \(\alpha \in \mathfrak {G}({\mathfrak {P}(s,\cdot )})\) for \(\mathfrak {P}(s,\cdot )\). In previous literature (for example, [10, 11, 31]), this semantics is called the “IMDP semantics”.

Computing and can be done for an IMC \(\mathfrak {C}\) simply by transforming the IMC by closing all of its (half-)open intervals, then employing a standard maximum/minimum reachability probability computation on the new, “closed” IMC (for example, the algorithms of [11, 31]): the correctness of this approach is shown in [9]. Algorithms for qualitative problems of IMCs (with open, half-open and closed intervals) are given in [33]. All of the aforementioned algorithms run in polynomial time in the size of the IMC, which is obtained as the sum over all states \(s,s' \in S\) of the binary representation of the endpoints of \(\mathfrak {P}(s,s')\), where rational numbers are encoded as the quotient of integers written in binary.

Interval Markov Decision Processes. An (open) interval Markov decision process (IMDP) \(\mathfrak {M}= (S,\mathfrak {A},\mathfrak {D})\) comprises a finite set \(S\) of states, a finite set \(\mathfrak {A}\) of actions, and an interval-based transition function \(\mathfrak {D}: S\times \mathfrak {A}\rightarrow \mathfrak {Dist}(S) \cup \{{ \bot }\}\). Let \(\mathfrak {A}({s}) = \{{ \mathfrak {a}\in \mathfrak {A}\mid \mathfrak {D}(s,\mathfrak {a}) \ne \bot }\}\), and assume that \(\mathfrak {A}({s}) \ne \emptyset \) for each state \(s\in S\). In contrast to IMCs, an IMDP makes a transition from a state \(s\in S\) in three steps: (1) an action \(\mathfrak {a}\in \mathfrak {A}({s})\) is chosen, then (2) an assignment \(\alpha \) for \(\mathfrak {D}(s,\mathfrak {a})\) is chosen, and finally (3) a probabilistic choice over target states to make the transition to is performed according to \(\alpha \). Formally, the semantics of an IMDP \(\mathfrak {M}= (S,\mathfrak {A},\mathfrak {D})\) is the MDP where \(A({s}) = \{{ (\mathfrak {a},\alpha ) \in \mathfrak {A}\times \mathsf {Dist}(S) \mid \mathfrak {a}\in \mathfrak {A}({s}) \hbox { and } \alpha \in \mathfrak {G}({\mathfrak {D}(s,\mathfrak {a})}) }\}\) for each state \(s\in S\), and \(\varDelta (s,(\mathfrak {a},\alpha )) = \alpha \) for each state \(s\in S\) and action/assignment pair . Note that (as in, for example, [17, 20, 30]) we adopt a cooperative resolution of nondeterminism for IMDPs, in which the choice of action and assignment (steps (1) and (2) above) is combined into a single nondeterministic choice in the semantic MDP.

Given the cooperative nondeterminism for IMDPs, we can show that, given an IMDP, an IMC can be constructed such that the maximal and minimal reachability probabilities for the IMDP and IMC coincide, and furthermore qualitative properties agree on the IMDP and the IMC. Formally, given the IMDP \(\mathfrak {M}= (S,\mathfrak {A},\mathfrak {D})\), we construct an IMC \(\mathfrak {C}[{\mathfrak {M}}] = (\tilde{S},\tilde{\mathfrak {P}})\) in the following way:

  • the set of states is defined as \(\tilde{S}=S\cup (S\otimes \mathfrak {A})\), where \(S\otimes \mathfrak {A}= \bigcup _{s\in S} \{{ (s,\mathfrak {a}) \in S\times \mathfrak {A}\mid \mathfrak {a}\in \mathfrak {A}({s}) }\}\);

  • for \(s\in S\) and \(\mathfrak {a}\in \mathfrak {A}({s})\), let \(\tilde{\mathfrak {P}}(s,(s,\mathfrak {a})) = [0,1]\), and let \(\tilde{\mathfrak {P}}((s,\mathfrak {a}),\cdot ) = \mathfrak {D}(s,\mathfrak {a})\).

The following proposition states the correctness of this construction with respect to quantitative and qualitative problems.

Proposition 1

Let \(\mathfrak {M}= (S,\mathfrak {A},\mathfrak {D})\) be an IMDP, and let \(s\in S\), \(T\subseteq S\) and \(\lambda \in \{{ 0,1 }\}\). Then:

  • and ;

  • there exists such that \(\mathrm {Pr}^{\sigma }_{s}(\Diamond T)=\lambda \) if and only if there exists such that \(\mathrm {Pr}^{\sigma '}_{s}(\Diamond T)=\lambda \);

  • \(\mathrm {Pr}^{\sigma }_{s}(\Diamond T)=\lambda \) for all if and only if \(\mathrm {Pr}^{\sigma '}_{s}(\Diamond T)=\lambda \) for all .

3 Clock-Dependent Probabilistic Timed Automata with One Clock

In this section, we introduce the formalism of clock-dependent probabilistic timed automata. The definition of clock-dependent probabilistic timed automata of [32] features an arbitrary number of clock variables. In contrast, we consider models with only one clock variable, which will be denoted \(x\) for the remainder of the paper.

A clock valuation is a value \(v\in \mathbb {R}_{\ge 0}\), interpreted as the current value of clock \(x\). Following the usual notational conventions for modelling formalisms based on timed automata, for clock valuation \(v\in \mathbb {R}_{\ge 0}\) and \(X\in \{{\{{x}\},\emptyset }\}\), we write \(v[X\,{:=}\,0]\) to denote the clock valuation in which clocks in \(X\) are reset to 0; in the one-clock setting, we have \(v[\{{x}\}\,{:=}\,0] = 0\) and \(v[\emptyset \,{:=}\,0] = v\). In the following, we write \(2^{\{{x}\}}\) rather than \(\{{\{{x}\},\emptyset }\}\).

The set \( \varPsi \) of clock constraints over \(x\) is defined as the set of conjunctions over atomic formulae of the form \(x\sim c\), where \(\sim \in \{ <,\le ,\ge ,> \}\) and \(c \in \mathbb {N}\). A clock valuation \(v\) satisfies a clock constraint \(\psi \), denoted by \(v\,\models \,\psi \), if \(\psi \) resolves to \(\mathtt{true}\) when substituting each occurrence of clock \(x\) with \(v\).

For a set Q, a distribution template \(\wp : \mathbb {R}_{\ge 0}\rightarrow \mathsf {Dist}(Q)\) gives a distribution over Q for each clock valuation. In the following, we use notation \({\wp }[{v}]\), rather than \(\wp (v)\), to denote the distribution corresponding to distribution template \(\wp \) and clock valuation \(v\). Let \(\mathsf {Temp}(Q)\) be the set of distribution templates over Q.

A one-clock clock-dependent probabilistic timed automaton (1c-cdPTA) \(\mathcal {P}= (L, inv , prob )\) comprises the following components:

  • a finite set \(L\) of locations;

  • a function \( inv : L\rightarrow \varPsi \) associating an invariant condition with each location;

  • a set \( prob \subseteq L\times \varPsi \times \mathsf {Temp}(2^{\{{x}\}} \times L)\) of probabilistic edges.

A probabilistic edge \((l,g,\wp ) \in prob \) comprises: (1) a source location \(l\); (2) a clock constraint \(g\), called a guard; and (3) a distribution template \(\wp \) with respect to pairs of the form \((X,l') \in 2^{\{{x}\}} \times L\) (i.e., pairs consisting of a first element indicating whether \(x\) should be reset to 0 or not, and a second element corresponding to a target location \(l'\)). We refer to pairs \((X,l') \in 2^{\{{x}\}} \times L\) as outcomes.

The behaviour of a 1c-cdPTA takes a similar form to that of a standard (one-clock) probabilistic timed automaton [16, 23, 25]: in any location time can advance as long as the invariant condition holds, and the choice as to how much time elapses is made nondeterministically; a probabilistic edge can be taken if its guard is satisfied by the current value of the clock, and the choice as to which probabilistic edge to take is made nondeterministically; for a taken probabilistic edge, the choice of whether to reset the clock and which target location to make the transition to is probabilistic. In comparison with one-clock probabilistic timed automata, the key novelty of 1c-cdPTAs is that the distribution used to make this probabilistic choice depends on the probabilistic edge taken and on the current clock valuation.

A state of a 1c-cdPTA is a pair comprising a location and a clock valuation satisfying the location’s invariant condition, i.e., \((l,v) \in L\times \mathbb {R}_{\ge 0}\) such that \(v\,\models \, inv (l)\). In any state \((l,v)\), a certain amount of time \(t\in \mathbb {R}_{\ge 0}\) elapses, then a probabilistic edge is traversed. The choice of \(t\) requires that the invariant \( inv (l)\) remains continuously satisfied while time passes. The resulting state after the elapse of time is \((l,v+t)\). A probabilistic edge \((l',g,\wp ) \in prob \) can then be chosen from \((l,v+t)\) if \(l= l'\) and it is enabled, i.e., the clock constraint \(g\) is satisfied by \(v+t\). Once a probabilistic edge \((l,g,\wp )\) is chosen, a successor location, and whether to reset the clock to 0, is chosen at random, according to the distribution \({\wp }[{v+t}]\).

We make the following assumptions on 1c-cdPTAs, in order to simplify the definition of their semantics. Firstly, we consider 1c-cdPTAs featuring invariant conditions that prevent the clock from exceeding some bound, and impose no lower bound: formally, for each location \(l\in L\), we have that \( inv (l)\) is a constraint \(x\le c\) for some \(c \in \mathbb {N}\), or a constraint \(x< c\) for some \(c \in \mathbb {N}\setminus \{{0}\}\). Secondly, we restrict our attention to 1c-cdPTAs for which it is always possible to take a probabilistic edge, either immediately or after letting time elapse. Formally, for each location \(l\in L\), if \( inv (l) = (x\le c)\) then (viewing c as a clock valuation) \(c\,\models \,g\) for some \((l,g,\wp ) \in prob \); instead, if \( inv (l) = (x< c)\) then \(c-\varepsilon \,\models \,g\) for all \(\varepsilon \in (0,1)\) and \((l,g,\wp ) \in prob \). Thirdly, we assume that all possible target states of probabilistic edges satisfy their invariants. Observe that, given the first assumption, this may not be the case only when the clock is not reset. Formally, for all probabilistic edges \((l,g,\wp ) \in prob \), for all clock valuations \(v\in \mathbb {R}_{\ge 0}\) such that \(v\,\models \,g\), and for all \(l' \in L\), we have that \({\wp }[{v}](\emptyset ,l')>0\) implies \(v[\emptyset :=0]\,\models \, inv (l')\), i.e., \(v\,\models \, inv (l')\). Note that we relax some of these assumptions when depicting 1c-cdPTAs graphically (for example, the 1c-cdPTA of Fig. 1 can be made to satisfy these assumptions by adding invariant conditions and self-looping probabilistic edges to locations \(\mathrm {S}\) and \(\mathrm {T}\)).

The semantics of the 1c-cdPTA \(\mathcal {P}= (L, inv , prob )\) is the MDP where:

  • \(S= \{ (l,v) \in L\times \mathbb {R}_{\ge 0}\mid v\,\models \, inv (l) \}\);

  • \(A= \mathbb {R}_{\ge 0}\times prob \);

  • for \((l,v) \in S\), \(\tilde{v} \in \mathbb {R}_{\ge 0}\) and \((l,g,\wp ) \in prob \) such that (1) \(\tilde{v} \ge v\), (2) \(\tilde{v}\,\models \,g\) and (3) \(w\,\models \, inv (l)\) for all \(v\le w\le \tilde{v}\), then we let \(\varDelta ((l,v),(\tilde{v},(l,g,\wp )))\) be the distribution such that, for \((l',v') \in S\):

    $$\begin{aligned} \varDelta ((l,v),(\tilde{v},(l,g,\wp )))(l',v') = \left\{ \begin{array}{ll} {\wp }[{\tilde{v}}](\{{x}\},l') + {\wp }[{\tilde{v}}](\emptyset ,l') &{} \hbox {if } v' = \tilde{v} = 0 \\ {\wp }[{\tilde{v}}](\emptyset ,l') &{} \hbox {if } v' = \tilde{v}> 0 \\ {\wp }[{\tilde{v}}](\{{x}\},l') &{} \hbox {if } v' = 0 \hbox { and } \tilde{v} > 0 \\ 0 &{} \hbox {otherwise.} \end{array} \right. \end{aligned}$$

Let \(F\subseteq L\) be a set of locations, and let \(T_{F} = \{{ (l,v) \in S\mid l\in F}\}\) be the set of states of that have their location component in \(F\). Then the maximum value of reaching \(F\) from state \((l,v) \in S\) corresponds to . Similarly, the minimum value of reaching \(F\) from state \((l,v) \in S\) corresponds to . As in Sect. 2, we can define a number of quantitative and qualitative reachability problems on 1c-cdPTA, where the initial state is set as \((l,0)\) for a particular \(l\in L\). The maximal reachability problem for \(\mathcal {P}\), \(F\subseteq L\), \(l\in L\), \(\unrhd \,\in \{ \ge , > \}\) and \(\lambda \in [0,1]\) is to decide whether ; similarly, the minimal reachability problem for \(\mathcal {P}\), \(F\subseteq L\), \(l\in L\), \(\unlhd \,\in \{ \le , < \}\) and \(\lambda \in [0,1]\) is to decide whether . Furthermore, we can define analogues of the qualitative problems featured in Sect. 2: (\(\forall 0\)) decide whether ; (\(\exists 0\)) decide whether there exists such that \(\mathrm {Pr}^{\sigma }_{(l,0)}(\Diamond T_{F})=0\); (\(\exists 1\)) decide whether there exists such that \(\mathrm {Pr}^{\sigma }_{(l,0)}(\Diamond T_{F})=1\); (\(\forall 1\)) decide whether \(\mathrm {Pr}^{\sigma }_{(l,0)}(\Diamond T_{F})=1\) for all .

Affine Clock Dependencies. In this paper, we consider distribution templates that are defined in terms of sets of affine functions in the following way.

Given probabilistic edge \(p= (l,g,\wp ) \in prob \), let \(I^{p}\) be the set of clock valuations in which \(p\) is enabled, i.e., \(I^{p} = \{{v\in \mathbb {R}_{\ge 0}\mid v\models g\wedge inv (l)}\}\). Note that \(I^{p} \subseteq \mathbb {R}_{\ge 0}\) corresponds to an interval with natural-numbered endpoints. Let \(\overline{I^{p}}\) be the closure of \(I^{p}\). We say that \(p\) is affine if, for each \(e\in 2^{\{{x}\}} \times L\), there exists a pair \((c^{p}_{e}, d^{p}_{e}) \in \mathbb {Q}^2\) of rational constants, such that \({\wp }[{v}](e) = c^{p}_{e} + d^{p}_{e} \cdot v\) for all \(v\in \overline{I^{p}}\). Note that, by the definition of distribution templates, for all \(v\in \overline{I^{p}}\), we have \(c^{p}_{e} + d^{p}_{e} \cdot v\ge 0\) for each \(e\in 2^{\{{x}\}} \times L\), and \(\sum _{e\in 2^{\{{x}\}} \times L} (c^{p}_{e} + d^{p}_{e} \cdot v) = 1\). A 1c-cdPTA is affine if all of its probabilistic edges are affine. Henceforth we assume that the 1c-cdPTAs we consider are affine. An affine probabilistic edge \(p\) is constant if, for each \(e\in 2^{\{{x}\}} \times L\), we have \(d^{p}_{e} = 0\), i.e., \({\wp }[{v}](e) = c^{p}_{e}\) for some \(c^{p}_{e} \in \mathbb {Q}\), for all \(v\in \overline{I^{p}}\). Note that, for a probabilistic edge \(p\in prob \), outcome \(e\in 2^{\{{x}\}} \times L\) and open interval \(I\subseteq I^{p}\), if \(d^{p}_{e} \ne 0\), then \({\wp }[{v}](e) > 0\) for all \(v\in I\) (because the existence of \(v_{=0} \in I\) such that \({\wp }[{v_{=0}}](e) = 0\), together with \(d^{p}_{e} \ne 0\), would mean that there exists \(v' \in I\) such that \({\wp }[{v'}](e) < 0\), which contradicts the definition of distribution templates).

Initialisation. In this paper, we also introduce a specific requirement for 1c-cdPTAs that allows us to analyse faithfully 1c-cdPTA using IMDPs in Sect. 4. A symbolic path fragment is a sequence \((l_0,g_0,\wp _0) (X_0,l_1) (l_1,g_1,\wp _1)\)\( (X_1,l_2) \cdots (l_n,g_n,\wp _n) \in ( prob \times (2^{\{{x}\}} \times L))^+ \times prob \) of probabilistic edges and outcomes such that \({\wp _i}[{v}](X_i,l_{i+1}) > 0\) for all \(v\in I^{(l_i,g_i,\wp _i)}{}\) for all \(i<n\). In this paper, we consider 1c-cdPTAs for which each symbolic path fragment that begins and ends with a non-constant probabilistic edge requires that the clock takes a natural numbered value, either from being reset or from passing through guards that have at most one (natural numbered) value in common. Formally, a 1c-cdPTA is initialised if, for any symbolic path fragment \((l_0,g_0,\wp _0) (X_0,l_1) (l_1,g_1,\wp _1) (X_1,l_2) \cdots (l_n,g_n,\wp _n)\) such that \((l_0,g_0,\wp _0)\) and \((l_n,g_n,\wp _n)\) are non-constant, either (1) \(X_i = \{{x}\}\) or (2) \(I^{(l_i,g_i,\wp _i)}{} \cap I^{(l_{i+1},g_{i+1},\wp _{i+1})}{}\) is empty or contains a single valuation, for some \(0< i < n\). We henceforth assume that all 1c-cdPTAs considered in this paper are initialised.

4 Translation from 1c-cdPTAs to IMDPs

In this section, we show that we can solve quantitative and qualitative problems of (affine and initialised) 1c-cdPTAs. In contrast to the approach for quantitative problems of multiple-clock cdPTAs presented in [32], which involves the construction of an approximate MDP, we represent the 1c-cdPTA precisely using an IMDP, by adapting the standard region-graph construction for one-clock (probabilistic) timed automata of [23, 26].

Let be a 1c-cdPTA. Let be the set of constants that are used in the guards of probabilistic edges and invariants of \(\mathcal {P}\), and let \(\mathbb {B}= Cst ({\mathcal {P}}) \cup \{{0}\}\). We write \(\mathbb {B}= \{{b_0, b_1, \ldots , b_k}\}\), where \(0 = b_0< b_1< \ldots < b_k\). The set \(\mathbb {B}\) defines the set \(\mathcal {I}_{\mathbb {B}} = \{{[b_0,b_0], (b_0,b_1), [b_1,b_1], \cdots , [b_k,b_k]}\}\). We define a total order on \(\mathcal {I}_{\mathbb {B}}\) in the following way: \([b_0,b_0]< (b_0,b_1)< [b_1,b_1]< \cdots < [b_k,b_k]\). Given an open interval \(B=(b,b') \in \mathcal {I}_{\mathbb {B}}\), its closure is written as \(\overline{B}\), i.e., \(\overline{B}=[b,b']\). Furthermore, let \(\mathsf {lb}(B)=b\) and \(\mathsf {rb}(B)=b'\) refer to the left- and right-endpoints of \(B\). For a closed interval \([b,b] \in \mathcal {I}_{\mathbb {B}}\), we let \({\mathsf {lb}(B)=\mathsf {rb}(B)=b}\).

Let \({\psi }\) be a guard of a probabilistic edge or an invariant of \(\mathcal {P}\). By definition, we have that, for each \({B\in \mathcal {I}_{\mathbb {B}}}\), either \({B\subseteq \{{v\in \mathbb {R}_{\ge 0}\mid v\,\models \,\psi }\}}\) or \({B\cap \{{v\in \mathbb {R}_{\ge 0}\mid v\,\models \,\psi }\} = \emptyset }\). We write \({B\,\models \,\psi }\) in the case of \({B\subseteq \{{v\in \mathbb {R}_{\ge 0}\mid v\,\models \,\psi }\}}\) (to represent the fact that all valuations of \(B\) satisfy \(\psi \)).

Example 1

Consider the 1c-cdPTA of Fig. 1. We have \(\mathbb {B}= \{{0,1,3,4,5}\}\) and \(\mathcal {I}_{\mathbb {B}} = \{{[0,0], (0,1), [1,1], (1,3), [3,3], (3,4), [4,4], (4,5), [5,5]}\}\). Consider the clock constraint \(x< 3\): we have \(B\,\models \,(x< 3)\) for all \(B\in \{{[0,0], (0,1), [1,1], (1,3)}\}\). Similarly, for the clock constraint \(4< x< 5\), we have \((4,5)\,\models \,(4< x< 5)\).

\(\mathbb {B}\)-Minimal Schedulers. The following technical lemma specifies that any scheduler of the 1c-cdPTA can be made “more deterministic” in the following way: for each interval \(\tilde{B} \in \mathcal {I}_{\mathbb {B}}\) and probabilistic edge \(p\in prob \), if, after executing a certain finite path, a scheduler chooses (assigns positive probability to) multiple actions \((\tilde{v}_1,p), \cdots , (\tilde{v}_n,p)\) that share the same probabilistic edge \(p\) and for which \(\tilde{v}_i \in \tilde{B}\) for all \(1 \le i \le n\), then we can obtain another scheduler for which the aforementioned actions are replaced by an action \((\tilde{v},p)\) such that \(\tilde{v} \in \tilde{B}\). Formally, we say that a scheduler of is \(\mathbb {B}\)-minimal if, for all finite paths , for all probabilistic edges \(p\in prob \), and for all pairs of actions \((\tilde{v}_1,p_1), (\tilde{v}_2,p_2) \in \mathsf {support}(\sigma (r))\), either \(p_1 \ne p_2\) or \(v_1\) and \(v_2\) belong to distinct intervals in \(\mathcal {I}_{\mathbb {B}}\), i.e., the intervals \(\tilde{B}_1, \tilde{B}_2 \in \mathcal {I}_{\mathbb {B}}\) for which \(\tilde{v}_1 \in \tilde{B}_1\) and \(\tilde{v}_2 \in \tilde{B}_2\) are such that \(\tilde{B}_1 \ne \tilde{B}_2\). Let be the set of schedulers of that are \(\mathbb {B}\)-minimal.

Lemma 1

Let \((l,v) \in S_{\mathcal {P}}\) and \(F\subseteq L\). Then, for each , there exists such that \(\mathrm {Pr}^{\sigma }_{(l,v)}(\Diamond T_{F}) = \mathrm {Pr}^{\pi }_{(l,v)}(\Diamond T_{F}) \).

The underlying idea of the proof of the lemma (which can be found in [34]) is that every finite path of \(\pi \) corresponds to a set of finite paths of \(\sigma \), where all of these paths have the same length, visit the same locations in order, choose the same probabilistic edges in order, and visit the same intervals of clock valuations in order. Consider the choice of \(\pi \) after a finite path: to replicate the choices made at the end of the corresponding set of finite paths of \(\sigma \), the choice of \(\pi \) is derived from a weighted average of the choices of \(\sigma \), where the weights correspond to the probabilities of the finite paths of \(\sigma \) under consideration. Another key point for the construction of \(\pi \) is that, when a non-constant probabilistic edge is chosen, the clock valuation used by \(\pi \) when taking the probabilistic edge reflects the clock valuation used by \(\sigma \) when taking some probabilistic edge from the finite paths of \(\sigma \) under consideration: the clock valuation chosen by \(\pi \) is obtained as a weighted average of the clock valuations chosen by \(\sigma \). Lemma 1 allows us to consider only \(\mathbb {B}\)-minimal schedulers in the sequel, permitting us to obtain a close correspondence between the schedulers of and the schedulers of the constructed IMDP, the definition of which we consider in the subsequent subsection.

Example 2

Consider the 1c-cdPTA of Fig. 1. In the following, we denote the outgoing probabilistic edges from \(\mathrm {W}\) and \(\mathrm {F}\) as \(p_\mathrm {W}\) and \(p_\mathrm {F}\), respectively. Consider a scheduler , where \(\sigma (\mathrm {W},0)\) (i.e., the choice of \(\sigma \) after the finite path comprising the single state \((\mathrm {W},0)\)) assigns probability \(\frac{1}{2}\) to the action \((\frac{5}{4},p_\mathrm {W})\) and probability \(\frac{1}{2}\) to the action \((\frac{7}{4},p_\mathrm {W})\) (where the two actions refer to either \(\frac{5}{4}\) or \(\frac{7}{4}\) time units elapsing, after which the probabilistic edge \(p_\mathrm {W}\) is taken). Then we can construct a \(\mathbb {B}\)-minimal scheduler such that \(\pi (\mathrm {W},0)\) assigns probability 1 to the action \((\frac{3}{2},p_\mathrm {W})\) (i.e., where \(\frac{3}{2} = \frac{1}{2} \cdot \frac{5}{4} + \frac{1}{2} \cdot \frac{7}{4}\)). Now consider finite paths \(r= (\mathrm {W},0) (\frac{5}{4},p_\mathrm {W}) (\mathrm {F},\frac{5}{4})\) and \(r' = (\mathrm {W},0) (\frac{7}{4},p_\mathrm {W}) (\mathrm {F},\frac{7}{4})\). Note that \(\mathrm {Pr}^{\sigma }_{*,(\mathrm {W},0)}(r) = \frac{1}{2} \cdot \frac{11 - \frac{3 \cdot 5}{4}}{16}\) and \(\mathrm {Pr}^{\sigma }_{*,(\mathrm {W},0)}(r') = \frac{1}{2} \cdot \frac{11 - \frac{3 \cdot 7}{4}}{16}\). Say that \(\sigma (r)\) assigns probability 1 to \(\frac{17}{4}\) and probability 1 to \(\frac{19}{4}\). Then \(\pi ((\mathrm {W},0)(\frac{3}{2},p_\mathrm {W})(\mathrm {F},\frac{3}{2}))\) assigns probability 1 to action \((\tilde{v},p_\mathrm {F})\), where \(\tilde{v} = \mathrm {Pr}^{\sigma }_{*,(\mathrm {W},0)}(r) \cdot 1 \cdot \frac{17}{4} + \mathrm {Pr}^{\sigma }_{*,(\mathrm {W},0)}(r') \cdot 1 \cdot \frac{19}{4}\), i.e., a weighted sum of the time delays chosen by \(\sigma \) after \(r\) and \(r'\), where the weights correspond to the probabilities of \(r\) and \(r'\) under \(\sigma \). Repeating this reasoning for all finite paths yields a \(\mathbb {B}\)-minimal scheduler \(\pi \) such that the probability of reaching a set of target states from \((\mathrm {W},0)\) is the same for both \(\sigma \) and \(\pi \).

IMDP Construction. We now present the idea of the IMDP construction. The states of the IMDP fall into two categories: (1) pairs comprising a location and an interval from \(\mathcal {I}_{\mathbb {B}}\), with the intuition that the state \((l,B) \in L\times \mathcal {I}_{\mathbb {B}}\) of the IMDP represents all states \((l,v)\) of such that \(v\in B\); (2) triples comprising an interval from \(\mathcal {I}_{\mathbb {B}}\), a probabilistic edge and a bit that specifies whether the state refers to the left- or right-endpoint of the interval. A single transition of the semantics of the 1c-cdPTA, which we recall represents the elapse of time (therefore increasing the value of the clock) followed by the traversal of a probabilistic edge, is represented by a sequence of two transitions in the IMDP: the first IMDP transition represents the choice of (i) the probabilistic edge, (ii) the interval in \(\mathcal {I}_{\mathbb {B}}\) which contains the valuation of the clock after letting time elapse and immediately before the probabilistic edge is traversed, and (iii) in the case in which the aforementioned interval is open, the position of the clock valuation within the interval; the second IMDP transition represents the probabilistic choice made from the extreme (left and right) endpoints of the aforementioned interval with the chosen probabilistic edges.

Fig. 2.
figure 2

Interval Markov decision process \(\mathfrak {M}[{\mathcal {P}}]\) obtained from \(\mathcal {P}\).

Example 3

The IMDP construction, applied to the example of Fig. 1, is shown in Fig. 2 (note that transitions corresponding to probability 0 are shown with a dashed line). The location \(\mathrm {W}\), and the value of the clock being 0, is represented by the state \((\mathrm {W},[0,0])\). Recall that the outgoing probabilistic edge from \(\mathrm {W}\) is enabled when the clock is between 1 and 3: hence the single action \(((1,3),p_\mathrm {W})\) is available from \((\mathrm {W},[0,0])\) (representing the set of actions \((\tilde{v},p_\mathrm {W})\) of with \(\tilde{v} \in (1,3)\)). The action \(((1,3),p_\mathrm {W})\) is associated with two target states, \(((1,3),p_\mathrm {W},\mathsf {lb})\) and \(((1,3),p_\mathrm {W},\mathsf {rb})\), each corresponding to the probability interval (0, 1). The choice of probability within the interval can be done in the IMDP to represent a choice of clock valuation in (1, 3): for example, the valuation \(\frac{3}{2}\) would be represented by the assignment that associates probability \(\frac{3}{4}\) with \(((1,3),p_\mathrm {W},\mathsf {lb})\) and \(\frac{1}{4}\) with \(((1,3),p_\mathrm {W},\mathsf {rb})\) (i.e., assigns a weight of \(\frac{3}{4}\) to the lower bound of (1, 3), and a weight of \(\frac{1}{4}\) to the upper bound of (1, 3), obtaining the weighted combination \(\frac{3}{4} \cdot 1 + \frac{1}{4} \cdot 3 = \frac{3}{2}\)). Then, from both \(((1,3),p_\mathrm {W},\mathsf {lb})\) and \(((1,3),p_\mathrm {W},\mathsf {rb})\), there is a probabilistic choice made regarding the target IMDP state to make the subsequent transition to, i.e., the transitions from \(((1,3),p_\mathrm {W},\mathsf {lb})\) and \(((1,3),p_\mathrm {W},\mathsf {rb})\) do not involve nondeterminism, because there is only one action available, and because the resulting interval distribution assigns singleton intervals to all possible target states.Footnote 2 The probabilities of the transitions from \(((1,3),p_\mathrm {W},\mathsf {lb})\) and \(((1,3),p_\mathrm {W},\mathsf {rb})\) are derived from the clock dependencies associated with 1 (i.e., the left endpoint of (1, 3)) and 3 (i.e., the right endpoint of (1, 3)), respectively. Hence the multiplication of the probabilities of the two aforementioned transitions (from \((\mathrm {W},[0,0])\) to either \(((1,3),p_\mathrm {W},\mathsf {lb})\) or \(((1,3),p_\mathrm {W},\mathsf {rb})\), and then to \((\mathrm {S},(1,3))\), \((\mathrm {T},(1,3))\) or \((\mathrm {F},(1,3))\)) represents exactly the probability of a single transition in the 1c-cdPTA: for example, in the 1c-cdPTA, considering again the example of the clock valuation associating \(\frac{3}{2}\) with \(x\), the probability of making a transition to location \(\mathrm {S}\) is \(\frac{3x- 3}{8} = \frac{3}{16}\); in the IMDP, assigning \(\frac{3}{4}\) to the transition to \(((1,3),p_\mathrm {W},\mathsf {lb})\) and \(\frac{1}{4}\) to the transition to \(((1,3),p_\mathrm {W},\mathsf {rb})\), we then obtain that the probability of making a transition to \((\mathrm {S},(1,3))\) from \((\mathrm {W},[0,0])\) is \(\frac{3}{4} \cdot 0 + \frac{1}{4} \cdot \frac{3}{4} = \frac{3}{16}\). Similar reasoning applies to the transitions available from \((\mathrm {F},(1,3))\).

We now describe formally the construction of the IMDP \(\mathfrak {M}[{\mathcal {P}}] = (S_{\mathfrak {M}[{\mathcal {P}}]},\)\(\mathfrak {A}_{\mathfrak {M}[{\mathcal {P}}]},\mathfrak {D}_{\mathfrak {M}[{\mathcal {P}}]})\). The set of states of \(\mathfrak {M}[{\mathcal {P}}]\) is defined as \(S_{\mathfrak {M}[{\mathcal {P}}]} = S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {reg}\cup S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {end}\), where \(S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {reg}= \{{(l,B) \in L\times \mathcal {I}_{\mathbb {B}} \mid B\,\models \, inv (l)}\}\) and \(S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {end}= \big \{(\tilde{B},(l,g,\wp ),\)\(\mathsf {dir}) \in \mathcal {I}_{\mathbb {B}} \times prob \times \{{\mathsf {lb},\mathsf {rb}}\} \mid \tilde{B}\,\models ,g\wedge inv (l)\big \}\). In order to distinguish states of and states of \(\mathfrak {M}[{\mathcal {P}}]\), we refer to elements of \(S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {reg}\) as regions, and elements of \(S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {end}\) as endpoint indicators. The set of actions of \(\mathfrak {M}[{\mathcal {P}}]\) is defined as \(\mathfrak {A}_{\mathfrak {M}[{\mathcal {P}}]} = \{{(\tilde{B},(l,g,\wp )) \in \mathcal {I}_{\mathbb {B}} \times prob \mid \tilde{B}\,\models \,g\wedge inv (l)}\} \cup \{{ \tau }\}\) (i.e., there is an action for each combination of interval from \(\mathcal {I}_{\mathbb {B}}\) and probabilistic edge such that all valuations from the interval satisfy both the guard of the probabilistic edge and the invariant condition of its source location). For each region \((l,B) \in S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {reg}\), let \(\mathfrak {A}_{\mathfrak {M}[{\mathcal {P}}]}({l,B}) = \{{(\tilde{B},(l',g,\wp )) \in \mathfrak {A}_{\mathfrak {M}[{\mathcal {P}}]} \mid l= l' \hbox { and } \tilde{B} \ge B}\}\).Footnote 3 For each \((\tilde{B},p,\mathsf {dir}) \in S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {end}\), let \(\mathfrak {A}_{\mathfrak {M}[{\mathcal {P}}]}({\tilde{B},p,\mathsf {dir}}) = \{{ \tau }\}\). The transition function \(\mathfrak {D}_{\mathfrak {M}[{\mathcal {P}}]}: S_{\mathfrak {M}[{\mathcal {P}}]} \times \mathfrak {A}_{\mathfrak {M}[{\mathcal {P}}]} \rightarrow \mathfrak {Dist}(S_{\mathfrak {M}[{\mathcal {P}}]}) \cup \{{ \bot }\}\) is defined as followsFootnote 4:

  • For each \((l,B) \in S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {reg}\) and \((\tilde{B},p) \in \mathfrak {A}_{\mathfrak {M}[{\mathcal {P}}]}({l,B})\), we let \(\mathfrak {D}_{\mathfrak {M}[{\mathcal {P}}]}((l,B),(\tilde{B},p))\) be the interval distribution such that \(\mathfrak {D}_{\mathfrak {M}[{\mathcal {P}}]}((l,B),(\tilde{B},p)) (\tilde{B},p,\mathsf {lb}) =(0,1)\), \(\mathfrak {D}_{\mathfrak {M}[{\mathcal {P}}]}((l,B),(\tilde{B},p)) (\tilde{B},p,\mathsf {rb}) =(0,1)\), and \(\mathfrak {D}_{\mathfrak {M}[{\mathcal {P}}]}((l,B),(\tilde{B},p))(s) = [0,0]\) for all \(s\in S_{\mathfrak {M}[{\mathcal {P}}]} \setminus \{{(\tilde{B},p,\mathsf {lb}),(\tilde{B},p,\mathsf {rb})}\}\).

  • For each \((\tilde{B},(l,g,\wp ),\mathsf {dir}) \in S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {end}\) and \((l',B') \in S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {reg}\), let:

    $$ \lambda ^{(\tilde{B},(l,g,\wp ),\mathsf {dir})}_{(l',B')} = \left\{ \begin{array}{ll} {\wp }[{\mathsf {dir}(\tilde{B})}](\{{x}\},l') + {\wp }[{\mathsf {dir}(\tilde{B})}](\emptyset ,l') &{} \hbox {if } B' = \tilde{B} = [0,0] \\ {\wp }[{\mathsf {dir}(\tilde{B})}](\emptyset ,l') &{} \hbox {if } B' = \tilde{B}> [0,0] \\ {\wp }[{\mathsf {dir}(\tilde{B})}](\{{x}\},l') &{} \hbox {if } B' = 0 \hbox { and } \tilde{B} > 0 \\ 0 &{} \hbox {otherwise.} \end{array} \right. $$

    Then \(\mathfrak {D}_{\mathfrak {M}[{\mathcal {P}}]}((\tilde{B},(l,g,\wp ),\mathsf {dir}),\tau )\) is the interval distribution such that, for all \(s\in S_{\mathfrak {M}[{\mathcal {P}}]}\):

    $$ \mathfrak {D}_{\mathfrak {M}[{\mathcal {P}}]}((\tilde{B},(l,g,\wp ),\mathsf {dir}),\tau )(s) = \left\{ \begin{array}{ll} [\lambda ^{(\tilde{B},(l,g,\wp ),\mathsf {dir})}_{s},\lambda ^{(\tilde{B},(l,g,\wp ),\mathsf {dir})}_{s}] &{} \hbox {if } s\in S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {reg}\\ \, \! [0,0] &{} \hbox {otherwise.} \end{array} \right. $$

Next, we consider the correctness of the construction of \(\mathfrak {M}[{\mathcal {P}}]\), i.e., that \(\mathfrak {M}[{\mathcal {P}}]\) can be used for solving quantitative and qualitative properties of the 1c-cdPTA \(\mathcal {P}\). The proof relies on showing that transitions of the semantic MDP of \(\mathcal {P}\) can be mimicked by a sequence of two transitions of the semantic MDP of \(\mathfrak {M}[{\mathcal {P}}]\), and vice versa. Let be the semantic MDP of \(\mathcal {P}\). Given state \((l,v) \in S_{\mathcal {P}}\), we let \(\mathsf {reg}({l,v}) = (l,B) \in S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {reg}\) be the unique region such that \(v\in B\). In the following, we let be the semantic MDP of \(\mathfrak {M}[{\mathcal {P}}]\).

We now show that, for any scheduler of (the semantics of) the 1c-cdPTA \(\mathcal {P}\), there exists a scheduler of (the semantics of) the IMDP \(\mathfrak {M}[{\mathcal {P}}]\) such that the schedulers assign the same probability to reaching a certain set of locations from a given location with the value of the clock equal to 0. Let \(\mathfrak {T}_{F} = \{{ (l,B) \in S_{\mathfrak {M}[{\mathcal {P}}]}^\mathsf {reg}\mid l\in F}\}\) be the set of regions with location component in \(F\).

Lemma 2

Let \(l\in L\) be a location and let \(F\subseteq L\) be a set of locations. Given a \(\mathbb {B}\)-minimal scheduler , there exists scheduler such that \(\mathrm {Pr}^{\pi }_{(l,0)}(\Diamond T_{F}) = \mathrm {Pr}^{\hat{\pi }}_{\mathsf {reg}({l,0})}(\Diamond \mathfrak {T}_{F})\).

The proof of Lemma 2 (see [34]) is simplified by the fact that, by Lemma 1, it suffices to consider \(\mathbb {B}\)-minimal schedulers: for each finite path \(r\) of \(\pi \), we can identify a set of finite paths of \(\hat{\pi }\) of length twice that of \(r\), that visit the same locations in order, choose the same probabilistic edges in order, and visit the same intervals in order, both regarding the clock valuations/intervals in states and in actions. In fact, finite paths of \(\hat{\pi }\) that are associated with \(r\) differ only in terms of the \(\mathsf {lb}\) and \(\mathsf {rb}\) components used in endpoint indicators. Furthermore, \(\hat{\pi }\) replicates exactly the choice of \(\pi \) made after \(r\) in terms of interval of \(\mathcal {I}_{\mathbb {B}}\) chosen and probabilistic edge in all of its finite paths associated with \(r\). Finally, \(\hat{\pi }\) chooses assignments (over edges labelled with (0, 1)) in order to represent exactly the choices of clock valuations made by \(\pi \), in the manner described in Example 3 above: more precisely, the choice of action \((\tilde{v},p)\) by \(\pi \), where \(\tilde{B}\) is the unique interval such that \(\tilde{v} \in \tilde{B}\), is mimicked by \(\hat{\pi }\) choosing the action \(((\tilde{B},p),\alpha )\) for which \(\alpha (\tilde{B},p,\mathsf {lb}) = \frac{\mathsf {rb}(\tilde{B}) - \tilde{v}}{\mathsf {rb}(\tilde{B}) - \mathsf {lb}(\tilde{B})}\), and \(\alpha (\tilde{B},p,\mathsf {rb}) = 1 - \alpha (\tilde{B},p,\mathsf {lb}) = \frac{\tilde{v} - \mathsf {lb}(\tilde{B})}{\mathsf {rb}(\tilde{B}) - \mathsf {lb}(\tilde{B})}\).

Example 4

Consider the 1c-cdPTA of Fig. 1. Let be a scheduler such that \(\pi (\mathrm {W},0)\) assigns probability 1 to the action \((\frac{3}{2},p_\mathrm {W})\). Then is constructed such that \(\hat{\pi }(\mathrm {W},[0,0])\) assigns probability 1 to \((((1,3),p_\mathrm {W}),\alpha )\), where \(\alpha ((1,3),p_\mathrm {W},\mathsf {lb}) = \frac{3}{4}\) and \(\alpha ((1,3),p_\mathrm {W},\mathsf {rb}) = \frac{1}{4}\) (observe that \(\alpha ((1,3),p_\mathrm {W},\mathsf {lb}) = \frac{3-\frac{3}{2}}{2}\)). Furthermore, \(\hat{\pi }((\mathrm {W},[0,0])(((1,3),p_\mathrm {W}),\alpha ))\) assigns probability 1 to \(\tau \). Now consider the finite path \(r= (\mathrm {W},0)(\frac{3}{2},p_\mathrm {W})(\mathrm {F},\frac{3}{2})\) of \(\pi \): then the corresponding set of finite paths of \(\hat{\pi }\) is \(r' = (\mathrm {W},[0,0]) (((1,3),p_\mathrm {W}),\alpha ) ((1,3),p_\mathrm {W},\mathsf {lb}) (\mathrm {F},(1,3))\) and \(r'' = (\mathrm {W},[0,0]) (((1,3),p_\mathrm {W}),\alpha ) ((1,3),p_\mathrm {W},\mathsf {rb}) (\mathrm {F},(1,3))\). Now say that \(\pi (r)\) assigns probability 1 to the action \((\frac{9}{2},p_\mathrm {F})\): then both \(\hat{\pi }(r')\) and \(\hat{\pi }(r'')\) assign probability 1 to the action \((((4,5),p_\mathrm {F}),\alpha ')\), where \(\alpha '((4,5),p_\mathrm {F},\mathsf {lb}) = \frac{1}{2}\) and \(\alpha '((4,5),p_\mathrm {F},\mathsf {rb}) = \frac{1}{2}\) (note that \(\alpha '((4,5),p_\mathrm {F},\mathsf {lb}) = 5-\frac{9}{2}\)). Hence, regardless of whether \(((1,3),p_\mathrm {W},\mathsf {lb})\) or \(((1,3),p_\mathrm {W},\mathsf {rb})\) was visited, scheduler \(\hat{\pi }\) makes the same choice to mimic \(\pi (r)\).

The following lemma considers the converse direction, namely that (starting from a given location with the clock equal to 0) any scheduler of can be mimicked by a \(\mathbb {B}\)-minimal scheduler of such that the schedulers assign the same probability of reaching a certain set of locations.

Lemma 3

Let \(l\in L\) be a location and let \(F\subseteq L\) be a set of locations. Given a scheduler , there exists a \(\mathbb {B}\)-minimal scheduler , such that \(\mathrm {Pr}^{\pi }_{(l,0)}(\Diamond T_{F}) = \mathrm {Pr}^{\hat{\pi }}_{\mathsf {reg}({l,0})}(\Diamond \mathfrak {T}_{F})\).

We characterise the size of a 1c-cdPTA as the sum of the number of its locations, the size of the binary encoding of the clock constraints used in invariant conditions and guards, and the size of the binary encoding of the constants used in the distribution templates of the probabilistic edges (i.e., \(c^{p}_{e}\) and \(d^{p}_{e}\) for each \(p\in prob \) and \(e\in 2^{\{{x}\}} \times L\)).

Theorem 1

Given a 1c-cdPTA \(\mathcal {P}= (L, inv , prob )\), \(l\in L\) and \(F\subseteq L\), the quantitative and qualitative problems can be solved in polynomial time.

The theorem follows from Lemma 1, Lemma 2, Lemma 3, Proposition 1 and the fact that quantitative and qualitative problems for IMDPs can be solved in polynomial time, given that there exist polynomial-time algorithms for analogous problems on IMCs with the semantics adopted in this paper [9, 11, 30, 33], and from the fact that the IMDP construction presented in this section gives an IMDP that is of size polynomial in the size of the 1c-cdPTA. We add that the quantitative problems for 1c-cdPTAs are PTIME-hard, following from the PTIME-hardness of reachability for MDPs [29], thus establishing PTIME-completeness for quantitative problems for 1c-cdPTAs.

5 Conclusion

We have presented a method for the transformation of a class of 1c-cdPTAs to IMDPs such that there is a precise relationship between the schedulers of the 1c-cdPTA and the IMDP, allowing us to use established polynomial-time algorithms for IMDPs to decide quantitative and qualitative reachability problems on the 1c-cdPTA. Overall, the results establish that such problems are in PTIME. The techniques rely on the initialisation requirement, which ensures that optimal choices for non-constant probabilistic edges correspond to the left or right endpoints of intervals that are derived from the syntactic description of the 1c-cdPTA. The initialisation requirement restricts dependencies between non-constant probabilistic edges: while this necessarily restricts the expressiveness of the formalism, the resulting model nevertheless retains the expressive power to represent basic situations in which the probability of certain events depends on the exact amount of time elapsed, such as those described in the introduction.

The IMDP construction can be simplified in a number of cases: for example, in the case in which at most two outcomes \(e_1, e_2\) of every probabilistic edge \(p\) are non-constant, i.e., for which \(d^{p}_{e_1} \ne 0\) and \(d^{p}_{e_2} \ne 0\), endpoint indicators are unnecessary; instead, when a probabilistic edge is taken from an open interval \(\tilde{B}\), each of \(e_1\) and \(e_2\) are associated with (non-singleton) intervals (other outcomes are associated with singleton intervals), and the choice of probability to assign between the two intervals represents the choice of clock valuation in \(\tilde{B}\). This construction is also polynomial in the size of the 1c-cdPTA. Future work could consider lifting the initialisation requirement: we conjecture that this is particularly challenging for quantitative properties, in particular recalling that Fig. 2 of [32] provides an example of a non-initialised 1c-cdPTA for which the maximum probability of reaching a certain location is attained by choosing a time delay corresponding to an irrational number. Solutions to the qualitative problem for non-initialised 1c-cdPTAs could potentially utilise connections with parametric MDPs [19, 35]. Furthermore, time-bounded reachability properties could also be considered in the context of 1c-cdPTAs.