Introduction

As cyber-physical systems grow ever more prevalent in our everyday lives, they also become increasingly complex. The need and desire to explain the behavior of such systems has sparked a branch of research, which tries to explain and explicate the substantial complexity of modern systems. One of the key principles which can lead to the understanding of formal systems are cause–effect relationships. The study of such relations in formal systems has received great interest over the past 25 years. Notions of causality have been proposed within a variety of models, such as structural equation models [30, 31, 43], Kripke structures [6, 14] and Markov chains [39, 40]. All of these investigations share a similar goal as they build a framework which tries to enable an explanation of why an observable phenomenon (the effect) has happened by logically linking previous events (the causes) to its occurrence. An understanding of such cause–effect relationships has been identified to be helpful in a variety of application areas ranging from finance [37] to medicine [38] and aeronautics [34], among others. Furthermore, in a societal context, causality plays a fundamental role in determining moral responsibility [8, 13] or legal accountability [24], and ultimately fosters user acceptance through an increased level of transparency [42].

Even though causality has been considered in various disciplines and within different application contexts, almost all approaches use at least one of the following two central paradigms to describe the logical link between causes and effect:

  1. 1.

    The probability-raising property: The probability to observe the effect is higher whenever the cause has been observed.

  2. 2.

    The counterfactual reasoning principle: The effect could not have happened if the cause had not occurred.

The notion of causality introduced in this paper uses both of these paradigms: Given a probability threshold \(p\in (0,1]\), a p-cause for an \(\omega \)-regular effect \(\mathcal {L}\) in a discrete-time Markov chain is defined to be a set of finite executions \(\Pi \) such that

  1. (i)

    the probability that \(\mathcal {L}\) occurs after an execution in \(\Pi \) is at least p and

  2. (ii)

    almost all executions exhibiting the effect \(\mathcal {L}\) have a prefix in \(\Pi \).

Table 1 Summary of complexity results for different kinds of cost in the fully observable case when the effect \(\mathcal {L}\) constitutes a reachability property

Condition (i) captures the probability-raising property if p is chosen to be greater than the overall probability that \(\mathcal {L}\) is observed, while (ii) is in line with counterfactual reasoning. Note that the effects are sets of infinite executions while causes are sets of finite executions preceding the effects. Nevertheless, it is of course possible that after some finite execution in a cause, it is already clear that the effect will (almost) surely be observed.

The precise meaning of execution depends on the setting. In the fully observable case, an observer has access to the system states themselves and thus an execution is a path in the Markov chain. In these settings we thus consider path-based p-causes (henceforth called p-causes). However, in settings without full introspection of the state space, only partial information about the current state can be observed. This can be modeled by labels on the states, where states with the same label are not distinguishable. In such a case an execution is represented by a trace (a sequence of labels) in the Markov chain and the corresponding p-causes are trace-based (henceforth called trace-p-cause).

By combining conditions (i) and (ii) the resulting notion is useful when causes are used for monitoring purposes. Imagine a critical event which the system should avoid (e.g., a fully automated drone crashing onto the ground). After a p-cause has been computed for this event, it can be used as a monitor that can trigger an alarm as soon as the execution so far is an element in the cause. Depending on the system and the threshold p the alarm is raised before the critical event occurs and thus suitable countermeasures can be taken (e.g., manual control instead of automated behavior). The probability threshold p can be thought of as the sensitivity of the monitor and can be adjusted depending on the criticality of the effect.

As multiple p-causes might exist for the same effect, the application described above motivates the question, which p-cause should be computed or used. Typically, one wants to minimize the resource consumption of the modeled system. Resources, such as time or energy, might be consumed by the system or the induced monitor, but also by the countermeasures taken upon triggering an alarm. The cost of countermeasures may depend on the state of the system or the execution until the alarm. Such costs can be modeled using state weights in the Markov chain, which induce weight functions on the finite executions either in an accumulative (total resource consumption) or instantaneous (current resource consumption intensity) fashion. To be able to model the different types of costs one might be interested in, we consider three cost mechanisms for causes:

  1. (1)

    The expected cost measures the expected resource consumption until the execution can be identified as part (or almost surely not part) of the cause.

  2. (2)

    The partial expected cost measures the expected resource consumption where executions which are not part of the cause do not consume any resources.

  3. (3)

    The maximal cost measures the maximal resource consumption that can occur within the cause.

Within a possible monitoring application the expected cost describes the expected resource consumption until the monitor triggers an alarm or reaches a safe state, whereas the partial expected cost can be used to model the costs of countermeasures that are only incurred if an execution in the cause is observed. In this context, the maximal cost measures the maximal possible resource consumption when an alarm is triggered.

In the setting of full observability of the state space where p-causes consist of finite paths, we obtain the complexity results presented in Table 1 for the computation of cost-minimal p-causes for different combinations of weight types and cost mechanisms if the effect \(\mathcal {L}\) is a reachability property. The case of arbitrary \(\omega \)-regular effects \(\mathcal {L}\) can be reduced to this case in polynomial time if \(\mathcal {L}\) is given by a deterministic Rabin automaton. In order to obtain the complexity results, we exploit connections to several computational problems for discrete-time Markovian models. For the expected cost we rely on reductions to the stochastic shortest path problem (SSP) studied in [7]. The exponential-time algorithm for partial expected costs on non-negative, accumulated weights uses results on partial expectations in Markov decision processes [44]. We obtain \(\mathtt {PP}\)-hardness for partial expected cost by a reduction from the cost problem for acyclic Markov chains stated in [28]. The pseudo-polynomial algorithm for the maximal cost on arbitrary, accumulated weights applies insights from total-payoff games [9, 12].

In the setting of partial observability, additional complications arise. For labeled Markov chains, we consider trace-p-causes, which consist of traces which incorporate the probability-raising principle for every underlying path. This approach is in line with notions of causality studied by Kleinberg et al. (see, e.g., [39]). In contrast to the full observable setting, trace-p-causes need not exist for arbitrary \(\omega \)-regular effects. We show that deciding the existence of a trace-p-cause is PSPACE-hard and can be done in exponential time. Furthermore, a few results about cost-minimal p-causes can be transferred with an exponential overhead in runtime: Namely, the results for the expected and maximal cost in case of non-negative weights and the algorithm for maximal cost for an instantaneous weight function. For the remaining combinations of weight types and cost mechanisms, analogous reductions to the results for p-causes lead to problem instances (mainly on partially observable MDPs) which are known to be undecidable in general or for which there are no known results in the literature. Whether the special form of these instances we obtain can be exploited to obtain further decidability results remains open.

Related Work Halpern and Pearl introduced a structural model approach to actual causality [31] which has sparked notions of causality in formal verification [6, 14]. The complexity of computing actual causes has been studied in [20, 21]. A probabilistic extension of this framework has been proposed in [25]. Recent work on checking and inferring actual causes is given in [33], and an application-oriented framework for it is presented in [34]. In another account, Kleinberg et al. [39] build a framework for actual causality in Markov chains and use it to infer causal relationships in data sets. This approach was later extended to continuous time data [37] and to token causality [40] and has been refined using new significance measures for actual and token causes [32, 52]. The survey article [1] introduces causes in probabilistic operational models by incorporating temporal priority and probability-raising in a state-based fashion. The extension of this approach to local and global probability-raising conditions as path properties is then compared to the PCTL approach of [39]. In [3] the probability-raising principle is incorporated in a causality notion for Markov decision processes and the coverage properties of causes are optimized by using quality measures.

In combination with logical programming, a logic for probabilistic causal reasoning is given in [51]. The subsequent work [50] compares this approach to Pearl’s theory of causality involving Bayesian networks [43]. The CP-logic of [51] is close to the representation of causal mechanisms of [18]. On the philosophical side, the probability-raising principle goes back to Reichenbach [45] and has been identified as a key ingredient to causality in various philosophical accounts, see e.g., [19, 48].

In stochastic systems modeled as Hidden Markov chains (HMCs) \(\omega \)-regular properties were studied in [27, 46] in a monitoring context. This approach to monitoring has recently been revived in [22]. In a similar context the trade-off between accuracy and overhead in runtime verification has been studied in [5, 35, 47]. In particular, [5] uses HMCs to estimate how likely the violation of a property is in the near future while monitoring a system. Monitoring the evolution of finite executions has also been investigated in the context of statistical model checking of LTL properties [17]. How randomization can improve monitors for non-probabilistic systems has been examined in [10]. The safety level of [23] measures which portion of a language admits bad prefixes, in the sense classically used for safety languages.

Contribution This paper contributes novel notions of causes for \(\omega \)-regular effects in stochastic operational models: In the setting of full introspection of the state space, p-causes are defined as sets of finite paths (Sect. 3). While the existence of such p-causes is always guaranteed, the paper considers various ways of assigning costs to a cause (Sect. 4). The cost mechanisms correspond to different kinds of resource consumption which may occur when applying p-causes, inspired by use-cases from the monitoring domain. The computational complexity of computing cost-minimal causes under these cost mechanisms is thoroughly developed. Finally, we extend the causality notion to partial observable models by introducing trace-p-causes (Sect. 5). While some of our methods for p-causes can be extended to trace-p-causes, we show that partial observability leads to additional computational difficulties in general.

This paper is an extended version of work published in the ATVA 2021 conference [2]. It extends the previous work by addressing partial observability in Sect. 5 and by providing the full proofs omitted in the conference version.

Preliminaries

Markov chains A discrete-time Markov chain (DTMC) M is a tuple \((S, s_0, {\textbf {P}})\), where S is a finite set of states, \(s_0\in S\) is the initial state, and \({\textbf {P}} :S \times S \rightarrow [0,1]\) is the transition probability function where we require \(\sum _{s' \in S} {\textbf {P}}(s, s') = 1\) for all \(s \in S\). For algorithmic problems all transition probabilities are assumed to be rational. A finite path \({\hat{\pi }}\) in M is a sequence \(s_0s_1\ldots s_n\) of states such that \({\textbf {P}}(s_i, s_{i+1}) > 0\) for all \(0 \le i \le n-1\). Let \({\text {last}}(s_0\ldots s_n) = s_n\). Similarly one defines the notion of an infinite path \(\pi \). Let \({\text {Paths}}_{{\text {fin}}}(M)\) and \({\text {Paths}}(M)\) be the set of finite and infinite paths. The set of prefixes of a path \(\pi \) is denoted by \({\text {Pref}}(\pi )\). The cylinder set of a finite path \({\hat{\pi }}\) is \({\text {Cyl}}({\hat{\pi }}) =\{\pi \in \ {\text {Paths}}(M)\mid {\hat{\pi }}\in {\text {Pref}}(\pi )\}.\) We consider \({\text {Paths}}(M)\) as a probability space whose \(\sigma \)-algebra is generated by such cylinder sets and whose probability measure is induced by \(\text {Pr}({\text {Cyl}}(s_0\ldots s_n)) = {\textbf {P}}(s_0, s_{1})\cdot \ldots \cdot {\textbf {P}}(s_{n-1}, s_n) \) (see [4, Chapter 10] for more details).

For an \(\omega \)-regular language \(\mathcal {L} \subseteq S^{\omega }\) let \({\text {Paths}}_M(\mathcal {L}) = {\text {Paths}}(M) \cap \mathcal {L}\). The probability of \(\mathcal {L}\) in M is defined as \(\text {Pr}_M(\mathcal {L}) = \text {Pr}({\text {Paths}}_M(\mathcal {L}))\). Given a state \(s\in S\), let \(\text {Pr}_{M,s}(\mathcal {L}) = \text {Pr}_{M_s}(\mathcal {L})\), where \(M_s\) is the DTMC obtained from M by replacing the initial state \(s_0\) with s. If M is clear from the context, we omit the subscript. For a finite path \({\hat{\pi }} \in {\text {Paths}}_{{\text {fin}}}(M)\), define the conditional probability

$$\begin{aligned} \text {Pr}_M(\mathcal {L} \mid {\hat{\pi }}) = \frac{\text {Pr}_M \left( {\text {Paths}}_M(\mathcal {L}) \cap {\text {Cyl}}({\hat{\pi }}) \right) }{\text {Pr}_M({\text {Cyl}}({\hat{\pi }}))}. \end{aligned}$$

Given \(E \subseteq S\), let \(\lozenge E = \{ s_0s_1\ldots \in {\text {Paths}}(M)\mid \exists i \ge 0. \; s_i\in E\}\). For such reachability properties we have \(\text {Pr}_{M}(\lozenge E \mid s_0\ldots s_n) = \text {Pr}_{M, s_n}(\lozenge E)\) for any \(s_0\ldots s_n\in {\text {Paths}}_{{\text {fin}}}(M)\). We assume \(\text {Pr}_{M,s_0}(\lozenge s) > 0\) for all states \(s \in S\). Furthermore, define a weight function on M as a map \(c: S \rightarrow \mathbb {Q}\). We typically use it to induce a weight function \(c: {\text {Paths}}_{{\text {fin}}}(M) \rightarrow \mathbb {Q}\) (denoted by the same letter) by accumulation, i.e., \(c(s_0 \cdots s_n) = \sum _{i=0}^{n} c(s_i)\). Finally, a set \(\Pi \subseteq {\text {Paths}}_{{\text {fin}}}(M)\) is called prefix-free if for every \({\hat{\pi }}\in \Pi \) we have \(\Pi \cap {\text {Pref}}({\hat{\pi }}) = \{{\hat{\pi }}\}\).

A labeled DTMC \(\mathcal {M} = (S,s_0,{\textbf {P}},\Sigma ,L)\) is a DTMC equipped with a labeling function \(L: S \rightarrow \Sigma \) that assigns each state \(s \in S\) a label \(L(s) \in \Sigma \). A finite trace \({\hat{\tau }}\) in \(\mathcal {M}\) is a sequence \(L(s_0)L(s_1) \ldots L(s_n)\) of labels such that there exists a corresponding finite path \({\hat{\pi }} = s_0 s_1 \ldots s_n \in {\text {Paths}}_{{\text {fin}}}(\mathcal {M})\) The set of all corresponding paths to \({\hat{\tau }}\) in \(\mathcal {M}\) is denoted by \({\text {Paths}}_{\mathcal {M}}({\hat{\tau }})\). The notion of an infinite trace \(\tau \) is defined in a similar manner. \({\text {Traces}}(\mathcal {M})\) can be considered as a probability space whose \(\sigma \)-algebra is, analogously to the non-labeled case, induced by the cylinder sets of finite traces. The probability of an \(\omega \)-regular language \(\mathcal {L} \subseteq \Sigma ^\omega \) is given by \(\Pr _{\mathcal {M}}(\mathcal {L})=\Pr _{\mathcal {M}}(\mathcal {L} \cap {\text {Traces}}(\mathcal {M}))\). A weight function can also be defined over labels, i.e., \(c:\Sigma \rightarrow \mathbb {Q}\), which in turn induces a weight function on finite traces \(c: {\text {Traces}}_{{\text {fin}}}(\mathcal {M}) \rightarrow \mathbb {Q}\).

Markov decision processes A Markov decision process (MDP) \(\mathcal {M}\) is a tuple \((S, {\text {Act}}, s_0, {\textbf {P}})\), where S is a finite set of states, \({\text {Act}}\) is a finite set of actions, \(s_0\) is the initial state, and \({\textbf {P}} :S \times {\text {Act}} \times S \rightarrow [0,1]\) is the transition probability function such that for all states \(s \in S\) and actions \(\alpha \in {\text {Act}}\) we have \(\sum _{s' \in S} {\textbf {P}}(s, \alpha ,s') \in \{0,1 \}\). An action \(\alpha \) is enabled in state \(s \in S\) if \(\sum _{s' \in S} {\textbf {P}}(s, \alpha ,s')=1\) and we define \({\text {Act}}(s) = \{\alpha \mid \alpha \text { is enabled in } s\}\). We require \({\text {Act}}(s) \ne \emptyset \) for all states \(s \in S\).

An infinite path in \(\mathcal {M}\) is an infinite sequence \(\pi =s_0 \alpha _1s_1 \alpha _2s_2 \dots \in (S \times {\text {Act}})^\omega \) such that for all \(i \ge 0\) we have \({\textbf {P}}(s_i, \alpha _{i+1},s_{i+1})>0\). Any finite prefix of \(\pi \) that ends in a state is a finite path. A scheduler \(\mathfrak {S}\) is a function that maps a finite path \(s_0 \alpha _1s_1 \dots s_n\) to an enabled action \(\alpha \in {\text {Act}}(s_n)\). Therefore, it resolves the nondeterminism of the MDP and induces a (potentially infinite) Markov chain \(\mathcal {M}_{\mathfrak {S}}\). If the chosen action only depends on the last state of the path, i.e., \(\mathfrak {S}(s_0 \alpha _1s_1 \dots s_n) = \mathfrak {S}(s_n)\), then the scheduler is called memoryless and naturally induces a finite DTMC. For more details on DTMCs and MDPs we refer to [4].

Nondeterministic finite automata We will denote nondeterministic finite automata as tuples \(\mathcal {A}=(Q, \Sigma ,Q_0,T,F)\), where Q is the set of states, \(\Sigma \) is the alphabet, \(Q_0\subseteq Q\) is the set of initial states, \(T:Q\times \Sigma \rightarrow 2^Q\) is the transition function, and \(F\subseteq Q\) are the final states. A transition \(q'\in T(q,\alpha )\) is also denoted by \(q\overset{\alpha }{\mapsto } q'\).

Deterministic Rabin automata A deterministic Rabin automaton (DRA) is a tuple \(\mathcal {A} = (Q, \Sigma , q_0, \delta , \text {Acc})\) where Q is a finite set of states, \(\Sigma \) an alphabet, \(q_0\) the initial state, \(\delta : Q \times \Sigma \rightarrow Q\) the transition function and \(\text {Acc} \subseteq 2^Q \times 2^Q\) the acceptance set. The run of \(\mathcal {A}\) on a word \(w=w_0 w_1 \dots \in \Sigma ^\omega \) is the sequence \(\tau = q_0 q_1 \ldots \) of states such that \(\delta (q_i,w_i)=q_{i+1}\) for all i. It is accepting if there exists a pair \((L,K) \in \text {Acc}\) such that L is only visited finitely often and K is visited infinitely often by \(\tau \). The language \(\mathcal {L}(\mathcal {A})\) is the set of all words \(w\in \Sigma ^\omega \) on which the run of \(\mathcal {A}\) is accepting.

Probabilistic causes

This section introduces a notion of cause for \(\omega \)-regular properties in Markov chains. For the rest of this section we fix a DTMC \(M = (S, s_0, {\textbf {P}})\), an \(\omega \)-regular language \(\mathcal {L}\) over the alphabet S and a threshold \(p \in (0,1]\).

Definition 1

(p-critical prefix). A finite path \({\hat{\pi }}\) is a p-critical prefix for \(\mathcal {L}\) if \(\text {Pr}(\mathcal {L} \mid {\hat{\pi }}) \ge p\).

Definition 2

(p-cause). A p-cause for \(\mathcal {L}\) in M is a prefix-free set of finite paths \(\Pi \subseteq {\text {Paths}}_{{\text {fin}}}(M)\) such that

  1. (1)

    almost every \(\pi \in {\text {Paths}}_M(\mathcal {L})\) has a prefix \({\hat{\pi }} \in \Pi \), and

  2. (2)

    every \({\hat{\pi }} \in \Pi \) is a p-critical prefix for \(\mathcal {L}\).

Note that condition (1) and (2) are in the spirit of completeness and soundness as used, e.g., in [15]. The first condition is our invocation of the counterfactual reasoning principle: Almost every occurrence of the effect is preceded by an element in the cause. If the threshold is chosen such that \( p > \text {Pr}_{s_0}(\mathcal {L})\), then the second condition reflects the probability-raising principle in that seeing an element of \(\Pi \) implies that the probability of the effect \(\mathcal {L}\) has increased after the execution so far.

In particular, for monitoring purposes as described in the introduction it would be misleading to choose p below \(\text {Pr}_{s_0}(\mathcal {L})\) as this could instantly trigger an alarm before the system is put to use. On the other hand p should not be too close to 1 as this may result in an alarm being triggered too late. This can also lead to a non-classical cause–effect relation as \(\text {Pr}(\mathcal {L} \mid {\hat{\pi }}) = 1\) for a finite path \({\hat{\pi }}\) might also imply, that the effect has already occurred. Thus, a careful choice of the probability threshold p depending on the modeled system and application is crucial. Also note, that the probability might only be raised after the execution but not along the execution. For example, there might be paths \({\hat{\pi }} = s_0 s_1 \cdots s_n \in \Pi \) such that \(\text {Pr}_{s_i}(\mathcal {L}) \le \text {Pr}_{s_0}(\mathcal {L})\) for all \(i < n\) but \(\text {Pr}_{s_n}(\mathcal {L}) > p\).

If \(\mathcal {L}\) coincides with a reachability property one could equivalently remove the almost from (1) of Definition 2. In general, however, ignoring paths with probability zero is necessary to guarantee the existence of p-causes for all p.

Example 3

Consider the DTMC M depicted in Fig. 1. For \(p = 3/4\), a possible p-cause for the language \(\mathcal {L} = \lozenge fail \) in M is \(\Pi _1 = \{s\, t, s\, u\}\), since both t and u reach \( fail \) with probability \(\ge p\). The sets \(\Theta _1 = \{s\, t, s\, u, s\, t\, u\}\) and \(\Theta _2 = \{s\, t\, fail , s\, u\}\) are not p-causes: \(\Theta _1\) is not prefix-free and for \(\Theta _2\) the path \(s\, t\, u\, fail \) has no prefix in \(\Theta _2\). Another p-cause is \(\Pi _2 =\{s\, t\, fail , s\, u, s\, t\, u\}\).

Fig. 1
figure 1

DTMC M from Example 3

Example 4

It might happen that there is no finite p-cause. Consider the DTMC in Fig. 2 and the probability threshold \(p = 1/2\). Since \(\text {Pr}_{s}(\lozenge fail ) < p\), the singleton \(\{s \}\) is not a p-cause. Thus, for every \(n \ge 0\) either \(s^n\, t\) or \(s^n\, t\, fail \) is contained in any p-cause, which must therefore be infinite. There may also exist non-regular p-causes (as languages over S). For example, for \(A= \{n \in \mathbb {N} \mid n \text { prime} \}\) we get a non-regular p-cause \(\Pi _A = \{ s_0^n\, t\, \mid n \in A \} \cup \{ s_0^m\, t\, fail \mid m \notin A \}\).

Fig. 2
figure 2

Infinite and non-regular 1/2-causes

Remark 5

(Reduction to reachability properties). Let \(\mathcal {A}\) be a deterministic Rabin automaton for \(\mathcal {L}\) and consider the product Markov chain \(M \otimes \mathcal {A}\) as in [4, Sect. 10.3]. For any finite path \({\hat{\pi }}=s_0\ldots s_n \in {\text {Paths}}_{{\text {fin}}}(M)\) there is a unique augmented path \(a({\hat{\pi }}) = (s_0,q_1) (s_1, q_2)\ldots (s_n,q_{n+1}) \in {\text {Paths}}_{{\text {fin}}}(M \otimes \mathcal {A})\) whose projection onto the first factor is \({\hat{\pi }}\). Under this correspondence, a bottom strongly connected component (BSCC) of \(M \otimes \mathcal {A}\) is either accepting or rejecting, meaning that for every finite path reaching this BSCC the corresponding path \({\hat{\pi }}\) in M satisfies \(\text {Pr}_M(\mathcal {L}\mid {\hat{\pi }}) =1\), or, respectively, \(\text {Pr}_M(\mathcal {L}\mid {\hat{\pi }}) =0\) [4, Sect. 10.3]. This readily implies that almost every \(\pi \in {\text {Paths}}_M(\mathcal {L})\) has a 1-critical prefix and that, therefore, p-causes exist for any p.

Moreover, if U is the union of all accepting BSCCs in \(M \otimes \mathcal {A}\), then

$$\begin{aligned} \text {Pr}_{M}(\mathcal {L} \mid {\hat{\pi }}) = \text {Pr}_{M \otimes \mathcal {A}}\big (\lozenge U \mid a({\hat{\pi }}) \big ) \end{aligned}$$

holds for all finite paths \({\hat{\pi }}\) of M [4, Theorem 10.56]. Hence every p-cause \(\Pi _1\) for \(\mathcal {L}\) in M induces a p-cause \(\Pi _2\) for \(\lozenge U\) in \(M \otimes \mathcal {A}\) by taking \(\Pi _2 = \{a({\hat{\pi }}) \mid {\hat{\pi }} \in \Pi _1\}\). Vice versa, given a p-cause \(\Pi _2\) for \(\lozenge U\) in \(M \otimes \mathcal {A}\), then the set of projections of paths in \(\Pi _2\) onto their first component is a p-cause for \(\mathcal {L}\) in M. In summary, the reduction of \(\omega \)-regular properties on M to reachability properties on the product \(M \otimes \mathcal {A}\) also induces a reduction on the level of causes.

Remark 5 motivates us to focus on reachability properties henceforth. To apply the algorithms presented in Sect. 4 to specifications given in richer formalisms such as LTL, one would first have to apply the reduction to reachability given above, which increases the worst-case complexity.

In order to align the exposition with a possible monitoring application, we will consider the target set as representing an erroneous behavior that should be avoided. After collapsing the target set, we may assume that there is a unique state \( fail \in S\) such that \(\mathcal {L} = \lozenge fail \) is the language we are interested in. Further, we collapse all states from which \( fail \) is not reachable to a unique state \( safe \in S\) with the property \(\text {Pr}_{ safe }(\lozenge fail )=0\). After pre-processing, we then have \(\text {Pr}_{s_0}(\lozenge \{ fail , safe \}) = 1\). We define the set

$$\begin{aligned} S_p := \{s \in S \mid \text {Pr}_s(\lozenge fail ) \ge p\}, \end{aligned}$$

of acceptable final states for p-critical prefixes. This set is never empty as \( fail \in S_p\) for all \(p \in (0,1]\).

There is a partial order on the set of p-causes defined as follows: \(\Pi \preceq \Phi \) if and only if for all \(\phi \in \Phi \) there exists \(\pi \in \Pi \) such that \(\pi \in {\text {Pref}}(\phi )\). The reflexivity and transitivity are straightforward, and the antisymmetry follows from the fact that p-causes are prefix-free. However, this order itself has no influence on the probability threshold p. In fact, for two p-causes \(\Pi , \Phi \) with \(\Pi \preceq \Phi \) and \(\pi \in \Pi , \phi \in \Phi \) it might happen that \(\text {Pr}(\lozenge fail \mid \pi ) \ge \text {Pr}(\lozenge fail \mid \phi )\). This partial order admits a minimal element which is a regular language over S and which plays a crucial role for finding optimal causes in Sect. 4.

Proposition 6

(Canonical p-cause). Let

$$\begin{aligned} \Theta = \{&s_0 \cdots s_n \in {\text {Paths}}_{{\text {fin}}}(M) \text { with}\\&s_n\in S_p\text { and for all } i<n: \,s_i\notin S_p \}. \end{aligned}$$

Then \(\Theta \) is a regular p-cause (henceforth called the canonical p-cause) and for all p-causes \(\Pi \) we have \(\Theta \preceq \Pi \).

Proof

The first condition \(s_n \in S_p\) ensures that every path in \(\Theta \) is a p-critical prefix for \(\lozenge fail \) in M. The second condition ensures that no proper prefix is p-critical, and thus \(\Theta \) is prefix-free. Clearly, every path reaching \( fail \) passes through \(S_p\) since \( fail \) is itself an element in \(S_p\). These properties together show that \(\Theta \) is a p-cause.

Let \(\Pi \) be another p-cause for \(\lozenge fail \) in M and consider an arbitrary path \(s_0 \ldots s_n \in \Pi \). This means \(s_n \in S_p\). If we have for all \(i<n\) that \(\text {Pr}_{s_i}(\lozenge fail )<p\), then \(s_0 \ldots s_n \in \Theta \) and there is nothing to prove. Therefore, assume that there exists \(i<n\) such that \(s_i \in S_p\). For minimal such i we have \(s_0 \ldots s_i \in \Theta \) and thus \(\Theta \preceq \Pi \).

To prove that \(\Theta \) is regular, consider the deterministic finite automaton \(\mathcal {A}=(Q, \Sigma ,q_0,T,F)\) for \(Q=S\), \(\Sigma =S\cup \{\iota \}\), \(q_0=\iota \), and final states \(F=S_p\). The transition relation is given by

i.e., all states in \(S_p\) are terminal. We argue that this automaton describes the language \(\Theta \):

If \(s_0 \in S_p\) then there are no transitions possible and \(\mathcal {L}_\mathcal {A}=\{s_0\}\) and \(\Theta = \{s_0\}\) by definition.

Let \(\mathcal {L}_\mathcal {A} \subseteq \Theta \) and let \(s_0 \ldots s_n\) be a word accepted by \(\mathcal {A}\). By definition we have \(s_n \in S_p\) For any state \(s \in S_p\) visited by \(s_0 \cdots s_n\) we know that \(s=s_n\) since \(S_p\) is terminal. This means \(\forall i<n. \; \text {Pr}_{s_i}(\lozenge fail ) < p\) and thus \(s_0 \ldots s_n \in \Theta \).

Now let \(\mathcal {L}_\mathcal {A} \supseteq \Theta \) and \(s_0 \ldots s_n \in \Theta \). A run for \(s_0 \ldots s_n\) in \(\mathcal {A}\) is possible, since we have an edge for every transition in M except for outgoing edges of \(S_p\). These exceptions are avoided, since for \(i<n\) we have \(\text {Pr}_{s_i}(\lozenge fail ) < p\) and thus no state but the last one is in \(S_p\). On the other hand \(s_n\) is in \(S_p\) by assumption and thus the word \(s_0 \ldots s_n\) is accepted by \(\mathcal {A}\). \(\square \)

We now introduce an MDP associated with M whose schedulers correspond to the p-causes of \(\lozenge fail \) in M. This MDP is useful both as a representation for p-causes and for algorithmic questions we consider later.

Definition 7

(p-causal MDP). For the DTMC \(M{=}(S,s_0,{\textbf {P}})\) define the p-causal MDP \(\mathcal {C}_p(M) = (S,\{continue,pick\}, s_0,{\textbf {P}}')\) associated with M, where \({\textbf {P}}'\) is defined as follows:

$$\begin{aligned} {\textbf {P'}}(s,continue,s')&= {\textbf {P}}(s,s') \text { for all } s,s' \in S \\ {\textbf {P'}}(s,pick, fail )&= {\left\{ \begin{array}{ll} 1 &{} \text { if } s \in S_p \\ 0 &{} \text { otherwise} \end{array}\right. } \end{aligned}$$

Given a weight function c on M, we consider c also as weight function on \(\mathcal {C}_p(M)\).

Example 8

Figure 3 demonstrates the p-causal MDP construction of \(\mathcal {C}_p(M)\). The black edges are transitions of M, where the transition probabilities are omitted. Let us assume \(S_p \backslash \{ fail \}= \{s_1,s_3,s_4 \}\). To construct \(\mathcal {C}_p(M)\) one adds transitions for the action pick, as shown by red edges.

Fig. 3
figure 3

Illustration of the p-causal MDP construction

Technically, schedulers are defined on all finite paths of an MDP \(\mathcal {M}\). However, for given a scheduler \(\mathfrak {S}\), there might be paths which cannot be obtained under \(\mathfrak {S}\). Thus we define an equivalence relation \(\equiv \) on the set of schedulers of \(\mathcal {M}\) by setting two schedulers \(\mathfrak {S}\) and \(\mathfrak {S}'\) equivalent, i.e., \(\mathfrak {S} \equiv \mathfrak {S}'\), if \({\text {Paths}}(\mathcal {M}_{\mathfrak {S}})={\text {Paths}}(\mathcal {M}_{\mathfrak {S}'})\). Note that two schedulers equivalent under \(\equiv \) behave identically.

Lemma 9

There is a one-to-one correspondence between equivalence classes of schedulers in \(\mathcal {C}_p(M)\) w.r.t. \(\equiv \) and p-causes in M for \(\lozenge fail \).

Proof

Given a p-cause \(\Pi \) for \(\lozenge fail \) in M, we construct the equivalence class of scheduler \([\mathfrak {S}_{\Pi }]\) by defining \(\mathfrak {S}_{\Pi }({\hat{\pi }})=pick\) if \({\hat{\pi }} \in \Pi \), and otherwise defining \(\mathfrak {S}_{\Pi }({\hat{\pi }})=continue\). Vice versa, given an equivalence class \([\mathfrak {S}]\) of schedulers, we define the p-cause

$$\begin{aligned} \Pi _{\mathfrak {S}}= \{&{\hat{\pi }} \in {\text {Paths}}_{{\text {fin}}}(M) \text { such that}\\&\mathfrak {S}({\hat{\pi }})=pick \text { or }{\hat{\pi }}\text { ends in } fail \text { and}\\&\mathfrak {S}\text { does not choose }pick\text { on any prefix of }{\hat{\pi }}\}. \end{aligned}$$

Since pick can only be chosen once on every path in \({\text {Paths}}(\mathcal {M}_{\mathfrak {S}})\), it is easy to see that \(\mathfrak {S} \equiv \mathfrak {S}'\) implies \(\Pi _{\mathfrak {S}} = \Pi _{\mathfrak {S}'}\). Note that every \({\hat{\pi }} \in \Pi _\mathfrak {S}\) is a p-critical prefix since it ends in \(S_p\) and every path in \(\lozenge fail \) is covered since either pick is chosen or \({\hat{\pi }}\) ends in \( fail \). Furthermore, the second condition makes \(\Pi \) prefix-free. \(\square \)

Types of p-causes and induced monitors

In this section we will introduce two specific classes of p-causes which have a comparatively simple representation. We explain what classes of schedulers they correspond to in the p-causal MDP and how monitors can be derived for them.

Definition 10

(State-based p-cause). A p-cause \(\Pi \) is state-based if there exists a set of states \(Q \subseteq S_p\) such that

$$\begin{aligned} \Pi =\{&s_0 \ldots s_n\in {\text {Paths}}_{{\text {fin}}}(M) \text { with}\\&s_n \in Q \text { and } \forall i<n: \; s_i \notin Q \}. \end{aligned}$$

State-based p-causes correspond to memoryless schedulers of \(\mathcal {C}_p(M)\) which choose pick exactly for paths ending in a state of Q. The prime example for a state-based p-cause is the canonical p-cause \(\Theta \). Note, that the part of the proof of Proposition 6 which shows regularity can be adapted to any state-based p-cause. For DTMCs equipped with a weight function we also introduce threshold-based p-causes:

Definition 11

(Threshold-based p-cause). A p-cause \(\Pi \) is threshold-based if there exists a map \(T: S_p \rightarrow \mathbb {Q}\cup \{\infty \}\) such that

$$\begin{aligned} \Pi = \{&s_0 \cdots s_n\in {\text {Paths}}_{{\text {fin}}}(M) \text { with}\\&s_0 \cdots s_n \in {\text {pick}}(T) \text { and}\\&s_0 \cdots s_i \notin {\text {pick}}(T) \text { for } i < n\}, \end{aligned}$$

where

$$\begin{aligned} {\text {pick}}(T) = \{&s_0 \ldots s_n \in {\text {Paths}}_{{\text {fin}}}(M) \text { with}\\&s_n\in S_p \text { and } c(s_0\ldots s_n) < T(s_n)\}. \end{aligned}$$

Threshold-based p-causes correspond to a simple class of weight-based schedulers of the p-causal MDP, which base their decision in a state only on whether the current weight exceeds the threshold or not. For monitoring purposes threshold-based p-causes are intuitively useful if triggering an alarm causes incurs costs, while reaching a safe state does not (see Sect. 4.2): The idea is that cheap paths (satisfying \(c(s_0 \ldots s_n) < T(s_n)\)) are picked for the p-cause, while expensive paths are continued in order to realize the chance (with probability \(\le 1{-}p\)) that a safe state is reached and therefore the high cost that has already been accumulated is avoided.

As argued before, the concept of p-causes can be used as a basis for monitors that raise an alarm as soon as a state sequence in the p-cause has been observed. State-based p-causes have the advantage that they are realizable by “memoryless” monitors that only need the information on the current state of the Markov chain. Threshold-based monitors additionally need to track the weight that has been accumulated so far until the threshold value of the current state is exceeded. Thus, the memory requirements of monitors realizing a threshold-based p-cause are given by the logarithmic length of the largest threshold value for \(S_p\)-states. All algorithms proposed in Sect. 4 for computing cost-minimal p-causes will return p-causes that are either state-based or threshold-based with polynomially bounded memory requirements.

Comparison to prima facie causes

The work [39] presents the notion of prima facie causes in DTMCs where both causes and events are formalized as PCTL state formulae. In our setting we can equivalently consider a state \( fail \in S\) as the effect and a state subset \(C \subseteq S\) constituting the cause. We then reformulate [39, Definition 4.1] to our setting.

Definition 12

(cf. [39]). A set \(C \subseteq S\) is a p-prima facie cause of \(\lozenge fail \) if the following three conditions hold:

  1. 1.

    The set C is reachable from the initial state and \( fail \notin C\).

  2. 2.

    \(\forall s \in C: \;\text {Pr}_{s}(\lozenge fail ) \ge p\)

  3. 3.

    \(\text {Pr}_{s_0}(\lozenge fail )<p\)

The condition \(p>\text {Pr}_{s_0}(\lozenge fail )\) we discussed for p-causes is hard-coded here as (3). In [39] the value p is implicitly existentially quantified and thus conditions (2) and (3) can be combined to \(\text {Pr}_s(\lozenge fail ) > \text {Pr}_{s_0}(\lozenge fail )\) for all \(s \in C\). This encapsulates the probability-raising property. However, \( fail \) may be reached while avoiding the cause C, so p-prima facie causes do not entail the counterfactual reasoning principle. Definition 2 can be seen as an extension of p-prima facie causes by:

Lemma 13

For \(p>\text {Pr}_{s_0}(\lozenge fail )\) every p-prima facie cause induces a state-based p-cause.

Proof

Let \(C \subseteq S\) be a p-prima facie cause. By condition (1) and (2) of Definition 12 we have \(C \subseteq S_p \backslash \{ fail \}\). Since every path reaching \( fail \) trivially visits a state in \( Q : = C \cup \{ fail \}\subseteq S_p\), the set

$$\begin{aligned} \Pi = \{&s_0 \ldots s_n\in {\text {Paths}}_{{\text {fin}}}(M) \text { with} \\&s_n \in Q \text { and } \forall i<n: \; s_i \notin Q\} \end{aligned}$$

is a state-based p-cause. \(\square \)

Costs of p-causes

In this section we fix a DTMC M with state space S, unique initial state \(s_0\), unique target and safe state \( fail , safe \in S\) and a threshold \(p \in (0,1]\). As motivated in the introduction, we equip the DTMC of our model with a weight function \(c:S \rightarrow \mathbb {Q}\) on states and consider the induced accumulated weight function on finite paths \(c:{\text {Paths}}_{{\text {fin}}}(M)\rightarrow \mathbb {Q}\). These weights typically represent resources spent, e.g., energy, time, material, etc. Naturally one would like to optimize resource consumption either by minimizing the expected resource consumption or by staying below a certain threshold. In this section we will introduce various measures for resource consumption applicable in different scenarios and analyze the complexity of computing optimal p-causes.

Expected cost of a p-cause

Definition 14

(Expected cost). Given a p-cause \(\Pi \) for \(\lozenge fail \) in M consider the random variable \(\mathcal {X}: {\text {Paths}}(M) \rightarrow \mathbb {Q}\) with \(\mathcal {X}(\pi )=c({\hat{\pi }})\) for

$$\begin{aligned}&{\hat{\pi }} \in \Pi \cap {\text {Pref}}(\pi ) \text { if such } {\hat{\pi }} \text { exists} \\&{\hat{\pi }} \in {\text {Pref}}(\pi ) \text { minimal with }{\text {last}}({\hat{\pi }})= safe . \end{aligned}$$

Since \(\text {Pr}_{s_0}(\lozenge \{ fail , safe \})=1\), paths not falling under the two cases above have measure 0. Then the expected cost \({\text {expcost}}(\Pi )\) of \(\Pi \) is the expected value of \(\mathcal {X}\).

The expected cost is means by which the efficiency of causes for monitoring purposes can be estimated. Assume a p-cause \(\Pi \) is used to monitor critical scenarios of a probabilistic system. This means that at some point either a critical scenario is predicted by the monitor (i.e., the execution seen so far lies in \(\Pi \)), or the monitor reports that no critical scenario will arise (i.e., \( safe \) has been reached) and can therefore be turned off. If the weight function on the state space is chosen such that it models the cost of monitoring the respective states, then \({\text {expcost}}(\Pi )\) estimates the average total resource consumption of the monitor.

We say that a p-cause \(\Pi \) is \({\text {expcost}}\)-minimal if for all p-causes \(\Phi \) we have

$$\begin{aligned} {\text {expcost}}(\Pi ) \le {\text {expcost}}(\Phi ). \end{aligned}$$

By \({\text {expcost}}^{\min }\), we denote the value \({\text {expcost}}(\Pi )\) of any \({\text {expcost}}\)-minimal p-cause \(\Pi \).

Theorem 15

  1. (1)

    Given a non-negative weight function \(c:S \rightarrow \mathbb {Q}_{\ge 0}\), the canonical p-cause \(\Theta \) from Proposition 6 is \({\text {expcost}}\)-minimal. The value \({\text {expcost}}^{\min }\) can then be computed in polynomial time.

  2. (2)

    For an arbitrary weight function \(c:S \rightarrow \mathbb {Q}\), an \({\text {expcost}}\)-minimal and state-based p-cause \(\Pi \) and \({\text {expcost}}^{\min }\) can be computed in polynomial time.

Proof

For the following, let \({\text {Cyl}}(\Pi ) = \bigcup _{\pi \in \Pi } {\text {Cyl}}(\pi )\) for \(\Pi \subseteq {\text {Paths}}_{{\text {fin}}}(M)\).

To show (1), we claim that for two p-causes \(\Pi \) and \(\Phi \) with \(\Pi \preceq \Phi \) we have \({\text {expcost}}(\Pi ) \le {\text {expcost}}(\Phi )\). Let \(\mathcal {X}_\Pi \) and \(\mathcal {X}_\Phi \) be the corresponding random variables of \({\text {expcost}}(\Pi )\) and \({\text {expcost}}(\Phi )\), respectively. We prove \({\text {expcost}}(\Pi ) \le {\text {expcost}}(\Phi )\) by showing for almost all \(\pi \in {\text {Paths}}(M)\) that \(\mathcal {X}_\Pi (\pi ) \le \mathcal {X}_\Phi (\pi )\). As p-causes only need prefixes for almost every path \(\pi \in {\text {Paths}}_M(\mathcal {L}\)), we ignore paths of probability 0 in the argument below.

From \(\Pi \preceq \Phi \) it follows that \({\text {Cyl}}(\Pi ) \supseteq {\text {Cyl}}(\Phi )\). We now deal separately with the three cases obtained by the partition

$$\begin{aligned}&{\text {Paths}}(M) = \\&{\text {Cyl}}(\Phi ) \ {\dot{\cup }} \ {\text {Cyl}}(\Pi ) \backslash {\text {Cyl}}(\Phi ) \ {\dot{\cup }} \ {\text {Paths}}(M) \backslash {\text {Cyl}}(\Pi ). \end{aligned}$$

For \(s_0s_1 \ldots \in {\text {Cyl}}(\Phi )\) both random variables consider the unique prefix of \(s_0s_1 \ldots \) in the respective p-cause. For any \(s_0 \ldots s_n \in \Phi \) we know by definition of \(\preceq \) that there is \(s_0 \ldots s_m \in \Pi \) with \(m \le n\). Therefore, we have

$$\begin{aligned}&\mathcal {X}_\Pi (s_0s_1 \ldots )=c(s_0 \ldots s_m) \\ \le&c(s_0 \ldots s_n)= \mathcal {X}_\Phi (s_0s_1 \ldots ). \end{aligned}$$
(*)

For \(s_0s_1 \ldots \in {\text {Cyl}}(\Pi ) \backslash {\text {Cyl}}(\Phi )\) the random variable \(\mathcal {X}_\Pi \) takes the unique prefix \(s_0\ldots s_m\) in \(\Pi \), whereas \(\mathcal {X}_\Phi \) takes the prefix \(s_0 \ldots s_n\) such that \(s_n = safe \). Since \(\text {Pr}_{ safe }(\lozenge fail ) = 0\), we have almost surely that \(m<n\), and therefore (*) holds. For \(s_0s_1 \ldots \in {\text {Paths}}(M) \backslash {\text {Cyl}}(\Pi )\) both random variables evaluate the same path and so \(\mathcal {X}_\Pi (s_0s_1 \ldots )= \mathcal {X}_\Phi (s_0s_1 \ldots )\). Thus, we have for any \(\pi \in {\text {Paths}}(M)\) that \(\mathcal {X}_\Pi (\pi ) \le \mathcal {X}_\Phi (\pi )\).

To compute \({\text {expcost}}^{\min }={\text {expcost}}(\Theta )\) consider the DTMC M but change the transitions such that every state in \(S_p\) is terminal. The value \({\text {expcost}}(\Theta )\) is then the expected reward \({\text {ExpRew}}(s_0 \models \lozenge S_p \cup \{ safe \})\) defined in [4, Definition 10.71]. Expected rewards in DTMCs can be computed in polynomial time via a classical linear equation system [4, Sect. 10.5.1].

We show (2) by reducing the problem of finding an \({\text {expcost}}\)-minimal p-cause to the stochastic shortest path problem (SSP) [7] in \(\mathcal {C}_p(M)\). For a scheduler \(\mathfrak {S}\) of \(\mathcal {C}_p(M)\) denote the corresponding p-cause for \(\lozenge fail \) in M by \(\Pi _{\mathfrak {S}}\) (cf. Lemma 9). In [7] the authors define the value expected cost of a scheduler \(\mathfrak {S}\) in an MDP. This value with respect to the target set \(\{ fail , safe \}\) coincides with \({\text {expcost}}(\Pi _{\mathfrak {S}})\).

The SSP asks to find a scheduler \(\mathfrak {S}^*\) in \(\mathcal {C}_p(M)\) minimizing the expected cost. It is shown in [7] that a memoryless such scheduler \(\mathfrak {S}^*\) and the value \({\text {expcost}}^{\min }\) can be computed in polynomial time. Since the scheduler \(\mathfrak {S}^*\) is memoryless, it corresponds to an \({\text {expcost}}\)-minimal state-based p-cause \(\Pi \)\(\square \)

Partial expected cost of a p-cause

In this section we study a variant of the expected cost where paths that have no prefix in the p-cause are attributed zero costs. A use case for this cost mechanism arises in a possible monitoring application when the costs are incurred by the countermeasures taken upon triggering the alarm. For example, an alarm might be followed by a downtime of the system and certain procedures must be initiated. Then the corresponding cost may depend on the current state and history of the execution. Naturally in such cases there are no costs incurred if no alarm is triggered.

Definition 16

(Partial expected cost). For a p-cause \(\Pi \) for \(\lozenge fail \) in M consider the random variable \(\mathcal {X}: {\text {Paths}}(M) \rightarrow \mathbb {Q}\) with

$$\begin{aligned} \mathcal {X}(\pi )= {\left\{ \begin{array}{ll} c({\hat{\pi }})&{} \text {for } {\hat{\pi }} \in \Pi \cap {\text {Pref}}(\pi ) \text { if it exists} \\ 0 &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

The partial expected cost \({\text {pexpcost}}(\Pi )\) of \(\Pi \) is the expected value of \(\mathcal {X}\).

Whereas the canonical p-cause is the minimal p-cause with respect to the expected costs of a nonnegative cost function (see statement (1) of Theorem 15), this does not hold in general for partial expected costs, as the following example shows.

Fig. 4
figure 4

An example showing that the partial expected cost is not monotonous on p-causes

Example 17

Consider the Markov chain depicted in Fig. 4. For the probability threshold \(p=1/2\) we have \(S_p = \{t, u, fail \}\). The canonical p-cause is \(\Theta = \{s_0^k\, t\mid k \ge 1 \}\) with \({\text {pexpcost}}(\Theta )= \sum _{k \ge 1} (1/4)^k \cdot k = 4/9\). Consider now an arbitrary p-cause \(\Pi \) for \(\lozenge fail \). If the path \(s_0^\ell \, t\) belongs to \(\Pi \), then it contributes \((1/4)^\ell \cdot \ell \) to \({\text {pexpcost}}(\Pi )\). If instead the paths \(s_0^\ell \, t \, fail \) and \(s_0^\ell \, t \, u\, fail \) belong to \(\Pi \), they contribute \((1/4)^\ell \cdot 1/2 \cdot \ell +(1/4)^\ell \cdot 1/2 \cdot 3/4 \cdot (\ell +w)\). So, the latter case provides a smaller \({\text {pexpcost}}\) if \(l > 3 w\), and the \({\text {pexpcost}}\)-minimal p-cause is therefore

$$\begin{aligned} \Pi = \{s_0^k t \mid 1 \le k \le 3 w \} \cup \{s_0^k t fail , s_0^k t u \mid 3 w < k \} . \end{aligned}$$

For \(w=1\), the expected cost of this p-cause is \(511/1152=4/9-1/1152\). So, it is indeed smaller than \({\text {pexpcost}}(\Theta )\).

Theorem 18

Given a non-negative weight function \(c :S \rightarrow \mathbb {Q}_{\ge 0}\), a \({\text {pexpcost}}\)-minimal and threshold-based p-cause \(\Pi \), and the value \({\text {pexpcost}}^{\min }\), can be computed in exponential time. The p-cause \(\Pi \) has a polynomially bounded representation.

Proof

Consider a scheduler \(\mathfrak {S}\) of \(\mathcal {C}_p(M)\) with weight function c and recall that \(\mathcal {C}_p(M)_\mathfrak {S}\) denotes the Markov chain induced by \(\mathfrak {S}\). Define the random variable \(\oplus _\mathfrak {S} fail : {\text {Paths}}(\mathcal {C}_p(M)_\mathfrak {S}) \rightarrow \mathbb {Q}\)

$$\begin{aligned} \oplus _\mathfrak {S} fail (\pi )= {\left\{ \begin{array}{ll}c(\pi )&{} \text {if } \pi \in \lozenge fail \\ 0 &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$

The partial expectation \({\mathbb{PE}\mathbb{}}^{\mathfrak {S}}\) of a scheduler \(\mathfrak {S}\) in \(\mathcal {C}_p(M)\) is defined as the expected value of \(\oplus _\mathfrak {S} fail \). The minimal partial expectation is \({\mathbb{PE}\mathbb{}}^{\min }= \inf _{\mathfrak {S}} {\mathbb{PE}\mathbb{}}^{\mathfrak {S}}\). It is known that there is a scheduler obtaining the minimal partial expectation [44].

Then a p-cause \(\Pi \) and the corresponding scheduler \(\mathfrak {S}_{\Pi }\) satisfy \({\text {pexpcost}}(\Pi )= \mathbb{PE}\mathbb{}^{\mathfrak {S}_{\Pi }}\). A cost-minimal scheduler for the partial expectation in an MDP with non-negative weights can be computed in exponential time by [44]. In this process we also compute \({\text {pexpcost}}^{\min }={\mathbb{PE}\mathbb{}}^{\min }\).

Furthermore, we can show that once the action continue is optimal in a state \(s \in S_p\) with accumulated weight w, it is also optimal for all weights \(w'>w\): Suppose choosing continue is optimal in some state s if the accumulated weight is w. Let E be the partial expected accumulated weight that the optimal scheduler collects from then on and let \(q = \text {Pr}_{s}(\lozenge fail )\). The optimality of continue implies that \(E+w\cdot q \le w\). For all \(w^\prime >w\), this implies \(E+w^\prime \cdot q \le w^\prime \) as well. We conclude the existence of \(T: S \rightarrow \mathbb {Q}\) such that continue is optimal if and only if the accumulated weight is at least T(s). If pick is not enabled in a state s, we have \(T(s) = 0\). Therefore \(\Pi \) is a threshold-based p-cause defined by T. As shown in [44], there is a saturation point K such that schedulers minimizing the partial expectation can be chosen to behave memoryless as soon as the accumulated weight exceeds K. This means that T(s) can be chosen to be either at most K or \(\infty \) for each state s. The saturation point K and hence all thresholds T(s) have a polynomial representation. \(\square \)

Since the causal MDP \(\mathcal {C}_p(M)\) has a comparatively simple form, one could expect that one can do better than the pseudo-polynomial algorithm obtained by reduction to [44]. Nevertheless, in the remainder of this section we argue that computing a \({\text {pexpcost}}\)-minimal p-cause is computationally hard, in contrast to \({\text {expcost}}\) (cf. Theorem 15). For this we recall that the complexity class \(\mathtt {PP}\) [26] is characterized as the class of languages \(\mathcal {L}\) that have a probabilistic polynomial-time bounded Turing machine \(M_{\mathcal {L}}\) such that for all words \(\tau \) one has \(\tau \in \mathcal {L}\) if and only if \(M_{\mathcal {L}}\) accepts \(\tau \) with probability at least 1/2 (cf. [29]). We will use polynomial Turing reductions, which, in contrast to many-one reductions, allow querying an oracle that solves the problem we reduce to a polynomial number of times. A polynomial time algorithm for a problem that is \(\mathtt {PP}\)-hard under polynomial Turing reductions would imply that the polynomial hierarchy collapses [49]. We reduce the \(\mathtt {PP}\)-complete cost-problem stated in [28, Theorem 3] to the problem of computing \({\text {pexpcost}}^{\min }\).

Fig. 5
figure 5

The DTMCs \(N_i\) for \(i=0,1\)

Theorem 19

Given an acyclic DTMC M, a weight function \(c :S \rightarrow \mathbb {N}\) and a rational \(\vartheta \in \mathbb {Q}\), deciding whether \({\text {pexpcost}}^{\min }\le \vartheta \) is \(\mathtt {PP}\)-hard under Turing reductions.

Proof

We provide a Turing reduction from the following problem that is shown to be \(\mathtt {PP}\)-hard in [28]: Given an acyclic DTMC M over state space S with initial state s, absorbing state t such that \(\text {Pr}_{s}(\lozenge t)=1\), weight function \(c:S \rightarrow \mathbb {N}\) and natural number \(R \in \mathbb {N}\), decide whether

$$\begin{aligned} \text {Pr}_M(\{\varphi \in {\text {Paths}}(M) \mid c(\pi ) \le R\}) \ge \frac{1}{2}. \end{aligned}$$

Given such an acyclic DTMC M we construct the two DTMCs \(N_0\) and \(N_1\) depicted in Fig. 5. We consider the \({\text {pexpcost}}\)-minimal p-causes for \(p=1/2\) in \(N_i\) for \(\lozenge fail \) and \(i \in \{0,1\}\). Suppose a path \(\pi \) arrives at state \(c_i\) with probability \(\text {Pr}_s(\pi )\) and accumulated weight w. We have to decide whether the path \(\pi \) or the extension \(\pi ^\prime =\pi b_i fail \) should be included in the cost-minimal p-cause. The path \(\pi ^\prime \) has weight \(w+R+i\) and probability \(\text {Pr}_s(\pi )/2\). We observe that \(\pi \) is the optimal choice if

$$\begin{aligned} \text {Pr}_s(\pi )\cdot w\le \text {Pr}_s(\pi )\cdot \frac{w+R+i}{2}. \end{aligned}$$

This is the case if and only if \(w\le R+i\). If \(i=1\), and \(w=R+i\), both choices are equally good and we decide to include the path \(\pi \) in this case. Hence, the \({\text {pexpcost}}\)-minimal p-causes for \(p=1/2\) in \(N_0\) is

$$\begin{aligned}&\Pi _0=\{{\hat{\pi }} \in {\text {Paths}}_{{\text {fin}}}(N_0) \mid {\text {last}}({\hat{\pi }}) = c_0 \text { and } c({\hat{\pi }}) \le R\} \cup \\&\{{\hat{\pi }} \in {\text {Paths}}_{{\text {fin}}}(N_0) \mid {\text {last}}({\hat{\pi }}) = fail \text { and } c({\hat{\pi }}) > 2R\}. \end{aligned}$$

Similarly in \(N_1\) the following p-cause is \({\text {pexpcost}}\)-minimal:

$$\begin{aligned}&\Pi _1=\{{\hat{\pi }} \in {\text {Paths}}_{{\text {fin}}}(N_1) \mid {\text {last}}({\hat{\pi }}) = c_1\text { and } c({\hat{\pi }}) \le R\} \cup \\&\{{\hat{\pi }} \in {\text {Paths}}_{{\text {fin}}}(N_1) \mid {\text {last}}({\hat{\pi }}) = fail \text { and } c({\hat{\pi }}) > 2R+1\}. \end{aligned}$$

Therefore, we have

We conclude that

$$\begin{aligned}&{\text {pexpcost}}_{N_0}^{\min } - {\text {pexpcost}}_{N_1}^{\min } \\&\quad ={\text {pexpcost}}_{N_0}(\Pi _0)-{\text {pexpcost}}_{N_1}(\Pi _1) \\&\quad = \frac{1}{6}\cdot \text {Pr}_M(\{\varphi \in {\text {Paths}}(M) \mid c(\varphi ) > R\})\\&\quad = \frac{1}{6}\cdot (1-\text {Pr}_M(\{\varphi \in {\text {Paths}}(M) \mid c(\varphi ) \le R\})) \end{aligned}$$

In the sequel we prove that we can use an oracle for the threshold problem for \({\text {pexpcost}}^{\min }\) to compute the values

$$\begin{aligned} {\text {pexpcost}}_{N_0}^{\min } \;\; \text {and} \;\; {\text {pexpcost}}_{N_1}^{\min }. \end{aligned}$$

This in turn allows us to compute \(\frac{1}{6}\text {Pr}_M(\{{\hat{\pi }} \in {\text {Paths}}(M) \mid c({\hat{\pi }}) \le R\})\) and thus to decide the problem from [28].

In any acyclic Markov chain K, the following holds: We assume that the transition probabilities are encoded as fractions of coprime integers. Therefore, we can compute the product of all denominators to get a number L in polynomial time and with polynomially bounded encoding. The maximal weight W of a path in K can be computed in linear time and has a polynomially bounded encoding. Therefore, the minimal \({\text {pexpcost}}\) is an integer multiple of 1/L and there are at most \(W \cdot L\) many different values for \({\text {pexpcost}}\). This in particular applies to the value of the \({\text {pexpcost}}\)-optimal p-cause.

Note that there are still exponentially many possible values for the minimal partial expected costs in both Markov chains \(N_i\). A binary search over all possible values with polynomially many applications of the threshold problem:

$$\begin{aligned} \text {Is } {\text {pexpcost}}^{\min } \le \vartheta \text { for a rational } \vartheta \in \mathbb {Q}? \end{aligned}$$

is possible, nevertheless. We therefore can apply this binary search to find the exact value \({\text {pexpcost}}_{N_i}(\Pi )\) for both DTMCs \(N_i (i=0,1)\) by solving polynomially many instances of the threshold problem. This gives us a polynomial Turing reduction to the problem stated in [28]. \(\square \)

Maximal cost of a p-cause

In practice, the weight function on the Markov chain potentially models resources for which the available amount might have a tight upper bound. For example, the amount of energy a drone can consume from its battery is naturally limited. Instead of just knowing that on average the resource consumption will lie below a given bound, it is therefore often desirable to guarantee to be below this limit for (almost) any evolution of the system.

Definition 20

(Maximal cost). Let \(\Pi \) be a p-cause for \(\lozenge fail \) in M. We define the maximal cost of \(\Pi \) to be

$$\begin{aligned} {\text {maxcost}}(\Pi )&= \sup \{c({\hat{\pi }}) \mid {\hat{\pi }} \in \Pi \}. \end{aligned}$$

We now turn to the algorithmic problem of computing \({\text {maxcost}}\)-minimal p-causes. In case that the cost function is non-negative, we find that again the canonical p-cause is optimal. We start with an observation that if certain, positively weighted, cycles exist, then the \({\text {maxcost}}\) of all p-causes is infinite.

Lemma 21

We have \({\text {maxcost}}(\Pi ) = \infty \) for all p-causes \(\Pi \) of M if and only if a path \(s_0 \ldots s_n \in (S \setminus S_p)^*\) exists in M which contains a cycle with positive weight. This condition can be checked in polynomial time.

Proof

Consider a such a positive cycle reachable in \(S \backslash S_p\). Then there are paths in \(\lozenge fail \) which contain this cycle arbitrarily often. For any p-cause \(\Pi \) almost all of these paths need a prefix in \(\Pi \). Since no state in the positive cycle nor in the path from \(s_0\) to the cycle is in \(S_p\), such prefixes also contain the cycle arbitrarily often. This means these prefixes accumulate the positive weight of the cycle arbitrarily often. Therefore, all p-causes \(\Pi \) contain paths with arbitrary high weight, and thereby \({\text {maxcost}}(\Pi ) = \infty \).

For the other direction, assume that \({\text {maxcost}}(\Pi ) = \infty \) for all p-causes \(\Pi \) of M. So, in particular for the canonical p-cause \(\Theta \), \({\text {maxcost}} (\Theta )=\infty \). So, arbitrarily high amounts of weights can be accumulated along paths not visiting \(S_p\) implying the existence of a positive cycle reachable in \(S\setminus S_p\).

The condition can be checked in polynomial time with the Bellman-Ford algorithm [16]. \(\square \)

Proposition 22

For a nonnegative cost function c, the canonical p-cause \(\Theta \) is \({\text {maxcost}}\)-minimal and \({\text {maxcost}}(\Theta )\) can be computed in polynomial time.

Proof

We show that for two p-causes \(\Pi , \Phi \) with \(\Pi \preceq \Phi \) we have \({\text {maxcost}}(\Pi ) \le {\text {maxcost}}(\Phi )\). Let \({\hat{\pi }}\in \Pi \) be arbitrary. Since \({\hat{\pi }}\) is a p-critical prefix (and \(p>0\)), there is a path \(\pi \in \lozenge fail \) with \( {\hat{\pi }} \in {\text {Pref}}(\pi )\). Since \(\Phi \) is a p-cause, there exists \({\hat{\varphi }} \in \Phi \cap {\text {Pref}}(\pi )\). The assumption \(\Pi \preceq \Phi \) and the fact that both \(\Pi \) and \(\Phi \) are prefix-free then force \({\hat{\pi }}\) to be a prefix of \({\hat{\varphi }}\). Hence \(c({\hat{\pi }}) \le c({\hat{\varphi }})\), and since \({\hat{\pi }}\) was arbitrary, it follows that \({\text {maxcost}}(\Pi ) \le {\text {maxcost}}(\Phi )\). This implies that \(\Theta \) is \({\text {maxcost}}\)-minimal.

To compute \({\text {maxcost}}^{\min } = {\text {maxcost}}(\Theta )\), we may assume that the condition of Lemma 21 does not apply. The problem is reduced to the computation of the longest path in a modified version of M. There might be cycles with weight 0 included in \(S \backslash S_p\), but in such cycles every state in the cycle has weight 0. Therefore, we can collapse such cycles completely without changing the value \({\text {maxcost}}(\Theta )\). We further collapse the set \(S_p\) into one absorbing state f. Computing \({\text {maxcost}}(\Theta )\) now amounts to searching for a longest path from \(s_0\) to f in this modified weighted directed acyclic graph. This can be reduced in linear time to the shortest path problem (by multiplying all weights with \(-1\)), which can in turn be solved in polynomial time [16]. \(\square \)

If the considered cost function is not non-negative, then the canonical p-cause may no longer be optimal. In this case we propose a reduction to a kind of total payoff two player game with an additional reachability requirement for one player, such as described in [9]. The game is played by the two players Min and Max over the arena defined by:

$$\begin{aligned} \mathcal {A} = (V, v_0, V_{Max}, V_{Min}, E), \end{aligned}$$

where \(V= S \,{\dot{\cup }}\,\dot{S}_p, v_0 = s_0, V_{Max}=S, \) and \(V_{Min}=\dot{S}_p\). Here \(\dot{S}_p\) is a copy of \(S_p\), and the edge relation E contains exactly the pairs \((s,t) \in V \times V\) such that one of the following holds:

  • \(s \in S ,t \in S \backslash S_p\), and \({\textbf {P}}(s,t)>0\),

  • \(s \in S, t \in \dot{S}_p\), and for \(u\in S_p\) with \(t=\dot{u}\) we have \({\textbf {P}}(s,u)>0\),

  • \(s \in \dot{S}_p, t \in S_p\), and \(s = \dot{t}\), or

  • \(s \in \dot{S}_p\) and \(t= fail \).

Player Max controls vertices in \(V_{Max}\), and Min controls vertices in \(V_{Min}\). This way Min can choose to go to \( fail \) whenever a state \(s \in S_p\) would be reached. Additionally, we equip \(\mathcal {A}\) with a weight function \(w:V \rightarrow \mathbb {Q}\) which mirrors the weight function c of M in the following way:

  • \(w(s) = c(s)\) for \(s \in S \backslash S_p\), and

  • \(w(s) = 0\) for \(s \in S_p\), and

  • \(w(\dot{s}) = c(s)\) for \(s \in S_p\).

The game proceeds like a standard two player graph game. An infinite play \(\pi = v_0 v_1 \ldots \in V^\omega \) has value \(- \infty \) if \(\pi \) avoids \( fail \) forever, and otherwise \(\sum _{i = 0}^{n} w(v_i)\), where n is the least position such that \(v_n = fail \) holds. As Max wants to maximize the outcome of the game, they will try to make sure that \( fail \) is reached eventually.

By multiplying all values with \(-1\), interchanging the roles of Max and Min as well as giving paths not reaching \( fail \) the value \(+\infty \) the game can be reduced to a min-cost reachability game as in [9]. Therefore, the game is determined [9] and the value Val(v) of a vertex is defined to be

$$\begin{aligned} Val(v) = \inf _{\mathfrak {S}_{Min}} \sup _{\mathfrak {S}_{Max}} Val(v,\mathfrak {S}_{Max},\mathfrak {S}_{Min}), \end{aligned}$$

where \(\mathfrak {S}_{Max}\) ranges over all strategies of Max, \(\mathfrak {S}_{Min}\) ranges over all strategies of Min and \(Val(v,\mathfrak {S}_{Max},\mathfrak {S}_{Min})\) is the value of the play emanating from v under these strategies. By [9, Theorem 1], player Min always has an optimal memoryless strategy. The point of the reduction is that strategies of Min are in one-to-one correspondence with p-causes in M, as the following lemma shows.

Lemma 23

There is a one-to-one correspondence between strategies \(\mathfrak {S}\) of Min and p-causes in M.

Proof

Let \(\mathfrak {S}\) be a strategy for Min and consider the set of consistent plays starting in \(s_0\) that we denote by \(Plays_{s_0}(\mathcal {A},\mathfrak {S})\). Every play \(\pi \in Plays_{s_0}(\mathcal {A}, \mathfrak {S})\) that reaches \( fail \) corresponds to a p-critical prefix. To see this, omit all states contained in \( \dot{S}_p\) from the play. If \(\pi \) reaches \( fail \) and the last state before \( fail \) is in \(\dot{S}_p\) then omit \( fail \) as well. The resulting path in M is a p-critical prefix. Let \(\Pi \subseteq {\text {Paths}}_{{\text {fin}}}(M)\) be the set of p-critical prefixes obtained in this way from plays in \(Plays_{s_0}(\mathcal {A},\mathfrak {S})\) reaching \( fail \).

To see that any path \(\pi \) to \( fail \) in M has a prefix in \(\Pi \), let \(\tau \) be the strategy of player Max that moves along the steps of \(\pi \). In the resulting play, either player Min moves to \( fail \) from some state \(s\in \dot{S}_p\) according to \(\mathfrak {S}\) and the corresponding prefix of \(\pi \) is in \(\Pi \), or the play reaches \( fail \) from a state not in \(\dot{S}_p\) and hence the path \(\pi \) itself belongs to \(\Pi \). Since the strategy has to make decisions for every \(s \in \dot{S}_p\), every path \(\pi \in {\text {Paths}}_M(\lozenge fail )\) has a prefix in \(\Pi \). \(\Pi \) is prefix-free since a violation of this property would correspond to a non-consistent play, since Min can only choose the edge to \( fail \) once. Therefore, \(\Pi \) is a p-cause in M for \(\lozenge fail \).

Since the reverse of this construction follows along completely analogous lines, we omit it here. \(\square \)

We have \(Val(s_0) = \infty \) if and only if there is a positive cycle reachable in \(S \backslash S_p\), as in that case Max can move along that cycle arbitrarily long, and then take a path to \( fail \), and otherwise Min can ensure a finite value by moving to \( fail \) as soon as possible. If \(Val(s_0)<\infty \), then both players have an optimal strategy by [9].

Lemma 24

We have \({\text {maxcost}}^{\min }=Val(s_0)\).

Proof

We have that \(Val(s_0)\) takes the value \(\infty \) if and only if there is a positive cycle reachable in \(S \backslash S_p\) if and only if \({\text {maxcost}}^{\min }=\infty \) by Lemma 21 and the discussion above.

Now assume that \(Val(s_0)<\infty \) and \({\text {maxcost}}^{\min }<\infty \). First, we show that \({\text {maxcost}}^{\min }\le Val(s_0)\). Let \(\mathfrak {S}\) be the optimal strategy for Min and \(\Pi _{\mathfrak {S}}\) the corresponding p-cause.

Suppose there is a path \(\pi \) in \(\Pi _{\mathfrak {S}}\) with \({\text {wgt}}(\pi )> Val(s_0)\). Using this path as a strategy for Max, we observe that \(Val(s_0,\mathfrak {S},\pi )={\text {wgt}}(\pi )\) because \(\mathfrak {S}\) moves to fail precisely when a path in \(\Pi _{\mathfrak {S}}\) has been played. This contradicts the optimality of \(\mathfrak {S}\). Hence, \({\text {maxcost}}^{\min }\le {\text {maxcost}}(\Pi _{\mathfrak {S}}) \le Val(s_0)\).

To show that \({\text {maxcost}}^{\min }\ge Val(s_0)\), let \(\Pi \) be a p-cause with \({\text {maxcost}}(\Pi )={\text {maxcost}}^{\min }\) which exists as \({\text {maxcost}}^{\min }<\infty \). Let \(\mathfrak {S}_{\Pi }\) be the corresponding strategy for Min. Suppose that there is a strategy \(\mathfrak {T}\) for Max with \(Val(s_0,\mathfrak {S}_{\Pi },\mathfrak {T})>{\text {maxcost}}^{\min }\). Strategy \(\mathfrak {T}\) induces a path \(\pi \) in \(\Pi \) and again, we observe that \({\text {wgt}}(\pi )=Val(s_0,\mathfrak {S}_{\Pi },\mathfrak {T})\) contradicting the optimality of \(\Pi \).

Thus \({\text {maxcost}}^{\min }\ge \sup _{\mathfrak {T}} Val(s_0,\mathfrak {S}_{\Pi },\mathfrak {T}) \ge Val(s_0)\) where \(\mathfrak {T}\) ranges over all strategies for Max. \(\square \)

With the above lemmas, and using results from [9], we are now in a position to prove the following upper bounds for computing \({\text {maxcost}}^{\min }\).

Theorem 25

  1. (1)

    For an arbitrary weight function \(c:S \rightarrow \mathbb {Q}\) a \({\text {maxcost}}\)-minimal and state-based p-cause \(\Pi \) and \({\text {maxcost}}^{\min }\) can be computed in pseudo-polynomial time.

  2. (2)

    Given a rational \(\vartheta \in \mathbb {Q}\), deciding whether \({\text {maxcost}}^{\min } \le \vartheta \) is in \(\mathtt {NP} \cap \mathtt {coNP}\).

Proof

  1. (1.)

    Using the correspondence between max-cost and min-cost games, the values \(Val(s_0)\) can be computed in pseudopolynomial time by applying [9, Algorithm 1]. In particular, this lets us compute \(Val(s_0) = {\text {maxcost}}^{\min }\). From the values of other states, an optimal memoryless strategy \(\mathfrak {S}\) for Min can be derived by fixing a successor \(s^\prime \) for each state \(s\in V_{Min}\) with \((s,s^\prime )\in E\) and

    $$\begin{aligned} Val(s)=c(s^\prime )+ Val(s') \end{aligned}$$

    as shown in [9]. Since the strategy is memoryless, we get a set \(Q \subseteq \dot{S}_p\) for which Min chooses the edge to \( fail \). By the construction from before the \({\text {maxcost}}\)-minimal p-cause \(\Pi \) obtained in this way is state-based.

  2. (2.)

    The decision problem “Is \(Val(s_0) \le 0\)?” is in \(\mathtt {NP}\cap \mathtt {coNP}\) by a reduction to mean-payoff games, as shown in [12]. The reduction introduces an edge from \( fail \) back to \(s_0\) and removes the state \( safe \) from the game. We have that \(Val(s_0) \le 0\) in the original max-cost reachability game if and only if the value of the constructed mean-payoff game is at most 0. The reason is that the value of the mean-payoff game is at most 0 if there is a strategy for Min such that Max can neither force a positive cycle in the original max-cost reachability game nor reach \( fail \) with positive weight in the original game. We adapt the construction to show that the decision problem “Is \({\text {maxcost}}^{\min } \le \vartheta \)?”, for \(\vartheta \in \mathbb {Q}\), is also in \(\mathtt {NP}\cap \mathtt {coNP}\). This can be achieved by adding an additional vertex s with weight \(-\vartheta \) to \(V_{Max}\), removing the edge between \( fail \) and \(s_0\) and adding two new edges, one from \( fail \) to s and one from s to \(s_0\). The value of the resulting mean-payoff game is then at most 0 if there is a strategy for Min such that Max can neither force a positive cycle in the original max-cost reachability game nor reach \( fail \) with weight above \(\vartheta \) in the original game. \(\square \)

Instantaneous cost

We get another perspective onto the introduced cost mechanisms by letting the given weight function c on states induce an instantaneous weight function instead of accumulating the weights. Concretely the induced cost function on M is then given by \(c_{{\text {inst}}}:{\text {Paths}}_{{\text {fin}}}(M) \rightarrow \mathbb {Q}\) which just takes the weight of the state visited last, i.e., \(c_{{\text {inst}}}(s_0 \cdots s_n)=c(s_n)\). This yields an alternative cost mechanism intended to model situations where the cost of repairing or rebooting only depends on the current state, e.g., the altitude an automated drone has reached.

We add the subscript ‘\({{\text {inst}}}\)’ to the three cost variants, where the accumulative weight function c has been replaced with the instantaneous weight function \(c_{{\text {inst}}}\), the error state is replaced by an error set E and the safe state is replaced by a set of terminal safe states F. Thus we optimize p-causes for \(\lozenge E\) in M.

Theorem 26

For \({\text {expcost}}_{{\text {inst}}}\), \({\text {pexpcost}}_{{\text {inst}}}\), and \({\text {maxcost}}_{{\text {inst}}}\) a cost-minimal p-cause \(\Pi \) and the value of the minimal cost can be computed in time polynomial in M. In all cases \(\Pi \) can be chosen to be a state-based p-cause.

Proof

Recall that we now work with a set of error states E instead of a single state \( fail \) and a set of terminal safe states F such that \(\text {Pr}_{f}(\Diamond E)=0\) if and only if \(f\in F\). We first note that for an instantaneous weight function we reduce partial expected cost to expected cost and therefore only need to consider one case: Given a weight function \(c:S\rightarrow \mathbb {Q}\), consider the weight function \(c'\) obtained from c by forcing \(c'(f)=0\) for all \(f\in F\). Then the partial expected cost with respect to \(c_{{\text {inst}}}\) equals the expected cost with respect to \(c'_{{\text {inst}}}\).

For \({\text {expcost}}_{{\text {inst}}}\) we construct an MDP \(\mathcal {N}=(S \,{\dot{\cup }}\, \dot{S}_p, \{pick, continue\}, s_0, {\textbf {P}}')\), where \(\dot{S}_p,\) is a disjoint copy of \(S_p\) in which all states are terminal. The copy of state \(s\in S_p\) in \(\dot{S}_p\) will be written as \(\dot{s}\). We define \({\textbf {P}}'(s,continue,s') = {\textbf {P}}(s,s')\) for \(s,s' \in S\), and \({\textbf {P}}'(s,pick,\dot{s}) = 1\) for \(s \in S_p\). The action pick is not enabled states outside of \(S_p\), and the action continue is not enabled in \(\dot{S}_p\). We define the weight function \(c_\mathcal {N}:S\, {\dot{\cup }} \,\dot{S}_p \rightarrow \mathbb {Q}\) by

$$\begin{aligned} c_\mathcal {N}(s)= {\left\{ \begin{array}{ll} c(s)&{} \text {if }s \in F \cup E \\ c(t)&{} \text {if } s \in \dot{S}_p \text { and } \dot{t} = s \\ 0&{} \text {else.} \end{array}\right. } \end{aligned}$$

The construction is illustrated in Figs. 6 and 7: Consider the DTMC \(M^\prime \) depicted in Fig. 6. The transition probabilities are omitted. It is enough to know \(S_p^\prime = \{s_0,s_1 \}\) and \(F^\prime = \{ safe \}\). The constructed MDP \(\mathcal {N}^\prime \) can be seen in Fig. 7, where the black edges are inherited from \(M^\prime \), and the red edges are added transitions belonging to the action pick.

Fig. 6
figure 6

The DTMC \(M^\prime \) with instantaneous weight

Fig. 7
figure 7

The MDP \(\mathcal {N}^\prime \) emulating instantaneous weight

For the constructed MDP \(\mathcal {N}\) we consider the accumulated weight function. This emulates an instantaneous weight function for the random variable \(\mathcal {X}_c\) of \({\text {expcost}}_{{\text {inst}}}\). A scheduler of this MDP corresponds to a p-cause for M in the same way as established in Lemma 9. Therefore, the problem of finding an \({\text {expcost}}\)-minimal p-cause for instantaneous weight \(c:S \rightarrow \mathbb {Q}\) in M for \(\lozenge E\) is equivalent to finding a cost-minimal scheduler in \(\mathcal {N}\) for \(\lozenge (E \cup F \cup \dot{S}_p)\). This is again the stochastic shortest path problem for \(\mathcal {N}\), which can be solved in polynomial time by [7]. Since the SSP is solved by a memoryless scheduler, the \({\text {expcost}}_{{\text {inst}}}\)-minimal p-cause is state-based.

For the computation of the minimal value of \({\text {maxcost}}_{{\text {inst}}}\), we enumerate the set \(S_p\) as \(s^0,\dots ,s^k\) where \(k=|S_p |-1\) such that \(c(s^i) \le c(s^j)\) for all \(0\le i < j \le k\).

Now we iteratively remove states in increasing order starting with \(s^0\). After removing a state \(s^i\), we check whether E is reachable in the resulting Markov chain. If this is the case, we continue by removing the next state. If E is not reachable anymore, the set \(S^i:=\{s^0,\dots ,s^i\}\) induces a state-based p-cause \(\Pi _{S^i}\). This follows from the fact that each path from the initial state to E contains a state in \(S^i\) and that \(S^i\subseteq S_p\). Furthermore, \({\text {maxcost}}_{{\text {inst}}} (\Pi _{S^i})\le c(s^i)\). Let j be the largest number less than i such that \(c(s^j)<c(s^i)\). There is no p-cause in which all paths end in \(\{s^0,\dots , s^j\}\) as E was still reachable after removing these states from M. So, there is no p-cause \(\Pi \) with \({\text {maxcost}}_{{\text {inst}}} (\Pi )<c(s^i)\). Therefore, \(\Pi _{S^i}\) is indeed a \({\text {maxcost}}_{{\text {inst}}}\)-minimal p-cause. Since \(E\subseteq S_p\), the procedure terminates at the latest when the states in E are removed. Hence the algorithm finds a state-based \({\text {maxcost}}_{{\text {inst}}}\)-minimal p-cause in polynomial time. \(\square \)

Trace-p-causes

The definition of p-causes discussed so far works on the level of paths of a DTMC. Implicitly, it hence makes the assumption that the states of the DTMC themselves are observable. In this section, we want to discuss a generalization in which the system is not fully observable. This can be modeled by adding a labeling function which assigns to each state in the Markov chain a label from a finite alphabet \(\Sigma \). Instead of paths through the state space, only the traces of these paths, i.e., the sequences of labels encountered can be observed.

Such a consideration makes sense in applications where there is no full introspection into the system but there is only partial information about the system state that can be gathered. For example, in the discussed monitoring application for p-causes, a critical event should be avoided (e.g., a drone crashing), but full information on the system state might not be available, e.g., only sensor readings of the drone might be observable instead of a fully specified state. Then, a cause for such an event should also be expressed in terms of these possible observations, e.g., in terms of sensor readings foreshadowing the crash. For this section, we work with a labeled DTMC \(\mathcal {M}=(S,s_0,{\textbf {P}},\Sigma ,L)\), an \(\omega \)-regular language \(\mathcal {L} \subseteq \Sigma ^\omega \), and a probability threshold \(p \in (0,1]\).

Definition 27

We define a finite trace \({\hat{\tau }}\) in \(\mathcal {M}\) as p-critical for \(\mathcal {L} \subseteq \Sigma ^{\omega }\) in \(\mathcal {M}\) if for every path \({\hat{\pi }} \in {\text {Paths}}_{\mathcal {M}}({\hat{\tau }})\) we have \(\text {Pr}(\mathcal {L} \mid {\hat{\pi }}) \ge p\).

With this definition of p-critical trace, we incorporated the probability-raising property in a strict sense, as we require that every underlying path raises the probability of \(\mathcal {L}\) above the specified value p. Such a strict interpretation of causes is in line with investigations of causality in operational models by Kleinberg et al. [32, 37, 39, 40, 52] where the probability-raising property is required for each element of a cause. For example, causes as defined in [39] are given by a set of states represented using a PCTL-formula, such that each included state raises the probability of the effect (see Sect. 3.2 for a more detailed comparison). A more detailed discussion of the correspondence between the notion of causality by Kleinberg et al. and trace- and path-based notions of causality can be found in [1].

In this paper we make a first step toward an understanding of trace-based causality by considering this strict view on the probability-raising principle. We show that some of the techniques developed for path-based p-causes can be applied in this setting of strict trace-p-causes. However, we also show that several new phenomena arise here. For example, strict trace-based p-causes may not exist at all (see Example 29). Another natural definition would be to take a global view and require that, for each trace in the cause, the conditional probability of the effect, given the finite prefix trace, is above p. For a treatment of this global definition, however, the techniques developed in this paper appear not to be applicable directly and additional difficulties have to be expected.

Definition 28

A trace-p-cause for \(\mathcal {L}\) in \(\mathcal {M}\) is a prefix-free set of finite traces in \(\mathcal {M}\) such that

  1. 1.

    almost every infinite trace \(\tau \) in \(\mathcal {M}\) satisfying \(\mathcal {L}\) has a prefix \({\hat{\tau }}\) in the cause

  2. 2.

    every finite trace in the cause is p-critical.

Note that these definitions generalize the notion of (path-)p-cause: If each state in the DTMC \(\mathcal {M}\) has a unique label, paths and traces can be identified with each other and the definitions agree with the definitions from Sect. 3.

First, let us briefly consider the case where \(\mathcal {L}\) constitutes a reachability property. So, assume there is a set of labels \(G \subseteq \Sigma \) such that \(\mathcal {L} = \mathcal {L}(\lozenge G)\). In this case, any finite trace in \((\Sigma \setminus G)^{*} G\) is clearly p-critical for any \(p \in (0,1]\). This readily implies that trace-p-causes always exist for reachability properties. In the sequel, we will see that an NFA-representation of the canonical trace-p-cause consisting of the minimal p-critical traces can be computed in exponential time in this case (see Proposition 35).

In contrast to the fully observable case, the investigation of causes of arbitrary \(\omega \)-regular effects, however, cannot be reduced to the simpler case of reachability properties. In fact, trace-p-causes might not exist in general as the following example shows.

Example 29

Consider the labeled DTMC \(\mathcal {M}\) of Fig. 8 where the labels of the states are displayed above the states. Now let \(\mathcal {L} = \mathcal {L}(\square a) = \{ a^{\omega } \}\) and \(p \in (1/2, 1)\) then any finite prefix \(a^n\) of \(a^\omega \) has \(s_0^n \in {\text {Paths}}(a^n)\). However, \(s_0^n\) is only p-critical for \(\mathcal {L}\) if \(p \le 1/2\). Thus there does not exist a trace-p-cause for \(\mathcal {L}\) in \(\mathcal {M}\).

Fig. 8
figure 8

Example for the non-existence of trace-p-causes

This observation leads to the decision problem: Given a labeled DTMC \(\mathcal {M}\), an \(\omega \)-regular language \(\mathcal {L}\) and a probability threshold \(p \in (0,1]\) decide the existence of a trace-p-cause for \(\mathcal {L}\) in \(\mathcal {M}\). Addressing this problem we use the following construction:

Definition 30

(Belief construction). Given a labeled DTMC \(\mathcal {M}=(S,s_0,P,\Sigma ,L)\), we construct the labeled DTMC \(\text {Bel}(\mathcal {M})=(S^{\text {Bel}},s_0^{\text {Bel}},P^{\text {Bel}},\Sigma ,L^{\text {Bel}})\) as follows: The state space is \(S^{\text {Bel}}=S\times 2^S\). The initial state is \(s_0^{\text {Bel}}=(s_0,\{s_0\})\). For (sT) and \((t,T^\prime )\) in \(S^{\text {Bel}}\), we define \(P^{\text {Bel}}((s,T),(t,T^\prime )) =P(s,t)\) if and only if \(T^\prime \) contains exactly all successors of states in T (in \(\mathcal {M}\)) with label L(t). All remaining transition probabilities are 0. The new labeling function assigns L(s) to a state (sT) in \(S^{\text {Bel}}\).

Intuitively, the second component of states in the belief construction keeps track of the states in which the DTMC \(\mathcal {M}\) could have ended up if another path with the same trace had been taken.

We assume we are given a DRA \(\mathcal {A}\) over \(\Sigma \) specifying the \(\omega \)-regular effect \(\mathcal {L}\). We work with the product DTMC \(\text {Bel}(\mathcal {M}) \otimes \mathcal {A}\) of the belief construction with the automaton. The acceptance condition of \(\mathcal {A}\) allows us to distinguish accepting and rejecting BSCCs in this DTMC. In analogy to before, let \(S_p\) be the set of states in \(\text {Bel}(\mathcal {M}) \otimes \mathcal {A}\) from which an accepting BSCC is reached with probability \(\ge p\).

For a state (sT) in \(\text {Bel}(\mathcal {M})\) and a state q of the automaton \(\mathcal {A}\), we call the state (sTq) in \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\) p-critical if \((t,T,q) \in S_p\) for all \(t\in T\). In words, a state is p-critical if it is in \(S_p\) and would still be in \(S_p\) if so far another run with the same trace as a run reaching (sTq) had been taken. Note that any other such run ends up in the same state q as \(\mathcal {A}\) is deterministic. Let us denote the set of p-critical states in \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\) by \(\text {Crit}_p\). With this notation, we get following result.

Lemma 31

Let \(\mathcal {M}\) be a labeled DTMC and \(\mathcal {A}\) be a DRA specifying an \(\omega \)-regular language \(\mathcal {L} \subseteq \Sigma ^{\omega }\). Then there is a trace-p-cause for \(\mathcal {L}\) in \(\mathcal {M}\) if and only if the following condition holds: For each accepting BSCC with states B in \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\), we have \(B\cap \text {Crit}_p \not = \emptyset \) or \(\Pr _{\text {Bel}(\mathcal {M})\otimes \mathcal {A}} (\lnot \text {Crit}_p \mathcal {U} B)=0\).

Fig. 9
figure 9

Illustration of the product of the belief construction and an automaton for \(\Box \Diamond a\)

Proof

We start with the following observation: Let \(\pi \) be a finite path in \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\). Then, the trace of this path is p-critical if and only if the path \(\pi \) ends in a state in \(\text {Crit}_p\). To prove this, first assume that \(\pi \) ends in the state (sTq) in \(\text {Crit}_p\). By the definition of the belief construction, all other paths with the same trace then end in a state (tTq) for some \(t\in T\). By the definition of \(\text {Crit}_p\), all of these states lie in \(S_p\). So, all paths \(\psi \) in \(\mathcal {M}\) that have this trace, satisfy \(\Pr _{\mathcal {M}}(\mathcal {L}\mid \psi ) \ge p\). Analogously, if \(\pi \) ends in a state (sTq) not in \(\text {Crit}_p\), then there is a path \(\pi ^\prime \) with the same trace that ends in a state (tTq) with \(t\in T\) not in \(S_p\). Projecting this path \(\pi ^\prime \) to the first component, we obtain the corresponding path \(\psi \) in \(\mathcal {M}\) with the same trace as \(\pi \) that does not satisfy \(\Pr _{\mathcal {M}}(\mathcal {L}\mid \psi ) \ge p\).

Furthermore, it is clear that a trace-p-cause exists if and only if almost all traces in \(\mathcal {L}\) generated by \(\mathcal {M}\) have a p-critical prefix.

Now assume that there is an accepting BSCC with states B in \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\) such that \(B\cap \text {Crit}_p = \emptyset \) and \(\Pr _{\text {Bel}(\mathcal {M})\otimes \mathcal {A}} (\lnot \text {Crit}_p \mathcal {U} B)>0\). So, there is a finite path \(\pi \) in \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\) reaching B without visiting \(\text {Crit}_p\). Let \(\Pi \) be the set of all infinite extensions of \(\pi \) in \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\) and let \(L(\Pi )\) be the set of traces of paths in \(\Pi \). Almost all of these traces belong to \(\mathcal {L}\) as \(\pi \) ends in an accepting BSCC. Furthermore, the probability that a trace in \(L(\Pi )\) is generated by \(\mathcal {M}\) is at least \(\Pr _{\text {Bel}(\mathcal {M})\otimes \mathcal {A}}(\pi )>0\). But no trace in \(L(\Pi )\) has a p-critical prefix: Any finite prefix of a trace in \(L(\Pi )\) is the trace of a finite prefix of a path in \(\Pi \). But no path in \(\Pi \) visits \(\text {Crit}_p\), so the claim follows by the observation above.

For the other direction, assume that there is a set of traces \(\mathcal {K}\subseteq \mathcal {L}\) that is generated by \(\mathcal {M}\) with positive probability such that no trace in \(\mathcal {K}\) has a p-critical prefix. Let \(\Pi \) be the set of paths in \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\) with a trace in \(\mathcal {K}\). As \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\) only refines \(\mathcal {M}\) the set of paths \(\Pi \) also has positive probability. By our assumptions almost no path in \(\Pi \) visits \(\text {Crit}_p\). As almost all paths in \(\Pi \), however, reach an accepting BSCC and visit all of its states, there must be an accepting BSCC with states B that is reachable without visiting \(\text {Crit}_p\) and that does not intersect \(\text {Crit}_p\). I.e., B satisfies \(B\cap \text {Crit}_p = \emptyset \) and \(\Pr _{\text {Bel}(\mathcal {M})\otimes \mathcal {A}} (\lnot \text {Crit}_p \mathcal {U} B)>0\). \(\square \)

A non-labeled DTMC can be seen as a special case of a labeled DTMC where every state has a unique label. Lemma 31 hence subsumes the result that p-causes exist for any p in non-labeled DTMCs: Applying the belief construction to a DTMC \(\mathcal {M}\) with unique labels for each state, all reachable states of \(\text {Bel}(\mathcal {M})\) take the form \((s,\{s\})\) where s is a state of \(\mathcal {M}\). So, the sets \(S_p\) and \(\text {Crit}_p\) coincide in this case. The reachable part of \(\text {Bel}(\mathcal {M})\) is furthermore isomorphic to \(\mathcal {M}\). After taking the product with a DRA \(\mathcal {A}\), we observe that all accepting BSCCs with states B satisfy \(B\subseteq \text {Crit}_1\subseteq \text {Crit}_p\) in this case as argued in Remark 5. Thus, the first condition of Lemma 31 always holds. In labeled DTMCs, the situation is more complicated and in particular, it is possible that only some or no states of an accepting BSCC in the product of the belief construction and an automaton belong to \(\text {Crit}_p\) as illustrated by the following example.

Example 32

Consider the labeled DTMC in Fig. 9a with labels in \(\Sigma =\{a,b\}\). Let \(p=3/4\) and consider the language \(\mathcal {L}(\Box \Diamond a)\) accepted by the following DRA \(\mathcal {A}\): The automaton has two states \(x_0\) and \(x_1\) that are both initial. Whenever an a is read, the automaton moves to \(x_1\) and otherwise it moves to \(x_0\). It accepts if \(x_1\) is seen infinitely often.

In Fig. 9b, the DTMC \(\text {Bel}(\mathcal {M}) \otimes \mathcal {A}\) is depicted. The second component of \(\text {Bel}(\mathcal {M})\) is denoted in the second line in the states and the state of the automaton is indicated by the double circles: In the double-circled states, the automaton component is in the state \(x_1\) that has to be seen infinitely often.

The accepting BSCCs consist of states \(B_1=\{(q,\{q\},x_1)\}\) and \(B_2=\{(u,\{t,u\},x_0),(v,\{r,v\},x_1)\}\). As the probability to visit an accepting state infinitely often from state \((t,\{t,u\},x_0)\) is less than 1/10, the state \((u,\{t,u\},x_0)\) does not belong to \(\text {Crit}_p\). So, we have \(B_2\not \subseteq \text {Crit}_p\). In particular, any trace that leads to \((u,\{t,u\},x_0)\) cannot be part of a trace-p-cause. Such traces take the form \((ab)^k\) for some \(k\in \mathbb {N}\) and always lead to state \((t,\{t,u\},x_0)\) as well.

Fig. 10
figure 10

Reduction from the universality problem for NFAs in which all states are accepting to the existence problem for trace-p-causes

The state \((v,\{r,v\},x_1)\), however, belongs to \(\text {Crit}_p\) as the probability to reach an accepting BSCC from state \((r,\{r,v\},x_1)\) is at least 9/10 and hence greater than p. So, traces leading to \((v,\{r,v\},x_1)\), i.e., traces of the form \(a(ba)^k\) for some \(k\in \mathbb {N}\), are \(p-critical\). All in all, we have \(B_1\cap \text {Crit}_p\not = \emptyset \) and \(B_2\cap \text {Crit}_p\not = \emptyset \). Hence, a trace-p-cause exists. An example is the singleton trace-p-cause \(\{aba\}\).

We can now employ Lemma 31, to check the existence of a trace-p-cause. An accompanying hardness result is presented afterwards in Proposition 34.

Corollary 33

Given a labeled DTMC \(\mathcal {M}\) and a DRA \(\mathcal {A}\) specifying an effect \(\mathcal {L}\), the existence of a trace-p-cause for \(\mathcal {L}\) in \(\mathcal {M}\) can be decided in time exponential in the size of \(\mathcal {M}\) and polynomial in the size of \(\mathcal {A}\).

Proof

By applying Lemma 31 the existence check for a trace-p-cause needs exponential time in \(\mathcal {M}\) for the construction of \(\text {Bel}(\mathcal {M})\) and polynomial time in \(\mathcal {A}\) for the construction of the product \(\text {Bel}(\mathcal {M}) \otimes \mathcal {A}\). The computation of \(\text {Crit}_p\) and the check of the condition in Lemma 31 then need time polynomial in the size of this product by using linear equation systems to compute the probabilities and graph algorithms to check the reachability properties of \(\text {Crit}_p\). \(\square \)

Proposition 34

Given a labeled DTMC \(\mathcal {M}\) and a DRA \(\mathcal {A}\) specifying an effect \(\mathcal {L}\), deciding the existence of a trace-p-cause for \(\mathcal {L}\) in \(\mathcal {M}\) is \(\mathtt {PSPACE}\)-hard.

Proof

We provide a reduction from the universality problem for NFAs in which all states are accepting, which is shown to be \(\mathtt {PSPACE}\)-hard in [36]. A word is rejected by such an NFA if and only if there is no run on the word in the automaton. Given such an NFA \(\mathcal {A}\) over an alphabet \(\Sigma \), we construct a DTMC \(\mathcal {M}\) that is also depicted in Fig. 10. In contrast to the definition used elsewhere in this paper, we attach labels to the transitions of \(\mathcal {M}\) and not to states to be closer to the structure of the NFA. It is, however, well-known how to transform a DTMC with labeled transitions to a DTMC with labeled states.

The construction goes as follows: First, we fix two fresh letters f and \(\#\) not in \(\Sigma \). From the initial state \(s_0\), there is a transition with probability 1/4 and label \(\#\) to a sub-DTMC that in each step produces any of the letters from \(\Sigma \cup \{\#\}\) with equal probability. In Fig. 10, this is depicted by the ellipse with label \((\Sigma \cup \{\#\})^\omega \).

Furthermore, we include a copy Q of the states of the NFA \(\mathcal {A}\) and a state \( fail \) in the DTMC \(\mathcal {M}\). For any transition with a label \(a\in \Sigma \) from a state q to a state \(q^\prime \) in \(\mathcal {A}\), there is such a transition with positive probability in \(\mathcal {M}\) as well. These transitions are illustrated with labels a and b in Fig. 10. Furthermore, from each state in Q, there is a transition with label f to the state \( fail \) and a transition with label \(\#\) to all initial states of \(\mathcal {A}\). In Fig. 10, the initial state of \(\mathcal {A}\) is \(q_0\). The precise probabilities are not important. All mentioned transitions simply have some positive probability. Finally, there is are transition with label \(\#\) from \(s_0\) to all initial states in Q such that the probabilities of these transitions add up to 3/4.

We now consider the \(\omega \)-regular language \(\mathcal {L}(\lnot \Diamond f)\) over \(\Sigma \cup \{f,\#\}\) consisting of all traces that do not contain the letter f. Furthermore, let \(p=1/2\). We claim that there is a trace-p-cause for \(\mathcal {L}(\lnot \Diamond f)\) in \(\mathcal {M}\) if and only if the NFA \(\mathcal {A}\) is not universal.

To show this, first assume that \(\mathcal {A}\) is universal. We claim that in this case, there are no p-critical finite traces in \(\mathcal {M}\). Note that for any finite path \(\pi \), we have that \(\Pr _{\mathcal {M}}(\mathcal {L}(\lnot \Diamond f)\mid \pi ) \ge 1/2\) holds if and only if \(\pi \) moved to the sub-DTMC labeled \((\Sigma \cup \{\#\})^\omega \) in Fig. 10. If a finite trace \(\sigma \) contains the letter f, it cannot be p-critical. The empty trace is also not p-critical. So, let \(\sigma \) be a non-empty finite trace without f that can be produced by \(\mathcal {M}\), i.e., \(\sigma \) is any finite trace over \(\Sigma \cup \{\#\}\) starting with \(\#\). Now, there is a finite path \(\pi \) with trace \(\sigma \) that ends in the copy of Q in \(\mathcal {M}\): the trace \(\sigma \) takes the form \(\# w_0 \# w_1 \dots \# w_n\) for some n where \(w_i\) with \(0\le i \le n\) are possibly empty words over \(\Sigma \). For each \(w_i\), there is a run on \(\mathcal {A}\) by assumption. But this means, the finite path \(\pi \) can be chosen to move to an appropriate initial state in Q, whenever \(\#\) occurs in \(\sigma \) and follow a run on the next word \(w_i\) in \(\sigma \) before \(\#\) is seen again. For this path, we have \(\Pr _{\mathcal {M}}(\mathcal {L}(\lnot \Diamond f)\mid \pi ) = 0\) and hence \(\sigma \) cannot be p-critical.

For the other direction, assume that \(\mathcal {A}\) is not universal and let w be a word over \(\Sigma \) that is not accepted by \(\mathcal {A}\). As all states in \(\mathcal {A}\) are accepting, this means that there is no run on w in \(\mathcal {A}\). We claim that the following set of finite traces is a trace-p-cause for \(\mathcal {L}(\lnot \Diamond f)\) in \(\mathcal {M}\):

$$\begin{aligned} C = \{\sigma&\in \#(\Sigma \cup \{\#\})^*\mid \sigma \text { ends in }\#w \\&\text { and no proper prefix of } \sigma \text { ends in} \#w\}. \end{aligned}$$

Clearly, C is prefix-free. Let \(\sigma \in C\). Then, no path with trace \(\sigma \) can end in Q: As \(\sigma \) ends in \(\#w\), a path ending in Q would have to move to an initial state of \(\mathcal {A}\) when this last \(\#\) is seen. Afterwards, the trace of the suffix has to be w. The suffix of the path would then produce a run of \(\mathcal {A}\) on w, but such a run does not exist. So, all paths with trace \(\sigma \) move to the sub-DTMC \((\Sigma \cup \{\#\})^\omega \) in Fig. 10. Hence, \(\sigma \) is indeed p-critical. Finally, we observe that almost all traces of runs that move to the sub-DTMC \((\Sigma \cup \{\#\})^\omega \) contain the finite infix \(\# w\), while almost all runs that move to the copy of \(\mathcal {A}\) contain the letter f. So, almost all infinite traces produced by \(\mathcal {M}\) have a prefix in C and hence C is indeed a trace-p-cause. \(\square \)

Finally, if a trace-p-cause exists, the belief construction can be used to compute a NFA-representation of the canonical trace-p-cause consisting of the minimal p-critical traces. Recall that we defined the prefix-order \(\preceq \) on p-causes which can be applied analogously to trace-p-causes in Sect. 3.

Proposition 35

If a trace-p-cause exists, the canonical (i.e., \(\preceq \)-minimal) trace-p-cause can be represented by an NFA computable in time exponential in \(\mathcal {M}\) and polynomial in \(\mathcal {A}\).

Proof

If a trace-p-cause for \(\mathcal {L}\) given by the DRA \(\mathcal {A}\) exists, the canonical trace-p-cause C can be specified by the following NFA \(\mathcal {CAN}\): View \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\) as an automaton and make all states in \(\text {Crit}_p\) in \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\) accepting and terminal. As we have seen, p-critical traces are precisely the traces of paths in \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\) ending in \(\text {Crit}_p\) in the proof of Lemma 31. The language of the NFA \(\mathcal {CAN}\) consists precisely of the minimal such traces; in particular, it is prefix-free. Note here that if a run on a trace in \(\mathcal {CAN}\) leads to an accepting state, all other runs on the trace also end in an accepting state by the definition of \(\text {Crit}_p\) and all accepting states are terminal. If a trace p-cause exists, almost all traces in \(\mathcal {L}\) generated by \(\mathcal {M}\) have a p-critical prefix and hence also a minimal p-critical prefix. So C is a trace-p-cause in this case and clearly, it is minimal. The computation of C takes exponential time as in the proof of Corollary 33. \(\square \)

A p-causal partially observable MDP The MDP construction described in Definition 7 does roughly work the same way for a labeled DTMC \(\mathcal {M}\), but leads to a partially observable MDP (POMDP). In a partially observable MDP, in each step an observation from a set \(\mathcal {O}\) is made. A scheduler does not have access to the current state of the process, but only to the history of observations and has to choose actions based only on the sequence of observations observed.

The construction of the p-causal POMDP \(\mathcal {C}_p(\mathcal {M})\), the analogue of the p-causal MDP from Definition 7, works as follows: Given a labeled DTMC \(\mathcal {M}\) and a DRA \(\mathcal {A}\), we start with the DTMC \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\). In each state, the action continue is enabled and leads to successors according to the transition probabilities of \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\). Furthermore, we add a fresh absorbing state \( cause \). In each state in \(\text {Crit}_p\), a further action pick is enabled that leads to \( cause \) with probability 1. Whenever entering a state (sTq) of \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\), the scheduler observes only the label L(s), but not the exact state.

If a run leads to \(\text {Crit}_p\), also all other runs with the same trace end in \(\text {Crit}_p\). So, after any sequence of observations either pick is enabled after all runs corresponding to the observations or after none of these runs. Note that based on the observed labels, the scheduler knows the second two components of the current state (sTq), i.e., the subset T of states of \(\mathcal {M}\) the original DTMC could currently be in and the current state q of the automaton. The scheduler can, however, not distinguish states (sTq) and (tTq) with \(s,t\in T\).

Let now R be the set of all states in rejecting BSCCs of \(\text {Bel}(\mathcal {M})\otimes \mathcal {A}\). There is now a 1-to-1-correspondence between schedulers for \(\mathcal {C}_p(\mathcal {M})\) that reach \(R\cup \{ cause \}\) with probability 1 and trace-p-causes for \(\mathcal {L}(\mathcal {A})\) in \(\mathcal {M}\): Given such a scheduler \(\mathfrak {S}\), the set of traces, i.e., sequences of observations, that are possible under \(\mathfrak {S}\) and after which \(\mathfrak {S}\) chooses pick constitute a trace-p-cause. We have seen before that traces of paths ending in \(\text {Crit}_p\) are p-critical and almost all infinite traces on which \(\mathfrak {S}\) never chooses pick are traces of paths ending in a rejecting BSCC and hence do not belong to \(\mathcal {L}(\mathcal {A})\). In the other direction, given a trace-p-cause C, the scheduler that chooses pick whenever the trace seen so far lies in C clearly reaches \( cause \) or R almost surely.

However, many verification problems concerning general POMDPs are undecidable [11, 41]. In particular, reductions analogous to the ones used in the fully observable setting to solve cost-problems lead to problems on POMDPs which are undecidable in general as discussed in the following section.

Cost for trace-p-causes

A labeled DTMC \(\mathcal {M}\) can be equipped with a cost function \(c: \Sigma \rightarrow \mathbb {Q}\). This way we can also consider the previously defined cost mechanisms from Sect. 4 in this setting. In particular the notions of \({\text {expcost}}\) (Definition 14), \({\text {pexpcost}}\) (Definition 16) and \({\text {maxcost}}\) (Definition 20) can be extended to work over \({\text {Traces}}(\mathcal {M})\) instead of \({\text {Paths}}(\mathcal {M})\). However, most complexity results do not carry over as the p-causal MDP construction, which is central in most algorithms, does not yield the same reductions. In this section we discuss, which results still hold when considered in the setting of a labeled DTMC.

For the remainder of this section we fix a labeled DTMC \(\mathcal {M} = (S,s_0, {\textbf {P}}, \Sigma , L)\), an \(\omega \)-regular language \(\mathcal {L} \subseteq \Sigma ^{\omega }\) and a probability threshold \(p \in (0,1]\). Furthermore, we assume a trace-p-cause for \(\mathcal {L}\) in \(\mathcal {M}\) exists. Let \({\text {expcost}}_{L}\), \({\text {pexpcost}}_L\) and \({\text {maxcost}}_L\) denote the resulting definitions of expected, partial expected and maximal cost of a trace-p-cause, i.e., when the random variable (resp. the supremum) is defined on \({\text {Traces}}(\mathcal {M})\) (resp. \({\hat{\tau }}\) in the cause).

Proposition 36

If a trace-p-cause exists and the weight function \(c: \Sigma \rightarrow \mathbb {Q}\) takes only non-negative values, the canonical trace-p-cause C is \({\text {expcost}}_L\)-minimal and \({\text {maxcost}}_L\)-minimal.

Proof

When proving \({\text {expcost}}\)- and \({\text {maxcost}}\)-minimality of the canonical p-cause \(\Theta \) we used that \(\Theta \) is the minimal element of the partial order \(\preceq \) on p-causes and both \({\text {expcost}}\) and \({\text {maxcost}}\) respect the order for non-negative weight function. On trace-p-causes such a partial order can also be defined: For two trace-p-causes \(C_1\) and \(C_2\) we define \(C_1 \preceq C_2\) if for every trace \({\hat{\sigma }} \in C_2\) there is a trace \({\hat{\tau }} \in C_1\) such that \({\hat{\tau }} \in {\text {Pref}}({\hat{\sigma }})\).

One can prove that the canonical trace-p-cause is the minimal element of this partial order: Let \(C'\) be another trace-p-cause for \(\mathcal {L}\) and consider an arbitrary trace \(\hat{\sigma } \in C'\). By definition \({\hat{\sigma }}\) is p-critical. If there is no p-critical trace in \({\text {Pref}}({\hat{\sigma }}) \backslash \{{\hat{\sigma }}\}\), then \({\hat{\sigma }} \in C\) by definition and there is nothing to prove. Therefore, assume that there exists \({\hat{\tau }} \in {\text {Pref}}({\hat{\sigma }}) \backslash \{{\hat{\sigma }}\}\) such that \({\hat{\tau }}\) is a p-critical trace for \(\mathcal {L}\). Then we there is a minimal such \({\hat{\tau }}\) which is contained in C by definition and thus \(C \preceq C'\).

The proof of \({\text {expcost}}_L\)- and \({\text {maxcost}}_L\)-minimality of C now follows analogous lines as the proofs of \({\text {expcost}}\)- and \({\text {maxcost}}\)-minimality of \(\Theta \). \(\square \)

Using both Propositions 36 and 35 we arrive at the following result.

Corollary 37

For a non-negative weight function \(c: \Sigma \rightarrow \mathbb {Q}_{\ge 0}\) the values \({\text {expcost}}_L^{\min }\) and \({\text {maxcost}}_L^{\min }\) as well as a cost-minimal trace-p-cause for \(\mathcal {L}\) (given as a DRA) in \(\mathcal {M}\) can be computed in exponential time.

Furthermore, the algorithm for \({\text {maxcost}}_{{\text {inst}}}\) can be adapted for trace-p-causes leading to the following result.

Proposition 38

Assume a trace-p-cause exists. For an arbitrary weight function \(c: \Sigma \rightarrow \mathbb {Q}\) the value \({\text {maxcost}}_{{\text {inst}}, L}^{\min }\) and a cost-minimal trace-p-cause for \(\mathcal {L}\) (given as DRA) in \(\mathcal {M}\) can be computed in exponential time.

Proof

Consider the Markov chain \(\text {Bel}(\mathcal {M}) \otimes \mathcal {A}\) with weights given by \(w(s,T,q) = c(L(s))\). We enumerate the set \(\text {Crit}_p\) as \(\mathfrak {s}_0, \ldots , \mathfrak {s}_k\) where \(k = |\text {Crit}_p |-1\) such that \(c(\mathfrak {s}_i) \le c(\mathfrak {s}_j)\) for all \(0 \le i < j \le k\). The idea of the original computation of \({\text {maxcost}}_{{\text {inst}}}^{\min }\) was to now iteratively remove the states starting with \(\mathfrak {s}_0\) and checking the reachability of the effect in each step. We can mimic this by the following idea: Let R be the set of all states in non-accepting BSCCs in \(\text {Bel}(\mathcal {M}) \otimes \mathcal {A}\) and define \(R_i = R\cup \{\mathfrak {s}_j \mid j\le i\}\). Consider the least number n such that \(\Pr _{\text {Bel}(\mathcal {M}) \otimes \mathcal {A}} (\Diamond R_n)=1\). Then there is no trace reaching an accepting BSCC without hitting any state in \(\{\mathfrak {s}_0,\ldots ,\mathfrak {s}_n\}\). Therefore, \({\text {maxcost}}_{{\text {inst}}, L}^{\min }= w(\mathfrak {s}_n)\) and an optimal p-cause consists of all traces leading to a state in \(\{\mathfrak {s}_0,\ldots ,\mathfrak {s}_n\}\).

As the computation of the Markov chain \(\text {Bel}(\mathcal {M}) \otimes \mathcal {A}\) takes time exponential in \(\mathcal {M}\), the resulting algorithm takes exponential time as well. \(\square \)

In the sequel, we discuss problems arising with the remaining combinations of weight type and cost mechanism to conclude this section.

As the algorithm of the expected cost for arbitrary weight functions uses the p-causal MDP in order to find a cost-minimal p-cause, it would be natural to use the p-causal POMDP \(\mathcal {C}_p(\mathcal {M})\) to address the computation of cost-minimal trace-p-causes. Unfortunately, formalizing the threshold-problem for the expected cost in this POMDP is an instance of a so-called positive-bounded problem as in [41, 4.1], which is undecidable in general.

Regarding the partial expected cost, obviously the PP-hardness result for non-negative cost functions of Theorem 19 does carry over. The exponential-time algorithm from Theorem 18, however, uses the p-causal MDP construction. By tackling the pexpcost-problem in labeled DTMCs via the analogous p-causal POMDP, even with non-negative weights, we obtain a partial expectation problem on POMDPs. Partial expectations in particular subsume quantitative reachability problems in POMDPs which are undecidable in general [41].

Similarly, the algorithm for the computation of \({\text {expcost}}_{{\text {inst}}}\)- and \({\text {pexpcost}}_{{\text {inst}}}\)-minimal p-causes used a reduction to a weighted reachability problem in the p-causal MDP. Again, the analogous reduction for labeled DTMC leads to a problem on POMDPs subsuming generally undecidable quantitative reachability problems [41].

In order to address the problem of finding a cost-minimal p-cause for \({\text {maxcost}}\) for an arbitrary weight function, we reduced the computation to a two-player total-payoff game with additional reachability constraints for one of the two players which can be solved in pseudo-polynomial time [9]. For trace-p-causes we can adapt the reduction for \({\text {maxcost}}_L\) resulting in a total-payoff game with imperfect information. However, to the best of our knowledge there are no known decidability results for such games.

In all the cases discussed above, it is unclear whether the special structure of the p-causal POMDP could lead to a decidability result. Thus these cost-problems for trace-p-causes remain open.

Conclusion

The proposed definition of p-causes in DTMCs combines the counterfactual definition of causes with the probability-raising property. Motivated by monitoring applications, we considered various cost mechanisms that are suitable for different interpretations of the costs incurred by monitoring or by taking countermeasures in case an alarm is triggered. We provided algorithms for and investigated the computational complexity of the computation of corresponding cost-minimal causes. By considering a notion of p-causes in labeled DTMCs we took a first step in addressing causality in partial observable probabilistic systems.

Cyber-physical systems are often not fully probabilistic, but involve a certain amount of control in form of decisions depending on the system state. Such systems can be modeled by MDPs, to which we intend to generalize the causality framework presented here. For this, incorporating the probability-raising principle is relatively straight-forward but poses a lot of algorithmic questions regarding the check of the probability-raising condition and the optimization of causes. These problems are addressed in [3]. On the other hand, fitting the non-probabilistic counterfactual reasoning idea into a suitable definition for MDPs is more challenging and should be addressed in future work.

In our approach to partial observability we choose a notion of causality which ensures the probability-raising principle for every possible execution underlying the observed trace. Even though this approach makes sense in the context of this paper, there are also other possible notions of causality within the setting of labeled DTMCs which should be examined in future work. However, even for our notion of trace-p-causes the complexity of computing cost-minimal causes is not fully explored. Therefore, our work here shall only be seen as a first step in addressing partial observability for this causality framework.