Keywords

figure a

1 Introduction

Property directed reachability analysis (PDR) refers to a class of verification algorithms for solving safety problems of transition systems [5, 12]. Its essence consists of 1) interleaving the construction of an inductive invariant (a positive chain) with that of a counterexample (a negative sequence), and 2) making the two sequences interact, with one narrowing down the search space for the other.

PDR algorithms have shown impressive performance both in hardware and software verification, leading to active research [15, 18, 28, 29] going far beyond its original scope. For instance, an abstract domain [8] capturing the over-approximation exploited by PDR has been recently introduced in [13], while PrIC3 [3] extended PDR for quantitative verification of probabilistic systems.

To uncover the abstract principles behind PDR and its extensions, Kori et al. proposed LT-PDR [19], a generalisation of PDR in terms of lattice/category theory. LT-PDR can be instantiated using domain-specific heuristics to create effective algorithms for different kinds of systems such as Kripke structures, Markov Decision Processes (MDPs), and Markov reward models. However, the theory in [19] does not offer guidance on devising concrete heuristics.

Adjoints in PDR. Our approach shares the same vision of LT-PDR, but we identify different principles: adjunctions are the core of our toolset.

figure b

An adjunction \(f \dashv g\) is one of the central concepts in category theory [23]. It is prevalent in various fields of computer science, too, such as abstract interpretation [8] and functional programming [22]. Our use of adjoints in this work comes in the following two flavours.

  • (forward-backward adjoint) f describes the forward semantics of a transition system, while g is the backward one, where we typically have \(A=C\).

  • (abstraction-concretization adjoint) C is a concrete semantic domain, and A is an abstract one, much like in abstract interpretation. An adjoint enables us to convert a fixed-point problem in C to that in A.

Our Algorithms. The problem we address is the standard lattice theoretical formulation of safety problems, namely whether the least fixed point of a continuous map b over a complete lattice \((L,\sqsubseteq )\) is below a given element \(p\in L\). In symbols \(\mu b\sqsubseteq _{?} p\). We present two algorithms.

figure c

The first one, named AdjointPDR, assumes to have an element \(i\in L\) and two adjoints \(f \dashv g:L \rightarrow L\), representing respectively initial states, forward semantics and backward semantics (see right) such that \(b(x)=f(x)\sqcup i\) for all \(x\in L\). Under this assumption, we have the following equivalences (they follow from the Knaster-Tarski theorem, see §2):

$$\mu b\sqsubseteq p \quad \Leftrightarrow \quad \mu (f\sqcup i)\sqsubseteq p \quad \Leftrightarrow \quad i \sqsubseteq \nu (g \sqcap p),$$

where \(\mu (f\sqcup i)\) and \(\nu (g \sqcap p)\) are, by the Kleene theorem, the limits of the initial and final chains illustrated below.

$$\begin{aligned} \bot \sqsubseteq i \sqsubseteq f(i)\sqcup i \sqsubseteq \cdots \qquad \qquad \qquad \cdots \sqsubseteq g(p)\sqcap p \sqsubseteq p \sqsubseteq \top \end{aligned}$$

As positive chain, PDR exploits an over-approximation of the initial chain: it is made greater to accelerate convergence; still it has to be below p.

The distinguishing feature of AdjointPDR is to take as a negative sequence (that is a sequential construction of potential counterexamples) an over-approximation of the final chain. This crucially differs from the negative sequence of LT-PDR, namely an under-approximation of the computed positive chain.

We prove that AdjointPDR is sound (Theorem 5) and does not loop (Proposition 7) but since, the problem \(\mu b \sqsubseteq _? p\) is not always decidable, we cannot prove termination. Nevertheless, AdjointPDR allows for a formal theory of heuristics that are essential when instantiating the algorithm to concrete problems. The theory prescribes the choices to obtain the boundary executions, using initial and final chains (Proposition 10); it thus identifies a class of heuristics guaranteeing termination when answers are negative (Theorem 12).

AdjointPDR ’s assumption of a forward-backward adjoint \(f \dashv g\), however, does not hold very often, especially in probabilistic settings. Our second algorithm AdjointPDR \(^\downarrow \) circumvents this problem by extending the lattice for the negative sequence, from L to the lattice \(L^{\downarrow }\) of lower sets in L.

figure d

Specifically, by using the second form of adjoints, namely an abstraction-concretization pair, the problem \(\mu b \sqsubseteq _{?} p\) in L can be translated to an equivalent problem on \(b^{\downarrow }\) in \(L^\downarrow \), for which an adjoint \(b^\downarrow \dashv b^\downarrow _r\) is guaranteed. This allows one to run AdjointPDR in the lattice \(L^\downarrow \). We then notice that the search for a positive chain can be conveniently restricted to principals in \(L^\downarrow \), which have representatives in L. The resulting algorithm, using L for positive chains and \(L^\downarrow \) for negative sequences, is AdjointPDR \(^\downarrow \).

The use of lower sets for the negative sequence is a key advantage. It not only avoids the restrictive assumption on forward-backward adjoints \(f\dashv g\), but also enables a more thorough search for counterexamples. AdjointPDR \(^\downarrow \) can simulate step-by-step LT-PDR (Theorem 17), while the reverse is not possible due to a single negative sequence in AdjointPDR \(^\downarrow \) potentially representing multiple (Proposition 18) or even all (Proposition 19) negative sequences in LT-PDR.

Concrete Instances. Our lattice-theoretic algorithms yield many concrete instances: the original IC3/PDR [5, 12] as well as Reverse PDR [27] are instances of AdjointPDR with L being the powerset of the state space; since LT-PDR can be simulated by AdjointPDR \(^\downarrow \) , the latter generalizes all instances in [19].

As a notable instance, we apply AdjointPDR \(^\downarrow \) to MDPs, specifically to decide if the maximum reachability probability [1] is below a given threshold. Here the lattice \(L=[0,1]^S\) is that of fuzzy predicates over the state space S. Our theory provides guidance to devise two heuristics, for which we prove negative termination (Corollary 20). We present its implementation in Haskell, and its experimental evaluation, where comparison is made against existing probabilistic PDR algorithms (PrIC3 [3], LT-PDR [19]) and a non-PDR one (Storm [11]). The performance of AdjointPDR \(^\downarrow \) is encouraging—it supports the potential of PDR algorithms in probabilistic model checking. The experiments also indicate the importance of having a variety of heuristics, and thus the value of our adjoint framework that helps coming up with those.

Additionally, we found that abstraction features of Haskell allows us to code lattice-theoretic algorithms almost literally (\(\sim \)100 lines). Implementing a few heuristics takes another \(\sim \)240 lines. This way, we found that mathematical abstraction can directly help easing implementation effort.

Related Work. Reverse PDR [27] applies PDR from unsafe states using a backward transition relation \(\textbf{T}\) and tries to prove that initial states are unreachable. Our right adjoint g is also backward, but it differs from \(\textbf{T}\) in the presence of nondeterminism: roughly, \(\textbf{T}(X)\) is the set of states which can reach X in one step, while g(X) are states which only reach X in one step. fbPDR [28, 29] runs PDR and Reverse PDR in parallel with shared information. Our work uses both forward and backward directions (the pair \(f\dashv g\)), too, but approximate differently: Reverse PDR over-approximates the set of states that can reach an unsafe state, while we over-approximate the set of states that only reach safe states.

The comparison with LT-PDR [19] is extensively discussed in Sect. 4.2. PrIC3 [3] extended PDR to MDPs, which are our main experimental ground: Sect. 6 compares the performances of PrIC3, LT-PDR and AdjointPDR \(^\downarrow \).

We remark that PDR has been applied to other settings, such as software model checking using theories and SMT-solvers [6, 21] or automated planning [30]. Most of them (e.g., software model checking) fall already in the generality of LT-PDR and thus they can be embedded in our framework.

It is also worth to mention that, in the context of abstract interpretation, the use of adjoints to construct initial and final chains and exploit the interaction between their approximations has been investigated in several works, e.g., [7].

Structure of the Paper. After recalling some preliminaries in Sect. 2, we present AdjointPDR in Sect. 3 and AdjointPDR \(^\downarrow \) in Sect. 4. In Sect. 5 we introduce the heuristics for the max reachability problems of MDPs, that are experimentally tested in Sect. 6.

2 Preliminaries and Notation

We assume that the reader is familiar with lattice theory, see, e.g., [10]. We use \((L,\sqsubseteq )\), \((L_1,\sqsubseteq _1)\), \((L_2,\sqsubseteq _2)\) to range over complete lattices and xyz to range over their elements. We omit subscripts and order relations whenever clear from the context. As usual, \(\bigsqcup \) and \(\sqcap \) denote least upper bound and greatest lower bound, \(\sqcup \) and \(\sqcap \) denote join and meet, \(\top \) and \(\bot \) top and bottom. Hereafter we will tacitly assume that all maps are monotone. Obviously, the identity map \(id:L\rightarrow L\) and the composition \(f \circ g :L_1\rightarrow L_3\) of two monotone maps \(g:L_1\rightarrow L_2\) and \(f:L_2\rightarrow L_3\) are monotone. For a map \(f:L \rightarrow L\), we inductively define \(f^0=id\) and \(f^{n+1}=f\circ f^n\). Given \(l :L_1 \rightarrow L_2\) and \(r:L_2\rightarrow L_1\), we say that l is the left adjoint of r, or equivalently that r is the right adjoint of l, written \(l\dashv r\), when it holds that \(l(x)\sqsubseteq _2 y\) iff \(x \sqsubseteq _1 r(y)\) for all \(x\in L_1\) and \(y\in L_2\). Given a map \(f:L\rightarrow L\), the element \(x\in L\) is a post-fixed point iff \(x\sqsubseteq f(x)\), a pre-fixed point iff \(f(x)\sqsubseteq x\) and a fixed point iff \(x=f(x)\). Pre, post and fixed points form complete lattices: we write \(\mu f\) and \(\nu f\) for the least and greatest fixed point.

Several problems relevant to computer science can be reduced to check if \(\mu b \sqsubseteq p\) for a monotone map \(b:L \rightarrow L\) on a complete lattice L. The Knaster-Tarski fixed-point theorem characterises \(\mu b\) as the least upper bound of all pre-fixed points of b and \(\nu b\) as the greatest lower bound of all its post-fixed points:

$$\begin{aligned} \mu b= \sqcap \{ x \mid b(x) \sqsubseteq x \} \qquad \qquad \nu b= \bigsqcup \{ x \mid x \sqsubseteq b(x) \}. \end{aligned}$$

This immediately leads to two proof principles, illustrated below:

$$\begin{aligned} \begin{array}{c} \exists x, \; b(x) \sqsubseteq x \sqsubseteq p \\ \hline \mu b\sqsubseteq p \end{array} \qquad \qquad \begin{array}{c} \exists x, \; i \sqsubseteq x\sqsubseteq b(x)\\ \hline i \sqsubseteq \nu b \end{array} \end{aligned}$$
(KT)

By means of (KT), one can prove \(\mu b \sqsubseteq p\) by finding some pre-fixed point x, often called invariant, such that \(x \sqsubseteq p\). However, automatically finding invariants might be rather complicated, so most of the algorithms rely on another fixed-point theorem, usually attributed to Kleene. It characterises \(\mu b\) and \(\nu b\) as the least upper bound and the greatest lower bound, of the initial and final chains:

$$\begin{aligned}&\bot \sqsubseteq b(\bot ) \sqsubseteq b^2(\bot ) \sqsubseteq \cdots \quad \text {and}\quad \cdots \sqsubseteq b^2(\top ) \sqsubseteq b(\top ) \sqsubseteq \top . \quad \text {That is,} \\&\mu b = \bigsqcup _{n\in \mathbb {N}} b^n(\bot ), \qquad \qquad \nu b = \sqcap _{n\in \mathbb {N}} b^n(\top ). \end{aligned}$$
(Kl)

The assumptions are stronger than for Knaster-Tarski: for the leftmost statement, it requires the map b to be \(\omega \)-continuous (i.e., it preserves \(\bigsqcup \) of \(\omega \)-chains) and, for the rightmost \(\omega \)-co-continuous (similar but for \(\sqcap \)). Observe that every left adjoint is continuous and every right adjoint is co-continuous (see e.g. [23]).

As explained in [19], property directed reachability (PDR) algorithms [5] exploits (KT) to try to prove the inequation and (Kl) to refute it. In the algorithm we introduce in the next section, we further assume that b is of the form \(f \sqcup i\) for some element \(i \in L\) and map \(f:L \rightarrow L\), namely \(b(x)= f(x) \sqcup i\) for all \(x \in L\). Moreover we require f to have a right adjoint \(g:L \rightarrow L\). In this case

$$\begin{aligned} \mu (f \sqcup i) \sqsubseteq p \qquad \text { if{}f } \qquad i \sqsubseteq \nu (g \sqcap p) \end{aligned}$$
(1)

(which is easily shown using the Knaster-Tarski theorem) and \((f \sqcup i)\) and \((g \sqcap p)\) are guaranteed to be (co)continuous. Since \(f \dashv g\) and left and right adjoints preserve, resp., arbitrary joins and meets, then for all \(n \in \mathbb {N}\)

$$\begin{aligned} \textstyle (f\sqcup i)^{n} (\bot ) = \bigsqcup _{j< n} f^j(i) \qquad (g\sqcap p)^{n} (\top ) = \sqcap _{j< n} g^j(p) \end{aligned}$$
(2)

which by (Kl) provide useful characterisations of least and greatest fixed points.

figure e

We conclude this section with an example that we will often revisit. It also provides a justification for the intuitive terminology that we sporadically use.

Fig. 1.
figure 1

The transition system of Example 1, with \(S = \{ s_0, \dots s_6 \}\) and \(I=\{s_0\}\).

Example 1

(Safety problem for transition systems). A transition system consists of a triple \((S, I, \delta )\) where S is a set of states, \(I \subseteq S\) is a set of initial states, and \(\delta :S \rightarrow \mathcal {P}S\) is a transition relation. Here \(\mathcal {P}S\) denotes the powerset of S, which forms a complete lattice ordered by inclusion \(\subseteq \). By defining \(F:\mathcal {P}S \rightarrow \mathcal {P}S\) as \(F(X) {\mathop {=}\limits ^{\tiny \text {def}}}\bigcup _{s \in X} \delta (s)\) for all \(X\in \mathcal {P}S\), one has that \(\mu (F \cup I)\) is the set of all states reachable from I. Therefore, for any \(P \in \mathcal {P}S\), representing some safety property, \(\mu (F \cup I) \subseteq P\) holds iff all reachable states are safe. It is worth to remark that F has a right adjoint \(G:\mathcal {P}S \rightarrow \mathcal {P}S\) defined for all \(X\in \mathcal {P}S\) as \(G(X) {\mathop {=}\limits ^{\tiny \text {def}}}\{s \mid \delta (s) \subseteq X\}\). Thus by (1), \(\mu (F \cup I) \subseteq P\) iff \(I \subseteq \nu (G \cap P)\).

Consider the transition system in Fig. 1. Hereafter we write \(S_{j}\) for the set of states \(\{s_0, s_1, \dots , s_j\}\) and we fix the set of safe states to be \(P = S_5\). It is immediate to see that \(\mu (F \cup I)=S_4 \subseteq P\). Automatically, this can be checked with the initial chains of \((F \cup I)\) or with the final chain of \((G \cap P)\) displayed below on the left and on the right, respectively.

$$\emptyset \subseteq I \subseteq S_2 \subseteq S_3 \subseteq S_4 \subseteq S_4 \subseteq \cdots \qquad \quad \qquad \cdots \subseteq S_4 \subseteq S_4 \subseteq P \subseteq S $$

The \((j+1)\)-th element of the initial chain contains all the states that can be reached by I in at most j transitions, while \((j+1)\)-th element of the final chain contains all the states that in at most j transitions reach safe states only.

Fig. 2.
figure 2

Invariants of AdjointPDR.

3 Adjoint PDR

In this section we present AdjointPDR, an algorithm that takes in input a tuple (ifgp) with \(i,p\in L\) and \(f\dashv g :L\rightarrow L\) and, if it terminates, it returns true whenever \(\mu (f \sqcup i) \sqsubseteq p\) and false otherwise.

The algorithm manipulates two sequences of elements of L: \(\boldsymbol{x} {\mathop {=}\limits ^{\tiny \text {def}}}x_0, \dots , x_{n-1}\) of length n and \(\boldsymbol{y}{\mathop {=}\limits ^{\tiny \text {def}}}y_k, \dots y_{n-1} \) of length \(n-k\). These satisfy, through the executions of AdjointPDR, the invariants in Fig. 2. Observe that, by (A1), \(x_j\) over-approximates the j-th element of the initial chain, namely \((f\sqcup i)^j(\bot ) \sqsubseteq x_j\), while, by (A3), the j-indexed element \(y_j\) of \(\boldsymbol{y}\) over-approximates \(g^{n-j-1}(p)\) that, borrowing the terminology of Example 1, is the set of states which are safe in \(n-j-1\) transitions. Moreover, by (PN), the element \(y_j\) witnesses that \(x_j\) is unsafe, i.e., that \(x_j \not \sqsubseteq g^{n-1-j}(p)\) or equivalently \(f^{n-j-1}(x_j) \not \sqsubseteq p\). Notably, \(\boldsymbol{x}\) is a positive chain and \(\boldsymbol{y}\) a negative sequence, according to the definitions below.

Definition 2

(positive chain). A positive chain for \(\mu (f \sqcup i) \sqsubseteq p\) is a finite chain \(x_0 \sqsubseteq \dots \sqsubseteq x_{n-1}\) in L of length \(n \ge 2\) which satisfies (P1), (P2), (P3) in Fig. 2. It is conclusive if \(x_{j+1} \sqsubseteq x_j\) for some \(j \le n-2\).

In a conclusive positive chain, \(x_{j+1}\) provides an invariant for \(f\sqcup i\) and thus, by (KT), \(\mu (f \sqcup i) \sqsubseteq p\) holds. So, when \(\boldsymbol{x}\) is conclusive, AdjointPDR returns true.

Definition 3

(negative sequence). A negative sequence for \(\mu (f \sqcup i) \sqsubseteq p\) is a finite sequence \( y_k, \dots , y_{n-1}\) in L with \(1 \le k \le n\) which satisfies (N1) and (N2) in Fig. 2. It is conclusive if \(k=1\) and \(i \not \sqsubseteq y_1\).

When \(\boldsymbol{y}\) is conclusive, AdjointPDR returns false as \(y_1\) provides a counterexample: (N1) and (N2) entail (A3) and thus \(i \not \sqsubseteq y_1\sqsupseteq g^{n-2}(p)\). By (Kl\({\dashv }\)), \(g^{n-2}(p) \sqsupseteq \nu (g \sqcap p)\) and thus \(i \not \sqsubseteq \nu (g \sqcap p)\). By (1), \(\mu (f \sqcup i) \not \sqsubseteq p\).

Fig. 3.
figure 3

AdjointPDR algorithm checking \(\mu (f \sqcup i) \sqsubseteq p\).

The pseudocode of the algorithm is displayed in Fig. 3, where we write \(( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k}\) to compactly represents the state of the algorithm: the pair (nk) is called the index of the state, with \(\boldsymbol{x}\) of length n and \(\boldsymbol{y}\) of length \(n-k\). When \(k=n\), \(\boldsymbol{y}\) is the empty sequence \(\varepsilon \). For any \(z\in L\), we write \( \boldsymbol{x},z\) for the chain \( x_0, \dots , x_{n-1}, z\) of length \(n+1\) and \( z,\boldsymbol{y}\) for the sequence \( z,y_k, \dots y_{n-1}\) of length \(n-(k-1)\). Moreover, we write \(\boldsymbol{x}\sqcap _j z\) for the chain \( x_0 \sqcap z, \dots , x_j \sqcap z, x_{j+1},\dots , x_{n-1}\). Finally, \(\textsf{tail}(\boldsymbol{y})\) stands for the tail of \(\boldsymbol{y}\), namely \(y_{k+1}, \dots y_{n-1}\) of length \(n-(k+1)\).

The algorithm starts in the initial state \(s_0{\mathop {=}\limits ^{\tiny \text {def}}}( \bot , \top \Vert \varepsilon )_{2,2}\) and, unless one of \(\boldsymbol{x}\) and \(\boldsymbol{y}\) is conclusive, iteratively applies one of the four mutually exclusive rules: (Unfold), (Candidate), (Decide) and (Conflict). The rule (Unfold) extends the positive chain by one element when the negative sequence is empty and the positive chain is under p; since the element introduced by (Unfold) is \(\top \), its application typically triggers rule (Candidate) that starts the negative sequence with an over-approximation of p. Recall that the role of \(y_j\) is to witness that \(x_j\) is unsafe. After (Candidate) either (Decide) or (Conflict) are possible: if \(y_k\) witnesses that, besides \(x_k\), also \(f(x_{k-1})\) is unsafe, then (Decide) is used to further extend the negative sequence to witness that \(x_{k-1}\) is unsafe; otherwise, the rule (Conflict) improves the precision of the positive chain in such a way that \(y_k\) no longer witnesses \(x_k\sqcap z\) unsafe and, thus, the negative sequence is shortened.

Note that, in (Candidate), (Decide) and (Conflict), the element \(z\in L\) is chosen among a set of possibilities, thus AdjointPDR is nondeterministic.

To illustrate the executions of the algorithm, we adopt a labeled transition system notation. Let \(\mathcal {S}{\mathop {=}\limits ^{\tiny \text {def}}}\{( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k} \mid n \ge 2 \text {, }k\le n\text {, } \boldsymbol{x}\in L^n \text { and } \boldsymbol{y}\in L^{n-k}\}\) be the set of all possible states of AdjointPDR. We call \(( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k} \in \mathcal {S}\) conclusive if \(\boldsymbol{x}\) or \(\boldsymbol{y}\) are such. When \(s \in \mathcal {S}\) is not conclusive, we write \(s {{\mathop {\rightarrow }\limits ^{D}}}_{}\) to mean that s satisfies the guards in the rule (Decide), and \(s {{\mathop {\rightarrow }\limits ^{D}}}_{z} s'\) to mean that, being (Decide) applicable, AdjointPDR moves from state s to \(s'\) by choosing z. Similarly for the other rules: the labels \( Ca \), \( Co \) and U stands for (Candidate), (Conflict) and (Unfold), respectively. When irrelevant we omit to specify labels and choices and we just write \(s {\mathop {\rightarrow }\limits ^{}} s'\). As usual \({\mathop {\rightarrow ^+}\limits ^{}}\) stands for the transitive closure of \({\mathop {\rightarrow }\limits ^{}}\) while \({\mathop {\rightarrow ^*}\limits ^{}}\) stands for the reflexive and transitive closure of \({\mathop {\rightarrow }\limits ^{}}\).

Example 4

Consider the safety problem in Example 1. Below we illustrate two possible computations of AdjointPDR that differ for the choice of z in (Conflict). The first run is conveniently represented as the following series of transitions.

figure f

The last state returns true since \(x_4 = x_5=S_4\). Observe that the elements of \(\boldsymbol{x}\), with the exception of the last element \(x_{n-1}\), are those of the initial chain of \((F \cup I)\), namely, \(x_j\) is the set of states reachable in at most \(j-1\) steps. In the second computation, the elements of \(\boldsymbol{x}\) are roughly those of the final chain of \((G \cap P)\). More precisely, after (Unfold) or (Candidate), \(x_{n-j}\) for \(j<n-1\) is the set of states which only reach safe states within j steps.

figure g

Observe that, by invariant (A1), the values of \(\boldsymbol{x}\) in the two runs are, respectively, the least and the greatest values for all possible computations of AdjointPDR.

Theorem 5.1 follows by invariants (I2), (P1), (P3) and (KT); Theorem 5.2 by (N1), (N2) and (Kl\({\dashv }\)). Note that both results hold for any choice of z.

Theorem 5

(Soundness). AdjointPDR is sound. Namely,

  1. 1.

    If AdjointPDR returns true then \(\mu (f \sqcup i) \sqsubseteq p\).

  2. 2.

    If AdjointPDR returns false then \(\mu (f \sqcup i) \not \sqsubseteq p\).

3.1 Progression

It is necessary to prove that in any step of the execution, if the algorithm does not return true or false, then it can progress to a new state, not yet visited. To this aim we must deal with the subtleties of the non-deterministic choice of the element z in (Candidate), (Decide) and (Conflict). The following proposition ensures that, for any of these three rules, there is always a possible choice.

Proposition 6

(Canonical choices). The following are always possible:

figure h

Thus, for all non-conclusive \(s\in \mathcal {S}\), if \(s_0 {\mathop {\rightarrow ^*}\limits ^{}} s \) then \(s {\mathop {\rightarrow }\limits ^{}}\).

Then, Proposition 7 ensures that AdjointPDR always traverses new states.

Proposition 7

(Impossibility of loops). If \(s_0 {\mathop {\rightarrow ^*}\limits ^{}} s {\mathop {\rightarrow ^+}\limits ^{ }} s'\), then \(s\ne s'\).

Observe that the above propositions entail that AdjointPDR terminates whenever the lattice L is finite, since the set of reachable states is finite in this case.

Example 8

For (IFGP) as in Example 1, AdjointPDR behaves essentially as IC3/PDR [5], solving reachability problems for transition systems with finite state space S. Since the lattice \(\mathcal {P}S\) is also finite, AdjointPDR always terminates.

3.2 Heuristics

The nondeterministic choices of the algorithm can be resolved by using heuristics. Intuitively, a heuristic chooses for any states \(s\in \mathcal {S}\) an element \(z\in L\) to be possibly used in (Candidate), (Decide) or (Conflict), so it is just a function \(h:\mathcal {S}\rightarrow L\). When defining a heuristic, we will avoid to specify its values on conclusive states or in those performing (Unfold), as they are clearly irrelevant.

With a heuristic, one can instantiate AdjointPDR by making the choice of z as prescribed by h. Syntactically, this means to erase from the code of Fig. 3 the three lines of choose and replace them by \(z \texttt {:= } h(\,( \boldsymbol{x} \Vert \boldsymbol{c} )_{n,k}\,)\). We call AdjointPDR\(_h\) the resulting deterministic algorithm and write \(s {{\mathop {\rightarrow }\limits ^{}}}_{h}^{} s'\) to mean that AdjointPDR\(_h\) moves from state s to \(s'\). We let \(\mathcal {S}^h{\mathop {=}\limits ^{\tiny \text {def}}}\{s\in \mathcal {S}\mid s_0{{\mathop {\rightarrow }\limits ^{}}}_{h}^{*} s\}\) be the sets of all states reachable by AdjointPDR\(_h\).

Definition 9

(legit heuristic). A heuristic \(h:\mathcal {S}\rightarrow L\) is called legit whenever for all \(s,s'\in \mathcal {S}^h\), if \(s {{\mathop {\rightarrow }\limits ^{}}}_{h}^{}s'\) then \(s{\mathop {\rightarrow }\limits ^{}}s'\).

When h is legit, the only execution of the deterministic algorithm AdjointPDR\(_h\) is one of the possible executions of the non-deterministic algorithm AdjointPDR.

The canonical choices provide two legit heuristics: first, we call simple any legit heuristic h that chooses z in (Candidate) and (Decide) as in Proposition 6:

$$\begin{aligned} ( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k} \mapsto {\left\{ \begin{array}{ll} p &{} \text {if } ( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k} {\mathop {\rightarrow }\limits ^{ Ca }} \\ g(y_k) &{} \text {if } ( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k} {\mathop {\rightarrow }\limits ^{ D }} \end{array}\right. } \end{aligned}$$
(3)

Then, if the choice in (Conflict) is like in Proposition 6.4, we call h initial; if it is like in Proposition 6.3, we call h final. Shortly, the two legit heuristics are:

$$ \begin{array}{r|ll} \quad simple initial \quad \ &{} \quad (3) \text { and }( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k} \mapsto (f\sqcup i)(x_{k-1}) &{}\quad \text{ if } ( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k} \in Co \quad \\ \hline \\ \quad simple final \quad \ &{} \quad (3) \text { and } ( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k} \mapsto y_k &{}\quad \text{ if } ( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k} \in Co \quad \\ \end{array} $$

Interestingly, with any simple heuristic, the sequence \(\boldsymbol{y}\) takes a familiar shape:

Proposition 10

Let \(h:\mathcal {S}\rightarrow L\) be any simple heuristic. For all \(( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k} \in \mathcal {S}^h\), invariant (A3) holds as an equality, namely for all \(j\in [k,n-1]\), \(y_j=g^{n-1-j}(p)\).

By the above proposition and (A3), the negative sequence \(\boldsymbol{y}\) occurring in the execution of AdjointPDR\(_h\), for a simple heuristic h, is the least amongst all the negative sequences occurring in any execution of AdjointPDR.

Instead, invariant (A1) informs us that the positive chain \(\boldsymbol{x}\) is always in between the initial chain of \(f\sqcup i\) and the final chain of \(g \sqcap p\). Such values of \(\boldsymbol{x}\) are obtained by, respectively, simple initial and simple final heuristic.

Example 11

Consider the two runs of AdjointPDR in Example 4. The first one exploits the simple initial heuristic and indeed, the positive chain \(\boldsymbol{x}\) coincides with the initial chain. Analogously, the second run uses the simple final heuristic.

3.3 Negative Termination

When the lattice L is not finite, AdjointPDR may not terminate, since checking \(\mu (f\sqcup i) \sqsubseteq p\) is not always decidable. In this section, we show that the use of certain heuristics can guarantee termination whenever \(\mu (f \sqcup i) \not \sqsubseteq p\).

The key insight is the following: if \(\mu (f \sqcup i) \not \sqsubseteq p\) then by (Kl), there should exist some \(\tilde{n}\in \mathbb {N}\) such that \((f \sqcup i)^{\tilde{n}} (\bot ) \not \sqsubseteq p\). By (A1), the rule (Unfold) can be applied only when \((f \sqcup i)^{n-1} (\bot ) \sqsubseteq x_{n-1} \sqsubseteq p\). Since (Unfold) increases n and n is never decreased by other rules, then (Unfold) can be applied at most \(\tilde{n}\) times.

The elements of negative sequences are introduced by rules (Candidate) and (Decide). If we guarantee that for any index (nk) the heuristic in such cases returns a finite number of values for z, then one can prove termination. To make this formal, we fix \( CaD ^h_{n,k} {\mathop {=}\limits ^{\tiny \text {def}}}\{ ( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k}\in \mathcal {S}^h \mid ( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k}{\mathop {\rightarrow }\limits ^{ Ca }} \text { or } ( \boldsymbol{x} \Vert \boldsymbol{y} )_{n,k}{\mathop {\rightarrow }\limits ^{D}}\}\), i.e., the set of all (nk)-indexed states reachable by AdjointPDR\(_h\) that trigger (Candidate) or (Decide), and \(h( CaD ^h_{n,k}){\mathop {=}\limits ^{\tiny \text {def}}}\{h(s) \mid s\in CaD ^h_{n,k}\}\), i.e., the set of all possible values returned by h in such states.

Theorem 12

(Negative termination). Let h be a legit heuristic. If \(h( CaD ^h_{n,k})\) is finite for all nk and \(\mu (f\sqcup i) \not \sqsubseteq p\), then AdjointPDR \(_h\) terminates.

Corollary 13

Let h be a simple heuristic. If \(\mu (f\sqcup i) \not \sqsubseteq p\), then AdjointPDR \(_h\) terminates.

Note that this corollary ensures negative termination whenever we use the canonical choices in (Candidate) and (Decide) irrespective of the choice for (Conflict), therefore it holds for both simple initial and simple final heuristics.

4 Recovering Adjoints with Lower Sets

In the previous section, we have introduced an algorithm for checking \(\mu b \sqsubseteq p\) whenever b is of the form \(f\sqcup i\) for an element \(i\in L\) and a left-adjoint \(f:L \rightarrow L\). This, unfortunately, is not the case for several interesting problems, like the max reachability problem [1] that we will illustrate in Sect. 5.

The next result informs us that, under standard assumptions, one can transfer the problem of checking \(\mu b \sqsubseteq p\) to lower sets, where adjoints can always be defined. Recall that, for a lattice \((L,\sqsubseteq )\), a lower set is a subset \(X\subseteq L\) such that if \(x\in X\) and \(x'\sqsubseteq x\) then \(x'\in X\); the set of lower sets of L forms a complete lattice \((L^\downarrow , \subseteq )\) with joins and meets given by union and intersection; as expected \(\bot \) is \(\emptyset \) and \(\top \) is L. Given \(b:L\rightarrow L\), one can define two functions \(b^\downarrow , b^\downarrow _r :L^\downarrow \rightarrow L^\downarrow \) as \(b^\downarrow (X) {\mathop {=}\limits ^{\tiny \text {def}}}b(X)^\downarrow \) and \(b^\downarrow _r(X) {\mathop {=}\limits ^{\tiny \text {def}}}\{x \mid b(x) \in X\}\). It holds that \(b^\downarrow \, \dashv \, b^\downarrow _r\).

(4)

In the diagram above, \((-)^\downarrow :x \mapsto \{x' \mid x' \sqsubseteq x\}\) and \(\bigsqcup :L^\downarrow \rightarrow L\) maps a lower set X into \(\bigsqcup \{x\mid x\in X\}\). The maps \(\bigsqcup \) and \((-)^\downarrow \) form a Galois insertion, namely \(\bigsqcup \dashv (-)^\downarrow \) and \(\bigsqcup (-)^\downarrow = id\), and thus one can think of (4) in terms of abstract interpretation [8, 9]: \(L^\downarrow \) represents the concrete domain, L the abstract domain and b is a sound abstraction of \(b^\downarrow \). Most importantly, it turns out that b is forward-complete [4, 14] w.r.t. \(b^\downarrow \), namely the following equation holds.

$$\begin{aligned} (-)^\downarrow \circ b = b^\downarrow \circ (-)^\downarrow \end{aligned}$$
(5)

Proposition 14

Let \((L,\sqsubseteq )\) be a complete lattice, \(p\in L\) and \(b :L \rightarrow L\) be a \(\omega \)-continuous map. Then \(\mu b \sqsubseteq p\) iff \(\mu (b^\downarrow \cup \bot ^\downarrow ) \subseteq p^\downarrow \).

By means of Proposition 14, we can thus solve \(\mu b \sqsubseteq p\) in L by running AdjointPDR on \((\bot ^\downarrow , b^\downarrow ,b_r^{\downarrow }, p^{\downarrow })\). Hereafter, we tacitly assume that b is \(\omega \)-continuous.

Fig. 4.
figure 4

The algorithm AdjointPDR \(^\downarrow \) for checking \(\mu b \sqsubseteq p\): the elements of negative sequence are in \(L^\downarrow \), while those of the positive chain are in L, with the only exception of \(x_0\) which is constantly the bottom lower set \(\emptyset \). For \(x_0\), we fix \(b(x_0) = \bot \).

4.1 AdjointPDR \(^\downarrow \): Positive Chain in L, Negative Sequence in \(L^\downarrow \)

While AdjointPDR on \((\bot ^\downarrow , b^\downarrow ,b_r^{\downarrow }, p^{\downarrow })\) might be computationally expensive, it is the first step toward the definition of an efficient algorithm that exploits a convenient form of the positive chain.

A lower set \(X\in L^{\downarrow }\) is said to be a principal if \(X=x^\downarrow \) for some \(x\in L\). Observe that the top of the lattice \((L^\downarrow , \subseteq )\) is a principal, namely \(\top ^\downarrow \), and that the meet (intersection) of two principals \(x^\downarrow \) and \(y^\downarrow \) is the principal \((x\sqcap y)^\downarrow \).

Suppose now that, in (Conflict), AdjointPDR\((\bot ^\downarrow , b^\downarrow ,b_r^{\downarrow }, p^{\downarrow })\) always chooses principals rather than arbitrary lower sets. This suffices to guarantee that all the elements of \(\boldsymbol{x}\) are principals (with the only exception of \(x_0\) which is constantly the bottom element of \(L^\downarrow \) that, note, is \(\emptyset \) and not \(\bot ^\downarrow \)). In fact, the elements of \(\boldsymbol{x}\) are all obtained by (Unfold), that adds the principal \(\top ^\downarrow \), and by (Conflict), that takes their meets with the chosen principal.

Since principals are in bijective correspondence with the elements of L, by imposing to AdjointPDR\((\bot ^\downarrow , b^\downarrow ,b_r^{\downarrow }, p^{\downarrow })\) to choose a principal in (Conflict), we obtain an algorithm, named AdjointPDR \(^\downarrow \), where the elements of the positive chain are drawn from L, while the negative sequence is taken in \(L^{\downarrow }\). The algorithm is reported in Fig. 4 where we use the notation \(( \boldsymbol{x} \Vert \boldsymbol{Y} )_{n,k}\) to emphasize that the elements of the negative sequence are lower sets of elements in L.

All definitions and results illustrated in Sect. 3 for AdjointPDR are inheritedFootnote 1 by AdjointPDR \(^\downarrow \), with the only exception of Proposition 6.3. The latter does not hold, as it prescribes a choice for (Conflict) that may not be a principal. In contrast, the choice in Proposition 6.4 is, thanks to (5), a principal. This means in particular that the simple initial heuristic is always applicable.

Theorem 15

All results in Sect. 3, but Proposition 6.3, hold for AdjointPDR \(^\downarrow \).

4.2 AdjointPDR \(^\downarrow \) Simulates LT-PDR

The closest approach to AdjointPDR and AdjointPDR \(^\downarrow \) is the lattice-theoretic extension of the original PDR, called LT-PDR [19]. While these algorithms exploit essentially the same positive chain to find an invariant, the main difference lies in the sequence used to witness the existence of some counterexamples.

Definition 16

(Kleene sequence, from [19]). A sequence \(\boldsymbol{c}= c_k,\dots , c_{n-1}\) of elements of L is a Kleene sequence if the conditions (C1) and (C2) below hold. It is conclusive if also condition (C0) holds.

$$ (C0) c_1 \sqsubseteq b(\bot ), \qquad (C1) c_{n-1} \not \sqsubseteq p, \qquad (C2) \forall j\in [k,n-2].~c_{j+1} \sqsubseteq b(c_j)\text {.} $$

LT-PDR tries to construct an under-approximation \(c_{n-1}\) of \(b^{n-2}(\bot )\) that violates the property p. The Kleene sequence is constructed by trial and error, starting by some arbitrary choice of \(c_{n-1}\).

AdjointPDR crucially differs from LT-PDR in the search for counterexamples: LT-PDR under-approximates the final chain while AdjointPDR over-approximates it. The algorithms are thus incomparable. However, we can draw a formal correspondence between AdjointPDR \(^\downarrow \) and LT-PDR by showing that AdjointPDR \(^\downarrow \) simulates LT-PDR, but cannot be simulated by LT-PDR. In fact, AdjointPDR \(^\downarrow \) exploits the existence of the adjoint to start from an over-approximation \(Y_{n-1}\) of \(p^\downarrow \) and computes backward an over-approximation of the set of safe states. Thus, the key difference comes from the strategy to look for a counterexample: to prove \(\mu b \not \sqsubseteq p\), AdjointPDR \(^\downarrow \) tries to find \(Y_{n-1}\) satisfying \(p \in Y_{n-1}\) and \(\mu b \not \in Y_{n-1}\) while LT-PDR tries to find \(c_{n-1}\) s.t. \(c_{n-1} \not \sqsubseteq p\) and \(c_{n-1} \sqsubseteq \mu b\).

Theorem 17 below states that any execution of LT-PDR can be mimicked by AdjointPDR \(^\downarrow \). The proof exploits a map from LT-PDR’s Kleene sequences \(\boldsymbol{c}\) to AdjointPDR \(^\downarrow \)’s negative sequences \(\boldsymbol{neg(c)}\) of a particular form. Let \((L^{\uparrow }, \supseteq )\) be the complete lattice of upper sets, namely subsets \(X \subseteq L\) such that \(X=X^\uparrow {\mathop {=}\limits ^{\tiny \text {def}}}\{x'\in L \mid \exists x\in X \,. \, x\sqsubseteq x'\}\). There is an isomorphism \(\lnot :{(L^\uparrow , \supseteq )} {\mathop {\longleftrightarrow }\limits ^{\cong }} (L^\downarrow , \subseteq )\) mapping each \(X\subseteq S\) into its complement. For a Kleene sequence \(\boldsymbol{c} = c_k,\dots , c_{n-1}\) of LT-PDR, the sequence \(\boldsymbol{neg(c)} {\mathop {=}\limits ^{\tiny \text {def}}}\lnot (\{ c_k \}^{\uparrow }), \dots , \lnot (\{ c_{n-1} \}^{\uparrow })\) is a negative sequence, in the sense of Definition 3, for AdjointPDR \(^\downarrow \). Most importantly, the assignment \(\boldsymbol{c} \mapsto \boldsymbol{neg(c)}\) extends to a function, from the states of LT-PDR to those of AdjointPDR \(^\downarrow \), that is proved to be a strong simulation [24].

Theorem 17

AdjointPDR \(^\downarrow \) simulates LT-PDR.

Remarkably, AdjointPDR \(^\downarrow \)’s negative sequences are not limited to the images of LT-PDR’s Kleene sequences: they are more general than the complement of the upper closure of a singleton. In fact, a single negative sequence of AdjointPDR \(^\downarrow \) can represent multiple Kleene sequences of LT-PDR at once. Intuitively, this means that a single execution of AdjointPDR \(^\downarrow \) can correspond to multiple runs of LT-PDR. We can make this formal by means of the following result.

Proposition 18

Let \(\{\boldsymbol{c^m}\}_{m\in M}\) be a family of Kleene sequences. Then its pointwise intersection \(\bigcap _{m\in M} \boldsymbol{neg(c^m)}\) is a negative sequence.

The above intersection is pointwise in the sense that, for all \(j\in {[k,n-1]}\), it holds \((\bigcap _{m\in M} \boldsymbol{neg(c^m)})_j {\mathop {=}\limits ^{\tiny \text {def}}}\bigcap _{m\in M} (\boldsymbol{neg(c^m)})_j = \lnot (\{ c_j^m \mid m \in M \}^{\uparrow })\): intuitively, this is (up to \(\boldsymbol{neg(\cdot )}\)) a set containing all the M counterexamples. Note that, if the negative sequence of AdjointPDR \(^\downarrow \) makes (A3) hold as an equality, as it is possible with any simple heuristic (see Proposition 10), then its complement contains all Kleene sequences possibly computed by LT-PDR.

Proposition 19

Let \(\boldsymbol{c}\) be a Kleene sequence and \(\boldsymbol{Y}\) be the negative sequence s.t. \(Y_j= (b_r^\downarrow )^{n-1-j}(p^\downarrow )\) for all \(j \in [k,n-1]\). Then \(c_j \in \lnot (Y_j)\) for all \(j \in [k,n-1]\).

While the previous result suggests that simple heuristics are always the best in theory, as they can carry all counterexamples, this is often not the case in practice, since they might be computationally hard and outperformed by some smart over-approximations. An example is given by (6) in the next section.

5 Instantiating AdjointPDR \(^\downarrow \) for MDPs

In this section we illustrate how to use AdjointPDR \(^\downarrow \) to address the max reachability problem [1] for Markov Decision Processes.

A Markov Decision Process (MDP) is a tuple \((A, S, s_\iota , \delta )\) where A is a set of labels, S is a set of states, \(s_\iota \in S\) is an initial state, and \(\delta :S\times A \rightarrow \mathcal {D}S + 1\) is a transition function. Here \(\mathcal {D}S\) is the set of probability distributions over S, namely functions \(d:S\rightarrow [0, 1]\) such that \(\sum _{s\in S} d(s)=1\), and \(\mathcal {D}S + 1\) is the disjoint union of \(\mathcal {D}S\) and \(1=\{*\}\). The transition function \(\delta \) assigns to every label \(a\in A\) and to every state \(s\in S\) either a distribution of states or \(* \in 1\). We assume that both S and A are finite sets and that the set \( Act (s){\mathop {=}\limits ^{\tiny \text {def}}}\{ a\in A \mid \delta (s,a)\ne *\}\) of actions enabled at s is non-empty for all states.

Intuitively, the max reachability problem requires to check whether the probability of reaching some bad states \(\beta \subseteq S\) is less than or equal to a given threshold \(\lambda \in [0, 1]\). Formally, it can be expressed in lattice theoretic terms, by considering the lattice \(([0, 1]^S,\le )\) of all functions \(d:S\rightarrow [0,1]\), often called frames, ordered pointwise. The max reachability problem consists in checking \(\mu b \le p\) for \(p\in [0,1]^S\) and \(b :[0, 1]^S \rightarrow [0, 1]^S\), defined for all \(d\in [0, 1]^S \) and \(s\in S\), as

$$\begin{aligned} p(s){\mathop {=}\limits ^{\tiny \text {def}}}{\left\{ \begin{array}{ll} \lambda &{}\text { if } s=s_\iota , \\ 1 &{}\text { if } s \ne s_\iota , \end{array}\right. } \qquad b(d)(s) {\mathop {=}\limits ^{\tiny \text {def}}}{\left\{ \begin{array}{ll} 1 &{}\text { if } s \in \beta , \\ \displaystyle \max _{a \in Act (s)} \sum _{s'\in S} d(s') \cdot \delta (s, a)(s') &{}\text { if } s \notin \beta . \end{array}\right. } \end{aligned}$$

The reader is referred to [1] for all details.

Since b is not of the form \(f\sqcup i\) for a left adjoint f (see e.g. [19]), rather than using AdjointPDR, one can exploit AdjointPDR \(^\downarrow \). Beyond the simple initial heuristic, which is always applicable and enjoys negative termination, we illustrate now two additional heuristics that are experimentally tested in Sect. 6.

The two novel heuristics make the same choices in (Candidate) and (Decide). They exploit functions \(\alpha :S \rightarrow A\), also known as memoryless schedulers, and the function \(b_{\alpha } :[0, 1]^S \rightarrow [0, 1]^S\) defined for all \(d\in [0, 1]^S \) and \(s\in S\) as follows:

$$\begin{aligned} b_{\alpha }(d)(s) {\mathop {=}\limits ^{\tiny \text {def}}}{\left\{ \begin{array}{ll} 1 &{}\text { if } s \in \beta , \\ \sum _{s'\in S} d(s') \cdot \delta (s, \alpha (s))(s') &{}\text { otherwise}. \end{array}\right. } \end{aligned}$$

Since for all \(D\in ([0,1]^S)^\downarrow \), \(b^\downarrow _r (D) = \{d \mid b(d) \in D\} = \bigcap _{\alpha }\{d \mid b_{\alpha } (d)\in D\}\) and since AdjointPDR \(^\downarrow \) executes (Decide) only when \(b(x_{k-1}) \notin Y_k\), there should exist some \(\alpha \) such that \(b_{\alpha } (x_{k-1})\notin Y_k\). One can thus fix

$$\begin{aligned} ( \boldsymbol{x} \Vert \boldsymbol{Y} )_{n,k} \mapsto {\left\{ \begin{array}{ll} p^\downarrow &{} \text {if }( \boldsymbol{x} \Vert \boldsymbol{Y} )_{n,k} {\mathop {\rightarrow }\limits ^{ Ca }} \\ \{d \mid b_{\alpha }(d) \in Y_k\} &{} \text {if }( \boldsymbol{x} \Vert \boldsymbol{Y} )_{n,k} {\mathop {\rightarrow }\limits ^{D}} \end{array}\right. } \end{aligned}$$
(6)

Intuitively, such choices are smart refinements of those in (3): for (Candidate) they are exactly the same; for (Decide) rather than taking \(b^\downarrow _r (Y_k)\), we consider a larger lower-set determined by the labels chosen by \(\alpha \). This allows to represent each \(Y_j\) as a set of \(d\in [0, 1]^S \) satisfying a single linear inequality, while using \(b^\downarrow _r (Y_k)\) would yield a systems of possibly exponentially many inequalities (see Example 21 below). Moreover, from Theorem 12, it follows that such choices ensures negative termination.

Corollary 20

Let h be a legit heuristic defined for (Candidate) and (Decide) as in (6). If \(\mu b \not \le p\), then AdjointPDR \(^\downarrow \)\(_h\) terminates.

Example 21

Consider the maximum reachability problem with threshold \(\lambda = \frac{1}{4}\) and \(\beta = \{s_3\}\) for the following MDP on alphabet \(A=\{a,b\}\) and \(s_\iota =s_0\).

figure i

Hereafter we write \(d\in [0,1]^S\) as column vectors with four entries \(v_0\dots v_3\) and we will use \(\cdot \) for the usual matrix multiplication. With this notation, the lower set \(p^\downarrow \in ([0,1]^S)^\downarrow \) and \(b:[0,1]^S \rightarrow [0,1]^S\) can be written as

$$ p^\downarrow = \{\,\!{\tiny { \!\begin{bmatrix} v_0 \\ v_1 \\ v_2 \\ v_3 \end{bmatrix}\!}}\! \mid {\tiny { \!\begin{bmatrix} 1&0&0&0 \end{bmatrix}\!}} \cdot \!{\tiny { \!\begin{bmatrix} v_0 \\ v_1 \\ v_2 \\ v_3 \end{bmatrix}\!}}\! \le {\tiny { \!\begin{bmatrix} \frac{1}{4} \end{bmatrix}\!} } \} \quad \text { and } \quad b (\, \!{\tiny { \!\begin{bmatrix} v_0 \\ v_1 \\ v_2 \\ v_3 \end{bmatrix}\!}}\! \,) =\!{\tiny { \!\begin{bmatrix} \max (\frac{v_1+v_2}{2}, \frac{v_0+2v_2}{3}) \\ \frac{v_0+v_3}{2} \\ v_0 \\ 1 \end{bmatrix}\!}}\! \text {.} $$

Amongst the several memoryless schedulers, only two are relevant for us: \(\zeta {\mathop {=}\limits ^{\tiny \text {def}}}( s_0 \mapsto a ,\; s_1 \mapsto a ,\; s_2 \mapsto b ,\; s_3 \mapsto a )\) and \(\xi {\mathop {=}\limits ^{\tiny \text {def}}}(s_0 \mapsto b ,\; s_1 \mapsto a ,\; s_2 \mapsto b ,\; s_3 \mapsto a)\). By using the definition of \(b_\alpha :[0,1]^S \rightarrow [0,1]^S\), we have that

$$\begin{aligned} b_\zeta (\, \!{\tiny { \!\begin{bmatrix} v_0 \\ v_1 \\ v_2 \\ v_3 \end{bmatrix}\!}}\! \,) =\!{\tiny { \!\begin{bmatrix} \frac{v_1+v_2}{2} \\ \frac{v_0+v_3}{2} \\ v_0 \\ 1 \end{bmatrix}\!}}\! \qquad \text { and } \qquad b_\xi (\, \!{\tiny { \!\begin{bmatrix} v_0 \\ v_1 \\ v_2 \\ v_3 \end{bmatrix}\!}}\! \,) =\!{\tiny { \!\begin{bmatrix} \frac{v_0+2v_2}{3} \\ \frac{v_0+v_3}{2} \\ v_0 \\ 1 \end{bmatrix}\!}}\!\text {.} \end{aligned}$$

It is immediate to see that the problem has negative answer, since using \(\zeta \) in 4 steps or less, \(s_0\) can reach \(s_3\) already with probability \(\frac{1}{4}+\frac{1}{8}\).

Fig. 5.
figure 5

The elements of the negative sequences computed by AdjointPDR \(^\downarrow \) for the MDP in Example 21. In the central column, these elements are computed by means of the simple initial heuristics, that is \(\mathcal {F}^i=(b_r^\downarrow )^i(p^\downarrow )\). In the rightmost column, these elements are computed using the heuristic in (6). In particular \(\mathcal {F}^i = \{d\mid b_\zeta (d) \in \mathcal {F}^{i-1} \}\) for \(i\le 3\), while for \(i\ge 4\) these are computed as \(\mathcal {F}^i = \{d\mid b_\xi (d) \in \mathcal {F}^{i-1} \}\).

To illustrate the advantages of (6), we run AdjointPDR \(^\downarrow \) with the simple initial heuristic and with the heuristic that only differs for the choice in (Decide), taken as in (6). For both heuristics, the first iterations are the same: several repetitions of (Candidate), (Conflict) and (Unfold) exploiting elements of the positive chain that form the initial chain (except for the last element \(x_{n-1}\)).

figure j

In the latter state the algorithm has to perform (Decide), since \(b(x_5) \notin p^\downarrow \). Now the choice of z in (Decide) is different for the two heuristics: the former uses \(b_r^\downarrow (p^\downarrow ) = \{d \mid b(d) \in p^\downarrow \}\), the latter uses \(\{d \mid b_\zeta (d) \in p^\downarrow \}\). Despite the different choices, both the heuristics proceed with 6 steps of (Decide):

figure k

The element of the negative sequence \(\mathcal {F}^i\) are illustrated in Fig. 5 for both the heuristics. In both cases, \(\mathcal {F}^5=\emptyset \) and thus AdjointPDR \(^\downarrow \) returns false.

To appreciate the advantages provided by (6), it is enough to compare the two columns for the \(\mathcal {F}^i\) in Fig. 5: in the central column, the number of inequalities defining \(\mathcal {F}^i\) significantly grows, while in the rightmost column is always 1.

Whenever \(Y_k\) is generated by a single linear inequality, we observe that \(Y_k=\{d\in [0,1]^S \mid \sum _{s\in S}(r_s \cdot d(s)) \le r \}\) for suitable non-negative real numbers r and \(r_s\) for all \(s\in S\). The convex set \(Y_k\) is generated by finitely many \(d\in [0,1]^S\) enjoying a convenient property: d(s) is different from 0 and 1 only for at most one \(s\in S\). The set of its generators, denoted by \(\mathcal {G}_k\), can thus be easily computed. We exploit this property to resolve the choice for (Conflict). We consider its sub set \(\mathcal {Z}_k{\mathop {=}\limits ^{\tiny \text {def}}}\{d \in \mathcal {G}_k \mid b(x_{k-1}) \le d\}\) and define \(z_{B}, z_{01}\in [0,1]^S\) for all \(s\in S\) as

$$\begin{aligned} z_{B}(s) \!{\mathop {=}\limits ^{\tiny \text {def}}}\! {\left\{ \begin{array}{ll} (\bigwedge \mathcal {Z}_k)(s) &{} \text {if }r_s \ne 0, \mathcal {Z}_k\ne \emptyset \\ b(x_{k-1})(s) &{} \text {otherwise} \end{array}\right. } z_{01}(s) \!{\mathop {=}\limits ^{\tiny \text {def}}}\! {\left\{ \begin{array}{ll} \lceil z_{B}(s)\rceil &{} \text {if }r_s = 0, \mathcal {Z}_k\ne \emptyset \\ z_{B}(s) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(7)

where, for \(u\in [0,1]\), \(\lceil u \rceil \) denotes 0 if \(u=0\) and 1 otherwise. We call hCoB and hCo01 the heuristics defined as in (6) for (Candidate) and (Decide) and as \(z_{B}\), respectively \(z_{01}\), for (Conflict). The heuristics hCo01 can be seen as a Boolean modification of hCoB, rounding up positive values to 1 to accelerate convergence.

Proposition 22

The heuristics hCoB and hCo01 are legit.

By Corollary 20, AdjointPDR \(^\downarrow \) terminates for negative answers with both hCoB and hCo01. We conclude this section with a last example.

Example 23

Consider the following MDP with alphabet \(A=\{a,b\}\) and \(s_\iota =s_0\)

figure l

and the max reachability problem with threshold \(\lambda = \frac{2}{5}\) and \(\beta =\{s_3\}\). The lower set \(p^\downarrow \in ([0,1]^S)^\downarrow \) and \(b:[0,1]^S \rightarrow [0,1]^S\) can be written as

$$ p^\downarrow = \{\,\!{\tiny { \!\begin{bmatrix} v_0 \\ v_1 \\ v_2 \\ v_3 \end{bmatrix}\!}}\! \mid {\tiny { \!\begin{bmatrix} 1&0&0&0 \end{bmatrix}\!}} \cdot \!{\tiny { \!\begin{bmatrix} v_0 \\ v_1 \\ v_2 \\ v_3 \end{bmatrix}\!}}\! \le {\tiny { \!\begin{bmatrix} \frac{2}{5} \end{bmatrix}\!} } \} \quad \text { and } \quad b (\, \!{\tiny { \!\begin{bmatrix} v_0 \\ v_1 \\ v_2 \\ v_3 \end{bmatrix}\!}}\! \,) =\!{\tiny { \!\begin{bmatrix} \max (v_0, \frac{v_1+v_2}{2}) \\ \frac{v_0+2\cdot v_3}{3} \\ v_2 \\ 1 \end{bmatrix}\!}}\! $$

With the simple initial heuristic, AdjointPDR \(^\downarrow \) does not terminate. With the heuristic hCo01, it returns true in 14 steps, while with hCoB in 8. The first 4 steps, common to both hCoB and hCo01, are illustrated below.

figure m

Observe that in the first (Conflict) \(z_B = z_{01}\), while in the second \(z_{01}(s_1)=1\) and \(z_{B}(s_1)=\frac{4}{5}\), leading to the two different states prefixed by vertical lines.

6 Implementation and Experiments

We first developed, using Haskell and exploiting its abstraction features, a common template that accommodates both AdjointPDR and AdjointPDR \(^\downarrow \). It is a program parametrized by two lattices—used for positive chains and negative sequences, respectively—and by a heuristic.

For our experiments, we instantiated the template to AdjointPDR \(^\downarrow \) for MDPs (letting \(L=[0,1]^{S}\)), with three different heuristics: hCoB and hCo01 from Proposition 22; and hCoS introduced below. Besides the template (\(\sim \)100 lines), we needed \(\sim \)140 lines to account for hCoB and hCo01, and additional \(\sim \)100 lines to further obtain hCoS. All this indicates a clear benefit of our abstract theory: a general template can itself be coded succinctly; instantiation to concrete problems is easy, too, thanks to an explicitly specified interface of heuristics.

Our implementation accepts MDPs expressed in a symbolic format inspired by Prism models [20], in which states are variable valuations and transitions are described by symbolic functions (they can be segmented with symbolic guards \(\{\text {guard}_i\}_{i}\)). We use rational arithmetic (Rational in Haskell) for probabilities to limit the impact of rounding errors.

Heuristics. The three heuristics (hCoB, hCo01, hCoS) use the same choices in (Candidate) and (Decide), as defined in (6), but different ones in (Conflict).

The third heuristics hCoS is a symbolic variant of hCoB; it relies on our symbolic model format. It uses \(z_{S}\) for z in (Conflict), where \(z_{S}(s)=z_{B}(s)\) if \(r_{s}\ne 0\) or \(\mathcal {Z}_k=\emptyset \). The definition of \(z_{S}(s)\) otherwise is notable: we use a piecewise affine function \((t_{i}\cdot s + u_{i})_{i}\) for \(z_{S}(s)\), where the affine functions \((t_{i}\cdot s + u_{i})_{i}\) are guarded by the same guards \(\{\text {guard}_i\}_{i}\) of the MDP’s transition function. We let the SMT solver Z3 [25] search for the values of the coefficients \(t_{i}, u_{i}\), so that \(z_{S}\) satisfies the requirements of (Conflict) (namely \(b(x_{k-1})(s) \le z_{S}(s) \le 1\) for each \(s\in S\) with \(r_s=0\)), together with the condition \(b (z_{S}) \le z_{S}\) for faster convergence. If the search is unsuccessful, we give up hCoS and fall back on the heuristic hCoB.

As a task common to the three heuristics, we need to calculate \(\mathcal {Z}_k = \{d \in \mathcal {G}_k \mid b(x_{k-1}) \le d\}\) in (Conflict) (see (7)). Rather than computing the whole set \(\mathcal {G}_k\) of generating points of the linear inequality that defines \(Y_{k}\), we implemented an ad-hoc algorithm that crucially exploits the condition \(b(x_{k-1}) \le d\) for pruning.

Experiment Settings. We conducted the experiments on Ubuntu 18.04 and AWS t2.xlarge (4 CPUs, 16 GB memory, up to 3.0 GHz Intel Scalable Processor). We used several Markov chain (MC) benchmarks and a couple of MDP ones.

Table 1. Experimental results on MC benchmarks. |S| is the number of states, P is the reachability probability (calculated by manual inspection), \(\lambda \) is the threshold in the problem \(P\le _{?} \lambda \) (shaded if the answer is no). The other columns show the average execution time in seconds; TO is timeout (900 s); MO is out-of-memory. For AdjointPDR \(^\downarrow \) and LT-PDR we used the tasty-bench Haskell package and repeated executions until std. dev. is < 5% (at least three execs). For PrIC3 and Storm, we made five executions. Storm’s execution does not depend on \(\lambda \): it seems to answer queries of the form \(P\le _{?} \lambda \) by calculating P. We observed a wrong answer for the entry with \((\dagger )\) (Storm, sp.-num., Haddad-Monmege); see the discussion of RQ2.

Research Questions. We wish to address the following questions.

RQ1:

Does AdjointPDR \(^\downarrow \) advance the state-of-the-art performance of PDR algorithms for probabilistic model checking?

RQ2:

How does AdjointPDR \(^\downarrow \) ’s performance compare against non-PDR algorithms for probabilistic model checking?

RQ3:

Does the theoretical framework of AdjointPDR \(^\downarrow \) successfully guide the discovery of various heuristics with practical performance?

RQ4:

Does AdjointPDR \(^\downarrow \) successfully manage nondeterminism in MDPs (that is absent in MCs)?

Experiments on MCs (Table 1). We used six benchmarks: Haddad-Monmege is from [17]; the others are from [3, 19]. We compared AdjointPDR \(^\downarrow \) (with three heuristics) against LT-PDR [19], PrIC3 (with four heuristics none, lin., pol., hyb., see [3]), and Storm 1.5 [11]. Storm is a recent comprehensive toolsuite that implements different algorithms and solvers. Among them, our comparison is against sparse-numeric, sparse-rational, and sparse-sound. The sparse engine uses explicit state space representation by sparse matrices; this is unlike another representative dd engine that uses symbolic BDDs. (We did not use dd since it often reported errors, and was overall slower than sparse.) Sparse-numeric is a value-iteration (VI) algorithm; sparse-rational solves linear (in)equations using rational arithmetic; sparse-sound is a sound VI algorithm [26].Footnote 2

Table 2. Experimental results on MDP benchmarks. The legend is the same as Table 1, except that P is now the maximum reachability probability.

Experiments on MDPs (Table 2). We used two benchmarks from [17]. We compared AdjointPDR \(^\downarrow \) only against Storm, since RQ1 is already addressed using MCs (besides, PrIC3 did not run for MDPs).

Discussion. The experimental results suggest the following answers to the RQs.

RQ1. The performance advantage of AdjointPDR \(^\downarrow \) , over both LT-PDR and PrIC3, was clearly observed throughout the benchmarks. AdjointPDR \(^\downarrow \) outperformed LT-PDR, thus confirming empirically the theoretical observation in Sect. 4.2. The profit is particularly evident in those instances whose answer is positive. AdjointPDR \(^\downarrow \) generally outperformed PrIC3, too. Exceptions are in ZeroConf, Chain and DoubleChain, where PrIC3 with polynomial (pol.) and hybrid (hyb.) heuristics performs well. This seems to be thanks to the expressivity of the polynomial template in PrIC3, which is a possible enhancement we are yet to implement (currently our symbolic heuristic hCoS uses only the affine template).

RQ2. The comparison with Storm is interesting. Note first that Storm’s sparse-numeric algorithm is a VI algorithm that gives a guaranteed lower bound without guaranteed convergence. Therefore its positive answer to \(P\le _{?}\lambda \) may not be correct. Indeed, for Haddad-Monmege with \(|S|\sim 10^{3}\), it answered \(P=0.5\) which is wrong (\((\dagger )\) in Table 1). This is in contrast with PDR algorithms that discovers an explicit witness for \(P\le \lambda \) via their positive chain.

Storm’s sparse-rational algorithm is precise. It was faster than PDR algorithms in many benchmarks, although AdjointPDR \(^\downarrow \) was better or comparable in ZeroConf (\(10^4\)) and Haddad-Monmege (41), for \(\lambda \) such that \(P\le \lambda \) is true. We believe this suggests a general advantage of PDR algorithms, namely to accelerate the search for an invariant-like witness for safety.

Storm’s sparse-sound algorithm is a sound VI algorithm that returns correct answers aside numerical errors. Its performance was similar to that of sparse-numeric, except for the two instances of Haddad-Monmege: sparse-sound returned correct answers but was much slower than sparse-numeric. For these two instances, AdjointPDR \(^\downarrow \) outperformed sparse-sound.

It seems that a big part of Storm’s good performance is attributed to the sparsity of state representation. This is notable in the comparison of the two instances of Haddad-Monmege (41 vs. \(10^3\)): while Storm handles both of them easily, AdjointPDR \(^\downarrow \) struggles a bit in the bigger instance. Our implementation can be extended to use sparse representation, too; this is future work.

RQ3. We derived the three heuristics (hCoB, hCo01, hCoS) exploiting the theory of AdjointPDR \(^\downarrow \) . The experiments show that each heuristic has its own strength. For example, hCo01 is slower than hCoB for MCs, but it is much better for MDPs. In general, there is no silver bullet heuristic, so coming up with a variety of them is important. The experiments suggest that our theory of AdjointPDR \(^\downarrow \) provides great help in doing so.

RQ4. Table 2 shows that AdjointPDR \(^\downarrow \) can handle nondeterminism well: once a suitable heuristic is chosen, its performances on MDPs and on MCs of similar size are comparable. It is also interesting that better-performing heuristics vary, as we discussed above.

Summary. AdjointPDR \(^\downarrow \) clearly outperforms existing probabilistic PDR algorithms in many benchmarks. It also compares well with Storm—a highly sophisticated toolsuite—in a couple of benchmarks. These are notable especially given that AdjointPDR \(^\downarrow \) currently lacks enhancing features such as richer symbolic templates and sparse representation (adding which is future work). Overall, we believe that AdjointPDR \(^\downarrow \)confirms the potential of PDR algorithms in probabilistic model checking. Through the three heuristics, we also observed the value of an abstract general theory in devising heuristics in PDR, which is probably true of verification algorithms in general besides PDR.