Keywords

figure a
figure b

1 Introduction

The decomposition of a graph to its strongly connected components (SCCs) is one of the most standard tasks in automated system verification. For example, model checking against LTL and \(\omega \)-regular properties reduces to computing cycles [30], while fairness conditions are typically checked given an SCC decomposition of the graph [21, 34]. Of special interest are bottom/terminal SCCs (or BSCCs), i.e., SCCs that, once entered, cannot be escaped. BSCCs are used to speed up LTL model checking [28], and they capture the long-run properties of Markov Chains  [4, 11] and Markov Decision Processes [13, 23], while they also correspond to the attractors of dynamical systems, as in signal transduction networks [29, 33].

Large-scale model-checking settings comprise huge systems that suffer from the state-space explosion problem. These systems are usually represented compactly by a model, e.g., by means of a programming language, a logic or a reaction network, and have size that is exponentially large in its description. Nevertheless, the system typically exhibits numerous symmetries that can be preserved when the state space is represented symbolically rather than explicitly. One predominant symbolic representation is via (reduced/ordered) Binary Decision Diagrams (BDDs) [10], which are found at the core of many classic and modern model checkers [5, 14, 20, 24, 26]. To benefit from the symbolic representation, analysis algorithms typically only have coarse-grained access to the graph, querying for the successors (\({\text {Post}}(X)\)) and predecessors (\({\text {Pre}}(X)\)) of a set of nodes X represented by a single BDD. Each such operation counts as a symbolic step. As symbolic steps are significantly slower than primitive operations, they serve as the complexity measure of symbolic algorithms [9, 12, 18, 25].

Due to the prevalence of SCC decomposition, the problem has been studied extensively in the symbolic setting, starting with the \(\textsc {Xie}\text {-}\textsc {Beerel}\) algorithm [32] of symbolic complexity \(O(n^2)\); \(\textsc {LockStep}\) [8] improves this bound to \(O(n\log n)\), while \(\textsc {Skeleton}\) [17] achieves O(n) time at the expense of \(\varTheta (n)\) symbolic space (i.e., number of BDDs). The most recent step in this progression is \(\textsc {Chain}\) [25] which achieves both O(n) symbolic time and \(O(\log n)\) symbolic space. In practice, heuristics aim to further improve the running time [16, 31, 34].

Naturally, the computation of BSCCs can be achieved by using one of the aforementioned algorithms to obtain an SCC decomposition, and check whether each SCC is indeed a BSCC. In practice, however, computing an SCC can be expensive, as it typically requires traversing it multiple times. For this reason, algorithms dedicated to BSCCs have received special attention. Although these do not offer theoretical improvements, they attempt to minimize the number of non-bottom SCCs computed and thus perform better in practice.

The predominant, general-purpose BSCC-decomposition algorithm is \(\textsc {BwdFwd}\), which is a modification of \(\textsc {Xie}\text {-}\textsc {Beerel}\) [32], and has O(n) complexity. Effectively, this algorithm aborts the computation of an SCC S as soon as it determines that S cannot be a BSCC, and removes it from the graph, as well as any node that can reach S. A recently-introduced preprocessing technique, called interleaved transition-guided reduction (ITGR) [6], aims to further detect and discard non-bottom SCCs before the main algorithm is run. ITGR is general-purpose, and was shown to be effective in handling asynchronous Boolean Network models [1,2,3]. However, as these algorithms are typically executed on huge inputs, issues of scalability often remain. We address this challenge here.

1.1 Our contributions

The PENDANT algorithm. We develop a new, linear-time algorithm for symbolic BSCC computation, called \(\textsc {Pendant}\), drawing inspiration from the recent \(\textsc {Chain}\) algorithm [25]. In contrast to the existing BSCC paradigm based on stopping the computation of SCCs that are deemed non-bottom, \(\textsc {Pendant}\) aims to start such computations from SCCs that are likely to be bottom. To achieve this, while \(\textsc {Pendant}\) computes an SCC, it also implicitly (at no extra cost) traverses the quotient graph Q downwards, making future SCC computations start from nodes that are close to the bottom of Q, and thus discover a BSCC quickly.

Deadlock detection. We employ a simple yet powerful preprocessing technique, called deadlock-detection. This is based on the insights that (i) each deadlock (singleton SCC) is a BSCC, and (ii) all deadlocks can be computed effectively in a single symbolic step.

Experimental evaluation. We implement \(\textsc {Pendant}\) and the deadlock-detection preprocessing, and evaluate their performance on computing the BSCCs of a large pool of models from three diverse datasets, namely, (i) Petri Nets from the Model Checking Contest [22], (ii) DiVinE models from the Benchmark of Explicit Models [27], and (iii) Asynchronous Boolean Network models [1,2,3]. Our experiments conclude that (i) \(\textsc {Pendant}\) is decisively more efficient than \(\textsc {BwdFwd}\), (ii) deadlock-detection improves the performance of both algorithms, and (iii) after deadlock-detection, ITGR is scarcely effective.

2 Preliminaries

In this section we present standard definitions and the \(\textsc {BwdFwd}\) algorithm.

2.1 Graphs, Bottom SCCs and Symbolic Representations

Graphs. We consider directed graphs \(G = (V,E)\), where V is a set of nodes and \(E \subseteq V \times V\) is a set of edges. We often write \(u\rightarrow v\) to denote an edge \((u,v)\in E\). For a node v, the image of v is \({\text {Post}}(v) = \{ u \mid v\rightarrow u \}\), while the pre-image of v is \({\text {Pre}}(v) = \{ u \mid u\rightarrow v \}\). These notions are extended to sets of nodes X in the natural way, i.e., \({\text {Post}}(X) = \bigcup _{v \in X} {\text {Post}}(v)\) and \({\text {Pre}}(X) = \bigcup _{v \in X} {\text {Pre}}(v)\).

A path is a sequence \(P=v_1\rightarrow v_2\rightarrow \dots \rightarrow v_k\), in which case we also write \(v_1\leadsto v_k\), and say that \(v_k\) is reachable from \(v_1\). The length of P is \(|P|=k-1\). For a set of nodes X we write \({\text {Fwd}}(X) = \{ u \mid \exists v \in X, v \leadsto u \}\) for the forward set of X and \({\text {Bwd}}(X) = \{ u \mid \exists v \in X, u \leadsto v \}\) for the backward set of X. We call a set \(X\subseteq V\) forward-closed if \({\text {Fwd}}(X)\subseteq X\). The restriction of G on a set \(X \subseteq V\) is the graph \(G[X] = (X, (X \times X) \cap E)\). A node \(v\in V\) is called a deadlock if it has no outgoing edges, i.e., \({\text {Post}}(v)=\emptyset \).

Bottom Strongly Connected Components (BSCCs). A strongly connected component (SCC) of G is a maximal set of nodes S such that for all \(u,v\in S\) we have \(u\leadsto v\). Each node v belongs to one SCC, written \({\text {SCC}}(v)\). A set \(X\subseteq V\) is called SCC-closed if for each \(v\in X\), we have \({\text {SCC}}(v)\subseteq X\). The diameter of an SCC S is the maximum distance between two nodes in S, i.e.,

$$ \delta (S)=\max _{u,v\in S}\min _{P:u\leadsto v} |P| $$

The quotient graph of G represents each SCC of G by a single node, and has a directed edge \(S\rightarrow S'\) iff \({\text {Post}}(S)\cap S'\ne \emptyset \), i.e., there exists nodes \(u\in S\) and \(v\in S'\) with \(u\rightarrow v\). The quotient graph is a directed acyclic graph. The leaf nodes of a quotient graph represent the SCCs that have no outgoing edges to any other SCCs, called bottom SCCS (or BSCCs). We denote by \({\text {SCCs}}(G)\) and \({\text {BSCCs}}(G)\) the set of SCCs and BSCCs of G, respectively.

The problem targeted in this paper is the computation of BSSCs. The following two simple properties of BSCCs are used throughout the paper.

Proposition 1

An SCC S is a BSCC if and only if \({\text {Fwd}}(S) = S\).

Proposition 2

If S is a BSCC then there is no BSCC in \({\text {Bwd}}(S)\setminus S\).

Symbolic operations and complexity. In large-scale model-checking settings, graphs are typically represented symbolically. One popular symbolic representation is Binary Decision Diagrams (BDDs) [19]. In particular, the node set V and edge relation E are represented compactly as BDDs, while algorithms use BDDs as data structures for representing subsets of V and E. The basic BDD operations give only coarse-grained access to the graph: given a BDD representing a set of nodes X, an algorithm can access \({\text {Pre}}(X)\) and \({\text {Post}}(X)\), each of which counts as one symbolic step. The complexity of symbolic algorithms is measured in the number of symbolic steps they execute [12, 25], since these are much slower than elementary operations (e.g., incrementing a counter). Basic set operations on BDDs (union, intersection, etc.) also do not count towards the time complexityFootnote 1. Finally, given a set X represented as a BDD, we use a \(\textsc {Pick}(X)\) operation which returns an arbitrary node \(v\in X\). This operation is natural and efficient for BDDs, and has been common in symbolic SCC algorithms [8, 17, 25].

2.2 The \(\textsc {BwdFwd}\) Algorithm for BSCCs

The symbolic computation of \({\text {BSCCs}}(G)\) can be performed by computing each \(S\in {\text {SCCs}}(G)\) using some existing symbolic algorithm [8, 17, 25, 32], and then reporting that S is a BSCC iff \({\text {Post}}(S)\subseteq S\) (following Proposition 1). Although this approach runs in O(n) symbolic steps when using \(\textsc {Chain}\) [25] or \(\textsc {Skeleton}\) [17], it can be unnecessarily slow in practice, as it typically spends considerable time computing SCCs that are not BSCCs. For this reason, the computation of BSCCs is targeted by algorithms dedicated to this task. The standard symbolic BSCC algorithm is \(\textsc {BwdFwd}\), which we briefly present here.

Algorithm 1
figure c

\(\textsc {Bwd}\)

Algorithm 2
figure d

\(\textsc {BwdFwd}\)

The Backward-Forward BSCC algorithm. \(\textsc {BwdFwd}\) is an adaptation of the standard Xie-Beerel algorithm [32]. Algorithm 2 follows its recent presentation in [6], adapted to our setting. The algorithm uses the standard mechanism for computing SCCs symbolically: given a pivot node v, we have \({\text {SCC}}(v) = {\text {Fwd}}(v) \cap {\text {Bwd}}(v)\). Given such a node v, \(\textsc {BwdFwd}\) first calls Algorithm 1 (Line 3) to retrieve the backward set \({\text {Bwd}}(v)\) (called the basin of v) using a standard fixpoint computation. Then, it uses a similar fixpoint computation to retrieve \({\text {Fwd}}(v)\) (Line 5) in F. This computation is terminated early if the algorithm discovers that \({\text {Fwd}}(v)\not \subseteq {\text {Bwd}}(v)\), as then \({\text {Fwd}}(v)\not \subseteq {\text {SCC}}(v)\), and due to Proposition 1, we have that \({\text {SCC}}(v)\) is not a BSCC. On the other hand, if the computation is carried to a fixpoint, we have that \({\text {Fwd}}(v)\subseteq {\text {Bwd}}(v)\) and thus \({\text {Fwd}}(v)={\text {SCC}}(v)\); then, Proposition 1 guarantees that \({\text {SCC}}(v)\) is a BSCC. Since the check in Line 9 succeeds, \(\textsc {BwdFwd}\) correctly outputs \({\text {SCC}}(v)\) as a BSCC. Finally, Proposition 2 guarantees that the basin \({\text {Bwd}}(v)\) contains no BSCC, except possibly \({\text {SCC}}(v)\) which was just outputted. The algorithm hence safely removes \({\text {Bwd}}(v)\) from G, and proceeds recursively (Line 10).

It is not hard to see that \(\textsc {BwdFwd}\) runs in O(n) symbolic steps, but offers two practical improvements over general SCC-decomposition algorithms. In each recursive call, the algorithm avoids computing SCCs in \({\text {Bwd}}(v)\setminus {\text {SCC}}(v)\) as they are guaranteed to be non-bottom; nodes in this set are only accessed during the basin computation in Algorithm 1, which is cheaper. Moreover, it stops computing \({\text {SCC}}(v)\) as soon as it discovers that \({\text {Fwd}}(v)\not \subseteq {\text {Bwd}}(v)\) (as \({\text {SCC}}(v)\) is not a BSCC). However, the algorithm can spend significant time in computing \({\text {Fwd}}(v)\) before it discovers that \({\text {Fwd}}(v)\not \subseteq {\text {Bwd}}(v)\), which results in wasteful symbolic operations. The following example illustrates this issue on a small graph.

Example. Fig. 1 shows a graph \(G = (V,E)\) (a) and two recursion trees. The left-most tree (b) illustrates the execution of \(\textsc {BwdFwd}\) on G. Each node in the tree has its variables subscripted by the pivot node v chosen in the corresponding recursive call, with the variables showing their values in that recursive call. E.g., \(F_v\) is the value of F after the loop of Line 5 has completed, given that v was chosen as pivot in that recursive call. The number of a node is underlined in \(F_v\) if it is a node is outside the backward set \(B_v\) and cuts the computation of \(F_v\) short (Line 5). Observe that the algorithm makes four recursive calls, where the second (\(v=2\)) and third (\(v=3\)) call spend considerable time in the forward computation (of the sets \(F_2\) and \(F_3\), respectively), and essentially compute \({\text {SCC}}(2)\) and \({\text {SCC}}(3)\) before determining that these are not BSCCs.

Fig. 1.
figure 1

An example input graph (a) and the recursion trees of the BwdFwd (b) and \(\textsc {Pendant}\) (c) algorithms on it.

3 The \(\textsc {Pendant}\) Algorithm for BSCCs

In this section we present our new algorithm, \(\textsc {Pendant}\), for computing BSCCs symbolically. Like \(\textsc {BwdFwd}\), \(\textsc {Pendant}\) spends linear time in the number of nodes of the input graph. In particular, we have the following theorem.

Theorem 1

Given a graph \(G = (V, E)\) of n nodes, \(\textsc {Pendant}\) computes BSCCs(G) in \(O(\sum _{S\in {\text {SCCs}}(G)} \delta (S)) = O(n)\) symbolic time.

However, as we will see in Section 5, in practice \(\textsc {Pendant}\) typically requires fewer symbolic steps than \(\textsc {BwdFwd}\). Intuitively, this is achieved by making, over time, smarter choices of pivot nodes v to start the SCC computation, meaning nodes v that are more likely to have \({\text {SCC}}(v)\) close to the leaves of the quotient graph. In turn, this reduces the number of non-bottom SCCs computed throughout the execution of the algorithm, which reduces the number of symbolic steps.

3.1 \(\textsc {Pendant}\)

\(\textsc {Pendant}\) is shown in Algorithm 4, and uses \(\textsc {FwdLastLayer}\), shown in Algorithm 3, as a sub-procedure.

Algorithm 3
figure e

\(\textsc {FwdLastLayer}\)

FwdLastLayer . \(\textsc {FwdLastLayer}\) computes the forward set \({\text {Fwd}}(v)\) of a node v using a standard fixpoint computation. The algorithm also keeps track of the last layer L of nodes discovered during the fixpoint computation, and returns both \({\text {Fwd}}(v)\) (represented in F) and L. Intuitively, \({\text {Fwd}}(v)\) is used by \(\textsc {Pendant}\) for computing \({\text {SCC}}(v)\) and testing whether it is a BSCC, while L is used to guide the selection of future pivots downwards in the quotient graph.

Algorithm 4
figure f

\(\textsc {Pendant}\)

Pendant . On input \(G = (V,E)\), \(\textsc {Pendant}\) begins by \(\textsc {Pick}\)’ing an arbitrary pivot node v (Line 2), with the aim to compute \({\text {SCC}}(v)\) and test whether it is a BSCC. For this purpose, it calls \(\textsc {FwdLastLayer}\) to retrieve \(F={\text {Fwd}}(v)\), and L being the last layer of \({\text {Fwd}}(v)\) (Line 5). It then computes \(S={\text {SCC}}(v)\), by calling \(\textsc {Bwd}\) (Algorithm 1, Line 6) to compute the backward set of v restricted to \({\text {Fwd}}(v)\). At this point, there are two cases.

  • If \(F\setminus S\ne \emptyset \), then S is not a BSCC. At this point, the set \(W=F\setminus S\) is guaranteed to contain a BSCC, and the algorithm resumes its search for a BSCC in this set, running a new iteration of the main loop. Moreover, the algorithm attempts to pick a new pivot in the last layer of \({\text {Fwd}}(v)\) (Line 10), as opposed to an arbitrary node in \(W\). Intuitively, this effectively allows \(\textsc {Pendant}\) to traverse the quotient graph downwards towards its leaves, and thus quickly pick a pivot v such that \({\text {SCC}}(v)\) is a BSCC.

  • If \(F\setminus S=\emptyset \), then \(D={\text {SCC}}(v)\) is guaranteed to be a BSCC; this is reported (Line 15), and the loop breaks (Line 4). Then the backwards set of B is computed and removed from the graph, as it is guaranteed to not contain any other BSCC, and the algorithm proceeds recursively in the remaining graph (Line 17). Note that the number of recursive calls of \(\textsc {Pendant}\) thus equals the number of BSCCs in the input graph.

Observe the qualitative differences between \(\textsc {Pendant}\) and \(\textsc {BwdFwd}\). First, \(\textsc {BwdFwd}\) begins with a backward search from the pivot v, while \(\textsc {Pendant}\) begins with a forward search from v. Second, \(\textsc {BwdFwd}\) removes the basin \({\text {Bwd}}(v)\) from G as soon as \({\text {SCC}}(v)\) is deemed to be non-bottom, while \(\textsc {Pendant}\) delays this step, and only computes (and removes) the basin of BSCCs. Third, \(\textsc {BwdFwd}\) picks pivots completely arbitrarily, whereas \(\textsc {Pendant}\), any time it computes an SCC S that is not bottom, it picks the next pivot from a distant successor of S in the quotient graph, which allows it to discover BSCCs quickly.

Example. Let us revisit our example in Fig. 1. The right-most recursion tree (c) illustrates the computation of \(\textsc {Pendant}\). Since there is only one BSCC, there is only one recursive call, but the node is subdivided to show each iteration of the loop in Line 4. As before, variables are subscripted with the pivot node v of that iteration. Initially, \(\textsc {Pendant}\) chooses arbitrarily \(v=1\), like \(\textsc {BwdFwd}\), and computes \({\text {Fwd}}(1)\). Then, it deems \({\text {SCC}}(1)\) as a non-bottom SCC, and the next pivot is chosen from the last layer of \({\text {Fwd}}(1)\), i.e., \(v=10\). Effectively, \(\textsc {Pendant}\) has reached a leaf of the quotient graph (the only leaf, in this case), and thus identifies a BSCC quickly. Importantly, it skips the expensive computation of two SCCs with large diameters \(({\text {SCC}}(2)\) and \({\text {SCC}}(3)\)), in contrast to \(\textsc {BwdFwd}\).

3.2 Correctness

We now turn our attention to the correctness of \(\textsc {Pendant}\). We start with two simple lemmas regarding forward-closed sets.

Lemma 1

Assume that \(X\subseteq V\) is forward-closed, and \(D\subseteq X\) is a BSCC. Then \(X\setminus {\text {Bwd}}(D)\) is forward-closed.

Proof

For any node \(v\in X\), if \({\text {Fwd}}(v)\cap {\text {Bwd}}(D)\ne \emptyset \) then clearly \(v\in {\text {Bwd}}(D)\) and hence \(v\not \in X\setminus {\text {Bwd}}(D)\). Thus, for every node \(v\in X\setminus {\text {Bwd}}(D)\), we have \({\text {Fwd}}(v)\cap {\text {Bwd}}(D)=\emptyset \), and since X is forward-closed, we have \({\text {Fwd}}(v)\subseteq X\).    \(\square \)

Lemma 2

For any node v, the set \({\text {Fwd}}(v)\setminus {\text {SCC}}(v)\) is forward-closed.

Proof

For any node \(u\in {\text {Fwd}}(v)\), if \({\text {Fwd}}(u)\cap {\text {SCC}}(v)\ne \emptyset \), then \(u\in {\text {SCC}}(v)\). Hence for every node \(u\in {\text {Fwd}}(v)\setminus {\text {SCC}}(v)\), we have \({\text {Fwd}}(u)\cap {\text {SCC}}(v)=\emptyset \), and thus \({\text {Fwd}}(u)\subseteq {\text {Fwd}}(v)\setminus {\text {SCC}}(v)\). The desired result follows.    \(\square \)

We now prove the soundness of \(\textsc {Pendant}\), i.e., every SCC outputted in Line 15 is a BSCC. For this, we prove the following stronger lemma, which states three invariants maintained by the algorithm.

Lemma 3

At each iteration of the main loop of \(\textsc {Pendant}\), the following invariants hold: (a) V and \(W\) are forward-closed, (b) S is an SCC, and (c) D is a BSCC.

Proof

Before entering the first iteration of the loop, we have that each of \(W\) and V is the whole node set of the input graph, hence both are trivially forward-closed. Now, assuming that \(W\) is forward-closed, we have that \(F={\text {Fwd}}(v)\) in Line 5. In turn, this implies that \(S={\text {SCC}}(v)\) in Line 6. Moreover, due to Proposition 1, if \(F\subseteq S\) in Line 7, then S is a BSCC, thus D outputted in Line 15 is indeed a BSCC.

To complete the invariant proof, it remains to argue that \(V'\) and \(W'\) remain forward-closed after they have been updated. There are two cases.

  1. 1.

    If the algorithm proceeds with another iteration of the main loop, we have \(V'=V\) and \(W'=F\setminus S\). Since \(F={\text {Fwd}}(v)\) and \(S={\text {SCC}}(v)\), Lemma 2 implies that \(W'\) is forward-closed.

  2. 2.

    Otherwise, the algorithm proceeds with a new recursive call in Line 17. We have that \(W'=V'=V\setminus B\), where \(B={\text {Bwd}}(D)\), and D is a BSCC. By Lemma 1, we have that \(V\setminus B\) is forward-closed, as desired.    \(\square \)

Observe that case (c) of Lemma 3 establishes the soundness of \(\textsc {Pendant}\). Next we establish its completeness, thereby concluding the correctness of \(\textsc {Pendant}\).

Lemma 4

\(\textsc {Pendant}\) outputs every BSCC of the input graph.

Proof

First, observe every time \(\textsc {Pendant}\) calls itself recursively in Line 17, it has outputted a BSCC D, and the recursion proceeds on the subgraph \(V\setminus {\text {Bwd}}(D)\). Due to Proposition 2, the algorithm has outputted all BSCCs in \(V\setminus {\text {Bwd}}(D)\). Hence, in each recursive call on a graph \(G=(V,E)\), the node set V contains all the BSCCs not already outputted by the algorithm. It thus suffices to argue that, in each recursive call, the main loop eventually terminates, as in doing so it outputs a BSCC.

In each iteration of the main loop, the set \(W\) is updated to \(W'=F\setminus S\) (Line 8), where \(F={\text {Fwd}}(v)\) and \(S={\text {SCC}}(v)\), where v is the current pivot. Since \(F\subseteq W\) and \(S\ne \emptyset \), it follows that \(W'\subsetneq W\), and thus the loop must eventually terminate.    \(\square \)

3.3 Complexity

Although the linear upper-bound of \(\textsc {BwdFwd}\) is trivial, the case of \(\textsc {Pendant}\) is more involved. This is because a call to \(\textsc {FwdLastLayer}\) may compute forward sets that consist of many layers (and thus cost many symbolic steps), while these sets are not immediately removed from the graph (as opposed to the backward set computed by \(\textsc {BwdFwd}\)), and are again accessed in future iterations of the algorithm. Nevertheless, a careful analysis shows that the complexity is indeed linear. We start with a simple lemma.

Lemma 5

Assume that \(X\subseteq V\) is forward-closed and \(D\subseteq X\) is a BSCC. Then \({\text {Bwd}}(D)\cap X\) is SCC-closed.

Proof

Consider any node \(v\in {\text {Bwd}}(D)\cap X\). Since X is forward-closed, we have \({\text {Fwd}}(v)\subseteq X\) and thus \({\text {SCC}}(v)\subseteq X\). Moreover, \({\text {Bwd}}(v)\subseteq {\text {Bwd}}(D)\) and thus \({\text {SCC}}(v)\subseteq X\). Hence \({\text {SCC}}(v)\subseteq {\text {Bwd}}(D)\cap X\).

We now prove the complexity of \(\textsc {Pendant}\).

Lemma 6

\(\textsc {Pendant}\) runs in \(O(\sum _{S\in {\text {SCCs}}(G)} \delta (S)) = O(n)\) symbolic steps.

Proof

In each recursive call, \(\textsc {Pendant}\) makes symbolic steps to (i) compute the SCCs of the picked pivots (Lines 5 and 6), and (ii) compute the backwards set of the outputted BSCC (Line 16). We will argue that, in total, case (i) takes \(\sum _{S \in {\text {SCCs}}(G)} 3\delta (S)\) time, while case (ii) takes \(\sum _{S \in {\text {SCCs}}(G)} \delta (S)\) time.

We start with case (i). For a given pivot v, computing \({\text {SCC}}(v)\) is done in two steps: (a) Line 5 computes the forward set F of v restricted to the node set \(W\), while (b) Line 6 computes \({\text {SCC}}(v)\) as the backward set of v restricted to F. Clearly, (b) takes \(\delta ({\text {SCC}}(v))\) symbolic steps, thus summing over all pivots v, we have that Line 6 takes at most \(\sum _{S \in {\text {SCCs}}(G)} \delta (S)\) time. To bound the time spent in (a), denote by \({\text {Levels}}(v)\) the number of iterations executed in \(\textsc {FwdLastLayer}\), i.e., \(\textsc {Pendant}\) spends \({\text {Levels}}(v)\) symbolic steps in Line 5. If \(F\setminus {\text {SCC}}(v)=\emptyset \) or \(L\setminus {\text {SCC}}(v)=\emptyset \), we have \({\text {Levels}}(v) = \delta ({\text {SCC}}(v))\). Otherwise, the next pivot \(v'\) is \(\textsc {Pick}\)’ed from L (Line 10). Consider a shortest-path \(P:v\leadsto v'\), and let \(\{S_1,\dots S_k\}\) be the SCCs of nodes along P (except v), and note that \({\text {Levels}}(v)\le \sum _{i = 1}^{k} \delta (S_i)\). Moreover, we have \(S_i\subseteq {\text {Bwd}}(v')\) for each \(i\in \{1,\dots , k\}\), and thus each \(S_i\) is not touched again by \(\textsc {FwdLastLayer}\), except if \(S_i={\text {SCC}}(v')\), but this case is accounted for already. Summing over all such \(S_i\) across all pivots v, we have that \(\sum _v{{\text {Levels}}(v)}\le \sum _{S\in {\text {SCCs}}(G)} \delta (S)\). Hence the total symbolic time spent for case (i) is bounded by \(\sum _{S \in {\text {SCCs}}(G)} 3\delta (S)\).

We now turn our attention to case (ii). Due to Lemma 3, \(W\) is forward closed and D is a BSCC. By Lemma 5, the set B computed in Line 16 is SCC-closed. The number of symbolic steps is hence bounded by \(\sum _{S\in {\text {SCCs}}(B)} \delta (S)\). Finally, B is removed from the graph in the recursive call, hence it will not be processed again. Thus the total time for case (ii) is \(\sum _{S \in {\text {SCCs}}(G)} \delta (S)\).    \(\square \)

4 Deadlock Detection

We now outline a simple but effective preprocessing technique for BSCCs.

Recall that a deadlock is a node v without outgoing edges, i.e., \({\text {Post}}(v)=\emptyset \). Observe that all deadlocks are BSCCs: formally we have \({\text {Fwd}}(\{v\})=\{v\}={\text {SCC}}(v)\), and thus the statement follows from Proposition 1 (the opposite is, of course, not true in general). Thus, deadlock-detection can be seen as a natural preprocessing step to any BSCC algorithm.

The key observation in this approach is that the set of all deadlocks can be computed efficiently, in only one symbolic step; this is achieved by Algorithm 5. In particular, the deadlock set is computed as \(D=V \setminus H\) where H is the set of nodes u that have a successor. In turn, H can be computed by a single \({\text {Pre}}\) operation on the entire node set. Finally, due to Proposition 2, the set \({\text {Bwd}}(D)\) is guaranteed to contain no BSCCs other than those in D, and thus it can be removed. The resulting graph is then passed to the main BSCC algorithm.

Algorithm 5
figure g

Deadlock detection (preprocessing)

5 Experiments

Here we report on an implementation of \(\textsc {Pendant}\), including the deadlock-detection technique, and an experimental evaluation of its performance on a large dataset of standard model-checking benchmarks across various domains.

Baselines. To assess the performance of \(\textsc {Pendant}\) and deadlock detection, we compare it with \(\textsc {BwdFwd}\) (Algorithm 2), as well as the recently introduced interleaved transition guided reduction (ITGR) [6], which we have implemented in our setting. ITGR is applicable when the transition relation is partitioned into a number of smaller relations \(E = (R_1, \dots , R_k)\) (as is the case in our setup), and works as a preprocessing step, much like our deadlock detection. At a high level, ITGR employs some local reasoning for each relation \(R_i\) to identify sets of nodes that do not contain BSCCs. Such sets can be removed, reducing the size of the graph that is further processed by a BSCC-computation algorithm.

Research Questions. Our setup is centered around the following questions.

  • RQ1 How does the performance of \(\textsc {Pendant}\) compare to that of \(\textsc {BwdFwd}\)?

  • RQ2 How does deadlock detection impact the performance of \(\textsc {Pendant}\) and \(\textsc {BwdFwd}\)?

  • RQ3 How does ITGR impact the performance of \(\textsc {Pendant}\) and \(\textsc {BwdFwd}\)?

  • RQ4 How does the performance of \(\textsc {Pendant}\) compare to the performance of \(\textsc {BwdFwd}\) when both use deadlock detection?

  • RQ5 How does ITGR impact the performance of \(\textsc {Pendant}\) after deadlock detection?

Datasets. We use benchmarks from the following categories.

  • Petri Net models from MCC, the Model Checking Contest [22].

  • DiVinE models from BEEM, the Benchmark of Explicit Models [27].

  • Asynchronous Boolean Network models [1,2,3].

We do not apply any selection criteria, except discarding models that are too slow to handle by all algorithms in our timed experiments. This results in 553 models in total.Footnote 2 In each model, the edge relation is naturally partitioned into subrelations \(R_1,\dots , R_k\), following the structure of the high-level specifications (transitions in Petri Nets and DiVinE state machines, and reactions in the Boolean Networks). We use the language-independent model checker LTSmin [20] to generate symbolic graphs for the DVE and PNML models. Since LTSmin does not handle Boolean Networks, these graphs are generated by a custom parser. The time taken for the graph generation is not measured in the running time of each algorithm. We use the BDD package Sylvan as our symbolic representation [15].

Experimental setup. Our experiments are run on a Linux machine with 2.4GHz CPU speed and 60GB of memory. We measure both symbolic steps and run time, but only present the results on symbolic steps here, as they reflect the true symbolic time-complexity of the algorithms, and are independent of the choice of the underlying BDD package. The results on time are qualitatively the same. Each run is timed out after 400 seconds, indicated as the graph taking \(10^9\) symbolic steps on the figures. Since our input relation is partitioned into several sub-relations \(E=(R_1,\dots , R_k)\), each \({\text {Pre}}\)/\({\text {Post}}\) operation incurs k symbolic steps (for all algorithms). Our setup is completely deterministic, however certain operations, like \(\textsc {Pick}\)’ing a node, are executed arbitrarily.

Experimental results. We now present our experimental results for addressing the above research questions. Note that all figures are plotted in log-scale.

Fig. 2.
figure 2

The number of symbolic steps executed by \(\textsc {Pendant}\) and \(\textsc {BwdFwd}\).

RQ1: \(\textsc {Pendant}\) vs \(\textsc {BwdFwd}\) . The performance of \(\textsc {Pendant}\) and \(\textsc {BwdFwd}\) is shown in Fig. 2, across all three datasets. Both algorithms manage to handle many models within the time limit, though there are a few time outs. We see that \(\textsc {Pendant}\) is generally no slower than \(\textsc {BwdFwd}\), with the clear exception of three timeout outliers. For the rest, the models that are slower for \(\textsc {Pendant}\) sit only slightly above the \(x=y\) line, meaning that the slowdown is comparatively small. On the other hand, there are several models on which \(\textsc {Pendant}\) is generally faster than \(\textsc {BwdFwd}\), and the speedup increases as we go towards more demanding benchmarks (more than two orders of magnitude). Finally, \(\textsc {Pendant}\) times out on much fewer models than \(\textsc {BwdFwd}\). Overall, \(\textsc {Pendant}\) is measurably faster than \(\textsc {BwdFwd}\), and this trend persists across all three datasets (DVE, PNML and Boolean Networks).

Fig. 3.
figure 3

The impact of deadlock detection in the number of symbolic steps executed by \(\textsc {Pendant}\) (left) and \(\textsc {BwdFwd}\) (right). Data points are classified as those having at least one deadlock, and those having no deadlock.

Fig. 4.
figure 4

The impact of ITGR in the number of symbolic steps executed by \(\textsc {Pendant}\) (left) and \(\textsc {BwdFwd}\) (right).

RQ2: The impact of deadlock detection. The impact of deadlock detection to both \(\textsc {Pendant}\) and \(\textsc {BwdFwd}\) is shown in Fig. 3. We see that deadlock detection improves the performance of both algorithms significantly. Indeed, detecting deadlocks requires only one symbolic step (per relation \(R_i\)), hence it is natural to expect that it does not slow down any algorithm, and has no effect on models that have no deadlocks. On the other hand, it leads to a measurable speedup on the models that have deadlocks, and the impact varies depending on the fraction of the graph that is removed during deadlock removal. Interestingly, deadlock detection also reduces significantly the number of timeouts for both \(\textsc {Pendant}\) and \(\textsc {BwdFwd}\). In conclusion, deadlock detection helps both algorithms.

RQ3: The impact of ITGR. The impact of ITGR to both \(\textsc {Pendant}\) and \(\textsc {BwdFwd}\) is shown in Fig. 4. Perhaps surprisingly, we find that ITGR does not have a consistent effect: it can both speed up and slow down each of the algorithms. At closer inspection, we observe that ITGR has a positive effect on most Boolean Network models, which is indeed the context in which it was introduced [6]. On the other hand, it has both positive and negative effects on DVE and PNML models, and even makes both algorithms time out on instances that they could easily handle without ITGR.

Fig. 5.
figure 5

The number of symbolic steps executed by \(\textsc {Pendant}\) and \(\textsc {BwdFwd}\), when also using deadlock detection.

RQ4: \(\textsc {Pendant}\) vs \(\textsc {BwdFwd}\), with deadlock detection. Since deadlock detection has a clear positive effect on both algorithms, it is natural to revisit RQ1 and ask about the performance of the two algorithms when also using deadlock detection. The result is shown in Fig. 5. Deadlock detection makes the performance of the two algorithms more similar in many benchmarks (i.e., more data points lie closer on the \(x=y\) line). However, \(\textsc {Pendant}\) remains decisively faster on many models, and thus its benefit is not overshadowed by the positive impact of deadlock detection. At closer inspection, we see that \(\textsc {Pendant}\) is faster on DVE and PNML models, but not on Boolean Networks. This is due to the fact that most Boolean Networks have many deadlocks, and thus the common deadlock-detection component simplifies such models considerably, making the remaining performance of the two algorithms similar.

Fig. 6.
figure 6

The impact of ITGR after using deadlock detection.

RQ5: The impact of ITGR after deadlock detection. Finally, in Fig. 6 we examine whether ITGR improves the performance of \(\textsc {Pendant}\) after deadlock detection has run. Although ITGR improves the performance on a few models, it generally leads to a slowdown, as well as to more timeouts. Interestingly, ITGR has the fewest positive effects (on top of deadlock detection) for Boolean Network models, for which it was originally introduced. Since these models have several deadlocks, the fast deadlock-detection preprocessing simplifies them considerably, at which point the cost of ITGR is not worth its little (or no) impact.

6 Conclusion

We have introduced \(\textsc {Pendant}\), a new symbolic algorithm for computing BSCCs, as well as a deadlock-detection technique for this task. Though both \(\textsc {Pendant}\) and the standard \(\textsc {BwdFwd}\) have O(n) symbolic-time complexity, our experimental results show that \(\textsc {Pendant}\) is typically faster, and thus to be preferred for this task. Moreover, deadlock-detection is an efficient and effective preprocessing technique for reporting singleton BSCCs (and removing their basin), before handing the computation to the general algorithm. Finally, the recently introduced ITGR, although effective on Boolean Network models, has mixed effects on DVE and PNML models, while its effect is often negative after deadlock detection (but not always). Some opportunities for future research include introducing saturation techniques [34] to \(\textsc {Pendant}\), extending the algorithm to symbolically handle colored graphs [7, 25], and understanding better the settings in which ITGR is effective.