This section formally describes the approach presented in this paper.
3.1 Programs, Actions, and Runs
Let \(P \mathrel {:=}\langle T, \mathcal {L}, \mathcal {C}\rangle \) represent a (possibly non-terminating) multi-threaded POSIX C program, where \(T\) is the set of statements, \(\mathcal {L}\) is the set of POSIX mutexes used in the program, and \(\mathcal {C}\) is the set of condition variables. This is a deliberately simplified presentation of our program syntax, see
[42] for full details. We represent the behavior of each statement in P by an action, i.e., a pair \(\langle i, b\rangle \) in \(A \subseteq \mathbb {N}\times B\), where \(i \ge 1\) identifies the thread executing the statement and b is the effect of the statement. We consider the following effects:
$$\begin{aligned} B \mathrel {:=}&~({\{ \textsf {loc} \mathclose \}} \times T) \cup ({\{ \textsf {acq},\textsf {rel} \mathclose \}} \times \mathcal {L}) \cup ({\{ \textsf {sig} \mathclose \}} \times \mathcal {C}\times \mathbb {N}) \\ \cup&~({\{ \textsf {bro} \mathclose \}} \times \mathcal {C}\times 2^\mathbb {N}) \cup ({\{ \textsf {w}_1,\textsf {w}_2 \mathclose \}} \times \mathcal {C}\times \mathcal {L}) \end{aligned}$$
Below we informally explain the intent of an effect and how actions of different effects interleave with each other. In
[42] we use actions and effects to define labeled transition system semantics to P. Below we also (informally) define an independence relation (see Sect. 2.2) between actions.
Local Actions. An action \(\langle i, \langle \textsf {loc}, t\rangle \rangle \) represents the execution of a local statement \(t\) from thread \(i\), i.e., a statement which manipulates local variables. For instance, the actions labeling events 1 and 3 in Fig. 2b are local actions. Note that local actions do not interfere with actions of other threads. Consequently, they are only dependent on actions of the same thread.
Mutex Lock/Unlock. Actions \(\langle i, \langle \textsf {acq}, l\rangle \rangle \) and \(\langle i, \langle \textsf {rel}, l\rangle \rangle \) respectively represent that thread \(i\) locks or unlocks mutex \( l \in \mathcal {L}\). The semantics of these actions correspond to the so-called NORMAL mutexes in the POSIX standard
[4]. Actions of \(\langle \textsf {acq}, l\rangle \) or \(\langle \textsf {rel}, l\rangle \) effect are only dependent on actions whose effect is an operation on the same mutex \(l\) (\(\textsf {acq}\), \(\textsf {rel}\), \(\textsf {w}_1\) or \(\textsf {w}_2\), see below). For instance the action of event 4 (\( \textsf {rel}\)) in Fig. 2b depends on the action of event 6 (\( \textsf {acq}\)).
Wait on Condition Variables. The occurrence of a pthread_cond_wait(c, l) statement is represented by two separate actions of effect
and
. An action
represents that thread i has atomically released the lock l and started waiting on condition variable c. An action
indicates that thread i has been woken up by a signal or broadcast operation on c and that it successfully re-acquired mutex l. For instance the action \(\langle 1, \langle \textsf {w}_1, c, m\rangle \rangle \) of event 10 in Fig. 2c represents that thread 1 has released mutex m and is waiting for c to be signaled. After the signal happens (event 12) the action \(\langle 1, \langle \textsf {w}_2, c, m\rangle \rangle \) of event 14 represents that thread 1 wakes up and re-acquires mutex m. An action \(\langle i, \langle \textsf {w}_1, c, l\rangle \rangle \) is dependent on any action whose effect operates on mutex l (\(\textsf {acq}\), \(\textsf {rel}\), \(\textsf {w}_1\) or \(\textsf {w}_2\)) as well as signals directed to thread i (\(\langle \textsf {sig}, c, i\rangle \), see below), lost signals (\(\langle \textsf {sig}, c, 0\rangle \), see below), and any broadcast (\(\langle \textsf {bro}, c, W\rangle \) for any \(W \subseteq \mathbb {N}\), see below). Similarly, an action \(\langle i, \langle \textsf {w}_2, c,l\rangle \rangle \) is dependent on any action whose effect operates on lock l as well as signals and broadcasts directed to thread i (that is, either \(\langle \textsf {sig}, c, i\rangle \) or \(\langle \textsf {bro}, c, W\rangle \) when \(i \in W\)).
Signal/Broadcast on Condition Variables. An action \(\langle i, \langle \textsf {sig}, c, j\rangle \rangle \), with \(j \ge 0\) indicates that thread i executed a pthread_cond_signal(c) statement. If \(j = 0\) then no thread was waiting on condition variable c, and the signal had no effect, as per the POSIX semantics. We refer to these as lost signals. Example: events 7 and 17 in Fig. 2b and 2d are labeled by lost signals. In both cases thread 1 was not waiting on the condition variable when the signal happened. However, when \(j \ge 1\) the action represents that thread j wakes up by this signal. Whenever a signal wakes up a thread \(j \ge 1\), we can always find a (unique) \(\textsf {w}_1\) action of thread j that happened before the signal and a unique \(\textsf {w}_2\) action in thread j that happens after the signal. For instance, event 12 in Fig. 2c signals thread 1, which went sleeping in the \(\textsf {w}_1\) event 10 and wakes up in the \(\textsf {w}_2\) event 14. Similarly, an action \(\langle i, \langle \textsf {bro}, c, W\rangle \rangle \), with \(W \subseteq \mathbb {N}\) indicates that thread i executed a pthread_cond_broadcast(c) statement and any thread j such that \(j \in W\) was woken up. If \(W = \emptyset \), then no thread was waiting on condition variable c (lost broadcast). Lost signals and broadcasts on c depend on any action of \(\langle \textsf {w}_1, c, \cdot \rangle \) effect as well as any non-lost signal/broadcast on c. Non-lost signals and broadcasts on c that wake up thread j dependFootnote 1 on \(\textsf {w}_1\) and \(\textsf {w}_2\) actions of thread j as well as any signal/broadcast (lost or not) on the same condition variable.
A run of P is a sequence of actions in \(A^*\) which respects the constraints stated above for actions. For instance, a run for the program shown in Fig. 2a is the sequence of actions which labels any topological order of the events shown in any partial order in Fig. 2b to 2e. The sequence below,
is a run of Fig. 2a. Naturally, if \(\sigma \in A^*\) is a run, any prefix of \(\sigma \) is also a run. Runs explicitly represent concurrency, using thread identifiers, and symbolically represent data non-determinism, using constraints, as illustrated by the 1st and 4th actions of the run above. We let \(\mathop { runs } (P)\) denote the set of all runs of P.
A concrete state of P is a tuple that represents, intuitively, the program counters of each thread, the values of all memory locations, the mutexes locked by each thread, and, for each condition variable, the set of threads waiting for it (see
[42] for a formal definition). Since runs represent operations on symbolic data, they reach a symbolic state, which conceptually corresponds to a set of concrete states of P.
The state of a run \(\sigma \), written \(\mathop { state } (\sigma )\), is the set of all concrete states of P that are reachable when the program executes the run \(\sigma \). For instance, the run \(\sigma '\) given above reaches a state consisting on all program states where y is 1, x is a non-negative number, thread 2 owns mutex m and its instruction pointer is at line 3, and thread 1 has finished. We let \(\mathop { reach } (P) \mathrel {:=}\bigcup _{\sigma \in \mathop { runs } (P)} \mathop { state } (\sigma )\) denote the set of all reachable states of P.
3.2 Independence
In the previous section, given an action \(a \in A\) we informally defined the set of actions which are dependent on a, therefore indirectly defining an independence relation. We now show that this relation is a valid independence
[19, 41]. Intuitively, an independence relation is valid when every pair of actions it declares as independent can be executed in any order while still producing the same state.
Our independence relation is valid only for data-race-free programs. We say that P is data-race-free iff any two local actions \(a \mathrel {:=}\langle i, \langle \textsf {loc}, t\rangle \rangle \) and \(a' \mathrel {:=}\langle i', \langle \textsf {loc}, t'\rangle \rangle \) from different threads (\(i \ne i'\)) commute at every reachable state of P. See
[42] for additional details. This ensures that local statements of different threads of P modify the memory without interfering each other.
Theorem 1
If P is data-race-free, then the independence relation defined in Sect. 3.1 is valid.
Proof
See
[42].
Our technique does not use data races as a source of thread interference for partial-order reduction. It will not explore two execution orders for the two statements that exhibit a data race. However, it can be used to detect and report data races found during the POR exploration, as we will see in Sect. 4.4.
3.3 Partial-Order Runs
A labeled partial-order (LPO) is a tuple \(\langle X, {<}, h\rangle \) where \(X\) is a set of events, \({<} \subseteq X \times X\) is a causality (a.k.a., happens-before) relation, and \(h :X \rightarrow A\) labels each event by an action in \(A\).
A partial-order run of P is an LPO that represents a run of P without enforcing an order of execution on actions that are independent. All partial-order runs of Fig. 2a are shown in Fig. 2b to 2e.
Given a run \(\sigma \) of P, we obtain the corresponding partial-order run \(\mathcal {E}_\sigma \mathrel {:=}\langle E, {<}, h\rangle \) by the following procedure: (1) initialize \(\mathcal {E}_\sigma \) to be the only totally-ordered LPO that consists of \(|\sigma |\) events where the i-th event is labeled by the i-th action of \(\sigma \); (2) for every two events \(e, e'\) such that \(e < e'\), remove the pair \(\langle e,e'\rangle \) from < if h(e) is independent from \(h(e')\); (3) restore transitivity in < (i.e., if \(e < e'\) and \(e' < e''\), then add \(\langle e, e''\rangle \) to <). The resulting LPO is a partial-order run of P.
Furthermore, the originating run \(\sigma \) is an interleaving of \(\mathcal {E}_\sigma \). Given some LPO \(\mathcal {E}\mathrel {:=}\langle E, {<}, h\rangle \), an interleaving of \(\mathcal {E}\) is the sequence that labels any topological ordering of \(\mathcal {E}\). Formally, it is any sequence \(h(e_1), \ldots , h(e_n)\) such that \(E = {\{ e_1, \ldots , e_n \mathclose \}}\) and \(e_i< e_j \implies i < j\). We let \(\mathop { inter } (\mathcal {E})\) denote the set of all interleavings of \(\mathcal {E}\). Given a partial-order run \(\mathcal {E}\) of P, the interleavings \(\mathop { inter } (\mathcal {E})\) have two important properties: every interleaving in \(\mathop { inter } (\mathcal {E})\) is a run of P, and any two interleavings \(\sigma , \sigma ' \in \mathop { inter } (\mathcal {E})\) reach the same state \(\mathop { state } (\sigma )= \mathop { state } (\sigma ')\).
3.4 Prime Event Structures
We use unfoldings to give semantics to multi-threaded programs. Unfoldings are Prime Event Structures
[37], tree-like representations of system behavior that use partial orders to represent concurrent interaction.
Figure 3a depicts an unfolding of the program in Fig. 2a. The nodes are events and solid arrows represent causal dependencies: events 1 and 4 must fire before 8 can fire. The dotted line represents conflicts: 2 and 5 are not in conflict and may occur in any order, but 2 and 16 are in conflict and cannot occur in the same (partial-order) run.
Formally, a Prime Event Structure
[37] (PES) is a tuple \(\mathcal {E}\mathrel {:=}\langle E, {<}, {\mathrel {\#}}, h\rangle \) with a set of events E, a causality relation \({<} \subseteq E \times E\), which is a strict partial order, a conflict relation \({\mathrel {\#}} \subseteq E \times E\) that is symmetric and irreflexive, and a labeling function \(h :E \rightarrow A\).
The causes of an event \(\left\lceil e \right\rceil \mathrel {:=}{\{ e' \in E :e' < e \mathclose \}}\) are the least set of events that must fire before e can fire. A configuration of \(\mathcal {E}\) is a finite set \(C \subseteq E\) that is causally closed (\(\left\lceil e \right\rceil \subseteq C\) for all \(e \in C\)), and conflict-free (\(\lnot (e \mathrel {\#}e')\) for all \(e, e' \in C\)). We let \(\mathop { conf } (\mathcal {E})\) denote the set of all configurations of \(\mathcal {E}\). For any \(e \in E\), the local configuration of e is defined as \([e] \mathrel {:=}\left\lceil e \right\rceil \cup {\{ e \mathclose \}} \). In Fig. 3a, the set \({\{ 1,2 \mathclose \}}\) is a configuration, and in fact it is a local configuration, i.e., \([2] = {\{ 1,2 \mathclose \}}\). The local configuration of event 6 is \({\{ 1,2,3,4,5,6 \mathclose \}}\). Set \({\{ 2,5,16 \mathclose \}}\) is not a configuration, because it is neither causally closed (1 is missing) nor conflict-free (\(2 \mathrel {\#}16\)).
3.5 Unfolding Semantics for Programs
Given a program P, in this section we define a PES \(\mathcal {U}_{P}\) such that every configuration of \(\mathcal {U}_{P}\) is a partial-order run of P.
Let \(\mathcal {E}_1 \mathrel {:=}\langle E_1, {<}_1, h_1\rangle , \ldots , \mathcal {E}_n \mathrel {:=}\langle E_n, {<}_n, h_n\rangle \) be the collection of all the partial-order runs of P. The events of \(\mathcal {U}_{P}\) are the equivalence classes of the structural equality relation that we intuitively described in Sect. 2.3.
Two events are structurally equal iff their canonical name is the same. Given some event \(e \in E_i\) in some partial-order run \(\mathcal {E}_i\), the canonical name \(\mathop { cn } (e)\) of e is the pair \(\langle a, H\rangle \) where \(a \mathrel {:=}h_i(e)\) is the executed action and \(H \mathrel {:=}{\{ \mathop { cn } (e') :e' <_i e \mathclose \}}\) is the set of canonical names of those events that causally precede e in \(\mathcal {E}_i\). Intuitively, canonical names indicate that action h(e) runs after the (transitively canonicalized) partially-ordered history preceding e. For instance, in Fig. 3a for events 1 and 6 we have \(\mathop { cn } (1) = \langle \langle 1, \langle \textsf {loc}, {\texttt {a=in()}}\rangle \rangle , \emptyset \rangle \), and \(\mathop { cn } (6) = \langle \langle 2, \langle \textsf {acq}, m\rangle \rangle , {\{ \mathop { cn } (1), \mathop { cn } (2), \mathop { cn } (3), \mathop { cn } (4), \mathop { cn } (5) \mathclose \}}\rangle \). Actually, the number within every event in Fig. 2b to 2e identifies (is in bijective correspondence with) its canonical name. Event 19 in Fig. 2d is the same event as event 19 in Fig. 2e because it fires the same action (\(\langle 1, \langle \textsf {acq}, m\rangle \rangle \)) after the same causal history (\({\{ 1,5,16,17,18 \mathclose \}}\)). Event 2 in Fig. 2c and 19 in Fig. 2d are not the same event because while \(h(2) = h(19) = \langle 1, \langle \textsf {acq}, m\rangle \rangle \) they have a different causal history (\({\{ 1 \mathclose \}}\) vs. \({\{ 1,5,16,17,18 \mathclose \}}\)). Obviously events 4 and 6 in Fig. 2b are different because \(h(4) \ne h(6)\). We can now define the unfolding of P as the only PES \(\mathcal {U}_{P} \mathrel {:=}\langle E, {<}, \mathrel {\#}, h\rangle \) such that
-
\(E \mathrel {:=}{\{ \mathop { cn } (e) :e \in E_1 \cup \ldots \cup E_n \mathclose \}}\) is the set of canonical names of all events;
-
Relation \({<} \subseteq E \times E\) is the union \({<_1} \cup \ldots \cup {<_n}\) of all happens-before relations;
-
Any two events \(e, e' \in E\) of \(\mathcal {U}_{P}\) are in conflict, \(e \mathrel {\#}e'\), when \(e \ne e'\), and \(\lnot (e < e')\), and \(\lnot (e' < e)\), and h(e) is dependent on \(h(e')\).
Figure 3a shows the unfolding produced by merging all 4 partial-order runs in Fig. 2b to 2e. Note that the configurations of \(\mathcal {U}_{P}\) are partial-order runs of P. Furthermore, the \(\subseteq \)-maximal configurations are exactly the 4 originating partial orders. It is possible to prove that \(\mathcal {U}_{P}\) is a semantics of P. In
[42] we show that (1) \(\mathcal {U}_{P}\) is uniquely defined, (2) any interleaving of any local configuration of \(\mathcal {U}_{P}\) is a run of P, (3) for any run \(\sigma \) of P there is a configuration C of \(\mathcal {U}_{P}\) such that \(\sigma \in \mathop { inter } (C)\).
3.6 Conflicting Extensions
Our technique analyzes P by iteratively constructing (all) partial-order runs of P. In every iteration we need to find the next partial order to explore. We use the so-called conflicting extensions of a configuration to detect how to start a new partial-order run that has not been explored before.
Given a configuration C of \(\mathcal {U}_{P}\), an extension of C is any event \(e \in E \setminus C\) such that all the causal predecessors of e are in C. We denote the set of extensions of C as \(\mathop { ex } (C) \mathrel {:=}{\{ e \in E :e \notin C \wedge \left\lceil e \right\rceil \subseteq C \mathclose \}}\). The enabled events of C are extensions that can form a larger configuration: \(\mathop { en } (C) \mathrel {:=}{\{ e \in \mathop { ex } (C) :C \cup {\{ e \mathclose \}} \in \mathop { conf } (\mathcal {E}) \mathclose \}}\). For instance, in Fig. 3a, the (local) configuration [6] has 3 extensions, \(\mathop { ex } ([6]) = {\{ 7,9,16 \mathclose \}}\) of which, however, only event 7 is enabled: \(\mathop { en } ([6]) = {\{ 7 \mathclose \}}\). Event 19 is not an extension of [6] because 18 is a causal predecessor of 19, but \(18 \not \in [6]\). A conflicting extension of C is an extension for which there is at least one \(e' \in C\) such that \(e \mathrel {\#}e'\). The (local) configuration [6] from our previous example has two conflicting extensions, events 9 and 16. A conflicting extension is, intuitively, an incompatible addition to the configuration C, an event e that cannot be executed together with C (without removing \(e'\) and its causal successors from C). We denote by \(\mathop { cex } (C)\) the set of all conflicting extensions of C, which coincides with the set of all extensions that are not enabled: \(\mathop { cex } (C) \mathrel {:=}\mathop { ex } (C) \setminus \mathop { en } (C)\).
Our technique discovers new conflicting extension events by trying to revert the causal order of certain events in C. Owing to space limitations we only explain how the algorithm handles events of \(\textsf {acq}\) and \(\textsf {w}_2\) effect (
[42] presents the remaining 4 procedures of the algorithm). Algorithm 1 shows the procedure that handles this case. It receives an event e of \(\textsf {acq}\) or \(\textsf {w}_2\) effect (line 2). We build and return a set of conflicting extensions, stored in variable R. Events are added to R in line 14 and 17. Note that we define events using their canonical name. For instance, in line 14 we add a new event whose action is h(e) and whose causal history is P. Note that we only create events that execute action h(e). Conceptually speaking, the algorithm simply finds different causal histories (variables P and \(e'\)) within the set \(K = \left\lceil e \right\rceil \) to execute action h(e).
Procedure last-of(C, i) returns the only <-maximal event of thread i in C; last-notify(e, c, i) returns the only immediate <-predecessor \(e'\) of e such that the effect of \(h(e')\) is either \(\langle \textsf {sig},c,i\rangle \) or \(\langle \textsf {bro},c,S\rangle \) with \(i \in S\); finally, procedure last-lock(C, l) returns the only <-maximal event that manipulates lock l in C (an event of effect \(\textsf {acq}\), \(\textsf {rel}\), \(\textsf {w}_1\) or \(\textsf {w}_2\)), or \(\bot \) if no such event exists. See
[42] for additional details.
3.7 Exploring the Unfolding
This section presents an algorithm that explores the state space of P by constructing all maximal configurations of \(\mathcal {U}_{P}\). In essence, our procedure is an improved Quasi-Optimal POR algorithm
[35], where the unfolding is not explored using a DFS traversal, but a user-defined search order. This enables us to build upon the preexisting exploration heuristics (“searchers”) in KLEE rather than having to follow a strict DFS exploration of the unfolding.
Our algorithm explores one configuration of \(\mathcal {U}_{P}\) at a time and organizes the exploration into a binary tree. Figure 3b shows the tree explored for the unfolding shown in Fig. 3a. A tree node is a tuple \(n \mathrel {:=}\langle C,D,A,e\rangle \) that represents both the exploration of a configuration C of \(\mathcal {U}_{P}\) and a choice to execute, or not, event \(e \in \mathop { en } (C)\). Both D (for disabled) and A (for add) are sets of events.
The key insight of this tree is as follows. The subtree rooted at a given node n explores all configurations of \(\mathcal {U}_{P}\) that include C and exclude D, with the following constraint: n’s left subtree explores all configurations including event e and n’s right subtree explores all configuration excluding e. Set A is used to guide the algorithm when exploring the right subtree. For instance, in Fig. 3b the subtree rooted at node \(n \mathrel {:=}\langle {\{ 1,2 \mathclose \}},\emptyset ,\emptyset ,3\rangle \) explores all maximal configurations that contain events 1 and 2 (namely, those shown in Fig. 2b and 2c). The left subtree of n explores all configurations including \({\{ 1,2,3 \mathclose \}}\) (Fig. 2b) and the right subtree all of those including \({\{ 1,2 \mathclose \}}\) but excluding 3 (Fig. 2c).
Algorithm 2 shows a simplified version of our algorithm. The complete version, in
[42], specifies additional details including how nodes are selected for exploration and how they are removed from the tree. The algorithm constructs and stores the exploration tree in the variable \(N\), and the set of currently known events of \(\mathcal {U}_{N}\) in variable \(U\). At the end of the exploration, \(U\) will store all events of \(\mathcal {U}_{N}\) and the leafs of the exploration tree in N will correspond to the maximal configurations of \(\mathcal {U}_{N}\).
The tree is constructed using a fixed-point loop (line 4) that repeats the following steps as long as they modify the tree: select a node \(\langle C,D,A,e\rangle \) in the tree (line 5), extend U with the conflicting extensions of C (line 6), check if the configuration is \(\subseteq \)-maximal (line 7), in which case there is nothing left to do, then try to add a left (line 9) or right (line 12) child node.
The subtree rooted at the left child node will explore all configurations that include \(C \cup {\{ e \mathclose \}}\) and exclude D (line 10); the right subtree will explore those including C and excluding \(D \cup {\{ e \mathclose \}}\) (line 15), if any of them exists, which we detect by checking (line 14) if we found a so-called alternative
[41].
An alternative is a set of events which witnesses the existence of some maximal configuration in \(\mathcal {U}_{P}\) that extends C without including \(D \cup {\{ e \mathclose \}}\). Computing such witness is an NP-complete problem, so we use an approximation called k-partial alternatives
[35], which can be computed in P-time and works well in practice. Our procedure alt specifically computes 1-partial alternatives: it selects \(k=1\) event e from \(D \cap \mathop { en } (C)\), searches for an event \(e'\) in conflict with e (we have added all known candidates in line 6, using the algorithms of Sect. 3.6) that can extend C (i.e., such that \(C \cup [e']\) is a configuration), and returns it. When such an event \(e'\) is found (line 33), some events in its local configuration \([e']\) become the A-component of the right child node (line 15), and the leftmost branch rooted at that node will re-execute those events (as they will be selected in line 20), guiding the search towards the witnessed maximal configuration.
For instance, in Fig. 3b, assume that the algorithm has selected node \(n = \langle {\{ 1 \mathclose \}}, \emptyset , \emptyset , 2\rangle \) at line 5 when event 16 is already in U. Then a call to alt( \({\{ 1 \mathclose \}}, {\{ 2 \mathclose \}}\)) is issued at line 13, event \(e = 2\) is selected at line 29 and event \(e' = 16\) gets selected at line 33, because \(2 \mathrel {\#}16\) and \([16] \cup {\{ 1 \mathclose \}}\) is a configuration. As a result, node \(n' = \langle {\{ 1 \mathclose \}}, {\{ 2 \mathclose \}}, {\{ 5,16 \mathclose \}}, 5\rangle \) becomes the right child of n in line 15, and the leftmost branch rooted at \(n'\) adds \({\{ 5,16 \mathclose \}}\) to C, leading to the maximal configuration Fig. 2d.
3.8 Cutoffs and Completeness
All interleavings of a given configuration always reach the same state, but interleavings of different configurations can also reach the same state. It is possible to exclude certain such redundant configurations from the exploration without making the algorithm incomplete, by using cutoff events
[32].
Intuitively, an event is a cutoff if we have already visited another event that reaches the same state with a shorter execution. Formally, in Algorithm 2, line 27 we let cutoff(e) return true iff there is some \(e' \in U\) such that \(\mathop { state } ([e]) = \mathop { state } ([e'])\) and \(|[e']| < |[e]|\). This makes Algorithm 2 ignore cutoff events and any event that causally succeeds them. Sect. 4.2 explains how to effectively implement the check \(\mathop { state } ([e]) = \mathop { state } ([e'])\).
While cutoffs prevent the exploration of redundant configurations, the analysis is still complete: it is possible to prove that every state reachable via a configuration with cutoffs is also reachable via a configuration without cutoffs. Furthermore, cutoff events not only reduce the exploration of redundant configurations, but also force the algorithm to terminate for non-terminating programs that run on bounded memory.
Theorem 2 (Correctness)
For any reachable state \(s \in \mathop { reach } (P)\), Algorithm 2 explores a configuration C such that for some \(C' \subseteq C\) it holds that \(\mathop { state } (C') = s\). Furthermore, it terminates for any program P such that \(\mathop { reach } (P)\) is finite.
A proof sketch is available in
[42]. Naturally, since Algorithm 2 explores \(\mathcal {U}_{P}\), and \(\mathcal {U}_{P}\) is an exact representation of all runs of P, then Algorithm 2 is also sound: any event constructed by the algorithm (added to set U) is associated with a real run of P.