1 Introduction

Termination is one of the most important properties of programs, and thus termination analysis is a very active field of research. Here, we are concerned with disproving termination of transition systems (TSs), a popular intermediate representation for verification of programs written in more expressive languages.

Example 1

Consider the following TS \(\mathcal {T}\) with entry-point \(\textsf{init}\) and two further locations \(\ell _1,\ell _2\) over the variables xyz, where \(x',y',z'\) represent the values of xyz after applying a transition, and \(\overset{=}{x}, x ++\), and abbreviate \(x' = x\), \(x' = x + 1\), and \(x' = x - 1\). The first two transitions are a variantFootnote 1 of chc-LIA-Lin_052 from the CHC Competition ’22 [7] and the last two are a variantFootnote 2 of flip2_rec.jar-obl-8 from the Termination and Complexity Competition (TermComp) [21].

figure b

At \(\ell _1\), \(\mathcal {T}\) operates in two “phases”: First, just x is incremented until x reaches z (\(1^{st}\) disjunct of \(\tau _{\ell _1}\)). Then, x and y are incremented until y reaches \(2 \cdot z + 1\) (\(2^{nd}\) disjunct of \(\tau _{\ell _1}\)). If \(x = y = c\) holds for some \(c > 1\) at that point (which is the case if \(x \le y = z\) holds initially), then the execution can continue at \(\ell _2\) as follows:

$$\begin{aligned}\begin{gathered} \ell _2(c,c,c_z) \longrightarrow _{\tau _{\ell _2}^=} \ell _2(c,c-1,c_z) \longrightarrow _{\tau _{\ell _2}^{\ne }} \ell _2(c-1,c,c_z) \longrightarrow _{\tau _{\ell _2}^{\ne }} \ell _2(c,c,c_z) \longrightarrow _{\tau _{\ell _2}^=} \ldots \end{gathered}\end{aligned}$$

Here, \(\ell _2(c,c,c_z)\) means that the current location is \(\ell _2\) and the values of xyz are \(c,c,c_z\). The \(1^{st}\) and \(2^{nd}\) step with \(\tau _{\ell _2}^{\ne }\) satisfy the \(1^{st}\) (\(x > y \wedge \ldots \)) and \(2^{nd}\) (\(x < y \wedge \ldots \)) disjunct of \(\tau _{\ell _2}^{\ne }\)’s condition, respectively. Thus, \(\mathcal {T}\) does not terminate.

Example 1 is challenging for state-of-the-art tools for several reasons. First, more than 5000 steps are required to reach \(\ell _2\), so reachability of \(\ell _2\) is difficult to prove for approaches that unroll the transition relation or use other variants of iterative deepening. Thus, chc-LIA-Lin_052 is beyond the capabilities of most other state-of-the-art tools for proving reachability.

Second, the pattern “ disjunct of , \(2^{nd}\) disjunct of ” must be found to prove non-termination. Therefore, flip2_rec.jar-obl-8 (which does not use disjunctions) cannot be solved by other state-of-the-art termination tools.

Third, Example 1 contains disjunctions, which are not supported by many termination tools. Presumably, the reason is that most techniques for (dis)proving termination of loops are restricted to conjunctions (e.g., due to the use of templates and Farkas’ Lemma). While disjunctions can be avoided by splitting disjunctive transitions according to the DNF of their conditions, this leads to an exponential blow-up in the number of transitions.

We present an approach that can prove non-termination of systems like Example 1 automatically. To this end, we tightly integrate non-termination techniques into our recent Acceleration Driven Clause Learning (ADCL) calculus [16], which has originally been designed for CHCs, but it can also be used to analyze TSs.

Due to the use of acceleration techniques that compute the transitive closure of recursive transitions, ADCL finds long witnesses of reachability automatically. If acceleration techniques cannot be applied, it unrolls the transition relation, so it can easily detect complex patterns of transitions that admit non-terminating runs. Finally, ADCL reduces reasoning about disjunctions to reasoning about conjunctions by considering conjunctive variants of disjunctive transitions. Thus, combining ADCL with non-termination techniques for conjunctive transitions allows for disproving termination of TSs with complex Boolean structure.

After introducing preliminaries in Sect. 2, Sect. 3 presents a straightforward adaption of ADCL to TSs. Section 4 introduces our main contribution: ADCL-NT, a variant of ADCL for proving non-termination. Finally, in Sect. 5, we discuss related work and demonstrate the power of our approach by comparing it with other state-of-the-art tools. All proofs can be found in [19].

2 Preliminaries

We assume familiarity with basics from many-sorted first-order logic. \(\mathcal {V}\) is a countably infinite set of variables and \(\mathcal {A}\) is a first-order theory over a k-sorted signature \(\varSigma _\mathcal {A}\) with carrier \(\mathcal {C}_\mathcal {A}= (\mathcal {C}_{\mathcal {A},1},\ldots ,\mathcal {C}_{\mathcal {A},k})\). \(\textsf{QF}(\varSigma _\mathcal {A})\) is the set of all quantifier-free first-order formulas over \(\varSigma _\mathcal {A}\), which are w.l.o.g. assumed to be in negation normal form, and \(\textsf{QF}_\wedge (\varSigma _\mathcal {A})\) only contains conjunctions of \(\varSigma _\mathcal {A}\)-literals. Given a first-order formula \(\eta \) over \(\varSigma _\mathcal {A}\), \(\sigma \) is a model of \(\eta \) (written \(\sigma \models _\mathcal {A}\eta \)) if it is a model of \(\mathcal {A}\) with carrier \(\mathcal {C}_\mathcal {A}\), extended with interpretations for \(\mathcal {V}\) such that \(\eta \) is satisfied. As usual, \(\models _\mathcal {A}\eta \) means that \(\eta \) is valid, and \(\eta \equiv _\mathcal {A}\eta '\) means \(\models _\mathcal {A}\eta \iff \eta '\).

We write \(\vec {x}\) for sequences and \(x_i\) is the \(i^{th}\) element of \(\vec {x}\). We use “ ” for concatenation of sequences, where we identify sequences of length 1 with their elements, so we may write, e.g., instead of .

Transition Systems. Let \(d \in \mathbb {N}\) be fixed, and let \(\vec {x},\vec {x}' \in \mathcal {V}^d\) be disjoint vectors of pairwise different variables. Each \(\psi \in \textsf{QF}(\varSigma _\mathcal {A})\) induces a relation \(\longrightarrow _\psi \) on \(\mathcal {C}_\mathcal {A}^d\) where \(\vec {s} \longrightarrow _\psi \vec {t}\) iff \(\psi [\vec {x}/\vec {s},\vec {x}'/\vec {t}]\) is satisfiable. So for the condition of \(\tau _{\ell _2}^=\), we have \((4,4,4) \longrightarrow _{\psi } (4,3,7)\). \(\mathcal {L}\supseteq \{\textsf{init},\textsf{err}\}\) is a finite set of locations. A configuration is a pair \((\ell ,\vec {s}) \in \mathcal {L}\times \mathcal {C}_\mathcal {A}^d\), written \(\ell (\vec {s})\). A transition is a triple \(\tau = (\ell ,\psi ,\ell ') \in \mathcal {L}\times \textsf{QF}(\varSigma _\mathcal {A}) \times \mathcal {L}\), written \(\ell \rightarrow \ell ' \,\llbracket \psi \rrbracket \), and its condition is \(\textsf{cond}(\tau ) \mathrel {\mathop :}=\psi \). W.l.o.g., we assume \(\ell \ne \textsf{err}\) and \(\ell ' \ne \textsf{init}\). Then \(\tau \) induces a relation \(\longrightarrow _{\tau }\) on configurations where iff , and \(\vec {s} \longrightarrow _\psi \vec {t}\). So, e.g., \(\ell _2(4,4,4) \longrightarrow _{\tau _{\ell _2}^=} \ell _2(4,3,7)\). We call \(\tau \) recursive if \(\ell = \ell '\), conjunctive if \(\psi \in \textsf{QF}_\wedge (\varSigma _\mathcal {A})\), initial if \(\ell = \textsf{init}\), and safe if \(\ell ' \ne \textsf{err}\). Moreover, we define \((\ell \rightarrow \ell ' \,\llbracket \psi \rrbracket )|_{\psi '} \mathrel {\mathop :}=\ell \rightarrow \ell ' \,\llbracket \psi ' \rrbracket \). A transition system (TS) \(\mathcal {T}\) is a finite set of transitions, and it induces the relation \(\longrightarrow _{\mathcal {T}} \mathrel {\mathop :}=\bigcup _{\tau \in \mathcal {T}} {\longrightarrow _{\tau }}\).

Chaining \(\tau = \ell _s \rightarrow \ell _t \,\llbracket \psi \rrbracket \) and \(\tau ' = \ell _s' \rightarrow \ell _t' \,\llbracket \psi ' \rrbracket \) yields \(\textsf{chain}(\tau ,\tau ') \mathrel {\mathop :}=(\ell _s \rightarrow \ell '_t \,\llbracket \psi _c \rrbracket )\) where \(\psi _c \mathrel {\mathop :}=\psi [\vec {x}' / \vec {x}''] \wedge \psi '[\vec {x} / \vec {x}'']\) for fresh \(\vec {x}'' \in \mathcal {V}^d\) if \(\ell _t = \ell _s'\), and \(\psi _c \mathrel {\mathop :}=\bot \) (meaning \( false \)) if \(\ell _t \ne \ell _s'\). So \({\longrightarrow _{\textsf{chain}(\tau ,\tau ')}} = {\longrightarrow _\tau } \circ {\longrightarrow _{\tau '}}\), and \(\textsf{chain}(\tau _{\mathsf {\ell _1 \rightarrow \ell _2}},\tau _{\ell _2}^=) = \ell _1 \rightarrow \ell _2 \,\llbracket \psi \rrbracket \) where . For non-empty, finite sequences of transitions we define \(\textsf{chain}([\tau ]) \mathrel {\mathop :}=\tau \) and . We lift notations for transitions to finite sequences via chaining. So \(\textsf{cond}(\vec {\tau }) \mathrel {\mathop :}=\textsf{cond}(\textsf{chain}(\vec {\tau }))\), \(\vec {\tau }\) is recursive if \(\textsf{chain}(\vec {\tau })\) is recursive, \({\longrightarrow _{\vec {\tau }}} = {\longrightarrow _{\textsf{chain}(\vec {\tau })}}\), etc. If \(\tau \) is initial and is a finite run. \(\mathcal {T}\) is safe if every finite run is safe. If there is a \(\sigma \) such that \(\sigma \models _{\mathcal {A}} \textsf{cond}(\vec {\tau }')\) for every finite prefix \(\vec {\tau }'\) of \(\vec {\tau } \in \mathcal {T}^{\omega }\), then \(\vec {\tau }\) is an infinite run. If no infinite run exists, then \(\mathcal {T}\) is terminating.

Acceleration. Acceleration techniques compute the transitive closure of relations. In the following definition, we only consider relations defined by conjunctive formulas, since many existing acceleration techniques do not support disjunctions [4], or have to resort to approximations in the presence of disjunctions [13].

Definition 2

(Acceleration). An acceleration technique is a function \(\textsf{accel}: \textsf{QF}_\wedge (\varSigma _\mathcal {A}) \mapsto \textsf{QF}_\wedge (\varSigma _{\mathcal {A}'})\) such that \({\longrightarrow _{\psi }^+} = {\longrightarrow _{\textsf{accel}(\psi )}}\), where \(\mathcal {A}'\) is a first-order theory. For recursive conjunctive transitions \(\tau \), we define \(\textsf{accel}(\tau ) \mathrel {\mathop :}=\tau |_{\textsf{accel}(\textsf{cond}(\tau ))}\).

So we clearly have \({\longrightarrow ^+_\tau } = {\longrightarrow _{\textsf{accel}(\tau )}}\). Note that most theories are not “closed under acceleration”. E.g., accelerating the Presburger formula \(x'_1 = x_1 + x_2 \wedge \overset{=}{x_2}\) yields the non-linear formula \(n > 0 \wedge x'_1 = x_1 + n \cdot x_2 \wedge \overset{=}{x_2}\). If neither \(\mathbb {N}\) nor \(\mathbb {Z}\) are contained in \(\mathcal {C}_\mathcal {A}\), then an additional sort for the range of n is required in the formula that results from applying \(\textsf{accel}\). Hence, Definition 2 allows \(\mathcal {A}' \ne \mathcal {A}\).

3 ADCL for Transition Systems

We originally proposed the ADCL calculus to analyze satisfiability of linear Constrained Horn Clauses (CHCs) [16]. Here, we rephrase it for TSs, and in Sect. 4, we modify it for proving non-termination. The adaption to TSs is straightforward as TSs can be transformed into equivalent linear CHCs and vice versa (see, e.g., [10]).

To bridge the gap between transitions \(\tau \) where \(\textsf{cond}(\tau ) \in \textsf{QF}(\varSigma _\mathcal {A})\) and acceleration techniques for formulas from \(\textsf{QF}_\wedge (\varSigma _\mathcal {A})\), ADCL uses syntactic implicants.

Definition 3

(Syntactic Implicants [16, Def. 6]). If \(\psi \in \textsf{QF}(\varSigma _\mathcal {A})\), then:

$$\begin{aligned} \textsf{sip}(\psi ,\sigma )&{} \mathrel {\mathop :}=\bigwedge \{\pi \textit{ is a literal of } \psi \mid \sigma \models _\mathcal {A}\pi \}{} & {} \textit{if}\; \sigma \models _\mathcal {A}\psi \\ \textsf{sip}(\psi )&{} \mathrel {\mathop :}=\{ \textsf{sip}(\psi ,\sigma ) \mid \sigma \models _\mathcal {A}\psi \}\\ \textsf{sip}(\tau )&{} \mathrel {\mathop :}=\{\tau |_{\psi } \mid \psi \in \textsf{sip}(\textsf{cond}(\tau ))\}{} & {} \textit{for transitions}\; \tau \\ \textsf{sip}(\mathcal {T})&{} \mathrel {\mathop :}=\bigcup _{\tau \in \mathcal {T}} \textsf{sip}(\tau ){} & {} \textit{for TSs}\; \mathcal {T}\end{aligned}$$

Here, \(\textsf{sip}\) abbreviates syntactic implicant projection.

As \(\textsf{sip}(\psi ,\sigma )\) is restricted to literals from \(\psi \), \(\textsf{sip}(\psi )\) is finite. Syntactic implicants ignore the semantics of literals. So we have, e.g., \((X> 1) \notin \textsf{sip}(X> 0 \wedge X> 1) = \{X> 0 \wedge X > 1\}\). It is easy to show \(\psi \equiv _\mathcal {A}\bigvee \textsf{sip}(\psi )\), and thus \({\longrightarrow _\mathcal {T}} = {\longrightarrow _{\textsf{sip}(\mathcal {T})}}\).

Since \(\textsf{sip}(\tau )\) is worst-case exponential in the size of \(\textsf{cond}(\tau )\), we do not compute it explicitly. Instead, ADCL constructs a run \(\vec {\tau }\) step by step, and to perform a step with \(\tau \), it searches for a model \(\sigma \) of . If such a model exists, it appends \(\tau |_{\textsf{sip}(\textsf{cond}(\tau ),\sigma )}\) to \(\vec {\tau }\). This corresponds to a step with a conjunctive variant of \(\tau \) whose condition is satisfied by \(\sigma \). In other words, our calculus constructs \(\textsf{sip}(\textsf{cond}(\tau ), \sigma )\) “on the fly” when performing a step with \(\tau \), where

The core idea of ADCL is to learn new, non-redundant transitions via acceleration. Essentially, a transition is redundant if its transition relation is a subset of another transition’s relation. Thus, redundant transitions are not useful for (dis-)proving safety.

Definition 4

(Redundancy, [16, Def. 8]). A transition \(\tau \) is (strictly) redundant w.r.t. \(\tau '\), denoted \(\tau \sqsubseteq \tau '\) (\(\tau \sqsubset \tau '\)) if \({\longrightarrow _\tau } \subseteq {\longrightarrow _{\tau '}}\) (\({\longrightarrow _\tau } \subset {\longrightarrow _{\tau '}}\)). For a TS \(\mathcal {T}\), we have \(\tau \sqsubseteq \mathcal {T}\) (\(\tau \sqsubset \mathcal {T}\)) if \(\tau \sqsubseteq \tau '\) (\(\tau \sqsubset \tau '\)) for some \(\tau ' \in \mathcal {T}\).

In the sequel, we assume oracles for redundancy, satisfiability of \(\textsf{QF}(\varSigma _\mathcal {A})\)-formulas, and acceleration. In practice, we use incomplete techniques instead (see Sect. 5).

From now on, let \(\mathcal {T}\) be the TS that is being analyzed with ADCL. A state of ADCL consists of a TS \(\mathcal {S}\) that augments \(\mathcal {T}\) with learned transitions, a run \(\vec {\tau }\) of \(\mathcal {S}\) called the trace, and a sequence of sets of blocking transitions \([B_i]_{i=0}^k\), where transitions that are redundant w.r.t. \(B_k\) must not be appended to the trace.

The following definition introduces the ADCL calculus. It extends the trace step by step (using the rule Step, which performs an evaluation step with a transition) and learns new transitions via acceleration (Accelerate) whenever a suffix of the trace is recursive. To avoid non-terminating ADCL-derivations, our notion of redundancy from Definition 4 is used to backtrack whenever a suffix of the trace corresponds to a special case of another (learned) transition (Covered). Moreover, Backtrack is used whenever a run cannot be continued. A more detailed explanation of ADCL is provided after Definition 5.

Definition 5

(ADCL [16, Def. 9, 10]). A state is a triple \((\mathcal {S},[\tau _i]_{i=1}^k,[B_i]_{i=0}^k)\) where \(\mathcal {S}\supseteq \mathcal {T}\) is a TS, \(\bigcup _{i=0}^k B_i \subseteq \textsf{sip}(\mathcal {S})\), and \([\tau _i]_{i=1}^k \in \textsf{sip}(\mathcal {S})^*\). The transitions in \(\textsf{sip}(\mathcal {T})\) are called original and the transitions in \(\textsf{sip}(\mathcal {S})\setminus \textsf{sip}(\mathcal {T})\) are learned. A transition \(\tau _{k+1} \sqsubseteq B_k\) is blocked, and \(\tau _{k+1} \not \sqsubseteq B_k\) is active if \(\textsf{chain}([\tau _i]_{i=1}^{k+1})\) is an initial transition with satisfiable condition (i.e., \([\tau _i]_{i=1}^{k+1}\) is a run). Let

$$\begin{aligned}\begin{gathered} \textsf{bt}(\mathcal {S},[\tau _i]_{i=1}^k,[B_0,\ldots ,B_k]) \mathrel {\mathop :}=(\mathcal {S},[\tau _i]_{i=1}^{k-1},[B_0,\ldots ,B_{k-1} \cup \{\tau _k\}]) \end{gathered}\end{aligned}$$

where \(\textsf{bt}\) abbreviates “backtrack”. Our calculus is defined by the following rules.

figure q

We write \(\overset{\textsc {I}}{\leadsto }\), \(\overset{\textsc {S}}{\leadsto }\), \(\ldots \) to indicate that the rule Init, Step, \(\ldots \) was used. Step adds a transition to the trace. When the trace has a recursive suffix, Accelerate allows for learning a new transition which then replaces the recursive suffix on the trace, or we may backtrack via Covered if the recursive suffix is redundant. Note that Covered does not apply if \(\vec {\tau }' \sqsubseteq \textsf{sip}(\mathcal {S})\) and \(|\vec {\tau }'| = 1\), as it could immediately undo every Step, otherwise. If no further Step is possible, Backtrack applies. Note that Backtrack and Covered block the last transition from the trace so that we do not perform the same Step again. If \(\vec {\tau }\) is an unsafe run, Refute yields \(\textsf{unsafe}\), and if the entire search space has been exhausted without finding an unsafe run (i.e., if all initial transitions are blocked), Prove yields \(\textsf{safe}\).

The definition of ADCL in [16] is more liberal than ours: In our setting, Accelerate may only be applied if the learned transition is non-redundant, and our definition of “active transitions” enforces that the first transition on the trace is always an initial transition. In [16], these requirements are not enforced by the definition of ADCL, but by the definition of reasonable strategies [16, Def. 14]. For simplicity, we integrated these requirements into Definition 5. Additionally, Covered should be preferred over Accelerate, and Accelerate should be preferred over Step.

Example 6

We apply ADCL to a version of Example 1 with the additional transition

figure r
figure s

Here, 5k abbreviates 5000 and:

figure t

On the right, we show formulas describing the configurations that are reachable with the current trace. Every \(\leadsto \)-derivation starts with Init. The first two Steps add the initial transition \(\tau _{\textsf{i}}\) and an element of \(\textsf{sip}(\tau _{\ell _1})\) to the trace. Since \(x < z\) holds after applying \(\tau _{\textsf{i}}\), the only possible choice for the latter is .

As is recursive, it is accelerated and replaced with \(\textsf{accel}(\tau _{\ell _1}|_{\psi _{x< z}}) = \tau ^+_{x<z}\), which simulates n steps with . Moreover, \(\tau _{x<z}^+\) is also added to the current set of blocking transitions, as we always have \({\longrightarrow ^2_{\tau }} \subseteq {\longrightarrow _{\tau }}\) for learned transitions \(\tau \) and thus adding them to the trace twice in a row is pointless.

Next, \(\tau _{\ell _1}\) is applicable again. As neither \(x < z\) nor \(x \ge z\) holds for all reachable configurations, we could continue with any element of \(\textsf{sip}(\tau _{\ell _1}) = \{\tau _{\ell _1}|_{\psi _{x < z}},\tau _{\ell _1}|_{\psi _{x \ge z}}\}\). We choose , so that the recursive transition can be accelerated to \(\tau ^+_{x \ge z}\). Then \(\tau _{\textsf{err}}\) applies, and the proof is finished via Refute.

For our purposes, the most important property of ADCL is the following.

Theorem 7

If \(\mathcal {T}\leadsto ^* (\mathcal {S},\vec {\tau },\vec {B})\) and \(\vec {\tau }\) is non-empty, then \(\textsf{cond}(\vec {\tau }) \not \equiv _\mathcal {A}\bot \) and \({\longrightarrow _{\vec {\tau }}} \subseteq {\longrightarrow ^+_\mathcal {T}}\). So if \(\mathcal {T}\leadsto ^* \textsf{unsafe}\), then \(\mathcal {T}\) is unsafe.

The other properties of ADCL that were shown in [16] immediately carry over to our setting, too: if \(\mathcal {T}\leadsto ^* \textsf{safe}\), then \(\mathcal {T}\) is safe; if \(\mathcal {T}\) is unsafe, then \(\mathcal {T}\leadsto ^* \textsf{unsafe}\); in general, \(\leadsto \) does not terminate. The proofs are analogous to [16].

4 Proving Non-Termination with ADCL-NT

From now on, we assume that the analyzed TS \(\mathcal {T}\) does not contain unsafe transitions. To prove non-termination, we look for a corresponding certificate.

Definition 8

(Certificate of Non-Termination). Let \(\tau = \ell \rightarrow \ell \,\llbracket \ldots \rrbracket \). A satisfiable formula \(\psi \) certifies non-termination of \(\tau \), written \(\psi \models _\mathcal {A}^\infty \tau \), if for any model \(\sigma \) of \(\psi \), there is an infinite sequence \( \ell (\sigma (\vec {x})) = \mathfrak {s}_1 \longrightarrow _\tau \mathfrak {s}_2 \longrightarrow _\tau \ldots \)

There exist many techniques for finding certificates of non-termination automatically, see Sect. 5. However, Definition 8 has several shortcomings. First, the problem of finding such certificates becomes very challenging if \(\textsf{cond}(\tau )\) contains disjunctions. Second, it is insufficient to consider a single transition when only non-singleton sequences \(\vec {\tau }\) such that \(\textsf{chain}(\vec {\tau })\) is recursive admit non-terminating runs. Third, just finding a certificate \(\psi \) of non-termination for some \(\vec {\tau } \in \mathcal {T}^*\) does not suffice for proving non-termination of \(\mathcal {T}\). Additionally, a proof that the pre-image of \(\longrightarrow _{\vec {\tau }|_\psi }\) is reachable from an initial configuration is required. All of these problems can be solved by integrating the search for certificates of non-termination into the ADCL calculus.

Definition 9

(ADCL-NT). To prove non-termination, we extend ADCL with the rule Nonterm and modify Covered as shown below. We write \(\leadsto _\textsf{nt}\) for the relation defined by the (modified) rules from Definition 5 and Nonterm.

figure z

So the idea of Nonterm is to apply a technique which searches for a certificate of non-termination to a recursive suffix of the trace. Apart from introducing Nonterm, we restricted Covered to recursive suffixes. The reason is that backtracking when the trace has a redundant, non-recursive suffix may prevent us from analyzing loops, resulting in a precision issue.

Example 10

Let \(\mathcal {T}\mathrel {\mathop :}=\{\tau _{\textsf{i}}, \tau '_{\textsf{i}}, \tau _\ell ,\tau _{\ell '}\}\) where

$$\begin{aligned}\begin{gathered} \tau _{\textsf{i}} \mathrel {\mathop :}=\textsf{init}\rightarrow \ell \,\llbracket \top \rrbracket \quad \tau _{\textsf{i}}' \mathrel {\mathop :}=\textsf{init}\rightarrow \ell ' \,\llbracket \top \rrbracket \quad \tau _\ell \mathrel {\mathop :}=\ell \rightarrow \ell ' \,\llbracket \top \rrbracket \quad \tau _{\ell '} \mathrel {\mathop :}=\ell ' \rightarrow \ell \,\llbracket \top \rrbracket \end{gathered}\end{aligned}$$

and \(\top \) means true. Due to the loop \(\ell \longrightarrow _{\tau _\ell } \ell ' \longrightarrow _{\tau _{\ell '}} \ell \), \(\mathcal {T}\) is clearly non-terminating. Without requiring that \(\vec {\tau }^\circlearrowleft \) is recursive in Covered, \(\mathcal {T}\) can be analyzed as follows:

figure aa

The \(1^{st}\) application of Covered is possible as \([\tau _{\textsf{i}},\tau _\ell ] \sqsubseteq \tau _{\textsf{i}}'\) and the \(2^{nd}\) application of Covered is possible as \([\tau '_{\textsf{i}},\tau _{\ell '}] \sqsubseteq \tau _{\textsf{i}}\). Note that the trace never contains both \(\tau _\ell \) and \(\tau _{\ell '}\), but both transitions are needed to prove non-termination.

Recall the shortcomings of Definition 8 mentioned above. First, due to the use of syntactic implicants, ADCL-NT reduces reasoning about arbitrary transitions to reasoning about conjunctive transitions. Second, as Nonterm considers a suffix \(\vec {\tau }^\circlearrowleft \) of the trace, it can prove non-termination of sequences of transitions. Third, ADCL’s capability to prove reachability directly carries over to our goal of proving non-termination. So in contrast to most other approaches (see Sect. 5), ADCL-NT does not have to resort to other tools or techniques for proving reachability.

We only search for a certificate of non-termination for \(\vec {\tau }^\circlearrowleft \) if ADCL-NT established reachability of the pre-image of \(\longrightarrow _{\vec {\tau }^\circlearrowleft }\) beforehand. Note, however, that this does not imply reachability of the pre-image of \(\longrightarrow _{\ell \rightarrow \textsf{err}\,\llbracket \psi \rrbracket }\), as \(\psi \) entails \(\textsf{cond}(\vec {\tau }^\circlearrowleft )\), but not the other way around. Hence, we cannot directly derive non-termination of \(\mathcal {T}\) when Nonterm applies. Regarding the strategy for \(\leadsto _\textsf{nt}\), one should try to use Nonterm once for each recursive suffix of the trace.

Example 11

Reconsider Example 1. Up to (excluding) the second-last step, the derivation from Example 6 remains unchanged. Then we get

figure ab
$$\begin{aligned} \text {where} \quad \psi _{x>y}&{} \mathrel {\mathop :}=x> 0 \wedge y> 0 \wedge x' = y \wedge x> y \wedge y' = x&\tau _\textsf{err}&{} \mathrel {\mathop :}=\ell _2 \rightarrow \textsf{err}\,\llbracket x = y> 1 \rrbracket \\ \psi _{x<y}&{} \mathrel {\mathop :}=x> 0 \wedge y > 0 \wedge x' = y \wedge x < y \wedge \overset{=}{y}&\mathcal {S}_3&{} \mathrel {\mathop :}=\mathcal {S}_2 \cup \{\tau _\textsf{err}\} \end{aligned}$$

The formulas on the right describe the values of x and y that are reachable with the current trace, where \(1 \equiv _2 y\) means that y is odd. After the first Step with \(\tau _{\mathsf {\ell _1 \rightarrow \ell _2}}\), just \(\tau _{\ell _2}^=\) can be used, as \(\textsf{cond}(\tau _{\mathsf {\ell _1 \rightarrow \ell _2}})\) implies \(x' = y'\). While \(\tau _{\ell _2}^=\) is recursive, Accelerate cannot be applied next, as \({{\longrightarrow _{\tau _{\ell _2}^=}} = {\longrightarrow _{\tau _{\ell _2}^=}^+}}\), so the learned transition would be redundant. Thus, we continue with \(\tau _{\ell _2}^{\ne }\), projected to \(x>y\) (as \(\textsf{cond}(\tau _{\ell _2}^=)\) implies \(x' = y'+1\)). Again, all transitions that could be learned are redundant, so Accelerate does not apply. We next use \(\tau _{\ell _2}^{\ne }\) projected to \(x<y\), as the previous Step swapped x and y. As the suffix \([\tau _{\ell _2}^=,\tau _{\ell _2}^{\ne }|_{\psi _{x>y}},\tau _{\ell _2}^{\ne }|_{\psi _{x<y}}]\) of the trace does not terminate (see Example 1), Nonterm applies. So we learn the transition \(\tau _\textsf{err}\), which is added to the trace to finish the proof, afterwards.

Theorem 12

If \(\mathcal {T}\leadsto _\textsf{nt}^* \textsf{unsafe}\), then \(\mathcal {T}\) does not terminate.

While Theorem 12 establishes the soundness of our approach, we now investigate completeness. In contrast to ADCL for safety (Sect. 3), ADCL-NT is not refutationally complete, but the proof is non-trivial. So in the following, we show that there are non-terminating TSs \(\mathcal {T}\) where \(\mathcal {T}\not \leadsto ^*_\textsf{nt}\textsf{unsafe}\). To prove incompleteness, we adapt the construction from the proof that ADCL does not terminate [16, Thm. 18]. There, states \((\mathcal {S}, \vec {\tau }, \vec {B})\) were extended by a component \(\mathcal {L}\) that maps every element of \(\textsf{sip}(\mathcal {S})\) to a regular language over \(\textsf{sip}(\mathcal {T})\). However, the proof of [16, Thm. 18] just required reasoning about finite (prefixes of infinite) runs, but we have to reason about infinite runs. So in our setting \(\mathcal {L}\) maps each element \(\tau \) of \(\textsf{sip}(\mathcal {S})\) to a regular or an \(\omega \)-regular language over \(\textsf{sip}(\mathcal {T})\), i.e., \(\mathcal {L}(\tau ) \subseteq \textsf{sip}(\mathcal {T})^*\) or \(\mathcal {L}(\tau ) \subseteq \textsf{sip}(\mathcal {T})^\omega \). We lift \(\mathcal {L}\) from \(\textsf{sip}(\mathcal {S})\) to sequences of transitions as follows.

figure ac

Here, “ ” denotes language concatenation (i.e., ) and we only consider sequences where \(\mathcal {L}(\tau )\) is regular (not \(\omega \)-regular) to ensure that \(\mathcal {L}\) is well defined. So while we lift other notations to sequences of transitions via chaining, \(\mathcal {L}(\vec {\tau })\) does not stand for \(\mathcal {L}(\textsf{chain}(\vec {\tau }))\).

Definition 13

(ADCL-NT with Regular Languages). We extend states by a fourth component \(\mathcal {L}\), and adapt Init, Accelerate, and Nonterm as follows:

figure af

All other rules from Definition 5 leave the last component of the state unchanged.

Here, \(\mathcal {L}(\pi )^+ \mathrel {\mathop :}=\bigcup _{n \in \mathbb {N}_{\ge 1}} \mathcal {L}(\pi )^n\), and \(\mathcal {L}(\pi )^\omega \) is the \(\omega \)-regular language consisting of all words that result from concatenating infinitely many elements of \(\mathcal {L}(\pi ) \setminus \{\varepsilon \}\).

In Accelerate and Nonterm, \(\textsf{chain}(\vec {\tau }^\circlearrowleft )\) is recursive. Thus, \(\vec {\tau }^\circlearrowleft \) does not contain unsafe transitions. Hence, \(\mathcal {L}(\vec {\tau }^\circlearrowleft )\) and thus also \(\mathcal {L}(\vec {\tau }^\circlearrowleft )^+\) are well defined and regular, and \(\mathcal {L}(\vec {\tau }^\circlearrowleft )^\omega \) is \(\omega \)-regular. Moreover, the use of “\(\uplus \)” is justified by the condition \(\tau \not \sqsubseteq \textsf{sip}(\mathcal {S})\). The next lemma states two crucial properties about \(\mathcal {L}\).

Lemma 14

Assume \(\mathcal {T}\leadsto _\textsf{nt}^* (\mathcal {S},\vec {\tau },\vec {B},\mathcal {L})\) and let \(\tau = (\ell \rightarrow \ell ' \,\llbracket \psi \rrbracket ) \in \textsf{sip}(\mathcal {S})\).

  • If \(\mathcal {L}(\tau ) \subseteq \textsf{sip}(\mathcal {T})^*\), then \({\longrightarrow _{\tau }} = \bigcup _{\vec {\tau } \in \mathcal {L}(\tau )}{\longrightarrow _{\vec {\tau }}}\).

  • If \(\mathcal {L}(\tau ) \subseteq \textsf{sip}(\mathcal {T})^\omega \), then for every model \(\sigma \) of \(\psi \), there is an infinite sequence \(\ell (\sigma (\vec {x})) = \mathfrak {s}_1 \longrightarrow _{\tau _1} \mathfrak {s}_2 \longrightarrow _{\tau _2} \ldots \) where \([\tau _1, \tau _2, \ldots ] \in \mathcal {L}(\tau )\).

Based on this lemma, we can prove that our extension of \(\leadsto _\textsf{nt}\) from Definition 13 is not refutationally complete. Then refutational incompleteness of ADCL-NT as introduced in Definition 9 follows immediately. The reason is that \(\mathcal {L}\) is only used in the premise of Init in Definition 13, but there the requirement “\(\mathcal {L}(\tau ) = \{\tau \}\) for all \(\tau \in \textsf{sip}(\mathcal {T})\)” is trivially satisfiable by choosing \(\mathcal {L}\) accordingly.

Theorem 15

There is a non-terminating TS \(\mathcal {T}\) such that .

Proof (Sketch)

As in the proof of [16, Thm. 18], for any (original or learned) transition \(\tau \) such that \(\mathcal {L}(\tau )\) is regular, \(\mathcal {L}(\tau )\) contains at most one square-free word (i.e., a word without a non-empty infix ). Thus, if \(\mathcal {L}(\tau )\) is \(\omega \)-regular, then \(\mathcal {L}(\tau )\) does not contain an infinite square-free word. Moreover, as in the proof of [16, Thm. 18], one can construct a TS \(\mathcal {T}\) that admits a single infinite run \(\vec {\tau }\), and this infinite run is square-free. Thus, there is no transition \(\tau \) such that \(\mathcal {L}(\tau )\) contains a suffix of \(\vec {\tau }\), i.e., no \(\leadsto _\textsf{nt}\)-derivation starting with \(\mathcal {T}\) corresponds to \(\vec {\tau }\). Hence, by Lemma 14, assuming \(\mathcal {T}\leadsto ^*_\textsf{nt}\textsf{unsafe}\) results in a contradiction.    \(\square \)

Since ADCL can prove unsafety as well as safety, it is natural to ask if there is a dual to ADCL-NT that can prove termination. The most obvious approach would be the following: Whenever the trace has a recursive suffix \(\vec {\tau }^\circlearrowleft \), then termination of \(\vec {\tau }^\circlearrowleft \) needs to be proven before the next \(\leadsto \)-step. The following example shows that this is not enough to ensure that \(\mathcal {T}\leadsto _\textsf{nt}^+ \textsf{safe}\) implies termination of \(\mathcal {T}\).

Example 16

Let \(\mathcal {T}\mathrel {\mathop :}=\{\tau _{\textsf{i}} = \textsf{init}\rightarrow \ell \,\llbracket \psi _{\textsf{i}} \rrbracket \} \cup \{\tau _m = \ell \rightarrow \ell \,\llbracket \psi _m \rrbracket \mid 0 \le m \le 2\}\) and

$$\begin{aligned} \psi _{\textsf{i}}&{} \mathrel {\mathop :}={x' = 0}&\psi _0&{} \mathrel {\mathop :}={x=0 \wedge x'=1}&\psi _1&{} \mathrel {\mathop :}={x=1 \wedge x'=2}&\psi _2 \mathrel {\mathop :}={x=2 \wedge x'=1}. \end{aligned}$$

As we have \(\ell (1) \longrightarrow _{\tau _1} \ell (2) \longrightarrow _{\tau _2} \ell (1)\), \(\mathcal {T}\) is clearly non-terminating. We get:

figure ai

After three Steps, we accelerate the recursive suffix \([\tau _0,\tau _1]\) of the trace, resulting in \(\tau _{01} = \ell \rightarrow \ell \,\llbracket x = 0 \wedge x' = 2 \rrbracket \) and \(\mathcal {S}_1 = \mathcal {T}\cup \{\tau _{01}\}\). After one more step, \([\tau _{01},\tau _2]\) is accelerated to \(\tau _{012} = \ell \rightarrow \ell \,\llbracket x = 0 \wedge x' = 1 \rrbracket \) and we get \(\mathcal {S}_2 = \mathcal {S}_1 \cup \{ \tau _{012} \}\). After the next step, \([\tau _{012},\tau _1]\) is redundant w.r.t. \(\tau _{01}\), so Covered applies. Then we Backtrack, as no other transitions are active. The next Steps also yield states that allow for backtracking (as their traces have the redundant suffixes \([\tau _0,\tau _1]\) and \([\tau _{01},\tau _2]\)), so we can finally apply Backtrack again and finish with Prove.

Note that whenever the trace has a recursive suffix, then it leads from \(\ell (i)\) to \(\ell (j)\) where \(i \ne j\), i.e., each such suffix is trivially terminating. In particular, the cycle \(\ell (1) \longrightarrow _{\tau _1} \ell (2) \longrightarrow _{\tau _2} \ell (1)\) is not apparent in any of the states.

This example reveals a fundamental problem when adapting ADCL for proving termination: ADCL ensures that all reachable configurations are covered, which is crucial for proving safety, but there are no such guarantees for all runs. Therefore, we think that adapting ADCL for proving termination requires major changes.

5 Related Work and Experiments

We presented ADCL-NT, a variant of ADCL for proving non-termination. The key insight is that tightly integrating techniques to detect non-terminating transitions into ADCL allows for handling classes of TSs that are challenging for other techniques. In particular, ADCL-NT can find non-terminating executions involving disjunctive transitions or complex patterns of transitions. Moreover, it tightly couples the search for non-terminating configurations and the proof of their reachability, whereas other approaches usually separate these two steps.

Related Work. There are many techniques to find certificates of non-termination [2, 14, 15, 22, 23, 25]. We could use any of them (they are black boxes for ADCL-NT).

Most non-termination techniques for TSs first search for non-terminating configurations, and then prove their reachability [5, 6, 9, 22], or they extract and analyze lassos [23]. In contrast, ADCL-NT tightly integrates the search for non-terminating configurations and reachability analysis.

Earlier versions of our tool LoAT [12, 15] also interleaved both steps using a technique akin to the state elimination method to transform finite automata to regular expressions. This technique cannot handle disjunctions, and it is incomplete for reachability. Hence, LoAT is now solely based on ADCL-NT.

Implementation. So far, our implementation in our tool LoAT is restricted to integer arithmetic. It uses the technique from [15] for acceleration and finding certificates of non-termination, the SMT solvers Z3 [26] and Yices [11], the recurrence solver PURRS [1], and libFAUDES [24] to implement the automata-based redundancy check from [16].

Experiments. To evaluate our implementation in LoAT, we used the 1222 Integer Transition Systems (ITSs) and the 335 C Integer Programs from the Termination Problems Database [28] used in TermComp [21]. The C programs are small, hand-crafted examples that often require complex proofs. The ITSs are significantly larger, as they were obtained from automatic transformations of C or Java programs. Moreover, they contain a lot of “noise”, e.g., branches where termination is trivial or variables that are irrelevant for (non-)termination. Thus, they are well suited to test the scalability and robustness of the tools.

We compared our implementation (LoAT ADCL) with other leading termination analyzers: iRankFinder [2, 9], T2 [6], Ultimate [8], VeryMax [3, 22], and the previous version of LoAT [15] (LoAT ’22). For T2, VeryMax, and Ultimate, we took the versions of their last TermComp participations (2015, 2019, and 2022). For iRankFinder, we used the configuration from the evaluation of [15], which is tailored towards proving non-termination. We excluded AProVE [20], as it cannot prove non-termination of ITSs, and it uses LoAT and T2 as backends when analyzing C programs. Moreover, we excluded Ultimate from the evaluation on ITSs, as it cannot parse them. All experiments were run on StarExec [27] with 300 s wallclock timeout, 1200 s CPU timeout, and 128 GB memory limit per example.

figure aj

The table above shows the results for ITSs, where the column “unique” contains the number of examples that could be solved by the respective tool, but no others. It shows that LoAT ADCL is the most powerful tool for proving non-termination of ITSs. The main reasons for the improvement are that LoAT ADCL builds upon a complete technique for proving reachability (in contrast to, e.g., LoAT ’22), and the close integration of non-termination techniques into a technique for proving reachability, whereas most competing tools separate these steps from each other.

If we only consider the examples where non-termination is proven, LoAT ADCL is also the fastest tool. If we consider all examples, then the average runtime of LoAT ADCL is significantly slower. This is not surprising, as ADCL-NT does not terminate in general. So while it is very fast in most cases (as witnessed by the very fast median runtime), it times out more often than the other tools.

For C integer programs, the best tools are very close (VeryMax: \(103{\times }\)No, LoAT ADCL: \(102{\times }\)No, Ultimate: \(100{\times }\)No). Regarding runtimes, the situation is analogous to ITSs. See [18] for detailed results, more information about our evaluation, and a pre-compiled binary. LoAT is open-source and available on GitHub [17].