Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Conflict-Driven Learning for Termination

Conflict-driven learning procedures are integral to the performance of sat and smt solvers. Such procedures combine search and refutation to determine if a formula is satisfiable. Conflicts discovered by search drive refutation, and search learns from refutation to avoid regions of the search space without solutions.

Our work is driven by the observation that discovering a small number of disjunctive termination arguments is crucial to the performance of certain termination analyzers [27]. Figure 1 summarizes our lifting of conflict-driven learning to termination analysis. We use reachability analysis to find a set of states that constitute potentially non-terminating execution. We apply a conditional termination analysis to this set to eliminate states from which all executions terminate. Unlike termination analysis, which solves a decision problem and returns a yes or no answer, conditional termination analysis is concerned with discovering sufficient conditions for termination. Sufficient conditions for termination play the role of learned clauses in our analysis. They prevent subsequent runs of reachability analysis from revisiting states from which termination is guaranteed.

Fig. 1.
figure 1

Conflict driven learning as applied to termination

Our conflict driven conditional termination procedure (cdct) can be viewed as a sound but incomplete solver for a family of monadic, second-order formulae. Büchi’s theorem shows that the language of a Büchi automaton is non-empty exactly if a formula in the monadic second-order theory of one successor (s1s) is satisfiable [5]. This theorem can be viewed encoding non-termination of a finite-state program as satisfiability in s1s. We introduce s1s(t), an extension of s1s to sequences of first-order structures, and encode non-termination in a control-flow graph (cfg) as satisfiability in s1s(t). A model of a formula is an infinite execution that respects the transition constraints in the cfg.

Formulating non-termination as satisfiability provides a clear route for lifting cdcl to non-termination. We combine decisions with reachability in an abstract domain to construct and refine assignments to second-order variables in the same way that sat solvers construct and refine partial assignments. A notable difference to standard abstract interpretation is that our assignments are neither over- nor under-approximations of the set of reachable states. Our conflict analysis uses backwards abstract interpretation to enlarge the set of states from which termination is guaranteed. We present a generalized unit rule for combining ranking functions with reachability analysis. These components are combined in our new analysis, which we have implemented and evaluated against state-of-the-art termination provers.

2 Non-Termination as Second-Order Satisfiability

The two contributions of this section are the logic s1s(t), which extends the monadic second-order logic of one successor (s1s) with a theory and an encoding of program non-termination as satisfiability in this theory.

2.1 Monadic Second-Order Theories of One Successor

We use for definition. Let \(\mathop {\mathcal {P}(S)}\) be the powerset of S. For \(f:A \rightarrow B\), the function \(f[a \mapsto b]\) maps a to b and c distinct from a to f(c). The symbols xyz range over first-order variables in \( Vars \), fgh over functions in \( Fun \), and PQR over predicates in \( Pred \). We use a set \( Pos \) of first-order position variables whose elements are ijk, a set \( SVar \) of monadic second-order variables denoted XYZ, a unary successor function \( suc \) and a binary successor predicate \( Suc \).

Our logic consists of three families of formulae called state, transition and trace formulae, which are interpreted over first-order structures, pairs of first-order structures and infinte sequences of first-order structures respectively. The formulae are named after how they are interpreted over programs.

A first-order interpretation \(( Val , I)\) defines functions \(I(f)\) and relations I(P) over values in \( Val \). The value \([\![t ]\!]_{s}\) of a term t in a state \(s : Vars \rightarrow Val \), is s(x) if t is x, and \(I(f)([\![t_0 ]\!]_{s}, \ldots , [\![t_n ]\!]_{s})\) if t is \(f(t_0, \ldots , t_{n})\). The interpretation of a state formula is the standard first-order semantics. A transition formula is interpreted at a transition, that is, a pair of states \((r,s)\). A formula \(\varphi \) in which the symbol \( suc \) does not occur is interpreted at the state r, while \( suc (x) = t\) compares the value of the term t in r with the value of x in the successor state s.

A trace \(\tau : \mathbb {N}\rightarrow ( Vars \rightarrow Val )\) is an infinite sequence of states and \(\tau (m)\) is the state at position m. A position assignment maps position variables to \(\mathbb {N}\) and second-order variables to subsets of \(\mathbb {N}\) such that \(\left\{ \sigma (X) ~|~X \in SVar \right\} \) partitions \(\mathbb {N}\). We explain this partition condition shortly. A trace formula is interpreted with respect to an s1s(t) structure \((\tau , \sigma )\).

Note that there are first-order variables of two sorts in a trace formula. A trace formula \(\varPhi \) asserting that the transition formula is true at the trace position denoted by i has the form \(\psi (x,y)(i)\). The predicate \( Suc (i, j)\) asserts that the position j occurs immediately after i.

An s1s(t) structure \((\tau , \sigma )\) is a model of \(\varPhi \) if \((\tau , \sigma ) \models \varPhi \), and is a countermodel otherwise. A trace formula is satisfiable if it has a model. An s1s(t) structure is defined using an infinite trace, so finite traces cannot be models of a formula.

2.2 Encoding Non-Termination in S1S(T)

We now recall control flow graphs (cfgs) and encode non-termination as satisfiability. A command in \( Cmd \) is an assignment \(x :=t\) of a term t to a first-order variable x, or is a condition \([\varphi ]\), where \(\varphi \) is a state formula. A cfg \(G = ( Loc , E, \mathtt {in}, \mathtt {ex}, stmt )\) consists of a finite set of locations \( Loc \) including an initial location \(\mathtt {in}\), an exit location \(\mathtt {ex}\), edges \(E \subseteq Loc \times Loc \), and a labelling \( stmt : E \rightarrow Cmd \) of edges with commands. To assist the presentation, we assume that the exit location \(\mathtt {ex}\) has no successors.

The formula \( Trans _{c}\) below defines the semantics of commands using the condition , that variables in V are not modified. The set of models of \( Trans _{c}\) is the transition relation \( Rel _{c}\). We write \( Trans _{e}\) and \( Rel _{e}\) for the transition formula and relation of the command \( stmt (e)\). The formula \( Inf _{\!G}\) extends the translation of Büchi automata to s1s to encode cfgs in s1s(t). We write for the first position on a trace and for a position that cannot be on an infinite trace.

The formula \( Inf _{\!G}\) encodes program behaviour as follows. Consider an s1s(t) structure \((\tau , \sigma )\). The interpretation \(\sigma (X_\ell )\) of a second-order variable \(X_\ell \) represents positions on the trace when execution is at location \(\ell \). Such an interpretation partitions \(\mathbb {N}\) because each position on a trace corresponds to a unique location. The entry constraint on \( First (i)\) ensures execution begins at \(\mathtt {in}\). The exit constraint implying \( Last (j)\) enforces that an infinite execution does not visit \(\mathtt {ex}\). The conditions involving \( Suc (i, j)\) are called transition constraints and express that consecutive states on a trace must respect the transition relation of G. Theorem 1 expresses non-termination as satisfiability.

Theorem 1

A cfg G has a non-terminating execution iff \( Inf _{\!G}\) is satisfiable.

We believe this is a simple yet novel encoding of non-termination that allows the duality between search and refutation to be exploited for termination analysis. In contrast, the second-order encoding of termination in [13] uses a predicate for disjunctive well-foundedness and is solved in a different manner.

Fig. 2.
figure 2

A formula encoding non-termination of the program shown in the monadic second-order theory of one successor over integer arithmetic.

Example 1

A cfg G and the formula \( Inf _{\!G}\) for a program with a variable x of type \(\mathbb {Z}\) are shown in Fig. 2. We write a trace as a sequence of values of x. Let \(\tau \) be the trace \( -1, -1, -2, -2, \ldots \) and \(\sigma \) the assignment mapping \(X_\mathtt {ex}\) to the empty set, and \(X_\mathtt {in}\) and \(X_\mathtt {a}\) to even and odd positions, respectively. The structure \((\tau , \sigma )\) is a model of \( Inf _{\!G}\). Every structure \((\tau , \delta )\), with \(\tau \) as before, in which \(\delta (X_\mathtt {ex})\) is not empty is a countermodel of \( Inf _{\!G}\) because \(\mathtt {ex}\) is not reachable if x is initially \(-1\), so some transition in \(\tau \) must violate a transition constraint in \( Inf _{\!G}\). Every structure \((\tau ', \delta ')\) with x non-negative in \(\tau '(0)\) is also a countermodel of \( Inf _{\!G}\) because executions with x initially non-negative terminate. Since \(\tau '\) is infinite by definition, some transition in \(\tau '\) must be infeasible. Terminating executions cannot be models of \( Inf _{\!G}\) because traces in s1s(t) structures are infinite.    \({\lhd }\)

The formula \( Inf _{\!G}\) is a conjunction of formulae in which second-order variables and first-order program variables are free but first-order position variables are bound. We exploit this structure in our analysis.

3 Conflict-Driven Conditional Termination

The conflict-driven conditional termination procedure (cdct) in Algorithm 1 generalizes cdcl from sat to termination analysis. The input is the formula \( Inf _{\!G}\). The output \((\mathsf {result}, \varDelta , \varTheta )\) is a result concerning a set of structures \(\varDelta \) and a set \(\varTheta \) of piecewise-defined ranking functions (pdrfs).

The value of \(\mathsf {result}\) is one of \(\mathsf {divergent}\), \(\mathsf {terminates}\), or \(\mathsf {unknown}\). cdct returns \(\mathsf {divergent}\) if the traces represented by \(\varDelta \) do not reach the exit location, which could be due to non-termination or undefined behaviour; It returns \(\mathsf {terminates}\) if \(\varDelta \) is empty and \(\varTheta \) guarantees termination for all states. It returns \(\mathsf {unknown}\) if cdct cannot prove termination and cannot progress. This happens if the abstract domain cannot accurately represent non-terminating executions, if the ranking functions used cannot express a termination argument, or a bound on the number of decisions has been exceeded.

cdct maintains four global data structures. The trail \( tr \) is a sequence of assignments to second-order variables. The explanation array \( exp \) contains in each element \( exp [i]\), the decision or constraint used by propagation to add \( tr [i]\) to the trail. The set of pdrfs \(\varTheta \), generated by conditional termination analysis, are our analogue of learned clauses. The blocking constraints \(\varPsi \) contain constraints representing two types of states, which need not be revisited. One is states from which all executions terminate. The other is states for which cdct could neither prove termination nor demonstrate non-termination.

Each execution of the cdct loop begins with a call to , which attempts to find a non-terminating execution. If returns \(\mathsf {divergent}\), cdct returns. If returns \(\mathsf {unknown}\), the trail represents a potential conflict because it has discovered a set of states from which some execution terminates. The conflict is potential because the trail may also contain models of \( Inf _{\!G}\). This is a difference to sat and smt solvers where a conflict contradicts a formula.

The conflict analysis procedure extracts from a potential conflict a definite conflict \(\theta \), expressed as a ranking function. The domain of \(\theta \) represents states from which all executions terminate. The learning step  generates a blocking constraint to drive subsequent search away from these states. Learning also generates a blocking constraint if cdct cannot make progress analyzing \([ tr ]\). This happens if no more decisions can be made and no ranking function can be extracted. cdct then backtracks if possible.

figure a

An Example Run. A program is shown in C-like syntax alongside Algorithm 1. The location \(\mathtt {a}\) is reached after the variables are initialized, \(\mathtt {b}\) is the loop head, and \(\mathtt {ex}\) is the exit location. The program terminates but the abstract interpretation-based tool FuncTion [32] cannot prove termination. cdct enables FuncTion to prove termination while also avoiding case explosion. Even though other tools may be able to prove termination, we believe cdct is interesting because similar ideas could be used to expand the programs handled by those tools.

In this example, we use an interval abstract domain and affine ranking functions. uses reachability analysis to derive the intervals \(y{:}[-3,3],i{:}[-3,3]\) at \(\mathtt {ex}\) but termination analysis fails. Decisions restrict the range of a variable at a location: for example, heuristically uses conditions from the code to make the decisions \(y{:}[1,\infty ]\) and \(y{:}[-\infty , 10]\) at location \(\mathtt {a}\). Reachability derives the range \(y{:}[1,3],i{:}[-1,-1]\) at \(\mathtt {ex}\), which is a conflict, because no trace with these states at \(\mathtt {ex}\) satisfies \( Inf _{\!G}\).  represents this conflict as \(X_\mathtt {ex}\mapsto \left\{ y{:}[1,3],i{:}[-1,-1] \rightarrow 0\right\} \), which assigns a pdrf to the second-order variable \(X_\mathtt {ex}\) and expresses that the program terminates in 0 steps for the states shown. The pdrf is propagated backwards through the program by an abstract interpreter [31] to derive the second-order assignments below. We omit the interval on i, which is unchanged.

$$\begin{aligned} X_\mathtt {ex}\mapsto y{:}[1,3] \rightarrow 0, X_{\mathtt {b}} \mapsto y{:}[1,3] \rightarrow 1, X_{\mathtt {b}} \mapsto y{:}[4,4] \rightarrow 3, X_{\mathtt {b}} \mapsto y{:}[5,5] \rightarrow 5 \end{aligned}$$

If these assignments are propagated to location \(\mathtt {b}\), we could only prove that the program terminates for y : [1, 5] at \(\mathtt {a}\). Instead, we apply widening to the pdrfs to derive \(X_b \mapsto \left\{ y{:}[1,3] \rightarrow 1, y{:}[4,10] \rightarrow 2x + 5\right\} \), which bounds the number of steps to termination at the loop head for y in the ranges shown. We heuristically expand the piece y : [4, 10] of the pdrf to \(y{:}[1,\infty ]\) and check if the \(2x+5\) is still a ranking function. Since it is, we have proved termination for executions with \(y{:}[1, \infty ],i{:}[-1,-1]\) at \(\mathtt {b}\), despite having explicitly only analyzed the range y : [0, 5].

The learning step complements the decision \(y{:}[1,\infty ]\) and uses \(X_\mathtt {a}\mapsto y{:}[-\infty ,0]\) to restrict future search. Learnt constraints typically have more structure. A similar run of cdct can show termination when y is initially non-positive.

Consider the program with the loop condition changed to \((y > -3)\). Now, the program does not always terminate. Decisions and learning can infer a ranking function for positive y as before. Decisions can also discover that for \(X_\mathtt {a} \mapsto y{:}[-1,-1]\), \(\mathtt {ex}\) is unreachable, indicating non-termination (as all locations lead to \(\mathtt {ex}\)). In this way, cdct proves conditional termination using disjunctions of ranking functions and also identifies non-terminating executions.

4 Search for a Conflict

We now show how a trail, a data structure used by sat solvers, can be used to make explicit the incremental progress made by an abstract interpreter.

Abstract Domains. A bounded lattice \((L, \sqsubseteq , \sqcap , \sqcup )\) is a partially ordered set with a meet \(\sqcap \), a join \(\sqcup \), a greatest element \(\top \) (top), and a least element \(\bot \) (bottom). A concrete domain for forward analysis \((\mathop {\mathcal {P}( State )},\subseteq , F)\) is a lattice of states with a set \(F = \left\{ post _{c} ~|~c \in Cmd \right\} \) of monotone functions called transformers, where \( post _{c}(S)\) is the image of S under the transition relation for c. An abstract domain is a bounded lattice with a set of abstract transformers \(G = \left\{ {\text { post }}^{A}_{c} ~|~c \in Cmd \right\} \) and a widening operator . There is a monotone concretization function \(\gamma :A \rightarrow \mathop {\mathcal {P}( State )}\) satisfying that \(\gamma (\top ) = State \) and \(\gamma (\bot ) = \emptyset \). The transformers satisfy the soundness condition \( post _{c}(\gamma (a)) \subseteq \gamma ({\text { post }}^{A}_{c}(a))\) that abstract transformers overapproximate concrete transformers.

Literals are essential for propagation and conflict analysis in sat. The analogue of literals in abstract domains are complementable meet-irreducibles [11]. A lattice element c is a meet-irreducible if \(a \sqcap b = c\) implies that \(a = c\) or \(b = c\). Let \({\mathcal {M}}_{A}\) be the meet-irreducibles of A. An abstract element a has a concrete complement if there exists an \(\overline{a}\) in A such that \(\gamma (a) = \lnot \gamma (\overline{a})\). A meet decomposition of an element a is a finite set \({ mdc }(a) \subseteq {\mathcal {M}}_{A}\) satisfying that \(\sqcap { mdc }(a) = a\) and that there is no strict subset \(S \subset { mdc }(a)\) with \(\sqcap S = a\). A has complementable meet irreducibles if every \(m \in {\mathcal {M}}_{A}\) has a concrete complement \(\overline{m} \in {\mathcal {M}}_{A}\).

Example 2

The interval lattice has elements [ab], where \(a \le b \in \mathbb {Z}\cup \left\{ -\infty , \infty \right\} \). The intervals \([-\infty , k], [k, \infty ]\) are meet-irreducibles, unlike [0, 2]. The set \(S = \left\{ [-\infty , 2], [0, \infty ], [-5, \infty ]\right\} \) satisfies \(\sqcap S = [0,2]\) but is not a meet decomposition because \(\left\{ [-\infty , 2], [0, \infty ]\right\} \subset S\). The concrete complements of \([-\infty , k]\) and \([k, \infty ]\) are \([k+1, \infty ]\) and \([-\infty , k-1]\), while [0, 2] has no concrete complement.    \({\lhd }\)

Abstract Assignments. sat solvers use partial assignments to incrementally construct a model. We introduce abstract assignments, which use abstract domains to represent s1s(t) structures. Let \( Struct \) be the set of s1s(t) structures. The lattice of abstract assignments \(( Asg _{\!A}, \sqsubseteq )\) contains the set with the pointwise order: \( asg \sqsubseteq asg '\) if \( asg (X) \sqsubseteq asg '(X)\) for all X in \( SVar \). The meet and join are also defined pointwise. An abstract assignment \( asg \) represents a set of s1s(t) structures as defined by the concretization \( conc : Asg _{\!A} \rightarrow \mathop {\mathcal {P}( Struct )}\).

An abstract assignment \( asg \) is a definite conflict for \(\varPhi \) if no model of \(\varPhi \) is in \( conc ( asg )\) and is a potential conflict if \( conc ( asg )\) contains a countermodel of \(\varPhi \).

Trail. We introduce a trail, which contains meet-irreducibles as in [4, 10] and in which a second-order variable can appear multiple times. A trail over A is the empty sequence \(\epsilon \) or the concatenation \( tr {\cdot } (X{:}m)\), where X is a second-order variable and m is a complementable meet-irreducible. A trail \( tr \) defines the assignment \([ tr ]\) where and \([ tr {\cdot } (X{:}m)]\) maps X to \([ tr ](X) \sqcap m\) and all other Y to \([ tr ](Y)\). A trail \( tr \) is in potential/definite conflict with \(\varPhi \) if \([ tr ]\) is. We write \( tr (X)\) for \([ tr ](X)\). An explanation \( exp \) for a trail of length n is a function from \([0, n-1]\) to constraints in \( Inf _{\!G}\) or learnt clauses.

Search(). Algorithm 2 extends a trail \( tr \) by propagating constraints from the cfg, making decisions, or applying a generalized unit rule. It returns \(\mathsf {divergent}\) if \( tr (X_\mathtt {ex})\) is \(\bot \), meaning that \(\mathtt {ex}\) is unreachable. It returns \(\mathsf {unknown}\) if \( tr (X_\mathtt {ex})\) is not \(\bot \) and no decisions can be made. This trail is a potential conflict because every structure in \( conc ([ tr ])\) with a non-empty assignment to \(X_\mathtt {ex}\) violates the constraint \(X_\mathtt {ex}(i) \implies Last (i)\), hence is a countermodel of \( Inf _{\!G}\).

figure b

Example 3

The table alongside Algorithm 2 illustrates the construction of \( tr \) and \( exp \) during interval analysis of the program in Fig. 2. The \( exp \) column shows the locations of the propagated constraints. The rows 1, 2, 3a, 4a, 5a, 6a represent a run of . The trail is initially empty and the result of standard interval analysis is the trail \(X_{\mathtt {ex}}{:}[-\infty , 0], X_{\mathtt {ex}}{:}[0, \infty ]\) in step 2, representing the assignment \(\left\{ X_\mathtt {in}\mapsto \top , X_\mathtt {a} \mapsto \top , X_\mathtt {ex}\mapsto [0,0]\right\} \). An arbitrary decision \(X_\mathtt {in}{:}[9, \infty ]\) in step 3a is not sound (see Example 3) and the smallest sound decision containing it is \([0,\infty ]\). Propagation yields \(X_\mathtt {a}{:}[1,\infty ]\) in step 4a. The decision \(X_\mathtt {in}{:}[-\infty , 0]\) in step 5a is sound, and when propagated, yields a conflict in step 6a, so search returns \(\mathsf {unknown}\). An alternative run is 1, 2, 3b, 4b. A decision \(X_\mathtt {in}{:}[-\infty , -7]\) is sound, and propagation yields \(X_\mathtt {a}{:}[-\infty , -7]\) and \(X_\mathtt {ex}{:}\bot \), so search returns \(\mathsf {divergent}\).    \({\lhd }\)

Propagate(). Algorithm 3 calls an abstract interpreter and stores the results in the trail in a form amenable to conflict analysis and learning. The notion of meet-difference makes explicit the incremental change between two calls to the abstract interpreter. Formally, the meet-difference of \(a,b \in A\) \({ mdiff }(a,b) = { mdc }(a) \setminus { mdc }(b)\). The meet-difference of two abstract assignments is the pointwise lift \({ mdiff }( asg , asg ') = \left\{ X_\mathtt {v}{:}m ~|~m \in { mdiff }( asg (X_\mathtt {v}), asg '(X_\mathtt {v})), X_\mathtt {v} \in SVar \right\} \).

In a transition constraint , we write \( sink (\psi )\) for \(X_\mathtt {v}\). A strongly connected component ( scc ) of \( Inf _{\!G}\) is a set of transition constraints T such that the set of locations \(\left\{ v ~|~\psi \in T, X_\mathtt {v} = sink (\psi )\right\} \) is an scc of G. The set of sccs of \( Inf _{\!G}\) is \( scc ( Inf _{\!G})\). calls a standard abstract interpreter on each scc and uses a meet-difference calculation to extend the trail with new information. also applies a generalized unit rule \( gunit \), explained in §conflicts. Propagation is sound in the sense that it does not eliminate models of the constraints involved.

Lemma 1

If \((\tau , \sigma )\) satisfies \( Inf _{\!G}\) and \(\varPsi \) and is in \( conc ([ tr ])\), it is also in \( conc ([ tr ])\) after invoking .

figure c

Decisions. The abstract assignment computed by (the abstract interpreter used by) can be refined using decisions. Boolean decisions make variables true or false and first-order decisions use values [7, 24] but our decisions, like those in [11], use abstract domain elements.

A decision is an element X : m that can be on a trail. A decision is sound if \( conc (X{:}m) \cup conc (X{:}\overline{m}) = Struct \). That is, considering the structures in m and \(\overline{m}\) amounts to considering all possible structures.

Example 4

Recall the unsound decision \(X_\mathtt {in}{:}[9, \infty ]\) from Example 3. The structure \((\tau , \sigma )\) with \(\tau = 9,9,8,8,\ldots \) and \(\sigma \) partitioning \(X_\mathtt {in}\) and \(X_\mathtt {a}\) into even and odd values is not in \( conc (X_\mathtt {in}{:}[9, \infty ])\) as x cannot be 8 at \(\mathtt {in}\). Similarly, it is not in \( conc (X_\mathtt {in}{:}[-\infty , 8])\) so \( conc (X_\mathtt {in}{:}[9, \infty ]) \cup conc (X_\mathtt {in}{:}[-\infty ,8]) \ne Struct \).    \({\lhd }\)

The unsoundness arises because pointwise lifting does not preserve concrete complements. Though \(\overline{m}\) is the concrete complement of m in A, \([X_v{:}\overline{m}]\) need not be the concrete complement of \([X_v{:}{m}]\) in \( Asg _{\!A}\). Unsound decisions can be extended by propagation to a post-fixed point to cover all structures. All decisions on variables \(X_v\) in singleton sccs with no self-loops are sound.

A decision rule \(\mathsf {dec}( Inf _{\!G}, \varPsi , tr )\) returns an abstract domain element d such that \([ tr {\cdot }(X_v{:}d)] \sqsubseteq [ tr ]\). The decision rule makes progress if this order is strict. Unlike in sat the decision rule can cause divergence of cdct because an infinite series of decisions like \([0,\infty ], [1,\infty ], \ldots \) may not change the result of propagation.

5 Conflict Analysis

Unlike sat and smt solvers, which generate definite conflicts, generates potential conflicts. We apply backwards abstract interpretation with ranking functions to extract definite conflicts, and use widening to generalize them.

Ranking Function Domains. Due to space limitations, we only briefly recall the concrete domain of ranking functions, which provides the intuition for conflict analysis, and discuss the abstract domain informally. See [8, 31] for details.

We write \(f:A \nrightarrow B\) for a partial function whose domain is \( dom (f)\). A ranking function \(f : State \nrightarrow \mathbb {O}\) for a relation R is a map from states to ordinals satisfying that for all s in \( dom (f)\) and \((s,t)\) in R, t is in \( dom (f)\) and \(f(t) < f(s)\). A concrete domain for termination analysis \(( Rank , \preccurlyeq , B)\) is a lattice of ranking functions with backwards transformers \(B = \left\{ bkw _{c} ~|~c \in Cmd \right\} \) defined below. Informally \(f \preccurlyeq g\) if f is defined on a state when g is and yields a lower rank: . The transformer \( bkw _{c}\) maps a ranking function f to one defined on states with all their successors in \( dom (f)\). Recall that \( Rel _c\) is the transition relation for a command c.

figure d

A subset \(P \subseteq A\) of a domain A is an abstract partition if \(\left\{ \gamma (a) ~|~a \in P\right\} \) partitions \( State \). Let \( Fun \subseteq Rank \) be a lattice of functions, for example, affine functions.

A piecewise defined ranking function (pdrf) over \( Fun \) and A is a set such that \(\left\{ a_1, \ldots , a_k\right\} \) is an abstract partition, and each \(f_i\) is in \( Fun \). The abstract domain of pdrfs \(( aRank , \preccurlyeq , Abd )\) is a lattice \( aRank \) with abduction transformers \( Abd \). The concretization \(\gamma ^r: aRank \rightarrow Rank \) of a \(\rho \) as above maps states to ranking functions: . The order and lattice operations are defined in terms of partition refinement and unification [31]. To compare \(\rho _1\) and \(\rho _2\), we consider the coarsest abstract partition that refines the abstract partitions of both and compare the ranking functions in each block pointwise.

Conflict analysis starts with a precondition for termination and finds a weaker precondition for termination, hence performs abduction. The abduction transformers satisfy the soundness condition: \(\gamma ^r( abd _{c}(\rho )) \preccurlyeq bkw _{c}(\gamma ^r(\rho ))\), which states that the termination bounds obtained with pdrfs are weaker than those that could be obtained in the concrete domain. A sound abduction transformer is underapproximating. A ranking assignment \( rk : SVar \rightarrow aRank \) associates a pdrf with each second-order variable. Ranking assignments form a lattice with point-wise meet and join and have a special order \(\leqslant \) for fixed point checks [31]. To exchange information between and we extract a meet-irreducible representation of the domains of pdrfs. The meet-projection of a pdrf is the set of sets of meet-irreducibles and provides a dnf-like representation of the abstract partition in \(\rho \).

Analyze(). Algorithm 4 uses an array \( dc \) to construct and generalize a definite conflict. Each \( dc [i]\) represents termination conditions for states in the trail. Executions from states at \(\mathtt {ex}\) terminate immediately so the last element of \( dc \) is \(\left\{ X_{\mathtt {ex}} \mapsto \left\{ [ tr ](X_\mathtt {ex}) \mapsto 0\right\} \right\} \) and all other elements are \(\top \). The conflict analysis loop walks backwards through the trail and extends \( dc [i]\). Forward propagation through the scc \( exp [i]\) added \( tr [i]\) to the trail, so \( dc [i]\) is propagated backwards through \( exp [i]\) to generalize the conflict to a ranking assignment \( rk \). New pdrfs are added to \( dc \) by the procedure . Specifically, for each \(X_v\) modified by , and \(m \in { mpr }( rk (X_v))\), finds trail indices with \( tr [j] \sqsubseteq X_v{:}m\) and sets \( dc [j]\) to the appropriate pdrf. continues until a unique implication point is reached, which is typically a dominator in the cfg at which a decision was made. returns \([ dc ]\), a representation of the pdrfs in \( dc \).

Learn() and the Generalized Unit Rule. Information computed by is communicated to using the trail, while information from is represented within by a blocking constraint and is incorporated in search using generalized unit rule. We describe these very briefly.

A set \(C = \left\{ X_1{:}m_1, \ldots , X_k{:}m_k\right\} \) of elements can be complemented element-wise to obtain \(\overline{C} = \left\{ X_1{:}\overline{m}_1, \ldots , X_k{:}\overline{m}_k\right\} \). If C is viewed as a conjunction of literals representing a conflict, \(\overline{C}\) is a clause the procedure can learn. applies meet-projection to a pdrf and complements this projection to obtain a blocking constraint. In practice, we simplify the partitions of the pdrf to avoid an explosion of blocking constraints, analogous to subsumption in sat.

The generalized unit rule [10] extends a trail using a blocking constraint. Assume that \(\varPsi \) has the form \(\left\{ X_0{:}m_0, \ldots , X_k{:}m_k\right\} \). The trail \( gunit ( tr , \varPsi )\) is \( tr \cdot (X_k{:}m_k)\) if \( [ tr ](X_i) \sqcap m_i = \bot \) for \(0 \le i < k\) and is \( tr \) otherwise. The generalized unit rule refines a trail in the sense that \([ gunit ( tr , \varPsi )] \sqsubseteq [ tr ]\). If \( tr \) is inconsistent with \(\varPsi \), \([ tr ]\) will represent \(\bot \). Having presented all components of the procedure, we now investigate how it works in practice.

6 Implementation

We have incorporated cdct in our prototype static analyzer FuncTion (http://www.di.ens.fr/~urban/FuncTion.html), which is based on piecewise-defined ranking functions [31]. A version without cdct  [32] participated in the 4th International Competition on Software Verification (SV-COMP 2015).

FuncTion+cdct accepts (non-deterministic) programs in a C-like syntax. It is implemented in OCaml and uses the APRON library [20]. The pieces of a pdrf can be represented with intervals, octagons or convex polyhedra, and ranking functions within the pieces are represented by affine functions. The precision of the analysis can also be controlled by adjusting the widening delay.

Experimental Evaluation. We evaluated our tool against 288 terminating C programs from the termination category of SV-COMP 2015. In particular, we compared FuncTion+cdct with other tools from the termination category of SV-COMP 2015: AProVE [29], FuncTion without cdct  [32], HIPTnT+ [22], and Ultimate Automizer [18]. The experiments were performed on a system with a 1.30 GHz 64-bit Dual-Core CPU (Intel i5-4250U) and 4 GB of RAM. For the other tools, since we did not have access to their competition version, we used the SV-COMP 2015 results obtained on more powerful systems with a 3.40 GHz 64-bit Quad-Core CPU (Intel i7-4770) and 33 GB of RAM.

Fig. 3.
figure 3

Overview of the experimental evaluation.

Fig. 4.
figure 4

Detailed comparison of FuncTion against its previous version [32] (a), AProVe [29] (b), HIPTnT+ [22] (c), and Ultimate Automizer [18] (d).

Figure 3 summarizes our evaluation. The first column is the number of programs each tool could prove terminating. The second column reports the average running time in seconds, and the last column reports the number of time outs, which was set to 180 seconds. In Fig. 3b, the first column () lists the number of programs that FuncTion+cdct proved terminating and the tool could not, the second column () reports the number of programs that the tool proved terminating and FuncTion+cdct could not, and the last two columns report the number of programs that the tool and FuncTion+cdct were both able () or unable () to prove terminating. The same symbols are used in Fig. 4.

Figure 3a shows that cdct causes a \(9\,\%\) improvement in FuncTion+cdct compared to FuncTion without cdct. The increase in runtime is not evenly distributed, and about \(2\,\%\) of the test cases require more than 20 seconds to be analyzed by FuncTion+cdct (cf. Fig. 4a). In these cases the decision heuristics do not quickly isolate sets of states on which the abstract interpreter makes progress. Figure 4a shows that, as expected, FuncTion without cdct terminates with an unknown result earlier. Figures 4b and 4d show that though AProVE and Ultimate Automizer were run on more powerful machines, FuncTion+cdct is generally faster but proves termination of respectively \(19\,\%\) and \(9\,\%\) fewer programs (cf. Fig. 3a). HIPTnT+ proves termination of \(16\,\%\) more programs than FuncTion+cdct (cf. Fig. 4a), but FuncTion+cdct proves termination of \(52\,\%\) of the program that HIPTnT+ is not able to prove terminating (\(8\,\%\) of the total test cases, cf. Fig. 3b). When comparing with FuncTion without cdct  [32], we observed a 2x speedup in the SV-COMP 2015 machines, so the runtime comparison of FuncTion+cdct and HIPTnT+ is inconclusive. Finally, thanks to the support for piecewise-defined ranking functions, \(1\,\%\) of the programs could be proved terminating only by FuncTion+cdct (\(2.7\,\%\) by AProVE, \(1\,\%\) by HIPTnT+, and \(1.7\,\%\) by Ultimate Automizer). No tool could prove termination for \(0.7\,\%\) of the programs.

7 Related Work and Conclusion

Büchi’s work relating automata and logic [5] is the basis for automata-based verification and synthesis. We depart from most work in this tradition in two ways. One is the use of sequences of first-order structures as in first-order temporal logics [19] and the other is to go from a graph-based representation to a formula, which is opposite of the translation used in automata-theoretic approaches. The use of s1s for pointer analysis [26], and termination [25] is restricted to decidable cases, as is [9]. Program analysis questions have been formulated with set-constraints [1] and second-order Horn clauses [13], but solutions to these formulae are typically invariants and ranking functions, not errors, and the methods used to solve them differ from cdct.

A key intuition behind our work is to lift algorithmic ideas from sat solvers to program analysis. The same intuition underlies smpp [17], which lifts dpll(t) to programs, acdcl  [10, 11], which lifts cdcl to lattices, the lifting of Stålmarck’s method [30], and lazy annotation, which uses interpolants for learning [23]. The idea of guiding an abstract interpreter away from certain regions appears in dagger [14] and Vinta [2], from which cdct differs in the use of a trail in search and a unit rule in learning. Our generalized unit rule is from acdcl, but the use of s1s(t), potential conflicts and the combination with pdrfs is all new. The widening used in cdct preserves a termination guarantee and we believe that algorithms for generating small interpolants [3] can help design better widening operators.

Finally, termination analysis is a thriving area with more approaches than we can discuss. A fundamental problem is the efficient discovery of disjunctions of ranking functions [27]. We use backward analysis, as in [8, 12], and our combination of conditional termination [6] with non-termination [15, 21] is crucial. The approach of [22] is similar ours with a different refutation step and information exchange mechanism. At a high level, cdct is the dual of [16], which underapproximates non-terminating executions and overapproximates terminating ones, while we overapproximate non-termination and underapproximate termination. We believe cdct can be extended to transition-based approaches [28], but the challenge is to develop search and learning.