1 Introduction

The long line of research on software model checking, i.e., on tools that statically analyze a given program in order to automatically verify a given temporal property, was initially restricted to safety properties [2, 3, 11, 20, 37, 45, 51]. It was later extended to termination [9, 21, 26, 27, 36, 40, 49, 50, 52]. The relative maturity of this research is reflected by the fact that software model checking tools successfully participate in the software verification competition SV-Comp [10], for safety  [29, 33, 41, 47] as well as for termination [33, 55, 56].

In a more recent trend, approaches to software model checking are emerging for the general class of LTL properties, and in particular general liveness properties [5, 2224]. In this paper, we introduce an approach to LTL software model checking which is based on fairness modulo theory, an extension of reachability modulo theory as introduced by Lal and Qadeer [42].

In the setting of [42], the existence of a program execution that violates a given safety property is proven via the reachability of an error location of the program along a feasible path. A path is feasible if the sequence of statements along the path is executable. This condition is checked by checking whether the corresponding logical formula is satisfiable modulo theory (i.e., satisfiable in the logical theory of integers, arrays, etc.). Today, quite efficient SMT solvers exist which can not only prove unsatisfiability but also compute interpolants [13, 14, 17, 19, 46]. Interpolants can be used to generalize the proof of unsatisfiability in order to show the infeasibility of more and more paths and eventually prove the unreachability of an error location (which is the underlying idea in the approach to program verification of [35, 36]).

We extend the setting of [42] to LTL by defining the construction of a new kind of program (a Büchi program) from the input program and the LTL property. The control flow graph of a Büchi program comes with a distinguished set of nodes which is used to define (infinite) fair paths (a path is fair if it visits the distinguished set of nodes infinitely often). Now, in our extension of the setting in [42], the existence of a program execution that violates a given LTL property is proven via the existence of a feasible fair path.

In general, to show that the infinite sequence of statements along a path is not executable, one needs to construct a ranking function. For example, for each of the two infinite sequences of statements below, one may construct the ranking function r defined by \(r(x,y)=x-y\).

figure a

Every finite prefix of \(\tau _1\) is executable. In contrast, \(\tau _2\) has the prefix

figure b

which already is not executable.

In the case where an infinite sequence of statements has a finite prefix such that already the prefix is not executable, it is not necessary to construct a ranking function. Instead, it is sufficient to consider the prefix and prove the unsatisfiability of the logical formula corresponding to the finite sequence of statements in the prefix.

Tools exist that, given an infinite sequence of statements like

figure c

or

figure d

, can construct a ranking function like r above automatically [7, 12, 48]. Recent efforts go into improving the scope and the scalability of such tools [8, 25, 34, 43]. In comparison with proving unsatisfiability, the task of constructing a ranking function will always be more costly. Hence, substituting the construction of a ranking function by the construction of a proof of unsatisfiability carries an interesting potential for optimization. The goal of the work in this paper is to investigate whether this potential can be exploited practically. We develop a practical method and tool for LTL software model checking that shows that this is indeed the case.

In the remainder of the paper, after discussing an example, we introduce Büchi programs (as described above, we reduce the validity of an LTL property for a given program to the absence of a feasible fair path in a Büchi program). We present an algorithm that constructs such a Büchi program and checks whether it has a feasible fair path. The algorithm selects certain finite prefixes of a path for the check of feasibility before the full infinite path is considered. We then present the evaluation of a tool which implements the algorithm. Our evaluation shows the practical potential of our approach. In particular, the tool can verify several benchmark programs—for a liveness property—just with finite prefixes (and thus without the construction of a single ranking function).

2 Example

In this section we demonstrate how we apply our approach to the program \(\mathcal {P}\) depicted in Fig. 1a and the LTL property \(\varphi = \square (x>0 \rightarrow \lozenge (y=0))\).

We represent the program \(\mathcal {P}\) by the graph depicted in Fig. 1b. The edges of this graph are labeled with program statements. We use the Büchi automaton \(\mathcal {A}_{\lnot \varphi }\) depicted in Fig. 1c as representation of the negation of the LTL property \(\varphi \).

Fig. 1.
figure 1

Program \(\mathcal {P} \) is shown in (a) as pseudocode and in (b) as control flow graph. The Büchi automaton \(\mathcal {A}_{\lnot \varphi }\) that represents the negation of the LTL property \(\varphi = \square (x>0 \rightarrow \lozenge (y=0))\) is shown in (c).

As a first step we construct the Büchi program \(\mathcal {B}\) depicted in Fig. 2. Afterwards we will show that this Büchi program \(\mathcal {B}\) has no path that is fair and feasible, thus proving that \(\mathcal {P}\) satisfies the LTL property \(\varphi \).

Fig. 2.
figure 2

The Büchi program \(\mathcal {B}\) constructed from the program \(\mathcal {P}\) (Fig. 1b) and the Büchi automaton representing \(\lnot \varphi \) (Fig. 1c). Each edge is labeled with the statements

figure e
figure f
where
figure g
comes from \(\mathcal {P}\) and
figure h
comes from \(\lnot \varphi \). The fair locations are \(l_0q_1\),\(l_1q_1\), \(l_2q_1\) and \(l_3q_1\), i.e., all locations that contain the Büchi automaton’s accepting state \(q_1\).

A Büchi program is a program together with a fairness constraint: an execution is fair if a fair location is visited infinitely often. The fair locations of \(\mathcal {B}\) are highlighted by double circles. The locations of the Büchi program \(\mathcal {B}\) are pairs whose first element is a location of the program \(\mathcal {P}\) and whose second element is a state of the Büchi automaton \(\mathcal {A}_{\lnot \varphi }\). The edges of the Büchi program \(\mathcal {B}\) are labeled with sequential compositions of two statements where the first element is a statement of the program. The second element of the sequential composition is an assume statement that represents a letter of the Büchi automaton \(\mathcal {A}_{\lnot \varphi }\).

A key concept in our analysis is the notion of a trace. A trace is an infinite sequence of statements. We call a trace fair if it is the labeling of a path that visits some fair location infinitely often. A trace is feasible if it corresponds to some program execution. An example for a fair trace is \(\tau _1 \tau _2^\omega \) where \(\tau _1\) and \(\tau _2\) are as follows.

figure i

This trace is not feasible because the second statement

figure j

and the third statement

figure k

are contradicting each other.

Our algorithm constructs Büchi programs such that each fair and feasible trace of the Büchi program corresponds to a feasible trace of the original program that violates the LTL property.

In order to show that \(\mathcal {P}\) satisfies \(\varphi \) we show that no fair trace of the Büchi program \(\mathcal {B}\) is feasible. Thus, our algorithm tries to find arguments for infeasibility of fair traces in \(\mathcal {B}\):

Local Infeasibility. In the Büchi program \(\mathcal {B}\) every trace that is the labeling of a path that contains the edge

figure l

is infeasible, because the statements

figure m

and

figure n

contradict each other. Another example for local infeasibiliy is the edge from \(l_1q_0\) to \(l_0q_1\) which is labeled with the two statements

figure o

and

figure p

that contradict each other, too.

Infeasibility of a Finite Prefix. Every trace that is the labeling of a path that has the following finite prefix

figure q

is infeasible because

figure r

contradicts

figure s

. Another example for infeasibiliy of a finite prefix is the trace \(\tau _1\tau _2^\omega \) that was discussed before.

\(\omega \) -Infeasibility. Every trace that is the labeling of an infinite path that eventually loops along the following edges

figure t

is infeasible because

figure u

infinitely often decreases x. Thus, the value of x will eventually contradict

figure v

. The formal termination argument is the ranking function \(f(x)=x\).

Each fair trace of \(\mathcal {B}\) is infeasible for one of the reasons mentioned above. Hence, we can conclude that program \(\mathcal {P}\) indeed satisfies the LTL property \(\varphi \).

All reasons for infeasibility that fall into the classes Local infeasibility or infeasibility of a finite prefix are comparatively cheap to detect. In this example we only needed to synthesize one ranking function, which is in general more expensive.

3 Preliminaries

Programs and Traces. In our formal exposition we consider a simple programming language whose statements are assignment, assume, and sequential composition. We use the syntax that is defined by the following grammar

$$\begin{aligned} \mathtt s \ := \ \mathtt {assume\, bexpr} \ \mid \ \mathtt {x:=expr} \ \mid \ \mathtt {s;s} \end{aligned}$$

where \( Var \) is a finite set of program variables, \(\mathtt x \in Var \), expr is an expression over \( Var \) and bexpr is a Boolean expression over \( Var \). For brevity we use bexpr to denote the assume statement assume bexpr.

We represent a program over a given set of statements \( Stmt \) as a labeled graph \(\mathcal {P} = (Loc, \delta , l_{0})\) with a finite set of nodes Loc called locations, a set of edges labeled with statements, i.e., \(\delta \subseteq Loc \times Stmt \times Loc\), and a distinguished node \(l_{0}\) which we call the initial location.

In the following we consider only programs where each location has at least one outgoing edge, i.e. \(\forall l\in Loc,\ \exists s\in Stmt ,\ \exists l'\in Loc \bullet (l,s,l') \in \delta \). We note that each program can be transformed into this form by adding to each location without outgoing edges a selfloop that is labeled with assume true.

We call an infinite sequence of statements \(\tau =s_0s_1s_2\ldots \) a trace of the program \(\mathcal {P} \) if \(\tau \) is the edge labeling of an infinite path that starts at the initial location \(l_{0}\). We define the set of all program traces formally as follows.

$$\begin{aligned} T(\mathcal {P}) = \{s_0s_1\ldots \in Stmt ^\omega \mid \exists l_1,l_2,\ldots \bullet (l_i,s_{i},l_{i+1}) \in \delta \text {, for } i=0,1,\ldots \} \end{aligned}$$

Let \(\mathcal {D}\) be the set of values of the program’s variables. We denote a program state \(\sigma \) as a function \(\sigma : Var \rightarrow \mathcal {D}\) that maps program variables to values. We use S to denote the set of all program states. Each statement s \(\in Stmt \) defines a binary relation \(\rho _\mathtt{s }\) over program states which we call the successor relation. Let Expr be set of all expressions over the program variables \( Var \). We assume a given interpretation function \(\mathcal {I}: Expr \times ( Var \rightarrow \mathcal {D}) \rightarrow \mathcal {D}\) and define the relation \(\rho _\mathtt{s } \subseteq S \times S\) inductively as follows.

figure w

Given a trace \(\tau = s_0s_1s_2\ldots \), a sequence of program states \(\pi = \sigma _0\sigma _1\sigma _2\ldots \) is called a program execution of the trace \(\tau \) if each successive pair of program states is contained in the successor relation of the corresponding statement of the trace, i.e., \((\sigma _{i},\sigma _{i+1}) \in \rho _{s_i}\) for \(i\in \{0,1,\ldots \}\). We call a trace \(\tau \) infeasible if it does not have any program execution, otherwise we call \(\tau \) feasible. We use \(\Pi (\tau )\) to denote the set of all program executions of \(\tau \). The set of all feasible trace of program \(\mathcal {P} \) is denoted by \(T_ feas (\mathcal {P})\), and the set of all program executions of \(\mathcal {P} \) is defined as follows.

$$\begin{aligned} \Pi (\mathcal {P}) = \bigcup _{\tau \in T_ feas (\mathcal {P})} \Pi (\tau ) \end{aligned}$$

Büchi Automata and LTL Properties. We will not formally introduce linear temporal logic (LTL). Every LTL property can be expressed as a Büchi automaton [1]. In our formal presentation we use Büchi automata to represent LTL properties.

A Büchi automaton \(\mathcal {A}_{} = (\Sigma , Q, q_0, \longrightarrow , F)\) is a five tuple consisting of a finite alphabet \(\Sigma \), a finite set of states Q, an initial state \(q_0 \in Q\), a transition relation \(\longrightarrow : Q \times \Sigma \times Q\), and a set of accepting states \(F \subseteq Q\). A word over the alphabet \(\Sigma \) is an infinite sequence \(w = a_0a_1a_2\ldots \) such that \(a_i \in \Sigma \) for all \(i \ge 0\). A run r of a Büchi automaton \(\mathcal {A}_{}\) on w is an infinite sequence of states \(q_0q_1 \ldots \), starting in the initial state such that for all \(a_i \in w\) there is a transition \((q_{i}, a_i, q_{i+1}) \in \longrightarrow \). A run r is called accepting if r contains infinitely many accepting states. A word w is accepted by \(\mathcal {A}_{}\) if there is an accepting run of \(\mathcal {A}_{}\) on w. The language \(\mathcal {L}(\mathcal {A}_{}) \) of a Büchi automaton \(\mathcal {A}_{}\) is the set of all words that are accepted by \(\mathcal {A}_{}\).

An atomic proposition is a set of program states. An LTL property over a set of atomic propositions AP defines a set of words over the alphabet \(\Sigma = 2^{AP}\). LTL properties are usually denoted by formulas, but several translations from formulas to equivalent Büchi automata are available [31, 32, 54]. We assume that we have given a Büchi automaton \(\mathcal {A}_{\varphi }\) for each LTL property \(\varphi \).

A program state \(\sigma \) satisfies a symbol a of the alphabet \(2^{AP}\) if \(\sigma \) is an element of all atomic propositions in a. A sequence of program states \(\sigma _0\sigma _1\ldots \) satisfies a word \(a_0a_1a_2\ldots \in (2^{AP})^\omega \), if \(\sigma _{i+1}\) satisfies \(a_i\) for each \(i\ge 0\). A sequence of program states \(\pi \) satisfies the LTL property \(\varphi \) if \(\pi \) satisfies some word \(w\in \mathcal {A}_{\varphi }\). A trace \(\tau =s_0s_1\ldots \) satisfies \(\varphi \) if it has at least one program execution and all program executions of the trace satisfy \(\varphi \). A program \(\mathcal {P} \) satisfies \(\varphi \) if all program executions of \(\mathcal {P} \) satisfy \(\varphi \). We will use the \(\models \) symbol to denote each of these “satisfies relations”, e.g., we will write \(\mathcal {P} \models \varphi \) if the program \(\mathcal {P} \) satisfies the LTL property \(\varphi \).

We note that these definitions do not put any restrictions on the initial state \(\sigma _0\) of a sequence of program states. This accounts for the fact that our programs do not have to start in a given initial program state and allows programs that satisfy the LTL property \(\square (x=0)\). For example, the program whose first statement sets the variable x to 0 and whose other statements do not modify x.

4 Büchi Program and Büchi Program Product

In this section we introduce the notion of a Büchi program, which is a program which is extended by a fairness constraint. We show that the problem whether a program satisfies an LTL property can be reduced to the problem whether a Büchi program has a fair program execution.

Definition 1

(Büchi Program). A Büchi program \({\mathcal {B}} =( Stmt , Loc, \delta , l_{0}, Loc_ fair )\) is a program \(\mathcal {P} =(Loc, \delta , l_{0})\) whose set of statements is \( Stmt \), with a distinguished subset of locations \(Loc_ fair \subseteq Loc\). We call the locations \(Loc_ fair \) the fair locations of \(\mathcal {B}\).

An example for a Büchi program is the program depicted in Fig. 2 which was discussed in Sect. 2.

Definition 2

(Fair Trace). A trace \(s_0s_1s_2\ldots \) of a Büchi program \(\mathcal {B}\) is a fair trace if

  • there exists a sequence of locations \(l_0,l_1,\ldots \) such that \(l_0\xrightarrow {s_0} l_1 \xrightarrow {s_1} l_2 \xrightarrow {s_2} \ldots \) is a path in \(\mathcal {B}\), i.e., \((l_i,s_i,l_{i+1})\in \delta \) for \(i=0,1,\ldots \), and

  • the sequence \(l_0,l_1,\ldots \) contains infinitely many fair locations.

We use \(T_ fair ({\mathcal {B}})\) to denote the set of fair traces of \(\mathcal {B}\).

If we consider the Büchi program \({\mathcal {B}} =( Stmt , Loc, \delta , l_{0}, Loc_ fair )\) as a Büchi automaton where the alphabet is the set of program statements \( Stmt \), the set of states is the set of program locations Loc, the transition relation is the labeled edge relation \(\delta \) the initial state is the initial location \(l_{0}\) and the set of accepting states is the set of fair locations \(Loc_ fair \), then the language of this Büchi automaton is exactly the set of fair traces of the Büchi program.

Definition 3

(Fair Program Execution). A program execution \(\pi \) of a Büchi program \(\mathcal {B}\) is a fair program execution of \(\mathcal {B}\) if \(\pi \) is the program execution of some fair trace of \(\mathcal {B}\). We use \(\Pi _{fair}({\mathcal {B}})\) to denote the set of all fair program execution of \(\mathcal {B}\).

We note that traces that are fair and feasible have at least one fair program execution.

Boolean expressions over the set of program variables Var, and atomic propositions both define sets of program states. For a letter \(a\in 2^{AP}\), we will use assume a to denote the assume statement whose expression evaluates to true for each state \(\sigma \) that satisfies a. Hence assume a has the following successor relation.

$$\begin{aligned} \{(\sigma , \sigma ')\mid \sigma \models p \text { for each } p\in a\} \end{aligned}$$

Definition 4

(Büchi Program Product). Let \(\mathcal {P} = (Loc, l_0,\delta _\mathcal {P})\) be a program over the set of statements \( Stmt \), AP a set of atomic propositions over the program’s variables Var, and let \(\mathcal {A}_{} = (\Sigma , Q,q_{0},\rightarrow ,F)\) be a Büchi automaton whose alphabet is \(\Sigma = 2^{AP}\). The Büchi program product \(\mathcal {P} \otimes \mathcal {A}_{}\) is a Büchi program \({\mathcal {B}} = ( Stmt _{{\mathcal {B}}}, Loc_{{\mathcal {B}}}, l_{0_{{\mathcal {B}}}}, \delta _{{\mathcal {B}}}, Loc_{F_{{\mathcal {B}}}})\) such that the set of statements consists of all sequential compositions of two statements where the first element is a statement of \(\mathcal {P} \) and the second element is a statement that assumes that a subset of atomic propositions is satisfied, i.e.,

$$\begin{aligned} Stmt _{{\mathcal {B}}} = \{{\mathtt{\textit{s; assume a} }} \mid s\in Stmt , a\in 2^{AP}\}, \end{aligned}$$

the set of locations is the Cartesian product of program locations and Büchi automaton states, i.e.,

$$\begin{aligned} Loc_{{\mathcal {B}}} = \{ (l,q) \mid l \in Loc\textit{ and }q \in Q\}, \end{aligned}$$

the initial location is the pair consisting of the program’s initial location and the Büchi automaton’s initial state, i.e.,

$$\begin{aligned} l_{0_{{\mathcal {B}}}} = (l_{0},q_{0}), \end{aligned}$$

the labeled edge relation is a product of the program’s edge relation and the transition relation of the Büchi automaton such that an edge is labeled by the statement that is a sequential composition of the program’s edge label and an assume statement obtained from the transition’s letter, formally defined as follows

$$\begin{aligned} \delta _{{\mathcal {B}}} = \{((l,q), \{{\mathtt{\textit{s; assume a} }}, (l',q')) \mid (l,s,l') \in \delta _{P} \textit{ and } (q,a,q') \in \rightarrow \}, \end{aligned}$$

the set of fair locations contains all pairs where the second component is an accepting state of the Büchi automaton, i.e.,

$$\begin{aligned} Loc_{F_{{\mathcal {B}}}} = \{ (l,q) \mid l \in Loc\textit{ and }q \in F\}. \end{aligned}$$

The following theorem shows how we can use the Büchi program product to check if a program satisfies an LTL property.

Theorem 1

The program \(\mathcal {P} \) satisfies the LTL property \(\varphi \) if and only if the Büchi program product \({\mathcal {B}} =\mathcal {P} \otimes \mathcal {A}_{\lnot \varphi }\) does not have a trace that is fair and feasible, i.e.,

$$\begin{aligned} \mathcal {P} \models \varphi \qquad \text {iff} \qquad T_ fair ({\mathcal {B}})\cap T_ feas ({\mathcal {B}}) = \emptyset \end{aligned}$$

Proof

For brevity, we give only a sketch of the proof. A more detailed proof is available in an extended version of this paper [30]. First, we use the definition of the Büchi program product to show the following connection between traces of \(\mathcal {B}\), traces of \(\mathcal {P}\) and words over \(2^{AP}\). \(s_0\); assume a \(_0\) \(s_1\); assume a \(_1 \ldots \in T_ fair ({\mathcal {B}})\) if and only if \(s_0s_1s_2\ldots \in T(\mathcal {P})\) and \(a_0a_1a_2\ldots \in \mathcal {L}(\mathcal {A}_{\lnot \varphi })\). Next, we use this equivalence to show that for a sequence of program states the following holds. \(\pi \in \Pi _ fair ({\mathcal {B}})\) if and only if \(\pi \in \Pi (\mathcal {P})\) and \(\pi \models \mathcal {A}_{\lnot \varphi }\). A Büchi program has a fair program execution if and only if it has a fair and feasible trace. We conclude that the intersection \(T_ fair ({\mathcal {B}})\cap T_ feas ({\mathcal {B}})\) is empty if and only if each program execution of \(\mathcal {P}\) satisfies the LTL property \(\varphi \).    \(\square \)

5 LTL Software Model Checking

In this section we describe our LTL software model checking algorithm. The algorithm is based on counter example guided abstraction refinement (CEGAR) in the fashion of [35] extended by a check for termination of fair traces and a corresponding abstraction refinement.

Fig. 3.
figure 3

The model checking algorithm. We use an automata-based approach that collects generalizations of infeasible traces in a Büchi automaton \(\mathcal {A}_{D}\). The three inner boxes represent the three checks, which lead either to a refinement of \(\mathcal {A}_{D}\), a result, or to a timeout (not shown).

Figure 3 shows an overview of the algorithm. The general idea is to create and continuously enlarge a Büchi automaton \(\mathcal {A}_{D}\) whose language contains all fair traces of \(\mathcal {B}\) that are already known to be infeasible. The algorithm starts by constructing a Büchi program \(\mathcal {B}\) with the product construction from Sect. 4. Initially, \(\mathcal {A}_{D}\) is a Büchi automaton that recognizes the empty language.

We use the similarities between Büchi programs and Büchi automata, i.e., that \(\mathcal {L}({\mathcal {B}}) = T_ feas ({\mathcal {B}})\), throughout the whole algorithm. For example, in the first step of our CEGAR loop we check whether the set of fair traces represented by \(\mathcal {A}_{D}\) is a superset of the fair traces of \(\mathcal {B}\) (the first box in Fig. 3). This check for trace inclusion can be done with only Büchi automata operations.

If the set of fair traces of \(\mathcal {A}_{D}\) is indeed a superset of the set of fair traces of \(\mathcal {B}\), we know that there is no fair and feasible trace in \(\mathcal {B}\) and our algorithm returns safe.

As the trace inclusion check is performed by computing \(\mathcal {L}({\mathcal {B}}) \setminus \mathcal {L}(\mathcal {A}_{D}) \), we will receive a fair trace \(\tau \) of \(\mathcal {B}\) that witnesses that the set of fair traces of \(\mathcal {A}_{D}\) is no superset of the set of fair traces of \(\mathcal {B}\). In this case, \(\tau \) is always of the form \(\tau _1\tau ^\omega _2\).

Next, our algorithm tries to decide whether \(\tau \) is feasible or not. This is done by first checking various finite prefixes for feasibility. More precisely, the stem \(\tau _1\), the loop \(\tau _2\) and then the concatenation \(\tau _1\tau _2\) are checked for feasibility in that order. If none of those finite prefixes is infeasible, our algorithm tries to prove that the full infinite trace terminates. The termination analysis (inner lower box) tries to find a ranking function to prove that the loop will terminate eventually. When non-termination can be proven, we conclude that \(\tau \) is feasible. Therefore, \(\tau \) is a fair and feasible trace in \(\mathcal {B}\) and thus a counterexample for the property \(\varphi \). If instead termination can be shown, we know that \(\tau \) is infeasible and the algorithm continues to the next step.

Note that the checks for feasibility of \(\tau _1\) and \(\tau _2\) as well as the termination analysis are based on – in general – undecidable methods. It is possible that they do not terminate. In such cases, our algorithm runs into a timeout and returns unknown as answer.

In the last step of the CEGAR loop we want to refine \(\mathcal {A}_{D}\) by adding more fair and infeasible traces. We do this by replacing \(\mathcal {A}_{D}\) with a Büchi automaton that is the union of the old Büchi automaton \(\mathcal {A}_{D}\) and a new Büchi automaton which we create from trace \(\tau \). This new Büchi automaton recognizes all fair traces of \(\mathcal {B}\) that are infeasible for the same reason for which trace \(\tau \) is infeasible. Depending on the reason for infeasibility of trace \(\tau \), we use different methods for the construction of this new Büchi automaton: if \(\tau \) was infeasible because we found an infeasible finite prefix, we use the method \(\mathsf refine _{F}\), if \(\tau \) was infeasible because we found a ranking function, we use \(\mathsf refine _{\omega }\).

The methods \(\mathsf refine _{F}\) and \(\mathsf refine _{\omega }\) generalize a single trace to a set of traces. The input of these methods is the trace \(\tau \) together with an infeasibility proof (resp. termination proof). The output is a Büchi automaton that accepts a set of traces whose infeasibility (resp. termination) can be shown by this infeasibility proof (resp. termination proof). \(\mathsf refine _{F}\) and \(\mathsf refine _{\omega }\) guarantee that at least the single trace is contained in the language, but usually recognize a much larger set of traces. As the generalization performed by these methods is quite involved, it is not in the scope of this paper. We refer the interested reader to [35, 36] for a detailed description.

6 Implementation and Evaluation

We implemented the algorithm from Sect. 5 as Ultimate LTLAutomizer in the program analysis framework Ultimate  [16]. This allowed us to use different, already available components for our implementation:

  • a parser for ANSI C extended with specifications written in ACSL [6],

  • various source-to-source transformations that optimize and simplify the input program,

  • an implementation of the Trace Abstraction algorithm [35] to determine feasibility of finite trace prefixes,

  • an implementation of a ranking function synthesis algorithm based on [34] to prove termination of fair traces in the Büchi program, and

  • various automata operations like union, complementation and intersection of Büchi automata.

For the LTL property we use a custom annotation compatible to the ACSL format. After parsing, we transform the LTL property with LTL2BA [32] to a Büchi automaton, which is then together with an initial program the input for the product algorithm.

Our implementation of the product construction already contains some optimizations. For one, we already described that we remove locally infeasible traces by removing infeasible edges during the construction. We also convert the expression e of assume e statements to disjunctive normal form. If this results in edges labeled with more than one disjuncts, i.e. with \(\mathtt {assume\, e}_\mathtt {1} \mathtt {||} \mathtt {e}_\mathtt {2} \mathtt {||} \mathtt {\ldots } \mathtt {||} \mathtt {e}_\mathtt {n}\), we convert them to n edges labeled with \(\mathtt {assume\, e}_\mathtt {i}\). This improves the performance of the ranking function synthesis algorithm considerably.

Table 1. The results of the comparison with the benchmarks from [23]. “Program”, “Lines”, and “\(\varphi \)” contain the name of the benchmark, the lines of code of the program, and the checked property (atomic propositions have been abbreviated). “Result” states whether the tool proved the property ( ✔ ), produced a valid counterexample ( ✗ ), ran out of time (T.O.) or out of memory (OOM). N.R. shows the instance where we could not use the benchmark because the property was not specified explicitly and could not be guessed from the comments in the file. “Time” contains the runtime of the respective tool in seconds. For Ultimate LTLAutomizer, there are additional statistics columns: “\(|\mathtt r _{F}|\) ” states how many traces were refined using \(\mathsf refine _{F}\), and analogous “\(|\mathtt r _{\omega }|\) ” for \(\mathsf refine _{\omega }\). “Inc.” shows how much the product increased in size compared to the original CFG of the program. The timeout for “Term.” and “DP” was four hours, our timeout was 20 min. Our memory limit was 8GB.

Table 1 shows a comparison of our implementation against the benchmarks and the data provided by [23], in which the authors compare their novel LTL-checking approach based on decision predicates (DP) against a Terminator-like procedure with an extension for fairness [21] (Term.). The set of benchmarks contains examples from “[...] the I/O subsystem of the windows kernel, the back-end infrastructure of the PostgreSQL database server, and the Apache web server”, as well as “some toy examples”. As the tools that were used in [23] are not publicly available, we could not re-run their implementations on our machine. Therefore, the results in the columns “Term.” and “DP” are verbatim from the original publication.

We could solve most of the benchmarks in under five seconds. Notable exceptions are “Windows OS 5”, where the other tools run into a timeout, and “Windows OS 8” where we performed much slower than DP. We are still unclear about the OOM result in “Apache accept()”, but we suspect a bug in our tool.

In many instances with liveness properties we did not need to provide a ranking function, because the generalization from traces that are infeasible because of infeasible finite prefixes already excluded all fair traces of the Büchi program. For the remainder, the termination arguments were no challenge, except for “Windows OS 8”: we had difficulties to generalize from many terminating traces, which also resulted in the slowdown compared to DP.

The expected increase in size of the Büchi program compared to the initial program’s CFG (Inc.) was also manageable. Interestingly, in both instances of “Toy linear arith.” the product was even smaller than the original CFG, because we could remove many infeasible edges.

On four benchmarks Ultimate LTLAutomizer results are different from the data in [23]: we contacted the authors and confirmed that our result for “Toy linear arith. 1” is indeed correct. We also could not run the benchmark “Windows OS 4”, because the LTL property contained variables that were not defined in the source file. We did not yet receive a response regarding this issue as well as regarding the correctness of our results in the other three instances.

Table 2. Results of Ultimate LTLAutomizer on other benchmark sets. “RERS ” are the online problems from “The RERS Grey-Box Challenge 2012” [39] and “coolant ” consists of toy examples modelled after real-world embedded systems with specifications based on the LTL patterns described in [53]. Each program set contains pairs of a file and a property. “Avg. Lines” states the average lines of code in the sample set, and \(|\text {Set}|\) the number of file-property pairs. In the next five columns we use the same symbols as in Table 1 except for \(\bigstar \), which represents abnormal termination of Ultimate LTLAutomizer. The last four columns show the average runtime, the average number of refinements with \(\mathsf refine _{F}\) and \(\mathsf refine _{\omega }\), and how much the size of the optimized product increased on average compared to the original CFG. We used the same timeout and memory limits as in Table 1.

We also considered two other benchmark sets (see Table 2). First, we ran the on-site problems from the RERS Grey-Box Challenge 2012 [39] (RERS). RERS is about comparing program verification techniques on a domain of problems comparable to the ones seen in embedded systems engineering. For this, they generate control-flow-intensive programs that contain a so-called ECA-engine (event-condition-action): one non-terminating while loop which first reads an input symbol, then calls a function that based on the current state and the input calculates an output symbol, and finally writes this output symbol. We took all 6 problem classes from the on-site part of the challenge and tried to solve them with our tool. The classification (P14 to P19) encodes the size and assumed difficulty of the problem class: P14 and P15 are small, P16 and P17 are medium, and P18 and P19 are large problems. Inside a size bracket, the larger number means a higher difficulty.

We were able to verify roughly 43 % of the RERS benchmarks without any modifications. The RERS set also helped us finding a bug that one of our optimizations on the Büchi program product introduced and which is responsible for all but four of the \(\bigstar \) results. For the remaining four examples, \(\bigstar \) occurred because Ultimate LTLAutomizer was unable to synthesize a ranking function. Interestingly, the RERS benchmarks did seldomly require generalizations with \(\mathsf refine _{\omega }\). In most cases, the \(\mathsf refine _{F}\) already excluded all fair traces from the Büchi program. This trend can also be observed in the number of \(\mathsf refine _{\omega }\) applications on the benchmarks that timed out (not shown in Table 2).

Second, we used a small toy example modeled after an embedded system, a coolant facility controller that encompasses two potentially non-terminating loops in succession. The first polls the user for the input of a sane temperature limit (except one example all versions of the coolant controller can loop infinitely in this step if the input is not suitable). The second loop polls the temperature, does some calculations, increments a counter and sets the “spoiled goods” flag if the temperature limit is exceeded. The LTL properties specify that the spoiled variable cannot be reset by the program (safety), that setup stages occur in the correct order (safety and liveness), and that the temperature controlling loop always progresses (safety and liveness). We then introduced various bugs in the original version of the program and checked against the property and its negation. Although the coolant examples are quite small, they contain complex inter-dependencies between traces which lead to timeouts in two cases.

An unexpected result of the evaluation was, that the initial size of the program does not seem to define the performance of the verification, both in time and success rate, as the larger programs from P17 and P18 had more results and were faster than their counterparts from P15 and P16. Also, the effective blow-up due to the product construction is no more than four times, which is still quite manageable.

The benchmark sets together with Ultimate LTLAutomizer are available from [30].

7 Related Work

An earlier approach to LTL software model checking was done in [21]. There, the authors reduced the problem to fair termination checking. Our work can be seen as improvement upon this approach, as we also use fair termination checking, but only when it is necessary. We avoid a large number of (more costly) termination checks due to our previous check for infeasible finite prefixes and the resulting generalizing refinement.

In [23], the authors reduce the LTL model checking problem to the problem of checking \(\forall \)CTL by first approximating the LTL formula with a suitable CTL formula, and then refining counterexamples that represent multiple paths by introducing non-deterministic prophecy variables in their program representation. This non-determinism is then removed through a determinization procedure. By using this technique, they try to reduce their dependence on termination proofs, which they identified as the main reason for poor performance of automata-theoretic approaches. Our approach can be seen as another strategy to reduce the reliance on many termination proofs. By iteratively refining the Büchi program with different proof techniques, we often remove complex control structures from loops and thus reduce the strain on the termination proof engine.

There exist various publicly available finite-state model checking tools that support both LTL properties and programs, but are in contrast to Ultimate LTLAutomizer limited to finite-state systems: SPIN [38] and Divine [4] are both based on the Vardi-Wolper product [57] for LTL model checking. Divine supports C/C++ via LLVM bytecode, SPIN can be used with different front-ends that translate programs to finite-state models, e.g. with Bandera [28] for Java. NuSMV [18] and Cadence SMV [44] reduce LTL model checking to CTL model checking. NuSMV can use different techniques like BDD symbolic model checking using symbolic fixed point, computation with BDDs, or bounded model checking using MiniSat. Cadence SMV uses Mu-calculus with additional fairness constraints [15].

8 Conclusion and Future Work

The encoding of the LTL program verification problem through the infeasibility of fair paths in a Büchi program has allowed us to define a sequence of semi-tests which can be scheduled before the full test of infeasibility of an infinite path. The occurrence of a successful semi-test (the proof of infeasibility for a finite prefix, by the construction of a proof of unsatisfiability) makes the full test redundant and avoids the relatively costly construction of a ranking function. Our experiments indicate that the corresponding approach leads to a practical tool for LTL software model checking.

We see several ways to improve performance. We may try to use alternatives to LTL2BA such as SPOT [31]; see [54]. The technique of large block encoding [11] adapted to Büchi programs, may help to reduce memory consumption.