1 Introduction

Model checking is one of the most active research fields within Formal Verification. Recent advancements both in Satisfiability Modulo Theory (SMT) [2] and Constrained Horn Clauses (CHC) [26] significantly increased model checking capabilities [4, 5]. Nonetheless, there is still a wide range of problems that require attention, such as model checking for nonlinear arithmetic, search for deep counterexamples, or analysis of multiple-loop systems.

A significant amount of research in model checking is centered around the loop analysis. There exist a large number of different approaches, most of which target [10, 19, 30, 37] specifically single-loop programs. Multi-loop approaches are less common, primarily because such systems are harder to analyze than single-loop software due to their complex inner structures with interconnected loops and branching. Nonetheless, developing such approaches is crucial, as multi-loop software is widespread.

One of the critical problems for multi-loop analysis is the presence of deep loops with a large number of iterations. The presence of such loops might significantly slow down the model checking of the whole program. Recently published papers on Transition Power Abstraction (TPA) [9, 10] tried to improve the analysis of deep loops. TPA is driven by SMT, similar to other algorithms like Interpolation-based Model Checking (IMC) [30], Spacer [26] or Lazy Abstraction With Interpolants (LAWI) [31]; however, TPA abstracts over transitions rather than states, overapproximating them and summarizing a sequence of transitions into a single abstract transition. This idea is beneficial for the detection of deep counterexamples because, unlike classic symbolic approaches, the algorithm unfolds loop iterations exponentially faster. TPA also leverages interpolants to abstract the system properties and this allows it to prove safety of possibly unbounded loops by producing a loop invariant.

TPA was developed for reasoning about single-loop systems. It is still possible to apply it to the multi-loop programs using a straightforward transformation to merge multiple loops into a single loop [13]. However, this method would lose structural information about the initial program leading to a potential slowdown of the verification. In this paper, we introduce a novel algorithm that enables effective reasoning over multi-loop programs by applying TPA modularly and incrementally for each loop. It explores every possible execution path of the program, discovering safe transition invariants for each loop along a path and utilizing them during the exploration of other paths. Learned information about the safe states is propagated back and forth through the path being explored thus contributing to substantial runtime savings. Additionally, our approach efficiently conducts reachability analysis for loops with large numbers of iterations as a result of the usage of TPA. Our algorithm handles programs with branching and multiple loops efficiently as confirmed by experiments.

Our approach was implemented inside the Golem CHC solver [8]. We experimentally compared the new approach to classical TPA (multi-loop programs were transformed into single-loop in advance) and state-of-the-art tools, such as Z3 (Spacer) [32] and Eldarica [23]. Results demonstrate that our modular analysis is able to solve a significant amount of multi-loop benchmarks previously unmanageable both by Golem competitors and TPA.

The rest of the paper is ordered as follows: Sect. 2 provides a brief overview of the terminology and concepts used in this paper. The main contribution of the paper, the TPA-based reachability algorithm for multi-loop programs, is presented in detail in Sect. 3. In Sect. 4, the effectiveness of the approach is evaluated through a series of experiments. Section 5 discusses related work, and Sect. 6 concludes the paper.

2 Preliminaries

Our approach relies on a symbolic program representation by mapping its control flow to formulas in first-order logic. A set of logic formulas Fla are restricted to Linear Integer Arithmetic (LIA).

While Language. We restrict our attention to programs in the conventional While language [33]. This language has the following meta-variables and categories: n (over integers), x (over variables), a (over arithmetic expressions), b (over boolean expressions), and S (over statements):

$$\begin{aligned} a \,\,{:}{:}\!\!= &\ n\mid \texttt {nondet}() \mid x\mid a_1 + a_2\mid a_1*a_2\mid a_1-a_2 \\b \,\,{:}{:}\!\!= &\ true\mid false\mid a_1 = a_2\mid a_1 \le a_2\mid \lnot b\mid b_1 \wedge b_2 \\S \,\,{:}{:}\!\!= &\ \texttt {assert} (b) \mid \texttt {assume}(b) \mid x:=a\mid \texttt {skip} \mid \\&\ S;S\mid \texttt {if}\ b\ \texttt {then}\ S\ \texttt {else}\ S\mid \texttt {while}\ b\ \texttt {do}\ S \end{aligned}$$

Figure 1 gives an example of a program with two consecutive loops. We present it in a more familiar C syntax, but it could easily be translated to While. We also use this example to illustrate the solving process of our algorithm in Sect. 3.3. The first loop increments both \(x_1\) and \(x_2\) until \(x_1 \ge 300\), aldo decrementing \(x_3\). The second loop increments \(x_3\) and decrements \(x_1\) until \(x_2 \le x_3\). The safety property of the program is given in the assertion \(x_1 \le 0\). This program is safe as for any value of \(x_1\) and \(x_2\) the assertion will be satisfied. By changing the assumption \(x_1 \le 100\) to \(x_1 \le 300\), this program can be made unsafe.

Fig. 1.
figure 1

Multi-loop example.

Program Encoding and Cutpoint Graphs. Our representation of the program assumes a global set of variables denoted V. Conventional primed notation is used to represent “next-state” variables. We model multiloop programs using Cutpoint Graphs (CPG) [6] which offer more compact program representations than classic Control Flow Graphs (CFG). Every node in a CPG (except \( entry \) and \( error \)) represents a loophead in the corresponding program. For every single loop-free segment between the loopheads, there exists a single corresponding edge in CPG, even when there are multiple possible paths through it.

Definition 1

Given a program \(\mathcal {P}\), its cutpoint graph representation \(G_{\mathcal {P}} =\) \(\langle N, E, L, entry , error \rangle \) is such where N is a finite set of cutpoints (graph nodes), representing loopheads in the program. E is a set of actions between the cutpoints (edges between the nodes) of a form (uv), where \(u, v \in N \cup \{ entry , error \}\). L is a mapping \(L : E \rightarrow Fla \) from edges to logic formulas over V and \(V'\), representing symbolic encodings of loop-free statements, and \( entry \) and \( error \) are such that \(\forall u \in N: ( error , u) \not \in E \wedge (u, entry ) \not \in E\).

Based on Definition 1, it is possible to represent any program specified in While as a CPG. For example, the CPG for the program in Fig. 1 is given in Fig. 2.

Fig. 2.
figure 2

Cutpoint Graph for program in Fig. 1. Transitions are labeled with constraints from Fla.

We focus on programs without nested loops. Such programs can be represented with cutpoint graphs that do not have any cycles except for the self-loops. This is because our algorithm uses TPA to analyze reachability in individual loops and TPA is designed to handle loops that can be represented as a transition system. The example from Fig. 2 satisfies this condition as it does not have nested loops.

Transition System Reachability Analysis. Transition system can be defined as \(\langle Init, Tr, V \rangle \), where Init is an initial state of the system represented by a first-order logic formula, Tr is a transition formula, which represents the transition in the system, and V is a set of system’s variables. Safety problem now can be defined as \(\langle Init, Tr, Bad, V \rangle \) where Bad is a formula that represents a state violating the safety property. Reachability analysis in this context is the search for a path through the transition system to reach a Bad state.

Craig Interpolation. Given two logical formulas (AB) such that \(A \wedge B\) is unsatisfiable, a Craig interpolant [14] I is a formula that satisfies the following conditions: \(A \rightarrow I\), \(I \wedge B\) is unsatisfiable, and I contains only common variables of A and B. Interpolation can be used to prove the safety of a transition system by over-approximating the set of reachable states [30] or to extract information from unfeasible error path through the program [22].

Transition Power Abstraction. One of the interpolation-based model checking approaches, Transition Power Abstraction [9, 10], is used in our algorithm. TPA is a model-checking algorithm that works based on the abstraction of the transition relation. It takes a safety problem \(\langle Init , Tr , Bad , V \rangle \) as input and decides if any bad state is reachable from some initial state. Moreover, it can return a safe inductive transition invariant if the system is safe or produce a provably reachable subset of \( Bad \) if the system is unsafe. A transition formula \(R(x,x')\) is a transition invariant, if \(\forall x,x': Tr ^{*}(x,x') \implies R(x,x')\), where \( Tr ^{*}\) is the reflexive transitive closure of \( Tr \). A transition invariant R is inductive if \(R(x,x') \wedge Tr (x',x'') \implies R(x,x'')\) or if \( Tr (x,x') \wedge R(x',x'') \implies R(x,x'')\). It is safe if \( Init (x) \wedge R(x,x') \wedge Bad (x')\) is unsatisfiable.

One of the most important properties of the TPA is the ability to efficiently execute deep reachability checks during the search for counterexamples. TPA runs iteratively, using transition abstractions \(ATr^{\le n}\), instead of exact transitions. \(ATr^{\le n}\) over-approximates the sequence of \(2^n\) transitions for n-th iteration of the TPA, allowing it to double the amount of the considered transitions every iteration of the algorithm. For more details on TPA, we refer the reader to [9, 10].

Example Continued. The example program in Fig. 1 has two interesting properties. First, the depth of the loops, which can overall result in up to 950 iterations. If it was an unsafe example, TPA would be more efficient than its competitors due to its ability to manage deep loops. Second, in the presence of multiple loops, TPA would verify this program only if loops in the program are merged into a single loop. The merged loop will be bigger in size and lose structural information about the program, which could cause a slowdown in verification.

On the other hand, TPA could be applied to the loops separately. However, that would require some intermediate assertion to define safety property for the first loop and initial conditions for the second loop. For example, the condition \(x_1 \le x_2 - x_3\) could serve as such intermediate assertion introduced between lines 3 and 4. Interestingly, the algorithm we present in the next section automatically infers similar helper information and applies TPA modularly.

3 Multi-loop Analysis with TPA

Our algorithm performs forward reachability analysis over the program’s cutpoint graph. It searches for a feasible path from \( entry \) to \( error \), building the path gradually and backtracking when the current path cannot be extended further. Before backtracking from a blocked state, it generalizes the reason for the conflict and learns blocking lemmas (similar to IC3/PDR-style algorithms). These define states that are guaranteed to be safe (i.e., there is no feasible path to \( error \) from these states), so the algorithm will know to avoid them the next time it reaches the same CPG node.

3.1 Overview

To utilize the strengths of TPA, the algorithm alternates between two phases: i) reasoning about traversing from one loop to another loop, and ii) reasoning about traversing a single loop.

The first phase checks the feasibility of a single (large-block [6]) step in the traditional sense, and it can be reduced to a single SMT check. However, the second phase attempts to extend the current path by getting to the exit of the current loop in an arbitrary number of its iterations. This effectively means solving a reachability problem for a transition system where initial states are the currently reached states, transition relation encodes one iteration of the loop, and error states are the states at the loop exit not yet blocked by the algorithm. While any algorithm for answering reachability queries over transition systems could be applied here, TPA [9] has two advantages over traditional, state-focused model-checking algorithms. Its deep exploration makes it less likely to get stuck in a single loop that requires many iterations to traverse and, secondly, TPA is able to re-use bounded and unbounded transition invariants learned in previous queries to speed up current query to the same node. Note that when the algorithm reaches the same node but with a new state, the initial states (and possibly the error states too) of the reachability problem change, but the transition relation always stays the same. Thus, the transition invariants from previous queries are still valid, while state invariants would very likely be invalidated.

3.2 Core Algorithm

figure a

Algorithm 1 takes as input a CPG of a program with a safety property and decides if the error node is reachable (UNSAFE) or not (SAFE). For each node v in the graph, the algorithm keeps track of two versions of the node, \({v}^{ pre }\) and \({v}^{ post }\), called the pre-state and the post-state, resp. The pre-state captures when the reachability analysis has reached v from another node. In programs, this represents execution reaching the loop header for the first time. The post-state captures when the reachability analysis is about to exit node v and continue to another node. In programs, this represents the execution exiting the loop. Each node version keeps track of a set of states already shown to be safe, denoted as \({v}^{ pre }. safe \) and \({v}^{ post }. safe \). These sets of states are represented as symbolic formulas initialized as \(\bot \) (no states are proved safe at the start).

The algorithm also maintains the current feasible path prefix in the variable \( path \) as a stack of entries of the form \([v, \varphi ]\) representing that the set of states \(\varphi \) has been reached at node v. At the beginning, \( path \) is initialized as leaving the entry node with no restriction on the states (Line 1). When \( error \) is added to \( path \), the algorithm has discovered a feasible path from \( entry \) to \( error \) and the program is unsafe (Line 15). If the algorithm ever backtracks beyond the initial entry (\( path \) becomes empty), there is no feasible path from entry to error, and the program is safe (Line 25). Assuming \( path \) is not empty, the algorithm attempts to extend the current feasible path prefix. There are two distinct cases. If the last entry on the path is a post-state of some node v (Line 4), the algorithm attempts to use v’s outgoing edges (ignoring the self-loop edge) to traverse to the pre-state of a different node w. Otherwise, the last entry is a pre-state of some node v (Line 18) and the algorithm attempts to get to the post-state of v by traversing v’s self-looping edge some arbitrary number of times. Next, we describe these two cases in detail.

Post: When the algorithm is leaving some node v with reached states \( curr \) (Line 4), it searches for an unblocked outgoing edge as a candidate for extending the current path prefix (Line 5). An edge e is marked as blocked if the current path prefix cannot be extended with this edge, and the algorithm remembers the set of blocked source states (states for which it is not feasible to traverse the edge) in \(e. blocked \). Algorithm 1 ensures the blocked states are superset of \( curr \).

If all outgoing edges are blocked, it means that all outgoing edges have been considered as possible extensions, but all have failed eventually. The current path thus cannot be extended, and the algorithm backtracks to the pre-state of v (Line 7) to try a different continuation from that point. Before backtracking, the algorithm learns a new set of safe states as the intersection of states that are safe for individual outgoing edges (these are guaranteed to include all currently reached states) and unblocks all edges (Lines 6–8).

If, on the other hand, there is an unblocked edge (Line 10), the algorithm attempts to reach some potentially unsafe state of the edge’s target node. The feasibility of this traversal, given the constraint of the edge, is checked in Lines 12-17 which, for simplicity, we call TraverseBridge throughout the rest of the paper. It decides if some potentially unsafe states are reachable and computes a set of definitely reached target states (in case of reachability) or a set of definitely blocked source states (in case of unreachability). If the traversal is feasible, the path is extended (Line 14), and the analysis will continue from the new reached point unless \( error \) has been reached, in which case the algorithm immediately terminates (Line 15). If the traversal is infeasible, the picked edge is blocked (Line 17), marked with superset of \( curr \) for which the traversal is infeasible (see more details on TraverseBridge below). In the next iteration, the algorithm tries to pick a different, unblocked edge.

Pre: When the algorithm is entering some node v with reached states \( curr \) (Line 18), it attempts to find a feasible traversal of the loop, i.e., to reach some potentially unsafe post-state of v (taking an arbitrary number of loop iterations). The feasibility of this traversal is checked in Lines 19-24, which for simplicity we call TraverseLoop. Similarly to TraverseBridge, TraverseLoop not only decides the feasibility of the traversal but also computes a set of definitely reached target states to extend the current path (Line 21) or a set of definitely blocked source states, which forces backtracking (Line 24). We provide further details on TraverseBridge and TraverseLoop in the next two paragraphs.

TraverseBridge. Given reached states \( curr \), target states \(\lnot ({w}^{ pre }. safe )\), and a transition constraint L((vw)), the goal is to check if any target states are reachable from source states with one step of the transition constraint. The reachability check then amounts to the satisfiability check for the conjunction of the three formulas (denoted \(\varphi \) to simplify writing). Provably reached state can be defined exactly as \(\exists x : \varphi \). To avoid quantifiers, we under-approximate the set of reached states with model-based projection (MBP) [26]. Provably blocked states can be characterized as \(\lnot \exists x' : L((v, w))(x,x') \wedge \lnot ({w}^{ pre }. safe )(x')\). It is again possible to avoid quantifiers but still obtain a generalization of the source states, using Craig interpolation [14].

TraverseLoop. Given reached states \( curr \), target states \(\lnot ({v}^{ post }. safe )\), and a transition constraint L((vv)), the goal is to check if any target states are reachable from source states with any number of steps of L((vv)). This is equivalent to deciding a safety problem for a transition system \(\mathcal {S} = \langle Init , Tr , Bad \rangle \) with \( Init = curr \), \( Tr = L((v,v))\) and \( Bad = \lnot ({v}^{ post }. safe )\). TPA can easily satisfy the additional requirements on TraverseLoop. It already internally computes provably reached states as part of the witness for reachability. Provably blocked states can be computed using a safe transition invariant that TPA computes as a witness for unreachability. Similarly to TraverseBridge, we leverage Craig interpolation to eliminate quantifiers. Note that computing a logically weak (more general) interpolant for \(A = curr \) and \(B = TInv \wedge \lnot ({v}^{ post }. safe )\) yields a potentially much larger set of blocked states than the source states themselves.

Using TPA for implementing TraverseLoop has the additional advantage that TPA learns bounded and unbounded transition invariants during a single reachability check, which can be leveraged to bootstrap the transition abstractions in future reachability queries for the same loop. Not starting from scratch has the potential to significantly speed up consequent queries.

3.3 Running Example

To demonstrate the execution of Algorithm  1, we utilize the motivating example from Fig. 2 as an input. The execution is depicted in Fig. 3.

Initially, the algorithm attempts to leave \( entry \) and picks the single (unblocked) outgoing edge leading to \(s_0\). Potentially unsafe states at \({s_0}^{ pre }\) are \(\top \) at this point, so TraverseBridge computes reached states at \({s_0}^{ pre }\) to be \(0 \le x_1 \le 100 \wedge 0 \le x_2 \le 50 \wedge x_3 = 0\).

As the next step, the algorithm attempts to traverse loop \(s_0\) with \(\top \) as the potentially unsafe states at \({s_0}^{ post }\). TraverseLoop determines that with one loop iteration state \(1 \le x_1 \le 101 \wedge 1 \le x_2 \le 51 \wedge x_3 = -1\) is reached at \({s_0}^{ post }\).

Attempting to continue from this state will now fail, because the only outgoing edge to \(s_1\) is not feasible, as determined by TraverseBridge with \(x_1 < 300\) being the blocked states.

The algorithm now backtracks to \({s_0}^{ pre }\) and attempts to traverse loop \(s_0\) again, but this time with \(x_1 \ge 300\) as the potentially unsafe states. Here TPA quickly determines that unsafe states are reachable, e.g. after 255 iterations of the loop the state \(x_1 = 300 \wedge 255 \le x_2 \le 305 \wedge x_3 = -255\) is reached.

From this state at \({s_0}^{ post }\) it is possible to traverse to \({s_1}^{ pre }\), reaching states defined by the same formula. Next, the algorithm attempts to traverse loop \(s_1\).

Similarly to how it behaved for the first loop, TPA suggests exiting the second loop after one iteration, in state \(x_1 = 299 \wedge 255 \le x_2 \le 305 \wedge x_3 = -254\). However, when checking the single outgoing edge to \( error \), TraverseBridge determines the infeasibility of this attempt and computes \(x_2 > x_3\) as safe states at \({s_1}^{ post }\).

Thus, the algorithm backtracks again and attempts to traverse loop \(s_1\) in a different way so that it ends up in a potentially unsafe state \(x_2 \le x_3\). Here TPA quickly determines that such a state can be reached after 511 iterations, with variable values \(x_1 = -211 \wedge x_2 = 256 \wedge x_3 = 256\). However, this path cannot reach \( error \), as determined by TraverseBridge with \(x_1 \le 0\) determined to be safe states at \({s_1}^{ post }\).

In the final attempt to traverse \(s_1\) TPAdetermines that no unsafe state is reachable anymore and computes \(x_1 \le 384 \wedge 384 \le x_2 - x_3\) as new safe states at \({s_1}^{ pre }\). Thus, the algorithm backtracks again and tries to find a different way to reach unsafe states of \({s_1}^{ pre }\) from \({s_0}^{ post }\). TraverseBridge determines this to be impossible with \(x_1 \le 384 \wedge 384 \le x_2 - x_3\) being safe at \({s_0}^{ post }\) as well. Note that this condition can be viewed as an intermediate assertions between the two loops (as we briefly mentioned in Sect. 2). It is sufficient to prove \( error \) cannot be reached by traversing the second loop, and, as we will see in a moment, it cannot be violated by traversing the first loop.

Finally, after backtracking to \({s_0}^{ pre }\), an attempt to traverse loop \(s_0\) to avoid the safe states at \({s_0}^{ post }\) fails, as TPA in TraverseLoop determines that \(x_1 \le 108 \wedge x_3 - x_2 \le 0\) are safe states at \({s_0}^{ pre }\). Finally, the algorithm backtracks to \({ entry }^{ post }\), and, with no new feasible way to extend the path, it concludes safety.

Fig. 3.
figure 3

Algorithm execution flow for Fig. 2.

3.4 Correctness

We first prove correctness when Algorithm 1 answers UNSAFE.

Theorem 1

When Algorithm 1 returns UNSAFE, there exists a feasible path from \( entry \) to \( error \).

Proof

We show by induction that for every entry \([v, \varphi ]\) that is added to \( path \), states \(\varphi \) at node v are reachable from \( entry \). This claim trivially holds for the initial entry \([{ entry }^{ post }, \top ]\) added on Line 1. New entries are added to \( path \) at Lines 14 and 21. If follows from the properties of TraverseBridge and TraverseLoop that the new reached states added to \( path \) are indeed reachable from the previous entry in \( path \).

Next, we prove the correctness of the SAFE answer using some auxiliary lemmas.

Lemma 1

The following is an invariant of the algorithm: For each node \(v \in N\) and each state \(s \in v. safe \) there is no feasible path from \( entry \) to \( error \) going through [vs].

Proof

Initially, all sets of safe states are empty (\(\bot \)), so the invariant holds trivially. Sets of safe states are extended at two points: Line 6 and Line 23.

On Line 23, the set of safe states for node \({v}^{ pre }\) is extended with the blocked states from TraverseLoop. TraverseLoop ensures that the blocked states are a superset of the currently reached states in \({v}^{ pre }\) that is guaranteed to only reach safe states of \({v}^{ post }\). Thus, this extension of safe states preserves the invariant.

On Line 6, the set of safe states for node \({v}^{ post }\) is extended with the intersection of the blocked states computed for each of v’s outgoing edges. TraverseBridge ensures that the blocked states computed for an edge are a superset of the currently reached states in \({v}^{ post }\) that is guaranteed to only reach safe states of w, the target of the edge. Thus, this extension of safe states preserves the invariant, too.

Lemma 2

When an entry \([v, \varphi ]\) is about to be popped from \( path \) (Lines 7 and 24), the current path cannot be extended to a feasible path from \( entry \) to \( error \).

Proof

The proof is analogous to the proof of Lemma 1. The entry \([v, \varphi ]\) is popped on Line 7 (Line 24) when the superset of \(\varphi \) is added to the safe states of \({v}^{ post }\) (\({v}^{ pre }\)) on Line 6 (Line 23). This exactly means that the current path prefix cannot be extended to a feasible path.

Theorem 2

When Algorithm 1 returns SAFE, there is no feasible path \( entry \) to \( error \).

Proof

Follows directly from Lemma 2 because Algorithm 1 returns SAFE when the initial entry \([ entry ,\top ]\) is popped from \( path \).

3.5 Witness Production

Here we show that Algorithm 1 can be extended to produce witnesses for both safe and unsafe programs (if it terminates).

Violation Witnesses. We show how a witness can be computed from \( path \) constructed by Algorithm 1. We use the standard notion of a violation witness as a counterexample path defined by a sequence of program states.

Definition 2

(Violation Witness). Given a CPG \(G_{\mathcal {S}} =\langle N, E, L, entry , error \rangle \), a violation witness is an execution trace \([s_1, ..., s_n]\) such that

  • for each \(i\in [1,n]\), tuple \(s_i = \langle v_i, st _i \rangle \) where \(v_i \in N\) and \( st _i\) a program state, i.e., an assignment of all program variables,

  • \(s_1 = \langle entry , \top \rangle \) and \(s_n = \langle error , q \rangle \) for some \(q \ne \bot \),

  • for each consecutive pair \(\langle v_i, st _i \rangle \) and \(\langle v_{i+1}, st _{i+1} \rangle \), \((v_i, v_{i+1}) \in E\) and \(L((v_i, v_{i+1}))( st _i, st _{i+1})\) is satisfiable.

When Algorithm 1 decides the input CPG to be unsafe (Line 15), the entries in \( path \) form a blueprint for the violation witness. It defines exactly which loops the counterexample traverses and in what order. However, the information that is missing is how many iterations are taken in each loop and what are the intermediate states of the program for those iterations. Fortunately, when TPA determines that target states are reachable it also computes how many steps are required. This number of loop iterations can be stored and used at the end to reconstruct the full execution trace. The blueprint from \( path \) combined with the precise number of unrollings of each loop defines the full step-by-step execution trace. To obtain concrete states at each execution step, an SMT query can be formed from the transitions defined by the trace, and concrete program states can be obtained directly from a model for such a query.

Safety Witnesses. We use inductive invariants as safety witnesses.

Definition 3

(Safety Witness). Given a CPG \(G_{\mathcal {S}} =\langle N, E, L, entry , error \rangle \), a safety witness is a mapping \( Inv : N \mapsto Fla \) from loops to state formulas such that \( Inv ( entry ) = \top \), \( Inv ( error ) = \bot \), and \(\forall (v, u) = e \in E : Inv (v) \wedge L(e) \implies Inv (u)\).

Note that this definition includes the requirement that \( Inv (v)\) is an inductive invariant because the condition must hold also for self-loop edges (vv).

We show how to compute inductive invariants from the information computed by Algorithm 1. Recall that the algorithm computes for each loop v the set of safe states \({v}^{ pre }.safe\) and \({v}^{ post }.safe\). We can compute a safety witness by computing, separately for each loop v, a safe inductive invariant for a reachability problem \(\langle {v}^{ pre }.safe, L((v,v)), \lnot {v}^{ post }.safe \rangle \) (which we know is safe).Footnote 1

Lemma 3

Suppose \( Inv (v)\) is a safe inductive invariant for a reachability problem \(\langle {v}^{ pre }.safe, L((v,v)), \lnot {v}^{ post }.safe \rangle \) for all \(v \in N\). Then \( Inv \) is a safety witness according to Definition 3, i.e., \(\forall (u, v) = e \in E : Inv (u) \wedge L(e) \implies Inv (v)\).

Proof

Each \( Inv (v)\) is, by construction, an inductive invariant for its corresponding loop v. We show that these invariants are inductive also with respect to transitions between loops.

Consider an edge \(e = (u,v)\) with \(u \ne v\). Since \( Inv (u)\) is a safe inductive invariant for the reachability problem \(\langle {u}^{ pre }.safe, L((u,u)), \lnot {u}^{ post }.safe \rangle \), it follows that \( Inv (u) \implies {u}^{ post }.safe\). Moreover, we know that \({u}^{ post }.safe \wedge L(e) \implies {v}^{ pre }.safe\) is valid based on how the set of safe states is constructed in Algorithm 1: only those states at u that cannot reach states outside of \({v}^{ pre }.safe\) are ever added to \({u}^{ post }.safe\). Finally, \({v}^{ pre }.safe \implies Inv (v)\) is valid by construction of \( Inv (v)\) as the inductive invariant for \(\langle {v}^{ pre }.safe, L((v,v)), \lnot {v}^{ post }.safe \rangle \). All three implications together yield the desired property \( Inv (u) \wedge L(e) \implies Inv (v)\).

4 Evaluation

We have implemented Algorithm 1 in our Golem CHC solver [8] and we refer to this implementation as Golem-Multiloop. In the experiments, Golem-Multiloop is compared with state-of-the-art tools Z3-Spacer (v4.13.0) [26, 32] and Eldarica (v2.1.0) [23], as well as the existing TPA and Spacer engines of Golem (denoted as Golem-TPA and Golem-Spacer). Benchmarks are centered specifically around the multi-loop instances. All experiments were conducted on a machine with an AMD EPYC 7452 32-core processor and 8\(\,\times \,\)32 GiB of memory.

Fig. 4.
figure 4

Comparison of performance of Golem-Multiloop with other tools: Z3-Spacer, Eldarica, Golem-Spacer and Golem-TPA. Plot on the left demonstrates amount of solved SAFE instances over time, plot on the the right shows UNSAFE instances.

The evaluation aims to answer the following two research questions:

  • RQ1: How does the new modular algorithm compare to Golem-TPA running on a transformed single-loop program?

  • RQ2: How does the performance of Golem-Multiloop fare against state-of-the-art tools?

The set of benchmarksFootnote 2 used in our experiments is partially composed of SV-COMP-23 instancesFootnote 3 (specifically from the ‘loops-crafted-1‘ set) and partially of crafted multi-loop examples. The benchmarks have a common structure, with multiple loops interconnected between each other without nested loops. Our motivating example from Fig. 1 illustrates the structure of these benchmarks. The benchmark set consists of 263 safe and 179 unsafe problems.

Quantile plots, shown in Fig. 4, compare the performance of individual tools on our benchmark set. A data point (xy) in the plot represents the fact that the corresponding algorithm solved y problems given time x (in seconds). The results show that Golem-Multiloop outperforms Golem-TPA both for safe and unsafe instances. We attribute the large performance improvement for safe instances to Golem-Multiloop ’s modularity. While Golem-TPA has to find a single safe transition invariant for the whole (transformed) program, Golem-Multiloop builds separate transition invariants for individual loops incrementally.

For unsafe problems, the difference between the two approaches is smaller but still significant. We speculate that the modular nature of Golem-Multiloop also helps it to build better, more focused transition abstractions which, in turn, allow it to discover the real counter-example faster than Golem-TPA, which needs to spend more time refining the abstraction of the monolithic transition relation. To answer RQ1, we conclude that the incremental and modular nature of Golem-Multiloop delivers significant improvements over applying TPA in a monolithic way to a transformed single-loop program.

Golem-Multiloop also significantly outperforms state-of-the-art tools. From the safe problems, Golem-Multiloop solves 198 benchmarks, while the second best, Eldarica, was able to solve 125 benchmarks. However, Eldarica solved 10 safe instances uniquely, demonstrating some orthogonality to our approach. Similar results can be observed for the unsafe benchmarks. Golem-Multiloop solves 23 instances more than Z3-Spacer, the second-best tool, even though Z3-Spacer was able to solve 12 instances uniquely. To answer RQ2, our evaluation shows that Golem-Multiloop significantly improves upon the state-of-the-art solving more instances than the next-best competitor. Moreover, it is on average 4.1 times and 2.8 times faster than the next best competitor on unsafe and safe instances, respectively.

Overall, the evaluation demonstrates that our new algorithm is capable of successfully handling both safe and unsafe challenging multi-loop programs. It significantly improves not only over TPAapplied to transformed single-loop programs but also over existing state-of-the-art tools.

5 Related Work

A well-established research area around loop analysis embraces a multitude of approaches, many of which are overviewed below.

Loop Summarization. Several techniques aim to produce an abstraction that captures a relationship between the input and output of the loop as a set of symbolic constraints. Produced this way, a loop summary is then used to replace the loop in a subsequent analysis of the program. The approaches differ mainly due to the application of symbolic abstraction [11, 27, 35] or symbolic execution [21, 36, 37]. All those approaches are property-agnostic and thus could be more expensive or less effective than needed when potentially employed by our approach. By contrast, our technique abstracts loops following the guidance of the safety property.

Loop Acceleration. A group of related techniques produce quantifier-free first-order formulas that under-approximate loop behaviours [3, 12, 19, 20]. They are motivated by and applied to verification approaches to improve scalability. We are however not aware if any such technique is applicable to complicated loops with control flow divergence or to loops over datatypes more complicated than just integers.

Invariant Generation. An older but more popular and more widely used technique in program analysis consists in the automated generation of inductive invariants. Intuitively, it aims at generating an over-approximation of all possible states that can be reached after a loop iteration, assuming it started from another over-approximation, and hoping to reach a fixpoint. There are multiple approaches to generate invariants, e.g. based on CEGAR and predicate abstraction [23, 29], IC3/PDR [26], program transformation based [24], syntax-guided synthesis [18], or Machine Learning/Neural networks [25, 34]. One of the most popular approaches for invariant generation is interpolant production, which is used in a wide variety of algorithms [9, 23, 26, 28,29,30,31].

Other Techniques. Some algorithms try to analyze loops differently, for example, to simplify loops themselves, transforming them into a simpler version of the same loop [15, 16]. These approaches are not comparable with our technique, as they simplify loops but not abstract them.

Multi-loop to Single-Loop Transformation. One of the important techniques for the analysis of multi-loop systems is the transformation of such systems into a single loop [1, 13, 17]. This set of approaches allows to apply algorithms like IMC, TPA, or other single-loop specific engines [7] to effectively analyze the multi-loop program as a whole.

6 Conclusion

Our paper introduces a novel approach for model checking of programs with multiple loops. Its main idea is a modular analysis of the program loops while propagating information about reachable and blocked states between consecutive loops. Utilization of Transition Power Abstraction for analysis of individual loops enables incremental computation and use of transition invariants for the program verification, which significantly improves the overall performance of the approach. We also proved the correctness of this algorithm and demonstrated how witnesses, both for SAFE and UNSAFE instances, can be generated. Experimental evaluation demonstrates that our algorithm significantly outperforms a straightforward application of TPA as well as other competitors in the analysis of multi-loop systems.

As a future work, we plan to modify this algorithm to manage multi-loop systems with nested loops. This would significantly expand the possible applications of our approach for the analysis of real-world programs.Footnote 4