figure a

1 Introduction

Software used in a wide range of applications use arrays to store and update data, often using loops to read and write arrays. Verifying correctness properties of such array programs is important, yet challenging. A variety of techniques have been proposed in the literature to address this problem, including inference of quantified loop invariants [20]. However, it is often difficult to automatically infer such invariants, especially when programs have loops that are sequentially composed and/or nested within each other, and have complex control flows. This has spurred recent interest in mathematical induction-based techniques for verifying parametric properties of array manipulating programs [11, 12, 42, 44]. While induction-based techniques are efficient and quite powerful, their Achilles heel is the automation of the inductive argument. Indeed, this often becomes the limiting step in applications of induction-based techniques. Automating the induction step and expanding the class of array manipulating programs to which induction-based techniques can be applied forms the primary motivation for our work. Rather than being a stand-alone technique, we envisage our work being used as part of a portfolio of techniques in a modern program verification tool.

We propose a novel and practically efficient induction-based technique that advances the state-of-the-art in automating the inductive step when reasoning about array manipulating programs. This allows us to automatically verify interesting properties of a large class of array manipulating programs that are beyond the reach of state-of-the-art induction-based techniques, viz. [12, 42]. The work that comes closest to us is Vajra [12], which is part of the portfolio of techniques in VeriAbs [1] – the winner of SV-COMP 2021 in the Arrays Reach sub-category. Our work addresses several key limitations of the technique implemented in Vajra, thereby making it possible to analyze a much larger class of array manipulating programs than can be done by VeriAbs. Significantly, this includes programs with nested loops that have hitherto been beyond the reach of automated techniques that use mathematical induction [12, 42, 44].

A key innovation in our approach is the construction of two slightly different versions of a given program that have identical control flow structures but slightly different data operations. We automatically identify simple relations, called difference invariants, between corresponding variables in the two versions of a program at key control flow points. Interestingly, these relations often turn out to be significantly simpler than inductive invariants required to prove the property directly. This is not entirely surprising, since the difference invariants depend less on what individual statements in the programs are doing, and more on the difference between what they are doing in the two versions of the program. We show how the two versions of a given program can be automatically constructed, and how differences in individual statements can be analyzed to infer simple difference invariants. Finally, we show how these difference invariants can be used to simplify the reasoning in the inductive step of our technique.

We consider programs with (possibly nested) loops manipulating arrays, where the size of each array is a symbolic integer parameter \(N~(> 0)\)Footnote 1. We verify (a sub-class of) quantified and quantifier-free properties that may depend on the symbolic parameter N. Like in [12], we view the verification problem as one of proving the validity of a parameterized Hoare triple \(\{\varphi (N)\} \;\mathsf {P}_N\; \{\psi (N)\}\) for all values of \(N~(> 0)\), where arrays are of size N in the program \(\mathsf {P}_N\), and N is a free variable in \(\varphi (\cdot )\) and \(\psi (\cdot )\).

To illustrate the kind of programs that are amenable to our technique, consider the program shown in Fig. 1(a), adapted from an SV-COMP benchmark. This program has a couple of sequentially composed loops that update arrays and scalars. The scalars S and F are initialized to 0 and 1 respectively before the first loop starts iterating. Subsequently, the first loop computes a recurrence in variable S and initializes elements of the array B to 1 if the corresponding elements of array A have non-negative values, and to 0 otherwise. The outermost branch condition in the body of the second loop evaluates to true only if the program parameter N and the variable S have same values. The value of F is reset based on some conditions depending on corresponding entries of arrays A and B. The pre-condition of this program is true; the post-condition asserts that F is never reset in the second loop.

State-of-the-art techniques find it difficult to prove the assertion in this program. Specifically, Vajra [12] is unable to prove the property, since it cannot reason about the branch condition (in the second loop) whose value depends on the program parameter N. VeriAbs [1], which employs a sequence of techniques such as loop shrinking, loop pruning, and inductive reasoning using [12] is also unable to verify the assertion shown in this program. Indeed, the loops in this program cannot be merged as the final value of S computed by the first loop is required in the second loop; hence loop shrinking does not help. Also, loop pruning does not work due to the complex dependencies in the program and the fact that the exact value of the recurrence variable S is required to verify the program. Subsequent abstractions and techniques applied by VeriAbs from its portfolio are also unable to verify the given post-condition. VIAP [42] translates the program to a quantified first-order logic formula in the theory of equality and uninterpreted functions [32]. It applies a sequence of tactics to simplify and prove the generated formula. These tactics include computing closed forms of recurrences, induction over array indices and the like to prove the property. However, its sequence of tactics is unable to verify this example within our time limit of 1 min.

Fig. 1.
figure 1

Motivating examples

Benchmarks with nested loops are a long standing challenge for most verifiers. Consider the program shown in Fig. 1(b) with a nested loop in addition to sequentially composed loops. The first loop initializes entries in array A to 0. The second loop aggregates a constant value in the scalar S. The third loop is a nested loop that updates array A based on the value of S. The entries of A are updated in the inner as well as outer loop. The property asserts that on termination, each array element equals twice the value of the parameter N.

While the inductive reasoning of Vajra and the tactics in VIAP do not support nested loops, the sequence of techniques used by VeriAbs is also unable to prove the given post-condition in this program. In sharp contrast, our prototype tool Diffy is able to verify the assertions in both these programs automatically within a few seconds. This illustrates the power of the inductive technique proposed in this paper.

The technical contributions of the paper can be summarized as follows:

  • We present a novel technique based on mathematical induction to prove interesting properties of a class of programs that manipulate arrays. The crucial inductive step in our technique uses difference invariants from two slightly different versions of the same program, and differs significantly from other induction-based techniques proposed in the literature [11, 12, 42, 44].

  • We describe algorithms to transform the input program for use in our inductive verification technique. We also present techniques to infer simple difference invariants from the two slightly different program versions, and to complete the inductive step using these difference invariants.

  • We describe a prototype tool Diffy that implements our algorithms.

  • We compare Diffy vis-a-vis state-of-the-art tools for verification of C programs that manipulate arrays on a large set of benchmarks. We demonstrate that Diffy significantly outperforms the winners of SV-COMP 2019, 2020 and 2021 in the Array Reach sub-category.

2 Overview and Relation to Earlier Work

In this section, we provide an overview of the main ideas underlying our technique. We also highlight how our technique differs from [12], which comes closest to our work. To keep the exposition simple, we consider the program \(\mathsf {P}_N\), shown in the first column of Fig. 2, where N is a symbolic parameter denoting the sizes of arrays a and b. We assume that we are given a parameterized pre-condition \(\varphi (N)\), and our goal is to establish the parameterized post-condition \(\psi (N)\), for all \(N>0\). In [12, 44], techniques based on mathematical induction (on N) were proposed to solve this class of problems. As with any induction-based technique, these approaches consist of three steps. First, they check if the base case holds, i.e. if the Hoare triple \(\{\varphi (N)\}\;\mathsf {P}_N\;\{\psi (N)\}\) holds for small values of N, say \(1 \le N \le M\), for some \(M > 0\). Next, they assume that the inductive hypothesis \(\{\varphi (N-1)\}\;\mathsf {P}_{N-1}\;\{\psi (N-1)\}\) holds for some \(N \ge M+1\). Finally, in the inductive step, they show that if the inductive hypothesis holds, so does \(\{\varphi (N)\}\;\mathsf {P}_N\;\{\psi (N)\}\). It is not hard to see that the inductive step is the most crucial step in this style of reasoning. It is also often the limiting step, since not all programs and properties allow for efficient inferencing of \(\{\varphi (N)\}\;\mathsf {P}_N\;\{\psi (N)\}\) from \(\{\varphi (N-1)\}\;\mathsf {P}_{N-1}\;\{\psi (N-1)\}\).

Fig. 2.
figure 2

Pictorial depiction of our program transformations

Like in [12, 44], our technique uses induction on N to prove the Hoare triple \(\{\varphi (N)\}\;\mathsf {P}_N\;\{\psi (N)\}\) for all \(N > 0\). Hence, our base case and inductive hypothesis are the same as those in [12, 44]. However, our reasoning in the crucial inductive step is significantly different from that in [12, 44], and this is where our primary contribution lies. As we show later, not only does this allow a much larger class of programs to be efficiently verified compared to [12, 44], it also permits reasoning about classes of programs with nested loops, that are beyond the reach of [12, 44]. Since the work of [12] significantly generalizes that of [44], henceforth, we only refer to [12] when talking of earlier work that uses induction on N.

In order to better understand our contribution and its difference vis-a-vis the work of [12], a quick recap of the inductive step used in [12] is essential. The inductive step in [12] crucially relies on finding a “difference program” \(\partial \mathsf {P}_N\) and a “difference pre-condition” \(\partial \varphi (N)\) such that: (i) \(\mathsf {P}_N\) is semantically equivalent to \(\mathsf {P}_{N-1}; \partial \mathsf {P}_N\), where ‘;’ denotes sequential composition of programsFootnote 2, (ii) \(\varphi (N) \Rightarrow \varphi (N-1)\wedge \partial \varphi (N)\), and (iii) no variable/array element in \(\partial \varphi (N)\) is modified by \(\mathsf {P}_{N-1}\). As shown in [12], once \(\partial \mathsf {P}_N\) and \(\partial \varphi (N)\) satisfying these conditions are obtained, the problem of proving \(\{\varphi (N)\} \; \mathsf {P}_N \; \{\psi (N)\}\) can be reduced to that of proving \(\{\psi (N-1) \wedge \partial \varphi (N)\}\;\partial \mathsf {P}_N\; \{\psi (N)\}\). This approach can be very effective if (i) \(\partial \mathsf {P}_N\) is “simpler” (e.g. has fewer loops or strictly less deeply nested loops) than \(\mathsf {P}_N\) and can be computed efficiently, and (ii) a formula \(\partial \varphi (N)\) satisfying the conditions mentioned above exists and can be computed efficiently.

The requirement of \(\mathsf {P}_N\) being semantically equivalent to \(\mathsf {P}_{N-1};\partial \mathsf {P}_N\) is a very stringent one, and finding such a program \(\partial \mathsf {P}_N\) is non-trivial in general. In fact, the authors of [12] simply provide a set of syntax-guided conditionally sound heuristics for computing \(\partial \mathsf {P}_N\). Unfortunately, when these conditions are violated (we have found many simple programs where they are violated), there are no known algorithmic techniques to generate \(\partial \mathsf {P}_N\) in a sound manner. Even if a program \(\partial \mathsf {P}_N\) were to be found in an ad-hoc manner, it may be as “complex” as \(\mathsf {P}_N\) itself. This makes the approach of [12] ineffective for analyzing such programs. As an example, the fourth column of Fig. 2 shows \(\mathsf {P}_{N-1}\) followed by one possible \(\partial \mathsf {P}_N\) that ensures \(\mathsf {P}_N\) (shown in the first column of the same figure) is semantically equivalent to \(\mathsf {P}_{N-1}; \partial \mathsf {P}_N\). Notice that \(\partial \mathsf {P}_N\) in this example has two sequentially composed loops, just like \(\mathsf {P}_N\) had. In addition, the assignment statement in the body of the second loop uses a more complex expression than that present in the corresponding loop of \(\mathsf {P}_N\). Proving \(\{\psi (N-1) \wedge \partial \varphi (N)\}\;\partial \mathsf {P}_N\; \{\psi (N)\}\) may therefore not be any simpler (perhaps even more difficult) than proving \(\{\varphi (N)\}\; \mathsf {P}_N\;\{\psi (N)\}\).

In addition to the difficulty of computing \(\partial \mathsf {P}_N\), it may be impossible to find a formula \(\partial \varphi (N)\) such that \(\varphi (N) \Rightarrow \varphi (N-1)\,\wedge \,\partial \varphi (N)\), as required by [12]. This can happen even for fairly routine pre-conditions, such as \(\varphi (N) \equiv \big (\bigwedge _{i=0}^{N-1} A[i] = N\big )\). Notice that there is no \(\partial \varphi (N)\) that satisfies \(\varphi (N) \Rightarrow \varphi (N-1)\,\wedge \,\partial \varphi (N)\) in this case. In such cases, the technique of [12] cannot be used at all, even if \(\mathsf {P}_N\), \(\varphi (N)\) and \(\psi (N)\) are such that there exists a trivial proof of \(\{\varphi (N)\}\;\mathsf {P}_N\;\{\psi (N)\}\).

The inductive step proposed in this paper largely mitigates the above problems, thereby making it possible to efficiently reason about a much larger class of programs than that possible using the technique of [12]. Our inductive step proceeds as follows. Given \(\mathsf {P}_N\), we first algorithmically construct two programs \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\), such that \(\mathsf {P}_N\) is semantically equivalent to \(\mathsf {Q}_{N-1}; \mathsf {peel}(\mathsf {P}_N)\). Intuitively, \(\mathsf {Q}_{N-1}\) is the same as \(\mathsf {P}_N\), but with all loop bounds that depend on N now modified to depend on \(N-1\) instead. Note that this is different from \(\mathsf {P}_{N-1}\), which is obtained by replacing all uses (not just in loop bounds) of N in \(\mathsf {P}_N\) by \(N-1\). As we will see, this simple difference makes the generation of \(\mathsf {peel}(\mathsf {P}_N)\) significantly simpler than generation of \(\partial \mathsf {P}_N\), as in [12]. While generating \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(P_N)\) may sound similar to generating \(\mathsf {P}_{N-1}\) and \(\partial \mathsf {P}_N\) [12], there are fundamental differences between the two approaches. First, as noted above, \(\mathsf {P}_{N-1}\) is semantically different from \(\mathsf {Q}_{N-1}\). Similarly, \(\mathsf {peel}(\mathsf {P}_N)\) is also semantically different from \(\partial \mathsf {P}_N\). Second, we provide an algorithm for generating \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\) that works for a significantly larger class of programs than that for which the technique of [12] works. Specifically, our algorithm works for all programs amenable to the technique of [12], and also for programs that violate the restrictions imposed by the grammar and conditional heuristics in [12]. For example, we can algorithmically generate \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\) even for a class of programs with arbitrarily nested loops – a program feature explicitly disallowed by the grammar in [12]. Third, we guarantee that \(\mathsf {peel}(\mathsf {P}_N)\) is “simpler” than \(\mathsf {P}_N\) in the sense that the maximum nesting depth of loops in \(\mathsf {peel}(\mathsf {P}_{N})\) is strictly less than that in \(\mathsf {P}_N\). Thus, if \(\mathsf {P}_N\) has no nested loops (all programs amenable to analysis by [12] belong to this class), \(\mathsf {peel}(\mathsf {P}_N)\) is guaranteed to be loop-free. As demonstrated by the fourth column of Fig. 2, no such guarantees can be given for \(\partial \mathsf {P}_N\) generated by the technique of [12]. This is a significant difference, since it greatly simplifies the analysis of \(\mathsf {peel}(\mathsf {P}_N)\) vis-a-vis that of \(\partial \mathsf {P}_N\).

We had mentioned earlier that some pre-conditions \(\varphi (N)\) do not admit any \(\partial \varphi (N)\) such that \(\varphi (N) \Rightarrow \varphi (N-1)\,\wedge \,\partial \varphi (N)\). It is, however, often easy to compute formulas \(\varphi '(N-1)\) and \(\varDelta \varphi '(N)\) in such cases such that \(\varphi (N) \Rightarrow \varphi '(N-1)\,\wedge \,\varDelta \varphi '(N)\), and the variables/array elements in \(\varDelta \varphi '(N)\) are not modified by either \(\mathsf {P}_{N-1}\) or \(\mathsf {Q}_{N-1}\). For example, if we were to consider a (new) pre-condition \(\varphi (N) \equiv \big (\bigwedge _{i=0}^{N-1} A[i] = N\big )\) for the program \(\mathsf {P}_N\) shown in the first column of Fig. 2, then we have \(\varphi '(N-1) \equiv \big (\bigwedge _{i=0}^{N-2} A[i] = N\big )\) and \(\varDelta \varphi '(N) \equiv \big (A[N-1] = N\big )\). We assume the availability of such a \(\varphi '(N-1)\) and \(\varDelta \varphi '(N)\) for the given \(\varphi (N)\). This significantly relaxes the requirement on pre-conditions and allows a much larger class of Hoare triples to be proved using our technique vis-a-vis that of [12].

The third column of Fig. 2 shows \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\) generated by our algorithm for the program \(\mathsf {P}_N\) in the first column of the figure. It is illustrative to compare these with \(\mathsf {P}_{N-1}\) and \(\partial \mathsf {P}_N\) shown in the fourth column of Fig. 2. Notice that \(\mathsf {Q}_{N-1}\) has the same control flow structure as \(\mathsf {P}_{N-1}\), but is not semantically equivalent to \(\mathsf {P}_{N-1}\). In fact, \(\mathsf {Q}_{N-1}\) and \(\mathsf {P}_{N-1}\) may be viewed as closely related versions of the same program. Let \(V_\mathsf {Q}\) and \(V_\mathsf {P}\) denote the set of variables of \(\mathsf {Q}_{N-1}\) and \(\mathsf {P}_{N-1}\) respectively. We assume \(V_\mathsf {Q}\) is disjoint from \(V_\mathsf {P}\), and analyze the joint execution of \(\mathsf {Q}_{N-1}\) starting from a state satisfying the pre-condition \(\varphi '(N-1)\), and \(\mathsf {P}_{N-1}\) starting from a state satisfying \(\varphi (N-1)\). The purpose of this analysis is to compute a difference predicate \(D(V_\mathsf {Q}, V_\mathsf {P}, N-1)\) that relates corresponding variables in \(\mathsf {Q}_{N-1}\) and \(\mathsf {P}_{N-1}\) at the end of their joint execution. The above problem is reminiscent of (yet, different from) translation validation [4, 17, 24, 40, 46, 48, 49], and indeed, our calculation of \(D(V_\mathsf {Q}, V_\mathsf {P}, N-1)\) is motivated by techniques from the translation validation literature. An important finding of our study is that corresponding variables in \(\mathsf {Q}_{N-1}\) and \(\mathsf {P}_{N-1}\) are often related by simple expressions on N, regardless of the complexity of \(\mathsf {P}_N\), \(\varphi (N)\) or \(\psi (N)\). Indeed, in all our experiments, we didn’t need to go beyond quadratic expressions on N to compute \(D(V_\mathsf {Q}, V_\mathsf {P}, N-1)\).

Once the steps described above are completed, we have \(\varDelta \varphi '(N)\), \(\mathsf {peel}(\mathsf {P}_N)\) and \(D(V_\mathsf {Q}, V_\mathsf {P}, N-1)\). It can now be shown that if the inductive hypothesis, i.e. \(\{\varphi (N-1)\} \; \mathsf {P}_{N-1}\; \{\psi (N-1)\}\) holds, then proving \(\{\varphi (N)\} \;\mathsf {P}_N\; \{\psi (N)\}\) reduces to proving \(\{\varDelta \varphi '(N) ~\wedge ~ \psi '(N-1) \} \; \mathsf {peel}(\mathsf {P}_N) \; \{\psi (N)\}\), where \(\psi '(N-1) \equiv \exists V_\mathsf {P}\big (\psi (N-1)\,\wedge \,D(V_\mathsf {Q}, V_\mathsf {P}, N-1)\big )\). A few points are worth emphasizing here. First, if \(D(V_\mathsf {Q}, V_\mathsf {P}, N-1)\) is obtained as a set of equalities, the existential quantifier in the formula \(\psi '(N-1)\) can often be eliminated simply by substitution. We can also use quantifier elimination capabilities of modern SMT solvers, viz. Z3 [39], to eliminate the quantifier, if needed. Second, recall that unlike \(\partial \mathsf {P}_N\) generated by the technique of [12], \(\mathsf {peel}(\mathsf {P}_N)\) is guaranteed to be “simpler” than \(\mathsf {P}_N\), and is indeed loop-free if \(\mathsf {P}_N\) has no nested loops. Therefore, proving \(\{\varDelta \varphi '(N) ~\wedge ~ \psi '(N-1)\} \; \mathsf {peel}(\mathsf {P}_N) \; \{\psi (N)\}\) is typically significantly simpler than proving \(\{\psi (N-1)\,\wedge \,\partial \varphi (N)\} \; \partial \mathsf {P}_N \; \{\psi (N)\}\). Finally, it may happen that the pre-condition in \(\{\varDelta \varphi '(N) ~\wedge ~ \psi '(N-1)\} \; \mathsf {peel}(\mathsf {P}_N) \; \{\psi (N)\}\) is not strong enough to yield a proof of the Hoare triple. In such cases, we need to strengthen the existing pre-condition by a formula, say \(\xi '(N-1)\), such that the strengthened pre-condition implies the weakest pre-condition of \(\psi (N)\) under \(\mathsf {peel}(\mathsf {P}_N)\). Having a simple structure for \(\mathsf {peel}(\mathsf {P}_N)\) (e.g., loop-free for the entire class of programs for which [12] works) makes it significantly easier to compute the weakest pre-condition. Note that \(\xi '(N-1)\) is defined over the variables in \(V_\mathsf {Q}\). In order to ensure that the inductive proof goes through, we need to strengthen the post-condition of the original program by \(\xi (N)\) such that \(\xi (N-1)\,\wedge \,D(V_\mathsf {Q}, V_\mathsf {P}, N-1) \Rightarrow \xi '(N-1)\). Computing \(\xi (N-1)\) requires a special form of logical abduction that ensures that \(\xi (N-1)\) refers only to variables in \(V_P\). However, if \(D(V_\mathsf {Q}, V_\mathsf {P}, N-1)\) is given as a set of equalities (as is often the case), \(\xi (N-1)\) can be computed from \(\xi '(N-1)\) simply by substitution. This process of strengthening the pre-condition and post-condition may need to iterate a few times until a fixed point is reached, similar to what happens in the inductive step of [12]. Note that the fixed point iterations may not always converge (verification is undecidable in general). However, in our experiments, convergence always happened within a few iterations. If \(\xi '(N-1)\) denotes the formula obtained on reaching the fixed point, the final Hoare triple to be proved is \(\{\xi '(N-1)\,\wedge \,\varDelta \varphi '(N) ~\wedge ~ \psi '(N-1)\} \; \mathsf {peel}(\mathsf {P}_N) \; \{\xi (N)\,\wedge \,\psi (N)\}\), where \(\psi '(N-1) \equiv \exists V_\mathsf {P}\big (\psi (N-1)\,\wedge \,D(V_\mathsf {Q}, V_\mathsf {P}, N-1)\big )\). Having a simple (often loop-free) \(\mathsf {peel}(\mathsf {P}_N)\) significantly simplifies the above process.

We conclude this section by giving an overview of how \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\) are computed for the program \(\mathsf {P}_N\) shown in the first column of Fig. 2. The second column of this figure shows the program obtained from \(\mathsf {P}_N\) by peeling the last iteration of each loop of the program. Clearly, the programs in the first and second columns are semantically equivalent. Since there are no nested loops in \(\mathsf {P}_N\), the peels (shown in solid boxes) in the second column are loop-free program fragments. For each such peel, we identify variables/array elements modified in the peel and used in subsequent non-peeled parts of the program. For example, the variable x is modified in the peel of the first loop and used in the body of the second loop, as shown by the arrow in the second column of Fig. 2. We replace all such uses (if needed, transitively) by expressions on the right-hand side of assignments in the peel until no variable/array element modified in the peel is used in any subsequent non-peeled part of the program. Thus, the use of x in the body of the second loop is replaced by the expression x + N * N in the third column of Fig. 2. The peeled iteration of the first loop can now be moved to the end of the program, since the variables modified in this peel are no longer used in any subsequent non-peeled part of the program. Repeating the above steps for the peeled iteration of the second loop, we get the program shown in the third column of Fig. 2. This effectively gives a transformed program that can be divided into two parts: (i) a program \(\mathsf {Q}_{N-1}\) that differs from \(\mathsf {P}_{N}\) only in that all loops are truncated to iterate \(N-1\) (instead of N) times, and (ii) a program \(\mathsf {peel}(\mathsf {P}_N)\) that is obtained by concatenating the peels of loops in \(\mathsf {P}_N\) in the same order in which the loops appeared in \(\mathsf {P}_N\). It is not hard to see that \(\mathsf {P}_N\), shown in the first column of Fig. 2, is semantically equivalent to \(\mathsf {Q}_{N-1}; \mathsf {peel}(\mathsf {P}_N)\). Notice that the construction of \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\) was fairly straightforward, and did not require any complex reasoning. In sharp contrast, construction of \(\partial \mathsf {P}_N\), as shown in the bottom half of fourth column of Fig. 2, requires non-trivial reasoning, and produces a program with two sequentially composed loops.

3 Preliminaries and Notation

We consider programs generated by the grammar shown below:

figure b

Formally, we consider a program \(\mathsf {P}_N\) to be a tuple \((\mathcal {V}, \mathcal {L}, \mathcal {A}, {\mathsf {PB}}, N)\), where \(\mathcal {V}\) is a set of scalar variables, \(\mathcal {L} \subseteq \mathcal {V}\) is a set of scalar loop counter variables, \(\mathcal {A}\) is a set of array variables, \({\mathsf {PB}}\) is the program body, and N is a special symbol denoting a positive integer parameter of the program. In the grammar shown above, we assume that \({A} \in \mathcal {A}\), \({v} \in \mathcal {V} \setminus \mathcal {L}\), \({\ell } \in \mathcal {L}\) and \({\mathsf {c}}\in \mathbb {Z}\). We also assume that each loop \({\mathsf {L}}\) has a unique loop counter variable \(\ell \) that is initialized at the beginning of \({\mathsf {L}}\) and is incremented by 1 at the end of each iteration. We assume that the assignments in the body of \({\mathsf {L}}\) do not update \(\ell \). For each loop \(\mathsf {L}\) with termination condition \(\ell < \mathsf {UB}\), we require that \(\mathsf {UB}\) is an expression in terms of N, variables in \(\mathcal {L}\) representing loop counters of loops that nest \(\mathsf {L}\), and constants as shown in the grammar. Our grammar allows a large class of programs (with nested loops) to be analyzed using our technique, and that are beyond the reach of state-of-the-art tools like [1, 12, 42].

We verify Hoare triples of the form \(\{\varphi (N)\} \;\mathsf {P}_N\; \{\psi (N)\}\), where the formulas \(\varphi (N)\) and \(\psi (N)\) are either universally quantified formulas of the form \(\forall I\, \left( \alpha (I, N) \Rightarrow \beta (\mathcal {A}, \mathcal {V}, I, N)\right) \) or quantifier-free formulas of the form \(\eta (\mathcal {A}, \mathcal {V}, N)\). In these formulas, I is a sequence of array index variables, \(\alpha \) is a quantifier-free formula in the theory of arithmetic over integers, and \(\beta \) and \(\eta \) are quantifier-free formulas in the combined theory of arrays and arithmetic over integers.

For technical reasons, we rename all scalar and array variables in the program in a pre-processing step as follows. We rename each scalar variable using the well-known Static Single Assignment (SSA) [43] technique, such that the variable is written at (at most) one location in the program. We also rename arrays in the program such that each loop updates its own version of an array and multiple writes to an array element within the same loop are performed on different versions of that array. We use techniques for array SSA [30] renaming studied earlier in the context of compilers, for this purpose. In the subsequent exposition, we assume that scalar and array variables in the program are already SSA renamed, and that all array and scalar variables referred to in the pre- and post-conditions are also expressed in terms of SSA renamed arrays and scalars.

4 Verification Using Difference Invariants

The key steps in the application of our technique, as discussed in Sect. 2, are

  • A1: Generation of \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\) from a given \(\mathsf {P}_N\).

  • A2: Generation of \(\varphi '(N-1)\) and \(\varDelta \varphi '(N)\) from a given \(\varphi (N)\).

  • A3: Generation of the difference invariant \(D(V_\mathsf {Q}, V_\mathsf {P}, N-1)\), given \(\varphi (N-1)\), \(\varphi '(N-1)\), \(\mathsf {Q}_{N-1}\) and \(\mathsf {P}_{N-1}\).

  • A4: Proving \(\{\varDelta \varphi '(N) ~\wedge ~ \exists V_\mathsf {P}\big (\psi (N-1)\,\wedge \,D(V_\mathsf {Q}, V_\mathsf {P}, N-1)\big )\} \; \mathsf {peel}(\mathsf {P}_N) \; \{\psi (N)\}\), possibly by generation of \(\xi '(N-1)\) and \(\xi (N)\) to strengthen the pre- and post-conditions, respectively.

We now discuss techniques for solving each of these sub-problems.

4.1 Generating \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\)

The procedure illustrated in Fig. 2 (going from the first column to the third column) is fairly straightforward if none of the loops have any nested loops within them. It is easy to extend this to arbitrary sequential compositions of non-nested loops. Having all variables and arrays in SSA-renamed forms makes it particularly easy to carry out the substitution exemplified by the arrow shown in the second column of Fig. 2. Hence, we don’t discuss any further the generation of \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\) when all loops are non-nested.

Fig. 3.
figure 3

A generic nested loop

The case of nested loops is, however, challenging and requires additional discussion. Before we present an algorithm for handling this case, we discuss the intuition using an abstract example. Consider a pair of nested loops, \(\mathsf {L}_1\) and \(\mathsf {L}_2\), as shown in Fig. 3. Suppose that B1 and B3 are loop-free code fragments in the body of \(\mathsf {L}_1\) that precede and succeed the nested loop \(\mathsf {L}_2\). Suppose further that the loop body, B2, of \(\mathsf {L}_2\) is loop-free. To focus on the key aspects of computing peels of nested loops, we make two simplifying assumptions: (i) no scalar variable or array element modified in B2 is used subsequently (including transitively) in either B3 or B1, and (ii) every scalar variable or array element that is modified in B1 and used subsequently in B2, is not modified again in either B1, B2 or B3. Note that these assumptions are made primarily to simplify the exposition. For a detailed discussion on how our technique can be used even with some relaxations of these assumptions, the reader is referred to [13]. The peel of the abstract loops \(\mathsf {L}_1\) and \(\mathsf {L}_2\) is as shown in Fig. 4. The first loop in the peel includes the last iteration of \(\mathsf {L}_2\) in each of the \(N-1\) iterations of \(\mathsf {L}_1\), that was missed in \(\mathsf {Q}_{N-1}\). The subsequent code includes the last iteration of \(\mathsf {L}_1\) that was missed in \(\mathsf {Q}_{N-1}\).

Fig. 4.
figure 4

Peel of the nested loop

Formally, we use the notation \(\mathsf {L}_1\) (N) to denote a loop \(\mathsf {L}_1\) that has no nested loops within it, and its loop counter, say \(\ell _1\), increases from 0 to an upper bound that is given by an expression in N. Similarly, we use \(\mathsf {L}_1\)(N, \(\mathsf {L}_2\)(N)) to denote a loop \(\mathsf {L}_1\) that has another loop \(\mathsf {L}_2\) nested within it. The loop counter \(\ell _1\) of \(\mathsf {L}_1\) increases from 0 to an upper bound expression in N, while the loop counter \(\ell _2\) of \(\mathsf {L}_2\) increases from 0 to an upper bound expression in \(\ell _1\) and N. Using this notation, \(\mathsf {L}_1\)(N,  \(\mathsf {L}_2\)(N,  \(\mathsf {L}_3\)(N))) represents three nested loops, and so on. Notice that the upper bound expression for a nested loop can depend not only on N but also on the loop counters of other loops nesting it. For notational clarity, we also use LPeel(\(\mathsf {L}_i\), a, b) to denote the peel of loop \(\mathsf {L}_i\) consisting of all iterations of \(\mathsf {L}_i\) where the value of \(\ell _i\) ranges from a to b-1, both inclusive. Note that if b-a is a constant, this corresponds to the concatenation of (b-a) peels of \(\mathsf {L}_i\).

Fig. 5.
figure 5

Peel of \(\mathsf {L}_1\)(N, \(\mathsf {L}_2\)(N))

We will now try to see how we can implement the transformation from the first column to the second column of Fig. 2 for a nested loop \(\mathsf {L}_1\)(N, \(\mathsf {L}_2\)(N)). The first step is to truncate all loops to use \(N-1\) instead of N in the upper bound expressions. Using the notation introduced above, this gives the loop \( \mathsf {L}_1\)(N-1,  \(\mathsf {L}_2\)(N-1)). Note that all uses of N other than in loop upper bound expressions stay unchanged as we go from \( \mathsf {L}_1\)(N, \(\mathsf {L}_2\)(N)) to \(\mathsf {L}_1\)(N-1, \(\mathsf {L}_2\)(N-1)). We now ask: Which are the loop iterations of \( \mathsf {L}_1\)(N, \(\mathsf {L}_2\)(N)) that have been missed (or skipped) in going to \( \mathsf {L}_1\)(N-1, \(\mathsf {L}_2\)(N-1)) ? Let the upper bound expression of \(\mathsf {L}_1\) in \( \mathsf {L}_1\)(N,  \(\mathsf {L}_2\)(N)) be \(U_{\mathsf {L}_1}(N)\), and that of \(\mathsf {L}_2\) be \(U_{\mathsf {L}_2}(\ell _1, N)\). It is not hard to see that in every iteration \(\ell _1\) of \(\mathsf {L}_1\), where \(0 \le \ell _1 < U_{\mathsf {L}_1}(N-1)\), the iterations corresponding to \(\ell _2 \in \{U_{\mathsf {L}_2}(\ell _1, N-1), \ldots , U_{\mathsf {L}_2}(\ell _1, N)-1\}\) have been missed. In addition, all iterations of \(\mathsf {L}_1\) corresponding to \(\ell _1 \in \{U_{\mathsf {L}_1}(N-1), \ldots , U_{\mathsf {L}_1}(N)-1\}\) have also been missed. This implies that the “peel” of \(\mathsf {L}_1\)(N,   \(\mathsf {L}_2\)(N)) must include all the above missed iterations. This peel therefore is the program fragment shown in Fig. 5.

Fig. 6.
figure 6

Peel of \(\mathsf {L}_1\)(N, \(\mathsf {L}_2\)(N, \(\mathsf {L}_3\)(N)))

Notice that if \(U_{\mathsf {L}_2}\) (\(\ell _1\) ,N) - \(U_{\mathsf {L}_2}\) (\(\ell _1\) ,N-1) is a constant (as is the case if \(U_{\mathsf {L}_2}\)(\(\ell _1\),N) is any linear function of \(\ell _1\) and N), then the peel does not have any loop with nesting depth 2. Hence, the maximum nesting depth of loops in the peel is strictly less than that in \(\mathsf {L}_1\)(N, \(\mathsf {L}_2\)(N)), yielding a peel that is “simpler” than the original program. This argument can be easily generalized to loops with arbitrarily large nesting depths. The peel of \(\mathsf {L}_1\)(N, \(\mathsf {L}_2\)(N, \(\mathsf {L}_3\)(N))) is as shown in Fig. 6.

Fig. 7.
figure 7

(a) Nested loop & (b) Peel

As an illustrative example, let us consider the program in Fig. 7(a), and suppose we wish to compute the peel of this program containing nested loops. In this case, the upper bounds of the loops are \(U_{\mathsf {L}_1}(N) = U_{\mathsf {L}_2}(N) = N\). The peel is shown in Fig. 7(b) and consists of two sequentially composed non-nested loops. The first loop takes into account the missed iterations of the inner loop (a single iteration in this example) that are executed in \(\mathsf {P}_N\) but are missed in \(\mathsf {Q}_{N-1}\). The second loop takes into account the missed iterations of the outer loop in \(\mathsf {Q}_{N-1}\) compared to \(\mathsf {P}_N\).

figure c

Generalizing the above intuition, Algorithm 1 presents function GenQandPeel for computing \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\) for a given \(\mathsf {P}_N\) that has sequentially composed loops with potentially nested loops. Due to the grammar of our programs, our loops are well nested. The method works by traversing over the structure of loops in the program. In this algorithm \(\mathsf {Q}_{\mathsf {L}_i}\) and \(\mathsf {R}_{\mathsf {L}_i}\) represent the counterparts of \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\) for loop \(\mathsf {L}_i\). We create the program \(\mathsf {Q}_{N-1}\) by peeling each loop in the program and then propagating these peels across subsequent loops. We identify the missed iterations of each loop in the program \(\mathsf {P}_N\) from the upper bound expression \(\mathsf {UB}\). Recall that the upper bound of each loop \({\mathsf {L}_k}\) at nesting depth k, denoted by \(U_{\mathsf {L}_k}\) is in terms of the loop counters \(\ell \) of outer loops and the program parameter N. We need to peel \(U_{\mathsf {L}_k}(\ell _1, \ell _2, \ldots , \ell _{k-1}, N) - U_{\mathsf {L}_k}(\ell _1, \ell _2, \ldots , \ell _{k-1}, N-1)\) number of iterations from each loop, where \(\ell _1 \le \ell _2 \le \ldots \le \ell _{k-1}\) are counters of the outer nesting loops. As discussed above, whenever this difference is a constant value, we are guaranteed that the loop nesting depth reduces by one. It may so happen that there are multiple sequentially composed loops \(SL_{j}\) at nesting depth k and not just a single loop \(\mathsf {L}_k\). At line 2, we iterate over top level loops and call function GenQandPeelForLoop(\(\mathsf {L}_i\)) for each sequentially composed loop \(\mathsf {L}_i\) in \(\mathsf {P}_N\). At line 11 we construct \(\mathsf {Q}_{\mathsf {L}}\) for loop \(\mathsf {L}\). If the loop \(\mathsf {L}\) has no nested loops, then the peel is the last iterations computed using the upper bound in line 21 For nested loops, the loop at line 15 builds the peel for all loops inside \(\mathsf {L}\) following the above intuition. The peels of all sub-loops are collected and inserted in the peel of \(\mathsf {L}\) at line 19. Since all the peeled iterations are moved after \(Q_{\mathsf {L}}\) of each loop, we need to repair expressions appearing in \(Q_{\mathsf {L}}\). The repairs are applied by the loop at line 4. In the repair step, we identify the right hand side expressions for all the variables and array elements assigned in the peeled iterations. Subsequently, the uses of the variables and arrays in \(\mathsf {Q}_{\mathsf {L}_i}\) that are assigned in \(\mathsf {R}_{\mathsf {L}_j}\) are replaced with the assigned expressions whenever \(j<i\). If \(\mathsf {R}_{\mathsf {L}_j}\) is a loop, this step is more involved and hence currently not considered. Finally at line 8, the peels and Qs of all top level loops are stitched and returned.

Note that lines 4 and 5 of Algorithm 1 implement the substitution represented by the arrow in the second column of Fig. 2. This is necessary in order to move the peel of a loop to the end of the program. If either of the loops \(\mathsf {L}_i\) or \(\mathsf {L}_j\) use array elements as index to other arrays then it can be difficult to identify what expression to use in \(\mathsf {Q}_{\mathsf {L}_i}\) for the substitution. However, such scenarios are observed less often, and hence, they hardly impact the effectiveness of the technique on programs seen in practice. The peel \(\mathsf {R}_{\mathsf {L}_j}\), from which the expression to be substituted in \(\mathsf {Q}_{\mathsf {L}_i}\) has to be taken, itself may have a loop. In such cases, it can be significantly more challenging to identify what expression to use in \(\mathsf {Q}_{\mathsf {L}_i}\). We use several optimizations to transform the peeled loop before trying to identify such an expression. If the modified values in the peel can be summarized as closed form expressions, then we can replace the loop in the peel with its summary. For example, consider the peeled loop, for ( \(\ell _1\) =0; \(\ell _1<\) N; \(\ell _1\) ++) { S = S + 1; }. This loop is summarized as S = S + N; before it can be moved across subsequent code. If the variables modified in the peel of a nested loop are not used later, then the peel can be trivially moved. In many cases, the loop in the peel can also be substituted with its conservative over-approximation. We have implemented some of these optimizations in our tool and are able to verify several benchmarks with sequentially composed nested loops. It may not always be possible to move the peel of a nested loop across subsequent loops but we have observed that these optimizations suffice for many programs seen in practice.

Theorem 1

Let \(\mathsf {Q}_{N-1}\) and \(\mathsf {peel}(\mathsf {P}_N)\) be generated by application of function GenQandPeel from Algorithm 1 on program \(\mathsf {P}_N\). Then \(\mathsf {P}_N\) is semantically equivalent to \(\mathsf {Q}_{N-1}; \mathsf {peel}(\mathsf {P}_N)\).

Lemma 1

Suppose the following conditions hold;

  • Program \(\mathsf {P}_N\) satisfies our syntactic restrictions (see Sect. 3).

  • The upper bound expressions of all loops are linear expressions in N and in the loop counters of outer nesting loops.

Then, the max nesting depth of loops in \(\mathsf {peel}(\mathsf {P}_N)\) is strictly less than that in \(\mathsf {P}_N\).

Proof

Let \(U_{\mathsf {L}_k}(\ell _1, \ldots ,\ell _{k-1}, N)\) be the upper bound expression of a loop \(\mathsf {L}_k\) at nesting depth k. Suppose \(U_{\mathsf {L}_k} = c_1.\ell _1 + \cdots c_{k-1}.\ell _{k-1} + C. N + D\), where \(c_1, \ldots c_{k-1}, C\) and D are constants. Then \(U_{\mathsf {L}_k}(\ell _1, \ldots , \ell _{k-1}, N)\)\(U_{\mathsf {L}_k}(\ell _1, \ldots \ell _{k-1}, N-1) = C\), i.e. a constant. Now, recalling the discussion in Sect. 4.1, we see that LPeel(\(\mathsf {L}_k\), \(U_k\)(\(\ell _1\mathtt{,} \ldots \mathtt{,} \ell _{k-1}{} \mathtt{, N-1}\)), \(U_k\)(\(\ell _1\mathtt{,} \ldots \mathtt{,} \ell _{k-1}{} \mathtt{, N}\))) simply results in concatenating a constant number of peels of the loop \(\mathsf {L}_k\). Hence, the maximum nesting depth of loops in LPeel( \( \mathsf {L}_k\), \(U_k\)(\(\ell _1\mathtt{,} \ldots , \ell _{k-1}{} \mathtt{, N-1}\)), \(U_k\)(\(\ell _1\mathtt{,} \ldots , \ell _{k-1}{} \mathtt{, N}\))) is strictly less than the maximum nesting depth of loops in \(\mathsf {L}_k\).

Suppose loop \(\mathsf {L}\) with nested loops (having maximum nesting depth t) is passed as the argument of function GenQandPeelForLoop (see Algorithm 1). In line 15 of function GenQandPeelForLoop, we iterate over all loops at nesting depth 2 and above within \(\mathsf {L}\). Let \(\mathsf {L}_k\) be a loop at nesting depth k, where \(2 \le k \le t\). Clearly, \(\mathsf {L}_k\) can have at most \(t-k\) nested levels of loops within it. Therefore, when LPeel is invoked on such a loop, the maximum nesting depth of loops in the peel generated for \(\mathsf {L}_k\) can be at most \(t-k-1\). From lines 18 and 19 of function GenQandPeelForLoop, we also know that this LPeel can itself appear at nesting depth k of the overall peel \(\mathsf {R}_{\mathsf {L}}\). Hence, the maximum nesting depth of loops in \(\mathsf {R}_{\mathsf {L}}\) can be \(t-k-1+k\), i.e. \(t-1\). This is strictly less than the maximum nesting depth of loops in \(\mathsf {L}\).   \(\square \)

Corollary 1

If \(\mathsf {P}_N\) has no nested loops, then \(\mathsf {peel}(\mathsf {P}_N)\) is loop-free.

4.2 Generating \(\varphi '(N-1)\) and \(\varDelta \varphi '(N)\)

Given \(\varphi (N)\), we check if it is of the form \(\bigwedge _{i=0}^{N-1} \rho _i\), where \(\rho _i\) is a formula on the \(i^{th}\) elements of one or more arrays, and scalars used in \(\mathsf {P}_N\). If so, we infer \(\varphi '(N-1)\) to be \(\bigwedge _{i=0}^{N-2} \rho _i\) and \(\varDelta \varphi '(N)\) to be \(\rho _{N-1}\) (assuming variables/array elements in \(\rho _{N-1}\) are not modified by \(\mathsf {Q}_{N-1}\)). Note that all uses of N in \(\rho _i\) are retained as is (i.e. not changed to \(N-1\)) in \(\varphi '(N-1)\). In general, when deriving \(\varphi '(N-1)\), we do not replace any use of N in \(\varphi (N)\) by \(N-1\) unless it is the limit of an iterated conjunct as discussed above. Specifically, if \(\varphi (N)\) doesn’t contain an iterated conjunct as above, then we consider \(\varphi '(N-1)\) to be the same as \(\varphi (N)\) and \(\varDelta \varphi '(N)\) to be True. Thus, our generation of \(\varphi '(N-1)\) and \(\varDelta \varphi '(N)\) differs from that of [12]. As discussed earlier, this makes it possible to reason about a much larger class of pre-conditions than that admissible by the technique of [12].

4.3 Inferring Inductive Difference Invariants

Once we have \(\mathsf {P}_{N-1}\), \(\mathsf {Q}_{N-1}\), \(\varphi (N-1)\) and \(\varphi '(N-1)\), we infer difference invariants. We construct the standard cross-product of programs \(\mathsf {Q}_{N-1}\) and \(\mathsf {P}_{N-1}\), denoted as \(\mathsf {Q}_{N-1} \times \mathsf {P}_{N-1}\), and infer difference invariants at key control points. Note that \(\mathsf {P}_{N-1}\) and \(\mathsf {Q}_{N-1}\) are guaranteed to have synchronized iterations of corresponding loops (both are obtained by restricting the upper bounds of all loops to use \(N-1\) instead of N). However, the conditional statements within the loop body may not be synchronized. Thus, whenever we can infer that the corresponding conditions are equivalent, we synchronize the branches of the conditional statement. Otherwise, we consider all four possibilities of the branch conditions. It can be seen that the net effect of the cross-product is executing the programs \(\mathsf {P}_{N-1}\) and \(\mathsf {Q}_{N-1}\) one after the other.

We run a dataflow analysis pass over the constructed product graph to infer difference invariants at loop head, loop exit and at each branch condition. The only dataflow values of interest are differences between corresponding variables in \(\mathsf {Q}_{N-1}\) and \(\mathsf {P}_{N-1}\). Indeed, since structure and variables of \(\mathsf {Q}_{N-1}\) and \(\mathsf {P}_{N-1}\) are similar, we can create the correspondence map between the variables. We start the difference invariant generation by considering relations between corresponding variables/array elements appearing in pre-conditions of the two programs. We apply static analysis that can track equality expressions (including disjunctions over equality expressions) over variables as we traverse the program. These equality expressions are our difference invariants.

We observed in our experiments the most of the inferred equality expressions are simple expressions of N (atmost quadratic in N). This not totally surprising and similar observations have also been independently made in [4, 15, 24]. Note that the difference invariants may not always be equalities. We can easily extend our analysis to learn inequalities using interval domains in static analysis. We can also use a library of expressions to infer difference invariants using a guess-and-check framework. Moreover, guessing difference invariants can be easy as in many cases the difference expressions may be independent of the program constructs, for example, the equality expression \(\mathtt {v = v'}\) where \(\mathtt {v} \in \mathsf {P}_{N-1}\) and \(\mathtt {v'} \in \mathsf {Q}_{N-1}\) does not depend on any other variable from the two programs.

For the example in Fig. 2, the difference invariant at the head of the first loop of \(Q_{N-1} \times P_{N-1}\) is \(D(V_{\mathsf {Q}}, V_{\mathsf {P}}, N-1) \equiv \mathtt {(x'-x = i \times (2 \times N-1)}\) \(\wedge \) \(\mathtt {\forall i \in [0,N-1),}\) \(\mathtt {a'[i]-a[i] = 1)}\), where \(\mathtt {x,a} \in V_{\mathsf {P}}\) and \(\mathtt {x',a'} \in V_{\mathsf {Q}}\). Given this, we easily get \(\mathtt {x'-x= (N-1) \times (2 \times N-1)}\) when the first loop terminates. For the second loop, \(D(V_{\mathsf {Q}}, V_{\mathsf {P}}, N-1) \equiv (\mathtt {\forall j \in [0,N-1), \; b'[j]-b[j] =} \mathtt {(x'-x) + N^2 = (N-1) \times }\) \(\mathtt {(2 \times N-1)+N^2})\).

Note that the difference invariants and its computation are agnostic of the given post-condition. Hence, our technique does not need to re-run this analysis for proving a different post-condition for the same program.

4.4 Verification Using Inductive Difference Invariants

We present our method Diffy for verification of programs using inductive difference invariants in Algorithm 2. It takes a Hoare triple \(\{\varphi (N)\} \;\mathsf {P}_N\; \{\psi (N)\}\) as input, where \(\varphi (N)\) and \(\psi (N)\) are pre- and post-condition formulas. We check the base in line 1 to verify the Hoare triple for \(N=1\). If this check fails, we report a counterexample. Subsequently, we compute \(\mathsf {Q}_{N-1}\) and \( \mathsf {peel}(\mathsf {P}_N)\) as described in Sect. 4.1 using the function GenQandPeel from Algorithm 1. At line 4, we compute the formulas \(\varphi '(N-1)\) and \(\varDelta \varphi '(N)\) as described in Sect. 4.2. For automation, we analyze the quantifiers appearing in \(\varphi (N)\) and modify the quantifier ranges such that the conditions in Sect. 4.2 hold. We infer difference invariants \(D(V_{\mathsf {Q}}, V_{\mathsf {P}}, N-1)\) on line 5 using the method described in Sect. 4.3, wherein \(V_{\mathsf {Q}}\) and \(V_{\mathsf {P}}\) are sets of variables from \(\mathsf {Q}_{N-1}\) and \(\mathsf {P}_{N-1}\) respectively. At line 6, we compute \(\psi '(N-1)\) by eliminating variables \(V_{\mathsf {P}}\) from \(\mathsf {P}_{N-1}\) from \(\psi (N-1) \wedge D(V_{\mathsf {Q}}, V_{\mathsf {P}}, N-1)\). At line 7, we check the inductive step of our analysis. If the inductive step succeeds, then we conclude that the assertion holds. If that is not the case then, we try to iteratively strengthen both the pre- and post-condition of \(\mathsf {peel}(\mathsf {P}_N)\) simultaneously by invoking Strengthen.

figure d

The function Strengthen first initializes the formula \(\chi (N)\) with \(\psi (N)\) and the formulas \(\xi (N)\) and \(\xi '(N-1)\) to \(\mathsf {True}\). To strengthen the pre-condition of \(\mathsf {peel}(\mathsf {P}_N)\), we infer a formula \(\chi '(N-1)\) using Dijkstra’s weakest pre-condition computation of \(\chi (N)\) over the \(\mathsf {peel}(\mathsf {P}_N)\) in line 16. It may happen that we are unable to infer such a formula. In such a case, if the program \(\mathsf {peel}(\mathsf {P}_N)\) has loops then we recursively invoke Diffy at line 19 to further simplify the program. Otherwise, we abandon the verification effort (line 21). We use quantifier elimination to infer \(\chi (N-1)\) from \(\chi '(N-1)\) and \(D(V_{\mathsf {Q}},V_{\mathsf {P}}, N-1))\) at line 6.

The inferred pre-conditions \(\chi (N)\) and \(\chi '(N-1)\) are accumulated in \(\xi (N)\) and \(\xi '(N-1)\), which strengthen the post-conditions of \(\mathsf {P}_{N}\) and \(\mathsf {Q}_{N-1}\) respectively in lines 23–24. We again check the base case for the inferred formulas in \(\xi (N)\) at line 25. If the check fails we abandon the verification attempt at line 26. If the base case succeeds, we then proceed to the inductive step. When the inductive step succeeds, we conclude that the assertion is verified. Otherwise, we continue in the loop and try to infer more pre-conditions untill we run out of time.

The pre-condition in Fig. 2 is \(\phi (N) \equiv \mathsf {True}\) and the post-condition is \(\psi (N) \equiv \mathtt {\forall j \in [0,N), \; b[j] = j+N^3)}\). At line 4, \(\phi '(N-1)\) and \(\varDelta \phi '(N-1)\) are computed to be \(\mathsf {True}\). \(D(V_{\mathsf {Q}}, V_{\mathsf {P}}, N-1)\) is the formula computed in Sect. 4.3. At line 6, \(\psi '(N-1) \equiv (\mathtt {\forall j \in [0,N-1), \; b'[j] = j + (N-1)^3 + (N-1) \times (2 \times N-1) + N^2 =}\) \(\mathtt {j + N^3})\). The algortihm then invokes Strengthen at line 10 which infers the formulas \(\chi '(N-1) \equiv \mathtt {(x'=(N-1)^3)}\) at line 16 and \(\chi (N) \equiv \mathtt {(x=N^3)}\) at line 22. These are accumulated in \(\xi '(N-1)\) and \(\xi (N)\), simultaneosuly strengthening the pre- and post-condition. Verification succeeds after this strengthening iteration.

The following theorem guarantees the soundness of our technique.

Theorem 2

Suppose there exist formulas \(\xi '(N)\) and \(\xi (N)\) and an integer \(M > 0\) such that the following hold

  • \(\{\varphi (N)\}\; \mathsf {P}_N\;\{\psi (N)\,\wedge \,\xi (N)\}\) holds for \(1 \le N \le M\), for some \(M > 0\).

  • \(\xi (N)\wedge \,D(V_\mathsf {Q}, V_\mathsf {P}, N) \Rightarrow \xi '(N)\) for all \(N > 0\).

  • \(\{\xi '(N-1)\,\wedge \,\varDelta \varphi '(N) ~\wedge ~ \psi '(N-1) \} \; \mathsf {peel}(\mathsf {P}_N) \; \{\xi (N)\,\wedge \,\psi (N)\}\) holds for all \(N \ge M\), where \(\psi '(N-1) \equiv \exists V_\mathsf {P}\big (\psi (N-1)\,\wedge \,D(V_\mathsf {Q}, V_\mathsf {P},N-1)\big )\).

Then \(\{\varphi (N)\} \; \mathsf {P}_N \; \{\psi (N)\}\) holds for all \(N > 0\).

5 Experimental Evaluation

We have instantiated our technique in a prototype tool called Diffy. It is written in C++ and is built using the LLVM(v6.0.0) [31] compiler. We use the SMT solver Z3(v4.8.7) [39] for proving Hoare triples of loop-free programs. Diffy and the supporting data to replicate the experiments are openly available at [14].

Table 1. Summary of the experimental results. S is successful result. U is inconclusive result. TO is timeout.

Setup. All experiments were performed on a machine with Intel i7-6500U CPU, 16 GB RAM, running at 2.5 GHz, and Ubuntu 18.04.5 LTS operating system. We have compared the results obtained from Diffy with Vajra(v1.0) [12], VIAP(v1.1) [42] and VeriAbs(v1.4.1-12) [1]. We choose Vajra which also employs inductive reasoning for proving array programs and verify the benchmarks in its test-suite. We compared with VeriAbs as it is the winner of the arrays sub-category in SV-COMP 2020 [6] and 2021 [7]. VeriAbs applies a sequence of techniques from its portfolio to verify array programs. We compared with VIAP which was the winner in arrays sub-category in SV-COMP 2019 [5]. VIAP also employs a sequence of tactics, implemented for proving a variety of array programs. Diffy does not use multiple techniques, however we choose to compare it with these portfolio verifiers to show that it performs well on a class of programs and can be a part of their portfolio. All tools take C programs in the SV-COMP format as input. Timeout of 60 s was set for each tool. A summary of the results is presented in Table 1.

Benchmarks. We have evaluated Diffy on a set of 303 array benchmarks, comprising of the entire test-suite of [12], enhanced with challenging benchmarks to test the efficacy of our approach. These benchmarks take a symbolic parameter N which specifies the size of each array. Assertions are (in-)equalities over array elements, scalars and (non-)linear polynomial terms over N. We have divided both the safe and unsafe benchmarks in three categories. Benchmarks in C1 category have standard array operations such as min, max, init, copy, compare as well as benchmarks that compute polynomials. In these benchmarks, branch conditions are not affected by the value of N, operations such as modulo and nested loops are not present. There are 110 safe and 99 unsafe programs in the C1 category in our test-suite. In C2 category, the branch conditions are affected by change in the program parameter N and operations such as modulo are used in these benchmarks. These benchmarks do not have nested loops in them. There are 24 safe and unsafe benchmarks in the C2 category. Benchmarks in category C3 are programs with atleast one nested loop in them. There are 23 safe and unsafe programs in category C3 in our test-suite. The test-suite has a total of 157 safe and 146 unsafe programs.

Fig. 8.
figure 8

Cactus Plots (a) All Safe Benchmarks (b) All Unsafe Benchmarks

Analysis. Diffy verified 151 safe benchmarks, compared to 110 verified by Vajra as well as VeriAbs and 20 verified by VIAP. Diffy was unable to verify 6 safe benchmarks. In 3 cases, the smt solver timed out while trying to prove the induction step since the formulated query had a modulus operation and in 3 cases it was unable to compute the predicates needed to prove the assertions. Vajra was unable to verify 47 programs from categories C2 and C3. These are programs with nested loops, branch conditions affected by N, and cases where it could not compute the difference program. The sequence of techniques employed by VeriAbs, ran out of time on 47 programs while trying to prove the given assertion. VeriAbs proved 2 benchmarks in category C2 and 3 benchmarks in category C3 where Diffy was inconclusive or timed out. VeriAbs spends considerable amount of time on different techniques in its portfolio before it resorts to Vajra and hence it could not verify 14 programs that Vajra was able to prove efficiently. VIAP was inconclusive on 24 programs which had nested loops or constructs that could not be handled by the tool. It ran out of time on 113 benchmarks as the initial tactics in its sequence took up the allotted time but could not verify the benchmarks. Diffy was able to verify all programs that VIAP and Vajra were able to verify within the specified time limit.

Fig. 9.
figure 9

Cactus plots (a) Safe C1 benchmarks (b) Unsafe C1 benchmarks

The cactus plot in Fig. 8(a) shows the performance of each tool on all safe benchmarks. Diffy was able to prove most of the programs within three seconds. The cactus plot in Fig. 9(a) shows the performance of each tool on safe benchmarks in C1 category. Vajra and Diffy perform equally well in the C1 category. This is due to the fact that both tools perform efficient inductive reasoning. Diffy outperforms VeriAbs and VIAP in this category. The cactus plot in Fig. 10(a) shows the performance of each tool on safe benchmarks in the combined categories C2 and C3, that are difficult for Vajra as most of these programs are not within its scope. Diffy out performs all other tools in categories C2 and C3. VeriAbs was an order of magnitude slower on programs it was able to verify, as compared to Diffy. VeriAbs spends significant amount of time in trying techniques from its portfolio, including Vajra, before one of them succeeds in verifying the assertion or takes up the entire time allotted to it. VIAP took 70 seconds more on an average as compared to Diffy to verify the given benchmark. VIAP also spends a large portion of time in trying different tactics implemented in the tool and solving the recurrence relations in programs.

Fig. 10.
figure 10

Cactus plots (a) Safe C2 & C3 benchmarks (b) Unsafe C2 & C3 benchmarks

Fig. 11.
figure 11

Cactus plots. TO = 100 s. (a) Safe benchmarks (b) Unsafe benchmarks

Our technique reports property violations when the base case of the analysis fails for small fixed values of N. While the focus of our work is on proving assertions, we report results on unsafe versions of the safe benchmarks from our test-suite. Diffy was able to detect a property violation in 142 unsafe programs and was inconclusive on 4 benchmarks. Vajra detected violations in 115 programs and was inconclusive on 31 programs. VeriAbs reported 125 programs as unsafe and ran out of time on 21 programs. VIAP reported property violation in 120 programs, was inconclusive on 23 programs and timed out on 3 programs.

The cactus plot in Fig. 8(b) shows the performance of each tool on all unsafe benchmarks. Diffy was able to detect a violation faster than all other tools and on more benchmarks from the test-suite. Figure 9(b) and Fig. 10(b) give a finer glimpse of the performance of these tools on the categories that we have defined. In the C1 category, Diffy and Vajra have comparable performance and Diffy disproves the same number of benchmarks as Vajra and VIAP. In C2 and C3 categories, we are able to detect property violations in more benchmarks than other tools in less time.

To observe any changes in the performance of these, we also ran them with an increased time out of 100 seconds (Fig. 11). Performance remains unchanged for Diffy, Vajra and VeriAbs on both safe and unsafe benchmarks, and of VIAP on unsafe benchmarks. VIAP was able to additionally verify 89 safe programs in categories C1 and C2 with the increased time limit.

6 Related Work

Techniques Based on Induction. Our work is related to several efforts that apply inductive reasoning to verify properties of array programs. Our work subsumes the full-program induction technique in [12] that works by inducting on the entire program via a program parameter N. We propose a principled method for computation and use of difference invariants, instead of computing difference programs which is more challenging. An approach to construct safety proofs by automatically synthesizing squeezing functions that shrink program traces is proposed in [27]. Such functions are not easy to synthesize, whereas difference invariants are relatively easy to infer. In [11], the post-condition is inductively established by identifying a tiling relation between the loop counter and array indices used in the program. Our technique can verify programs from [11], when supplied with the tiling relation. [44] identifies recurrent program fragments for induction using the loop counter. They require restrictive data dependencies, called commutativity of statements, to move peeled iterations across subsequent loops. Unfortunately, these restrictions are not satisfied by a large class of programs in practice, where our technique succeeds.

Difference Computation. Computing differences of program expressions has been studied for incremental computation of expensive expressions [35, 41], optimizing programs with arrays [34], and checking data-structure invariants [45]. These differences are not always well suited for verifying properties, in contrast with the difference invariants which enable inductive reasoning in our case.

Logic Based Reasoning. In [21], trace logic that implicitly captures inductive loop invariants is described. They use theorem provers to introduce and prove lemmas at arbitrary control locations in the program. Unlike their technique, we focus primarily on universally quantified and quantifier-free properties, although a restricted class of existentially quantified properties can be handled by our technique (see [13] for more details). VIAP [42] translates the program to an quantified first-order logic formula using the scheme proposed in [32]. It uses a portfolio of tactics to simplify and prove the generated formulas. Dedicated solvers for recurrences are used whereas our technique adapts induction for handling recurrences.

Invariant Generation. Several techniques generate invariants for array programs. QUIC3 [25], FreqHorn [9, 19] infer universally quantified invariants over arrays for Constrained Horn Clauses (CHCs). Template-based techniques [8, 23, 47] search for inductive quantified invariants by instantiating parameters of a fixed set of templates. We generate relational invariants, which are often easier to infer compared to inductive quantified invariants for each loop.

Abstraction-Based Techniques. Counterexample-guided abstraction refinement using prophecy variables for programs with arrays is proposed in [36]. VeriAbs [1] uses a portfolio of techniques, specifically to identify loops that can be soundly abstracted by a bounded number of iterations. Vaphor [38] transforms array programs to array-free Horn formulas to track bounded number of array cells. Booster [3] combines lazy abstraction based interpolation [2] and acceleration [10, 28] for array programs. Abstractions in [16, 18, 22, 26, 29, 33, 37] implicitly or explicitly partition the range array indices to infer and prove facts on array segments. In contrast, our method does not rely on abstractions.

7 Conclusion

We presented a novel verification technique that combines generation of difference invariants and inductive reasoning. Difference invariants relate corresponding variables and arrays from the two versions of a program and are often easy to infer and prove. We have instantiated these techniques in our prototype Diffy. Experiments shows that Diffy out-performs the tools that won the Arrays sub-category in SV-COMP 2019, 2020 and 2021. Although we have focused on universal and quantifier-free properties in this paper, the technique applies to some classes of existential properties as well. The interested reader is referred to [13] for more details. Investigations in using synthesis techniques for automatic generation of difference invariants to verify properties of array manipulating programs is a part of future work.