figure a
figure b

1 Introduction

Precondition inference is concerned with finding a set of initial states from which all terminating executions of a given program reach states satisfying a given postcondition. The weakest precondition refers to the largest such set of initial states. The weakest precondition can be used as a contract on a library function’s input, for run-time argument value checks, as a summary in compositional verification, and in many more applications [2, 11, 12, 24, 46, 47, 52, 53].

Finding the weakest precondition, especially in the presence of unbounded loops and data structures like arrays, is challenging and uncomputable in general. To show that a precondition is valid requires reasoning about all possible executions of loops. Almost always this necessitates the inference of adequate inductive invariants. However, automatic invariant inference is an equally difficult problem, and in the case of array programs, the required invariants are often quantified formulas, adding to the difficulty of reasoning about them. Moreover, existing invariant inference techniques [22, 28, 30, 32, 41] rely on a precondition being provided by the user. This makes it difficult to use such techniques directly in our problem setting, where preconditions are not available to begin with.

Even if we are able to find a precondition for a given program and postcondition, proving that the precondition is the weakest presents significant technical challenges. Specifically, we need to prove that adding any new state to the set of initial states represented by the precondition results in an execution that terminates in a state violating the postcondition. To find such a proof, existing quantified precondition inference techniques assume the program to be deterministic, i.e., from every initial state, there is a unique program execution [49, 53]. However, it often becomes necessary to use non-deterministic features when modeling programs, thereby admitting multiple possible executions starting from the same initial state. Such non-deterministic features may be needed to model user input, non-deterministic functions, external functions, or when programs are abstracted. Hence, assuming that all programs are deterministic significantly restricts the applicability of existing techniques for finding weakest preconditions.

We propose a novel technique for inferring weakest preconditions for a class of terminating non-deterministic programs that manipulate arrays, with respect to postconditions expressed in a rich language of formulas. Specifically, we target the class of linear array programs, defined formally in Section 3. This includes programs used in many practical applications, and the literature describes several verification techniques for this class of programs [7, 8, 40]. However, existing techniques for weakest precondition inference either apply to deterministic linear array programs, or deal with non-determinism in simpler classes of programs. Our work fills this gap, making it possible to infer weakest preconditions for linear array programs with non-determinism.

The proposed technique works in the infer-check-weaken framework [1, 27, 49, 50, 54]. It first infers a precondition along with adequate inductive invariants. A maximality check follows to see whether the precondition is weakest. If the check yields a negative answer, the precondition is weakened. This loop continues until the weakest precondition is found. In this framework, our core contributions are Structural Array Abduction (SAA) for inferring preconditions and associated invariants, and Specialized Maximality Checking (SMC) for proving that the inferred precondition is maximal (or weakest).

At a high level, SAA "guesses" candidate preconditions and inductive invariants as (quantified) formulas, and checks their correctness using an SMT solver. Since quantified formulas over arrays are challenging to reason about even with state-of-the-art SMT solvers, the guessing has to be done carefully. SAA uses abductive inference for this purpose. First, it constructs an abduction query to find what property of array elements at the start of a loop iteration will result in a desired property after the iteration. The array property thus inferred is then combined with a range formula [22], which is a predicate representing the boundary between indices of the array that are processed and those that are yet to be processed. A set of rules guide the construction of appropriate abduction queries and range formulas.

Though SAA is effective in finding weak preconditions, it is not guaranteed to find the weakest precondition. SMC is used to check whether a precondition is indeed the weakest. This amounts to determining whether for every initial state that violates the precondition, there is a terminating execution that results in a state violating the postcondition. To accomplish this, SMC uses the insight that every execution of a non-deterministic program is also an execution of an under-approximation of the original program obtained by suitably restricting the non-determinism in control flows (i.e., \(\texttt {if}\) statements). Specifically, the existence of inductive invariants for the complement verification problem, i.e., under-approximated program with complemented pre-and postconditions, proves that the inferred precondition is indeed the weakest for the given (terminating) program and postcondition. SMC uses SAA to find an under-approximated program and its inductive invariants. When SAA fails, SMC weakens the precondition from a set of candidates obtained in a syntax-guided way, like in [22].

Our technique is implemented in a tool called MaxPrANQ. It takes constrained Horn clauses (CHCs) as input, which is a convenient way to model and reason about programs symbolically (details in Sec 3.2). On a challenging set of 66 precondition inference tasks, our tool inferred the weakest precondition for all 66 and automatically proved 59 of them to be the weakest. In comparison, the state-of-the-art tool PreQSyn  [49] could only solve 2/66 benchmarks, and P-Gen  [53] did not find a precondition for any of them. To further gauge the difficulty level of reasoning about our benchmarks, we tried using two state-of-the-art inductive invariant inference tools, FreqHorn  [22] and Spacer  [30], to simply prove the correctness of the preconditions inferred by MaxPrANQ. Neither FreqHorn nor Spacer could however complete the task for the entire set of 66 benchmarks in the given time. This shows that even proving the correctness of the weakest preconditions was difficult for our benchmarks, let alone inferring the preconditions automatically.

The primary contributions of our paper are:

  1. 1.

    SAA: a method for finding preconditions, inductive invariants, and stronger guard conditions for non-deterministic linear array programs.

  2. 2.

    SMC: a method for checking if a precondition is the weakest and, when inconclusive, weakening it.

  3. 3.

    MaxPrANQ: a tool for finding the weakest preconditions, with witnesses of validity (inductive invariants) and maximality.

The rest of the paper has following sections: Sect 2 has a running example, Sect 3 provides necessary background, Sect 4 gives an overview of our algorithm, SAA and SMC descriptions are in Sect 5 and Sect 6, resp., Sect 7 gives evaluation details, Sect 8 has related work, and limitations and future work are in Sect 9.

2 A Running Example

Fig. 1a shows a non-deterministic program with a postcondition that requires a universally quantified weakest precondition. The program has three arrays: A, B, and C, each of parametric size N. For each array index i, the program chooses non-deterministically whether to write i to the i-th element of C or copy the i-th element of C into the corresponding index of A. The postcondition, as stated in the assert, requires that the arrays A and B have the same content. Our goal is to infer the weakest precondition (denoted by pre) over A, B, C, and N under which the program satisfies the postcondition.

Fig. 1.
figure 1

A non-deterministic array program and its maximality proof.

Existing weakest precondition inference techniques [49, 53] diverge for non-deterministic programs like the one in Fig. 1a. For instance, P-Gen  [53] fails to find a precondition, and PreQSyn  [49] fails to prove that the precondition it finds is the weakest in 200 seconds. This is because they either fail to generalise a set of initial states to a quantified precondition or, when they do, they cannot prove it to be the weakest for non-deterministic programs. In contrast, SAA finds the precondition: \(\forall j .\,0\!\le \! j\! <\! N\!\!\implies \!\! (A[j]\! =\! B[j] \wedge B[j]\! =\! C[j])\) (details in Sect 5.3), and SMC proves this to be the weakest precondition, all within a few seconds.

To prove maximality, SMC finds an under-approximated program, as shown in Fig. 1b. In this program, the non-determinism in the \(\texttt {if}\) statement is restricted by a new guard: \(A[i] \ne B[i]\). Furthermore, the assume condition is the complement of the precondition inferred by SAA earlier, and the condition in the \(\texttt {assert} \) is also complemented. The existence of an adequate inductive invariant for this program (which in turn can be found by SAA) proves that all its executions from every initial state violating the inferred precondition result in states violating the given postcondition as the program is terminating. In other words, the inferred precondition is indeed the weakest for the program and postcondition in Fig. 1a.

3 Background

3.1 Linear Array Programs

Fig 2 shows a grammar for linear array programs over a set of integer and array variables, and , respectively. In the figure, , , is a fixed loop counter, , t is a linear arithmetic expression, and each \(g_i\) (or guard) is a boolean combination of linear expressions over and , with \(\bigvee _{i=1}^{n} g_i = \top \). The if statement is a set of guarded assignments. When such an if statement is executed, exactly one guard that evaluates to true in the current program state is non-deterministically chosen and the corresponding assignment statement is executedFootnote 1. A program is non-deterministic if there are program states in which more than one guard of an \(\texttt {if}\) could evaluate to true.

Fig. 2.
figure 2

Linear Array Programs

Let P be a linear array program over and . A pre/postcondition for P is a formula of the form or , where is an integer variable, \(\textit{R}\) is a linear predicate over x and that represents a range of indices of array(s), and \(\textit{Q}\) is a linear predicate over and elements of array(s) in , the latter being accessed only through linear index expressions in x. As an example, \(\forall x .\,(0 \le x \le N) \implies (C[x] \le B[x])\) qualifies for a pre/postcondition, where and . Following standard Floyd-Hoare logic, we say a pair of conditions \((\psi , \rho )\) is a valid pre- and postcondition pair for P, if every execution of P that begins in a state satisfying \(\psi \) ends in a state satisfying \(\rho \).

The weakest precondition inference problem we consider is: given a linear array program P and a postcondition \(\rho \), find the weakest precondition \(\psi \) such that \((\psi ,\rho )\) forms a valid pre- and postcondition pair for P.

Weakest precondition inference for linear array programs is undecidable in general [49]. Therefore, we cannot hope for an algorithm that infers weakest preconditions in all cases. Nevertheless, many practical and useful programs can be modeled as linear array programs (see for example [7, 8, 40]). This motivates us to design techniques for finding weakest preconditions that work well for a large subclass of linear array programs.

3.2 Modeling Linear Array Programs as CHCs

In recent years, it is becoming popular to represent a program and its pre- and postcondition as a system of first-order logic (FOL) formulas with uninterpreted relations, called constrained Horn clauses (CHCs) [10, 19, 29, 33, 35,36,37, 43, 45]. In CHCs, the uninterpreted relations represent invariants and the goal is to find interpretations for them. We will consider the task of precondition inference as a CHC-solving task, with the missing precondition represented by a relation.

Definition 1

A CHC is a formula in a FOL \(\mathcal {L}\) (linear integer arithmetic with arrays in this paper) over a set of relations with one of the following forms:

$$\begin{aligned} \varphi (\vec {x}_0) & \!\!\implies \!\! \boldsymbol{r}_0(\vec {x}_0) \end{aligned}$$
(1)
$$\begin{aligned} \bigwedge \limits _{0\le i \le k} \boldsymbol{r}_i(\vec {x}_i) \!\wedge \! \varphi (\vec {x}_0, \ldots ,\vec {x}_{k+1}) & \!\!\implies \!\! \boldsymbol{r}_{k+1}(\vec {x}_{k+1}) \end{aligned}$$
(2)
$$\begin{aligned} \bigwedge \limits _{0\le i \le k} \boldsymbol{r}_i(\vec {x}_i) \!\wedge \! \varphi (\vec {x}_0, \ldots ,\vec {x}_{k}) \!\!\implies \!\! & \bot \end{aligned}$$
(3)

where, for every i, Footnote 2, and \(\vec {x}_i\) represents the vector of variables \((x_1, \ldots , x_{a_{\boldsymbol{r}_i}})\), where \(a_{\boldsymbol{r}_i}\) is the arity of \(\boldsymbol{r}_i\). \(\varphi \), called a constraint, is an \(\mathcal {L}\)-formula in conjunctive normal form without uninterpreted relations. CHCs of type (1) are called factsFootnote 3, of type (2) inductive, and of type (3) queries. Note that each CHC has a leading quantification over \(\vec {x}\) (e.g. \(\forall \vec {x}_0 \ldots \vec {x}_{k+1}\) for type (2)) that is implicit in the paper.

For a CHC C, we use the following notations: \( body (C)\) (resp. \( head (C)\)) denotes the left (resp. right) side of the implication in C, \( rels ()\) denotes the relations from that appear in \( body (C)\), or \( head (C)\), and \( args ()\) denotes the variables in \( body (C)\), or \( head (C)\). We assume the constraint \(\varphi \) of a CHC C can be partitioned into two formulas: \( assign (C)\) and \( guard (C)\), denoting the assignment statement and control-flow guard conditions (if any). A system of CHCs S is a finite set of CHCs. For any system S, if there is a CHC C with \(| rels (( body (C)))| \ge 1\), then S is non-linear, otherwise linear.

We assume the input CHC system is induced by a linear array program with \(n \ge 0\) sequential loops. In particular, it is a linear CHC system over , where \(\boldsymbol{pre}\) denotes the precondition, and each \(\boldsymbol{inv}_i\) denotes an inductive invariant for the i-th sequential loop.

Fig. 3.
figure 3

CHC system for the program from Fig. 1a.

Example 1

A linear system of CHCs induced by the program from Fig 1a is shown in Fig 3. In the system, the precondition is represented by the relation \(\boldsymbol{pre}\) and the inductive invariant by \(\boldsymbol{inv}_1\). C1 is the initialization CHC with \(\boldsymbol{pre}\). The two CHCs C2 and C3 correspond to non-deterministic writes in the loop, while C4 is the query CHC, which has the assert condition. It is worth noting that interpretations for \(\boldsymbol{pre}\) and \(\boldsymbol{inv}_1\) that make each CHC valid gives a precondition and an adequate inductive invariant. For example,

$$\begin{aligned} \boldsymbol{pre}\mapsto \lambda N,A,B,C.\,&\forall j .\,0\!\le \! j\! <\! N\!\!\implies \!\! (A[j]\! =\! B[j] \wedge B[j]\! =\! C[j]) \\ \boldsymbol{inv}_1 \mapsto \lambda N,A,B,C,i.\,&\forall j .\,0 \le j < i \implies A[j] = B[j] \; \wedge \\ &\forall j .\,i \le j < N \implies (A[j] = B[j] \wedge B[j] = C[j]) \end{aligned}$$

A map of interpretations for assigns to each relation symbol an interpretation of the form \(\lambda x_1 \cdots \lambda x_{a_{\boldsymbol{r}}} .\,\psi (x_1,\ldots ,x_{a_{\boldsymbol{r}}})\), where \(\psi \) is a \(\mathcal {L}\)-formula. We use the notation to denote the interpretation for \(\boldsymbol{r}\) by . For a formula \(\alpha \) and a map for , we write to denote the formula obtained by replacing each atomic formula of the form \(\boldsymbol{r}(t_1, \ldots , t_{a_{\boldsymbol{r}}})\) in \(\alpha \) by .

Solution to CHCs A solution to a CHC C is a map for such that the formula is valid; in this case, we say C is satisfiable. is a solution to a system S if it satisfies all the CHCs in S; in this case, we say S is satisfiable.

Let S be a system of CHCs induced by a program P and a postcondition \(\rho \). If is a solution to S, then (, \(\rho \)) forms a valid pre/postcondition for P.

3.3 Abductive Inference

The core method used in SAA for inference is abduction. Given a formula \((\boldsymbol{r}(\vec {x}) \wedge \alpha (\vec {y})) \implies \beta (\vec {y})\), where \(\boldsymbol{r}\) represents a relation, \(\alpha \) (hypothesis) and \(\beta \) (conclusion) are formulas without relations, and the variables in \(\vec {x}\) are also present in \(\vec {y}\), the problem of abduction is to find an interpretation \(\lambda x_1 \cdots \lambda x_{a_{\boldsymbol{r}}} .\,\psi \) to \(\boldsymbol{r}\) such that:

figure an

Example 2

Consider the abduction problem \((\boldsymbol{r}(x) \wedge y = 42) \implies (x - y > 0)\). The maximal solution for the problem is \(\boldsymbol{r}\mapsto \lambda x .\,x > 42\).

A given abduction problem can have multiple solutions. SAA seeks the maximal solution. There are techniques, like quantifier elimination, to find maximal solutions [16], but they are limited to non-array theories. To overcome this, range abduction [49] proposes a suitable array-to-integer abstraction, which SAA also uses.

Non-linear CHCs have more than one relation in \( body \), requiring an extension of the abduction problem called multi-abduction. In multi-abduction, interpretations to multiple relations need to be inferred. SAA encounters non-linear CHCs while searching for maximality proofs, which involve the guard and inductive invariant relations. To solve the multi-abduction problem, SAA uses the technique from  [1] after performing the array-to-integer abstraction from  [49].

Example 3

The following is a multi-abduction problem: \((\boldsymbol{r}_1(A,i) \wedge \boldsymbol{r}_2(B,i) \wedge C[i] = 42) \implies (A[i] + B[i] > C[i])\). A maximal solution is \(\boldsymbol{r}_1 \mapsto \lambda A,i .\,A[i] > 42\) and \(\boldsymbol{r}_2 \mapsto \lambda B,i .\,B[i] \ge 0\).

4 Inferring Weakest Preconditions

An overview of our weakest precondition inference algorithm is in Algorithm 1. The algorithm takes as input a CHC system S over \(\{\boldsymbol{pre}, \boldsymbol{inv}_1, \ldots \boldsymbol{inv}_n \}\). It first computes a solution to S using \(\textsc {SAA} \) (line 1). Though this solution gives a precondition (as the solution will have interpretations to \(\boldsymbol{pre}, \boldsymbol{inv}_1, \ldots \boldsymbol{inv}_n\)), it is not guaranteed to be the weakest. Hence, in a loop, the algorithm performs maximality checking and weakening (lines 2 to  6). When the maximality check succeeds, the solution is guaranteed to be the weakest (Theorem 1); hence the algorithm returns the current precondition (line 5). Otherwise, the algorithm assumes the maximality check is inconclusive and tries to find a weakening (line 6). The algorithm progresses and continues the same loop if a weakening is found. When the weakening is inconclusive, the loop terminates, and the current precondition is returned without a maximality guarantee (line 7).

figure aq

SAA takes a CHC system, which is either a precondition inference task (S), or the encoding of maximality check (G) with additional non-CHC constraints (\(\varGamma \)). It finds a solution to the CHC system that also satisfies the additional constraints.

Algorithm 1 proves the maximality of a precondition by encoding a CHC system G and non-CHC constraints \(\varGamma \), together called specialized CHCs (line 3). G has the same set of CHCs as the input CHC system S except the following: 1) the relation \(\boldsymbol{pre}\) is replaced by the formula , 2) the postcondition is the negation of the postcondition in S, and 3) new guard relations: \(\boldsymbol{g}_{Ci},\ldots ,\boldsymbol{g}_{Cj}\) are added to \( body \) of CHCs: \(Ci, \ldots Cj\) corresponding to non-deterministic \(\texttt {if}\) conditions. Thus, G is a CHC system over the invariant relations of S and new guard relations. A solution to G gives stronger \(\texttt {if}\) conditions and inductive invariants for the complement pre- and postcondition. The non-CHC constraints in \(\varGamma \) make sure the disjunction of \(\boldsymbol{g}_{Ci},\ldots ,\boldsymbol{g}_{Cj}\) is \(\top \); thus ensuring the interpretations for them are not too strong.

When SAA fails to find a solution to G, Algorithm 1 calls Weaken. At a high level, Weaken enumerates candidate preconditions obtained in a syntax-guided way like  [22] and then tries to find inductive invariants using SAA again.

Fig. 4.
figure 4

Specialized CHCs for the CHCs from Example 1.

Example 4

Fig. 4 shows the specialized CHCs for the CHC system from Fig. 3 and the precondition from Example 1 (viz. \(\forall j .\,0 \le j < N \implies (A[j] = B[j] \wedge B[j] = C[j])\)). This system has following changes: 1) In the first CHC, the relation \(\boldsymbol{pre}\) is replaced by the complement of the precondition, 2) The next two CHCs have \(\boldsymbol{g}_{C2}, \boldsymbol{g}_{C3}\) in \( body \), 3) In the fourth CHC, the postcondition is complemented, and 4) The last constraint is a non-CHC constraint that makes sure \(\boldsymbol{g}_{C2} \vee \boldsymbol{g}_{C3}\) is \(\top \).

Theorem 1 (Soundness of Algorithm 1\(^5\))

[Soundness of Algorithm 1\(^5\)] For a system S, if Algorithm 1 terminates with “weakest” then is the weakest precondition for S.Footnote 4

5 Structural Array Abduction

Structural Array Abduction (SAA) solves CHCs. In Algorithm 1, SAA solves program-induced CHCs to identify preconditions, specialized CHCs for maximality proofs, and CHCs with candidate weakened preconditions to find invariants.

5.1 Algorithm Description

SAA aims to find interpretations to \(\boldsymbol{pre}\) and \(\boldsymbol{inv}\) of the following form:

(4)

Similar to the postcondition, here, \(\textit{R}\) is a linear predicate over x and that represents a range of indices of array(s), and \(\textit{Q}\) is a linear predicate over and elements of array(s) in , the latter being accessed only through linear index expressions in x. Such a form is sufficient to represent inductive invariants for a large class of array programs, as observed in existing works [22, 28, 30, 31, 38].

figure ax

A relatively complete guessing algorithm involves enumerating all candidate solutions in the form of  4 and then checking them using an SMT solver. However, given the large number of candidate solutions and the inherent challenge that quantified formulas with arrays pose for SMT solvers, SAA brings a novel improvement. It narrows down the search by guessing likely candidate solutions using a logical method, as presented in Algorithm 2.

Algorithm 2 begins with an initial candidate solution, e.g., in our implementation, and checks whether it is a solution to all CHCs. If not, the algorithm attempts to make the candidate a solution to the failed CHC mainly through abduction-based strengthening (line 3), or heuristics-based weakening if the CHC is a \( fact \) (line 5). If neither strengthening nor weakening results in a change to the candidate, the algorithm proceeds to the next candidate in the fixed form. When a candidate is found to be a solution, it is checked for additional constraints in \(\varGamma \).

The abduction-based strengthening method is presented in Algorithm 3. It seeks new interpretations for the relations in \( body \) of a CHC, which can be \(\boldsymbol{pre}\), \(\boldsymbol{inv}\), or \(\boldsymbol{g}_C\), that imply the interpretation for the relation in \( head \) of the CHC. This constitutes the abduction problem, as defined in Sec 3.3. However, existing abduction solvers cannot be used directly as they do not support quantified formulas with arrays. Hence, in Algorithm 3, \(\textit{Q}\) and \(\textit{R}\) from the fixed form ( 4) are determined separately and then combined into a quantified formula.

To find \(\textit{Q}\), the algorithm constructs an abduction query based on the rules provided in Table 1. In the abduction query, the hypothesis (\(\alpha \)) is the assignment formula present in the constraint of the CHC (line 1), and the conclusion (\(\beta \)) is derived from the table based on the type of the CHC (line 2). Since the query contains array terms, which are not supported by existing abduction solvers, they are replaced by integer terms in a manner similar to the approach presented in  [49] (e.g., A[i] is replaced by a new integer variable \(a_i\)). Subsequently, the query is solved using an integer abduction solver to obtain a maximal solution (line 4). When the CHC has a guard relation \(\boldsymbol{g}_C\) in its \( body \), an additional abduction query is constructed to find interpretations for the other guard relations \(\boldsymbol{g}_{C'}\) (line 6). Finally, integer terms in the solutions of the abduction queries are mapped back to corresponding array terms (line 7).

figure az

SAA uses the concept of range formulas as described in  [22] to determine \(\textit{R}\). In the context of linear array programs, these range formulas can take the form of \({ 0 \le j < u }\), \( 0 \le j < i \), and \( i \le j < u \)Footnote 5, where j is a free variable and u is the upper bound of the loop (cf. Fig. 2). From these formulas, a suitable \(\textit{R}\) is selected based on the type of the CHC in Table 1.

The resulting \(\textit{R}\) and \(\textit{Q}\) are appropriately combined into a quantified formula and conjoined (or disjoined in the case of existential quantification) with the existing interpretation (lines 9 and 10). The guard relations are updated by the non-quantified formulas \(\textit{Q}_{\boldsymbol{g}_C}, \ldots , \textit{Q}_{\boldsymbol{g}_{C'}}\).

Table 1 provides a set of rules for determining \(\textit{R}\) and \(\beta \) for all types of CHCs that Algorithm 3 may encounter. These CHCs include: 1) \( precond \): the initialization CHC with \(\boldsymbol{pre}\) in \( body \), 2) \( query \): the CHC with postcondition, 3) \( intra \): CHCs representing potentially non-deterministic updates within a loop, and 4) \( inter \): CHCs occurring between two loops. For example, when a CHC C falls into the \( precond \) category, \(\beta \) is the \(\textit{Q}\) present in corresponding to the range \( i \le j < u \), and formula \(\textit{R}\) is \({ 0 \le j < u }\). We give an intuition to these rules while illustrating our technique in the following section.

When a \( fact \) CHC is unsatisfiable, SAA uses heuristic-based weakening for the \( head \) relation (Algorithm 2, line 5). This method generates a candidate set of \(\textit{Q}\) formulas using the syntax of \( body \) of the CHC and combines it with \( i \le j < u \) to get a quantified formula.

Table 1. Rules for all CHC types to construct formulas \(\textit{R}\) and \(\textit{Q}\) in Algorithm 3.

Theorem 2

If the input CHC system S has a solution in the form of  4, then Algorithm 2 will find it provided all the SMT checks return a result.

5.2 Distinguishing SAA With Closely-Related Techniques

SAA uses the concept of range formulas from  [22] and array to integer abduction technique from range abduction [49]. Nevertheless, there are notable differences: 1) While  [22] relies on preconditions to infer invariants, SAA is capable of inferring invariants even in the absence of preconditions. 2) Both  [22] and range abduction can’t handle nonlinear CHCs resulting from guarded relations, which SAA support by using multi-abduction. 3) In our experiments, we observed that range abduction tends to generate stronger preconditions compared to SAA. 4) Range abduction performs two abduction queries for each CHC, whereas SAA requires only one. 5) Range abduction uses the Houdini algorithm [23] for weakening, which is not necessary for SAA.

5.3 Illustration

Consider the CHCs from Fig. 3. For these CHCs, the range formulas are: \(0\!\le \!j\!<\!N\), \(0\!\le \!j\!<\!i\), and \(i\!\le \!j\!<\! N\), as the upper bound u of the loop is N.

The algorithm begins with . But, the \( query \) CHC (C4) is unsatisfiable as is too weak. So, SAA tries to find a strengthening for \(\boldsymbol{inv}_1\) using abduction. Recall that the postcondition (\(\rho \)) is \(\forall j.\,0\!\le \!j\!<\!N\!\implies \!A[j]=B[j]\). While \(\rho \) itself can make this CHC satisfiable, it might be too strong for other CHCs with \(\boldsymbol{inv}_1\). Therefore, the rule for \( query \) in the table suggests to consider \(\textit{R}\) as \(0 \le j < i\), and \(\beta \) as \(A[j] = B[j]\) from \(\rho \), corresponding to the range \(0 \le j < N\). The abduction query \((\boldsymbol{inv}_1(A,B,C,i,N) \wedge \top )\implies \!A[j] = B[j]\) yields \(\textit{Q}\) as \(A[j] = B[j]\). Combining \(\textit{R}\) and \(\textit{Q}\) results in:

figure bd

Next, an \( intra \) CHC C2 is unsatisfiable. This is due to the absence of restrictions on the values of A and B in the range \(i \le j < N\) within \(\boldsymbol{inv}_1\). One way to fix this is to find a \(\textit{Q}\) in the range \(i \le j < N\) that implies \(A[j] = B[j]\). This approach aligns with the rule for \( intra \) CHC, where \(\beta \) is \(A[j] = B[j]\) corresponding to the range \(0 \le j < i\) of ], and \(\textit{R}\) is \(i \le j < N\). Further, \( assign (C2)\) is \(C'[j] = j\) (primed variables denote updated variables), resulting in the following abduction query: \((\boldsymbol{inv}_1(A,B,C,i,N) \wedge C'[j]=j)\implies \!A[j]=B[j]\). This query yields \(A[j] = B[j]\) as \(\textit{Q}\). Combining \(\textit{R}\) and \(\textit{Q}\) into a quantified formula, and conjoining it with gives:

figure bg

In the third iteration, another \( intra \) CHC C3 fails the check. Here, \( assign (C3)\) is \(A'[j] = B[j]\) and \(\beta \) is \(A'[j] = B[j]\), resulting in the following abduction query: \((\boldsymbol{inv}_1(A,B,C,i,N) \wedge A'[j]=C[j])\implies \!A'[j]=B[j]\). This query yields \(B[j] = C[j]\) as \(\textit{Q}\). Combining this with \(\textit{R}\), which is \(i \le j < N\), results in:

figure bh

Subsequently, a \( precond \) CHC, C1, fails the check. These CHCs have an initialization of the counter variable (i.e., \(i=0\)), rendering the formula within the range \(0 \le j < i\) trivially \(\top \). Therefore, the rule for \( precond \) CHC selects \(\beta \) from the other range, i.e., \(\beta \) is \(A[j] = B[j] \wedge B[j] = C[j]\) from the range \(i \le j < N\) of . This leads to the following abduction query: \((\boldsymbol{pre}(A,B,C,i,N) \wedge \top ) \implies (A[j] = B[j] \wedge B[j] = C[j])\), which yields \(\textit{Q}\) as \(A[j] = B[j] \wedge B[j] = C[j]\). Further, \(\textit{R}\) is \(0 \le j < N\), resulting in:

figure bj

The algorithm terminates as the candidate is a solution.

6 Specialized Maximality Checking

While SAA effectively infers precondition, it may not always be the weakest. To check for maximality, a specialized CHC system (G and \(\varGamma \)) is generated in Algorithm 1 using the method \(\textsc {GetSplCHCs} \), which is described in this section. This section also covers the method to weaken a precondition.

6.1 \(\textsc {GetSplCHCs} \) method

The \(\textsc {GetSplCHCs} \) method constructs a new CHC system G by iterating over all the CHCs in the input system S while performing the following:

  1. 1.

    Replacing \(\boldsymbol{pre}\) with and the postcondition \(\rho \) with \(\lnot \rho \), and

  2. 2.

    For each relation \(\boldsymbol{inv}_i\), if there exist two \( intra \) CHCs C and \(C'\) with , then for each \( intra \) CHC C of \(\boldsymbol{inv}_i\), a new relation \(\boldsymbol{g}_C( args ( body (C))\) is added to \( body (C)\).

Example 5

The CHC system from Fig. 3, has \( intra \) CHCs C2 and C3 with \( guard (C2) = guard (C3) = i < N\), so . As a result, two new relations \(\boldsymbol{g}_{C2}\) and \(\boldsymbol{g}_{C3}\) are introduced into \( body (C2)\) and \( body (C3)\), leading to the CHC system shown in Fig. 4. For this system, SAA finds: \(\boldsymbol{g}_{C2} \mapsto A[i] \ne B[i]\) and \(\boldsymbol{g}_{C3} \mapsto A[i] = B[i]\), along with an invariant for \(\boldsymbol{inv}_1\).

A solution to G can result in interpretations for guard relations that block all executions (e.g., \(\bot \)). To prevent this, the following non-CHC constraint (\(\varGamma \)) will be introduced:

$$\begin{aligned} \top \implies \bigvee \limits _{1 \le j \le m} \big (\boldsymbol{g}_{C_j}( args ( body (C_j)) \wedge guard (C_j)\big ) \end{aligned}$$

Theorem 3

If a CHC system S induced by a program P has a solution , and its specialized CHCs (G and \(\varGamma \)) are satisfied, then is the weakest precondition of P.

6.2 Weakening Procedure

When \(\textsc {SAA} \) is inconclusive on the specialized CHCs, the precondition is weakened, as shown Algorithm 4.The algorithm begins by computing a set of potential candidate preconditions \(\Delta \). We assume that this set is computed in a syntax-guided fashion like in  [22] (can also be provided by the user). Only candidate preconditions that are strictly weaker than are taken into consideration. For each such candidate \(\delta \in \Delta \), SAA is invoked to infer inductive invariants by passing the CHC system S with the relation \(\boldsymbol{pre}\) replaced by \(\delta \). This process continues till success or all the candidates have been exhausted. Whenever the method succeeds, the precondition is weaker by construction.

figure bq

7 Evaluation

Implementation Our algorithm is implemented in a tool called MaxPrANQ on top of the HornSpec framework [50]. The tool takes as input a set of CHCs with preconditions and invariants represented as uninterpreted relations. It returns the weakest precondition, along with proof of validity (viz. inductive invariants) and maximality (viz. specialized CHCs and its solution). It uses Z3[14] to solve SMT queries. Quantifier elimination is done by model-based projection [3, 20].

Research Questions We evaluate MaxPrANQ on the following questions:

  • RQ1 Can MaxPrANQ find weakest preconditions for a range of benchmarks?

  • RQ2 How well does MaxPrANQ perform in comparison with state-of-the-art techniques?

  • RQ3 How challenging is it for existing techniques to infer invariants for our benchmarks even with the preconditions being given?

Benchmarks and Configuration We use 66 precondition inference tasks with universal quantified postconditions. While we initially intended to use the precondition inference benchmarks from [53], none of them had quantified postconditions. Hence, we derived our benchmarks from existing verification tasks of [22] that had quantified postconditions. Specifically, we considered multiple loop benchmarks from [22], where the first loop is an initialization loop, and the other loops perform various array update operations. We then removed the first loop so that a non-trivial quantified precondition would need to be synthesized. Overall, we consider 26 multiple loop benchmarks from [22]. Since a majority of the 26 benchmarks were deterministic, we added non-deterministic guards to the update operations and introduced a similar update operation in the other branch . To further test our tool, we adapted these benchmarks to 40 more benchmarks by using common array update operations and postconditions. We performed the experiments on a Ubuntu machine with 2.5 GHz processor and 16 GB memory. A timeout of 200 seconds was given to all the tools.

Tools for comparison We compare our tool against \(\textsc {PreQSyn} \)[49], an abduction-based precondition inference tool, and \(\textsc {P-Gen} \) [53], a predicate abstraction based tool. Additionally, we compare against the CHC solvers that can generate quantified inductive invariants: FreqHorn  [22], a SyGuS based tool, and Spacer  [30] (Z3 v4.8.10), an extension of PDR for quantified formulas.

RQ1 \(\textsc {MaxPrANQ} \) found and automatically proved 59/66 weakest preconditions. For the remaining 7, it found the weakest precondition, but couldn’t prove it automatically due to failure in finding a solution to the specialized CHCs. Overall, it solved 125 CHC systems – 66 universal and 59 existential quantification. The time taken was less than 30 seconds on all except one benchmark. Details are in Fig 5 and  [48].

RQ2 \(\textsc {PreQSyn} \) found and automatically proved 2/66 weakest preconditions. On the remaining 64, it found preconditions for 56 but could not prove; for 8, it did not find a precondition. To compare with our maximality checking, we provided the 56 preconditions generated by \(\textsc {PreQSyn} \) to our SMC module. Out of 56, SMC proved 52 to be the weakest, where 7 were weakened before proving. We observe that \(\textsc {PreQSyn} \)’s maximality checking is unsuitable for non-deterministic programs, and its preconditions are not always the weakest.

\(\textsc {P-Gen} \) did not find a precondition for any of the 66 benchmarks. Its output was not a precondition on 41, and on the rest it was stuck in the refinement loop. Our experiments conclude that \(\textsc {P-Gen} \)’s inference engine is unable to generalize and find quantified preconditions when postconditions are quantified. Hence, our technique complements P-Gen ’s capability of finding preconditions for quantifier-free postconditions.

RQ3 The reader may wonder whether the benchmarks themselves are easier to solve, given the limited availability of weakest precondition tools for non-deterministic programs. We experimentally demonstrate that this is not the case by passing the CHCs with preconditions generated by \(\textsc {MaxPrANQ} \) to state-of-the-art CHC solvers for arrays: FreqHorn and Spacer. Out of 66 benchmarks, FreqHorn found inductive invariants for 56 and Spacer found 34. In comparison, \(\textsc {MaxPrANQ} \) found preconditions and invariants for all the 66 benchmarks.

Fig. 5.
figure 5

The bar graphs show the # weakest proven and valid preconditions inferred by the tools. The scatter plots show the time taken by the tools for invariant inference.

8 Related Work

The problem of precondition inference has received considerable attention [2, 11, 12, 24, 46, 47, 52, 53]. In particular, for programs with arrays, closely related works include [12, 49, 53]. The work in [12] infers preconditions by abstract interpretation,  [53] by CEGAR based predicate abstraction, and  [49] by range abduction. Compared to [12], we don’t need a predefined abstract domain. We work in a framework similar to [53] and [49], but they target deterministic programs. Their maximality check assumes that from a precondition only one execution reaches the postcondition, which is not the case for non-deterministic programs. The novelty of our SAA algorithm, compared to range abduction [49], is in how it constructs abduction queries using a set of rules based on the structure of the CHCs, and support for non-linear CHCs. Range abduction, on the other hand, creates two abduction queries and employs the Houdini technique [23], which can generate stronger preconditions, as observed in our experiments.

Precondition inference is closely related to the problem of invariant inference. For inferring universally quantified invariants, several techniques have been proposed. The main methods include predicate abstraction [32, 41], abstract interpretation [28], PDR [30], and syntax guided synthesis [22]. These techniques are crucially dependent on given preconditions, which are missing in our setting. Without preconditions, they generate trivial solution like \(\bot \).

The validity of a precondition can be established by techniques that do not explicitly generate inductive invariants. Such techniques include array smashing [5], converting to array-free nonlinear CHCs [44], over-approximating unknown bound of loops to a smaller known bound [40], accelerating entire transition relations [6], using CHC transformation [4, 34], induction based techniques [7,8,9] and trace logic based techniques[25]. These techniques are useful for assertion checking and not directly for precondition inference, however.

CHCs are widely used to symbolically encode different synthesis tasks  [18, 21, 50, 51, 55]. However, none of these works handle CHCs with arrays. SAA uses abduction that has been used for programs without arrays to infer invariants [16, 17], preconditions [15, 26], and specifications [1, 50, 54].

The concept of SMC resembles angelic verification [13, 42], but differs in how it is solved. Angelic verification neither guarantees maximality nor computes inductive invariants, and uses user supplied specifications. A recent work [27] proposes a reduction of maximality checking to finding termination proofs for CHC systems with integers. In comparison, SMC reduces to finding inductive invariants and guards by exploiting the fact that the programs are terminating.

9 Limitations and Future Work

Usage of Theorem Prover: The preconditions and invariants guessed by SAA in our evaluation are in a fragment of the theory of one-dimensional arrays and linear integer arithmetic that state-of-the-art SMT solvers support reasonably well. However, in a general case, SAA might generate a challenging precondition/invariant for our SMT solver. In such instances, MaxPrANQ logs a failure and switches to another precondition/invariant. To handle such cases, we plan to complement our SMT solver by an automated theorem prover like Vampire [39]. This can also help us to handle preconditions with alternating quantification.

Non-linear CHC Support: The multi-abduction done in SAA can help in handling non-linear CHCs, which can encode programs with recursive functions. For this, the range analysis in SAA has to be tweaked to determine a loop counter-like variable for recursive functions, which we target for future work.

Termination and Compositional Verification: The assumption of terminating programs helps in proving maximality by inferring invariants and stronger guards. Relaxing this assumption would require a more complex maximality checking. Finally, an immediate future work is to integrate this technique in an existing verifier to scale it compositionally.