Keywords

figure a
figure b

1 Introduction

Context and Motivation. Deductive verification tools were successfully used in many case studies [4] to prove a large range of safety, security and functional properties. Such tools often have issues to conduct automatic proof on code with recursive data structures (e.g. linked lists, trees, etc.), in particular, due to complex memory models they need. The user has to guide the proof by interactively proved lemmas, assertions, etc. Abstract interpretation tools based on separation logic and shape analysis [3] can efficiently reason about such structures but typically cannot deal with so large classes of properties. This short paper presents new ideas and emerging results on combining both techniques trying to take the best of both worlds.

Approach and Results. We present a verification approach combining a popular deductive verifier for C programs, Frama-C/Wp  [6], with a shape analysis tool, MemCAD  [10]. The main idea is to prove structural and separation properties in MemCAD and then to assume them in Frama-C/Wp in order to increase the level of automation of the latter and overcome some of its limitations. We apply it on a real-life case study using linked lists: a few (slightly simplified) functions of tpm2-tssFootnote 1, a popular library for communication with a Trusted Platform Module (TPM). Recent work [11] demonstrated that deductive verification of the library functions manipulating linked lists was relatively hard, and required many additional lemmas and assertions.

The contributions of this paper include the presentation of a combined verification technique using deductive verification and shape analysis, its illustration with Frama-C/Wp and MemCAD on a function manipulating linked lists, as well as a successful case study on a set of functions of the tpm2-tss library.

2 Background

2.1 Deductive Verification with Frama-C/Wp

Frama-C  [6] is an integrated toolbox built around a kernel offering core services and plugins dedicated to specific analysis or verification tasks for C code, e.g. value analysis, runtime assertion checking and deductive verification. Acsl (ANSI C Specification Language) [6] is the common specification language of the plugins. The Wp plugin performs modular deductive verification: each function is verified independently. It generates verification conditions (VCs) from the C code with Acsl annotations and requests their proof by the QED simplifier or by external provers.

We illustrate the main Acsl features on the running exampleFootnote 2 of Fig. 1, 3, 4, 5, presented as we go, where Acsl notation (e.g. ) is pretty-printed (resp., as \(\forall , \mathbb {Z},\Rightarrow ,\le , \wedge \)). Lines 69–85 of Fig. 4 show a contract for function (detailed below) that adds a new value into a linked list (cf. Lines 1–2 of Fig. 1), allocating a new cell. The contract includes pre-conditions ( clauses) and post-conditions ( clauses). The clause is a special kind of post-condition that indicates the memory locations the function is allowed to modify. Acsl formulas are mostly multi-sorted first-order logic where types are either C types or logic types (such as , the type of mathematical integers). Acsl provides built-in constructs such as (the value returned by the function) and predicates such as (stating that pointer refers to an allocated memory location, so that can be safely read and written) and (stating that the memory locations referred to by given pointers do not intersect). Notice that the considered memory locations are here indicated by pointers. Users can define predicates such as those in Fig. 1, adapted here from a previous work [1] on verifying linked lists in Wp.

Fig. 1.
figure 1

Types and Acsl predicates for linked lists.

The main predicate is the inductively defined predicate (Lines 10–19) stating that a linked list (segment) of values (defined on Lines 1–2) from pointer to pointer (excluded) is a well-formed list represented by an Acsl logical list . In other words, contains the pointers to the cells of that list segment (or the whole list if is ). Acsl lists are similar to lists in functional programming. In the inductive case ( ) overlapping list cells (or cyclic lists) are avoided by requiring that the first cell is separated from all the other cells in the list including , so the list is well-formed. The predicates on Lines 4–9 use predefined functions: and that returns the \(n^\text {th}\) element of a logic list. Predicates can take one or several program points (C labels plus some Acsl labels: and ). The built-in specifies the value of an expression at a label . Using these features, states that a logic list does not change between two program points (Lines 20–24). Finally, Lines 25–36 define an axiomatic function that constructs a logic list from a C linked list. While it would be possible to write instead of Line 72 of Figure 4, the scope of the existential quantifier is just this line. Therefore, cannot be used in the post-conditions, hence the need for .

Let us now detail the contract of (its code is detailed below). The pre-conditions state that is a valid pointer to a list (Line 70), separated from every element in the list (Line 71), and refers to a linked list verifying the inductive predicate (Line 72). Line 73 specifies that the only locations the function is allowed to modify are , the head pointer of the list, and , the first element of the list at the exit point, i.e. the freshly allocated cell. We cannot reference the new list cell at the entry point because it is not allocated yet. In post-conditions, the returned value indicates whether or not the allocation is successful (Line 76). Regardless of the success, we expect the list invariants to hold (Lines 74–75). In case the allocation fails, we expect the pointer and the list contents to be unchanged (Lines 77–79). If it succeeds, we expect the list to be composed of the new cell followed by the old list (Lines 80–81), the old list being unchanged (Lines 82–83), and the fields of the new cell, and , resp., to point to the old list (Line 84) and to contain the expected value (Line 85).

Fig. 2.
figure 2

Inductive predicates for MemCAD.

2.2 Shape Analysis with MemCAD

The purpose of MemCAD  [10] is to automatically infer precise invariants about programs manipulating complex data structures. It is based on shape analysis [3], a static code analysis technique that discovers and verifies properties of recursive, dynamically allocated data structures. It relies on separation logic and abstract interpretation. Unlike in Wp, the analysis is global.

Fig. 3.
figure 3

Auxiliary MemCAD checks for linked lists.

To use MemCAD on linked lists defined on Lines 1–2 of Fig. 1, the user first defines an inductive predicate expressing a structural invariant of a well-formed linked list, such as predicate on Lines a–h of Fig. 2. A list, i.e. a pointer to a list cell, satisfies the predicate in two cases. Each case defines a memory separation formula and additional constraints. In the first case, the pointer is null (Line d) and no specific memory separation is required (Line c). This case has no additional arguments (cf. on Line b). The second case has two (existentially quantified) arguments: an address and an integer (Line e), denoted, resp., by and in the rest of the case. The pointer is non null and refers to a valid memory block of 8 bytes (Line h), assuming a 32-bit system. Lines f–g define the values of the fields and (at offsets 0 and 4) as and , and require separation between those fields and the rest of the list. The separation is expressed by the separating conjunction “ ” [10]. Notice that “ ” on Line f specifies separation recursively, for all list cells reached by the predicate via the inductive case. The user can insert the instruction to assume that list respects predicate , or to check the same property in MemCAD.

Predicate on Lines n–t is very close to predicate except that it only defines one list cell without recursion. Predicate on Lines j–m expresses that a double pointer to a list cell (i.e. of type ) is valid, refers to a well-formed list and is separated from its cells. Predicate is explained below.

Fig. 4.
figure 4

Functions and with contracts.

3 Combined Approach

3.1 Shape Analysis Assisted Verification

To prove complex memory-related annotations with Wp on real-life code [11], the user typically has to manually annotate the code with many additional carefully chosen assertions establishing structural invariants and separation properties at several intermediate program points, and to add numerous lemmas to facilitate reasoning about them (whose proof must usually be done manually in Coq, an interactive proof assistant). Our approach proposes to let MemCAD deal with the structural invariants of recursive data structures and separation properties, and to admit them in Wp at some key points.

In order to use both tools simultaneously in this way, we first need to show the equivalence between MemCAD and Wp inductive predicates. For MemCAD, predicate (Lines a–h of Fig. 2) specifies that each element of the list is a valid cell, is separated from every other cell of the list and the list is null-terminated. This is equivalent to the predicate for Wp (Lines 10–19 of Fig. 1) when we consider the whole list. Indeed, when is , this predicate also means that every list cell is valid and separated from any other list cell, and the list is null-terminated. Explicit separation conditions in the Acsl predicate for Wp are expressed by the separating conjunction in the MemCAD counterpart. (Notice that separation of with on Line 15 is trivial.) The sequence of list elements, expressed by a logic list in Acsl and used to prove functional properties about the contents of the list (cf. Lines 80–81) in Wp, does not need to be specified for MemCAD, which we only use to reason about structural properties.

To check if invariants hold in MemCAD, we define check functions shown in Fig. 3. These functions are specified to be side-effect-free (cf. Lines 40, 47) to prevent interference with the proof in Wp.

The first function, (Lines 41–43), checks that respects the predicate, i.e. is a valid pointer to a well-formed list from which it is separated (Line 42, see also Lines j–m of Fig. 2).

The goal of the second function, , is to check that refers to a list cell, respects the predicate, and the corresponding pointer and the list cells are separated from the cell referred to by . To do that in MemCAD, we introduce an ad-hoc structure with both pointers (Line 45). The function initializes a local structure (Lines 49–50) and takes its address (Line 51) in order to express the required check (Line 52). This check relies on the predicate (Lines v–z of Fig. 2) stating that the given pointer is non-null and refers to a structure with two pointers at offsets 0 and 4, denoted and , referring, resp., to a cell and to a double pointer to a well-formed list, which are separated (between them and from the list cells). Notice that “ ” on Line y specifies separation recursively, that is, from all locations considered in separation constraints reached via (and hence via ).

An important benefit of using MemCAD is its capacity to automatically handle dynamic memory allocation, which is not yet supported in Wp. Thus, we define a custom allocator that simulates the behavior of for list cells on Lines 59–67 of Fig. 4. Wp uses its contract, which is simple but currently unprovable by Wp since dynamic allocation is not supported (it should become provable when this support is added into Wp).

3.2 Proof of Function list_push

We illustrate our approach on function of Fig. 4. It tries to allocate a new cell (Lines 87–88), and, in case of success, puts it on top of the list with the given data (Lines 93, 95, 97, 103). Lines 92, 96 define ghost labels (that is, labels used only in annotations).

Lines 89–91 show how we use MemCAD to verify that the new cell (referred to by ) is separated both from the list cells and the pointer referred to by (Line 89), and introduce these properties as assumptions for Wp ( clauses on Lines 90–91). They help Wp to prove in an clause on Line 94 that the list remains unchanged since label (i.e. Line 92) despite writing into the new cell on Line 93, and a similar assertion for the old list on Lines 98–99 despite the assignment on Line 97.

Instead of reasoning about the modified list directly in Wp—which often presents another difficulty for deductive verification—we let MemCAD check the list invariants on Line 100 and admit them on Lines 101–102 for Wp to prove the post-conditions. Thanks to those assumptions, Wp successfully proves this function. Notice that the check instruction for MemCAD and the admit instructions for Wp are placed (for the moment, manually) at the same program location to ensure the soundness of the global verification.

In order to have a full proof, we also need to run MemCAD to verify all the checks in . For this purpose, we define a wrapper in Fig. 5 to analyze the call to on Line 108 with an arbitrary list respecting the given pre-conditions (which correspond in MemCAD, as we explained above, to assuming predicate for , cf. Lines 70–72, 107). MemCAD also succeeds in its analysis, hence, we can conclude that our function respects its Acsl contract.

While the annotation step is done manually in the current work, it can be better automated in the future. A coordinated generation of checks and assumptions for a given recursive data structure for both tools will facilitate the verification and the justification of soundness of the combined approach. An early idea consists in defining a domain-specific language for the description of the target recursive data structure that is then used for the generation of necessary predicates for MemCAD and for Wp as well as necessary assumptions and checks. The investigation of this research direction is left for future work.

Fig. 5.
figure 5

Wrapper to verify in MemCAD.

4 Case study on the tpm2-tss library

We tested our approach on a few (slightly simplified) functions of the tpm2-tss library, a widely used open-source implementation of the TPM Software Stack (TSS)Footnote 3 designed to access the Trusted Platform Module (TPM). The library uses a linked list to store and use TPM resources, such as objects sent to and received from the TPM. List cells are dynamically allocated. Simplifications were applied to data structures used for list cells (and their treatment).

We consider two functions, to add an object and to look for an object in a list, with one called function, and apply MemCAD to verify separation properties for a newly allocated cell that Wp is currently not able to deduce. A recent study [11] demonstrated that deductive verification with Wp of these functions required many additional lemmas and assertions, as well as the replacement of the dynamic memory allocation by a static allocator. Interestingly, the difficulty to verify real-life code was not caused by complex operations on lists—these operations are in reality quite simple in the target code—but by the difficulty to reason about the recursive data structure itself.

The proposed approach combining deductive verification with shape analysis allows us to perform a complete proof with less effort and without replacing dynamic allocator by a static allocator. On the considered functions, the proof with Wp alone [11] required 14 lemmas, leading to the generation of 241 proof obligations, one of which required a manually created Wp script, and took 4m50s. Thanks to combining Wp and MemCAD in our work, we could remove \(\sim \)45 auxiliary Acsl annotations and 5 lemmas, so the proof required only 9 lemmas, leading to 194 proof obligations using no scripts, and took 1min47s in total for Wp and MemCAD (the latter taking less than 1 sec.).

5 Related Work and Conclusion

Related Work. Various tools based on separation logic were proposed, such as VeriFast [8], Viper [7], VerCors [2]. He et al. [5] extract functional specification from imperative programs using a memory-safe type system and insert dynamic checks into the specification. GRASShopper [9] combines separation logic with an SMT-based verifier. Unlike in our work, GRASShopper does not integrate abstract interpretation based shape analysis (which allows us to infer structural invariants with MemCAD without having to provide loop invariants for this tool). Issues reported in a recent study [11] motivate such combinations for complex real-life code with recursive data structures. Our work continues previous efforts by proposing a combination of weakest-precondition based deductive verification with abstract interpretation based shape analysis on the source-code level, which, to the best of our knowledge, was not studied and evaluated before.

Conclusion and Future Work. This short paper has presented an approach combining deductive verification with Frama-C/Wp and shape analysis with MemCAD. Separation properties and structural invariants for linked data structures can be more easily proved by the latter, and then used as assumptions in the former, thus allowing it to focus on other properties. This work is still ongoing and opens interesting research questions and perspectives: automation of the proposed verification technique including a coordinated generation of checks and assumptions, proof of its soundness, design of a common (higher-level) specification mechanism for recursive data structures with automatic translation into suitable definitions for MemCAD and Frama-C, as well as evaluation on other relevant case studies.