A Verified Implementation of the Bounded List Container
This paper contributes to the trend of providing fully verified container libraries. We consider an implementation of the bounded doubly linked list container which manages the list in a fixed size, heap allocated array. The container provides constant time methods to update the list by adding, deleting, and changing elements, as well as cursors for list traversal and access to elements. The library is implemented in C, but we wrote the code and its specification by imitating the ones provided by GNAT for the standard library of Ada 2012. The proof of functional correctness is done using VeriFast, which provides an auto-active verification environment for Separation Logic extended with algebraic data types. The specifications proved entail the contracts of the Ada library and include new features. The verification method we used employs a precise algebraic model of the data structure and we show that it facilitates the verification and captures entirely the library contracts. This case study may be of interest for other verification platforms, thus we highlight the intricate points of its proof.
Standard libraries of programming languages provide efficient implementations for common data containers. The details of these implementations are abstracted away by generic interfaces which are specified in terms of well understood mathematical structures such as sets, multisets, sequences, and partial functions. The intensive use of container libraries makes important their formal verification.
However, the functional correctness of these libraries is challenging to verify for several reasons. Firstly, their implementation is highly optimized: it employs complex data structures and manages the memory directly through pointers/references or specific memory allocators. Secondly, the specification of containers is rarely formal. Notable exceptions are, e.g., Eiffel  and SPARK ; recently,  provided a specification of the Ada 2012 container library. The formal specifications are very important when the library employs constructs that are out of the scope of the underlying mathematical structure. A typical example of such constructs are iterators. For example, Java iterators are generic and can exist independently of the container; Ada 2012 iterators, called cursors, are part of the container. Thirdly, the specification of the link between the low level implementation and the mathematical specification requires hybrid logics that are able to capture both low level and high level specifications of the container. For verification purposes, these logics shall be supported by efficient solvers.
This work focuses on the functional verification of the bounded doubly linked lists container, which is a GNAT implementation  of the doubly linked lists container in the standard library of Ada 2012 . This container is currently used by client programs  written in SPARK , a subset of Ada targeted at safety- and security-critical applications. The lists have bounded capacity, fixed at the list creation, and thus avoid dynamic memory allocation during the container use. This feature is required in critical code, where it is necessary to supply formal guarantees on the maximal amount of memory used by the running code.
The container implementation is original compared with other implementations of linked lists inside arrays. It employs an array of fixed size in which it manages (i) the occupied array cells inside a doubly linked list representing the content of the container and (ii) a singly linked list of free array cells. The operations provided are classic for lists. The amortized constant time complexity is preserved by the implementation of insert and delete operations. The list elements are designated using (bi-directional) cursors, also used to traverse the list. In conclusion, the code of this container was designed to ensure efficiency of operations and not its verification, and therefore it provides a realistic test for the automated verification.
Thanks to the introduction of formal contracts in Ada 2012, the container has been fully specified recently based on a previous specification in Why3 by Dross et al. . The specification given is “meant to facilitate the formal verification of code using this container” , and it is presently used to prove the clients written in SPARK. The container is specified in terms of a model representing a functional implementation of bounded vectors, also written in Ada. This kind of specification is a substitute for the algebraic data types, not supported by Ada. It has the advantage of being executable, which enables the run-time verification of the implementation. An important feature of these contracts is their completeness  with respect to the models considered for the container and the cursors. This aspect is a challenge for the state of the art verification tools. The formal verification of these contracts can not be done by GNATprove, the deductive verification environment for SPARK, because the code employs language constructs out of its scope.
The goal of our study is to apply on-the-shelf verification tools to prove the full functional correctness and the memory safety for this implementation, without simplifying the code or the specification. To open the case study to more verification platforms, we choose to write this library in C, because C may capture all the features of the container implementation, except the strong typing and the generic types of Ada. The C implementation mimics the Ada code. The functional specification of the C code translates the container contracts from Ada based on (i) representation predicates that relate heap regions with algebraic models using inductively defined predicates, (ii) algebraic lists and maps, and (iii) inductively defined predicates and functions on the algebraic models. The logic including these features is undecidable in general. Therefore, we have to help the prover to obtain push button verification. The auto-active verification  environments are helpful in such tasks.
The invariant properties of the implementation and the features exhibited by the specification guided us towards deductive verification platforms that support Separation Logic  (SL) and algebraic data types. Consequently, we choose the VeriFast  auto-active verification tool, which provides means for (a) the specification of representation predicates in the style introduced for SL by O’Hearn et al. , (b) the definition of polymorphic algebraic data types, predicates and functions, and (c) the definition of user lemmas to help verification. Using these features, we employ a verification methodology based on the refinement of the original specification. The refined specification not only captures accurately the contracts, but also eases the deductive verification process, i.e., the writing of lemmas. For example, we employ a style of writing representation predicates in SL that leads to simpler lemmas for list segment composition.
To summarize, we verified the C implementation of a bounded doubly linked list container against its functional specification. In addition, we verified the safety of memory accesses using Separation Logic. For this, we annotated the C code and we extended the library for algebraic polymorphic lists of VeriFast with new predicates and lemmas. These logic development may be used in other verification tools based on induction.
The paper begins by presenting the case study in Sect. 2. Then, we highlight in Sect. 3 the main ingredients of the verification approach used and the challenges we faced. Section 4 presents the experimental results. We compare this work with other approaches for verification of containers and complex data structures in Sect. 5.
2 Dynamic Bounded Doubly-Linked Lists
This section presents the container code and its functional specification.
Implementation: The code is written in a very simple fragment of C, which may be easily translated to most imperative programming languages. It uses records and pointers to records, dynamic memory allocation, classic accesses to record fields and array elements, basic integer type and its operations. Like in the original code, the container does not support concurrency and has been written to obtain efficient operations and not to ease the verification. The container elements are designated through cursors, which represent valid positions in the list; they may be moved forward and backward in the list. The container interface includes 30 operations including classic operations (creation, copy, size access, clearing and deallocation, equality test, searching) and a rich set of utilities (inserting or deleting bunches of elements at some position, searching from the end, merging lists, swapping elements or links, reversing in place, sorting).
Specification: The functional specification is model based . Two mathematical models are used: algebraic lists (i.e., finite sequences) to represent the content of the list and finite partial maps to model the set of valid cursors (see Sect. 2.3 for details). The contracts employ operations on these mathematical models that are beyond their classic usage. For example, the test of inclusion between the set of elements of two sequences, or the test that the domain of a partial mapping has been truncated from a given value. For this reason, we enriched the library of mathematical models provided by our prover with such operations and the corresponding axiomatizations (see Sect. 3.2).
An important feature of our functional specification is the usage of a refined abstraction for the list to ease the proof that the operations satisfy their contracts. We introduce a precise model for the list, which is an algebraic list of abstract cells, storing container values together with the links between the cells. This precise model is mapped to the abstract model (sequence of values) using a catamorphic mapping , called Open image in new window . Moreover, the precise model is used to compute the (abstract) model of cursors, based on a catamorphic mapping, called Open image in new window . The use of the precise model facilitates the verification effort for proving that implementations of operations satisfy their contracts (see Sect. 3.1).
The functional specification is complete in the sense given by : the post-condition of each operation uniquely defines its result and the side effect on the model of the container and of its cursors. However, it does not check for memory overflow at the container creation.
2.2 List Container
List Elements: The data stored in the list container is typed by an abstract type Open image in new window , defined as an alias to the integer type in our code. This coding is sound for the proof of the functional correctness of the container implementation because the container assumes only that values of Open image in new window may be compared for equality.
The values of the C type are abstracted by the ghost type Open image in new window , defined at line 1 in Fig. 1, which records the values of node fields. The logic functions Open image in new window , Open image in new window , and Open image in new window to access first, second, resp. third component of a Open image in new window value.
The predicate Open image in new window (line 6 in Fig. 1) relates a node Open image in new window allocated in the heap with its model Open image in new window . The allocation property is expressed by the predefined predicate Open image in new window . The values of the fields are bound to existentially quantified variables and used to build the model of the node. The predicate Open image in new window constrains the fields Open image in new window and Open image in new window to be indexes in an array starting at index 0 and ending at index Open image in new window .
There are two kinds of nodes in the array managed by the container: nodes occupied by list elements and nodes not yet used in the list, i.e., free. Free nodes have the Open image in new window field at \(-1\) and the Open image in new window field is irrelevant. They are specified by the predicate Open image in new window (line 14 in Fig. 1), which also constraints the parameter Open image in new window to be equal to the value of the field Open image in new window . Occupied nodes have the Open image in new window field set to a non-negative integer and the Open image in new window field is relevant. The predicate Open image in new window (line 19 in Fig. 1) relates the node with its abstract model.
The length of the list is given by the field Open image in new window . The first and the last cells of the lists are stored at indexes Open image in new window resp. Open image in new window . Field Open image in new window denotes the start of the list registering the free nodes. The operation creating the container allocates the array Open image in new window and sets at free all nodes in the array. The fields denoting the size and the extreme cells of the doubly linked list are set to 0. The initialization of the Open image in new window field is detailed in the next paragraph.
Acyclic List of Free Nodes: The free nodes are organized in a singly linked list, called the free-list. The start of this list is given by the field Open image in new window of the type Open image in new window . If Open image in new window is negative, the list is built from all nodes stored between Open image in new window and Open image in new window included; this permits a fast initialization of the free-list at the container creation. If Open image in new window is positive, the free-list starts at index Open image in new window , uses as successor relation the Open image in new window field, and ends at index 0. Figure 2 illustrates the two kinds of free list. The representation predicate Open image in new window (line 31 in Fig. 1) is used when Open image in new window is negative. It collects in the parameter Open image in new window the sequence of the indexes of free nodes. For the second case, we define the predicate Open image in new window (line 38 in Fig. 1). The two kinds of free-list are combined in the predicate Open image in new window (line 45 of Fig. 1) that calls the correct predicate depending on the sign of Open image in new window . Notice that the constraints required by this predicates (the relation between container capacity, BDLL and free-list sizes) are all present in the Ada 2012 specification .
Representation Predicate: The invariants of the container are collected in the predicate Open image in new window (line 52 in Fig. 1) which mainly specifies that the container is allocated in the heap (predefined Open image in new window predicate), and its field Open image in new window is also allocated as a block containing Open image in new window \(+1\) records of type Open image in new window . The first node of this array (at address Open image in new window ) has its Open image in new window and Open image in new window fields set to \(-1\) resp. 0. The set of remaining nodes is split between the lists specified by the Open image in new window and Open image in new window predicates due to the separating conjunction. The size of the BDLL is exactly the one of its model and stored in the field Open image in new window .
The logic type Open image in new window abstracts the cursor implementation (line 1 in Fig. 3). The representation predicate for cursors, Open image in new window (line 3 in Fig. 3), checks that the cursor content, Open image in new window , corresponds to an occupied node in the list using the precise model Open image in new window of the BDLL (see line 13). Moreover, the predicate computes from Open image in new window and Open image in new window , the BDLL starting index, the segments Open image in new window and Open image in new window , into which the cursor Open image in new window splits Open image in new window .
Notice that this manner of specifying cursor model is coherent with the sequence model of the container, because the access to the elements of a sequence is based on positions. However, this specification choice does not combine well with inductive reasoning and induces additional work for the proof (see Sect. 3). We have to enrich the inductive list model with operations using positions. For example, we define the operation Open image in new window which returns the Open image in new window th element of the list Open image in new window . We also defined operations Open image in new window and Open image in new window on association lists to test if an abstract cursor is in the domain of the map resp. to obtain the value to which it is bound.
3 Verification Approach
We employ an auto-active verification approach , supported by the tool VeriFast . The auto-active approach provides more automation of the verification process based on the ability given to the user to help the prover by adding annotations and lemmas and the efficient use of back-end solvers. This section highlights the methodology applied to conduct auto-active verification for this case study. This methodology is independent of the specific tool used. We also comment on the advantages and difficulties encountered with the tool used. Notice that we did not have prior experience with VeriFast.
3.1 Model-Based Specification for Verification
The contracts provided for our container are in a first order logic over sequences and maps, which employs recursive logic functions. This theory is undecidable so we have to provide lemma to help the prover to tackle verification conditions.
Usage of a precise model is the solution we found to ease the writing of lemmas. It consists in refining the abstract model used for the container specification into a model that captures more details on the container organization. The abstract model is obtained from the refined one using a catamorphic mapping. This methodology is required by the gap between the abstract model and the lower level implementation of the container.
The catamorphism mappings used to obtain the abstract model of the container and the model of valid cursors have good inductive definitions and enable efficient decision procedures . However, these decision procedures are not available in our verification tool; this work may be a motivation to add them.
Specification of user types by representation predicates mapping them to inductive types is classical in Separation Logic. We encode the invariant of the BDLL data structure in the predicate Open image in new window . The adoption of C for the implementation keeps us away from the problems of verifying object-related properties, for example. However, this choice leads to an overburden in annotations because we have to specify that parameters of type ‘ Open image in new window ’ satisfy the invariant.
Additional annotation have been supplied to axiomatize global constants (like the Open image in new window record in Fig. 1) and arrays of user-defined structures (like Open image in new window in Open image in new window ).
Contract cases are intensively used in the considered GNAT library. We got around the absence of contract cases in VeriFast using conditional expressions and logic predicate and functions that relate two models (old and new). We do not observe any expressivity or performance problems with this method of encoding contracts.
3.2 Support for Specification Types
Specification of model types is done based on the mathematical models sequence (or inductive polymorphic list) and map (or inductive polymorphic association list). The VeriFast libraries including these models (mainly list.*) contain 9 predicates and 20 lemmas, and are not enough for the operations on models required in our specifications. We added tens of lemma and predicates. They are useful not only for the container proof but also for the verification of client programs with inductive back-end solvers. (Nowadays, these proofs are done by GNATprove by calling SMT solvers with quantifiers support.)
This definition is not as easy to reason about as we might expect. In particular, some properties of this definition of inclusion such as reflexivity are only provable under the additional assumption that the keys are distinct.
Notice that these proofs are not necessary for provers with support for finite maps and sets. Although VeriFast supports as back-end solver Z3 , it does not use it for such theories. The inductive theories are supported by other back-end solvers, e.g., CVC4  that are not connected to VeriFast.
3.3 Annotations Load
The annotations required by the proof of our library belong to two categories: (1) mandatory annotations including function contracts and predicates employed by these contracts and (2) auxiliary annotations including loop invariants, open/close of predicates, definitions and calls of lemmas.
The prover VeriFast includes all this annotation burden, since we can not direct the prover in the usage of these annotations. VeriFast provides two mechanisms to limit the burden of the auxiliary annotations: (i) lemmas can be marked as automated which means they will be given to the backend solver on all problems, (ii) inductive predicate definitions can be automatically folded and unfolded when used with computed parameters.
We introduce few automated lemmas and call them in order to lighten the prover load. We don’t observe performance problems by including all these annotations and despite the absence of modular proofs. The frame reasoning rule of Separation Logic seems to play an important role in this good behavior.
We found useful the two ways of specifying inductive predicates in VeriFast: by case on the model or by case over the aliasing of heap locations. We started with the first style, but finally chose the second to bring advantages of automatic folding and unfolding of computed predicates.
3.4 Challenges Dealt
We considered a functional specification which is already in use in client code. Therefore, we can not adapt this specification to ease the verification. Instead, we propose a method based on a refined specification based on a precise model of the container that eases the verification and allows to obtain the initial specification with minimal cost.
The specification we received is complete with respect to the model of containers and cursors. This requires to specify logic functions and predicates that are more complex than the usual ones.
The code has been designed to obtain efficient container implementation and does not focus on verification. Therefore, the verification task has been more difficult compared with previous work verifying functional specification of container libraries [28, 39] designed with verification in mind.
Only specifications of contracts for public operations on the container were provided. We had to annotate the code and the internal operations. This implied an additional cost in annotations because some internal operations break the data structure invariants.
Having in mind the extension of this verification effort to other bounded container libraries (for sets or maps), we propose reusable logic libraries and suggest some improvements for the auto-active verification tool in use.
4 Verification Results
Bugs Found: We did not find spectacular bugs in the code, which is normal for a library that has been used for years. We only detected a potential arithmetic overflow in the computation of the memory to be allocated and a potential memory shortage. The last problem is in fact dealt for the SPARK clients using tools that measure the memory allocated by the program.
Statistics on the proof
Specification Load: We have coded, specified, and verified 27 functions out of the 39 provided by the container library including equality and emptiness tests, clear, assign and copy, getting and setting one element, manipulating the cursors, inserting and deleting at some cursor, finding an element before and after a cursor. Most of the 12 remaining functions deal with sorting. The size of our development is given in Table 1. To obtain a specification close to the Ada 2012 one, we wrote two files of logic definitions for models (vfseq.gh and vfmap.gh) extending the VeriFast libraries. Additional fixpoint functions and lemmas required on VeriFast lists are written in file vflist.gh. The rate between source code and annotations is about 1 to 9. The required annotations (i.e., data structure invariant, pre/post conditions, and logic predicate and function used directly in them) represent a quarter of all annotations (including also loop invariants and lemmas). In Ada 2012 container, the rate between source code and contracts is already of about 1 to 3.
Verification Performance: We run VeriFast on a machine with 16 GB RAM, Intel core i5, and 2.70 GHz, installed with Linux. The back-end solver of VeriFast was redux. The verification takes 1.3 s for the full container.
5 Related Work
The verification of individual data structures has received special attention. General safety properties (i.e., absence of out of array bounds accesses, null dereferences, division by zero, arithmetic overflow) may be verified automatically with low load of annotations using static analysis methods, e.g. [13, 17, 19, 21]. More complex properties like reachability of locations in the heap and shape of the data structures could also be proved with static analysis methods based on shape analysis, e.g., [4, 6, 10, 32]. These automatic techniques have been applied to linked lists coded in arrays . These methods concern limited properties and may be used in the early stages of the library development to infer internal invariant properties. Extension of fully automatic techniques to cover functional specification abstractions like sets or bags are based either on shape analysis, e.g., [7, 14] or on logic fragments supported by SMT decision procedures [16, 18, 37, 38]. These functional specifications capture essential mathematical properties of the data structure but do not deal with properties of iterators over them.
At the opposite end of the spectrum of verification techniques, interactive provers have been used to obtain detailed specifications about data structures based on powerful theories, e.g., [8, 23, 29], but they require expertise and great amount of proof scripting.
At the intermediate level of automation, functional verification tools have been used to tackle the verification of specific data structures (e.g., Dafny , GRASShoper , VeriFast , or Why3 ) but we are not aware of any experiment on bounded lists.
The full functional correctness of container libraries has been considered in [28, 39]. They consider complex data structures in imperative and object oriented languages that require to verify special properties and may benefit from modular verification thanks to inheritance. In both cited works, a special effort has been deployed to improve the prover to call solvers for different theories or to generate verification conditions that may be dealt with efficiently. This efforts lead to a low annotation overhead, especially in . We use an on-the-shelf auto-active verification tool but improve its performances by employing a refinement method which leads to more automation but a more important annotation overhead. None of these works consider the container of bounded list.
We apply auto-active verification provided by the VeriFast tool to prove the functional specifications of the bounded doubly linked list container. The implementation we consider is in C, but it mimics the GNAT library , which is used in SPARK client programs. The functional specification is model-based and uses sequence and map mathematical models in a specific way to model the content of the list and its valid cursors. Our main contributions are (i) the improvement of the logic libraries of VeriFast to deal with such specific models and (ii) the use of a refinement based methodology to ease the proof automation.
This case study provides a motivation for the development of inductive solvers and their connection with auto-active provers like VeriFast. This experiment is another demonstration of the known fact (see ) that proving functional specifications of real world containers is more difficult than proving functional specification of data structures. The support for automation of these proofs is of an utmost importance to scale the verification to a full library of containers.
Data Availability Statement and Acknowledgements
The annotated code of the library and the sources used by its proof are available in the figshare repository . The file-set also includes the distribution of the VeriFast tool running in the virtual machine provided at https://doi.org/10.6084/m9.figshare.5896615.
This work has been supported by the French national research organization ANR (grant ANR-14-CE28-0018). We thank Claire Dross and Yannick Moy from AdaCore for guiding us through the Ada standard library and for supplying the latest version of its specification. We thank Samantha Dihn for the first C version of the Ada containers.
- 1.Ada Europe: Ada Reference Manual - Language and Standard Libraries, Chapter A.18.3 The Generic Package Containers. Doubly_Linked_Lists Norm ISO/IEC 8652:2012(E) (2012). http://www.adaic.org/resources/add_content/standards/12rm/html/RM-TTL.html
- 2.AdaCore: SPARK verification gallery. https://github.com/AdaCore/spark2014/tree/master/testsuite/gnatprove/tests
- 3.Baudin, P., Filliâtre, J.C., Hubert, T., Marché, C., Monate, B., Moy, Y., Prevosto, V.: ACSL: ANSI C Specification Language (preliminary design V1.2), preliminary edition, May 2008Google Scholar
- 7.Chin, W.-N., David, C., Nguyen, H.H., Qin, S.: Automated verification of shape, size and bag properties via user-defined predicates in separation logic. Sci. Comput. Program. 77(9), 1006–1036 (2012). The Programming Languages track at the 24th ACM Symposium on Applied Computing (SAC 2009)CrossRefGoogle Scholar
- 8.Chlipala, A., Malecha, J.G., Morrisett, G., Shinnar, A., Wisnesky, R.: Effective interactive proofs for higher-order imperative programs. In: Proceedings of ICFP, pp. 79–90. ACM (2009)Google Scholar
- 12.GNU Fundation: GNAT library components in gcc 7.1. https://sourceware.org/svn/gcc/tags/gcc_7_1_0_release/gcc/ada/ files a-cfdlli.ad*
- 20.Leino, K.R.M., Moskal, M.: Usable auto-active verification (2010). http://www.fm.csl.sri.com
- 22.McCormick, J.W., Chapin, P.C.: Building High Integrity Applications with SPARK. Cambridge University Press, Cambridge (2015)Google Scholar
- 29.Pottier, F.: Verifying a hash table and its iterators in higher-order separation logic. In: Proceedings of CPP, pp. 3–16. ACM (2017)Google Scholar
- 31.Reynolds, J.C.: Separation logic: a logic for shared mutable data structures. In: Proceedings of LICS, pp. 55–74. IEEE (2002)Google Scholar
- 33.Rustan, K., Leino, M.: Main Microsoft research Dafny web page. http://research.microsoft.com/en-us/projects/dafny
- 36.Why3-Team: Why3 verification gallery. http://toccata.lri.fr/gallery/
<SimplePara><Emphasis Type="Bold">Open Access</Emphasis> This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.</SimplePara> <SimplePara>The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.</SimplePara>