String Abstraction for Model Checking of C Programs

  • Conference paper
  • First Online:
Model Checking Software (SPIN 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11636))

Included in the following conference series:


Automatic abstraction is a powerful software verification technique. In this paper, we elaborate an abstract domain for C strings, that is, null-terminated arrays of characters. We describe the abstract semantics of basic string operations and prove their soundness with regards to previously established concrete semantics of those operations. In addition to a selection of string functions from the standard C library, we provide semantics for character access and update, enabling automatic lifting of arbitrary string-manipulating code into the domain.

The domain we present (called M-String) has two other abstract domains as its parameters: an index (bound) domain and a character domain. Picking different constituent domains allows M-String to be tailored for specific verification tasks, balancing precision against complexity.

In addition to describing the domain theoretically, we also provide an executable implementation of the abstract operations. Using a tool which automatically lifts existing programs into the M-String domain along with an explicit-state model checker, we have evaluated the proposed domain experimentally on a few simple but realistic test programs.

This work has been partially supported by the Czech Science Foundation grant No. 18-02177S.

    The string of interest of a character array is the sequence of characters up to the first null one (included). In the case in which the null character occurs at the first index of a character array, then its string of interest is defined as “null”. If the null character does not occur in the array, then its string of interest is defined as “undefined”. Otherwise, the string of interest is considered to be “well-defined”.

    For scalars in C programs, we use the bitvector theory.

    The processor used to run the benchmarks was Intel Xeon E5-2630 clocked at 2.60 GHz. To make reproduction of the benchmarks easier, we provide instructions and scripts in the online supplementary material.

    The implementations were taken from pdclib, a public-domain libc implementation.


