Incremental bounded model checking for embedded software

Program analysis is on the brink of mainstream usage in embedded systems development. Formal verification of behavioural requirements, finding runtime errors and test case generation are some of the most common applications of automated verification tools based on bounded model checking (BMC). Existing industrial tools for embedded software use an off-the-shelf bounded model checker and apply it iteratively to verify the program with an increasing number of unwindings. This approach unnecessarily wastes time repeating work that has already been done and fails to exploit the power of incremental SAT solving. This article reports on the extension of the software model checker CBMC to support incremental BMC and its successful integration with the industrial embedded software verification tool BTC EMBEDDED TESTER. We present an extensive evaluation over large industrial embedded programs, mainly from the automotive industry. We show that incremental BMC cuts runtimes by one order of magnitude in comparison to the standard non-incremental approach, enabling the application of formal verification to large and complex embedded software. We furthermore report promising results on analysing programs with arbitrary loop structure using incremental BMC, demonstrating its applicability and potential to verify general software beyond the embedded domain.


Introduction
Recent trend estimation [25] in automotive embedded systems revealed ever growing complexity of computer systems, providing increased safety, efficiency and entertainment satisfaction.Hence, automated design tools are vital for managing this complexity and support the verification processes in order to satisfy the high safety requirements stipulated by safety standards and regulations.Similar to the developments in hardware verification in the 1990s, verification tools for embedded software are becoming indispensable in industrial practice for hunting runtime bugs, checking functional properties and test suite generation [24].For example, the automotive safety standard ISO 26262 [36] requires the test suite to satisfy modified condition/decision coverage [30] -a goal that is laborious to achieve without support by a model checker that identifies unreachable test goals and suggests test vectors for difficult-to-reach test goals.
In this paper, we focus on the application of Bounded Model Checking (BMC) to this problem.The technique is highly accurate (no false alarms) and is furthermore able to generate counterexamples that aid debugging and serve as test vectors.The spiralling power of SAT solvers has made this technique scale to reasonably large programs and has enabled industrial application.
In BMC, the property of interest is checked for traces that execute loops up to a given number of times k.Since the value of k that is required to find a bug is not known a-priori, one has to try increasingly larger values of k until a bug is found.The analysis is aborted when memory and runtime limits are exceeded. 3ost existing industrial verification tools use an off-the-shelf Bounded Model Checker and, without additional information about the program to be checked, apply it in an iterative fashion: k=0 while t r u e do i f BMC( program , k ) f a i l s then r e t u r n counterexample f i k++ od This basic procedure offers scope for improvement.In particular, note that the Bounded Model Checker has to redo the work of generating and solving the SAT formula for time frames 0 to k when called to check time frame k + 1.It is desirable to perform the verification incrementally for iteration k + 1 by building upon the work done for iteration k.
Incremental BMC has been applied successfully to the verification of hardware designs, and has been reported to yield substantial speedups [51,22].Fortunately, the typical control-loop structure of embedded software resembles the monolithic transition relation of hardware designs, and thus strongly suggests incremental verification of successive loop unwindings.However -to our knowledge -none of the software model checkers for C programs that have competed in the TACAS 2014 Software Verification Competition implement such a technique that ultimately exploits the full power of incremental SAT solving [53,21].Contributions.The primary contribution of this paper is experimental.We quantify the benefit of incremental BMC in the context of the verification of embedded software.To this end, (1) we survey the techniques that are state of the art in embedded software verification, briefly summarise the underlying theory, and highlight the challenges faced when applying them to industrial code; (2) we present the first industrial-strength implementation of incremental BMC in a software model checker for ANSI-C programs combining symbolic execution, slicing and incremental SAT solving.Besides loop unwinding, we also elucidate other applications of incremental SAT solving in a state-of-the-art Bounded Model Checker; (3) we report on the successful integration of our incremental bounded model checker in the industrial embedded software verification tools BTC Embed-dedTester and EmbeddedValidator where it is used by several hundred industrial users since version 3.4 and 4.3, respectively; and (4) we give a comprehensive experimental evaluation over a large set of industrial embedded benchmarks that quantify the performance gain due to incremental BMC: our new method outperforms the winner of the TACAS 2014 Software Verification Competition [41] by one order of magnitude.

Verification of Model-based Embedded Software
Model-based development is well-established in embedded software development, and particularly popular in the automotive industry.Tools such as Simulink4 are in widespread use for modelling, code generation and testing. 5n this paper, we focus on the verification of C code generated from these models.To this end, we illustrate the characteristics of this verification problem with the help of a well-known case study (Sec.2.1) and explain the workflow and principal techniques that a state-of-the-art embedded software verification tool uses.

Case Study: Fault-Tolerant Fuel Control System
The Fault-Tolerant Fuel Control System6 (FuelSys) for a gasoline engine, originally introduced as a demonstration example for MATLAB Simulink/Stateflow and then adapted for dSPACE TargetLink, is representative of a variety of automotive applications as it combines discrete control logic via Stateflowwith continuous signal flow expressed by Simulink or TargetLink and thus establishes a hybrid discrete-continuous system.More precisely, the control logic of FuelSys is implemented by six automata with two to five states each, while the signal flow is further subdivided into three subsystems with a rich variety of Simulink/TargetLink blocks involving arithmetic, lookup tables, integrators, filters and interpolation (Fig. 1).The system is designed to keep the air-fuel ratio nearly constant depending on the inputs given by a throttle sensor, a speed sensor, an oxygen sensor (EGO) and a pressure sensor (MAP).Moreover it is tolerant to individual sensor faults and is designed to be highly robust, i.e. after detection of a sensor fault the system is dynamically reconfigured.Properties of interest.The key functional property for FuelSys is how the air-fuel ratio evolves for each of the four sensor-failure scenarios.Simulationbased approaches show that FuelSys is indeed fault-tolerant in each case of a single failure: the air-fuel ratio can be regulated after a few seconds to about Fig. 1: The Simulink Diagram for the Fault-Tolerant Fuel Control System 80 % of the target ratio.In addition to functional testing of industrial embedded software, safety standards call for structural testing of the production code before release deployment.In Sec.2.3, we give a brief overview about such standards and the state of affairs of their implementation in practice.

Structure of Generated Code
Many modelling languages follow the synchronous programming paradigm [28], which is well-suited for modelling time-triggered systems, in which tasks (subsystems of the model) execute at given rates.Code generation for such languages produces a typical code structure, which corresponds essentially to a non-preemptive operating system task scheduler.Most code generators provide the scheduler for time-triggered execution or code to interface with popular real-time operating systems.In either case, the functionality corresponds to the following pseudo code: The distinguishing characteristic of such a reactive program is its unbounded main loop, which we will analyse incrementally.All other loops contained within that loop, e.g. to iterate over arrays or interpolate values using look-up tables, have a statically bounded number of iterations and can be fully unwound.

Analysis with BMC and k-induction
Recent safety standards, e.g.ISO-26262 [36] Property Instrumentation.Formal verification requires formalisations of high-level requirements, often using observer Büchi automata [12] with a dedicated 'error state' generated from temporal logic descriptions.Test vector generation is done for code-coverage criteria such as branches, statements, conditions and MC/DC [30] of the production C code.For FuelSys, for example, MC/DC instrumentation yields 251 test goals.The properties to be verified or tested have in common that they can be reduced to a reachability problem.In formal verification of safety properties, we prove that the error state is unreachable, whereas the aim of test vector generation is to obtain a trace that demonstrates reachability of the goal state.
To validate whether the air-fuel ratio in the FuelSys controller is regulated after a few seconds to be within some margin of the target ratio, one has to instrument the reactive program, as sketched above, with an observer implementing the asserted property.For instance, consider the requirement "If some sensor fails for the first time then within 10 seconds the air-fuel ratio will keep in between the range of 80 % to 120 % of the target ratio forever."The code fragment for an observer for this requirement may look as follows: In order to verify that the above property actually holds, one has to show that the assertion in the observer code is always satisfied.We use BMC for refutation of the assertion, and k-induction for proving it.Bounded Model Checking.BMC [3,16] can be used to check the existence of a path π = s 0 , s 1 , . . ., s k of length k between two sets of states described by propositional formulae φ to ψ.This check is performed by deciding satisfiability of the following formula using a SAT or SMT solver: If the solver returns the answer "satisfiable", it also provides a satisfying assignment to the variables (s The satisfying assignment represents one possible path π = s 0 , s 1 , . . ., s k from φ to ψ and identifies the corresponding input sequence i 0 , . . ., i k−1 .Hence, BMC is useful for refuting safety properties (where φ gives the set of initial states and ψ defines the error states) and generating test vectors (where ψ defines the test goal to be covered).Unbounded Model Checking by k-Induction.BMC can prove reachability, whereas unreachability can be shown using induction.The predicate ¬ψ is an (inductive) invariant, i.e., it holds in all reachable states, if each of the following two formulae, base case (BC) and induction step (SC), are unsatisfiable: Both formulae can be decided with the help of a SAT or SMT solver.
The property of interest is often not inductive, however, and the check above fails.An option is to strengthen the property, e.g., using auxiliary invariants obtained using an abstract interpreter.Furthermore, the criterion above can be generalised to k-induction [49,22,27,18]: The base case checks whether ¬ψ holds in the first k steps, whereas the induction step checks if we can conclude from the invariant holding over any k consecutive steps that it holds for the (k + 1) st step.If the base step fails, i.e. above formula is satisfiable, we have refuted the property.If it holds and the induction step fails, we do not know whether ¬ψ is invariant.Only if both formulae hold we have proved that ¬ψ is invariant.Both base step and induction step are essentially instances of BMC: starting from the initial state φ for the base case, and starting from any state for the induction step.Thus, similar to BMC, k-induction can be applied by using a sequence of increasing values for k.Challenges.Embedded C code has to meet many conflicting requirements like real-time constraints, low memory footprint and low energy consumption.Code generators offer options to perform certain optimisations towards these goals, often to the detriment of code size (and also readability for humans).The observer instrumentation to encode properties and identify the test goals corresponding to code-coverage criteria such as MC/DC produces a non-negligible overhead in the size of the code but introduces little semantic complexity.The size of the formula built further increases whenever internal loops need to be unwound, for example to perform a linear search in lookup tables in the FuelSys example.File sizes of 10 MB and more are common, which poses difficulties to many tools already when parsing the source code and encoding the program into a SAT formula, mostly due to inefficient data structures.Incremental BMC helps reducing formula sizes and peak memory consumption (see Sec. 4.2) by incremental formula generation and solving.
The next generation of automotive electronic control units will be equipped with floating point (FP) units [35] to perform complex arithmetic used to implement control laws.FuelSys is an example of this trend.The use of FP arithmetic comes with many hard-to-avoid pitfalls.Many verification tools interpret FP numbers as reals, an approach which gives rise to false positives and is unable to detect certain bugs.Hence, even though bit-precise reasoning about FP arithmetic is expensive, it is indispensable in order to spot intricate bugs.State-of-the-art methods are based on bit-blasting and bitvector refinement [6] (see Sec. 3.3), but new abstraction-based solvers are emerging (e.g.[29] [29,5]).
In practice, many loop unwindings may be needed to detect errors and reach certain tests goals (more than 100 for some of our industrial benchmarks, see Sec. 4.2).Non-incremental bounded model checking repeats work such as file parsing, loop unwinding, SAT formula encoding and discards information learnt in the SAT solver every time it is called and so gives away an enormous amount of performance.This effect is exacerbating the cost of large unwinding limits that may be needed.
A great challenge is to exploit all the benefits of incrementality in BMC and to significantly enhance performance of its integration with an industrial-strength embedded verification and test-vector generation tool.

Incremental BMC
In this section, we explain the technical background of incremental SAT solving and how it is employed in our implementation of incremental BMC.

Incremental SAT solving
The first ideas for incremental SAT solving date back to the 1990s [34,50,38].The question is how to solve a sequence of similar SAT problems while reusing effort spent on solving previous instances, i.e., reusing the internal state and learnt information of the solver.
Obviously, incremental SAT solving is easy when the modification to the CNF representation of the problem makes it grow monotonically.This means that if we want to solve a sequence of (increasingly constrained) SAT problems with CNF formulae Φ(k) for k ≥ 0 then Φ(k) must be growing monotonically in k, i.e.Φ(k + 1) = Φ(k) ∧ ϕ(k) for CNF formulae ϕ(k).
Removal of clauses from Φ(k) is trickier, as some of the clauses learnt during the solving process are no longer implied by the new instance, and need to be removed as well.Earlier work has identified conditions for the reuse of learnt clauses [51,53], but this requires expensive book-keeping, which partially saps the benefit of incrementality.
The most popular approach to incremental solving is to solve SAT problems under assumptions [22]: assumptions are modelled as the first decision literals made by the SAT solver.If a learnt clause is derived from an assumption, then it will contain that assumption literal.This light book-keeping enables the SAT solver to maintain its performance when using assumptions.SAT solving under assumptions allows us to emulate the removal of clauses as explained in Sec.3.2.SMT solvers offer an interface for pushing and popping clauses in a stack-like manner.Pushing adds clauses, popping removes them from the formula.This makes the modification of the formula intuitive to the user, but the efficiency depends on the underlying implementation of the push and pop operations.For example, in [26] it was observed that some SMT solvers (like Z3) are not optimised for incremental usage and hence perform worse incrementally than non-incrementally.Since Cbmc itself implements powerful bitvector decision procedures, we use the SAT solver MiniSAT2 [21] as a backend solver.
We will see that incremental BMC requires a non-monotonic series of formulae.For SAT solvers, solving under assumptions is the prevalent method, hence we will focus on this technique in the sequel.

Incremental BMC
Following the construction in [22] for finite state machines, incremental BMC can be formulated as a sequence of SAT problems Φ(k) that we need to solve: with assumption ¬α 0 where Ψ (k) is the disjunction 0≤j≤k ψ(s j ) of error states ψ to be proved unreachable up to iteration k.This disjunction means that the verification fails if at least one of the error states is reachable.Since the set of ψ j s grows in each iteration, our problem is not monotonic: one has to remove Ψ (k) when adding Ψ (k + 1) because Ψ (k) subsumes Ψ (k + 1), and thus simply conjoining the Ψ (k)s would not yield the desired formula.
Here, solving under assumptions comes to rescue.In iteration k, the α k is assumed to be false, whereas it is assumed true for iterations k ′ > k.This has the effect that in iteration k ′ the formula (Ψ (k) ∨ α k ) becomes trivially satisfied.Hence, it does not contribute to the (un)satisfiability of Φ(k ′ ), which emulates its deletion. 11ymbolic execution.For software, however, (4) results in large formulae and would be highly inefficient for the purpose of BMC.In practice, software model checkers use symbolic execution in order to exploit, for example, constant propagation and pruning branches when conditionals are infeasible, while generating the SAT formula and thus reducing its size.This means that the formulae describing T and Ψ in (4) are actually dependent on k.Fortunately, this does not affect the correctness of above formula construction and we can replace T by T k in (4) and ψ by ψ k in the definition of Ψ (k).Slicing.Another feature used by state-of-the-art software model checkers is slicing: The purpose of slicing is, again, reducing the size of the SAT formula by removing (or better: not generating) those parts of the formula that have no influence on its satisfiability.There are many techniques how to implement slicing with the desired trade-off between runtime efficiency and its formula pruning effectiveness [52].
Slicing is performed relative to Ψ (k).We said that the number of disjuncts ψ j in Ψ is growing monotonically with k.Hence, we will show that, assuming that our slicing operator is monotonic, we obtain a monotonic formula construction: The transition relation for each time frame T k is a conjunction τ ∈M τ of subrelations τ (e.g., formulae corresponding to program instructions).The slicing operator slice selects a subset of M .The operator slice is monotonic iff M ⊆ M ′ =⇒ slice(M ) ⊆ slice(M ′ ).We can then view the conjunction of transition relations for k time frames T (k) = 0≤j≤k T j as τ ∈M k τ .Then a slice An incremental slice is then defined as the difference between T sliced (k + 1) and T sliced (k): Monotonicity of formula construction follows from M ′ k+1 ⊆ M k+1 and the assumed monotonicity M ′ k ⊆ M ′ k+1 of the slicing operator.We can thus replace T by T sliced k in (4).Mind that T sliced k contains also subrelations τ for time steps k ′ < k.

Incremental Refinements
Incremental SAT solving is also used for incremental refinements of the transition relation T for bitvectors and arrays, for example.Bitvector Refinement.The purpose of bitvector refinement [10,2,31,11,9,19] is to reduce the size of formulae encoding bitvector operations.This is especially important for arithmetic operations that generate huge SAT formulae, e.g.multiplication, division and remainder operations, both for integer and floating-point variables [6].Bitvector refinement is based on successive under-and overapproximations.For instance, underapproximations can be obtained by fixing a certain number of bits, whereas overapproximation make a certain number of bits unconstrained.If an underapproximation is satisfiable (SAT) or an overapproximation is unsatisfiable (UNSAT) we know that the non-approximated formula is SAT or UNSAT respectively.Otherwise, the number of fixed respectively unconstrained bits is reduced until the non-approximated formula itself is checked.Arrays.To handle programs with arrays, Ackermann expansion is necessary to ensure the functional consistency property of arrays: ∀i, However, adding a quadratic number of constraints (in the size of the array A) is extremely costly.Experience has shown that only a small number of these constraints is actually used [47].
Hence, more efficient just trying to solve the SAT formula without these constraints, which is an over-approximation.Hence, if we get an UNSAT result (a), we know that the solution with the Ackermann constraints would be UNSAT too.In case of a SAT result (b), we check the consistency of the obtained model: if it turns out not to violate consistency, then we know that we have found a real bug.Otherwise (c), we add the violated Ackermann constraint to the formula.
The formula construction is trivially monotonic and we can use incremental SAT solving.We repeat the procedure until we hit case (a) or (b), which is guaranteed to happen.Some SMT solvers, like Boolector implement a similar procedure to decide the SMT-LIB array theory [7,8].
Formula construction.Applying above refinements inside an incremental Bounded Model Checker requires using several incremental formula encodings for (in general, non-monotonic) refinements simultaneously.These refinements are global over all unwindings, so that in iteration k we have to further refine transition relations T k ′ from earlier iterations k ′ < k.We can formalise the incremental formula construction as follows: For iteration k ≥ 0 of incremental BMC and the ℓ th refinement: with assumption ¬α 0 ℓ is incremented in each iteration of the refinement loop until convergence, whereas k is incremented when considering the next time frame.

Experimental Evaluation
We present the results of our experimental evaluation of incremental BMC and incremental k-induction on industrial programs from mainly automotive origin.The experiments for this study were performed on a 3.5 GHz Intel Xeon machine with 8 cores and 32 GB of physical memory running Windows 7 with a time limit of 3,600 seconds.
Our study of incremental BMC is targeted at embedded software since it takes advantage of its specific properties (one big unbounded loop, whereas other loops are bounded).However, incremental BMC can also be applied to programs where loops and control structures are more irregular; to this end, we report on a preliminary study of incremental BMC on programs with multiple loops.These latter experiments were performed on a 3 GHz Intel Xeon with 8 cores and 50 GB of physical memory running Fedora 20 with a time limit of 3,600 seconds.

Implementation
Cbmc.We have implemented our extension 12 for incremental BMC in the Bounded Model Checker for ANSI-C programs Cbmc [17] using the SAT solver MiniSAT2 [21] as a backend solver.
Cbmc is called in incremental mode using the command line cbmc file.c--no-unwinding-assertions --incremental. 13The following options can be added to enable specific features of Cbmc: ---no-sat-preprocessor: turns off SAT formula preprocessing, i.e. the Min-iSAT2 simplifier is not used.---slice-formula: slices the SAT formula.
---unwind-max k: limits the loop unwindings of the loop to be checked incrementally to k unwindings.Without this option, Cbmc will not terminate for unsatisfiable instances, i.e. bug-free programs with unbounded loops.More information regarding the usage of incremental Cbmc can be found on the CPROVER wiki page. 14ntegration with an industrial-strength embedded verification tool.In the integration of Cbmc with BTC EmbeddedTester and EmbeddedValidator, a master routine selects the next verification/test goal to be analysed starting from instrumented C code.After some preprocessing like source-level slicing and internal-loop unwinding the resulting reachability task is given to Cbmc.If Cbmc is able to solve the problem within the user-defined time limit, the result, i.e. information of bounded or unbounded unreachability, or a test vector or counterexample in case of reachability, is reported back to the master process.Otherwise, i.e. in case of a timeout, Cbmc is killed but information about the solved unwindings of the reactive main loop is given back, which frequently is a useful result for the user.
To prove unreachability of verification/test goals (properties), split-case kinduction is performed (see Section 2.3).For this purpose BTC EmbeddedTester generates two source files, one containing the base case, which is a normal BMC problem with the property given as assertion (cf.Equ.(3) (BC)); the file for the step case havocs variables modified in the loop and the invariant property is assumed at the beginning of the loop and asserted at the end of the loop (cf.Equ.(3) (SC)).To check the step case, we require a reversed termination behaviour of Cbmc (option --stop-when-unsat), i.e. it continues unwinding as long as the problem is SAT and stops as soon as it is UNSAT.Implementation of Incremental BMC for General Sequential Programs.Incremental Cbmc can also be used for programs with multiple loops.For these programs, Cbmc incrementally unwinds loops one at each time.For a given loop, Cbmc will unwind the loop until it is fully unwound or until a maximum depth k is reached (given by option --unwind-max k).After a loop has been unwound, Cbmc continues to the next loop.This procedure is repeated until all loops have been unwound or a bug has been found.Recursive function calls are treated similarly.

Incremental BMC for Embedded Software
We report results on industrial programs for the integration of Cbmc with BTC EmbeddedTester and EmbeddedValidator.For these experiments, we used 60 benchmarks that originated mainly from automotive applications. 15Half of the benchmarks are bug-free (UNSAT instances), half contain a bug (SAT instances).This benchmark suite is an indicator for performance of model checking tools in practice as it covers a representative spectrum of embedded software.A summary of the benchmark characteristics is listed in Table 1; for a full list we refer to Table 2 in the Appendix.
Besides the number of lines of code, we give the number of conditional operators, multiplications and divisions or remainder operations, which are a good indicator for the difficulty of the benchmark, because they generate large formulae -recall that for each "/" occurring in the program, Cbmc has to generate a divider circuit.The surprisingly high number of conditional operators in most of the benchmarks is due to the preprocessing of conditional assignments by BTC EmbeddedTester and hints at the amount of branching in these benchmarks.Moreover, we list the number of input and state variables, and the variables introduced by the observer instrumentation.
These benchmarks have the property of having only one unbounded loop.For these benchmarks, Cbmc is called in incremental mode by using the option --incremental-check c:main.0where c::main.0 is the loop identifier of the unbounded loop to be unwound and checked incrementally.The loop identifiers can be obtained using the option --show-loops.Runtimes.We compared the incremental (i) with the non-incremental (ni) approach and evaluated the impact of slicing (s), SAT preprocessing (p) and bitvec- tor refinement (r). 16The incremental and non-incremental approaches were compared by activating none of the three techniques, with slicing only (+s), with slicing and preprocessing (+s+p), and with all three options activated (+s+p+r).
The maximum number of loop unwindings was fixed to 10 for the UNSAT instances in order to balance a significant exploration depth with reasonable analysis runtimes.For SAT instances, a maximum number of loop unwindings was not fixed since the incremental and non-incremental approaches are bound to terminate when the unwinding depth reaches the depth of the bug.The number of unwindings are listed in the last column in Table 1 (resp.Table 2 in the Appendix).
Fig. 4 shows the comparison between the incremental and non-incremental approaches and the impact of each tool option on their performance.Fig. 4a shows the average geometric mean [23] speedup of instances that were solved by all approaches.We consider as baseline the (ni+s+p) approach since it was the best non-incremental approach.Each bar shows the average geometric mean speedup of each approach when compared to (ni+s+p).For example, (ni) has a speedup of 0.77, i.e. (ni) is on average 0.77× slower than (ni+s+p).On the other hand, all incremental versions are much faster than the non-incremental versions.For example, (i) is on average over 3.5× faster than (ni+s+p) and (i+s+p) is on average over 5× faster than (ni+s+p).We observe the following effects of the tool options: (i) slicing shows significant benefits overall (also on peak memory consumption); (ii) not using formula preprocessing is a bad idea in general; and (iii) bitvector refinement shows benefits for UNSAT instances, but produces overhead for SAT instances which deteriorates the overall performance of the tool (see Fig. 7 in the Appendix for more details).Even though the tool options have some positive effects, they are rather minor in comparison to the performance gains from using an incremental approach.
Since the best incremental and non-incremental approaches were obtained with the configuration (+s+p), we will use this configuration for both approaches on the results described in the remainder of the paper.
Fig. 4b shows a scatter plot with runtimes of the best non-incremental (ni+s+p) and incremental (i+s+p) approaches.Each point in the plot corresponds to an instance, where the x-axis corresponds to the runtime required by the incremental approach and the y-axis corresponds to the runtime required by the non-incremental approach.If an instance is above the diagonal, then it means that the incremental approach is faster than the non-incremental approach, otherwise it means that the non-incremental approach is faster.SAT instances are plotted as crosses, whereas UNSAT instances are plotted as squares.Incremental BMC significantly outperforms non-incremental BMC.For SAT instances, the advantage of incremental BMC is negligible for the easy instances, whereas speedups are around a factor of 10 for the medium and hard instances.For UN-SAT instances, speedups are also significant and most instances have a speedup of more than a factor of 5. Solving vs. overall runtime.Since the non-incremental approach has to reparse source files and preprocess them at each iteration, one might argue that removing this overhead is the main reason for the speedup observed.However, the overhead for parsing files, symbolic execution and slicing when compared to generating and solving SAT formula is similar for the incremental and nonincremental approach.The incremental approach spends 27% of its time solving the SAT formula (582 out of 2,151 seconds), whereas the non-incremental approach spends 28% of its time (3,317 out of 11,811 seconds).Unsurprisingly, solving the instance for the largest k in the non-incremental approach takes a considerable amount of time (around 24%), when compared to the total time for solving the SAT formulae for iterations 1. . .k (784 out of 3,317 seconds).
An explanation for these speedups might be the size of the queries issued in both approaches.The average number of clauses per solver call is halved from 1,367k clauses for the non-incremental approach to 709k clauses for the incremental approach.Similarly, the average number of variables is less than a third in the incremental approach when compared to the non-incremental approach, being of 217k and 746k respectively.Peak memory consumption.Smaller query sizes also have an effect on peak memory consumption which is reduced by 30% for UNSAT benchmarks; for SAT benchmarks, however, we observed a 10% increase.

Incremental BMC for Code coverage on FuelSys
As reported in the previous section, enabling Cbmc to work incrementally led to tremendous performance gains on the benchmark suite consisting of selected single input files.In order to assess whether these improvements have practical impact in the integration of Cbmc with an industrial-strength test-vector generation tool, we compared the performance of BTC EmbeddedTester with the incremental feature of Cbmc being disabled and enabled.The time limit per subtask was 10 minutes and the unwinding depth for all internal loops was 50.For unwinding depth 10 of the main loop, the incremental feature improves the overall runtime from 152.3 to 70.4 minutes, i.e. more than 2× faster, and for unwinding depth 50 from 377.4 to 108.5 minutes, i.e. more than 3× faster.
In the latter case, the rate of solved subproblems for MC/DC (i.e.not run into timeout) could be increased from 98.4 % to 99.2 %.

Incremental k-Induction for Embedded Software
To compare the performance of incremental and non-incremental approaches for k-induction, we considered the subset of UNSAT benchmarks for which kinduction required more than 1 iteration.Note that when k-induction requires only 1 iteration, the performance of incremental and non-incremental approaches is similar.This subset of benchmarks corresponds to 10 UNSAT benchmarks (see Table 2 in the Appendix for more details).Fig. 5 shows a scatter plot with the runtimes of incremental and non-incremental k-induction using the tool options (+s+p).Instances that correspond to the base case are plotted as crosses, whereas instances that correspond to the step case are plotted as squares.The runtimes for both incremental and nonincremental checking are relatively small.These are due to the small number of iterations required by k-induction to prove the unreachability of the properties present on these benchmarks (between 2 and 4 iterations with an average of 2.4 per instance).
Incremental checking is always faster than non-incremental checking.When considering the average geometric mean speedup, incremental checking is around 2× faster than non-incremental checking, on both base and step cases.

Incremental BMC for Programs with Multiple Loops
Incremental BMC is not restricted to programs with a single unbounded loop and may also be applied to programs with multiple unbounded loops.To evaluate the performance of incremental BMC on this kind of programs, we compared the performance of incremental and non-incremental approaches on the 62 benchmarks from the SystemC category of the Software Verification Competition benchmark set,17 because these benchmarks, which were derived from SystemC models [15], contain many loops.25 benchmarks are bug-free (UNSAT instances) and 37 contain a bug (SAT instances).These benchmarks have between 2 and 19 loops with an average of 10.3 loops per instance.For SAT instances, the depth of the bug ranges from 1 to 5 with an average depth of 2.5.For more details regarding these benchmarks see Table 3 in the Appendix.
We have fixed the maximum number of loop unwindings to 10 for both, SAT and UNSAT instances.Note that this unwind depth is larger than the depth of the bugs for the SAT instances.Formula slicing is not yet fully supported in incremental Cbmc for programs with multiple loops.Therefore, the incremental approach was run with the tool options (+p), whereas the non-incremental approach was run with the tool options (+s+p).
Fig. 6 shows a scatter plot with the run times of the incremental and nonincremental approaches.For the majority of the instances, the incremental approach outperforms the non-incremental approach and for many SAT and UN-SAT instances the speedup is larger than a factor of 10.However, there are a few instances for which the non-incremental approach performs better.The nonincremental approach unwinds all loops until a fixed unwind depth, whereas the incremental approach fully unwinds one loop before continuing to the next loop.For some instances, fully unwinding each loop may result in the generation of larger formulae, particularly for SAT instances.Not using slicing for the incremental approach may also result in larger formulae.The increase in formula size may explain the observed slowdown for some instances.
Overall, when considering instances solved by both approaches, the incremental approach is faster than the non-incremental approach and the average geometric speedup is larger than a factor of 3.

Related Work
Most related is recent work on a prototype tool nbis [26] implementing incremental BMC using SMT solvers.They show the advantages of incremental software BMC.However, they do not consider industrial embedded software and have evaluated their tool only on small benchmarks that are very easy for both, incremental and non-incremental, approaches (runtimes <1s). 18 Bit-precise formal verification techniques are indispensable for embedded system models and implementations, that have low-level, i.e.C language, semantics like discrete-time Simulink models.The importance of this topic has recently attracted attention as shown by publications on verification using SMT Solving [32,43], test case generation [45], symbolic analysis for improving simulation coverage [1], and directed random testing [48].Yet, all these works have not exploited incremental BMC.
The test vector generation tool FShell [33] uses incremental SAT solving to check the reachability of a set of test goals.However, it assumes a fixed unwinding of the loops.There is no reason why incremental BMC should not boost its performance when increasing loop unwindings need to be considered.Test vector generation tools like Klee [13] use incremental SAT solving to extend the paths to be explored.However, they consider only single paths at a time, whereas BMC explores all paths simultaneously.
Incremental SAT solving has important applications in other verification techniques like the IC3 algorithm [4,20] and incremental BMC is standard for hardware verification [37,54].We show that the speedups of incremental SAT solving reported in [22] regarding k-induction on small HW circuits carry over to industrial embedded software.

Conclusions
We claim that incremental BMC is an indispensable technique for embedded software verification that should be considered state-of-the-art in such tools.To underpin this claim, we report on the successful integration of our incremental extension of Cbmc into an industrial embedded software verification tool.Our experiments demonstrate one-order-of-magnitude speedups from incremental approaches on industrial embedded software benchmarks for BMC and k-induction.These performance gains result in faster property verification and higher test coverage, and thus, a productivity increase in embedded software verification.
Moreover, we reported on significant speedups for programs with multiple loops that show the applicability of incremental BMC beyond embedded software.We expect that incremental BMC can be further improved for programs with multiple loops by simultaneously unwinding all loops incrementally instead of fully unwinding one loop at each time.Incremental k-induction for programs with multiple loops will also benefit from such an improvement.

Table 1 :
Benchmark characteristics from industrial programs

Table 2 :
Embedded software benchmark characteristics (name of the benchmark and application domain, lines of code, number of operators (cond(a?b:c), mul(*), div/rem(/,%)), number of boolean/integer/floating point input and state variables, number of boolean variables introduced by the observer instrumentation, number of loop unwindings considered; k-induction was performed on the instances marked with *)

Table 3 :
SystemC benchmark characteristics (name of the benchmark, lines of code, number of bounded loops, number of unbounded loops, and number of loop unwindings considered)