CoVeriTest with Adaptive Time Scheduling (Competition Contribution)

CoVeriTest, which is integrated in the analysis framework CPAchecker, adopts verification technology for test-case generation. It encodes individual test goals as reachability queries, which are then processed by verifiers. To increase the effectiveness on a broad class of testing tasks, CoVeriTest leverages the strengths of two different analyses: an explicit value analysis and predicate abstraction. Similar to TestComp’20, the two analyses are interleaved and the time duration of an interleaving segment is calculated dynamically. However, the calculation of the time duration focuses on the predicted future performance instead of the past performance, thus, rewarding analyses that likely cover open test goals.


Test-Generation Approach
Generating test-cases for a diverse set of tasks like in TestComp is challenging and often cannot be performed effectively by a single approach. Therefore, cooperative approaches that combine the strengths of multiple test-case generators frequently show superior performance as long as they do not spend too much time in unproductive test-case generators. To avoid unproductive test-case generation, we equip our CoVeriTest submission with a novel learning-based scheduler that considers the expected productiveness of a test-case generator.
CoVeriTest is a hybrid approach based on the concept of cooperative, verification-based testing [5], which combines complementary verifiers. In our current instantiation, we iteratively run two verification algorithms, namely value analysis [4] and predicate analysis [3]. In each iteration, the analyses proceed their exploration until they hit their time limit. The time limit of an analysis is computed dynamically at the beginning of each iteration round using our novel learning-based time scheduler.  goals, which are shared between the analyses, as unreachability queries and let the analyses prove the unreachability of those goals. A reported counterexample proves the reachability of a test goal. Therefore, the counterexample is converted into a test [1] and the test goal is removed from the set of open test goals.
Time Scheduling. Our time scheduler limits the time per iteration round to 100 s 3 and distributes the 100 s based on the expected contribution of the individual analyses. The idea is that an analysis gets more time if there exists more paths to open test goals that the analysis is expected to handle well. Figure 1 shows the integration of our time scheduler into the CoVeriTest workflow. First, the scheduler samples a set of syntactical counterexample paths ρ, which starts at the beginning of the program and ends in an open test goal. Then, it estimates for each path ρ the probability P (V i | ρ) that analysis i detects ρ as a real counterexample 4 . We estimate the probability P (V i | ρ) using an unigram language model [9] in combination with the approach of Richter et al. [10] for the abstraction of the syntactical paths ρ. Finally, the scheduler assigns a time budget to analysis i in proportion to the average probability of detecting a counterexample path on a testing task T (program plus open test goals): Learning Probability Distribution. The probability distribution P (V i | ρ) is unknown. Thus, we aim to learn the distribution. To this end, we executed the value and predicate analysis separately on the TestComp'20 category coveragebranches and used the reported counterexamples, which are obviously counterexamples that can be decided by the reporting analysis, to pre-train our unigram language model [9]. At the beginning of each CoVeriTest execution, we load the pre-trained model and use the reported counterexamples to improve it during 3 We choose the same iteration time limit as in TestComp'20 [8], which has been established by extensive evaluation of CoVeriTest [5]. 4 Note that it is not important that ρ is a real counterexample. We rather model the probability that the analysis i can decide whether ρ is a counterexample than to decide whether ρ is a counterexample.
execution. When the sampled paths are indecisive, E ρ∈T [P (V i | ρ)] becomes the normalized progress used in the TestComp'20 strategy [8]. The normalized progress describes the relative contribution of an analysis to the goals covered in the last iteration.

Tool Architecture
CoVeriTest is an extension of the software analysis framework CPAchecker [2] (version 2.0) and is written in Java. For parsing, we use the Eclipse CDT parser 5 . For test-case generation, we rely on two instances of CPAchecker's test-case generation algorithm, which extracts test cases from counterexamples [1]. One instance generates test cases based on CPAchecker's value analysis [4] and the other instance uses CPAchecker's predicate analysis [3]. Both analyses apply counterexample-guided abstraction refinement [7] and use the SMT solver Math-SAT5 [6]. We interleave the two instances and determine their time slices based on their expected success on the set of open test goals. To determine the time slices, we added the adaptive scheduler described in the previous section.

Strengths and Weaknesses
The main difference between CoVeriTest versions in Test-Comp'20 and Test-Comp'21 is the distribution of the 100 s per round. Our own experiments with the Test-Comp 2020 benchmark set revealed a small advantage for our new distribution with respect to the coverage-branches category. Comparing the competition results against a CoVeriTest configuration using the time distribution from Test-Comp'20 shows that the new distribution performs slightly worse in the coverage-error category. In total, 13 errors are missed, 8 of them are missed in the subcategory Floats. Overall, an advantage of the new distribution is scarcely noticeable on the Test-Comp 2021 benchmark set. The unigram language model does not generalize well.
Since the underlying analyses remain the same, CoVeriTest still generates a small number of test cases. Also, the problems with tasks using large arrays and the subcategories BusyBox-Memsafety and SQLite-Memsafety remain. Additionally, CoVeriTest performs poorly on the new ntdrivers tasks and the new subcategory Combinations. While finding the error in the new nla-digbench tasks is difficult, covering branches works well for these tasks. Moreover, CoVeriTest deals well with the new category XCSP and the remaining new tasks. on program program.i, one requires a Java 11 runtime environment and must execute the following command line: scripts/cpa.sh -testcomp21 -setprop log.consoleLevel=SEVERE -stats -benchmark -heap 10000m -spec property.prp program.i Note that property.prp is a place marker for the test specification (coverage--error-call.prp or coverage-branches.prp). Tests are generated for programs assuming a 32-bit environment. To support 64-bit environments, one needs to add the configuration option -64. The generated tests are written to the folder output/test-suite and adhere to the XML format demanded by the Test-Comp rules. Additionally, the folder contains the mandatory metadata file.

Project and Contributors
CoVeriTest is an extension of the CPAchecker project 7 and is developed as a joint, open source project between research groups of Paderborn University and TU Darmstadt. Contributors are Marie-Christine Jakobs and Cedric Richter.
We also like to thank all developers of CPAchecker.