figure a
figure b

1 Validation Approach

There are multiple violation witness validators in the ReachSafety category of SV-COMP that are based on test harness generation [3]. However, none take part in the category for concurrent programs, presumably due to the increased complexity in orchestrating the different thread interleavings prescribed by the witness files. ConcurrentWitness2Test aims to fill this gap, by providing an enhanced test harness that takes not only the data-nondeterminism into account, but also the nondeterminism caused by concurrency. In this paper we concentrate on solving the latter, as the former is already well documented by the implementing tools [3].

The current witness format for concurrent software defines two edge data fields that we can extract information from [3]:

  • createThread: The unique ID of the new thread that results from the execution of the containing edge

  • threadId: Which thread is currently active when the containing edge is executed. Valid values have at least one createThread entry in the witness automaton that must be executed prior to the current edge

Using these pieces of information, we insert a yield and release call around each action (as seen in the example in  Figure 2, based on the metadata from Figure 1), with the parameter target increasing at every encountered edge. These functions are shown in Figure 3 and Figure 4, respectively. They rely on a shared variable current denoting the next value where the functions need to take effect (to handle revisited locations in the source, e.g., in a loop), alongside a mutex and a condition variable. Locking and unlocking in the figures refer to operations on the mutex variable; while broadcasting and waiting refer to operations on the condition variable.

Fig. 1.
figure 1

Witness

Fig. 2.
figure 2

Source

Fig. 3.
figure 3

yield(target)

Fig. 4.
figure 4

release(target)

One of the main obstacles to overcome is the resolution of the threadID metadata. In our experience, none of the tools produce fully specified witnesses in terms of interleavings, i.e., not every action is totally ordered in the program. While this is acceptable according to the witness format [3], a certain level of nondeterminism might remain in the program after applying the witness. To overcome this problem we rely on statistics, i.e., we execute the resulting harness multiple times, and classify the results as always observable, sometimes observable and never observable. Observability refers to that of the error state, tested by inspecting the exit code of the program. At SV-COMP’24 we opted to only refuse witnesses with never observable verdicts.

2 Software Architecture

ConcurrentWitness2Test is a Python project, relying on pycparserFootnote 1 for parsing C files, and networkxFootnote 2 for parsing GraphML-based witnesses. As opposed to the harness-only solutions of other witness-to-test validators [3], ConcurrentWitness2Test also needs to modify the AST of the C file to insert the function calls to yield and release, therefore the intermediate output of ConcurrentWitness2Test consists of a patched C file and a separate test harness. We use gccFootnote 3 to compile these resulting files to an executable. We run this executable at most 100 times, with an option for early termination if c. See Figure 5 for an overview of this workflow.

Fig. 5.
figure 5

Architecture of ConcurrentWitness2Test

3 Discussion of Strengths and Weaknesses of the Approach

As seen in Table 1Footnote 4, ConcurrentWitness2Test lacks support for some tools’ witnesses. Since then, this limitation has been mostly rectified, but not in time for SV-COMP. The main shortcoming of the competition version of ConcurrentWitness2Test was the handling of cases where edge attributes were given for complex syntactic elements, such as loops, and we tried to insert the function calls into the heads of loops instead of their body. This was an easy fix, and we hope to further the support for various tools even more for next year’s SV-COMP.

Despite these temporary shortcomings, ConcurrentWitness2Test still correctly confirmed 1197 results[2]. In contrast, the validator was wrong only 239 times: 2 witnesses were confirmed and 237 witnesses were refused erroneouslyFootnote 5. These numbers highlight the strength of our approach.

We also note that ConcurrentWitness2Test confirmed 932 results with only a sometimes observable verdict. This means that multiple tools produce nondeterministic witnesses, where some interleaving leads the execution to an error state, but not all. We suggest tool developers to concentrate on providing better, deterministic witnesses in order for their results to always be validated. We will aim to constrain our acceptance criteria to always observable in future competitions.

Table 1. Results per supported tool, results for wrong verdicts in parentheses

4 Tool Setup and Configuration

The binary archive available at Zenodo [1] contains all required dependencies in the form a virtual environment except for the python 3 interpreter, which needs to be installed separately (e.g., via the python3 package on Ubuntu 22.04).

The tool can be started either directly via the main.py file, or the convenience script in start.sh. Either way, the tool expects two inputs: an argument providing the (preprocessed) C file, and the witness file with the –witness <file> flag. Upon success, the tool always outputs a single line starting with the string Verdict:, with the verdict SOMETIMES/ALWAYS/NEVER directly afterward. Some handled exceptions also appear as verdicts.

Up-to-date badges on verification tool support can be seen on the main GitHub pageFootnote 6. Tool support has been significantly enhanced since the version nominated for the competition, in preparation for next year’s SV-COMP, and for tools to use that may want to improve their witnesses in the meantime.

5 Software Project and Data Availability

ConcurrentWitness2Test is a validation tool maintained by the Critical Systems Research GroupFootnote 7 of the Budapest University of Technology and Economics. The project is available open-source on GitHubFootnote 8 under an Apache 2.0 license. The version (1.0.0) used in the competition is available at [1].