Skip to main content

Tests from Witnesses

Execution-Based Validation of Verification Results

  • Conference paper
  • First Online:
Tests and Proofs (TAP 2018)

Abstract

The research community made enormous progress in the past years in developing algorithms for verifying software, as shown by international competitions. Unfortunately, the transfer into industrial practice is slow. A reason for this might be that the verification tools do not connect well to the developer work-flow. This paper presents a solution to this problem: We use verification witnesses as interface between verification tools and the testing process that every developer is familiar with. Many modern verification tools report, in case a bug is found, an error path as exchangeable verification witness. Our approach is to synthesize a test from each witness, such that the developer can inspect the verification result using familiar technology, such as debuggers, profilers, and visualization tools. Moreover, this approach identifies the witnesses as an interface between formal verification and testing: Developers can use arbitrary (witness-producing) verification tools, and arbitrary converters from witnesses to tests; we implemented two such converters. We performed a large experimental study to confirm that our proposed solution works well in practice: Out of 18 966 verification results obtained from 21 verifiers, 14 727 results were confirmed by witness-based result validation, and 10 080 of these results were confirmed alone by extracting and executing tests, meaning that the desired specification violation was effectively observed. We thus show that our approach is directly and immediately applicable to verification results produced by software verifiers that adhere to the international standard for verification witnesses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://sv-comp.sosy-lab.org/2017/systems.php

  2. 2.

    It has been shown that model checkers can be effective in constructing useful tests [12].

  3. 3.

    At least 21 verifiers are available that produce witnesses in the exchangeable format (cf. Table 1, which lists the verifiers that we use in our experiments).

  4. 4.

    The example also works for larger data types, but for ease of presentation, we aim to keep the range of values small, so that all calculations can be followed by hand.

  5. 5.

    We choose BenchExec [13] as container solution, because it is also used by SV-COMP.

  6. 6.

    https://github.com/eliben/pycparser

  7. 7.

    https://github.com/sosy-lab/sv-benchmarks/tree/423cf8c

  8. 8.

    We have to restrict the experiments to property ReachSafety because there were no witness validators available for the other properties.

  9. 9.

    There are also two commercial verifiers that produce witnesses, but we cannot use them due to their proprietary license.

  10. 10.

    https://sv-comp.sosy-lab.org/2017/systems.php

  11. 11.

    https://github.com/sosy-lab/sv-comp/tree/svcomp17/benchmark-defs

  12. 12.

    https://www.sosy-lab.org/research/executionbasedwitnessvalidation/

References

  1. Alglave, J., Donaldson, A.F., Kroening, D., Tautschnig, M.: Making software verification tools really work. In: Bultan, T., Hsiung, P.-A. (eds.) Proceedings of ATVA 2011. LNCS, vol. 6996, pp. 28–42. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  2. Andrianov, P., Friedberger, K., Mandrykin, M., Mutilin, V., Volkov, A.: CPA-BAM-BnB: Block-abstraction memoization and region-based memory models for predicate abstractions. In: Legay, A., Margaria, T. (eds.) Proceedings of TACAS 2017. LNCS, vol. 10206, pp. 355–359. Springer, Heidelberg (2017)

    Chapter  Google Scholar 

  3. Artho, C., Havelund, K., Honiden, S.: Visualization of concurrent program executions. In: Belli, F., Chen, A., Lin, H., McMillin, B., Mei, H. (eds.) Proceedings of COMPSAC 2007, pp. 541–546. IEEE (2007)

  4. Beyer, D.: Reliable and reproducible competition results with BenchExec and witnesses (report on SV-COMP 2016). In: Chechik, M., Raskin, J.-F. (eds.) Proceedings of TACAS 2016. LNCS, vol. 9636, pp. 887–904. Springer, Heidelberg (2016)

    Chapter  Google Scholar 

  5. Beyer, D.: Software verification with validation of results. In: Legay, A., Margaria, T. (eds.) Proceedings of TACAS 2017. LNCS, vol. 10206, pp. 331–349. Springer, Heidelberg (2017)

    Chapter  Google Scholar 

  6. Beyer, D., Chlipala, A.J., Henzinger, T.A., Jhala, R., Majumdar, R.: Generating tests from counterexamples. In: Finkelstein, A., Estublier, J., Rosenblum, D.S. (eds.) Proceedings of ICSE 2004, pp. 326–335. IEEE (2004)

  7. Beyer, D., Dangl, M.: Verification-aided debugging: An interactive web-service for exploring error witnesses. In: Chaudhuri, S., Farzan, A. (eds.) Proceedings of CAV 2016. LNCS, vol. 9780, pp. 502–509. Springer, Cham (2016)

    Google Scholar 

  8. Beyer, D., Dangl, M., Dietsch, D., Heizmann, M.: Correctness witnesses: Exchanging verification results between verifiers. In: Zimmermann, T., Cleland-Huang, J., Su, Z., (eds.) Proceedings of FSE 2016, pp. 326–337. ACM (2016)

  9. Beyer, D., Dangl, M., Dietsch, D., Heizmann, M., Stahlbauer, A.: Witness validation and stepwise testification across software verifiers. In: Di Nitto, E., Harman, M., Heymans, P. (eds.) Proceedings of FSE 2015, pp. 721–733. ACM (2015)

  10. Beyer, D., Dangl, M., Wendler, P.: Boosting k-induction with continuously-refined invariants. In: Kroening, D., Păsăreanu, C.S. (eds.) Proceedings of CAV 2015. LNCS, vol. 9206, pp. 622–640. Springer, Cham (2015)

    Chapter  Google Scholar 

  11. Beyer, D., Keremoglu, M.E.: CPAchecker: A tool for configurable software verification. In: Gopalakrishnan, G., Qadeer, S. (eds.) Proceedings of CAV 2011. LNCS, vol. 6806, pp. 184–190. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Beyer, D., Lemberger, T.: Software verification: Testing vs. model checking. Proceedings of HVC 2017. LNCS, vol. 10629, pp. 99–114. Springer, Cham (2017)

    Chapter  Google Scholar 

  13. Beyer, D., Löwe, S., Wendler, P.: Reliable benchmarking: Requirements and solutions. Int. J. Softw. Tools Technol. Transf. (2017)

  14. Beyer, D., Wendler, P.: Reuse of verification results. In: Bartocci, E., Ramakrishnan, C.R. (eds.) Proceedings of SPIN 2013. LNCS, vol. 7976, pp. 1–17. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  15. Brandes, U., Eiglsperger, M., Herman, I., Himsolt, M., Marshall, M.S.: GraphML progress report structural layer proposal. In: Mutzel, P., Jünger, M., Leipert, S. (eds.) Proceedings of GD 2001. LNCS, vol. 2265, pp. 501–512. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  16. Cadar, C., Ganesh, V., Pawlowski, P.M., Dill, D.L., Engler, D.R.: EXE: Automatically generating inputs of death. In: Juels, A., Wright, R.N., De Capitani di Vimercati, S. (eds.) Proceedings of CCS 2006, pp. 322–335. ACM (2006)

  17. Cassez, F., Sloane, A.M., Roberts, M., Pigram, M., Suvanpong, P., de Aledo, P.G.: Skink: Static analysis of programs in LLVM intermediate representation. In: Legay, A., Margaria, T. (eds.) Proceedings of TACAS 2017. LNCS, vol. 10206, pp. 380–384. Springer, Heidelberg (2017)

    Chapter  Google Scholar 

  18. Castaño, R., Braberman, V.A., Garbervetsky, D., Uchitel, S.: Model checker execution reports. In: Rosu, G., Di Penta, M., Nguyen, T.N. (eds.) Proceedings of ASE 2017, pp. 200–205. IEEE (2017)

  19. Chalupa, M., Vitovská, M., Jonáš, M., Slaby, J., Strejček, J.: Symbiotic 4: Beyond reachability. In: Legay, A., Margaria, T. (eds.) Proceedings of TACAS 2017. LNCS, vol. 10206, pp. 385–389. Springer, Heidelberg (2017)

    Chapter  Google Scholar 

  20. Christakis, M., Bird, C.: What developers want and need from program analysis: An empirical study. In: Lo, D., Apel, S., Khurshid, S. (eds.) Proceedings of ASE 2016, pp. 332–343. ACM (2016)

  21. Clarke, E.M., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guided abstraction refinement for symbolic model checking. J. ACM 50(5), 752–794 (2003)

    Article  MathSciNet  Google Scholar 

  22. Csallner, C., Smaragdakis, Y.: Check ’n’ crash: Combining static checking and testing. In: Roman, G.-C., Griswold, W.G., Nuseibeh, B. (eds.) Proceedings of ICSE 2005, pp. 422–431. ACM (2005)

  23. Dangl, M., Löwe, S., Wendler, P.: CPAchecker with support for recursive programs and floating-point arithmetic. In: Baier, C., Tinelli, C. (eds.) Proceedings of TACAS 2015. LNCS, vol. 9035, pp. 423–425. Springer, Heidelberg (2015)

    Google Scholar 

  24. Gadelha, M.Y.R., Ismail, H.I., Cordeiro, L.C.: Handling loops in bounded model checking of C programs via k-induction. STTT 19(1), 97–114 (2017)

    Article  Google Scholar 

  25. Godefroid, P., Klarlund, N., Sen, K.: Dart: Directed automated random testing. In: Sarkar, V., Hall, M.W. (eds.) Proceedings of PLDI 2005, pp. 213–223. ACM (2005)

  26. Greitschus, M., Dietsch, D., Heizmann, M., Nutz, A., Schätzle, C., Schilling, C., Schüssele, F., Podelski, A.: Ultimate Taipan: Trace abstraction and abstract interpretation. In: Legay, A., Margaria, T. (eds.) Proceedings of TACAS 2017. LNCS, vol. 10206, pp. 399–403. Springer, Heidelberg (2017)

    Chapter  Google Scholar 

  27. Gulavani, B.S., Henzinger, T.A., Kannan, Y., Nori, A.V., Rajamani, S.K.: Synergy: A new algorithm for property checking. In: Young, M., Devanbu, P.T., (eds.) Proceedings of FSE 2006, pp. 117–127. ACM (2006)

  28. Gunter, E.L., Peled, D.: Path exploration tool. In: Cleaveland, W.R. (ed.) Proceedings of TACAS 1999. LNCS, vol. 1579, pp. 405–419. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  29. Heizmann, M., Chen, Y.-W., Dietsch, D., Greitschus, M., Nutz, A., Musa, B., Schätzle, C., Schilling, C., Schüssele, F., Podelski, A.: Ultimate automizer with an on-demand construction of Floyd-Hoare automata. In: Legay, A., Margaria, T. (eds.) Proceedings of TACAS 2017. LNCS, vol. 10206, pp. 394–398. Springer, Heidelberg (2017)

    Chapter  Google Scholar 

  30. Holík, L., Hruška, M., Lengál, O., Rogalewicz, A., Šimáček, J., Vojnar, T.: Forester: From heap shapes to automata predicates. In: Legay, A., Margaria, T. (eds.) Proceedings of TACAS 2017. LNCS, vol. 10206, pp. 365–369. Springer, Heidelberg (2017)

    Chapter  Google Scholar 

  31. Holzer, A., Schallhart, C., Tautschnig, M., Veith, H.: How did you specify your test suite. In: Pecheur, C., Andrews, J., Di Nitto, E. (eds.) Proceedings of ASE 2010, pp. 407–416. ACM (2010)

  32. Jakobs, M.-C., Wehrheim, H.: Compact proof witnesses. In: Barrett, C., Davies, M., Kahsai, T. (eds.) Proceedings of NFM 2017. LNCS, vol. 10227, pp. 389–403. Springer, Cham (2017)

    Chapter  Google Scholar 

  33. Kotoun, M., Peringer, P., Šoková, V., Vojnar, T.: Optimized PredatorHP and the SV-COMP heap and memory safety benchmark. In: Chechik, M., Raskin, J.-F. (eds.) Proceedings of TACAS 2016. LNCS, vol. 9636, pp. 942–945. Springer, Heidelberg (2016)

    Chapter  Google Scholar 

  34. Kroening, D., Tautschnig, M.: CBMC: C bounded model checker. In: Ábrahám, E., Havelund, K. (eds.) Proceedings of TACAS 2014. LNCS, vol. 8413, pp. 389–391. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  35. Li, K., Reichenbach, C., Csallner, C., Smaragdakis, Y.: Residual investigation: Predictive and precise bug detection. In: Heimdahl, M.P.E., Su, Z., (eds.) Proceedings of ISSTA 2012, pp. 298–308. ACM (2012)

  36. Majumdar, R., Sen, K.: Hybrid concolic testing. In: Emmerich, W., Knight, J., Rothermel, G. (eds.) Proceedings of ICSE 2007, pp. 416–426. IEEE (2007)

  37. Morse, J., Ramalho, M., Cordeiro, L., Nicole, D., Fischer, B.: ESBMC 1.22. In: Ábrahám, E., Havelund, K. (eds.) Proceedings of TACAS 2014. LNCS, vol. 8413, pp. 405–407. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  38. Mrázek, J., Jonáš, M., Štill, V., Lauko, H., Barnat, J.: Optimizing and caching SMT queries in SymDIVINE. In: Legay, A., Margaria, T. (eds.) Proceedings of TACAS 2017. LNCS, vol. 10206, pp. 390–393. Springer, Heidelberg (2017)

    Chapter  Google Scholar 

  39. Müller, P., Ruskiewicz, J.N.: Using debuggers to understand failed verification attempts. In: Butler, M., Schulte, W. (eds.) Proceedings of FM 2011. LNCS, vol. 6664, pp. 73–87. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  40. Nutz, A., Dietsch, D., Mohamed, M.M., Podelski, A.: Ultimate Kojak with memory safety checks. In: Baier, C., Tinelli, C. (eds.) Proceedings of TACAS 2015. LNCS, vol. 9035, pp. 458–460. Springer, Heidelberg (2015)

    Google Scholar 

  41. Rakamarić, Z., Emmi, M.: SMACK: Decoupling source language details from verifier implementations. In: Biere, A., Bloem, R. (eds.) Proceedings of CAV 2014. LNCS, vol. 8559, pp. 106–113. Springer, Cham (2014)

    Google Scholar 

  42. Rocha, H., Barreto, R., Cordeiro, L., Neto, A.D.: Understanding programming bugs in ANSI-C software using bounded model checking counter-examples. In: Derrick, J., Gnesi, S., Latella, D., Treharne, H. (eds.) Proceedings of IFM 2012. LNCS, vol. 7321, pp. 128–142. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  43. Rocha, W., Rocha, H., Ismail, H., Cordeiro, L., Fischer, B.: DepthK: A k-induction verifier based on invariant inference for C programs. In: Legay, A., Margaria, T. (eds.) Proceedings of TACAS 2017. LNCS, vol. 10206, pp. 360–364. Springer, Heidelberg (2017)

    Chapter  Google Scholar 

  44. Schneider, F.B.: Enforceable security policies. ACM Trans. Inf. Syst. Secur. 3(1), 30–50 (2000)

    Article  Google Scholar 

  45. Schrammel, P., Kroening, D.: 2LS for program analysis. In: Chechik, M., Raskin, J.-F. (eds.) Proceedings of TACAS 2016. LNCS, vol. 9636, pp. 905–907. Springer, Heidelberg (2016)

    Chapter  Google Scholar 

  46. Sen, K., Marinov, D., Agha, G.: Cute: A concolic unit testing engine for C. In: Wermelinger, M., Gall, H.C. (eds.) Proceedings of FSE 2005, pp. 263–272. ACM (2005)

  47. Shved, P., Mandrykin, M., Mutilin, V.: Predicate analysis with BLAST 2.7. In: Flanagan, C., König, B. (eds.) Proceedings of TACAS 2012. LNCS, vol. 7214, pp. 525–527. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  48. Visser, W., Păsăreanu, C.S., Khurshid, S.: Test input generation with Java PathFinder. In: Avrunin, G.S., Rothermel, G. (eds.) Proceedings of ISSTA 2004, pp. 97–107. ACM (2004)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Beyer, D., Dangl, M., Lemberger, T., Tautschnig, M. (2018). Tests from Witnesses. In: Dubois, C., Wolff, B. (eds) Tests and Proofs. TAP 2018. Lecture Notes in Computer Science(), vol 10889. Springer, Cham. https://doi.org/10.1007/978-3-319-92994-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-92994-1_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-92993-4

  • Online ISBN: 978-3-319-92994-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics