A Toolkit for Automated Testing of Dafny

Fedchin, Aleksandr; Dean, Tyler; Foster, Jeffrey S.; Mercer, Eric; Rakamarić, Zvonimir; Reger, Giles; Rungta, Neha; Salkeld, Robin; Wagner, Lucas; Waldrip, Cassidy

doi:10.1007/978-3-031-33170-1_24

Aleksandr Fedchin⁹,
Tyler Dean¹¹,
Jeffrey S. Foster⁹,
Eric Mercer¹¹,
Zvonimir Rakamarić¹⁰,
Giles Reger¹⁰,
Neha Rungta¹⁰,
Robin Salkeld¹⁰,
Lucas Wagner¹⁰ &
…
Cassidy Waldrip¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13903))

Included in the following conference series:

NASA Formal Methods Symposium

503 Accesses
2 Citations

Abstract

Dafny is a verification-ready programming language that is executed via compilation to C# and other mainstream languages. We introduce a toolkit for automated testing of Dafny programs, consisting of DUnit (unit testing framework), DMock (mocking framework), and DTest (automated test generation). The main component of the toolkit, DTest, repurposes the Dafny verifier to automatically generate DUnit test cases that achieve desired coverage. It supports verification-specific language features, such as pre- and postconditions, and leverages them for mocking with DMock. We evaluate the new toolkit in two ways. First, we use two open-source Dafny projects to demonstrate that DTest can generate unit tests with branch coverage that is comparable to the expectations developers set for manually written tests. Second, we show that a greedy approach to test generation often produces a number of tests close to the theoretical minimum for the given coverage criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Dafny’s int is compiled to C#’s BigInteger because in Dafny integers are unbounded.

References

de Azevedo Oliveira, D., Medeiros, V., Déharbe, D., Musicante, M.A.: BTestBox: a tool for testing b translators and coverage of B models. In: Beyer, D., Keller, C. (eds.) TAP 2019. LNCS, vol. 11823, pp. 83–92. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31157-5_6
Chapter Google Scholar
Baldoni, R., Coppa, E., D’Elia, D.C., Demetrescu, C., Finocchi, I.: A survey of symbolic execution techniques. ACM Comput. Surv. 51(3) (2018). https://doi.org/10.1145/3182657
Barnett, M., Chang, B.-Y.E., DeLine, R., Jacobs, B., Leino, K.R.M.: Boogie: a modular reusable verifier for object-oriented programs. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.-P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 364–387. Springer, Heidelberg (2006). https://doi.org/10.1007/11804192_17
Chapter Google Scholar
Becker, B.F.H., Lourenço, C.B., Marché, C.: Explaining counterexamples with giant-step assertion checking. In: Workshop on Formal Integrated Development Environment, pp. 82–88 (2021). https://doi.org/10.4204/EPTCS.338.10
Beyer, D., Chlipala, A., Henzinger, T., Jhala, R., Majumdar, R.: Generating tests from counterexamples. In: International Conference on Software Engineering, pp. 326–335 (2004). https://doi.org/10.1109/ICSE.2004.1317455
Beyer, D., Jakobs, M.C.: CoVeriTest: cooperative verifier-based testing. In: Fundamental Approaches to Software Engineering, pp. 389–408 (2019). https://doi.org/10.1007/978-3-030-16722-6_23
Boogie. https://github.com/boogie-org/boogie
Chakarov, A., Fedchin, A., Rakamarić, Z., Rungta, N.: Better counterexamples for Dafny. In: Fisman, D., Rosu, G. (eds.) TACAS 2022. LNCS, vol. 13243, pp. 404–411. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-99524-9_23
Chapter Google Scholar
Christakis, M., Leino, K.R.M., Müller, P., Wüstholz, V.: Integrated environment for diagnosing verification errors. In: Chechik, M., Raskin, J.-F. (eds.) TACAS 2016. LNCS, vol. 9636, pp. 424–441. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49674-9_25
Chapter Google Scholar
Claessen, K., Hughes, J.: QuickCheck: a lightweight tool for random testing of haskell programs. In: International Conference on Functional Programming, pp. 268–279 (2000). https://doi.org/10.1145/351240.351266
Coverlet. https://github.com/coverlet-coverage/coverlet
Dafny. https://github.com/dafny-lang/dafny
AWS Encryption SDK. https://github.com/aws/aws-encryption-sdk-dafny
Filliâtre, J.-C., Paskevich, A.: Why3—where programs meet provers. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 125–128. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37036-6_8
Chapter Google Scholar
Hao, D., Zhang, L., Wu, X., Mei, H., Rothermel, G.: On-demand test suite reduction. In: International Conference on Software Engineering, pp. 738–748 (2012). https://doi.org/10.1109/ICSE.2012.6227144
Hawblitzel, C., et al.: Ironclad apps: end-to-end security via automated full-system verification. In: Symposium on Operating Systems Design and Implementation, pp. 165–181 (2014)
Google Scholar
Hsu, H.Y., Orso, A.: MINTS: a general framework and tool for supporting test-suite minimization. In: International Conference on Software Engineering, pp. 419–429 (2009). https://doi.org/10.1109/ICSE.2009.5070541
Irfan, A., Porncharoenwase, S., Rakamarić, Z., Rungta, N., Torlak, E.: Testing Dafny (experience paper). In: International Symposium on Software Testing and Analysis, pp. 556–567 (2022). https://doi.org/10.1145/3533767.3534382
Khurshid, S., PĂsĂreanu, C.S., Visser, W.: Generalized symbolic execution for model checking and testing. In: Garavel, H., Hatcliff, J. (eds.) TACAS 2003. LNCS, vol. 2619, pp. 553–568. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36577-X_40
Chapter MATH Google Scholar
Leinenbach, D., Santen, T.: Verifying the microsoft hyper-V hypervisor with VCC. In: Cavalcanti, A., Dams, D.R. (eds.) FM 2009. LNCS, vol. 5850, pp. 806–809. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-05089-3_51
Chapter Google Scholar
Leino, K.R.M.: Dafny: an automatic program verifier for functional correctness. In: Clarke, E.M., Voronkov, A. (eds.) LPAR 2010. LNCS (LNAI), vol. 6355, pp. 348–370. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17511-4_20
Chapter MATH Google Scholar
Leino, K.R.M.: Accessible software verification with Dafny. IEEE Softw. 34(6), 94–97 (2017). https://doi.org/10.1109/MS.2017.4121212
Article Google Scholar
Leroy, X.: A formally verified compiler back-end. J. Autom. Reason. 43(4), 363–446 (2009). https://doi.org/10.1007/s10817-009-9155-4
Article MathSciNet MATH Google Scholar
Dafny utilities library. https://github.com/dafny-lang/libraries
Liew, D., Cadar, C., Donaldson, A.F.: Symbooglix: a symbolic execution engine for boogie programs. In: International Conference on Software Testing, Verification and Validation, pp. 45–56 (2016). https://doi.org/10.1109/ICST.2016.11
Manès, V.J., et al.: The art, science, and engineering of fuzzing: a survey. IEEE Trans. Softw. Eng. 47(11), 2312–2331 (2019). https://doi.org/10.1109/TSE.2019.2946563
Article Google Scholar
Mockito. https://github.com/mockito/mockito
Moq. https://github.com/moq/moq
de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78800-3_24
Chapter Google Scholar
Polikarpova, N., Furia, C.A., West, S.: To run what no one has run before: executing an intermediate verification language. In: Legay, A., Bensalem, S. (eds.) RV 2013. LNCS, vol. 8174, pp. 251–268. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40787-1_15
Chapter Google Scholar
Spettel, P.: Delfy: dynamic test generation for Dafny. Master’s thesis, Eidgenössische Technische Hochschule Zürich (2013). https://doi.org/10.3929/ethz-a-010056933
XUnit. https://github.com/xunit/xunit
Z3. https://github.com/Z3Prover/z3

Download references

Acknowledgments

The authors would like to thank Aleks Chakarov, Cody Roux, William Schultz, and Serdar Tasiran for their invaluable feedback on the usability of the toolkit, Ryan Emery and Tony Knapp for facilitating the use of the ESDK as a benchmark dataset, Rustan Leino, Mikael Mayer, Aaron Tomb, and Remy Williams for their feedback on the source code, and the anonymous reviewers for helping improve this text. This work is partly supported by an Amazon post-internship graduate research fellowship.

Author information

Authors and Affiliations

Tufts University, Medford, USA
Aleksandr Fedchin & Jeffrey S. Foster
Amazon Web Services, Seattle, USA
Zvonimir Rakamarić, Giles Reger, Neha Rungta, Robin Salkeld & Lucas Wagner
Brigham Young University, Provo, USA
Tyler Dean, Eric Mercer & Cassidy Waldrip

Authors

Aleksandr Fedchin
View author publications
You can also search for this author in PubMed Google Scholar
Tyler Dean
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey S. Foster
View author publications
You can also search for this author in PubMed Google Scholar
Eric Mercer
View author publications
You can also search for this author in PubMed Google Scholar
Zvonimir Rakamarić
View author publications
You can also search for this author in PubMed Google Scholar
Giles Reger
View author publications
You can also search for this author in PubMed Google Scholar
Neha Rungta
View author publications
You can also search for this author in PubMed Google Scholar
Robin Salkeld
View author publications
You can also search for this author in PubMed Google Scholar
Lucas Wagner
View author publications
You can also search for this author in PubMed Google Scholar
Cassidy Waldrip
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aleksandr Fedchin .

Editor information

Editors and Affiliations

Iowa State University, Ames, IA, USA
Kristin Yvonne Rozier
University of Texas at Austin, Austin, TX, USA
Swarat Chaudhuri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fedchin, A. et al. (2023). A Toolkit for Automated Testing of Dafny. In: Rozier, K.Y., Chaudhuri, S. (eds) NASA Formal Methods. NFM 2023. Lecture Notes in Computer Science, vol 13903. Springer, Cham. https://doi.org/10.1007/978-3-031-33170-1_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-33170-1_24
Published: 03 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-33169-5
Online ISBN: 978-3-031-33170-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Toolkit for Automated Testing of Dafny