A framework for testing first-order logic axioms in program verification
Program verification systems based on automated theorem provers rely on user-provided axioms in order to verify domain-specific properties of code. However, formulating axioms correctly (that is, formalizing properties of an intended mathematical interpretation) is non-trivial in practice, and avoiding or even detecting unsoundness can sometimes be difficult to achieve. Moreover, speculating soundness of axioms based on the output of the provers themselves is not easy since they do not typically give counterexamples. We adopt the idea of model-based testing to aid axiom authors in discovering errors in axiomatizations. To test the validity of axioms, users define a computational model of the axiomatized logic by giving interpretations to the function symbols and constants in a simple declarative programming language. We have developed an axiom testing framework that helps automate model definition and test generation using off-the-shelf tools for meta-programming, property-based random testing, and constraint solving. We have experimented with our tool to test the axioms used in Auto-Cert, a program verification system that has been applied to verify aerospace flight code using a first-order axiomatization of navigational concepts, and were able to find counterexamples for a number of axioms.
KeywordsModel-based testing Program verification Automated theorem proving Property-based testing Constraint solving
- Becker, M., & Smith, D. R. (2005). Model validation in Planware. In Verification and validation of model-based planning and scheduling systems (VVPS 2005), Monterey, CA, USA.Google Scholar
- Berghofer, S., & Nipkow, T. (2004). Random testing in Isabelle/HOL. In 2nd IEEE international conference on software engineering and formal methods (SEFM 2004), pp. 230–239.Google Scholar
- Blaine, L., Gilham, L., Liu, J., Smith, D., & Westfold, S. (1998). Planware: Domain-specific synthesis of high-performance schedulers. In The 13th IEEE international conference on automated software engineering (ASE ’98). IEEE Computer Society, Honolulu, Hawaii, USA, pp. 270–280.Google Scholar
- Bradley, A. R., Manna, Z., & Sipma, H. B. (2006). What’s decidable about arrays? In E. A. Emerson & K. S. Namjoshi (Eds.), VMCAI, Springer, Lecture Notes in Computer Science, 3855, 427–442. http://dx.doi.org/10.1007/11609773_28.
- Carlier, M., Dubois, C. (2008). Functional testing in the Focal environment. In B. Beckert & R. Hähnle (Eds.), The 2nd international conference on tests and proofs (TAP 2008) (Vol. 4966, pp. 84–98). Springer, Lecture Notes in Computer Science. http://dx.doi.org/10.1007/978-3-540-79124-9_7.
- Claessen, K., & Hughes, J. (2000). QuickCheck: A lightweight tool for random testing of Haskell programs. In Proceedings of the ACM SIGPLAN international conference on functional programming, pp. 268–279.Google Scholar
- Claessen, K., & Sutcliffe, G. (2009). A simple type system for FOF. http://www.cs.miami.edu/~tptp/TPTP/Proposals/TypedFOF.html.
- Claessen, K., & Svensson, H. (2008). Finding counter examples in induction proofs. In The 2nd international conference on tests and proofs (TAP 2008), pp. 48–65.Google Scholar
- Denney, E., & Fischer, B. (2008). Generating customized verifiers for automatically generated code. In Proceedings of the conference on generative programming and component engineering (GPCE ’08) (pp. 77–87). Nashville, TN: ACM Press.Google Scholar
- Denney, E., & Trac, S. (2008). A software safety certification tool for automatically generated guidance, navigation and control code. In: IEEE aerospace conference.Google Scholar
- Dutertre, B., & de Moura, L. (2006). The YICES SMT solver. Tool paper at http://yices.csl.sri.com/tool-paper.pdf.
- Dybjer, P., Haiyan, Q., & Takeyama, M. (2003). Combining testing and proving in dependent type theory. In 16th International conference on theorem proving in higher order logics (TPHOLs 2003) (pp. 188–203). New York: Springer.Google Scholar
- Fontaine, P. (2007). Combinations of theories and the bernays-schönfinkel-ramsey class. In B. Beckert (Ed.), VERIFY, CEUR-WS.org, CEUR workshop proceedings, Vol. 259. http://ceur-ws.org/Vol-259/paper06.pdf.
- Green, C. (1969). The application of theorem proving to question-answering systems. PhD thesis, Stanford University.Google Scholar
- McCarthy, J., & Painter, J. (1967). Correctness of a compiler for arithmetic expressions. In: J. T. Schwartz (Ed.), Proceedings symposium in applied mathematics (Vol. 19, pp. 33–41). Mathematical aspects of computer science. Providence, RI: American Mathematical Society.Google Scholar
- Paulson, L., & Nipkow, T. (1994). Isabelle: A generic theorem prover. Lecture notes in computer science (Vol. 828). Springer, New York.Google Scholar
- Pérez, J. A. N., & Voronkov, A. (2007). Encodings of problems in effectively propositional logic. In: J. Marques-Silva & K. A. Sakallah (Eds.), SAT, Springer, lecture notes in computer science (Vol. 4501, p. 3). http://dx.doi.org/10.1007/978-3-540-72788-0_2.
- Sheard, T., & Peyton Jones, S. (2002). Template metaprogramming for Haskell. In: ACM SIGPLAN Haskell workshop 02 (pp. 1–16). New York: ACM Press.Google Scholar
- Sutcliffe, G. (2000). System description: systemOn TPTP. In 17th International conference on automated deduction (CADE 2000)( Vol. 1831, pp. 406–410), Springer, Lecture notes in computer science.Google Scholar
- Sutcliffe, G., Denney, E., & Fischer, B. (2005). Practical proof checking for program certification. In: Proceedings of the CADE-20 workshop on empirically successful classical automated reasoning (ESCAR ’05).Google Scholar
- Vallado, D. A. (2001). Fundamentals of astrodynamics and applications (2nd ed.). Torrance: Space Technology Library, Microcosm Press and Kluwer Academic Publishers.Google Scholar