Journal of Automated Reasoning

, Volume 55, Issue 2, pp 117–183 | Cite as

The Reflective Milawa Theorem Prover is Sound (Down to the Machine Code that Runs it)

  • Jared Davis
  • Magnus O. MyreenEmail author


This paper presents, we believe, the most comprehensive evidence of a theorem prover’s soundness to date. Our subject is the Milawa theorem prover. We present evidence of its soundness down to the machine code. Milawa is a theorem prover styled after NQTHM and ACL2. It is based on an idealised version of ACL2’s computational logic and provides the user with high-level tactics similar to ACL2’s. In contrast to NQTHM and ACL2, Milawa has a small kernel that is somewhat like an LCF-style system. We explain how the Milawa theorem prover is constructed as a sequence of reflective extensions from its kernel. The kernel establishes the soundness of these extensions during Milawa’s bootstrapping process. Going deeper, we explain how we have shown that the Milawa kernel is sound using the HOL4 theorem prover. In HOL4, we have formalized its logic, proved the logic sound, and proved that the source code for the Milawa kernel (1,700 lines of Lisp) faithfully implements this logic. Going even further, we have combined these results with the x86 machine-code level verification of the Lisp runtime Jitawa. Our top-level theorem states that Milawa can never claim to prove anything that is false when it is run on this Lisp runtime.


Soundness Theorem proving Proof assistant Machine code 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hoare, C.A.R.: An axiomatic basis for computer programming. Commun. ACM 12(10), 576–580 (1969)CrossRefzbMATHGoogle Scholar
  2. 2.
    Kaufmann, M., Manolios, P., Moore, J.S.: Computer-Aided Reasoning: An Approach. Kluwer Academic Publishers, Norwell (2000)Google Scholar
  3. 3.
    Bertot, Y., Castéran, P.: Interactive Theorem Proving and Program Development: Coq’Art: The Calculus of Inductive Constructions. Texts in Theoretical Computer Science. Springer, Berlin (2004)Google Scholar
  4. 4.
    Slind, K., Norrish, M.: A brief overview of HOL4. In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs. LNCS, Springer, Berlin (2008)Google Scholar
  5. 5.
    Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL — A Proof Assistant for Higher-Order Logic Volume 2283 of LNCS. Springer, Berlin Heidelberg (2002)Google Scholar
  6. 6.
    Davis, J.C.: A Self-Verifying Theorem Prover. PhD thesis, University of Texas, Austin (2009)Google Scholar
  7. 7.
    Boyer, R.S., Kaufmann, M., Moore, J.S.: The Boyer-Moore theorem prover and its interactive enhancement. Comput. Math. Appl. 29(2), 27–62 (1995)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Gordon, M.J., Milner, A.J., Wadsworth, C.P.: Edinburgh LCF: A Mechanised Logic of Computation. LNCS, Springer, Berlin (1979)CrossRefGoogle Scholar
  9. 9.
    Harrison, J.: HOL Light: an overview. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs. LNCS, Springer, Berlin (2009)Google Scholar
  10. 10.
    Myreen, M.O., Davis, J.: A verified runtime for a verified theorem prover. In: Interactive Theorem Proving (ITP). LNCS, Springer, Berlin (2011)Google Scholar
  11. 11.
    Harrison, J.: Towards self-verification of HOL light. In: Furbach, U., Shankar, N. (eds.) IJCAR. LNAI, Springer, Berlin (2006)Google Scholar
  12. 12.
    Griffioen, D., Huisman, M.: A comparison of PVS and Isabelle/HOL. In: Gundy, J., Newey, M. (eds.) Theorem Proving in Higher Order Logics (TPHOLS ’98). Volume 1479 of LNCS, pp. 123–142. Springer, Berlin (1998)Google Scholar
  13. 13.
    Brummayer, R., Biere, A.: Fuzzing and delta-debugging SMT solvers. In: SMT ’09, ACM, pp. 1–5 (2009)Google Scholar
  14. 14.
    Brummayer, R., Lonsing, F., Biere, A.: Automated testing and debugging of SAT and QBF solvers. In: Proceedings of the 13th International Conference on Theory and Applications of Satisfiability Testing. SAT ’10, pp. 44–57. Springer, Berlin (2010)CrossRefGoogle Scholar
  15. 15.
    Järvisalo, M, Heule, M.J., Biere, A.: Inprocessing rules. In: Gramlich, B., Miller, D., Sattler, U. (eds.) Automated Reasoning. Volume 7364 of LNCS, pp. 355–370. Springer, Berlin (2012)Google Scholar
  16. 16.
    Barendregt, H., Wiedijk, F.: The challenge of computer mathematics. Phil. Trans. R. Soc. A 363(1835), 2351–2375 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Wetzler, N., Heule, M., Hunt, W.A. Jr.: DRAT-trim: Efficient checking and trimming using expressive clausal proofs. In: SAT ’14. Volume 8561 of LNCS, pp. 422–429. Springer, Berlin (2014)Google Scholar
  18. 18.
    Balabanov, V., Jiang, J.R.: Unified qbf certification and its applications. Form. Methods Syst. Des. 41(1), 45–65 (2012)CrossRefzbMATHGoogle Scholar
  19. 19.
    Böhme, S., Fox, A., Sewell, T., Weber, T.: Reconstruction of Z3’s bit-vector proofs in HOL4 and Isabelle/HOL. In: CPP ’11. Volume 7086 of LNCS, pp. 183–198. Springer, Berlin (2011)Google Scholar
  20. 20.
    McCune, W., Shumsky, O.: Ivy: a preprocessor and proof checker for first-order logic. In: Computer-Aided Reasoning: ACL2 Case Studies. Kluwer Academic Publishers, Norwell (2000)Google Scholar
  21. 21.
    Darbari, A., Fischer, B., Marques-Silva, J.: Industrial-strength certified SAT solving through verified SAT proof checking. In: ICTAC ’10. Volume 6255 of LNCS, pp. 260–274. Springer, Berlin (2010)Google Scholar
  22. 22.
    Weber, T., Amjad, H.: Efficiently checking propositional refutations in HOL theorem provers. J. Appl. Logic 7(1), 26–40 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Marić, F.: Formalization and implementation of modern SAT solvers. J. Autom. Reason. 43(1), 81–119 (2009)CrossRefzbMATHGoogle Scholar
  24. 24.
    Hurd, J.: The OpenTheory standard theory library. In: Bobaru, M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NASA Formal Methods. LNCS, Springer, Berlin (2011)Google Scholar
  25. 25.
    Kaufmann, M., Moore, J.S.: Structured theory development for a mechanized logic. J. Autom. Reason. 26(2), 161–203 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Davis, J.: Reasoning about file input in ACL2. In: Manolios, P., Wilding, M. (eds.) ACL2 ’06 (2006)Google Scholar
  27. 27.
    Kaufmann, M., Moore, J.: Design goals of ACL2. Technical Report 101, Computational Logic, Inc. (1994)Google Scholar
  28. 28.
    Rager, D.L., Hunt, W.A. Jr.: Implementing a parallelism library for a functional subset of LISP. In: International Lisp Conference (ILC), pp. 18–30 (2009)Google Scholar
  29. 29.
    Boyer, R.S., Hunt, W.A. Jr.: Function memoization and unique object representation for ACL2 functions. In: ACL2 ’06, ACM (2006)Google Scholar
  30. 30.
    Hunt, W.A. Jr., Krug, R.B., Moore, J.: Linear and nonlinear arithmetic in ACL2. In: Geist, D. (ed.) Correct Hardware Design and Verification Methods (CHARME ’03). Volume 2860 of LNCS, pp. 319–333. Springer, Berlin (2003)Google Scholar
  31. 31.
    Hunt, W.A. Jr., Kaufmann, M., Krug, R.B., Moore, J., Smith, E.W.: Meta reasoning in ACL2. In: Hurd, J., Melham, T. (eds.) Theorem Proving in Higher Order Logics (TPHOLS ’05). Volume 3603 of LNCS, pp. 163–178. Springer, Berlin (2005)Google Scholar
  32. 32.
    Brock, B., Kaufmann, M., Moore, J.S.: Rewriting with equivalence relations in ACL2. J. Autom. Reason. 40(4), 293–306 (2008)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Kaufmann, M., Moore, J.S., Ray, S., Reeber, E.: Integrating external deduction tools with acl2. J. Autom. Reason. 7(1), 3–25 (2009)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Harrison, J.: Metatheory and reflection in theorem proving: a survey and critique. Technical Report CRC-053. SRI Cambridge, Millers Yard, Cambridge, UK (1995)Google Scholar
  35. 35.
    McCarthy, J.: Recursive functions of symbolic expressions and their computation by machine, part 1. Commun. ACM 3(4), 184–195 (1960)CrossRefzbMATHGoogle Scholar
  36. 36.
    Shoenfield, J.R.: Mathematical Logic. The Association for Symbolic Logic (1967)Google Scholar
  37. 37.
    Shankar, N.: Metamathematics, Machines, and Gödel’s Proof. Cambridge University Press, Cambridge (1994)CrossRefzbMATHGoogle Scholar
  38. 38.
    Boyer, R.S., Moore, J.S.: A Computational Logic Handbook, 2nd edn. Academic Press, New York (1997)Google Scholar
  39. 39.
    Myreen, M.O., Gordon, M.J.C.: Verified LISP implementations on ARM, x86 and PowerPC. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs. LNCS, Springer, Berlin (2009)Google Scholar
  40. 40.
    Kaufmann, M., Slind, K.: Proof pearl: Wellfounded induction on the ordinals up to 𝜖 0. In: Schneider, K., Brandt, J. (eds.) Theorem Proving in Higher Order Logics (TPHOLs), pp. 294–301. LNCS, Springer, Berlin (2007)CrossRefGoogle Scholar
  41. 41.
    Myreen, M.O.: Functional programs: conversions between deep and shallow embeddings. In: Interactive Theorem Proving (ITP). LNCS, Springer, Berlin (2012)Google Scholar
  42. 42.
    Myreen, M.O.: Verified just-in-time compiler on x86. In: Hermenegildo, M.V., Palsberg, J. (eds.) Principles of Programming Languages (POPL), ACM (2010)Google Scholar
  43. 43.
    Myreen, M.O.: Formal verification of machine-code programs. PhD thesis, University of Cambridge, Cambridge (2009)Google Scholar
  44. 44.
    Myreen, M.O., Slind, K., Gordon, M.J.: Extensible proof-producing compilation. In: de Moor, O., Schwartzbach, M.I. (eds.) Compiler Construction (CC). LNCS, Springer, Berlin (2009)Google Scholar
  45. 45.
    Manolios, P., Moore, J.S.: Partial functions in ACL2. J. Autom. Reason. 31 (2), 107–127 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  46. 46.
    Kumar, R., Arthan, R., Myreen, M.O., Owens, S.: HOL with definitions: semantics, soundness, and a verified implementation. In: Klein, G., Gamboa, R. (eds.) Interactive Theorem Proving (ITP). LNCS, Springer, Berlin (2014)Google Scholar
  47. 47.
    Myreen, M.O., Owens, S., Kumar, R.: Steps towards verified implementations of HOL light. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) Interactive Theorem Proving (ITP). LNCS, Springer, Berlin (2013)Google Scholar
  48. 48.
    Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: a verified implementation of ML. In: Jagannathan, S., Sewell, P. (eds.) Principles of Programming Languages (POPL), ACM (2014)Google Scholar
  49. 49.
    Gordon, M.J.C., Hunt, W.A. Jr., Kaufmann, M., Reynolds, J.: An embedding of the ACL2 logic in HOL. In: International Workshop on the ACL2 Theorem Prover and Its Applications (ACL2), ACM, pp. 40–46 (2006)Google Scholar
  50. 50.
    Gordon, M.J.C., Reynolds, J., Hunt, W.A. Jr., Kaufmann, M.: An integration of HOL and ACL2. In: Formal Methods in Computer-Aided Design (FMCAD). IEEE Computer Society, pp. 153–160 (2006)Google Scholar
  51. 51.
    McCune, W., Shumsky, O.: System description: Ivy. In: Automated Deduction (CADE), pp. 401–405. LNCS, Springer, Berlin (2000)Google Scholar
  52. 52.
    Ridge, T., Margetson, J.: A mechanically verified, sound and complete theorem prover for first order logic. In: Hurd, J., Melham, T.F. (eds.) TPHOLs. LNCS, Springer, Berlin (2005)Google Scholar
  53. 53.
    Marić, F.: Formal verification of a modern SAT solver by shallow embedding into Isabelle/HOL. Theor. Comput. Sci. 411(50), 4333–4356 (2010)CrossRefzbMATHGoogle Scholar
  54. 54.
    Haftmann, F., Bulwahn, L.: Code generation from Isabelle/HOL theories Isabelle2011-1 Documentation.

Copyright information

© Springer Science+Business Media Dordrecht 2015

Authors and Affiliations

  1. 1.Centaur Technology, Inc.AustinUSA
  2. 2.CSE DepartmentChalmers University of TechnologyGöteborgSweden
  3. 3.Computer LaboratoryUniversity of CambridgeCambridgeUK

Personalised recommendations