Journal of Automated Reasoning

, Volume 56, Issue 2, pp 155–200 | Cite as

Semi-intelligible Isar Proofs from Machine-Generated Proofs

  • Jasmin Christian Blanchette
  • Sascha Böhme
  • Mathias Fleury
  • Steffen Juilf Smolka
  • Albert Steckermeier


Sledgehammer is a component of the Isabelle/HOL proof assistant that integrates external automatic theorem provers (ATPs) to discharge interactive proof obligations. As a safeguard against bugs, the proofs found by the external provers are reconstructed in Isabelle. Reconstructing complex arguments involves translating them to Isabelle’s Isar format, supplying suitable justifications for each step. Sledgehammer transforms the proofs by contradiction into direct proofs; it iteratively tests and compresses the output, resulting in simpler and faster proofs; and it supports a wide range of ATPs, including E, LEO-II, Satallax, SPASS, Vampire, veriT, Waldmeister, and Z3.


Automatic theorem provers Proof assistants Natural deduction 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alama, J.: Escape to Mizar from ATPs. In: Fontaine, P., Schmidt, R., Schulz, S. (eds.) PAAR-2012, EPiC, vol. 21, pp 3–11. EasyChair (2013)Google Scholar
  2. 2.
    Armand, M., Faure, G., Grégoire, B., Keller, C., Théry, L., Werner, B.: A modular integration of SAT/SMT solvers to Coq through proof witnesses. In: Jouannaud, J.P., Shao, Z. (eds.) CPP 2011, LNCS, vol. 7086, pp 135–150. Springer (2011)Google Scholar
  3. 3.
    Autexier, S., Benzmüller, C., Fiedler, A., Horacek, H., Vo, Q.B.: Assertion-level proof representation with under-specification. Electr. Notes Theor. Comput. Sci. 93, 5–23 (2004)Google Scholar
  4. 4.
    Azmy, N., Weidenbach, C.: Computing tiny clause normal forms. In: Bonacina, M.P. (ed.) CADE-24, LNCS, vol. 7898, pp 109–125. Springer (2013)Google Scholar
  5. 5.
    Bachmair, L., Ganzinger, H.: Resolution theorem proving. In: Robinson, A., Voronkov, A. (eds.) Handbook of Automated Reasoning, vol. I, pp 19–99. Elsevier (2001)Google Scholar
  6. 6.
    Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanovic, D., King, T., Reynolds, A., Tinelli, C.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011, LNCS, vol. 6806, pp 171–177. Springer (2011)Google Scholar
  7. 7.
    Barrett, C., Stump, A., Tinelli, C.: The SMT-LIB standard—Version 2.0. In: Gupta, A., Kroening, D. (eds.) SMT 2010 (2010)Google Scholar
  8. 8.
    Benzmüller, C., Paulson, L.C., Theiss, F., Fietzke, A.: LEO-II—A cooperative automatic theorem prover for higher-order logic. In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008, LNCS, vol. 5195, pp 162–170. Springer (2008)Google Scholar
  9. 9.
    Besson, F., Fontaine, P., Théry, L.: A flexible proof format for SMT: A proposal. In: Fontaine, P., Stump, A. (eds.) PxTP 2011, pp 15–26 (2011)Google Scholar
  10. 10.
    Blanchette, J.C.: Automatic proofs and refutations for higher-order logic. Ph.D. thesis, Dept. of Informatics, Technische Universität München (2012)Google Scholar
  11. 11.
    Blanchette, J.C.: Redirecting proofs by contradiction. In: Blanchette, J.C., Urban, J. (eds.) PxTP 2013, EPiC, vol. 14, pp 11–26. EasyChair (2013)Google Scholar
  12. 12.
    Blanchette, J.C., Böhme, S., Paulson, L.C.: Extending Sledgehammer with SMT solvers. J. Autom. Reason. 51(1), 109–128 (2013)CrossRefGoogle Scholar
  13. 13.
    Blanchette, J.C., Böhme, S., Popescu, A., Smallbone, N.: Encoding monomorphic and polymorphic types. In: Piterman, N., Smolka, S. (eds.) TACAS 2013, LNCS, vol. 7795, pp 493–507. Springer (2013)Google Scholar
  14. 14.
    Blanchette, J.C., Popescu, A., Wand, D., Weidenbach, C.: More SPASS with Isabelle—Superposition with hard sorts and configurable simplification. In: Beringer, L., Felty, A. (eds.) ITP 2012, LNCS, vol. 7406, pp 345–360. Springer (2012)Google Scholar
  15. 15.
    Böhme, S.: Proving theorems of higher-order logic with SMT solvers. Ph.D. thesis, Dept. of Informatics, Technische Universität München (2012)Google Scholar
  16. 16.
    Böhme, S., Nipkow, T.: Sledgehammer: judgement day. In: Giesl, J., Hähnle, R. (eds.) IJCAR 2010, LNCS, vol. 6173, pp 107–121. Springer (2010)Google Scholar
  17. 17.
    Böhme, S., Weber, T.: Fast LCF-style proof reconstruction for Z3. In: Kaufmann, M., Paulson, L. (eds.) ITP 2010, LNCS, vol. 6172, pp 179–194. Springer (2010)Google Scholar
  18. 18.
    Bouton, T, de Oliveira, D.C.B., Déharbe, D., Fontaine, P.: veriT: an open, trustable and efficient SMT-solver. In: Schmidt, R.A. (ed.) CADE-22, LNCS, vol. 5663, pp 151–156. Springer (2009)Google Scholar
  19. 19.
    Bradley, A.R., Manna, Z.: Property-directed incremental invariant generation. Form. Asp. Comput. 20, 379–405 (2008)CrossRefzbMATHGoogle Scholar
  20. 20.
    Brown, C.E.: Satallax: an automatic higher-order prover. In: Gramlich, B., Miller, D., Sattler, U. (eds.) IJCAR 2012, LNCS, vol. 7364, pp 111–117. Springer (2012)Google Scholar
  21. 21.
    Brown, C.E.: Using Satallax to generate proof terms for conjectures in Coq. Presentation at AIPA 2012 (2012)Google Scholar
  22. 22.
    Caminati, M.: Re: [isabelle] sledgehammer, smt and metis. (2014)
  23. 23.
    Chaieb, A., Nipkow, T.: Verifying and reflecting quantifier elimination for Presburger arithmetic. In: Sutcliffe, G., Voronkov, A. (eds.) LPAR 2005, LNCS, vol. 3835, pp 367–380. Springer (2005)Google Scholar
  24. 24.
    Church, A.: A formulation of the simple theory of types. J. Symb. Log. 5(2), 56–68 (1940)CrossRefMathSciNetGoogle Scholar
  25. 25.
    Dahn, B.I.: Robbins algebras are Boolean: a revision of McCune’s computer-generated solution of Robbins problem. J. Algebra 208(2), 526–532 (1998)CrossRefMathSciNetzbMATHGoogle Scholar
  26. 26.
    Davis, M.: Obvious logical inferences. In: Hayes, P.J. (ed.) IJCAI ’81, pp 530–531. William Kaufmann (1981)Google Scholar
  27. 27.
    Diekmann, C.: Network security policy verification. In: Klein, G., Nipkow, T., Paulson, L. (eds.) Archive of Formal Proofs. (2014)
  28. 28.
    Fleury, M.: Translation of proofs provided by external provers. Internship report, Technische Universität München. (2014)
  29. 29.
    Foster, S., Struth, G.: Regular algebras. In: Klein, G., Nipkow, T., Paulson, L. (eds.) Archive of Formal Proofs. (2014)
  30. 30.
    Gentzen, G.: Untersuchungen über das logische Schließen. Math. Z. 39, 176–210 (1935)CrossRefMathSciNetGoogle Scholar
  31. 31.
    Gordon, M.J.C., Melham, T.F. (eds.): Introduction to HOL: A Theorem Proving Environment for Higher Order Logic. Cambridge University Press (1993)Google Scholar
  32. 32.
    Herbrand, J.: Thèses présentées à la faculté des sciences de paris pour obtenir le grade de docteur ès sciences mathématiques. Ph.D. thesis, Science Faculty, Université de Paris (1930)Google Scholar
  33. 33.
    Hillenbrand, T., Buch, A., Vogt, R., Löchner, B.: WALDMEISTER—High-performance equational deduction. J. Autom. Reason. 18(2), 265–270 (1997)CrossRefGoogle Scholar
  34. 34.
    Hoder, K., Kovacs, L., Voronkov, A.: Vampire usage and demo. Presentation at the Vampire tutorial at CADE-23. (2011)
  35. 35.
    Huang, X.: Translating machine-generated resolution proofs into ND-proofs at the assertion level. In: Foo, N.Y., Goebel, R. (eds.) PRICAI ’96, LNCS, vol. 1114, pp 399–410. Springer (1996)Google Scholar
  36. 36.
    Hurd, J.: First-order proof tactics in higher-order logic theorem provers. In: Archer, M., Di Vito, B., Muṅoz, C. (eds.) Design and Application of Strategies/Tactics in Higher Order Logics, no. CP-2003-212448 in NASA Technical Reports, pp 56–68 (2003)Google Scholar
  37. 37.
    Jaśkowski, S.: On the rules of suppositions in formal logic. Stud. Logica. 1, 5–32 (1934)zbMATHGoogle Scholar
  38. 38.
    Kaliszyk, C., Urban, J.: PRocH: Proof reconstruction for HOL Light. In: Bonacina, M.P. (ed.) CADE-24, LNCS, vol. 7898, pp 267–273. Springer (2013)Google Scholar
  39. 39.
    Kaliszyk, C., Urban, J.: Learning-assisted automated reasoning with Flyspeck. J. Autom. Reason. 53(2), 173–213 (2014)CrossRefMathSciNetGoogle Scholar
  40. 40.
    Karp, R.M.: Reducibility among combinatorial problems. In: Miller, R.E., Thatcher, J.W. (eds.) Complexity of Computer Computations, IBM Research Symposia Series, pp 85–103. Plenum Press (1972)Google Scholar
  41. 41.
    Klein, G., Nipkow, T., Paulson, L. (eds.): Archive of Formal Proofs.
  42. 42.
    Knuth, D.E., Larrabee, T.L., Roberts, P.M.: Mathematical Writing, Mathematical Association of America (1989)Google Scholar
  43. 43.
    Korovin, K.: Private communication (2013)Google Scholar
  44. 44.
    Kühlwein, D., Blanchette, J.C., Kaliszyk, C., Urban, J.: MaSh: machine learning for Sledgehammer. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP 2013, LNCS, vol. 7998, pp 35–50. Springer (2013)Google Scholar
  45. 45.
    Loveland, D.W.: Automated Theorem Proving: A Logical Basis. North-Holland (1978)Google Scholar
  46. 46.
    Matuszewski, R., Rudnicki, P.: Mizar: the first 30 years. Mechanized Mathematics and Its Applications 4(1), 3–24 (2005)Google Scholar
  47. 47.
    Meier, A.: TRAMP: transformation of machine-found proofs into natural deduction proofs at the assertion level (system description). In: McAllester, D. (ed.) CADE-17, LNCS, vol. 1831, pp 460–464. Springer (2000)Google Scholar
  48. 48.
    Meng, J., Paulson, L.C.: Translating higher-order clauses to first-order clauses. J. Autom. Reason. 40(1), 35–60 (2008)CrossRefMathSciNetzbMATHGoogle Scholar
  49. 49.
    Meng, J., Paulson, L.C.: Lightweight relevance filtering for machine-generated resolution problems. J. Appl. Log. 7(1), 41–57 (2009)CrossRefMathSciNetzbMATHGoogle Scholar
  50. 50.
    Miller, D., Felty, A.: An integration of resolution and natural deduction theorem proving. In: AAAI-86, vol. I: Science, pp 198–202. Morgan Kaufmann (1986)Google Scholar
  51. 51.
    de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008, LNCS, vol. 4963, pp 337–340. Springer (2008)Google Scholar
  52. 52.
    Nipkow, T.: Equational reasoning in Isabelle. Sci. Comput. Program. 12(2), 123–149 (1989)CrossRefMathSciNetzbMATHGoogle Scholar
  53. 53.
    Nipkow, T., Klein, G.: Concrete Semantics—With Isabelle/HOL. Springer (2014)Google Scholar
  54. 54.
    Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL: A Proof Assistant for Higher-Order Logic. LNCS, vol. 2283. Springer (2002)Google Scholar
  55. 55.
    Pąk, K.: The methods of improving and reorganizing natural deduction proofs. In: MathUI10 (2010)Google Scholar
  56. 56.
    Pąk, K.: Methods of lemma extraction in natural deduction proofs. J. Autom. Reason. 50(2), 217–228 (2013)CrossRefzbMATHGoogle Scholar
  57. 57.
    Pąk, K.: Improving legibility of natural deduction proofs is not trivial. Log. Meth. Comput. Sci. 10(3) (2014)Google Scholar
  58. 58.
    Paulson, L.C.: Isabelle: A Generic Theorem Prover. LNCS, vol. 828. Springer (1994)Google Scholar
  59. 59.
    Paulson, L.C.: Strategic principles in the design of Isabelle. In: CADE-15 Workshop on Strategies in Automated Deduction, pp 11–17 (1998)Google Scholar
  60. 60.
    Paulson, L.C.: A generic tableau prover and its integration with Isabelle. J. Univ. Comp. Sci. 5, 73–87 (1999)MathSciNetzbMATHGoogle Scholar
  61. 61.
    Paulson, L.C.: Three years of experience with Sledgehammer, a practical link between automated and interactive theorem provers. In: Konev, B., Schmidt, R., Schulz, S. (eds.) PAAR-2010, pp 1–10 (2010)Google Scholar
  62. 62.
    Paulson, L.C., Blanchette, J.C.: Three years of experience with Sledgehammer, a practical link between automatic and interactive theorem provers. In: Sutcliffe, G., Schulz, S., Ternovska, E. (eds.) IWIL-2010, EPiC, vol. 2, pp 1–11. EasyChair (2012)Google Scholar
  63. 63.
    Paulson, L.C., Susanto, K.W.: Source-level proof reconstruction for interactive theorem proving. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007, LNCS, vol. 4732, pp 232–245. Springer (2007)Google Scholar
  64. 64.
    Pfenning, F.: Analytic and non-analytic proofs. In: Shostak, R.E. (ed.) CADE-7, LNCS, vol. 170, pp 393–413. Springer (1984)Google Scholar
  65. 65.
    Reynolds, A., Tinelli, C, de Moura, L.: Finding conflicting instances of quantified formulas in SMT. In: Claessen, K., Kuncak, V. (eds.) FMCAD 2014, pp 195–202. FMCAD Inc. (2014)Google Scholar
  66. 66.
    Riazanov, A., Voronkov, A.: The design and implementation of Vampire. AI Commun. 15(2–3), 91–110 (2002)zbMATHGoogle Scholar
  67. 67.
    Robinson, A.J.: A machine-oriented logic based on the resolution principle. J. ACM 12(1), 23–41 (1965)CrossRefzbMATHGoogle Scholar
  68. 68.
    Rudnicki, P.: Obvious inferences. J. Autom. Reason. 3(4), 383–393 (1987)CrossRefMathSciNetzbMATHGoogle Scholar
  69. 69.
    Rudnicki, P., Urban, J.: Escape to ATP for Mizar. In: Fontaine, P., Stump, A. (eds.) PxTP 2011, pp 46–59 (2011)Google Scholar
  70. 70.
    Schulz, S.: System description: E 1.8. In: McMillan, K., Middeldorp, A., Voronkov, A. (eds.) LPAR-19, LNCS, vol. 8312, pp 735–743. Springer (2013)Google Scholar
  71. 71.
    Smolka, S.J.: Robust, semi-intelligible Isabelle proofs from ATP proofs. B.Sc. thesis, Dept. of Informatics, Technische Universität München (2013)Google Scholar
  72. 72.
    Smolka, S.J., Blanchette, J.C.: Robust, semi-intelligible Isabelle proofs from ATP proofs. In: Blanchette, J.C., Urban, J. (eds.) PxTP 2013, EPiC, vol. 14, pp 117–132. EasyChair (2013)Google Scholar
  73. 73.
    Steckermeier, A.: Extending Isabelle/HOL with the equality prover Waldmeister. B.Sc. thesis, Dept. of Informatics, Technische Universität München (2014)Google Scholar
  74. 74.
    Stump, A., Oe, D., Reynolds, A., Hadarean, L., Tinelli, C.: SMT proof checking using a logical framework. J. Formal Meth. Sys. Design 42(1), 91–118 (2013)CrossRefzbMATHGoogle Scholar
  75. 75.
    Sultana, N., Benzmu̇ller, C.: Understanding LEO-II’s proofs. In: Korovin, K., Schulz, S., Ternovska, E. (eds.) IWIL 2012, EPiC, vol. 22, pp 33–52. EasyChair (2013)Google Scholar
  76. 76.
    Sutcliffe, G.: System description: SystemOnTPTP. In: McAllester, D. (ed.) CADE-17, LNCS, vol. 1831, pp 406–410. Springer (2000)Google Scholar
  77. 77.
    Sutcliffe, G.: The TPTP problem library and associated infrastructure—The FOF and CNF parts, v3.5.0. J. Autom. Reason. 43(4), 337–362 (2009)CrossRefzbMATHGoogle Scholar
  78. 78.
    Sutcliffe, G.: The CADE-24 automated theorem proving system competition—CASC-24. AI Commun. 27(4), 405–416 (2014)MathSciNetGoogle Scholar
  79. 79.
    Sutcliffe, G., Zimmer, J., Schulz, S.: TSTP data-exchange formats for automated theorem proving tools. In: Zhang, W., Sorge, V. (eds.) Distributed Constraint Problem Solving and Reasoning in Multi-Agent Systems, Frontiers in Artificial Intelligence and Applications, vol. 112, pp 201–215. IOS Press (2004)Google Scholar
  80. 80.
    Thiemann, R.: Computing N-th roots using the Babylonian method. In: Klein, G., Nipkow, T., Paulson, L. (eds.) Archive of Formal Proofs. (2013)
  81. 81.
    Urban, J., Sutcliffe, G., Trac, S., Puzis, Y.: Combining Mizar and TPTP semantic presentation and verification tools. In: A. Grabowski, A. Naumowicz (eds.) Computer Reconstruction of the Body of Mathematics, Studies in Logic, Grammar and Rhetoric, vol. 18(31), pp. 121–136. University of Białystok (2009)Google Scholar
  82. 82.
    Wampler-Doty, M.: A complete proof of the Robbins conjecture. In: Klein, G., Nipkow, T., Paulson, L. (eds.) Archive of Formal Proofs. (2010)
  83. 83.
    Weber, T.: Sat-based finite model generation for higher-order logic. Ph.D. thesis, Dept. of Informatics, Technische Universität München (2008)Google Scholar
  84. 84.
    Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.: SPASS version 3.5. In: Schmidt, R.A. (ed.) CADE-22, LNCS, vol. 5663, pp 140–145. Springer (2009)Google Scholar
  85. 85.
    Wenzel, M.: Type classes and overloading in higher-order logic. In: Gunter, E.L., Felty, A. (eds.) TPHOLs 1997, LNCS, vol. 1275, pp 307–322. Springer (1997)Google Scholar
  86. 86.
    Wenzel, M.: Isabelle/Isar—A generic framework for human-readable proof documents. In: In: R. Matuszewski, A. Zalewska (eds.) From Insight to Proof—Festschrift in Honour of Andrzej Trybulec, Studies in Logic, Grammar and Rhetoric, vol. 10(23). University of Białystok (2007)Google Scholar
  87. 87.
    Wickerson, J., Dodds, M., Parkinson, M.J.: Ribbon proofs for separation logic. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013, LNCS, vol. 7792, pp 189–208. Springer (2013)Google Scholar
  88. 88.
    Zimmer, J., Meier, A., Sutcliffe, G., Zhan, Y.: Integrated proof transformation services. In: Benzmüller, C., Windsteiger, W. (eds.) IJCAR WS 7 (2004)Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2015

Authors and Affiliations

  • Jasmin Christian Blanchette
    • 1
    • 2
  • Sascha Böhme
    • 3
  • Mathias Fleury
    • 2
    • 4
  • Steffen Juilf Smolka
    • 5
  • Albert Steckermeier
    • 3
  1. 1.Inria Nancy & LORIA, Équipe VeriDisVillers-lès-NancyFrance
  2. 2.Max-Planck-Institut für InformatikSaarbrückenGermany
  3. 3.Institut für InformatikTechnische Universität MünchenGarchingGermany
  4. 4.École normale supérieure de Rennes, Campus de Ker LannBruzFrance
  5. 5.Department of Computer ScienceCornell UniversityIthacaUSA

Personalised recommendations