Journal of Automated Reasoning

, Volume 57, Issue 3, pp 219–244 | Cite as

A Learning-Based Fact Selector for Isabelle/HOL

  • Jasmin Christian BlanchetteEmail author
  • David Greenaway
  • Cezary KaliszykEmail author
  • Daniel Kühlwein
  • Josef Urban


Sledgehammer integrates automatic theorem provers in the proof assistant Isabelle/HOL. A key component, the fact selector, heuristically ranks the thousands of facts (lemmas, definitions, or axioms) available and selects a subset, based on syntactic similarity to the current proof goal. We introduce MaSh, an alternative that learns from successful proofs. New challenges arose from our “zero click” vision: MaSh integrates seamlessly with the users’ workflow, so that they benefit from machine learning without having to install software, set up servers, or guide the learning. MaSh outperforms the old fact selector on large formalizations.


Relevance filtering Machine learning Proof assistants  Automatic theorem provers 



Tobias Nipkow and Lawrence Paulson have, for years, encouraged us to investigate the integration of machine learning in Sledgehammer; their foresight has made this work possible. Gerwin Klein and Makarius Wenzel provided advice on technical and licensing issues. Andrew Reynolds helped us tune the command-line options passed to CVC4. Tobias Nipkow, Lars Noschinski, Mark Summerfield, Dmitriy Traytel, and the anonymous reviewers suggested many improvements to earlier versions of this paper. Blanchette was partially supported by the Deutsche Forschungsgemeinschaft (DFG) project Hardening the Hammer (Grant NI 491/14-1). Kaliszyk was supported by the Austrian Science Fund (FWF): P26201. Kühlwein was supported by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NWO) project Learning to Reason (Grant 612.001.010). Urban was supported by the NWO project Knowledge-Based Automated Reasoning (Grant 612.001.208) and the European Research Council (ERC) grant AI4REASON.


  1. 1.
    Paulson, L.C., Blanchette, J.C.: Three years of experience with Sledgehammer, a practical link between automatic and interactive theorem provers. In: Sutcliffe, G., Schulz, S., Ternovska, E. (eds.) IWIL-2010, Volume 2 of EPiC, pp. 1–11. EasyChair (2012)Google Scholar
  2. 2.
    Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL: A Proof Assistant for Higher-Order Logic, Volume 2283 of LNCS. Springer, Berlin (2002)zbMATHCrossRefGoogle Scholar
  3. 3.
    Blanchette, J.C., Böhme, S., Popescu, A., Smallbone, N.: Encoding monomorphic and polymorphic types. In: Piterman, N., Smolka, S. (eds.) TACAS 2013, Volume 7795 of LNCS, pp. 493–507. Springer, Berlin (2013)Google Scholar
  4. 4.
    Blanchette, J.C., Böhme, S., Paulson, L.C.: Extending Sledgehammer with SMT solvers. J. Autom. Reason. 51(1), 109–128 (2013)MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Blanchette, J.C., Popescu, A., Wand, D., Weidenbach, C.: More SPASS with Isabelle-Superposition with hard sorts and configurable simplification. In: Beringer, L., Felty, A. (eds.) ITP 2012, Volume 7406 of LNCS, pp. 345–360. Springer, Berlin (2012)Google Scholar
  6. 6.
    Reynolds, A., Tinelli, C., de Moura, L.: Finding conflicting instances of quantified formulas in SMT. In: Claessen, K., Kuncak, V. (eds.) FMCAD 2014, pp. 195–202. IEEE (2014)Google Scholar
  7. 7.
    Voronkov, A.: AVATAR: the architecture for first-order theorem provers. In: Biere, A., Bloem, R. (eds.) CAV 2014, Volume 8559 of LNCS, pp. 696–710. Springer, Berlin (2014)Google Scholar
  8. 8.
    Meng, J., Paulson, L.C.: Lightweight relevance filtering for machine-generated resolution problems. J. Appl. Logic 7(1), 41–57 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    The Mizar Mathematical Library.
  10. 10.
    Grabowski, A., Korniłowicz, A., Naumowicz, A.: Mizar in a nutshell. J. Formaliz. Reason. 3(2), 153–245 (2010)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Urban, J.: MPTP 0.2: design, implementation, and initial experiments. J. Autom. Reason. 37(1–2), 21–43 (2006)zbMATHGoogle Scholar
  12. 12.
    Urban, J.: MaLARea: a metasystem for automated reasoning in large theories. In: Sutcliffe, G., Urban, J., Schulz, S. (eds.) ESARLT 2007, Volume 257 of CEUR Workshop Proceedings. (2007)Google Scholar
  13. 13.
    Urban, J., Sutcliffe, G., Pudlák, P., Vyskočil, J.: MaLARea SG1-Machine learner for automated reasoning with semantic guidance. In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008, Volume 5195 of LNCS, pp. 441–456. Springer, Berlin (2008)Google Scholar
  14. 14.
    Sutcliffe, G.: The 4th IJCAR automated theorem proving system competition-CASC-J4. AI Commun. 22(1), 59–72 (2009)MathSciNetGoogle Scholar
  15. 15.
    Sutcliffe, G.: The 6th IJCAR automated theorem proving system competition-CASC-J6. AI Commun. 26(2), 211–223 (2013)MathSciNetGoogle Scholar
  16. 16.
    Alama, J., Heskes, T., Kühlwein, D., Tsivtsivadze, E., Urban, J.: Premise selection for mathematics by corpus analysis and kernel methods. J. Autom. Reason. 52(2), 191–213 (2014)MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Kaliszyk, C., Urban, J.: MizAR 40 for Mizar 40. J. Autom. Reason. 55(3), 245–256 (2015)MathSciNetzbMATHCrossRefGoogle Scholar
  18. 18.
    Kühlwein, D., van Laarhoven, T., Tsivtsivadze, E., Urban, J., Heskes, T.: Overview and evaluation of premise selection techniques for large theory mathematics. In: Gramlich, B., Miller, D., Sattler, U. (eds.) IJCAR 2012, Volume 7364 of LNCS, pp. 378–392. Springer, Berlin (2012)Google Scholar
  19. 19.
    Hales, T.C.: Introduction to the Flyspeck project. In: Coquand, T., Lombardi, H., Roy, M.-F. (eds.) Mathematics, Algorithms, Proofs, number 05021 in Dagstuhl Seminar Proceedings, pp. 1–11. Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany (2006)Google Scholar
  20. 20.
    Hales, T., Adams, M., Bauer, G., Dang, D.T., Harrison, J., Hoang, L.T., Kaliszyk, C., Magron, V., McLaughlin, S., Nguyen, T.T., Nguyen, Q.T., Nipkow, T., Obua, S., Pleso, J., Rute, J., Solovyev, A., Ta, T.H.A., Tran, N.T., Trieu, T.D., Urban, J., Vu, K.K., Zumkeller, R.: A formal proof of the Kepler conjecture. CoRR, abs/1501.02155 (2015)Google Scholar
  21. 21.
    Harrison, J.: HOL light: a tutorial introduction. In: Srivas, M.K., Camilleri, A.J. (eds.) FMCAD ’96, Volume 1166 of LNCS, pp. 265–269. Springer, Berlin (1996)Google Scholar
  22. 22.
    Kaliszyk, C., Urban, J.: Learning-assisted automated reasoning with Flyspeck. J. Autom. Reason. 53(2), 173–213 (2014)MathSciNetzbMATHCrossRefGoogle Scholar
  23. 23.
    Kühlwein, D., Blanchette, J.C., Kaliszyk, C., Urban, J.: MaSh: machine learning for sledgehammer. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP 2013, Volume 7998 of LNCS, pp. 35–50. Springer, Berlin (2013)Google Scholar
  24. 24.
    Wenzel, M.: Isabelle/Isar—a generic framework for human-readable proof documents. In: Matuszewski, R., Zalewska, A. (eds.) From Insight to Proof-Festschrift in Honour of Andrzej Trybulec, Volume 10(23) of Studies in Logic, Grammar, and Rhetoric. Uniwersytet w Białymstoku (2007)Google Scholar
  25. 25.
    Schulz, S.: System description: E 1.8. In: McMillan, K.L., Middeldorp, A., Voronkov, A. (eds.) LPAR-19, Volume 8312 of LNCS, pp. 735–743. Springer, Berlin (2013)Google Scholar
  26. 26.
    Kovács, L., Voronkov, A.: First-order theorem proving and Vampire. In: Sharygina, N., Veith, H. (eds.) CAV 2013, Volume 8044 of LNCS, pp. 1–35. Springer, Berlin (2013)Google Scholar
  27. 27.
    Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanovic, D., King, T., Reynolds, A., Tinelli, C.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011, Volume 6806 of LNCS, pp. 171–177. Springer, Berlin (2011)Google Scholar
  28. 28.
    Bouton, T., de Oliveira, D.C.B., Déharbe, D., Fontaine, P.: veriT: an open, trustable and efficient SMT-solver. In: Schmidt, R.A. (ed.) CADE-22, Volume 5663 of LNCS, pp. 151–156. Springer, Berlin (2009)Google Scholar
  29. 29.
    de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008, Volume 4963 of LNCS, pp. 337–340. Springer, Berlin (2008)Google Scholar
  30. 30.
    Hurd, J.: First-order proof tactics in higher-order logic theorem provers. In: Archer, M., Di Vito, B., Muñoz, C. (eds.) Design and Application of Strategies/Tactics in Higher Order Logics, NASA Technical Reports, pp. 56–68 (2003)Google Scholar
  31. 31.
    Blanchette, J.C., Böhme, S., Fleury, M., Smolka, S.J., Steckermeier, A.: Semi-intelligible Isar proofs from machine-generated proofs. J. Autom. Reason. (2015). doi: 10.1007/s10817-015-9335-3
  32. 32.
    Paulson, L.C., Susanto, K.W.: Source-level proof reconstruction for interactive theorem proving. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007, Volume 4732 of LNCS, pp. 232–245. Springer, Berlin (2007)Google Scholar
  33. 33.
    Paulson, L.C.: The inductive approach to verifying cryptographic protocols. J. Comput. Secur. 6(1–2), 85–128 (1998)CrossRefGoogle Scholar
  34. 34.
    Kaliszyk, C., Urban, J.: Stronger automation for Flyspeck by feature weighting and strategy evolution. In: Blanchette, J.C., Urban, J. (eds.) PxTP 2013, volume 14 of EPiC, pp. 87–95. EasyChair (2013)Google Scholar
  35. 35.
    Spärck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28, 11–21 (1972)CrossRefGoogle Scholar
  36. 36.
    Alama, J., Kühlwein, D., Urban, J.: Automated and human proofs in general mathematics: an initial comparison. In: Bjørner, N., Voronkov, A. (eds.) LPAR-18, Volume 7180 of LNCS, pp. 37–45. Springer, Berlin (2012)Google Scholar
  37. 37.
    Berghofer, S., Nipkow, T.: Proof terms for simply typed higher order logic. In: Aagaard, M., Harrison, J. (eds.) TPHOLs 2000, Volume 1869 of LNCS, pp. 38–52. Springer, Berlin (2000)Google Scholar
  38. 38.
    Kühlwein, D., Urban, J.: Learning from multiple proofs: first experiments. In: Fontaine, P., Schmidt, R.A., Schulz, S. (eds.) PAAR-2012, Volume 21 of EPiC, pp. 82–94. EasyChair (2013)Google Scholar
  39. 39.
    Gauthier, T., Kaliszyk, C.: Premise selection and external provers for HOL4. In: Leroy, X., Tiu, A. (eds.) CPP 2015, pp. 49–57. ACM (2015)Google Scholar
  40. 40.
    Hoder, K., Voronkov, A.: Sine qua non for large theory reasoning. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) CADE-23, Volume 6803 of LNCS, pp. 299–314. Springer, Berlin (2011)Google Scholar
  41. 41.
    Klein, G., Nipkow, T., Paulson, L. (eds.) Archive of Formal Proofs.
  42. 42.
    Thiemann, R., Sternagel, C.: Certification of termination proofs using CeTA. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs 2009, Volume 5674 of LNCS, pp. 452–468. Springer, Berlin (2009)Google Scholar
  43. 43.
    Klein, G., Nipkow, T.: Jinja is not Java. In: Klein, G., Nipkow, T., Paulson, L. (eds.) Archive of Formal Proofs. (2005)
  44. 44.
    Urban, C., Kaliszyk, C.: General bindings and alpha-equivalence in Nominal Isabelle. Log. Methods Comput. Sci. 8(2:14), 1–35 (2012)Google Scholar
  45. 45.
    Hölzl, J., Heller, A.: Three chapters of measure theory in Isabelle/HOL. In: van Eekelen, M., Geuvers, H., Schmaltz, J., Wiedijk, F. (eds.) ITP 2011, Volume 6898 of LNCS, pp. 135–151. Springer, Berlin (2011)Google Scholar
  46. 46.
    Böhme, S., Nipkow, T.: Sledgehammer: judgement day. In: Giesl, J., Hähnle, R. (eds.) IJCAR 2010, Volume 6173 of LNCS, pp. 107–121. Springer, Berlin (2010)Google Scholar
  47. 47.
    Urban, J.: BliStr: The Blind Strategymaker. Presented at PAAR-2014. CoRR, abs/1301.2683, (2014)Google Scholar
  48. 48.
    Klein, G., Andronick, J., Elphinstone, K., Heiser, G., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Winwood, S.: seL4: formal verification of an operating-system kernel. Commun. ACM 53(6), 107–115 (2010)CrossRefGoogle Scholar
  49. 49.
    Greenaway, D., Andronick, J., Klein, G.: Bridging the gap: automatic verified abstraction of C. In: Beringer, L., Felty, A. (eds.) ITP 2012, Volume 7406 of LNCS, pp. 99–115. Springer, Berlin (2012)Google Scholar
  50. 50.
    Kaliszyk, C., Urban, J.: HOL(y)Hammer: online ATP service for HOL light. Math. Comput. Sci. 9(1), 5–22 (2015)zbMATHCrossRefGoogle Scholar
  51. 51.
    Urban, J., Rudnicki, P., Sutcliffe, G.: ATP and presentation service for Mizar formalizations. J. Autom. Reason. 50(2), 229–241 (2013)MathSciNetzbMATHCrossRefGoogle Scholar
  52. 52.
    Denzinger, J., Fuchs, M., Goller, C., Schulz, S.: Learning from previous proof experience. Technical Report AR99-4, Institut für Informatik, Technische Universität München (1999)Google Scholar
  53. 53.
    Urban, J.: An overview of methods for large-theory automated theorem proving. In: Höfner, P., McIver, A., Struth, G. (eds.) ATE-2011, Volume 760 of CEUR Workshop Proceedings, pp. 3–8. (2011)Google Scholar
  54. 54.
    Heras, J., Komendantskaya, E., Johansson, M., Maclean, E.: Proof-pattern recognition and lemma discovery in ACL2. In: McMillan, K.L., Middeldorp, A., Voronkov, A. (eds.) LPAR-19, Volume 8312 of LNCS, pp. 389–406. Springer, Berlin (2013)Google Scholar
  55. 55.
    Urban, J., Vyskočil, J.: Theorem proving in large formal mathematics as an emerging AI field. In: Bonacina, M.P., Stickel, M.E. (eds.) Automated Reasoning and Mathematics-Essays in Memory of William McCune, Volume 7788 of LNCS, pp. 240–257. Springer, Berlin (2013)Google Scholar
  56. 56.
    Urban, J.: MoMM—fast interreduction and retrieval in large libraries of formalized mathematics. Int. J. AI Tools 15(1), 109–130 (2006)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Jasmin Christian Blanchette
    • 1
    • 2
    Email author
  • David Greenaway
    • 3
  • Cezary Kaliszyk
    • 4
    Email author
  • Daniel Kühlwein
    • 5
  • Josef Urban
    • 6
  1. 1.Inria Nancy – Grand-Est & LORIAVillers-lès-NancyFrance
  2. 2.Max-Planck-Institut für InformatikSaarbrückenGermany
  3. 3.NICTAUniversity of New South WalesSydneyAustralia
  4. 4.University of InnsbruckInnsbruckAustria
  5. 5.Radboud UniversityNijmegenThe Netherlands
  6. 6.Czech Technical University in PraguePragueCzech Republic

Personalised recommendations