A Learning-Based Fact Selector for Isabelle/HOL

Blanchette, Jasmin Christian; Greenaway, David; Kaliszyk, Cezary; Kühlwein, Daniel; Urban, Josef

doi:10.1007/s10817-016-9362-8

A Learning-Based Fact Selector for Isabelle/HOL

Published: 03 February 2016

Volume 57, pages 219–244, (2016)
Cite this article

Journal of Automated Reasoning Aims and scope Submit manuscript

Jasmin Christian Blanchette^1,2,
David Greenaway³,
Cezary Kaliszyk⁴,
Daniel Kühlwein⁵ &
…
Josef Urban⁶

621 Accesses
33 Citations
Explore all metrics

Abstract

Sledgehammer integrates automatic theorem provers in the proof assistant Isabelle/HOL. A key component, the fact selector, heuristically ranks the thousands of facts (lemmas, definitions, or axioms) available and selects a subset, based on syntactic similarity to the current proof goal. We introduce MaSh, an alternative that learns from successful proofs. New challenges arose from our “zero click” vision: MaSh integrates seamlessly with the users’ workflow, so that they benefit from machine learning without having to install software, set up servers, or guide the learning. MaSh outperforms the old fact selector on large formalizations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MaSh: Machine Learning for Sledgehammer

System Description: E.T. 0.1

AUTO2, A Saturation-Based Heuristic Prover for Higher-Order Logic

Notes

The source code is distributed as part of Isabelle in src/HOL/Tools/Sledgehammer/sledgehammer_mash.ML.
In general, we could extend the features recursively by following the dependency graph, but here we perform only one iteration.
http://www21.in.tum.de/~blanchet/mash2_data.tgz.
Earlier evaluations of Sledgehammer, starting with Böhme and Nipkow’s Judgment Day experiments [46], always operated on individual (sub)goals, guided by the notion that lemmas can be too difficult to be proved outright by automatic provers. However, lemmas appear to provide the right level of challenge for modern automation, and they tend to exhibit less redundancy than a sequence of similar subgoals.
CVC4’s developers have been reporting high success rates on Sledgehammer-generated benchmarks before [6], but this is the first time that we independently corroborate those results.
It is hard to envisage all possible combinations, but with the recent progress in natural language processing, suitable combination methods could soon be applied to another major aspect of formalization: the translation from informal prose to formal specification.

References

Paulson, L.C., Blanchette, J.C.: Three years of experience with Sledgehammer, a practical link between automatic and interactive theorem provers. In: Sutcliffe, G., Schulz, S., Ternovska, E. (eds.) IWIL-2010, Volume 2 of EPiC, pp. 1–11. EasyChair (2012)
Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL: A Proof Assistant for Higher-Order Logic, Volume 2283 of LNCS. Springer, Berlin (2002)
Book MATH Google Scholar
Blanchette, J.C., Böhme, S., Popescu, A., Smallbone, N.: Encoding monomorphic and polymorphic types. In: Piterman, N., Smolka, S. (eds.) TACAS 2013, Volume 7795 of LNCS, pp. 493–507. Springer, Berlin (2013)
Google Scholar
Blanchette, J.C., Böhme, S., Paulson, L.C.: Extending Sledgehammer with SMT solvers. J. Autom. Reason. 51(1), 109–128 (2013)
Article MathSciNet MATH Google Scholar
Blanchette, J.C., Popescu, A., Wand, D., Weidenbach, C.: More SPASS with Isabelle-Superposition with hard sorts and configurable simplification. In: Beringer, L., Felty, A. (eds.) ITP 2012, Volume 7406 of LNCS, pp. 345–360. Springer, Berlin (2012)
Google Scholar
Reynolds, A., Tinelli, C., de Moura, L.: Finding conflicting instances of quantified formulas in SMT. In: Claessen, K., Kuncak, V. (eds.) FMCAD 2014, pp. 195–202. IEEE (2014)
Voronkov, A.: AVATAR: the architecture for first-order theorem provers. In: Biere, A., Bloem, R. (eds.) CAV 2014, Volume 8559 of LNCS, pp. 696–710. Springer, Berlin (2014)
Google Scholar
Meng, J., Paulson, L.C.: Lightweight relevance filtering for machine-generated resolution problems. J. Appl. Logic 7(1), 41–57 (2009)
Article MathSciNet MATH Google Scholar
The Mizar Mathematical Library. http://mizar.org/
Grabowski, A., Korniłowicz, A., Naumowicz, A.: Mizar in a nutshell. J. Formaliz. Reason. 3(2), 153–245 (2010)
MathSciNet MATH Google Scholar
Urban, J.: MPTP 0.2: design, implementation, and initial experiments. J. Autom. Reason. 37(1–2), 21–43 (2006)
MATH Google Scholar
Urban, J.: MaLARea: a metasystem for automated reasoning in large theories. In: Sutcliffe, G., Urban, J., Schulz, S. (eds.) ESARLT 2007, Volume 257 of CEUR Workshop Proceedings. CEUR-WS.org (2007)
Urban, J., Sutcliffe, G., Pudlák, P., Vyskočil, J.: MaLARea SG1-Machine learner for automated reasoning with semantic guidance. In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008, Volume 5195 of LNCS, pp. 441–456. Springer, Berlin (2008)
Google Scholar
Sutcliffe, G.: The 4th IJCAR automated theorem proving system competition-CASC-J4. AI Commun. 22(1), 59–72 (2009)
MathSciNet Google Scholar
Sutcliffe, G.: The 6th IJCAR automated theorem proving system competition-CASC-J6. AI Commun. 26(2), 211–223 (2013)
MathSciNet Google Scholar
Alama, J., Heskes, T., Kühlwein, D., Tsivtsivadze, E., Urban, J.: Premise selection for mathematics by corpus analysis and kernel methods. J. Autom. Reason. 52(2), 191–213 (2014)
Article MathSciNet MATH Google Scholar
Kaliszyk, C., Urban, J.: MizAR 40 for Mizar 40. J. Autom. Reason. 55(3), 245–256 (2015)
Article MathSciNet MATH Google Scholar
Kühlwein, D., van Laarhoven, T., Tsivtsivadze, E., Urban, J., Heskes, T.: Overview and evaluation of premise selection techniques for large theory mathematics. In: Gramlich, B., Miller, D., Sattler, U. (eds.) IJCAR 2012, Volume 7364 of LNCS, pp. 378–392. Springer, Berlin (2012)
Google Scholar
Hales, T.C.: Introduction to the Flyspeck project. In: Coquand, T., Lombardi, H., Roy, M.-F. (eds.) Mathematics, Algorithms, Proofs, number 05021 in Dagstuhl Seminar Proceedings, pp. 1–11. Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany (2006)
Hales, T., Adams, M., Bauer, G., Dang, D.T., Harrison, J., Hoang, L.T., Kaliszyk, C., Magron, V., McLaughlin, S., Nguyen, T.T., Nguyen, Q.T., Nipkow, T., Obua, S., Pleso, J., Rute, J., Solovyev, A., Ta, T.H.A., Tran, N.T., Trieu, T.D., Urban, J., Vu, K.K., Zumkeller, R.: A formal proof of the Kepler conjecture. CoRR, abs/1501.02155 (2015)
Harrison, J.: HOL light: a tutorial introduction. In: Srivas, M.K., Camilleri, A.J. (eds.) FMCAD ’96, Volume 1166 of LNCS, pp. 265–269. Springer, Berlin (1996)
Google Scholar
Kaliszyk, C., Urban, J.: Learning-assisted automated reasoning with Flyspeck. J. Autom. Reason. 53(2), 173–213 (2014)
Article MathSciNet MATH Google Scholar
Kühlwein, D., Blanchette, J.C., Kaliszyk, C., Urban, J.: MaSh: machine learning for sledgehammer. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP 2013, Volume 7998 of LNCS, pp. 35–50. Springer, Berlin (2013)
Google Scholar
Wenzel, M.: Isabelle/Isar—a generic framework for human-readable proof documents. In: Matuszewski, R., Zalewska, A. (eds.) From Insight to Proof-Festschrift in Honour of Andrzej Trybulec, Volume 10(23) of Studies in Logic, Grammar, and Rhetoric. Uniwersytet w Białymstoku (2007)
Schulz, S.: System description: E 1.8. In: McMillan, K.L., Middeldorp, A., Voronkov, A. (eds.) LPAR-19, Volume 8312 of LNCS, pp. 735–743. Springer, Berlin (2013)
Google Scholar
Kovács, L., Voronkov, A.: First-order theorem proving and Vampire. In: Sharygina, N., Veith, H. (eds.) CAV 2013, Volume 8044 of LNCS, pp. 1–35. Springer, Berlin (2013)
Google Scholar
Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanovic, D., King, T., Reynolds, A., Tinelli, C.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011, Volume 6806 of LNCS, pp. 171–177. Springer, Berlin (2011)
Google Scholar
Bouton, T., de Oliveira, D.C.B., Déharbe, D., Fontaine, P.: veriT: an open, trustable and efficient SMT-solver. In: Schmidt, R.A. (ed.) CADE-22, Volume 5663 of LNCS, pp. 151–156. Springer, Berlin (2009)
Google Scholar
de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008, Volume 4963 of LNCS, pp. 337–340. Springer, Berlin (2008)
Google Scholar
Hurd, J.: First-order proof tactics in higher-order logic theorem provers. In: Archer, M., Di Vito, B., Muñoz, C. (eds.) Design and Application of Strategies/Tactics in Higher Order Logics, NASA Technical Reports, pp. 56–68 (2003)
Blanchette, J.C., Böhme, S., Fleury, M., Smolka, S.J., Steckermeier, A.: Semi-intelligible Isar proofs from machine-generated proofs. J. Autom. Reason. (2015). doi:10.1007/s10817-015-9335-3
Paulson, L.C., Susanto, K.W.: Source-level proof reconstruction for interactive theorem proving. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007, Volume 4732 of LNCS, pp. 232–245. Springer, Berlin (2007)
Google Scholar
Paulson, L.C.: The inductive approach to verifying cryptographic protocols. J. Comput. Secur. 6(1–2), 85–128 (1998)
Article Google Scholar
Kaliszyk, C., Urban, J.: Stronger automation for Flyspeck by feature weighting and strategy evolution. In: Blanchette, J.C., Urban, J. (eds.) PxTP 2013, volume 14 of EPiC, pp. 87–95. EasyChair (2013)
Spärck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28, 11–21 (1972)
Article Google Scholar
Alama, J., Kühlwein, D., Urban, J.: Automated and human proofs in general mathematics: an initial comparison. In: Bjørner, N., Voronkov, A. (eds.) LPAR-18, Volume 7180 of LNCS, pp. 37–45. Springer, Berlin (2012)
Google Scholar
Berghofer, S., Nipkow, T.: Proof terms for simply typed higher order logic. In: Aagaard, M., Harrison, J. (eds.) TPHOLs 2000, Volume 1869 of LNCS, pp. 38–52. Springer, Berlin (2000)
Google Scholar
Kühlwein, D., Urban, J.: Learning from multiple proofs: first experiments. In: Fontaine, P., Schmidt, R.A., Schulz, S. (eds.) PAAR-2012, Volume 21 of EPiC, pp. 82–94. EasyChair (2013)
Gauthier, T., Kaliszyk, C.: Premise selection and external provers for HOL4. In: Leroy, X., Tiu, A. (eds.) CPP 2015, pp. 49–57. ACM (2015)
Hoder, K., Voronkov, A.: Sine qua non for large theory reasoning. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) CADE-23, Volume 6803 of LNCS, pp. 299–314. Springer, Berlin (2011)
Google Scholar
Klein, G., Nipkow, T., Paulson, L. (eds.) Archive of Formal Proofs. http://afp.sf.net/
Thiemann, R., Sternagel, C.: Certification of termination proofs using CeTA. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs 2009, Volume 5674 of LNCS, pp. 452–468. Springer, Berlin (2009)
Google Scholar
Klein, G., Nipkow, T.: Jinja is not Java. In: Klein, G., Nipkow, T., Paulson, L. (eds.) Archive of Formal Proofs. http://afp.sf.net/entries/Jinja.shtml (2005)
Urban, C., Kaliszyk, C.: General bindings and alpha-equivalence in Nominal Isabelle. Log. Methods Comput. Sci. 8(2:14), 1–35 (2012)
Hölzl, J., Heller, A.: Three chapters of measure theory in Isabelle/HOL. In: van Eekelen, M., Geuvers, H., Schmaltz, J., Wiedijk, F. (eds.) ITP 2011, Volume 6898 of LNCS, pp. 135–151. Springer, Berlin (2011)
Google Scholar
Böhme, S., Nipkow, T.: Sledgehammer: judgement day. In: Giesl, J., Hähnle, R. (eds.) IJCAR 2010, Volume 6173 of LNCS, pp. 107–121. Springer, Berlin (2010)
Google Scholar
Urban, J.: BliStr: The Blind Strategymaker. Presented at PAAR-2014. CoRR, abs/1301.2683, (2014)
Klein, G., Andronick, J., Elphinstone, K., Heiser, G., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Winwood, S.: seL4: formal verification of an operating-system kernel. Commun. ACM 53(6), 107–115 (2010)
Article Google Scholar
Greenaway, D., Andronick, J., Klein, G.: Bridging the gap: automatic verified abstraction of C. In: Beringer, L., Felty, A. (eds.) ITP 2012, Volume 7406 of LNCS, pp. 99–115. Springer, Berlin (2012)
Google Scholar
Kaliszyk, C., Urban, J.: HOL(y)Hammer: online ATP service for HOL light. Math. Comput. Sci. 9(1), 5–22 (2015)
Article MATH Google Scholar
Urban, J., Rudnicki, P., Sutcliffe, G.: ATP and presentation service for Mizar formalizations. J. Autom. Reason. 50(2), 229–241 (2013)
Article MathSciNet MATH Google Scholar
Denzinger, J., Fuchs, M., Goller, C., Schulz, S.: Learning from previous proof experience. Technical Report AR99-4, Institut für Informatik, Technische Universität München (1999)
Urban, J.: An overview of methods for large-theory automated theorem proving. In: Höfner, P., McIver, A., Struth, G. (eds.) ATE-2011, Volume 760 of CEUR Workshop Proceedings, pp. 3–8. CEUR-WS.org (2011)
Heras, J., Komendantskaya, E., Johansson, M., Maclean, E.: Proof-pattern recognition and lemma discovery in ACL2. In: McMillan, K.L., Middeldorp, A., Voronkov, A. (eds.) LPAR-19, Volume 8312 of LNCS, pp. 389–406. Springer, Berlin (2013)
Google Scholar
Urban, J., Vyskočil, J.: Theorem proving in large formal mathematics as an emerging AI field. In: Bonacina, M.P., Stickel, M.E. (eds.) Automated Reasoning and Mathematics-Essays in Memory of William McCune, Volume 7788 of LNCS, pp. 240–257. Springer, Berlin (2013)
Google Scholar
Urban, J.: MoMM—fast interreduction and retrieval in large libraries of formalized mathematics. Int. J. AI Tools 15(1), 109–130 (2006)
Article Google Scholar

Download references

Acknowledgments

Tobias Nipkow and Lawrence Paulson have, for years, encouraged us to investigate the integration of machine learning in Sledgehammer; their foresight has made this work possible. Gerwin Klein and Makarius Wenzel provided advice on technical and licensing issues. Andrew Reynolds helped us tune the command-line options passed to CVC4. Tobias Nipkow, Lars Noschinski, Mark Summerfield, Dmitriy Traytel, and the anonymous reviewers suggested many improvements to earlier versions of this paper. Blanchette was partially supported by the Deutsche Forschungsgemeinschaft (DFG) project Hardening the Hammer (Grant NI 491/14-1). Kaliszyk was supported by the Austrian Science Fund (FWF): P26201. Kühlwein was supported by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NWO) project Learning to Reason (Grant 612.001.010). Urban was supported by the NWO project Knowledge-Based Automated Reasoning (Grant 612.001.208) and the European Research Council (ERC) grant AI4REASON.

Author information

Authors and Affiliations

Inria Nancy – Grand-Est & LORIA, Villers-lès-Nancy, France
Jasmin Christian Blanchette
Max-Planck-Institut für Informatik, Saarbrücken, Germany
Jasmin Christian Blanchette
NICTA, University of New South Wales, Sydney, Australia
David Greenaway
University of Innsbruck, Innsbruck, Austria
Cezary Kaliszyk
Radboud University, Nijmegen, The Netherlands
Daniel Kühlwein
Czech Technical University in Prague, Prague, Czech Republic
Josef Urban

Authors

Jasmin Christian Blanchette
View author publications
You can also search for this author in PubMed Google Scholar
David Greenaway
View author publications
You can also search for this author in PubMed Google Scholar
Cezary Kaliszyk
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Kühlwein
View author publications
You can also search for this author in PubMed Google Scholar
Josef Urban
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jasmin Christian Blanchette or Cezary Kaliszyk.

Additional information

In memoriam Piotr Rudnicki 1951–2012.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Blanchette, J.C., Greenaway, D., Kaliszyk, C. et al. A Learning-Based Fact Selector for Isabelle/HOL. J Autom Reasoning 57, 219–244 (2016). https://doi.org/10.1007/s10817-016-9362-8

Download citation

Received: 31 March 2015
Accepted: 15 January 2016
Published: 03 February 2016
Issue Date: October 2016
DOI: https://doi.org/10.1007/s10817-016-9362-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Learning-Based Fact Selector for Isabelle/HOL

Abstract

Access this article

Similar content being viewed by others

MaSh: Machine Learning for Sledgehammer

System Description: E.T. 0.1

AUTO2, A Saturation-Based Heuristic Prover for Higher-Order Logic

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Learning-Based Fact Selector for Isabelle/HOL

Abstract

Access this article

Similar content being viewed by others

MaSh: Machine Learning for Sledgehammer

System Description: E.T. 0.1

AUTO2, A Saturation-Based Heuristic Prover for Higher-Order Logic

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation