GRUNGE: A Grand Unified ATP Challenge

Brown, Chad E.; Gauthier, Thibault; Kaliszyk, Cezary; Sutcliffe, Geoff; Urban, Josef

doi:10.1007/978-3-030-29436-6_8

Chad E. Brown⁸,
Thibault Gauthier⁸,
Cezary Kaliszyk^9,10,
Geoff Sutcliffe¹¹ &
…
Josef Urban⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11716))

Included in the following conference series:

International Conference on Automated Deduction

748 Accesses
7 Citations

Abstract

This paper describes a large set of related theorem proving problems obtained by translating theorems from the HOL4 standard library into multiple logical formalisms. The formalisms are in higher-order logic (with and without type variables) and first-order logic (possibly with types, and possibly with type variables). The resultant problem sets allow us to run automated theorem provers that support different logical formalisms on corresponding problems, and compare their performances. This also results in a new “grand unified” large theory benchmark that emulates the ITP/ATP hammer setting, where systems and metasystems can use multiple formalisms in complementary ways, and jointly learn from the accumulated knowledge.

Supported by the ERC grant no. 649043 AI4REASON and no. 714034 SMART, by the Czech project AI&Reasoning CZ.02.1.01/0.0/0.0/15_003/0000466, the European Regional Development Fund, and the National Science Foundation Grant 1730419 - “CI-SUSTAIN: StarExec: Cross-Community Infrastructure for Logic Solving”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
1291 theorems were not included due to dependencies being erased during the build of the HOL4 library.
2.
http://www.tptp.org/CASC/27/TrainingData.HL4.tgz.

References

Alama, J., Heskes, T., Kühlwein, D., Tsivtsivadze, E., Urban, J.: Premise selection for mathematics by corpus analysis and kernel methods. J. Autom. Reason. 52(2), 191–213 (2014). https://doi.org/10.1007/s10817-013-9286-5
Article MathSciNet MATH Google Scholar
Barrett, C., et al.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 171–177. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_14
Chapter Google Scholar
Baumgartner, P., Waldmann, U.: Hierarchic superposition with weak abstraction. In: Bonacina, M.P. (ed.) CADE 2013. LNCS (LNAI), vol. 7898, pp. 39–57. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38574-2_3
Chapter Google Scholar
Benzmüller, C., Paulson, L.C., Theiss, F., Fietzke, A.: LEO-II - a cooperative automatic theorem prover for classical higher-order logic (system description). In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008. LNCS (LNAI), vol. 5195, pp. 162–170. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71070-7_14
Chapter Google Scholar
Benzmüller, C., Rabe, F., Sutcliffe, G.: THF0 – the core of the TPTP language for higher-order logic. In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008. LNCS (LNAI), vol. 5195, pp. 491–506. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71070-7_41. http://christoph-benzmueller.de/papers/C25.pdf
Chapter Google Scholar
Blanchette, J.C., Böhme, S., Popescu, A., Smallbone, N.: Encoding monomorphic and polymorphic types. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 493–507. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36742-7_34
Chapter MATH Google Scholar
Blanchette, J.C., Kaliszyk, C., Paulson, L.C., Urban, J.: Hammering towards QED. J. Formalized Reason. 9(1), 101–148 (2016). https://doi.org/10.6092/issn.1972-5787/4593
Article MathSciNet MATH Google Scholar
Blanchette, J.C., Paskevich, A.: TFF1: the TPTP typed first-order form with rank-1 polymorphism. In: Bonacina, M.P. (ed.) CADE 2013. LNCS (LNAI), vol. 7898, pp. 414–420. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38574-2_29
Chapter Google Scholar
Böhme, S., Weber, T.: Fast LCF-style proof reconstruction for Z3. In: Kaufmann, M., Paulson, L.C. (eds.) ITP 2010. LNCS, vol. 6172, pp. 179–194. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14052-5_14
Chapter Google Scholar
Brown, C.E.: Satallax: an automatic higher-order prover. In: Gramlich, B., Miller, D., Sattler, U. (eds.) IJCAR 2012. LNCS (LNAI), vol. 7364, pp. 111–117. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31365-3_11
Chapter Google Scholar
Burel, G.: Experimenting with deduction modulo. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) CADE 2011. LNCS (LNAI), vol. 6803, pp. 162–176. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22438-6_14
Chapter MATH Google Scholar
Church, A.: A formulation of the simple theory of types. J. Symb. Logic 5, 56–68 (1940)
Article MathSciNet Google Scholar
Cruanes, S.: Extending superposition with integer arithmetic, structural induction, and beyond. (Extensions de la Superposition pour l’Arithmétique Linéaire Entière, l’Induction Structurelle, et bien plus encore). Ph.D. thesis, École Polytechnique, Palaiseau, France (2015). https://tel.archives-ouvertes.fr/tel-01223502
Czajka, L.: Improving automation in interactive theorem provers by efficient encoding of lambda-abstractions. In: Avigad, J., Chlipala, A. (eds.) Proceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs, Saint Petersburg, FL, USA, 20–22 January 2016, pp. 49–57. ACM (2016). https://doi.org/10.1145/2854065.2854069
Delahaye, D., Doligez, D., Gilbert, F., Halmagrand, P., Hermant, O.: Zenon modulo: when achilles outruns the tortoise using deduction modulo. In: McMillan et al. [34], pp. 274–290. https://doi.org/10.1007/978-3-642-45221-5_20
Gauthier, T., Kaliszyk, C.: Premise selection and external provers for HOL4. In: Certified Programs and Proofs (CPP 2015). ACM (2015). https://doi.org/10.1145/2676724.2693173
Gauthier, T., Kaliszyk, C., Urban, J.: TacticToe: learning to reason with HOL4 tactics. In: Eiter, T., Sands, D. (eds.) 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning, LPAR-21, Maun, Botswana, 7–12 May 2017. EPiC Series in Computing, vol. 46, pp. 125–143. EasyChair (2017). http://www.easychair.org/publications/paper/340355
Gauthier, T., Kaliszyk, C., Urban, J., Kumar, R., Norrish, M.: Learning to prove with tactics. CoRR (2018). http://arxiv.org/abs/1804.00596
Gordon, M.J.C., Melham, T.F. (eds.): Introduction to HOL: A Theorem Proving Environment for Higher Order Logic. Cambridge University Press (1993). http://www.cs.ox.ac.uk/tom.melham/pub/Gordon-1993-ITH.html
Harrison, J.: HOL light: a tutorial introduction. In: Srivas, M., Camilleri, A. (eds.) FMCAD 1996. LNCS, vol. 1166, pp. 265–269. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0031814
Chapter Google Scholar
Harrison, J.: Optimizing proof search in model elimination. In: McRobbie, M.A., Slaney, J.K. (eds.) CADE 1996. LNCS, vol. 1104, pp. 313–327. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-61511-3_97
Chapter Google Scholar
Harrison, J., Urban, J., Wiedijk, F.: History of interactive theorem proving. In: Siekmann, J.H. (ed.) Computational Logic, Handbook of the History of Logic, vol. 9, pp. 135–214. Elsevier (2014). https://doi.org/10.1016/B978-0-444-51624-4.50004-6
Hurd, J.: First-order proof tactics in higher-order logic theorem provers. Design and Application of Strategies/Tactics in Higher Order Logics, number NASA/CP-2003-212448 in NASA Technical reports, pp. 56–68 (2003)
Google Scholar
Hurd, J.: System description: the metis proof tactic. In: Benzmueller, C., Harrison, J., Schurmann, C. (ed.) Workshop on Empirically Successful Automated Reasoning in Higher-Order Logic (ESHOL), pp. 103–104 (2005). https://arxiv.org/pdf/cs/0601042
Hurd, J.: The opentheory standard theory library. In: Bobaru, M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NFM 2011. LNCS, vol. 6617, pp. 177–191. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20398-5_14
Chapter Google Scholar
Kaliszyk, C., Sutcliffe, G., Rabe, F.: TH1: the TPTP typed higher-order form with rank-1 polymorphism. In: Fontaine, P., Schulz, S., Urban, J. (eds.) Proceedings of the 5th Workshop on Practical Aspects of Automated Reasoning. CEUR Workshop Proceedings, vol. 1635, pp. 41–55 (2016)
Google Scholar
Kaliszyk, C., Urban, J.: Learning-assisted automated reasoning with Flyspeck. J. Autom. Reason. 53(2), 173–213 (2014). https://doi.org/10.1007/s10817-014-9303-3
Article MathSciNet MATH Google Scholar
King, D., Arthan, R., Winnersh, I.: Development of practical verification tools. ICL Syst. J. 11, 106–122 (1996)
Google Scholar
Korovin, K.: iProver – an instantiation-based theorem prover for first-order logic (system description). In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008. LNCS (LNAI), vol. 5195, pp. 292–298. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71070-7_24
Chapter Google Scholar
Kovács, L., Voronkov, A.: First-order theorem proving and Vampire. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 1–35. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39799-8_1
Chapter Google Scholar
Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: a verified implementation of ML. In: Jagannathan, S., Sewell, P. (eds.) The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2014, San Diego, CA, USA, 20–21 January 2014, pp. 179–192. ACM (2014). https://doi.org/10.1145/2535838.2535841
Lindblad, F.: A focused sequent calculus for higher-order logic. In: Demri, S., Kapur, D., Weidenbach, C. (eds.) IJCAR 2014. LNCS (LNAI), vol. 8562, pp. 61–75. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08587-6_5
Chapter Google Scholar
McCune, W.: Prover9 and Mace4 (2005–2010). http://www.cs.unm.edu/~mccune/prover9/
McMillan, K.L., Middeldorp, A., Voronkov, A. (eds.): LPAR 2013. LNCS, vol. 8312. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-45221-5
Book Google Scholar
Meng, J., Paulson, L.C.: Translating higher-order clauses to first-order clauses. J. Autom. Reason. 40(1), 35–60 (2008)
Article MathSciNet Google Scholar
Nipkow, T., Wenzel, M., Paulson, L.C. (eds.): Isabelle/HOL. LNCS, vol. 2283. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45949-9
Book MATH Google Scholar
Pfenning, F., Elliot, C.: Higher-order abstract syntax. In: Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation, PLDI 1988, pp. 199–208. ACM, New York (1988). https://doi.org/10.1145/53990.54010
Pitts, A.: The HOL logic. In: Gordon and Melham [19]. http://www.cs.ox.ac.uk/tom.melham/pub/Gordon-1993-ITH.html
Rümmer, P.: A constraint sequent calculus for first-order logic with linear integer arithmetic. In: Cervesato, I., Veith, H., Voronkov, A. (eds.) LPAR 2008. LNCS (LNAI), vol. 5330, pp. 274–289. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89439-1_20
Chapter MATH Google Scholar
Rümmer, P.: E-matching with free variables. In: Bjørner, N., Voronkov, A. (eds.) LPAR 2012. LNCS, vol. 7180, pp. 359–374. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28717-6_28
Chapter Google Scholar
Schulz, S.: System description: E 1.8. In: McMillan et al. [34], pp. 735–743. https://doi.org/10.1007/978-3-642-45221-5_49
Slind, K., Norrish, M.: A brief overview of HOL4. In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 28–32. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71067-7_6
Chapter Google Scholar
Steen, A., Benzmüller, C.: The higher-order prover Leo-III. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) IJCAR 2018. LNCS (LNAI), vol. 10900, pp. 108–116. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94205-6_8. http://christoph-benzmueller.de/papers/C70.pdf
Chapter Google Scholar
Steen, A., Wisniewski, M., Benzmüller, C.: Going polymorphic - TH1 reasoning for Leo-III. In: Eiter, T., Sands, D., Sutcliffe, G., Voronkov, A. (eds.) IWIL@LPAR 2017 Workshop and LPAR-21 Short Presentations, Maun, Botswana, 7–12 May 2017, vol. 1. Kalpa Publications in Computing, EasyChair (2017). http://www.easychair.org/publications/paper/346851
Sutcliffe, G.: The CADE ATP system competition - CASC. AI Mag. 37(2), 99–101 (2016)
MATH Google Scholar
Sutcliffe, G.: The TPTP problem library and associated infrastructure. From CNF to TH0, TPTP v6.4.0. J. Autom. Reason. 59(4), 483–502 (2017)
Article MathSciNet Google Scholar
Sutcliffe, G., Schulz, S., Claessen, K., Baumgartner, P.: The TPTP typed first-order form with arithmetic. In: Bjørner, N., Voronkov, A. (eds.) LPAR 2012. LNCS, vol. 7180, pp. 406–419. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28717-6_32
Chapter Google Scholar
Sutcliffe, G.: The 9th IJCAR automated theorem proving system competition - CASC-J9. AI Commun. 31(6), 495–507 (2018). https://doi.org/10.3233/AIC-180773
Article MathSciNet MATH Google Scholar
Weber, T.: SMT solvers: new oracles for the HOL theorem prover. Int. J. Softw. Tools Technol. Transfer 13(5), 419–429 (2011). https://doi.org/10.1007/s10009-011-0188-8
Article Google Scholar
Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.: SPASS version 3.5. In: Schmidt, R.A. (ed.) CADE 2009. LNCS (LNAI), vol. 5663, pp. 140–145. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02959-2_10
Chapter Google Scholar
Xu, Y., Liu, J., Chen, S., Zhong, X., He, X.: Contradiction separation based dynamic multi-clause synergized automated deduction. Inf. Sci. 462, 93–113 (2018)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Czech Technical University in Prague, Prague, Czech Republic
Chad E. Brown, Thibault Gauthier & Josef Urban
University of Innsbruck, Innsbruck, Austria
Cezary Kaliszyk
University of Warsaw, Warsaw, Poland
Cezary Kaliszyk
University of Miami, Coral Gables, USA
Geoff Sutcliffe

Authors

Chad E. Brown
View author publications
You can also search for this author in PubMed Google Scholar
Thibault Gauthier
View author publications
You can also search for this author in PubMed Google Scholar
Cezary Kaliszyk
View author publications
You can also search for this author in PubMed Google Scholar
Geoff Sutcliffe
View author publications
You can also search for this author in PubMed Google Scholar
Josef Urban
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Josef Urban .

Editor information

Editors and Affiliations

University of Lorraine, Villers-lès-Nancy, France
Pascal Fontaine

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brown, C.E., Gauthier, T., Kaliszyk, C., Sutcliffe, G., Urban, J. (2019). GRUNGE: A Grand Unified ATP Challenge. In: Fontaine, P. (eds) Automated Deduction – CADE 27. CADE 2019. Lecture Notes in Computer Science(), vol 11716. Springer, Cham. https://doi.org/10.1007/978-3-030-29436-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-29436-6_8
Published: 20 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29435-9
Online ISBN: 978-3-030-29436-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics