GRUNGE: A Grand Unified ATP Challenge
Conference paper
First Online:
- 4 Citations
- 392 Downloads
Abstract
This paper describes a large set of related theorem proving problems obtained by translating theorems from the HOL4 standard library into multiple logical formalisms. The formalisms are in higher-order logic (with and without type variables) and first-order logic (possibly with types, and possibly with type variables). The resultant problem sets allow us to run automated theorem provers that support different logical formalisms on corresponding problems, and compare their performances. This also results in a new “grand unified” large theory benchmark that emulates the ITP/ATP hammer setting, where systems and metasystems can use multiple formalisms in complementary ways, and jointly learn from the accumulated knowledge.
Keywords
Theorem proving Higher-order logic First-order logic Many-sorted logicReferences
- 1.Alama, J., Heskes, T., Kühlwein, D., Tsivtsivadze, E., Urban, J.: Premise selection for mathematics by corpus analysis and kernel methods. J. Autom. Reason. 52(2), 191–213 (2014). https://doi.org/10.1007/s10817-013-9286-5MathSciNetCrossRefzbMATHGoogle Scholar
- 2.Barrett, C., et al.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 171–177. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_14CrossRefGoogle Scholar
- 3.Baumgartner, P., Waldmann, U.: Hierarchic superposition with weak abstraction. In: Bonacina, M.P. (ed.) CADE 2013. LNCS (LNAI), vol. 7898, pp. 39–57. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38574-2_3CrossRefGoogle Scholar
- 4.Benzmüller, C., Paulson, L.C., Theiss, F., Fietzke, A.: LEO-II - a cooperative automatic theorem prover for classical higher-order logic (system description). In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008. LNCS (LNAI), vol. 5195, pp. 162–170. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71070-7_14CrossRefGoogle Scholar
- 5.Benzmüller, C., Rabe, F., Sutcliffe, G.: THF0 – the core of the TPTP language for higher-order logic. In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008. LNCS (LNAI), vol. 5195, pp. 491–506. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71070-7_41. http://christoph-benzmueller.de/papers/C25.pdfCrossRefGoogle Scholar
- 6.Blanchette, J.C., Böhme, S., Popescu, A., Smallbone, N.: Encoding monomorphic and polymorphic types. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 493–507. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36742-7_34CrossRefzbMATHGoogle Scholar
- 7.Blanchette, J.C., Kaliszyk, C., Paulson, L.C., Urban, J.: Hammering towards QED. J. Formalized Reason. 9(1), 101–148 (2016). https://doi.org/10.6092/issn.1972-5787/4593MathSciNetCrossRefGoogle Scholar
- 8.Blanchette, J.C., Paskevich, A.: TFF1: the TPTP typed first-order form with rank-1 polymorphism. In: Bonacina, M.P. (ed.) CADE 2013. LNCS (LNAI), vol. 7898, pp. 414–420. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38574-2_29CrossRefGoogle Scholar
- 9.Böhme, S., Weber, T.: Fast LCF-style proof reconstruction for Z3. In: Kaufmann, M., Paulson, L.C. (eds.) ITP 2010. LNCS, vol. 6172, pp. 179–194. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14052-5_14CrossRefGoogle Scholar
- 10.Brown, C.E.: Satallax: an automatic higher-order prover. In: Gramlich, B., Miller, D., Sattler, U. (eds.) IJCAR 2012. LNCS (LNAI), vol. 7364, pp. 111–117. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31365-3_11CrossRefGoogle Scholar
- 11.Burel, G.: Experimenting with deduction modulo. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) CADE 2011. LNCS (LNAI), vol. 6803, pp. 162–176. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22438-6_14CrossRefzbMATHGoogle Scholar
- 12.Church, A.: A formulation of the simple theory of types. J. Symb. Logic 5, 56–68 (1940)MathSciNetCrossRefGoogle Scholar
- 13.Cruanes, S.: Extending superposition with integer arithmetic, structural induction, and beyond. (Extensions de la Superposition pour l’Arithmétique Linéaire Entière, l’Induction Structurelle, et bien plus encore). Ph.D. thesis, École Polytechnique, Palaiseau, France (2015). https://tel.archives-ouvertes.fr/tel-01223502
- 14.Czajka, L.: Improving automation in interactive theorem provers by efficient encoding of lambda-abstractions. In: Avigad, J., Chlipala, A. (eds.) Proceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs, Saint Petersburg, FL, USA, 20–22 January 2016, pp. 49–57. ACM (2016). https://doi.org/10.1145/2854065.2854069
- 15.Delahaye, D., Doligez, D., Gilbert, F., Halmagrand, P., Hermant, O.: Zenon modulo: when achilles outruns the tortoise using deduction modulo. In: McMillan et al. [34], pp. 274–290. https://doi.org/10.1007/978-3-642-45221-5_20CrossRefGoogle Scholar
- 16.Gauthier, T., Kaliszyk, C.: Premise selection and external provers for HOL4. In: Certified Programs and Proofs (CPP 2015). ACM (2015). https://doi.org/10.1145/2676724.2693173
- 17.Gauthier, T., Kaliszyk, C., Urban, J.: TacticToe: learning to reason with HOL4 tactics. In: Eiter, T., Sands, D. (eds.) 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning, LPAR-21, Maun, Botswana, 7–12 May 2017. EPiC Series in Computing, vol. 46, pp. 125–143. EasyChair (2017). http://www.easychair.org/publications/paper/340355
- 18.Gauthier, T., Kaliszyk, C., Urban, J., Kumar, R., Norrish, M.: Learning to prove with tactics. CoRR (2018). http://arxiv.org/abs/1804.00596
- 19.Gordon, M.J.C., Melham, T.F. (eds.): Introduction to HOL: A Theorem Proving Environment for Higher Order Logic. Cambridge University Press (1993). http://www.cs.ox.ac.uk/tom.melham/pub/Gordon-1993-ITH.html
- 20.Harrison, J.: HOL light: a tutorial introduction. In: Srivas, M., Camilleri, A. (eds.) FMCAD 1996. LNCS, vol. 1166, pp. 265–269. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0031814CrossRefGoogle Scholar
- 21.Harrison, J.: Optimizing proof search in model elimination. In: McRobbie, M.A., Slaney, J.K. (eds.) CADE 1996. LNCS, vol. 1104, pp. 313–327. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-61511-3_97CrossRefGoogle Scholar
- 22.Harrison, J., Urban, J., Wiedijk, F.: History of interactive theorem proving. In: Siekmann, J.H. (ed.) Computational Logic, Handbook of the History of Logic, vol. 9, pp. 135–214. Elsevier (2014). https://doi.org/10.1016/B978-0-444-51624-4.50004-6CrossRefGoogle Scholar
- 23.Hurd, J.: First-order proof tactics in higher-order logic theorem provers. Design and Application of Strategies/Tactics in Higher Order Logics, number NASA/CP-2003-212448 in NASA Technical reports, pp. 56–68 (2003)Google Scholar
- 24.Hurd, J.: System description: the metis proof tactic. In: Benzmueller, C., Harrison, J., Schurmann, C. (ed.) Workshop on Empirically Successful Automated Reasoning in Higher-Order Logic (ESHOL), pp. 103–104 (2005). https://arxiv.org/pdf/cs/0601042
- 25.Hurd, J.: The opentheory standard theory library. In: Bobaru, M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NFM 2011. LNCS, vol. 6617, pp. 177–191. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20398-5_14CrossRefGoogle Scholar
- 26.Kaliszyk, C., Sutcliffe, G., Rabe, F.: TH1: the TPTP typed higher-order form with rank-1 polymorphism. In: Fontaine, P., Schulz, S., Urban, J. (eds.) Proceedings of the 5th Workshop on Practical Aspects of Automated Reasoning. CEUR Workshop Proceedings, vol. 1635, pp. 41–55 (2016)Google Scholar
- 27.Kaliszyk, C., Urban, J.: Learning-assisted automated reasoning with Flyspeck. J. Autom. Reason. 53(2), 173–213 (2014). https://doi.org/10.1007/s10817-014-9303-3MathSciNetCrossRefzbMATHGoogle Scholar
- 28.King, D., Arthan, R., Winnersh, I.: Development of practical verification tools. ICL Syst. J. 11, 106–122 (1996)Google Scholar
- 29.Korovin, K.: iProver – an instantiation-based theorem prover for first-order logic (system description). In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008. LNCS (LNAI), vol. 5195, pp. 292–298. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71070-7_24CrossRefGoogle Scholar
- 30.Kovács, L., Voronkov, A.: First-order theorem proving and Vampire. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 1–35. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39799-8_1CrossRefGoogle Scholar
- 31.Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: a verified implementation of ML. In: Jagannathan, S., Sewell, P. (eds.) The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2014, San Diego, CA, USA, 20–21 January 2014, pp. 179–192. ACM (2014). https://doi.org/10.1145/2535838.2535841
- 32.Lindblad, F.: A focused sequent calculus for higher-order logic. In: Demri, S., Kapur, D., Weidenbach, C. (eds.) IJCAR 2014. LNCS (LNAI), vol. 8562, pp. 61–75. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08587-6_5CrossRefGoogle Scholar
- 33.McCune, W.: Prover9 and Mace4 (2005–2010). http://www.cs.unm.edu/~mccune/prover9/
- 34.McMillan, K.L., Middeldorp, A., Voronkov, A. (eds.): LPAR 2013. LNCS, vol. 8312. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-45221-5CrossRefGoogle Scholar
- 35.Meng, J., Paulson, L.C.: Translating higher-order clauses to first-order clauses. J. Autom. Reason. 40(1), 35–60 (2008)MathSciNetCrossRefGoogle Scholar
- 36.Nipkow, T., Wenzel, M., Paulson, L.C. (eds.): Isabelle/HOL. LNCS, vol. 2283. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45949-9CrossRefzbMATHGoogle Scholar
- 37.Pfenning, F., Elliot, C.: Higher-order abstract syntax. In: Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation, PLDI 1988, pp. 199–208. ACM, New York (1988). https://doi.org/10.1145/53990.54010
- 38.Pitts, A.: The HOL logic. In: Gordon and Melham [19]. http://www.cs.ox.ac.uk/tom.melham/pub/Gordon-1993-ITH.html
- 39.Rümmer, P.: A constraint sequent calculus for first-order logic with linear integer arithmetic. In: Cervesato, I., Veith, H., Voronkov, A. (eds.) LPAR 2008. LNCS (LNAI), vol. 5330, pp. 274–289. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89439-1_20CrossRefzbMATHGoogle Scholar
- 40.Rümmer, P.: E-matching with free variables. In: Bjørner, N., Voronkov, A. (eds.) LPAR 2012. LNCS, vol. 7180, pp. 359–374. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28717-6_28CrossRefGoogle Scholar
- 41.Schulz, S.: System description: E 1.8. In: McMillan et al. [34], pp. 735–743. https://doi.org/10.1007/978-3-642-45221-5_49CrossRefGoogle Scholar
- 42.Slind, K., Norrish, M.: A brief overview of HOL4. In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 28–32. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71067-7_6CrossRefGoogle Scholar
- 43.Steen, A., Benzmüller, C.: The higher-order prover Leo-III. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) IJCAR 2018. LNCS (LNAI), vol. 10900, pp. 108–116. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94205-6_8. http://christoph-benzmueller.de/papers/C70.pdfCrossRefGoogle Scholar
- 44.Steen, A., Wisniewski, M., Benzmüller, C.: Going polymorphic - TH1 reasoning for Leo-III. In: Eiter, T., Sands, D., Sutcliffe, G., Voronkov, A. (eds.) IWIL@LPAR 2017 Workshop and LPAR-21 Short Presentations, Maun, Botswana, 7–12 May 2017, vol. 1. Kalpa Publications in Computing, EasyChair (2017). http://www.easychair.org/publications/paper/346851
- 45.Sutcliffe, G.: The CADE ATP system competition - CASC. AI Mag. 37(2), 99–101 (2016)CrossRefGoogle Scholar
- 46.Sutcliffe, G.: The TPTP problem library and associated infrastructure. From CNF to TH0, TPTP v6.4.0. J. Autom. Reason. 59(4), 483–502 (2017)MathSciNetCrossRefGoogle Scholar
- 47.Sutcliffe, G., Schulz, S., Claessen, K., Baumgartner, P.: The TPTP typed first-order form with arithmetic. In: Bjørner, N., Voronkov, A. (eds.) LPAR 2012. LNCS, vol. 7180, pp. 406–419. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28717-6_32CrossRefGoogle Scholar
- 48.Sutcliffe, G.: The 9th IJCAR automated theorem proving system competition - CASC-J9. AI Commun. 31(6), 495–507 (2018). https://doi.org/10.3233/AIC-180773MathSciNetCrossRefGoogle Scholar
- 49.Weber, T.: SMT solvers: new oracles for the HOL theorem prover. Int. J. Softw. Tools Technol. Transfer 13(5), 419–429 (2011). https://doi.org/10.1007/s10009-011-0188-8CrossRefGoogle Scholar
- 50.Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.: SPASS version 3.5. In: Schmidt, R.A. (ed.) CADE 2009. LNCS (LNAI), vol. 5663, pp. 140–145. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02959-2_10CrossRefGoogle Scholar
- 51.Xu, Y., Liu, J., Chen, S., Zhong, X., He, X.: Contradiction separation based dynamic multi-clause synergized automated deduction. Inf. Sci. 462, 93–113 (2018)MathSciNetCrossRefGoogle Scholar
Copyright information
© Springer Nature Switzerland AG 2019