Premise Selection for Mathematics by Corpus Analysis and Kernel Methods

Alama, Jesse; Heskes, Tom; Kühlwein, Daniel; Tsivtsivadze, Evgeni; Urban, Josef

doi:10.1007/s10817-013-9286-5

Premise Selection for Mathematics by Corpus Analysis and Kernel Methods

Published: 24 April 2013

Volume 52, pages 191–213, (2014)
Cite this article

Journal of Automated Reasoning Aims and scope Submit manuscript

Jesse Alama¹,
Tom Heskes²,
Daniel Kühlwein²,
Evgeni Tsivtsivadze² &
…
Josef Urban²

457 Accesses
63 Citations
3 Altmetric
Explore all metrics

Abstract

Smart premise selection is essential when using automated reasoning as a tool for large-theory formal proof development. This work develops learning-based premise selection in two ways. First, a fine-grained dependency analysis of existing high-level formal mathematical proofs is used to build a large knowledge base of proof dependencies, providing precise data for ATP-based re-verification and for training premise selection algorithms. Second, a new machine learning algorithm for premise selection based on kernel methods is proposed and implemented. To evaluate the impact of both techniques, a benchmark consisting of 2078 large-theory mathematical problems is constructed, extending the older MPTP Challenge benchmark. The combined effect of the techniques results in a 50 % improvement on the benchmark over the state-of-the-art Vampire/SInE system for automated reasoning in large theories.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Counterfactual explanations and how to find them: literature review and benchmarking

Article Open access 28 April 2022

Theorem proving in artificial neural networks: new frontiers in mathematical AI

Article Open access 20 January 2024

A Step Towards Absolute Versions of Metamathematical Results

Article Open access 29 November 2023

References

Alama, J.: Formal proofs and refutations. Ph.D. thesis, Stanford University (2009)
Alama, J., Brink, K., Mamane, L., Urban, J.: Large formal wikis: Issues and solutions. In: Davenport, J.H., Farmer, W.M., Urban, J., Rabe, F. (eds.) Calculemus/MKM. Lecture Notes in Computer Science, vol. 6824, pp. 133–148. Springer (2011)
Alama, J., Kühlwein, D., Urban, J.: Automated and human proofs in general mathematics: an initial comparison. In: Bjørner, N., Voronkov, A. (eds.) LPAR. Lecture Notes in Computer Science, vol. 7180, pp. 37–45. Springer (2012)
Alama, J., Mamane, L., Urban, J.: Dependencies in formal mathematics: applications and extraction for Coq and Mizar. In: Jeuring, J., Campbell, J.A., Carette, J., Reis, G.D., Sojka, P., Wenzel, M., Sorge, V. (eds.) AISC/MKM/Calculemus. Lecture Notes in Computer Science, vol. 7362, pp. 1–16. Springer (2012)
Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337–404 (1950)
Google Scholar
Bertot, Y., Castéran, P.: Interactive theorem proving and program development. Coq’Art: the calculus of inductive constructions. In: Texts in Theoretical Computer Science. Springer (2004)
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Secaucus (2006)
Google Scholar
Blanchette, J.C., Bulwahn, L., Nipkow, T.: Automatic proof and disproof in Isabelle/HOL. In: Tinelli, C., Sofronie-Stokkermans, V. (eds.) FroCos. Lecture Notes in Computer Science, vol. 6989, pp. 12–27. Springer (2011)
Carlson, A., Cumby, C., Rosen, J., Roth, D.: The SNoW learning architecture. Tech. Rep. UIUCDCS-R-99-2101, UIUC Computer Science Department (1999). http://cogcomp.cs.illinois.edu/papers/CCRR99.pdf
Davis, M.: Obvious logical inferences. In: Hayes, P.J. (ed.) IJCAI, pp. 530–531. Kaufmann (1981)
Grabowski, A., Korniłowicz, A., Naumowicz, A.: Mizar in a nutshell. J. Formaliz. Reason. 3(2), 153–245 (2010)
MATH MathSciNet Google Scholar
Harrison, J.: HOL light: A tutorial introduction. In: Srivas, M.K., Camilleri, A.J. (eds.) FMCAD. Lecture Notes in Computer Science, vol. 1166, pp. 265–269. Springer (1996)
Harrison, J., Slind, K., Arthan, R.: HOL. In: Wiedijk, F. (ed.) The Seventeen Provers of the World. Lecture Notes in Computer Science, vol. 3600, pp. 11–19. Springer (2006)
Hoder, K., Voronkov, A.: Sine qua non for large theory reasoning. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) CADE. Lecture Notes in Computer Science, vol. 6803, pp. 299–314. Springer (2011)
Meng, J., Paulson, L.C.: Translating higher-order clauses to first-order clauses. J. Autom. Reason. 40(1), 35–60 (2008)
Article MATH MathSciNet Google Scholar
de Moura, L.M., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS. Lecture Notes in Computer Science, vol. 4963, pp. 337–340. Springer (2008)
Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL—A proof assistant for higher-order logic. Lecture Notes in Computer Science, vol. 2283. Springer (2002)
Paulson, L.C., Susanto, K.W.: Source-level proof reconstruction for interactive theorem proving. In: Schneider, K., Brandt, J. (eds.) TPHOLs. Lecture Notes in Computer Science, vol. 4732, pp. 232–245. Springer (2007)
Pease, A., Sutcliffe, G.: First order reasoning on a large ontology. In: Sutcliffe, G., Urban, J., Schulz, S. (eds.) Proceedings of the CADE-21 Workshop on Empirically Successful Automated Reasoning in Large Theories, Bremen, Germany, 17th July 2007. CEUR Workshop Proceedings, vol. 257. CEUR-WS.org (2007)
Riazanov, A., Voronkov, A.: The design and implementation of VAMPIRE. AI Commun. 15(2–3), 91–110 (2002)
MATH Google Scholar
Richard, M.D., Lippmann, R.P.: Neural network classifiers estimate Bayesian a posteriori probabilities. Neural Comput. 3(4), 461–483 (2010)
Article Google Scholar
Rifkin, R., Yeo, G., Poggio, T.: Regularized least-squares classification. In: Suykens, J., Horvath, G., Basu, S., Micchelli, C., Vandewalle, J. (eds.) Advances in Learning Theory: Methods, Model and Applications, pp. 131–154. IOS Press, Amsterdam (2003)
Google Scholar
Rudnicki, P.: Obvious inferences. J. Autom. Reason. 3(4), 383–393 (1987)
Article MATH MathSciNet Google Scholar
Schoelkopf, B., Herbrich, R., Williamson, R., Smola, A.J.: A generalized representer theorem. In: Helmbold, D., Williamson, R. (eds.) Proceedings of the 14th Annual Conference on Computational Learning Theory, pp. 416–426. Berlin, Germany (2001)
Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Schulz, S.: E - A brainiac theorem prover. AI Commun. 15(2–3), 111–126 (2002)
MATH Google Scholar
Shalev-Shwartz, S., Singer, Y., Srebro, N., Cotter, A.: Pegasos: primal estimated sub-gradient solver for SVM. Math. Program. 127(1), 3–30 (2011)
Article MATH MathSciNet Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, New York (2004)
Book Google Scholar
Simpson, S.G.: Subsystems of Second Order Arithmetic, 2nd edn. Perspectives in Mathematical Logic. Springer (2009)
Solovay, R.: AC and strongly inaccessible cardinals. Available on the Foundations of Mathematics archives at http://www.cs.nyu.edu/pipermail/fom/2008-March/012783.html (2008)
Tarski, A.: On well-ordered subsets of any set. Fundam. Math. 32, 176–183 (1939)
Google Scholar
Trybulec, A.: Tarski Grothendieck set theory. Formaliz. Math. 1(1), 9–11 (1990)
Google Scholar
Tsivtsivadze, E., Pahikkala, T., Boberg, J., Salakoski, T., Heskes, T.: Co-regularized least-squares for label ranking. In: Fürnkranz, J., Hüllermeier, E. (eds.) Preference Learning, pp. 107–123. Springer, Berlin (2011)
Google Scholar
Urban, J.: MPTP—motivation, implementation, first experiments. J. Autom. Reason. 33(3–4), 319–339 (2004)
Article MATH MathSciNet Google Scholar
Urban, J.: MPTP 0.2: design, implementation, and initial experiments. J. Autom. Reason. 37(1–2), 21–43 (2006)
MATH Google Scholar
Urban, J., Hoder, K., Voronkov, A.: Evaluation of automated theorem proving on the Mizar Mathematical Library. In: Fukuda, K., van der Hoeven, J., Joswig, M., Takayama, N. (eds.) ICMS. Lecture Notes in Computer Science, vol. 6327, pp. 155–166. Springer (2010)
Urban, J., Rudnicki, P., Sutcliffe, G.: ATP and presentation service for Mizar formalizations. J. Autom. Reason. 50, 229–241 (2013)
Article MATH MathSciNet Google Scholar
Urban, J., Sutcliffe, G.: Automated reasoning and presentation support for formalizing mathematics in Mizar. In: Autexier, S., Calmet, J., Delahaye, D., Ion, P.D.F., Rideau, L., Rioboo, R., Sexton, A.P. (eds.) AISC/MKM/Calculemus. Lecture Notes in Computer Science, vol. 6167, pp. 132–146. Springer (2010)
Urban, J., Sutcliffe, G., Pudlák, P., Vyskocil, J.: MaLARea SG1- machine learner for automated reasoning with semantic guidance. In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR. Lecture Notes in Computer Science, vol. 5195, pp. 441–456. Springer (2008)
Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.: SPASS version 3.5. In: Schmidt, R.A. (ed.) CADE. LNCS, vol. 5663, pp. 140–145. Springer (2009)

Download references

Author information

Authors and Affiliations

Center for Artificial Intelligence, New University of Lisbon, 1099085, Lisbon, Portugal
Jesse Alama
Intelligent Systems, Institute for Computing and Information Sciences, Radboud University, Nijmegen, Netherlands
Tom Heskes, Daniel Kühlwein, Evgeni Tsivtsivadze & Josef Urban

Authors

Jesse Alama
View author publications
You can also search for this author in PubMed Google Scholar
Tom Heskes
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Kühlwein
View author publications
You can also search for this author in PubMed Google Scholar
Evgeni Tsivtsivadze
View author publications
You can also search for this author in PubMed Google Scholar
Josef Urban
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Josef Urban.

Additional information

Research work of Jesse Alama was funded by FCT project “Dialogical Foundations of Semantics” (DiFoS) in the ESF EuroCoRes programme LogICCC (FCT LogICCC/0001/2007). Research for this paper was partially done while a visiting fellow at the Isaac Newton Institute for the Mathematical Sciences in the programme ‘Semantics & Syntax’.

Research works of T. Heskes, D. Kühlwein, E. Tsivtsivadze and J. Urban were funded by the NWO projects “Learning2Reason” and “MathWiki”.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alama, J., Heskes, T., Kühlwein, D. et al. Premise Selection for Mathematics by Corpus Analysis and Kernel Methods. J Autom Reasoning 52, 191–213 (2014). https://doi.org/10.1007/s10817-013-9286-5

Download citation

Received: 10 August 2011
Accepted: 02 April 2013
Published: 24 April 2013
Issue Date: February 2014
DOI: https://doi.org/10.1007/s10817-013-9286-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Premise Selection for Mathematics by Corpus Analysis and Kernel Methods

Abstract

Access this article

Similar content being viewed by others

Counterfactual explanations and how to find them: literature review and benchmarking

Theorem proving in artificial neural networks: new frontiers in mathematical AI

A Step Towards Absolute Versions of Metamathematical Results

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Premise Selection for Mathematics by Corpus Analysis and Kernel Methods

Abstract

Access this article

Similar content being viewed by others

Counterfactual explanations and how to find them: literature review and benchmarking

Theorem proving in artificial neural networks: new frontiers in mathematical AI

A Step Towards Absolute Versions of Metamathematical Results

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation