Skip to main content

Vampire with a Brain Is a Good ITP Hammer

  • Conference paper
  • First Online:
Frontiers of Combining Systems (FroCoS 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12941))

Included in the following conference series:

Abstract

Vampire has been for a long time the strongest first-order automatic theorem prover, widely used for hammer-style proof automation in ITPs such as Mizar, Isabelle, HOL, and Coq. In this work, we considerably improve the performance of Vampire in hammering over the full Mizar library by enhancing its saturation procedure with efficient neural guidance. In particular, we employ a recently proposed recursive neural network classifying the generated clauses based only on their derivation history. Compared to previous neural methods based on considering the logical content of the clauses, our architecture makes evaluating a single clause much less time consuming. The resulting system shows good learning capability and improves on the state-of-the-art performance on the Mizar library, while proving many theorems that the related ENIGMA system could not prove in a similar hammering evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    We compared and empirically evaluated various modes of integrating the model advice into the clause selection process in our previous work [39].

  2. 2.

    By special we mean “in principle distinct”. Since all the embeddings are learnable, the network itself “decides” during training how exactly to distinguish \(I_ goal \) and all the other axioms embeddings \(I_i\) (and also the “generic” \(I_{ unknown }\)).

  3. 3.

    The implementation is available as a public repo at https://git.io/JOh6S.

  4. 4.

    There exist non-trivial preprocessing techniques for achieving graph batching [25].

  5. 5.

    The latter is already relevant within a single derivation (c.f. [39], Sect. 4.4).

  6. 6.

    E.g., a node can be designated a positive example (label 1.0, weight \(w_1\)) in one derivation and a negative one (label 0.0, weight \(w_2\)) in another. The corresponding collapsed node receives the label \(w_1/(w_1+w_2)\) and weight \((w_1+w_2)\).

  7. 7.

    This effect has already been observed by researches in a related context [30].

  8. 8.

    Supplementary materials for the experiments can be found at https://git.io/JOY71.

  9. 9.

    This means layered clause selection with second-level ratio 2:1 (as explained in Sect. 2) and lazy model evaluation and abstraction caching (see [39]).

  10. 10.

    A server with Intel(R) Xeon(R) Gold 6140 CPUs @ 2.3 GHz with 500 GB RAM.

  11. 11.

    The learning rate was set to grow linearly from 0 to a maximum value \(\alpha _m = 2.0 \times 10^{-4}\) in epoch 40: \(\alpha (t) = t\cdot \alpha _m/40\) for \(t \in (0,40]\); and then to decrease from that value as the reciprocal function of time: \(\alpha (t) = 40\cdot \alpha _m/t\) for \(t\in (40,100).\)

  12. 12.

    Please note that the batches of training and validation examples for different numbers of revealed axioms were constructed and split independently, so meaningful comparisons are mainly possible between the values of the middle column (for \(m = 1000\)).

  13. 13.

    The first option is like being born blind, learning during life how to live without the missing sense, the second option is like losing a sense “just before the final exam”.

  14. 14.

    Our architecture separately models arity one rules, binary rules, and rules with arity of 3 and more for which a binary building block is iteratively composed with itself.

  15. 15.

    These could also be used whenever a trained model is combined with a strategy not used to produce the training data, possibly invoking rules not present in training.

  16. 16.

    In honor of dropout [37], a well-know regularization technique that inspired this.

  17. 17.

    Using again the here prevalent layered clause selection with second-level ratio 2:1.

References

  1. Alama, J., Heskes, T., Kühlwein, D., Tsivtsivadze, E., Urban, J.: Premise selection for mathematics by corpus analysis and kernel methods. J. Autom. Reason. 52(2), 191–213 (2014). https://doi.org/10.1007/s10817-013-9286-5

    Article  MathSciNet  MATH  Google Scholar 

  2. Alemi, A.A., Chollet, F., Irving, G., Szegedy, C., Urban, J.: DeepMath - deep sequence models for premise selection. CoRR abs/1606.04442 (2016)

    Google Scholar 

  3. Aygün, E., et al.: Learning to prove from synthetic theorems. CoRR abs/2006.11259 (2020)

    Google Scholar 

  4. Bachmair, L., Ganzinger, H.: Resolution theorem proving. In: Robinson and Voronkov [33], pp. 19–99. https://doi.org/10.1016/b978-044450813-3/50004-7

  5. Barrett, C., Fontaine, P., Tinelli, C.: The Satisfiability Modulo Theories Library (SMT-LIB) (2016). www.SMT-LIB.org

  6. Blanchette, J.C., Kaliszyk, C., Paulson, L.C., Urban, J.: Hammering towards QED. J. Formaliz. Reason. 9(1), 101–148 (2016). https://doi.org/10.6092/issn.1972-5787/4593

    Article  MathSciNet  MATH  Google Scholar 

  7. Chvalovský, K., Jakubuv, J., Suda, M., Urban, J.: ENIGMA-NG: efficient neural and gradient-boosted inference guidance for E. In: Fontaine [12], pp. 197–215. https://doi.org/10.1007/978-3-030-29436-6_12

  8. Crouse, M., et al.: A deep reinforcement learning based approach to learning transferable proof guidance strategies. CoRR abs/1911.02065 (2019)

    Google Scholar 

  9. Denzinger, J., Schulz, S.: Learning domain knowledge to improve theorem proving. In: McRobbie, M.A., Slaney, J.K. (eds.) CADE 1996. LNCS, vol. 1104, pp. 62–76. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-61511-3_69

    Chapter  Google Scholar 

  10. Färber, M., Kaliszyk, C.: Random forests for premise selection. In: Lutz, C., Ranise, S. (eds.) FroCoS 2015. LNCS (LNAI), vol. 9322, pp. 325–340. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24246-0_20

    Chapter  Google Scholar 

  11. Färber, M., Kaliszyk, C., Urban, J.: Monte Carlo tableau proof search. In: de Moura, L. (ed.) CADE 2017. LNCS (LNAI), vol. 10395, pp. 563–579. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63046-5_34

    Chapter  Google Scholar 

  12. Fontaine, P. (ed.): CADE 2019. LNCS (LNAI), vol. 11716. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29436-6

    Book  MATH  Google Scholar 

  13. Gleiss, B., Suda, M.: Layered clause selection for saturation-based theorem proving. In: Fontaine, P., Korovin, K., Kotsireas, I.S., Rümmer, P., Tourret, S. (eds.) Joint Proceedings of the 7th Workshop on Practical Aspects of Automated Reasoning (PAAR) and the 5th Satisfiability Checking and Symbolic Computation Workshop (SC-Square), co-located with the 10th International Joint Conference on Automated Reasoning (IJCAR 2020), Paris, France, June–July, 2020 (Virtual). CEUR Workshop Proceedings, vol. 2752, pp. 34–52. CEUR-WS.org (2020)

    Google Scholar 

  14. Gleiss, B., Suda, M.: Layered clause selection for theory reasoning. In: Peltier, N., Sofronie-Stokkermans, V. (eds.) IJCAR 2020, Part I. LNCS (LNAI), vol. 12166, pp. 402–409. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-51074-9_23

    Chapter  Google Scholar 

  15. Goller, C., Küchler, A.: Learning task-dependent distributed representations by backpropagation through structure. In: Proceedings of International Conference on Neural Networks (ICNN 1996), Washington, DC, USA, 3–6 June 1996, pp. 347–352. IEEE (1996). https://doi.org/10.1109/ICNN.1996.548916

  16. Goodfellow, I.J., Bengio, Y., Courville, A.C.: Deep Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  17. Grabowski, A., Kornilowicz, A., Naumowicz, A.: Mizar in a nutshell. J. Formaliz. Reason. 3(2), 153–245 (2010). https://doi.org/10.6092/issn.1972-5787/1980

    Article  MathSciNet  MATH  Google Scholar 

  18. Hoder, K., Voronkov, A.: Sine Qua non for large theory reasoning. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) CADE 2011. LNCS (LNAI), vol. 6803, pp. 299–314. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22438-6_23

    Chapter  Google Scholar 

  19. Jakubův, J., Chvalovský, K., Olšák, M., Piotrowski, B., Suda, M., Urban, J.: ENIGMA anonymous: symbol-independent inference guiding machine (system description). In: Peltier, N., Sofronie-Stokkermans, V. (eds.) IJCAR 2020, Part II. LNCS (LNAI), vol. 12167, pp. 448–463. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-51054-1_29

    Chapter  Google Scholar 

  20. Jakubův, J., Urban, J.: ENIGMA: efficient learning-based inference guiding machine. In: Geuvers, H., England, M., Hasan, O., Rabe, F., Teschke, O. (eds.) CICM 2017. LNCS (LNAI), vol. 10383, pp. 292–302. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62075-6_20

    Chapter  Google Scholar 

  21. Jakubův, J., Urban, J.: Enhancing ENIGMA given clause guidance. In: Rabe, F., Farmer, W.M., Passmore, G.O., Youssef, A. (eds.) CICM 2018. LNCS (LNAI), vol. 11006, pp. 118–124. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96812-4_11

    Chapter  Google Scholar 

  22. Jakubuv, J., Urban, J.: Hammering Mizar by learning clause guidance (short paper). In: Harrison, J., O’Leary, J., Tolmach, A. (eds.) 10th International Conference on Interactive Theorem Proving, ITP 2019, Portland, OR, USA, 9–12 September 2019. LIPIcs, vol. 141, pp. 34:1–34:8. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019). https://doi.org/10.4230/LIPIcs.ITP.2019.34

  23. Kaliszyk, C., Urban, J.: Mizar 40 for mizar 40. J. Autom. Reason. 55(3), 245–256 (2015). https://doi.org/10.1007/s10817-015-9330-8

    Article  MathSciNet  MATH  Google Scholar 

  24. Kovács, L., Voronkov, A.: First-order theorem proving and Vampire. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 1–35. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39799-8_1

    Chapter  Google Scholar 

  25. Looks, M., Herreshoff, M., Hutchins, D., Norvig, P.: Deep learning with dynamic computation graphs. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings. OpenReview.net (2017)

    Google Scholar 

  26. Loos, S.M., Irving, G., Szegedy, C., Kaliszyk, C.: Deep network guided proof search. In: Eiter, T., Sands, D. (eds.) LPAR-21, 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Maun, Botswana, 7–12 May 2017. EPiC Series in Computing, vol. 46, pp. 85–105. EasyChair (2017)

    Google Scholar 

  27. Nieuwenhuis, R., Rubio, A.: Paramodulation-based theorem proving. In: Robinson and Voronkov [33], pp. 371–443. https://doi.org/10.1016/b978-044450813-3/50009-6

  28. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

  29. Piotrowski, B., Urban, J.: Stateful premise selection by recurrent neural networks. In: Albert, E., Kovács, L. (eds.) LPAR 2020: 23rd International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Alicante, Spain, 22–27 May 2020. EPiC Series in Computing, vol. 73, pp. 409–422. EasyChair (2020). https://easychair.org/publications/paper/g38n

  30. Recht, B., Re, C., Wright, S., Niu, F.: HOGWILD!: a lock-free approach to parallelizing stochastic gradient descent. In: Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 24. Curran Associates, Inc. (2011). https://proceedings.neurips.cc/paper/2011/file/218a0aefd1d1a4be65601cc6ddc1520e-Paper.pdf

  31. Reger, G., Suda, M., Voronkov, A.: Playing with AVATAR. In: Felty, A.P., Middeldorp, A. (eds.) CADE 2015. LNCS (LNAI), vol. 9195, pp. 399–415. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21401-6_28

    Chapter  Google Scholar 

  32. Riazanov, A., Voronkov, A.: Limited resource strategy in resolution theorem proving. J. Symb. Comput. 36(1–2), 101–115 (2003). https://doi.org/10.1016/S0747-7171(03)00040-3

    Article  MathSciNet  MATH  Google Scholar 

  33. Robinson, J.A., Voronkov, A. (eds.): Handbook of Automated Reasoning (in 2 volumes). Elsevier and MIT Press (2001)

    Google Scholar 

  34. Schulz, S.: Learning Search Control Knowledge for Equational Deduction. No. 230 in DISKI, Akademische Verlagsgesellschaft Aka GmbH Berlin (2000)

    Google Scholar 

  35. Schulz, S., Cruanes, S., Vukmirovic, P.: Faster, higher, stronger: E 2.3. In: Fontaine [12], pp. 495–507. https://doi.org/10.1007/978-3-030-29436-6_29

  36. Schulz, S., Möhrmann, M.: Performance of clause selection heuristics for saturation-based theorem proving. In: Olivetti, N., Tiwari, A. (eds.) IJCAR 2016. LNCS (LNAI), vol. 9706, pp. 330–345. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40229-1_23

    Chapter  Google Scholar 

  37. Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014). http://dl.acm.org/citation.cfm?id=2670313

  38. Suda, M.: Aiming for the goal with SInE. In: Kovács, L., Voronkov, A. (eds.) Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops. EPiC Series in Computing, vol. 71, pp. 38–44. EasyChair (2020). https://doi.org/10.29007/q4pt

  39. Suda, M.: Improving ENIGMA-style clause selection while learning from history. In: Platzer, A., Sutcliffe, G. (eds.) Proceedings of the 28th CADE (2021, to appear). https://arxiv.org/abs/2102.13564

  40. Tammet, T.: GKC: a reasoning system for large knowledge bases. In: Fontaine [12], pp. 538–549. https://doi.org/10.1007/978-3-030-29436-6_32

  41. Urban, J.: MPTP 0.2: Design, implementation, and initial experiments. J. Autom. Reason. 37(1–2), 21–43 (2006). https://doi.org/10.1007/s10817-006-9032-3

  42. Urban, J., Vyskočil, J., Štěpánek, P.: MaLeCoP machine learning connection prover. In: Brünnler, K., Metcalfe, G. (eds.) TABLEAUX 2011. LNCS (LNAI), vol. 6793, pp. 263–277. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22119-4_21

    Chapter  Google Scholar 

  43. Voronkov, A.: AVATAR: the architecture for first-order theorem provers. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 696–710. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08867-9_46

    Chapter  Google Scholar 

  44. Wang, M., Tang, Y., Wang, J., Deng, J.: Premise selection for theorem proving by deep graph embedding. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017, pp. 2786–2796 (2017). https://proceedings.neurips.cc/paper/2017/hash/18d10dc6e666eab6de9215ae5b3d54df-Abstract.html

  45. Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.: SPASS version 3.5. In: Schmidt, R.A. (ed.) CADE 2009. LNCS (LNAI), vol. 5663, pp. 140–145. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02959-2_10

    Chapter  Google Scholar 

Download references

Acknowledgement

This work was supported by the Czech Science Foundation project 20-06390Y and the project RICAIP no. 857306 under the EU-H2020 programme.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Suda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Suda, M. (2021). Vampire with a Brain Is a Good ITP Hammer. In: Konev, B., Reger, G. (eds) Frontiers of Combining Systems. FroCoS 2021. Lecture Notes in Computer Science(), vol 12941. Springer, Cham. https://doi.org/10.1007/978-3-030-86205-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86205-3_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86204-6

  • Online ISBN: 978-3-030-86205-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics