First Neural Conjecturing Datasets and Experiments

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12236)


We describe several datasets and first experiments with creating conjectures by neural methods. The datasets are based on the Mizar Mathematical Library processed in several forms and the problems extracted from it by the MPTP system and proved by the E prover using the ENIGMA guidance. The conjecturing experiments use the Transformer architecture and in particular its GPT-2 implementation.



Funded by the AI4REASON ERC Consolidator grant nr. 649043 and by the Czech project AI&Reasoning CZ.02.1.01/0.0/0.0/15_003/0000466 and the European Regional Development Fund. We thank K. Chvalovský and T. Gauthier for discussions.

Supplementary material


  1. 1.
    Brown, C.E., Gauthier, T.: Self-learned formula synthesis in set theory. CoRR, abs/1912.01525 (2019)Google Scholar
  2. 2.
    Chvalovský, K., Jakubův, J., Suda, M., Urban, J.: ENIGMA-NG: efficient neural and gradient-boosted inference guidance for E. In: Fontaine, P. (ed.) CADE 2019. LNCS (LNAI), vol. 11716, pp. 197–215. Springer, Cham (2019). Scholar
  3. 3.
    Colton, S.: Automated Theory Formation in Pure Mathematics. Distinguished Dissertations. Springer, London (2012). Scholar
  4. 4.
    Fajtlowicz, S.: On conjectures of Graffiti. Ann. Discrete Math. 72(1–3), 113–118 (1988)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Gauthier, T.: Deep reinforcement learning in HOL4. CoRR, abs/1910.11797 (2019)Google Scholar
  6. 6.
    Gauthier, T., Kaliszyk, C., Urban, J.: Initial experiments with statistical conjecturing over large formal corpora. In: CICM 2016 WiP Proceedings, pp. 219–228 (2016)Google Scholar
  7. 7.
    Johansson, M., Rosén, D., Smallbone, N., Claessen, K.: Hipster: integrating theory exploration in a proof assistant. In: Watt, S.M., Davenport, J.H., Sexton, A.P., Sojka, P., Urban, J. (eds.) CICM 2014. LNCS (LNAI), vol. 8543, pp. 108–122. Springer, Cham (2014). Scholar
  8. 8.
    Kaliszyk, C., Urban, J., Vyskočil, J.: Automating formalization by statistical and semantic parsing of mathematics. In: Ayala-Rincón, M., Muñoz, C.A. (eds.) ITP 2017. LNCS, vol. 10499, pp. 12–27. Springer, Cham (2017). Scholar
  9. 9.
    Kaliszyk, C., Urban, J., Vyskočil, J.: Learning to parse on aligned corpora (Rough Diamond). In: Urban, C., Zhang, X. (eds.) ITP 2015. LNCS, vol. 9236, pp. 227–233. Springer, Cham (2015). Scholar
  10. 10.
    Lenat, D.B.: AM: an artificial intelligence approach to discovery in mathematics as heuristic search. Ph.D thesis, Stanford (1976)Google Scholar
  11. 11.
    Piotrowski, B., Urban, J.: Stateful Premise Selection by Recurrent Neural Networks (2020)Google Scholar
  12. 12.
    Radford, A., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)Google Scholar
  13. 13.
    Schulz, S.: System description: E 1.8. In: McMillan, K., Middeldorp, A., Voronkov, A. (eds.) LPAR 2013. LNCS, vol. 8312, pp. 735–743. Springer, Heidelberg (2013). Scholar
  14. 14.
    Urban, J.: XML-izing Mizar: making semantic processing and presentation of MML easy. In: Kohlhase, M. (ed.) MKM 2005. LNCS (LNAI), vol. 3863, pp. 346–360. Springer, Heidelberg (2006). Scholar
  15. 15.
    Urban, J.: MPTP 0.2: design, implementation, and initial experiments. J. Autom. Reasoning 37(1–2), 21–43 (2006)zbMATHGoogle Scholar
  16. 16.
    Wang, Q., Brown, C.E., Kaliszyk, C., Urban, J.: Exploration of neural machine translation in autoformalization of mathematics in Mizar. In: CPP, pp. 85–98 (2020)Google Scholar
  17. 17.
    Wang, Q., Kaliszyk, C., Urban, J.: First experiments with neural translation of informal to formal mathematics. In: Rabe, F., Farmer, W.M., Passmore, G.O., Youssef, A. (eds.) CICM 2018. LNCS (LNAI), vol. 11006, pp. 255–270. Springer, Cham (2018). Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Czech Institute of Informatics, Robotics and CyberneticsPragueCzech Republic

Personalised recommendations