Skip to main content

CoProver: A Recommender System for Proof Construction

  • Conference paper
  • First Online:
Intelligent Computer Mathematics (CICM 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14101))

Included in the following conference series:


Interactive Theorem Provers (ITPs) are an indispensable tool in the arsenal of formal method experts as a platform for construction and (formal) verification of proofs. The complexity of the proofs in conjunction with the level of expertise typically required for the process to succeed can often hinder the adoption of ITPs. A recent strain of work has investigated methods to incorporate machine learning models trained on ITP user activity traces as a viable path towards full automation. While a valuable line of investigation, many problems still require human supervision to be completed fully, thus applying learning methods to assist the user with useful recommendations can prove more fruitful. Following the vein of user assistance, we introduce CoProver, a proof recommender system based on transformers, capable of learning from past actions during proof construction, all while exploring knowledge stored in the ITP concerning previous proofs. CoProver employs a neurally learnt sequence-based encoding of sequents, capturing long distance relationships between terms and hidden cues therein. We couple CoProver with the Prototype Verification System (PVS) and evaluate its performance on two key areas, namely: (1) Next Proof Action Recommendation, and (2) Relevant Lemma Retrieval given a library of theories. We evaluate CoProver on a series of well-established metrics originating from the recommender system and information retrieval communities, respectively. We show that CoProver successfully outperforms prior state of the art applied to recommendation in the domain. We conclude by discussing future directions viable for CoProver (and similar approaches) such as argument prediction, proof summarization, and more.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions


  1. 1.

  2. 2.

    A step is a single forward-predict pass over a training instance, and multiple steps can be performed over the same data during the training phase.

  3. 3.

    Initial experiments with larger samples showed no difference in performance with a system trained with the smaller set.

  4. 4.

  5. 5.


  1. Alama, J., Kühlwein, D., Tsivtsivadze, E., Urban, J., Heskes, T.: Premise selection for mathematics by corpus analysis and kernel methods. CoRR abs/1108.3446 (2011). arXiv:1108.3446

  2. Balunovic, M., Bielik, P., Vechev, M.T.: Learning to solve SMT formulas. In: NeurIPS, pp. 10338–10349 (2018)

    Google Scholar 

  3. Bansal, K., Loos, S., Rabe, M., Szegedy, C., Wilcox, S.: Holist: An environment for machine learning of higher order logic theorem proving. In: International Conference on Machine Learning, pp. 454–463. PMLR (2019)

    Google Scholar 

  4. Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability. IOS Press (2009)

    Google Scholar 

  5. Blanchette, J.C., Greenaway, D., Kaliszyk, C., Kühlwein, D., Urban, J.: A learning-based fact selector for Isabelle/HOL. J. Autom. Reason. 57(3), 219–244 (2016)

    Google Scholar 

  6. Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a "siamese" time delay neural network. In: Advances in Neural Information Processing Systems, vol. 6 (1993)

    Google Scholar 

  7. Brown, T.B., et al.: Language models are few-shot learners. CoRR abs/2005.14165 (2020). arXiv:2005.14165

  8. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (Jun 2019). 10.18653/v1/N19-1423,

  9. Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. No. 57 in Monographs on Statistics and Applied Probability, Chapman & Hall/CRC, Boca Raton, Florida, USA (1993)

    Google Scholar 

  10. First, E., Brun, Y., Guha, A.: Tactok: semantics-aware proof synthesis. In: Proceedings of the ACM on Programming Languages 4(OOPSLA), pp. 1–31 (2020)

    Google Scholar 

  11. Gage, P.: A new algorithm for data compression. C Users J. D(2), 23–38 (1994)

    Google Scholar 

  12. Gauthier, T., Kaliszyk, C., Urban, J.: Learning to reason with hol4 tactics. arXiv preprint arXiv:1804.00595 (2018)

  13. Gauthier, T., Kaliszyk, C., Urban, J., Kumar, R., Norrish, M.: Learning to prove with tactics. CoRR abs/1804.00596 (2018), arXiv:1804.00596

  14. Gauthier, T., Kaliszyk, C., Urban, J., Kumar, R., Norrish, M.: Tactictoe: learning to prove with tactics. J. Autom. Reason. 65(2), 257–286 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  15. Gransden, T., Walkinshaw, N., Raman, R.: SEPIA: search for proofs using inferred automata. In: Felty, A.P., Middeldorp, A. (eds.) CADE 2015. LNCS (LNAI), vol. 9195, pp. 246–255. Springer, Cham (2015).

    Chapter  Google Scholar 

  16. Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. Tech. rep., Department of Computer Science, National Taiwan University (2003).

  17. Huang, D., Dhariwal, P., Song, D., Sutskever, I.: Gamepad: A learning environment for theorem proving. arXiv preprint arXiv:1806.00608 (2018)

  18. Huang, P.S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.: Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, pp. 2333–2338. CIKM ’13, Association for Computing Machinery, New York, NY, USA (2013).

  19. Irving, G., Szegedy, C., Alemi, A.A., Eén, N., Chollet, F., Urban, J.: Deepmath-deep sequence models for premise selection. Adv. Neural. Inf. Process. Syst. 29, 2235–2243 (2016)

    Google Scholar 

  20. Jakubův, J., Urban, J.: ENIGMA: efficient learning-based inference guiding machine. In: Geuvers, H., England, M., Hasan, O., Rabe, F., Teschke, O. (eds.) CICM 2017. LNCS (LNAI), vol. 10383, pp. 292–302. Springer, Cham (2017).

    Chapter  Google Scholar 

  21. Jiang, A.Q., et al.: Thor: Wielding hammers to integrate language models and automated theorem provers (2022)., arXiv:2205.10893

  22. Jiang, A.Q., et al.: Draft, sketch, and prove: Guiding formal theorem provers with informal proofs (2022)., arXiv:2210.12283

  23. Jones, K.S.: A statistical interpretation of term specificity and its application in retrieval. J. Document. 28, 11–21 (1972)

    Article  Google Scholar 

  24. Kaliszyk, C., Urban, J.: Femalecop: Fairly efficient machine learning connection prover. In: Logic for Programming, Artificial Intelligence, and Reasoning. pp. 88–96. Springer (2015)

    Google Scholar 

  25. Komendantskaya, E., Heras, J., Grov, G.: Machine learning in proof general: Interfacing interfaces. arXiv preprint arXiv:1212.3618 (2012)

  26. Kühlwein, D., Blanchette, J.C., Kaliszyk, C., Urban, J.: MaSh: machine learning for sledgehammer. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP 2013. LNCS, vol. 7998, pp. 35–50. Springer, Heidelberg (2013).

    Chapter  Google Scholar 

  27. Lample, Get al.: Hypertree proof search for neural theorem proving (2022)., arXiv:2205.11491

  28. Li, S., et al.: PyTorch Distributed: Experiences on accelerating data parallel training. CoRR abs/2006.15704 (2020), arXiv:2006.15704

  29. Li, W., Yu, L., Wu, Y., Paulson, L.C.: Isarstep: a benchmark for high-level mathematical reasoning. arXiv preprint arXiv:2006.09265 (2020)

  30. Liu, Y., et al.: RoBERTa: A robustly optimized BERT pretraining approach (2019)., arXiv:1907.11692

  31. Loos, S., Irving, G., Szegedy, C., Kaliszyk, C.: Deep network guided proof search. arXiv preprint arXiv:1701.06972 (2017)

  32. Luhn, H.P.: A statistical approach to mechanized encoding and searching of literary information. IBM J. Res. Dev. 1(4), 309–317 (1957).

    Article  MathSciNet  Google Scholar 

  33. Manning, C.D., Raghavan, P., Schatze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK (2008).

  34. Mikuła, M., et al.: Magnushammer: A transformer-based approach to premise selection (2023)

    Google Scholar 

  35. Mitra, B., Craswell, N.: (2018)

    Google Scholar 

  36. Otten, J., Bibel, W.: leancop: lean connection-based theorem proving. J. Symb. Comput. 36(1–2), 139–161 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  37. Owre, S., Rushby, J.M., Shankar, N.: PVS: a prototype verification system. In: Kapur, D. (ed.) CADE 1992. LNCS, vol. 607, pp. 748–752. Springer, Heidelberg (1992).

    Chapter  Google Scholar 

  38. Rabe, M.N., Szegedy, C.: Towards the automatic mathematician. In: Platzer, A., Sutcliffe, G. (eds.) CADE 2021. LNCS (LNAI), vol. 12699, pp. 25–37. Springer, Cham (2021).

    Chapter  Google Scholar 

  39. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019), arXiv:1910.10683

  40. Reimers, N., Gurevych, I.: Sentence-BERT: Sentence embeddings using Siamese BERT-Networks. CoRR abs/1908.10084 (2019). arXiv:1908.10084

  41. Schulz, S.: E - a brainiac theorem prover. J. AI Commun. 15(2/3), 111–126 (2002)

    MATH  Google Scholar 

  42. Selsam, D., Bjørner, N.: Guiding high-performance SAT solvers with Unsat-core predictions. In: Janota, M., Lynce, I. (eds.) SAT 2019. LNCS, vol. 11628, pp. 336–353. Springer, Cham (2019).

    Chapter  MATH  Google Scholar 

  43. Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725. Association for Computational Linguistics, Berlin, Germany (Aug 2016).,

  44. Shankar, N.: Automated reasoning, fast and slow. In: Proceedings of the 24th international conference on Automated Deduction, pp. 145–161. CADE’13, Springer-Verlag, Berlin, Heidelberg (2013).,

  45. Urban, J., Vyskočil, J., Štěpánek, P.: MaLeCoP machine learning connection prover. In: Brünnler, K., Metcalfe, G. (eds.) TABLEAUX 2011. LNCS (LNAI), vol. 6793, pp. 263–277. Springer, Heidelberg (2011).

    Chapter  Google Scholar 

  46. Vaswani, A.,et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  47. Wang, M., Tang, Y., Wang, J., Deng, J.: Premise selection for theorem proving by deep graph embedding. arXiv preprint arXiv:1709.09994 (2017)

  48. Whalen, D.: Holophrasm: a neural automated theorem prover for higher-order logic. arXiv preprint arXiv:1608.02644 (2016)

  49. Wolf, T., et al.: Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics, Online (Oct 2020),

  50. Wu, M., Norrish, M., Walder, C., Dezfouli, A.: Tacticzero: Learning to prove theorems from scratch with deep reinforcement learning. arXiv preprint arXiv:2102.09756 (2021)

  51. Yang, K., Deng, J.: Learning to prove theorems via interacting with proof assistants. In: International Conference on Machine Learning, pp. 6984–6994. PMLR (2019)

    Google Scholar 

Download references


This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR00112290064 and by the National Institute of Aeronautics. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Government or DARPA.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Eric Yeh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yeh, E., Hitaj, B., Owre, S., Quemener, M., Shankar, N. (2023). CoProver: A Recommender System for Proof Construction. In: Dubois, C., Kerber, M. (eds) Intelligent Computer Mathematics. CICM 2023. Lecture Notes in Computer Science(), vol 14101. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-42752-7

  • Online ISBN: 978-3-031-42753-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics