Advertisement

A Promising Path Towards Autoformalization and General Artificial Intelligence

Conference paper
  • 3.5k Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12236)

Abstract

An autoformalization system is an AI that learns to read natural language content and to turn it into an abstract, machine verifiable formalization, ideally by bootstrapping from unlabeled training data with minimum human interaction. This is a difficult task in general, one that would require strong automated reasoning and automated natural language processing capabilities. In this paper, it is argued that autoformalization is a promising path for systems to learn sophisticated, general purpose reasoning in all domains of mathematics and computer science. This could have far reaching implications not just for mathematical research, but also for software synthesis. Here I provide the outline for a realistic path towards those goals and give a survey of recent results that support the feasibility of this direction.

Notes

Acknowledgements

My warmest thanks go to my close collaborators and colleagues Sarah M. Loos, Markus N. Rabe, Kshitij Bansal, Francois Chollet, Alex Alemi, Stewart Wilcox, Niklas Een, Geoffrey Irving, Victor Toman and Aditya Paliwal for their contributions towards the goals sketched here. I am also indebted to Josef Urban and Cezary Kaliszyk for their pioneering work and selflessly sharing their vision and expertise and also for their collaboration on this area. I am also thankful to Ilya Sutskever, Henryk Michalewski, Daniel Huang, Quoc Le, Dániel Varga, Zsolt Zombori, Adrián Csiszárik for their feedback and valuable discussions on this topic. I would like to thank to Jay Yagnik, Rahul Sukthankar, Ashok Popat, Rif Saurous, Jeff Dean and Geoffrey Hinton for their support of deep learning based reasoning work at Google. I am grateful to Péter Szoldán, Christoph Benzmüller and Bruce Miller for proofreading the manuscript.

References

  1. 1.
    Alemi, A.A., Chollet, F., Eén, N., Irving, G., Szegedy, C., Urban, J.: Deepmath - deep sequence models for premise selection. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, Barcelona, Spain, 5–10 December 2016, pp. 2235–2243 (2016)Google Scholar
  2. 2.
    Andrychowicz, M., et al.: Hindsight experience replay. In: Advances in Neural Information Processing Systems 30 (NIPS 2017), pp. 5048–5058 (2017)Google Scholar
  3. 3.
    Bansal, K., Loos, S.M., Rabe, M.N., Szegedy, C.: Learning to reason in large theories without imitation. arXiv preprint arXiv:1905.10501 (2019)
  4. 4.
    Bansal, K., Loos, S.M., Rabe, M.N., Szegedy, C., Wilcox, S.: HOList: an environment for machine learning of higher-order theorem proving. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Proceedings of Machine Learning Research, Long Beach, California, USA, 9–15 June 2019, vol. 97, pp. 454–463. PMLR (2019)Google Scholar
  5. 5.
    Biere, A., Cimatti, A., Clarke, E., Zhu, Y.: Symbolic model checking without BDDs. In: Cleaveland, W.R. (ed.) TACAS 1999. LNCS, vol. 1579, pp. 193–207. Springer, Heidelberg (1999).  https://doi.org/10.1007/3-540-49059-0_14CrossRefGoogle Scholar
  6. 6.
    Blanchette, J.C., Kaliszyk, C., Paulson, L.C., Urban, J.: Hammering towards QED. J. Formalized Reasoning 9(1), 101–148 (2016)MathSciNetzbMATHGoogle Scholar
  7. 7.
    The Coq Proof Assistant. http://coq.inria.fr
  8. 8.
    de Moura, L., Kong, S., Avigad, J., van Doorn, F., von Raumer, J.: The lean theorem prover (System Description). In: Felty, A.P., Middeldorp, A. (eds.) CADE 2015. LNCS (LNAI), vol. 9195, pp. 378–388. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-21401-6_26CrossRefGoogle Scholar
  9. 9.
    Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long and Short Papers), vol. 1, pp. 4171–4186 (2019)Google Scholar
  10. 10.
    Eén, N., Sörensson, N.: An extensible SAT-solver. In: Giunchiglia, E., Tacchella, A. (eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518. Springer, Heidelberg (2004).  https://doi.org/10.1007/978-3-540-24605-3_37CrossRefGoogle Scholar
  11. 11.
    Fitting, M.: First-order Logic and Automated Theorem Proving. Springer, New York (2012).  https://doi.org/10.1007/978-1-4612-2360-3zbMATHCrossRefGoogle Scholar
  12. 12.
    Garrabrant, S., Benson-Tilsen, T., Critch, A., Soares, N., Taylor, J.: Logical induction. arXiv preprint arXiv:1609.03543 (2016)
  13. 13.
    Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 2414–2423. IEEE Computer Society (2016)Google Scholar
  14. 14.
    Gauthier, T., Kaliszyk, C., Urban, J.: TacticToe: learning to reason with HOL4 tactics. In: Eiter, T., Sands, D. (eds.) LPAR-21, 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Maun, Botswana, 7–12 May 2017, EPiC Series in Computing, vol. 46, pp. 125–143. EasyChair (2017)Google Scholar
  15. 15.
    Gawehn, E., Hiss, J.A., Schneider, G.: Deep learning in drug discovery. Mol. Inform. 35(1), 3–14 (2016)CrossRefGoogle Scholar
  16. 16.
    Gonthier, G.: Formal proof-the four-color theorem. Not. AMS 55(11), 1382–1393 (2008)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Gonthier, G., et al.: A machine-checked proof of the odd order theorem. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP 2013. LNCS, vol. 7998, pp. 163–179. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-39634-2_14CrossRefGoogle Scholar
  18. 18.
    Google’s scalable supercomputers for machine learning, Cloud TPU Pods, are now publicly available in beta. https://bit.ly/2YkZh3i
  19. 19.
    Gordon, A., et al.: MorphNet: Fast & simple resource-constrained structure learning of deep networks. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018, pp. 1586–1595. IEEE Computer Society (2018)Google Scholar
  20. 20.
    Grace, K., Salvatier, J., Dafoe, A., Zhang, B., Evans, O.: When will AI exceed human performance? evidence from AI experts. J. Artif. Intell. Res. 62, 729–754 (2018)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Hadjeres, G., Pachet, F., Nielsen, F.: DeepBach: a steerable model for Bach chorales generation. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1362–1371. JMLR (2017)Google Scholar
  22. 22.
    Hales, T., et al.: A formal proof of the Kepler conjecture. In: Forum of Mathematics, Pi, vol. 5. Cambridge University Press (2017)Google Scholar
  23. 23.
    Harrison, J.: HOL light: a tutorial introduction. In: Srivas, M., Camilleri, A. (eds.) FMCAD 1996. LNCS, vol. 1166, pp. 265–269. Springer, Heidelberg (1996).  https://doi.org/10.1007/BFb0031814CrossRefGoogle Scholar
  24. 24.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 770–778. IEEE Computer Society (2016)Google Scholar
  25. 25.
    Heule, M.J.H., Kullmann, O., Marek, V.W.: Solving and verifying the Boolean Pythagorean triples problem via cube-and-conquer. In: Creignou, N., Le Berre, D. (eds.) SAT 2016. LNCS, vol. 9710, pp. 228–245. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-40970-2_15zbMATHCrossRefGoogle Scholar
  26. 26.
    Huang, D., Dhariwal, P., Song, D., Sutskever, I.: GamePad: a learning environment for theorem proving. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019)Google Scholar
  27. 27.
    Kaliszyk, C., Urban, J.: HOL (y) hammer: online ATP service for HOL light. Math. Comput. Sci. 9(1), 5–22 (2015)zbMATHCrossRefGoogle Scholar
  28. 28.
    Kaliszyk, C., Urban, J., Vyskočil, J.: Learning to parse on aligned corpora (Rough Diamond). In: Urban, C., Zhang, X. (eds.) ITP 2015. LNCS, vol. 9236, pp. 227–233. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-22102-1_15CrossRefGoogle Scholar
  29. 29.
    Kaliszyk, C., Urban, J., Vyskocil, J.: System description: statistical parsing of informalized Mizar formulas. In: Jebelean, T., Negru, V., Petcu, D., Zaharie, D., Ida, T., Watt, S.M., (eds.) 19th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC 2017, Timisoara, Romania, 21–24 September 2017, pp. 169–172. IEEE Computer Society (2017)Google Scholar
  30. 30.
    Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006).  https://doi.org/10.1007/11871842_29CrossRefGoogle Scholar
  31. 31.
    Lample, G., Charton, F.: Deep learning for symbolic mathematics. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020)Google Scholar
  32. 32.
    Lample, G., Conneau, A., Denoyer, L., Ranzato, M.: Unsupervised machine translation using monolingual corpora only. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018, Conference Track Proceedings. OpenReview.net (2018)Google Scholar
  33. 33.
    Lee, D., Szegedy, C., Rabe, M.N., Loos, S.M., Bansal, K.: Mathematical reasoning in latent space. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020)Google Scholar
  34. 34.
    Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
  35. 35.
    Loos, S., Irving, G., Szegedy, C., Kaliszyk, C.: Deep network guided proof search. In: Eiter, T., Sands, D. (eds.) LPAR-21, 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Maun, Botswana, 7–12 May 2017, EPiC Series in Computing, vol. 46, pp. 85–105. EasyChair (2017)Google Scholar
  36. 36.
    McCarthy, J.: Computer programs for checking mathematical proofs. In: A Paper Presented at the Symposium on Recursive Function Theory, New York, April 1961Google Scholar
  37. 37.
    Megill, N.: Metamath. In: Wiedijk, F. (ed.) The Seventeen Provers of the World. LNCS (LNAI), vol. 3600, pp. 88–95. Springer, Heidelberg (2006).  https://doi.org/10.1007/11542384_13CrossRefGoogle Scholar
  38. 38.
    The Mizar Mathematical Library. http://mizar.org
  39. 39.
    Nisan, N., et al.: Introduction to mechanism design (for computer scientists). Algorithmic Game Theor. 9, 209–242 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  40. 40.
    Paliwal, A., Loos, S., Rabe, M., Bansal, K., Szegedy, C.: Graph representations for higher-order logic and theorem proving. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, New York, NY, USA, 7–12 February 2020. AAAI Press (2020)Google Scholar
  41. 41.
    Peters, M.E., et al.: Deep contextualized word representations. In: Walker, M.A., Ji, H., Stent, A. (eds.) Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, 1–6 June 2018, (Long Papers), vol. 1, pp. 2227–2237. Association for Computational Linguistics (2018)Google Scholar
  42. 42.
    Silver, D., et al.: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419), 1140–1144 (2018)MathSciNetzbMATHCrossRefGoogle Scholar
  43. 43.
    Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)CrossRefGoogle Scholar
  44. 44.
    Simon, D.L.: Checking number theory proofs in natural language. Ph.D thesis (1990)Google Scholar
  45. 45.
    Slind, K., Norrish, M.: A brief overview of HOL4. In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 28–32. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-71067-7_6CrossRefGoogle Scholar
  46. 46.
    Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1–9. IEEE Computer Society (2015)Google Scholar
  47. 47.
    Tan, M., Le, Q.V.: EfficientNet: rethinking model scaling for convolutional neural networks. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Long Beach, California, USA, 9–15 June 2019, Proceedings of Machine Learning Research, vol. 97, pp. 6105–6114. PMLR (2019)Google Scholar
  48. 48.
    Urban, J.: Translating Mizar for first order theorem provers. In: Asperti, A., Buchberger, B., Davenport, J.H. (eds.) MKM 2003. LNCS, vol. 2594, pp. 203–215. Springer, Heidelberg (2003).  https://doi.org/10.1007/3-540-36469-2_16CrossRefGoogle Scholar
  49. 49.
    Urban, J.: MPTP 0.2: design, implementation, and initial experiments. J. Autom. Reasoning 37(1–2), 21–43 (2006)zbMATHGoogle Scholar
  50. 50.
    Urban, J.: MaLARea: a metasystem for automated reasoning in large theories. In: Sutcliffe, G., Urban, J., Schulz, S. (eds.) Proceedings of the CADE-21 Workshop on Empirically Successful Automated Reasoning in Large Theories, Bremen, Germany, 17th July 2007, CEUR Workshop Proceedings, vol. 257. CEUR-WS.org (2007)Google Scholar
  51. 51.
    Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017, pp. 5998–6008 (2017)Google Scholar
  52. 52.
    Wang, M., Tang, Y., Wang, J., Deng, J.: Premise selection for theorem proving by deep graph embedding. In: Advances in Neural Information Processing Systems 30 (NIPS 2017), pp. 2786–2796 (2017)Google Scholar
  53. 53.
    Wenzel, M., Paulson, L.C., Nipkow, T.: The Isabelle framework. In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 33–38. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-71067-7_7CrossRefGoogle Scholar
  54. 54.
    Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks. In: IEEE Transactions on Neural Networks and Learning Systems, pp. 1–21 (2020)Google Scholar
  55. 55.
    Yang, K., Deng, J.: Learning to prove theorems via interacting with proof assistants. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Long Beach, California, USA, 9–15 June 2019, Proceedings of Machine Learning Research, vol. 97, pp. 6984–6994. PMLR (2019)Google Scholar
  56. 56.
    Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Wallach, H.M., et al. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Canada, Vancouver, BC, 8–14 December 2019, pp. 5754–5764 (2019)Google Scholar
  57. 57.
    Yu, A.W., et al.: QANet: combining local convolution with global self-attention for reading comprehension. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018, Conference Track Proceedings. OpenReview.net (2018)Google Scholar
  58. 58.
    Yuksel, S.E., Wilson, J.N., Gader, P.D.: Twenty years of mixture of experts. IEEE Trans. Neural Networks Learn. Syst. 23(8), 1177–1193 (2012)CrossRefGoogle Scholar
  59. 59.
    Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 2223–2232. IEEE Computer Society (2017)Google Scholar
  60. 60.
    Zinn, C.: Understanding informal mathematical discourse. Ph.D thesis, Institut für Informatik, Universität Erlangen-Nürnberg (2004)Google Scholar
  61. 61.
    Zombori, Z., Csiszárik, A., Michalewski, H., Kaliszyk, C., Urban, J.: Towards finding longer proofs. arXiv preprint arXiv:1905.13100 (2019)
  62. 62.
    Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings. OpenReview.net (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Google ResearchMountain ViewUSA

Personalised recommendations