Abstract
One of the first big hurdles that mathematicians encounter when considering writing formal proofs is the necessity to get acquainted with the formal terminology and the parsing mechanisms used in the large ITP libraries. This includes the large number of formal symbols, the grammar of the formal languages and the advanced mechanisms instrumenting the proof assistants to correctly understand the formal expressions in the presence of ubiquitous overloading.
In this work we start to address this problem by developing approximate probabilistic parsing techniques that autonomously train disambiguation on large corpora. Unlike in standard natural language processing, we can filter the resulting parse trees by strong ITP and AR semantic methods such as typechecking and automated theorem proving, and even let the probabilistic methods self-improve based on such semantic feedback. We describe the general motivation and our first experiments, and build an online system for parsing ambiguous formulas over the Flyspeck library.
C. Kaliszyk—Supported by the Austrian Science Fund (FWF): P26201.
J. Urban—Supported by NWO grant nr. 612.001.208.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Approximate results of an opinion poll run by the second author since 2000.
- 2.
- 3.
Exactly, the theorems containing substrings sin, cos and tan.
- 4.
- 5.
- 6.
- 7.
The exact list is at http://mizar.cs.ualberta.ca/~mptp/i2f/00proved2.
References
Blanchette, J. C., Kaliszyk, C., Paulson, L.C., Urban, J.: Hammering towards QED. Accepted to J. Formalized Reasoning (2015). Preprint at http://www4.in.tum.de/~blanchet/h4qed.pdf
Dijkstra, E.W.: The fruits of misunderstanding. Elektronische Rechenanlagen 25(6), 10–13 (1983)
Hales, T.: Dense Sphere Packings: A Blueprint for Formal Proofs. London Mathematical Society Lecture Note Series, vol. 400. Cambridge University Press, Cambridge (2012)
Kaliszyk, C., Urban, J.: Stronger automation for Flyspeck by feature weighting and strategy evolution. In: Blanchette, J.C., Urban, J. (eds.) PxTP 2013. EPiC Series, vol. 14, pp. 87–95. EasyChair (2013)
Kaliszyk, C., Urban, J.: HOL(y)Hammer: online ATP service for HOL Light. Math. Comput. Sci. 9(1), 5–22 (2015)
Kaliszyk, C., Urban, J., Vyskočil, J., Geuvers, H.: Developing corpus-based translation methods between informal and formal mathematics: project description. In: Watt, S.M., Davenport, J.H., Sexton, A.P., Sojka, P., Urban, J. (eds.) CICM 2014. LNCS, vol. 8543, pp. 435–439. Springer, Heidelberg (2014)
Kühlwein, D., van Laarhoven, T., Tsivtsivadze, E., Urban, J., Heskes, T.: Overview and evaluation of premise selection techniques for large theory mathematics. In: Gramlich, B., Miller, D., Sattler, U. (eds.) IJCAR 2012. LNCS, vol. 7364, pp. 378–392. Springer, Heidelberg (2012)
Tankink, C., Kaliszyk, C., Urban, J., Geuvers, H.: Formal mathematics on display: a wiki for flyspeck. In: Carette, J., Aspinall, D., Lange, C., Sojka, P., Windsteiger, W. (eds.) CICM 2013. LNCS, vol. 7961, pp. 152–167. Springer, Heidelberg (2013)
Younger, D.H.: Recognition and parsing of context-free languages in time n\(\,{\hat{}}\,\,3\). Inf. Control 10(2), 189–208 (1967)
Zinn, C.: Understanding informal mathematical discourse. Ph.D. thesis, University of Erlangen-Nuremberg (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kaliszyk, C., Urban, J., Vyskočil, J. (2015). Learning to Parse on Aligned Corpora (Rough Diamond). In: Urban, C., Zhang, X. (eds) Interactive Theorem Proving. ITP 2015. Lecture Notes in Computer Science(), vol 9236. Springer, Cham. https://doi.org/10.1007/978-3-319-22102-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-22102-1_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22101-4
Online ISBN: 978-3-319-22102-1
eBook Packages: Computer ScienceComputer Science (R0)