Advertisement

DynGenPar – A Dynamic Generalized Parser for Common Mathematical Language

  • Kevin Kofler
  • Arnold Neumaier
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7362)

Abstract

This paper introduces a dynamic generalized parser aimed primarily at common natural mathematical language. Our algorithm combines the efficiency of GLR parsing, the dynamic extensibility of tableless approaches and the expressiveness of extended context-free grammars such as parallel multiple context-free grammars (PMCFGs). In particular, it supports efficient dynamic rule additions to the grammar at any moment. The algorithm is designed in a fully incremental way, allowing to resume parsing with additional tokens without restarting the parse process, and can predict possible next tokens. Additionally, we handle constraints on the token following a rule. This allows for grammatically correct English indefinite articles when working with word tokens. It can also represent typical operations for scannerless parsing such as maximal matches when working with character tokens. Our long-term goal is to computerize a large library of existing mathematical knowledge using the new parser, starting from natural language input as found in textbooks or in the papers collected by the digital mathematical library (DML) projects around the world. In this paper, we present the algorithmic ideas behind our approach, give a short overview of the implementation, and present some efficiency results. The new parser is available at http://www.tigen.org/kevin.kofler/fmathl/dyngenpar/ .

Keywords

dynamic generalized parser dynamic parser tableless parser scannerless parser parser parallel multiple context-free grammars common mathematical language natural mathematical language controlled natural language mathematical knowledge management formalized mathematics digital mathematical library 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Qt – Cross-platform application and UI framework, http://qt.nokia.com
  2. 2.
    Qt Jambi – The Qt library for Java, http://qt-jambi.org
  3. 3.
    Angelov, K.: Incremental parsing with parallel multiple context-free grammars. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 69–76 (2009)Google Scholar
  4. 4.
    Angelov, K., Bringert, B., Ranta, A.: PGF: A Portable Run-Time Format for Type-Theoretical Grammars. Journal of Logic, Language and Information 19(2), 201–228 (2010)CrossRefGoogle Scholar
  5. 5.
    Costagliola, G., Deufemia, V., Polese, G.: Visual language implementation through standard compiler-compiler techniques. Journal of Visual Languages & Computing 18(2), 165–226 (2007); selected papers from Visual Languages and Computing 2005CrossRefGoogle Scholar
  6. 6.
    Cramer, M., Fisseni, B., Koepke, P., Kühlwein, D., Schröder, B., Veldman, J.: The Naproche Project Controlled Natural Language Proof Checking of Mathematical Texts. In: Fuchs, N.E. (ed.) CNL 2009. LNCS, vol. 5972, pp. 170–186. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Flex Project: flex: The Fast Lexical Analyzer, http://flex.sourceforge.net
  8. 8.
    Free Software Foundation: Bison – GNU parser generator, http://www.gnu.org/software/bison
  9. 9.
    Free Software Foundation: GNU General Public License (GPL) v2.0 (June 1991), http://www.gnu.org/licenses/old-licenses/gpl-2.0
  10. 10.
    Free Software Foundation: GNU General Public License (GPL) v3.0 (June 2007), http://www.gnu.org/licenses/gpl-3.0
  11. 11.
    Hinze, R., Paterson, R.: Derivation of a typed functional LR parser (2003)Google Scholar
  12. 12.
    Humayoun, M.: Developing the System MathNat for Automatic Formalization of Mathematical texts. Ph.D. thesis, University of Grenoble (2012), http://www.lama.univ-savoie.fr/~humayoun/phd/mathnat.html
  13. 13.
    Kasami, T.: An efficient recognition and syntax analysis algorithm for context-free languages. Tech. Rep. AFCRL-65-758, Air Force Cambridge Research Laboratory, Bedford, MA (1965)Google Scholar
  14. 14.
    Koepke, P., Schröder, B., Buechel, G., et al.: Naproche – Natural language proof checking, http://www.naproche.net
  15. 15.
    Kofler, K.: DynGenPar – Dynamic Generalized Parser, http://www.tigen.org/kevin.kofler/fmathl/dyngenpar
  16. 16.
    Kofler, K., Neumaier, A.: The DynGenPar Algorithm on an Example, slides, http://www.tigen.org/kevin.kofler/fmathl/dyngenpar-example.pdf
  17. 17.
    Kofler, K., Neumaier, A.: A Dynamic Generalized Parser for Common Mathematical Language. In: Work-in-Progress Proceedings of CICM/MKM (2011), http://www.tigen.org/kevin.kofler/fmathl/dyngenpar-wip.pdf
  18. 18.
    Kohlhase, M.: Using LaTeX as a Semantic Markup Format. Mathematics in Computer Science 2.2, 279–304 (2008)CrossRefGoogle Scholar
  19. 19.
    Mernik, M., Heering, J., Sloane, A.: When and how to develop domain-specific languages. ACM Computing Surveys (CSUR) 37(4), 316–344 (2005)CrossRefGoogle Scholar
  20. 20.
    Neumaier, A.: Analysis und lineare Algebra, unpublished lecture notes, http://www.mat.univie.ac.at/~neum/FMathL/ALA.pdf
  21. 21.
    Neumaier, A.: FMathL – Formal Mathematical Language, http://www.mat.univie.ac.at/~neum/fmathl.html
  22. 22.
    Ranta, A.: Grammatical Framework: A Type-Theoretical Grammar Formalism. Journal of Functional Programming 14(2), 145–189 (2004)MathSciNetzbMATHCrossRefGoogle Scholar
  23. 23.
    Ranta, A., Angelov, K., Hallgren, T., et al.: GF – Grammatical Framework, http://www.grammaticalframework.org
  24. 24.
    Schodl, P.: Foundations for a Self-Reflective, Context-Aware Semantic Representation of Mathematical Specifications. Ph.D. thesis, University of Vienna (2011), http://www.mat.univie.ac.at/~schodl/pdfs/diss_online.pdf
  25. 25.
    Schodl, P., Neumaier, A.: An experimental grammar for German mathematical text. Tech. rep., University of Vienna (2009), http://www.mat.univie.ac.at/~neum/FMathL/ALA-grammar.pdf
  26. 26.
    Schodl, P., Neumaier, A.: The FMathL type system. Tech. rep., University of Vienna (2011), http://www.mat.univie.ac.at/~neum/FMathL/types.pdf
  27. 27.
    Schodl, P., Neumaier, A., Kofler, K., Domes, F., Schichl, H.: Towards a Self-reflective, Context-aware Semantic Representation of Mathematical Specifications. In: Kallrath, J. (ed.) Algebraic Modeling Systems – Modeling and Solving Real World Optimization Problems, ch. 2. Springer (2012)Google Scholar
  28. 28.
    Seki, H., Matsumura, T., Fujii, M., Kasami, T.: On multiple context-free grammars. Theoretical Computer Science 88(2), 191–229 (1991)MathSciNetzbMATHCrossRefGoogle Scholar
  29. 29.
    Tomita, M.: An Efficient Augmented Context-Free Parsing Algorithm. Computational Linguistics 13(1-2), 31–46 (1987)Google Scholar
  30. 30.
    Tomita, M., Ng, S.: The Generalized LR parsing algorithm. In: Tomita, M. (ed.) Generalized LR Parsing, pp. 1–16. Kluwer (1991)Google Scholar
  31. 31.
    Visser, E.: Scannerless generalized-LR parsing. Tech. Rep. P9707, Programming Research Group, University of Amsterdam (1997)Google Scholar
  32. 32.
    Younger, D.: Recognition and parsing of context-free languages in time n 3. Information and Control 10(2), 189–208 (1967)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Kevin Kofler
    • 1
  • Arnold Neumaier
    • 1
  1. 1.Faculty of MathematicsUniversity of Vienna, AustriaWienAustria

Personalised recommendations