Advertisement

Simple, Functional, Sound and Complete Parsing for All Context-Free Grammars

  • Tom Ridge
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7086)

Abstract

Parsers for context-free grammars can be implemented directly and naturally in a functional style known as “combinator parsing”, using recursion following the structure of the grammar rules. However, naive implementations fail to terminate on left-recursive grammars, and despite extensive research the only complete parsers for general context-free grammars are constructed using other techniques such as Earley parsing. Our main contribution is to show how to construct simple, sound and complete parser implementations directly from grammar specifications, for all context-free grammars, based on combinator parsing. We then construct a generic parser generator and show that generated parsers are sound and complete. The formal proofs are mechanized using the HOL4 theorem prover. Memoized parsers based on our approach are polynomial-time in the size of the input. Preliminary real-world performance testing on highly ambiguous grammars indicates our parsers are faster than those generated by the popular Happy parser generator.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Happy, a parser generator for Haskell, http://www.haskell.org/happy/
  2. 2.
    Barthwal, A., Norrish, M.: Verified, Executable Parsing. In: Castagna, G. (ed.) ESOP 2009. LNCS, vol. 5502, pp. 160–174. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  3. 3.
    Danielsson, N.A.: Total parser combinators. In: Hudak, P., Weirich, S. (eds.) ICFP, pp. 285–296. ACM (2010)Google Scholar
  4. 4.
    Earley, J.: An efficient context-free parsing algorithm. Commun. ACM 13(2), 94–102 (1970)CrossRefzbMATHGoogle Scholar
  5. 5.
    Ford, B.: Packrat parsing: simple, powerful, lazy, linear time, functional pearl. In: ICFP 2002: Proceedings of the Seventh ACM SIGPLAN International Conference on Functional Programming, New York, NY, USA, vol. 37/9, pp. 36–47. ACM (2002)Google Scholar
  6. 6.
    Frost, R.A., Hafiz, R., Callaghan, P.: Parser Combinators for Ambiguous Left-Recursive Grammars. In: Hudak, P., Warren, D.S. (eds.) PADL 2008. LNCS, vol. 4902, pp. 167–181. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  7. 7.
    Frost, R.A., Hafiz, R., Callaghan, P.C.: Modular and efficient top-down parsing for ambiguous left-recursive grammars. In: IWPT 2007: Proceedings of the 10th International Conference on Parsing Technologies, Morristown, NJ, USA, pp. 109–120. Association for Computational Linguistics (2007)Google Scholar
  8. 8.
    Hafiz, R., Frost, R.A.: Lazy Combinators for Executable Specifications of General Attribute Grammars. In: Carro, M., Peña, R. (eds.) PADL 2010. LNCS, vol. 5937, pp. 167–182. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Hutton, G.: Higher-order functions for parsing. J. Funct. Program. 2(3), 323–343 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Kasami, T.: An efficient recognition and syntax analysis algorithm for context-free languages. Technical Report AFCRL-65-758, Air Force Cambridge Research Laboratory, Bedford, Massachusetts (1965)Google Scholar
  11. 11.
    Koprowski, A., Binsztok, H.: TRX: A Formally Verified Parser Interpreter. In: Gordon, A.D. (ed.) ESOP 2010. LNCS, vol. 6012, pp. 345–365. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    Kuno, S.: The predictive analyzer and a path elimination technique. Commun. ACM 8(7), 453–462 (1965)CrossRefzbMATHGoogle Scholar
  13. 13.
    Leroy, X.: Formal verification of a realistic compiler. Communications of the ACM (April 2009)Google Scholar
  14. 14.
    Pratt, V.R.: Top down operator precedence. In: Proceedings ACM Symposium on Principles Prog. Languages (1973)Google Scholar
  15. 15.
    Ridge, T.: Simple, functional, sound and complete parsing for all context-free grammars (2010) unpublished draft, http://www.cs.le.ac.uk/~tr61
  16. 16.
    Tomita, M.: Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Kluwer, Boston (1986)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Tom Ridge
    • 1
  1. 1.University of LeicesterUK

Personalised recommendations