Bounded Seas

— Island Parsing Without Shipwrecks
  • Jan Kurš
  • Mircea Lungu
  • Oscar Nierstrasz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8706)

Abstract

Imprecise manipulation of source code (semi-parsing) is useful for tasks such as robust parsing, error recovery, lexical analysis, and rapid development of parsers for data extraction. An island grammar precisely defines only a subset of a language syntax (islands), while the rest of the syntax (water) is defined imprecisely.

Usually, water is defined as the negation of islands. Albeit simple, such a definition of water is naive and impedes composition of islands. When developing an island grammar, sooner or later a programmer has to create water tailored to each individual island. Such an approach is fragile, however, because water can change with any change of a grammar. It is time-consuming, because water is defined manually by a programmer and not automatically. Finally, an island surrounded by water cannot be reused because water has to be defined for every grammar individually.

In this paper we propose a new technique of island parsing — bounded seas. Bounded seas are composable, robust, reusable and easy to use because island-specific water is created automatically. We integrated bounded seas into a parser combinator framework as a demonstration of their composability and reusability.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Moonen, L.: Generating robust parsers using island grammars. In: Burd, E., Aiken, P., Koschke, R. (eds.) Proceedings Eight Working Conference on Reverse Engineering (WCRE 2001), pp. 13–22. IEEE Computer Society (2001), doi:doi:10.1109/WCRE.2001.957806Google Scholar
  2. 2.
    Renggli, L., Ducasse, S., Gîrba, T., Nierstrasz, O.: Practical dynamic grammars for dynamic languages. In: 4th Workshop on Dynamic Languages and Applications (DYLA 2010), Malaga, Spain (2010)Google Scholar
  3. 3.
    Hutton, G., Meijer, E.: Monadic parser combinators, Tech. Rep. NOTTCS-TR-96-4, Department of Computer Science, University of Nottingham (1996)Google Scholar
  4. 4.
    Frost, R., Launchbury, J.: Constructing natural language interpreters in a lazy functional language. Comput. J. 32(2), 108–121 (1989), doi:doi:10.1093/comjnl/32.2.108Google Scholar
  5. 5.
    Ford, B.: Parsing expression grammars: a recognition-based syntactic foundation. In: POPL 2004: Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 111–122. ACM, New York (2004), doi:10.1145/964001.964011Google Scholar
  6. 6.
    Nierstrasz, O., Ducasse, S., Gîrba, T.: The story of Moose: an agile reengineering environment. In: Proceedings of the European Software Engineering Conference (ESEC/FSE 2005), pp. 1–10. ACM Press, New York (2005), doi:10.1145/1095430.1081707 (invited paper)Google Scholar
  7. 7.
    Chomsky, N.: Three models for the description of language. IRE Transactions on Information Theory 2, 113–124 (1956), http://www.chomsky.info/articles/195609--.pdf CrossRefMATHGoogle Scholar
  8. 8.
    Scott, E., Johnstone, A.: Gll parsing. Electron. Notes Theor. Comput. Sci. 253(7), 177–189 (2010), doi:10.1016/j.entcs.2010.08.041CrossRefGoogle Scholar
  9. 9.
    Grune, D., Jacobs, C.J.: Generalized LL Parsing. In: Parsing Techniques — A Practical Guide, vol. 1, ch. 11.2, pp. 391–398. Springer (2008)Google Scholar
  10. 10.
    Grune, D., Jacobs, C.J.: Deterministic Top-Down Parsing. In: Parsing Techniques — A Practical Guide, vol. 1, ch. 8, pp. 235–361. Springer (2008)Google Scholar
  11. 11.
    Aho, A.V., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques and Tools. Addison Wesley, Reading (1986)Google Scholar
  12. 12.
    Aho, A.V., Ullman, J.D.: The Theory of Parsing, Translation and Compiling Volume I: Parsing. Prentice-Hall (1972)Google Scholar
  13. 13.
    Lavie, A., Tomita, M.: Glr* - an efficient noise-skipping parsing algorithm for context free grammars. In: Proceedings of the Third International Workshop on Parsing Technologies, pp. 123–134 (1993)Google Scholar
  14. 14.
    Tomita, M.: Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Kluwer Academic Publishers, Norwell (1985)Google Scholar
  15. 15.
    Bischofberger, W.R.: Sniff: A pragmatic approach to a C++ programming environment. In: C++ Conference, pp. 67–82 (1992)Google Scholar
  16. 16.
    Asveld, P.: A fuzzy approach to erroneous inputs in context-free language recognition. In: Proceedings of the Fourth International Workshop on Parsing Technologies IWPT 1995, pp. 14–25. Institute of Formal and Applied Linguistics, Charles University (1995)Google Scholar
  17. 17.
    Koppler, R.: A systematic approach to fuzzy parsing. Software: Practice and Experience 27(6), 637–649 (1997), doi:10.1002/(SICI)1097-024X(199706)27:6<637:AID-SPE99>3.0.CO;2-3Google Scholar
  18. 18.
    Klusener, S., Lämmel, R.: Deriving tolerant grammars from a base-line grammar. In: Proceedings of the International Conference on Software Maintenance (ICSM 2003), pp. 179–188. IEEE Computer Society (2003), doi:10.1109/ICSM.2003.1235420Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Jan Kurš
    • 1
  • Mircea Lungu
    • 1
  • Oscar Nierstrasz
    • 1
  1. 1.Software Composition GroupUniversity of BernSwitzerland

Personalised recommendations