Skip to main content

PAPAGENO: A Parallel Parser Generator for Operator Precedence Grammars

  • Conference paper
Software Language Engineering (SLE 2012)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7745))

Included in the following conference series:

Abstract

In almost all language processing applications, languages are parsed employing classical algorithms (such as the LR(1) parsers generated by Bison), which are sequential due to their left-to-right state-dependent nature. Although early theoretical studies on parallel parsing algorithms delineated potential speedups on abstract parallel machines using a data-parallel approach, practical developments have not materialized, except in recent experiments on ad hoc parsers for large XML files. We describe a general-purpose practical generator (PAPAGENO) able to produce efficient deterministic parallel parsers, which exhibit significant speedups when parsing large texts on modern multi-core machines, while not penalizing sequential operation. The generated parser relies on the properties of Floyd’s operator precedence grammars, to provide a naturally parallel implementation of the parsing process. Parsing of each text portion proceeds in parallel and independently, without communication and synchronization, until all partial parse stacks are recombined into the final result. Since Floyd’s grammars can express most syntaxes with little adaptation, we have performed extensive experiments, on both synthetically generated texts and real JSON documents. The effective parallel code portion in the generated parsers exceeds 80% for most of the tested scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Grune, D., Jacobs, C.J.H.: Parsing techniques a practical guide. Ellis Horwood Limited, Chichester (1990)

    Google Scholar 

  2. Crespi Reghizzi, S., Mandrioli, D.: Operator precedence and the visibly push-down property. JCSS, Journ. Computer and System Science 78(6), 1837–1867 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  3. Cohen, J., Kolodner, S.: Estimating the speedup in parallel parsing. IEEE Transactions on Software Engineering 11(1), 114–124 (1985)

    Article  Google Scholar 

  4. Sarkar, D., Deo, N.: Estimating the speedup in parallel parsing. IEEE Trans. on Softw. Eng. 16(7), 677 (1990)

    Article  Google Scholar 

  5. Mickunas, M.D., Schell, R.M.: Parallel compilation in a multiprocessor environment (extended abstract). In: Proceedings of the 1978 Annual Conference, ACM 1978, pp. 241–246. ACM, New York (1978)

    Chapter  Google Scholar 

  6. Goeman, H.: On parsing and condensing substrings of LR languages in linear time. Theor. Comput. Sci. 267, 61–82 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bates, J., Lavie, A.: Recognizing substrings of LR(k) languages in linear time. ACM Trans. Program. Lang. Syst. 16, 1051–1077 (1994)

    Article  Google Scholar 

  8. Lu, W., Chiu, K., Pan, Y.: A parallel approach to XML parsing. In: GRID, pp. 223–230. IEEE (2006)

    Google Scholar 

  9. Pan, Y., Zhang, Y., Chiu, K.: Hybrid Parallelism for XML SAX Parsing. In: IEEE International Conference on Web Services, ICWS 2008, pp. 505–512. IEEE Computer Society (2008)

    Google Scholar 

  10. Cole, M.: Parallel programming, list homomorphisms and the maximum segment sum problem. In: Proceedings of ParCo., vol. 93, pp. 211–230 (1993)

    Google Scholar 

  11. Germann, U., Joanis, E., Larkin, S.: Tightly packed tries: How to fit large models into memory, and make them load fast, too. In: Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing, pp. 31–39 (2009)

    Google Scholar 

  12. Ramachandran, S.: Web metrics: Size and number of resources. Technical report, Google (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Barenghi, A., Viviani, E., Crespi Reghizzi, S., Mandrioli, D., Pradella, M. (2013). PAPAGENO: A Parallel Parser Generator for Operator Precedence Grammars. In: Czarnecki, K., Hedin, G. (eds) Software Language Engineering. SLE 2012. Lecture Notes in Computer Science, vol 7745. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36089-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36089-3_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36088-6

  • Online ISBN: 978-3-642-36089-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics