Advertisement

From Ambiguous Regular Expressions to Deterministic Parsing Automata

  • Angelo Borsotti
  • Luca Breveglieri
  • Stefano Crespi Reghizzi
  • Angelo Morzenti
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9223)

Abstract

This new parser generator for ambiguous regular expressions (RE) formally extends the Berry-Sethi (BS) algorithm into a finite-state device that specifies the syntax tree(s). We extend the local testability property of the marked RE’s from terminal strings to linearized syntax trees. The generator supports disambiguation, i.e., selecting a preferred tree in case of ambiguity. The selection is parametric with respect to the Greedy or POSIX criterion. The parser is proved correct and has linear-time complexity. The generator is available as an interactive SW tool (on GitHub - see http://github.com/breveglieri/ebs/README).

Keywords

Regular expression RE Syntax tree Berry-Sethi Ambiguity Parsing 

References

  1. 1.
    Allauzen, C., Mohri, M.: A unified construction of the glushkov, follow, and antimirov automata. In: Královič, R., Urzyczyn, P. (eds.) MFCS 2006. LNCS, vol. 4162, pp. 110–121. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  2. 2.
    Berry, G., Sethi, R.: From regular expressions to deterministic automata. Theor. Comput. Sci. 48(1), 117–126 (1986)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Berstel, J., Pin, J.E.: Local languages and the Berry-Sethi algorithm. Theor. Comput. Sci. 155(2), 439–446 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Book, R., Even, S., Greibach, S., Ott, G.: Ambiguity in graphs and expressions. IEEE Trans. on Comp. C-20(2), 149–153 (1971)Google Scholar
  5. 5.
    Breveglieri, L., Crespi Reghizzi, S., Morzenti, A.: Shift-reduce parsers for transition networks. In: Dediu, A.-H., Martín-Vide, C., Sierra-Rodríguez, J.-L., Truthe, B. (eds.) LATA 2014. LNCS, vol. 8370, pp. 222–235. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  6. 6.
    Dubè, D., Feeley, M.: Efficiently building a parse tree from a regular expression. Acta Inf. 37(2), 121–144 (2000)CrossRefzbMATHGoogle Scholar
  7. 7.
    Frisch, A., Cardelli, L.: Greedy regular expression matching. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 618–629. Springer, Heidelberg (2004) CrossRefGoogle Scholar
  8. 8.
    IEEE: std. 1003.2, POSIX, regular expression notation, section 2.8 (1992)Google Scholar
  9. 9.
    Okui, S., Suzuki, T.: Disambiguation in regular expression matching via position automata with augmented transitions. In: Domaratzki, M., Salomaa, K. (eds.) CIAA 2010. LNCS, vol. 6482, pp. 231–240. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  10. 10.
    Sakarovitch, J.: Elements of Automata Theory. Cambridge University Press, New York (2009) CrossRefzbMATHGoogle Scholar
  11. 11.
    Sulzmann, M., Lu, K.Z.M.: POSIX regular expression parsing with derivatives. In: Codish, M., Sumii, E. (eds.) FLOPS 2014. LNCS, vol. 8475, pp. 203–220. Springer, Heidelberg (2014) CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Angelo Borsotti
    • 1
  • Luca Breveglieri
    • 1
  • Stefano Crespi Reghizzi
    • 2
  • Angelo Morzenti
    • 1
  1. 1.Dip. di Elettronica, Informazione e Bioingegneria (DEIB)Politecnico di MilanoMilanoItaly
  2. 2.Dip. di Elettronica, Informazione e Bioingegneria (DEIB)CNR-IEIIT, Politecnico di MilanoMilanoItaly

Personalised recommendations