From Ambiguous Regular Expressions to Deterministic Parsing Automata

  • Angelo Borsotti
  • Luca Breveglieri
  • Stefano Crespi Reghizzi
  • Angelo Morzenti
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9223)


This new parser generator for ambiguous regular expressions (RE) formally extends the Berry-Sethi (BS) algorithm into a finite-state device that specifies the syntax tree(s). We extend the local testability property of the marked RE’s from terminal strings to linearized syntax trees. The generator supports disambiguation, i.e., selecting a preferred tree in case of ambiguity. The selection is parametric with respect to the Greedy or POSIX criterion. The parser is proved correct and has linear-time complexity. The generator is available as an interactive SW tool (on GitHub - see


Regular expression RE Syntax tree Berry-Sethi Ambiguity Parsing 


  1. 1.
    Allauzen, C., Mohri, M.: A unified construction of the glushkov, follow, and antimirov automata. In: Královič, R., Urzyczyn, P. (eds.) MFCS 2006. LNCS, vol. 4162, pp. 110–121. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  2. 2.
    Berry, G., Sethi, R.: From regular expressions to deterministic automata. Theor. Comput. Sci. 48(1), 117–126 (1986)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Berstel, J., Pin, J.E.: Local languages and the Berry-Sethi algorithm. Theor. Comput. Sci. 155(2), 439–446 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Book, R., Even, S., Greibach, S., Ott, G.: Ambiguity in graphs and expressions. IEEE Trans. on Comp. C-20(2), 149–153 (1971)Google Scholar
  5. 5.
    Breveglieri, L., Crespi Reghizzi, S., Morzenti, A.: Shift-reduce parsers for transition networks. In: Dediu, A.-H., Martín-Vide, C., Sierra-Rodríguez, J.-L., Truthe, B. (eds.) LATA 2014. LNCS, vol. 8370, pp. 222–235. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  6. 6.
    Dubè, D., Feeley, M.: Efficiently building a parse tree from a regular expression. Acta Inf. 37(2), 121–144 (2000)CrossRefzbMATHGoogle Scholar
  7. 7.
    Frisch, A., Cardelli, L.: Greedy regular expression matching. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 618–629. Springer, Heidelberg (2004) CrossRefGoogle Scholar
  8. 8.
    IEEE: std. 1003.2, POSIX, regular expression notation, section 2.8 (1992)Google Scholar
  9. 9.
    Okui, S., Suzuki, T.: Disambiguation in regular expression matching via position automata with augmented transitions. In: Domaratzki, M., Salomaa, K. (eds.) CIAA 2010. LNCS, vol. 6482, pp. 231–240. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  10. 10.
    Sakarovitch, J.: Elements of Automata Theory. Cambridge University Press, New York (2009) CrossRefzbMATHGoogle Scholar
  11. 11.
    Sulzmann, M., Lu, K.Z.M.: POSIX regular expression parsing with derivatives. In: Codish, M., Sumii, E. (eds.) FLOPS 2014. LNCS, vol. 8475, pp. 203–220. Springer, Heidelberg (2014) CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Angelo Borsotti
    • 1
  • Luca Breveglieri
    • 1
  • Stefano Crespi Reghizzi
    • 2
  • Angelo Morzenti
    • 1
  1. 1.Dip. di Elettronica, Informazione e Bioingegneria (DEIB)Politecnico di MilanoMilanoItaly
  2. 2.Dip. di Elettronica, Informazione e Bioingegneria (DEIB)CNR-IEIIT, Politecnico di MilanoMilanoItaly

Personalised recommendations