A Unified Construction of the Glushkov, Follow, and Antimirov Automata

  • Cyril Allauzen
  • Mehryar Mohri
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4162)


A number of different techniques have been introduced in the last few decades to create ε-free automata representing regular expressions such as the Glushkov automata, follow automata, or Antimirov automata. This paper presents a simple and unified view of all these construction methods both for unweighted and weighted regular expressions. It describes simpler algorithms with time complexities at least as favorable as that of the best previously known techniques, and provides a concise proof of their correctness. Our algorithms are all based on two standard automata operations: epsilon-removal and minimization. This contrasts with the multitude of complicated and special-purpose techniques previously described in the literature, and makes it straightforward to generalize these algorithms to the weighted case. In particular, we extend the definition and construction of follow automata to the case of weighted regular expressions over a closed semiring and present the first algorithm to compute weighted Antimirov automata.


Regular Expression Weighted Case Empty String Unweighted Case Alphabet Symbol 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aho, A.V., Sethi, R., Ullman, J.D.: Compilers, Principles, Techniques and Tools. Addison-Wesley, Reading (1986)Google Scholar
  2. 2.
    Antimirov, V.M.: Partial derivatives of regular expressions and finite automaton constructions. Theoretical Computer Science 155(2), 291–319 (1996)MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Berry, G., Sethi, R.: From regular expressions to deterministic automata. Theoretical Computer Science 48(3), 117–126 (1986)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Brüggemann-Klein, A.: Regular expressions into finite automata. Theoretical Computer Science 120(2), 197–213 (1993)MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Caron, P., Flouret, M.: Glushkov construction for series: the non commutative case. International Journal of Computer Mathematics 80(4), 457–472 (2003)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Champarnaud, J.-M., Laugerotte, É., Ouardi, F., Ziadi, D.: From regular weighted expressions to finite automata. In: H. Ibarra, O., Dang, Z. (eds.) CIAA 2003. LNCS, vol. 2759, pp. 49–60. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  7. 7.
    Champarnaud, J.-M., Nicart, F., Ziadi, D.: Computing the follow automaton of an expression. In: Domaratzki, M., Okhotin, A., Salomaa, K., Yu, S. (eds.) CIAA 2004. LNCS, vol. 3317, pp. 90–101. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  8. 8.
    Champarnaud, J.-M., Ziadi, D.: Computing the equation automaton of a regular expression in O(s 2) space and time. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 157–168. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  9. 9.
    Chang, C.-H., Page, R.: From regular expressions to DFA’s using compressed NFA’s. Theoretical Computer Science 178(1-2), 1–36 (1997)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Giammarresi, D., Ponty, J.-L., Wood, D.: Glushkov and Thompson constructions: a synthesis (1998),
  11. 11.
    Glushkov, V.M.: The abstract theory of automata. Russian Mathematical Surveys 16, 1–53 (1961)CrossRefGoogle Scholar
  12. 12.
    Ilie, L., Yu, S.: Follow automata. Information and Computation 186(1), 146–162 (2003)CrossRefMathSciNetGoogle Scholar
  13. 13.
    Kleene, S.C.: Representations of events in nerve sets and finite automata. In: Shannon, C.E., McCarthy, J., Ashby, W.R. (eds.) Automata Studies, pp. 3–42. Princeton University Press, Princeton (1956)Google Scholar
  14. 14.
    Lehmann, D.J.: Algebraic structures for transitives closures. Theoretical Computer Science 4, 59–76 (1977)MATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Lombardy, S., Sakarovitch, J.: Derivatives of rational expressions with multiplicity. Theoretical Computer Science 332(1-3), 142–177 (2005)CrossRefMathSciNetGoogle Scholar
  16. 16.
    McNaughton, R., Yamada, H.: Regular expressions and state graphs for automata. IEEE Transactions on Electronic Computers 9(1), 39–47 (1960)CrossRefGoogle Scholar
  17. 17.
    Mohri, M.: Finite-State Transducers in Language and Speech Processing. Computational Linguistics 23, 2 (1997)Google Scholar
  18. 18.
    Mohri, M.: Generic e-removal and input e-normalization algorithms for weighted transducers. International Journal of Foundations of Computer Science 13(1), 129–143 (2002)MATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Mohri, M.: Semiring Frameworks and Algorithms for Shortest-Distance Problems. Journal of Automata, Languages and Combinatorics 7(3), 321–350 (2002)MATHMathSciNetGoogle Scholar
  20. 20.
    Navarro, G., Raffinot, M.: Fast regular expression search. In: Vitter, J.S., Zaroliagis, C.D. (eds.) WAE 1999. LNCS, vol. 1668, pp. 198–212. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  21. 21.
    Navarro, G., Raffinot, M.: Flexible pattern matching. Cambridge University Press, Cambridge (2002)MATHGoogle Scholar
  22. 22.
    Ponty, J.-L., Ziadi, D., Champarnaud, J.-M.: A new quadratic algorithm to convert a regular expression into automata. In: Raymond, D.R., Yu, S., Wood, D. (eds.) WIA 1996. LNCS, vol. 1260, pp. 109–119. Springer, Heidelberg (1997)Google Scholar
  23. 23.
    Schützenberger, M.-P.: On the definition of a family of automata. Information and Control 4, 245–270 (1961)MATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    Thompson, K.: Regular expression search algorithm. Communications of the ACM 11(6), 365–375 (1968)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Cyril Allauzen
    • 1
  • Mehryar Mohri
    • 1
  1. 1.Courant Institute of Mathematical SciencesNew YorkUSA

Personalised recommendations