Skip to main content

Finite Automata as Regular Language Recognizers

  • Chapter
  • First Online:
Formal Languages and Compilation

Abstract

We introduce the finite automata, we discuss their properties, and we present their role as recognizers of regular languages, in particular at lexical compilation level. After overviewing the abstract model of Turing machine, we focus on the finite-state devices. We define the notions of state accessibility, determinism, spontaneous move and ambiguity. We analyze the relations between finite automata and grammars. We present the basic constructions for cleaning, determinizing and minimizing automata. Then we describe the BMC method for deriving regular expressions from automata. Conversely, we present the methods by Thompson and by Berry–Sethi for obtaining finite recognizers from regular expressions. The latter method is based on the concept of locally testable language and is also used for eliminating nondeterminism. For ambiguous regular expressions, we present a new algorithm by extending Berry–Sethi, to perform parsing and string matching. Finally, by exploiting their relation to finite automata, we introduce the regular expressions extended with the operations of complement and intersection. A synopsis of relations between regular expressions, grammars and automata ends the chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Many books cover the subject, e.g., Bovet and Crescenzi [1], Floyd and Beigel [2], Hopcroft and Ullman [3], Kozen [4], McNaughton [5] and Rich [6].

  2. 2.

    Other books have a broader and deeper coverage of automata theory, such as Salomaa [7], Hopcroft and Ullman [3, 8], Harrison [9], Shallit [10] and the handbook [11]; for finite automata, a specific reference is Sakarovitch [12].

  3. 3.

    Other subtler and more efficient algorithms have been invented. We refer the reader to the survey in [13].

  4. 4.

    Other more efficient equivalence tests do without first minimizing the automaton.

  5. 5.

    If the machine has multiple final states, multiple initial states result, thus causing another form of nondeterminism to be dealt with later.

  6. 6.

    Although the copy-rule elimination algorithm on p. 67 does not handle empty rules, for right-linear grammars it continues to work also when empty rules are in the grammar.

  7. 7.

    If an NFA is represented by the equivalent left-linear grammar, the powerset algorithm above performs exactly as the construction for transforming a context-free grammar into invertible form by eliminating the repeated right parts of rules, which was presented in Chap. 2 on p. 70.

  8. 8.

    If the given automaton N has multiple initial states, the initial state of machine \(M'\) is the set of all of them.

  9. 9.

    Originally presented in [14]. It forms the base of the popular tool lex (or GNU flex) used for building scanners.

  10. 10.

    We follow the conceptual path of Berstel and Pin [16].

  11. 11.

    In Chap. 5, local languages and automata are also used to model the control-flow graph of a program.

  12. 12.

    We refer to [16] for a formal proof.

  13. 13.

    In Sect. 2.3.2.1 we defined in the same way a numbered r.e., for the purpose of formulating a rough criterion for checking if an r.e. is ambiguous.

  14. 14.

    This is an example of transliteration (homomorphism) as defined on p. 97.

  15. 15.

    For a thorough justification of the method we refer to [15].

  16. 16.

    A rigorous discussion on problematic r.e. and ambiguity is in Frisch and Cardelli [17].

  17. 17.

    The transducer model will be discussed at length in Chap. 5.

  18. 18.

    For the star-free family, we allow in an r.e. the empty set \(\emptyset \) as a new metasymbol. This is necessary, e.g., to define the universal language as \(\lnot \, \emptyset \) (complement of \(\emptyset \)), since we may not use the Kleene star expression \(\varSigma ^*\).

  19. 19.

    For the theory of star-free languages we refer to McNaughton and Papert [18].

References

  1. Bovet D, Crescenzi P (1994) Introduction to the theory of complexity. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  2. Floyd RW, Beigel R (1994) The language of machines: an introduction to computability and formal languages. Computer Science Press, New York

    Google Scholar 

  3. Hopcroft J, Ullman J (1979) Introduction to automata theory, languages, and computation. Addison-Wesley, Massachusetts

    MATH  Google Scholar 

  4. Kozen D (2007) Theory of computation. Springer, London

    Google Scholar 

  5. McNaughton R (1982) Elementary computability, formal languages and automata. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  6. Rich E (2008) Automata, computability, and complexity: theory and applications. Pearson Education, New York

    Google Scholar 

  7. Salomaa A (1973) Formal languages. Academic Press, New York

    MATH  Google Scholar 

  8. Hopcroft J, Ullman J (1969) Formal languages and their relation to automata. Addison-Wesley, Massachusetts

    MATH  Google Scholar 

  9. Harrison M (1978) Introduction to formal language theory. Addison Wesley, Massachusetts

    MATH  Google Scholar 

  10. Shallit J (2008) A second course in formal languages and automata theory, 1st edn. Cambridge University Press, New York

    Book  Google Scholar 

  11. Rozenberg G, Salomaa A (eds) (1997) Handbook of formal languages, vol. 1: word, language, grammar. Springer, New York

    Google Scholar 

  12. Sakarovitch J (2009) Elements of automata theory. Cambridge University Press, Cambridge

    Book  Google Scholar 

  13. Watson B (1994) A taxonomy of finite automata minimization algorithms, Report. Department of Mathematics and Computer Science, Eindhoven; Eindhoven University of Technology, Eindhoven, The Netherlands

    Google Scholar 

  14. Thompson K (1968) Regular expression search algorithm. Commun ACM 11(6):419–422

    Article  Google Scholar 

  15. Berry G, Sethi R (1986) From regular expressions to deterministic automata. Theor Comput Sci 48(1):117–126

    Article  MathSciNet  Google Scholar 

  16. Berstel J, Pin JE (1996) Local languages and the Berry-Sethi algorithm. Theor Comput Sci 155(2):439–446

    Article  MathSciNet  Google Scholar 

  17. Frisch A, Cardelli L (2004) Greedy regular expression matching. In: Díaz J, Karhumäki J, Lepistö A, Sannella D, (eds) ICALP. Springer, Berlin, pp 618–629

    Chapter  Google Scholar 

  18. McNaughton R, Papert S (1971) Counter-free automata. The MIT Press, Cambridge

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefano Crespi Reghizzi .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Crespi Reghizzi, S., Breveglieri, L., Morzenti, A. (2019). Finite Automata as Regular Language Recognizers. In: Formal Languages and Compilation. Texts in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-04879-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04879-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04878-5

  • Online ISBN: 978-3-030-04879-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics