Abstract
We introduce the finite automata, we discuss their properties, and we present their role as recognizers of regular languages, in particular at lexical compilation level. After overviewing the abstract model of Turing machine, we focus on the finite-state devices. We define the notions of state accessibility, determinism, spontaneous move and ambiguity. We analyze the relations between finite automata and grammars. We present the basic constructions for cleaning, determinizing and minimizing automata. Then we describe the BMC method for deriving regular expressions from automata. Conversely, we present the methods by Thompson and by Berry–Sethi for obtaining finite recognizers from regular expressions. The latter method is based on the concept of locally testable language and is also used for eliminating nondeterminism. For ambiguous regular expressions, we present a new algorithm by extending Berry–Sethi, to perform parsing and string matching. Finally, by exploiting their relation to finite automata, we introduce the regular expressions extended with the operations of complement and intersection. A synopsis of relations between regular expressions, grammars and automata ends the chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
Other subtler and more efficient algorithms have been invented. We refer the reader to the survey in [13].
- 4.
Other more efficient equivalence tests do without first minimizing the automaton.
- 5.
If the machine has multiple final states, multiple initial states result, thus causing another form of nondeterminism to be dealt with later.
- 6.
Although the copy-rule elimination algorithm on p. 67 does not handle empty rules, for right-linear grammars it continues to work also when empty rules are in the grammar.
- 7.
If an NFA is represented by the equivalent left-linear grammar, the powerset algorithm above performs exactly as the construction for transforming a context-free grammar into invertible form by eliminating the repeated right parts of rules, which was presented in Chap. 2 on p. 70.
- 8.
If the given automaton N has multiple initial states, the initial state of machine \(M'\) is the set of all of them.
- 9.
Originally presented in [14]. It forms the base of the popular tool lex (or GNU flex) used for building scanners.
- 10.
We follow the conceptual path of Berstel and Pin [16].
- 11.
In Chap. 5, local languages and automata are also used to model the control-flow graph of a program.
- 12.
We refer to [16] for a formal proof.
- 13.
In Sect. 2.3.2.1 we defined in the same way a numbered r.e., for the purpose of formulating a rough criterion for checking if an r.e. is ambiguous.
- 14.
This is an example of transliteration (homomorphism) as defined on p. 97.
- 15.
For a thorough justification of the method we refer to [15].
- 16.
A rigorous discussion on problematic r.e. and ambiguity is in Frisch and Cardelli [17].
- 17.
The transducer model will be discussed at length in Chap. 5.
- 18.
For the star-free family, we allow in an r.e. the empty set \(\emptyset \) as a new metasymbol. This is necessary, e.g., to define the universal language as \(\lnot \, \emptyset \) (complement of \(\emptyset \)), since we may not use the Kleene star expression \(\varSigma ^*\).
- 19.
For the theory of star-free languages we refer to McNaughton and Papert [18].
References
Bovet D, Crescenzi P (1994) Introduction to the theory of complexity. Prentice-Hall, Englewood Cliffs
Floyd RW, Beigel R (1994) The language of machines: an introduction to computability and formal languages. Computer Science Press, New York
Hopcroft J, Ullman J (1979) Introduction to automata theory, languages, and computation. Addison-Wesley, Massachusetts
Kozen D (2007) Theory of computation. Springer, London
McNaughton R (1982) Elementary computability, formal languages and automata. Prentice-Hall, Englewood Cliffs
Rich E (2008) Automata, computability, and complexity: theory and applications. Pearson Education, New York
Salomaa A (1973) Formal languages. Academic Press, New York
Hopcroft J, Ullman J (1969) Formal languages and their relation to automata. Addison-Wesley, Massachusetts
Harrison M (1978) Introduction to formal language theory. Addison Wesley, Massachusetts
Shallit J (2008) A second course in formal languages and automata theory, 1st edn. Cambridge University Press, New York
Rozenberg G, Salomaa A (eds) (1997) Handbook of formal languages, vol. 1: word, language, grammar. Springer, New York
Sakarovitch J (2009) Elements of automata theory. Cambridge University Press, Cambridge
Watson B (1994) A taxonomy of finite automata minimization algorithms, Report. Department of Mathematics and Computer Science, Eindhoven; Eindhoven University of Technology, Eindhoven, The Netherlands
Thompson K (1968) Regular expression search algorithm. Commun ACM 11(6):419–422
Berry G, Sethi R (1986) From regular expressions to deterministic automata. Theor Comput Sci 48(1):117–126
Berstel J, Pin JE (1996) Local languages and the Berry-Sethi algorithm. Theor Comput Sci 155(2):439–446
Frisch A, Cardelli L (2004) Greedy regular expression matching. In: Díaz J, Karhumäki J, Lepistö A, Sannella D, (eds) ICALP. Springer, Berlin, pp 618–629
McNaughton R, Papert S (1971) Counter-free automata. The MIT Press, Cambridge
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Crespi Reghizzi, S., Breveglieri, L., Morzenti, A. (2019). Finite Automata as Regular Language Recognizers. In: Formal Languages and Compilation. Texts in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-04879-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-04879-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04878-5
Online ISBN: 978-3-030-04879-2
eBook Packages: Computer ScienceComputer Science (R0)