Finite Automata as Regular Language Recognizers

Crespi Reghizzi, Stefano; Breveglieri, Luca; Morzenti, Angelo

doi:10.1007/978-3-030-04879-2_3

Stefano Crespi Reghizzi⁶,
Luca Breveglieri ORCID: orcid.org/0000-0001-5294-6840⁶ &
Angelo Morzenti⁶

Part of the book series: Texts in Computer Science ((TCS))

1709 Accesses

Abstract

We introduce the finite automata, we discuss their properties, and we present their role as recognizers of regular languages, in particular at lexical compilation level. After overviewing the abstract model of Turing machine, we focus on the finite-state devices. We define the notions of state accessibility, determinism, spontaneous move and ambiguity. We analyze the relations between finite automata and grammars. We present the basic constructions for cleaning, determinizing and minimizing automata. Then we describe the BMC method for deriving regular expressions from automata. Conversely, we present the methods by Thompson and by Berry–Sethi for obtaining finite recognizers from regular expressions. The latter method is based on the concept of locally testable language and is also used for eliminating nondeterminism. For ambiguous regular expressions, we present a new algorithm by extending Berry–Sethi, to perform parsing and string matching. Finally, by exploiting their relation to finite automata, we introduce the regular expressions extended with the operations of complement and intersection. A synopsis of relations between regular expressions, grammars and automata ends the chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Many books cover the subject, e.g., Bovet and Crescenzi [1], Floyd and Beigel [2], Hopcroft and Ullman [3], Kozen [4], McNaughton [5] and Rich [6].
2.
Other books have a broader and deeper coverage of automata theory, such as Salomaa [7], Hopcroft and Ullman [3, 8], Harrison [9], Shallit [10] and the handbook [11]; for finite automata, a specific reference is Sakarovitch [12].
3.
Other subtler and more efficient algorithms have been invented. We refer the reader to the survey in [13].
4.
Other more efficient equivalence tests do without first minimizing the automaton.
5.
If the machine has multiple final states, multiple initial states result, thus causing another form of nondeterminism to be dealt with later.
6.
Although the copy-rule elimination algorithm on p. 67 does not handle empty rules, for right-linear grammars it continues to work also when empty rules are in the grammar.
7.
If an NFA is represented by the equivalent left-linear grammar, the powerset algorithm above performs exactly as the construction for transforming a context-free grammar into invertible form by eliminating the repeated right parts of rules, which was presented in Chap. 2 on p. 70.
8.
If the given automaton N has multiple initial states, the initial state of machine \(M'\) is the set of all of them.
9.
Originally presented in [14]. It forms the base of the popular tool lex (or GNU flex) used for building scanners.
10.
We follow the conceptual path of Berstel and Pin [16].
11.
In Chap. 5, local languages and automata are also used to model the control-flow graph of a program.
12.
We refer to [16] for a formal proof.
13.
In Sect. 2.3.2.1 we defined in the same way a numbered r.e., for the purpose of formulating a rough criterion for checking if an r.e. is ambiguous.
14.
This is an example of transliteration (homomorphism) as defined on p. 97.
15.
For a thorough justification of the method we refer to [15].
16.
A rigorous discussion on problematic r.e. and ambiguity is in Frisch and Cardelli [17].
17.
The transducer model will be discussed at length in Chap. 5.
18.
For the star-free family, we allow in an r.e. the empty set \(\emptyset \) as a new metasymbol. This is necessary, e.g., to define the universal language as \(\lnot \, \emptyset \) (complement of \(\emptyset \)), since we may not use the Kleene star expression \(\varSigma ^*\).
19.
For the theory of star-free languages we refer to McNaughton and Papert [18].

References

Bovet D, Crescenzi P (1994) Introduction to the theory of complexity. Prentice-Hall, Englewood Cliffs
MATH Google Scholar
Floyd RW, Beigel R (1994) The language of machines: an introduction to computability and formal languages. Computer Science Press, New York
Google Scholar
Hopcroft J, Ullman J (1979) Introduction to automata theory, languages, and computation. Addison-Wesley, Massachusetts
MATH Google Scholar
Kozen D (2007) Theory of computation. Springer, London
Google Scholar
McNaughton R (1982) Elementary computability, formal languages and automata. Prentice-Hall, Englewood Cliffs
MATH Google Scholar
Rich E (2008) Automata, computability, and complexity: theory and applications. Pearson Education, New York
Google Scholar
Salomaa A (1973) Formal languages. Academic Press, New York
MATH Google Scholar
Hopcroft J, Ullman J (1969) Formal languages and their relation to automata. Addison-Wesley, Massachusetts
MATH Google Scholar
Harrison M (1978) Introduction to formal language theory. Addison Wesley, Massachusetts
MATH Google Scholar
Shallit J (2008) A second course in formal languages and automata theory, 1st edn. Cambridge University Press, New York
Book Google Scholar
Rozenberg G, Salomaa A (eds) (1997) Handbook of formal languages, vol. 1: word, language, grammar. Springer, New York
Google Scholar
Sakarovitch J (2009) Elements of automata theory. Cambridge University Press, Cambridge
Book Google Scholar
Watson B (1994) A taxonomy of finite automata minimization algorithms, Report. Department of Mathematics and Computer Science, Eindhoven; Eindhoven University of Technology, Eindhoven, The Netherlands
Google Scholar
Thompson K (1968) Regular expression search algorithm. Commun ACM 11(6):419–422
Article Google Scholar
Berry G, Sethi R (1986) From regular expressions to deterministic automata. Theor Comput Sci 48(1):117–126
Article MathSciNet Google Scholar
Berstel J, Pin JE (1996) Local languages and the Berry-Sethi algorithm. Theor Comput Sci 155(2):439–446
Article MathSciNet Google Scholar
Frisch A, Cardelli L (2004) Greedy regular expression matching. In: Díaz J, Karhumäki J, Lepistö A, Sannella D, (eds) ICALP. Springer, Berlin, pp 618–629
Chapter Google Scholar
McNaughton R, Papert S (1971) Counter-free automata. The MIT Press, Cambridge
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy
Stefano Crespi Reghizzi, Luca Breveglieri & Angelo Morzenti

Authors

Stefano Crespi Reghizzi
View author publications
You can also search for this author in PubMed Google Scholar
Luca Breveglieri
View author publications
You can also search for this author in PubMed Google Scholar
Angelo Morzenti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefano Crespi Reghizzi .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Crespi Reghizzi, S., Breveglieri, L., Morzenti, A. (2019). Finite Automata as Regular Language Recognizers. In: Formal Languages and Compilation. Texts in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-04879-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-04879-2_3
Published: 18 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04878-5
Online ISBN: 978-3-030-04879-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics