Characteristic sets for polynomial grammatical inference

De La Higuera, Colin

doi:10.1007/BFb0033342

Colin De La Higuera¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1147))

Included in the following conference series:

International Colloquium on Grammatical Inference

154 Accesses
4 Citations

Abstract

When concerned about efficient grammatical inference two issues are relevant: the first one is to determine the quality of the result, and the second is to try to use polynomial time and space. A typical idea to deal with the first point is to say that an algorithm performs well if it identifies in the limit the correct language. The second point has led to debate about how to define polynomial time: the main definitions of polynomial inference have been proposed by Pitt and Angluin. We return in this paper to another definition proposed by Gold that requires a characteristic set of strings to exist for each grammar, and this set to be polynomial in the size of the grammar or automaton that is to be learnt, where the size of the sample is the sum of the lengths of all its words. The learning algorithm must also infer correctly as soon as the characteristic set is included in the data. We first show that this definition corresponds to a notion of teachability as defined by Goldman and Mathias. By adapting their teacher/learner model to grammatical Inference we prove that languages given by context-free grammars, simple deterministic grammars, linear grammars and nondeterministic finite automata are not polynomially identifiable from given data.

This work has been performed while the author was visiting the Universidad Politecnica de Valencia, Spain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Bibliography

Angluin, D. (1987). Queries and concept learning. Machine Learning 2, 319–342.
Google Scholar
Anthony, M., Brightwell, G., Cohen, D. & Shawe-Taylor, J. (1992). On exact specification by examples. Proceedings of COLT 92 (pp. 311–318). A.C.M.
Google Scholar
Castellanos, A., Galiano I. & Vidal, E. (1994). Application of OSTIA to machine translation tasks. Proceedings of the International Colloquium on Grammatical Inference ICGI-94 (pp. 93–105). Lecture Notes in Artificial Intelligence 862, Springer-Verlag.
Google Scholar
Freivalds, R., Kinber, E.B. & Wiehagen, R. (1989). Inductive inference from good examples. Proceedings of the International Workshop on Analogical and Inductive Inference (pp. 1–17). Lecture Notes in Artificial Intelligence 397, Springer-Verlag.
Google Scholar
García, P., Segarra, E., Vidal, E. & Galiano, I. (1994). On the use of the morphic generator grammatical inference (MGGI) methodology in automatic speech recognition. International Journal of Pattern Recognition and Artificial Intelligence 4, 667–685.
Article Google Scholar
García, P. & Vidal, E. (1990). Inference of K-testable languages in the strict sense and applications to syntactic pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 12 /9, 920–925.
Article Google Scholar
Garey, M.R. & Johnson, D.S. (1979). Computers and intractability: a guide to the theory of NP-completeness. San Francisco: W.H. Freeman.
Google Scholar
Gold, E.M. (1967). Language identification in the limit. Inform. & Control. 10, 447–474.
Google Scholar
Gold, E.M. (1978). Complexity of automaton identification from given data. Information and Control 37, 302–320.
Article Google Scholar
Goldman, S.A. & Kearns M.J. (1991). On the complexity of teaching. Proceedings of COLT' 91 (pp. 303–314).
Google Scholar
Goldman, S.A. & Mathias, H.D. (1993). Teaching a smarter learner. Proceedings of COLT' 93 (pp. 67–76).
Google Scholar
Harrison, M.A. (1978). Introduction to formal language theory. Reading: Addison-Wesley.
Google Scholar
Ishizaka, I. (1989). Learning simple deterministic languages. Proceedings of COLT' 89 (pp. 162–174). A.C.M.
Google Scholar
Jackson, J. & Tomkins, A. (1992). A computational model of teaching. Proceedings of COLT' 92 (pp. 319–326). A.C.M.
Google Scholar
Koshiba, T., Mäkinen, E. & Takada, Y. (1995). Learning deterministic even linear languages from positive examples. Proceedings of ALT '95, Lecture Notes in Artificial Intelligence 997, Springer-Verlag.
Google Scholar
Oncina, J. & García, P. (1992) Inferring regular languages in polynomial time. In Pattern Recognition and Image Analysis, World Scientific.
Google Scholar
Oncina, J., García, P. & Vidal E. (1993). Learning subsequential transducers for pattern recognition tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 448–458.
Article Google Scholar
Pitt, L. (1989). Inductive inference, dfas and computational complexity. Proceedings of the International Workshop on Analogical and Inductive Inference (pp. 18–44). Lecture Notes in Artificial Intelligence 397, Springer-Verlag.
Google Scholar
Sempere, J.M. & García, P. (1994). A characterisation of even linear languages and its application to the learning problem. Proceedings of the International Colloquium on Grammatical Inference ICGI-94 (pp. 38–44). Lecture Notes in Artificial Intelligence 862, Springer-Verlag.
Google Scholar
Takada, Y. (1988). Grammatical inference for even linear languages based on control sets. Information Processing Letters 28, 193–199.
Article Google Scholar
Takada, Y. (1994). A hierarchy of language families learnable by regular language learners. Proceedings of the International Colloquium on Grammatical Inference ICGI-94 (pp. 16–24). Lecture Notes in Artificial Intelligence 862, Springer-Verlag.
Google Scholar
Wiehagen, R. (1992). From inductive inference to algorithmic learning theory. Proceedings of ALT' 92, (pp 13–24). Lecture Notes in Artificial Intelligence 743, Springer-Verlag.
Google Scholar
Yokomori, T. (1993). Learning non-deterministic finite automata from queries and counterexamples. Machine Intelligence 13. Furukawa, Michie & Muggleton eds., Oxford Univ. Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Département d'Informatique Fondamentale (DIF) LIRMM, 161rue Ada, 34 392, Montpellier Cedex 5, France
Colin De La Higuera

Authors

Colin De La Higuera
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Laurent Miclet Colin de la Higuera

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

De La Higuera, C. (1996). Characteristic sets for polynomial grammatical inference. In: Miclet, L., de la Higuera, C. (eds) Grammatical Interference: Learning Syntax from Sentences. ICGI 1996. Lecture Notes in Computer Science, vol 1147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0033342

Download citation

DOI: https://doi.org/10.1007/BFb0033342
Published: 17 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61778-5
Online ISBN: 978-3-540-70678-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics