Abstract
Learning from positive data constitutes an important topic in Grammatical Inference since it is believed that the acquisition of grammar by children only needs syntactically correct (i.e. positive) instances. However, classical learning models provide no way to avoid the problem of overgeneralization. In order to overcome this problem, we use here a learning model from simple examples, where the notion of simplicity is defined with the help of Kolmogorov complexity. We show that a general and natural heuristic which allows learning from simple positive examples can be developed in this model. Our main result is that the class of regular languages is probably exactly learnable from simple positive examples.
Article PDF
Similar content being viewed by others
References
Angluin, D. (1980). Inductive inference of formal languages from positive data. Inform. Control, 45:2, 117–135.
Angluin, D. (1982). Inference of reversible languages. J. ACM, 29:3, 741–765.
Angluin, D. (1987). Learning regular sets from queries and counterexamples. Information and Computation, 75:2, 87–106.
Carrasco, R., & Oncina, J. (1994). Learning stochastic regular grammars by means of a state merging method. In International Conference on Grammatical Inference (pp. 139–152), Heidelberg: Springer-Verlag.
Castro, J., & Guijarro, D. (1998). Query, pacs and simple-pac learning. Technical report, Dept. Llenguatges i Sistemes Inform`atics.
de Jongh, D, & Kanazawa, M. (1996). Angluin's theorem for indexed families of r.e. sets and applications. In Proc. 9th Annu. Conf. on Comput. Learning Theory (pp. 193–204), New York, NY: ACM Press.
Denis, F. (1998). PAC learning from positive statistical queries. In M. M. Richter, C. H. Smith, R. Wiehagen, & T. Zeugmann, (Eds.), Proceedings of the 9th International Conference on Algorithmic Learning Theory (ALT-98) (pp. 112–126), Berlin: Springer Vol. 1501 of LNAI.
Denis, F., D'Halluin, C., & Gilleron, R. (1996). PAC learning with simple examples. In 13th Annual Symposium on Theoretical Aspects of Computer Science (pp. 231–242), Grenoble, France: Springer. Volume 1046 of INCS.
Denis, F., & Gilleron, R. (1997a). PAC learning under helpful distributions. In M. Li, & A. Maruoka, (Eds.), Proceedings of the 8th International Workshop on Algorithmic Learning Theory (ALT-97) (pp. 132–145), Berlin: Springer. Vol. 1316 of LNAI.
Denis, F., & Gilleron, R. (1997b). Pac learning under helpful distributions. Submitted. Long version available at ftp://www.grappa.univlille3.fr/pub/reports/helpful.ps.
D'Halluin, C. (1998). Apprentissage par exemples simples; plate-forme d'apprentissage de langages réguliers. PhD thesis, Université Lille 1.
Gold, E. (1967). Language identification in the limit. Inform. Control, 10, 447–474.
Gold, E. (1978). Complexity of automaton identification from given data. Inform. Control, 37, 302–320.
Goldman, S. A., & Mathias, H. D. (1996). Teaching a smarter learner. Journal of Computer and System Sciences, 522, 255–267.
Haussler, D., Kearns, M., Littlestone, N., & Warmuth, M. K. (1991). Equivalence of models for polynomial learnability. Inform. Comput., 95:2, 129–161.
Head, T., Kobayashi, S., & Yokomori, T. (1998). Locality, reversibility and beyond: Learning languages from positive data. In ALT 98, 9th International Conference on Algorithmic Learning Theory (pp. 191–204), Springer-Verlag. Vol. 1501 of Lecture Notes in Artificial Intelligence.
Higuera, C. D. L. (1997). Characteristic sets for polynomial grammatical inference. Machine Learning, 27, 125–137.
Kanazawa, M. (1996). Identification in the limit of categorial grammars. Journal of Logic, Language, and Information, 5:2, 115–155.
Kearns, M., & Valiant, L. (1994). Cryptographic limitations on learning boolean formulae and finite automata. Journal of the ACM, 41:1, 67–95.
Kearns, M. J., & Vazirani, U. V. (1994). An Introduction to Computational Learning Theory. MIT Press.
Koshiba, T., & E. Mäkinen, Y. T. (1997). Learning deterministic even linear languages from positive examples. Theoretical Computer Science, 185, 63–79.
Li, M., & Vitányi, P. (1991). Learning simple concepts under simple distributions. SIAM J. Comput., 20, 911–935.
Li, M., & Vitanyi, P. (1993). An Introduction to Kolmogorov Complexity and Its Applications. Text and Monographs in Computer Science. Springer-Verlag.
Natarajan, B. (1991a). Machine Learning: A Theoretical Approach. San Mateo, CA: Morgan Kaufmann.
Natarajan, B. K. (1991b). Probably approximate learning of sets and functions. SIAM J. COMPUT., 20:2, 328–351.
Oncina, J., & Garcia, P. (1992). Inferring regular languages in polynomial update time. In Pattern Recognition and Image Analysis, pp. 49–61.
Parekh, R., & Honavar, V. (1997). Learning DFA from simple examples. In M. Li, & A. Maruoka, (Eds.), Proceedings of the 8th International Workshop on Algorithmic Learning Theory (ALT-97) (pp. 16–131), Berlin: Springer. Vol. 1316 of LNAI.
Pitt, L. (1989). Inductive inference, DFAs, and computational complexity. In Proceedings of AII-89 Workshop on Analogical and Inductive Inference. (pp. 18–44), Heidelberg: Springer-Verlag. Vol. 397 of Lecture Notes in Artificial Intelligence.
Rabiner, L. R., & Juang, B. H. (1986). An introduction to hidden markov models. IEEE ASSP Magazine, 3:1, 4–16.
Sakakibara, Y. (1992). Efficient learning of context-free grammars from positive structural examples. Information and Computation, 97:1, 23–60.
Sakakibara, Y. (1997). Recent advances of grammatical inference. Theoretical Computer Science, 185:1, 15–45.
Shvayster, H. (1990). A necessary condition for learning from positive examples. Machine Learning, 5, 101–113.
Stolcke, A., & Omohundro, S. (1994). Inducing probabilistic grammars by Bayesian model merging. Lecture Notes in Computer Science, 862, 106–118.
Tellier, I. (1998). Meaning helps learning syntax. In Proceedings of ICGI'98Workshop on Grammatical Inference (pp. 25–36), Vol. 1433 of Lecture Notes in Artificial Intelligence.
Valiant, L. (1984). A theory of the learnable. Commun. ACM, 27:11, 1134–1142.
Yokomori (1995). On polynomial-time learnability in the limit of strictly deterministic automata. Machine Learning, 2, 153–179.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Denis, F. Learning Regular Languages from Simple Positive Examples. Machine Learning 44, 37–66 (2001). https://doi.org/10.1023/A:1010826628977
Issue Date:
DOI: https://doi.org/10.1023/A:1010826628977