Machine Learning

, Volume 44, Issue 1–2, pp 9–35 | Cite as

Learning DFA from Simple Examples

  • Rajesh Parekh
  • Vasant Honavar


Efficient learning of DFA is a challenging research problem in grammatical inference. It is known that both exact and approximate (in the PAC sense) identifiability of DFA is hard. Pitt has posed the following open research problem: “Are DFA PAC-identifiable if examples are drawn from the uniform distribution, or some other known simple distribution?” (Pitt, in Lecture Notes in Artificial Intelligence, 397, pp. 18–44, Springer-Verlag, 1989). We demonstrate that the class of DFA whose canonical representations have logarithmic Kolmogorov complexity is efficiently PAC learnable under the Solomonoff Levin universal distribution (m). We prove that the class of DFA is efficiently learnable under the PACS (PAC learning with simple examples) model (Denis, D'Halluin & Gilleron, STACS'96—Proceedings of the 13th Annual Symposium on the Theoretical Aspects of Computer Science, pp. 231–242, 1996) wherein positive and negative examples are sampled according to the universal distribution conditional on a description of the target concept. Further, we show that any concept that is learnable under Gold's model of learning from characteristic samples, Goldman and Mathias' polynomial teachability model, and the model of learning from example based queries is also learnable under the PACS model.

DFA inference exact identification characteristic sets PAC learning collusion 


  1. Angluin, D. (1981). A note on the number of queries needed to identify regular languages. Information and Control, 51, 76–87.Google Scholar
  2. Angluin, D. (1987). Learning regular sets from queries and counterexamples. Information and Computation, 75, 87–106.Google Scholar
  3. Angluin, D. (1988). Queries and concept learning. Machine Learning, 2:4, 319–342.Google Scholar
  4. Castro, J., & Guijarro, D. (1998). Query, pacs and simple-pac learning. Technical Report LSI-98-2-R, Universitat Polytéctica de Catalunya, Spain.Google Scholar
  5. Chomsky, N. (1956). Three models for the description of language. PGIT, 2:3, 113–124.Google Scholar
  6. Denis, F., D'Halluin, C., & Gilleron, R. (1996). Pac learning with simple examples. STACS'96—Proceedings of the 13th Annual Symposium on the Theoretical Aspects of Computer Science (pp. 231–242).Google Scholar
  7. Denis, F.,& Gilleron, R. (1997). Pac learning under helpful distributions. In Proceedings of the Eighth International Workshop on Algorithmic Learning Theory (ALT'97), Lecture Notes in Artificial Intelligence 1316 (pp. 132–145), Sendai, Japan.Google Scholar
  8. Dupont, P. (1996). Incremental regular inference. In L. Miclet, & C. Higuera, (Eds.), Proceedings of the Third ICGI-96, Lecture Notes in Artificial Intelligence 1147 (pp. 222–237), Montpellier,France, Springer.Google Scholar
  9. Dupont, P. (1996). Utilisation et apprentissage de modèles de language pour la reconnaissance de la parole continue. PhD thesis, Ecole Normale Supérieure des Télécommunications, Paris, France.Google Scholar
  10. Dupont, P., Miclet, L., & Vidal, E. (1994). What is the search space of the regular inference? In Proceedings of the Second International Colloquium on Grammatical Inference (ICGI'94) (pp. 25–37). Alicante, Spain.Google Scholar
  11. Gold, E. (1978). Complexity of automaton identification from given data. Information and Control, 37:3, 302–320.Google Scholar
  12. Goldman, S., & Mathias, H. (1993). Teaching a smarter learner. In Proceedings of theWorkshop on Computational Learning Theory (COLT'93) (pp. 67–76). ACM Press.Google Scholar
  13. Goldman, S., & Mathias, H (1996). Teaching a smarter learner. Journal of Computer and System Sciences, 52, 255–267.Google Scholar
  14. Colin de la Higuera (1996). Characteristic sets for polynomial grammatical inference. In L. Miclet, & C. Higuera, (Eds.), Proceedings of the Third ICGI-96, Lecture Notes in Artificial Intelligence 1147 (pp. 59–71). Montpellier, France, Springer.Google Scholar
  15. Hopcroft, J., & Ullman, J. (1979). Introduction to automata theory, languages, and computation. Reading, MA: Addison Wesley.Google Scholar
  16. Jackson, J., & Tomkins, A. (1992). A computational model of teaching. In Proceedings of the Workshop on Computational Learning Theory (COLT'92) (pp. 319–326). ACM Press.Google Scholar
  17. Kearns, M., & Valiant, L. G. (1989). Cryptographic limitations on learning boolean formulae and finite automata. In Proceedings of the 21st Annual ACM Symposium on Theory of Computing (pp. 433–444). New York: ACM.Google Scholar
  18. Lang, K. (1992). Random DFAs can be approximately learned from sparse uniform sample. In Proceedings of the 5th ACM workshop on Computational Learning Theory (pp. 45–52).Google Scholar
  19. Li, M., & Vitányi, P. (1991). Learning simple concepts under simple distributions. SIAM Journal of Computing, 20:5, 911–935.Google Scholar
  20. Li, M., & Vitányi, P. (1997). An introduction to Kolmogorov complexity and its applications, (2nd ed.) New York: Springer Verlag.Google Scholar
  21. Oncina, J., & Garcia, P. (1992). Inferring regular languages in polynomial update time. In N. Pérez et al. (eds.), Pattern recognition and image analysis (pp. 49–61). Singapore: World Scientific.Google Scholar
  22. Pao, T., & Carr, J. (1978). A solution of the syntactic induction-inference problem for regular languages. Computer Languages, 3, 53–64.Google Scholar
  23. Parekh, R., & Honavar, V. (1993). Efficient learning of regular languages using teacher supplied positive examples and learner generated queries. In Proceedings of the Fifth UNB Conference on AI (pp. 195–203). Fredricton, Canada.Google Scholar
  24. Parekh, R., & Honavar, V. (1997). Learning DFA from simple examples. In Proceedings of the Eighth International Workshop on Algorithmic Learning Theory (ALT'97), Lecture Notes in Artificial Intelligence 1316 (pp. 116–131). Sendai, Japan, Springer. Also presented at theWorkshop on Grammar Inference, Automata Induction, and Language Acquisition (ICML'97), Nashville, TN, July 12, 1997.Google Scholar
  25. Parekh, R & Honavar, V. (1999). Simple DFA are polynomially probably exactly learnable from simple examples. In Proceedings of the Sixteenth International Conference on Machine Learning (ICML'99) (pp. 298–306). Bled, Slovenia.Google Scholar
  26. Pitt, L. (1989). Inductive inference, DFAs and computational complexity. In Analogical and Inductive Inference, Lecture Notes in Artificial Intelligence, 397 (pp. 18–44). Springer-Verlag.Google Scholar
  27. Pitt, L., & Warmuth, M. K. (1988). Reductions among prediction problems: on the difficulty of predicting automata. In Proceedings of the 3rd IEEE Conference on Structure in Complexity Theory (pp. 60–69).Google Scholar
  28. Pitt, L., & Warmuth, M. K. (1989). The minimum consistency DFA problem cannot be approximated within any polynomial. In Proceedings of the 21st ACM Symposium on the Theory of Computing (pp. 421–432). ACM.Google Scholar
  29. Rivest, R. L. & Schapire, R. E. (1993). Inference of finite automata using homing sequences. Information and Computation, 103:2, 299–347.Google Scholar
  30. Trakhtenbrot, B., & Barzdin, Ya. (1973). Finite Automata: Behavior and Synthesis. Amsterdam, North Holland.Google Scholar
  31. Valiant, L. (1984). A theory of the learnable. Communications of the ACM, 27, 1134–1142.Google Scholar

Copyright information

© Kluwer Academic Publishers 2001

Authors and Affiliations

  • Rajesh Parekh
    • 1
  • Vasant Honavar
    • 2
  1. 1.Blue Martini SoftwareSan MateoUSA
  2. 2.Department of Computer ScienceIowa State UniversityAmesUSA

Personalised recommendations