Advertisement

Machine Learning

, Volume 44, Issue 1–2, pp 67–91 | Cite as

Stochastic Finite Learning of the Pattern Languages

  • Peter Rossmanith
  • Thomas Zeugmann
Article

Abstract

The present paper proposes a new learning model—called stochastic finite learning—and shows the whole class of pattern languages to be learnable within this model.

This main result is achieved by providing a new and improved average-case analysis of the Lange–Wiehagen (New Generation Computing, 8, 361–370) algorithm learning the class of all pattern languages in the limit from positive data. The complexity measure chosen is the total learning time, i.e., the overall time taken by the algorithm until convergence. The expectation of the total learning time is carefully analyzed and exponentially shrinking tail bounds for it are established for a large class of probability distributions. For every pattern π containing k different variables it is shown that Lange and Wiehagen's algorithm possesses an expected total learning time of \(O(\hat \alpha ^k E[\Lambda ]\log _{1/\beta } (k))\), where \({\hat \alpha }\) and β are two easily computable parameters arising naturally from the underlying probability distributions, and E[Λ] is the expected example string length.

Finally, assuming a bit of domain knowledge concerning the underlying class of probability distributions, it is shown how to convert learning in the limit into stochastic finite learning.

inductive learning pattern languages average-case analysis learning in the limit stochastic finite learning 

References

  1. Angluin, D. (1980a). Finding patterns common to a set of strings. Journal of Computer and System Sciences, 21, 46–62.Google Scholar
  2. Angluin, D. (1980b). Inductive inference of formal languages from positive data. Information and Control, 45, 117–135.Google Scholar
  3. Blumer, A., Ehrenfeucht, A., Haussler, D.,& Warmuth, M. K. (1989). Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36, 926–965.Google Scholar
  4. Daley, R.,& Smith, C. H. (1986). On the complexity of inductive inference. Information and Control, 69, 12–40.Google Scholar
  5. Erlebach, T., Rossmanith, P., Stadtherr, H., Steger, A.,& Zeugmann, T. (1997). Learning one-variable pattern languages very efficiently on average, in parallel, and by asking queries. In M. Li& A. Maruoka, (Eds.), Proceedings of the Eighth International Workshop on Algorithmic Learning Theory (pp. 260–276). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 1316.Google Scholar
  6. Fulk, M. A. (1990). Prudence and other conditions on formal language learning, Information and Computation, 85, 1–11.Google Scholar
  7. Gold, E. M. (1967). Language identification in the limit. Information and Control, 10, 447–474.Google Scholar
  8. Goldman, S. A., Kearns, M. J.& Schapire, R. E. (1993). Exact identification of circuits using fixed points of amplification functions. SIAM Journal of Computing, 22, 705–726.Google Scholar
  9. Hagerup, T.,& Rüb, C. (1990). A guided tour of Chernoff bounds. Information Processing Letters, 33, 305–308.Google Scholar
  10. Haussler, D., Kearns, M., Littlestone, N.,& Warmuth, M. K. (1991). Equivalence of models for polynomial learnability. Information and Computation, 95, 129–161.Google Scholar
  11. Hopcroft, J. E.,& Ullman, J. D. (1969). Formal Languages and their Relation to Automata. Reading, MA: Addison-Wesley.Google Scholar
  12. Kearns, M.,& Pitt, L. (1989). A polynomial-time algorithm for learning k- variable pattern languages from examples. In R. Rivest, D. Haussler,& M. K. Warmuth, (Eds.), Proceedings of the Second Annual ACM Workshop on Computational Learning Theory (pp. 57–71). San Mateo, CA: Morgan Kaufmann.Google Scholar
  13. Lange, S.,& Wiehagen, R. (1991). Polynomial-time inference of arbitrary pattern languages. New Generation Computing, 8, 361–370.Google Scholar
  14. Lange, S.,& Zeugmann, T. (1993). Monotonic versus non-monotonic language learning. In G. Brewka, K. P. Jantke,& P. H. Schmitt, (Eds.), Proceedings of the Second International Workshop on Nonmonotonic and Inductive Logic, (pp. 254–269). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 659.Google Scholar
  15. Lange, S.,& Zeugmann, T. (1996). Set-driven and rearrangement-independent learning of recursive languages. Mathematical Systems Theory, 29, 599–634.Google Scholar
  16. Mitchell, A., Scheffer, T., Sharma, A.,& Stephan, F. (1999). The VC-dimension of subclasses of pattern languages. In O. Watanabe& T. Yokomori, (Eds.), Proceedings of the Tenth International Conference on Algorithmic Learning Theory (pp. 93–105). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 1720.Google Scholar
  17. Muggleton, S. (1994). Bayesian inductive logic programming. In W. Cohen& H. Hirsh, (Eds.), Proceedings of the Eleventh International Conference on Machine Learning (pp. 371–379). San Mateo, CA: Morgan Kaufmann.Google Scholar
  18. Pitt, L. (1989). Inductive inference, DFAs and computational complexity. In K. P. Jantke, (Ed.), Proceedings of the International Workshop on Analogical and Inductive Inference (pp. 18–44). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 397.Google Scholar
  19. Reischuk, R.,& Zeugmann, T. (1998). Learning one-variable pattern languages in linear average time. In P. Bartlett& Y. Mansour, (Eds.), Proceedings of the Eleventh Annual Conference on Computational Learning Theory (pp. 198–208). New York, NY: ACM Press.Google Scholar
  20. Reischuk, R.,& Zeugmann, T. (1999). A complete and tight average-case analysis of learning monomials. In C. Meinel& S. Tison, (Eds.), Proceedings of the Sixteenth International Symposium on Theoretical Aspects of Computer Science (pp. 414–423). Berlin: Springer-Verlag. Lecture Notes in Computer Science 1563.Google Scholar
  21. Rossmanith, P.,& Zeugmann, T. (1998). Learning k-variable pattern languages efficiently stochastically finite on average from positive data. DOI Technical Report DOI-TR-145, Department of Informatics, Kyushu University.Google Scholar
  22. Salomaa, A. (1994a). Patterns. (The Formal Language Theory Column). EATCS Bulletin, 54, 46–62.Google Scholar
  23. Salomaa, A. (1994b). Return to patterns. (The Formal Language Theory Column). EATCS Bulletin, 55, 144–157.Google Scholar
  24. Schapire, R. E. (1990). Pattern languages are not learnable. In M. A. Fulk& J. Case, (Eds.), Proceedings of the Third Annual ACM Workshop on Computational Learning Theory (pp. 122–129). San Mateo, CA: Morgan Kaufmann.Google Scholar
  25. Shinohara, T.,& Arikawa, S. (1995). Pattern inference. In K.P. Jantke& S. Lange, (Eds.), Algorithmic Learning for Knowledge-Based Systems (pp. 259–291). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 961.Google Scholar
  26. Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM 27, 1134–1142.Google Scholar
  27. Zeugmann, T. (1998). Lange and Wiehagen's pattern learning algorithm: An average-case analysis with respect to its total learning time. Annals of Mathematics and Artificial Intelligence, 23, 177–145.Google Scholar
  28. Zeugmann, T.,& Lange, S. (1995). A guided tour across the boundaries of learning recursive languages. In K. P. Jantke& S. Lange, (Eds.), Algorithmic Learning for Knowledge-Based Systems (pp. 190–258). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 961.Google Scholar

Copyright information

© Kluwer Academic Publishers 2001

Authors and Affiliations

  • Peter Rossmanith
    • 1
  • Thomas Zeugmann
    • 2
  1. 1.Institut für InformatikTechnische Universität MünchenMünchenGermany
  2. 2.Institut für Theoretische InformatikMedizinische Universität LübeckLübeckGermany

Personalised recommendations