Abstract
The present paper proposes a new learning model—called stochastic finite learning—and shows the whole class of pattern languages to be learnable within this model.
This main result is achieved by providing a new and improved average-case analysis of the Lange–Wiehagen (New Generation Computing, 8, 361–370) algorithm learning the class of all pattern languages in the limit from positive data. The complexity measure chosen is the total learning time, i.e., the overall time taken by the algorithm until convergence. The expectation of the total learning time is carefully analyzed and exponentially shrinking tail bounds for it are established for a large class of probability distributions. For every pattern π containing k different variables it is shown that Lange and Wiehagen's algorithm possesses an expected total learning time of \(O(\hat \alpha ^k E[\Lambda ]\log _{1/\beta } (k))\), where \({\hat \alpha }\) and β are two easily computable parameters arising naturally from the underlying probability distributions, and E[Λ] is the expected example string length.
Finally, assuming a bit of domain knowledge concerning the underlying class of probability distributions, it is shown how to convert learning in the limit into stochastic finite learning.
Article PDF
Similar content being viewed by others
References
Angluin, D. (1980a). Finding patterns common to a set of strings. Journal of Computer and System Sciences, 21, 46–62.
Angluin, D. (1980b). Inductive inference of formal languages from positive data. Information and Control, 45, 117–135.
Blumer, A., Ehrenfeucht, A., Haussler, D.,& Warmuth, M. K. (1989). Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36, 926–965.
Daley, R.,& Smith, C. H. (1986). On the complexity of inductive inference. Information and Control, 69, 12–40.
Erlebach, T., Rossmanith, P., Stadtherr, H., Steger, A.,& Zeugmann, T. (1997). Learning one-variable pattern languages very efficiently on average, in parallel, and by asking queries. In M. Li& A. Maruoka, (Eds.), Proceedings of the Eighth International Workshop on Algorithmic Learning Theory (pp. 260–276). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 1316.
Fulk, M. A. (1990). Prudence and other conditions on formal language learning, Information and Computation, 85, 1–11.
Gold, E. M. (1967). Language identification in the limit. Information and Control, 10, 447–474.
Goldman, S. A., Kearns, M. J.& Schapire, R. E. (1993). Exact identification of circuits using fixed points of amplification functions. SIAM Journal of Computing, 22, 705–726.
Hagerup, T.,& Rüb, C. (1990). A guided tour of Chernoff bounds. Information Processing Letters, 33, 305–308.
Haussler, D., Kearns, M., Littlestone, N.,& Warmuth, M. K. (1991). Equivalence of models for polynomial learnability. Information and Computation, 95, 129–161.
Hopcroft, J. E.,& Ullman, J. D. (1969). Formal Languages and their Relation to Automata. Reading, MA: Addison-Wesley.
Kearns, M.,& Pitt, L. (1989). A polynomial-time algorithm for learning k- variable pattern languages from examples. In R. Rivest, D. Haussler,& M. K. Warmuth, (Eds.), Proceedings of the Second Annual ACM Workshop on Computational Learning Theory (pp. 57–71). San Mateo, CA: Morgan Kaufmann.
Lange, S.,& Wiehagen, R. (1991). Polynomial-time inference of arbitrary pattern languages. New Generation Computing, 8, 361–370.
Lange, S.,& Zeugmann, T. (1993). Monotonic versus non-monotonic language learning. In G. Brewka, K. P. Jantke,& P. H. Schmitt, (Eds.), Proceedings of the Second International Workshop on Nonmonotonic and Inductive Logic, (pp. 254–269). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 659.
Lange, S.,& Zeugmann, T. (1996). Set-driven and rearrangement-independent learning of recursive languages. Mathematical Systems Theory, 29, 599–634.
Mitchell, A., Scheffer, T., Sharma, A.,& Stephan, F. (1999). The VC-dimension of subclasses of pattern languages. In O. Watanabe& T. Yokomori, (Eds.), Proceedings of the Tenth International Conference on Algorithmic Learning Theory (pp. 93–105). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 1720.
Muggleton, S. (1994). Bayesian inductive logic programming. In W. Cohen& H. Hirsh, (Eds.), Proceedings of the Eleventh International Conference on Machine Learning (pp. 371–379). San Mateo, CA: Morgan Kaufmann.
Pitt, L. (1989). Inductive inference, DFAs and computational complexity. In K. P. Jantke, (Ed.), Proceedings of the International Workshop on Analogical and Inductive Inference (pp. 18–44). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 397.
Reischuk, R.,& Zeugmann, T. (1998). Learning one-variable pattern languages in linear average time. In P. Bartlett& Y. Mansour, (Eds.), Proceedings of the Eleventh Annual Conference on Computational Learning Theory (pp. 198–208). New York, NY: ACM Press.
Reischuk, R.,& Zeugmann, T. (1999). A complete and tight average-case analysis of learning monomials. In C. Meinel& S. Tison, (Eds.), Proceedings of the Sixteenth International Symposium on Theoretical Aspects of Computer Science (pp. 414–423). Berlin: Springer-Verlag. Lecture Notes in Computer Science 1563.
Rossmanith, P.,& Zeugmann, T. (1998). Learning k-variable pattern languages efficiently stochastically finite on average from positive data. DOI Technical Report DOI-TR-145, Department of Informatics, Kyushu University.
Salomaa, A. (1994a). Patterns. (The Formal Language Theory Column). EATCS Bulletin, 54, 46–62.
Salomaa, A. (1994b). Return to patterns. (The Formal Language Theory Column). EATCS Bulletin, 55, 144–157.
Schapire, R. E. (1990). Pattern languages are not learnable. In M. A. Fulk& J. Case, (Eds.), Proceedings of the Third Annual ACM Workshop on Computational Learning Theory (pp. 122–129). San Mateo, CA: Morgan Kaufmann.
Shinohara, T.,& Arikawa, S. (1995). Pattern inference. In K.P. Jantke& S. Lange, (Eds.), Algorithmic Learning for Knowledge-Based Systems (pp. 259–291). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 961.
Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM 27, 1134–1142.
Zeugmann, T. (1998). Lange and Wiehagen's pattern learning algorithm: An average-case analysis with respect to its total learning time. Annals of Mathematics and Artificial Intelligence, 23, 177–145.
Zeugmann, T.,& Lange, S. (1995). A guided tour across the boundaries of learning recursive languages. In K. P. Jantke& S. Lange, (Eds.), Algorithmic Learning for Knowledge-Based Systems (pp. 190–258). Berlin: Springer-Verlag. Lecture Notes in Artificial Intelligence 961.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Rossmanith, P., Zeugmann, T. Stochastic Finite Learning of the Pattern Languages. Machine Learning 44, 67–91 (2001). https://doi.org/10.1023/A:1010875913047
Issue Date:
DOI: https://doi.org/10.1023/A:1010875913047