Stochastic Inference of Regular Tree Languages

Carrasco, Rafael C.; Oncina, Jose; Calera-Rubio, Jorge

doi:10.1023/A:1010836331703

Stochastic Inference of Regular Tree Languages

Published: July 2001

Volume 44, pages 185–197, (2001)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Stochastic Inference of Regular Tree Languages

Download PDF

Rafael C. Carrasco¹,
Jose Oncina¹ &
Jorge Calera-Rubio¹

462 Accesses
27 Citations
Explore all metrics

Abstract

We generalize a former algorithm for regular language identification from stochastic samples to the case of tree languages. It can also be used to identify context-free languages when structural information about the strings is available. The procedure identifies equivalent subtrees in the sample and outputs the hypothesis in linear time with the number of examples. The results are evaluated with a method that computes efficiently the relative entropy between the target grammar and the inferred one.

References

Angluin, D. (1988). Identifying languages from stochastic examples. Technical Report YALEU/DCS/RR-614, Yale University Dept. of Computer Science, New Haven, CT.
Google Scholar
Calera-Rubio, J.,& Carrasco, R. C. (1998). Computing the relative entropy between regular tree languages. Information Processing Letters, 68:6, 283–289.
Google Scholar
Carrasco, R. C.,& Oncina, J. (1999). Learning deterministic regular grammars from stochastic samples in polynomial time. RAIRO (Theoretical Informatics and Applications), 33:1, 1–20.
Google Scholar
Cover, T. M.,& Thomas, J. A. (1991). Elements of Information Theory. Wiley Series in Telecommunications. New York, NY, USA: John Wiley&Sons.
Google Scholar
Feller, W. (1950). An Introduction to Probability Theory and Its Applications I (2nd edn.). New York: John Wiley.
Google Scholar
Gécseg, F.,& Steinby, M. (1984). Tree Automata. Budapest: Akadémiai Kiadó.
Google Scholar
Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58:301, 13–30.
Google Scholar
Hopcroft, J.,& Ullman, J. (1980). Introduction to Automata Theory, Languages, and Computation. N. Reading, MA: Addison-Wesley.
Google Scholar
Oncina, J.,& García, P. (1994). Inference of rational tree sets. Technical Report DSIC-ii-1994-23, DSIC, Universidad Politécnica de Valencia.
Sakakibara, Y. (1992). Efficient learning of context-free grammars from positive structural examples. Information and Computation, 97:1, 23–60.
Google Scholar
Sakakibara, Y., Brown, M., Underwood, R. C., Mian, I. S.,& Haussler, D. (1994). Stochastic context-free grammars for modeling RNA. In L. Hunter, (Ed.), Proceedings of the 27th Annual Hawaii International Conference on System Sciences. Vol. 5: Biotechnology Computing. Los Alamitos, CA, USA (284–294).
Stolcke, A.,& Omohundro, S. (1993). Hidden Markov model induction by Bayesian model merging. In S. J. Hanson, J. D. Cowan, and C. L. Giles, (Eds.), Advances in Neural Information Processing Systems (Vol. 5, pp. 11–18).
Wetherell, C. S. (1980). Probabilistic languages: A review and some open questions. ACM Computing Surveys, 12:4, 361–379.
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, E-03071, Alicante
Rafael C. Carrasco, Jose Oncina & Jorge Calera-Rubio

Authors

Rafael C. Carrasco
View author publications
You can also search for this author in PubMed Google Scholar
Jose Oncina
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Calera-Rubio
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Carrasco, R.C., Oncina, J. & Calera-Rubio, J. Stochastic Inference of Regular Tree Languages. Machine Learning 44, 185–197 (2001). https://doi.org/10.1023/A:1010836331703

Download citation

Issue Date: July 2001
DOI: https://doi.org/10.1023/A:1010836331703

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Stochastic Inference of Regular Tree Languages

Abstract

Article PDF

Similar content being viewed by others

A Stochastic Model of Mathematics and Science

Forest construction of Gaussian and discrete variables with the application of Watanabe Bayesian Information Criterion

Language Learnability in the Limit: A Generalization of Gold’s Theorem

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Stochastic Inference of Regular Tree Languages

Abstract

Article PDF

Similar content being viewed by others

A Stochastic Model of Mathematics and Science

Forest construction of Gaussian and discrete variables with the application of Watanabe Bayesian Information Criterion

Language Learnability in the Limit: A Generalization of Gold’s Theorem

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation