Some Statistical-Estimation Methods for Stochastic Finite-State Transducers

Picó, David; Casacuberta, Francisco

doi:10.1023/A:1010880113956

Some Statistical-Estimation Methods for Stochastic Finite-State Transducers

Published: July 2001

Volume 44, pages 121–141, (2001)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Some Statistical-Estimation Methods for Stochastic Finite-State Transducers

Download PDF

David Picó¹ &
Francisco Casacuberta¹

398 Accesses
15 Citations
Explore all metrics

Abstract

Formal translations constitute a suitable framework for dealing with many problems in pattern recognition and computational linguistics. The application of formal transducers to these areas requires a stochastic extension for dealing with noisy, distorted patterns with high variability. In this paper, some estimation criteria are proposed and developed for the parameter estimation of regular syntax-directed translation schemata. These criteria are: maximum likelihood estimation, minimum conditional entropy estimation and conditional maximum likelihood estimation. The last two criteria were proposed in order to deal with situations when training data is sparse. These criteria take into account the possibility of ambiguity in the translations: i.e., there can be different output strings for a single input string. In this case, the final goal of the stochastic framework is to find the highest probability translation of a given input string. These criteria were tested on a translation task which has a high degree of ambiguity.

References

Aho, A. V.& Ullman, J. D. (1972). The theory of parsing, translation and compiling. vol. 1. Prentice-Hall.
Amengual, J. C., Benedí, J. B., Casacuberta, F., Castaño, A., Castellanos, A., Jiménez, V. M., Llorens, D., Marzal, A., Pastor, M., Prat, F., Vidal, E.,& Vilar, J. M. (1998). The EuTrans-I speech translation system. submitted to Machine Translation.
Bahl, L.R., Jelinek, F.,& Mercer, R. L. (1983). A maximum likelihood approach to continuous speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5:2, 179–196.
Google Scholar
Baum, L. E.& Sell, G. R. (1968). Growth transformations for functions on manifolds. Pacific Journal of Mathematics, 26:2, 211–227.
Google Scholar
Berstel, J. (1979). Transductions&Context-Free Languages. Stuttgart: B.G. Teubner.
Google Scholar
Brown, P. F. (1987). The acoustic-modelling problem in automatic speech recognition. Ph. Dissertation. Carnegie-Mellon University.
Cardin, R., Normandin, Y.,& DeMori, R. (1994). High performance connected digit recognition using maximum mutual information estimation. IEEE Trans. on Speech&Audio Processing, 2:2, 300–311.
Google Scholar
Casacuberta, F. (1900). Some relations among stochastic finite state networks used in automatic speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligente, PAMI-12(7), 691–693.
Google Scholar
Casacuberta, F. (1994). Statistical estimation of stochastic context-free grammars using the inside-outside algorithm&a transformation on grammars. R. Carrasco& J. Oncina, (Eds.). Grammatical inference and applications, Lecture notes in artificial intelligence (Vol. 862), pp. 119–129, Springer-Verlag.
Casacuberta, F. (1995). Probabilistic estimation of stochastic regular syntax-directed translation schemes. Proc. of the VI Spanish Symposium on Pattern Recognition and Image Analysis (pp. 201–207).
Casacuberta, F. (1996). Growth transformations for probabilistic functions of stochastic grammars. International Journal of Pattern Recognition and Artificial Intelligence, 10:3, 183–201.
Google Scholar
Casacuberta, F. (1996). Maximum mutual information and conditional maximum likelihood estimation of stochastic regular syntax-directed translation schemes. L. Miclet& C. de la Higuera, (Eds.). Grammatical inference: Learning syntax from sentences. Lecture notes in artificial intelligence (Vol. 1147, pp. 282–291). Springer-Verlag.
Casacuberta, F. (2000). Morphic generator translation inference, (to be submited to ICGI'2000).
Casacuberta, F.& de la Higuera, C. (2000). Computational complexity of problems on probabilistic grammars and transducers (to be published).
Dempster, A. P., Laird, N. M.,& Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Ser. B, 39:1, 1–38.
Google Scholar
Fu, K. S. (Ed.) (1982). Syntactic pattern recognition applications. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Gildea, D.& Jurafsky, D. (1996). Learning bias and phonological-rule induction. Computational Linguistics, 22:4, 497–530.
Google Scholar
González, R. C.& Thomason, M. G. (1978). Syntactic pattern recognition: An introduction, Reading, MA: Addison-Wesley.
Google Scholar
Gopalakrishnan, P. S., Kanevsky, D., Nádas, A.,& Nahamoo, D. (1991). An inequality for rational functions with applications to some statistical estimation problems. IEEE Transactions on Information Theory, 37:1.
Jelinek, F.& Lafferty, J. D. (1991). Computation of the probability of initial substring generation by stochastic context-free grammars. Computational Linguistics, 17:3, 315–323.
Google Scholar
Maryanski, F. J.& Thomason, M. G. (1979). Properties of stochastic syntax-directed translation schemata. International Journal of Computer and Information Sciences, 8:2, 89–110.
Google Scholar
Merhav, N.& Ephraim, Y. (1991). Maximum likelihood hidden Markov modelling using a dominant sequence of states. IEEE Transactions on Signal Processing, 39:9, 2111–2115.
Google Scholar
Mohri, M. (1997). Finite-state transducers in language and speech processing. Computational Linguistics, 23:2, 269–311.
Google Scholar
Nádas, A., Hahamoo, D.,& Picherny, M. (1988). On a model-robust training method for speech recognition. Trans. on Acoustic, Speech and Signal Processing, 36:9, 1432–1435.
Google Scholar
Oflazer, K. (1996). Error-tolerant finite-state recognition with applications to morphological analysis and spelling correction. Computational Linguistics, 22(1), 73–89.
Google Scholar
Oncina, J., García, P.,& Vidal, E. (1993). Learning subsequential transducers for pattern recognition interpretation tasks. IEEE Transactions on Pattern Analysis&Machine Intelligence, 15:5, 448–458.
Google Scholar
Roche, E.& Schabes, Y. (1995). Deterministic part-of-speech tagging with finite-state transducers. Computational Linguistics, 21:2, 227–253.
Google Scholar
Sánchez, J. A.& Benedí, J. M. (1997). Consistency of stocastic context-free grammars from probabilistic estimation based on growth transformation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19:9, 1052–1059.
Google Scholar
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423, (Part I); pp. 623- 656 (Part II).
Google Scholar
Thomason, M. G. (1976). Regular stochastic syntax-directed translations. Technical Report CS-76-17.
Vidal, E. (1997). Finite-state speech-to-speech translation. Proceedings of the International Conference on Acoustic, Speech and Signal Processing (Vol.1, pp. 111–114. Munich, Germany).
Google Scholar
Vidal, E., Casacuberta, F.,& García, P. (1995). Grammatical inference and speech recognition. New Advances and Trends in Speech Recognition and Coding. NATO ASI Series. (pp. 174–191), Springer-Verlag.

Download references

Author information

Authors and Affiliations

Institut Tecnològic d'Informàtica, Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Valencia, Spain
David Picó & Francisco Casacuberta

Authors

David Picó
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Casacuberta
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Picó, D., Casacuberta, F. Some Statistical-Estimation Methods for Stochastic Finite-State Transducers. Machine Learning 44, 121–141 (2001). https://doi.org/10.1023/A:1010880113956

Download citation

Issue Date: July 2001
DOI: https://doi.org/10.1023/A:1010880113956

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Some Statistical-Estimation Methods for Stochastic Finite-State Transducers

Abstract

Article PDF

Similar content being viewed by others

Confidence distributions and hypothesis testing

Scenic: a language for scenario specification and data generation

Error Classification and Analysis for Machine Translation Quality Assessment

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Some Statistical-Estimation Methods for Stochastic Finite-State Transducers

Abstract

Article PDF

Similar content being viewed by others

Confidence distributions and hypothesis testing

Scenic: a language for scenario specification and data generation

Error Classification and Analysis for Machine Translation Quality Assessment

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation