Stochastic Finite-State Models for Spoken Language Machine Translation

Bangalore, Srinivas; Riccardi, Giuseppe

doi:10.1023/B:COAT.0000010804.12581.96

Stochastic Finite-State Models for Spoken Language Machine Translation

Published: January 2002

Volume 17, pages 165–184, (2002)
Cite this article

Machine Translation

Srinivas Bangalore¹ &
Giuseppe Riccardi¹

156 Accesses
16 Citations
Explore all metrics

Abstract

The problem of machine translation can be viewed as consisting of twosubproblems (a) lexical selection and (b) lexical reordering. In thispaper, we propose stochastic finite-state models for these two subproblems. Stochastic finite-state models are efficiently learnablefrom data, effective for decoding and are associated with a calculusfor composing models which allows for tight integration of constraintsfrom various levels of language processing. We present a method forlearning stochastic finite-state models for lexical selection andlexical reordering that are trained automatically from pairs of sourceand target utterances. We use this method to develop models forEnglish–Japanese and English–SPANISH translation and present the performance of these models for translation on speech and text. We also evaluate the efficacy of such a translation model in the context of a call routing task of unconstrained speech utterances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abella, A. and A. L. Gorin: 1997, ‘Generating Semantically Consistent Inputs to a Dialog Manager’, in Proceedings of the European Conference on Speech Communication and Technology (Eurospeech), Rhodes, Greece, pp. 1879–1882.
Abney, S.: 1991, ‘Parsing by Chunks’, in R. Berwick, S. Abney, and C. Tenny (eds), Principle-Based Parsing: Computation and Psycholinguistics, Kluwer Academic Publishers, Dordrecht, pp. 257–278.
Google Scholar
ALPAC: 1966, Languages and Machines: Computers in Translation and Linguistics, Automatic Language Processing Advisory Committee, Division of Behavioral Sciences, National Research Council, National Academy of Sciences, Washington DC.
Google Scholar
Alshawi, H., S. Bangalore, and S. Douglas: 1998a, ‘Automatic Acquisition of Hierarchical Transduction Models for Machine Translation’, in COLING-ACL '98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Quebec, pp. 41–47.
Alshawi, H., S. Bangalore, and S. Douglas: 1998b, ‘Learning Phrase-Based Head Transduction Models for Translation of Spoken Utterances’, in 5th International Conference on Spoken Language Processing (ICSLP98), Sydney, pp. 2767–2770.
Arnold, D., L. Sadler, and R. L. Humphreys: 1993, ‘Special Issue on Evaluation of MT Systems’, Machine Translation 8(1–2).
Bangalore, S. and A. Joshi: 1999, ‘Supertagging: An Approach to Almost Parsing’, Computational Linguistics 25, 237–265.
Google Scholar
Bangalore, S. and G. Riccardi: 2000, ‘Stochastic Finite-State Models for Spoken Language Machine Translation’, in ANLP/NAACL 2000 Workshop Embedded Machine Translation Systems, Seattle, Washington, pp. 52–59.
Bangalore, S. and G. Riccardi: 2001, ‘A Finite-State Approach to Machine Translation’, in 2nd Meeting of the North American Chapter of the Association for Computational Linguistics, Pittsburgh, Pennsylvania, pp. 135–142.
Brown, P. F., S. A. Della Pietra, V. J. Della Pietra, and R. Mercer: 1993, ‘The Mathematics ofMachine Translation: Parameter Estimation’, Computational Linguistics 16, 263–312.
Google Scholar
Gorin, A. L., G. Riccardi, and J. H. Wright: 1997, ‘How May I Help You?’, Speech Communication 23, 113–127.
Google Scholar
Kaplan, R. and M. Kay: 1994, ‘Regular Models of Phonological Rule Systems’, Computational Linguistics 20, 331–378.
Google Scholar
Knight, K. and Y. Al-Onaizan: 1998, ‘Translation with Finite-State Devices’, in D. Farwell, L. Gerber, and E. Hovy (eds), Machine Translation and the Information Soup: Third Conference of the Association for Machine Translation in the Americas, AMTA '98, Springer, Berlin, pp. 421–437.
Google Scholar
Koskenniemi, K. K.: 1984, ‘Two-Level Morphology: A General Computation Model forWord-Form Recognition and Production’, Ph.D. thesis, University of Helsinki, Finland.
Google Scholar
Lavie, A., L. Levin, M. Woszczyna, D. Gates, M. Gavalda, and A. Waibel: 1999, ‘The Janus-III Translation System: Speech-to-Speech Translation in Multiple Domains’, in Proceedings of CSTAR Workshop, Schwetzingen, Germany.
Nederhof, M.-J.: 2000, ‘Practical Experiments with Regular Approximation of Context-Free Languages’, Computational Linguistics 26, 17–44.
Google Scholar
Papineni, K., S. Roukos, T. Ward, and W. Zhu: 2002, ‘BLEU: A Method for Automatic Evaluation of Machine Translation’, in 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, pp. 313–318.
Pereira, F. C. and M. D. Riley: 1997, ‘Speech Recognition by Composition of Weighted Finite Automata’, in E. Roche and S. Y. Schabes (eds), Finite State Language Processing, MIT Press, Cambridge, MA, pp. 431–453.
Google Scholar
Pereira, F. C. and R. Wright: 1997, 'Finite-State Approximation of Phrase-Structure Grammars, in E. Roche and Y. Schabes (eds), Finite-State Language Processing, MIT Press, Cambridge, MA, pp. 149–173.
Google Scholar
Riccardi, G. and S. Bangalore: 1998, ‘Automatic Acquisition of Phrase Grammars for Stochastic Language Modeling’, Proceedings of the Sixth Workshop on Very Large Corpora, Montreal, Quebec, pp. 188–196.
Riccardi, G., E. Bocchieri, and R. Pieraccini: 1995, ‘Non Deterministic Tochastic Language Models for Speech Recognition’, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Detroit, MI, pp. 247–250.
Riccardi, G. and A. Gorin: 2000, ‘Stochastic Language Adaptation over Time and State in Natural Spoken Dialogue Systems’, IEEE Transactions on Speech and Audio Processing, 8, 3–10.
Google Scholar
Riccardi, G., A. L. Gorin, A. Ljolje, and M. Riley: 1997, ‘A Spoken Language System for Automated Call Routing’, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '97), Munich, pp. 1143–1146.
Riccardi, G., R. Pieraccini, and E. Bocchieri: 1996, 'Stochastic Automata for Language Modeling, Computer Speech and Language 10, 265–293.
Google Scholar
Roche, E.: 1999, ‘Finite State Transducers: Parsing Free and Frozen Sentences’, in A. Kornai (ed.), Extened Finite State Models of Language, Cambridge University Press, Cambridge, pp. 108–120.
Google Scholar
Vilar, J. M., V. M. Jiménez, J. C. Amengual, A. Castellanos, D. Llorens, and E. Vidal: 1999, ‘Text and Speech Translation by Means of Subsequential Transducers’, in A. Kornai (ed.), Extended Finite State Models of Language, Cambridge University Press, Cambridge, pp. 121–139.
Google Scholar
Wahlster, W. (ed.): 2000, Verbmobil: Foundations of Speech-to-Speech Translation, Springer, Berlin.
Google Scholar
Woszczyna, M., M. Broadhead, D. Gates, M. Gavaldà, A. Lavie, L. Levin, and A. Waibel: 1998, ‘A Modular Approach to Spoken Language Translation for Large Domains’, in D. Farwell, L. Gerber, and E. Hovy (eds), Machine Translation and the Information Soup: Third Conference of the Association forMachine Translation in the Americas, AMTA '98, Springer, Berlin, pp. 31–40.
Google Scholar
Wu, D.: 1997, ‘Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora’, Computational Linguistics 23, 377–404.
Google Scholar
Yamamoto, M. and K. W. Church: 1998, ‘Using Suffix Arrays to Compute Term Frequency and Document Frequency for All Substrings in a Corpus’, in Proceedings of the Sixth Workshop on Very Large Corpora, Montreal, Quebec, pp. 28–37.

Download references

Author information

Authors and Affiliations

AT & T Labs Research, 180 Park Avenue, Florham Park, NJ, 07932, USA
Srinivas Bangalore & Giuseppe Riccardi

Authors

Srinivas Bangalore
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Riccardi
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bangalore, S., Riccardi, G. Stochastic Finite-State Models for Spoken Language Machine Translation. Machine Translation 17, 165–184 (2002). https://doi.org/10.1023/B:COAT.0000010804.12581.96

Download citation

Issue Date: January 2002
DOI: https://doi.org/10.1023/B:COAT.0000010804.12581.96

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stochastic Finite-State Models for Spoken Language Machine Translation

Abstract

Access this article

Similar content being viewed by others

Natural Language Processing

Near-term advances in quantum natural language processing

GPT-3: Its Nature, Scope, Limits, and Consequences

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Stochastic Finite-State Models for Spoken Language Machine Translation

Abstract

Access this article

Similar content being viewed by others

Natural Language Processing

Near-term advances in quantum natural language processing

GPT-3: Its Nature, Scope, Limits, and Consequences

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation