Skip to main content
Log in

Stochastic Finite-State Models for Spoken Language Machine Translation

  • Published:
Machine Translation

Abstract

The problem of machine translation can be viewed as consisting of twosubproblems (a) lexical selection and (b) lexical reordering. In thispaper, we propose stochastic finite-state models for these two subproblems. Stochastic finite-state models are efficiently learnablefrom data, effective for decoding and are associated with a calculusfor composing models which allows for tight integration of constraintsfrom various levels of language processing. We present a method forlearning stochastic finite-state models for lexical selection andlexical reordering that are trained automatically from pairs of sourceand target utterances. We use this method to develop models forEnglish–Japanese and English–SPANISH translation and present the performance of these models for translation on speech and text. We also evaluate the efficacy of such a translation model in the context of a call routing task of unconstrained speech utterances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abella, A. and A. L. Gorin: 1997, ‘Generating Semantically Consistent Inputs to a Dialog Manager’, in Proceedings of the European Conference on Speech Communication and Technology (Eurospeech), Rhodes, Greece, pp. 1879–1882.

  • Abney, S.: 1991, ‘Parsing by Chunks’, in R. Berwick, S. Abney, and C. Tenny (eds), Principle-Based Parsing: Computation and Psycholinguistics, Kluwer Academic Publishers, Dordrecht, pp. 257–278.

    Google Scholar 

  • ALPAC: 1966, Languages and Machines: Computers in Translation and Linguistics, Automatic Language Processing Advisory Committee, Division of Behavioral Sciences, National Research Council, National Academy of Sciences, Washington DC.

    Google Scholar 

  • Alshawi, H., S. Bangalore, and S. Douglas: 1998a, ‘Automatic Acquisition of Hierarchical Transduction Models for Machine Translation’, in COLING-ACL '98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Quebec, pp. 41–47.

  • Alshawi, H., S. Bangalore, and S. Douglas: 1998b, ‘Learning Phrase-Based Head Transduction Models for Translation of Spoken Utterances’, in 5th International Conference on Spoken Language Processing (ICSLP98), Sydney, pp. 2767–2770.

  • Arnold, D., L. Sadler, and R. L. Humphreys: 1993, ‘Special Issue on Evaluation of MT Systems’, Machine Translation 8(1–2).

  • Bangalore, S. and A. Joshi: 1999, ‘Supertagging: An Approach to Almost Parsing’, Computational Linguistics 25, 237–265.

    Google Scholar 

  • Bangalore, S. and G. Riccardi: 2000, ‘Stochastic Finite-State Models for Spoken Language Machine Translation’, in ANLP/NAACL 2000 Workshop Embedded Machine Translation Systems, Seattle, Washington, pp. 52–59.

  • Bangalore, S. and G. Riccardi: 2001, ‘A Finite-State Approach to Machine Translation’, in 2nd Meeting of the North American Chapter of the Association for Computational Linguistics, Pittsburgh, Pennsylvania, pp. 135–142.

  • Brown, P. F., S. A. Della Pietra, V. J. Della Pietra, and R. Mercer: 1993, ‘The Mathematics ofMachine Translation: Parameter Estimation’, Computational Linguistics 16, 263–312.

    Google Scholar 

  • Gorin, A. L., G. Riccardi, and J. H. Wright: 1997, ‘How May I Help You?’, Speech Communication 23, 113–127.

    Google Scholar 

  • Kaplan, R. and M. Kay: 1994, ‘Regular Models of Phonological Rule Systems’, Computational Linguistics 20, 331–378.

    Google Scholar 

  • Knight, K. and Y. Al-Onaizan: 1998, ‘Translation with Finite-State Devices’, in D. Farwell, L. Gerber, and E. Hovy (eds), Machine Translation and the Information Soup: Third Conference of the Association for Machine Translation in the Americas, AMTA '98, Springer, Berlin, pp. 421–437.

    Google Scholar 

  • Koskenniemi, K. K.: 1984, ‘Two-Level Morphology: A General Computation Model forWord-Form Recognition and Production’, Ph.D. thesis, University of Helsinki, Finland.

    Google Scholar 

  • Lavie, A., L. Levin, M. Woszczyna, D. Gates, M. Gavalda, and A. Waibel: 1999, ‘The Janus-III Translation System: Speech-to-Speech Translation in Multiple Domains’, in Proceedings of CSTAR Workshop, Schwetzingen, Germany.

  • Nederhof, M.-J.: 2000, ‘Practical Experiments with Regular Approximation of Context-Free Languages’, Computational Linguistics 26, 17–44.

    Google Scholar 

  • Papineni, K., S. Roukos, T. Ward, and W. Zhu: 2002, ‘BLEU: A Method for Automatic Evaluation of Machine Translation’, in 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, pp. 313–318.

  • Pereira, F. C. and M. D. Riley: 1997, ‘Speech Recognition by Composition of Weighted Finite Automata’, in E. Roche and S. Y. Schabes (eds), Finite State Language Processing, MIT Press, Cambridge, MA, pp. 431–453.

    Google Scholar 

  • Pereira, F. C. and R. Wright: 1997, 'Finite-State Approximation of Phrase-Structure Grammars, in E. Roche and Y. Schabes (eds), Finite-State Language Processing, MIT Press, Cambridge, MA, pp. 149–173.

    Google Scholar 

  • Riccardi, G. and S. Bangalore: 1998, ‘Automatic Acquisition of Phrase Grammars for Stochastic Language Modeling’, Proceedings of the Sixth Workshop on Very Large Corpora, Montreal, Quebec, pp. 188–196.

  • Riccardi, G., E. Bocchieri, and R. Pieraccini: 1995, ‘Non Deterministic Tochastic Language Models for Speech Recognition’, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Detroit, MI, pp. 247–250.

  • Riccardi, G. and A. Gorin: 2000, ‘Stochastic Language Adaptation over Time and State in Natural Spoken Dialogue Systems’, IEEE Transactions on Speech and Audio Processing, 8, 3–10.

    Google Scholar 

  • Riccardi, G., A. L. Gorin, A. Ljolje, and M. Riley: 1997, ‘A Spoken Language System for Automated Call Routing’, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '97), Munich, pp. 1143–1146.

  • Riccardi, G., R. Pieraccini, and E. Bocchieri: 1996, 'Stochastic Automata for Language Modeling, Computer Speech and Language 10, 265–293.

    Google Scholar 

  • Roche, E.: 1999, ‘Finite State Transducers: Parsing Free and Frozen Sentences’, in A. Kornai (ed.), Extened Finite State Models of Language, Cambridge University Press, Cambridge, pp. 108–120.

    Google Scholar 

  • Vilar, J. M., V. M. Jiménez, J. C. Amengual, A. Castellanos, D. Llorens, and E. Vidal: 1999, ‘Text and Speech Translation by Means of Subsequential Transducers’, in A. Kornai (ed.), Extended Finite State Models of Language, Cambridge University Press, Cambridge, pp. 121–139.

    Google Scholar 

  • Wahlster, W. (ed.): 2000, Verbmobil: Foundations of Speech-to-Speech Translation, Springer, Berlin.

    Google Scholar 

  • Woszczyna, M., M. Broadhead, D. Gates, M. Gavaldà, A. Lavie, L. Levin, and A. Waibel: 1998, ‘A Modular Approach to Spoken Language Translation for Large Domains’, in D. Farwell, L. Gerber, and E. Hovy (eds), Machine Translation and the Information Soup: Third Conference of the Association forMachine Translation in the Americas, AMTA '98, Springer, Berlin, pp. 31–40.

    Google Scholar 

  • Wu, D.: 1997, ‘Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora’, Computational Linguistics 23, 377–404.

    Google Scholar 

  • Yamamoto, M. and K. W. Church: 1998, ‘Using Suffix Arrays to Compute Term Frequency and Document Frequency for All Substrings in a Corpus’, in Proceedings of the Sixth Workshop on Very Large Corpora, Montreal, Quebec, pp. 28–37.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bangalore, S., Riccardi, G. Stochastic Finite-State Models for Spoken Language Machine Translation. Machine Translation 17, 165–184 (2002). https://doi.org/10.1023/B:COAT.0000010804.12581.96

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:COAT.0000010804.12581.96

Navigation