Head-Transducer Models for Speech Translation and Their Automatic Acquisition from Bilingual Data

Alshawi, Hiyan; Bangalore, Srinivas; Douglas, Shona

doi:10.1023/A:1011187330969

Head-Transducer Models for Speech Translation and Their Automatic Acquisition from Bilingual Data

Published: June 2000

Volume 15, pages 105–124, (2000)
Cite this article

Machine Translation

Hiyan Alshawi¹,
Srinivas Bangalore¹ &
Shona Douglas¹

52 Accesses
7 Citations
Explore all metrics

Abstract

This article presents statistical language translation models,called “dependency transduction models”, based on collectionsof “head transducers”. Head transducers are middle-out finite-state transducers which translate a head word in a source stringinto its corresponding head in the target language, and furthertranslate sequences of dependents of the source head into sequencesof dependents of the target head. The models are intended to capturethe lexical sensitivity of direct statistical translation models,while at the same time taking account of the hierarchical phrasalstructure of language. Head transducers are suitable for directrecursive lexical translation, and are simple enough to be trainedfully automatically. We present a method for fully automatictraining of dependency transduction models for which the only inputis transcribed and translated speech utterances. The method has beenapplied to create English–Spanish and English–Japanese translationmodels for speech translation applications. The dependencytransduction model gives around 75% accuracy for an English–Spanishtranslation task (using a simple string edit-distance measure) and70% for an English–Japanese translation task. Enhanced with targetn-grams and a case-based component, English–Spanish accuracy is over76%; for English–Japanese it is 73% for transcribed speech, and60% for translation from recognition word lattices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalization of Discriminative Approaches for Speech Language Understanding in a Multilingual Context

MDL-Based Models for Transliteration Generation

Slavic languages in phrase-based statistical machine translation: a survey

Article 06 May 2017

References

Alshawi, H., A. L. Buchbaum, and F. Xia: 1997, ‘A Comparison of Head Trandsucers and Transfer for a Limited Domain Translation Application’, 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics, Madrid, pp. 360–365.
Alshawi, H., S. Bangalore, and S. Douglas: 1998a, ‘Automatic Acquisition of Hierarchical Transduction Models for Machine Translation’, 98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, pp. 41–47.
Alshawi, H., S. Bangalore, and S. Douglas: 1998, ‘Learning Phrase-Based Head Transduction Models for Translation of Spoken Utterances’, 5th International Conference on Spoken Language Processing (ICSPL98), Sydney, pp. 2767–2770.
Brown, P., J. Cocke, S. D. Pietra, V. D. Pietra, J. Lafferty, R. Mercer, and P. Rossin: 1990, ‘A Statistical Approach to Machine Translation’, Computational Linguistics 16, 79–85.
Google Scholar
Brown, P., S. D. Pietra, V. D. Pietra, and R. Mercer: 1993, ‘The Mathematics ofMachine Translation: Parameter Estimation’, Computational Linguistics 19, 263–312.
Google Scholar
Booth, T.: 1969, ‘Probabilistic Representation of Formal Languages’, Conference Record of 1969 Tenth Annual IEEE Symposium on Switching and Automata Theory, Waterloo, Ontario, pp. 74–81.
Collins, M. J.: 1996, ‘A New Statistical Parser Based on Bigram Lexical Dependencies’, 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA, pp. 184–191.
Early, J.: 1970, ‘An Efficient Context-Free Parsing Algorithm’, Communications of the ACM 14, 61–74.
Google Scholar
Gale, W. and K. Church: 1991, ‘Identifying Word Correspondences in Parallel Texts’, [DARPA] Workshop on Speech and Natural Language Processing, Asilomar, CA, pp. 152–157.
Hirschman, L., M. Bates, D. Dahl, W. Fisher, J. Garofolo, D. Pallett, K. Hunicke-Smith, P. Price, A. Rudnicky, and E. Tzoukermann: 1993, ‘Multi-Site Data Collection and Evaluation in Spoken Language Understanding’, Human Language Technology Workshop, San Francisco, pp. 19–24.
Hudson, R.: 1984, Word Grammar, Blackwell, Oxford.
Google Scholar
McCord, M.: 1988, ‘A Multi-Target Machine Translation System’, Proceedings of the International Conference on Fifth Generation Computer Systems, Tokyo, pp. 1141–1149.
NIST (National Institute of Standards and Technology): 1997, Speech Recognition Scoring Package, National Institute of Standards and Technology, Gaithersburg, MD.
Google Scholar
Sumita, E. and H. Iida: 1995, ‘Heterogeneous Computing for Example-Based Translation of Spoken Language’, Proceedings of the Sixth International Conference on Theoretical and Methodological Issues in Machine Translation TMI 95, Leuven, pp. 273–286.
van Noord, G., J. Dorrepaal, P. van der Eijk, M. Florenza, H. Huessink, and L. des Tombe: 1991, ‘An Overview of Mimo2’, Machine Translation 6, 201–214.
Google Scholar
Vilar, J., V. M. Jiménez, J. C. Amengual, A. Castellanos, D. LLorens, and E. Vidal: 1996, ‘Text and Speech Translation by Means of Subsequential Transducers’, Natural Language Engineering 2, 351–354.
Google Scholar
Worm, K. L.: 1998, ‘A Model for Robust Processing of Spontaneous Speech by Integrating Viable Fragments’, 98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, pp. 1403–1407.
Wu, D.: 1997, ‘Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora’, Computational Linguistics 23, 377–404.
Google Scholar
Younger, D.: 1967, ‘Recognition and Parsing of Context-Free Languages in Time n ³', Information and Control 10, 189–208.
Google Scholar

Download references

Author information

Authors and Affiliations

AT & T Labs Research, 180 Park Avenue, PO Box 971, Florham Park, NJ, 07932, USA
Hiyan Alshawi, Srinivas Bangalore & Shona Douglas

Authors

Hiyan Alshawi
View author publications
You can also search for this author in PubMed Google Scholar
Srinivas Bangalore
View author publications
You can also search for this author in PubMed Google Scholar
Shona Douglas
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alshawi, H., Bangalore, S. & Douglas, S. Head-Transducer Models for Speech Translation and Their Automatic Acquisition from Bilingual Data. Machine Translation 15, 105–124 (2000). https://doi.org/10.1023/A:1011187330969

Download citation

Issue Date: June 2000
DOI: https://doi.org/10.1023/A:1011187330969

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Head-Transducer Models for Speech Translation and Their Automatic Acquisition from Bilingual Data

Abstract

Access this article

Similar content being viewed by others

Generalization of Discriminative Approaches for Speech Language Understanding in a Multilingual Context

MDL-Based Models for Transliteration Generation

Slavic languages in phrase-based statistical machine translation: a survey

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Head-Transducer Models for Speech Translation and Their Automatic Acquisition from Bilingual Data

Abstract

Access this article

Similar content being viewed by others

Generalization of Discriminative Approaches for Speech Language Understanding in a Multilingual Context

MDL-Based Models for Transliteration Generation

Slavic languages in phrase-based statistical machine translation: a survey

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation