Skip to main content
Log in

Interactive Speech Translation in the Diplomat Project

  • Published:
Machine Translation

Abstract

The Diplomat rapid-deployment speech-translation systemis intended to allow naï ve users to communicate across a languagebarrier, without strong domain restrictions, despite the error-pronenature of current speech and translation technologies. In addition,it should be deployable for new languages an order of magnitude morequickly than traditional technologies. Achieving this ambitious setof goals depends in large part on allowing the users to correct recognition and translation errors interactively. We present the Multi-Engine Machine Translation (MEMT) architecture, describing how it is well suited for such an application. We then discuss ourapproaches to rapid-deployment speech recognition and synthesis.Finally we describe our incorporation of interactive error correctionthroughout the system design. We have already developed workingbidirectional Croatian ⇆ English and Spanish⇆ English systems, and have Haitian Creole ⇆ English and Korean ⇆ English versions under development.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Allen, Jeffrey and Christopher Hogan: 1998, ‘Expanding Lexical Coverage of Parallel Corpora for the EBMT Approach’, in '98), Granada, Spain, pp. 747–754.

  • Black, A., K. Lenzo, and V. Pagel: 1998, ‘Issues in Building General Letter to Sound Rules’, in Proceedings of the Third ESCA/COCOSDA International Workshop on Speech Synthesis, Jenolan, Australia, pp. 77–80.

  • Black, Alan and Paul Taylor: 1997, ‘Automatically Clustering Similar Units for Unit Selection in Speech Synthesis’, in Proceedings of the Fifth European Conference on Speech Communication and Technology (Eurospeech'97), Rhodes, Greece, pp. 601–604.

  • Brown, Ralf D.: 1996, ‘Example-Based Machine Translation in the Pangloss System’, in COLING-96: The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 169–174.

  • Brown, Ralf D.: 1997, ‘Automated Dictionary Extraction for “Knowledge-Free” Example-Based Translation’, in Proceedings of the 7th International Conference on Theoretical and Methodological Issues in Machine Translation, Santa Fe, New Mexico, pp. 111–118.

  • Brown, Ralf and Robert Frederking: 1995, ‘Applying Statistical English Language Modeling to Symbolic Machine Translation’, in Proceedings of the Sixth International Conference on Theoretical and Methodological Issues in Machine Translation TMI 95, Leuven, Belgium, pp. 221–239.

  • CIA: 1995, The World Factbook, Central Intelligence Agency, Washington, D.C.

    Google Scholar 

  • Damiba, Bertrand A. and Alexander I. Rudnicky: 1998, ‘Language-Independent Lexical Acquisition’, ms. available at URL http://www.cs.cmu.edu/ air/papers/DamibaRudnicky98.pdf.

  • Eskenazi, Maxine, Christopher Hogan, Jeffrey Allen, and Robert Frederking: 1998, ‘Issues in Database Design: Recording and Processing Speech from New Populations’, in 98), Granada, Spain, pp. 1289–1293.

  • Farwell, David and Yorick Wilks: 1991, ‘Ultra: A Multilingual Machine Translator’, in Machine Translation Summit III, Washington, DC, pp. 19–24.

  • Frederking, Robert: 1994, ‘statistical Language Models for Symbolic MT’, paper presented at the Language Engineering on the Information Highway Workshop, Santorini, Greece, September 1994.

  • Frederking, Robert E. and Ralf D. Brown: 1996, ‘The Pangloss-Lite Machine Translation System’, in Expanding MT Horizons: Proceedings of the Second Conference of the Association for Machine Translation in the Americas, Montreal, Quebec, pp. 269–272.

  • Frederking, Robert, Dean Grannes, P. Cousseau, and Sergei Nirenburg: 1993, ‘An MAT Tool and Its Effectiveness’, in Proceedings of the DARPA Human Language Technology Workshop, Morgan Kaufmann Publishers, San Franscisco, Princeton, NJ, pp. 196–201.

    Google Scholar 

  • Frederking, Robert and Sergei Nirenburg: 1994, ‘Three Heads are Better than One’, in 4 th Conference on Applied Natural Language Processing, Stuttgart, Germany, pp. 95–100.

  • Hogan, Christopher and Robert E. Frederking: 1998, ‘An Evaluation of the Multi-engine MT Architecture’, in David Farwell, Laurie Gerber and Eduard Hovy (eds), Machine Translation and the Information Soup: Third Conference of the Association for Machine Translation in the Americas, AMTA'98,..., Springer, Berlin, pp. 113–123.

    Google Scholar 

  • Horiguchi, Keiko and Alexander Franz: 1997, ‘A Formal Basis for Spoken Language Translation by Analogy’, in Spoken Language Translation, Proceedings of a Workshop Sponsored by the Association for Computational Linguistics and by the European Network in Language and Speech (ELSNET), Madrid, Spain, pp. 32–39.

  • Huang, Xuedong, Fileno Alleva, Hsiao-Wuen Hon, Mei-Yuh Hwang, and Ronald Rosenfeld: 1992, ‘The SPHINX-II Speech Recognition System: An Overview’, Report CMU-CS–92–112, Carnegie Mellon University School of Computer Science, Pittsburgh, PA.

    Google Scholar 

  • Hunt, Andrew and Alan Black: 1996, ‘Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database’, in Proceedings of ICASSP 96, Atlanta, Georgia, pp. 373–376.

  • Kay, Martin: 1976, ‘Experiments with a Powerful Parser’, American Journal of Computational Linguistics, microfiche 43.

  • Lenzo, Kevin, Christopher Hogan, and Jeffrey Allen: 1998, ‘Rapid-Deployment Text-to-Speech in the DIPLOMAT System’, in '98), Sydney, Australia, pp. 1999–2002.

  • Levin, Lori, Alon Lavie, Monika Woszczyna, and Alex Waibel: 2000, ‘The Janus III Translation System’, Machine Translation, this volume.

  • MacDonald, R. R.: 1963, General Report 1952–1963, Georgetown University Press, Washington, DC.

    Google Scholar 

  • Melby, A. K.: 1993, ‘Computer-Assisted Translation Systems: The Standard Design and a Multi-Level Design’, in Conference on Applied Natural Language Processing, Santa Monica, California, pp. 174–177.

  • Mitamura, Teruko, Eric H. Nyberg III, and Jaime G. Carbonell: 1991, ‘Interlingua Translation System for Multi-Lingual Document Production’, in Machine Translation Summit III, Washington, DC, pp. 55–61.

  • Nagao, Makoto: 1984, ‘A Framework of a Mechanical Translation between Japanese and English by Analogy Principle’, in A. Elithorn and R. Banerji (eds), Artificial and Human Intelligence, North-Holland, Amsterdam, pp. 173–180.

    Google Scholar 

  • Nirenburg, Sergei et al.: 1995, ‘The Pangloss Mark III Machine Translation System’, Report CMUCMT–95–145, Computing Research Laboratory (New Mexico State University), Center for Machine Translation (Carnegie Mellon University), Information Sciences Institute (University of Southern California).

  • Pagel, V., K. Lenzo, and A. Black: 1998, ‘Letter to Sound Rules for Accented Lexicon Compression’, in '98), Sydney, Australia, pp. 2015–2020.

  • Ravishankar, Mosur: 1996, Efficient Algorithms for Speech Recognition, Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA.

    Google Scholar 

  • Rayner, Manny and David Carter: 1997, ‘Hybrid Processing in the Spoken Language Translator’, in Proceedings of the 1997 International Conference on Acoustics, Speech, and Signal Processing (ICASSP'97), Munich, Germany, pp. 107–110.

  • Rudnicky, Alexander: 1995, ‘Language Modeling with Limited Domain Data’, in Proceedings of the ARPA Workshop on Spoken Language Technology, San Mateo, California, pp. 66–69.

  • Sato, Satoshi and Makoto Nagao: 1990, ‘Toward Memory-Based Translation’, in COLING-90, Papers Presented to the 13th International Conference on Computational Linguistics, Vol. 3, Helsinki, Finland, pp. 247–252.

    Google Scholar 

  • Unicode Consortium, The: 1996, The Unicode Standard: Version 2.0, Addison-Wesley Developers Press, Reading, Massachusetts.

    Google Scholar 

  • Waibel, Alex: 1996, ‘Interactive Translation of Conversational Speech’, Computer 29, 41–48.

    Google Scholar 

  • Winograd, Terry: 1983, Language as a Cognitive Process. Volume 1: Syntax, Addison-Wesley, Reading, Massachusetts.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Frederking, R., Rudnicky, A., Hogan, C. et al. Interactive Speech Translation in the Diplomat Project. Machine Translation 15, 27–42 (2000). https://doi.org/10.1023/A:1011172330853

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011172330853

Navigation