Skip to main content
Log in

Robustness and Portability Issues in Multilingual Speech Processing

  • Published:
Machine Translation

Abstract

In this article, we discuss robustness and portability issues forparsing components in interactive speech systems. The robustness isobtained by choosing an appropriate grammar formalism. It should bewell adapted to spontaneous speech effects, which are frequent inthese application domains. Portability, on the other hand, can beachieved by choosing a flexible grammar implementation. We illustrateboth issues by describing a stochasticparsing component implemented and evaluated for spoken languagetranslation and information retrieval applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bennacef, S. K., H. Bonneau-Maynard, J. L. Gauvain, L. Lamel and W. Minker: 1994, ‘A Spoken Language System for Information Retrieval’, International Conference on Spoken Language Processing ICSLP 1994, Yokohama, Japan, pp. 1271–1274.

  • Blasband, M.: 1998, ‘Speech Recognition in Practice: The ARISE Project’, La Lettre de l'IA, pp. 207–210.

  • Bruce, B.: 1975, ‘Case Systems for Natural Language’, Artificial Intelligence 6, 327–360.

    Google Scholar 

  • Feldman, J. A. and D.H. Bullard: 1982, ‘Connectionist Models and Their Properties’, Cognitive Science 6, 205–254.

    Google Scholar 

  • Fillmore, C. J.: 1968, ‘The Case for Case’, in Emmon Bach and Robert T. Harms (eds), Universals in Linguistic Theory. New York: Holt, Rinehart and Winston, pp. 1–90.

    Google Scholar 

  • Finke, M., P. Geutner, H. Hild, T. Kemp, K. Ries and M. Westphal: 1997, ‘The Karlsruhe-Verbmobil Speech Recognition Engine’, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'97), Munich, Vol. 1, pp. 83–86.

    Google Scholar 

  • Gates, D., A. Lavie, L. Levin, A. Waibel, M. Gavaldà, L. Mayfield, M. Woszczyna and P. Zahn: 1996, ‘End-to-End Evaluation in JANUS: A Speech-to-speech Translation System’, ECAI 1996: Proceedings of the 12th European Conference on Artificial Intelligence, Budapest, pp. 35–40.

  • Gauvain, J. L., S. Bennacef, L. Devillers, L. Lamel and S. Rosset: 1997, ‘Spoken Language Component of the MASK Kiosk’, in S. Pfleger and K. Varghese (eds), Human Comfort & Security of Information Systems. New York: Springer-Verlag, pp. 93–103.

    Google Scholar 

  • Hatazaki, K., J. Noguchi, A. Okumura, K. Yoshida and T. Watanabe: 1992, ‘INTERTALKER: An Experimental Automatic Interpretation System Using Conceptual Representation’, International Conference on Spoken Language Processing ICSLP 1992, Banff, Canada.

  • Issar, S. and W. Ward: 1993, ‘CMU's Robust Spoken Language Understanding System’, Eurospeech: Proceedings of the 3rd European Conference on Speech, Communication, and Technology, Berlin, Germany, pp. 2147–2150.

  • Jelinek, F., J. Lafferty, D. Magerman, A. Ratnaparkhi and S. Roukos: 1994, ‘Decision Tree Parsing Using a Hidden Derivation Model’, Proceedings of the ARPA Human Language Technology Workshop, Plainsboro, NJ, pp. 260–265.

  • Jelinek, F., J. Lafferty and R. Mercer: 1992, ‘Basic Methods of Probabilistic Context Free Grammars’, Speech Recognition and Understanding. Recent Advances 75, 345–360.

    Google Scholar 

  • Kay, M., J. M. Gawron and P. Norvig: 1994, Verbmobil: A Translation System for Face-to-Face Dialog, Stanford: CSLI.

    Google Scholar 

  • Kuhn, R. and R. De Mori: 1993 ‘Learning Speech Semantics with Keyword Classification Trees’, Proceedings of ICASSP 1993: IEEE International Conference on Acoustics, Speech, and Signal Processing, Minneapolis, MN, pp. 55–58.

  • Kuhn, R. and R. De Mori: 1994, ‘Recent results in automatic learning rules for semantic interpretation’, International Conference on Spoken Language Processing ICSLP 1994, Yokahama, Japan, pp. 75–78.

  • Lamel, L.: 1998, ‘Spoken Language Dialog System Development and Evaluation at LIMST’, International Symposium on Spoken Dialogue, Sydney.

  • Lamel, L., S. K. Bennacef, H. Bonneau-Maynard, S. Rosset and J. L. Gauvain: 1995, ‘Recent Developments in Spoken Language Systems for Information Retrieval’, ESCA Workshop on Spoken Dialogue Systems, Vigsø, Denmark, pp. 17–20.

  • Lavie, A. and M. Tomita: 1993, ‘GLR* — An Efficient Noise Skipping Parsing Algorithm for Context Free Grammars’, Third International Workshop on Parsing Technologies IWPT 93, Tilburg, The Netherlands, pp. 123–134.

  • Lavie, A., D. Gates, N. Coccaro and L. Levin: 1996, ‘Input Segmentation of Spontaneous Speech in JANUS: A Speech-to-Speech Translation System’, ECAI 1996: Proceedings of the 12th European Conference on Artificial Intelligence, Budapest, pp. 54–59.

  • Lavie, A., A. Waibel, L. Levin, M. Finke, D. Gates, M. Gavaldà, T. Zeppenfeld and P. Zhan: 1997, ‘Janus III: Speech-to-Speech Translation in Multiple Languages’, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97), Munich, Vol. 1, pp. 99–102.

    Google Scholar 

  • Levin, E., and R. Pieraccini: 1995, ‘Chronus — The Next Generation’, Proceedings of the DARPA Human Language Technology Workshop, Princeton, NJ, pp. 269–271.

  • Levin, L., O. Glickman, Y. Qu, D. Gates, A. Lavie, C. P. Rosé, C. Van Ess-Dykema and A. Waibel: 1995, ‘Using Context in Machine Translation of Spoken Language’, Proceedings of the Sixth International Conference on Theoretical and Methodological Issues in Machine Translation, TMI 95, Leuven, pp. 173–187.

  • Levin, L., A. Lavie, M. Woszczyna, D. Gates, M. Gavaldà, D. Koll and A. Waibel: 2000, ‘The JANUS-III Translation System’, Machine Translation 15, 3–25

    Google Scholar 

  • Mayfield, L., M. Gavaldà, W. Ward and A. Waibel: 1995, ‘Concept-Based Speech Translation’, 1995 IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP '95, Detroit, pp. 97–100.

  • Miller, S., D. Stallard, R. Bobrow and R. Schwartz: 1996, ‘A Fully Statistical Approach to Natural Language Interfaces’, 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA, pp. 55–61.

  • Minami, Y., K. Shikano, S. Takahashi, T. Yamada, O. Yoshioka and S. Furui: 1995, ‘Large-Vocabulary Continuous Speech Recognition Algorithm Applied to a Multi-Modal Telephone Directory Assistance System’, Speech Communication 15, 301–310.

    Google Scholar 

  • Minker, W.: 1997, ‘Stochastically-Based Natural Language Understanding Across Tasks and Languages’, EuroSpeech '97: 5th European Conference on Speech Communication and Technology, Rhodes, Greece, pp. 1423–1426.

  • Minker, W., S. K. Bennacef and J. L. Gauvain: 1996, ‘A Stochastic Case Frame Approach for Natural Language Understanding’, Eurospeech: Proceedings of the 3rd European Conference on Speech, Communication, and Technology, Berlin, Germany, pp. 1013–1016.

  • Minker, W., M. Gavaldà and A. Waibel: 1999a, ‘Stochastically-based Semantic Analysis for Machine Translation’, Computer Speech and Language 13, 177–194.

    Google Scholar 

  • Minker, W., A. Waibel and J. Mariani: 1999b, Stochastically-Based Semantic Analysis, Boston: Kluwer Academic Publishers.

    Google Scholar 

  • Morimoto, T., T. Takezawa, F. Yato, S. Sagayama, T. Tashiro, M. Nagata and A. Kurematsu: 1993, ‘ATR Speech Translation System: ASURA’, Eurospeech: Proceedings of the 3rd European Conference on Speech, Communication, and Technology, Berlin, Germany, pp. 1295–1298.

  • Oerder, M., and H. Aust: 1994, ‘A Realtime Prototype of an Automatic Inquiry System’, International Conference on Spoken Language Processing ICSLP 1994, Yokohama, Japan, pp. 703–706.

  • Peckham, J.: 1993, ‘A New Generation of Spoken Dialogue Systems: Results and Lessons from the Sundial Project’, Eurospeech: Proceedings of the 3rd European Conference on Speech, Communication, and Technology, Berlin, Germany, pp. 33–40.

  • Price, P.: 1990, ‘Evaluation of Spoken Language Systems: The Atis Domain’, Proceedings of ARPA Human Language Technology Workshop, pp. 91–95.

  • Rabiner, L. R. and B. H. Juang: 1986, ‘An Introduction to Hidden Markov Models’, IEEE Transactions on Acoustics, Speech and Signal Processing 3, 4–16.

    Google Scholar 

  • Reithinger, N., E. Maier and J. Alexandersson: 1995, ‘Treatment of Incomplete Dialogues in a Speech-to-Speech Translation System’, ESCA Workshop on Spoken Dialogue Systems, Vigsø, Denmark.

  • Roe, D. B., F. C. Pereira, R. W. Sproat and M. D. Riley: 1992, ‘Efficient Grammar Processing for a Spoken Language Translation System’, Proceedings of ICASSP 1992: IEEE International Conference on Acoustics, Speech, and Signal Processing, San Francisco, Vol. 1., pp. 213–216.

    Google Scholar 

  • Rosé, C. P., B. Di Eugenio, L. S. Levin and C. Van Ess-Dykema: 1995, ‘Discourse Processing of Dialogues with Multiple Threads’, 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, Massachusetts, pp. 31–38.

  • Schwartz, R., S. Miller, D. Stallard and J. Makhoul: 1996, ‘Language Understanding Using Hidden Understanding Models’, ICSLP 96: The Fourth International Conference on Spoken Language Processing, Philadelphia, PA, pp. 997–1000.

  • Tomita, M.: 1987, ‘An Efficient Augmented-Context-Free Parsing Algorithm’, Computational Linguistics 13, 31–46.

    Google Scholar 

  • Tomita, M. (ed.): 1991. Generalized LR-Parsing. Boston: Kluwer Academic Publishers.

    Google Scholar 

  • Wahlster, W.: 1993, ‘Verbmobil, Translation of Face-to-Face Dialogs’, The Fourth Machine Translation Summit, Kobe, Japan, pp. 127–135.

  • Waibel, A.: 1996, ‘Interactive Translation of Conversational Speech’, Computer 27, 41–48.

    Google Scholar 

  • Waibel, A.: 1999, ‘Interactive Translation of Conversational Speech’, in: K. Ponting (ed.), Computational Models of Speech Pattern Processing, Berlin: Springer-Verlag.

    Google Scholar 

  • Waibel, A., A. Jain, A. McNair, H. Saito, A. Hauptmann and J. Tebelskis: 1991, ‘JANUS: A Speech-to-Speech Translation System Using Connectionist and Symbolic Processing Strategies’, Proceedings of ICASSP 1991: IEEE International Conference on Acoustics, Speech, and Signal Processing, Toronto, Vol. 2, pp. 793–796.

    Google Scholar 

  • Ward, W.: 1994, ‘Extracting Information in Spontaneous Speech’, International Conference on Spoken Language Processing ICSLP 1994, Yokohama, Japan, pp. 83–86.

  • Ward, W. and S. Issar: 1995, ‘The CMU Atis System’, Proceedings of ARPA Workshop on Spoken Language Technology, San Mateo, CA: Morgan Kaufmann, pp. 249–251.

    Google Scholar 

  • Yamada, M., F. Itoh, K. Sakai, Y. Komori, Y. Ohora and M. Fujita: 1995, ‘A Spoken Dialogue System with Active/Non-Active Word Control for CD-ROM Information Retrieval’, Speech Communication 15, 355–365.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Minker, W. Robustness and Portability Issues in Multilingual Speech Processing. Machine Translation 16, 109–126 (2001). https://doi.org/10.1023/A:1014574522188

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1014574522188

Navigation