The Prosody Module

  • Anton Batliner
  • Jan Buckow
  • Heinrich Niemann
  • Elmar Nöth
  • Volker Warnke
Part of the Artificial Intelligence book series (AI)


We describe the acoustic-prosodic and syntactic-prosodic annotation and classification of boundaries, accents and sentence mood integrated in the Verbmobil system for the three languages German, English, and Japanese. For the acoustic-prosodic classification, a large feature vector with normalized prosodic features is used. For the three languages, a multilingual prosody module was developed that reduces memory requirement considerably, compared to three monolingual modules. For classification, neural networks and statistic language models are used.


Spontaneous Speech Prosodic Feature Phrase Boundary Prosodic Boundary Speech Unit 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Alexandersson, J., Engel, R., Kipp, M., Koch, S., Küssner, U., Reithinger, N., and Stede, M. Modeling Negotiation Dialogs. In this volume. Google Scholar
  2. Bagshaw, P. C. (1994). Automatic Prosodic Analysis for Computer Aided Pronunciation Teaching. PhD thesis, University of Edinburgh.Google Scholar
  3. Batliner, A., Huber, R., Niemann, H., Nöth, E., Spilker, J., and Fischer, K. The Recognition of Emotion. In this volume. Google Scholar
  4. Batliner, A., Kompe, R., Kießling, A., Mast, M., Niemann, H., and Nöth, E. (1998). M = Syntax + Prosody: a Syntactic-Prosodic Labelling Scheme for Large Spontaneous Speech Databases. Speech Communication 25(4):193–222.CrossRefGoogle Scholar
  5. Batliner, A., Nutt, M., Warnke, V., Nöth, E., Buckow, J., Huber, R., and Niemann, H. (1999). Automatic Annotation and Classification of Phrase Accents in Spontaneous Speech. In Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH 99), 519–522.Google Scholar
  6. Block, H. (1997). The Language Components in Verbmobil. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, volume 1, 79–82.Google Scholar
  7. Heine, J., and Bos, J. Discourse and Dialog Semantics for Translation. In this volume. Google Scholar
  8. Jekat, S., Klein, A., Maier, E., Maleck, I., Mast, M., and Quantz, J. (1995). Dialogue Acts in Verbmobil. Verbmobil Report 65.Google Scholar
  9. Kiefer, B., Krieger, H.-U., and Nederhof, M.-J. Efficient and Robust HPSG Parsing of Word Graphs. In this volume. Google Scholar
  10. Kießling, A. (1997). Extraktion und Klassifikation prosodischer Merkmale in der automatischen Sprachverarbeitung. Berichte aus der Informatik. Aachen: Shaker Verlag.Google Scholar
  11. Kipp, M., Alexandersson, J., Reithinger, N., and Engel, R. Dialog Processing. In this volume. Google Scholar
  12. Kompe, R. (1997). Prosody in Speech Understanding Systems. Lecture Notes for Artificial Intelligence. Berlin: Springer-Verlag.CrossRefGoogle Scholar
  13. Mast, M., Maier, E., and Schmitz, B. (1995). Criteria for the Segmentation of Spoken Input into Individual Utterances. Verbmobil Report 97.Google Scholar
  14. Klüter, A., Ndiaye, A., and Kirchmann H. Verbmobil from a Software Engineering Point of View: System Design and Software Integration. In this volume. Google Scholar
  15. Price, P., Ostendorf, M., Shattuck-Hufnagel, S., and Fong, C. (1991). The Use of Prosody in Syntactic Disambiguation. Journal of the Acoustic Society of America 90:2956–2970.CrossRefGoogle Scholar
  16. Reithinger, N., and Engel, R. Robust Content Extraction for Translation and Dialog Processing. In this volume. Google Scholar
  17. Schukat-Talamazzini, E., Gallwitz, F., Harbeck, S., and Warnke, V. (1997). Rational Interpolation of Maximum Likelihood Predictors in Stochastic Language Modeling. In Proc. European Conf. on Speech Communication and Technology, volume 5, 2731–2734.Google Scholar
  18. Searle, J. (1969). Speech Acts. An Essay in the Philosophy of Language. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  19. Shriberg, E., Bates, R., Taylor, P., Stolcke, A., Jurafsky, D., Ries, K., Cocarro, N., Martin, R., Meteer, M., and Ess-Dykema, C. V. (1998). Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? Language and Speech 41:439–487.Google Scholar
  20. Spilker, J., Klarner, M., and Görz, G. Processing Self Corrections in a Speech-to-Speech System. In this volume. Google Scholar
  21. Vogel, S., Och, F.J., Tillmann, C., Niessen, S., Sawaf, H., and Ney, H. Statistical Methods for Machine Translation. In this volume. Google Scholar
  22. Wang, M., and Hirschberg, J. (1992). Automatic Classification of Intonational Phrase Boundaries. Computer Speech & Language 6(2):175–196.CrossRefGoogle Scholar
  23. Warnke, V., Gallwitz, F., Batliner, A., Buckow, J., Huber, R., Nöth, E., and Höthker, A. (1999). Integrating Multiple Knowledge Sources for Word Hypotheses Graph Interpretation. In Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH 99), 235–239.Google Scholar
  24. Wightman, C. (1992). Automatic Detection of Prosodic Constituents. PhD thesis, Boston University Graduate School.Google Scholar
  25. Zell, A., Mache, N., Sommer, T., and Korb, T. (1991a). Design of the SNNS Neural Network Simulator. In Proceedings of the Österreichische Artificial-Intelligence-Tagung, Informatik-Fachberichte 287, 93–102. Springer Verlag.Google Scholar
  26. Zell, A., Mache, N., Sommer, T., and Korb, T. (1991b). The SNNS Neural Network Simulator. In Proceedings of the 15. Fachtagung für Künstliche Intelligenz, 254–263. Springer Verlag.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Anton Batliner
    • 1
  • Jan Buckow
    • 1
  • Heinrich Niemann
    • 1
  • Elmar Nöth
    • 1
  • Volker Warnke
    • 1
  1. 1.Lehrstuhl für MustererkennungUniversität Erlangen-NürnbergGermany

Personalised recommendations