Introduction to Part IV

  • Sadaoki Furui


This section consists of five papers on how to use prosodic information (prosodic features of speech), such as pitch, energy, and duration cues, in automatic speech recognition. As earlier chapters have shown, prosodic information plays an important role in human speech communication. In the last few years, speech recognition systems have dramatically improved, and automatic speech understanding is now a realistic goal. With these developments, the potential role of recognizing prosodic features has become greater, since a transcription of the spoken word sequence alone may not provide enough information for accurate speech understanding; the same word sequence can have different meanings associated with different prosody. Meaning is affected by phrase boundaries, pitch accents, and tone (intonation). For example, phrase boundary placement (detection) is useful in syntactic disambiguation, and tone is useful in determining whether or not an utterance is a yes—no question. In English, there are many noun—verb or noun—adjective pairs in which a change in the word accent indicates a change in the word meaning. Phrase boundary placement is also useful for reducing the search space, that is, reducing the number of calculations in continuous speech recognition.


Speech Recognition Automatic Speech Recognition Speech Recognition System Pitch Contour Prosodic Feature 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag New York, Inc. 1997

Authors and Affiliations

  • Sadaoki Furui

There are no affiliations available

Personalised recommendations