Skip to main content

Application of Expressive Speech in TTS System with Cepstral Description

  • Conference paper
Book cover Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5042))

Abstract

Expressive speech synthesis representing different human emotions has been in the interests of researchers for a longer time. Recently, some experiments with storytelling speaking style have been performed. This particular speaking style is suitable for applications aimed at children as well as special applications aimed at blind people. Analyzing human storytellers’ speech, we designed a set of prosodic parameters prototypes for converting speech produced by the text-to-speech (TTS) system into storytelling speech. In addition to suprasegmental characteristics (pitch, intensity, and duration) included in these speech prototypes, also information about significant frequencies of spectral envelope and spectral flatness determining degree of voicing was used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Přibilová, A., Přibil, J.: Non-linear Frequency Scale Mapping for Voice Conversion in Text-to-Speech System with Cepstral Description. Speech Communication 48, 1691–1703 (2006)

    Article  Google Scholar 

  2. Iida, A., Campbell, N., Higuchi, F., Yasumura, M.: A Corpus-Based Speech Synthesis System with Emotion. Speech Communication 40, 161–187 (2003)

    Article  MATH  Google Scholar 

  3. Navas, E., Hernáez, I., Luengo, I.: An Objective and Subjective Study of the Role of Semantics and Prosodic Features in Building Corpora for Emotional TTS. IEEE Transactions on Audio, Speech, and Language Processing 14, 1117–1127 (2006)

    Article  Google Scholar 

  4. Tao, J., Kang, Y., Li, A.: Prosody Conversion from Neutral Speech to Emotional Speech. IEEE Transactions on Audio, Speech, and Language Processing 14, 1145–1154 (2006)

    Article  Google Scholar 

  5. Přibil, J., Přibilová, A.: Emotional Style Conversion in the TTS System with Cepstral Description. In: Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds.) COST Action 2102. LNCS (LNAI), vol. 4775, pp. 65–73. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  6. House, D., Bell, L., Gustafson, K., Johansson, L.: Child-Directed Speech Synthesis: Evaluation of Prosodic Variation for an Educational Computer Program. In: Proceedings of Eurospeech, Budapest, pp. 1843–1846 (1999)

    Google Scholar 

  7. Theune, M., Meijs, K., Heylen, D., Ordelman, R.: Generating Expressive Speech for Storytelling Applications. IEEE Transactions on Audio, Speech, and Language Processing 14, 1137–1144 (2006)

    Article  Google Scholar 

  8. Přibil, J., Přibilová, A.: Voicing Transition Frequency Determination for Harmonic Speech Model. In: Proceedings of the 13th International Conference on Systems, Signals and Image Processing, Budapest, pp. 25–28 (2006)

    Google Scholar 

  9. Vích, R.: Cepstral Speech Model, Padé Approximation, Excitation, and Gain Matching in Cepstral Speech Synthesis. In: Proceedings of the 15th Biennial International EURASIP Conference Biosignal, Brno, pp. 77–82 (2000)

    Google Scholar 

  10. Gray, A.H., Markel, J.D.: A Spectral-Flatness Measure for Studying the Autocorrelation Method of Linear Prediction of Speech Analysis. IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP-22, 207–217 (1974)

    Article  Google Scholar 

  11. Esposito, A., Stejskal, V., Smékal, Z., Bourbakis, N.: The Significance of Empty Speech Pauses: Cognitive and Algorithmic Issues. In: Proceedings of the 2nd International Symposium on Brain Vision and Artificial Intelligence, Naples, pp. 542–554 (2007)

    Google Scholar 

  12. Ito, T., Takeda, K., Itakura, F.: Analysis and Recognition of Whispered Speech. Speech Communication 45, 139–152 (2005)

    Article  Google Scholar 

  13. Přibil, J., Madlová, A.: Two Synthesis Methods Based on Cepstral Parameterization. Radioengineering 11(2), 35–39 (2002)

    Google Scholar 

  14. Unser, M.: Splines. A Perfect Fit for Signal and Image Processing. IEEE Signal Processing Magazine 16, 22–38 (1999)

    Article  Google Scholar 

  15. Akande, O.O., Murphy, P.J.: Estimation of the Vocal Tract Transfer Function with Application to Glottal Wave Analysis. Speech Communication 46, 15–36 (2005)

    Article  Google Scholar 

  16. Přibil, J., Přibilová, A.: Distributed Listening Test Program for Synthetic Speech Evaluation. In: Proceedings of the 34 Jahrestagung für Akustik DAGA 2008, Dresden (to be published, 2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Přibil, J., Přibilová, A. (2008). Application of Expressive Speech in TTS System with Cepstral Description. In: Esposito, A., Bourbakis, N.G., Avouris, N., Hatzilygeroudis, I. (eds) Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction. Lecture Notes in Computer Science(), vol 5042. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70872-8_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70872-8_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70871-1

  • Online ISBN: 978-3-540-70872-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics