Skip to main content

Improvements in Czech Expressive Speech Synthesis in Limited Domain

  • Conference paper
  • 1081 Accesses

Part of the Lecture Notes in Computer Science book series (LNAI,volume 8113)

Abstract

In our recent work, a method on how to enumerate differences between various expressive categories (communicative functions) has been proposed. To improve the overall impact of this approach to both the quality of synthetic expressive speech and expressivity perception by listeners, a few modifications are suggested in this paper. The main ones consist in a different way of expressive data processing and penalty matrix calculation. A complex evaluation using listening tests and some auxiliary measures was performed.

Keywords

  • expressive speech synthesis
  • unit selection
  • target cost
  • communicative functions

This work was supported by the European Regional Development Fund (ERDF), project “New Technologies for Information Society” (NTIS), European Centre of Excellence, ED1.1.00/02.0090. The access to computing and storage facilities owned by parties and projects contributing to the National Grid Infrastructure MetaCentrum, provided under the programme “Projects of Large Infrastructure for Research, Development, and Innovations” (LM2010005) is highly appreciated.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-01931-4_23
  • Chapter length: 11 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   69.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-01931-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   89.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Russell, J.A.: A circumplex model of affect. Journal of Personality and Social Psychology 39, 1161–1178 (1980)

    CrossRef  Google Scholar 

  2. Cornelius, R.R.: The science of emotion: Research and tradition in the psychology of emotions. Prentice-Hall, Englewood Cliffs (1996)

    Google Scholar 

  3. Syrdal, A.K., Conkie, A., Kim, Y.J., Beutnagel, M.: Speech acts and dialog TTS. In: Proceedings of the 7th ISCA Speech Synthesis Workshop – SSW7, Kyoto, Japan, pp. 179–183 (2010)

    Google Scholar 

  4. Zovato, E., Pacchiotti, A., Quazza, S., Sandri, S.: Towards emotional speech synthesis: A rule based approach. In: Proceedings of the 5th ISCA Speech Synthesis Workshop – SSW5, Pittsburgh, PA, USA, pp. 219–220 (2004)

    Google Scholar 

  5. Hamza, W., Bakis, R., Eide, E.M., Picheny, M.A., Pitrelli, J.F.: The IBM expressive speech synthesis system. In: Proceedings of the 8th International Conference on Spoken Language Processing – ISCLP, Jeju, Korea, pp. 2577–2580 (2004)

    Google Scholar 

  6. Krstulovic, S., Hunecke, A., Schroder, M.: An HMM-based speech synthesis system applied to German and its adaptation to a limited set of expressive football announcements. In: Proceedings of Interspeech, Antwerp, Belgium, pp. 1897–1900 (2007)

    Google Scholar 

  7. Ircing, P., Romportl, J., Loose, Z.: Audiovisual interface for Czech spoken dialogue system. In: IEEE 10th International Conference on Signal Processing Proceedings, Beijing, China, pp. 526–529. Institute of Electrical and Electronics Engineers, Inc. (2010)

    Google Scholar 

  8. Grůber, M., Matoušek, J.: Listening-test-based annotation of communicative functions for expressive speech synthesis. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2010. LNCS, vol. 6231, pp. 283–290. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  9. Grůber, M., Tihelka, D.: Expressive speech synthesis for Czech limited domain dialogue system – basic experiments. In: IEEE 10th International Conference on Signal Processing Proceedings, Beijing, China, vol. 1, pp. 561–564. Institute of Electrical and Electronics Engineers, Inc. (2010)

    Google Scholar 

  10. Grůber, M., Legát, M., Ircing, P., Romportl, J., Psutka, J.: Czech Senior COMPANION: Wizard of Oz data collection and expressive speech corpus recording and annotation. In: Vetulani, Z. (ed.) LTC 2009. LNCS, vol. 6562, pp. 280–290. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  11. Tihelka, D., Kala, J., Matoušek, J.: Enhancements of Viterbi search for fast unit selection synthesis. In: Proceedings of Interspeech, Makuhari, Japan, pp. 174–177 (2010)

    Google Scholar 

  12. Grůber, M.: Enumerating differences between various communicative functions for purposes of Czech expressive speech synthesis in limited domain. In: Proceedings of Interspeech, Portland, Oregon, USA, pp. 650–653 (2012)

    Google Scholar 

  13. Syrdal, A.K., Kim, Y.J.: Dialog speech acts and prosody: Considerations for TTS. In: Proceedings of Speech Prosody, Campinas, Brazil, pp. 661–665 (May 2008)

    Google Scholar 

  14. Grůber, M., Hanzlíček, Z.: Czech expressive speech synthesis in limited domain: Comparison of unit selection and HMM-based approaches. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 656–664. Springer, Heidelberg (2012)

    CrossRef  Google Scholar 

  15. Grůber, M.: Acoustic analysis of Czech expressive recordings from a single speaker in terms of various communicative functions. In: Proceedings of the 11th IEEE International Symposium on Signal Processing and Information Technology, pp. 267–272. IEEE (2011)

    Google Scholar 

  16. Trujillo-Ortiz, A., Hernandez-Walls, R., Castro-Perez, A., Barba-Rojo, K.: MOUTLIER1: Detection of outlier in multivariate samples test. A MATLAB file (2006) (online; cited October 29, 2012)

    Google Scholar 

  17. Wilks, S.S.: Multivariate statistical outlier. The Indian Journal of Statistics 25(4), 407–426 (1963)

    MathSciNet  MATH  Google Scholar 

  18. Přibil, J., Přibilová, A.: Statistical analysis of spectral properties and prosodic parameters of emotional speech. Measurement Science Review 9, 95–104 (2009)

    Google Scholar 

  19. Přibil, J., Přibilová, A.: Statistical analysis of complementary spectral features of emotional speech in Czech and Slovak. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 299–306. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Grůber, M., Matoušek, J. (2013). Improvements in Czech Expressive Speech Synthesis in Limited Domain. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-01931-4_23

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-01930-7

  • Online ISBN: 978-3-319-01931-4

  • eBook Packages: Computer ScienceComputer Science (R0)