Improvements in Czech Expressive Speech Synthesis in Limited Domain

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8113)


In our recent work, a method on how to enumerate differences between various expressive categories (communicative functions) has been proposed. To improve the overall impact of this approach to both the quality of synthetic expressive speech and expressivity perception by listeners, a few modifications are suggested in this paper. The main ones consist in a different way of expressive data processing and penalty matrix calculation. A complex evaluation using listening tests and some auxiliary measures was performed.


expressive speech synthesis unit selection target cost communicative functions 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Russell, J.A.: A circumplex model of affect. Journal of Personality and Social Psychology 39, 1161–1178 (1980)CrossRefGoogle Scholar
  2. 2.
    Cornelius, R.R.: The science of emotion: Research and tradition in the psychology of emotions. Prentice-Hall, Englewood Cliffs (1996)Google Scholar
  3. 3.
    Syrdal, A.K., Conkie, A., Kim, Y.J., Beutnagel, M.: Speech acts and dialog TTS. In: Proceedings of the 7th ISCA Speech Synthesis Workshop – SSW7, Kyoto, Japan, pp. 179–183 (2010)Google Scholar
  4. 4.
    Zovato, E., Pacchiotti, A., Quazza, S., Sandri, S.: Towards emotional speech synthesis: A rule based approach. In: Proceedings of the 5th ISCA Speech Synthesis Workshop – SSW5, Pittsburgh, PA, USA, pp. 219–220 (2004)Google Scholar
  5. 5.
    Hamza, W., Bakis, R., Eide, E.M., Picheny, M.A., Pitrelli, J.F.: The IBM expressive speech synthesis system. In: Proceedings of the 8th International Conference on Spoken Language Processing – ISCLP, Jeju, Korea, pp. 2577–2580 (2004)Google Scholar
  6. 6.
    Krstulovic, S., Hunecke, A., Schroder, M.: An HMM-based speech synthesis system applied to German and its adaptation to a limited set of expressive football announcements. In: Proceedings of Interspeech, Antwerp, Belgium, pp. 1897–1900 (2007)Google Scholar
  7. 7.
    Ircing, P., Romportl, J., Loose, Z.: Audiovisual interface for Czech spoken dialogue system. In: IEEE 10th International Conference on Signal Processing Proceedings, Beijing, China, pp. 526–529. Institute of Electrical and Electronics Engineers, Inc. (2010)Google Scholar
  8. 8.
    Grůber, M., Matoušek, J.: Listening-test-based annotation of communicative functions for expressive speech synthesis. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2010. LNCS, vol. 6231, pp. 283–290. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Grůber, M., Tihelka, D.: Expressive speech synthesis for Czech limited domain dialogue system – basic experiments. In: IEEE 10th International Conference on Signal Processing Proceedings, Beijing, China, vol. 1, pp. 561–564. Institute of Electrical and Electronics Engineers, Inc. (2010)Google Scholar
  10. 10.
    Grůber, M., Legát, M., Ircing, P., Romportl, J., Psutka, J.: Czech Senior COMPANION: Wizard of Oz data collection and expressive speech corpus recording and annotation. In: Vetulani, Z. (ed.) LTC 2009. LNCS, vol. 6562, pp. 280–290. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  11. 11.
    Tihelka, D., Kala, J., Matoušek, J.: Enhancements of Viterbi search for fast unit selection synthesis. In: Proceedings of Interspeech, Makuhari, Japan, pp. 174–177 (2010)Google Scholar
  12. 12.
    Grůber, M.: Enumerating differences between various communicative functions for purposes of Czech expressive speech synthesis in limited domain. In: Proceedings of Interspeech, Portland, Oregon, USA, pp. 650–653 (2012)Google Scholar
  13. 13.
    Syrdal, A.K., Kim, Y.J.: Dialog speech acts and prosody: Considerations for TTS. In: Proceedings of Speech Prosody, Campinas, Brazil, pp. 661–665 (May 2008)Google Scholar
  14. 14.
    Grůber, M., Hanzlíček, Z.: Czech expressive speech synthesis in limited domain: Comparison of unit selection and HMM-based approaches. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 656–664. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  15. 15.
    Grůber, M.: Acoustic analysis of Czech expressive recordings from a single speaker in terms of various communicative functions. In: Proceedings of the 11th IEEE International Symposium on Signal Processing and Information Technology, pp. 267–272. IEEE (2011)Google Scholar
  16. 16.
    Trujillo-Ortiz, A., Hernandez-Walls, R., Castro-Perez, A., Barba-Rojo, K.: MOUTLIER1: Detection of outlier in multivariate samples test. A MATLAB file (2006) (online; cited October 29, 2012)Google Scholar
  17. 17.
    Wilks, S.S.: Multivariate statistical outlier. The Indian Journal of Statistics 25(4), 407–426 (1963)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Přibil, J., Přibilová, A.: Statistical analysis of spectral properties and prosodic parameters of emotional speech. Measurement Science Review 9, 95–104 (2009)Google Scholar
  19. 19.
    Přibil, J., Přibilová, A.: Statistical analysis of complementary spectral features of emotional speech in Czech and Slovak. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 299–306. Springer, Heidelberg (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  1. 1.Department of Cybernetics, Faculty of Applied SciencesUniversity of West BohemiaCzech Republic

Personalised recommendations