Speech Synthesizing Simultaneous Emotion-Related States

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11096)


We describe an approach to simulate first and secondary emotional expression in synthesized speech simultaneously by targeting different parameter categories. The approach is based on the open-source system “Emofilt” which utilizes the diphone-synthesizer “Mbrola”. The evaluation of the approach by a perception experiment showed that the pure emotions were all recognized above chance. Whereas the results are promising, the ultimate aim to validly synthesize two emotions simultaneously was not fully reached. Apparently, some emotions dominate the perception (fear), and the salience or quality of synthesis does not seem to be equally distributed over the two feature bundles.


Speech synthesis Emotion simulation Mixed emotions 


  1. 1.
    Barra-Chicote, R., Yamagishi, J., King, S., Monero, J.M., Macias-Guarasa, J.: Analysis of statistical parametric and unit-selection speech synthesis systems applied to emotional speech. Speech Commun. 52(5), 394–404 (2010)CrossRefGoogle Scholar
  2. 2.
    Berrios, R., Totterdell, P., Kellett, S.: Eliciting mixed emotions: a meta-analysis comparing models, types, and measures. Front. Psychol. 6, 428 (2015)CrossRefGoogle Scholar
  3. 3.
    Burkhardt, F.: Simulation emotionaler Sprechweise mit Sprachsynthesesystemen. Shaker (2000)Google Scholar
  4. 4.
    Burkhardt, F.: Emofilt: the simulation of emotional speech by prosody transformation. In: Proceedings of Interspeech. Lisbon (2005)Google Scholar
  5. 5.
    Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B.: A database of German emotional speech. In: Proceedings of Interspeech. Lisbon (2005)Google Scholar
  6. 6.
    Burkhardt, F.: An affective spoken story teller. In: Proceedings of Interspeech. Florence (2011)Google Scholar
  7. 7.
    Burkhardt, F.: Fast labeling and transcription with the speechalyzer toolkit. In: Proceedings of LREC (Language Resources Evaluation Conference), Istanbul (2012)Google Scholar
  8. 8.
    Du, S., Tao, Y., Martinez, A.: Compound facial expressions of emotion. Proc. Natl. Acad. Sci. 111(15), E1454–62 (2014)CrossRefGoogle Scholar
  9. 9.
    Dutoit, T., Pagel, V., Pierret, N., Bataille, F., Van der Vreken, O.: The MBROLA project: towards a set of high-quality speech synthesizers free of use for non-commercial purposes. In: Proceedings of ICSLP 1996, Philadelphia, vol. 3, pp. 1393–1396 (1996)Google Scholar
  10. 10.
    Grimm, M., Kroschel, K., Narayanan, S.: The Vera am Mittag German audio-visual emotional speech database. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Hannover (2008)Google Scholar
  11. 11.
    Latorre, J., et al.: Speech factorization for HMM-TTS based on cluster adaptive training. In: Proceedings of Interspeech. Portland (2012)Google Scholar
  12. 12.
    Lee, Y., Rabiee, A., Lee, S.: Emotional end-to-end neural speech synthesizer. CoRR (2017)Google Scholar
  13. 13.
    Martin, J.C., Niewiadomski, R., Devillers, L., Buisine, S., Pelachaud, C.: Multimodal complex emotions: gesture expressivity and blended facial expressions. Int. J. Humanoid Rob. 3, 269–292 (2006)CrossRefGoogle Scholar
  14. 14.
    Murray, I.R., Arnott, J.L.: Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. JASA 93(2), 1097–1107 (1993)CrossRefGoogle Scholar
  15. 15.
    Schröder, M.: Emotional speech synthesis - a review. In: Proceedings of Eurospeech 2001, Aalborg, pp. 561–564 (2001)Google Scholar
  16. 16.
    Schröder, M., Trouvain, J.: The German text-to-speech synthesis system mary: a tool for research, development and teaching. Int. J. Speech Technol. 6, 365–377 (2003)CrossRefGoogle Scholar
  17. 17.
    Tachibana, M., Yamagishi, J., Masuko, T., Kobayashi, T.: Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing. IEICE Trans. Inf. Syst. 88(11), 2484–2491 (2005)CrossRefGoogle Scholar
  18. 18.
    Williams, P., Aaker, J.: Can mixed emotions peacefully coexist? J. Consum. Res. 28(4), 636–649 (2002)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.audEERING GmbHBerlinGermany
  2. 2.Technische Universität BerlinBerlinGermany

Personalised recommendations