Advertisement

Speech Synthesis of Emotions Using Vowel Features

  • Kanu Boku
  • Taro Asada
  • Yasunari Yoshitomi
  • Masayoshi Tabuse
Part of the Studies in Computational Intelligence book series (SCI, volume 443)

Abstract

Recently, methods for adding emotion to synthetic speech have received considerable attention in the field of speech synthesis research. For generating emotional synthetic speech, it is necessary to control the prosodic features of the utterances. We propose a case-based method for generating emotional synthetic speech by exploiting the characteristics of the maximum amplitude and the utterance time of vowels, and the fundamental frequency of emotional speech. As an initial investigation, we adopted the utterance of Japanese names, which are semantically neutral. By using the proposed method, emotional synthetic speech made from the emotional speech of one male subject was discriminable with a mean accuracy of 70% when ten subjects listened to the emotional synthetic utterances of “angry,” “happy,” “neutral,” “sad,” or “surprised” when the utterance was the Japanese name “Taro.”

Keywords

Emotional speech feature parameter synthetic speech emotional synthetic speech vowel 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Katae, N., Kimura, S.: An effect of voice quality and control in emotional speech synthesis. Proc. of the Autumn Meeting the Acoustical Society of Japan 2, 187–188 (2000) (in Japanese)Google Scholar
  2. 2.
    Ogata, S., Yotsukura, T., Morishima, S.: Voice conversion to append emotional impression by controlling articulation information. IEICE Technical Report, Human Information Processing 99(582), 53–58 (2000) (in Japanese)Google Scholar
  3. 3.
    Iida, A., Iga, S., Higuchi, F., Campbell, N., Yasumura, M.: A prototype of a speech synthesis system with emotion for assisting communication. Trans. of Human Interface Society 2(2), 63–70 (2000) (in Japanese)Google Scholar
  4. 4.
    Moriyama, T., Mori, S., Ozawa, S.: A synthesis method of emotional speech using subspace constraints in prosody. Trans. of Information Processing Society of Japan 50(3), 1181–1191 (2009) (in Japanese)Google Scholar
  5. 5.
    Open-Source Large Vocabulary CSR Engine Julius, http://julius.sourceforge.jp/en_index.php?q=index-en.html
  6. 6.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Kanu Boku
    • 1
  • Taro Asada
    • 1
  • Yasunari Yoshitomi
    • 1
  • Masayoshi Tabuse
    • 1
  1. 1.Graduate School of Life and Environmental SciencesKyoto Prefectural UniversitySakyo-kuJapan

Personalised recommendations