Study on Emotional Speech Features in Korean with Its Application to Voice Conversion

Kim, Sang-Jin; Kim, Kwang-Ki; Han, Hyun Bae; Hahn, Minsoo

doi:10.1007/11573548_44

Sang-Jin Kim¹⁹,
Kwang-Ki Kim¹⁹,
Hyun Bae Han²⁰ &
…
Minsoo Hahn¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3784))

Included in the following conference series:

International Conference on Affective Computing and Intelligent Interaction

5037 Accesses
1 Citations

Abstract

Recent researches in speech synthesis are mainly focused on naturalness, and the emotional speech synthesis becomes one of the highlighted research topics. Although quite a many studies on emotional speech in English or Japanese have been addressed, the studies in Korean can seldom be found. This paper presents an analysis of emotional speech in Korean. Emotional speech features related to human speech prosody, such as F0, the duration, and the amplitude with their variations, are exploited. Their attribution to three different types of typical human speech is tried to be quantified and modeled. By utilizing the analysis results, emotional voice conversion from the neutral speech to the emotional one is also performed and tested.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Vine, D.S.B., Sahandi, R.: Synthesis of Emotional Speech using RP-PSOLA. In: IEE Colloquium on the State of the Art in Speech Synthesis (2000)
Google Scholar
Murray, I., Arnott, J.: Implementation and Testing of a System for Producing Emotion-by-Rule in Synthetic Speech. Speech Communication, 369–390 (1995)
Google Scholar
Jun, S., Shigeo, M.: Emotion Modeling in Speech Production using Emotion Space. In: IEEE International Workshop on Robot and Human Communication, pp. 472–477 (1996)
Google Scholar
Tsuyoshi, M., Shinji, O.: Emotional Recognition and Synthesis System on Speech. In: Proceedings of IEEE International Conference on Multimedia Computing and Systems, pp. 840–844 (1999)
Google Scholar
Erhard, R., Hannes, P.: Generating Emotional Speech with a Concaternative Synthesizer. In: Proceedings of ICSLP 1998, pp. 671–675 (1998)
Google Scholar
Galanis, D., Darsinos, V., Kokkinakis, G.: Investigating Emotional Speech Parameters for Speech Synthesis. In: Proceedings of ICECS 1996, pp. 1227–1230 (1996)
Google Scholar
Kazuhito, K., Hirotaka, S., Hiroaki, S.: Prosodic Parameters in Emotional Speech. In: Proceedings of ICSLP 1998, pp. 679–682 (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Speech and Audio Info. Lab., Information and Communications Univ., Korea
Sang-Jin Kim, Kwang-Ki Kim & Minsoo Hahn
International Network Planning & Management Team, Network Group, KT, Korea
Hyun Bae Han

Authors

Sang-Jin Kim
View author publications
You can also search for this author in PubMed Google Scholar
Kwang-Ki Kim
View author publications
You can also search for this author in PubMed Google Scholar
Hyun Bae Han
View author publications
You can also search for this author in PubMed Google Scholar
Minsoo Hahn
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences,
Jianhua Tao
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
MIT Media Laboratory, 20 Ames Street, 02139, Cambridge, MA, USA
Rosalind W. Picard

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, SJ., Kim, KK., Han, H.B., Hahn, M. (2005). Study on Emotional Speech Features in Korean with Its Application to Voice Conversion. In: Tao, J., Tan, T., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2005. Lecture Notes in Computer Science, vol 3784. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573548_44

Download citation

DOI: https://doi.org/10.1007/11573548_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29621-8
Online ISBN: 978-3-540-32273-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics