Abstract
In this paper, the dynamics of prosodic features are exploited for speech emotion conversion. In particular, emotion conversion of neutral speech to anger speech is accomplished. The database used for analysis of prosody is the Indian Institute of Technology Kharagpur Simulated Emotion Speech Corpus (IITKGP-SESC). The prosodic features considered for the study are pitch contour, intensity contour, and duration contour. Objective test is performed in terms of average of pitch contour and intensity contour. Subjective listening test results show that the effectiveness of perception of emotion is better in the case of pitch contour modification at the beginning and ending of utterance than for the whole utterance. The results show that the synthesized anger speech is perceived very close to natural anger emotion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Vroomen, J., Collier, R., Mozziconacci, S.: Duration and intonation in emotional speech. Eurospeech 1, 577–580 (1993)
Tao, J., Kang, Y., Li, A.: Prosody conversion from neutral speech to emotional speech. IEEE Transactions on Audio, Speech, and Language Processing 14, 1145–1154 (2006)
Rao, K.S., Yegnanarayana, B.: Prosody modification using instants of significant excitation. IEEE Transactions on Audio, Speech and Language Processing 14, 972–980 (2006)
Paeschke, A., Sendlmeier, W.F.: Prosodic characteristics of emotional speech: measurements of fundamental frequency movements. In: Speech Emotion, pp. 75–80 (2000)
Koolagudi, S.G., Maity, S., Kumar, V.A., Chakrabarti, S., Sreenivasa Rao, K.: IITKGP-SESC: Speech database for emotion analysis. In: Ranka, S., et al. (eds.) IC3 2009. CCIS, vol. 40, pp. 485–492. Springer, Heidelberg (2009)
Yegnanarayana, B., Murty, K.S.R.: Event-based instantaneous fundamental frequency estimation from speech signals. IEEE Transactions on Audio, Speech and Language Process 17(4), 614–625 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Vuppala, A.K., Limmayya, J., Raghavendra, G. (2013). Neutral Speech to Anger Speech Conversion Using Prosody Modification. In: Prasath, R., Kathirvalavakumar, T. (eds) Mining Intelligence and Knowledge Exploration. Lecture Notes in Computer Science(), vol 8284. Springer, Cham. https://doi.org/10.1007/978-3-319-03844-5_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-03844-5_39
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03843-8
Online ISBN: 978-3-319-03844-5
eBook Packages: Computer ScienceComputer Science (R0)