Neutral Speech to Anger Speech Conversion Using Prosody Modification
In this paper, the dynamics of prosodic features are exploited for speech emotion conversion. In particular, emotion conversion of neutral speech to anger speech is accomplished. The database used for analysis of prosody is the Indian Institute of Technology Kharagpur Simulated Emotion Speech Corpus (IITKGP-SESC). The prosodic features considered for the study are pitch contour, intensity contour, and duration contour. Objective test is performed in terms of average of pitch contour and intensity contour. Subjective listening test results show that the effectiveness of perception of emotion is better in the case of pitch contour modification at the beginning and ending of utterance than for the whole utterance. The results show that the synthesized anger speech is perceived very close to natural anger emotion.
KeywordsEmotion conversion neutral speech anger speech phase vocoder pitch shift intensity contour duration contour
Unable to display preview. Download preview PDF.
- 1.Vroomen, J., Collier, R., Mozziconacci, S.: Duration and intonation in emotional speech. Eurospeech 1, 577–580 (1993)Google Scholar
- 4.Paeschke, A., Sendlmeier, W.F.: Prosodic characteristics of emotional speech: measurements of fundamental frequency movements. In: Speech Emotion, pp. 75–80 (2000)Google Scholar