Abstract
This study examines an emotion labeling method for a system utterance of a non-task-oriented spoken dialogue system. The conventional study proposed the cooperative emotion labeling, which generates an emotional speech with an emotion label estimated from user and system utterances. However, this method had a problem that the system cannot decide the emotion label when the emotion is not estimated from the linguistic information. Therefore, we propose a method that uses both the acoustic and the linguistic information for the emotion recognition. In this paper, we show the performance of the emotion recognition when using the acoustic features first. Then, a dialogue experiment based on scenarios is conducted to verify the effectiveness of the proposed emotion labeling method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Acosta, J.C., Ward, N.G.: Achieving rapport with turn-by-turn, user-responsive emotional coloring. Speech Commun. 53(9–10), 1137–1148 (2011)
Chiba, Y., Nose, T., Yamanaka, M., Kase, T., Ito, A.: An analysis of the effect of emotional speech synthesis on non-task-oriented dialogue system. In: Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, pp. 371–375 (2018)
Kase, T., Nose, T., Chiba, Y., Ito, A.: Method of emotion coloring in chat dialogues considering system and user utterance. In: Reports of the Spring Meeting the Acoustical Society of Japan, pp. 89–92 (2016). (in Japanese)
Kobayashi, N., Inui, K., Matsumoto, Y., Tateishi, K., Fukushima, T.: Collecting evaluative expressions for opinion extraction. In: Proceedings of the International Conference on Natural Language Processing, pp. 596–605 (2004)
Lee, A., Oura, K., Tokuda, K.: MMDAgent-A fully open-source toolkit for voice interaction systems. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8382–8385 (2013)
Meguro, T., Higashinaka, R., Minami, Y., Dohsaka, K.: Controlling listening-oriented dialogue using partially observable markov decision processes. In: Proceedings of the 23rd International Conference on Computational Linguistics, COLING 2010, pp. 761–769 (2010)
Nass, C., et al.: Improving automotive safety by pairing driver emotion and car voice emotion. In: CHI 2005 Extended Abstracts on Human Factors in Computing Systems, pp. 1973–1976 (2005)
Nose, T., Kobayashi, T.: Recent development of HMM-based expressive speech synthesis and its applications. In: Proc. APSIPA ASC, pp. 1–4 (2011)
Ritter, A., Cherry, C., Dolan, W.B.: Data-driven response generation in social media. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, pp. 583–593 (2011)
Schuller, B., Steidl, S., Batliner, A.: The INTERSPEECH 2009 emotion challenge. In: Proceedings of the Tenth Annual Conference of the International Speech Communication Association (2009)
Sugiyama, H., Meguro, T., Higashinaka, R., Minami, Y.: Open-domain utterance generation for conversational dialogue systems using web-scale dependency structures. In: Proceedings of the SIGDIAL 2013 Conference, pp. 334–338 (2013)
Takeishi, E., Nose, T., Chiba, Y., Ito, A.: Construction and analysis of phonetically and prosodically balanced emotional speech database. In: Proceedings of 2016 Conference of the Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques, pp. 16–21 (2016)
Acknowledgments
Part of this work was supported by JSPS KAKENHI Grant Number JP17H00823, JP18K18136.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yamanaka, M., Chiba, Y., Nose, T., Ito, A. (2019). A Study on a Spoken Dialogue System with Cooperative Emotional Speech Synthesis Using Acoustic and Linguistic Information. In: Pan, JS., Ito, A., Tsai, PW., Jain, L. (eds) Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing. IIH-MSP 2018. Smart Innovation, Systems and Technologies, vol 110. Springer, Cham. https://doi.org/10.1007/978-3-030-03748-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-03748-2_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03747-5
Online ISBN: 978-3-030-03748-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)