Speech Expression Multimodal Emotion Recognition Based on Deep Belief Network

Liu, Dong; Chen, Longxi; Wang, Zhiyong; Diao, Guangqiang

doi:10.1007/s10723-021-09564-0

Speech Expression Multimodal Emotion Recognition Based on Deep Belief Network

Published: 18 May 2021

Volume 19, article number 22, (2021)
Cite this article

Journal of Grid Computing Aims and scope Submit manuscript

Dong Liu¹,
Longxi Chen ORCID: orcid.org/0000-0001-7952-9553¹,
Zhiyong Wang¹ &
…
Guangqiang Diao¹

322 Accesses
21 Citations
Explore all metrics

Abstract

Aiming at the problems of insufficient information and poor recognition rate in single-mode emotion recognition, a multi-mode emotion recognition method based on deep belief network is proposed. Firstly, speech and expression signals are preprocessed and feature extracted to obtain high-level features of single-mode signals. Then, the high-level speech features and expression features are fused by using the bimodal deep belief network (BDBN), and the multimodal fusion features for classification are obtained, and the redundant information between modes is removed. Finally, the multi-modal fusion features are classified by LIBSVM to realize the final emotion recognition. Based on the Friends data set, the proposed model is demonstrated experimentally. The experimental results show that the recognition accuracy of multimodal fusion feature is the best, which is 90.89%, and the unweighted recognition accuracy of the proposed model is 86.17%, which is better than other comparison methods, and has certain research value and practicability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning Based Emotion Recognition from Chinese Speech

Emotion recognition using multimodal deep learning in multiple psychophysiological signals and video

Article 20 January 2020

Feature Learning via Deep Belief Network for Chinese Speech Emotion Recognition

References

Rahdari, F., Rashedi, E., Eftekhari, M.: A Multimodal Emotion Recognition System Using Facial Landmark Analysis[J]. Iranian Journal of Science and Technology. Trans. Electr. Eng. 43(JUL.SUPPL.1), S171–S189 (2019)
Google Scholar
Nemati, S., Rohani-Dezfuli, A.R., Basiri, E., et al.: A hybrid latent space data fusion method for multimodal emotion recognition[J]. IEEE Access. 7(4), 172948–172964 (2019)
Article Google Scholar
Wang, Y.: Multimodal emotion recognition algorithm based on edge network emotion element compensation and data fusion[J]. Pers. Ubiquit. Comput. 23(3–4), 383–392 (2019)
Article Google Scholar
Wang, Z., Zhou, X., Wang, W., Liang, C.: Emotion recognition using multimodal deep learning in multiple psychophysiological signals and video[J]. Int. J. Mach. Learn. Cybern. 11(4), 923–934 (2020)
Article Google Scholar
Xia, K., Hu, T., Si, W.: Editorial for the special issue on "research on methods of multimodal information fusion in emotion recognition"[J]. Pers. Ubiquit. Comput. 23(3–4), 359–361 (2019)
Article Google Scholar
Jaiswal, M.: Interpreting multimodal machine learning models trained for emotion recognition to address robustness and privacy concerns[J]. Proc. AAAI Conf. Artificial Intell. 34(10), 13716–13717 (2020)
Google Scholar
Jaiswal, M., Provost, E.M.: Privacy enhanced multimodal neural representations for emotion recognition[J]. Proc. AAAI Conf. Artificial Intell. 34(5), 7985–7993 (2020)
Google Scholar
Choi, D.Y., Kim, D.H., Song, B.C.: Multimodal attention network for continuous-time emotion recognition using video and EEG signals[J]. IEEE Access. 8, 203814–203826 (2020)
Article Google Scholar
Zheng, W.L., Liu, W., Lu, Y., Lu, B.L., Cichocki, A.: EmotionMeter: a multimodal framework for recognizing human emotions[J]. IEEE Trans. Cybern. 49, 1110–1122 (2019)
Article Google Scholar
Noroozi, F., Marjanovic, M., Njegus, A., Escalera, S., Anbarjafari, G.: Audio-visual emotion recognition in video clips[J]. Affect. Comput. IEEE Trans. 10(1), 60–75 (2019)
Article Google Scholar
Seng, J.K.P., Ang, L.M.: Multimodal emotion and sentiment modeling from unstructured big data: challenges, architecture, & techniques[J]. IEEE Access. 7(5), 90982–90998 (2019)
Article Google Scholar
Avots, E., Sapinski, T., Bachmann, M., et al.: Audiovisual emotion recognition in wild[J]. Mach. Vis. Appl. 30(5), 975–985 (2019)
Article Google Scholar
Kim, Y., Provost, E.M.: ISLA: temporal segmentation and labeling for audio-visual emotion recognition[J]. Affect. Comput. IEEE Trans. 10(2), 196–208 (2019)
Article Google Scholar
Li, D.H., Wang, Z., Wang, C.H., et al.: The fusion of electroencephalography and facial expression for continuous emotion recognition[J]. IEEE Access. 7(7), 155724–155736 (2019)
Article Google Scholar
Hu, M., Wang, H., Wang, X., et al.: Video facial emotion recognition based on local enhanced motion history image and CNN-CTSLSTM networks[J]. J. Vis. Commun. Image Represent. 59, 176–185 (2019)
Article Google Scholar
Azad, R., Asadi-Aghbolaghi, M., Kasaei, S., Escalera, S.: Dynamic 3D hand gesture recognition by learning weighted depth motion maps[J]. IEEE Trans. Circuits Syst. Video Technol. 29(6), 1729–1740 (2019)
Article Google Scholar
Li, X., Song, D., Zhang, P., et al.: Emotion recognition from multi-channel EEG data throughConvolutional recurrent neural network[C]// international conference on bioinformatics andBiomedicine. IEEE. 3(4), 352–359 (2017)
Google Scholar
A A R , A M M , B S M A . Dear-Mulsemedia: dataset for emotion analysis and recognition in response to multiple sensorial media[J]. Inf. Fusion, 2021, 65(3):37–49
Egger, M., Ley, M., Hanke, S.: Emotion recognition from physiological signal analysis: a review[J]. Electron. Notes Theor. Comput. Sci. 343(5), 35–55 (2019)
Article Google Scholar
Mittal, T., Bhattacharya, U., Chandra, R., Bera, A., Manocha, D.: M3ER: multiplicative multimodal emotion recognition using facial, textual, and speech cues[J]. Proc. AAAI Conf. Artificial Intell. 34(2), 1359–1367 (2020)
Google Scholar
Zhang, H.: Expression-EEG based collaborative multimodal emotion recognition using deep AutoEncoder[J]. IEEE Access. 8(3), 164130–164143 (2020)
Article Google Scholar
Jaratrotkamjorn, A.: Bimodal emotion recognition using deep belief network[J]. ECTI Trans. Comput. Inf. Technol. (ECTI-CIT). 15(1), 73–81 (2021)
Article Google Scholar
Li, Y., Ishi, C.T., Inoue, K., et al.: Expressing reactive emotion based on multimodal emotion recognition for natural conversation in human–robot interaction*[J]. Adv. Robot. 33(1), 1–12 (2019)
Article Google Scholar
Li, J., Zhong, J., Wang, M.: Unsupervised recurrent neural network with parametric Bias framework for human emotion recognition with multimodal sensor data fusion[J]. Sensors and materials. 32(4), 1261–1277 (2020)
Article Google Scholar
Tzirakis, P., Chen, J., Zafeiriou, S., Schuller, B.: End-to-end multimodal affect recognition in real-world environments[J]. Inf. Fusion. 68(5), 46–53 (2021)
Article Google Scholar
Rao, P.: Weighted normalization fusion approach for multimodal emotion recognition[J]. Int. J. Sci. Technol. Res. 9(4), 3092–3098 (2020)
Google Scholar
Schmidt, T., Schlindwein, M., Lichtner, K., et al.: Investigating the Relationship Between Emotion Recognition Software and Usability Metrics[J]. i-com. 19(2), 139–151 (2020)
Article Google Scholar
Mansouri-Benssassi, E., Ye, J.: Synch-graph: multisensory emotion recognition through neural synchrony via graph convolutional networks[J]. Proc. AAAI Conf. Artificial Intell. 34(2), 1351–1358 (2020)
Google Scholar
Hare, M.M., Garcia, A.M., Hart, K.C., Graziano, P.A.: Intervention response among preschoolers with ADHD: the role of emotion understanding[J]. J. Sch. Psychol. 84(6), 19–31 (2021)
Article Google Scholar
de Boer, M.J., Jürgens, T., Cornelissen, F.W., et al.: Degraded visual and auditory input individually impair audiovisual emotion recognition from speech-like stimuli, but no evidence for an exacerbated effect from combined degradation[J]. Vis. Res. 180(2), 51–62 (2021)
Article Google Scholar
Caldas, O.I., Aviles, O.F., Rodriguez-Guerrero, C.: Effects of presence and challenge variations on emotional engagement in immersive virtual environments[J]. IEEE Trans. Neural Syst. Rehab. Eng. 28(5), 1109–1116 (2020)
Article Google Scholar
Yadegaridehkordi, E., Noor, N.F.B.M., Bin Ayub, M.N., et al.: Affective computing in education: a systematic review and future research[J]. Comput. Educ. 142(11), 1–19 (2019)
Google Scholar
Gupta, K.S.: Development of music player application using emotion recognition[J]. Intl. J. Modern Trends Sci. Technol. 7(1), 54–57 (2021)
Article Google Scholar

Download references

Acknowledgments

This work is supported This work was supported in part by the Natural Science Foundation of Shandong Province of China under Grant ZR2016AM30, Social Science Planning Research Project of Shandong Province under Grant 18CLYJ50, in part by the Shandong Soft Science Research Program under Grant 2018RKB01144, and in part by The Project of Shandong Province Higher Educational Science and Technology Program under Grant J15LN15.

Author information

Authors and Affiliations

School of Information Engineering, Shandong Youth University of Political Science, Jinan, 250103, Shandong, China
Dong Liu, Longxi Chen, Zhiyong Wang & Guangqiang Diao

Authors

Dong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Longxi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guangqiang Diao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Longxi Chen.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, D., Chen, L., Wang, Z. et al. Speech Expression Multimodal Emotion Recognition Based on Deep Belief Network. J Grid Computing 19, 22 (2021). https://doi.org/10.1007/s10723-021-09564-0

Download citation

Received: 04 February 2021
Accepted: 14 May 2021
Published: 18 May 2021
DOI: https://doi.org/10.1007/s10723-021-09564-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech Expression Multimodal Emotion Recognition Based on Deep Belief Network

Abstract

Access this article

Similar content being viewed by others

Deep Learning Based Emotion Recognition from Chinese Speech

Emotion recognition using multimodal deep learning in multiple psychophysiological signals and video

Feature Learning via Deep Belief Network for Chinese Speech Emotion Recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Speech Expression Multimodal Emotion Recognition Based on Deep Belief Network

Abstract

Access this article

Similar content being viewed by others

Deep Learning Based Emotion Recognition from Chinese Speech

Emotion recognition using multimodal deep learning in multiple psychophysiological signals and video

Feature Learning via Deep Belief Network for Chinese Speech Emotion Recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation