A soft voting ensemble learning-based approach for multimodal sentiment analysis

Salur, Mehmet Umut; Aydın, İlhan

doi:10.1007/s00521-022-07451-7

A soft voting ensemble learning-based approach for multimodal sentiment analysis

Original Article
Published: 15 June 2022

Volume 34, pages 18391–18406, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

1706 Accesses
12 Citations
1 Altmetric
Explore all metrics

Abstract

It is possible to determine people's feelings and opinions about a subject or product from social media posts via sentiment analysis. With the pervasive usage of the Internet and smart devices, the data produced daily by users can be in different modalities such as text, image, audio, and video. Multimodal sentiment analysis is to reveal the feeling of the user's posts by analyzing the data in different modalities as a whole. One of the major challenges of multimodal sentiment is how the sentiment obtained on different modalities is combined to ensure sentiment and meaning integrity of the post. Also, many studies use the same classifying methods to analyze different modalities. In fact, each classifier can be effective in different feature sets. In this study, a soft voting-based ensemble model is proposed that takes advantage of the effective performance of different classifiers on different modalities. In the proposed model, deep feature extraction was made with deep learning methods (BiLSTM, CNN) from the multimodal datasets. After the feature selection was conducted on the features which are a fusion of text and image features, the final feature sets were classified with the soft voting-based ensemble learning model. The performance of the proposed model has been tested on two different benchmark datasets consisting of text–image pairs. As a result of the experimental studies, it was revealed that the proposed model outperformed multiple adversary models on the same datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on sentiment analysis methods, applications, and challenges

Article 07 February 2022

A review on sentiment analysis and emotion detection from text

Article 28 August 2021

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Article Open access 07 May 2022

References

Li Z, Fan Y, Jiang B, Lei T, Liu W (2019) A survey on sentiment analysis and opinion mining for social multimedia. Multimed Tools Appl 78(6):6939–6967. https://doi.org/10.1007/s11042-018-6445-z
Article Google Scholar
Yang X, Feng S, Wang D, Zhang Y (2020) Image-text multimodal emotion classification via multi-view attentional network. IEEE Trans Multimed. https://doi.org/10.1109/TMM.2020.3035277
Article Google Scholar
Soleymani M, Garcia D, Jou B, Schuller B, Chang SF, Pantic M (2017) A survey of multimodal sentiment analysis. Image Vis Comput 65:3–14. https://doi.org/10.1016/j.imavis.2017.08.003
Article Google Scholar
Xu N, Mao W (2017) A residual merged neutral network for multimodal sentiment analysis. In: 2017 IEEE 2nd ınternational conference on big data analysis, ICBDA 2017, pp 6–10. https://doi.org/10.1109/ICBDA.2017.8078794
Poria S, Majumder N, Hazarika D, Cambria E, Gelbukh A, Hussain A (2018) Multimodal sentiment analysis: addressing key issues and setting up the baselines. IEEE Intell Syst 33(6):17–25. https://doi.org/10.1109/MIS.2018.2882362
Article Google Scholar
Poria S, Chaturvedi I, Cambria E, Hussain A (2017) Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: Proceedings—IEEE ınternational conference on data mining, ICDM, pp 439–448. https://doi.org/10.1109/ICDM.2016.178
Poria S, Cambria E, Howard N, Bin Huang G, Hussain A (2016) Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing 174:50–59. https://doi.org/10.1016/j.neucom.2015.01.095
Article Google Scholar
Niu T, Zhu S, Pang L, El Saddik A (2016) Sentiment analysis on multi-view social data. MultiMed Model:15–27
Huang F, Zhang X, Zhao Z, Xu J, Li Z (2019) Image–text sentiment analysis via deep multimodal attentive fusion. Knowl-Based Syst 167:26–37. https://doi.org/10.1016/j.knosys.2019.01.019
Article Google Scholar
Majumder N, Hazarika D, Gelbukh A, Cambria E, Poria S (2018) Multimodal sentiment analysis using hierarchical fusion with context modeling. Knowl-Based Syst 161:124–133. https://doi.org/10.1016/j.knosys.2018.07.041
Article Google Scholar
Zadeh A, Chen M, Poria S, Cambria E, Morency L-P (2017) Tensor fusion network for multimodal sentiment analysis. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 1103–1114. https://doi.org/10.18653/v1/D17-1115
Ma H, Wang J, Qian L, Lin H (2020) HAN-ReGRU: hierarchical attention network with residual gated recurrent unit for emotion recognition in conversation. Neural Comput Appl 33(7):2685–2703. https://doi.org/10.1007/s00521-020-05063-7
Article Google Scholar
Poria S, Cambria E, Gelbukh A (2015) Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis, pp 2539–2544. https://doi.org/10.18653/v1/d15-1303
Corchs S, Fersini E, Gasparini F (2019) Ensemble learning on visual and textual data for social image emotion classification. Int J Mach Learn Cybern 10(8):2057–2070. https://doi.org/10.1007/s13042-017-0734-0
Article Google Scholar
Chen F, Gao Y, Cao D, Ji R (2015) Multimodal hypergraph learning for microblog sentiment prediction. In: Proceedings—IEEE ınternational conference on multimedia and expo, 2015, vol 2015. https://doi.org/10.1109/ICME.2015.7177477
Cao D, Ji R, Lin D, Li S (2016) A cross-media public sentiment analysis system for microblog. Multimed Syst 22(4):479–486. https://doi.org/10.1007/s00530-014-0407-8
Article Google Scholar
Xu N (2017) Analyzing multimodal public sentiment based on hierarchical semantic attentional network. In: 2017 IEEE ınternational conference on ıntelligence and security ınformatics: security and big data, ISI 2017, pp 152–154. https://doi.org/10.1109/ISI.2017.8004895
Xu N, Mao W (2017) MultiSentiNet: a deep semantic network for multimodal sentiment analysis. In: International Conference on Information and Knowledge Management, Proceedings, vol Part F1318, pp 2399–2402. https://doi.org/10.1145/3132847.3133142
Xu N, Mao W, Chen G (2018) A co-memory network for multimodal sentiment analysis. In: The 41st ınternational ACM SIGIR conference on research & development in ınformation retrieval—SIGIR ’18, pp 929–932. https://doi.org/10.1145/3209978.3210093
Kumar A, Garg G (2019) Sentiment analysis of multimodal twitter data. Multimed Tools Appl 78(17):24103–24119. https://doi.org/10.1007/s11042-019-7390-1
Article Google Scholar
Xu N, Mao W, Chen G (2019) Multi-interactive memory network for aspect based multimodal sentiment analysis. Proc AAAI Conf Artif Intell 33:371–378. https://doi.org/10.1609/aaai.v33i01.3301371
Article Google Scholar
Chen F, Ji R, Su J, Cao D, Gao Y (2018) Predicting microblog sentiments via weakly supervised multimodal deep learning. IEEE Trans Multimed 20(4):997–1007. https://doi.org/10.1109/TMM.2017.2757769
Article Google Scholar
Huddar MG, Sannakki SS, Rajpurohit VS (2018) An ensemble approach to utterance level multimodal sentiment analysis. In: 2018 ınternational conference on computational techniques, electronics and mechanical systems (CTEMS), pp 145–150. https://doi.org/10.1109/CTEMS.2018.8769162
Tran H-N, Cambria E (2018) Ensemble application of ELM and GPU for real-time multimodal sentiment analysis. Memetic Comput 10(1):3–13. https://doi.org/10.1007/s12293-017-0228-3
Article Google Scholar
Jiang T, Wang J, Liu Z, Ling Y (2020) Fusion-extraction network for multimodal sentiment analysis. In: Lauw HW, Wong RC-W, Ntoulas A, Lim E-P, Ng S-K, Pan SJ (eds) Advances in knowledge discovery and data mining. Springer, Cham, pp 785–797
Chapter Google Scholar
Huddar MG, Sannakki SS, Rajpurohit VS (2021) Attention-based multimodal contextual fusion for sentiment and emotion classification using bidirectional LSTM. Multimed Tools Appl. https://doi.org/10.1007/s11042-020-10285-x
Article Google Scholar
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: 15th Conf. Eur. Chapter Assoc. Comput. Linguist. EACL 2017 - Proc. Conf., vol 2, pp 427–431. https://doi.org/10.18653/v1/e17-2068
Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: 36th Int. Conf. Mach. Learn. ICML 2019, vol. 2019-June, pp 10691–10700
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database, pp 248–255. https://doi.org/10.1109/cvprw.2009.5206848
Jianqiang Z (2015) Pre-processing boosting twitter sentiment analysis?. In: 2015 IEEE ınternational conference on smart city/socialcom/sustaincom (SmartCity), pp 748–753. https://doi.org/10.1109/SmartCity.2015.158
Yahi N, Belhadef H (2020) Morphosyntactic preprocessing ımpact on document embedding: an empirical study on semantic similarity. Emerg Trends Intell Comput Inform:118–126
Salur MU, Aydın I (2018) The ımpact of preprocessing on classification performance in convolutional neural networks for turkish text. In: 2018 ınternational conference on artificial ıntelligence and data processing (IDAP), pp 1–4. https://doi.org/10.1109/IDAP.2018.8620722
Salur MU, Aydin I (2020) A novel hybrid deep learning model for sentiment classification. IEEE Access 8:58080–58093. https://doi.org/10.1109/ACCESS.2020.2982538
Article Google Scholar
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C (2018) A survey on deep transfer learning. In: International conference on artificial neural networks, pp 270–279
Robnık Sıkonja M, Kononenko I (2003) Theoretical and empirical analysis of relieff and Rrelieff. Mach Learn 53:23–69
Article Google Scholar
Huddar MG, Sannakki SS, Rajpurohit VS (2020) Multi-level feature optimization and multimodal contextual fusion for sentiment analysis and emotion classification. Comput Intell 36(2):861–881. https://doi.org/10.1111/coin.12274
Article MathSciNet Google Scholar
Dong X, Yu Z, Cao W, Shi Y, Ma Q (2020) A survey on ensemble learning. Front Comput Sci 14(2):241–258. https://doi.org/10.1007/s11704-019-8208-z
Article Google Scholar
Hastie T, Rosset S, Zhu J, Zou H (2009) Multi-class adaboost. Stat Interface 2(3):349–360
Article MathSciNet Google Scholar
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378. https://doi.org/10.1016/S0167-9473(01)00065-2
Article MathSciNet MATH Google Scholar
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. Prepr. https://arxiv.org/abs/1810.04805
Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Gaziantep Islam Science and Technology University, 27010, Gaziantep, Turkey
Mehmet Umut Salur
Department of Computer Engineering, Fırat University, 23190, Elazig, Turkey
İlhan Aydın

Authors

Mehmet Umut Salur
View author publications
You can also search for this author in PubMed Google Scholar
İlhan Aydın
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mehmet Umut Salur.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Salur, M.U., Aydın, İ. A soft voting ensemble learning-based approach for multimodal sentiment analysis. Neural Comput & Applic 34, 18391–18406 (2022). https://doi.org/10.1007/s00521-022-07451-7

Download citation

Received: 29 March 2021
Accepted: 22 May 2022
Published: 15 June 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s00521-022-07451-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A soft voting ensemble learning-based approach for multimodal sentiment analysis

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

A review on sentiment analysis and emotion detection from text

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A soft voting ensemble learning-based approach for multimodal sentiment analysis

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

A review on sentiment analysis and emotion detection from text

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation