Sentiment Analysis of Telephone Conversations Using Multimodal Data

Logumanov, Alexander Gafuanovich; Klenin, Julius Dmitrievich; Botov, Dmitry Sergeevich

doi:10.1007/978-3-030-11027-7_9

Alexander Gafuanovich Logumanov²⁶,
Julius Dmitrievich Klenin²⁶ &
Dmitry Sergeevich Botov²⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11179))

Included in the following conference series:

International Conference on Analysis of Images, Social Networks and Texts

870 Accesses

Abstract

Sentiment analysis of conversations is a widely studied topic, but the proposed solutions are mostly based only on text analysis, which in the real conditions of telephone conversations is not ideal and contains a lot of mistakes and inaccuracies arising at the stage of speech recognition. Today, there are almost no papers about the sentiment analysis of conversations using multimodal datasets for the Russian language. In this paper, we suggest the use of multimodal sentiment analysis of conversations, with both the recognized text and the audio signal used as the training data. To do this, we assemble our own dataset consisting of records of telephone conversations, labelled by sentiment intensity. The texts are obtained with the help of ready-made tools for automatic speech recognition. We carry out a number of experiments to find the best way to extract features from audio and texts and we also build models for determining the sentiment intensity for individual modalities and a combination of them. Different classification algorithms are compared: linear, neural networks and ensembles of decision trees, where XGBoost works best for audio, Logistic Regression - for text and LightGBM - for multimodal data. The results show that combining several modalities allows to achieve the best quality of classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Arkhipenko, K., et al.: Comparison of neural network architectures for sentiment analysis of Russian tweets. In: Proceedings of the International Conference Dialogue (2016)
Google Scholar
Baltrusaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. CoRR abs/1705.09406. arXiv:1705.09406 (2017)
Carletta, J., et al.: The AMI meeting corpus: a pre-announcement. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 28–39. Springer, Heidelberg (2006). https://doi.org/10.1007/11677482_3. ISBN 3-540-32549-2
Chapter Google Scholar
Chen, M., et al.: Multimodal sentiment analysis with word-level fusion and reinforcement learning. In: Proceedings of the 19th ACM International Conference on Multimodal Interaction - ICMI 2017. ACM Press (2017). https://doi.org/10.1145/3136755.3136801
Chollet, F., et al.: Keras (2015). https://keras.io
Greff, K., et al.: LSTM: a search space Odyssey. CoRR abs/1503.04069. arXiv:1503.04069 (2015)
Hershey, S., et al.: CNN architectures for large-scale audio classification. CoRR abs/1609.09430. arXiv:1609.09430 (2016)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Korobov, M.: Morphological analyzer and generator for Russian and Ukrainian languages. In: Khachay, M.Y., Konstantinova, N., Panchenko, A., Ignatov, D.I., Labunets, V.G. (eds.) AIST 2015. CCIS, vol. 542, pp. 320–332. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26123-2_31. ISBN 978-3-319-26122-5
Chapter Google Scholar
LeCun, Y., Haffner, P., Bottou, L., Bengio, Y.: Object recognition with gradient-based learning. Shape, Contour and Grouping in Computer Vision. LNCS, vol. 1681, pp. 319–345. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-46805-6_19. ISBN 978-3-540-46805-9
Chapter Google Scholar
Loukachevitch, N., et al.: Task on sentiment analysis of tweets about telecom and financial companies. In: Proceedings of International Conference Dialogue (2015)
Google Scholar
Martínez-Cáamara, E., et al.: Sentiment analysis in Twitter. Nat. Lang. Eng. 20, 1–28 (2014)
Article Google Scholar
McFee, B., et al.: librosa/librosa: 0.6.1, May 2018. https://doi.org/10.5281/zenodo.1252297
Mikolov, T., et al.: Effcient estimation of word representations in vector space. CoRR abs/1301.3781. arXiv:1301.3781 (2013)
Panchenko, A., et al.: Human and machine judgements for Russian semantic relatedness. In: Ignatov, D.I., et al. (eds.) AIST 2016. CCIS, vol. 661, pp. 221–235. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52920-2_21. ISBN 978-3-319-52920-2
Chapter Google Scholar
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing - EMNLP 2002. Association for Computational Linguistics (2002). https://doi.org/10.3115/1118693.1118704
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Poria, S., et al.: Multimodal sentiment analysis: addressing key issues and setting up baselines. CoRR abs/1803.07427 (2018)
Book Google Scholar
Ramos, J.: Using TF-IDF to determine word relevance in document queries, January 2003
Google Scholar
Řeh\(\mathring{\rm {u}}\)řuek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50. ELRA, Valletta, Malta, May 2010. http://is.muni.cz/publication/884893/en
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556. arXiv:1409.1556 (2014)
Somasundaran, S., et al.: Manual annotation of opinion categories in meetings. In: Proceedings of the Workshop on Frontiers in Linguistically Annotated Corpora 2006 - LAC 2006. Association for Computational Linguistics (2006). https://doi.org/10.3115/1641991.1641998
Turney, P.D.: Thumbs up or thumbs down? In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL 2002. Association for Computational Linguistics (2001). https://doi.org/10.3115/1073083.1073153
Yuhas, B.P., Goldstein, M.H., Sejnowski, T.J.: Integration of acoustic and visual speech signals using neural networks. IEEE Commun. Mag. 27(11), 65–71 (1989). https://doi.org/10.1109/35.41402
Article Google Scholar
Zadeh, A., et al.: MOSI: multimodal corpus of sentiment intensity and subjectivity analysis in online opinion videos. CoRR abs/1606.06259. arXiv:1606.06259 (2016)
Zadeh, A., et al.: Multi-attention recurrent network for human communication comprehension. CoRR abs/1802.00923. arXiv:1802.00923 (2018)
Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey, January 2018
Google Scholar

Download references

Author information

Authors and Affiliations

Chelyabinsk, Russia
Alexander Gafuanovich Logumanov, Julius Dmitrievich Klenin & Dmitry Sergeevich Botov

Authors

Alexander Gafuanovich Logumanov
View author publications
You can also search for this author in PubMed Google Scholar
Julius Dmitrievich Klenin
View author publications
You can also search for this author in PubMed Google Scholar
Dmitry Sergeevich Botov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander Gafuanovich Logumanov .

Editor information

Editors and Affiliations

RWTH Aachen University, Aachen, Germany
Wil M. P. van der Aalst
University of Ljubljana, Ljubljana, Slovenia
Vladimir Batagelj
University of Mannheim, Mannheim, Germany
Goran Glavaš
National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov
Institute of Mathematics and Mechanics, Yekaterinburg, Russia
Michael Khachay
National Research University Higher School of Economics, Moscow, Russia
Sergei O. Kuznetsov
National Research University Higher School of Economics , Saint Petersburg, Russia
Olessia Koltsova
National Research University Higher School of Economics, Moscow, Russia
Irina A. Lomazova
Moscow State University, Moscow, Russia
Natalia Loukachevitch
Loria, Vandoeuvre lès Nancy, France
Amedeo Napoli
University of Hamburg, Hamburg, Germany
Alexander Panchenko
University of Florida, Gainesville, FL, USA
Panos M. Pardalos
Ca Foscari University of Venice, Venice, Italy
Marcello Pelillo
National Research University Higher School of Economics, Nizhny Novgorod, Russia
Andrey V. Savchenko

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Logumanov, A.G., Klenin, J.D., Botov, D.S. (2018). Sentiment Analysis of Telephone Conversations Using Multimodal Data. In: van der Aalst, W., et al. Analysis of Images, Social Networks and Texts. AIST 2018. Lecture Notes in Computer Science(), vol 11179. Springer, Cham. https://doi.org/10.1007/978-3-030-11027-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-11027-7_9
Published: 31 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11026-0
Online ISBN: 978-3-030-11027-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics