Skip to main content
Log in

Emotion recognition in Hindi text using multilingual BERT transformer

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Emotions are a vital and fundamental part of our existence. Whatever we do, say, or do not say somehow reflects our feelings, however not immediately. To comprehend human’s most fundamental behaviour, we must examine these feelings using emotional data. According to the extensive literature review, categorising speech text into multiple classes is now undergoing extensive investigation. The application of this research is very limited in local and regional languages such as Hindi. This study focuses on text emotion analysis, specifically for the Hindi language. In our study, BHAAV Dataset is used, which consists of 20,304 sentences, where every other sentence has been manually annotated into one of the five emotion categories (Anger, Suspense, Joy, Sad, Neutral). Comparison of multiple machine learning and deep learning techniques with word embedding is used to demonstrate accuracy. And then, the trained model is used to predict the emotions of Hindi text. The best performance were observed in case of mBERT model with loss- 0.1689 ,balanced_accuracy- 93.88%, recall- 93.44%, auc- 99.55% and precision- 94.39 % on training data, while loss- 0.3073, balanced_accuracy- 91.84%, recall- 91.74%, auc- 98.46% and precision- 92.01% on testing data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

The dataset analysed during the current study is now openly available on https://zenodo.org/record/3457467#.YuyoTXZBxPa Zenodo repository.

References

  1. Ahmad Z, Jindal R, Ekbal A, Bhattachharyya P (2020) Borrow from rich cousin: transfer learning for emotion detection using cross lingual embedding. Expert Syst Appl 139:112851

    Article  Google Scholar 

  2. Al-Azani S, El-Alfy E-SM (2020) Enhanced video analytics for sentiment analysis based on fusing textual, auditory and visual information. IEEE Access 8:136843–136857. https://doi.org/10.1109/ACCESS.2020.3011977

    Article  Google Scholar 

  3. Alammar J (2023) The illustrated Bert, Elmo, and Co. (how NLP cracked transfer learning). http://jalammar.github.io/illustrated-bert/

  4. Alm ECO (2008) Affect in* text and speech. University of Illinois at Urbana-Champaign

  5. Appen: Datasets Resource Center (2022) https://appen.com/datasets-resource-center/. Accessed 06 May 2022

  6. Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp 440–447

  7. Buechel S, Hahn U (2017) Readers vs. writers vs. texts: coping with different perspectives of text understanding in emotion annotation. In: Proceedings of the 11th linguistic annotation workshop, pp 1–12

  8. Calvo RA, D’Mello S (2010) Affect detection: an interdisciplinary review of models, methods, and their applications. IEEE Trans Affect Comput 1 (1):18–37

    Article  Google Scholar 

  9. Cambria E, White B (2014) Jumping nlp curves: a review of natural language processing research. IEEE Comput Intell Mag 9(2):48–57

    Article  Google Scholar 

  10. Cambria E, Das D, Bandyopadhyay S, Feraco A (2017) Affective computing and sentiment analysis. In: A practical guide to sentiment analysis. Springer, pp 1–10

  11. Chaffar S, Inkpen D (2011) Using a heterogeneous dataset for emotion analysis in text. In: Canadian conference on artificial intelligence. Springer, pp 62–67

  12. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  MATH  Google Scholar 

  13. Chen S-Y, Hsu C-C, Kuo C-C, Ku L-W, et al. (2018) Emotionlines: an emotion corpus of multi-party conversations. arXiv:1802.08379

  14. Devlin J (2019) Bert/multilingual at google-research/bert, GitHub. https://github.com/google-research/bert/blob/master/multilingual.md. Accessed 04 Sept 2022

  15. Feng Y, Cheng Y (2021) Short text sentiment analysis based on multi-channel cnn with multi-head attention mechanism. IEEE Access 9:19854–19863. https://doi.org/10.1109/ACCESS.2021.3054521

    Article  Google Scholar 

  16. Ghazi D, Inkpen D, Szpakowicz S (2015) Detecting emotion stimuli in emotion-bearing sentences. In: International conference on intelligent text processing and computational linguistics. Springer, pp 152–165

  17. Hsu C-C, Ku L-W (2022) EmotionX 2019 - datasets. https://sites.google.com/view/emotionx2019/datasets. Accessed 06 May 2022

  18. Huang Y-H, Lee S-R, Ma M-Y, Chen Y-H, Yu Y-W, Chen Y-S (2019) Emotionx-idea: emotion bert–an affectional model for conversation. arXiv:1908.06264

  19. Huang C, Trabelsi A, Zaïane OR (2019) Ana at semeval-2019 task 3: contextual emotion detection in conversations through hierarchical lstms and bert. arXiv:1904.00132

  20. Joulin A, Grave E, Bojanowski P, Douze M, Jégou H, Mikolov T (2016) Fasttext.zip: compressing text classification models. arXiv:1612.03651

  21. Kumar Y, Mahata D, Aggarwal S, Chugh A, Maheshwari R, Shah RR (2029) BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories

  22. Li Y, Su H, Shen X, Li W, Cao Z, Niu S (2017) Dailydialog: a manually labelled multi-turn dialogue dataset. arXiv:1710.03957

  23. Liu V, Banea C, Mihalcea R (2017) Grounded emotions. In: 2017 27th International conference on affective computing and intelligent interaction (ACII). IEEE, pp 477–483

  24. Lu Z, Cao L, Zhang Y, Chiu C-C, Fan J (2020) Speech sentiment analysis via pre-trained features from end-to-end asr models. In: ICASSP 2020 - 2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 7149–7153. https://doi.org/10.1109/ICASSP40776.2020.9052937

  25. Luo J, Bouazizi M, Ohtsuki T (2021) Data augmentation for sentiment analysis using sentence compression-based seqgan with data screening. IEEE Access 9:99922–99931. https://doi.org/10.1109/ACCESS.2021.3094023

    Article  Google Scholar 

  26. Maas A, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, pp 142–150

  27. Malte A, Ratadiya P (2019) Multilingual cyber abuse detection using advanced transformer architecture. In: TENCON 2019-2019 IEEE region 10 conference (TENCON). IEEE, pp 784–789

  28. Manshu T, Bing W (2019) Adding prior knowledge in hierarchical attention neural network for cross domain sentiment classification. IEEE Access 7:32578–32588. https://doi.org/10.1109/ACCESS.2019.2901929

    Article  Google Scholar 

  29. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26

  30. Mohammad SM, Bravo-Marquez F (2017) Wassa-2017 shared task on emotion intensity. arXiv:1708.03700

  31. Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales arXiv:cs/0506075

  32. Pang B, Lee L, et al. (2008) Opinion mining and sentiment analysis. Found TrendsⓇ Inf Retrieval 2(1–2):1–135

    Article  Google Scholar 

  33. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Empirical methods in natural language processing (EMNLP), pp 1532–1543. http://www.aclweb.org/anthology/D14-1162

  34. Polignano M, Basile P, de Gemmis M, Semeraro G (2019) A comparison of word-embeddings in emotion detection from text using bilstm, cnn and self-attention. In: Adjunct publication of the 27th conference on user modeling, adaptation and personalization, pp 63–68

  35. Preoţiuc-Pietro D, Schwartz HA, Park G, Eichstaedt J, Kern M, Ungar L, Shulman E (2016) Modelling valence and arousal in facebook posts. In: Proceedings of the 7th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 9–15

  36. Ragheb W, Azé J., Bringay S, Servajean M (2019) Attention-based modeling for emotion detection and classification in textual conversations. arXiv:1906.07020

  37. Rosenthal S, Farra N, Nakov P (2019) Semeval-2017 task 4: sentiment analysis in twitter. arXiv:1912.00741

  38. Scherer KR, Wallbott HG (1994) Evidence for universality and cultural variation of differential emotion response patterning. J Personal Social Psychol 66 (2):310

    Article  Google Scholar 

  39. Seal D, Roy UK, Basak R (2020) Sentence-level emotion detection from text based on semantic rules. In: Information and communication technology for sustainable development. Springer, pp 423–430

  40. Seo S, Na S, Kim J (2020) Hmtl: heterogeneous modality transfer learning for audio-visual sentiment analysis. IEEE Access 8:140426–140437. https://doi.org/10.1109/ACCESS.2020.3006563

    Article  Google Scholar 

  41. Suhasini M, Srinivasu B (2020) Emotion detection framework for twitter data using supervised classifiers. In: Data engineering and communication technology. Springer, pp 565–576

  42. Taskin Z, Al U (2019) Natural language processing applications in library and information science

  43. Wang B, Liakata M, Zubiaga A, Procter R, Jensen E (2016) Smiles: twitter emotion classification using domain. In: SAAIP@ IJCAI

  44. Wang J, Yu L-C, Lai KR, Zhang X (2020) Tree-structured regional cnn-lstm model for dimensional sentiment analysis. IEEE/ACM Trans Audio Speech Language Process 28:581–591. https://doi.org/10.1109/TASLP.2019.2959251

    Article  Google Scholar 

  45. Yin F, Wang Y, Liu J, Lin L (2020) The construction of sentiment lexicon based on context-dependent part-of-speech chunks for semantic disambiguation. IEEE Access 8:63359–63367. https://doi.org/10.1109/ACCESS.2020.2984284

    Article  Google Scholar 

  46. Yu L-C, Lee L-H, Hao S, Wang J, He Y, Hu J, Lai KR, Zhang X (2016) Building chinese affective resources in valence-arousal dimensions. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 540–545

  47. Zhang B, Li X, Xu X, Leung K-C, Chen Z, Ye Y (2020) Knowledge guided capsule attention network for aspect-based sentiment analysis. IEEE/ACM Trans Audio Speech Language Process 28:2538–2551. https://doi.org/10.1109/TASLP.2020.3017093

    Article  Google Scholar 

  48. Zhu Y, Kiros R, Zemel R, Salakhutdinov R, Urtasun R, Torralba A, Fidler S (2015) Aligning books and movies: towards story-like visual explanations by watching movies and reading books. In: Proceedings of the IEEE international conference on computer vision, pp 19–27

Download references

Acknowledgements

We express our sincere gratitude to the MIDAS Research Laboratory, IIIT-Delhi, India for providing us the BHAAV dataset. The authors would like to acknowledge the technical support of Writing Lab, Institute for the Future of Education, Tecnologico de Monterrey, Mexico, in the production of this work.

Funding

There is no funding received against this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehul Mahrishi.

Ethics declarations

Conflict of Interests

The authors declare that there is no Conflicts of interests or Competing interests in this research.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, T., Mahrishi, M. & Sharma, G. Emotion recognition in Hindi text using multilingual BERT transformer. Multimed Tools Appl 82, 42373–42394 (2023). https://doi.org/10.1007/s11042-023-15150-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15150-1

Keywords

Navigation