Skip to main content
Log in

An effective approach for emotion detection in multimedia text data using sequence based convolutional neural network

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In the recent trends, the world has stepped into a multimedia era for enhancing business, recommendation systems, and information retrieval, etc. Multimedia data is highly rich in contents which express different human emotions. Several issues for emotion detection from multimedia images & videos have been addressed in this domain, but a very less effort has been applied for text data. The evaluation of deep learning has outperformed traditional techniques in sentiment analysis tasks. Inspired by the work done in the field of sentiment analysis, a deep learning based framework has been implemented on multimedia text data for the task of fine-grained emotion detection. The presented work introduces a new corpus which expresses different forms of emotions collected from a TV show’s transcript. A manual annotation of the corpus has been conducted with the help of English expert annotators. As an emotion detection framework, this paper proposes a sequence-based convolutional neural network(CNN) with word embedding to detect the emotions. An attention mechanism is applied in the proposed model which allows CNN to focus on the words that have more effect on the classification or the part of the features that should be attended more. The main aim of the work is to develop a framework such a way to generalize to newly collected data and help business to understand the customer’s mind and social media monitoring as it allows us to gain an overview of the wider public opinion behind certain topics. Experiments conducted on the dataset shows that the proposed framework correctly detects the emotions from the text with good precision and accuracy score.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. https://keras.io/

  2. https://www.tensorflow.org/

  3. https://atom.io/

References

  1. Alm CO, Roth D, Sproat R (2005). Emotions from text: machine learning for text-based emotion prediction. In Proceedings of the conference on human language technology and empirical methods in natural language processing (pp. 579–586). Association for Computational Linguistics

  2. Altenberger F, Lenz C (2018). A Non-Technical Survey on Deep Convolutional Neural Network Architectures. arXiv preprint arXiv:1803.02129

  3. Altman DG (1991) Practical statistics for medical research (reprint 1999). CRC Press, Boca Raton

    Google Scholar 

  4. Aman S, Szpakowicz S (2007). Identifying expressions of emotion in text. In Text, Speech and Dialogue: 10th International Conference, TSD 2007, Pilsen, Czech Republic, September 3–7, 2007. Proceedings, pages 196–205. Springer

  5. Artstein R, Poesio M (2008) Inter-coder agreement for computational linguistics. Computational Linguistics 34(4):555–596

    Article  Google Scholar 

  6. Bahdanau D, Cho K, Bengio Y (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

  7. Bartlett MS, Littlewort G, Frank M, Lainscsek C, Fasel I, Movellan J (2005). Recognizing facial expression: machine learning and application to spontaneous behavior. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on (Vol. 2, pp. 568–573). IEEE

  8. Bhattacharyya P (2012) Natural language processing: a perspective from computation in presence of ambiguity, resource constraint and multi-linguality. CSI journal of computing 1(2):1–13

    Google Scholar 

  9. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  MATH  Google Scholar 

  10. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078

  11. Chorowski JK, Bahdanau D, Serdyuk D, Cho K, Bengio Y (2015). Attention-based models for speech recognition. In Advances in neural information processing systems (pp. 577–585)

  12. Ciregan D, Meier U, Schmidhuber J (2012). Multi-column deep neural networks for image classification. In Computer vision and pattern recognition (CVPR), 2012 IEEE conference on (pp. 3642–3649). IEEE

  13. Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46

    Article  Google Scholar 

  14. Da Silva IN, Spatti DH, Flauzino RA, Liboni LHB, dos Reis Alves SF (2017). Artificial Neural Network Architectures and Training Processes. In Artificial Neural Networks (pp. 21–28). Springer, Cham

  15. EbrahimiKahou S, Michalski V, Konda K, Memisevic R, Pal C (2015). Recurrent neural networks for emotion recognition in video. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (pp. 467–474). ACM

  16. Ekman P (1992) An argument for basic emotions. Cognit Emot 6(3–4):169–200

    Article  Google Scholar 

  17. Erhan D, Szegedy C, Toshev A, Anguelov D (2014). Scalable object detection using deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2147–2154)

  18. Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382

    Article  Google Scholar 

  19. Gao K, Xu H, Wang J (2015) A rule-based approach to emotion causes detection for Chinese micro-blogs. Expert Syst Appl 42(9):4517–4528

    Article  Google Scholar 

  20. Garcia D, Schweitzer F (2011). Emotions in product reviews--empirics and models. In Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on (pp. 483–488). IEEE

  21. Geertzen J (2012). Inter-Rater Agreement with multiple raters and variables. Retrieved May 8, 2018, from https://nlp-ml.io/jg/software/ira/

  22. Ghazi D, Inkpen D, Szpakowicz S (2010). Hierarchical versus flat classification of emotions in text. In Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text (pp. 140–146). Association for Computational Linguistics

  23. Girshick R, Donahue J, Darrell T, Malik J (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580–587)

  24. Graves A, Mohamed AR, Hinton G (2013). Speech recognition with deep recurrent neural networks. In Acoustics, speech and signal processing (icassp), 2013 ieee international conference on (pp. 6645–6649). IEEE

  25. Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Chen T (2017). Recent advances in convolutional neural networks. Pattern Recognition

  26. Guo L, Zhang D, Wang L, Wang H, Cui B (2018) CRAN: a hybrid CNN-RNN attention-based model for rext classification. In International Conference on Conceptual Modeling Springer, Cham, pp 571–585

  27. Hamester D, Barros P, Wermter S (2015). Face expression recognition with a 2-channel convolutional neural network. In Neural Networks (IJCNN), 2015 International Joint Conference on (pp. 1–8). IEEE

  28. Hermann KM, Kocisky T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015). Teaching machines to read and comprehend. In Advances in neural information processing systems (pp. 1693–1701)

  29. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  MATH  Google Scholar 

  30. Hinton G, Deng L, Yu D, Dahl GE, Mohamed AR, Jaitly N, Kingsbury B (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29(6):82–97

    Article  Google Scholar 

  31. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580

  32. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  33. Hubel DH, Wiesel TN (1959) Receptive fields of single neurones in the cat's striate cortex. J Physiol 148(3):574–591

    Article  Google Scholar 

  34. Islam J, Zhang Y (2016). Visual sentiment analysis for social images using transfer learning approach. In Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom)(BDCloud-SocialCom-SustainCom), 2016 IEEE International Conferences on (pp. 124–130). IEEE

  35. Izard CE (2013). Human emotions. Springer Science & Business Media

  36. Jain VK, Kumar S (2017). Predictive Analysis of Emotions for Improving Customer Services. Handbook on Applying Predictive Analytics within the Service Sector, 125–134

  37. Jain VK, Kumar S (2018) Effective surveillance and predictive mapping of mosquito-borne diseases using social media. Journal of Computational Science 25:406–415

    Article  MathSciNet  Google Scholar 

  38. Jain VK, Kumar S, Jain N, Verma P (2016). A Novel Approach to Track Public Emotions Related to Epidemics In Multilingual Data. In 2nd International Conference and Youth School Information Technology and Nanotechnology (ITNT 2016), Russia (pp. 883–889)

  39. Jain VK, Kumar S, Fernandes SL (2017) Extraction of emotions from multilingual text using intelligent text processing and computational linguistics. Journal of Computational Science 21:316–326

    Article  Google Scholar 

  40. Jain VK, Kumar S, Mahanti P (2018) Sentiment recognition in customer reviews using deep learning. International Journal of Enterprise Information Systems (IJEIS) 14(2):77–86

    Article  Google Scholar 

  41. Jain N, Kumar S, Kumar A, Shamsolmoali P, Zareapoor M (2018). Hybrid deep neural networks for face emotion recognition. Pattern Recognition Letters

  42. Johnson R, Zhang T (2015). Semi-supervised convolutional neural networks for text categorization via region embedding. In Advances in neural information processing systems (pp. 919–927)

  43. Kahou SE, Pal C, Bouthillier X, Froumenty P, Gülçehre Ç, Memisevic R, Mirza M (2013). Combining modality specific deep neural networks for emotion recognition in video. In Proceedings of the 15th ACM on International conference on multimodal interaction (pp. 543–550). ACM

  44. Kalchbrenner N, Grefenstette E, Blunsom P (2014). A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188

  45. Kim Y (2014). Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882

  46. Kim BK, Lee H, Roh J, Lee SY (2015). Hierarchical committee of deep cnns with exponentially-weighted decision fusion for static facial expression recognition. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (pp. 427–434). ACM

  47. Kingma DP, Ba J (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  48. Krippendorff K (1995) On the reliability of unitizing continuous data. Sociol Methodol 25:47–76

    Article  Google Scholar 

  49. Kumar S, Mahanti P, Wang SJ (2018). Intelligent Computational Techniques

  50. Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174

    Article  MATH  Google Scholar 

  51. LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks 3361(10):1995

    Google Scholar 

  52. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  53. Lei J, Rao Y, Li Q, Quan X, Wenyin L (2014) Towards building a social emotion detection system for online news. Futur Gener Comput Syst 37:438–448

    Article  Google Scholar 

  54. Levi G, Hassner T (2015). Emotion recognition in the wild via convolutional neural networks and mapped binary patterns. In Proceedings of the 2015 ACM on an international conference on multimodal interaction (pp. 503–510). ACM

  55. Liu H, Lieberman H, Selker T (2003). A model of textual affect sensing using real-world knowledge. In Proceedings of the 8th international conference on Intelligent user interfaces (pp. 125–132). ACM

  56. Liu M, Li S, Shan S, Wang R, Chen X (2014) Deeply learning deformable facial action parts model for dynamic expression analysis. In: Asian conference on computer vision. Springer, Cham, pp 143–157

    Google Scholar 

  57. Lopes AT, de Aguiar E, Oliveira-Santos T (2015). A facial expression recognition system using convolutional networks. In Graphics, Patterns and Images (SIBGRAPI), 2015 28th SIBGRAPI Conference on (pp. 273–280). IEEE

  58. Mikolov T, Chen K, Corrado G, Dean J (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781

  59. Mishne G (2005). Experiments with mood classification in blog posts. In Proceedings of ACM SIGIR 2005 workshop on stylistic analysis of text for information access (Vol. 19, pp. 321–327)

  60. Mnih V, Heess N, Graves A (2014). Recurrent models of visual attention. In Advances in neural information processing systems (pp. 2204–2212)

  61. Mohammad SM, Yang TW (2011). Tracking sentiment in mail: How genders differ on emotional axes. In Proceedings of the 2nd workshop on computational approaches to subjectivity and sentiment analysis (pp. 70–79). Association for Computational Linguistics

  62. Nair V, Hinton GE (2010). Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 807–814)

  63. Nielsen MA (2015). Neural networks and deep learning. Determination Press.

  64. Nowlan SJ, Hinton GE (1992) Simplifying neural networks by soft weight-sharing. Neuralcomputation 4(4):473–493

    Google Scholar 

  65. Ouyang X, Zhou P, Li CH, Liu L (2015). Sentiment analysis using convolutional neural network. In Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing (CIT/IUCC/DASC/PICOM), 2015 IEEE International Conference on (pp. 2359–2364). IEEE

  66. Pennington J, Socher R, DManning C (2014). GloVe: Global Vectors for Word Representation Jeffrey. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543

  67. Pierre-Yves O (2003) The production and recognition of emotions in speech: features and algorithms. International Journal of Human-Computer Studies 59(1–2):157–183

    Article  Google Scholar 

  68. Plutchik R (1990). Emotions and psychotherapy: A psycho evolutionary perspective. In Emotion, psychopathology, and psychotherapy (pp. 3–41)

  69. Raghavan V (1940) The Number of Rasa, Madras, pp. 20–45

  70. Raghuvanshi A, Choksi V (2016). Facial Expression Recognition with Convolutional Neural Networks. CS231n Course Projects

  71. Read J, Carroll J (2012) Annotating expressions of appraisal in English. Lang Resour Eval 46(3):421–447

    Article  Google Scholar 

  72. Rolls ET (2016). Cerebral cortex: principles of operation. Oxford University Press

  73. Ruder S, Ghaffari P, Breslin JG (2016). Insight-1 at semeval-2016 task 5: Deep learning for multilingual aspect-based sentiment analysis. arXiv preprint arXiv:1609.02748

  74. Scherer K and Wallbott H 1997. The ISEAR questionnaire and codebook. Geneva Emotion Re- search Group

  75. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

    Article  Google Scholar 

  76. Schuff H, Barnes J, Mohme J, Padó S, Klinger R. (2017). Annotation, Modelling and Analysis of Fine-Grained Emotions on a Stance and Sentiment Detection Corpus. In Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (pp. 13–23)

  77. Sorjonen ML, Peräkylä A (Eds.). (2012). Emotion in interaction. Oxford University Press

  78. Stergiou C, Siganos D (2010). Neural Networks. 1996

  79. Strapparava C, Mihalcea R (2007) Semeval-2007 task 14: affective text. In Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007) (pp. 70–74)

  80. Strapparava C, Mihalcea R (2007). SemEval- 2007 Task 14: Affective Text. In Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), pages 70–74, Prague, Czech Republic

  81. Strapparava C, Mihalcea R (2008). Learning to identify emotions in text. In Proceedings of the 2008 ACM symposium on Applied computing (pp. 1556–1560). ACM

  82. Strapparava C, Valitutti A (2004). Wordnet affect: an affective extension of wordnet. In Lrec (Vol. 4, pp. 1083–1086)

  83. Sze V, Chen YH, Yang TJ, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105(12):2295–2329

    Article  Google Scholar 

  84. Tang Y (2013). Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239

  85. Tomkins S (1962). Affect imagery consciousness: Volume I: The positive effects. Springer publishing company

  86. Torao Y, Naruki S, Kaori Y, Masahiro N (1997) An emotion processing system based on fuzzy inference and subjective observations. Inf Sci 101(3–4):217–247

    Article  Google Scholar 

  87. Whitelaw C, Garg N, Argamon S (2005). Using appraisal groups for sentiment analysis. In Proceedings of the 14th ACM international conference on Information and knowledge management (pp. 625–631). ACM

  88. Xie S, Hu H (2017) Facial expression recognition with FRR-CNN. Electron Lett 53(4):235–237

    Article  Google Scholar 

  89. Yang Z, Yang D, Dyer C, He X, Smola A, HovyE (2016). Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 1480–1489)

  90. You Q, Luo J, Jin H, Yang J (2015). Joint visual-textual sentiment analysis with deep neural networks. In Proceedings of the 23rd ACM international conference on Multimedia (pp. 1071–1074). ACM

  91. Zhang X, LeCun Y (2015). Text understanding from scratch. arXiv preprint arXiv:1502.01710

  92. Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B. (2016). Attention-based bidirectional long short-term memory networks for relation classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (Vol. 2, pp. 207–212)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shishir Kumar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shrivastava, K., Kumar, S. & Jain, D.K. An effective approach for emotion detection in multimedia text data using sequence based convolutional neural network. Multimed Tools Appl 78, 29607–29639 (2019). https://doi.org/10.1007/s11042-019-07813-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-07813-9

Keywords

Navigation