Memetic Computing

, Volume 10, Issue 1, pp 3–13 | Cite as

Ensemble application of ELM and GPU for real-time multimodal sentiment analysis

  • Ha-Nguyen TranEmail author
  • Erik Cambria
Regular Research Paper


The enormous number of videos posted everyday on multimedia websites such as Facebook and YouTube makes the Internet an infinite source of information. Collecting and processing such information, however, is a very challenging task as it involves dealing with a huge amount of information that is changing at a very high speed. To this end, we leverage on the processing speed of extreme learning machine and graphics processing unit to overcome the limitations of standard learning algorithms and central processing unit (CPU) and, hence, perform real-time multimodal sentiment analysis, i.e., harvesting sentiments from web videos by taking into account audio, visual and textual modalities as sources of the information. For the sentiment classification, we leveraged on sentic memes, i.e., basic units of sentiment whose combination can potentially describe the full range of emotional experiences that are rooted in any of us, including different degrees of polarity. We used both feature and decision level fusion methods to fuse the information extracted from the different modalities. Using the sentiment annotated dataset generated from YouTube video reviews, our proposed multimodal system is shown to achieve an accuracy of 78%. In term of processing speed, our method shows improvements of several orders of magnitude for feature extraction compared to CPU-based counterparts.


Multimodal sentiment analysis Opinion mining Multimodal fusion GPGPU Sentic computing 


  1. 1.
    Alm CO, Roth D, Sproat R (2005) Emotions from text: machine learning for text-based emotion prediction. In: Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, pp 579–586Google Scholar
  2. 2.
    Busso C, Narayanan SS (2007) Interrelation between speech and facial gestures in emotional utterances: a single subject study. IEEE Trans Audio Speech Lang Process 15(8):2331–2347CrossRefGoogle Scholar
  3. 3.
    Cambria E (2016) Affective computing and sentiment analysis. IEEE Intell Syst 31(2):102–107CrossRefGoogle Scholar
  4. 4.
    Cambria E, Hussain A (2015) Sentic computing: a common-sense-based framework for concept-level sentiment analysis. Springer, ChamCrossRefGoogle Scholar
  5. 5.
    Cambria E, Poria S, Bajpai R, Schuller B (2016) SenticNet 4: a semantic resource for sentiment analysis based on conceptual primitives. In: the 26th international conference on computational linguistics (COLING), pp 2666–2677Google Scholar
  6. 6.
    Chaumartin F-R (2007) Upar7: a knowledge-based system for headline sentiment tagging. In: Proceedings of the 4th international workshop on semantic evaluations. Association for Computational Linguistics, pp 422–425Google Scholar
  7. 7.
    Cootes TF, Taylor CJ, Cooper DH, Graham J (1995) Active shape models-their training and application. Comput Vis Image Underst 61(1):38–59CrossRefGoogle Scholar
  8. 8.
    Ekman P, Keltner D (1970) Universal facial expressions of emotion. Calif Ment Health Res Dig 8(4):151–158Google Scholar
  9. 9.
    El Ayadi M, Kamel MS, Karray F (2011) Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognit 44(3):572–587CrossRefzbMATHGoogle Scholar
  10. 10.
    Huang G-B, Cambria E, Toh K-A, Widrow B, Xu Z (2015) New trends of learning in computational intelligence. IEEE Comput Intell Mag 10(2):16–17CrossRefGoogle Scholar
  11. 11.
    Johnstone T (1996) Emotional speech elicited using computer games. In: Proceedings, 4th international conference on spoken language, 1996. ICSLP 96. vol 3. IEEE, pp 1985–1988Google Scholar
  12. 12.
    Li J, Lu Y, Pu B, Xie Y, Qin J, Pang W-M, Heng P-A (2009) Accelerating active shape model using GPU for facial extraction in video. In: IEEE international conference on intelligent computing and intelligent systems, 2009. ICIS 2009, vol 4. IEEE, pp 522–526Google Scholar
  13. 13.
    Li X, Mao W, Jiang W, Yao Y (2016) Extreme learning machine via free sparse transfer representation optimization. Memet Comput 8(2):85–95CrossRefGoogle Scholar
  14. 14.
    Majumder N, Poria S, Gelbukh A, Cambria E (2017) Deep learning based document modeling for personality detection from text. IEEE Intell Syst 32(2):42–49CrossRefGoogle Scholar
  15. 15.
    Michálek J, Vaněk J (2014) An open-source GPU-accelerated feature extraction tool. In: 2014 12th international conference on signal processing (ICSP). IEEE, pp 450–454Google Scholar
  16. 16.
    Morency L-P, Mihalcea R, Doshi P (2011) Towards multimodal sentiment analysis: harvesting opinions from the web. In: Proceedings of the 13th international conference on multimodal interfaces. ACM, pp 169–176Google Scholar
  17. 17.
    Murray IR, Arnott JL (1993) Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J Acoust Soc Am 93(2):1097–1108CrossRefGoogle Scholar
  18. 18.
    Oneto L, Bisio F, Cambria E, Anguita D (2016) Statistical learning theory and ELM for big social data analysis. IEEE Comput Intell Mag 11(3):45–55CrossRefGoogle Scholar
  19. 19.
    Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl-Based Syst 108:42–49CrossRefGoogle Scholar
  20. 20.
    Poria S, Chaturvedi I, Cambria E, Hussain A (2016) Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: ICDM. Barcelona, pp 439–448Google Scholar
  21. 21.
    Poria S, Cambria E, Bajpai R, Hussain A (2017) A review of affective computing: from unimodal analysis to multimodal fusion. Inf Fus 37:98–125CrossRefGoogle Scholar
  22. 22.
    Sattar A, Seguier R (2010) Hmoam: hybrid multi-objective genetic optimization for facial analysis by appearance model. Memet Comput 2(1):25–46CrossRefGoogle Scholar
  23. 23.
    Song M, You M, Li N, Chen C (2008) A robust multimodal approach for emotion recognition. Neurocomputing 71(10):1913–1920CrossRefGoogle Scholar
  24. 24.
    Tran H-N, Cambria E, Hussain A (2016) Towards GPU-based common-sense reasoning: using fast subgraph matching. Cognit Comput 8(6):1074–1086CrossRefGoogle Scholar
  25. 25.
    Turney PD (2002) Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, pp 417–424Google Scholar
  26. 26.
    Várkonyi-Kóczy AR (2010) New advances in digital image processing. Memet Comput 2(4):283–304CrossRefGoogle Scholar
  27. 27.
    Wiebe J, Riloff E (2005) Creating subjective and objective sentence classifiers from unannotated texts. In: International conference on intelligent text processing and computational linguistics. Springer, pp 486–497Google Scholar
  28. 28.
    Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, pp 347–354Google Scholar
  29. 29.
    Xiao C, Dong Z, Xu Y, Meng K, Zhou X, Zhang X (2016) Rational and self-adaptive evolutionary extreme learning machine for electricity price forecast. Memet Comput 8(3):223–233CrossRefGoogle Scholar
  30. 30.
    Yang C, Lin KH-Y, Chen H-H (2007) Building emotion lexicon from weblog corpora. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions. Association for Computational Linguistics, pp 133–136Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringNanyang Technological UniversitySingaporeSingapore

Personalised recommendations