Mobile Networks and Applications

, Volume 21, Issue 5, pp 753–763 | Cite as

Audio-Visual Emotion Recognition Using Big Data Towards 5G

  • M. Shamim HossainEmail author
  • Ghulam Muhammad
  • Mohammed F. Alhamid
  • Biao Song
  • Khaled Al-Mutib


With the advent of future generation mobile communication technologies (5G), there is the potential to allow mobile users to have access to big data processing over different clouds and networks. The increasing numbers of mobile users come with additional expectations for personalized services (e.g., social networking, smart home, health monitoring) at any time, from anywhere, and through any means of connectivity. Because of the expected massive amount of complex data generated by such services and networks from heterogeneous multiple sources, an infrastructure is required to recognize a user’s sentiments (e.g., emotion) and behavioral patterns to provide a high quality mobile user experience. To this end, this paper proposes an infrastructure that combines the potential of emotion-aware big data and cloud technology towards 5G. With this proposed infrastructure, a bimodal system of big data emotion recognition is proposed, where the modalities consist of speech and face video. Experimental results show that the proposed approach achieves 83.10 % emotion recognition accuracy using bimodal inputs. To show the suitability and validity of the proposed approach, Hadoop-based distributed processing is used to speed up the processing for heterogeneous mobile clients.


Emotion recognition Weber local descriptor Big data 5G 



The authors extend their appreciation to the Deanship of Scientific Research at King Saud University, Riyadh, Saudi Arabia for funding this work through the research group project no. RGP-1436-023.


  1. 1.
    Hossain E, Hasan M (2015) 5G cellular: key enabling technologies and research challenges. IEEE Instrum Meas Mag 18(3):11–21CrossRefGoogle Scholar
  2. 2.
    Han Q, Liang S, Zhang H (2015) Mobile cloud sensing, big data, and 5G networks make an intelligent and smart world. IEEE Netw 29(2):40–45CrossRefGoogle Scholar
  3. 3.
    Chen M, Mao S, Li Y, Mao S (2014) Big data: a survey. ACM/Springer Mobile Networks and Applications 19(2):171–209CrossRefGoogle Scholar
  4. 4.
    Baimbetov Y, Khalil I, Steinbauer M, Anderst-Kotsis G (2015) Using big data for emotionally intelligent mobile services through multi-modal emotion recognition. In: Inclusive smart cities and e-health:lecture notes in computer science, vol 9102. Springer, pp 127–138Google Scholar
  5. 5.
    Hossain MS, Muhammad G, Song B, Hassan M, Alelaiwi A, Alamri A (2015) Audio-visual emotion-aware cloud gaming framework. IEEE Trans Circuits Syst Video Technol 25(12):2105–2118CrossRefGoogle Scholar
  6. 6.
    Chen M, Hao Y, Li Y, Wu D, Huang D (2015) Demo: LIVES: Learning through interactive video and emotion aware system. In: ACM Mobihoc 2015. Hangzhou, pp 22–25Google Scholar
  7. 7.
    Hossain MS, Muhammad G (2015) Cloud-assisted speech and face recognition framework for health monitoring. ACM/Springer Mobile Networks and Applications 20(3):391–399CrossRefGoogle Scholar
  8. 8.
    Chen M, Zhang Y, Li Y, Mao S, Leung V (2015) EMC: Emotion-aware mobile cloud computing in 5G. IEEE Netw 29(2):32–38CrossRefGoogle Scholar
  9. 9.
    Chen M, Zhang Y, Li Y, Hassan M, Alamri A (2015) AIWAC: Affective interaction through wearable computing and cloud technology. IEEE Wirel Commun Mag 22(1):20–27CrossRefGoogle Scholar
  10. 10.
    Hossain MS, Muhammad G (2015) Audio-visual emotion recognition using multi-directional regression and Ridgelet transform. Springer J. Multimodal User InterfacesGoogle Scholar
  11. 11.
    Chen M (2014) NDNC-BAN: supporting rich media healthcare services via named data networking in cloud-assisted wireless body area networks. Inf Sci 284(10):142–156CrossRefGoogle Scholar
  12. 12.
    Schuller B, Rigoll G, Lang M (2004) Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine belief network architecture. In: IEEE ICASSP’04Google Scholar
  13. 13.
    Zhou Y, Sun Y, Zhang J, Yan Y (2009) Speech emotion recognition using both spectral and prosodic features. In: ICIECS’09Google Scholar
  14. 14.
    Albornoz EM, Milone DH, Rufiner HL (2011) Spoken emotion recognition using hierarchical classifiers. Comput Speech Lang 25:556–570CrossRefGoogle Scholar
  15. 15.
    Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B (2005) A database of German emotional speech. In: Interspeech’2005, Lisbon, PortugalGoogle Scholar
  16. 16.
    Bettadapura V (2012) Face expression recognition and analysis: the state of the art. College of Computing, Georgia Institute of TechnologyGoogle Scholar
  17. 17.
    Senechal T, Rapp V, Salam H, Seguier R, Bailly K, Prevost L (2012) Facial action recognition combining heterogeneous features via multikernel learning. IEEE Trans Syst Man Cybern B Cybern 42(4):993–1005CrossRefGoogle Scholar
  18. 18.
    Majumder A, Behera L, Subramanian VK (2014) Emotion recognition from geometric facial features using self organizing map. Pattern Recogn 47(3):1282–1293CrossRefGoogle Scholar
  19. 19.
    Bejani M, Gharavian D, Charkari NM (2014) Audiovisual emotion recognition using ANOVA feature selection method and multi classifier neural networks. Neural Comput & Applic 24(2):399–412CrossRefGoogle Scholar
  20. 20.
    Martin O, Kotsia I, Macq B, Pitas I (2006) The eNTERFACE’05 audiovisual emotion database. In: ICDEW’2006, Atlanta, GAGoogle Scholar
  21. 21.
    Kachele M, Glodek M, Zharkov D, Meudt S, Schwenker F (2014) Fusion of audio-visual features using hierarchical classifier systems for the recognition of affective states and the state of depression. In: ICPRAM’14Google Scholar
  22. 22.
    Jeremie N, Vincent R, Kevin B, Lionel P, Mohamed C (2014) Audio-visual emotion recognition: A dynamic, multimodal approach. In: IHM’14, Lille, FranceGoogle Scholar
  23. 23.
    Ryu C, Lee D, Jang M, Kim C, Seo E (2013) Extensible video processing framework in apache Hadoop. In: IEEE International conference on cloud computing technology and science, vol 2, pp 305–310Google Scholar
  24. 24.
    Wang H, et al. (2012) Large-scale multimedia data mining using MapReduce framework. In: IEEE CloudCom, pp 287–292Google Scholar
  25. 25.
  26. 26.
    Tan H, Chen L (2014) An approach for fast and parallel video processing on Apache Hadoop clusters. In: IEEE ICMEGoogle Scholar
  27. 27.
    Kim M, Han S, Cui Y, Lee H, Cho H, Hwang S (2014) CloudDMSS: robust Hadoop-based multimedia streaming service architecture for a cloud computing environment. Clust Comput 17(3):1386–7857Google Scholar
  28. 28.
    Chiu O Microsoft delivers interactive analytics on Big Data with the release of Spark for Azure HDInsight.
  29. 29.
  30. 30.
    Open Source Computer Visiopn (OpenCV).
  31. 31.
    Muhammad G, Mesallam T, Almalki K, Farahat M, Mahmood A, Alsulaiman M (2012) Multi Directional Regression (MDR) Based Features for Automatic Voice Disorder Detection. J Voice 26(6):817.e19–817.e27CrossRefGoogle Scholar
  32. 32.
    Dollar P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: IEEE VS-PETS’05, Beijing, ChinaGoogle Scholar
  33. 33.
    Kanade T, Cohn J, Tian Y (2000) Comprehensive database for facial expression analysis. In: IEEE AFGR’00Google Scholar
  34. 34.
    Muhammad G, Masud M, Alelaiwi A, Rahman MA, Karime A, Alamri A, Hossain MS (2015) Spectro-temporal directional derivative based automatic speech recognition for a serious game scenario. Multimedia Tools and Applications 74(14):5313–5327CrossRefGoogle Scholar
  35. 35.
    Chen J, Shan S, He C, Zhao G, Pietikainen M, Chen X, Gao W (2010) WLD: a robust local image descriptor. IEEE Trans Pattern Anal Mach Intell 32(9):1705–1720CrossRefGoogle Scholar
  36. 36.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297zbMATHGoogle Scholar
  37. 37.
    Kim M, Cui Y, Han S, Lee HP (2013) Towards efficient design and implementation of a Hadoop-based distributed video transcoding system in cloud computing environment. J Multimed Ubiquitous Eng 8(2):213–224Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • M. Shamim Hossain
    • 1
    Email author
  • Ghulam Muhammad
    • 2
  • Mohammed F. Alhamid
    • 1
  • Biao Song
    • 3
  • Khaled Al-Mutib
    • 1
  1. 1.Software Engineering Department, College of Computer and Information Sciences (CCIS)King Saud UniversityRiyadhKingdom of Saudi Arabia
  2. 2.Computer Engineering Department, College of Computer and Information Sciences (CCIS)King Saud UniversityRiyadhKingdom of Saudi Arabia
  3. 3.Information System Department, College of Computer and Information Sciences (CCIS)King Saud UniversityRiyadhKingdom of Saudi Arabia

Personalised recommendations