A Model for Automated Affect Recognition on Smartphone-Cloud Architecture

  • Ying Su
  • Rajib Rana
  • Frank Whittaker
  • Jeffrey SoarEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9677)


This paper proposes a model for automated affect recognition on a smartphone-cloud architecture. Whilst facial-mood recognition is becoming more advanced, our contribution is in analysis and classification of voice to supplement mood recognition. In the model we build upon previous work of others and supplement these with new algorithms.


Smart phone Affect recognition 


  1. 1.
    Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Parker, G., Breakspear, M.: Characterising depressed speech for classification. In: Interspeech, pp. 2534–2538 (2013)Google Scholar
  2. 2.
    Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B.: A database of German emotional speech. In: Interspeech, vol. 5, pp. 1517–1520 (2005)Google Scholar
  3. 3.
    Chang, K.-H., Fisher, D., Canny, J., Hartmann, B.: How’s my mood and stress? an efficient speech analysis library for unobtrusive monitoring on mobile phones. In: Proceedings of the 6th International Conference on Body Area Networks, BodyNets 2011, ICST, Brussels, Belgium, ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), pp. 71–77 (2011)Google Scholar
  4. 4.
    Chew, S.W., Rana, R., Lucey, P., Lucey, S., Sridharan, S.: Sparse temporal representations for facial expression recognition. In: Ho, Y.-S. (ed.) PSIVT 2011, Part II. LNCS, vol. 7088, pp. 311–322. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  5. 5.
    Cichosz, J., Slot, K.: Application of selected speech-signal characteristics to emotion recognition in polish language. In: International Conference on Signals and Electronic Systems, pp. 409–412 (2005)Google Scholar
  6. 6.
    Deng, L., Yu, D.: Deep Learning. Now Publishers Incorporated, Hanover (2014)zbMATHGoogle Scholar
  7. 7.
    Engberg, I.S., Hansen, A.V.: Documentation of the Danish emotional speech database. DES. Internal AAU report, Center for Person Kommunikation, Denmark, p. 22 (1996)Google Scholar
  8. 8.
    Georgiev, P., Lane, N.D., Rachuri, K.K., Mascolo, C.: DSP. Ear: leveraging co- processor support for continuous audio sensing on smartphones. In: Proceedings of the 12th ACM Conference on Embedded Network Sensor Systems, pp. 295–309. ACM (2014)Google Scholar
  9. 9.
    Han, K., Yu, D., Tashev, I.: Speech emotion recognition using deep neural network and extreme learning machine. In: Proceedings of INTERSPEECH, ISCA, Singapore, pp. 223–227 (2014)Google Scholar
  10. 10.
    Hannun, A.Y., Case, C., Casper, J., Catanzaro, B.C., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., Coates, A., Ng, A.Y.: Deep speech: scaling up end-to-end speech recognition. CoRR, abs/1412.5567 (2014)Google Scholar
  11. 11.
    Hansen, J.H., Bou-Ghazale, S.E., Sarikaya, R., Pellom, B.: Getting started with susas: a speech under simulated and actual stress database. In: Eurospeech, vol. 97, pp. 1743–1746 (1997)Google Scholar
  12. 12.
    Lane, N.D., Georgiev, P.: Can deep learning revolutionize mobile sensing? (2015)Google Scholar
  13. 13.
    Lu, H., Bernheim Brush, A.J., Priyantha, B., Karlson, A.K., Liu, J.: SpeakerSense: energy efficient unobtrusive speaker identification on mobile phones. In: Lyons, K., Hightower, J., Huang, E.M. (eds.) Pervasive 2011. LNCS, vol. 6696, pp. 188–205. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  14. 14.
    Lu, H., Frauendorfer, D., Rabbi, M., Mast, M.S., Chittaranjan, G.T., Campbell, A.T., Gatica-Perez, D., Choudhury, T.: StressSense: detecting stress in unconstrained acoustic environments using smartphones. In: Proceedings ofthe 2012 ACM Conference on Ubiquitous Computing, pp. 351–360. ACM (2012)Google Scholar
  15. 15.
    McIntyre, G., Göcke, R., Hyett, M., Green, M., Breakspear, M.: An approach for automatically measuring facialactivity in depressed subjects. In: 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, 2009, ACII 2009, pp. 1–8. IEEE (2009)Google Scholar
  16. 16.
    Picard, R.W.: Affective computing for HCI. In: HCI, vol. 1, pp. 829–833 (1999)Google Scholar
  17. 17.
    Rana, R., Kusy, B., Wall, J., Hu, W.: Novel activity classification and occupancy estimation methods for intelligent HVAC (heating, ventilation and air conditioning) systems. Energy 93, 245–255 (2015)CrossRefGoogle Scholar
  18. 18.
    Schuller, B., Arsic, D., Wallhoff, F., Rigoll, G.: Emotion recognition in the noise applying large acoustic feature sets. In: Speech Prosody, Dresden, pp. 276–289 (2006)Google Scholar
  19. 19.
    Seltzer, M.L., Yu, D., Wang, Y.: An investigation of deep neural networks for noise robust speech recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7398–7402. IEEE (2013)Google Scholar
  20. 20.
    Wei, B., Yang, M., Shen, Y., Rana, R., Chou, C.T., Hu, W.: Real-time classification via sparse representation in acoustic sensor networks. In: Proceedings of the 11th ACM Conference on Embedded Networked Sensor Systems, p. 21. ACM (2013)Google Scholar
  21. 21.
    Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)CrossRefGoogle Scholar
  22. 22.
    Zeng, Z., Pantic, M., Roisman, G., Huang, T.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Ying Su
    • 1
  • Rajib Rana
    • 2
  • Frank Whittaker
    • 2
  • Jeffrey Soar
    • 2
    Email author
  1. 1.Institute of Science and Technology for ChinaBeijingChina
  2. 2.University of Southern QueenslandSpringfieldAustralia

Personalised recommendations