Towards IMACA: Intelligent Multimodal Affective Conversational Agent

  • Amir Hussain
  • Erik Cambria
  • Thomas Mazzocco
  • Marco Grassi
  • Qiu-Feng Wang
  • Tariq Durrani
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7663)


A key aspect when trying to achieve natural interaction in machines is multimodality. Besides verbal communication, in fact, humans interact also through many other channels, e.g., facial expressions, gestures, eye contact, posture, and voice tone. Such channels convey not only semantics, but also emotional cues that are essential for interpreting the message transmitted. The importance of the affective information and the capability of properly managing it, in fact, has been more and more understood as fundamental for the development of a new generation of emotion-aware applications for several scenarios like e-learning, e-health, and human-computer interaction. To this end, this work investigates the adoption of different paradigms in the fields of text, vocal, and video analysis, in order to lay the basis for the development of an intelligent multimodal affective conversational agent.


AI HCI Multimodal Sentiment Analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cifani, S., Abel, A., Hussain, A., Squartini, S., Piazza, F.: An Investigation into Audiovisual Speech Correlation in Reverberant Noisy Environments. In: Esposito, A., Vích, R. (eds.) Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions. LNCS, vol. 5641, pp. 331–343. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  2. 2.
    Kapoor, A., Burleson, W., Picard, R.: Automatic prediction of frustration. International Journal of Human-Computer Studies 65, 724–736 (2007)CrossRefGoogle Scholar
  3. 3.
    Shan, C., Gong, S., McOwan, P.: Beyond facial expressions: Learning human emotion from body gestures. In: BMVC, Warwick (2007)Google Scholar
  4. 4.
    Pun, T., Alecu, T., Chanel, G., Kronegg, J., Voloshynovskiy, S.: Brain-computer interaction research at the computer vision and multimedia laboratory. IEEE Trans. on Neural Systems and Rehabilitation Engineering 14(2), 210–213 (2006)CrossRefGoogle Scholar
  5. 5.
    Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley & Sons (2004)Google Scholar
  6. 6.
    Zeng, Z., Tu, J., Liu, M., Huang, T., Pianfetti, B., Roth, D., Levinson, S.: Audio-visual affect recognition. IEEE Trans. Multimedia 9(2), 424–428 (2007)CrossRefGoogle Scholar
  7. 7.
    Gunes, H., Piccardi, M.: Bi-modal emotion recognition from expressive face and body gestures. Network and Computer Applications 30(4), 1334–1345 (2007)CrossRefGoogle Scholar
  8. 8.
    Pal, P., Iyer, A., Yantorno, R.: Emotion detection from infant facial expressions and cries. In: International Conference on Acoustics, Speech and Signal Processing, Dallas (2006)Google Scholar
  9. 9.
    Cambria, E., Hussain, A.: Sentic Computing: Techniques, Tools, and Applications. Springer, Dordrecht (2012)Google Scholar
  10. 10.
    Cambria, E., Benson, T., Eckl, C., Hussain, A.: Sentic PROMs: Application of sentic computing to the development of a novel unified framework for measuring health-care quality. Expert Systems with Applications 39(12), 10533–10543 (2012)CrossRefGoogle Scholar
  11. 11.
    Cambria, E., Song, Y., Wang, H., Hussain, A.: Isanette: A common and common sense knowledge base for opinion mining. In: ICDM, Vancouver, pp. 315–322 (2011)Google Scholar
  12. 12.
    Cambria, E., Olsher, D., Kwok, K.: Sentic activation: A two-level affective common sense reasoning framework. In: AAAI, Toronto, pp. 186–192 (2012)Google Scholar
  13. 13.
    Cambria, E., Livingstone, A., Hussain, A.: The hourglass of emotions. In: Esposito, A., et al. (eds.) Cognitive Behavioural Systems. LNCS, vol. 7403, pp. 144–157. Springer, Heidelberg (2012)Google Scholar
  14. 14.
    Alm, C., Roth, D., Sproat, R.: Emotions from text: Machine learning for text-based emotion prediction. In: HLT/EMNLP, pp. 347–354 (2005)Google Scholar
  15. 15.
    Lin, W., Wilson, T., Wiebe, J., Hauptmann, A.: Which side are you on? identifying perspectives at the document and sentence levels. In: Conference on Natural Language Learning, pp. 109–116 (2006)Google Scholar
  16. 16.
    Danisman, T., Alpkocak, A.: Feeler: Emotion classification of text using vector space model. In: AISB (2008)Google Scholar
  17. 17.
    D’Mello, S., Dowell, N., Graesser, A.: Cohesion relationships in tutorial dialogue as predictors of affective states. In: Conf. Artificial Intelligence in Education, pp. 9–16 (2009)Google Scholar
  18. 18.
    Cambria, E., Mazzocco, T., Hussain, A., Eckl, C.: Sentic medoids: Organizing affective common sense knowledge in a multi-dimensional vector space. In: Liu, D., Zhang, H., Polycarpou, M., Alippi, C., He, H. (eds.) ISNN 2011, Part III. LNCS, vol. 6677, pp. 601–610. Springer, Heidelberg (2011)Google Scholar
  19. 19.
    Christian, J., Deeming, A.: Affective human-robotic interaction. In: Affect and Emotion in Human-Computer Interaction: From Theory to Applications (2008)Google Scholar
  20. 20.
    Petrushin, V.: Emotion in speech: Recognition and application to call centers. In: Conference on Artificial Neural Networks in Engineering, p. 710 (1999)Google Scholar
  21. 21.
    Navas, E., Hernez, L.: An objective and subjective study of the role of semantics and prosodic features in building corpora for emotional TTS. IEEE Transactions on Audio, Speech, and Language Processing 14, 1117–1127 (2006)CrossRefGoogle Scholar
  22. 22.
    Atassi, H., Esposito, A.: A speaker independent approach to the classification of emotional vocal expressions, pp. 147-152 (2008)Google Scholar
  23. 23.
    Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of german emotional speech. In: Interspeech, pp. 1517–1520 (2005)Google Scholar
  24. 24.
    Pudil, P., Ferri, F., Novovicova, J., Kittler, J.: Floating search method for feature selection with non monotonic criterion functions. Pattern Recognition 2, 279–283 (1994)Google Scholar
  25. 25.
    Ekman, P., Dalgleish, T., Power, M.: Handbook of Cognition and Emotion. Wiley, Chichester (1999)Google Scholar
  26. 26.
    Abel, A., Hussain, A., Nguyen, Q.-D., Ringeval, F., Chetouani, M., Milgram, M.: Maximising Audiovisual Correlation with Automatic Lip Tracking and Vowel Based Segmentation. In: Fierrez, J., Ortega-Garcia, J., Esposito, A., Drygajlo, A., Faundez-Zanuy, M. (eds.) BioID MultiComm 2009. LNCS, vol. 5707, pp. 65–72. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  27. 27.
    Whissell, C.: The dictionary of affect in language. Emotion: Theory, Research, and Experience 4, 113–131 (1989)Google Scholar
  28. 28.
    Grassi, M., Cambria, E., Hussain, A., Piazza, F.: Sentic web: A new paradigm for managing social media affective information. Cognitive Computation 3(3), 480–489 (2011)CrossRefGoogle Scholar
  29. 29.
    Grassi, M.: Developing HEO Human Emotions Ontology. In: Fierrez, J., Ortega-Garcia, J., Esposito, A., Drygajlo, A., Faundez-Zanuy, M. (eds.) BioID MultiComm 2009. LNCS, vol. 5707, pp. 244–251. Springer, Heidelberg (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Amir Hussain
    • 1
  • Erik Cambria
    • 2
  • Thomas Mazzocco
    • 1
  • Marco Grassi
    • 3
  • Qiu-Feng Wang
    • 4
  • Tariq Durrani
    • 5
  1. 1.Dept. of Computing Science and MathematicsUniversity of StirlingUK
  2. 2.Temasek LaboratoriesNational University of SingaporeSingapore
  3. 3.Dept. of Information EngineeringUniversitá Politecnica delle MarcheItaly
  4. 4.National Laboratory of Pattern RecognitionChinese Academy of SciencesP.R. China
  5. 5.Dept. of Electronic and Electrical EngineeringUniversity of StrathclydeUK

Personalised recommendations