Skip to main content
Log in

Context-aware personality estimation and emotion recognition in social interaction

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Personality and emotion as intrinsic factors often have great influences on the cognition of people’s behavior. In computer vision, there is a lot of work done on the recognition of emotions, such as classification of a person’s emotions via analyzing facial expressions. Relatively there is less work done on personality estimation. Personality, as a long-term characteristic pattern of behavior, influences the emotion generation of a person. In this paper, we present a new method to analyze and estimate personality and emotions in dyadic and multiparty social interactions. We first propose a context-aware deep learning framework that automatically estimates the personality of a target person based on his/her own and the interlocutor’s body behavioral and facial information recorded in the interaction process. Then, we expand this architecture to form a method for jointly estimating personality and recognizing emotions. We conduct a series of experiments on two datasets and the experimental results show that the proposed method has good performance in both personality estimation and emotion recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The two datasets “MHHRI” and “MUMBAI” used in this study are available from Celiktutan et al. [6] and Doyran et al. [8], respectively, but restrictions apply to the availability of these data, which were used under licence for the current study, and so are not publicly available. Data may, however be available from the authors upon reasonable request.

References

  1. Arriaga, O., Valdenegro-Toro, M., Plöger, P.: Real-time Convolutional Neural Networks for Emotion and Gender Classification. arXiv:1710.07557 (2017)

  2. Ashton, M.C., Lee, K.: The HEXACO-60: a short measure of the major dimensions of personality. J. Personal. Assess. 91(4), 340–345 (2009)

    Article  Google Scholar 

  3. Baltrusaitis, T., Zadeh, A., Lim, Y.C., Morency, L.P.: OpenFace 2.0: Facial Behavior Analysis Toolkit. In: Proceedings of the IEEE International Conference on Automatic Face & Gesture Recognition (2018)

  4. Burgoon, J.K., Guerrero, L.K., Floyd, K.: Nonverbal Communication. Routledge, Oxford (2010)

    Google Scholar 

  5. Carreira, J., Zisserman, A.: Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

  6. Celiktutan, O., Skordos, E., Gunes, H.: Multimodal Human-Human-Robot Interactions (MHHRI) Dataset for Studying Personality and Engagement. In: IEEE Transactions on Affective Computing (2019)

  7. Curto, D., Clapés, A., Selva, J., Smeureanu, S., Junior, J.C.S.J., Gallardo-Pujol, D., Guilera, G., Leiva, D., Moeslund, T.B., Escalera, S., Palmero, C.: Dyadformer: A Multi-Modal Transformer for Long-Range Modeling of Dyadic Interactions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops (2021)

  8. Doyran, M., Schimmel, A., Baki, P., Ergin, K., Türkmen, B., Salah, A.A., Bakkes, S.C.J., Kaya, H., Poppe, R., Salah, A.A.: MUMBAI: multi-Person multimodal board game affect and interaction analysis dataset. J. Multimod. User Interfaces (2021). https://doi.org/10.1007/s12193-021-00364-0

    Article  Google Scholar 

  9. Ekman, P.: Universal Emotions. https://www.paulekman.com/universal-emotions/ (2022)

  10. Ekman, P., Friesen, W.V., Ellsworth, P.: Emotion in the Human Face: Guidelines for Research and an Integration of Findings. Elsevier, Heidelberg (2013)

    Google Scholar 

  11. Eleftheriadis, S., Rudovic, O., Pantic, M.: Joint facial action unit detection and feature fusion: a multi-conditional learning approach. IEEE Trans. Image Process. 25(12), 5727–5742 (2016). https://doi.org/10.1109/TIP.2016.2615288

    Article  MathSciNet  Google Scholar 

  12. Fabian Benitez-Quiroz, C., Srinivasan, R., Martinez, A.M.: EmotioNet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

  13. Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: YOLOX: Exceeding YOLO Series in 2021. arXiv:2107.08430 (2021)

  14. Gross, J.J., Feldman Barrett, L.: Emotion generation and emotion regulation: one or two depends on your point of view. Emot. Rev. 3(1), 8–16 (2011). https://doi.org/10.1177/1754073910380974

    Article  Google Scholar 

  15. Güçlütürk, Y., Güçlü, U., van Gerven, M.A., Lier, R.v.: Deep Impression: Audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition. In: Computer Vision – ECCV (2016)

  16. Gürpinar, F., Kaya, H., Salah, A.A.: Multimodal Fusion of Audio, Scene, and Face Features for First Impression Estimation. In: Proceedings of the International Conference on Pattern Recognition (2016)

  17. Hall, J.A., Knapp, M.L.: Nonverbal Communication. Walter de Gruyter, Berlin (2013)

    Book  Google Scholar 

  18. Hara, K., Kataoka, H., Satoh, Y.: Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

  19. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

  20. Huang, G.B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 42(2), 513–529 (2012). https://doi.org/10.1109/TSMCB.2011.2168604

    Article  Google Scholar 

  21. Jacques Junior, J.C.S., Güçlütürk, Y., Pérez, M., Güçlü, U., Andujar, C., Baró, X., Escalante, H.J., Guyon, I., van Gerven, M.A.J., van Lier, R., Escalera, S.: First Impressions: A Survey on Vision-Based Apparent Personality Trait Analysis. In: IEEE Transactions on Affective Computing (2022)

  22. Jin, H., Song, Q., Hu, X.: Auto-Keras: An Efficient Neural Architecture Search System. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1946–1956 (2019)

  23. Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., et al.: The Kinetics Human Action Video Dataset. arXiv:1705.06950 (2017)

  24. Kosti, R., Alvarez, J.M., Recasens, A., Lapedriza, A.: Emotion Recognition in Context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1667–1675 (2017)

  25. Lee, J., Kim, S., Kim, S., Park, J., Sohn, K.: Context-Aware Emotion Recognition Networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10,143–10,152 (2019)

  26. Li, Y., Zeng, J., Shan, S., Chen, X.: Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans. Image Process. 28(5), 2439–2450 (2019). https://doi.org/10.1109/TIP.2018.2886767

    Article  MathSciNet  Google Scholar 

  27. McCrae, R.R., John, O.P.: An introduction to the five-factor model and its applications. J. Personal. 60(2), 175–215 (1992)

    Article  Google Scholar 

  28. Mehrabian, A.: Basic Dimensions for a General Psychological Theory: Implications for Personality, Social, Environmental, and Developmental Studies. Oelgeschlager, Cambridge (1980)

    Google Scholar 

  29. Mittal, T., Guhan, P., Bhattacharya, U., Chandra, R., Bera, A., Manocha, D.: EmotiCon: Context-Aware Multimodal Emotion Recognition Using Frege’s Principle. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14,234–14,243 (2020)

  30. Mollahosseini, A., Hasani, B., Mahoor, M.H.: AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild. In: IEEE Transactions on Affective Computing (2017)

  31. Nicolaou, M.A., Gunes, H., Pantic, M.: Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space. IEEE Trans. Affect. Comput. 2(2), 92–105 (2011)

    Article  Google Scholar 

  32. Palmero, C., Selva, J., Smeureanu, S., Junior, J.C.S.J., Clapes, A., Mosegui, A., Zhang, Z., Gallardo, D., Guilera, G., Leiva, D., Escalera, S.: Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops (2021)

  33. Passini, F.T., Norman, W.T.: A universal conception of personality structure? J. Pers. Soc. Psychol. 4(1), 44 (1966)

    Article  Google Scholar 

  34. Patterson, M.L.: A systems model of dyadic nonverbal interaction. J. Nonverbal Behav. 43(2), 111–132 (2019)

    Article  Google Scholar 

  35. Plutchik, R.: A Psychoevolutionary theory of emotions. Soc. Sci. Inf. 21(4–5), 529–553 (1982). https://doi.org/10.1177/053901882021004003

    Article  Google Scholar 

  36. Ponce-López, V., Chen, B., Oliu, M., Corneanu, C., Clapés, A., Guyon, I., Baró, X., Escalante, H.J., Escalera, S.: ChaLearn LAP 2016: First Round Challenge on First Impressions - Dataset and Results. In: Computer Vision – ECCV Workshops (2016)

  37. Rammstedt, B., John, O.P.: Measuring personality in one minute or less: A 10-item short version of the Big Five Inventory in English and German. J. Res. Personal. 41(1), 203–212 (2007)

    Article  Google Scholar 

  38. Romeo, M., Hernandez Garcia, D., Han, T., Cangelosi, A., Jokinen, K.: Predicting apparent personality from body language: benchmarking deep learning architectures for adaptive social human-robot interaction. Adv. Robotics 35(19), 1167–1179 (2021)

    Article  Google Scholar 

  39. Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161 (1980)

    Article  Google Scholar 

  40. Salam, H., Celiktutan, O., Hupont, I., Gunes, H., Chetouani, M.: Fully automatic analysis of engagement and its relationship to personality in human-robot interactions. IEEE Access 5, 705–721 (2016)

    Article  Google Scholar 

  41. Schindler, K., Van Gool, L., De Gelder, B.: Recognizing emotions expressed by body pose: a biologically inspired neural model. Neural Netw. 21(9), 1238–1246 (2008)

    Article  Google Scholar 

  42. Shao, Z., Song, S., Jaiswal, S., Shen, L., Valstar, M., Gunes, H.: Personality Recognition by Modelling Person-specific Cognitive Processes Using Graph Representation. In: Proceedings of the ACM International Conference on Multimedia (2021)

  43. She, J., Hu, Y., Shi, H., Wang, J., Shen, Q., Mei, T.: Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty Estimation for Facial Expression Recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

  44. Subramaniam, A., Patel, V., Mishra, A., Balasubramanian, P., Mittal, A.: Bi-modal First Impressions Recognition Using Temporally Ordered Deep Audio and Stochastic Visual Features. In: Computer Vision – ECCV (2016)

  45. Tellamekala, M.K., Giesbrecht, T., Valstar, M.: Apparent Personality Recognition from Uncertainty-Aware Facial Emotion Predictions Using Conditional Latent Variable Models. In: Proceesings of the IEEE International Conference on Automatic Face and Gesture Recognition (2021)

  46. Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning Spatiotemporal Features With 3D Convolutional Networks. In: Computer Vision – ECCV (2015)

  47. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention Is All You Need. Adv. Neural Inform. Process. Syst. (2017)

  48. Vinciarelli, A., Mohammadi, G.: A survey of personality computing. IEEE Trans. Affect. Comput. 5(3), 273–291 (2014)

    Article  Google Scholar 

  49. Wang, X., Girshick, R., Gupta, A., He, K.: Non-Local Neural Networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

  50. Watson, D.: Strangers’ ratings of the five robust personality factors: evidence of a surprising convergence with self-report. J. Pers. Soc. Psychol. 57(1), 120 (1989)

    Article  Google Scholar 

  51. Zhang, C.L., Zhang, H., Wei, X.S., Wu, J.: Deep Bimodal Regression for Apparent Personality Analysis. In: Computer Vision – ECCV (2016)

  52. Zhang, L., Peng, S., Winkler, S.: PersEmoN: a deep network for joint analysis of apparent personality, emotion and their relationship. IEEE Trans. Affect. Comput. 13(1), 298–305 (2019)

    Article  Google Scholar 

  53. Zhang, Z., Zheng, J., Thalmann, N.M.: Real and apparent personality prediction in human-human interaction. In: 2022 International Conference on Cyberworlds (CW), pp. 187–194. IEEE (2022)

Download references

Acknowledgements

We thank the authors of MHHRI [6] and the authors of MUMBAI [8] for allowing us to use their datasets in our research under the user agreement and licensee. The work is partially supported by MOE AcRF Tier 1 Grant of Singapore (RG12/22).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianmin Zheng.

Ethics declarations

Conflict of interest

Author Professor Jianmin Zheng is an Editorial Board Member of the Visual Computer and author Professor Nadia Magnenat Thalmann is the Editor-in-Chief of the Visual Computer.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., Zheng, J. & Thalmann, N.M. Context-aware personality estimation and emotion recognition in social interaction. Vis Comput 40, 5123–5137 (2024). https://doi.org/10.1007/s00371-023-02862-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-02862-6

Keywords

Navigation