Synthesizing facial expressions in dyadic human–robot interaction

Sham, Abdallah Hussein; Tikka, Pia; Lamas, David; Anbarjafari, Gholamreza

doi:10.1007/s11760-024-03202-4

Synthesizing facial expressions in dyadic human–robot interaction

Original Paper
Published: 11 May 2024

(2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Abdallah Hussein Sham¹,
Pia Tikka¹,
David Lamas² &
…
Gholamreza Anbarjafari^3,4,5,6

71 Accesses
Explore all metrics

Abstract

Generative artificial intelligence (GenAI) can be used to create facial expressions of artificial human characters in real time based on the training dataset. However, the bottleneck that prevents natural dyadic interaction between an artificial character and a human lies in the GenAI’s limited capability to recognize dynamically changing contexts. To tackle this issue, we investigated how deep learning (DL) techniques could synthesize facial reaction emotions based on a sequence of the previous emotions. We applied action units from the facial action coding system to manipulate facial points of an artificial character inside unreal engine 4 using the OpenFace API. First the artificial character’s facial behavior was programmed to mimic human facial expressions on screen. For adequate reaction emotions, we then trained an autoencoder with a long short-term memory model to have a DL model. To validate the performance of our trained model, we compared our results on reaction expressions with our test dataset by using average root-mean-square error. Furthermore, sixteen test participants reported the apparent naturalness of the character’s reactions to the dynamic human expressions. Our findings are promising steps in developing facial reaction emotion synthesis into a dynamic system that can adapt to the user’s specific needs and context.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Facial emotion recognition using convolutional neural networks (FERC)

Article 18 February 2020

A review on the long short-term memory model

Article 13 May 2020

Availability of data and materials

Not applicable.

No datasets were generated or analyzed during the current study.

References

Mota, B.E.F., Rodrigues, P.O., Lacerda, K.C.D., David, I.A., Volchan, E., Campagnoli, R.R., Souza, G.G.L.: Pictures of social interaction prompt a sustained increase of the smile expression and induce sociability. Sci. Rep. 11(1), 5518 (2021)
Article Google Scholar
Schindler, S., Bublatzky, F.: Attention and emotion: an integrative review of emotional face processing as a function of attention. Cortex 130, 362–386 (2020)
Article Google Scholar
Cowen, A., Sauter, D., Tracy, J.L., Keltner, D.: Mapping the passions: toward a high-dimensional taxonomy of emotional experience and expression. Psychol. Sci. Public Interest 20(1), 69–90 (2019)
Article Google Scholar
Holland, A.C., O’Connell, G., Dziobek, I.: Facial mimicry, empathy, and emotion recognition: a meta-analysis of correlations. Cogn. Emot. 35(1), 150–168 (2021)
Article Google Scholar
Volynets, S., Smirnov, D., Saarimäki, H., Nummenmaa, L.: Statistical pattern recognition reveals shared neural signatures for displaying and recognizing specific facial expressions. Soc. Cognitive Affect. Neurosci. 15(8), 803–813 (2020)
Article Google Scholar
Barrett, L.F., Adolphs, R., Marsella, S., Martinez, A.M., Pollak, S.D.: Emotional expressions reconsidered: challenges to inferring emotion from human facial movements. Psychol. Sci. Public Interest 20(1), 1–68 (2019)
Article Google Scholar
Gouizi, K., Bereksi Reguig, F., Maaoui, C.: Emotion recognition from physiological signals. J. Med. Eng. Technol. 35(6–7), 300–307 (2011)
Article Google Scholar
Egger, M., Ley, M., Hanke, S.: Emotion recognition from physiological signal analysis: a review. Electron. Notes Theoretical Comput. Sci. 343, 35–55 (2019)
Article Google Scholar
Seibt, J.: “Towards an ontology of simulated social interaction: varieties of the “as if” for robots and humans,” In: Sociality and normativity for robots: Philosophical inquiries into human-robot interactions. Springer, pp. 11–39 (2017)
Malinowska, J.K.: What does it mean to empathise with a robot? Mind. Mach. 31(3), 361–376 (2021)
Article Google Scholar
Fox, J., Gambino, A.: Relationship development with humanoid social robots: applying interpersonal theories to human-robot interaction. Cyberpsychol. Behav. Soc. Netw. 24(5), 294–299 (2021)
Article Google Scholar
Park, S., Whang, M.: Empathy in human-robot interaction: designing for social robots. Int. J. Environ. Res. Public Health 19(3), 1889 (2022)
Article Google Scholar
Damiano, L., Dumouchel, P.: Anthropomorphism in human-robot co-evolution. Front. Psychol. 9, 468 (2018)
Article Google Scholar
Lee, J.R., Wang, L., Wong, A.: Emotionnet nano: an efficient deep convolutional neural network design for real-time facial expression recognition. Front. Artif. Intell. 3, 609673 (2021)
Article Google Scholar
Wang, M.-Y., Luan, P., Zhang, J., Xiang, Y.-T., Niu, H., Yuan, Z.: Concurrent mapping of brain activation from multiple subjects during social interaction by hyperscanning: a mini-review. Quant. Imaging Med. Surg. 8(8), 819 (2018)
Article Google Scholar
Fragopanagos, N., Taylor, J.G.: Emotion recognition in human-computer interaction. Neural Netw. 18(4), 389–405 (2005)
Article Google Scholar
Toichoa Eyam, A., Mohammed, W.M., Martinez Lastra, J.L.: Emotion-driven analysis and control of human-robot interactions in collaborative applications. Sensors 21(14), 4626 (2021)
Article Google Scholar
Spezialetti, M., Placidi, G., Rossi, S.: Emotion recognition for human-robot interaction: Recent advances and future perspectives. Front. Robotics AI, p. 145, (2020)
Duthoit, C. J., Sztynda, T., Lal, S. K., Jap, B. T., Agbinya, J. I.: Optical flow image analysis of facial expressions of human emotion: Forensic applications. In: Proceedings of the 1st international conference on Forensic applications and techniques in telecommunications, information, and multimedia and workshop, pp. 1–6 (2008)
Popa, M., Rothkrantz, L., Wiggers, P.: Products appreciation by facial expressions analysis. In: Proceedings of the 11th International Conference on Computer Systems and Technologies and Workshop for PhD Students in Computing on International Conference on Computer Systems and Technologies, pp. 293–298 (2010)
Liu, X., Zhang, L., Yadegar, J.: A multi-modal emotion recognition system for persistent and non-invasive personal health monitoring. In: Proceedings of the 2nd Conference on Wireless Health, pp. 1–2 (2011)
Metri, P., Ghorpade, J.: Facial emotion recognition using context based multimodal approach. Int. J. Emerging Sci. 2(1), 171 (2012)
Google Scholar
Soladié, C., Salam, H., Pelachaud, C., Stoiber, N., Séguier, R.: A multimodal fuzzy inference system using a continuous facial expression representation for emotion detection. In: Proceedings of the 14th ACM international conference on Multimodal interaction, pp. 493–500 (2012)
Terzis, V., Moridis, C.N., Economides, A.A.: Measuring instant emotions based on facial expressions during computer-based assessment. Pers. Ubiquit. Comput. 17, 43–52 (2013)
Article Google Scholar
Zen, G., Sangineto, E., Ricci, E., Sebe, N.: Unsupervised domain adaptation for personalized facial emotion recognition. In: Proceedings of the 16th international conference on multimodal interaction, pp. 128–135 (2014)
Guo, J., Lei, Z., Wan, J., Avots, E., Hajarolasvadi, N., Knyazev, B., Kuharenko, A., Junior, J.C.S.J., Baró, X., Demirel, H., et al.: Dominant and complementary emotion recognition from still images of faces. IEEE Access 6, 26391–26403 (2018)
Article Google Scholar
Slimani, K., Kas, M., El Merabet, Y., Messoussi, R., Ruichek, Y.: Facial emotion recognition: A comparative analysis using 22 lbp variants. In: Proceedings of the 2nd Mediterranean Conference on Pattern Recognition and Artificial Intelligence, pp. 88–94 (2018)
Slimani, K., Lekdioui, K., Messoussi, R., Touahni, R.: Compound facial expression recognition based on highway cnn. In: Proceedings of the new challenges in data sciences: acts of the second conference of the Moroccan Classification Society, pp. 1–7 (2019)
Zook, A.: “Game agi beyond characters,” Artificial Intelligence: Concepts, Methodologies, Tools, and Applications, pp. 463–484, (2017)
Everitt, T., Lea, G., Hutter, M.: Agi safety literature review. arXiv preprint arXiv:1805.01109, (2018)
Hartholt, A., Traum, D., Marsella, S. C., Shapiro, A., Stratou, G., Leuski, A., Morency, L.-P., Gratch, J.: All together now: Introducing the virtual human toolkit. In: Intelligent Virtual Agents: 13th International Conference, IVA: Edinburgh, UK, August 29–31, 2013. Proceedings 13. Springer 2013, 368–381 (2013)
Zhou, W., Peng, X., Riedl, M.: Dialogue shaping: Empowering agents through npc interaction. arXiv preprint arXiv:2307.15833, (2023)
Gong, R., Huang, Q., Ma, X., Vo, H., Durante, Z., Noda, Y., Zheng, Z., Zhu, S.-C., Terzopoulos, D., Fei-Fei, L. et al.: Mindagent: Emergent gaming interaction. arXiv preprint arXiv:2309.09971, (2023)
Csepregi, L. M.: “The effect of context-aware llm-based npc conversations on player engagement in role-playing video games,” AALBORG UNIVERSITY,Student assignment : Master’s thesis and HD graduation project, (2023)
Lee, J., Marsella, S.: Nonverbal behavior generator for embodied conversational agents. In: International Workshop on Intelligent Virtual Agents. Springer, pp. 243–255 (2006)
Vilhjálmsson, H., Cantelmo, N., Cassell, J., Chafai, N. E., Kipp, M., Kopp, S., Mancini, M., Marsella, S., Marshall, A. N., Pelachaud, C. et al.: The behavior markup language: Recent developments and challenges. In: Intelligent Virtual Agents: 7th International Conference, IVA 2007 Paris, France, September 17-19, 2007 Proceedings 7. Springer, pp. 99–111 (2007)
Hartholt, A., Fast, E., Li, Z., Kim, K., Leeds, A., Mozgai, S.:“Re-architecting the virtual human toolkit: towards an interoperable platform for embodied conversational agent research and development,” In: Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents, pp. 1–8 (2022)
Cerekovic, A., Aran, O., Gatica-Perez, D.: Rapport with virtual agents: What do human social cues and personality explain? IEEE Trans. Affect. Comput. 8(3), 382–395 (2016)
Article Google Scholar
Loveys, K., Sebaratnam, G., Sagar, M., Broadbent, E.: The effect of design features on relationship quality with embodied conversational agents: a systematic review. Int. J. Soc. Robot. 12(6), 1293–1312 (2020)
Article Google Scholar
Ter Stal, S., Kramer, L.L., Tabak, M., Akker, H.. op den, Hermens, H..: Design features of embodied conversational agents in ehealth: a literature review. Int. J. Hum Comput Stud. 138, 102409 (2020)
Article Google Scholar
Tellols, D., Lopez-Sanchez, M., Rodríguez, I., Almajano, P., Puig, A.: Enhancing sentient embodied conversational agents with machine learning. Pattern Recogn. Lett. 129, 317–323 (2020)
Article Google Scholar
Liu, C., Li, X.: Superimposition-guided facial reconstruction from skull. arXiv preprint arXiv:1810.00107, (2018)
Lattas, A., Moschoglou, S., Gecer, B., Ploumpis, S., Triantafyllou, V., Ghosh, A., Zafeiriou, S.: Avatarme: Realistically renderable 3d facial reconstruction" in-the-wild. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 760–769 (2020)
Taherkhani, F., Rai, A., Gao, Q., Srivastava, S., Chen, X., de la Torre, F., Song, S., Prakash, A., Kim, D.: Controllable 3d generative adversarial face model via disentangling shape and appearance. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 826–836 (2023)
Otberdout, N., Ferrari, C., Daoudi, M., Berretti, S., Del Bimbo, A.: Sparse to dense dynamic 3d facial expression generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20 385–20 394 (2022)
Tikka, P., Bastamow, T., Gerolin, I., Gerry, L. J., Pardinho, V. P., Kaipainen, M., Kosunen, I.: The state of darkness: non-human narratives embedded in the encounters with artificial agents. In: Non-Human Narratives. CARNEGIE MELLON UNIVERSITY, p. 9 (2020)
Lugaresi, C., Tang, J., Nash, H., McClanahan, C., Uboweja, E., Hays, M., Zhang, F., Chang, C.-L., Yong, M. G., Lee, J. et al.: Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172, (2019)
King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)
Google Scholar
Lang, S., Murrow, G.: The Distance Formula. New York, NY: Springer New York, pp. 110–122. [Online]. Available: https://doi.org/10.1007/978-1-4757-2022-8_4(1988)
Hammer, P.C.: The midpoint method of numerical integration. Math. Mag. 31(4), 193–195 (1958)
Article MathSciNet Google Scholar
Mar 1964. [Online]. Available: https://gis.stackexchange.com/questions/256738/bearing-of-two-close-lines-in-python
OpenCV, Perspective-n-point (pnp) pose computation. [Online]. Available: https://docs.opencv.org/4.x/d5/d1f/calib3d_solvePnP.html
Ekman, P., Friesen, W. V.: Facial action coding system. Environmental Psychology & Nonverbal Behavior, (1978)
Baltrušaitis, T., Robinson, P., Morency, L.-P.: Openface: an open source facial behavior analysis toolkit. In: IEEE winter conference on applications of computer vision (WACV). IEEE 2016, 1–10 (2016)
Sham, A.H., Khan, A., Lamas, D., Tikka, P., Anbarjafari, G.: Towards context-aware facial emotion reaction database for dyadic interaction settings. Sensors 23(1), 458 (2023)
Article Google Scholar
Sham, A.H., Tikka, P., Lamas, D., Anbarjafari, G.: Automatic reaction emotion estimation in a human-human dyadic setting using deep neural networks. SIViP 17(2), 527–534 (2023)
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. u., Polosukhin, I.: “Attention is all you need,” In: Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Team, K.: Keras documentation: Timedistributed layer. [Online]. Available: https://keras.io/api/layers/recurrent_layers/time_distributed/
Kenney, J.F., Keeping, E.: Root mean square. Math. Stat. 1, 59–60 (1962)
Google Scholar
Reaction facial expression synthesis on an avatar from one person’s facial expressions. Jul 2023. [Online]. Available: https://youtu.be/bVLZlRqlY7k
Facial mimicry on an avatar from one person’s facial expressions using openface api. Jul 2023. [Online]. Available: https://youtu.be/H6h_JpLNtyA
Facial mimicry using google’s mediapipe. Jul 2023. [Online]. Available: https://youtu.be/9i1Ntrg5E28
Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4d scans. ACM Trans. Graph. 36(6), 194–1 (2017)
Article Google Scholar

Download references

Acknowledgements

The work has been supported by the EU Mobilitas Pluss Grant (MOBTT90) and the grant for “Context-embedded social characters for virtual simulation - an enactive media approach to human-robot co-learning” (TF/1522) of Dr. Pia Tikka, Tallinn University.

Funding

Not applicable

Author information

Authors and Affiliations

Enactive Virtuality Lab, Baltic Film, Media, and Arts School, Tallinn University, Narva mnt 25, 10120, Tallinn, Estonia
Abdallah Hussein Sham & Pia Tikka
Digital Technology Institute, Tallinn University, Narva mnt 25, 10120, Tallinn, Estonia
David Lamas
iVCV OÜ, Tartu, 51011, Estonia
Gholamreza Anbarjafari
iCV Lab, University of Tartu, Tartu, Estonia
Gholamreza Anbarjafari
Institute of Higher Education, Yildiz Technical University, Beşiktaş, Istanbul, Turkey
Gholamreza Anbarjafari
PwC Advisory, Helsinki, Finland
Gholamreza Anbarjafari

Authors

Abdallah Hussein Sham
View author publications
You can also search for this author in PubMed Google Scholar
Pia Tikka
View author publications
You can also search for this author in PubMed Google Scholar
David Lamas
View author publications
You can also search for this author in PubMed Google Scholar
Gholamreza Anbarjafari
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Sham and Anbarjafari were involved in conceptualization, formal analysis, methodology, software, validation, visualization, writing—original draft, and writing—reviewing and editing. Tikka and Lamas were responsible for conceptualization, formal analysis, validation, writing—original draft, and writing—reviewing and editing. All authors reviewed the manuscript.

Corresponding author

Correspondence to Abdallah Hussein Sham.

Ethics declarations

Conflict of interest

The authors declare no Conflict of interest.

Ethical approval

Not applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sham, A.H., Tikka, P., Lamas, D. et al. Synthesizing facial expressions in dyadic human–robot interaction. SIViP (2024). https://doi.org/10.1007/s11760-024-03202-4

Download citation

Received: 18 December 2023
Revised: 23 March 2024
Accepted: 05 April 2024
Published: 11 May 2024
DOI: https://doi.org/10.1007/s11760-024-03202-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Synthesizing facial expressions in dyadic human–robot interaction

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Facial emotion recognition using convolutional neural networks (FERC)

A review on the long short-term memory model

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Synthesizing facial expressions in dyadic human–robot interaction

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Facial emotion recognition using convolutional neural networks (FERC)

A review on the long short-term memory model

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation