Measuring Audience and Actor Emotions at a Theater Play Through Automatic Emotion Recognition from Face, Speech, and Body Sensors

Gloor, Peter A.; Araño, Keith April; Guerrazzi, Emanuele

doi:10.1007/978-3-030-48993-9_3

Peter A. Gloor⁴,
Keith April Araño⁵ &
Emanuele Guerrazzi⁶

Part of the book series: Springer Proceedings in Complexity ((SPCOM))

Included in the following conference series:

Collaborative innovation networks conference of Digital Transformation of Collaboration

645 Accesses
1 Citations

Abstract

We describe a preliminary experiment to track the emotions of actors and audience in a theater play through machine learning and AI. During a 40-min play in Zurich, eight actors were equipped with body-sensing smartwatches. At the same time, the emotions of the audience were tracked anonymously using facial emotion tracking. In parallel, also the emotions in the voices of the actors were assessed through automatic voice emotion tracking. This paper demonstrates a first fully automated and privacy-respecting system to measure both audience and actor satisfaction during a public performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

S. Albanie, A. Nagrani, A. Vedaldi, A. Zisserman, Emotion recognition in speech using cross-modal transfer in the wild, in MM 2018—Proceedings of the 2018 ACM Multimedia Conference (2018), pp. 292–301, https://doi.org/10.1145/3240508.3240578
M. Ali, A.H. Mosa, F.Al Machot, K. Kyamakya, A review of emotion recognition using physiological signals. Ann. Telecommun. 109(3–4), 303–318 (2018). https://doi.org/10.1007/978-3-319-58996-1
Article Google Scholar
N. Ambady, M. Weisbuch, Nonverbal behavior, in Handbook of Social Psychology, vol. 1, 5th ed. (Wiley, Hoboken, NJ, US, 2010), pp. 464–497
Google Scholar
P. Budner, J. Eirich, P.A. Gloor, “Making You Happy Makes Me Happy”—Measuring Individual Mood with Smartwatches (Aristotle 2004) (2017), pp. 1–14, http://arxiv.org/abs/1711.06134
F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, B. Weiss, A database of German emotional speech, in 9th European Conference on Speech Communication and Technology, vol. 5 (2005)
Google Scholar
C. Busso, M. Bulut, C.-C. Lee, A. Kazemzadeh, E. Mower, S. Kim, S.S. Narayanan, IEMOCAP: interactive emotional dyadic motion capture database. Lang. Resour. Eval. 42(4), 335 (2008). https://doi.org/10.1007/s10579-008-9076-6
Article Google Scholar
H. Cao, D.G. Cooper, M.K. Keutmann, R.C. Gur, A. Nenkova, R. Verma, CREMA-D: crowd-sourced emotional multimodal actors dataset. IEEE Trans. Affect. Comput. 5(4), 377–390 (2014). https://doi.org/10.1109/TAFFC.2014.2336244
Article Google Scholar
M. Chen, Y. Zhang, M. Qiu, N. Guizani, Y. Hao, SPHA: smart personal health advisor based on deep analytics. IEEE Commun. Mag. 56(3), 164–169 (2018). https://doi.org/10.1109/MCOM.2018.1700274
Article Google Scholar
K. Dupuis, M. Pichora-Fuller, Recognition of emotional speech for younger and older talkers: behavioural findings from the toronto emotional speech set. Can. Acoust. Acoust. Can. 39, 182–183 (2011)
Google Scholar
M. Egger, M. Ley, S. Hanke, Emotion recognition from physiological signal analysis: a review. Electron. Notes Theor. Comput. Sci. 343, 35–55 (2019). https://doi.org/10.1016/j.entcs.2019.04.009
Article Google Scholar
P. Ekman, W.V. Freisen, S. Ancoli, Facial signs of emotional experience. J. Pers. Soc. Psychol. 39(6), 1125–1134 (1980). https://doi.org/10.1037/h0077722
Article Google Scholar
P. Ekman, W.V. Friesen, Constants across cultures in the face and emotion. J. Personal. Soc. Psychol. US: Am. Psychol. Assoc. (1971). https://doi.org/10.1037/h0030377
Article Google Scholar
P. Gloor, A.F. Colladon, G. Giacomelli, T. Saran, F. Grippa, The impact of virtual mirroring on customer satisfaction. J. Bus. Res. 75, 67–76 (2017)
Article Google Scholar
W. Hong, C. Zheng, L. Wu, X. Pu, Analyzing the relationship between consumer satisfaction and fresh e-commerce logistics service using text mining techniques. Sustainability (Switzerland) 11(13), 1–16 (2019). https://doi.org/10.3390/su11133570
Article Google Scholar
P. Jackson, S. Ul haq, Surrey Audio-Visual Expressed Emotion (SAVEE) Database (2011)
Google Scholar
A. Karpathy, F.-F. Li, Deep Visual-Semantic Alignments for Generating Image Descriptions. CoRR, abs/1412.2 (2014), http://arxiv.org/abs/1412.2306
D.E. King, Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)
Google Scholar
B.C. Ko, A brief review of facial emotion recognition based on visual information. Sensors (Switzerland) 18(2) (2018). https://doi.org/10.3390/s18020401
A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, in Proceedings of the 25th International Conference on Neural Information Processing Systems, vol. 1 (Curran Associates Inc., USA, 2012), pp. 1097–1105, http://dl.acm.org/citation.cfm?id=2999134.2999257
J. Lee, I. Tashev, High-level feature representation using recurrent neural network for speech emotion recognition, in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2015-Janua (2015), pp. 1537–1540
Google Scholar
Y. Li, J.-Y. Zhu, R. Tedrake, A. Torralba, Connecting Touch and Vision via Cross-Modal Prediction, (d) (2019), http://arxiv.org/abs/1906.06322
S.R. Livingstone, F.A. Russo, The Ryerson audio-visual database of emotional speech and song (RAVDESS) (2018), https://doi.org/10.5281/zenodo.1188976
P. Lucey, J. Cohn, T. Kanade, J. Saragih, Z. Ambadar, I. Matthews, The Extended Cohn-Kanade Dataset (CK+): a complete dataset for action unit and emotion-specified expression, in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition—Workshops, CVPRW 2010 (2010), https://doi.org/10.1109/CVPRW.2010.5543262
M. Lyons, M. Kamachi, J. Gyoba The Japanese Female Facial Expression (JAFFE) Database (Zenodo, 1998), https://doi.org/10.5281/zenodo.3451524
C. Marechal, D. Mikołajewski, K. Tyburek, P. Prokopowicz, L. Bougueroua, C. Ancourt, K. Węgrzyn-Wolska, Survey on AI-based multimodal methods for emotion detection, in Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11400 (2019), pp. 307–324, https://doi.org/10.1007/978-3-030-16272-6_11
B. McFee, C. Raffel, D. Liang, D. Ellis, M. McVicar, E. Battenberg, O. Nieto, librosa: audio and music signal analysis in python, in Proceedings of the 14th Python in Science Conference, (Scipy) (2015), pp. 18–24, https://doi.org/10.25080/majora-7b98e3ed-003
J. Mena-Chalco, R. Marcondes, L. Velho, Banco de Dados de Faces 3D: IMPA-FACE3D (2008)
Google Scholar
D. Nandi, K. Rao, Language identification using excitation source features (2015), https://doi.org/10.1007/978-3-319-17725-0
A. Owens, P. Isola, J.H. McDermott, A. Torralba, E.H. Adelson, W.T. Freeman, Visually Indicated Sounds. CoRR, abs/1512.0 (2015), http://arxiv.org/abs/1512.08512
O.M. Parkhi, A. Vedaldi, A. Zisserman, Deep Face Recognition (Section 3) (2015), pp. 41.1–41.12, https://doi.org/10.5244/c.29.41
B. Parkinson, How emotions affect other people. Emot. Res. (2014)
Google Scholar
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, É. Duchesnay, Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011). http://dl.acm.org/citation.cfm?id=1953048.2078195
J. Posner, J. Russell, B. Peterson, The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Dev. Psychopathol. 17, 715–734 (2005). https://doi.org/10.1017/S0954579405050340
Article Google Scholar
J.L. Qiu, W. Liu, B.L. Lu, Multi-view emotion recognition using deep canonical correlation analysis, in Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11305 LNCS (2018), pp. 221–231, https://doi.org/10.1007/978-3-030-04221-9_20
N. Rule, N. Ambady, First impressions of the face: predicting success. Soc. Pers. Psychol. Compass 4(8), 506–516 (2010). https://doi.org/10.1111/j.1751-9004.2010.00282.x
Article Google Scholar
K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition (2014)
Google Scholar
R. Smith, A. Alkozei, W. Killgore, How do emotions work? Front. Young Minds 5 (2017). https://doi.org/10.3389/frym.2017.00069
M. Swain, A. Routray, P. Kabisatpathy, Databases, features and classifiers for speech emotion recognition: a review. Int. J. Speech Technol. 21(1), 93–120 (2018). https://doi.org/10.1007/s10772-018-9491-z
Article Google Scholar
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S.E. Reed, D. Anguelov, A. Rabinovich, Going Deeper with Convolutions. CoRR, abs/1409.4 (2014), http://arxiv.org/abs/1409.4842
G. Trigeorgis, F. Ringeval, R. Brueckner, E. Marchi, M.A. Nicolaou, B. Schuller, S. Zafeiriou, Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network, in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing—Proceedings, 2016-May (2016), pp. 5200–5204, https://doi.org/10.1109/ICASSP.2016.7472669
M.A. Ullah, M.M. Islam, N.B. Azman, Z.M. Zaki, An overview of multimodal sentiment analysis research: opportunities and difficulties, in 2017 IEEE International Conference on Imaging, Vision and Pattern Recognition, IcIVPR 2017 (2017), https://doi.org/10.1109/ICIVPR.2017.7890858
E. Vasey, S. Ko, M. Jeon, In-vehicle affect detection system: identification of emotional arousal by monitoring the driver and driving style, in Adjunct Proceedings of the 10th International Conference on Automotive User Interfaces and Interactive Vehicular Applications (ACM, New York, NY, USA, 2018), pp. 243–247, https://doi.org/10.1145/3239092.3267417
A. Yadollahi, A.G. Shahraki, O.R. Zaiane, Current state of text sentiment analysis from opinion to emotion mining. ACM Comput. Surv. 50(2), 1–33 (2017). https://doi.org/10.1145/3057270
Article Google Scholar
L. Yin, X. Wei, Y. Sun, J. Wang, M.J. Rosato, A 3D facial expression database for facial behavior research, in Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition (IEEE Computer Society, Washington, DC, USA, 2006), pp. 211–216, http://dl.acm.org/citation.cfm?id=1126250.1126340
A. Zadeh, M. Chen, S. Poria, E. Cambria, L.-P. Morency, Tensor fusion network for multimodal sentiment analysis (2018), pp. 1103–1114, https://doi.org/10.18653/v1/d17-1115

Download references

Acknowledgements

The authors thank Samuel Schwarz and Garrick Lauterbach from Digitalbuehne.ch for providing the venue and supporting them during the performance. They are also grateful to Jannik Roessler for his invaluable support during data collection.

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional/ and or national research committee with the Helsinki declaration and its later amendments or comparable with ethical standards. Consent by the organizers and participants in the Zurich experiment to be recorded has been given.

Author information

Authors and Affiliations

Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA, USA
Peter A. Gloor
Politecnico di Milano, Via Lambruschini, 4/B, 20156, Milan, Italy
Keith April Araño
University of Pisa, Largo Lucio Lazzarino, 56122, Pisa, Italy
Emanuele Guerrazzi

Authors

Peter A. Gloor
View author publications
You can also search for this author in PubMed Google Scholar
Keith April Araño
View author publications
You can also search for this author in PubMed Google Scholar
Emanuele Guerrazzi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter A. Gloor .

Editor information

Editors and Affiliations

Kozminski University, Warsaw, Poland
Aleksandra Przegalinska
Northeastern University, Boston, MA, USA
Francesca Grippa
MIT Center for Collective Intelligence, Cambridge, MA, USA
Peter A. Gloor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gloor, P.A., Araño, K.A., Guerrazzi, E. (2020). Measuring Audience and Actor Emotions at a Theater Play Through Automatic Emotion Recognition from Face, Speech, and Body Sensors. In: Przegalinska, A., Grippa, F., Gloor, P. (eds) Digital Transformation of Collaboration. COINs 2019. Springer Proceedings in Complexity. Springer, Cham. https://doi.org/10.1007/978-3-030-48993-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-48993-9_3
Published: 29 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-48992-2
Online ISBN: 978-3-030-48993-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics