Abstract
We describe a preliminary experiment to track the emotions of actors and audience in a theater play through machine learning and AI. During a 40-min play in Zurich, eight actors were equipped with body-sensing smartwatches. At the same time, the emotions of the audience were tracked anonymously using facial emotion tracking. In parallel, also the emotions in the voices of the actors were assessed through automatic voice emotion tracking. This paper demonstrates a first fully automated and privacy-respecting system to measure both audience and actor satisfaction during a public performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
S. Albanie, A. Nagrani, A. Vedaldi, A. Zisserman, Emotion recognition in speech using cross-modal transfer in the wild, in MM 2018āProceedings of the 2018 ACM Multimedia Conference (2018), pp. 292ā301, https://doi.org/10.1145/3240508.3240578
M. Ali, A.H. Mosa, F.Al Machot, K. Kyamakya, A review of emotion recognition using physiological signals. Ann. Telecommun. 109(3ā4), 303ā318 (2018). https://doi.org/10.1007/978-3-319-58996-1
N. Ambady, M. Weisbuch, Nonverbal behavior, in Handbook of Social Psychology, vol. 1, 5th ed. (Wiley, Hoboken, NJ, US, 2010), pp. 464ā497
P. Budner, J. Eirich, P.A. Gloor, āMaking You Happy Makes Me HappyāāMeasuring Individual Mood with Smartwatches (Aristotle 2004) (2017), pp. 1ā14, http://arxiv.org/abs/1711.06134
F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, B. Weiss, A database of German emotional speech, in 9th European Conference on Speech Communication and Technology, vol. 5 (2005)
C. Busso, M. Bulut, C.-C. Lee, A. Kazemzadeh, E. Mower, S. Kim, S.S. Narayanan, IEMOCAP: interactive emotional dyadic motion capture database. Lang. Resour. Eval. 42(4), 335 (2008). https://doi.org/10.1007/s10579-008-9076-6
H. Cao, D.G. Cooper, M.K. Keutmann, R.C. Gur, A. Nenkova, R. Verma, CREMA-D: crowd-sourced emotional multimodal actors dataset. IEEE Trans. Affect. Comput. 5(4), 377ā390 (2014). https://doi.org/10.1109/TAFFC.2014.2336244
M. Chen, Y. Zhang, M. Qiu, N. Guizani, Y. Hao, SPHA: smart personal health advisor based on deep analytics. IEEE Commun. Mag. 56(3), 164ā169 (2018). https://doi.org/10.1109/MCOM.2018.1700274
K. Dupuis, M. Pichora-Fuller, Recognition of emotional speech for younger and older talkers: behavioural findings from the toronto emotional speech set. Can. Acoust. Acoust. Can. 39, 182ā183 (2011)
M. Egger, M. Ley, S. Hanke, Emotion recognition from physiological signal analysis: a review. Electron. Notes Theor. Comput. Sci. 343, 35ā55 (2019). https://doi.org/10.1016/j.entcs.2019.04.009
P. Ekman, W.V. Freisen, S. Ancoli, Facial signs of emotional experience. J. Pers. Soc. Psychol. 39(6), 1125ā1134 (1980). https://doi.org/10.1037/h0077722
P. Ekman, W.V. Friesen, Constants across cultures in the face and emotion. J. Personal. Soc. Psychol. US: Am. Psychol. Assoc. (1971). https://doi.org/10.1037/h0030377
P. Gloor, A.F. Colladon, G. Giacomelli, T. Saran, F. Grippa, The impact of virtual mirroring on customer satisfaction. J. Bus. Res. 75, 67ā76 (2017)
W. Hong, C. Zheng, L. Wu, X. Pu, Analyzing the relationship between consumer satisfaction and fresh e-commerce logistics service using text mining techniques. Sustainability (Switzerland) 11(13), 1ā16 (2019). https://doi.org/10.3390/su11133570
P. Jackson, S. Ul haq, Surrey Audio-Visual Expressed Emotion (SAVEE) Database (2011)
A. Karpathy, F.-F. Li, Deep Visual-Semantic Alignments for Generating Image Descriptions. CoRR, abs/1412.2 (2014), http://arxiv.org/abs/1412.2306
D.E. King, Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755ā1758 (2009)
B.C. Ko, A brief review of facial emotion recognition based on visual information. Sensors (Switzerland) 18(2) (2018). https://doi.org/10.3390/s18020401
A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, in Proceedings of the 25th International Conference on Neural Information Processing Systems, vol. 1 (Curran Associates Inc., USA, 2012), pp. 1097ā1105, http://dl.acm.org/citation.cfm?id=2999134.2999257
J. Lee, I. Tashev, High-level feature representation using recurrent neural network for speech emotion recognition, in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2015-Janua (2015), pp. 1537ā1540
Y. Li, J.-Y. Zhu, R. Tedrake, A. Torralba, Connecting Touch and Vision via Cross-Modal Prediction, (d) (2019), http://arxiv.org/abs/1906.06322
S.R. Livingstone, F.A. Russo, The Ryerson audio-visual database of emotional speech and song (RAVDESS) (2018), https://doi.org/10.5281/zenodo.1188976
P. Lucey, J. Cohn, T. Kanade, J. Saragih, Z. Ambadar, I. Matthews, The Extended Cohn-Kanade Dataset (CK+): a complete dataset for action unit and emotion-specified expression, in 2010 IEEE Computer Society Conference on Computer Vision and Pattern RecognitionāWorkshops, CVPRW 2010 (2010), https://doi.org/10.1109/CVPRW.2010.5543262
M. Lyons, M. Kamachi, J. Gyoba The Japanese Female Facial Expression (JAFFE) Database (Zenodo, 1998), https://doi.org/10.5281/zenodo.3451524
C. Marechal, D. MikoÅajewski, K. Tyburek, P. Prokopowicz, L. Bougueroua, C. Ancourt, K. WÄgrzyn-Wolska, Survey on AI-based multimodal methods for emotion detection, in Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11400 (2019), pp. 307ā324, https://doi.org/10.1007/978-3-030-16272-6_11
B. McFee, C. Raffel, D. Liang, D. Ellis, M. McVicar, E. Battenberg, O. Nieto, librosa: audio and music signal analysis in python, in Proceedings of the 14th Python in Science Conference, (Scipy) (2015), pp. 18ā24, https://doi.org/10.25080/majora-7b98e3ed-003
J. Mena-Chalco, R. Marcondes, L. Velho, Banco de Dados de Faces 3D: IMPA-FACE3D (2008)
D. Nandi, K. Rao, Language identification using excitation source features (2015), https://doi.org/10.1007/978-3-319-17725-0
A. Owens, P. Isola, J.H. McDermott, A. Torralba, E.H. Adelson, W.T. Freeman, Visually Indicated Sounds. CoRR, abs/1512.0 (2015), http://arxiv.org/abs/1512.08512
O.M. Parkhi, A. Vedaldi, A. Zisserman, Deep Face Recognition (SectionĀ 3) (2015), pp. 41.1ā41.12, https://doi.org/10.5244/c.29.41
B. Parkinson, How emotions affect other people. Emot. Res. (2014)
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, Ć. Duchesnay, Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825ā2830 (2011). http://dl.acm.org/citation.cfm?id=1953048.2078195
J. Posner, J. Russell, B. Peterson, The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Dev. Psychopathol. 17, 715ā734 (2005). https://doi.org/10.1017/S0954579405050340
J.L. Qiu, W. Liu, B.L. Lu, Multi-view emotion recognition using deep canonical correlation analysis, in Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11305 LNCS (2018), pp. 221ā231, https://doi.org/10.1007/978-3-030-04221-9_20
N. Rule, N. Ambady, First impressions of the face: predicting success. Soc. Pers. Psychol. Compass 4(8), 506ā516 (2010). https://doi.org/10.1111/j.1751-9004.2010.00282.x
K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition (2014)
R. Smith, A. Alkozei, W. Killgore, How do emotions work? Front. Young Minds 5 (2017). https://doi.org/10.3389/frym.2017.00069
M. Swain, A. Routray, P. Kabisatpathy, Databases, features and classifiers for speech emotion recognition: a review. Int. J. Speech Technol. 21(1), 93ā120 (2018). https://doi.org/10.1007/s10772-018-9491-z
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S.E. Reed, D. Anguelov, A. Rabinovich, Going Deeper with Convolutions. CoRR, abs/1409.4 (2014), http://arxiv.org/abs/1409.4842
G. Trigeorgis, F. Ringeval, R. Brueckner, E. Marchi, M.A. Nicolaou, B. Schuller, S. Zafeiriou, Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network, in ICASSP, IEEE International Conference on Acoustics, Speech and Signal ProcessingāProceedings, 2016-May (2016), pp. 5200ā5204, https://doi.org/10.1109/ICASSP.2016.7472669
M.A. Ullah, M.M. Islam, N.B. Azman, Z.M. Zaki, An overview of multimodal sentiment analysis research: opportunities and difficulties, in 2017 IEEE International Conference on Imaging, Vision and Pattern Recognition, IcIVPR 2017 (2017), https://doi.org/10.1109/ICIVPR.2017.7890858
E. Vasey, S. Ko, M. Jeon, In-vehicle affect detection system: identification of emotional arousal by monitoring the driver and driving style, in Adjunct Proceedings of the 10th International Conference on Automotive User Interfaces and Interactive Vehicular Applications (ACM, New York, NY, USA, 2018), pp. 243ā247, https://doi.org/10.1145/3239092.3267417
A. Yadollahi, A.G. Shahraki, O.R. Zaiane, Current state of text sentiment analysis from opinion to emotion mining. ACM Comput. Surv. 50(2), 1ā33 (2017). https://doi.org/10.1145/3057270
L. Yin, X. Wei, Y. Sun, J. Wang, M.J. Rosato, A 3D facial expression database for facial behavior research, in Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition (IEEE Computer Society, Washington, DC, USA, 2006), pp. 211ā216, http://dl.acm.org/citation.cfm?id=1126250.1126340
A. Zadeh, M. Chen, S. Poria, E. Cambria, L.-P. Morency, Tensor fusion network for multimodal sentiment analysis (2018), pp. 1103ā1114, https://doi.org/10.18653/v1/d17-1115
Acknowledgements
The authors thank Samuel Schwarz and Garrick Lauterbach from Digitalbuehne.ch for providing the venue and supporting them during the performance. They are also grateful to Jannik Roessler for his invaluable support during data collection.
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional/ and or national research committee with the Helsinki declaration and its later amendments or comparable with ethical standards. Consent by the organizers and participants in the Zurich experiment to be recorded has been given.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Gloor, P.A., AraƱo, K.A., Guerrazzi, E. (2020). Measuring Audience and Actor Emotions at a Theater Play Through Automatic Emotion Recognition from Face, Speech, and Body Sensors. In: Przegalinska, A., Grippa, F., Gloor, P. (eds) Digital Transformation of Collaboration. COINs 2019. Springer Proceedings in Complexity. Springer, Cham. https://doi.org/10.1007/978-3-030-48993-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-48993-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-48992-2
Online ISBN: 978-3-030-48993-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)