Multimodal Machine Learning for Social Interaction with Ageing Individuals

Morency, Louis-Philippe; Sakti, Sakriani; Schuller, Björn W.; Ultes, Stefan

doi:10.1007/978-981-16-3476-5_3

Louis-Philippe Morency⁵,
Sakriani Sakti⁶,
Björn W. Schuller⁷ &
…
Stefan Ultes⁸

297 Accesses

Abstract

Multimodal machine learning (MMML) is a vibrant multi-disciplinary research field which addresses some of the original goals of artificial intelligence by integrating and modelling multiple communicative modalities, including linguistic, acoustic and visual messages. With the goal of better understanding and modelling behaviours of ageing individuals, this research field brings some unique challenges for multimodal researchers given the heterogeneity of the data and the contingency often found between modalities. In this chapter, we identify four key challenges necessary to enable multimodal machine learning for ageing individuals: (1) multimodal, this modelling task includes multiple relevant modalities which need to be represented, aligned and fused; (2) high variability, this modelling problem expresses high variability given the many social contexts, large space of actions and the possible physical or cognitive impairment; (3) sparse and noisy resources, this modelling challenge addresses unreliable sensory data and the limitation and sparseness of resources that are specific for the special user group of ageing individuals; and (4) concept drift, where two types of drift were identified, namely on the group level and on the individual level (the former refers to the fact that the target group of usage is not fully known at the moment of development of according interfaces given that it is yet to age, and the latter refers to the fact that ageing may lead to drifting behaviour and interaction preferences throughout the ongoing ageing effect). These four challenges come together when we build an evaluation plan that enables, at the same time, the strategy to include the broader machine learning community in this effort. This research agenda will enable more effective and robust modelling technologies as well as development of socially competent and culture-aware embodied conversational agents for elderly care.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A requirement for separate models is that the observed variability can be divided into distinct categories.

References

Assisted senior living: dealing with stubbornness. https://www.assistedseniorliving.net/caregiving/dealing-with-stubbornness/. Accessed 02 Oct 2019
AVEC ’19: Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop. Association for Computing Machinery, New York, NY, USA (2019)
Google Scholar
de Barros, R.S.M., de Carvalho Santos, S.G.T.: An overview and comprehensive comparison of ensembles for concept drift. Inf. Fusion 52, 213–244 (2019)
Google Scholar
Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 412, 423–443 (2018)
Google Scholar
Campigotto, P., Passerini, A., Battiti, R.: Handling concept drift in preference learning for interactive decision making. HaCDAIS 2010, 29 (2010)
Google Scholar
Dyer, K.B., Polikar, R.: Semi-supervised learning in initially labeled non-stationary environments with gradual drift. In: The 2012 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE (2012)
Google Scholar
Effendi, J., Tjandra, A., Sakti, S., Nakamura, S.: Listening while speaking and visualizing: improving asr through multimodal chain. In: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) pp. 471–478 (2019)
Google Scholar
Esposito, F., Basile, T.M., Di Mauro, N., Ferilli, S.: Machine learning enhancing adaptivity of multimodal mobile systems. In: Multimodal Human Computer Interaction and Pervasive Services, pp. 121–138. IGI Global (2009)
Google Scholar
Gašić, M., Young, S.: Gaussian processes for POMDP-based dialogue manager optimization. IEEE/ACM Trans. Audio, Speech, Lang. Process. 22(1), 28–40 (2014)
Article Google Scholar
Hiraoka, T., Neubig, G., Yoshino, K., Toda, T., Nakamura, S.: Active Learning for Example-Based Dialog Systems, chap. Dialogues with Social Robots: Enablements, Analyses, and Evaluation, pp. 67–78 (2017)
Google Scholar
Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: ICML, pp. 487–494 (2000)
Google Scholar
Li, Z., Li, Z., Zhang, J., Feng, Y., Niu, C., Zhou, J.: Bridging text and video: a universal multimodal transformer for video-audio scene-aware dialog. AAAI2020 DSTC8 workshop (2020)
Google Scholar
Morris, S., Fawcett, G., Brisebois, L., Hughes, J.: Canadian survey on disability reports: a demographic, employment and income profile of Canadians with disabilities aged 15 years and over (2017). https://www150.statcan.gc.ca/n1/pub/89-654-x/89-654-x2018002-eng.htm. Accessed 27 May 2020
Murman, D.L.: The impact of age on cognition. Semin. Hear. 36(03), 111–121 (2015). https://doi.org/10.1055/s-0035-1555115
Padmalatha, E., Reddy, C., Rani, P.: Mining concept drift from data streams by unsupervised learning. Int. J. Comput. Appl. 117(15) (2015)
Google Scholar
Palaskar, S., Sanabria, R., Metze, F.: Transfer learning for multimodal dialog. Comput. Speech Lang. 64, 101093 (2020). https://doi.org/10.1016/j.csl.2020.101093
Ren, Z., Han, J., Cummins, N., Schuller, B.: Enhancing transferability of black-box adversarial attacks via lifelong learning for speech emotion recognition models. In: Proceedings INTERSPEECH 2020, p. 5. ISCA, ISCA, Shanghai, China (2020)
Google Scholar
Richter, J., Shi, J., Chen, J.J., Rahnenführer, J., Lang, M.: Model-based optimization with concept drifts. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, pp. 877–885 (2020)
Google Scholar
Rudovic, O., Zhang, M., Schuller, B., Picard, R.W.: Multi-modal active learning from human data: a deep reinforcement learning approach (2019). https://arxiv.org/abs/1906.03098
Sahoo, D., Pham, Q., Lu, J., Hoi, S.C.: Online deep learning: learning deep neural networks on the fly (2017). arXiv:1711.03705
Tseng, B.H., Rei, M., Budzianowski, P., Turner, R., Byrne, B., Korhonen, A.: Semi-supervised bootstrapping of dialogue state trackers for task-oriented modelling. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 1273–1278. Hong Kong, China (2019)
Google Scholar
Tsymbal, A.: The problem of concept drift: definitions and related work. Computer Science Department, Trinity College Dublin 106(2), 58 (2004)
Google Scholar
Wagner, J., Lingenfelser, F., Baur, T., Damian, I., Kistler, F., André, E.: The social signal interpretation (ssi) framework: multimodal signal processing and recognition in real-time. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 831–834. ACM (2013)
Google Scholar
Webb, G.I., Pazzani, M.J., Billsus, D.: Machine learning for user modeling. User Model. User-Adapt. Interact. 11(1–2), 19–29 (2001)
Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Google Scholar
Wolf, T., Sanh, V., Chaumond, J., Delangue, C.: Transfertransfo: a transfer learning approach for neural network based conversational agents. NeurIPS 2018 CAI Workshop (2019)
Google Scholar
Wu, J., Wang, X., Wang, W.Y.: Self-supervised dialogue learning. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 3857–3867. Florence, Italy (2019)
Google Scholar
Zhang, S., McClean, S., Scotney, B., Chaurasia, P., Nugent, C.: Using duration to learn activities of daily living in a smart home environment. In: 2010 4th International Conference on Pervasive Computing Technologies for Healthcare, pp. 1–8. IEEE (2010)
Google Scholar
Žliobaitė, I.: Learning under concept drift: an overview (2010). arXiv:1010.4784

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, USA
Louis-Philippe Morency
Nara Institute of Science and Technology, Ikoma, Japan
Sakriani Sakti
Chair EIHW, University of Augsburg, Germany & GLAM, Imperial College London, Augsburg, UK
Björn W. Schuller
Mercedes-Benz AG, Sindelfingen, Germany
Stefan Ultes

Authors

Louis-Philippe Morency
View author publications
You can also search for this author in PubMed Google Scholar
Sakriani Sakti
View author publications
You can also search for this author in PubMed Google Scholar
Björn W. Schuller
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Ultes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefan Ultes .

Editor information

Editors and Affiliations

Institute of Communications Engineering, University of Ulm, Ulm, Germany
Juliana Miehle
Institute of Communications Engineering, University of Ulm, Ulm, Germany
Wolfgang Minker
Human-Centered Multimedia, University of Augsburg, Augsburg, Germany
Elisabeth André
Nara Institute of Science and Technology, Ikoma, Japan
Koichiro Yoshino

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Morency, LP., Sakti, S., Schuller, B.W., Ultes, S. (2021). Multimodal Machine Learning for Social Interaction with Ageing Individuals. In: Miehle, J., Minker, W., André, E., Yoshino, K. (eds) Multimodal Agents for Ageing and Multicultural Societies . Springer, Singapore. https://doi.org/10.1007/978-981-16-3476-5_3

Download citation

DOI: https://doi.org/10.1007/978-981-16-3476-5_3
Published: 10 October 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-3475-8
Online ISBN: 978-981-16-3476-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics