Abstract
Multimodal machine learning (MMML) is a vibrant multi-disciplinary research field which addresses some of the original goals of artificial intelligence by integrating and modelling multiple communicative modalities, including linguistic, acoustic and visual messages. With the goal of better understanding and modelling behaviours of ageing individuals, this research field brings some unique challenges for multimodal researchers given the heterogeneity of the data and the contingency often found between modalities. In this chapter, we identify four key challenges necessary to enable multimodal machine learning for ageing individuals: (1) multimodal, this modelling task includes multiple relevant modalities which need to be represented, aligned and fused; (2) high variability, this modelling problem expresses high variability given the many social contexts, large space of actions and the possible physical or cognitive impairment; (3) sparse and noisy resources, this modelling challenge addresses unreliable sensory data and the limitation and sparseness of resources that are specific for the special user group of ageing individuals; and (4) concept drift, where two types of drift were identified, namely on the group level and on the individual level (the former refers to the fact that the target group of usage is not fully known at the moment of development of according interfaces given that it is yet to age, and the latter refers to the fact that ageing may lead to drifting behaviour and interaction preferences throughout the ongoing ageing effect). These four challenges come together when we build an evaluation plan that enables, at the same time, the strategy to include the broader machine learning community in this effort. This research agenda will enable more effective and robust modelling technologies as well as development of socially competent and culture-aware embodied conversational agents for elderly care.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A requirement for separate models is that the observed variability can be divided into distinct categories.
References
Assisted senior living: dealing with stubbornness. https://www.assistedseniorliving.net/caregiving/dealing-with-stubbornness/. Accessed 02 Oct 2019
AVEC ’19: Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop. Association for Computing Machinery, New York, NY, USA (2019)
de Barros, R.S.M., de Carvalho Santos, S.G.T.: An overview and comprehensive comparison of ensembles for concept drift. Inf. Fusion 52, 213–244 (2019)
Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 412, 423–443 (2018)
Campigotto, P., Passerini, A., Battiti, R.: Handling concept drift in preference learning for interactive decision making. HaCDAIS 2010, 29 (2010)
Dyer, K.B., Polikar, R.: Semi-supervised learning in initially labeled non-stationary environments with gradual drift. In: The 2012 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE (2012)
Effendi, J., Tjandra, A., Sakti, S., Nakamura, S.: Listening while speaking and visualizing: improving asr through multimodal chain. In: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) pp. 471–478 (2019)
Esposito, F., Basile, T.M., Di Mauro, N., Ferilli, S.: Machine learning enhancing adaptivity of multimodal mobile systems. In: Multimodal Human Computer Interaction and Pervasive Services, pp. 121–138. IGI Global (2009)
Gašić, M., Young, S.: Gaussian processes for POMDP-based dialogue manager optimization. IEEE/ACM Trans. Audio, Speech, Lang. Process. 22(1), 28–40 (2014)
Hiraoka, T., Neubig, G., Yoshino, K., Toda, T., Nakamura, S.: Active Learning for Example-Based Dialog Systems, chap. Dialogues with Social Robots: Enablements, Analyses, and Evaluation, pp. 67–78 (2017)
Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: ICML, pp. 487–494 (2000)
Li, Z., Li, Z., Zhang, J., Feng, Y., Niu, C., Zhou, J.: Bridging text and video: a universal multimodal transformer for video-audio scene-aware dialog. AAAI2020 DSTC8 workshop (2020)
Morris, S., Fawcett, G., Brisebois, L., Hughes, J.: Canadian survey on disability reports: a demographic, employment and income profile of Canadians with disabilities aged 15 years and over (2017). https://www150.statcan.gc.ca/n1/pub/89-654-x/89-654-x2018002-eng.htm. Accessed 27 May 2020
Murman, D.L.: The impact of age on cognition. Semin. Hear. 36(03), 111–121 (2015). https://doi.org/10.1055/s-0035-1555115
Padmalatha, E., Reddy, C., Rani, P.: Mining concept drift from data streams by unsupervised learning. Int. J. Comput. Appl. 117(15) (2015)
Palaskar, S., Sanabria, R., Metze, F.: Transfer learning for multimodal dialog. Comput. Speech Lang. 64, 101093 (2020). https://doi.org/10.1016/j.csl.2020.101093
Ren, Z., Han, J., Cummins, N., Schuller, B.: Enhancing transferability of black-box adversarial attacks via lifelong learning for speech emotion recognition models. In: Proceedings INTERSPEECH 2020, p. 5. ISCA, ISCA, Shanghai, China (2020)
Richter, J., Shi, J., Chen, J.J., Rahnenführer, J., Lang, M.: Model-based optimization with concept drifts. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, pp. 877–885 (2020)
Rudovic, O., Zhang, M., Schuller, B., Picard, R.W.: Multi-modal active learning from human data: a deep reinforcement learning approach (2019). https://arxiv.org/abs/1906.03098
Sahoo, D., Pham, Q., Lu, J., Hoi, S.C.: Online deep learning: learning deep neural networks on the fly (2017). arXiv:1711.03705
Tseng, B.H., Rei, M., Budzianowski, P., Turner, R., Byrne, B., Korhonen, A.: Semi-supervised bootstrapping of dialogue state trackers for task-oriented modelling. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 1273–1278. Hong Kong, China (2019)
Tsymbal, A.: The problem of concept drift: definitions and related work. Computer Science Department, Trinity College Dublin 106(2), 58 (2004)
Wagner, J., Lingenfelser, F., Baur, T., Damian, I., Kistler, F., André, E.: The social signal interpretation (ssi) framework: multimodal signal processing and recognition in real-time. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 831–834. ACM (2013)
Webb, G.I., Pazzani, M.J., Billsus, D.: Machine learning for user modeling. User Model. User-Adapt. Interact. 11(1–2), 19–29 (2001)
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Wolf, T., Sanh, V., Chaumond, J., Delangue, C.: Transfertransfo: a transfer learning approach for neural network based conversational agents. NeurIPS 2018 CAI Workshop (2019)
Wu, J., Wang, X., Wang, W.Y.: Self-supervised dialogue learning. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 3857–3867. Florence, Italy (2019)
Zhang, S., McClean, S., Scotney, B., Chaurasia, P., Nugent, C.: Using duration to learn activities of daily living in a smart home environment. In: 2010 4th International Conference on Pervasive Computing Technologies for Healthcare, pp. 1–8. IEEE (2010)
Žliobaitė, I.: Learning under concept drift: an overview (2010). arXiv:1010.4784
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Morency, LP., Sakti, S., Schuller, B.W., Ultes, S. (2021). Multimodal Machine Learning for Social Interaction with Ageing Individuals. In: Miehle, J., Minker, W., André, E., Yoshino, K. (eds) Multimodal Agents for Ageing and Multicultural Societies . Springer, Singapore. https://doi.org/10.1007/978-981-16-3476-5_3
Download citation
DOI: https://doi.org/10.1007/978-981-16-3476-5_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-3475-8
Online ISBN: 978-981-16-3476-5
eBook Packages: Computer ScienceComputer Science (R0)