Skip to main content

Multimodal Machine Learning for Social Interaction with Ageing Individuals

  • Chapter
  • First Online:
Multimodal Agents for Ageing and Multicultural Societies

Abstract

Multimodal machine learning (MMML) is a vibrant multi-disciplinary research field which addresses some of the original goals of artificial intelligence by integrating and modelling multiple communicative modalities, including linguistic, acoustic and visual messages. With the goal of better understanding and modelling behaviours of ageing individuals, this research field brings some unique challenges for multimodal researchers given the heterogeneity of the data and the contingency often found between modalities. In this chapter, we identify four key challenges necessary to enable multimodal machine learning for ageing individuals: (1) multimodal, this modelling task includes multiple relevant modalities which need to be represented, aligned and fused; (2) high variability, this modelling problem expresses high variability given the many social contexts, large space of actions and the possible physical or cognitive impairment; (3) sparse and noisy resources, this modelling challenge addresses unreliable sensory data and the limitation and sparseness of resources that are specific for the special user group of ageing individuals; and (4) concept drift, where two types of drift were identified, namely on the group level and on the individual level (the former refers to the fact that the target group of usage is not fully known at the moment of development of according interfaces given that it is yet to age, and the latter refers to the fact that ageing may lead to drifting behaviour and interaction preferences throughout the ongoing ageing effect). These four challenges come together when we build an evaluation plan that enables, at the same time, the strategy to include the broader machine learning community in this effort. This research agenda will enable more effective and robust modelling technologies as well as development of socially competent and culture-aware embodied conversational agents for elderly care.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A requirement for separate models is that the observed variability can be divided into distinct categories.

References

  1. Assisted senior living: dealing with stubbornness. https://www.assistedseniorliving.net/caregiving/dealing-with-stubbornness/. Accessed 02 Oct 2019

  2. AVEC ’19: Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop. Association for Computing Machinery, New York, NY, USA (2019)

    Google Scholar 

  3. de Barros, R.S.M., de Carvalho Santos, S.G.T.: An overview and comprehensive comparison of ensembles for concept drift. Inf. Fusion 52, 213–244 (2019)

    Google Scholar 

  4. Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 412, 423–443 (2018)

    Google Scholar 

  5. Campigotto, P., Passerini, A., Battiti, R.: Handling concept drift in preference learning for interactive decision making. HaCDAIS 2010, 29 (2010)

    Google Scholar 

  6. Dyer, K.B., Polikar, R.: Semi-supervised learning in initially labeled non-stationary environments with gradual drift. In: The 2012 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE (2012)

    Google Scholar 

  7. Effendi, J., Tjandra, A., Sakti, S., Nakamura, S.: Listening while speaking and visualizing: improving asr through multimodal chain. In: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) pp. 471–478 (2019)

    Google Scholar 

  8. Esposito, F., Basile, T.M., Di Mauro, N., Ferilli, S.: Machine learning enhancing adaptivity of multimodal mobile systems. In: Multimodal Human Computer Interaction and Pervasive Services, pp. 121–138. IGI Global (2009)

    Google Scholar 

  9. Gašić, M., Young, S.: Gaussian processes for POMDP-based dialogue manager optimization. IEEE/ACM Trans. Audio, Speech, Lang. Process. 22(1), 28–40 (2014)

    Article  Google Scholar 

  10. Hiraoka, T., Neubig, G., Yoshino, K., Toda, T., Nakamura, S.: Active Learning for Example-Based Dialog Systems, chap. Dialogues with Social Robots: Enablements, Analyses, and Evaluation, pp. 67–78 (2017)

    Google Scholar 

  11. Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: ICML, pp. 487–494 (2000)

    Google Scholar 

  12. Li, Z., Li, Z., Zhang, J., Feng, Y., Niu, C., Zhou, J.: Bridging text and video: a universal multimodal transformer for video-audio scene-aware dialog. AAAI2020 DSTC8 workshop (2020)

    Google Scholar 

  13. Morris, S., Fawcett, G., Brisebois, L., Hughes, J.: Canadian survey on disability reports: a demographic, employment and income profile of Canadians with disabilities aged 15 years and over (2017). https://www150.statcan.gc.ca/n1/pub/89-654-x/89-654-x2018002-eng.htm. Accessed 27 May 2020

  14. Murman, D.L.: The impact of age on cognition. Semin. Hear. 36(03), 111–121 (2015). https://doi.org/10.1055/s-0035-1555115

  15. Padmalatha, E., Reddy, C., Rani, P.: Mining concept drift from data streams by unsupervised learning. Int. J. Comput. Appl. 117(15) (2015)

    Google Scholar 

  16. Palaskar, S., Sanabria, R., Metze, F.: Transfer learning for multimodal dialog. Comput. Speech Lang. 64, 101093 (2020). https://doi.org/10.1016/j.csl.2020.101093

  17. Ren, Z., Han, J., Cummins, N., Schuller, B.: Enhancing transferability of black-box adversarial attacks via lifelong learning for speech emotion recognition models. In: Proceedings INTERSPEECH 2020, p. 5. ISCA, ISCA, Shanghai, China (2020)

    Google Scholar 

  18. Richter, J., Shi, J., Chen, J.J., Rahnenführer, J., Lang, M.: Model-based optimization with concept drifts. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, pp. 877–885 (2020)

    Google Scholar 

  19. Rudovic, O., Zhang, M., Schuller, B., Picard, R.W.: Multi-modal active learning from human data: a deep reinforcement learning approach (2019). https://arxiv.org/abs/1906.03098

  20. Sahoo, D., Pham, Q., Lu, J., Hoi, S.C.: Online deep learning: learning deep neural networks on the fly (2017). arXiv:1711.03705

  21. Tseng, B.H., Rei, M., Budzianowski, P., Turner, R., Byrne, B., Korhonen, A.: Semi-supervised bootstrapping of dialogue state trackers for task-oriented modelling. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 1273–1278. Hong Kong, China (2019)

    Google Scholar 

  22. Tsymbal, A.: The problem of concept drift: definitions and related work. Computer Science Department, Trinity College Dublin 106(2), 58 (2004)

    Google Scholar 

  23. Wagner, J., Lingenfelser, F., Baur, T., Damian, I., Kistler, F., André, E.: The social signal interpretation (ssi) framework: multimodal signal processing and recognition in real-time. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 831–834. ACM (2013)

    Google Scholar 

  24. Webb, G.I., Pazzani, M.J., Billsus, D.: Machine learning for user modeling. User Model. User-Adapt. Interact. 11(1–2), 19–29 (2001)

    Google Scholar 

  25. Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)

    Google Scholar 

  26. Wolf, T., Sanh, V., Chaumond, J., Delangue, C.: Transfertransfo: a transfer learning approach for neural network based conversational agents. NeurIPS 2018 CAI Workshop (2019)

    Google Scholar 

  27. Wu, J., Wang, X., Wang, W.Y.: Self-supervised dialogue learning. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 3857–3867. Florence, Italy (2019)

    Google Scholar 

  28. Zhang, S., McClean, S., Scotney, B., Chaurasia, P., Nugent, C.: Using duration to learn activities of daily living in a smart home environment. In: 2010 4th International Conference on Pervasive Computing Technologies for Healthcare, pp. 1–8. IEEE (2010)

    Google Scholar 

  29. Žliobaitė, I.: Learning under concept drift: an overview (2010). arXiv:1010.4784

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefan Ultes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Morency, LP., Sakti, S., Schuller, B.W., Ultes, S. (2021). Multimodal Machine Learning for Social Interaction with Ageing Individuals. In: Miehle, J., Minker, W., André, E., Yoshino, K. (eds) Multimodal Agents for Ageing and Multicultural Societies . Springer, Singapore. https://doi.org/10.1007/978-981-16-3476-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-3476-5_3

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-3475-8

  • Online ISBN: 978-981-16-3476-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics