Speech Input from Older Users in Smart Environments: Challenges and Perspectives

  • Ravichander Vipperla
  • Maria Wolters
  • Kallirroi Georgila
  • Steve Renals
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5615)


Although older people are an important user group for smart environments, there has been relatively little work on adapting natural language interfaces to their requirements. In this paper, we focus on a particularly thorny problem: processing speech input from older users. Our experiments on the MATCH corpus show clearly that we need age-specific adaptation in order to recognize older users’ speech reliably. Language models need to cover typical interaction patterns of older people, and acoustic models need to accommodate older voices. Further research is needed into intelligent adaptation techniques that will allow existing large, robust systems to be adapted with relatively small amounts of in-domain, age appropriate data. In addition, older users need to be supported with adequate strategies for handling speech recognition errors.


Speech Recognition Language Model Automatic Speech Recognition Smart Home Acoustic Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Helal, S., Mann, W., El-Zabadani, H., King, J., Kaddoura, Y., Jansen, E.: The Gator Tech Smart House: a programmable pervasive space. Computer 38, 50–60 (2005)CrossRefGoogle Scholar
  2. 2.
    Vovos, A., Kladis, B., Fakotakis, N.: Speech operated smart-home control system for users with special needs. In: 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, pp. 193–196 (2005)Google Scholar
  3. 3.
    Renals, S., Hain, T., Bourlard, H.: Recognition and interpretation of meetings: The AMI and AMIDA projects. In: Proc. IEEE Workshop on Automatic Speech Recognition and Understanding (2007)Google Scholar
  4. 4.
    Moeller, S., Krebber, J., Raake, A., Smeele, P., Rajman, M., Melichar, M., Pallotta, V., Tsakou, G., Kladis, B., Vovos, A., Hoonhout, J., Schuchardt, D., Fakotakis, N., Ganchev, T., Potamitis, I.: INSPIRE: Evaluation of a Smart-Home System for Infotainment Management and Device Control. In: Proc. LREC, pp. 1603–1606 (2004)Google Scholar
  5. 5.
    Hawley, M.S., Enderby, P., Green, P.D., Cunningham, S.P., Brownsell, S., Carmichael, J., Parker, M., Hatzis, A., O‘Neill, P., Palmer, R.: A speech-controlled environmental control system for people with severe dysarthria. Medical Engineering & Physics 29, 586–593 (2007)CrossRefGoogle Scholar
  6. 6.
    Arking, R.: Biology of Aging. Oxford University Press, New York (2005)Google Scholar
  7. 7.
    Rabbitt, P., Anderson, M.: The lacunae of loss? Aging and the differentiation of cognitive abilities. In: Lifespan Cognition: Mechanisms of Change. Oxford University Press, New York (2006)Google Scholar
  8. 8.
    Deary, I.J., Whiteman, M.C., Starr, J.M., Whalley, L.J., Fox, H.C.: The impact of child-hood intelligence on later life: Following up the Scottish Mental Surveys of 1932 and 1947. Journal of Personality and Social Psychology 86, 130–147 (2004)CrossRefGoogle Scholar
  9. 9.
    Linville, S.E.: Vocal Aging. Singular Thomson Learning, San Diego (2001)Google Scholar
  10. 10.
    Ramig, L.O., Gray, S., Baker, K., Corbin-Lewis, K., Buder, E., Luschei, E., Coon, H., Smith, M.: The Aging Voice: A Review, Treatment Data and Familial and Genetic Perspectives. Clinical Linguistics and Phonetics 53, 252–265 (2001)Google Scholar
  11. 11.
    Xue, S.A., Hao, G.J.: Changes in the human vocal tract due to aging and the acoustic correlates of speech production: a pilot study. Journal of Speech, Language, and Hearing Research 46, 689–701 (2003)CrossRefGoogle Scholar
  12. 12.
    Vipperla, R., Renals, S., Frankel, J.: Longitudinal study of ASR performance on ageing Voices. In: Proc.1 Interspeech 2008, pp. 2550–2553 (2008)Google Scholar
  13. 13.
    Baba, A., Yoshizawa, S., Yamada, M., Lee, A., Shikano, K.: Acoustic models of the elderly for large-vocabulary continuous speech recognition. Electronics and Communications in Japan, Part 2 (Electronics) 87, 49–57 (2004)CrossRefGoogle Scholar
  14. 14.
    Wilpon, J.G., Jacobsen, C.N.: Study of speech recognition for children and the elderly. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 1, pp. 349–352 (1996)Google Scholar
  15. 15.
    Baeckman, L., Small, B.J., Wahlin, A.: Aging and Memory: Cognitive and Biological Perspectives. In: Handbook of the Psychology of Aging, pp. 349–377. Academic Press, San Diego (2001)Google Scholar
  16. 16.
    Verhaeghen, P.: Aging and vocabulary scores: a meta-analysis. Psychology of Aging 18, 332–339 (2003)CrossRefGoogle Scholar
  17. 17.
    Shafto, M.A., Burke, D.M., Stamatakis, E.A., Tam, P.P., Tyler, L.K.: On the tip-of-the-tongue: neural correlates of increased word-finding failures in normal aging. J. Cogn. Neurosci. 19, 2060–2070 (2007)CrossRefGoogle Scholar
  18. 18.
    Caruso, A.J., McClowry, M.T., Max, L.: Age-related effects on speech fluency. Seminars in Speech and Language 18, 171–179 (1997)CrossRefGoogle Scholar
  19. 19.
    Pennebaker, J.W., Stone, L.D.: Words of wisdom: Language use over the life span. Journal of Personality and Social Psychology 85, 291–301 (2003)CrossRefGoogle Scholar
  20. 20.
    Wolters, M., Georgila, K., Logie, R., MacPherson, S., Moore, J., Watson, M.: Reducing Working Memory Load in Spoken Dialogues: Do We Have to Limit the Number of Options? In: Interacting with Computers (accepted, 2009)Google Scholar
  21. 21.
    Barras, C., Geoffrois, E., Wu, Z., Liberman, M.: Transcriber: Development and use of a tool for assisting speech corpora production. Speech Communication 33 (2000)Google Scholar
  22. 22.
    Moore, J., Kronenthal, M., Ashby, S.: Guidelines for AMI Speech Transcriptions. AMI Deliverable (2005)Google Scholar
  23. 23.
    Carletta, J.: Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus. Language Resources and Evaluation 41, 181–190 (2007)CrossRefGoogle Scholar
  24. 24.
    Georgila, K., Wolters, M., Karaiskos, V., Kronenthal, M., Logie, R., Mayo, N., Moore, J., Watson, M.: A Fully Annotated Corpus for Studying the Effect of Cognitive Ageing on Users’ Interactions with Spoken Dialogue Systems. In: Proceedings of the 6th International Conference on Language Resources and Evaluation (2008)Google Scholar
  25. 25.
    Walker, M.A., Passonneau, R.J., Boland, J.E.: Quantitative and qualitative evaluation of DARPA Communicator spoken dialogue systems. In: Proceedings of the 39th Meeting of the Association for Computational Linguistics, pp. 515–522 (2001)Google Scholar
  26. 26.
    Anderson, S., Liberman, N., Bernstein, E., Foster, S., Cate, E., Levin, B., Hudson, R.: Recognition of Elderly Speech and Voice-Driven Document Retrieval. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Phonenix, Arizona (1999)Google Scholar
  27. 27.
    Vipperla, R., Renals, S., Frankel, J.: Longitudinal study of ASR performance on ageing voices. In: Proc. Interspeech, pp. 2550–2553 (2008)Google Scholar
  28. 28.
    Georgila, K., Wolters, M., Moore, J.: Simulating the Behaviour of Older versus Younger Users. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, Human Language Technologies (ACL/HLT), pp. 49–52 (2008)Google Scholar
  29. 29.
    Hain., T., Burget., L., Dines., J., Garau., G.: M.Karafiat., Lincoln., M., McCowan., I., Moore., D., Wan., V., Ordelman., R., Renals, S.: The 2005 AMI System for the transcription of Speech in Meetings. In: Proceedings of the Rich Transcription 2005 Spring Meeting Recognition Evaluation (2005)Google Scholar
  30. 30.
    Gauvain, J.-L., Lee, C.-H.: Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Transactions on Speech and Audio Processing 2, 291–298 (1994)CrossRefGoogle Scholar
  31. 31.
    Jurafsky, D., James, H.: Martin: Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics. Prentice-Hall, Englewood Cliffs (2008)Google Scholar
  32. 32.
    Möller, S., Gödde, F., Wolters, M.: A Corpus Analysis of Spoken Smart-Home Interactions with Older Users. In: Proceedings of the 6th International Conference on Language Resources and Evaluation (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Ravichander Vipperla
    • 1
  • Maria Wolters
    • 1
  • Kallirroi Georgila
    • 2
  • Steve Renals
    • 1
  1. 1.Centre for Speech Technology Research, School of InformaticsUniversity of EdinburghScotland
  2. 2.Institute for Creative TechnologiesUniversity of Southern CaliforniaUSA

Personalised recommendations