Recognition of Paralinguistic Information in Spoken Dialogue Systems for Elderly People

  • Humberto Pérez-Espinosa
  • Juan Martínez-Miranda
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9413)


Different strategies are currently studied and applied with the objective to facilitate the acceptability and effective use of Ambient Assisted Living (AAL) applications. One of these strategies is the development of speech-based interfaces to facilitate the communication between the system and the user. In addition to the improvement of communication, the voice of the elder can be also used to automatically classify some paralinguistic phenomena associated with specific mental states and assess the quality of the interaction between the system and the target user. This paper presents our initial work in the construction of these classifiers using an existent spoken dialogue corpus. We present the performance obtained in our models using spoken dialogues from young and older users. We also discuss the further work to effectively integrate these models into AAL applications.


Interactive systems Speech analysis Paralinguistic phenomena Acoustic voice patterns 



This research work has been carried out in the context of the “Cátedras CONACyT” programme funded by the Mexican National Research Council (CONACyT).


  1. 1.
    United Nations. World Population Prospects: The 2012 Revision, Highlights and Advance Tables. Department of Economic and Social Affairs, Population Division (2013). Working Paper No. ESA/P/WP.228Google Scholar
  2. 2.
    Ullberg, J., Loutfi, A., Pecora, F.: A customizable approach for monitoring activities of elderly users in their homes. In: Mazzeo, P.L., Spagnolo, P., Moeslund, T.B. (eds.) AMMDS 2014. LNCS, vol. 8703, pp. 13–25. Springer, Heidelberg (2014) Google Scholar
  3. 3.
    Bisiani, R., et al.: Fostering social interaction of home-bound elderly people: the easyreach system. In: Ali, M., Bosse, T., Hindriks, K.V., Hoogendoorn, M., Jonker, C.M., Treur, J. (eds.) IEA/AIE 2013. LNCS, vol. 7906, pp. 33–42. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  4. 4.
    Angeletou, A., Garschall, M., Hochleitner, C., Tscheligi, M.: I need to know, i cannot, i don’t understand: older users’ requirements for a navigation application? In: Assistive Technology: From Research to Practice, vol. 33, pp. 34–39. IOS Press (2013)Google Scholar
  5. 5.
    Tabak, M., Burkow, T., Ciobanu, I., Berteanu, M., Hermens., H.J.: Acceptance and usability of an ambulant activity coach for patients with COPD. In: Proceedings of the IADIS International Conference e-Health 2013, 24–26 July 2013, Prague, Czech Republic, pp. 61–68. IADIS Press (2013). ISBN 978-972-8939-87-8Google Scholar
  6. 6.
    Schlögl, S., Garschall, M., Tscheligi, M.: Designing natural language user interfaces with elderly users. In: Workshop on Designing Speech and Language Interactions, CHI 2014, Toronto (2014)Google Scholar
  7. 7.
    Möller, S., Gödde, F., Wolters, M.: A corpus analysis of spoken smart- home interactions with older users. In: Proceedings of the 6th International Conference on Language Resources and Evaluation, pp. 735–740 (2008)Google Scholar
  8. 8.
    Georgila, K., Wolters, M., Karaiskos, V., Kronenthal, M., Logie, R., Mayo, N., Watson, M.: A fully annotated corpus for studying the effect of cognitive ageing on users’ interactions with spoken dialogue systems. In: Proceedings of the 6th International Conference on Language Resources and Evaluation, pp. 938–944 (2008)Google Scholar
  9. 9.
    Miller, D., Gagnon, M., Talbot, V., Messier, C.: Predictors of successful communication with interactive voice response systems in older people. J. Gerontol. Ser. B: Psychol. Sci. Soc. Sci. 68(4), 495–503 (2013)CrossRefGoogle Scholar
  10. 10.
    Cucchiarini, C.: The JASMIN speech corpus: recordings of children, non-natives and elderly people. In: Spyns, P., Odijk, J. (eds.) Essential Speech and Language Technology for Dutch, pp. 43–59. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  11. 11.
    Batliner, A., Steidl, S., Seppi, D., Schuller, B.: Segmenting into adequate units for automatic recognition of emotion-related episodes: a speech- based approach. Adv. Hum. Comput. Interact. 2010, 15 (2010). doi: 10.1155/2010/782802. Article ID 782802CrossRefGoogle Scholar
  12. 12.
    Eyben, F., Wöllmer, M., Schuller, B.: Opensmile: the munich versatile and fast open-source audio feature extractor. In: Proceedings of the International Conference on Multimedia, pp. 1459–1462. ACM (2010)Google Scholar
  13. 13.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)Google Scholar
  14. 14.
    Torre, P., Barlow, J.A.: Age-related changes in acoustic characteristics of adult speech. J. Commun. Disord. 42, 324–333 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Humberto Pérez-Espinosa
    • 1
  • Juan Martínez-Miranda
    • 1
  1. 1.CONACYT Research Fellow – CICESE-UT3TepicMexico

Personalised recommendations