Automatic detection of cognitive impairment in elderly people using an entertainment chatbot with Natural Language Processing capabilities

de Arriba-Pérez, Francisco; García-Méndez, Silvia; González-Castaño, Francisco J.; Costa-Montenegro, Enrique

doi:10.1007/s12652-022-03849-2

Automatic detection of cognitive impairment in elderly people using an entertainment chatbot with Natural Language Processing capabilities

Original Research
Open access
Published: 29 April 2022

Volume 14, pages 16283–16298, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Automatic detection of cognitive impairment in elderly people using an entertainment chatbot with Natural Language Processing capabilities

Download PDF

Francisco de Arriba-Pérez¹^na1,
Silvia García-Méndez ORCID: orcid.org/0000-0003-0533-1303¹^na1,
Francisco J. González-Castaño¹^na1 &
…
Enrique Costa-Montenegro¹^na1

5229 Accesses
18 Citations
1 Altmetric
Explore all metrics

Abstract

Previous researchers have proposed intelligent systems for therapeutic monitoring of cognitive impairments. However, most existing practical approaches for this purpose are based on manual tests. This raises issues such as excessive caretaking effort and the white-coat effect. To avoid these issues, we present an intelligent conversational system for entertaining elderly people with news of their interest that monitors cognitive impairment transparently. Automatic chatbot dialogue stages allow assessing content description skills and detecting cognitive impairment with Machine Learning algorithms. We create these dialogue flows automatically from updated news items using Natural Language Generation techniques. The system also infers the gold standard of the answers to the questions, so it can assess cognitive capabilities automatically by comparing these answers with the user responses. It employs a similarity metric with values in [0, 1], in increasing level of similarity. To evaluate the performance and usability of our approach, we have conducted field tests with a test group of 30 elderly people in the earliest stages of dementia, under the supervision of gerontologists. In the experiments, we have analysed the effect of stress and concentration in these users. Those without cognitive impairment performed up to five times better. In particular, the similarity metric varied between 0.03, for stressed and unfocused participants, and 0.36, for relaxed and focused users. Finally, we developed a Machine Learning algorithm based on textual analysis features for automatic cognitive impairment detection, which attained accuracy, F-measure and recall levels above 80%. We have thus validated the automatic approach to detect cognitive impairment in elderly people based on entertainment content. The results suggest that the solution has strong potential for long-term user-friendly therapeutic monitoring of elderly people.

Empathic Chatbot: Emotional Astuteness for Mental Health Well-Being

Design of a Chatbot to Assist the Elderly

Artificial intelligence insights into osteoporosis: assessing ChatGPT’s information quality and readability

Article 19 March 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The United Nations has reported (World Population Prospects report^{Footnote 1}) that 9% of the world population is over 65 years old, and this percentage will reach 16% in 30 years. The population older than 80 is growing even faster, and it is expected to reach 450 million by 2050. This urges society to find innovative solutions to improve the living conditions of our elders, especially of those who live alone (Callahan et al. 2014; Hancock et al. 2006).

A main issue of the quality of life of elderly people is the severe prevalence of cognitive impairment disorders. Affected people are mainly over 65 years old and they were 50 million in 2015, although this segment is expected to triple by 2050 (Livingston et al. 2017). Regular screening for detecting early symptoms and monitoring the progression of these disorders have been considered beneficial for treatment planning and patient autonomy (Borson et al. 2013). However, it has also been observed that cognitive impairment assessment in primary care systems is inefficient (Löppönen et al. 2003; Boise et al. 2004).

As discussed in Sect. 2, even though existing telecare set-top-boxes and gateways have increasingly intelligent capabilities, they still do not communicate autonomously with the elderly using natural language. This is also the case of the solutions for cognitive evaluation, which mostly rely on sets of predefined manual tests.

Given the industrial gap, and as demonstrated by the analysis of the state of the art in Sect. 2, we propose a novel conversational system for entertainment and therapeutic monitoring of elderly people that relies on nlp techniques and Machine Learning for empathetic chatbot behaviour generation and user-transparent automatic assessment.

From the perspective of the target users, the elderly, the main priority of any information system should be alleviating loneliness, whether the system has embedded cognitive monitoring capabilities or not. Accordingly, we want our solution to be perceived as a friendly intelligent assistant to access Internet media, that is, a conversational system that reads news. These will be interspersed with brief dialogues to subtly guide the users through a series of questions to gather their interests and evaluate their understanding of the information they have just consumed, which includes word category understanding and short-term memory, to evaluate cognitive impairment (Loewenstein et al. 2004; Crocco et al. 2014).

Conversational systems seem an adequate approach for this purpose. These software programs allow for human interaction with a machine using written or spoken natural language (Shawar and Atwell 2007). Ideally, the dialogue should be empathetic (Fung et al. 2018; Rashkin et al. 2019), a still distant goal even after the recent advances in Artificial Intelligence (ai) and in Natural Language Generation (nlg) techniques. Before virtual companions (Shum et al. 2018) become a reality, entertainment through interesting information will be more feasible. In this vein, it is well known that elders feel accompanied by simply listening to news in their background (Östlund 2010).

More in detail, our system reads recent news items and generates questions about them. At the same time, in order to automatically evaluate cognitive capabilities, the system measures the similarity between user answers and a gold standard (Yang and Powers 2005; Corley and Mihalcea 2005; Li et al. 2006; Feng et al. 2008) that is automatically generated from the news. To validate our approach, we have defined an answer similarity metric and we have performed tests on a sample of 30 patients of Asociación de Familiares de enfermos de Alzheimer y otras demencias de Galicia (afaga,^{Footnote 2} the Galician Association of Relatives of Patients with Alzheimer’s and other Dementias). This public association seeks to improve the quality of life of Alzheimer patients, provides guidance and information to relatives and to the public, and makes society aware of this reality to achieve a broader and more effective response. It collaborates actively in research on cognitive impairments.

The rest of this paper is organised as follows. Section 2 reviews related work and lists our contributions. Section 3 describes our conversational system for entertainment and user-transparent cognitive assessment. Section 4 presents our case study and the results of our word similarity approach to automatically evaluate cognitive impairment. Finally, Sect. 5 concludes the paper.

2 Related work

The design of intelligent systems (Wilamowski and Irwin 2015) is a relevant research field, including for example industrial (Irani and Kamal 2014; Ngai et al. 2014), military (Ma’sum et al. 2013; Yoo et al. 2014) and social (Magnisalis et al. 2011; Bernardini et al. 2014; Adamson et al. 2019) applications. In the telecare domain, personal assistants (Matsuyama et al. 2016; López et al. 2018) and artificial companionship (Chumkamon et al. 2016; Abdollahi et al. 2017) are relevant.

Regarding the automatic detection of health conditions, there exists a wealth of work based on Machine Learning, such as Ghoneim et al. (2018), on a smart healthcare framework to detect medical data tampering; Sedik et al. (2021), on the analysis of the outbreak of covid-19 disease; Ahmed et al. (2021), on an unsupervised Machine Learning approach to predict data-types attributes for optimal processing of telemedicine data, including text and images; and Sarrab and Alshohoumi (2021), on the real-time detection of abnormalities in streamed data from IoT sensors (furthermore, in Masud et al. (2021), a mutual authentication and secret key establishment protocol was proposed to protect medical IoT networks). Our research contributes to automatic smart telecare solutions based on Machine Learning.

However, despite the huge advances in ai for artificial reasoning and problem-solving, interactions are still far from human-like. Most recent ai systems have partial understanding of natural language and lack cognitive capabilities to enrich the communication with context-dependent information (Skjuve et al. 2019). Specifically, existing solutions for intelligent conversational systems are based on retrieval-based methods (Yasuda et al. 2014; Wu et al. 2018), which select the best candidate among a predefined set of alternative responses, and generation-based procedures (Oh et al. 2017; Su et al. 2017; Baby et al. 2017), which rely on nlp techniques to create human-like written or oral dialogue flows. Note that natural language is especially beneficial for user interfaces for its spontaneity and friendliness (Liu et al. 2018).

Among the existing intelligent conversational systems, we can mention Google Duplex (Lindgren and Andersson 2011) and the Neural Responding Machine in Shang et al. (2015) based on Recurrent Neural Networks. Moreover, in Wen et al. (2017) the authors presented a dialogue system based on the pipelined Wizard-of-Oz framework, which, unlike other approaches in the literature, can make assumptions. Regarding linguistic knowledge, in Wang et al. (2015) and Wu et al. (2018) syntactic features were used to generate coherent and human-like texts. Newscasters (Matsumoto et al. 2007), which, as previously said, generate a feeling of companionship (Östlund 2010), do not sustain dialogues with end users. Conversational systems have already been considered for entertainment and healthcare (Noh et al. 2017; Su et al. 2017), although still at an early stage.

Regarding existing conversational systems for entertainment (Johnson et al. 2016; Correia et al. 2016; Aaltonen et al. 2017), we highlight EduRobot (Budiharto et al. 2017) which can sing and tell stories, although it is not oriented to the specific needs of the elderly. It has been suggested to make conversational systems more appealing by modelling their interfaces as pets or avatars (Sharkey and Sharkey 2012). Unlike EduRobot, RobAlz (Salichs et al. 2016) has been specifically devised for this audience, but it has no therapeutic diagnostic capabilities. Few existing intelligent systems for senior healthcare (Foroughi et al. 2008; Hsu and Chien 2009; Suryadevara et al. 2012; Tseng et al. 2013; Samanta et al. 2014; Wang et al. 2016) have human-computer communication capabilities. The system by Yasuda et al. (2014) for people with dementia is an exception, but its communication capabilities are very limited: it just selects questions and answers among 120 pre-set options. Therefore, even though there still is incipient work on digital tools for therapeutic monitoring of people with dementia and other impairments, manual tests (written and task-based neurological and neuropsychological assessments with caretaker supervision) are typically used. For example, the Mini-Mental State Examination (Ridha and Rossor 2005) is a cognitive test on orientation, immediate memory, calculation, attention and comprehension, to cite some tasks, which produces scores about dementia levels. The Mini Cognitive Assessment Instrument (Milne et al. 2008) includes a verbal memory task and a clock drawing test. Finally, the camdex test (Ball et al. 2004) is another standardised manual tool for the diagnosis of mental disorders, which is especially suitable for the early detection of dementia. It asks questions related to memory, personality, general mental and intellectual functioning, and judgement. It also considers specific symptoms and the medical histories of the users and their families. Note that the white-coat effect is a major concern in all these manual approaches, apart from the fact that they are time-consuming and require professional expertise.

Previous research has proved the benefits of combining dichotomous questions (also named closed questions) with essay questions^{Footnote 3} to mitigate the white-coat effect (Ridha and Rossor 2005; Echeburúa et al. 2017). We remark the interest of inserting distracting questions before attention-demanding questions in cognitive tests (Ridha and Rossor 2005). Altogether, this combination allows evaluating cognitive impairment less intrusively (Ball et al. 2004; Ridha and Rossor 2005; Milne et al. 2008). To the best of our knowledge, our proposal is the first intelligent system that embeds user-transparent, automatic cognitive assessment into a newscaster system that sustains dialogues with elderly people, based on nlp techniques for chatbot behaviour generation and human-machine communication.

We close this section with a review of related industrial initiatives, which further backs the social relevance of the field under study.

For instance, the Carelife system by Televés^{Footnote 4} analyses personal routines from sensor data for custom home care. Its home gateway can be extended via peripherals, such as biomedical devices, as well as via software. Doro launched SmartCare^{Footnote 5} in 2018. It includes a home gateway and home sensors to detect behavioural changes. However, neither these systems nor the sam Robotic Concierge by Luvozo^{Footnote 6} have built-in intelligence to communicate with the elderly using natural language.

Regarding general purpose domestic robots, there are examples such as ZenBo by Asus,^{Footnote 7} with video surveillance, Internet shopping and agenda features. Its interactions are rather rigid. It can understand vocal orders, but it has no empathetic capabilities, nor is it tailored to the needs of the elderly.

We believe that a feasible path towards a next generation of intelligent conversational telecare systems is the augmentation through software of current platforms such as Carelife and Smartcare by relying on their simple voice interfaces (which, nowadays, caregivers employ to call the users), without any additional hardware add-ons. Some platforms are already open in this aspect. For example, Buddy, by European Blue Frog Robotics,^{Footnote 8} allows third developers to create new applications and distribute them via its store. We are neither aware of any application of this robot to entertain and monitor elderly people, nor of any intelligent functionalities.

Regarding solutions focusing on cognitive evaluation, we must mention three approaches based on manual tests (written and task-oriented neurological and neuropsychological assessments), none of which employ automatic nlp techniques. Neurotrack^{Footnote 9} is a set of cognitive tests to evaluate, monitor and strengthen brain health to reduce the risk of Alzheimer and other dementias. Mezurio^{Footnote 10} provides support to interactive data acquisition as a baseline for detecting individuals at risk of developing Alzheimer’s disease. Altoida^{Footnote 11} tests the functional and cognitive skills of patients with a Machine Learning algorithm but, as the previous two solutions, it is entirely based on a set of predefined tests and it does not have any bidirectional communication capabilities in natural language.

3 System architecture

We present a novel intelligent system specially designed for therapeutic monitoring of elderly people with different levels of cognitive impairments or on the verge of suffering them. The users perceive the system as a news broadcasting service, with which they interact by voice from time to time. Therapeutic monitoring is embedded as a user-transparent functionality. Figure 1 illustrates the main modules of the system, on which we will elaborate in the next sections. They include online (news broadcast service and intelligent dialogue generation system) and local services (Android^{Footnote 12} application with cognitive attention assessment service). Online services modules were implemented using the Eclipse^{Footnote 13} Integrated Development Environment (ide) tool and Java 1.8 programming language.^{Footnote 14} They were deployed on a Tomcat^{Footnote 15} server to be made available through a rest api, which was programmed using the Jersey library.^{Footnote 16}The Android application (for Lollipop operating system or higher regarding devices compatibility) was developed with Android Studio.^{Footnote 17}

The system transforms Spanish speech into text and vice versa as input and output data (stt/speech-to-text and tts/text-to-speech boxes in Fig. 1). For this purpose, it employs the Google Voice Android Software Development Kit (sdk) library.^{Footnote 18}

Regarding system activation, we use voice commands and facial recognition. For implementing the latter we employed the OpenCV library^{Footnote 19} and an eye-sensing train data set^{Footnote 20} (note that previous works have also exploited sophisticated schemes for this purpose, apart from voice commands (Alsmirat et al. 2019)).

As in Wang et al. (2020), we decided to combine text and images, in our case to help the users focus during the dialogue stages. Specifically, the dialogue with the users is guided with basic graphic indicators (see Fig. 2): the screen displays a “muted” or “open” microphone to indicate the system’s or user’s turn to speak. Moreover, the “facial” expressions of the avatar of the conversational system provide empathetic feedback to the users based on text sentiment analysis. Finally, note that the user interface is an animated dog, simulating a pet as suggested in literature (Sharkey and Sharkey 2012).

To ensure short response times, instead of querying external systems, the system relies on a MongoDB database^{Footnote 21} in our own server containing all the necessary linguistic knowledge.

3.1 News broadcast service

The news broadcast service requires a varied and updated set of news to engage the target users. This news is periodically extracted from the Application Programming Interface (api) of the Spanish National Radio and Television (RTVE^{Footnote 22}) channel with a get query, using the tematicas and noticias api services, to gather news items on specific topics. This task is performed in the background owing to the requirements of the api, and it hides news processing to the target users by generating pre-saved news items on a daily basis. The content retrieved from the api is saved into a MongoDB structure using a json file. Date and topic features are exploited for indexing and searching. As a result, the News Broadcast Service delivers content immediately from the user’s point of view.

News are arranged into five categories: economy, politics, science, society and sports, all of them at national level except for politics and society, which focus on Galicia (Spain) in our case, since proximity is appealing to elderly people.

Table 1 presents an example of a social news item. We provide the user with a summary of each piece of news by extracting the lead paragraph (for that purpose, we first split the news items into paragraphs and take the first paragraph after the title).

Table 1 Example of news content

Full size table

3.2 Automatic question generation

Our question generation system combines linguistic knowledge from our aLexiS lexicon (García-Méndez et al. 2018, 2019) with the Name Entity Classification (nec) functionality from Freeling (Atserias et al. 2006; Padró and Stanilovsky 2012). The former is saved in a MongoDB database to reduce the response time of the chatbot. Besides, the nec process is executed in the background because of the complexity of the linguistic analysis of the news.

These two resources allow extracting and identifying personal names, organisations and locations, which together constitute valuable data for question generation. More specifically, thanks to the linguistic information in aLexiS, our conversational system can adjust features such as the gender, number, person, and tense of the questions it generates.

As previously said, we follow the strategy of combining dichotomous and essay questions to reduce the white-coat effect and create a more relaxed atmosphere.

To generate dichotomous questions, we rely on the nec functionality of Freeling to extract personal names and locations from the news. This produces questions such as those in Table 2. The system always generates four similar options for each question, and one of them is picked at random and presented to the user. Then, depending on the user’s answer, the system poses the next question as indicated in Table 3.

Table 2 Example of dichotomous questions by different nec results

Full size table

Table 3 Example of dichotomous questions by different nec results depending on the user’s response

Full size table

Regarding the extraction of the gold standard answers for the aforementioned types of questions, we obtain these data with Freeling syntactic parsing. Take the sentence ‘National Police has dismantled a dangerous drug trafficking ring that operated in Galicia, Madrid, and Alicante’ as an example. The corresponding essay question using our system is ‘Who has dismantled a dangerous drug trafficking ring?’, and the correct answer it produces as a reference is the noun phrase that precedes the verb, ‘National Police’. Note that the best answer is obtained by extracting the noun phrases that precede the verb for ‘who’ questions, as in our example. On the other hand, for ‘what’ questions, we use the noun phrase that precedes the verb plus the verb itself. Take the sentence ‘The Government will automatically extend the social electric bonds until September 15th’ and its associated question ‘What does the news say on September 15th will happen?’ as an example of ‘what’ question handling. The correct produced answer is ‘It will automatically extend social electric bonds’. Finally, for the question ‘Which places does the news item mention?’, the correct answer is produced by extracting all location entities from the news content using the nec functionality by Freeling. Note that in the first example about the drug trafficking ring, the correct answer is Galicia, Madrid and Alicante.

Both the generation of questions and the extraction of gold standard answers are performed in the background, after the daily news-gathering process (see Sect. 3.1), using the same indexing scheme.

By combining dichotomous and essay questions, the conversational system establishes a dialogue with the end users. Each dialogue is composed of the following three stages:

News: prior to the dialogue, the conversational system presents the news item.
User-centred questions: a dichotomous question followed by two essay questions to distract the user.
Attention-demanding question: a last essay question on key aspects of the news item. It allows assessing if the user understood the news piece, if he/she was focused during the conversation, and his/her short-term memory.

Figure 3 shows an example of a real dialogue according to this structure. To keep the user engaged, most questions are related to the news item (avatar marked in yellow).

The resulting user’s utterances are the input to the cognitive attention assessment service, which calculates the accuracy of the user answers by comparing them with the gold standard responses. We describe it in the next section.

3.3 Cognitive attention assessment service

We employ the lexical Multilingual Central Repository (mcr^{Footnote 23}) database (González-Agirre et al. 2012), which integrates the Spanish WordNet into the EuroWordNet framework, to obtain the semantic classification of the adjectives, adverbs, nouns and verbs in the news piece. For that purpose, we extract from mcr three semantic categories corresponding to Adimen sumo, WordNet Domains and Top Ontology hierarchies for nouns and verbs, and Top Ontology hierarchies for adjectives and adverbs, since there is less information available for these two lexical categories (Pedersen et al. 2004). From mcr we also gather holonyms, hypernyms, hyponyms, meronyms, synonyms and related data for nouns and verbs, and only synonyms for adjectives and adverbs. Table 4 shows an example for noun montaña ‘mountain’.

This linguistic knowledge was added to the aLexiS lexicon within the same json indexing scheme, which did not affect response time performance.

Table 4 Semantic data from mcr for noun montaña ‘mountain’

Full size table

We obtain the final score for each response as a weighted average, as follows:

$$\begin{aligned} sim=0.8 \frac{\sum _{i=1}^{N_{noun}} {noun^*_i} + \sum _{i=1}^{N_{verb}} {verb^*_i}}{N_{noun} + N_{verb}} \nonumber \\ + 0.2 \frac{\sum _{i=1}^{N_{adj}} {adj^*_i} + \sum _{i=1}^{N_{adv}} {adv^*_i}}{N_{adj} + N_{adv}} \end{aligned}$$

(1)

Where:

$N_x$ represents the number of words of lexical category x in the ideal response.
For the i-th word of category x in the ideal answer, we calculate its similarity with all words within the same lexical category x in the user’s response, and we take the highest value as $x^*_{i}$.

Note that nouns and verbs are considered more relevant than adjectives and adverbs in expression (1). This choice is supported by the fact that the former generally carry most semantic information in a sentence, whereas the latter provide nuances (Feng et al. 2008; Corley and Mihalcea 2005).

To obtain $x^*_i$ we adapted the method by Yang et al. (Yang and Powers 2005) to calculate the similarity $s(word_1, word_2)$ between words $word_1$ and $word_2$:

$$\begin{aligned} s{(word_1,word_2)} = (1-\gamma ) \alpha _s \beta ^{d(word_1,word_2)} + \gamma \end{aligned}$$

(2)

Where:

$\alpha _s = 0.9$ if the words are synonyms and 0.85 otherwise.
$d(word_1,word_2) = 0$ if the words are holonyms, hypernyms, hyponyms, meronyms or synonyms, related to the same hierarchy category or belonging to it. Otherwise, $d(word_1,word_2)$ is WordNet’s shortest path between $word_1$ and $word_2$.
$\beta$ acts as a depth factor that decreases similarity exponentially, to the power of the number of hierarchical steps separating the two concepts. In the tests in Sect. 4 we set it to 0.7.
$\gamma$ is explained below.

Note that certain fairly similar words in the same WordNet domain category will have a very low similarity value with our method (less than 0.4). Consider for example the pair (madera ‘wood’, cartón ‘cardboard’), whose similarity is 0.27, a low value taking into account that both terms define materials and share WordNet domain category ‘substance’; and pair (panadero ‘baker’, maestro ‘teacher’), with a similarity of 0.10, although these words represent professions and belong to the same WordNet domain category ‘person’.

To avoid this issue, we define correction factor $\gamma$ in expression (2). By default $\gamma =0$, except for the following two cases:

For word pairs that belong to the same WordNet domain category, $\gamma =0.25$. After applying this to the two previous examples, their similarities grow to 0.45 and 0.33 (from 0.27 and 0.10), respectively.
For all terms with the same stem that have not been already classified as synonyms, with a similarity of 0.85 or less, $\gamma =0.5$. This is because they belong to the same word family, and their similarity should be even higher for our purposes than in the previous case. For instance, the pair (flor ‘flower’, florista ‘florist’) would have a similarity value of 0.15 for $\gamma =0$, but it becomes 0.58 after applying $\gamma =0.5$.

The goal of these corrections is improving coherence, by defining a ground truth for word similarity (Li et al. 2006). This was tuned by considering the value range (0.3, 0.6) for similarity scores in cases such as autógrafo ‘autograph’ versus firma ‘signature’ and cojín ‘cushion’ versus almohada ‘pillow’.

Moreover, we pay special attention to the treatment of numbers. Given the fact that our goal is to assess the understanding of the news, we must take into account that even people with healthy minds seldom retain the exact quantities they have just heard. For example, after listening to the sentence ‘there were 2569 casualties’, a person will likely remember ‘there were over 2500 casualties’. For this reason, we generate all the possible numbers a given quantity can be rounded to by dividing it by powers of ten, and we assign a 0.7 similarity score if any of them produces a match. For instance, if the correct amount is 2569, the possible right answers that would be assigned a 0.7 similarity score are 2000, 2500, 2560, 2570, 2600 and 3000. On top of that, if the words ‘over’ or ‘under’ are chosen correctly, a similarity score of 0.9 is assigned. Going back to the previous example, if the user’s reply is ‘there were over 2500 casualties’, it receives a 0.9 similarity score, whereas if it is ‘there were 2500 casualties’ it gets a 0.7 similarity score, since it reflects less understanding of the original information.

As a final example, take the sentence un profesor llevó papel en blanco a su hogar en la montaña ‘a teacher took blank paper to his home in the mountain’ as the ideal response to a question. The following three user answers produce different results:

Un hombre llevó folios blancos a su casa del monte ‘a man took blank paper sheets to his house on the hill’: this reply uses different words than the original, but keeps the same meaning. It would obtain a similarity score of 0.8.
Un hombre sacó madera blanca de su apartamento ‘a man took white wood from his apartment’: this sentence has lost most original meaning, although some common concepts remain. It would obtain a similarity score of 0.51.
Un hombre rompió una silla en una tienda ‘a man broke a chair in a shop’: this sentence has none of the original meaning left. It would obtain a similarity score of 0.25 (note that the subject was correctly inferred).

The sim metric allows the automatic evaluation of comprehension skills from the questions and their corresponding gold standard answers, and the procedure reflects the level of concentration during the conversation and the reliability of short-term memory.

4 Experimental results and discussion

In this section, we present the validation tests to assess the effectiveness of our approach to determine cognitive impairment levels.

4.1 Hardware

We ran the system on a server with the following characteristics:

Operating System: Ubuntu 18.04 LTS 64 bits
Processor: Intel Xeon CPU E5-2620 v2 2.1 GHz
RAM: 64GB DDR3
Disk: 3 Tb

4.2 Case study

The case study comprises two experimental scenarios. The first experiment (Sect. 4.2.1) studies the sim metric presented in Sect. 3.3 as a tool to assess abstraction capabilities, for different user profiles and levels of cognitive impairment. The second experiment (Sect. 4.2.2) evaluates Machine Learning algorithms to automatically detect cognitive impairment in the users under study.

These experiments were divided in “sessions”, a session being a particular newscast and its associated dialogue with an elderly person. The profiles of the participants in these sessions were characterised as follows:

At least 60 years old.
Technological skills and hearing problems: existing or not.
Study levels: basic (high education or less) or superior (bachelor’s, master’s and doctoral degrees).
Cognitive impairment level: absent, mild or severe, as established for the case study by gerontology experts from afaga.
Stress: yes or no.
Focus: yes or no.

The experiments lasted for three months and involved 30 users 75.73 ± 6.60 years old (average ± standard deviation). All the users involved in the experiments are patients in the occupational therapy workshops of afaga. The tests were conducted under the supervision of their caregivers. For annotating cognitive impairment, afaga applied the Spanish version (Díaz Mardomingo and Peraita Adrados 2008) of the Global Deterioration Scale standard methodology (Reisberg et al. 1982). Specifically, 57% of the participants had cognitive impairment to some extent (40% mild, 17% severe) and the rest were mentally healthy. We used this methodology as manual baseline for the comparison with our automatic Machine Learning detection approach in Sect. 4.2.2.

In each individual experiment, we registered the characteristics of the user. Tables 5 and 6 shows the session registration sheet, which was filled by the caregivers, and a real example of a session, with its newscast content and its associated interview.

Table 5 Session registration sheet

Full size table

Table 6 News item for session 1

Full size table

In detail, 17 participants had basic technological skills (e.g., they regularly used electronic devices such as computers and smartphones), 9 suffered from hearing problems, 16 had a basic education and 14 had a superior level of education. 18 participants had been diagnosed some cognitive impairment. Most of the users were in a positive frame of mind (14 were happy to participate and 16 simply accepted it). 22 users were clearly focused.

Regarding the white-coat effect, it is worth mentioning that 91.67% of the participants without any cognitive impairment and 61.11% with mild or severe cognitive impairments were relaxed during the experiments. As it could be expected beforehand, cognitive impairment level increased with age (Ammal and Jayashree 2020).

4.2.1 Similarity metric assessment

Each experiment was composed of five different sessions whose corresponding news items were related to economy, politics, science, society, and sports. Each session consisted of a newscast and four questions, as explained in Sect. 3.

In the experiments, our system was able to separate users in most cases by sim scores that were significantly related to their level of cognitive impairment. Table 7 averages the sim metric for the three groups in the study (absent, mild and severe impairment).

Table 7 Average ± sd sim metric by level of impairment across all sessions

Full size table

By session, Table 8 shows that users with severe cognitive impairments scored significantly less, while healthy ones or those with mild levels of cognitive impairment performed significantly better. The effect of mild cognitive impairment (compared to its absence) in the comprehension skills of the participants was noticeable in all sessions except for the third. This session was particularly challenging due to its vocabulary, which included technical terms such as pymes ‘SMEs’ (acronym for small and medium enterprises). Furthermore, in sessions 1, 2 and 4 the difference between mild and no cognitive impairment was noteworthy.

Table 8 Average ± sd sim metric at each session by level of impairment

Full size table

Session 5 was particularly interesting. In it, users received several clues in user-centred questions before being presented the attention-demanding question. Thus, users with mild cognitive impairment performed better in this session than in session 3, for example. This is coherent with the fact that the reinforcement of key ideas from the news helped users to accurately answer attention-demanding questions.

Table 9 shows the average lengths of user responses. They tended to be shorter for higher impairment levels. In fact, for severe impairment, users tended to remain silent or answer concisely (e.g., ‘I don’t know’).

Table 9 Average answer length by level of impairment, in characters

Full size table

Table 10 shows that, for users with cognitive impairments, stress had an appalling effect on performance, and that high focus had a very positive outcome, even enhancing sim results from 0.20 to 0.36 in relaxed users and, remarkably, from 0.03 to 0.26 in stressed users.

Table 10 Average ± sd sim metric for users with cognitive impairments by stress and focus

Full size table

Finally, Table 11 presents sim measurements versus technological skills and levels of education, for users with cognitive impairments. In view of these results, more educated users performed better in the experiments than those with a basic education. Apparently, technological skills only led to higher sim scores for users with a basic education (note, however, the large standard deviation of the sim score of skilled users with superior educational level).

Table 11 Average ± sd sim metric for users with cognitive impairments by technological skills and level of education

Full size table

4.2.2 Automatic cognitive impairment detection

Finally, in order to evaluate the effectiveness of the proposed system, we trained a set of selected Machine Learning algorithms (Salzberg 1994) for detecting cognitive impairment: Bayesian Network (bn), Decision Tree (dt), Random Forest (rf) and linear Support Vector Machine (svm), which have been widely used in medical applications (Lu et al. 2017; Bratić et al. 2018; Ghoneim et al. 2018; Rukmawan et al. 2021; Ahmed et al. 2021).

Table 12 shows the training and classification complexity of the Machine Learning algorithms we selected, for c classes, d features, k instances of the algorithm (where applicable) and n samples. bn training has linear training complexity (Lu et al. 2006). dt and rf training have logarithmic training complexity (Witten et al. 2016; Hassine et al. 2019). svm has the highest training complexity, but it is very fast in classification time if trained with a linear kernel, as in our case (Vapnik 2000). We employed the algorithm implementations from Weka^{Footnote 24} (Witten et al. 2016).

Table 12 Training and testing complexity of the Machine Learning algorithms

Full size table

Firstly, we divided the sample into user classes with and without impairments. Table 13 shows all the features we considered. Then, we applied the GainRatioAttributeEval feature selection algorithm, also from Weka, which evaluates the relevance of the attributes by measuring their gain ratio with regard to the target class. The most relevant features it selected for the classification model were, in decreasing importance, length of the response in characters, focus, sim for question 4 in session 2, age, technology skills, and sim for question 4 in session 4.

Table 13 Features for the Machine Learning models training

Full size table

Note that in spite of the results in Table 11, the selection algorithm preferred technological skills over level of education.

Finally, Table 14 shows the classification results for the selected algorithms with 10-fold cross validation (Berrar 2019) to avoid overfitting (with the best values in bold). This methodology minimises underestimation and overestimation in the results. For this purpose, the dataset is divided into 10 segments, 9 of which are used for training and the remaining one for testing. This process is repeated ten times by avoiding overlapping testing segments in different evaluations. At the end, the final performance metric is computed as the average of the intermediate tests. In addition, unlike svm (Jatav and Sharma 2018) and bn (Wood et al. 2019) algorithms, which are less prone to overfitting issues, in the cases of dt and rf we limited the folds to 3 and the maximum depth to 5. To avoid bias, the entries from the same users were grouped to prevent them from being simultaneously used for training and evaluation.

Note that dt was finally selected due to its better performance, since it attained a detection accuracy of 86.67% using the most relevant features.

Table 14 F-measure, recall and response times for the selected algorithms

Full size table

5 Conclusions

Even though there exists previous research on intelligent systems for therapeutic monitoring of cognitive impairment in elderly people, most current approaches are based on manual tests that rely on human supervision for early detection.

In this work, to reduce caregivers’ effort and the white-coat effect, we have proposed a novel conversational system for entertainment and therapeutic monitoring of elderly people. It relies on nlp techniques for chatbot behaviour generation and user-transparent automatic assessment, by combining distracting (user-centred) with attention-demanding questions (embedded cognitive tests). Thus, our main contribution is a Machine Learning approach for user-transparent cognitive monitoring that is embedded into a user-centred entertainment solution. This approach is based on metrics that estimate the abstraction skills of the users from their answers during the automatic dialogue stages.

Experimental results with elderly people under the supervision of afaga gerontologists indicate that our solution is satisfactory and has strong potential for user-friendly therapeutic monitoring. Preliminary analyses have obtained a detection accuracy of cognitive impairment close to 90%.

Given these promising results, we plan to enhance the system with empathetic capabilities through user feedback. At the same time, users will benefit from real-time encouragement about their performance.

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Notes

Available at https://population.un.org/wpp, September 2021.
Available at https://afaga.com, September 2021.
Dichotomous or closed questions have binary yes/no answers. Essay questions allow users to express themselves freely.
Available at http://televescorporation.com/areas-de-negocio/sociosanitario, September 2021.
Available at https://www.doro.com/es-es/care, September 2021.
Available at https://luvozo.com, September 2021.
Available at https://zenbo.asus.com, September 2021.
Available at https://buddytherobot.com/en/buddy-the- emotional-robot, September 2021.
Available at https://neurotrack.com, September 2021.
Available at https://mezur.io, September 2021.
Available at http://altoida.com, September 2021.
Available at https://www.android.com, November 2021.
Available at https://www.eclipse.org, November 2021.
Available at https://www.java.com, November 2021.
Available at http://tomcat.apache.org, November 2021.
Available at https://eclipse-ee4j.github.io/jersey, November 2021.
Available at https://developer.android.com/studio, November 2021.
Available at https://developer.android.com/reference/android/speech/SpeechRecognizer, September 2021.
Available at https://opencv.org, September 2021.
Available at https://github.com/opencv/opencv/blob/master/data/lbpcascades/lbpcascade_frontalface.xml, September 2021.
Available at https://www.mongodb.com, September 2021.
Available at https://www.rtve.es/api, September 2021.
Available at http://adimen.si.ehu.es/web/MCR, September 2021.
Available at https://www.cs.waikato.ac.nz/ml/weka, September 2021.

References

Aaltonen I, Arvola A, Heikkilä P, Lammi H (2017) Hello pepper, may I tickle you?: Children’s and adults’ responses to an entertainment robot at a shopping mall. In: Proceedings of the ACM/IEEE international conference on human-robot interaction. IEEE Computer Society, pp 53–54. https://doi.org/10.1145/3029798.3038362
Abdollahi H, Mollahosseini A, Lane JT, Mahoor MH (2017) A pilot study on using an intelligent life-like robot as a companion for elderly individuals with dementia and depression. In: Proceedings of the international conference on humanoid robotics. IEEE, pp 541–546. https://doi.org/10.1109/HUMANOIDS.2017.8246925
Adamson G, Havens JC, Chatila R (2019) Designing a value-driven future for ethical autonomous and intelligent systems. Proc IEEE 107(3):518–525. https://doi.org/10.1109/JPROC.2018.2884923
Article Google Scholar
Ahmed G, Ghulam M, Umar AS, Brij G (2018) Medical image forgery detection for smart healthcare. IEEE Commun Mag 56(4):33–37. https://doi.org/10.1109/MCOM.2018.1700817
Article Google Scholar
Ahmed ST, Sankar S, Sandhya M (2021) Multi-objective optimal medical data informatics standardization and processing technique for telemedicine via machine learning approach. J Ambient Intell Hum Comput 12(5):5349–5358. https://doi.org/10.1007/s12652-020-02016-9
Article Google Scholar
Alexander W, Vladimir S, Kayvan N, Delaram K (2019) Private naive bayes classification of personal biomedical data: Application in cancer data analysis. Comput Biol Med 105:144–150. https://doi.org/10.1016/j.compbiomed.2018.11.018
Article Google Scholar
Alsmirat MA, Al-Alem F, Al-Ayyoub M, Jararweh Y, Gupta B (2019) Impact of digital fingerprint image quality on the fingerprint recognition accuracy. Multimedia Tools Appl 78(3):3649–3688. https://doi.org/10.1007/s11042-017-5537-5
Article Google Scholar
Amanda S, Noel S (2012) Granny and the robots: ethical issues in robot care for the elderly. Ethics Inf Technol 14(1):27–40. https://doi.org/10.1007/s10676-010-9234-6
Article Google Scholar
Ammal SM, Jayashree LS (2020) Early detection of cognitive impairment of elders using wearable sensors. In: Systems simulation and modeling for cloud computing and big data applications. Elsevier, Oxford, pp 147–159. https://doi.org/10.1016/B978-0-12-819779-0.00010-1
Atserias J, Casas B, Comelles E, González M, Padró L, Padró M (2006) Freeling 1.3: Syntactic and semantic services in an open-source NLP library. In: Proceedings of the international conference on language resources and evaluation. European Language Resources Association
Baby CJ, Khan FA, Swathi JN (2017) Home automation using IoT and a chatbot using natural language processing. In: Proceedings of the innovations in power and advanced computing technologies. IEEE, pp 1–6. https://doi.org/10.1109/IPACT.2017.8245185
Ball SL, Holland AJ, Huppert FA, Treppner P, Watson P, Hon J (2004) The modified CAMDEX informant interview is a valid and reliable tool for use in the diagnosis of dementia in adults with Down’s syndrome. J ntellect Disabil Res 48(6):611–620. https://doi.org/10.1111/j.1365-2788.2004.00630.x
Article Google Scholar
Bernardini S, Porayska-Pomsta K, Smith TJ (2014) ECHOES: an intelligent serious game for fostering social communication in children with autism. Inf Sci 264:41–60. https://doi.org/10.1016/j.ins.2013.10.027
Article Google Scholar
Berrar D (2019) Cross-validation. In: Encyclopedia of bioinformatics and computational biology. Elsevier, Oxford, pp 542–545. https://doi.org/10.1016/B978-0-12-809633-8.20349-X
Boise L, Neal MB, Kaye J (2004) Dementia assessment in primary care: results from a study in three managed care systems. J Gerontol Ser A Biol Sci Med Sci 59(6):621–626. https://doi.org/10.1093/gerona/59.6.M621
Article Google Scholar
Borson S, Frank L, Bayley PJ, Boustani M, Dean M, Lin P-J, McCarten JR, Morris JC, Salmon DP, Schmitt FA et al (2013) Improving dementia care: the role of screening and detection of cognitive impairment. Alzheimer’s Dementia 9(2):151–159. https://doi.org/10.1016/j.jalz.2012.08.008
Article Google Scholar
Brankica B, Vladimir K, Mirjana I, Iztok O, Zoran B (2018) Machine learning for predicting cognitive diseases: methods, data sources and risk factors. J Med Syst 42(12):1–15. https://doi.org/10.1007/s10916-018-1071-x
Article Google Scholar
Britt Ö (2010) Watching television in later life: a deeper understanding of TV viewing in the homes of old people and in geriatric care contexts. Scand J Caring Sci 24(2):233–243. https://doi.org/10.1111/j.1471-6712.2009.00711.x
Article Google Scholar
Callahan CM, Foroud T, Saykin AJ, Shekhar A, Hendrie HC (2014) Translational research on aging: clinical epidemiology as a bridge between the sciences. Transl Res 163(5):439–445. https://doi.org/10.1016/j.trsl.2013.09.002
Article Google Scholar
Carmen DM, Herminia PA (2008) Neuropsychological evaluation and cognitive evolution of a bilingual Alzheimer patient. Revista de Psicopatología y Psicología Clínica 13(3):219–228. https://doi.org/10.5944/rppc.vol.13.num.3.2008.4061
Article Google Scholar
Corley CD, Mihalcea R (2005) Measuring the semantic similarity of texts. In: Proceedings of workshop on empirical modeling of semantic equivalence and entailment, pp 13–18
Correia F, Ribeiro T, Alves-Oliveira P, Maia N, Melo FS, Paiva A (2016) Building a social robot as a game companion in a card game. In: Proceedings of the ACM/IEEE international conference on human-robot interaction, pp 563–563. https://doi.org/10.1109/HRI.2016.7451857
Elizabeth C, Curiel Rosie E, Amarilis A, Czaja Sara J, Loewenstein David A (2014) An evaluation of deficits in semantic cueing and proactive and retroactive interference as early features of Alzheimer’s disease. American Journal of Geriatric Psychiatry 22(9):889–897. https://doi.org/10.1016/j.jagp.2013.01.066
Article Google Scholar
Enrique E, Amor Pedro J, Manuel MJ, Belén S, Irene Z (2017) Escala de Gravedad de Síntomas del Trastorno de Estrés Postraumático según el DSM-5: versión forense (EGS-F). Anuario de Psicología Jurídica 27(1):67–77. https://doi.org/10.1016/j.apj.2017.02.005
Article Google Scholar
Feng Jin, Zhou Yi-Ming, Martin Trevor (2008) Sentence similarity based on relevance. In: Proceedings of International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, volume 8, p 833
Foroughi H, Aski BS, Pourreza H (2008) Intelligent video surveillance for monitoring fall detection of elderly in home environments. In: Proceedings of the international conference on computer and information technology. IEEE, pp 219–224. https://doi.org/10.1109/ICCITECHN.2008.4803020
Fung P, Bertero D, Xu P, Park JH, Wu CS, Madotto A (2018) Empathetic dialog systems. In: Proceedings of the international conference on language resources and evaluation. European Language Resources Association, pp 1–7
García-Méndez S, Fernández-Gavilanes M, Costa-Montenegro E, Juncal-Martínez J, González-Castaño FJ (2018) Automatic natural language generation applied to alternative and augmentative communication for online video content services using simpleNLG for Spanish. In: Proceedings of the web for all conference: internet of accessible things. ACM Press, pp 1–4. https://doi.org/10.1145/3192714.3192837
García-Méndez S, Fernández-Gavilanes M, Costa-Montenegro E, Juncal-Martínez J, González-Castaño FJ (2019) A library for automatic natural language generation of Spanish texts. Expert Syst Appl 120:372–386. https://doi.org/10.1016/J.ESWA.2018.11.036
Article Google Scholar
González-Agirre A, Laparra E, Rigau G (2012) Multilingual central repository version 3.0: upgrading a very large lexical knowledge base. In: Proceedings of the Global WordNet conference, pp 118–125
Hancock Geraldine A, Bob W, David C, Martin O (2006) The needs of older people with dementia in residential care. Int J Geriatr Psychiatry 21(1):43–49. https://doi.org/10.1002/gps.1421
Article Google Scholar
Haoxiang W, Zhihui L, Yang L, Gupta BB, Chang C (2020) Visual saliency guided complex image retrieval. Pattern Recogn Lett 130:64–72. https://doi.org/10.1016/j.patrec.2018.08.010
Article Google Scholar
Hassine K, Erbad A, Hamila R (2019) Important complexity reduction of random forest in multi-classification problem. In: International wireless communications & mobile computing conference. IEEE, pp 226–231. https://doi.org/10.1109/IWCMC.2019.8766544
Hsu CC, Chien YY (2009) An Intelligent fuzzy affective computing system for elderly living alone. In: Proceedings of the international conference on hybrid intelligent systems. IEEE, pp 293–297. https://doi.org/10.1109/HIS.2009.318
Ioannis M, Stavros D, Anastasios K (2011) Adaptive and intelligent systems for collaborative learning support: a review of the field. IEEE Trans Learn Technol 4(1):5–20. https://doi.org/10.1109/TLT.2011.2
Article Google Scholar
Johnson David O, Cuijpers Raymond H, Kathrin P, Van de Ven Antoine AJ (2016) Exploring the Entertainment Value of Playing Games with a Humanoid Robot. International Journal of Social Robotics 8(2):247–269. https://doi.org/10.1007/s12369-015-0331-x
Article Google Scholar
Li Y, McLean D, Bandar ZA, O’Shea JD, Crockett K (2006) Sentence similarity based on semantic nets and corpus statistics. IEEE Trans Knowl Data Eng 18(8):1138–1150. https://doi.org/10.1109/TKDE.2006.130
Article Google Scholar
Liang-Hung W, Yi-Mao H, Xue-Qin X, Shuenn-Yuh L (2016) An outdoor intelligent healthcare monitoring device for the elderly. IEEE Trans Consum Electron 62(2):128–135. https://doi.org/10.1109/TCE.2016.7514671
Article Google Scholar
Lindgren M, Andersson IS (2011) The Karen instruments for measuring quality of nursing care: construct validity and internal consistency. Int J Qual Health Care 23(3):292–301. https://doi.org/10.1093/intqhc/mzq092
Article Google Scholar
Liu WD, Chuang KY, Chen KY (2018) The design and implementation of a chatbot’s character for elderly care. In: Proceedings of the international conference on system science and engineering, pp 1–5. https://doi.org/10.1109/ICSSE.2018.8520008
Livingston G, Sommerlad A, Orgeta V, Costafreda SG, Huntley J, Ames D, Ballard C, Banerjee S, Burns A, Cohen-Mansfield J, Cooper C, Fox N, Gitlin LN, Howard R, Kales HC, Larson EB, Ritchie K, Rockwood K, Sampson EL, Samus Q, Schneider LS, Selbaek G, Teri L, Mukadam N, Cohen-Mansfield J, Gitlin N (2017) The Lancet Commissions Dementia prevention, intervention, and care. Lancet 390:2673–734. https://doi.org/10.1016/S0140-6736(17)31363-6
Article Google Scholar
Loewenstein DA, Acevedo A, Luis C, Crum T, Barker WW, Duara R (2004) Semantic interference deficits and the detection of mild Alzheimer’s disease and mild cognitive impairment without dementia. J Int Neuropsychol Soc 10(1):91–100. https://doi.org/10.1017/S1355617704101112
Article Google Scholar
López Gustavo, Quesada Luis, Guerrero Luis A (2018) Alexa vs. Siri vs. Cortana vs. Google Assistant: A Comparison of Speech-Based Natural User Interfaces. In: Advances in Intelligent Systems and Computing, pp 241–250. Springer, Cham. ISBN 9783319603650. https://doi.org/10.1007/978-3-319-60366-7_23
Lu J, Yang Y, Webb GI (2006) Incremental discretization for Naïve-Bayes classifier. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). Springer, pp 223–238. https://doi.org/10.1007/11811305_25
Marita S, Maria HI, Asbjørn F, Bae BP (2019) Help! Is my chatbot falling into the uncanny valley? An empirical study of user experience in human-chatbot interaction. Hum Technol 15(1):30–54. https://doi.org/10.17011/ht/urn.201902201607
Article Google Scholar
Masud M, Gaba GS, Alqahtani S, Muhammad G, Gupta BB, Kumar P, Ghoneim A (2021) A lightweight and robust secure key establishment protocol for internet of medical things in COVID-19 patients care. IEEE Internet of Things J. https://doi.org/10.1109/JIOT.2020.3047662
Article Google Scholar
Ma’sum MA, Arrofi MK, Jati G, Arifin F, Kurniawan MN, Mursanto P, Jatmiko W (2013) Simulation of intelligent Unmanned Aerial Vehicle (UAV) for military surveillance. In: Proceedings of the international conference on advanced computer science and information systems. IEEE, pp 161–166. https://doi.org/10.1109/ICACSIS.2013.6761569
Matsumoto R, Nakayama H, Harada T, Kuniyoshi Y (2007) Journalist robot: robot system making news articles from real world. In: Proceedings of the international conference on intelligent robots and systems. IEEE, pp 1234–1241. https://doi.org/10.1109/IROS.2007.4399598
Matsuyama Y, Bhardwaj A, Zhao R, Romeo O, Akoju S, Cassell J (2016) Socially-aware animated intelligent personal assistant agent. In: Proceedings of the annual meeting of the special interest group on discourse and dialogue. Association for Computational Linguistics, pp 224–227. https://doi.org/10.18653/v1/W16-3628
Milne A, Culverwell A, Guss R, Tuppen J, Whelton R (2008) Screening for dementia in primary care: a review of the use, efficacy and quality of measures. Int Psychogeriatr 20(5):911–926. https://doi.org/10.1017/S1041610208007394
Article Google Scholar
Minna L, Ismo R, Raimo I, Tero V, Sirkka-Liisa K (2003) Diagnosing cognitive impairment and dementia in primary health care-a more active approach is needed. Age Ageing 32(6):606–612. https://doi.org/10.1093/ageing/afg097
Article Google Scholar
Mohamed S, Fatma A (2021) Assisted-fog-based framework for iot-based healthcare data preservation. Int J Cloud Appl Comput 11(2):1–16. https://doi.org/10.4018/IJCAC.2021040101
Article Google Scholar
Ngai EWT, Peng S, Paul A, Moon Karen KL (2014) Decision support and intelligent systems in the textile and apparel supply chain: an academic review of research articles. Expert Syst App 41(1):81–91. https://doi.org/10.1016/j.eswa.2013.07.013
Article Google Scholar
Noh S, Han J, Jo J, Choi A (2017) Virtual companion based mobile user interface: an intelligent and simplified mobile user interface for the elderly users. In: Proceedings of the international symposium on ubiquitous virtual reality, pp 8–9. https://doi.org/10.1109/ISUVR.2017.10
Oh KJ, Lee D, Ko B, Choi HJ (2017) A chatbot for psychiatric counseling in mental healthcare service based on emotional dialogue analysis and sentence generation. In: Proceedings of the IEEE international conference on mobile data management, pp 371–376. https://doi.org/10.1109/MDM.2017.64
Padró L, Stanilovsky E (2012) Freeling 3.0: towards wider multilinguality. In: Proceedings of the language resources and evaluation conference. European Language Resources Association, pp 2473–2479
Pedersen T, Patwardhan S, Michelizzi J (2004) Wordnet: similarity-measuring the relatedness of concepts. In: Proceedings of conference on artificial intelligence, vol 4, pp 25–29
Rashkin H, Smith EM, Li M, Boureau Y-L (2019) Towards empathetic open-domain conversation models: a new benchmark and dataset. In: Proceedings of the annual meeting of the association for computational linguistics. Association for Computational Linguistics, pp 5370–5381. https://doi.org/10.18653/v1/P19-1534
Reisberg B, Ferris SH, De Leon MJ, Crook T (1982) The Global Deterioration Scale for assessment of primary degenerative dementia. Am J Psychiatry 139:1136–1139. https://doi.org/10.1176/ajp.139.9.1136
Article Google Scholar
Ridha B, Rossor M (2005) The mini mental state examination. Pract Neurol 5(5):298–303. https://doi.org/10.1111/j.1474-7766.2005.00333.x
Article Google Scholar
Rukmawan SH, Aszhari FR, Rustam Z, Pandelaki J (2021) Cerebral infarction classification using the k-nearest neighbor and naive bayes classifier. J Phys Conf Ser 1752:012045. https://doi.org/10.1088/1742-6596/1752/1/012045
Article Google Scholar
Sakmongkon C, Eiji H, Masato K (2016) Intelligent emotion and behavior based on topological consciousness and adaptive resonance theory in a companion robot. Biol Inspir Cogn Arch 18:51–67. https://doi.org/10.1016/j.bica.2016.09.004
Article Google Scholar
Salichs Miguel A, Encinar Irene P, Esther S, Alvaro C-G, María M (2016) Study of scenarios and technical requirements of a social assistive robot for alzheimer’s disease patients and their caregivers. Int J Soc Robot 8(1):85–102. https://doi.org/10.1007/s12369-015-0319-6
Article Google Scholar
Salzberg Steven L (1994) C4.5: programs for machine learning. Mach Learn 16:235–240. https://doi.org/10.1007/BF00993309
Article Google Scholar
Samanta N, Chanda AK, Roy CC (2014) An energy efficient, minimally intrusive multi-sensor intelligent system for health monitoring of elderly people. Int J Smart Sens Intell Syst 7(2):762–780. https://doi.org/10.21307/ijssis-2017-680
Article Google Scholar
Sedik A, Hammad M, Abd El-Samie FE, Gupta BB, Abd El-Latif AA (2021) Efficient deep learning approach for augmented detection of Coronavirus disease. Neural Comput Appl. https://doi.org/10.1007/s00521-020-05410-8
Article Google Scholar
Shakuntala J, Vivek S (2018) An algorithm for predictive data mining approach in medical diagnosis. Int J Comput Sci Inf Technol 10(1):11–20. https://doi.org/10.5121/ijcsit.2018.10102
Article Google Scholar
Shang L, Lu Z, Li H (2015) Neural Responding machine for short-text conversation. In: Proceedings of the annual meeting of the association for computational linguistics and the international joint conference on natural language processing. Association for Computational Linguistics, pp 1577–1586. https://doi.org/10.3115/v1/P15-1152
Shawar BA, Atwell E (2007) Chatbots: are they really useful? Ldv forum 22(1):29–49
Google Scholar
Shen L, Yong X, Weidong C, Michael F, Dagan FD (2017) Early identification of mild cognitive impairment using incomplete random forest-robust support vector machine and FDG-PET imaging. Comput Med Imaging Graph 60:35–41. https://doi.org/10.1016/j.compmedimag.2017.01.001
Article Google Scholar
Shum H-Y, He X-D, Li D (2018) From Eliza to XiaoIce: challenges and opportunities with social chatbots. Front Inf Technol Electron Eng 19(1):10–26. https://doi.org/10.1631/FITEE.1700826
Article Google Scholar
Su MH, Wu CH, Huang KY, Hong QB, Wang HM (2017) A chatbot using LSTM-based multi-layer embedding for elderly care. In: International conference on orange technologies. IEEE, pp 70–74. https://doi.org/10.1109/ICOT.2017.8336091
Suryadevara NK, Quazi MT, Mukhopadhyay SC (2012) Intelligent sensing systems for measuring wellness indices of the daily activities for the elderly. In: Proceedings of the international conference on intelligent environments. IEEE, pp 347–350. https://doi.org/10.1109/IE.2012.49
Tseng Kevin C, Chien-Lung H, Yu-Hao C (2013) Designing an intelligent health monitoring system and exploring user acceptance for the elderly. J Med Syst 37(6):9967. https://doi.org/10.1007/s10916-013-9967-y
Article Google Scholar
Vapnik VN (2000) The nature of statistical learning theory. Springer, New York. https://doi.org/10.1007/978-1-4757-3264-1
Wang M, Lu Z, Li H, Liu Q (2015) Syntax-based deep matching of short texts. In: Proceedings of the international joint conference on artificial intelligence. Association for Computational Linguistics, pp 1354–1361
Wen TH, Vandyke D, Mrkšíc N, Gašíc M, Rojas-Barahona LM, Su PH, Ultes S, Young S (2017) A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the conference of the European chapter of the association for computational linguistics. Association for Computational Linguistics, pp 437–448
Widodo B, Dian CA, Rumondor Pingkan CB, Derwin S (2017) EduRobot: intelligent humanoid robot with natural interaction for education and entertainment. Procedia Comput Sci 116:564–570. https://doi.org/10.1016/j.procs.2017.10.064
Article Google Scholar
Wilamowski BM, Irwin JD (2015) Intelligent systems. In: Adaptive stochastic optimization techniques with applications. CRC Press, pp 109–130. https://doi.org/10.1201/b19256-7
Witten IH, Frank E, Hall MA, Pal CJ (2016) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Publishers. https://doi.org/10.1016/c2009-0-19715-5
Wu Yu, Li Zhoujun W, Wei ZM (2018) Response selection with topic clues for retrieval-based chatbots. Neurocomputing 316:251–261. https://doi.org/10.1016/j.neucom.2018.07.073
Article Google Scholar
Yang D, Powers DMW (2005) Measuring semantic similarity in the taxonomy of wordnet. In: Proceedings of the Australasian conference on computer science. Australian Computer Society, pp 315–322. https://doi.org/10.5555/1082161.1082196
Yasuda K, Fuketa M, Aoe J (2014) An anime agent system for reminiscence therapy. Gerontechnology 13:118–119. https://doi.org/10.4017/gt.2014.13.02.239.00
Article Google Scholar
Yoo D, No S, Ra M (2014) A practical military ontology construction for the intelligent army tactical command information system. Int J Comput Commun Control 9(1):93–100. https://doi.org/10.15837/ijccc.2014.1.49
Article Google Scholar
Zahir I, Mustafa KM (2014) Intelligent systems research in the construction industry. Expert Syst Appl 41(4):934–950. https://doi.org/10.1016/j.eswa.2013.06.061
Article Google Scholar

Download references

Acknowledgements

This work was supported by Xunta de Galicia grant grc 2018/053, Spain; and University of Vigo/CISUG for open access charge. The authors are indebted to afaga for providing access to real patients and expert knowledge about monitoring of cognitive impairments, and for validating our results.

Funding

Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work was partially supported by Xunta de Galicia grant ED481B-2021-118.

Author information

Francisco de Arriba-Pérez, Silvia García-Méndez, Francisco J. González-Castaño and Enrique Costa-Montenegro contributed equally to this work.

Authors and Affiliations

Information Technologies Group, atlanTTic, School of Telecommunications Engineering, University of Vigo, Campus Lagoas-Marcosende, 36310, Vigo, Spain
Francisco de Arriba-Pérez, Silvia García-Méndez, Francisco J. González-Castaño & Enrique Costa-Montenegro

Authors

Francisco de Arriba-Pérez
View author publications
You can also search for this author in PubMed Google Scholar
Silvia García-Méndez
View author publications
You can also search for this author in PubMed Google Scholar
Francisco J. González-Castaño
View author publications
You can also search for this author in PubMed Google Scholar
Enrique Costa-Montenegro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Silvia García-Méndez.

Ethics declarations

Conflict of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Ethics approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Consent to participate

Informed consent was obtained from all individual participants in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visithttp://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

de Arriba-Pérez, F., García-Méndez, S., González-Castaño, F.J. et al. Automatic detection of cognitive impairment in elderly people using an entertainment chatbot with Natural Language Processing capabilities. J Ambient Intell Human Comput 14, 16283–16298 (2023). https://doi.org/10.1007/s12652-022-03849-2

Download citation

Received: 23 January 2021
Accepted: 04 April 2022
Published: 29 April 2022
Issue Date: December 2023
DOI: https://doi.org/10.1007/s12652-022-03849-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Automatic detection of cognitive impairment in elderly people using an entertainment chatbot with Natural Language Processing capabilities

Abstract

Similar content being viewed by others

Empathic Chatbot: Emotional Astuteness for Mental Health Well-Being

Design of a Chatbot to Assist the Elderly

Artificial intelligence insights into osteoporosis: assessing ChatGPT’s information quality and readability

1 Introduction

2 Related work

3 System architecture

3.1 News broadcast service

3.2 Automatic question generation

3.3 Cognitive attention assessment service

4 Experimental results and discussion

4.1 Hardware

4.2 Case study

4.2.1 Similarity metric assessment

4.2.2 Automatic cognitive impairment detection

5 Conclusions

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Ethics approval

Consent to participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic detection of cognitive impairment in elderly people using an entertainment chatbot with Natural Language Processing capabilities

Abstract

Similar content being viewed by others

Empathic Chatbot: Emotional Astuteness for Mental Health Well-Being

Design of a Chatbot to Assist the Elderly

Artificial intelligence insights into osteoporosis: assessing ChatGPT’s information quality and readability

Explore related subjects

1 Introduction

2 Related work

3 System architecture

3.1 News broadcast service

3.2 Automatic question generation

3.3 Cognitive attention assessment service

4 Experimental results and discussion

4.1 Hardware

4.2 Case study

4.2.1 Similarity metric assessment

4.2.2 Automatic cognitive impairment detection

5 Conclusions

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Ethics approval

Consent to participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation