Keywords

1 Introduction

The prevalence of mental health problems has been increasing in the past decades worldwide [10]. Over their lifespan, 29% of the global population is affected by some form of mental disorder [36], with affective disorders, such as depression and anxiety disorders, being the most common forms [29]. As Denecke et al. [12] emphasized, we further have a global shortage of professionals delivering mental health services. As a result, affected individuals often face long waiting periods before they can receive adequate treatment [9] and less than half receive proper treatment at all [17]. This can be especially harmful for people with age-related depression, as there is presumably a high rate of under-reporting due to the high stigmatization of the illness in elderly people and as the risk for suicide rises strongly in relation to age [15]. Therefore, new solutions have to be found to provide low threshold, scalable forms of mental health support. Those solutions should be applicable to different forms of mental health problems and should be accessible and usable by a broad and diverse user base.

To cope with the shortage of face-to-face therapy options and the rising prevalence of mental disorders, research on internet-based therapy approaches has been going on for several decades. Approaches exist in the form of online psychotherapy, email-based, app-based, or chat-based therapy solutions. While some solutions provide automated therapeutic methods for self-guided interventions, others function as a communication medium between therapist and patient and are intended to be used adjunct to therapy [7, 26]. As they are easy to access and discrete in usage, those solutions provide one possible strategy to encounter the issue of increasing mental health problems [3]. As a next step, CA-based therapy has become one focus of research in this area. The primary type of psychotherapy implemented digitally is CBT [6]. In recent years several internet-based CBT solutions have been shown to be effective in improving depressive symptoms and life satisfaction [2, 25, 28, 39]. CBT is based on the assumption that cognitive distortions contribute to the development and maintenance of mental illnesses [5, 14]. Therefore, the central goal of CBT is to identify the patient’s problematic patterns in cognition and behavior to modify those through cognitive reconstruction and behavior changes. As a trans-diagnostic intervention, CBT can be successfully applied to many psychiatric conditions, as its underlying principals apply to anxiety disorders, depression, eating disorders and paranoid delusions [38]. Nevertheless, from its beginnings CBT was created with a focus on depressive disorders [41]. In mental health, conversational agents (CAs) such as chatbots (CBs) and voice assistants (VAs) can be a more interactive way to inform patients on CBT techniques and provide information on mental well being [10]. Furthermore, CAs can help to reduce cost, improve efficiency and reduce time spent assessing the well being of patients [22]. Moreover, CAs can be used as a more user friendly way of screening for mental health issues in comparison to diagnostic scales [10]. While existing CA-based solutions showed promising results, the usability of VA-based CBT with a standalone voice user interface (VUI) has not yet been analyzed - especially not with elderly users. This is, however, crucial, as a good usability leads to better acceptance and therapy adherence [32]. Therefore, we propose a novel concept for VA-based CBT for elderly people with an onset depressive disorder and investigate the perceived usability with the target group. Specifically, we want to investigate how the usability of a VA-based system for CBT delivery compares to the usability of CB-based CBT delivery.

In Sect. 2 the current state of research of CA-based CBT will be laid out. Subsequently, Sect. 3 introduces the conceptualization and interaction design of a CA capable of delivering selected CBT methods. Section 4 describes a conducted study, followed by a thorough discussion of results, implications and shortcomings. Section 5 completes the paper through a concise conclusion and an indication of future work.

2 State of the Art

CBs for mental health counseling have already been proposed and first studies have shown promising results regarding acceptance and efficacy [10, 12, 18]. However, CB-based solutions can have disadvantages for people with visual or other impairments and accessibility guidelines for CBs have yet to be established [35]. Furthermore, certain user groups might prefer speech as an interaction modality. Concerning the elderly, for example, studies indicate that VUIs are favored over conventional user interfaces due to the simplicity of speech interaction and the avoidance of physical accessibility issues [30, 42]. Additionally, this user group often lacks media competence and therefore the usage of graphical user interfaces in the context of therapy can be challenging due to missing skills necessary for interacting with computerized therapy systems [24].

Wysa is a CB developed by Becky et al. [23]. The system’s goal is to increase mental resilience through text-based conversations. In 2017 a study was conducted to investigate the efficacy of the CB in mediating CBT techniques to increase the mental well being of users. Participants that used the app in between measurements of their mental state had a significant improvement in depressive symptoms in comparison to control.

Fitzpatrick et al. [18] designed and implemented a CB called Woebot intended for daily usage. Each interaction with Woebot consists of a short inquiry about the user’s current activity and mood followed by a short CBT-based learning session. The efficacy of the application for treatment of depression and anxiety was shown in a randomized controlled trial.

An application designed to support mental health through VA technology was proposed by Bhat et al. [8]. The plant-shaped device combined with an Alexa VA called PlantBot was specifically designed for young adults receiving behavioral activation therapy for depression to remind them of their therapeutic tasks at home. On PlantBot, a LED-panel indicates what the user has to do - e.g. start a conversation with Alexa - and provides feedback to the user in combination with a suitable sound. Through the VA, the user receives instructions for tasks to be performed. The usability of PlantBot was evaluated in two studies revealing that the proposed approach showed a good acceptance and usability by participants with depressive symptoms. However, to what extent these results are transferable to elderly users is still unclear.

To summarize, existing approaches indicate a good acceptance and efficacy of CA-based CBT for depression treatment, but standalone VA-based systems for CBT have not yet been proposed and their usability in comparison to CB-based solutions is still unknown. The upcoming Section will hence describe the concept and prototype of a VA-based system in order to investigate the perceived usability in comparison to a CB-based CBT approach.

3 Design and Prototype of a VA for CBT Delivery

A thorough analysis of related research (cf. Sect. 2) and of the context of use resulted in several insights. First, most computerized CBT systems follow the structure of a classical face-to-face therapy session to convey their content. Moreover, several of the before mentioned solutions had a mental health assessment of users in order to hinder suicidal people from using the system and rather connect them to a suicide prevention hotline. This is crucial to ensure the safety of users, as suicide risk increases 30-fold for people with depression compared to the general population [20] and the suicide risk for elderly people is particularly high [15]. Additionally, the system should clearly indicate that the usage of a CA-based system for CBT is not intended as a replacement for face-to-face therapy with mental health specialists [18]. Furthermore, as the system is intended for elderly users, guidelines for the design of VUIs for elderly people have to be taken into account [19, 37]. Based on related research, three CBT tools were identified as suitable for a VA-based CBT session: A mood journal [11], a radio play with multiple choice questions as a psycho educational component [27, 40], and a short meditation practice to end the session [21, 31].

Fig. 1.
A workflow of the V A-based C B T session. Introduction, instructions, suicide risk assessment, meditation, radio play, mood journal, and farewell message.

Conceptualized structure of a VA-based CBT session.

The system introduces itself at the beginning of each session. As a female voice and a personalized name seem to be favorable for the acceptance of VA systems by elderly users [33, 42], we chose a female persona for the VA. Upon introduction, the system stresses that it is not intended as a therapy replacement and is not a real human, but an automated CA. Subsequently, on first usage an introduction is given to the user (see Fig. 1), explaining possible interactions with the system and the structure of the following session to ensure self-descriptiveness, expectation conformity and user learnability (as suggested by Ferland et al. [16]). Afterwards, users are asked if suicidal thoughts occurred recently. If so, users are urged to call the suicide prevention hotline, the telephone number to said service is given and the session is terminated. Otherwise, the system leads the user through the before mentioned CBT tools. An excerpt of the designed dialog management can be seen in Fig. 2. The session is finalized through a farewell message and the user is asked to set a date and reminder for the next planned session.

Fig. 2.
A flow chart for the relationship of welcome phrase, give instructions for V A usage, context, ask question, start question, positive answer, and negative answer initiate mood journal and information on suicide hotline.

Activity diagram showing an excerpt of the implemented dialog management strategy, including the therapy session start and suicide assessment components. Intents are indicated through dotted boxes.

4 Evaluation: Comparing the Usability of CB-Based Versus VA-Based CBT

In order to assess the usability (cf. ISO 9241-11 [13]) of the developed system we conducted a randomized controlled A/B testing experiment, mainly focusing on two central research questions: 1. How does the usability of a VA-based CBT system compare to a CB-based CBT system? 2. Are there differences in perceived usability between older and younger users? Requirements of elderly participants were taken into account in the design of the conducted user study, following guidelines for user testing with elderly subjects [34]. Senior participants were recruited by contacting nursing facilities and through social media. Additionally, younger participants were recruited through social media. All participants had to sign a privacy policy and consent form to comply with data protection provisions.

4.1 Participants

Overall, 14 participants took part in the study (8 females and 6 males). The average age of participants was 57.57 years (ranging from 26 to 83), with 9 participants being over the age of 60. 57% of participants had either been depressive and in therapy in the past or had close relatives with depression. However, none of the participants were diagnosed with depression during the study period in order to comply with ethical concerns. The implications of this predicament will be discussed later. 50% of participants had no prior experience with CAs.

4.2 Methodology and Material

Participants were pseudorandomly assigned to either the CB group or the VA group, to keep the age distribution in both groups approximately equal. Thereby, the interaction modality (CB or VA) and age were independent variables, with questionnaire results (system usability scale result) being the dependent variable. After an introduction to the concept (see Fig. 1), participants had to fill out 5-point Likert scale questionnaires (ranging from 1 := completely disagree, to 5 := completely agree) on demographic information, technological affinity and on their opinion of the presented concept. Afterwards, participants had an independent hands-on session with the prototype in absence of the facilitator (to factor out social desirability effects). Both testing groups were provided with the same CA and content, differentiating only in the interaction modality (VA in group A and CB in group B). The user test was followed by a questionnaire on the usability - using the system usability scale (SUS) - and by a semi structured interview as a formative evaluation (e.g. to gather suggestions for additional functionalities). Due to the global COVID-19 pandemic, rigorous hygiene precautions had to be taken into account. Therefore, the questionnaire and conducted semi-structured interview were carried out online, the VA was provided through a telephone gateway and the CB version as a web application to enable test subjects to participate in the study remotely.

4.3 Results

Concept Evaluation: Overall, the concept of a CA-based system for CBT was well received by participants. Participants indicated that they were interested in talking to the system (median (ME): 4, mode (MO): 5) and that they would appreciate it, if a VA was able to support them during depressive episodes (ME: 4, MO:4). The relative majority of participants assessed the system to be capable of reducing the supply bottleneck of psychological care (43%).

Table 1. Mean SUS scores in VA and CB group for participants in different age groups

Comparison Between Group A and B: The SUS is a widely used standardized questionnaire for assessing perceived usability through a scale from 0 to 100 (with \(\ge \)60 indicating a usability above average and \(\ge \)70 indicating a good usability) [4]. SUS measurements showed differences between VA and CB group, especially in relation to age (see Table 1). In the VA group the usability was rated as good (SUS: 75) and in the CB group as mediocre (SUS: 51) by participants over the age of 60. Additionally, over all age groups the VA condition had a SUS score of 75 and the CB condition had a score of 61. Furthermore, participants had to answer a questionnaire specific to the therapeutic context. While participants of the VA group could imagine using the system in case of a depressive episode (median: 4, mode: 4), the opinion with the CB-based prototype, especially among seniors, was scattered (median: 2.5, mode: [1,2,3,5]) and overall in the CB group the opinion was rather neutral (median: 3.5, mode: 5). Additionally, the progress subjects achieved with the prototype during the CA-based therapy session was tracked. In the VA group subjects over the age of 60 finished on average 90% of the therapy session. While seniors in the CB group finished only 58%. For younger participants (age <60) the difference between groups was smaller (92% for VA condition, 72% for CB condition).

4.4 Discussion

The concept and developed prototype were positively received by the majority of test subjects. 10 out of 14 participants found the system useful and assessed the system to be able to support people with depression. Direct comparison of age groups revealed that younger people rated the concept and prototype slightly better than seniors. It was also found that the evaluation of the prototype was dependent on the evaluation of the concept. Subjects who found the overall concept of VA-based CBT unsuitable or did not support the idea of computer-based therapy generally, also rated the usability of the prototype lower than those who rated the concept positively. In particular, subjects who suffered from depression themselves or had close acquaintances or family members with depression showed high interest in the concept and prototype. The fact that the evaluation of the concept affected the evaluation of usability should be viewed critically. Subjects who themselves or their relatives are affected by the therapy supply bottleneck may have rated the usability better and thus falsify the result. Conversely, people who rated usability negatively because of the concept could falsify the result in the opposite direction. A larger sample size could reduce such potential biases in a followup study. Overall, the usability of the VA-based prototype (SUS: 75), was rated better than the usability of the CB prototype (SUS: 61). For the group of elderly participants, scores were even further apart (75 for VA group and 51 for CB group). In both variants most of the test persons reached the end of the therapy session. This is a good indicator that the prototype was intuitive for the target group and that errors could be avoided or corrected through the implemented dialog management. The majority of subjects also indicated in the semi-structured interview that they preferred or would have preferred to talk to the system rather than write with the system. In addition, two seniors in the CB group dropped out of the therapy session after only a few minutes, stating that they were frustrated by the time needed for typing.

It is important to point out that the user study was conducted with only 14 participants. While 57% of participants had either been depressive in the past or had close relatives with depression, non of them were diagnosed with depression at the time of study. A followup study should therefore be conducted with people diagnosed with depression and with a higher number of participants in both groups. While results were highly promising, the focus of this study was to investigate the usability of VA-based CBT delivery in comparison to CB-based CBT systems. Similar to related CA-based therapy approaches [1], the effect of this novel form of CBT delivery on psychological well being should be further investigated, as the so far existing evidence is not sufficient enough to show clinical importance.

5 Conclusion

This study investigated a novel approach for VA-based CBT delivery for elderly people with depression. Therefore, a first concept of a VA-based CBT session was designed, implemented and used to conduct a randomized controlled A/B testing experiment. The concept was well received by the majority of participants and results indicate a good usability of the designed system. In particular, results indicate a preference for the delivery of CBT-methods via VA rather than CB, especially among elderly participants. Future studies should investigate the usability and acceptance of VA-based CBT delivery over a longer time period. Moreover, the efficacy of the system on the reduction of depressive symptoms in the target group should be examined. For this purpose the system should be extended by further CBT-based courses and exercises in order to provide subsequent CBT sessions via VA.