Background

Cognitive behavioral therapy (CBT) is an evidence-based therapeutic approach used to treat psychological distress and a variety of mental disorders [1]. This therapy aims to modify maladaptive cognitions that lead to distress and problematic behaviours, thereby reducing negative symptoms and improving functioning [2]. CBT has been shown to produce large effect size improvements for treatment of mental health disorders, such as anxiety and depression [3]. CBT can be paired with pharmaceutical treatments to improve outcomes and has proven to be more effective than antidepressants when used for the treatment of depression in adults [3].

One method of CBT delivery that has been proven effective is internet-based CBT (iCBT), which has led to symptom reduction in both small and large effect sizes [4]. In this treatment method, a licensed therapist supports patients through online messaging platforms, e-mail, or web pages and provides them with exercises and behavioural intervention programs [5, 6]. iCBT has been identified as a plausible alternative to traditional CBT for patients with depression; helping to improve patient outcomes [7]. Randomized controlled trials have also shown that therapist-assisted iCBT is comparable to face-to-face CBT [8, 9], even when considering regarding development of strong patient-provider relationship [10].

With this shift towards alternative delivery methods for mental health therapies, an increasing number of mobile health (mHealth) apps in the mobile marketplace have emerged claiming to provide CBT. In contrast with iCBT, mHealth CBT apps tend to be self-guided and it is unknown if these apps effectively implement the evidence-based principles of CBT [11,12,13,14]. Additionally, there is little evidence demonstrating that these CBT apps can be recommended for unsupervised self-management [15]. The small existing evidence base is further exacerbated by the fast pace of technology relative to the pace of research and evaluation of mHealth apps [16]. Further research is required to better understand the mHealth CBT apps marketplace, particularly related to the effect on patient-provider relationships [17]. In addition, while research demonstrates patient interest in using mHealth apps for self-management, clinician interaction and health system integration of app has been identified as an important factor for patient confidence and ultimate behaviour change [18].

The purpose of this paper is to apply an mHealth app evaluation framework to CBT mHealth apps, to better understand the current marketplace for CBT mHealth apps, focusing primarily on the presence of functionalities to support patient-provider relationships. Specifically, this paper will focus on apps targeted towards adults with depression and/or anxiety.

Framework development

An evaluation framework was developed to evaluate the quality of the patient-provider relationship in CBT mHealth apps based on a reference architecture for health app design [19], (see Table 1). The evaluation framework is composed of 20 measures aimed at measuring the evidence-based support of CBT mHealth apps and their ability to enhance the patient-provider relationship. These 20 measures were based on properties from Chindalo et al. reference architecture which distinguishes features such as explicitly identifying patient’s diagnosis, enabling interoperability with EMRs, identifying and tracking process and proxy metrics for diseases as well as identifying and tracking important outcome measures [19]. These concepts fit with Albrecht et al. framework which provides details on evidence-based criteria that should be considered when evaluating mobile applications [20]. The framework also identifies features that are based on the patient engagement framework created by Balouchi et al. which focuses on functionalities of mobile apps that enhance the patient-provider relationship [21]. The rationale for the methodology is to provide a perspective on the experience of general users and clinicians when identifying mHealth apps for the purpose of CBT.

Table 1 Ranking of functionalities

The final list of measures was developed with an experienced clinician (KK) and took into account the information needed to provide high quality clinical care to a patient requiring CBT. The measures developed were customized for the treatment of mental health disorders, such as depression and anxiety; diseases that respond to CBT. Although some of the measures can be used for evaluating other disease types, the set of measures developed for CBT are only appropriate for mental health and related disorders.

Methods

50 CBT mHealth apps were identified from the Apple iTunes and Google Play app stores using the search terms “Cognitive Behavioural Therapy” or “CBT.” The rationale for the use of the health app design reference architecture over other popular frameworks used for mHealth App reviews is described previously [19].

Each app was downloaded and screened independently against 20 functional measures by two reviewers. Each measure was scored on a binary scale (0,1). Apps received a score of 1 if they had at least one attribute of that measure. To generate an evaluation score for each app, the sum of the binary measures was taken. The agreement between scores was determined after a blind independent review. The agreement between scores was completed by examining the number of scores the reviewers agreed on divided by the total number of functions in the framework. The mean evaluation score was calculated and used for analysis.

Before screening initiation, a calibration exercise was carried out with five randomly selected mHealth apps, which were evaluated by six reviewers. The calibration allowed the areas of discrepancies in interpretation of measures to be surfaced and addressed and improved the standardization of the approach. All reviewers were trained in the standardized method, and each of the 50 apps was evaluated by two independent reviewers.

Reviewers gave their ratings and included descriptions justifying their decision for each measure. After evaluations were complete, all data were collated into a single spreadsheet. Before data analysis, 15 apps identified by reviewers were excluded as they did not claim to provide CBT and offered other functions unrelated to the patient-provider relationship. Reviewers downloaded the app and scored them using the standardized method. Each app was independently and blindly screened against the evaluation criteria. For each of the measures, the higher score between the two reviewers was accepted, and final scores were generated for each app. The complete list of apps downloaded can be found in Appendix 1.

Results

The mean evaluation score across the 35 apps was 4.9 out of 20 functional criteria. The median score was 5. The two highest apps met 11 out of 20 functional criteria. The lowest app met 2 out of 20 functional criteria (see Fig. 1).

Fig. 1
figure 1

Distribution of app evaluation scores

Overall, the apps scored well on functions including education and recommendations, user interface, and behaviour tracking functional criteria (See Table 1). Primarily, these criteria were met through the provision of education about CBT techniques and how they may reduce patient symptoms. Apps generally scored poorly on criteria including physiological measurement, patient health information collection, lab results, medication or comorbidities as well as health system integration and utilization; all of which may be important for management of patient with mental health disorders.

Discussion

While recent literature suggests the potential of mHealth apps to improve care accessibility and decrease depression levels in users, findings from this research suggest that the current marketplace for mHealth apps is limited in its ability to provide benefits for the patient-provider relationship [12, 13]. Overall, our research found that the mHealth apps in the marketplace primarily act only as symptom trackers or educational resources with little integration into the larger healthcare system (see Fig. 2).

Fig. 2
figure 2

App evaluation scores after download

While apps overall did not score high on the evaluation framework, particularly in regards to healthcare integration, it should be noted that apps that perform only one core function may still provide some benefit to users. For instance, one empirical study reported the use of CBT-based depression apps are especially useful when they provide mood prediction; demonstrating the potential benefits of apps containing this feature alone [22]. Since our criteria were used to evaluate overall prevalence of functions as well as market gaps and opportunities, effectiveness of the individual functions was not taken into consideration.

Overall, by not providing healthcare integration, the apps under review did not provide opportunities for ensuring patient accountability and presented very little opportunity for use by healthcare providers. In addition, this lack of integration with providers and the healthcare system as a whole may limit the effectiveness of these apps in supporting sustained behaviour change [18]. It has been argued that mHealth apps should not be designed for healthcare provider use, and instead their main purpose is for patient empowerment outside the provider-patient relationship, suggesting their utility despite lack of integration. For instance, recent studies have found that mHealth apps may be useful and effective when used for self-monitoring and providing support for patients who are interested in self-treatment [23]. Therefore, apps that scored low on our evaluation criteria may present utility for highly motivated patients who are self-starters. Additional areas of improvement identified for the apps include more meaningful use of data collected, a stronger evidence base, and the ability to send notifications.

Identified limitations of the study are as follows: (1) the research team was not able to establish how often apps were used, or by which populations; (2) no patient representatives were included in the creation of the evaluation framework nor the reviewing of the individual apps. In future iterations, the inclusion of patients would improve the quality of data collected. These limitations can inform future research to collect data on the users of these apps to draw more insights on how often the apps were used and the types of users and their likelihood to have better patient outcomes.

Conclusions

Overall, there is a lack of evidence-based information and integration that enhance the patient-provider relationship in the CBT mobile app marketplace. Many apps only perform one function, mainly for patient engagement, and lack the functionality necessary to help patients adhere to their treatment within the larger health system. App developers should take note of the importance of evidence-based functionalities to improve patient outcomes which would encourage insurers and payers to begin to reimburse for the use of these technologies. Integration and connectivity with clinicians may facilitate improved app desirability and performance.