1 Introduction

Conditions such as chronic pain, depression and anxiety disorders continue to resist treatment, and scientists are yet to find purely biological models for these prevalent conditions [12]. As such, search for technology-driven strategies to facilitate citizen science, and patient-centered and personalized care has begun [3,4,5].

The ubiquity of information and communication technologies (ICT) has triggered an exponential surge of digital health applications, especially for mental healthcare. In a 2018 opinion paper by the former director of the National Institute of Mental Health who left NIMH to join Google's Life Science, Tom Insel asked: “What does this technology revolution have in store for psychiatry? Will brick-and-mortar clinics be replaced by teleclinicians? Will smartphones become the new clinical interface for diagnosis and treatment? Will psychiatrists be replaced by artificial intelligence (AI)-engineered conversational bots trained on millions of clinical interviews and treatment sessions?” [6].

Among the possibilities for digital interventions in mental health are: digital phenotyping [7] and citizen science [3, 8, 9], remote care to rural communities by offering more timely response, guided treatment and adherence monitoring [10], as well as providing information and social support, which are important factors in coping with illness [11].

Today, the use of self-monitoring and behavioral training applications has become commonplace. At the time of writing this article (Feb 2022), Google Play and Apple Store offer more than 106,000 mobile apps listed under health and wellness (Source, Statistia reports in Dec 21, 2021). Several of these applications are heavily commercialized and advertised, with some data to support their efficacy, for cognitive behavioral training for weight loss [12], mindfulness [13], and game-based cognitive training [14].

Several systematic reviews in the clinical literature have tried to examine the clinical effectiveness of ICT interventions and have found promising possibilities for providing digital care for mental healthcare [15], especially for anxiety and depression [16] or chronic pain [17, 18]. However, the majority of the existing systematic reviews are designed to address specific research questions about a particular outcome, or particular population under study, and as such the search strategies do not provide a long-shot perspectives on the trends, commonalities and differences among applications that target comorbid conditions, nor the existing knowledge gaps.

In this current review, we have aimed to conduct a systematic, but broad scoping review of any studies that are listed under the category of Randomized Controlled Trials (RCTs) involving any ICT intervention. Without any specific priors other than those categories that guided the search, we have used an inductive coding methodology to identify themes that emerge from the existing literature, and might guide the development of a pragmatic framework for collaborative and inclusive digital mental healthcare solutions.

2 Methods

2.1 Systematic Review Procedure

Our systematic review search and data extraction were guided by [19] (Box 1) and search results are reported according to PRISMA guidelines [20] (Fig. 1).

figure a

2.2 Article Selection and Data Retrieval

Because we were interested in health-related trials, we limited our search to the PubMed database (up to and including published articles on May 18, 2021) using the search query: “(anxiety OR depression OR pain) AND (app OR mobile OR game OR web OR smartphone OR VR OR “virtual reality”). Search was limited to Randomized Controlled Trials, and it resulted in a total of 1,932 peer-reviewed articles. No [tiab] annotations were used to ensure that our search was not limited by the words in the title or abstract only.

We used a quasi-automated information extraction method and used the PubMed export query tool to automatically generate a base list of results and sorted abstracts. We then programmed a Java file extraction routine to remove headers abd extract the results, and methods sections of each entry using Google Sheets API. Those articles that did not have structured abstracts were coded manually. If sufficient data was not available in the abstracts, and the articles were open-access, we searched methods and results for finding more information about sample size and demographics.

Fig. 1.
figure 1

Systematic review flow chart

2.3 Inductive Coding of Themes in Abstracts

Each abstract was reviewed by two coders. In the first pass ST coded the abstracts for outcome variable studied (Pain, Depression, Anxiety or Stress), sample size, sample characteristics (age range, patient or healthy control) and the type of ICT that was used (app, web, virtual reality, game, artificial intelligence, augmented reality, Computer, video, handheld devices, phone mobile text, console, Fitbit, etc.); as well as the reason for which the ICT intervention was designed (general distraction, immersion, visualization, manipulation, hypnosis, talking to a chatbot, socialization, etc.), the benefits that were provisioned (education, training, therapy, tracking, physical training, cognitive behavioral therapy, self-management, clinical follow-up and adherence monitoring, or data collection and diagnostics).

The second coder NKM, reviewed inductive codes and scanned the abstracts for reports of significant effectiveness. A study was classified as Effective if it reported a “significant improvement” as a result of ICT-based intervention; Non-effective if “no significant effects” were reported; and Mixed, if the primary outcomes (e.g., pain reduction, anxiety and depression scores, engagement) were modified based on specific experimental contexts, such as characteristics of population under study, baseline variations in clinical status, or experimental design. In addition, abstracts were inductively coded if the objective of the study was something other than a clinical evaluation of the pre-post intervention.

Because the age groups were not consistently reported, we categorized the population into child (<10 or identified as child in the study), adolescent (between 10–18, or if identified as adolescent in the study), adult (between 18–60) and senior (+60 or if identified as such). Because sex or gender were not consistently reported in the abstracts, we did not examine these variables.

2.4 Statistical Analysis and Synthesis of Results

We used NVIVO (for Mac); and Gephi 0.9 to examine the interrelations between themes that emerged from the review. Results were explored in terms of counts and occasionally as percentages (ratio of part to whole).

Within this review, several ICTs were designed or tested for more than one outcome. This provided an opportunity to examine and cross-functionalities between different implementations (e.g., game, web, app, etc.) for different applications (e.g., pain, anxiety, depression), and different age groups. We created an adjacency matrix with each cell representing the number of times that any two themes appeared in the same Abstract. We used Gephi to compute the network modularity (the structure emerging from the likelihood of node correlations, dividing a network into clusters forming from more correlated nodes) and eigenvector centrality (the importance of each node in terms of its within-cluster connections, as well as connection to the hubs of other clusters.) These variables were used to depict the prevalence of themes that emerged in the review, and the relations between them.

3 Results

3.1 Summary of Sample Characteristics

Within the abstracts reviewed, only 1,221 studies had specified the age group. The majority of individuals tested were adults (18–64 years old), 88.53% (n = 1081); followed by children (1–11 years old), 7.29% (n = 89), adolescents (12–17 years old), 3.77% (n = 46), and seniors (65 years and older), 0.41% (n = 5).

Figure 1 shows the distribution of sample size in each study, which indicates a moderate growth over time (exponential fit). The average sample size among the 1142 studies that reported the sample size was (279 ± 1178), with the median = 93. The large standard deviation is explained by the large variations from case studies (n = 1) to large-scale open access smoking secession studies (n = 23,213, albeit with high drop-out rate) [21], or more structured large-scale longitudinal trials (n = 6451) reporting effectiveness of intervention [22].

Fig. 2.
figure 2

Progression of sample size over time. Number of participants in the study (Y-axis) is scaled logarithmically.

Overall, 212 studies compared patients with specific clinical conditions to healthy controls, 198 studies studied only healthy controls, 823 studies included patients only.

The prevalence of studies targeting Depression was the highest (645 studies) followed by Anxiety (493 studies), Pain (337 studies) and Stress (131 studies); with significant overlap between them, the largest being Anxiety and Depression (269 studies).

3.2 Types of ICTs Studied

Within this review, 44% of studies deployed Web-based interventions (n = 561), followed by Apps (27%; n = 346), VR (21%; n = 271), Gamified interventions (10%; n = 131); text messaging (n = 32) and other types of interventions (such as Augmented Reality, n = 5; telephones, n = 4 or multimedia, n = 10); with a few using multimodal interventions.

The prevalence of each modality being used for a specific type of study is reported in Table 1. As it can be seen, the total number of counted outcomes and interventions is greater than the number of articles and this indicates co-occurrence of conditions or modes of intervention.

Table 1. Type of ICT used for RCT interventions (not mutually exclusive).

3.3 Application of ICT in Clinical Mental Health Care

While coding the studies, we noted a significant degree of overlap between different conditions, or multimodal interventions that used hybrid forms of interaction with patients while offering digital interventions. As Fig. 3 illustrates, our network-based illustrations showed that Depression was the most central condition (n = 645) within the themes of this review. Web- and App-based interventions, Therapy and Self-Management clustered together with themes related to Depression, Anxiety and Stress; as well as Expert-guided interventions and Cognitive Behavioral Therapy (CBT) (magenta). On the other hand, VR and Games clustered together with condition of Pain, with the of Distraction, Physical training and Data Generation (black). To use ICTs for Tracking and Clinical follow-up and diagnosis formed a separate cluster (green). This modular representation indicates that there is a certain degree of bias in what type of ICT was preferred for which type of clinical condition. As well, the fact that the network seems to have low modularity suggests that possibilities for multi-modal interventions for these correlated conditions were being explored.

3.4 Evolution of Study Targets and Outcomes Over Time

As Fig. 4 shows, there has been a steady growth of both the number of controlled trials (70% of articles reviewed here, n = 890), as well as positive reports of the interventions having been effective on reducing the symptoms or improving patient's quality of life or care (n = 638). Proportionately, the number of studies that have not satisfied the efficacy criterion is small (n = 97). A relatively small portion of studies have also revealed complexities in administration or uptake of the interventions related to participant characteristics (n = 155). Notably, starting in 2007, a new category of publication, Protocol (n = 177), has started to emerge with the highest peak in 2015–2016.

Fig. 3.
figure 3

Clustering Themes. The size of the letter indicated centrality (i.e., importance and frequency of connections), lines depict the frequency of connections. Colors indicate the likelihood of belonging to the same theme cluster. (Color figure online)

Fig. 4.
figure 4

Study outcomes measured and reported over time

4 Discussion

4.1 The Key Findings

Our results are consistent with a similar systematic review of literature in 2015, which identified Depression and CBT as the most prevalent themes in ICT-related mental health research [23]. However they also illustrate a change from 2016 when another review failed to demonstrate consistent efficacy of ICT based interventions [24]. Our review confirms that ICTs have been extensively tested for therapeutic purposes in pain and mood disorders; and extends knowledge by illustrating a larger research landscape where inter-relations between comorbid conditions and multimodal interventions are explored.

A few key concepts are as follows:

  • Depression and Anxiety are the most studied conditions, with web- and app-based interaction being implemented for providing CBT, or guided therapies.

  • Applications such as games and VR have been most extensively studied in pain to create analgesia or in surgical planning to reduce acute anxiety. These interventions have also been used to serve in physical training, as well as in creating simulated fear or anxiety responses in studying vulnerability to anxiety or mood disorders.

  • The experience of users is more likely to have been considered in therapeutic studies of depression and anxiety, but specifics of user experiences are seldome a part of the experimental designs.

  • All ICTs listed here have the potential to be data-collection tools. However, this modality of application does not have a significant weight in the body of literature reviewed here (RCTs on PubMed).

4.2 The Bigger Picture

The bigger picture emerging from this review is the remaining gaps that need to be addressed:

ICTs Are Not Integrated in Clinical Care for Pain, Anxiety, and Depression.

This review shows that despite evidence growing in favor of increasing healthcare capacity by adding ICTs, by the time of this review, they are not integrated in the healthcare system.

The World Health Organization's first agenda on eHealth (involving 58 nations) was announced in May 2005 (WHA58.28), urging member states to draw long-term strategies and multisectoral collaborations for developing legal, logistical and technological plans for developing and implementing various range of eHealth services [25]. As Fig. 4 illustrates, the number of PubMed-indexed RCT publications has begun rising coincidentally.

However, the relatively small sample sizes (Fig. 2), study protocols without a unifying data-collection/harmonization strategy, ambiguities in data interpretation due to variations in experimental design, expert-guiding and adherence to ecological data gathering, suggest that clinical eHealth implementations at least in mood, anxiety and pain conditions are still lagging.

‘Controlled’ Trials in Digital Mental Healthcare Remain Challenging.

The point of controlled trials is to minimize the sources of variation among the study participants, to record changes that are observed by modifying one single element in the study. To control for variations in mental health research often translates to careful stratification of sample in terms of age, gender, and clinical diagnosis, and controlling for moderating factors such as personality, cognitive and affective states, etc. However, unlike pharmacological therapies, ICT-based therapies involve a complex range of active decision making and executive agency which play a role in the phenomenology of a given experience with technology. To date, these factors remain difficult to characterize and quantify. However, this review reveals an emerging trend of studies which use adaptive algorithms towards diagnostics or experimental manipulation of user experience (e.g., by changing the intensity of VR imagery, or using it as a stimulus to induce anxiety.) Within this review, examples of these applications (Chatbots, VR-induced hypnosis, or data-driven decision support systems) in RCTs are still too few.

Human Factors Pose a Challenge But Are Not Always Considered in Trial Design and Interpretation.

Given that more than 80% of the studies here were in adults (18–60), and that gender, culture or geopolitical or socioeconomic variations were not consistently measured or reported as control variables of interest, the bulk of existing evidence is not generalizable. More specifically, within this review, 61 studies were solely about evaluation of user experience in a clinical context. A smaller portion of these were related to specific outcomes measures (38/890; i.e., less than 4%), and considered user experience, acceptance, and adherence in their analyses; and 27 of those produced mixed results. It should be noted that nearly 28% of the clinical trials reported negative or mixed results, perhaps due to individual or contextual differences mediating the results.

A discussion of possible human factors that can impact clinical efficacy (e.g., baseline severity; dyadic procedures; personality factors; presence of co-morbidities or medications; or even blinding and placebo effect) are beyond the scope of this article, however these are critical factors that must be considered in provisioning adaptive and person-centered design of such applications. On the other hand, clinicians must also account for variations that arise from interindividual variations in human-computer interactions (HCI) [26].

Confounding Effects of ICT-Alone or Expert-Guided Longitudinal Care Are not Always Explored.

By nature, information and communication technologies involve social interactions with professionals. Most studies reviewed here have a longitudinal design, or an Intervention versus a wait-list design. In either case, patients benefit from extended access to healthcare providers. Most interventions reviewed here include either an expert or peer involved in real-time or asynchronous communications (e.g., via video conferencing, SMS, or through social and peer support). We noted (not reported here) that in many of the effective interventions (e.g., those in which the effects were sustained after 12 months), the effects improved if the ICTs were added to the regular therapeutic procedures. It is plausible to postulate that when used as a complementary tool in patient care, ICT increases the capacity of patients to learn from interactive communication with experts, and then personalize their own care (by self-monitoring, and improved decision making about treatment options, or compliance with rehabilitation or pharmacological interventions.) Whether stand-alone technologies may produce clinically significant outcomes is an important factor that can be studied only if the human factors are carefully considered in both experimental design, and phenomenological assessment of the ICT usage in different individuals.

4.3 Limitations and Future Work

This review is limited by its own scale, which served to offer a long-shot perspective on the progression of the eHealth in clinical care for comorbid conditions: pain, anxiety, depression, and stress. As such, it remains to more closely examine the experimental factors that may inform the development of a unifying framework for collaborative and international research and development into digital therapeutics.

It should also be noted that this review is based on an inductive qualitative methodology where the themes for interventions and study objectives are coded based on keywords in the abstract and two coders. In this paper, we have reported only the first set of our code hierarchy. In the future work, we will be delving more closely into clinical elements of these studies.

5 Conclusion

This review illustrates a rapidly growing interest in applying ICTs in care for hard-to-treat conditions such as pain, anxiety, and depression; and suggests that the evidence is mostly promising. However, there is a lack of an experimental or conceptual framework to account for variations in human factors. These variations may arise from the quality and quantity of the interactions among patients and their caregivers, as well as from personal or clinical factors that may influence the experience of individuals while adopting these technologies into their care. Methods to account for such variations are needed.