Introduction

Major depression is one of the most common mental disorders, affecting 10% of the US adult population annually [1]. Depression also contributes to significant comorbidity, functional impairment, and disability. While there are several effective interventions for addressing depression [2, 3], it is estimated that less than half of people with depression receive care for their symptoms [4]. Moreover, about one-third of people receiving antidepressants do not experience clinically significant improvements [5], with a significant proportion experiencing symptom recurrence and treatment-resistance [6]. In addition to issues related to accessibility and stigma [7], there remains an urgent need for delivering interventions for depression that address these obstacles.

Over the past decade, therapist-supported digital mental health interventions (DMHI) have gained significant attention in addressing these needs. Emergent evidence indicates that DMHIs incorporating cognitive-behavioral therapy (CBT) are equally effective as in-person CBT at reducing depression symptoms and superior to treatment as usual, waitlist controls, and attention controls [8]. Other DMHIs have successfully incorporated modalities like mindfulness, behavioral activation, stress reduction, cognitive reappraisal, and psychoeducation to address depression, finding improved cost-effectiveness and high scalability over traditional approaches [9].

Despite mounting evidence of the effectiveness of DMHIs in reducing depression symptoms [8, 9], few studies have systematically examined differential symptom trajectories among people receiving DMHIs. Because existing studies consistently indicate that changes depression symptoms during traditional, in-person psychological interventions are characterized by combinations of gradual changes and abrupt shifts [10], evaluating real-world data from DMHIs represents an emergent area of inquiry to better understand differential symptom trajectories across participant populations and inform the personalization of DMHIs based upon the trajectories’ characteristics [11, 12]. Similarly, linking program engagement to differential trajectories may help proactively identify treatment resistance and non-adherence before dropout has occurred [12, 13]. Such information could be used to develop predictive algorithms to aid clinicians in monitoring and intervening with participants at increased risk of dropout. The impact of key sociodemographics (e.g., age, gender) and clinical characteristics (e.g., symptom chronicity, trauma) on differential trajectories of depression symptoms also bears critical importance for developing precision care strategies for DMHIs [11,12,13,14]. To date, however, few studies have leveraged real-world data from a DMHI to investigate differential trajectories of depression symptoms and examine multivariate associations with program engagement, sociodemographics, and clinical characteristics.

To address this gap, this study used a model-based clustering technique called repeated measures latent profile analysis (RMLPA) to investigate differential trajectories of depression symptoms among 2192 people who participated in a 12-week DMHI between January 2020 and July 2021. The primary hypothesis was that RMLPA would empirically identify two or more distinct trajectories and at least one trajectory would be indicative of treatment resistance. The secondary hypothesis was that trajectories demonstrating the largest improvement in symptom severity would be associated with higher treatment engagement independent of sociodemographic and clinical characteristics.

Methods

Study design and participants

The sample included 2192 people aged 18 to 82 (mean = 39.1) who participated in a DMHI called the Meru Health Program (MHP) from January 1, 2020 to July 6, 2021. Referral to the MHP was through healthcare providers and employee assistance programs. Inclusion criteria determined at program intake included (1) having at least mild levels of depression, anxiety, or burnout; (2) owning a smartphone; (3) no active substance use disorder; (4) no severe active suicidal ideation with a specific plan or severe active self-harm; (5) no history of psychosis or mania; and (6) being 18 years of age or older.

The MHP incorporates self-guided modules with interactions with a dedicated, licensed clinical therapist through a smartphone app. The MHP lasts 12 weeks and contains components of cognitive behavioral therapy, behavioral activation therapy, mindfulness, sleep therapy, nutritional psychiatry, and heart rate variability biofeedback. After people are initially trained on how to use the app, they proceed with weekly modules that begin with introductory psychoeducation videos about the main topic. People can participate in an anonymous group chat and interact with licensed clinicians over the 12-week period. Protocols are also in place to handle mental deterioration and emergencies. Additional details about the intervention may be found elsewhere [15, 16].

All enrolled individuals consented to participate and have their deidentified data used for research purposes. Data are stored in Health Insurance Portability and Accountability Act-compliant electronic medical records that include protected health information. All data are encrypted in transit and at rest. Institutional review board exemption for this analysis was obtained from the Pearl Institutional Review Board (21-MERU-114) for analyses of previously collected and de-identified data. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline for cohort studies.

Procedures

We calculated a composite score by summing the nine items on the PHQ-9, a validated measure of depression in the past 2 weeks in clinical and population-based samples [17]. The PHQ-9 was administered at baseline and biweekly over the course of the program. The items refer to the frequency of experiencing depression symptoms, including anhedonia, low mood, sleep, fatigue, appetite, self-image, difficulty with concentration, reduced activity, and suicidal ideation in the past two weeks. Each PHQ-9 item was scored on a four-point Likert scale (0, not at all; 1, several days; 2, more than half the days; 3, nearly every day), with scores ranging 0–27.

Sociodemographic variables included gender identity (male, female, gender-expansive) and age. Clinical characteristics included chronicity of major depressive episode (MDE; none, single, recurrent), taking a psychotropic medication (yes/no), lifetime psychiatric hospitalization (yes/no), lifetime suicide attempt (yes/no), and lifetime traumatic event (yes/no). The average number of days a patient was actively using the app served as a continuous indicator of program engagement. Program completion was defined as engaging in at least 50% (six or more out of 12) of the treatment weeks [18, 19].

Statistical analysis

PHQ-9 composite scores were examined graphically to illustrate average levels of depression symptoms at each biweekly interval for the overall population. We conducted RMLPA to identify differential trajectories of depression symptoms over the 12-week intervention. RMLPA accounts for the interdependence between repeated measures without making assumptions about the functional form and distribution in observed variables (i.e., depression symptoms over 12 weeks) like other growth modeling approaches [20, 21]. Rather than modeling scaled change, RMLPA (along with its categorical analog, repeated measures latent class analysis) models patterns of states across time [21], making it well-suited to empirically identify groups of participants experiencing non-linear and discontinuous trajectories during behavioral and mental health interventions [22,23,24,25,26]. Models were fit with one to six latent profiles. Model fit was determined with the Akaike information criteria (AIC), Bayesian information criteria (BIC), sample size adjusted BIC (aBIC), bootstrapped likelihood ratio test (BLRT), and entropy [20, 27]. In addition to fit statistics, models were compared graphically to examine whether a higher number of latent profiles improved the theoretical interpretation of the data. The RMLPA was reported using an adapted version of the Guidelines for Reporting on Latent Trajectory Studies checklist [28].

After determination of the most parsimonious model, patients were classified according to their most likely 12-week treatment profile. Associations between the profiles and key covariates were evaluated using the standard three-step method [29]. Changes in pre- and post-program PHQ-9 scores were evaluated with Hedge’s g. A minimal clinically important difference (MCID) was defined as a five-point change in PHQ-9 scores from baseline to week 12 [30]. To determine multivariate associations with sociodemographic and clinical characteristics, a multinomial logistic regression model was fit with the treatment response profiles entered as the dependent variable. The profile with the lowest levels of depression symptoms was used as the referent outcome. As a sensitivity analysis, calendar year (2020 vs. 2021) was included as a covariate to control for secular trends in depression symptoms potentially related to the COVID-19 pandemic [31].

All statistical analyses were conducted with RStudio, Version 1.3.959. Statistical significance was defined as a two-sided p value ≤ 0.05. The study was preregistered on the Open Science Framework with open code (https://osf.io/5adme/). Our analytical approach utilized intention-to-treat (ITT) analyses whereby all participants who enrolled in the program were included regardless of intervention engagement or attrition [32]. Multiple imputation with chained equations was used to account for missing data [33]. Additional details regarding missing data analyses, multiple imputation, and RMLPA modeling procedures are included in the methodological supplement.

Results

Figure 1 shows the changes in PHQ-9 scores over the 12-week period for the 2192 participants. PHQ-9 scores decreased from 11.7 (SD 6.0) at baseline to 5.0 (SD 4.0) at week 12, with the largest reductions occurring during the first four weeks of the program. Overall, MHP participants were approximately 39 years old and predominantly female (79.9%). Recurrent MDE was indicated by 39.1% of participants, while 39.5% reported a major trauma and 29% reported use of a psychotropic medication. Lifetime suicide attempts were reported by 4.4% of participants and psychiatric hospitalizations by 4%. Participants were active an average of 39% of the days enrolled in the program and 73.9% completed the program.

Fig. 1
figure 1

Overall Change in PHQ-9 Scores During a 12-Week Digital Mental Health Intervention (N = 2192)a. PHQ-9 Patient Health Questionnaire-9. aError bands indicate 95% confidence intervals

After fitting models with one to six latent profiles, the four-profile model was considered the best based upon established recommendations [29]. While models with a greater number of latent profiles were associated with lower AIC, BIC, and aBIC, the fit statistics plateaued after four profiles. Additionally, models with more latent profiles were highly overlapping and did not improve entropy or clinical interpretability. Thus, we selected the more parsimonious four-profile model.

From this model, four latent profiles emerged with distinct trajectories of depression symptoms during the 12-week program (Fig. 2). The largest profile (n = 845, 38.6% of participants) was characterized by moderate levels of depression symptoms and quick MCIDs in depression symptoms (PHQ0 = 12.0, PHQ12 = 4.0, Δ = 8.0; Hedge’s g = 1.45). A similar profile with moderate levels of depression symptoms was also derived (n = 432, 19.7%) and demonstrated very high rates of dropout (98.2%). A moderately severe profile with slow MCIDs (n = 498, 22.7%) showed the highest levels of depression symptoms over the course of the 12-week program (PHQ0 = 15.2, PHQ12 = 9.7, Δ = 5.5; Hedge’s g = 1.00). The mild profile had the lowest levels of depression symptoms that decreased over the 12-week period (PHQ0 = 5.9, PHQ12 = 1.9, Δ = 4.0; Hedge’s g = 1.38). The distributions of sociodemographic variables and clinical characteristics among the four treatment profiles are shown in Table 1. Table 2 shows the changes in PHQ-9 scores and effect sizes for the profiles. The Hedge’s g statistics ranged from 1.00 to 1.77, indicating large effect sizes in the changes in pre- and post-program PHQ-9 scores overall and for each of the four profiles.

Fig. 2
figure 2

Final Solution of Depression Symptom Trajectories During a 12-week Digital Mental Health Intervention (N = 2192)a. PHQ-9 Patient Health Questionnaire-9, MCID minimal clinically important difference. aError bands indicate 95% confidence intervals. The Dropout trajectory is illustrated as part of an intent-to-treat framework using multiple imputation with chained equations (see methodological supplement). Among the Dropout trajectory, the average number of weeks completed was 3.43 (standard deviation = 2.68), with 1.9% completing the program (at least 50% of the 12 treatment weeks)

Table 1 Bivariate associations between depression symptom trajectories, sociodemographics, and clinical characteristics (N = 2192)
Table 2 Changes in PHQ-9 Scores by depression symptom trajectory

Table 3 shows the results from the multinomial logistic regression model with the mild profile specified as the reference outcome. Younger age was associated with the moderately severe, slow MCID profile (aOR 0.98, 95% CI 0.97–0.99), while use of a psychotropic medication (aOR 1.69, 95% CI 1.20–2.36) and program engagement (aOR 2.57, 95% CI 1.36–4.86) were associated with increased odds. Program non-completion (aOR 99.34, 95% CI 40.12–245.94) and lack of program engagement (aOR 0.0015, 95% CI 0.001–0.002) were strongly associated with the dropout profile. Increased odds of recurrent MDEs and a history of major trauma were demonstrated across the profiles. In a sensitivity analysis, calendar year was not a significant predictor of treatment response, nor did any of the parameter estimates change by over 10%.

Table 3 Multivariate associations between symptom trajectories, sociodemographics, and clinical characteristics (N = 2192)a

Discussion

This study investigated differential treatment response among a sample of 2192 people across a broad age range who engaged with a 12-week DMHI. Overall, depression symptoms measured by the PHQ-9 significantly improved during the intervention. However, this group trend masks differential trajectories of depression symptoms that may inform precision care for target subpopulations who have different core needs and capabilities in DMHIs. In support of our primary hypothesis, RMLPA empirically identified four distinct treatment profiles that varied by pre-treatment symptom severity and trajectory of depression symptoms over 12 weeks (e.g., linear or “steady improvers”, log-linear or “rapid initial improvers”, cubic or “downward staircase improvers”). Additional variability in the propensity of dropout was also found among the profiles. The hypothesized identification of a treatment resistant group was partially supported, as the moderately severe profile demonstrated MCIDs in depression symptoms but did not reach more stringent definitions of treatment response or achieve remission [34, 35]. Lastly, the null for the secondary hypothesis was not rejected, as the highest amount of program engagement was not found among the profile with the largest reduction in depression symptoms.

While the overall pattern of treatment response was consistent with the sudden gains phenomenon [10], RMLPA identified four distinct trajectories of change in depression symptoms during a 12-week DMHI. The largest treatment profile (39%) was characterized by people who began treatment with moderate levels of depression symptoms and subsequently demonstrated rapid improvements during the first four weeks followed by slower change thereafter (a log-linear trajectory). A similar profile (19%) was identified for people starting with mild symptoms, who also exhibited a log-linear pattern of change, albeit with lower levels of symptoms throughout the intervention. The second largest profile (22.7%) had moderately severe levels of symptoms that decreased steadily over 12 weeks (linear trajectory). The remaining profile consisted of people with moderate levels of depression symptoms (19.7%) who accounted for the majority of dropout. These findings are highly consistent with previous studies finding non-linear trajectories of depression symptoms during mental health interventions [10, 36, 37]. The present study is among the first to use RMLPA to investigate differential treatment response patterns in a novel DMHI, leveraging a large sample size in combination with a range of sociodemographic and clinical characteristics shown to influence treatment response.

Several notable findings emerged from the multivariate analyses. Younger individuals were more likely to be in the moderately severe treatment response profile. This finding is consistent with clinical and population-based studies demonstrating elevated levels of depression symptoms and mental health problems that peak in early adulthood [38]. Higher levels of psychotropic medication use and program engagement were also found among the moderately severe profile, likely indicative of more complex clinical histories and presenting issues [39]. Although the moderately severe profile identified in this study experienced MCIDs in depression symptoms (PHQ0 = 15.2, PHQ12 = 9.7, Δ = 5.5) and demonstrated high levels of adherence (92.8%), the 36% reduction in symptom severity and post-treatment score of 9.7 did not meet more stringent definitions of treatment response (e.g., 50% decline in depressive symptoms) and remission (PHQ12 < 5), respectively [30]. As these findings are suggestive of treatment resistance [6], higher levels of residual symptoms at the end of the intervention among the moderately severe profile may portend poorer long-term prognoses, including symptom relapse, failure to achieve recovery, and psychosocial impairments [39]. Additional efforts focused on continuation treatment, relapse prevention, and psychosocial monitoring may therefore be warranted for certain subpopulations after program completion [40].

The pattern of engagement among the treatment profiles also suggest that clinical outcomes obtained from interactions with the MHP were not always proportional to time spent on the program. For example, the moderately severe profile had the highest level of engagement coupled with the lowest effect size in pre- and post-treatment PHQ-9 scores (Hedge’s g = 1.00). Nevertheless, all levels of engagement were associated with improvement in depression symptoms across the treatment profiles, which is suggestive of the dodo bird verdict [41]. Follow-up studies are necessary to more rigorously evaluate the impacts of multiple engagement measures (e.g., therapist communications, mindfulness exercises, group participation) and treatment processes like the therapeutic alliance on depression symptoms in DMHIs [13, 41]. Similarly, lifetime trauma exposures and recurrent MDEs were significantly associated with the treatment profiles in the multivariate analyses. Both of these clinical characteristics have been shown to influence the development and course of psychiatric disorders, including depression, through transdiagnostic mechanisms such as emotional dysregulation and reactivity, associative learning, and rumination [42,43,44]. Given the ubiquity of trauma exposures and recurrent MDEs in the current study, addressing a wide range of trauma types and transdiagnostic characteristics during intake may be useful in triaging patients to certain therapy modules and personalizing care with more precision [11, 45].

Several limitations and offsetting strengths are acknowledged. Because the sample excluded participants who had more severe forms of mental illness (e.g., active suicidality, schizophrenia, bipolar I), these results may generalize best to people with mild, moderate, or moderately severe levels of depression symptoms. Second, this study may only generalize to therapist-supported DMHIs, which have been shown to be associated with higher levels of engagement and more positive outcomes compared to self-guided, text-based, and automated DMHIs [46,47,48,49]. It is possible that the therapist supports increase engagement and symptom improvements through a remote continuous care approach whereby therapists first establish rapport and trust with patients, then foster a digital therapeutic alliance over the duration of the intervention [50, 51]. This study also used a single measure of program engagement, which may not have fully captured the potential effects of other engagement types on treatment outcomes [13]. In addition, this study investigated differential trajectories of depression symptoms using a brief, self-administered instrument, limiting direct comparability with patients formally diagnosed with major depressive disorder who participate in DMHIs [30].

While some evidence suggests that behavioral change theories guide the parameterization of trajectories a priori [52], the use of RMLPA with a large sample of patients allowed for the empirical identification of treatment profiles with discernible symptom trajectories. This strategy facilitated a comprehensive characterization of treatment response without assumptions about the data’s potential functional forms and distributions, including evaluation of multivariate associations with several clinically relevant covariates [20, 21]. RMLPA also robustly identified a treatment profile accounting for the majority of program dropout (424 out of 572 participants), although the imputation procedures assume that data for dropout participants with missing responses on the PHQ-9 may be predicted with data from participants who provided data. As participants with poorer engagement and clinical outcomes are less likely to respond, PHQ-9 scores imputed based upon the observed characteristics of participants may introduce methodological artifacts based upon inaccurate assumptions. Nevertheless, research indicates that multiple imputation leads to less bias than listwise deletion and other common methods [53, 54]. Additional missing data procedures outside the scope of this study will be necessary to confirm the cubic trajectory demonstrated among the dropout profile [55, 56]. Similarly, other ameliorative efforts may be required to determine if adherence can be boosted through strategies like motivational interviewing, engagement checks, encouragement, and prompts (both automated and human) unrelated to the therapeutic content [57]. Lastly, while the use of ITT analyses provided more conservative estimates of intervention effects compared to studies only focusing on treatment completers, the lack of control groups still raises the possibility symptom improvements may have been a function of natural remission. Data from randomized controlled trials of psychotherapy and pharmacotherapy for depression indicate that natural remission is a common phenomenon shown to contribute to placebo effects, suggesting that external factors may have influenced symptom improvements in this DMHI [58, 59].

The findings from this study have important implications for precision care. Both lifetime trauma and MDE chronicity were common, suggesting the need for trauma-informed treatment modules as well as protocols for addressing more complex symptomatology, improving the therapeutic alliance, and increasing adherence. Post-treatment cohort studies are also necessary to better understand the long-term effectiveness of DMHI, factors predictive of continued symptom remission, and feasibility of case management and relapse prevention interventions. Similarly, mixed methods studies may help elucidate processes related to acceptability, satisfaction, and implementation fidelity from both the patient and clinician perspectives [60]. Taken together, the findings highlight the importance of evaluating differential treatment response and identifying key processes that promote improvements in depression symptoms among people participating in DMHIs.