Effectiveness of Blended Cognitive Behavioral Therapy Versus Treatment as Usual for Depression in Routine Specialized Mental Healthcare: E-COMPARED Trial in the Netherlands

The general aim of this study was to investigate the effectiveness of blended Cognitive Behavioral Therapy (bCBT) as compared to Treatment as Usual (TAU) for depression in specialized routine mental healthcare in the Netherlands. We further explored a range of secondary outcome variables, including quality of life, clinical response, remission and reliable improvement, as well as clinical deterioration and potential negative effects of treatment. n = 103 patients with Major Depressive Disorder were recruited as part of the E-COMPARED project, and randomly allocated to bCBT (n = 53) or TAU (n = 50). Measurements took place at baseline, 3-, 6- and 12-months follow-up. Treatment effects were analyzed using linear mixed-effects models for repeated measures. Depressive symptoms significantly declined and quality of life significantly improved over time in both bCBT and TAU during 12-months follow-up. No significant interaction effects between treatment group and assessment point were found. Likewise, there were no significant differences between the two treatment groups on secondary outcomes. Patients following bCBT went from severe to mild symptom severity, along with large within-group effects. Applying bCBT in routine specialized mental health care seems promising, but is a relatively new form of treatment that is still under development and more research is needed. Netherlands Trials Register NTR4962. Registered on 5 January 2015.


Background
Major depression is highly prevalent and burdensome worldwide (Vigo et al., 2016), like in the Netherlands (de Graaf et al., 2012). It has not only a strong impact on the quality of life of patients (Vos et al., 2019), but also poses a substantial societal and economic burden (Greenberg et al., 2021;Lépine & Briley, 2011). Despite the establishment of evidence-based psychological treatments for depression, many depressed individuals do not receive these clinical services in an adequate and timely manner due to constrained resources of mental health organizations (Mooitra et al., 2022;Patel et al., 2010). Access barriers, such as the restricted availability of qualified therapists resulting in long waiting times before treatment can start, impede utilization of mental healthcare services (Araya et al., 2018). This underlines the need for treatment approaches that address the issues of better accessibility to evidence-based treatments.
Digital technologies have the potential to scale up psychological treatment resources, by enhancing flexibility in timing and location of services, and potentially reducing costs . This contributed to the development of internet-based interventions such as Internet-based Cognitive Behavioral Therapy (iCBT). ICBT has been found to be effective in reducing depressive symptoms in numerous randomized controlled trials (Karyotaki, 2021). This 1 3 meta-analysis also showed that online therapeutic guidance leads to higher effect sizes than unguided ones, but only for the severely depressed. A systematic review by Etzelmueller et al. (2020) showed that iCBT may also be effective in routine care, although the uptake is scarce. A next step in implementing iCBT is to use it in specialized mental healthcare for complex and more severely depressed patients, to increase patient empowerment and (cost-) effectiveness of the treatment, and at the least to provide an evidence-based treatment alternative to face-toface CBT for depression. These patients experience more restrictions in daily functioning and often need more intensive motivation (Hadjistavropoulos et al., 2017). In the Netherlands, online programs are available for these patients, but uptake is low due to therapist and patient factors (Mol et al., 2020). Therapists as well as patients mostly prefer having (also) face-to-face contact (van Schaik et al., 2022). Therefore, blended treatment formats have been developed, wherein digital treatment components are integrated with face-to-face (FTF) psychotherapy into one treatment protocol , to combine the advantages of online and offline treatment modalities and stimulate the uptake of online treatment into routine mental healthcare. Within blended CBT (bCBT), patients work through the treatment protocol online, do homework exercises and monitor their symptoms (Kooistra et al., 2016). Therapists can provide feedback to the homework assignments within the online environment and use the FTF sessions to increase patient's understanding of the CBT theory and facilitate the transformation of theory into everyday practice, as well as to address more patient-specific needs. In this way, more practical aspects of therapy can be handled through the digital program, while therapists can focus more on the support of process-related therapy aspects in the FTF sessions, which could aid in improving efficiency (van der Vaart et al., 2014). By giving a time structure and support, therapists may also increase compliance. Furthermore, they may increase treatment intensity in case of crisis situations or suicidality. Hence, in more severely depressed patients blended treatments tend to hold many benefits by combining the advantages of a FTF and an online approach. For this research project we worked with a protocol of alternating sessions (50% FTF and 50% online sessions), but in outside research settings the distribution of FTF and online sessions appear more tailored to individual needs.
While the empirical basis for iCBT is relatively sound, current evidence of blended treatment of clinically diagnosed major depression is in its infancy. Studies initially focused on feasibility and acceptability, indicating promise of this new treatment (Kooistra et al., 2016;van der Vaart et al., 2014;Wentzel, van der Vaart, Bohlmeijer, & van Gemert-Pijnen, 2016). Reported benefits of integrating digital technologies with traditional FTF therapy include less traveling time for patients, increased access to content and thus longer exposure to therapy, the opportunity of regular monitoring of symptoms, improvement of patient's self-management skills, possible lower time-investment for therapists (Wright et al., 2005) and possible reductions in therapist drift (Månsson et al., 2013;van der Vaart et al., 2014).
Regarding treatment outcomes of efficacy trials on integrated blended treatments conducted so far, initial results indicate a decrease of depressive symptoms following treatment based on blending FTF therapy with digital technology-based support (Berger et al., 2018;Høifødt et al., 2013;Kooistra et al., 2019;Ly et al., 2015;Nakao et al., 2018;Thase et al., 2018;Wright et al., 2005;Zwerenz et al., 2017). Three studies focused on patients in specialized mental healthcare settings, comparing bCBT to either standard CBT (Kooistra et al., 2019), regular psychotherapy (Berger et al., 2018) or TAU combined with online information (Zwerenz et al., 2017). Most had the common aim to reduce therapist time by either shortening the FTF session duration, or providing fewer FTF sessions (Kooistra et al., 2019;Ly et al., 2015;Nakao et al., 2018;Thase et al., 2018;Wright et al., 2005) investigated an intensive blended protocol for the treatment of depression, and results indicated that from a healthcare provider perspective, bCBT could be cost-effective with comparable treatment effects as standard CBT.
Up to now, little attention is paid to possible undesirable outcomes of blended treatments, while this aspect is crucial for clinical decision making. In psychotherapy research, negative aspects are defined as "changes that are experienced as negative by the patient and that have direct or indirect harmful effects, or that are experienced by the affected person as detrimental" (Ladwig et al., 2014). Mainly examined are deterioration rates on symptom level. Within the domain of online interventions, it has for example been found that self-help as well as guided iCBT groups had lower risk of reliable deterioration than waiting list controls (Ebert et al., 2016;Karyotaki et al., 2018;Rozental, Magnusson, Boettcher, Andersson, & Carlbring, 2017). Negative sideeffects can however also occur independent of the success or failure of the therapy regarding the main symptoms and include other domains such as stigma, changes and strains in life areas (work, relationship, family and friends) or dependency within the therapeutic relationship. Further analysis of the different domains of negative effects seems necessary (Rozental, 2016).
The primary aim of this study was to investigate the effectiveness of bCBT as compared to treatment as usual (TAU) in specialized routine mental healthcare in the Netherlands, as part of a large European research project "European Comparative Effectiveness Research on Internet-Based Depression Treatment" (E-COMPARED; www.e-compa red. 1 3 eu). We further aimed to explore and compare a range of secondary outcome variables, including health-related Quality of Life (QoL), clinical response, remission and reliable improvement, as well as clinical deterioration and potential negative effects of treatment.

Study Design and Participants
The E-COMPARED study was a multi-site non-inferiority randomized controlled trial (RCT) comparing bCBT to treatment as usual (TAU) for adult depression in routine care settings across nine European countries . In this paper, we present the results of the trial conducted in the Netherlands , with data collected between June 2015 and July 2018.
In the Netherlands, participants were recruited at seven different study sites, within four participating outpatient specialized mental healthcare centers (MHC). Clinicians of the participating sites were asked to recommend the study to patients of their workload with depressive symptoms and provide them with a letter containing information about the study. Those willing to be approached were contacted by the research team. In order to be included, participants had to be ≥ 18 years old, have a primary diagnosis of Major Depressive Disorder according to Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) criteria as confirmed with the MINI International Neuropsychiatric Interview (M.I.N.I), and have a score of 5 or above on the Patient Health Questionnaire (PHQ-9). Having sufficient command of the Dutch language and having access to the internet were additional inclusion criteria. Patients were asked to provide written consent by post before undergoing the online baseline assessment.
The exclusion criteria were having serious psychiatric comorbidity (e.g., bipolar disorder, psychotic illness, or substance dependence) that required alternative treatment primary to the treatment of MDD, or serious and imminent suicidal ideation . For applicants at high risk for suicide, therapists were immediately notified such that standardized suicide protocols within the participating mental healthcare centers could be followed. When the suicide risk was stabilized, the patient could still potentially be eligible to participate. Participants might have received psychological treatment in the past but had to agree not to participate in other psychological treatment for depression parallel to the intervention treatment of the study. In both groups, patients could use or receive additional antidepressant medication, following the usual care processes protocols for depression treatment (add or combine antidepressants when the patient does not respond enough to CBT).
The diagnostic interviews were conducted by telephone, by trained assessors from the research team at the coordinating center. Measures were based on self-report and collected using an online survey platform. Assessments took place at baseline (T0), as well as three (T1), six (T2) and twelve months (T3) after randomization.
The study protocol has been published  and the protocol for this study has been approved by the Medical Ethics Committee of the VU University Medical Centre (registration number 2015.078).

Randomization
All eligible participants were randomly allocated in a ratio of 1:1 to either the bCBT or TAU group. The allocation was concealed and conducted centrally by an independent team of researchers not otherwise involved in the trial, using a random number generator to produce the allocation scheme. Randomization took place at an individual level and was stratified by study site.

Blended Cognitive Behavioral Therapy (bCBT)
The bCBT integrated individual FTF therapy with both internet-and mobile-based treatment elements, using the Moodbuster platform Warmerdam et al., 2012). Moodbuster consisted of a web-portal for patients, a web-portal for therapists and a mobile application. Patients had access to treatment modules, homework exercises, a calendar, a mood graph and a messaging system for contact with their therapist through their personal account on the Moodbuster web-portal. With the Moodbuster mobile application, patients could monitor their symptoms (ecological momentary assessment: EMA) and receive automated reminders and motivational messages (ecological momentary intervention: EMI). Therapists were able to review patient's treatment course through the therapist web-portal, allowing them to provide personal feedback and guidance as integral part of the intervention. This support was asynchronous, whereby the therapist set a date in between the FTF sessions to provide a written feedback message containing supportive and clinical techniques. The FTF sessions weekly alternated with online sessions (ratio 50/50) and followed up on the Moodbuster treatment modules (see below). The Dutch bCBT protocol indicated the application of ten FTF and nine online sessions, over a period of 19-20 weeks .
During the first FTF session, patients were introduced to the Moodbuster program by their therapist. Moodbuster comprises of an introduction and six therapeutic modules and offers flexibility in sequence and time spent on each 1 3 module. After completing the mandatory Introduction and Psycho-education module, patients would in accordance with their therapist choose the next module to start with. The core CBT modules focused on identification and restructuring of dysfunctional thinking (Cognitive restructuring) and activation of behavioral strategies to increase engagement in positive activities (Behavioral activation). Also additional modules were included, aiming to increase physical activity levels (Physical exercise) and to learn skills to proactively solve problems (Problem-solving). At the end of the treatment, therapists would give access to the Relapse prevention module, focusing on treatment evaluation and reviewing skills learned to formulate relapse/recurrence prevention strategies. All bCBT was given by licensed CBT therapists who received additional training on the bCBT protocol. They were instructed to include the core elements of CBT, but were free to decide how many sessions to spend on each module, whether to repeat them, or to add optional modules such as problem-solving or physical exercise. These additional components could however not take up more than a quarter of the total treatment (no more than about 25% of the FTF and online sessions combined), preventing too much heterogeneity in the treatment programs. Patients were instructed to work on one module at a time, with a recommended time-frame of one to two weeks. More time was however given if needed, to provide the opportunity to integrate the exercises in their daily life (Kemmeren et al., 2019).

Treatment As Usual (TAU)
TAU was defined as the routine care given to patients with MDD in the participating outpatient settings, after referral from a general practitioner or primary mental healthcare professional for specialist mental healthcare. Consistent with national treatment guidelines (Spijker et al., 2013), the type of treatment could consist of evidence-based face-to-face psychotherapy (mainly CBT, interpersonal psychotherapy (IPT) or problem-solving therapy (PST)), antidepressant medication, or a combination of these). Usual care was followed without interference and monitored through fidelity checklists from therapists (also filled out for the bCBT condition). The fidelity checklist included the date, duration and the main applied therapeutic strategy of each FTF session.

Therapists
Licensed CBT therapists as well as CBT therapists in training working under supervision of an experienced CBT therapist, were asked to participate in the study and apply the blended treatment program. All received training on how to deliver the bCBT with Moodbuster, including instructions for use of the online platform; skills for written feedback; content of the CBT modules and application of the blended treatment protocol (Kemmeren et al., , 2019. To get familiar with the online environment, 'training accounts' were made available for therapists to walk through the intervention from a patient perspective. Additionally, therapists were provided with a detailed treatment manual to guide them through the bCBT and regular supervision was provided between therapists and the research coordinator and/or CBT supervisor. In total, 36 therapists from 7 MHC participated in the training. Of those, 24 provided the bCBT within the trial, treating on average 2.2 patients (ranging from 1 to 8). Participating therapists could deliver treatment in both the intervention and the control condition.

Primary Outcome
The primary outcome was depressive symptoms, as assessed with the Patient Health Questionnaire (PHQ-9) at T0-T3. The PHQ-9 consists of nine items representing the DSM-IV criteria, each scored on a range from 0 (not at all) to 3 (nearly every day). Depression severity can be categorically classified, distinguishing between mild (5-9), moderate (10-14), moderately severe (15-19) and severe (≥ 20) major depression (Kroenke, 2012). Higher total scores are indicative of a higher severity of depressive symptoms. The internal consistency reached a value of α = 0.88 (Kroenke et al., 2010).

Secondary Outcomes
The Quick Inventory of Depressive Symptomatology Self-Report (QIDS-SR-16) measuring depression severity was additionally assessed, for possible comparison with more generic or other specialized care studies. The QIDS-SR-16 consists of 16 items, with each item scoring from 0 to 3 and a total range of 0-48. The cut-off points of 6, 11, 16 and 21 represent the thresholds for mild, moderate, severe and very severe depression, respectively. Highly acceptable psychometric properties have been shown, with internal consistency of α = 0.86 (Rush et al., 2003).
Further secondary outcomes were used to explore the clinical significance of treatment effects, as well as healthrelated Quality of Life (QoL) and possible negative effects in the two study groups. Clinical response was defined as 50% reduction from baseline symptom severity (Rush et al., 2006) and remission as a follow-up PHQ-9 score < 5 (Kroenke et al., 2010). Clinical reliable improvement and deterioration of depressive symptoms at an individual level over the course of treatment, were based on the Reliable Change Index (RCI) as formulated by Jacobson & Truax (1991). Baseline PHQ-9 scores were subtracted from follow-up scores and then divided by the standard error of the difference scores (Jacobson & Truax, 1991), using the test-retest reliability coefficient of the PHQ-9 (r = .84). Participants were coded as having 'improved' if they experienced a reliable positive change (RCI<-1.96; equal to a decline of 6 points or more on the PHQ-9) and 'deteriorated' if they experienced a reliable negative change (RCI > − 1.96; equal to an increase of 6 points or more on the PHQ-9).
Health-related QoL was assessed with the EuroQol 5D-5 L (EQ-5D-5 L; The EuroQol Group, 1990). This questionnaire consists of five items (mobility, self-care, daily activities, discomfort and mood state) that each can be scored on a range from 'no problems' to 'a lot of problems'. The EQ-5D-5L health states can be represented by a single summary number (index value), reflecting how good or bad a health state is. Furthermore, the digits for the five dimensions can be combined into a 5-digit number that describes the patient's health state. An algorithm for the Dutch population was used to convert EQ-5D scores into utility values, ranging from 0 to 1. The maximum score of 1 indicates the best health state.
Possible negative side-effects of the treatment were specifically assessed with the Inventory of Negative Effects of Psychotherapy (Ladwig et al., 2014), six months after baseline (T2). The version used in this study consists of 15 items, assessing a range of common changes participants experienced during or after their therapy, concerning their social and work environment. The following domains are covered: negative interpersonal changes, negative effects in different life areas (intimate relationships, family and friends, workplace), perceived dependence on the therapist and stigmatization. The first six items assess negative and positive effects and are rated on a 7-point Likert scale, ranging from − 3 (very positive experience) to 3 (very negative experience). The other nine items are rated on a 4-point Likert scale, ranging from 0 (no agreement at all) to 3 (total agreement). Hence, the intensity of negative effects per item ranges from zero to three. For each item, respondents also state whether they attribute the effects to the therapy or to other circumstances. Only item scores of those negative effects that were attributed to the therapy are added to the total score. For this study, a simple score of negative effects was calculated by adding all negative effects related to therapy (ranging from 0 to 15), as well as the intensity score by summing up the intensity ratings for reported negative effects (ranging from 0 to 45). The higher the scores, the more negative effects. Additionally, negative effects on domain/item basis were analyzed. The INEP has shown an internal consistency of α = 0.86 (for psychometrics of the INEP see Ladwig et al., 2014).

Statistical Analyses
Analysis of variance (ANOVA) for continuous and χ 2 -tests for categorical variables were used to estimate betweengroup differences in demographics and pre-treatment measures at baseline, as well as to estimate between-group differences in possible negative side-effects of treatment (INEP) at 6 months follow-up.
Treatment effects were analyzed using linear mixedeffects models (LMM) for repeated measures, given their ability to handle missing data and dependency of the repeated observations of the same participant over time. The LMM included participants as random effects, and group (bCBT vs. TAU), assessment point (T1-T3) and the interaction between group and assessment point as fixed effects. We used an unstructured covariance structure, based on the Akaike Information Criterion (AIC; Akaike, 1974). The continuous outcome measures -depression severity (PHQ-9; QIDS-16), health state utility (EQ-5D-5 L) -were evaluated in separate mixed models. Differences between clinical response, remission and reliable improvement and deterioration of depression severity were analyzed using Pearson χ 2 -analysis. All analyses were performed on an intention-totreat basis, including all randomized participants, regardless of non-adherence to the intervention or non-response to the follow-up assessments.
To assess the magnitude of treatment effect, Cohen's d effect sizes (Cohen, 1988) were calculated both within-and between groups from estimated means and observed pooled standard deviations (SD). For between-group effect sizes, the mean difference between bCBT and TAU scores were divided by the pooled standard deviation of baseline scores (Mean diff /SD pooled ). Within-group effect sizes used the magnitude of change from baseline to follow-up assessments (Mean diff ), divided by SD diff = √(SD 2 pre + SD 2 post − 2rSDpre SD post ), with r being the correlation between baseline and follow-up scores. Effect sizes were classified as small (Cohen d = 0.2-0.49), medium (Cohen d = 0.5-0.79) or large (Cohen d > 0.8). Statistical analyses were conducted using the Statistical Package for the Social Sciences 24.0 (SPSS 24.0). Figure 1 shows the flow of participants from screening to 12-months follow-up (FU) during the study. Between June 2015 and July 2017, 138 patients were screened for eligibility. Of these, 25.4% (35/138) did not meet the inclusion criteria (n = 11), or refrained from participation (n = 24). In total, 103 eligible patients who signed informed consent and completed baseline assessments, were randomized to either bCBT (n = 53) or TAU (n = 50). At 3-months FU, n = 83 (80.6%) of patients completed the assessments, 79.2% in the bCBT group and 82% in the TAU group. At 6-months FU, n = 77 (74.8%) completed the assessment (79.2% in bCBT and 70% in TAU) and n = 72 (69.9%) the 12-months FU assessment (67.9% in bCBT and 72% in TAU). The initial goal of including n = 156 participants for the overall European non-inferiority trial  was not reached at the end of the E-COMPARED fieldwork. Inclusion in the Netherlands was prolonged for a couple of months in order to increase sample size to explore the effectiveness in the Netherlands.

Study Flow and Participant Characteristics
Socio-demographic and clinical characteristics of participants at baseline are summarized in Table 1. Participants were predominantly female (N = 70, 68%) with an average age of 39.3 years (SD = 11.1). A little more than half of patients (N = 53, 51.5%) received treatment for depression prior to the study, of whom 6% psychotherapy, 41.5% pharmacotherapy and 52.8% a combination of both. Furthermore, 53 patients (51.5%) were receiving antidepressant medication at the start of the study. The mean depression scores at baseline (mean PHQ = 16.4; mean QIDS = 16) indicated a sample with (moderately) severe depression and 77.5% had at least one comorbid diagnosis. The proportion of individuals with a (moderately) severe depression (PHQ > 14) was 63.1% (65/103), while 24.3% (25/103) had a score indicating moderate depression (PHQ between 10 and 14) and 12.6% (13/103) had a mild depression (PHQ between 5 and 9).
There were no significant differences in baseline characteristics between the groups (reflecting successful randomization), nor between those patients who did or did not provide the FU assessments. Table 2 reports observed means and standard deviations at baseline and FU assessments (T1-T3), from the linear mixed models for the PHQ-9, QIDS-16 and EQ-5D-5 L measures separately, as well as within-and between-group effects based on estimated means.

Clinical Outcome Analysis
The PHQ-9 score (primary outcome) significantly declined over time in both treatment groups (overall F 3,82 =36.312, p < .001) during 12 months FU (all time effects p < .001). The average PHQ-9 scores for the bCBT group decreased from 16.1 to 12.6 at 3-months FU with an additional decrease of 3.1 points at 6-months (going from
The LMM showed significant improvement on the EQ-5D over time (F = 12.289, p < .001). In both bCBT and TAU, health state significantly improved from baseline to 3 months (F = 6.549, P = .012), 6 months (F = 6.23, P = .014) and 12 months FU (F = 19.957, P < .001). No significant between group difference was found in improvement of healthrelated QoL over time (overall F = 0.564, p = .641), neither on T1 (F 1,87 =0.112, p = .74), T2 (F 1,85 =0.213, p = .65) or T3 (F 1,82 =0.046, p = .83). Between-group effect sizes at all time points were small, with d = 0.004 at T1, d = 0.01 at T2 and d = 0.02 at T3. At 3-months FU, the within-group effect sizes of the EQ-5D were small for both bCBT (d = 0.28) and TAU (d = 0.39). At 6-and 12-months FU however, medium within-group effects for both treatment groups were found. Table 3 presents the proportion of patients in bCBT and TAU, who showed clinical response, remission, reliable improvement, reliable deterioration or no reliable change on depression symptoms separately. There were no significant differences between the two treatment groups in terms of the percentage of patients experiencing a clinical response, remission or reliable change from baseline to 3-, 6-or 12-months FU. The number of patients showing clinical response at T1 is still low for both treatment groups (26.2% for bCBT and 26.8% for TAU). At T2, this increased to 50% in the bCBT group versus 28.6% for TAU, however with no significant difference. At T3, the percentages of patients with clinical response in bCBT and TAU were closer to each other again (53% versus 47.2%).

Clinical Response, Remission and Reliable Change
In both treatment groups, the number of patients showing reliable clinical deterioration was low. In bCBT two patients (4.8%) deteriorated at T1 and one (2.8%) at T3. For TAU these numbers were three (7.3%) at T1 and one (2.8%) at T2. There were no patients who consecutively reported deterioration at all follow-up assessments.

Negative Effects
At 6 months FU, 75 participants filled out the INEP (response rate of 73%). Of those, 46 (61.3%) reported at least one negative side effect that they attributed to the treatment; 24/41 (58.5%) in the bCBT group and 22/34 (64.7%) in the TAU group. Patients in the bCBT group reported on average 1.29 (SD = 1.33) negative effects with a mean intensity score of 2.15 (SD = 2.40), and in the TAU group on average 1.41 (SD = 1.67) negative effects were reported with a mean intensity score of 2.12 (SD = 2.51). The difference in number of reported negative effects between treatment groups was not significant (F 1,74 =0.118; P = .73), nor was the intensity score of negative effects related to therapy (F 1,74 =0.003; P = .96).
When analyzing the negative effects on item basis, one item stood out in particular with n = 15 (36.6%) in bCBT and n = 10 (29.4%) in TAU reporting that patients suffered more from events of their past since the end of treatment or compared to before the start of the treatment. Looking at domain level, most frequently effects in the area of negative interpersonal changes/emotions (negative effects on Table 2 Observed means and standard deviations (SD) for clinical outcome variables at baseline and follow-up assessments within bCBT and TAU, within-group effects and between-group effects based on estimated means a Between-group comparisons were based on estimated means from the linear mixed model. b Between-group effect sizes were calculated based on estimated means from the linear mixed model. c Within-group effect sizes were calculated based on estimated means from the linear mixed model using raw differences. emotional experience and social functioning) were reported (bCBT: n = 11, 26.8%; TAU: n = 8, 23.5%). Relatively more patients in the TAU group (n = 9, 26.5%) reported perceived therapeutic dependence, compared to bCBT (n = 6, 14.6%).

Discussion
The present study investigated the effectiveness of bCBT as compared to TAU in outpatients diagnosed with MDD receiving specialized mental healthcare in the Netherlands. The main finding was that no statistically significant between-group differences emerged over time, in terms of decreased depression severity. Patients receiving bCBT and TAU showed similar rates of decline in depression severity, with medium within-group effect sizes from baseline to 3 months FU and large within-group effects at 6-and 12-months FU in both groups. Likewise, our results indicate no differences in terms of the percentage of patients experiencing a clinical response, remission, reliable change or negative effect of the treatment. Additionally, no significant between-group differences emerged over time for health-related QoL, increasing comparably in both treatment groups.
Our findings are in line with previous studies on blended treatment for depression and anxiety, suggesting comparable clinical effects between the blended and control intervention (Kooistra et al., 2019;Ly et al., 2015;Mathiasen et al., 2022;Nakao et al., 2018;Romijn et al., 2021b;Thase et al., 2018). Results from Berger et al. (2018) and Zwerenz et al. (2017) indicated moderate between-group effect sizes in favor of blended, but these studies offered the online component as an adjunctive tool to usual care rather than an integrated blended treatment protocol as replacement of usual psychotherapeutic care. Furthermore, trials are not fully comparable regarding the treatments that were received in the comparison groups, consisting of either standard CBT (Kooistra et al., 2019;Ly et al., 2015;Mathiasen et al., 2022;Thase et al., 2018), usual treatment with a psychotherapist (Berger et al., 2018) or psychiatrists (Nakao et al., 2018), or TAU with access to an online platform (Zwerenz et al., 2017). This makes direct comparisons of between-group effect sizes difficult.
In our study, patients following bCBT showed significant improvements on depressive symptoms, going from severe to mild symptom severity, along with large within-group effects at 6-months FU (d = 1.18) that were maintained at 12-months FU (d = 1.32). These results are comparable to other blended treatments, reporting within-group effect sizes ranging from d = 1.0 to d = 1.47 (Berger et al., 2018;Ly et al., 2015;Nakao et al., 2018;Zwerenz et al., 2017). Moreover, the bCBT group also showed high clinical response of 50% at 6-months and 58% at 12-months. These results are similar to a meta-analysis of 92 studies on FTF treatments for MDD, where the overall response rate (defined as 50% reduction on any depression measure) was 48% . Remission rate (30.6%) was similar to Kooistra et al. (31%), but slightly lower than that reported by Thase et al. (42.9%) and Ly et al. (40%). This however concerned medication-free patients (Thase et al., 2018) and recruitment outside clinical settings through advertisements (Ly et al., 2015). Patients within specialized MHC show higher levels of depression at the start of treatment and considering parallel trajectories of decrease in depressive symptoms, the lower number of patients in remission can be explained. Adequate treatment of depression is usually associated with significant improvement in QoL. In our study, healthrelated QoL significantly improved over time, with an average EQ-5D utility score of 0.72 in the bCBT group and 0.69 in the TAU group at 12-months FU. This corresponds with results from an individual participant data analysis of ten RCT's evaluating depression treatment, where less severe health states were associated with higher utility scores, with an average utility score of 0.70 for individuals in remission from depression (Kolovos et al., 2017).
With regard to possible undesirable outcomes, our findings indicate limited deterioration rates on the PHQ-9 in both bCBT (4.8% at T1 and 2.8% at T3) and TAU (7.3% at T1 and 2.8% at T2). These numbers closely resemble the deterioration rate of 5-10% that is often referred to in psychological treatment delivered FTF (Rozental et al., 2014). Looking at the INEP, most frequently negative effects on emotional experiences and social functioning were reported. Notable is that around one third of patients indicated to suffer more from events of their past compared to before the start of the treatment. This could be the consequence of being confronted with repressed emotions and thoughts during therapy, leading to (temporary) negative effects in the area of interpersonal changes/emotions. This study is one of the few studies that evaluated bCBT in clinical settings under real-life conditions, which could introduce challenges (Zuidgeest et al., 2017). External patient-, provider-and system-level factors may moderate the intervention's effect (Singal et al., 2014). However, participants are representative for the population in routine specialized MHC, maximizing generalizability and making results more pertinent to real-world clinical practice decisions. This study being a multicenter trial furthermore provides a better basis for the subsequent generalization of findings, as patients were recruited from a wider population and treatment was administered in a broader range of clinical settings. Considering that almost 50% of patients in our study already received previous treatment for depression (psychotherapy, medication or a combination of both), more than half of patients (51,5%) used antidepressant medication, 77,5% of patients had one or more comorbid disorders and high initial severity levels of depression, our results are remarkable: despite complex clinical presentations, patients seemed to benefit well from treatment. A strength of this study is that patients in the bCBT group did not participate in other psychological treatment for depression parallel to the intervention treatment (no add-on design). Observed clinical results can thus fully be attributed to bCBT. It could be argued that within this study we found an indication of a difference in effect at 6-months FU, when considering levels of probability of the P-value instead of cut-off levels (Pitak-Arnnop et al., 2010). The bCBT group displayed a greater decrease in depression severity (3.1 vs. 1.1 points on the PHQ-9), with a between-group effect size of d = 0.42 (p = .08) and 50% showed clinical response in bCBT vs. 28.6% in the TAU group (p = .06). However, these differences were not significant. Effects at 12-months FU diminished again, with both treatment groups showing mild symptom severity.
In the E-COMPARED study, which was powered as a non-inferiority trial, the total numbers needed were calculated and divided over the participating countries. Consequently, 156 patients were needed for the Dutch trial . Our study however resulted in a smaller than planned sample size, including 103 participants rather than the intended 156 for the overall trial. This was mainly due to lower recruitment rates than anticipated and time constraints of conducting the trial. Although sites were supported as much as possible in all trial-related activities, the enrollment of patients proved challenging. Patients were recruited at non-academic centers, and as the study was a pragmatic trial, the research team was depending on clinicians for inviting patients to participate in our study.
Another limitation of our study is not having specific data on possible other treatments patients received during the follow-up period, which might potentially distort estimates of the intervention effect. Furthermore, therapists could not be blinded to treatment allocation and delivered both the intervention and the control group intervention. This type of design is associated with potential drawbacks (Falkenström et al., 2013), but was necessary in order to be able to run the study in routine care. Future research should focus on therapist's allegiance as moderating factor, to evaluate this possible bias.
In conclusion, our main findings indicate that bCBT seems to be as effective in significantly reducing depressive symptoms and improving QoL as TAU. Our study was conducted in routine MHC, increasing the relevance of research findings to clinical practice. Our results indicate that applying blended treatment in routine specialized mental health care seems promising. Blended treatment in these settings is however a relatively new form of treatment that is still under development. New research should focus on larger samples, like the overall E-COMPARED study (publication in progress), to replicate results and enable assessment of possible moderators and mediators.