Introduction

Chronic fatigue syndrome (CFS) is a debilitating disorder characterized by medically evaluated, unexplained, persistent or recurrent persistent fatigue that is not the result of current stress, not relieved by rest, results in significant activity limitations, and for which there is no clear organic explanation [1]. However, there is a broad array of possible diagnostic criteria that can be used for CFS. Hence, CFS according to the presented definition needs to be differentiated from newer classification approaches for myalgic encephalomyelitis/CFS [2]. While the etiology of CFS remains unclear, evidence suggests that not only biological but also psychosocial factors play an important role in the development and maintenance of the condition [3]. Cognitive behavioral therapy (CBT) derives from corresponding disorder models that assume interactions among biological/physical, psychological, and social factors [4]. CBT has been shown to be one of the most effective psychological treatments for CFS. In CFS, CBT is based on assumptions about the interaction of cognitive processes and behaviors, which contribute to the perpetuation of the ailments. It usually involves identifying the patient's negative thoughts, beliefs [5], and behaviors believed to contribute to the physical symptoms [6], most importantly, patients’ focus on perceived symptoms of fatigue is decreased [7]. The therapist helps the patient develop altered and more realistic views on their illness, and coping skills to manage their symptoms [8]. Thus, patients experience reversibility of symptoms, which results in enhanced self-efficacy [9]. The goal is to help patients gradually increase their activity levels and thereby to decrease impairments (e.g., [8]). Previous studies indicate short-term efficacy of CBT in reducing symptoms, enhancing quality of life, and improving physical functioning in patients with CFS. However, the evidence base for the long-term efficacy remains unclear [10].

Several previous meta-analyses have examined the efficacy of CBT in the treatment of CFS. Malouff et al.’s [11] meta-analysis (k = 13) found significant reductions in fatigue for CBT compared with inactive and non-specific control conditions (d = 0.54, 95%CI 0.26 to 0.81). Functioning was significantly reduced in subjective (d = 0.45, 95%CI 0.12 to 0.78) as well as objective measures (d = 0.52, 95%CI 0.28 to 0.76). However, not all evaluated interventions entailed cognitive components. The Cochrane review (k = 15) by Price et al. [8] resulted in small to moderate effects, e.g., for fatigue severity (SMD = -0.39, 95% CI -0.0.60 to -0.19), physical functioning (SMD = 0.11, 95%CI -0.32 to 0.54), depression (SMD = -0.24, 95%CI -0.53 to 0.05), and anxiety symptoms (SMD = -0.30, 95%CI -0.59 to -0.01) compared with inactive control conditions. Although meta-analytic results indicate significant efficacy in fatigue, functional impairment/quality of life, depression, and anxiety at short-term follow-up, long-term efficacy remains largely unclear [8, 12].

Despite these positive findings, the use of CBT as a treatment for CFS is controversial. The preceding version of the National Institute for Health and Care Excellence (NICE) guideline [13] recommended CBT for myalgic encephalomyelitis/CFS. However, the updated guideline published in 2021 [14] no longer recommends CBT as a curative, but only as an adjunctive treatment. This change has generated debate and criticism, in part due to the methods chosen by NICE to make the decision [15]. The debate highlights the need for up-to-date research on the efficacy of CBT in the treatment of CFS. Previous meta-analyses point to the short-term efficacy of CBT for CFS [8, 11, 12]. However, their publication dates back 10 to 15 years, so their results may now be outdated and not fully representative of the current state of research. A recent systematic review showed that several studies have been published since then, that could provide valuable information [16].

Furthermore, patient acceptance is an important factor in the evaluation of CFS treatment. In this context, patient acceptance of psychological treatments is of interest. If a treatment is not acceptable to patients despite demonstrated positive outcomes, it may not be effective in practice [17]. In research, drop-out might decrease the validity of the conclusions drawn from clinical trials. It is possible that it leads to a form of selection bias within randomized controlled trials [18], when people refuse to start or finish a therapy. Furthermore, psychotherapies might not work in full effect, when not adequately completed. Therefore, non-adherence can have negative consequences for patients and can increase health care costs [17]. For CFS, the review by Price et al. [8] reported an odds ratio of 1.77 (95%CI 1.13 to 2.75) for drop-out of the intervention arms compared to usual care. Malouff et al. [11] reported a mean CBT drop-out rate of 12% with a range of 0–42%. Castell et al. [12] found a median drop-out rate of 17% with a range of 0–46%. However, none of these meta-analyses differentiated between different stages of drop-out; they conferred to the authors’ definitions of drop-out which are not necessarily pre-defined or uniform. Hence, the implications that can be drawn from these results are limited.

One reason for the lack of willingness to engage in psychological therapy could be the discrepancy between the subjective explanatory concepts of some people with CFS and CBT models. That is, patients that suffer from CFS often have their own explanations about the causes and nature of their illness. When these subjective explanatory models are primarily physical and not related to psychological processes (such as appraisals and dysfunctional coping behavior), they may be hesitant to engage in a therapy that emphasizes these aspects [19]. Furthermore, patient characteristics like fatigue severity or comorbid psychopathology could determine whether a patient is able and willing to participate in CBT (e.g., [20]). Other aspects of the treatment itself, such as format in individual or group intervention, might influence the acceptability of treatments for patients as well [8].

Therefore, an updated meta-analysis should examine not only the efficacy of CBT, but also its acceptability to patients. To address these questions, first, a meta-analysis of randomized controlled trials (RCTs) to evaluate the efficacy of CBT in adults with CFS was conducted. Our objective was to evaluate the outcomes fatigue, depression, anxiety, and functional impairment post-treatment and to determine whether these effects persist in the long term. Secondly, we analyzed acceptance of CBT in a very differentiated way, i.e., drop-out rates at different stages of the trials. Here, the primary outcomes were non-completion of all mandatory sessions, and drop-out according to the primary study definition. Additionally, treatment refusal (non-starters), and the average number of sessions completed were included. Besides, for both parts of the project, differentiated moderator analyses served to identify the impact of study design, treatment format and participant characteristics on treatment efficacy and acceptance, e.g., regarding the different diagnostic criteria and control groups used in the studies, intervention related variables (treatment intensity and therapy setting), and – for the acceptance analyses – clinical variables (fatigue severity and duration of fatigue symptoms).

Methods

The reporting of this meta-analysis followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement [21]. A protocol for this project is available under: https://osf.io/2je7u. Furthermore, the data and R codes are available: https://osf.io/wq4gj/.

Eligibility Criteria

Studies were considered eligible for inclusion if they met the following criteria: (a) participants were adults (≥ 18 years old) diagnosed with CFS according to any of the recognized diagnostic criteria (e.g., Oxford criteria [22] or Fukuda definition [1]; see Supplement for a complete list of eligible diagnostic criteria) or a score above a certain threshold on a validated symptom scale; (b) the study was a randomized controlled trial that compared CBT to inactive or non-specific control groups; (c) each arm relevant to this project was n ≥ 10 participants [23]; (d) the study was published in English or German.

Literature Search

A systematic literature search was conducted in six databases, including PubMed, PsycINFO, PSYNDEX, CENTRAL, CDSR, and CCA; the final search was on 1st of June 2022. Results were filtered for randomized controlled trial, meta-analysis, and systematic review in PubMed, and for clinical trial, meta-analysis, and systematic review in PsycINFO. The search in PSYNDEX was limited to articles published in academic journals; consequently, grey literature, i.e., unpublished reports, was not included. The search algorithm for PubMed in the Supplement provides an example for the search strategies used.

Study Selection

For study selection, abstracts were scanned for eligibility after the removal of duplicates by two independent researchers. Afterwards, the full articles of the reports that had not been excluded in the screening process were assessed for eligibility. During an additional backwards search, the references of meta-analyses and systematic reviews found during the first phase were searched for further eligible reports. The final selection was discussed between all authors. None of the independent researchers were blinded to any aspects of the studies at any time during the process.

Data Extraction

Data were extracted from eligible studies by two independent reviewers using a standardized sheet. The extracted data was then again checked by ACKB. A complete list of extracted variables is provided in the Supplement. When reported, results of intent-to-treat-analyses were preferred; as were validated measures, if multiple measures were used for the assessment of the same outcome. Furthermore, two independent researchers assessed the risk of bias (RoB) using the revised Cochrane risk of bias tool for randomized trials for the efficacy outcomes (RoB 2.0, [24]). As the tool was not developed for acceptance outcomes, we did not use it for these.

Outcome Variables and Data Analysis

We used a random effects model to estimate the effect sizes, and heterogeneity was assessed using the I2 statistic. The primary analysis was a meta-analysis of the efficacy of CBT in the treatment of CFS. The primary outcome was fatigue (physical and mental). Secondary outcomes were depression, anxiety, and perceived health status (including functional disability and quality of life). Outcomes were examined for post-treatment assessment, for short-term (up to 3 months) and long-term follow-up assessments (3–12 months) [25], as well as any follow-up intervals over a year. The standardized mean difference was calculated using Hedges’ g, with negative effects indicating a larger symptom reduction in treatment groups for fatigue, depression, and anxiety compared to the control groups; while positive effects on perceived health status suggest a larger improvement in treatment groups.

In the meta-analyses of the acceptance of CBT in CFS, we distinguished different forms of drop-out: The outcomes of interest were non-completion of all mandatory sessions, drop-out according to the primary study definition, treatment refusal (non-starters), and average number of sessions completed. On the one hand, non-completion was defined as all-cause discontinuation [26] after the treatment has been started, i.e., a unilateral termination of treatment despite therapeutic need [27] and against therapeutic advice. Thus, non-completion can be considered as the proportion of participants who started the treatment and completed at least one treatment module, but did not complete all sessions: (n participants not completing all sessions) / (n participants starting intervention). On the other hand, treatment refusal was defined as the proportion of participants in each group that did not start the intervention after being allocated and thus did not complete any module: (n participants not starting treatment) / (n participants allocated to intervention group) [28].

In accordance with former meta-analyses examining acceptance, we also included the primary authors’ definition of drop-out: (n participants dropping out according to authors’ definition) / (n participants starting intervention). Lastly, the average proportions of sessions completed was examined to estimate the amount of therapy participants, who started therapy, received: (average number of sessions completed) / (total number of sessions).

The meta-analyses for non-completion, treatment refusal, and for the average proportion of sessions completed in the intervention groups was calculated using weighted rates. For studies in which a rate of non-completers or drop-outs was reported, but not the number of participants who started treatment, we “imputed” the number of participants, i.e., we used the number of participants allocated as an estimation of starters. Non-completion and drop-out were also assessed for the control groups, and compared to the intervention groups using Relative Risk (RR). However, distinguishing these forms of drop-out was not possible in most inactive control groups. Lastly, the reasons for discontinuations were extracted for all acceptance outcomes, if available.

Moderator Analyses

We conducted moderator analyses to explore the influence of study-related, intervention-related variables, and clinical variables on efficacy and acceptance of CBT in CFS: The choice of the control group was examined by comparing the effects of non-specific (treatment as usual, TAU; psychological and attention placebo groups) and inactive control groups (wait-list, WL). Furthermore, the therapy setting, therapy dosage (i.e., total therapy time in minutes), number of sessions, and duration of treatment (in weeks) were examined as potential moderators to judge the effect of intervention characteristics. As in former meta-analyses, we planned to examine differences between study outcomes considering the diagnostic criteria used for inclusion. We also examined the influence of fatigue severity and the duration of fatigue symptoms at baseline on non-completion, drop-outs, and average proportion of sessions completed. Regarding the other outcomes this was not feasible. For both, meta-regressions and subgroup analyses, we used a mixed-effects model with a true overall effect for each subgroup and random effects within subgroups. Subgroups had to be at least k ≥ 3 to be included in the analyses. Variables were dummy coded to dichotomize categories if necessary.

Sensitivity Analyses

First, outlier studies were identified and every study’s individual influence on effect sizes and heterogeneity was analyzed. Further, to evaluate the use of the random effects model, results in the meta-analyses using fixed effect and random effects models were compared. Lastly, we recalculated the analyses excluding those studies for which we “imputed” the number of starters in the calculation of non-completion and drop-out rates.

Publication Bias

Publication bias was assessed for every efficacy outcome using contour-enhanced funnel plots and Egger's regression test. Furthermore, using the p-curve method the distribution of statistically significant p-values for anomalies was examined.

Results

Study Selection and Characteristics

The initial search yielded a total of 415 articles, of which 91 were duplicates. The remaining 324 articles’ titles and abstracts were screened for eligibility, and 253 articles were excluded as they did not meet inclusion criteria (Fig. 1). The full texts of the remaining 57 articles were assessed for eligibility, and 41 studies were excluded. Through other methods, like backwards search, two relevant reports were identified. Finally, k = 15 studies with n = 2015 participants were included in the meta-analysis (with data reported in 18 manuscripts). There was no indication that eligible RCTs published in other languages were excluded during the course of the systematic literature search.

Fig. 1
figure 1

PRISMA 2020 flow diagram (adapted from Page et al. [21])

The included studies were conducted in Europe (k = 6 in Great Britain, k = 5 in the Netherlands, k = 1 in Norway) and US (k = 3); they were published between 1996 and 2021. Most studies (k = 10) used the international definition criteria (Centers for Disease Control and Prevention, [1]) for inclusion of CFS patients, while the other studies either used the Oxford criteria ([22], k = 3), or cut-off values (k = 2). The sample sizes of the studies ranged from 37 to 321 participants that were allocated to treatment arms. The duration of treatments varied from 8 to 36 weeks (M = 18.85, SD = 8.52), with a range of two to 17 sessions (M = 10.71, SD = 4.83), and a total time of direct contact from 2 to 28 h (M = 12.75, SD = 7.82). One study used an internet-based self-help form of CBT [29], another used a video-telephone-based program [30], three studies implemented CBT in a group setting [31,32,33]. All studies compared CBT to inactive or non-specific control groups, namely k = 5 WL, k = 6 TAU, k = 3 psychological placebo (in the form of relaxation [34, 35] or a health promotion group [30]), and one study with an attention placebo control group (symptom monitoring [36]). That is, k = 5 studies used an inactive control group (WL), while the other k = 10 studies used non-specific controls (see Tables 1 and 2 for further study characteristics).

Table 1 Study Characteristics
Table 2 Additional information on acceptance outcomes

There were two stepped-care studies that built up on two included studies: [20] used the same sample as [37], while [38] recruited additional participants that were added to the sample in [29]. To avoid the problem of dependent samples, we did not include the second parts of these studies. Furthermore, [39] incorporated CBT in a multidisciplinary intervention, but graded exercise predominated. Therefore, we did not include the study in the analyses.

For studies with multiple arms we combined similar arms where possible: For [29] there were two similar internet-based CBT arms, one facilitated protocol-driven therapist feedback, in the other one, therapist feedback was provided on demand. In [33] the CBT conditions only differed in group sizes (either four participants and one therapist or eight participants and two therapists per group). For other studies, some additional arms could not be included in the analyses: The shorter CBT arm from [40] as well as the cognitive therapy arm from [35] relied on a different rational than the elaborated CBT arms in both studies and were therefore excluded. Since the control groups in [36], i.e., the symptom monitoring support and TAU, were not comparable, TAU was excluded. Additionally, ineligible control groups were excluded: Anaerobic activity therapy [35], education and support [31], (active) guided support groups [41], graded exercise, and adaptive pacing [42].

The primary outcome measure for the efficacy of CBT was fatigue, which was assessed in 14 of the trials. Most used either the Checklist Individual Strength – Fatigue subscale (k = 5), or the Chalder Fatigue Questionnaire (k = 5, one using an adapted version of the scale; see Supplement for more information on outcome measures). For [35] and [43] no means were reported at post-treatment, and for [44] no effect could be calculated as SDs were not reported. Depression was mostly assessed using the Hospital Anxiety and Depression Scale – depression subscale (k = 6), or the Beck Depression Inventory (k = 2). Anxiety was rated on the Hospital Anxiety and Depression Scale – anxiety subscale (k = 6) and the Beck Anxiety Inventory (k = 2). Perceived health status was mainly assessed using the Short Form Health Survey – physical functioning subscale (k = 9).

Efficacy of CBT

The meta-analytic results (k = 11) showed that CBT was significantly more effective in reducing fatigue than the control conditions at post-treatment g = -0.52 (95%CI -0.69 to -0.35; Fig. 2A). As only one study’s follow-up met our criterion for short-term follow-up [32], no results could be aggregated for any outcome. At long-term follow-up (k = 7), with a mean follow-up duration of 31.14 weeks (SD = 12.75), the effect on fatigue was also significant: g = -0.41 (95%CI -0.65 to -0.18). The heterogeneity across the studies was I2 = 64.4% (95%CI 32.1 to 81.3), indicating a moderate to high level of variability between the studies at post-treatment, and no to high heterogeneity at long-term follow-up (I2 = 57.9%, 95%CI 2.5 to 81.8). For follow-ups longer than 12 months post-treatment, two studies provided information on fatigue severity: [45] reported a ~ 3.5-year follow-up to [46] (g = 0.12, 95%CI -0.23 to 0.47), and [47] reported a two-year follow-up to [42] (g = -0.21, 95%CI -0.47 to 0.05).

Fig. 2
figure 2

Forest plots for primary outcomes for fatigue, non-completion, and drop-out. A forest plot for fatigue (post-treatment); B forest plot for non-completion in CBT groups; C forest plot for drop-out in CBT groups

For the secondary outcomes, CBT was significantly more effective in reducing depression and anxiety than the control conditions at post-treatment with small to moderate effects (Table 3). Furthermore, perceived health status was significantly improved in the CBT groups with a small effect. At long-term follow-up, neither the effects on perceived health status, nor on depression were significant (see Table 3; see Supplement for the forest plots). However, anxiety showed a significant small effect. It is to note, that the individual effects’ 95%CIs included 0, but the aggregated effect did not and was therefore significant. Of the secondary outcomes, only perceived health status was reported at follow-ups longer than a year post treatment: g = -0.35 (95%CI -0.7 to 0.0) for [46], and g = 0.17 (95%CI -0.08 to 0.43) for [42].

Table 3 Additional information on treatment effects on outcomes at post-treatment, at follow-up, and for indicators of acceptance

Acceptance of CBT

The results of the meta-analyses on acceptance showed that the non-completion in the CBT groups was 22% (0.22, 95%CI 0.03 to 0.71, Fig. 2B. Using each study’s drop-out definition (k = 10), the overall weighted rate was 15% (0.15, 95%CI 0.09 to 0.25, Fig. 2C). Four studies reported the average numbers of sessions completed, and for those the average proportion of modules was 84% (0.84, 95%CI 0.56 to 0.96). Ten studies reported the numbers of participants who were allocated, but did not start CBT. The weighted rate of treatment refusal was 7% (0.07, 95%CI 0.03 to 0.15). Reasons for drop-out were rarely reported (see Supplement for a list of all reasons).

Relative Risks

In the calculations of the RR in the acceptance outcomes, the RR for non-completion (k = 5) was RR = 3.87 (95%CI 0.30 to 49.63) with a heterogeneity of I2 = 89.2% (95%CI 77.5 to 94.8). For drop-out (k = 9) the aggregated RR was 2.26 (95%CI 1.05 to 4.86; I2 = 67.6%, 95%CI 34.7 to 83.9). As one study [44] reported that there was neither drop-out nor non-completion in any group, it could not be incorporated in the analysis.

Moderator Analyses

Moderator analyses were conducted to examine the effects of different variables on the efficacy and acceptance of CBT. The moderator analyses for efficacy outcomes were limited to post-treatment due to the small number of studies in the follow-up intervals. First, the subgroup analysis on control groups showed that the effects on fatigue between CBT and inactive control groups (g = -0.76, 95%CI -0.96; -0.56) were larger than the effects between CBT and non-specific control groups (g = -0.34, 95%CI -0.47; -0.21) at post-treatment (Table 4). There was no significant difference between subgroups for perceived health status or drop-out. Subgroups were too small for depression, anxiety, non-completion, and average proportion of sessions completed. The differences in therapy setting, i.e., individual and group setting, were not significant for perceived health status. The subgroups were too small to be analyzed for all other outcomes.

Table 4 Moderator analyses – subgroup analyses

Therapy dosage was a significant moderator for fatigue (R2 = 91.17%, p = 0.0003), i.e., a higher dosage was associated with a greater reduction in fatigue. However, the meta-regressions for perceived health status, depression, anxiety, non-completion, drop-out, and average proportion of sessions completed were not significant. The meta-regression on number of sessions was significant for perceived health status (R2 = 49.21%, p = 0.03), indicating that more sessions are associated with a greater improvement in the perceived health status, but not significant for any other outcome. Duration of therapy in weeks was not a significant moderator for any outcome either. Subgroup analyses for diagnostic criteria ratings were not possible as there were not at least two groups with k ≥ 3.

Fatigue severity was a significant moderator for non-completion and drop-out. That is, higher fatigue prior to treatment was associated with greater non-completion and higher drop-out. However, it was not significant for the average proportion of sessions completed. Duration of fatigue symptoms was neither a significant moderator for non-completion nor for drop-out. As only two studies reported information on the average proportion of sessions completed and the duration of symptoms, no analysis was conducted for this outcome. The detailed results of the meta-regression analyses are presented in the Supplement.

Sensitivity Analyses

We did not detect any outliers for depression, anxiety, drop-out, and the average proportion of sessions completed. For fatigue, there was one outlier [33] as well as for perceived health status [46], non-completion [29], and treatment refusal [43]. While the mean estimated effects did not change drastically after exclusion, heterogeneity decreased for all four outcomes.

When comparing the results using a random effects model (0.22, 95%CI 0.03 to 0.71) and a fixed effect model (0.49, 95%CI 0.44 to 0.54), there was only a difference in the estimate of the non-completion rate. Other than that, the confidence intervals tended to be wider in the analyses using a random effects model. The exclusion of the studies using the number of participants allocated to the intervention group when the number of treatment starters was not given, did not result in meaningful differences for non-completion or drop-out (see Supplement for detailed sensitivity analyses).

Risk of Bias

Overall RoB across all studies was either rated high or some concerns (Fig. 3). However, the RoB 2.0 tool might lead to higher ratings of conventional psychotherapeutic trial designs (cf. [48]).

Fig. 3
figure 3

Risk of bias ratings for fatigue, perceived health status, depression, and anxiety at post-treatment. A summary plot for fatigue; B summary plot for perceived health status, C summary plot for depression, D summary plot for anxiety. Additional information is provided in the Supplement

Publication Bias

Egger’s regression test did not indicate a possible publication bias in any of the efficacy outcomes at post treatment. While there was a significant result for anxiety at long-term follow-up, the test was solely based on five studies. The p-curve analyses indicated that the evidence on fatigue and perceived health status at post-treatment and for fatigue at long-term follow-up was based on evidential value. The funnel plots and p-curves are provided in the Supplement.

Deviations from the Protocol

There resulted some deviations from the protocol: The outcomes mental fatigue and physical fatigue could not be aggregated separately because the studies did not report fatigue divided into these domains. Additionally, the outcomes participation refusal and total drop-out rate were excluded from the analysis. It was found that there was no consistent reporting of participation refusal in studies, which could be attributed to different recruitment methods used that led to differently defined participant pools between studies. Thus, an aggregated proportion of participation refusal would have been difficult to interpret. Furthermore, the reporting of treatment refusal (non-starters) and drop-out in studies was not always clearly distinguishable, possibly leading to overlaps and an overestimation of the total drop-out rate. In contrast to the protocol, an interrater reliability was not applicable as the chosen procedure in this project relied on consensus within the author group and therefore some decisions were revised after collective discussion. No sensitivity analysis for risk of bias values was carried out either, as most ratings were high and an analysis was therefore not feasible. Lastly, moderator analyses were streamlined to enhance clarity in reporting, i.e., we unified the moderator analyses for efficacy and acceptance outcomes.

Discussion

The results of this meta-analysis provide important insights into the efficacy and acceptance of CBT in the treatment of CFS in adults. Our findings confirm that CBT is an effective intervention for CFS. We found significant post-treatment effects of CBT on fatigue, depression, anxiety, and perceived health status. The effect on the primary efficacy outcome, fatigue, was moderate with a confidence interval ranging from small to moderate effects. The mean effects on the other efficacy outcomes were smaller, yet had wider confidence intervals. That is, there were small effects on depression, anxiety, and perceived health status in CBT compared to inactive and non-specific control groups. The follow-up effects suggest a partial maintenance, while the data base is slightly smaller than for post-treatment. The effect on fatigue was small with the upper end of the 95%CI ranging into moderate effects. For the secondary outcomes, neither the effect on depression nor on perceived health status was significant at long-term follow-up. In contrast, there was a small significant overall effect on anxiety. All in all, these results are in line with previous research [8, 11, 12, 16, 49], indicating that CBT can alleviate CFS symptoms and improve patients’ overall well-being. In fact, this analysis is based on a broader data base compared to previous meta-analyses. Yet, for some outcomes sample sizes are still relatively small. This is especially the case for depression and anxiety. Besides, this is the first project that provides meaningful aggregated long-term effects; in previous works, there were only short-term follow-up effects [8] or aggregation was not possible [12]. While there is a recently published individual patient data meta-analysis in CBT for CFS [49], the data base for that project was limited to a specific treatment-protocol (e.g., [50]) and therefore less generalizable.

In the light of the latest NICE guidelines [14], this meta-analysis sheds light on the acceptance of CBT in the treatment of CFS, which is an important aspect of treatment evaluation. The results showed that non-adherence rates in the CBT group were generally low, indicating that CBT was acceptable to patients. Over all studies, only 7% of participants refused participation after randomization. Of those who started the interventions, a mean of 15% dropped out of the treatment groups, and 22% did not complete all mandatory modules. That is, most participants completed all mandatory modules and even more received an adequate amount of treatment according to the individual study authors’ definitions. In line with this, some studies that reported the average proportion of sessions completed, and the aggregated rates indicate that patients on average complete most sessions. Although some clinicians argue that some patients discontinue treatment due to major improvements already made during the current path of treatment, this is rather unlikely [51]. While cases of clinically relevant worsening of symptoms due to CBT are rather rare in CFS [16], they might still have an increased risk of dropping out of treatment. In general, about one fifth of participants drop out of psychotherapy trials [28, 52]. That is, the findings on drop-out in this project are comparable to general reviews on drop-out in psychotherapy studies. Furthermore, the results on treatment refusal are comparable to treatment refusal rates in individual therapies for various psychological disorders as well [28].

In non-completion, the RR was not significant between intervention and control groups, but the confidence interval was wide, and heterogeneity was considerable. In contrast, the RR for drop-out was significantly larger in intervention groups, i.e., participants were more than twice as likely to drop-out of the intervention groups compared to the controls. Heterogeneity was moderate to considerable. However, in the non-specific, and especially in the inactive control groups, the definitions of dropping out of the control group and non-completion in the control group were not always clearly defined. Moreover, in most control groups, participation was low-threshold due to the inactive nature. Hence, the RRs cannot be adequately interpreted as they are most likely biased by this methodological artifact.

The reported analyses on efficacy and acceptance provide average effects, thereby complementing another recent review [16] that reported the proportion of clinically improved or worsened cases. Furthermore, the results of most analyses in this project showed a relevant amount of heterogeneity. This suggests that not every patient benefits from CBT, which is why an array of moderators was examined: Using moderator and sensitivity analyses, the effects on fatigue were larger when only compared to inactive control groups (moderate to large effect). However, the effects were still small to moderate, if CBT was compared to an intervention that controls for some factors such as therapist support, attention, or expectancy variables. Within the subgroup analyses, the 95%CIs of both groups did not overlap and heterogeneity was drastically reduced in both groups. There was a similar pattern for perceived health status, as the effect was only significant for inactive control groups, but not for non-specific control groups with a larger 95%CI for the latter (from no effect to a moderate effect). Yet, the subgroups did not differ significantly. For perceived health status, the therapy setting (group vs. individual) did not affect the outcome, but it’s potential impact on other outcomes could not be evaluated. Although it was not possible to calculate subgroup analyses based on the different diagnostic criteria used for inclusion of CFS patients, most included studies used well-established criteria [2]. For the two studies that used cut-off values as inclusion criteria, sensitivity analyses did not indicate a significant impact on results.

Furthermore, the influence of three variables of treatment intensity was examined: treatment dosage (total time in minutes), the number of sessions, and the duration of treatments in weeks. There was an effect of treatment dosage, as expected, for fatigue (higher dosage predicts a higher symptom reduction), and the moderator accounted for almost all heterogeneity (R2 = 91.17%). Although the included studies evaluated relatively short therapies with a range from 2 to 17 sessions, the number of sessions was a significant moderator for the effect in perceived health status. Duration of treatment (in weeks) neither affected efficacy nor acceptance. That is, treatment intensity may be a relevant moderator in CFS. Consequently, two indicators of treatment intensity each showed a relevant association with one of the two most frequently studied efficacy outcome measures. In contrast, they were not associated with the measures of treatment acceptance. Fatigue severity was associated with a higher non-completion rate – that is, if the mean on the fatigue scale rises one point, the proportion of non-completion rises 0.1. This was also true for drop-out, but to a lesser extent. That is, some individuals might not deem themselves able to participate in regular sessions due to their severe fatigue symptoms. Duration of symptoms at baseline neither moderated non-completion nor drop-out. The differentiated sensitivity analyses, which considered the meta-analytic model choice and the influence of specific studies indicated that the project’s results are robust. Additionally, there was no substantial indication of publication bias for the efficacy outcomes.

Limitations and Future Directions

As this project’s focus was on investigating the absolute efficacy of CBT, which allows for a more accurate assessment of the efficacy and acceptance of the intervention [53], the interpretation of results is limited to this pool of studies. Consequently, the statements made apply primarily to CFS according to the Oxford criteria [22] and Centers for Disease Control and Prevention (CDC) definition [1], but not for myalgic encephalomyelitis/CFS according to the definition by the NICE guidelines [14]. However, a recently published individual patient data meta-analysis did not find evidence suggesting that patients meeting different case definitions or reporting additional symptoms benefited less from CBT [49]. Nonetheless, the selection of studies might have led to an overestimation of effects as effects are usually larger when compared to inactive and non-specific control groups [54]. Additionally, in line with Kim et al.’s [55] findings on measures used in CFS trials, the measures of fatigue and perceived health status are self-report ratings. This has partly been criticized [56]. However, this line of argument neglects the fact that those affected primarily report subjective suffering. The assessment of self-ratings is in line with the recommendations by the EURONET-SOMA group [57]. The same applies to comorbid psychopathology, which can be a predictor [58], a concomitant factor or a consequence of CFS [12]. Therefore, subjective experienced fatigue, comorbid psychopathology, and resulting disabilities have been chosen as key outcome measures. However, post-exertional malaise – which has been proposed as a cardinal symptom for myalgic encephalomyelitis/CFS in the latest NICE-guidelines [14]– could not be systematically aggregated as it was seldom assessed and not uniformly reported [29, 35, 42]. Future studies should take this into account. Furthermore, this project – as previous meta-analyses in the field – can neither answer which components of CBT are effective nor does it allow for therapy comparisons. Since, studies were considered here if they were based on a cognitive-behavioral rational rather than a purely behavioral rational, we cannot draw conclusions on differences between behavioral and cognitive components or the like.

Since the overall RoB was rated at least some concerns, but mostly high for the included studies, this should be taken into account when interpreting the results. Nonetheless, the high rating can partly be explained by the nature of conventional psychotherapeutic trial designs used for these studies. This does not imply that these designs are without flaw, however, these ratings do not render the results irrelevant. For example, one major criticism of psychotherapy trials in CFS is a lack of blinding [56]. However, this overemphasizes the assumed effect of blinding in trials, which is not reflected in clinical data [59].

Furthermore, there might be a form of selection bias in the examined sample since the included studies were conducted mainly in Europe and the US, which may limit the generalizability of the findings to other regions. More importantly, the included participants in the studies already agreed to participate in a RCT on CBT for CFS. That is, participants who are not willing to partake in CBT might simply refuse participation. Moreover, as most studies included at least some face-to-face contact, it might disadvantage those CFS patients who, due to their pronounced symptom severity, are housebound [16]. Therefore, conclusions on this subset of CFS patients cannot be drawn. Thus, it remains an open question as to which treatments or forms of treatment are appropriate for patients who are this severely impaired. Future studies should continue to systematically assess the reported efficacy outcomes, and should additionally be supplemented with objective measurement instruments. Thus, other observable indicators would also be of interest, such as behavior changes and sickness leave days. In particular, the catamnestic effects in CBT should be further investigated in large-scale trials. Currently, in study reports, specific forms of drop-out rates are seldom reported [17], and a consistent definition of drop-out is lacking [28]. While it is understandable that authors define an adequate amount of treatment received as the main indicator of acceptance, this leads to a certain amount of variance between studies’ definitions of adequate treatment. However, it is noticeable that across all studies, only a small proportion of studies reported uptake and discontinuation; which could indicate attrition bias [16]. Especially, the (non-)completion of all mandatory modules and information on the mandatory modules completed were only reported in some studies. In line with former meta-analyses on adherence, treatment refusal (e.g., [28]), and drop-outs (e.g., [17]) have been systematically aggregated here. Additionally, in this project, the analysis of non-completion and the average proportion of mandatory modules provided a new perspective on and operationalized measures of adherence and hence acceptance. Differentiated information on adherence is essential. Preferably, future studies should report the described variants of acceptance outcomes. Although there is now a broader database compared with the meta-analyses from 10 to 15 years ago, the subgroup analyses and, especially, the regression analyses to identify possible moderators of efficacy and acceptance require further primary studies.

Conclusion

In conclusion, this meta-analysis provides further evidence for the efficacy of CBT in the treatment of CFS in adults and maintenance of the effects, while also highlighting the importance of considering the acceptance of treatments. Acceptance of CBT in CFS does not seem to be lower when compared with other patient groups with various mental disorders – this also applies if the stricter criterion of non-completion is taken into account. The results may help inform clinical practice and future research in this area. Hence, this project contributes to a better understanding of the potential benefits and limitations of CBT in the treatment of CFS and supports clinical decision-making. One reason why some people with CFS are reluctant to undergo psychological therapy could be the lack of willingness to engage in psychological therapy among people with CFS. This may stem from a mismatch between their personal beliefs about their condition and the foundational principles of therapies like CBT. Recognizing and addressing this disconnect is crucial for providing effective support and treatment for individuals with CFS, potentially by tailoring therapy to their unique needs and perspectives. Considering that initial fatigue severity was associated with a slightly increased risk of discontinuation of treatment, one could consider acceptance facilitating interventions – as shown in pain [60]. Similarly, stepped-care approaches might be promising in the field of CFS: first, the participants receive low-threshold intervention (e.g., internet-based self-help program), and then, if needed, a more intense face-to-face therapy [20].