Self-compassion is the tendency to soothe oneself with kindness and non-judgemental understanding in times of difficulty and suffering (Neff 2003b; Gilbert 2009). Greater levels of self-compassion have been linked to reduced mental health symptoms, with meta-analyses reporting large correlations between higher levels of self-compassion and lower levels of depression, anxiety and stress in adults (r = − 0.54; MacBeth and Gumley 2012) and adolescents (r = − 0.55; Marsh et al. 2018), as well as greater overall psychological well-being (r = 0.47; Zessin et al. 2015). Motivated by the link between self-compassion and mental health, a range of compassion-based therapies have been developed (for a review, see Leaviss and Uttley 2015), and a meta-analysis has provided preliminary evidence that such therapies produce moderate positive changes in self-compassion and other mental health outcomes (Kirby et al. 2017). However, it is not possible to say from this meta-analysis whether self-compassion-related therapies are effective in treating individuals with clinical or subclinical levels of mental health problems because many of the samples included were drawn from the general non-clinical population. This therefore calls for an updated meta-analysis examining the effectiveness of self-compassion-related therapies in clinical and subclinical populations.

Compassion-focussed therapy (CFT) is the intervention that most explicitly aims to modify self-compassion. It was developed for use with people with chronic mental health problems who experience high self-criticism and shame and who do not respond well to conventional therapies (Gilbert and Proctor 2006). CFT is grounded in a theoretical assumption that we have three affective systems (threat, drive and soothing) and that enhancing the soothing system helps us manage negative thoughts and emotions through promoting social bonding and positive self-repair behaviours (Gilbert 2009). Typical techniques used in CFT include self-compassionate meditation, imagery, letter writing and dialogic role-play (Gilbert 2009). Similar techniques are used in parallel therapies, such as mindful self-compassion therapy (MSC; Neff and Germer 2013). A meta-analysis (Kirby et al. 2017) has indicated that CFT and related therapies, such as MSC, improve levels of self-compassion (d = 0.70), as well as reduce anxiety (d = 0.49), depression (d = 0.64) and psychological distress (d = 0.47), in various groups both with and without mental health conditions.

While Kirby et al. (2017) exclusively reviewed CFT, a focus on self-compassion is not restricted to one modality of therapy. It is relevant across ‘third-wave’ therapies, such as mindfulness-based cognitive therapy (MBCT), dialectical behavioural therapy (DBT) and acceptance and commitment therapy (ACT). As such, the second edition of the MBCT manual (Segal et al. 2013) explicitly makes the promotion of self-compassion an aim of therapy and improvement in self-compassion as a mechanism of change in mindfulness therapies (for a review, see Gu et al. 2015). This is hardly surprising given that self-compassion and mindfulness are overlapping constructs. As such, mindfulness figures in Neff’s (2003a) three-part conceptualisation of self-compassion, alongside self-kindness and common humanity. Self-compassion is directly relevant to DBT, given that the DBT manual for borderline personality disorder includes several exercises designed to encourage self-compassion (Linehan 1993). Finally, self-compassion has also been linked theoretically to the core processes of ACT, in particular acceptance, cognitive diffusion, present moment awareness and self as context, which are all aimed at reducing self-criticism (Neff and Tirch 2013). Based on the similarity between self-compassion and the underlying constructs in MBCT, DBT and ACT, it is reasonable to view these different interventions as part of a family of self-compassion-related therapies that could be evaluated as a group.

Just as the clinical significance of self-compassion is not limited to one therapeutic modality, it is also not limited to one psychological diagnosis. A tendency to be self-critical, which is viewed as the opposite of self-compassion, is seen as a universal feature of psychopathology (Clark et al. 1994; Gilbert and Proctor 2006). Also, in addition to depression, anxiety and stress (MacBeth and Gumley 2012), low self-compassion has been linked to symptomology in people with persecutory delusions (Collett et al. 2016), auditory hallucinations (Dudley et al. 2018), eating disorders (Ferreira et al. 2013), and Cluster C personality disorders (Schanche et al. 2011). Psychotherapies that target self-compassion are therefore likely to be relevant across disorders. This is consistent with a transdiagnostic approach to therapy, which recognises that psychological disorders are often comorbid, share causal factors and have blurred diagnostic boundaries (Newby et al. 2015).

A final issue for consideration is that self-compassion is not necessarily a single, unitary construct. The most commonly used psychometric measure of self-compassion, the Self Compassion Scale (Neff 2003b), comprises six separate subscales including three positive and three negative: the positive subscales include self-kindness, common humanity and mindfulness, while the negative subscales include self-judgment, isolation and over-identification. These subscales have differing relationships with other psychological variables. Muris and Petrocchi (2017) found that the negative items were more strongly related to psychopathology than the positive items, and Neff (2016) found general trends for improvements in the negative subscales to predict reduced psychopathology and for the positive subscales to predict increased well-being in a randomised controlled trial (RCT) of MSC therapy. Given these differential relationships between self-compassion and mental health outcomes, the effect of therapy on self-compassion should be investigated as a multifaceted phenomenon.

This meta-analysis aimed to evaluate the effectiveness of self-compassion-related therapies, compared to a control condition, in clinical and subclinical populations. Our review extends previous reviews (Kirby et al. 2017; Leaviss and Uttley 2015) in three important ways. First, it is more inclusive of the type of therapies; thus, we use the general term ‘self-compassion-related therapies’ rather than CFT to refer to the therapies included in our review, as we take any intervention with the stated goal of directly or indirectly improving an individual’s level of self-compassion as relevant. Second, we focussed purely on groups with classifiable mental health symptoms presenting at either a subclinical or clinical level. Third, we assessed whether particular aspects of self-compassion are more modifiable in therapy than others. The previous reviews (Kirby et al. 2017; Leaviss and Uttley 2015) indicated that we should expect therapeutic outcome to show considerable variety across studies, and so we hypothesised that improvements in self-compassion and psychopathology would be moderated by the clinical status of participants and the type of control group and intervention used in the studies.


The review was conducted following the guidance by the Centre for Reviews and Dissemination (CRD 2009). This was originally designed as a systematic review and a protocol was submitted to the PROSPERO International prospective register of systematic reviews (CRD42016033532; Mackintosh 2016). Due to the large number of studies identified during the literature search following submission of the protocol, we decided to limit our analysis to RCTs, as these offer the highest standard of evidence. The number of studies also allowed us to offer a quantitative, rather than purely qualitative, review, thereby providing more information for researchers and clinicians.

Identification and Selection of Studies

A comprehensive literature search was conducted in July 2017 using five databases: PsycINFO, Medline, Embase, CINAHL and Cochrane Library. The following keywords were used: ‘compassion focused therapy*’ or ‘compassionate mind training’ or ‘mindful self-compassion’ or (‘mindfulness based’ or ‘MBCT’ or ‘MBSR’ or ‘acceptance and commitment therapy*’ or ‘ACT’ or ‘dialectical behaviour* therapy*’ or ‘DBT’ or ‘intervention’ or ‘treatment’ and ‘self-compassion’ or ‘self-kindness’). After removal of duplicates, studies were screened based on title. Next, abstracts and full-text articles were independently screened by two researchers according to the inclusion criteria (see below). Any ambiguities were resolved in discussion. Reference lists of the final set of studies included in the review were screened for further relevant studies, as were the reference lists of three previous reviews (Kirby et al. 2017; Leaviss and Uttley 2015; MacBeth and Gumley 2012). Additional searches were conducted on the publications of two key authors in the field of self-compassion (Neff and Gilbert) and publication lists on relevant websites ( and

Eligibility Criteria

For inclusion, studies had to be RCTs evaluating an intervention with a self-compassion component against either an active intervention or a waitlist/treatment as usual control. We required the intervention to include at least one face-to-face session with a trained therapist. The study population had to consist of adults of 18 years and over who had a clinical or subclinical mental health problem, as assessed by formal clinical diagnosis or by a validated self-report measure. Self-compassion is relevant to a range of mental health problems, so this review was not restricted to any specific diagnosis. Studies needed to include a standardised measure of self-compassion. Where possible, we also extracted depression and anxiety scores. We focussed on symptoms of depression and anxiety as key outcome variables since these have been identified as linked to self-compassion in previous meta-analyses (MacBeth and Gumley 2012; Marsh et al. 2018). They are also common outcomes in RCTs, so it was likely that we would identify a sufficient number of studies to calculate summary estimates of the effect of therapy on these two variables. Finally, all included studies needed to be published in a peer-reviewed journal in English.

Data Extraction

Characteristics of the identified studies were independently extracted by two researchers; see Table 1 below and Tables 4 and 5 in the Appendix.

Table 1 Characteristics of studies included in the review

For the meta-analysis, we extracted outcome data for self-compassion from each paper and for depression and/or anxiety where these were reported. We extracted means and SDs pre- and post-treatment and sample sizes in the intervention and control groups. Where an intention-to-treat sample was used, we extracted the full sample size at randomisation, and where per-protocol results were given, we took the sample size of study completers, so that the weighting of studies in the meta-analysis would be proportional to the amount of data contributed. Generally, raw means were extracted, although in two cases (Kelly and Carter 2015; Kelly et al. 2017) only estimated means from multilevel modelling were reported.

We planned to accept any standardised measure of self-compassion, though in practice this meant either the Self-Compassion Scale (SCS; Neff 2003b) or the Self-Compassion Short-Form (SCS-SF; Raes et al. 2011) was required, as these are the only validated measures of the construct. The SCS is a 26-item self-report questionnaire, including six subscales, self-kindness, self-judgement, common humanity, isolation, mindfulness and over-identification. The first two subscales include 5 items and the others include 4 items; the total score is computed as the average of the six subscales. The SCS-SF includes 12 items in total (2 from each scale); SCS-SF and SCS full scores were reported to be almost perfectly correlated (r = 0.97; Raes et al. 2011). As part of our review, we were interested in addressing whether different facets of self-compassion were more modifiable in therapy than others. Some studies reported breakdowns on the subscales, so we extracted all these scores; where results were not fully reported, we contacted the authors.

We accepted any psychometrically validated measure of depression and anxiety. If studies reported more than one measure of depression or anxiety, we selected the primary outcome or pooled the results if there was no a priori reason to favour one measure. In practice, this situation occurred only twice during data extraction. Kingston et al. (2015) reported anxiety and depression using both the Hospital Anxiety and Depression Scale (HADS) and Profile of Mood States (POMS). As the HADS was the clinical screening tool, we used this in our analysis. Hou et al. (2013) reported separate results for the State Anxiety Inventory (SAI) and Trait Anxiety Inventory (TAI), and to avoid an arbitrary choice of one over the other, we averaged the means.

Quality Assessment

We assessed the quality of the studies in the review using two systems. First, we used the Cochrane Collaboration’s tool for assessing risk of bias in RCTs (Higgins et al. 2011). This is the standard framework used for assessing whether there is low, high or uncertain risk of bias within studies. We checked for bias arising from: the allocation of individuals into groups (selection bias), the blinding of participants and personnel to condition during the intervention (performance bias), blinding during assessment (detection bias), missing data (attrition bias) and selective reporting of results (reporting bias). The above was supplemented by a checklist adapted from Downs and Black (1998) for evaluating randomised and non-randomised studies of healthcare interventions, and informed by the changes made by Cahill et al. (2010) for assessing practice-based research on psychological therapies. The final checklist consisted of 27 items covering four areas: reporting (11 items), external validity (4 items), internal reliability of measurement and treatment (5 items) and internal reliability of confounding variables/selection bias (7 items). Quality was assessed independently by two researchers, and inter-rater reliability was assessed using Cohen’s Kappa statistic.

Data Analyses

Meta-analysis was carried out in the open-source software environment R (version 3.4.0) using the (Del Re 2013) and metafor packages (Viechtbauer 2010). Using the mes() function in, we calculated the standardised mean difference effect size for each comparison of a compassion-related intervention with a control condition and the associated sampling variance. For the effect size, the difference in change scores between the intervention group (group 1) and the control group (group 2) was divided by the pooled pre-study standard deviation, as shown in the formula below:

$$ \frac{\left({M}_{\mathrm{Group}\ 1,\mathrm{post}-\mathrm{study}}-{M}_{\mathrm{Group}\ 1,\mathrm{pre}-\mathrm{study}}\right)-\left({M}_{\mathrm{Group}\ 2,\mathrm{post}-\mathrm{study}}-{M}_{\mathrm{Group}\ 2,\mathrm{pre}-\mathrm{study}}\right)}{\mathrm{sqrt}\ \left(\ \left({{\mathrm{SD}}_{\mathrm{Group}\ 1,\mathrm{pre}-\mathrm{study}}}^{\ast }\ \left({n}_{\mathrm{Group}\ 1}-1\right)\ \right)+\left({{\mathrm{SD}}_{\mathrm{Group}\ 2,\mathrm{pre}-\mathrm{study}}}^{\ast }\ \left({n}_{\mathrm{Group}\ 2}-1\right)\ \right)\ \right)/\left(N-2\ \right)\ \Big)} $$

This followed the meta-analysis of Kirby et al. (2017) of compassion-based interventions, and the metric was adjusted as suggested by Hedges and Olkin (1985) to correct for biased estimation in small samples (giving Hedge’s g).

Some studies compared an intervention group to two different control groups. In these cases, we calculated effect sizes for each comparison of an intervention to a control. To correct for these correlated comparisons (Higgins and Green 2011), we used a multilevel meta-analytic model (Konstantopoulos 2011; Weisz et al. 2013). Using the function in the metafor package, we ran separate multilevel models with restricted maximum likelihood estimation for self-compassion, depression and anxiety, including ~group|study as a random term in each model. The summary effects produced by the models were interpreted according to Cohen’s (1988) guidelines: Hedge’s g of 0.20 as a small effect, 0.50 as medium and 0.80 as large. We tested whether the effect sizes for self-compassion, depression and anxiety differed in size using a Wald-type test.

The presence of heterogeneity was assessed using the Q-statistic, which tests whether the sum of weighted squared deviations about the summary effect size is greater than expected by sampling error. This has a χ2 distribution under the null hypothesis. Heterogeneity was quantified using the I2 statistic, calculated as (Q − df) / Q and expressed as a percentage; 0% indicates no observed heterogeneity, while 25, 50 and 75% indicate low, moderate and high heterogeneity, respectively (Higgins et al. 2003).

As detailed in the aims above, we examined if three characteristics of the studies contributed to heterogeneity: the study population (clinical or subclinical), the modality of therapy (explicitly compassion-based, i.e. CFT, or another type of intervention) and the kind of control group (active or waitlist/treatment as usual). To assess the importance of these study-level variables, we carried out meta-regressions by adding the three moderators to the multilevel models for self-compassion, depression and anxiety. Variables were dummy-coded such that positive coefficients for population, therapy type and control type meant a greater effect for clinical populations, CFT and studies with an active control, respectively. Where a moderator was significant, we split the data set by that moderator to get effect size estimates within the subgroups.

Publication bias and sensitivity of the meta-analysis to influential cases were tested. Individual effects were identified as potentially influential if they had leverage, defined by a hat value above 2/n, a conservative cut-off (Hoaglin and Kempthorne 1986), or were discrepant, with a standardised residual of ± 3. The models were compared with and without any effects that screened positive for leverage or discrepancy. Publication bias is usually investigated by funnel plots. However, traditional funnel plots, plotted with the effect sizes against their standard error, do not provide a reliable assessment of publication bias when effects are nested and when there is significant heterogeneity, and so the residuals of the moderated models, rather than the raw effects, were plotted here instead (Nakagawa and Santos 2012). This has the effect of checking for publication bias when heterogeneity has been accounted for. We ran Egger’s test for the asymmetry of the funnel plot showing the residuals (Egger et al. 1997). In the case of possible publication bias, we ran the trim-and-fill procedure on the residuals of the moderated model to approximate the number of hypothetical unpublished studies ‘missing’ from the data set, and as advised by the originators of the procedure, it was used as a sensitivity analysis rather than an adjustment (Duval and Tweedie 2000).


Study Selection

See Fig. 1 for details on the selection of papers for the meta-analysis. A total of 22 studies met our inclusion criteria. Just three of these studies were included in the only other meta-analysis of compassion-based interventions (Kirby et al. 2017). One study (Huijbers et al. 2015), despite not reporting self-compassion outcomes, was included because a second study by the same authors (Huijbers et al. 2017) indicated that the SCS was administered in the RCT. We were able to obtain a breakdown of the SCS results from the authors to be included in this meta-analysis.

Fig. 1
figure 1

Flowchart showing number of records at each stage of the literature screening

Study Characteristics

See Table 1 for a description of all 22 studies. Table 4 in the Appendix presents further details on the studies, including their main outcomes and information on the therapists, treatment adherence and attrition. Of the 22 RCTs included in the review, 13 evaluated mindfulness-based therapies, 1 a day-long ACT workshop and 8 compassion-based interventions (CFT or related compassionate mind/loving-kindness approaches). The literature search did not identify any studies examining the effect of DBT on self-compassion.

Of the 13 mindfulness-based interventions, the majority (n = 9) were closely matched in format, following the treatment protocols of either Kabat-Zinn (1990) or Segal et al. (2002), with manualised weekly group sessions typically lasting between 2 and 2.5 h over 8 weeks. The other four mindfulness interventions were more heterogeneous: two were also an 8-week course but with shorter sessions, one was a longer course and one was a self-help intervention with an initial face-to-face orientation session.

The compassion-based interventions (n = 8) were more variable in format. Two had relatively minimal therapist contact time: each comprised of one face-to-face orientation session followed by 3 or 4 weeks of guided self-help. Six compassion-focussed interventions followed a more intensive course format, but the weekly sessions were shorter (1 to 1.5 h in duration) and the length of the course was more variable (between 7 and 12 weeks). Also, some had a group format (n = 3), whereas others involved one-to-one sessions (n = 3).

In 11 of the 22 studies, there was a comparison group engaging in an active control condition. In three of these (Armstrong and Rimes 2016; Hou et al. 2014; Kuyken et al. 2010), the control condition was not closely matched to the intervention in duration of social contact. In 15 studies, the control condition was waitlist or treatment as usual (TAU). In four of these studies (Huijbers et al. 2017; Kelly et al. 2017; Key et al. 2017; Kingston et al. 2015), the control groups provided a high level of comparison with the intervention group, since the participants were under the care of an outpatient clinic with consistent psychotherapy and/or pharmacotherapy. In the other 11 studies, the waitlist/TAU group contained participants with little treatment or no consistent level of treatment. Where treatment adherence was reported, it appeared high, although few studies gave a comprehensive report of adherence. Therapist competence was reported in most studies, and where it was, experienced, qualified mental health professionals delivered the interventions.

Studies included in this review report summary data for a total of 1262 participants at baseline, with individual sample sizes between 16 and 173 (median = 40). Half the studies took place in the USA (n = 8) or the UK (n = 4), with the remaining studies spread across Canada (n = 2), the Netherlands (n = 2) and one each in Japan, China, Ireland, Portugal, Norway and Israel. In total, 73.9% of the individuals were female, and mean age was 40.0 years (SD = 10.7). Data from 10 studies was based on an intention-to-treat (ITT) sample and 12 on per-protocol (PP) participants. Of the 12 papers with PP results, 4 repeated the analyses with ITT samples and reported finding equivalent results. Across all studies, post-intervention data was available on 78.8% of participants randomised to a condition; the median number of participants providing post-intervention data was 71.8% (range = 56.3–92.7%) in PP samples and 85.3% (range = 75.6–100%) in ITT samples, indicating higher attrition in PP samples. The clinical characteristics of the included studies were as follows: peri-clinical anxiety/depression (n = 4), recurrent depression in full/partial remission (n = 3), treatment-resistant depression (n = 1), social anxiety disorder (n = 2), trauma symptoms/PTSD (n = 2), eating disorder (n = 3), obsessive-compulsive disorder (n = 1), high stress (n = 2) and high self-criticism/low self-compassion (n = 4). The studies of high stress and high self-criticism/low self-compassion all selected their participants using thresholds on screening measures that might be taken as indicating a risk for developing psychopathology. See Table 1 for the characteristics of the samples.

Quality Assessment

See Table 2 for the assessment of risk of bias. There was variable risk of bias across studies, with the main problem being performance bias. There was little evidence that the participants in the experimental and control conditions were likely to have similar expectations for treatment gains, as the control groups often failed to provide a comparable level of treatment to the experimental groups. For instance, a patient is unlikely to have the same expectations of improvement if they are offered a minimal self-help course compared to weekly group therapy. See Tables 6 and 7 in the Appendix for further quality ratings. Inter-rater reliability of the two assessors was high (Kappa = 0.83).

Table 2 Assessment of risk of bias across the studies included in the review

Effects of Self-Compassion-Related Interventions on Self-Compassion, Depression and Anxiety

Figure 2 shows forest plots for all three outcomes, with each individual effect size representing a comparison between a self-compassion-related intervention and a control condition.

Fig. 2
figure 2

Forest plots showing effect sizes for the three main outcomes: self-compassion, anxiety and depression. Where authors and year are followed by (1) or (2), this indicates the comparison between the intervention group and either control group 1 or control group 2. See Table 1 for details regarding the conditions

There were 26 comparisons that measured the self-compassion outcome, covering a total of 1172 individuals. The overall effect was medium-sized effect for greater improvement in self-compassion in the self-compassion intervention compared to the control, g = 0.52, 95% CIs [0.32, 0.71], p < 0.001. As can be seen in the forest plot, 19 of the 26 comparisons were at least small-sized, and 15 were medium-sized. Across the studies, heterogeneity was moderate, Q(25) = 63.63, p < 0.001, I2 = 60.7%.

There were 17 comparisons that measured anxiety, covering a total of 665 individuals. The overall effect was borderline medium for anxiety, g = 0.46, 95% CIs [0.25, 0.66], p < 0.001. Heterogeneity was small, Q(16) = 28.67, p = 0.041, I2 = 44.2%. There were 22 comparisons that measured depressive symptoms, covering a total of 1063 individuals. A small to medium effect was found for depressive symptoms, g = 0.40, 95% CIs [0.23, 0.57], p < 0.001. There was evidence of moderate heterogeneity, Q(21) = 51.09, p < 0.001, I2 = 58.9%.

We tested whether the magnitude of the summary effect for self-compassion differed significantly from those for anxiety and depression using Wald-type tests. Both tests were non-significant: z = 0.42, p = 0.676; z = 0.80, p = 0.380. However, on a study by study level, there was evidence that interventions often varied in their impact on self-compassion and the psychopathology measures. The average absolute difference between a study’s effect size for self-compassion and for depression was 0.45 (SD = 0.46), and between self-compassion and anxiety, it was 0.41 (SD = 0.41).

For the three meta-analytic models, all standardised residuals were between − 2.68 and 2.10, and no hat values were flagged, suggesting that there were no influential outliers.

Types of Control, Intervention and Population as Possible Moderators of Outcome

Meta-regressions indicated that the type of control (active vs. waitlist/TAU) was the only study-level moderator of outcome. Study population (clinical or subclinical) and type of therapy (explicitly compassion-based, i.e. CFT, or another type of intervention) showed no effects on outcome. In Table 3 , the studies are categorised according to the three moderators.

Table 3 Studies classified by hypothesised moderators

For self-compassion, the omnibus test of the moderators was significant, QM(3) = 12.92, p = 0.005. Residual heterogeneity was also significant, QE(22) = 42.05, p = 0.006, indicating that 47.7% of variability remained unexplained by the model. In the meta-regression for self-compassion, type of control was significant (β = 0.54, SE = 0.16, p < 0.001), but type of intervention (β = 0.20, p = 0.350) and population (β = − 0.05, p = 0.787) were not. For anxiety, the omnibus test of the moderators was also significant, QM(3) = 18.62, p < 0.001, and a non-significant test for residual heterogeneity indicated that all variability was explained, QE(13) = 10.05, p = 0.690. For anxiety, type of control was also a significant predictor (β = 0.53, SE = 0.16, p < 0.001). Neither type of intervention (β = − 0.31, p = 0.169) nor population (β = 0.27, p = 0.082) was significant. Finally, the full model for depression was not significant, QM(3) = 3.84, p = 0.279, and type of control closely missed significance, though it did retain a sizeable coefficient, as in the other models, β = 0.38, SE = 0.20, p = 0.066. Type of intervention (β = − 0.10, p = 0.610) and population (β = 0.16, p = 0.378) were non-significant as moderators. Residual heterogeneity was significant, QE(18) = 40.34, p = 0.002, with 55.4% of variability remaining unexplained.

Given the results of the meta-regressions that indicated substantial between-studies differences based on the type of control used, we ran subgroup analysis to extract summary estimates at the subgroup level. For studies with a passive control condition, summary effects were moderate: self-compassion, g = 0.72 [0.53, 0.90], p < 0.001; depression, g = 0.56 [0.38, 0.73], p < 0.001; anxiety g = 0.69, [0.44, 0.93], p < 0.001. For studies with an active control condition, effect sizes were not significant, though self-compassion only marginally missed significance, g = 0.27 [− 0.04, 0.58], p = 0.092. Estimates for the other outcomes were as follows: anxiety, g = 0.15 [− 0.05, 0.35], p = 0.138; depression, g = 0.17 [− 0.17, 0.52], p = 0.324.

It is worth noting that there was substantial variability in the nature of the passive TAU control groups: they varied from having no treatment at all to having ongoing outpatient care. We therefore conducted exploratory analysis beyond our planned moderator analysis. In our last subgroup analysis, we grouped studies with a high-level TAU control together with those with an active control group. In practice, this meant re-categorising four studies where all or most control participants received psychological treatment and/or psychotropic medication in their usual care (Huijbers et al. 2017; Kelly et al. 2017; Key et al. 2017; Kingston et al. 2015) as having active rather than passive controls. Under this subgrouping, estimates for the three outcomes were as follows: self-compassion, g = 0.35 [0.09, 0.62], p = 0.010; depression, g = 0.16 [− 0.15, 0.47], p = 0.302; and anxiety g = 0.23, [0.00, 0.45], p = 0.049.

Effects of Self-Compassion-Related Interventions on Subscales of the SCS

Sixteen studies used the full form of the SCS; the rest used the SCS-SF, which does not allow reliable calculation of subscale scores (Raes et al. 2011). We had access to the breakdown of subscores in 8 studies, either through published data and unpublished data obtained directly from authors, with a total sample of 326 people. We ran a random effects meta-analysis on each scale. Effects were similar across scales: typically, medium in size, though there was a tendency for negative subscales to be associated with slightly higher effects. Note that all these studies employed passive TAU control groups. Bearing in mind that our moderator analysis above found that studies with active controls were associated with lower effects, it is possible that the following results overestimate the true effect of treatment; nonetheless, these results give a sense of the relative effect on different subscales. Summary effects were as follows: self-kindness, g = 0.58 [0.37, 0.80]; self-judgement, g = 0.54 [0.31, 0.77]; common humanity, g = 0.46 [0.24, 0.68]; isolation, g = 0.63 [0.41, 0.85]; mindfulness, g = 0.41 [0.19, 0.63]; and over-identification, g = 0.72 [0.48, 0.96]; all ps < 0.001. There was no evidence of heterogeneity in any analysis, all p values ≥ 0.301. See Fig.  3 for forest plots of each SCS subscale.

Fig. 3
figure 3

Forest plots showing effect sizes for the six subscales of the SCS

Publication Bias

Publication bias was assessed by inspecting funnel plots and performing Egger’s regression to test for asymmetry in the plots. Contour-enhanced funnel plots of the observed effects and funnel plots of residuals are shown in Fig.  4 (Appendix). As explained in the “Method,” Egger’s test was run on the residuals of the models including our study-level moderators. Results for Egger’s test were as follows: self-compassion, p = 0.136; anxiety, p = 0.737; depression, p = 0.851. Given that p = 0.1 is taken as the threshold for significance in Egger’s test, we took the borderline result for self-compassion as reason for investigating the sensitivity of the self-compassion effects to publication bias by running the trim-and-fill procedure on the residuals for the moderated model for self-compassion. This indicated that 3 studies were ‘missing’ from the left of the plot and that adding these would adjust the summary effect slightly, β = 0.08.

Fig. 4
figure 4

Funnel plots of observed effects and residuals for all measures. (A) Funnel plots of the observed effects (Hedge’s g) for individual studies against study precision, represented here by SE of Hedge’s g. Pseudo-confidence regions are shown by the light (0.05 < p < 0.01) and dark grey bands (0.01 < p < 0.001). Possible publication bias is indicated if there are more studies in these regions than in the white inside the bands, at the bottom of the plots compared to the top (i.e. as a function of study precision). However, differences may also relate to heterogeneity across studies. In the plots above, effect size and precision are confounded by the type of control, indicating that heterogeneity is a factor. The dotted blue line marks the overall summary effect. (B) Funnel plots showing residuals, with heterogeneity relating to the moderators removed, against the same scale for study precision


This review evaluated the effectiveness of interventions aiming to increase self-compassion among individuals with a mental disorder or a subclinical psychological difficulty. Our results are somewhat equivocal. On the one hand, we found that self-compassion-related therapies, compared to a control condition, successfully increase self-compassion and reduce levels of depression and anxiety with medium effect sizes. These results indicate that self-compassion is a psychological characteristic that can be modified in therapy, and this is of clinical interest given the relationship between self-compassion and psychopathology (MacBeth and Gumley 2012; Marsh et al. 2018). However, this meta-analysis also found that self-compassion-related therapies did not produce better outcomes than active control conditions. This indicates that such therapies are unlikely to have any specific effect over and above the general benefits of any active treatment. We should therefore be cautious about claiming that it is possible to ‘target’ self-compassion in therapy. Instead, it would seem that self-compassion is one of the many psychological characteristics that are modifiable during the course of a range of therapies.

The studies included in this review included participants with a range of clinical and subclinical presentations, and there did not seem to be any evidence suggesting that self-compassion-related interventions were more suited to some presentations compared with others. Our meta-regressions did not find that clinical or subclinical level of presentation moderated the effect size. It must be borne in mind, though, that our analysis likely had insufficient power to detect a small effect. In addition, our review covered a variety of interventions which are all hypothesised as acting, at least partially, by increasing self-compassion. There was no evidence in the meta-regression that the type of intervention moderated outcome, which is consistent with the idea that a range of therapies can modify an individual’s level of self-compassion. Nonetheless, the proviso above, regarding power, applies here too.

One question that could not be tackled quantitatively in the review is the question of mediation: do increases in self-compassion mediate improvements in psychopathology? This is an important question, given that increased self-compassion is assumed to be the mechanism of change in self-compassion-related therapies (e.g. Gilbert 2009). While the meta-analysis cannot answer the question, five of the studies included in the review did include some basic analysis of mediation, and we give a narrative synthesis of these findings below. Hoffart et al. (2015) and Kuyken et al. (2010) both found that increased self-compassion predicted improved psychopathology (PTSD symptoms in the former case, depression in the latter) across their samples, with no differences between the treatment and control groups. In two further studies, change in self-compassion also showed large-sized correlations with the key outcome measures: post-intervention social anxiety-related psychopathology (Koszycki et al. 2016) and change in neuroticism (Armstrong and Rimes 2016). These correlations did not vary between the treatment and control groups. Of the five studies assessing self-compassion as a mediator, only Eisendrath et al. (2016) did not find an effect of change in self-compassion on their primary outcome. The overall consensus across these studies is that increases in self-compassion are related to improvements in psychopathology. However, this relationship is not specific to self-compassion-related therapies; in fact, whenever this association was found in intervention groups, it was also found in control groups. We would therefore need to be sceptical of any suggestion that promoting self-compassion can improve psychopathological symptoms. This calls into question the proposed mechanism of change in self-compassion-related therapies: namely, that self-compassion is the primary target of therapy, with other psychological characteristics changing as a consequence of improvements in self-compassion. Although a sophisticated analysis of mediation would be needed to assess this, the emerging picture is that self-compassion-related therapies do not have a special role to play in promoting self-compassion, either as an end in itself or as a means of influencing other psychological characteristics.

All studies used the SCS (Neff 2003b) or SCS-SF (Raes et al. 2011) to assess self-compassion. The full SCS covers six factors associated with self-compassion. When evaluating the effect of self-compassion-related interventions compared to a control condition on pre-post scores in these subscales, there was significantly greater improvement in all. Interestingly, the negative subscales (self-judgment, isolation and over-identification) showed a trend for greater improvement than the positive subscales (self-kindness, common humanity and mindfulness); in particular, over-identification (g = 0.72) and isolation (g = 0.63) had the greatest effect sizes. This echoes the results of two other papers included in this review (Kelly and Carter 2015; Kelly et al. 2017) that reported larger effects for the negative compared to the positive items. Collectively, these findings speak to the debate around the psychometric properties of the SCS. On the one hand, the fact that all six subscales showed significant improvements support the validity of the scale, since a self-compassion intervention would be expected to improve scores across the subscales of a self-compassion measure. However, there seemed to be variability in how modifiable different subscales were, meaning that studies only analysing differences in SCS total score may lose clinically relevant information. Williams et al. (2014) suggested researchers avoid using total scores because the scale did not fit a one-factor structure, and our analysis indicates that a total score may not fully reflect the differential psychotherapeutic benefits of the six facets.

Methodologically, the studies included in the review were of reasonable quality. While the earliest review of self-compassion-related therapies concluded that treatment effectiveness was difficult to evaluate given methodological weaknesses in the field (Leaviss and Uttley 2015), we can be more confident in the quality of the RCTs reviewed here, although there was a particular risk of performance bias across the studies. It is unlikely that participants would expect as much improvement or found the condition as credible if they were in the control compared to the treatment group in many of the studies. This comes down to an absence of an active control condition in much research, and even where there was an active control condition, it was not always clear if it was well matched to the intervention condition in terms of social contact. It is important that conditions are matched on social contact in order to control for the significant impact of common factors in psychotherapy (Wampold 2015). The most rigorous test would involve comparing self-compassion-related therapies to gold-standard treatments, like CBT. This would involve evaluating the relative impact on primary mental health outcomes, as well as characterising the role of self-compassion in the therapeutic process. While this research is needed, it is worth noting that many studies included in this review did offer a reasonably high level of comparison, with treatment as usual sometimes including a high level of ongoing psychotherapy and/or pharmacotherapy. A further limitation was that some studies suffered from quite substantial attrition, often without making any rigorous analysis of any differences between dropouts and completers. Intention-to-treat analyses were not carried out consistently, although favouring per-protocol analyses is understandable given the sometimes low sample sizes and the fledgling status of compassion-related therapies. Longer follow-up periods would also be advisable, given that boosting self-compassion is likely to be a useful approach for buffering against relapsing mental disorders; this can only be assessed if medium-term follow-up is conducted.

In conclusion, this meta-analysis found that self-compassion-related interventions had moderate effects on self-compassion, depression and anxiety outcomes across 22 RCTs. However, when limiting analysis to comparisons between self-compassion-related interventions and active control condition, there were no significant differences in outcome. This suggests that self-compassion-related interventions lacked a specific effect when compared to other active treatments. There was no evidence that effects differed between clinical and subclinical populations, nor between therapies with an explicit or implicit aim to boost self-compassion. In the analysis of the subscales of the SCS, there was some variability in how modifiable subscales were, with negative subscales appearing to be more amenable to therapeutic change, supporting the view that collapsing the subscales into one total may risk losing clinically relevant information. Synthesis of research findings indicated that changes in self-compassion were related to changes in psychopathology, although there was no evidence that this relationship was specific to self-compassion-related interventions. Overall, this review provides good evidence that levels of self-compassion can be modified in third-wave self-compassion-related therapies, but does not indicate that these therapies are any better in promoting self-compassion than other active psychological treatments.