Introduction

Adolescence and emerging adulthood are critical developmental stages, characterized by fundamental biological, psychological, and social changes and role transitions. It is estimated that nearly half of all mental disorders will be manifested by the age of 14 years, and 75% by the age of 24 (Kessler et al., 2005, 2007). A variety of psychotherapeutic treatment modalities have been shown to be effective in treating mental disorders among adolescents and young people (Kazdin et al., 2000; Weisz et al., 2006). However, to understand better what works for whom in psychotherapy, the research in the field is moving beyond testing the efficacy and effectiveness of interventions for specific mental disorders, focusing increasingly on specific variables that impact the treatment outcome to make more personalized treatment feasible (Insel, 2009; Kraemer et al., 2002; La Greca et al., 2009). Participants’ characteristics (e.g., age, gender, diagnosis) that are measured before the treatment began may represent important predictors or moderators related to the treatment outcome (Lambert, 2013). Predictors of treatment outcome refer to baseline variables that impact the treatment outcome regardless of the intervention type (Hinshaw et al., 2007; La Greca et al., 2009). Treatment moderators, on the other hand, are variables that have a differential impact on outcome depending on treatment allocation (Kraemer et al., 2002). While the research on predictors of treatment outcome for adults is more extensive, the evidence for treatments targeting adolescents is still scarce, inconsistent, and limited by methodological variance, and only few studies have investigated predictors across several diagnostic groups (transdiagnostic predictors) or different treatment modalities (e.g., Nilsen et al., 2013; Vousoura et al., 2021) or targeted specifically this age group. This systematic review addresses the gap in previous research by focusing on sociodemographic predictors and moderators of treatment outcome for adolescents and young people across various mental disorders and treatment modalities.

Thus far, there are few systematic reviews on predictors and moderators of treatment outcome for the combined child and adolescent population for different disorder groups. Ginsburg et al. (2008) synthesized the findings of 21 studies that examined predictors of intervention response in pediatric obsessive-compulsive disorder (6- to 19- year-olds); Nilsen et al. (2013) conducted a review of 45 studies on predictors and moderators of outcome in child and adolescent anxiety and depression (4- to 18-year-olds); and Kunas et al. (2021) systematically reviewed 73 studies and meta-analyzed 23 studies with psychological predictors of cognitive-behavioral therapy outcomes for children and adolescents (mean age < 18) with anxiety and depressive disorders. In addition, a recent scoping review of Courtney et al. (2022) included 33 RCT studies on biological, psychosocial, or combined treatment for depressed adolescents (13- to 17-year-olds), including 53 variables tested as baseline predictors and 41 as moderators.

The most studied predictors or moderators of treatment outcome are patient sociodemographic characteristics (e.g., age, gender, ethnicity, education; Nilsen et al., 2013). Systematic reviews indicate that sociodemographic variables are usually not significantly associated with treatment outcome (Ginsburg et al., 2008; Kunas et al., 2021; Nilsen et al., 2013), but the results are not consistent. Kunas et al. (2021) reported mixed evidence on age and gender and concluded that these were not significant predictors of CBT outcome for depressed and anxious youth, being in line with previous studies on other disorder groups and other types of psychological treatments (e.g. Emslie et al., 2011; Ginsburg et al., 2008; Hinshaw, 2007; Nilsen et al., 2013). However, Courtney et al. (2022) reported that older age was a significant predictor of better treatment outcome for depression. Kunas et al. (2021) also reported that the evidence for patients’ ethnicity, education, socioeconomic status (SES) or family related variables (e.g., living with only one parent) is not consistent, so they were not considered significant predictors of treatment outcome among depressed or anxious youth. Courtney et al. (2022) also found similar results for depression, as did the meta-analysis by Weisz et al. (2017) across disorder categories. However, many of the included studies in Weisz et al. (2017) did not report patients’ ethnicity, so minorities were underrepresented in the analysis conducted in the meta-analysis. Nilsen et al. (2013) found CBT to be equally effective for youth with anxiety disorders across ethnic groups, but two out of three depression studies showed that it would be important to adapt depression treatments to better match the needs of ethnic minorities. Also, a large depression study on adolescents (Treatment for SSRI Resistant Depression in Adolescents, TORDIA) has reported that white adolescents and adolescents with higher SES are more likely to benefit from CBT (Brent et al., 2008). As the possible sociodemographic predictors to be tested in an individual study are countless, there are many sociodemographic variables that have not been tested and reported repeatedly. Thus, a variety of sociodemographic variables have yielded few results in terms of understanding what will work for whom across various circumstances (e.g., one treatment modality, one type of mental disorder).

Current Study

So far there are no systematic reviews on predictor and moderator studies conducted specifically on adolescents and young people across various disorder groups and treatment modalities. Previous reviews have mostly focused on specific disorders or treatment modalities, different age ranges, or included only randomized controlled trials, and disagree as to whether sociodemographic variables are significant predictors of treatment outcome. It is important to investigate whether some predictors may cut across disorders and treatment modalities whereas others are treatment and/or disorder specific. The aim of this systematic review is to present an overview regarding the existing evidence for the predictive and moderating role of sociodemographic variables on the outcome of psychotherapeutic interventions for adolescents and young people with mental disorders across treatment modalities. The aim is to provide an extensive review of the knowledge that has been gained from studies so far. The present study focuses specifically on youth i.e. the transitional stage from childhood to adulthood, which is a critical time for the onset of mental disorders.

Methods

Search Strategy and Study Selection Criteria

Building on the European “Roadmap for Mental Health Research in Europe” (ROAMER; Wykes et al., 2015), the COST Action on European Network on Individualized Psychotherapy Treatment of Young People with Mental Disorders (TREATme; www.cost.eu/actions/CA16102/) proposed to investigate age-specific predictors and moderators of outcome for adolescents and young people. The age group 12–30 years was chosen because studies concerning psychotherapeutic treatment only for children and adolescents or only for adults, mask the common feature of the transitional phase from childhood to more autonomous adult life (Merikangas et al., 2022). The age range for adolescents and young people includes both adolescence (13–17 years) and emerging adulthood (18–29 years; Arnett, 2014). In the present study, the age range was extended to 12–30 years to avoid excluding too many relevant studies. This part of the study focuses on sociodemographic predictors of treatment outcome across different disorder groups, Other types of predictors are investigated in separate parts of the study (e.g. clinical, family related and psychological predictors.). The review was conducted by researchers involved in TREATme funded by the European Cooperation in Science and Technology (COST). The study protocol was registered in PROSPERO (CRD42020166756) and is published (Vousoura et al., 2021).

The research questions and search strings for systematic database search were formed following the PICOS (Population, Intervention, Comparison, Outcome, Study design) strategy (Higgins et al., 2021) and the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines (Moher et al., 2009) are utilized in the reporting of the systematic literature search process. The searches were conducted in the databases of PubMed and PsycINFO for all published articles until April 22, 2021, combining search terms for (1) psychotherapeutic interventions; (2) mental disorders; (3) age range; and (4) study type. The included studies had to be clinical outcome studies with at least one treatment condition being a psychotherapeutic intervention of any treatment modality, targeted for adolescents and young people (aged 12–30 years) with specified mental disorders and published in peer-reviewed journals. To avoid pre-assumptions about possible predictors to be found, the search was not narrowed by search terms concerning predictors described in the previous literature.

The systematic database searches were conducted by one researcher (VG) separately in PubMed and PsycINFO for each of the included diagnostic group: (a) anxiety, obsessive-compulsive and trauma-related disorders; (b) depressive and bipolar disorders; (c) psychotic disorders; (d) eating disorders; (e) personality disorders; (f) substance-related disorders; (g) autism; (h) attention deficit/hyperactivity; and (i) conduct disorders. Two researchers (EV, SP) independently replicated the searches to cross-check the results. The study design, search strings and inclusion criteria are described more in detail in Vousoura et al. (2021).

Screening Procedure

The results from the two databases were combined for each diagnostic group when the results were imported to reference manager software (Mendeley), and duplicates were removed. Two independent researchers started a four-step screening process for each of the disorder groups based on the inclusion/exclusion criteria specified by two researchers (BT, VG). For the groups with the most results, several independent rating pairs were formed. In the first step, the titles were screened to identify if the study included patients with the specified mental disorder, and psychotherapeutic intervention as treatment. In the second step the abstracts were screened to assess also if the participants were aged 12 to 30, the study was an outcome study, and published in peer-reviewed journal where at least the title and the abstract were available in English. In the third step, full texts were screened to see if the patients were also diagnosed with the specified disorder or had high level of symptoms on at least one relevant self-report measure (above the defined cut-off point) and the study had at least two assessment points (pre- and post-treatment). If the full text was not available, the corresponding author was contacted. In case of no response in two weeks, the study was counted as missing. If all the criteria mentioned above were fulfilled, the study was included in the final fourth step, where the study was screened for predictors and moderators of treatment outcome. In each step, two independent reviewers rated the study to decide if it should be included for the next step. If there was not enough information to see if the study should be included or excluded, the article was included for the next step. If there were disagreements on an article after each independent reviewer had made a decision, the pair discussed to consensus was reached on whether the study would be included or not for the next stage of screening.

Data Extraction

Each rater pair summarized and extracted the following data from the studies that were agreed to be included: (1) identification of the study (authors; article title; publication year; country; original trial), (2) methodological characteristics (sample size; participants age; randomization; type of treatment and controls; number of sessions; outcome measures) and (3) information on predictor/moderator (type of predictor/moderator; significance of prediction). Each predictor or moderator’s relation to different outcome measures was reported separately. Disagreements were resolved by a third researcher.

A narrative synthesis was made of the different predictors and moderators found, and the studies were classified by the type of predictors or moderators they assessed (VG, TP, AS, BM, NC). The classification was formed based on Knopp at al. (2013), and the categories were defined as (1) sociodemographic; (2) clinical; (3) psychological; (4) treatment-related or (5) other predictors or moderators of treatment outcome. Each study could be classified under several categories based on different variables assessed for prediction. For this systematic review the studies that reported sociodemographic predictors or moderators related to treatment outcome were selected.

Quality Assessment

The reports from all rater pairs from first, second, and the third step of the screening process were examined for a quality check (EV, CJ, SP, ST). Inconsistencies or missing data were corrected in a dialogue with the rater pairs, in certain cases resulting in a partial or full re-coding of the data within a given diagnostic group. Based on this evaluation, and during the data extraction process, some additional studies were agreed to be included and some excluded based on cross-checking (SP, TP). The final number from each step is presented in Fig. 1.

Fig. 1
figure 1

PRISMA Flow Chart of Study Selection Process

The methodological quality of the included studies was evaluated by the Mixed Methods Appraisal Tool (MMAT) for qualitative, quantitative, and mixed methods (Hong et al., 2018) (BM, AK). Included trials had a quantitative (randomized and non-randomized) design and were rated according to the relevant criteria. For RCTs, evaluation criteria involved: randomization process, comparisons of groups at baseline, completion of the outcome data, blinding of the assessors to the provided intervention and adherence of the participants to the assigned intervention. For the non-randomized trials, the evaluation criteria assessed if: participants were representative of the target population, measurements were appropriate for the outcome and intervention, outcome data was complete, confounders were considered, and intervention was administered as intended. In line with the recommendation of MMAT (Hong et al., 2018), outcome data for both randomized and non-randomized trials were considered complete if the dropout rate was a maximum of 20% at post-treatment. For every included study, each criterion was rated as “yes”, “unclear” or “no”.

Results

Results of the Screening Process

The systematic database search identified in total 17,359 articles when duplicates were removed. Altogether 5,545 articles were related to mood disorders (depressive and bipolar disorders); 3326 to anxiety disorders (anxiety, obsessive-compulsive and trauma-related disorders); 2612 to substance use disorders (SUD); 1565 to conduct disorders; 1450 to psychotic disorders; 1125 to eating disorders; 697 to personality disorders; 605 to attention deficit/hyperactivity disorder (ADHD); and 464 articles to autism spectrum disorders (ASD). As some studies included patients from more than one diagnostic group, there is some overlap between diagnostic groups. From all the articles 13,247 were excluded based on screening the title and abstract, 4,112 full texts were assessed for eligibility and 292 articles were identified as predictor or moderator studies.

Of these 292 articles, 114 included sociodemographic predictors or moderators according to the classification criteria and were included in this study. The mean age of patients included in the studies that were included in this study ranged from 13.6 to 23.0 years, with only 14.8% including participants over the age of 20. The heterogeneity of the included studies with a diverse set of study quality and methods precluded a quantitative estimation of the effects of the predictors/moderators on treatment outcomes. The PRISMA flow diagram presenting the number of studies included in each screening step is presented in Fig. 1.

Results Regarding Predictors and Moderators of Treatment Outcome

An overall image of the obtained results is synthetized in Table 1. The number of studies displaying significant, mixed, or non-significant results for every sociodemographic variable analyzed across disorders and intervention modalities, were identified. A result was defined as significant, if the article reported the predictor/ moderator to be statistically significant (p < .05). A mixed result was defined as a predictor/moderator study where the result was statistically significant for some of the outcome measures used in the study, but non-significant for other outcome measures. The result was defined non-significant if the article reported so, or if the result section in the individual article reported only the significant predictors/moderators, when the assumption was that the rest of the studied variables were found statistically non-significant.

Table 1 Number of Studies With Sociodemographic Variables (Significant / Overall) Predicting or Moderating Treatment Outcomes Across Disorders and Intervention Modalities for Adolescents and Young Adults

This systematic review focuses on reporting the details and results from each individual study containing significant or mixed results – yet also presenting the broader picture in terms of providing the full number of studies identified for each predictor/moderator variable, and later discussing the relevance of the findings. A full reference list including all 114 identified articles meeting the inclusion criteria is presented in Online Resource 1, and detailed information on the included articles is found in Online Resource 2.

Age

Overall, 79 studies investigated the effects of age as a potential predictor/moderator of treatment (see Online Resource 2). Of these, 17 studies identified age as a significant predictor or moderator of outcome in at least one of the analyses conducted, while 62 studies reported no significant effects of age. There were significant results in four out of 18 studies of mood disorders. For eating disorders, significant results were found in four out of 19 studies, and for SUD, in seven out of 25 studies. Furthermore, in one out of two studies of personality disorders, and in one out of two studies for autism spectrum disorders significant results were found. Age was not found to be a significant predictor or moderator of treatment outcome in four studies on anxiety disorders, in four studies on ADHD, in one study of psychosis and in four transdiagnostic studies.

Mood disorders

Among the studies with significant results (four out of 18 studies), an RCT (Mufson et al., 2004) comparing interpersonal therapy for depressed adolescents with treatment as usual (TAU) found differential treatment effects by age (12–14 vs. 15–18 years). In the older group, IPT-A was more effective than TAU, whereas the difference between treatments was non-significant in the younger group. One RCT (Curry et al., 2006) compared the effects of several interventions for adolescents with major depression disorder (MDD): fluoxetine in combination with cognitive-behavioral therapy (CBT), fluoxetine alone, CBT alone, or clinical management with pill placebo. The study found that adolescents who were younger than 16 at baseline improved more on clinician-rated symptoms than older adolescents, regardless of treatment conditions. However, age was not found to be a moderator for the treatment outcome.

Another RCT (Davey et al., 2019) compared CBT in combination with either fluoxetine or placebo for patients with MDD aged 15–25 years. The study found a marginal relationship between younger age and higher improvement of interviewer-rated severity of depression and an association between younger age and higher rates of remission from depression. In addition, the study found another marginal interaction effect between treatment and age with severity of depression, indicating that younger patients may do better in CBT plus placebo treatment whereas older patients do better when treated with CBT in combination with fluoxetine. Furthermore, they found an interaction between treatment and age with regard to remission from depression, where younger patients had higher remission rates when treated with CBT plus placebo, while older patients did better in CBT combined with fluoxetine. Stasiak et al. (2014) compared computerized CBT (cCBT) with a computerized placebo condition in treating adolescents (13 to 18) with symptoms of depression finding that the adolescents improved significantly more in problem solving capacities when treated with cCBT compared to placebo, but in this study the effect of cCBT was larger for the older adolescents. It must be noted that the sample size of this study was small, and that six other moderator and seven predictor analyses yielded non-significant findings.

Eating disorders

Among the studies with significant results for eating disorders (four out of 19 studies), two RCTs on Family-Based Treatment (FBT) for adolescents with anorexia nervosa (Agras et al., 2014; Ciao et al., 2015) analyzed predictors and moderators. The first study found age to be a predictor of change, namely younger age (range 12–18) predicted weight gain at 12-month follow-up, but most participants did not achieve remission at end of treatment (Agras et al., 2014). In contrast, Ciao et al. (2015) found older age (range 12–19) to predict faster change in weight gain and higher overall self-esteem at end of treatment. The study also found age to be a moderator, as an interaction effect between age and type of treatment was found. Younger adolescents in Supportive Psychotherapy (SPT) for anorexia had the slowest rate of improvement in Eating Concerns compared to older adolescents, whereas the rates of change were equivalent across age span in FBT.

In addition, one non-controlled open trial on acceptance-based separated family treatment (ASFT) for adolescents with anorexia between the age of 12 and 18 examined predictors of change (Timko et al., 2015). The authors found older age to predict higher change in scores of maternally observed anorexic behavior, but not in paternally observed anorexic behaviors or in global eating disorder symptomatology (EDE-Global). Considering the relatively small sample size (n = 47) and inconsistency in findings across parent- and child-reported symptoms, these finding should be interpreted with caution. Finally, a non-controlled trial of a day hospital program for adolescents (13–18 years) with anorexia or bulimia nervosa, which included group therapy to improve self-esteem and social skills, studied age as a possible predictor of change (Lazaro et al., 2011). The study found that age was unrelated to change in patients with bulimia but did correlate positively with improvement in certain self-esteem factors (behavior adjustment, happiness and satisfaction and self-concept related to weight and shape).

Substance use disorders

Among the seven studies with significant results (out of 25 studies), three studies examined age as a predictor and moderator in psychotherapy for substance abuse disorders. In an RCT, Slesnick and Prestopick (2009) compared Ecologically Based Family Therapy (EBFT) and Functional Family Therapy (FFT) with TAU for runaway adolescents with substance abuse problems. The study found age moderating the effect of treatment for days of alcohol use, internalizing problems, and score on the Beck Depression Inventory (BDI). While days of alcohol use decreased for both older (16–17 years) and younger (12–15 years) adolescents in EBFT, in the FFT condition alcohol use was only reduced in older youth, while neither older nor younger adolescents in TAU had reduced alcohol use. With regard to internalizing problems and depression, age also had a moderating effect on the relationship between treatment modality and outcome. For the younger adolescents, internalizing problems and depression improved in both EBFT and FFT, but not in TAU. Hendriks et al. (2012) compared Multi-Dimensional Family Therapy (MDFT) with CBT in an RCT of adolescents with cannabis use disorder (13–18 years). The authors found a moderator effect: older age was related to better outcome in CBT, while younger age was associated with better outcome in MDFT, with regard to both number of days with cannabis use and number of smoked joints. Lastly, Kaminer et al. (2002) examined age in an RCT on group CBT versus psychoeducational group therapy (PET) for substance abusing adolescents (13–18 years). The authors conducted predictor- and moderator analyses on eight outcome measures and found three out of 16 significant findings. In the predictor analyses, older youth across treatment modalities were more likely to test positive on a urinalysis during the treatment period. In the moderator analyses, the authors found that youth younger than 16, who received PET, were more likely to exhibit a positive urinalysis compared to youth in CBT. Considering the school status of adolescents as outcome, the authors found differing moderator effects for older and younger youth. CBT Patients below 16 years of age showed no improvement in school status, while younger patients in PET significantly improved. For youth aged 16 or older, however, CBT patients improved marginally while the school status of youth in PET worsened substantially.

Four studies examined age as a predictor only. Davis et al. (2018) compared a therapist-administered brief intervention (BI) with a computerized BI in an RCT for youth (14–20 years) screening positive on the Alcohol Use Disorders Identification Test (AUDIT). Treatment responders were significantly younger than non-responders across treatments. In an RCT on individual versus family psychotherapy, Guo et al. (2016) compared Motivational Enhancement Therapy (MET) and a Community Reinforcement Approach (CRA) to EBFT. Older adolescents (range 12–17) reported a greater increase in family cohesion than younger adolescents during the 18-month post-treatment time period, but no significant effect of age on family conflict. Similarly, Zhang and Slesnick (2018) examined CRA, MET and Case Management (CM) in an RCT on substance abuse amongst homeless youth (14–20 years). Younger youth were more likely to belong in the low-and-increasing substance use and high-and-stable social stability growth class. In other words, homeless youth of younger age were more prone to increased substance use during treatment. Finally, Burrow-Sanchez et al. (2015) compared culturally accommodated group CBT (A-CBT) to standard group CBT in a RCT with adolescent Latinos (13–18 years) abusing drugs and alcohol. Age was found to predict number of days of drug use, with higher age in years predicting increase in substance use days. Age was, however, only included as a covariate in the study, and the authors did not address the interpretation of the predictor.

Personality disorders

The only study (out of three) finding significant results for personality disorders was an observational study of short-term group CBT for young adults (18–29 years) with personality disorders and personality disorder features. Renner et al. (2013) investigated the relationship between age and symptomatic distress, schema modes, coping response, and Early Maladaptive Schemas (EMS). Decrease in EMS over time was stronger in younger patients. The sample size of the study was small (n = 26).

Autism spectrum disorder (ASD)

One out of two RCTs testing the efficacy of a communicative skills training program (Tackling Teenage Training or TTT) for adolescents (12–18 years) with ASD found age to be a significant predictor (Visser et al., 2017). Younger age was found to moderate three out of eight outcomes, as younger adolescents had higher increase in psychosexual knowledge (parent- and child-rated) as well as improvements in their social functioning following the TTT program compared to older adolescents.

Gender

Sixty-nine articles investigated the effects of gender as a potential predictor and/or moderator of treatment outcome (see Online Resource 2). Of those, 56 articles reported no significant effects of gender while 13 articles reported significant gender effects on at least one outcome measure. More specifically, four identified gender as a significant predictor or moderator of outcome and nine studies yielded mixed results. For mood disorders significant or mixed results were found in four out of 19 studies and for eating disorders in two out of seven studies. Further, for SUD significant results were found in six out of 27 studies. Additionally, significant results were found in one out of five transdiagnostic studies. Three studies in anxiety disorders, two studies in ASD, four studies in ADHD and two studies in psychosis found no significant effects of gender on treatment outcome. There were no studies identified that explored gender as a predictor or moderator of outcome in youth diagnosed with personality disorders.

Mood disorders

Gender emerged as a significant moderator of outcome in the treatment of mood disorders in three studies (out of 20 studies). Two RCTs reported interactions between gender, treatment modality and another baseline variable. Betancourt et al. (2012) compared the effects of interpersonal group therapy (IPT-G), creative play/recreation (CP) and waitlist control on depression symptoms in adolescents. A significant interaction between gender, abduction history and treatment modality were found. More specifically, non-abducted females benefited more from IPT-G than from CP or waitlist, whereas no gender differences in treatment response were observed for participants with abduction history. Similarly, Amaya et al. (2011) conducted an RCT where they compared fluoxetine, CBT and combined fluoxetine + CBT (COMB) in the treatment of depression. The relationships between gender, marital discord and treatment modality in predicting treatment response (defined by global clinical improvement) were examined. Predictor analysis showed no main effect of gender on treatment response. However, interaction analyses revealed that females from families characterized by higher marital discord demonstrated higher response to active treatment in general, including higher response to CBT. In contrast, in females from families with low marital discord, CBT alone yielded a poorer response. Conversely, for higher marital discord males, but not females, COMB resulted in better response to treatment than CBT alone.

Another RCT in adolescents with depression (Charkhandeh et al., 2016) compared the effects of CBT, an alternative medicine intervention (“Reiki”) and waitlist control on depression scores, as measured by the Child Depression Inventory (CDI). Male participants showed a smaller treatment effect for the Reiki intervention, compared to their female counterparts. Importantly, the moderator effect was found for the total score of CDI, but not for its individual subscales, hereby demonstrating mixed results regarding the moderating effect of gender. One additional RCT documented mixed findings regarding gender as a predictor of outcome, depending on the outcome measure used. Stasiak et al. (2014) compared the effects of cCBT and Psychoeducation on depression symptoms, quality of life and coping strategies in a small sample (n = 34) of adolescents with MDD. Males improved more than females in terms of depression symptoms, regardless of treatment group. Females, on the other hand, showed more improvement on a subscale assessing positive coping strategies.

Eating disorders

Gender emerged as a significant predictor of outcome in two RCTs (out of seven studies). Le Grange et al. (2015) compared the effects of FBT, CBT-A and Supportive psychotherapy (SPT) in reducing binge eating and purging episodes in adolescents with bulimia nervosa. Gender was identified as a predictor of outcome: being male was associated with more favorable outcome regardless of treatment type. Further examination of gender as a potential moderator of outcome yielded no significant results. The same research group reported an RCT comparing FBT and Adolescent focused individual therapy (AFT) for youth with anorexia nervosa (Le Grange et al., 2014). The authors analyzed only the proportion of the sample that showed significant weight improvement. Among those, between-group analyses compared the participants who showed a timely weight restoration to those who showed a slow pattern of weight improvement. Results showed that adolescents with timely weight restoration were more likely to be male. However, in both studies, the samples were predominantly female (> 90%), which limits the interpretation of these findings.

Substance use disorders

Six studies out of 27 found significant or mixed results for SUD, two studies reported mixed results regarding gender as a predictor of treatment outcome in SUD. Slesnick et al. (2006) explored the efficacy of a family therapy program for SUD and compared it with a control group. In this RCT, drug use was significantly reduced regardless of gender, treatment, or being a primary alcohol or drug user. However, primary drug using males showed an increase in alcohol use after treatment. Brown et al. (2015) reported an RCT of the effects of Motivational Interviewing (MI) versus TAU in hospitalized adolescents with comorbid psychiatric disorders and SUD. Compared to females, males reported greater reductions in thought problems after the intervention, as assessed by the Youth Self Report (YSR) scale. However, gender was not predictive of any of the primary outcomes focusing on substance use or other psychiatric symptoms. Of note, the MI intervention consisted of only two 45 min sessions. Another RCT showed that gender can also act as a moderator of treatment outcome in adolescents with alcohol use disorder. Slesnick and Prestopnik (2009) investigated the interaction between gender and the effects of two family-therapies (EBFT and FFT), and TAU. While EBFT reduced alcohol and drug use in both males and females, FFT had those effects only for males. Moreover, worse outcome was reported for males in TAU. Based on these results, the authors suggested that male adolescents could particularly benefit from family therapy.

Finally, three reports of RCTs by Kaminer and colleagues tested the effect of gender as a predictor and/or moderator of outcome and also provided mixed findings. In a pilot RCT, Kaminer and Burleson (1998) compared the effects of CBT and Interactional Group (IG) treatment in a small group (n = 32) of adolescents with psychoactive SUD. Girls showed a greater increase in psychiatric problems from baseline to post-treatment. There were no gender effects on substance use indices or other self-reported difficulties, nor interactions between gender and treatment group. In a larger trial, Kaminer et al. (2002) compared the effects of CBT and PET. Results showed that gender did not predict or moderate treatment effects on urinalysis. However, gender moderated treatment outcome on self-reported symptoms (Teen Addiction Severity Index; TASI). On the substance use subscale, male CBT subjects showed most improvement, whereas the male PET subjects showed no significant improvement. On the other hand, female subjects showed improvement regardless of treatment group. Similarly, male CBT subjects improved on the school subscale, whereas the male PET subjects did not change. The female PET subjects, conversely, improved, while the female CBT subjects became worse. Finally, on the family subscale, male CBT subjects improved significantly, and the male PET subjects did not change. In contrast, the female PET subjects improved, while the female CBT subjects did not change. There were no further interactions with gender for the other TASI subscales, including peer, legal or psychological problems. Taken together, these results suggest that CBT was somewhat more beneficial for male subjects, compared to females. The third study (Kaminer et al., 2008) specifically explored the effects of an aftercare program (comprising functional analysis and relapse prevention sessions) on alcohol and marijuana use. CBT treatment completers were randomly assigned to three aftercare conditions: in-person, brief telephone, or a non-active condition. Gender showed to be a predictor and moderator in terms of alcohol use, but not marijuana use outcomes. First, compared to males, female participants, while still showing decreases in abstinence, showed smaller decreases. Moreover, the active aftercare treatments were differentially more effective for girls. Among boys, for example, both conditions showed similar decreases in abstinence. In contrast, the girls significantly decreased in abstinence in the non-active condition, but not in the active conditions.

Transdiagnostic studies

Cornelius et al. (2010) examined gender effects in a double-blind fluoxetine trial of adolescents with comorbid SUD and depression receiving CBT and MET. Females showed a greater improvement on BDI and on DSM criteria of cannabis abuse after treatment. Nevertheless, gender was not predictive of outcome on several other depression or substance use outcomes.

Ethnicity

Studies investigating race or ethnic background as a predictor/moderator of treatment outcome were combined as one predictor/moderator group, ethnicity, in the systematic review, as most studies did not differentiate definitions of race relative to ethnicity. In total, 45 studies assessed the effect of ethnicity on outcome (see Online Resource 2), while one article examined country of birth as a predictor of treatment response. Eight articles identified ethnicity as a significant predictor or moderator of outcome. Four articles reported a mixed result, and 34 found a non-significant result in predictor/moderator analyses of ethnicity and country of birth. Five out of 15 studies on mood disorders and three out of 21 studies on SUD reported that ethnicity was a predictor/moderator of treatment outcome. One study on mood disorders, two on SUD and one transdiagnostic study reported a mixed result of ethnicity being a predictor or moderator of outcome. Six studies on eating disorders, and one study for each in anxiety, ASD and ADHD found no significant moderation of ethnicity on treatment outcome. There were no studies on ethnicity as a moderator of treatment outcome for psychosis or personality disorders. However, adolescents with borderline personality traits were included in a transdiagnostic study.

Mood disorders

Out of the six studies out of 15 finding significant results on at least one outcome measure, one RCT on CBT for depressed adolescents (Rohde et al., 2006) reported that White participants had greater reduction in symptoms compared to non-White participants. Similarly, Brent et al. (2009) conducted a comparative study assessing treatment outcome differences between medication, CBT or the combination of medication and CBT in a group of adolescents with depression. Analyses revealed that White race, amongst others, was a predictor of earlier time to onset of a suicidal event. Another RCT on family-focused therapy in adolescents with bipolar disorder, found that race was a predictor of treatment outcome (Weintraub et al., 2020). In particular, the authors reported that non-White adolescents were more likely to have a slower symptom recovery compared to Hispanic adolescents who improved within the first six months or over the next year of family-focused therapy. Weersing et al. (2006) investigated the effects of a CBT based intervention in a non-randomized clinical trial using a small sample size. The authors reported that ethnic minority youth had a slower improvement of depression symptoms after CBT compared to non-minority youth. Ethnicity remained a predictor even when entered in a hierarchical regression model with other variables. In contrast, Pan et al. (2019) conducted an RCT with depressed individuals comparing directive treatment and non-directive treatments. Ethnicity was found to be a moderator of treatment outcome. Specifically, the study reported that African Americans had greater symptom reductions when enrolled in directive treatment; however, European Americans in the Cultural Values Interview showed a similar pattern.

One additional RCT found mixed results regarding ethnicity as a moderator of treatment outcome. Specifically, Ngo et al. (2009) compared the effects of culturally adapted quality improvement intervention and treatment as usual on depression symptoms and quality of life in a sample of adolescents with major depression or dysthymia. Black youth assigned to the quality improvement intervention experienced a larger reduction in depression symptoms compared to Latinx and White youth. However, no significant interaction was found between intervention and ethnicity in the assessment of quality of life.

Substance use disorders

Of the five studies (out of 21) reporting significant results for SUD, several were RCT studies. Slesnick et al. (2013) compared three treatment modalities: MI, Community Reinforcement learning (CR) and EBFT, among substance-abusing adolescents. Minority youth displayed reductions in substance use that are comparable with those displayed by their White counterparts. However, it was noted that minority youth relapsed faster than White youth. In contrast, in their study Zhang and Slesnick (2018) examined end-of-treatment trajectories of social stability and substance use between CR, MET and case management in a sample of substance-using youth. African American individuals had worse substance use and social stability outcomes compared to Whites and other ethnic group individuals. A third RCT, showed that Hispanic adolescents participating in Structural Ecosystem Therapy (SET) had a greater reduction in drug use compared to African American adolescents (Robbins, 2008).

Horigian et al. (2013) found mixed results of ethnicity as a predictor of treatment outcome. In particular, the authors compared Brief Strategic Family Therapy (BSFT) and TAU on depression, anxiety and externalizing symptoms in adolescents with substance abuse. Ethnicity was found to be a predictor only for child-reported depression, child-reported anxiety and parent reported depression. The authors reported that African American adolescents tended to have lower depressive symptoms and showed a smaller decrease during treatment compared to Hispanic and white adolescents. Mixed results of ethnicity as a predictor of treatment outcome were also found in the RCT by Hops et al. (2011) where Anglo adolescents had a slighter decrease in specific HIV risk behaviors, including oral sex and close friends that had sex, compared to Hispanic adolescents. Moreover, the authors reported that Hispanic adolescents with both high and low risk for HIV showed an increase in specific HIV-risk related behaviors during post-treatment, compared to Anglo adolescents who remained relatively stable.

Transdiagnostic studies

One RCT explored the efficacy of Dialectical Behavioral Therapy (DBT) and individual or group support therapy in a group of adolescents with a high risk of suicide having a variety of diagnosis combined with borderline personality traits (Adrian et al., 2019). Treatment decreased the rate of suicide attempts significantly more for Latino/Hispanic compared to non-Latino individuals. However, none of these effects were found for suicidal ideation, self-harm, and non-suicidal self-injury. Moreover, White adolescents tended to show a smaller reduction in suicidal ideation at the end of treatment compared to non-White adolescents. Similarly, White adolescents had a higher score of suicide ideation compared to their non-White counterparts at post-treatment.

Education

Eleven studies were found that investigated the effects of education as a potential predictor of treatment outcome (see Online Resource 2). Of these, four found full or mixed support for their hypothesis, and all of them were on SUD. The other seven studies, two on SUD and two on psychosis, one for each on mood disorders (depression), eating disorders and ADHD, found non-significant results.

Substance use disorders

In one of the four significant studies (out of six), an RCT of Ögel and Coskun (2011) compared CBT and an educational program for substance use among 62 Turkish teenagers. The authors found that the level of education was a predictor of abstinence, as the chance of abstinence increased proportionally with levels of education (Ögel & Coskun, 2011). In another RCT, Davis at al., (2018) compared a therapist administered and a computerized brief intervention (U-Connect) with enhanced usual care on a sample of 475 risky drinking adolescents. The authors found significantly more treatment responders for participants enrolled in a high school as compared to those enrolled in a technical school/college. Wang et al. (2016) compared MET, combined parenting skill training (PST) + MET with standard supervision by the court in a sample of Taiwanese teenagers with SUD. Attending school was found to be a predictor of outcome, but not of the relapse rate.

Finally, school attendance and school performance were tested as predictors by Battjes et al. (2004). After 20 sessions of Group-Based Treatment for adolescents with SUD, students who attended school and did not have poor grades at baseline managed to significantly reduce the number of marijuana days at both six and 12 months. On the other hand, students who did not attend school or had poor grades managed to reduce their marijuana days, but the reduction was less notable and more fluctuant. (i.e., marijuana use decreased from baseline to six months but increased from six to 12 months, such that use at 12 months had reverted to pre-treatment levels). However, the same study found that school status was a non-significant predictor for two other outcomes: days of alcohol use to intoxication and days of criminal activity.

Socioeconomic status (SES)

In total 18 studies (see Online Resource 2) investigated the effects of SES as a potential predictor/moderator of treatment outcome. Three studies found SES to be a predictor or moderator of treatment outcome and one study showed mixed results.

Mood disorders

One of the two studies (out of nine) finding significant results for mood disorders, a large (n = 443) four arm RCT study of Curry et al. (2006) compared the effects of CBT, fluoxetine, CBT combined with fluoxetine, and a clinical management placebo condition for teenagers with MDD. The authors found that family income moderate the effects of treatment: for adolescents residing in families with low and middle levels of income (<$75,000/year), fluoxetine and CBT combined with fluoxetine were equally effective, and both were more effective than CBT alone or placebo/control, which did not significantly differ from each other. However, at high levels of family income (≥$75,000), the three active treatments were not significantly different from one another and were more effective than placebo. In an open treatment trial of adolescent suicide attempters, Brent et al. (2009) found household income to be a predictor of earlier time to onset of suicide (attempt) as an outcome of specialized psychotherapy for suicide attempting adolescents and/ or medication management. The adolescents belonging to households with higher income attempted, idealized about, or committed suicide earlier than adolescents from households with lower income.

Substance use disorders

In the first (out of two) studies with significant results for SUD, an RCT comparing therapist administered and computerized U-Connect Brief intervention with enhanced usual care on a sample of 475 adolescents with risky drinking (Davis et al., 2018), receiving public assistance was a predictor of treatment response measured as an outcome: more participants that were receiving public assistance were found among treatment responders. The second study by Wang et al. (2016) compared MET and combined PST + MET with standard supervision by the court and found that employment of Taiwanese teenagers with SUD at the baseline was significantly linked with employment at the end of the study, but not to the relapse rate.

Family Constellation

Overall, 21 studies (see Online Resource 2) investigated the effects of family constellation as a potential predictor/moderator of treatment outcome. Of these studies, ten explored the effect of family-based treatments for adolescents with eating disorders (nine on anorexia and one on bulimia). Four studies focused on substance abuse, five on major depression, one on bipolar I and II, and one on schizophrenia. Of these, three studies identify family constellation as a significant predictor or moderator - two were on adolescents with anorexia and one on adolescents with bipolar I or II disorders.

Mood disorders

In the only (out of six) study with significant results, Miklowitz et al. (2014) explored the effects of psychosocial intervention on adolescents with bipolar disorder I or II. In this sample 34.7% of adolescents lived with both of their biological parents. The authors found that adolescents who lived with both of their biological parents showed a longer time to manic recurrence than those living with one biological parent.

Eating disorders

In the first of the two (out of 10) studies finding significant results, Agras et al. (2014) compared two types of family therapies for adolescents with anorexia. Intact families had a higher rate of remission regardless of the treatment used. In another study Lock et al. (2005) reported that like in most other studies on family therapy, there were mostly intact families (78%). In a post hoc exploratory analysis of possible moderators of treatment outcome, those with non-intact families did better on Global EDE in longer treatment; however, there was no difference on the other primary outcome measure, BMI.

Other variables studied as predictors or moderators

History of traumatic events. Overall 10 studies (see Online Resource 2) investigated the effects of traumatic events as a potential predictor or moderator of treatment outcome. Seven of them identified traumatic events as a predictor or moderator of outcome at least for one of the outcome measures assessed. Of these, five studies were on depressive disorder, one on SUD and one on personality disorders. Two studies reported no significant effects on treatment for depressive disorder and one for SUD.

In five RCTs (out of seven studies), adolescents’ history of traumatic events was found to be a predictor of treatment outcome for depression. In one RCT, Betancourt et al. (2012) suggested that abduction history interacted with gender to moderate the effectiveness of IPT-G. More precisely, IPT-G was effective in reducing depressive symptoms for both genders with a history of abduction. However, for participants with no history of abduction, females displayed the greatest treatment effects whereas males showed no significant improvement when compared to control conditions. In another RCT, Shamseddeen et al. (2011) found that physical abuse moderated response to treatment. Specifically, participants with history of physical abuse had much lower response rate to combination therapy (SSRI + CBT) than to medication only, suggesting that patients with history of physical abuse may require specialized treatment approaches. Another RCT (Ammerman et al., 2016), showed evidence of a moderating effect for physical and emotional abuse, so that patients who had experienced physical abuse in childhood had greater improvements in parenting and patients with experiences of emotional abuse improved more in social network size in the In-Home CBT relative to control conditions. In addition to treatment outcomes, emotional and physical neglect were predictors for lower social support, smaller social network size and home environments, which were less stimulating, nurturing and safe. Emotional and physical abuse were also predictors of poor home environments. An additional RCT (Brent et al., 2009) showed that sexual abuse was a predictor of earlier onset of a suicidal event. Furthermore, Barbe et al. (2004) also showed that sexual abuse was a negative predictor of long-term outcome in adolescent depression. Moreover, this study also showed that CBT was more efficacious than Non-directive supportive therapy (NST) in absence of sexual abuse but was not better than NST in those with a history of sexual abuse.

In one study (out of two) on SUD, Battjes et al. (2004) suggested that after taking part in a Group-Based Treatment adolescents with a history of emotional abuse may have a greater reduction in days of marijuana use compared to their peers without such a history. These results were found at the six-month follow-up, but they seemed to be lost at 12-month follow-up. Schuppert et al. (2012) tested the efficacy of an Emotion Regulation Training (ERT) program for borderline personality disorders. The authors found that history of abuse was associated with less improvement in the severity of borderline personality symptoms and general psychopathology immediately after the treatment. Moreover, those deficits (less improvement in the general psychopathology and in the quality of life) were maintained at follow-up.

Parents’ education. Two out of nine studies (see Online Resource 2) found parental level of education to be a moderator of outcome, including one article on adolescents with anorexia nervosa and one on adolescents with SUD. An RCT of Le Grange et al. (2014) examined FBT and AFT for anorexia nervosa. Outcome was decided as reaching ≥ 95% of expected body weight early or not. They found that most parents had attended some higher education. Compared to non-early FBT responders, early FBT responders differed at baseline only in their parents’ education level, such that early FBT responders had parents with fewer years of education. French et al. (2008) compared the cost-effectiveness of four interventions that were examined in a study by the Waldron et al. (2001) which included family-based, individual, and group cognitive behavioral approaches, for adolescents with SUD. Outcome was the use of marijuana as reported by the adolescents and delinquency scores as measured with the Delinquent Behavior subscale of the Child Behavior Checklist and YSR at the four- and seven-month follow-up assessments after the initiation of treatment. Their findings indicate a significant effect for years of parent education, suggesting lower levels of delinquent behavior for adolescents of parents with relatively more years of education at four-month follow-up, which was set to coincide with the end of the treatment.

Forensic history. Overall, five articles (See Online Resource 2) investigated the effects of forensic history as a potential predictor/moderator of treatment outcome for SUD. All of them studied adolescents with SUD. Of these, two articles from one study identified forensic history as a predictor or moderator of outcome at least for one of the measured outcomes, and three reported no significant effects.Using an RCT design, Hendriks et al. (2011;, 2012) suggested that self-reported violence or property crimes were associated with a more favorable outcome in terms of number of cannabis use days when using MDFT and a less favorable outcome when using CBT.

Referral to treatment. Eight articles investigated the effects of referral to treatment, defined as the source from where or by whom the youth were referred to treatment, as a potential predictor of treatment outcome. Of those, five reported no significant effect of referral to treatment (SUD, mood, and anxiety disorders). One study reported referral to treatment being a predictor of the outcome for mood disorder, and two articles reported mixed results on at least one outcome measure for SUD.

Brent et al. (1998) reported that major depression at the end of the treatment period was associated with having come into the study from a clinical referral rather that from an advertisement (90.0% versus 56.5%), hence major depression at the end of treatment was predicted by clinical referral source. Schaub et al. (2014) reported that external coercion had an impact on treatment outcome for SUD. Adolescents who were externally coerced to participate in a cannabis cessation program showed greater improvement on externalizing symptoms. So, referral to treatment seem to have a positive impact on some of the secondary outcomes (i.e., externalizing symptoms) but not on the primary outcome (cannabis use). Finally, Tamm et al. (2013) found a significant effect of court ordered treatment on SUD responders. Participants who were court mandated to treatment had greater predicted odds to achieve a 50% reduction in substance abuse but had lower predicted odds to complete treatment than those who were not court mandated to treatment. To sum up, findings show that adolescents who were court-mandated to treatment targeting substance use had greater reductions in days of substance use but lower rates of treatment completion.

Accommodation status. Three studies investigated the effects of the young person’s accommodation status, defined as where or with whom the youth lived, as a potential predictor of treatment outcome. They were all RCTs and focused on brief interventions for substance abuse, alcohol misusers in the USA and volatile substance misusers in Turkey. Two of these studies identified accommodation as a potential predictor. Compared to non-responders, those that were deemed as responders (based on Audit-C) to a Brief 1-session Intervention, more often lived with their parents (Davis et al., 2018). Ögel and Coskun (2011) found that homelessness was a negative predictor of abstinence from volatile substance after a 3-session CBT intervention, whereas duration of homelessness did not affect remission.

Parental age. There were two studies on treatments for adolescents with anorexia nervosa that examined parental age as a predictor or moderator (Le Grange et al., 2015; Martin-Wagar et al., 2019). None of them found parental age to be of significance.

Work status. One study (Allott, 2011) investigated the effects of work status as a potential predictor of treatment outcome (positive psychotic symptoms) in youth with psychosis and was found non-significant.

School type. One study (Walter, 2013) investigated the effects of school type as a potential predictor of a number of treatment outcomes in a transdiagnostic sample of adolescents. School type was not significantly correlated with outcomes either at discharge or at follow-up.

Distance to treatment. One study (Lenhard et al., 2018) investigated the effects of the distance to treatment as a potential predictor of treatment outcome and reported no significant effect of distance to treatment for anxiety disorders.

Social support/network. One study investigated the effects of the social support/network as a potential predictor of treatment outcome. Arterberry et al. (2018) tested the potential impact on treatment response for substance abusers. Two different aspects of the social support/network were taken into account: community involvement and having a mentor. Community involvement was not found significant. However, having a mentor was found to have a positive impact on treatment outcome, considering that there was an indirect association between having a mentor and being a responder to the SUD treatment program.

Sexual orientation. One study examined sexual orientation as a potential predictor of outcome for three active treatments among homeless youth with substance or alcohol use disorder (Zhang & Slesnick, 2018). Being straight or LGBTQ was not significantly correlated with any of the outcomes defined as trajectories or co-occurring patterns of substance/ alcohol abuse.

Psychosocial adversity. One study (Walter et al., 2013) investigated the effects of more adverse psychosocial conditions as a potential predictor of treatment outcome in a transdiagnostic sample of adolescents. This study reported no significant effect.

Summary of the Results

Overall, 114 individual articles reporting a total of 287 sociodemographic predictors/moderators were included in this systematic review. The most studied predictors/moderators were age, gender and ethnicity. Many predictors were investigated in less than ten studies, several of them being analyzed only in one study. The most studied disorder groups were clearly SUD, eating disorders and mood disorders (> 50 predictors tested for each). For anxiety disorders, there were only 11 predictors/moderators examined in three individual studies (Ingul et al., 2014; Lenhard et al., 2018; Schneider et al., 2018) and none of them were found significant. For ADHD, five studies (13 predictors/moderators) were identified (Antshel et al., 2012; Barkley et al., 1992; Boyer et al., 2016; Fleming et al., 2015; Vidal et al., 2015), again with no significant results. For transdiagnostic trials, six studies (15 predictors/moderators) with sporadic significant results, were found (Adrian et al., 2019; Cornelius et al., 2009, 2010; Gergov et al., 2021; Layne et al., 2003; Walter et al., 2013). For psychosis, autism and personality disorders, only very few (< 10) predictors/moderators were investigated without any of them being significant - except for one study where age was a predictor on treatment outcome for autism (Visser et al., 2017).

Only about 25% of all the studies reported a significant effect for any of the tested predictors. Age, the most frequently researched sociodemographic variables, was reported to be significant in 22% (17/79) of the studies. Gender was a significant predictor in 19% of the studies (13/69), ethnicity in 26% (12/46), education in 36% (4/11), socioeconomic status in 22% (4/18), family constellation in 14% (3/21), and history of traumatic events in 55% (6/11) of the studies (Table 1).

Quality Assessment of the Included Studies

Out of 114 included articles, 90 were RCTs and 24 were non-RCTs. Methodological quality criteria were rated according to the experimental design, and ratings on the outcome data consider the time point at the end of treatment unless otherwise specified. For detailed information on all ratings, see Online Resource 3.

Most included RCTs (72 articles) reported on the randomization process, while in 18 studies the process of randomization was not clear. However, in 11 trials with unclear randomization, groups were comparable at baseline indicating that randomization was correctly performed. In the remaining six (out of seven) trials with the unclear performance of randomization, it was not possible to determine if the groups were comparable at baseline (e.g., characteristics were provided for the entire sample, or no statistical comparisons among the groups were provided). Finally, one study with unclear randomization performance involved groups that were not comparable at baseline. In contrast, in 12 studies that specified the way by which randomization was performed, groups were either not comparable at baseline (in four articles) or this criterion was not clear. In sum, out of 90 included RCTs, 71 studies had groups that were comparable at baseline, in 14 studies this criterion was not clear, and five articles involved groups with significant imbalances at baseline.

The majority of included RCTs (62) had complete outcome data at the end of the treatment while 13 studies did not meet this criterion. However, in two articles with incomplete outcome data, one condition (out of three) did fulfill this criterion. Similarly, in another trial, two (out of three) conditions had complete outcome data for the first phase of treatment (10/12 sessions) but neither condition met the criterion for the end of treatment. For 15 included RCTs, it was not possible to determine if there were complete outcome data at the end of treatment.

In most (63) included RCTs, participants adhered to the assigned intervention, 16 included RCTs did not meet this criterion, and in 11 RCTs, it was not possible to determine if this criterion was completed. In one trial participants adhered to the assigned intervention for the first phase of treatment (10/12 sessions), but not to the second phase (two sessions), thus the criterion was rated negatively. By contrast, although participants in the treatment condition did not adhere to the assigned intervention in one (single) trial, this criterion was met for the entire sample and was rated positively. Outcome assessors were blinded to the provided intervention(s) in 57 included RCTs, four RCTs recognized the lack of blinding, and this criterion was unclear in 29 included RCTs.

In 24 included non-RCTs, participants were representative of the target population and measurements were appropriate regarding outcome and interventions (both criteria being the inclusion criteria in the present study). Thirteen included non-RCTs had complete outcome data at the end of treatment, eight studies in this group did not meet this criterion and in three non-RCTs it was not possible to determine if this criterion was completed. In 19 included non-RCTs, the interventions were administered as intended and in five remaining studies in this group it was not possible to determine if this criterion was completed. Confounders possibly interacting with the studied predictor/moderator were controlled for in analyses in 16 non-RCTs, five studies did not meet this criterion and it was not clear if this criterion was completed in three studies.

Discussion

Despite the worrying prevalence of mental disorders among adolescents and young people, studies on predictors and moderators of treatment outcome remain scarce, and clinically relevant conclusions are difficult to draw. This systematic review focused on the predictive and moderating role of sociodemographic variables conducted specifically on psychotherapeutic interventions for adolescents and young people across diverse mental disorders and different treatment modalities. Previous reviews have focused mostly on specific disorders, treatment modalities or only RCT studies. Therefore, the present study aimed to give an extensive review of the knowledge that has been gained from studies so far focusing on the transitional stage from childhood to adulthood. Exploring these effects could help us understand the factors that contribute to treatment efficacy and indicate what interventions are most beneficial for which groups. Such knowledge could be used for effective treatment planning before the patient is assigned to an intervention.

Outcomes

None of the sociodemographic predictors were found clearly significant, as only about 25% of all the studies reported any variable that significantly predicted or moderated the treatment outcome. In addition, for many of the sociodemographic variables and diagnostic groups findings are mixed so that they point to different directions (e.g., some studies finding older and some younger youth benefitting more from treatment), and non-significant results outnumber significant ones. The heterogeneity of the included studies may have decreased the probability of finding uniformly significant predictors – perhaps the sociodemographic predictors are differently related to treatment outcome for different disorders or treatment modalities. However, as the findings are in line with previous results from systematic reviews on child and adolescent psychotherapy, which mostly do not support the relevance of sociodemographic predictors of treatment outcome (Courtney et al., 2022; Kunas et al., 2021; Nilsen et al., 2013), it is not likely that the wide inclusion criteria would explain the results of the review. As the systematic review focuses on how the sociodemographic predictors are related to treatment outcome when patients have already accessed the treatment, it does not consider all predictive aspects of the sociodemographic variables. Thus, it might be that the sociodemographic variables are more important predictors of access to treatment than of the outcome of an acquired treatment. Furthermore, the mixed results may partially be a consequence of the low sample sizes in several of the included studies. Only very few were sufficiently powered for predictor and moderator analyses. Moreover, significance rates may have been inflated by multiple testing and by relying on simple, bivariate analyses. Since only a few studies have been conducted within each disorder group and for each sociodemographic variable, even significant results should be regarded with caution when it comes to outlining the big picture.

The variable that had the strongest empirical support as a predictor of (poorer) outcome across diagnoses and treatment types was having a history of traumatic events (e.g., abduction, physical/sexual/emotional abuse). There are many potential explanations for this finding. One likely issue is that trauma is often linked to emotion regulation difficulties and disorganized attachment patterns (Crow et al., 2021), which may entail a strain on the psychotherapeutic alliance to the therapist and require longer and more intensive treatments. A history of trauma, in particular developmental trauma in childhood relationships, is also associated with a risk for developing co-morbid disorders, among these complex PTSD and borderline personality disorder (Bozzatello et al., 2021; Cruz et al., 2022; Ford & Courtois, 2021). Accordingly, an important objective for further studies would be to study history of trauma as a potential predictor of outcome as well as possible associations between a history of trauma and other potential predictors of poorer outcome such as higher severity of symptoms and/or psychiatric co-morbidities. In clinical practice the finding supports the need for developing and providing trauma-informed treatment (see e.g. Butler et al., 2011; Black et al., 2012). Because of the potentially long-lasting negative impact of trauma on physical and mental health, ways to address patients’ history of trauma have also drawn increasing attention of health care policymakers and providers (Center for Health Care Strategies, 2016; Jones et al., 2020).

In the studies investigating age, significant predictor or moderator effects were reported in approximately a fifth of the studies, but even within studies that reported significant findings, there were generally a higher rate of non-significant analyses, and some of the results contradicted each other (i.e., supporting a different direction). Most of the included studies were in accordance with the systematic review of Kunas et al. (2021) stating that age was not a significant predictor of treatment outcome for CBT among youth with depression or anxiety. Yet, some of the studies supported the recent scoping review on predictors, moderators, and mediators associated with treatment outcome in RCTs among adolescents with depression (Courtney et al., 2022) stating that older age would be a predictor of better treatment outcomes. However, both reviews were limited to internalizing disorders and Kunas et al. (2021) only included one treatment modality and also included children. Regarding autism spectrum disorder, only one moderator study found that younger adolescents had a higher increase in psychosexual knowledge and social functioning following an active intervention compared to older adolescents. Also, a possible explanation for that age did not predict treatment outcome could be that the age ranges were quite different between included studies, as some included only adolescents, while others also included adults up until the age of 30.

Gender was also shown to be a predictor or a moderator of outcome in approximately a fifth of the reviewed studies. The results were quite mixed, so overall it is not possible to state which gender would benefit the most from treatment, but in treatment of substance use disorders it may be that males benefit a bit more than females. One possible explanation could be that substance use is more socially accepted among boys than girls, indicating that substance abuse may affect more boys from well-functioning homes, whereas those girls who end up abusing substances are more likely to have e.g. comorbidities, lower SES and/or trauma history, which was not possible to detect in this review. In addition, the studies with significant results for ethnicity suggest that belonging to an ethnic minority group (non-White/Hispanic) may predict weaker treatment outcomes. The results might be explained as a result of experiencing minority stress, and/or by interactions with lower social capital or SES. Previous systematic reviews have shown mixed results for the significance of ethnicity as a predictor of treatment outcome. However, the findings from the systematic review of Nilsen et al. (2013), which is limited to internalizing disorders and also includes children, suggests that it would be important to adapt depression treatments for youth to better match the needs of ethnic minority groups.

Higher education predicted better outcomes for SUD, but it was not significant for the few studies concerning other disorders. The studies with significant results for SUD on SES showed mixed results, so there is no indication that poorer SES was related to poorer treatment outcome, which is in accordance with previous systematic review of Kunas et al. (2021) focusing on depressed and anxious children and adolescent treated with CBT. It should be noted that variables related to SES are probably important factors considering barriers to care, so they might impact the results already before the enrolment of patients to the treatment studies, as they might influence access to suitable care and willingness to be diagnosed or treated etc.

Some predictors/moderators that may potentially be of importance for young people - like source for referral to treatment, social support and forensic history - appear only in one or a few diagnostic groups, so more studies across diagnostic groups are needed to access their relevance. Also, accommodation status and social support (having a mentor) could predict better outcomes for SUD, but studies on other disorder groups were lacking. The few studies on family constellation were mainly found to be non-significant and the studies with significant results showed mixed evidence, as with parents’ education. Parent’s age, work status, school type, sexual orientation, distance to treatment and sociodemographic adversity were reported to be non-significant predictors, however, there were only 1–2 studies researching each of the variables.

It should be noted that the findings of this systematic review are highly tentative and may be spurious. The results from the different studies in this review for the sociodemographic variables render many questions open and may even be difficult to interpret without speculating. The studies are difficult to compare because of several methodological issues. The studies investigate a large variety of therapy modes where the theoretical base, goal for the therapy, and the setting of the therapy differ. Also, the used outcome measures and applied analyzes vary (e.g., simple correlations vs. multilevel models) and the analyzed predictors/moderators are not standardized. In many studies the predictor and moderator variables may be named differently, have not been grouped or have been grouped according to different principles, making it difficult to compare results from different studies (Gergov et al., 2021). In addition, the way of reporting the results varies (e.g., many studies report only significant findings) which makes interpretation difficult.

The heterogeneity of the study designs of the studies included in this review highlights the importance for forming guidelines on performing predictor and moderator studies. The study designs should include the use of identical outcome measures and assessment of potential predictors/moderators at pre-intervention, conducting statistical analyses on prediction, testing the same predictors/moderators across studies, pre-defined hypotheses, sample sizes based on realistic power estimations, and the use of appropriate statistical methods. These elements are needed to explore which sociodemographic predictors and moderators across all diagnostic groups should be taken into account regularly when making treatment recommendations. Also, novel machine learning approaches that show potential to advance the field of precision psychotherapy (e.g. Aafjes-van Doorn et al., 2021; Schwartz et al., 2021) highly depend on a sound knowledge of reliable predictors. This review of potential predictors of treatment outcome might contribute to a proactive investigation of predictors, e.g. by helping to develop and routinely administer instruments assessing the most important sociodemographic predictors of treatment outcome.

Strengths and Limitations of the Study

The strength of the review is that the principles for the systematic search were created by a large interdisciplinary group of professionals in the field of youth psychotherapy, in order to ensure a set of sound inclusion criteria of predictor/moderator studies related to treatment outcome of psychotherapeutic interventions for youth with mental disorders. The study selection was continuously discussed within the group. The risk of noncompliance concerning the inclusion criteria was tackled by a data integrity group who made an overall quality control for the screening procedure. Another important strength of the study is the meticulous and time-consuming search strategy where all predictor and moderator variables were identified manually from all clinical outcome studies instead of using a search term that included keywords like predictor/moderator - an approach that would have reduced the number of identified articles remarkably. The searches can be reproduced, so the systematic review can be updated in future when more studies are published. Until now, not many systematic reviews on predictors and moderators of treatment outcome focusing specifically on adolescents and young people have been published, and transdiagnostic systematic review on predictors and moderators of treatment outcome for this age group were not available.

The study aimed to review the existing tested moderators and predictors. We did not look into the effect sizes because this would be more appropriate in an individual patients’ data meta-analysis, where one would model moderators to explain intervention heterogeneity. This limitation was known already when forming the research question and deciding to do a systematic review, as this study is mainly exploratory. Future studies should develop a priori, confirmatory hypotheses to test these findings. One major limitation of existing studies is that the predictor and moderating variables were not hypothesized a priori, but rather in secondary analyses. This is a significant problem which increases risk of publication and retrospective biases (Baldwin et al., 2022; Courtney et al., 2022; Sun et al., 2014), as secondary analyses are often underpowered, test for many moderators/predictors to “fish” for significant findings without adjusting p-values accordingly, and only report significant findings (publication bias). In addition, basing the definition of significant results to a statistical significance (p-value) fails to account for e.g., sample size, directionality, and effect size (Hedges & Olkin, 1980). Especially the studies reporting sociodemographic moderator analyses may be underpowered, as the number of participants in the individual studies were not necessarily very high (n ranged between 32 and 484) considering the number of moderators tested. It is also possible that there may be publication bias, so that studies would not report on the results if predictors/moderators were not significant. To address this limitation, this review conducted an exhaustive list of all sociodemographic predictors or moderators that were reported to be tested in the selected studies; it is possible however that several non-significant findings were not identified because they were not published. Furthermore, the majority of studies included in the review did not examine interactions between different baseline predictors, therefore, conclusions on the overlap of the variables and direction of the effects cannot be made.

Another possible limitation of the study is the decision to conduct the searches by diagnostic groups, as the decision may have led to not identifying or exclusion of possible transdiagnostic predictor or moderator studies. Five studies were identified that included samples with diverse diagnostic groups at the stage of data extraction and one study outside the search with such a sample; however, it cannot be definite that the search strategy did not prevent the identification of more relevant individual studies including patients from several diagnostic groups.

In addition, the systematic review focuses on how the sociodemographic predictors are related to treatment outcome when adolescents and young people have already accessed the treatment but does not consider all predictive aspects of the sociodemographic variables. Thus, any conclusions on, e.g., pre-treatment selection effects cannot be made based on the review. Sociodemographic predictors might also act as a prognosis for clinical severity, e.g. younger age might indicate earlier onset of mental disorder and therefore possibly more severe disorder. Sociodemographic predictors should be examined controlling for symptom severity which was not the case in most of the included studies.

Conclusion

Research on predictors and moderators of treatment outcome for psychotherapeutic interventions has increased in recent years, however, systematic reviews have mainly focused on specific disorders or treatment modalities and have not targeted precisely the patients on a transitional stage from childhood to adulthood. This review describes the evidence on sociodemographic variables that are predicting treatment outcomes for adolescents and young people across different mental disorders and treatment modalities. The review found that most of the studies on predictors and moderators of treatment outcome for this age group are conducted with patients diagnosed with mood or eating disorders or SUD. Only a few studies have investigated samples across diagnostic groups, and some disorder groups have not really been investigated for predictors of treatment outcome. The review provides tentative support that ethnic minority status and a history of trauma may predict poorer outcome of psychotherapy across several diagnoses and treatment modalities. Otherwise, the results mostly do not support the relevance of sociodemographic variables for predicting treatment outcome. Age, gender and ethnicity are most frequently researched predictors or moderators, and a portion of the studies report significant results. However, for age and gender the findings are mixed and point to different directions, so it is not possible to state if older/younger or female/male patients’ benefit more from treatment across different disorder groups. In treatment of SUD, males may show larger benefits, as may patients with higher education, and stable accommodation status. It is evident based on the heterogeneity of the study designs of the studies included in this review that guidelines for conducting predictor and moderator studies are needed in order for the studies to be methodologically more similar and the results to be comparable. More research with sound study designs is needed before predictor and moderator studies could guide therapists in their clinical work.