Females in the juvenile justice system remain an understudied minority among justice-involved youth (JIY). Although males and females share many risk factors for delinquent behavior (e.g., deviant peers, low parental monitoring), females also possess gender-specific developmental pathways to and risks for delinquency (Baglivio et al., 2014; Scott & Brown, 2018). For example, females are more likely to enter the juvenile justice system for status offenses (e.g., running away from an abusive home), and their offenses are more often linked to victimization experiences (e.g., physical retaliation against an abuser) compared to males (Belknap & Holsinger, 2006; Ehrmann et al., 2019; Kerig, 2018). Similarly, sexual abuse and polyvictimization are more strongly correlated with subsequent offending among females relative to males (Conrad et al., 2014; Kerig, 2014). Among JIY, it is theorized that these victimization-related factors compound risk for engaging in substance use (a status offense) to cope with psychiatric symptoms that emerge from prior victimization experiences such as posttraumatic stress (Kerig, 2018). Supporting this hypothesis, female JIY are more likely to have co-morbid substance use and posttraumatic stress disorders than male JIY (Teplin et al., 2002). Taken together, evidence suggests that females in the juvenile justice system possess gender-specific risk factors and developmental pathways that should be considered when intervening with this population.

Because males represent the majority of JIY (~ 79%), existing interventions delivered in juvenile justice contexts often focus on more male-centered (e.g., violent behavior) or universal risks (e.g., deviant peers) or pathways to delinquent behavior (Ehrmann et al., 2019). Feminist scholars hypothesize this lack of attention to gender-specific needs suggests existing interventions delivered in juvenile justice system contexts may be less effective for females (e.g., Covington & Bloom, 2007), and thus advocate for the implementation of gender-responsive programming for female JIY. Feminist theorists argue that gender identity affects every aspect of life and as such, understanding the position of females in society is critical in addressing female delinquent behavior; they argue that females’ societal status (i.e., oppression relative to males) produces different causal mechanisms for offending behavior.

Two theories emerge as most prominent from feminist criminology literature. The first, relational theory, asserts that females are more likely to develop their self-worth and identity based on their connections with others, and thus their behavior is more likely to be motivated by relational concerns. Because female offending often occurs in service of a relationship (e.g., securing drugs for a drug-using partner, stealing goods for a family member), relational theory suggests that the development of healthy relationships is particularly important to prevent or reduce delinquent behavior among females (Bloom & Covington, 2008; Farmer, 2019). In line with relational theory, interventions that focus on helping female offenders eliminate unhealthy relationships that facilitate delinquent behavior, or strengthen existing relationships that promote prosocial behavior, can be considered gender-responsive and thus more likely to reduce recidivism among females. For example, interventions that address contextual factors contributing to antisocial behavior, including delinquent peers and poor familial relationships, may be particularly effective for justice-involved girls because they focus on relational problems that may exacerbate delinquent behavior.

The second theory, the pathways perspective, postulates that females are more likely than males to become justice-involved as a direct result of prior abuse or mental health challenges. For example, at-risk females often utilize survival mechanisms that are criminalized (e.g., retaliating against perpetrators of abuse; Chesney-Lind, 1999; Flores et al., 2020), or struggle with substance use (sometimes to cope with prior trauma) which leads to subsequent justice system involvement (Gehring, 2018). In sum, the pathways perspective suggests that interventions aimed to reduce recidivism among females require attention to gender-specific risk factors that lead to system-involvement, such as comorbid substance misuse and PTSD or lack of post-abuse shelter (Gehring, 2018). As such, interventions that target these factors could be more effective at reducing female delinquency.

Other scholars argue that despite some gender differences in patterns of delinquent behavior, female delinquency can still be explained by existing “universal” criminology theories like general strain and social control theory (Broidy & Nyce, 2022; Costello & Laub, 2020). For example, general strain theory posits that criminal behavior emerges due to 1) failure to achieve positively valued goals, 2) loss of positive stimuli, and 3) increase in negative stimuli – which mainstream theorists argue explain both male and female criminal behavior (e.g., Broidy & Agnew, 1997). Moreover, the presence of any of these three “strains” creates negative emotions, like anger, which sometimes drives individuals to engage in delinquent behavior to alleviate their negative emotions (Agnew, 1992). Thus, per general strain theory, interventions that attempt to eliminate the “strains” in an offender’s life (e.g., providing vocational training for an offender who failed to obtain a job) or offer solutions to reduce the burden of existing strains (e.g., coping skills to manage stress from strains) can help reduce criminal behavior (Agnew, 2017). Social control theory, on the other hand, states that criminal activity emerges due to inadequate socialization (i.e., lack of investment in relationships, commitments, values, and norms that may encourage prosocial behavior; Heimer, 1996; Hirschi, 2017). Thus, according to social control theory, individuals must feel invested in relationships, communities, or personal values that encourage prosocial behavior to avoid engaging in delinquency. Like general strain theory, some scholars argue that social control theory explains delinquent behavior regardless of gender (e.g., Alarid et al., 2000; Heimer, 1996). Per social control theory, interventions that encourage offenders to invest in their communities and strengthen personal relationships are likely to reduce recidivism among males and females (Roman et al., 2017).

Reviews of interventions aimed to reduce delinquency among JIY overall show mixed findings (e.g., Liddle, 2016; Van der Stouwe et al., 2014; Markham, 2018; Weisman & Montgomery, 2019). Indeed, a recent systematic review and meta-analysis of diverse noninstitutional interventions for justice-involved youth found that interventions are largely ineffective at reducing delinquency (Olsson et al., 2021). Among this literature, information as to whether effectiveness differs by gender among justice-involved youth is extremely limited. Many intervention trials include exclusively male JIY, and studies with gender diversity typically include too few females to reliably detect gender differences in intervention effects (Leve et al., 2015). Aggregating effects across multiple studies through meta-analysis increases power to detect significant gender differences in intervention outcomes. In this study, we use meta-analysis to assess whether gender moderates intervention effects on delinquency among JIY with a preliminary sample of ten studies with effect size information by gender. We have two competing hypotheses as to whether intervention effects vary by gender among JIY. The “gender-responsive treatment hypothesis” argues that conventional interventions often neglect salient, gender-specific criminogenic needs of female offenders (e.g., higher rates of comorbid substance use and PTSD). Thus, one prediction is that males will benefit more than females from mainstream interventions as they are often designed to meet the criminogenic needs of "typical” JIY – i.e., adolescent males. In contrast, the “universal treatment hypothesis” argues that existing interventions target general criminogenic needs and are broadly effective across different risk profiles. Thus, an alternative prediction is that males and females will benefit equally from conventional interventions.

Method

We conducted an electronic literature in 2022 search using the following databases: Applied Social Sciences Index & Abstracts (ASSIA), ProQuest Criminal Justice Database, MEDLINE®, ProQuest Dissertations & Theses Global, PsycINFO, Public Health Database, Social Science Database, and Social Services Abstracts. Terms representing justice-involvement (e.g., juvenile justice, delinquent, court-involved), intervention (e.g., intervene, modification, program), controlled evaluation (e.g., randomized controlled trial, controlled clinical trial), and adolescents/young adults (e.g., youth, teen, juvenile) were utilized in our search. The search yielded 5,918 studies, and all titles and abstracts were screened for eligibility. In addition to this literature search, we looked for publicly available data from randomized trials of psychosocial interventions with JIY from the Inter-university Consortium for Political and Social Research (ICPSR), which yielded 1,098 datasets. A total of 32 other studies were identified through other sources, all of which were screened at the full-text level.

Studies were eligible if they met the following criteria: 1) randomized controlled trial, 2) youth participants (up to age 24 years), 3) sample with at least 50% JIY or with outcomes reported separately for JIY vs. non-JIY, 4) included at least 5 male and 5 female participants per condition, 5) reported at least one delinquency or recidivism outcome, and 6) had data available to calculate gender-specific effect sizes.

Two research assistants and the first author conducted the first round of screening 6,614 abstracts from ICSPR and databases, which yielded 549 studies. One research assistant and the first author screened the full text of these 549 studies, in addition to 32 studies identified through other studies (n = 581 studies screened at the full-text level). Together, our search yielded 10 total eligible studies. Reasons for study exclusion at the full-text level are provided in Fig. 1.

Fig. 1
figure 1

Study Selection Flowchart

Most studies that met eligibility criteria 1–5 listed above (i.e., 73 studies) did not offer sufficient data to calculate gender-specific effect sizes. Gender-specific effect size information was derived from: data available in the published RCTs (N = 4), raw data made available to study authors (N = 4), and raw data from publicly available datasets (N = 2). We analyzed delinquency data from the first available post-treatment timepoint reported. When multiple delinquency outcomes were assessed within a study, effects were calculated for each measure and then combined to form a single effect size coefficient.

Study Characteristics

The 10 studies included in this meta-analysis yielded a total of N = 4,344 (n = 60–1817) participants, with N = 962 (n = 15–309) females and N = 3,382 (n = 10–1508) males. Study-level demographic data for age, race, and ethnicity is as follows: The mean age was 15.33 years, ranging from 10–18 years, all studies were based in the United States, and 40% of studies included predominantly Black or African American youth, 30% of studies included predominantly Caucasian youth, 20% of studies included predominantly Latino youth, and 10% included predominantly unspecified racial/ethnic youth.

In eight studies, youth were juvenile court-, police-, or corrections-referred, in one of the studies participants were referred from an inpatient facility, and in the last study the referral source was unspecified. More information about the interventions tested and primary effects on delinquency are provided in Table 1.

Table 1 Overview of Interventions Tested Across Studies

Effect Size Estimation

For this meta-analysis, the effect size statistic represents the standardized difference in outcomes between treatment and comparison at the first follow-up timepoint. Effect sizes were calculated separately for males and females for each study. For continuous variables, effect sizes (Cohen’s d) were calculated using the standardized mean difference with the pooled standard deviation as the denominator. For dichotomous variables (e.g., arrested or not), log odds ratios were calculated and converted to create a common effect size index (Lipsey & Wilson, 2001). Positive effect sizes indicate the active intervention is more beneficial than the control and negative effect sizes indicate the control group was more beneficial than the active intervention. Effect sizes of 0.2 or lower are considered small effects, around 0.5 medium effects, and 0.8 or higher large effects (Cohen, 1988). For each gender, we used only one effect size for analyses to avoid violating assumptions of statistical independence (Lipsey & Wilson, 2001), thus yielding two effect sizes (one male, one female) per study. When multiple indices were used to assess delinquency outcomes (e.g., official arrest data and self-reported delinquency), we calculated the effects for each measure and then averaged them to form a single effect size.

Statistical Analysis

Comprehensive Meta-Analysis (CMA), Third Version was used to analyze data. Studies varied substantially in terms of treatment characteristics and demographics. As such, we expected heterogeneity of effects and used a random effects model, which assumes that true effects vary systematically across studies (Borenstein et al., 2010). We calculated the Q statistic (Hedges & Olkin, 1985) to test for homogeneity of effects across studies. A significant Q statistic indicates heterogeneous distribution of effects due to study characteristics. We supplemented the Q statistic with the I2 index since Q values are often poor at detecting heterogeneity when sample sizes are small (Higgins et al., 2003). I2 is the percentage of variation across studies due to heterogeneity; it is an index of inconsistency across study results. I2 values are typically categorized as follows: 25% indicates low heterogeneity, 50% moderate heterogeneity, and 75% high heterogeneity (Higgins et al., 2003). We tested gender as a moderator using the Q statistic (Hedges & Olkin, 1985) to determine if effects varied by participant gender.

Results

Results from the random effects model showed that interventions were generally ineffective for JIY overall, d = -0.006, p = 0.921. Contrary to expectations, the Q statistic for heterogeneity across studies was not significant, Q(9) = 7.905, p = 0.783. Moreover, the I2 index was < 25% (I2 = 17%), indicating a low degree of heterogeneity (Higgins et al., 2003). Although there was low heterogeneity across studies, moderator testing may be warranted so long as a model is specified a priori (Hall & Rosenthal, 1995). As such, we followed through with moderator testing given our a priori hypotheses. Gender did not moderate interventions effects for JIY, Q(1) = 0.071, p = 0.790; indeed, intervention effects were non-significant for both males, d = 0.006, p = 0.933 and females, d = -0.027, p = 0.785. See Fig. 2 for more detailed results.

Fig. 2
figure 2

Forest Plot with Overall and Study-level Effect Sizes

Supplementary Analysis

Given our unexpected null findings, we conducted a supplemental analysis to test whether results remained consistent when only official arrest and incarceration data were included (i.e., self-reported delinquency excluded). This supplementary analysis excluded the trial by Wilson et al. (2009), which only included self-report delinquency data. Results from the random effects model were consistent with the primary analysis. When focused only on arrest/incarceration outcomes, interventions were generally ineffective for JIY, d = 0.048, p = 0.337. Although the Q statistic was not significant, Q(8) = 10.888, p = 0.144, the I2 index was 35%, indicating a low to moderate degree of heterogeneity. Again, however, gender did not moderate intervention effects, Q(1) = 0.231, p = 0.631. Intervention effects on arrest/incarceration were nonsignificant for both males, d = -0.050, p = 0.491, and females, d = -0.022, p = 0.832.

Discussion

To our knowledge, this is the first meta-analysis focused on gender differences in intervention effects for JIY. This analysis with over 4,000 youth across ten studies indicates no significant gender differences in intervention effects for youth in juvenile justice contexts. In fact, our analyses show that interventions were broadly ineffective for JIY. Moreover, findings were similar when we limited our analysis to archival delinquency data. Overall, results support neither the gender-responsivity treatment nor the universal treatment hypotheses, which suggests a need for increased research focused on understanding why and for whom interventions work in juvenile justice system contexts.

Notably, our findings are largely congruent with Olson et al.’s (2021) recent meta-analysis and systematic review investigating the effects of noninstitutional psychosocial interventions for justice-involved adolescents. Across 35 studies published between 2000–2019, the authors found that active treatments were no more likely to reduce recidivism than control (Olsson et al., 2021). Moreover, using a crude index of gender representation (i.e., studies with > 70% male vs. < 70% male), the authors found that gender of study participants did not moderate intervention effectiveness.

Although we included only 10 studies, three were investigations of MST, which is touted as an intervention with a robust evidence-base in reducing problem behavior among delinquent youth (e.g., Henggeler et al., 2016). However, as noted previously, the effectiveness of MST at reducing delinquency is mixed (e.g., Fonagy et al., 2018; Markham, 2018), which is also apparent when reviewing the MST trials included in this meta-analysis (see Table 1). Reviews of the remaining interventions included in this meta-analysis (e.g., employment-focused intervention, intensive supervision, police-based restorative justice program, teen court, cognitive restructuring, drug court, and counseling) also have shown null or iatrogenic effects in reducing delinquency. For example, some intensive supervision programs and/or interventions with law enforcement as treatment providers (e.g., police-based programs) are linked to iatrogenic effects for offending populations (e.g., Rubenson et al., 2021), and reviews of teen court programs indicate they are largely ineffective (e.g., Gase et al., 2016). Thus, our lack of significant results overall is not entirely surprising given these mixed findings. In general, our results indicate that more work is needed to adapt existing interventions, or develop new ones, that yield reductions in delinquency for males and females. With respect to gender more specifically, none of the interventions included in this study are advertised or explicitly labeled as “gender-responsive,” nor do they include specific attention to some of the prominent theoretical explanations for female delinquency, such as those described in the pathways perspective or relational theory. Indeed, relatively few gender-responsive interventions for justice-involved females have been implemented and rigorously tested (Kerig & Schindler, 2013). Overall, our null findings support the need to evaluate whether gender-responsive interventions, relative to mainstream programs, lead to beneficial outcomes for female JIY.

Limitations

One limitation of this meta-analysis is that we included only ten studies. We used a convenience sample based on studies for which (1) delinquency data was published separately by gender, or (2) we had access to raw data. Another limitation is that the studies were mostly older trials published before 2012, and it is possible that more recent studies may have yielded different findings. However, Olsson et al.’s (2021) JIY meta-analysis showed that publication year did not significantly moderate study effects, although their manuscript only included studies published between 2000–2019. Thus, despite these limitations, our null findings with a sample size of over 4,000 youth and inclusion of diverse psychosocial interventions provide tentative evidence that existing interventions are largely ineffective for JIY overall.

Another important limitation is that none of the included studies differentiated between sex and gender. Thus, we classified participants as cisgender male or female. Transgender and nonbinary youth are overrepresented among JIY (Himmelstein & Brückner, 2011) and it is likely that at least some of the over 4,000 youth included in this meta-analysis might identify as transgender or nonbinary. As a result, we were unable to detect whether differences in intervention effects vary for JIY with gender-expansive identities. Finally, we were unable to investigate if results differed depending on whether youth possessed multiple, intersectional identities. Racial/ethnic and sexual minority youth are overrepresented in the juvenile justice system, and female or other gender minority youth with multiple minoritized identities may experience poorer treatment outcomes compared to males (Irvine-Baker et al., 2019).

Future Directions

Further research, including an expansion of this meta-analysis as newer randomized trials are published and more datasets with gender-specific effect size information are made available, is needed to test whether intervention effects on delinquency vary by gender among JIY. Results from this study could motivate future efforts to improve interventions for female JIY. Future studies should also attempt to tease apart variables that may explain null effects for some of these psychosocial interventions, and whether these explanatory variables differ by gender and other intersectional identities (e.g., race, religious beliefs, disability status, and sexual orientation). In sum, our findings, although preliminary, are an important contribution to the limited literature on whether interventions work for male and female JIY.