Introduction

Since the 1980s and 1990s, numerous trials have been conducted to test the efficacy of behavioral interventions that aim to prevent sexually transmitted infections (STIs) and human immunodeficiency virus (HIV) by encouraging people to use condoms or reduce their number of sexual partners. In turn, in the last 10–15 years a large number of meta-analyses have been published. Some of these have focused on different target groups such as African Americans (Darbes et al., 2008), adolescents (Johnson et al., 2003), or men who have sex with men (MSM) (Johnson et al., 2005), or different types of interventions such as the use of computer-technology (Noar et al., 2009) or social media (Swanton et al., 2015). Despite their different foci, these meta-analyses often show positive pooled effect sizes for changes in condom use and other sexual risk behaviors. However, the effect sizes have been found to be significantly heterogeneous, which has led some researchers to explore which factors moderate intervention efficacy through stratified analysis and meta-regression techniques.

Given the growing numbers of meta-analyses that have conducted moderator analyses, researchers are now turning to systematically reviewing the meta-analytic studies themselves. Five such meta-reviews, or meta-syntheses, have been published in recent years. Each provide different insights into the moderators of intervention efficacy effect size (Johnson et al., 2014; Lorimer et al., 2013; Noar, 2008; Protogerou & Johnson, 2014; Vergidis & Falagas, 2009).

Four out of the five meta-reviews have focused their attention on meta-analyses of interventions targeted at specific groups such as MSM, adolescents, or specific ethnicities (Lorimer et al., 2013; Noar, 2008; Protogerou & Johnson, 2014; Vergidis & Falagas, 2009). A range of factors have been shown to be associated with larger intervention effects. Sessions delivered to single-ethnicity or single-gender groups were more efficacious than mixed ethnicity/gender sessions (Noar, 2008). For African Americans, greater efficacy was found for interventions that involved peer education, whereas for Latinos the effect was larger in interventions targeted at same sex groups (Vergidis & Falagas, 2009). Group and community-level interventions increased condom use and reduced unprotected anal intercourse in interventions delivered to MSM (Lorimer et al., 2013). The use of motivation enhancement skills training and use of theory was linked to efficacy in interventions targeted to adolescents (Protogerou & Johnson, 2014).

Unlike these four meta-reviews, Johnson et al. (2014) did not restrict their synthesis to prior meta-analyses focused on particular target groups. They focused instead on the 56 behavioral HIV prevention meta-analyses that had been included in a meta-synthesis of behavior change interventions conducted by Johnson et al., (2010). Two intervention content dimensions, skills training and motivational enhancement, were identified as being significantly associated with greater risk reduction behaviors in multiple meta-analyses. However, the synthesis lacked detail about the results found for all intervention content dimensions. In particular, their focus was on identifying only the significant moderators; the non-significant dimensions were not identified. This limits our ability to explore not only the reasons for lack of consensus in results between meta-analyses (i.e., why is a dimension a significant moderator in one meta-analysis but not another?), but also to identify dimensions that never, or rarely, produce significant effects (i.e., which dimensions do not make a difference to intervention effectiveness?)

This limitation is addressed in the meta-review reported in this paper, in which we present a comprehensive and detailed synthesis of previous meta-analyses that have tested the significance of intervention dimensions. The intervention dimensions selected for analysis are listed and defined in Table 1 and include mode of delivery dimensions (e.g., number of sessions, group delivery) and communicator dimensions (e.g., matched ethnicity, expert delivery), as well as the content dimensions (e.g., individual tailoring, condom skills training) analyzed by Johnson et al. (2014). Also, unlike the meta-reviews conducted by Lorimer et al. (2013), Noar (2008), Protogerou and Johnson (2014), and Vergidis and Falagas (2009) we did not restrict our analysis to meta-analyses that had focused on particular target groups like MSM, adolescents, or specific ethnicities.

Table 1 Intervention characteristic dimensions

Objectives

The aim of this meta-review was to synthesize the existing meta-analytic evidence on the outcomes of behavioral interventions that aim to reduce the risk of STIs or HIV by increasing condom use or reducing unprotected sex. Our primary objective was to identify which types of interventions previous meta-analyses have found to be associated with larger intervention effects. We considered a broad range of intervention characteristics shown and defined in Table 1, which included format of delivery dimensions (e.g., number of sessions, group delivery), communicator dimensions (e.g., matched ethnicity, expert delivery), and content dimensions (e.g., individual tailoring, condom skills training).

Methods

Eligibility criteria

To qualify for inclusion, the meta-analysis must have: (1) been published in a peer-reviewed journal since 2000; and (2) reported moderator analysis with significance testing for at least one of the intervention features (shown in Table 1) on sexual risk behavior (i.e., measures of condom use or unprotected sex) or STI/HIV incidence rates. Meta-analyses were excluded if they: (1) focused only on interventions that aimed to prevent pregnancy without also addressing the prevention of STIs or HIV; (2) focused only on interventions that aimed to prevent HIV/STI transmission from people living with HIV (including mother–child transmission of HIV), or were concerned only with evaluating the outcomes of STI screening, HIV counselling/testing or HPV vaccination; (3) focused only on abstinence education interventions aimed at reducing sexual activity rather than encouraging condom use/protection; or (4) only reported moderator analysis on effect sizes based on sexual activity measures such as number of sexual partners or frequency of sexual activity.

Information sources, search strategy and study selection

The Web of Science (formerly Web of Knowledge) database was searched on May 7 2015. In addition to the Web of Science Core Collection [Social Sciences Citation Index (SSCI), Science Citation Index Expanded (SCI-EXPANDED)], this database includes access to the Cochrane Database of Systematic Reviews, Current Contents Connect, and MEDLINE. The search terms used are shown in Fig. 1, which also shows the PRISMA flowchart of study inclusion and reasons for exclusion (Moher et al., 2009).

Fig. 1
figure 1

PRISMA flowchart of study inclusion and exclusion

JC and HR-S independently screened the titles and abstracts of the papers identified from the search. Potentially eligible papers were short-listed for full-text review if the title or abstract indicated that the paper was reporting either a meta-analysis or systematic review of STI/HIV prevention interventions. The full-text articles were then reviewed by both JC and HR-S and only papers that met the eligibility criteria were included in the synthesis.

Data extraction and analysis

We extracted the following information from each meta-analysis: (1) authors and report date; (2) type of STI/HIV interventions included in the meta-analysis; (3) target group(s) included or excluded from the meta-analysis (including country of residence restrictions); (4) latest year included in the search period; and (5) details of the moderator analysis reported for the intervention characteristics shown in Table 1. We recorded: (1) the number of studies (k) on which the moderator analysis was based; (2) whether the moderator analysis was conducted on a univariate or multivariate basis; (3) whether the researchers had used conservative Bonferroni corrected significance levels for multiple comparisons; and (4) whether the moderator effect was significantly positive (+), negative (−), or not significant (ns). Data extraction was conducted by JC and checked by either SH or HR-S. Fewer than 6 differences in coding were identified across all meta-analyses and these were resolved by discussion.

Results

As shown in Fig. 1, 37 meta-analyses were included in this meta-review. Table 2 shows the data extracted from each study. The meta-analyses varied in terms of how inclusive they were with some focusing on specific types of populations such as adolescents (Chin et al., 2012; Johnson et al., 2003; Johnson et al., 2011; Mullen et al., 2002), STI clinic patients (Crepaz et al., 2007; Scott-Sheldon et al., 2010), African Americans (Crepaz et al., 2007, 2009; Darbes et al., 2008; Henny et al., 2012; Johnson et al., 2009; Reid et al., 2014), Hispanics (Crepaz et al., 2007; Herbst et al., 2007), MSM (Herbst et al., 2005; Higa et al., 2013; Johnson et al., 2005), heterosexuals (Henny et al., 2012; LaCroix et al., 2013; Neumann et al., 2002; Tyson et al., 2014), women only (Crepaz et al., 2009; Lennon et al., 2012), men only (Henny et al., 2012), or drug users (Meader et al., 2013; Prendergast et al., 2001). Beyond the interventions tested on North American populations, which were included in most of the meta-analyses, others were restricted to particular countries like South Africa (Scott-Sheldon et al., 2013) and China (Liu et al., 2014; Xiao et al., 2012; Zheng & Zheng, 2012), or Asian countries (Tan et al., 2012).

Table 2 Tests of moderator effects on condom use/unprotected sex and STI/HIV incidence effect sizes in 37 meta-analyses of HIV prevention interventions

Some reviews also placed restrictions on the types of interventions that were included. Restrictions included excluding interventions where recipients engaged in behaviors like role playing or condom-use skills (Albarracin et al., 2003), pamphlet studies (Johnson et al., 2003; Johnson et al., 2011), or mass-media interventions (Scott-Sheldon et al., 2011). Others restricted themselves to interventions that were group-based (Chin et al., 2012), multi-session (Meader et al., 2013), single session (Eaton et al., 2012), face-to-face (Huedo-Medina et al., 2010; Lennon et al., 2012), used computer-technology (Noar et al., 2009), used new media (Swanton et al., 2015), or were informed by the Theory of Planned Behavior (Tyson et al., 2014).

The final types of restrictions were concerned with the study design or information provided in the intervention reports. Some meta-analyses only included studies that comprised both a pre-test and post-test (Albarracin et al., 2005; Albarracin et al., 2003; Albarracin et al., 2008; Durantini et al., 2006; Earl & Albarracin, 2007), or where information was provided about the interventionist (Durantini et al., 2006) or percentage of Latinos in the sample (Albarracin et al., 2008), or where depression measures were obtained and separate results were provided for women (Lennon et al., 2012).

Although these restrictions reduce the overlap between the meta-analyses included in this meta-review, several of the meta-analyses share the same intervention studies. For example, the analyses reported by Durantini et al. (2006) and Earl and Albarracin (2007) were both based on a sub-set of papers reviewed by Albarracin et al. (2005). All of the studies included in Johnson et al. (2003) were included in the later meta-analysis reported in Johnson et al. (2011), and Reid et al. (2014) report a secondary analysis of studies included in Johnson et al. (2009). The overlap is particularly important to consider when synthesizing and interpreting the results of the moderator analyses.

Moderator analysis

Table 2 shows the results of the moderator tests conducted on the effect sizes for each meta-analysis and the overall numbers of significant and non-significant effects are summarized in Table 3. Some dimensions were tested as moderators more often than others. Frequently tested dimensions include duration, group targeting/tailoring, and skills training (condom, intrapersonal or interpersonal).

Table 3 Number of significant and non-significant moderator effects for the mode of delivery, communicator and content dimensionsa

Although the numbers shown in Table 3 provide a snapshot of which dimensions were most and least likely to produce significant effects, the numbers need to be treated with caution for a couple of reasons. Firstly, significant effects were more likely to be produced in meta-analyses with larger numbers of studies—the 123 significant effects found for the condom use/unprotected sex effect sizes came from tests conducted on an average of 100 studies [M(95 % CI) = 100 (82–118), Mdn = 40, SD = 105, n = 123] whereas the 145 non-significant effects came from tests conducted on an average of 45 studies [M(95 %CI) = 45 (37–53), Mdn = 34, SD = 50, n = 145]. Secondly, the effects are not independent of each other. As well as meta-analyses sharing the same intervention studies, some meta-analyses tested moderator effects for multiple related outcomes, for example condom use in the short, intermediate and long-term (Johnson et al., 2009), or condom use at most recent sexual intercourse and within the last 6 months (Zheng & Zheng, 2012). It is therefore important to consider not only the numbers of significant and non-significant effects, but also the sources of the effects. We therefore examined whether there are features of the meta-analyses that differentiate the significant effects from the non-significant effects. Although this information can be extracted from Table 2, listing the findings for each dimension facilitates this analysis (see Online Resource 1 (condom use/unprotected sex) and Online Resource 2 (STI/HIV incidence). These Online Resources also report the effect sizes for the significant moderators when they were reported by the original meta-analyses. This provides a sense of the magnitude of the effects observed.

Mode of delivery dimensions

With regard to mode of delivery dimensions, there is limited evidence that interventions of longer duration or consisting of more sessions are more efficacious. The majority of effects for duration were not significant and the 6 positive effects found for condom use/unprotected sex were obtained from 3 meta-analyses, 2 of which tested the effects of the moderator at 3 condom use follow-ups (Johnson et al., 2009; LaCroix et al., 2014; Scott-Sheldon et al., 2010). There is also no obvious distinction between the target groups or types of interventions included in these meta-analyses compared to those that produced non-significant effects.

Three out of 7 meta-analyses found interventions delivered in a school, classroom or educational setting were less effective at reducing sexual risk behaviors with small effect sizes (r = −.32, β = −.23, β = −.33) (Albarracin et al., 2003; Durantini et al., 2006; Huedo-Medina et al., 2010). However, since none of these three meta-analyses were restricted to interventions conducted on school- or college-aged populations the effect of this moderator might reflect lower efficacy of interventions in recipients of this age-range rather than the location of the intervention itself. This interpretation is supported by the fact that 3 of the 4 meta-analyses that produced non-significant effects of school setting had restricted their populations to adolescents (Chin et al., 2012; Johnson et al., 2003; Mullen et al., 2002). There is therefore little evidence that the setting (whether school, clinic or community) in which an intervention is delivered makes any difference to its effectiveness.

The effects of delivering an intervention in groups were also inconclusive. The 4 meta-analyses that demonstrated positive effects on condom use/unprotected sex for this moderator (Albarracin et al., 2008; Durantini et al., 2006; Neumann et al., 2002; Tan et al., 2012) do not appear to share any distinguishing features from the 10 that demonstrated negative or non-significant effects.

Communicator dimensions

Turning to the communicator dimensions, the effects of peer and expert delivery are somewhat mixed. It might be worth noting that the 2 meta-analyses that produced the 3 significant negative effects on condom use/unprotected sex for peer delivery were based on populations that included a high percent of MSM (Herbst et al., 2007; Zheng & Zheng, 2012). However, the idea that peer delivery is less effective in MSM populations is weakened by the finding that 1 of the 4 meta-analyses that produced significant positive effects was also based on an analysis of interventions designed for MSM (Higa et al., 2013). There were no observable distinctions between the meta-analyses that showed positive or negative effects of expert delivery.

Matching the person delivering the intervention according to the ethnicity, gender or age of the recipient had positive effects on intervention effectiveness in the majority of tests on condom use/unprotected sex. Matching gender produced most of the significant positive effects, although the effects were quite small. As shown in Online Resource 1, Cohen’s d effect sizes were between .14 and .38 larger when the facilitator’s gender was matched to the recipient. Although the positive significant effects for matching ethnicity and age were of a similar magnitude, they were outweighed by non-significant or negative effects. However, the non-significant effects were obtained from meta-analyses with much smaller numbers of studies—6 out of the 7 non-significant effects came from meta-analyses with fewer than 50 studies, whereas 5 out of the 6 significant positive effects came from two meta-analyses with over 200 studies (Albarracin et al., 2008; Durantini et al., 2006).

Content dimensions

The effects of group targeting/tailoring, where interventions were targeted at a specific group or tailored to enhance their applicability or acceptability to a particular group, were more likely to be positive than the effects of individual tailoring where materials used for the intervention were tailored to each individual recipient. However, there were no easily observable differentiating features between the meta-analyses that showed positive effects of group targeting/tailoring and those that showed non-significant effects. However, 2 of the 3 meta-analyses that found individual tailoring to have negative effects on condom use/unprotected sex were based on interventions conducted in Asia and China (Tan et al., 2012; Zheng & Zheng, 2012).

Conducting formative research had mixed effects. Although effects on condom use/unprotected sex were positive in 2 meta-analyses, they were negative in 2. However, these negative effects were small (β = −.12, β = −.08) and not significant when all methodological and population predictors were simultaneously entered into the analysis (Albarracin et al., 2005; Durantini et al., 2006). These same meta-analyses found that using theory to design an intervention had small positive effects (β = .10, β = .12)—a finding that was shared by 2 more moderately sized meta-analyses (Herbst et al., 2005; Johnson et al., 2003).

The information content of interventions had small positive effects in 4 of the 8 tests on condom use/unprotected sex. As shown in Online Resource 1, Cohen’s d effect sizes were between .09 and .40 larger when information was provided about the mechanisms of HIV, STI/HIV transmission or disease prevention methods. However, 3 of the 4 positive effects were based on meta-analyses that shared many of the same intervention studies (Albarracin et al., 2005; Albarracin et al., 2008; Durantini et al., 2006). There was also no conclusive evidence that including a motivational enhancement component within an intervention enhanced efficacy—although the inclusion of attitudinal arguments was found to have positive effects in around half of the meta-analyses where this moderator was tested. However, the inclusion of threat/fear-inducing or normative arguments may be just as likely to produce negative, rather than positive, effects. Although, there is some evidence that the use of fear might be effective with Latino groups (Albarracin et al., 2008) or within interventions conducted in groups, rather than at an individual or community level (Johnson et al., 2005). Although further research is needed to support these observations, these findings highlight how the effectiveness of some techniques might be dependent on specific population or intervention characteristics.

The most consistent moderator effects emerged for the skills components of the interventions. Although there was no evidence that interventions with a variety or mixture of skills training produced significant larger effect sizes, coding interventions according to more specific types of training such as training in condom skills, intrapersonal skills, and interpersonal skills, did show the potential value of these techniques. The effects were most consistent for condom skills and intrapersonal skills with 7 small to medium sized positive effects for each moderator across a range of different meta-analyses, including 3 of the 4 that focused on adolescent/youth populations (Johnson et al., 2003; Johnson et al., 2011; Scott-Sheldon et al., 2013).

Discussion

A growing number of meta-analyses of STI/HIV prevention interventions have explored the sources of heterogeneity of effect sizes by testing the extent that various study characteristics moderate effect sizes. This meta-review synthesizes the results from 37 meta-analyses identified through a systematic search of the published literature. A range of mode of delivery, communicator and content dimensions were examined and consistent positive effects were found for a small number of characteristics including matching the gender or ethnicity of the communicator to the intervention recipients, group targeting or tailoring of the intervention, use of a theory to underpin intervention design, providing factual information, presenting arguments designed to change attitudes, and providing condom skills and intrapersonal skills training.

Although the use of theory moderator was not specific to a particular theory, our findings do lend support to the Information-Motivation and Behavioral Skills (IMB) model of HIV preventive behavior (Fisher & Fisher, 1992). This model proposes that information and behavioral skills are necessary, but not sufficient, for HIV prevention. People’s attitudes towards HIV prevention are also an important determinant of their motivation to initiate and maintain preventive behavior. The role of motivational enhancement and skills training was also highlighted in the meta-review conducted by Johnson et al. (2014), but the broader scope of our analysis has identified the potentially important roles of features such as matching the person delivering the intervention and targeting the content to the characteristics of the recipient. This highlights the value of designing and delivering interventions which are aimed at modifying IMB components in a group-appropriate fashion.

Also, by reporting the non-significant and negative effects alongside the positive effects, our meta-review highlights dimensions that either make no difference or could potentially compromise intervention efficacy. This includes dimensions that we might have expected to make a positive difference, such as the overall duration, number of sessions, peer delivery, tailoring to the individual, use of threat/fear induction methods, and normative arguments.

However, non-significant effects were quite prevalent and we need to be cautious about ruling out the potential value of some dimensions when in some meta-analyses the lack of significance might be attributable to lack of statistical power. We highlighted k < 20 as a small sample where lack of power might be an issue, although it should be noted that even with 20 studies the moderator effect size would need to be quite large to produce a significant effect. Meta-analyses probably need at least 50 or 60 studies to have sufficient power to detect even medium moderator effect sizes. Notably only 10 of the 37 meta-analyses included in this meta-review were based on 50 or more studies and only three of those included literature published within the last 5 years: LaCroix et al. (2014) k = 58; Scott-Sheldon et al. (2011) k = 67; and Tan et al. (2012) k = 52. Notably the largest reviews that include over 100 studies do not include any literature published within the last 10 years: Albarracin et al. (2005) k = 200; Albarracin et al. (2008) k = 350; Durantini et al. (2006) k = 166; Earl and Albarracin (2007) k = 180. This is probably because the most recent meta-analyses have tended to adopt increasingly restrictive inclusion criteria (i.e., focussing on particular types of interventions or population groups) which limit the potential to statistically examine moderators of intervention efficacy.

There are some limitations to this meta-review that need to be considered when interpreting the findings. Firstly, although we conducted a systematic and thorough search of the literature, we cannot rule out the possibility that relevant meta-analyses were not included. Secondly, we are reliant on the original authors’ literature search, data extraction, and analysis. Our synthesis relies not only on the thoroughness of the literature search and reliability of the coding of dimensions, but also the adequacy and accuracy of the statistical methods used to compute effect sizes and test moderator effects. Bearing in mind that all of the meta-analyses are published in peer reviewed journals we have placed some faith in the fact that the meta-analyses were conducted appropriately. However, there were some differences in the methods used to compute effect sizes (e.g., whether they were adjusted for baseline differences), and to test moderator effects (e.g., whether analyses were based on fixed, random, or mixed effects assumptions and use of Bonferroni corrected significance values), that may contribute towards the different patterns of results found between meta-analyses. There is also the possibility that we may have miscategorized the dimensions. Although the coding was checked between two researchers, the definitions used by some meta-analysts for their tested moderators were not always provided in detail. Also, some dimensions had quite broad definitions that may have picked up on subtly different issues. Group targeting/tailoring for example included both whether an intervention was targeted at a particular group and also whether the information was designed to be specific to the target audience. We grouped these two features together, but this could have masked different effects on intervention efficacy. Finally, the insights gained from this meta-review are somewhat restricted to identifying the moderators of intervention effect sizes for behavioral outcomes like condom use, rather than biomarker-confirmed outcomes such as STI/HIV infection rates. Our insights were restricted because only 7 of the 37 meta-analyses tested moderator effects on STI/HIV incidence. If we want to demonstrate the clinical relevance of behavioral interventions, there clearly needs to be more research which evaluates the effects on STI/HIV infection rates and considers their role relative to innovations in pharmacological prevention such as pre-exposure prophylaxis (Centers for Disease Control and Prevention, 2014).

Despite its limitations, this meta-review has advanced our understanding of factors linked to improved efficacy of behavioral interventions. It has also highlighted deficiencies in the existing meta-analytic literature including the tendency to narrow the focus and inclusion criteria. The narrow focus of many of the meta-analyses conducted in recent years has undermined the reliability of the moderator analyses that have been conducted. To further our understanding an up-to-date and less restricted meta-analysis of the HIV prevention literature is needed. A less restricted meta-analysis might also enable not only more rigorous multivariate tests of moderating factors but also an exploration of how the intervention delivery, communicator, and content factors interact with each other and other characteristics, like the study date, type of recipients, or country the study was conducted in. This could include testing some of the interactions tentatively highlighted in this meta-review, for example whether skills-based techniques work better with adolescents or threat/fear induction messages backfire when delivered to certain cultural groups. Exploring the role of factors like the study date would also provide an indication of whether the efficacy of behavioral interventions has changed over time. This type of analysis could provide insights into whether intervention efficacy has been influenced by innovations in the design of interventions or by changing external circumstances such as improved treatment or the broader social context.

The findings of this meta-review suggest that HIV/STI prevention interventions should involve a number of features. Researchers should consider who delivers the intervention, as interventions that match the gender or ethnicity of the communicator to the recipients tend to be more successful. In terms of content, there seems to be value in designing interventions that are group targeted or tailored, use theory to underpin intervention design, provide factual information, present arguments designed to change attitudes, and provide condom skills/intrapersonal skills training. In designing interventions, it is worth noting that the duration and number of sessions did not affect intervention success. Also, expert delivery was not more successful than peer delivery. These findings have important implications for the field and highlight how less labor-intensive (and thus cheaper) interventions may be as successful as those that are more labor-intensive. The specific method of delivery might however be important and a priority for future research is to compare traditional face-to-face approaches against novel methods which use social media and mHealth applications.