Background

Health-related stigma – the co-occurrence of labeling, stereotyping, separating, status loss, and discrimination associated with a specific disease in the context of power imbalance [1] – deepens health disparities and drives population mortality and morbidity [2]. Interventions to alleviate stigma and its consequences are demonstrating effectiveness across a range of conditions, including HIV/AIDS, mental and substance use disorders, leprosy, epilepsy, and tuberculosis [3,4,5,6,7,8,9,10]. For example, social contact interventions, which facilitate interactions between individuals with a stigmatizing condition and those without it, have been shown to be effective at reducing community stigmatizing beliefs about mental health [6]; individual- and group-based psychotherapeutic interventions have been shown to reduce internalized stigma associated with HIV and mental health conditions [3, 10]; and socioeconomic rehabilitation programs have been shown to reduce stigmatizing attitudes towards people with leprosy [5]. Observed effects have tended to be small-to-moderate and limited to changes in attitudes and knowledge, with less evidence concerning long-term impacts on behavior change and health [11, 12]. Stigma can be intersectional, wherein multiple stigmatizing identities converge within individuals or groups, and effective interventions often grow complex to reflect this reality [13]. Interventions may be multi-component and multi-level [3], meaning that they may be especially difficult to implement, replicate, and disseminate to new contexts [14].

Few stigma reduction interventions move beyond the pilot phase of implementation, and those that do have tended to be in high-income countries. For example, mass media campaigns to reduce the stigma associated with mental health have been implemented at scale and sustained over time in the England, Scotland, Canada, New Zealand, and Australia [11]; however, most interventions do not reach those who need them. This is especially true in low- and middle-income countries (LMICs), where reduced access to resources and lack of political support for stigma reduction interventions compound the burden and consequences of stigma [15, 16]. For example, most LMICs spend far less than needed on the provision of mental health services [17], making large-scale investment in mental health stigma reduction programs unlikely without strong evidence of affordability and sustainability. Furthermore, stigma in low-resource settings tends to be a greater impediment to accessing services than elsewhere [18]. Anti-homosexuality laws and other legislation criminalizing stigmatized identities both increase the burden of stigma and prevent the implementation of effective services and interventions [19]. The same cultural and structural factors that drive and facilitate stigmatizing attitudes threaten the credibility and uptake of the interventions themselves [20].

Implementation science seeks to improve population health by leveraging interdisciplinary methods to promote the uptake and dissemination of effective, under-used interventions in the real world [21]. The emphasis is on implementation strategies, namely on approaches to facilitate, strengthen, or sustain the delivery of evidence-based technologies, practices, and services [22, 23]. Implementation science studies use qualitative and quantitative methods to measure implementation outcomes, including acceptability, adoption, appropriateness, cost, feasibility, fidelity, penetration, and sustainability (Table 1) [24]; these are indicators of implementation success and process, proximal to service delivery and patient health outcomes. Increasingly, studies use psychometrically validated measures of implementation outcomes [25, 26]. A range of theoretical frameworks support implementation science, including those that can be used to guide the translation of research into practice (e.g., the Canadian Institutes of Health Research Model of Knowledge Translation [27]), study the determinants of implementation success (e.g., the Consolidated Framework for Implementation Research [28]), and evaluate the impact of implementation (e.g., the RE-AIM framework [29]) [30]. Depending on the level of evidence required and the research questions involved, studies fall along a continuum from effectiveness, to hybrid effectiveness-implementation [31], to implementation (Fig. 1). Whereas effectiveness studies focus a priori on generalizability and test the effect of interventions on clinical outcomes [32], hybrid study designs can be used to test intervention effects while examining the process of implementation (type 1), simultaneously test clinical interventions and assess the feasibility or utility of implementation interventions or strategies (type 2), or test implementation interventions or strategies while observing clinical outcomes (type 3) [31]. Non-hybrid implementation studies focus a priori on the adoption or uptake of clinical interventions in the real world [33].

Table 1 Implementation outcome definitions
Fig. 1
figure 1

Continuum of study designs from effectiveness to implementation. As defined by Curran et al. [31]

Implementation science has particular relevance to the goal of delivering effective stigma reduction interventions in LMICs, offering tools to identify, explain, and circumvent barriers to implementation given severe resource constraints [34]. It can be used to study and improve complex interventions whose multiple, interacting components blur the boundaries between intervention, context, and implementation [14] and has the potential to generate evidence of affordability, scalability, and sustainability, which could be used to convince policy-makers and donors to invest in future implementation [35]. Moreover, it could bring policy-makers, providers, patients, and other stakeholders into the research process, promoting engagement around the study and delivery of interventions that may themselves be stigmatized [36]. However, the utility of implementation research depends on its rigor and replicability. To encourage growth and strength in the field of stigma implementation research, it is important to summarize previous work in the area, evaluate that rigor and replicability, and articulate priorities for future research. Our objectives were to systematically review implementation studies of health-related stigma reduction interventions in LMICs and critically assess the reporting of implementation outcomes and intervention descriptions.

Methods

We registered our systematic review protocol in the International Prospective Register of Systematic Reviews (PROSPERO #CRD42018085786) and followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [37].

Search strategy

One author (CK) searched four electronic bibliographic databases (PubMed, CINAHL, PsycINFO, and EMBASE) through November 15, 2017, for studies fulfilling four search concepts – stigma, intervention, implementation outcomes, and LMICs. We developed a list of terms for each concept in collaboration with an information scientist. The full search strategy for all databases is presented in Additional file 1. The PsycINFO search excluded dissertations, while the CINAHL search was restricted to academic journals. Finally, the reference lists of included studies were reviewed for additional publications.

Study selection

Studies were included in any language that (1) collected empiric data, (2) evaluated implementation of an intervention whose primary objective was to reduce stigma related to a health condition, (3) were based in a LMIC according to the World Bank [38], and (4) reported at least one implementation outcome as defined by Proctor et al. [24]. Studies evaluating interventions targeting stigma related to marginalized identities, behaviors, beliefs, or experiences (e.g., stigma related to race, economic status, employment, or sexual preference) were excluded if the interventions did not also target stigma related to a health condition. Unpublished and non-peer-reviewed research were excluded. Qualitative and quantitative studies had the same inclusion and exclusion criteria. The Covidence tool was used to remove duplicate studies and to conduct study screening [39]. A mix of two authors from a team of four (CK, BJ, CSK, and LS) independently screened all titles, abstracts, and full-text articles, and noted reasons for excluding studies during full-text review. Studies passed the title/abstract screening stage if the title or abstract mentioned stigma reduction and if it was possible that the study had been conducted in a LMIC. Studies passed the full-text screening stage if all criteria above were met. Disagreements were resolved through discussion until consensus was reached.

Data abstraction

Two authors (CK and BJ) independently piloted a structured abstraction form with two studies; all co-authors reviewed, critiqued, and approved the form. For each study, one of three authors (CK, BJ, and CSK) abstracted study and intervention characteristics (Table 2) onto a shared spreadsheet. One of the two remaining authors verified each abstraction, and the group of three resolved any disagreement through discussion.

Table 2 Study and intervention characteristics

At the study level, we collected research questions, methods and study types, implementation research frameworks used, years of data collection, study populations, implementation outcomes reported [24], stigma, service delivery, patient health, and/or other outcomes reported, study limitations, and conclusions or lessons learned. Studies were categorized as effectiveness, type 1, 2, or 3 hybrid effectiveness-implementation [31], or implementation, according to Curran et al. [31]. We noted the stage of intervention implementation at the time of each study as either pilot/once-off, scaling up, implemented and sustained at scale, or undergoing de-implementation. Studies were considered to have used an implementation research framework if authors specified one within the introduction or methods. Implementation outcomes were defined according to Proctor et al. [24]. Patient-level service penetration – the percent of eligible patients receiving an intervention – was considered a form of penetration, though this distinction is not clear in Proctor et al. [24]. We developed a five-item rubric to assess the quality of reporting of implementation outcomes, noting whether the authors included the implementation outcomes in their study objectives; whether they specified any hypotheses or conceptual models for the implementation outcomes; whether they described measurement methods for the implementation outcomes; whether they used validated measures for the implementation outcomes [25]; and whether they reported the sample sizes for the implementation outcomes.

At the intervention level, we collected intervention names, intervention descriptions, countries, associated stigmatizing health conditions, and target populations. Interventions were categorized based on type, including information/education, skills, counselling/support, contact, structural, and/or biomedical [3]; socio-ecological level, including individual, interpersonal, organizational, community, and/or public policy; stigma domain targeted, including driver, facilitator, and/or manifestation [3]; and finally the type of stigma targeted, including experienced, community, anticipated, and/or internalized [40]. The 12-item Template for Intervention Description and Replication (TIDieR) was used to evaluate the comprehensiveness of intervention description and specification by the studies in the sample [31]. TIDieR is an extension of item five of the Consolidated Standards of Reporting Trials (CONSORT), providing granular instructions for the description of interventions to ensure sufficient detail for replicability [41]. Implementation science journals encourage the use of TIDieR or other standards when describing interventions [42]. Each item in the TIDieR checklist (e.g., who provides the intervention? What materials are used?) was counted as present if any aspect of the item was mentioned, regardless of quality or level of detail. When multiple studies in the sample evaluated the same intervention, TIDieR intervention specification was assessed across the studies. Risk of bias was not assessed, as the goal was not to synthesize results across the studies in the sample.

Analysis

We calculated percentages for categorical variables and means and standard deviations (SD) for continuous variables. An implementation outcome reporting score was calculated for each study by summing the number of rubric items present and dividing by the total number of applicable items. A TIDieR specification score out of 12 was calculated for each intervention by summing the number of checklist items reported across studies of the same intervention and dividing by the total number of applicable items. These variables were used to summarize the aims, methods, and results of the studies and interventions in the sample. Qualitative synthesis and quantitative meta-analysis of study findings was not possible, given the heterogeneity in research questions and outcomes.

Results

Study selection

We screened 5951 studies and assessed 257 full-text articles for eligibility. A total of 35 studies met all eligibility criteria (Fig. 2) [43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77] and evaluated 29 different stigma reduction interventions (Table 3).

Fig. 2
figure 2

PRISMA flow diagram

Table 3 Included studies (n = 35) and associated interventions (n = 29) by year of publication

Study characteristics

The 35 studies in the sample were published between 2003 and 2017; the median year of publication was 2013 (Table 4). Study designs varied and included both qualitative and quantitative methods; 20 (57%) adopted at least one qualitative method, including interviewing, focus groups, or observation, while 8 (23%) reported results from cross-sectional surveys. One was an effectiveness study, with no a priori intent to assess implementation outcomes. The majority (32, 91%) were type 1 hybrid effectiveness-implementation studies; for example, Shah et al. [66] paired an effectiveness study with a process evaluation in order to assess provider-level acceptability and feasibility. None were type 2 or type 3 hybrid studies. Two were implementation studies; for example, Gurnani et al. [53] used routinely collected monitoring and evaluation data to assess the penetration of a structural intervention to reduce stigma around HIV/AIDS and sex work. Most (29, 83%) were evaluations of once-off or pilot implementations, while 6 (17%) evaluated implementation at scale. None evaluated the interventions undergoing scale-up, and none evaluated the process of de-implementation. No studies adopted a formal theoretical framework for implementation research.

Table 4 Study-level descriptive statistics (n = 35)

Patient, provider, or community-level acceptability (20, 57%) and feasibility (14, 40%) were the most frequently reported implementation outcomes. Though authors usually reported whether participants found activities useful, enjoyable, or difficult, they rarely described why. Penetration was also relatively common (6, 17%). In comparison, appropriateness and fidelity were reported in 5 (14%) and 4 (11%) studies, respectively, while cost and sustainability were reported twice each, and adoption was reported once. In addition to these implementation outcomes, stigma (25, 71%) and service delivery outcomes (12, 34%) were most frequently reported, while patient health outcomes were rarely assessed (7, 20%).

Implementation outcome reporting scores were low, with a mean of 40% (SD 30%); 14 (40%) studies mentioned implementation outcomes in their study objectives, while 3 (9%) prespecified a hypothesis or conceptual model to explain implementation outcomes. For example, Rice et al. [56] used diffusion of innovation theory to inform their hypothesis about the penetration of messaging in intervention settings. Though 28 (80%) studies described methods for collecting implementation outcomes and 24 (69%) documented a sample size for those outcomes, none used validated measures of implementation outcomes in their quantitative data collection.

Intervention characteristics

Of the 29 interventions in the sample, 18 (62%) were implemented in sub-Saharan Africa (Table 5), 20 (69%) focused on stigma related to HIV/AIDS, and fewer addressed mental health (3, 10%), leprosy (2, 7%), or other conditions (6, 21%); the majority (28, 97%) used information or education to reduce stigma. For example, the Tchova Tchova program in Mozambique broadcasted HIV education over the radio, including a debate segment where listeners could ask questions to an HIV specialist [72]. Skill- and capacity-building were the next most common types of stigma reduction interventions (13, 45%), followed by counseling (6, 21%) and contact events (6, 21%). The Stigma Assessment and Reduction of Impact program in Indonesia, for instance, taught participatory video production skills to people affected by leprosy [67, 68], while the Trauma-Focused Cognitive Behavioral Therapy program in Zambia counseled orphans and vulnerable children to reduce shame-related feelings around sexual abuse [61,62,63]. Few interventions used structural (1, 3%) or biomedical (1, 3%) approaches to reduce stigma. The drivers of stigma were targeted by 28 (97%) studies, while few targeted its facilitators (4, 14%) or manifestations (10, 34%). In Senegal, the HIV Prevention 2.0 study targeted all three through its Integrated Stigma Mitigation Intervention approach, wherein drivers related to knowledge and competency of service providers, facilitators related to peer support and peer-to-peer referral, and manifestations related to individual self-stigma and self-esteem [76]. Most interventions (24, 83%) focused on reducing community stigma, while fewer targeted experienced (11, 38%), anticipated (7, 24%), or internalized stigma (9, 31%). For example, the Indian film Prarambha was produced to raise awareness about HIV and designed to be viewed by individuals in HIV-vulnerable communities, thus targeting a driver of community stigma related to HIV [58]. While many interventions operated at the individual (23, 79%) and interpersonal levels (14, 48%), fewer were implemented at the community (11, 38%), organizational (6, 21%), or public policy (1, 3%) levels. Several interventions at the community, organizational, or public policy level specifically targeted the structural drivers of health-related stigma among key or vulnerable populations. In another example from India, the Karnataka Health Promotion Trust organization educated female sex workers on their legal rights and implemented sensitization and awareness training with government officials, police, and journalists [53].

Table 5 Intervention-level descriptive statistics (n = 29)

Adherence to the TIDieR checklist for reporting interventions was uneven. On average, interventions met 60% (SD 10%) of the TIDieR criteria. All interventions specified how they were delivered – whether face-to-face, remotely, individually, or in a group, and the majority offered a rationale to justify the intervention (28, 97%) and described the procedures involved in delivering intervention components (28, 97%). Few interventions (5, 17%) documented how they were tailored to different target groups or contexts, and only 2 (7%) described modifications that took place over the course of implementation.

Discussion

We systematically reviewed implementation research conducted in support of stigma reduction interventions in LMICs. A broad, inclusive definition of implementation research was used, considering any studies that reported implementation outcomes while evaluating stigma reduction interventions. Few studies were found, with the majority of these evaluating interventions to reduce HIV-related stigma, taking place in sub-Saharan Africa, and evaluating pilot or once-off interventions. The interventions in the sample were diverse, adopting a variety of tactics to reduce stigma, though those that had been implemented at scale tended to incorporate mass media or target structural changes, rather than individual-level support or service delivery. Further, none took a trans-diagnostic approach seeking to reduce stigma associated with multiple health conditions.

A critical assessment of these studies suggested three key gaps in the literature. First, no study in the sample explicitly incorporated a conceptual framework for implementation research, evaluated implementation strategies using a type 2 or 3 hybrid study design, nor used validated measures of implementation outcomes. Second, most studies focused on intervention acceptability and feasibility, and few assessed adoption, appropriateness, cost, fidelity, penetration, or sustainability. Third, intervention descriptions were sparse and often lacked the key details necessary for the eventual replication and adoption of those interventions. These gaps were consistent across the different stigmatizing health conditions – coverage of robust methods for implementation research was not greater among studies of interventions targeting any particular condition.

Theoretical frameworks, validated measures, and rigorous methods support the generalizability and ultimately promote the utility of implementation research [78]. Implementation science is a rapidly growing field, though essentially all available frameworks and measures for implementation determinants and outcomes have been developed in high-income countries [25, 30, 79]. Frameworks like the Consolidated Framework for Implementation Research are increasingly popular and have produced actionable results to enhance implementation in high-resource settings [80,81,82,83], though they may need to be translated and adapted to support implementation of stigma reduction and other complex interventions in LMICs. Improvements to measurement could also promote the comparability of findings across future stigma implementation studies, accelerating knowledge production in the field and easing the translation of findings into practice [84]. Robust measures are increasingly available [25], including measures of acceptability, appropriateness, feasibility [85], and sustainability [86, 87], though there is a major need for continued development and validation to ensure these are relevant to stigma interventions and valid in LMIC settings. With such measures and frameworks in hand, LMIC-based stigma researchers could start to assess how patient-, provider-, facility-, and community-level characteristics predict implementation outcomes. Such studies would help determine, for example, the projected health sector cost of providing in-service stigma reduction training to clinicians, or the patient-level factors associated with preference for peer counselors over lay counselors. Subsequent type 2 and 3 effectiveness-implementation hybrid study designs could compare implementation strategies and observe changes in relevant outcomes [31], for example, experimenting with the counselor cadre and assessing relative levels of adoption. Of course, for all this to be feasible, capacity-building and funding for implementation science among stigma researchers in LMICs is critical. Few opportunities for training and support of LMIC-based implementation researchers are currently available [88].

Future research (Box 1) will need to assess the complete range of implementation outcomes to further strengthen the evidence base for the delivery and scale-up of effective stigma reduction interventions. Studies in this sample concentrated on assessing acceptability and feasibility and rarely measured other implementation outcomes. For example, only five studies measured provider- or facility-level adoption or penetration. As such, little is known about the factors associated with the uptake of stigma reduction interventions by health facilities, staff, patients, or communities in LMICs. Appropriateness, fidelity, cost, and sustainability were also seldom evaluated. Appropriateness is important because uptake of an intervention is unlikely unless community members, patients, and providers perceive its utility and compatibility with their other activities. One study used an innovative approach to improve the appropriateness of a stigma reduction intervention by involving community members with leprosy as staff members to inform study design and implementation [67]. Another asked community members to help select and tailor intervention components to address local concerns [61]. Fidelity has been shown to be critical to ensuring that effectiveness is maximized and successful outcomes are replicable across settings [89]. Evidence of cost and cost-effectiveness is necessary to justify scale-up and funding by health systems and donors. Finally, sustainability ensures investments into stigma reduction efforts are not wasted [90, 91].

Detailed, transparent descriptions of interventions in manuscripts and supplemental materials are also important to ensure others can replicate the work and achieve comparable results to those seen in effectiveness studies [92]. The majority of stigma interventions in the sample performed well against the TIDieR criteria, offering some description of the who, what, when, where, and why of intervention delivery [41], though descriptions were generally sparse, and few manuscripts offered links to formal manuals or protocols detailing intervention content and procedures. This is consistent with other reviews highlighting deficiencies in the comprehensive reporting of processes for complex interventions [93]. Moreover, few studies in the sample reported on intervention tailoring, modifications that were made over the course of the study, or fidelity assessment. Stigma is multi-dimensional; as a result, successful stigma interventions are complex, operating across multiple components and socio-ecological levels [15]. Complex interventions like these work best when peripheral components are tailored to local contexts [94]; it is therefore important to define the core, standardized parts of an intervention, and those that can be or have been adapted to suit local needs. As noted above, fidelity assessment is important to ensuring effectiveness; more frequent reporting of fidelity would serve both to increase the range of implementation outcomes assessed and to improve performance against the TIDieR criteria. Future stigma implementation research could ease the translation of findings into practice and deepen intervention specification by providing intervention materials as manuscript appendices, comprehensively documenting and reporting adaptations or modifications to interventions, and incorporating fidelity assessment into implementation and evaluation [95].

This review had several limitations. First, studies of interventions with stigma reduction as a secondary objective or incidental effect were excluded, though many interventions have immense potential to reduce health-related stigma even if stigma reduction is not their primary goal. For example, integration of services to address stigmatizing conditions into primary care and other platforms (e.g., primary mental health care [96] or prevention of vertical transmission of HIV as part of routine antenatal care [97]) may improve service delivery and patient health outcomes and de-stigmatize the associated condition. Evaluations of the implementation of these approaches exist (e.g., using interviews to assess acceptability and feasibility of vertical transmission prevention and antenatal service integration in Kenya [98]) but were not captured by this review. Second, studies conducted in high-income countries were excluded, though they may represent a significant proportion of stigma implementation research. This review focused on the unique challenge of studying the implementation of stigma-specific interventions in LMICs, where there is a large burden of unaddressed stigma as well as significant financial and logistic constraints to deliver such interventions. Third, this review was focused on implementation science, seeking to develop generalizable knowledge beyond the individual context under study. Therefore, unpublished and non-peer-reviewed studies were excluded. We recognize that barriers to publication in academic journals are greater for investigators in LMIC settings. To limit bias against non-English speaking investigators, we did not restrict our search on the basis of language. Finally, the assessment of implementation outcomes by studies in the sample was too sparse to draw strong conclusions about factors that promote or inhibit successful and sustained implementation at scale.

Conclusion

Implementation science has the potential to support the development, delivery, and dissemination of stigma reduction interventions in LMICs, though usage to date has been limited. Rigorous stigma implementation research is urgently needed. There are clear barriers to successful implementation of stigma reduction interventions, especially in LMICs. Given these barriers, implementation science can help maximize the population health impact of stigma reduction interventions by allowing researchers to test and refine implementation strategies, develop new approaches to improve their interventions in various settings, explore and understand the causal mechanisms between intervention and impact, and generate evidence to convince policy-makers of the value of scale-up [99]. Such research will help us deliver on the promise of interventions to alleviate the burden of stigma worldwide.