Stigma is defined as a socially discrediting attribute, making an individual ‘different’ and ‘less desirable’ to others (Goffman, 1963). Public stigma of alcohol use disorder (AUD) is prevalent and resistant to change in the UK and other countries (Crisp et al., 2005; Kilian et al., 2021; Schomerus et al., 2014). Among people with AUD, this negatively impacts psychological wellbeing (Glass et al., 2013; Hunter et al., 2017; Smith et al., 2010), treatment seeking (Finn et al., 2023; Keyes et al., 2010; Probst et al., 2015; Wallhed Finn et al., 2014) and recovery (Crapanzano et al., 2018; Luoma et al., 2014; Mak et al., 2017).

There is a need for more trials on public anti-stigma interventions for AUD. Public stigma underpins other forms of stigma (Corrigan et al., 2011; Evans-Lacko et al., 2012; Pryor & Reeder, 2011; Schomerus et al., 2011) and can potentially be countered with briefer, more scalable interventions. Yet, existing anti-stigma intervention research for substance use disorders focuses on self- and structural stigma, with only three studies hitherto targeting public stigma (Livingston et al., 2012; Tostes et al., 2020). These provided limited intervention descriptions and investigated short-term outcomes without control groups (Luty et al., 2007, 2008, 2009), preventing attribution of observed changes to the interventions implemented. The present study seeks to address this gap by evaluating anti-stigma interventions for AUD. Given the limited research available on addiction-specific anti-stigma interventions, the interventions were developed based on the more extensively researched area of mental illness stigma (Corrigan et al., 2017), in line with the Medical Research Council’s framework for developing and evaluating interventions (Skivington et al., 2021). These typically rely on three strategies: protest, education and contact.

Protest highlights injustices of stigma, contact facilitates interactions between ingroups (e.g., the public) and outgroups (e.g., people with neuropsychiatric disorders), and education provides information that challenges stigmatising beliefs (Corrigan, 2015). While protest can have unintended rebound effects, contact and education demonstrate individual and combined effectiveness (Corrigan et al., 2012; Griffiths et al., 2014; Morgan et al., 2018; Thornicroft et al., 2016), though their relative effectiveness is unclear. Contact was previously recognised as superior to education (Corrigan et al., 2012), but recent evidence suggests no clear advantage of either and similar effects between combined and sole interventions (Morgan et al., 2018).

Accordingly, this pilot study gathers preliminary evidence for the immediate and sustained relative efficacy of different interventions, namely education, contact and combined education and contact, in reducing the UK public stigma of AUD. To lay the foundations for a larger future trial, this study aimed to assess the feasibility of both the procedures implemented and the nature of each intervention (Skivington et al., 2021).

Brief videos were used as these allow high intervention fidelity and easy dissemination at reduced cost while offering comparable effectiveness to face-to-face mediums (Maunder & White, 2019; Morgan et al., 2018). Aiming to tackle stigma of severe AUD, the study used the term alcohol dependence with participants (Saunders et al., 2019) to delineate severe alcohol problems with accessible language.

The interventions’ expected effects were underpinned by Link and Phelan’s (2001) theory of stigma, which holds that stigmatisation takes place when an individual’s human differences are labelled (e.g., ‘alcoholic’) and, through dominant cultural beliefs, linked to undesirable stereotypes (e.g., ‘lazy’). These in turn mark the individual as different and lead them to experience status loss and discrimination. By linking people with AUD to counter-stereotypic attributes, the videos were expected to disrupt this stigmatisation process and reduce stigma.

The education video used a commonly used myth-fact format (Corrigan & Shapiro, 2010), which drew on attribution theory to influence participants feelings and behaviours by challenging inaccurate stereotypes about AUD with facts (Weiner, 1986). Drawing on contact theory, the contact video sought to disconfirm stereotypes of AUD by using indirect contact with people with past experience of AUD sharing their stories (Allport, 1954). ‘Moderate disconfirmation’ was favoured over ‘strong disconfirmation’ to align with the model of stereotype change (Rothbart & John, 1985), which holds that strongly disconfirming stories (e.g., a sole focus on recovery) lead ingroup members to subtype the counterstereotypic group member as ‘deviant’, blocking generalisation to the outgroup as a whole. Meanwhile, moderately disconfirming stories (e.g., that also describe symptoms and challenges) have been shown to facilitate greater stereotype change (Reinke et al., 2004).

Based on existing literature and theoretical frameworks, the study’s hypotheses were as follows:

  • H1: There will be a significant difference in mean stigma scores between groups, which will depend on the interaction between time-point and intervention group

  • H2: There will be a significant difference in the intervention groups’ mean stigma scores at the different time-points (pre-test, post-test and follow-up) vs. the control group

Methods

Design

This study was a quasi-randomised controlled pilot trial of parallel groups (three anti-stigma interventions and a control) with repeated stigma measurement at pre-test, post-test and follow-up. Participants were recruited and paid through the online crowd-sourcing platform Prolific. Inclusion criteria were 18 + , English-speaking, living in the UK and consenting to take part. Individuals with experience of AUD (both personal and indirect) were able to participate. Quasi-random allocation into groups was achieved by running four sequential recruitment phases (one for each group), each closing before the next opened and showing identical recruitment information to ensure participants were blinded. Each group was capped at 150 completing participants, with repeat enrolment barred via Prolific.

Procedures

City, University of London’s Ethics Committee approved the study, which comprised two stages. In stage one, eligible participants who provided informed consent completed demographic information and stigma measures before individually watching either an education (EV), contact (CV), combined education and contact (CombV) or control (CtrlV) video online. They then answered an attention check question, repeated the measures and completed feasibility questions before viewing debrief information with support resources. At a 1 month follow-up (stage two), participants were invited via Prolific IDs to repeat the measures. Participants from all rounds then received further debrief information. Materials and interventions were hosted via Prolific on the online survey platform Qualtrics and on the video-hosting platform MediaSpace.

Measures

Demographic

Participants completed demographic information, including their age, gender, ethnicity and experience (direct and indirect) of AUD.

Outcome Measures

Two self-report measures of public mental illness stigma were used: the Attribution-Questionnaire-27 (AQ-27) and the social distance scale (SDS). Both include items about a vignette subject with mental illness. Our study used a modified vignette with both, like other studies examining addiction-related stigma (Abdullah & Brown, 2020; Abraham et al., 2013; Janulis et al., 2013; Pescosolido et al., 2010). This depicted a man (‘John’) with symptoms meeting DSM-IV criteria for alcohol dependence (American Psychiatric Association, 2000), and is presented in Supplementary Online Materials (SOM) Figure 1.

AQ-27. The AQ-27 includes 27 items representing nine subscales: blame, anger, pity, help, dangerousness, fear, avoidance, segregation and coercion. Participants rate items on a 9-point Likert scale ranging from 1 (low agreement) to 9 (high agreement) (Corrigan, 2008). Scores are summed to create an AQ score (27 to 243). Higher scores indicate higher stigma. The AQ-27 demonstrates reliability and validity (Pinto et al., 2012), demonstrating good test–retest reliability across subscales (r > 0.75) and convergent validity with social distance measures (Brown, 2008).

SDS. The SDS measures intentions to distance socially from the vignette subject (Link, 1987; Penn et al., 1994). Its seven items are rated on a 4-point Likert scale ranging from 0 (definitely willing) to 3 (definitely unwilling). Responses are summed and divided by seven to generate a SDS score (0 to 3). Higher scores indicate greater stigma. The SDS demonstrates validity and good reliability, with internal consistency ranging from α = 0.75 to α = 0.92 (Link, 1987; Wei et al., 2015).

Feasibility Questionnaire. A feasibility questionnaire was developed (SOM Table I). This contained eleven quantitative and two qualitative questions assessing Orsmond and Cohn’s (2015) feasibility research objectives, such as evaluating the suitability of study procedures. Participants ranked most items using a 5-point Likert scale ranging from strongly disagree (1) to strongly agree (5), with items about procedure duration ranked on a 3-point Likert scale ranging from too short (1) to too long (3).

Interventions

Education

Following a typical myth-fact structure (Corrigan et al., 2001), the 4-min education video countered five common myths about AUD: that AUD only affects certain groups, and that people with AUD cannot recover, are to blame for their problems, can control their drinking and don’t care about others.

Myth-fact pairs were developed from a Delphi study where UK-based experts in AUD (five academics, five clinicians and eight experts-by-experience) listed common myths and facts about AUD and rated their importance for inclusion in an anti-stigma intervention. The video was an animated explainer created with an animation software (Vyond), since this medium benefits learning and engagement (Berney & Bétrancourt, 2016; Höffler & Leutner, 2007; Shahbaznezhad et al., 2021).

Using theory to guide intervention development optimises effectiveness (Craig et al., 2008; Michie et al., 2011). Attribution theory guided the education video, which holds that attributions about a person with a stigmatised condition (e.g., ‘he is to blame for his AUD’) shape affective (e.g., anger) and behavioural responses (e.g., diminished helping behaviour) (Jones, 1984; Weiner, 1986). It was expected that by challenging stigmatising attributions, the video would improve participants’ feelings and intended behaviour towards people with AUD.

A copy of the video is available on request. SOM Table II presents its content.

Contact

The 4-min contact video showed clips from online recorded interviews with three public speakers sharing personal experiences of AUD. Speakers were sent recruitment information via social media before consenting to interviews. They comprised two males and one female, aged 40–60 years old, with white (× 2) and mixed white and black Caribbean (× 1) ethnicities.

The interview guide was based on a systematic review of common themes in effective contact interventions for public stigma of neuropsychiatric disorders. It comprised: Section 1: introductions, background, symptoms, challenges; Section 2: acceptance/treatment, recovery, achievements; and Section 3: ongoing challenges and hope.

The contact hypothesis guided the video. This proposes that contact between ingroups (e.g., the public) and outgroups (e.g., people with AUD) reduces ingroup prejudice through enhanced knowledge (Allport et al., 1979). Balancing the discussion of symptoms and challenges with successes indicative of recovery was intended to moderately disconfirm AUD-related stereotypes (Rothbart & John, 1985). While certain other ‘optimal’ conditions, like group cooperation, may enhance contact effects, these are not necessary for effectiveness (Pettigrew & Tropp, 2006). It was therefore expected that mere exposure to the outgroup would reduce stigma.

SOM Table III presents the interview guide and video content.

Combined

The 8-min combined video presented the education and contact interventions in sequence, with education first to enable deeper processing of combined content (Chan et al., 2009).

Control

The 4-min control video presented educational content about tornadoes.

Analyses

Data were analysed using SPSS. Statistical significance for tests was defined at a 95% confidence level and α = 0.05.

A priori power analysis was conducted using G power. No studies with comparable research design were retrieved. A significance criterion of α = 0.05 and power = 0.80 meant the minimum sample needed was 100 with a medium effect size and 460 with a small effect size (Cohen, (1988). Accordingly, 600 participants were targeted for stage one, providing contingency against a smaller effect size and dropout before follow-up, typically 20–40% in comparable studies (Keith et al., 2017).

Participants failing attention checks were excluded from analysis.

Boxplots were inspected for outliers greater than 1.5 × the interquartile range (or greater than 3 × the interquartile range for extreme outliers) above the third quartile or below the first quartile. Shapiro–Wilk, Levene’s and Mauchly’s tests were run to check assumptions of normality, homogeneity of variance and sphericity required for parametric analyses.

Thirty-three outliers were identified (12 SDS below the threshold; 21 AQ above the threshold) (SOM Figure 2). One extremely low outlier in the SDS was included in the analysis since statistical tests did not yield significantly different results when it was included versus excluded.

Tests of normality are in SOM Figure 3. A Shapiro–Wilk test found the SDS distribution departed significantly from normality at pre-test, W (539) = 0.94, p ≤ 0.001; post-test, W (539) = 0.96, p ≤ 0.001; and follow-up, W (539) = 0.97, p ≤ 0.001, as did the distribution of the AQ at pre-test, W (539) = 0.99, p ≤ 0.005; post-test, W (539) = 0.98, p ≤ 0.001; and follow-up, W (539) = 0.99, p ≤ 0.001. This was corroborated by histograms and skewness.

Since the normality assumption was violated, square root transformation was applied to the dependent variable for all groups.

Data transformation did not rectify the normality assumption violation for the SDS at pre-test, W (539) = 0.96, p ≤ 0.001; post-test, W (539) = 0.97, p ≤ 0.001; or follow-up, W (539) = 0.98, p ≤ 0.001. Results of the two-way mixed analysis of variances (ANOVA) on the original SDS data are reported as ANOVAs are robust to violations of normality assumptions (Lakens, 2022).

Data transformation rectified the violation for the AQ at pre-test, W (539) = 0.10, p = 0.975; post-test, W (539) = 0.10, p = 0.149; and follow-up, W (539) = 0.10, p = 0.334. Results of the two-way mixed ANOVA on the transformed AQ data are thus reported.

Levene’s tests (SOM Figure 4) found the homogeneity of variances assumption was violated for both measures. For the SDS, equal variances were indicated at follow-up, F (3, 535) = 0.904, p = 0.439, but unequal variances at pre-test, F (3, 535) = 4.04, p = 0.007 and post-test, F(3, 535) = 6.16, p =  < 0.001. For the AQ, equal variances were indicated at pre-test, F (3, 535) = 1.18, p = 0.137, and at follow-up, F (3, 535) = 2.63, p = 0.049; but unequal variances at post-test, F (3, 535) = 3.56, p = 0.014. The results of the mixed ANOVAs are nonetheless reported as groups were similar sizes, which reduces the type I error rate (Lakens, 2022).

Mauchly’s test (SOM Figure 5) indicated the sphericity assumption was violated for the AQ, χ2(2) = 47.20, p ≤ 0.001 and SDS, χ2(2) = 22.01, p ≤ 0.001.

For each measure, a two-way mixed ANOVA was conducted with video condition (EV, CV, CombV, CtrlV) and time (pre-test, post-test, follow-up) as between- and within-subject factors respectively.

Where a significant interaction effect was present, simple main effects through pairwise comparisons were calculated to explore how each group was differentially effective at each level of time. A Greenhouse Geisser correction was used for multiple comparisons.

Five-point Likert feasibility questions were analysed based on the percentage of participants scoring ‘agree’ (4) or ‘strongly agree’ (5). Three-point Likert questions were analysed through percentage of participants selecting each score. Qualitative responses were too short for comprehensive analysis but reviewed to identify frequently occurring themes.

Results

Participants

Six hundred and thirty-nine participants were recruited from May to June 2022. Six hundred met eligibility criteria, completed stage one and passed attention checks—150 in each group (EV, CV, CombV, CtrlV). Demonstrating a 90% retention rate, 539 participants completed stage two in July 2022 (Fig. 1), which was a sufficient sample (power, 1-β err prob, 1.00). Table 1 provides demographic information, which was broadly comparable across groups.

Fig. 1
figure 1

Consort flow diagram

Table 1 Participant demographic information

Intervention Effects

Stigma Scores Significantly Reduced Across Intervention Groups vs. Pre-test at Both Post-test and Follow-up Despite Significantly Increasing Between Post-test and Follow-up

AQ stigma (Table 2) significantly reduced for intervention groups between pre-test and both post-test (EV, MD = 1.18, SE = 0.07, p ≤ 0.001; CV, MD = 1.01, SE = 0.07, p ≤ 0.001; CombV, MD = 1.31, SE = 0.07, p ≤ 0.001) and follow-up (EV, MD = 0.66. SE = 0.09, p ≤ 0.001; CV, MD = 0.37, SE = 0.08, p ≤ 0.001; CombV, MD = 0.69, SE = 0.08, p ≤ 0.001) despite significantly increasing between post-test and follow-up (EV, MD =  − 0.52, SE = 0.09, p ≤ 0.001; CV, MD =  − 0.64, SE = 0.08, p ≤ 0.001; CombV, MD =  − 0.62, SE = 0.09, p ≤ 0.001).

Table 2 Pairwise comparisons for time (AQTr)

SDS stigma (Table 3) significantly reduced for intervention groups between pre-test and both post-test (EV, MD = 0.36, SE = 0.03, p ≤ 0.001; CV, MD = 0.34, SE = 0.03, p ≤ 0.001; CombV, MD = 0.47, SE = 0.03, p ≤ 0.001) and follow-up (EV, MD = 0.24, SE = 0.03, p ≤ 0.001; CV, MD = 0.24, SE = 0.03, p ≤ 0.001; CombV, MD = 0.32, SE = 0.03, p ≤ 0.001) despite significantly increasing between post-test and follow-up (EV, MD =  − 0.12, SE = 0.04, p = 0.002; CV, MD = -0.10, SE = 0.04, p = 0.007; MD =  − 0.15, SE = 0.04, p ≤ 0.001).

Table 3 Pairwise comparisons for time (SDS)

AQ and SDS stigma for the CtrlV significantly reduced between pre-test and follow-up (AQ: MD = 0.27, SE = 0.09, p = 0.002; SDS: MD = 0.14, SE = 0.03, p ≤ 0.001). AQ stigma significantly reduced between pre- and post-test (MD = 0.22, SE = 0.07, p ≤ 0.001) but not SDS stigma (p > 0.05), whereas SDS stigma significantly reduced between post-test and follow-up (MD = 0.13, SE = 0.04, p ≤ 0.001) but not AQ stigma (MD = 0.05, SE = 0.09, p = 0.530).

Significant time-Group Interaction Effect Observed: Intervention Groups Displayed Significantly Reduced Stigma Relative to the Control Group at Post-assessment and Follow-up

A significant time-group interaction effect was observed for the AQ, F (5.53, 986.54) = 20.59, p < 0.001, and the SDS, F (5.77, 1028.47) = 16.67, p < 0.001, both with a medium effect size (partial η2 = 0.10, partial η2 = 0.085 respectively), meaning the study’s hypothesis (1) that there would be an interaction effect between time and group was supported.

Pairwise comparisons by group are in Table 4 (AQ) and Table 5 (SDS).

Table 4 Pairwise comparisons for group (AQTr)
Table 5 Pairwise comparisons for group (SDS)

While mean AQ stigma in the CtrlV significantly reduced at post-test, mean AQ and SDS stigma was significantly lower in the intervention groups at post-test: EV (AQ: MD =  − 0.97, SE = 0.18, p ≤ 0.001; SDS: MD =  − 0.45, SE = 0.07, p ≤ 0.001), CV (AQ: MD =  − 1.22, SE = 0.18, p ≤ 0.001; SDS: MD =  − 0.41, SE = 0.07, p ≤ 0.001) and CombV (AQ: MD =  − 1.24, SE = 0.18, p ≤ 0.001; SDS: MD =  − 0.47, SE = 0.07, p ≤ 0.001). Stigma reductions at that time were similar across intervention groups despite marginal differences (all ps > 0.13).

While mean stigma in the CtrlV continued decreasing at follow-up, mean AQ and SDS stigma was still significantly lower in the intervention groups: EV (AQ: MD =  − 0.40, SE = 0.18, p = 0.028; SDS: MD =  − 0.20, SE = 0.06, p ≤ 0.001), CV (AQ: MD =  − 0.53, SE = 0.18, p = 0.003; SDS: MD =  − 0.18, SE = 0.06, p = 0.004) and CombV (AQ: MD =  − 0.57, SE = 0.18, p = 0.001; SDS: MD =  − 0.19, SE = 0.06, p = 0.003).

Consequently, the study’s hypothesis (2) that there would be a significant difference in the intervention groups’ mean stigma scores across time versus the CtrlV was supported. Again, there were no significant differences in mean AQ and SDS stigma between intervention groups (all ps > 0.31). For example, stigma in the CombV was lower but not significantly lower than the CV (AQ: MD =  − 0.05, SE = 0.18, p = 0.796; SDS: MD =  − 0.01, SE = 0.06, p = 0.86).

As differences between time points and intervention groups were not consistent, as shown by Fig. 2a (AQ) and Fig. 2b (SDS), neither a main effect of time nor group on mean stigma appears present.

Fig. 2
figure 2

All interventions were effective in reducing participant stigma relative to a control group. A, B Participants watching an intervention video showed significantly reduced stigma relative to the control group immediately after watching the video at post-test (time point 2, all ps < 0.01). The impact reduced at follow-up (time point 3), but remained significantly lower than pre-test (timepoint 1) and relative to the control group for all interventions (all ps < 0.05). All intervention types were found to have broadly comparable impact, with minimal difference in change in stigma scores at either post-test or follow-up (all ps > 0.31)

Feasibility

Feasibility of the study’s procedures and interventions (Tables 6 and 7) is summarised below against Orsmond and Cohn’s (2015) evaluation areas for feasibility research.

Table 6 Feasibility quantitative responses
Table 7 Feasibility selected qualitative responses

Recruitment Capability and Resulting Sample Characteristics

Recruitment capability was strong and eligibility criteria feasible, with 632 eligible members of the target population recruited within a week (~ 2 days per group). Scores on pre-test stigma measures showed sufficiently high stigma scores to warrant intervention. It was, however, challenging to recruit a representative sample, with a skew towards younger (63% under 45 vs. 42% for UK population), female (64% vs. 52%) participants and those with personal experience of AUD (10% vs. 3%) (McManus et al., 2016).

Data Collection Procedures and Outcome Measures

Supporting the measures’ suitability for the intended population, participants felt instructions outlined what was expected (99% agreement), items were clear and easy to understand (98%) and they could accurately record feelings towards the vignette subject (95%). Indicators of internal consistency were good (e.g., alpha reliability coefficients above 0.80), though the measures’ appropriateness merits further research as they lack wide validation for measuring public AUD stigma. Usable and complete data were collected from all participants, with 92% also providing qualitative feasibility comments.

Acceptability and Suitability of Study Procedures and Intervention

Adherence and retention were strong, with 96% of participants completing stage one and a 90% retention rate at follow-up. Participants appeared to understand and engage with the interventions: most passed attention check questions (99%) and agreed the videos were clear and easy to follow (98–100%) and engaging and interesting (82–95%). Average study completion time was below 20 min and most participants felt survey duration was ‘about right’ (92–99%).

Qualitative feedback reflected participants’ satisfaction with the interventions. Each approach had different benefits, with those watching myth-fact content feeling they learned more new information (EV and CombV: 76% vs. CV: 50%) and contact content promoting empathy. Other strengths were the EV’s clarity and simplicity, the CV’s speakers, diversity and narrative, and CombV’s combined approaches.

The interventions’ acceptability was limited by their perceived relevance to participants (50–60%) and participants’ willingness to watch them in their own time (41–48%). Additionally, while most felt interventions were the right length (66–94%), qualitative feedback (especially for the CombV) recommended shortening them. Development areas included enhancing production quality, providing additional content and offering more balanced information.

Resources and Ability to Manage and Implement the Study and Intervention

The research team had the resources, time and skills to conduct the study. However, feasibility was limited by budgetary constraints for intervention production, where additional funding could have supported specialist input to increase intervention quality.

Participant Responses to Intervention

Participants’ stigma scores suggested all interventions show promise of being successful with the public (Fig. 2). Qualitative feedback also indicated their potential for attitude change (‘it changed my opinion about the condition’—CombV).

Discussion

Summary of Findings and Implications

The present study is, to our knowledge, the first to examine the relative efficacy of education and contact interventions in reducing the public stigma of AUD. Supporting the study’s hypothesis (H1), there was a significant interaction between time and group. Stigma in the intervention groups significantly decreased at post-test and follow-up, despite a significant rebound between post-test and follow-up. Supporting hypothesis (H2), while stigma scores decreased in the control group (CtrlV) between pre-test and follow-up, they decreased more in the intervention groups (EV, CV, CombV). Together, these findings suggest that brief education, contact and combined interventions used to tackle the public stigma of various neuropsychiatric disorders can successfully reduce public AUD stigma. Diminished intervention impact at follow-up is consistent with previous reports (Morgan et al., 2018), raising a potential challenge with using short interventions to sustainably shift entrenched social biases. Conceptual models of stigma highlight its perceived functions such as to keep outgroups ‘down’ (e.g., exploitation or domination) or ‘away’ (e.g., avoidance of illness). In the case of substance use disorders, it is often used to ‘keep people in’ by enforcing social norms (Phelan et al., 2008). The rigidity of public AUD stigma relative to other conditions (Crisp et al., 2005; Kilian et al., 2021; Schomerus et al., 2014) may, therefore, be explained by it being seen as an effective strategy to limit excessive alcohol use (Vanyukov, 2024) by defining the boundaries of acceptable behaviour (Phelan et al., 2008). Additionally, the interventions’ reduced impact at follow-up could reflect people’s tendency to avoid ‘cognitive dissonance’ (i.e., holding contradictory beliefs) through retaining their existing beliefs despite new information (Festinger, 1957). Further research is needed to clarify reasons for the interventions’ decreased impact over time and to create more sustained changes in stigma.

The contact content’s efficacy corroborates previous demonstration that positive portrayals of people with substance use disorder promote stigma reduction (Luty et al., 2009; McGinty et al., 2015). Participants’ qualitative endorsement of its diverse speakers, and balancing of symptoms, recovery and ongoing difficulties, also substantiate anti-stigma recommendations for contact intervention development (Corrigan et al., 2013). However, contrary to evidence that education is either ineffective for the reduction of AUD stigma (Luty et al., 2007) or inferior to contact (Corrigan et al., 2007), this study found no significant differences in efficacy between the two types of intervention. Combining both approaches in anti-stigma efforts may, therefore, be useful, especially given participants highlighted their different benefits, with education imparting knowledge and contact humanising people with AUD.

More research is needed to understand these discrepancies. However, future trials may consider engaging mediums for education content delivery since our myths and facts were effective in an animated explainer format, in contrast to previously ineffective black-and-white factsheets (Luty et al., 2007). Additionally, systematic content selection may enhance intervention effects: our myth-fact pairs were developed based on expert perspectives and a systematic review of population studies of public attitudes towards AUD, ensuring messaging reflected public attitudes and key AUD stereotypes.

Our findings support progression to a larger trial, with some amendments to the study’s procedures and interventions. While a large sample was enrolled in a short period, a more representative sample is needed to confirm the interventions’ public impact. Data collection procedures enabled sufficient data analyses, and the outcome measures were reliable and perceived as appropriate. However, further validation of measures examining the public stigma of addictive disorders (Brown, 2011; Johnson-Kwochka et al., 2021) would enable more accurate future evaluations. Study procedures were considered suitable, and most participants were satisfied with the study duration and retained at follow-up. Further, the interventions show promise for stigma reduction in future trials, given both their effects and qualitative feedback about their potential for attitude change. However, while their acceptability was endorsed, more work is needed to improve their relevance and appeal to participants since less than half agreed they would watch the videos in their own time. Feedback suggested future studies would benefit from shortening the videos, enhancing their quality (particularly making education content more engaging) and providing additional content (e.g., support resources). Offering a more balanced perspective (e.g., realistic recovery rates) was also advised, suggesting holistic accounts of AUD are important while countering stigma. Greater resources to improve the interventions prior to further testing may result in an approach that is more likely to succeed.

Limitations

This study’s findings should be considered in the context of its limitations. First, the sample, recruited through the online crowd-sourcing platform Prolific, was not entirely representative of the UK public (Chandler et al., 2015), limiting the generalisability of the study’s findings. It was skewed towards younger, female participants and those with direct and indirect experience of AUD. While gender did not materially impact stigma, younger participants and those with experience of AUD demonstrated slightly lower stigma across time.

Second, the comparability of effects across groups was limited because the study was not truly randomised since participants entered it at different times. Additionally, the difference in length of the CtrlV and CombV makes the latter’s efficacy challenging to assess quantitatively.

Third, the findings lack ecological validity. Measures assessed explicit rather than implicit attitudes (Bos et al., 2013), which have been shown to reduce more following anti-stigma interventions (Maunder & White, 2019). Whether self-reported behavioural intentions would translate to actual behaviour is therefore unclear. Further, reduced intervention impact at follow-up and changes in control group stigma suggest social desirability bias may have influenced results.

Finally, while 1 month is a longer follow-up than other contact and education studies (Corrigan et al., 2002, 2004, 2007), establishing the duration of intervention effects on stigma over longer periods is important given increased stigma at follow-up in our study.

Directions for Future Research

To verify these preliminary results, a future larger trial could capitalise on the insights offered by the present study and improve their generalisability (e.g., through true randomisation or a more representative sample) and feasibility (e.g., through better-produced interventions). To improve ecological validity, the interventions could be tested in real-life contexts, with changes in actual behaviour evaluated (e.g., donations to charity). They could be trialled initially in sub-groups of the population (e.g., students), with the aim of broadening their use in society-wide campaigns (e.g., via social media or short public health adverts). Alternatively, developing targeted interventions for specific groups able to effect change, such as policymakers or employers, may be impactful (Corrigan et al., 2014).

Since public stigma of AUD is entrenched (Crisp et al., 2005; Kilian et al., 2021; Nathan et al., 2016), it is possible that single-session interventions promote insufficient lasting change (Earnshaw et al., 2018). Future research could, therefore, evaluate the impact of the interventions when delivered continuously. Similarly, research into intervention acceptability could elucidate how to prolong effects at follow-up.

Additionally, studies into mechanisms of change could clarify how the interventions are efficacious when applied to addiction-related stigma. For example, while the contact hypothesis proposes enhanced knowledge as the key mediator of contact interventions (Allport, 1954), contemporary research regards other mediators, such as empathy, as superior (Pettigrew & Tropp, 2008).

To facilitate engagement, additional studies could test the effects of interventions delivered at different lengths (e.g., 4 versus 2 min), through different mediums (e.g., text versus video) and by different people (e.g., actors, celebrities, health professionals). Further, other education and contact intervention types could be tested, such as facts-only (Luty et al., 2007) and vignette contact interventions (Luty et al., 2008).

Finally, interventions found to be successful after more extensive evaluation should be tailored to other substance and behavioural addictions in order not only to determine the differential sensitivity of stigmas towards these conditions to these interventions but also to develop a wide range of strategies effective at reducing public stigma towards all addictions.