Recanting, defined as the denial of a previous positive report of lifetime (ever used) substance use at a later interview, has been found to be a significant source of measurement error in longitudinal alcohol and drug-use surveys (Fendrich 2005; Fendrich and Rosenbaum 2003; Johnston and O'Malley 1997; Percy et al. 2005). Recanting can be identified through observing logical inconsistencies in the patterns of reporting of lifetime consumption across survey sweeps; as once a respondent has transitioned into substance use, all future responses to an ‘ever use’ question should be positive (Percy et al. 2005).

Recanting always indicates the presence of a false or mistaken report. It may arise from either over-reporting lifetime use (claiming to have tried alcohol when they actually had not, possibly to impress their peers) that is corrected at a later follow-up, or from under-reporting lifetime use at the later sweep (denying that they had ever used following an earlier positive report, typically due to social desirability or concerns about being identified as a drinker by either school or parents). Recanting may also be the result of a simple error, such as ticking the wrong box. Numerous longitudinal studies of adolescent alcohol and drug use have identified extensive recanting (Ensminger et al. 2007; Fendrich and Mackesy-Amiti 1995; Fendrich and Rosenbaum 2003; Johnston and O'Malley 1997; Percy et al. 2005; Shillington and Clapp 2000; Shillington et al. 2011a; Shillington et al. 1995; Shillington et al. 2011b; Taylor et al. 2017). Recanting has also been observed in relation to smoking (Sargent et al. 2017; Soulakova and Crockett 2014; Stanton et al. 2007), inhalant use (Martino et al. 2009), delinquency (Sibley et al. 2010), sexual behaviour (Dariotis et al. 2009; Palen et al. 2008), self-harm (Mars et al. 2016), weight control (Rosenbaum 2009) and age of substance use initiation (Bailey et al. 1992; Engels et al. 1997). Inconsistencies in self-reported substance use have not only been found across time but also across survey settings (school vs household) (Griesler et al. 2008).

In alcohol-specific studies, the proportion of respondents found to have recanted has ranged from under 10% (Percy et al. 2005; Shillington and Clapp 2000) to over 30% (Fendrich and Rosenbaum 2003; Shillington et al. 2010). Alcohol-related recanting has tended to be higher amongst males (Percy et al. 2005; Shillington, Clapp, et al., 2011; Siddiqui et al. 1999), although not all studies have confirmed this (Shillington and Clapp 2000), younger respondents (Shillington et al. 2010), ethnic minority respondents (Fendrich and Rosenbaum 2003; Siddiqui et al. 1999), although in some studies, differences due to race/ethnicity have been rather minimal (Shillington and Clapp 2000), respondents with low educational expectations (Percy et al. 2005), and those reporting lower levels of consumption (Fendrich and Mackesy-Amiti 2000).

In one study, young people who reported receiving drug education in the previous 12 months were significantly more likely to recant previous positive reports of cannabis use (Percy et al. 2005). Here, lower reported drug-use behaviours amongst those who received drug education may have been an artefact of increased measurement error (recanting), rather than a reflection of a decrease in actual drug-use behaviour. Similarly, Harris and colleagues observed different reported levels of drug-use inconsistencies across the various arms of a treatment outcome studies (Harris et al. 2008). This opens up the possibility that the positive impact of a prevention intervention may, in part, be due to increased recanting within the intervention arm, relative to the control arm, rather than an actual reduction in consumption (Fendrich 2005; Macleod et al. 2005). This possibility may also help explain the finding that effect sizes are inflated in non-blinded studies with self-report outcomes (Hrobjartsson et al. 2014; Kaner et al. 2017). Given that few universal prevention trials include any non-survey-based confirmation of self-reported substance use, assessing the impact of an identifiable source of measurement error (recanting) on primary outcomes would seem a valuable analytical procedure.

Using data from a large-scale clustered randomised controlled trial (McKay et al. 2018), we aimed to address this gap by examining recanting levels within a school-based alcohol education effectiveness study conducted in the UK. We compared the rates of recanting behaviour across the intervention and control (education as normal) arms and controlled for respondent recanting within the estimation of intervention effects on the trial primary outcomes.

Method

Design, Procedures and Sample

The study uses data from a large-scale cluster randomised controlled trial (cRCT) examining the effectiveness of an alcohol education intervention (McKay et al. 2018). A total of 105 post-primary schools, across two geographical locations, participated in the trial: 70 in Northern Ireland (NI) and 35 in Scotland. At the beginning of the study, participants were in their first year of high school (mean age = 12.5 years). Schools were randomly assigned (1:1) to receive the intervention or alcohol education as normal (comprising standard personal, social and health education) before baseline data were collected. Data were collected at baseline (T0) in June 2012 and at three follow-ups: + 12 (T1), + 24 (T2) and + 33 (T3) months. By T3, the mean age of participants was just over 15. The questionnaires were verbally administered to pupils in exam-like conditions. Opt-in consent was obtained from school head-teachers/principals prior to randomisation of the school to either intervention or control. Opt-out consent from participants and their parents/guardians was obtained after randomisation.

The research was approved by Liverpool John Moores University Research Ethics Committee. The trial protocol is available from http://www.nets.nihr.ac.uk/projects/phr/10300209.

Measures

Primary Outcomes

The study had two primary outcomes, both assessed at 33 months. The first was the frequency of self-reported heavy episodic drinking (HED) in the previous 30 days. For male respondents, HED corresponded to consumption of six or more UK units of alcohol in a single session. The corresponding threshold for female respondents was 4.5 units of alcohol. This frequency count was dichotomized at never/one or more occasion. To aid the accuracy of recall, respondents were presented with pictorial prompts of how much alcohol ≥ 6/≥ 4.5 UK units represented. This was a revised primary outcome. This initial proposed HED outcome was the self-reported frequency of consumption of > 5 ‘drinks’ in a single drinking episode. Concerns arose because it became clear that > 5 ‘drinks’ could refer to drinks of different alcohol strength and volume, and that the intervention had a specific learning outcome around the counting of units consumed. As a result, the HED primary outcome was changed to the ≥ 6/≥ 4.5 UK unit threshold described above. This change was implemented before T3 data collection and before any data was unblinded for analysis. As a result, the indicator of HED employed at T0 (baseline) was the frequency of consuming 5 or more drinks in the last month, which was dichotomized.

The second primary outcome was the number of self-reported alcohol-related harms (caused by their own drinking; ARH) in the previous 6 months. Participants were asked to indicate how many times in the past 6 months they had experienced each of 16 separate harms. The frequency count was dichotomised for each harm and then summed across the 16 harms to form an overall count of the number of individual harms experienced ranging from zero to 16.

Recanting

At each sweep, respondents were asked if they had ever consumed a ‘full drink’ of alcohol, not just a sip or a taste (yes/no). Recanting was considered to have occurred when a positive lifetime self-report (yes—have consumed a full drink) at one data sweep was followed by a negative lifetime self-report (no—never have consumed a full drink) at a later follow-up. For example, positive lifetime report at T1 followed by a negative lifetime report at T2 would be classified as an incident of recanting at T2. Recanting was coded as a dichotomous indicator (0/1) for each follow-up sweep. An overall recanting indicator was also constructed capturing recanting at any follow-up sweep (T1-T3—again coded 0/1).

Statistical Analysis

Data cleaning, data management and preliminary analysis were undertaken using IBM SPSS version 22. The primary outcome models were estimated in Mplus 7.11. The outcome analysis was an intention-to-treat (ITT) analysis using the complete case population. Prior sensitivity analysis had suggested that the outcome models were robust to all but extreme missing data assumptions (McKay et al. 2018). A multi-level logistic regression model estimated the association between the intervention and the odds of HED. A multi-level negative binomial regression model estimated the association between the intervention and the number of ARH. All models included school-level random intercepts to account for correlation due to clustering of students within schools. The models also adjusted for factors used to stratify randomisation (location—NI/Scotland; proportion of free school meals—tertile split, low/medium/high), gender of school (in NI only, co-educational/boys only/girls only), the respondents’ recanting (at T3 and at any sweep) and the relevant outcome assessed at baseline. Given that the analysis had two primary outcomes, a statistically significant result was concluded if the p value for the trial treatment arm explanatory variable was < 0.025.

Results

A total of 11,316 pupils participated in the T0 (baseline) data sweep. An additional 1422 pupils who were either absent at T0 but present at T1 data collection (i.e. missing on the day of the T0 data collection) or who joined participating schools between T0 and T1 (before the delivery of phase 1 of the intervention) were also included within the study population. Of the full sample (those who completed a questionnaire at either T0 or T1, N = 12,738), 10,405 also completed the questionnaire at T3 (81.7%) and form the complete case population. Table 1 provides the baseline (T0) characteristics of the sample. No differences were detected between the intervention and control arms at T0.

Table 1 Demographic characteristics and alcohol outcomes (HED and alcohol-related harms) at baseline (T0) by study arm

At T3, around one in 5 participants reported at least one episode of HED in the last 30 days. The prevalence of HED was nine percentage points higher in the control group (26%) than in the intervention group (17%; odds ratio = 0·60; 95% CI 0·49–0·73) (McKay et al. 2018). Amongst those who had consumed alcohol at T0, around half reported having engaged in HED at T3 within the control schools, compared to just over a third in intervention schools. Around two thirds of pupils (63%) reported no ARH at T3. There was no difference in the number of self-reported ARHs (incident rate ratio = 0·92, CI 0·78–1.05) between control and intervention pupils (McKay et al. 2018).

Around 5% of pupils recanted their alcohol consumption at each data sweep (Table 2). The overall recanting rate across the follow-up period (33 months) was 10%. Recanting was higher amongst males (6% compared to 3%) and respondents in the intervention arm, albeit here, the difference was very small (one percentage point difference). Recanting was lower amongst respondents who reported HED. However, it should be noted that the observed effect sizes, as indexed by Cramer’s V, were small to very small. Survey responses were not edited to ensure internal consistency within each sweep. As a result, seven respondents who reported HED at T3 were observed to have also answered ‘never’ to the earlier T3 lifetime use item.

Table 2 Proportion of respondents who recanted (T1 to T3) by respondent characteristic

Initial multi-level ITT outcome analysis indicated that the intervention had a significant impact on HED (Table 3) but no impact on ARH (Table 4) at T3. Although recanting was slightly higher within intervention schools (5.3%) than in control schools (4.3%) and was significantly associated with both primary outcomes (HED and ARH), when entered into the primary outcome models, recanting did not alter the intervention effects observed in the initial outcome analysis. The intervention remained significant in terms of its association with HED (Table 3) and non-significant in its association with ARH (Table 4).

Table 3 HED primary outcome analysis unadjusted for recanting (model 1), adjusted for recanting at any sweep (model 2) and adjusted for recanting at T3 (intention-to-treat complete case analysis)
Table 4 Alcohol-related harms (ARH) primary outcome analysis unadjusted for recanting (model 1), adjusted for recanting at any sweep (model 2) and adjusted for recanting at T3 (intention-to-treat complete case analysis)

Discussion

This study examined the extent of recanting within a large school-based alcohol prevention trial. By replication of the original trial primary outcome analysis, but this time controlling of instances of recanting (both any recanting and recanting at the final follow-up), the study also assessed the impact on recanting on the estimation of the intervention effect.

Results of this study revealed that recanting is a noticeable source of measurement error within school-based alcohol trials, involving around one in 20 respondents each data sweep, and a source of measurement error that is associated with traditional consumption-based outcomes measures. The level of recanting estimated within this study was at the lower range of previous estimates (Percy et al. 2005; Shillington and Clapp 2000) and confirmed prior reports of inflated recanting amongst male students (Percy et al. 2005; Siddiqui et al. 1999).

In terms of between-group differences, there was a differential rate of recanting across the study arms (intervention versus control) and across gender, although the effect of recanting was very small, as evidenced by the Cramer’s V values. While recanting was higher amongst pupils who received a harm reduction alcohol education intervention than those who received education as normal, the observed difference was only a single percentage point difference, compared to the nine percentage point difference observed in primary outcome HED. Based on this finding, recanting may not be the predominant cause of inflated effect sizes in non-blinded studies with self-report alcohol outcomes (Hrobjartsson et al. 2014; Kaner et al. 2017), although further replication is required before a robust conclusion can be drawn.

Recanting arises from a lack of consistency in reporting across survey sweeps. This inconsistency may be due to deliberate attempts to misreport consumption (under-reporting following a positive report) or a failure to maintain a deception over multiple survey sweeps (an over-report that is not repeated in subsequent sweeps) (McCambridge and Strang 2006). Either way, the incident of recanting indicates a false report at one or more data sweeps. It is also possible that recanting arises due to simple errors in answering the pencil and paper questionnaire (ticking no when they meant to tick yes), rather than deliberate attempts to misreport consumption. However, as only seven students who recanted lifetime alcohol use went on to report HED, this may not be a significant issue in a study such as this. Those who did recant appeared to be consistent in their reports of alcohol consumption within a survey sweep, even when inconsistent between sweeps.

While the analysis has a number of methodological strengths including a large sample size, robust randomisation and analysis that accounted for the multi-level structure of the data collected, it does have some limitations. Firstly, while the study identified incidents of recanting, it was unable to determine whether these were the result of either initial over-reporting (i.e. a false initial claim of lifetime use that was corrected at follow-up) or later under-reporting (a true initial claim of lifetime use that was denied at follow-up). It is likely that these distinct forms of self-report inconsistencies will have a different impact on the assessment of trial outcomes. Secondly, the study did not incorporate any non-questionnaire-based self-report measures (Koning et al. 2010) or non-survey-based measures of consumption, although these are likely to be non-viable in large-scale surveys (Taylor et al. 2017). As a result, it had no independent corroboration on reporting inconsistencies. And finally, the study was only able to assess the impact of recanting on alcohol outcomes. The impact of recanting may be greater for more socially undesirable adolescent behaviours (i.e. cannabis and other drug use) (Fendrich and Mackesy-Amiti 1995; Fendrich and Rosenbaum 2003; Harris et al. 2008; Johnston and O'Malley 1997; Percy et al. 2005).

In conclusion, this study demonstrates, for what we believe may be first time, that recanting is an important form of measurement error within in school-based prevention RCTs, but one that is readily identifiable and easily controlled for within RCT outcome analysis, using the method demonstrated above. Recanting varies with known predictors of alcohol consumption (e.g., gender) (Chassin et al. 2002) and with standard alcohol outcome measures routinely used in prevention RCTs. We recommend including recanted self-reports as a covariate within planned outcome analysis as a simple method of adjusting intervention effects for a known source of self-report measurement error associated with consumption-based outcomes. Another option would be test different recanting assumptions within any outcome analysis (i.e. using a worst case scenario—setting all who recant to HED; a best case scenario—setting all who recant to non-HED; or a conservative case scenario—setting all who recant in the treatment arm to HED and all who recant in the control arm to non-HED) to assess the sensitivity of the outcome results to variations in those assumptions, as is routinely done with missing data (see for example McKay et al. 2018).

While the levels of recanting were low within this study, this may not always be the case, particularly for non-alcohol substance use outcomes where social desirability pressures are considerably greater (Shillington et al. 2010). Therefore, we recommend that recanting analysis should be readily incorporated with the broader suite of sensitivity tests undertaken to assess other forms of error within RCTs, such as those for missing data and alternative specifications of the trial primary outcomes (Thabane et al. 2013).