Introduction

Our review team conducted an extensive quality assessment on the clinical studies we included in our living systematic review of the cardiovascular effects of e-cigarette substitution for cigarettes [1]. When we scrutinized the studies for many types of bias (drawn from the Oxford Centre for Evidence-Based Medicine Catalogue of Bias [2]), we were particularly concerned with spin bias. Spin bias is the distortion of data that misleads readers [3]. Spin bias is “a misrepresentation of study results, regardless of motive” [4]. One type of spin bias occurs when statistically nonsignificant results are reported as “showing an effect” [5] or where unjustified claims are made for results with p-values > 0.05 [6].

The reason for our vigilance for spin bias is because the use of e-cigarettes is a controversial and divisive field [7,8,9]. We, along with other researchers, are deeply concerned that “polarized stances on e-cigarettes will threaten the integrity of research” [9]. Calling out spin is critical because biased conclusions from studies garner media reporting [10] that becomes a source of misinformation, influencing public and clinicians’ opinions and actions. Senior members of the research team reported reading numerous instances of spin bias in e-cigarette studies. Would we find spin bias in the set of clinical studies included in our systematic review?

Yes, we did. Seven of 26 studies exhibited spin bias of nonsignificant findings (See Table 1). Spin bias may spring from “a strong position that relies more on their opinion than on the study results” [10], or it may be prompted as defense against publication bias against null results [11]. Whatever the motive, we developed a technique for an objective identification of spin bias for nonsignificant and misreported findings.

Technique for the identification of spin bias

Our technique for identifying spin bias within an article has two steps.

First, the data and findings from the results, including in tables and figures and in appendices or supplementary materials, are tracked throughout their mentions in the study text. Data discrepancies and “pairs of statements that cannot both be true” [12] point to potential instances of spin bias. Our technique examines the misreporting or misrepresentation of nonsignificant data in the Discussion, but additionally this process can detect data discrepancies between the abstract and the conclusion of a study. Tracking can also reveal the omission of specific findings from the Discussion, another type of spin bias.

Second, the discrepancies identified in the data are reported with exact quotes (see Table 1) for objective identification. In our systematic review, many of the data discrepancies were between the data presented in a table or figure and what was stated in the text.

For non-significant findings in some instances, the spin is made with causal language, with a claim or by stating that an effect occurred where the finding was not significant [5, 10]. For example, in Table 1, there was no significant difference between e-cigarette and cigarette use on blood pressure, but in the Discussion the claim was that e-cigarettes had a lower impact than cigarettes. In some cases, the authors in their Discussion flatly stated that a finding was significant when it was reported as nonsignificant in the Results section. We observed this misreporting in two studies in our review.

For accuracy and completeness, two reviewers independently should check for data discrepancies and misreporting of nonsignificant findings. Differences in their assessments and observations should be resolved by discussion; this was our procedure. A third team member (in our review, the Project Leader) should verify all evidence of spin bias.

Our systematic review: nonsignificant findings and spin bias in the studies

Table 1 displays the occurrences of spin bias we found in our systematic review with this technique and how we documented the evidence of spin bias.

Table 1 Data discrepancies and nonsignificant data spin

Critical discussion

It could be argued that authors could legitimately frame their nonsignificant findings by stating that “the findings could be…but not sufficient evidence.” This linguistic turn obscures the results of the actual data collected. The authors’ could be is likely their preferred hypothesis, yet nonsignificant data could be construed in any number of ways. The accurate statement would be that it was not a significant finding. Our living systematic review reported over 66% nonsignificant cardiovascular test results. This was important data indicating that e-cigarettes had no difference in cardiovascular effects than cigarettes. Nonsignificant findings provide evidence that should not be drowned out by the noise of speculations.

In the broader biomedical literature, misreporting and misinterpretation of study findings are evidently common practices that produce spin bias [10, 20, 21]. A comparison of 896 abstracts with their full text conclusions observed that 15–35% were “inconsistent” [22]. Another study documented that 22% of trials with nonsignificant results (75 of 346 studies) had high levels of spin bias in their Conclusions [5]. In our systematic review, 27% of the studies exhibited spin bias with nonsignificant results. Nonsignificant results appear to be the most prone to spin bias: “the only factor that seems consistently associated with spin is non-statistically significant results” [23, see also 5].

Limitations

Certainly our technique for identifying spin bias requires further testing, evaluation, and validation. As far as we know, this is the only report on the identification of nonsignificant results and spin bias in the Discussion sections of published articles. Spin bias has been examined between an abstract and the text in randomized controlled trials [20, 22, 24, 25] and in systematic reviews [21, 26, 27]. Spin bias in abstracts is especially serious because many readers look only at the abstract.

Two recent checklists could incorporate our two-step process for identifying spin bias with nonsignificant findings. The recently published Quality Output Checklist and Content Assessment tool (QuOCCA) includes one item on the spin of nonsignificant results, but it purposefully excludes checking the Discussion because spin in that section of a study was “difficult to identify” [6]. This was not our experience. The Discussion section is a key section for reporting and interpreting results, and should be checked with the QuOCCA, not excluded. PRIOR (Preferred Reporting Items for Overviews of Reviews) [28] is another recently published tool with checks for reporting bias and data discrepancies. Both tools would be enriched with our technique to identify the spin of nonsignificant data, the most common reporting bias.

Our technique does not identify all instances of spin bias. It cannot document where particular datapoints, such as secondary outcomes or subgroup findings, have been over-emphasized over primary outcomes, although our technique does document findings which are omitted from the Discussion. Identifying spin bias from overgeneralizations and overstatements entails analyzing rhetoric and checking for ascertainment bias. Our technique will require adaptation to be useful for uncovering the spin bias from undocumented deviations from a clinical trial registry or protocol and the published study [29]. A rating for the intensity of spin bias could be based on the number of instances identified, with multiple occurrences a flag for potential researcher bias.

Finally, we can only wonder out loud about how much of an appetite there is among editors and peer reviewers to routinely take on an additional check. Spin bias distorts scientific research and misleads readers. Editors and peer reviewers should be vigilant for spin bias [5]. and “in theory, peer-reviewers and editors should determine whether the conclusions match the results” [10]. But it does not have to be all or nothing. Knowing that nonsignificant findings are the most common source of reporting bias, peer reviewers should spot check for how nonsignificant findings are presented in the discussion and abstract, without it being too onerous or time-consuming a task. The editorial team should routinely check for reporting bias of all kinds to prevent errors that result in retractions. Hopefully our technique can assist.