Introduction

The capacity of the research literature to self-correct is of vital importance, particularly given reports of high rates of published errors within the research literature (Allison et al. 2016; Bik et al. 2016; Bik et al. 2018; Brown et al. 2018; Georgescu and Wren 2018; Guttinger and Love 2019; Nuijten et al. 2016). Errors that escape detection during both manuscript preparation and peer review can be subsequently flagged and/or corrected by approaches including retractions, expressions of concern and author corrections (Fanelli et al. 2018; Vaught et al. 2017). Despite the range of approaches that are available to correct errors within the literature, researchers have described difficulties in navigating these processes (Allison et al. 2016; Grey et al. 2020a; Malički et al. 2019; Saiz et al. 2018; Vorland et al. 2020). Post-publication review processes have been described as temporally slow, complex and time consuming (Grey et al. 2020a; Malički et al. 2019; Saiz et al. 2018), and even resulting in financial burdens to the notifying party (Allison et al. 2016). As these factors are likely to discourage researchers from communicating published errors, it is perhaps not surprising that few teams have described their efforts to correct published errors (Allison et al. 2016; Grey et al. 2020a; Malički et al. 2019; Saiz et al. 2018). More such descriptions would allow the detailed examination of post-publication review processes, which could lead to process improvements, and more timely responses to incorrect published information (Grey et al. 2020a).

We have an opportunity to compare journal responses to post-publication notifications of very similar errors, as members of our team have been describing concerns about biomedical research papers with a restricted range of error types to their corresponding journals since 2015. While only a minority of these notifications have been resolved, comparing even a limited number of journal responses can provide much needed information about how biomedical journals respond to very similar published errors.

Published nucleotide sequence reagent errors

The publications of concern that we have described to journals commonly analyse human gene function in cancer cell lines (Byrne and Labbé 2017; Labbé et al. 2019). Experiments that analyse gene function often employ short pieces of DNA or RNA that correspond to genes of interest, and these DNA or RNA sequences can be written into publications to signal exactly how specific genes were investigated. Because written DNA or RNA sequences lack obvious meaning, even short nucleotide sequences are susceptible to being transcribed incorrectly (Byrne et al. 2019; Labbé et al. 2019). Nucleotide sequences can incorporate the equivalent of spelling mistakes (Chiarella et al. 2015; Habbal et al. 2005; Labbé et al. 2019; Shannon et al. 2008) and intended nucleotide sequences can also be substituted by nucleotide sequences with entirely different identities and meanings (Byrne and Labbé 2017; Chiarella et al. 2015; Katavetin et al. 2005; Kocemba et al. 2016; Labbé et al. 2019; Tamm 2016). Whereas typographic errors within nucleotide sequence reagents can have variable significance, wrongly identified nucleotide sequence reagents will almost certainly invalidate any experiment that uses such reagents (Byrne et al. 2019; Labbé et al. 2019). Wrongly identified nucleotide sequence reagents, particularly those that recur across multiple papers, also risk being incorporated into future studies, potentially leading to failed experiments and/or publication of further incorrect data (Byrne et al. 2019; Labbé et al. 2019). Our previous and ongoing analysis of published nucleotide sequence errors has therefore focused on the detection and description of wrongly identified nucleotide sequence reagents (Byrne and Labbé 2017; Labbé et al. 2019).

We have identified and described three types of wrongly identified nucleotide sequence reagents (Byrne and Labbé 2017; Labbé et al. 2019) (Fig. 1a). The first error type represents the DNA or RNA equivalent of substituting one recognizable word for an unrelated word (Fig. 1a). Such incorrect DNA or RNA reagents will target a different gene or sequence from that intended (Byrne and Labbé 2017; Labbé et al. 2019). The second error type substitutes the DNA or RNA equivalent of a recognized word with a meaningless string of letters (Fig. 1a). Instead of targeting the stated gene or sequence, such reagents appear to not target any sequence in the species under study (Labbé et al. 2019). The third error type represents the converse problem, where a deliberately meaningless string of letters is replaced with the DNA or RNA equivalent of a recognized word (Byrne and Labbé 2017; Labbé et al. 2019) (Fig. 1a). Whereas non-targeting reagents should not target any gene or sequence in the species under study, incorrect “non-targeting” reagents show unexpected homology to a specific gene (Fig. 1b, c) (Byrne and Labbé 2017; Labbé et al. 2019).

Fig. 1
figure 1

a Diagrammatic representation of three different nucleotide sequence identity error types (shown at left) with claimed versus verified reagent status (centre panels) and the experimental consequences (shown at right). The error type where a claimed “non-targeting” reagent is instead predicted to target a gene is highlighted (middle row). b, c The incorrect non-targeting reagents Sequence A (B) and Sequence D (C). Nucleotide sequences are shown 5′–3′, with 21 nucleotide sequences that are 100% identical to the human genes TPD52L2 (B) or NOB1 (C) shown in bold, with arrows indicating the direction of nucleotide sequence complementarity. The presence of two inverted or self-complementary sequences is typical of shRNA reagents (Lambeth and Smith 2013). Single nucleotides that are deleted in variant sequences are highlighted in grey

Incorrect non-targeting nucleotide sequence reagents

To compare how journals respond to wrongly identified nucleotide sequence reagents, we will describe journal responses to descriptions of incorrect non-targeting reagents (Byrne and Labbé 2017; Labbé et al. 2019). We have focused upon incorrect non-targeting reagents for several reasons. Firstly, the description of a single incorrect non-targeting reagent typically represents a fatal flaw in any gene knockdown publication, where every individual gene knockdown is typically paired with the same non-targeting control (Han 2018). The description of an incorrect non-targeting control is therefore a very serious error that is sufficient to invalidate all affected gene knockdown experiments. Secondly, incorrect non-targeting control reagents represent the most frequent nucleotide sequence identity error type that we initially reported within a cohort of 48 strikingly similar human gene knockdown papers (Byrne and Labbé 2017). Our subsequent notifications of these and other papers with incorrect non-targeting controls mean that journal responses are more likely to concern papers with this error type. Thirdly, subsequent analyses suggest that incorrect non-targeting reagents represent a robustly detected error type (Labbé et al. 2020), as such reagents have “gained” a gene targeting function that is unambiguous and therefore easy to detect (Fig. 1b, c). Finally, as the same non-targeting reagent can be legitimately applied for the analysis of many different genes, incorrect non-targeting controls can be found across many papers (Byrne and Labbé 2017; Labbé et al. 2019). This means that by searching for a small number of incorrect control reagents, we can identify a larger number of fatally flawed gene knockdown papers (Byrne and Labbé 2017). By then describing such errors to journals, we can eventually compare journal responses to a set of very similar papers that describe the application of the same gene knockdown technique, and even the same incorrect negative control.

Our first report of incorrect nucleotide sequence reagents in gene knockdown papers identified two incorrect non-targeting reagents in multiple papers (Byrne and Labbé 2017) (Fig. 1b, c). These reagents named Sequence A or Sequence D are indicated to represent TPD52L2 or NOB1 targeting short hairpin RNA’s (or shRNA’s) as these reagents show inverted or self-complementary sequences that are identical to TPD52L2 (Sequence A) or NOB1 transcripts (Sequence D), respectively (Byrne and Labbé 2017) (Fig. 1b, c). While both Sequence A and Sequence D have been correctly described as TPD52L2 and NOB1 targeting reagents, they have also been incorrectly described as “non-targeting” controls in numerous publications (Byrne and Labbé 2017; Labbé et al. 2019). We therefore compared 32 available journal responses to 31 human gene knockdown papers that commonly examined the functions of single human genes in human cell lines corresponding to different cancer types, with each paper also specifying an incorrect non-targeting control.

Materials and methods

Identification of journal responses to publications with incorrect non-targeting control reagents

Most journal responses followed our notifications of papers that described Sequence A or Sequence D, or a very highly related variant sequence (Fig. 1b, c) (Byrne and Labbé 2017). Concerns were communicated to journals by emails written and sent by JAB and copied to CL, with follow-up communications also occurring through emails written and sent by JAB. Emails were sent to the correspondence address listed on journal websites and were copied to one or more editors or managing editors, using either journal-supplied email addresses, and/ or email addresses that were retrieved by JAB using names and institutional addresses as Google search queries. Emails described concerns about structural and textual similarities with other human gene knockdown papers and the presence of an incorrect non-targeting control reagent (Sequence A, Sequence D, or a very highly related variant sequence) (Byrne and Labbé 2017). Where concerns involved multiple papers published by the same journal, concerns were described in single emails that described up to 6 individual papers. Errors in papers that described the analysis of the TPD52L2 gene were initially communicated to their publishing journals in June 2015, with other papers communicated in January 2017, following the description of 48 strikingly similar human gene knockdown papers (Byrne and Labbé 2017). The communication of one paper was delayed until January 2018, after the paper was initially communicated to the wrong journal by JAB in January 2017.

Journal responses to 29 human gene knockdown publications with incorrect non-targeting controls were identified through PubMed and/or Google Scholar searches using publication titles as search queries. We also identified journal responses to two additional human gene knockdown papers that refer to the use of Sequence A as a “non-targeting” control (Table 1). One journal response resulted from the journal BioMed Research International identifying and investigating a human gene knockdown paper (BioMed Research International 2017a) in response to our notification of a very similar paper (BioMed Research International 2017b) (Table 1). The second additional response followed the recognition of Sequence A as an incorrect non-targeting control by the authors of an OncoTargets and Therapy publication (Retraction 2017) (Table 1). This published response was identified using Sequence A as a Google Scholar search query, as previously described by Byrne and Labbé (2017) and described in further detail below.

Table 1 Summary of 31 human gene knockdown publications employing human cancer cell lines that describe an incorrect non-targeting gene knockdown reagent

Analysis of journal responses

Times for journals to respond to notifications (measured in months) were calculated according to the month and year when concerns were first communicated by JAB to the publishing journal, and the month and year in which the journal response was published online, as identified through PubMed and/or Google Scholar searches of publication titles. In the case of the single journal that decided to take no action in response to our concerns, the response time was calculated from the month and year of our first email to the month and year of the journal’s email stating their final decision to take no action. In the case of the human gene knockdown paper that was identified and investigated by BioMed Research International (BioMed Research International 2017a), the journal’s response to this paper was considered to be temporally linked to our description of another BioMed Research International human gene knockdown paper (BioMed Research International 2017b) (Table 1). This was supported by the journal’s published response referring to the BioMed Research International paper that we communicated (Table 2), and by the two journal responses being published in the same month and year (Fig. 2). Journal time to respond was not calculated for the single response that was initiated by the authors (Retraction 2017).

Table 2 Comparison of 26 published journal responses to 25 human gene knock down papers that describe an incorrect non-targeting nucleotide sequence reagent
Fig. 2
figure 2

Diagrammatic representation of journal response times to 30 gene knockdown papers, from communication to journal decision, according to the timeline on the X axis (years, numbers refer to the first month in each quarter). The retraction which was requested by the authors (Retraction 2017) is not shown. Each horizontal line represents a single publication, with different colours representing different journals according to the key shown at left. Published retractions, expressions of concern and author corrections are represented by solid lines, broken lines, and dotted lines, respectively. Decisions to take no action are indicated by shaded dotted lines. The incorrect non-targeting control is shown at the right of each publication (A = Sequence A, A’ = variant Sequence A, D = Sequence D, D’ = variant Sequence D). PubMed ID’s of retractions, expressions of concern, corrections or papers for which no action was taken are shown at left, apart from the single author correction that resolved an earlier expression of concern, where the PubMed ID of the correction is shown at right. One author correction (PubMed ID 31329732) jointly corrected two papers

All published journal responses were visually inspected to determine whether individual responses referred to the incorrect non-targeting control, and whether authors were consulted prior to publication of the response. As two author corrections referred to a common reagent description error made by the same reagent supply company (Correction 2018), original publications (Table 1) were examined to identify which reagent supply companies were cited as having provided either expression plasmids which are likely to encode shRNA reagents and/or shRNA sequence reagents. The terms “Shanghai” and “HollyBio” were used in combination and/or individually to query pdf files for reference to Shanghai HollyBio as the supplier of gene knockdown reagents. Text references to either “Shanghai HollyBio” or “HollyBio”, where this referred to a Shanghai-based company, were accepted as corresponding to the same company. Where other suppliers were stated to have provided either expression plasmids and/or shRNA sequence reagents for gene knockdown experiments, the names and geographic locations of these suppliers were also recorded.

Identification of other human gene knockdown papers that describe Sequence A or Sequence D as a non-targeting control reagent

Google Scholar searches were performed in July 2020 to identify other human gene knockdown publications that employed either Sequence A or Sequence D as a “non-targeting” control (Byrne and Labbé 2017). Text strings corresponding to either Sequence A (GCGGAGGGTTTGAAAGAATATCTCGAGATATTCTTTCAAACCCTCCGCTTTTTT) or Sequence D (CTAGCCCGGCCAAGGAAGTGCAATTGCATACTCGAGTATGCAATTGCACTTCCTTGGTTTTTTGTTAAT) were employed as individual text queries in Google Scholar searches (Byrne and Labbé 2017). The pdf file of each paper was then inspected to confirm (i) the presence of either Sequence A or Sequence D and (ii) either “Shanghai HollyBio” or “HollyBio” as a Shanghai-based company, or any other gene knockdown reagent supplier. Papers were also screened using Seek and Blastn, with nucleotide sequence identities manually verified as previously described (Labbé et al. 2019). The PubMed ID of each identified paper was used to query email correspondence archives, to verify whether any papers had been previously communicated to their publishing journals.

Citation counts for human gene knockdown papers

Citation counts for human gene knockdown papers are those identified by Google Scholar in September, 2020. Citations within retraction notices, expressions of concern and/or author corrections to the same paper were excluded.

Results

Journal responses to human gene knockdown publications with an incorrect non-targeting control

We analysed journal responses to 31 gene knockdown papers that commonly examined single human genes in cancer cell lines corresponding to different cancer types (Table 1). The 31 papers targeted 21 different human genes, with 5 genes being represented across multiple papers and human cancer types (Table 1). All 31 papers shared the fatal error of describing a “non-targeting” control reagent that is predicted to target a human gene (Byrne and Labbé 2017) (Tables 1, 2). Most (26/31) papers specified Sequence A as their “non-targeting” control, which is predicted to target TPD52L2 (Fig. 1b), whereas the remaining 5 papers described Sequence D, which is predicted to target NOB1 (Fig. 1c, Table 1) (Byrne and Labbé 2017). The 31 papers were published by 13 different journals between 2014 and 2017 (Table 1). Citation numbers for individual publications ranged from 2 to 19 citations/ paper, and as of September 2020, the 31 papers had been collectively cited 279 times (Table 1).

The 31 papers generated 32 post-publication responses, which represented 14 retractions, 5 expressions of concern, 7 author corrections (including one author correction that resolved an earlier expression of concern), and 6 stated decisions to take no action (Tables 1, 2). As described in the Methods, we could measure response times for 31/32 journal responses, excluding the single author-initiated retraction (Retraction 2017) (Table 2, Fig. 2). Overall, journal response times varied from 2 to 30 months (Fig. 2). Times to retraction varied from 2 to 30 months for 13 retractions that were published from 2016 to 2019, whereas times to publication of 5 expressions of concern and 7 author corrections ranged from 17–21 and 19–28 months, respectively. Most or all author corrections and expressions of concern were published during 2018, with the single correction that resolved an expression of concern being published in 2019 (Fig. 2). One journal required 13 months to communicate their decision to take no action over 6 papers (Fig. 2). Only 6/31 responses were published within 12 months, all of which were retractions (Fig. 2).

We considered whether the 26 published responses (retractions, expressions of concern, or author corrections) referred to the “non-targeting” reagent and/ or involved input from study authors (Table 2). Most (21/26) published responses (9/14 retractions, 5/5 expressions of concern, 7/7 author corrections) referred to the incorrect control reagent (Table 2). Of the remaining 5 retractions, one referred indirectly to “experimental data/ plasmid vectors”, two referred to “experimental defects”, one referred to “experiments… sourced out to a biotechnology company”, and the final retraction referred to the use of a contaminated human cell line (Table 2). Similarly, most (17/26) published responses (9/14 retractions, 1/5 expressions of concern, 7/7 author corrections) mentioned author involvement or agreement with the published response, whereas the remaining 9 responses (5/9 retractions, 4/5 expressions of concern) either did not mention author involvement or specified that authors could not be contacted (Table 2). Overall, 15 published responses (7/14 retractions, 1/5 expressions of concern and 7/7 author corrections) referred to the wrongly identified control reagent and specified author involvement in the published response.

When responses were considered according to publishing journal, 4/13 journals responded to a single paper, publishing a retraction in each case, whereas the remaining 9 journals responded to at least two papers. Five of these 9 journals issued the same response to multiple papers (either retractions, expressions of concern, or the decision to take no action), whereas 4 journals published different responses to different papers (Fig. 2, Table 2). Acta Biochimica Biophysica Sinica published an author correction to one paper but later retracted a second paper (Fig. 2, Table 2). Cancer Biotherapy and Radiopharmaceuticals retracted two papers that were communicated in June 2015 and subsequently published author corrections of two papers that were communicated in January 2017 (Fig. 2, Table 2). Chemical Biology and Drug Design retracted two papers and published author corrections of 3 papers (Fig. 2, Table 2). One of these author corrections is factually incorrect, because the authors denied the significance of the incorrect non-targeting control (Table 2). Finally, OncoTargets and Therapy retracted one paper in response to an author description of an incorrect non-targeting reagent (Sequence A) (Table 2) and subsequently published an Expression of Concern to another paper that employed the same “non-targeting” control (Fig. 2, Table 2). Whereas the author-initiated retraction stated that the authors had been advised of an error affecting the control sequence by the supplier, the subsequently published expression of concern for a different paper indicated that OncoTargets and Therapy “was unable to make a definitive conclusion” about the same wrongly identified control reagent (Table 2).

External suppliers of “non-targeting” control reagents

We noted that most (6/7) author corrections replaced Sequence A with non-targeting sequences that are identical over 44 nucleotides (Table 2). A joint correction of two Cancer Biotherapy and Radiopharmaceuticals papers stated that the non-targeting control sequence was wrongly specified by the supplier, leading to the wrong control shRNA sequence (Sequence A) being supplied in the product specification (Correction 2018) (Table 2). Although the published correction referred to the reagent supplier as Holly Biotechnologies (Correction 2018), the corresponding Cancer Biotherapy and Radiopharmaceuticals papers referred to the gene knockdown reagent supplier as Shanghai HollyBio or Hollybio Shanghai (Zhang et al. 2014a, b).

Text analyses revealed that almost all (30/31) gene knockdown papers referred to Shanghai HollyBio or Hollybio based in Shanghai as the supplier of either expression plasmids and/or gene knockdown nucleotide sequence reagents. These 30 papers included all 26 papers with published responses and 5/6 papers where the journal elected to take no action, with the remaining paper naming another Shanghai-based company GenePharma as the supplier of expression plasmids and nucleotide sequence reagents (Table 1). We therefore examined all published responses for explanations for the incorrect control reagent. While no other journal responses beyond the corrections published by Cancer Biotherapy and Radiopharmaceuticals named any reagent supply company, the author correction published by Acta Biochimica Biophysica Sinica referred to “technical problems of the manufacturer” (Table 2). Five retractions confirmed the use of incorrect reagents, either by stating that the authors agreed that incorrect sequences had been used (Chemical Biology and Drug Design, Korean Journal of Physiology and Pharmacology), or that the negative control sequence was written wrongly, without further explanation (Journal of Breast Cancer), or that the control plasmid was in fact an empty vector (BioMed Research International) or that authors could not confirm that the corrected control sequence provided by the supplier had been used in experiments (OncoTargets and Therapy) (Table 2).

Additional human gene knockdown papers that describe Sequence A or Sequence D as “non-targeting” control reagents

We have previously used nucleotide sequences as text strings in Google Scholar searches to identify papers that describe incorrect non-targeting controls (Byrne and Labbé 2017). In addition to papers listed in Table 1, Google Scholar searches performed in July 2020 identified 19 human gene knockdown papers that described either Sequence A (n = 15 papers) or Sequence D (n = 4 papers) as the “non-targeting” control. Ten of these 19 papers have been previously described (Byrne and Labbé 2017; Labbé et al. 2019) and communicated to their publishing journals (Table 3). The 19 papers were published between 2015 and 2019 by 13 journals and examined 13 different human genes in cell lines corresponding to 13 human cancer types (Table 3). Over half (7/13) of these human genes were represented across multiple papers and cancer types (Tables 1, 3). Most (11/19) papers specified that gene knockdown reagents were supplied by Shanghai HollyBio, with 5 papers naming 4 other suppliers (GenePharma, GeneChem, Hanbio, ShanghaiBio) that were commonly based in Shanghai, China (Table 3). In addition to their wrongly identified non-targeting reagent, Seek and Blastn analyses supported by manual verification found that 6/19 papers also described 2–5 wrongly identified targeting reagents, most of which were predicted to target genes other than those claimed in the text (Supplementary Table 1). Citation numbers for individual publications ranged from 1 to 54 citations/ paper, and as of September 2020, the 19 papers had been collectively cited 229 times (Table 3).

Table 3 Additional human gene knockdown publications employing human cancer cell lines that describe an incorrect non-targeting nucleotide sequence reagent, identified using either Sequence A or Sequence D as a Google Scholar search query

Discussion

Comparisons of 32 journal responses to 31 human gene knockdown papers have highlighted the wide range of responses by 13 journals to notifications of incorrect control reagents that represent fatal errors in gene knockdown studies. Journal responses ranged from issuing retractions to deciding to take no action, with variations in responses being noted both between different journals and by 4 individual journals. Viewed individually, many published journal responses are reasonable. Retractions may have reflected numerous journals’ interpretation of the fatal error represented by the incorrect control reagent, which was also described in the context of other concerns regarding textual and structural similarity with other human gene knockdown papers (Byrne and Labbé 2017). Expressions of concern may also have been considered reasonable responses while journal investigations continue (Vaught et al. 2017), and we note that most expressions of concern did not specify author involvement or stated that authors could not be contacted. Where authors responded with plausible explanations for the incorrect control sequence, it is also reasonable to provide an opportunity to correct the published record. Nonetheless, given the collective range of journal responses, we set out to explain why variable and at times inconsistent responses were issued to highly similar human gene knockdown papers with the same reagent error type.

We recognize that the present analysis was not designed as a controlled study, and although the 31 papers shared striking similarities and an incorrect non-targeting control, these papers were not identical. Some papers presented with targeting nucleotide sequence reagent errors, in addition to the incorrect non-targeting control (Byrne and Labbé 2017; Labbé et al. 2019). While the presence of additional errors may have affected journal responses, the single “non-targeting” control reagent remained sufficient to invalidate all gene knockdown results described in each paper. The range of responses suggests that while most journals recognized the described error, some journals variably interpreted its significance. Particular responses indicated uncertainty about the significance of the “non-targeting” control, such as responses that did not mention the reagent error, the publication of an author correction that dismissed the importance of an incorrect non-targeting control (Table 2), and most concerningly, the single journal that elected to take no action over 6 papers (Table 1).

We recognize that the interpretation of errors that are notified by third parties may be more complex than errors described by authors, and that nucleotide sequence reagent errors can be difficult to understand, particularly those affecting non-targeting controls. As we notified concerns about 29/31 papers, we must share responsibility for journals having an incomplete understanding of the significance of the incorrect control reagent. We described our concerns by email, and varied journal responses could suggest that emails may inadequately explain concepts such as incorrect gene knockdown controls. Our emails may have provided insufficient information and/or may have used a format that did not allow journal staff to quickly or easily understand the described errors and concerns, such as the status of an incorrect non-targeting control as a fatal error. For this reason, we suggest the derivation and adoption of structured, standardized and ideally universal templates for post-publication error communication (Table 4). Such templates would be co-designed by researchers, editorial staff and publishers to provide the information that different stakeholders need to assess published errors.

Structured and shared error communication templates would present many advantages. Such templates could be used by both authors and third parties and completed templates could then be shared with authors and peer reviewers as required. Templates could also be formatted to allow authors and/or peer reviewers to respond directly to individual concerns. Shared templates could assist with describing errors at post-publication review sites such as PubPeer and could facilitate the communication of published errors between PubPeer and journals. Structured templates could also encourage more standardized and transparent reporting in retraction notices and other post-publication reports (Vorland et al. 2020; Vuong 2020a, b), whereas universally accepted templates would allow information about published errors to be aggregated, analysed and reported. This would improve the awareness and understanding of particular error types, which could encourage more journals to proactively investigate publications and submitted manuscripts for repeated errors such as wrongly identified nucleotide reagents. We propose a draft template (Table 4) that could align with and extend the “REAPPRAISED” checklist for publication integrity (Grey et al. 2020b). We welcome input from different stakeholders to ensure that this template is broadly useful and widely adopted.

Table 4 Draft communication template for published errors

While recognizing the challenges of communicating information about published errors, we note that a subset of retractions, expressions of concern and author corrections offered contrasting and at times inconsistent explanations for the incorrect control reagent. These different explanations for a common reagent error type may have also contributed to variations in journal responses. Indeed, contrasting explanations were unexpected, given that almost all (30/31) papers, and all 25 papers with published responses, stated that gene knockdown reagents were obtained from a common reagent supplier, Shanghai Hollybio. In 6/7 author corrections, the “non-targeting” control sequence was corrected to the same non-targeting control sequence, which is consistent with the support of a common supplier, and in the joint author correction of two papers, Holly Biotechnologies (or Shanghai Hollybio) accepted responsibility for the incorrect reagent description (Correction 2018). However, in contrast to assertions that the supplier error only affected the product description, the authors of other gene knockdown papers accepted that an incorrect non-targeting control had been used, or may have been used, and either requested or agreed to article retractions (Table 2). Taken together, these conflicting accounts indicate considerable uncertainty over the control reagents that were supplied and therefore the experiments that were conducted.

It was also surprising to note that same “non-targeting” control reagents were supplied by Shanghai Hollybio as well as other Shanghai-based companies. In total, 5 different companies were listed as supplying either Sequence A and/or Sequence D as “non-targeting” controls to human gene knockdown studies that analyse the functions of single human genes in human cancer cell lines. It is difficult to understand how multiple companies could supply identical incorrect reagents, or identical wrongly described reagents, when errors in products and/or their specifications might be reasonably expected to be manufacturer-specific (Knight 2001).

Regardless of whether errors affect material reagents or their descriptions, companies that supply incorrect reagents should make reasonable efforts to inform their clients of their mistakes (Knight 2001). Such responses seem particularly critical in the case of incorrect non-targeting control reagents that could be wrongly paired with targeting reagents for the analysis of literally thousands of human genes and hence provided to many individual clients. Published descriptions of specific incorrect non-targeting controls (Byrne and Labbé 2017; Byrne et al. 2019; Labbé et al. 2019) should also enable authors to contact reagent suppliers and/or take steps to correct their own publications, and we recognize that such efforts may be in progress. However, despite two years passing since the publication of author corrections where Holly Biotechnologies/ Shanghai Hollybio accepted responsibility for the incorrect description of a non-targeting control (Correction 2018), and despite numerous other papers that incorrectly describe Sequence A or Sequence D being clearly visible within the literature (Table 3), we could find no other author- or supplier-initiated published responses, beyond those analysed in the present study. It is increasingly difficult understand why numerous papers that describe known incorrect control reagents should remain uncorrected within the literature.

One previously highlighted journal retraction concerned a paper that described Sequence A as both a gene targeting and a non-targeting reagent and yet reported different results for these identical experiments (Byrne and Labbé 2017; Retraction 2016). The retraction notice stated that the “experiments were sourced out to a biotechnology company” (Retraction 2016). This retraction notice, combined with striking similarities between gene knockdown studies and their frequent shared nucleotide sequence reagent errors, led to the proposal that these cancer research papers may have been produced by undeclared assistance from external parties such as paper mills (Byrne and Labbé 2017; Byrne et al. 2019). Inconsistent explanations for the description of “non-targeting” control reagents, the implausible provision of identical “non-targeting” reagents by different external suppliers, and the apparently very limited efforts to correct affected papers by both reagent suppliers and study authors, could all be consistent with papers having been produced with undeclared external assistance, and with experiments not having been performed as described (Byrne and Labbé 2017).

Inconsistent explanations for the description of “non-targeting” control reagents also raise the concerning possibility that some information that is provided to journals in response to error notification could also reflect undeclared external assistance. This could mean that in addition to contributing papers to the literature, paper mills might also contribute information in response to notified errors or concerns, possibly to reduce the likelihood of publications being retracted. This possibility would considerably complicate journal efforts to appropriately respond to flagged errors and could result in the publication of what might be factually incorrect author “corrections”. Such incorrect corrections would allow study conclusions to remain unchallenged such that they could continue to mislead future cancer research efforts.

In summary, the present case study of journal responses to a set of human gene knockdown papers that share specific incorrect control reagents highlights the thin line that can exist between journal retractions, expressions of concern, author corrections, and taking no action at all. We propose that standardized error notifications might improve post-publication responses to published errors, enhancing the capacity of science to self-correct. Greater awareness and understanding of nucleotide sequence reagent errors should allow more journals to proactively identify and investigate papers that describe known incorrect reagents and take appropriate responses to papers that describe wrongly identified nucleotide sequence reagents. Indeed, the many cancer patients worldwide who look to biomedical research for improved treatments and quality of life deserve nothing less.