Introduction

The scientific method dictates that new hypotheses must be formed on the basis of acquired results and that these new hypotheses must be then experimentally tested (Woodcock, 2014). As such, there is a delicate balance between innovation and continuity with previous research, which is well exemplified by common expressions such as ‘line of research’ (Foster et al., 2015).

It does not come as a surprise, therefore, that each scientific manuscript includes a long list of references, i.e. previous publications that served as sources, as the evidence that corroborates the foundations that sustain the whole theoretical edifice built by the authors. Yet, previous literature can have a broader, and even more important role. Investigators often rely on previously published papers as compasses to guide their choices and are well aware of the strident dichotomy between the need for innovative results, to position themselves within the scientific community, and the necessity to remain within the boundaries of established research directions (Merton, 1957; J. Wang et al., 2017) to get their research published, funded or even considered by the broader community. Given the high pressure on scholars to get their results published and to accrue a vast number of publication to support their career ambitions (van Dalen, 2021), it is reasonable to wonder whether the driving forces behind their research choices—when it comes to topics, methods, assays, endpoints and even reporting—are imputable only to their scientific appropriateness, or other, more mundane factors may compound, e.g. prestige (Foster et al., 2015), or the desire to streamline the publication process by adopting successful models. This, however, may be a source of bias in a publication, and in some extreme cases the existence of ‘paper mills’, where scientific articles are mass produced for personal or political reasons, has been denounced (Christopher, 2021).

We have previously published a brief report on a group of scientific publications in the dental field that shared some uncanny formal resemblance in the way data were presented (Galli et al., 2019). The purpose of the present commentary is to highlight the existence of a niche of in vitro studies that went well beyond the dental field and seemed to closely adopt an exceedingly consistent format for data reporting, in order to bring it to the attention of the scientific community.

While we were unable to identify new papers following this format after 2019, the rise and fall of such a peculiarly identifiable reporting format must be scrutinized. We contend that scientific soundness was not likely to be the driving force in the establishment of such particular praxis—it could be called a trend—, a phenomenon that, as it will be shown later on, is far from unique. We also believe that it is important to assess the limits and potential of such trends, to exploit them, where possible, to optimize future research and correct their course, where deemed necessary.

Materials and methods

Online search

We searched the MEDLINE public database of biomedical literature through the PubMed portal, from its earliest available dates to May 2021, without applying date or language filters. We searched terms (see Supplementary information), including MESH subheadings and keywords, which could match the papers we had initially come across (Galli et al., 2019). The retrieved studies were screened by 2 investigators (CG and MTC), to identify relevant studies that presented similarities to the previously published template with gingival fibroblasts (Galli et al., 2019), e.g. a grouped bar chart for cell viability as Figure I. Reference lists of included studies and relevant reviews were also searched. The Inflammation and International Immunopharmacology journals, which published several of the articles we reported in our previous study, were furthermore hand-searched to improve the detection of relevant studies. The included studies are listed in Table 1.

Table 1 List of included studies in the present review. We reported the cell model, the tested compound and the chosen stimulus used

Document similarity

To investigate the format similarity across different studies, we created a d × F feature matrix, where d represents the individual studies, in chronological order, starting from the oldest one, and F was a list of common figures that could be encountered in the studies (Table 2).

Table 2 Sequence of common features in the studies included in the review

Common elements that followed the template were identified: a Diagram of the molecule structure (Diagram), a histogram for cell viability (Viability), Viability, a quantification of relevant inflammatory cytokines by ELISA (ELISA), quantification of Prostaglandins and nitrites (PGE2), Cox-2 assay (Cox), p65 and IkB quantification by Western blot analysis (p65), TLR4 quantification by Western blot (TLR4), ICAM expression (ICAM), cell adherence assay (Adherence), metalloproteinase assay (MMP), Nrf2 quantitation (Nrf2), pJNK quantitation (pJNK), PPAR-γ (PPARG). For each study we recorded the figure representing that given endpoint. We then calculated the Levenshtein (Levenshtein, 1966) and Jaccard distance (Halkidi et al., 2002) to estimate the dissimilarity between them using the Textdistance 4.1.5 (Orsinium, 2019) library for Python 3.7 thus obtaining a d × d distance matrix, which we then plotted as a heat map, with lighter colors indicating higher dissimilarity. Black therefore indicates a distance value of 0, meaning identity, and lighter colors indicate increasing dissimilarity.

Results

The findings

We have already reported that the unexpected discovery of a limited but clearly identifiable tradition of studies in the periodontal field (Hao et al., 2017; Jian et al., 2015; F. Liu et al., 2019; Qi et al., 2018; Q. Wang et al., 2016; Q.-B. Wang et al., 2016; Yimin Wang et al., 2015; C. Wei et al., 2015; N. Zhang et al., 2017) sharing a high degree of formal similarity in their reporting prompted us to broaden our search (Galli et al., 2019). A survey of the literature has retrieved a considerable number of studies, published between 2008 and 2019, that appear to follow closely related reporting templates, in several biomedical fields besides periodontal inflammation, including microglia inflammation (Han et al., 2017; Li-hua et al., 2017; N. Liu et al., 2017; H. Wang et al., 2015; Min Wang et al., 2018; Xiaokun Wang et al., 2017; Yanan Wang et al., 2015; Wang-sheng et al., 2017; L. Zhang et al., 2018), umbilical endothelial cells (Yong Li et al., 2016; Lin et al., 2017; Song et al., 2016; Xiaodong et al., 2015; Zheng et al., 2018), macrophages (Bi et al., 2012; Ci et al., 2008, 2010; Fu et al., 2014; Fu, Liu, Liu, et al., 2012; Fu, Liu, Zhang, et al., 2012; Hu et al., 2013; Huo, Cui, et al., 2013; Huo, Gao, et al., 2013; Huo et al., 2012; Soromou et al., 2012; X. Zhang et al., 2009) epithelial cells (Liang et al., 2014; Luo et al., 2013; Yanan Wang et al., 2017; Yang et al., 2014), adipocytes (Ming Wang et al., 2015) and chondrocytes (Feng et al., 2017; Jingbo et al., 2015; Liao et al., 2016; Lou et al., 2017; Ma et al., 2015; Pan et al., 2017; Piao et al., 2015; Y. Qu et al., 2016, Qu, Zhou, et al., 2017; C. Wang et al., 2016; D. Wang et al., 2015; H. Zhang et al., 2015) (Table 1).

Most of the earlier works are focused on macrophages, while different cell models were introduced in later manuscripts (Fig. 1).

Fig. 1
figure 1

The diagram represents the main cell models used in the studies that appeared to closely follow the template. The oldest model was human and, mostly, murine macrophages, while later works also focused on different cell models

These manuscripts are not the product of a single research group, as we first hypothesized, but also of un-related investigators (Table 1) from different institutions in China. In the great majority of the cases (52 articles out of 65) the corresponding author did not provide an institution email address, but rather a personal address. Due to specific a priori constraints of our literature search, all the works that were included in this report shared the common purpose of investigating whether a given compound with potential beneficial effects hampered the in vitro inflammatory response evoked in a target cell population by a pro-inflammatory cue. In the vast majority of the models, inflammation was induced by LPS, with the noticeable exception of osteoarthritis studies, where Interleukin-1β was typically used with chondrocytes. Studies focusing on gingival fibroblasts consistently used 1 μg/ml LPS from Porphyromonas gingivalis, but studies on BV-2 microglia cells resorted to 0.5 μg/ml LPS from Escherichia coli, similarly to reports with umbilical endothelial cells or macrophages, where, however, 1 μg/ml LPS was again preferred. As for the tested compound, henceforth simply referred to as X compound, its nature vastly varied across studies, although most manuscripts actually focused on herbal extracts or molecules of natural origin. This was the case for instance with Alpinetin, a natural flavonoid utilized in traditional eastern medicine of herbal origin (Huo et al., 2012), Magnolol, an extract of Magnolia officinalis (Chen et al., 2011) or Geniposide, which is obtained from Gardenia plants (Fu, Liu, Liu, et al., 2012). This may be explained both by the great relevance of natural products in far eastern traditional medicine, which may have sensitized researchers to more actively investigate the properties of herbal compounds, but it could also be possibly due to the greater accessibility of natural compounds for independent academical testing, e.g. when compared to novel synthetic molecules, which are presumably more often screened by private pharma companies prior to the clinical stages.

A comparative view

As we tried to assess the similarity across these studies, we observed that most of these studies used similar endpoints, which were then plotted using strikingly similar graphs (Fig. 2, Table 2).

Fig. 2
figure 2

This figure represents the occurrences of specific figure types at given positions in the manuscripts. The most consistent feature of the template was the iconic viability histogram usually found as Fig. 1, followed by a cytokine assay by ELISA as Fig. 2, as described in this review

We recorded the position of each figure in the manuscript and its content, and were thus able to compute scores for dissimilarity among studies, which we plotted as a matrix (Fig. 3).

Fig. 3
figure 3

Heat maps diagrams representing the distance matrices of all the papers included in the review (A, D), only the chondrocyte studies (B, E), or the remaining, non-chondrocyte studies (C, F). The distance between the studies was calculating using the Levenshtein (A–C), or Jaccard (D–F) similarity score based on the data plots and their position in the paper

This distance is mostly based on the presence of the same endpoints, as in the Jaccard distance, and also on the position of the figures in the text, as in the Levenshtein distance. We plotted the distance matrix of the studies as a heat map, where the numbers on the axis represent the index of a study as can be found in Table 2, and colors represent the degree of distance between studies as far as the order of the figures is concerned. When taken together, it appears that although early studies had a higher degree of similarity, as far as reported endpoints, represented as darker colors in the corresponding heat map (Fig. 3A, D), the degree of similarity was lower among later works. Since the studies on chondrocytes were conducted in a slightly different way from all the other studies, namely they did not use LPS but rather IL-1β to elicit the inflammatory response in cells, we believed that they could constitute an isolated sub-group.

When we examined the studies on chondrocytes alone (Fig. 3B, E), we observed that distance between studies progressively decreased, as the heat map comparing later studies was darker than with early studies, as it is clearer to visualize looking at the Jaccard distance (Fig. 3E). This would suggest that authors publishing in this field and referring to the template progressively adhered to it more closely, or converged on a new standard of reporting.

When we then analyzed the remaining studies, their distance again appeared lower in the older studies, at least after an initial period, and then decrease. There could be a resurgence in similarity in the most recent studies (Fig. 3C, F).

It must be remembered, however, that although the sequence of figures in the manuscript may differ, individual figures across studies had often an uncanny esthetic resemblance, as explained in greater detail below. This aspect of similarity could be captured only through a qualitative evaluation of the manuscripts.

A jack of all trades

We identified 65 studies that closely followed this format template, so that it was possible to outline a prototypical structure. This prototype was then declined in different fashions, according to the specific needs of a given cell type and therefore the characteristics of a specific inflammatory microenvironment. Within a field of study, e.g. periodontal inflammation or microglia inflammation, however, the format was followed with great consistency, down to the very details, including the order of appearance of a certain chart, or how data were plotted, what color combinations were chosen, what measures of dispersion were used and sometimes even the wording of a figure legend. These reports typically presented a very standard core set of figures, usually Figures I through IV (henceforth indicated by roman numbers to distinguish them from the illustrations of the present study), where the highest degree of similarity was found and included additional figures that had a higher degree of originality. The most important characteristics of this template can be found in greater detail below.

Figure I—The most distinctive feature in the exceeding majority of studies that conformed to this format is represented by Figure I, which usually depicted cell viability after the addition of LPS and in the presence or in the absence of 3 increasing doses of the tested compound (Fig. 4A). The purpose of this figure was usually to demonstrate that the X compound did not exert toxic or undesired effects on cell viability, even under pro-inflammatory conditions, which is a sound pre-requisite for its potential use in a clinical setting. Very few papers avoided showing cell viability data as Figure I, and even in those cases, the authors stated that cell viability was measured but not shown (Lei Li et al., 2017; Piao et al., 2015). Some papers, however, preferred to represent the molecular structure of the tested compound as Figure I, and the remaining experimental results then followed the first diagram. Only Ming et al. showed cell viability data as Fig. 4 (Xiaodong et al., 2015) and only a very recent paper used a modified bar chart (Zheng et al., 2018), which could be a sign that this template eventually underwent some further adaptation or fine tuning.

Fig. 4
figure 4

A Standard representation of cell viability in the absence or in the presence of LPS and in the presence or absence of a X compound, as used in the template. B Cell viability histogram as found in chondrocyte studies. Conc1 = Concentration 1 of X compound; Conc2 = Concentration 2; Conc3 = Concentration 3

Cell viability was quite consistently determined by MTT assay, although a smaller number of papers, interestingly prevalently in the periodontal field, used CCK-8 assay (Qi et al., 2018; Yimin Wang et al., 2015). This figure was consistently constituted by a bar chart that expressed cell viability as a percentage normalized by the cell viability levels of the control group, where no LPS or X compound was added. Interestingly, the Y axis of the graph invariably started at 50% viability in all the publications that followed the format. More importantly, this bar chart relied on a two-color scheme, which included a lighter color, usually white, which was commonly used for the Control group, LPS group and LPS + X compound, and a darker color, e.g. grey or black, which denoted the groups stimulated with only the X compound (the color scheme was reversed in some reports). However, according to the graph labels, the light color indicated “Drugs + LPS”, which was only partially correct. The same color was in fact used also for the Control and LPS only groups! Nevertheless, this labelling and data plotting had a tremendous success, as it was encountered in virtually all the manuscripts that adopted this format, and in only 1 instance, to the best of our knowledge, a third color was used in the graph to distinguish the control and LPS only groups (W. Wei et al., 2015). It should be noted that the vast majority of studies that followed this format used the wording “Drugs + LPS/Drugs only” for the graph labels (only occasionally was the actual name of the drug mentioned in the label (Q. Wang et al., 2016), and the figure legend was also extremely consistent in its choice of words. The most common formula was: “Effects of X compound on the cell viability of Y cells. Cells were cultured with different concentrations of X compound (Concentrations may be reported here) in the absence or in the presence of (LPS concentration) LPS for 24 h. The cell viability was determined by MTT assay. The values presented are the means ± SEM of three independent experiments.” The use of Standard Error as a measure of dispersion appeared as the norm in these studies, with Standard Deviation found only in a few cases (Q. Wang et al., 2016; N. Zhang et al., 2017).

A peculiar sub-set of studies, namely those that focused on osteoarthritis chondrocytes, while broadly following a similar plotting format, presented some significant differences. Two studies (Liao et al., 2016; H. Zhang et al., 2015) used the same graph format for cell viability as described above, i.e. a grouped bar chart, though noticeably with 4 increasing concentrations of X compound instead of the more common 3 concentrations used in the other studies. However, the most common format for Figure I in chondrocyte studies was a simple bar chart (Fig. 4B) reporting data for cell viability in the absence or in the presence of 3 increasing doses (Feng et al., 2017; Lou et al., 2017; Ma et al., 2015; Pan et al., 2017; Y. Qu et al., 2016; D. Wang et al., 2015) or, more rarely, 4 or more concentrations of X compound (Lee et al., 2018; C. Wang et al., 2016; Zhong et al., 2015).

Figure II—Figure II was mostly focused on investigating how the X compound affected the production of inflammation mediators by cells after inflammatory challenge. Though Figure II could significantly differ across studies, it was very consistent, when a given cell model is considered. As for BV-2 microglia cells, Figure II typically consisted of a multi-panel figure that comprised 4 bar charts depicting the levels of TNF-a, IL-1b, PGE2 and IL-6 in culture supernatants, measured by ELISA assay (Han et al., 2017; N. Liu et al., 2017; H. Wang et al., 2015) in the Control, LPS, and LPS + 3 increasing concentrations of the X compound groups. Other studies had subtle differences, as the publication by Li-Hua et al., which omitted IL-6 (Li-hua et al., 2017), or Zhang et al., which reported both the protein and the mRNA levels for TNF-α and IL-1β as Figure II, while deferring PGE2 to Figure III, together with Nitrites (L. Zhang et al., 2018). As explained elsewhere (Galli et al., 2019), studies on human gingival fibroblasts consistently reported protein levels for IL-8 and IL-6 as Figure II (Jian et al., 2015; Yimin Wang et al., 2015; C. Wei et al., 2015; N. Zhang et al., 2017), though PGE2 and Nitrite levels could be present (Hao et al., 2017; F. Liu et al., 2019).

The choice of mediators was understandably dependent on the cell model used, and when it came to umbilical vein endothelial cells Figure II often contained a multi-panel figure consisting of bar charts for TNF-α, IL-6 and IL-8 (Yong Li et al., 2016), or only TNF-α and IL-8, by ELISA (Song et al., 2016; Xiaodong et al., 2015). Studies on macrophages were less numerous and therefore it is difficult to outline a prototypical template, but they all had TNF-a and IL-6 amounts by ELISA as the second figure, in combination with cell viability (Fu, Liu, Zhang, et al., 2012) or in more complex multi-panel figures (Fu et al., 2014; W. Li et al., 2013).

PGE2 and Nitrites levels in the supernatant were most commonly shown as Figure II in studies on chondrocytes. They could be either alone (Lou et al., 2017; Ma et al., 2015; Pan et al., 2017 C. Wang et al., 2016; D. Wang et al., 2015; H. Zhang et al., 2015) or within a multi-panel figure that contained also the photograph of a WB membrane with inducible Nitric Oxide Synthase (iNOS) and Cyclooxygenase-2 in the control, LPG group and in the groups stimulated with both LPS and 3 increasing concentrations of X compound, together with 2 bar chart showing the quantitation of the WB intensities (Piao et al., 2015; Y. Qu et al., 2016). Liao et al. further added a bar chart with TNF-α levels by ELISA to this already crowded figure (Liao et al., 2016). Feng et al. had a Figure II composed with PGE2, Nitrites, TNF-α and IL-6 in a study that followed the format template less closely also with respect to other figures (Feng et al., 2017). Less closely related studies could still report PGE2 and Nitrites levels in more complex pictures (Lee et al., 2018) or in later positions in the text (Zhong et al., 2015).

Figures III and IV—Figure III was mostly dedicated to investigating the activation of intracellular signals, both signaling pathways or cell markers, with the possible exception of the chondrocyte template. In the case of both the microglia (Han et al., 2017; Li-hua et al., 2017; N. Liu et al., 2017; H. Wang et al., 2015; Min Wang et al., 2018; Wang-sheng et al., 2017; L. Zhang et al., 2018) and gingival fibroblast studies (Jian et al., 2015; Qi et al., 2018; Q. Wang et al., 2016; Q.-B. Wang et al., 2016; Yimin Wang et al., 2015; C. Wei et al., 2015), Figure III was constituted by data from Western Blot analysis of the activation of the NF-kB signaling pathway. More specifically, Figure III was a multi-panel figure with the photograph of a WB membrane with lanes for the Control, LPS group and LPS + 3 increasing concentrations of the X compound. The samples were labeled for phospho-p65, phospho-IkB and β-Actin as reference. The WB membrane was accompanied by 2 bar charts on its side, representing the quantitation of its signal intensities. In one case with BV-2 microglia cells (H. Wang et al., 2015) and with gingival fibroblasts (Jian et al., 2015) TLR-4 receptors were also quantitated in Figure III. In one case this WB figure, though present, was moved down to Figure IV (L. Zhang et al., 2018). Studies with epithelial cells did not appear to differ significantly, as with (Luo et al., 2013), though in other studies the third figure is represented by a WB analysis with Cox-2 and iNOS levels (including their quantitation as a bar chart) and the WB for phospho-p65 and phospho- IkB was moved down to a later position (Liang et al., 2014; Yang et al., 2014). When the early studies on macrophages are considered, this same structure could already be seen in Huo et al. (2012) and Fu, Liu, Liu, et al. (2012), though the other studies preferred to show results for mRNA levels for cytokines, as a support for ELISA data, and only then move to Western Blots for phosphorylated proteins (Bi et al., 2012; Fu et al., 2014; Fu, Liu, Zhang, et al., 2012; Hu et al., 2013).

Studies with umbilical endothelial cells typically adopted a modified template. Figure III could still show a WB but with ICAM/VCAM adhesion proteins, together with their bar charts, followed by a bar chart representing the effects of the X compound on cell adhesion (Yong Li et al., 2016; Song et al., 2016), but these two figures could also appear in the inverted order (Lin et al., 2017; Xiaodong et al., 2015). WB for p-p65 and p-IkB followed these graphs in all these cases.

A modified template was also usually encountered with studies on chondrocytes. Figure III usually showed Cox-2 and iNOS levels by Western Blot, with addition of the quantitation of the band intensity expressed through a bar chart (Feng et al., 2017; Lou et al., 2017; Ma et al., 2015; Pan et al., 2017; D. Wang et al., 2015; H. Zhang et al., 2015), though this multi-panel figure could also be found in an upper position, as a part of Figure II (Liao et al., 2016) or Figure I (Piao et al., 2015), while this graph was only rarely absent (C. Wang et al., 016) or modified (Y. Qu et al., 2016). This figure could be then directly followed by WB for phospho-p65 (Feng et al., 2017; Ma et al., 2015; H. Zhang et al., 2015) or by a multi-panel figure of 3 bar charts reporting levels of metalloproteinases (MMPs) MMP-1, MMP-3 and MMP-13 by ELISA in the control, LPS and LPS + X compound groups (Liao et al., 2016; Lou et al., 2017; Piao et al., 2015; C. Wang et al., 2016; D. Wang et al., 2015), which was in turn consistently followed by a separate figure with WB for phospho-p65 and p-IkB as described above. There was only one case where metalloproteinases were quantitated by Western Blot (Pan et al., 2017).

Discussion

We have previously reported that a conspicuous number of manuscripts appeared in the periodontal field adopting an exceedingly consistent format of reporting, to the point that it was possible to outline a detailed reporting template (Galli et al., 2019). A broader survey of the literature revealed that studies that appeared to follow the same or similar templates could be found in several other medical fields. The common element of all these studies, at least in part because of the way we conducted the literature search, was that they were all designed to answer a specific experimental question:

Does X inhibit the inflammatory response triggered by Y in Z cells?

where X represents the tested compound, Y is a pro-inflammatory stimulus, e.g. most commonly LPS, and Z is a cell phenotype. We were able to identify 65 studies (Table 1) that appeared to conform to very consistent templates, which shared such close similarities that they by far exceeded the normal degree of resemblance that can be found across studies within the same field of investigation. We propose therefore that it is possible to postulate the philological dependence of their format. This means that the authors of these studies were most likely to have adopted the same template originally stemming from a previously published paper, an archetype, to use a term that is common in Lachmann’s methodology, which they considered to be commendable and worth reproducing for their own publications. Unfortunately, at the present stage of our research, we were not able to identify such an archetype with certainty. One of the main limits of our research is that the list of studies we included in this report is not necessarily exhaustive, because it is impossible to search a literature database like Medline on the basis of the methods used by the single studies and even less so on the basis of their iconographic similarity. The first paper that presented some of the characteristics that we have identified was published by Ci et al., (2008), and although its similarity to later papers is quite low, it is the oldest example of manuscript reporting a cell viability plot that followed the template described in the present study. Although all the studies that we were able to identify shared a common model, i.e. the testing of an anti-inflammatory compound in a cell model, after stimulation with a known inflammatory stimulus, it is possible that this same template or a similar version of it may be encountered in different experimental settings, that we therefore missed. We have actually just started to identify in vivo pre-clinical studies using rodent lung injury or mastitis models that appear to follow specific but extremely standardized reporting protocols (Huo, Cui, et al., 2013; Huo, Gao, et al., 2013; Yanwei Li et al., 2018; S. Qu et al., 2017; W. Wei et al., 2015). For this reason, the main purpose of the present study is to raise awareness in the readers of the palpable existence of this tradition or format of reporting, without any claim of completeness. Interestingly, all the early studies, up to 2013, were conducted in macrophages, using RAW264.7 cells or primary macrophages, and they appeared to share strong similarities (Fig. 2). More specifically, Fu, Liu, Liu, et al. (2012) and Fu, Liu, Zhang, et al. (2012) are the only instance where the prototypical bar charts of the viability assay that can usually be found as Figure I were included in a multi-panel figure together with ILs measurements by ELISA, which appeared as separate figures in all the later reports. Moreover, these early papers were also the only cases where this bar chart did not report data as “percentage of cell viability” or where LPS concentration was expressed as mg/L instead of the more familiar μg/ml. Although it is not possible to prove that one of these papers is indeed the archetype, their features would strongly indicate that they were composed at a stage when this template was still fluid. Later manuscripts, however, although incorporating several differences in the order of the figures, followed the formatting of the individual figures so closely that even questionable aspects could be traced in virtually all the manuscripts that we included in the present review. This is, for instance, the case with Figure I, which was arguably effective in graphically showing the absence of toxic effects of the X compound. However, it is generally accepted that Y axis should start at 0, especially so in bar charts, so having the Y axis starting at 50% viability may be questionable. The same applies to the curious choice of labelling the control and LPS bars with the same color used for LPS + X compound (see above), which may be puzzling to the reader. These questionable aspects, which were nevertheless closely reproduced in the later papers, can be actually treated similarly to what philologists consider “common errors”, i.e. a strong sign of textual dependency. We can hypothesize that this template influenced later researchers in different areas and was modified to better adjust to their needs in the specific experimental settings, e.g. osteoarthritis inflammation. It could even be theoretically possible then to reconstruct a phylogenetic tree of this template, branching off as new specialized templates are derived from the previous one.

There may be several explanations behind the choice to use the same data plotting and order, which go beyond the simple acknowledgement of the validity of a specific methodology for the testing of anti-inflammatory compounds. It cannot be ruled out that cultural factors may have had a role in it, as all the authors from this group of articles are from Chinese institutions and authors of neighboring countries have generally not adopted the template, although its echoes can be felt in some studies from other Asian countries (Aroonrerk et al., 2016; Lee et al., 2018; Liang Li et al., 2015; Shin et al., 2019). Cultural and scientific prestige of published papers may have compelled authors to adopt the same format, and maybe the status of published papers, i.e. ‘successful’ papers, officially accepted by peer-reviewers and, by extension, the scientific community, may have provided some safe ground to non-native English speakers.

It must be noted that uniformity is not a reproachable feature of scientific manuscripts per se. On the contrary, standardization can increase the efficiency of scientific communication, as it decreases the cognitive burden required to interpret scientific data. It is easier for readers to quickly go through and extract data and information from manuscripts that look familiar. This is the whole idea behind the standardization of scientific manuscript formats: as readers, we normally expect an article to comprise an abstract, an introduction, a materials and methods section, a result and discussion section. Much alert, however, has been raised about the existence of paper mills, i.e. entities mass producing scientific articles that display recurrent templates and systematically fabricated data, with the only purpose of increasing the number of publications for career or political reasons, regardless of the scientific value of the published articles (Christopher, 2021; Hackett & Kelly, 2020). In our case, a closer look at the results did not unequivocally confirm a systematic plagiarism of actual data. Although the bar charts often appeared very similar, we were unable to find an exact match in Western Blot figures. Textual similarities were often subtle, but present, though usually not in the actual use of the exact choice of words, but in the structure of the text. The viability assay legend was arguably the most similar piece of text in these articles and it generally started with a descriptive nominal sentence (‘Effects of X on the cell viability of Y’). A similar kind of legend was also usually found accompanying the Western Blot figure(s), while the bar charts reporting cytokine expression were consistently accompanied by a declarative legend (‘X inhibits…’). In the papers that most closely followed the template, the Introduction and the Discussion sections followed the same flow of arguments. The case with the articles using gingival fibroblast to investigate LPS effects is exemplary: the Introduction consistently contained 2 paragraphs; the former about periodontitis and the second introducing the tested compound. The Discussion section usually contained 3 paragraphs, with the second paragraph consistently introducing the role of Porphyromonas gingivalis in periodontitis. These data are probably not enough to define the present articles as the product of a paper mill, but clearly indicate the definite will to closely imitate a given template.

We believe that we here witnessed the codification of a specific testing protocol and reporting format, for reasons that elude its scientific soundness. We mentioned that the vast majority of articles of this cohort reports personal emails for their corresponding authors. The use of personal email in lieu of institutional ones is a debated issue in the literature (Kozak et al., 2015) and has been associated with increased retraction and lower citation counts (X. Liu & Chen, 2021; Shen et al., 2018). Although proponents of the use of personal emails argue that they are preferable to allow author identification even in case they change affiliation, the presence of private email addresses, often displaying initials, makes a univocal author identification more difficult.

We further witnessed the demise, or rather abandonment of this template, which is more puzzling. Echoes of the template can be found in some recent papers where modules of the format were re-elaborated but still preserved some scant resemblance with the model we described (Tao et al., 2020; Zhao et al., 2020), so that it is safe to assume the authors of these papers were aware of this previous literature, but no example of adoption of the template was found in late 2019 up to mid 2020 or later publications, to the best of our knowledge.

What happened can only be guessed. It is possible that research just moved on, and that research groups that adopted the format moved to further stages of investigation (e.g. in vivo), where the template was not applicable. Or it is possible that later researchers deemed it insufficient and pivoted to different models. It is even possible that the publication of our previous paper on the topic, which did occur in 2019, prompted the involved groups to abandon the template to avoid repercussions. The issue should be further investigated and it may just be beyond the aim of the present report.

The most important question that our findings raise is probably how widespread such phenomena are in science.

Conclusions

A conspicuous number of articles were published in recent years, up to 2019, according to a very consistent and strict format template that regulated the endpoints that were measured, the way and the order with which data were presented in the paper. These manuscripts, which focused on the anti-inflammatory properties of mostly natural compounds, were not the product of a single research group and followed a precise format of reporting. This format may have started in 2008, with reports in macrophage lines, and was later adopted with different cell models and adapted to the specific needs of a research field, differentiating into sub-sets, whose common origin is however still clearly recognizable. Acknowledging the existence of such traditions of reporting is necessary, to discuss them, accept them if deemed worthy, discard them when unnecessary.