Current Implementation of Editorial Procedures
Table 2 lists all editorial procedures and attributes studied in our research. Appendix B (electronic supplementary material) also presents an overview of the current implementation of all 12 editorial attributes according to our data.
Table 2 The different forms of peer review categorized by dimension and attributes The table in Appendix B demonstrates some clear differences in the uptake of editorial procedures, especially indicating that several ‘traditional’ review procedures are still ubiquitous, such as selection of reviewers by editors (97%), keeping reviewer identities anonymous (94%), or pre-publication review (97%). In contrast, some more recent or innovative procedures are virtually absent, including review in which reviewer identities are made public (2%), review by commercial platforms (1%) and post-publication review (2%).
Even though some editorial procedures are ubiquitous there is no ‘standard’ model for peer review, nor even a limited set of standard models. The core set of review procedures, used in combination by 75% of all journals, consists of five principles: (i) pre-publication review, (ii) using methodological rigour and correctness as selection criteria, (iii) performed by external reviewers suggested and selected by editor(s), (iv) keeping reviewers anonymous (both to authors and other reviewers, as well as to readers of the published manuscript) and (v) making review reports accessible to authors and editors. However, as soon as we add more characteristics to this set, the commonality between journals quickly drops. Outside of this set, editorial procedures are quite diverse, with journals engaging in review procedures that differ on at least one of the twelve attributes studied. Hence, even though some basic review procedures seem universal, only relatively few journals use all of them and only very few journals perform the editorial process in the exact same way. Given the fact that editorial procedures in a large share of journals are more or less centrally organised through large publishers, this significant heterogeneity in journals’ review procedures might be deemed surprising. In the following sections we will look more specifically at the distribution of editorial procedures across scientific disciplines and academic publishers.
The Distribution of Editorial Procedures
Research Disciplines
Peer review is commonly presented as field-specific, with particular procedures that are common for particular research areas. However, our data suggest rather the opposite. For most of the editorial attributes studied, research fields appear strikingly similar. While journals tend to differ in their editorial procedures in subtle ways, when aggregating over disciplines, these variations dissolve. In fact, only two of the twelve attributes display substantial differences between fields: the level of author anonymity and the form of statistics review.
The former, demonstrated in Figure 1, represents a well-known difference between the social sciences and humanities, on the one hand, and natural and health sciences, on the other. While in SSH journals it is common to blind author identities to reviewers (but not to editors), journals in all other domains more commonly disclose author identities both to editors and reviewers. The biomedical and health sciences demonstrate most diversity with 63% of the journals disclosing author identities, 36% blinding author identities to reviewers only, and 2% blinding author identities both to reviewers and editors. These findings are consistent with a Taylor & Francis survey, in which SSH editors-reported to have used, at some point in time, 86% double-blind and 35% single-blind, while STM editors reported 75% single blind and 42% double blind (Taylor & Francis 2015). Our overall occurrence rate for double-blind procedures resembles that of the Directory of Open Access Journals (48%), but reliable disciplinary break-downs are not provided there (Directory of Open Access Journals 2018).
The second major difference between scientific domains consists of how they perform statistics reviews (Figure 2). This might be not so surprising given the different importance of statistical analyses in various domains. Most notably, statistical review was deemed to be ‘not applicable’ to many journals in mathematics and computer sciences, physical sciences, and engineering and social sciences and humanities. In contrast, it is considered relevant for the biomedical and health sciences, as well as for life and earth sciences. In the latter, statistics review is predominantly incorporated in the general review assignment, whereas in the former more than half of the journals report having specialist statistics reviewers to evaluate these aspects of the manuscript.
Publishers
Similar to the distribution of editorial procedures over scientific disciplines, the distribution of procedures over large and small academic publishers is fairly homogeneous. Distinguishing between the five largest publishers (Elsevier, Springer, Wiley, Taylor & Francis, and Sage) and the other, smaller, publishers, we only notice three significant differences in the way their affiliated journals organise their editorial process. Journals affiliated with the large publishers more often communicate author responses to review reports with reviewers (68% vs. 56%), and more often use plagiarism detection software (70% vs. 55%). In contrast, journals affiliated with smaller publishers more often facilitate reader commentary on the journal’s webpage (25% vs. 14%).
Most interestingly, however, the distribution of editorial procedures for all other (47) attributes does not differ substantially between the largest and the smaller publishers. While some differences are to be expected if only due to chance, this suggests that the heterogeneity in editorial procedures occurs mainly within publishers, rather than across publishers. Hence this seems to demonstrate that editors of journals at larger publishers are relatively autonomous in their choice of editorial procedures. At least no difference can be spotted between the set of most prominent publishers and smaller publishers, including journals run by scientific communities or university presses.
Changes Over Time
In spite of a constant stream of innovations, editorial procedures in most journals are surprisingly stable. For example, whereas 54.3% of the journals disclosed author identities to reviewers and editors in 2000, 54.6% did so in 2008 and 54.0% in 2018. Similarly, reviewer identities were hidden from authors and other reviewers in 94.7% of the journals in 2000 and in 94.2% of journals in 2018. The vast majority of other procedures studied display very similar patterns.
For all 12 aspects of the editorial process used in our survey, we asked respondents whether any changes had taken place since 2000. Only 169 out of the 361 responding journals (47%) reported at least one such change and only 11 (3%) reported at least three changes. Hence, the majority of the journals do not report any change, suggesting that their editorial procedures have remained fixed since the beginning of this century. In total, 286 changes were reported, an average of 0.8 changes per journal. The majority of alterations in editorial procedures concerned the introduction of digital tools (most notably text similarity scanners), or changes in review criteria (usually becoming more strict), comprising respectively 39% and 16% of all changes.
Because the number of changes in editorial procedures is so low, hardly any are visible when plotting trends of review procedures for most of the attributes studied. Only for the attribute concerning the usage of digital tools, a clear trend is visible since the year 2000, see Figure 3. The figure demonstrates that, especially over the last decade, journals are increasingly adopting text similarity scanners, while the share of journals not using any form of digital support is clearly declining.
Drawing on the literature about the spread of innovations, certain factors might be expected to drive the implementation of novel editorial procedures. First, we could expect more innovations in journals with high retraction rates, since innovations tend to appear as ways to tackle specific issues, and peer review is increasingly expected to detect fraudulent or erroneous manuscripts. Second, prominent, or highly established journals might be expected to be drivers of change, as they have more resources available, are more centrally positioned in communication networks and their reputation is at stake. Although heavily criticised, the journal impact factor (JIF) remains one of the sole recognised indicators of journal prominence and prestige. Using JIF, one could expect that journals with a higher JIF are more likely to implement novel editorial procedures. However, both factors, retraction rate and JIF, were not significantly correlated with the number of changes in editorial procedures (with r-squared values of 0.003 and 0.00004 respectively). This suggests that both the number of retractions and the JIF neither have a stimulating nor a restricting effect on the implementation of innovative review procedures.
Plotting the number of changes for which we have information on the exact date of implementation, we conclude that, even though only few changes occur, the rate of implementation is generally increasing (Figure 4). Showing the number of changes due to the implementation of plagiarism detection software, we again conclude that this type of change accounts for the majority of innovations in peer review. It remains to be studied whether the increasing pace of innovation is a general trend, or whether the apparent trend is merely an effect of specific innovations becoming more familiar and more ingrained in communication networks, thereby temporarily lowering the threshold for implementing this specific innovation (Wejnert 2002).
The literature on the spread of innovations distinguishes between smaller or larger collectives of actors implementing an innovation, suggesting that they may be more or less likely to do so in various circumstances. Therefore, we studied the number of changes in editorial procedures in journals of either one of the five largest publishers (Elsevier, Springer, Wiley, Taylor & Francis, and Sage) or any of the other publishers. The results are plotted in Figure 5. The figure shows that large publishers contribute slightly more to the number of implemented changes. However, when compensating for the fact that our sample comprises 198 journals from the large publishers and 162 journals from the smaller publishers, this difference in number of implemented changes becomes negligible. In addition, the trends of implementing changes in both larger and smaller publishers are highly similar, suggesting akin underlying mechanisms.
Some Reflection: Reasons for Change
Even though it was not a prime aim of our study, our data allows us to get some impression of the reasons why journals alter their editorial procedures. Though not directly invited to do so, a substantial share of the respondents reporting on changes in their editorial procedures, included information about the reason for the change. Out of the 286 reported changes, 61 (21.3%) were provided with information on the reason for change. Even though this data has to be treated with caution, it shows interesting patterns. Most notably, ‘the availability of new tools made this possible’ was frequently mentioned as a reason to adopt new editorial procedures. It was mentioned in 41% of all cases, not surprisingly, especially when reporting on changes in the use of (text similarity) scanners or support in statistical review. Other reasons frequently presented were the arrival of a new editor-in-chief (15%) or a (new) requirement by the publisher (8%).
Besides these three major reasons, other less frequently occurring motivations for change include ‘pressure to increase impact factors’, ‘increased submission rates’ and ‘stopped to have access to this service’ (e.g. to specialist statistics reviewers). In addition, some journals specifically addressed ‘issues with fraud/misconduct’ or the intention to ‘filter ‘bad’ science’ as reasons to implement different editorial procedures. Notably absent among the list of reported reasons for change was a history of retracted journal articles that ‘slipped through’ peer review and were later found to be problematic.
This suggests that, by and large, the opportunity to implement editorial innovations (i.e. the availability of and access to new tools, or the new expertise of a novel editor-in-chief) are the main motivators to change. On the contrary, intrinsic arguments to improve peer reviews capabilities or performance are seldom given as motivations for change. Even though our data are to be considered rather exploratory, they do suggest a clear pattern and invoke several questions for future research.
Innovation Niches
Our analyses of editorial procedures show a very slow implementation rate. When looking at the editorial process ‘from a distance’, little seems to be changing. However, despite an apparent stability, some innovations are actually getting a foothold, but only in very specific niches and particular contexts of the publication system, a phenomenon which is extensively described in innovations studies (e.g. Smith and Raven 2012). In the following, we will provide short descriptions of four niches in which particular innovations are getting established. This will allow for reflection on the circumstances in which innovations might be more widely implemented.
Text Similarity Scanners
The only innovation for which we observe substantial implementation are text similarity scanners, with significant increase in usage over the past decade. Combining different pieces of data from our study, a nuanced picture emerges about the reasons for their unique success.
First, text similarity scanners promise a simple fix for the rather uncontested issue of plagiarism and problematic text recycling. Unlike many of the other review procedures, these scanners promise a guaranteed solution to a specific problem, much more so than blinding author or reviewer identities, for instance. Hence, the expectations are clear, allowing for a relatively smooth translation of expectations into requirements for the tools (Van Lente 1993).
Second, journals and publishers have a major (commercial) stake in providing or promising duplication-free manuscripts. It allows them to sell a ‘unique’ product. Especially the larger, commercial publishers may be interested in this, in line with our finding that the use of text similarity scanners is one of the few examples distinguishing the larger from the smaller publishers.
Third, similarity scanners are not only used in the publishing industry, but also in higher education, scanning student papers for plagiarism. In fact, many of the developers of such scanners consider this their primary market. For editors and publishers, the usage of these scanners in higher education provides a testbed allowing them to see whether the scanners live up to expectations. Since many editors also have a role as lecturer, this allows them to get familiar with these tools via multiple communication networks.
Registered Reports in Health and Psychology Journals
A second example of an innovation that finds substantial implementation, though only in a particular niche, are the registered reports, in which research is evaluated only based on its rationale and methodology, usually before data gathering has started. Currently, this review model has been implemented in a substantial amount of psychology journals, as well as some journals in the health sciences (Center for Open Science 2018). Similar to text similarity scanners, registered reports were established with a fairly specific aim. They aim to address the alleged replication crisis, and promise to provide a more or less simple fix by facilitating the publication of negative results (combating publication bias) and making replication studies more attractive (Nosek and Lakens 2014; Horbach and Halffman 2018b). In addition, the registered report model is highly similar to the review model used in grant applications, which is also solely based on a study’s a priori rationale and methodology. Hence, akin to the text similarity scanners, actors might become familiar with registered reports through various communication channels, thus making the innovation more familiar.
Even though concerns about the ‘replication crisis’ in science currently seem to be spreading, they originated and still mainly seem to affect the medical science and (social) psychology (Wicherts 2017; Begley and Ioannidis 2015). Hence, the implementation of registered reports seems to be constrained to the area for which it provides a solution to an acknowledged and well-defined problem. In addition, the registered report format seems to be most applicable to certain areas of research (including the empirical, highly standardised fields, with low levels of researchers’ degrees of freedom), while it is less applicable in fields with other methodological and epistemic traditions (such as the humanities).
Image Manipulation Scanners in Biomedical Journals
A third editorial innovation that we would like to single out comprises the use of image manipulation scanners. At present, they seem to be most commonly used in biomedical fields and, to a lesser extent, some journals in psychology (Scheman and Bennett 2017). Within these fields, they again provide a solution to an uncontested issue, being the manipulation of figures and images, such as western blots. While detecting image tweaking is still technically challenging, highly standardised representations such as western blots allow for some automated detection, or at least flagging of potential problems. Even though some prominent cases of fraud where detected through careful scanning of images and figures, including the Schön case (Consoli 2006), such detection as yet relies on human skill. While techniques based on Artificial Intelligence promise to take this approach to a more automated level, such expectations remain to be fulfilled (BioMed Central 2017). Currently, the use of image manipulation scanners therefore seems to be constrained to (1) fields in which images commonly occur in manuscripts; and (2) those fields that have highly standardised representations in images and figures, thereby allowing relatively simple technical tools to be of genuine assistance.
Open Review at Several Publishers
The last peer review innovation implemented in specific niches is the open review model. Several publishers have now adopted this model, with some, such as BioMed Central and the British Medical Journal, launching a range of new journals adopting open review (Godlee 2002). This review procedure aligns with the more general call for opening up science and adhering to open science practices, including publishing open access, sharing data, and other forms of transparency in research (Nosek et al. 2015). Despite wide calls to follow these standards, our data show that implementation of the open review model is still rather modest and mainly confined to several individual publishers. Part of this may be due to the large variety of different forms of ‘open review’, a term that may encompass either the disclosure of reviewers’ identities to the authors of a submitted manuscript, the disclosure of such identities to the wider public, or even the publication of entire review reports (Ross-Hellauer 2017). In fact, Ross-Hellauer (2017) found at least 22 different definitions of ‘open peer review’, showing that the phrase is currently highly ambiguous and has not yet settled into a single set of features or schema for implementation. This lack of uniformity may cause a serious obstacle for editors or publishers willing to implement some form of open review in their journals.