Introduction

Research disciplines differ not only in the topic and focus of their research but also in the structure of the manuscripts, peer review evaluation, editorial processes, and research methodology. Because research work is more closely linked to reading and writing in some areas, such as humanities, the differences may occur in the language and writing styles and in the preferred ways of communicating research findings, whether it is in the form of an article or a monograph. Bellow we present literature review on the differences in manuscripts and peer-review processes across different research areas.

Manuscript differences

A study of linguistic differences between research areas in 500 abstracts of research articles published in 50 high-impact journals, showed that each of the five areas—earth, formal (i.e. related to formal systems such as logic, mathematics, statistics), life, physical and social science—have their own set of “macro-structural, metadiscoursal and formulation features” (Ngai et al., 2018). For example, medical and health sciences have been using the IMRaD article format (Introduction, Methods, Results, Discussion) since the 1950s (Sollaci et al., 2004), whereas research disciplines such as social sciences tend to have a more flexible article structure. Predominant publication outputs in social sciences (and also humanities) are academic monographs and books (Williams et al., 2009; Wolfe, 1990). Also, articles in social sciences journals tend to be longer than those in medical journals (Silverberg & Ray, 2018). Another difference is that the natural sciences research strategies are more adapted to “large concentrated knowledge clusters”, whereas social sciences usually adapt their research strategies to “many small isolated knowledge clusters” (Jaffe, 2014).

Peer review differences

The importance of the evaluation process for researchers and research in general has been the topic of numerous studies, from examining the impact of peer review on submitted manuscripts to some specific characteristics of peer review reports. Recently, initiatives to share peer review data on peer review (Squazzoni et al., 2020), have brought about better understanding of the peer review process across different disciplines (Buljan et al., 2020; Squazzoni et al., 2021a, 2021b), particularly in the type of peer review. In open peer review, authors and reviewers know each other’s identity and sometimes reviewer reports are published next to the articles (Ross-Hellauer, 2017), whereas in post-publication peer review, articles are reviewed after publication in an open review process (Ford, 2015). In medical and health Sciences, open and post-publication peer review are becoming more common and peer review reports are often published together with the articles (Hamilton et al., 2020). In social sciences, the peer review process has remained closed because double blind peer review is still preferred (Karhulahti & Backe, 2021). In recent years, however, some platforms publishing articles from social sciences and humanities, such as Palgrave Macmillan, have adopted the practice of open peer review (Palgrave MacMillan, 2014). This allows researchers to study the peer review process in social sciences and humanities and to compare it with other research areas.

A study exploring the role of peer review in increasing the quality and value of manuscripts (Garcia-Costa et al., 2022) showed that the impact of peer review is shared across research areas but not without certain differences, as reports from social sciences and economic journals displayed the highest “developmental standards”.

Regarding the linguistic differences, a study of almost half a million peer review reports from 61 journals (Buljan et al., 2020) showed that peer review reports were longer in social sciences than in medical journals, but there were no differences in the length between double- and single-blind reviews. Language characteristics were also different across disciplines (Buljan et al., 2020): peer review reports in medical journals had low Authenticity (impersonal and cautious language) and high analytical Tone (use of more formal and logical language), whereas the language of peer review reports in social sciences journals had high Authenticity (personal and open language) and high Clout (honest and humble reporting and high level of confidence). Using natural language techniques, Rashidi et al. (2020), studied published articles and their open peer review reports from F1000Research journal, which uses a post-publication peer review. They found consistency and similarity in the use of salient words, like those from the Medical Subject Headings (MeSH) of MEDLINE. F1000Research platform was also used to develop a sentiment analysis program to detect praise and criticism in peer evaluations (Thelwall et al., 2020), which showed that negative evaluations in reviewer’s comments better predict review outcomes than positive comments.

Peer review research carries major challenges due to the lack of access to the whole process of scientific publication. This means that peer review process remains hidden from the submission of the article to its rejection or publication after peer review, hindering our understanding of the publishing process. For this reason, we decided to use the Open Research Central (ORC) portal, which hosts several journal platforms for post-publication peer review (Tracz, 2017) to study the characteristics of articles and peer review reports in Medical and Health Sciences and Social Sciences. Journals at the ORC platform are multidisciplinary and use the post-publication review: the articles are publicly available upon submission, and access is possible to the whole peer review and editorial decision-making process (ORC, 2022). To our knowledge, there has not been a study that analysed the peer review reports from this platform. The aim of our study was to examine possible differences in the submitted articles and peer review process in Medical and Health Sciences vs. Social Sciences. We examined: (i) the structural and linguistic differences between research articles; (ii) the characteristics of the peer review process; (iii) the language of peer review reports; and (iv) the outcomes of peer review process.

Methods

Data source: ORC portal

ORC currently includes the following journals: F1000Research, Wellcome Open Research, Gates Open Research, MNI Open Research, HRB Open Research, AAS Open Research, AMRC Open Research and Emerald Open Research.

Identify the articles: get articles’ DOIs

Using the ORC search engine (https://openresearchcentral.org/browse/articles) and Python 3.8.5 (https://www.python.org/downloads/release/python-385/), we performed 2 automatic queries applying the following filters: “Article type(s): Research Article” and “Subject area: Medical and Health Sciences”, or “Subject area: Social Science”. We then extracted the articles’ DOIs using requests (https://docs.python-requests.org/en/latest/) and Beautiful Soup (https://beautiful-soup-4.readthedocs.io/en/latest/) HTTP libraries for Python. We retrieved 1912 Medical and Health and 477 Social Sciences articles. In order to create the samples of articles with clear Medical and Health vs. Social Sciences content, we excluded articles with a tag for both disciplinary fields, those with a tag for Medicine and Health Sciences and any other disciplinary field except Biology and Life Science, and those with a tag both for Social Sciences and Biology and Life Sciences. This was done also to ensure that manuscripts and peer review reports were not influenced by the language and writing style of multiple research areas. This left with 408 Medical and Health and 54 Social Science articles (Fig. 1).

Fig. 1
figure 1

A flowchart representing methods workflow (created using Zen Flowchart: https://www.zenflowchart.com/)

Retrieve the articles in XML format

Using the DOIs of filtered articles, we downloaded the articles manually in an XML format. We used the XML article format in order to achieve better quality data mining due to its semantic and machine-readable tagging. All versions of articles were downloaded in order to get complete article information.

XML Parser: extract and save relevant data

We used the ElementTree library in Python (xml.etree.ElementTree — The ElementTree XML API — Python 3.10.1 documentation) for parsing data from the XML files. First, the articles that had not been reviewed were excluded, yielding a total of 51 articles with a Social Sciences tag and 361 articles with a Medical and Health Sciences tag. A simplified sequence diagram for XML document parser is shown in Fig. 2.

Fig. 2
figure 2

A simplified sequence diagram for ORC XML parser (the diagram was created using PlantUML open-source tool https://plantuml.com/sequence-diagram)

Using the scripts, we extracted the following variables:

  1. (1)

    Length of an article and the length of the individual article chapter – Introduction, Methods, Results and Discussion (IMRaD);

  2. (2)

    Number of figures, tables and supplementary material in the articles;

  3. (3)

    Percent of articles following the IMRaD structure;

  4. (4)

    Linguistic characteristics of the articles such as Tone, Sentiment, etc.;

  5. (5)

    Male to female ratio among article reviewers;

  6. (6)

    Time for an article to be first posted;

  7. (7)

    Number of rounds of review until the article is accepted;

  8. (8)

    Time to review each version of the article;

  9. (9)

    Time for an article to have a “positive” status;

  10. (10)

    Length of review comments;

  11. (11)

    Linguistic characteristics of research articles and corresponding peer reviews; and

  12. (12)

    Reviewers’ recommendations.

Reviewers’ gender was determined by using Python class Genderize from Genderize.io web service (https://pypi.org/project/Genderize/), which predicts the gender of a person given their name.

Details on how each of the variables was extracted from the XML files can be found in the following Python scripts: https://github.com/Tonija/ORC_scripts.

The results of the scripts were saved in a csv table and used for linguistic and statistical analysis.

Linguistic analysis

Linguistic inquiry word count (LIWC)

The texts of the articles and corresponding peer reviews were analysed using the Linguistic Inquiry Word Count (LIWC) text analysis software program (Pennebaker et al., 2015a). We calculated LIWC’s five default variables (Word count, Analytic, Clout, Authentic, Tone), where Word count (WC) is the raw number of words in a given text, while Analytic, Clout, Authentic, and Tone are the linguistic variables expressed as percentages of total words within a text (Pennebaker et al., 2015b). Higher scores on the Analytic dimension describe the use of formal, logical, and hierarchical language; higher Clout score refers to a higher level of leadership and confidence; higher Authentic score points to a more personal way of writing; and higher Tone score represents a more positive emotion dimensions (Pennebaker et al., 2015b).

We also analysed seven other LIWC categories related to research evaluation, used for linguistic study of letters of recommendation for academic job applicants (Schmader et al., 2007), and for text analysis of research grant reviewers’ critiques (Kaatz et al., 2015). These words categories with examples are: Ability (brillian*, capab*, expert*, proficien*); Achievement (accomplish*, award*, power*, succeed*); Agentic (ambiti*, assert*, confident*, decisive*); Research (data, experiment*, manuscript*, research*); Standout adjectives (extraordinar*, remarkable, superb*, unique); Positive evaluation (appropriat*, clear*, innovat*, quality); and Negative evaluation (bias*, concern*, fail*, inaccura*). Higher Ability refers to higher usage of adjectives that describe talent, skill, or proficiency in a particular area; higher Achievement score refers to higher usage of terms that relate to success and achievement; higher Agentic score points to higher usage of words that describe achieving goals; higher Research score indicates higher usage of research terminology; higher Standout adjectives score reflects the use of adjectives describing exceptional, noticeable skill or performance; and higher Positive (Negative) evaluation score indicates higher (lower) display of affirmation and acceptance.

Word embeddings and t-distributed stochastic neighbour embedding

We also explored whether the texts of the articles and peer review reports from the two research disciplines formed different word clusters.

For peer review reports, we applied Word Embeddings, a method in which words are given mathematical vector representation so that they are mapped to points in Euclidean space, with words that are similar in meaning being closer to each other (Hren et al., 2022; Jurafsky et al., 2000). We used the Gensim library and Word2Vec approach in Python to create Word Embeddings, as well as a pre-trained model (https://github.com/lintseju/word_embedding), trained on Wikimedia database dump of the English Wikipedia on February 20, 2019. Finally, clusters were visualised using TensorBoard Embedding Projector (https://projector.tensorflow.org/), which projects the high-dimensional data into three dimensions. The projector uses the principal components analysis (PCA) to visualise clusters and cosine distances between clusters as a reference to cluster distances. After the data points were created, we attached the labels by uploading a metadata file we previously generated with a Python script. We applied this technique only to peer review reports because the texts of the articles were a too large for TensorBoard Embedding Projector.

For article texts we applied t-Distributed Stochastic Neighbor Embedding (t-SNE) technique using the TSNE class from the Scikit-learn (Sklearn) Python library (https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html), which visualises high-dimensional data by placing each datapoint in a two-dimensional map (van den Maaten & Hinton, 2008). We used the Global Vectors for Word Representation (GloVe) pre-trained model that was trained using Wikipedia 2014 + Gigaword 5 (Pennington et al., 2014).

Word frequency in peer review reports

We also examined whether there would be differences in peer review reports in Social versus Medical and Health Sciences regarding the most common words usage. We calculated the percentage of words found in the 10,000 most common English words in the Project Gutenberg list (Wiktionary, 2006), as well as in the Academic Word List (AWL), which contains 570 words that are specific to written academic texts but are not included in the 2000 English words from the General service List (Coxhead, 2000; Coxhead & Nation, 2001). Finally, we identified the most frequent words that were unique to peer review reports in Social Sciences (i.e. not found in peer review reports in Medical Sciences in our sample) and vice versa.

Changes in manuscript versions: Levenshtein distance

Because of a small number of strictly Social Sciences articles, for this analysis we included Social Sciences articles that overlaped with Biology and Life Science, yielding 48 social and 166 articles from Medical and Health Sciences.

We measured the changes in the text from the first draft to the second version of a manuscript by means of the Levenshtein distance (Levenshtein, 1966), a character-based metric counting the minimum number of edit operations (insertions, deletions or substitutions) required to transform one text into the other. We computed this distance on the overall textual content of the manuscript, including figure captions and tables, but skipping references.

Changes in the references were calculated as the ratio of references being edited (added or deleted) to the total number of distinct references in both the first and the last version of the manuscript. To give two examples, we measured a change as 0.5 when the references of a manuscript changed from [A, B, C] to [A, B, D] (i.e. 2 changes of 4 references) and we calculated a value of 0.25 for a change from [A, B, C, D] to [A, B, C] (i.e. 1 change of 4 references). Two references were considered as equal by a matching algorithm (Vincent-Lamarre & Larivière, 2021) that checked whether they had the same publication year, the same number of authors, and a Levenshtein distance lower than 0.1 between the list of authors and the paper titles.

Statistical analysis

To assess possible differences between the articles and their reviews across Medical and Health, and Social Sciences, one-way ANOVA and post hoc Tukey test were employed. For multivariate frequency distribution of the variables, Contingency test was utilised. All analyses were carried out in JASP, Version 0.14.1. (JASP Team, 2020).

Results

Structural and linguistic differences of articles in Social Sciences vs. Medical and Health Sciences

Articles from Medical and Health Sciences and Social Sciences differed in their structure (Table 1). Both the Introduction and Conclusion sections were longer for Social Sciences than those from Medical and Health Sciences. The number of additional sections and the number of special sections was also higher for articles in Social Sciences. Discussion and Conclusion sections were merged more often in Social Sciences articles, whereas Medical and Health Sciences articles often had Conclusions as a separate section. Articles in Medical and Health Sciences followed the IMRaD structure more frequently, and contained more figures.

Table 1 Structure and linguistic characteristics (median, % or number and 95% confidence interval, CI) of articles in Social Sciences vs. Medical and Health Sciences

Linguistic analysis was performed on the text of the latest version of an article. Articles in Social Sciences had higher Word count and higher Clout, whereas the articles in Medical and Health Sciences had a higher Tone score.

Characteristics of the peer review process and peer review reports from all stages of review in Social Sciences vs. Medical and Health Sciences

No statistically significant differences were found in the characteristics of the peer review reports from all stages of review between articles from Social Sciences and Medical and Health Sciences: reviewer’s gender, time between the first and second manuscript version posted, time between acceptance for publication and the first article version, time between the first and finally approved article version, as well as the number of article versions (Table 2).

Table 2 Characteristics of the peer review process and peer review reports from all stages of review in Social Sciences and Medical and Health Sciences

Comparison of linguistic characteristics of peer review reports in Social Sciences vs. Medical and Health Sciences

Peer review reports were significantly longer for articles in Social Sciences (Table 3). Social Sciences peer review reports also had higher scores on several linguistic characteristics: Clout, Authentic, Tone, Agentic, Achievement, Research and Standout (Table 3).

Table 3 Comparison of linguistic characteristics of peer review reports in Social and Medical and Health Sciences

Linguistic differences between research articles and corresponding peer reviews in Social Sciences vs. Medical and Health Sciences

We also compared the linguistic characteristics of the articles and their corresponding peer review reports (Table 4). In general, the articles had higher scores for the Analytic, Clout and Authentic variables than corresponding peer review reports, whereas peer review reports had a significantly higher positive Tone compared to the language of the corresponding articles.

Table 4 Linguistic differences between research articles and corresponding peer reviews in Social and Medical and Health Sciences

The difference between the linguistic characteristics of articles and peer review reports was significantly higher for the Clout score for Social Sciences than for Medical and Health Sciences articles (Table 4). The opposite was true for the Tone score, where the difference between the articles and peer review reports was greater for Medical and Health Sciences articles (Table 4).

Comparison of reviewers’ recommendations for articles in Social Sciences vs. Medical and Health Sciences

There were no statistically significant differences in the outcome of the peer review process between the two disciplines, measured as the number of reviewers’ recommendations for different versions of the articles (Table 5). For articles in both disciplines, about a half of the articles were approved already at the stage of the first version. The proportion of reject recommendations were low and decreased in the next versions, with only a single article (in Medical and Health Sciences) receiving a rejection recommendation at the level of the third article version (Table 5).

Table 5 Comparison of reviewers’ recommendations for articles Social Sciences vs. Medical and Health Sciences

Changes in the text of the manuscript were mainly concentrated in the Methods and Results & Discussion sections (Table 6), as measured by the higher values of the Levenshtein distance metric. References also changed, predominantly by adding more references. The differences between the disciplines were statistically significant only for the Introduction section, which was the least modified section in the Medical and Health Sciences.

Table 6 Text changes (median, 95% CI) in the latest version of articles in Social Sciences and Medical and Health Sciences

Changes to the latest version of articles in Social Sciences vs. Medical and Health Sciences

Word cluster visualisation

Using the Word Embedding visualisation of words, we observed that, the words in peer review reports from Social Sciences were more spherically distributed, which means that they had more general terms that could be found in other research areas, for example (Fig. 3A). On the other hand, clusters consisting of specific terms were found in Medical and Health Sciences peer reviews (Fig. 3B).

Fig. 3
figure 3

Word Embeddings 3D visualisation for reviews in Social Sciences (left) vs. Medical and Health Sciences (right). Clustering indicates grouping together the closest or most similar words; the closer two words are, the more similar they are, and vice versa. Gensim library and Word2Vec approach in Python were used to create Word Embeddings, as well as a pre-trained model (https://github.com/lintseju/word_embedding), trained on Wikimedia database dump of the English Wikipedia on February 20, 2019 and the clusters were visualised using TensorBoard Embedding Projector (https://projector.tensorflow.org/)

Using the t-Distributed Stochastic Neighbor Embedding (t-SNE) visualisation, we observed similar distributions for the words in text of the articles: words from Social Sciences were more spherically distributed as well, having more general terms (Fig. 4A) while clusters consisting of specific terms were found in Medical articles texts (Fig. 4B). GIF format of the images can be found in the Appendix.

Fig. 4
figure 4

The-Distributed Stochastic Neighbor Embedding (t-SNE) visualisation (van den Maaten & Hinton, 2008) for texts of the articles in Social Sciences (A) vs. Medical and Health Sciences (B) in a two-dimensional map. Created using the TSNE class from the Scikit-learn (Sklearn) Python library (Skicit learn, 2022) and the Global Vectors for Word Representation (GloVe) pre-trained model that was trained using Wikipedia 2014 + Gigaword 5 (Pennington et al., 2014)

Peer review reports from Social Sciences contained a higher percentage of words from the Academic Word List (8.5%) compared to peer review reports from Medical and Health Sciences (7.2%) (MD = − 1.3, 95% CI − 1.7 to − 0.8). They also contained a higher percentage of the 10,000 most common English words (76.7%) compared to peer review reports from Medical and Health Sciences (72.7%) (MD = − 4.0, 95% CI − 5.0 to − 3.0).

Most common and most common unique words found in peer reviews can be found in Table 7.

Table 7 Most common and most common unique words in peer reviews in Social Sciences vs. Medical and Health Sciences

Discussion

Understanding the differences between Medical and Health Sciences and Social Sciences in their structural and linguistic characteristics is crucial for successful interdisciplinary collaborations and for avoiding misunderstandings between different research groups. Our study found certain differences both in articles and peer review reports regarding to their structure and linguistic characteristics.

Structural differences in articles and peer review reports between Social and Medical and Health Sciences

Longer articles and peer review reports in Social Sciences compared to Medical and Health Sciences could reflect the tradition of the writing style and formats typically used in the disciplines. Despite an increase of journal articles as a publication output for social sciences and humanities (Savage & Olejniczak, 2022), academic monographs and books are still being used as forms of scholarly dissemination in the humanities and some social sciences, sometimes even remaining crucial for professional advancement (Williams et al., 2009). Some university departments emphasise publishing in the form of books and monographs (Wolfe, 1990). Typically, monographs range between 70,000 and 110,000 words, which makes them significantly longer than a standard or even the longest journal article. The length of journal articles also differs across disciplines, with medical journals usually strict limits for article word count. For example, in five medical journals (New England Journal of Medicine [NEJM], Lancet, JAMA, BMJ and Annals of Internal Medicine), the word limits for the main text ranges from 2700 (NEJM) to 4400 (BMJ) (Silverberg & Ray, 2018). On the other hand, social sciences journals allow longer papers and they typically limit the manuscript size in the number of pages. For example, in four social sciences journals (Review of Economics and Statistics [Rev Econ Stat], Journal of Business and Economic Statistics [JBES], Human relations, Journal of Marriage and Family [JMF]), page limits for the main text ranged from 35 (JBES and JMF) to 45 (Rev Econ Stat), the limit often being a recommendation rather than obligation. Some journals, such as Sociological Science, do not have any limits related to manuscript length. A 2011 market research conducted by Palgrave Macmillan, a publisher of books and journals in humanities and social sciences, in which they surveyed 1,268 authors and academics from humanities and social sciences, the majority of respondents expressed that the perfect length would be between a journal article and a monograph (McCall, 2015). This resulted in the development of Palgrave Pilot, a format ranging between 25,000 and 50,000 words (McCall, 2015).

Medical and Health Science articles were shorter, but contained more images and graphs compared to Social Sciences articles. However, studies suggest that even in medical sciences journals, graphs are underused (Chen et al., 2017), they are often not self-explanatory and fail to display full data (Cooper et al., 2001). Peer review seems to improve graph quality but there is further need for improvement (Schriger et al., 2016). Because social sciences are entering a golden age (Salganik, 2019), with more data available (Buyalskaya et al., 2021), social sciences authors should also recognize the importance of the visual data presentation and increase the number of graphs and figures.

As expected, Medical and Health Science articles more often followed the IMRaD format compared to Social Sciences, since they were among the first to adopt IMRaD structure. This is not surprising since research in health and life sciences most often uses a hypothetico-deductive approach (Jürgen, 1968; Lewis, 1988), which starts from a hypothesis, moves to observation and comes to a conclusion. IMRaD is the perfect format to present such research as it follows the structure of a logical argument (Puzzo & Conti, 2021). social sciences, on the other hand, widely use a mixed method approach (Plano Clark & Ivankova, 2016; Timans et al., 2019), which incorporates both deductive and inductive methods (Creswell, 2012). Inductive approach moves from observation to hypothesis, and the IMRaD format may not be suitable. As there is an increase in mixed method approach in health and clinical sciences (Plano Clark, 2010; Coyle et al., 2018), it is questionable whether IMRaD can and should be a one-size-fits-all format of journal article. If research is done inductively, should it be presented in IMRaD format? There are even arguments that the current publishing process discourages inductive research (Woiceshyn & Daellenbach, 2018). Nevertheless, as the format of research paper has evolved from descriptive to standardised style (Kronick, 1976), IMRaD format will continue to evolve as well (Wu, 2011) to adapt to the diversification of methodological approaches in different scientific disciplines and particularly in multi- and interdisciplinary work.

Linguistic differences in articles and peer review reports between Social Sciences and Medical and Health Sciences

Articles in Social Sciences had higher Word count, Clout, and Authenticity, whereas articles in Medical and Health Sciences had higher Analytic and Tone score. Higher Authenticity score for social sciences articles, which indicates a more personal way of writing, is not an unexpected finding as analyses in social sciences and humanities often relay on interpretations based on researcher’s personal opinions and values, leading to subjectivity (Khatwani & Panhwar, 2019). Higher Analytic score for articles in Medical and Health Sciences, which reflects the use of formal, logical, and hierarchical language, does not surprise due to the hypothetico-didactic methodological approach and “dispassionate scientific language” that is frequently used in these disciplines (Steffens, 2021). This is partially in accordance with previous studies that compared peer review reports in social sciences and medical and health sciences. Understanding the language differences between disciplines is important because linguistic characteristics of manuscripts may have an effect on the evaluation process. Peer review has a crucial role in determining the fate of manuscripts. If peer review reports contain more positive words and/or expressions, research manuscripts are more likely to be accepted for publishing (Fadi Al-Khasawneh, 2022; Ghosal et al., 2019; Wang & Wan, 2018). Also, the absence of negative comments can indicate a positive outcome for the submitted article (Thelwall et al., 2020). There is a positive correlation between longer texts and longer sentences, and the positive score of the selection procedures (van den Besselaar & Mom, 2022). Furthermore, project descriptions with a more pronounced narrative structure and expressed self-confidence are more likely to be granted (van den Besselaar & Mom, 2022).

We also found linguistic differences in the peer review reports between the two research areas. Peer review reports for Social Sciences articles had higher scores on several linguistic characteristics: Clout, Authentic, Tone, Agentic, Achievement, Research and Standout. On the other hand, peer review reports for Medical and Health Sciences had higher score on positive evaluation words, i.e. more positive descriptors and superlatives, than the reports in Social Sciences. This is partially in accordance with previous studies that compared peer review reports in social and medical and health sciences. Buljan et al. (2020) found that the language of peer review reports in social sciences journals had high Authenticity and Clout scores, whereas peer review reports in medicine had higher Analytical tone than peer review reports in social sciences. In addition, reviewer recommendations were closely associated with the linguistic characteristics of the review reports, and not to area of research, type of peer review, or reviewer gender (Buljan et al., 2020). Our study, on the other hand, showed that there were differences of the linguistic characteristics of articles and peer review reports between Social Sciences and Medical and Health Sciences. As ORC contains only open peer review reports, the question remains whether there would be differences between open and closed peer review reports. For example, a study showed that closed peer review reports had more positive LIWC Tone compared to open peer review (Bornmann et al., 2012).

One of the novelties that our study brings is the comparison of words used in peer review reports in Social Sciences and Medical and Health Sciences. While Social Sciences peer review reports had more general terms that could be found in other research areas, terminology in Medical and Health Sciences peer review reports was more profession-specific. We visualised these results using clusters to indicate grouping together the closest or most similar words: the closer two words are, the more similar they are, and vice versa. More clusters consisting of specific terms were found in peer review reports in Medical and Health Sciences than in the Social Sciences. This finding actually confirms medical terminology as one of the “oldest specialized terminologies in the world”, having been shaped from Greek and Latin medical writings for over 2000 years (Džuganová, 2019).

Characteristics of the peer review process

We found no statistically significant differences in the duration of the peer review process between Social Sciences and Medical and Health Sciences for articles published in post-publication peer-review platforms. About a half of the articles in both disciplines were approved at the stage of the first version. The reason for this is probably the uniform policy and procedures of the Open Research Central platform, i.e. the same evaluation process for articles in both disciplines. The duration of the peer review process may differ across research areas. A study on 3500 peer review experiences published at the SciRev.sc website revealed significant differences in the duration of the first round and of the total review process across research areas. The first round was of the shortest duration in medicine and public health journals, lasting 8–9 weeks while it was twice as longer in social sciences and humanities, approximately 16–18 weeks (Huisman & Smits, 2017). The study also showed that the total peer review duration in medicine and public health journals was 12–14 weeks, whereas in social sciences and humanities journals about 22–23 weeks (Huisman & Smits, 2017). We also did not find statistically significant differences in other characteristics of the peer review process between the two research areas, such as reviewer’s gender or the number of article versions.

Changes in the manuscript versions

We found that differences between research areas were only statistically significant in the Introduction section, which was the least modified section in the Medical and Health Sciences manuscripts. Some studies examined whether the manuscript versions changed based on the peer review reports. Nicholson et al. (2022) compared linguistic features within bioRxiv preprints to published biomedical texts with aim of examining their changes after peer review. Among predominant changes were typesetting, mentions of supporting information sections or additional files. Another study (Akbaritabar et al., 2022) matched 6024 preprint-publication pairs across research areas and examined changes in their reference lists between the manuscript versions. They found that 90% of references were not changed between versions and 8% were added. The study also found that manuscripts in the natural and medical sciences reframe their literature more extensive, whereas changes in engineering were mostly related to methodology.

Limitations

The limitation of our study is that the Social Sciences articles and peer review reports that we used were mostly from Psychology and Sociology, which have structural similarities to those from Medical and Health Sciences. For example, articles from these two disciplines tend to have IMRaD structure, similar to articles from Medical and Health Sciences. The limitation is also the difference in the sample size as the platform journals still predominantly publish Medical and Health research.

Recommendations

Are the similarities we found between articles in Social Sciences and Medical and Health Sciences a result of their real differences or because the authors had to use the same format of the ORC platforms? The same question applies for the peer review reports. We believe this is due to the latter. For this reason, we recommend that the editors of all ORC platforms take potential structural and linguistic differences between disciplines in consideration. We believe the editors should also consider whether the IMRaD structure is the most appropriate format for each of the disciplines and whether additional formats should be offered.

Conclusion

Due to the different approach, tradition of the writing style and formats typically used in the two compared disciplines, it is not surprising that there are structural and linguistic differences in research articles in Medical and Health Sciences and Social Sciences. However, the review process for articles in Social Sciences and Medical and Health Sciences may not differ as much as is usually considered. This may be due in part to the same platform, which may have uniform policies and processes. With the development of open science practices in social sciences (Christensen et al., 2019), publishing platforms from social sciences and humanities that offer open peer review (Palgrave MacMillan, 2014), those that host multidisciplinary journals (Tracz, 2017; ORC, 2022), and with the evolving role of preprints (Mirowski, 2018) and editorial and review innovations, we can perhaps expect even greater conversion of the article formats and evaluation processes across research areas.