Introduction

Over the past decades, the number of scientific publications has increased exponentially (Bornmann et al., 2021; Powell et al., 2017; Zhang et al., 2015) fuelled by societal and commercial interests. These are evidenced by growing R&D funding, increase in the number of researchers, expanded international collaborations, commercialization potential, and the participation of emerging countries (Bhattacharya, et al., 2015; Javed & Liu, 2018; Moiwo & Tao, 2013; Shashnov & Kotsemir, 2018). Alongside the escalating research landscape, ethics and integrity continues to serve as a basis for good and wholesome research practices. The necessity and shared responsibility of all stakeholders to uphold standards and principles is expected to continue to grow as more countries commit to funding research and more researchers contribute to the knowledge base (Titus et al., 2008). Where research ethics defines the standard framework of acceptable conduct in research (Resnik, 2020), research integrity represents the adherence of conduct to ethical principles which renders trust and confidence in the given methods and results. Both are complementary in supporting responsible scientific conduct (Bird, 2006; Braun et al., 2020). The absence of either facet can have potentially dire consequences which could erode public trust in research. Clinical research without ethical protocols can marginalize vulnerable study participants (Schüklenk, 2000). Flippant or perfunctory conduct, skewed or exaggerated data, or outright fraudulent results have the potential to misdirect future paths of research (Bouter et al., 2016; Coxon et al., 2019). Poor research may also mislead or misinform policymakers, thereby undermining government support for research in terms of its reliability and future funding (Michalek et al., 2010). Researchers and institutions also face increasing competition, pressure and demands (Haven et al., 2019), potentially leading to corner cutting or compromising of standards.

The ongoing discourse led to flourishing Research Ethics and Research Integrity (RIRE)-related fields (Aubert Bonn & Pinxten, 2019). In the last 10–15 years, the number of retractions have increased (Steen et al., 2013), signalling a greater awareness and identification of appropriate publication conduct (Fanelli, 2013). Several journals are dedicated to the subject, with some serving areas of specialization with titles including Journal of Empirical Research on Human Research Ethics, Journal of Medical Ethics, The Journal of Ethics, Science and Engineering Ethics, Ethics and Information Technology, and Teaching Ethics. Moreover, institutions are adopting policies and regulations to promote good practices (Geller et al., 2010; Lins & Carvalho, 2014). In research, there is increasing multinational collaboration, and more publications have a diverse representation of country authorship (Jiang et al., 2018; Leydesdorff & Wagner, 2008). These partnerships are valued for its potential to raise impact (Guerrero Bote et al., 2013), visibility, creativity, and present opportunities for resource-sharing, knowledge transfer, and training (Freshwater et al., 2006; Khor & Yu, 2016; Rodrigues et al., 2016; RoyalSociety, 2011). Recent events such as the coronavirus pandemic that started in 2019 underlined the need for global collaborative efforts in science (Li et al., 2020). Multinational cooperation could additionally serve as a medium for policy transfer (Stone, 2012). In light of these factors, we examine the collaborative publishing trends on the international scale for RIRE-related research in the past three decades. Considering the global nature of research (Rodrigues et al., 2016), and by proxy, research integrity and ethics, we aim to characterize the current patterns in RIRE. In particular, is there an indication of knowledge flows and cooperation between scientifically advantaged countries and emerging countries?

Data set and methods

Dimensions (Digital Science) was used as the database for study, with publication ranges between 1990 and 2020, spanning 31 years, where publications of type “article” were included. The date range has been modified from an earlier version to include 2019 and 2020, which used Web of Science (WoS) as the database. The database change aimed to improve representation of regional and local publications (e.g. regional journals which may not be indexed). A title and abstract search was conducted for publications containing keywords related to RIRE using Dimensions BigQuery with the following terms: “research integrity” OR “research ethics” OR “scientific ethics” OR “scientific integrity” OR “research dishonesty” OR “scientific dishonesty” OR “scientific misconduct” OR “research misconduct” OR “publication misconduct” OR “misconduct in research” OR “ethics in research” OR “integrity in research” OR “research plagiarism” OR “research falsification” OR “plagiarism in research” OR “falsification in research” (Supplementary Information). Articles without a publisher were excluded. In total 11,895 unique publications were found. Publications were further narrowed to those which contained organization identifiers, in order to identify organization countries. DOIs of publications without organization identifiers were matched against entries in a Scopus (Elsevier) search. Organization countries were identified through the SciVal module (Elsevier) and merged with Dimensions data. The total number of unique publications after pre-processing was 9742.

World Bank economy group data was obtained for the most recent financial year (FY2022) and harmonized with country name data. The World Bank economy classification is based on the Gross National Income (GNI) values. Research participation was measured by full counting aggregated by time. Subject to National Output is computed as the ratio of RIRE-related paper relative to the national output.

For publication classification, a supervised machine learning classifier based on support vector machines (SVM) was implemented to examine topics of publication within RIRE. SVM was chosen due to its performance compared to other supervised classification algorithms (Goh et al., 2020). Compared to models such as the random forest classifier, SVM yielded the highest accuracy (82%). Labels were identified based on a multi-institutional study on research integrity (Mejlgaard et al., 2020) consisting of nine key areas to promote organizational research integrity. The training set consisted of publications based on themes of the nine outlined topics (Mejlgaard et al., 2020). These were identified using the Dimensions “Similar Documents” module on the web app, providing up to 2000 of the most conceptually similar papers for each topic.

Results and discussion

Between 1990 and 2020, yearly publications on RIRE have risen (Fig. 1, primary axis), with a cumulative total of 9742 publications. The year-on-year publication count increased from 37 in 1990 to 1265 in 2020, with a compound annual growth rate (CAGR) of 12.5%. Of these, 2048 publications involved more than one country, representing 21% of all publications in the dataset. In the early 1990s, efforts were concentrated to national-level publishing which consisted of upper middle to high income economies. A trend reversal is seen in the 2000s with multinational publications and is consistent with observed trends of research globalization (Barjak & Robinson, 2008; Wagner et al., 2015). Multinational publications from 1998 onwards saw a CAGR of 12.3%, and made up ~30% of all global publications by 2020.

Fig. 1
figure 1

Yearly publication trends of scholarly output in RIRE-related topics. The primary y-axis represents the number of articles per year, while the secondary y-axis shows percentage of articles which are associated with a single authoring country, or multiple countries between 1990 and 2020

In country-level breakdown of publications and international collaboration proportions (Fig. 2), high income (H) countries publish the most on RIRE-related topics. Global averages for output and international collaboration proportions at 131 and ~60% respectively, where low income countries publish at orders of magnitude lower compared to the average and exhibit international collaboration proportions which are above average. India, China, South Africa, and Brazil are notable middle income economies with comparable output and collaboration with leading high income groups. The United States leads in publishing volume with 2444 cumulative publications, followed by the United Kingdom, Canada, Australia, Brazil, Germany, South Africa, Netherlands, and China (Fig. 2). Brazil counts for amongst the lowest in international collaboration proportions, and note that there is a preference to publish in local medical journals in a non-English language. RIRE papers relative to national output comprised a comparatively lower proportion for the United States, China, and India, ranging between 0.01 and 0.029% of all publication output for the year of 2020. Although RIRE contributions by proportion may appear stagnant, the volume from which these countries contribute are growing. By contrast, RIRE accounts for growing proportions of the national output in countries such as South Africa, Brazil, and the United Kingdom (0.07–0.14%). RIRE-related research within the emerging countries follows behind high income countries. This may be attributed to national research priorities, delay in participation, or ongoing development and enforcement of research ethics and integrity policies within the institutions (Ana et al., 2013; Fanelli et al., 2015; Okonta & Rossouw, 2014). As an example, activity from emerging economies such as India and China, classified as lower-middle and upper-middle income economies respectively, were recorded in the mid-2000s (Fig. 3).

Fig. 2
figure 2

Country-level summary of total publications (1990–2020), x-axis (logarithmically scaled) with respect to % collaborative publications (international), y-axis, grouped according to World Bank economy classification. Country labels are shown in ISO 3166 alpha-3 format with publications ≥ 50

Fig. 3
figure 3

Time series of RIRE-related publications as a proportion of national research output for selected countries. Bubble sizes indicate RIRE publication size with a scaling factor. Text labels are visual aid for smaller bubbles indicating RIRE output values

International collaboration is examined from an economic perspective, based on the frequency of income group collaboration (Fig. 4). Annualized data shows that the highest activity occurs among high income (H) countries exclusively. Collaborations amongst low income (L) countries were not recorded. This result is not unexpected, where the absence may be due to the significant barriers faced by countries of this cohort, including lower research readiness and capacity arising from resource, infrastructural and logistical limitations. Interestingly, no H–L collaboration structures were recorded. Meanwhile, middle and low economy activity pairs (L–UM; L–LM) are sporadic, while exclusively middle income (LM–UM) collaboration pairs recorded a tenfold increase in the last 10 years. H–L collaborations are observed to appear in combination with a middle economy (LM/UM). Between 2006 and 2020, the number of H collaborations increased by 14 times. For the same period, H- pairings with upper-middle (UM) and lower-middle (LM) income economies increased by approximately greater than  21 times. These findings suggest there is knowledge transfer, although it is coupled with economic stratification. Middle income economies are in a unique position in the knowledge flow with ties to more scientifically advanced countries and can potentially serve as an intermediary for low-income economies with increasing adoption of ethics-related regulations. For example, the Chinese government developed ethics-related policies and established oversight committees (Cyranoski, 2018; Qiu, 2015; Yi et al., 2017; Zeng & Resnik, 2010) to address its RIRE deficiencies. The significant repercussions on offending researchers may reach far beyond academic careers (Cyranoski, 2018), including the possibility of facing restrictions on jobs, loans, and business opportunities.

Fig. 4
figure 4

Annual output figures for collaboration based on any combination of economic group pairings (y-axis) over time (x-axis). H high income, UM upper middle income, LM lower middle income, L low income

To identify the topics which of interest to the research community, we classify the publications into nine groups, namely: (1) Research environment; (2) Supervision and mentoring; (3) Research integrity and training; (4) Research ethics structures; (5) Dealing with breaches of research integrity; (6) Data practices and management; (7) Research collaboration; (8) Declaration of interests; and (9) Publication and communication (Mejlgaard et al., 2020). In recent years, publications predominantly fall under ‘Research ethics and structures’ (Fig. 5), at 5804 publications. Research ethics addresses the protocols and policies for adherence in conducting scientific investigation. Biomedical ethics in particular pertains to clinical research methodology, with keywords such as ‘institutional review board’, ‘informed consent’, and ‘ethics committees’ which are prevalent across titles and abstracts. Bioethics represents a specialized case in biomedical and medical research, addressing social values in medical ethics (e.g. decision-making, dignity, euthanasia, death, quality of life), or concerns intersecting with biotechnology (e.g. cloning, stem cells), all of which benefit from the consensus of international alliances, consortiums, and working groups (Isasi, 2012). The average number of countries per paper was found to be 1.37, indicating less than two countries collaborating per paper, and suggests less papers at the international collaboration level. This value is on par with the average for all publications at 1.35 countries per publication where single authorship occurs more than international collaborations. For example, in ‘Research ethics structures’, the authorship collaboration matrix (Fig. 6) and ranking shows that the high-income group leads with 4199 papers and associated with a single country (Fig. 7). This configuration is followed, with a significant gap, by upper middle income countries. The prevalence of single country authorship as the preferred authorship structure may arise from factors such alignment with national-level research policies, and the simpler logistics compared to international collaboration (Dusdal & Powell, 2021). By comparison, the publication gap between a national-level LM country and an instance of collaboration is considerably smaller (LM = 1; H–LM = 2).

Fig. 5
figure 5

Heatmap of output volume by topic in the most recent 10 years

Fig. 6
figure 6

Country participation by income group for Topics on “Research ethics structure”

Fig. 7
figure 7

Top 10 authorship structures ranked by frequency. The numbers in the x-axis indicate the number of countries associated with the publication

Conclusions

In this study, bibliometric analysis capture and characterize the macroscopic trends within the broader RIRE research community. The results indicate that while RIRE productivity has continued to grow between 1990 and 2020, there are variations between countries on factors such as topics of publication and rate of growth. Several key areas are found within RIRE research, including research integrity, bioethics, scientific misconduct, and plagiarism, which exhibit some geographic dependence, with countries of greater centrality exhibiting a broader scope of topic coverage. Top contributors of RIRE research and collaboration partners are generally dominated by wealthier nations such as the United States, England, Canada, Australia, Germany, and France, with typical collaboration pairings amongst fellow top publishers such as US–Canada and US–United Kingdom. These high-income countries are scientifically advantaged and may have the first mover lead in terms of research. However, we find that emerging countries have a growing presence in global research participation, and in some cases outpacing more developed and counterparts, RIRE-related publications do not follow the subject share proportion growth rates in the same manner. There are strong signals for the emerging trend of H–UM and UM–LM collaborations. The H–LM–UM joint papers have also increased × 8during 2007–2020. This signals a global growing trend towards a collaborative effort to transfer knowledge and deepened the knowledge on RIRE.

Based on current trends, we expect there will be a sustained interest and discourse in RIRE topics as research integrity and research ethics are universal concerns (Bouter, 2020) for several reasons. First, the credibility and research reputation of an institution is intimately tied to its ability produce quality research with good accountability (Hudson, 2008). There is added visibility from publicly available databases which monitor research activities, such as Retraction Watch, conferences such as the World Conference on Research Integrity, and the formal adoption of research frameworks. Currently, countries and institutions (Aubert Bonn et al., 2017) have substantially different norms and standards governing scientific work (Leshner & Turekian, 2009) and international collaboration. However, there is evidence that the scientific community observes value in the transnational partnerships. The proportion of international collaborations is at an all-time high at 29% between 1990 and 2020. The relative decrease of single country papers, and the increase of two or multiple countries indicates that joint papers in RIRE are involving more countries, institution, and authors. As more institutions adopt initiatives such as formal structured training (DuBois et al., 2008; Ferguson et al., 2007; Satalkar & Shaw, 2019; Sponholz, 2000), there may be more opportunities for further collaboration and communication towards harmonizing standards to promote more uniform global research practices (Frankel et al., 2016). Secondly, the scope of topics is expected to change depending on factors such as relevance to the publishing country or funding agency, and forthcoming applications or technologies. As an example, bioethics is largely affected by the developments within the sector, such as stem cells, gene editing using CRISPR technology, and cloning. With the integration of artificial intelligence, machine learning, and big data permeating sectors including the medical practice (e.g. patient data records and collection, genomic data and physiologic data) as well as cybersecurity, we expect that the future discourse would evolve accordingly, dictated by prevailing innovations and commercialization.