Introduction

Given its lack of concordant perception, either as a game or fashion or as a significant part of the academic community, scientometric evaluation (SE), which uses international bibliographic database indicators such as the Impact Factor (IF), has been a focus of numerous investigations across several disciplines. During the second half of the twentieth century, science and its evaluation procedures radically shifted because of multiple factors, the most noteworthy of which was the massive growth of research and the need for the development of standards for science governance (Hammarfelt & de Rijcke, 2015). Consequently, evaluating scholarship merit has become a growing challenge, and modern evaluation systems have focused heavily on international publication (IP) to maximize visibility and citation magnitude (Karlsson, 2017; Shao & Shen, 2012). The eventual gain is to enhance the rankings of higher education institutions. This aligns with the objective of current scientific policies, which is to strengthen science and education by introducing an international character. Therefore, since the 1980s, these policies have included numerous incentives to boost IP; financial rewards—funding or a higher salary—or professional incentives such as promotion and tenure are examples of these stimuli (Korytkowski & Kulczycki, 2019a; Kulczycki et al., 2019; Rowlands & Wright, 2022).

In this context, since the 1990s, SE has been widely utilized by multiple stakeholders, notably funders, promotion and tenure committees, policy-makers, information specialists, worldwide rankings of universities, and scholars themselves. This is because metrics provide large volumes of quantitative, reliable, and processable information and an easy, more objective measure of research performance (Åström & Hansson, 2013; Ball & Tunger, 2006; Deutz et al., 2021; Grinev et al., 2021; Kelly & Jennions, 2006; Ma & Ladisch, 2019). The extensive use of these data reflects the emerging audit society, government accountability mechanisms, and vigorous political steering of the national research system (de Rijcke et al., 2016; Hammarfelt & Hallonsten, 2023). Nonetheless, these indicators have many criticisms, such as potential abuse and the limited extent to which they accurately reflect the actual quality of research; the inability to reveal a comprehensive and objective vision of the individual scientific contributions; probable manipulation; and disregard for differences in publication and citation patterns across disciplines (Aksnes & Rip, 2009; Kulczycki et al., 2021; West & Rich, 2012). The most significant recommendations of the Leiden Manifesto and the San Francisco Declaration for confronting these flaws were to identify a set of potential metrics to help various disciplines choose the best among them and to reveal the calculations that metric providers use (DORA, 2012; Hicks et al., 2015).

As one of the essentials for governing research, academic promotion regulations (APRs) are usually developed to meet the growing shifts in academia. The use of SE for judging the scholarships of candidates for academic promotion (AP) has been a prominent aspect of this evolution (Demetrescu et al., 2020). In Egypt, SE has been employed since late 2016 as a part of the APRs in all bodies affiliated with the Supreme Council of Universities (SCU), a governmental umbrella of Egyptian universities and higher education institutions. Compared to local outlets, these regulations have allotted higher points to publications in international journals (IJs) and other outlets indexed in the Web of Science (WoS) and/or Scopus.

As detailed later, this evaluation uses different metrics, such as the IF, the SCImago Journal Rank (SJR), and the CiteScore. This significantly helps scientific committees for promotion in determining the scores of journals and proceedings where the papers of candidates were published. The Digital Libraries Unit (DLU) at the SCU is responsible for performing the SE centrally and exclusively, and the applicants and their committees receive an official report containing the findings of this evaluation. Applying this evaluation in the Egyptian environment has some features that may differ from other national systems, such as the habilitation procedure in Poland (Kulczycki, 2019) and the Italian National Scientific Qualification (Marzolla, 2016). First, the evaluation is implemented by a neutral national authority employing library and information science specialists. They are qualified to utilize bibliographic databases in a manner that ensures the evaluation's accuracy, integrity, and impartiality. This contrasts with countries where the candidates themselves or the committee of experts provide the values of the bibliometric indicators as one of the promotion requirements. Second, the SE is applied to all disciplines served by the scientific committees, whereas the Italian system uses bibliometric indicators for only nine subject areas, such as mathematics and computer sciences, medical sciences, and physics. Third, the indicator values are used without any normalization, unlike the Italian system. Finally, while heavy reliance is placed on the IF for evaluating journals in some countries, such as China (Shao & Shen, 2012; Shu et al., 2020), the Egyptian system depends on the Cite Score or the SJR alongside the IF; such variety would suit all specializations.

In this context, it is interesting to observe that Egypt's ranking on InCites has increased from 40 in 2016 to 28 in 2021 (Clarivate. InCites, 2022). This strongly reinforces the study hypothesis that SE stimulates Egyptian scholars to publish more in international channels, particularly WoS journals, and changes their publication patterns. Accordingly, the objective is to explore how the SE-based APRs impact the evolution of international publications (IPs) produced by the SCU’s bodies (SCUs) compared to Egyptian research centres (ERCs), which are not subject to these regulations but may have benefited from the SE spillover effect. Hence, the main hypothesis is twofold, as follows:

RH1

There are differences in the IPs of the SCUs compared to the ERCs (net effect) according to the total volume, subject areas, document types, and journal quartiles after employing SE.

RH2

There are differences in the IPs of the ERCs (spillover effect) according to the total volume, subject areas, document types, and journal quartiles after employing SE.

Literature review

To accomplish the study's objective and design its theoretical and applied structure, it was necessary to examine the literature addressing the use of metrics in AP, which represents the focus of the study. Moreover, the author reviewed investigations exploring the impact of using these indicators on publication patterns, whether through monitoring stakeholders’ perceptions or analysing bibliometric data. The two approaches comprehensively revealed the changes generated by utilizing metrics. To determine the methods and statistical techniques adopted, the study leaned upon works assessing the effects of various incentives on universities' scientific output before and after launching these incentives. Below are the previous studies according to their categories.

Usage of SE in AP

A study by Beck (1991) was found to be one of the classic studies to use the term “scientometric evaluation” in its title. In the context of the widespread use of this evaluation in Hungary since the late 1970s, this study evaluated the scientific performance of the Faculty of Natural Sciences using the IF. The most significant recommendation was the necessity of publishing in higher-impact journals. Later, McHugh (1994) provided a snapshot of the weight of scientific production as one of the main standards for the promotion process using several indicators, such as the publication rate and the number of citations. Furthermore, Woodside (2009) suggested adopting multiple metrics for evaluating journals and the scientific contributions of candidates for promotion instead of relying solely on the IF. Recently, Abramo et al. (2023) investigated the impact of scientometric requirements for a professorship after the Ukrainian policy reform; they explored progression in some aspects of IP, although it did not consistently yield greater performance. Earlier studies have highlighted the negative effects of applying this evaluation in AP. Feenstra and Lopez-Cozar (2023) concluded that using metrics harmed publication practices and research integrity despite a production boost by Spanish philosophy researchers and more transparency in the promotion process. Some Turkish academics publish their works in predatory journals to rapidly get promoted, as reported by Demir (2018). This result was later confirmed in Uzbekistan by Eshchanov et al. (2021). Plagiarism, reliance on payment-based publication, and exploitation of beginner researchers were also downsides mentioned by faculty in Kazakhstan (Kuzhabekova & Ruby, 2018).

Stakeholders’ perceptions of SE

The survey of academic stakeholders was one of the most conducted methods to recognize the positive and negative implications of SE on research evaluation and publication patterns. Some surveys have been incorporated with scientometric analyses. Aksnes and Rip (2009) discovered that Norwegian scholars were mistrustful of the fairness of citation-based criteria. Although Swedish humanities academics highlighted a conflict between disciplinary traditions and the requirements of this international trend, they were conscious of the positive influence of these measures on IP (Hammarfelt & de Rijcke, 2015). Similarly, McCulloch (2017) confirmed the inconsistency between these traditions and the types and outlets of publication required by the Research Excellence Framework in England. A wider examination in Australia and Sweden revealed attitudes in the humanities towards indices; many criticized them and felt pressured to use such measures (Hammarfelt & Haddow, 2018). Although this usage has become a significant part of daily life for Irish scholars and decision-makers, standardization and institutionalization of the metrics require examination of their validity, reliability, and implications for all aspects of academia (Ma & Ladisch, 2019). In Poland, Kulczycki et al. (2021) concluded that most deans of higher education institutions believed that these measures were an effective technique for evaluating scholars.

Role of incentives in enhancing IPs

Some scientometric studies have verified the impact of research evaluation systems on IP before and after their inception. University funding was the frequent target of these systems, which mostly employed metrics. In South Africa, a study by Inglesi-Lotz and Pouris (2011) proved that after applying the National Research Foundation’s rating system, IPs in social science rose by an average of 24.5% over the projected volume without such a system. The Norwegian Publication Indicator, a research funding system that began in 2004, demonstrated its efficacy when both IPs and collaboration grew (Bloch & Schneider, 2016). This indicator has also been adopted in Danish institutions since 2008 and has been found to positively affect an article's evolution (Ingwersen & Larsen, 2014). After joining the Russian excellence initiative Project 5–100 for funding, Matveeva and Ferligoj (2020) concluded that the participating universities experienced a rise in both collaborations and research output, whereas there was no considerable shift in nonparticipating universities' collaboration. The initiative also generated more publications in the highest- and lowest-quality journals (Matveeva et al., 2021). Additionally, Lovakov et al. (2021) revealed the positive spillover effect of this initiative on the research performance of universities that were not funded by the initiative. In Italy, Abramo and D'Angelo (2021) measured the relative productivity before and after the implementation of the performance-based research funding system; the results demonstrated that universities' achievement was convergent, owing to the significant advancement achieved by those with the lowest outputs after they joined this system. Akbaritabar et al. (2021) examined the impact of the second national research assessment introduced in 2011 on the production of Italian sociologists. The volume of research grew in journals that were effective for evaluation; however, some journals were of questionable quality.

At the Arab level, no studies approached the purpose of the present investigation, to the best of the author’s knowledge. Yet, some bibliometric studies were conducted to monitor the impact of radical changes on the development of IPs in Arab countries. Ibrahim (2018) explored that following the Arab Spring movement, the number of WoS publications produced by 22 Arab nations doubled. After assessing the effect of recent gender policies in the Middle East and North Africa on the development of WoS literature written by women versus men, El-Ouahi and Lariviere (2023) found that women in Tunisia, Lebanon, Turkey, Algeria, and Egypt achieved more progress relative to other nations in that region. In Saudi Arabia, many bibliometric studies tracked the development of IPs. The most recent one was the study by Al-Jamimi et al. (2023), which showed tremendous growth in these publications because of investment in the Saudi infrastructure for scientific research. Additionally, Al-Jamimi et al. (2022) reviewed bibliometric studies that monitored research performance in economically emerging countries, including Saudi Arabia, where 132 indicators were identified and classified according to productivity, influence, and cooperation.

Despite prior fruitful work, several gaps were explored in this research stream. The impact of metrics has been insufficiently investigated in the framework of promotion; it needs a deeper comprehension in the light of APRs. Another gap is that most researchers have shown the effects on the quantitative progress of the IPs, but the effects on publication type or journal classification have received inadequate consideration. Furthermore, the national level was rarely addressed, as coverage typically focused on a university or group of universities, and numerous studies were also limited to specific specializations rather than covering all subject areas. There was also a severe lack of tracking of the differences between the institutions that receive stimulus and those that do not. Last, while universities attracted most studies, research centres were absent, although they contribute significantly to national scientific research. The current research would bridge these gaps and encourage scholars to enrich this stream, especially in the Middle East, where this type of research is scarce.

Background

In the last decade, Egypt has witnessed two revolutions targeted at radical change and corruption eradication: one on January 25, 2011, and the other on June 30, 2013. Reforms have consequently been implemented across all sectors, including higher education and scientific research (HE&SR). Concurrently, this sector was considering the recommendations issued by the World Bank in its 2010 review of Egyptian policy on HE&SR. Optimizing research capacity to compete globally and implementing an incremental reform strategy were the most essential suggestions. For the staff, the most challenging aspects were the changes impacting their employment security and conditions, status, workload, and performance evaluation (The World Bank, 2010). All these circumstances have prompted the Ministry of HE&SR to improve the quality of research performance and increase IP rates, especially those of universities. The eventual intention was to raise Egypt's position in the global rankings.

The refinement of APRs has been one of the most substantial transformations executed by the SCU. According to these regulations, the promotion portfolio submitted for an associate professorship or full professorship includes at least five and up to eight papers (70 points), a report on teaching, research, and community activities (20 points), and a discussion with the candidate (10 points). Each requirement must be fulfilled by at least 60%. Promotion requires a total score of at least 70 points for full professorship and at least 60 points for associate professorship. All submitted papers are graded using the following equation:

$${\text{The}}\,{\text{score}}\,{\text{of}}\,{\text{a}}\,{\text{paper}}\; = \;\frac{{{\text{score}}\,{\text{of}}\,{\text{journal}}\,{\text{or}}\,{\text{conference}}\,\left( {10\,{\text{points}}\,{\text{or}}\,{\text{less}}} \right)\, \times \,{\text{score}}\,{\text{of}}\,{\text{participation}}\,\left( {{\text{weighted}}\,{\text{based}}\,{\text{on}}\,{\text{the}}\,{\text{number}}\,{\text{of}}\,{\text{co}} - {\text{authors}}\,(1\,{\text{point}}\,{\text{or}}\,{\text{less}}} \right)\, \times \,{\text{score}}\,{\text{graded}}\,{\text{by}}\,{\text{experts}}\,\left( {100 \,\,{\text{points}} \,\,{\text{or}} \,\,{\text{less}}} \right)}}{50}$$

The APRs are revised by the SCU every three years when a new round of scientific committees for promotion is to be launched (approximately 131 committees). Since the inception of the 12th round (2016–2019), the APR has utilized SE for the first time to evaluate the channels of publications (Ministry of Higher Education & Scientific Research. Supreme Council of Universities, 2016). Accordingly, a scientometric report (SR) issued by the DLU has become a prerequisite for applying for promotion. Based on data from the WoS and Scopus, this report contains rankings of journals and proceedings where the nominee's research was published. The report includes the value and quartile of both the SJR and the IF during the 12th round, while the CiteScore and the IF are applied in the 13th round (2019–2022) (Ministry of Higher Education & Scientific Research. Supreme Council of Universities, 2019) and the 14th one (2022–2025) (Ministry of Higher Education & Scientific Research. Supreme Council of Universities, 2022). Formerly, the responsibility of allocating points to the publication outlets was delegated to committee members. In contrast, the regulations of the last two rounds have involved assigning points to each rank of IJs indexed in the WoS and Scopus, as shown in Table 1. This ensures uniformity, objectivity, and transparency and prevents the bias and favouritism that may occur if the committee members allocate journal points, which represents one of the most important aspects of the desired reform.

Table 1 Points assigned to IJs indexed in the WoS and Scopus according to the APR issued by the SCU*

According to the SE, articles or reviews published in the first and second quartiles of the WoS are exempt from expert review. This procedure is called “fast track” as the highest scores are awarded to these papers (Q1: 90/100 and Q2: 80/100) instead of scores graded by experts. If the journal is indexed in both WoS and Scopus, it will obtain the highest mark, as listed in the table. Furthermore, the IJs indexed in WoS or Scopus receive higher marks than those received by the nonindexed ones called “local journals”, whether published in Egypt or other countries, which obtain at most 7/10. The APRs detail seven criteria for evaluating this type of journal. In addition, the scores of international proceedings indexed in the two databases are higher than those of the local ones. IJs are generally superior to international and local proceedings, which obtain fewer points; hence, these regulations strongly encourage articles. The APRs have no items for evaluating books, book chapters, etc., which are subject to the committee’s determination.

The regulations of ERCs affiliated with the Ministry of HE&SR and other governmental authorities were reviewed, such as Council of Research Centres and Institutions (2021) and Egyptian Atomic Energy Authority (2021). It was found that they highlight IP and its conditions without referring to the incentives included in the SCU’s regulations in terms of determining points for IJs or grades assigned to research published in the highest-ranking journals. Accordingly, although ERCs do not undergo the SCU’s APRs, the impact of these new regulations is expected to extend to academics working for these bodies, which often support IP.

These shifts are in line with some worldwide trends in research evaluation and academic promotion systems. These trends include: using citation-based metrics to identify the prestige of journals; encouraging publishing in top-tier journals and avoiding questionable ones; allotting higher points to WoS journals; evaluating local journals to help them enhance their quality; and employing library and information science to serve and improve promotion procedures (Brown, 2014; Cruz-Castro & Sanz-Menendez, 2021; Demetrescu et al., 2020; Frandsen et al., 2022; West & Rich, 2012). In contrast, the national system is inconsistent with other trends, such as using indices measuring scholars or papers, such as the H-index and citation count, or altmetrics (Bray et al., 2023; Kelly & Jennions, 2006; Nuzzolese et al., 2019; Yu et al., 2023). All these alterations in the HE&SR sector, specifically the incorporation of the SE into the APRs, necessitate an investigation of the potential variations of the IP rates and patterns in Egypt; this will be the focus of the current research.

Data and methods

The current investigation is classified as an impact evaluation study, whose primary objective is to monitor the shifts that are attributed to a certain intervention, such as a policy or programme, to determine its effectiveness. A quasi-experimental design is the most commonly used for this purpose (Inglesi-Lotz & Pouris, 2011). It relies mainly on the randomized before-after-control-impact (BACI) method, which employs a treatment group (TG) that receives the intervention and a control (comparison) group (CG) that does not. Both groups are represented by their data before and after launching the treatment. This approach makes it feasible to account for any natural or previous differences between the TG and CG to specify the true impact of the intervention. Accordingly, CG is essential for controlling the environment and reducing confounding factors other than the intervention that could have led to the discovered change (Christie et al., 2020; The International Initiative for Impact Evaluation, 2022). In addition, the spillover effects received by CG could be explored. This design was used in this case to examine the effect of SE-based APRs by comparing the IPs of both the TG (SCUs) and CG (ERCs) before and after initiating this evaluation.

Data

To conduct the required comparisons, it was necessary to extract data reflecting the features of the two groups’ research indexed in the WoS. The data were extracted from Clarivate's InCites to recognize the differences that might have resulted from the Egyptian researchers' preference for WoS journals, which earn higher points relative to Scopus journals in APRs. Given that the period following the implementation of the SE available for the analysis was only five years (2017–2021), it was suggested that the whole period of the study be divided into two equal subperiods of 5 years before and after launching the SE. After gathering the IP data for the proposed period and examining the statistical models, it was found that the models were invalid because they required a longer time series to be accurately executed. Accordingly, the time series should have been extended as much as possible. This is advantageous in this analysis because the longer the time series is, the more reliable the statistical results are and the fewer errors there are, which enhances the ability of the statistical models to estimate and predict. Therefore, it was appropriate to stretch the series from 1996 to 2021 and divide it into two periods: 1996 to 2016 (before SE) and 2017 to 2021 (after SE).

Moreover, two groups were meticulously chosen based on the Egyptian organizations listed on InCites during the whole period: the SCUs (45 bodies) and the ERCs (43 bodies). Foreign and independent organizations that are not affiliated with the SCU or any Egyptian body were excluded, such as the American University in Egypt and Naval Medical Research Unit 3. Al-Azhar University was also isolated because it is a large-scale university of a special nature that follows a religious body dubbed Al-Azhar Al-Sharif. The aim was to form two groups as similar as possible in terms of the surrounding environment in which research is produced to obtain academic promotion. There is another interesting similarity, as each group produced approximately 50% of its total production of the whole period during the second period (5 years) (Table 2). This is despite the SCUs outpacing the ERCs regarding the total IPs throughout the entire time. Furthermore, an essential assumption must have been applied before applying the method adopted in this work to reach the correct findings: the trends of the IPs of the first period were assumed to be the same for the two groups. More technically, the graphs of both from 1996 to 2016 were assumed to be represented by two approximately parallel lines; this was achieved with the identified groups through gathering and examining data. It was further found that both have fewer humanities and social sciences publications but are more productive in other disciplines.

Table 2 The total IPs produced by the SCUs and ERCs from 1996 to 2021 according to InCites

The data collected during July 2022 reflect the IPs according to four variables: (1) documents (overall volumes); (2) subject areas (using InCites’ Global Institutions Profiles Project (GIPP), a scheme with six broad subjects as a more abbreviated alternative than the WoS scheme with 21 subjects); (3) document types (articles, book chapters, proceedings papers, and reviews (books were not found for both groups)); and (4) journal rankings (Q1, Q2, Q3, and Q4). To obtain the number of papers indexed in ESCI journals (Emerging Sources Citation Index), the following calculation was implemented:

$$The\,documents\,published\,in\,ESCI\,journals\, = \,the\,total\,documents\, - \,the\,total\,documents\,published\,in\,all\,journals\,under\,all\,quartiles\,\left( {for\,each\,organization\,in\,each\,year} \right).$$

The data points for each institution (while collecting data) and for each group (while processing and analysing data) are as follows:

  • Number of total publications each year.

  • Number of publications under each subject each year.

  • Number of publications under each type each year.

  • Number of publications under each journal quartile each year.

Accordingly, 26 Excel sheets were generated, where a sheet was designated to record data for each year, and then the data were collected in a separate sheet. (See the Appendix: the IPs of SCUs and ERCs and the Online Resource: the IPs under each organization of SCUs and ERCs). Table 2 reports the IP totals of both groups in the two intervals.

To identify the general trends of IP growth, the following equation was used to calculate the annual growth rate (AGR):

$$AGR = \frac{End\,Value - First\,Value}{{First\,Value}}\, \times \,100$$

The AGR for 1996 was not calculated because the 1995 values (the first value) were outside the scope of the research. Moreover, there were no AGRs for some variables in some years due to the absence of production, such as book chapters. The quartiles of WoS were initiated in 1997; hence, the AGR for that year was not calculated for quartiles, unlike ESCI journals that were present. The AGR values have been rounded to the nearest major unit.

Statistical model

The difference in difference (DID) or double difference model was employed to perform this examination. In quasi-experimental contexts, such a technique represents a quantitative design for estimating causal relationships, as it explores the effects of an intervention that does not affect all individuals or institutions at the same time. To this end, it compares the variations in the TG mean before and after the intervention with the change in the CG mean over time. This design incorporates cross-sectional and time-series data to measure the changes in outcome variables of both the TG and CG before and following the intervention (Makkonen & Mitze, 2016). In other words, the DID estimator examines the difference in both the TG and CG before and after the intervention; then, the net effect, or net change, is calculated as follows: (TG after–TG before)–(CG after–CG before). This approach should be accompanied by either the matching method or testing the parallel trends assumption, i.e., examining the similarity between the IPs of the TG and those of the CG before applying the SE. Then, two dummy variables are generated: one for the groups taking values 1 for TG and 0 for CG, and the other for the period, where 1 and 0 are the values indicating the periods after and before the intervention, respectively. The regression model used here can be expressed as follows:

$${Y}_{t}= {\beta }_{0}+ {\beta }_{1}{G}_{t}+ {\beta }_{2}{P}_{t}+ {\beta }_{3}{G}_{t}*{P}_{t}+ {\varepsilon }_{t} ,$$

where:

\({Y}_{t}\) is the IP value for t = 1996, 1997, …, 2021, \({G}_{t}=\left\{\begin{array}{c}0 , if\,Y\in CG\\ 1 , if Y \in TG\end{array}\right.\),\({P}_{t}=\left\{\begin{array}{c}0 , if\,t<2016\\ 1 , if t\ge 2016\end{array}\right.\),

\({G}_{t}*{P}_{t}\) is the interaction term, and \({\varepsilon }_{t}\) indicates the error term.

The four coefficients are illustrated in Table 3, as the columns refer to the periods and rows refer to the groups. The mean IPs of the TG before and after SE are \({\beta }_{0}+ {\beta }_{1}\) and \({\beta }_{0}+ {\beta }_{1}+ {\beta }_{2}+ {\beta }_{3}\), respectively. Additionally, the mean IPs of the CG before and after this evaluation are \({\beta }_{0}\) and \({\beta }_{0}+{\beta }_{2}\), respectively. Therefore, the difference between the two periods in the TG is \({\beta }_{2}+ {\beta }_{3}\), and in the CG is \({\beta }_{2}\); the difference in difference is \({\beta }_{3}\). Additionally, the difference between the two groups in the first period is \({\beta }_{1}\), and that in the second period is \({\beta }_{1}+ {\beta }_{3}\).

Table 3 Exemplification of the DID model

Considering the assumption in this study, it was desired to use the AGR values to conduct the DID method to explore the differences between the two groups in terms of the growth rate instead of publication volumes, which may not accurately reflect the potential progress. Several tests conducted in this study revealed that this assumption was violated using the AGRs. In contrast, the assumption was effectively met by using the number of publications. The numbers led to higher values of adjusted R2, which explains the DID models. The average value of R2 is 75%, meaning an excellent interpretation of the models, i.e., it accurately explains the changes in the IPs. Additionally, the p values of the DID models applied are 0.00; all the models are significant at a confidence level of 95%. As evidenced by the following results, these models indicated a consistently higher impact of the SE on SCUs compared to ERCs.

The statistical packages used in this study are SPSS®20, STATA®14, and MS Excel 2013. All the figures and tables representing the results of the DID models were generated from statistical applications.

Results

This section presents the general trends of the IP evolution for the SCUs compared to ERCs, followed by the DID approach outcomes performed to test the study's hypotheses.

General trends

Overall, the IPs of both groups increased rapidly following the introduction of the SE, as illustrated in Fig. 1. Until 2005, the number of publications slowly increased and then escalated from 2006 to 2016. Afterwards, publications jumped more sharply than in the previous period. Nevertheless, the SCUs recorded noticeably higher values throughout the whole period. Table 4 shows that following the AGR decline from 2012 (20%) to 2017 (5%), the SCUs had a sharp rise from 2018 (16%) to 2020 (22%) and a minor fall in 2021 (20%). The AGR of 22% (2020) was the highest achieved by the ERCs after the SE. Throughout the time series, the arts and humanities and social sciences in both groups showed vastly lower publications relative to other subjects, which all had similar growth patterns across the years and an exponential rise after the SE (Fig. 2a, b). Nevertheless, the social sciences outperformed other disciplines during the second period, with an AGR of 47% in 2021 for SCUs and its highest level (80%) in 2020 for ERCs (Table 5). Moreover, the number of articles and reviews increased in both groups, but there was an accelerated shift after 2016 (Fig. 2c, d). As listed in Table 6, the highest AGRs of articles were approximately 30% for SCUs and 25% for ERCs in 2020, and the highest AGR of reviews was approximately 57% for both SCUs (2021) and ERCs (2019). The proceedings papers declined gradually, while the book chapters were absent for some years. For quartiles, the publications of Q1 and Q2 rose tremendously compared to other quartiles after the new evaluation (Fig. 2e, f). Whereas the AGR of SCUs reached 33% in 2021 for Q1 and 52% in 2020 for Q2, the highest values recorded by the ERCs were approximately 45% (2020) for Q1 and 46% (2019) for Q2. Other quartiles witnessed relatively low AGRs, as depicted in Table 7.

Fig. 1
figure 1

Total IPs from 1996 to 2021 for the SCUs and ERCs. The SCUs have greater IPs than the ERCs. Although both lines are increasing, the SCU line has grown exponentially, particularly since 2016

Table 4 The AGR of total IPs for the SCUs and ERCs from 1996 to 2021
Fig. 2
figure 2

IPs according to subject areas, document types, and journal quartiles from 1996 to 2021 for the SCUs and ERCs

Table 5 The AGR of IPs according to subject areas for the SCUs and ERCs from 1996 to 2021
Table 6 The AGR of IPs according to document types for the SCUs and ERCs from 1996 to 2021
Table 7 The AGR of IPs according to journal quartiles for the SCUs and ERCs from 1996 to 2021

Results of the DID model

The DID method was applied for the total IPs in general and according to the subjects, types, and journal quartiles. Figure 3 illustrates in detail the application of the DID model. The y-axis is the mean of the total IPs, and the x-axis includes the two periods (before and after the SE). The black line represents the SCUs, and the dotted line refers to the ERCs. The difference between the means of IPs recorded for the SCUs in the first and second periods is \({\beta }_{2}+ {\beta }_{3}\), while the difference between the means of IPs for the ERCs in the two periods is \({\beta }_{2}\). Consequently, the difference in differences is \({\beta }_{3}\), a net effect gained by SCUs compared to ERCs after the SE. The mean of the total IPs of the SCUs is higher than that of the ERCs in both periods. Remarkably, the line of the SCUs is steeper than the ERCs’ line, reflecting a large increase in the SCUs’ publications after the SE compared to the other group. The distance between the two lines after the SE is wider than its peer before. As visualized in Figs. 4, 5 and 6, the same observation holds across all subjects, types, and quartiles.

Fig. 3
figure 3

Mean of the total IPs in the two periods for the SCUs and ERCs. The mean of the SCUs is higher than that of the ERCs in both periods. After the SE, the line of the SCUs is steeper than that of the ERCs, reflecting a larger increase in the total IPs compared to the ERCs

Fig. 4
figure 4

Mean of the IPs according to subject areas in the two periods for the SCUs and ERCs

Fig. 5
figure 5

Mean of the IPs according to document types in the two periods for the SCUs and ERCs

Fig. 6
figure 6

Mean of the IPs according to journal quartiles in the two periods for the SCUs and ERCs

The results of the DID model for the total IPs are depicted in Table 8. The p-value of the model is 0.00; hence, the model is significant at a 95% confidence level. The model explains 78.9% of the variation in the total IPs according to the adjusted R2 of 0.789. Additionally, the coefficients \({\widehat{\beta }}_{1}\), \({\widehat{\beta }}_{2}\), and \({\widehat{\beta }}_{3}\) are significant, as their p values are approximately 0.00, 0.02, and 0.00, respectively. There is a significant difference between the two periods regarding the total IPs of the SCUs (\({\widehat{\beta }}_{2}+{\widehat{\beta }}_{3}\)), where this total increased by almost 21445 publications after launching the SE. Likewise, a significant difference in the total IPs of the ERCs (\({\widehat{\beta }}_{2}\)) is observed, such that this total recorded a rise of 4514 in the second period. As a result, it is proven that the net effect between the SCUs and ERCs is significant, and the SE positively impacted the total IPs as it grew considerably in the SCUs by almost 16931 (\({\widehat{\beta }}_{3})\) compared to the ERCs. The model, moreover, demonstrates a significant difference between the total IPs of the SCUs and ERCs in the first period (\({\widehat{\beta }}_{1}\)) and the second period (\({\widehat{\beta }}_{1}+{\widehat{\beta }}_{3}\)), where the total IPs of the SCUs are higher than those of the ERCs.

Table 8 The DID model for the total IPs

The DID models for the six subjects were performed individually. The p-values of the six models of the subjects are 0.00; all the models are significant at a confidence level of 95%. These models provide an interpretation of the variations in the IPs as follows: arts and humanities, 80.3%; clinical, preclinical, and health sciences, 80.1%; engineering and technology, 80.5%; life sciences, 77.5%; physical sciences, 77.0%; and social sciences, 74.8%. This is due to the adjusted R2 values of these models being 0.803, 0.801, 0.805, 0.775, 0.770, and 0.748, respectively. Whereas the p values of the coefficients \({\widehat{\beta }}_{1}\) and \({\widehat{\beta }}_{3}\) in the six models are less than 0.05, i.e., they are significant, the coefficient \({\widehat{\beta }}_{2}\) is significant only in the models representing engineering and technology, life sciences, and physical sciences. The SE, consequently, resulted in rising IPs in these three subjects in both the SCUs (\({\widehat{\beta }}_{2}+{\widehat{\beta }}_{3}\)) and the ERCs (\({\widehat{\beta }}_{2}\)). In contrast, while the IPs produced by the SCUs under the arts and humanities, clinical, preclinical and health sciences, and social sciences increased by \({\widehat{\beta }}_{3}\) after implementing the SE, these subjects were not affected in the other group. This reveals the significance of the net difference; more precisely, the SE yielded evolution in the IPs of the SCUs in all subjects by \({\widehat{\beta }}_{3}\) compared to the ERCs. In addition, the findings indicate that the two groups differ from each other considerably before (\({\widehat{\beta }}_{1}\)) and after (\({\widehat{\beta }}_{1}+{\widehat{\beta }}_{3}\)) SE, where the publications produced by the SCUs are higher than those produced by the ERCs in all disciplines. Table 9 presents the details.

Table 9 The DID model for the IPs according to subject areas

For types, all the models are significant at a confidence level of 95% because the p values of the four models of the types are also 0.00. The individual DID models interpret 76.8%, 69.5%, 57.5%, and 67.9% of the variations in articles, book chapters, proceedings papers, and reviews, respectively, as the values of adjusted R2 were 0.768, 0.695, 0.575, and 0.679. The p values of the coefficient \({\widehat{\beta }}_{3}\) in the four models are less than 0.05, indicating that it is significant, while the coefficient \({\widehat{\beta }}_{2}\) is significant only in the model of the articles. Accordingly, the SCUs and the ERCs experienced a boost in articles following the inception of the SE by (\({\widehat{\beta }}_{2}+{\widehat{\beta }}_{3}\)) and (\({\widehat{\beta }}_{2}\)), respectively. On the other hand, the reviews, book chapters, and proceedings papers rose by the values of \({\widehat{\beta }}_{3}\) in the SCUs and were not impacted in the ERCs. It is therefore evident that the net effect is significant, as the SE improved the four types by \({\widehat{\beta }}_{3}\) in the SCUs in comparison to the ERCs. Furthermore, the model explored a significant difference in articles and proceedings papers between the SCUs and ERCs before the intervention (\({\widehat{\beta }}_{1}\)) and after it \(({\widehat{\beta }}_{1}+{\widehat{\beta }}_{3}\)), such that the productivity of the SCUs was more than that of the ERCs. Regarding the book chapters and the reviews, they were significantly higher in the SCUs in the second period in contrast to the first one, as there is no significant difference (Table 10).

Table 10 The DID model for the IPs according to document types

Finally, the p values of the five models of the journal rankings are 0.00; the models are significant at a confidence level of 95%. As shown in Table 11, since the values of the adjusted R2 are 0.753, 0.678, 0.823, 0.807, and 0.773, the models of the quartiles explain 75.3%, 67.8%, 82.3%, 80.7%, and 77.3% of the variations in the IPs ranked as Q1, Q2, Q3, Q4, and ESCI journals, respectively. Except for \({\widehat{\beta }}_{2}\) in Q2, the p values of \({\widehat{\beta }}_{1}\), \({\widehat{\beta }}_{2}\), and \({\widehat{\beta }}_{3}\) in the models are significant. Consequently, while the IPs under all the quartiles developed significantly in the SCUs by \({\widehat{\beta }}_{2} +{\widehat{\beta }}_{3}\) after the SE, they increased in the ERCs by \({\widehat{\beta }}_{2}\), except for Q2, meaning papers under this quartile were not influenced in the ERCs. Therefore, the net effect is significant for the IPs under all the quartiles, resulting in an increase in their values by \({\widehat{\beta }}_{3}\) in the SCUs compared to the ERCs. Similar to most of the previous variables, there is also a significant difference between the two groups in each period, where the IPs of the SCUs are higher than those of the ERCs under all journal rankings.

Table 11 The DID model for the IPs according to journal quartiles

Given the preceding analysis and the summary of DID outcomes presented in Table 12, the hypothesis examination results can be concisely described as follows:

Table 12 Summary of the DID Model results

RH1

After employing SE, the IPs of the SCUs witnessed a significant net difference (\({\widehat{\beta }}_{3})\) for all variables studied. The total IPs increased by \({\widehat{\beta }}_{3}\)(16931). Likewise, the subjects were enhanced by approximately 90, 6298, 4922, 6459, 4221, and 471 for arts and humanities, clinical, preclinical and health sciences, engineering and technology, life sciences, physical sciences, and social sciences, respectively. For types of publications, the values achieved for articles, book chapters, proceedings papers, and reviews were approximately 13981, 17, 879, and 935, respectively. Finally, Q1, Q2, Q3, Q4, and ESCI journals were boosted by approximately 5134, 4089, 2572, 1426, and 3710, respectively.

RH2

\({\widehat{\beta }}_{2}\) reflects the differences in the IPs of the ERCs. The total IPs significantly changed; the value of \({\widehat{\beta }}_{2}\) is approximately 4514. There were significant differences only in engineering and technology, 1281; life sciences, 2099; and physical sciences, 1825. All types of publications had no differences except for articles, which increased by approximately 4121. In contrast, other than Q2, all quartiles witnessed significant differences, as follows: Q1, 1238; Q3, 717; Q4, 495; and ESCI, 998.

Discussion

The SCU has recently introduced SE-based APRs for promoting Egyptian faculty members as one of the substantial developments in the higher education sector. The current study attempted to examine the potential differences in the SCUs’ IPs as a consequence of this evaluation as well as the spillovers of the SE on the ERCs governed by different regulations. The DID technique was the most efficient for testing the study hypotheses. Previous scientometric analyses employed it to determine the net effect of incentives on research output, such as funding initiatives (Lovakov et al., 2021), promotion systems (Abramo et al., 2023), and even political changes (Makkonen & Mitze, 2016). This is an advanced statistical approach that is more comprehensive than both the T test and the F test for conducting this type of research.

The findings indicated that SE has led to significant positive differences in the IPs of the SCUs and extended effects on the ERCs. Remarkably, the volume of IPs produced by each group before the ES over 21 years (1996–2016) is nearly equivalent to that published over the subsequent five years (2017–2021), implying that IPs have dramatically risen since the initiation of the new evaluation. Despite the fluctuations in the AGR experienced by both groups before the SE, the trend was generally increasing after 2017. The second period featured rates of 5%, 16%, 18%, 22%, and 20%, respectively, for SCUs, whereas those of the ERCs were 4%, 13%, 20%, 22%, and 16%, respectively. The AGR faced a severe decrease in 2017. This may be due to the economic consequences experienced by Egypt in late 2016 when the Egyptian pound was floated against the US dollar, triggering the reluctance of researchers to pay IP fees, especially for open access journals. The COVID-19 pandemic may have also resulted in the AGR declining in 2021 after the increases attained since 2018. Although the DID results proved substantial positive differences between the two periods for the higher and lower performers, the gap between the two groups in the total IPs considerably widened after the SE.

Similarly, this gap appeared in all subjects, which all witnessed significant net differences represented by the values of \({\widehat{\beta }}_{3}\) for SCUs compared to ERCs. Unlike other disciplines, the humanities, health sciences, and social sciences did not experience any significant change in ERCs. For SCUs, the highest value of \({\widehat{\beta }}_{3}\) was achieved by the life sciences (6459), while the lowest value was, as expected, for the humanities (90). Social sciences have recorded higher AGRs than other sciences over the last three years in both SCUs (33%, 24%, and 47%) and ERCs (35%, 80%, and 40%). This leap seems to represent a positive response to the SE from these sciences. According to earlier research, the humanities and social sciences have a lower tendency towards adopting this type of evaluation (Jimenez-Contreras et al., 2003; Korytkowski & Kulczycki, 2019b; Ma & Ladisch, 2019). As interpreted by Hammarfelt and de Rijcke (2015) and Wilsdon (2015), these disciplines have recently changed towards IP in response to metrics-based evaluation and face tough competition with other fields. This is despite the inconsistency between this approach and the rationale and traditions of research in these disciplines, including publications in other outlets that may not be subject to metrics (e.g., books, exhibitions, and films), works in national languages, and works that are intended for a non academic audience.

Among types, the highest values of the \({\widehat{\beta }}_{3}\) were achieved by articles (13981) and reviews (935); the APRs have altered the document type patterns as they stimulated researchers to publish more of such two types. This is due to the determination of points for the IJs in which both are published, allowing academicians to know the score of the journals in which they have published or will publish. This contrasts with proceedings papers, which had fewer strict determinations compared to articles and reviews, likely causing potential discussion among the members of the promotion committee about the allocation of scores for this type. The highest weight obtained by articles published in IJs, especially in the case of promotion, was emphasized by Chang and Su (2022); Deutz et al. (2021); Inglesi-Lotz and Pouris (2011); Shu et al. (2020). For both SCUs and ERCs, the articles and reviews recorded high AGRs relative to other types. The tests revealed a positive difference achieved for SCUs by the value of \({\widehat{\beta }}_{3}\) for all types, and the articles outperformed all other types. In contrast, only articles witnessed a significant difference between the two periods in the ERCs.

The results related to the journal quartile suggested the influential role of APRs in motivating publication in the WoS journals, especially those under Q1 and Q2. Since introducing the 13th regulation, greater weights have been assigned to Q1 journals (10/10) and Q2 journals (9.5/10) when calculating the candidate’s score for promotion. The other appealing feature was exempting the papers in these journals from expert review and allotting them the highest possible scores in this assessment. After SE, the two quartiles recorded the highest AGRs for SCUs (Q1: 33% and Q2: 52%) and for ERCs (Q1: 45% and Q2: 46%) compared to the other categories. For SCUs, the \({\widehat{\beta }}_{3}\) values were significant across all quartiles, and the highest values were achieved for Q1 (5134) and Q2 (4089), while \({\widehat{\beta }}_{2}\) for just Q2 was insignificant for ERCs, generally confirming a positive response to incentives included in the regulations. This is consistent with relevant findings demonstrating the significance of WoS journals in evaluation systems, especially Q1 journals, which are widely cited and have a higher academic reputation (Korytkowski & Kulczycki, 2019a; Lovakov et al., 2021; Shu et al., 2020).

It is undeniable that there have been auxiliary factors supporting Egyptian researchers and generating a conducive environment to gain the advantages of the APRs. The Egyptian Knowledge Bank (EKB), the national digital library, plays an essential and exclusive role in supplying international digital sources, including WoS and Scopus; therefore, researchers can gather valuable references pertinent to their works and verify the classification of journals before publishing. Moreover, this organization offers free language editing services through accredited bodies, courses on the IP, and initiatives with prestigious publishers, such as Springer, for publishing research without open access fees, especially in Q1 and Q2 WoS journals. Another factor is that there has been a proliferation of IP centres at Egyptian universities. They strive to enhance all IP services to raise the universities' rankings. Finally, scientometric culture has become prevalent on a large scale in the Egyptian scientific research market, as the SR issued by the DLU has represented a mandatory requirement not only for promotion but also for applying for several state scientific awards and evaluating private technological and engineering institutes. This report may be required for future leadership positions in the HE&SR sector.

This scientometric work can be incorporated into the research stream examining the implications of evaluation systems designed to raise IP rates in several countries, such as South Africa (Inglesi-Lotz & Pouris, 2011), Russia (Lovakov et al., 2021; Matveeva & Ferligoj, 2020; Matveeva et al., 2021), Italy (Abramo & D'Angelo, 2021; Akbaritabar et al., 2021), and Ukraine (Abramo et al., 2023). Furthermore, this study supplements a shortage of studies that explore changes experienced by other institutions that are not subject to the evaluation systems studied. In addition to universities, there has been an effort to recognize differences in ERCs that are not governed by the SCU’s APRs. Utilizing the current method, more subsequent research could be performed to monitor the impact of national assessments on the IP, particularly in Arabic nations or the Middle East, generating fruitful comparisons. Nationally, the indications presented can support leaders in HE&SR in developing policies that encourage institutions to promote their IP records, ultimately enhancing Egypt's position in worldwide rankings. The study also draws attention to the significance of scientometrics in academic promotion procedures. Likewise, international organizations such as the World Bank can consult these findings as a case study to assess the influence of international review of national HE&SR strategies on shifting research practices in Egypt.

Limitations and future work

In this analysis, citations were not studied as a measure of publications’ impact. Rather, journal quartiles were examined because the APRs depend solely on them, regardless of the total citations harvested by the promotion nominees. However, after a longer period, citations can be utilized to monitor the actual effect of IPs after SE. Additionally, there was a time restriction, as only a period of 5 years following the inception of the SE was available to study. A similar study could track the long-term effects of this evaluation after five subsequent years. Despite the necessity of identifying the influence of this evaluation on the growth of IPs compared to local publications, the absence of a national database that covers local and Arabic publications (the native language) prevented the accomplishment of this objective. A cross-sectional study by the author will manually analyse the papers submitted by Egyptian faculty members for promotion using more variables than those adopted in this study. Finally, this study supports Abramo et al. (2023)’s warning that the new knowledge is not solely presented in publications, and bibliographic databases (such as WoS here) do not necessarily register all publications.

Conclusion

Going back to the starting point, the SE is not a fashion or a game. It has become an integral part of academia and its internationalization. It is proven that this evaluation has affected not only the SCUs that are subject to it but also other research producers in Egypt. Although recent APRs have no obligation to publish internationally and merely include a benchmark for assigning higher points to international channels, they have inspired researchers to publish more IPs, especially in the highest-ranked journals. This change would lead to scientific excellence and the enhancement of the international exposure of Egyptian universities. The effect will grow if SE is sustained in all scientific activities. In addition, there is a need to disseminate more awareness among researchers about the metric culture, the advantages of international publishing and its prestigious channels as well as its exploiters, which are represented by predatory and hijacker journals. Moreover, it is necessary to incentivize scholars of social sciences and humanities to raise their global visibility. Notwithstanding, they require special APRs to be suited to their traditions and include less complex SE than the current unified evaluation for all sciences. Finally, Egyptian academics should be granted greater financial support to participate more in international publishing marathons.