Introduction

Science is essential to the socioeconomic development of a nation. Research activities enable the expansion of the frontiers of knowledge with great potential for innovation (Brooks, 1994), as innovation transforms knowledge into value and wealth. The importance of scientific knowledge and innovation has been recognised in neoclassical (Solow, 1956) and endogenous (Romer, 1986) growth models, and several studies suggest that scientific knowledge may have a positive effect on economic growth (e.g., Inglesi-Lotz & Pouris, 2013; Ntuli et al., 2015).

However, developing research activities require an appropriate environment. By that, we mean each country must have a stimulating environment within universities, with the institutionalisation of scientific professional structures and societies, with funding devoted to research activities—whether basic or applied activities—and institutions that can manage and link these activities with the right spillovers. Besides, it is necessary to create and implement scientific policies that support and foster the development of research activities.

In Africa, scientific capabilities are crucial for socioeconomic development (AUC, 2014). Some studies have found a positive association between scientific knowledge and economic growth, underscoring the need for a knowledge base, which can only be achieved with well-trained human capital (e.g., studies by Inglesi-Lotz & Pouris, 2013; Onyancha, 2020).

However, the scarcity of resources and an inappropriate environment are severe obstacles to the development of African national science and technology (S&T) systems (AUC, 2014). The gross domestic expenditure on research and development (GERD) in Africa is low; in 2015, it averaged 0.3% of gross domestic product (GDP) (Mouton, 2018; Mouton et al., 2019). In addition, there is wide variation across countries. For example, South Africa's GERD was 0.83% of GDP in 2017, Egypt’s was 0.72% in 2018, Mozambique's was 0.34% in 2018, and in other countries it was even lower (UNESCO, 2021). The data also shows that some countries rely heavily on international funding for their R&D activities. Mozambique and Burkina Faso are examples, as 60% or more of their R&D funding comes from international sources (NPCA, 2014). On the other hand, we have South Africa, where 12% of funding for R&D comes from international sources (NPCA, 2014). In this context, it is important to highlight that although South Africa and Egypt have the highest investment in R&D, they still lag far behind countries that are considered scientific powers, such as Germany, France, and the United Kingdom, whose GERD in 2018 was 3.09%, 2.20%, and 1.72% of GDP respectively (UNESCO, 2021).

In addition to financial resources invested in R&D, there are also science funding councils/agencies and science, technology, and innovation (STI) policies. The S&T systems of countries with an Anglophone tradition (e.g. South Africa, Kenya) include science funding councils, while countries with a Francophone tradition do not have such councils, although the importance of such an institution has been recognised (Mouton et al., 2015) Moreover, the establishment of science funding councils is still very recent in most countries in sub-Saharan Africa (Mouton et al., 2015).

Many African countries have also recently developed national STI policy plans (Mouton et al., 2015; NPCA, 2014), and some countries have consulted international organisations such as UNESCO (e.g. Botswana, Burundi, the Democratic Republic of Congo, Lesotho, Malawi, Namibia, Nigeria) to help formulate and revise STI policy plans.

The level of investment in R&D, the existence of science funding councils and national STI policy plans are all important in the development of national S&T systems in Africa and results in countries being at different stages of development in terms of their S&T systems. This has implications for science (regarding available research infrastructures, human resource development, etc.) as some countries will be able to produce more scientific knowledge than others. This is supported by several studies that have shown that only a few countries are responsible for a high percentage of the scientific knowledge produced in Africa.

Between 2000 and 2004, African countries produced 1.8% of the world's publications indexed on the Web of Science, with South Africa and Egypt accounting for the largest share of total African publications (30% and 20% respectively), followed by Morocco (8%) (Pouris & Pouris, 2009). Looking at a longer period (2000 and 2015), South Africa and Egypt contributed the most to the total number of African articles and reviews indexed in the Science Citation Index Expanded (97 and 79 thousand respectively, Sooryamoorthy, 2018). The same pattern was observed for the period 2005–2010 and 2011–2015 (Mouton & Blanckenberg, 2018). The breakdown of Africa's scientific production by scientific field also shows that South Africa and Egypt are the African countries with the highest contribution in most scientific fields (Pouris & Pouris, 2009).

As resources in African countries for R&D are limited, it is imperative to look for other ways to enable African scientists to contribute to the advancement of knowledge. One such strategy could be the development of research activities in an international framework. Several studies looking at the impact of IRC have shown that its benefits lie in access to funding for research activities, equipment and the development of research, management and learning infrastructures (e.g., Efstathiou et al., 2014; Matenga et al., 2019; Zdravkovic et al., 2016). The involvement of African scientists in research collaboration networks offers the opportunity to engage in learning processes that enable them to acquire/improve skills and gain new knowledge (Bozeman & Corley, 2004). As scientists take their place in these networks and build trusting relationships, they are more likely to expand their professional ties (Newman, 2001), increasing opportunities for information exchange and exploring new scientific ideas. Finally, their involvement in networks can provide opportunities to attract international assets–financial and material–whose access is recognised as a key benefit of IRC (Maluleka et al., 2016; Muriithi et al., 2018; Owusu-Nimo & Boshoff, 2017). Thus, the continuous integration of African countries in these networks greatly benefits their S&T systems, despite the fragile position that African nations still have in these international networks (e.g., Vieira & Cerdeira, 2022).

Hence, scientific policies aimed at fostering and supporting IRC are deemed important. To maximise the benefits of such policies, it is important to be cognisant of the main barriers to IRC. In general, geographic, economic, political, cultural, intellectual and excellence distances between countries are barriers to IRC (e.g., Frame & Carpenter, 1979; Hoekman et al., 2010; Scherngell & Hu, 2011; Vieira et al., 2022). However, we argue that in the context of African countries, some factors operate differently than in other regions of the world. Indeed, some may even promote rather than hinder IRC. This may happen with economic distance, considering the low level of development of African S&T systems and the need to bring them on par with others in terms of socioeconomic development. As for excellence distance, it may not be an obstacle if we consider that an appropriate alignment of science and its funding according to the socioeconomic needs (Ciarli & Rafols, 2019; Sarewitz & Pielke, 2007) is more fruitful than scientific excellence per se, in countries with early–stage S&T systems.

This study contributes to the literature on IRC in several ways. First, we analysed the influence of geographic, ICTs, economic, governance, cultural, intellectual, excellence, and social distances in IRC for 54 African countries. Past research on the topic has focused on a single African country, discipline, research programme, and individual scientists (e.g., Asubiaro & Badmus, 2020; Holmarsdottir, 2013; Reddy et al., 2002; Sooryamoorthy, 2010). Second, we use a panel count data model with time fixed effects and clusters at the country pair level, which is more appropriate than cross–section techniques because it controls for omitted variables that may affect IRC. Previous studies focusing on the factors affecting the process of collaboration that involve African scientists have mainly obtained results from surveys, interviews, bibliometric studies, or focus–group discussions (e.g., Adams et al., 2014; Asubiaro & Badmus, 2020; Bleck et al., 2018; Holmarsdottir, 2013; Loukanova et al., 2014; Maluleka et al., 2016; Muriithi et al., 2018; Tierney et al., 2013), methodologies that are not the most appropriate to analyse large datasets. Finally, we discuss policy implications, as the results differ from those of previous studies, highlighting the need for policies tailored to the context of African countries.

The remainder of this paper is organised as follows: In “Literature review and hypotheses”, section we review the studies that address the factors that influence research collaboration in Africa and present the framework that supports the hypotheses formulated. “Methodology” section presents the methodology, including a description of the dataset, bibliometric analysis, variables, and model. The following section, Results, presents a brief bibliometric analysis, the descriptive statistics, and the impact of each distance on IRC. “Discussion and Conclusion”, section concludes the paper, presents the main findings of the study, and discusses the policy implications.

Literature review and hypotheses

The studies dealing with the factors affecting research collaboration that involves African scientists have pointed out several barriers. Examples include physical, ICTs and cultural distances, lack of time and research culture, inexistence of a common research language, insufficient funding, poor management of funding, misunderstanding of arrangements and responsibilities, and huge bureaucracy (Bleck et al., 2018; Holmarsdottir, 2013; Loukanova et al., 2014; Maluleka et al., 2016; Muriithi et al., 2018; Tierney et al., 2013). The findings addressing this subject come from studies where scientists that participated in several research projects used their experience to describe the challenges faced in all the process of research collaboration. We briefly present some of these studies.

The analysis of a survey administered to 248 academics at public universities in Kenya found that problems in collaboration were related to sociocultural factors, management and control, and availability of resources. For the sociocultural aspects, academics cited mistrust as a problem; for the management and control dimension, a lack of time for research; and for the availability of resources, insufficient funds and their management (Muriithi et al., 2018).

Social scientists who have participated in research collaborations in Africa pointed to several factors that can negatively impact the collaboration process and that should be considered and early addressed by scientists when deciding to participate in cross-regional collaborations (Bleck et al., 2018). The lack of time of Southern scientists due to multiple demands (teaching, policy work and grading) is considered a barrier. On the other hand, the practise of research collaboration and paid consultancy jobs simultaneously hinder the process of collaboration. Southern scientists are often required as experts for lucrative non-academic consultancy activities, which result in less time for them to carry out their research agenda. The lack of a common 'research language' and misunderstanding of research objectives when collaboration between scientists from different disciplines is required was also cited as a challenge if not properly managed.

Responses from 51 academic faculty employed by Library and Information Science schools in South Africa to a questionnaire examining factors affecting research collaboration indicate that bureaucracy, lack of funding and time, and physical distance are factors that negatively affect research collaboration (Maluleka et al., 2016).

Analysis of a questionnaire answered by nine doctoral students, six doctoral supervisors, and country principal investigators, which participated in an European Union funded collaborative project involving a partnership between Europe and sub-Saharan Africa, a revealed that the main problems in collaboration are the cultural clash, misunderstanding of arrangements and responsibilities, and insufficient funding (Loukanova et al., 2014).

In describing the Academic Model Providing Access to Healthcare, particularly the partnership between Kenyan and North American scientists, important challenges were raised: lack of time for research, physical distance, lack of research culture, and authorship issues (Tierney et al., 2013). The lack of time for research on the Kenyan side is attributed to social and family obligations and the large number of tasks Kenyan academics must manage. Culturally, they have large families for which they have major responsibilities, especially financial. Academics in these countries have low salaries, so they have to look for other sources of income to fulfill the responsibilities for their families. On the other hand, they have many commitments at universities (teaching, university examinations, governance, and clinical practices) that result in the absence of time for research activities. Physical distance was highlighted as a challenge because working groups, programs, and projects were organised as partnership activities involving Kenyans and North Americans. As a result, there are few opportunities for teleconferencing, and they have to take place at very different times of the day (early morning in North America and late afternoon in Kenya; not always the optimal times for participants to make decisions). ICTs were used to mitigate the effects of physical distance, but limited bandwidth and power outages were cited as barriers to appropriate use of these technologies. The lack of research culture on the part of Kenyan academics stems from the limited focus on research as part of the academic culture. As for the authorship and given that the collaborative research requires different amounts and types of efforts among members of the research team, the leaders felt the need to allocate responsibilities in advance, otherwise conflicts would have arisen.

Critical reflections on the challenges of research collaboration by scientists from the Adolescent Reproductive Health Network, which involved several European and African universities, have shown that research collaborations that involve Southern institutions in all aspects of the research process are more likely to lead to sustainable research partnerships (Holmarsdottir, 2013). They also point out that socio-cultural differences can have a negative impact on the collaboration process and therefore suggest that scientists from the North should spend time in the field to familiarise themselves with the context in the South. Finally, they emphasise that differences in race, gender and language have a negative impact on the collaboration process.

These findings are highly relevant in the context of research collaboration, but these studies have not addressed the challenges posed by other factors (economic, governance, intellectual and excellence distance) that have been shown to be barriers in studies addressing research collaboration in other regions (e.g., Vieira et al., 2022).

In explaining research collaboration, several models and frameworks have been developed (e.g., Amabile et al., 2001; Kraut et al., 1987; Sonnenwald, 2007). Therefore, the barriers identified for exploration in this study were drawn from a synthesis of studies, models, and frameworks in the extant literature.

Physical distance

Research collaboration as a social process (Kraut et al., 1988) implies physical proximity, which permits more frequent, effective and unplanned face–to–face communication (Sommer, 1959). Face–to–face communication is desired for several reasons.

First, if face–to–face communication occurs frequently, it will lower the time needed to complete the research activities. Collaborators working over a long distance need to meet at several stages of the research project, and long travel distances will increase the duration of the project, as well as its financial costs.

Moreover, there are disciplines that given their research nature, e.g., mathematics, requires face–to–face communication. As stated by a mathematician interviewed in Walsh and Bayma’s work “We write very proper, formal, very abstract. We think informally, intuitively. None of that is in the publication” (Walsh & Bayma, 1996).

Additionally, improving individual expertise is among the many scientific motivations behind research collaboration (Katz & Hicks, 1997). In the process of sharing and learning new knowledge, tacit knowledge deserves special attention due to the impossibility of codifying it (Gertler, 2003). Thus, face–to–face communication acts as a tool that facilitates the sharing of this knowledge (Storper & Venables, 2004).

Also, this type of communication encompasses visual and corporal cues that are as important as the words in allowing a comprehensive understanding of the information being shared (Storper & Venables, 2004).

Finally, unplanned face–to–face communication is relevant in starting new collaborations (Katz & Martin, 1997; Laudel, 2001). Thus, two collaborating scientists could frequently engage in informal communication, which might end up with new collaborative projects.

In Africa, physical distance is particularly critical. Academics have little time for research activities due to their heavy teaching loads (Tijssen & Kraemer-Mbula, 2018; Zdravkovic et al., 2016) and the need to fall back on other professional commitments to supplement their income (Sawyerr, 2004; Tierney et al., 2013). Therefore, long distance travel is not desirable. Even in the case of short physical distance, the low development of physical communications in Africa might result in a long travel time.

The literature on physical distance revealed, in general, its negative impact on research collaboration (e.g., Balland, 2012; Hoekman et al., 2010; Katz, 1994; Pan et al., 2012; Pond et al., 2007; Torre, 2008).

Given the role of physical distance, we hypothesise that:

H1

The greater the distance between countries, the lower the IRC.

H2

The non–contiguity of countries negatively affects IRC.

ICTs distance

The emergence of ICTs and their continuous development has been relevant in advancing knowledge frontiers (Heimeriks & Vasileiadou, 2008) and in overcoming some of the barriers imposed by physical distance to research collaboration (Walsh, 1996).

ICTs provide communication channels (e-mail, audio conferences), community data systems (e.g., Protein Databank), and access to remote scientific instruments (e.g., telescopes) that are independent of the collaborators’ geographic location (Bos et al., 2007). Considering their advantages, the use of ICTs provided an increase in the research teams’ size and remote collaborations (Ding et al., 2010; Walsh, 1996).

At the country level, there are marked differences regarding the ICT infrastructures, which determine their adoption and use by the scientific community (Ayanso et al., 2014). In Africa, the ICT infrastructures are limited, which is why their use in collaborative research has increased but is still low (Muriithi et al., 2016).

Thus, our formulated hypothesis is as follows:

H3

The ICTs distance negatively affects IRC.

Economic distance

When scientists can not solve a scientific problem with the resources (intellectual, financial and scientific infrastructures) available in their countries, they show openness to research collaboration with foreign scientists (Luukkonen et al., 1992; Zdravkovic et al., 2016). Usually, the availability of these resources is related to the economic development of each country (Sokolov-Mladenovic et al., 2016; Wagner et al., 2001). Therefore, we expect joint research between countries with different levels of development (e.g., Maleka et al., 2019; Shehatta & Mahmood, 2017). However, the greater the economic disparity, the greater the challenges associated with research collaboration among countries, which could limit collaborative activities.

In Africa, most of the national S&T systems are at an early stage of development (UNESCO, 2015) and the low availability of resources does not allow for rapid scaling up of these systems. All these constraints, as well as the potential of scientific knowledge that can be generated in this region for its socioeconomic development, have already been recognised by international intergovernmental organisations (e.g., United Nations), political and economic organisations (e.g., European Union), funding agencies of several countries and donors. Over time, these actors have been responsible for funding and designing research programmes aimed at developing collaborative research activities between African and non-African scientists (Skupien & Ruffin, 2020). Among many other goals, these programmes aim to make African countries more scientifically capable and bring them on par with others in terms of socioeconomic development.

Therefore, we assume that:

H4

Economic distance fosters IRC.

Governance distance

Governance is the process by which governments are selected, monitored, and replaced; the ability of a government to effectively formulate and implement sound policies; and the respect of citizens and the state for the institutions that govern economic and social interactions (Kaufmann et al., 2010).

Governance is important from several perspectives. Firstly, political instability, corruption and violence prevent an attractive research environment (Allard et al., 2012). Also, the degree of freedom influences the ability to engage and conduct activities (Schiermeier, 2021; Skupien & Ruffin, 2020). Further, the quality and complexity of policy formulation and implementation regarding intellectual property and legal infrastructures may pose additional challenges to the collaborative process (Forero-Pineda, 2006). Therefore, rules and laws as formal institutions influence the behaviour of actors and organisations (Boschma, 2005). To the extent that actors and organisations (including universities and other research entities) share similar formal institutions, i.e. build trust on the basis of common institutions, this proximity leads to increased IRC (Boschma, 2005).

Many African countries have undertaken reforms to improve their overall governance. Nevertheless, there are countries where these reforms are still in their infancy and have a long way to go to achieve good performance, and others are experiencing a decline in their governance performance after periods of improvement (Mbaku, 2020; MIF, 2020).

We conjecture that:

H5

Governance distance negatively affects IRC.

Cultural distance

Cultural distance is the dissimilarity in values, beliefs, attitudes and language among individuals (UNESCO, 2001).

The greater propensity to interact with others of similar values, beliefs and attitudes is well discussed in the literature (Huston & Levinger, 1978; McPherson et al., 2001). This major tendency is discussed under the term homophily principle (Lazarsfeld & Merton, 1954; McPherson et al., 2001). The principle states that interaction between similar people is higher than between dissimilar people. By similar people, we mean those whose similarity is based on informal, formal or ascribed status (status homophily) or on values, attitudes and beliefs (value homophily) (McPherson et al., 2001). This propensity to interact (value homophily) has been shown to be important for knowledge sharing and learning and for fostering an environment of shared habits and respected norms of behaviour (Lucas, 2006; Makela et al., 2007).

As for cultural similarity (value homophily), imperial history has been identified as an important driver (Bonikowski, 2010). When individuals interact, cultural traits are transferred from one individual to another, resulting in an environment of similar values, beliefs, and behaviours (Axelrod, 1997). Several African countries were colonised, and this historical past likely contributed to the absorption of other cultures.

Language is essential to the process of knowledge sharing (Welch & Welch, 2008). It is linked to culture as metaphors, accents and dialects are embedded in it revealing a person's cultural background (Goddard & Wierzbicka, 2001). Therefore, a common language is expected to facilitate knowledge sharing (Ambos & Ambos, 2009; Makela et al., 2007).

The literature on the effects of cultural distance on research collaboration emphasises its negative impact (Gui et al., 2019; Hoekman et al., 2010; Luukkonen et al., 1992; Plotnikova & Rake, 2014).In the case of African countries, shared culture and language were suggested as possible reasons of the collaboration patterns observed in the networks representing African international collaborations (Adams et al., 2014).

From the previous points, we anticipate that:

H6

The absence of a colonial tie negatively affects IRC.

H7

Not sharing a common language negatively affects IRC.

H8

Not sharing a common coloniser negatively affects IRC.

Intellectual distance

Intellectual distance is the gap among knowledge bases of different countries. When seeking knowledge through research collaboration, partners must have similar knowledge bases (Cohen & Levinthal, 1990). Cohen and Levinthal defined a firm's absorptive capacity as”…the ability of a firm to recognize the value of new, external information, assimilate it, and apply it to commercial ends…’ (Cohen & Levinthal, 1990). For these authors, the absorptive capacity of the firm depends on the absorptive capacity of the individual members of the firm. In turn, an individual's ability to evaluate and use external knowledge depends on the level of prior knowledge associated with it. Using this concept of absorptive capacity, we can conclude that the exchange and learning process in a research collaboration will only be successful if each scientist has the appropriate knowledge that enables him/her to learn and interpret the new knowledge. In other words, the scientists' ability to absorb, interpret and exploit new knowledge is closely related to their intellectual background. However, some intellectual distance is essential as it enables the combination of complementary knowledge that expands knowledge frontiers (Boschma, 2005; Gilsing et al., 2008).

In Africa, the production of scientific knowledge is low, considering its contribution to global scientific knowledge (Adams et al., 2014). Moreover, a high percentage of this knowledge comes from a small number of African countries, and the distribution of knowledge across scientific fields is disproportionate (e.g., Pouris & Ho, 2014; Vieira & Cerdeira, 2022). These patterns suggest that the knowledge base may be fragile in several scientific fields and countries.

Empirically, the studies highlighted the negative effect of intellectual distance on collaboration propensity (e.g., Acosta et al., 2011; Capello & Caragliu, 2018; Fernandez et al., 2016).

We foresee that:

H9

Intellectual distance negatively affects IRC.

Excellence distance

A country's competitiveness is tied to a rich scientific knowledge and innovation system. At the heart of the competitiveness issue, it is the decision by policy actors to create an environment that supports excellent research and to provide the means to foster linkages between users and producers of knowledge. Efforts have been made to direct funding towards research excellence (e.g., the Research Excellence Framework). Other efforts relate to the use of instruments (e.g., formal bilateral agreements, internationally focused training programmes) to encourage IRC with scientific powers (Boekholt et al., 2009).

In Africa, the pursuit of excellence in research has been recognised by several scientific actors and excellence initiatives have been developed (Tijssen & Kraemer-Mbula, 2018). One example is the Science and Technology Consolidated Plan of Action 2005–2014, which assumes that science and technology must be produced and used to solve specific African problems. This plan of action emphasises excellence in research and several centres of excellence have been established in different African regions (UNESCO, 2015). However, African scientists mention that the unavailability of time for research, limited access to equipment and funding are serious obstacles to the development of research excellence (Tijssen & Kraemer-Mbula, 2018). Therefore, in the African context, it is necessary to recognise and promote research that is valuable across local, regional, national, and global scales. While research must be well done and adhere to standards, excellence may not be the primary goal. We assume that these particularities are considered when scientists define IRC. We, therefore, assume that when IRCs are established between African and non-African scientists, the excellence of the African science system is not a determining factor in whether or not the collaboration will continue.

Thus, we anticipate that:

H10

Excellence distance is not a barrier in IRC.

Social distance

The literature on embeddedness assumes that economic relationships are to some extent embedded in a social context (Boschma, 2005; Granovetter, 1985). In Granovetter's view “…the behavior and institutions to be analyzed are so constrained by ongoing social relations that to construe them as independent is a grievous misunderstanding.” (Granovetter, 1985). Taking into account the literature on embeddedness, we consider that in IRC, socially embedded relationships between individuals are extremely important for the success of IRC. We refer to the absence of socially embedded relationships between individuals as social distance. Socially embedded relations involve trust derived from friendship or shared experience; thus, social distance prevents trust–based interactions, which are important in fostering knowledge exchange and sharing ideas (Dhanaraj et al., 2004; Sherwood & Covin, 2008).

Diverse research collaborations are established because a friendly relationship exists or collaborators have developed joint research activities in the past (Owusu-Nimo & Boshoff, 2017). As the interaction progresses, individuals recognise the capabilities and interests of their collaborators, emerging a trust-based relation. This can lead to new research collaborations between the same collaborators. Furthermore, trust can spread through a collaborator's network increasing the opportunities for cooperation with a collaborator's collaborators (Newman, 2001).

The literature on social distance highlights the importance of personal relationships and previous collaboration in IRC (e.g., Eduan & Jiang, 2019; Fernandez et al., 2016; Owusu-Nimo & Boshoff, 2017; Plotnikova & Rake, 2014).

We envision that:

H11

The absence of previous collaborations negatively affects IRC.

H12

The inexistence of common collaborators negatively affects IRC.

Methodology

Data

In studying the influence of the distances in IRC, we considered the IRC between different African countries and between African and non–African countries. The nations involved are members of the United Nations (193 countries).

The number of co-publications (pub_col) between two countries is our measure of IRC. The data were retrieved from InCites, that includes content indexed in the WoS (Science Citation Index Expanded, Social Sciences citation Index, Arts & humanities Citation Index, Conference Proceedings Citation Index, Book Citation Index and Emerging Sources Citation index), for the period between 2000 and 2017. We have 8937 unique country pairs and a total of 149,426 observations. We used a panel data as this allows including cluster country pairs effects, as well as time fixed effects, to control for some unobserved variables, such as the implementation of national policies regarding IRC and external funding received through research partnerships.

As for the data, we recognise the limitations regarding the difficulty of having a universal concept of research collaboration (Katz & Martin, 1997) and from the selective procedures of the WoS (Clarivate, 2020).

Given the complexity of the collaboration process, it is changeling to have a concept of research collaboration and an index that can adequately measure it. We have adopted the co-authorship approach, but this provides only a partial perspective on collaboration, as not all outputs of the collaboration process are tangible (Katz & Martin, 1997). Moreover, studies reported that the contribution of scientists from developing countries was beneficial to the advancement of knowledge in a collaborative framework, but their contributions were unacknowledged through the scientific publications, as they are not considered as authors (Dahdouh-Guebas et al., 2003; Elobu et al., 2014). However, the advantages—namely, invariant and verifiable characteristics, data availability, and the ability to work with large datasets (see Katz & Martin, 1997 for more advantages)—have contributed to the use of this approach as a proxy for IRC (Newman, 2001, 2004).

Regarding the selection policies, WoS only considers sources that meet a set of criteria and therefore relies more on selectivity than on comprehensiveness. This constitutes a shortcoming as several African journals are not indexed (Owusu-Nimo & Boshoff, 2017) and African scientists often rely on grey literature to publish their research findings (Marfo et al., 2011).

Bibliometric analysis

Using a bibliometric approach, we briefly present an overview of the number of documents published by African scientists between 2000 and 2017, as well as the number of documents with at least one foreign scientist (this can be a scientist from another African country or a non-African country). Since scientists from a few countries make the largest contribution to Africa's scientific output, we present and discuss statistics for the 10 African countries with the largest contribution to the total number of African documents. We have also identified the most important foreign partners of these countries, the top five. With this analysis, we aim to provide readers with a broader understanding of IRC on the continent.

Variables and models

Dependent variable

The dependent variable is the number of co–publications between two countries in each year. For a pub_col, we identified, through its affiliation, the participating countries. We used full counting (i.e. if three countries are mentioned, we considered one co–publication for each country) and disregarded the number of addresses in which a country appears.

Independent variables

As for collecting the data on the independent variables, we used the information available at the Centre d'Études Prospectives et d'Informations Internationales (CEPII), International Telecommunication Union, United Nations, Worldwide Governance Indicators project and InCites.

As for ICTs, economic, governance, intellectual and excellence distance, we determined the Euclidean distance between two countries concerning our choice of variables as a proxy for each distance.

Regarding internal scientific determinants, we should consider the resources devoted to R&D activities (financial, infrastructures and human capital). However, this information is not available for most of the countries. Thus, we used the number of publications of each country (\({Pub}_{\mathrm{it}}\) and \({Pub}_{\mathrm{jt}}\)) as a proxy, an approach widely applied in studies addressing research collaboration (e.g., Hoekman et al., 2010; Plotnikova & Rake, 2014).

As for geographical distance, we considered the distcap and contig variables from CEPII. Variable distcap—determined following the great circle formula, which uses geographic coordinates of the capital cities—was renamed as capitals. The dummy variable contig, renamed as contiguous, indicates whether the two countries share a common border (1) or not (0).

In determining the ICTs distance (ICTs), we considered the percentage of individuals with access to the internet in each country. While it is a partial measure of ICTs available in a country, information regarding ICTs infrastructures and other variables are not available on a global scale.

In measuring economic distance (economic), we used an index that takes into consideration the gross national income per capita, in addition to the life expectancy and education; the Human Development Index.

Regarding the governance distance (governance), we used the six dimensions (Voice and Accountability, Political Stability and Absence of Violence/ Terrorism, Government Effectiveness, Regulatory Quality, Rule of Law, and Control of Corruption) of governance from the Worldwide Governance Indicators (Kaufmann et al., 2010). For each dimension and country, we collected the data on the percentile rank.

Concerning cultural distance, we used variables from CEPII: common language (language), colonial link (colony), and common coloniser after 1945 (common). Language, colony and common assume a value of 1 if the two countries share a common language, had a colonial tie, and share a common coloniser, respectively, and zero otherwise.

As regards intellectual distance (intellectual), we determined a specialisation index looking at the distribution of publications by scientific domain for each country, which was computed similarly to the comparative advantage index (Balassa, 1965). The reference is the world’s publications, and values higher than 1 (lower than 1) reveal that the country is specialised (under–specialised) in the given scientific domain. The indicator was determined as follows:

$$Specialisation_{{cft}} {\mkern 1mu} = {\mkern 1mu} \frac{{\frac{{Pub_{{{\text{cft}}}} }}{{\sum\limits_{{{\text{f}}^{'} \in {\text{F}}}} {\sum P } ub_{{{\text{cf}}^{'} {\text{t}}}} }}}}{{\frac{{\sum\limits_{{{\text{c}}^{'} \in {\text{C}}}} P ub_{{{\text{c}}^{'} {\text{ft}}}} }}{{\sum\limits_{{{\text{c}}^{'} \in {\text{C}},{\text{f}}^{'} \in {\text{F}}}} P ub_{{{\text{c}}^{'} {\text{f}}^{'} {\text{t}}}} }}}}$$
(1)

where, \({Pub}_{\mathrm{cf}}\) is the number of publications of country c belonging to the scientific domain f, C and F is the set of countries and scientific domains, respectively. As scientific domains, we considered the first hierarchical level of Fields of Science (FoS) schema.

In addressing excellence distance (excellence), we used a widely used indicator: the top 10% most cited documents in the world (Hollanders et al., 2019; OECD, 2015). The top–percentile approach has become a widely accepted method for identifying characteristics of research excellence in international science. We used the % Documents in Top 10% available in InCites. A similar indicator, the 1% of the world's most cited documents, was used in a previous study to examine the research excellence of African universities (Tijssen & Kraemer-Mbula, 2018). Here, we determined for each country the percentage of the total scientific production that is in the 10% of the world's most cited documents.

In measuring social distance, we looked at past collaborations and a local similarity index, the Jaccard index (Jaccard, 1901; Lu & Zhou, 2011).

Past collaborations, which we label as Past, between country i and j is equal to 1 at time t if two countries collaborated in t–1 and zero otherwise.

As for shared collaborators, consider an undirected and weighted network G (V, E), where V is the set of nodes (countries) and E is the set of links (\({Pub\_col}_{\mathrm{i},\mathrm{ j},\mathrm{ t}}\)), where self–connections are not allowed. For countries i and j, let Γ(i) and Γ(j) denote the set of their collaborators (neighbours) at time t. Then, the dissimilarity regarding the shared collaborators is calculated using the following expression:

$$Collaborators\,_{ijt} \, = \,1 - \,\frac{{\left| {\Gamma \left( i \right) \cap \Gamma \left( j \right)} \right|}}{{\left| {\Gamma \left( i \right) \cup \Gamma \left( j \right)} \right|}}$$
(2)

Model

We use the gravity model to study the influence of several distances on IRC. The gravity model has been applied to several applications regarding interactions among countries, namely research collaboration (Hoekman et al., 2009, 2010; Plotnikova & Rake, 2014). The rationale for this model is related to Newton’s law of universal gravitation, which affirms that the gravitational force between two objects depends on the masses of the two objects and the distance between them. In our context, the model is defined as follows:

$$Y_{{{\text{ijt}}}} = \beta_{0} \times P_{{{\text{it}}}}^{{\beta_{1} }} \times P_{{j{\text{t}}}}^{{\beta_{2} }} \times D_{{{\text{ijt}}}}^{{\beta_{3} }} \times \varepsilon _{{{\text{ijt}}}} {\text{ i}},{\text{ j}}\,{ = }\,1, \ldots , n\,\,t\, = \,1,\, \ldots T$$
(3)

Equation 3 states that the interaction between two countries (\({Y}_{\mathrm{ij}}\)) is directly proportional to internal scientific determinants (the number of publications in each country, \({P}_{\mathrm{it}}\) and \({P}_{\mathrm{jt}}\)) and inversely proportional to the distances between the two countries (\({D}_{\mathrm{ijt}}\))\(;{\varepsilon }_{\mathrm{ijt}}\) represents the error term.

Applying logarithms on both sides of Eq. (3), parameters (\({\beta }_{0}, {\beta }_{1},{\beta }_{2}, {\beta }_{3})\) may be estimated by ordinary least squares (OLS). However, OLS entails several limitations (Silva & Tenreyro, 2006). First, given that the dependent variable is expressed in its logarithmic form, it is impossible to accommodate the existence of zeros. While our dependent variable is characterised by integers and nonnegative values, OLS assumes that it is a continuous, boundless variable, making this technique inappropriate for count data. The usage of OLS may be inconsistent even in the presence of heteroscedasticity (Silva & Tenreyro, 2006).

Thus, to estimate our model, we resorted to count data estimation techniques and employed a Poisson regression model with fixed effects (Silva & Tenreyro, 2006). We assume that the pub_col between two countries follows a Poisson distribution:

$${\text{Pr}}\left[ {Y_{{{\text{ijt}}}} } \right] = \frac{{e^{{\left( { - \mu_{{\text{ij t}}} } \right)}} \times \mu_{{{\text{ijt}}}}^{{Y_{{{\text{ijt}}}} }} }}{{Y_{{\text{ij t}}} !}},\,\,Y_{{{\text{ij}},{\text{t}}}} = 0,1, \ldots$$
(4)

The conditional mean \({\mu }_{ijt}\) is given by:

$$\mu_{ij t} = e^{{\beta_{0} + \beta_{1} lnP_{{{\text{it}}}} + \beta_{2} lnP_{{{\text{jt}}}} + \beta_{3} lnD_{{{\text{ijt}}}} }}$$
(5)

This model may be inappropriate if there is conditional overdispersion (conditional variance higher than the conditional mean). However, overdispersion is only a critical issue when the goal is to determine the probability of a count event. Our goal is to determine the effects of the variables on the conditional mean, so overdispersion is irrelevant. Moreover, the Poisson estimator is very robust to any distributional misspecification, allows for any type of variance-mean relationship and serial correlation (one only needs to cluster the standard errors) (Cameron & Trivedi, 2005; Wooldridge, 2010).

Our dataset is characterised by an excess of zeros (72% of the total observations have pub_col = 0). This would not be a problem if all zeros were generated by the same process. In this case, it is possible that observations with zero values could result from multiple processes. For example, research activities in a country could be very rare due to resource constraints (human capital, scientific infrastructures and financial), leading to outputs that might not include scientific publications indexed in the WoS. Thus, we conclude that the value of co-publications is a certain zero.

Therefore, we used the zero-inflated version of the Poisson regression model. A logit model is obtained for certain zeros, which allows predicting whether a given pair would belong to this class and then, a Poisson model predicts the counts for those pairs that are not certain zeros. Finally, the two models are combined (Cameron & Trivedi, 2005):

We also included time–period (year) dummies and cluster–robust standard errors at the country pair level.

We present both regression models in the Results section and, the Voung test to ascertain whether the zero-inflated version is more appropriate than the Poisson regression.

Results

In the following figures, we present statistics on the number of documents published by African scientists (for the sake of simplicity, we use the term scientist for the authors of the documents, although a particular author is not necessarily a scientist in the strict sense of the word) and their contribution to the world output. In general, we can see that Africa's share of world output has increased both in the total number of documents and in the total number of documents with at least one foreign scientist (Fig. 1); in 2000 Africa world share was 1.3% given the total number of documents and 3.1% in 2017; in 2000 was 3.9% given the total number of documents with at least one foreign scientist and 7.6% in 2017.

Fig. 1
figure 1

Africa world share given the total number of documents and the documents with at least one foreign scientist. The percentages were determined using the documents indexed in the WoS and published between 2000 and 2017

The 10 African countries with the highest number of published documents between 2000 and 2017 (Fig. 2)Footnote 1 produced about 88% of the scientific knowledge generated on the continent during the same period. South Africa stands out, with scientists from that country publishing more than 233 thousand documents and scientists from Egypt more than 154 thousand. In this top 10, we observe countries from different regions: northern (Egypt, Tunisia, Algeria and Morocco), western (Nigeria and Ghana), eastern (Kenya, Ethiopia and Uganda) and southern (South Africa). Previous studies are consistent with our findings, as the countries in each region with the highest number of publications are the same (e.g., Mounton & Blanckenberg, 2018; UNESCO, 2015).

Fig. 2
figure 2

Number of documents of the 10 African countries with the highest number of documents indexed in the WoS and published between 2000 and 2017

Looking at the number of documents involving foreign scientists (who may be scientists from another African country or non-African countries), these countries account for 81% of the total number of documents from Africa with international collaboration. There are clear differences between countries on this dimension (Fig. 3). South Africa and Egypt are the countries with the highest number of documents involving foreign scientists, but in terms of their representativeness in the total number of documents, we observe the lowest values for these countries and Nigeria (40%, 42% and 33% respectively). At the other end of the scale, Uganda and Kenya (79% and 76% respectively) have the highest representativeness. When we look at all African countries, we find 29 (54%) producing 80% or more of their scientific output in collaboration with foreign scientists, and 34 (63%) producing 75% or more of their scientific output in this situation (see Appendix Table 6). It has been reported that African countries are heavily dependent on international funding to carry out research activities due to low investment by national governments in research activities (Beaudry et al., 2018). As consequence, most African countries report a high presence of documents with participation of foreign scientists in their scientific production.

Fig. 3
figure 3

Number of documents with at least one foreign scientist for the 10 African countries with the highest number of documents indexed in the WoS between 2000 and 2017 and its representativiness in the total number of documents of these countries in the same period

Regarding the contribution of these countries to Africa's total scientific production, we see that South Africa and Egypt have the highest share of documents (30.8% and 20.4% respectively) (Fig. 4). In terms of the percentage of the total number of African papers with foreign scientists, these countries continue to lead, although the percentages are lower compared to the percentage of the total number of documents (26.7% and 18.4% respectively). It is also interesting to note that Kenya, Ethiopia, Ghana and Uganda contribute more to African documents (in percentage) with foreign scientists (orange line) than to total African documents (blue bar). Tunisia, Algeria and Morocco have more or less the same contribution in both cases. Finally, Nigeria contributes more than Algeria and Kenya to the total number of African documents, but the latter two have a higher share of African documents with foreign scientists. Similar patterns are also observed between Morocco and Kenya, and between Ethiopia, Ghana and Uganda. For the remaining countries, the share of African documents is less than 1% for 40 countries and less than 0.01% for 16 countries (see Appendix Table 6). For African documents with foreign scientists, the share is below 1% for 34 countries and below 0.01% for 7 countries (see Appendix Table 6).

Fig. 4
figure 4

Representativeness of documents of the 10 African countries with the highest number of documents indexed in the WoS published between 2000 and 2017 in the total number of documents published by African scientists in the same period, and representativeness of documents with at least one international collaboration from the same countries in the total number of documents of the same type published by African scientists

As for the African countries’ main collaborators, we note that among the top 5 collaborators are essentially non-African countries. Only in Nigeria, Kenya, Ghana, and Uganda we do find African countries in this top 5 list, and South Africa stands out as African partner.Footnote 2 In this top group, we find countries that are known for their high performance in the scientific arena (e.g. the USA and the UK), that have had a colonial relationship with African countries (e.g., France with Algeria and Tunisia) and that host relevant organisations for funding research in Africa (e.g., the Deutsche Forschungsgemeinschaft, the Federal Ministry of Education and Research and the German Academic Exchange Service, all from Germany (Kozma et al., 2018)) (see Fig. 5).

Fig. 5
figure 5

The main five foreign partners for the top 10 African countries with more documents published between 2000 and 2017

Descriptive

The descriptive statistics reveal that, on average, African countries produced circa 6 publications in collaboration with other countries in 17 years. However, the distribution of the variable pub_col is highly skewed to the left and at least 50% of the observations have zero pub_col (Table 1). As for past collaborations, we found collaborations at time t − 1 for 27% of the cases (Table 2) suggesting that despite the high presence of publications with at least one foreign scientist in the total number of publications of African countries, their activity concerning IRC is concentrated around a few countries.

Table 1 Descriptive statistics for the pub_col, Pub, and distances
Table 2 Frequency of occurrence of the dummy variables

For the remaining variables, the high dispersion was already expected given the nature of our object of observation.

Concerning the dummy variables, we observed that 1.2%, 20.8%, 0.73%, and 13.3% of the observations share a common border, a common language, had a colonial link, and a common coloniser, respectively (Table 2). Most of the IRC occurred between countries not sharing a common border, language, coloniser, and colonial link (Table 3).

Table 3 The proportion of pub_col for different levels of the dummy variables

Finally, the independent variables are not strongly correlated (Appendix Table 5). The highest correlations (between 0.39 and 0.51) were observed between the variables representing the ICTs, economic and governance distances.

The influence of each distance

The variables that represent the internal scientific determinants have a positive and significant impact (p-value < 0.05) on pub_col (Table 4). The very similar values of Pubi and Pubj result from the fact that the dependent variable represents collaboration, which has no direction.

Table 4 Results using the Poisson regression models and its zero-inflated version regression

Despite the very similar results for both regression models, Voung’s test leads us to conclude that the zero-inflation version of the Poisson regression is more appropriate.

The coefficients of the distance variables have the expected signal and are statistically significant (p–value < 0.05) for almost all variables (Table 4), except for the variables representing distance for governance and excellence, and a common coloniser.

Thus, physical distance and not sharing a common border are barriers to IRC. The absence or limited access to ICTs also emerges as an obstacle, corroborating previous findings (Muriithi et al., 2016; Tierney et al., 2013). Cultural distance imposes challenges to the collaboration process at least when there are no colonial ties and no common language. If knowledge bases do not overlap to some degree, joint research activities will result in less pub_col than if they are relatively close. Past collaborations allow for increased collaboration within the same country pair. If the collaborators of the countries in each pair are very different (implying a low level of common collaborators), it is not possible to exploit the trust–based relationships created by previous collaborations.

As for governance distance, the coefficient is negative but not statistically significant. Although we are not able to show the reasons for this result, we can suggest possible explanations. Bilateral and multilateral agreements and programmes have been widely applied in the context of IRC (Boekholt et al., 2009). The formal nature of these instruments may not allow that differences in governance impose major challenges to the collaboration process. These agreements are also seen as a means of bringing countries closer together through cooperation in science. Global societal challenges require the design and implementation of these instruments, as it is impossible for a single country to address these issues. In Africa, challenges as climate change, biodiversity loss, and health issues are expected to inflict severe damages to the continent (IPBES, 2018; WMO, 2020), and therefore the participation of African scientists in tackling these challenges is deemed important (AU-EU, 2017).

Sharing a former coloniser has no impact on IRC. In the dataset, about 12% of all observations have no common coloniser but speak the same language, while 9% of all observations had a common coloniser and speak the same language. Although the percentage is similar in both scenarios, country pairs satisfying the first condition account for 39% of the total pub_col, while country pairs satisfying the second condition represent 4%. Thus, we suggest that cultural proximity (as measured by a common coloniser) is not a sufficient condition for joint research activities if the necessary resources are unavailable. Of all observations that share a common coloniser and language, bilateral collaborations between African countries account for 40%.

As expected, not all the distances have the same effect on IRC. Past collaboration and a common language seem to have the largest effect; having collaborated in the past increases pub_col by 1.4%. The same is true for speaking the same language. Colonial ties increase IRC by 1.1%. The ICTs distance has the smallest effect: a 1% increase decreases pub_col by 0.06%.

Thus, our results confirm all the hypotheses raised except H5 and H8.

Discussion and conclusion

Scientific knowledge is important for the socioeconomic development of a nation. Studies disclosed a positive relationship between scientific knowledge generated in African countries and their economic growth. However, scarce resources for R&D could retard the development of their national S&T systems. In an effort to develop their scientific systems, IRC may be an appropriate strategy. Therefore, in determining policies to promote IRC, it is imperative to identify its barriers. The literature has shown that geographical, economic, political/governance, cultural, intellectual and excellence distance hampers IRC in other regions. However, to date, no single study has addressed this subject in the African context. The question therefore is that: Is Africa different? If so, what are the implications for science and science policy?

This study contributes to the topic through an analysis of a dataset that includes the bilateral collaborations of African countries (with another African country or a non–African country). Using panel data for 54 African economies, we examined the effects of geographical, ICTs, economic, governance, cultural, intellectual, excellence and social distance on the cross–national collaborations of these countries. The results suggest that Africa is indeed different. We found that geographical and ICTs distances, lack of colonial ties and common language, large discrepancies in the knowledge base, the inexistence of past collaborations, and the dissimilarities concerning the collaborators belonging to each country's network significantly and negatively impact IRC. On the other hand, we found that economic distance promotes IRC, contradicting previous studies, and that governance and excellence distances and a common coloniser do not affect IRC.

As for scientific policies, it is difficult to have a successful recipe. Nonetheless, policies must be adapted to each environment.

Policies aimed at fostering IRC should take into account that physical distance is a barrier. A way of minimizing this negative effect is to invest in infrastructures of ICTs that allow continuous interaction among scientists and access to key resources (databanks, specialised equipment). The availability of ICTs cannot overcome all the limitations of physical distance (the share of tacit knowledge continues to be an issue). However, it contributes to diminishing the number of visits and consequently the time and financial resources, so important in the context of African scientists.

Bilateral and multilateral collaborations with developed countries should continue, but always aiming to address scientific problems that hinder Africa's socioeconomic development. The focus on policies to promote IRC should consider the interactions between the academic, government and industrial sectors, which are weak in most African countries. In the absence of policies to foster these interactions, each sector will seek international research relationships, mainly the academy. This can lead to asymmetric relationships, especially when dealing with countries having solid S&T systems. African scientists may tend to adopt the research agenda of international partners, which would further weaken the internal interactions of the different sectors and consequently perpetuate the socioeconomic underdevelopment of African countries. We, therefore, advocate for a balance between policies that promote and support IRC with the most developed economies and those that stimulate interactions between the different actors. Simply applying policies designed in countries with well-defined interactions among these sectors will not be successful in this case.

Cultural proximity is seen as positive in joint research. However, our results have shown that this may not be a sufficient condition. Policies to promote interactions among African countries should focus on building transnational infrastructures that are adequately resourced and promote especially research collaboration between scientists from African countries. Only with the appropriate resources, it is possible to take advantage of the benefits of cultural proximity in research collaborations.

The scientific production of African countries is biased towards the Natural Sciences and Medical Sciences (Pouris & Ho, 2014). Since similar knowledge bases are essential to the process of collaboration, it seems important to focus more on policies aimed at promoting the development of the scientific domains with a weaker knowledge base, without neglecting the policies in scientific domains where African scientists can continue to contribute to the advancement of knowledge. In this way, a more balanced scientific spectrum can be achieved. Once a balanced knowledge base is obtained, collaborative policies should be differentiated by scientific domains to reflect the scientific knowledge needs of individual countries.

Since resources are scarce in Africa, policies to support IRC should not primarily aim at concentrating resources on a small group of researchers known for their excellent research to achieve excellence. Their S&T systems are at an early stage of development, so these countries need to build research capacity according to their priorities. Research utilisation (according to local expectations and needs) is more important than research excellence per se. The participation of African scientists in IRC will familiarise them with a quality-oriented research culture, which will be crucial for achieving research excellence in African science in the medium-long term and therefore in building solid S&T systems.