Scientific systems in Latin America: performance, networks, and collaborations with industry

In this paper, we use a combination of bibliometric, social network and econometric approaches to increase our knowledge of how research institutions interact with the private sector in Latin America (LA). We first study recent trends in scientific output and specialization. On average, LA countries have been reducing the gap with the world leading regions. They have also tended to specialize in fields related to economic activities based on natural resources, such as Agricultural and Plant and Animal Sciences. However, collaborations with the private sector remain scarce. In this paper, we have built scientific networks composed by what we define as Research Departments (RD). These RDs belong to universities, research institutes and government agencies. We model the intensity of collaboration of a RD with industry as a function of its size, previous performance, and its position in the LA and national scientific networks. Our results show that the RDs which higher diversity of research partners in their national scientific network work more intensively with industry. Additionally, collaborations with industry are influenced by previous interactions with the private sector.


Introduction
Understanding the main traits of scientific institutions that engage in collaborative work with the industry is critical for improving policies aimed to increase science-industry linkages. This is even more important in Latin American (LA) countries, where public support for science, technology, and innovation (STI) has increased significantly since the early 2000s, but the study of science-industry linkages has been relatively neglected. In this study, we show that scientific systems in LA countries have improved in performance while specializing in scientific fields that are related to the main economic activities of the region. Furthermore, by analyzing scientific organizations characteristics and collaboration networks we find that organizations that have, or that have access to, diversified sources of knowledge work more intensively with the private sector.
One of the main motivations behind the increasing relevance of policies that promote science-industry linkages in the LA region is the potential benefit in innovation and technological capacities in the private sector. Indeed, Crespi (2012) summarizes main results of impact evaluations conducted in LA finding that public policies that promote collaboration between universities and industry increase the level of investments in innovation and labor productivity in firms. Marotta et al. (2007) show that both in Chile and Colombia firms that collaborate with universities are more likely to introduce product innovations and to apply for patents. Despite these positive findings, the magnitude of STI policies is still too modest to pull LA countries towards knowledge-based economies. Evidence from innovation surveys shows that universities and research centers tend to be less relevant partners for technological innovation in LA firms in comparison to the non-LA OECD countries (OECD 2015).
Part of the reason behind the limited importance of scientific institutions in the typical LA country has been attributed to the high relevance that the exploitation of natural resources has in LA economies. Conventional views see natural resources-based industries as activities with slow technological progress, where innovation is mostly driven by the suppliers of machinery and equipment, and that has a reduced potential of producing knowledge spillovers to other sectors (Lall 2000;Pavitt 1984). Nonetheless, some scholars argue that there are certain specificities in the current context that creates a demand for local knowledge for the exploitation of natural resources which open a 'window of opportunity' for the development of local knowledge providers (Kaplan 2012;Marin et al. 2015;Urzúa 2011). Marin et al. (2015), for example, argue that the intensification of the challenges in the exploitation of natural resources together with changes in volume and requirements of the global demand would favor the development of a domestic knowledge-intensive industry built upon local scientific capacities. Following this rationale, economic growth would be grounded in the capabilities acquired by each country in its specific area of resource endowment. It would then advance along the new technological trajectories being opened by research made in natural resources related fields. In this study, we provide insights about scientific development around natural resources in LA.
After studying scientific systems at the country level, we focus our analysis on scientific organizations and its patterns of collaboration with the private sector, measured by copublications. There is an extensive body of literature on university-industry collaborations, and some of these studies examine cross-country and disciplinary differences in the patterns of co-authored scientific publications between university and industry (Godin 1996;Hicks et al. 1996;Tijssen 2004). Nonetheless, not much is known about the characteristics Scientific systems in Latin America: Performance, networks… 875 of scientific institutions which favor research collaborations with industry. The lack of evidence is even more noticeable in the LA countries. The results of this study help to close this gap.
In what follows, we first update and discuss current literature related to our research question; then in section three, we describe our data sources and the methodology. Section four shows the evolution of scientific production process and major trends on the specialization of LA countries. Section five describes LA scientific networks in five selected disciplines: Agriculture, Engineering, Environmental, Geosciences, and Plant and Animal Science. Section six presents and analyses main econometric results, and lastly, we present conclusions in the final section.

Science-industry collaboration in Latin America
Science-Industry collaboration in LA has largely been built from a top-down perspective as a result of S&T policies based on a supply-push focus (Crespi and Dutrénit 2014;Dutrénit and Arza 2010). Although LA universities differ across countries about their origins, a common feature is that they were initially oriented to undergraduate teaching. As research activities became increasingly common, postgraduate programs were gradually provided. Crespi and Dutrénit (2014) describe how several of the most important public research centers in LA were created during the period of supply-push science policies (1950s-1980s). These centers have focused on supporting sectors considered relevant by the policy makers (for example, coffee in Costa Rica, aeronautics and oil in Brazil, oil in Mexico, nuclear technology in Argentina, and agriculture in most countries). During the same period, private sector evolved in economic activities that remained fairly protected from international pressures (either naturally or through intervention) (Crespi and Dutrénit 2014). Consequently, the incentives to engage in technological updating and learning were lessened. These singularities perhaps led LA firms to disregard local scientific institutions when conducting their innovation activities (Crespi et al. 2010).
However, despite their relative scarcity, science-industry interactions have been key for successful historical experiences in some industries in LA. For instance, Arza and Vazquez (2012) highlight the importance of research by scientific institutions for agricultural technological upgrading in Argentina, while Casas et al. (2000) discuss the role of research institutions in successful experiences in biotechnology and other industries in Mexico. In a similar line, Suzigan and Albuquerque (2011) argue for the importance of university research for the development of the aircraft, steel and agricultural industry in Brazil. Evidence from quantitative studies show that manufacturing firms that collaborate with local universities increase their investments on R&D, are more likely to innovate and to apply for patents, and reach higher levels of labor productivity (Crespi 2012;Marotta et al. 2007).

Measuring science-industry knowledge transfer
There are a wide variety of channels through which tacit and codified knowledge is being transferred between universities and industry. Some mechanisms include the mobility of students, personnel exchange, informal exchanges of information, public conferences, consulting, collaborative and contract R&D projects, joint ventures, scientific publications and patents (Cohen et al. 2002;Gray et al. 2013;Link et al. 2007;Meyer-Krahmer and Schmoch 1998;Narin et al. 1997). According to Bekkers and Bodas Freitas (2008) the relative importance of these different channels in different contexts is explained, to a large degree, by the basic characteristics of the knowledge in question (tacitness, systemicness, expected breakthroughs), the disciplinary origin of the knowledge involved, and to a lesser degree the individual and organizational characteristics of those involved in the knowledge transfer process.
Due to this variety of channels and mechanisms, there are methodological challenges in measuring and assessing University-Science collaborative research. The impacts of Science-Industry collaborations are usually spread in space and time, can be numerous, and they are almost always difficult to separate from other parts of organizational life (Bozeman 2000). This methodological challenge is compounded by problems of data availability and measurability. For this reason, to focus on research collaboration between research institutions and industry we adopt co-authored publications as a measure of occurrence and intensity of collaboration (Godin 1996;Tijssen 2012).

Co-publications as a measure of science-industry collaboration
The analysis of co-authorship has become one of the standard ways of measuring research collaborations between organizations (Lundberg et al. 2006). Co-authored publications indicate the achievement of access to an often-informal network, and can be viewed as successful scientific collaboration in themselves. They also suggest an indicating diffusion of knowledge and skills. Moreover, co-authorship as an indicator is quantifiable and invariant, while the measurement is not invasive (Abramo et al. 2009).
However, it should be emphasized that joint publications are just one type of the different channels of knowledge transfer. Research financed by industry, co-patenting, or even research collaborations that do not involve scientific publications are not captured by academic databases. Hence, as the discussion of Grupp and Mogee (2004) reveals, relying only on this single indicator to assess knowledge transfer activities of institutions and countries may overestimate (underestimate) the performance of those which are naturally more (less) inclined towards activities that lead to publishable outcomes. Furthermore, some co-authored articles do not reflect real collaboration. A publication co-authored by two institutions could suggest a collaboration that has not taken place, for example, if an author has the two affiliations. Also, most scientific publications are about a specific topic or research question, and interdisciplinary research may be left out of the publication system (Porter and Rafols 2009). Therefore, co-authorship can never be more than a rather imperfect or partial indicator of research collaboration (Katz and Martin 1997;Laudel 2002). In the LA context, it has been argued that co-authored publications are one of the most important channels of knowledge transfer for researchers and firms (Dutrénit and Arza 2010). Therefore, for our study, we start from the assumption that companies need to perform research to absorb and appropriate codified scientific and technical knowledge (Aristei et al. 2016;Rosenberg 1990). Although "the traditional motivation of the technologist is not to publish, but to produce his artifact or process without disclosing material that may be helpful to his peers" (Price 1963), industrial researchers involved in scientific production activities act strategically. They publish in order to build their reputations, increase their visibility, reorient R&D agenda, establish intellectual claims and legal rights, signal capabilities to attract potential partners, and remain effectively plugged in scientific Scientific systems in Latin America: Performance,networks… 877 networks where new ideas are emerging (Godin 1996;Lee 2000;Li et al. 2015;Tijssen 2004). Many of these papers are likely to be co-authored with researchers in the public sector. These researchers, on the other hand, have a different set of motives to collaborate with industrial researchers, namely to generate additional research funds, gain insights in the area of research, look for business opportunities, increase the output of commercialization activities and further the university's outreach mission (Belkhodja and Landry 2007;Bozeman and Gaughan 2007;D'Este and Patel 2007;Lee 2000;Wong and Singh 2013).
Consequently, science-industry co-authorships can constitute a strategic way of acting that gives the researchers involved valuable insights in comparison with peers who are not participating in such collaborations. In this study, we assume that scientific institutions are always available for co-authorships with industry researchers, i.e., these institutions are not actively selecting their potential private sector partners. On the other hand, we assume that researchers in industries prefer to collaborate with institutions that exhibit certain structural characteristics. These include a quality dimension and a measurement of the diversity of knowledge. The latter can be studied by analyzing collaboration network structures.

Network position as correlate of performance
Scholars of social networks have consistently shown a significant association between network position and performance. One line of research indicates that actors with a higher number of direct ties will have access to additional sources of knowledge, ideas, and resources, thereby enhancing performance (Ahuja 2000;Reagans and McEvily 2003). Other research emphasizes the benefit of brokerage. Actors brokering between otherwise disconnected actors are characterized by having a timing advantage, being in an advantageous position for identifying arbitrage opportunities, having higher chances of creating new knowledge or products, and being better able to capitalize on their existent capabilities (Burt 2004(Burt , 2005Zaheer and Bell 2005). The benefits of both types of network positions have also been suggested (Reagans and McEvily 2003;Fleming et al. 2007). Despite these different perspectives, the consensus has been that network positions correlate significantly with actor performance in different areas.
If we consider scientific collaboration network studies, most of the previous work focuses on the individual/researcher level. Some studies highlight the importance of structural collaboration network positions as a driver of preferential attachments (Barabasi et al. 2002;Moody 2004;Abbasi et al. 2012). Others try to understand if the location of a researcher in a network can bring some advantages, for instance, a higher level of citations, better access to knowledge sources, awareness of potential projects or access to more funding (Abbasi et al. 2011;Ebadi and Schiffauerova 2015). However, to the best of our knowledge, there is no evidence yet on the impact of structural collaboration network positions on the level of collaboration with industry.
In this study, we use a mixed set of methodologies and metrics to analyze the LA scientific system, its interactions, and its proximity to industry. Taking into consideration the productive structure of the region, we focus on five natural resource related fields: Agricultural, Engineering, Environmental, Geosciences, and Plant and Animals sciences. Our contribution is to update trends and specializations patterns of scientific production in LA countries using bibliometric analysis and descriptive statistics. Furthermore, we will assess to what extent the specialized knowledge diversity of scientific institutions, proxy by its position in the scientific networks, affect collaborations between science and industry.
Understanding the determinants of these collaborations will provide useful information for the design of policies aimed at fostering science-industry linkages.

Data collection
We used the InCites™ (2017) tool proposed by Thomson Reuters, which is a web-based research evaluation tool that facilitates national and institutional comparisons across long time periods using indicators of publication output, productivity, specialization and normalized citation impact. InCites™ provided output and citation metrics from the WoS™ (Web of Science™, Thomson Reuters), which in turn allowed us to access data and metrics from a dataset of 22 million WoS™ papers from 1981 to 2013. All articles and reviews from researchers with a LA affiliation, published between 2004 and 2013, were analyzed. The metrics for comparisons between countries are created based on address criteria, using the whole-counting method, that is, counts are not weighted by number of authors or addresses.
InCites™ classifies author addresses (affiliations) as "university," "research institute," "government," or "corporate." In our work, an industry collaborative publication is one that has at least one author with a "corporate" affiliation, and at least one author with an affiliation with a LA "university" or "research institute." It is important to keep in mind that not all single affiliations of all publications in InCites™ are unified as "university," "research institute" or "corporate." 1 There are corporate affiliations that have not been identified or unified yet; hence, they have not been classified as industrial publications. Multinational enterprises (MNEs) are more likely to have been identified and unified as "corporate." Therefore, publications listed as industry (co)publications are a lower boundary of the real private sector research output. We would expect that countries with a lower presence of MNEs have larger differences between the number of publications authored by industry captured by InCites™ and the real activity.
Another important caveat in our analysis is that LA's research output may be underestimated because its researchers often publish in journals that are not indexed in major citation databases, such as WoS™ or Elsevier's Scopus™.

Bibliometric analysis
In this section, we analyze the evolution of the science systems in LA, focusing mainly on its output, productivity, specialization, quality, and linkages with industry. In addition to publication output (number of articles and reviews) and research performance (publication output relative to GDP and population), we calculate the percentage of publications of each country that were co-authored with industry, and the share of total publication output coauthored with international institutions. We also compute standard specialization indexes to depict the relative specialization of each country in a given area (Balassa 1965). This also serves to assess the overall level of specialization of each country (Laursen 2000). Finally, to study the quality of the research output, we use two normalized measures of citation impact. These are values which evaluate the scientific influence or visibility of a set of publications in a given period. For the Quality Citation Index (Bornmann and Leydesdorff 2013), a country value of 1.2 indicates that the citation impacts of papers published by researchers in this country are, on average, 20% points above the worldwide average. For the Quality Top 10% Index a country value of "10" indicates that 10% of the publications of that country are in the top 10% of the world, regardless of subject, year and document type (Pudovkin and Garfield 2009). Therefore, that country can be considered as performing at the same level as the world average. A value higher than "10", indicates a higher performance relative to the world average (see "Appendix" for more details).

Social network analysis
In this section, we describe the structure and patterns of collaborations of the LA scientific network. We focus this part of our analysis on what we defined as the Research Department level (RD). 2 This unit of analysis is defined by an output measure. We assume that all publications from one institution, in a determined scientific field, were produced by a specific RD. For example, we treat all publications from one institution in two scientific fields as research output from two different RDs that belong to the same institution. In addition, we assume that the research performed in each area faces its specific conditions and it is embedded in a particular scientific network, independent from other scientific topics. Although these assumptions could be debatable, scientific research in each field demands high levels of specialization and knowledge, which makes it very costly to get involved in research in other disciplines (Jeffrey 2003). Hence, we expect that this definition may include some errors but not a consistent bias.
As it was mentioned above, we define institutions conducting research in more than one field as having different RDs operating separately in each one of them. To extract the relevant scientific networks, we define a threshold to select the most prolific RDs in LA. For each field studied, we select RDs with more than 50 publications in each of the two 5-year's periods analyzed. Afterward, for each of these "elite" RD, we gather all partners with five or more collaborations in the same field, in the same period. Thus, two RDs are going to be linked if they have five or more co-authorships in the field and period. It is worth mentioning that collaboration partners are not necessarily part of the "elite" RDs group, given that they only need to satisfy the minimum of five co-publications with one "elite" RD. This group of collaboration partners also includes RDs that are not from LA institutions; however, we do not consider in our calculations those that are linked only with one LA RD.
We perform this analysis in two periods of 5 years each (2004-2008, and 2009-2013). Besides the graphical description of networks of both periods, we obtain information at the RD (node) level, such as centrality indicators (degree, betweenness, and closeness), and network features, namely the number of nodes, number of communities and average path length.

Econometric analysis
In this section, we set a model that allows for gathering new evidence of the characteristics of the RDs working more closely with the industry. We define the percentage of publications of RDs that are co-authored with the industry as the dependent variable, and we relate it to a set of RD features that could influence such collaborations: (1) knowledge production capacity; (2) research quality; (3) orientation towards industry; and (4) knowledge diversity.
Co-authorships with industry are far to be common in science. The occurrence of these events can be represented as a case of corner outcomes with a corner at zero and a continuous distribution for strictly positive values (upper-censored at 100). Wooldridge (2002) suggests addressing these cases implementing "hurdle" or "two-tiered" models. This allows explanatory variables to differently affect the participation decision, i.e., the co-authorship of at least one publication, and the intensity of these collaborations, measured as the percentage of the total publications of a RD that were produced jointly with firms. Therefore, we firstly follow the specification of the two-tiered model developed by Cragg (1971). In the "first-tier" of the model, we estimate the probability of participation in co-publication with industry using a probit model. In the "second-tier" a truncated normal model is used to estimate the intensity of the collaborations with industry, formally: where w is a dichotomous variable equal to 1 if the RD has at least one co-publication with industry and 0 otherwise, and y is the percentage of publications of the RD co-authored with the private sector. When w is equal to 0, then y also takes the value of 0. While w ¼ 1, then y [ 0. Variables x 1 and x 2 are sets of characteristics of the RDs that affect the likeliness to co-publish with industry and the intensity of these activities, respectively. Hence, c captures the effects on the participation and b those associated with the intensity of co-publication. This specification assumes conditional independence between the two tiers of the model. In this case that means to assume that after controlling the observable characteristics of the RDs, there is no correlation between the decision to participate and the intensity of co-publications. We are aware that the latter assumption could be debatable. Therefore, we also use the approach developed by Heckman (1979) as a consistency check. Although this model is aimed to address the selectivity problem that arise when an interval of the outcome variable is not observable, statistically is very similar to Cragg's model and its flexibility allows for correlation between the participation and intensity equations. However, a variable that affects the participation but not the intensity of collaborations with industry needs to be included to identify the model. As we mentioned above, we model the participation of the RDs in collaboration with industry and the intensity of co-publication as a direct function of the main RDs characteristics. The first independent variable is the total number of scientific publications during the period, which depicts the capacities of knowledge production of the RDs. This variable is expected to have a positive effect on the relationships with the private sector since the capability of a university to attract private enterprise collaboration is influenced by the size of the group of academic researchers and their output (Abramo et al. 2010). Furthermore, this variable is also a proxy for the size of the RD. Larger organizations may have more resources available to assign for relationships with the private sector.
We also include a measurement of the scientific quality of the research output of the RD, in the form of a citation impact index. In principle, we expect an ambiguous effect of quality in co-publications between science and industry. On the one hand, highly cited institutions enjoy reputational benefits that make them perceived as more desirable partners Scientific systems in Latin America: Performance, networks… 881 for research by the private sector, increasing the likeliness of this type of collaboration. On the other hand, institutions that produced highly cited publications may be mainly focused on academic research, leaving few resources available to create linkages with industry. The empirical evidence is also mixed. Some studies have shown a correlation between universities' citation impact and their intensity of collaboration with industry (Abramo et al. 2010;Balconi and Laboranti 2006;Giunta et al. 2016). However, further analysis, examining specifically the Italian situation, showed that enterprises do not necessarily choose partners with higher scientific influence (Abramo et al. 2009). The orientation of a RD towards working closely with the private sector certainly will affect the share of co-publications (Bozeman and Gaughan 2007;Giunta et al. 2016). We proxy this factor, using a variable that calculates the previous record of science-industry collaborations of the RD. By measuring prior partnership, we are also able to control for the pre-existent linkages with industry that could have been developed at the institutional or personal 3 (researchers) level.
Finally, we consider that the diversification of knowledge within each RD positively affects its closeness to the private sector. In particular, we assume that industrial research projects in which companies involve RDs are significantly more complex and uncertain than the common ones (Hall et al. 2003). Hence, RDs that possess or have access to diverse but complementary expertise, even within the same scientific field, are going to be working more intensively with the industry. Unfortunately, the level of disaggregation of the publication data by scientific field does not allow us to test this directly. Nevertheless, we make use of the social network features of each RD to proxy their internal knowledge diversification. Specifically, we assume that RDs that have more diversified internal knowledge sources are more likely to have a more varied set of research partners. 4 Accordingly, we include variables that provide information about the linkages of the RDs and its relevance in the scientific network. We rely on three commonly used measures of network centrality (Freeman 1978): degree, betweenness, and closeness (see "Appendix" for more details).
By including this type of variables in our estimation, we are also controlling for other mechanisms that are taking place in parallel. Namely, RDs that are relatively better connected in their scientific network could be given preference in work collaborations, since they have earlier access to sources of knowledge and ideas (Burt 2005). Higher centrality can also lower the cost of screening other RDs for future partnerships, help to diffuse the scientific challenges in which companies are interested and increase its scientific reputation thereby attracting top researchers (Godin 1996;Lee 2000;Li et al. 2015;Tijssen 2004). On the other hand, working with highly connected RDs can be risky for companies because it increases the potential damages of leakages of relevant information of the firms. Finally, we can also expect that geographical proximity plays a role in shaping these collaborations (Bozeman and Corley 2004;Giunta et al. 2016;Pinch et al. 2003). Hence, we include RDs information regarding their relevance in both LA and their national scientific networks.
Usually, quantitative studies assessing causality based on statistics and data from networks are subject to endogeneity biases. In our case, it would be in the causal direction of the relationship between the linkages of a RD within their scientific network and the intensity of collaborations with industry. We try to address this potential problem by using information from two separate periods of time. This enables us to analyze RDs characteristics and position in the network in one period and the collaborations with the private sector in the following period. Furthermore, from the management literature, we know that previous alliances tend to remain or to be repeated because routines decrease asymmetries of information among partners and facilitate the estimation of future returns of joint activities (Gulati 1995). At the same time, processes of path dependence induced by the influence of initial conditions on future developments may also occur here (Thune and Gulbrandsen 2014). The choice of the 5-year time span is a compromise between robustness of results and timeliness.
We control for differences in the intrinsic degree of proximity to industry of different scientific fields. We also include a set of country dummies to control for idiosyncratic characteristics and specific science-industry policies. Also, we include a dummy variable that controls characteristics of RDs that are part of universities, relative to other types of institutions. Finally, we allow errors to be correlated among RDs that belong to the same institution.

Science in LA: trends and specialization
LA's long-term world percentage of publication output in WoS™ has increased from 1.32% in 1981 to 5.03% in 2013. In 2013, all LA countries accounted for 71,391 publications in WoS™. Brazil´s share of publication output is particularly high when compared with other countries of the region (around 55% of LA output in 2013), reflecting differences in the size of the economies. According to our analysis, the share of world scientific output from Brazil increased at a constant rate from 1993 to 2006, when publications skyrocketed to the levels seen in Brazil in 2013. 5 Other countries that show higher average shares of scientific output than LA in the last decade are Mexico, Argentina, and Chile. Table 1 provides data adjusting scientific output by other characteristics of the countries. This allows for an assessment of the scientific "productivity" per billion of USD and per million of inhabitants.
LA countries are ranked in Table 2 by aggregate scientific production from 2004 to 2013. Although Brazil has the highest number of publications, it has the lower scientific impact. This may happen due to a significant percentage of articles being published in national journals that had recently been included in the databases (Collazo-Reyes 2013). Countries with smaller scientific systems tend to rely more intensively on international collaborations. The average LA country has 75% of its scientific outputs co-published with a foreign institution, while that figure goes down to 42% when considering the top 4 largest science systems in the region (Brazil, Mexico, Argentina, and Chile).
In general, although LA's scientific impact is growing, it remains relatively low when compared to the world average. Despite their low productivity and scientific output, Peru and Panama perform best in these terms, probably because more than 85% of their Table 1 Research performance of LA: summary statistics (2004-2008 and 2009-2013). Source:  a Docs = Scientific publications b RSI = Share of a country's papers in a given field, relative to the share of world papers in that field c SII = Specialization Intensity Index. This measure provides a ratio to assess whether a country is "specialized" or "not specialized." It grows with the specialization intensity of a country publications are co-authored with researchers outside their country (Van Raan 1998). Chile, by far the most productive country in the region, has also increased its research output and maintained a medium level of scientific impact. As regard to the levels of collaboration with industry, we can appreciate a relatively low percentage mainly in the countries that are not so dependent on international collaboration or have larger science systems-compared to countries like the United States or Germany (higher than 2%). These results are in line with Tijssen (2012), who showed that LA and North Africa are the regions in the world with the lowest intensity of science-industry co-authorship. Countries often try to invest strategically in research areas critical to their economic development. Creation of specific local knowledge may increase innovation capacities of incumbents, but also promote the birth of start-ups or spin-offs. These trends run in parallel with others that do not necessarily operate in the same direction. Historical and cultural influences, strengths of scientific establishments, as well as incentives and government funding for scientific research play a relevant role in defining the revealed scientific specialization of a country. The size of the scientific system also matters, since larger science systems have the capacity for more diversity and greater coverage of the full scope of sciences. In contrast, smaller systems may be limited in their ability to invest in specific domains. We explore the outcome of these trends through a specialization analysis based on the 22 Essential Science Indicators (ESI) areas. 6 Table 2 contains the five subject areas of higher specialization for the nine countries in LA with more than 1% of LA total scientific output over the 2009-2013 period. Table 2 also provides information on aggregate specialization level (given by the SII index) for each of these nine countries.
Research specialization is quite similar across these LA countries. In aggregate terms, the top 5 areas with the largest output from LA, relative to the world are Agricultural Sciences (15.7%), Plant and Animal Science (12.3%), Space Science (9.3%), Environment/ Ecology (7.7%) and Microbiology (7.3%). The higher LA specializations are in Agricultural Sciences and Plant and Animal Sciences, which is in line with the high importance of agricultural, livestock and agro-industrial activities in the region.
The cases of Peru and Chile are interesting because they revealed high specialization 7 in subject areas different from the other countries of the sample. The specialization of Peru is related to issues in public health (prevention of HIV, tuberculosis, and lupus) in which they also have a high scientific impact (Van Noorden 2014). Chile's high specialization in Space Science is related to its excellent infrastructure of giant telescopes housed in the Atacama Desert. According to Catanzaro et al. (2014), funding for astrophysics has increased from $2 million in 2006 to $6.8 million in 2010. Over the same period, the number of faculty positions has almost doubled. This has led not only to an increase in the number of publications in this field but also to an increase in quality. In contrast, Economics and Business, Materials Science, Computer Science, Psychiatry/Psychology and at a certain level Engineering seem to be neglected research disciplines across LA countries.
In summary, scientific activity has been growing in LA countries during the last decade but not at a pace that allowed it to catch-up with the rest of the world. Only four countries show productivity levels closer to the world averages. Co-publications with international 6 The Essential Science Indicators schema (Thomson Reuters) comprises 22 subject areas in science and social sciences and is based on journal assignments. Arts and Humanities journals are not included. Each indexed journal (11,000+) is found in only one of the 22 subject areas and there is no overlap between categories. 7 If a country has a scientific output structure equal to the world, the value of the indicator will be zero. The size of SII is an indication of how strongly each country is specialized.
Scientific systems in Latin America: Performance, networks… 887 institutions are frequent and highly relevant for the scientific impact (quality) of smaller scientific systems. On the other hand, collaborations with industry are scarce even when research specialization seems to be influenced by economic specialization.
In what follows, we will focus on the study of five main scientific fields: Agricultural, Engineering, Environmental, Geosciences, and Plant and Animals sciences. We choose Agricultural, Geosciences and Plant and Animals as they are closely related to the natural resources-based economic activities in which LA countries are more intensive. We also include engineering and environmental sciences because we assume that this type of knowledge needs to be consistently applied across the main economic activities of the countries analyzed.

Network analysis
The data requirements for extracting the scientific networks explained in Sect. 3.3 only allow us to include in our analysis RDs from the following LA countries: Argentina, Brazil, Chile, Colombia, Costa Rica, Cuba, Mexico, Panama, Peru, Uruguay, and Venezuela. Table 3 gives some network summary statistics from the five scientific fields that we are analyzing. Network graphs are available in the "Appendix" ("Network graphs of all scientific areas in 2004-2008 and 2009-2013" section).
The 55% growth of LA scientific production between 2004-2008 and 2009-2013 is roughly proportional to the increase in the number of RDs (nodes) in all subject areas (networks). In both periods, the average path length is less than 4, implying that knowledge that is created in one node has the potential to be diffused in few steps to the rest of the network.
Interestingly, the change in the number of communities 8 does not follow a common trend. Engineering, Plant and Animal, and to some extent, Environmental Sciences shows a remarkable increment in the number of RDs in the LA network. However, there are limited changes in the number of communities, 9 suggesting that newcomers were rapidly attached to well-established groups of collaborators. On the other hand, Agricultural and Geosciences at least double the number of knowledge communities. It can be interpreted that evolving networks are creating new niches of knowledge, either with new local actors or increasing diversification of knowledge sources through new international collaborations. Nevertheless, it should be borne in mind that geographical proximity may also be playing a role in the creation and evolution of these research communities.
Shortest average paths together with an increasing number of communities are signals that a network structure is evolving towards a structure that facilitates both knowledge creation and knowledge diffusion. However, attention needs to be paid to the fact that in almost all scientific fields studied it is common to observe that two neighboring countries (in geographic terms) are only connected to each other through a RD that is based in a third country. Even when this situation gives potential brokerage power to the external RD, it is not clear what the impact is for the performance of LA scientific networks. Clearly, this is a topic that requires further research.
We also found that the increases in the number of RDs, between both periods, are not reflected in significant changes in the percentage of RDs collaborating with the industry. Therefore, we can assume that the share of RDs collaborating with the industry among the new incumbents is the same as the proportion of RDs connected to the industry in the previous period. On the other hand, the average percentage RDs of co-publications with industry fell in all scientific fields, except for engineering. The latter suggests that the new RDs that co-publish with the industry are doing it less intensively than the average RD of the previous period.
As we mentioned before, for the econometric implementation we also estimate centrality measures for local/national networks. For each country, these networks are formed by all elite national RD (same threshold defined before) and its research partners. Foreign institutions are also included in the network. However, those that have collaborations with only one local RD are considered peripheral and are subsequently dropped from the network. After application of these filters, we are left with data only from Argentina, Brazil, Chile, and Mexico. 10

Econometric analysis
In this section, we present the results of the estimation of Cragg (1971) model described in Sect. 3.4, run using the user-developed craggit routine in the Stata software. 11 We pooled data from the LA scientific networks presented in Sect. 5. After the application of data requirements to the nodes gathered from these networks and dropping outliers on the outcome variable, 12 we end up with a database of 324 observations from four LA countries (Argentina, Brazil, Chile, and Mexico) in the five selected scientific topics.  10 We could get the national network of Venezuela, but data requirements for the econometric estimations left these observations out of the final dataset. 11 As a consistency check, we also estimated a two-step Heckman selection model, using the same software. Those results are available in the appendix. 12 We define as an outlier a RD for which the outcome variable is more than three standards deviations above/below the mean. Table 4 summarizes descriptive statistics of the main variables used in the econometric analysis. Overall, 70% of the RD in the sample did not have a single publication coauthored with industry in the period 2009-2013, a lower share than the average of 78% in period 2004-2008. The average RD published approximately 179 papers between 2004 and 2008. In the same period, the indexes of quality of publications show that these publications tend to underperform in relation to the rest of the world, having 21% fewer citations (citation index of 0.79) than the average paper in the same field. Furthermore, only 6.1% of the publications of these RDs are in the top 10% of their field.
Brazil concentrates almost 50% of the RDs considered in this sample. Argentina, Chile, and Mexico account for the other half of the observations. Plant and Animal Sciences account for 27% of the RDs here considered. Agricultural, Engineering, and Environmental Sciences represent roughly 20% each, while Geosciences accounts for the remaining 14% of the cases. Most the RDs in the sample are part of universities (86%), and the remaining 14% belong to research institutes or government agencies. Table 5 shows findings regarding determinants of the participation of co-publication with industry and its intensity. As expected, past collaborations with industry are revealed as a strong predictor of collaborations in the subsequent period. RDs with a higher    324 Clustered errors at the institution level. Standard errors in parentheses *Coefficient is statistically significant at the 10% level; **at the 5% level; ***at the 1% level; no asterisk means the coefficient is not different from zero with statistical significance percentage of co-publications with industry are more prone to keep engaging in these collaborations. Across the different specifications of the model, the sign and statistical significance of this effect remain. The positive impact of past collaborations in the intensity equation is not statistically significant in the craggit estimation. However, the Heckmanmodel not only confirms the positive relation but in this specification the coefficients are significant.
Interestingly, there is no clear relationship between academic quality and engagement in research with the private sector. The signs of the coefficients (positive for participation and negative for intensity) may suggest that higher academic quality favors participation in research with the industry, but for RDs with higher levels of citation impact, the collaboration intensity with industry is relatively smaller. One possible explanation for the latter is that RDs that produce highly cited research are mainly focused on academic research and not so much on generating linkages with industry.
Country dummies show that there are no significant differences at this level on the likeliness of RDs to engage in research collaboration with the private sector. However, the Brazilian RDs that do participate in industry collaborations are doing it more intensively than their counterparts in Argentina, Chile, and Mexico. Finally, engineering sciences are consistently the research field with most collaboration with industry in both, participation and intensity, a result that we expected given the applied orientation of the engineering activities. The result in the intensity equation also holds for Geosciences, probably due to the importance of mining operations in the sample of countries included in our analysis.
The position of the RDs in the LA scientific network does not seem to be related to its relationship with the private sector. Indeed, none of the network centrality measurements tested (degree, estimations 2 and 3; betweenness, estimations 4 and 5; and closeness, estimations 6 and 7) at the LA level show statistically significant coefficients. The unimportance of these RDs features, in the global LA context, contrast with the results observed when we consider the node characteristics of the RDs in the national/local network. Indeed, our most important finding is that two of the centrality measurements of the national/local scientific networks (estimations 3 and 5) show positive and significant effects on the intensity of the collaboration with industry.
Although not relevant in the participation equation, RDs with higher values of local degree and betweenness engage more intensively in research activities with the private sector. These results suggest that the RDs that have or have access to a more diversified set of knowledge sources in their countries are more prone to engage intensively in research with industry. The mechanism behind this finding may be related to the fact that RDs which are better connected can provide different strands of specialized knowledge that allow them to tackle the type of challenges proposed by the industry adequately. At the same time, these RDs can provide benefits to firms by lowering the costs of screening other RDs for future partnerships, decreasing the risk of knowledge lock-in, attracting high-qualified researchers, and providing a more effective diffusion of the scientific challenges of the company.
Furthermore, nodes in brokerage positions (higher betweenness) are characterized by having a timing advantage. They are not only more likely to be first recipients of information from diverse groups but also occupy a privileged position from which they can assess the relevance of new information (Burt 2005). Therefore, in a competitive process in which timing is rewarded, a brokerage position of RDs in national borders may be providing a crucial advantage for collaboration with industry. However, as previous research has also suggested (Liao and Phan 2016) since the participation equation is not significant in the local network, we should be careful when arguing that these two types of network positions will lead deterministically to more science-industry collaborations.
Larger RDs can cover a wider spectrum of scientific topics, and they also have more resources that could be used to establish relations with the private sector (e.g., TTOs), making them more prone to engage in collaboration with industry. Our results confirm this showing that even after controlling for networks centrality features, the size of the RD is revealed as a strong predictor of the likeliness of performing research with the private sector. On the other hand, the intensity of these collaborations decreases with the size of the RD. We are aware that the relation between size and centrality may raise a multicollinearity problem. The availability of more resources in larger RDs can also increase its centrality. However, we are confident that theoretically, both variables are not measuring the same characteristics of the RDs, that is, more co-publications do not necessarily imply more diversity in co-publications partners. Therefore, both size and centrality must be included in the estimations. Excluding these aspects will give rise to a problem of omitted variables. Nevertheless, this potential problem needs to be considered when interpreting results.
In Table 6 (in the "Appendix") we present the results of the estimations of a two-step Heckman model for specifications 3, 5 and 7 of the model. Based on the results of the Craggit estimations we use the size of the RD as the exclusion variable, i.e., affecting the participation decision but not the intensity equation. Most of the results of the previous estimations hold. However, in this set of estimations, the previous collaboration with industry increases not only the likeliness of participation in co-publication but also the intensity of these collaborations. An extra 1% of co-publications with industry in one period increase these activities in 0.35-0.45% in the next period. Despite this change, the degree and betweenness values of the RDs in their national scientific networks have a positive effect on the intensity of collaborations with industry.

Conclusions
In this paper, we use a combination of bibliometric, social network and econometric techniques to increase the understanding of LA scientific systems and its relationship with the private sector. We studied recent trends in the scientific outcome, the linkages that exist between RDs within and between LA countries, and RDs collaboration activities with industry.
We found that the LA share of global scientific publications started to increase at a higher rate since 1993, thus revealing a trend for convergence with the world leading regions. This increase has been mainly driven by Brazil and most notably in subject areas such as Agricultural, and Plant and Animal Sciences. Moreover, when analyzing the relative scientific output normalized by GDP (Docs/GDP) and population (Docs/Pop), the results show that in the most recent years Chile, Uruguay, Argentina, and Brazil have levels of scientific productivity higher than the world average. Furthermore, specialization of scientific systems in LA tends to follow economic specialization, focusing on scientific fields related to natural resources. However, in the last decade, most LA countries have an average industry collaboration percentage below 1%. This is a low number when compared to the rest of the world. There are differences between fields (Engineering and Geosciences show higher levels than other sciences) but in general, collaborations between science and industry, measured as co-publications, are scarce.
The growth of scientific production can also be appreciated by the increasing number of RDs embedded in the LA scientific networks. However, the structures of these networks are not evolving in the same way. We find preliminary evidence that suggests that LA Geosciences and Agricultural Sciences networks are evolving towards structures that facilitate both knowledge creation and diffusion. It is worth noting that collaborations between RDs of different LA countries remain low. In most of the fields studied, linkages between LA countries are scarce even when these countries tend to specialize in similar scientific fields. Understanding if this lack of integration between LA scientific institutions is harming potential gains of complementary knowledge is a matter of further research.
The main finding is that the RDs that have a more diverse set of knowledge sources, within their scientific discipline, are the ones that are working more closely with industry. Besides possessing different sources of complementary knowledge within the same discipline, that can tackle more effectively private sector challenges; firms may perceive these RDs as having a higher reputation and stronger research capabilities. Furthermore, by being in brokerage positions, RDs are not only more likely to be early recipients of information from diverse groups but also occupy a privileged position from which they can assess the relevance of new information. This timing advantage may be a crucial element for collaboration with industry. Although interesting, we cannot determine which of these is dominating.
Complementing this analysis with qualitative approaches and primary data which consider other types of technology transfer activities and sources of funding for research would certainly improve the understanding of LA knowledge production, transfer and diffusion systems. Furthermore, focusing on the publication analysis of science-industry linkages at the level of technologies, rather than scientific fields, is a matter of further research.
where P is accounts for the number of publications in subject area s in country i, P i accounts for the total number of publications in that same country i, P s accounts for the total number of publications in subject area s worldwide, and finally P accounts for the total number of publications in the world.
Specialization Intensity Index This measure provides a ratio which in the numerator displays the square of the difference between specialization intensity of class s in country i and specialization intensity of that class in the world, while the same denominator shows the sum of the weighting of all subject areas in country i, with this ratio summed up across all s subject areas. This Chi square of sectoral specialization is adapted from Laursen (2000) and provides a concentration measure that grows with the specialization intensity of a country: Quality Citation Index This score calculates the mean citation rate of a country's set of publications in a specific subject area, period, and document type, divided by the mean citation rate of all publications in that subject area/period/document type: Quality Top 10% Index This index shows the proportion of publications belonging to the top 10% most cited documents in a given subject category, year and publication type:

Centrality measures
Degree This measure of centrality accounts for the total number of links that a node has in a network. In the case of the networks that we are studying it will account for the total number of different research partners with whom each RD collaborates. RDs with higher degree number could be considered popular among their peers, enjoying benefits from reputation. Furthermore, they also hold what could be regarded as a more diversified set of research partners. However, regularly, maintaining links is a costly endeavor, and then we would expect to find limits on the utility of getting new linkages. We use the normalized version of the indicator implemented by the igraph package of the R software. Formally: where l i; j ð Þ ¼ 1 if there is an edge between i and j 0 otherwise & , and n is the number of nodes of the network.
Betweenness This index accounts for the total number of shortest paths 13 in which a node is involved. Under the assumption that shortest paths are preferred in the diffusion of knowledge in a network, RT with higher betweenness values may be connecting knowledge from two very distant RD, broadening the scope of potential sources of information and allowing them to play a role of broker of knowledge. We use the normalized version of the indicator implemented by the igraph package of the R software. Formally: where n is the number of nodes of the network, g jk i ð Þ is the number of shortest paths that pass through node i, and g jk is the total number of shortest paths.
Closeness This index is defined by the inverse of the average shortest path to all other nodes in the network. An RD with higher values of closeness would require less effort to reach any other source of information. At the same time, at least theoretically, it could access new knowledge more quickly than others. We use the normalized version of the indicator implemented by the igraph package of the R software. Formally: where n is the number of nodes of the network, d i; j ð Þ is the length of the shortest path between nodes i and j.
Network graphs of all scientific areas in -2008-2013 The shortest path is the minimum distance, accounted by links, between two nodes of a network. 14 Networks are visualized using the Fruchterman-Reingold algorithm. Circle nodes are for LA RD. Square nodes are for non-LA RD. Edge thickness represent the normalized number of co-publications. 15

Consistency check of econometric estimation
See Table 6.  Coefficients reported are marginal effects. Bootstrapped clustered errors at the institution level (100 repetitions) *Coefficient is statistically significant at the 10% level; **at the 5% level; ***at the 1% level; no asterisk means the coefficient is not different from zero with statistical significance