Internationality at university level

University level indicators of internationality have been compiled and presented. 18 universities from three countries, Hungary, Israel and Sweden were selected as sample. “Breadth” and “depth” of internationality were distinguished and characterized by bibliometric and webometric tools.


Introduction
The term "international" prevails in the literature of scientometrics. A simple topic search in the Web of Science shows that about 20% of all papers with the keyword "bibliomet* or scientomet* or informet* or altmet*" also contains "international* or multinational* or cross-national*". It is about three times more than scientometrics-related papers with the keyword "interdisciplin* or multidisciplin* or cross-disciplin*", another favored topic.
Yet, attempts to measure internationality, whether within or outside the realm of scientometrics, are relatively sparse. This abstention is certainly resulted from the lack of consensus about the meaning, let alone the method of measurement of the concept of internationality. As one of the prominent researchers of the topic formulated: "The two central problems […] are: (1) Global under-representation as a result of the use of the literal definition of the word international in its minimal sense and, (2) Widespread usage of the term international without any evaluation of its degree with reference to appropriate measures or indices. " (Buela-Casal et al. 2006). Buela-Casal et al. (2005) and his coworkers proposed a multicriterion "Internationality Index" to measure the internationality of journals. The method is scarcely generalizable for This paper is dedicated to the memory of Judit Bar-Ilan (1958-2019, an outstanding scholar and an inimitable friend and colleague.
1 3 studying objects other than journals, and did not gain widespread use outside the Spanish psychology community.
A critical point in studying internationality is the distinction between its "breadth" and "depth", similarly as Zhao and Wei (2018) defined the breadth and depth of collaboration, in general. Breadth, in the present case, roughly means the number of countries involved, depth is related to the extent of international links. As an extreme case, 90% of the publications of Andorra is published in international co-operation, that is, the extent (depth) of publication internationality is enormous, but practically all of these international papers have been published with Spain as co-operating partner, i.e., the breadth of internationality is, to say the least, moderate. On the other hand, international co-operation is involved in not more than 20% of US publications, but the range of co-operating partners is the widest possible.
In one of the first large-scale international co-authorship statistics, Frame and Carpenter (1979) used the most straightforward pure "depth" indicator: the proportion of internationally co-authored papers in the publication output of countries. In a follow-up study, Schubert and Braun (1980) confirmed the results of Frame and Narin, and also proposed a "Cooperation Index" taking into account the inherent correlation between the "foreign co-authorship ratio" and the total publication output of the countries. A pioneering attempt to map the co-authorship network of countries has also been made. The complementary indicator of "foreign co-authorship ratio", "domesticity", has later been extended from coauthorship to reference and citation internationality, as well (Glänzel and Schubert 2005).
In the original version of Buela-Casal's "Internationality Index" , co-authorship internationality was taken into account through a "breadth" indicator (the number of countries from which the authors of the articles came from). In a parallel paper (Perakakis et al. 2005) the use of the Gini Coefficient as dispersion indicator has also been considered.
Being a conservation biologist, Calver et al. (2010) noticed the analogy between the problems of internationality with that of multispecies ecological populations. He proposed a set of indicators originated from ecology to characterize the 'richness', 'diversity' and 'evenness' for the country of origin of authors, editors, citations, etc. Recently Calver et al. (2018), a similar system of indicators has been proposed to characterize, beyond internationality, interdisciplinarity, as well.
In the present paper, internationality is considered at the level of universities. Statistics about international cooperation in individual universities are certainly included in various university reports and similar documents. We are not aware, however, that a targeted, comparative study on this topic has ever been published.

Distinctive features of studying internationality at university level
Since most of the existing scientometric studies of internationality concern to journals, attention should be paid to the main similarities and differences between studying universities and journals.
1. Most of the universities have an inherent "nationality", while the majority of scientific journals are published by multinational publishers. Journals of national societies or, as a matter of fact, universities are rather the exception than the rule. The concept of "domesticity" has much more relevance for universities than for journals, whether coauthorship, references or citations are concerned. 2. There is no university counterpart of the editorial boards of journals. International governing bodies or supervisory committees may exist but their significance is usually negligible. 3. Universities usually have well-defined, unique internet domain names that make them eligible for webometric analysis. Journals' web pages are frequently hidden under the umbrella of the publisher's website.
Speaking about assessing universities, one cannot avoid encountering issues of university ranking. Indeed, except for the Shanghai (ARWU) ranking, all major university ranking systems [Times Higher Education (THE), Quacquarelli Symonds (QS), Leiden (CWTS)] include some kind of "internationality" indicator. These indicators are then in a more or less transparent way weighed into the final composite score. In this respect, it is very important to reiterate what Buela-Casal et al. (2006) emphasized in context of journals: "It should be made clear that internationality per se is not to be equated with quality". Glänzel and Schubert (2001) found that while international co-authorship in chemistry, as a rule, results in publications with higher citation impact than purely domestic papers, some of the international co-operation links between countries are "hot", i. e., co-authored papers have higher impact than that of any of the two contributing countries, while others are "cool". This is even more true for universities. Whether and in what extent internationality is desirable in the activities of a university, should be gauged by competent decision makers of the university. Well-targeted indicators may efficiently help them, but an uncompromising weighing scheme may cause more harm than benefit.

Universities under study
A total of 18 universities, six from each of three countries: Hungary, Israel and Sweden were chosen for analysis. The universities were selected on the basis of their more-or-less unanimous top positions in the well-known university rankings (ARWU, CWTS, THE, QS). Table 1 lists the website address of the selected universities, as well as their WoS "Organization-Enhanced" labels.

Bibliometric indicators
Bibliometric indicators of the universities were determined on the basis of Clarivate's Web of Science (WoS) Core Collection database. Publications of the universities were retrieved by "Organization-Enhanced" search using their WoS label as search term (see Table 1). Items published in 2017-2018 and citations to them from the date of their publication up to the date of the analysis (September, 2019) were taken into consideration.

Publication and citation statistics
Basic publication and citation statistics were collected using the "Analyze Results" and "Create Citation Report" functions of the WoS and are given in Table 2.

"Depth" of internationality and related indicators
As mentioned above, the most straightforward indicator of the depth of internationality is the proportion of international publications and citations. A publication or a citation is considered international if any of its authors is from outside the home country of the university in question. A set of indicators characterizing the international publications and citations of the 18 universities are collected in Table 3. The last column (% Citation advantage of international publications) is the ratio of the mean citation rate per paper of the international publications (third data column of Table 3) to that of the university's overall value (fourth data column of Table 2).

"Breadth" of internationality and related indicators
An apparently obvious measure of the breadth or scope of internationality is the number of countries involved. However simple it may sound, counting countries raises several theoretical and technical questions. As of 2019, the United Nations (UN) recognizes 195 sovereign countries. In addition, there are 61 dependent areas (dependencies), and 6 disputed territories. There is no general consensus how many (and which) of them are to be taken into account in compiling statistics by countries. Web of Science apparently does not provide a full list of the countries covered, but as witnessed by the Country/Region option of the "Analyze Results" function, it uses a rather wide definition. In the present analysis, we tried to accommodate ourselves to the WoS practice. A notable exception is the United Kingdom which is considered in this study a single country as opposed to the WoS practice of treating England, Scotland, Wales and North-Ireland separately.
It has to be noted that WoS Country/Region list contains a large number of erroneous entries. There are OCR errors (e.g., Sndia), deficient or miscoded items (Sultanate, Tbilisi) among them; a number of items were assigned to United Kingdom, in spite of the above mentioned general practice of the database. Although most of these errors occurred only once or very few times, therefore, only an insignificantly tiny fraction of the total items of the database were involved, the apparent number of countries increased significantly. Reliable country statistics can be made only after manually cleaning all occurring country codes; that was carefully done in our analysis.
For a more refined characterization of the breadth of internationality, the Partnership Ability Index, PHI (Schubert 2012), can be used. PHI is a Hirsch-type indicator (Schubert and Schubert 2019) defined on the model of the h-index. In our case, a university has an international co-authorship Partnership Ability Index PHI if PHI is the greatest number of countries having at least PHI co-authored publications with the given university; a university has an international citedness Partnership Ability Index PHI if PHI is the greatest number of countries citing the given university in at least PHI publications.
Another aspect of the breadth of internationality is the average number of countries per paper. It should be noticed that in case of co-authorship, the home country of the university is a contributing country by definition, therefore the indicator values concerning to the effective partner countries are one less than the table values.
The indicators characterizing the breadth of internationality of the co-authorship and citedness patterns of the 18 universities are summarized in Table 4.

Country names behind the numbers
However telling indicators of internationality may be, more often than not, the real substance is in the details, i. e., in the finer structure of international co-authorship and citation patterns. Table 6 in the "Appendix" contains for each university under study the top 20 countries contributing to the internationality of the institutions in question either as collaborating partners or as sources of citations.
In the tables, next to the number of publications, a size-normalized "preference factor" is also indicated. These indicators were calculated according to Schubert and Glänzel (2006). The co-authorship preference factor is the quotient of the share of a country in all co-authorships of a university divided by the share of the same country in world total publications. Citation preference is defined in perfectly analogous way. A preference factor larger than 1 signifies that the country's contribution as collaborating partner (or as source of citations) is larger than it would be expected by its mere size.

3
The home country, contributing to all publications of its universities by definition, is omitted from the co-authorship lists. Its high contribution as citation source is not logically necessary, but in practice, "self-preference" is outstandingly high, as witnessed by the lists.

Webometric indicators
The term 'webometrics' was introduced into the literature by Almind and Ingwersen (1997). As they wrote: "While informetrics is the research into information in a broad sense and not only limited to scientific communication, the approach taken here will be called Webometrics, which covers research of all network-based communication using informetric or other quantitative measures". Despite of the caveat, the term is used by the authors themselves and their followers mainly as a tool for the analysis of scientific information (Björneborn and Ingwersen 2001;Bar-Ilan 2008). In wider context, the term 'web metrics' or 'web analysis' is used, instead.
Webometrics became an important actor in university evaluations since, in 2004, the Madrid based Cybermetrics Lab launched the www.webom etric s.info website with regularly updated lists of indicators largely based on the web presence of world's universities. Web based indicators of Webometrics.info are those of 'presence': the size (number 1 3 of pages) of the main web domain of the institution (based on Google), and 'visibility': the number of external links to the institution's webpages (based on the services Ahrefs and Majestic). No decomposition of visibility indicators by countries (sources of links) is given, therefore, the aspect of internationality is missing from this collection. In this study, a first attempt is made to use a webometric measure which, to our knowledge, until now escaped the attention of scientometric analysts: the number and distribution of visitors of university websites. Certainly, visitor statistics is permanently tracked and recorded by the IT departments of universities, but no comparative analysis has been found in the literature.
The website Visitors Detective (www.visit orsde tecti ve.com) was used as source. This Tel-Aviv based website is, in its own words, "an advanced website traffic estimator that offers an accurate report about the number of visitors to a website". Although no methodological details of its operation is published, the results provided seem to be stable and credible. For any website address, beside the overall current daily traffic, even the free version provides the traffic history for any period back to 1 year and, what is most important for the present study, the distribution of the traffic by source country. In spite of its obvious uncertainties and deficiencies, it seems to be an interesting experiment to use this tool for comparing the website traffic of the universities under study. Table 5 contains two basic statistics: the total number of daily visitors and the percentage share of non-domestic (international) visitors. The date of sampling was October 15, 2019.
For the same day, the country distribution of visitors is given in Table 7 (see "Appendix").

Discussion
The collection of data and indicators presented in Tables 1, 2, 3, 4, 5, 6, and 7 lends itself to analysis and interpretation from a myriad of aspects. The authors encourage the readers to find the perspective of their specific interest. Some examples will be given in the present section. Before considering some positive examples, an important warning should be stressed. However tempting it is, the mere possibility of ordering objects by some measurable feature is not sufficient ground to make, let alone publish, ranked lists, rankings. As we can read in an excellent recent book (Érdi 2020) on the topic: "We should keep in mind the lesson […]: ranking reflects a mixture of the reality and illusion of objectivity, and it is also subject to manipulation".
Of course, for well defined, specific purposes comparisons and even rankings might be useful, but it must be kept in mind that no indicator conveys inherent value judgement in itself. It is a question of human decision that a higher value of an indicator is favorable, unfavorable or neutral in a given aspect. As mentioned in the "Introduction" section, only competent decision makers can decide what type and what extent of internationality is desirable at a university. Comparisons and rankings may help decisions only after the objectives had been clearly laid down.  Table 3 to make country patterns visible. There is a statistically significant overall negative correlation between the two variables: the higher the percentage share of internationally co-authored publications, the smaller the advantage internationality means in citation rate. This is the consequence of the simple fact that, without exception, international publications receive higher citation rates than domestic ones. Sweden stands out in co-authorship internationality. Israel closely follows the overall trend. For the Hungarian universities (with the single exception of the BME), international coauthorship means above average driving power for reaching higher citation rates. Figure 2 shows the relation between the co-authorship-and citation-based PHI and the h-index (see data in Tables 2 and 4). Both PHI indices are in significant positive correlation with the h-index, partly because of their size-dependence. Conspicuously, points referring to Hungarian and Swedish universities are completely disjoint; Israel overlaps with both.

Internationality and citation rates
A more direct relation between internationality and citation rate is seen in Fig. 3. In accordance with the fact that internationality has a citation advantage for all universities under study, average citation rate is positively correlated with the average number of collaborating countries per paper (see data in Tables 3 and 4). No country-specific behavior can be observed in this respect. The most probable reason of the separate position of the two Hungarian universities (ELTE and UniDeb) is their overly active participation in large mega-authored projects (particularly in CERN-based high energy particle physics research), which has positive effect on both their international embeddedness and citation rate.

Preference factors
Apparently, the nature of co-authorship and citedness preference is substantially different. An institute has an active role in selecting its collaborating partners, while its role is largely passive in determining the source of citations to its publications. In fact, a non-negligible part of non-domestic ("international") citations comes from international collaborations of the university in question (actually, "institutional self-citations"), that is, the two kinds of international relations (co-authorship and citedness) are not completely independent.
Anyhow, the two kinds of preference factors are positively related, as seen in Fig. 4 (based on the data in Table 6). Obviously, beside the "self-citation" component, the preference is strongly influenced by other factors. It becomes clear, if the "high end" of the preference plot is considered.
Countries with the highest preference to collaborate with Hungarian universities and to cite their publication are the Czech Republic and Austria, as well as Greece and Portugal. As to the first two, geography is a simple and evident explanation; the latter appears to be the consequence of joint participation in the large "mega-authored" projects.
In the case of Sweden, the strongest preference is evinced by Norway, Denmark and Finland, for doubtlessly geocultural reasons. Countries with the highest preference toward Israel-at much lower level than that found in the other two countries-are Switzerland and Austria. No straightforward explanation could be found for these connections the nature of which might be revealed by a deeper, itemized analysis of co-authored and citing papers.
It is interesting to note that the relation between the citedness and co-authorship preference factors can be well approximated by a power function-actually, the best fit is very close to a square root law.

Disciplinary patterns
Universities, by definition, implicate a kind of universality. Yet, some universities are more universal than others. These others are more or less oriented towards some specific areas of knowledge, such as physical and engineering sciences ("technical universities" in what follows) or medical and life sciences ("medical universities"). In the selected universities of each of the three countries there is one "technical university" (BME, Technion and KTH) and one "medical university" (Semmelweis, Weizmann and Karolinska). One may wonder whether these specialized universities exhibit some specific pattern in their indicators of internationality.
The relation among three indicators, the average number of collaborating countries per publication, the percentage of international publications and the percentage citation advantage is shown in Fig. 5.
As witnessed by Fig. 5, disciplinary orientation does not seem to be directly related to the universities' internationality. The relatively low internationality of BME and the high internationality of the Weizmann Institute might be explicable by other factors.
Disciplinary differences in international collaboration and its citation impact are certainly exist (see, e.g., Puuska et al. 2014), but how these differences are manifested at university level, remains to be studied.

Validity of Glänzel's estimation for PHI
According to Glänzel's model (Glänzel 2006), PHI is in a simple relation with the number of papers, n, and the average number of countries per paper, x: PHI = c·n 1/3 ·x 2/3 , where c is a positive constant of the order of 1 (see also Schubert 2012). The impressive agreement between estimated and observed values in Fig. 6 is a convincing proof that Glänzel's model is valid also in this case. Consequently, following the idea of Prathap (2010), a good estimation of PHI can be given from the number of papers and the average number of collaborating countries per paper without the detailed knowledge of the full distribution.

Webometric indicators
Webometric ranking of universities (as published in webometrics.com) show moderate similarity with bibliometrics-based rankings (Aguillo et al. 2010). A corresponding result was found if the basic bibliometric and webometric indicators of our sample, the number of publications and the number of visitors, were considered. As seen in Fig. 7, a moderately positive overall correlation can be found between the two indicators. Israeli universities seem to follow the overall trend quite closely, while Hungarian and Swedish data behave rather haphazardly.
As the indicators of internationality is considered, even this moderate correlation appears to vanish. Percentage share of internationally co-authored publications and percentage share of international (non-domestic) visitors are completely uncorrelated (see Fig. 8). The three countries, nevertheless, exhibit some specific patterns: Swedish universities form a cluster of high international co-authorship, average international visitors; Israeli universities have higher-than-average, Hungarian universities have lower-than-average share in international visitors independently of co-authorship internationality.
The positive side of the apparent discrepancy between bibliometric and webometric indicators is that they can be interpreted as separate, even independent, dimensions of the impact of scientific research. The problem is that, while there are several decades of experience in interpreting bibliometric indicators (including cautionary warnings against traps and trickeries), very few is known about the motivations, driving forces of phenomena reflected by webometric indicators, let alone their science policy significance.
As an example, one of the most striking feature of the country distributions of the website visitors in Table 7, is the predominant presence of India and Pakistan among the top countries in practically all universities. Most probably this feature is related to the fact that most web design and search engine optimization services are located in these two countries, and their activity is reflected in this indicator. Whether this feature is desirable, undesirable or neutral should be decided by competent policy makers, and the use of this indicator in evaluation should be considered accordingly.
We close this discussion by reiterating that this section contains just a few of the countless possible analyses and interpretations, and the readers are encouraged to find the perspective of their specific interest. And the fundamental maxim remains: comparisons and rankings may help decisions only after the objectives had been clearly laid down.

Conclusions
1. Indicators of internationality used, e. g., for national comparisons or for journals can easily be adapted to the analysis of universities. 2. Internationally co-authored papers have definite citation advantage over domestic ones. 3. Partnership Ability Index is a useful measure of the "breadth" of internationality. Glänzel's model proved to be valid for this Hirsch-type indicator, as well. 4. Co-authorship and citedness preference factors are correlated, and are determined by multiple influences; geographical closeness and joint participation in large international projects being the most decisive. 5. At university level (at least in the sample under study) disciplinary specialization does not seem to have effect on internationality. 6. Visitor statistics is a possible alternative to webpage or backlink counts in constructing webometric indicators. Like other webometric indicators, it appears to be uncorrelated with bibliometric indicators, and also like the others, it does not lend itself to straightforward interpretation.