Introduction

The journal network and its role in collecting and communicating advances in science continues to be a source of debate and challenge to understanding. The network illustrates Herbert Simon’s (1962) argument that systems are shaped in hierarchies in order to deal with complexity (Boyack et al. 2014). The journal structures provide order and improve the efficiency of the search for new information. Eugene Garfield enhanced this role by creating additional categories for the evaluation of journals (Bensman 2007).

Although citations are paper-specific (Waltman and van Eck 2012), Garfield (1972) constructed the Science Citation Index (SCI) and its derivatives (such as the Social Science Citation Index) at the journal level (Garfield 1971). By aggregating citations at that level, one obtains a systems view of the disciplines as they are linked to the subjects covered by the respective journals (Narin et al. 1972). The Institute of Scientific Information (ISI) developed a journal classification system—the so-called “Web-of-Science subject categories” (WC)—that is often used in scientometric evaluations. Three decades later, however, Pudovkin and Garfield (2002, p. 1113) stated that journals had been assigned to these categories by “subjective and heuristic methods” that did not sufficiently appreciate, or perhaps allow for the visibility of, the relatedness of journals across boundaries (Leydesdorff and Bornmann 2016). As boundaries are drawn to enhance efficiency, new developments, especially those that bring together disparate ideas in original ways (Uzzi et al. 2013), can be disadvantaged by a scheme that relies on incremental additions to the conventional subject categories (Rafols et al. 2012).

The categorization of research in terms of disciplines has often been commented upon in the history of science. For example, Bernal (1939, p. 78) noted that: “From their very nature there must be a certain amount of overlapping….” In 1972, the Organization for Economic Cooperation and Development (OECD 1972) proposed a systematization of the distinctions between multi-, pluri-, inter-, and trans-disciplinarity as categories for research and higher education (Klein 2010; Stokols et al. 2003) “Multidisciplinary” is used for juxtaposing disciplinary/professional perspectives, which retain separate voices; “interdisciplinary” integrates disciplines; and “transdisciplinary” synthesizes disciplines into larger frameworks (Gibbons et al. 1994). We adopt these definitions in this study.

We return to a question which Leydesdorff and Rafols (2011) raised, but did not answer conclusively at the time, namely, how to distinguish and if possible rank journals in terms of their “interdisciplinarity” in such a way as to identify where creative combinations are indicated. In this previous study we used the 8707 journals included in the Journal Citation Reports (JCR) 2008, and explored a number of measures of interdisciplinarity and diversity, also detailed in Wagner et al. (2011). In this study, we build on the statistical decomposition of the JCR data 2015 (11,365 journals) using VOSviewer (Leydesdorff et al. 2017). A statistical decomposition, however, does not have to be semantically meaningful (Rafols and Leydesdorff 2009). The advantage of this approach, however, is that the two problems—of decomposition and interdisciplinarity—can be separated analytically. Furthermore, we exploit advances made in recent years:

  1. 1.

    At the time, we did not sufficiently distinguish between cosine-normalization of the data as more or less standard in the scientometric tradition (Ahlgren et al. 2003; Salton and McGill 1983) and the use of graph-analytical measures such as betweenness centrality that presume binary networks. The distance measures in the two topologies, however, are very different: graph-analytically, one can distinguish shortest paths in the network of relations; but the vector space is spanned in terms of correlations—that is, including non-relations. Proximity can be expressed in this topology, for example, as a cosine value; and distance accordingly as (1-cosine).

  2. 2.

    Betweenness centrality—the relative number of times that a node is part of the shortest distance (“geodesic”) between other nodes in a network—is an obvious candidate for the measurement of interdisciplinarity; it scored as best for this purpose in the previous comparison. Leydesdorff and Rafols (2011, p. 93) noted that weighted betweenness could be further explored using the citation values of the links as the weights; but at that time, this concept was still under development (Brandes 2001; Newman 2004) and not yet implemented for larger-sized matrices. In the meantime, Brandes (2008) comprehensively discussed betweenness and the measure for valued networks was implemented, for example, in the software package visone (available at http://visone.info).

  3. 3.

    Diversity measures have been further developed into “true” diversity measures by Zhang et al. (2016); “true” diversity can be scaled at the ratio level so that one can consider percentages in increase or decrease of diversity (Jost 2006; Rousseau et al. 2017, in preparation). Furthermore, Cassi et al. (2014) proposed that diversity can be decomposed into within-group and between-group diversity. Using a general approximation method for distributions, these authors have developed benchmarks for institutional interdisciplinarity (see also the further elaboration in Cassi et al. 2017; cf. Chiu and Chao 2014). However, we shall argue that one can decompose diversity in terms of the cell values of (\(p_{i} p_{j} d_{ij}\)) because it is a summation. Differences among aggregates of these values can be tested for statistical significance using ANOVA with Bonferroni correction ex post (e.g., the Tukey test).

  4. 4.

    The availability of virtually unlimited memory resources using a 64-bit operating system and the further development of software for network analysis (Gephi, ORA, Pajek, UCInet, visone, VOSviewer, etc.) enables us to address questions that were previously out of reach.

Interdiscipinarity

Interdisciplinarity has remained a fluid concept fulfilling various functions at the interfaces between political and scientific discourses (Wagner et al. 2011). Funding agencies and policy makers call for interdisciplinarity from a normative perspective based upon their expectation that boundary-spanning produces creative outputs and can contribute to solving practical problems. For example, in 2015 Nature devoted a special issue to interdisciplinarity, stating in the editorial that scientists and social scientists “must work together … to solve grand challenges facing society—energy, water, climate, food, health.” On this occasion, van Noorden (2015) collected a number of indicators of interdisciplinarity showing a mixed, albeit optimistic picture; interdisciplinary research, by some measures, has been on the rise since the 1980s. According to this study, interdisciplinary research would have long-term impact (> 10 years) more frequently than disciplinary research (p. 306). Asian countries were shown to publish interdisciplinary papers more frequently than western countries (p. 307).

On the basis of a topic model, Nichols (2014, p. 747) concluded that 89% of the portfolio of the Directorate for Social, Behavioral, and Economic Sciences (SBE) of the US National Science Foundation is “comprised of IDR (interdisciplinary research)—with 55% of the portfolio identified as having external interdisciplinarity and 34% of the portfolio comprised of awards with internal interdisciplinarity. When dollar amounts are taken into account, 93% of this portfolio is comprised of IDR (…).” Although this result may be partly an effect of the methods used (Leydesdorff and Nerghes 2017), these impressive percentages show, in our opinion, the responsiveness of (social) scientists to calls for interdisciplinarity by funding agencies.

Is this commitment also reflected in the output of research? Do scientists relabel their research for the purpose of obtaining funds (Mutz et al. 2015)? On the output side, the journal literature can be considered as a selection environment at the global level. However, the journal literature has recently witnessed important changes in its orientation toward “interdisciplinarity.” Using a new business model, PLOS ONE was introduced in 2006 with the objective to cover research from all fields of science without disciplinary criteria. “PLOS ONE only verifies whether experiments and data analysis were conducted rigorously, and leaves it to the scientific community to ascertain importance, post publication, through debate and comment” (https://en.wikipedia.org/wiki/PLOS_ONE; MacCallum 2006, 2011). Although this model is multi-disciplinary, it creates room for the evolution of new standards at the edges of existing disciplines.

It remains difficult to define interdisciplinarity, when disciplines cannot be demarcated clearly. Most if not all of science is a process of seeking diverse inputs in order to create innovative insights. Labels as to whether the results of research are classified as “chemistry” or “physics” are added afterwards. Yet, such classifications structure expectations, behavior, and action. A physicist hired in a medical faculty, for example, has to fulfill a different range of expectations and thus faces another range of options than his/her colleague in a physics department.

The journal literature is mostly structured in terms of specialties because its main function has been to control quality, particularly in the case of specialized contributions. The launch of a new journal and its incorporation into the quality control system of the relevant neighboring journals and databases (including the bibliometric ones) provide practicing scientists with new options. The emergence of a new specialty is often associated with the clustering of journals supporting new developments at the field level (e.g., Leydesdorff and Goldstone 2014; van den Besselaar and Leydesdorff 1996). However, the demarcation of disciplines in terms of journals has remained a major problem. As noted, one uses WCs in scientometrics as a proxy, but this generates error (Leydesdorff 2006).

Interdisciplinarity can also be considered as a variable; neither journals nor departments are mono-disciplinary. The system is operational and therefore in flux. But how could one measure the interdisciplinarity of a journal, a department, or even an individual scholar “while a storm is raging at sea” (Neurath 1932/1933)? Is physical chemistry more or less ‘interdisciplinary’ than biochemistry? Or are both parts of chemistry? Does it matter when a laboratory for biochemistry is relabeled as molecular biology, and thereafter classified as biology?

When interviewed, for example, physicists who attended the emergence of nanotechnology and nanoscience during the 1990s considered the nano domain as just another domain in physics, whereas material scientists experienced this same development as revolutionary (Wagner et al. 2015). A new set of research questions became possible and new journals emerged at relevant interfaces, while existing journals changed their orientations; for example, in terms of what is admissible as a contribution. The material scientists considered nanotechnology and nanoscience as a new discipline, while the physicists did not. As Klein (2010) states: “It is two disciplines, one might say, divided by a common subject” (p. 79).

Unlike hierarchical classifications, a network representation of the relations among disciplines and specialties provides room for operational definitions of interdisciplinarity and measurement. Dense areas in the networks can overlap into areas that are less dense; new densities can emerge in the less dense areas because of recursive interactions; densities at interfaces can be approached from different angles, and then other characteristics may prevail in the perception and hence categorization.

In this study, these larger questions about the (inter)disciplinary dynamics of science are reduced to the seemingly trivial question of the measurement of interdisciplinarity of scholarly journals in a specific year. Can we sharpen the instrument so that an operational definition and measurement of interdisciplinarity become feasible? Using the aggregated journal–journal citation network 2015 based on JCR data, we test two measures which have been suggested for measuring interdisciplinarity. By moving from the top level of “all of science” (11 k + journals) to ten broad fields and then to lower levels of specialties, we hope to be able to say more about the quality of the instruments as well as about the problems of measuring interdisciplinarity. In other words, we entertain two research questions: one substantial, about measuring the interdisciplinarity of journals at different levels, and one methodological, about problems with this measurement.

The measurement instruments

The focus will be on two measures of interdisciplinarity: betweenness centrality and diversity.

Betweenness centrality

Betweenness centrality (Brandes 2001, 2008; Freeman 1978/1979) and its derivatives such as “structural holes” (Burt 2001) are readily available in software packages as measures for brokering roles between clusters. A high betweenness measure at the node level indicates that the node has a higher than average likelihood of being on the shortest path from one node to another. This position may enable the agent at the node to control the flow between other vertices (Brandes 2008, p. 137). Investigators with high betweenness are, for example, better positioned to relay (or withhold) information between research groups (Freeman 1977; Abbasi et al. 2012). They are advantaged in terms of search.

Algorithms for variants of betweenness centrality were notably implemented in the software package visone. Among these variants is the possibility to use weighted networks (Freeman et al. 1991). The combination of betweenness centrality with the disparity notion in diversity studies—to be discussed below—is also the subject of ongoing research on Q- or Gefura measures (Flom et al. 2004; Rousseau and Zhang 2008; Guns and Rousseau 2015). However, this further extension is not studied here.

Diversity

Following a series of empirical studies by Alan Porter and his colleagues (Porter et al. 2006, 2007, 2008; Porter and Rafols 2009), on the one hand, and Stirling’s (2007) mathematical elaboration, on the other, Rafols and Meyer (2010) distinguished three aspects of interdisciplinarity: (1) variety, (2) balance, and (3) disparity. Variety can be measured, for example, as Shannon entropy. The participating disciplines in specific instances of interdisciplinarity can be assessed in terms of their balance: a balanced participation can be associated with interdisciplinarity, whereas an unbalanced one suggests a different relationship (Nijssen et al. 1998). In the extreme case, the one discipline is enrolled by the other in a service relationship.

On the basis of animations of newly emerging journal structures, Leydesdorff and Schank (2008) showed that interdisciplinary developments occur often at specific interfaces between disciplines, but are initially presented as—and believed to be—interdisciplinary. From this perspective, interdisciplinarity can be associated with the idea of a pre-paradigmatic phase in the development of disciplines and specialties (van den Daele et al. 1979). New and initially interdisciplinary developments may crystallize into new disciplinary structures (van den Besselaar and Leydesdorff 1996) or they may dissipate as the core disciplines absorb the new concepts.

The measurement of “disparity” provides us with an ecological perspective: a collaboration between authors in biology and chemistry, for example, can be considered as less interdisciplinary in terms of disparity than one between authors in chemistry and anthropology. The cognitive distance between latter two disciplines—being a natural and a social science discipline is much larger than that between two neighboring fields in the natural sciences (Boschma 2005). The disparity thus reflects a next-order structure in terms of ecological distances and niches among journal sets.

Disparity and variety can be combined in the noted measures of diversity (Rao 1982; Stirling 2007) as follows:

$$\Delta = \sum _{\mathop {ij}\limits_{(i \ne j)}} p_{i} p_{j} d_{ij}$$
(1)

In this formula, i and j represent different categories; p i represents the relative frequency or probability of category i, and d ij the distance between i and j. The distance, for example, can be the geodesic (that is, shortest path) in a network or (1-cosine) in a vector space by using the cosine as a proximity measure (Ahlgren et al. 2003; Jaffe 1989; Salton and McGill 1983). The multiplication of the measures of distance and relative occupation has led to the characterization of this measure as “quadratic entropy” (e.g., Izsáki and Papp 1995). Stirling (2007) suggests developing a further heuristics by weighing the two components; for example, by adding exponents. However, one then obtains a parameter space which is infinite (Ricotta and Szeidl 2006).

The first part of Eq. 1 (that is, the measure of variety \(\sum _{\mathop {ij}\limits_{(i \ne j)}} p_{i} p_{j} \) is also known as the Gini–Simpson diversity measure in biology or the Herfindahl–Hirschman index in economics (Leydesdorff 2015). Note that this term is measured at the level of a vector. Using a citation matrix, two different distance matrices can be constructed among the citing and cited vectors, respectively. In the citing dimension, Rao–Stirling diversity has been considered as a measure of integration in interdisciplinary research (Porter and Rafols 2009; Rousseau et al. 2017, in preparation; Wagner et al. 2011, p. 16). Variety and disparity are combined and integrated in a citing paper by the citing author(s). In the cited dimension, one measures diversity (in terms of variety and disparity) in the structures from which one cites. The structures operate as selection environments. Rousseau et al. (2017, in preparation) suggests that diversity in the cited dimension should be considered as diffusion: diffusion can be interdisciplinary to various extents.

Zhang et al. (2016) further developed Δ into 2 D 3 as a “true” diversity measure; true diversity has the advantage that the measure is scaled so that a 20% higher value of 2 D 3 indicates 20% more diversity (Jost 2006). Conveniently, the two measures relate monotonically as follows (Zhang et al. 2016, p. 1260, Eq. 6):

$${}^{2}D^{3} = 1/\left( {1 - \Delta } \right)$$
(2)

True diversity varies from one to infinity when Δ varies between zero and one. Note that these diversity measures do not include “balance” as the third element distinguished in the definition of interdisciplinarity by Rafols and Meyer (2010). One can envisage adding a third probability distribution (p k ) to Eq. 1 as a representation of the disciplinary contributions (Rafols 2014). Alternatively, “balance” can be operationalized using, for example, the Gini-index.

As noted, Cassi et al. (2014) developed a methodology for the decomposition of diversity into within-group and between-group diversity (see also the further elaboration in Cassi et al. 2017; cf. Chiu and Chao 2014). In our opinion, the Eqs.  1 and 2 are valid for each subset since the operation is a straightforward summation. Consequently, one can decompose diversity in terms of the cell values (\(p_{i} p_{j} d_{ij}\)). Differences among aggregated subsets can be tested using ANOVA with Bonferroni correction ex post (e.g., the Tukey test). For the exploration of this decomposition, we use the WoS Category Library and Information Science (86 journals in the JCR 2015) instead of the set of 62 journals categorized by Leydesdorff et al. (2017) into a single group on statistical grounds. The results are then easier to follow.

Units of analysis

In addition to the various aspects of interdisciplinarity that can be distinguished, the choice of the system of reference will make a difference. Interdisciplinarity can be attributed to departments, journals, œuvres, emerging disciplines, etc. In science studies, it is customary to distinguish between the socially organized group level and the level of intellectually organized fields of science (Whitley 1984). The interdisciplinarity of a group (e.g., a department) can be important from the perspective of team science. The dynamics of interdisciplinary at the field level are relatively autonomous (“self-organizing”; van den Daele and Weingart 1975).

For example, the interdisciplinary development of nano-technology in the 1990s required contributions from chemistry (e.g., advanced ceramics), applied physics, and materials sciences. A group in a chemistry faculty will be positioned for the challenge of participating in this new development differently from a group in physics. The group dynamics, in other words, can be different among groups and from the field dynamics. New fields of science may develop at the global level, whereas groups are localized. One can also consider the fields as the selection environments for groups or, more generally, individual or institutional agency. Selection mechanisms can reflexively be anticipated.

Furthermore, one can attribute interdisciplinarity as a variable to units of analysis such as authors, groups, texts at the nodes of networks, or second-order units of analysis such as links. (Factor loadings, for example, are attributes of variables.) One can expect different dynamics at the first-order or second-order level. Whereas interdisciplinarity can a political or managerial objective in the case of first-order units (e.g., groups), interdisciplinarity attributed at the level of second-order units (e.g., fields) is largely beyond the control of decision makers or individual scientists. Second-order units can be rearranged and thus develop resilience against external steering.

Note that these distinctions are analytical: journals, for example, are organized in terms of their production process, but can be self-organizing in terms of their content to variable extents. The interdisciplinarity of a journal or a department is also determined by the sample and the level of granularity in the analysis. A journal, for example, may appear interdisciplinary in the context of a large set of journals, but when this set is decomposed, the interdisciplinarity may be lost since the borders are drawn differently. For example, important ties to other domains may be cut by decomposition. In sum, one unavoidably entertains a model when measuring “interdisciplinarity;” and by using this model, the concept is (re)constructed.

Methods

Data

We use the directed (asymmetrical and valued) 1-mode matrix among the 11,365 journals listed in the Science and Social Sciences Citation Index in 2015. Table 1 provides descriptive statistics of the largest component of 11,359 journals. (Six journals are not connected.)Footnote 1

Table 1 Network characteristics of the largest component of the matrix based on JCR 2015

In the first round of the decomposition, ten clusters were distinguished. These are listed in Table 2. At http://www.leydesdorff.net/jcr15/scope/index.htm the reader will find a hierarchical decomposition in terms of maps of science.

Table 2 Fields distinguished at the top level of JCR 2015

We pursue the analysis for the complete set (n = 11,359) and for the first cluster (n = 3274). Within this latter subset, 62 journals are classified as Library and Information Science (LIS) in the second round of decomposition. We use the LIS set as an example at the (next-lower) specialty level.Footnote 2

Statistics

As noted, we focus in this study on betweenness centrality and diversity as two main candidates for measuring interdisciplinarity in journal citation networks. The betweenness centrality (BC) of a vertex k is equal to the proportion of all the geodesics between pairs (g ij ) of vertices that include this vertex (g ijk ; e.g., de Nooy et al. 2011, p. 151). The BC for a vertex k can formally be written as follows:

$$BC_{k} = \mathop \sum \limits_{i} \mathop \sum \limits_{j} \frac{{g_{ijk} }}{{g_{ij} }},\quad i \ne j \ne k$$
(3)

Freeman (1977) introduced several variants of this betweenness measure when proposing it. In their study of centrality in valued graphs, Freeman et al. (1991) further elaborated flow centrality, which includes all the independent paths contributing to BC in addition to the geodesics. In the meantime, a number of software programs for network analysis have adopted Brandes’ (2008) algorithm for valued graphs.Footnote 3 We use Pajek and UCInet for non-valued graphs and visone for analyzing valued ones.Footnote 4

While BC can be computed on an asymmetrical matrix, Rao-Stirling diversity and 2 D 3 are evaluated along vectors in either the cited or citing direction of a citation matrix. Both the proportions and the distances have to be taken in the one direction or the other. One thus obtains two different—but most likely correlated—measures. The proportions are straightforward relative to the sum of the references given by the journal (citing) or the citations received (cited).Footnote 5 The distance measure, however, provides us with another parameter choice.

In line with the reasoning about BC, one could consider using geodesics as a measure of distance. However, the average geodesic in the network under study is 2.5 with a standard deviation of 0.6 (Table 1). In other words, the variation in the geodesic distances is small: most of them are 2 or 3.Footnote 6 The choice of another distance measure—or equivalently (1-proximity)—provides us with a plethora of options. We chose (1-cosine) as the distance measure because Euclidean distances did not work in our previous project (Leydesdorff and Rafols 2011). The cosine has been used as a proximity measure in technology studies by Jaffe (1989); Ahlgren et al. (2003) suggested using the cosine (Salton and McGill 1983) as an alternative to the Pearson correlation in bibliometrics.

The computation of diversity is computationally intensive because of the permutation of the i and j parameters along the vector in each case. A routine for generating diversity values on the basis of a Pajek file is provided in the Appendix. Note that p i and p j along a vector can both be larger than zero, but the cosine between the vectors i and j in the same direction may be zero. For example, if a journal n is cited by two marginal journals (i = 5 and j = 6 in Fig. 1), the co-occurrence in the vertical direction is larger than zero; but in the horizontal direction the cosine value can be zero and the distance therefore one. The cosine values between marginal journals may thus boost diversity as measured here. (Given the skew in scientometric distributions, one can expect relative marginality to prevail in any delineated domain.)

Fig. 1
figure 1

The computation of Rao–Stirling diversity

Results

The full set of 11,365 journals in JCR 2015

Table 3 lists the top-25 journals when ranked for betweenness, valued betweenness, and diversity measured as 2 D 3 in both the citing and cited directions. Not surprisingly, PLOS ONE ranks highest on BC in both the binary and the valued case. The Pearson correlation for the two rankings across the file is larger than .99 (Table 4), but differences at the top of the list are sometimes considerable. PLOS ONE and, for example, Psychol Bull gain in score when BC is based on values, but Nature and Science lose.

Table 3 Top 25 journals in terms of various betweenness centrality and diversity measures
Table 4 Pearson and Spearman’ rank correlations of BC and 2 D 3 among 11,359 journals (in the lower and upper triangle, respectively); all correlations are statistically significant at the level p < .01

Interestingly, Scientometrics ranks 11th on BC in the binary case, but only in the 15th place using valued BC. Typically, this journal cites and is cited by journals in other fields incidentally and unsystematically in addition to more dense citation in its own intellectual environment. Citations to and from Psychol Bull in contrast are more specific. Annu Rev Psychol and Psychol Rev show the same pattern as Psychol Bull of increasing BC when valued.

Most of the journals with high BC values are multi-disciplinary journals. In accordance with its definition, BC measures the extent to which the distance between otherwise potentially distant clusters is bridged. Note that some journals in the social sciences score high on BC, among which is Scientometrics. In our opinion, Scientometrics can be considered as a specialist journal with a specific disciplinary orientation. As noted, however, its citation patterns and being cited patterns span across different disciplines because a variety of disciplines can be the subject of study and indicators are used in other fields. In other words, BC does not teach us about the nature of the knowledge production process, but about patterns of integration and diffusion across disciplinary boundaries (Rousseau et al. 2017, in preparation).

Table 4 shows that BC and 2 D 3 measure different things. Diversity in the citing direction is not correlated to BC. In the cited direction the rank-order correlation is still substantial. This correlation can be explained as follows: the disparity factor (d ij ) indicates the distances that have to be bridged between different domains. The (multi-disciplinary) structure of science is reflected in both this distance and BC. However, variety [\(\sum \mathop {ij}\nolimits_{(i \ne j)} p_{i} p_{j}\)]—as the second component of diversity—is based on a different principle. In the citing dimension, particularly, one may cite across disciplinary boundaries (“trans-disciplinarily”; Gibbons et al. 1994) and generate variety. This source of variation is also reflected in the cited dimension, since the cited can be considered as the archive of a time-series of citing relations. Not incidentally, therefore, we find journals in the right-most column of Table 3 from the periphery, or with a specific national background that may be problem- or sector-oriented (e.g., agriculture). Leydesdorff and Bihui (2005) found such a non-disciplinary orientation in the case of Chinese journals that are institutionally based.

The Journal of the Chinese Institute of Engineering (with 2 D 3 = 20.13 at the top of this list), for example, was cited in 2015 in articles published in 56 journals, but it cites from 230 journals. It can therefore be considered a net importer of knowledge (Yan et al. 2013). Figure 2 shows this environment of 230 journals as a map based on aggregated citation relations. Using BC as the values for the nodes, Fig. 2a first shows the structure of the journals as a map. In Fig. 2b, diversity in the citing direction is used as the parameter for the node sizes. This brings engineering journals more to the fore then physics journals. The J Chin Inst Eng itself is not visible in Fig. 2a, but most pronouncedly in Fig. 2b. Note that there is further nothing special about this journal: its 2-year Journal Impact Factor (JIF) is .246 and the 5-year JIF is .259.

Fig. 2
figure 2

Citing patterns among 230 journals cited by J Chin Inst Eng during 2016. Nodes are sized according to BC in (a) but according to diversity in the citing dimension in (b)

Journals in the social sciences

The largest subset of journals distinguished in the decomposition (Table 2) is a group of 3274 journals in the social sciences. We pursue the analysis for this subset in order to see whether the patterns found above can be considered general. In a next section we zoom further into the subset of journals classified as LIS within this set.

Table 5 shows that Soc Sci Med is ranked highest in terms of BC in both the valued and non-valued analysis. This journal was ranked in the third position in the full set—after PLOS ONE and PNAS. The rank of Scientometrics has now decreased from the 11th to the 21st position using non-valued BC and from the 15th to the 31st position using valued BC. A large proportion of its betweenness is in connecting social science disciplines with the natural and medical sciences. These relations across disciplinary divides are cut by the decomposition (Table 6).

Table 5 Top 25 social-science journals (n = 3274) in terms of various betweenness centrality and diversity measures
Table 6 Pearson and Spearman’ rank correlations of BC and 2 D 3 among 3274 journals (in the lower and upper triangle, respectively)

The pattern described above for the full set is also found in this subset. Two factors explain 78.6% of the variance in the four variables: BC versus the variation-factor in the diversity citing (Table 7). The Pearson correlations between BC (binary and valued) and 2 D 3 citing are .010 and .008, respectively. Note the negative sign of 2 D 3 citing on factor 1 in Table 7. The two mechanisms thus stand orthogonally.

Table 7 Varimax rotated factor solution for the four variables; n = 3264

The journals in the right-most column of Table 6 are recognizably trans-disciplinary or, in other words, reaching out across boundaries. On the cited side, the pronounced position of sociology journals is noteworthy. Major sociology journals such as Am J Sociol, Brit J Sociol, and Am Sociol Rev figure in this top list as they did in Table 3, but other sociology journals such as Soc Sci Inform also rank high on this list (#9), while this journal was ranked only at position 1363 in the total set.

In Fig. 3, we try to capture the differences visually. Figure 3a first provides a map of these 3274 journals. In Fig. 3b, the node sizes are proportional to the BC scores of the journals. One can see a shift to the applied side. For example, the Am Rev Econ comes to the foreground in the left-most cluster (pink) in Fig. 3a, while this most-pronounced position is assumed by Appl Econ and World Dev in Fig. 3b. Similarly, J Pers Soc Psychology—the flagship of this field—is overshadowed by Psych Bull in the top-right cluster (turkois) of Fig. 3b. This journal is also read outside the specialty. The J Bus Ethics is most pronounced in terms of BC values among the business and management journals in the light-blue cluster top-left.

Fig. 3
figure 3

Comparison of the map for the social sciences (a) with one using BC (b), and one using 2 D 3 values (c) for the sizes of the nodes (in VOSViewer)

In Fig. 3c, the node sizes are proportional to the diversity scores (citing). The picture teaches us that highly diverse journals are spread across the disciplines as variation. All disciplines have portfolios of journals of which some are more diverse than others.

Library and information sciences (LIS)

We first pursued the analysis using the 62 journals that were classified as LIS, but for reasons of presentation, here we use the results of the analysis based on citations among the 86 journals classified in terms of SC as LIS in the JCR 2015. Otherwise, the discussion about the differences between the two samples would lead us away from the objectives of this study (cf. Leydesdorff et al. 2017).

In Table 8, Scientometrics is the journal with the highest BC in both analyses. JASIST follows at only the 12th position, while one would expect the latter journal to be more integrative among the different subjects studied in LIS. In terms of knowledge integration indicated as diversity in the citing dimension, JASIST assumes the third position and Scientometrics trails in 45th position. In the cited dimension, the diversity of Scientometrics is ranked 70 (among 86). Thus, the journal is cited in this environment much more specifically than in the larger context of all the journals included in the JCR, where it assumed the 339th and 6246th position among 11,359 observations, respectively. In the latter case the quantile values are 97.0 and 45.0, respectively, versus 47.7 and 7.0 in the smaller set of LIS journals (Table 10).

Table 8 Top 25 LIS journals in terms of various betweenness centrality and diversity measures
Table 9 Pearson and Spearman’ rank correlations of BC and 2 D 3 among 86 LIS journals; in the lower and upper triangle, respectively
Table 10 Scientometrics at three levels of aggregation

In this much smaller set, the diversity in the citing dimension is significantly correlated to BC (Table 9). In other words, citing behavior is more specific at the specialty level. The socio-cognitive structure of the field guides the variation. Table 10 shows the values of diversity in the case of Scientometrics at the three levels, respectively. Diversity is larger in the citing than cited dimension at the level of the full set. Limitation to the social sciences leads to losing citation in the citing dimension more than in the cited. As a consequence, diversity is larger in the cited than citing dimension at this level. Being at the edge of the LIS set, the journal cites more than it is cited by other journals in this set.

In sum, diversity is dependent on the delineations of the sample in which it is measured.

Decomposition of the diversity

In a next step we decompose the LIS set of 86 journals (Table 11). Three journals (Econtent, Restauror, and Z Bibl Bibl) are not part of the large component, and therefore not included in this decomposition. Using VOSviewer, six groups are distinguished, of which one contains only a single journal (Soc Sci Inform). Figure 4 shows this map. Mean diversity values with standard errors for the five groups decomposed as sub-matrices are provided in Fig. 5.

Fig. 4
figure 4

Clustering of the LIS set (n = 86) into five clusters using VOSviewer

Fig. 5
figure 5

Average 2 D 3 cited and citing for five subgroups of LIS journals (error bars with standard errors)

Based on the post hoc Tukey test, two homogenous groups are distinguished in the citing dimension: library science and bibliometrics with relatively high citing scores, on the one side, and the other three with significantly lower scores, on the other. However, the distinction is not significant. In the cited direction, the entire set is statistically homogenous.

Table 11 Decomposition of the diversity in the LIS set

Within the subsets, however, the diversity scores are based on sub-matrices with corresponding cosine values. Table 12 provides the diversity values when all cell values are normalized in terms of the grand matrix. The difference between the total diversity and the sum of the within-group diversities is then by definition equal to the between-group diversity. Using the Tukey test with this design between-group diversity is significantly different from diversity in all the subsets with the exception of citing diversity in the subset of 19 journals labeled as “information science.” Both cited and citing, “information science” and “library science” are considered as homogenous with between-group diversity. The other three specialism are considered as significantly different.

Table 12 Between-group diversity in the LIS set

Note that the total diversity is generated in a matrix of 82 times 81 or 6642 cells, whereas the within-group diversity is generated in subsets which add up to 1648 cells (24.8%).Footnote 7 In other words, diversity is concentrated in these groupings since they generate in 24.8% of the cells (100–53.3 =) 46.7% and (100–58.1 =) 41.9% of the total diversity in the cited and citing directions, respectively.

Diffusion and integration

Bibliographic coupling among diverse sources by a citing unit has been considered as integration (Wagner et al. 2011), whereas co-citation can be considered conversely as diffusion (Rousseau et al. 2017, in preparation). Using the concepts of integration and diffusion of knowledge for citing and cited diversity, however, one can directly draw diffusion and integration networks by extracting the k = 1 neighbourhoods; for example in Pajek. Figure 6a, b provide these networks for the journal Scientometrics in the LIS set (JCR 2015) as an example: 38 journals constitute the diffusion network (Fig. 6a) and 51 the knowledge integration network (Fig. 6b).

Fig. 6
figure 6

a 38 journals in the knowledge diffusion network of Scientometrics in the LIS set 2015; b 51 journals in the knowledge integration network of Scientometrics in the LIS set 2015

For example, Fig. 6b shows that the Journal of the American Society for Information Science and Technology plays a central role in the knowledge integration network. Articles in this journal are cited in Scientometrics; but only the Journal of the Assocation for Information Science and Technology—the current name of the same journal—is visible in the diffusion network (Fig. 6a). Note that this analysis was pursued within the LIS set of 86 journals. Other journals outside the LIS field (e.g., Research Policy) are also important in the citation environment of Scientometrics.

Conclusions and discussion

Using journals as units of analysis, we addressed the question of whether interdisciplinarity can be measured in terms of betweenness centrality or diversity as indicators. We tried to pursue these ideas in considerable detail. It seems to us that the problem of measuring interdisciplinarity, however, remains unsolved because of the fluidity of the term “interdisciplinarity.” The very concept means different things in policy discourse and in science studies. From a scientometric perspective, interdisciplinarity is difficult to define if there is no operational definition of the disciplines. The latter problem, however, has remained an unsolved problem in bibliometrics.

Bibliometricians often use the WoS Subject Categories as a proxy for disciplines, but these categories are pragmatic (e.g., Pudovkin and Garfield 2002; Leydesdorff and Bornmann 2016; Rafols and Leydesdorff 2009). In this study, we build on the statistical decomposition of the JCR data using VOSviewer (Leydesdorff et al. 2017). The advantage of this approach is that the two problems—of decomposition and interdisciplinarity—are separated.

Our main conclusions are:

  • The analysis at different levels of aggregation teaches us that BC can be considered as a measure of multi-disciplinarity more than interdisciplinarity. Valued BC improves on binary BC because citation networks are valued; marginal links should not be considered equal to central ones.

  • Diversity in the citing dimension is very different (and statistically independent) from BC: it can also indicate non- or trans-disciplinarity. In local and applicational contexts, for example, the disciplinary origin of knowledge contributions may be irrelevant. In specialist contexts, however, citing diversity is coupled to the intellectual structures in the set(s) under study.

  • Diversity in the cited dimension may come closest to an understanding of interdisciplinarity as a trade-off between structural selection and stochastic variation.

  • Despite the absence of “balance”—the third element in Rafols and Meyer’s (2010) definition of interdisciplinarity—Rao-Stirling “diversity” is often used as an indicator of interdisciplinarity; but it remains only an indicator of diversity.

  • The bibliographic coupling by citation of diverse contributions in a citing article has been considered as knowledge integration (Wagner et al. 2011; Rousseau et al. 2017). Analogously, but with the opposite direction in the arrows, diversity in co-citation can be considered as diffusion across domains. Using an example, we have demonstrated how these concepts can be elaborated into integration and diffusion networks.

  • The sigma (Σ) in the formula (Eq. 2) makes it possible to distinguish between within-group and between-group diversity. In this respect, the diversity measure is as flexible as Shannon entropy measures (Theil 1972). Differences in diversity can be tested for statistical significance using Bonferroni correction ex post. Homogenous and non-homogenous (sub)sets can thus be distinguished.

In other words, the problems of measurement could be solved to the extent that a general routine for generating diversity scores from networks is provided (see the Appendix). However, the interpretation of diversity as interdisciplinarity remains the problem. Diversity is very sensititive to the delineation of the sample; but is this also the case for interdisciplinarity? Is interdisciplinarity an intrinsic characteristic or can it only be defined (as more or less interdisciplinarity) in relation to a distribution?

We focused on journals in this study, but our arguments are not journal-specific. Some units of analysis, such as universities, are almost by definition multi-disciplinary or non-disciplinary. Non-disciplinarity can also be called “trans-disciplinary” (Gibbons et al. 1994). However, the semantic proliferation of Greek and Latin propositions—meta-disciplinary, epi-disciplinary, etc.—does not solve the problem of the operationalization of disciplinarity and then also interdisciplinarity.

In summary, we conclude that multi-disciplinarity is a clear concept that can be operationalized. Knowledge integration and diffusion refer to diversity, but not necessarily to interdisciplinarity. Diversity can flexibly be measured, but the score is dependent on the system of reference. We submit that a conceptualization in terms of variation and selection may prove more fruitful. For example, one can easily understand that variation is generated when different sources are cited, but to consider this variation as interdisciplinary knowledge integration is at best metaphorical.

Given this state of the art, policy analysts seeking measures to assess interdisciplinarity can be advised to specify first the relevant contexts, such as journal sets, comparable departments, etc. Networks in these environments can be evaluated in terms of BC and diversity. The routine provided in the Appendix may serve for the latter purpose and network analysis programs can be used for measuring BC. (When the network can be measured at the interval scale, one is advised to use valued BC.) The arguments provided in this study may be helpful for the interpretation of the results; for example, by specifying methodological limitations.