Quantifying the higher-order influence of scientific publications

Franceschet, Massimo; Colavizza, Giovanni

doi:10.1007/s11192-020-03580-9

Quantifying the higher-order influence of scientific publications

Open access
Published: 13 July 2020

Volume 125, pages 951–963, (2020)
Cite this article

Download PDF

You have full access to this open access article

Scientometrics Aims and scope Submit manuscript

Quantifying the higher-order influence of scientific publications

Download PDF

1555 Accesses
6 Altmetric
Explore all metrics

Abstract

Citation impact is commonly assessed using direct, first-order citation relations. We consider here instead the indirect influence of publications on new publications via citations. We present a novel method to quantify the higher-order citation influence of publications, considering both direct, or first-order, and indirect, or higher-order citations. In particular, we are interested in higher-order citation influence at the level of disciplines. We apply this method to the whole Web of Science data at the level of disciplines. We find that a significant amount of influence—42%—stems from higher-order citations. Furthermore, we show that higher-order citation influence is helpful to quantify and visualize citation flows among disciplines, and to assess their degree of interdisciplinarity.

Identifying the Key Reference of a Scientific Publication

Article 21 June 2020

Looking deeper into academic citations through network analysis: popularity, influence and impact

Article 10 July 2017

Role of interdisciplinarity in computer sciences: quantification, impact and life trajectory

Article 16 December 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

New knowledge builds on previous knowledge: this is a central tenet of science. A publication relies on previous publications and cites them to acknowledge this debt (Merton 1957). Although citations acknowledge direct influences, the extent of the influence of a publication can go beyond these first-order relations. The study of the influence of previous publications on new ones rests at the core of scientometrics. The visualization and quantification of such dependence has been termed “algorithmic historiography” by Garfield et al. (1964, 2003). A variety of tools have been developed for the purpose of facilitating such exploration (Chen 2006; van Eck and Waltman 2010, 2014; Marx et al. 2014; Thor et al. 2016). Furthermore, previous literature has investigated methods to trace the historical development of science using citations (Lucio-Arias and Leydesdorff 2008; Yi-Ning and Hsu 2016; Subelj et al. 2020) and text (Gerow et al. 2018; Jurgens et al. 2018; Soni et al. 2019). Our related goal here is to quantify citation influence, and thus give credit, beyond direct citations. In particular, we aim at understanding the interplay of first and higher-order influence across academic disciplines.

In this contribution we define higher-order citations as citations chains of arbitrary length among pairs of publications, and show how the higher-order citation matrix among disciplines can be computed in an iterative and efficient way. Our proposed method is related to the well-known PageRank algorithm (Brin and Page 1998; Franceschet 2011; Waltman and Yan 2014), but it is specifically focused on quantifying higher-order citation influence. We apply this novel definition to the Web of Science dataset between years 2000 and 2016 included (17,932,523 publications and 190,550,206 citations among them). We show that the contribution of first-order (length 1) citations accounts for 58% of the whole higher-order citation flow, hence it misses a conspicuous part (42%) of citation information. Indeed, higher-order citations bring a clear picture of the relationships among disciplines (Klavans and Boyack 2009). Furthermore, we observe this added value by clustering disciplines into larger communities, finding disciplines that act as brokers among communities, and distinguishing between interdisciplinary and autarchic disciplines.

Methodology

Let $G = (V,E)$ be a citation network with n nodes V and m directed edges E. We assume the nodes represent publications. If publication i cites publication j, then $(i, j) \in E$. Normally, G is a Directed Acyclic Graph (DAG), because citations only go from more recent publications to older publications.^{Footnote 1} A simple example is depicted in Fig. 1.

Let A be the adjacency matrix, so that $A_{ij} = 1$ whenever i cites j, that is $(i, j) \in E$, and $A_{ij} = 0$ otherwise. Let $d_i$ be the outdegree of node i, i.e., the number of publications referenced by publication i within the citation network G.

We then recursively define the dependence of publication i on publication j as the mean dependence of publications referenced by i on publication j:

$$\begin{aligned} P_{ij} = {\left\{ \begin{array}{ll} 1 &{}\quad \text {if } i = j,\\ 0 &{}\quad \text {if } i \ne j \text { and } d_i = 0,\\ \frac{1}{d_i} \sum _k A_{ik} P_{kj} &{}\quad \text {if } i \ne j \text { and } d_i > 0. \end{array}\right. } \end{aligned}$$

We say that $P_{ij}$ is the dependence of i on j, but on the same note it is the influence of j on i. Notice that the recursive equation has always a solution since recursion proceeds from each publication to its citing publications, and the graph G is acyclic.

Let us label each edge of the graph (i, j) with probability $1/d_i$ of going from i to j in a random walk on the graph. Given a path $\pi = k_1, k_2, \ldots k_r$ on the graph, we define the likelihood of the path $\pi$ as

$$\begin{aligned} p(\pi ) = \prod _{i=1}^{r-1} \frac{1}{k_i}. \end{aligned}$$

The dependence $P_{ij}$, when $i \ne j$, is then the sum of likelihoods of all paths from i to j in the graph. In general:

The dependence $P_{ij}$ is large if there are numerous likely paths starting at iand ending inj

For instance, with reference to the graph in Fig. 1, we have:

$$\begin{aligned} \begin{array}{l} P_{11} = 1 \\ P_{12} = P_{14} = \frac{1}{4} \\ P_{13} = P_{15} = \frac{1}{4} \frac{1}{2} + \frac{1}{4} \frac{1}{1} = \frac{3}{8} \\ P_{16} = P_{17} = \frac{1}{4} \frac{1}{2} + \frac{1}{4} \frac{1}{2} \frac{1}{1} + \frac{1}{4} \frac{1}{1} = \frac{1}{2} \end{array} \end{aligned}$$

We can write this more compactly using matrix notation. Let D be a diagonal matrix such that $D_{ii} = \frac{1}{d_i}$ if $d_i > 0$ and $D_{ii} = 0$ otherwise. We can then write

$$\begin{aligned} P = DAP + I \end{aligned}$$

(1)

where I is the $n \times n$ identity matrix. We can solve for P and obtain

$$\begin{aligned} P = (I - DA)^{-1} \end{aligned}$$

Notice that, if we topologically sort the nodes in A (as done in Fig. 1), which is possible since G is a DAG, then both A and $I-DA$ are triangular matrices. In particular, the diagonal elements of $I-DA$ are equal to 1. Hence $\det (I-DA) = 1$, the matrix $I-DA$ is invertible and Eq. (1) has a solution, as noticed above. The inverse $P = (I - DA)^{-1}$ is also triangular.

One can also iteratively compute P using the fact that:

$$\begin{aligned} P = \sum _{i=0}^{\infty } (DA)^i = \sum _{i=0}^{l} (DA)^i \end{aligned}$$

(2)

where $l \le n-1$ is the longest path is the graph G and n is the number of nodes of G. The last equality holds because G is acyclic and thus $(DA)^i = 0$ for all $i > l$. We expect $l \ll n$. In particular, the length l is bounded by the longest path in the dataset, which corresponds to the number of time instants in the granularity of the dataset. For instance, if the dataset covers 10 years and publication dates are given with a month granularity, then l is lower than $12 \cdot 10 = 120$.

Matrix $(DA)^i$ computes the dependence contribution of paths of length i in graph G. In particular, for $i = 1$, the matrix DA represents first-order citations, that is direct citations among publications. On the other hand, matrix $(DA)^i$ for $i > 1$, encodes higher-order citations, that is chains of citations of length i among publications.

Notice that $P_{ij} \ne 0$ if and only if there exists at least on path from i to j in graph G. Hence, matrix P has the same non-zero pattern of the adjacency matrix of the transitive closure of G. We thus expect P to be denser than A.

Discipline dependence

Instead of looking at the individual dependence of publication i on publication j, we are interested in disciplinary dependencies. In particular, we are interested in the dependence of a publication (or of a discipline) on a discipline.

Let us denote by $Q_{iv}$ the extent to which publication i belongs to discipline v, hence Q is a matrix $n \times k$, where n is the number of publications and k is the number of disciplines. For the non-overlapping case, $Q_{iv} = 1$ if publication i belongs to discipline v. A publication can belong to multiple disciplines, thus $Q_{iv} > 0$ for possibly more than a single discipline v. In either case, we have $\sum _{v} Q_{iv} = 1$ and $Q_{iv} \ge 0$.

The dependence $R_{iv}$ of publication i on discipline v can then be defined as the sum of the dependencies of publication i on articles in v:

$$\begin{aligned} R_{iv} = \sum _j P_{ij} Q_{jv}, \end{aligned}$$

or, in matrix notation

$$\begin{aligned} R = P Q. \end{aligned}$$

Note that

$$\begin{aligned} \begin{array}{lcl} R &{} = &{} P Q \\ &{} = &{} (DAP + I) Q \\ &{} = &{} DAPQ + Q \\ &{} = &{} DAR + Q. \end{array} \end{aligned}$$

We can hence iteratively compute matrix R without materializing matrix P:

$$\begin{aligned} \left\{ \begin{array}{lcl} R^{(0)} &{} = &{} Q \\ R^{(i+1)} &{} = &{} DAR^{(i)} + Q \\ \end{array} \right. \end{aligned}$$

Notice that $R^{(i)} = \sum _{j=0}^{i} (DA)^j Q$ is the dependence contribution of citation paths of length up to i. Hence

$$\begin{aligned} R = \sum _{i=0}^{\infty } (DA)^i Q = \sum _{i=0}^{l} (DA)^i Q \end{aligned}$$

where l is the longest path in the graph, and the iterative computation of R can stop after l steps. Although R can be as dense as P, it has size $n \times k$, which is more manageable than the size of P, which is $n \times n$, since we expect $k \ll n$.

As a particular case, the dependence $r_i$ of publication i on the whole network is $r_i = \sum _j P_{i,j}$, that is, $r = Pe$. We thus have that:

$$\begin{aligned} r = Pe = (DAP + I)e = DAPe + e = DAr + e. \end{aligned}$$

Recall that the Pagerank of G, with damping factor $\alpha$ and exogenous vector $\beta$, is the vector x such that $x = \alpha DA x + \beta$ (Newman 2018). Hence, interestingly, the dependence vector r is also the Pagerank of G with damping factor $\alpha = 1$ and exogenous vector $\beta = e$.

One can also define the the dependence $S_{u,j}$ of discipline u on publication j as the sum of the dependence of publications in u on article j:

$$\begin{aligned} S_{uj} = \sum _i Q_{i u} P_{ij} , \end{aligned}$$

or, in matrix notation

$$\begin{aligned} S = Q^T P. \end{aligned}$$

Notice that since $P = (I - DA)^{-1}$, then $P(I-DA) = I$ and hence $P = PDA+I$. It follows that $S = SDA + Q^T$ and also S can be computed iteratively.

The dependence $F_{uv}$ of discipline u on discipline v is the sum of the dependence of papers in u on papers in v, that is:

$$\begin{aligned} F_{uv} = \sum _i Q_{iu} R_{iv} = \sum _i \sum _j Q_{iu} P_{ij} Q_{jv}, \end{aligned}$$

or, in matrix notation

$$\begin{aligned} F = Q^T R = Q^T P Q = S Q. \end{aligned}$$

We also define $F^{(i)} = Q^T R^{(i)}$, for $i \ge 0$, as the citation flow matrix for paths of length up to i. Notice that, for $i \ge 1$, $F^{(i)} - F^{(i-1)}$ is the citation flow matrix for paths of length equal to i.

Consider again the simple citation network depicted in Fig. 2, where nodes are partitioned in 3 disjoint disciplines. The light blue and green communities are closed worlds (autarchies), since they reference only within their own groups (their off-diagonal flows in matrix F is indeed 0). On the other hand, the red community is more interdisciplinary, since it references the other two groups outside its territory (the off-diagonal flow in matrix F is 2.25).

Case study

We applied our method on all publications from the CWTS in-house version of the Web of Science, considering the years between 2000 and 2016 included. We consider a total of 17,932,523 publications, and 190,550,206 citations among them—excluding 444,436 synchronous citations, which we discarded to guarantee that G is a DAG.^{Footnote 2} The longest citation path in the dataset is of length 29—equal to the maximum number of iterations needed for convergence. In what follows, we rely on the high-level aggregation of the journal-based classification of Web of Science, which represents 30 broad disciplines (see Table 2).

The contribution of higher-order citations

We start by assessing the contribution of first-order and higher-order citations to the citation flow among disciplines. Recall that partial flow matrix $F^{(i)}$ is the flow matrix for paths of length up to i, with total flow matrix $F = F^{(l)}$, where l is the length of the longest path in the citation graph. Let $M^{(i)} = F^{(i)} - F^{(i-1)}$ be the flow matrix for paths of length precisely i. The entry-wise matrix norm $||\cdot ||_1$ defined as $||M^{(i)}||_1 = \sum _{u,v} |M^{(i)}_{u,v}|$ is a measure of the total citation flow contained in matrix $M^{(i)}$. We also tested the Frobenius norm $||\cdot ||_2$ with similar outcomes.

We computed the norm of partial flow matrices $M^{(i)}$ relative to the norm of total flow matrix $F = F^{(l)}$, for $1 \le i \le l$. Results are shown in Fig. 3. First-order (direct) citations contribute for 58% to the overall flow, hence higher-order citations contribute for 42%, a significant share. In particular, the share of second-order (length 2) citations is 20%, that of third-order citations (length 3) is 12%, and that of fourth-order citations (length 4) is 6%. Longer citations paths account for about 4% of the flow. When we consider the top disciplines by flow contribution (Fig. 4), we have that six of them account for 38% (over 42%) of first-order flow, 13% (over 20%) of second-order flow, 8% (over 12%) of third-order flow, and 4% (over 4%) of fourth-order flow, following a similar pattern to global contributions.^{Footnote 3} We conclude that there is an important part of dependence flow that goes beyond direct citations which is worth investigating.

The citation flow network

The citation flow matrix is a full matrix and hence the corresponding flow network is a full graph. However, one might investigate the pairs of disciplines that have an higher than expected citation flow, and those that have a lower than expected citation flow.

Table 2 contains, for each discipline, the internal citation flow (self-flow), the outgoing and incoming citation flows and, moreover, the size of the discipline in number of articles. As expected, citation flows are strongly correlated with size of the discipline (Pearson correlation above 0.9).

To overcome the size-dependence issue, we normalize the flow matrix using the signed contribution to Pearson’s $\chi$-squared test. The normalized flow ${\hat{F}}_{i,j}$ between disciplines i and j is computed as:

$$\begin{aligned} {\hat{F}}_{i,j} = \frac{F_{i,j} - E_{i,j}}{\sqrt{E_{i,j}}} \end{aligned}$$

where

$$\begin{aligned} E_{i,j} = \frac{(\sum _k F_{i,k}) \cdot (\sum _k F_{k,j})}{\sum _{u,v} F_{u,v}} \end{aligned}$$

is the expected flow between i and j. The pairs of disciplines that significantly cite each other more than expected (above the 90th percentile) and less than expected (below the 10th percentile) are shown in Fig. 5. As for within-discipline citation flows (normalized by expected citations), Astronomy and Astrophysics, Mathematics, and Language and Linguistics lead the ranking, while Instruments and Instrumentation, Basic Medical Sciences and General and Industrial Engineering are at the bottom.

Furthermore, we consider the same network limited to positively weighted edges, thus with a higher than expected citation flow. We then apply the fast greedy clustering method to this network, as depicted in Fig. 6. Four macro areas emerge from this analysis, namely the life and medical sciences, science and engineering applied to the Earth and the environment, mathematical sciences and social and human sciences. If we do the same limiting ourselves to first-order citations (Fig. 7), the partition of disciplines into communities is less clear.

Our analyses suggest that some disciplines are more interdisciplinary (connecting different communities) and other more autarchic (mostly self-referencing), a topic we explore in the following section.

Interdisciplinarity and autarchy

In this section we match higher-order citation flows with measures of interdisciplinarity. We claim that:

A discipline is interdisciplinary when it is evenly cited from dissimilar disciplines.

This thesis immediately recalls the Rao quadratic entropy (Rao 1982), which has been previously used to measure interdisciplinarity (Porter and Rafols 2009; Rafols and Meyer 2010; Yegros-Yegros et al. 2015; Wang and Schneider 2019). The Rao quadratic entropy is one measure among others which have been studied in the literature (Mugabushaka et al. May 2016). Let us consider a set of objects and a probability distribution p such that $p_i$ is the probability of object i. Suppose we also have information about pairwise distance (dissimilarity) $d_{i,j}$ among any two objects i and j. Then a measure of heterogeneity among objects is the Rao quadratic entropy:

$$\begin{aligned} R(p, d) = \sum _{i,j} p_{i} \, p_{j} \, d_{i,j} \end{aligned}$$

There are two components in this definition of heterogeneity: (1) the evenness of the distribution p, (2) the distances d among objects. It holds that, in general:

R(p, d) is large when p evenly distributes its probability among dissimilar objects;
on the contrary, R(p, d) is small when p concentrates its probability on similar objects.

To apply Rao’s measure to the higher-order citation flow matrix F, we proceed as follows. For each discipline pair u and v, let

$$\begin{aligned} p_{u, v} = \frac{F_{u, v}}{\sum _i F_{i, v}}. \end{aligned}$$

Notice that $p_{u,v}$ is the relative share of citation flow from discipline u to discipline v compared to the total flow received by v. Notice, moreover, that $p_{*, v} = (p_{1, v}, p_{2, v}, \ldots , p_{k, v})$ is a probability distribution.

The similarity $s_{u,v}$ among two disciplines u and v is computed as the cosine of the angle between the u and v columns $F_{*,u}$ and $F_{*,v}$ of the flow matrix F:

$$\begin{aligned} s_{u,v} = \cos (F_{*,u}, F_{*,v}) = \frac{F_{*,u} F_{*,v}}{\Vert F_{*,u}\Vert \Vert F_{*,v}\Vert }. \end{aligned}$$

The cosine runs from 0 (no similarity) to 1 (maximum similarity). Hence, two disciplines are similar if they have a similar pattern of incoming citation flows. The distance $d_{u,v}$ among two disciplines u and v is then

$$\begin{aligned} d_{u,v} = 1 - s_{u,v}. \end{aligned}$$

so that two disciplines are distant if they are not similar.

Finally, for each discipline v, we apply the Rao quadratic entropy to the flow distribution $p_{*, v}$ and distance measure d among disciplines. This gives us a measure of interdisciplinarity for each discipline. The top and bottom 5 interdisciplinary disciplines are given in Table 1.

Table 1 Top 5 (top) and bottom 5 (bottom) disciplines by their interdisciplinarity

Full size table

Notice how two interrelated disciplines like Statistical Sciences and Mathematics end up on quite different ranks: while Statistics is interdisciplinary, Mathematics is rather autarchic. Indeed, Mathematics receives 78% of higher-order citation flow from itself, and the rest from a small number of other fields, mainly Physics, Materials Science and Computer Science. On the other hand, the internal flow for Statistics is limited to 43%. Statistics receives instead a significant citation flow from many other disciplines, including Mathematics, Computer Sciences, Economics and Business, General and Industrial Engineering, Electrical Engineering and Telecommunication, Clinical Medicine. This suggests that higher-order citations should be considered when assessing the degree of interdisciplinarity or autarchy of a discipline.

Conclusion

A considerable amount of effort goes into quantifying and assessing citation influence and impact via direct citations. We proposed instead here to quantify citation influence beyond direct citations by also using higher-order citations, that is citations chains of arbitrary length among pairs of publications. We have presented a method, informed by PageRank, to quantify the higher-order citation influence of publications. The proposed method accounts for both direct, or first-order, and indirect, or higher-order citations. In particular, we assessed the method on the whole Web of Science corpus between 2000 and 2016 at the level of entire disciplines.

Our results show that the contribution of first-order (length 1) citations accounts for 58% of the whole higher-order citation flow, while higher-order citations (levels 2 and above) account for 42%: a significant share. The proposed method is size-dependent, yet easily normalized, and it can be used for a variety of applications. We investigated two here. By using higher-order citation flows, we were able to provide for a high-level map of science clearly distinguishing among four macro-areas: life and medical sciences, Earth and environment sciences, mathematical sciences, social and human sciences. The same picture using only first-order information was found to be less clear-cut. Furthermore, we used the proposed method to rate disciplines according to their degree of interdisciplinarity using the Rao quadratic entropy. We are thus able to distinguish between autarchic disciplines, e.g., mathematics, and interdisciplinary ones, e.g. statistics. We suggest that accounting for higher-order citations is thus relevant and important, and might help on a variety of open scientimetrics questions: performing clustering, measuring interdisciplinarity, assessing the impact of fundamental research, among others.

Notes

There are some exceptions, but these can be removed so as to ensure that G is a DAG.
A citation between two publications is discarded if the publication time (year and month) of the citing publication is the same, or older than the publication time of the cited publication.
In order: Clinical medicine, Physics and materials science, Chemistry and chemical engineering, Basic life sciences, Biomedical sciences, Biological sciences.

References

Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1–7), 107–117.
Article Google Scholar
Chen, C. (2006). CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for Information Science and Technology, 57(3), 359–377.
Article Google Scholar
Colavizza, G., Franceschet, M., Traag, V. A., & Waltman, L. (2019). Quantifying the long-term influence of scientific publications. In Proceedings of the 17th international conference on scintometrics & informetrics.
Franceschet, M. (2011). PageRank: Standing on the shoulders of giants. Communications of the ACM, 54(6), 92–101.
Article Google Scholar
Garfield, E., Pudovkin, A. I., & Istomin, V. S. (2003). Why do we need algorithmic historiography? Journal of the American Society for Information Science and Technology, 54(5), 400–412.
Article Google Scholar
Garfield, E., Sher, I. H., & Torpie, R. J. (1964). The use of citation data in writing the history of science (Vol. 49, No. 638, p. 1256). The Institute for Scientific Information, Technical Report, AF.
Gerow, A., Hu, Y., Boyd-Graber, J., Blei, D. M., & Evans, J. A. (2018). Measuring discursive influence across scholarship. Proceedings of the National Academy of Sciences, 115, 3308–3313.
Article Google Scholar
Jurgens, D., Kumar, S., Hoover, R., McFarland, D., & Jurafsky, D. (2018). Measuring the evolution of a scientific field through citation frames. Transactions of the Association for Computational Linguistics, 6, 391–406.
Article Google Scholar
Klavans, R., & Boyack, K. W. (2009). Toward a consensus map of science. Journal of the American Society for Information Science and Technology, 60(3), 455–476.
Article Google Scholar
Lucio-Arias, D., & Leydesdorff, L. (2008). Main-path analysis and path-dependent transitions in HistCite TM-based historiograms. Journal of the American Society for Information Science and Technology, 59(12), 1948–1962.
Article Google Scholar
Marx, W., Bornmann, L., Barth, A., & Leydesdorff, L. (2014). Detecting the historical roots of research fields by reference publication year spectroscopy (RPYS): Detecting the Historical Roots of Research Fields by Reference Publication Year Spectroscopy (RPYS). Journal of the Association for Information Science and Technology, 65(4), 751–764.
Article Google Scholar
Merton, R. K. (1957). Priorities in scientific discovery: A chapter in the sociology of science. American Sociological Review, 22(6), 635–659.
Article Google Scholar
Mugabushaka, A.-M., Kyriakou, A., & Papazoglou, T. (2016). Bibliometric indicators of interdisciplinarity: The potential of the Leinster–Cobbold diversity indices to study disciplinary diversity. Scientometrics, 107(2), 593–607.
Article Google Scholar
Newman, M. E. J. (2018). Networks: An introduction (2nd ed.). Oxford: Oxford University Press.
Book Google Scholar
Porter, A. L., & Rafols, I. (2009). Is science becoming more interdisciplinary? Measuring and mapping six research fields over time. Scientometrics, 81(3), 719–745.
Article Google Scholar
Rafols, I., & Meyer, M. (2010). Diversity and network coherence as indicators of interdisciplinarity: case studies in bionanoscience. Scientometrics, 82(2), 263–287.
Article Google Scholar
Rao, C. R. (1982). Diversity and dissimilarity coefficients: A unified approach. Theoretical Population Biology, 21, 24–43.
Article MathSciNet Google Scholar
Soni, S., Lerman, K., & Eisenstein, J. (2019). Follow the leader: Documents on the leading edge of semantic change get more citations. arXiv:1909.04189 [physics]
Subelj, L., Waltman, L., Traag, V., & van Eck, N. J. (2020). Intermediacy of publications. Royal Society Open Science, 7(1), 190207.
Article Google Scholar
Thor, A., Marx, W., Leydesdorff, L., & Bornmann, L. (2016). Introducing CitedReferencesExplorer (CRExplorer): A program for reference publication year spectroscopy with cited references standardization. Journal of Informetrics, 10(2), 503–515.
Article Google Scholar
van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538.
Article Google Scholar
van Eck, N. J., & Waltman, L. (2014). CitNetExplorer: A new software tool for analyzing and visualizing citation networks. Journal of Informetrics, 8(4), 802–823.
Article Google Scholar
Waltman, L., & Yan, E. (2014). PageRank-related methods for analyzing citation networks. In Y. Ding, R. Rousseau, & D. Wolfram (Eds.), Measuring scholarly impact (pp. 83–100). Berlin: Springer.
Google Scholar
Wang, Q., & Schneider, J. W. (2019). Consistency and validity of interdisciplinarity measures. Quantitative Science Studies, 1, 239–263.
Article Google Scholar
Yegros-Yegros, A., Rafols, I., & D’Este, P. (2015). Does interdisciplinary research lead to higher citation impact? The different effect of proximal and distal interdisciplinarity. PLOS ONE, 10(8), e0135095.
Article Google Scholar
Yi-Ning, T., & Hsu, S.-L. (2016). Constructing conceptual trajectory maps to trace the development of research fields. Journal of the Association for Information Science and Technology, 67(8), 2016–2031.
Article Google Scholar

Download references

Acknowledgements

This work stems from prior efforts in collaboration with Ludo Waltman and Vincent A. Traag Colavizza et al. (2019), whom we thank for their contribution. We are grateful to the Centre for Science and Technology Studies (CWTS), Leiden University, for providing us access to their databases.

Author information

Authors and Affiliations

University of Udine, Via delle Scienze 206, 33100, Udine, Italy
Massimo Franceschet
University of Amsterdam, Postbus 94550, 1090 GN, Amsterdam, The Netherlands
Giovanni Colavizza

Authors

Massimo Franceschet
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Colavizza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giovanni Colavizza.

Appendix

See Table 2.

Table 2 The Web of Science disciplines, with fields id, name of discipline, size, self citation flow, incoming citation flow and outgoing citation flow

Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Franceschet, M., Colavizza, G. Quantifying the higher-order influence of scientific publications. Scientometrics 125, 951–963 (2020). https://doi.org/10.1007/s11192-020-03580-9

Download citation

Received: 30 November 2019
Published: 13 July 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s11192-020-03580-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Quantifying the higher-order influence of scientific publications

Abstract

Similar content being viewed by others

Identifying the Key Reference of a Scientific Publication

Looking deeper into academic citations through network analysis: popularity, influence and impact

Role of interdisciplinarity in computer sciences: quantification, impact and life trajectory

Introduction