Ranking Researchers Through Collaboration Pattern Analysis

Cataldi, Mario; Di Caro, Luigi; Schifanella, Claudio

doi:10.1007/978-3-319-46131-1_11

Mario Cataldi^20,21,22,
Luigi Di Caro^20,21,22 &
Claudio Schifanella^20,21,22

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9853))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2826 Accesses
2 Citations

Abstract

The academic world utterly relies on the concept of scientific collaboration. As in every collaborative network, however, the production of research articles follows hidden co-authoring principles as well as temporal dynamics which generate latent and complex collaboration patterns. In this paper, we present an online advanced tool for real-time rankings of computer scientists under these perspectives.

You have full access to this open access chapter, Download conference paper PDF

Quantifying scientific collaboration impact by exploiting collaboration-citation network

Article 30 June 2021

Predicting Research Collaboration Trends Based on the Similarity of Publications and Relationship of Scientists

An author is known by the context she keeps: significance of network motifs in scientific collaborations

Article 26 May 2015

Keywords

1 Introduction

Scientists are object of evaluation for funding allocation and career promotions. The discovery of leading scientists is an important task that simple statistics over long publication records may miss. In this demo, we present an online tool for analyzing researchers under a collaborative perspective by studying and ranking their capacity to maintain the same quality/quantity levels in different research environments.

Bibliometric indicators are increasingly used to evaluate scientific careers based on personal publication records. The simple number of papers published by an author rather than the received citations are still common ways to capture both the quantity and the impact of a scientist’s work. However, these measures represent only an evaluation of what is knowable from simple database searches. Still, these numbers actually make strong assumptions on the co-authorship of the research works in terms of how proportional the collaboration was among the co-authors. In a sense, scientists may look favorably good if working in a dynamic and active research environment. On the contrary, they may result unfairly below par due to modest research collaborators.

In literature, a number of related concepts have been presented, such as the undeserved co-authorship [4] and the scientific relevance [1]. Instead, our proposed application system is oriented to the study of what collaboration means. A research collaboration can be defined as a two-way process where individuals and/or organizations share learning, ideas and experiences to produce joint scientific outcomes. Collaborations are intrinsically necessary to the production of complete and groundbreaking research. In the light of this, one of the key aspect (and more demanded in recruitment scenarios) of a successful researcher is the development of a large and active network of collaborators that helps researchers bring new solutions within the research community.

The presented online tool is able to automatically compare scientists by deeply analyzing their local co-authorship networks and how they have been crucial in the production of research articles over time. Along the paper, we will present both the theoretical and algorithmic parts of the tool as well as the set of available features which can be freely tested on http://d-index.di.unito.it.

2 Background: Formalization of Scientific Collaborations

Based on the theoretical works proposed in [2, 3], for this demo application, we make use of a formalization of the co-authorship network that represents the environment in which a researcher has produced his/her scientific outcomes. Given two collaborating researchers (also called authors), $r_i$, $r_j$ and their common scientific network $N^t_{r_i,r_j}$, defined as the set of researchers who collaborated with them, the autonomy of their collaboration $a^t_{r_i, r_j}$ at time t is calculated as:

$$ a^t_{r_i, r_j} = \left\{ \begin{array}{l l} 0 &{} \quad \text {if}\ N^t_{r_i,r_j} = \emptyset \\ \frac{1}{\sum _{r_k \in N^t_{r_i,r_j}} \left( \sum _{x = 1}^{c(r_k, O^t_{r_i,r_j})} \frac{1}{x}\right) } &{} \quad \text {if}\ N^t_{r_i,r_j} \ne \emptyset \end{array} \right. $$

where the function $c(r_k, O^t_{r_i,r_j})$ returns the number of times a researcher $r_k$ co-authored a paper with both $r_i$ and $r_j$ at time t. The higher the autonomy the more independent the work of $r_i$ and $r_j$ is from their research environment. We then define the dependence value of $r_i$ on the collaboration with $r_j$ as $d^t_{r_i \rightarrow r_j}$ as

$$ d^t_{r_i \rightarrow r_j} = \frac{p^t_{r_i,r_j}}{p^t_{r_i}} \times \frac{a^t_{r_i,r_j,N^t_{r_i}} + a^t_{r_j,\lnot r_i,N^t_{r_i}} }{a^t_{r_i,r_j,N^t_{r_i}} + a^t_{r_j,\lnot r_i,N^t_{r_i}} + a^t_{r_i,\lnot r_j,N^t_{r_i}}}, $$

where p is a productivity score (number of published works) of a is the autonomy score. The dependence value $d^t_{r_j \rightarrow r_i}$ ranges from 0 to 1; in particular, $d^t_{r_i \rightarrow r_j} \approx 0$ indicates that the dependence of $r_i$ on $r_j$, at the time t, is negligible, while a $d^t_{r_i \rightarrow r_j} \approx 1$ highlights the contrary.

Thus, given the complete set of dependence values, for each year and relative to each co-author, we calculate the researcher’s dependence trajectory, by calculating the standard deviation, along the time, of each dependence value, for each co-author, from the optimal attended value of 0 (which would mean a dependence score of 0; i.e., the production of the considered researcher is independent from the collaboration with the considered co-author). In a sense, we aim at evaluating the overall independence of a researcher from the surrounding community. More formally, given a researcher $r_i$, we define his/her dependence trajectory $\overrightarrow{d_{r_i}} = \{sd^{t}_{r_i}, sd^{t+1}_{r_i},\cdots , sd^{t+n}_{r_i}\}$, where $sd^{t}_{r_i}$ is calculated as

$$ sd^t_{r_i} = \sqrt{\frac{\sum _{r_k \in N_{r_i}} (d^t_{r_i \rightarrow r_k})^2}{|N_{r_i}|}}. $$

In words, the system detects anomalies in the collaboration patterns with respect to the attended behavior. Researchers, in fact, are expected to increment their collaboration network over time becoming independent from their single collaborations.

We can use these values to properly compare, and rank researchers with similar characteristics. More in detail, we provide a radar chart that can rank the independence performance of a considered researcher with respect to those who have (i) similar career length, (ii) similar number of publications, (iii) similar number of co-authors. We also provide a comparison with the active researchers and the whole community. Finally, we will integrate a feature to compare researchers with respect to topics automatically extracted from publication titles.

3 Application and Demo Scenario

In this section, we present our application, available at http://d-index.di.unito.it, for analyzing, comparing and ranking scientific collaboration patterns of researchers. As data input, we considered the DBLP data set^{Footnote 1}, containing information about 1,717,211 authors and 3,268,812 scientific papers^{Footnote 2}.

As shown in Fig. 1, the proposed application initially allows to search for any author indexed by DBLP and to analyze his/her scientific profile and her/his collaboration history over time (through several features and visualizations). Then, the online demo provides the following analyses:

Collaborations over time. The user can analyze the evolution over time of each scientific collaboration for a searched researcher.
Collaboration Pattern Analysis. The system can visualize the above-mentioned dependence patterns through a curve metaphor, mapping the evolution of the dependence of a researcher on the support of each co-author along the career. With this chart, it is also possible to select/deselect additional co-authors to make further analyses and comparisons.
Temporal Analysis. The demo provides a dynamic visualization chart (“time-lapse”) which allows the user to focus on a specific time interval and/or a subset of co-authors.
Ranking. This tool also allows to compare and rank the overall independence of an author, along his/her whole career, with the whole research community (Fig. 2). This visualization permits to focus on how much the entire production of a researcher can be considered dependent on the interactions with her/his local community.

The presented demo can be used to analyze each researcher in the entire DBLP community by also considering similar profiles (with parameters such as number of papers, number of co-authors, and length of career).

Notes

1.
http://dblp.uni-trier.de/db.
2.
Information updated at May 2016.

References

Ausloos, M.: A scientometrics law about co-authors and their ranking: the co-author core. Scientometrics 95(3), 895–909 (2013)
Article Google Scholar
Caro, L.D., Cataldi, M., Schifanella, C.: The d-index: discovering dependences among scientific collaborators from their bibliographic data records. Scientometrics 93(3), 583–607 (2012)
Article Google Scholar
Di Caro, L., Cataldi, M., Lamolle, M., Schifanella, C.: It is not what but who you know: a time-sensitive collaboration impact measure of researchers in surrounding communities. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015 Companion, pp. 995–1000. ACM, New York (2015)
Google Scholar
Slone, R.M.: Coauthors’ contributions to major papers published in the ajr: frequency of undeserved coauthorship. AJR Am. J. Roentgenol. 167(3), 571–579 (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

Université Paris 8, Paris, France
Mario Cataldi, Luigi Di Caro & Claudio Schifanella
University of Torino, Turin, Italy
Mario Cataldi, Luigi Di Caro & Claudio Schifanella
RAI Research Centre, Turin, Italy
Mario Cataldi, Luigi Di Caro & Claudio Schifanella

Authors

Mario Cataldi
View author publications
You can also search for this author in PubMed Google Scholar
Luigi Di Caro
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Schifanella
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mario Cataldi .

Editor information

Editors and Affiliations

Department of Computer Science, KU Leuven, Leuven, Belgium
Bettina Berendt
Deloitte GmbH, München, Germany
Björn Bringmann
Laboratoire Hubert Curien, Jean Monnet University, Saint-Etienne, France
Élisa Fromont
Allianz SE, Munich, Germany
Gemma Garriga
Max-Planck-Institute for Informatics, Saarbrücken, Germany
Pauli Miettinen
Aalto University School of Science, Espoo, Finland
Nikolaj Tatti
Siemens AG & Lud. Max. Univ. of Munich, Munich, Germany
Volker Tresp

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cataldi, M., Di Caro, L., Schifanella, C. (2016). Ranking Researchers Through Collaboration Pattern Analysis. In: Berendt, B., et al. Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2016. Lecture Notes in Computer Science(), vol 9853. Springer, Cham. https://doi.org/10.1007/978-3-319-46131-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-46131-1_11
Published: 03 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46130-4
Online ISBN: 978-3-319-46131-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Ranking Researchers Through Collaboration Pattern Analysis

Abstract

Similar content being viewed by others

Quantifying scientific collaboration impact by exploiting collaboration-citation network

Predicting Research Collaboration Trends Based on the Similarity of Publications and Relationship of Scientists

An author is known by the context she keeps: significance of network motifs in scientific collaborations

Keywords

1 Introduction

2 Background: Formalization of Scientific Collaborations

3 Application and Demo Scenario

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Ranking Researchers Through Collaboration Pattern Analysis

Abstract

Similar content being viewed by others

Quantifying scientific collaboration impact by exploiting collaboration-citation network

Predicting Research Collaboration Trends Based on the Similarity of Publications and Relationship of Scientists

An author is known by the context she keeps: significance of network motifs in scientific collaborations

Keywords

1 Introduction

2 Background: Formalization of Scientific Collaborations

3 Application and Demo Scenario

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation