Skip to main content
Log in

The d-index: Discovering dependences among scientific collaborators from their bibliographic data records

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The evaluation of the work of a researcher and its impact on the research community has been deeply studied in literature through the definition of several measures, first among all the h-index and its variations. Although these measures represent valuable tools for analyzing researchers’ outputs, they usually assume the co-authorship to be a proportional collaboration between the parts, missing out their relationships and the relative scientific influences. In this work, we propose the d-index, a novel measure that estimates the dependence degree between authors on their research environment along their entire scientific publication history. We also present a web application that implements these ideas and provides a number of visualization tools for analyzing and comparing scientific dependences among all the scientists in the DBLP bibliographic database. Finally, relying on this web environment, we present case and user studies that highlight both the validity and the reliability of the proposed evaluation measure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://www.informatik.uni-trier.de/ley/db/.

  2. http://scholar.google.com/.

  3. http://academic.research.microsoft.com/.

  4. http://citeseer.ist.psu.edu/.

  5. http://dblpvis.uni-trier.de/.

  6. http://arnetminer.org/.

  7. A bipartite graph is a graph with two distinct vertex sets such that all edges in the graph are between vertices of different vertex sets.

  8. presented in “Appendix” section and available at http://d-index.di.unito.it.

  9. http://www.informatik.uni-trier.de/ley/db/.

  10. The data set is updated on March 2012. Please notice that, within our system, we rely on the disambiguated authors name provided by Tang et al. (2007). This name disambiguation system is based on a constraints probabilistic model and aims at finding, extracting, and fusing the semantics-based profiling information of a researcher from the Web.

  11. Information updated on March 2012.

  12. Information updated on March 2012.

  13. As it results from Fig. 9d, within our data set, a scientist with a long career (for example of ∼50 years) owns, in average, a lower number of co-authored works with respect to many younger scientists. Analyzing the reason of this result is out of the scope of this paper, but some hypotheses are possible. For example, this could lead to believe in a significant improvement in the rhythm of the publication process along the last decades. On the other hand, an alternative explanation can be that the coverage of DBLP significantly improved in the recent years.

  14. available at http://d-index.di.unito.it.

  15. Please notice that, for the authors’ name disambiguation process, we leverage the authors’ list provided by Tang et al. (2010); thus, the system is not currently able to distinguish among scientists who are not disambiguated within the considered data set.

References

  • Barabasi, A. L., Jeong, H., Neda, Z., Ravasz, E., Schubert, A., & Vicsek, T. (2002). Evolution of the social network of scientific collaborations. Physica A: Statistical Mechanics and its Applications, 311(3–4), 590–614.

    Article  MathSciNet  MATH  Google Scholar 

  • Brandao, W. C., de Oliveira e Silva, A. B., & Parreiras, F. S. (2007). Social networks in information science: Behavioral evidences of the researchers and evolutional trends of the coauthor network. Inf Inf 12(Special Issue), Portuguese.

  • Chen, P., Xie, H., Maslov, S., & Redner, S. (2007). Finding scientific gems with google’s pagerank algorithm. Journal of Informetrics, 1(1), 8–15.

    Article  Google Scholar 

  • Chubin, D. E. (1976). The conceptualization of scientific specialties. The Sociological Quarterly, 17(4), 448–476.

    Article  Google Scholar 

  • Crane, D. (1969). Social structure in a group of scientists: A test of the “invisible college” hypothesis. American Sociological Review, 3, 335–352.

    Article  Google Scholar 

  • de Beaver, D., & Rosen, R. (1979). Studies in scientific collaboration. Scientometrics, 1(2), 133–149.

    Article  Google Scholar 

  • Feeney, M., & Bernal, M. (2010). Women in stem networks: Who seeks advice and support from women scientists?. Scientometrics, 85(3), 767–790.

    Article  Google Scholar 

  • Fisher, R. (1918). The correlation between relatives on the supposition of mendelian inheritance. Transactions of the Royal Society of Edinburgh, 52(2), 399–433.

    Google Scholar 

  • Girvan, M., & Newman, M. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99(12), 7821.

    Article  MathSciNet  MATH  Google Scholar 

  • Goodman, L., & Kruskal, W. (1954). Measures of association for cross classifications. Journal of the American Statistical Association, 49, 732–764.

    Google Scholar 

  • Hou, H., Kretschmer, H., & Liu, Z. (2008). The structure of scientific collaboration networks in scientometrics. Scientometrics, 75(2), 189–202.

    Article  Google Scholar 

  • Hunt, R. (1991). Trying an authorship index. Nature, 352(6332), 187.

    Article  Google Scholar 

  • Imperial, J., & Rodríguez-Navarro, A. (2007). Usefulness of Hirschs h-index to evaluate scientific research in Spain. Scientometrics, 71(2), 271–282.

    Article  Google Scholar 

  • Katz, J. S., Katz, J. S., Martin, B.R., & Martin, B. R. (1997). What is research collaboration?. Research Policy, 26, 1–18.

    Article  Google Scholar 

  • Kendall, M. (1938). A new measure of rank correlation. Biometrika, 30(1/2), 81–93.

    Article  MathSciNet  MATH  Google Scholar 

  • Melin, G., & Persson, O. (1996). Studying research collaboration using co-authorships. Scientometrics, 36, 363–377.

    Article  Google Scholar 

  • Moody, J. (2004). The structure of a social science collaboration network: Disciplinary cohesion from 1963 to 1999. American Sociological Review, 69(2), 213–238.

    Article  Google Scholar 

  • Moon, S., You, J., Kwak, H., Kim, D., & Jeong, H. (2010). Understanding topological mesoscale features in community mining. In Communication systems and networks (COMSNETS), 2010 second international conference on (pp. 1–10). Bangalore, India: IEEE.

  • Newman, M. E. J. (2001). Scientific collaboration networks: I. network construction and fundamental results. Physical Review E 64, 016131.

    Google Scholar 

  • Pelleg, D., & Moore, A. W. (2000). X-means: Extending k-means with efficient estimation of the number of clusters. In Proceedings of the seventeenth international conference on machine learning (pp. 727–734). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.

  • Pepe, A., & Rodriguez, M. (2010). An in-depth longitudinal analysis of mixing patterns in a small scientific collaboration network. Scientometrics, 85(3).

  • Radicchi, F., Fortunato, S., & Castellano, C. (2008). Universality of citation distributions: Toward an objective measure of scientific impact. Proceedings of the National Academy of Sciences, 105(45), 17268.

    Article  Google Scholar 

  • Radicchi, F., Markines, B., & Vespignani, A. (2009). Diffusion of scientific credits and the ranking of scientists. Physical Review E, 80(5), 056103.

    Article  Google Scholar 

  • Rodriguez, M., & Pepe, A. (2008). On the relationship between the structural and socioacademic communities of a coauthorship network. Journal of Informetrics, 2(3), 195–201.

    Article  Google Scholar 

  • Schmidt, R. (1987). A worksheet for authorship of scientific articles. Bulletin of the Ecological Society of America, 68(1), 8–10.

    Google Scholar 

  • Shapin, S. (1981). Laboratory life. the social construction of scientific facts. Medical History, 25(3), 341–342.

    Google Scholar 

  • Sidiropoulos, A., Katsaros, D., & Manolopoulos, Y. (2007). Generalized Hirsch h-index for disclosing latent facts in citation networks. Scientometrics, 72(2), 253–280.

    Article  Google Scholar 

  • Tang, J., Zhang, D., & Yao, L. (2007). Social network extraction of academic researchers. In Proceedings of the 2007 seventh IEEE international conference on data mining (pp. 292–301). Washington, DC, USA: IEEE Computer Society.

  • Tang, J., Yao, L., Zhang, D., & Zhang, J. (2010). A combination approach to web user profiling. ACM Transactions on Knowledge Discovery from Data (TKDD), 5(1), 2.

    Google Scholar 

  • Torgerson, W.S. (1958). Theory and methods of scaling. Malabar, FL: R.E. Krieger Pub. Co.

  • Verhagen, J., Wallace, K., Collins, S., & Scott, T. (2003). QUAD system offers fair shares to all authors. Nature, 426(6967), 602.

    Article  Google Scholar 

  • Wagner, C. S., & Leydesdorff, L. (2005). Network structure, self-organization, and the growth of international collaboration in science. Research Policy, 34, 1608–1618.

    Google Scholar 

  • Wang, C., Han, J., Jia, Y., Tang, J., Zhang, D., Yu, Y., & Guo, J. (2010). Mining advisor-advisee relationships from research publication networks. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 203–212). San Diego, CA, USA: ACM.

  • Yan, E., & Ding, Y. (2009). Applying centrality measures to impact analysis: A coauthorship network analysis. Journal of the American Society for Information Science and Technology, 60(10), 2107–2118.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luigi Di Caro.

Appendix: The web application: http://d-index.di.unito.it

Appendix: The web application: http://d-index.di.unito.it

In this section we present a web environment Footnote 14 for evaluation of dependences among scientific researchers. Within the presented web application, it is possible to perform the following operations:

  • visualization of scientific profiles;

  • graphical analyses of the scientific dependences of the authors on all their co-authors;

  • graphical analyses of the mutual dependences of the each researcher relatively to his/her co-authors;

  • comparison among authors, based on their dependences on their co-authors (shared or not);

  • analysis of the dependences of an author by taking into account the local research communities in which he/she has been involved (intended as groups of co-authors who, together, have frequently collaborated with him/her).

  • comparisons with the entire community, focusing on the dependence coefficient, with respect to the total number of papers, total number of co-authors and length of career.

Figure 10 shows a set of screenshots taken from http://d-index.di.unito.it. Within this system a user can search for authors and analyze their scientific profiles, according to the proposed measures. In particular, given a researcher’s name, the system permits to retrieve the related authors (by querying the database with the inserted search string) and to disambiguate the resulted authors’ list. Footnote 15 The user can therefore analyze the scientific profile of the selected author (Fig. 10a) by studying his/her scientific career (visualized as a histogram of papers per year), the complete list of scientific outcomes (through the standard information as the title, the name of the conference/journal, the year of the publication, etc.) and his/her co-authors over the entire career.

Fig. 10
figure 10

Different visualizations of the career of Dr. Christos Faloutsos taken from http://d-index.di.unito.it; these tools permits to visualize the information about the author (a), his dependence curve (b), his mutual dependence curve (c), a comparison with another scientist (d) and his Collaboration Map (e), the relations between his dependence coefficient and his total number of papers (f), number of his co-authors (g), and length of his career (h) compared with the whole scientific community

Then, as shown in Fig. 10b, it is possible to analyze the 2-dimensional dependence curve of the selected scientist that visualizes all his/her scientific dependences (plotted as explained in “Dependence curve and comparison among authors”). As already introduced, the curve permits to graphically order his/her co-authors with respect to their d-index value; in other words, this chart allows the user to easily visualize the authors from whom the considered author is more scientifically dependent. In addition, the user can analyze his/her scientific relationships with his/her co-authors by visualizing his/her Mutual Dependences Curve (Fig. 10c), graphically separating the authors who depend on the considered researcher (in blue) from those he/she is dependent on (in red).

Using the dependence curve, it is also possible to compare different authors in terms of their dependences (Fig. 10d) based on shared co-authors (in case they exist) or their highest dependence values (mapped through meta authors). The user can compare, within the same chart, an unlimited number of authors.

Finally, as shown in Fig. 10e, it is possible to get an alternative view of the scientific dependences of the author by taking into account the communities (intended as group of co-authors) in which he/she is involved. This is possible through the use of the Community Map (explained in “Collaboration map” section). Using this chart, it is possible to organize the co-authors not only with respect to the dependence on them, but also based on the local communities they form with respect to their collaboration with the considered author. In this way it is possible to estimate the number of research groups in which the author is involved (visualized with different colors) and, for each of them, the relative dependence of each researcher belonging to it on the considered author.

Finally, the application permits to compare the researcher with the entire community by focusing on the dependence coefficient with respect to different parameters (total number of outcomes, total number of co-authors and length of the scientific career). Within these charts, all the researchers within DBLP are mapped with blue points, whereas the considered researcher is highlighted in red. Moreover, within these charts, it is possible to map multiple authors and compare them with respect to the entire community. Some examples are shown in Fig. 10f, g, h.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Di Caro, L., Cataldi, M. & Schifanella, C. The d-index: Discovering dependences among scientific collaborators from their bibliographic data records. Scientometrics 93, 583–607 (2012). https://doi.org/10.1007/s11192-012-0762-1

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-012-0762-1

Keywords

Navigation