Skip to main content
Log in

On the banks of Shodhganga: analysis of the academic genealogy graph of an Indian ETD repository

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Academic genealogy graphs capture information about the lineage of researchers, encode how knowledge flows from advisors to proteges, and shed light on the birth and evolution of disciplines. In this paper, we study the academic genealogy graph/network (AGN) in Shodhganga which is the Indian Electronic Theses and Dissertations (ETD) database. We have disambiguated the names of the researchers in Shodhganga and constructed the Shodhganga-AGN, which we have analyzed with topological metrics proposed in the literature on general graphs as well as that on genealogy networks. The metrics studied have been able to identify the institutes and researchers that have played a significant role in the development of the Indian higher education system. The largest connected component of Shodhganga-AGN consists of 1356 researchers and 1437 advisor–advisee relationships. The component is dominated by researchers from science and is affiliated primarily with three institutions. We have also studied subgraphs in the genealogy network to identify supervision patterns, and found that most of the subgraph instances connect researchers within a single institution or subject. Thus, our study is a detailed insightful analysis of the academic genealogy of researchers indexed in Shodhganga, and captures the decades-old research ecosystem of India, as expressed through the formal advisor–advisee relationships in Indian universities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. https://www.genealogy.math.ndsu.nodak.edu/.

  2. https://academictree.org/.

  3. https://aishe.gov.in/aishe/gotoAisheReports.

  4. https://www.ugc.ac.in/pdfnews/8604044Higher-Education-Brochure.pdf.

  5. Dropout rate is not known, but we assumed it to be more than 2%.

  6. https://neo4j.com/.

  7. https://spacy.io.

  8. In Algorithm 2 the the threshold value is dynamic

  9. Structural variation refers to characters missing, longer-shorter version of names, etc.

  10. https://github.com/Djasingh/Shodhganga-AGN.

  11. https://gephi.org/.

  12. https://www.scopus.com/authid/detail.uri?authorId=7005045135.

  13. https://academictree.org/.

  14. The May 2018 AFT data dump is used.

  15. We have made the assumption that the Researchers have only one doctoral degree.

  16. The involvement of multiple advisors to advise a researcher in AGN.

  17. Note: We have not identified the subgraphs based on their frequency of occurrence in the random graphs and Shodhganga-AGN.

  18. https://neo4j.com/.

  19. Note: Temporal information is not considered for detecting subgraphs.

  20. The advisor–advisee connections available in Shodhganga-AGN following the predefined subgraph structures are referred as instances of the subgraph structures (shown in Fig. 13a).

  21. https://www.nirfindia.org/2021/Ranking.html.

References

  • Alves, B.L., Benevenuto, F., Laender, A.H. (2013). The role of research leaders on the evolution of scientific communities. In Proceedings of the 22nd international conference on world wide web-www ’13 companion (pp 649–656). New York, ACM Press. https://doi.org/10.1145/2487788.2488016

  • Arslan, E., Gunes, M.H., Yuksel, M. (2011). Analysis of academic ties: A case study of mathematics genealogy. In 2011 IEEE globecom workshops (gc wkshps) (pp. 125–129). https://doi.org/10.1109/GLOCOMW.2011.6162384

  • Avron, A., Dershowitz, N., Rabinovich, A. (2008). Boris A. Trakhtenbrot: Academic genealogy and publications. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (Vol. 4800 LNCS, pp. 46–57). Springer, Berlin. https://doi.org/10.1007/978-3-540-78127-1_3

  • Carolina Elias, M., Floeter-Winter, L. M., & Mena-Chalco, J. P. (2016). The dynamics of Brazilian protozoology over the past century. Memorias do Instituto Oswaldo Cruz, 111(1), 67–74. https://doi.org/10.1590/0074-02760150386

    Article  Google Scholar 

  • Chariker, J. H., Zhang, Y., Pani, J. R., & Rouchka, E. C. (2017). Identification of successful mentoring communities using network-based analysis of mentor-mentee relationships across Nobel laureates. Scientometrics, 111(3), 1733–1749. https://doi.org/10.1007/s11192-017-2364-4

    Article  Google Scholar 

  • Cronin, B., & Sugimoto, C. R. E. (2014). Beyond bibliometrics: Harnessing multidimensional indicators of scholarly impact cambridge ma mit press 2014 466 pp. Online Information Review, 39(2), 270–271.

    Google Scholar 

  • Damaceno, R. J., Rossi, L., Mugnaini, R., & Mena-Chalco, J. P. (2019). The Brazilian academic genealogy: Evidence of advisor–advisee relationships through quantitative analysis. Scientometrics, 119(1), 303–333. https://doi.org/10.1007/s11192-019-03023-0

    Article  Google Scholar 

  • Da Silva, C. E. M., Nunes, R., & Viegas, E. M. M. (2018). A genealogy of the Brazilian scientific research on freshwater fish farming by means of the academic supervision linkage. Scientometrics. https://doi.org/10.1007/s11192-018-2940-2

  • David, S. V., & Hayden, B. Y. (2012). Neurotree: A collaborative, graphical database of the academic genealogy of neuroscience. PLoS ONE, 7(10), e46608. https://doi.org/10.1371/journal.pone.0046608

    Article  Google Scholar 

  • Dores, W., Benevenuto, F., Laender, A.H. (2016). Extracting academic genealogy trees from the networked digital library of theses and dissertations. In Proceedings of the ACM/IEEE joint conference on digital libraries (Vol. 2016-Septe, pp. 163–166). New York, Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/2910896.2910916

  • Dores, W., Soares, E., Benevenuto, F., & Laender, A. H. (2017). Building the Brazilian academic genealogy tree. Lecture Notes in Computer Science. https://doi.org/10.1007/978-3-319-67008-9_43

    Article  Google Scholar 

  • Gargiulo, F., Caen, A., Lambiotte, R., Carletti, T. (2016). The classical origin of modern mathematics. EPJ Data Science 51. https://doi.org/10.1140/epjds/s13688-016-0088-y

  • Hart, R. E., & Cossuth, J. H. (2013). A family tree of tropical meteorology’s academic community and its proposed expansion. Bulletin of the American Meteorological Society, 94(12), 1837–1848. https://doi.org/10.1175/BAMS-D-12-00110.1

    Article  Google Scholar 

  • Head, K., Li, Y. A., & Minondo, A. (2019). Geography, ties, and knowledge flows: Evidence from citations in mathematics. The Review of Economics and Statistics, 101(4), 713–727. https://doi.org/10.1162/rest_a_00771

    Article  Google Scholar 

  • Heinisch, D. P., & Buenstorf, G. (2018). The next generation (plus one): An analysis of doctoral students’ academic fecundity based on a novel approach to advisor identification. Scientometrics, 117(1), 351–380. https://doi.org/10.1007/s11192-018-2840-5

    Article  Google Scholar 

  • Hirshman, B. R., Alattar, A. A., Dhawan, S., Carley, K. M., & Chen, C. C. (2019). Association between medical academic genealogy and publication outcome: Impact of unconscious bias on scientific objectivity. Acta Neurochirurgica, 161(2), 205–211. https://doi.org/10.1007/s00701-019-03804-9

    Article  Google Scholar 

  • Jackson, A. (2007). A labor of love: The mathematics genealogy project. Notices of the American Mathematical Society, 54, 1002–1003.

    Google Scholar 

  • Krumov, L., Fretter, C., Müller-Hannemann, M., Weihe, K., & Hütt, M- T. (2011). Motifs in co-authorship networks and their relation to the impact of scientific publications. The European Physical Journal, B84(4), 535–540. https://doi.org/10.1140/epjb/e2011-10746-5

    Article  Google Scholar 

  • Liénard, J. F., Achakulvisut, T., Acuna, D. E., & David, S. V. (2018). Intellectual synthesis in mentorship determines success in academic careers. Nature Communications, 9(1), 4840. https://doi.org/10.1038/s41467-018-07034-y

    Article  Google Scholar 

  • Liu, J., Tang, T., Kong, X., Tolba, A., & AL-Makhadmeh, Z., Xia, F. (2018). Understanding the advisor–advisee relationship via scholarly data analysis. Scientometrics, 116(1), 161–180. https://doi.org/10.1007/s11192-018-2762-2

    Article  Google Scholar 

  • Liu, J., Xia, F., Wang, L., Xu, B., Kong, X., Tong, H., King, I. (2019). Shifu2: A Network Representation Learning Based Model for Advisor–advisee Relationship Mining. In IEEE Transactions on Knowledge and Data Engineering pp. 1–1. https://doi.org/10.1109/tkde.2019.2946825

  • Ma, Y., & Uzzi, B. (2018). Scientific prize network predicts who pushes the boundaries of science. Proceedings of the National Academy of Sciences, 115(50), 12608–12615. https://doi.org/10.1073/pnas.1800485115

    Article  Google Scholar 

  • Madeira, G., Borges, E.N., Barañano, M., Nascimento, P.K., Lucca, G., De Fatima Maia, M., Dimuro, G. (2019). The gold tree: An information system for analyzing academic genealogy. In ICEIS 2019-proceedings of the 21st international conference on enterprise information systems. https://doi.org/10.5220/0007758401140120

  • Malmgren, R. D., Ottino, J. M., Amaral, L. A. N., Nunes Amaral, L. A., & Shiralkar, P. (2010). The role of mentorship in protégé performance. Nature, 465, 622–626. https://doi.org/10.1038/nature09040

    Article  Google Scholar 

  • Marsh, E. J. (2017). Family matters: Measuring impact through one’s academic descendants. Perspectives on Psychological Science, 12(6), 1130–1132. https://doi.org/10.1177/1745691617719759

    Article  Google Scholar 

  • Mugnaini, R., Damaceno, R.J., Mena-Chalco, J.P. (2019). An empirical analysis on the relationship between publications and academic genealogy. In 17th international conference on scientometrics and informetrics, ISSI 2019-proceedings.

  • Paranjape, A., Benson, A.R., Leskovec, J. (2017). Motifs in temporal networks. In Proceedings of the tenth ACM international conference on web search and data mining (pp. 601–610). New York, NY, USA Association for Computing Machinery. https://doi.org/10.1145/3018661.3018731

  • Rossi, L., Damaceno, R. J., Freire, I. L., Bechara, E. J., & Mena-Chalco, J. P. (2018). Topological metrics in academic genealogy graphs. Journal of Informetrics, 12(4), 1042–1058. https://doi.org/10.1016/j.joi.2018.08.004

    Article  Google Scholar 

  • Rossi, L., Freire, I. L., & Mena-Chalco, J. P. (2017). Genealogical index: A metric to analyze advisor–advisee relationships. Journal of Informetrics, 11(2), 564–582. https://doi.org/10.1016/j.joi.2017.04.001

    Article  Google Scholar 

  • Russell, T. G., & Sugimoto, C. R. (2009). Mpact family trees: Quantifying academic genealogy in library and information science. Journal of Education for Library and Information Science, 5, 248–262.

    Google Scholar 

  • Sanyal, D. K., Dey, S., & Das, P. P. (2020). gm-index: A new mentorship index for researchers. Scientogmetrics, 123(1), 71–102. https://doi.org/10.1007/s11192-020-03384-x

    Article  Google Scholar 

  • Semenov, A., Veremyev, A., Nikolaev, A., Pasiliao, E. L., & Boginski, V. (2020). Network-based indices of individual and collective advising impacts in mathematics. Computational Social Networks, 7(1), 1–18. https://doi.org/10.1186/s40649-019-0075-0

    Article  Google Scholar 

  • Tan, Z., Liu, C., Mao, Y., Guo, Y., Shen, J., Wang, X. (2016). AceMap: A Novel Approach towards Displaying Relationship among Academic Literatures. In Proceedings of the 25th international conference companion on world wide web-www ’16 companion. https://doi.org/10.1145/2872518.2890514

  • Tuesta, E. F., Delgado, K. V., Mugnaini, R., Digiampietri, L. A., Mena-Chalco, J. P., & Pérez-Alcázar, J. J. (2015). Analysis of an advisor–advisee relationship: An exploratory study of the area of Exact and Earth Sciences in Brazil. PLoS ONE, 10(5), e0129065. https://doi.org/10.1371/journal.pone.0129065

    Article  Google Scholar 

  • Wang, C., Han, J., Jia, Y., Tang, J., Zhang, D., Yu, Y. (2010). Mining advisor–advisee relationships from research publication networks. In Proceedings of the ACM sigkdd international conference on knowledge discovery and data mining (pp. 203–212). New York, USAACM Press. https://doi.org/10.1145/1835804.1835833

  • Wang, W., Liu, J., Xia, F., King, I., Tong, H., & (2017). Shifu: Deep learning based advisor–advisee relationship mining in scholarly big data. 26th international world wide web conference,. (2017). www 2017 companion (pp. 303–310). International World Wide Web Conferences Steering Committee. https://doi.org/10.1145/3041021.3054159

  • Wijsen, L. D., Borsboom, D., Cabaço, T., & Heiser, W. J. (2019). An academic genealogy of psychometric society presidents. Psychometrika. https://doi.org/10.1007/s11336-018-09651-4

  • Wu, W., Han, Y., Li, D. (2008). The topology and motif analysis of journal citation networks. In 2008 international conference on computer science and software engineering (Vol. 1, pp. 287–293). https://doi.org/10.1109/CSSE.2008.495

  • Zeitlyn, D., & Hook, D. W. (2019). Perception, prestige and pagerank. PLoS ONE, 14(5), 1–21. https://doi.org/10.1371/journal.pone.0216783

    Article  Google Scholar 

  • Zhao, Z., Liu, W., Qian, Y., Nie, L., Yin, Y., & Zhang, Y. (2018). Identifying advisor–advisee relationships from co-author networks via a novel deep model. Information Sciences, 466, 258–269. https://doi.org/10.1016/j.ins.2018.07.064

    Article  Google Scholar 

  • Ziechmann, R., Hoffman, H., & Chin, L. S. (2019). Academic genealogy of neurosurgery via department chair. World Neurosurgery. https://doi.org/10.1016/j.wneu.2018.09.023

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dhananjay Kumar.

Ethics declarations

Conflict of Interest

The authors did not receive support from any organization for the submitted work and have no competing financial interests to declare.

Appendix

Appendix

Algorithms flowchart

Fig. 20
figure 20

Algorithm 2-In all the above cases, we have imposed the natural constraint that the advisor’s (who is potentially the same person as the advisee) date of thesis submission to be earlier than the advisee’s thesis submission date. In the case where the number of candidates (advisee) to merge with the advisor is greater than one, we further check them across the available attributes to find the best candidate

Fig. 21
figure 21

Algorithm 3-Initially students with same names and having same department and institute are assigned the same index even if they are referring to different individuals. In that scenario, if advisor information (advisor thesis) is available, we are using that to find the most likely student among all the students (with the same name) associated with him/her. After that, group the remaining students by thesis title and assign them with different indices (Assumption: very less likely that two students with the same name and in the same institute and department has different thesis titles associated with same advisor)

Fig. 22
figure 22

Algorithm 4-Applicable if multiple advisors advised the student (multiple records exist, same student name has variation due to human error (check Appendix Table 2)), or duplicate records exist, or wrongly combined thesis title. The function merges students with similar names (with slight variation) having the same thesis title. If student names are different across the same thesis index (similarity value less than threshold value), then there is a chance that two different thesis indices are merged wrongly. Therefore, we will not change the student indices

Sample of researcher records with few attributes

Table 2 Sample of researcher records (Reference for complete dataset)

Top researchers in Shodhganga-AGN based on different genealogical metrics

Table 3 Top 5 researchers who have trained maximum advisees (fecundity value)
Table 4 Top 5 researchers with maximum fertile child’s
Table 5 Top 5 researchers with maximum lineage size (number of descendants)

Institute abbreviation, Nirf ranking, and DDC subject code with subject name

Table 6 Top 5 researchers with maximum cousins
Table 7 Top 5 researchers with maximum lineage depth
Table 8 Top 5 researchers with maximum advisors
Table 9 Top 5 researchers based on \(h_m\)-index value
Table 10 Top 5 researchers based on \(g_m\)-index value
Table 11 DDC codes
Table 12 Institute abbreviation and Nirf ranking (2021)

Extracted subgraphs structures from Shodhganga-AGN

Fig. 23
figure 23

Extracted subgraphs instances from Shodhganga-AGN

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, D., Bhowmick, P.K., Dey, S. et al. On the banks of Shodhganga: analysis of the academic genealogy graph of an Indian ETD repository. Scientometrics 128, 3879–3914 (2023). https://doi.org/10.1007/s11192-023-04728-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-023-04728-z

Keywords

Navigation