, Volume 100, Issue 2, pp 407–437 | Cite as

Topic-based Pagerank: toward a topic-level scientific evaluation

  • Erjia YanEmail author


Within the same research field, different subfields and topics may exhibit varied citation behaviors and scholarly communication patterns. For a more effect scientific evaluation at the topic level, this study proposes a topic-based PageRank approach. This approach aims to evaluate the scientific impact of research entities (e.g., papers, authors, journals, and institutions) at the topic-level. The proposed topic-based PageRank, when applied to a data set on library and information science publications, has effectively detected a variety of research topics and identified authors, papers, and journals of the highest impact from each topic. Evaluation results show that compared with the standard PageRank and a topic modeling technique, the proposed topic-based PageRank has the best performance on relevance and impact. Different perspectives of organizing scientific literature are also discussed and this study recommends the mode of organization that integrates stable research domains and dynamic topics.


Scientific evaluation Impact PageRank Topic models 


  1. Bergstrom, C. T., & West, J. D. (2008). Assessing citations with the Eigenfactor™ metrics. Neurology, 71(23), 1850–1851.CrossRefGoogle Scholar
  2. Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. Annals of Applied Statistics, 1(1), 17–35.CrossRefzbMATHMathSciNetGoogle Scholar
  3. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(4–5), 993–1033.zbMATHGoogle Scholar
  4. Bollen, J., Rodriguez, M. A., & Van de Sompel, H. (2006). Journal status. Scientometrics, 69(3), 669–687.CrossRefGoogle Scholar
  5. Boyack, K. W., Klavans, A. R., & Börner, K. (2005). Mapping the backbone of science. Scientometrics, 64(3), 351–374.CrossRefGoogle Scholar
  6. Chen, C. M. (2004). Searching for intellectual turning points: Progressive knowledge domain visualization. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl. 1), 5303–5310.CrossRefGoogle Scholar
  7. Chen, P., Xie, H., Maslov, S., & Redner, S. (2007). Finding scientific gems with Google’s PageRank algorithm. Journal of Informetrics, 1(1), 8–15.CrossRefGoogle Scholar
  8. Cronin, B. (1984). The citation process: The role and significance of citations in scientific communication. London: Taylor Graham.Google Scholar
  9. Ding, Y. (2011a). Applying weighted PageRank to author citation networks. Journal of the American Society for Information Science and Technology, 62(2), 236–245.CrossRefGoogle Scholar
  10. Ding, Y. (2011b). Topic-based PageRank on author co-citation networks. Journal of the American Society for Information Science and Technology, 62(3), 449–466.Google Scholar
  11. Glänzel, W., & Thijs, B. (2011). Using ‘core documents’ for the representation of clusters and topics. Scientometrics, 88(1), 297–309.CrossRefGoogle Scholar
  12. Glänzel, W., & Thijs, B. (2012). Using ‘core documents’ for detecting and labelling new emerging topics. Scientometrics, 91(2), 399–416.CrossRefGoogle Scholar
  13. Guerrero-Bote, V. P., & Moya-Anegón, F. (2012). A further step forward in measuring journals’ scientific prestige: The SJR2 indicator. Journal of Informetrics, 6(4), 674–688.CrossRefGoogle Scholar
  14. Haveliwala, T. H. (2003). Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering, 15(4), 784–796.CrossRefGoogle Scholar
  15. Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46), 16569–16572.CrossRefGoogle Scholar
  16. Hirst, G. (1978). Discipline impact factors: Method for determining core journal lists. Journal of the American Society for Information Science, 29(4), 171–172.CrossRefGoogle Scholar
  17. Holton, G. (1978). Can science be measured? In scientific imaginations: Case studies (pp. 199–228). Cambridge: Cambridge University Press.Google Scholar
  18. Janssens, F., Glänzel, W., & De Moor, B. (2008). A hybrid mapping of information science. Scientometrics, 75(3), 607–631.CrossRefGoogle Scholar
  19. Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20(4), 422–446.CrossRefGoogle Scholar
  20. Klein, J. T. (1990). Interdisciplinarity: History, theory, and practice. Detroit: Wayne State University Press.Google Scholar
  21. Li, D., Ding, Y., Shuai, X., Bollen, J., Tang, J., Chen, S., et al. (2012). Adding community and dynamic to topic models. Journal of Informetrics, 6(2), 237–253.CrossRefGoogle Scholar
  22. Lin, J. (1991). Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37(1), 145–151.CrossRefzbMATHGoogle Scholar
  23. Liu, X., Bollen, J., Nelson, M. L., & Van de Sompel, H. (2005). Co-authorship networks in the digital library research community. Information Processing and Management, 41(6), 1462–1480.CrossRefGoogle Scholar
  24. Ma, N., Guan, J., & Zhao, Y. (2008). Bringing PageRank to the citation analysis. Information Processing and Management, 44(2), 800–810.CrossRefGoogle Scholar
  25. McCallum, A., Corrada-Emmanuel, A., & Wang, X. (2004) The Author-Recipient-Topic model for topic and role discovery in social networks: Experiments with Enron and academic email. Technical Report UM-CS-2004-096. Retrieved May 30, 2010 from doi:
  26. Milojevic, S., Sugimoto, C. R., Yan, E., & Ding, Y. (2011). The cognitive structure of library and information science: Analysis of article title words. Journal of the American Society for Information Science and Technology, 62(10), 1933–1953.CrossRefGoogle Scholar
  27. Moed, H. F. (2010). CWTS crown indicator measures citation impact of a research group’s publication oeuvre. Journal of Informetrics, 3(3), 436–438.CrossRefGoogle Scholar
  28. Narin, F. (1976). Evaluative bibliometrics: The use of publication and citation analysis in the evaluation of scientific activity. Washington DC: National Science Foundation.Google Scholar
  29. Ni, C., Sugimoto, C. R., & Jiang, J. (2013). Venue-author-coupling: A measure for identifying disciplines through author communities. Journal of the American Society for Information Science and Technology, 64(2), 265–279.CrossRefGoogle Scholar
  30. Pinski, G., & Narin, F. (1976). Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing and Management, 12(5), 297–312.CrossRefGoogle Scholar
  31. Radicchi, F., Fortunato, S., Markines, B., & Vespignani, A. (2009). Diffusion of scientific credits and the ranking of scientists. Physical Review E, 80(5), 056103.CrossRefGoogle Scholar
  32. Rafols, I., & Leydesdorff, L. (2009). Content-based and algorithmic classifications of journals: perspectives on the dynamics of scientific communication and indexer effects. Journal of the American Society for Information Science and Technology, 60(9), 1823–1835.CrossRefGoogle Scholar
  33. Sayyadi, H., & Getoor, L. (2009). FutureRank: Ranking scientific articles by predicting their future PageRank. In Proceedings of the Ninth SIAM International Conference on Data Mining. Retrieved February 6, 2012 from
  34. Steyvers, M., Smyth, P., Rosen-Zvi, M., & Griffiths, T. (2004). Probabilistic author-topic models for information discovery. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 306–315). New York: ACM Press.Google Scholar
  35. Sugimoto, C. R., Li, D., Russell, T. G., Finlay, S. C., & Ding, Y. (2011). The shifting sands of disciplinary development: analyzing North American Library and Information Science dissertations using latent dirichlet allocation. Journal of the American Society for Information Science and Technology, 62(1), 185–204.CrossRefGoogle Scholar
  36. Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., & Su, Z. (2008). ArnetMiner: Extraction and mining of academic social networks. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 990–998). New York: ACM Press.Google Scholar
  37. Van Raan, A. F. J. (2004). Measuring science: Capita Selecta of current issues. In H. F. Moed, W. Glänzel, & U. Schmoch (Eds.), Handbook of quantitative science and technology research (pp. 19–50). Dordrecht: Kluwer Academic Publishers.Google Scholar
  38. Van Raan, A. F. J. (2008). Bibliometric statistical properties of the 100 largest European research universities: Prevalent scaling rules in the science system. Journal of the American Society for Information Science and Technology, 59(3), 461–475.CrossRefGoogle Scholar
  39. Walker, D., Xie, H., Yan, K. K., & Maslov, S. (2007). Ranking scientific publications using a simple model of network traffic. Journal of Statistical Mechanics: Theory and Experiment, P06010. doi:  10.1088/1742-5468/2007/06/P06010.
  40. Waltman, L., & Van Eck, N. J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of the American Society for Information Science and Technology, 63(12), 2378–2392.CrossRefGoogle Scholar
  41. Waltman, L., Yan, E., & Van Eck, N. J. (2011). A recursive field-normalized bibliometric performance indicator: An application to the field of library and information science. Scientometrics, 89(1), 301–314.CrossRefGoogle Scholar
  42. White, H. D., & McCain, K. W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of the American Society for Information Science, 49(4), 327–355.Google Scholar
  43. Yan, E. (2014). Finding knowledge paths among scientific disciplines. Journal of the American Society for Information Science and Technology. doi: 10.1002/asi.23106
  44. Yan, E., & Ding, Y. (2011). Discovering author impact: A PageRank perspective. Information Processing and Management, 47(1), 125–134.CrossRefGoogle Scholar
  45. Yan, E., Ding, Y., Cronin, B., & Leydesdorff, L. (2013). A bird’s-eye view of scientific trading: Dependency relations among fields of science. Journal of Informetrics, 7(2), 249–264.CrossRefGoogle Scholar
  46. Yan, E., Ding, Y., & Jacob, E. K. (2012a). Overlaying communities and topics: An analysis on publication networks. Scientometrics, 90(2), 499–513.CrossRefGoogle Scholar
  47. Yan, E., Ding, Y., Milojevic, S., & Sugimoto, C. R. (2012b). Topics in dynamic research communities: An exploratory study for the field of information retrieval. Journal of Informetrics, 6(1), 140–153.CrossRefGoogle Scholar
  48. Yan, E., Ding, Y., & Sugimoto, C. R. (2011). P-Rank: An indicator measuring prestige in heterogeneous scholarly networks. Journal of the American Society for Information Science and Technology, 62(3), 467–477.Google Scholar
  49. Yan, E., & Sugimoto, C. R. (2011). Institutional interactions: Exploring social, cognitive, and geographic relationships between institutions as demonstrated through citation networks. Journal of the American Society for Information Science and Technology, 62(8), 1498–1514.CrossRefGoogle Scholar
  50. Zitt, M. (2005). Facing diversity of science: A challenge for bibliometric indicators. Measurement, 3(1), 38–49.Google Scholar
  51. Życzkowski, K. (2010). Citation graph, weighted impact factors and performance indices. Scientometrics, 85(1), 301–315.CrossRefGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2014

Authors and Affiliations

  1. 1.College of Computing and InformaticsDrexel UniversityPhiladelphiaUSA

Personalised recommendations