Skip to main content
Log in

Topic-based Pagerank: toward a topic-level scientific evaluation

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Within the same research field, different subfields and topics may exhibit varied citation behaviors and scholarly communication patterns. For a more effect scientific evaluation at the topic level, this study proposes a topic-based PageRank approach. This approach aims to evaluate the scientific impact of research entities (e.g., papers, authors, journals, and institutions) at the topic-level. The proposed topic-based PageRank, when applied to a data set on library and information science publications, has effectively detected a variety of research topics and identified authors, papers, and journals of the highest impact from each topic. Evaluation results show that compared with the standard PageRank and a topic modeling technique, the proposed topic-based PageRank has the best performance on relevance and impact. Different perspectives of organizing scientific literature are also discussed and this study recommends the mode of organization that integrates stable research domains and dynamic topics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://www.pages.drexel.edu/~ey86/p/topic_pagerank/stoplist.txt.

References

  • Bergstrom, C. T., & West, J. D. (2008). Assessing citations with the Eigenfactor™ metrics. Neurology, 71(23), 1850–1851.

    Article  Google Scholar 

  • Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. Annals of Applied Statistics, 1(1), 17–35.

    Article  MATH  MathSciNet  Google Scholar 

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(4–5), 993–1033.

    MATH  Google Scholar 

  • Bollen, J., Rodriguez, M. A., & Van de Sompel, H. (2006). Journal status. Scientometrics, 69(3), 669–687.

    Article  Google Scholar 

  • Boyack, K. W., Klavans, A. R., & Börner, K. (2005). Mapping the backbone of science. Scientometrics, 64(3), 351–374.

    Article  Google Scholar 

  • Chen, C. M. (2004). Searching for intellectual turning points: Progressive knowledge domain visualization. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl. 1), 5303–5310.

    Article  Google Scholar 

  • Chen, P., Xie, H., Maslov, S., & Redner, S. (2007). Finding scientific gems with Google’s PageRank algorithm. Journal of Informetrics, 1(1), 8–15.

    Article  Google Scholar 

  • Cronin, B. (1984). The citation process: The role and significance of citations in scientific communication. London: Taylor Graham.

    Google Scholar 

  • Ding, Y. (2011a). Applying weighted PageRank to author citation networks. Journal of the American Society for Information Science and Technology, 62(2), 236–245.

    Article  Google Scholar 

  • Ding, Y. (2011b). Topic-based PageRank on author co-citation networks. Journal of the American Society for Information Science and Technology, 62(3), 449–466.

    Google Scholar 

  • Glänzel, W., & Thijs, B. (2011). Using ‘core documents’ for the representation of clusters and topics. Scientometrics, 88(1), 297–309.

    Article  Google Scholar 

  • Glänzel, W., & Thijs, B. (2012). Using ‘core documents’ for detecting and labelling new emerging topics. Scientometrics, 91(2), 399–416.

    Article  Google Scholar 

  • Guerrero-Bote, V. P., & Moya-Anegón, F. (2012). A further step forward in measuring journals’ scientific prestige: The SJR2 indicator. Journal of Informetrics, 6(4), 674–688.

    Article  Google Scholar 

  • Haveliwala, T. H. (2003). Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering, 15(4), 784–796.

    Article  Google Scholar 

  • Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46), 16569–16572.

    Article  Google Scholar 

  • Hirst, G. (1978). Discipline impact factors: Method for determining core journal lists. Journal of the American Society for Information Science, 29(4), 171–172.

    Article  Google Scholar 

  • Holton, G. (1978). Can science be measured? In scientific imaginations: Case studies (pp. 199–228). Cambridge: Cambridge University Press.

    Google Scholar 

  • Janssens, F., Glänzel, W., & De Moor, B. (2008). A hybrid mapping of information science. Scientometrics, 75(3), 607–631.

    Article  Google Scholar 

  • Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20(4), 422–446.

    Article  Google Scholar 

  • Klein, J. T. (1990). Interdisciplinarity: History, theory, and practice. Detroit: Wayne State University Press.

    Google Scholar 

  • Li, D., Ding, Y., Shuai, X., Bollen, J., Tang, J., Chen, S., et al. (2012). Adding community and dynamic to topic models. Journal of Informetrics, 6(2), 237–253.

    Article  Google Scholar 

  • Lin, J. (1991). Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37(1), 145–151.

    Article  MATH  Google Scholar 

  • Liu, X., Bollen, J., Nelson, M. L., & Van de Sompel, H. (2005). Co-authorship networks in the digital library research community. Information Processing and Management, 41(6), 1462–1480.

    Article  Google Scholar 

  • Ma, N., Guan, J., & Zhao, Y. (2008). Bringing PageRank to the citation analysis. Information Processing and Management, 44(2), 800–810.

    Article  Google Scholar 

  • McCallum, A., Corrada-Emmanuel, A., & Wang, X. (2004) The Author-Recipient-Topic model for topic and role discovery in social networks: Experiments with Enron and academic email. Technical Report UM-CS-2004-096. Retrieved May 30, 2010 from http://citeseerx.ist.psu.edu/viewdoc/download?. doi: 10.1.1.84.5833.

  • Milojevic, S., Sugimoto, C. R., Yan, E., & Ding, Y. (2011). The cognitive structure of library and information science: Analysis of article title words. Journal of the American Society for Information Science and Technology, 62(10), 1933–1953.

    Article  Google Scholar 

  • Moed, H. F. (2010). CWTS crown indicator measures citation impact of a research group’s publication oeuvre. Journal of Informetrics, 3(3), 436–438.

    Article  Google Scholar 

  • Narin, F. (1976). Evaluative bibliometrics: The use of publication and citation analysis in the evaluation of scientific activity. Washington DC: National Science Foundation.

    Google Scholar 

  • Ni, C., Sugimoto, C. R., & Jiang, J. (2013). Venue-author-coupling: A measure for identifying disciplines through author communities. Journal of the American Society for Information Science and Technology, 64(2), 265–279.

    Article  Google Scholar 

  • Pinski, G., & Narin, F. (1976). Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing and Management, 12(5), 297–312.

    Article  Google Scholar 

  • Radicchi, F., Fortunato, S., Markines, B., & Vespignani, A. (2009). Diffusion of scientific credits and the ranking of scientists. Physical Review E, 80(5), 056103.

    Article  Google Scholar 

  • Rafols, I., & Leydesdorff, L. (2009). Content-based and algorithmic classifications of journals: perspectives on the dynamics of scientific communication and indexer effects. Journal of the American Society for Information Science and Technology, 60(9), 1823–1835.

    Article  Google Scholar 

  • Sayyadi, H., & Getoor, L. (2009). FutureRank: Ranking scientific articles by predicting their future PageRank. In Proceedings of the Ninth SIAM International Conference on Data Mining. Retrieved February 6, 2012 from http://www.siam.org/proceedings/datamining/2009/dm09_050_sayyadih.pdf.

  • Steyvers, M., Smyth, P., Rosen-Zvi, M., & Griffiths, T. (2004). Probabilistic author-topic models for information discovery. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 306–315). New York: ACM Press.

  • Sugimoto, C. R., Li, D., Russell, T. G., Finlay, S. C., & Ding, Y. (2011). The shifting sands of disciplinary development: analyzing North American Library and Information Science dissertations using latent dirichlet allocation. Journal of the American Society for Information Science and Technology, 62(1), 185–204.

    Article  Google Scholar 

  • Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., & Su, Z. (2008). ArnetMiner: Extraction and mining of academic social networks. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 990–998). New York: ACM Press.

  • Van Raan, A. F. J. (2004). Measuring science: Capita Selecta of current issues. In H. F. Moed, W. Glänzel, & U. Schmoch (Eds.), Handbook of quantitative science and technology research (pp. 19–50). Dordrecht: Kluwer Academic Publishers.

    Google Scholar 

  • Van Raan, A. F. J. (2008). Bibliometric statistical properties of the 100 largest European research universities: Prevalent scaling rules in the science system. Journal of the American Society for Information Science and Technology, 59(3), 461–475.

    Article  Google Scholar 

  • Walker, D., Xie, H., Yan, K. K., & Maslov, S. (2007). Ranking scientific publications using a simple model of network traffic. Journal of Statistical Mechanics: Theory and Experiment, P06010. doi: 10.1088/1742-5468/2007/06/P06010.

  • Waltman, L., & Van Eck, N. J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of the American Society for Information Science and Technology, 63(12), 2378–2392.

    Article  Google Scholar 

  • Waltman, L., Yan, E., & Van Eck, N. J. (2011). A recursive field-normalized bibliometric performance indicator: An application to the field of library and information science. Scientometrics, 89(1), 301–314.

    Article  Google Scholar 

  • White, H. D., & McCain, K. W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of the American Society for Information Science, 49(4), 327–355.

    Google Scholar 

  • Yan, E. (2014). Finding knowledge paths among scientific disciplines. Journal of the American Society for Information Science and Technology. doi:10.1002/asi.23106

  • Yan, E., & Ding, Y. (2011). Discovering author impact: A PageRank perspective. Information Processing and Management, 47(1), 125–134.

    Article  Google Scholar 

  • Yan, E., Ding, Y., Cronin, B., & Leydesdorff, L. (2013). A bird’s-eye view of scientific trading: Dependency relations among fields of science. Journal of Informetrics, 7(2), 249–264.

    Article  Google Scholar 

  • Yan, E., Ding, Y., & Jacob, E. K. (2012a). Overlaying communities and topics: An analysis on publication networks. Scientometrics, 90(2), 499–513.

    Article  Google Scholar 

  • Yan, E., Ding, Y., Milojevic, S., & Sugimoto, C. R. (2012b). Topics in dynamic research communities: An exploratory study for the field of information retrieval. Journal of Informetrics, 6(1), 140–153.

    Article  Google Scholar 

  • Yan, E., Ding, Y., & Sugimoto, C. R. (2011). P-Rank: An indicator measuring prestige in heterogeneous scholarly networks. Journal of the American Society for Information Science and Technology, 62(3), 467–477.

    Google Scholar 

  • Yan, E., & Sugimoto, C. R. (2011). Institutional interactions: Exploring social, cognitive, and geographic relationships between institutions as demonstrated through citation networks. Journal of the American Society for Information Science and Technology, 62(8), 1498–1514.

    Article  Google Scholar 

  • Zitt, M. (2005). Facing diversity of science: A challenge for bibliometric indicators. Measurement, 3(1), 38–49.

    Google Scholar 

  • Życzkowski, K. (2010). Citation graph, weighted impact factors and performance indices. Scientometrics, 85(1), 301–315.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erjia Yan.

Appendix

Appendix

See Tables 15, 16, 17 and 18.

Table 15 Top ranked core research entities for topics 1–5
Table 16 Top ranked core research entities for topics 6–10
Table 17 Top ranked core research entities for topics 11–15
Table 18 Top ranked core research entities for topics 16–20

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, E. Topic-based Pagerank: toward a topic-level scientific evaluation. Scientometrics 100, 407–437 (2014). https://doi.org/10.1007/s11192-014-1308-5

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-014-1308-5

Keywords

Navigation