Abstract
Within the same research field, different subfields and topics may exhibit varied citation behaviors and scholarly communication patterns. For a more effect scientific evaluation at the topic level, this study proposes a topic-based PageRank approach. This approach aims to evaluate the scientific impact of research entities (e.g., papers, authors, journals, and institutions) at the topic-level. The proposed topic-based PageRank, when applied to a data set on library and information science publications, has effectively detected a variety of research topics and identified authors, papers, and journals of the highest impact from each topic. Evaluation results show that compared with the standard PageRank and a topic modeling technique, the proposed topic-based PageRank has the best performance on relevance and impact. Different perspectives of organizing scientific literature are also discussed and this study recommends the mode of organization that integrates stable research domains and dynamic topics.
Similar content being viewed by others
References
Bergstrom, C. T., & West, J. D. (2008). Assessing citations with the Eigenfactor™ metrics. Neurology, 71(23), 1850–1851.
Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. Annals of Applied Statistics, 1(1), 17–35.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(4–5), 993–1033.
Bollen, J., Rodriguez, M. A., & Van de Sompel, H. (2006). Journal status. Scientometrics, 69(3), 669–687.
Boyack, K. W., Klavans, A. R., & Börner, K. (2005). Mapping the backbone of science. Scientometrics, 64(3), 351–374.
Chen, C. M. (2004). Searching for intellectual turning points: Progressive knowledge domain visualization. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl. 1), 5303–5310.
Chen, P., Xie, H., Maslov, S., & Redner, S. (2007). Finding scientific gems with Google’s PageRank algorithm. Journal of Informetrics, 1(1), 8–15.
Cronin, B. (1984). The citation process: The role and significance of citations in scientific communication. London: Taylor Graham.
Ding, Y. (2011a). Applying weighted PageRank to author citation networks. Journal of the American Society for Information Science and Technology, 62(2), 236–245.
Ding, Y. (2011b). Topic-based PageRank on author co-citation networks. Journal of the American Society for Information Science and Technology, 62(3), 449–466.
Glänzel, W., & Thijs, B. (2011). Using ‘core documents’ for the representation of clusters and topics. Scientometrics, 88(1), 297–309.
Glänzel, W., & Thijs, B. (2012). Using ‘core documents’ for detecting and labelling new emerging topics. Scientometrics, 91(2), 399–416.
Guerrero-Bote, V. P., & Moya-Anegón, F. (2012). A further step forward in measuring journals’ scientific prestige: The SJR2 indicator. Journal of Informetrics, 6(4), 674–688.
Haveliwala, T. H. (2003). Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering, 15(4), 784–796.
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46), 16569–16572.
Hirst, G. (1978). Discipline impact factors: Method for determining core journal lists. Journal of the American Society for Information Science, 29(4), 171–172.
Holton, G. (1978). Can science be measured? In scientific imaginations: Case studies (pp. 199–228). Cambridge: Cambridge University Press.
Janssens, F., Glänzel, W., & De Moor, B. (2008). A hybrid mapping of information science. Scientometrics, 75(3), 607–631.
Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20(4), 422–446.
Klein, J. T. (1990). Interdisciplinarity: History, theory, and practice. Detroit: Wayne State University Press.
Li, D., Ding, Y., Shuai, X., Bollen, J., Tang, J., Chen, S., et al. (2012). Adding community and dynamic to topic models. Journal of Informetrics, 6(2), 237–253.
Lin, J. (1991). Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37(1), 145–151.
Liu, X., Bollen, J., Nelson, M. L., & Van de Sompel, H. (2005). Co-authorship networks in the digital library research community. Information Processing and Management, 41(6), 1462–1480.
Ma, N., Guan, J., & Zhao, Y. (2008). Bringing PageRank to the citation analysis. Information Processing and Management, 44(2), 800–810.
McCallum, A., Corrada-Emmanuel, A., & Wang, X. (2004) The Author-Recipient-Topic model for topic and role discovery in social networks: Experiments with Enron and academic email. Technical Report UM-CS-2004-096. Retrieved May 30, 2010 from http://citeseerx.ist.psu.edu/viewdoc/download?. doi: 10.1.1.84.5833.
Milojevic, S., Sugimoto, C. R., Yan, E., & Ding, Y. (2011). The cognitive structure of library and information science: Analysis of article title words. Journal of the American Society for Information Science and Technology, 62(10), 1933–1953.
Moed, H. F. (2010). CWTS crown indicator measures citation impact of a research group’s publication oeuvre. Journal of Informetrics, 3(3), 436–438.
Narin, F. (1976). Evaluative bibliometrics: The use of publication and citation analysis in the evaluation of scientific activity. Washington DC: National Science Foundation.
Ni, C., Sugimoto, C. R., & Jiang, J. (2013). Venue-author-coupling: A measure for identifying disciplines through author communities. Journal of the American Society for Information Science and Technology, 64(2), 265–279.
Pinski, G., & Narin, F. (1976). Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing and Management, 12(5), 297–312.
Radicchi, F., Fortunato, S., Markines, B., & Vespignani, A. (2009). Diffusion of scientific credits and the ranking of scientists. Physical Review E, 80(5), 056103.
Rafols, I., & Leydesdorff, L. (2009). Content-based and algorithmic classifications of journals: perspectives on the dynamics of scientific communication and indexer effects. Journal of the American Society for Information Science and Technology, 60(9), 1823–1835.
Sayyadi, H., & Getoor, L. (2009). FutureRank: Ranking scientific articles by predicting their future PageRank. In Proceedings of the Ninth SIAM International Conference on Data Mining. Retrieved February 6, 2012 from http://www.siam.org/proceedings/datamining/2009/dm09_050_sayyadih.pdf.
Steyvers, M., Smyth, P., Rosen-Zvi, M., & Griffiths, T. (2004). Probabilistic author-topic models for information discovery. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 306–315). New York: ACM Press.
Sugimoto, C. R., Li, D., Russell, T. G., Finlay, S. C., & Ding, Y. (2011). The shifting sands of disciplinary development: analyzing North American Library and Information Science dissertations using latent dirichlet allocation. Journal of the American Society for Information Science and Technology, 62(1), 185–204.
Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., & Su, Z. (2008). ArnetMiner: Extraction and mining of academic social networks. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 990–998). New York: ACM Press.
Van Raan, A. F. J. (2004). Measuring science: Capita Selecta of current issues. In H. F. Moed, W. Glänzel, & U. Schmoch (Eds.), Handbook of quantitative science and technology research (pp. 19–50). Dordrecht: Kluwer Academic Publishers.
Van Raan, A. F. J. (2008). Bibliometric statistical properties of the 100 largest European research universities: Prevalent scaling rules in the science system. Journal of the American Society for Information Science and Technology, 59(3), 461–475.
Walker, D., Xie, H., Yan, K. K., & Maslov, S. (2007). Ranking scientific publications using a simple model of network traffic. Journal of Statistical Mechanics: Theory and Experiment, P06010. doi: 10.1088/1742-5468/2007/06/P06010.
Waltman, L., & Van Eck, N. J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of the American Society for Information Science and Technology, 63(12), 2378–2392.
Waltman, L., Yan, E., & Van Eck, N. J. (2011). A recursive field-normalized bibliometric performance indicator: An application to the field of library and information science. Scientometrics, 89(1), 301–314.
White, H. D., & McCain, K. W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of the American Society for Information Science, 49(4), 327–355.
Yan, E. (2014). Finding knowledge paths among scientific disciplines. Journal of the American Society for Information Science and Technology. doi:10.1002/asi.23106
Yan, E., & Ding, Y. (2011). Discovering author impact: A PageRank perspective. Information Processing and Management, 47(1), 125–134.
Yan, E., Ding, Y., Cronin, B., & Leydesdorff, L. (2013). A bird’s-eye view of scientific trading: Dependency relations among fields of science. Journal of Informetrics, 7(2), 249–264.
Yan, E., Ding, Y., & Jacob, E. K. (2012a). Overlaying communities and topics: An analysis on publication networks. Scientometrics, 90(2), 499–513.
Yan, E., Ding, Y., Milojevic, S., & Sugimoto, C. R. (2012b). Topics in dynamic research communities: An exploratory study for the field of information retrieval. Journal of Informetrics, 6(1), 140–153.
Yan, E., Ding, Y., & Sugimoto, C. R. (2011). P-Rank: An indicator measuring prestige in heterogeneous scholarly networks. Journal of the American Society for Information Science and Technology, 62(3), 467–477.
Yan, E., & Sugimoto, C. R. (2011). Institutional interactions: Exploring social, cognitive, and geographic relationships between institutions as demonstrated through citation networks. Journal of the American Society for Information Science and Technology, 62(8), 1498–1514.
Zitt, M. (2005). Facing diversity of science: A challenge for bibliometric indicators. Measurement, 3(1), 38–49.
Życzkowski, K. (2010). Citation graph, weighted impact factors and performance indices. Scientometrics, 85(1), 301–315.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yan, E. Topic-based Pagerank: toward a topic-level scientific evaluation. Scientometrics 100, 407–437 (2014). https://doi.org/10.1007/s11192-014-1308-5
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-014-1308-5