Predicting the Future Impact of Academic Publications

  • Carolina Bento
  • Bruno Martins
  • Pável Calado
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8154)


Predicting the future impact of academic publications has many important applications. In this paper, we propose methods for predicting future article impact, leveraging digital libraries of academic publications containing citation information. Using a set of successive past impact scores, obtained through graph-ranking algorithms such as PageRank, we study the evolution of the publications in terms of their yearly impact scores, learning regression models to predict the future PageRank scores, or to predict the future number of downloads. Results obtained over a DBLP citation dataset, covering papers published up to the year of 2011, show that the impact predictions are highly accurate for all experimental setups. A model based on regression trees, using features relative to PageRank scores, PageRank change rates, author PageRank scores, and term occurrence frequencies in the abstracts and titles of the publications, computed over citation graphs from the three previous years, obtained the best results.


Regression Tree Digital Library Citation Network Impact Score Future Impact 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Breiman, L.: Random Forests. Machine Learning 45(1) (2001)Google Scholar
  2. 2.
    Chen, P., Xie, H., Maslov, S., Redner, S.: Finding scientific gems with Google’s PageRank algorithm. Journal of Informetrics 1(1) (2007)Google Scholar
  3. 3.
    Chien, S., Dwork, C., Kumar, R., Simon, D.R., Sivakumar, D.: Link evolution: Analysis and algorithms. Internet Mathematics 1(3) (2003)Google Scholar
  4. 4.
    Davis, J.V., Dhillon, I.S.: Estimating the global PageRank of web communities. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2006)Google Scholar
  5. 5.
    Friedman, J.H.: Greedy function approximation: A gradient boosting machine. Annals of Statistics 29(5) (2000)Google Scholar
  6. 6.
    Kan, M.-Y., Thi, H.O.N.: Fast webpage classiffication using URL features. In: Proceedings of the ACM International Conference on Information and Knowledge Management (2005)Google Scholar
  7. 7.
    Kaul, R., Yun, Y., Kim, S.-G.: Ranking billions of web pages using diodes. Communications of ACM 52(8) (2009)Google Scholar
  8. 8.
    Kendall, M.G.: A new measure of rank correlation. Biometrika 30(1/2) (1938)Google Scholar
  9. 9.
    Langville, A., Meyer, C.D.: Survey: Deeper inside PageRank. Internet Mathematics 1(3) (2003)Google Scholar
  10. 10.
    Lerman, K., Ghosh, R., Kang, J.H.: Centrality metric for dynamic networks. In: Proceedings of the Workshop on Mining and Learning with Graphs (2010)Google Scholar
  11. 11.
    Mohan, A., Chen, Z., Weinberger, K.Q.: Web-search ranking with initialized gradient boosted regression trees. Journal of Machine Learning Research 14 (2011)Google Scholar
  12. 12.
    Radicchi, F., Fortunato, S., Markines, B., Vespignani, A.: Diffusion of scientific credits and the ranking of scientists. Physical Review (2009)Google Scholar
  13. 13.
    Sayyadi, H., Getoor, L.: Future rank: Ranking scientific articles by predicting their future PageRank. In: Proceedings of the SIAM International Conference on Data Mining (2009)Google Scholar
  14. 14.
    Spearman, C.: The proof and measurement of association between two things. American Journal of Psychology 15 (1904)Google Scholar
  15. 15.
    Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: Extraction and mining of academic social networks. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2008)Google Scholar
  16. 16.
    Vazirgiannis, M., Drosos, D., Senellart, P., Vlachou, A.: Web page rank prediction with Markov models. In: Proceedings of the International Conference on World Wide Web (2008)Google Scholar
  17. 17.
    Voudigari, E., Pavlopoulos, J., Vazirgiannis, M.: A framework for web Page Rank prediction. In: Iliadis, L., Maglogiannis, I., Papadopoulos, H. (eds.) EANN/AIAI 2011, Part II. IFIP AICT, vol. 364, pp. 240–249. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  18. 18.
    Walker, D., Xie, H., Yan, K.-K., Maslov, S.: Ranking scientific publications using a simple model of network traffic. Technical Report CoRR, abs/physics/0612122 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Carolina Bento
    • 1
  • Bruno Martins
    • 1
  • Pável Calado
    • 1
  1. 1.INESC-IDInstituto Superior TécnicoPorto SalvoPortugal

Personalised recommendations