, Volume 118, Issue 1, pp 109–125 | Cite as

Can we predict ESI highly cited publications?

  • Fenghua Wang
  • Ying FanEmail author
  • An ZengEmail author
  • Zengru Di


The highly cited papers defined by Clarivate Analytics’ Essential Science Indicators (ESI) have been widely used to measure the scientific performance of scientists, research institutions, universities and countries. However, researchers have seldom studied which factors can affect a paper to be an ESI highly cited paper. The prediction of ESI highly cited papers is much less studied, too. According to the existing researches about factors influencing paper’s citations, four classical papers’ factors are chosen in this study, which are scientific impact of the first author, scientific impact of the potential leader, scientific impact of the team and the relevance of authors’ existing papers. Similar to the definition of ESI highly cited papers, we develop a new measure of papers’ scientific impact. Firstly, we get statistics properties of four factors with APS data and Nobel data in order to study four factors’ performance of ESI highly cited papers. Then, Spearman correlation and Logistic regression are applied to explore the relationship between four factors and papers’ scientific impact. At last, we try to predict highly cited papers by NN algorithms incorporating four factors. The results show that the potential leader factor plays a more important role in the short term than in the long term, while the team factor is on the contrary, more important in the long term. Interestingly, the first author factor doesn’t have an obvious effect on papers’ scientific impact among top 1%. The prediction results are better than random.


ESI Citation network Scientific impact Prediction 



This work was supported by the National Natural Science Foundation of China (Grant Nos. 61603046 and 61374175) and the Natural Science Foundation of Beijing (Grant No. L160008).


  1. Abramo, G., Cicero, T., & DAngelo, C. A. (2011). Assessing the varying level of impact measurement accuracy as a function of the citation window length. Journal of Informetrics, 5(4), 659–667.CrossRefGoogle Scholar
  2. Adams, J. (2005). Early citation counts correlate with accumulated impact. Scientometrics, 63(3), 567–581.CrossRefGoogle Scholar
  3. Aksnes, D. W. (2003). Characteristics of highly cited papers. Research Evaluation, 12(3), 159–170.CrossRefGoogle Scholar
  4. Bianconi, G., & Barabási, A. L. (2001). Bose–Einstein condensation in complex networks. Physical Review Letters, 86(24), 5632.CrossRefGoogle Scholar
  5. Biscaro, C., & Giupponi, C. (2014). Co-authorship and bibliographic coupling network effects on citations. PloS ONE, 9(6), e99502.CrossRefGoogle Scholar
  6. Cao, X., Chen, Y., & Liu, K. R. (2016). A data analytic approach to quantifying scientific impact. Journal of Informetrics, 10(2), 471–484.CrossRefGoogle Scholar
  7. Chatterjee, A., Ghosh, A., & Chakrabarti, B. K. (2016). Universality of citation distributions for academic institutions and journals. PloS ONE, 11(1), e0146762.CrossRefGoogle Scholar
  8. Chen, P., Xie, H., Maslov, S., & Redner, S. (2007). Finding scientific gems with Googles PageRank algorithm. Journal of Informetrics, 1(1), 8–15.CrossRefGoogle Scholar
  9. Danell, R. (2011). Can the quality of scientific work be predicted using information on the author’s track record? Journal of the Association for Information Science and Technology, 62(1), 50–60.Google Scholar
  10. Gardner, M. W., & Dorling, S. R. (1998). Artificial neural networks (the multilayer perceptron)—A review of applications in the atmospheric sciences. Atmospheric Environment, 32(14), 2627–2636.CrossRefGoogle Scholar
  11. Garfield, E., & Welljamsdorof, A. (1992). Of nobel class—A citation perspective on high-impact research authors. Theoretical Medicine, 13(2), 117–135.CrossRefGoogle Scholar
  12. Glänzel, W., & Schubert, A. (1988). Characteristic scores and scales in assessing citation impact. Journal of Information Science, 14(2), 123–127.CrossRefGoogle Scholar
  13. Hirsch, J. E. (2007). Does the h index have predictive power? Proceedings of the National Academy of Sciences, 104(49), 19193–19198.CrossRefGoogle Scholar
  14. Hu, X., & Rousseau, R. (2009). A comparative study of the difference in research performance in biomedical fields among selected Western and Asian countries. Scientometrics, 81(2), 475–491.CrossRefGoogle Scholar
  15. Hurley, L. A., Ogier, A. L., & Torvik, V. I. (2013, November). Deconstructing the collaborative impact: Article and author characteristics that influence citation count. In Proceedings of the 76th ASIS&T annual meeting: Beyond the cloud: Rethinking information boundaries (p. 61). American Society for Information Science.Google Scholar
  16. Jeong, H., Nda, Z., & Barabási, A. L. (2003). Measuring preferential attachment in evolving networks. EPL (Europhysics Letters), 61(4), 567.CrossRefGoogle Scholar
  17. Kosmulski, M. (2012). The order in the lists of authors in multi-author papers revisited. Journal of Informetrics, 6(4), 639–644.CrossRefGoogle Scholar
  18. Mariani, M. S., Medo, M., & Zhang, Y. C. (2016). Identification of milestone papers through time-balanced network centrality. Journal of Informetrics, 10(4), 1207–1223.CrossRefGoogle Scholar
  19. Maske, K. L., Durden, G. C., & Gaynor, P. E. (2003). Determinants of scholarly productivity among male and female economists. Economic inquiry, 41(4), 555–564.CrossRefGoogle Scholar
  20. Merton, R. K. (1968). The Matthew effect in science: The reward and communication systems of science are considered. Science, 159(3810), 56–63.CrossRefGoogle Scholar
  21. Newman, M. E. (2009). The first-mover advantage in scientific publication. EPL (Europhysics Letters), 86(6), 68001.CrossRefGoogle Scholar
  22. Newman, M. E. J. (2014). Prediction of highly cited papers. EPL (Europhysics Letters), 105(2), 28002.CrossRefGoogle Scholar
  23. Niu, Q., Zhou, J., Zeng, A., Fan, Y., & Di, Z. (2016). Which publication is your representative work? Journal of Informetrics, 10(3), 842–853.CrossRefGoogle Scholar
  24. Noorhidawati, A., Aspura, M. Y. I., Zahila, M. N., & Abrizah, A. (2017). Characteristics of Malaysian highly cited papers. Malaysian Journal of Library & Information Science, 22(2), 85–99.CrossRefGoogle Scholar
  25. Ponomarev, I. V., Lawton, B. K., Williams, D. E., & Schnell, J. D. (2014). Breakthrough paper indicator 2.0: Can geographical diversity and interdisciplinarity improve the accuracy of outstanding papers prediction? Scientometrics, 100(3), 755–765.CrossRefGoogle Scholar
  26. Ponomarev, I. V., Williams, D. E., Hackett, C. J., Schnell, J. D., & Haak, L. L. (2014). Predicting highly cited papers: A method for early detection of candidate breakthroughs. Technological Forecasting and Social Change, 81, 49–55.CrossRefGoogle Scholar
  27. Pouris, A. (2007). The international performance of the South African academic institutions: A citation assessment. Higher Education, 54(4), 501–509.CrossRefGoogle Scholar
  28. Qi, M., Zeng, A., Li, M., Fan, Y., & Di, Z. (2017). Standing on the shoulders of giants: The effect of outstanding scientists on young collaborators careers. Scientometrics, 111(3), 1839–1850.CrossRefGoogle Scholar
  29. Radicchi, F., Fortunato, S., & Castellano, C. (2008). Universality of citation distributions: Toward an objective measure of scientific impact. Proceedings of the National Academy of Sciences, 105(45), 17268–17272.CrossRefGoogle Scholar
  30. Redner, S. (1998). How popular is your paper? An empirical study of the citation distribution. The European Physical Journal B: Condensed Matter and Complex Systems, 4(2), 131–134.CrossRefGoogle Scholar
  31. Shen, H. W., & Barabási, A. L. (2014). Collective credit allocation in science. Proceedings of the National Academy of Sciences, 111(34), 12325–12330.CrossRefGoogle Scholar
  32. Soteriades, E. S., & Falagas, M. E. (2005). Comparison of amount of biomedical research originating from the European Union and the United States. BMJ, 331(7510), 192–194.CrossRefGoogle Scholar
  33. Stewart, J. A. (1983). Achievement and ascriptive processes in the recognition of scientific articles. Social Forces, 62(1), 166–189.CrossRefGoogle Scholar
  34. Wang, D., Song, C., & Barabási, A. L. (2013). Quantifying long-term scientific impact. Science, 342(6154), 127–132.CrossRefGoogle Scholar
  35. Wang, M., Yu, G., Xu, J., He, H., Yu, D., & An, S. (2012). Development a case-based classifier for predicting highly cited papers. Journal of Informetrics, 6(4), 586–599.CrossRefGoogle Scholar
  36. Winnink, J. J., & Tijssen, R. J. (2015). Early stage identification of breakthroughs at the interface of science and technology: Lessons drawn from a landmark publication. Scientometrics, 102(1), 113–134.CrossRefGoogle Scholar
  37. Winnink, J. J., Tijssen, R. J., & van Raan, A. F. (2016). Theory-changing breakthroughs in science: The impact of research teamwork on scientific discoveries. Journal of the Association for Information Science and Technology, 67(5), 1210–1223.CrossRefGoogle Scholar
  38. Xiao, S., Yan, J., Li, C., Jin, B., Wang, X., Yang, X., et al. (2016, July). On modeling and predicting individual paper citation count over time. In IJCAI (pp. 2676–2682).Google Scholar
  39. Yan, R., Tang, J., Liu, X., Shan, D., & Li, X. (2011, October). Citation count prediction: Learning to estimate future citations for literature. In Proceedings of the 20th ACM international conference on Information and knowledge management (pp. 1247–1252). ACM.Google Scholar
  40. Yao, L., Wei, T., Zeng, A., Fan, Y., & Di, Z. (2014). Ranking scientific publications: The effect of nonlinearity. Scientific Reports, 4, 6663.CrossRefGoogle Scholar
  41. Yu, T., Yu, G., Li, P. Y., & Wang, L. (2014). Citation impact prediction for scientific papers using stepwise regression analysis. Scientometrics, 101(2), 1233–1252.CrossRefGoogle Scholar
  42. Zeng, A., Shen, Z., Zhou, J., Wu, J., Fan, Y., Wang, Y., et al. (2017). The science of science: From the perspective of complex systems. Physics Reports, 714, 1–73.MathSciNetCrossRefzbMATHGoogle Scholar
  43. Zhang, J., & Guan, J. (2017). Scientific relatedness and intellectual base: A citation analysis of un-cited and highly-cited papers in the solar energy field. Scientometrics, 110(1), 141–162.CrossRefGoogle Scholar
  44. Zhou, J., Zeng, A., Fan, Y., & Di, Z. (2016). Ranking scientific publications with similarity-preferential mechanism. Scientometrics, 106(2), 805–816.CrossRefGoogle Scholar
  45. Zhu, X., Wu, Q., Zheng, Y., & Ma, X. (2004). Highly cited research papers and the evaluation of a research university: A case study: Peking University 1974–2003. Scientometrics, 60(2), 237–347.CrossRefGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2018

Authors and Affiliations

  1. 1.School of Systems ScienceBeijing Normal UniversityBeijingPeople’s Republic of China

Personalised recommendations