Skip to main content
Log in

Can we predict ESI highly cited publications?

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The highly cited papers defined by Clarivate Analytics’ Essential Science Indicators (ESI) have been widely used to measure the scientific performance of scientists, research institutions, universities and countries. However, researchers have seldom studied which factors can affect a paper to be an ESI highly cited paper. The prediction of ESI highly cited papers is much less studied, too. According to the existing researches about factors influencing paper’s citations, four classical papers’ factors are chosen in this study, which are scientific impact of the first author, scientific impact of the potential leader, scientific impact of the team and the relevance of authors’ existing papers. Similar to the definition of ESI highly cited papers, we develop a new measure of papers’ scientific impact. Firstly, we get statistics properties of four factors with APS data and Nobel data in order to study four factors’ performance of ESI highly cited papers. Then, Spearman correlation and Logistic regression are applied to explore the relationship between four factors and papers’ scientific impact. At last, we try to predict highly cited papers by NN algorithms incorporating four factors. The results show that the potential leader factor plays a more important role in the short term than in the long term, while the team factor is on the contrary, more important in the long term. Interestingly, the first author factor doesn’t have an obvious effect on papers’ scientific impact among top 1%. The prediction results are better than random.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Abramo, G., Cicero, T., & DAngelo, C. A. (2011). Assessing the varying level of impact measurement accuracy as a function of the citation window length. Journal of Informetrics, 5(4), 659–667.

    Article  Google Scholar 

  • Adams, J. (2005). Early citation counts correlate with accumulated impact. Scientometrics, 63(3), 567–581.

    Article  Google Scholar 

  • Aksnes, D. W. (2003). Characteristics of highly cited papers. Research Evaluation, 12(3), 159–170.

    Article  Google Scholar 

  • Bianconi, G., & Barabási, A. L. (2001). Bose–Einstein condensation in complex networks. Physical Review Letters, 86(24), 5632.

    Article  Google Scholar 

  • Biscaro, C., & Giupponi, C. (2014). Co-authorship and bibliographic coupling network effects on citations. PloS ONE, 9(6), e99502.

    Article  Google Scholar 

  • Cao, X., Chen, Y., & Liu, K. R. (2016). A data analytic approach to quantifying scientific impact. Journal of Informetrics, 10(2), 471–484.

    Article  Google Scholar 

  • Chatterjee, A., Ghosh, A., & Chakrabarti, B. K. (2016). Universality of citation distributions for academic institutions and journals. PloS ONE, 11(1), e0146762.

    Article  Google Scholar 

  • Chen, P., Xie, H., Maslov, S., & Redner, S. (2007). Finding scientific gems with Googles PageRank algorithm. Journal of Informetrics, 1(1), 8–15.

    Article  Google Scholar 

  • Danell, R. (2011). Can the quality of scientific work be predicted using information on the author’s track record? Journal of the Association for Information Science and Technology, 62(1), 50–60.

    Google Scholar 

  • Gardner, M. W., & Dorling, S. R. (1998). Artificial neural networks (the multilayer perceptron)—A review of applications in the atmospheric sciences. Atmospheric Environment, 32(14), 2627–2636.

    Article  Google Scholar 

  • Garfield, E., & Welljamsdorof, A. (1992). Of nobel class—A citation perspective on high-impact research authors. Theoretical Medicine, 13(2), 117–135.

    Article  Google Scholar 

  • Glänzel, W., & Schubert, A. (1988). Characteristic scores and scales in assessing citation impact. Journal of Information Science, 14(2), 123–127.

    Article  Google Scholar 

  • Hirsch, J. E. (2007). Does the h index have predictive power? Proceedings of the National Academy of Sciences, 104(49), 19193–19198.

    Article  Google Scholar 

  • Hu, X., & Rousseau, R. (2009). A comparative study of the difference in research performance in biomedical fields among selected Western and Asian countries. Scientometrics, 81(2), 475–491.

    Article  Google Scholar 

  • Hurley, L. A., Ogier, A. L., & Torvik, V. I. (2013, November). Deconstructing the collaborative impact: Article and author characteristics that influence citation count. In Proceedings of the 76th ASIS&T annual meeting: Beyond the cloud: Rethinking information boundaries (p. 61). American Society for Information Science.

  • Jeong, H., Nda, Z., & Barabási, A. L. (2003). Measuring preferential attachment in evolving networks. EPL (Europhysics Letters), 61(4), 567.

    Article  Google Scholar 

  • Kosmulski, M. (2012). The order in the lists of authors in multi-author papers revisited. Journal of Informetrics, 6(4), 639–644.

    Article  Google Scholar 

  • Mariani, M. S., Medo, M., & Zhang, Y. C. (2016). Identification of milestone papers through time-balanced network centrality. Journal of Informetrics, 10(4), 1207–1223.

    Article  Google Scholar 

  • Maske, K. L., Durden, G. C., & Gaynor, P. E. (2003). Determinants of scholarly productivity among male and female economists. Economic inquiry, 41(4), 555–564.

    Article  Google Scholar 

  • Merton, R. K. (1968). The Matthew effect in science: The reward and communication systems of science are considered. Science, 159(3810), 56–63.

    Article  Google Scholar 

  • Newman, M. E. (2009). The first-mover advantage in scientific publication. EPL (Europhysics Letters), 86(6), 68001.

    Article  Google Scholar 

  • Newman, M. E. J. (2014). Prediction of highly cited papers. EPL (Europhysics Letters), 105(2), 28002.

    Article  Google Scholar 

  • Niu, Q., Zhou, J., Zeng, A., Fan, Y., & Di, Z. (2016). Which publication is your representative work? Journal of Informetrics, 10(3), 842–853.

    Article  Google Scholar 

  • Noorhidawati, A., Aspura, M. Y. I., Zahila, M. N., & Abrizah, A. (2017). Characteristics of Malaysian highly cited papers. Malaysian Journal of Library & Information Science, 22(2), 85–99.

    Article  Google Scholar 

  • Ponomarev, I. V., Lawton, B. K., Williams, D. E., & Schnell, J. D. (2014). Breakthrough paper indicator 2.0: Can geographical diversity and interdisciplinarity improve the accuracy of outstanding papers prediction? Scientometrics, 100(3), 755–765.

    Article  Google Scholar 

  • Ponomarev, I. V., Williams, D. E., Hackett, C. J., Schnell, J. D., & Haak, L. L. (2014). Predicting highly cited papers: A method for early detection of candidate breakthroughs. Technological Forecasting and Social Change, 81, 49–55.

    Article  Google Scholar 

  • Pouris, A. (2007). The international performance of the South African academic institutions: A citation assessment. Higher Education, 54(4), 501–509.

    Article  Google Scholar 

  • Qi, M., Zeng, A., Li, M., Fan, Y., & Di, Z. (2017). Standing on the shoulders of giants: The effect of outstanding scientists on young collaborators careers. Scientometrics, 111(3), 1839–1850.

    Article  Google Scholar 

  • Radicchi, F., Fortunato, S., & Castellano, C. (2008). Universality of citation distributions: Toward an objective measure of scientific impact. Proceedings of the National Academy of Sciences, 105(45), 17268–17272.

    Article  Google Scholar 

  • Redner, S. (1998). How popular is your paper? An empirical study of the citation distribution. The European Physical Journal B: Condensed Matter and Complex Systems, 4(2), 131–134.

    Article  Google Scholar 

  • Shen, H. W., & Barabási, A. L. (2014). Collective credit allocation in science. Proceedings of the National Academy of Sciences, 111(34), 12325–12330.

    Article  Google Scholar 

  • Soteriades, E. S., & Falagas, M. E. (2005). Comparison of amount of biomedical research originating from the European Union and the United States. BMJ, 331(7510), 192–194.

    Article  Google Scholar 

  • Stewart, J. A. (1983). Achievement and ascriptive processes in the recognition of scientific articles. Social Forces, 62(1), 166–189.

    Article  Google Scholar 

  • Wang, D., Song, C., & Barabási, A. L. (2013). Quantifying long-term scientific impact. Science, 342(6154), 127–132.

    Article  Google Scholar 

  • Wang, M., Yu, G., Xu, J., He, H., Yu, D., & An, S. (2012). Development a case-based classifier for predicting highly cited papers. Journal of Informetrics, 6(4), 586–599.

    Article  Google Scholar 

  • Winnink, J. J., & Tijssen, R. J. (2015). Early stage identification of breakthroughs at the interface of science and technology: Lessons drawn from a landmark publication. Scientometrics, 102(1), 113–134.

    Article  Google Scholar 

  • Winnink, J. J., Tijssen, R. J., & van Raan, A. F. (2016). Theory-changing breakthroughs in science: The impact of research teamwork on scientific discoveries. Journal of the Association for Information Science and Technology, 67(5), 1210–1223.

    Article  Google Scholar 

  • Xiao, S., Yan, J., Li, C., Jin, B., Wang, X., Yang, X., et al. (2016, July). On modeling and predicting individual paper citation count over time. In IJCAI (pp. 2676–2682).

  • Yan, R., Tang, J., Liu, X., Shan, D., & Li, X. (2011, October). Citation count prediction: Learning to estimate future citations for literature. In Proceedings of the 20th ACM international conference on Information and knowledge management (pp. 1247–1252). ACM.

  • Yao, L., Wei, T., Zeng, A., Fan, Y., & Di, Z. (2014). Ranking scientific publications: The effect of nonlinearity. Scientific Reports, 4, 6663.

    Article  Google Scholar 

  • Yu, T., Yu, G., Li, P. Y., & Wang, L. (2014). Citation impact prediction for scientific papers using stepwise regression analysis. Scientometrics, 101(2), 1233–1252.

    Article  Google Scholar 

  • Zeng, A., Shen, Z., Zhou, J., Wu, J., Fan, Y., Wang, Y., et al. (2017). The science of science: From the perspective of complex systems. Physics Reports, 714, 1–73.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, J., & Guan, J. (2017). Scientific relatedness and intellectual base: A citation analysis of un-cited and highly-cited papers in the solar energy field. Scientometrics, 110(1), 141–162.

    Article  Google Scholar 

  • Zhou, J., Zeng, A., Fan, Y., & Di, Z. (2016). Ranking scientific publications with similarity-preferential mechanism. Scientometrics, 106(2), 805–816.

    Article  Google Scholar 

  • Zhu, X., Wu, Q., Zheng, Y., & Ma, X. (2004). Highly cited research papers and the evaluation of a research university: A case study: Peking University 1974–2003. Scientometrics, 60(2), 237–347.

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61603046 and 61374175) and the Natural Science Foundation of Beijing (Grant No. L160008).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ying Fan or An Zeng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, F., Fan, Y., Zeng, A. et al. Can we predict ESI highly cited publications?. Scientometrics 118, 109–125 (2019). https://doi.org/10.1007/s11192-018-2965-6

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-018-2965-6

Keywords

Navigation