Skip to main content

Research on the Prediction of Highly Cited Papers Based on PCA-BPNN

  • Conference paper
  • First Online:
Modeling and Simulation of Social-Behavioral Phenomena in Creative Societies (MSBC 2022)

Abstract

With the increase in scientific research investment, the number of papers has increased significantly, and the evaluation of the impact of papers has received extensive attention from scholars. The citation frequency is the most convenient and widely used index to measure the academic influence of papers. Still, the citation frequency can only measure the real impact of papers some period of time after those have been published. Therefore, to be able to identify highly cited papers at the early stage of publication, this paper collects data on 1025 academic papers published under the library and information discipline of the Web of Science library in 2007 and then extracts 24 predictive characteristics from three aspects: papers, authors, and journals. On this basis, 7 principal component vectors are constructed by feature screening based on PCA. Also, combined with the BP neural network model, the PCA-BPNN highly-cited paper classification prediction model is constructed and finally compared with the other 5 models. The results show that the PCA-BPNN model built in this paper has better prediction performance and provides an effective model for the prediction of paper influence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yu, T., Yu, G., Li, P.-Y., Wang, L.: Citation impact prediction for scientific papers using stepwise regression analysis. Scientometrics 101(2), 1233–1252 (2014). https://doi.org/10.1007/s11192-014-1279-6

    Article  Google Scholar 

  2. Cao, X., Chen, Y., Liu, K.R.: A data analytic approach to quantifying scientific impact. J. Informet. 10, 471–484 (2016)

    Article  Google Scholar 

  3. Bai, X., et al.: An overview on evaluating and predicting scholarly article impact. Information 8(3), 73 (2017)

    Article  MathSciNet  Google Scholar 

  4. Hou, J., Pan, H., Guo, T., Lee, I., Kong, X., Xia, F.: Prediction methods and applications in the science of science: a survey. Comput. Sci. Rev. 34, 100197 (2019)

    Article  Google Scholar 

  5. Wang, M., Wang, Z., Chen, G.: Which can better predict the future success of articles? Bibliometric indices or alternative metrics. Scientometrics 119, 1575–1595 (2019)

    Article  Google Scholar 

  6. Lokker, C., McKibbon, K.A., McKinlay, R.J., Wilczynski, N.L., Haynes, R.B.: Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study. BMJ 336, 655–657 (2008)

    Article  Google Scholar 

  7. Pobiedina, N., Ichise, R.: Citation count prediction as a link prediction problem. Appl. Intell. 44(2), 252–268 (2015). https://doi.org/10.1007/s10489-015-0657-y

    Article  Google Scholar 

  8. Kosteas, V.D.: Predicting long-run citation counts for articles in top economics journals. Scientometrics 115(3), 1395–1412 (2018). https://doi.org/10.1007/s11192-018-2703-0

    Article  Google Scholar 

  9. Abramo, G., D’Angelo, C.A., Reale, E.: Peer review versus bibliometrics: which method better predicts the scholarly impact of publications? Scientometrics 121(1), 537–554 (2019). https://doi.org/10.1007/s11192-019-03184-y

    Article  Google Scholar 

  10. Amjad, T., Shahid, N., Daud, A., Khatoon, A.: Citation burst prediction in a bibliometric network. Scientometrics 127(5), 2773–2790 (2022)

    Article  Google Scholar 

  11. Ma, A., Liu, Y., Xu, X., Dong, T.: A deep-learning based citation count prediction model with paper metadata semantic features. Scientometrics 126(8), 6803–6823 (2021). https://doi.org/10.1007/s11192-021-04033-7

    Article  Google Scholar 

  12. Wang, K., Shi, W., Bai, J., Zhao, X., Zhang, L.: Prediction and application of article potential citations based on nonlinear citation-forecasting combined model. Scientometrics 126(8), 6533–6550 (2021). https://doi.org/10.1007/s11192-021-04026-6

    Article  Google Scholar 

  13. Zhao, Q., Feng, X.: Utilizing citation network structure to predict paper citation counts: A Deep learning approach. J. Informet. 16(1), 101235 (2022)

    Article  Google Scholar 

  14. Dong, Y., Johnson, R.A., Chawla, N.V.: Will this paper increase your h-index? In: Bifet, A., May, M., Zadrozny, B., Gavalda, R., Pedreschi, D., Bonchi, F., Cardoso, J., Spiliopoulou, M. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9286, pp. 259–263. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23461-8_26

    Chapter  Google Scholar 

  15. Hassan, S.-U., Bowman, T.D., Shabbir, M., Akhtar, A., Imran, M., Aljohani, N.R.: Influential tweeters in relation to highly cited articles in altmetric big data. Scientometrics 119(1), 481–493 (2019). https://doi.org/10.1007/s11192-019-03044-9

    Article  Google Scholar 

  16. Wang, F., Fan, Y., Zeng, A., Di, Z.: Can we predict ESI highly cited publications? Scientometrics 118(1), 109–125 (2018). https://doi.org/10.1007/s11192-018-2965-6

    Article  Google Scholar 

  17. Hu, Y.H., Tai, C.T., Liu, K.E., Cai, C.F.: Identification of highly-cited papers using topic-model-based and bibliometric features: the consideration of keyword popularity. J. Informet. 14, (2020)

    Google Scholar 

  18. Chowdhury, K.P.: Functional analysis of generalized linear models under non-linear constraints with applications to identifying highly-cited papers. J. Informet. 15(1), (2021)

    Google Scholar 

  19. Wang, M., Yu, G., Yu, D.: Mining typical features for highly cited papers. Scientometrics 87(3), 695–706 (2011)

    Article  Google Scholar 

  20. Wang, M., Yu, G., An, S., Yu, D.: Discovery of factors influencing citation impact based on a soft fuzzy rough set model. Scientometrics 93(3), 635–644 (2012)

    Article  Google Scholar 

  21. Bai, X., Zhang, F., Lee, I.: Predicting the citations of scholarly paper. J. Informet. 13(1), 407–418 (2019)

    Article  Google Scholar 

  22. Ruan, X., Zhu, Y., Li, J., Cheng, Y.: Predicting the citation counts of individual papers via a BP neural network. J. Informet. 14(3), 101039 (2020)

    Article  Google Scholar 

  23. Yan, R., Tang, J., Liu, X., Shan, D., Li, X.: Citation count prediction: learning to estimate future citations for literature. In: Proceedings of the 20th ACM international conference on Information and knowledge management, pp. 1247–1252 (2011)

    Google Scholar 

  24. So, M., Kim, J., Choi, S., Park, H.W.: Factors affecting citation networks in science and technology: focused on non-quality factors. Qual. Quant. 49(4), 1513–1530 (2014). https://doi.org/10.1007/s11135-014-0110-z

    Article  Google Scholar 

  25. Xie, J., Gong, K., Li, J., Ke, Q., Kang, H., Cheng, Y.: A probe into 66 factors which are possibly associated with the number of citations an article received. Scientometrics 119(3), 1429–1454 (2019). https://doi.org/10.1007/s11192-019-03094-z

    Article  Google Scholar 

  26. McClelland, D.C.: How motives, skills, and values determine what people do. Am. Psychol. 40(7), 812–825 (1985)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tian Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yu, T., Duan, C. (2023). Research on the Prediction of Highly Cited Papers Based on PCA-BPNN. In: Agarwal, N., Kleiner, G.B., Sakalauskas, L. (eds) Modeling and Simulation of Social-Behavioral Phenomena in Creative Societies. MSBC 2022. Communications in Computer and Information Science, vol 1717. Springer, Cham. https://doi.org/10.1007/978-3-031-33728-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-33728-4_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-33727-7

  • Online ISBN: 978-3-031-33728-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics