Trend-Based Citation Count Prediction for Research Articles

  • Cheng-Te Li
  • Yu-Jen Lin
  • Rui Yan
  • Mi-Yen Yeh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9077)


This paper aims to predict the future impact, measured by the citation count, of any papers of interest. While existing studies utilized the features related to the paper content or publication information to do Citation Count Prediction (CCP), we propose to leverage the citation count trend of a paper and develop a Trend-based Citation Count Prediction (T-CCP) model. By observing the citation count fluctuation of a paper along with time, we identify five typical citation trends: early burst, middle burst, late burst, multi bursts, and no bursts. T-CCP first performs Citation Trend Classification (CTC) to detect the citation trend of a paper, and then learns the predictive function for each trend to predict the citation count. We investigate two categories of features for CCP, CTC, and T-CCP: the publication features, including author, venue, expertise, social, and reinforcement features, and the early citation behaviors, including citation statistical and structural features. Experiments conducted on the ArnetMiner citation dataset exhibit promising results that T-CCP outperforms CCP and the proposed features are more effective than conventional ones.


Citation count Citation link Citation category Citation graph 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bethard, S., Jurafsky, D.: Who should I cite: learning literature search models from citation behavior. In: Proc. of ACM International Conference on Information and Knowledge Management (CIKM), pp. 609−618 (2010)Google Scholar
  2. 2.
    Adams, J.: Early citation counts correlate with accumulated impact. Scientometrics 63(3), 567–581 (2005)CrossRefGoogle Scholar
  3. 3.
    Hirsch, J.E.: An index to quantify an individual’s scientific research output. Proc. of the National Academy of Sciences of the United States of America 102(46), 16569 (2005)CrossRefGoogle Scholar
  4. 4.
    Smola, A., Schölkopf, B.: A tutorial on support vector regression. Statistics and Computing 14(3), 199–222 (2004)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Tang, J., Zhang, J., Yao, L., Li, J., Zheng, L., Su, Z.: ArnetMiner: Extracting and mining of academic social networks. In: Proc. of ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 990−998 (2008). (data:
  6. 6.
    Beel, J., Gipp, B.: Google scholar’s ranking algorithm: The impact of citation counts (an empirical study). In: Proc. of International Conference on Research Challenges in Information Science (RCIS), pp. 439−446 (2009)Google Scholar
  7. 7.
    Yan, R., Tang, J., Liu, X., Shan, D., Li, X.: Citation count prediction: Learning to estimate future citations for literature. In: Proc. of ACM International Conference on Information and Knowledge Management (CIKM), pp. 1247−1252 (2011)Google Scholar
  8. 8.
    Shi, X., Leskovec, J., McFarland, D.A.: Citing for high impact. In: Proc. of ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL), pp. 49−58 (2010)Google Scholar
  9. 9.
    Castillo, C., Donato, D., Gionis, A.: Estimating Number of Citations Using Author Reputation. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 107–117. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  10. 10.
    Yogatama, D., Heilman, M., O’Connor, B., Dyer, C, Routledge, B.R., Smith, N.A.: Predicting a scientific community’s response to an article. In: Proc. of International Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 594−604, (2011)Google Scholar
  11. Pobiedina, N., Ichise, R.: Predicting citation counts for academic literature using graph pattern mining. In: Ali, M., Pan, J.-S., Chen, S.-M., Horng, M.-F. (eds.) IEA/AIE 2014, Part II. LNCS, vol. 8482, pp. 109–119. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  12. 12.
    Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871–1874 (2008)zbMATHGoogle Scholar
  13. 13.
    Stern, D.I.: High-ranked social science journal articles can be identified from early citation information. PLoS ONE 9(11), e112520 (2014)CrossRefGoogle Scholar
  14. 14.
    Blondel, V.D, Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment (10), P10008 (2008)Google Scholar
  15. 15.
    Chakraborty, T., Sikdar, S., Tammana, V., Ganguly, N., Mukherjee, A.: Computer science fields as ground-truth communities: Their impact, rise and fall. In: Proc. of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASNOAM), pp. 426−433 (2013)Google Scholar
  16. 16.
    Yan, R., Huang, C., Tang, J., Zhang, Y., Li, X: To better stand on the shoulder of giants. In: Proc. of ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL), pp. 51−60 (2012)Google Scholar
  17. 17.
    Katz, J.S., Hicks, D.: How much is a collaboration worth? A calibrated bibliometric model. Scientometrics 40(3), 541–554 (1997)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Research Center for IT InnovationAcademia SinicaTaipeiTaiwan
  2. 2.Institute of Information ScienceAcademia SinicaTaipeiTaiwan
  3. 3.Natural Language Processing DepartmentBaidu Inc.BeijingChina

Personalised recommendations