Scientometrics

, Volume 105, Issue 3, pp 1577–1604 | Cite as

Modelling citation networks

Article

Abstract

The distribution of the number of academic publications against citation count for papers published in the same year is remarkably similar from year to year. We characterise the shape of such distributions by a ‘width’, \(\sigma ^2\), associated with fitting a log-normal to each distribution, and find the width to be approximately constant for publications published in different years. This similarity is not surprising, after all, why would papers in a given year be cited more than another year? Nevertheless, we show that simple citation models fail to capture this behaviour. We then provide a simple three parameter citation network model which can reproduce the correct width over time. We use the citation network of papers from the hep-th section of arXiv to test our model. Our final model reproduces the data’s observed ‘width’ when around 20 % of the citations in the model are made to recently published papers in the entire network (‘global information’). The remaining 80 % of citations are made using the references from these papers’ bibliographies (‘local searches’). We note that this is consistent with other studies, though our motivation to achieve the above distribution with time is very different. Finally, we find that, in the citation network model, varying the number of papers referenced by a new publication is important as it alters the parameters in the model which are fitted to the data. This is not addressed in current models and needs further work.

Keywords

Complex networks Directed acyclic graphs Bibliometrics Citation networks 

Mathematics Subject Classification

91D30 

References

  1. Bentley, R., Hahn, M., & Shennan, S. (2004). Random drift and culture change. Proceedings of the Royal Society B, 271, 1443–1450.CrossRefGoogle Scholar
  2. Brzezinski, M. (2015). Power laws in citation distributions: Evidence from Scopus. Scientometrics, 103(1), 213–228.CrossRefGoogle Scholar
  3. Chung, F., Lu, L., Dewey, T. G., & Galas, D. J. (2003). Duplication models for biological networks. Journal of Computational Biology, 10, 677–687.CrossRefGoogle Scholar
  4. Clauset, A., Shalizi, C. R., & Newman, M. E. J. (2009). Power-law distributions in empirical data. Siam Review, 51, 661–703.MathSciNetCrossRefMATHGoogle Scholar
  5. Clough, J. R., & Evans, T. S. (2014). What is the dimension of citation space? arXiv:1408.1274.
  6. Clough, J. R., Gollings, J., Loach, T. V., & Evans, T. S. (2014). Transitive reduction of citation networks. Journal of Complex Networks. arXiv:1310.8224.
  7. de Solla Price, D. J. (1965). Networks of scientific papers. Science, 149, 510–515.CrossRefGoogle Scholar
  8. de Solla Price, D. J. (1976). A general theory of bibliometric and other cumulative advantage processes. Journal of the American Society for Information Science, 27, 292–306.CrossRefGoogle Scholar
  9. Dorogovtsev, S., & Mendes, J. (2000). Evolution of networks with aging of sites. Physical Review E, 62, 1842–1845.CrossRefGoogle Scholar
  10. Dorogovtsev, S., & Mendes, J. A. A. A. (2001). Scaling properties of scale-free evolving networks: Continuous approach. Physical Review E, 63, 056125.CrossRefGoogle Scholar
  11. Dorogovtsev, S., Mendes, J., & Samukhin, A. (2000). Structure of growing networks with preferential linking. Physical Review Letters, 85, 4633–4636.CrossRefGoogle Scholar
  12. Eom, Y.-H., & Fortunato, S. (2011). Characterizing and modeling citation dynamics. Plos One, 6, e24926.CrossRefGoogle Scholar
  13. Evans, T. S., Hopkins, N., & Kaube, B. S. (2012). Universality of performance indicators based on citation and reference counts. Scientometrics, 93, 473–495. doi:10.1007/s11192-012-0694-9. arXiv:1110.3271.CrossRefGoogle Scholar
  14. Evans, T. S., & Saramaki, J. (2005). Scale-free networks from self-organization. Physical Review E, 72, 026138. doi:10.1103/PhysRevE.72.026138. arXiv:cond-mat/0411390.MathSciNetCrossRefGoogle Scholar
  15. Fowler, J. H., & Jeon, S. (2008). The authority of supreme court precedent. Social Networks, 30, 16–30.CrossRefGoogle Scholar
  16. Geng, X., & Wang, Y. (2009). Degree correlations in citation networks model with aging. EPL, 88, 38002.CrossRefGoogle Scholar
  17. Goldberg, S. R. (2013). Modelling citation networks. figshare. doi:10.6084/m9.figshare.1134542.
  18. Goldberg, S. R., & Evans, T. S. (2012). Universality of performance indicators based on citation and reference counts. figshare. doi:10.6084/m9.figshare.1134544. Retrieved 12 Aug 2014.
  19. Golosovsky, M., & Sorin, S. (2013). The transition towards immortality: Non-linear autocatalytic growth of citations to scientific papers. Journal of Statistical Physics, 151, 340–354.MathSciNetCrossRefMATHGoogle Scholar
  20. Hajra, K., & Sen, P. (2005). Aging in citation networks. Physica A, 346, 44–48.CrossRefGoogle Scholar
  21. Hajra, K. B., & Sen, P. (2006). Modelling aging characteristics in citation networks. Physica A, 368, 575–582.CrossRefGoogle Scholar
  22. KDD Cup. (2003). Network mining and usage log analysis. http://www.cs.cornell.edu/projects/kddcup/datasets.html. Accessed 1 Oct 2012.
  23. Krapivsky, P., & Redner, S. (2001). Organization of growing random networks. Physical Review E, 63, 066123.CrossRefGoogle Scholar
  24. Laherraére, J., & Sornette, D. (1998). Stretched exponential distributions in nature and economy: ‘fat tails’ with characteristic scales. The European Physical Journal B-Condensed Matter and Complex Systems, 2, 525–539.CrossRefGoogle Scholar
  25. Maslov, S., & Redner, S. (2008). Promise and pitfalls of extending Google’s PageRank algorithm to citation networks. The Journal of Neuroscience, 28, 11103–11105.CrossRefGoogle Scholar
  26. Mitzenmacher, M. (2004). A brief history of generative models for power law and lognormal distributions. Internet Mathematics, 1, 226–251.MathSciNetCrossRefMATHGoogle Scholar
  27. Newman, M. (2010). Networks: An introduction. New York: Oxford University Press.CrossRefGoogle Scholar
  28. Perc, M. (2014). The Matthew effect in empirical data. Journal of The Royal Society Interface, 11, 20140378–20140378.CrossRefGoogle Scholar
  29. Pollmann, T. (2000). Forgetting and the ageing of scientific publications. Scientometrics, 47, 43–54.CrossRefGoogle Scholar
  30. Radicchi, F., Fortunato, S., & Castellano, C. (2008). Universality of citation distributions: Toward an objective measure of scientific impact. Proceedings of the National Academy of Sciences of the United States of America, 105, 17268–17272.CrossRefGoogle Scholar
  31. Redner, S. (1998). How popular is your paper? An empirical study of the citation distribution. The European Physical Journal B-Condensed Matter and Complex Systems, 4, 131–134.CrossRefGoogle Scholar
  32. Ren, F.-X., Shen, H.-W., & Cheng, X.-Q. (2012). Modeling the clustering in citation networks. Physica A, 391, 3533–3539.CrossRefGoogle Scholar
  33. Saramäki, J., & Kaski, K. (2004). Scale-free networks generated by random walkers. Physica A, 341, 80.MathSciNetCrossRefGoogle Scholar
  34. Seglen, P. O. (1992). The skewness of science. Journal of the American Society for Information Science, 43, 628–638.CrossRefGoogle Scholar
  35. Simkin, M. V., & Roychowdhury, V. P. (2005a). Copied citations create renowned papers? Annals of Improbable Research, 11, 24–27.CrossRefGoogle Scholar
  36. Simkin, M. V., & Roychowdhury, V. P. (2005b). Stochastic modeling of citation slips. Scientometrics, 62, 367–384.CrossRefGoogle Scholar
  37. Simkin, M. V., & Roychowdhury, V. P. (2007). A mathematical theory of citing. Journal of the American Society for Information Science and Technology, 58, 1661–1673.CrossRefGoogle Scholar
  38. Smolinsky, L., Lercher, A., & McDaniel, A. (2015). Testing theories of preferential attachment in random networks of citations. Journal of the Association for Information Science and Technology. doi:10.1002/asi.23312.Google Scholar
  39. Sternitzke, C., Bartkowski, A., & Schramm, R. (2008). Visualizing patent statistics by means of social network analysis tools. World Patent Information, 30, 115–131.CrossRefGoogle Scholar
  40. Stringer, M. J., Sales-Pardo, M., & Amaral, L. A. N. (2008). Effectiveness of journal ranking schemes as a tool for locating information. PLoS ONE, 3(2), e1683.CrossRefGoogle Scholar
  41. Stringer, M. J., Sales-Pardo, M., & Amaral, L. A. N. (2010). Statistical validation of a global model for the distribution of the ultimate number of citations accrued by papers published in a scientific journal. Journal of the American Society for Information Science and Technology, 61, 1377–1385.CrossRefGoogle Scholar
  42. van Raan, A. F. J. (2001). Two-step competition process leads to quasi power-law income distributions: Application to scientific publication and citation distributions. Physica A, 298, 530–536.CrossRefMATHGoogle Scholar
  43. Vazquez, A. (2000). Knowing a network by walking on it: Emergence of scaling. arXiv:cond-mat/0006132.
  44. Vázquez, A. (2001). Statistics of citation networks. arXiv:cond-mat/0105031.
  45. Vázquez, A. (2003). Growing networks with local rules: preferential attachment, clustering hierarchy and degree correlations. Physical Review E, 67, 056104.CrossRefGoogle Scholar
  46. Wallace, M. L., Lariviere, V., & Gingras, Y. (2009). Modeling a century of citation distributions. Journal of Informetrics, 3, 296–303.CrossRefGoogle Scholar
  47. Waltman, L., van Eck, N. J., & van Raan, A. F. J. (2012). Universality of citation distributions revisited. Journal of the American Society for Information Science and Technology, 63, 72–77.CrossRefGoogle Scholar
  48. Wu, Y., Fu, T. Z. J., & Chiu, D. M. (2014). Generalized preferential attachment considering aging. Journal of Informetrics, 8, 650–658.CrossRefGoogle Scholar
  49. Zhu, H., Wang, X., & Zhu, J. (2003). Effect of aging on network structure. Physical Review E, 68, 056121.CrossRefGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2015

Authors and Affiliations

  1. 1.School of Physics and AstronomyQueen Mary University of LondonLondonUK
  2. 2.Centre for Complexity Science and Physics DepartmentImperial College LondonLondonUK

Personalised recommendations