Abstract

The distribution of the number of academic publications against citation count for papers published in the same year is remarkably similar from year to year. We characterise the shape of such distributions by a ‘width’, \(\sigma ^2\), associated with fitting a log-normal to each distribution, and find the width to be approximately constant for publications published in different years. This similarity is not surprising, after all, why would papers in a given year be cited more than another year? Nevertheless, we show that simple citation models fail to capture this behaviour. We then provide a simple three parameter citation network model which can reproduce the correct width over time. We use the citation network of papers from the hep-th section of arXiv to test our model. Our final model reproduces the data’s observed ‘width’ when around 20 % of the citations in the model are made to recently published papers in the entire network (‘global information’). The remaining 80 % of citations are made using the references from these papers’ bibliographies (‘local searches’). We note that this is consistent with other studies, though our motivation to achieve the above distribution with time is very different. Finally, we find that, in the citation network model, varying the number of papers referenced by a new publication is important as it alters the parameters in the model which are fitted to the data. This is not addressed in current models and needs further work.

Keywords

Complex networks Directed acyclic graphs Bibliometrics Citation networks

91D30## Notes

Acknowledgments

We would like to thank James Gollings and James Clough for allowing us to use their transitive reduction code from which we created our own declustering code. We would like to thank Tamar Loach for sharing her results on related projects and M. V. Simkin for discussions about his work.

