Abstract
We test 16 bibliometric indicators with respect to their validity at the level of the individual researcher by estimating their power to predict later successful researchers. We compare the indicators of a sample of astrophysics researchers who later co-authored highly cited papers before their first landmark paper with the distributions of these indicators over a random control group of young authors in astronomy and astrophysics. We find that field and citation-window normalisation substantially improves the predicting power of citation indicators. The sum of citation numbers normalised with expected citation numbers is the only indicator which shows differences between later stars and random authors significant on a 1 % level. Indicators of paper output are not very useful to predict later stars. The famous h-index makes no difference at all between later stars and the random control group.
Similar content being viewed by others
Notes
14th International Society of Scientometrics and Informetrics Conference in Vienna, Austria, 15th to 20th July 2013 (Havemann and Larsen 2013).
We exclude candidates with many paper sets offered by Author Search.
cf. Henneken et al. (2011, p. 5).
cf. the Wikipedia article http://en.wikipedia.org/wiki/Mann-Whitney-Wilcoxon_test.
http://www.r-project.org (R-scripts for indicator calculation and sample data can be obtained from the first author of this paper).
It would be interesting—from a theoretical point of view—to determine the influence of each correction separately.
This is in accordance with the result obtained by Neufeld et al. (2013, cf. p. 9) when comparing successful with non-successful applicants of a funding programme for young researchers.
References
Ajiferuke, I., Burrell, Q., & Tague, J. (1988). Collaborative coefficient: A single measure of the degree of collaboration in research. Scientometrics, 14, 421–433.
Bornmann, L., Leydesdorff, L., & Wang, J. (2013). Which percentile-based approach should be preferred for calculating normalized citation impact values? An empirical comparison of five approaches including a newly developed citation-rank approach (P100). Journal of Informetrics, 7(4), 933–944. s. a. http://arxiv.org/abs/1306.4454
Costas, R., van Leeuwen, T. N., & Bordons, M. (2010). A bibliometric classificatory approach for the study and assessment of research performance at the individual level: The effects of age on productivity and impact. Journal of the American Society for Information Science and Technology, 61(8), 1564–1581.
Egghe, L. (2006). An improvement of the h-index: The g-index. ISSI Newsletter, 2(2), 8–9.
Egghe, L. (2008). Mathematical theory of the \(h\)- and \(g\)-index in case of fractional counting of authorship. Journal of the American Society for Information Science and Technology, 59(10), 1608–1616.
Havemann, F., & Larsen, B. (2013). Bibliometric indicators of young authors in astrophysics: Can later stars be predicted? In J. Gorraiz, E. Schiebel, C. Gumpenberger, M. Hörlesberger, & H. Moed (Eds.), Proceedings of ISSI 2013 Vienna (Vol. 2, pp. 1881–1883).
Henneken, E. A., Kurtz, M. J., & Accomazzi, A. (2011). The ADS in the information age-impact on discovery. arXiv:1106.5644.
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences, 102(46), 16569–16572. http://arxiv.org/abs/physics/0508025.
Hönekopp, J., & Khan, J. (2012). Future publication success in science is better predicted by traditional measures than by the \(h\) index. Scientometrics, 90(3), 843–853.
Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & von Ins, M. (2009). Funding of young scientist and scientific excellence. Scientometrics, 79(1), 171–190.
Kosmulski, M. (2012). Calibration against a reference set: A quantitative approach to assessment of the methods of assessment of scientific output. Journal of Informetrics, 6(3), 451–456.
Kreiman, G., & Maunsell, J. H. R. (2011). Nine criteria for a measure of scientific output. Frontiers in Computational Neuroscience, 5, article nr. 48 (6 pages).
Lehmann, S., Jackson, A. D., & Lautrup, B. E. (2006). Measures for measures. Nature, 444(7122), 1003–1004.
Lehmann, S., Jackson, A. D., & Lautrup, B. E. (2008). A quantitative analysis of indicators of scientific performance. Scientometrics, 76(2), 369–390.
Levene, M., Fenner, T., & Bar-Ilan, J. (2012). A bibliometric index based on the complete list of cited publications. Cybermetrics: International Journal of Scientometrics, Informetrics and Bibliometrics (16), 1–6. s.a. arXiv:1304.6945.
Lozano, G. A., Larivière, V., & Gingras, Y. (2012). The weakening relationship between the impact factor and papers’ citations in the digital age. Journal of the American Society for Information Science and Technology, 63(11), 2140–2145.
Lundberg, J. (2007). Lifting the crown—citation z-score. Journal of Informetrics, 1(2), 145–154.
Marchant, T. (2009). Score-based bibliometric rankings of authors. Journal of the American Society for Information Science and Technology, 60(6), 1132–1137.
Nederhof, A. J., & van Raan, A. F. J. (1987). Peer review and bibliometric indicators of scientific performance: A comparison of cum laude doctorates with ordinary doctorates in physics. Scientometrics, 11(5–6), 333–350.
Neufeld, J., Huber, N., & Wegner, A. (2013). Peer review-based selection decisions in individual research funding, applicants’ publication strategies and performance: The case of the ERC starting grants. Research Evaluation, 22(4), 237–247.
Opthof, T. (2011). Differences in citation frequency of clinical and basic science papers in cardiovascular research. Medical & Biological Engineering & Computing, 49(6), 613–621.
Opthof, T., & Leydesdorff, L. (2010). Caveats for the journal and field normalizations in the CWTS (“Leiden”) evaluations of research performance. Journal of Informetrics, 4(3), 423–430.
Pepe, A., & Kurtz, M. J. (2012). A measure of total research impact independent of time and discipline. PLoS One, 7(11), e46428.
Pudovkin, A., Kretschmer, H., Stegmann, J., & Garfield, E. (2012). Research evaluation. Part I: Productivity and citedness of a German medical research institution. Scientometrics, 93(1), 3–16.
Radicchi, F., & Castellano, C. (2011). Rescaling citations of publications in physics. Physical Review E, 83(4), 046116.
Radicchi, F., & Castellano, C. (2012). Testing the fairness of citation indicators for comparison across scientific domains: The case of fractional citation counts. Journal of Informetrics, 6(1), 121–130.
Radicchi, F., Fortunato, S., & Castellano, C. (2008). Universality of citation distributions: Toward an objective measure of scientific impact. Proceedings of the National Academy of Sciences, 105(45), 17268–17272.
Sachs, L., & Hedderich, J. (2006). Angewandte Statistik. Methodensammlung mit R (12th ed.). Berlin: Springer.
Schreiber, M. (2008a). A modification of the \(h\)-index: The \(h_m\)-index accounts for multi-authored manuscripts. Journal of Informetrics, 2(3), 211–216.
Schreiber, M. (2008b). The influence of self-citation corrections on Egghe’s \(g\) index. Scientometrics, 76(1), 187–200. arXiv:0707.4577.
Schreiber, M. (2008c). To share the fame in a fair way, \(h_{\rm m}\) modifies \(h\) for multi-authored manuscripts. New Journal of Physics, 10(4), 040201.
Schreiber, M. (2009). Fractionalized counting of publications for the \(g\)-index. Journal of the American Society for Information Science and Technology, 60(10), 2145–2150.
Schubert, A., & Braun, T. (1986). Relative indicators and relational charts for comparative assessment of publication output and citation impact. Scientometrics, 9(5), 281–291.
Seglen, P. O. (1997). Why the impact factor of journals should not be used for evaluating research. BMJ: British Medical Journal, 314(7079), 498–513.
van Eck, N. J., & Waltman, L. (2008). Generalizing the \(h\)- and \(g\)-indices. Journal of Informetrics, 2(4), 263–271.
Waltman, L., & van Eck, N. J. (2012). The inconsistency of the h-index. Journal of the American Society for Information Science and Technology, 63(2), 406–415.
Waltman, L., & van Eck, N. J. (2013). A systematic empirical comparison of different approaches for normalizing citation impact indicators. Journal of Informetrics, 7(4), 833–849.
Waltman, L., van Eck, N. J., van Leeuwen, T. N., Visser, M. S., & van Raan, A. F. J. (2011). Towards a new crown indicator: An empirical analysis. Scientometrics, 87(3), 467–481.
Zhou, P., & Leydesdorff, L. (2011). Fractional counting of citations in research evaluation: A cross- and interdisciplinary assessment of the Tsinghua University in Beijing. Journal of Informetrics, 5(3), 360–368.
Acknowledgments
We thank Jesper W. Schneider of Aarhus University for helpful discussions of an early draft and Paul Wouters at CWTS in Leiden for kindly providing citation data. We further thank two anonymous reviewers who helped us to sharpen our definition of later stars. The analysis was done for the purposes of the ACUMEN project, financed by the European Commission, cf. http://research-acumen.eu/.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Most cited papers of later stars
In Table 3 the most cited papers of later stars are compared to their most cited papers published within the first 5 years they have papers in WoS-journals. The citation numbers p.a. (cit/year) are determined with a citation window of 5 years for the most cited papers in all years. For the most cited papers of the first 5 years the citation window is restricted to these 5 years because we suppose an evaluation directly after this period.
All authors in Table 3 have got more than twice the number of citations p.a. for their most cited paper than for their most cited paper in their first 5 years. The author on rank 8 has got more citations for the early most cited paper than the authors on ranks 2, 9, and 21 for their most cited paper in the whole period of analysis. Omitting these three later stars would result in a clearer picture but we cannot omit them without the risk of a bias with respect to subfields because differing citation numbers could be caused by different citation behaviours in different subfields (see footnote 4, p. 3). Our definition of later stars can only be relative: they have got substantially higher citation numbers for their most cited paper after their first 5 years than within these years.
Nonetheless, we repeated all calculations for the smaller sample of 24 later stars. This gives us an impression how stable the results are under small changes of the sample. All p values are higher now—what is expected for a smaller sample if it would be generated by random reduction—but, again, the normalised number of citations \(c_{\rm norm}\) is the best indicator (\(p = 1.8\,\%\)) and the h-index the worst measure (\(p=50.8\,\%\)).
Descriptions of indicators
Productivity indicators
Number of papers: This elementary indicator of productivity belongs to a bygone era when co-authorship was the exception and not the rule. It has the unwanted adverse effects of multiple publishing of the same results and of honorary authorships.
Fractional score: Each paper i is divided into \(a_i\) fractions where \(a_i\) is the number of authors. These fractions are summed up for the papers of the evaluated author. We use the simplest variant where all fractions of a paper are equal: \(f = \sum _i 1/a_i.\) This indicator penalises honorary authorships and takes into account that larger teams can be more productive.
Total influence
All indicators of total influence tend to increase with the author’s number of papers. That means, they are also indicating productivity.
Number of citations: Each citation of a paper indicates that it has influenced the citing author(s). The sum \(c=\sum _i c_i\) of raw numbers \(c_i\) of citations of an author’s papers is highly field dependent. The paper’s number of citations \(c_i\) depends on the age of a paper at the time of evaluation. Highly cited papers have surely some quality but less cited ones can also be of high quality.
Normalised numbers of citations: We normalise each paper’s number of citations \(c_i\) by an expected number of citations \(\hbox{E}(c_i)\) which takes into account the paper’s age and the citation behaviour in astrophysics during the first 5 (calendar) years in the paper’s lifetime (cf. section “Expected citation numbers” of Appendix). After normalising each paper’s citation number we sum the ratios of observed and expected citation numbers:
Some bibliometricians do not calculate the sum of ratios but the ratio of sums (Schubert and Braun 1986): \(\sum _i c_i/\sum _i{\rm E}(c_i).\) This procedure is thought to evaluate the whole oeuvre of an author but has been criticised recently for being not “consistent” (Opthof and Leydesdorff 2010; Waltman et al. 2011).Footnote 10
The j-index: The j-index is the sum of the square roots of citation numbers of the author’s papers
It was proposed by Levene et al. (2012) to downgrade the influence of highly cited papers in the sum of citation numbers.
Fractional citations: Analogously to the fractional score described above we distribute citations of each paper equally to its authors:
Fractional normalised citations: The normalised numbers of citations can also be distributed among the authors involved (Radicchi and Castellano 2011):
Typical influence
Mean citation number: The arithmetic mean of citations of an author’s papers
is the simplest indicator of influence which does not tend to increase with the author’s productivity.
Mean fractional citations: The arithmetic mean of fractionally counted citations of an author’s papers:
Median of fractional citations: The median of fractionally counted citations of an author’s papers \({\rm median}(c_i/a_i)\) is considered because citation distributions are skewed.
Maximum of fractional citations: We wondered whether for a later star a large maximum of (fractional) citations \(\max (c_i/a_i)\) is more typical than a large value of any measure of central tendency of citation numbers (Lehmann et al. 2008, cf. p. 375).
Indices of h-type
Hirsch index: The h-index was introduced by Hirsch (2005) “to quantify an individual’s scientific research output.” It is defined as the maximum rank \(r\) in a rank list of an author’s papers according to their citation numbers \(c_i\) which is less than or equal to the citation number \(c_r\) of the paper with rank r:\(h = \max ( r | c_r \ge r).\) The h-index has been criticised for its arbitrariness (van Eck and Waltman 2008). It is arbitrary because in the definition Hirsch “assumes an equality between incommensurable quantities” (Lehmann et al. 2008, p. 377), namely a rank and a citation number. Hirsch himself stated that his index depends on field-specific citation and collaboration behaviour (Hirsch 2005, p. 16571).
Egghe’s g-index: Egghe (2006) criticised the h-index for being insensitive to the citation frequency of an author’s highly cited papers. His g-index can be defined as the maximum rank r which is less than or equal to the mean citation number \((\sum _i^r c_i)/r\) of papers till rank \(r\) (Schreiber 2008b). This condition is equivalent to \(\sum _i^r c_i \ge r^2.\) That means, \(g\) can also be defined as
Fractional indices of h-type
Schreiber’s h m -index: Fractional counting of papers or of citations could be applied to define an h-index which takes multi-authorship into account (Egghe 2008; Schreiber 2008c). Schreiber (2008a) argued that fractionally counted citations could remove highly cited papers from the h-core if they have a lot of authors. This led him to define the \(h_{\rm m}\)-index as the maximal effective rank \(r_{\rm eff}(r)= \sum _i^r 1/a_i\) which is less than or equal to the number of citations c r :
Egghe’s \(g_{\rm f}\) -index: Egghe (2008) proposed to define a fractional g-index \(g_{\rm f}\) as
Here the citations are counted fractionally.
Schreiber’s \(g_{\rm m}\) -index: Schreiber (2009) proposed a fractional g-index \(g_{\rm m}\) where both, papers and citations, are counted fractionally:
Expected citation numbers
Usually, for field normalisation expected citation numbers of papers are calculated as arithmetic means of citation numbers of all papers (of the same document type) published in all journals of the field in the same year. There are two main technical problems with this method, the rough delineation of fields and the skewness of citation distributions.
We do not evaluate single authors but only want to show the influence of field normalisation on distributions of citation indicators of authors. Therefore we can use a random sample of papers (for which we have already the citation data) instead of all papers in the field. This sample contains papers published in the years 1991–2009 by all 331 random authors of our initial control sample. We only consider those 2,342 papers with at most 20 authors. Figure 2 shows the average cumulated citation numbers in the publication year, 1 year later, 2 years later etc. Due to the skewness of citation distributions these arithmetic means fluctuate. Therefore we made a linear regression for each of the five time series of citation numbers of papers (not of the averages) but restricted the analysis to the years 1995–2007 (coloured part of the regression lines) where we have more than 100 papers in each year. The interpolated citation numbers obtained by linear regression are used as expected citation numbers \({\rm E}(c_i)\) of papers published in the corresponding years.
From these data we estimate a doubling of citation numbers in astrophysics in the two decades around the millennium. Calculating expected citation numbers as field averages is problematic because the arithmetic mean is not a good measure for the central tendency of skewed citation distributions. Lundberg (2007) therefore proposed to determine expected citation numbers as geometric means of citation numbers of papers in the field. Because papers can have zero citations he adds 1 to be able to calculate the geometric mean. This can be justified by saying that publishing a paper is the first citation of the published results.
Rights and permissions
About this article
Cite this article
Havemann, F., Larsen, B. Bibliometric indicators of young authors in astrophysics: Can later stars be predicted?. Scientometrics 102, 1413–1434 (2015). https://doi.org/10.1007/s11192-014-1476-3
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-014-1476-3