The Substantive and Practical Significance of Citation Impact Differences Between Institutions: Guidelines for the Analysis of Percentiles Using Effect Sizes and Confidence Intervals

  • Richard WilliamsEmail author
  • Lutz Bornmann


In this chapter we address the statistical analysis of percentiles: How should the citation impact of institutions be compared? In educational and psychological testing, percentiles are already used widely as a standard to evaluate an individual’s test scores—intelligence tests for example—by comparing them with the scores of a calibrated sample. Percentiles, or percentile rank classes, are also a very suitable method for bibliometrics to normalize citations of publications in terms of the subject category and the publication year and, unlike the mean-based indicators (the relative citation rates), percentiles are scarcely affected by skewed distributions of citations. The percentile of a certain publication provides information about the citation impact this publication has achieved in comparison to other similar publications in the same subject category and publication year. Analyses of percentiles, however, have not always been presented in the most effective and meaningful way. New APA guidelines (Association American Psychological, Publication manual of the American Psychological Association (6 ed.). Washington, DC: American Psychological Association (APA), 2010) suggest a lesser emphasis on significance tests and a greater emphasis on the substantive and practical significance of findings. Drawing on work by Cumming (Understanding the new statistics: effect sizes, confidence intervals, and meta-analysis. London: Routledge, 2012) we show how examinations of effect sizes (e.g., Cohen’s d statistic) and confidence intervals can lead to a clear understanding of citation impact differences.


American Psychological Association Citation Impact Sample Standard Deviation Percentile Ranking Substantive Significance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Acock, A. (2010). A gentle introduction to Stata (3rd ed.). College Station, TX: Stata Press.Google Scholar
  2. Association American Psychological. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: American Psychological Association (APA).Google Scholar
  3. Bornmann, L., & Leydesdorff, L. (2013). Statistical tests and research assessments: A comment on Schneider (2012). Journal of the American Society for Information Science and Technology, 64(6), 1306–1308. doi: 10.1002/asi.22860.CrossRefGoogle Scholar
  4. Bornmann, L., Leydesdorff, L., & Mutz, R. (2013). The use of percentiles and percentile rank classes in the analysis of bibliometric data: opportunities and limits. Journal of Informetrics, 7(1), 158–165.CrossRefGoogle Scholar
  5. Bornmann, L., & Mutz, R. (2013). The advantage of the use of samples in evaluative bibliometric studies. Journal of Informetrics, 7(1), 89–90. doi: 10.1016/j.joi.2012.08.002.CrossRefGoogle Scholar
  6. Bornmann, L., & Williams, R. (2013). How to calculate the practical significance of citation impact differences? An empirical example from evaluative institutional bibliometrics using adjusted predictions and marginal effects. Journal of Informetrics, 7(2), 562–574. doi: 10.1016/j.joi.2013.02.005.CrossRefGoogle Scholar
  7. Bornmann, L., de Moya Anegon, F., & Leydesdorff, L. (2012). The new Excellence Indicator in the World Report of the SCImago Institutions Rankings 2011. Journal of Informetrics, 6(2), 333–335. doi:  10.1016/j.joi.2011.11.006.
  8. Cameron, A. C. & Trivedi, P. K. (2010). Microeconomics using Stata (Revised ed.). College Station, TX: Stata Press.Google Scholar
  9. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.zbMATHGoogle Scholar
  10. Cox, N. J. (2005). Calculating percentile ranks or plotting positions. Retrieved May 30, from
  11. Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. London: Routledge.Google Scholar
  12. Glänzel, W., Thijs, B., Schubert, A., & Debackere, K. (2009). Subfield-specific normalized relative indicators and a new generation of relational charts: methodological foundations illustrated on the assessment of institutional research performance. Scientometrics, 78(1), 165–188.CrossRefGoogle Scholar
  13. Huber, C. (2013). Measures of effect size in Stata 13. The Stata Blog. Retrieved December 6, 2013, from
  14. Hyndman, R. J., & Fan, Y. N. (1996). Sample quantiles in statistical packages. American Statistician, 50(4), 361–365.Google Scholar
  15. International Committee of Medical Journal Editors. (2010). Uniform requirements for manuscripts submitted to biomedical journals: Writing and editing for biomedical publication. Journal of Pharmacology and Pharmacotherapeutics, 1(1), 42–58. Retrieved April 10, 2014 from
  16. Leydesdorff, L. (2012). Accounting for the uncertainty in the evaluation of percentile ranks. Journal of the American Society for Information Science and Technology, 63(11), 2349–2350.CrossRefGoogle Scholar
  17. Leydesdorff, L., & Bornmann, L. (2011). Integrated impact indicators (I3) compared with impact factors (IFs): An alternative research design with policy implications. Journal of the American Society of Information Science and Technology, 62(11), 2133–2146.CrossRefGoogle Scholar
  18. Leydesdorff, L., & Bornmann, L. (2012). Percentile ranks and the integrated impact indicator (I3). Journal of the American Society for Information Science and Technology, 63(9), 1901–1902. doi: 10.1002/asi.22641.CrossRefGoogle Scholar
  19. Long, S., & Freese, J. (2006). Regression models for categorical dependent variables using Stata (2nd ed.). College Station, TX: Stata Press.zbMATHGoogle Scholar
  20. Lundberg, J. (2007). Lifting the crown - citation z-score. Journal of Informetrics, 1(2), 145–154.CrossRefGoogle Scholar
  21. Moed, H. F., De Bruin, R. E., & Van Leeuwen, T. N. (1995). New bibliometric tools for the assessment of national research performance - database description, overview of indicators and first applications. Scientometrics, 33(3), 381–422.CrossRefGoogle Scholar
  22. Opthof, T., & Leydesdorff, L. (2010). Caveats for the journal and field normalizations in the CWTS (“Leiden”) evaluations of research performance. Journal of Informetrics, 4(3), 423–430.CrossRefGoogle Scholar
  23. Pudovkin, A. I., & Garfield, E. (2009). Percentile rank and author superiority indexes for evaluating individual journal articles and the author’s overall citation performance. Paper presented at the Fifth International Conference on Webometrics, Informetrics & Scientometrics (WIS).Google Scholar
  24. Schneider, J., & Schneider, J. (2012). Testing university rankings statistically: Why this is not such a good idea after all. Some reflections on statistical power, effect sizes, random sampling and imaginary populations. In E. Archambault, Y. Gingras, & V. Lariviere (Eds.), The 17th International Conference on Science and Technology Indicators (pp. 719–732). Montreal, Canada: Repro-UQAM.Google Scholar
  25. Schreiber, M. (2012). Inconsistencies of recently proposed citation impact indicators and how to avoid them. Journal of the American Society for Information Science and Technology, 63(10), 2062–2073. doi: 10.1002/asi.22703.CrossRefMathSciNetGoogle Scholar
  26. Schreiber, M. (2013). Uncertainties and ambiguities in percentiles and how to avoid them. Journal of the American Society for Information Science and Technology, 64(3), 640–643. doi: 10.1002/asi.22752.CrossRefGoogle Scholar
  27. Schubert, A., & Braun, T. (1986). Relative indicators and relational charts for comparative assessment of publication output and citation impact. Scientometrics, 9(5–6), 281–291.CrossRefGoogle Scholar
  28. StataCorp. (2013). Stata statistical software: Release 13. College Station, TX: Stata Corporation.Google Scholar
  29. Tressoldi, P. E., Giofre, D., Sella, F., & Cumming, G. (2013). High impact = high statistical standards? not necessarily so. PLoS One, 8(2). doi:  10.1371/journal.pone.0056180.
  30. van Raan, A. F. J., van Leeuwen, T. N., Visser, M. S., van Eck, N. J., & Waltman, L. (2010). Rivals for the crown: Reply to Opthof and Leydesdorff. Journal of Informetrics, 4, 431–435.CrossRefGoogle Scholar
  31. Waltman, L., Calero-Medina, C., Kosten, J., Noyons, E. C. M., Tijssen, R. J. W., van Eck, N. J., et al. (2012). The Leiden Ranking 2011/2012: Data collection, indicators, and interpretation. Journal of the American Society for Information Science and Technology, 63(12), 2419–2432.CrossRefGoogle Scholar
  32. Waltman, L., & Schreiber, M. (2013). On the calculation of percentile-based bibliometric indicators. Journal of the American Society for Information Science and Technology, 64(2), 372–379.CrossRefGoogle Scholar
  33. Williams, R. (2012). Using the margins command to estimate and interpret adjusted predictions and marginal effects. The Stata Journal, 12(2), 308–331.Google Scholar
  34. Zhou, P., & Zhong, Y. (2012). The citation-based indicator and combined impact indicator—new options for measuring impact. Journal of Informetrics, 6(4), 631–638. doi: 10.1016/j.joi.2012.05.004.CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of SociologyUniversity of Notre DameNotre DameUSA
  2. 2.Division for Science and Innovation StudiesAdministrative Headquarters of the Max Planck SocietyMunichGermany

Personalised recommendations