Skip to main content
Log in

Peer review versus bibliometrics: Which method better predicts the scholarly impact of publications?

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

In this work, we try to answer the question of which method, peer review versus bibliometrics, better predicts the future overall scholarly impact of scientific publications. We measure the agreement between peer review evaluations of Web of Science indexed publications submitted to the first Italian research assessment exercise and long-term citations of the same publications. We do the same for an early citation-based indicator. We find that the latter shows stronger predictive power, i.e. it more reliably predicts late citations in all the disciplinary areas examined, and for any citation time window starting 1 year after publication.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. In this work, we deal with the evaluation of research output encoded in written form and indexed in bibliographic repertories (such as Scopus and Web of Science, WoS), i.e. after fellow specialists have assessed its suitability for publication.

  2. To account for time devoted to teaching activities, 1 professor equals 0.5 FTE.

  3. More precisely, Martin and Irvine (1983) argue that “it is necessary to distinguish between, not two, but three concepts—the “quality”, “importance”, and “impact” of the research described in a paper”, whereby “importance” of a publication refers to its potential influence on the advance of scientific knowledge.

  4. ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies. ISO 9000: 2015 is accessible at https://www.iso.org/obp/ui/#iso:std:iso:9000:ed-4:v1:en, last accessed 12 June, 2019.

  5. We take this opportunity to thank the CWTS for making its database accessible to us for research purposes.

  6. For reasons of space, we omit reporting the case of BS_2Y score because, as expected, results are just between those for BS_1Y score and BS_3Y score.

References

  • Abramo, G. (2018). Revisiting the scientometric conceptualization of impact and its measurement. Journal of Informetrics, 12(3), 590–597.

    Article  Google Scholar 

  • Abramo, G., Cicero, T., & D’Angelo, C. A. (2011a). Assessing the varying level of impact measurement accuracy as a function of the citation window length. Journal of Informetrics, 5(4), 659–667.

    Article  Google Scholar 

  • Abramo, G., Cicero, T., & D’Angelo, C. A. (2012). The dispersion of research performance within and between universities as a potential indicator of the competitive intensity in higher education systems. Journal of Informetrics, 6(2), 155–168.

    Article  Google Scholar 

  • Abramo, G., Cicero, T., & D’Angelo, C. A. (2013a). National peer-review research assessment exercises for the hard sciences can be a complete waste of money: the Italian case. Scientometrics, 95(1), 311–324.

    Article  Google Scholar 

  • Abramo, G., & D’Angelo, C. A. (2016). Refrain from adopting the combination of citation and journal metrics to grade publications, as used in the Italian national research assessment exercise (VQR 2011-2014). Scientometrics, 109(3), 2053–2065.

    Article  Google Scholar 

  • Abramo, G., D’Angelo, C. A., & Di Costa, F. (2011b). National research assessment exercises: a comparison of peer review and bibliometrics rankings. Scientometrics, 89(3), 929–941.

    Article  Google Scholar 

  • Abramo, G., D’Angelo, C. A., & Felici, G. (2019). Predicting long-term publication impact through a combination of early citations and journal impact factor. Journal of Informetrics, 13(1), 32–49.

    Article  Google Scholar 

  • Abramo, G., D’Angelo, C. A., & Rosati, F. (2015). The determinants of academic career advancement: evidence from Italy. Science and Public Policy, 42(6), 761–774.

    Google Scholar 

  • Abramo, G., D’Angelo, C. A., & Viel, F. (2013b). Selecting competent referees to assess research projects proposals: a study of referees’ registers. Research Evaluation, 22(1), 41–51.

    Article  Google Scholar 

  • Aksnes, D.W., Langfeldt, L., & Wouters, P. (2019). Citations, citation indicators, and research quality: An overview of basic concepts and theories. SAGE Open, January–March, 1–17.

  • Aksnes, D. W., & Taxt, R. E. (2004). Peer reviews and bibliometric indicators: A comparative study at Norvegian University. Research Evaluation, 13(1), 33–41.

    Article  Google Scholar 

  • Alfò, M., Benedetto, S., Malgarini, M., & Scipione, S. (2017). On the use of bibliometric information for assessing articles quality: an analysis based on the third Italian research evaluation exercise. In 2017 STI conference, Paris.

  • Allen, L., Jones, C., Dolby, K., Lynn, D., & Walport, M. (2009). Looking for landmarks: The role of expert review and bibliometric analysis in evaluating scientific publication outputs. PLoS ONE, 4(6), e5910.

    Article  Google Scholar 

  • Ancaiani, A., Anfossi, A. F., Barbara, A., Benedetto, S., Blasi, B., Carletti, V., et al. (2015). Evaluating scientific research in Italy: The 2004–10 research evaluation exercise. Research Evaluation, 24(3), 242–255.

    Article  Google Scholar 

  • ANVUR. (2013). Valutazione della qualità della ricerca 2004–2010. Rapporto finale. http://www.anvur.it/rapporto/. Last Accessed 12 June 2019.

  • Baccini, A., Barabesi, L., & De Nicolao, G. (2018). The Holy Grail and the bad sampling: a test for the homogeneity of missing proportions for evaluating the agreement between peer review and bibliometrics in the Italian research assessment exercises. arXiv:1810.12430v1.

  • Baccini, A., & De Nicolao, G. (2016). Do they agree? Bibliometric evaluation versus informed peer review in the Italian research assessment exercise. Scientometrics, 108(3), 1651–1671.

    Article  Google Scholar 

  • Bertocchi, G., Gambardella, A., Jappelli, T., Nappi, C. A., & Peracchi, F. (2015). Bibliometric evaluation versus informed peer review: Evidence from Italy. Research Policy, 44(2), 451–466.

    Article  Google Scholar 

  • Bornmann, L. (2011). Scientific peer review. Annual Review of Information Science and Technology, 45, 199–245.

    Article  Google Scholar 

  • Bornmann, L., & Daniel, H.-D. (2005). Does the h-index for ranking of scientists really work? Scientometrics, 65(3), 391–392.

    Article  Google Scholar 

  • Bornmann, L., & Daniel, H.-D. (2008). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation, 64(1), 45–80.

    Article  Google Scholar 

  • Bornmann, L., & Leydesdorff, L. (2013). The validation of (advanced) bibliometric indicators through peer assessments: A comparative study using data from InCites and F1000. Journal of Informetrics, 7(2), 286–291.

    Article  Google Scholar 

  • Cabezas-Clavijo, Á., Robinson-García, N., Escabias, M., & Jiménez-Contreras, E. (2013). Reviewers’ ratings and bibliometric indicators: Hand in hand when assessing over research proposals? PLoS ONE, 8(6), e68258.

    Article  Google Scholar 

  • Cetina, K. K. (1981). The manufacture of knowledge: An essay on the constructivist and contextual nature of science. New York: Pergamon Press.

    Google Scholar 

  • Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.

    Article  Google Scholar 

  • Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220.

    Article  Google Scholar 

  • Cole, J. R., & Cole, S. (1973). Social stratification in science. Chicago: University of Chicago Press.

    Google Scholar 

  • Cuccurullo, F. (2006). La valutazione triennale della ricerca–VTR del CIVR. Analysis, 3(4), 5–7.

    Google Scholar 

  • Fleiss, J. L., Levin, B., & Myunghee, C. P. (2003). Statistical methods for rates and proportions. Hoboken, NJ: Wiley.

    Book  MATH  Google Scholar 

  • Franceschet, M., & Costantini, A. (2011). The first Italian research assessment exercise: A bibliometric perspective. Journal of Informetrics, 5(2), 275–291.

    Article  Google Scholar 

  • Garfield, E. (1979). Citation indexing-its theory and application in science, technology, and humanities. New York, NY: Wiley.

    Google Scholar 

  • Garfield, E. (1980). Premature discovery or delayed recognition: Why? Current Contents, 21, 5–10.

    Google Scholar 

  • Glänzel, W. (2008). Seven myths in bibliometrics. About facts and fiction in quantitative science studies. In H. Kretschmer & F. Havemann (Eds.), Proceedings of WIS fourth international conference on webometrics, informetrics and scientometrics & ninth COLLNET meeting. Berlin: Institute for Library and Information Science.

    Google Scholar 

  • Harnad, S. (2008). Validating research performance metrics against peer rankings. Ethics in Science and Environmental Politics, 8(1), 103–107.

    Article  MathSciNet  Google Scholar 

  • Herrmannova, D., Patton, R., Knoth, P., & Stahl, C. (2018). Do citations and readership identify seminal publications? Scientometrics, 115(1), 239–262.

    Article  Google Scholar 

  • Horrobin, D. F. (1990). The philosophical basis of peer review and the suppression of innovation. Journal of the American Medical Association, 263(10), 1438–1441.

    Article  Google Scholar 

  • Ke, Q., Ferrara, E., Radicchi, F., & Flammini, A. (2015). Defining and identifying sleeping beauties in science. Proceedings of the National Academy of Sciences, 112(24), 7426–7431.

    Article  Google Scholar 

  • Kreiman, G., & Maunsell, J. H. R. (2011). Nine criteria for a measure of scientific output. Frontiers in Computational Neuroscience, 5(48), 11.

    Google Scholar 

  • Kulczycki, E., Korzeń, M., & Korytkowski, P. (2017). Toward an excellence-based research funding system: Evidence from Poland. Journal of Informetrics, 11(1), 282–298.

    Article  Google Scholar 

  • Latour, B. (1987). Science in action: How to follow scientists and engineers through society. Cambridge, MA: Harvard University Press.

    Google Scholar 

  • Leydesdorff, L., Bornmann, L., Comins, J. A., & Milojević, S. (2016). Citations: Indicators of quality? The impact fallacy. Frontiers in Research Metrics and Analytics, 1(1), 1–15.

    Google Scholar 

  • Lin, L. I.-K. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics, 45(1), 255–268.

    Article  MATH  Google Scholar 

  • Lin, L. I.-K. (2000). Erratum: A note on the concordance correlation coefficient (biometrics (1989) (214)). Biometrics, 56(1), 324–325.

    Article  MathSciNet  Google Scholar 

  • Mahdi, S., D’Este, P., & Neely, A. (2008). Citation counts: are they good predictors of RAE scores? Technical Report February. Advanced Institute of Management Research. https://doi.org/10.2139/ssrn.1154053.

  • Martin, B. R., & Irvine, J. (1983). Assessing basic research: Some partial indicators of scientific progress in radio astronomy. Research Policy, 12(2), 61–90.

    Article  Google Scholar 

  • McBride, G. B. (2005). A proposal for strength-of-agreement criteria for lins concordance correlation coefficient. NIWA Client Report, HAM2005-062.

  • Meho, L. I., & Sonnenwald, D. H. (2000). Citation ranking versus peer evaluation of senior faculty research performance: a case study of Kurdish Scholarship. Journal of the American Society for Information Science, 51(2), 123–138.

    Article  Google Scholar 

  • Merton, R. K. (1973). Priorities in scientific discovery. In R. K. Merton (Ed.), The sociology of science: Theoretical and empirical investigations (pp. 286–324). Chicago: University of Chicago Press.

    Google Scholar 

  • Mingers, J., & Leydesdorff, L. (2015). A review of theory and practice in scientometrics. European Journal of Operational Research, 246(1), 1–19.

    Article  MATH  Google Scholar 

  • Moxam, H., & Anderson, J. (1992a). Peer review. A view from the inside. Science and Technology Policy, 5(1), 7–15.

    Google Scholar 

  • Moxam, H., & Anderson, J. (1992b). Peer review. A view from the inside. Science and Technology Policy, 5(1), 7–15.

    Google Scholar 

  • Mryglod, O., Kenna, R., Holovatch, Y., & Berche, B. (2015). Predicting results of the research excellence framework using departmental h-index: revisited. Scientometrics, 104(3), 1013–1017.

    Article  Google Scholar 

  • Oppenheim, C. (1997). The correlation between citation counts and the 1992 research assessment exercise ratings for British research in genetics, anatomy and archaeology. Journal of Documentation, 53(5), 477–487.

    Article  MathSciNet  Google Scholar 

  • Oppenheim, C., & Norris, M. (2003). Citation counts and the research assessment exercise V: Archaeology and the 2001 RAE. Journal of Documentation, 56(6), 709–730.

    Google Scholar 

  • Pendlebury, D. A. (2009). The use and misuse of journal metrics and other citation indicators. Scientometrics, 57(1), 1–11.

    Google Scholar 

  • Pichappan, P., & Sarasvady, S. (2002). The other side of the coin: The intricacies of author self-citations. Scientometrics, 54(2), 285–290.

    Article  Google Scholar 

  • Pride, D., & Knoth, P. (2018). Peer review and citation data in predicting university rankings, a large-scale analysis. In International conference on theory and practice of digital libraries, TPDL 2018: Digital libraries for open knowledge, 195–207. https://doi.org/10.1007/978-3-030-00066-0_17. Last Accessed 12 June 2019.

  • Reale, E., Barbara, A., & Costantini, A. (2007). Peer review for the evaluation of academic research: Lessons from the Italian experience. Research Evaluation, 16(3), 216–228.

    Article  Google Scholar 

  • Reale, E., & Zinilli, A. (2017). Evaluation for the allocation of university research project funding: Can rules improve the peer review? Research Evaluation, 26(3), 190–198.

    Article  Google Scholar 

  • Rinia, E. J., van Leeuwen, T., van Vuren, H. G., & van Raan, A. F. J. (1998). Comparative analysis of a set of bibliometric indicators and central peer-review criteria: Evaluation of condensed matter physics in the Netherlands. Research Policy, 27(1), 95–107.

    Article  Google Scholar 

  • Sheskin, D. J. (2003). Handbook of parametric and nonparametric statistical procedures. London: Chapman & Hall.

    Book  MATH  Google Scholar 

  • Sugimoto, C. R., & Larivière, V. (2018). Measuring research. Oxford: Oxford University Press.

    Google Scholar 

  • Taylor, J. (2011a). The assessment of research quality in UK universities: Peer review or metrics? British Journal of Management, 22(2), 202–217.

    Article  Google Scholar 

  • Taylor, J. (2011b). The assessment of research quality in UK universities: Peer review or metrics? British Journal of Management, 22(2), 202–217.

    Article  Google Scholar 

  • Thomas, P. R., & Watkins, D. S. (1998). Institutional research rankings via bibliometric analysis and direct peer-review: A comparative case study with policy implications. Scientometrics, 41(3), 335–355.

    Article  Google Scholar 

  • Traag, V. A., & Waltman, L. (2019). Systematic analysis of agreement between metrics and peer review in the UK REF. London: Palgrave Communications.

    Book  Google Scholar 

  • van Raan, A. F. J. (2004). Sleeping beauties in science. Scientometrics, 59(3), 461–466.

    Google Scholar 

  • van Raan, A. F. J. (2006). Comparison of the Hirsch-index with standard bibliometric indicators and with peer judgment for 147 chemistry research groups. Scientometrics, 67(3), 491–502.

    Article  Google Scholar 

  • Vieira, E. S., Cabral, J. A. S., & Gomes, J. A. N. F. (2014a). Definition of a model based on bibliometric indicators for assessing applicants to academic positions. Journal of the Association for Information Science and Technology, 65(3), 560–577.

    Article  Google Scholar 

  • Vieira, E. S., Cabral, J. A. S., & Gomes, J. A. N. F. (2014b). How good is a model based on bibliometric indicators in predicting the final decisions made by peers? Journal of Informetrics, 8(2), 390–405.

    Article  Google Scholar 

  • Vieira, E. S., & Gomes, J. A. N. F. (2018). The peer-review process: The most valued dimensions according to the researcher’s scientific career. Research Evaluation, 27(3), 246–261.

    Article  Google Scholar 

  • Wilsdon, J., Allen, L., Belfiore, E., Campbell, P., Curry, S., Hill, S., et al. (2015). The Metric Tide: Report of the independent review of the role of metrics in research assessment and management. Bristol: HEFCE.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giovanni Abramo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abramo, G., D’Angelo, C.A. & Reale, E. Peer review versus bibliometrics: Which method better predicts the scholarly impact of publications?. Scientometrics 121, 537–554 (2019). https://doi.org/10.1007/s11192-019-03184-y

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-019-03184-y

Keywords

Navigation