Refrain from adopting the combination of citation and journal metrics to grade publications, as used in the Italian national research assessment exercise (VQR 2011–2014)


The prediction of the long-term impact of a scientific article is challenging task, addressed by the bibliometrician through resorting to a proxy whose reliability increases with the breadth of the citation window. In the national research assessment exercises using metrics the citation window is necessarily short, but in some cases is sufficient to advise the use of simple citations. For the Italian VQR 2011–2014, the choice was instead made to adopt a linear weighted combination of citations and journal metric percentiles, with weights differentiated by discipline and year. Given the strategic importance of the exercise, whose results inform the allocation of a significant share of resources for the national academic system, we examined whether the predictive power of the proposed indicator is stronger than the simple citation count. The results show the opposite, for all discipline in the sciences and a citation window above 2 years.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3


  1. 1. (last accessed 23/09/2016).

  2. 2.

    At time of publication, all co-authors but Giorgio Parisi were affiliated to ANVUR. Giorgio Parisi was president of the panel of experts in physics in the VQR 2004–2010.

  3. 3.

    Gruppo di Esperti della Valutazione (Group of Experts in Evaluation).

  4. 4.

    Specifically GEV 10—Ancient history, philology, literature and art; GEV 11a—History, philosophy, pedagogy; GEV 12—Law; and GEV 14—Political and social sciences.

  5. 5.

    The journal metrics proposed were: the 2- and 5-year Impact Factor (5YIF) and the Article Influence score (AI), for WoS indexed publications; the Impact per Publication (IPP) and the SCImago Journal Rank (SJR), for Scopus indexed publications.

  6. 6.

    Note that the C–J space ultimately adopted under the 2011–2014 VQR differs in two aspects from the one presented by Anfossi et al. (2016). The procedure applied five regions instead of four. Further, each GEV was requested to distinguish two separate “peer review” regions, one at the top-left corner (high citation percentile/low journal metric percentile) and the other at the bottom-right corner (low citation/high journal metric). The publications falling in these special regions were to be evaluated by informed peer-review.

  7. 7.

    As stated above, most of 2014 publications are not evaluated by metrics.

  8. 8.

    We have excluded GEV 8b (Architecture) and GEV 11b (Psychology) due to the limited share of products indexed in bibliometric databases out of the total research production of these professors in the period 2004–2006.

  9. 9.

    For publications in multi-category journals the percentile considered is the one referring to the most favorable subject category.

  10. 10.

    For this analysis we eliminate the double counts of publications co-authored by professors pertaining to the same GEV. We consider instead publications co-authored by professors of different GEVs, because each GEV adopts different thresholds.


  1. Abramo, G., Cicero, T., & D’Angelo, C. A. (2011a). The dangers of performance-based research funding in non-competitive higher education systems. Scientometrics, 87(3), 641–654.

    Article  Google Scholar 

  2. Abramo, G., Cicero, T., & D’Angelo, C. A. (2013). National peer-review research assessment exercises for the hard sciences can be a complete waste of money: The Italian case. Scientometrics, 95(1), 311–324.

    Article  Google Scholar 

  3. Abramo, G., & D’Angelo, C. A. (2015). The VQR, Italy’s second national research assessment: Methodological failures and ranking distortions. Journal of the American Society for Information Science and Technology, 66(11), 2202–2214.

    Google Scholar 

  4. Abramo, G., D’Angelo, C. A., & Caprasecca, A. (2009). Allocative efficiency in public research funding: Can bibliometrics help? Research Policy, 38(1), 206–215.

    Article  Google Scholar 

  5. Abramo, G., D’Angelo, C. A., & Di Costa, F. (2010). Citations versus journal impact factor as proxy of quality: Could the latter ever be preferable? Scientometrics, 84(3), 821–833.

    Article  Google Scholar 

  6. Abramo, G., D’Angelo, C. A., & Di Costa, F. (2011b). National research assessment exercises: The effects of changing the rules of the game during the game. Scientometrics, 88(1), 229–238.

    Article  Google Scholar 

  7. Abramo, G., D’Angelo, C. A., & Di Costa, F. (2014). Inefficiency in selecting products for submission to national research assessment exercises. Scientometrics, 98(3), 2069–2086.

    Article  Google Scholar 

  8. Anfossi, A., Ciolfi, A., Costa, F., Parisi, G., & Benedetto, S. (2016). Large-scale assessment of research outputs through a weighted combination of bibliometric indicators. Scientometrics, 107(2), 671–683.

    Article  Google Scholar 

  9. Baccini, A. (2016). Napoleon and the bibliometric evaluation of research: Considerations on university reform and the action of the national evaluation agency in italy. [Napoléon et l’évaluation bibliométrique de la recherche: Considérations sur la réforme de l’universitéet sur l’action de l’agence nationale d’évaluation en Italie]. Canadian Journal of Information and Library Science, 40(1), 37–57.

    Google Scholar 

  10. Baccini, A., & De Nicolao, G. (2016). Do they agree? Bibliometric evaluation versus informed peer review in the Italian research assessment exercise. Scientometrics, 108(3), 1651–1671.

    Article  Google Scholar 

  11. Butler, L. (2003a). Modifying publication practices in response to funding formulas. Research Evaluation, 12(1), 39–46.

    Article  Google Scholar 

  12. Butler, L. (2003b). Explaining Australia’s increased share of ISI publications—The effects of a funding formula based on publication counts. Research Policy, 32, 143–155.

    Article  Google Scholar 

  13. Franco, G. (2013). The scientific sector MED44 facing the task of assessing the quality of research (2004–2010) of the ANVUR (National Agency of Assessing University Research): Lights and shadows]. [Il settore scientifico-disciplinare MED44 di fronte all’esercizio di valutazione della qualità della ricerca (VQR 2004–2010) dell’ANVUR: Luci e ombre. La Medicina Del Lavoro, 104(6), 483–485.

    Google Scholar 

  14. Levitt, J. M., & Thelwall, M. (2011). A combined bibliometric indicator to predict article impact. Information Processing and Management, 47(2), 300–308.

    Article  Google Scholar 

  15. Stern, D. I. (2014). High-ranked social science journal articles can be identified from early citation information. PLoS ONE, 9(11), 1–11.

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Giovanni Abramo.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Abramo, G., D’Angelo, C.A. Refrain from adopting the combination of citation and journal metrics to grade publications, as used in the Italian national research assessment exercise (VQR 2011–2014). Scientometrics 109, 2053–2065 (2016).

Download citation


  • Research evaluation
  • Bibliometrics
  • Impact factor