Refrain from adopting the combination of citation and journal metrics to grade publications, as used in the Italian national research assessment exercise (VQR 2011–2014)


The prediction of the long-term impact of a scientific article is challenging task, addressed by the bibliometrician through resorting to a proxy whose reliability increases with the breadth of the citation window. In the national research assessment exercises using metrics the citation window is necessarily short, but in some cases is sufficient to advise the use of simple citations. For the Italian VQR 2011–2014, the choice was instead made to adopt a linear weighted combination of citations and journal metric percentiles, with weights differentiated by discipline and year. Given the strategic importance of the exercise, whose results inform the allocation of a significant share of resources for the national academic system, we examined whether the predictive power of the proposed indicator is stronger than the simple citation count. The results show the opposite, for all discipline in the sciences and a citation window above 2 years.

Fig. 1
Fig. 2
Fig. 3


    At time of publication, all co-authors but Giorgio Parisi were affiliated to ANVUR. Giorgio Parisi was president of the panel of experts in physics in the VQR 2004–2010.

    Gruppo di Esperti della Valutazione (Group of Experts in Evaluation).

    Specifically GEV 10—Ancient history, philology, literature and art; GEV 11a—History, philosophy, pedagogy; GEV 12—Law; and GEV 14—Political and social sciences.

    The journal metrics proposed were: the 2- and 5-year Impact Factor (5YIF) and the Article Influence score (AI), for WoS indexed publications; the Impact per Publication (IPP) and the SCImago Journal Rank (SJR), for Scopus indexed publications.

    Note that the C–J space ultimately adopted under the 2011–2014 VQR differs in two aspects from the one presented by Anfossi et al. (2016). The procedure applied five regions instead of four. Further, each GEV was requested to distinguish two separate “peer review” regions, one at the top-left corner (high citation percentile/low journal metric percentile) and the other at the bottom-right corner (low citation/high journal metric). The publications falling in these special regions were to be evaluated by informed peer-review.

    As stated above, most of 2014 publications are not evaluated by metrics.

    We have excluded GEV 8b (Architecture) and GEV 11b (Psychology) due to the limited share of products indexed in bibliometric databases out of the total research production of these professors in the period 2004–2006.

    For publications in multi-category journals the percentile considered is the one referring to the most favorable subject category.

    For this analysis we eliminate the double counts of publications co-authored by professors pertaining to the same GEV. We consider instead publications co-authored by professors of different GEVs, because each GEV adopts different thresholds.


