Plots for visualizing paper impact and journal impact of single researchers in a single graph

In research evaluation of single researchers, the assessment of paper and journal impact is of interest. High journal impact reflects the ability of researchers to convince strict reviewers, and high paper impact reflects the usefulness of papers for future research. In many bibliometric studies, metrics for journal and paper impact are separately presented. In this paper, we introduce two graph types, which combine both metrics in a single graph. The graphs can be used in research evaluation to visualize the performance of single researchers comprehensively.


Introduction
Publication success of single researchers can be assessed bibliometrically on the paper and journal basis.Both perspectives provide different insights.Researchers submit their manuscripts to journals and these journals can be high-impact or low-impact journals.Thus, it is a first success of researchers if they are able to get a high share of their manuscripts accepted in high-impact journals.These high-impact journals can be multi-disciplinary journals, like Science or Nature, or subject-specific journals, like Angewandte Chemie -International Edition.Since the impact of the journals depends on their reputation (besides other factors), the ability to publish papers in high-impact journals reflects one aspect of high performance.Several years subsequent to the publication of a manuscript in a journal (at least three years), the citation impact of the published manuscript can be determined.Citations are seen as a proxy of quality which reflects at least impact (but not, e.g., accuracy).Thus, high citations for the researcher's papers indicate his or her ability to publish influential research.This is a second success which can be reached by researchers.
In research evaluation of single researchers, the assessment of both events is interesting.High journal impact reflects the ability to convince strict reviewers (Bornmann, 2011), and high paper impact reflects the usefulness of papers for future research (Bornmann & Marx, 2014c).In many bibliometric studies, metrics for journal and paper impact are separately presented.In this paper, we introduce two graph types which combine both metrics in a single graph.The graph types are rooted in Bland-Altman plots introduced by Bland and Altman (1986).Bland-Altman plots are a standard instrument in the validation of clinical measurement (Giavarina, 2015).These plots analyze the magnitude of differences between two metrics (here: journal and paper impact).The proposed graph types are able to show the share of a researcher's papers with higher paper impact than can be expected from the journal impact.
The proposals in this paper follow earlier proposals of visualizing bibliometric data of single researchers with beam plots (Bornmann & Marx, 2014a, 2014b).Beam plots show the performance of single researchers in view of publication output and citation impact.

Methods
Percentiles have been introduced in bibliometrics to overcome a certain problem with frequently used citation indicators: the indicators are based on the arithmetic average.
However, since, as a rule, the distribution of citations across publications is skewed, the arithmetic average should not be used with raw citation data.The citation percentile of a single paper is an impact value below which a certain share of papers fall (Bornmann, Leydesdorff, & Mutz, 2013).Given a set with several papers and their citation counts, the citation percentile of 95 for a focal paper means, for example, that 95% of the papers in a set have citations counts lower than the focal paper.The formula by Hazen (1914) ((i − 0.5) / n * 100) is frequently used for the calculation of percentiles, whereby n is the number of papers in the set and i is the rank position of the papers in the set (concerning their citations).In bibliometrics, the set of papers is defined field-and time-specific (e.g., by using Web of Science subject categories and publication years).The corresponding citation percentile for each paper is a field-and time-normalized impact value which can be used for cross-field comparisons.
In this study, we used citation percentiles for the papers of a single anonymous researcher which are based on Hazen's formula as an example to demonstrate the plots.The researcher published 99 papers (articles and reviews) between 2004 and 2013.Citations were counted until the end of 2016 for calculation of Hazen percentiles.For field classification, we used the Web of Science (WoS, provided by Clarivate Analytics, formerly the IP and Science business of Thomson Reuters) subject categories.In the case of multiple assignments of papers to subject categories, the average value of the percentile in each subject category was used to obtain a paper-based percentile value (Haunschild & Bornmann, 2016).We retrieved Hazen percentiles for the papers from our in-house database derived from the Science Citation Index Expanded (SCI-E), Social Sciences Citation Index (SSCI), and Arts and Humanities Citation Index (AHCI) provided by Clarivate Analytics.
Recently, Pudovkin and Garfield (2004) introduced a similar metric as citation percentiles on the level of journals.Also, this metric leads to field-normalized valueson the level of journals.The so-called rank normalized impact factor (rnIF) equals ((k -r j + 1) / k *100) where r j is the descending rank of journal j in its subject category and k is the number of journals in the category.In contrast to the usual Journal Impact Factor (Garfield, 2006), the rnIF can be used for cross-field comparisons of journals.Our in-house database also provides the journal percentiles (rnIF) from the Journal Citation Reports (Clarivate Analytics).The corresponding journal percentiles for the papers under study were retrieved.
In the following, we call citation percentiles on the level of papers as paper percentiles and citation percentiles on the level of journals (rnIF) as journal percentiles.By using both percentiles for a publication set of a single researcher, we have similar field-normalized metrics available which can be used for the comparison of two events: the success of publishing in good journals and the success of receiving high citation counts.
We provide a Stata command (babiplot) and an R package (BibPlots) which produce the graphs proposed in this study.Both can be found in SSC Archive (in the case of Stata) and CRAN (Comprehensive R Archive Network, in the case of R), respectively.
3 Results Bornmann andMarx (2014a, 2014b) introduced beamplots which can be used to visualize the productivity and citation impact of single researchers.It is an advantage of the plots that they contain distributional information (the spread of papers across citation percentiles and publication years) and index information (the mean impact of the papers over all publication years and the mean impact within single publication years).Bornmann andMarx (2014a, 2014b) proposed to use beamplots with paper percentiles, but they can also be used with journal percentiles.Figure 1  Although the ability of researchers to publish high impact papers and papers in high impact journals in each year can be seen in beamplots, the connection between the two is lost for the years where multiple papers with different paper impact were published.that the corresponding paper has a higher paper impact than journal impact and vice versa.
Each row (n r ) and column (n c ) as well as quadrant (n q ) of the scatter plot is labeled with the number and percentage of data points in the corresponding section, e.g., n r1 = 89; 90% for 89 papers (90%) in row 1.The values of n c1 correspond to the number and proportion of papers belonging to the 50% most frequently cited papers in the corresponding subject categories and publication years.The red squares show the average value of all data points in each quadrant.
The results in Figure 2 demonstrate that the example researcher was able to publish most of the papers in better than average journals with better than average impact later on.The figure is rooted in Bland-Altman plots and is a scatter plot of the following two quantities: (1) difference between paper and journal impact and (2) average of paper and journal impact.The bisecting line is no longer necessary to judge whether the paper percentile is higher than the journal percentile or vice versa.Additionally, Figure 3 features two dashed red lines which indicate (1) whether there is a general tendency of the researcher to publish in journals with higher impact or to publish papers with higher impact (see the y-line) and ( 2) whether the researcher is generally able or not to publish papers in good journals which receive high impact later on (see the x-line).Often, one is interested in the papers which belong to the 10% most frequently cited papers in the corresponding subject categories and publication years (Bornmann, de Moya Anegón, & Leydesdorff, 2012).This is visible in the scatter plot in Figure 2. The data points with a paper impact of 90 or higher are among the top 10% of their subject category and in their publication year.However, this information is lost in the Bland-Altman plot of Figure 3.
In order to restore this information in the plot, the black circles with a paper impact of 90 or higher (papers among the top 10%) are unfilled while the other circles are filled.
A benefit of the Bland-Altman plot lies in the analysis of the relationship between the differences of journal and paper impact and their average.Papers with extreme differences between journal and paper impact are clearly identifiable.The Bland-Altman plot in Figure 3 combines the need to see the distribution of the impact of the individual papers and the journals they are published in with aggregated statistical values over the quadrants, rows, and columns of the plot.Furthermore, the average of the differences between paper and journal impact (see the horizontal dashed line in Figure 3) shows whether there is a general tendency of the researcher to publish in better journals or papers with higher impact.The vertical dashed line in Figure 3 shows the overall average impact (paper and journal impact).

Discussion
Researchers and decision makers in science policy are usually interested in receiving single numbers on the performance of single researchers (Leydesdorff, Wouters, & Bornmann, 2016).This explains the popularity of the h index proposed by Hirsch (2005) and the many variants of this index introduced in recent years (Bornmann, Mutz, Hug, & Daniel, 2011).However, since the reduction of the performance into a single number leads to the loss of information, it is recommended in bibliometrics to use distributions instead of only single numbers.For example, Lariviere et al. (2016) propose to use a method for generating the citation distribution of journals, instead of the use of the Journal Impact Factor (JIF) in research evaluation.Basically, the JIF measures the mean citation impact of papers published in a journal.Similarly, the performance of single scientists should be also measured by focusing on distributions than only on single numbers.In this study, we have demonstrated the usefulness of scatter plots and Bland-Altman plots to combine paper and journal percentiles in a single graph.The availability of journal percentiles in the Journal Citation Reports (JCR, Clarivate Analytics, http://clarivate.com)makes it possible to contrast paper impact with journal impacttime-and field-normalized by using percentiles.
Bland-Altman plots are a standard instrument in the validation of clinical measurement (Giavarina, 2015).We think that the plots also allow interesting insights into the publication and citation profiles of single scientists and can be used for research evaluation purposes.The plots can be applied to inspect the amount of papers in a set whichmore or lessagree in a high or low paper and journal impact.This is the average dimension of the plot.The other dimension focusses on the differences between paper and journal impact.Are there many papers with large differences between both metrics?Is the researcher more able to publish in high-impact journals or papers with high impact?If we use journal percentiles as an expected value which can be contrasted with paper percentile, a higher paper percentile demonstrates that the paper received more impact than could be expected on the basis of the publishing journal.
In the scatter and Bland-Altman plots, which we present in this study, paper and journal impact is categorized into four quadrants.A high general impact of the papers in the set is indicated by high numbers and proportions of data points in quadrants 1 and 4 (i.e., high values of n q1 and n q4 ).High numbers and proportions of data points in quadrants 2 and 3 (high values of n q2 and n q3 ) indicate a low impact in general.The differences between n q1 and n q4 as well as n q2 and n q3 indicate if the papers show a higher journal or higher paper impact.With the categorization of journal and citation impact into four quadrants, the plots follow approaches in bibliometrics, such as the Characteristics Scores and Scales (CSS) method (Glänzel, Debackere, & Thijs, 2016), which can also be used to assign citation impact to (four) impact groups.
Future research could study whether these variants are also useful in bibliometrics or not.
presents beam plots of paper (left) and journal (right) percentiles of the same example researcher.The individual percentiles (paper or journal percentiles) are shown using grey rhombi; the median over a publication year is displayed with red triangles.Furthermore, a red dashed line visualizes the median of the (paper or journal) percentiles for all the years and a grey line marks the value 50.A value of 50 designates the average impact of a publication or journal in a subject area or publication year.

Figure 1 .
Figure 1.Beam plots of paper and journal percentiles

Figure 2 .
Figure 2. Scatter plot of paper and journal percentiles

Figure 3 .
Figure 3. Bland-Altman plot of paper and journal percentiles with papers as unfilled circles,