Research performance metrics are used to inform decision making across the higher education sector. At an institutional level, the quantity, quality and impact of research outputs inform rankings, prestige and the allocation of performance-based funding (Geuna & Martin, 2003; Smith et al., 2013). In the United Kingdom (UK), such funding is considerable and increasing annually (Office for National Statistics, 2021). The funding landscape is similar in many other parts of the world (UNESCO Institute for Statistics, 2022). Within institutions, research performance metrics help inform a range of resourcing decisions, including employee selection, promotion, and time/task allocation. In this context, the assessment and benchmarking of research performance at the level of the individual academic has considerable value.

Widely used bibliometric measures of individual research performance include publication quantity, citation counts and h-index scores. Publication quantity is a simple measure of productivity whilst citation counts are a proxy for impact or influence (Choudhri et al., 2015; Myers, 1970). The h-index (Hirsch, 2005) balances both quantity and influence, and is widely used as a proxy for researcher impact (Alonso et al., 2009). A final bibliometric measure of interest to the present study is the SCImago Journal Rank (SJR). This measure reflects an academic journal’s influence and, unlike the Impact Factor, takes both the quantity and influence of citing sources into account (Falagas et al., 2008). Academics who have consistently published in journals with higher SJRs can be assumed to have, on average, produced higher quality work.

Several studies have attempted to document the research performance of academic psychologists. For example, in the United States, Joy (2006) observed that academic psychologists employed at elite research universities had significantly higher publication rates than academics at less prestigious institutions. Additionally, these elite research universities were found to have a greater proportion of highly productive academics. Regardless of whether the employment of such eminent scholars at these elite universities was a result of a fostered scholarship climate within the institution or through the exercise of financial muscle, these findings nonetheless support the notion that research performance influences decisions about hiring and promotion in higher education.

More recently, Mazzucchelli et al. (2019) assessed the research performance of academic psychologists in Australia using three bibliometric indicators: lifetime publications, lifetime citations and h-index scores. They observed that academic psychologists in the ‘Group of Eight’ (Go8; comprised of research-intensive Australian universities) significantly outperformed academic psychologists outside of the Go8 on all three indicators. Interestingly, the Go8’s advantage tended to decrease somewhat as academic level increased. For example, Go8 lecturers had published 50% more papers, had 96% more citations, and h-index scores that were 52% higher than their colleagues outside of the Go8. At the professorial level, these figured had reduced to 28%, 64% and 28% respectively. Predictably, Mazzucchelli et al. (2019) also found that senior academics performed better on each indicator of research performance than their more junior colleagues, with publication and citation counts at least doubling for each successive academic level (e.g., from lecturer to senior lecturer). h-index scores also increased with seniority, albeit at a slightly slower pace. By comparing their findings with those of Malouff et al. (2010), Mazzucchelli et al. (2019) observed that, over the course of a decade, lifetime publication counts for Australian academic psychologists increased by a factor of between two and three. These findings were consistent with those of Haslam et al. (2017), whose analyses were limited to the Go8.

Most recently, Craig et al. (2021) calculated performance norms for Australian academic psychologists using the same sampling methods and bibliometric indicators employed by Mazzucchelli et al. (2019), as well as several additional metrics. The additional metrics were extracted from the SciVal platform and focused on recent performance (defined as the 4 year period between 2016 and 2019). They included publication counts, citation counts, citations per publication, Field Weighted Citation Impact (FWCI; the ratio of citations received to the average for similar publications) scores, and the percentage of publications in the top 10% by citations, in journals with CiteScores above the 90th percentile, with an international co-author, and with a corporate affiliation. Consistent with Mazzucchelli et al. (2019), Craig et al. (2021) found that the quantity of recent SciVal publications, citations, citations per publication and FWCI scores tended to increase with seniority, although the differences between adjacent academic levels were not always statistically significant. Furthermore, they found that academic psychologists in the Go8 outperformed those outside of the Go8 on these four SciVal metrics, although not always significantly so. For the remaining SciVal metrics there were also trends in favour of seniority and the Go8, albeit with a number of reversals (e.g., associate professors tended to out-perform full professors) and with mostly non-significant pairwise comparisons. Finally, after controlling career length and the other SciVal metrics, recent publications and citations predicted lifetime publications, citations, and h-index scores. Recent SciVal citations per publication and percentage of publications with an international co-author also predicted h-index scores. The remaining SciVal metrics were either non-significant or weak negative predictors of the lifetime performance measures.

The current study

The primary objective of the current study was to carry out a replication and extension of Mazzucchelli et al. (2019) within the context of higher education in the UK. Specifically, we sought to provide up-to-date research performance norms for academic psychologists in the UK, stratified by academic level and university mission group (see Table 1). For consistency with Mazzucchelli et al. (2019), the measures of research performance we used were extracted from the Scopus and SCImago databases. These were lifetime Scopus publications, lifetime Scopus citations, Scopus h-index scores, lifetime Scopus first-author publications and average SJRs for journals published in. Furthermore, we sought to compare the research performance of UK academic psychologists with that of their Australian counterparts. As in Australia, universities in the UK are organised into several mission groups. And like the Australian Go8, the Russell Group in the UK represents a self-selected group of elite, research-intensive universities. Such comparisons are valid and informative because both nations have implemented comprehensive, government driven systems to evaluate and rank institutional and department/subject level research performance. These are the REF (Research Excellence Framework) and ERA (Excellence in Research for Australia) systems in the UK and Australia respectively (Martin-Sardesai & Guthrie, 2018). As these systems underlie substantial funding decisions, it is reasonable to assume that both UK and Australian institutions have similar research environments, with a similar focus on promoting research activity and impact (Auranen & Nieminen, 2010). However, Australian academic psychologists at each level of seniority tend to have been research active for longer than their UK equivalents (Craig et al., 2021). Up-to-date research performance norms will also facilitate departmental, institutional and national resourcing decisions, and are likely to be useful in the context of decision making around employee selection, promotion, performance management and time/task allocation. Finally, at an individual level, knowledge of contemporary, national research performance norms will facilitate career development planning and can be used to help strengthen the case for promotion and research funding.

Table 1 University groups in the United Kingdom

We hypothesised that lifetime publications (H1a), lifetime citations (H2a), h-index scores (H3a), lifetime first author publications (H4a) and average SJRs (H5a) for UK academics in schools/departments of psychology would follow a rank ordering of level E (professor or equivalent) > level D (associate professor or equivalent) > level C (senior lecturer or equivalent) > level B (lecturer or equivalent). Furthermore, we hypothesised that academics at Russell Group universities would outperform their counterparts at non-Russell Group universities on each measure of lifetime research performance (H1b to H5b). It should be noted that H1 to H3 replicated Mazzucchelli et al. (2019) in a UK context, whereas H4 and H5 were extensions. Beyond testing these pre-registered (see https://osf.io/sypnf/) hypotheses, we also sought to understand the extent to which research career length impacted the relationship between Russell Group membership and each lifetime performance measure at each level of academic seniority. Finally, for the performance measures in H1 to H3, we anticipated few differences between UK and Australian academic psychologists.

Method

Design

This study utilised a 2 (university grouping: Russell Group vs non-Russell Group) × 4 (academic level: B, C, D or E) between-subjects factorial design. Russell Group membership was defined by whether or not a university was listed at https://russellgroup.ac.uk/about/our-universities/. Academic level B comprised lecturers, research fellows, assistant professors and associate senior lecturers; level C comprised senior lecturers, senior research fellows and associate principal lecturers; level D comprised associate professors, readers and principal lecturers; and level E comprised professors and chairs. The research performance of academics holding an appointment below the level of lecturer (e.g., teaching associate, research associate, assistant lecturer etc.) was beyond the scope of this study. Like the level A Australian academics described by Mazzucchelli et al. (2019) and Craig et al. (2021), UK academics working at a comparable level do not typically have an independent research profile. The dependent variables were lifetime Scopus publications, lifetime Scopus citations, Scopus h-index scores, lifetime Scopus first author publications and average SJR values for the journals that published the lifetime Scopus publications. For our additional exploratory analyses the covariate was research career length, defined as the number of years since first Scopus indexed publication. Prior to any data collection, this study was approved by the School of Psychological Science Research Ethics Committee at the University of Bristol (Approval Code: 102862).

Data sources

The six university groups described in Table 1 were identified using https://ukstudyoptions.com and https://thecompleteuniversityguide.co.uk. Where applicable, group membership was then determined and verified against each group’s website. Of the 162 universities identified across the six groups, eight held dual group membership and 31 did not offer an undergraduate psychology programme accredited by the British Psychological Society. These institutions were dropped from our list, leaving 123 universities. Stratified random sampling (using random number generation at https://random.org) was then used to sample 30% of universities from each group. This resulted in a final sample of 38 universities, of which eight belonged to the Russell Group, four to University Alliance, six to MillionPlus, three to GuildHE, four to the Cathedrals Group and 13 to the 1994/unaffiliated group. Eight universities had to be resampled either because the originally sampled universities did not have a dedicated psychology department webpage listing affiliated academic staff, or psychology was part of a bigger department (e.g., a school of social science) and there was no way of unambiguously differentiating academic psychologists from academics with other specialisations.

Based on pilot sampling, we estimated that this process would result in a list of around 1400 academic psychologists. A series of sensitivity power analyses using G*Power (Faul et al., 2007) indicated that N = 1400 would enable us to detect effects considerably smaller than those reported by Mazzucchelli et al. (2019).

A total of 1887 academic psychologists were initially identified on the psychology department websites of the sampled universities between May and June 2020. For consistency with Mazzucchelli et al. (2019), and also Craig et al. (2021), 199 academics holding appointments as an emeritus, honorary, adjunct or visiting professor, a teaching fellow, a research associate, a research assistant or a postdoctoral researcher were excluded from our sample. A further 120 academics were excluded because their job role could not be determined from their university website profile, their LinkedIn, ResearchGate or Google Scholar profile (if available), or the first two pages of results from a Google search of their name and university affiliation. Finally, 229 academics who did not appear to have a Scopus profile were also excluded, resulting in a final sample of 1339 academics. Table 2 shows the gender distribution and research career length of staff within each university group, split by academic level.

Table 2 Characteristics of the sample of UK academic psychologists, split by university group, academic level and gender (N = 1339)

Measures and procedure

Research performance was measured using several bibliometric indicators derived from Scopus. As noted by McNally (2010), Scopus has reliability advantages over PsycINFO and Web of Science. These include low instances of author duplication, organisation of authors by discipline, more accurate metrics and the availability of authors’ institutional affiliation histories. Additionally, the coverage of journals in Scopus is broader than that of Web of Science (Mongeon & Paul-Hus, 2015) and it has more extensive coverage of psychology than PsycINFO (García‐Pérez, 2010). Furthermore, the Scopus advanced search tools allowed us to limit each academic’s publication and citation records to those prior to 1 January 2020. Where necessary, the ‘Author’ and ‘Document’ search functions were used to resolve any ambiguities around authorship and affiliations, thus increasing the quality of our data.

The bibliometric indicators we used were lifetime Scopus publications, lifetime Scopus citations, Scopus h-index scores, lifetime Scopus first author publications and average SJR for journals published in. Lifetime Scopus publications was defined as the total number of ‘article’ and ‘review’ type papers published prior to 1 January 2020. For consistency with Mazzucchelli et al. (2019), we did not include other types of publications (e.g., notes, book chapters, editorials, errata etc.), nor papers that were listed as ‘in press’. Lifetime Scopus citations was defined as the total number of citations accumulated prior to 1 January 2020 for article and review type papers. Scopus h-index scores reflect the number of Scopus indexed publications which have been cited at least the same number of times (Hirsch, 2005). Unlike the publication and citation count measures, we were unable to date limit or limit the publication types used in the h-index calculations. Lifetime Scopus first author publications was defined as the sub-set of lifetime Scopus publications for which the academic was listed as the first author. Although strongly correlated with lifetime publications (r = .78 for the full sample and r = .56 to .94 when the sample was split by academic level and RG membership), lifetime first author publications are worthy of separate consideration because (a) first authorship is given greatest weight when research performance is assessed for the purposes of promotion and the award of grant funding (Jones, 2013; Tscharntke et al., 2007); and (b) unlike lifetime publications (Mazzucchelli et al., 2019), lifetime first author publications do not appear to be increasing with time (Fanelli & Lariviere, 2016). This suggests that, although related, these two performance metrics are distinct and there is value in calculating benchmarks for both. Finally, average SJR was calculated as the (weighted) mean 2019 SJR for the set of journals in which the previously identified papers were published. Publications in journals without an SJR were assigned an SJR value of 0.Footnote 1

All data were collected from May to June 2020 from the Scopus and SCImago databases. In instances where the initial Scopus search did not return any results, or where the academic appeared to be affiliated with a different university, a search for the academic’s publication titles via the ‘Document’ search function was carried out. These publication titles were matched and verified against those listed on the academic’s university web profile. As noted previously, academics whose Scopus profiles could not be identified following these steps were excluded from our sample. SJRs were sourced from a spreadsheet available for download at https://www.scimagojr.com/. The Microsoft Excel VLOOKUP function was used to match SJRs with the publication details extracted from Scopus, and a sample were manually verified to confirm accuracy.

Data preparation and analysis

Our anonymised data, with codes replacing the names of universities and individual academics, are open and available at https://osf.io/4dgfm/. They were consolidated using Microsoft Excel before being imported into IBM SPSS (Version 26) for analysis. A series of 2 × 4 between-subjects factorial analyses of variance (ANOVAs) were used to test our pre-registered hypotheses. These were followed up with comparisons of marginal means and simple effects/comparisons. The assumption of homogeneity of variance was violated for many of these analyses, due primarily to much larger performance variance amongst the professoriate relative to academics at other levels. Consequently, we also ran robust ANOVAs (with trimmed means) and robust t-tests using the WRS2 R package. Both sets of results led to the same interpretations, and the former are reported herein. To examine the extent to which research career length impacted the relationship between Russell Group membership and each of our performance measures at each level of academic seniority, a series of analyses of covariance (ANCOVAs) were used. These analyses were not pre-registered. Where possible, one-sample t-tests and independent samples t-tests were used to compare our findings with those of Mazzucchelli et al. (2019) and Craig et al. (2021) respectively. The former were pre-registered, whereas the latter were not. Throughout, an alpha level of .01 was used as the threshold for statistical significance. This alpha level was pre-registered as a reasonable compromise between Type I and Type II error control. It also helped mitigate the Type I error inflation that can result from assumption violations (Milligan et al., 1987). Finally, as measures of effect size, we used partial eta-squared (η2p) for omnibus effects, and Cohen’s d for pairwise comparisons.

Results

Descriptive statistics

Table 2 describes the sample, split by university group and academic level. Overall, there were more females in the sample (56.8%), however they tended to be concentrated at lower academic levels. Males comprised 60.7% of the professoriate. The mean number of years since first Scopus indexed publication for academics at level B, C, D and E was 9.08, 12.42, 17.11 and 25.23 years respectively. For the sake of comparison, Craig et al. (2021), who collected Australian normative data at virtually the same time that we were collecting our UK data, reported means of 8.92, 16.01, 19.90 and 28.22 years for levels B, C, D and E respectively. These cross-national differences were statistically significant at levels C to E (ps < .01, d = 0.31 to 0.46), though not at level B (p = .766, d =  − 0.02). In the UK, at levels B, C and E there was a tendency for members of the Russell Group to have been research active longer than their equivalents outside of the Russell Group (d = 0.09 to 0.41), however this difference was only statistically significant (p < .001) at level E (M = 27.35 years vs. 23.59 years). At level D, there was a small, non-significant bias against the Russell Group (d =  − 0.11, p = .393). Performance on the five bibliometric indicators, split by university group and academic level, is summarised in Table 3.

Table 3 Descriptive statistics for each indicator of research performance, split by university group and academic level (N = 1339)

Lifetime Scopus publications

The main effect of academic level on lifetime Scopus publications in a 2 × 4 factorial ANOVA was significant, F(3, 1331) = 220.35, p < .001, η2p = .33. Marginal means comparisons indicated significant differences between all levels except B and C (p = .329). Level E academics had published significantly more than level B, C and D academics (ps < .001), and level D academics had published significantly more than level C (p = .008) and B (p < .001) academics. This suggests broad support for H1a. The Publications section of Table 4 displays summary statistics for each academic level. The percentiles in this table capture the number of papers published by academics at the 10th, 25th, 50th, 75th and 90th percentiles within each level. For example, a level B academic performing at the 10th percentile had published two Scopus indexed article or review type papers. A level E academic performing at the 10th percentile had published 25 such papers. Comparisons with Mazzucchelli et al. (2019) and Craig et al. (2021), which are also reported in Table 4, indicated that Australian academics had published significantly more than UK academics at each level (ps < .001) except level B (p = .020 and .031 respectively).

Table 4 Research performance indicators by academic level

The main effect of university group was also significant; F(1, 1331) = 77.73, p < .001, η2p = .06. This indicated that, overall, academics at Russell Group universities had published more than academics at other universities, and that H1b was supported. However, there was also a significant interaction between academic level and university group, F(3, 1331) = 42.12, p < .001, η2p = .09. Simple comparisons indicated that even though academics at each level in the Russell Group had published more than equivalent academics at other universities, the difference was significant only at level E (p < .001, d = 0.84). At levels C and D, the differences were ‘medium’ sized, though non-significant (p = .099 and .038 respectively). This interaction is illustrated graphically in panel A of Fig. 1. The Publications section of Table 5 presents the relevant group means, mean differences and standardised differences between academic levels within each university group, as well as differences between the university groups within each academic level. As can be seen in Table 3, the mean number of publications for level E Russell Group academics were 2.0 to 4.1 times higher than the corresponding means for level E academics in the other university groups.

Fig. 1
figure 1

Means and 95% CIs for each indicator of research performance, split by university group and academic level (N = 1339). B = Lecturer, Research Fellow, Assistant Professor or Associate Senior Lecturer; C = Senior Lecturer, Senior Research Fellow or Associate Principal Lecturer; D = Associate Professor, Reader or Principal Lecturer; E = Professor or Chair

Table 5 Comparisons between academic levels and university group on each indicator of research performance

Lifetime Scopus citations

The main effect of academic level on lifetime Scopus citations in a 2 × 4 factorial ANOVA was significant, F(3, 1329) = 95.95, p < .001, η2p = .18. The differences between levels B and C (p = .460). B and D (p = .010), and C and D (p = .211) were non-significant, although level D academics did have more than double the citations of their more junior colleagues. Level E academics had significantly more citations than all other levels (all p < .001). This suggests partial support for H2a. The Citations section of Table 4 displays summary statistics for each academic level, as well as comparisons between our means and those reported by Mazzucchelli et al. (2019) and Craig et al. (2021). When compared to the full sample of Australian academics in Mazzucchelli et al. (2019), the UK academics were cited significantly more often (p = .005), although difference was small (d = 0.08). However, when each academic level was considered separately, only the level B comparison was statistically significant (p < .001). Again, the difference was small (d = 0.19). The magnitude of the comparisons with Craig et al. (2021) were similar, although none reached the threshold for statistical significance.

The main effect of university group was also significant, F(1, 1329) = 56.49, p < .001, η2p = .04. This indicated that, overall, academics at Russell Group universities had higher citation counts than academics at other universities, and that H2b was supported. However, there was also a significant interaction between the academic level and university group, F(3, 1329) = 41.46, p < .001, η2p = .09. Simple comparisons indicated that even though academics at each level in the Russell Group had higher citation counts than academics at other universities, the difference was only significant for level E academics (p < .001, d = 0.75). This interaction is illustrated graphically in panel B of Fig. 1. The Citations section of Table 5 presents the relevant group means, mean differences and standardised differences between academic levels within each university group, as well as differences between the university groups within each academic level. As can be seen in Table 3, citation counts for level E Russell Group academics were 3.5 to 12.1 times higher than the citation counts for level E academics in the other university groups.

Scopus h-index scores

The main effect of academic level on Scopus h-index scores in a 2 × 4 factorial ANOVA was significant, F(3, 1331) = 315.06, p < .001, η2p = .42. Marginal means comparisons indicated significant differences between all academic levels, except levels B and C (p = .137). That is, the h-index scores of level E academics were significantly higher than the h-index scores of level B, C and D academics (all p < .001), and the h-index scores of level D academics were significantly higher than the h-index scores of level B and C academics (both p < .001). This suggests broad support for H3a. The h-index section of Table 4 displays summary statistics for each academic level, as well as comparisons between our means and those reported by Mazzucchelli et al. (2019) and Craig et al. (2021). The pattern of differences here is more inconsistent than in previous analyses. The UK academics had higher h-index scores at level B (ps < .001) and lower h-index scores at levels C (ps < .001) and D (p = .005 and .003 respectively). At level E, there was no difference between the two nations (p = .089 and .734 respectively). The differences were small to medium in magnitude.

The main effect of university group was also significant, F(1, 1331) = 133.56, p < .001, η2p = .09. This indicated that, overall, academics at Russell Group universities had higher h-index scores than academics at other universities, and that H3b was supported. However, there was also a significant interaction between academic level and university group, F(3, 1331) = 45.14, p < .001, η2p = .09. Simple comparisons indicated that Russell Group academics at levels C, D and E had significantly higher h-index scores than their counterparts at other universities (p = .008 for level C and p < .001 for levels D and E). However, the difference at level B was non-significant (p = .042). The magnitude of these differences ranged from large (at level E) to small (at level B). This interaction is illustrated graphically in panel C of Fig. 1. The h-index section of Table 5 presents the relevant group means, mean differences and standardised differences between academic levels within each university group, as well as differences between the university groups within each academic level. As can be seen in Table 3, the h-index scores of level E academics at Russell Group universities were 1.7 to 3.1 times higher than the h-index scores of level E academics in the other university groups.

Lifetime Scopus first author publications

The main effect of academic level on lifetime Scopus first author publications in a 2 × 4 factorial ANOVA was statistically significant, F(3, 1331) = 145.93, p < .001, ηp2 = .25. Marginal means comparisons indicated significant differences between all levels except B and C (p = .941). Level E academics had significantly more first author publications than level B, C and D academics, and level D academics had significantly more first author publications than level C and B academics (all p < .001). This suggests broad support for H4a. The First author publications section of Table 4 displays summary statistics for each academic level.

The main effect for university group was also significant, F(1, 1331) = 30.10, p < .001, ηp2 = .02. This indicates that, overall, academics at Russell Group universities had published more first author papers than academics at other universities, and that H4b was supported. However, there was also a significant interaction between academic level and university group, F(3, 1331) = 16.35, p < .001, ηp2 = .04. Simple comparisons indicated that even though academics at each level in the Russell Group had published more first author papers than academics at other universities, the difference was only statistically significant for level E academics (p < .001, d = 0.69). This interaction is illustrated graphically in panel D of Fig. 1. The First author publications section of Table 5 presents the relevant group means, mean differences and standardised differences between academic levels within each university group, as well as differences between the university groups within each academic level. As can be seen in Table 3, the mean number of first author publications for level E Russell Group academics was 1.5 to 2.8 times higher than the corresponding means for level E academics in the other university groups.

Journal SJRs

The main effect of academic level on journal SJRs in a 2 × 4 factorial ANOVA was statistically significant, F(3, 1331) = 12.13, p < .001, ηp2 = .03. Marginal means comparisons indicated significant differences between all academic levels, except levels B and D (p = .724). Level E academics had significantly higher SJR averages than level B (p = .001), C (p < .001) and D (p = .002) academics, and level D academics had significantly higher SJR averages than level C academics (p = .001). Interestingly, while the average SJR for academics at level B was higher than academic at C and D, the difference was significant only at level C (p < .001 for C and p = .724 for D). This suggests partial support for H5a. The Journal SJR section of Table 4 displays summary statistics for each academic level.

The main effect for university group was also significant, F(1, 1331) = 97.61, p < .001, ηp2 = .07. This indicates that, overall, academics at Russell Group universities tended to publish in journals with higher SJRs, and that H5b was supported. However, there was also a significant interaction between academic level and university group, F(3, 1331) = 6.06, p < .001, ηp2 = .01. Simple comparisons indicated that Russell Group academics at levels B, D and E tended to publish in journals with higher SJRs than academics from other universities (all p < .001). However, at level C, there was only a trend in favour of the Russell Group (p = .027). The difference between university groups was most pronounced at level E (d = 0.87), and level E Russell Group academics published in journals with SJRs that were 1.5 to 2.1 times higher than the corresponding mean SJRs for level E academics in the other university groups (see Table 3). This interaction is illustrated graphically in panel E of Fig. 1. The Journal SJR section of Table 5 presents the relevant group means, mean differences and standardised differences between academic levels within each university group, as well as differences between the university groups within each academic level.

Effects of research career length

In 16 of the 20 one-way ANCOVAs reported in Table 6, research career length was statistically significant (ps < .01, ηp2 = .03 to .47, median ηp2 = .20), indicating that it accounted for variance in performance at each academic level beyond the variance accounted for by group membership. However, when career length was controlled, there remained a consistent performance bias in favour of the Russell Group. Furthermore, the differences between the effect sizes for the adjusted and unadjusted comparisons in Tables 5 and 6 respectively ranged from small (d =  − 0.20) to trivial (d = 0.05; median difference between ds =  − 0.05). The tendency for academic psychologists at Russell Group universities to have been research active somewhat longer than their equivalents outside of the Russell Group does not appear to account for the performance differences we observed.

Table 6 Comparisons between academic levels and university group on each indicator of research performance, adjusted for research career length

Discussion

In this study we replicated and extended Mazzucchelli et al. (2019) within the UK higher education context. Specifically, we calculated up-to-date lifetime research performance norms for academic psychologists in the UK, stratified by academic level and university mission group. Our indicators of research performance were lifetime publications, lifetime citations, h-index scores, lifetime first author publications and published in journal SJR averages. These indicators were extracted from the Scopus and SCImago databases. We also made comparisons between UK academic psychologists and their Australian equivalents, where it was possible to do so. Finally, we considered the impact of research career length on the performance differences observed between academic psychologists in and outside the Russell Group.

Whilst Mazzucchelli and colleagues (2019) observed a clear and statistically significant ordering of academic levels on their three indicators of lifetime research performance, our findings were more mixed. The professors (i.e., level E academics) in our sample clearly out-performed their more junior colleagues on lifetime publications, lifetime citations and h-index scores. Indeed, the professors had triple the publications, more than four times the citations and more than double the h-index scores of the level D (associate professor or equivalent) academics. Furthermore, the level D academics in our sample outperformed their level C (senior lecturer or equivalent) and level B (lecturer or equivalent) colleagues by a factor of between two and three, although these differences did not always reach our threshold for statistical significance (α = .01). However, the performance of the level C and B academics was more-or-less equivalent.

Somewhat similar results were observed for our additional performance indicators: lifetime first author publications and journal SJRs. Academics at level E significantly out-performed academics at all other levels on both. They had more than double the first author publications of level D academics, and average journal SJRs that were around 17% higher. Unexpectedly, while level D academics had significantly more first author publications than level C (79% higher) and B (64% higher) academics, the average journal SJRs for level B surpassed level C (33% higher) and were equivalent to level D.

In Australia, Haslam et al. (2017) observed a linear relationship between lifetime research performance and academic seniority in the Go8. Also in Australia, Mazzucchelli et al. (2019) observed a rough doubling of research performance with each level of academic seniority. For the same lifetime metrics as Mazzucchelli et al. (2019), Craig et al. (2021) reported similar findings. In the UK, professors had a greater performance advantage over their more junior colleagues, whereas there were few performance differences between academics at levels B and C. These effects may be attributable, at least partly, to research career length (defined herein as the number of years since first Scopus publication). The average research career length of a level E academic in our sample was 25 years, versus 17, 12 and 9 years for academics at levels D, C and B respectively. What is notable here is the relatively small difference between the average research career lengths of level C and B academics, which may partially account for the absence of any real performance differences between these levels. By way of contrast, Craig et al. (2021) reported average research career lengths of 16 and 9 years for Australian academic psychologists at levels C and B respectively. It appears that UK academic psychologists are promoted (or ‘progressed’) to level C considerably sooner than their Australian counterparts. However, this alone cannot fully account for the lack of differentiation between levels B and C in the UK. Consequently, we propose that it also reflects level C promotion/progression criteria that place relatively greater emphasis on teaching and administrative performance than they do on research performance (Parker, 2008). It may also be the case that research activity slows, and administrative and teaching loads increase, in the aftermath of securing a ‘tenured’ academic position. Such an effect has been observed in the US and Canada (Duffy et al., 2011). Unlike in the US, most UK academic staff from level B up are employed on tenured (i.e., ‘continuing’ or ‘permanent’) contracts (Higher Education Statistics Agency, 2021). The differentiation between levels C, D and E in the UK may reflect level D and E promotion criteria that demand research excellence (Parker, 2008). In fact, it has been observed that, even in situations where teaching excellence was the primary determinant of promotion to levels D and E, a significant research profile remained crucial (Cashmore et al., 2013).

The UK and Australia have similar research environments (Auranen & Nieminen, 2010). Consequently, we expected to see similar levels of lifetime research performance in the UK to those observed by Mazzucchelli et al. (2019) and Craig et al. (2021) in Australia. At the discipline level, this appeared to be the case. The Australian academic psychologists had published 26–34% more papers, had 19–24% fewer citations, and very similar h-index scores to the UK academic psychologists. Although some of these differences were statistically significant, they would all be characterised as ‘small’ in magnitude (Cohen, 1988). Some larger differences emerged when making international comparisons within academic levels. For instance, level B UK academics had around double the citations of their Australian counterparts and had h-index scores that were over 50% higher. At levels C and D, the publication counts and h-index scores of the UK academics were significantly below those of their Australian counterparts. Specifically, the UK academics had published 41–51% fewer papers and had h-index scores that were 10–29% lower. These differences may be partially attributable to the shorter research career lengths of the UK academics at levels C (12 vs. 16 years) and D (17 vs. 20 years). At level E there was very little difference between UK and Australian h-index scores, although the UK academics had published 26–30% fewer papers, but had been cited 4–34% more often. Overall, UK and Australian academic psychologists demonstrated broadly comparable levels of research performance, although there appeared to be a slightly greater emphasis on impact relative to quantity in the UK. The factors underpinning this difference would be an interesting avenue for future research.

Another commonality between the UK and Australia was the influence of university mission group on lifetime research performance. In Australia, Mazzucchelli et al. (2019) observed that academics at Go8 universities had published 28–50% more papers, had been cited 64–96% more often, and had h-index scores that were 28–53% higher than academics at the same level outside of the Go8. We also found a main effect of mission group (Russell Group vs. non-Russell Group) on all five of our performance indicators. However, follow-up analyses indicated that these effects were driven largely by the very strong research performance of the Russell Group professoriate. This is clearest in Fig. 1. Russell Group professors had published 2.2 times more papers, had 4.2 times more citations, had h-index scores that were 1.9 times higher, were first author on 1.7 times more papers and published in journals with SJRs 1.6 times higher than professors outside the Russell Group. Although Russell Group academics at other levels consistently outperformed their equivalents outside the Russell Group, the differences were more modest and mainly non-significant. The four exceptions to this were on h-index scores at levels C (71% higher) and D (43% higher), and journal SJRs at levels B (27% higher) and D (51% higher). Although academics in the Russell Group tended to have been research active for somewhat longer than their equivalents outside the Russell Group, these research career length differences did not explain the performance differences we observed.

One explanation for these findings is the heterogeneity amongst higher education institutions in the UK (Harley et al., 2004). The Russell Group comprises mainly ‘old’, research-intensive universities with reputations based on the quality and quantity of their research outputs. Following their REF performance in 2014, the 20 English Russell Group members were allocated 67% of the Higher Education Funding Council for England’s (HEFCE) 2015/16 quality-related research funding. The remaining 33% was split between the other 111 eligible institutions (MillionPlus, 2016). In the period from 2002 to 2006, about two thirds of research outputs from UK universities had at least one author with a Russell Group affiliation. In the same time period, these authors produced 73% of UK outputs defined as ‘highly cited’ (Adams & Gurney, 2010). By contrast, many of the universities outside of the Russell Group are modern (e.g., former polytechnics that were granted university status in 1992), place a stronger emphasis on teaching quality (Harley et al., 2004) and receive relatively little research funding (MillionPlus, 2016). These differences in priorities and funding are reflected in hiring practices, promotion criteria and internal resourcing decisions (e.g., the proportion of staff time allocated to research related activity).

Furthermore, although the Russell Group can be considered reasonably homogeneous (but see Adams & Gurney, 2010; Boliver, 2015), the universities outside of the Russell Group are not. Indeed, our sample of non-Russell Group universities are home to some of the best and worst performing psychology departments in the UK. This is somewhat evident in Table 3, which shows large research performance differences between mission groups that, for the purposes of our primary analyses, we combined to form a single comparison group. The heterogeneity within this comparison group attenuated the differences between it and the Russell Group, at least at lower academic levels. As noted previously, amongst the professoriate there were consistently large performance differences between those in and outside of the Russell Group. Interestingly, this pattern is opposite to that observed by Mazzucchelli et al. (2019), who found that differences between the performance levels of academics in and outside of the Go8 tended to decrease as seniority increased.

The relative homogeneity of the Russell Group is consistent with Joy (2006), who found that scholarly productivity levels did not differ amongst research focused US universities with different levels of prestige. Since a strong research profile informs national and international rankings, and ultimately drives reputation (McNally, 2010; Sabharwal, 2013), Joy (2006) concluded that reputed elite and research-intensive universities successfully ‘poach’ high research performers by capitalising on their financial resources and brand cachet. Conversely, the less prestigious universities appoint ‘rising stars’ and create an environment conducive to productivity. Drawing from this idea, it may be that academics with exceptional research performance switch to Russell Group universities when opportunities arise. Although we did not examine the prospect that the Russell Group’s research strength is linked to its ability to ‘lure’ the best and brightest from elsewhere with professorial positions, Scopus does provide comprehensive lists of past institutional affiliations which could be used to test this idea in the future. However, these data would need to be combined with data from other sources to address questions about the career stages at which academics make the switch. It is, however, worth noting that there was a greater proportion of professors in our Russell Group sample than there was in our samples of academics from the other university mission groups (see Table 2). As in the Go8 in Australia (Craig et al., 2021; Mazzucchelli et al., 2019), psychology departments within the Russell Group are, relatively speaking, ‘top heavy’.

To our knowledge, this study is the first comprehensive and representative survey of the lifetime research performance of academic psychologists in the UK, stratified by academic level and university mission group. Previous efforts (Colman, Garner, & Jolly 1992; Colman, Grant, & Henderson 1992; Furnham & Bonnett, 1992; Rushton & Meltzer, 1981) have focused on limited indicators of performance derived from a limited set of journals over limited time periods. Furthermore, they did not consider the influence of seniority or university type on performance and, given our current understanding of how research performance in Australia has increased considerably over the last decade (Haslam et al., 2017; Mazzucchelli et al., 2019), are now thoroughly out-of-date. It is also, to our knowledge, the largest such effort ever undertaken, anywhere. Our results can be used to facilitate departmental, institutional and national resourcing decisions, and may help to inform decision making around employee selection, promotion, performance management and time/task allocation. They are also likely be of use to individual academics for career development planning, and can be used to help strengthen the case for promotion and research funding.

Our study is, however, not without limitations and caveats. First, our indicators of performance were sourced primarily from Scopus. Although Scopus has several advantages over other academic databases (see Aghaei Chadegani et al., 2013), it is not perfect. For example, during data collection we noticed that, on occasions, papers had been incorrectly attributed and affiliations were not always current. Furthermore, although it is considerably more comprehensive than other academic databases (e.g., PsycINFO and Web of Science), Scopus does not index all academic outputs. And, within Scopus, we limited our searches to ‘article’ and ‘review’ type journal articles. Consequently, the figures presented herein almost certainly underestimate the true levels of research activity of academic psychologists in the UK. They do not include books, book chapters, material that was ‘in press’ at the time of data collection, papers published in journals that are not indexed by Scopus, reports and other ‘grey’ literature. It would be informative to repeat this study without any filters using Google Scholar, which is regarded as the most inclusive academic database (Martin-Martin et al., 2018). We suspect that this would result in considerably larger performance estimates, although the relative differences between academic levels and university mission groups would probably remain unchanged. However, as comprehensive as Google Scholar is, it must be remembered that it does not and cannot capture all research activity. Obvious omissions include grant applications, confidential reports, conference presentations and posters, invited talks, peer-reviews and research supervision.

Second, for the sake of consistency with Mazzucchelli et al. (2019), we did not attempt to distinguish between staff working full- or part-time, nor staff on research-focused, ‘traditional’ research-and-teaching or teaching-focused contracts. It seems obvious that these factors impact research performance. It is also plausible to suggest that, compared to teaching-intensive institutions outside the Russell Group, members of the Russell Group may employ a greater proportion of research-focused and traditional academics on full-time contracts. Russell Group universities may also set higher expectations for promotion, particularly in relation to research. If so, these differences may explain some (if not most) of the performance variation that we observed between universities in and outside of the Russell Group. For a truly ‘fair’ comparison between universities and university groups, future researchers should attempt to measure and control these factors. Given its complex relationship with research performance (Huang et al., 2020), gender ought also be controlled in any future research.

Third, although research performance as measured by raw publication and citation counts has increased in recent decades (Haslam et al., 2017; Mazzucchelli et al., 2019), so too has the average number of co-authors on each publication (Fanelli & Lariviere, 2016). When the number of co-authors is accounted for, publication counts appear stable over time (Fanelli & Lariviere, 2016). Furthermore, there are established differences in co-authorship practices between sub-fields of psychology (Henriksen, 2016). Future researchers ought to consider adjusting raw performance measures to account for co-authorship and sub-field practices. The impact this may have on the magnitude of differences between academic levels or university mission groups is currently unknown. Additionally, weightings can be given to publications based on authorship position. However, this would require assuming that contribution level and authorship position are inversely related, which is not necessarily the case (Sauermann & Haeussler, 2017) The hIa (h-index annual), which is the mean annual increase in h-index scores after correcting for co-authorship (Harzing et al., 2014), may also be useful here.

Fourth, research performance, as conceptualised in our study, and research impact are not synonymous. Although more difficult to operationalise, impact has played an increasingly prominent role in the evaluation of research at both the discipline and institutional level in recent years (see, for e.g., the REF in the UK and ERA in Australia). Individual impact is also growing in importance (Smith et al., 2013). Future researchers should consider ways of incorporating impact (beyond h-index scores and citation counts) into comparisons between different groups of psychology academics.

Finally, we echo Duffy et al. (2011) in saying that the norms reported herein should be used sensitively and cautiously. Performance measures like publication and citation counts present a very partial picture of the value of an individual or department’s scholarly contribution to psychology. These measures must be considered alongside a range of other indicators of scholarly performance, including those linked to impact, teaching and learning, and community engagement when making resourcing decisions, or decisions related to employee selection, promotion, performance management or time/task allocation. Research performance measures must also be considered in context and relative to opportunity as, for example, comparing the publication counts of academics without reference to their area of expertise, career length, employment fraction, teaching and administrative loads is simplistic and unfair.

In conclusion, increasing competition for finite research time and funding has made the evaluation of research performance essential. This study is the first comprehensive and representative survey of the lifetime research performance of academic psychologists in the UK, stratified by academic level and university mission group. Broadly speaking, we found that lifetime research performance increased with seniority, although not in the predictable manner observed in Australia by Haslam et al. (2017) and Mazzucchelli et al. (2019). We also found that academics in the Russell Group outperformed those outside of the Russell Group on all our measures. These differences were largely due to the very strong research performance of the Russell Group professoriate, and could not be accounted for by the tendency for Russell Group academics to have been research active longer than their equivalents outside of the Russell Group. Overall, the research performance of UK academic psychologists was comparable to that of their Australian counterparts. Our results shed light on lifetime levels of research activity in UK psychology departments and will be useful for a range of benchmarking purposes. Our data are open and afford potential for addressing novel future research questions about the quantification of research performance within academic psychology in the UK, as well as future international comparisons. However, they must be used and interpreted in context, alongside a range of other measures of scholarly productivity and performance.