Sources of convergence and divergence in university research quality: evidence from the performance-based research funding system in New Zealand

The introduction of performance-based research funding systems (PBRFS) in many countries has generated new information on their impacts. Recent research has considered whether such systems generate convergence or divergence of research quality across universities and academic disciplines. However, little attention has been given to the processes determining research quality changes. This paper utilises anonymised longitudinal researcher data over 15 years of the New Zealand PBRFS to evaluate whether research quality changes are characterised by convergence or divergence, and the processes determining those dynamics. A unique feature is the use of longitudinal data to decompose changes in researcher quality into contributions arising from the entry, exit and quality transformations of retained researchers, and their impacts on convergence or divergence of research quality across universities and disciplines. The paper also identifies how researcher dynamics vary systematically between universities and disciplines, providing new insights into the effects of these systems.


Introduction
Performance-based research funding systems (PBRFS) are designed to improve research output, quality and impact, and to strengthen the accountability of universities for the use of public funds. They involve a range of incentives, created in particular by the metrics used to measure research quality, which are designed to encourage institutional and individual changes, although unintended consequences can also result. 1 In evaluating performance outcomes of a PBRFS, it is important to know if the design features led to the initially stronger universities capturing an increasing share of research funds, or to a catchup process whereby initially weaker universities raised their standards at a faster rate. This relates to what is referred to as β-convergence, defined as a systematic tendency for lowerquality universities to experience relatively higher growth rates compared with higherquality universities. A further type of convergence, called σ-convergence, refers to changes in the overall dispersion of university research quality. 2 Despite the large literature on PBRFSs, very few systematic attempts have been made to assess their convergence properties. The first study to examine convergence formally in the context of a PBRFS is , who use information about the first two full rounds, in 2003 and 2012, of the New Zealand (NZ) PBRFS. 3 They found significant evidence for both β-and σ-convergence. Subsequently, Checchi et al. (2020) adopted the same technique to examine Italian universities, following the introduction of the Italian PBRFS (the Evaluation of Research Quality, VQR). They found convergence of research quality of Italian universities between the VQR rounds of 2004 and 2010, and between 2011 and 2014. Following the same basic method, Abramo and D'Angelo (2022) also provide evidence of convergence of the research quality of Italian universities and also find that the process of convergence is affected predominantly by a stronger rate of improvement of the initially lower research-quality universities.
The present paper extends this earlier work in several ways. The first, and most important extension, is that the precise sources of convergence are examined in detail. This is achieved by investigating the nature of the transitions of researchers involved in achieving quality improvements. Essentially, quality changes arise from staff turnover and quality transformations of those remaining within the same university. This exploration of the sources of convergence involves the development of a new decomposition method that identifies the separate contributions of exits, entrants and quality transformation.
Second, unlike the earlier analysis of , the paper is able to make use of the third full round carried out in 2018. This extension allows investigation of whether changes in research quality during the second period (2012)(2013)(2014)(2015)(2016)(2017)(2018) were systematically related to the previous responses from 2003 to 2012. Third, more detailed information is available about researchers in each round, including gender and full-time equivalent (FTE) status, which allows for further control variables to be included in the analysis.
Hence, this paper presents techniques that identify the contributions of turnover and quality transformation of researchers to the growth in research quality following the introduction of a PBRFS. Turnover arises from exits and entrants of researchers, defined as follows. Entrants to a university consist of researchers who either move from another NZ university, or enter the NZ university system from outside it. Exits are researchers who leave a university, either transferring to another NZ university or moving outside the NZ university 1 3 system. When looking at NZ universities as a whole, the transfers among universities net out, and entrants and exits include only those moving to or from the NZ university system.
Each university is incentivised to manage the process of improvement in research quality, subject to constraints. 4 The nature of the incentives in the PBRFS design is likely to influence relative responses and therefore whether they generate convergence or divergence. 5 The benefits of the turnover of researchers of a particular quality vary by the type of university. In turn, the characteristics of a university's turnover, and quality transformation of incumbents, depend on its initial average research quality. These determine the process of convergence of research quality among all universities within the system.
The New Zealand scheme was designed to unbundle the research component of Government funding of New Zealand tertiary education organisations, and allocate the research component based on research performance rather than the number of students; see New Zealand Tertiary Education Commission (2019, p. 11). Three measures are used to allocate Government funding to support research at universities and other tertiary education organisations. The largest component, the focus of this paper, is Quality Evaluation. 6 The system substantially changed the incentives facing individuals, departments and universities. However, when the PBRFS was introduced, there was no explicit statement of whether an aim was to generate more concentration of higher-quality research.
'Growth rates of research quality and convergence' section begins by describing the research quality metric used by the NZ PBRFS. It then provides empirical results regarding β-convergence and σ-convergence, allowing for the influence of past changes. 'Contribution of exits, entrants and quality transformations to AQS changes' section reports the contributions of exits, entrants and quality transformation of researchers to research quality changes for universities and academic disciplines (subject areas), and 'β-Convergence properties of contributions to AQS growth' section examines these component effects on convergence. 'Conclusions' section concludes.

Growth rates of research quality and convergence
The New Zealand measure of research quality The assessment of research quality in the NZ PBRFS is based on the performance of all eligible individual researchers within each university. 7 The assessment is based on peer review of a range of research outputs over the previous 6 years. Each researcher is assigned to one of four quality categories (QCs), A, B, C and R, where the highest category is A, and R indicates an absence of significant research outputs. These QCs are used to compute 4 On management responses to the introduction of a PBRFS, see Adams (2008) and Woelert and McKenzie (2018). 5 In the UK context, Barker (2007, p. 6) suggests that the funding weights disproportionately rewarded universities which achieved high quality scores, and therefore led to divergence. However, convergence properties for the UK scheme cannot be tested formally, as in the present paper, because the metrics involve only a small number of qualitative categories, and not all academics needed to be included in evaluations. 6 The other components are Research Degree Completions and External Research Income. 7 Those eligible include all research and teaching staff who are employed on the PBRF census date under an employment agreement with a duration of at least 1 year, and are employed throughout the contract on at least a 0.20 FTE basis. Although precise information is not available, this appears to account for around 90% of total non-administration staff. a quantitative performance score, referred to as an Average Quality Score (AQS) for each subject area and university, defined as follows. 8 Each individual, h, is given a cardinal score, G h , depending on the QC: 10 for A; 6 for B; 2 for C; and 0 for R. The average quality score, AQS, is the employment-weighted arithmetic mean score. 9 Define the full-time equivalent (FTE) employment weight of person, h, as e h ≤ 1 , and let n denote the number of employees. The AQS for a university or discipline group is: The data used here include the QC assigned to every researcher who participated in any of the assessment rounds, in 2003, 2012 and 2018. 10 It also includes an anonymous identifier, age, gender, research subject area, university of employment, and FTE status, including whether the researcher exited or entered the entire NZ university system between rounds or transferred to or from another NZ university.

β-Convergence
A standard form of logarithmic regression specification between AQS changes and initial AQS to test for β-convergence from time t − 1 to t is: The left-hand side measures the annualised proportional AQS growth rate for university j, and discipline i. 11 This enables the systematic component of change to be separated from other influences, including university-specific fixed effects and random shocks. Where three time periods are available, (2) can be extended to 12 : In what follows, t, t − 1 and t − 2 refer to 2018, 2012 and 2003. The full convergence effect, β, is measured by β = (β 1 + β 2 ). Convergence tests can also be conditioned on a This form was used by  to examine changes from 2003 to 2012, but they did not need to annualise the growth rates because only one period was available. Academic subjects are collected into nine disciplines: medicine, engineering, core science, management, accounting finance & economics, humanities, agriculture, law and education; see  for details of the groups. 12 This specification is consistent with one having the lagged (logarithm) of the 2012 AQS, and the growth from 2003 to 2012 (that is, the difference in the logarithms of the AQSs), on the right-hand side. The interpretation is that growth from 2012 to 2018 depends on the AQS level in 2012 and the previous annualised growth rate experienced from 2003 to 2012. 8 The assessment and scoring methods are described in more detail and critically evaluated in Buckle and Creedy (2019b). 9 Two new categories C(NE) and R(NE) were introduced by the TEC for the 2012 and 2018 assessment rounds to signify if a researcher met the 'new and emerging' criteria. The score assigned by the TEC to R(NE) was 0 in 2012 and 2018, the same as in 2003. The score assigned by the TEC to C(NE) was 2 in 2012, the same as for C, and 4 in 2018 (New Zealand Tertiary Education Commission, 2019, p. 13). For the purposes of this paper the score assigned to C(NE) in 2018 is set equal to 2, to ensure consistency over time. 10 The anonymised data used in this study are not publicly available and were provided by the NZ Tertiary Education Commission (TEC) following a confidentiality agreement.

3
vector of additional variables, including the number of staff FTEs, the median age and the gender ratio of researchers by university and discipline.
Testing for possible differences across universities and disciplines in the growth (and convergence or divergence) of their AQSs, involves testing for the inclusion or exclusion in regressions of shift dummy variables, where D i = 1 for university i, and is zero otherwise. Slope dummies, S i , equal to D i logAQS it−1 are also added. Similar dummies are defined for discipline groups. The parameters γ and β = (β 1 + β 2 ) in (3), along with their university-specific and discipline-specific equivalents, therefore respectively capture autonomous AQS growth and the rate of convergence (β < 0) or divergence (β > 0) of AQSs for each university and discipline. The parameter, β, may be interpreted as the 'full' rate of convergence over the whole period, including any university or discipline fixed effects.
The approach begins with all dummy variables included: 2 × 8 university shift and slope dummies and 2 × 9 discipline equivalents. These are progressively eliminated using the 'general-to-specific' (Gets) approach of Campos et al. (2005), Castle et al. (2011) andDoornik (2014), whereby the variable with the lowest t-ratio is omitted first. 13 The regression is then re-run in a sequential process, until the most parsimonious specification of the data-generating process is obtained. This effectively treats the null hypothesis as i ≠ j ≠ and i ≠ j ≠ , against the alternative of common values of γ and β across universities and disciplines.
Following the Gets approach, the convergence variables, logAQS 2012 and logAQS 2003 , always passed relevant t-tests, while dummy variables were progressively eliminated based on parameter t-tests. Table 1 shows t-ratios associated with the shift (D) and slope (S) dummy variables for each university and discipline added to regressions on (3), and the order in which they are omitted. Left-hand columns report results where the initial regression including all university and discipline dummies. As a check, right-hand columns report similar results but for initial regressions that include either all university dummies or all discipline dummies.
In both cases the table shows the t-ratio associated with a given dummy in the regression immediately prior to it being omitted. For example, in the left-hand columns of Table 1, the first variable eliminated was the shift dummy for Massey University (MU) with a t-value of 0.00. The regression was re-run omitting this variable. This led to the slope dummy for Medicine being identified as the lowest t-ratio and eliminated (t = 0.03). This process was repeated until only variables with t-ratios > |3| were retained. 14 The only dummy variable meriting retention in the regression after this process is the slope dummy variable for the Accounting, Finance and Economics (AFE) discipline group, with a t-ratio of − 3.48. 15 13 Campos et al. (2005, p. 2), for example, summarise the Gets approach as follows: '1. Ascertain that the general statistical model is congruent. 2. Eliminate a variable (or variables) that satisfies the selection (i.e., simplification) criteria. 3. Check that the simplified model remains congruent. 4. Continue steps 2 and 3 until none of the remaining variables can be eliminated'. 14 A critical t-ratio = 3 was chosen following Castle et al. (2011) and Castle and Hendry (2014) who argue that substantial pre-testing and variable selection from many possible models increases the risk of retaining irrelevant variables. For example, with a critical significance level for t-tests of c α = 0.05, there is a 1 in 20 chance of an irrelevant variable being retained (with a threshold t > 2) in the model on average. However, for c α = 0.01 (t α ≈ 2.6) this becomes 1 in 100, and c α = 0.001 (t α ≈ 3.35) implies 1 in 1000. Hence t α between 2.6 and 3.35 substantially reduce the risk of a false positive. Castle and Hendry (2014) recommend setting α = min(1/N, 1/T, 1%). 15 Table 3 also indicates that if a more conventional t > 2 was used to retain dummy variables, a shift dummy for Medicine (t = 2.41), a slope dummy for core Science (t = 2.49) and a slope dummy for Education (t = 2.57) would be retained.

3
The same final outcome is obtained using the approach in the right-hand columns which, for example, tests the null hypothesis, i ≠ , while maintaining j = , and vice versa.
Details of the final regression are reported in Table 2. 16 These reveal a strong common initial rate of β-convergence across all universities and disciplines (except AFE) of β 1 = − 0.1167. 17 The initial AFE convergence rate, at β 1 = − 0.1321 (− 0.1167-0.0154), is quantitatively similar to, if statistically different from, the average rate of − 0.1167 across all disciplines. There are several possible reasons why the AFE discipline group could have a different estimated rate of convergence. This may reflect differences in international market conditions and rates of staff turnover for AFE researchers (see, for example, Boyle, 2008;Ehrenberg et al., 1991;Xu, 2008), a difference in attitudes and responses to the PBRFS incentives (Shin & Cummings, 2010), differences in the availability of contestable research funding for AFE researchers, or a difference in assessment standards adopted by the PBRFS assessment panel for the AFE group. In the absence of suitable data, these issues cannot be explored here. Table 2 suggests that the inclusion of logAQS 2003 in the specification is statistically justified and slightly reduces the longer-run annual convergence rate; with β 2 = 0.022 and hence β = − 0.0947 (t = − 10.994). The positive sign on logAQS 2003 reflects the fact that the degree of convergence in the first period is negatively associated with convergence in the second period, thereby modifying the full-period degree of convergence. This slightly slower rate of convergence when estimated over the whole 2003-2018 period is unsurprising given the rapid estimated convergence rates for 2003-2012 (reported in , such that by 2018, average research quality scores were much closer across universities and disciplines than they had been in 2003 when the PBRFS was introduced. The value of β, while small in absolute terms, implies a high rate of convergence, in terms of differential average growth rates, conditional on the initial AQS. These are clearly illustrated in the cross-plot of the 87 individual observations in Fig. 1, which shows the relationship between logAQS 2012 and actual AQS growth. The logAQS observations in 2012 can be seen to range from as low as 0.55 (AQS = 1.74) to highs up to 1.88 (AQS = 6.54). Similarly, differences across research units in the rate of growth of AQS over the 2012-2018 period are substantial, ranging from negative annual growth rates of − 0.06 to large positive growth of 0.12. The degree of β-convergence therefore reflects relatively rapid convergence, or 'catch-up'. Figure 1 reveals the negative relationship across individual research units, implying a strong β-convergence tendency. In addition, the highlighted observations (white dots) for average values (across disciplines) within the each of the eight universities confirm a convergence tendency at the aggregated university level. Thus, for example, AUT and Lincoln University (LU) had the lowest initial AQS values and the fastest AQS growth, whilst the initially-leading university, VUW, had the slowest (near zero) AQS growth.
Based on the regression in Table 2   Nevertheless, the dominant convergence relationship between logAQS 2012 and expected AQS growth is clear in Fig. 2, despite the small (lagged) divergence tendency contributed by logAQS 2003 . A small additional source of variation observed in Fig. 2 arises from the slightly different convergence parameter applicable to AFE. Figure 3 compares actual and expected AQS growth at various points in the distribution from the 5th to the 95th percentile. This reveals a generally good fit across the distribution of AQS growth (2012-2018) observations, especially around the median. Unsurprisingly, expected values fit less well at the 5th and 95th percentiles, but nevertheless perform well, for example capturing around 67% of the large actual AQS growth rates at the 95th percentile (0.052/0.078), but under 30% at the 5th percentile.
The effects of control variables, including their potential impacts on convergence parameters, were tested by adding three variables to Eq. (3). These were: the initial number of staff FTEs in a university-discipline unit; its gender balance (female staff share); and staff median age. These variables were added individually and in combination, but none revealed any statistically significant effects, using conventional confidence intervals.

σ-Convergence
As mentioned above, the overall convergence or divergence in AQS scores is measured by σ-convergence, indicating whether the distribution of AQSs across research units is becoming more or less dispersed. A value of β < 0 does not necessarily imply σ-convergence, which also depends on the error variance, σ 2 ε in (3). Table 3 reports the variances of logAQS across universities and disciplines separately and combined, and for each discipline within universities. This shows that there were substantial reductions in the overall dispersion of logAQS between 2003 and 2018 for all groups: for all universities and disciplines combined and for each discipline within universities. 18 For all universities combined, the variance is more than halved during 2012-2018, but for disciplines there is little change. For disciplines within universities, the decline of the variance for Massey disciplines is as large as for the earlier period, but for the other universities there is very little change, with half experiencing a small increase and the others a small decrease or no change.
Thus a strong common rate of β-convergence across all university and discipline units during 2012-2018 did not necessarily translate into an overall decline in dispersion across all units, with differences in σ-convergence observed across periods. Whereas substantial σ-convergence is observed for all groups during the initial PBRFS phase in 2003-2012, there is a mix of σ-convergence and σ-divergence during 2012-2018, with a relatively small reduction in variances after 2012. This partly arises from the relatively low variance of logAQS achieved by 2012.

Contribution of exits, entrants and quality transformations to AQS changes
The previous section estimated the extent of research-quality convergence of universities and discipline groups, although it did not directly consider the precise sources of the AQS growth in terms of the turnover of staff and their quality transformation. This section examines these sources and their dynamics, revealing how the separate contributions of exits, entrants and quality transformations of incumbent researchers combine to generate overall convergence. The analysis uses a decomposition method to identify the contributions of these three components, for universities and discipline groups.

Determinants of research quality change
To identify the way in which the component flows combine to affect the AQS of a university or discipline group, first let n 0 and n 1 denote the number of individuals in a university at times 0 and 1 respectively, and assume for convenience that all are full-time employees. As above, G i denotes the score attached to each person, i, depending on the Quality Category, QC. The four QCs have scores,g k , (for k = 1,…,4), of 10, 6, 2 and 0 (for A, B, C and R researchers respectively). Hence if person i belongs to category k, the score is G i = g k . Finally, let Q 0 and Q 1 denote AQSs at times 0 and 1. By definition, the initial AQS is: Footnote 18 (continued) AQS in 2003 than other disciplines) is excluded the variance still declines but by much less than when Education is included.
Suppose X k people exit from quality category, k, and E k enter, belonging to quality category, k. Furthermore, define T k as the net transformations of incumbent researchers into quality category, k. That is, T k measures the transfers into k from all other categories, net of the transfers out of k into all other categories, so that ∑ 4 k=1 T k = 0 . Allowing for all entries, exits and transformations over the period, the proportional change in quality is thus: The AQS change therefore depends in a complex way on the flows contained in Eq. (5). However, it is possible, as shown in subsection 'Decomposition of quality changes', to decompose the change into the three separate components. First is it useful to consider whether differences in the flows among universities and disciplines is likely to give rise to convergence.
Differential rates of exit, entry and quality transformation can contribute towards convergence for several reasons. For example, there is an upper limit to the extent to which exits can continue to contribute to improving a university or discipline AQS: as the number of low-quality researchers is reduced, the scope to improve an AQS from exits diminishes. Low-quality universities have relatively large numbers of R-rated researchers, encouraging a high exit rate of Rs. Higher-quality universities are able to retain their higher-quality researchers, and at the same time can more readily attract high-quality researchers from other universities in NZ as well as from outside the system. In addition, universities with high AQSs may be able to attract strong younger researchers, who enter as C or even R types, but are capable of progressing more rapidly to higher levels. The PBRFS scoring system implies that, for example, a university with an initial AQS that is below 2 can increase its average quality by employing more C-type researchers. But a university with an AQS above 2 reduces its average quality by employing more Cs: they will wish to be confident that those who do enter the university are capable of subsequently raising their score. The fact that the attractiveness of C-type researchers varies with the AQS means not only that there are differences among universities at any time, but their ability to raise their AQS further via recruitment practices is likely to change over time as their AQSs change.
A similar argument could be made for the marginal effects of transformations (T), and there may also be diminishing benefits from investing in initiatives that improve the research environment and skills of incumbent researchers. For entrants, there is a diminishing effect from a university's budget constraint, as well as from constraints imposed by other university requirements (such as teaching and administration staffing requirements). For these reasons, the effects of exits, entrants and transformations on the rate of improvement of research quality of a university or discipline can be expected to display diminishing marginal productivity analogous to those evident in models of economic growth convergence. 19

Decomposition of quality changes
The required decomposition can be obtained by suitably modifying the relevant transition matrix, as follows. 20 Let the initial and final AQS for the specified group be denoted by Q1 and Q2. The aim is to decompose Q2-Q1 into components that measure the separate impact on the change in AQS of exits, entrants and quality transformations. An AQS can be calculated for the final period, using a counterfactual assumption of no entrants into any category, denoted Q3.
Furthermore, it is possible to obtain an alternative counterfactual AQS in the final period, by setting all exits and entrants to zero: this is denoted Q4. The counterfactual of no exits is applied by supposing that those recorded as exiting remain in the quality category in which they were placed in the initial period: that is, the diagonals of the flow matrix are augmented by the number of (FTE) exits. The difference, Q4-Q1, therefore reflects the effect of quality transformations made by those who remain in the system (since, in calculating Q4, those who actually exit are assumed to remain on their respective diagonal). The difference, Q3-Q4, reflects the separate effect of exits. Finally, the difference, Q2-Q3, measures the effect of entrants. These are combined to give: These contributions can be derived for the entire university system, and for any unit within the system. Figure 4, derived using the data in Table 7, illustrates the separate average annual contributions to the changes in AQS of exits, entrants and quality transformations for all universities combined, each university, and each discipline group. The contributions for each discipline within each university are shown in Fig. 5.
It is evident from Fig. 4 that, for the university system as a whole and for each university and discipline, the net impact of exits is always positive, the net impact of entrants is always negative, and the net impact of quality transformations is always positive. This pattern prevails in both periods, although the sizes of the average annual contributions change. In the second period the positive average annual contribution from quality transformations is larger ( The characteristics for separate disciplines within each university shown Fig. 5, reveals greater diversity in the level of contributions and the changes between the two periods than is evident for the more aggregated university and discipline groups shown in Fig. 4. Although the dominant pattern of these researcher contributions is the same for universities and disciplines as a whole, there are several cases where the impact of exits and quality transformations have a negative effect on AQSs, and of entrants having a positive a positive impact. The data for Figures 4 and 5 are provided in Table 7 in the Appendix.
A word of caution is necessary in interpreting the negative contribution of entrants observed above. This contribution, Q2-Q3, measures the difference between the final AQS and that which would result from having no entrants, and relates only to the AQS values at the end of a period. Hence a negative value of Q2-Q3 suggests that entrants on average, by the end of a period period, are of lower-quality than the incumbents at the end of a period. And of course, incumbents are typically subject to positive net transformations over the period. This does not necessarily mean that the entrants do not contribute positively to a higher overall AQS, as their contribution is positive so long as the average quality of entrants is higher than the overall start-of-period AQS.
Further details of the contributions are provided in Table 4, for all universities combined. This shows the beginning and end-of-period AQSs of various groups, and differences between the AQSs of the groups. The table shows how the different groups have contributed to the overall change in the AQS. The AQS of exits during 2003-2012, of 2.12, was substantially lower than the AQS of all researchers, of 2.88, in 2003. In 2003 the AQS of those who remained incumbent in the system was 3.68, which was 0.80 higher than the average of all researchers. They improved considerably to 5.18. Interestingly, the entrants during this period had an AQS in 2003 of 3.90. Although this was lower than the AQS of all researchers in 2012, of 4.55, it was higher than the AQS of all researchers in 2003. Therefore, the net effect of entrants was to contribute to an improvement in the score from 2003, of 1.02, but this contribution to the improvement in AQS was lower than that of incumbents, which was 2.3.
The second period differs in important ways from the first. The exits during this period had an AQS lower than that of all researchers in 2012. However, the difference was about half of the corresponding difference in the earlier period. Entrants, on the other hand, had a much lower score in 2018 than did entrants in the earlier period (by 2012) and was also lower than the AQS for all researchers at the start (2012) of this second period. Hence, the net effect of entrants during 2012-2018 was to reduce the AQS of all researchers in 2018, from what it would otherwise have been (that is, with no entrants). The AQS of entrants in 2018 was 0.97 lower than the score of all researchers in 2012, contrasting with an AQS of incumbents which was 1.31 higher than the average of all researchers at the start of the period. In addition, the AQS of entrants in 2012 was actually lower than the AQS in 2012 of those who exited the system during the second period.
The positive contributions of exits and quality transformations to the growth in AQSs are consistent with the new incentives created by the PBRFS. These encouraged universities to remove lower-quality researchers and to retain higher-quality and promising researchers considered more likely to transition to a higher QC over time. The positive net impact of quality transformations reflects decisions to attract and retain good researchers who improve over time. This net positive effect captures the mixture of those who improve and those who decline in quality while remaining within the institution. There were clearly diminishing gains from the positive net transformations of incumbents during the second period compared with the first.
The average annual contribution of exits, although also on average positive in both periods, is smaller in the second period. This reflects the situation where the large gains achieved by the removal of a very high proportion of R researchers during the first period reduced the scope for similar AQS gains from this source during the second period. During this second period, the proportion of exits by higher-quality researchers increases substantially.
The contribution of entrants to growth in university AQSs clearly deteriorated during the second period compared with the first period: their AQS was lower than that of entrants in the first period, and lower than the AQS of all researchers at the start of the second period, Fig. 4 Average annual contributions to AQS growth of exits, entrants and quality transformations. Note Derived from Appendix Table 7 and of exits during that period. It clearly became increasingly more difficult to recruit entrants above the average quality. 21

Fig. 5
Contributions to exits, entrants and quality transformations to changes in AQS across disciplines. Notes Derived from Appendix Table 7 1 3 To decompose σ-convergence, Table 5 reports standard deviations of the separate average annual contributions of exits, entrants and quality transformations to changes in AQS. These standard deviations are similar in both periods for universities and disciplines overall. The changes in standard deviations across disciplines within universities (shown in the lower section of the table) are more diverse, as observed in section 'Growth rates of   Table 6 Regression results for AQS growth components, X, E and T This table shows the estimated regression coefficients and standard errors are in parentheses ** (*) = significant at 1% (5%) † Adding Age 2012 is not statistically significant: t = − 1.04 in (i); t = − 1.57 in (iv); and t = 1.36 in (v) † † There are fewer observations for entrants due to nine observations where AQS fell between 2012 and 2018, such that the log difference is undefined Stage 1 results Stage 2 results

β-Convergence properties of contributions to AQS growth
This section evaluates how the separate components of AQS growth (entry, exit and quality transformation) contribute to the β-convergence properties observed for AQS growth as a whole. That is, did initially lower-quality units catch-up on higher-quality units largely via improvements to retained staff, or via the removal of low-quality staff and/or recruitment of higher-quality new staff? While results reported in section 'Contribution of exits, entrants and quality transformations to AQS changes' revealed negative effects on AQS growth from new entrants, they could nevertheless have contributed to cross-unit quality convergence.
For example, to help catch-up on other units, some universities or disciplines may have used a strategy of quality improvement for retained staff, even if they had limited ability, or made little attempt, to use exits and entrants for this purpose, or vice versa. The previous section revealed substantial variation across universities and discipline groups in the contributions of these different components.
To examine the influences on convergence of exits (X), entrants (E) and quality transformations (T), a two-stage process was followed. First, regressions of Eq. (3) were estimated including control variables, with statistically insignificant controls progressively eliminated. This suggested at most one control variable, each unit's initial median age of staff, Age 2012 , with a significant effect on AQS growth, but only for E and T components. Results are reported in columns (i) to (iii) of Table 6. Second, the previous general-to-specific process was followed, starting with all dummy variables and Age 2012 included, and eliminating insignificant variables in turn. This yielded the 'final' regressions reported in columns (iv) to (vi) of Table 6. 22 Allowing for the possibility of different rates of convergence across universities and disciplines, final results in columns (iv) to (vi) suggest that all three components of AQS growth contributed towards overall convergence, with β 1 negative in all cases, and with additional negative effects from β 2 for quality transformations, T. Long-run convergence parameters also confirm negative effects in all three cases. In the case of X and T these are robustly identified (t-ratios exceed |3|), while for E the estimate (− 0.0646) is significantly negative at the 10% level. However, estimated magnitudes suggest that the largest effects on AQS growth are from E and T (around − 0.065 to − 0.068), with estimated convergence effects for X smaller at − 0.028.
Results in columns (iv) to (vi) of Table 6 also suggest few statistically significant deviations from a uniform contribution to the convergence rate across universities and disciplines. For exits, X, there is some evidence of a larger convergence rate contribution for AUT (which initially had the lowest AQS and the fastest growth in the first period), and a slower rate for Lincoln (LU). For T, only Massey university (MU) displays a significantly greater rate of convergence. Across disciplines there is some evidence for different convergence rates on E or T for education, agriculture, and AFE. 23

3
Overall, these results suggest that the largest components contributing towards β-convergence in AQS across universities and disciplines from 2012 to 2018 were via recruitment of entrants and by transforming the quality scores of established and remaining researchers. Exiting staff from those universities and disciplines also facilitated convergence of average quality scores at the next PBRF audit, but made a smaller contribution: just under half of that of each of the other two components. Nevertheless, an important aspect of these results is that entrants can be seen to contribute consistently towards β-convergence yet display a mixed record of σ-convergence and σ-divergence, as shown in Table 5.

Conclusions
This paper has extended understanding of the convergence properties arising from the New Zealand PBRFS. In particular, it has developed a decomposition method which allows the separate contributions to the convergence process of staff turnover, in terms of exits and entrants, and quality transformations of incumbent staff, to be identified. The approach has also been extended to examine the question of whether changes between two assessment exercises depend on previous researcher quality changes. These contributions can clearly be related to the incentives, and constraints, within the system.
A strong degree of β-convergence was found for the period 2012-2018. Despite the considerable initial variation across universities in their AQS scores, a common convergence process was found to operate across all universities, and all discipline groups within and across universities. Furthermore, the results also suggested that universities' growth in research quality from 2012 to 2018 was affected by growth in the prior period. A relatively high rate of improvement in the first period systematically mitigated growth in the second period. There was also some evidence that the AQS for one discipline, Accounting, Finance and Economics, may have converged at a slightly faster rate than other disciplines. In addition, it was found that the convergence properties were not affected by differences in gender ratios, median age, and size of universities and their discipline composition.
Strong β-convergence is not necessarily associated with reduced dispersion or σ-convergence. However, the results confirm that a strong σ-convergence process is observed over 2003-2012. Across universities, this was maintained during 2012-2018, despite the fact that the dispersion of AQS levels across universities and disciplines was already much reduced by 2012. Across disciplines, σ-convergence was substantially reduced during 2012-2018, with σ-divergence in some cases.
When considering the contributions of researcher exits, entry and quality transformation, two outcomes are especially noteworthy. First, exits and quality transformations both contributed, as expected, to improvements in AQSs on average across universities and disciplines. New entrants (whether from outside the NZ system or cross-university transfers within it) raised AQS levels over the period 2003-2012, as they had an average quality in excess of the initial AQS, although their contribution to growth was less than that of incumbents. Over the period 2012-2018, entrants on average actually served to reduce AQS levels as their average quality was below the average in 2012. The combination of substantial quality improvement over the first period, combined with a difficulty of recruiting (and retaining) higher-quality researchers in the second period, seems to have contributed to significant 'decreasing returns' from the PBRFS.
Second, the overall decreasing returns combined with similar effects within universities and disciplines, such that all three components of change contributed to a process of β-convergence, rather than divergence, in AQS levels across universities and disciplines. The rate of AQS convergence was around − 0.06, or 6%, per year for entrants and transformations, and somewhat slower, at around − 0.03 for exits.
Further, these convergence rates were relatively uniform across universities and disciplines with only a few exceptions. These exceptions are: a slower rate of convergence via exits for Lincoln University; a faster rate of convergence via exits for AUT; and a faster rate of convergence via quality transformations for Massey University. Among disciplines, the exceptions include: faster convergence via quality transformations for AFE, and via entrants to Agriculture; but a slower rate of convergence via quality transformation in Education.
In the New Zealand case, the PBRFS funding formula was not designed explicitly to favour increased concentration of researchers, nor to favour low-quality units. Rather, funding allocation was both scale-neutral and neutral with respect to the location of individuals with different quality scores. This level playing field in financial PBRF incentives across research units may have encouraged initially low-rated universities and disciplines to aim for substantial improvement over 2003-2018, and avoided discouraging a process of knowledge transfer across universities. These aspects may have facilitated the convergence outcome observed here.
The results found for the NZ PBRFS do not necessarily carry over to assessment exercises in other countries. This is because the process depends on the particular incentive structure created by a PBRFS, and these differ among countries. The basic methods used to investigate whether there is β-and σ-convergence can be applied to other countries for which their assessment process assigns some kind of quality score to universities and discipline groups, and do not require longitudinal information about individual researchers. However, the present analysis, identifying separate contributions to change, has been possible due to the New Zealand PBRFS's design, which requires all qualifying academics to be assessed in all rounds. Such data are not generally collected as a matter of course for other countries. Nevertheless, the same fundamental processes are involved in generating research quality improvements in any system. Hence, it would be useful for countries to collect, where possible, longitudinal information about individual researchers. Crucially, the incentives created by a PBRFS matter, and it would be useful for the evaluation process to be designed such that the information necessary to investigate responses to incentives is collected. There were no observations for Lincoln University in the 2012 and 2018 assessment rounds for Law and Education discipline groups