1 Introduction

In recent years, the higher education sector has been subject to formal quantitative research that has mainly covered such topics as the estimation of rates of return in higher education, the academic labour market, institutional behaviour, and higher education as an industry. Approaches to higher education institutions (HEIs) have been changing and, apart from recognizing their obvious role in human capital and knowledge creation, critical analysis concerning their productivity and efficiency has started to gain importance. In particular, due to changing demographic trends and competition for students, as well as a growing squeeze on public entities by financial constraints, public HEIs are under constant pressure to improve their performance. On top of this, competition between universities has been growing steadily, and European HEIs are still struggling to catch up with American institutions.

An examination of the existing literature leads us to conclude that there are several gaps in analyses of higher education productivity that need to be filled. First, major attention has thus far been focused on the analysis of productivity performance, in so far as this concerns evaluating the productivity levels of universities (among others: Glass et al. 1995; Johnes 2006a, b; Bonaccorsi et al. 2007). However, such efficiency scores, apart from providing a tool for comparing productivity between units (and, hence, serving as yet another university ranking system), say nothing about changes in productivity across time (and whether universities manage to improve their performance, stagnate or regress). Second, due to problems commonly associated with gathering comparable data for HEIs from multiple countries, such exercises (assessments of productivity changes over time) have usually been conducted with units from only one country (or, exceptionally, as in Agasisti and Johnes 2009 or Agasisti and Pérez-Esparrells 2010, two countries). In fact, Agasisti and Pérez-Esparrells (2010) state: ‘Future research can extend this study. For instance, a wider comparison among universities from different European countries could be useful for policy purposes’ (p.102). From the policy perspective, a comparative cross-European analysis of HEI productivity is of major importance, especially in the light of the integration of European higher education systems under the Bologna process.

Finally, certain higher education studies assessing productivity changes over time (Flegg et al. 2004; Johnes 2008; Worthington and Lee 2008; Agasisti and Johnes 2009; Agasisti and Pérez-Esparrells 2010) have adopted techniques based on Malmquist indices that have not been statistically verified. In other words, these authors simply state that productivity (efficiency, technology—if the indices are decomposed) in selected HEIs has increased or decreased, but no formal tool has been applied to check whether the estimates are sensitive to random variations in the data. Traditional Malmquist methodology, based on estimations of distance measures made through data envelopment analysis (DEA), a non-stochastic procedure, does not provide any insight into the statistical significance of its results. There are, however, tools based on resampling (bootstrap) methods that allow us to correct this weakness.

Hence, the considerable limits of the existing literature stem from the facts that (i) little is known about productivity changes across universities from several countries analyzed within a common methodological framework, and (ii) methodological issues concerning the significance of the results obtained with Malmquist indices have not been appropriately addressed.

A very particular feature of our dataset is its panel dimension, which allows us to go beyond studies that only compare efficiency scores across units of higher education (usually from just one country). We have managed to gather comparable statistics concerning the inputs and outputs of 266 public HEIs from seven European countries (namely: Austria, Finland, Germany, Italy, Poland, the United Kingdom, and Switzerland) over the time period 2001–2005. Moreover, the bootstrap estimation procedure adopted (Simar and Wilson 1999) corrects the basic and possibly biased information given by Malmquist indices of productivity, providing us with confidence intervals for these indices and their components. As a result, we have a tool to verify whether changes in the productivity of European HEIs, as indicated by Malmquist indices, are significant in a statistical sense (i.e. whether the result indicates a real change in the productivity of a given HEI or is just an outcome of sampling noise). Thus, by focusing on the two limits described above and using an original and vast set of microdata on HEIs from several European countries in conjunction with a consistent bootstrap methodology, this study presents an important extension of the existing literature.

The remainder of the paper is organized as follows: we devote Sect. 2 to a presentation of the methodology applied in our analysis (in particular, describing ways of assessing the statistical significance of Malmquist indices) and a concise description of the studies most closely-related to our research. In Sect. 3, we first present our data and then show the statistically significant results of a cross-European assessment of productivity, efficiency and technology changes in 266 HEIs. Conclusions follow.

2 Theoretical and empirical background

2.1 Changes in productivity across time—Malmquist indices and their statistical significance

Higher education institutions are not classical firms whose aim is profit maximization; public HEIs, in particular, are by definition, non-profit organizations. Hence, we cannot assess their productivity by using the methods typically applied to the evaluation of companies producing goods or services and generating profit. Moreover, the functioning of HEIs is characterized by interplay between multiple inputs and outputs. Universities use such inputs as human resources (staff), students and financial resources and ‘produce’ at least two outputs, reflecting both their teaching and research missions.Footnote 1 Consequently, analysis of HEI productivity dynamics must take these features into account. Tools based on DEA have proven very useful in capturing multiple inputs and outputs at the same time and focusing on a non-parametric treatment of efficiency frontiers. We focus on changes in the productivity of European public HEIs where productivity is understood not in absolute terms, but as performance that is relative to the efficiency of technologies (represented by a frontier function). The aim is not to identify levels of productivity as previous studies have (e.g. Glass et al. 1995; Johnes 2006a, b), but to study the dynamics of productivity. Thus, below, we do not focus on the formal derivation of DEA relative productivity scores,Footnote 2 but we show the methods applied to assess changes in productivity in the higher education sector.

To measure productivity change between two periods of time, we adopt the output-based Malmquist index of productivity developed by Färe et al. (1992, 1994, 1997), itself drawn from the measurements of efficiency in Farrell (1957) and of productivity in Caves et al. (1982). The output-oriented model aims to maximize output while using no more than the number of inputs observed.Footnote 3 Hence, the question to be answered is: by how much can output quantities be proportionally augmented without changing input quantities? In the context of HEI efficiency, output-oriented models are usually used because the quantity and quality of inputs, such as student entrants, are assumed to be fixed exogenously and universities can hardly influence their number or characteristics, at least in the short term. We compute Malmquist indicesFootnote 4 that are based on DEA scores, allowing us to measure the total factor productivity (TFP)Footnote 5 change of single HEIs between two data points:

$$ TFP_{i,(t,t + 1)} = m_{i,(t,t + 1)} = m_{i} (x_{t + 1} ,y_{t + 1} ;x_{t} ,y_{t} ) = \left[ {\frac{{d_{i}^{t} (x_{t + 1} ,y_{t + 1} )}}{{d_{i}^{t} (x_{t} ,y_{t} )}} \times \frac{{d_{i}^{t + 1} (x_{t + 1} ,y_{t + 1} )}}{{d_{i}^{t + 1} (x_{t} ,y_{t} )}}} \right]^{1/2} , $$
(1)

where i = 1,…,N denotes the DMUFootnote 6 (in our case HEI) being evaluated, x refers to inputs and y to outputs, and m is the productivity of the most recent production point defined by inputs and outputs (x t+1 , y t+1 ) using period t + 1 technology, relative to the earlier production point (x t , y t ) using period t technology.Footnote 7 Output distance functions are denoted as d. Footnote 8 With regard to output orientation, a value of m i,(t,t+1) greater than one indicates positive TFP growth in HEI i from period t to period t + 1, while m i,(t,t+1) smaller than one indicates TFP decline. For example, m i,(2002,2003)=1.14 would signify an improvement in TFP of HEI i between the years 2002 and 2003 of 14 %. If m i,(t,t+1) equals unity, then no improvement in the TFP of HEI i was observed between the two data points.

In order to distinguish between two basic mechanisms provoking TFP growth, we adopt the Malmquist decomposition proposed by Färe et al. (1992):

$$ m_{i,(t,t + 1)} = \underbrace {{\frac{{d_{i}^{t + 1} (x_{t + 1} ,y_{t + 1} )}}{{d_{i}^{t} (x_{t} ,y_{t} )}}}}_{{technical\; efficiency\; change(\varepsilon_{i} )}} {\underbrace {{\left[ {\frac{{d_{i}^{t} (x_{t + 1} ,y_{t + 1} )}}{{d_{i}^{t + 1} (x_{t + 1} ,y_{t + 1} )}}\times \frac{{d_{i}^{t} (x_{t} ,y_{t} )}}{{d_{i}^{t + 1} (x_{t} ,y_{t} )}}} \right]}}_{{technological\; change(\tau_{i} )}}}^{1/2} $$
(2)

where technical efficiency change Footnote 9 (ε) reflects changes in the relative efficiency of a unit i (e.g. universities getting closer to or further away from the efficiency frontier), while technological change (τ) measures the shift in the production frontier itself and reflects effects that concern the higher education system as a whole. Values of ε i.(t,t+1) greater (lower) than unity indicate improvements (decreases) in technical efficiency between t and t + 1. Similarly, values of τ i,(t,t+1) greater (lower) than unity indicate technological progress (regress) between t and t + 1. The value of m will be equal to 1 if the net effect of changes in technical efficiency and frontier changes is null.

The problem with the approach described above is that the frontier needed for the calculation of distance functions is estimated from the data, and thus the resulting changes in m may simply be the result of sampling noise. Hence, we adopt a particular way of measuring productivity changes: we follow a bootstrap procedure to obtain bias-corrected estimates of Malmquist indices (and their components—as in Eq. 2) and their confidence intervals (Simar and Wilson, 1999). This procedure is based on bootstrap DEA analysisFootnote 10 (relying on replication of the data-generating process) and allows us to: (i) verify whether correction for the bias in non-parametric distance function estimates (and thus in Malmquist index estimates) is desirable, and (ii) check whether the changes in productivity indicated by Malmquist indices and their components are statistically significant.Footnote 11

In line with Simar and Wilson (1999), we first compute a set of bootstrap estimates for the Malmquist index for each HEI i: \( \{ \hat{m}(b)_{i,(t,t + 1)} \} \)for b = 1,…B (where B is the total number of replications performed with pseudosamples drawn from the ‘original’ dataset). Then, the bootstrap bias estimate for the ‘original’ (non-bootstrapped) estimator is calculated as:

$$ bias(\hat{m}_{i,(t,t + 1)} ) = 1/B\sum\limits_{b = 1}^{B} {\hat{m}(b)_{i,(t,t + 1)} - \hat{m}_{i,(t,t + 1)} } , $$
(3)

where the first component is the average value of the bootstrap estimates of the Malmquist index, \( Avg[\hat{m}(b)] = 1/B\sum\nolimits_{b = 1}^{B} {\hat{m}(b)_{i,(t,t + 1)} } \), and the second component, \( \hat{m}_{i,(t,t + 1)} \), denotes the ‘original’(non-bootstrapped) estimates of the Malmquist index (as in Eq. 1). In the next step, for each HEI i, we compute a bias-corrected estimate of the Malmquist index, \( \hat{m}\_corr_{i,(t,t + 1)} \)—the difference between \( \hat{m}_{i,(t,t + 1)} \) and \( bias(\hat{m}_{i,(t,t + 1)} ) \), which, using (3), can be expressed as:

$$ \hat{m}\_corr_{i,(t,t + 1)} = 2\hat{m}_{i,(t.t + 1)} - 1/B\sum\limits_{b = 1}^{B} {\hat{m}(b)_{i,(t,t + 1)} } $$
(4)

The choice between the ‘original’ estimate of the Malmquist index and its bias-corrected version is based on a comparison of the mean square errors (MSEs) of the two indices, as it is plausible that the latter may have a higher MSE (Efron and Tibshirani 1993).Footnote 12

Finally, in order to assess whether productivity change is meaningful in the statistical sense, the (1-α) percent confidence interval is obtained with the bootstrapping procedure as:

$$ \hat{m}_{i,(t,t + 1)} + l\_\hat{m}_{\alpha } (b) \le m_{i,(t,t + 1)} \le \hat{m}_{i,(t,t + 1)} + u\_\hat{m}_{\alpha } (b). $$
(5)

The \( {l}\_{\hat{m}}_{\alpha} \) and \( {u}\_{\hat{m}}_{\alpha} \) estimated respectively define the lower and upper bootstrap estimates of the confidence interval bounds for the Malmquist index, and α (e.g. 10, 5 or 1 %) characterizes the size of the interval. Following Simar and Wilson (1999), the Malmquist index estimated is said to be significantly different from unity (and so the productivity change is statistically significant) if the interval defined in Eq. 5 does not include unity.

An analogous approach applies for all the components of the Malmquist index (ε and τ), so that we also obtain bias-corrected estimates of ε and τ: \( \hat{\varepsilon }\_corr_{i,(t,t + 1)} \) and \( \hat{\tau }\_corr_{i,(t,t + 1)} \), as well as confidence intervals for ε and τ, allowing us to verify their statistical significance.

2.2 Related empirical evidence in the context of higher education

So far, probably due to problems with obtaining multi-period micro-level data on the performance of single universities, few authors have applied Malmquist indices to HEIs, usually preferring to focus on institutions from one country. A multi-country setting demands the computation of an index that requires the same set of inputs and outputs for all HEIs from the sample and, additionally, the presence of the same units and variables across time; unbalanced panels with changing sets of HEIs or inputs/outputs are not allowed.

Flegg et al. (2004) apply the Malmquist approach to a sample of 45 British universities for the period 1980/1981–1992/1993. Their results show that in these years TFP increased by 51.5 % but that most of this rise was caused by an outward shift of the efficiency frontier (technological change) and not by the movement of universities towards the frontier (efficiency change). Johnes (2008) derives Malmquist indices for 112 English HEIs over the period 1996/1997–2004/2005 and finds an average increase in TFP of around 1 % per year (decomposition shows that average annual technological change was equal to approximately 6 %, but a decrease in efficiency of 5 % per year took place). Worthington and Lee (2008) analyze 35 Australian universities (1998–2003) and find an average increase in productivity growth of circa 3 %, largely due to technological progress and not technical efficiency change. All in all, the existing evidence, based on British and Australian experience, suggests a predominant role for technological change, rather than efficiency change, in provoking overall TFP growth in HEIs.

It should be noted, however, that the HEIs from the countries analyzed so far were characterized by high levels of efficiency (high DEA efficiency scores) at the outset. There are only two published papers (that we are aware of) comparing changes in the productivity and efficiency of HEIs from more than one country: Agasisti and Johnes (2009) and Agasisti and Pérez-Esparrells (2010).

Agasisti and Johnes (2009) employ Malmquist indices to analyze 127 English and 57 Italian public universities over the short period 2002/2003–2004/2005. In line with the findings of the abovementioned authors, their results confirm that English HEIs did not realize gains in technical efficiency, but rather registered changes in productivity that were due to frontier shifts. On the contrary, Italian HEIs—typically less efficient at the outset than English ones—became more technically efficient with respect to the frontier. This is an important result, suggesting that HEIs from countries further away from the common ‘European’ higher education efficiency frontier can experience ‘catching-up’ effects, while those which are already highly efficient move the frontier itself up.

Agasisti and Pérez-Esparrells (2010) adopt a similar setting, counting (apart from DEA scores) Malmquist indices for 57 Italian public institutions and 46 Spanish ones, again for a relatively short time span (the academic years 2004/2005 and 2000/2001). They find that Italian universities experienced important improvements in productivity, mainly due to improvements in ‘technology’ (the authors argue that the change resulted from important reforms in the curriculum organization of the Italian system of higher education), while Spanish universities registered much lower improvements in overall productivity, as a result of changes in efficiency.

However, despite the great advantages of cross-country evidence, none of these papers assess the statistical significance of their results. Consequently, we cannot exclude the possibility of bias caused by sample noise.

Bootstrapped DEA techniques have been used in economic analyses of productivity levels in many different sectors, including higher education (e.g. Johnes 2006a, b). On the contrary, the application of bootstrapped Malmquist methods to the analysis of productivity change has in general been less frequent,Footnote 13 and it should be noted in particular that none of the papers (that we are aware of) have used a consistent bootstrap methodology for the computation of Malmquist indices in the context of the higher education sector. Hence, the ‘original’ estimates of the distance functions and Malmquist indices of the universities analyzed so far have not been corrected for finite-sample bias, and what remains their main weakness is that their statistical significance is unknown. In this paper, we address these issues.

3 Empirical evidence on productivity changes in European HEIs

3.1 Data and panel composition

Our analysis draws on a university-level database containing information on the outputs and inputs of 266 public HEIs from a set of European Union (Austria, Finland, Germany, Italy, Poland and the UK) and non-EU (Switzerland) countries for which it was possible to gather comparable micro data. We draw on a balanced panel containing statistics for single European HEIs for the years 2001–2005.Footnote 14 Even though the data comes from numerous sources, particular attention has been given to ensuring the maximum level of comparability of the crucial variables across countries in accordance with the Frascati manual (OECD 2002)—for details, see the data appendix (Table 6).

Table 7 in the Appendix contains information on the number of HEIs from each country (due to space limits a detailed list of all the universities covered by our study is available upon request). To the best of our knowledge, this is the most comprehensive balanced panel micro dataset on European HEIs from several countries that has been used for Malmquist analysis of productivity change.Footnote 15 Moreover, so far, advanced analysis of productivity trends in universities from new EU member states has been ignored. In contrast, along with universities from six western European countries, we also included in our analysis HEIs from Poland.Footnote 16

Our dataset only contains public HEIs, because several statistics, the crucial ones concerning funding, are often not available for private HEIs. Additionally, we decided to concentrate only on the university sector; regarding the binary higher education system, we excluded from our sample applied science institutes/schools (such as German or Austrian fachhohschule and applied science HEIs in Finland and Switzerland), which were only marginally conducive to research. Moreover, we also excluded from our analysis special purpose units specializing in one discipline only (e.g. medicine, arts, sports) and distance learning universities, as these were not considered comparable with ‘traditional’ universities. Finally, units whose publication records (used as a measure of one of the outputs) were scant, incomplete or identified via ambiguous affiliationsFootnote 17 were not taken into consideration.

The calculation of Malmquist indices required the estimation of distance functions. We first used a bootstrapped DEA method based on annual observations of 266 European HEIs, which produced two outputs from three inputs. Given the double mission of HEIs (teaching and research)Footnote 18 as outputs, we considered teaching output (measured in terms of graduates), as well as research output (quantified by means of bibliometric indicators and based on an analysis of publication records, as in, among others, Creamer 1999; Dundar and Lewis 1998). While comparison of the number of graduates (total, without distinguishing between various types of studies) across HEIs was quite straightforward,Footnote 19 a challenge was posed by the necessary cross-country comparability of research outputs. Different countries adopt specific measures of research production (such as research funds, publication records, patents and applications). However, we relied on the uniform bibliometric data from Thomson Reuters’ ISI Web of Science database (a part of the ISI Web of KnowledgeFootnote 20), which lists publications from quality journals (with a positive impact factor) in the majority of scientific fields.Footnote 21 We counted all publications (scientific articles, proceedings papers, meeting abstracts, reviews, letters, notes) published in a given year, with the requirement that at least one author declared an institutional affiliation with an HEI.Footnote 22

Concerning input measures, our dataset contained information on numbers of students, total academic staff and total real revenues. Revenues were converted from national currency units into Euro PPSFootnote 23 (using exchange rates from Eurostat), to account for cross-country differences in price level and the purchasing power of the money that HEIs dispose of.

As for data sources (Table 8 in the Appendix), the availability and coverage of university-level data differed from country to country. The most comprehensive databases concerning HEIs exist in Finland, the UK and Italy, with freely-available online platforms giving access to a broad range of statistics that are not confidential. For Swiss, Austrian and German HEIs, data was kindly provided by the staff of each country’s central statistical office. In the case of Poland, unfortunately, micro-data on HEIs (even public ones) practically does not exist for research purposes. There is no on-line platform containing such data, and only a few statistics are available in paper versions of publications issued by the Polish Ministry of Science and Higher Education (MNiSW) and the Polish Central Statistical Office (GUS); part of the data used were obtained through direct contact with the statistical offices possessing them.Footnote 24

Our benchmark Malmquist analysis is based on DEA performed with three inputs and two outputs, where DMUs are compared with respect to the common European frontier. As a robustness check, we consider alternative formulations of DEA specification: a two input-two output version of the DEA model (without students as an input)Footnote 25 and estimates based on the use of average values of inputs and outputs.Footnote 26 Finally, to check for cross country heterogeneity, we perform an additional analysis where country-specific frontiers are estimated and productivity change is estimated with respect to units from the same country.

3.2 Malmquist indices: results for European HEIs

3.2.1 Benchmark results

In benchmark estimation we considered productivity change with respect to a common frontier, thus all 266 HEIs were treated jointly, and the frontier was estimated using annual information on the whole sample of European universities. Consequently, changes in productivity were relative to the European efficiency frontier in public higher education (relative in the sense that they were computed with reference to other universities from the group). Later on, we take into account cross-country specificity (see Sect. 3.3).

We first calculated ‘original’ (not bootstrapped) estimates of Malmquist indices (and their components). Then, we applied the bootstrap method described above (maintaining the assumption of constant returns to scale and output orientation), setting the number of bootstrap replications B = 2,000. We compared the MSEs of bias-corrected and ‘original’ (non-bootstrapped) estimates of Malmquist indices, finding that in the vast majority of cases, bias correction increased MSE (for details, see Table 9 in the Appendix). Simar and Wilson (1999) obtained analogous results. Consequently, and like the aforementioned authors, we do not report bias-corrected estimates, but rely on ‘original’ estimates of m (TFP), ε and τ that are based on decomposition (2): \( \hat{m} \), \( \hat{\varepsilon } \) and \( \hat{\tau } \). In Table 10 we show summary statistics of the variables used in the DEA model, while summary statistics of both the ‘original’ and bias-corrected estimates of the indices are reported in Table 11 in the Appendix, where it can be seen that the difference between the two is negligible (the coefficients of correlation between the ‘original’ and bias-corrected series are between 0.97 for τ and 0.99 for m). However, we do refer to the estimated bootstrap confidence intervals to assess whether changes in productivity, efficiency and technology are meaningful in a statistical sense. The full set of results for all HEIs is obtainable upon request; here, we present the key findings.

In Table 1, we compare all the results (N = 1,064) with the statistically significant ones (at a significance level of 5 %).Footnote 27 In particular, we show the number of cases in our panel in which estimates of Malmquist indices were significantly different from unity, \( N(\hat{m}**) \), and their average value (\( \bar{\hat{m}}** \)), comparing them with the average value of all the indices (\( \bar{\hat{m}} \)). Finally, we report the number of cases with statistically significant increases in TFP, \( N(\hat{m}** > 1) \), and the percentage of cases in which statistically significant annual improvements in productivity were registered. The same exercise has been done with estimates of ε and τ.

Table 1 Benchmark results—trends in productivity (m), efficiency (ε) and technology (τ) in 266 European HEIs based on annual changes for 2001–2005 period, CRS

The calculation of confidence intervals permits us to note that at a standard 5 % level of significance, out of the 1,064 annual estimates of TFP growth between the years 2001 and 2005, 963 were statistically different from unity. Thus, in 90 % of the HEIs in our sample statistically significant changes in productivity were registered. Taking into account only statistically significant estimates of m, between the years 2001 and 2005, on average, HEIs in our sample registered an increase in productivity of around 4.5 % annually (the average value of all Malmquist indices, significant and not, equals 4.1 %). Counting cases in which m was significant and greater than one, denoted in Table 2 as \( \% (\hat{m}** > 1) \), we can conclude that statistically significant annual improvements in overall productivity took place in 56 % of cases.

Table 2 Comparison of benchmark and alternative estimates of productivity (m), efficiency (ε) and technology (τ) change in 266 European HEIs (based on annual changes for 2001–2005 period), CRS

Comparing statistically significant estimates ε and τ, the two basic components of m, average efficiency improved by 5.7 %, while technology shifted up by 4.6 %.Footnote 28 If we considered all the estimates, these values would be lower (3.2 % and 1.2 %, respectively). Hence, accounting for statistical significance matters for the conclusions drawn. Looking at the number of cases with significant improvements in efficiency and technology, it is evident that efficiency change ε (being a ‘catching up effect’ towards/away from the frontier) was more common than change in technology τ (‘frontier shift effect’). From Table 1, it emerges that of the 1,064 observations analyzed for the period 2001–2005, efficiency change was significantly higher than unity in 32 % of cases (so approximately one-third of HEIs managed to catch up towards the European efficiency frontier) while significant technological improvement took place in only 17 % of HEIs.

3.2.2 Robustness checks and extensions of the basic model

In order to check the robustness of our findings, we ask whether the way the productivity frontier was defined in the DEA estimation matters to the conclusions drawn, so we consider alternative DEA model formulations with modified sets of inputs and outputs.

Firstly, we consider a DEA model with a restricted number of two inputs (total staff, total revenues) and two outputs (teaching output—graduates, and research output—publications). Such a formulation addresses the difficulty in modelling the students-graduates productivity relationshipFootnote 29 and corrects for any correlation between students and other inputs (such as teaching staff and funding).

Secondly, we perform a Malmquist analysis based on a DEA model with input and output data expressed as time averages.Footnote 30 Such an exercise permits us to correct for any random time variation in the data, as well as a possible relationship between past inputs and present outputs. We consider a DEA model with three inputs and two outputs as in the benchmark estimation, but based on moving averages of all inputs and outputs (2 year moving averages: t1 = 2001–2002, t2 = 2002–2003, t3 = 2003–2004, t4 = 2004–2005). Then, we obtain Malmquist indices based on this average data, which reflect productivity changes between periods: t1/t2, t2/t3, and t3/t4.

The results concerning TFP growth in European HEIs obtained with alternative DEA formulations are actually very similar to the benchmark ones (we compare them in Table 2) and the correlations between the estimates obtained with different models are fairly high.Footnote 31 The estimated annual TFP change indicated by the Malmquist index at most deviates from the benchmark result (4 %) by approximately 0.6 p.p.

Alternatively, as an extension to the basic analysis of annual changes in productivity, efficiency and technology, we employ a Malmquist analysis to only two periods: in this case the DEA model is estimated with 3-year averages (T1 = 2001–2003 and T2 = 2003–2005), so that the Malmquist index obtained can be interpreted as the average productivity change between T1 and T2 and fully corrects for time variation in the original annual data on inputs and outputs. Crucial results based on such averaged data are reported in Table 3 and can be compared with the evidence on annual changes reported in Table 1.

Table 3 Results—trends in productivity (m), efficiency (ε) and technology (τ) in 266 European HEIs, change between T1:2001–2003 and T2:2003–2005 (based on 3-year averages); CRS

On average, productivity in European HEIs rose by approximately 9 % between the initial period T1 and final period T2 (\( \bar{\hat{m}} = 8. 9\;\% \) and \( \bar{\hat{m}}** = 9. 6\;\% \))—note that this result is actually in line with the evidence on annual change reported in Table 1 (where \( \bar{\hat{m}} = 4\;\% \); \( \bar{\hat{m}}** = 4. 5\;\% \)) because the input and output values for T1 and T2 are in fact averaged data around 2002 and 2004. Consequently, the estimates of TFP growth obtained with 3-year averages should be approximately twice as large as those obtained with annual data, and this indeed is the case. The only difference is that when we consider a longer time horizon, the proportion of HEIs registering statistically significant improvements in productivity is larger than in the case of annual changes in productivity (72 % versus 56 %, respectively).

3.3 Malmquist indices: accounting for cross-country heterogeneity

Our dataset has the important property of panel dimension. Thus, we can check for country-specific trends in productivity, efficiency and technology change. In Table 4, we report the average (by country) values of m, ε and τ (all and only those which are statistically significant) and the percentage of cases with statistically significant annual improvements in productivity, efficiency and technology. In most cases (with the exception of technology change in Poland) accounting for statistical significance only negligibly alters the average values of the indices estimated, so in the interpretation of the results we limit ourselves to the significant ones.

Table 4 Results by country (1): annual changes in productivity (m), efficiency (ε) and technology (τ)—mean values (all and statistically significant) and percentage of cases with statistically significant improvements; CRS
Table 5 Results by country (2)—changes in productivity (m), efficiency (ε) and technology (τ) based on 3 year averages (T1 = 2001–2003 and T2 = 2003–2005) and alternative frontier definition (E-European frontier; C-country specific frontier): mean values of all indices; CRS

The average statistically meaningful TFP change indicated by the Malmquist index ranges from 0.98 (TFP decline of 2 % annually) in Austrian HEIs to 1.09 (TFP growth of 9 % annually) in Switzerland, where the average efficiency change was also the highest (rising by 19 % annually). Only Austrian HEIs registered a decline in average efficiency: by 4 % (\( \bar{\hat{\varepsilon }}** = 0. 9 6 \)).

In all of the countries examined, \( \% (\hat{m}** > 1) > \% (\hat{\varepsilon }** > 1) > \% (\hat{\tau }** > 1) \), so that the percentage of cases with HEIs registering statistically significant improvements in TFP was larger than the percentage of cases with statistically significant efficiency growth, which, in turn, was higher than the percentage of cases with statistically significant positive changes in technology. For instance, 69 % of 216 annual observations on Italian HEIs (54 university units observed across four time periods) registered statistically significant TFP growth; 45 % were characterized by statistically significant improvements in efficiency, but only 9 % showed statistically significant improvements in technology.Footnote 32 Among the seven European countries analyzed, Italy had the highest percentage of public HEIs with significant TFP growth and significant efficiency improvement.

The time dimension can also be important when analyzing the productivity changes of HEIs in European countries. We have isolated only HEIs with Malmquist indices statistically significantly different from unity (at the 5 % level)—that is, either higher (statistically significant productivity increases) or lower (statistically significant productivity decreases). Of these, we have calculated the average TFP across the HEIs in given countries in each time period. Figure 1 shows the average significant change in TFP in public European HEIs, by country and year (detailed data are reported in Table 12 in the Appendix). It turns out that if we take into account exclusively those universities that really (in a statistical sense) registered a change in productivity, on average German, Italian and Swiss HEIs performed better (having constant rises in TFP) than HEIs in other countries.

Fig. 1
figure 1

Average statistically significant changes in total factor productivity (m) of HEIs by country and year. Note: Results based on three input—two output model; only \( \hat{m} \) statistically different from the unity taken into account for the calculation of averages; 5 % level of statistical significance. Source: Own elaboration

Due to space limits, we are not able to report all the Malmquist indices for every HEI and year analyzed. However, we count European universities that registered constant statistically significant improvements in TFP (thus having Malmquist indices significantly larger than unity in all of the time periods). Of the 266 universities in our sample, this was the case in only 28 units. Among these we find: two HEIs from Finland, eight HEIs from Germany, fourteen HEIs from Italy, one from Poland, two from Switzerland and one from the UK.Footnote 33

Finally, in order to check whether the definition of the frontier matters for country-specific conclusions, we consider two alternative applications of the DEA model: the first based on a pooled sample (266 HEIs) and thus reflecting the general ‘European frontier’; the second based on separate DEA models for each country (where each HEI was evaluated with respect to the units from the same country, e.g. comparing the performance of Italian HEIs with other Italian HEIs etc.). This exercise could be done for five of the seven countries in our sample—in the cases of Austria and Switzerland the number of decision-making units is not sufficient to estimate the frontier and assure a reasonable level of discrimination.Footnote 34 A comparison between the two approaches to frontier definition can be particularly informative when comparing efficiency change and analyzing whether universities were getting closer to (or further away from) the overall ‘European’ efficiency frontier or their national frontiers (influenced by country-specific educational policies etc.).Footnote 35 Table 5 summarizes the results by country, based on 3-year averaged data and corresponding to the two alternative definitions of the frontier (E—European frontier and C—country-specific frontier). The values reported correspond to mean TFP, efficiency and technology change in HEIs from single countries in the periods T1:2001–2003 and T2: 2003–2005.

In turns out that frontier definition is less important for the measurement of general productivity change indicated by the Malmquist index (which remains the main interest of our analysis) than for its components. The correlation between the series of \( \bar{\hat{m}} \) obtained with the European frontier (E) and that using the country-specific frontier (C) equals 0.99 and their average values are very similar. Italian and German HEIs registered the biggest TFP change in T1 (2001–2003) and T2 (2003–2005) (by 17 and 11 %, respectively). However, the channels through which productivity changes are materialized differ depending on the frontier formulation. This observation leads us to the issue of frontier definition in DEA/Malmquist studies performed for units from different countries.Footnote 36

As far as the common European frontier is concerned (E), HEIs from all countries obtained an average rise in productivity (\( \bar{\hat{m}} > 1 \)), which in the cases of German, Italian and Polish HEIs was mainly due to an increase in their relative efficiency (movement towards the European frontier—catching up effect), while in the case of Finish and British universities productivity growth was achieved more through technology change.

Using the country-specific frontier model (C), again, universities from all countries on average registered improvements in their productivity. It is notable, however, that this time a rise in TFP was mostly due to shifts of their country-specific frontiers (as indicated by \( \hat{\tau } \), the technology change estimate in country-specific frontier setting). Only in the case of Polish HEIs do we obtain a value of \( \bar{\hat{\varepsilon }} \) greater than 1 in the country-specific approach, meaning that units from Poland were not only catching up with the European frontier but also with the national one. On the contrary, German HEIs on average caught up with European ones (\( \bar{\hat{\varepsilon }} = 1.0 8 1 \) in the European frontier setting) but moved back from the German efficiency frontier (\( \bar{\hat{\varepsilon }} = 0. 9 8 \) in the country-specific frontier setting). This means that the German higher education frontier was rising more quickly than the overall European one. Similar patterns emerge when analyzing the Italian case. Consequently, the choice of benchmark against which we assess the efficiency performance of universities makes a difference. This is an important result and will be included amongst the guidelines for future research which we propose together with our conclusions.

4 Conclusions and suggestions for future research

Despite increasing pressure on public universities to constantly optimize results using limited resources, changes in university productivity have only been marginally analyzed, usually with respect to HEIs from just one or at most two countries. Cross-country multi-period analysis of productivity trends is demanding, as it requires the collection of micro data for the same units and for multiple time periods. In the case of universities from several European countries, it has proven to be a quite challenging, albeit feasible, piece of research.

Our paper contributes to the existing literature by presenting productivity changes (along with efficiency and technology trends) in 266 public HEIs from seven European countries for the years 2001–2005 (analyzed mainly annually, but also in terms of time averages). Moreover, we have proposed the application of important methodological improvements, providing consistent estimates of Malmquist indices, along with their confidence intervals, based on a bootstrap method. Consequently, our conclusions are based on statistically significant results that do not suffer from sample noise and, hence, are statistically robust.

These robust results indicate that, of the 1,064 annual estimates of TFP growth in the European HEIs analysed, 963 (90 %) were statistically different from unity (at a standard 5 % level of significance) so the majority of HEIs registered statistically significant changes in productivity. Between the years 2001 and 2005, HEIs from our sample registered an average increase in productivity of around 4.5 % per year and efficiency change predominated over technology improvements. However, the methodology adopted permits us to state that only approximately half of the cases were characterized by statistically significant annual improvements in overall productivity. In the other cases, either the TFP of HEIs declined or their Malmquist indices were not significantly different from unity (no improvement—no regress).

Our study has benefited from the advantage of being based on panel data, with information on the productivity performance of universities from several European countries. Consequently, we have thoroughly analyzed cross-country variation in productivity changes that are typical for universities from different systems of higher education. The average TFP index ranged from 0.98 (TFP decline of 2 %, annually) in Austria to 1.09 (TFP growth of 9 %, annually) in Switzerland. There is also much inter-country variation in the proportion of universities that registered statistically significant improvements in productivity. For instance, two-thirds of Italian HEIs registered statistically significant TFP improvements (the best score across the seven countries), while this was typical for less than half (46 %) of British universities.

With regard to the time dimension, we have been able to check which universities registered constant TFP growth in every time period across the years 2001–2004. On average, German, Italian and Swiss HEIs, whose TFP rose consistently, performed better than HEIs from the other countries. Looking at single university units, in our basic analysis evaluating HEIs vis-à-vis a common European frontier of productivity, we found that only 28 European universities (out of 266) registered statistically significant improvements in productivity in all of the years between 2001 and 2005.

We have extended our analysis by comparing the results obtained with alternative datasets, by changing the set of inputs and outputs in the DEA estimation and by employing alternative definitions of the productivity frontier (‘European’ and ‘country specific’). Our basic finding of approximately 4 % annual productivity growth in European HEIs is robust to changes in the formulation of the DEA model for the Malmquist index calculations. Frontier definition is not so important for the measurement of general productivity change (the Malmquist index remains fairly stable) but proves to be relevant when comparing efficiency and technology developments. A joint treatment of universities with respect to a common productivity frontier is appropriate if the researcher is interested in comparing HEIs as units competing jointly within the European system of higher education, as we were. Assessing HEIs against other units from the same country tells more about the movement of national frontiers of higher education. Consequently, through alternative frontier measurement we demonstrate that, depending on the research question formulated at the outset, the need to take into account the heterogeneity of higher education systems across countries should be considered.