1 Introduction

Universities, as a particular type of organization within the structure of the Higher Education System, have become significantly relevant in explaining some of the causes of the development achieved in different countries over the last years. The literature has focused on the efficiency levels of Higher Education Institutions (HEI), as well as on the main factors that might account for the calculated efficiency rates. In this regard, DEA has been one of the most widely used techniques to estimate efficiency and explain the influence of each input and output in obtaining the coefficient (Liu et al. 2013).

One of the main reasons that explain the use of a nonparametric methodology in the analysis of efficiency in higher education institutions is related to the possibility of working with multiple outputs and multiple inputs simultaneously, in conjunction with the parametric methodologies traditionally employed in the study of efficiency (see Emrouznejad et al. 2008; Aristovnik and Obadić 2014). Additionally, the characteristics of the university system naturally lead to rely on a model that considers more than one output, since, apart from providing with higher education degrees to graduates, universities play an important role in scientific production, not only through the integration of technology in the productive sector, but also through scientific output itself (papers, books, patents, etc.).

Efficiency measurement in higher education institutions in Argentina has also gained interest over the last few years, especially due to the major expansion of the university system.Footnote 1 This growth has been substantiated by two phenomena: the creation of new public universities (9 new National Universities in the nineties and 11 in the period 2002–2010) and, additionally, the budget increase allocated to education by the National State (a budget equivalent to 6% of Gross Domestic Product must be assigned to education according to the educational financing law since 2005).

In this regard, the public resources that governments allocate to universities have been a subject of debate both within universities and outside of them. This is especially interesting in the case of Argentina, due to the idiosyncratic organizational characteristics of the higher education system: institutional autonomy, budget autarky, unrestricted admission with tuition fees fully subsidized and poor quality evaluation. Thereby, this institutional autonomy allows universities to focus on teaching or research activities, centred their attention to a particular field (i.e. specialist universities vs. comprehensive universities), establish them own characteristics of professors, and choose organization structure (departmental or by schools) among others. These conditions could make easy to identify the most efficient resource management models within them (Johnes and Yu 2008). Efficiency studies in higher education institutions using nonparametric in the literature, however, are scarce for Latin America in general and for Argentina specifically. This situation is surprising, especially in view of the peculiarities of the university systems in the region (systems with a prevailing participation of the State in the funding mechanisms).

Our attention, then, will focus on determining which factors under the control of universities (such as quality of professors, type of contracts for their faculties, budget allocation, and infrastructure expenditure) can affect significantly the technical efficiency (following Farrell 1957) achieved. For this purpose, we obtain efficiency scores for each university for a panel database and then explain which characteristics of universities can affect more those efficiency levels. Furthermore analysing significance parameters estimation, fixed effects behaviour and scale efficiency (Worthington and Lee 2008), we are allowed to see relationships between them as well as geographic patterns.

This paper has been structured in four sections: firstly, the characteristics of the quantitative methods used in both stages are explained as well as the main bibliographic references. Secondly, the elaboration of the database and the calculation of the main variables are described. The third section presents the results of the empirical application to the case of Argentina and the last section provides an analysis of the results as well as the conclusions.

2 Literature review: DEA and efficiency in education

DEA has been applied in analysing efficiency of universities. There is an abundant literature that has studied the efficiency in public universities recently: in Italy and Spain (Agasisti and Pérez-Esparrells 2009); in university departments in Spain (Giménez García 2004), (Díez de Castro and Díez Martín 2005) and (Martín 2006); in Australian universities (Abbott and Doucouliagos 2003) and (Leitner et al. 2007); in higher education institutions in England (Johnes 2006a); across Polish universities (Nazarko and Šaparauskas 2014); in state universities in Greece (Katharaki and Katharakis 2010 and Kounetas et al. 2011) among others. Flegg et al. (2004) examine the technical efficiency of British universities in the period 1980/1981–1992/1993 using DEA for a panel, but they do not use a complementary second stage, instead they calculate efficiency scores for each year and focus on decomposition of that into pure technical efficiency, congestion efficiency and scale efficiency.

DEA has been applied specifically in measuring efficiency in the Argentinian university system by (Alberto 2005) and in the task of evaluating the technical efficiency of state-run universities in Argentina by (Coria 2011). In both of these works, some of the limitations of the model in selecting the input and output variables are acknowledged, since the different levels of the education system produce other types of social goods and positive external benefits that cannot be quantified or rigorously measured, but have a positive impact in social production.

Both in the Argentinian and in the international literature, a trend towards the use of nonparametric methods—such as DEA—can be observed, while the use of parametric estimation models, traditionally employed in the analysis of efficiency, is quite limited. The parametric approach, as the nonparametric one, assumes the same conditions and technology among production units, but in addition needs of assumptions concerning both random term and inefficiency component, which are not always known. Additionally, it requires strong distributional assumptions for the parameter estimation and it does not allow dealing with simultaneity of inputs and outputs, which is very frequent to find in the models that explain the university productive process.

Some authors (see Laureti et al. 2014) use a stochastic frontier approach to measure efficiency in Italians universities, using the Generalized Maximum Entropy (GME) method to estimate it. Although this methodological approach allows us to overcome the assumptions related to random effects distribution and inefficiency component. However, it only focuses on measuring efficiency of only one output of the university productive process (teaching).Footnote 2 Other recent attempts to overcome these limitations rely on the use of a Network-DEA model that represents the underlying production of higher education research, providing a deeper perspective of both output quality and quantity (see Lee and Worthington 2016). Nevertheless, in this case they embed the efficiency score into another DEA model in the second stage. Similarly (Ibáñez Martín et al. 2017) estimates a stochastic frontier to measure university departments efficiency, but defining this as the student performance, limiting again the study of efficiency to teaching output.Footnote 3

In this sense, it is important to show that although the contributions based on the analysis of the university outputs considered separately provide useful information to identify those variables explaining efficiency levels, they still are limited in terms of their capacity to fully explain differences between universities.

These reasons explain that literature on the assessment of efficiency in higher education institutions has progressively tended to use nonparametric techniques, mainly because of the advantages that this approach provides in terms of the assumptions required for estimation and multidimensionality of the university productive process. Even when some of the variables traditionally considered in DEA for university efficiency assessment can be either considered inputs or outputs, there is no unanimous consensus in the literature about some of them (Rosenmayer 2014). This issue is particularly relevant in the study of any university system, because the proportion of inputs that are used for the production of a particular quantity of outputs it is not always perfectly distinguishable.

Nevertheless, it should be noted that one of the main weaknesses of the nonparametric approach is associated with inexistent test of the parameters of production function obtained (Johnes 2006a) and, therefore, with the distortive effects that possible outliers could have. This particular issue, however, could be addressed with a two-stage estimation strategy that exploits the comparative advantages of both parametric and nonparametric approaches, applying the DEA procedure in the first stage just to measure efficiency scores on a flexible way and then modelling by a parametric function the variability on these scores on a second stage. The use of such a two-stage model has been recurrently employed to measure efficiency levels on other fields, but is not common in the analysis of efficiency in higher education systems.Footnote 4 To the best of our knowledge, the only application in this field is the work by Wolszczak-Derlacz and Parteka (2011), which applied it to explain efficiency on the universities of European Union countries.

In our work, we follow this approach, being one novelty the use of panel data estimators applied to a dataset of 30 Argentinean universities with annual data for the period 2004 to 2013. Different to the study of Wolszczak-Derlacz and Parteka (2011) that bases on a cross section of universities, the use of panel data sets allows for identifying individual heterogeneity of universities, which in turn produces more efficient estimates of the coefficients in the second stage (see Wooldridge 2010). The application of this two-stage strategy combined with panel data can help to identify the characteristics that explain differences in efficiency in public universities, which, in our point of view, provides new elements in studying the higher education system in Argentina and, thus, in improving resource allocation and management policies.

3 Methodology and data

The two-stage procedure applied on the paper consists on applying a parametric modelling on a second stage to the results of DEA on a first stage. In the particular case under study, the first stage of our analysis consists on quantifying technical efficiency levels of each National University on each year, by applying a DEA model oriented to outputs with constant returns along the period 2004–2013 on an annual basis. Secondly, the DEA efficiency score for each National University and each year will be regressed on a set of institutional factors, which might help to explain more thoroughly the determinants of efficiency in each university. The econometric techniques to estimate this parametric regression equation take advantage of the structure of panel of our dataset.

The use of this estimation strategy, which has not been frequently employed in the literature on efficiency, allows for exploiting the advantages of both approaches. The first stage by means of DEA devises a function that considers multiple inputs and outputs not directly related to each other. The parametric approach on the second stage helps to explain the effect of certain variables on the efficiency levels achieved by each university. On this regard, a two-stage model allows for a deeper level of analysis of the causes of efficiency, not only through the internal factors that DEA usually considers, but also through the inclusion of other factors affecting the university production system.

3.1 First stage: recovering efficiency scores by DEA

DEA (Charnes et al. 1978) is a nonparametric technique that builds an envelopment, also known as efficiency frontier or observed production frontier. Those Decision-Making Units (DMUs) that are not on the frontier will be considered as inefficient, and it is possible to evaluate their relative efficiency, i.e. to compare them with the—closest—DMUs of reference in terms of their technology. The key point is to define the empirical production frontier formed by the most productive units, building an efficiency perimeter by segments that envelops the units studied. This allows us to quantify the inefficiency of the observations in the sample as their distance to this frontier. In this way, the efficiency measurement of a unit by means of DEA implies that efficiency is measured in relative (and not absolute) terms, since it lies on the construction of the set of technologically feasible production possibilities and the estimation of the maximum feasible expansion of the product (output) of the DMU within the set of production possibilities. Thus, a DMU will be considered efficient as long as it is not possible to reduce the number of inputs without reducing, by at least one unit, the number of outputs. Likewise, a DMU will be considered efficient when it is not possible to increase the number of outputs without increasing, by at least one unit, the number of inputs.

Within the general framework of DEA, it is possible to distinguish between several specific models depending on their assumptions. These models can be classified into input or output oriented, or depending on the type of performance on scale that characterizes production technology we can distinguish between constant or variable returns to scale (Johnes 2006b). Although not unanimous, the approach commonly followed in the literature on evaluation of efficiency in universities is an output-oriented model.Footnote 5 Some references of output-oriented model can be mentioned: in German universities (Warning 2004), to Italian universities (Guccio et al. 2016), applied to Indian universities (Tyagi et al. 2009), to Turkish universities (Bayraktar et al. 2013), and for Mexican universities (Sagarra et al. 2017). The reason for this decision lies in the little—or nonexistent—flexibility of the inputs generally employed (teachers, budgetary resources, physical space, etc.), in addition to the fact that the management of the volume of such inputs is considered as an exogenous variable, as it depends on decisions in which DMUs have no influence.

The formalization of the output-oriented DEA model of Constant Returns to Scale (CRS) can be presented as follows:

$$\varPsi = \left\{ {\left( {x,y} \right) \in R_{ + }^{S + M} \left| {x\;\;{\text{can}}\;\;{\text{produce}}\;\;y} \right|} \right\}$$

where x represents a vector of M inputs and y represents a vector of S outputs for a set of \(n\) DMUs. The feasible production frontier is determined by Ψ, and all the DMUs included within the frontier are inefficient, while those on the Ψ frontier are considered efficient. The efficiency score (\(h_{j}\)) for \({\text{DMU}}_{j}\) can be represented by the following quotient:

$$h_{j} = \frac{{\mathop \sum \nolimits_{r = 1}^{S} u_{rj} y_{rj} }}{{\mathop \sum \nolimits_{i = 1}^{M} v_{ij} x_{ij} }};\;\;\;\;j = 1, \ldots ,n$$

where \(v_{ij}\) (\(i = 1,2, \ldots ,M\)) and \(u_{rj}\) (\(r = 1,2, \ldots ,S\)) are the weights or weighting factors of the inputs and outputs, respectively, to calculate the weighed sum of \(M\) inputs and \(S\) outputs for the DMUs. The weights for each \({\text{DMU}}_{j}\) can be determined by means of the following mathematical programming problem:

$$\begin{array}{l} h_{0}^{*} = \hbox{max} h_{0} \hfill \\ {\text{subject}}\;{\text{to}}{:} \hfill \\ h_{j} \le 1;\;\;\;\;\;\;j = 1,2, \ldots ,n \hfill \\ v_{ij} ,u_{rj} \ge 0;\;\;\;\;i = 1,2, \ldots ,M; \;\;\;\;\;\;r = 1,2, \ldots ,S \hfill \\ \end{array}$$

where \(h_{0}\) represents the quotient between the weighed amount of outputs and the weighed amount of inputs for the DMU considered (\({\text{DMU}}_{0}\)), which involves solving as many nonlinear programs as DMUs exist. By calculating this model for each unit and each temporal unit, we obtain the \(n\) DEA efficiency indexes (\(h_{jt}^{*}\)) associated with each DMU and year considered, where each one of them will be associated with (M + S) optimal weights. Accordingly, the bigger \(h_{jt}^{*}\), the better the performance of the DMU considered. However, this level will not be higher than the unit, due to the restriction imposed on the mathematical program.

3.2 Second stage: modelling efficiency scores

In the second step of the analysis, the efficiency scores previously calculated by means of DEA are taken as a dependent variable and regressed on a set of factors that could potentially affect it. The general equation to be estimated is

$$h_{jt}^{*} = \mathop \sum \limits_{j = 1}^{n} \alpha_{j} d_{j} + \mathop \sum \limits_{k = 1}^{K} \beta_{k} z_{kjt} + \mathop \sum \limits_{t = 1}^{T} \gamma_{t} {\text{d}}t_{t} + \varepsilon_{jt}$$

where \(z_{kjt}\) represents the institutionalFootnote 6 variable \(k\) that could potentially affect the \({\text{DMU}}_{j}\) efficiency levels for the period \(t\) and \(d_{j}\) and \({\text{d}}t_{t}\) denote individual and time dummy variables, respectively. Likewise, \(\varepsilon_{jt}\) represents the unobserved factors in the equation that affect the efficiency levels of each National University in a given period. The impact on efficiency of the institutional variables \(z_{k}\) is captured by the estimates of the \(\beta_{k}\) parameters, while the estimates of \(\gamma_{t}\) and \(\alpha_{j}\) quantify, respectively the time-effects and the time-invariant unobserved heterogeneity in efficiency between universities.

Our main point of interest are the marginal effects of the institutional variables \(z_{k}\), since they provide useful information for the design of policies in the universities pursuing improvements on the efficiency of these institutions. We acknowledge, however, that some idiosyncratic characteristics of the universities can also affect their efficiency levels and they are not directly contained in time-varying observable regressors—academic reputation or tradition that attracts better professors and students or regional characteristics are just two examples—. Not accounting for these effects can severely condition the estimates of the \(\beta_{k}\) parameters of interest (see Wooldridge 2010). The strategy followed to estimate Eq. (4) takes advantage of the structure of panel of the dataset and we apply a Fixed Effect (FE) estimator to effectively control for the university-specific heterogeneity. The use of an FE estimator lies on the assumption of correlation between the individual time-invariant effects \(\alpha_{j}\) and the regressors \(z_{k}\), which is a sensible assumption in the case under study. An alternative estimator would be the so-called Random Effects (RE) estimator, which assumes no correlation between the regressors and the individual effects. We applied the Hausman test (Hausman 1978) to distinguish between these two alternative estimators, being the RE option rejected under all the specifications considered.

3.3 Data for the first stage

The methodology previously sketched will be applied to the 30Footnote 7 Argentinean National Universities along the period 2004–2013. Yearly data of a set of variables of interest have been obtained from the Statistics Yearbooks of the University Policy Office, an institution that depends on the Argentinean Ministry of Education. The variables taken for estimating the efficiency scores by DEA in the first stage were classified as inputs or outputs of the universities. As indicators of output, and given that universities are expected to produce teaching and research, we have taken the annual number of graduated students as a measure of the teaching output and we have calculated an indicator of scientific production as the measure of research output. More specifically, the scientific production was defined as the total number of scientific papers, books and patents on which some author was affiliated to an Argentinean National University.Footnote 8

As inputs for the DEA in the first stage, we follow previous literature (see Johnes 2006a) and take the number of enrolled students, the budget of the university and the academic staff on each university (professors, teaching assistants and researchers). The number of students is directly observable in the Statistical Yearbooks of the University Policy Office, but the other input variables required for some additional computations.Footnote 9 The budget only contains the resources that central government transfers to each University, excluding those related to personnel expenditures since these are indirectly reflected on the academic staff variable.Footnote 10 The academic staff of each university was calculated as the number of equivalent Full-Time Assistant Professors, which is the status of a junior trained academic.Footnote 11 Table 1 shows the descriptive statistics of the variables used in the first stage for each year. In all cases, the data show great heterogeneity across universities.

Table 1 Descriptive statistics for National Universities (period 2004–2013)

4 Results

4.1 First stage: DEA 2004–20013

Table 2 shows the results of the DEA conducted in the first stage, reporting the annual efficiency scores \(h_{jt}^{*}\) obtained by applying the output-oriented DEA with Variable Returns to Scale (Banker et al. 1984) from 2004 to 2013.Footnote 12 The last three columns of Table 2 report the minimum, maximum and mean values, respectively, of \(h_{jt}^{*}\) across years for each university \(j = 1, \ldots ,30\). These figures reveal that nine universities (Buenos Aires, Formosa, Gral. San Martín, Gral. Sarmiento, La Plata, Lanus, Rosario, Sur and Villa María) were classified as efficient in all the years studied, which can be considered a substantial percentage of the total set. These results are somewhat consistent with those obtained in previous analysis for Argentina by Coria (2011) and Alberto (2005), where despite some minor differences in the least efficient ones, the universities categorized as efficient were the same. The three bottom rows in Table 2 present equivalent statistics for each year \(t = 2004, \ldots ,2013\) across universities, describing the general dynamics of the efficiency of the full set of National Universities along the years under study. Generally speaking, the results suggest a modest but steady positive trend.

Table 2 Efficiency Score for National Universities (period 2004–2013)

Figure 1 complements the information contained on Table 2 by means of a visual representation of the results. In this figure, we plot the mean score \(\bar{h}_{j \cdot }^{*}\) for each university \(j\) bounded by limits calculated as three times the standard error along the years. The horizontal axis displays the universities sorted on an increasing order of their mean scores, classifying the most efficient—on average—as those located in the right section of the figure, while those located in the left are the least efficient universities—on average—. Plotting the dispersion around the means \(\bar{h}_{j \cdot }^{*}\) provides useful information as well, since it allows the identification of those universities with higher volatility on their scores, being Catamarca, La Matanza and, particularly, La Rioja cases of universities that have low efficiency indicators and high variability. On the contrary, the high part of the distribution characterizes for presenting comparatively lower variability on the efficiency scores. The University Tres de Febrero is the only exception to this general pattern.

Fig. 1
figure 1

Efficiency scores (VRS) distribution by universities

4.2 Second stage: modelling differences in efficiency scores. Data and estimation strategy

In the second stage, each efficiency score \(h_{jt}^{*}\) reported in Table 2 was explained by estimating Eq. (4) by applying FE. Note that the variables taken as inputs in the DEA conducted in the first stage are not fully controlled by the universities, since, for example, their budgets are determined by the central administration and the number of enrolled students cannot be directly limited by the universities because of legal restrictions.

There are other variables, however, that the universities can manage with less constraints and that can potentially affect their efficiency. In particular, we were interested on quantifying the effect of the variables related to the composition of their faculties, which can be decided by the universities directly.Footnote 13 There are two main dimensions on which the universities can design the composition of their faculties: one is the proportion of high-ranked professors and other is the teaching workload taken by full-time faculty—used as a proxy to the separation of academic staff into teaching-only or teaching and research oriented (Worthington and Lee 2008)—. Note that low-ranked are much common than high-ranked professors in the Argentinean public universities, which can be affecting the productivity of these universities both in teaching and research. Moreover, the presence on the faculties of part-time instructors is relatively common in Argentinean universities as well, which could be hampering the efficiency on the functioning of the universities, since the professionals on these part-time positions have usually a non-academic job out of the university and do not normally have incentives to produce research output. Therefore, one of the objectives in the second stage was to quantify the effect on the efficiency scores of the number of high-ranked professors, which we express in relative terms to the assistants to prevent scale problems.

Academic staff was divided into two subcategories: Professors, which include high-ranked academics (Assistant, Associate and Full Professors), and lower ranked Teaching Assistants, which comprise initial level auxiliary instructors (Teaching Assistants and Practical Work instructors). Finally, the number of researchers was calculated distinguishing between senior and junior researchers. The former category includes researchers type I, II and III, while the second category includes researchers type IV and V in the Teacher Incentive System of the Argentinean Ministry of Science and Technology. To generate both groups of researchers (seniors and juniors), we assign different weights to each category and then we aggregate them.Footnote 14

Additionally, the second effect of interest in our analysis was the impact on efficiency of the proportion of full-time positions, expressed again as a ratio over the total of part-time faculty. These two variables were collected again from the Statistics Yearbooks of the University Policy Office for each one of the thirty universities on an annual basis from 2004 to 2013. The most basic specification of a model like (4) includes only these two variables plus the respective individual effects for each university.

Furthermore, more detailed specifications that included as control variables several indicators regarding the distribution of the budgets were estimated as well. More specifically, these additional control variables were defined as: the budget allocated to scientific production (Science Budget), the resources applied to special incentives to researchers (Research Incentives Budget), the capital investment (Infrastructure Budget), the current expenses (Operation Expenses) and the resources used in non-academic staff (Administrative Staff). All these variables were initially given in millions of Argentinean Pesos, but in our estimations they are expressed in relative terms to the number of students and in natural logs to make their interpretation easier. Table 3 presents the estimates of several specifications of Eq. (4).Footnote 15

Table 3 Results: FE estimates

The column labelled as “Model 1” in Table 3 reports the estimates obtained under the most basic specification without any additional control variable. Subsequent columns to the right, from “Model 2” to “Model 5”, include the previously mentioned controls gradually. Note that the estimation of Model 1 uses the total of 300 data points resulting in studying the thirty universities for the 10 years, but specifications with more control variables present smaller sample sizes because some of the indicators were not available for all the universities and years.

Results in Table 3 indicate that the two variables of interest, namely the ratios of high-ranked professors and the full-time positions, have a positive and significant effect on efficiency levels, while those associated with budget are not significantly affecting the efficiency under any specification.Footnote 16 One possible explanation for the lack of significant effects of such variables might lie on their reduced time variation: on the period under study the budget of the universities did not substantially change along time, being most of the variability on these indicators explained by between differences which are already captured by the individual fixed effects. The variable that measures the ratio between high-ranked versus low-ranked professors presents a positive modest, but still significant, coefficient under any of the specifications. The coefficient associated with the proportion of teaching workload taught by full-time versus part-time positions is much bigger in size and significantly different from zero at 1% in all the specifications. This result is not surprising and goes in the same line as documented in previous literature (Worthington and Lee 2008): those universities that organize their faculty relying on part-time positions pay a penalty in terms of their efficiency. The estimates suggest that doubling the ratio of teaching workload taught by their full-time faculty increases their efficiency scores on a range that varies between 5.43 and 6.84%.

The parametric approach followed in this second stage of our analysis makes also possible to quantify the part of the efficiency not explained by the within variation of the regressors but by the individual time-invariant characteristics of the universities. The part of the variation explained by the regressors \(z_{k}\) is reported in the rows with the respective \(R^{2}\) that explains the within variability, revealing that they explain between 18 and 25% depending on the specification estimated of Eq. (4). The proportion of the variability on the \(h_{jt}^{*}\) scores that, not being explained by the regressors \(z_{k}\), can be attributed to these individual characteristics of the universities is reported in the respective row labelled with the scalar \(\rho\) for every model estimated. This proportion is substantially high, being higher than 80% in all the cases and suggesting that these time-fixed effects play a relevant role on explaining differences on efficiency scores. These fixed effects can be recovered and their estimates corresponding to the most detailed specification of Eq. (4), which corresponds to the column “Model 5” in Table 3, are presented in Fig. 2.

Fig. 2
figure 2

Individual fixed effects by university. Reference: University of Buenos Aires. The dots represent the point estimates of the fixed effects and the horizontal bars the confidence intervals constructed at 95%

The vertical red line on zero is the reference on our analysis and arbitrarily sets the reference as the fixed effect estimated for the University of Buenos Aires—with a point estimate of 0.2329—. Not surprisingly, the universities with most negative fixed effects—at the left of this reference line—correspond to those with the smallest efficiency scores estimated in the first stage, as the cases of Jujuy, Salta or La Rioja. The universities with estimated fixed effects at the right of this line are those with individual effects larger than the reference, being those only the cases of the universities of General San Martin, General Sarmiento, La Plata, Mar de Plata, Sur and Villa Maria. Note that these are universities located in the highest part of the distribution with mean scores \(\bar{h}_{j \cdot }^{*}\) equal or very close to one. Note, however, that there are universities with mean scores \(\bar{h}_{j \cdot }^{*}\) equal to one (Formosa, Lanus and Rosario) that present fixed effects lower than the estimate corresponding to the reference. This result suggests that these universities manage to be efficient even when their idiosyncratic time-invariant characteristics put them in a comparatively worse situation than the case of the University of Buenos Aires.

5 Conclusions

The analysis of efficiency of higher education has become customary in recent years (see Agasisti et al. 2016; Sagarra et al. 2017; Wolszczak-Derlacz 2017). The flexible nature of the productive process of education systems, on which several outputs and inputs can be considered, favours the use of Data Envelopment Analysis (DEA) rather than more rigid approaches. Measuring efficiency by simply applying DEA, however, does not allow for identifying the drivers of this efficiency, which can be considered as a drawback of this approach.

This paper takes advantage of the flexibility inherent to DEA modelling to study the efficiency on universities, but applying also a second stage analysis based on a parametric approach that allows for identifying the most relevant factors that explain differences on productivity along time and between universities. More specifically, we analyse the set of thirty National Universities in Argentina, studying annual information from 2004 to 2013. On a first stage, we apply an output-oriented DEA-VRS that explains the teaching (number of graduate students) and research (number of scientific papers, books, patents, etc.) outputs on three inputs, namely the number of enrolled students, the annual budget and the academic staff. The results of this first stage show a positive but modest trend in the efficiency scores along the years. More interestingly, they also reveal a clear segregation between (i) a group of nine universities with top efficiency scores along the years (Buenos Aires, General San Martin, General Sarmiento, La Plata, Formosa, Lanus, Rosario, Sur and Villa Maria), (ii) a second group with four universities that present high efficiency scores (Mar del Plata, Quilmes, Cordoba and Lomas de Zamora), and (iii) a last group with the remaining seventeen universities on which the efficiency scores are low (being as low as 0.318 on mean in the case of Jujuy, for example).

Besides this initial evaluation of the efficiency on the sample under study, the analysis conducted in the second stage of our research allows for identifying the factors that explain the variability on the efficiency scores calculated on the first stage. We apply a Fixed Effect (FE) estimator that takes advantage of the structure of panel of our data set of universities to identify the part of this variation that can be attributed to time-invariant idiosyncratic characteristics of the universities—between variability—and the within variation that is explained on a set of regressors.

The use of panel data estimators to explain the results of a preliminary DEA model is a novelty, to the best of our knowledge, in the study of efficiency in higher education. In the case under study, the most important regressors were set as variables related to the composition of the faculties over which the universities have some control. In particular, we put our interest on two ratios that measure the proportion of high-ranked faculty over lower ranked positions and the part of the teaching hours taught by full-time positions—over those taught by part-time instructors. The results of our models under several specifications show that these two indicators contributed significantly—specially the second one—to explain variability on the efficiency. The time-invariant individual effects estimated indicated that there was a substantial part of between variation on the efficiency scores that can be explained by these idiosyncratic attributes.