1 Introduction

The Italian National Health Service (NHS), introduced in 1978, is a universal health care system providing comprehensive health insurance coverage and uniform health benefits to the whole population. It is inspired by the two fundamental pillars enshrined in Article 32 of the Italian Constitution: universality and equality. While the principle of universality has never been questioned, the principle of equality, especially if broken down into terms of equality of services in the local territories, has been repeatedly overlooked in recent years. The last decade has been characterized by the profound financial crisis, exacerbating the economic differences between North and South (see about this aspect Lagravinese (2015)) with inevitable effects also on the demand and supply of health. Indeed, inequalities have started to increase again with evident effects on available healthcare resources, on the quality of uneven healthcare services, and on the living conditions of citizens.

Since its introduction, and as in other European countries (see Costa-Font and Greer (2013)), the Italian NHS has undergone important reforms to decentralize health management and policy responsibilities to the sub-layers of government (Turati, 2013). As a result of a federal reform (Legislative Decree 56/2000), each region is in charge of organizing the health system, following the general guidelines defined by the central government that is responsible for setting the Essential Levels of Health Services (LEA, Livelli Essenziali di Assistenza).Footnote 1 However, the separation of financing from expenditure responsibilities in the provision of LEA, and before LEA in the provision of uniform levels of service, has provided a non-negligible incentive to the uncontrolled growth of Italian health expenditures and has historically contributed to creating bailing out expectations in regional behavior (Liberati, 2003), especially in a context of often inadequate regional health governance and accountability (Carinci et al., 2012).

The public policies concerning a reduction in beds, in medical and nursing staff, hospital mergers and acquisitions as well as lower investments in infrastructure are now subject to revaluation in many countries. Even the decentralized organization itself is questioned and, in many cases, possible scenarios of re-centralization of the health system are well thought-out (Mosca, 2007). In this respect, Italy is among the western countries that have significantly reduced healthcare spending by reshaping its hospital organization and decentralizing management at regional level. The effects of these policies are still controversial to this day, but what is undoubtedly an established fact is that the inequalities between the regional systems in these 20 years have not diminished. Furthermore, the recovery plans imposed by the central government in 2007 for the regions with high deficitsFootnote 2 have certainly favored the reduction of healthcare expenditures and have proved to be an effective mechanism for eliminating sub-national governments deficits (Bordignon et al., 2020). Nevertheless, there is still no clear consensus in the literature whether these plans, and in general the decentralization process, have also had an impact on increasing health inequalities between or within regions. Our work attempts to enrich this literature looking at how hospital efficiency (one of the pillar of health services restructuring by the regions) has changed over the last decade within and across regions.

The economic literature has so far produced a rich and consolidated series of works to study hospital efficiency. As far as Italy is concerned, most of them have investigated if the efficiency has changed either with the introduction of new accounting systems, such as the use of the diagnostic-related groups (Barbetta et al., 2007, Cavalieri et al., 2018), or due to the ownership of hospitals (Cellini et al., 2000, Colombi et al., 2017, Daidone and D’Amico, 2009). To the best of our knowledge, there are no studies that measure in a more holistic way how efficiency has changed over the years and whether the regional policies have actually reduced the differences in hospitals between and within regions. This paper attempts to fill this gap. In particular, the aim of the present study is threefold.

The first contribution of the paper regards the methodology used to measure the efficiency of hospitals. Indeed, health performances are usually averaged across several dimensions to obtain a composite indicator useful for ranking and comparisons. However, the obtained average score can hide contrasting attitudes and specialisations, being a crucial issue the definition of a proper set of weights to aggregate different subjects.Footnote 3 We apply the benefit of doubt (BoD) approach to determine a composite index that considers the multi-dimensionality of the hospital outcome. As far as we know, the BoD has received scarce attention so far in the health care sector. Nonetheless, in the absence of information about value judgments, as it is the case of hospitals, the BoD methodology can be an useful tool to account for the complexity of the health care system (Matos et al., 2021). The BoD aggregates linearly quantitative performance sub-indicators into a single composite one using the combination of weights that is the most convenient for the evaluated hospital (Cherchye et al., 2007). This is done by implicitly assigning less (more) weight to those sub-indicators or aspects of performance that the particular hospital is relatively weak (strong) when compared with all others in the sample (Karagiannis and Paschalidou, 2017).Footnote 4 The high flexibility in terms of exogenous assumptions for setting weights, makes the BoD an appropriate approach to account for the specializations of individual hospitals.

Secondly, the composite indicator generated by the BoD is then used as output in a metafrontier production function based on a stochastic frontier framework.Footnote 5 More specifically, we apply a two-step stochastic frontier approach recently developed by Huang et al. (2014), based on the idea that hospitals have access, potentially, to distinct production technologies in different time periods, being conditioned by several factors such as, for instance, environmental characteristics, regulation and the availability of resources. The analysis is carried out using hospital data over almost one decade: in 2007, the year before the great recession and when, with the introduction of the recovery plans, a number of regions had to substantially reconsider their offer on the territory and in 2016, the last year for which the data are available in detail for all Italian healthcare facilities. To account for the change in efficiency between the two years, the metafrontier has been used. To the best of our knowledge, this is the first work that applies this methodology to the Italian hospital context. The main findings show that the primary source of inefficiency comes from managerial inefficiency especially for hospitals located in southern regions.

The final contribution of the paper is an attempt to link the literature on the hospitals’ efficiency and inequality in the healthcare services at local level. The efficiency score distribution is employed to compute the Theil index (Theil, 1967), a perfect decomposable inequality index, to compare over time the inequality of the estimated efficiency between hospitals in the same region and between different regions. The results of this analysis allow us to make some more general considerations regarding the evolution of the healthcare services in Italy. The findings show a clear and persistent North-South gap in efficiency performances of hospitals along with an increase in the inequality in terms of efficiency between the areas of the country mostly determined by between region inequality. The empirical evidence provides helpful recommendations for policy makers regarding whether local level provision of hospital services need to be improved or whether the regional policies implemented in these years must be reconsidered in order to favor a convergence process.

The rest of the paper is organized as follows. The next section describes the methodology employed, Section 3 presents the data and provides some descriptive statistics, Section 4 discusses the results both in the metafrontier and inequality analyses and finally, Section 5 concludes and offers some policy implications.

2 Methodology

2.1 Efficiency of hospitals

We calculate the hospital’s relative efficiency at converting inputs into a production set while maximizing outputs. In the literature of productive efficiency analysis and frontier estimation, hospital efficiency has been estimated by two main alternative methods such as data envelopment analysis - DEA (Charnes et al., 1978, Farrell, 1957) and stochastic frontier analysis—SFA (Aigner et al., 1977, Meeusen and van den Broeck, 1977). The main advantage of the former method is the non-parametric treatment of the frontier that does not require the underlying production function to belong to any specific functional form.Footnote 6 Although its underlying assumptions provide a way to economize on data requirements, making efficiency analysis possible even when other methods of estimation would not be applicable (Patrizii and Resce, 2013), it ignores any stochastic noise in the data. The advantage of the SFA, instead, is in its stochastic treatment of these deviations.Footnote 7 However, the data used in these analyses may be relatively noisy, so that the deterministic approach (such as the DEA) seems to be inappropriate. Moreover, the units analyzed may apply different production technologies making also the parametric approach (such as the SFA) inappropriate. To incorporate the benefits of both approaches and to bridge the gap between SFA and DEA, several solutions have been proposed in the literature. For a recent and comprehensive view of the theoretical and empirical methods approaches to productivity and efficiency analysis, see Sickles and Zelenyuk (2019).

To comply with the characteristics of the hospital sector and to better understand the differences in average health costs and productivity among hospitals operating in the same environment, more standard techniques along with alternative approaches are used. An extension of the stochastic frontier is applied to estimate the determinants of persistent (long-run) and transient (short-run) inefficiencies (Colombi et al., 2017). With the aim of measuring the impact of environmental characteristics on hospitals’ technical efficiency, the conditional-efficiency approach (Cordero et al., 2015, Halkos and Tzeremes, 2011, Mastromarco et al., 2019, Varabyova et al., 2017) as well as a two-stage semiparametric bootstrap-based approach are applied (Cavalieri et al., 2018, Chowdhury and Zelenyuk, 2016). The importance of the spatial dependent heterogeneity in hospital technical efficiency is also underlined, through a spatial stochastic frontier, to separate spatial issues from the effects of geographical institutional factors (Auteri et al., 2019, Cavalieri et al., 2020). Stochastic input and output distance functions are also employed (Daidone and D’Amico, 2009, Jiang and Andrews, 2020) to model multiple input and output technologies. However, although extensive, the literature has not reached a clear consensus on which method should be adopted to measure the efficiency of healthcare provision institutions. For a comprehensive view, see Jacobs et al. (2009).

In our opinion, the stochastic frontier model is particularly well-suitable for the analysis of health facilities. It allows for stochastic errors, results cannot be severely affected by the presence of outliers, statistical inferences can be drawn, and the estimated parameters (which significance can be tested) can provide potentially useful information in terms of policy implications. Moreover, since operating in more advantageous (more disadvantageous) environments may increase (decrease) a hospital’s efficiency, such factors should be taken into account in the efficiency estimation. We recognize merits of the SFA in integrating the environmental factors in a one-step approach avoiding collinearity issues between the error term and the efficiency scores. Potential limits in handling a multi-output and multi-input environment are solved through the use of a single composite outcome (BoD approach).

2.2 The stochastic frontier analysis

To estimate hospital technical efficiency, we rely on a parametric approach, such as the SFA, through which a frontier is estimated on the relation between inputs and outputs. This can be, for example, a linear function, a quadratic function or a translog function. We use a more general functional form, that is the transcendental logarithmic or “translog” (see Barbetta et al. (2007), Colombi et al. (2017)). It may be preferred to the Cobb-Douglas form to overcome the latter restrictive elasticity of substitution and scale properties, and to allow for non-linear causalities. This method has become a common technique to assess production and inefficiencies in the production of good and services in several contexts, such as economics, health, education, and energy sectors. The methodology employed in this paper is developed by Huang et al. (2014). In a general canonical form, the stochastic frontier model is described by the following set of equations:

$${Y}_{it}={x}_{it}^{\prime}\beta +{\epsilon }_{it}$$
$${\epsilon }_{it}={\nu }_{it}-{u}_{it}$$
$${\nu }_{it} \sim N(0,{\sigma }_{\nu }^{2})$$
$${u}_{it} \sim {N}^{+}(0,{\sigma }_{u}^{2})$$

where Y denotes the output of the ith hospital at time t, xi is 1 × k vector of input of the ith hospital at time t, β is k × 1 vector of unknown parameters to be estimated, ϵit is the random error, with uit denoting the short-term inefficiency distributed by each unit as half normal, and νit is the stochastic component, distributed independently and identically as \(N(0,{\sigma }_{\nu }^{2})\).

The model described in the system of Eqs. (14) is estimated by means of maximum-likelihood methods.Footnote 8 The technical efficiency is calculated as the ratio of the observed output and the maximum feasible output on the production frontier. Formally:

$$T{E}_{it}=\frac{{Y}_{it}}{\hat{{Y}_{it}}}=\frac{{x}_{it}^{\prime}\beta +{\epsilon }_{it}}{{x}_{it}^{\prime}\beta +{\nu }_{it}}$$

where \({\hat{Y}}_{it}\) is the maximum feasible output that lies on the production frontier.

2.3 Metafrontier model

We follow the procedure proposed by Huang et al. (2014), apply a two-stage parametric approach and estimate a metafrontier production function based on a stochastic frontier framework. We calculate technical efficiency scores for hospitals in separate groups adopting different technologies as we expect hospitals in 2007 and in 2016 not generated from a single production frontier. This procedure allows us to assess both the group-specific frontiers (hospitals in 2007 and hospitals in 2016) and the metafrontier (for both periods).

As in Battese et al. (2004) and O’Donnell et al. (2008), the comparable technical efficiency measures can be decomposed in technical efficiency scores specific to the two groups and technology gap ratios. The main difference between the method proposed by Huang et al. (2014) is that the latter’s second-step estimation of the metafrontier is still based on the stochastic frontier framework, rather than on a mathematical programming technique. Indeed, the two-step method applied by Battese et al. (2004) and O’Donnell et al. (2008) combines the stochastic frontier regression (used in the first step) to estimate the frontier of the separate groups with mathematical programming technique (in the second step) to obtain the metafrontier.Footnote 9 The second-step estimation proposed by Huang et al. (2014) is still based on the stochastic frontier configuration, such that the metafrontier estimation is a stochastic metafrontier (SMF) regression method, with some important computational advantages. First, the usual statistical inferences can be performed without relying on simulations or bootstrap methods, as opposed to mathematical programming techniques.Footnote 10 Second, the metafrontier makes possible to directly estimate the technology gaps by treating them as a conventional one-sided error term. This strategy allows us to separate the random shocks from the technology gaps.

The procedure is based on two stages. In the first stage, the group-specific frontiers (i.e. hospitals in 2007 and 2016) are estimated by means of a stochastic frontier model as in the system of Eqs. (14) for each hospital. The estimated parameter β associated to the pooled stochastic frontier model changes to βj for each group j. In the second stage, a metafrontier is enveloped over the frontiers associated to each year (see for more details about the metafrontier optimization (O’Donnell et al., 2008)).

We calculate the technical efficiency (TE) associated to the metafrontier in order to compare the efficiency scores of the hospitals across different technology sets (frontiers). Formally:

$$T{E}_{it(j)}^{* }=\frac{{Y}_{it(j)}}{{Y}_{it}^{* }}=\frac{{x}_{it(j)}^{\prime}{\beta }_{j}+{\epsilon }_{it(j)}}{{x}_{it}^{\prime}{\beta }^{* }}$$

where \({Y}_{it}^{* }\) denotes the output on the metafrontier, β* defines the parameters associated to the metafrontier function for all groups j, while \(T{E}_{it(j)}^{* }\) represents the ratio of the observed output of hospital i in group j to the metafrontier output over time t.

The technology gap ratio ratio (TGR), described by the ratio of the output of the production function for hospital i over time t relative to the potential output of the metafrontier for a given set of input variables, is as follows:

$$TG{R}_{it(j)}=\frac{T{E}_{it(j)}^{* }}{T{E}_{it(j)}}$$

In particular, a technology gap exists due to the chosen technology depending on both economic and non-economic production environments. More specifically, TGR captures the difference between the productivity of the group and the metatechnology (i.e. the technology available to all hospitals) (see for more details the geographical representation associated to metafrontier in O’Donnell et al. (2008)).

Finally, the two previous terms can be also used to calculate the hospital technical efficiency with respect to the metafrontier production technology as opposed to the hospital’s technical efficiency with respect to the group-j production technology (MTE), as follows:

$$MT{E}_{it(j)}=TG{R}_{it(j)}\cdot T{E}_{it(j)}$$

2.4 Inequality in efficiencies scores over time

In order to investigate inequality in terms of efficiency scores, we use the results of MTE. The distribution in the two periods is then computed disentangling between and within regions inequality using the Theil index (Theil, 1967), which is a perfectly decomposable inequality index, as follows:

$$T{I}_{it}=\mathop{\sum}\limits_{it}{f}_{it}\left(\frac{{\tau }_{it}}{\mu }\right)log\left(\frac{{\tau }_{it}}{\mu }\right)$$

where fit is the group of hospital i at time t, τit is the efficiency index of the hospital i at time t and μ is the average of efficiency scores obtained from the full sample.

The index in Eq. (9) can be decomposed into a between and within group component as follows:

$$T{I}_{it}=\left[\mathop{\sum}\limits_{jt}{g}_{jt}\left(\frac{{\mu }_{jt}}{\mu }\right)log\left(\frac{{\mu }_{jt}}{\mu }\right)\right]+\mathop{\sum}\limits_{jt}T{I}_{jt}{g}_{jt}\left(\frac{{\mu }_{jt}}{\mu }\right)$$

where j refers to the sub-group, gjt is the share of group j and TIjt is the inequality in group j at time t. The between component of inequality is captured by the first term, i.e., the level of inequality if everyone within each group j had efficiency level μj at time t, the second term gives the within component of inequality (Cowell, 2000, Elbers et al., 2005).

3 Data

3.1 Inputs and outputs

The data were collected by the Ministry of Health. The dataset for 2016 can be consulted directly by downloading it from the government website whereas for the 2007 data, the Archive Internet Wayback Machine on the Ministry of Health website was used.

Data include information on different inputs and outputs usually considered in the studies on hospital efficiency (see, for instance, Barbetta et al. (2007), Cavalieri et al. (2018), Colombi et al. (2017), Jiang and Andrews (2020), Mastromarco et al. (2019)) with regard to 403 hospitals followed in two years (more specifically, we collect data on 29 accredited, 42 private and 332 public hospitals).

More specifically, five measures of inputs are included in the model. The first three measures reflect the number of personnel units: the number of physicians (Physicians), the number of nurses (Nurses) and the number of other personnel (Other). The fourth input refers to a measure of capital such as the number of available beds (Beds). The fifth input is the number of magnetic resonance imaging scans (MRI scans)Footnote 11 per one thousand population at province level as a medical input to acknowledge the growth and importance of healthcare technology.Footnote 12 With regard to the output side, three measures of outputs are included in the model: the number of discharged patients (Discharged Patients), the number of inpatient days (Inpatient Days), and the number of emergency room treatments (Emergency Room Treatments).

Weighting outputs according to case-mix has been acknowledged as vital, particularly when the sample consists of hospitals of different sizes, or university hospitals together with other acute hospitals, to minimize intra-hospital as well as inter-hospital differences and therefore account for the different cost of hospital services (Chowdhury and Zelenyuk, 2016, Chowdhury et al., 2014, Mastromarco et al., 2019). To take heed of the complexity and different characteristics of the health services, following Colombi et al. (2017), we weight the three outputs using a weighted annual output of hospital i in year t, using the diagnosis-related group weight.

3.2 Variability of (in)efficiency

It seems inappropriate to assume that efficiency will vary for each hospital in the same way. Institutions can react in very different ways to the contexts they operate in. To take into account the effects of factors that affect the performance of hospitals, we include a vector of exogenous variables in the variance of the inefficiency term (see Colombi et al. (2017)).

First of all, we reckon with the impact of ownership on hospital performances. In general, private ownership characterized by the presence of residual claimants should represent a powerful incentive to economic efficiency and cost reduction; on the contrary, public ownership and/or the absence of any claimant of residual earnings may induce shirking and could decrease effort, consequently reducing efficiency (Barbetta et al., 2007). We use a dummy variable taking the value of 1 in case the hospital has a private/accredited private health care (Private) ownership, and 0 otherwise.

Second, we also deal with the fact that, in some Italian regions, region-specific recovery plans, called “Piani di Rientro”, have been implemented since 2007 in order to recover from the budget deficit related to health care expenditures.Footnote 13 Although part of the literature agrees on the fact that being included in such plans helps to contained costs (Atella et al., 2019), the consequences in terms of health outcomes are less clear.Footnote 14 A dummy variable is therefore included: 1 if the hospital is located in a region included in the recovery plan (Recovery Plans), and 0 otherwise.

Finally, a macro-area dummy is also included to control for geographical area effects (North) taking the value of 1 in case the hospital is located in a region in the North of Italy, and 0 otherwise (Centre-South as reference group).

Table 1 contains the descriptive statistics of the variables used in the production set. Table 2 specifies the outputs, inputs and the exogenous factor combinations in the empirical models. Hospitals in the southern regions have, on average, a lower number of personnel (both physicians and nurses) and beds available than those in the northern regions. Hospitals located in northern regions also have, on average, a higher number of discharged patients and emergency room treatments, as well as a higher number of inpatient days, compared to the hospitals in central and southern regions. See Figs. 1 and 2 for a graphical representation of the inputs and outputs at regional level, respectively.

Table 1 Descriptive statistics by geographical areas
Table 2 Estimating hospitals’ efficiency—specification of outputs and inputs and exogenous factors
Fig. 1
figure 1

Descriptive statistics: Inputs (geographical variation). ae The five measures of inputs included in the model such as the number of available beds (Beds), physicians (Physicians), nurses (Nurses), other personnel (Other), and magnetic resonance imaging scans (MRI scans), respectively

Fig. 2
figure 2

Descriptive statistics: Outputs (geographical variation). ac The three measures of outputs included in the model such as the number of discharged patients (Discharged Patients), inpatient days (Inpatient Days), and emergency room treatments (Emergency Room Treatments), respectively

3.3 Composite index of efficiency: benefit of doubt approach

Hospitals exhibit a wide variation in the quality of the product and its characteristics being particularly relevant on the output sides. To control for these differences, we construct a composite index consisting of the three output variables described above. The main aim is to avoid common subjectivity on the weight selection by proposing an endogenous weighting mechanism. Therefore, variables in which hospitals have a comparative advantage are more heavily weighted than those in which hospitals have a lower comparative advantage, or even a comparative disadvantage. Since the strengths of hospitals differ, the weights on the performances should differ as well. For example, hospital X may perform relatively poorly with regard to the number of discharged patients. Therefore, in an endogenous weighting, the output ‘number of discharged patients’ will be assigned a lower weight. In contrast, outputs where the hospitals perform relatively well, for instance, the number of emergency room treatments, will obtain a higher weight. In a similar model the weights are observation-specific, diverging from the previous literature and practices. The idea corresponds to the BoD model, a concept that was first developed by Melyn and Moesen (1991).Footnote 15 Using BoD, each hospital gains its own weights that maximize (or minimize) the impact of the criteria in which the hospital performs relatively well (or poorly) compared to others.Footnote 16 The BoD scores are used as composite output to measure the efficiency of hospitals.

4 Results

This section is divided into two sub-sections. First, we examine the results of the hospitals’ metafrontier production function based on a stochastic frontier framework (§4.1). The results of the analysis of the inequality in hospital efficiency are then presented and discussed (§4.2).

4.1 The hospitals’ metafrontier production function

Following the approach suggested by Huang et al. (2014), we apply the likelihood ratio test for the null hypothesis that the production frontiers are the same for the two groups of hospitals (those operating in 2007 and 2016) and find that the null hypothesis is rejected supporting the idea that the sample hospitals are operating, over a distance of almost a decade, using heterogeneous technologies. Therefore, the existence of a potential production technology gap justifies the estimation of the metafrontier production function in hospital performance/efficiency.

Table 3 reports the hospitals’ stochastic frontier estimates. Columns 1 and 2 report the estimates of the hospital group-specific stochastic frontiers for both years 2007 and 2016, respectively. The ratio of the group-specific production frontier to the metafrontier is reported in Column 3. All input variables have a positive and statistically significant effect on the outcomes of the hospitals (BoD composite index of efficiency). Being located in a region which is included in the recovery plan is negatively correlated with the technical level of inefficiency. This evidence suggests that being included in such plans not only helps to contain costs, but also seems to have positive consequences on the efficiency of the hospitals (Bordignon et al., 2020, Di Novi et al., 2019). Private hospitals outperform public hospitals. Finally, operating in more economically developed areas (North area of the country) is associated, on average, with higher efficiency. Given the gap in economic development between northern and southern Italy, this result raises serious social issues under the equity profile. Indeed, all else equal, hospitals that operate in southern Italy are required to provide extra effort to produce the same level of output. To deal with multiple outputs in the stochastic frontier framework, we also use as robustness analysis a distance function approach, still assuming a translog functional form. The results, reported in Table 7 in the Appendix, corroborate our results implying that the functional form does not invalidate the findings of the analysis.Footnote 17

Table 3 The hospital group stochastic frontier estimates

Table 4 reports the summary statistics of various efficiency scores, at regional level, for the two groups of regions (in 2007 and 2016)Footnote 18 while Fig. 3 reports a graphical representation of the hospitals’ efficiency scores at regional level. All regions in 2016 (with the exception of Lazio, Puglia, Sardegna, Sicilia and Toscana) seem to be more technically efficient (average TE = 0.7839) with respect to their own peer group in 2007 (average TE = 0.7781). Hospitals operating in 2016 seem to be highly more efficient (average TGR = 0.9876) compared to those in 2007 (average TGR = 0.9630) in adopting the best available hospital-operating technology as measured in the technology gap ratio. The hospitals in 2007 are less technically efficient (average MTE = 0.7614) with respect to the same hospitals in 2016 (average MTE of 0.7838) as measured by the metafrontier technical efficiency. Overall, on average the TGR scores play a more important role in the determination of the ranking in MTE, suggesting that sources of inefficiency for hospitals operating in 2007 come especially from the technology used rather than from managerial inefficiency.

Table 4 The estimates of the hospitals’ metafrontier
Fig. 3
figure 3

Hospitals’ efficiency scores (geographical variation). ac The measures of hospitals’ efficiency scores such as Technical Efficiency, Technical Gap Ratio, and Metafrontier Technical Efficiency, respectively

An interesting heterogeneity is present when single regions are taken into account. Indeed, the TE scores play a more important role in the determination of the lowering (or not notable increasing) MTE ranking in 2016 with respect to 2007 for Lazio, Puglia, Sardegna, Sicilia and Toscana, suggesting that the primary source of inefficiency come from (lower) managerial efficiency rather than the technology undertaken. On the other hand, the TE scores play a more important role in the determination of the ranking in MTE for Emilia-Romagna, Friuli-Venezia Giulia, Liguria, Lombardia, Trentino (P.A. Bolzano and P.A. Trento), Piemonte and Veneto, suggesting that the primary source of efficiency in 2016 comes from (higher) managerial efficiency rather than the technology undertaken (that remain quite constant, although very high). Finally, although Basilicata, Calabria, Marche and Umbria in year 2016 outperform year 2007, the TGR scores play a more important role in determining the ranking in MTE, suggesting that for hospitals located in those regions the main source of efficiency come especially from the technology undertaken rather than managerial efficiency. The opposite is true, instead, for Molise for which the main source of efficiency for hospitals operating in 2016 seems to come from managerial efficiency rather than the technology used. Finally, both TE and TGR scores play an important role in determining the increasing ranking in MTE for Abruzzo and Campania, suggesting that sources of efficiency in 2016 are both (higher) managerial efficiency and the technology undertaken.

4.2 Rising Inequality in Hospital Efficiencies

To analyze the distribution over time of the ‘more efficient and ‘less efficient’ hospitals, we divide the rank distribution of the hospitals’ scores into 5 percentiles. The graphical representation of the rank distribution is represented in Fig. 4 (the blue, brown, green, orange and red lines illustrate, for each region and for years 2007 and 2016, the share of hospitals that falls in the 20th, 40th, 60th, 80th and 100th rank percentile) and shows a clear and persistent North-South gap in the efficiency performances of hospitals. Indeed, in all the northern regions (Piemonte, Valle d’Aosta, Liguria, Lombardia, Trentino-Alto Adige, Veneto, Friuli-Venezia Giulia, Emilia-Romagna) the hospitals’ efficiency scores reached the highest 61th–80th and 81th–100th percentiles. The gap between the central-northern and southern regions did not change substantially between 2007 and 2016. A clear pattern emerges: none of the southern hospitals reach the highest percentile both in 2007 and 2016. More specifically, in the southern regions and in the two periods analyzed, the distribution of efficiency never reaches the highest percentiles in Basilicata, Calabria and Sardegna.

Fig. 4
figure 4

Rank distribution of the hospitals’ efficiency scores at regional level for years 2007 and 2016

In line with the previous literature, the findings highlight that in the last decades no process of convergence between the southern regions and those in the north has been observed (Lagravinese et al., 2019). The neglect inefficiencies are still evident between the two macro areas (north-south) and in 2016 the best hospital in the south never reaches the results of the northern hospitals. This result may explain how health mobility between northern and southern regions has increased in recent years, posing serious problems of inequality due to the fact that only individuals with higher incomes can afford to be treated outside their own region (Balia et al., 2018, 2020).

The rank distribution can then be further decomposed to estimate the variability between and within regions according to Eq. (10). To this purpose, Table 5 reports the Theil index for hospital’ performances, with two components: the between regions (estimated within 20 regions) and the within regions. Table 5 shows that between 2007 and 2016, the total inequality of the country increased from 0.022 to 0.025, suggesting that the differences between areas of the country have remained almost unchanged. Much of this inequality is determined by between inequality (around 63.9% of the total). In both years 2007 and 2016, the between inequality increased from 0.014 to 0.015 and the within inequality slightly increased from 0.008 to 0.009.

Table 5 Theil index: total, between and within inequality

Additionally, we also calculate the Theil index for each region to further examine how the within inequality has changed in the single regions. The results, summarized in Table 6, show that in general, as was to be expected, inequality in the regions is not as high. This is explained by the fact that by now the regions have outlined the distribution of the service on the local territory where all the hospitals follow guidelines and homogeneous protocols in the regional territory. However, some differences can be seen. In the central-northern regions (net of the regions of Umbria and especially Lazio), it is generally observed that the differences in the same region are minimal with values very close to zero, while the differences are greater among the hospitals in the southern regions. Between 2007 and 2016, an increase in the within inequality occurred especially in Calabria and Sardegna.

Table 6 Theil Index Within inequality by regions

5 Concluding remarks and policy implications

5.1 Conclusion

This paper investigates the efficiency of Italian hospitals and how their performances have changed over the period 2007–2016. We apply the BoD approach to determine a composite index that considers the multi-dimensionality of the hospital outcome to be used as main output in a metafrontier production function based on a stochastic frontier framework. This procedure disentangles the group-specific frontiers and the metafrontier, further decomposing the efficiency scores of various groups of hospitals into technical efficiency scores and technology gaps.

The main findings show that almost all regions in 2016 (with some exceptions) seem to be more technically efficient than their own peer group in 2007, being, on average, also more efficient in adopting the best available hospital-operating technology. More heterogeneity is found when looking at the single regions. Indeed, managerial efficiency rather than the efficiency in adopting the best available technology seems to be more important (or not important) in determining the lowering (increasing) in the metafrontier technical efficiency ranking, especially for hospitals located in southern regions. Furthermore, with a closer look at the distribution of the efficiency scores across time, the empirical evidence shows a clear and persistent North-South gap in the efficiency performances of hospitals that did not change substantially between 2007 and 2016. Finally, when we further decomposed the rank distribution in order to estimate the variability between and within regions, the results show an increase in the inequality in terms of efficiency between the areas of the country mostly determined by between region inequality. An increase in within region inequality is also found, especially for the southern regions.

A possible shortcoming of our empirical strategy is worthy to be further discussed. We apply the BoD methodology using the typical construction of the meta-frontier which is estimated under the assumption that the metaset is convex. Violating this assumption could make the estimates biased (see Kerstens et al. (2019) for a discussion). The convexity assumption is usually justified using a time divisibility argument (Hackman, 2008, Shephard, 1970) according to which if production processes are time divisible, then a (hospital) manager could use, in some cases, an input to produce a specific output, and then use a second input to produce a further specific output the rest of the time. Our knowledge of hospital organization leads us to believe that this is not the case for the groups of hospitals operating in 2007 and 2016 suggesting that the technology-specific production possibilities sets may not be convex (i.e., possible differences in the methods that are available to transform inputs into outputs or in production environments).

Finally, given the data constraints and unavailability, more dimensions of inputs and outputs could not be used in the empirical analysis. However, besides labour and capital (represented by number of beds), the consumable (e.g., supplies in medical, drugs, and surgical, etc.) is also an important aspect to be accounted in as the inputs (see, for an example, Chowdhury and Zelenyuk (2016)). Moreover, the number of employees is used as the inputs for labour, even though the full-time equivalent staff is also widely accepted as an important variable in hospital efficiency analysis (see for comprehensive reviews Hollingsworth (2008), O’Neill et al. (2008), Worthington (2004)). These more thorough variables should be taken into account for empirical models and studies in the future.

5.2 Policy implications

Several implications can also be derived from our analysis.

The findings highlight the still unresolved social-economic dualism between the northern and southern regions of the country. Indeed, a persistent territorial divide in the regional health care in Italy has been seen to be, at least partially, the possible cause of a large heterogeneity of mortality rates among regions (Arcà et al., 2020, Lagravinese et al., 2019). We find that the gap between the central-northern and southern regions not only did not change substantially between 2007 and 2016, but also that none of the central-southern hospitals reach the highest percentile in both periods, confirming that in the last decades no process of convergence between the southern regions and those in the north has been observed.

Although, there is evidence of the general positive effect of decentralization on health performance in Italy (Atella et al., 2019, Bordignon et al., 2020, Di Novi et al., 2019), our study, along with some recent literature (Arcà et al., 2020, Depalo, 2019, Lagravinese et al., 2019), provides evidence that such improvements may not involve all the health dimensions and all the regions, and, more importantly, may not solve, even partially, the increasing gap between northern and southern regions. The decentralization process therefore does not seem to have helped bridging the gap between hospitals in the north and those in the south. Indeed, the differences have also (albeit slightly) worsened over the years. Certainly, further analyses are needed in the future to more carefully indicate whether the efficiency changes are the result of such policies in filling up the gap, given that also other social-economic factors changed in the study period (e.g., the demography changes in different regions). However, in the next few years, also in light of the critical issues on regional organization revealed by the recent pandemic due to the COVID-19 virus, national policies will have to be implemented to bridge the infrastructural and economic gap in the southern regions.