# On the estimation of spatial stochastic frontier models: an alternative skew-normal approach

- 408 Downloads

## Abstract

This paper deals with an alternative approach to combine spatial dependence and stochastic frontier models using a large statistical literature on skew-normal distribution functions. I show how to combine a spatial dependence structure with a stochastic frontier model, that is, (1) straightforward to estimate, (2) able to combine spatial dependence and a technical efficiency term in a single error term, and (3) produce consistent estimates. With smaller sample sizes estimation of the parameter, governing technical efficiencies becomes imprecise. The consistency of parameter estimation is shown using simulations, and I provide an empirical application to estimate spatially correlated technical efficiencies within an European regional production function context.

## JEL Classification

R11 R15## 1 Introduction

*within*countries than across countries. To illustrate this, Fig. 1 shows the dispersion of relative regional GDP per capital across European countries.

European income seems to be concentrated within large metropolitan areas (most notably—the capital cities) such as Paris, London, Luxembourg, Oslo, and Stockholm. Apart from this urban–rural divide, large differences are present as well between subnational regions. The most well-known example is the North–South division within Italy, but these differences can as well be noticed within almost all other countries in Europe. Most notable examples are Spain (North–South division as well), France (with a relatively poorer central area), the UK (where a North–South divide is, albeit less notable, visible as well), and Germany (with the former division between East Germany and West Germany). Thus, some regions within countries are significantly more successful than others—even when faced with similar national institutions.

What makes these regions economically successful? This is perhaps the most crucial and complex research question for both regional policy makers and regional scientists to answer. Policy makers would like to have policies able to steer regions to success, and regional scientists are especially interested in the (nature of the) determinants that drive this success. At the heart, this research question deals with the absolute and relative location advantages of regions.^{1}

A strongly related research question deals with the exact nature of economic performance and how to measure it. To do so, endowment levels should be taken into account. In the economics literature, this can be reflected by the use of regional production functions (see, e.g. Rodríguez-Pose and Crescenzi 2008; Basile et al. 2012). Given the size of production factors, such as labour and capital, regions should attain a certain production level, but usually produce suboptimal. The distance between the optimal and actual production level is usually measured by technical (in)efficiencies and, stochastically, modelled by a stochastic frontier approach.

There is already a sizeable literature dealing with benchmarking regions using regional technical efficiencies modelled by a stochastic frontier approach (see, amongst others, Driffield and Munday 2001; Brock 1999; Puig-Junoy 2001; Puig-Junoy and Pinilla 2008; Alvarez 2007; Otsuka 2017).^{2} This literature usually deals with the relative (sectoral) performance of regions, and this is the approach this paper takes as well. The production factors are then usually constituted of the aggregates of various forms of labour (high skilled and low skilled) and capital (both physical and human) within a region.

However, taking only local endowments into account boils down to an absolute location approach: it does not matter where the region is located with respect to its neighbours. However, the relative location of the region matters as well as regions are intrinsically connected to each other in networks formed by trade, knowledge spillovers, commuting, and migration (Thissen et al. 2016). It is crucial to control for this spatial dependence as omitting it might lead to bias—at least in the estimation of technical efficiencies (Anselin 1988).

The literature that combines spatial dependence and stochastic production frontiers is, although relatively recent, already sizeable. Most studies employ a parametric approach, and the enumeration that follows is definitely not conclusive. One of the first parametric studies was Barrios and Ladado (2010), who uses an iterative back-fitting algorithm to find consistent parameter estimates although they do not allow for correlation between the technical efficiency and spatial dependence structure. Pavlyuk (2010) uses as well a parametric approach, but does not report how he estimates consistently both the spatial dependence process and technical efficiencies. Fusco and Vidoli (2013) and Vidoli et al. (2016) separate out the error term in a spatial lag structure and technical efficiencies, with an application to the Italian wine sector. Kinfu and Sawhney (2015) apply a spatial stochastic frontier analysis to maternal care in India. Glass et al. (2013) decompose productivity growth using a spatial autoregressive model, whereafter Glass et al. (2016) extend the analysis to a spatial panel setting. Finally, Jiang et al. (2017) apply a fixed effects stochastic frontier model to energy efficiency in Chinese Provinces. In addition, there is a smaller literature that resorts to a Bayesian approach and simulation techniques, i.e. Schmidt et al. (2009), Areal et al. (2010), and Tsionas and Michaelides (2016).

A specific feature that applies to most of the studies above is that they model the spatial dependence and efficiency processes separately (see, e.g. Fusco and Vidoli 2013). Then, as I will argue below, the error term is by definition multivariate as it is a combination of a normal and truncated normal distribution, where one of them or even both are multivariate due to the involved spatial correlation structure, which makes estimation cumbersome.

In contrast, this study applies a alternative approach firmly rooted in the *statistical* literature. Using a relatively straightforward skew-normal distribution approach, I show how to combine a spatial error structure with a stochastic frontier model, that is, (1) straightforward to estimate, (2) able to combine spatial dependence and a frontier model in a single error term, and (3) produce consistent estimates. The latter is shown by a simulation study, where—although all parameters are consistent—it is clear that the parameter measuring technical inefficiencies is very inefficient (i.e. large standard errors) with small amounts of observations. Skew-normal distribution is not often applied in the econometric stochastic frontier literature with as notable exception Chen et al. (2014), although they are looking at fixed effects panel models instead of spatial dependence models.

The remainder of this paper is structured as follows. The next section introduces the concept of regional technical efficiencies and discusses some measurement issues. Consecutively, it treats the modelling (and its associated estimation) of technical efficiencies in two ways: a mainly econometric and a more statistical one.^{3} The last subsection deals with the introduction of spatial dependence in stochastic production frontiers. Section 3 provides simulation results to indicate the performance of the proposed estimation methods, within small and realistic samples as usually encountered when benchmarking (European) regions. Section 4 provides an application of spatial stochastic frontier modelling and gives an estimation of the average technical efficiencies of European NUTS2 regions in the period 2000–2010. The last section concludes by indicate how in the proposed framework, more complex spatial dependence structures can be incorporated in stochastic frontier models. Estimation of these models, however, requires complex multivariate likelihood or simulation techniques.

## 2 Estimating regional technical inefficiencies

Since the late 1970s, economists increasingly recognized that although having access to the same set of production factors, firms do not necessarily produce the same output—and that there was consequently a need to econometrically correct for that (see the seminal papers of Aigner et al. 1977; Meeusen and van den Broek 1977). To explain this variation in output, it was argued that firms do not deploy production factors with the same efficiency. For example, labour might be less productive because of firm-specific lack of monitoring, which opens the possibility for various forms of shirking on the work floor. Or firms do not have access to the same technology and have therefore different output levels.

This, however, creates a problem. If most firms do not produce according to profit maximization, but systematically lower than that, then traditional production function estimates are *biased*.^{4} Namely, not being able to optimize profits or costs leads to the fact that firms end up beneath an estimated ideal profit level. Consequently, in the literature associated with stochastic production functions, the error terms are usually composed error terms: the traditional error term reflecting noise and a new error term—being strictly positive—measuring a firm’s inefficiency.

Analogously to firms, regions with similar inputs do not necessarily attain the same production level as well. Partly, this may be due to missing covariates (such as not being able to correctly measuring human and social capital, but partly this may be caused by the fact that inputs are not always deployed as *efficient* as possible—due to local or national institutions, social structures, etc.

To control for this, the regional science literature has borrowed from the firm-specific efficiency literature the concept of regional stochastic production frontier analysis. As Fig. 1 clearly shows, some regions are probably more efficient than others—even within the same country. And this should be reflected when benchmarking those regions.

*y*as the

*given*maximum attainable production a region can get using the production factors \(x_1\) and \(x_2\), say capital and labour. Regions \(A_1, A_2, A_3, B_1, B_2\), and \(B_3\) are all producing inefficiently. With the production factors \(x_1\) and \(x_2\) that theoretically enable them to produce

*y*, they produce on average \({\hat{y}}\). The distance between

*Y*and \({\hat{y}}\) is then a measure for the average efficiency. More precisely, average efficiency is defined as the ratio \({\hat{\mathbf{y}}}/{\mathbf {y}}\) where \({\mathbf {y}}\) is the length of the line between the origin and

*y*. As a result, technical efficiencies must be smaller than 1.

Estimation of technical efficiencies may, however, be biased in the presence of spatial dependence or unobserved spatial heterogeneity amongst regions. Namely, when one assumes that only neighbouring regions benefit from other regions’ technological knowledge through the traditional Marshallian channels of shared customers and suppliers, shared labour pools or spillover mechanisms (or just through unobserved spatial heterogeneity), then straightforward estimation of technical efficiencies is biased. This can be seen in the right production isoquant depicted in Fig. 2. Assume that regions \(A_1, A_2\), and\(A_3\) belong to country *A* and regions \(B_1, B_2\), and \(B_3\) belong to country *B*, then it quite well conceivable because of spatial unobserved heterogeneity or spatial dependence that the technical efficiencies of the regions in country *A* and *B* are related. In Fig. 2, region \(B_3\) might not produce inefficiently at all *given* the fact that neighbouring regions in country *B* produce less efficiently. This works as well the other way around. Regions very central in a network and surrounded by very efficiently producing regions have besides a strong economic structure probably a very favourable *relative* location as well. In this context, efficiency can be related to advantages related to the *absolute* location, while spatial dependence relates to the *relative* location. To control for the inefficiency in both production and the geographical location, this paper incorporates a spatial correlation structure in stochastic production frontiers.

To show how one can incorporate spatial dependence in regional cross-sectional stochastic production frontier models, I first revisit concisely the non-spatial stochastic frontier model in Sect. 2.1.^{5} Thereafter, in Sect. 2.2, an alternative estimation and not commonly used estimation method is introduced.^{6} Finally, I show how one can readily incorporate spatial dependence in stochastic production function frontier models in Sect. 2.3.

### 2.1 Stochastic production frontiers

*i*\((i \in \{1,\ldots ,N))\) can be modelled in a cross section as a Cobb–Douglas production function, thus (using vector notation)

^{7}:

*X*is the matrix of production factors, \(\beta\) the vector parameters of the Cobb–Douglas production function and TE denotes the so-called firm-specific technical efficiency. Thus, TE is a distance measure of the firm to the (maximum) production of the best production firm there is—

*within*the sample of firms. As a consequence,

*TE*must be smaller or equal to one for each firm. Aigner et al. (1977) and Meeusen and van den Broek (1977) already specified (1) by assuming that \(\hbox {TE} = \exp (-u)\), where

*u*represents a stochastic variable. Assuming a logarithmic specification yields:

*u*being a stochastic variable as well, where \(u\sim N(0,\sigma ^2_u)\) and \(v \sim N(0,\sigma ^2_v)\) and with the explicit condition that \(u > 0\).

*u*and

*v*are conveniently considered independent. This enables us to find the marginal density of \(\epsilon\), namely:

*u*and

*v*and that

*u*and

*v*are intertwined by this conditional nature.

^{8}Note further that an estimate of the technical efficiency can now be obtained by finding the distribution of \(f(u|\epsilon )\).

Obviously, estimation of this model with ordinary least squares regression creates a bias because of the simultaneous appearance of the two stochastic variables with one being truncated at zero. The traditional estimation procedure uses a likelihood procedure based on the density in Eq. (3). However, introducing a more complex error structure in specification (2) is rather cumbersome and not very intuitive. The next subsection proposes therefore an alternative specification and corresponding estimation procedure, which is more straightforward to adapt.

### 2.2 A skew-normal approach

*N*(0, 1) variables, and \(\delta \in (-1,1)\). Here, the stochastic variable \(\epsilon\) is generated by means of convolution.

^{9}The parameter \(\alpha\) in density (6) is a skewness parameter and determines the shape of the density function.

Density (6) is shown in Fig. 3 for some values of the parameter \(\alpha\). When \(\alpha\) is positive, the density is skewed to the right and when it is negative, it is skewed to the left. When \(\alpha\) is zero, the density becomes a standard normal density function and if \(\alpha \rightarrow \infty (-\infty )\), then the density converges to the half-normal density; \(2\phi (x)\) for \(z \ge 0 (\le 0)\).

*u*and

*v*which is implicit in specification (2). Note that specification (2) only holds when \(\delta < 0\). I do not explicitly impose this condition on the model, but choose to leave this as an empirical test.

*i*th observation of the vector \(\epsilon\).

Skew-normal distributions are not much used in econometrics, but for this purpose, they will do very nicely.^{10} They allow us to use a single error term instead of a composite one, which has some benefits (such as clarity) when working with multivariate distributions. Moreover, the interpretation of the parameters seems as well more intuitive (using scale, location, and skewness parameters). A disadvantage of using skew-normal distributions is the need to use a re-parametrization of the parameters in order to estimate them properly.

The next subsection introduces a spatial variant of this distribution function and applies it to both spatial lag and spatial error models.

### 2.3 Spatial dependence in stochastic production frontier models

^{11}, namely being a normal distribution with mean and variance equal to:

## 3 Simulation

*u*and

*v*1000 times, where I vary the number of observations—so the length of these vectors—as well (with lengths 250, 1000, and 10,000) and then estimate the model parameters, where after I calculate the distributional mean and standard deviation of each parameter allowing me to infer consistency and efficiency for each of the model parameters.

Mean and standard deviation (between parentheses) of frontier model estimation results for various \(\delta\)’s and number of observations

Obs. | Variable | Simulated \(\delta\) at | ||
---|---|---|---|---|

− 0.2 | − 0.5 | − 0.8 | ||

250 | Constant | 2.05 (0.29) | 2.00 (0.28) | 1.98 (0.22) |

\(\ln\)(Capital) | 0.35 (0.03) | 0.35 (0.03) | 0.35 (0.02) | |

\(\ln\)(Employment) | 0.65 (0.03) | 0.65 (0.03) | 0.65 (0.03) | |

\({\hat{\sigma }}\) | 0.32 (0.04) | 0.31 (0.04) | 0.29 (0.03) | |

\({\hat{\delta }}\) | − 0.35 (0.32) | − 0.41 (0.32) | − 0.73 (0.21) | |

1000 | Constant | 2.03 (0.16) | 1.98 (0.14) | 1.99 (0.11) |

\(\ln\)(Capital) | 0.35 (0.02) | 0.35 (0.02) | 0.35 (0.01) | |

\(\ln\)(Employment) | 0.65 (0.02) | 0.65 (0.02) | 0.65 (0.01) | |

\({\hat{\sigma }}\) | 0.31 (0.02) | 0.30 (0.02) | 0.30 (0.02) | |

\({\hat{\delta }}\) | − 0.31 (0.26) | − 0.41 (0.25) | − 0.78 (0.08) | |

10,000 | Constant | 2.03 (0.06) | 1.99 (0.05) | 2.00 (0.04) |

\(\ln\)(Capital) | 0.35 (0.01) | 0.35 (0.00) | 0.35 (0.00) | |

\(\ln\)(Employment) | 0.65 (0.01) | 0.65 (0.00) | 0.65 (0.00) | |

\({\hat{\sigma }}\) | 0.31 (0.01) | 0.30 (0.01) | 0.30 (0.01) | |

\({\hat{\delta }}\) | − 0.29 (0.15) | − 0.47 (0.11) | − 0.79 (0.05) |

Table 1 gives the results of a simulation exercise with only a frontier model. Here, the number of observations (250, 1000, 10,000) is varied as well as the simulated value of \(\delta\) (\(-\,0.2\), \(-\,0.5\), \(-\,0.8\)). All variables, except for \({\hat{\delta }}\), behave as expected and conform theory. They converge to their true values as the sample size gets bigger. \({\hat{\alpha }}\) also converges to its true value, but only for large sample sizes and large true \(\delta\)’s. Moreover, its standard deviation is much larger than the other parameters, making this parameter relatively imprecise to estimate.

*W*, from the empirical application (see Sect. 4).

^{12}So, the model I now estimate is the same as model 15, but now I have in addition the following specification for the error term \(\mu\):

*u*and

*v*100 times.

Mean and standard deviation (between parentheses) of a frontier model estimation results with a spatial error structure for various \(\delta\)’s and \(\lambda\)’s

Variable | Simulated \(\delta\) at | |||
---|---|---|---|---|

− 0.2 | − 0.5 | − 0.8 | ||

\(\lambda = 0.2\) | Constant | 2.03 (0.33) | 1.96 (0.21) | 1.96 (0.27) |

\(\ln\)(Capital) | 0.35 (0.03) | 0.35 (0.03) | 0.35 (0.02) | |

\(\ln\)(Employment) | 0.65 (0.03) | 0.65 (0.03) | 0.65 (0.02) | |

\({\hat{\lambda }}\) | 0.17 (0.09) | 0.18 (0.09) | 0.18 (0.09) | |

\({\hat{\sigma }}\) | 0.32 (0.04) | 0.30 (0.04) | 0.29 (0.04) | |

\({\hat{\delta }}\) | − 0.27 (0.33) | − 0.33 (0.35) | − 0.68 (0.30) | |

\(\lambda = 0.5\) | Constant | 2.07 (0.36) | 1.96 (0.35) | 1.95 (0.28) |

\(\ln\)(Capital) | 0.35 (0.03) | 0.35 (0.03) | 0.35 (0.02) | |

\(\ln\)(Employment) | 0.65 (0.03) | 0.65 (0.03) | 0.65 (0.02) | |

\({\hat{\lambda }}\) | 0.48 (0.07) | 0.48 (0.07) | 0.48 (0.07) | |

\({\hat{\sigma }}\) | 0.32 (0.04) | 0.31 (0.04) | 0.30 (0.04) | |

\({\hat{\delta }}\) | − 0.31 (0.32) | − 0.41 (0.32) | − 0.73 (0.22) | |

\(\lambda = 0.8\) | Constant | 2.17 (0.59) | 1.89 (0.57) | 1.88 (0.44) |

\(\ln\)(Capital) | 0.35 (0.03) | 0.35 (0.03) | 0.35 (0.02) | |

\(\ln\)(Employment) | 0.65 (0.03) | 0.65 (0.03) | 0.65 (0.02) | |

\({\hat{\lambda }}\) | 0.78 (0.04) | 0.78 (0.04) | 0.79 (0.06) | |

\({\hat{\sigma }}\) | 0.32 (0.04) | 0.31 (0.04) | 0.30 (0.04) | |

\({\hat{\delta }}\) | − 0.33 (0.32) | − 0.42 (0.32) | − 0.74 (0.20) |

Table 2 presents the simulation results of the corresponding frontier model with an error structure. Conform Table 1, it is clear that with small sample sizes (in this case being 256), the parameter measuring technical efficiencies (\(\alpha\)) is not very precisely estimated, whereas all other parameters are very close to their true value. When the true \(\delta\) is closer to one or, to a lesser extent, when \(\lambda\) gets higher, estimation becomes slightly more efficient, but not by much.

## 4 Empirical application: the efficiency of European regions

In this section, I apply the concept of spatial stochastic frontiers to European NUTS-2 regional production functions. The next subsection first describes concisely the data, and the subsequent subsection gives the estimation results.

### 4.1 Data and specification

NUTS-2 (Nomenclature of Units for Territorial Statistics) is a geocode standard for referencing the subdivisions of European countries for statistical purposes, where the addition 2 stands for the geographical level of more or less provinces. I use two databases. For labour, I use the European regional database by Cambridge Econometrics: a database containing detailed sectoral information about the regional provision of labour (see Cambridge Econometrics 2015). For regional gross value added and capital, I adopt the supply and use tables as used previously inThissen et al. (2016) and explained in detail in Thissen and Diodato (2013a, b). This allows us to deal with one of the prevailing data problems in this literature: the calculation of the capital stock. Typically, this is done with a perpetual inventory method. However, this could be problematic, since shocks in the capital stocks (e.g. by deaths or migrations of a firms) do not manifest themselves in the short run. Because there is information on regional value added of capital (\(V^K\)) across regions, sectors and years (so \(V^K_{r,s,t} = r_{r,s,t} K_{r,s,t}\)), I can circumvent this problem by using data on sector-specific interest rates for capital and thus calculate the capital stock per region, year, and sector (\(K_{r,s,t}\)).^{13}

To avoid idiosyncratic shocks, the data used are the mean over the period 2000–2010, and the economic sectors that they comprise are: agriculture, energy and manufacturing, construction, distribution market services, and non-market services. The countries included in the estimation can be seen in Fig. 1 and are basically all EU25 countries except for Romania and Bulgaria. The total number of NUTS-2 regions in the dataset is 256. Its geographical distribution is shown in Fig. 1.

To define the spatial weight matrix *W*, I use a *k*-nearest neighbour algorithm with \(k = 4\), where the *k*-nearest neighbours get a weight of 1. The weights of all other neighbours are set at 0. Finally, I row-standardize *W*.

*Y*is gross value added,

*L*is the number of workers multiplied by the average hours worked per week,

*K*is the amount of capital,

*r*is the region, and \(\epsilon\) is an error term that can be distributed normally or skew normally.

The next subsection provides the results for various sectors and specifications of the production function of (17).

### 4.2 Results

Estimation results for energy and manufacturing

OLS | Sp. error | Sp. lag | Frontier | Fr. error | Fr. lag | |
---|---|---|---|---|---|---|

Constant | 2.40*** | 3.16*** | 0.46 | 2.50*** | 4.03*** | 0.51 |

(0.16) | (0.15) | (0.26) | (0.19) | (0.25) | (1.08) | |

\(\ln\)(Capital) | 0.77*** | 0.58*** | 0.70*** | 0.76*** | 0.58*** | 0.70*** |

(0.03) | (0.03) | (0.03) | (0.04) | (0.03) | (0.03) | |

\(\ln\)(Employment) | 0.30*** | 0.44*** | 0.32*** | 0.30*** | 0.44*** | 0.32*** |

(0.03) | (0.03) | (0.03) | (0.04) | (0.03) | (0.03) | |

\(\hat{\sigma }\) | 0.33*** | 0.22*** | 0.29*** | 0.33*** | 0.29*** | 0.29* |

(0.01) | (0.01) | (0.01) | (0.02) | (0.02) | (0.14) | |

\(\hat{\lambda }\) | 0.79*** | 0.78*** | ||||

(0.09) | (0.09) | |||||

\(\hat{\rho }\) | 0.25*** | 0.12*** | ||||

(0.03) | (0.01) | |||||

\(\hat{\alpha }\) | − 1.75*** | − 1.41*** | − 0.17 | |||

(0.52) | (0.37) | (4.41) | ||||

Log Lik. | − 80.10 | 92.28 | 41.45 | 11.03 | 94.86 | 41.45 |

^{14}

Estimation results for all sectors for frontier model with spatial error

Agric. | E and M | Constr. | Distri. | Serv. | NM Serv. | |
---|---|---|---|---|---|---|

Constant | 2.43*** | 4.03*** | 2.88 | 2.02* | 2.05*** | 2.70*** |

(0.57) | (0.25) | (2.79) | (1.01) | (0.55) | (0.78) | |

\(\ln\)(Capital) | 0.68*** | 0.58*** | 0.59*** | 0.66*** | 0.71*** | 0.58*** |

(0.03) | (0.03) | (0.04) | (0.03) | (0.03) | (0.03) | |

\(\ln\)(Employment) | 0.21*** | 0.44*** | 0.40*** | 0.33*** | 0.29*** | 0.43*** |

(0.03) | (0.03) | (0.04) | (0.03) | (0.03) | (0.04) | |

\(\hat{\sigma }\) | 0.26*** | 0.29*** | 0.22*** | 0.18*** | 0.17*** | 0.18*** |

(0.01) | (0.03) | (0.01) | (0.01) | (0.01) | (0.01) | |

\(\hat{\alpha }\) | − 0.04 | − 1.41*** | − 0.00 | − 0.01 | − 0.01 | − 0.02 |

(1.15) | (0.37) | (3.90) | (1.56) | (1.57) | (1.03) | |

\(\hat{\lambda }\) | 0.57*** | 0.78*** | 0.75*** | 0.78*** | 0.59*** | 0.81*** |

(0.08) | (0.09) | (0.08) | (0.08) | (0.08) | (0.08) | |

Log Lik. | 62.92 | 94.86 | 91.66 | 144.50 | 164.88 | 136.06 |

## 5 In conclusion

The main aim of this paper is to introduce spatial dependence in stochastic production frontier analysis. I do so by using a skew-normal distribution function approach, which I argue is (1) straightforward to use, (2) able to separate spatial dependence and technical efficiencies, and (3) produce consistent estimates. These results can be interpreted using the discussion on relative and absolute geographic location. The size of endowments and thus maximum attainable production are caused by a mixture of absolute geographic location and historical path dependence. Similarly for regional efficiency, as it can be argued that they are mainly caused by institutions and social structures. However, spatial dependence measures the location within the network and could thus be a measure for the relative location. Central regions just performed better because they have better access to production inputs and technology. When comparing regions’ performance, it would be fairer to control for the region’s location within the network.

Obviously, there is more to this because technical efficiencies itself may be spatially dependent instead of the error structure in total. (For instance, there are spillovers in the adoption of new technology that improve the efficiency or there are specific institutions, such as former guilds or unions, that prohibit the adoption of new technologies spatially concentrated.) In any case, when looking at the efficiency of regions, taking into account spatial dependence—whether in the inefficiency part or not—strongly affects the estimates of technical efficiencies in the energy and manufacturing sector.

Unfortunately, the parameter which governs technical efficiencies (in this case the skew-normal parameter \(\alpha\) or the \(\delta\) parameter in the traditional literature) is volatile when the parameter itself is small or with small amounts of observations (which is typically the case in spatial econometrics applications)—whether in a spatial setting or not. The simulation exercise shows that this does not affect the other parameters but that one should be careful in drawing strong conclusions when applying (spatial) frontier analyses with a small number of observations, such when analysing European regional performance.

For our empirical application, when looking at the energy and manufacturing sector in European regions, taking spatial dependence into account controls more or less for the core-periphery pattern in Europe. Thus, regions in the periphery do not produce that inefficiently only because of their economic structure, but partly as well because of their location and the related diminished access to knowledge and information. Obviously, the estimations are restrictive regarding the data and specification I use. Ideally, one would like to model larger regional datasets, to test the alternative skew-normal approach to spatial stochastic frontier models. A viable avenue for further research would be to use regional panel data instead of cross-sectional data.

## Footnotes

- 1.
This question is part of a larger debate about the drivers of economic success, both on national and regional levels. Some scientists favour the proposition that regions prosper because of historical events and the associated path dependencies (e.g. Landes 1998), while others emphasize absolute (e.g. Diamond 1998) and relative (e.g. Fujita et al. 1999) locational advantages. Obviously, these different drivers require different policy instruments—if any at all.

- 2.
- 3.
Actually, both modelling approaches date back to one common source, namely Weinstein (1964).

- 4.
A similar line of reasoning could be held for minimizing cost functions. The remainder of this section deals with production functions, but note that the same arguments hold for costs functions as well.

- 5.
- 6.
- 7.For practical purposes, such a production function is in its simplest form denoted as:where$$\begin{aligned} Y = AL^\alpha K^{1-\alpha } \end{aligned}$$
*L*stands for labour,*K*for capital, and*A*for the level of technology (also known as: labour augmenting technology). - 8.
A different way of denoting this is to observe that we interested in the probability \(\pi (v|u^{\prime }>0)\) where \(u^{\prime }\) is now a normally distributed variable.

- 9.
- 10.
Interestingly, the statistics literature mentions as a possible application of skew-normal distribution function the area of stochastic production frontier models. However, this has yet not permeated fully in the econometrics literature (being the exception Chen et al. 2014)—although there is a nice R package that is able to deal with various forms of skew-normal and skew-t distribution functions, see http://pbil.univ-lyon1.fr/library/sn/html/sn.html. For more information about the skew-normal distribution function, see http://azzalini.stat.unipd.it/SN.

- 11.
They actually do this for the multivariate setting where both

*u*and*v*and their correlation are multivariately distributed. Our case is a special case of their result. - 12.
I deliberately choose for a realistic weight matrix as these types and sizes typically occur in the literature. Artificial weight matrices are slightly cumbersome to make, and one typically needs to resort to rook or queen matrices as has been done in the simulations in Anselin and Florax (1995). Moreover, first-order contiguity does not really make sense as we have islands and thus disjointed dependencies and fully specified distance matrices have disadvantages as well (see LeSage and Pace 2009). Four-nearest neighbour matrices as is used here are often used, but obviously you can vary the type of W-matrix as well in the simulation. For reasons of conciseness, I refrain from this option.

- 13.
Unfortunately but not surprisingly, there is only country-specific interest rates instead of region specific ones.

- 14.
This is most likely due to the amount of observations. Spatial dependence structures, different starting values and centred parameter transformations as suggested by Azzalini (2005), all yield similar results.

## Notes

### Acknowledgements

First of all, I would like to dedicate this paper to the memory of my late colleague and mentor Raymond Florax who started working on spatial stochastic frontier models already 10 years ago.This paper was prepared for the 17th workshop on Spatial Econometrics and Statistics in Dijon 2018. I would like to thank Ferdinand Paraguas, Paul Elhorst and Henri de Groot for useful comments on earlier versions of this paper.

## References

- Aigner DJ, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic production frontier models. J Econ 6(1):21–37CrossRefGoogle Scholar
- Alvarez A (2007) Decomposing regional productivity growth using an aggregate production frontier. Ann Reg Sci 41(2):431–441CrossRefGoogle Scholar
- Anselin L (1988) Spatial econometrics: methods and models, vol 4. Springer, BerlinCrossRefGoogle Scholar
- Anselin L, Florax RJ (1995) Small sample properties of tests for spatial dependence in regression models: some further results. In: Anselin L, Florax RJGM (eds) New directions in spatial econometrics. Springer, Berlin, pp 21–74CrossRefGoogle Scholar
- Areal FJ, Balcombe K, Tiffin R (2010) Integrating spatial dependence into stochastic frontier analysis. Aust J Agric Resour Econ 56:521–541CrossRefGoogle Scholar
- Arellano-Valle RB, Azzalini A (2006) On the unification of families of skew-normal distributions. Scand J Stat 33(3):561–574CrossRefGoogle Scholar
- Arellano-Valle RB, Azzalini A (2008) The centred parametrization for the multivariate skew-normal distribution. J Multivar Anal 99(7):1362–1382CrossRefGoogle Scholar
- Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12(2):171–178Google Scholar
- Azzalini A (2005) The skew-normal distribution and related multivariate families. Scand J Stat 32:159–188CrossRefGoogle Scholar
- Azzalini A, Capitanio A (1999) Statistical application of the multivariate skew-normal distribution. J R Stat Soc 61(3):579–602CrossRefGoogle Scholar
- Azzalini A, DallaValle A (1996) The multivariate skew-normal distribution. Biometrika 83(4):715–726CrossRefGoogle Scholar
- Barrios EB, Ladado RF (2010) Spatial stochastic frontier models. Technical report, Philippine Institute for Development Studies, Discussion paper series no. 2010-08Google Scholar
- Basile R, Capello R, Caragliu A (2012) Technological interdependence and regional growth in Europe: proximity and synergy in knowledge spillovers. Pap Reg Sci 91(4):697–722CrossRefGoogle Scholar
- Brock GJ (1999) Exploring a regional technical efficiency frontier in the former USSR. Econ Plan 32(1):23–44CrossRefGoogle Scholar
- Cambridge Econometrics (2015) European regional data. Technical report, database information can be retrieved from Cambridge Econometrics. https://www.camecon.com/european-regional-data/. Accessed 20 Aug 2017
- Chen YY, Schmidt P, Wang HJ (2014) Consistent estimation of the fixed effects stochastic frontier model. J Econom 181(2):65–76CrossRefGoogle Scholar
- Diamond JM (1998) Guns, germs and steel: a short history of everybody for the last 13,000 years. Random House, New YorkGoogle Scholar
- Dominguez-Molina JA, González-Farias G, Ramos-Quiroga R (2003) Skew-normality in stochastic frontier analysis. Technical report, Comunicacion tecnica no. I-03-18Google Scholar
- Dominguez-Molina JA, González-Farias G, Ramos-Quiroga R, Gupta AK (2007) A matrix variate closed skew-normal distribution with applications to stochastic frontier analysis. Commun Stat Theory Methods 36(9):1691–1703CrossRefGoogle Scholar
- Driffield N, Munday M (2001) Foreign manufacturing, regional agglomeration and technical efficiency in UK industries: a stochastic production frontier approach. Reg Stud 35(5):391–399CrossRefGoogle Scholar
- Favero CA, Papi L (1995) Technical efficiency and scale efficiency in the Italian banking sector: a non-parametric approach. Appl Econ 27(4):385–395CrossRefGoogle Scholar
- Fujita M, Krugman P, Venables AJ (1999) The spatial economy: cities, regions, and economic trade. MIT Press, CambridgeCrossRefGoogle Scholar
- Fusco E, Vidoli F (2013) Spatial stochastic frontier models: controlling spatial global and local heterogeneity. Int Rev Appl Econ 27(5):679–694CrossRefGoogle Scholar
- Glass A, Kenjegalieva K, Paez-Farrell J (2013) Productivity growth decomposition using a spatial autoregressive frontier model. Econ Lett 119(3):291–295CrossRefGoogle Scholar
- Glass AJ, Kenjegalieva K, Sickles RC (2016) A spatial autoregressive stochastic frontier model for panel data with asymmetric efficiency spillovers. J Econom 190(2):289–300CrossRefGoogle Scholar
- Jiang L, Folmer H, Ji M, Tang J (2017) Energy efficiency in the chinese provinces: a fixed effects stochastic frontier spatial Durbin error panel analysis. Ann Reg Sci 58(2):301–319CrossRefGoogle Scholar
- Kinfu Y, Sawhney M (2015) Inefficiency, heterogeneity and spillover effects in maternal care in India: a spatial stochastic frontier analysis. BMC Health Serv Res 15(1):118CrossRefGoogle Scholar
- Kumbhakar SC, Knox Lovell CA (2000) Stochastic frontier analysis. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- Kumbhakar SC, Tsionas EG (2006) Estimation of stochastic frontier production functions with input-oriented technical efficiency. J Econom 133(1):71–96CrossRefGoogle Scholar
- Landes DS (1998) The wealth and poverty of nations: why some are so rich and some so poor. W.W. Norton and Co., New YorkGoogle Scholar
- LeSage JP, Pace RK (2009) Introduction to spatial econometrics. Chapman & Hall, Boca RatonCrossRefGoogle Scholar
- Meeusen W, van den Broek J (1977) Efficiency estimation from Cobb–Douglas production functions with composed error. Int Econ Rev 18(2):435–444CrossRefGoogle Scholar
- Otsuka A (2017) Regional determinants of total factor productivity in Japan: stochastic frontier analysis. Ann Reg Sci 58(3):579–596CrossRefGoogle Scholar
- Pavlyuk D (2010) Regional tourism competition in the Baltic states: a spatial stochastic frontier approach. MPRA Paper 25052, University Library of Munich, GermanyGoogle Scholar
- Puig-Junoy J (2001) Technical inefficiency and public capital in US states: a stochastic frontier approach. J Reg Sci 41(1):75–96CrossRefGoogle Scholar
- Puig-Junoy J, Pinilla J (2008) Why are some Spanish regions so much more efficient than others? Environ Plan C Gov Policy 26(6):1129–1142CrossRefGoogle Scholar
- Resti A (1997) Evaluating the cost-efficiency of the Italian banking system: what can be learned from the joint application of parametric and non-parametric techniques. J Bank Finance 21(2):221–250CrossRefGoogle Scholar
- Rodríguez-Pose A, Crescenzi R (2008) Research and development, spillovers, innovation systems, and the genesis of regional growth in Europe. Reg Stud 42(1):51–67CrossRefGoogle Scholar
- Schmidt AM, Moreira ARB, Helfand SM, Fonseca TCO (2009) Spatial stochastic frontier models: accounting for unobserved local determinants of inefficiency. J Prod Anal 31:101–112CrossRefGoogle Scholar
- Thissen M, Diodato D (2013a) Trade between European NUTS2 regions from 2000 to 2010. Technical report, The PBL Netherlands Environmental Assessment Agency, The HagueGoogle Scholar
- Thissen M, Diodato D (2013b) Trade between European NUTS2 regions in 2000. Technical report, The PBL Netherlands Environmental Assessment Agency, The HagueGoogle Scholar
- Thissen M, de Graaff T, van Oort F (2016) Competitive network positions in trade and structural economic growth: a geographically weighted regression analysis for European regions. Pap Reg Sci 95(1):159–180CrossRefGoogle Scholar
- Tsionas EG, Michaelides PG (2016) A spatial stochastic frontier model with spillovers: evidence for Italian regions. Scott J Polit Econ 63(3):243–257CrossRefGoogle Scholar
- Vidoli F, Cardillo C, Fusco E, Canello J (2016) Spatial nonstationarity in the stochastic frontier model: an application to the Italian wine industry. Reg Sci Urb Econ 61:153–164CrossRefGoogle Scholar
- Wang HJ, Ho CW (2010) Estimating fixed-effect panel stochastic frontier models by model transformation. J Econom 157(2):286–296CrossRefGoogle Scholar
- Wang WS, Schmidt P (2009) On the distribution of estimated technical efficiency in stochastic frontier models. J Econom 148(1):36–45CrossRefGoogle Scholar
- Weinstein MA (1964) The sum of values from a normal and a truncated normal distribution. Technometrics 6(1):104–105CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.