## Abstract

In this paper we extend the work of Simar (J Product Ananl 28:183–201, 2007) introducing noise in nonparametric frontier models. We develop an approach that synthesizes the best features of the two main methods in the estimation of production efficiency. Specifically, our approach first allows for statistical noise, similar to Stochastic frontier analysis (even in a more flexible way), and second, it allows modelling multiple-inputs-multiple-outputs technologies without imposing parametric assumptions on production relationship, similar to what is done in non-parametric methods, like Data Envelopment Analysis (DEA), Free Disposal Hull (FDH), etc.... The methodology is based on the theory of local maximum likelihood estimation and extends recent works of Kumbhakar et al. (J Econom 137(1):1–27, 2007) and Park et al. (J Econom 146:185–198, 2008). Our method is suitable for modelling and estimation of the marginal effects onto inefficiency level jointly with estimation of marginal effects of input. The approach is robust to heteroskedastic cases and to various (unknown) distributions of statistical noise and inefficiency, despite assuming simple anchorage models. The method also improves DEA/FDH estimators, by allowing them to be quite robust to statistical noise and especially to outliers, which were the main problems of the original DEA/FDH estimators. The procedure shows great performance for various simulated cases and is also illustrated for some real data sets. Even in the single-output case, our simulated examples show that our stochastic DEA/FDH improves the Kumbhakar et al. (J Econom 137(1):1–27, 2007) method, by making the resulting frontier smoother, monotonic and, if we wish, concave.

### Similar content being viewed by others

## Notes

As an anonymous referee pointed, this fact should be clearly stated: in partial order frontier models, the DGP does not allow for noise. If

*m*(*n*)→ ∞ or α(*n*)→ 1 when*n*→∞, the partial frontier estimators will converge to the FDH frontier. But in finite samples, if*m*is finite and α < 1, they will not envelop outliers and extreme noisy data points. So, in a sense, partial order frontier models allow to handle implicitly random noise in the estimation process. But of course, if*m*(*n*)→∞ or α(*n*)→ 1 as*n*→∞, there is no guarantee that the resulting robust estimator will not envelop some noisy data points.Any other parametric model could be chosen as anchorage for the stochastic part of the model: in our simulations below the chosen parametrization turned out to be very flexible, providing good results even if the anchorage model was not the true one.

A natural choice for the grid of points could be the original data (

*K*=*n*). Indeed, the noise in the radial direction does not influence the procedure, since the interest is to have points on the estimated frontier. We could alternatively select a grid of*K*values in polar coordinates (ω_{ k }, η_{ k },*x*_{ k }), where the (η_{ k },*x*_{ k }) would be randomly chosen in the range of the observed (η_{ i },*X*_{ i }). The values of ω_{ k }are rather arbitrary and do not play any role in the procedure, so they could also be randomly chosen in the range of the observed (noisy) ω_{ i }.In all the simulated examples, we provided also the plots of the various nonparametric estimators of the efficiency against the true efficiency scores. They are not reproduced to save space. As expected, they confirm the qualitative comments coming from the pictures displayed below for the estimation of the frontier levels.

We would like to thank Paul W. Wilson who provided us this data set.

In the case of the estimation of a cost function where ɛ =

*v*+*u*, we would obtain:$$ g_\varepsilon(\varepsilon|Z=z)= {\frac{2} {\sigma(z)}}\, \varphi\left({\frac{\varepsilon} {\sigma(z)}}\right) \Upphi\left(\varepsilon {\frac{\nu(z)} {\sigma(z)}}\right) $$

## References

Aigner DJ, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier models. J Econom 6:21–37

Atkinson SE, Primont D (2002) Stochastic estimation of firm technology, and productivity growth using shadow cost and distance function. J Econom 108:203–225

Cazals C, Florens JP, Simar L (2002) Nonparametric frontier estimation: a Robust approach. J Econom 106:1–25

Charnes A, Cooper WW, Rhodes E (1978) Measuring the inefficiency of decision making units. Eur J Oper Res 2(6):429–444

Daouia A, Simar L (2007) Nonparametric efficiency analysis: a multivariate conditional quantile approach. J Econom 140:375–400

Daraio C, Simar L (2007) Advanced Robust and nonparametric methods in efficiency analysis: methodology and applications. Springer, New York.

Debreu G (1951) The coefficient of resource utilization. Econometrica 19(3):273–292

Deprins D, Simar L, Tulkens H (1984) Measuring labor inefficiency in post offices. In: Marchand M, Pestieau P, Tulkens H (eds) The performance of public enterprises: concepts and measurements. Amsterdam, North-Holland, pp 243–267

Fan J, Gijbels I (1996) Local polynomial modelling and its applications. Chapman and Hall, London

Farrell MJ (1957) The measurement of productive efficiency. J R Stat Soc Ser A 120(3):253–281

Greene WH (1990) A gamma-distributed stochastic frontier model. J Econom 46:141–163

Gstach D (1998) Another approach to data envelopment analysis in noisy environments: DEA + . J Product Anal 9:161–176

Hall P, Simar L (2002) Estimating a changepoint, boundary or frontier in the presence of observation error. J Am Stat Assoc 97:523–534

Jondrow J, Lovell CAK, Materov IS, Schmidt P (1982) On the estimation of technical inefficiency in stochastic frontier production models. J Econom 19:233–238

Kneip A, Park BU, Simar L (1998) A note on the convergence of nonparametric DEA estimators for production efficiency scores. Econom Theory 14:783–793

Kneip A, Simar L (1996) A general framework for frontier estimation with panel data. J Product Anal 7:187–212

Kneip A, Simar L, Wilson PW (2008) Asymptotics and consistent bootstraps for DEA estimators in non-parametric frontier models. Econom Theory 24:1663–1697

Kumbhakar SC, Park BU, Simar L, Tsionas EG (2007) Nonparametric stochastic frontiers: a local likelihood approach. J Econom 137(1):1–27

Land KC, Lovell CAK, Thore S (1993) Chance-constrained data envelopment analysis. Manage Decison Econ 14(6):541–554

Meeusen W, van den Broek J (1977) Efficiency estimation from Cobb-Douglas production function with composed error. Int Econ Rev 8:435–444

Olesen OB, Petersen NC (1995) Chance-constrained efficiency evaluation. Manage Sci 41:442–457

Park B, Simar L, Weiner Ch (2000) The FDH estimator for productivity efficiency scores: asymptotic properties. Econom Theory 16:855–877

Park B, Simar L, Zelenyuk V (2008) Local likelihood estimation of truncated regression and its partial derivatives: theory and application. J Econom 146:185–198

Ritter C, Simar L (1997) Pitfall of Normal-Gamma stochastic frontier models. J Product Anal 8:167–182

Simar L (2007) How to improve the performances of DEA/FDH estimators in the presence of noise. J Product Anal 28:183–201

Simar L, Wilson PW (2008) Statistical inference in nonparametric frontier models: recent developments and perspectives. In: Fried H, Lovell CAK, Schmidt S (eds) The measurement of productive efficiency, 2nd edn. Oxford University Press, Oxford

Simar L, Wilson PW (2010) Inference from cross-sectional, stochastic frontier models. Econom Rev 29(1):62–98

Stevenson RE (1980) Likelihood functions for generalized stochastic frontier estimation. J Econom 13:57–66

## Acknowledgments

The authors acknowledge support from the “Interuniversity Attraction Pole”, Phase VI (No. P6/03) from the Belgian Science Policy. L. Simar also acknowledges support from the Chair of Excellency “Pierre de Fermat”, Région Midi-Pyrénées, Toulouse, France.

## Author information

### Authors and Affiliations

### Corresponding author

## Appendix

### Appendix

### 1.1 Local polynomial estimation

This appendix summarizes the main aspects of local maximum likelihood estimation using local polynomial approximation for the functional parameters of a model. The presentation of these techniques in a multiple regression framework with high order polynomials can be notationally very complex. In order to simplify the notation, in this appendix, we will use a general notation, independent of the preceding one, but we will give the correspondence when necessary.

We consider indeed a general nonparametric stochastic frontier model which can be viewed as a nonparametric regression of a univariate dependent variable *Y* (in Sect. 2, \(Y{\equiv\log}\omega\)) on a *d*-dimensional variable *Z* (in Sect. 2, we had \(Z\equiv(\eta,X)\) with *d* = *p* + *q* − 1), where locally (for *Z* = *z*), the error term is the convolution of a positive random variable *u* (inefficiency) with a known distribution and a real random variable *v* having also a known distribution with zero mean, the parameters of these distributions being unknown function of *z*. In its most general version the model could be presented as follows.

We observe a set of i.i.d. random variables (*Z*
_{
i
}, *Y*
_{
i
}) for *i* = 1, ..., *n* with \(Z_i \in {\mathbb{R}}^d\) and \(Y_i \in {\mathbb{R}}\), where

for some unknown function *f*, where, conditionally on *Z* = *z*, *u* and *v* are independent and ɛ = − *u* + *v* has a known conditional continuous distribution *G*(·, τ(*z*)), where τ is some *k*-valued unknown function (in our setup τ will be defined as the pair of functions \((\sigma^2_u,\sigma^2_v)^T\), but more general structure could be analyzed, see below). The unknown functions *f* and τ are called the functional parameters of the model.

For the ease of the notation, we first introduce the simple case of constant parameters for the convolution parameters and then generalize for more general models allowing for heteroskedasticity. Then we will customize the likelihood function for the model used in our simulations and in our real data examples.

#### 1.1.1 Convolution with constant parameters

In this model, conditionally on *Z* = *z*, ɛ = *v* − *u* has a continuous distribution *G*(·, τ), where τ represents the unknown parameters of the convolution, that are assumed to be constant. In other words, the conditional density of *Y* given *Z* = *z* equals

where \(g_\varepsilon(\varepsilon, \tau)=\partial G(\varepsilon, \tau)/\partial\varepsilon\). Our main interest here is estimation of the function *f* and eventually its derivatives. We now describe the local polynomial estimation of *f* in a general setting of multivariate *Z*.

Define ℓ = logϕ. Then, the conditional log-likelihood equals \(\sum_{i=1}^n \ell(Y_i, f(Z_i), \tau)\). Let *z* be a point at which one wants to estimate the values of the function *f* and its derivatives. A local conditional log-likelihood is obtained by replacing *f* in the conditional log-likelihood by its *m*th order polynomial approximation in a neighborhood of *z* and putting the weight *K*
_{
h
}(*Z*
_{
i
} − *z*) for each observation (*Z*
_{
i
}, *Y*
_{
i
}), where *K*
_{
h
}(*u*) = *h*
^{−d}
*K*(*h*
^{−1}
*u*), *K* is a *d*-variate kernel function, typically a symmetric density function defined on \({\mathbb{R}}^d\), and *h* is a positive scalar, called the bandwidth. Precisely, it is given by

where *r*(*m*) − 1 is the total number of partial derivatives up to order *m*, i.e., \(r(m)=\sum_{j=0}^m \left(\begin{array}{c}j+d-1\\d-1\end{array}\right)\). Here and below, \(Z_i \equiv (Z_{i1}, \ldots, Z_{id})^T\) and \(z \equiv (z_1, \ldots, z_d)^T\).

The *m*th order local polynomial estimators of *f* and its derivatives at *z* are obtained by maximizing *L*
_{
n
}(θ_{0}, θ_{1}, ..., θ_{
r(m)-1}, τ; *z*). For example, \(\hat{f}(z) = \hat{\theta}_0(z)\) and the estimator of \(f^{\prime}(z) \equiv [\partial f(z)/\partial z_1, \ldots, \partial f(z)/ \partial z_d]^T\) is given by \(\hat{f}^{\prime}(z) = [\hat{\theta}_1(z), \ldots, \hat{\theta}_d(z)]^T\), where

The estimator \(\tilde{\tau}\) is obtained locally in the above local polynomial estimation procedure, and thus depends on *z*. This can be improved by maximizing the full likelihood with *f* being replaced by its estimator \(\hat{f}\), i.e., a better estimator is given by

One may further update the estimators \(\hat{\theta}_j(z)\) by maximizing with respect to θ_{0}, θ_{1}, ..., θ_{
r(m)-1} only the likelihood \(L_n(\theta_0,\theta_1,\ldots,\theta_{r(m)-1}, \hat{\tau}; z)\) where τ on the right hand side of (5.3) is replaced by \(\hat{\tau}\).

In the particular case where quadratic local approximation is used to estimate *f*, the likelihood (5.2) can be written as

where \({\varvec \theta_1}=(\theta_1,\ldots,\theta_d)^T\) and \({\varvec \Uptheta}={\varvec \Uptheta}^T\) represents the *d*(*d* + 1)/2 last elements of \({\varvec \theta}=(\theta_0,\theta_1,\ldots,\theta_{r(2)-1})\) associated with the quadratic terms but written here and below in a quadratic form, for the sake of notational simplicity. Here, after maximizing this likelihood relative to the full set of parameters (θ_{0}, θ_{1}, ..., θ_{
r(2)-1}, τ), we have directly \(\hat{f}(z) = \hat{\theta}_0(z)\) and \(\hat{f}^{\prime}(z)= {\varvec \theta_1}(z)\).

###
*Remark A.1*

For practical computation of the maximum in (A.3), in order to avoid maximization under the constraint \({\varvec \Uptheta}={\varvec \Uptheta}^T\), we should rather use the explicit full expression

in the place of the quadratic form \(\frac{1} {2}\,(Z_{i}-z_{})^T {{\varvec \Uptheta}} (Z_{i}-z_{})\) in (A.5).

#### 1.1.2 Convolution with functional parameters

In this subsection we discuss the case where the shape parameter τ is also a smooth function of *z*. We know indeed from previous experience in a similar context (see Park et al. 2008) that fitting a local constant shape parameters whereas the true shape parameters are not constant, provides poor estimates of *f* and *f*′. It is also shown that choosing a lower order polynomial for the variance functional τ than for *f* could jeopardize the accuracy of the estimation of the latter.

In addition, if partial derivatives of the functional parameter τ(*z*) are of interest, quadratic local polynomial approximation for τ(*z*) should be useful. As pointed in Kumbhakar et al. (2007), the order of the polynomials for the functionals *f* and for τ do not have to be the same. They can be adapted to the object of interest for the researcher (derivative or not), reminding that higher orders involve notational and computational complexities. So parsimony is also advisable in this framework. We suggest the following strategy, if partial derivatives of some functional parameter are of interest, we will choose a quadratic local polynomial for this parameter, otherwise, by the principle of parsimony we will choose local linear approximations.

The presentation below is for the general case of a \(m_1^{\rm th}\) order local polynomial for the function *f* and a \(m_2^{\rm th}\) order local polynomial for the (vector-valued) function τ. When the *k*-valued shape parameter is also a functional parameter, the conditional log-likelihood is now given by \(\sum_{i=1}^n \ell(Y_i, f(Z_i), \tau(Z_i))\), where ℓ(*y*, ν, ω) = log *g*
_{ɛ}(*y* − ν, ω). Hence we obtain the following local conditional log-likelihood:

where each \(\tau_j \in {\mathbb{R}}^k\).

The local polynomial estimators of *f*, τ and their derivatives at *z* are obtained by maximizing *L*
_{
n
}(θ_{0}, θ_{1}, ..., θ_{
r(m_1)-1}, τ_{0}, ..., τ_{
r(m_2)-1}; *z*), i.e.,

As above \(\hat{f}(z) = \hat{\theta}_0(z), \hat{f}^{\prime}(z) = [\hat{\theta}_1(z), \ldots, \hat{\theta}_d(z)]^T\) and for the vector τ we have the *k*-dimensional vector \(\hat{\tau}(z) = \hat{\tau}_0(z)\) and the (*d* × *k*) matrix \(\hat{\tau}^{\prime}(z)=\widehat{{\frac{\partial \tau^T} {\partial z}}} = [\hat{\tau}_1(z), \ldots, \hat{\tau}_d(z)]^T\).

We know that under regularity conditions and appropriate bandwidths selection, these are consistent estimators, asymptotically normally distributed with a bias converging to zero when *n* →∞ (see Kumbhakar et al. 2007 for the theoretical details). To summarize, if we choose, as we did, a bandwidth matrix *H* = *hS*
^{1/2}, it can be shown by direct application of Theorem 2.2 in Kumbhakar et al. (2007), that for linear approximation of all the functionals, the optimal bandwidth that balances bias and variance is given by \(h {\equiv}n^{-1/(d+4)}\): this corresponds to a rate of convergence of the nonparametric estimators of the functionals of the order *n*
^{2/(d+4)}. In our setup here, this means *n*
^{2/(p+q+3)}, that can be compared to the DEA and FDH rates of convergence (see footnote 5).

#### 1.1.3 The customized model used in the applications

We first have to select the distribution for the convolution term. As is well known in local MLE procedure, this can be viewed as selecting an anchorage parametric model for the convolution term; the localized version will encompass this anchorage model, ensuring much more flexibility, as shown e.g. in Kumbhakar et al. (2007), in Park et al. (2008) and in some of our examples in Sect. 3.

We will focus here on the two-parameters (*k* = 2) distribution *G*(·, τ) obtained from the Normal/Half-Normal convolution case as proposed in Aigner et al. (1977). To be explicit, we choose a local normal \({{\mathcal{N}}}(0,\sigma^2_v(z))\) for the noise *v* and a local half-normal \({{\mathcal{N}}}^+(0,\sigma^2_u(z))\) for the inefficiency *u*. An alternative could be the Normal/Exponential convolution also analyzed by Aigner et al. (1977), but to save place we will not present this case. Although theoretically valid, we also avoid one sided distributions for the inefficiency term *u* that would be characterized by two free parameters (scale and shape) as the Gamma (Greene 1990) or the truncated normal (Stevenson 1980) due to the identification issues raised by Ritter and Simar (1997). The local version of our model showed to be flexible enough to handle many different situations.

So our nonparametric localized model can be written, as in (A.1)

where \(\big(u | Z=z \big) \sim |{{\mathcal{N}}}(0,\sigma_u^2(z))|\) and \(\big(v | Z=z\big) \sim {{\mathcal{N}}}(0,\sigma_v^2(z)), u\) and *v* being independent conditionally on *Z*. The conditional probability density function of ɛ = *v* − *u* is given by^{Footnote 6}

where φ(·) and \(\Upphi(\cdot)\) are the pdf and cdf of a standard normal variable. Note that we keep the parametrization in terms of the functionals σ^{2}
_{
u
} and σ^{2}
_{
v
} because we could be interested in estimating their derivatives, but any other standard parametrization could be used (e.g. in terms of λ = σ_{
u
}/σ_{
v
} and σ^{2} = σ^{2}
_{
u
} + σ^{2}
_{
v
}).

As suggested in Kumbhakar et al. (2007), in order to avoid non negativity restrictions on the variance functions in the local polynomial approximations, we rather choose the following coordinate system for the shape parameters: \(\tau(z)=(\widetilde \sigma^2_u(z),\widetilde\sigma^2_v(z))^T\) where \(\widetilde\sigma_u^2(z)=\log(\sigma_u^2(z))\) and \(\widetilde\sigma_v^2(z)=\log(\sigma_v^2(z))\).

The conditional pdf of *Y* given *Z* is thus given by:

After some analytical manipulation, it is found that

where the constants have been eliminated. The conditional local log-likelihood is

Introducing the local polynomial approximations for the functional parameters *f*, \(\widetilde\sigma^2_u\) and \(\widetilde\sigma^2_v\), would follow the general presentation of section “Convolution with constant parameters” above. But since we are interested in the partial derivative of \(\sigma^2_u\) only, we will choose the parsimonious model with local linear approximations for *f* and \(\widetilde\sigma^2_v\) and a quadratic approximation for \(\widetilde\sigma^2_u\), at the point *z*. The conditional local log-likelihood is finally given by

where for the ease of notation we used the shortcuts ψ for the following expressions:

Here the local parameters are the three scalars \((\theta_0,\sigma^2_{u0},\sigma^2_{v0})\) plus the three (*d* × 1) vectors \({\varvec \theta_1},{\varvec \sigma_{u1}^2}\) and \({\varvec \sigma_{v1}^2}\), and the (*d* × *d*) symmetric matrix \({\varvec \Upsigma_{u2}}\) containing *d*(*d* + 1)/2 free parameters. At the total, we have 3 (*d* + 1) + *d* (*d* + 1)/2 free local parameters.

The local polynomial estimator of the model is given by \((\widehat{\theta}_0,\widehat{\sigma}^2_{u0},\widehat{\sigma}^2_{v0})\) where

These local estimators are indeed local since the solution of this maximization problem depends on the value of *z*. In particular \(\widehat f(z)=\widehat{\theta}_0(z)\) and \(\widehat {\widetilde\sigma^2}_u(z)=\widehat\sigma^2_{u0}(z)\) and \(\widehat {\widetilde\sigma^2}_v(z)=\widehat\sigma^2_{v0}(z)\), where the “tildes” are used here to remind that we are estimating the log of the variance functions. In addition, \(\widehat {{\frac{\partial} {\partial z}}\widetilde\sigma^2}_u(z)=\widehat{\varvec \sigma}^2_{u1}(z)\). For practical computations, see Remark 0.1.

If a quadratic approximation would be needed for *f* and \(\widetilde\sigma^2_v\) too, we should add in (5.8) a quadratic term for these two elements. This would involve two additional symmetric matrices \({\varvec \Uptheta}\) and \({\varvec \Upsigma}_v\) that would appear in quadratic forms as we did it above in (5.8) with \({\varvec \Upsigma}_u\). This would add 2 *d* (*d* + 1) free parameters in the optimization (5.9). Again, practical organization of the optimization should follow Remark 0.1 for the 3 involved symmetric matrices.

Estimates of \(\sigma^2_u(z), \sigma^2_v(z)\) and \({\frac{\partial} {\partial z}}{\sigma^2}_u(z)\) in the original units can be directly obtained from their logarithmic versions, \(\widehat {\sigma^2}_u(z)= \exp\left( \widehat {\widetilde\sigma^2}_u(z)\right)\) and \(\widehat {\sigma^2}_v(z)= \exp\left( \widehat {\widetilde\sigma^2}_v(z)\right)\). For the derivatives we have \( \widehat {{\frac{\partial} {\partial z}}\sigma^2}_u(z)= \widehat {\sigma^2}_u(z)\,\widehat {{\frac{\partial} {\partial z}}\widetilde\sigma^2}_u(z) \).

## Rights and permissions

## About this article

### Cite this article

Simar, L., Zelenyuk, V. Stochastic FDH/DEA estimators for frontier analysis.
*J Prod Anal* **36**, 1–20 (2011). https://doi.org/10.1007/s11123-010-0170-6

Published:

Issue Date:

DOI: https://doi.org/10.1007/s11123-010-0170-6