Abstract
In this paper we discuss goodness of fit tests for the distribution of technical inefficiency in stochastic frontier models. If we maintain the hypothesis that the assumed normal distribution for statistical noise is correct, the assumed distribution for technical inefficiency is testable. We show that a goodness of fit test can be based on the distribution of estimated technical efficiency, or equivalently on the distribution of the composed error term. We consider both the Pearson χ 2 test and the Kolmogorov–Smirnov test. We provide simulation results to show the extent to which the tests are reliable in finite samples.
Similar content being viewed by others
References
Abadir KM, Magnus JR (2005) Matrix algebra (Econometric exercises, Volume 1). Cambridge University Press, Cambridge
Aigner DJ, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier production function models. J Econom 6:21–37
Bai J (2003) Testing parametric conditional distributions of dynamic models. Rev Econ Stat 85:531–549
Bera AK, Mallick NC (2002) Information matrix tests for the composed error frontier model. In: Balakrishnan N (ed) Advances on methodological and applied aspects of probability and statistics. Gordon and Breach Science Publishers, London
Chen Y.-T, Wang H.-J (2009) “Centered-Residuals-Based Moment Estimator and Test for Stochastic Frontier Models.” unpublished manuscript, Academia Sinica
Coelli T (1995) Estimators and hypothesis tests for a stochastic frontier function. J Productivity Anal 6:247–265
Coelli T, Prasada Rao DS, O’Donnell CJ, Battese GE (2005) An introduction to efficiency and productivity analysis, 2nd edn. Springer, New York
Giné E, Zinn J (1990) Bootstrapping general empirical measures. Annals of Probability 18:851–869
Greene WH (1980a) Maximum likelihood estimation of econometric frontier functions. J Econom 13:27–56
Greene WH (1980b) On the estimation of a flexible frontier production model. J Econom 13:101–115
Greene WH (1990) A gamma-distributed stochastic frontier model. J Econom 46:141–164
Greene WH (2008) Econometric Analysis, 6th edn. Pearson Prentice Hall, Upper Saddle River
Hansen LP (1982) Large sample properties of generalized method of moments estimators. Econometrica 50:1029–1054
Heckman J (1984) The χ 2 goodness of fit for models estimated from microdata. Econometrica 52:1543–1548
Johnson NL, Kotz S (1970) Continuous univariate distributions–1. Boston, Houghton Mifflin
Jondrow J, Lovell CAK, Materov IS, Schmidt P (1982) On the estimation of technical efficiency in the stochastic frontier production function model. J Econom 19:233–238
Khmalzade EV (1981) Martingale approach to the theory of goodness of fit test. Theory probab Appl 26:240–257
Khmalzade EV (1988) An innovation approach in goodness of fit tests in R m. Ann Stat 16:1503–1516
Khmalzade EV (1993) Goodness of fit problem and scanning innovation martingales. Ann Stat 21:798–829
Kopp RJ, Mullahy J (1990) Moment-based estimation and testing of stochastic frontier models. J Econom 46:165–183
Lee L-F (1983) A test for distributional assumptions for the stochastic frontier functions. J Econom 22:245–267
Meeusen W, van den Broeck J (1977) Efficient estimation from cobb-douglas production functions with composed error. Int Econ Rev 18:435–444
Newey WK (1985) Maximum likelihood specification testing and conditional moment tests. Econometrica 53:1047–1070
Pitt MM, Lee LF (1981) The measurement and sources of technical inefficiency in the indonesian weaving industry. J Dev Econ 9:43–64
Ruppert D, Carroll RJ (1980) Trimmed least squares estimation in the linear model. J Am Stat Assoc 75:828–838
Schmidt P, Lin T-F (1984) Simple tests for alternative specifications in stochastic frontier models. J Econom 24:349–361
Simar L, Wilson PW (2010) Inferences from cross-sectional stochastic frontier models. Econom Rev 29:62–98
Stevenson RE (1980) Likelihood functions for generalized stochastic frontier estimation. J Econom 13:57–66
Stute W, Gonzáles Manteiga W, Presedo Quindimil M (1993) Bootstrap based goodness of fit tests. Metrika 40:243–256
Tallis GM (1983) Goodness of fit. In: Kotz S, Johnson NL (eds) Encyclopedia of statistical sciences, vol 3. Wiley, New York, pp 451–461
Tauchen G (1985) Diagnostic testing and evaluation of maximum likelihood models. J Econom 30:415–444
Waldman D (1982) A stationary point for the stochastic frontier likelihood. J Econom 18:275–279
Wang WS, Schmidt P (2009) On the distribution of estimated technical efficiency in stochastic frontier models. J Econom 148:36–45
White H (1982) Maximum likelihood estimation of misspecified models. Econometrica 50:1–16
Zellner A, Revankar N (1970) Generalized production functions. Rev Econ Stud 37:241–250
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Appendix A
In this Appendix we establish Eq. 8 of the text. We write \( \bar{g}(\theta_{0} ) = P - \hat{P} \), where P is the (k − 1)-dimensional vector with jth element p j = p j (θ 0) and \( \hat{P} \) is the (k − 1)-dimensional vector with jth element \( \hat{p}_{j} = O_{j} /n \). Also we write \( V(\theta_{0} ) = \Uppi - PP^{\prime } \) where Π is the diagonal matrix with jth diagonal element equal to p j . Now we use the fact (e.g. Abadir and Magnus (2005), p. 87) that
Therefore
The first term on the right hand side of (20) equals \( n\sum\nolimits_{j = 1}^{k - 1} {(\hat{p}_{j} - p_{j} )^{2} /p_{j} } = \sum\nolimits_{j = 1}^{k - 1} {(O_{j} - E_{j} )^{2} /E_{j} } \). For the second term, note that \( 1 - P^{\prime } \Uppi^{ - 1} P = 1 - \sum\nolimits_{j = 1}^{k - 1} {p_{j} = p_{k} } \) and that \( (\hat{P} - P)^{\prime } \Uppi^{ - 1} P = (\hat{P} - P)^{\prime } e_{k - 1} \) (where e k−1 is a vector of dimension (k − 1) with each element equal to one) = \( [(1 - \hat{p}_{k} ) - (1 - p_{k} )] = (p_{k} - \hat{p}_{k} ) \). Therefore \( n\bar{g}(\theta_{0} )^{\prime } V(\theta_{0} )^{ - 1} \bar{g}(\theta_{0} ) = \sum\nolimits_{j = 1}^{k - 1} {(O_{j} - E_{j} )^{2} /E_{j} + n(p_{k} - \hat{p}_{k} )^{2} /p_{k} } = \sum\nolimits_{j = 1}^{k} {(O_{j} - E_{j} )^{2} /E_{j} } \).
Appendix B
In this Appendix we discuss the goodness of fit test based on quantiles and its relationship to the Pearson test based on actual and expected cell counts. Suppose that we pick (k − 1) probabilities 0 < p 1 < p 2 ··· < p k−1 < 1. Let the corresponding population quantiles be m 1(θ) < m 2(θ) ··· < m k−1(θ), so that P(y ≤ m j (θ)) = p j , and let the sample quantiles be \( \hat{m}_{1} \le \hat{m}_{2} \cdots \le \hat{m}_{k - 1} \). So now the test will depend on (\( \hat{m} - m \)), the vector whose jth element equals (\( \hat{m}_{j} - m_{j} (\theta ) \)), and the test statistic equals \( n(\hat{m} - m(\hat{\theta }))^{\prime } W(\hat{m} - m(\hat{\theta })) \) with an appropriate choice of W.
To see how this compares to the CMT test, we note that \( \sqrt n (\hat{m}_{j} - m_{j} (\theta )) \) is asymptotically normal, and so it must be expressable as an average (plus an asymptotically negligible term). This is the “influence function representation,” which is given by:
where o p (1) is an asymptotically negligible term (i.e., it converges in probability to zero), and where
where f is the pdf of y. See, for example, Ruppert and Carroll (1980), p. 832. Therefore the test based on (\( \hat{m} - m \)) is equivalent in large samples to the CMT test based on the moment conditions \( E[1(y \le m_{j} (\theta )) - p_{j} ], j = 1,2, \ldots k - 1 \). This is an overlapping set of cells. However, it is also equivalent to consider the non-overlapping cells: \( A_{1} = \{ y\left| {y \le m_{1} } \right.(\theta )\} ,\,A_{2} = \{ y\left| {m_{1} (\theta ) < y \le m_{2} } \right.(\theta )\} \), etc. The resulting test is the CMT test based on observed versus actual cell counts, as discussed in the text.
Appendix C
In this Appendix we derive analytically the variance matrix C used in the conditional moment test, for the case of a normal distribution. We wish to evaluate
Here s = s(y,θ) is the score function for the normal distribution, given by
and \( g = g(y,\theta ) \) is the vector whose jth element equals [1(y ∈ A j ) − p j ].
It is well known that C 11 is the information matrix for the normal distribution, given by
Also C 22 equals the matrix V(θ) as defined in the discussion following Eq. 6 of the text.
This leaves the submatrix C 12. It is of dimension 2 by (k − 1). We will evaluate in turn the (1,j) and (2,j) elements of this matrix. To do so we make the reasonable assumption that the cells are intervals, so that A j = (a, b], where for notational simplicity we do not express the subscript “j” that should appear on a and b.Then element (1,j) of C 12 equals
where “φ“is the standard normal density function. Here we have evaluated the conditional expectation \( E\left( {y|a < y \le b} \right) = \mu + {\frac{1}{{p_{j} }}}\left[ {\varphi \left( {{\frac{a - \mu }{\sigma }}} \right) - \varphi \left( {{\frac{b - \mu }{\sigma }}} \right)} \right] \) as in Johnson and Kotz (1970), equation (79), p. 81.
Similarly element (2,j) of C 12 equals
where \( Ey^{2} 1(y \in A_{j} ) = p_{j} \text{var} (y\left| {y \in A_{j} )} \right. + p_{j} \left[ {E(y\left| {y \in A_{j} )} \right.} \right]^{2} \). Furthermore, \( Ey^{2} 1(y \in A_{j} ) = p_{j} \sigma^{2} \left\{ {1 - {\frac{b\phi (b) - a\phi (a)}{\Upphi (b) - \Upphi (a)}} - \left[ {{\frac{\phi (b) - \phi (a)}{\Upphi (b) - \Upphi (a)}}} \right]^{2} } \right\} + p_{j} \left[ {\mu - \sigma {\frac{\phi (b) - \phi (a)}{\Upphi (b) - \Upphi (a)}}} \right]^{2} \).
Rights and permissions
About this article
Cite this article
Wang, W.S., Amsler, C. & Schmidt, P. Goodness of fit tests in stochastic frontier models. J Prod Anal 35, 95–118 (2011). https://doi.org/10.1007/s11123-010-0188-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11123-010-0188-9