Abstract
Heckman’s (Ann Econ Soc Meas 15:475–492, 1976; Econometrica 47(1):153–161, 1979) sample selection model has been employed in many applications of linear or nonlinear regression studies. It is well known that ignoring the sample selectivity may result in estimation bias of the estimator. Although the stochastic frontier (SF) model with sample selection has been investigated in Greene (J Product Anal 34:15–24, 2010), we intend to extend the model in several directions in this paper. First, we extend the distribution of the inefficiency from the half normal to truncated normal distribution. Second, we discuss the likelihood estimation method for the SF model with sample selection and also its most common incarnation, endogenous switching. Third, we suggest a simple framework to derive the closed form of the likelihood function using the closed skew-normal distribution. Fourth, we propose the estimator for the technical efficiency index due to Battese and Coelli (Empir Econ 20(2):325–332, 1995) based on the sample selection information. Finally, we also demonstrate the approach using the Taiwan hotel industry data.
Similar content being viewed by others
Notes
The analytic approximation of (v–u) is derived by Lai and Huang (2013).
See Lemma B.1 and B.2 in Appendix 2.
I thank William Greene for this suggestion.
It is also known as Herfindahl–Hirschman index, see Hirschman (1964).
The customers are classified as either individual or group customers.
More details about the variable definitions can also refer to Lai (2013).
LR = 2*[(15.0517 + 79.1114)–69.4959] = 49.3344 ~ χ2(13).
H0: ρ = 0. The LR statistic is 2*(15.05-12.94) = 4.22 ~ χ2(1).
The estimation result of the normal-half normal SF model with sample selection is available upon a request from the author. The LR statistic is 2*(15.0517–5.0347) = 20.034 ~ χ2(5).
References
Battese GE, Coelli TJ (1995) A model for technical inefficiency effects in a stochastic frontier production function for panel data. Empir Econ 20(2):325–332
Dominguez-Molina JA, Gonzalez-Farias G, Ramos-Quiroga R (2004) Skew-normality in stochastic frontier analysis. In: Genton MG (ed) Skew-elliptical distribution and their applications: a journey beyond normality. CRC Press, Chapman and Hall
González-Farías G, Dominguez-Molina JA, Gupta AK (2004) The closed skew normal distribution. In: Genton MG (ed) Skew-elliptical distribution and their applications: a journey beyond normality. CRC Press, Chapman and Hall
Greene W (2010) A stochastic frontier model with correction for sample selection. J Prod Anal 34:15–24
Heckman JJ (1976) The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. Ann Econ Soc Meas 15:475–492
Heckman JJ (1979) Sample selection bias as a specification error. Econometrica 47(1):153–161
Hirschman AO (1964) The paternity of an index. Am Econ Rev 54(5):761
Kumbhakar SC, Lovell CAK (2000) Stochastic frontier analysis. Cambridge: Cambridge University Press
Kumbhakar S, Tsionas M, Sipiläinen T (2009) Joint estimation of technology choice and technical efficiency: an application to organic and conventional dairy farming. J Prod Anal 31(3):151–162
Lai H-P (2013) Estimation of the threshold stochastic frontier model in the presence of an endogenous sample split variable. J Prod Anal 40(2):227–237
Lai H-P, Huang CJ (2013) Maximum likelihood estimation of seemingly unrelated stochastic frontier regressions. J Prod Anal 40(1):1–14
Lai H-P, Polachek S, Wang H-J (2012) Estimation of a stochastic frontier model with a sample selection problem. Working paper, Department of Economics, National Chung Cheng University
Maddala G (1986) Disequilibrium, self-selection, and switching models. In: Griliches Z, Intriligator MD (eds.), Handbook of econometrics chapter 28 (pp. 1633–1688). Elsevier volume 3 of Handbook of Econometrics
Terza JV (2009) Parametric nonlinear regression with endogenous switching. Econom Rev 28(6):555–580
Tong YL (1990) The multivariate normal distribution. Springer-Verlag, New York
Acknowledgments
I thank William Greene and three anonymous referees for the helpful comments. Lai gratefully acknowledges the National Science Council of Taiwan (NSC-101-2410-H-194-017) for the research support. The usual disclaimer applies.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1
We use the symbol “\(\oplus\)” to indicate the matrix direct sum operator: for any two matrices A and B, \(A \oplus B = \left( {\begin{array}{*{20}c} A & O \\ O & B \\ \end{array} } \right)\). Property 1 suggests that the joint distribution of the independent CSN random vectors is still a CSN distribution.
Property 1
(Proposition 2.4.1 of GDG (2004)) If \(y_{1} ,{ \ldots },y_{n}\) are independent random vectors with \(y_{i} \sim CSN_{{p_{i} ,q_{i} }} \left( {\pi_{i} ,\varSigma_{i} ,\varGamma_{i} ,\kappa_{i} ,\Delta_{i} } \right)\) , the joint distribution of \(y_{1} ,{ \ldots },y_{n}\) is
where \(p^{*} = \sum\nolimits_{i = 1}^{n} {p_{i} }\), \(q^{*} = \sum\nolimits_{i = 1}^{n} {q_{i} }\), \(\pi^{*} = \left( {\pi_{1}^{\text{T}} ,{ \ldots },\pi_{n}^{\text{T}} } \right)^{\text{T}}\), \(\varSigma^{*} = \mathop \oplus \limits_{i = 1}^{n} \, \varSigma_{i}\), \(\varGamma^{*} = \mathop \oplus \limits_{i = 1}^{n} \, \varGamma_{i}\), \(\kappa^{*} = \left( {\kappa_{1}^{\text{T}} ,{ \ldots },\kappa_{n}^{\text{T}} } \right)^{\text{T}}\) , and \(\Delta^{*} = \mathop \oplus \limits_{i = 1}^{n} \, \Delta_{i}\).
Property 2 suggests that a linear transformation of a multivariate CSN random vector is still CSN distributed.
Property 2
(Proposition 2.3.1 of GDG (2004)) If \(y\sim CSN_{p,q} \left( {\pi ,\varSigma ,\varGamma ,\kappa ,\Delta } \right)\) and let A be a \(n \times p\) matrix of rank n, where \(n \le p\) . Then
where \(\pi_{A} = A\pi\), \(\varSigma_{A} = A\varSigma A^{\text{T}}\), \(\varGamma_{A} = \varGamma \varSigma A^{\text{T}} \varSigma_{A}^{ - 1}\) , and \(\Delta_{A} = \Delta + \varGamma \varSigma \varGamma^{\text{T}} - \varGamma \varSigma A^{\text{T}} \varSigma_{A}^{ - 1} A\varSigma \varGamma^{\text{T}}\).
The following Property 3 states the conditional distribution of the CSN.
Property 3
(Proposition 2.3.2 of GDG (2004)) If \(y \sim CSN_{p,q} \left( {\pi ,\varSigma ,\varGamma ,\kappa ,\Delta } \right)\) then for two subvectors \(y_{1}\) and \(y_{2}\), where \(y^{\text{T}} = \left( {\begin{array}{*{20}c} {y_{1}^{\text{T}} } & {y_{2}^{\text{T}} } \\ \end{array} } \right)\), \(y_{1}\) is k-dimensional, \(1 \le k \le p\) , and \(\pi\), \(\varSigma\), and \(\varGamma\) are partitioned as \(\pi = \left( {\begin{array}{*{20}c} {\pi_{1} } \\ {\pi_{2} } \\ \end{array} } \right) , { }\varSigma { = }\left( {\begin{array}{*{20}c} {\varSigma_{11} } & {\varSigma_{12} } \\ {\varSigma_{21} } & {\varSigma_{22} } \\ \end{array} } \right), \varGamma = \left( {\begin{array}{*{20}c} {\varGamma_{1} } & {\varGamma_{2} } \\ \end{array} } \right),\) then the conditional distribution of \(y_{2}\) given \(y_{1} = y_{10}\) is
where \(\varGamma^{*} = \varGamma_{1} + \varGamma_{2} \varSigma_{21} \varSigma_{11}^{ - 1}\), \(\Delta^{*} = \Delta + \varGamma_{2} \varSigma_{22 \cdot 1} \varGamma_{2}^{\text{T}}\), and \(\varSigma_{22 \cdot 1} = \varSigma_{22} - \varSigma_{21} \varSigma_{11}^{ - 1} \varSigma_{12}\).
Appendix 2
Lemma B.1
Under assumption [A1], both \(v_{i} |\{ e_{i} > - w_{i}^{\text{T}} \gamma \}\) and \(v_{i} |\{ e_{i} \le - w_{i}^{\text{T}} \gamma \}\) follow CSN distributions. More specifically,
Proof of Lemma B.1
Using the result of Theorem 2.1.1 of Tong (1990), we have \(e|v\sim N (\rho v/\sigma_{v} , { }1 - \rho^{2} )\) under assumption [A1]. Let r be a known constant and \(\varPhi_{z} ( \cdot )\) be the cdf of a standard normal random variable, then
-
(i)
$$\begin{aligned} f_{v|e} (v|e > r) & = & \frac{{f_{v} (v)}}{\Pr (e > r)}\Pr (e > r|v) = \frac{{\phi_{1} (v;0,\sigma_{v}^{2} )}}{{\varPhi_{z} ( - r)}}\left[ {1 - \varPhi_{1} \left( {r;\frac{\rho }{{\sigma_{v} }}v,1 - \rho^{2} } \right)} \right] \\ & & = & \frac{{\phi_{1} (v;0,\sigma_{v}^{2} )}}{{\varPhi_{z} ( - r)}}\varPhi_{1} \left( {\frac{\rho }{{\sigma_{v} }}v;r,1 - \rho^{2} } \right){\text{ is }}CSN_{1,1} \left( {0,\sigma_{v}^{2} ,\frac{\rho }{{\sigma_{v} }},r,1 - \rho^{2} } \right). \\ \end{aligned}$$
-
(ii)
$$\begin{aligned} f_{v|e} (v|e \le r) = & \frac{{f_{v} (v)}}{\Pr (e \le r)}\Pr (e \le r|v) = \frac{{\phi_{1} (v;0,\sigma_{v}^{2} )}}{{\varPhi_{z} (r)}}\varPhi_{1} \left( {r;\frac{\rho }{{\sigma_{v} }}v,1 - \rho^{2} } \right) \\ = \frac{{\phi_{1} (v;0,\sigma_{v}^{2} )}}{{\varPhi_{z} (r)}}\varPhi_{1} \left( { - \frac{\rho }{{\sigma_{v} }}v; - r,1 - \rho^{2} } \right){\text{ is }}CSN_{1,1} \left( {0,\sigma_{v}^{2} , - \frac{\rho }{{\sigma_{v} }}, - r,1 - \rho^{2} } \right). \\ \end{aligned}$$
Denote \(d = 1(e > r)\), then (i) and (ii) together implies that \(f_{v|e} (v|d)\) is \(CSN_{1,1} \left( {0,\sigma_{v}^{2} ,(2d - 1)\frac{\rho }{{\sigma_{v} }},(2d - 1)r,1 - \rho^{2} } \right)\). □
Lemma B.2
Under assumption [A2], \(u_{i} \sim CSN_{1,1} (\mu_{i} ,\sigma_{u}^{2} ,1, - \mu {}_{i},0)\).
Proof Lemma B.2
By Lemma 13.6.1 of Dominguez-Molina, González-Farías and Ramos-Quiroga (2004), the moment generating function of the truncated normal distribution is
By comparing (30) and (6). It follows that \(u_{i}\) actually follows the \(CSN_{1,1} (\mu_{i} ,\sigma_{u}^{2} ,1, - \mu_{i} ,0)\) distribution. □
Under [A2], \(v_{i} |d_{i}\) and \(u_{i} |d_{i}\) are independent to each other. Then it follows from Lemma B.1, Lemma B.2 and Property 1 that
where \(\pi_{i}^{*} = \left( {\begin{array}{*{20}c} 0 \\ {\mu_{i} } \\ \end{array} } \right),\) \(\varSigma^{*} = \left( {\begin{array}{*{20}c} {\sigma_{v}^{2} } & 0 \\ 0 & {\sigma_{u}^{2} } \\ \end{array} } \right),\) \(\varGamma_{i}^{*} = \left( {\begin{array}{*{20}c} {(2d_{i} - 1)\rho /\sigma_{v} } & 0 \\ 0 & 1 \\ \end{array} } \right),\) \(\kappa_{i}^{*} = \left( {\begin{array}{*{20}c} { - (2d_{i} - 1)w_{i}^{\text{T}} \gamma } \\ { - \mu_{i} } \\ \end{array} } \right),\) \(\Delta^{*} = \left( {\begin{array}{*{20}c} {1 - \rho^{2} } & 0 \\ 0 & 0 \\ \end{array} } \right).\) Since \(\varepsilon_{i} |d_{i}\) can be represented a linear combination of \(v_{i} |d_{i}\) and \(u_{i} |d_{i}\) and Property 2 suggests that the linear combination of \(v_{i} |d_{i}\) and \(u_{i} |d_{i}\) is still CSN, we can easily obtain the conditional probability density function of the model in (3). The main result is summarized in Theorem 1.
Proof of Theorem 1
Let \(A = (1, - 1)\). Given the result in (31), \(\left( {v_{i} ,u_{i} } \right)^{\text{T}} |d_{i} \sim CSN_{2,2} \left( {\pi_{i}^{*} ,\varSigma^{*} ,\varGamma_{i}^{*} ,\kappa_{i}^{*} ,\Delta^{*} } \right)\), and \(\varepsilon_{i} = A(v_{i} ,u_{i} )^{\text{T}}\), then it follows from Property 2 that \(\varepsilon_{i} |d_{i} \sim CSN_{1,2} \left( {\pi_{A,i} ,\varSigma_{A} ,\varGamma_{A,i} ,\kappa_{A,i} ,\Delta_{A,i} } \right)\), where \(\pi_{A,i} = A\pi_{i}^{*} = (1, - 1)\left( \begin{gathered} 0 \hfill \\ \mu_{i} \hfill \\ \end{gathered} \right) = - \mu_{i}\), \(\varSigma_{A} = A\varSigma^{*} A^{\text{T}} = (\sigma_{v}^{2} + \sigma_{u}^{2} ),\) \(\varGamma_{A,i} = \varGamma_{i}^{*} \varSigma^{*} A^{\text{T}} \varSigma_{A}^{ - 1} = \frac{1}{{\sigma_{v}^{2} + \sigma_{u}^{2} }}\left( \begin{gathered} \left( {2d - 1} \right)\rho \sigma_{v} \\ - \sigma_{u}^{2} \\ \end{gathered} \right),\;\kappa_{A,i} = \kappa_{i}^{*} \;{\text{and }}\Delta_{A,i} = \Delta^{*} + \varGamma_{i}^{*} \varSigma^{*} \varGamma_{i}^{{*{\text{T}}}} - \varGamma_{i}^{*} \varSigma^{*} A^{T} \varSigma_{A}^{ - 1} A\varSigma^{*} \varGamma_{i}^{{*{\text{T}}}}\) \(= \frac{1}{{\sigma_{v}^{2} + \sigma_{u}^{2} }}\left( {\begin{array}{*{20}c} {(1 - \rho^{2} )\sigma_{v}^{2} + \sigma_{u}^{2} } & {\left( {2d_{i} - 1} \right)\rho \sigma_{v} \sigma_{u}^{2} } \\ {\left( {2d_{i} - 1} \right)\rho \sigma_{v} \sigma_{u}^{2} } & {\sigma_{v}^{2} \sigma_{u}^{2} } \\ \end{array} } \right).\)
To obtain the corresponding pdf of \(CSN_{1,2} \left( {\pi_{A,i} ,\varSigma_{A} ,\varGamma_{A,i} ,\upkappa_{A,i} ,\Delta_{A,i} } \right)\), as defined in (5). (i) Let \(\varPi = \Delta_{A,i} + \varGamma_{A,i} \varSigma_{A} \varGamma_{A,i}^{\text{T}}\) \(= \left( {\begin{array}{*{20}c} 1 & 0 \\ 0 & {\sigma_{u}^{2} } \\ \end{array} } \right)\). Since \(\varPi\) is a diagonal matrix, it follows that \(C^{ - 1} = \varPhi_{2} \left( {0;\kappa_{A,i} ,\varPi } \right) = \varPhi_{1} \left( {0; - \left( {2d_{i} - 1} \right)w_{i}^{\text{T}} \gamma ,1} \right)\varPhi_{1} \left( {0; - \mu_{i} ,\sigma_{u}^{2} } \right) = \varPhi_{z} \left( {(2d_{i} - 1)w_{i}^{\text{T}} \gamma } \right)\varPhi_{z} \left( {\mu_{i} /\sigma_{u} } \right).\) (ii) \(\phi_{1} \left( {\varepsilon_{i} ; - \mu_{i} ,\sigma_{v}^{2} + \sigma_{u}^{2} } \right) = \frac{1}{{\sqrt {\sigma_{v}^{2} + \sigma_{u}^{2} } }}\phi_{z} \left( {\frac{{\varepsilon_{i} + \mu_{i} }}{{\sqrt {\sigma_{v}^{2} + \sigma_{u}^{2} } }}} \right).\) (iii) \(\varPhi_{2} \left( {\varGamma_{A} (\varepsilon_{i} - \mu_{i} );\kappa_{i}^{*} ,\Delta_{A,i} } \right) =\varPhi_{2} \left( 0; \kappa_{A,i}-\Gamma_{A,i}(\varepsilon_i-\mu_{i}), \Delta_{A,i}\right)\). Therefore, (i–iii) together imply (8). □
Proof of Corollary 2
If \(\rho = 0\), then \(\Delta_{A} = \left( {\begin{array}{*{20}c} 1 & 0 \\ 0 & {\sigma_{v}^{2} \sigma_{u}^{2} /(\sigma_{v}^{2} + \sigma_{u}^{2} )} \\ \end{array} } \right)\), \(\varPhi_{2} \left( {\varGamma_{A,i} (\varepsilon_{i} + \mu_{i} );\kappa_{A,i} ,\Delta_{A,i} } \right)\)=\(\varPhi_{z} \left( {(2d_{i} - 1)w_{i}^{\text{T}} \gamma } \right)\) \(\times \varPhi_{1} \left( { - \frac{{\sigma_{u}^{2} }}{{\sigma_{v}^{2} + \sigma_{u}^{2} }}(\varepsilon_{i} + \mu_{i} ); - \mu_{i} ,\frac{{\sigma_{v}^{2} \sigma_{u}^{2} }}{{\sigma_{v}^{2} + \sigma_{u}^{2} }}} \right)\), where \(\varPhi_{1} \left( { - \frac{{\sigma_{u}^{2} }}{{\sigma_{v}^{2} + \sigma_{u}^{2} }}(\varepsilon_{i} + \mu_{i} ); - \mu_{i} ,\frac{{\sigma_{v}^{2} \sigma_{u}^{2} }}{{\sigma_{v}^{2} + \sigma_{u}^{2} }}} \right) =\) \(\varPhi_{z} \left( { - \frac{{\sigma_{u} /\sigma_{v} }}{{\sqrt {\sigma_{v}^{2} + \sigma_{u}^{2} } }}\varepsilon_{i} + \frac{{\sigma_{v} /\sigma_{u} }}{{\sqrt {\sigma_{v}^{2} + \sigma_{u}^{2} } }}\mu_{i} } \right)\). Substituting the result into (8), we obtain (9). □
Proof of Theorem 3
Let \(B = \left( {\begin{array}{*{20}c} 1 & { - 1} \\ 0 & 1 \\ \end{array} } \right)\), then it follows from (5) and Property 2 that \(\left( {\varepsilon_{i} ,u_{i} } \right)^{\text{T}} |d_{i} = B \cdot \left( {v_{i} ,u_{i} } \right)^{\text{T}} |d_{i}\) \(\sim CSN_{2,2} (\pi_{B,i} ,\varSigma_{B} ,\varGamma_{B,i} ,\kappa_{B,i} ,\Delta_{B} )\), where \(\pi_{B,i} = \left( {\begin{array}{*{20}c} { - \mu_{i} } \\ {\mu_{i} } \\ \end{array} } \right)\), \(\varSigma_{B} = \left( {\begin{array}{*{20}c} {\sigma_{v}^{2} + \sigma_{u}^{2} } & { - \sigma_{u}^{2} } \\ { - \sigma_{u}^{2} } & {\sigma_{u}^{2} } \\ \end{array} } \right)\), \(\varGamma_{B,i} = \left( {\begin{array}{*{20}c} {(2d_{i} - 1)\frac{\rho }{{\sigma_{v} }}} & {(2d_{i} - 1)\frac{\rho }{{\sigma_{v} }}} \\ 0 & 1 \\ \end{array} } \right)\), \(\Delta_{B} = \left( {\begin{array}{*{20}c} {1 - \rho^{2} } & 0 \\ 0 & 0 \\ \end{array} } \right)\), and \(\kappa_{B,i} = \left( {\begin{array}{*{20}c} { - (2d_{i} - 1)w_{i}^{\text{T}} \gamma } \\ { - \mu } \\ \end{array} } \right)\). Then Property 3 suggests that \(\left. {u_{i} } \right|\left\{ {\left. {\varepsilon_{i} } \right|d_{i} } \right\}\sim CSN_{1,2} (\tilde{\pi }_{i} ,\sigma_{*}^{2} ,\tilde{\varGamma }_{i} ,\tilde{\kappa }_{i} ,\tilde{\Delta })\), where \(\tilde{\pi }_{i} = \frac{{\sigma_{v}^{2} \mu_{i} - \sigma_{u}^{2} \varepsilon_{i} }}{{\sigma_{v}^{2} + \sigma_{u}^{2} }}\), \(\sigma_{*}^{2} = \frac{{\sigma_{v}^{2} \sigma_{u}^{2} }}{{\sigma_{v}^{2} + \sigma_{u}^{2} }}\), \(\tilde{\varGamma }_{i} = \left( {\begin{array}{*{20}c} {(2d_{i} - 1)\frac{\rho }{{\sigma_{v} }}} \\ 1 \\ \end{array} } \right)\), \(\tilde{\kappa }_{i} = \left( {\begin{array}{*{20}c} {\frac{{(2d_{i} - 1)\rho }}{{\sigma_{v} }}\left( {\frac{{\sigma_{v}^{2} (\varepsilon_{i} + \mu_{i} )}}{{\sigma_{v}^{2} + \sigma_{u}^{2} }} - w_{i}^{\text{T}} \gamma } \right)} \\ {\frac{{\sigma_{u}^{2} (\varepsilon_{i} + \mu_{i} )}}{{\sigma_{v}^{2} + \sigma_{u}^{2} }} - \mu_{i} } \\ \end{array} } \right)\), and \(\tilde{\Delta } = \Delta_{B} = \left( {\begin{array}{*{20}c} {1 - \rho^{2} } & 0 \\ 0 & 0 \\ \end{array} } \right)\). The corresponding moment generating function is \(E(e^{{tu_{i} }} |\left. {\{ \varepsilon_{i} } \right|d_{i} \} ) = M_{{\left. u \right|\left. {\{ \varepsilon } \right|d\} }} (t) = \frac{{\varPhi_{2} (\ddot{\varGamma }_{i} t;\tilde{\kappa }_{i} ,\ddot{\Delta }_{i} )}}{{\varPhi_{2} (0;\tilde{\kappa }_{i} ,\ddot{\Delta }_{i} )}}e^{{t\tilde{\pi }_{i} + \frac{1}{2}t^{ 2} \sigma_{*}^{2} }}\), where \(\ddot{\varGamma }_{i} = \tilde{\varGamma }_{i} \sigma_{*}^{2}\) and \(\ddot{\Delta }_{i} = \tilde{\Delta } + \sigma_{*}^{2} \tilde{\varGamma }_{i} \tilde{\varGamma }_{i}^{\text{T}}\). Substituting \(t = - 1\) gives the estimator of TE, \(E(e^{{ - u_{i} }} |\left. {\{ \varepsilon_{i} } \right|d_{i} \} )\). □
Rights and permissions
About this article
Cite this article
Lai, Hp. Maximum likelihood estimation of the stochastic frontier model with endogenous switching or sample selection. J Prod Anal 43, 105–117 (2015). https://doi.org/10.1007/s11123-014-0410-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11123-014-0410-2
Keywords
- Stochastic frontier model
- Sample selection
- Endogenous switching
- Maximum likelihood estimation
- Closed skew-normal distribution