Skip to main content
Log in

When does Heckman’s two-step procedure for censored data work and when does it not?

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

Heckman’s two-step procedure (Heckit) for estimating the parameters in linear models from censored data is frequently used by econometricians, despite of the fact that earlier studies cast doubt on the procedure. In this paper it is shown that estimates of the hazard h for approaching the censoring limit, the latter being used as an explanatory variable in the second step of the Heckit, can induce multicollinearity. The influence of the censoring proportion and sample size upon bias and variance in three types of random linear models are studied by simulations. From these results a simple relation is established that describes how absolute bias depends on the censoring proportion and the sample size. It is also shown that the Heckit may work with non-normal (Laplace) distributions, but it collapses if h deviates too much from that of the normal distribution. Data from a study of work resumption after sick-listing are used to demonstrate that the Heckit can be very risky.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bergendorff S, Hansson E, Hansson T, Jonsson R (2001) Predictors of health status and work resumption (in Swedish). Rygg och Nacke RFV and Sahlgrenska Universitetssjukhuset 8, Stockholm

  • Dow WH, Norton EC (2003) Choosing between and interpreting the Heckit and two-part models for corner solutions. Health Serv. Outcomes Res. Methodol. 4: 5–18

    Article  Google Scholar 

  • Flood L, Gråsjö U (2001) A Monte Carlo simulation study of a Tobit model. Appl Econ Lett 8: 581–584

    Article  Google Scholar 

  • Gordon RD (1941) Values of Mills’ ratio of area to boarding ordinate and of the normal probability integral for large values of the argument. Ann Math Stat 12: 364–366

    Article  MATH  Google Scholar 

  • Hansson T, Hansson E (2000) The effects of common medical interventions on pain, back function, and work resumption in patients with chronic low back pain. SPINE 25(23): 3055–3064

    Article  Google Scholar 

  • Hansson E, Hansson T, Jonsson R (2004) Predictors for work ability and disability in men and women with low-back or neck problems. Eur Spine J 15: 780–793

    Article  Google Scholar 

  • Heckman J (1976) The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator of such models. Ann Econ Soc Meas 5: 475–492

    Google Scholar 

  • Heckman J (1979) Sample selection error as a specification error. Econometrica 47: 153–161

    Article  MathSciNet  MATH  Google Scholar 

  • Honda Y (1985) Testing the error components model with non-normal disturbances. Rev Econ Stud 52: 681–690

    Article  MathSciNet  MATH  Google Scholar 

  • Hsiao C (2003) Analysis of panel data. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Johnson NL, Kotz S, Balakrishnan N (1994) Continuous univariate distributions, vol I, 2nd edn. Wiley, New York

    Google Scholar 

  • Karlsson M (2005) Estimators of semiparametric truncated and censored regression models. Statistical Studies, Ph.D. thesis, Department of Statistics, Ume University 34

  • Kim CK, Lai TL (2000) Efficient score estimation and adaptive M-estimators in censored and truncated regression models. Stat Sin 10: 731–749

    MathSciNet  MATH  Google Scholar 

  • Kruskal WH, Tanur JM (1978) International encyclopedia of statistics, vol 2. McMillan, New York

    Google Scholar 

  • Lee MJ (1996) Method of moments and semiparametric econometrics for limited dependent variable models. Springer, New York

    Google Scholar 

  • Lundevaller EH, Laitila T (2002) Test of random subject effects in heteroscedastic linear models. Biom J 44: 825–834

    Article  MathSciNet  Google Scholar 

  • Nelson FD (1984) Efficiency of the two-step estimator for models with endogenous sample selection. J Econom 24: 181–196

    Article  Google Scholar 

  • Paarsch HJ (1984) A Monte Carlo comparison of estimators for censored regression models. J Econom 24: 197–213

    Article  MATH  Google Scholar 

  • Powell JL (1994) Estimation of semiparametric models. In: Engel RF, McFadden DL (eds) Handbook of econometrics, vol 4. North-Holland, Amsterdam, pp 2444–2521

    Google Scholar 

  • Puhani PA (2000) The Heckman correction for sample selection and its critique. J Econ Surv 14(1): 53–68

    Article  Google Scholar 

  • Rao CR (1965) The theory of least squares when the parameters are stochastic and its application to the analysis of growth curves. Biometrica 52: 447–458

    MATH  Google Scholar 

  • Rosett RN, Nelson FD (1975) Estimation of the two-limit probit regression model. Econometrica 43: 141–146

    Article  MATH  Google Scholar 

  • Swamy P (1971) Statistical inference in random coefficient regression model. Springer, Berlin

    Google Scholar 

  • Tobin J (1958) Estimation of relationships for limited dependent variables. Econometrica 26: 24–36

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert Jonsson.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jonsson, R. When does Heckman’s two-step procedure for censored data work and when does it not?. Stat Papers 53, 33–49 (2012). https://doi.org/10.1007/s00362-010-0306-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-010-0306-9

Keywords

Navigation