Program Impact Estimation with Binary Outcome Variables: Monte Carlo Results for Alternative Estimators and Empirical Examples



A frequent challenge in program impact estimation, and causal modeling more generally, is estimation of the effect of a binary endogenous variable on a binary outcome of interest. We report results from Monte Carlo experiments designed to assess the performance of estimators frequently applied in this circumstance. Many rely on an instrumental variables identification strategy and in those instances our central interest is the overidentified case. Even when identification is technically achieved by functional form, it is widely perceived that instruments generate more credible identification. Our focus is on widely used models available in the popular STATA statistical software package, but we also evaluate a semi-parametric instrumental variables random effects model not yet available in STATA. The parameters of interest in these experiments are program impact, test statistics assessing endogeneity and overidentification tests. We consider performance under alternative behavioral circumstances by varying distributional assumptions for unobservables, instrument strength levels, sample sizes, and impact magnitudes. Some models turn in a somewhat disappointing performance. Those that rely on joint normality for identification are not particularly robust to error misspecification, raising questions about whether they should be preferred to the semi-parametric estimator (regardless of comparative ease of estimation) or even to simple single equation models that ignore endogeneity. We provide examples of the methods using data from Bangladesh and Tanzania.


Error Correlation Average Treatment Effect Mean Absolute Deviation Monte Carlo Experiment Linear Probability Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Anderson T, Rubin H (1950) The asymptotic properties of estimates of the parameters of a single equation in a complete system of stochastic equations. Ann Math Stat 21:570–582CrossRefGoogle Scholar
  2. Angrist J, Krueger A (2001) Instrumental variables and the search for identification: from supply and demand to natural experiments. J Econ Perspect 15:69–85CrossRefGoogle Scholar
  3. Angrist J, Pischke J (2009) Mostly harmless econometrics: an empiricist’s companion. Princeton University Press, PrincetonGoogle Scholar
  4. Babalola S (2005) Communication, ideation and contraceptive use in Burkina Faso: an application of the propensity score matching method. J Fam Plan Reprod Health Care 31:207–212CrossRefGoogle Scholar
  5. Bassman R (1960) On finite sample distributions of generalized classical linear identifiability test statistics. J Am Stat Assoc 55:650–659CrossRefGoogle Scholar
  6. Bauman K, Viadro C, Tsui A (1993) Family planning program effects in developing countries: conclusions and related considerations. The evaluation project working paper IM-03-03Google Scholar
  7. Bollen K, Guilkey D, Mroz T (1995) Binary outcomes and endogenous explanatory variables: tests and solutions with an application to the demand for contraceptive use in Tunisia. Demography 32:111–131CrossRefGoogle Scholar
  8. Bound J, Jaeger D, Baker R (1995) Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. J Am Stat Assoc 90:443–450Google Scholar
  9. Cappellari L, Jenkins S (2003) Multivariate probit regression using simulated maximum likelihood. STATA J 3:278–294Google Scholar
  10. Chen S, Guilkey D (2003) Determinants of contraceptive method choice in rural Tanzania between 1991 and 1999. Stud Fam Plan 34:263–276CrossRefGoogle Scholar
  11. Chiburis RC, Das J, Lokshin M (2011) A practical comparison of the bivariate probit and linear IV estimators. The World Bank Policy research working paper 5601Google Scholar
  12. Durbin J (1954) Errors in variables. Rev Int Stat Inst 22:23–32CrossRefGoogle Scholar
  13. Fleishman A (1978) A method for simulating nonnormal distributions. Psychometrika 43:521–532CrossRefGoogle Scholar
  14. Gourieroux C, Monfort A, Renault E, Trognon A (1987) Generalized residuals. J Econom 34:5–32CrossRefGoogle Scholar
  15. Guilkey D, Hutchinson P (2011) Overcoming methodological challenges in evaluating health communication campaigns: evidence from rural Bangladesh. Stud Fam Plan 42:93–106CrossRefGoogle Scholar
  16. Guilkey D, Mroz T, Taylor L (1992) Estimation and testing in simultaneous equations models with discrete outcomes using cross section data. UNC-CH Department of Economics working paperGoogle Scholar
  17. Guilkey D, Hutchinson P, Lance P (2006) Cost effectiveness analysis for health communications programs. J Health Commun 11:47–67CrossRefGoogle Scholar
  18. Hansen L (1982) Large sample properties of generalized method of moments estimators. Econometrica 50:1029–1054CrossRefGoogle Scholar
  19. Hausman J (1978) Specification tests in econometrics. Econometrica 46:1251–1271CrossRefGoogle Scholar
  20. Hayashi F (2000) Econometrics. Princeton University Press, PrincetonGoogle Scholar
  21. Heckman J, Singer B (1984) A method for minimizing the impact of distributional assumptions in econometric models for duration data. Econometrica 52:271–320CrossRefGoogle Scholar
  22. Hutchinson P, Wheeler J (2006). The cost effectiveness of health communication programs: what do we know? J Health Commun 11:7–45CrossRefGoogle Scholar
  23. Imbens G, Angrist J (1994) Indentification and estimation of local average treatment effects. Econometrica 62:467–475CrossRefGoogle Scholar
  24. Kaiser H, Dickman K (1962) Sample and population score matrices and sample correlation matrices from an arbitrary population correlation matrix. Psychometrika 27:179–182CrossRefGoogle Scholar
  25. LaLonde R (1986) Evaluating the econometric evaluations of training programs with experimental data. Am Econ Rev 76:604–620Google Scholar
  26. Manning W, Duan N, Rogers W (1987) Monte Carlo evidence on the choice between sample selection and two-part models. J Econom 35:59–82CrossRefGoogle Scholar
  27. Mwaikambo L, Speizer I, Schurmann A, Morgan G, Fikree F (2011) What works in family planning interventions: a systematic review. Stud Fam Plan 42:67–82CrossRefGoogle Scholar
  28. Mroz T (1999) Discrete factor approximations in simultaneous equations models: estimating the impact of a dummy endogenous variable on a continuous outcome. J Econom 92:233–274CrossRefGoogle Scholar
  29. Ngallaba S, Kapiga S, Ruyoba I, Boerma J (1993) Tanzania demographic and health survey 1991/1992. Macro International Inc., ColumbiaGoogle Scholar
  30. Rivers D, Vuong Q (1988) Limited information estimators and exogeneity tests for simultaneous probit models. J Econom 39:347–366CrossRefGoogle Scholar
  31. Sargon J (1958) The estimation of economic relationships using instrumental variables. Econometrica 26:393–415CrossRefGoogle Scholar
  32. Stock J, Staiger D (1997) Instrumental variables regression with weak instruments. Econometrica 65:557–586CrossRefGoogle Scholar
  33. Terza J, Basu A, Rathouz P (2008) Two-stage residual inclusion estimation: addressing endogeneity in health econometric modelling. J Health Econ 27:531–543CrossRefGoogle Scholar
  34. Vale C, Maurelli V (1983) Simulating multivariate nonnormal distributions. Psychometrika 48:465–471CrossRefGoogle Scholar
  35. Wu D (1974) Alternative tests of independence between stochastic regressors and disturbances: finite sample results. Econometrica 42:529–546CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of Economics and the Carolina Population CenterUniversity of North Carolina at Chapel HillChapel HillUSA
  2. 2.Carolina Population CenterUniversity of North Carolina at Chapel HillChapel HillUSA

Personalised recommendations