Festschrift in Honor of Peter Schmidt pp 5-46 | Cite as

# Program Impact Estimation with Binary Outcome Variables: Monte Carlo Results for Alternative Estimators and Empirical Examples

## Abstract

A frequent challenge in program impact estimation, and causal modeling more generally, is estimation of the effect of a binary endogenous variable on a binary outcome of interest. We report results from Monte Carlo experiments designed to assess the performance of estimators frequently applied in this circumstance. Many rely on an instrumental variables identification strategy and in those instances our central interest is the overidentified case. Even when identification is technically achieved by functional form, it is widely perceived that instruments generate more credible identification. Our focus is on widely used models available in the popular STATA statistical software package, but we also evaluate a semi-parametric instrumental variables random effects model not yet available in STATA. The parameters of interest in these experiments are program impact, test statistics assessing endogeneity and overidentification tests. We consider performance under alternative behavioral circumstances by varying distributional assumptions for unobservables, instrument strength levels, sample sizes, and impact magnitudes. Some models turn in a somewhat disappointing performance. Those that rely on joint normality for identification are not particularly robust to error misspecification, raising questions about whether they should be preferred to the semi-parametric estimator (regardless of comparative ease of estimation) or even to simple single equation models that ignore endogeneity. We provide examples of the methods using data from Bangladesh and Tanzania.

## Keywords

Error Correlation Average Treatment Effect Mean Absolute Deviation Monte Carlo Experiment Linear Probability Model## References

- Anderson T, Rubin H (1950) The asymptotic properties of estimates of the parameters of a single equation in a complete system of stochastic equations. Ann Math Stat 21:570–582CrossRefGoogle Scholar
- Angrist J, Krueger A (2001) Instrumental variables and the search for identification: from supply and demand to natural experiments. J Econ Perspect 15:69–85CrossRefGoogle Scholar
- Angrist J, Pischke J (2009) Mostly harmless econometrics: an empiricist’s companion. Princeton University Press, PrincetonGoogle Scholar
- Babalola S (2005) Communication, ideation and contraceptive use in Burkina Faso: an application of the propensity score matching method. J Fam Plan Reprod Health Care 31:207–212CrossRefGoogle Scholar
- Bassman R (1960) On finite sample distributions of generalized classical linear identifiability test statistics. J Am Stat Assoc 55:650–659CrossRefGoogle Scholar
- Bauman K, Viadro C, Tsui A (1993) Family planning program effects in developing countries: conclusions and related considerations. The evaluation project working paper IM-03-03Google Scholar
- Bollen K, Guilkey D, Mroz T (1995) Binary outcomes and endogenous explanatory variables: tests and solutions with an application to the demand for contraceptive use in Tunisia. Demography 32:111–131CrossRefGoogle Scholar
- Bound J, Jaeger D, Baker R (1995) Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. J Am Stat Assoc 90:443–450Google Scholar
- Cappellari L, Jenkins S (2003) Multivariate probit regression using simulated maximum likelihood. STATA J 3:278–294Google Scholar
- Chen S, Guilkey D (2003) Determinants of contraceptive method choice in rural Tanzania between 1991 and 1999. Stud Fam Plan 34:263–276CrossRefGoogle Scholar
- Chiburis RC, Das J, Lokshin M (2011) A practical comparison of the bivariate probit and linear IV estimators. The World Bank Policy research working paper 5601Google Scholar
- Durbin J (1954) Errors in variables. Rev Int Stat Inst 22:23–32CrossRefGoogle Scholar
- Fleishman A (1978) A method for simulating nonnormal distributions. Psychometrika 43:521–532CrossRefGoogle Scholar
- Gourieroux C, Monfort A, Renault E, Trognon A (1987) Generalized residuals. J Econom 34:5–32CrossRefGoogle Scholar
- Guilkey D, Hutchinson P (2011) Overcoming methodological challenges in evaluating health communication campaigns: evidence from rural Bangladesh. Stud Fam Plan 42:93–106CrossRefGoogle Scholar
- Guilkey D, Mroz T, Taylor L (1992) Estimation and testing in simultaneous equations models with discrete outcomes using cross section data. UNC-CH Department of Economics working paperGoogle Scholar
- Guilkey D, Hutchinson P, Lance P (2006) Cost effectiveness analysis for health communications programs. J Health Commun 11:47–67CrossRefGoogle Scholar
- Hansen L (1982) Large sample properties of generalized method of moments estimators. Econometrica 50:1029–1054CrossRefGoogle Scholar
- Hausman J (1978) Specification tests in econometrics. Econometrica 46:1251–1271CrossRefGoogle Scholar
- Hayashi F (2000) Econometrics. Princeton University Press, PrincetonGoogle Scholar
- Heckman J, Singer B (1984) A method for minimizing the impact of distributional assumptions in econometric models for duration data. Econometrica 52:271–320CrossRefGoogle Scholar
- Hutchinson P, Wheeler J (2006). The cost effectiveness of health communication programs: what do we know? J Health Commun 11:7–45CrossRefGoogle Scholar
- Imbens G, Angrist J (1994) Indentification and estimation of local average treatment effects. Econometrica 62:467–475CrossRefGoogle Scholar
- Kaiser H, Dickman K (1962) Sample and population score matrices and sample correlation matrices from an arbitrary population correlation matrix. Psychometrika 27:179–182CrossRefGoogle Scholar
- LaLonde R (1986) Evaluating the econometric evaluations of training programs with experimental data. Am Econ Rev 76:604–620Google Scholar
- Manning W, Duan N, Rogers W (1987) Monte Carlo evidence on the choice between sample selection and two-part models. J Econom 35:59–82CrossRefGoogle Scholar
- Mwaikambo L, Speizer I, Schurmann A, Morgan G, Fikree F (2011) What works in family planning interventions: a systematic review. Stud Fam Plan 42:67–82CrossRefGoogle Scholar
- Mroz T (1999) Discrete factor approximations in simultaneous equations models: estimating the impact of a dummy endogenous variable on a continuous outcome. J Econom 92:233–274CrossRefGoogle Scholar
- Ngallaba S, Kapiga S, Ruyoba I, Boerma J (1993) Tanzania demographic and health survey 1991/1992. Macro International Inc., ColumbiaGoogle Scholar
- Rivers D, Vuong Q (1988) Limited information estimators and exogeneity tests for simultaneous probit models. J Econom 39:347–366CrossRefGoogle Scholar
- Sargon J (1958) The estimation of economic relationships using instrumental variables. Econometrica 26:393–415CrossRefGoogle Scholar
- Stock J, Staiger D (1997) Instrumental variables regression with weak instruments. Econometrica 65:557–586CrossRefGoogle Scholar
- Terza J, Basu A, Rathouz P (2008) Two-stage residual inclusion estimation: addressing endogeneity in health econometric modelling. J Health Econ 27:531–543CrossRefGoogle Scholar
- Vale C, Maurelli V (1983) Simulating multivariate nonnormal distributions. Psychometrika 48:465–471CrossRefGoogle Scholar
- Wu D (1974) Alternative tests of independence between stochastic regressors and disturbances: finite sample results. Econometrica 42:529–546CrossRefGoogle Scholar