Handbook of Causal Analysis for Social Research pp 167-187 | Cite as

# Counterfactual Causal Analysis and Nonlinear Probability Models

## Abstract

Nonlinear probability models, such as logits and probits for binary dependent variables, the ordered logit and ordered probit for ordinal dependent variables and the multinomial logit, together with log-linear models for contingency tables, have become widely used by social scientists in the past 30 years. In this chapter, we show that the identification and estimation of causal effects using these models present severe challenges, over and above those usually encountered in identifying causal effects in a linear setting. These challenges are derived from the lack of separate identification of the mean and variance in these models. We show their impact in experimental and observational studies, and we investigate the problems that arise in the use of standard approaches to the causal analysis of nonexperimental data, such as propensity scores, instrumental variables, and control functions. Naive use of these approaches with nonlinear probability models will yield biased estimates of causal effects, though the estimates will be a lower bound of the true causal effect and will have the correct sign. We show that the technique of *Y*-standardization brings the parameters of nonlinear probability models on a scale that we can meaningfully interpret but cannot measure. Other techniques, such as average partial effects, can yield causal effects on the probability scale, but, in this case, the linear probability model provides a simple and effective alternative.

## Keywords

Propensity Score Causal Effect Potential Outcome Multinomial Logit Model Latent Variable Model## References

- Achen, C. H. (1977). Measuring representation: Perils of the correlation coefficient.
*American Journal of Political Science, 21*, 805–821.CrossRefGoogle Scholar - Allison, P. D. (1999). Comparing logit and probit coefficients across groups.
*Sociological Methods & Research, 28*, 186–208.CrossRefGoogle Scholar - Amemiya, T. (1975). Qualitative response models.
*Annals of Economic and Social Measurement, 4*, 363–388.Google Scholar - Angrist, J. D., & Pischke, J.-S. (2008).
*Mostly harmless econometrics: An empiricist’s companion*. Princeton: Princeton University Press.Google Scholar - Blalock, H. M. (1967a). Path coefficients versus regression coefficients.
*The American Journal of Sociology, 72*, 675–676.CrossRefGoogle Scholar - Blalock, H. M. (1967b). Causal inference, closed populations, and measures of association.
*American Political Science Review, 61*, 130–136.CrossRefGoogle Scholar - Blundell, R., Dearden, L., & Sianesi, B. (2005). Evaluating the effect of education on earnings: Models, methods and results from the National Child Development Survey.
*Journal of the Royal Statistical Society, Series A, 168*, 473–512.Google Scholar - Breen, R., Karlson, K. B., & Holm, A. (2012).
*Correlations and non-linear probability models.*Unpublished paper.Google Scholar - Cameron, S. V., & Heckman, J. J. (1998). Life cycle schooling and dynamic selection bias: Models and evidence for five cohorts of American males.
*Journal of Political Economy, 106*, 262–333.CrossRefGoogle Scholar - Cohen, J. (1969).
*Statistical power analysis for the behavioral sciences*. New York: Academic.Google Scholar - Cox, D. R. (1958).
*Planning of experiments*. New York: Wiley.Google Scholar - Cramer, J. S. (2007). Robustness of logit analysis: Unobserved heterogeneity and mis-specified disturbances.
*Oxford Bulletin of Economics and Statistics, 69*, 545–555.CrossRefGoogle Scholar - Fienberg, S. E. (1977).
*The analysis of cross-classified categorical data*. Cambridge, MA: MIT Press.Google Scholar - Fisher, R. A. (1932).
*Statistical methods for research workers*. Edinburgh: Oliver and Boyd.Google Scholar - Gail, M. H. (1986). Adjusting for covariates that have the same distribution in exposed and unexposed cohorts. In S. H. Moolgavkar & R. L. Prentice (Eds.),
*Modern statistical methods in chronic disease epidemiology*(pp. 3–18). New York: Wiley.Google Scholar - Gail, M. H., Wieand, S., & Piantdosi, S. (1984). Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates.
*Biometrika, 71*, 431–444.CrossRefGoogle Scholar - Gangl, M. (2010). Causal inference in sociological research.
*Annual Review of Sociology, 36*, 21–48.CrossRefGoogle Scholar - Hauck, W. W., Neuhaus, J. M., Kalbfleisch, J. D., & Anderson, S. (1991). A consequence of omitted covariates when estimating odds ratios.
*Journal of Clinical Epidemiology, 44*, 77–81.CrossRefGoogle Scholar - Heckman, J. J. (1979). Sample selection bias as specification error.
*Econometrica, 47*, 153–161.CrossRefGoogle Scholar - Heckman, J. J., Ichimura, H., Smith, J., & Todd, P. (1998). Characterizing selection bias using experimental data.
*Econometrica, 66*, 1017–1098.CrossRefGoogle Scholar - Imbens, G. W., & Angrist, J. D. (1994). Identification and estimation of local average treatment effects.
*Econometrica, 62*, 467–475.CrossRefGoogle Scholar - Imbens, G. W., & Wooldridge, J. M. (2009). Recent developments in the econometrics of program evaluation.
*Journal of Economic Literature, 47*, 5–86.CrossRefGoogle Scholar - Karlson, K. B., Holm, A., & Breen, R. (2012). Comparing regression coefficients between same sample nested models using logit and probit: A new method.
*Sociological Methodology, 42*(1), 286–313.CrossRefGoogle Scholar - Kim, J.-O., & Mueller, C. W. (1976). Standardized and unstandardized coefficients in causal analysis: An expository note.
*Sociological Methods & Research, 4*, 423–438.CrossRefGoogle Scholar - Mare, R. D. (2006). Response: Statistical models of educational stratification – Hauser and Andrew’s models for school transitions.
*Sociological Methodology, 36*, 27–37.CrossRefGoogle Scholar - McFadden, D. (1974). Conditional logit analysis of qualitative choice behavior. In P. Zarembka (Ed.),
*Frontiers in econometrics*(pp. 105–142). New York: Academic.Google Scholar - McKelvey, R. D., & Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables.
*Journal of Mathematical Sociology, 4*, 103–120.CrossRefGoogle Scholar - Mood, C. (2010). Logistic regression: Why we cannot do what we think we can do, and what we can do about it.
*European Sociological Review, 26*, 67–82.CrossRefGoogle Scholar - Morgan, S. L., & Winship, C. (2007).
*Counterfactuals and causal inference: Methods and principles for social research*. New York: Cambridge University Press.CrossRefGoogle Scholar - Olsen, R. J. (1982). Independence from irrelevant alternatives and attrition bias: Their relation to one another in the evaluation of experimental programs.
*Southern Economic Journal, 49*, 521–535.CrossRefGoogle Scholar - Pearl, J. (1995). Causal diagrams for empirical research.
*Biometrika, 82*, 669–710.CrossRefGoogle Scholar - Pearl, J. (2006).
*Causality: Models, reasoning and inference*. Cambridge: Cambridge University Press.Google Scholar - Robins, J. M. (1999). Association, causation, and marginal structural models.
*Synthese, 121*, 151–179.CrossRefGoogle Scholar - Robins, J. M., Hernán, M. A., & Brumback, B. (2000). Marginal structural models and causal inference in epidemiology.
*Epidemiology, 11*, 550–560.CrossRefGoogle Scholar - Robinson, L. D., & Jewell, N. P. (1991). Some surprising results about covariate adjustment in logistic regression models.
*International Statistical Review, 58*, 227–240.Google Scholar - Swait, J., & Louviere, J. (1993). The role of the scale parameter in the estimation and comparison of multinomial logit models.
*Journal of Marketing Research, 30*, 305–314.CrossRefGoogle Scholar - Train, K. (2009).
*Discrete choice methods with simulation*. Cambridge: Cambridge University Press.CrossRefGoogle Scholar - Vytlacil, E. (2002). Independence, monotonicity, and latent index models: An equivalence result.
*Econometrica, 70*, 331–441.CrossRefGoogle Scholar - Winship, C., & Mare, R. D. (1984). Regression models with ordinal variables.
*American Sociological Review, 49*, 512–525.CrossRefGoogle Scholar - Wooldridge, J. M. (2002).
*Econometric analysis of cross section and panel data*. Cambridge, MA: MIT Press.Google Scholar - Xie, Y. (2011). Values and limitations of statistical models.
*Research in Social Stratification and Mobility, 29*, 343–349.CrossRefGoogle Scholar - Yatchew, A., & Griliches, Z. (1985). Specification error in probit models.
*The Review of Economics and Statistics, 67*, 134–139.CrossRefGoogle Scholar