Handbook of Causal Analysis for Social Research pp 403-424 | Cite as

# What You Can Learn from Wrong Causal Models

## Abstract

It is common for social science researchers to provide estimates of causal effects from regression models imposed on observational data. The many problems with such work are well documented and widely known. The usual response is to claim, with little real evidence, that the causal model is close enough to the “truth” that sufficiently accurate causal effects can be estimated. In this chapter, a more circumspect approach is taken. We assume that the causal model is a substantial distance from the truth and then consider what can be learned nevertheless. To that end, we distinguish between how nature generated the data, a “true” model representing how this was accomplished, and a working model that is imposed on the data. The working model will typically be “wrong.” Nevertheless, unbiased or asymptotically unbiased estimates from parametric, semiparametric, and nonparametric working models can often be obtained in concert with appropriate statistical tests and confidence intervals. However, the estimates are not of the regression parameters typically assumed. Estimates of causal effects are not provided. Correlation is not causation. Nor is partial correlation, even when dressed up as regression coefficients. However, we argue that insights about causal effects do not require estimates of causal effects. We also discuss what can be learned when our alternative approach is not persuasive.

## Keywords

Census Tract Causal Effect Conditional Distribution Causal Inference Causal Model## References

- Angrist, J. D., & Pischke, J. (2009).
*Most harmless econometrics*. Princeton: Princeton University Press.Google Scholar - Berk, R. A. (2003).
*Regression analysis: A constructive critique*. Newberry Park: Sage Publications.Google Scholar - Berk, R. A., Kriegler, B., & Ylvisaker, D. (2008). Counting the homeless in Los Angeles county. In D. Nolan & S. Speed (Eds.),
*Probability and statistics: Essays in honor of David A. Freedman*(Monograph series). Beachwood: Institute of Mathematical Statistics.Google Scholar - Berk, R. A., Brown, L., & Zhao, L. (2010). Statistical inference after model selection.
*Journal of Quantitative Criminology, 26*, 217–236.CrossRefGoogle Scholar - Berk, R. A., Brown, L., Buja, A., George, E., Pitkin, E., Traskin, M., Zhang, K., & Zhao, L. (2011).
*Regression with a random design matrix*(Working paper). Pennsylvania: Department of Statistics, University of Pennsylvania.Google Scholar - Box, G. E. P. (1979). Robustness in the strategy of scientific model building. In R. L. Launer & G. N. Wilkinson (Eds.),
*Robustness in statistics*. New York: Academic.Google Scholar - Cleveland, W. (1979). Robust locally weighted regression and smoothing scatterplots.
*Journal of the American Statistical Association, 78*, 829–836.CrossRefGoogle Scholar - Cook, D. R., & Weisberg, S. (1999).
*Applied regression including computing and graphics*. NewYork: Wiley.CrossRefGoogle Scholar - Duncan, O. D. (1975).
*Introduction to structural equation models*. New York: Academic.Google Scholar - Eicker, F. (1963). Asymptotic normality and consistency of the least squares estimators for families of linear regressions.
*Annals of Mathematical Statistics, 34*, 447–456.CrossRefGoogle Scholar - Eicker, F. (1967). Limit theorems for regressions with unequal and dependent errors.
*Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1*, 59–82.Google Scholar - Fisher, R. A. (1924). The distribution of the partial correlation coefficient.
*Metron, 3,*329–332.Google Scholar - Freedman, D. A. (1981). Bootstrapping regression models.
*Annals of Statistics, 9*(6), 1218–1228.CrossRefGoogle Scholar - Freedman, D. A. (2009). Diagnostics cannot have much power against general alternatives.
*International Journal of Forecasting, 25*(4), 833–839.CrossRefGoogle Scholar - Gelman, A., & Park, D. K. (2008). Splitting a predictor at the upper quarter third and the lower quarter or third.
*The American Statistician, 62*(4), 1–8.Google Scholar - Goldberger, A. S., & Duncan, O. D. (1973).
*Structural equation modeling in the social sciences*. New York: Seminar Press.Google Scholar - Greene, W. H. (2003).
*Econometric analysis*(5th ed.). New York: Prentice Hall.Google Scholar - Hanushek, E. A., & Jackson, J. E. (1977).
*Statistical methods for social scientists*. New York: Academic.Google Scholar - Hastie, T. J., & Tibshirani, R. J. (1990).
*Generalized additive models*. New York: Chapman & Hall.Google Scholar - Hastie, T. J., Tibshirani, R. J., & Friedman, J. (2009).
*The elements of statistical learning*(2nd ed.). New York: Springer.CrossRefGoogle Scholar - Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions.
*Proceedings of the Fifth Symposium on Mathematical Statistics and Probability, I*, 221–233.Google Scholar - Kaplan, D. (2009).
*Structural equation modeling: Foundations and extensions*(2nd ed.). Los Angeles: Sage Publications.Google Scholar - Leeb, H., & Pötscher, B. M. (2006). Can one estimate the conditional distribution of post-model-selection estimators?
*The Annals of Statistics, 34*(5), 2554–2591.CrossRefGoogle Scholar - Leeb, H., & Pötscher, B. M. (2008a). Model selection. In T. G. Anderson, R. A. Davis, J.-P. Kreib, & T. Mikosch (Eds.),
*The handbook of financial time series*(pp. 785–821). New York: Springer.Google Scholar - Leeb, H., & Pötscher, B. M. (2008b). Sparse estimators and the oracle property, or the return of Hodges estimator.
*Journal of Econometrics, 142*, 201–211.CrossRefGoogle Scholar - Mammen, E. (1993). Bootstrap and wild bootstrap for high dimensional linear models.
*Annals of Statistics, 21*(1), 255–285.CrossRefGoogle Scholar - Rosenbaum, P. (2009).
*Design of observational studies*. New York: Springer.Google Scholar - Rosenbaum, P. (2010).
*Observational studies*(2nd ed.). New York: Springer.CrossRefGoogle Scholar - Tibshirani, R. (1996). Regression shrinkage and selection via the lasso.
*Journal of the Royal Statistical Society B, 58*(1), 267–288.Google Scholar - Thompson, S. (2002).
*Sampling*(2nd ed.). New York: Wiley.Google Scholar - White, H. (1980). Using least squares to approximate unknown regression functions.
*International Economic Review, 21*(1), 149–170.CrossRefGoogle Scholar - Zellner, A. (1984).
*Basic issues in econometrics*. Chicago: University of Chicago Press.Google Scholar