Pretest Measures of the Study Outcome and the Elimination of Selection Bias: Evidence from Three Within Study Comparisons
- 283 Downloads
This paper examines how pretest measures of a study outcome reduce selection bias in observational studies in education. The theoretical rationale for privileging pretests in bias control is that they are often highly correlated with the outcome, and in many contexts, they are also highly correlated with the selection process. To examine the pretest’s role in bias reduction, we use the data from two within study comparisons and an especially strong quasi-experiment, each with an educational intervention that seeks to improve achievement. In each study, the pretest measures are consistently highly correlated with post-intervention measures of themselves, but the studies vary the correlation between the pretest and the process of selection into treatment. Across the three datasets with two outcomes each, there are three cases where this correlation is low and three where it is high. A single wave of pretest always reduces bias across the six instances examined, and it eliminates bias in three of them. Adding a second pretest wave eliminates bias in two more instances. However, the pattern of bias elimination does not follow the predicted pattern—that more bias reduction ensues as a function of how highly the pretest is correlated with selection. The findings show that bias is more complexly related to the pretest’s correlation with selection than we hypothesized, and we seek to explain why.
KeywordsWithin-study comparison Propensity score matching Randomized experiment Causal inference
Compliance with Ethical Standards
This work was supported by the National Science Foundation Grant DRL-1228866.
Conflict of Interest
The authors declare that they have no conflict of interest.
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. For this type of study, formal consent is not required. This article does not contain any studies with animals performed by any of the authors.
This study only included de-identified, secondary data analysis. For this type of study, formal consent is not required.
- Alexander, K. L., Entwisle, R. D., & Dauber, S. L. (2003). On the success of failure: A reassessment of the effects of retention in the primary school grades. Cambridge: New York.Google Scholar
- Ashenfelter, O. (1978). Estimating the effect of training programs on earnings. Review of Economics and Statistics, 67, 47–57.Google Scholar
- Bloom, H., Michalopoulos, C., & Hill, C. (2005). Using experiments to assess nonexperimental comparison-group methods for measuring program effects. In H. Bloom (Ed.), Learning more from social experiments. New York: Russell Sage.Google Scholar
- Campbell, D. T., & Boruch, R. F. (1975). Making the case for randomized assignment to treatments by considering the alternatives. In C. A. Bennett & A. A. Lumsdaine (Eds.), Evaluation and experiments: Some critical issues in assessing social programs. New York: Academic.Google Scholar
- Campbell, D.T., & Erlebacher, A. E. (1970). How regression artifacts can mistakenly make compensatory education programs look harmful. In J. Hellmuth (Ed.), The disadvantaged child: Vol. 3, Compensatory education: A national debate. New York: Brunner/Mazel.Google Scholar
- Campbell, D. T., & Stanley, J. (1963). Experimental and quasi-experimental designs for research. Boston, MA: Houghton Mifflin Company.Google Scholar
- Cronbach, L. (1982). Desigining evaluations of educational and social programs. San Francisco, CA: Jossey-Bass Publishers.Google Scholar
- Elwert, F. & Winship, C. (2014). Endogenous selection bias: The problem of conditioning on a collider variable. The Annual Review of Sociology.Google Scholar
- LaLonde, R. (1986). Evaluating the econometric evalautions of training programs with experimental data. Annual Economic Review, 76, 604–20.Google Scholar
- Rubin, D. B., & Thomas, N. (1996). Characterizing the effect of using linear propensity score methods with normal distributions. Biometrika, 79, 797–809.Google Scholar
- Shadish, W. R., Cook, T., & Campbell, D. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin Company.Google Scholar
- Smith, J., & Todd, P. (2005). Does matching overcome LaLonde’s critique of nonexperimental estimators? Journal of Econometrics, 305–353.Google Scholar
- St. Clair, T., Cook, T.D., & Hallberg, K. (2014). Examining the internal validity and statistical precision of the comparative interrupted time series design by comparison with a randomized experiment. American Journal of Evaluation.Google Scholar
- Steiner, P. M., Cook, T. D., & Shadish, W. R. (2011). On the importance of reliable covariate measurement in selection bias adjustments using propensity scores. Journal of Educational and Behavioral Statistics, 36, 213.Google Scholar
- Wong, V., Valentine, J.C. & Miller-Bains, K. (2016). Empirical performance of covariates in education observational studies. Journal of Research on Educational Effectiveness.Google Scholar
- Wooldridge, J.M. (2009). Should instrumental variables be used as matching variables? Working paper.Google Scholar