Skip to main content
Log in

Pretest Measures of the Study Outcome and the Elimination of Selection Bias: Evidence from Three Within Study Comparisons

  • Published:
Prevention Science Aims and scope Submit manuscript

Abstract

This paper examines how pretest measures of a study outcome reduce selection bias in observational studies in education. The theoretical rationale for privileging pretests in bias control is that they are often highly correlated with the outcome, and in many contexts, they are also highly correlated with the selection process. To examine the pretest’s role in bias reduction, we use the data from two within study comparisons and an especially strong quasi-experiment, each with an educational intervention that seeks to improve achievement. In each study, the pretest measures are consistently highly correlated with post-intervention measures of themselves, but the studies vary the correlation between the pretest and the process of selection into treatment. Across the three datasets with two outcomes each, there are three cases where this correlation is low and three where it is high. A single wave of pretest always reduces bias across the six instances examined, and it eliminates bias in three of them. Adding a second pretest wave eliminates bias in two more instances. However, the pattern of bias elimination does not follow the predicted pattern—that more bias reduction ensues as a function of how highly the pretest is correlated with selection. The findings show that bias is more complexly related to the pretest’s correlation with selection than we hypothesized, and we seek to explain why.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. To ensure the congruence of effect estimates across models, only school-level covariates were included in the models presented here. However, inclusions of pretreatment student level covariates in the outcome models did not substantively change the analysis results.

  2. The estimates presented here rely on 1:1 nearest neighbor matching, but are robust to alternative specifications.

References

  • Alexander, K. L., Entwisle, R. D., & Dauber, S. L. (2003). On the success of failure: A reassessment of the effects of retention in the primary school grades. Cambridge: New York.

    Google Scholar 

  • Ashenfelter, O. (1978). Estimating the effect of training programs on earnings. Review of Economics and Statistics, 67, 47–57.

  • Bifulco, R. (2012). Can nonrandomized estimates replicate estimates based on random assignment in evaluations of school choice? A within-study comparison. Journal of Policy Analysis and Management, 31, 729–751.

    Article  Google Scholar 

  • Bloom, H., Michalopoulos, C., & Hill, C. (2005). Using experiments to assess nonexperimental comparison-group methods for measuring program effects. In H. Bloom (Ed.), Learning more from social experiments. New York: Russell Sage.

    Google Scholar 

  • Campbell, D. T. (1957). Factors relevant to the validity of experiments in social setting. Psychological Bulletin, 54, 297–312.

    Article  CAS  PubMed  Google Scholar 

  • Campbell, D. T., & Boruch, R. F. (1975). Making the case for randomized assignment to treatments by considering the alternatives. In C. A. Bennett & A. A. Lumsdaine (Eds.), Evaluation and experiments: Some critical issues in assessing social programs. New York: Academic.

    Google Scholar 

  • Campbell, D.T., & Erlebacher, A. E. (1970). How regression artifacts can mistakenly make compensatory education programs look harmful. In J. Hellmuth (Ed.), The disadvantaged child: Vol. 3, Compensatory education: A national debate. New York: Brunner/Mazel.

  • Campbell, D. T., & Stanley, J. (1963). Experimental and quasi-experimental designs for research. Boston, MA: Houghton Mifflin Company.

    Google Scholar 

  • Cook, T. D., Shadish, W. J., & Wong, V. C. (2008). Three conditions under which observational studies produce the same results as experiments. Journal of Policy Analysis and Management, 27, 724–750.

    Article  Google Scholar 

  • Cronbach, L. (1982). Desigining evaluations of educational and social programs. San Francisco, CA: Jossey-Bass Publishers.

    Google Scholar 

  • Demirtas, H., & Hedeker, D. (2011). A practical way for computing approximate upper and lower correlation bounds. The American Statistician, 65, 2.

    Article  Google Scholar 

  • Elwert, F. & Winship, C. (2014). Endogenous selection bias: The problem of conditioning on a collider variable. The Annual Review of Sociology.

  • Glazerman, S., Levy, D., & Myers, D. (2003). Nonexperimental versus experimental estimates of earnings impacts. The Annals of the American Academy, 589, 63–91.

    Article  Google Scholar 

  • Hong, G., & Raudenbush, S. W. (2005). Effects of kindergarten retention policy on children’s cognitive growth in reading and mathematics. Education Evaluation and Policy Analysis, 27, 205–224.

    Article  Google Scholar 

  • Hong, G., & Raudenbush, S. W. (2006). Evaluation kindergarten retention: A case study for causal inference for multilevel observational data. Journal of the American Statistical Association, 101, 901–910.

    Article  CAS  Google Scholar 

  • Jackson, G. B. (1975). The research evidence on the effects of grade retention. Review of Educational Research, 45, 613–635.

    Article  Google Scholar 

  • Konstantopoulos, S., Miller, S., & van der Ploeg, A. (2013). The impact of Indiana’s interim assessments on methematics and reading. Educational Evaluation and Policy Analysis, 35, 481–499.

    Article  Google Scholar 

  • LaLonde, R. (1986). Evaluating the econometric evalautions of training programs with experimental data. Annual Economic Review, 76, 604–20.

    Google Scholar 

  • Pearl, J. (2009). The structural theory of causation. In P. McKay Illari, F. Russo, & J. Williamson (Eds.), Causality in the sciences (pp. 1–30). Oxford: Clarendon.

    Chapter  Google Scholar 

  • Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55.

    Article  Google Scholar 

  • Rubin, D. B. (2001). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services & Outcomes Research Methodology, 2, 169–188.

    Article  Google Scholar 

  • Rubin, D. B., & Thomas, N. (1996). Characterizing the effect of using linear propensity score methods with normal distributions. Biometrika, 79, 797–809.

  • Shadish, W. R., Clark, M. H., & Steiner, P. M. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random to nonrandom assignment. Journal of the American Statistical Association, 103, 1334–1343.

    Article  CAS  Google Scholar 

  • Shadish, W. R., Cook, T., & Campbell, D. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin Company.

    Google Scholar 

  • Smith, J., & Todd, P. (2005). Does matching overcome LaLonde’s critique of nonexperimental estimators? Journal of Econometrics, 305–353.

  • St. Clair, T., Cook, T.D., & Hallberg, K. (2014). Examining the internal validity and statistical precision of the comparative interrupted time series design by comparison with a randomized experiment. American Journal of Evaluation.

  • Steiner, P. M., Cook, T. D., Shadish, W. R., & Clark, M. H. (2010). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods, 15, 250–67.

    Article  PubMed  Google Scholar 

  • Steiner, P. M., Cook, T. D., & Shadish, W. R. (2011). On the importance of reliable covariate measurement in selection bias adjustments using propensity scores. Journal of Educational and Behavioral Statistics, 36, 213.

  • Wong, V., Valentine, J.C. & Miller-Bains, K. (2016). Empirical performance of covariates in education observational studies. Journal of Research on Educational Effectiveness.

  • Wooldridge, J.M. (2009). Should instrumental variables be used as matching variables? Working paper.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kelly Hallberg.

Ethics declarations

Funding

This work was supported by the National Science Foundation Grant DRL-1228866.

Conflict of Interest

The authors declare that they have no conflict of interest.

Ethical Approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. For this type of study, formal consent is not required. This article does not contain any studies with animals performed by any of the authors.

Informed Consent

This study only included de-identified, secondary data analysis. For this type of study, formal consent is not required.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hallberg, K., Cook, T.D., Steiner, P.M. et al. Pretest Measures of the Study Outcome and the Elimination of Selection Bias: Evidence from Three Within Study Comparisons. Prev Sci 19, 274–283 (2018). https://doi.org/10.1007/s11121-016-0732-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11121-016-0732-6

Keywords

Navigation