Skip to main content

Using Modern Missing Data Methods with Auxiliary Variables to Mitigate the Effects of Attrition on Statistical Power

  • Chapter
  • First Online:
Missing Data

Part of the book series: Statistics for Social and Behavioral Sciences ((SSBS))

Abstract

Missing data in a field experiment may arise from a number of sources. Participants may skip over questions inadvertently or refuse to answer them; they may offer an illegible response; they may fail to complete a questionnaire; or they may be absent from an entire measurement session in a longitudinal study. The last is often called wave nonresponse. Many participants who are unavailable for one or more occasions of measurement are available at later occasions. We define attrition is a special case of wave nonresponse in which a participant drops out of a study after a certain time and is no longer available at any subsequent wave of data collection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This last step involved a simple trial and error process: Try a particular sample size; if the SE was too large, increase the sample size and try again.

  2. 2.

    The MGSEM procedure makes use of the covariance matrix as input. When the covariance matrix is analyzed in this manner, one may simply change the sample size indicated in model being tested without changing the input covariance matrix. The more commonly used FIML approach cannot do this. With that approach, raw data must be input. And with raw-data input, the sample size is tied directly to the data being input (e.g., with N  =  500, 500 cases are read from the raw data file). Thus with the FIML methods, changing the sample size changes the data being read, thereby producing changes in the results.

  3. 3.

    One can safely ignore the “W_A_R_N_I_N_G” in the LISREL output that “LAMBDA-Y does not have full column rank”. It is a necessary byproduct of this analysis.

  4. 4.

    We chose r XY  =  .10 because the issue of N EFF becomes most important with small effect sizes. This is closely related to the issue of determining statistical power with varying effect sizes. With large effect sizes, especially in field experiments, it is often possible to find significant effects, even with relatively small sample sizes. It is often the case that sample size is an issue only with smaller effect sizes. We address the issue of other values of rXY later in this chapter.

    We arbitrarily chose r XZ  =  .10. In our experience, this value always tends to be rather similar to r XY. We address the issue of different values of r XZ later in this chapter.

  5. 5.

    This is true unless the variables acting as auxiliary variables happen to be part of the analysis model. The only analysis that fits this requirement well is growth modeling. That is, even when there are missing values in the growth part of the model, the growth model can be estimated making use of partial data. Although the results of this analysis are not maximum likelihood, they do tend to be unbiased and efficient.

  6. 6.

    For this demonstration, we will stay with the scenario in which N TOT  =  1,000, N CC  =  500, and %Z  =  100 % for both auxiliary variables.

References

  • Allison, P. D. (1987). Estimation of linear models with incomplete data. In C. Clogg (Ed.), Sociological Methodology 1987 (pp. 71–103). San Francisco: Jossey Bass.

    Google Scholar 

  • Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.

    Article  Google Scholar 

  • Graham, J. W. (2003). Adding missing-data relevant variables to FIML-based structural equation models. Structural Equation Modeling, 10, 80–100.

    Article  Google Scholar 

  • Graham, J. W., & Donaldson, S. I. (1993). Evaluating interventions with differential attrition: The importance of nonresponse mechanisms and use of followup data. Journal of Applied Psychology, 78, 119–128.

    Article  Google Scholar 

  • Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory. Prevention Science, 8, 206–213.

    Article  Google Scholar 

  • Graham, J. W., Taylor, B. J., & Cumsille, P. E. (2001). Planned missing data designs in analysis of change. In L. Collins & A. Sayer (Eds.), New methods for the analysis of change, (pp. 335–353). Washington, DC: American Psychological Association.

    Chapter  Google Scholar 

  • Graham, J. W., Taylor, B. J., Olchowski, A. E., & Cumsille, P. E. (2006). Planned missing data designs in psychological research. Psychological Methods, 11, 323–343.

    Article  Google Scholar 

  • Hansen, W. B., & Graham, J. W. (1991). Preventing alcohol, marijuana, and cigarette use among adolescents: Peer pressure resistance training versus establishing conservative norms. Preventive Medicine, 20, 414–430.

    Article  Google Scholar 

  • Jöreskog, K. G., & Sörbom, D. (1996). LISREL 8 User’s Reference Guide. Chicago, IL: Scientific Software, Inc.

    Google Scholar 

  • Murray, D. M. (1998). Design and analysis of group-randomized trials. New York: Oxford University Press.

    Google Scholar 

  • Muthén, B., Kaplan, D., & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52, 431–462.

    Article  MATH  Google Scholar 

  • Muthén, L. K., & Muthén, B. O. (2010). Mplus User’s Guide. (6th ed.). Los Angeles: Author.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media New York

About this chapter

Cite this chapter

Graham, J.W., Collins, L.M. (2012). Using Modern Missing Data Methods with Auxiliary Variables to Mitigate the Effects of Attrition on Statistical Power. In: Missing Data. Statistics for Social and Behavioral Sciences. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4018-5_11

Download citation

Publish with us

Policies and ethics