Using Modern Missing Data Methods with Auxiliary Variables to Mitigate the Effects of Attrition on Statistical Power

Graham, John W.; Collins, Linda M.

doi:10.1007/978-1-4614-4018-5_11

John W. Graham² &
Linda M. Collins

Part of the book series: Statistics for Social and Behavioral Sciences ((SSBS))

7770 Accesses
6 Citations

Abstract

Missing data in a field experiment may arise from a number of sources. Participants may skip over questions inadvertently or refuse to answer them; they may offer an illegible response; they may fail to complete a questionnaire; or they may be absent from an entire measurement session in a longitudinal study. The last is often called wave nonresponse. Many participants who are unavailable for one or more occasions of measurement are available at later occasions. We define attrition is a special case of wave nonresponse in which a participant drops out of a study after a certain time and is no longer available at any subsequent wave of data collection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This last step involved a simple trial and error process: Try a particular sample size; if the SE was too large, increase the sample size and try again.
2.
The MGSEM procedure makes use of the covariance matrix as input. When the covariance matrix is analyzed in this manner, one may simply change the sample size indicated in model being tested without changing the input covariance matrix. The more commonly used FIML approach cannot do this. With that approach, raw data must be input. And with raw-data input, the sample size is tied directly to the data being input (e.g., with N = 500, 500 cases are read from the raw data file). Thus with the FIML methods, changing the sample size changes the data being read, thereby producing changes in the results.
3.
One can safely ignore the “W_A_R_N_I_N_G” in the LISREL output that “LAMBDA-Y does not have full column rank”. It is a necessary byproduct of this analysis.
4.
We chose r _XY = .10 because the issue of N _EFF becomes most important with small effect sizes. This is closely related to the issue of determining statistical power with varying effect sizes. With large effect sizes, especially in field experiments, it is often possible to find significant effects, even with relatively small sample sizes. It is often the case that sample size is an issue only with smaller effect sizes. We address the issue of other values of r_XY later in this chapter.
We arbitrarily chose r _XZ = .10. In our experience, this value always tends to be rather similar to r _XY. We address the issue of different values of r _XZ later in this chapter.
5.
This is true unless the variables acting as auxiliary variables happen to be part of the analysis model. The only analysis that fits this requirement well is growth modeling. That is, even when there are missing values in the growth part of the model, the growth model can be estimated making use of partial data. Although the results of this analysis are not maximum likelihood, they do tend to be unbiased and efficient.
6.
For this demonstration, we will stay with the scenario in which N _TOT = 1,000, N _CC = 500, and %Z = 100 % for both auxiliary variables.

References

Allison, P. D. (1987). Estimation of linear models with incomplete data. In C. Clogg (Ed.), Sociological Methodology 1987 (pp. 71–103). San Francisco: Jossey Bass.
Google Scholar
Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.
Article Google Scholar
Graham, J. W. (2003). Adding missing-data relevant variables to FIML-based structural equation models. Structural Equation Modeling, 10, 80–100.
Article Google Scholar
Graham, J. W., & Donaldson, S. I. (1993). Evaluating interventions with differential attrition: The importance of nonresponse mechanisms and use of followup data. Journal of Applied Psychology, 78, 119–128.
Article Google Scholar
Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory. Prevention Science, 8, 206–213.
Article Google Scholar
Graham, J. W., Taylor, B. J., & Cumsille, P. E. (2001). Planned missing data designs in analysis of change. In L. Collins & A. Sayer (Eds.), New methods for the analysis of change, (pp. 335–353). Washington, DC: American Psychological Association.
Chapter Google Scholar
Graham, J. W., Taylor, B. J., Olchowski, A. E., & Cumsille, P. E. (2006). Planned missing data designs in psychological research. Psychological Methods, 11, 323–343.
Article Google Scholar
Hansen, W. B., & Graham, J. W. (1991). Preventing alcohol, marijuana, and cigarette use among adolescents: Peer pressure resistance training versus establishing conservative norms. Preventive Medicine, 20, 414–430.
Article Google Scholar
Jöreskog, K. G., & Sörbom, D. (1996). LISREL 8 User’s Reference Guide. Chicago, IL: Scientific Software, Inc.
Google Scholar
Murray, D. M. (1998). Design and analysis of group-randomized trials. New York: Oxford University Press.
Google Scholar
Muthén, B., Kaplan, D., & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52, 431–462.
Article MATH Google Scholar
Muthén, L. K., & Muthén, B. O. (2010). Mplus User’s Guide. (6th ed.). Los Angeles: Author.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biobehavioral Health, The Pennsylvania State University, Health & Human Development Bldg. East, University Park, PA, USA
John W. Graham

Authors

John W. Graham
View author publications
You can also search for this author in PubMed Google Scholar
Linda M. Collins
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Graham, J.W., Collins, L.M. (2012). Using Modern Missing Data Methods with Auxiliary Variables to Mitigate the Effects of Attrition on Statistical Power. In: Missing Data. Statistics for Social and Behavioral Sciences. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4018-5_11

Download citation

DOI: https://doi.org/10.1007/978-1-4614-4018-5_11
Published: 10 May 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4017-8
Online ISBN: 978-1-4614-4018-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics