# Power calculation in multiply imputed data

- 19 Downloads

## Abstract

Multiple imputation (MI) has been proven an effective procedure to deal with incomplete datasets. Compared with complete case analysis (CCA), MI is more efficient since it uses the information provided by incomplete cases which are simply discarded in CCA. A few simulation studies have shown that statistical power can be improved when MI is used. However, there is a lack of knowledge about how much power can be gained. In this article, we build a general formula to calculate the statistical power when MI is used. Specific formulas are given for several different conditions. We demonstrate our finding through simulation studies and a data example.

## Notes

### Acknowledgements

The data used in this manuscript came from a Grant (R01 MH077312) awarded to Dr. Golda Ginsburg by the National Institute of Mental Health. ClinicalTrials.gov: NCT00847561

## References

- Baguley T (2004) Understanding statistical power in the context of applied research. Appl Ergon 35:73–80CrossRefGoogle Scholar
- Balkin RS, Sheperis CJ (2011) Evaluating and reporting statistical power in counseling research. J Couns Dev 89(3):268–272CrossRefGoogle Scholar
- Barnard J, Rubin DB (1999) Small-sample degrees of freedom with multiple imputation. Biometrika 86(4):948–955MathSciNetCrossRefzbMATHGoogle Scholar
- Beaujean AA (2014) Sample size determination for regression models using Monte Carlo methods in R. Pract Assess Res Eval 19:2Google Scholar
- Champely S, Ekstrom C, Dalgaard P, Gill J, Wunder J, Rosario HD (2015) Basic functions for power analysisGoogle Scholar
- Cohen J (1988) Statistical power analysis for behavioral science, 2nd edn. Routledge, LondonzbMATHGoogle Scholar
- Collins LM, Schafer JL, Kam C-M (2001) A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychol Methods 6(4):330–351CrossRefGoogle Scholar
- Desai M, Esserman DA, Gammon MD, Terry MB (2011) The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects. Epidemiol Perspect Innov 8(1):5CrossRefGoogle Scholar
- Elashoff JD (2007) nQuery advisor® Version 7.0 user’s guideGoogle Scholar
- Faul F, Erdfelder E, Lang A-G, Buchner A (2007) G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods 39:175–191CrossRefGoogle Scholar
- Ginsburg GS, Drake KL, Tein JY, Teetse R, Riddle MA (2015) Preventing onset of anxiety disorders in offspring of anxious parents: a randomized controlled trial of a family-based intervention. Am J Psychiatry 172(December):1207–1214CrossRefGoogle Scholar
- Graham JW (2009) Missing data analysis: making it work in the real world. Ann Rev Psychol 60:549–576CrossRefGoogle Scholar
- Graham JW, Olchowski AE, Gilreath TD (2007) How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prev Sci 8:206–213CrossRefGoogle Scholar
- Hansen MH, Hurwitz WN, Madow WG (1953) Sample survey methods and survey, 1st edn. Wiley, New YorkzbMATHGoogle Scholar
- Harel O (2007) Inferences on missing information under multiple imputation and two-stage multiple imputation. Stat Methodol 4(January):75–89MathSciNetCrossRefzbMATHGoogle Scholar
- Harel O, Zhou XH (2007) Multiple imputation: review of theory, implementation and software. Stat Med 26(16):3057–3077MathSciNetCrossRefGoogle Scholar
- IBM Corp. (2013) IBM SPSS statistics for windows, version 22.0. IBM Corp., Armonk, NYGoogle Scholar
- Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New YorkCrossRefzbMATHGoogle Scholar
- Marshall A, Altman DG, Holder RL, Royston P (2009) Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines. BMC Med Res Methodol 9(1):1CrossRefGoogle Scholar
- McGinniss J, Harel O (2016) Multiple imputation in three or more stages. J Stat Plan Inference 176:33–51MathSciNetCrossRefzbMATHGoogle Scholar
- Meng X-L (1994) Multiple-imputation inferences with uncongenial sources of input (Disc: pp. 558–573). Stat Sci 9:538–558CrossRefGoogle Scholar
- Moher D, Dulberg CS, Wells GA (1994) Statistical power, sample size, and their reporting in randomized controlled trials. JAMA 272(2):122–124CrossRefGoogle Scholar
- Murphy KR, Myor B, Wolach A (1998) Statistical power analysis: a simple and general model for traditional and modern hypothesis tests, 1st edn. Routledge, LondonGoogle Scholar
- Muthén LK, Muthén BO (2002) How to use a Monte Carlo study to decide on sample size and determine power. Struct Equ Model 9(4):599–620MathSciNetCrossRefGoogle Scholar
- NCSS, LLC. Kaysville, Utah, USA (2017) PASS 15 power analysis and sample size softwareGoogle Scholar
- Peterman RM (1990) The importance of reporting statistical power: the forest decline and acidic deposition example. Ecology 71(5):2024–2027CrossRefGoogle Scholar
- R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, ViennaGoogle Scholar
- Raghunathan TE, Solenberger PW, Van Hoewyk J (2002) IVEware: imputation and variance estimation software user guide. Survey Methodology Program Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MIGoogle Scholar
- Reiter JP (2008) Multiple imputation when records used for imputation are not used or disseminated for analysis. Biometrika 95:933–946MathSciNetCrossRefzbMATHGoogle Scholar
- Rubin DB (1978) Multiple imputations in sample surveys: a phenomenological Bayesian approach to nonresponse, pp 20–28. Survey Research Methods Section of the American Statistical AssociationGoogle Scholar
- Rubin DB (1988) An overview of multiple imputation. In: JSM proceedings on survey research methods section. Alexandria: American Statistical AssociationGoogle Scholar
- Rubin DB (1987) Multiple imputation for nonresponse in surveys, 1st edn. Wiley, New YorkCrossRefzbMATHGoogle Scholar
- SAS (2008) SAS/STAT 9.2 user’s guide. SAS, Cary, NCGoogle Scholar
- SAS Institute Inc. (2011) SAS/STAT Software, Version 9.3. Cary, NCGoogle Scholar
- Schafer JL (1997) Analysis of incomplete multivariate data, 1st edn. Chapman and Hall, Boca RatonCrossRefzbMATHGoogle Scholar
- Schafer JL (1999) Multiple imputation: a primer. Stat Method Med Res 8(1):3–15CrossRefGoogle Scholar
- Schafer JL, Graham JW (2002) Multiple imutation: our view of the state of art. Psychol Method 7(2):147–177CrossRefGoogle Scholar
- Schafer JL, Olsen MK (1998) Multiple imputation for multivariate missing-data problems: A data analyst’s perspective. Multivariate Behav Res 33(4):545–571CrossRefGoogle Scholar
- Shen ZJ (2000) Nested multiple imputation. Ph.D. thesis, Department of Statistics, Harvard UniversityGoogle Scholar
- StataCorp (2013) Stata power and sample-size reference manual release 13Google Scholar
- Steidl RJ, Hayes JP, Schauber E (1997) Statistical Power Analysis in Wildlife Research. The Journal of Wildlife Management 61(2):270–279CrossRefGoogle Scholar
- Templ M, Filzmoser P (2008) Visualization of missing values using the R-package VIM. Research report cs-2008-1, Department of Statistics and Probability Theory, Vienna University of TechnologyGoogle Scholar
- van Buuren S (2012) Flexible imputation of missing data, 1st edn. Chapman and Hall, Boca RatonCrossRefzbMATHGoogle Scholar
- van Buuren S, Groothuis-Oudshoorn K (2011) Mice: multivariate imputation by chained equations in R. J Stat Softw 45(3):1–67CrossRefGoogle Scholar
- Van der Sluis S, Dolan CV, Neale MC, Posthuma D (2008) Power calculations using exact data simulation: a useful tool for genetic study designs. Behav Genet 38:202–211CrossRefGoogle Scholar
- Verbeke G, Molenberghs G (2000) Chap. 21. New York: SpringerGoogle Scholar
- Wagstaff D A, Harel O (2011) A closer examination of three small-sample approximations to the multiple-imputation degrees of freedom. Stata J 11(3):403–419(17)CrossRefGoogle Scholar
- White IR, Carlin JB (2010) Bias and efficiency of multiple imputation compared with complete-case analysis for missing covariate valuese size for planned missing designs. Stat Med 29(December):2929–2931Google Scholar
- White IR, Royston P, Wood AM (2011) Multiple imputation using chained equations: issues and guidance for practice. Stat Med 30(4):377–399MathSciNetCrossRefGoogle Scholar
- Wothke W (2000) Longitudinal and multigroup modeling with missing data. Lawrence Erlbaum Associates PublishersGoogle Scholar