Statistical Modeling for the Heart Disease Diagnosis via Multiple Imputation
During statistical analysis of clinic data, missing data is a common challenge. Incomplete datasets can occur via different means, such as mishandling of samples, low signal-to-noise ratio, measurement error, non-responses to questions, or aberrant value deletion. Missing data causes severe problems in statistical analysis and leads to invalid conclusions. Multiple imputation is a useful strategy for handling missing data. The statistical inference of multiple imputation is widely accepted as a less biased and more valid result. In the chapter, we apply the multiple imputation to a public-accessible heart disease dataset, which has a high missing rate, and build a prediction model for the heart disease diagnosis.
KeywordMissing data Multiple imputation Heart disease dataset
The authors are grateful to the two reviewers for their helpful comments, which improved the manuscript significantly. The authors would like to thank Lisa Elon for invaluable advice and Dr. Eric Dammer for critical reading of the manuscript.