Skip to main content

Evaluation Metrics for Deep Learning Imputation Models

  • Chapter
  • First Online:
AI for Disease Surveillance and Pandemic Intelligence (W3PHAI 2021)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1013))

Included in the following conference series:

Abstract

There is growing interest in imputing missing data in tabular datasets using deep learning. A commonly used metric in evaluating the performance of a deep learning-based imputation model is root mean square error (RMSE), which is a prediction evaluation metric. In this paper, we demonstrate the limitations of RMSE for evaluating deep learning-based imputation performance by conducting a comparative analysis between RMSE and alternative metrics in the statistical literature including qualitative, predictive accuracy, and statistical distance. To minimize model and dataset biases, we use two different deep learning imputation models (denoising autoencoders and generative adversarial nets) and a regression imputation model. We also use two tabular datasets with growing amounts of missing data from different industry sectors: healthcare and financial. Our results show that contrary to the commonly used RMSE metric, the statistical metric of Jensen Shannon distance best assessed the imputation models’ performance. The regression model also ranked higher than deep learning when evaluated using the Jensen Shannon metric.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Arbel, M., et al.: Kale: When energy-based learning meets adversarial training (2020)

    Google Scholar 

  2. Beasley, T.M., Erickson, S., Allison, D.B.: Rank-based inverse normal transformations are increasingly used, but are they merited? Behav. Genet. 39(5), 580 (2009)

    Article  Google Scholar 

  3. Biau, D.J., Kernéis, S., Porcher, R.: Statistics in brief: the importance of sample size in the planning and interpretation of medical research. CORR 466(9) (2008)

    Google Scholar 

  4. Borji, A.: Pros and cons of GAN evaluation measures. CVIU 179 (2019)

    Google Scholar 

  5. Boursalie, O., Samavi, R., Doyle, T.E., Koff, D.A.: Deep learning model for cancer risk from low dose medical imaging radiation. Eur. Congr. of Radiol. (2020)

    Google Scholar 

  6. Boursalie, O., Samavi, R., Doyle, T.E., Koff, D.A.: Using medical imaging effective dose in deep learning models: estimation and evaluation. IEEE TRPMS (2020)

    Google Scholar 

  7. Briët, J., Harremoës, P.: Properties of classical and quantum JS divergence. PRA 79(5) (2009)

    Google Scholar 

  8. Buuren, S.v., et al.: MICE: multivariate imputation by CE in R. JSS 45(3) (2010)

    Google Scholar 

  9. Cohen, J.: Statistical Power Analysis for the Behavioral Sciences. Erbaum Press (1988)

    Google Scholar 

  10. Garcia, S., Luengo, J., Herrera, F.: Tutorial on practical tips of the most influential data preprocessing algorithms in data mining. Knowl.-Based Syst. 98 (2015)

    Google Scholar 

  11. Berrington de González, A., Mahesh, M., Kim, K., et al.: Projected cancer risks from CT scans performed in the USA in 2007. JAMA Int. Med. 169(22) (2009)

    Google Scholar 

  12. Guan, J., Li, R., Yu, S., Zhang, X.: Generation of synthetic EMR text. IEEE BIBM (2018)

    Google Scholar 

  13. Heymans, M., et al.: Applied missing data analysis with SPSS and R studio (2019)

    Google Scholar 

  14. ICRP: The 2007 recommendations of the ICRP. Publication 103 37(2–4) (2007)

    Google Scholar 

  15. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)

    Google Scholar 

  16. Kullback, S., Leibler, R.A.: On information and sufficiency. AMS 22(1) (1951)

    Google Scholar 

  17. Lall, R., Robinson, T.: Applying the midas touch: how to handle missing values in large and complex data. Apsanet (2020)

    Google Scholar 

  18. Li, L., Song, Q., Yang, X.: K-means clustering of overweight and obese population using QT metabolic data. Diabetes Metab. Syndr. Obes. 12 (2019)

    Google Scholar 

  19. Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. John Wiley & Sons (2019)

    Google Scholar 

  20. Marshall, J., Chahin, A., Rush, B., et al.: Secondary Analysis of EHR. Springer Nature (2016)

    Google Scholar 

  21. Mathews, J., Forsythe, A., Brady, Z., et al.: Cancer risk in 680,000 people exposed to CT scans in childhood or adolescence: data linkage study of 11 million Australians. BMJ 346 (2013)

    Google Scholar 

  22. Nazabal, A., Olmos, P.M., Ghahramani, Z., et al.: Handling incomplete heterogeneous data using VAES. Pattern Recognit. 107 (2020)

    Google Scholar 

  23. Nguyen, C.D., Carlin, J.B., Lee, K.J.: Model checking in multiple imputation: an overview and case study. Emerg. Themes Epidemiol. 14(1) (2017)

    Google Scholar 

  24. Nowozin, S., Cseke, B., Tomioka, R.: f-GAN: training generative neural samplers using variational divergence minimization. In: NIPS (2016)

    Google Scholar 

  25. Pedregosa, F., Varoquaux, G., et al.: Scikit-learn: ML in python. JMLR 12(85) (2011)

    Google Scholar 

  26. Pham, T., Tran, T., Phung, D., Venkatesh, S.: Predicting healthcare trajectories from medical records: a deep learning approach. J. Biomed. Inform. 69 (2017)

    Google Scholar 

  27. Rubin, D.B.: Inference and missing data. Biometrika 63(3), 581–592 (1976)

    Article  MathSciNet  Google Scholar 

  28. Rubin, D.B.: Statistical matching using file concatenation with adjusted weights and multiple imputations. J. Bus. Econ. Stats. 4(1) (1986)

    Google Scholar 

  29. Rubin, D.B.: An overview of multiple imputation. In: ASA SRMS (1988)

    Google Scholar 

  30. Sriperumbudur, B.K., et al.: On the empirical estimation of IPM. Electron. J. Stats. 6 (2012)

    Google Scholar 

  31. Srivastava, A., Valkov, L., Russell, C., et al.: Reducing mode collapse in GANs using implicit variational learning. In: NIPS (2017)

    Google Scholar 

  32. Van Buuren, S.: Flexible Imputation of Missing Data. CRC Press (2018)

    Google Scholar 

  33. Voulodimos, A., Doulamis, N., et al.: DL for CV: a brief review. CIN 2018 (2018)

    Google Scholar 

  34. Wells, B.J., et al.: Strategies for handling missing data in EHR data. eGEMs 1(3) (2013)

    Google Scholar 

  35. Yeh, I.C., Lien, C.H.: The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst. Appl. 36(2) (2009)

    Google Scholar 

  36. Yoon, J., et al.: GAIN: missing data imputation using GAN. In: ICML (2018)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Natural Sciences and Engineering Research Council of Canada and Southern Ontario Smart Computing Innovation Platform.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Omar Boursalie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Boursalie, O., Samavi, R., Doyle, T.E. (2022). Evaluation Metrics for Deep Learning Imputation Models. In: Shaban-Nejad, A., Michalowski, M., Bianco, S. (eds) AI for Disease Surveillance and Pandemic Intelligence. W3PHAI 2021. Studies in Computational Intelligence, vol 1013. Springer, Cham. https://doi.org/10.1007/978-3-030-93080-6_22

Download citation

Publish with us

Policies and ethics