Introduction to Missing Data Estimation

  • Collins Achepsah LekeEmail author
  • Tshilidzi Marwala
Part of the Studies in Big Data book series (SBD, volume 48)


This chapter describes in detail the problem of missing data. It also describes the different missing data patterns and mechanisms. This is followed by a discussion of the classical missing data techniques ensued by a presentation of machine learning approaches to address the missing data problem. Subsequently, machine learning optimization techniques are presented for missing data estimation tasks.


  1. Abdella, M., & Marwala, T. (2005a). The use of genetic algorithms and neural networks to approximate missing data in database. 24, 577–589.Google Scholar
  2. Abdella, M. (2005). The use of genetic algorithms and neural networks to approximate missing data in database. Unpublished master’s thesis, University of the Witwatersrand, Johannesburg.Google Scholar
  3. Abdella, M., & Marwala, T. (2005b). Treatment of missing data using neural networks. In: Proceedings of the IEEE International Joint Conference on Neural Networks, vol. 1, 598–603Google Scholar
  4. Allison, P. D. (2000). Multiple imputation for missing data. Sociological Methods & Research, 28(3), 301–309.CrossRefGoogle Scholar
  5. Allison, P. D. (2002). Missing data. Thousand Oaks: Sage Publications.zbMATHCrossRefGoogle Scholar
  6. Atalla, M. J., & Inman, D. J. (1998). On model updating using neural networks. Mechanical Systems and Signal Processing, 12, 135–161.CrossRefGoogle Scholar
  7. Baek, K., & Cho, S. (2003). Bankruptcy prediction for credit risk using an auto-associative neural network in Korean firms. In: IEEE Conference on Computational Intelligence for Financial Engineering, pp. 25–29, Hong Kong, China.Google Scholar
  8. Brain, L. B., Marwala, T., & Tettey, T. (2006). Autoencoder networks for HIV classification. Current Science, 91(11), 1467–1473.Google Scholar
  9. Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121–167.CrossRefGoogle Scholar
  10. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1997). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistics Society, 39(1), 1–38.MathSciNetzbMATHGoogle Scholar
  11. Dhlamini, S. M., Nelwamondo, F. V., & Marwala, T. (2006). Condition monitoring of HV bushings in the presence of missing data using evolutionary computing. Transactions on Power Systems, 1(2), 280–287.Google Scholar
  12. Engelbrecht, A. P. (2006). Particle swarm optimization: Where does it belong? In: Proceedings of IEEE Swarm Intelligence Symposium, pp. 48–54.Google Scholar
  13. Faris, P. D., Ghali, W. A., Brant, R., Norris, C. M., Galbraith, P. D., & Knudtson, M. L. (2002). Multiple imputation versus data enhancement for dealing with missing data in observational health care outcome analyses. Journal of Clinical Epidemiology, 55(2), 184–191.CrossRefGoogle Scholar
  14. Gabrys, B. (2002). Neuro-fuzzy approach to processing inputs with missing values in pattern recognition problems. International Journal of Approximate Reasoning, 30, 149–179.MathSciNetzbMATHCrossRefGoogle Scholar
  15. Garca-Laencina, P., Sancho-Gmez, J., Figueiras-Vidal, A., & Verleysen, M. (2009). K nearest neighbours with mutual information for simultaneous classification and missing data imputation. Neurocomputing, 72(7–9), 1483–1493.CrossRefGoogle Scholar
  16. Hastie, T., Tibshirani, R., & Friedman, J. (2008). The elements of statistical learning: Data mining, inference, and prediction. New York: Springer.zbMATHGoogle Scholar
  17. Haykin, S. (1999). Neural networks (2nd ed.). New Jersey: Prentice-Hall.Google Scholar
  18. Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., & Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems and their Applications, 13(4), 18–28.CrossRefGoogle Scholar
  19. Hines, J. W., Robert, E. U., & Wrest, D. J. (1998). Use of autoassociative neural networks for signal validation. Journal of Intelligent and Robotic Systems, 21(2), 143–154.CrossRefGoogle Scholar
  20. Ho, P., Silva, M. C. M., & Hogg, T. A. (2001). Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port. Chemometrics and Intelligent Laboratory Systems, 55(1–2), 1–11.CrossRefGoogle Scholar
  21. Hui, D., Wan, S., Su, B., Katul, G., Monson, R., & Luo, Y. (2004). Gap-filling missing data in eddy covariance measurements using multiple imputation (MI) for annual estimations. Agricultural and Forest Meteorology, 121(1–2), 93–111.CrossRefGoogle Scholar
  22. Isaacs, J. C. (2014). Representational learning for sonar ATR. In SPIE Defense + Security. In: Detection and Sensing of Mines, Explosive Objects, and Obscured Targets XIX. International Society for Optics and Photonics, vol. 9072, p. 907203.
  23. Junninen, H., Niska, H., Tuppurainen, K., Ruuskanen, J., & Kolehmainen, M. (2004). Methods for imputation of missing values in air quality data sets. Atmospheric Environment, 38(18), 2895–2907.CrossRefGoogle Scholar
  24. Kalousis, A., & Hilario, M. (2000). Supervised knowledge discovery from incomplete data. In: Proceedings of the 2nd International Conference on Data Mining. WIT Press. Accessed Oct 2016.
  25. Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization (PSO). In: Proceedings of IEEE International Conference on Neural Networks (ICNN), Perth, Australia, vol. 4, pp. 1942–1948.Google Scholar
  26. Leke, C., & Marwala, T. (2016). Missing data estimation in high-dimensional datasets: A swarm intelligence-deep neural network approach. In: International Conference in Swarm Intelligence. Springer International Publishing, pp. 259–270.Google Scholar
  27. Leke, C., Twala, B., & Marwala, T. (2014). Modeling of missing data prediction: Computational intelligence and optimization algorithms. In: 2014 IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 1400–1404.Google Scholar
  28. Little, R., & Rubin, D. (2014). Statistical analysis with missing data (Vol. 333). New York: Wiley.zbMATHGoogle Scholar
  29. Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. New York: Wiley.zbMATHGoogle Scholar
  30. Liu, Y., & Brown, S. D. (2013). Comparison of five iterative imputation methods for multivariate classification. Chemometrics and Intelligent Laboratory Systems, 120, 106–115.CrossRefGoogle Scholar
  31. Lu, P. J., & Hsu, T. C. (2002). Application of autoassociative neural network on gas-path sensor data validation. Journal of Propulsion and Power, 18(4), 879–888.CrossRefGoogle Scholar
  32. Marwala, T. (2010). Finite element model updating using computational intelligence techniques: Applications to structural dynamics. Heidelberg: Springer.zbMATHCrossRefGoogle Scholar
  33. Marwala, T., & Lagazio, M. (2011). Militarized conflict modeling using computational intelligence techniques. London: Springer.CrossRefGoogle Scholar
  34. Marwala, T. (2009). Computational intelligence for missing data imputation: Estimation and management knowledge optimization techniques. Hershey, New York: Information Science Reference.CrossRefGoogle Scholar
  35. Marwala, T. (2001). Probabilistic fault identification using a committee of neural networks and vibration data. Journal of Aircraft, 38(1), 138–146.CrossRefGoogle Scholar
  36. Marwala, T., & Chakraverty, S. (2006). Fault classification in structures with incomplete measured data using autoassociative neural networks and genetic algorithm. Current Science, 90(4), 542–549.Google Scholar
  37. Marwala, T. (2013). Economic modelling using artificial intelligence methods. London: Springer.zbMATHCrossRefGoogle Scholar
  38. Ming-Hau, C. (2010). Pattern recognition of business failure by autoassociative neural networks in considering the missing values. International Computer Symposium (ICS) (pp. 711–715). Taiwan: Taipei.Google Scholar
  39. Mistry, J., Nelwamondo, F., & Marwala, T. (2008). Estimating missing data and determining the confidence of the estimate data. In: Seventh International Conference on Machine Learning and Applications, San Diego, CA, USA, pp. 752–755.Google Scholar
  40. Nelwamondo, F. V., Mohamed, S., & Marwala, T. (2007a). Missing data: A comparison of neural network and expectation maximization techniques. Current Science, 93(11), 1514–1521.Google Scholar
  41. Nelwamondo, F. V., & Marwala, T. (2007a). Handling missing data from heteroskedastic and non-stationary data. Lecture Notes in Computer Science, 4491(1), 1297–1306Google Scholar
  42. Nelwamondo, F. V., & Marwala, T. (2007b). Rough set theory for the treatment of incomplete data. In: Proceedings of the IEEE Conference on Fuzzy Systems, London, UK, pp. 338–343.Google Scholar
  43. Nelwamondo, F. V., & Marwala, T. (2007c). Fuzzy ARTMAP and neural network approach to online processing of inputs with missing values. SAIEE Africa Research Journal, 98(2), 45–51.Google Scholar
  44. Nelwamondo, F. V., Mohamed, S., & Marwala, T. (2007b). Missing data: A comparison of neural network and expectation maximisation techniques. Current Science, 93(12), 1514–1521.Google Scholar
  45. Nelwamondo, F. V., & Marwala, T. (2008). Techniques for handling missing data: applications to online condition monitoring. International Journal of Innovative Computing, Information and Control, 4(6), 1507–1526.Google Scholar
  46. Nishanth, K. J., & Ravi, V. (2013). A computational intelligence based online data imputation method: An application for banking. Journal of Information Processing Systems, 9(4), 633–650.CrossRefGoogle Scholar
  47. Pérez, A., Dennis, R. J., Gil, J. F. A., Róndon, M. A., & López, A. (2002). Use of the mean, hot deck and multiple imputation techniques to predict outcome in intensive care unit patients in Colombia. Journal of Statistics in Medicine, 21(24), 3885–3896.CrossRefGoogle Scholar
  48. Pigott, T. D. (2001). A review of methods for missing data. Educational Research and Evaluation, 7(4), 353–383.CrossRefGoogle Scholar
  49. Poleto, F. Z., Singer, J. M., & Paulino, C. D. (2011). Missing data mechanisms and their implications on the analysis of categorical data. Statistics and Computing, 21(1), 31–43.MathSciNetzbMATHCrossRefGoogle Scholar
  50. Polikar, R., De Pasquale, J., Mohammed, H. S., Brown, G., & Kuncheva, L. I. (2010). Learn ++mf: A random subspace approach for the missing feature problem. Pattern Recognition, 43(11), 3817–3832.zbMATHCrossRefGoogle Scholar
  51. Ramoni, M., & Sebastiani, P. (2001). Robust learning with missing data. Journal of Machine Learning, 45(2), 147–170.zbMATHCrossRefGoogle Scholar
  52. Rubin, D. (1978). Multiple imputations in sample surveys-a phenomenological Bayesian approach to nonresponse. Proceedings of the survey research methods section of the American Statistical Association, 1, 20–34.Google Scholar
  53. Sartori, N., Salvan, A., & Thomaseth, K. (2005). Multiple imputation of missing values in a cancer mortality analysis with estimated exposure dose. Computational Statistics & Data Analysis, 49(3), 937–953.MathSciNetzbMATHCrossRefGoogle Scholar
  54. Scheffer, J. (2000). Dealing with missing data. Research Letters in the Information and Mathematical Sciences. 3:153–160. (last accessed: 18-March-2016). [Online]. Available:
  55. Shinozaki, T., & Ostendorf, M. (2008). Cross-validation and aggregated EM training for robust parameter estimation. Computer Speech & Language, 22(2), 185–195.CrossRefGoogle Scholar
  56. Silva-Ramirez, E.-L., Pino-Mejias, R., Lopez-Coello, M., & Cubiles-de-la Vega, M.-D. (2011). Missing value imputation on missing completely at random data using multilayer perceptrons. Neural Networks, 24(1), 121–129.CrossRefGoogle Scholar
  57. Smauoi, N., & Al-Yakoob, S. (2003). Analyzing the dynamics of cellular flames using karhunenloeve decomposition and autoassociative neural networks. Society for Industrial and Applied Mathematics, 24, 1790–1808.zbMATHGoogle Scholar
  58. Steeb, W.-H. (2008). The Nonlinear Workbook. Singapore: World Scientific.zbMATHCrossRefGoogle Scholar
  59. Stolkin, R., Greig, A., Hodgetts, M., & Gilby, J. (2008). An EM/E-MRF algorithm for adaptive model-based tracking in extremely poor visibility. Image and Vision Computing, 26(4), 480–495.CrossRefGoogle Scholar
  60. Suykens, J. A. K., & Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural Processing Letters, 9(3), 293–300.CrossRefGoogle Scholar
  61. Tim, T., Mutajogire, M., & Marwala, T. (2004). Stock market prediction using evolutionary neural networks (pp. 123–133). PRASA: Fifteenth Annual Symposium of the Pattern Recognition.Google Scholar
  62. Tremblay, M. C., Dutta, K., & Vandermeer, D. (2010). Using data mining techniques to discover bias patterns in missing data. Journal of Data and Information Quality, 2(1), 1–19.CrossRefGoogle Scholar
  63. Twala, B. (2009). An empirical comparison of techniques for handling incomplete data using decision trees. Applied Artificial Intelligence, 23(5), 373–405.CrossRefGoogle Scholar
  64. Twala, B., & Cartwright, M. (2010). Ensemble missing data techniques for software effort prediction. Intelligent Data Analysis., 14(3), 299–331.CrossRefGoogle Scholar
  65. Twala, B. E. T. H., Jones, M. C., & Hand, D. J. (2008). Good methods for coping with missing data in decision trees. Pattern Recognition Letters, 29(7), 950–956.CrossRefGoogle Scholar
  66. Twala, B., & Phorah, M. (2010). Predicting incomplete gene microarray data with the use of supervised learning algorithms. Pattern Recognition Letters, 31, 2061–2069.CrossRefGoogle Scholar
  67. Twala, B. E. T. H. (2005). Effective techniques for handling incomplete data using decision trees. Unpublished doctoral dissertation, The Open University, UK.Google Scholar
  68. Wang, S. (2005). Classification with incomplete survey data: A Hopfield neural network approach. Computers & Operations Research, 24, 53–62.Google Scholar
  69. Yansaneh, I. S., Wallace, L. S., & Marker, D. A. (1998). Imputation methods for large complex datasets: An application to the Nehis. In: Proceedings of the Survey Research Methods Section, pp. 314–319.Google Scholar
  70. Yu, S., & Kobayashi, H. (2003). A hidden semi-Markov model with missing data and multiple observation sequences for mobility tracking. Signal Processing, 83(2), 235–250.zbMATHCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Faculty of Engineering and Built EnvironmentUniversity of JohannesburgAuckland ParkSouth Africa

Personalised recommendations