Skip to main content
Log in

A selective neural network ensemble classification for incomplete data

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Neural network ensemble (NNE) is a simple and effective method to deal with incomplete data for classification. However, with the increase in the number of missing values, the number of incomplete feature combinations (feature subsets) grown rapidly which makes the NNE method very time-consuming and the accuracy is also need to be improved. In this paper, we propose a selective neural network ensemble (SNNE) classification for incomplete data. The SNNE first obtains all the available feature subsets of the incomplete dataset and then applies mutual information to measure the importance (relevance) degree of each feature subset. After that, an optimization process is applied to remove the feature subsets by satisfying the following condition: there is at least a feature subset contained in the removed feature subset and the difference of their importance degree is smaller than a given threshold δ. Finally, the rest of the feature subsets were used to train a group of neural networks and the classification for a given sample is decided by weighted majority voting of all available components in the ensemble. Experimental results show that δ = 0.05 is reasonable in our study. It can improve the efficiency of the algorithm without loss the algorithm accuracy. Experiments also show that SNNE outperforms the NNE-based algorithms compared. In addition, it can greatly reduce the running time when dealing with datasets with larger number of missing values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Allison PD (2001) Missing data, 136th edn. Sage publications, London

    MATH  Google Scholar 

  2. Little RJ, Rubin DB (2014) Statistical analysis with missing data. Wiley, New York

    MATH  Google Scholar 

  3. Lee SY, Song XY (2003) Maximum likelihood estimation and model comparison for mixtures of structural equation models with ignorable missing data. J Classif 20(2):221–255

    Article  MathSciNet  MATH  Google Scholar 

  4. Li X-L, Zhou Z-H (2007) Structure learning of probabilistic relational models from incomplete relational data. Machine Learning: ECML 2007. Springer, Berlin, pp 214–225

    Chapter  Google Scholar 

  5. Duda RO, Hart PE (1973) Pattern classification and scene analysis, edn 3. Wiley, New York

    MATH  Google Scholar 

  6. Prati RC, Batista GE, Monard MC (2004) Class imbalances versus class overlapping: an analysis of a learning system behavior. MICAI 2004: advances in artificial intelligence. Springer, Berlin, pp 312–321

    Chapter  Google Scholar 

  7. Rässler S (2004) The impact of multiple imputation for DACSEIS. University of Erlangen-Nurnberg, Germany

    Google Scholar 

  8. Williams D, Liao X-J, Xue Y, Carin L, Krishnapuram B (2007) On classification with incomplete data. IEEE Trans Pattern Anal Mach Intell 29(3):427–436

    Article  Google Scholar 

  9. Clark PG, Grzymala B, Rzasa JW (2014) Mining incomplete data with singleton, subset and concept probabilistic approximations. Inf Sci 280:368–384

    Article  MathSciNet  MATH  Google Scholar 

  10. Ramoni M, Sebastianiv P (2001) Robust bayes classifiers. Artif Intell 125(1):209–226

    Article  MathSciNet  MATH  Google Scholar 

  11. Lin H-C, Su C-T (2013) A selective Bayes classifier with meta-heuristics for incomplete data. Neurocomputing 106:95–102

    Article  Google Scholar 

  12. Ramoni M, Sebastiani P (2001) Robust learning with missing data. Mach Learn 45(2):147–170

    Article  MATH  Google Scholar 

  13. Krause S, Polikar R (2003) An ensemble of classifiers approach for the missing feature problem. In: Proceedings of the 2003 international joint conference on neural networks, vol. 1, IEEE, pp 553–558

  14. Jiang K, Chen H-X, Yuan S-M (2005) Classification for incomplete data using classifier ensembles. In: IEEE international conference on neural networks and brain, ICNN&B 2005, Vol 1, pp 559–563

  15. Philippot E, Santosh KC, Belaïd A, Belaïd Y (2014) Bayesian networks for incomplete data analysis in form processing. Int J Mach Learn Cybernet 6(3):1–17

    Google Scholar 

  16. Wang X-Z, Wang R, Feng H-M, Wang H-C (2014) A new approach to classifier fusion based on upper integral. IEEE Trans Cybern 44(5):620–635

    Article  Google Scholar 

  17. Wang X-Z, Xing H-J, Li Y, Hua Q, Dong C-R, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654

    Article  Google Scholar 

  18. Chen H-X, Du Y-P, Jiang K (2012) Classification of incomplete data using classifier ensembles. In: 2012 IEEE international conference on systems and informatics, ICSAI2012, pp 2229–2232

  19. Yan Y-T, Zhang Y-P, Zhang Y-W (2014) Multi-granulation ensemble classification for incomplete data. In: Rough sets and knowledge technology, Springer International Publishing, pp 343–351

  20. Dai Q, Li M (2014) Introducing randomness into greedy ensemble pruning algorithms. Appl Intell 42(3):406–429

    Article  MathSciNet  Google Scholar 

  21. Zhang T, Dai Q, Ma Z (2015) Extreme learning machines’ ensemble selection with GRASP. Appl Intell 43(2):439–459

    Article  Google Scholar 

  22. Ma Z, Dai Q, Liu N (2015) Several novel evaluation measures for rank-based ensemble pruning with applications to time series prediction. Exp Syst Appl 42(1):280–292

    Article  Google Scholar 

  23. Wang X-Z, Chen A-X, Feng H-M (2011) Upper integral network with extreme learning mechanism. Neurocomputing 74(16):2520–2525

    Article  Google Scholar 

  24. You Z-H, Lei Y-K, Zhu L et al (2013) Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis. BMC Bioinformatics 14(Suppl 8):S10

    Article  Google Scholar 

  25. Chen H, Ni D, Qin J et al (2015) Standard plane localization in fetal ultrasound via domain transferred deep neural networks. IEEE J Biomed Health Inform 19(5):1627–1636

    Article  Google Scholar 

  26. Chen H-X, Yuan S-M, Jiang K (2005) Wrapper approach for learning neural network ensemble by feature selection. In: Advances in neural networks (ISNN 2005), vol 3496. Springer, Berlin, pp 526–531

  27. Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution, In: ICML, Vol 3, pp 856–863

  28. Erik S et al (2010) On estimating mutual information for feature selection, Artificial Neural Networks-ICANN 2010. Springer, Berlin, pp 362–367

    Google Scholar 

  29. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27379–423:623–656

    Article  MathSciNet  MATH  Google Scholar 

  30. Benoit QF, Doquire G, Verleysen M (2014) Estimating mutual information for feature selection in the presence of label noise. Comput Stat Data Anal 71:832–848

    Article  MathSciNet  Google Scholar 

  31. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MATH  Google Scholar 

  32. UCI Repository of machine learning databases for classification, http://archive.ics.uci.edu/ml/datasets.html

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (Nos. 61175046 and 61203290), Natural Science Foundation of Anhui Province (No. 1408085MF132).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuan-Ting Yan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, YT., Zhang, YP., Zhang, YW. et al. A selective neural network ensemble classification for incomplete data. Int. J. Mach. Learn. & Cyber. 8, 1513–1524 (2017). https://doi.org/10.1007/s13042-016-0524-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-016-0524-0

Keywords

Navigation