Bagging and Feature Selection for Classification with Incomplete Data

  • Cao Truong Tran
  • Mengjie Zhang
  • Peter Andreae
  • Bing Xue
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10199)


Missing values are an unavoidable issue of many real-world datasets. Dealing with missing values is an essential requirement in classification problem, because inadequate treatment with missing values often leads to large classification errors. Some classifiers can directly work with incomplete data, but they often result in big classification errors and generate complex models. Feature selection and bagging have been successfully used to improve classification, but they are mainly applied to complete data. This paper proposes a combination of bagging and feature selection to improve classification with incomplete data. To achieve this purpose, a wrapper-based feature selection which can directly work with incomplete data is used to select suitable feature subsets for bagging. The experiments on eight incomplete datasets were designed to compare the proposed method with three other popular methods that are able to deal with incomplete data using C4.5/REPTree as classifiers and using Particle Swam Optimisation as a search technique in feature selection. Results show that the combination of bagging and feature selection can not only achieve better classification accuracy than the other methods but also generate less complex models compared to the bagging method.


Incomplete data Ensemble Feature selection Classification Particle swam optimisation C4.5 REPTree 


  1. 1.
    Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)CrossRefGoogle Scholar
  2. 2.
    Chen, H., Du, Y., Jiang, K.: Classification of incomplete data using classifier ensembles. In: 2012 International Conference on Systems and Informatics (ICSAI), pp. 2229–2232 (2012)Google Scholar
  3. 3.
    Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6, 58–73 (2002)CrossRefGoogle Scholar
  4. 4.
    Dietterich, T.G.: Ensemble methods in machine learning. In: International Workshop on Multiple Classifier Systems, pp. 1–15 (2000)Google Scholar
  5. 5.
    Doquire, G., Verleysen, M.: Feature selection with missing data using mutual information estimators. Neurocomputing 90, 3–11 (2012)CrossRefGoogle Scholar
  6. 6.
    García-Laencina, P.J., Sancho-Gómez, J.L., Figueiras-Vidal, A.R.: Pattern classification with missing data: a review. Neural Comput. Appl. 19, 263–282 (2010)CrossRefGoogle Scholar
  7. 7.
    Guerra-Salcedo, C., Whitley, D.: Feature selection mechanisms for ensemble creation: a genetic search perspective. In: Data Mining with Evolutionary Algorithms: Research Directions. Papers from the AAAI Workshop (1999)Google Scholar
  8. 8.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explor. Newsl. 11, 10–18 (2009)CrossRefGoogle Scholar
  9. 9.
    Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Waltham (2011)zbMATHGoogle Scholar
  10. 10.
    Kennedy, J.: Particle swarm optimization. In: Encyclopedia of Machine Learning, pp. 760–766 (2011)Google Scholar
  11. 11.
    Krause, S., Polikar, R.: An ensemble of classifiers approach for the missing feature problem. In: 2003 Proceedings of the International Joint Conference on Neural Networks, vol. 1, pp. 553–558 (2003)Google Scholar
  12. 12.
    Lichman, M.: UCI machine learning repository (2013).
  13. 13.
    Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (2014)zbMATHGoogle Scholar
  14. 14.
    Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining, vol. 454. Springer, Heidelberg (2012)zbMATHGoogle Scholar
  15. 15.
    Oliveira, L.S., Morita, M., Sabourin, R.: Feature selection for ensembles applied to handwriting recognition. Int. J. Doc. Anal. Recogn. (IJDAR) 8, 262–279 (2006)CrossRefGoogle Scholar
  16. 16.
    Opitz, D., Maclin, R.: Popular ensemble methods: An empirical study. J. Artif. Intell. Res. 11, 169–198 (1999)zbMATHGoogle Scholar
  17. 17.
    Opitz, D.W.: Feature selection for ensembles. In: AAAI/IAAI 379–384 (1999)Google Scholar
  18. 18.
    Qian, W., Shu, W.: Mutual information criterion for feature selection from incomplete data. Neurocomputing 168, 210–220 (2015)CrossRefGoogle Scholar
  19. 19.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Elsevier, New York (2014)Google Scholar
  20. 20.
    Saar-Tsechansky, M., Provost, F.: Handling missing values when applying classification models. J. Mach. Learn. Res. 8, 1623–1657 (2007)zbMATHGoogle Scholar
  21. 21.
    Su, J., Zhang, H.: A fast decision tree learning algorithm. In: Proceedings of the 21st National Conference on Artificial Intelligence, vol. 1, pp. 500–505 (2006)Google Scholar
  22. 22.
    Tran, C.T., Zhang, M., Andreae, P., Xue, B.: Improving performance for classification with incomplete data using wrapper-based feature selection. Evol. Intell. 9, 81–94 (2016)CrossRefGoogle Scholar
  23. 23.
    Tran, C.T., Zhang, M., Andreae, P., Xue, B.: A wrapper feature selection approach to classification with missing data. In: Squillero, G., Burelli, P. (eds.) EvoApplications 2016. LNCS, vol. 9597, pp. 685–700. Springer, Cham (2016). doi: 10.1007/978-3-319-31204-0_44CrossRefGoogle Scholar
  24. 24.
    Xue, B., Zhang, M., Browne, W., Yao, X.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20, 606–626 (2016)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Cao Truong Tran
    • 1
  • Mengjie Zhang
    • 1
  • Peter Andreae
    • 1
  • Bing Xue
    • 1
  1. 1.School of Engineering and Computer ScienceVictoria University of WellingtonWellingtonNew Zealand

Personalised recommendations