Skip to main content

Multiple-Side Multiple-Learner for Incomplete Data Classification

  • Conference paper
  • First Online:
Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9437))

  • 1393 Accesses

Abstract

Selective classifier can improve classification accuracy and algorithm efficiency by removing the irrelevant attributes of data. However, most of them deal with complete data. Actual datasets are often incomplete due to various reasons. Incomplete dataset also have some irrelevant attributes which have a negative effect on the algorithm performance. By analyzing main classification methods of incomplete data, this paper proposes a Multiple-side Multiple-learner algorithm for incomplete data (MSML). MSML first obtains a feature subset of the original incomplete dataset based on the chi-square statistic. And then, according to the missing attribute values of the selected feature subset, MSML obtains a group of data subsets. Each data subset was used to train a sub classifier based on bagging algorithm. Finally, the results of different sub classifiers were combined by weighted majority voting. Experimental results on UCI incomplete datasets show that MSML can effectively reduce the number of attributes, and thus improve the algorithm execution efficiency. At the same time, it can improve the classification accuracy and algorithm stability too.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  2. Qian, Y., Liang, J., Pedrycz, W., Dang, C.: Positive approximation: an accelerator for attribute reduction in rough set theory. Artif. Intell. 174(9), 597–618 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  3. Kuncheva, L.I., Rodrguez, J.J., Plumpton, C.O., et al.: Random subspace ensembles for fMRI classification. IEEE Trans. Med. Imaging 29(2), 531–542 (2010)

    Article  Google Scholar 

  4. Zhang, J., Zhang, D.: A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples. Pattern Recogn. 44(6), 1162–1171 (2011)

    Article  MATH  Google Scholar 

  5. Sun, S., Zhang, C.: Subspace ensembles for classification. Phys. A Stat. Mech. Appl. 385(1), 199–207 (2007)

    Article  MathSciNet  Google Scholar 

  6. Bryll, R., Gutierrez-Osuna, R., Quek, F.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recogn. 36(6), 1291–1302 (2003)

    Article  MATH  Google Scholar 

  7. Allison, P.D.: Missing Data. Sage Publications, Thousand Oaks (2001)

    MATH  Google Scholar 

  8. Roderick L., J A, Rubin, D.B.: Statistical Analysis with Missing Data, vol. 43, no. 4, pp. 364–365. Wiley, New York (2002)

    Google Scholar 

  9. Gheyas, I.A., Smith, L.S.: A neural network-based framework for the reconstruction of incomplete data sets. Neurocomputing 73(16), 3039–3065 (2010)

    Article  Google Scholar 

  10. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B (Methodol.) 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  11. Russell, S., Binder, J., Koller, D., Kanazawa, K.: Local learning in probabilistic networks with hidden variables. In: Proceedings of IJCAI 1995, pp. 1146–1152 (1995)

    Google Scholar 

  12. Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 721–741 (1984)

    Article  MATH  Google Scholar 

  13. Williams, D., Liao, X., Xue, Y., Carin, L., Krishnapuram, B.: On classification with incomplete data. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 427–436 (2007)

    Article  Google Scholar 

  14. Ramoni, M., Sebastiani, P.: Robust Bayes classifiers. Artif. Intell. 125(1), 209–226 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  15. Krause, S., Polikar, R.: An ensemble of classifiers approach for the missing feature problem. In: IEEE Proceedings of the International Joint Conference on Neural Networks, vol. 1, pp. 553–558 (2003)

    Google Scholar 

  16. Chen, H., Du, Y., Jiang, K.: Classification of incomplete data using classifier ensembles. In: IEEE International Conference on Systems and Informatics. pp. 2229–2232 (2012)

    Google Scholar 

  17. Yan, Y.-T., Zhang, Y.-P., Zhang, Y.-W.: Multi-granulation ensemble classification for incomplete data. In: Miao, D., Pedrycz, W., Slezak, D., Peters, G., Hu, Q., Wang, R. (eds.) RSKT 2014. LNCS, vol. 8818, pp. 343–351. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  18. UCI Repository of machine learning databases for classification. http://archive.ics.uci.edu/ml/datasets.html

  19. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (Nos.61175046 and 61203290).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan-Ping Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Yan, Yt., Zhang, YP., Du, XQ. (2015). Multiple-Side Multiple-Learner for Incomplete Data Classification. In: Yao, Y., Hu, Q., Yu, H., Grzymala-Busse, J.W. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. Lecture Notes in Computer Science(), vol 9437. Springer, Cham. https://doi.org/10.1007/978-3-319-25783-9_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25783-9_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25782-2

  • Online ISBN: 978-3-319-25783-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics