Multiple-Side Multiple-Learner for Incomplete Data Classification

Yan, Yuan-ting; Zhang, Yan-Ping; Du, Xiu-Quan

doi:10.1007/978-3-319-25783-9_29

Yuan-ting Yan¹⁸,
Yan-Ping Zhang¹⁸ &
Xiu-Quan Du¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9437))

1393 Accesses

Abstract

Selective classifier can improve classification accuracy and algorithm efficiency by removing the irrelevant attributes of data. However, most of them deal with complete data. Actual datasets are often incomplete due to various reasons. Incomplete dataset also have some irrelevant attributes which have a negative effect on the algorithm performance. By analyzing main classification methods of incomplete data, this paper proposes a Multiple-side Multiple-learner algorithm for incomplete data (MSML). MSML first obtains a feature subset of the original incomplete dataset based on the chi-square statistic. And then, according to the missing attribute values of the selected feature subset, MSML obtains a group of data subsets. Each data subset was used to train a sub classifier based on bagging algorithm. Finally, the results of different sub classifiers were combined by weighted majority voting. Experimental results on UCI incomplete datasets show that MSML can effectively reduce the number of attributes, and thus improve the algorithm execution efficiency. At the same time, it can improve the classification accuracy and algorithm stability too.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
MATH Google Scholar
Qian, Y., Liang, J., Pedrycz, W., Dang, C.: Positive approximation: an accelerator for attribute reduction in rough set theory. Artif. Intell. 174(9), 597–618 (2010)
Article MathSciNet MATH Google Scholar
Kuncheva, L.I., Rodrguez, J.J., Plumpton, C.O., et al.: Random subspace ensembles for fMRI classification. IEEE Trans. Med. Imaging 29(2), 531–542 (2010)
Article Google Scholar
Zhang, J., Zhang, D.: A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples. Pattern Recogn. 44(6), 1162–1171 (2011)
Article MATH Google Scholar
Sun, S., Zhang, C.: Subspace ensembles for classification. Phys. A Stat. Mech. Appl. 385(1), 199–207 (2007)
Article MathSciNet Google Scholar
Bryll, R., Gutierrez-Osuna, R., Quek, F.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recogn. 36(6), 1291–1302 (2003)
Article MATH Google Scholar
Allison, P.D.: Missing Data. Sage Publications, Thousand Oaks (2001)
MATH Google Scholar
Roderick L., J A, Rubin, D.B.: Statistical Analysis with Missing Data, vol. 43, no. 4, pp. 364–365. Wiley, New York (2002)
Google Scholar
Gheyas, I.A., Smith, L.S.: A neural network-based framework for the reconstruction of incomplete data sets. Neurocomputing 73(16), 3039–3065 (2010)
Article Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B (Methodol.) 39, 1–38 (1977)
MathSciNet MATH Google Scholar
Russell, S., Binder, J., Koller, D., Kanazawa, K.: Local learning in probabilistic networks with hidden variables. In: Proceedings of IJCAI 1995, pp. 1146–1152 (1995)
Google Scholar
Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 721–741 (1984)
Article MATH Google Scholar
Williams, D., Liao, X., Xue, Y., Carin, L., Krishnapuram, B.: On classification with incomplete data. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 427–436 (2007)
Article Google Scholar
Ramoni, M., Sebastiani, P.: Robust Bayes classifiers. Artif. Intell. 125(1), 209–226 (2001)
Article MathSciNet MATH Google Scholar
Krause, S., Polikar, R.: An ensemble of classifiers approach for the missing feature problem. In: IEEE Proceedings of the International Joint Conference on Neural Networks, vol. 1, pp. 553–558 (2003)
Google Scholar
Chen, H., Du, Y., Jiang, K.: Classification of incomplete data using classifier ensembles. In: IEEE International Conference on Systems and Informatics. pp. 2229–2232 (2012)
Google Scholar
Yan, Y.-T., Zhang, Y.-P., Zhang, Y.-W.: Multi-granulation ensemble classification for incomplete data. In: Miao, D., Pedrycz, W., Slezak, D., Peters, G., Hu, Q., Wang, R. (eds.) RSKT 2014. LNCS, vol. 8818, pp. 343–351. Springer, Heidelberg (2014)
Chapter Google Scholar
UCI Repository of machine learning databases for classification. http://archive.ics.uci.edu/ml/datasets.html
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Article MATH Google Scholar

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (Nos.61175046 and 61203290).

Author information

Authors and Affiliations

Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, 230601, Anhui Province, China
Yuan-ting Yan, Yan-Ping Zhang & Xiu-Quan Du

Authors

Yuan-ting Yan
View author publications
You can also search for this author in PubMed Google Scholar
Yan-Ping Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiu-Quan Du
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan-Ping Zhang .

Editor information

Editors and Affiliations

University of Regina, Regina, SK, Canada
Yiyu Yao
Tianjin University, Tianjin, China
Qinghua Hu
Chongqing University of Posts and Telecommunications, Chongqing, China
Hong Yu
University of Kansas, Lawrence, KS, USA
Jerzy W. Grzymala-Busse

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yan, Yt., Zhang, YP., Du, XQ. (2015). Multiple-Side Multiple-Learner for Incomplete Data Classification. In: Yao, Y., Hu, Q., Yu, H., Grzymala-Busse, J.W. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. Lecture Notes in Computer Science(), vol 9437. Springer, Cham. https://doi.org/10.1007/978-3-319-25783-9_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-25783-9_29
Published: 08 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25782-2
Online ISBN: 978-3-319-25783-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics