Selective Ensemble Approach for Classification of Datasets with Incomplete Values

Wang, Yan; Gao, Yi; Shen, Ruimin; Yang, Fan

doi:10.1007/978-3-642-25664-6_33

Yan Wang⁴,
Yi Gao⁴,
Ruimin Shen⁴ &
…
Fan Yang⁵

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 122))

1661 Accesses
2 Citations

Abstract

In some research situations, we often have to classify data with incomplete values which affect the learning performance of classifiers. Although various classification algorithms have been proposed, most of them are short of the ability to deal with incomplete data. This paper proposes a novel approach based on selective ensemble for classifying incomplete data. The method finds the local complete patterns for which the feature values are complete and trains multiple component learners for each local complete subset. Then, it combines the outputs of the classifiers. The method needs no assumption about the incomplete mechanism that is necessary for previous methods. The proposed method is evaluated by three datasets from the UCI Machine Learning Repository. The experiments results show that classification accuracy of the proposed method is superior to those of widely used imputations and deletion method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 429.00; Price excludes VAT (USA)

Softcover Book: USD 549.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chen, J., Huang, H., Tian, F., Tian, S.: A Selective Bayes Classifier for Classifying Incomplete Data Based on Gain Ratio. Knowledge-Based Systems 21, 530–534 (2008)
Article Google Scholar
Quinlan, J.R.: C4.5 Programs for Machine Learning. Morgan Kaufman, San Francisco (1993)
Google Scholar
Qu, H., Mao, L., Wang, J.: Method for Self-Extracting Diagnostic Rules of Blood Stasis Syndrome Based on Decision Tree. Chinese Journal of Biomedical Engineering 24, 709–711, 727 (2005)
Google Scholar
Boujelben, M.A., Smet, Y.D., Frikha, A., Chabchoub, H.: Building A Binary Outranking Relation in Uncertain, Imprecise and Multi-experts Contexts: The application of Evidence Theory. International Journal of Approximate Reasoning 50, 1259–1278 (2009)
Article MathSciNet MATH Google Scholar
Salama, A.S.: Topological Solution of Missing Attribute Values Problem in Incomplete Information Tables. Information Sciences 180, 631–639 (2010)
Article MathSciNet Google Scholar
Huang, Z., Li, J., Su, H., Watts, G.S., Chen, H.: Large-scale Regulatory Network Analysis from Microarray Data: Modified Bayesian Network Learning and Association Rule Mining. Decision Support Systems 43, 1207–1225 (2007)
Article Google Scholar
Williams, D., Liao, X., Xue, Y., Carin, L., Krishnapuram, B.: On Classification with Incomplete Data. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 427–436 (2007)
Article Google Scholar
Ramoni, M., Sebastiani, P.: Robust Bayes Classifiers. Artificial Intelligence 125, 209–226 (2001)
Article MathSciNet MATH Google Scholar
Partalas, I., Tsoumakas, P.G., Vlahavas, I.: Pruning an Ensemble of Classifiers via Reinforcement Learning. Neurocomputing 72, 1900–1909 (2009)
Article Google Scholar
Breiman, L.: Bagging Predictors. Machine Learning 24, 123–140 (1996)
MathSciNet MATH Google Scholar
Freund, Y., Schapire, R.E.: A Decision-Theoretic Generalize of On-Line Learning and An Application to Boosting. Journal of Computer and System Sciences 55, 119–139 (1997)
Article MathSciNet MATH Google Scholar
Witten, I.H., Frank, E.: Data Mining. Morgan Kaufmann Publishers, Elsevier (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Continuing Education, Shanghai Jiao Tong University, Shanghai, 200240, China
Yan Wang, Yi Gao & Ruimin Shen
Chongqing Academy of Science & Technology, Chongqing, 401123, China
Fan Yang

Authors

Yan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Gao
View author publications
You can also search for this author in PubMed Google Scholar
Ruimin Shen
View author publications
You can also search for this author in PubMed Google Scholar
Fan Yang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, 200240, Shanghai, China
Yinglin Wang
School of Information Science and Technology, Southwest Jiaotong University, 610031, Chengdu, Sichuan Province, China
Tianrui Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Y., Gao, Y., Shen, R., Yang, F. (2011). Selective Ensemble Approach for Classification of Datasets with Incomplete Values. In: Wang, Y., Li, T. (eds) Foundations of Intelligent Systems. Advances in Intelligent and Soft Computing, vol 122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25664-6_33

Download citation

DOI: https://doi.org/10.1007/978-3-642-25664-6_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25663-9
Online ISBN: 978-3-642-25664-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics