Robust Ensemble Classifier Combination Based on Noise Removal with One-Class SVM

Çatak, Ferhat Özgür

doi:10.1007/978-3-319-26535-3_2

Ferhat Özgür Çatak¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9490))

Included in the following conference series:

International Conference on Neural Information Processing

1744 Accesses
1 Citations

Abstract

In machine learning area, as the number of labeled input samples becomes very large, it is very difficult to build a classification model because of input data set is not fit in a memory in training phase of the algorithm, therefore, it is necessary to utilize data partitioning to handle overall data set. Bagging and boosting based data partitioning methods have been broadly used in data mining and pattern recognition area. Both of these methods have shown a great possibility for improving classification model performance. This study is concerned with the analysis of data set partitioning with noise removal and its impact on the performance of multiple classifier models. In this study, we propose noise filtering preprocessing at each data set partition to increment classifier model performance. We applied Gini impurity approach to find the best split percentage of noise filter ratio. The filtered sub data set is then used to train individual ensemble models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Anderson, J.R., Michalski, R.S., Carbonell, J.G., Mitchell, T.M.: Machine Learning: An Artificial Intelligence Approach, vol. 2. Morgan Kaufmann, San Mateo (1986)
MATH Google Scholar
Ramakrishnan, R., Gehrke, J.: Database Management Systems. Osborne/McGraw-Hill, Berkeley (2000)
MATH Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory. Springer Science and Business Media, New York (2000)
Book MATH Google Scholar
Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)
Article MATH Google Scholar
Freund, Y., Schapire, R.E.: A desicion-theoretic generalization of on-line learning and an application to boosting. In: Vitányi, P. (ed.) Computational Learning Theory. Lecture Notes in Computer Science, vol. 904, pp. 23–37. Springer, Heidelberg (1995)
Chapter Google Scholar
Freund, Y., Schapire, R., Abe, N.: A short introduction to boosting. J.-Jpn. Soc. Artif. Intell. 14(771–780), 1612 (1999)
Google Scholar
Landesa-Vzquez, I., Alba-Castro, J.L.: Double-base asymmetric AdaBoost. Neurocomputing 118, 101–114 (2013)
Article Google Scholar
Kuncheva, L.I.: Using diversity measures for generating error-correcting output codes in classifier ensembles. Pattern Recogn. Lett. 26(1), 83–90 (2005)
Article Google Scholar
Dara, R.A., Makrehchi, M., Kamel, M.S.: Filter-based data partitioning for training multiple classifier systems. IEEE Trans. Knowl. Data Eng. 22(4), 508–522 (2010)
Article Google Scholar
Chawla, N.V., Moore, T.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P., Springer, C.: Distributed learning with bagging-like performance. Pattern Recogn. Lett. 24(1), 455–471 (2003)
Article Google Scholar
Woods, K., Bowyer, K., Kegelmeyer Jr., W.P.: Combination of multiple classifiers using local accuracy estimates. In: 1996 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Proceedings CVPR 1996, pp. 391–396. IEEE (1996)
Google Scholar
Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1), 245–271 (1997)
Article MathSciNet MATH Google Scholar
Freund, Y., Schapire, R.E., et al.: Experiments with a new boosting algorithm. ICML 96, 148–156 (1996)
Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MathSciNet MATH Google Scholar
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In: Advances in Neural Information Processing Systems, pp. 231–238. MIT Press (1995)
Google Scholar
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

TÜBİTAK BİLGEM, Cyber Security Institute, Kocaeli/Gebze, Turkey
Ferhat Özgür Çatak

Authors

Ferhat Özgür Çatak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ferhat Özgür Çatak .

Editor information

Editors and Affiliations

University of Istanbul, Istanbul, Turkey
Sabri Arik
University at Qatar, Doha, Qatar
Tingwen Huang
Tunku Abdul Rahman University College, Kuala Lumpur, Malaysia
Weng Kin Lai
University of Science Technology, Wuhan, China
Qingshan Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Çatak, F.Ö. (2015). Robust Ensemble Classifier Combination Based on Noise Removal with One-Class SVM. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9490. Springer, Cham. https://doi.org/10.1007/978-3-319-26535-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-26535-3_2
Published: 10 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26534-6
Online ISBN: 978-3-319-26535-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics