Abstract
One of the techniques for improving the accuracy of induced classifier is noise filtering. The classifiers prediction performance is affected by the noisy datasets used in the induction of classifiers. Therefore, it is very important to detect and remove the noise in order to increase the classification accuracy. This paper proposed a model for noise detection in the datasets using k-means and support vector machine (SVM) techniques. The proposed model has been tested using the datasets from University of California, Irvine machine learning repository. Experimental results reveal that the proposed model can improve data quality and increase the classification accuracies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lowongtrakool, C.: Noise filtering in unsupervised clustering using computation intelligence. Int. J. Math. Anal. 6, 2911–2920 (2012)
Sluban, B., Gamberger, D., Lavra, N.: Advances in class noise detection, pp. 1105–1106 (2010)
Daza, L., Acuna, E.: An algorithm for detecting noise on supervised classification (2007)
Frank, A., Asuncion, A: UCI machine learning repository (2011). https://archive.ics.uci.edu/ml15:22
Hodge, V.J., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 1–43 (2004)
Van Hulse, J.D., Khoshgoftaar, T.M., Huang, H.: The pairwise attribute noise detection algorithm. Knowl. Inf. Syst. 11, 171–190 (2006)
Miranda, A.L., Garcia, L.P.F., Carvalho, A.C., Lorena, A.C.: Use of classification algorithms in noise detection and elimination. In: Corchado, E., Wu, X., Oja, E., Herrero, Á., Baruque, B. (eds.) HAIS 2009. LNCS, vol. 5572, pp. 417–424. Springer, Heidelberg (2009)
Li, D., Hu, W., Xiong, W., Yang, J.: Fuzzy relevance vector machine for learning from unbalanced data and noise. Pattern Recogn. Lett. 29, 1175–1181 (2008)
Xiong, H., Pandey, G., Member, S.: Enhancing data analysis with noise removal. IEEE Trans. Knowl. Data Eng. 18, 304–319 (2006)
Li, Y.: Classification in the presence of class noise. Pattern Recogn. 5, 1–30 (2003)
Zeng, X., Martinez, T.: A noise filtering method using neural networks. In: IEEE lnternational Workshop on Soft Computing Techniques in Instrumentatian, Measurement and Related Application, SCIMA 2003, pp. 26–31. IEEE (2003)
Zhu, X., Chen, Q.: eliminating class noise in large datasets, pp. 920–927.(2003)
Lawrence, N.D., Schölkopf, B.: Estimating a kernel Fisher discriminant in the presence of label noise. In: ICML, pp. 306–313. Citeseer (2001)
Gamberger, D., Lavrac, N.: Noise detection and elimination in data preprocessing: experiments in medical domains. Appl. Artif. Intell. 14(2), 205–223 (2000)
Shah, Z., Mahmood, A.N., Mustafa, A.K.: A hybrid approach to improving clustering accuracy using SVM. In: Industrial Electronics and Applications (ICIEA), pp. 783–788. IEEE (2013)
Vapnik, V.N., Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Jiang, B., Zhang, X., Cai, T.: Estimating the confidence interval for prediction errors of support vector machine classifiers. J Mach. Learn. Res. 9, 521–540 (2008)
Kordos, M., Rusiecki, A.: Improving MLP neural network performance by noise reduction. In: Dediu, A.-H., Martín-Vide, C., Truthe, B., Vega-Rodríguez, M.A. (eds.) TPNC 2013. LNCS, vol. 8273, pp. 133–144. Springer, Heidelberg (2013)
Salehi, S., Selamat, A., Mashinchi, R., Fujita, H.: The synergistic combination of particle swarm optimization and fuzzy sets to design granular classifier. Knowl.-Based Syst. 76, 200–218 (2015)
Byeon, B., Rasheed, K., Doshi, P.: Enhancing the quality of noisy training data using a genetic algorithm and prototype selection. In: IC-AI, pp. 821–827 (2008)
Utkin, L.V., Zhuk, Y.A.: Robust boosting classification models with local sets of probability distributions. Knowl.-Based Syst. 61, 59–75 (2014)
Acknowledgement
This work is supported by the Ministry of Education and Research Management Centre at the Universiti Teknologi Malaysia under the Research University Grant Scheme (Vote No. Q.J130000.2528.05H84).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Nematzadeh, Z., Ibrahim, R., Selamat, A. (2015). A Method for Class Noise Detection Based on K-means and SVM Algorithms. In: Fujita, H., Guizzi, G. (eds) Intelligent Software Methodologies, Tools and Techniques. SoMeT 2015. Communications in Computer and Information Science, vol 532. Springer, Cham. https://doi.org/10.1007/978-3-319-22689-7_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-22689-7_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22688-0
Online ISBN: 978-3-319-22689-7
eBook Packages: Computer ScienceComputer Science (R0)