Abstract
In this paper, we propose a new cluster-based sample reduction method which is unsupervised, geometric, and density-based. The original data is initially divided into clusters, and each cluster is divided into “portions” defined as the areas between two concentric circles. Then, using the proposed geometric-based formulas, the membership value of each sample belonging to a specific portion is calculated. Samples are then selected from the original data according to the corresponding calculated membership value. We conduct various experiments on the NSL-KDD and KDDCup99 datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Mahbod, T., Ebrahim, B., Wei, L., Ali, A.G.: A Detailed Analysis of the KDD CUP 99 Data Set. In: Proceeding of Computational Intelligence in Security and Defense Application (CISDA 2009) (2009)
Cochran, W.G.: Sampling Techniques. John Wiley & Sons (1977)
Chambers, R.L., Skinner, C.J. (eds.): Analysis of Survey Data. Wiley (2003)
Liu, H., Motoda, H., Yu, L.: A selective sampling approach to active feature selection. Artificial Intelligence 159(1-2), 49–74 (2004)
Kim, J.-M.: Calibration approach estimators in stratified sampling. Statistics & Probability Letters 77, 99–103 (2007)
PeRkalska, E.: Prototype selection for dissimilarity-based classifiers. Pattern Recognition 39, 189–208 (2006)
Duin, R.P.W., Juszczak, P., Ridder, D., Paclík, D.: PR-Tools, a Matlab toolbox for pattern recognition (2004), http://www.prtools.org
Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)
Xu, Y., Chow, T.W.S.: Efficient Self-Organizing Map Learning Scheme Using Data Reduction Preprocessing. In: Proceedings of the World Congress on Engineering 2010, WCE 2010, London, U.K, June 30-July 2 (2010)
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Machine Intell. 1(4), 224–227 (1979)
NSL-KKD dataset is available at: http://iscx.ca/NSL-KDD/ (last visit April 2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Mohammadi, M., Raahemi, B., Akbari, A. (2014). A Clustering Density-Based Sample Reduction Method. In: Sokolova, M., van Beek, P. (eds) Advances in Artificial Intelligence. Canadian AI 2014. Lecture Notes in Computer Science(), vol 8436. Springer, Cham. https://doi.org/10.1007/978-3-319-06483-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-06483-3_32
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06482-6
Online ISBN: 978-3-319-06483-3
eBook Packages: Computer ScienceComputer Science (R0)