Enhanced KNNC Using Train Sample Clustering
In this paper, a new classification method based on k-Nearest Neighbor (kNN) lazy classifier is proposed. This method leverages the clustering concept to reduce the size of the training set in kNN classifier and also in order to enhance its performance in terms of time complexity. The new approach is called Modified Nearest Neighbor Classifier Based on Clustering (MNNCBC). Inspiring the traditional lazy k-NN algorithm, the main idea is to classify a test instance based on the tags of its k nearest neighbors. In MNNCBC, the training set is first grouped into a small number of partitions. By obtaining a number of partitions employing several runnings of a simple clustering algorithm, MNNCBC algorithm extracts a large number of clusters out of those partitions. Then, a label is assigned to the center of each cluster produced in the previous step. The assignment is determined with use of the majority vote mechanism between the class labels of the patterns in each cluster. MNNCBC algorithm iteratively inserts a cluster into a pool of the selected clusters that are considered as the training set of the final 1-NN classifier as long as the accuracy of 1-NN classifier over a set of patterns included the training set and the validation set improves. The selected set of the most accurate clusters are considered as the training set of proposed 1-NN classifier. After that, the class label of a new test sample is determined according to the class label of the nearest cluster center. While kNN lazy classifier is computationally expensive, MNNCBC classifier reduces its computational complexity by a multiplier of 1/k. So MNNCBC classifier is about k times faster than kNN classifier. MNNCBC is evaluated on some real datasets from UCI repository. Empirical results show that MNNCBC has an excellent improvement in terms of both accuracy and time complexity in comparison with kNN classifier.
KeywordsEdited nearest neighbor classifier kNN Combinatorial classification
Unable to display preview. Download preview PDF.
- 1.Fix, E., Hodges, J.L.: Discriminatory analysis, nonparametric discrimination: Consistency properties. Technical Report 4, USAF School of Aviation Medicine, Randolph Field, Texas (1951)Google Scholar
- 10.Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley & Sons (2000)Google Scholar
- 11.Itqon, S.K., Satoru, I.: Improving Performance of k-Nearest Neighbor Classifier by Test Features. Springer Transactions of the Institute of Electronics, Information and Communication Engineers (2001)Google Scholar
- 14.Newman, C.B.D.J., Hettich, S., Merz, C.: UCI repository of machine learning databases (1998). http://www.ics.uci.edu/˜mlearn/MLSummary.html
- 15.Wu, X.: Top 10 algorithms in data mining. Knowledge information, 22-24. Springer-Verlag London Limited (2007)Google Scholar
- 16.Parvin, H., Minaei-Bidgoli, B., Ghatei, S., Alinejad-Rokny, H.: An Innovative Combination of Particle Swarm Optimization, Learning Automaton and Great Deluge Algorithms for Dynamic Environments. International Journal of the Physical Sciences, IJPS 6(22), 5121–5127 (2011)Google Scholar
- 17.Parvin, H., Helmi, H., Minaei-Bidgoli, B., Alinejad-Rokny, H., Shirgahi, H.: Linkage Learning Based on Differences in Local Optimums of Building Blocks with One Optima. International Journal of the Physical Sciences, IJPS 6(14), 3419–3425 (2011)Google Scholar
- 18.Parvin, H., Alizadeh, H., Minaei-Bidgoli, B.: Validation Based Modified k-Nearest Neighbor. Book Chapter in IAENG Transactions on Engineering Technologies, II–Special Edition of the World Congress on Engineering and Computer Science (2008)Google Scholar
- 26.Chikh, M.A., Saidi, M., Settouti, N.: Diagnosis of Diabetes Diseases Using an Artificial Immune Recognition System2 (AIRS2) with Fuzzy K-nearest Neighbor. Journal of Medical Systems (2011) (Online)Google Scholar
- 27.Liu, D.Y., Chen, H.L., Yang, B., Lv, X.E., Li, L.N., Liu, J.: Design of an Enhanced Fuzzy k-nearest Neighbor Classifier Based Computer Aided Diagnostic System for Thyroid Disease. Journal of Medical Systems (2011) (Online)Google Scholar
- 29.Mejdoub, M., Amar, C.B.: Classification improvement of local feature vectors over the KNN algorithm. Multimedia Tools and Applications (2011) (Online)Google Scholar
- 33.Parvin, H., Minaei-Bidgoli, B.: Linkage learning based on local optima. In: Jędrzejowicz, P., Nguyen, N.T., Hoang, K. (eds.) ICCCI 2011, Part I. LNCS, vol. 6922, pp. 163–172. Springer, Heidelberg (2011)Google Scholar