Abstract
Clustering is the popular unsupervised learning technique of data mining which divide the data into groups having similar objects and used in various application areas. k-Means is the most popular clustering algorithm among all partition based clustering algorithm to partition a dataset into meaningful patterns. k-Means suffers some shortcomings. This paper addresses two shortcomings of k-Means; pass number of centroids in apriori and does not handle noise. This paper also presents an overview of cluster analysis, clustering algorithms, preprocessing and normalization techniques in modified k-Means to improve the effectiveness and efficiency of the modified k-Means clustering algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Chichester (1990)
Velmurugan, T., Santhanam, T.: Computational Complexity between K-Means and K-Medoids Clustering Algorithms for Normal and Uniform Distributions of Data Points. Journal of Computer Science 6(3), 363–368 (2010)
Jiawei Han, M.K.: Data Mining Concepts and Techniques. Morgan Kaufmann Publishers. An Imprint of Elsevier (2006)
Dunham, M.H.: Data Mining- Introductory and Advanced Concepts. In: Pearson Education 2006. Proceedings of the World Congress on Engineering, vol. 1 (2009)
McQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceeding 5th Berkeley Symp. Math. Statist. Prob., vol. 1, pp. 281–297 (1967)
Merz, C., Murphy, P.: UCI Repository of Machine Learning Databases, ftp://ftp.ics.uci.edu/pub/machine-learning-databases
Tan, P.-N., Steinback, M., Kumar, V.: Introduction to Data Mining. Pearson Education (2007)
Patel, V.R., Mehta, R.G.: Clustering Algorithms: A Comprehensive Survey. In: International Conference on Electronics, Information and Communication Systems Engineering, Jodhpur (2011)
Oyelade, O.J., Oladipupo, O.O., Obagbuwa, I.C.: Application of kMeans Clustering algorithm for prediction of Students’ Academic Performance. International Journal of Computer Science and Information Security 7 (2010)
Sumitra Devi, K.A., Vijayalakshmi, M.N., Vasantha, R., Abraham, A.: Accomplishment of Circuit Partitioning using VHDL and Clustering Pertaining to VLSI design
Tilton, J.C., Marchisio, G., Koperski, K.: NASA’s Intelligent Systems Program, NASA Headquarter Code R
Ng, R.T., Han, J.: CLARANS:A Method for Clustering Objects for Spatial Data Mining. IEEE Transaction on Knowledge and Data Engineering 14(5), 1003–1016 (2002)
Seidman, C.: Data Mining with Microsoft SQL Server 2000 Technical Reference, amazon.com/Mining-Microsoft-Server-Technical-Reference/dp/0735612714 ; ISBN:0-7356-1271-4
Noh, S.-K., Kim, Y.-M., Kim, D.K., Noh, B.-N.: Network Anomaly Detection Based on Clustering of Sequence Patterns. In: Gavrilova, M.L., Gervasi, O., Kumar, V., Tan, C.J.K., Taniar, D., Laganá, A., Mun, Y., Choo, H. (eds.) ICCSA 2006. LNCS, vol. 3981, pp. 349–358. Springer, Heidelberg (2006)
Sahay, S.: Study and Implementation of CHEMELEON algorithm for gene clustering
Erman, J., Arlitt, M., Mahanti, A.: Traffic Classification Using Clustering Algorithms. In: SIGCOMM 2006 Workshops, Pisa, Italy, September 11-15 (2006)
Santhisree, K., Damodaram, A.: OPTICS on Sequential Data: Experiments and Test Results. International Journal of Computer Applications 5, 1–4 (2010)
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. Department of Computer Science, University of Wisconsin, Madison, WI 53706
Maheshwari, P., Srivastava, N.: WaveCluster for Remote Sensing Image Retrieval. International Journal on Computer Science and Engineering 3(2) (2011)
Scanlan, J., Hartnett, J., Williams, R.: DynamicWEB: Profile Correlation Using COBWEB. In: Sattar, A., Kang, B.-h. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 1059–1063. Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Patel, V.R., Mehta, R.G. (2011). Modified k-Means Clustering Algorithm. In: Das, V.V., Thankachan, N. (eds) Computational Intelligence and Information Technology. CIIT 2011. Communications in Computer and Information Science, vol 250. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25734-6_46
Download citation
DOI: https://doi.org/10.1007/978-3-642-25734-6_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25733-9
Online ISBN: 978-3-642-25734-6
eBook Packages: Computer ScienceComputer Science (R0)