Modified k-Means Clustering Algorithm

Patel, Vaishali R.; Mehta, Rupa G.

doi:10.1007/978-3-642-25734-6_46

Vaishali R. Patel³ &
Rupa G. Mehta⁴

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 250))

Included in the following conference series:

International Conference on Computational Intelligence and Information Technology

1779 Accesses
10 Citations

Abstract

Clustering is the popular unsupervised learning technique of data mining which divide the data into groups having similar objects and used in various application areas. k-Means is the most popular clustering algorithm among all partition based clustering algorithm to partition a dataset into meaningful patterns. k-Means suffers some shortcomings. This paper addresses two shortcomings of k-Means; pass number of centroids in apriori and does not handle noise. This paper also presents an overview of cluster analysis, clustering algorithms, preprocessing and normalization techniques in modified k-Means to improve the effectiveness and efficiency of the modified k-Means clustering algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Chichester (1990)
Book MATH Google Scholar
Velmurugan, T., Santhanam, T.: Computational Complexity between K-Means and K-Medoids Clustering Algorithms for Normal and Uniform Distributions of Data Points. Journal of Computer Science 6(3), 363–368 (2010)
Article Google Scholar
Jiawei Han, M.K.: Data Mining Concepts and Techniques. Morgan Kaufmann Publishers. An Imprint of Elsevier (2006)
Google Scholar
Dunham, M.H.: Data Mining- Introductory and Advanced Concepts. In: Pearson Education 2006. Proceedings of the World Congress on Engineering, vol. 1 (2009)
Google Scholar
McQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceeding 5th Berkeley Symp. Math. Statist. Prob., vol. 1, pp. 281–297 (1967)
Google Scholar
Merz, C., Murphy, P.: UCI Repository of Machine Learning Databases, ftp://ftp.ics.uci.edu/pub/machine-learning-databases
Tan, P.-N., Steinback, M., Kumar, V.: Introduction to Data Mining. Pearson Education (2007)
Google Scholar
Patel, V.R., Mehta, R.G.: Clustering Algorithms: A Comprehensive Survey. In: International Conference on Electronics, Information and Communication Systems Engineering, Jodhpur (2011)
Google Scholar
Oyelade, O.J., Oladipupo, O.O., Obagbuwa, I.C.: Application of kMeans Clustering algorithm for prediction of Students’ Academic Performance. International Journal of Computer Science and Information Security 7 (2010)
Google Scholar
Sumitra Devi, K.A., Vijayalakshmi, M.N., Vasantha, R., Abraham, A.: Accomplishment of Circuit Partitioning using VHDL and Clustering Pertaining to VLSI design
Google Scholar
Tilton, J.C., Marchisio, G., Koperski, K.: NASA’s Intelligent Systems Program, NASA Headquarter Code R
Google Scholar
Ng, R.T., Han, J.: CLARANS:A Method for Clustering Objects for Spatial Data Mining. IEEE Transaction on Knowledge and Data Engineering 14(5), 1003–1016 (2002)
Article Google Scholar
Seidman, C.: Data Mining with Microsoft SQL Server 2000 Technical Reference, amazon.com/Mining-Microsoft-Server-Technical-Reference/dp/0735612714 ; ISBN:0-7356-1271-4
Noh, S.-K., Kim, Y.-M., Kim, D.K., Noh, B.-N.: Network Anomaly Detection Based on Clustering of Sequence Patterns. In: Gavrilova, M.L., Gervasi, O., Kumar, V., Tan, C.J.K., Taniar, D., Laganá, A., Mun, Y., Choo, H. (eds.) ICCSA 2006. LNCS, vol. 3981, pp. 349–358. Springer, Heidelberg (2006)
Chapter Google Scholar
Sahay, S.: Study and Implementation of CHEMELEON algorithm for gene clustering
Google Scholar
Erman, J., Arlitt, M., Mahanti, A.: Traffic Classification Using Clustering Algorithms. In: SIGCOMM 2006 Workshops, Pisa, Italy, September 11-15 (2006)
Google Scholar
Santhisree, K., Damodaram, A.: OPTICS on Sequential Data: Experiments and Test Results. International Journal of Computer Applications 5, 1–4 (2010)
Article Google Scholar
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. Department of Computer Science, University of Wisconsin, Madison, WI 53706
Google Scholar
Maheshwari, P., Srivastava, N.: WaveCluster for Remote Sensing Image Retrieval. International Journal on Computer Science and Engineering 3(2) (2011)
Google Scholar
Scanlan, J., Hartnett, J., Williams, R.: DynamicWEB: Profile Correlation Using COBWEB. In: Sattar, A., Kang, B.-h. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 1059–1063. Springer, Heidelberg (2006)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, SVMIT, Bharuch, Gujarat, India
Vaishali R. Patel
Department of Computer Engineering, SVNIT, Surat, Gujarat, India
Rupa G. Mehta

Authors

Vaishali R. Patel
View author publications
You can also search for this author in PubMed Google Scholar
Rupa G. Mehta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The IDES, Ouderkerk aan de Amstel, Amsterdam, 1191 GT, Netherlands
Vinu V Das
College of Engineering, Trivandrum, Kerala, India
Nessy Thankachan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Patel, V.R., Mehta, R.G. (2011). Modified k-Means Clustering Algorithm. In: Das, V.V., Thankachan, N. (eds) Computational Intelligence and Information Technology. CIIT 2011. Communications in Computer and Information Science, vol 250. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25734-6_46

Download citation

DOI: https://doi.org/10.1007/978-3-642-25734-6_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25733-9
Online ISBN: 978-3-642-25734-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics