Skip to main content

Modified k-Means Clustering Algorithm

  • Conference paper
Book cover Computational Intelligence and Information Technology (CIIT 2011)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 250))

Abstract

Clustering is the popular unsupervised learning technique of data mining which divide the data into groups having similar objects and used in various application areas. k-Means is the most popular clustering algorithm among all partition based clustering algorithm to partition a dataset into meaningful patterns. k-Means suffers some shortcomings. This paper addresses two shortcomings of k-Means; pass number of centroids in apriori and does not handle noise. This paper also presents an overview of cluster analysis, clustering algorithms, preprocessing and normalization techniques in modified k-Means to improve the effectiveness and efficiency of the modified k-Means clustering algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Chichester (1990)

    Book  MATH  Google Scholar 

  2. Velmurugan, T., Santhanam, T.: Computational Complexity between K-Means and K-Medoids Clustering Algorithms for Normal and Uniform Distributions of Data Points. Journal of Computer Science 6(3), 363–368 (2010)

    Article  Google Scholar 

  3. Jiawei Han, M.K.: Data Mining Concepts and Techniques. Morgan Kaufmann Publishers. An Imprint of Elsevier (2006)

    Google Scholar 

  4. Dunham, M.H.: Data Mining- Introductory and Advanced Concepts. In: Pearson Education 2006. Proceedings of the World Congress on Engineering, vol. 1 (2009)

    Google Scholar 

  5. McQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceeding 5th Berkeley Symp. Math. Statist. Prob., vol. 1, pp. 281–297 (1967)

    Google Scholar 

  6. Merz, C., Murphy, P.: UCI Repository of Machine Learning Databases, ftp://ftp.ics.uci.edu/pub/machine-learning-databases

  7. Tan, P.-N., Steinback, M., Kumar, V.: Introduction to Data Mining. Pearson Education (2007)

    Google Scholar 

  8. Patel, V.R., Mehta, R.G.: Clustering Algorithms: A Comprehensive Survey. In: International Conference on Electronics, Information and Communication Systems Engineering, Jodhpur (2011)

    Google Scholar 

  9. Oyelade, O.J., Oladipupo, O.O., Obagbuwa, I.C.: Application of kMeans Clustering algorithm for prediction of Students’ Academic Performance. International Journal of Computer Science and Information Security 7 (2010)

    Google Scholar 

  10. Sumitra Devi, K.A., Vijayalakshmi, M.N., Vasantha, R., Abraham, A.: Accomplishment of Circuit Partitioning using VHDL and Clustering Pertaining to VLSI design

    Google Scholar 

  11. Tilton, J.C., Marchisio, G., Koperski, K.: NASA’s Intelligent Systems Program, NASA Headquarter Code R

    Google Scholar 

  12. Ng, R.T., Han, J.: CLARANS:A Method for Clustering Objects for Spatial Data Mining. IEEE Transaction on Knowledge and Data Engineering 14(5), 1003–1016 (2002)

    Article  Google Scholar 

  13. Seidman, C.: Data Mining with Microsoft SQL Server 2000 Technical Reference, amazon.com/Mining-Microsoft-Server-Technical-Reference/dp/0735612714 ; ISBN:0-7356-1271-4

  14. Noh, S.-K., Kim, Y.-M., Kim, D.K., Noh, B.-N.: Network Anomaly Detection Based on Clustering of Sequence Patterns. In: Gavrilova, M.L., Gervasi, O., Kumar, V., Tan, C.J.K., Taniar, D., Laganá, A., Mun, Y., Choo, H. (eds.) ICCSA 2006. LNCS, vol. 3981, pp. 349–358. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  15. Sahay, S.: Study and Implementation of CHEMELEON algorithm for gene clustering

    Google Scholar 

  16. Erman, J., Arlitt, M., Mahanti, A.: Traffic Classification Using Clustering Algorithms. In: SIGCOMM 2006 Workshops, Pisa, Italy, September 11-15 (2006)

    Google Scholar 

  17. Santhisree, K., Damodaram, A.: OPTICS on Sequential Data: Experiments and Test Results. International Journal of Computer Applications 5, 1–4 (2010)

    Article  Google Scholar 

  18. Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. Department of Computer Science, University of Wisconsin, Madison, WI 53706

    Google Scholar 

  19. Maheshwari, P., Srivastava, N.: WaveCluster for Remote Sensing Image Retrieval. International Journal on Computer Science and Engineering 3(2) (2011)

    Google Scholar 

  20. Scanlan, J., Hartnett, J., Williams, R.: DynamicWEB: Profile Correlation Using COBWEB. In: Sattar, A., Kang, B.-h. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 1059–1063. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Patel, V.R., Mehta, R.G. (2011). Modified k-Means Clustering Algorithm. In: Das, V.V., Thankachan, N. (eds) Computational Intelligence and Information Technology. CIIT 2011. Communications in Computer and Information Science, vol 250. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25734-6_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25734-6_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25733-9

  • Online ISBN: 978-3-642-25734-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics