Skip to main content

Seed Point Selection Algorithm in Clustering of Image Data

  • Conference paper
  • First Online:
Progress in Intelligent Computing Techniques: Theory, Practice, and Applications

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 719))

Abstract

Massive amount of data are being collected in almost all sectors of life due to recent technological advancements. Various data mining tools including clustering is often applied on huge data sets in order to extract hidden and previously unknown information which can be helpful in future decision-making processes. Clustering is an unsupervised technique of data points which is separated into homogeneous groups. Seed point is an important feature of a clustering technique, which is called the core of the cluster and the performance of seed-based clustering technique depends on the choice of initial cluster center. The initial seed point selection is a challenging job due to formation of better cluster partition with rapidly convergence criteria. In the present research we have proposed the seed point selection algorithm applied on image data by taking the RGB features of color image as well as 2D data based on the maximization of Shannon’s entropy with distance restriction criteria. Our seed point selection algorithm converges in a minimum number of steps for the formation of better clusters. We have applied our algorithm in different image data as well as discrete data and the results appear to be satisfactory. Also we have compared the result with other seed selection methods applied through K-Means algorithm for the comparative study of number of iterations and CPU time with the other clustering technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Jain, A. K., Dubes, R. C.: Algorithms for Clustering Data. Englewood Cliffs NJ: Prentice-Hall, (1988).

    Google Scholar 

  2. Singh, D., Reddy, C. K.: A survey on platforms for big data analytics. Journal of Big Data, Springer, 2(8), 1–20, doi:10.1186/s40537-014-0008-6, (2014).

  3. Liu, Z., Zheng, Q., Xue, L., Guan, X.: A distributed energy efficient clustering algorithm with improved coverage in wireless sensor networks. Journal of Future Generation Computer System, 28(5), 780–790, (2012).

    Google Scholar 

  4. Wang, Q., Megalooikonomou, V.: A clustering Algorithm for intrusion detection. In Proc. of SPIE, 5812, 31–38, doi:10.1117/12.603567, (2005).

  5. Kodabagi, M. M., Hanji, S. S., Hanji, S. V.: Application of enhanced clustering technique using similarity measure for market segmentation. CS&IT –CSCP-2014, 15–27, (2014).

    Google Scholar 

  6. Villmann, T., Albani, C.: Clustering of categoric data in medicine application of evolutionary algorithms. International Conference 7th Fuzzy Days on Computational Intelligence, Theory and Applications, 619–627, (2001).

    Google Scholar 

  7. Cao, F., Liang, J., Jiang, G.: An initialization for the K-Means algorithm using neighborhood model. Computers and Mathematics with Applications, 58, 474–483, (2009).

    Google Scholar 

  8. Tou, J. T., Gonzales, R. C.: Pattern Recognition Principles. Addison-Wesley, (1974).

    Google Scholar 

  9. Bhattacharya, A., De, R. K.: Divisive correlation clustering algorithm (DCCA) for grouping of genes detecting varying patterns in expression profiles. Bioinformatics, 24, 1359–1366, (2008).

    Google Scholar 

  10. Reddy, C. K., Vinazmuri, B.: A survey of partitional and hierarchical clustering algorithms. Data Clustering Algorithms and Applications, 87–110, (2013).

    Google Scholar 

  11. Arifin, A. Z., Asano, A.: Image segmentation by histogram thresholding using hierarchical cluster analysis. Pattern Recognition Letters, 27(13), 1515–521, (2006).

    Google Scholar 

  12. Jain, A. K.: Data Clustering: 50 Years beyond K-Means. Pattern Recognition Letters, 31(8), 651–666, (2010).

    Google Scholar 

  13. Chen, K., Li, L..: The best K for entropy based categorical data clustering. Proc. of International Conference on Scientific and Statistical Database Management (SSDBM), 253–262, (2005).

    Google Scholar 

  14. Chaudhuri, D., Chaudhuri, B. B.: A novel multi-seed nonhierarchical data clustering technique. IEEE Trans. on Systems, Man and Cybernetics – Part B: 27(5), 871–877, (1997).

    Google Scholar 

  15. Pal, S.K., Paramanik, P. K.: Fuzzy measures in determining seed point in clustering. Pattern Recognition Letters, 4, 159–164, (1986).

    Google Scholar 

  16. Lu, J. F., Tang, J. B., Tang, Z. M., Wang, J. Y.: Hierarchical initialization approach for K-Means clustering. Pattern Recognition Letters, 29, 787–795, (2008).

    Google Scholar 

  17. Reddy, D., Jana, P. K.: Initialization for K-means clustering using Vornoi diagram. Procedia Technology 4, 395–400, (2012).

    Google Scholar 

  18. Bai, L., Liang, J., Dang, C., Cao, F.: A cluster centers initialization method for clustering categorical data. Expert Systems with Applications, 39, 8022–8029, (2012).

    Google Scholar 

  19. Celebi, M. E., Hassan, A. K., Vela, P. A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Systems with Applications, 40, 200–210, (2013).

    Google Scholar 

  20. Zahra, S., Ghazanfar, M. A., Khalid, A., Naeem, U.: Novel centroid selection approaches for K-means-clustering based recommender systems. Information Sciences, 320, 156–189, (2015).

    Google Scholar 

  21. Astrahan, M. M.: Speech analysis by clustering, or the hyperphoneme method. Stanford Artif. Intell. Proj. Memo. AIM-124, AD 09067, Stanford Univ., Stanford, CA, (1970).

    Google Scholar 

  22. Ball, G. H., Hall, D. J.: ISODATA: A novel method of data analysis and pattern classification. Tech. Rep. Stanford Res. Inst., Menlo Park, CA, (1965).

    Google Scholar 

  23. Chaudhuri, D., Murthy, C. A., Chaudhuri, B. B.: Finding a subset of representative points in a data set. IEEE Trans. on Systems, Man and Cybernetics, 24(9), 1416–1424, (1994).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kuntal Chowdhury .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Chowdhury, K., Chaudhuri, D., Pal, A.K. (2018). Seed Point Selection Algorithm in Clustering of Image Data. In: Sa, P., Sahoo, M., Murugappan, M., Wu, Y., Majhi, B. (eds) Progress in Intelligent Computing Techniques: Theory, Practice, and Applications. Advances in Intelligent Systems and Computing, vol 719. Springer, Singapore. https://doi.org/10.1007/978-981-10-3376-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3376-6_13

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3375-9

  • Online ISBN: 978-981-10-3376-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics