Seed Point Selection Algorithm in Clustering of Image Data

Chowdhury, Kuntal; Chaudhuri, Debasis; Pal, Arup Kumar

doi:10.1007/978-981-10-3376-6_13

Kuntal Chowdhury¹⁹,
Debasis Chaudhuri²⁰ &
Arup Kumar Pal²¹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 719))

1073 Accesses
4 Citations

Abstract

Massive amount of data are being collected in almost all sectors of life due to recent technological advancements. Various data mining tools including clustering is often applied on huge data sets in order to extract hidden and previously unknown information which can be helpful in future decision-making processes. Clustering is an unsupervised technique of data points which is separated into homogeneous groups. Seed point is an important feature of a clustering technique, which is called the core of the cluster and the performance of seed-based clustering technique depends on the choice of initial cluster center. The initial seed point selection is a challenging job due to formation of better cluster partition with rapidly convergence criteria. In the present research we have proposed the seed point selection algorithm applied on image data by taking the RGB features of color image as well as 2D data based on the maximization of Shannon’s entropy with distance restriction criteria. Our seed point selection algorithm converges in a minimum number of steps for the formation of better clusters. We have applied our algorithm in different image data as well as discrete data and the results appear to be satisfactory. Also we have compared the result with other seed selection methods applied through K-Means algorithm for the comparative study of number of iterations and CPU time with the other clustering technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Jain, A. K., Dubes, R. C.: Algorithms for Clustering Data. Englewood Cliffs NJ: Prentice-Hall, (1988).
Google Scholar
Singh, D., Reddy, C. K.: A survey on platforms for big data analytics. Journal of Big Data, Springer, 2(8), 1–20, doi:10.1186/s40537-014-0008-6, (2014).
Liu, Z., Zheng, Q., Xue, L., Guan, X.: A distributed energy efficient clustering algorithm with improved coverage in wireless sensor networks. Journal of Future Generation Computer System, 28(5), 780–790, (2012).
Google Scholar
Wang, Q., Megalooikonomou, V.: A clustering Algorithm for intrusion detection. In Proc. of SPIE, 5812, 31–38, doi:10.1117/12.603567, (2005).
Kodabagi, M. M., Hanji, S. S., Hanji, S. V.: Application of enhanced clustering technique using similarity measure for market segmentation. CS&IT –CSCP-2014, 15–27, (2014).
Google Scholar
Villmann, T., Albani, C.: Clustering of categoric data in medicine application of evolutionary algorithms. International Conference 7^th Fuzzy Days on Computational Intelligence, Theory and Applications, 619–627, (2001).
Google Scholar
Cao, F., Liang, J., Jiang, G.: An initialization for the K-Means algorithm using neighborhood model. Computers and Mathematics with Applications, 58, 474–483, (2009).
Google Scholar
Tou, J. T., Gonzales, R. C.: Pattern Recognition Principles. Addison-Wesley, (1974).
Google Scholar
Bhattacharya, A., De, R. K.: Divisive correlation clustering algorithm (DCCA) for grouping of genes detecting varying patterns in expression profiles. Bioinformatics, 24, 1359–1366, (2008).
Google Scholar
Reddy, C. K., Vinazmuri, B.: A survey of partitional and hierarchical clustering algorithms. Data Clustering Algorithms and Applications, 87–110, (2013).
Google Scholar
Arifin, A. Z., Asano, A.: Image segmentation by histogram thresholding using hierarchical cluster analysis. Pattern Recognition Letters, 27(13), 1515–521, (2006).
Google Scholar
Jain, A. K.: Data Clustering: 50 Years beyond K-Means. Pattern Recognition Letters, 31(8), 651–666, (2010).
Google Scholar
Chen, K., Li, L..: The best K for entropy based categorical data clustering. Proc. of International Conference on Scientific and Statistical Database Management (SSDBM), 253–262, (2005).
Google Scholar
Chaudhuri, D., Chaudhuri, B. B.: A novel multi-seed nonhierarchical data clustering technique. IEEE Trans. on Systems, Man and Cybernetics – Part B: 27(5), 871–877, (1997).
Google Scholar
Pal, S.K., Paramanik, P. K.: Fuzzy measures in determining seed point in clustering. Pattern Recognition Letters, 4, 159–164, (1986).
Google Scholar
Lu, J. F., Tang, J. B., Tang, Z. M., Wang, J. Y.: Hierarchical initialization approach for K-Means clustering. Pattern Recognition Letters, 29, 787–795, (2008).
Google Scholar
Reddy, D., Jana, P. K.: Initialization for K-means clustering using Vornoi diagram. Procedia Technology 4, 395–400, (2012).
Google Scholar
Bai, L., Liang, J., Dang, C., Cao, F.: A cluster centers initialization method for clustering categorical data. Expert Systems with Applications, 39, 8022–8029, (2012).
Google Scholar
Celebi, M. E., Hassan, A. K., Vela, P. A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Systems with Applications, 40, 200–210, (2013).
Google Scholar
Zahra, S., Ghazanfar, M. A., Khalid, A., Naeem, U.: Novel centroid selection approaches for K-means-clustering based recommender systems. Information Sciences, 320, 156–189, (2015).
Google Scholar
Astrahan, M. M.: Speech analysis by clustering, or the hyperphoneme method. Stanford Artif. Intell. Proj. Memo. AIM-124, AD 09067, Stanford Univ., Stanford, CA, (1970).
Google Scholar
Ball, G. H., Hall, D. J.: ISODATA: A novel method of data analysis and pattern classification. Tech. Rep. Stanford Res. Inst., Menlo Park, CA, (1965).
Google Scholar
Chaudhuri, D., Murthy, C. A., Chaudhuri, B. B.: Finding a subset of representative points in a data set. IEEE Trans. on Systems, Man and Cybernetics, 24(9), 1416–1424, (1994).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology, DIT University, Dehradun, 248001, India
Kuntal Chowdhury
DIC (DRDO), Panagarh Base, Muraripur, Bardhaman, 713149, West Bengal, India
Debasis Chaudhuri
Department of Computer Science & Engineering, Indian Institute of Technology (Indian School Of Mines), Dhanbad, 826004, India
Arup Kumar Pal

Authors

Kuntal Chowdhury
View author publications
You can also search for this author in PubMed Google Scholar
Debasis Chaudhuri
View author publications
You can also search for this author in PubMed Google Scholar
Arup Kumar Pal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kuntal Chowdhury .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology, Rourkela, Odisha, India
Pankaj Kumar Sa
Department of Computer Science and Engineering, National Institute of Technology, Rourkela, Odisha, India
Manmath Narayan Sahoo
School of Mechatronic Engineering, Universiti Malaysia Perlis (UniMAP), Arau, Perlis, Malaysia
M. Murugappan
The University of Exeter, Exeter, Devon, United Kingdom
Yulei Wu
Department of Computer Science and Engineering, National Institute of Technology, Rourkela, Odisha, India
Banshidhar Majhi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chowdhury, K., Chaudhuri, D., Pal, A.K. (2018). Seed Point Selection Algorithm in Clustering of Image Data. In: Sa, P., Sahoo, M., Murugappan, M., Wu, Y., Majhi, B. (eds) Progress in Intelligent Computing Techniques: Theory, Practice, and Applications. Advances in Intelligent Systems and Computing, vol 719. Springer, Singapore. https://doi.org/10.1007/978-981-10-3376-6_13

Download citation

DOI: https://doi.org/10.1007/978-981-10-3376-6_13
Published: 05 August 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3375-9
Online ISBN: 978-981-10-3376-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics