Skip to main content

Advertisement

Log in

A review: accuracy optimization in clustering ensembles using genetic algorithms

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

The clustering ensemble has emerged as a prominent method for improving robustness, stability, and accuracy of unsupervised classification solutions. It combines multiple partitions generated by different clustering algorithms into a single clustering solution. Genetic algorithms are known as methods with high ability to solve optimization problems including clustering. To date, significant progress has been contributed to find consensus clustering that will yield better results than existing clustering. This paper presents a survey of genetic algorithms designed for clustering ensembles. It begins with the introduction of clustering ensembles and clustering ensemble algorithms. Subsequently, this paper describes a number of suggested genetic-guided clustering ensemble algorithms, in particular the genotypes, fitness functions, and genetic operations. Next, clustering accuracies among the genetic-guided clustering ensemble algorithms is compared. This paper concludes that using genetic algorithms in clustering ensemble improves the clustering accuracy and addresses open questions subject to future research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Analoui M, Sadighian N (2006) Solving cluster ensemble problems by correlation’s matrix & GA. IFIP Int Fed Inf Process 228: 227–231

    Article  Google Scholar 

  • Azimi J, Abdoos M, Analoui M (2007) A new efficient approach in clustering ensembles. In: Proceedings of the 8th international conference on intellignt data engineering and automated learning. Lecture Note Computer Science, vol 4881, pp 395–405

  • Azimi J, Mohammadi M, Movaghar A, Analoui M (2007) Clustering ensembles using genetic algorithm. In: The international workshop on computer architecture for machine perception and sensing, IEEE, pp 119–123

  • Bouchachia A (2005) Learning with hybrid data. In: Proceedings of the fifth international conference on hybrid intelligent systems. IEEE Computer Society

  • Chiou YC, Lan LW (2001) Genetic clustering algorithms. EJOR Eur J Oper Res 135: 413–427

    Article  MATH  MathSciNet  Google Scholar 

  • Coello CAC, Van Veldhuizen DA, Lamont GB (2002) Evolutionary algorithms for solving multi-objective problems. Kluwer, Norwell

    MATH  Google Scholar 

  • Corne DW, Jerram NR, Knowles JD, Oates MJ (2001) PESA-II: region-based selection in evolutionary multi-objective optimization. In: Proceedings of the genetic and evolutionary computation conference, pp 283–290

  • Deb K (2001) Multi-objective optimization using evolutionary algorithms. ISBN: 047187339X, Wiley

  • Demiriz A, Bennett KP, Embrechts MJ (1999) Semi-supervised clustering using genetic algorithms. Artif Neural Netw Eng J 809–814

  • Dietterich TG (1997) Machine-learning research. AI Mag J 18(4): 97–136

    Google Scholar 

  • Du J, Korkmaz E, Alhajj R, Barker K (2004) Novel clustering approach that employs genetic algorithm with new representation scheme and multiple objectives. Data Warehousing Knowl Discov J, Springer, pp 219–228

  • Dudoit S, Fridlyand J (2003) Bagging to improve the accuracy of a clustering procedure. Bioinf J, Oxford University Press, vol 19, no 9, pp 1090–1099

  • Faceli K, De Carvalho A, De Souto M (2007) Multi-objective clustering ensemble with prior knowledge. Adv Bioinf Comput Biol, Springer, pp 34–45

  • Falkenauer E (1994) A new representation and operators for genetic algorithms applied to grouping problems. Evol Comput 2: 123–144

    Article  Google Scholar 

  • Falkenauer E (1998) Genetic algorithms and grouping problems. Wiley, USA, ISBN: 0471971502

  • Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: a cluster ensemble approach. In: Proceedings of the 20th international conference on machine learning (ICML), vol 20, no 1, pp 186–193

  • Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the 21st international conference on machine learning. ACM, p 36

  • Fischer B, Buhmann JM (2003) Bagging for path-based clustering. IEEE Trans Pattern Anal Mach Intell 25(11)

  • Fischer B, Buhmann JM (2003) Path-based clustering for grouping of smooth curves and texture segmentation. IEEE Trans Pattern Anal Mach Intell 25(4)

  • Franti P (2000) Genetic algorithm with deterministic crossover for vector quantization. Pattern Recogn Lett J 21: 61–68

    Article  Google Scholar 

  • Fred ALN (2001) Finding consistent cluster in data partitions. Springer, Berlin, pp 309–318

    Google Scholar 

  • Fred ALN, Jain AK (2002) Data clustering using evidence accumulation. Pattern Recogn J 4: 835–850

    Google Scholar 

  • Fred A, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27: 835–850

    Article  Google Scholar 

  • Gablentz V, Koppen M, Dimitriadou E (2000) Robust clustering by evolutionary computation. In: Proceedings of the fifth online world conference soft computing in industrial applications (WSC5)

  • Garai G, Chaudhuri BB (2004) A novel genetic algorithm for automatic clustering. Pattern Recogn Lett J 25: 173–187

    Article  Google Scholar 

  • Ghaemi R, Sulaiman MN, Ibrahim H, Mustapha N (2009) A survey: clustering ensembles techniques. In: Proceedings of the international conference on computer, electrical, and systems science, and engineering (CESSE), vol 38, pp 644–653

  • Handl J, Knowles J (2005) Exploiting the trade-off—the benefits of multiple objectives in data clustering. In: Proceedings of the third international conference on evolutionary multi-criterion optimization. Springer, pp 547–560

  • Handl J, Knowles J (2006) Multi-objective clustering and cluster validation. Multi Object Mach Learn J, Springer, pp 12–47

  • Haupt RL, Haupt SE (1998) Practical genetic algorithms. ISBN 0-471-45565-2, Wiley Online Library

  • Hong Y, Kwong S (2008) To combine steady-state genetic algorithm and ensemble learning for data clustering. Pattern Recogn Lett J, Elsevier, vol 29, no 9, pp 1416–1423

  • Hong Y, Kwong S, Chang Y, Ren Q (2008) Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm. Pattern Recogn Soc 41(9): 2742–2756

    Article  MATH  Google Scholar 

  • Hruschka ER, Campello RJGB, Freitas AA, De Carvalho A (2009) A survey of evolutionary algorithms for clustering. IEEE Trans Syst Man Cybern C Appl Rev 39(2): 133–155

    Article  Google Scholar 

  • Jain AK, Murty MN, Flynn P (1999) Data clustering: a review. ACM Comput Surv 31(3): 264–323

    Article  Google Scholar 

  • Jones DR, Beltramo MA (1991) Solving partitioning problems with genetic algorithm. In: Proceedings of the fourth international conference on genetic algorithms. California University, Morgan Kaufmann Publishers, pp 442–449

  • Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 359–392

  • Kellam P, Liu X, Martin NJ, Orengo C, Swift S, Tucker A (2001) Comparing, contrasting and combining clusters in viral gene expression data. In: Proceedings of the sixth workshop on intelligent data analysis in medicine and pharmocology, pp 56–62

  • Krishna K, Murty M (2002) Genetic K-means algorithm. IEEE Trans Syst Man Cybern B 29(3): 433–439

    Article  Google Scholar 

  • Kuncheva LI, Bezdek JC (2002) Nearest prototype classification: custering, genetic algorithms or random search?.   IEEE Trans Syst Man Cybern C Appl Rev 28(1): 160–164

    Article  Google Scholar 

  • Kuncheva LI, Hadjitodorov ST, Todorova LP (2006) Experimental comparison of cluster ensemble methods. In: Proceedings of FUSION, Citeseer, pp 105–115

  • Lu Y, Li S, Fotouhi F, Deng Y, Brown SJ (2004) Incremental genetic K-means algorithm and its application in gene expression data analysis. BMC Bioinform J 5(1): 172

    Article  Google Scholar 

  • Luo H, Jing F, Xie X (2007) Combining multiple clusterings using information theory-based genetic algorithm. In: International conference on computational intelligence and security, IEEE, vol 1, pp 84–89

  • Martnez-Otzeta JM, Sierra B, Lazkano E, Astigarraga A(2006) Classifier hierarchy learning by means of genetic algorithms. Pattern Recogn Lett J, Elsevier, vol 27, no 16, pp 1998–2004

  • Minaei-Bidgoli B, Topchy A, Punch WF (2004) A comparison of resampling methods for clustering ensembles. In: Proceedings of the international conference on machine learning: models, technologies and applications, Michigan State University, Citeseer

  • Mitra S (2004) An evolutionary rough portative clustering. Pattern Recogn Lett J 25: 1439–1449

    Article  Google Scholar 

  • Mohammadi M, Nikanjam A, Rahmani A (2008) An evolutionary approach to clustering ensemble. In: Fourth international conference on natural computation, IEEE, vol 3, pp 77–82

  • Ng A, Jordan M, Weiss Y (2001) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 849–856

  • Ozyer T, Alhajj R (2009) Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer. Appl Intell J, Springer, vol 31, no 3, pp 318–331

  • Qian Y, Suen CY (2000) Clustering combination method. In: Proceedings of the fifteen international conference on pattern recognition, vol 2, pp 732–735

  • Ramanathan K, Guan SU (2006) Recursive self-organizing maps with hybrid clustering. In: IEEE conference on cybernetics and intelligent systems, pp 1–6

  • Sheng W, Tucker A, Liu X (2004) Clustering with niching genetic K-means algorithm. In: Proceeding genetic and evolutionary computation conference, Springer, pp 162–173

  • Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining partitionings. In: Proceeding of 11th national conference on artificial intelligence, pp 93–98

  • Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Mach Learn Res J 3: 583–617

    Article  MATH  MathSciNet  Google Scholar 

  • Topchy A, Jain AK, Punch WF (2003) Combining multiple weak clusterings. In: Proceeding of the third IEEE international conference on data mining (ICDM), pp 331–338

  • Topchy A, Jain AK, Punch WF (2004a) A mixture model for clustering ensembles. In: Proceedings of the SIAM international conference on data mining, Michigan State University

  • Topchy A, Minaei-Bidgoli B, Jain AK, Punch WF (2004b) Adaptive clustering ensembles. Pattern Recogn J 1: 272–275

    Google Scholar 

  • Topchy A, Jain AK, Punch WF (2005) Clustering ensembles: models of consensus and weak partitions. IEEE Trans Pattern Anal Mach Intell 27(12): 1866–1881

    Article  Google Scholar 

  • Vavak F, Fogarty TC (1996) Comparison of steady-state and generational genetic algorithms for use in nonstationary environments. Lecture Notes in Computer Science, Springer

  • Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3)

  • Yoon HS, Ahn SY, Lee SH, Cho SB, Kim JH (2006a) Heterogeneous clustering ensemble method for combining different cluster results. Data Min Biomed Appl J, Springer, pp 82–92

  • Yoon HS, Lee SH, Cho SB, Kim JH (2006b) A novel framework for discovering robust cluster results. Discov Sci, Springer, pp 373–377

  • Yoon HS, Lee SH, Cho SB, Kim JH (2006c) Integration analysis of diverse genomic data using multi-clustering results. Biomed Med Data Anal J, Springer, pp 37–48

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reza Ghaemi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghaemi, R., Sulaiman, N.b., Ibrahim, H. et al. A review: accuracy optimization in clustering ensembles using genetic algorithms. Artif Intell Rev 35, 287–318 (2011). https://doi.org/10.1007/s10462-010-9195-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-010-9195-5

Keywords

Navigation