Optimizing the minimum spanning tree-based extracted clusters using evolution strategy

Abstract

There are many approaches available for extracting clusters. A few are based on the partitioning of the data and others rely on extracting hierarchical structures. Graphs provide a convenient representation of entities having relationships. Clusters can be extracted from a graph-based structure using minimum spanning trees (MSTs). This work focuses on optimizing the MST-based extracted clusters using Evolution Strategy (ES). A graph may have multiple MSTs causing varying cluster formations based on different MST selection. This work uses (1+1)-ES to obtain the optimal MST-based clustering. The Davies–Bouldin Index is utilized as fitness function to evaluate the quality of the clusters formed by the ES population. The proposed approach is evaluated using eleven benchmark datasets. Seven of these are based on microarray and the rest are taken from the UCI machine learning repository. Both, external and internal cluster validation indices are used to evaluate the results. The performance of the proposed approach is compared with two state-of-the-art MST-based clustering algorithms. The results support promising performance of the proposed approach in terms of time and cluster validity indices.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Notes

  1. 1.

    http://igraph.org/.

  2. 2.

    http://archive.ics.uci.edu/ml/datasets.html.

References

  1. 1.

    Datta, S., Datta, S.: Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics 19(4), 459–466 (2003)

    Article  Google Scholar 

  2. 2.

    Shen, H., Yang, J., Wang, S., Liu, X.: Attribute weighted mercer kernel based fuzzy clustering algorithm for general non-spherical datasets. Soft Comput. 10(11), 1061–1073 (2006)

    Article  Google Scholar 

  3. 3.

    Srinivasan, G.: A clustering algorithm for machine cell formation in group technology using minimum spanning trees. Int. J. Prod. Res. 32(9), 2149–2158 (1994)

    Article  MATH  Google Scholar 

  4. 4.

    Thawonmas, R., Ashida, T.: Evolution strategy for optimizing parameters in Ms Pac-Man controller ICE Pambush 3. In: IEEE Symposium on Computational Intelligence and Games, pp. 235–240 (2010)

  5. 5.

    Eberhart, R.C., Shi, Y.: Tracking and optimizing dynamic systems with particle swarms. In: IEEE Evolutionary Computation, pp. 94–100 (2001)

  6. 6.

    Wu, F., Mueller, L.A., Crouzillat, D., Pétiard, V., Tanksley, S.D.: Combining bioinformatics and phylogenetics to identify large sets of single-copy orthologous genes (COSII) for comparative, evolutionary and systematic studies: a test case in the euasterid plant clade. Genetics 174(3), 1407–1420 (2006)

    Article  Google Scholar 

  7. 7.

    Huang, A.: Similarity measures for text document clustering. In: Proceedings of the sixth new zealand computer science research student conference (NZCSRSC2008), Christchurch, pp. 49–56 (2008)

  8. 8.

    Zha, H., He, X., Ding, C., Simon, H., Gu, M.: Bipartite graph partitioning and data clustering. In: Proceedings of the tenth international conference on Information and knowledge management, pp. 25–32 (2001)

  9. 9.

    Grygorash, O., Zhou, Y., Jorgensen, Z.: Minimum spanning tree based clustering algorithms. In: Tools with Artificial Intelligence, pp. 73–81 (2006)

  10. 10.

    Halim, Z., Kalsoom, R., Baig, A.R.: Profiling drivers based on driver dependent vehicle driving features. Appl. Intell. 44(3), 645–664 (2016)

    Article  Google Scholar 

  11. 11.

    Hussain, S.F., Mushtaq, M., Halim, Z.: Multi-view document clustering via ensemble methods. J. Intell. Inf. Syst. 43(1), 81–99 (2014)

    Article  Google Scholar 

  12. 12.

    Abraham, A., Guo, H., Liu, H.: Swarm intelligence: foundations, perspectives and applications. In: Swarm Intelligent Systems, pp. 3–25 (2006)

  13. 13.

    Pirim, H., Ekşioğlu, B., Perkins, A.D.: Clustering high throughput biological data with B-MST, a minimum spanning tree based heuristic. Comput. Biol. Med. 62, 94–102 (2015)

    Article  Google Scholar 

  14. 14.

    Müller, A.C., Nowozin, S., Lampert, C.H.: Information theoretic clustering using minimum spanning trees. In: Joint DAGM (German Association for Pattern Recognition) and OAGM Symposium, pp. 205–215 (2012)

  15. 15.

    Zahn, C.T.: Graph theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. C–20(1), 68–86 (1971)

    Article  MATH  Google Scholar 

  16. 16.

    Xu, Y., Olman, V., Xu, D.: Clustering gene expression data using a graph-theriotic approach: an application of minimum spanning trees. Bioinformatics 18, 536–545 (2002)

    Article  Google Scholar 

  17. 17.

    Gonzalez, R.C., Wintz, P.: Digital Image Processing. Addison-Wesley, Reading, MA (1987)

    Google Scholar 

  18. 18.

    Xu, Y., Olman, V., Uberbacher, E.C.: A segmentation algorithm for noisy images: design and evaluation. Pattern Recognit. Lett. 19, 1213–1224 (1998)

    Article  MATH  Google Scholar 

  19. 19.

    Zhong, C., Malinen, M., Miao, D., Fränti, P.: A fast minimum spanning tree algorithm based on K-means. Inf. Sci. 295, 1–17 (2015)

    MathSciNet  Article  MATH  Google Scholar 

  20. 20.

    Zhou, R., Shu, L., Su, Y.: An adaptive minimum spanning tree test for detecting irregularly-shaped spatial clusters. Comput. Stat. Data Anal. 89, 134–146 (2015)

    MathSciNet  Article  Google Scholar 

  21. 21.

    Zhou, Y., Grygorash, O., Hain, T.F.: Clustering with minimum spanning trees. Int. J. Artif. Intell. Tools 20(01), 139–177 (2011)

    Article  Google Scholar 

  22. 22.

    Wang, X., Wang, X.L., Chen, C., Wilkes, D.M.: Enhancing minimum spanning tree-based clustering by removing density-based outliers. Digit. Signal Process. 23(5), 1523–1538 (2013)

    MathSciNet  Article  Google Scholar 

  23. 23.

    Jothi, R., Mohanty, S.K., Ojha, A.: Fast minimum spanning tree based clustering algorithms on local neighborhood graph. In: International Workshop on Graph-Based Representations in Pattern Recognition, pp. 292–301 (2015)

  24. 24.

    Tzortzis, G., Likas, A.: The MinMax k-Means clustering algorithm. Pattern Recognit. 47(7), 2505–2516 (2014)

    Article  Google Scholar 

  25. 25.

    Yu, M., Hillebrand, A., Tewarie, P., Meier, J., van Dijk, B., Van Mieghem, P., Stam, C.J.: Hierarchical clustering in minimum spanning trees. Chaos: an interdisciplinary. J. Nonlinear Sci. 25(2), 023107 (2015)

    Google Scholar 

  26. 26.

    Huang, G., Dong, S., Ren, J.: A minimum spanning tree clustering algorithm based on density. Adv. Inf. Sci. Serv. Sci. 5(2), 44 (2013)

    Google Scholar 

  27. 27.

    Zhong, C., Miao, D., Fränti, P.: Minimum spanning tree based split-and-merge: a hierarchical clustering method. Inf. Sci. 181(16), 3397–3410 (2011)

    Article  Google Scholar 

  28. 28.

    Abraham, A., Nedjah, N., Mourelle, L.: Evolutionary computation: from genetic algorithms to genetic programming. In: Genetic Systems Programming, pp. 1–20 (2006)

  29. 29.

    Halim, Z., Waqas, M., Hussain, S.F.: Clustering large probabilistic graphs using multi-population evolutionary algorithm. Inf. Sci. 317, 78–95 (2015)

    Article  Google Scholar 

  30. 30.

    Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Syst. 1695(5), 1–9 (2006)

    Google Scholar 

  31. 31.

    Bandyopadhyay, S., Mukhopadhyay, A., Maulik, U.: An improved algorithm for clustering gene expression data. Bioinformatics 23(21), 2859–2865 (2007)

    Article  Google Scholar 

  32. 32.

    Rendón, E., Abundez, I., Arizmendi, A., Quiroz, E.: Internal versus external cluster validation indexes. Int. J. Comput. Commun. 5(1), 27–34 (2011)

  33. 33.

    Iwata, T., Lloyd, J.R., Ghahramani, Z.: Unsupervised many-to-many object matching for relational data. IEEE Trans. Pattern Anal. Mach. Intell. 38(3), 607–617 (2016)

    Article  Google Scholar 

  34. 34.

    Halim, Z., Muhammad, T.: Quantifying and optimizing visualization: an evolutionary computing-based approach. Inf. Sci. 385, 284–313 (2017)

    Article  Google Scholar 

  35. 35.

    Muhammad, T., Halim, Z.: Employing artificial neural networks for constructing metadata-based model to automatically select an appropriate data visualization technique. Appl. Soft Comput. 49, 365–384 (2016)

    Article  Google Scholar 

  36. 36.

    Leskovec, J., Mcauley, J.J.: Learning to discover social circles in ego networks. Adv. Neural Inf. Process. Syst. 25, 539–547 (2012)

    Google Scholar 

  37. 37.

    Mcauley, J.J., Leskovec, J.: Discovering social circles in ego networks. ACM Trans. Knowl. Discov. Data 8(1), 4 (2014)

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Zahid Halim.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Halim, Z., Uzma Optimizing the minimum spanning tree-based extracted clusters using evolution strategy. Cluster Comput 21, 377–391 (2018). https://doi.org/10.1007/s10586-017-0868-6

Download citation

Keywords

  • Minimum spanning trees
  • Clustering
  • Graphs
  • Evolution strategy