Abstract
There are many approaches available for extracting clusters. A few are based on the partitioning of the data and others rely on extracting hierarchical structures. Graphs provide a convenient representation of entities having relationships. Clusters can be extracted from a graph-based structure using minimum spanning trees (MSTs). This work focuses on optimizing the MST-based extracted clusters using Evolution Strategy (ES). A graph may have multiple MSTs causing varying cluster formations based on different MST selection. This work uses (1+1)-ES to obtain the optimal MST-based clustering. The Davies–Bouldin Index is utilized as fitness function to evaluate the quality of the clusters formed by the ES population. The proposed approach is evaluated using eleven benchmark datasets. Seven of these are based on microarray and the rest are taken from the UCI machine learning repository. Both, external and internal cluster validation indices are used to evaluate the results. The performance of the proposed approach is compared with two state-of-the-art MST-based clustering algorithms. The results support promising performance of the proposed approach in terms of time and cluster validity indices.
Similar content being viewed by others
References
Datta, S., Datta, S.: Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics 19(4), 459–466 (2003)
Shen, H., Yang, J., Wang, S., Liu, X.: Attribute weighted mercer kernel based fuzzy clustering algorithm for general non-spherical datasets. Soft Comput. 10(11), 1061–1073 (2006)
Srinivasan, G.: A clustering algorithm for machine cell formation in group technology using minimum spanning trees. Int. J. Prod. Res. 32(9), 2149–2158 (1994)
Thawonmas, R., Ashida, T.: Evolution strategy for optimizing parameters in Ms Pac-Man controller ICE Pambush 3. In: IEEE Symposium on Computational Intelligence and Games, pp. 235–240 (2010)
Eberhart, R.C., Shi, Y.: Tracking and optimizing dynamic systems with particle swarms. In: IEEE Evolutionary Computation, pp. 94–100 (2001)
Wu, F., Mueller, L.A., Crouzillat, D., Pétiard, V., Tanksley, S.D.: Combining bioinformatics and phylogenetics to identify large sets of single-copy orthologous genes (COSII) for comparative, evolutionary and systematic studies: a test case in the euasterid plant clade. Genetics 174(3), 1407–1420 (2006)
Huang, A.: Similarity measures for text document clustering. In: Proceedings of the sixth new zealand computer science research student conference (NZCSRSC2008), Christchurch, pp. 49–56 (2008)
Zha, H., He, X., Ding, C., Simon, H., Gu, M.: Bipartite graph partitioning and data clustering. In: Proceedings of the tenth international conference on Information and knowledge management, pp. 25–32 (2001)
Grygorash, O., Zhou, Y., Jorgensen, Z.: Minimum spanning tree based clustering algorithms. In: Tools with Artificial Intelligence, pp. 73–81 (2006)
Halim, Z., Kalsoom, R., Baig, A.R.: Profiling drivers based on driver dependent vehicle driving features. Appl. Intell. 44(3), 645–664 (2016)
Hussain, S.F., Mushtaq, M., Halim, Z.: Multi-view document clustering via ensemble methods. J. Intell. Inf. Syst. 43(1), 81–99 (2014)
Abraham, A., Guo, H., Liu, H.: Swarm intelligence: foundations, perspectives and applications. In: Swarm Intelligent Systems, pp. 3–25 (2006)
Pirim, H., Ekşioğlu, B., Perkins, A.D.: Clustering high throughput biological data with B-MST, a minimum spanning tree based heuristic. Comput. Biol. Med. 62, 94–102 (2015)
Müller, A.C., Nowozin, S., Lampert, C.H.: Information theoretic clustering using minimum spanning trees. In: Joint DAGM (German Association for Pattern Recognition) and OAGM Symposium, pp. 205–215 (2012)
Zahn, C.T.: Graph theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. C–20(1), 68–86 (1971)
Xu, Y., Olman, V., Xu, D.: Clustering gene expression data using a graph-theriotic approach: an application of minimum spanning trees. Bioinformatics 18, 536–545 (2002)
Gonzalez, R.C., Wintz, P.: Digital Image Processing. Addison-Wesley, Reading, MA (1987)
Xu, Y., Olman, V., Uberbacher, E.C.: A segmentation algorithm for noisy images: design and evaluation. Pattern Recognit. Lett. 19, 1213–1224 (1998)
Zhong, C., Malinen, M., Miao, D., Fränti, P.: A fast minimum spanning tree algorithm based on K-means. Inf. Sci. 295, 1–17 (2015)
Zhou, R., Shu, L., Su, Y.: An adaptive minimum spanning tree test for detecting irregularly-shaped spatial clusters. Comput. Stat. Data Anal. 89, 134–146 (2015)
Zhou, Y., Grygorash, O., Hain, T.F.: Clustering with minimum spanning trees. Int. J. Artif. Intell. Tools 20(01), 139–177 (2011)
Wang, X., Wang, X.L., Chen, C., Wilkes, D.M.: Enhancing minimum spanning tree-based clustering by removing density-based outliers. Digit. Signal Process. 23(5), 1523–1538 (2013)
Jothi, R., Mohanty, S.K., Ojha, A.: Fast minimum spanning tree based clustering algorithms on local neighborhood graph. In: International Workshop on Graph-Based Representations in Pattern Recognition, pp. 292–301 (2015)
Tzortzis, G., Likas, A.: The MinMax k-Means clustering algorithm. Pattern Recognit. 47(7), 2505–2516 (2014)
Yu, M., Hillebrand, A., Tewarie, P., Meier, J., van Dijk, B., Van Mieghem, P., Stam, C.J.: Hierarchical clustering in minimum spanning trees. Chaos: an interdisciplinary. J. Nonlinear Sci. 25(2), 023107 (2015)
Huang, G., Dong, S., Ren, J.: A minimum spanning tree clustering algorithm based on density. Adv. Inf. Sci. Serv. Sci. 5(2), 44 (2013)
Zhong, C., Miao, D., Fränti, P.: Minimum spanning tree based split-and-merge: a hierarchical clustering method. Inf. Sci. 181(16), 3397–3410 (2011)
Abraham, A., Nedjah, N., Mourelle, L.: Evolutionary computation: from genetic algorithms to genetic programming. In: Genetic Systems Programming, pp. 1–20 (2006)
Halim, Z., Waqas, M., Hussain, S.F.: Clustering large probabilistic graphs using multi-population evolutionary algorithm. Inf. Sci. 317, 78–95 (2015)
Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Syst. 1695(5), 1–9 (2006)
Bandyopadhyay, S., Mukhopadhyay, A., Maulik, U.: An improved algorithm for clustering gene expression data. Bioinformatics 23(21), 2859–2865 (2007)
Rendón, E., Abundez, I., Arizmendi, A., Quiroz, E.: Internal versus external cluster validation indexes. Int. J. Comput. Commun. 5(1), 27–34 (2011)
Iwata, T., Lloyd, J.R., Ghahramani, Z.: Unsupervised many-to-many object matching for relational data. IEEE Trans. Pattern Anal. Mach. Intell. 38(3), 607–617 (2016)
Halim, Z., Muhammad, T.: Quantifying and optimizing visualization: an evolutionary computing-based approach. Inf. Sci. 385, 284–313 (2017)
Muhammad, T., Halim, Z.: Employing artificial neural networks for constructing metadata-based model to automatically select an appropriate data visualization technique. Appl. Soft Comput. 49, 365–384 (2016)
Leskovec, J., Mcauley, J.J.: Learning to discover social circles in ego networks. Adv. Neural Inf. Process. Syst. 25, 539–547 (2012)
Mcauley, J.J., Leskovec, J.: Discovering social circles in ego networks. ACM Trans. Knowl. Discov. Data 8(1), 4 (2014)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Halim, Z., Uzma Optimizing the minimum spanning tree-based extracted clusters using evolution strategy. Cluster Comput 21, 377–391 (2018). https://doi.org/10.1007/s10586-017-0868-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-0868-6