Categorical Data Clustering Method Based on Improved Fruit Fly Optimization Algorithm
K-modes algorithm is a general algorithm for categorical data clustering. It has the characteristics of simple principle and easy implementation. However, K-modes algorithm is vulnerable to the initial cluster centers and falls into the local optimal solution. And K-modes clustering algorithm cannot automatically determine the number of clusters, it needs to be set manually. These problems limit the application of the K-modes algorithm. This paper addresses the two problems by proposing a K-modes clustering algorithm based on the improved fruit fly optimization algorithm (IFOA-K-modes). The IFOA-K-modes algorithm combines K-modes algorithm with the fruit fly optimization algorithm (FOA), and optimizes the number of clusters and the cluster centers by using the improved fruit fly optimization algorithm (IFOA). In this paper, because of the strong local search ability and weak global search ability, the FOA is improved from the search mechanism, coordinate system and dynamic regulation of search radius. At the end of the paper, the IFOA-K-modes algorithm is verified by experiments. And the results show that the IFOA-K-modes has the ability to optimize the number of clusters and cluster centers, and the accuracy of clustering is also improved.
KeywordsClustering K-modes Fruit fly optimization algorithm Categorical data
This work is supported by the 2013 scientific research program of the Shaanxi Provincial Education Department (Grant no. 2013JK0175).
- 3.Khan, S.S., Ahmad, A.: Computing initial points using density based multiscale data condensation for clustering categorical data (2003)Google Scholar
- 5.Peng, L., Liu, Y.: Attribute weights-based clustering centres algorithm for initialising K-modes clustering. Clust. Comput. 3, 1–9 (2018)Google Scholar
- 12.Kaufman, L., Rousseeuw, P.J.: Finding groups in data: an introduction to cluster analysis. Mach. Des. 74 (1990)Google Scholar