Advertisement

Categorical Data Clustering Method Based on Improved Fruit Fly Optimization Algorithm

  • Dong LiEmail author
  • Huifeng Xue
  • Wenyu Zhang
  • Yan Zhang
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 885)

Abstract

K-modes algorithm is a general algorithm for categorical data clustering. It has the characteristics of simple principle and easy implementation. However, K-modes algorithm is vulnerable to the initial cluster centers and falls into the local optimal solution. And K-modes clustering algorithm cannot automatically determine the number of clusters, it needs to be set manually. These problems limit the application of the K-modes algorithm. This paper addresses the two problems by proposing a K-modes clustering algorithm based on the improved fruit fly optimization algorithm (IFOA-K-modes). The IFOA-K-modes algorithm combines K-modes algorithm with the fruit fly optimization algorithm (FOA), and optimizes the number of clusters and the cluster centers by using the improved fruit fly optimization algorithm (IFOA). In this paper, because of the strong local search ability and weak global search ability, the FOA is improved from the search mechanism, coordinate system and dynamic regulation of search radius. At the end of the paper, the IFOA-K-modes algorithm is verified by experiments. And the results show that the IFOA-K-modes has the ability to optimize the number of clusters and cluster centers, and the accuracy of clustering is also improved.

Keywords

Clustering K-modes Fruit fly optimization algorithm Categorical data 

Notes

Acknowledgments

This work is supported by the 2013 scientific research program of the Shaanxi Provincial Education Department (Grant no. 2013JK0175).

References

  1. 1.
    Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Disc. 2(3), 283–304 (1998)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Khan, S.S., Ahmad, A., et al.: Cluster center initialization algorithm for K-means clustering. Expert Syst. Appl. 40(18), 7444–7456 (2013)CrossRefGoogle Scholar
  3. 3.
    Khan, S.S., Ahmad, A.: Computing initial points using density based multiscale data condensation for clustering categorical data (2003)Google Scholar
  4. 4.
    Cao, F., Liang, J., Bai, L.: A new initialization method for categorical data clustering. Expert Syst. Appl. 36(7), 10223–10228 (2009)CrossRefGoogle Scholar
  5. 5.
    Peng, L., Liu, Y.: Attribute weights-based clustering centres algorithm for initialising K-modes clustering. Clust. Comput. 3, 1–9 (2018)Google Scholar
  6. 6.
    Ng, M.K., Li, M.J., Huang, J.Z., et al.: On the impact of dissimilarity measure in K-modes clustering algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 503 (2007)CrossRefGoogle Scholar
  7. 7.
    Cao, F., Liang, J., Li, D., et al.: A dissimilarity measure for the k-modes clustering algorithm. Knowl. Based Syst. 26(9), 120–127 (2012)CrossRefGoogle Scholar
  8. 8.
    Pan, W.T.: A new fruit fly optimization algorithm: taking the financial distress model as an example. Knowl. Based Syst. 26(2), 69–74 (2012)CrossRefGoogle Scholar
  9. 9.
    Wang, J., Cao, J., Li, B., et al.: Bio-inspired ant colony optimization based clustering algorithm with mobile sinks for applications in consumer home automation networks. IEEE Trans. Consum. Electron. 61(4), 438–444 (2016)CrossRefGoogle Scholar
  10. 10.
    Cagnina, L., Errecalde, M., Ingaramo, D., et al.: An efficient particle swarm optimization approach to cluster short texts. Inf. Sci. 265(5), 36–49 (2014)CrossRefGoogle Scholar
  11. 11.
    Maulik, U., Bandyopadhyay, S.: Genetic algorithm-based clustering technique. Pattern Recogn. 33(9), 1455–1465 (2004)CrossRefGoogle Scholar
  12. 12.
    Kaufman, L., Rousseeuw, P.J.: Finding groups in data: an introduction to cluster analysis. Mach. Des. 74 (1990)Google Scholar
  13. 13.
    Yang, Y.: An evaluation of statistical approaches to text categorization. Inf. Retr. 1(1–2), 69–90 (1999)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.School of AutomationNorthwestern Polytechnical UniversityXi’anPeople’s Republic of China
  2. 2.School of Economics and ManagementXi’an University of Posts and TelecommunicationsXi’anPeople’s Republic of China

Personalised recommendations