Skip to main content

Clustering aims at representing large datasets by a fewer number of prototypes or clusters. It brings simplicity in modeling data and thus plays a central role in the process of knowledge discovery and data mining. Data mining tasks, in these days, require fast and accurate partitioning of huge datasets, which may come with a variety of attributes or features. This, in turn, imposes severe computational requirements on the relevant clustering techniques. A family of bio-inspired algorithms, well-known as Swarm Intelligence (SI) has recently emerged that meets these requirements and has successfully been applied to a number of real world clustering problems. This chapter explores the role of SI in clustering different kinds of datasets. It finally describes a new SI technique for partitioning any dataset into an optimal number of groups through one run of optimization. Computer simulations undertaken in this research have also been provided to demonstrate the effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • A. Abraham, C. Grosan and V. Ramos (2006) (Eds.), Swarm Intelligence and Data Mining, Studies in Computational Intelligence, Springer Verlag, Germany, pages 270, ISBN: 3-540-34955-3.

    Google Scholar 

  • Ahmed MN, Yaman SM, Mohamed N, 2002, Farag AA and Moriarty TA, Modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data. IEEE Trans Med Imaging, 21, pp. 193-199.

    Article  Google Scholar 

  • Azzag H, Guinot C and Venturini G, Data and text mining with hierarchical clustering ants, in Swarm Intelligence in Data Mining, Abraham A, (2006), Grosan C and Ramos V (Eds), Springer, pp. 153-186.

    Google Scholar 

  • Ball G and Hall D, 1967, A Clustering Technique for Summarizing Multivariate Data, Behavioral Science 12, pp. 153-155.

    Article  Google Scholar 

  • Bandyopadhyay S and Maulik U, 2000, Genetic clustering for automatic evolution of clusters and application to image classification, Pattern Recognition, 35, pp. 1197-1208.

    Article  Google Scholar 

  • Beni G and Wang U, (1989), Swarm intelligence in cellular robotic systems. In NATO Advanced Workshop on Robots and Biological Systems, Il Ciocco, Tuscany, Italy.

    Google Scholar 

  • Bensaid AM, Hall LO, Bezdek JC.and Clarke LP, 1996, Partially supervised clustering for image segmentation. Pattern Recognition, vol. 29, pp. 859-871.

    Google Scholar 

  • Bezdek JC, 1981, Pattern recognition with fuzzy objective function algorithms. New York: Plenum.

    MATH  Google Scholar 

  • Blake C, Keough E and Merz CJ, (1998), UCI repository of machine learning database http://www.ics.uci.edu/∼mlearn/MLrepository.html.

  • Bonabeau E, Dorigo M and Theraulaz G, 1999, Swarm Intelligence: From Natural to Artificial Systems. Oxford University Press, New York.

    MATH  Google Scholar 

  • Brucker P, 1978, On the complexity of clustering problems. Beckmenn M and Kunzi HP (Eds.), Optimization and Operations Research, Lecture Notes in Economics and Mathematical Systems, Berlin, Springer, vol.157, pp. 45-54.

    Google Scholar 

  • Clark MC, Hall LO, Goldgof DB, Clarke LP, 1994, Velthuizen RP and Silbiger MS , MRI segmentation using fuzzy clustering techniques. IEEE Eng Med Biol, 13, pp.730-742.

    Article  Google Scholar 

  • Clerc M and Kennedy J. 2002, The particle swarm - explosion, stability, and convergence in a multidimensional complex space, In IEEE Transactions on Evolutionary Computation, 6(1):58-73.

    Article  Google Scholar 

  • Couzin ID, Krause J, James R, Ruxton GD, Franks NR, 2002, Collective Memory and Spatial Sorting in Animal Groups, Journal of Theoretical Biology, 218, pp. 1-11.

    Article  MathSciNet  Google Scholar 

  • Cui X and Potok TE, (2005), Document Clustering Analysis Based on Hybrid PSO+Kmeans Algorithm, Journal of Computer Sciences (Special Issue), ISSN 1549-3636, pp. 27-33.

    Google Scholar 

  • Das S, Konar A and Abraham A, 2006, Spatial Information based Image Segmentation with a Modified Particle Swarm Optimization, in proceedings of Sixth International Conference on Intelligent System Design and Applications (ISDA 06) Jinan, Shangdong, China, IEEE Computer Society Press.

    Google Scholar 

  • Deb K, Pratap A, Agarwal S, and Meyarivan T (2002), A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Trans. on Evolutionary Computation, Vol.6, No.2.

    Google Scholar 

  • Deneubourg JL, Goss S, Franks N, Sendova-Franks A, 1991, Detrain C and Chetien L , The dynamics of collective sorting: Robot-like ants and ant-like robots. In Meyer JA and Wilson SW (Eds.) Proceedings of the First International Conference on Simulation of Adaptive Behaviour: From Animals to Animats 1, pp. 356-363. MIT Press, Cambridge, MA.

    Google Scholar 

  • Dorigo M and Gambardella LM, 1997, Ant colony system: A cooperative learning approach to the traveling salesman problem, IEEE Trans. Evolutionary Computing, vol. 1, pp. 53-66.

    Article  Google Scholar 

  • Dorigo M, Maniezzo V and Colorni A, (1996), The ant system: Optimization by a colony of cooperating agents, IEEE Trans. Systems Man and Cybernetics Part B, vol. 26.

    Google Scholar 

  • Duda RO and Hart PE, 1973, Pattern Classification and Scene Analysis. John Wiley and Sons, USA.

    MATH  Google Scholar 

  • Eberhart RC and Shi Y, 2001, Particle swarm optimization: Developments, applications and resources, In Proceedings of IEEE International Conference on Evolutionary Computation, vol. 1, pp. 81-86.

    Google Scholar 

  • Evangelou IE, Hadjimitsis DG, Lazakidou AA, (2001), Clayton C, Data Mining and Knowledge Discovery in Complex Image Data using Artificial Neural Networks, Workshop on Complex Reasoning an Geographical Data, Cyprus.

    Google Scholar 

  • Everitt BS, (1993), Cluster Analysis. Halsted Press, Third Edition.

    Google Scholar 

  • Falkenauer E, 1998, Genetic Algorithms and Grouping Problems, John Wiley and Son, Chichester.

    Google Scholar 

  • Fogel LJ, Owens AJ and Walsh MJ, 1966, Artificial Intelligence through Simulated Evolution. New York: Wiley.

    MATH  Google Scholar 

  • Forgy EW, (1965), Cluster Analysis of Multivariate Data: Efficiency versus Interpretability of classification, Biometrics, 21.

    Google Scholar 

  • Frigui H and Krishnapuram R, 1999, A Robust Competitive Clustering Algorithm with Applications in Computer Vision, IEEE Transactions on Pattern Analysis and Machine Intelligence 21 (5), pp. 450-465.

    Article  Google Scholar 

  • Fukunaga K, (1990), Introduction to Statistical Pattern Recognition. Academic Press.

    Google Scholar 

  • Gath I and Geva A, 1989, Unsupervised optimal fuzzy clustering. IEEE Transactions on PAMI, 11, pp. 773-781.

    Google Scholar 

  • Goldberg DE, 1975, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Reading, MA.

    Google Scholar 

  • Grosan C, Abraham A and Monica C, Swarm Intelligence in Data Mining, in Swarm Intelligence in Data Mining, Abraham A, (2006), Grosan C and Ramos V (Eds), Springer, pp. 1-16.

    Google Scholar 

  • Halkidi M and Vazirgiannis M, (2001), Clustering Validity Assessment: Finding the Optimal Partitioning of a Data Set. Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM 01), San Jose, California, USA, pp. 187-194.

    Google Scholar 

  • Halkidi M, Batistakis Y and Vazirgiannis M, 2001, On Clustering Validation Techniques. Journal of Intelligent Information Systems (JIIS), 17(2-3), pp. 107-145.

    Article  MATH  Google Scholar 

  • Handl J and Meyer B, 2002, Improved ant-based clustering and sorting in a document retrieval interface. In Proceedings of the Seventh International Conference on Parallel Problem Solving from Nature (PPSN VII), volume 2439 of LNCS, pp. 913-923. Springer-Verlag, Berlin, Germany.

    Google Scholar 

  • Handl J, Knowles J and Dorigo M, 2003, Ant-based clustering: a comparative study of its relative performance with respect to k-means, average link and 1D-som. Technical Report TR/IRIDIA/2003-24. IRIDIA, Universite Libre de Bruxelles, Belgium.

    Google Scholar 

  • Hoe K, Lai W, and Tai T, 2002, Homogenous ants for web document similarity modeling and categorization. In Proceedings of the Third International Workshop on Ant Algorithms (ANTS 2002), volume 2463 of LNCS, pp. 256-261. Springer-Verlag, Berlin, Germany.

    Google Scholar 

  • Holland JH, 1975, Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor.

    Google Scholar 

  • Jain AK, Murty MN and Flynn PJ, 1999, Data clustering: a review, ACM Computing Surveys, vol. 31, no. 3, pp. 264-323.

    Article  Google Scholar 

  • Kanade PM and Hall LO, (2003), Fuzzy Ants as a Clustering Concept. In Proceedings of the 22nd International Conference of the North American Fuzzy Information Processing Society (NAFIPS03), pp. 227-232.

    Google Scholar 

  • Kaufman, L and Rousseeuw, PJ, 1990, Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, New York.

    Google Scholar 

  • Kennedy J and Eberhart R, (1995), Particle swarm optimization, In Proceedings of IEEE International conference on Neural Networks, pp. 1942-1948.

    Google Scholar 

  • Kennedy J and Eberhart RC, 1997, A discrete binary version of the particle swarm algorithm, Proceedings of the 1997 Conf. on Systems, Man, and Cybernetics, IEEE Service Center, Piscataway, NJ, pp. 4104-4109.

    Google Scholar 

  • Kennedy J, Eberhart R and Shi Y, (2001), Swarm Intelligence, Morgan Kaufmann Academic Press.

    Google Scholar 

  • Kohonen T, (1995), Self-Organizing Maps, Springer Series in Information Sciences, Vol 30, Springer-Verlag.

    Google Scholar 

  • Konar A, (2005), Computational Intelligence: Principles, Techniques and Applications, Springer.

    Google Scholar 

  • Krause J and Ruxton GD, 2002, Living in Groups. Oxford: Oxford University Press.

    Google Scholar 

  • Kuntz P and Snyers D, 1994, Emergent colonization and graph partitioning. In Proceedings of the Third International Conference on Simulation of Adaptive Behaviour: From Animals to Animats 3, pp. 494-500. MIT Press, Cambridge, MA.

    Google Scholar 

  • Kuntz P and Snyers D, 1999, New results on an ant-based heuristic for highlighting the organization of large graphs. In Proceedings of the 1999 Congress on Evolutionary Computation, pp. 1451-1458. IEEE Press, Piscataway, NJ.

    Chapter  Google Scholar 

  • Kuntz P, Snyers D and Layzell P, 1998, A stochastic heuristic for visualising graph clusters in a bi-dimensional space prior to partitioning. Journal of Heuristics, 5 (3), pp. 327-351.

    Google Scholar 

  • Lee C-Y and Antonsson EK, (2000), Self-adapting vertices for mask layout synthesis Modeling and Simulation of Microsystems Conference (San Diego, March 27-29) eds. M Laudon and B Romanowicz. pp. 83-86.

    Google Scholar 

  • Leung Y, Zhang J and Xu Z, 2000, Clustering by Space-Space Filtering, IEEE Transactions on Pattern Analysis and Machine Intelligence 22 (12), pp. 1396-1410.

    Article  Google Scholar 

  • Lewin B, 1995, Genes VII. Oxford University Press, New York, NY.

    Google Scholar 

  • Lillesand T and Keifer R, 1994, Remote Sensing and Image Interpretation, John Wiley & Sons, USA.

    Google Scholar 

  • Lumer E and Faieta B, Lumer E and Faieta B, (1994), Diversity and Adaptation in Populations of Clustering Ants. In Proceedings Third International Conference on Simulation of Adaptive Behavior: from animals to animates 3, Cambridge, Massachusetts MIT press, pp. 499-508.

    Google Scholar 

  • Lumer E and Faieta B, (1995), Exploratory database analysis via self-organization, Unpublished manuscript.

    Google Scholar 

  • MacQueen J, (1967), Some methods for classification and analysis of multivariate observations, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281-297.

    Google Scholar 

  • Major PF, Dill LM, 1978, The three-dimensional structure of airborne bird flocks. Behavioral Ecology and Sociobiology, 4, pp. 111-122.

    Article  Google Scholar 

  • Mao J and Jain AK, 1995, Artificial neural networks for feature extraction and multivariate data projection. IEEE Trans. Neural Networks.vol. 6, 296-317.

    Article  Google Scholar 

  • Milonas MM, 1994, Swarms, phase transitions, and collective intelligence, In Langton CG Ed., Artificial Life III, Addison Wesley, Reading, MA.

    Google Scholar 

  • Mitchell T, 1997, Machine Learning. McGraw-Hill, Inc., New York, NY.

    MATH  Google Scholar 

  • Mitra S, Pal SK and Mitra P, 2002, Data mining in soft computing framework: A survey, IEEE Transactions on Neural Networks, Vol. 13, pp. 3-14.

    Article  Google Scholar 

  • Monmarche N, Slimane M and Venturini G, (1999), Ant Class: discovery of clusters in numeric data by a hybridization of an ant colony with the k means algorithm. Internal Report No. 213, E3i, Laboratoire d’Informatique, Universite de Tours.

    Google Scholar 

  • Ng R and Han J, (1994), Efficient and effective clustering method for spatial data mining. In: Proc. 1994 International Conf. Very Large Data Bases (VLDB’94). Santiago, Chile, September pp. 144-155.

    Google Scholar 

  • Omran M, Engelbrecht AP and Salman A, 2005, Particle Swarm Optimization Method for Image Clustering. International Journal of Pattern Recognition and Artificial Intelligence, 19(3), pp. 297-322.

    Article  Google Scholar 

  • Omran M, Engelbrecht AP and Salman A, (2005), Differential Evolution Methods for Unsupervised Image Classification, Proceedings of Seventh Congress on Evolutionary Computation (CEC-2005). IEEE Press.

    Google Scholar 

  • Omran M, Salman A and Engelbrecht AP, 2002, Image Classification using Particle Swarm Optimization. In Conference on Simulated Evolution and Learning, volume 1, pp. 370-374.

    Google Scholar 

  • Omran M, Salman A and Engelbrecht AP, 2005, Dynamic Clustering using Particle Swarm Optimization with Application in Unsupervised Image Classification. Fifth World Enformatika Conference (ICCI 2005), Prague, Czech Republic.

    Google Scholar 

  • Pakhira MK, Bandyopadhyay S and Maulik, U, 2005, A Study of Some Fuzzy Cluster Validity Indices, Genetic clustering And Application to Pixel Classification, Fuzzy Sets and Systems 155, pp. 191-214.

    MathSciNet  Google Scholar 

  • Pal NR, Bezdek JC and Tsao ECK, 1993, Generalized clustering networks and Kohonen’s self-organizing scheme. IEEE Trans. Neural Networks, vol 4, 549-557.

    Article  Google Scholar 

  • Partridge BL, 1982, The structure and function of fish schools. Science American, 245, pp. 90-99.

    Google Scholar 

  • Partridge BL, Pitcher TJ, 1980, The sensory basis of fish schools: relative role of lateral line and vision. Journal of Comparative Physiology, 135, pp. 315-325.

    Article  Google Scholar 

  • Paterlini S and Krink T, 2006, Differential Evolution and Particle Swarm Optimization in Partitional Clustering. Computational Statistics and Data Analysis, vol. 50, pp. 1220-1247.

    Article  MathSciNet  Google Scholar 

  • Paterlini S and Minerva T, 2003, Evolutionary Approaches for Cluster Analysis. In Bonarini A, Masulli F and Pasi G (eds.) Soft Computing Applications. Springer-Verlag, Berlin. 167-178.

    Google Scholar 

  • Ramos V and Merelo JJ, 2002, Self-organized stigmergic document maps: Environments as a mechanism for context learning. In Proceedings of the First Spanish Conference on Evolutionary and Bio-Inspired Algorithms (AEB 2002), pp. 284-293. Centro Univ. M’erida, M’erida, Spain.

    Google Scholar 

  • Ramos V, Muge F and Pina P, 2002, Self-Organized Data and Image Retrieval as a Consequence of Inter-Dynamic Synergistic Relationships in Artificial Ant Colonies. Soft Computing Systems: Design, Management and Applications. 87, pp. 500-509.

    Google Scholar 

  • Rao MR, 1971, Cluster Analysis and Mathematical Programming,. Journal of the American Statistical Association, Vol. 22, pp 622-626.

    Article  Google Scholar 

  • Rokach, L., Maimon, O. (2005), Clustering Methods, Data Mining and Knowledge Discovery Handbook, Springer, pp. 321-352.

    Google Scholar 

  • Rosenberger C and Chehdi K, (2000), Unsupervised clustering method with optimal estimation of the number of clusters: Application to image segmentation, in Proc. IEEE International Conference on Pattern Recognition (ICPR), vol. 1, Barcelona, pp. 1656-1659.

    Google Scholar 

  • Sarkar M, Yegnanarayana B and Khemani D, 1997, A clustering algorithm using an evolutionary programming-based approach, Pattern Recognition Letters, 18, pp. 975-986.

    Article  Google Scholar 

  • Schwefel H-P, 1995, Evolution and Optimum Seeking. New York, NY: Wiley, 1st edition.

    Google Scholar 

  • Selim SZ and Alsultan K, 1991, A simulated annealing algorithm for the clustering problem. Pattern recognition, 24(7), pp. 1003-1008.

    Article  MathSciNet  Google Scholar 

  • Storn R and Price K, 1997, Differential evolution - A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces, Journal of Global Optimization, 11(4), pp. 341-359.

    Article  MATH  MathSciNet  Google Scholar 

  • Theodoridis S and Koutroubas K, (1999), Pattern recognition, Academic Press.

    Google Scholar 

  • Tou JT and Gonzalez RC, 1974, Pattern Recognition Principles. London, Addison-Wesley.

    MATH  Google Scholar 

  • Trivedi MM and Bezdek JC, (1986), Low-level segmentation of aerial images with fuzzy clustering, IEEE Trans.on Systems, Man and Cybernetics, Volume 16.

    Google Scholar 

  • Tsang W and Kwong S, Ant Colony Clustering and Feature Extraction for Anomaly Intrusion Detection, in Swarm Intelligence in Data Mining, Abraham A, (2006), Grosan C and Ramos V (Eds), Springer, pp. 101-121.

    Google Scholar 

  • van der Merwe DW and Engelbrecht AP, 2003, Data clustering using particle swarm optimization. In: Proceedings of the 2003 IEEE Congress on Evolutionary Computation, pp. 215-220, Piscataway, NJ: IEEE Service Center.

    Google Scholar 

  • Wallace CS and Boulton DM, 1968, An Information Measure for Classification, Computer Journal, Vol. 11, No. 2, 1968, pp. 185-194.

    MATH  Google Scholar 

  • Wang X, Wang Y and Wang L, 2004, Improving fuzzy c-means clustering based on feature-weight learning. Pattern Recognition Letters, vol. 25, pp. 1123-32.

    Article  Google Scholar 

  • Xiao X, Dow ER, Eberhart RC, Miled ZB and Oppelt RJ, 2003, Gene Clustering Using Self-Organizing Maps and Particle Swarm Optimization, Proc of the 17th International Symposium on Parallel and Distributed Processing (PDPS ’03), IEEE Computer Society, Washington DC.

    Google Scholar 

  • Xie, X and Beni G, 1991, Validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Machine Learning, Vol. 3, pp. 841-846.

    Article  Google Scholar 

  • Xu, R., Wunsch, D. 2005, Survey of Clustering Algorithms, IEEE Transactions on Neural Networks, Vol. 16(3): 645-678.

    Article  Google Scholar 

  • Zahn CT, (1971), Graph-theoretical methods for detecting and describing gestalt clusters, IEEE Transactions on Computers C-20, 68-86.

    Google Scholar 

  • Zhang T, Ramakrishnan R and Livny M, 1997, BIRCH: A New Data Clustering Algorithm and Its Applications, Data Mining and Knowledge Discovery, vol. 1, no. 2, pp. 141-182.

    Article  Google Scholar 

  • Hall LO, Ö zyurt IB and Bezdek JC, 1999, Clustering with a genetically optimized approach, IEEE Trans. Evolutionary Computing 3 (2) pp. 103-112.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Abraham, A., Das, S., Roy, S. (2008). Swarm Intelligence Algorithms for Data Clustering. In: Maimon, O., Rokach, L. (eds) Soft Computing for Knowledge Discovery and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-69935-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-69935-6_12

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-69934-9

  • Online ISBN: 978-0-387-69935-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics