Data Mining and Knowledge Discovery

, Volume 26, Issue 1, pp 1–26

A single pass algorithm for clustering evolving data streams based on swarm intelligence

  • Agostino Forestiero
  • Clara Pizzuti
  • Giandomenico Spezzano
Article

DOI: 10.1007/s10618-011-0242-x

Cite this article as:
Forestiero, A., Pizzuti, C. & Spezzano, G. Data Min Knowl Disc (2013) 26: 1. doi:10.1007/s10618-011-0242-x

Abstract

Existing density-based data stream clustering algorithms use a two-phase scheme approach consisting of an online phase, in which raw data is processed to gather summary statistics, and an offline phase that generates the clusters by using the summary data. In this article we propose a data stream clustering method based on a multi-agent system that uses a decentralized bottom-up self-organizing strategy to group similar data points. Data points are associated with agents and deployed onto a 2D space, to work simultaneously by applying a heuristic strategy based on a bio-inspired model, known as flocking model. Agents move onto the space for a fixed time and, when they encounter other agents into a predefined visibility range, they can decide to form a flock if they are similar. Flocks can join to form swarms of similar groups. This strategy allows to merge the two phases of density-based approaches and thus to avoid the computing demanding offline cluster computation, since a swarm represents a cluster. Experimental results show that the bio-inspired approach can obtain very good results on real and synthetic data sets.

Keywords

Data streams Density-based clustering Bio-inspired flocking model 

Copyright information

© The Author(s) 2011

Authors and Affiliations

  • Agostino Forestiero
    • 1
  • Clara Pizzuti
    • 1
  • Giandomenico Spezzano
    • 1
  1. 1.National Research Council of Italy–CNRRende (CS)Italy