Resource-Aware Distributed Clustering of Drifting Sensor Data Streams

  • Marwan Hassani
  • Thomas Seidl
Part of the Communications in Computer and Information Science book series (CCIS, volume 293)

Abstract

Collecting data from sensor nodes is the ultimate goal of Wireless Sensor Networks. This is performed by transmitting the sensed measurements to some data collecting station. In sensor nodes, radio communication is the dominating consumer of the energy resources which are usually limited. Summarizing the sensed data internally on sensor nodes and sending only the summaries will considerably save energy. Clustering is an established data mining technique for grouping objects based on similarity. For sensor networks, k-center clustering aims at grouping sensor measurements in k groups, each contains similar measurements. In this paper we propose a novel resource-aware k-center clustering algorithm called: SenClu. Our algorithm immediately detects new trends in the drifting sensor data stream and follows them. SenClu powerfully uses a light-weighted decaying technique that gives lower influence to old data. As sensor data are usually noisy, our algorithm is also outlier-aware. In thorough experiments on drifting synthetic and real world data sets, we show that SenClu outperforms two state-of-the-art algorithms by producing higher clustering quality and following trends in the stream, while consuming nearly the same amount of energy.

Keywords

Sensor Network Sensor Node Wireless Sensor Network Cluster Head Input Stream 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: SDM, pp. 326–337 (2006)Google Scholar
  2. 2.
    Charikar, M., Chekuri, C., Feder, T., Motwani, R.: Incremental clustering and dynamic information retrieval. In: Proc. ACM STOC, pp. 626–635 (1997)Google Scholar
  3. 3.
    Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: Proc. SODA, pp. 642–651 (2001)Google Scholar
  4. 4.
    Charikar, M., O’Callaghan, L., Panigrahy, R.: Better streaming algorithms for clustering problems. In: Proc. ACM STOC, pp. 30–39 (2003)Google Scholar
  5. 5.
    Chen, Y., Tu, L.: Density-based clustering for real-time stream data. In: KDD, pp. 133–142 (2007)Google Scholar
  6. 6.
    Cormode, G., Muthukrishnan, S., Zhuang, W.: Conquering the divide: Continuous clustering of distributed data streams. In: Proc. IEEE ICDE, pp. 1036–1045 (2007)Google Scholar
  7. 7.
    Feder, T., Greene, D.: Optimal algorithms for approximate clustering. In: Proc. ACM STOC, pp. 434–444 (1988)Google Scholar
  8. 8.
    Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theoretical Computer Science 38(2-3), 293–306 (1985)MathSciNetMATHCrossRefGoogle Scholar
  9. 9.
    Guha, S.: Tight results for clustering and summarizing data streams. In: Proc. ICDT, pp. 268–275 (2009)Google Scholar
  10. 10.
    Hassani, M., Müller, E., Seidl, T.: EDISKCO: energy efficient distributed in-sensor-network k-center clustering with outliers. In: Proc. SensorKDD 2009, pp. 39–48 (2009)Google Scholar
  11. 11.
    Hassani, M., Müller, E., Spaus, P., Faqolli, A., Palpanas, T., Seidl, T.: Self-organizing energy aware clustering of nodes in sensor networks using relevant attributes. In: Proc. SensorKDD 2010, pp. 87–96 (2010)Google Scholar
  12. 12.
    Hassani, M., Seidl, T.: Towards a mobile health context prediction: Sequential pattern mining in multiple streams. In: Proc. of IEEE MDM 2011, vol. 2, pp. 55–57 (2011)Google Scholar
  13. 13.
    Heinzelman, W.R., Chandrakasan, A., Balakrishnan, H.: Energy-efficient communication protocol for wireless microsensor networks. In: Proc. HICSS (2000)Google Scholar
  14. 14.
    Hochbaum, D., Shmoys, D.B.: A best possible approximation algorithm for the k-centre problem. Math. of Operations Research 10, 180–184 (1985)MathSciNetMATHCrossRefGoogle Scholar
  15. 15.
    Matthew McCutchen, R., Khuller, S.: Streaming Algorithms for k-Center Clustering with Outliers and with Anonymity. In: Goel, A., Jansen, K., Rolim, J.D.P., Rubinfeld, R. (eds.) APPROX and RANDOM 2008. LNCS, vol. 5171, pp. 165–178. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Fahmy, S., Younis, O.: Heed: A hybrid, energy-efficient, distributed clustering approach for ad hoc sensor networks. IEEE Transactions on Mobile Computing 3(4), 366–379 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Marwan Hassani
    • 1
  • Thomas Seidl
    • 1
  1. 1.Data Management and Data Exploration GroupRWTH Aachen UniversityGermany

Personalised recommendations