Mining frequent items and itemsets from distributed data streams for emergency detection and management

Original Research
  • 192 Downloads

Abstract

Sensor networks are an important technology for large-scale monitoring, that allow the collection of environmental measurement streaming data in remote areas. Such data constitute a valuable source of information to be exploited for better understanding natural phenomena. Moreover, in some cases streams of data must be analyzed in real time to provide information about trends, outlier values or regularities that must be signaled as soon as possible, to prevent emergencies or disasters (e.g., landslides, fires). For such a reason, real-time analysis of distributed data streams is a challenging task since it requires scalable solutions to handle streams of data that are generated very rapidly by multiple sources. This paper presents the design and the implementation of an architecture for the analysis of data streams in distributed environments. Experimental evaluation shows the efficiency and effectiveness of the approach.

References

  1. Cesario E, De Caria N, Mastroianni C, Talia D (2009) Distributed data mining using a public resource computing framework. In: Proceedings of the CoreGRID ERCIM Working Group Workshop on Grids, P2P and Service Computing, pp 33–44Google Scholar
  2. Cesario E, Mastroianni C, Talia D (2014) A multi-domain architecture for mining frequent items anditemsets from distributed data streams. J Grid Comput 12(1):153–168CrossRefGoogle Scholar
  3. Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. In: Proceedings of the 29thInternational Colloquium on Automata, Languages and Programming (ICALP), pp 693–703Google Scholar
  4. Cormode G, Muthukrishnan S (2005) An improved data stream summary: the count-min sketch and its applications. J Algorithms 55(1):58–75MathSciNetCrossRefMATHGoogle Scholar
  5. Cormode G, Garofalakis M (2008) Approximate continuous querying over distributed streams. ACM Trans Database Syst 33(2):1–39CrossRefGoogle Scholar
  6. Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. In: Proceedings of the 29thInternational Colloquium on Automata, Languages and Programming (ICALP), pp 693–703Google Scholar
  7. Cormode G, Hadjieleftheriou M (2009) Finding the frequent items in streams of data. Commun ACM 52(10):97–105CrossRefGoogle Scholar
  8. Fischer M, Salzburg S (1982) Finding a majority among n votes: solution to problem. J Algorithms 3(4):376–379Google Scholar
  9. Gaber M, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM SIGMOD Rec 34(1):18–26CrossRefMATHGoogle Scholar
  10. Grama AY, Gupta A, Kumar V (1993) Isoefficiency: measuring the scalability of parallel algorithms and architectures. IEEE Parallel Distrib Technol 1(3):12–21CrossRefGoogle Scholar
  11. Greenwald M, Khanna S (2001) Space-efficient online computation of quantile summaries. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD), pp 58–66Google Scholar
  12. Jin R, Agrawal G (2005) An algorithm forin-core frequent itemset mining on streaming data. In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM), pp 210–217Google Scholar
  13. Jin R, Agrawal G (2005) An algorithm forin-core frequent itemset mining on streaming data. In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM), pp 210–217Google Scholar
  14. Parthasarathy S, Ghoting A, Otey ME (2007) A survey of distributed mining of data streams. In: Aggarwal C (ed) Data Streams: Models and Algorithms. Springer, pp 289–307Google Scholar
  15. Shrivastava N, Buragohain C, Agrawal D, Suri S (2004) Medians and beyond: New aggregation techniques. In: Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems (SenSys), pp 239–249Google Scholar
  16. Wright A (2010) Data streaming 2.0. Commun ACM 53(4):13–14CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Albino Altomare
    • 1
  • Eugenio Cesario
    • 1
  • Domenico Talia
    • 2
  1. 1.ICAR-CNRRendeItaly
  2. 2.DIMES-UNICALRendeItaly

Personalised recommendations