Skip to main content
Log in

Mining frequent items and itemsets from distributed data streams for emergency detection and management

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Sensor networks are an important technology for large-scale monitoring, that allow the collection of environmental measurement streaming data in remote areas. Such data constitute a valuable source of information to be exploited for better understanding natural phenomena. Moreover, in some cases streams of data must be analyzed in real time to provide information about trends, outlier values or regularities that must be signaled as soon as possible, to prevent emergencies or disasters (e.g., landslides, fires). For such a reason, real-time analysis of distributed data streams is a challenging task since it requires scalable solutions to handle streams of data that are generated very rapidly by multiple sources. This paper presents the design and the implementation of an architecture for the analysis of data streams in distributed environments. Experimental evaluation shows the efficiency and effectiveness of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Miners may have the ability to store some transactions in their own memory. In this case, they only ask the Data Cacher those transactions that could not be stored locally.

  2. the fraction of CPU is tuned using the program “cpulimit”.

References

  • Cesario E, De Caria N, Mastroianni C, Talia D (2009) Distributed data mining using a public resource computing framework. In: Proceedings of the CoreGRID ERCIM Working Group Workshop on Grids, P2P and Service Computing, pp 33–44

  • Cesario E, Mastroianni C, Talia D (2014) A multi-domain architecture for mining frequent items anditemsets from distributed data streams. J Grid Comput 12(1):153–168

    Article  Google Scholar 

  • Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. In: Proceedings of the 29thInternational Colloquium on Automata, Languages and Programming (ICALP), pp 693–703

  • Cormode G, Muthukrishnan S (2005) An improved data stream summary: the count-min sketch and its applications. J Algorithms 55(1):58–75

    Article  MathSciNet  MATH  Google Scholar 

  • Cormode G, Garofalakis M (2008) Approximate continuous querying over distributed streams. ACM Trans Database Syst 33(2):1–39

    Article  Google Scholar 

  • Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. In: Proceedings of the 29thInternational Colloquium on Automata, Languages and Programming (ICALP), pp 693–703

  • Cormode G, Hadjieleftheriou M (2009) Finding the frequent items in streams of data. Commun ACM 52(10):97–105

    Article  Google Scholar 

  • Fischer M, Salzburg S (1982) Finding a majority among n votes: solution to problem. J Algorithms 3(4):376–379

    Google Scholar 

  • Gaber M, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM SIGMOD Rec 34(1):18–26

    Article  MATH  Google Scholar 

  • Grama AY, Gupta A, Kumar V (1993) Isoefficiency: measuring the scalability of parallel algorithms and architectures. IEEE Parallel Distrib Technol 1(3):12–21

    Article  Google Scholar 

  • Greenwald M, Khanna S (2001) Space-efficient online computation of quantile summaries. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD), pp 58–66

  • Jin R, Agrawal G (2005) An algorithm forin-core frequent itemset mining on streaming data. In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM), pp 210–217

  • Jin R, Agrawal G (2005) An algorithm forin-core frequent itemset mining on streaming data. In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM), pp 210–217

  • Parthasarathy S, Ghoting A, Otey ME (2007) A survey of distributed mining of data streams. In: Aggarwal C (ed) Data Streams: Models and Algorithms. Springer, pp 289–307

  • Shrivastava N, Buragohain C, Agrawal D, Suri S (2004) Medians and beyond: New aggregation techniques. In: Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems (SenSys), pp 239–249

  • Wright A (2010) Data streaming 2.0. Commun ACM 53(4):13–14

    Article  Google Scholar 

Download references

Acknowledgments

This research work has been funded by the project INSYEME.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eugenio Cesario.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Altomare, A., Cesario, E. & Talia, D. Mining frequent items and itemsets from distributed data streams for emergency detection and management. J Ambient Intell Human Comput 8, 47–55 (2017). https://doi.org/10.1007/s12652-016-0344-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-016-0344-9

Keywords

Navigation