Abstract
Sensor networks are an important technology for large-scale monitoring, that allow the collection of environmental measurement streaming data in remote areas. Such data constitute a valuable source of information to be exploited for better understanding natural phenomena. Moreover, in some cases streams of data must be analyzed in real time to provide information about trends, outlier values or regularities that must be signaled as soon as possible, to prevent emergencies or disasters (e.g., landslides, fires). For such a reason, real-time analysis of distributed data streams is a challenging task since it requires scalable solutions to handle streams of data that are generated very rapidly by multiple sources. This paper presents the design and the implementation of an architecture for the analysis of data streams in distributed environments. Experimental evaluation shows the efficiency and effectiveness of the approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Miners may have the ability to store some transactions in their own memory. In this case, they only ask the Data Cacher those transactions that could not be stored locally.
the fraction of CPU is tuned using the program “cpulimit”.
References
Cesario E, De Caria N, Mastroianni C, Talia D (2009) Distributed data mining using a public resource computing framework. In: Proceedings of the CoreGRID ERCIM Working Group Workshop on Grids, P2P and Service Computing, pp 33–44
Cesario E, Mastroianni C, Talia D (2014) A multi-domain architecture for mining frequent items anditemsets from distributed data streams. J Grid Comput 12(1):153–168
Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. In: Proceedings of the 29thInternational Colloquium on Automata, Languages and Programming (ICALP), pp 693–703
Cormode G, Muthukrishnan S (2005) An improved data stream summary: the count-min sketch and its applications. J Algorithms 55(1):58–75
Cormode G, Garofalakis M (2008) Approximate continuous querying over distributed streams. ACM Trans Database Syst 33(2):1–39
Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. In: Proceedings of the 29thInternational Colloquium on Automata, Languages and Programming (ICALP), pp 693–703
Cormode G, Hadjieleftheriou M (2009) Finding the frequent items in streams of data. Commun ACM 52(10):97–105
Fischer M, Salzburg S (1982) Finding a majority among n votes: solution to problem. J Algorithms 3(4):376–379
Gaber M, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM SIGMOD Rec 34(1):18–26
Grama AY, Gupta A, Kumar V (1993) Isoefficiency: measuring the scalability of parallel algorithms and architectures. IEEE Parallel Distrib Technol 1(3):12–21
Greenwald M, Khanna S (2001) Space-efficient online computation of quantile summaries. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD), pp 58–66
Jin R, Agrawal G (2005) An algorithm forin-core frequent itemset mining on streaming data. In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM), pp 210–217
Jin R, Agrawal G (2005) An algorithm forin-core frequent itemset mining on streaming data. In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM), pp 210–217
Parthasarathy S, Ghoting A, Otey ME (2007) A survey of distributed mining of data streams. In: Aggarwal C (ed) Data Streams: Models and Algorithms. Springer, pp 289–307
Shrivastava N, Buragohain C, Agrawal D, Suri S (2004) Medians and beyond: New aggregation techniques. In: Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems (SenSys), pp 239–249
Wright A (2010) Data streaming 2.0. Commun ACM 53(4):13–14
Acknowledgments
This research work has been funded by the project INSYEME.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Altomare, A., Cesario, E. & Talia, D. Mining frequent items and itemsets from distributed data streams for emergency detection and management. J Ambient Intell Human Comput 8, 47–55 (2017). https://doi.org/10.1007/s12652-016-0344-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-016-0344-9