Abstract
Some applications such as sensor networks, internet traffic analysis, location-based services, and health measurements are always required for considering unbounded, fast, large-volumes, continuous, even for distributed stream data. It’s a better way to use synopsis as a list of partial summaries of unknown item sets in order to reduce the memory space usage, let it can afford to process so fast and huge incoming data. Normally, different quantity of item set leads to different summaries, especially for Top-k operator which as a partial preprocess over synopsis. Therefore, we proposed smooth synopsis that dynamically assigns a numeral interval to resolve the items set, in order to maintain a more accurate approximate answers’ list from partial Top-k processing. In particular, we proposed an algorithm (called SFI algorithm) to mine the most frequent items by a more adaptive and fast way in specific stream resources. Finally, our experimental results demonstrate the accuracy and efficiency of our approximation techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arasu, A., Babcock, B., Babu, S., Datar, M., Ito, K., Nishizawa, I., Rosenstein, J., Widom, J.: STREAM: The Stanford Stream Data Manager (Demonstration Description). In: 2003 ACM SIGMOD International Conference on Management of Data, p. 665 (2003)
Li, J., Tufte, K., Shkapenyuk, V., Papadimos, V., Johnson, T., Maier, D.: Out-of-order Processing A New Architecture for High-Performance Stream Systems. In: 34th International Conference on Very Large Data Bases, VLDB 2008, pp. 274–288 (2008)
Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M.J., Hellerstein, J.M., Hong, W., Krishnamurthy, S., Madden, S.R., Reiss, F., Shah, M.A.: TelegraphCQ: Continuous Dataflow Processing for an Uncertain World. In: First Biennial Conference on Innovative Data Systems Research, CIDR 2003, Asilomar (2003)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models And Issues in Data Streams. In: 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database System, pp. 1–16 (2002)
Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., Dewitt, D.J., Madden, S., Stonebraker, M.: A Comparison of Approaches to Large-Scale Data Analysis. In: 2009 ACM SIGMOD International Conference on Management of Data, pp. 165–178 (2009)
Mouratidis, K., Bakiras, S., Papadias, D.: Continuous Monitoring of Top-K Queries over Sliding Windows. In: 2006 ACM SIGMOD International Conference on Management of Data, pp. 635–646 (2006)
Krishnamurthy, S., Wu, C., Franklin, M.J.: On-the-fly Sharing for Streamed Aggregation. In: 2006 ACM SIGMOD International Conference on Management of Data, pp. 623–634 (2006)
Toman, D.: On Construction of Holistic Synopses under the Duplicate Semantics of Streaming Queries. In: TIME 2007 14th International Symposium on Temporal Representation and Reasoning, pp. 150–162 (2007)
Stern, M., Buchmann, E., Bohm, K.: A Wavelet Transform for Efficient Consolidation of Sensor Relations with Quality Guarantees. In: VLDB 2009, 35th International Conference on Very Large Data Bases, pp. 157–168 (2009)
Matias, Y., Urieli, D.: Optimal Workload-Based Weighted Wavelet Synopses. Journal of Theoretical Computer Science, 227–246 (2007)
Golab, L., DeHaan, D., Demaine, E.D., Lopez-Ortiz, A., Munro, J.I.: Identifying Frequent Items In Sliding Windows Over On-Line Packet Streams. In: 3rd ACM SIGCOMM Conference on Internet Measurement, pp. 173–178 (2003)
Wu, M., Equille, L.B., Marian, A., Procopiuc, C.M., Srivastava, D.: Processing Top-k join Queries. In: VLDB 2010, 36th International Conference on Very Large Data Bases, pp. 860–870 (2010)
Cheng, J., Ke, Y.: Maintaining Frequent Itemsets over High-Speed Data Streams. In: 2006 PAKDD, 10th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, pp. 462e–467e (2006)
Wong, W.K., Cheung, D.W., Hung, E., Kao, B., Mamoulis, N.: An Audit Environment for Outsourcing of Frequent Itemset Mining. In: VLDB 2009, 35th International Conference on Very Large Data Bases, pp. 1162–1172 (2009)
Wang, L., Koo Lee, Y., Ryu, K.H.: Supporting Top-K Aggregate Queries over Unequal Synopsis on Internet Traffic Streams. In: Zhang, Y., Yu, G., Bertino, E., Xu, G. (eds.) APWeb 2008. LNCS, vol. 4976, pp. 590–600. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, L., Qu, Z.Y., Zhou, T.H., Ryu, K.H. (2013). Using a Real-Time Top-k Algorithm to Mine the Most Frequent Items over Multiple Streams. In: Huang, DS., Bevilacqua, V., Figueroa, J.C., Premaratne, P. (eds) Intelligent Computing Theories. ICIC 2013. Lecture Notes in Computer Science, vol 7995. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39479-9_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-39479-9_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39478-2
Online ISBN: 978-3-642-39479-9
eBook Packages: Computer ScienceComputer Science (R0)