Skip to main content

Using a Real-Time Top-k Algorithm to Mine the Most Frequent Items over Multiple Streams

  • Conference paper
Intelligent Computing Theories (ICIC 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7995))

Included in the following conference series:

  • 3444 Accesses

Abstract

Some applications such as sensor networks, internet traffic analysis, location-based services, and health measurements are always required for considering unbounded, fast, large-volumes, continuous, even for distributed stream data. It’s a better way to use synopsis as a list of partial summaries of unknown item sets in order to reduce the memory space usage, let it can afford to process so fast and huge incoming data. Normally, different quantity of item set leads to different summaries, especially for Top-k operator which as a partial preprocess over synopsis. Therefore, we proposed smooth synopsis that dynamically assigns a numeral interval to resolve the items set, in order to maintain a more accurate approximate answers’ list from partial Top-k processing. In particular, we proposed an algorithm (called SFI algorithm) to mine the most frequent items by a more adaptive and fast way in specific stream resources. Finally, our experimental results demonstrate the accuracy and efficiency of our approximation techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arasu, A., Babcock, B., Babu, S., Datar, M., Ito, K., Nishizawa, I., Rosenstein, J., Widom, J.: STREAM: The Stanford Stream Data Manager (Demonstration Description). In: 2003 ACM SIGMOD International Conference on Management of Data, p. 665 (2003)

    Google Scholar 

  2. Li, J., Tufte, K., Shkapenyuk, V., Papadimos, V., Johnson, T., Maier, D.: Out-of-order Processing A New Architecture for High-Performance Stream Systems. In: 34th International Conference on Very Large Data Bases, VLDB 2008, pp. 274–288 (2008)

    Google Scholar 

  3. Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M.J., Hellerstein, J.M., Hong, W., Krishnamurthy, S., Madden, S.R., Reiss, F., Shah, M.A.: TelegraphCQ: Continuous Dataflow Processing for an Uncertain World. In: First Biennial Conference on Innovative Data Systems Research, CIDR 2003, Asilomar (2003)

    Google Scholar 

  4. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models And Issues in Data Streams. In: 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database System, pp. 1–16 (2002)

    Google Scholar 

  5. Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., Dewitt, D.J., Madden, S., Stonebraker, M.: A Comparison of Approaches to Large-Scale Data Analysis. In: 2009 ACM SIGMOD International Conference on Management of Data, pp. 165–178 (2009)

    Google Scholar 

  6. Mouratidis, K., Bakiras, S., Papadias, D.: Continuous Monitoring of Top-K Queries over Sliding Windows. In: 2006 ACM SIGMOD International Conference on Management of Data, pp. 635–646 (2006)

    Google Scholar 

  7. Krishnamurthy, S., Wu, C., Franklin, M.J.: On-the-fly Sharing for Streamed Aggregation. In: 2006 ACM SIGMOD International Conference on Management of Data, pp. 623–634 (2006)

    Google Scholar 

  8. Toman, D.: On Construction of Holistic Synopses under the Duplicate Semantics of Streaming Queries. In: TIME 2007 14th International Symposium on Temporal Representation and Reasoning, pp. 150–162 (2007)

    Google Scholar 

  9. Stern, M., Buchmann, E., Bohm, K.: A Wavelet Transform for Efficient Consolidation of Sensor Relations with Quality Guarantees. In: VLDB 2009, 35th International Conference on Very Large Data Bases, pp. 157–168 (2009)

    Google Scholar 

  10. Matias, Y., Urieli, D.: Optimal Workload-Based Weighted Wavelet Synopses. Journal of Theoretical Computer Science, 227–246 (2007)

    Google Scholar 

  11. Golab, L., DeHaan, D., Demaine, E.D., Lopez-Ortiz, A., Munro, J.I.: Identifying Frequent Items In Sliding Windows Over On-Line Packet Streams. In: 3rd ACM SIGCOMM Conference on Internet Measurement, pp. 173–178 (2003)

    Google Scholar 

  12. Wu, M., Equille, L.B., Marian, A., Procopiuc, C.M., Srivastava, D.: Processing Top-k join Queries. In: VLDB 2010, 36th International Conference on Very Large Data Bases, pp. 860–870 (2010)

    Google Scholar 

  13. Cheng, J., Ke, Y.: Maintaining Frequent Itemsets over High-Speed Data Streams. In: 2006 PAKDD, 10th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, pp. 462e–467e (2006)

    Google Scholar 

  14. Wong, W.K., Cheung, D.W., Hung, E., Kao, B., Mamoulis, N.: An Audit Environment for Outsourcing of Frequent Itemset Mining. In: VLDB 2009, 35th International Conference on Very Large Data Bases, pp. 1162–1172 (2009)

    Google Scholar 

  15. Wang, L., Koo Lee, Y., Ryu, K.H.: Supporting Top-K Aggregate Queries over Unequal Synopsis on Internet Traffic Streams. In: Zhang, Y., Yu, G., Bertino, E., Xu, G. (eds.) APWeb 2008. LNCS, vol. 4976, pp. 590–600. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, L., Qu, Z.Y., Zhou, T.H., Ryu, K.H. (2013). Using a Real-Time Top-k Algorithm to Mine the Most Frequent Items over Multiple Streams. In: Huang, DS., Bevilacqua, V., Figueroa, J.C., Premaratne, P. (eds) Intelligent Computing Theories. ICIC 2013. Lecture Notes in Computer Science, vol 7995. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39479-9_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39479-9_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39478-2

  • Online ISBN: 978-3-642-39479-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics