Advertisement

An Adaptive Algorithm for Finding Frequent Sets in Landmark Windows

  • Xuan Hong Dang
  • Kok-Leong Ong
  • Vincent Lee
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7520)

Abstract

We consider a CPU constrained environment for finding approximation of frequent sets in data streams using the landmark window. Our algorithm can detect overload situations, i.e., breaching the CPU capacity, and sheds data in the stream to “keep up”. This is done within a controlled error threshold by exploiting the Chernoff-bound. Empirical evaluation of the algorithm confirms the feasibility.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Appice, A., Ceci, M., Turi, A., Malerba, D.: A Parallel Distributed Algorithm for Relational Frequent Pattern Discovery from Very Large Data Sets. Intelligent Data Analysis 15(1), 69–88 (2011)Google Scholar
  2. 2.
    Bai, Y., Wang, H., Zaniolo, C.: Load Shedding in Classifying Multi-Source Streaming Data: A Bayes Risk Approach. In: SDM (2007)Google Scholar
  3. 3.
    Babcock, B., Datar, M., Motwani, R.: Load Shedding for Aggregation Queries over Data Streams. In: ICDE, pp. 350–361 (2004)Google Scholar
  4. 4.
    Chang, J.H., Lee, W.S.: Finding Recent Frequent Itemsets Adaptively over Online Data streams. In: SIGKDD (2003)Google Scholar
  5. 5.
    Chi, Y., Yu, P.S., Wang, H., Muntz, R.R.: LoadStar: A Load Shedding Scheme for Classifying Data Streams. In: SDM, pp. 346–357 (2005)Google Scholar
  6. 6.
    Dang, X., Ong, K.-L., Lee, V.C.S.: Real-Time Mining of Approximate Frequent Sets over Data Streams Using Load Shedding. Deakin University, Technical Report (TR 11/06) (2011), http://www.deakin.edu.au/~leong/papers/dol.pdf
  7. 7.
    Gedik, B., Wu, K.-L., Yu, P.S., Liu, L.: Adaptive Load Shedding for Windowed Stream Joins. In: CIKM, pp. 171–178 (2005)Google Scholar
  8. 8.
    Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. Next Generation Data Mining. AAAI/MIT (2003)Google Scholar
  9. 9.
    Lin, C., Chiu, D., Wu, Y., Chen, A.: Mining Frequent Itemsets from Data Streams with a Time-Sensitive Sliding Window. In: SDM, pp. 68–79 (2005)Google Scholar
  10. 10.
    Tatbul, N., Çetintemel, U., Zdonik, S.B.: Staying Fit: Efficient Load Shedding Techniques for Distributed Stream Processing. In: VLDB, pp. 159–170 (2007)Google Scholar
  11. 11.
    Yang, G.: The Complexity of Mining Maximal Frequent Itemsets and Maximal Frequent Patterns. In: SIGKDD, pp. 344–353 (2004)Google Scholar
  12. 12.
    Yu, J.X., Chong, Z., Lu, H., Zhou, A.: False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams. In: VLDB, pp. 204–215 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Xuan Hong Dang
    • 1
  • Kok-Leong Ong
    • 2
  • Vincent Lee
    • 3
  1. 1.Dept. of Computer ScienceAarhus UniversityDenmark
  2. 2.School of ITDeakin UniversityAustralia
  3. 3.Faculty of ITMonash UniversityAustralia

Personalised recommendations