Advertisement

Adaptive Load Shedding for Mining Frequent Patterns from Data Streams

  • Xuan Hong Dang
  • Wee-Keong Ng
  • Kok-Leong Ong
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4081)

Abstract

Most algorithms that focus on discovering frequent patterns from data streams assumed that the machinery is capable of managing all the incoming transactions without any delay; or without the need to drop transactions. However, this assumption is often impractical due to the inherent characteristics of data stream environments. Especially under high load conditions, there is often a shortage of system resources to process the incoming transactions. This causes unwanted latencies that in turn, affects the applicability of the data mining models produced – which often has a small window of opportunity. We propose a load shedding algorithm to address this issue. The algorithm adaptively detects overload situations and drops transactions from data streams using a probabilistic model. We tested our algorithm on both synthetic and real-life datasets to verify the feasibility of our algorithm.

Keywords

Data Stream Frequent Pattern Frequent Itemsets Sample Batch Mining Frequent Itemsets 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB Conference, pp. 487–499 (1994)Google Scholar
  2. 2.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: PODS Conference, pp. 1–16 (2002)Google Scholar
  3. 3.
    Babcock, B., Datar, M., Motwani, R.: Load shedding for aggregation queries over data streams. In: ICDE Conference, pp. 350–361 (2004)Google Scholar
  4. 4.
    Chambers, C., Feng, W., Sahu, S., Saha, D.: Measurement-based characterization of a collection of on-line games. In: IMC Conference, pp. 1–14 (2005)Google Scholar
  5. 5.
    Chang, J.H., Lee, W.S.: Finding recent frequent itemsets adaptively over online data streams. In: ACM SIGKDD Conference, pp. 487–492 (2003)Google Scholar
  6. 6.
    Chi, Y., Yu, P.S., Wang, H., Muntz, R.R.: Loadstar: A load shedding scheme for classifying data streams. In: SIAM Conference, pp. 346–357 (2005)Google Scholar
  7. 7.
    Dang, X.H., Ng, W.K., Ong, K.L.: Adaptive load shedding for mining frequent patterns from data streams. Technical Report, Nanyang Technological UniversityGoogle Scholar
  8. 8.
    Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Next Generation Data Mining, AAAI/MIT (2003)Google Scholar
  9. 9.
    Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58(301), 13–30 (1963)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Lin, C.H., Chiu, D.Y., Wu, Y.H., Chen, A.L.P.: Mining frequent itemsets from data streams with a time-sensitive sliding window. In: SIAM Conference (2005)Google Scholar
  11. 11.
    Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: VLDB Conference, pp. 346–357 (2002)Google Scholar
  12. 12.
    Tatbul, N., Çetintemel, U., Zdonik, S.B., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: VLDB Conference, pp. 309–320 (2003)Google Scholar
  13. 13.
    Teng, W.G., Chen, M.S., Yu, P.S.: A regression-based temporal pattern mining scheme for data streams. In: VLDB Conference, pp. 93–104 (2003)Google Scholar
  14. 14.
    Yang, G.: The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: ACM SIGKDD Conference, pp. 344–353 (2004)Google Scholar
  15. 15.
    Yu, J.X., Lu, Z.C.H., Zhou, A.: False positive or false negative: Mining frequent itemsets from high speed transactional data streams. In: VLDB Conference (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Xuan Hong Dang
    • 1
  • Wee-Keong Ng
    • 1
  • Kok-Leong Ong
    • 2
  1. 1.School of Computer EngineeringNanyang Technological UniversitySingapore
  2. 2.School of Engineering & ITDeakin UniversityAustralia

Personalised recommendations