Advertisement

EStream: Online Mining of Frequent Sets with Precise Error Guarantee

  • Xuan Hong Dang
  • Wee-Keong Ng
  • Kok-Leong Ong
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4081)

Abstract

In data stream applications, a good approximation obtained in a timely manner is often better than the exact answer that’s delayed beyond the window of opportunity. Of course, the quality of the approximate is as important as its timely delivery. Unfortunately, algorithms capable of online processing do not conform strictly to a precise error guarantee. Since online processing is essential and so is the precision of the error, it is necessary that stream algorithms meet both criteria. Yet, this is not the case for mining frequent sets in data streams. We present EStream, a novel algorithm that allows online processing while producing results strictly within the error bound. Our theoretical and experimental results show that EStream is a better candidate for finding frequent sets in data streams, when both constraints need to be satisfied.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB Conference, pp. 487–499 (1994)Google Scholar
  2. 2.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: PODS Conference, pp. 1–16 (2002)Google Scholar
  3. 3.
    Babcock, B., Datar, M., Motwani, R.: Sampling from a moving window over streaming data. In: ACM-SIAM Symposium on Discrete Algorithms (2002)Google Scholar
  4. 4.
    Chang, J.H., Lee, W.S.: Estwin: Adaptively monitoring the recent change of frequent itemsets over online data streams. In: CIKM Conference (2003)Google Scholar
  5. 5.
    Chang, J.H., Lee, W.S.: Finding recent frequent itemsets adaptively over online data streams. In: ACM SIGKDD Conference, pp. 487–492 (2003)Google Scholar
  6. 6.
    Cormode, G., Muthukrishnan, S.: What’s hot and what’s not: tracking most frequent items dynamically. ACM Trans. Database Syst. 30(1), 249–278 (2005)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Dang, X.H., Ng, W.K., Ong, K.L.: Online mining of frequent patterns with precise error guarantees. Technical Report, Nanyang Technological UniversityGoogle Scholar
  8. 8.
    Garofalakis, M., Gehrke, J., Rastogi, R.: Querying and mining data streams: you only get one look a tutorial. In: ACM SIGMOD Conference (2002)Google Scholar
  9. 9.
    Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. AAAI/MIT (2003)Google Scholar
  10. 10.
    Hidber, C.: Online association rule mining. In: SIGMOD Conference (1999)Google Scholar
  11. 11.
    Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: VLDB Conference, pp. 346–357 (2002)Google Scholar
  12. 12.
    Tatbul, N., Çetintemel, U., Zdonik, S.B., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: VLDB Conference, pp. 309–320 (2003)Google Scholar
  13. 13.
    Teng, W.G., Chen, M.S., Yu, P.S.: A regression-based temporal pattern mining scheme for data streams. In: VLDB Conference, pp. 93–104 (2003)Google Scholar
  14. 14.
    Yu, J.X., Lu, Z.C.H., Zhou, A.: False positive or false negative: Mining frequent itemsets from high speed transactional data streams. In: VLDB Conference (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Xuan Hong Dang
    • 1
  • Wee-Keong Ng
    • 1
  • Kok-Leong Ong
    • 2
  1. 1.School of Computer EngineeringNanyang Technological UniversitySingapore
  2. 2.School of Engineering & ITDeakin UniversityAustralia

Personalised recommendations