An Efficient Algorithm for Frequent Itemset Mining on Data Streams

  • Xie Zhi-jun
  • Chen Hong
  • Cuiping Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4065)


In order to mining frequent itemsets on data stream efficiently, a new approach was proposed in this paper. The memory efficient and accurate one-pass algorithm divides all the frequent itemsets into frequent equivalence classes and prune all the redundant itemsets except for those represent the GLB(Greatest Lower Bound) and LUB(Least Upper Bound) of the frequent equivalence class and the number of GLB and LUB is much less than that of frequent itemsets. In order to maintain these equivalence classes, A compact data structure, the frequent itemset enumerate tree (FIET) was proposed in the paper. The detailed experimental evaluation on synthetic and real datasets shows that the algorithm is very accurate in practice and requires significantly lower memory than Jin and Agrawal’s one pass algorithm.


Equivalence Class Data Stream Memory Requirement Frequent Itemset Support Level 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Mannila, H., Srikant, R., Toivonent, H., Verkamo, A.I.: Fast discovery of associantion rules. In: Fayyad, U., et al. (eds.) Advances in knowledge Discovery and Data Mining, pp. 307–328. AAAI press, Menlo Park (1996)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc.1994 Int. conf. Very Large DataBases (VLDB 1994), Santiago, Chile, September 1994, pp. 487–499 (1994)Google Scholar
  3. 3.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream systems. In: Proceedings of the 2002 ACM Symposium on principles of Database Systems (PODS 2002). ACM Press, New York (2002)Google Scholar
  4. 4.
    Borgelt, C.: Apriori implementation,
  5. 5.
    Dobra, A., Gehrke, J., Garofalakis, M., Rastogi, R.: Processing complex aggregate queries over data streams. In: proc.of the 2002 ACM SIGMOD Intl. Conf. on Management of Data (June 2002)Google Scholar
  6. 6.
    Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the ACM Conference on Knowledge and Data Discovery (SIGKDD) (2000)Google Scholar
  7. 7.
    Gehrke, J., Korn, F., Srivastava, D.: On computing correlated aggregates over continual data streams. In: Proc.of the 2001 ACM SIGMOD Intl. Conf. on Manaagement of Data, pp. 13–24. ACM Press, New York (2001)CrossRefGoogle Scholar
  8. 8.
    Giannella, C., Han, J., Pei, J., yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Proceedings of the NSF Workshop on Next Generation Data Mining (November 2002)Google Scholar
  9. 9.
    Gibbons, P.B., Tirthapura, S.: Estimating simple functions on the union of data streams. In: Proc.of the 2001 ACM Symp. on parallel Algorithms and Architechtures, pp. 281–291. ACM Press, New York (2001)CrossRefGoogle Scholar
  10. 10.
    Goethals, B.: Fp-tree implementation,
  11. 11.
    Jin, R., Agrawal, G.: An algorithm for in-core frequent itemset mining on streaming data (submitted, 2004) Google Scholar
  12. 12.
    Zaki, M.J., Hsiao, C.: Charm: An efficient algorithm for closed itemset mining. In: 2nd SIAM Int’l. conf. on Data Mining (2002)Google Scholar
  13. 13.
    Han, J., Pei, J., Yin, Y.: Mining Frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD Conference on Management of Data (2000)Google Scholar
  14. 14.
    Li, C., Cong, G., Tung, A.K.H., Wang, S.: Incremental Maintainence of Quotient Cube for sum and Median. In: Proceedings of SIGKDD, Seattle, WA, USA, pp. 226–235 (August 2004)Google Scholar
  15. 15.
    Manku, G.S., Motwain, R.: Approximate Frequency Counts Over Data Streams. In: Proceedings of Conference on Very Large DataBase (VLDB), pp. 346–357 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Xie Zhi-jun
    • 1
  • Chen Hong
    • 1
  • Cuiping Li
    • 1
  1. 1.School of InformationRenMin UniversityBeiJingP.R. China

Personalised recommendations