Advertisement

An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams

  • Jia-Ling Koh
  • Shu-Ning Shin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4081)

Abstract

Recently, the data stream, which is an unbounded sequence of data elements generated at a rapid rate, provides a dynamic environment for collecting data sources. It is likely that the embedded knowledge in a data stream will change quickly as time goes by. Therefore, catching the recent trend of data is an important issue when mining frequent itemsets from data streams. Although the sliding window model proposed a good solution for this problem, the appearing information of the patterns within the sliding window has to be maintained completely in the traditional approach. In this paper, for estimating the approximate supports of patterns within the current sliding window, two data structures are proposed to maintain the average time stamps and frequency changing points of patterns, respectively. The experiment results show that our approach will reduce the run-time memory usage significantly. Moreover, the proposed FCP algorithm achieves high accuracy of mining results and guarantees no false dismissal occurring.

Keywords

Data Stream Frequent Pattern Memory Usage Frequent Itemsets Mining Result 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of Int. Conf. on Very Large Data Bases (1994)Google Scholar
  2. 2.
    Han, J., Pei, J., Yin, Y., Mao, R.: Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery 8(1), 53–87 (2004)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Park, J.S., Chen, M.S., Yu, P.S.: An Effective Hash-based Algorithm for Mining Asso-ciation Rules. In: Proc. of the ACM SIGMOD International Conference on Management of Data (SIGMOD 1995), May 1995, pp. 175–186 (1995)Google Scholar
  4. 4.
    Jin, C., Qian, W., Sha, C., Yu, J.X., Zhou, A.: Dynamically Maintaining Frequent Items Over a Data Stream. In: Proc. of the 12th ACM International Conference on Information and Knowledge Management (2003)Google Scholar
  5. 5.
    Chang, J.H., Lee, W.S.: Finding Recent Frequent Itemsets Adaptively over Online Data Streams. In: Proc. of the 9th ACM International Conference on Knowledge Discovery and Data Ming (2003)Google Scholar
  6. 6.
    Chang, J.H., Lee, W.S.: A Sliding Window Method for Finding Recently Frequent Itemsets over Online Data Streams. Journal of Information Science and Engineering 20, 753–762 (2004)Google Scholar
  7. 7.
    Manku, G.S., Chen Motwani, R.: Approximate Frequent Counts over Data Streams. In: Proc. of the 28th International Conference on Very Large Database, August 2002, Hong Kong, China (2002)Google Scholar
  8. 8.
    Wang, K., Tang, L., Han, J., Liu, J.: Top Down FP-Growth for Association Rule Mining. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, Springer, Heidelberg (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jia-Ling Koh
    • 1
  • Shu-Ning Shin
    • 1
  1. 1.Department of Computer Science and Information EngineeringNational Taiwan Normal UniversityTaipeiTaiwan, R.O.C

Personalised recommendations