Journal of Intelligent Information Systems

, Volume 28, Issue 1, pp 23–36 | Cite as

Towards a new approach for mining frequent itemsets on data stream

  • Chedy Raïssi
  • Pascal Poncelet
  • Maguelonne Teisseire
Article

Abstract

Mining frequent patterns on streaming data is a new challenging problem for the data mining community since data arrives sequentially in the form of continuous rapid streams. In this paper we propose a new approach for mining itemsets. Our approach has the following advantages: an efficient representation of items and a novel data structure to maintain frequent patterns coupled with a fast pruning strategy. At any time, users can issue requests for frequent itemsets over an arbitrary time interval. Furthermore our approach produces an approximate answer with an assurance that it will not bypass user-defined frequency and temporal thresholds. Finally the proposed method is analyzed by a series of experiments on different datasets.

Keywords

Data streams Frequent itemsets Approximate answer 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large database. In Proceedings of the International Conference on Management of Data (ACM SIGMOD 93) (pp. 207–216). New York: ACM.CrossRefGoogle Scholar
  2. Chen, Y., Dong, G., Han, J., Wah, B.W., & Wang, J. (2002). Multidimensional regression analysis of time-series data streams. In VLDB Conference.Google Scholar
  3. Chi, Y., Wang, H., Yu, P.S., & Muntz, R.R. (2004). Moment: Maintaining closed frequent itemsets over a stream sliding window. In Proceedings of International Conference on Data Missing ’04 Conference (pp. 59–66).Google Scholar
  4. Dokas, P., Ertoz, L., Kumar, V., Lazarevic, A., Srivastava, J., & Tan, P.-N. (2002). Data mining for network intrusion detection. In Proceedings of the 2002 National Science Foundation Workshop on Data Mining.Google Scholar
  5. Giannella, G., Han, J., Pei, J., Yan, X., & Yu, P. (2003). Mining frequent patterns in data streams at multiple time granularities. In Next generation data mining. New York: MIT.Google Scholar
  6. Han, J., Pei, J., Mortazavi-asl, B., Chen, Q., Dayal, U., & Hsu, M. (2000). Freespan: Frequent pattern-projected sequential pattern mining. In Proceedings of Knowledge Discovery and Data ’00 Conference (pp. 20–23).Google Scholar
  7. Jin, C., Qian, W., Sha, C., Yu, J.-X., & Zhou, A. (2003). Dynamically maintaining frequent items over a data stream. In Proceedings of International Conference on Information and Knowledge Management ’04 Conference (pp. 287–294). Washington, District of Columbia.Google Scholar
  8. Karp, R.-M., Shenker, S., & Papadimitriou, C.-H. (2003). A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems, 28(1), 51–55.CrossRefGoogle Scholar
  9. Li, H.-F., Lee, S.Y., & Shan, M.-K. (2004). An efficient algorithm for mining frequent itemsets over the entire history of data streams. In Proceedings of the 1st International Workshop on Knowledge Discovery in Data streams.Google Scholar
  10. Manku, G., & Motwani, R. (2002). Approximate frequency counts over data streams. In Proceedings of very Large Databases ’02 Conference (pp. 346–357). Hong Kong, China.Google Scholar
  11. Sivanandam, S.N., Sumathi, D., Hamsapriya, T., & Babu, K. (2004). In Parallel buddy prima—A hybrid parallel frequent itemset mining algorithm for very large databases. Retrieved from www.acadjournal.com.
  12. Teng, W.-G., Chen, M.-S., & Yu, P.S. (2003). A regression-based temporal patterns mining schema for data streams. In Proceedings of very Large Databases ’03 Conference (pp. 93–104). Berlin, Germany.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2006

Authors and Affiliations

  • Chedy Raïssi
    • 1
    • 2
  • Pascal Poncelet
    • 1
  • Maguelonne Teisseire
    • 2
  1. 1.EMA/LGI2PParc Scientifique Georges BesseNîmes CedexFrance
  2. 2.LIRMM UMR CNRS 5506Montpellier Cedex 5France

Personalised recommendations