Skip to main content
Log in

Towards a new approach for mining frequent itemsets on data stream

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Mining frequent patterns on streaming data is a new challenging problem for the data mining community since data arrives sequentially in the form of continuous rapid streams. In this paper we propose a new approach for mining itemsets. Our approach has the following advantages: an efficient representation of items and a novel data structure to maintain frequent patterns coupled with a fast pruning strategy. At any time, users can issue requests for frequent itemsets over an arbitrary time interval. Furthermore our approach produces an approximate answer with an assurance that it will not bypass user-defined frequency and temporal thresholds. Finally the proposed method is analyzed by a series of experiments on different datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large database. In Proceedings of the International Conference on Management of Data (ACM SIGMOD 93) (pp. 207–216). New York: ACM.

    Chapter  Google Scholar 

  • Chen, Y., Dong, G., Han, J., Wah, B.W., & Wang, J. (2002). Multidimensional regression analysis of time-series data streams. In VLDB Conference.

  • Chi, Y., Wang, H., Yu, P.S., & Muntz, R.R. (2004). Moment: Maintaining closed frequent itemsets over a stream sliding window. In Proceedings of International Conference on Data Missing ’04 Conference (pp. 59–66).

  • Dokas, P., Ertoz, L., Kumar, V., Lazarevic, A., Srivastava, J., & Tan, P.-N. (2002). Data mining for network intrusion detection. In Proceedings of the 2002 National Science Foundation Workshop on Data Mining.

  • Giannella, G., Han, J., Pei, J., Yan, X., & Yu, P. (2003). Mining frequent patterns in data streams at multiple time granularities. In Next generation data mining. New York: MIT.

    Google Scholar 

  • Han, J., Pei, J., Mortazavi-asl, B., Chen, Q., Dayal, U., & Hsu, M. (2000). Freespan: Frequent pattern-projected sequential pattern mining. In Proceedings of Knowledge Discovery and Data ’00 Conference (pp. 20–23).

  • Jin, C., Qian, W., Sha, C., Yu, J.-X., & Zhou, A. (2003). Dynamically maintaining frequent items over a data stream. In Proceedings of International Conference on Information and Knowledge Management ’04 Conference (pp. 287–294). Washington, District of Columbia.

  • Karp, R.-M., Shenker, S., & Papadimitriou, C.-H. (2003). A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems, 28(1), 51–55.

    Article  Google Scholar 

  • Li, H.-F., Lee, S.Y., & Shan, M.-K. (2004). An efficient algorithm for mining frequent itemsets over the entire history of data streams. In Proceedings of the 1st International Workshop on Knowledge Discovery in Data streams.

  • Manku, G., & Motwani, R. (2002). Approximate frequency counts over data streams. In Proceedings of very Large Databases ’02 Conference (pp. 346–357). Hong Kong, China.

  • Sivanandam, S.N., Sumathi, D., Hamsapriya, T., & Babu, K. (2004). In Parallel buddy prima—A hybrid parallel frequent itemset mining algorithm for very large databases. Retrieved from www.acadjournal.com.

  • Teng, W.-G., Chen, M.-S., & Yu, P.S. (2003). A regression-based temporal patterns mining schema for data streams. In Proceedings of very Large Databases ’03 Conference (pp. 93–104). Berlin, Germany.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chedy Raïssi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Raïssi, C., Poncelet, P. & Teisseire, M. Towards a new approach for mining frequent itemsets on data stream. J Intell Inf Syst 28, 23–36 (2007). https://doi.org/10.1007/s10844-006-0002-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-006-0002-3

Keywords

Navigation