Skip to main content
Log in

Mining discriminative itemsets in data streams using the tilted-time window model

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

A discriminative itemset is a frequent itemset in the target data stream with much higher frequency than that of the same itemset in the rest of the data streams in the dataset. The discriminative itemsets describe the distinguishing features between data streams. Mining discriminative itemsets in data streams is very important, where continuously arriving transactions can be inserted in fast speed and large volume. Compared with frequent itemset mining in single data stream, there are additional challenges in the discriminative itemset mining process as the Apriori property of subset is not applicable. We propose an efficient and high accurate method for mining discriminative itemsets in data streams using a tilted-time window model. The proposed single-pass H-DISSparse algorithm is designed particularly based on several well-defined characteristics aiming to improve the approximate frequencies of the itemsets in the tilted-time window model. The data structures are dynamically adjusted in offline time intervals to reflect the discriminative itemset frequencies in different time periods in unsynchronized data streams. Empirical analysis shows the efficient time and space complexity of the proposed method in the fast-growing big data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig.15

Similar content being viewed by others

References

  1. Aggarwal CC (2007) Data streams: models and algorithms. Springer, Berlin

    Book  Google Scholar 

  2. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases VLDB.

  3. Alhammady H, Ramamohanarao K (2005) Mining emerging patterns and classification in data streams. In: The proceedings of IEEE/WIC/ACM international conference on web intelligence, pp 272–275

  4. Amagata D, Hara T (2017) Mining top-k co-occurrence patterns across multiple streams. IEEE Trans Knowl Data Eng 29(10):2249–2262

    Article  Google Scholar 

  5. Bailey J, Loekito E (2010) Efficient incremental mining of contrast patterns in changing data. Inf Process Lett 110(3):88–92

    Article  MathSciNet  Google Scholar 

  6. Bailey J, Manoukian T, Ramamohanarao K (2002) Fast algorithms for mining emerging patterns. In: Proceedings of the 6th European conference on principles of data mining and knowledge discovery

  7. Chang JH, Lee WS (2003) Finding recent frequent itemsets adaptively over online data streams. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM

  8. Cheng H, Yan X, Han J et al (2008) Direct discriminative pattern mining for effective classification. In: 2008 IEEE 24th international conference on data engineering, IEEE

  9. Chi Y, Wang H, Philip SY et al (2004) Moment: maintaining closed frequent itemsets over a stream sliding window. In: Fourth IEEE international conference on data mining ICDM '04

  10. Chi Y, Wang H, Philip SY et al (2006) Catch the moment: maintaining closed frequent itemsets over a data stream sliding window. Knowl Inf Syst 10(3):265–294

    Article  Google Scholar 

  11. Dong G, Bailey J (2012) Contrast data mining: concepts, algorithms, and applications. CRC Press, Boca Raton

    Google Scholar 

  12. Dong G, Li J (1999) Efficient mining of emerging patterns: discovering trends and differences. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining

  13. Fan H, Ramamohanarao K (2002) An efficient single-scan algorithm for mining essential jumping emerging patterns for classification. In: Proceedings of the 6th Pacific-Asia conference on advances in knowledge discovery and data mining

  14. Fan W, Zhang K, Cheng H et al (2008) Direct mining of discriminative and essential frequent patterns via model-based search tree. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

  15. Fournier-Viger P, Lin JC-W, Gomariz A et al (2016) The SPMF open-source data mining library version 2. In: Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2016, Riva del Garda, Italy, 19–23 Sept 2016, Proceedings, part III. Springer, Cham, pp 36–40

  16. Giannella C, Han J, Pei J et al (2003) Mining frequent patterns in data streams at multiple time granularities. Next Gener Data Min 212:191–212

    Google Scholar 

  17. Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam

    MATH  Google Scholar 

  18. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. ACM sigmod record. ACM, New York

    Google Scholar 

  19. He Z, Gu F, Zhao C et al (2017) Conditional discriminative pattern mining. Inf Sci 375(3):1–15

    Article  Google Scholar 

  20. He Z, Zhang S, Gu F et al (2019) Mining conditional discriminative sequential patterns. Inf Sci 478:524–539

    Article  Google Scholar 

  21. Leonardo P, Fabio V (2018) Efficient mining of the most significant patterns with permutation testing. In: Proceedings of the 24th ACM sigkdd international conference on knowledge discovery & data mining. London, United Kingdom. ACM, pp 2070–2079

  22. Li J, Liu G, Wong L (2007) Mining statistically important equivalence classes and delta-discriminative emerging patterns. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM

  23. Lin Z, Jiang B, Pei J et al (2010) Mining discriminative items in multiple data streams. World Wide Web 13(4):497–522

    Article  Google Scholar 

  24. Manku GS (2016) Frequent itemset mining over data streams. In: Garofalakis M, Gehrke J, Rastogi R (eds) Data stream management: processing high-speed data streams. Springer, Berlin, pp 209–219

    Chapter  Google Scholar 

  25. Manku GS, Motwani R (2002) Approximate frequency counts over data streams. In: Proceedings of the 28th international conference on very large data bases, VLDB endowment

  26. Quinlan JR (2014) C4.5: programs for machine learning. Elsevier, Amsterdam

    Google Scholar 

  27. Seyfi M (2011) Mining discriminative items in multiple data streams with hierarchical counters approach. In: Fourth international workshop on advanced computational intelligence (IWACI), 2011, IEEE

  28. Seyfi M (2018) Mining discriminative itemsets in data streams using different window models. Queensland University of Technology, Brisbane

    Book  Google Scholar 

  29. Seyfi M, Geva S, Nayak R (2014) Mining discriminative itemsets in data streams. In: International conference on web information systems engineering. Springer

  30. Seyfi M, Nayak R, Xu Y et al (2017) Efficient mining of discriminative itemsets. In: Proceedings of the international conference on web intelligence, Leipzig, Germany. ACM, pp 451–459

  31. Tanbeer SK, Ahmed CF, Jeong B-S et al (2009) Sliding window-based frequent pattern mining over data streams. Inf Sci 179(22):3843–3865

    Article  MathSciNet  Google Scholar 

  32. Yu K, Ding W, Simovici DA et al (2015) Classification with streaming features: an emerging-pattern mining approach. ACM Trans Knowl Discov Data 9(4):1–31

    Article  Google Scholar 

  33. Yu K, Ding W, Wang H et al (2013) Bridging causal relevance and pattern discriminability: Mining emerging patterns from high-dimensional data. IEEE Trans Knowl Data Eng 25(12):2721–2739

    Article  Google Scholar 

  34. Zhang X, Dong G, Kotagiri R (2000) Exploring constraints to efeciently mine emerging patterns from large high-dimensional datasets. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Majid Seyfi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Seyfi, M., Nayak, R., Xu, Y. et al. Mining discriminative itemsets in data streams using the tilted-time window model. Knowl Inf Syst 63, 1241–1270 (2021). https://doi.org/10.1007/s10115-021-01550-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-021-01550-y

Keywords

Navigation