Skip to main content

Adaptive Self-Sufficient Itemset Miner for Transactional Data Streams

  • Conference paper
  • First Online:
PRICAI 2019: Trends in Artificial Intelligence (PRICAI 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11671))

Included in the following conference series:

  • 2701 Accesses

Abstract

Most studies on pattern mining consider itemsets that have a high frequency of occurrence as useful, often determined by the support of the itemsets. However, current research has shown that we need to move beyond a pure “support-confidence” framework for pattern mining. Recently, there is an interest on finding statistically significant patterns and one of the most popular type of patterns is self-sufficient itemsets. One limitation is that these works do not consider concept drifts and cannot be used in a data stream. Learning in the online environment requires us to develop efficient and effective mechanisms to address the online characteristics of non-static data and non-stationary data distributions. In our research we will concentrate on detecting self-sufficient itemsets from data streams. These patterns have a frequency that is significantly different from the frequency of their subsets and supersets. We present a comprehensive framework for mining self-sufficient itemsets from data streams along with a drift detector. This supports mining self-sufficient itemsets in an online environment and provides the ability to adapt to changes in the stream. Our experimental evaluations show that our framework can mine self-sufficient itemsets faster in an online environment and with better precision and recall.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD Conference, vol. 22, p. 207 (1993)

    Article  Google Scholar 

  2. Bayardo, R.J., Agrawal, R.: Mining the most interesting rules. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 145–154 (1999)

    Google Scholar 

  3. Bayardo, R.J., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. Data Min. Knowl. Disc. 4(2), 217–240 (2000)

    Article  Google Scholar 

  4. Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 443–448 (2007)

    Google Scholar 

  5. Brijs, T., Swinnen, G., Vanhoof, K., Wets, G.: Using association rules for product assortment decisions: a case study. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 254–260 (1999)

    Google Scholar 

  6. Dua, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  7. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1–44:37 (2014)

    Article  MATH  Google Scholar 

  8. Hamalainen, W.: Kingfisher: an efficient algorithm for searching for both positive and negative dependency rules with statistical significance measures. Knowl. Inf. Syst. 32, 1–32 (2011)

    Google Scholar 

  9. Harel, M., Crammer, K., El-Yaniv, R., Mannor, S.: Concept drift detection through resampling. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, vol. 32, pp. II-1009–II-1017 (2014)

    Google Scholar 

  10. Hettich, S., Bay, S.D.: Irvine, CA (1999). http://kdd.ics.uci.edu

  11. Kohavi, R., Brodley, C., Frasca, B., Mason, L., Zheng, Z.: KDD-cup 2000 organizers’ report. SIGKDD Explor. 2, 86–98 (2000)

    Article  Google Scholar 

  12. Liu, A., Zhang, G., Lu, J.: Fuzzy time windowing for gradual concept drift adaptation. In: Proceedings of the 2017 IEEE International Conference on Fuzzy Systems, pp. 1–6. IEEE (2017)

    Google Scholar 

  13. Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 346–357 (2002)

    Chapter  Google Scholar 

  14. Narayanan, R., Honbo, D., Memik, G., Choudhary, A., Zambreno, J.: NU-MineBench (2018). http://cucis.ece.northwestern.edu/index.html

  15. Newman, C.B.D., Merz, C.: UCI repository of machine learning databases (1998). http://www.ics.uci.edu/~mlearn/MLRepository.html

  16. Nguyen, H.L., Woon, Y.K., Ng, W.K.: A survey on data stream clustering and classification. Knowl. Inf. Syst. 45, 535–569 (2014)

    Article  Google Scholar 

  17. Piatetsky-Shapiro, G.: Discovery, analysis, and presentation of strong rules. Knowl. Discovery Databases, 229–238 (1991)

    Google Scholar 

  18. Webb, G.: Discovering significant patterns. Mach. Learn. 68(1), 1–33 (2007)

    Article  Google Scholar 

  19. Webb, G.: Self-sufficient itemsets: an approach to screening potentially interesting associations between items. ACM Trans. Knowl. Discov. Data 4, 1–20 (2010)

    Article  Google Scholar 

  20. Webb, G.: Filtered-top-k association discovery. WIREs Data Mining Knowl. Discov. 1(3), 183–192 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Feiyang Tang or David Tse Jung Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tang, F., Huang, D.T.J., Koh, Y.S., Fournier-Viger, P. (2019). Adaptive Self-Sufficient Itemset Miner for Transactional Data Streams. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29911-8_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29910-1

  • Online ISBN: 978-3-030-29911-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics