Skip to main content

A Relaxation-Based Approach for Mining Diverse Closed Patterns

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2020)

Abstract

In recent years, pattern mining has moved from a slow-moving repeated three-step process to a much more agile iterative/user-centric mining model. A vital ingredient of this framework is the ability to quickly present a set of diverse patterns to the user. In this paper, we use constraint programming (well-suited to user-centric mining due to its rich constraint language) to efficiently mine a diverse set of closed patterns. Diversity is controlled through a threshold on the Jaccard similarity of pattern occurrences. We show that the Jaccard measure has no monotonicity property, which prevents usual pruning techniques and makes classical pattern mining unworkable. This is why we propose anti-monotonic lower and upper bound relaxations, which allow effective pruning, with an efficient branching rule, boosting the whole search process. We show experimentally that our approach significantly reduces the number of patterns and is very efficient in terms of running times, particularly on dense data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Opposed to more rigid search in classical pattern mining algorithms, which often rely on exploiting the properties of a particular constraint.

  2. 2.

    https://github.com/lobnury/ClosedDiversity.

References

  1. Supplementary Material, June 2020. https://github.com/lobnury/ClosedDiversity

  2. Belaid, M., Bessiere, C., Lazaar, N.: Constraint programming for mining borders of frequent itemsets. In: Proceedings of IJCAI 2019, Macao, China, pp. 1064–1070 (2019)

    Google Scholar 

  3. Belfodil, A., et al.: Fssd-a fast and efficient algorithm for subgroup set discovery. In: Proceedings of DSAA, pp. 91–99 (2019)

    Google Scholar 

  4. Bosc, G., Boulicaut, J.F., Raïssi, C., Kaytoue, M.: Anytime discovery of a diverse set of patterns with Monte Carlo tree search. Data Min. Knowl. Disc. 32(3), 604–650 (2018)

    Article  MathSciNet  Google Scholar 

  5. Bringmann, B., Zimmermann, A.: The chosen few: on identifying valuable patterns. Proc. ICDM 2007, 63–72 (2007)

    Google Scholar 

  6. De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: 14th ACM SIGKDD, pp. 204–212 (2008)

    Google Scholar 

  7. De Raedt, L., Zimmermann, A.: Constraint-based pattern set mining. In: 7th SIAM SDM, pp. 237–248. SIAM (2007)

    Google Scholar 

  8. Dzyuba, V., van Leeuwen, M., De Raedt, L.: Flexible constrained sampling with guarantees for pattern mining. Data Min. Knowl. Disc. 31(5), 1266–1293 (2017). https://doi.org/10.1007/s10618-017-0501-6

    Article  MathSciNet  MATH  Google Scholar 

  9. Dzyuba, V., van Leeuwen, M.: Interactive discovery of interesting subgroup sets. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) IDA 2013. LNCS, vol. 8207, pp. 150–161. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41398-8_14

    Chapter  Google Scholar 

  10. Hoeve, W., Katriel, I.: Global constraints. In: Handbook of Constraint Programming, pp. 169–208. Elsevier Science Inc., (2006)

    Google Scholar 

  11. Kifer, D., Gehrke, J., Bucila, C., White, W.: How to quickly find a witness. In: Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.) Constraint-Based Mining and Inductive Databases. LNCS (LNAI), vol. 3848, pp. 216–242. Springer, Heidelberg (2006). https://doi.org/10.1007/11615576_11

    Chapter  Google Scholar 

  12. Knobbe, A.J., Ho, E.K.Y.: Pattern teams. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 577–584. Springer, Heidelberg (2006). https://doi.org/10.1007/11871637_58

    Chapter  Google Scholar 

  13. Lazaar, N., et al.: A global constraint for closed frequent pattern mining. In: Proceedings of the 22nd CP, pp. 333–349 (2016)

    Google Scholar 

  14. Leeuwen, M.: Interactive data exploration using pattern mining. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. LNCS, vol. 8401, pp. 169–182. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43968-5_9

    Chapter  Google Scholar 

  15. Ng, R.T., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained association rules. In: Proceedings of ACM SIGMOD, pp. 13–24 (1998)

    Google Scholar 

  16. Pei, J., Han, J., Lakshmanan, L.V.S.: Mining frequent item sets with convertible constraints. In: Proceedings of ICDE, pp. 433–442 (2001)

    Google Scholar 

  17. Prud’homme, C., Fages, J.G., Lorca, X.: Choco Solver Documentation (2016)

    Google Scholar 

  18. Puolamäki, K., Kang, B., Lijffijt, J., De Bie, T.: Interactive visual data exploration with subjective feedback. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9852, pp. 214–229. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46227-1_14

    Chapter  Google Scholar 

  19. Schaus, P., Aoga, J.O.R., Guns, T.: CoverSize: a global constraint for frequency-based itemset mining. In: Beck, J.C. (ed.) CP 2017. LNCS, vol. 10416, pp. 529–546. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66158-2_34

    Chapter  MATH  Google Scholar 

  20. Van Leeuwen, M., Knobbe, A.: Diverse subgroup set discovery. Data Min. Knowl. Disc. 25(2), 208–242 (2012)

    Article  MathSciNet  Google Scholar 

  21. Vreeken, J., Van Leeuwen, M., Siebes, A.: Krimp: mining itemsets that compress. Data Min. Knowl. Disc. 23(1), 169–214 (2011)

    Article  MathSciNet  Google Scholar 

  22. Wang, J., Han, J., Pei, J.: CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the Ninth KDD, pp. 236–245. ACM (2003)

    Google Scholar 

  23. Zaki, M., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: Proceedings of KDD 1997, Newport Beach, California, USA, August 14–17, pp. 283–286. AAAI Press (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samir Loudni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hien, A. et al. (2021). A Relaxation-Based Approach for Mining Diverse Closed Patterns. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12457. Springer, Cham. https://doi.org/10.1007/978-3-030-67658-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67658-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67657-5

  • Online ISBN: 978-3-030-67658-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics