Advertisement

Data Mining and Knowledge Discovery

, Volume 32, Issue 3, pp 604–650 | Cite as

Anytime discovery of a diverse set of patterns with Monte Carlo tree search

  • Guillaume Bosc
  • Jean-François Boulicaut
  • Chedy Raïssi
  • Mehdi Kaytoue
Article

Abstract

The discovery of patterns that accurately discriminate one class label from another remains a challenging data mining task. Subgroup discovery (SD) is one of the frameworks that enables to elicit such interesting patterns from labeled data. A question remains fairly open: How to select an accurate heuristic search technique when exhaustive enumeration of the pattern space is infeasible? Existing approaches make use of beam-search, sampling, and genetic algorithms for discovering a pattern set that is non-redundant and of high quality w.r.t. a pattern quality measure. We argue that such approaches produce pattern sets that lack of diversity: Only few patterns of high quality, and different enough, are discovered. Our main contribution is then to formally define pattern mining as a game and to solve it with Monte Carlo tree search (MCTS). It can be seen as an exhaustive search guided by random simulations which can be stopped early (limited budget) by virtue of its best-first search property. We show through a comprehensive set of experiments how MCTS enables the anytime discovery of a diverse pattern set of high quality. It outperforms other approaches when dealing with a large pattern search space and for different quality measures. Thanks to its genericity, our MCTS approach can be used for SD but also for many other pattern mining tasks.

Keywords

Supervised pattern mining Subgroup discovery Exceptional model mining Heuristic search Monte Carlo tree search Diversity 

Notes

Acknowledgements

The authors would like to thank the anonymous reviewers for their constructive and insightful comments. They also warmly thank Sandy Moens, Mario Boley, Tarcísio Lucas, Renato Vimiero, Albrecht Zimmermann, Marc Plantevit, Aimene Belfodil, Abdallah Saffidine, Dave Ritchie and especially Céline Robardet for discussions, advice or code sharing. This work has been partially supported by the European Union (GRAISearch, FP7-PEOPLE-2013-IAPP) and the Institut rhônalpin des systèmes complexes (IXXI).

References

  1. Abramson B (1990) Expected-outcome: a general model of static evaluation. IEEE Trans Pattern Anal Mach Intell 12(2):182–193.  https://doi.org/10.1109/34.44404 CrossRefGoogle Scholar
  2. Abudawood T, Flach PA (2009) Evaluation measures for multi-class subgroup discovery. In: Buntine WL, Grobelnik M, Mladenic D, Shawe-Taylor J (eds) In: Machine learning and knowledge discovery in databases, European Conference, ECML PKDD 2009, Bled, Slovenia, September 7–11, 2009, Proceedings, Part I, Springer, Lecture Notes in Computer Science, vol 5781, pp 35–50.  https://doi.org/10.1007/978-3-642-04180-8_20
  3. Atzmüller M, Lemmerich F (2009) Fast subgroup discovery for continuous target concepts. In: Rauch J, Ras ZW, Berka P, Elomaa T (eds) Foundations of intelligent systems, 18th international symposium, ISMIS 2009, Prague, Czech Republic, September 14–17, 2009. Proceedings, Springer, Lecture Notes in Computer Science, vol 5722, pp 35–44.  https://doi.org/10.1007/978-3-642-04125-9_7
  4. Atzmüller M, Puppe F (2006) Sd-map—a fast algorithm for exhaustive subgroup discovery. In: Fürnkranz J, Scheffer T, Spiliopoulou M (eds) Knowledge discovery in databases: PKDD 2006, 10th European conference on principles and practice of knowledge discovery in databases, Berlin, Germany, September 18–22, 2006, Proceedings, Springer, Lecture Notes in Computer Science, vol 4213, pp 6–17.  https://doi.org/10.1007/11871637_6
  5. Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Mach Learn 47(2–3):235–256.  https://doi.org/10.1023/A:1013689704352 CrossRefMATHGoogle Scholar
  6. Belfodil A, Kuznetsov SO, Robardet C, Kaytoue M (2017) Mining convex polygon patterns with formal concept analysis. In: Sierra C (ed) Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI 2017, Melbourne, Australia, August 19–25, 2017, ijcai.org, pp 1425–1432.  https://doi.org/10.24963/ijcai.2017/197
  7. Bendimerad AA, Plantevit M, Robardet C (2016) Unsupervised exceptional attributed sub-graph mining in urban data. In: Bonchi F, Domingo-Ferrer J, Baeza-Yates RA, Zhou Z, Wu X (eds) IEEE 16th international conference on data mining, ICDM 2016, December 12–15, 2016, Barcelona, Spain, IEEE, pp 21–30.  https://doi.org/10.1109/ICDM.2016.0013
  8. Björnsson Y, Finnsson H (2009) Cadiaplayer: a simulation-based general game player. IEEE Trans Comput Intell AI Games 1(1):4–15.  https://doi.org/10.1109/TCIAIG.2009.2018702 CrossRefGoogle Scholar
  9. Boley M, Lucchese C, Paurat D, Gärtner T (2011) Direct local pattern sampling by efficient two-step random procedures. In: Apté C, Ghosh J, Smyth P (eds) Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, San Diego, CA, USA, August 21–24, 2011, ACM, pp 582–590.  https://doi.org/10.1145/2020408.2020500
  10. Bosc G, Golebiowski J, Bensafi M, Robardet C, Plantevit M, Boulicaut J, Kaytoue M (2016) Local subgroup discovery for eliciting and understanding new structure-odor relationships. In: Calders T, Ceci M, Malerba D (eds) Discovery science—19th international conference, DS 2016, Bari, Italy, October 19–21, 2016, Proceedings, Lecture Notes in Computer Science, vol 9956, pp 19–34.  https://doi.org/10.1007/978-3-319-46307-0_2
  11. Boulicaut J, Jeudy B (2010) Constraint-based data mining. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook, 2nd edn. Springer, Berlin, pp 339–354.  https://doi.org/10.1007/978-0-387-09823-4_17 Google Scholar
  12. Bringmann B, Zimmermann A (2009) One in a million: picking the right patterns. Knowl Inf Syst 18(1):61–81.  https://doi.org/10.1007/s10115-008-0136-4 CrossRefGoogle Scholar
  13. Browne C, Powley EJ, Whitehouse D, Lucas SM, Cowling PI, Rohlfshagen P, Tavener S, Liebana DP, Samothrakis S, Colton S (2012) A survey of monte carlo tree search methods. IEEE Trans Comput Intell AI Games 4(1):1–43.  https://doi.org/10.1109/TCIAIG.2012.2186810 CrossRefGoogle Scholar
  14. Carmona CJ, González P, del Jesús MJ, Herrera F (2010) NMEEF-SD: non-dominated multiobjective evolutionary algorithm for extracting fuzzy rules in subgroup discovery. IEEE Trans Fuzzy Syst 18(5):958–970.  https://doi.org/10.1109/TFUZZ.2010.2060200 CrossRefGoogle Scholar
  15. del Jesús MJ, González P, Herrera F, Mesonero M (2007) Evolutionary fuzzy rule induction process for subgroup discovery: a case study in marketing. IEEE Trans Fuzzy Syst 15(4):578–592.  https://doi.org/10.1109/TFUZZ.2006.890662 CrossRefGoogle Scholar
  16. Downar L, Duivesteijn W (2017) Exceptionally monotone models—the rank correlation model class for exceptional model mining. Knowl Inf Syst 51(2):369–394.  https://doi.org/10.1007/s10115-016-0979-z CrossRefGoogle Scholar
  17. Duivesteijn W, Knobbe AJ (2011) Exploiting false discoveries—statistical validation of patterns and quality measures in subgroup discovery. In: Cook DJ, Pei J, Wang W, Zaïane OR, Wu X (eds) 11th IEEE international conference on data mining, ICDM 2011, Vancouver, BC, Canada, December 11–14, 2011, IEEE Computer Society, pp 151–160.  https://doi.org/10.1109/ICDM.2011.65
  18. Duivesteijn W, Knobbe AJ, Feelders A, van Leeuwen M (2010) Subgroup discovery meets bayesian networks—an exceptional model mining approach. In: Webb GI, Liu B, Zhang C, Gunopulos D, Wu X (eds) ICDM 2010, The 10th ieee international conference on data mining, Sydney, Australia, 14–17 December 2010, IEEE Computer Society, pp 158–167.  https://doi.org/10.1109/ICDM.2010.53
  19. Duivesteijn W, Feelders A, Knobbe AJ (2016) Exceptional model mining–supervised descriptive local pattern mining with complex target concepts. Data Min Knowl Discov 30(1):47–98.  https://doi.org/10.1007/s10618-015-0403-4 MathSciNetCrossRefGoogle Scholar
  20. Egho E, Gay D, Boullé M, Voisine N, Clérot F (2015) A parameter-free approach for mining robust sequential classification rules. In: Aggarwal CC, Zhou Z, Tuzhilin A, Xiong H, Wu X (eds) 2015 IEEE international conference on data mining, ICDM 2015, Atlantic City, NJ, USA, November 14–17, 2015, IEEE Computer Society, pp 745–750.  https://doi.org/10.1109/ICDM.2015.87
  21. Egho E, Gay D, Boullé M, Voisine N, Clérot F (2017) A user parameter-free approach for mining robust sequential classification rules. Knowl Inf Syst 52(1):53–81.  https://doi.org/10.1007/s10115-016-1002-4 CrossRefGoogle Scholar
  22. Fürnkranz J, Gamberger D, Lavrac N (2012) Foundations of rule learning. Cognitive Technologies. Springer, Berlin.  https://doi.org/10.1007/978-3-540-75197-7 CrossRefMATHGoogle Scholar
  23. Gaudel R, Sebag M (2010) Feature selection as a one-player game. In: Fürnkranz J, Joachims T (eds) Proceedings of the 27th international conference on machine learning (ICML-10), June 21–24, 2010, Haifa, Israel, Omnipress, pp 359–366. http://www.icml2010.org/papers/247.pdf
  24. Gay D, Boullé M (2012) A bayesian approach for classification rule mining in quantitative databases. In: Flach PA, Bie TD, Cristianini N (eds) Machine learning and knowledge discovery in databases—European conference, ECML PKDD 2012, Bristol, UK, September 24–28, 2012. Proceedings, Part II, Springer, Lecture Notes in Computer Science, vol 7524, pp 243–259.  https://doi.org/10.1007/978-3-642-33486-3_16
  25. Gelly S, Silver D (2007) Combining online and offline knowledge in UCT. In: Ghahramani Z (ed) Machine learning, proceedings of the twenty-fourth international conference (ICML 2007), Corvallis, Oregon, USA, June 20–24, 2007, ACM, ACM International Conference Proceeding Series, vol 227, pp 273–280.  https://doi.org/10.1145/1273496.1273531
  26. Grosskreutz H, Rüping S, Wrobel S (2008) Tight optimistic estimates for fast subgroup discovery. In: Daelemans W, Goethals B, Morik K (eds) Machine learning and knowledge discovery in databases, European conference, ECML/PKDD 2008, Antwerp, Belgium, September 15–19, 2008, Proceedings, Part I, Springer, Lecture Notes in Computer Science, vol 5211, pp 440–456.  https://doi.org/10.1007/978-3-540-87479-9_47
  27. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Chen W, Naughton JF, Bernstein PA (eds) Proceedings of the 2000 ACM SIGMOD international conference on management of data, May 16–18, 2000, Dallas, Texas, USA., ACM, pp 1–12.  https://doi.org/10.1145/342009.335372
  28. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87.  https://doi.org/10.1023/B:DAMI.0000005258.31418.83 MathSciNetCrossRefGoogle Scholar
  29. Helmbold DP, Parker-Wood A (2009) All-moves-as-first heuristics in Monte-Carlo go. In: Arabnia HR, de la Fuente D, Olivas JA (eds) Proceedings of the 2009 international conference on artificial intelligence, ICAI 2009, July 13–16, 2009, Las Vegas Nevada, USA, 2 Volumes, CSREA Press, pp 605–610Google Scholar
  30. Holland JH (1975) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. University Michigan Press, Ann ArborMATHGoogle Scholar
  31. Kavsek B, Lavrac N (2006) APRIORI-SD: adapting association rule learning to subgroup discovery. Appl Artif Intell 20(7):543–583.  https://doi.org/10.1080/08839510600779688 CrossRefGoogle Scholar
  32. Kaytoue M, Kuznetsov SO, Napoli A (2011) Revisiting numerical pattern mining with formal concept analysis. In: Walsh (2011), pp 1342–1347.  https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-227
  33. Kaytoue M, Plantevit M, Zimmermann A, Bendimerad AA, Robardet C (2017) Exceptional contextual subgraph mining. Mach Learn 106(8):1171–1211.  https://doi.org/10.1007/s10994-016-5598-0 MathSciNetCrossRefMATHGoogle Scholar
  34. Klösgen W (1996) Explora: a multipattern and multistrategy discovery assistant. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowledge discovery and data mining. AAAI/MIT Press, Cambridge, pp 249–271Google Scholar
  35. Kocsis L, Szepesvári C (2006) Bandit based monte-carlo planning. In: Fürnkranz J, Scheffer T, Spiliopoulou M (eds) Machine learning: ECML 2006, 17th European conference on machine learning, Berlin, Germany, September 18–22, 2006, Proceedings, Springer, Lecture Notes in Computer Science, vol 4212, pp 282–293.  https://doi.org/10.1007/11871842_29
  36. Lavrac N, Flach PA, Zupan B (1999) Rule evaluation measures: A unifying view. In: Dzeroski S, Flach PA (eds) Inductive logic programming, 9th international workshop, ILP-99, Bled, Slovenia, June 24–27, 1999, Proceedings, Springer, Lecture Notes in Computer Science, vol 1634, pp 174–185.  https://doi.org/10.1007/3-540-48751-4_17
  37. Lavrac N, Cestnik B, Gamberger D, Flach PA (2004) Decision support through subgroup discovery: three case studies and the lessons learned. Mach Learn 57(1–2):115–143.  https://doi.org/10.1023/B:MACH.0000035474.48771.cd CrossRefMATHGoogle Scholar
  38. Leman D, Feelders A, Knobbe AJ (2008) Exceptional model mining. In: Daelemans W, Goethals B, Morik K (eds) Machine learning and knowledge discovery in databases, European conference, ECML/PKDD 2008, Antwerp, Belgium, September 15–19, 2008, Proceedings, Part II, Springer, Lecture Notes in Computer Science, vol 5212, pp 1–16.  https://doi.org/10.1007/978-3-540-87481-2_1
  39. Lemmerich F, Atzmueller M, Puppe F (2016) Fast exhaustive subgroup discovery with numerical target concepts. Data Min Knowl Discov 30(3):711–762.  https://doi.org/10.1007/s10618-015-0436-8 MathSciNetCrossRefGoogle Scholar
  40. Lowerre BT (1976) The harpy speech recognition system. Ph.D. Thesis, Carnegie-Mellon University, Pittsburgh, PA, Department of Computer ScienceGoogle Scholar
  41. Lucas T, Silva TCPB, Vimieiro R, Ludermir TB (2017) A new evolutionary algorithm for mining top-k discriminative patterns in high dimensional data. Appl Soft Comput 59:487–499.  https://doi.org/10.1016/j.asoc.2017.05.048 CrossRefGoogle Scholar
  42. Meeng M, Duivesteijn W, Knobbe AJ (2014) Rocsearch—an roc-guided search strategy for subgroup discovery. In: Zaki MJ, Obradovic Z, Tan P, Banerjee A, Kamath C, Parthasarathy S (eds) Proceedings of the 2014 SIAM international conference on data mining, Philadelphia, Pennsylvania, USA, April 24–26, 2014, SIAM, pp 704–712.  https://doi.org/10.1137/1.9781611973440.81
  43. Moens S, Boley M (2014) Instant exceptional model mining using weighted controlled pattern sampling. In: Blockeel H, van Leeuwen M, Vinciotti V (eds) Advances in intelligent data analysis XIII—13th international symposium, IDA 2014, Leuven, Belgium, October 30–November 1, 2014. Proceedings, Springer, Lecture Notes in Computer Science, vol 8819, pp 203–214.  https://doi.org/10.1007/978-3-319-12571-8_18
  44. Mueller M, Rosales R, Steck H, Krishnan S, Rao B, Kramer S (2009) Subgroup discovery for test selection: A novel approach and its application to breast cancer diagnosis. In: Adams NM, Robardet C, Siebes A, Boulicaut J (eds) Advances in intelligent data analysis VIII, 8th international symposium on intelligent data analysis, IDA 2009, Lyon, France, August 31–September 2, 2009. Proceedings, Springer, Lecture Notes in Computer Science, vol 5772, pp 119–130.  https://doi.org/10.1007/978-3-642-03915-7_11
  45. Novak PK, Lavrac N, Webb GI (2009) Supervised descriptive rule discovery: a unifying survey of contrast set, emerging pattern and subgroup mining. J Mach Learn Res 10:377–403.  https://doi.org/10.1145/1577069.1577083 MATHGoogle Scholar
  46. Pachón V, Vázquez JM, Domínguez JL, López MJM (2011) Multi-objective evolutionary approach for subgroup discovery. In: Corchado E, Kurzynski M, Wozniak M (eds) Hybrid artificial intelligent systems—6th international conference, HAIS 2011, Wroclaw, Poland, May 23–25, 2011, Proceedings, Part II, Springer, Lecture Notes in Computer Science, vol 6679, pp 271–278.  https://doi.org/10.1007/978-3-642-21222-2_33
  47. Rodríguez D, Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2012) Searching for rules to detect defective modules: a subgroup discovery approach. Inf Sci 191:14–30.  https://doi.org/10.1016/j.ins.2011.01.039 CrossRefGoogle Scholar
  48. Russell SJ, Norvig P (2010) Artificial intelligence—a modern approach (3. internat. ed). Pearson Education. http://vig.pearsoned.com/store/product/1,1207,store-12521_isbn-0136042597,00.html
  49. Schadd MPD, Winands MHM, van den Herik HJ, Chaslot G, Uiterwijk JWHM (2008) Single-player monte-carlo tree search. In: van den Herik HJ, Xu X, Ma Z, Winands MHM (eds) Computers and games, 6th international conference, CG 2008, Beijing, China, September 29–October 1, 2008. Proceedings, Springer, Lecture Notes in Computer Science, vol 5131, pp 1–12.  https://doi.org/10.1007/978-3-540-87608-3_1
  50. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap TP, Leach M, Kavukcuoglu K, Graepel T, Hassabis D (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489CrossRefGoogle Scholar
  51. van Leeuwen M, Galbrun E (2015) Association discovery in two-view data. IEEE Trans Knowl Data Eng 27(12):3190–3202.  https://doi.org/10.1109/TKDE.2015.2453159 CrossRefGoogle Scholar
  52. van Leeuwen M, Knobbe AJ (2011) Non-redundant subgroup discovery in large and complex data. In: Gunopulos D, Hofmann T, Malerba D, Vazirgiannis M (eds) Machine learning and knowledge discovery in databases—European conference, ECML PKDD 2011, Athens, Greece, September 5–9, 2011, Proceedings, Part III, Springer, Lecture Notes in Computer Science, vol 6913, pp 459–474.  https://doi.org/10.1007/978-3-642-23808-6_30
  53. van Leeuwen M, Knobbe AJ (2012) Diverse subgroup set discovery. Data Min Knowl Discov 25(2):208–242.  https://doi.org/10.1007/s10618-012-0273-y MathSciNetCrossRefGoogle Scholar
  54. van Leeuwen M, Ukkonen A (2013) Discovering skylines of subgroup sets. In: Blockeel H, Kersting K, Nijssen S, Zelezný F (eds) Machine learning and knowledge discovery in databases—European conference, ECML PKDD 2013, Prague, Czech Republic, September 23–27, 2013, Proceedings, Part III, Springer, Lecture Notes in Computer Science, vol 8190, pp 272–287.  https://doi.org/10.1007/978-3-642-40994-3_18
  55. Walsh T (ed) (2011) IJCAI 2011, Proceedings of the 22nd international joint conference on artificial intelligence, Barcelona, Catalonia, Spain, July 16–22, 2011, IJCAI/AAAI. http://ijcai.org/proceedings/2011
  56. Wrobel S (1997) An algorithm for multi-relational discovery of subgroups. In: Komorowski HJ, Zytkow JM (eds) Principles of data mining and knowledge discovery, first European symposium, PKDD ’97, Trondheim, Norway, June 24–27, 1997, Proceedings, Springer, Lecture Notes in Computer Science, vol 1263, pp 78–87.  https://doi.org/10.1007/3-540-63223-9_108

Copyright information

© The Author(s) 2017

Authors and Affiliations

  1. 1.CNRS, INSA-Lyon, LIRIS, UMR5205Université de LyonLyonFrance
  2. 2.INRIA Nancy Grand EstVillers-lès-NancyFrance

Personalised recommendations