Machine Learning

, Volume 57, Issue 1–2, pp 115–143 | Cite as

Decision Support Through Subgroup Discovery: Three Case Studies and the Lessons Learned

  • Nada Lavrač
  • Bojan Cestnik
  • Dragan Gamberger
  • Peter Flach


This paper presents ways to use subgroup discovery to generate actionable knowledge for decision support. Actionable knowledge is explicit symbolic knowledge, typically presented in the form of rules, that allows the decision maker to recognize some important relations and to perform an appropriate action, such as targeting a direct marketing campaign, or planning a population screening campaign aimed at detecting individuals with high disease risk. Different subgroup discovery approaches are outlined, and their advantages over using standard classification rule learning are discussed. Three case studies, a medical and two marketing ones, are used to present the lessons learned in solving problems requiring actionable knowledge generation for decision support.

data mining subgroup discovery decision support actionability lessons learned 


  1. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A. I. (1996). Fast discovery of association rules. In Advances in knowledge discovery and data mining. Menlo Park, CA: AAAI Press.Google Scholar
  2. Berger, J. (1985). Statistical decision theory and bayesian analysis. Springer-Verlag.Google Scholar
  3. Berry, M., & Linoff, G. (2000). Mastering data mining, the art and science of customer relationship managemen. John Wiley.Google Scholar
  4. Cestnik, B., Lavrač, N., Železny, F., Gamberger, D., Todorovski, L., & Kline, M. (2002). Data mining for decision support in marketing:Acase study in targeting a marketing campaign. In Proceedings of the ECML/PKDD-2002 Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (pp. 25–34).Google Scholar
  5. Clark, P., & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. In Proc. Fifth European Working Session on Learning (pp. 151–163). Springer.Google Scholar
  6. Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–283.Google Scholar
  7. Cohen, W.W. (1995). Fast effective rule induction. In A. Prieditis & S. Russell (Eds.), Proc. of the 12th International Conference on Machine Learning (pp. 115–123). Morgan Kaufmann.Google Scholar
  8. Cohen, W. W., & Singer, Y. (1999). A simple, fast, and effective rule learner. In Proceedings of the 17th National Conference on Artificial Intelligence. American Association for Artificial Intelligence.Google Scholar
  9. De Raedt, L., Blockeel, H., Dehaspe, L., & Laer, W. V. (2001). Three companions for data mining in first order logic. In S. Džeroski & N. Lavrač (Eds.), Relational Data Mining. Springer-Verlag.Google Scholar
  10. De Raedt, L., & Dehaspe, L. (1997). Clausal discovery. Machine Learning, 26, 99–146.Google Scholar
  11. Flach, P. (2003). The geometry of ROC space: Understanding machine learning metrics through ROC isometrics. In Proc. 20th International Conference on Machine Learning (ICML03) (pp. 194–201). AAAI Press.Google Scholar
  12. Flach, P., & Gamberger, D. (2001). Subgroup evaluation and decision support for direct mailing problem. In Proceedings of the ECML/PKDD-2001 Workshop on Integration Aspects of Data Mining, Decision Support and Meta-Learning (pp. 45–56).Google Scholar
  13. Fürnkranz, J., & Flach, P. (2003). An analysis of rule evaluation metrics. In Proc. 20th International Conference on Machine Learning (ICML03) (pp. 202–209). AAAI Press.Google Scholar
  14. Gamberger, D., & Lavrač, N. (2002). Expert guided subgroup discovery: Methodology and application. Journal of Artificiel Intelligence Research, 17, 501–527.Google Scholar
  15. Gamberger, D., Lavrač, N., & Krstačić, G. (2003). Active subgroup mining: A case study in coronary heart disease risk group detection. Artificial Intelligence in Medicine, 28, 27–57.Google Scholar
  16. Holte, R. (1993).Very simple classification rules perform well on most commonly used datasets.Machine Learning, 11, 63–91.Google Scholar
  17. Kavšek, B., Lavrač, N., & Jovanoski, V. (2003). APROPRI-SD: Adapting association rule learning to subgroup discovery. In M. Berthold, H. J. Lenz, E. Bradley, R. Kruse, & C. Borgelt (Eds.), Advances in intelligent data analysis (pp. 230–241). Springer-Verlag.Google Scholar
  18. Kloesgen,W. (1996). EXPLORA:Amultipattern and multistrategy discovery assistant. In M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy (Eds.), Advances in knowledge discovery and data mining. Menlo Park, CA: AAAI Press.Google Scholar
  19. Lavrač, N., Flach, P., Kavšek, B., & Todorovski, L. (2002). Adapting classification rule induction to subgroup discovery. In V. Kumar, S. Tsumoto, N. Zhong, P. Yu, & X.Wu (Eds.), Proceedings of the 2002 IEEE International Conference on Data Mining (pp. 266–273). IEEE Computer Society.Google Scholar
  20. Lavrač, N., Flach, P., & Zupan, B. (1999). Rule evaluation measures: A unifying view. In S. Džeroski & P. Flach (Eds.), Proceedings of the 9th International Workshop on Inductive Logic Programming (pp. 174–185). Springer-Verlag.Google Scholar
  21. Lavrač, N., Kavšek, B., Flach, P., & Todorovski, L. (2004). Subgroup discovery with CN2-SD. Journal of Machine Learning Research, 5, 153–188.Google Scholar
  22. Michalski, R., Mozetič, I., Hong, J., & Lavrač, N. (1986). The multi-purpose incremental learning system AQ15 and its testing application on three medical domains. In Proc. 5th National Conference on Artificial Intelligence (pp. 1041–1045). Morgan Kaufmann.Google Scholar
  23. Myers, J. (1996). Segmentation and positioning for strategic marketing decisions. American Marketing Association.Google Scholar
  24. Piatetsky-Shapiro, G., & Matheus, C. (1994). The interestingness of deviation. In Proceedings of the AAAI-94 Workshop on Knowledge Discovery in Databases (pp. 25–36).Google Scholar
  25. Provost, F. J., & Fawcett, T. (1998). Robust classification systems for imprecise environments. In Proceedings of the 19th National Conference on Artificial Intelligence (pp. 706–713).Google Scholar
  26. Rivest, R. L. (1987).Learning decision lists. Machine Learning, 2:3, 229–246.Google Scholar
  27. Silberschatz, A., & Tuzhilin, A. (1995). On subjective measures of interestingness in knowledge discovery. In Knowledge Discovery and data mining (pp. 275–281).Google Scholar
  28. Wrobel, S. (1997). An algorithm for multi-relational discovery of subgroups. In J. Komorowski & J. Zytkow (Eds.), Proc. First European Symposion on Principles of Data Mining and Knowledge Discovery (PKDD-97) (pp. 78–87). Springer Verlag.Google Scholar
  29. Wrobel, S. (2001). Inductive logic programming for knowledge discovery in databases. In S.Džeroski & N. Lavrač (Eds.), Relational data mining. Springer-Verlag.Google Scholar
  30. Wrobel, S.,& Džeroski, S. (1995). The ILP description learning problem: Towards a general model-level definition of data mining in ILP. In K. Morik & J. Herrmann (Eds.), Proc. Fachgruppentreffen Maschinelles Lernen (FGML-95). 44221 Dortmund, Univ. Dortmund.Google Scholar

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Nada Lavrač
    • 1
    • 2
  • Bojan Cestnik
    • 1
  • Dragan Gamberger
    • 3
  • Peter Flach
    • 4
  1. 1.Jožef Stefan InstituteLjubljanaSlovenia
  2. 2.Nova Gorica PolytechnicNova GoricaSlovenia
  3. 3.Rudjer Bošković InstituteZagrebCroatia
  4. 4.University of BristolBristolUK

Personalised recommendations