Skip to main content

Supervised Descriptive Rule Learning

  • Chapter
  • First Online:

Part of the book series: Cognitive Technologies ((COGTECH))

Abstract

This chapter presents subgroup discovery (SD) and some of the related supervised descriptive rule induction techniques, including contrast set mining (CSM) and emerging pattern mining (EPM). These descriptive rule learning techniques are presented in a unifying framework named supervised descriptive rule learning. All these techniques aim at discovering patterns in the form of rules induced from labeled data. This chapter contributes to the understanding of these techniques by presenting a unified terminology and by explaining the apparent differences between the learning tasks as variants of a unique supervised descriptive rule learning task. It also shows that various rule learning heuristics used in CSM, EPM, and SD algorithms all aim at optimizing a trade off between rule coverage and precision.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   79.95
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   84.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Parts of this chapter are based on Kralj Novak, Lavrač, and Webb (2009).

  2. 2.

    http://www.giwebb.com/

  3. 3.

    Jumping emerging patterns are emerging patterns with support zero in one dataset and greater then zero in the other dataset.

References

  • Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A. I. (1995). Fast discovery of association rules. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy (Eds.), Advances in knowledge discovery and data mining (pp. 307–328). Menlo Park, CA: AAAI.

    Google Scholar 

  • Atzmüller, M., & Puppe, F. (2005). Semi-automatic visual subgroup mining using VIKAMINE. Journal of Universal Computer Science, 11(11), 1752–1765. Special Issue on Visual Data Mining.

    Google Scholar 

  • Atzmüller, M., & Puppe, F. (2006). SD-Map – A fast algorithm for exhaustive subgroup discovery. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-06), Berlin, Germany (pp. 6–17). Berlin, Germany: Springer.

    Google Scholar 

  • Atzmüller, M., Puppe, F., & Buscher, H.-P. (2005a). Exploiting background knowledge for knowledge-intensive subgroup discovery. In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI-05), Edinburgh, UK (pp. 647–652). San Francisco: Morgan Kaufmann.

    Google Scholar 

  • Atzmüller, M., Puppe, F., & Buscher, H.-P. (2005b). Profiling examiners using intelligent subgroup mining. In Proceedings of the 10th Workshop on Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP-05) (pp. 46–51) Aberdeen: AIME.

    Google Scholar 

  • Aumann, Y., & Lindell, Y. (1999). A statistical theory for quantitative association rules. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), San Diego, CA (pp. 261–270). New York: ACM.

    Google Scholar 

  • Bay, S. D. (2000). Multivariate discretization of continuous variables for set mining. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000), Boston (pp. 315–319). New York: ACM.

    Google Scholar 

  • Bay, S. D., & Pazzani, M. J. (2001). Detecting group differences: Mining contrast sets. Data Mining and Knowledge Discovery, 5(3), 213–246.

    Article  MATH  Google Scholar 

  • Bayardo, R. J., Jr. (1998). Efficiently mining long patterns from databases. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD-98), Seattle, WA (pp. 85–93). New York: ACM

    Google Scholar 

  • Boulesteix, A.-L., Tutz, G., & Strimmer, K. (2003). A CART-based approach to discover emerging patterns in microarray data. Bioinformatics, 19(18), 2465–2472.

    Article  Google Scholar 

  • Daly, O., & Taniar, D. (2005). Exception rules in data mining. In M. Khosrow-Pour (Ed.), Encyclopedia of information science and technology (Vol. II, pp. 1144–1148). Hershey, PA: Idea Group.

    Chapter  Google Scholar 

  • del Jesus, M. J., González, P., Herrera, F., & Mesonero, M. (2007). Evolutionary fuzzy rule induction process for subgroup discovery: A case study in marketing. IEEE Transactions on Fuzzy Systems, 15(4), 578–592.

    Article  Google Scholar 

  • Dong, G., & Li, J. (1999). Efficient mining of emerging patterns: Discovering trends and differences. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), San Diego, CA (pp. 43–52). New York: ACM

    Google Scholar 

  • Dong, G., Zhang, X., Wong, L., & Li, J. (1999). CAEP: Classification by aggregating emerging patterns. In Proceedings of the 2nd International Conference on Discovery Science (DS-99), Tokyo, Japan (pp. 30–42). Berlin, Germany/New York: Springer.

    Google Scholar 

  • Fan, H., Fan, M., Ramamohanarao, K., & Liu, M. (2006). Further improving emerging pattern based classifiers via bagging. In Proceedings of the 10th Pacific-Asia conference on Knowledge Discovery and Data Mining (PAKDD-06), Singapore (pp. 91–96). Berlin, Germany/Heidelberg, Germany/New York: Springer.

    Google Scholar 

  • Fan, H., & Ramamohanarao, K. (2003a). A Bayesian approach to use emerging patterns for classification. In Proceedings of the 14th Australasian Database Conference (ADC-03), Adelaide, SA (pp. 39–48). Darlinghurst, NSW: Australian Computer Society

    Google Scholar 

  • Fan, H., & Ramamohanarao, K. (2003b). Efficiently mining interesting emerging patterns. In Proceeding of the 4th International Conference on Web-Age Information Management (WAIM-03), Chengdu, China (pp. 189–201). Berlin, Germany/New York: Springer.

    Google Scholar 

  • Friedman, J. H., & Fisher, N. I. (1999). Bump hunting in high-dimensional data. Statistics and Computing, 9(2), 123–143.

    Article  Google Scholar 

  • Gamberger, D., & Lavrač, N. (2002). Expert-guided subgroup discovery: Methodology and application. Journal of Artificial Intelligence Research, 17, 501–527.

    MATH  Google Scholar 

  • Gamberger, D., Lavrač, N., & Wettschereck., D. (2002). Subgroup visualization: A method and application in population screening. In Proceedings of the 7th International Workshop on Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP-02), Lyon, France (pp. 31–35). Lyon, France: ECAI

    Google Scholar 

  • Garriga, G. C., Kralj, P., & Lavrač, N. (2006). Closed sets for labeled data. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-06), Berlin, Germany (pp. 163 – 174). Berlin, Germany/New York: Springer

    Google Scholar 

  • Hilderman, R. J., & Peckham, T. (2005). A statistically sound alternative approach to mining contrast sets. In Proceedings of the 4th Australia Data Mining Conference (AusDM-05), Sydney, NSW (pp. 157–172).

    Google Scholar 

  • Jenkole, J., Kralj, P., Lavrač, N., & Sluga, A. (2007). A data mining experiment on manufacturing shop floor data. In Proceedings of the 40th CIRP International Seminar on Manufacturing Systems. Liverpool, UK: University of Liverpool

    Google Scholar 

  • Kavšek, B., & Lavrač, N. (2006). Apriori-SD: Adapting association rule learning to subgroup discovery. Applied Artificial Intelligence, 20(7), 543–583.

    Article  Google Scholar 

  • Klösgen, W. (1996). Explora: A multipattern and multistrategy discovery assistant. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy (Eds.), Advances in knowledge discovery and data mining (pp. 249–271). Menlo Park, CA: AAAI. Chap. 10.

  • Klösgen, W., & May, M. (2002). Spatial subgroup mining integrated in an object-relational spatial database. In Proceedings of the 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-02) (pp. 275–286). Berlin, Germany/New York: Springer.

    Google Scholar 

  • Klösgen, W., May, M., & Petch, J. (2003). Mining census data for spatial effects on mortality. Intelligent Data Analysis, 7(6):521–540.

    Google Scholar 

  • Kralj, P., Grubešič, A., Toplak, N., Gruden, K., Lavrač, N., & Garriga, G. C. (2006). Application of closed itemset mining for class labeled data in functional genomics. Informatica Medica Slovenica, 11(1), 40–45.

    Google Scholar 

  • Kralj, P., Lavrač, N., Gamberger, D., & Krstačić, A. (2007a). Contrast set mining for distinguishing between similar diseases. In Proceedings of the 11th Conference on Artificial Intelligence in Medicine (AIME-07), Amsterdam (pp. 109–118). Berlin, Germany: Springer

    Google Scholar 

  • Kralj, P., Lavrač, N., Gamberger, D., & Krstačić, A. (2007b). Contrast set mining through subgroup discovery applied to brain ischaemia data. In Proceedings of the 11th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD-07), Nanjing, China (pp. 579–586). Berlin, Germany/New York: Springer

    Google Scholar 

  • Kralj, P., Lavrač, N., & Zupan, B. (2005). Subgroup visualization. In Proceedings of the 8th International Multiconference Information Society (IS-05), Ljubljana, Slovenia (pp. 228–231). Ljubljana, Slovenia: Institut Jožef Stefan.

    Google Scholar 

  • Kralj Novak, P., Lavrač, N., & Webb, G. I. (2009). Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. Journal of Machine Learning Research, 10, 377–403.

    MATH  Google Scholar 

  • Lavrač, N., Cestnik, B., Gamberger, D., & Flach, P. A. (2004). Decision support through subgroup discovery: Three case studies and the lessons learned. Machine Learning, 57(1–2):115–143. Special issue on Data Mining Lessons Learned.

    Google Scholar 

  • Lavrač, N., Kavšek, B., Flach, P., & Todorovski, L. (2004). Subgroup discovery with CN2-SD. Journal of Machine Learning Research, 5, 153–188.

    Google Scholar 

  • Lavrač, N., Kralj, P., Gamberger, D., & Krstačić, A. (2007). Supporting factors to improve the explanatory potential of contrast set mining: Analyzing brain ischaemia data. In Proceedings of the 11th Mediterranean Conference on Medical and Biological Engineering and Computing (MEDICON-07), Ljubljana, Slovenia (pp. 157–161). Berlin, Germany: Springer.

    Google Scholar 

  • Li, J., Dong, G., & Ramamohanarao, K. (2000). Instance-based classification by emerging patterns. In Proceedings of the 14th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-2000), Lyon, France (pp. 191–200). Berlin, Germany/New York: Springer.

    Google Scholar 

  • Li, J., Dong, G., & Ramamohanarao, K. (2001). Making use of the most expressive jumping emerging patterns for classification. Knowledge and Information Systems, 3(2), 1–29.

    Google Scholar 

  • Li, J., Liu, H., Downing, J. R., Yeoh, A. E.-J., & Wong, L. (2003). Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (ALL) patients. Bioinformatics, 19(1), 71–78.

    Article  MATH  Google Scholar 

  • Li, J., & Wong, L. (2002b). Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. Bioinformatics, 18(10), 1406–1407.

    Article  Google Scholar 

  • Lin, J., & Keogh, E. (2006). Group SAX: Extending the notion of contrast sets to time series and multimedia data. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-06), Berlin, Germany (pp. 284–296). Berlin, Germany/New York: Springer.

    Google Scholar 

  • Liu, B., Hsu, W., Han, H.-S., & Xia, Y. (2000). Mining changes for real-life applications. In Proceedings of the 2nd International Conference on Data Warehousing and Knowledge Discovery (DaWaK-2000), London (pp. 337–346). Berlin, Germany: Springer.

    Google Scholar 

  • Liu, B., Hsu, W., & Ma, Y. (2001). Discovering the set of fundamental rule changes. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-01), San Francisco (pp. 335–340). New York: ACM.

    Google Scholar 

  • May, M., & Ragia, L. (2002). Spatial subgroup discovery applied to the analysis of vegetation data. In Proceedings of the 4th International Conference on Practical Aspects of Knowledge Management (PAKM-2002), Vienna (pp. 49–61). Berlin, Germany/New York: Springer.

    Google Scholar 

  • Simeon, M., & Hilderman, R. J. (2007). Exploratory quantitative contrast set mining: A discretization approach. In Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI-07), Patras, Greece (Vol.2, pp. 124–131). Los Alamitos, CA: IEEE.

    Google Scholar 

  • Siu, K., Butler, S., Beveridge, T., Gillam, J., Hall, C., & Kaye, A., et al. (2005). Identifying markers of pathology in SAXS data of malignant tissues of the brain. Nuclear Instruments and Methods in Physics Research A, 548, 140–146.

    Article  Google Scholar 

  • Song, H. S., Kimb, J. K., & Kima, S. H. (2001). Mining the change of customer behavior in an internet shopping mall. Expert Systems with Applications, 21(3), 157–168.

    Article  Google Scholar 

  • Soulet, A., Crémilleux, B., & Rioult, F. (2004). Condensed representation of emerging patterns. In Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-04), Sydney, NSW (pp. 127–132). Berlin, Germany/New York: Springer.

    Google Scholar 

  • Suzuki, E. (2006). Data mining methods for discovering interesting exceptions from an unsupervised table. Journal of Universal Computer Science, 12(6), 627–653.

    Google Scholar 

  • Wang, K., Zhou, S., Fu, A. W.-C., & Yu, J. X. (2003). Mining changes of classification by correspondence tracing. In Proceedings of the 3rd SIAM International Conference on Data Mining (SDM-03) (pp. 95–106). Philadelphia: SIAM

    Google Scholar 

  • Webb, G. I. (1995). OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 5, 431–465.

    Google Scholar 

  • Webb, G. I. (2001). Discovering associations with numeric variables. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-01), San Francisco (pp. 383–388). New York: ACM.

    Google Scholar 

  • Webb, G. I. (2007). Discovering significant patterns. Machine Learning, 68(1), 1–33.

    Article  Google Scholar 

  • Webb, G. I., Butler, S. M., & Newlands, D. (2003). On detecting differences between groups. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-03), Washington, DC (pp. 256–265). New York: ACM.

    Google Scholar 

  • Wettschereck, D. (2002). A KDDSE-independent PMML visualizer. In Proceedings of 2nd Workshop on Integration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM-02) (pp. 150–155). Helsinki, Finland: Helsinki University

    Google Scholar 

  • Wong, T.-T., & Tseng, K.-L. (2005). Mining negative contrast sets from data with discrete attributes. Expert Systems with Applications, 29(2), 401–407.

    Article  MATH  Google Scholar 

  • Wrobel, S. (1997). An algorithm for multi-relational discovery of subgroups. In Proceedings of the 1st European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD-97) (pp. 78–87). Berlin, Germany: Springer.

    Chapter  Google Scholar 

  • Wrobel, S. (2001). Inductive logic programming for knowledge discovery in databases. In S. Džeroski & N. Lavrač (Eds.), Relational data mining (pp. 74–101). Berlin, Germany/New York: Springer.

    Google Scholar 

  • Zelezný, F., & Lavrač, N. (2006). Propositionalization-based relational subgroup discovery with RSD. Machine Learning, 62, 33–63.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Fürnkranz, J., Gamberger, D., Lavrač, N. (2012). Supervised Descriptive Rule Learning. In: Foundations of Rule Learning. Cognitive Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75197-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75197-7_11

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75196-0

  • Online ISBN: 978-3-540-75197-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics