Supervised Descriptive Rule Learning

Fürnkranz, Johannes; Gamberger, Dragan; Lavrač, Nada

doi:10.1007/978-3-540-75197-7_11

Supervised Descriptive Rule Learning

Johannes Fürnkranz⁴,
Dragan Gamberger⁵ &
Nada Lavrač⁶

Chapter
First Online: 01 January 2012

2164 Accesses
1 Citations

Part of the book series: Cognitive Technologies ((COGTECH))

Abstract

This chapter presents subgroup discovery (SD) and some of the related supervised descriptive rule induction techniques, including contrast set mining (CSM) and emerging pattern mining (EPM). These descriptive rule learning techniques are presented in a unifying framework named supervised descriptive rule learning. All these techniques aim at discovering patterns in the form of rules induced from labeled data. This chapter contributes to the understanding of these techniques by presenting a unified terminology and by explaining the apparent differences between the learning tasks as variants of a unique supervised descriptive rule learning task. It also shows that various rule learning heuristics used in CSM, EPM, and SD algorithms all aim at optimizing a trade off between rule coverage and precision.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.95; Price excludes VAT (USA)

Hardcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Parts of this chapter are based on Kralj Novak, Lavrač, and Webb (2009).
2.
http://www.giwebb.com/
3.
Jumping emerging patterns are emerging patterns with support zero in one dataset and greater then zero in the other dataset.

References

Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A. I. (1995). Fast discovery of association rules. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy (Eds.), Advances in knowledge discovery and data mining (pp. 307–328). Menlo Park, CA: AAAI.
Google Scholar
Atzmüller, M., & Puppe, F. (2005). Semi-automatic visual subgroup mining using VIKAMINE. Journal of Universal Computer Science, 11(11), 1752–1765. Special Issue on Visual Data Mining.
Google Scholar
Atzmüller, M., & Puppe, F. (2006). SD-Map – A fast algorithm for exhaustive subgroup discovery. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-06), Berlin, Germany (pp. 6–17). Berlin, Germany: Springer.
Google Scholar
Atzmüller, M., Puppe, F., & Buscher, H.-P. (2005a). Exploiting background knowledge for knowledge-intensive subgroup discovery. In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI-05), Edinburgh, UK (pp. 647–652). San Francisco: Morgan Kaufmann.
Google Scholar
Atzmüller, M., Puppe, F., & Buscher, H.-P. (2005b). Profiling examiners using intelligent subgroup mining. In Proceedings of the 10th Workshop on Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP-05) (pp. 46–51) Aberdeen: AIME.
Google Scholar
Aumann, Y., & Lindell, Y. (1999). A statistical theory for quantitative association rules. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), San Diego, CA (pp. 261–270). New York: ACM.
Google Scholar
Bay, S. D. (2000). Multivariate discretization of continuous variables for set mining. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000), Boston (pp. 315–319). New York: ACM.
Google Scholar
Bay, S. D., & Pazzani, M. J. (2001). Detecting group differences: Mining contrast sets. Data Mining and Knowledge Discovery, 5(3), 213–246.
Article MATH Google Scholar
Bayardo, R. J., Jr. (1998). Efficiently mining long patterns from databases. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD-98), Seattle, WA (pp. 85–93). New York: ACM
Google Scholar
Boulesteix, A.-L., Tutz, G., & Strimmer, K. (2003). A CART-based approach to discover emerging patterns in microarray data. Bioinformatics, 19(18), 2465–2472.
Article Google Scholar
Daly, O., & Taniar, D. (2005). Exception rules in data mining. In M. Khosrow-Pour (Ed.), Encyclopedia of information science and technology (Vol. II, pp. 1144–1148). Hershey, PA: Idea Group.
Chapter Google Scholar
del Jesus, M. J., González, P., Herrera, F., & Mesonero, M. (2007). Evolutionary fuzzy rule induction process for subgroup discovery: A case study in marketing. IEEE Transactions on Fuzzy Systems, 15(4), 578–592.
Article Google Scholar
Dong, G., & Li, J. (1999). Efficient mining of emerging patterns: Discovering trends and differences. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), San Diego, CA (pp. 43–52). New York: ACM
Google Scholar
Dong, G., Zhang, X., Wong, L., & Li, J. (1999). CAEP: Classification by aggregating emerging patterns. In Proceedings of the 2nd International Conference on Discovery Science (DS-99), Tokyo, Japan (pp. 30–42). Berlin, Germany/New York: Springer.
Google Scholar
Fan, H., Fan, M., Ramamohanarao, K., & Liu, M. (2006). Further improving emerging pattern based classifiers via bagging. In Proceedings of the 10th Pacific-Asia conference on Knowledge Discovery and Data Mining (PAKDD-06), Singapore (pp. 91–96). Berlin, Germany/Heidelberg, Germany/New York: Springer.
Google Scholar
Fan, H., & Ramamohanarao, K. (2003a). A Bayesian approach to use emerging patterns for classification. In Proceedings of the 14th Australasian Database Conference (ADC-03), Adelaide, SA (pp. 39–48). Darlinghurst, NSW: Australian Computer Society
Google Scholar
Fan, H., & Ramamohanarao, K. (2003b). Efficiently mining interesting emerging patterns. In Proceeding of the 4th International Conference on Web-Age Information Management (WAIM-03), Chengdu, China (pp. 189–201). Berlin, Germany/New York: Springer.
Google Scholar
Friedman, J. H., & Fisher, N. I. (1999). Bump hunting in high-dimensional data. Statistics and Computing, 9(2), 123–143.
Article Google Scholar
Gamberger, D., & Lavrač, N. (2002). Expert-guided subgroup discovery: Methodology and application. Journal of Artificial Intelligence Research, 17, 501–527.
MATH Google Scholar
Gamberger, D., Lavrač, N., & Wettschereck., D. (2002). Subgroup visualization: A method and application in population screening. In Proceedings of the 7th International Workshop on Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP-02), Lyon, France (pp. 31–35). Lyon, France: ECAI
Google Scholar
Garriga, G. C., Kralj, P., & Lavrač, N. (2006). Closed sets for labeled data. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-06), Berlin, Germany (pp. 163 – 174). Berlin, Germany/New York: Springer
Google Scholar
Hilderman, R. J., & Peckham, T. (2005). A statistically sound alternative approach to mining contrast sets. In Proceedings of the 4th Australia Data Mining Conference (AusDM-05), Sydney, NSW (pp. 157–172).
Google Scholar
Jenkole, J., Kralj, P., Lavrač, N., & Sluga, A. (2007). A data mining experiment on manufacturing shop floor data. In Proceedings of the 40th CIRP International Seminar on Manufacturing Systems. Liverpool, UK: University of Liverpool
Google Scholar
Kavšek, B., & Lavrač, N. (2006). Apriori-SD: Adapting association rule learning to subgroup discovery. Applied Artificial Intelligence, 20(7), 543–583.
Article Google Scholar
Klösgen, W. (1996). Explora: A multipattern and multistrategy discovery assistant. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy (Eds.), Advances in knowledge discovery and data mining (pp. 249–271). Menlo Park, CA: AAAI. Chap. 10.
Klösgen, W., & May, M. (2002). Spatial subgroup mining integrated in an object-relational spatial database. In Proceedings of the 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-02) (pp. 275–286). Berlin, Germany/New York: Springer.
Google Scholar
Klösgen, W., May, M., & Petch, J. (2003). Mining census data for spatial effects on mortality. Intelligent Data Analysis, 7(6):521–540.
Google Scholar
Kralj, P., Grubešič, A., Toplak, N., Gruden, K., Lavrač, N., & Garriga, G. C. (2006). Application of closed itemset mining for class labeled data in functional genomics. Informatica Medica Slovenica, 11(1), 40–45.
Google Scholar
Kralj, P., Lavrač, N., Gamberger, D., & Krstačić, A. (2007a). Contrast set mining for distinguishing between similar diseases. In Proceedings of the 11th Conference on Artificial Intelligence in Medicine (AIME-07), Amsterdam (pp. 109–118). Berlin, Germany: Springer
Google Scholar
Kralj, P., Lavrač, N., Gamberger, D., & Krstačić, A. (2007b). Contrast set mining through subgroup discovery applied to brain ischaemia data. In Proceedings of the 11th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD-07), Nanjing, China (pp. 579–586). Berlin, Germany/New York: Springer
Google Scholar
Kralj, P., Lavrač, N., & Zupan, B. (2005). Subgroup visualization. In Proceedings of the 8th International Multiconference Information Society (IS-05), Ljubljana, Slovenia (pp. 228–231). Ljubljana, Slovenia: Institut Jožef Stefan.
Google Scholar
Kralj Novak, P., Lavrač, N., & Webb, G. I. (2009). Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. Journal of Machine Learning Research, 10, 377–403.
MATH Google Scholar
Lavrač, N., Cestnik, B., Gamberger, D., & Flach, P. A. (2004). Decision support through subgroup discovery: Three case studies and the lessons learned. Machine Learning, 57(1–2):115–143. Special issue on Data Mining Lessons Learned.
Google Scholar
Lavrač, N., Kavšek, B., Flach, P., & Todorovski, L. (2004). Subgroup discovery with CN2-SD. Journal of Machine Learning Research, 5, 153–188.
Google Scholar
Lavrač, N., Kralj, P., Gamberger, D., & Krstačić, A. (2007). Supporting factors to improve the explanatory potential of contrast set mining: Analyzing brain ischaemia data. In Proceedings of the 11th Mediterranean Conference on Medical and Biological Engineering and Computing (MEDICON-07), Ljubljana, Slovenia (pp. 157–161). Berlin, Germany: Springer.
Google Scholar
Li, J., Dong, G., & Ramamohanarao, K. (2000). Instance-based classification by emerging patterns. In Proceedings of the 14th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-2000), Lyon, France (pp. 191–200). Berlin, Germany/New York: Springer.
Google Scholar
Li, J., Dong, G., & Ramamohanarao, K. (2001). Making use of the most expressive jumping emerging patterns for classification. Knowledge and Information Systems, 3(2), 1–29.
Google Scholar
Li, J., Liu, H., Downing, J. R., Yeoh, A. E.-J., & Wong, L. (2003). Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (ALL) patients. Bioinformatics, 19(1), 71–78.
Article MATH Google Scholar
Li, J., & Wong, L. (2002b). Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. Bioinformatics, 18(10), 1406–1407.
Article Google Scholar
Lin, J., & Keogh, E. (2006). Group SAX: Extending the notion of contrast sets to time series and multimedia data. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-06), Berlin, Germany (pp. 284–296). Berlin, Germany/New York: Springer.
Google Scholar
Liu, B., Hsu, W., Han, H.-S., & Xia, Y. (2000). Mining changes for real-life applications. In Proceedings of the 2nd International Conference on Data Warehousing and Knowledge Discovery (DaWaK-2000), London (pp. 337–346). Berlin, Germany: Springer.
Google Scholar
Liu, B., Hsu, W., & Ma, Y. (2001). Discovering the set of fundamental rule changes. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-01), San Francisco (pp. 335–340). New York: ACM.
Google Scholar
May, M., & Ragia, L. (2002). Spatial subgroup discovery applied to the analysis of vegetation data. In Proceedings of the 4th International Conference on Practical Aspects of Knowledge Management (PAKM-2002), Vienna (pp. 49–61). Berlin, Germany/New York: Springer.
Google Scholar
Simeon, M., & Hilderman, R. J. (2007). Exploratory quantitative contrast set mining: A discretization approach. In Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI-07), Patras, Greece (Vol.2, pp. 124–131). Los Alamitos, CA: IEEE.
Google Scholar
Siu, K., Butler, S., Beveridge, T., Gillam, J., Hall, C., & Kaye, A., et al. (2005). Identifying markers of pathology in SAXS data of malignant tissues of the brain. Nuclear Instruments and Methods in Physics Research A, 548, 140–146.
Article Google Scholar
Song, H. S., Kimb, J. K., & Kima, S. H. (2001). Mining the change of customer behavior in an internet shopping mall. Expert Systems with Applications, 21(3), 157–168.
Article Google Scholar
Soulet, A., Crémilleux, B., & Rioult, F. (2004). Condensed representation of emerging patterns. In Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-04), Sydney, NSW (pp. 127–132). Berlin, Germany/New York: Springer.
Google Scholar
Suzuki, E. (2006). Data mining methods for discovering interesting exceptions from an unsupervised table. Journal of Universal Computer Science, 12(6), 627–653.
Google Scholar
Wang, K., Zhou, S., Fu, A. W.-C., & Yu, J. X. (2003). Mining changes of classification by correspondence tracing. In Proceedings of the 3rd SIAM International Conference on Data Mining (SDM-03) (pp. 95–106). Philadelphia: SIAM
Google Scholar
Webb, G. I. (1995). OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 5, 431–465.
Google Scholar
Webb, G. I. (2001). Discovering associations with numeric variables. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-01), San Francisco (pp. 383–388). New York: ACM.
Google Scholar
Webb, G. I. (2007). Discovering significant patterns. Machine Learning, 68(1), 1–33.
Article Google Scholar
Webb, G. I., Butler, S. M., & Newlands, D. (2003). On detecting differences between groups. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-03), Washington, DC (pp. 256–265). New York: ACM.
Google Scholar
Wettschereck, D. (2002). A KDDSE-independent PMML visualizer. In Proceedings of 2nd Workshop on Integration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM-02) (pp. 150–155). Helsinki, Finland: Helsinki University
Google Scholar
Wong, T.-T., & Tseng, K.-L. (2005). Mining negative contrast sets from data with discrete attributes. Expert Systems with Applications, 29(2), 401–407.
Article MATH Google Scholar
Wrobel, S. (1997). An algorithm for multi-relational discovery of subgroups. In Proceedings of the 1st European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD-97) (pp. 78–87). Berlin, Germany: Springer.
Chapter Google Scholar
Wrobel, S. (2001). Inductive logic programming for knowledge discovery in databases. In S. Džeroski & N. Lavrač (Eds.), Relational data mining (pp. 74–101). Berlin, Germany/New York: Springer.
Google Scholar
Zelezný, F., & Lavrač, N. (2006). Propositionalization-based relational subgroup discovery with RSD. Machine Learning, 62, 33–63.
Article Google Scholar

Download references

Author information

Authors and Affiliations

FB Informatik, TU Darmstadt, Darmstadt, Germany
Johannes Fürnkranz
Rudjer Bošković Institute, Zagreb, Croatia
Dragan Gamberger
Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia
Nada Lavrač

Authors

Johannes Fürnkranz
View author publications
You can also search for this author in PubMed Google Scholar
Dragan Gamberger
View author publications
You can also search for this author in PubMed Google Scholar
Nada Lavrač
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Fürnkranz, J., Gamberger, D., Lavrač, N. (2012). Supervised Descriptive Rule Learning. In: Foundations of Rule Learning. Cognitive Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75197-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-75197-7_11
Published: 27 September 2012
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75196-0
Online ISBN: 978-3-540-75197-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics