Subgroup Discovery with Linguistic Rules

  • María José del Jesus
  • Pedro González
  • Francisco Herrera

Abstract

Subgroup discovery can be defined as a form of supervised inductive learning in which, given a population of individuals and a specific property of individuals in which we are interested, find population subgroups that have the most unusual distributional characteristics with respect to the property of interest. Subgroup discovery algorithms aim at discovering individual rules, which must be represented in explicit symbolic form and which must be simple and understandable in order to be recognized as actionable by potential users.

A fuzzy approach for a subgroup discovery process, which considers linguistic variables with linguistic terms in descriptive fuzzy rules, lets us obtain knowledge in a similar way of the human thought process. Linguistic rules are naturally inclined towards coping with linguistic knowledge and to produce more interpretable and actionable solutions. This chapter analyzes the use of linguistic rules for modelling this problem, and shows a genetic extraction model for learning this kind of rules.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal R, Imielinski T, Swami AN (1993) Mining Association Rules between Sets of Items in Large Databases. In: Proceedings of the International Conference on Management of Data (ACM SIGMOD 1995). Washington, DC, pp. 207–216Google Scholar
  2. 2.
    Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo I (1996) Fast Discovery of Association Rules. In: Fayyad U, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in Knowledge Discovery and Data Mining. AAAI Press, California, pp. 307–328Google Scholar
  3. 3.
    Atzmueller M, Puppe F, Buscher H-P (2004) Towards Knowledge-Intensive Subgroup Discovery. In: Proceedings Lernen, Wissensentdeckung und Adaptivität Workshop (LWA‘04). Berlin, pp. 117–123Google Scholar
  4. 4.
    Au WH, Chan KCC (1998) An effective algorithm for discovering fuzzy rules in relational databases. In: Proceedings of the IEEE International Conference on Fuzzy Systems (Fuzz IEEE‘98). Anchorage (USA), pp. 1314–1319Google Scholar
  5. 5.
    Bäck T, Fogel D, Michalewicz Z (1997) Handbook of Evolutionary Computation. Oxford University Press, OxfordMATHGoogle Scholar
  6. 6.
    Clark P, Niblett T (1989) The cn2 induction algorithm. Machine Learning 3(4): 261–283Google Scholar
  7. 7.
    Coello CA, Van Veldhuizen DA, Lamont GB (2002) Evolutionary Algorithms for Solving Multi-Objective Problems. Kluwer Academic Publishers, New YorkMATHGoogle Scholar
  8. 8.
    Cordòn O, del Jesus MJ, Herrera F (1998) Genetic Learning of Fuzzy Rule-based Classification Systems Co-operating with Fuzzy Reasoning Methods. International Journal of Intelligent Systems 13 (10/11): 1025–1053CrossRefGoogle Scholar
  9. 9.
    Cordòn O, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems: evolutionary tuning and learning of fuzzy knowledge bases. World Scientific, SingaporeMATHGoogle Scholar
  10. 10.
    Chen G, Wei Q (2002) Fuzzy association rules and the extended mining algorithms. Information Sciences 147: 201–228MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Deb K (2001) Multi-Objective Optimization using Evolutionary Algorithms. John Wiley & Sons, ChichesterMATHGoogle Scholar
  12. 12.
    Del Jesus MJ, González P, Herrera F, Mesonero M (2005) Evolutionary Induction of Descriptive Rules in a Market Problem. In Ruan D, Chen G, Kerre E, Wets G (eds) Intelligent Data Mining: Techniques and Applications. Springer Verlag, pp. 267–292Google Scholar
  13. 13.
    Del Jesus MJ, González P, Herrera F, Mesonero M (Accepted) Evolutionary fuzzy rule induction process for subgroup discovery: a case study in marketing. IEEE Trans. Fuzzy SystemsGoogle Scholar
  14. 14.
    Dubois D, Prade H, Sudamp T (2005) On the representation, measurement, and discovery of fuzzy associations. IEEE Trans. on Fuzzy Systems 13: 250–262CrossRefGoogle Scholar
  15. 15.
    Flach PA, Savnik I (1999) Database dependency discovery: a machine learning approach. AI Communications 12(3): 139–160MathSciNetGoogle Scholar
  16. 16.
    Fu AW, Wong MH, Sze SC, Wong WC, Wong WL, Yu WK (1998) Finding fuzzy sets for the mining of fuzzy association rules for numerical at-tributes. In: First International Symposium on Intelligent Data Engineering and Learning (IDEAL‘98). Hong Kong, pp. 263–268Google Scholar
  17. 17.
    Gamberger D, Lavrac N (2002) Expert-guided subgroup discovery: Methodology and application. Journal of Artificial Intelligence Research 17: 1–27CrossRefGoogle Scholar
  18. 18.
    Gamberger D, Lavrac N, Krstacic G (2003) Active subgroup mining: a case study in coronary heart disease risk group detection. Artificial Intelligence in Medicine 28 (1): 27–57CrossRefGoogle Scholar
  19. 19.
    Hong TP, Chen CH, Wu YL, Lee YC (2004) Using divide-and-conquer GA strategy in fuzzy data mining. In: Ninth International Symposium on Computers and Communications (ISCC 2004). Alexandria, EGYPT, pp. 116–121Google Scholar
  20. 20.
    Hong TP, Liu KY, Wang SL (2003) Fuzzy data mining for interesting generalized association rules. Fuzzy sets and systems 138: 255–269CrossRefMathSciNetGoogle Scholar
  21. 21.
    Hüllermeier E (2005) Fuzzy methods in machine learning and data mining: Status and prospects. Fuzzy Sets and Systems 156 (3): 387–407CrossRefMathSciNetGoogle Scholar
  22. 22.
    Ishibuchi H, Nakashima T, Nii M (2004) Classification and modeling with linguistic information granules Springer-Verlag, New YorkGoogle Scholar
  23. 23.
    Kavsek B, Lavrac N, Jovanoski V (2003) APRIORI-SD: Adapting association rule learning to subgroup discovery. In: Proceedings of the 5th International Symposium on Intelligent Data Analysis (IDA 2003). Berlin, pp. 230–241Google Scholar
  24. 24.
    Klösgen W (1996) Explora: A Multipattern and Multistrategy Discovery Assistant. In Fayyad U, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in Knowledge Discovery and Data Mining, AAAI Press, California, pp. 249–271Google Scholar
  25. 25.
    Klösgen W (2002) Subgroup Discovery. In Klösgen W, Zytkow J (eds) Handbook of Data Mining and Knowledge Discovery. Oxford University Press, New York, pp. 354–364Google Scholar
  26. 26.
    Klösgen W, May M (2002) Census Data Mining - An Application. In: 13th European Conference on Machine Learning (ECML‘02) / 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD‘02) workshop on on Mining Official Data. Helsinki, pp. 65–79Google Scholar
  27. 27.
    Kuok C, Fu A, Wong ML (1998) Mining fuzzy association rules in databases. ACM SIGMOD Record 27: 41–46.CrossRefGoogle Scholar
  28. 28.
    Lavrac N, Flach P, Zupan B (1999) Rule evaluation measures: A unifying view. In: Proceedings of the 9th International Workshop on Inductive Logic Programming (ILP‘99). Bled, Slovenia, pp. 174–185Google Scholar
  29. 29.
    Lavrac N, Kavsec B, Flach P, Todorovski L (2004) Subgroup discovery with CN2-SD. Journal of Machine Learning Research 5: 153–188Google Scholar
  30. 30.
    Lavrac N, Zelezny F, Flach P (2003) RSD: Relational subgroup discovery through first-order feature construction. In: Proceedings of the 13th International Conference on Inductive Logic Programming (ILP 2003). Szeged, Hungary, pp. 149–165Google Scholar
  31. 31.
    Michie D, Spiegelhalter DJ, Taylor CC (1994) Machine learning, neural and estatistical classification. Ellis HorwoodGoogle Scholar
  32. 32.
    Piatetsky-Shapiro G, Matheus, C (1994) The interestingness of deviation. In: Proceedings of the AAAI-94 Workshop on Knowledge Discovery in Databases. Seattle, Washington, pp. 25–36Google Scholar
  33. 33.
    Quinlan JR (1987) Generating Production Rules from Decision Trees. In: Proceedings of the 10th International Joint Conference on Artificial Intelligence (IJCAI‘87). Milan, Italy, pp. 304–307Google Scholar
  34. 34.
    Raedt LD, Dehaspe L (1997) Clausal discovery. Machine Learning 26: 99–146MATHCrossRefGoogle Scholar
  35. 35.
    Wrobel S (1997) An algorithm for multi-relational discovery of subgroups. In Proceedings of the First European Symposion on Principles of Data Mining and Knowledge Discovery (PKDD-97). Trondheim, Norway, pp. 78–87Google Scholar
  36. 36.
    Zadeh LA (1965) Fuzzy sets. Information Control 8: 338–353MATHCrossRefMathSciNetGoogle Scholar
  37. 37.
    Zadeh LA (1975) The concept of a linguistic variable and its applications to approximate reasoning, parts I, II, III. Information Sciences 8–9: 199–249, 301–357, 43–80Google Scholar
  38. 38.
    Zhang S, Lu J, Zhang C (2004) A fuzzy logic based method to acquire user threshold of minimum-support for mining association rules. Information Sciences 164: 1–16MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • María José del Jesus
  • Pedro González
  • Francisco Herrera

There are no affiliations available

Personalised recommendations