Advertisement

Propositionalization Approaches to Relational Data Mining

  • Stefan Kramer
  • Nada Lavrač
  • Peter Flach
Chapter

Abstract

This chapter surveys methods that transform a relational representation of a learning problem into a propositional (feature-based, attribute-value) representation. This kind of representation change is known as propositionalization. Taking such an approach, feature construction can be decoupled from model construction. It has been shown that in many relational data mining applications this can be done without loss of predictive performance. After reviewing both general-purpose and domaindependent propositionalization approaches from the literature, an extension to the Linus propositionalization method that overcomes the system’s earlier inability to deal with non-determinate local variables is described.

Keywords

Background Knowledge Inductive Logic Inductive Logic Programming Feature Construction Propositional Representation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 11.1
    R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A.I. Verkamo. Fast discovery of association rules. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 307–328. MIT press, Cambridge, MA, 1996.Google Scholar
  2. 11.2
    E. Alphonse and C. Rouveirol. Lazy propositionalisation for relational learning. Proceedings of the Fourteenth European Conference on Artificial Intelligence, pages 256–260. IOS Press, Amsterdam, 2000.Google Scholar
  3. 11.3
    I. Bratko, I. Mozetič, and N. Lavrač. KARDIO: A Study in Deep and Qualitative Knowledge for Expert Systems. MIT Press, Cambridge, MA, 1989.Google Scholar
  4. 11.4
    C.J.C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), pages 121–167, 1998.CrossRefGoogle Scholar
  5. 11.5
    B. Cestnik, I. Kononenko, and I. Bratko. ASSISTANT 86: A knowledge elicitation tool for sophisticated users. In Proceedings of the Second European Working Session on Learning, pages 31–44. Sigma Press, Wilmslow, UK, 1987.Google Scholar
  6. 11.6
    Y. Chevaleyre and J-D. Zucker. Noise-tolerant rule induction from multi-instance data. Proceedings of the ICML-2000 workshop on Attribute- Value and Relational Learning: Crossing the Boundaries, pages 1–11. Stanford University, Stanford, CA, 2000.Google Scholar
  7. 11.7
    P. Clark and R. Boswell. Rule induction with CN2: Some recent improvements. In Proceedings Fifth European Working Session on Learning, pages 151–163. Springer, Berlin, 1991.Google Scholar
  8. 11.8
    P. Clark and T. Niblett. The CN2 induction algorithm. Machine Learning, 3(4):261–283, 1989.Google Scholar
  9. 11.9
    W.W. Cohen. PAC-learning nondeterminate clauses. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 676–681. AAAI Press, Menlo Park, CA, 1994.Google Scholar
  10. 11.10
    W.W. Cohen. Learning trees and rules with set-valued features. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 709–716. AAAI Press, Menlo Park, CA, 1996.Google Scholar
  11. 11.11
    D.J. Cook and L.B. Holder. Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research, 1:231–255, 1994.Google Scholar
  12. 11.12
    L. Dehaspe and H. Toivonen. Discovery of frequent Datalog patterns. Data Mining and Knowledge Discovery, 3(l):7–36, 1999.CrossRefGoogle Scholar
  13. 11.13
    L. De Raedt. Logical settings for concept learning. Artificial Intelligence, 95:187–201, 1997.MathSciNetzbMATHCrossRefGoogle Scholar
  14. 11.14
    L. De Raedt. Attribute-value learning versus inductive logic programming: The missing links (extended abstract). In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 1–8. Springer, Berlin, 1998.CrossRefGoogle Scholar
  15. 11.15
    T.G. Dietterich, R.H. Lathrop and T. Lozano-Perez. Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence 89(1–2): 31–71, 1997.zbMATHCrossRefGoogle Scholar
  16. 11.16
    S. Dzeroski, H. Blocked, B. Kompare, S. Kramer, B. Pfahringer, and W. Van Laer. Experiments in Predicting Biodegradability. In Proceedings of the Ninth International Workshop on Inductive Logic Programming, pages 80–91. Springer, Berlin, 1999.CrossRefGoogle Scholar
  17. 11.17
    D. Fensel, M. Zickwolff, and M. Wiese. Are substitutions the better examples? Learning complete sets of clauses with Frog. In Proceedings of the Fifth International Workshop on Inductive Logic Programming, pages 453–474. Department of Computer Science, Katholieke Universiteit Leuven, 1995.Google Scholar
  18. 11.18
    P. Flach. Knowledge representation for inductive learning. In Proceedings of the European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty, pages 160–167. Springer, Berlin, 1999.CrossRefGoogle Scholar
  19. 11.19
    P. Flach, C. Giraud-Carrier, and J.W. Lloyd. Strongly typed inductive concept learning. In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 185–194. Springer, Berlin, 1998.CrossRefGoogle Scholar
  20. 11.20
    P. Flach and N. Lachiche. 1BC: A first-order Bayesian classifier. In Proceedings of the Ninth International Workshop on Inductive Logic Programming, pages 92–103. Springer, Berlin, 1999.CrossRefGoogle Scholar
  21. 11.21
    P. Flach and N. Lachiche. Confirmation-guided discovery of first-order rules with Tertius. Machine Learning, 42(1–2): 61–95, 2001.zbMATHCrossRefGoogle Scholar
  22. 11.22
    P. Geibel and F. Wysotzki. Relational learning with decision trees. In Proceedings Twelfth European Conference on Artificial Intelligence, pages 428–432. IOS Press, Amsterdam, 1996.Google Scholar
  23. 11.23
    G. Klopman. Artificial intelligence approach to structure-activity studies: computer automated structure evaluation of biological activity of organic molecules. Journal of the American Chemical Society, 106:7315–7321, 1984.CrossRefGoogle Scholar
  24. 11.24
    G. Klopman. MultiCASE: A hierarchical computer automated structure evaluation program. Quantitative Structure Activity Relationships, 11:176–184, 1992.CrossRefGoogle Scholar
  25. 11.25
    W. Klösgen. EXPLORA: A multipattern and multistrategy discovery assistant. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 249–271. AAAI Press, Menlo Park, CA, 1996.Google Scholar
  26. 11.26
    R. Kohavi, D. Sommerfield, and J. Dougherty. Data mining using MLC++: A machine learning library in C++. In Proceedings of the Eighth IEEE International Conference on Tools for Artificial Intelligence, pages 234–245. IEEE Computer Society Press, Los Alamitos, CA, 1996. http://www.sgi.com/Technology/mlc.CrossRefGoogle Scholar
  27. 11.27
    D. Koller and M. Sahami. Toward optimal feature selection. In Proceedings of the Thirteenth International Conference on Machine Learning, pages 284–292. Morgan Kaufmann, San Francisco, CA, 1996.Google Scholar
  28. 11.28
    S. Kramer. Structural regression trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 812–810. AAAI Press, Menlo Park, CA, 1996.Google Scholar
  29. 11.29
    S. Kramer and E. Frank. Bottom-Up propositionalization. In Proceedings of the ILP-2000 Work-in-Progress Track, pages 156–162. Imperial College, London, 2000.Google Scholar
  30. 11.30
    S. Kramer, B. Pfahringer, and C. Helma. Stochastic propositionalization of non-determinate background knowledge. In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 80–94. Springer, Berlin, 1998.CrossRefGoogle Scholar
  31. 11.31
    N. Lavrac and S. Dšeroski. Inductive Logic Programming: Techniques and Applications. Ellis Horwood, Chichester, 1994. Freely available at http://www-ai.ijs.si/SasoDzeroski/ILPBook/.zbMATHGoogle Scholar
  32. 11.32
    N. Lavrac, S. Dzeroski, and M. Grobelnik. Learning nonrecursive definitions of relations with LINUS. In Proceedings of the Fifth European Working Session on Learning, pages 265–281. Springer-Verlag, Berlin, 1991.Google Scholar
  33. 11.33
    N. Lavrač, D. Gamberger, P. Turney. A relevancy filter for constructive induction. IEEE Intelligent Systems, 13: 50–56, 1998.Google Scholar
  34. 11.34
    H. Mannila and H. Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1 :241–258, 1997.CrossRefGoogle Scholar
  35. 11.35
    D. Michie, S. Muggleton, D. Page, and A. Srinivasan. To the international computing community: A new East-West challenge. Technical report, Oxford University Computing laboratory, Oxford, UK, 1994.Google Scholar
  36. 11.36
    F. Mizoguchi, H. Ohwada, M. Daidoji, and S. Shirato. Learning rules that classify ocular fundus images for glaucoma diagnosis. In Proceedings of the Sixth International Workshop on Inductive Logic Programming, pages 146–162. Springer-Verlag, Berlin, 1996.Google Scholar
  37. 11.37
    I. Mozetič. NEWGEM: Program for learning from examples, technical documentation and user’s guide. Reports of Intelligent Systems Group UIUCDCS-F-85–949, Department of Computer Science, University of Illinois, Urbana Champaign, IL, 1985.Google Scholar
  38. 11.38
    S. Muggleton. Inverse entailment and Progol. New Generation Computing, 13: 245–286, 1995.CrossRefGoogle Scholar
  39. 11.39
    S. Muggleton and C. Feng. Efficient induction of logic programs. In S. Muggleton, editor, Inductive Logic Programming, pages 281–298. Academic Press, London, 1992.Google Scholar
  40. 11.40
    S. Muggleton, R.D. King, and M.J.E Sternberg. Protein secondary structure prediction using logic. In Proceedings of the Second International Workshop on Inductive Logic Programming, pages 228–259. TM-1182, ICOT, Tokyo, 1992.Google Scholar
  41. 11.41
    S. Muggleton, A. Srinivasan, R. King, and M. Sternberg. Biochemical knowledge discovery using Inductive Logic Programming. In Proceedings of the First Conference on Discovery Science, pages 326–341. Springer, Berlin, 1998.Google Scholar
  42. 11.42
    A.L. Oliveira and A. Sangiovanni-Vincentelli. Constructive induction using a non-greedy strategy for feature selection. In Proceedings of the Ninth International Workshop on Machine Learning, pages 354–360. Morgan Kaufmann, San Francisco, CA, 1992.Google Scholar
  43. 11.43
    G. Pagallo and D. Haussler. Boolean feature discovery in empirical learning. Machine Learning, 5:71–99, 1990.CrossRefGoogle Scholar
  44. 11.44
    J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.Google Scholar
  45. 11.45
    B.L. Richards and R.J. Mooney. Learning relations by pathfinding. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 50–55. AAAI Press, Menlo Park, CA, 1992.Google Scholar
  46. 11.46
    M. Sebag and C. Rouveirol. Tractable induction and classification in first order logic via stochastic matching. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pages 888–893. Morgan Kaufmann, San Francisco, CA, 1997.Google Scholar
  47. 11.47
    A. Srinivasan and R. King. Feature construction with inductive logic programming: a study of quantitative predictions of biological activity aided by structural attributes. Data Mining and Knowledge Discovery, 3(l):37–57, 1999.CrossRefGoogle Scholar
  48. 11.48
    A. Srinivasan, R. King and D.W. Bristol, An assessment of submissions made to the Predictive Toxicology Evaluation Challenge. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 270–275. Morgan Kaufmann, San Francisco, CA, 1999.Google Scholar
  49. 11.49
    A. Srinivasan, S. Muggleton, R.D. King and M. Sternberg. Theories for mutagenicity: a study of first-order and feature based induction. Artificial Intelligence, 85(1–2):277–299, 1996.CrossRefGoogle Scholar
  50. 11.50
    I. Stahl. Predicate invention in inductive logic programming. In L. De Raedt, editor, Advances in Inductive Logic Programming, pages 34–47. IOS Press, Amsterdam, 1996.Google Scholar
  51. 11.51
    P. Turney. Low size-complexity inductive logic programming: The East-West challenge considered as a problem in cost-sensitive classification. In L. De Raedt, editor, Advances in Inductive Logic Programming, pages 308–321. IOS Press, Amsterdam, 1996.Google Scholar
  52. 11.52
    V. Vapnik. Estimation of Dependencies Based on Empirical Data. Springer Verlag, Berlin, 1982.Google Scholar
  53. 11.53
    V. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag, Berlin, 1995.zbMATHCrossRefGoogle Scholar
  54. 11.54
    J. Wnek and R.S. Michalski. Hypothesis-driven constructive induction in AQ17: A method and experiments. In Proceedings of IJCAI-91 Workshop on Evaluating and Changing Representations in Machine Learning, pages 13–22. Sydney, Australia, 1991.Google Scholar
  55. 11.55
    S. Wrobel. An algorithm for multi-relational discovery of subgroups. In Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery, pages 78–87. Springer, Berlin, 1997.CrossRefGoogle Scholar
  56. 11.56
    J-D. Zucker and J-G. Ganascia. Representation changes for efficient learning in structural domains. In Proceedings of the Thirteenth International Conference on Machine Learning, pages 543–551. Morgan Kaufmann, San Francisco, CA, 1996.Google Scholar
  57. 11.57
    J-D. Zucker and J-G. Ganascia. Learning structurally indeterminate clauses. In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 235–244. Springer, Berlin, 1998.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Stefan Kramer
    • 1
  • Nada Lavrač
    • 2
  • Peter Flach
    • 3
  1. 1.Machine Learning and Natural Language Processing Lab, Institute for Computer ScienceAlbert-Ludwigs University FreiburgFreiburg i. Br.Germany
  2. 2.Jožef Stefan InstituteLjubljanaSlovenia
  3. 3.Department of Computer ScienceUniversity of BristolBristolUK

Personalised recommendations