Skip to main content

Propositionalization Approaches to Relational Data Mining

  • Chapter
Relational Data Mining

Abstract

This chapter surveys methods that transform a relational representation of a learning problem into a propositional (feature-based, attribute-value) representation. This kind of representation change is known as propositionalization. Taking such an approach, feature construction can be decoupled from model construction. It has been shown that in many relational data mining applications this can be done without loss of predictive performance. After reviewing both general-purpose and domaindependent propositionalization approaches from the literature, an extension to the Linus propositionalization method that overcomes the system’s earlier inability to deal with non-determinate local variables is described.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A.I. Verkamo. Fast discovery of association rules. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 307–328. MIT press, Cambridge, MA, 1996.

    Google Scholar 

  2. E. Alphonse and C. Rouveirol. Lazy propositionalisation for relational learning. Proceedings of the Fourteenth European Conference on Artificial Intelligence, pages 256–260. IOS Press, Amsterdam, 2000.

    Google Scholar 

  3. I. Bratko, I. Mozetič, and N. Lavrač. KARDIO: A Study in Deep and Qualitative Knowledge for Expert Systems. MIT Press, Cambridge, MA, 1989.

    Google Scholar 

  4. C.J.C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), pages 121–167, 1998.

    Article  Google Scholar 

  5. B. Cestnik, I. Kononenko, and I. Bratko. ASSISTANT 86: A knowledge elicitation tool for sophisticated users. In Proceedings of the Second European Working Session on Learning, pages 31–44. Sigma Press, Wilmslow, UK, 1987.

    Google Scholar 

  6. Y. Chevaleyre and J-D. Zucker. Noise-tolerant rule induction from multi-instance data. Proceedings of the ICML-2000 workshop on Attribute- Value and Relational Learning: Crossing the Boundaries, pages 1–11. Stanford University, Stanford, CA, 2000.

    Google Scholar 

  7. P. Clark and R. Boswell. Rule induction with CN2: Some recent improvements. In Proceedings Fifth European Working Session on Learning, pages 151–163. Springer, Berlin, 1991.

    Google Scholar 

  8. P. Clark and T. Niblett. The CN2 induction algorithm. Machine Learning, 3(4):261–283, 1989.

    Google Scholar 

  9. W.W. Cohen. PAC-learning nondeterminate clauses. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 676–681. AAAI Press, Menlo Park, CA, 1994.

    Google Scholar 

  10. W.W. Cohen. Learning trees and rules with set-valued features. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 709–716. AAAI Press, Menlo Park, CA, 1996.

    Google Scholar 

  11. D.J. Cook and L.B. Holder. Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research, 1:231–255, 1994.

    Google Scholar 

  12. L. Dehaspe and H. Toivonen. Discovery of frequent Datalog patterns. Data Mining and Knowledge Discovery, 3(l):7–36, 1999.

    Article  Google Scholar 

  13. L. De Raedt. Logical settings for concept learning. Artificial Intelligence, 95:187–201, 1997.

    Article  MathSciNet  MATH  Google Scholar 

  14. L. De Raedt. Attribute-value learning versus inductive logic programming: The missing links (extended abstract). In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 1–8. Springer, Berlin, 1998.

    Chapter  Google Scholar 

  15. T.G. Dietterich, R.H. Lathrop and T. Lozano-Perez. Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence 89(1–2): 31–71, 1997.

    Article  MATH  Google Scholar 

  16. S. Dzeroski, H. Blocked, B. Kompare, S. Kramer, B. Pfahringer, and W. Van Laer. Experiments in Predicting Biodegradability. In Proceedings of the Ninth International Workshop on Inductive Logic Programming, pages 80–91. Springer, Berlin, 1999.

    Chapter  Google Scholar 

  17. D. Fensel, M. Zickwolff, and M. Wiese. Are substitutions the better examples? Learning complete sets of clauses with Frog. In Proceedings of the Fifth International Workshop on Inductive Logic Programming, pages 453–474. Department of Computer Science, Katholieke Universiteit Leuven, 1995.

    Google Scholar 

  18. P. Flach. Knowledge representation for inductive learning. In Proceedings of the European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty, pages 160–167. Springer, Berlin, 1999.

    Chapter  Google Scholar 

  19. P. Flach, C. Giraud-Carrier, and J.W. Lloyd. Strongly typed inductive concept learning. In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 185–194. Springer, Berlin, 1998.

    Chapter  Google Scholar 

  20. P. Flach and N. Lachiche. 1BC: A first-order Bayesian classifier. In Proceedings of the Ninth International Workshop on Inductive Logic Programming, pages 92–103. Springer, Berlin, 1999.

    Chapter  Google Scholar 

  21. P. Flach and N. Lachiche. Confirmation-guided discovery of first-order rules with Tertius. Machine Learning, 42(1–2): 61–95, 2001.

    Article  MATH  Google Scholar 

  22. P. Geibel and F. Wysotzki. Relational learning with decision trees. In Proceedings Twelfth European Conference on Artificial Intelligence, pages 428–432. IOS Press, Amsterdam, 1996.

    Google Scholar 

  23. G. Klopman. Artificial intelligence approach to structure-activity studies: computer automated structure evaluation of biological activity of organic molecules. Journal of the American Chemical Society, 106:7315–7321, 1984.

    Article  Google Scholar 

  24. G. Klopman. MultiCASE: A hierarchical computer automated structure evaluation program. Quantitative Structure Activity Relationships, 11:176–184, 1992.

    Article  Google Scholar 

  25. W. Klösgen. EXPLORA: A multipattern and multistrategy discovery assistant. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 249–271. AAAI Press, Menlo Park, CA, 1996.

    Google Scholar 

  26. R. Kohavi, D. Sommerfield, and J. Dougherty. Data mining using MLC++: A machine learning library in C++. In Proceedings of the Eighth IEEE International Conference on Tools for Artificial Intelligence, pages 234–245. IEEE Computer Society Press, Los Alamitos, CA, 1996. http://www.sgi.com/Technology/mlc.

    Chapter  Google Scholar 

  27. D. Koller and M. Sahami. Toward optimal feature selection. In Proceedings of the Thirteenth International Conference on Machine Learning, pages 284–292. Morgan Kaufmann, San Francisco, CA, 1996.

    Google Scholar 

  28. S. Kramer. Structural regression trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 812–810. AAAI Press, Menlo Park, CA, 1996.

    Google Scholar 

  29. S. Kramer and E. Frank. Bottom-Up propositionalization. In Proceedings of the ILP-2000 Work-in-Progress Track, pages 156–162. Imperial College, London, 2000.

    Google Scholar 

  30. S. Kramer, B. Pfahringer, and C. Helma. Stochastic propositionalization of non-determinate background knowledge. In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 80–94. Springer, Berlin, 1998.

    Chapter  Google Scholar 

  31. N. Lavrac and S. Dšeroski. Inductive Logic Programming: Techniques and Applications. Ellis Horwood, Chichester, 1994. Freely available at http://www-ai.ijs.si/SasoDzeroski/ILPBook/.

    MATH  Google Scholar 

  32. N. Lavrac, S. Dzeroski, and M. Grobelnik. Learning nonrecursive definitions of relations with LINUS. In Proceedings of the Fifth European Working Session on Learning, pages 265–281. Springer-Verlag, Berlin, 1991.

    Google Scholar 

  33. N. Lavrač, D. Gamberger, P. Turney. A relevancy filter for constructive induction. IEEE Intelligent Systems, 13: 50–56, 1998.

    Google Scholar 

  34. H. Mannila and H. Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1 :241–258, 1997.

    Article  Google Scholar 

  35. D. Michie, S. Muggleton, D. Page, and A. Srinivasan. To the international computing community: A new East-West challenge. Technical report, Oxford University Computing laboratory, Oxford, UK, 1994.

    Google Scholar 

  36. F. Mizoguchi, H. Ohwada, M. Daidoji, and S. Shirato. Learning rules that classify ocular fundus images for glaucoma diagnosis. In Proceedings of the Sixth International Workshop on Inductive Logic Programming, pages 146–162. Springer-Verlag, Berlin, 1996.

    Google Scholar 

  37. I. Mozetič. NEWGEM: Program for learning from examples, technical documentation and user’s guide. Reports of Intelligent Systems Group UIUCDCS-F-85–949, Department of Computer Science, University of Illinois, Urbana Champaign, IL, 1985.

    Google Scholar 

  38. S. Muggleton. Inverse entailment and Progol. New Generation Computing, 13: 245–286, 1995.

    Article  Google Scholar 

  39. S. Muggleton and C. Feng. Efficient induction of logic programs. In S. Muggleton, editor, Inductive Logic Programming, pages 281–298. Academic Press, London, 1992.

    Google Scholar 

  40. S. Muggleton, R.D. King, and M.J.E Sternberg. Protein secondary structure prediction using logic. In Proceedings of the Second International Workshop on Inductive Logic Programming, pages 228–259. TM-1182, ICOT, Tokyo, 1992.

    Google Scholar 

  41. S. Muggleton, A. Srinivasan, R. King, and M. Sternberg. Biochemical knowledge discovery using Inductive Logic Programming. In Proceedings of the First Conference on Discovery Science, pages 326–341. Springer, Berlin, 1998.

    Google Scholar 

  42. A.L. Oliveira and A. Sangiovanni-Vincentelli. Constructive induction using a non-greedy strategy for feature selection. In Proceedings of the Ninth International Workshop on Machine Learning, pages 354–360. Morgan Kaufmann, San Francisco, CA, 1992.

    Google Scholar 

  43. G. Pagallo and D. Haussler. Boolean feature discovery in empirical learning. Machine Learning, 5:71–99, 1990.

    Article  Google Scholar 

  44. J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.

    Google Scholar 

  45. B.L. Richards and R.J. Mooney. Learning relations by pathfinding. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 50–55. AAAI Press, Menlo Park, CA, 1992.

    Google Scholar 

  46. M. Sebag and C. Rouveirol. Tractable induction and classification in first order logic via stochastic matching. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pages 888–893. Morgan Kaufmann, San Francisco, CA, 1997.

    Google Scholar 

  47. A. Srinivasan and R. King. Feature construction with inductive logic programming: a study of quantitative predictions of biological activity aided by structural attributes. Data Mining and Knowledge Discovery, 3(l):37–57, 1999.

    Article  Google Scholar 

  48. A. Srinivasan, R. King and D.W. Bristol, An assessment of submissions made to the Predictive Toxicology Evaluation Challenge. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 270–275. Morgan Kaufmann, San Francisco, CA, 1999.

    Google Scholar 

  49. A. Srinivasan, S. Muggleton, R.D. King and M. Sternberg. Theories for mutagenicity: a study of first-order and feature based induction. Artificial Intelligence, 85(1–2):277–299, 1996.

    Article  Google Scholar 

  50. I. Stahl. Predicate invention in inductive logic programming. In L. De Raedt, editor, Advances in Inductive Logic Programming, pages 34–47. IOS Press, Amsterdam, 1996.

    Google Scholar 

  51. P. Turney. Low size-complexity inductive logic programming: The East-West challenge considered as a problem in cost-sensitive classification. In L. De Raedt, editor, Advances in Inductive Logic Programming, pages 308–321. IOS Press, Amsterdam, 1996.

    Google Scholar 

  52. V. Vapnik. Estimation of Dependencies Based on Empirical Data. Springer Verlag, Berlin, 1982.

    Google Scholar 

  53. V. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag, Berlin, 1995.

    Book  MATH  Google Scholar 

  54. J. Wnek and R.S. Michalski. Hypothesis-driven constructive induction in AQ17: A method and experiments. In Proceedings of IJCAI-91 Workshop on Evaluating and Changing Representations in Machine Learning, pages 13–22. Sydney, Australia, 1991.

    Google Scholar 

  55. S. Wrobel. An algorithm for multi-relational discovery of subgroups. In Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery, pages 78–87. Springer, Berlin, 1997.

    Chapter  Google Scholar 

  56. J-D. Zucker and J-G. Ganascia. Representation changes for efficient learning in structural domains. In Proceedings of the Thirteenth International Conference on Machine Learning, pages 543–551. Morgan Kaufmann, San Francisco, CA, 1996.

    Google Scholar 

  57. J-D. Zucker and J-G. Ganascia. Learning structurally indeterminate clauses. In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 235–244. Springer, Berlin, 1998.

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kramer, S., Lavrač, N., Flach, P. (2001). Propositionalization Approaches to Relational Data Mining. In: Džeroski, S., Lavrač, N. (eds) Relational Data Mining. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04599-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-04599-2_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-07604-6

  • Online ISBN: 978-3-662-04599-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics