Journal of Intelligent Information Systems

, Volume 42, Issue 2, pp 255–281

A method for reduction of examples in relational learning

  • Ondřej Kuželka
  • Andrea Szabóová
  • Filip Železný
Article

Abstract

Feature selection methods often improve the performance of attribute-value learning. We explore whether also in relational learning, examples in the form of clauses can be reduced in size to speed up learning without affecting the learned hypothesis. To this end, we introduce the notion of safe reduction: a safely reduced example cannot be distinguished from the original example under the given hypothesis language bias. Next, we consider the particular, rather permissive bias of bounded treewidth clauses. We show that under this hypothesis bias, examples of arbitrary treewidth can be reduced efficiently. We evaluate our approach on four data sets with the popular system Aleph and the state-of-the-art relational learner nFOIL. On all four data sets we make learning faster in the case of nFOIL, achieving an order-of-magnitude speed up on one of the data sets, and more accurate in the case of Aleph.

Keywords

Relational learning Feature selection Bounded treewidth 

References

  1. Appice, A., Ceci,M., Rawles, S., Flach, P.A. (2004). Redundant feature elimination for multi-class problems. In ICML (vol. 69).Google Scholar
  2. Atserias, A., Bulatov, A., Dalmau, V. (2007). On the power of k-consistency. In Proceedings of ICALP-2007 (pp. 266–271).Google Scholar
  3. Beeri, C., Fagin, R., Maier, D., Yannakakis, M. (1983). On the desirability of acyclic database schemes. Journal of ACM, 30(3), 479–513.CrossRefMATHMathSciNetGoogle Scholar
  4. Bodlaender, H.L., & Mohring, R.H. (1993). The pathwidth and treewidth of cographs. SIAM Journal of Discrete Methematics, 6, 238–255.MathSciNetGoogle Scholar
  5. Courcelle, B. (1990). The monadic second-order logic of graphs. i. recognizable sets of finite graphs. Information and Computation, 85(1), 12–75.CrossRefMATHMathSciNetGoogle Scholar
  6. De Raedt, L. (1997).) Logical settings for concept-learning. Artificial Intelligence, 95(1), 187–201.CrossRefMATHMathSciNetGoogle Scholar
  7. De Raedt, L. (2008). Logical and relational learning. New York: Springer.Google Scholar
  8. Dechter, R. (2003). Constraint processing. San Francisco: Morgan Kaufmann.Google Scholar
  9. Erickson, J. (2009). CS 598: Computational topology, course notes, University of Illinois at Urbana-Champaign. http://compgeom.cs.uiuc.edu/~jeffe/teaching/comptop/.
  10. Fagin, R. (1983). Degrees of acyclicity for hypergraphs and relational database schemes. Journal of the ACM, 30(3), 514–550.CrossRefMATHMathSciNetGoogle Scholar
  11. Feder, T., & Vardi, M.Y. (1998). The computational structure of monotone monadic snp and constraint satisfaction: a study through datalog and group theory. SIAM Journal on Computing, 28(1), 57–104.CrossRefMATHMathSciNetGoogle Scholar
  12. Freuder, E.C. (1990). Complexity of k-tree structured constraint satisfaction problems. In Proceedings of the eighth national conference on artificial intelligence (vol. 1, pp. 4–9). AAAI’90: AAAI Press.Google Scholar
  13. Hastie, T., Tibshirani, R., Friedman, J. (2001). The elements of statistical learning: data mining, inference, and prediction. New York: Springer.Google Scholar
  14. Helma, C., King, R.D., Kramer, S., Srinivasan, A. (2001). The predictive toxicology challenge 2000–2001. Bioinformatics, 17(1), 107–108.CrossRefGoogle Scholar
  15. Krogel, M.A., Rawles, S., Železný, F., Flach, P., Lavrac, N., Wrobel, S. (2003). Comparative evaluation of approaches to propositionalization. In ILP. Springer.Google Scholar
  16. Kuželka, O., & Železný, F. (2009). Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties. In ICML 2009: the 26th International Conference on Machine Learning.Google Scholar
  17. Kuželka, O., Železný, F. (2011a). Block-wise construction of tree-like relational features with monotone reducibility and redundancy. Machine Learning, 83, 163–192.CrossRefMATHMathSciNetGoogle Scholar
  18. Kuželka, O., Železný, F. (2011b). Seeing the world through homomorphism: An experimental study on reducibility of examples. In ILP’10: Inductive logic programming (pp. 138–145).Google Scholar
  19. Kuželka, O., Szabóová, A., Železný, F. (2013a). Bounded least general generalization. In ILP’12: inductive logic programming.Google Scholar
  20. Kuželka, O., Szabóová, A., Železný, F. (2013b). Reducing examples in relational learning with bounded-treewidth hypotheses. In New frontiers in mining complex patterns (pp. 17–32).Google Scholar
  21. Landwehr, N., Kersting, K., Raedt, L.D. (2007). Integrating naïve bayes and FOIL. Journal of Machine Learning Research, 8, 481–507.MATHGoogle Scholar
  22. Lavrač, N., Gamberger, D., Jovanoski, V. (1999). A study of relevance for learning in deductive databases. Journal of Logic Programming, 40(2/3), 215–249.CrossRefMATHMathSciNetGoogle Scholar
  23. Liu, H.,Motoda, H., Setiono, R., Zhao, Z. (2010). Feature selection: an ever evolving frontier in data mining. Journal of Machine Learning Research - Proceedings Track, 10, 4–13.Google Scholar
  24. Mackworth, A. (1977). Consistency in networks of relations. Artificial Intelligence, 8(1), 99–118.CrossRefMATHMathSciNetGoogle Scholar
  25. Maloberti, J., & Sebag, M. (2004). Fast theta-subsumption with constraint satisfaction algorithms. Machine Learning, 55(2), 137–174.CrossRefMATHGoogle Scholar
  26. Muggleton, S. (1995). Inverse entailment and Progol. New Generation Computing, Special Issue on Inductive Logic Programming, 13(3–4), 245–286.CrossRefGoogle Scholar
  27. Nassif, H., Al-Ali, H., Khuri, S., Keirouz, W., Page, D. (2009). An inductive logic programming approach to validate hexose biochemical knowledge. In: Proceedings of the 19th international conference on ILP (pp. 149–165). Leuven.Google Scholar
  28. Nienhuys-Cheng, S.H., de Wolf, R., (eds.) (1997). Foundations of inductive logic programming. Lecture Notes in Computer Science (vol. 1228). Springer.Google Scholar
  29. Plotkin, G. (1970). A note on inductive generalization. Edinburgh: Edinburgh University Press.Google Scholar
  30. Rossi, F., van Beek, P., Walsh T., (Eds.) (2006). Handbook of constraint programming. New York: Elsevier.Google Scholar
  31. Žaková, M., Železný, F., Garcia-Sedano, J., Tissot, C.M., Lavrač, N., Křemen, P., Molina, J. (2007). Relational data mining applied to virtual engineering of product designs. In ILP06, LNAI (vol. 4455, pp. 439–453). Springer.Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Ondřej Kuželka
    • 1
  • Andrea Szabóová
    • 1
  • Filip Železný
    • 1
  1. 1.Faculty of Electrical EngineeringCzech Technical University in PraguePragueCzech Republic

Personalised recommendations