Reducing Examples in Relational Learning with Bounded-Treewidth Hypotheses

  • Ondřej Kuželka
  • Andrea Szabóová
  • Filip Železný
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7765)

Abstract

Feature selection methods often improve the performance of attribute-value learning. We explore whether also in relational learning, examples in the form of clauses can be reduced in size to speed up learning without affecting the learned hypothesis. To this end, we introduce the notion of safe reduction: a safely reduced example cannot be distinguished from the original example under the given hypothesis language bias. Next, we consider the particular, rather permissive bias of bounded treewidth clauses. We show that under this hypothesis bias, examples of arbitrary treewidth can be reduced efficiently. The bounded treewidth bias can be replaced by other assumptions such as acyclicity with similar benefits. We evaluate our approach on four data sets with the popular system Aleph and the state-of-the-art relational learner nFOIL. On all four data sets we make learning faster for nFOIL, achieving an order-of-magnitude speed up on one of the data sets, and more accurate for Aleph.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Liu, H., Motoda, H., Setiono, R., Zhao, Z.: Feature selection: An ever evolving frontier in data mining. Journal of Machine Learning Research - Proceedings Track 10, 4–13 (2010)Google Scholar
  2. 2.
    Lavrac, N., Gamberger, D., Jovanoski, V.: A study of relevance for learning in deductive databases. J. Log. Program. 40(2-3), 215–249 (1999)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Appice, A., Ceci, M., Rawles, S., Flach, P.A.: Redundant feature elimination for multi-class problems. In: ICML, vol. 69 (2004)Google Scholar
  4. 4.
    Raedt, L.D.: Logical and Relational Learning: From ILP to MRDM (Cognitive Technologies). Springer-Verlag New York, Inc. (2008)Google Scholar
  5. 5.
    Plotkin, G.D.: A note on inductive generalization. Machine Intelligence 5, 153–163 (1970)MathSciNetGoogle Scholar
  6. 6.
    Kuželka, O., Železný, F.: Seeing the world through homomorphism: An experimental study on reducibility of examples. In: Frasconi, P., Lisi, F.A. (eds.) ILP 2010. LNCS, vol. 6489, pp. 138–145. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. 7.
    Nassif, H., Al-Ali, H., Khuri, S., Keirouz, W., Page, D.: An inductive logic programming approach to validate hexose binding biochemical knowledge. In: De Raedt, L. (ed.) ILP 2009. LNCS, vol. 5989, pp. 149–165. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  8. 8.
    Erickson, J.: CS 598: Computational Topology, course notes, University of Illinois at Urbana-Champaign (2009)Google Scholar
  9. 9.
    Kuželka, O., Železný, F.: Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties. In: ICML 2009: the 26th Int. Conf. on Machine Learning (2009)Google Scholar
  10. 10.
    Kuželka, O., Železný, F.: Block-wise construction of tree-like relational features with monotone reducibility and redundancy. Machine Learning 83, 163–192 (2011)CrossRefMathSciNetMATHGoogle Scholar
  11. 11.
    Krogel, M.-A., Rawles, S., Železný, F., Flach, P.A., Lavrač, N., Wrobel, S.: Comparative evaluation of approaches to propositionalization. In: Horváth, T., Yamamoto, A. (eds.) ILP 2003. LNCS (LNAI), vol. 2835, pp. 197–214. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  12. 12.
    Dechter, R.: Constraint Processing. Morgan Kaufmann Publishers (2003)Google Scholar
  13. 13.
    Feder, T., Vardi, M.Y.: The computational structure of monotone monadic snp and constraint satisfaction: A study through datalog and group theory. SIAM J. Comput. 28(1), 57–104 (1998)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Maloberti, J., Sebag, M.: Fast theta-subsumption with constraint satisfaction algorithms. Machine Learning 55(2), 137–174 (2004)CrossRefMATHGoogle Scholar
  15. 15.
    Atserias, A., Bulatov, A., Dalmau, V.: On the power of k-consistency. In: Arge, L., Cachin, C., Jurdziński, T., Tarlecki, A. (eds.) ICALP 2007. LNCS, vol. 4596, pp. 279–290. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  16. 16.
    Rossi, F., van Beek, P., Walsh, T. (eds.): Handbook of Constraint Programming. Elsevier (2006)Google Scholar
  17. 17.
    De Raedt, L.: Logical settings for concept-learning. Artif. Intell. 95(1), 187–201 (1997)CrossRefMATHGoogle Scholar
  18. 18.
    Gottlob, G., Leone, N., Scarcello, F.: Hypertree decompositions and tractable queries. Journal of Computer and System Sciences 64(3), 579–627 (2002)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Landwehr, N., Kersting, K., Raedt, L.D.: Integrating naïve bayes and FOIL. Journal of Machine Learning Research 8, 481–507 (2007)MATHGoogle Scholar
  20. 20.
    Mackworth, A.: Consistency in networks of relations. Artificial Intelligence 8(1), 99–118 (1977)MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Helma, C., King, R.D., Kramer, S., Srinivasan, A.: The predictive toxicology challenge 2000-2001. Bioinformatics 17(1), 107–108 (2001)CrossRefGoogle Scholar
  22. 22.
    Žáková, M., Železný, F., Garcia-Sedano, J.A., Masia Tissot, C., Lavrač, N., Křemen, P., Molina, J.: Relational data mining applied to virtual engineering of product designs. In: Muggleton, S.H., Otero, R., Tamaddoni-Nezhad, A. (eds.) ILP 2006. LNCS (LNAI), vol. 4455, pp. 439–453. Springer, Heidelberg (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Ondřej Kuželka
    • 1
  • Andrea Szabóová
    • 1
  • Filip Železný
    • 1
  1. 1.Faculty of Electrical EngineeringCzech Technical University in PraguePragueCzech Republic

Personalised recommendations