Refining Aggregate Conditions in Relational Learning

  • Celine Vens
  • Jan Ramon
  • Hendrik Blockeel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4213)


In relational learning, predictions for an individual are based not only on its own properties but also on the properties of a set of related individuals. Many systems use aggregates to summarize this set. Features thus introduced compare the result of an aggregate function to a threshold. We consider the case where the set to be aggregated is generated by a complex query and present a framework for refining such complex aggregate conditions along three dimensions: the aggregate function, the query used to generate the set, and the threshold value. The proposed aggregate refinement operator allows a more efficient search through the hypothesis space and thus can be beneficial for many relational learners that use aggregates. As an example application, we have implemented the refinement operator in a relational decision tree induction system. Experimental results show a significant efficiency gain in comparison with the use of a less advanced refinement operator.


Generalize Average Monotonicity Property Start Condition Aggregate Condition Inductive Logic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Ng, R.T., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained associations rules. In: SIGMOD International Conference on Management of Data, pp. 13–24 (1998)Google Scholar
  2. 2.
    Krogel, M.A., Wrobel, S.: Transformation-based learning using multi-relational aggregation. In: Rouveirol, C., Sebag, M. (eds.) ILP 2001. LNCS, vol. 2157, pp. 142–155. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  3. 3.
    Knobbe, A., de Haas, M., Siebes, A.: Propositionalisation and aggregates. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS, vol. 2168, pp. 277–288. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  4. 4.
    Neville, J., Jensen, D., Friedland, L., Hay, M.: Learning relational probability trees. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2003)Google Scholar
  5. 5.
    Koller, D.: Probabilistic relational models. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS, vol. 1634, pp. 3–13. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  6. 6.
    Perlich, C., Provost, F.: Aggregation-based feature invention and relational concept classes. In: Proceedings of the 9th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 167–176. ACM Press, New York (2003)CrossRefGoogle Scholar
  7. 7.
    Krogel, M.A., Wrobel, S.: Facets of aggregation approaches to propositionalization. In: Proceedings of the Work-in-Progress Track at the 13th International Conference on Inductive Logic Programming, pp. 30–39 (2003)Google Scholar
  8. 8.
    Knobbe, A., Siebes, A., Marseille, B.: Involving aggregate functions in multi-relational search. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS, vol. 2431, pp. 287–298. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Van Assche, A., Vens, C., Blockeel, H., Džeroski, S.: First order random forests: Learning relational classifiers with complex aggregates. Machine Learning, Special Issue on ILP (to appear, 2006)Google Scholar
  10. 10.
    Uwents, W., Blockeel, H.: Classifying relational data with neural networks. In: Kramer, S., Pfahringer, B. (eds.) ILP 2005. LNCS, vol. 3625, pp. 384–396. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  11. 11.
    Muggleton, S. (ed.): Inductive Logic Programming. Academic Press, London (1992)MATHGoogle Scholar
  12. 12.
    Plotkin, G.: A note on inductive generalization. Machine Intell. 5, 153–163 (1969)MATHGoogle Scholar
  13. 13.
    Blockeel, H., De Raedt, L.: Top-down induction of first order logical decision trees. Artificial Intelligence 101(1-2), 285–297 (1998)MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann series in Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  15. 15.
    Blockeel, H., Dehaspe, L., Demoen, B., Janssens, G., Ramon, J., Vandecasteele, H.: Improving the efficiency of Inductive Logic Programming through the use of query packs. Journal of Artificial Intelligence Research 16, 135–166 (2002)MATHGoogle Scholar
  16. 16.
    Srinivasan, A., King, R., Bristol, D.: An assessment of ILP-assisted models for toxicology and the PTE-3 experiment. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS, vol. 1634, pp. 291–302. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  17. 17.
    Berka, P.: Guide to the financial data set. In: The ECML/PKDD 2000 Discovery Challenge (2000)Google Scholar
  18. 18.
    Džeroski, S., Schulze-Kremer, S., Heidtke, K.R., Siems, K., Wettschereck, D., Blockeel, H.: Diterpene structure elucidation from 13C NMR spectra with inductive logic programming. Applied Artificial Intelligence 12(5), 363–384 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Celine Vens
    • 1
  • Jan Ramon
    • 1
  • Hendrik Blockeel
    • 1
  1. 1.Department of Computer ScienceKatholieke Universiteit LeuvenLeuvenBelgium

Personalised recommendations