The Study of Dynamic Aggregation of Relational Attributes on Relational Data Mining

  • Rayner Alfred
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4632)


Most aggregation functions are limited to either categorical or numerical values but not both values. In this paper, we define three concepts of aggregation function and introduce a novel method to aggregate multiple instances that consists of both the categorical and numerical values. We show how these concepts can be implemented using clustering techniques. In our experiment, we discretize continuous values before applying the aggregation function on relational datasets. With the empirical results obtained, we demonstrate that our transformation approach using clustering techniques, as a means of aggregating multiple instances of attribute’s values, can compete with existing multi-relational techniques, such as Progol and Tilde. In addition, the effect of the number of interval for discretization on the classification performance is also evaluated.


Data Summarization Multiple Instance Aggregation Clustering Relational data Mining 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alfred, R., Kazakov, D.: Pattern-Based Transformation Approach to Relational Domain Learning Using DARA. In: Crone, S.F., Lessmann, S., Stahlbock, R. (eds.) Proceedings of the International Conference on Data Mining, LAS VEGAS, Nevada, June 25-29, 2006, pp. 296–302. CSREA Press (2006)Google Scholar
  2. 2.
    Srinivasan, A., Muggleton, S.H., Sternberg, M.J.E., King, R.D.: Theories for mutagenicity: A Study in first-order and feature-based induction. Artificial Intelligence, 85 (1996)Google Scholar
  3. 3.
    Dietterich, T.G., Lathrop, R.H., Lozano-Perez, T.: Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence 89(1-2), 31–71 (1997)CrossRefzbMATHGoogle Scholar
  4. 4.
    Żytkow, J.M., Rauch, J. (eds.): Principles of Data Mining and Knowledge Discovery. LNCS (LNAI), vol. 1704. Springer, Heidelberg (1999)Google Scholar
  5. 5.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Los Altos, CalGoogle Scholar
  6. 6.
    Kramer, S., Lavrač, N., Flach, P.: Propositionalization approaches to relational data mining. In: Dzeroski, S., Lavrač, N. (eds.) Relational Data mining, Springer, Heidelberg (2001)Google Scholar
  7. 7.
    Salton, G., Michael, J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York (1986)Google Scholar
  8. 8.
    Bezdek, J.C.: Some new indexes of cluster validity. IEEE Transaction System, Man, Cybern. B 28, 301–315 (1998)CrossRefGoogle Scholar
  9. 9.
    Boley, D.: Principal direction divisive partitioning. Data Mining and Knowledge Discovery 2(4), 325–344 (1998)CrossRefGoogle Scholar
  10. 10.
    Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufman, Seattle (1999)Google Scholar
  11. 11.
    Knobbe, A.J., de Haas, M., Siebes, A.: Propositionalisation and Aggregates. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 277–288. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  12. 12.
    Perlich, C., Provost, F.: Aggregation-Based Feature Invention and Relational Concept Classes. In: KDD 2003. Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, New York (2003)Google Scholar
  13. 13.
    De Raedt, L.: Attribute-value learning versus inductive logic programming: The missing links (extended abstract). In: Page, D.L. (ed.) Inductive Logic Programming. LNCS, vol. 1446, pp. 1–8. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  14. 14.
    Lavrač, N., Džeroski, S.: Inductive Logic Programming: Techniques and Applications. Ellis Horwood (1994)Google Scholar
  15. 15.
    Lavrač, N., Džeroski, S., Grobelnik, M.: Learning nonrecursive definitions of relations with LINUS. In: Kodratoff, Y. (ed.) Machine Learning - EWSL-91. LNCS, vol. 482, pp. 265–281. Springer, Heidelberg (1991)CrossRefGoogle Scholar
  16. 16.
    Krogel, M.A., Wrobel, S.: Transformation-Based Learning Using Multirelational Aggregation. In: Rouveirol, C., Sebag, M. (eds.) ILP 2001. LNCS (LNAI), vol. 2157, Springer, Heidelberg (2001)CrossRefGoogle Scholar
  17. 17.
    Lavrač, N., Železny, F., Flach, P.A.: RSD: Relational subgroup discovery through first-order feature construction. In: Matwin, S., Sammut, C. (eds.) ILP 2002. LNCS (LNAI), vol. 2583, Springer, Heidelberg (2003)Google Scholar
  18. 18.
    Krogel, M.A., Rawles, S., Železny, F., Flach, P.A., Lavrač, N., Wrobel, S.: Comparative evaluation of approaches to propositionalization. In: Horváth, T., Yamamoto, A. (eds.) ILP 2003. LNCS (LNAI), vol. 2835, pp. 197–214. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  19. 19.
    Blokceel, H., Bruynooghe, M.: Aggregation versus Selection Bias, and relational neural networks. In: Kurumatani, K., Chen, S.-H., Ohuchi, A. (eds.) IJCAI-WS 2003 and MAMUS 2003. LNCS (LNAI), vol. 3012, Springer, Heidelberg (2004)Google Scholar
  20. 20.
    Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 94–105. ACM Press, New York (1998)CrossRefGoogle Scholar
  21. 21.
    Hofmann, T., Buhnmann, J.M.: Active data clustering. In: Advance in Neural Information Processing System (1998)Google Scholar
  22. 22.
    Hartigan, J.A.: Clustering Algorithms. Wiley, New York (1975)zbMATHGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Rayner Alfred
    • 1
  1. 1.Universiti Malaysia Sabah, School of Engineering and Information Technology, 88999, Kota Kinabalu, SabahMalaysia

Personalised recommendations