Data Summarization Approach to Relational Domain Learning Based on Frequent Pattern to Support the Development of Decision Making

  • Rayner Alfred
  • Dimitar Kazakov
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4093)


A new approach is needed to handle huge dataset stored in multiple tables in a very-large database. Data mining and Knowledge Discovery in Databases (KDD) promise to play a crucial role in the way people interact with databases, especially decision support databases where analysis and exploration operations are essential. In this paper, we present related works in Relational Data Mining, define the basic notions of data mining for decision support and the types of data aggregation as a means of categorizing or summarizing data. We then present a novel approach to relational domain learning to support the development of decision making models by introducing automated construction of hierarchical multi-attribute model for decision making. We will describe how relational dataset can naturally be handled to support the construction of hierarchical multi-attribute model by using relational aggregation based on pattern’s distance. In this paper, we presents the prototype of “Dynamic Aggregation of Relational Attributes” (hence called DARA) that is capable of supporting the construction of hierarchical multi-attribute model for decision making. We experimentally show these results in a multi-relational domain that shows higher percentage of correctly classified instances and illustrate set of rules extracted from the relational domains to support decision-making.


Inductive Logic Programming Relational Domain Relational Learning Dynamic Aggregation Multiple Table 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bezdek, J.C.: Some new indexes of cluster validiy. IEEE Trans. Syst., Man, Cybern. B 28, 301–315 (1998)Google Scholar
  2. 2.
    Marko, B.: 2001. Decision Support. In: Mladenic, D., Lavrač, N., Bohanec, M., Moyle, S. (eds.) Data Mining and Decision Support: Integration and Collaboration, Kluwer Aca. Publishers, Dordrecht (2003)Google Scholar
  3. 3.
    Dillon, W., Goldstein, M.: Multivariate analysis, pp. 157–208. John Wiley and Sons, Chichester (1984)Google Scholar
  4. 4.
    Džeroski, S., Blockeel, H., Kompare, B., Kramer, S., Pfahringer, B., Van Laer, W.: Experiments in Predicting Biodegradability. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, Springer, Heidelberg (1999)Google Scholar
  5. 5.
    Džeroski, S., Lavrač, N. (eds.): Relational Data mining. Springer, Heidelberg (2001)Google Scholar
  6. 6.
    Getoor, L., Friedman, N., Koller, D., Pfeffer, A.: Learning Probabilistic relational models. In: Džeroski, S., Lavrač, N. (eds.) Relational Data mining, Springer, Heidelberg (2001)Google Scholar
  7. 7.
    Horvath, T., Wrobel, S., Bohnebeck, U.: Relational instance-based learning with lists and terms. Machine Learning 43(1/2), 53–80 (2001)Google Scholar
  8. 8.
    Kirsten, M., Wrobel, S., Horvath, T.: Distance based approaches to relational learning and clustering. In: Džeroski, S., Lavrač, N. (eds.) Relational Data mining, Springer, Heidelberg (2001)Google Scholar
  9. 9.
    Knobbe, A., De Haas, M., Siebes, A.: Propositionalization and aggregates. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 277–288. Springer, Heidelberg (2001)Google Scholar
  10. 10.
    Koller, D., Pfeffer, A.: Probabilistic frame-based systems. In: AAAI/IAAI, pp. 580–587 (1998)Google Scholar
  11. 11.
    Kramer, S., Lavrač, N., Flach, P.: Propositionalization approaches to relational data mining. In: Džeroski, S., Lavrač, N. (eds.) Relational Data mining, Springer, Heidelberg (2001)Google Scholar
  12. 12.
    Krogel, M.A., Rawles, S., Železny, F., Flach, P.A., Lavrač, N., Wrobel, S.: Comparative evaluation of approaches to propositionalization. In: Horváth, T., Yamamoto, A. (eds.) ILP 2003. LNCS (LNAI), vol. 2835, pp. 197–214. Springer, Heidelberg (2003)Google Scholar
  13. 13.
    Muggleton, S.H., DeRaedt, L.: Inductive Logic programming: Theory and Methods. The Journal of Logic Programming 19 & 20, 629–680 (1994)Google Scholar
  14. 14.
    Muggleton, S.H.: Inverse Entailment and Progol. New Generation Computing 13, 245–286 (1995)Google Scholar
  15. 15.
    McQueen, J.: Some Methods of classification and analysis of multivariate observations. In: Proceedings of Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–293 (1967)Google Scholar
  16. 16.
    Perlich, C., Provost, F.: Aggregation-based feature invention and relational concept classes. In: Proceedings of the Ninth ACM International Conference on Knowledge Discovery and Data Mining (KDD) (2003)Google Scholar
  17. 17.
    Perlich, C., Provost, F.: ACORA: Distribution-based aggregation for relational learning from identifier attributes. Journal of Machine Learning (2005)Google Scholar
  18. 18.
    Propescul, A., Ungar, L.H., Lawrence, S., Pennock, D.M.: Structural Logistic Regression: Combining relational and statistical learning. In: Proceedings of the workshop on Multi-Relational Data Mining (MRDM-2002), University of Alberta, Edmonton, Canada, July 2002, pp. 130–141 (2002)Google Scholar
  19. 19.
    Srinivasan, A., King, R.D.: Feature Construction with Inductive Logic Programming: A Study of Quantitative Predictions of Biological Activity Aided by Structural Attributes. Data Mining and Knowledge Discovery 3(1), 37–57 (1999)Google Scholar
  20. 20.
    Srinivasan, A., King, R.D., Bristol, D.W.: An Assessment of ILP-Assisted Models for Toxicology and the PTE-3 Experiment. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, Springer, Heidelberg (1999)Google Scholar
  21. 21.
    Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)Google Scholar
  22. 22.
    Salton, G., Michael, J.: McGill, Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York (1986)Google Scholar
  23. 23.
    Power, D.J.: Decision Support Systems Glossary (1999), http://DSSResources.COM/glossary/
  24. 24.
    INSEAD, Decision Sciences. PhD Program Description (2003),
  25. 25.
    Hillier, F.S., Lieberman, G.J.: Introduction to Operation Research. McGraw-Hill, New York (2000)Google Scholar
  26. 26.
    Clemen, R.T.: Making Hard Decisions: An introduction to Decision Analysis. Duxbury Press (1996)Google Scholar
  27. 27.
    Han, J., Kamber, M.: Data Mining: Concept and Techniques. Morgan Kaufmann, San Francisco (2001)Google Scholar
  28. 28.
    Mallach, E.G.: Understanding Decision Support Systems and Expert Systems. Irwin, Burr Ridge (1994)Google Scholar
  29. 29.
    DAS, Decision Analysis Software (2001),
  30. 30.
    Younes, H.L.S.: Current tools for assisting intelligent agents in real-time decision making, MSc Thesis (2001),
  31. 31.
    Parmigiani, G.: Modelling in Medical Decision Making: A Bayesian Approach. John Wiley & Sons, Ltd., Chichester (2002)Google Scholar
  32. 32.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the International Conference on Very Large Databases, Santiago de Chile, Chile (1994)Google Scholar
  33. 33.
    Watanabe, T., Suzuki, H., Takabayashi, L.: Application of prototypeline to chronic hepatitis data. In: Working core of ECML/PKDD 2003 Discovery Challenge, pp. 166–177 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Rayner Alfred
    • 1
    • 2
  • Dimitar Kazakov
    • 1
  1. 1.Computer Science DepartmentUniversity of YorkHeslingtonUnited Kingdom
  2. 2.School of Engineering and Information TechnologyOn Study Leave from Universiti Malaysia SabahKota KinabaluMalaysia

Personalised recommendations