Skip to main content

Mining Model Trees: A Multi-relational Approach

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2835))

Abstract

In many data mining tools that support regression tasks, training data are stored in a single table containing both the target field (dependent variable) and the attributes (independent variables). Generally, only intra-tuple relationships between the attributes and the target field are found, while inter-tuple relationships are not considered and (inter-table) relationships between several tuples of distinct tables are not even explorable. Disregarding inter-table relationships can be a severe limitation in many real-word applications that involve the prediction of numerical values from data that are naturally organized in a relational model involving several tables (multi-relational model). In this paper, we present a new data mining algorithm, named Mr-SMOTI, which induces model trees from a multi-relational model. A model tree is a tree-structured prediction model whose leaves are associated with multiple linear regression models. The particular feature of Mr-SMOTI is that internal nodes of the induced model tree can be of two types: regression nodes, which add a variable to some multiple linear models according to a stepwise strategy, and split nodes, which perform tests on attributes or the join condition and eventually partition the training set. The induced model tree is a multi-relational pattern that can be represented by means of selection graphs, which can be translated into SQL, or equivalently into first order logic expressions.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Appice, A., Ceci, M., Lanza, A., Lisi, F.A., Malerba, D.: Discovery of Spatial Association Rules in Georeferenced Census Data: A Relational Mining Approach, Intelligent Data Analysis, numero speciale su "Mining Official Data" (in press)

    Google Scholar 

  2. Blockeel, H.: Top-down induction of first order logical decision trees. Ph.D thesis, Department of Computer Science, Katholieke Universiteit Leuven (1998)

    Google Scholar 

  3. Breiman, L., Friedman, J., Olshen, R., Stone, J.: Classification and regression tree. Wadsworth & Brooks (1984)

    Google Scholar 

  4. Draper, N.R., Smith, H.: Applied regression analysis. John Wiley & Sons, Chichester (1982)

    Google Scholar 

  5. Dzeroski, S.: Numerical Constraints and Learnability in Inductive Logic Programming. Ph.D thesis, University of Ljubljana, Slovenia (1995)

    Google Scholar 

  6. Dzeroski, S., Blockeel, H., Kramer, S., Kompare, B., Pfahringer, B., Van Laer, W.: Experiments in predicting biodegradability. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, pp. 80–91. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  7. Dzeroski, S., Todoroski, L., Urbancic, T.: Handling real numbers in inductive logic programming: A step towards better behavioural clones. In: Lavrač, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, Springer, Heidelberg (1995)

    Google Scholar 

  8. Dzeroski, S., Lavrac, N. (eds.): Relational Data Mining. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  9. Karalic, A.: Linear regression in regression tree leaves. In: Proc. of ISSEK 1992 (International School for Synthesis of Expert Knowledge), Bled, Slovenia (1992)

    Google Scholar 

  10. Karalic, A.: First Order regression. Ph.D thesis, University of Ljubljana, Slovenia (1995)

    Google Scholar 

  11. Knobbe, J., Siebes, A., Van der Wallen, D.M.G.: Multi-relational decision tree induction. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 378–383. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  12. Knobbe, J., Blockeel, H., Siebes, A., Van der Wallen, D.M.G.: Multi-relational Data Mining. In: Proc. of Benelearn 1999 (1999)

    Google Scholar 

  13. Knobbe, A.J., Haas, M., Siebes, A.: Propositionalisation and aggregates. In: Proc. 5th European Conf. on Principles of Data Mining and Knowledge Discovery, Springer, Heidelberg (2001)

    Google Scholar 

  14. Kramer, S.: Structural regression trees. In: Proc. 13th National Conf. on Artificial Intelligence (1996)

    Google Scholar 

  15. Lavrac, N., Dzeroski, S.: Inductive Logic Programming: Techniques and Applications, Ellis Horwood, Chichester, UK (1994)

    Google Scholar 

  16. Leiva, H.A.: MRDTL: A multi-relational decision tree learning algorithm. Master thesis, University of Iowa, USA (2002)

    Google Scholar 

  17. Lubinsky, D.: Tree Structured Interpretable Regression. In: Fisher, D., Lenz, H.J. (eds.) Learning from Data. Lecture Notes in Statistics, vol. 112, Springer, Heidelberg (1994)

    Google Scholar 

  18. Malerba, D., Appice, A., Ceci, M., Monopoli, M.: Trading-off versus global effects or regression nodes in model trees. In: Hacid, M.-S., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds.) ISMIS 2002. LNCS (LNAI), vol. 2366, p. 393. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  19. Malerba, D., Esposito, F., Ceci, M., Appice, A.: Top -down induction of model trees with regression and splitting nodes. LACAM Technical Report (2003)

    Google Scholar 

  20. Mehta, M., Agrawal, R., Rissanen, J.: SLIQ: A fast scalable classifier for data mining. In: Proceedings of the Fifth International Conference on Extending Database Technology (1996)

    Google Scholar 

  21. Muggleton, S., Srinivasan, A., King, R., Sternberg, M.: Biochemical knowledge discovery using Inductive Logic Programming. In: Motoda, H. (ed.) Proceedings of the first Conference on Discovery Science, Springer, Berlin (1998)

    Google Scholar 

  22. Orkin, M., Drogin, R.: Vital Statistics. McGraw-Hill, New York (1990)

    Google Scholar 

  23. Quinlan, J.R.: Learning with continuous classes. In: Adams, Sterling (eds.) Proceedings AI 1992, World Scientific, Singapore (1992)

    Google Scholar 

  24. Quinlan, J.R.: A case study in Machine Leaning. In: Proceedings ACSC-16, Sixteenth Australian Computer Science Conferences (1993)

    Google Scholar 

  25. Silverstein, G., Pazzani, M.J.: Relational cliches: Constraining constructive induction during relational learning. In: Proc. 8th Int. Workshop on Machine Learning (1991)

    Google Scholar 

  26. Torgo, L.: Functional Models for Regression Tree Leaves. In: Proceedings of the 14th International Conference (ICML 1997), Nashville, Tennessee (1997)

    Google Scholar 

  27. Wang, Y., Witten, I.H.: Inducing Model Trees for Continuous Classes. In: van Someren, M., Widmer, G. (eds.) ECML 1997. LNCS, vol. 1224, Springer, Heidelberg (1997)

    Google Scholar 

  28. Weiss, S.M., Indurkhya, N.: Predictive Data Mining. A Practical Guide. Morgan Kaufmann, San Francisco (1998)

    MATH  Google Scholar 

  29. Wrobel, S.: Inductive logic programming for knowledge discovery in databases. In: Dzeroski, S., Lavrac, N. (eds.) Relational Data Mining, pp. 74–101. Springer, Heidelberg (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Apice, A., Ceci, M., Malerba, D. (2003). Mining Model Trees: A Multi-relational Approach. In: Horváth, T., Yamamoto, A. (eds) Inductive Logic Programming. ILP 2003. Lecture Notes in Computer Science(), vol 2835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39917-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39917-9_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20144-1

  • Online ISBN: 978-3-540-39917-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics