Inducing Classification and Regression Trees in First Order Logic

Kramer, Stefan; Widmer, Gerhard

doi:10.1007/978-3-662-04599-2_6

Stefan Kramer² &
Gerhard Widmer^3,4

481 Accesses
17 Citations

Abstract

In this chapter, we present a system that enhances the representational capabilities of decision and regression tree learning by extending it to first-order logic, i.e., relational representations as commonly used in Inductive Logic Programming. We describe an algorithm named Structural Classification and Regression Trees (S-Cart), which is capable of inducing first-order trees for both classification and regression problems, i.e., for the prediction of either discrete classes or numerical values. We arrive at this algorithm by a strategy called upgrading — we start from a propositional induction algorithm and turn it into a relational learner by devising suitable extensions of the representation language and the associated algorithms. In particular, we have upgraded Cart, the classical method for learning classification and regression trees, to handle relational examples and background knowledge. The system constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns either a discrete class or a numerical value to each leaf. In addition, we have extended the Cart methodology by adding linear regression models to the leaves of the trees; this does not have a counterpart in Cart, but was inspired by its approach to pruning. The regression variant of S-Cart is one of the few systems applicable to Relational Regression problems. Experiments in several real-world domains demonstrate that the approach is useful and competitive with existing methods, indicating that the advantage of relatively small and comprehensible models does not come at the expense of predictive accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

P. Auer, W. Maass, and R. Holte. Theory and applications of agnostic PAC-learning with small decision trees. In Proceedings of the Twelfth International Conference on Machine Learning, pages 21–29. Morgan Kaufmann, San Francisco, CA, 1995.
Google Scholar
H. Blockeel and L. De Raedt. Top-down induction of first order logical decision trees. Artificial Intelligence, 101(1–2):285–297, 1998.
Article MathSciNet MATH Google Scholar
H. Boström. Covering vs. divide-and-conquer for top-down induction of logic programs. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pages 1194–1200. Morgan Kaufmann, San Francisco, CA, 1995.
Google Scholar
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth, Belmont, CA, 1984.
MATH Google Scholar
W. Buntine. A theory of learning classification rules. PhD thesis, School of Computing Science, University of Technology, Sydney, Australia, 1992.
Google Scholar
P. Clark and T. Niblett. The CN2 induction algorithm. Machine Learning, 3:261–283, 1989.
Google Scholar
W. Cohen. Grammatically biased learning: Learning logic programs using an explicit antecedent description language. Artificial Intelligence, 68(2):303–366, 1994.
Article MATH Google Scholar
J. Cussens and S. Džeroski, editors. Learning Language in Logic. Springer, Berlin, 2000.
MATH Google Scholar
L. Dehaspe and H. Toivonen. Discovery of frequent datalog patterns. Data Mining and Knowledge Discovery, 3(1):7–36, 1999.
Article Google Scholar
L. De Raedt and W. Van Laer. Inductive constraint logic. In Proceedings of the Fifth Workshop on Algorithmic Learning Theory, pages 80–94. Springer, Berlin, 1995.
Chapter Google Scholar
T. Dietterich, M. Kearns, and Y. Mansour. Applying the weak learning framework to understand and improve C4.5. In Proceedings of the Thirteenth International Conference on Machine Learning, pages 96–104. Morgan Kaufmann, San Francisco, CA, 1996.
Google Scholar
S. Džeroski, H. Blockeel, B. Kompare, S. Kramer, B. Pfahringer, and W. Van Laer. Experiments in predicting biodegradability. In Proceedings of the Ninth International Workshop on Inductive Logic Programming, pages 80–91. Springer, Berlin, 1999.
Chapter Google Scholar
S. Džeroski, L. Todorovski, and T. Urbančič. Handling real numbers in inductive logic programming: A step towards better behavioural clones. Proceedings of the Eighth European Conference on Machine Learning, pages 283–286. Springer, Berlin, 1995.
Google Scholar
R. Freund and W. Wilson. Regression analysis. Academic Press, London, 1998.
MATH Google Scholar
D. Heath, S. Kasif, and S. Salzberg. Induction of oblique decision trees. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1002–1007. Morgan Kaufmann, San Francisco, CA, 1993.
Google Scholar
A. Karalić. Employing linear regression in regression tree leaves. In Proceedings of the Tenth European Conference on Artificial Intelligence, pages 440–441. John Wiley and Sons, 1992.
Google Scholar
A. Karalić and I. Bratko. First order regression. Machine Learning, 26(2/3): 147–176, 1997.
Article MATH Google Scholar
M. Kearns. Boosting theory towards practice: Recent developments in decision tree induction and the weak learning framework. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 1337–1339. AAAI Press, Menlo Park, CA, 1996.
Google Scholar
M. Kearns and Y. Mansour. A fast, bottom-up decision tree pruning algorithm with near-optimal generalization. In Proceedings of the Fifteenth International Conference on Machine Learning, pages 269–277. Morgan Kaufmann, San Francisco, CA, 1998.
Google Scholar
R. King and A. Srinivasan. Prediction of rodent carcinogenicity bioassays from molecular structure using Inductive Logic Programming. Environmental Health Perspectives, 104(5):1031–1040, 1996.
Article Google Scholar
S. Kramer. Structural regression trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 812–819. AAAI Press, Menlo Park, CA, 1996.
Google Scholar
S. Kramer. Relational learning vs. propositionalization: investigations in inductive logic programming and propositional machine learning. PhD thesis, Vienna University of Technology, Vienna, Austria, 1999.
Google Scholar
N. Lavrač and S. Džeroski. Inductive Logic Programming: Techniques and Applications. Ellis Horwood, Chichester, 1994. Freely available at http://www-ai.ijs.si/SasoDzeroski/ILPBook/.
MATH Google Scholar
M. Manago. Knowledge-intensive induction. In Proceedings of the Sixth International Workshop on Machine Learning, pages 151–155. Morgan Kaufman, San Francisco, CA, 1989.
Google Scholar
G. Pagallo and D. Haussler. Boolean feature discovery in empirical learning. Machine Learning, 5 (1):71–100, 1990.
Article Google Scholar
H. Pirker and S. Kramer. Listening to lists: Studying durational phenomena in enumerations. In Proceedings of the Fourteenth International Conference of Phonetic Sciences, pages 273–276. Department of Linguistics, University of California, Berkeley, CA, 1999.
Google Scholar
J. Quinlan. Learning with continuous classes. Proceedings of the Second Australian Conference on Artificial Intelligence, pages 343–348. World Scientific, Singapore, 1992.
Google Scholar
J. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann, San Francisco, CA, 1993.
Google Scholar
J. Quinlan. Bagging, boosting and C4.5. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 725–730. AAAI Press, Menlo Park, CA, 1996.
Google Scholar
G. Silverstein and M. Pazzani. Relational clichés: Constraining constructive induction during relational learning. In Proceedings of the Eighth International Workshop on Machine Learning, pages 203–207. Morgan Kaufmann, San Francisco, CA, 1991.
Google Scholar
A. Srinivasan, S. Muggleton, R. King, and M. Sternberg. Theories for mutagenicity: a study of first-order and feature based induction. Artificial Intelligence, 85 (1–2):277–299, 1996.
Article Google Scholar
Y. Wang and I. Witten. Inducing model trees for continuous classes. In Poster Papers of the Ninth European Conference on Machine Learning, pages 128– 137. Faculty of Informatics and Statistics, University of Economics, Prague, 1997.
Google Scholar
L. Watanabe and L. Rendell. Learning structural decision trees from examples. In Proceedings of the Twelfth International Joint Conference on Artificial Intelligence, pages 770–776. Morgan Kaufmann, San Francisco, CA, 1991.
Google Scholar
S. Weiss and N. Indurkhya. Rule-based machine learning methods for functional prediction. Journal of Artificial Intelligence Research, 3:383–403, 1995.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Informatik, Albert-Ludwigs-Universität Freiburg, Am Flughafen 17, D-79110, Freiburg, Germany
Stefan Kramer
Department of Medical Cybernetics and Artificial Intelligence, University of Vienna, Freyung 6/2, A-1010, Vienna, Austria
Gerhard Widmer
Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010, Vienna, Austria
Gerhard Widmer

Authors

Stefan Kramer
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Widmer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Jožef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Sašo Džeroski & Nada Lavrač &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kramer, S., Widmer, G. (2001). Inducing Classification and Regression Trees in First Order Logic. In: Džeroski, S., Lavrač, N. (eds) Relational Data Mining. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04599-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-662-04599-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07604-6
Online ISBN: 978-3-662-04599-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics