Efficient Lookahead Decision Trees

Kiossou, Harold; Schaus, Pierre; Nijssen, Siegfried; Aglin, Gaël

doi:10.1007/978-3-031-58553-1_11

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14642))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

147 Accesses

Abstract

Conventionally, decision trees are learned using a greedy approach, beginning at the root and moving toward the leaves. At each internal node, the feature that yields the best data split is chosen based on a metric like information gain. This process can be regarded as evaluating the quality of the best depth-one subtree. To address the shortsightedness of this method, one can generalize it to greater depths. Lookahead trees have demonstrated strong performance in situations with high feature interaction or low signal-to-noise ratios. They constitute a good trade-off between optimal decision trees and purely greedy decision trees. Currently, there are no readily available tools for constructing these lookahead trees, and their computational cost can be significantly higher than that of purely greedy ones. In this study, we introduce an efficient implementation of lookahead decision trees, specifically LGDT, by adapting a recently introduced algorithmic concept from the MurTree approach to find optimal decision trees of depth two. Additionally, we utilize an efficient reversible sparse bitset data structure to store the filtered examples while expanding the tree nodes in a depth-first-search manner. Experiments on state-of-the-art datasets demonstrate that our implementation offers remarkable computation-time performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
All the formula of this section can easily be adapted for multi-class contexts.
2.
https://github.com/aia-uclouvain/pydl8.5.
3.
https://scikit-learn.org/.
4.
https://dtai.cs.kuleuven.be/CP4IM/datasets/.

References

Aghaei, S., Gómez, A., Vayanos, P.: Strong optimal classification trees. ArXiv Preprint ArXiv:2103.15965 (2021)
Aglin, G., Nijssen, S., Schaus, P.: Learning optimal decision trees using caching branch-and-bound search. Proc. AAAI. 34, 3146–3153 (2020)
Article Google Scholar
Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 106, 1039–1082 (2017)
Article MathSciNet Google Scholar
Boutilier, J., Michini, C., Zhou, Z.: Shattering inequalities for learning optimal decision trees. In: International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research, pp. 74–90 (2022)
Google Scholar
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Wadsworth Int. Group. 37, 237–251 (1984)
Google Scholar
Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: a maximal frequent itemset algorithm for transactional databases. In: Proceedings 17th International Conference On Data Engineering, pp. 443–452 (2001)
Google Scholar
Demeulenaere, J., et al.: Compact-table: efficiently filtering table constraints with reversible sparse bit-sets. In: Rueher, M. (ed.) CP 2016. LNCS, vol. 9892, pp. 207–223. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44953-1_14
Chapter Google Scholar
Demirović, E., et al.: MurTree: optimal decision trees via dynamic programming and search. J. Mach. Learn. Res. 23, 1–47 (2022)
MathSciNet Google Scholar
Dolan, E., Moré, J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
Article MathSciNet Google Scholar
Donick, D., Lera, S.: Uncovering feature interdependencies in high-noise environments with stepwise lookahead decision forests. Sci. Rep. 11, 9238 (2021)
Article Google Scholar
Esmeir, S., Markovitch, S.: Lookahead-based algorithms for anytime induction of decision trees. In: Proceedings Of The Twenty-first International Conference On Machine Learning, p. 33 (2004)
Google Scholar
Holsheimer, M., Kersten, M., Mannila, H., Toivonen, H.: A perspective on databases and data mining. In: KDD, vol. 95, pp. 150–155 (1995)
Google Scholar
Iba, W., Langley, P.: Induction of one-level decision trees. Mach. Learn. Proc. 1992, 233–240 (1992)
Google Scholar
Lin, J., Zhong, C., Hu, D., Rudin, C., Seltzer, M.: Generalized and scalable optimal sparse decision trees. In: ICML, pp. 6150–6160 (2020)
Google Scholar
Narodytska, N., Ignatiev, A., Pereira, F., Marques-Silva, J., Ras, I.: Learning optimal decision trees with SAT. In: IJCAI, pp. 1362–1368 (2018)
Google Scholar
Nijssen, S., Fromont, E.: Mining optimal decision trees from itemset lattices. In: KDD, pp. 530–539 (2007)
Google Scholar
Norton, S.: Generating better decision trees. In: IJCAI, vol. 89, pp. 800–805 (1989)
Google Scholar
Quinlan, J.: C4.5: Programs for Machine Learning. Elsevier (2014)
Google Scholar
Ragavan, H., Rendell, L.: Lookahead feature construction for learning hard concepts. In: ICML (1993)
Google Scholar
Verhaeghe, H., Nijssen, S., Pesant, G., Quimper, C., Schaus, P.: Learning optimal decision trees using constraint programming. Constraints 25, 226–250 (2020)
Article MathSciNet Google Scholar
Verwer, S., Zhang, Y.: Learning optimal classification trees using a binary linear program formulation. Proc. AAAI. 33, 1625–1632 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

UCLouvain, Louvain-la-Neuve, Belgium
Harold Kiossou, Pierre Schaus, Siegfried Nijssen & Gaël Aglin

Authors

Harold Kiossou
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Schaus
View author publications
You can also search for this author in PubMed Google Scholar
Siegfried Nijssen
View author publications
You can also search for this author in PubMed Google Scholar
Gaël Aglin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Harold Kiossou .

Editor information

Editors and Affiliations

Stockholm University, Kista, Sweden
Ioanna Miliou
Fraunhofer IAIS, Sankt Augustin, Germany
Nico Piatkowski
Stockholm University, Kista, Sweden
Panagiotis Papapetrou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kiossou, H., Schaus, P., Nijssen, S., Aglin, G. (2024). Efficient Lookahead Decision Trees. In: Miliou, I., Piatkowski, N., Papapetrou, P. (eds) Advances in Intelligent Data Analysis XXII. IDA 2024. Lecture Notes in Computer Science, vol 14642. Springer, Cham. https://doi.org/10.1007/978-3-031-58553-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-58553-1_11
Published: 16 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58555-5
Online ISBN: 978-3-031-58553-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics