Abstract
We consider the problem of constructing decision trees for entity identification from a given table. The input is a table containing information about a set of entities over a fixed set of attributes. The goal is to construct a decision tree that identifies each entity unambiguously by testing the attribute values such that the average number of tests is minimized. The previously best known approximation ratio for this problem was O(log2 N). In this paper, we present a new greedy heuristic that yields an improved approximation ratio of O(logN).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Murthy, S.: Automatic construction of decision trees from data: A multi-disciplinary survey. Data Mining and Knowledge Discovery 2(4), 345–389 (1998)
Moret, B.: Decision trees and diagrams. ACM Computing Surveys 14(4), 593–623 (1982)
Hyafil, L., Rivest, R.: Constructing optimal binary decision trees is NP-complete. Information Processing Letters 5(1), 15–17 (1976)
Garey, M.: Optimal binary identification procedures. SIAM Journal on Applied Mathematics 23(2), 173–186 (1972)
Kosaraju, S., Przytycka, M., Borgstrom, R.: On an optimal split tree problem. In: Workshop on Algorithms and Data Structures (1999)
Adler, M., Heeringa, B.: Approximating optimal binary decision trees. In: Goel, A., Jansen, K., Rolim, J.D.P., Rubinfeld, R. (eds.) APPROX and RANDOM 2008. LNCS, vol. 5171, pp. 1–9. Springer, Heidelberg (2008)
Heeringa, B.: Improving Access to Organized Information. Ph.D. thesis, University of Massachusetts, Amherst (2006)
Garey, M., Graham, R.: Performance bounds on the splitting algorithm for binary testing. Acta Informatica 3, 347–355 (1974)
Chakaravarthy, V., Pandit, V., Roy, S., Awasthi, P., Mohania, M.: Decision trees for entity identification: approximation algorithms and hardness results. In: ACM Symposium on Principles of Database Systems (2007)
Feige, U., Lovász, L., Tetali, P.: Approximating min sum set cover. Algorithmica 40(4), 219–234 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chakaravarthy, V.T., Pandit, V., Roy, S., Sabharwal, Y. (2009). Approximating Decision Trees with Multiway Branches. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds) Automata, Languages and Programming. ICALP 2009. Lecture Notes in Computer Science, vol 5555. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02927-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-02927-1_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02926-4
Online ISBN: 978-3-642-02927-1
eBook Packages: Computer ScienceComputer Science (R0)