Advertisement

Pattern Analysis & Applications

, Volume 5, Issue 2, pp 102–112 | Cite as

A Data Complexity Analysis of Comparative Advantages of Decision Forest Constructors

  • Tin Kam Ho

Abstract:

Using a number of measures for characterising the complexity of classification problems, we studied the comparative advantages of two methods for constructing decision forests – bootstrapping and random subspaces. We investigated a collection of 392 two-class problems from the UCI depository, and observed that there are strong correlations between the classifier accuracies and measures of length of class boundaries, thickness of the class manifolds, and nonlinearities of decision boundaries. We found characteristics of both difficult and easy cases where combination methods are no better than single classifiers. Also, we observed that the bootstrapping method is better when the training samples are sparse, and the subspace method is better when the classes are compact and the boundaries are smooth.

Key words: Bagging; Classifier combination; Data complexity; Decision forest; Decision tree; Random subspace method 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag London Limited 2002

Authors and Affiliations

  • Tin Kam Ho
    • 1
  1. 1.Bell Laboratories, Lucent Technologies, Murray Hill, NJ, USAUS

Personalised recommendations