A Data Complexity Analysis of Comparative Advantages of Decision Forest Constructors
- 307 Downloads
Using a number of measures for characterising the complexity of classification problems, we studied the comparative advantages of two methods for constructing decision forests – bootstrapping and random subspaces. We investigated a collection of 392 two-class problems from the UCI depository, and observed that there are strong correlations between the classifier accuracies and measures of length of class boundaries, thickness of the class manifolds, and nonlinearities of decision boundaries. We found characteristics of both difficult and easy cases where combination methods are no better than single classifiers. Also, we observed that the bootstrapping method is better when the training samples are sparse, and the subspace method is better when the classes are compact and the boundaries are smooth.
Unable to display preview. Download preview PDF.