, Volume 11, Issue 1-5, pp 193-225

The Racing Algorithm: Model Selection for Lazy Learners

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Given a set of models and some training data, we would like to find the model that best describes the data. Finding the model with the lowest generalization error is a computationally expensive process, especially if the number of testing points is high or if the number of models is large. Optimization techniques such as hill climbing or genetic algorithms are helpful but can end up with a model that is arbitrarily worse than the best one or cannot be used because there is no distance metric on the space of discrete models. In this paper we develop a technique called “racing” that tests the set of models in parallel, quickly discards those models that are clearly inferior and concentrates the computational effort on differentiating among the better models. Racing is especially suitable for selecting among lazy learners since training requires negligible expense, and incremental testing using leave-one-out cross validation is efficient. We use racing to select among various lazy learning algorithms and to find relevant features in applications ranging from robot juggling to lesion detection in MRI scans.