Artificial Intelligence Review

, Volume 11, Issue 1, pp 227–253

Control-Sensitive Feature Selection for Lazy Learners

  • Fedro Domingos
Article

DOI: 10.1023/A:1006508722917

Cite this article as:
Domingos, F. Artificial Intelligence Review (1997) 11: 227. doi:10.1023/A:1006508722917

Abstract

High sensitivity to irrelevant features is arguably the main shortcoming of simple lazy learners. In response to it, many feature selection methods have been proposed, including forward sequential selection (FSS) and backward sequential selection (BSS). Although they often produce substantial improvements in accuracy, these methods select the same set of relevant features everywhere in the instance space, and thus represent only a partial solution to the problem. In general, some features will be relevant only in some parts of the space; deleting them may hurt accuracy in those parts, but selecting them will have the same effect in parts where they are irrelevant. This article introduces RC, a new feature selection algorithm that uses a clustering-like approach to select sets of locally relevant features (i.e., the features it selects may vary from one instance to another). Experiments in a large number of domains from the UCI repository show that RC almost always improves accuracy with respect to FSS and BSS, often with high significance. A study using artificial domains confirms the hypothesis that this difference in performance is due to RC's context sensitivity, and also suggests conditions where this sensitivity will and will not be an advantage. Another feature of RC is that it is faster than FSS and BSS, often by an order of magnitude or more.

lazy learning feature selection nearest neighbor induction machine learning 

Copyright information

© Kluwer Academic Publishers 1997

Authors and Affiliations

  • Fedro Domingos
    • 1
  1. 1.Department of Information and Computer ScienceUniversity of California, IrvineIrvineU.S.A.

Personalised recommendations