Control-Sensitive Feature Selection for Lazy Learners

Domingos, Fedro

doi:10.1023/A:1006508722917

Control-Sensitive Feature Selection for Lazy Learners

Published: February 1997

Volume 11, pages 227–253, (1997)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Fedro Domingos¹

199 Accesses
41 Citations
Explore all metrics

Abstract

High sensitivity to irrelevant features is arguably the main shortcoming of simple lazy learners. In response to it, many feature selection methods have been proposed, including forward sequential selection (FSS) and backward sequential selection (BSS). Although they often produce substantial improvements in accuracy, these methods select the same set of relevant features everywhere in the instance space, and thus represent only a partial solution to the problem. In general, some features will be relevant only in some parts of the space; deleting them may hurt accuracy in those parts, but selecting them will have the same effect in parts where they are irrelevant. This article introduces RC, a new feature selection algorithm that uses a clustering-like approach to select sets of locally relevant features (i.e., the features it selects may vary from one instance to another). Experiments in a large number of domains from the UCI repository show that RC almost always improves accuracy with respect to FSS and BSS, often with high significance. A study using artificial domains confirms the hypothesis that this difference in performance is due to RC's context sensitivity, and also suggests conditions where this sensitivity will and will not be an advantage. Another feature of RC is that it is faster than FSS and BSS, often by an order of magnitude or more.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aha, D. W. (1989). Incremental, Instance-Based Learning of Independent and Graded Concept Descriptions. In Proceedings of The Sixth International Workshop on Machine Learning, pp. 387–391. Ithaca, NY: Morgan Kaufmann.
Google Scholar
Aha, D. W. (1992). Generalizing from Case Studies: A Case Study. In Proceedings of The Ninth International Workshop on Machine Learning, pp. 1–10. Aberdeen, Scotland: Morgan Kaufmann.
Google Scholar
Aha, D. W. & Bankert, R. L. (1994). Feature Selection for Case-Based Classification of Cloud Types: An Empirical Comparison. In Proceedings of The 1994 AAAI Workshop on Case-Based Reasoning, pp. 106–112. Seattle, WA: AAAI Press.
Google Scholar
Aha, D. W. & Goldstone, R. L. (1992). Concept Learning and Flexible Weighting. In Proceedings of The Fourteenth Annual Conference of the Cognitive Science Society, pp. 534–539. Evanston, IL: Lawrence Erlbaum.
Google Scholar
Aha, D. W., Kibler, D. & Albert, M. K. (1991). Instance-Based Learning Algorithms. Machine Learning 6: 37–66.
Google Scholar
Almuallim, H. & Dietterich, T. G. (1991). Learning with Many Irrelvant Features. In Proceedings of The Ninth National Conference on Artificial Intelligence, pp. 547–552. Menlo Park, CA: AAAI Press.
Google Scholar
Atkeson, C. G., Moore, A. W. & Schaal, S. (1997). Locally Weighted Learning. Artificial Intelligence Review, this issue.
Cain, T., Pazzani, M. J. & Silverstein, G. (1991). Using Domain Knowledge to Influence Similarity Judgments. In Proceedings of The Case-Based Reasoning Workshop, pp. 191–199. Washington, DC: Morgan Kaufmann.
Google Scholar
Cardie, C. (1993). Using Decision Trees to Improve Case-Based Learning. In Proceedings of The Tenth International Conference on Machine Learning, pp. 25–32. Amherst, MA: Morgan Kaufmann.
Google Scholar
Caruana, R. & Freitag, D. (1994). Greedy Attribute Selection. In Proceedings of The Eleventh International Conference on Machine Learning, pp. 28–36. New Brunswick, NJ: Morgan Kaufmann.
Google Scholar
Clark, P. & Niblett, T. (1989). The CN2 Induction Algorithm. Machine Learning 3: 261–283.
Google Scholar
Cost, S. & Salzberg, S. (1993). A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features. Machine Learning 10: 57–78.
Google Scholar
Creecy, R. H., Masand, B. M., Smith, S. J. & Waltz, D. L. (1992). Trading MIPS and Memory for Knowledge Engineering. Communications of the ACM 35(8): 48–63.
Google Scholar
DeGroot, M. H. (1986). Probability and Statistics, Second Edition. Addison-Wesley: Reading, MA.
Google Scholar
Devijver, P. A. & Kittler, J. (1982). Pattern Recognition: A Statistical Approach. Prentice/Hall: Englewood Cliffs, NJ.
Google Scholar
Domingos, P. (1995). The RISE 2.0 System: A Case Study in Multistrategy Learning. TR–95–2, Department of Information and Computer Science, University of California at Irvine, Irvine, CA.
Google Scholar
Domingos, P. (1996). Unifying Instance-Based and Rule-Based Induction. Machine Learning 24: 141–168.
Google Scholar
Holte, R. C. (1993). Very Simple Classification Rules Perform Well on Most Commonly Used Datasets. Machine Learning 11: 63–91.
Google Scholar
John, G. H., Kohavi, R. & Pfleger, K. (1994). Irrelevant Features and the Subset Selection Problem. In Proceedings of The Eleventh International Conference on Machine Learning, pp. 121–129. New Brunswick, NJ: Morgan Kaufmann.
Google Scholar
Kelly, J. D. & Davis, L. (1991). A Hybrid Genetic Algorithm for Classification. In Proceedings of The Twelfth International Joint Conference on Artificial Intelligence, pp. 645–650. Sydney: Morgan Kaufmann.
Google Scholar
Kibler, D. & Aha, D. W. (1987). Learning Representative Exemplars of Concepts: An Initial Case Study. In Proceedings of The Fourth International Workshop on Machine Learning, pp. 24–30, Irvine, CA: Morgan Kaufmann.
Google Scholar
Kira, A. & Rendell, L. A. (1992). A Practical Approach to Feature Selection. In Proceedings of The Ninth International Workshop on Machine Learning, pp. 249–256. Aberdeen, Scotland: Morgan Kaufmann.
Google Scholar
Kittler, J. (1986). Feature Selection and Extraction. In Young, T. Y. & Fu, K. S. (eds.) Handbook of Pattern Recognition and Image Processing. Academic Press: New York.
Google Scholar
Kolodner, J. (1993). Case-Based Reasoning. Morgan Kaufmann: San Mateo, CA.
Google Scholar
Langley, P. & Sage, S. (1994). Oblivious Decision Trees and Abstract Cases. In Proceedings of The 1994 AAAI Workshop on Case-Based Reasoning, pp. 113–117. Seattle, CA: AAAI Press.
Google Scholar
Lee, C. (1994). An Instance-Based Learning Method for Databases: An Information Theoretic Approach. In Proceedings of The Ninth European Conference on Machine Learning, pp. 387–390. Catania, Italy: Springer-Verlag.
Google Scholar
Mohri, T. & Tanaka, H. (1994). An Optimal Weighting Criterion of Case Indexing for Both Numeric and Symbolic Attributes. In Proceedings of The 1994 AAAI Workshop on Case-Based Reasoning, pp. 123–127. Seattle, WA: AAAI Press.
Google Scholar
Murphy, P. M. (1995). UCI Repository of Machine Learning Databases. Machine-Readable Data Repository, Department of Information and Computer Science, University of California at Irvine, Irvine, CA.
Google Scholar
Niblett, T. (1987). Constructing Decision Trees in Noisy Domains. In Proceedings of The Second European Working Session on Learning, pp. 67–78. Bled, Yugoslavia: Sigma.
Google Scholar
Nosofsky, R. M., Clark, S. E. & Shin, H. J. (1989). Rules and Exemplars in Categorization, Identification, and Recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition 15: 282–304.
Google Scholar
Pagallo, G. & Haussler, D. (1990). Boolean Feature Discovery in Empirical Learning. Machine Learning 3: 71–99.
Google Scholar
Ricci, F. & Avesani, P. (1995). Learning a Local Similarity Metric for Case-Based Reasoning. In Proceedings of The First International Conference on Case-Based Reasoning, pp. 301–312. Sesimbra, Portugal: Springer-Verlag.
Google Scholar
Salzberg, S. (1991). A Nearest Hyperrectangle Learning Method. Machine Learning 6: 251–276.
Google Scholar
Schaffer, C. (1989). Analysis of Artificial Data Sets. In Proceedings of The Second International Symposium on Artificial Intelligence, pp. 607–617. Monterrey, Mexico: McGraw-Hill.
Google Scholar
Schlimmer, J. C. (1993). Efficiently Inducing Determinations: A Complete and Systematic Search Algorithm that Uses Optimal Pruning. In Proceedings of The Tenth International Conference on Machine Learning, pp. 284–290. Amherst, MA: Morgan Kaufmann.
Google Scholar
Skalak, D. B. (1992). Representing Cases as Knowledge Sources that Apply Local Similarity Metrics. In Proceedings of The Fourteenth Annual Conference of the Cognitive Science Society, pp. 325–330. Evanston, IL: Lawrence Erlbaum.
Google Scholar
Skalak, D. B. (1994). Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms. In Proceedings of The Eleventh International Conference on Machine Learning, pp. 293–301. New Brunswick, NJ: Morgan Kaufmann.
Google Scholar
Stanfill, C. & Waltz, D. (1986). Toward Memory-Based Reasoning. Communications of the ACM 29: 1213–1228.
Google Scholar
Vafaie, H. & DeJong, K. (1993). Robust Feature Selection Algorithms. In Proceedings of The Fifth IEEE International Conference on Tools for Artificial Intelligence, pp. 356–363. Boston, MA: Computer Society Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information and Computer Science, University of California, Irvine, Irvine, California, 92697, U.S.A.
Fedro Domingos

Authors

Fedro Domingos
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Domingos, F. Control-Sensitive Feature Selection for Lazy Learners. Artificial Intelligence Review 11, 227–253 (1997). https://doi.org/10.1023/A:1006508722917

Download citation

Issue Date: February 1997
DOI: https://doi.org/10.1023/A:1006508722917

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Control-Sensitive Feature Selection for Lazy Learners

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

A survey on semi-supervised learning

A survey on ensemble learning

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Control-Sensitive Feature Selection for Lazy Learners

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

A survey on semi-supervised learning

A survey on ensemble learning

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation