Skip to main content

Feature Selection and Sparse Learning

  • Chapter
  • First Online:
Machine Learning

Abstract

Watermelons can be described by many attributes, such as color, root, sound, texture, and surface, but experienced people can determine the ripeness with only the root and sound information. In other words, not all attributes are equally important for the learning task. In machine learning, attributes are also called features. Features that are useful for the current learning task are called relevant features, and those useless ones are called irrelevant features. The process of selecting relevant features from a given feature set is called feature selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 64.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Aharon M, Elad M, Bruckstein A (2006) K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Image Process 54(11):4311–4322

    Article  Google Scholar 

  • Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723

    Article  MathSciNet  Google Scholar 

  • Baraniuk RG (2007) Compressive sensing. IEEE Signal Process Mag 24(4):118–121

    Article  Google Scholar 

  • Bengio S, Pereira F, Singer Y, Strelow D (2009) Group sparse coding. In: Bengio Y, Schuurmans D, Lafferty JD, Williams CKI, Culotta A (eds) Advances in neural information processing systems 22 (NIPS). MIT Press, Cambridge, pp 82–89

    Google Scholar 

  • Blum A, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1–2):245–271

    Article  MathSciNet  Google Scholar 

  • Candès EJ (2008) The restricted isometry property and its implications for compressed sensing. Comptes Rendus Math 346(9–10):589–592

    Article  MathSciNet  Google Scholar 

  • Candès EJ, Recht B (2009) Exact matrix completion via convex optimization. Found Comput Math 9(6):717–772

    Article  MathSciNet  Google Scholar 

  • Candès EJ, Romberg J, Tao T (2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans Inf Theory 52(2):489–509

    Article  MathSciNet  Google Scholar 

  • Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM 58(3). Article 11

    Google Scholar 

  • Chen SS, Donoho DL, Saunders MA (1998) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1):33–61

    Article  MathSciNet  Google Scholar 

  • Combettes PL, Wajs VR (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model Simul 4(4):1168–1200

    Article  MathSciNet  Google Scholar 

  • Donoho DL (2006) Compressed sensing. IEEE Trans Inf Theory 52(4):1289–1306

    Article  MathSciNet  Google Scholar 

  • Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499

    Article  MathSciNet  Google Scholar 

  • Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305

    MATH  Google Scholar 

  • Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  • Jain A, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE Trans Pattern Anal Mach Intell 19(2):153–158

    Article  Google Scholar 

  • Kira K, Rendell LA (1992) The feature selection problem: Traditional methods and a new algorithm. I: Proceedings of the 10th national conference on artificial intelligence (AAAI), San Jose, CA, pp 129–134

    Google Scholar 

  • Kohavi R, John GH (1997) Wrappers for feature subset selection. Artificial Intelligence 97(1–2):273–324

    Article  Google Scholar 

  • Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. In: Proceedings of the 7th European conference on machine learning (ECML), Catania, Italy, pp 171–182

    Google Scholar 

  • Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer, Boston

    Book  Google Scholar 

  • Liu H, Motoda H (2007) Computational methods of feature selection. Chapman & Hall/CRC, Boca Raton

    Google Scholar 

  • Liu H, Motoda H, Setiono R, Zhao Z (2010) Feature selection: an ever evolving frontier in data mining. In: Proceedings of the 4th workshop on feature selection in data mining (FSDM), Hyderabad, India, pp 4–13

    Google Scholar 

  • Liu H, Setiono R (1996) Feature selection and classification — a probabilistic wrapper approach. In: Proceedings of the 9th international conference on industrial and engineering applications of artificial intelligence and expert systems (IEA/AIE), Fukuoka, Japan, pp 419–424

    Google Scholar 

  • Liu J, Ye J (2009) Efficient Euclidean projections in linear time. In: Proceedings of the 26th international conference on machine learning (ICML), Montreal, Canada, pp 657–664

    Google Scholar 

  • Mairal J, Elad M, Sapiro G (2008) Sparse representation for color image restoration. IEEE Trans Image Process 17(1):53–69

    Article  MathSciNet  Google Scholar 

  • Mallat SG, Zhang ZF (1993) Matching pursuits with time-frequency dictionaries. IEEE Trans Signal Process 41(12):3397–3415

    Article  Google Scholar 

  • Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput C-26(9):917–922

    Google Scholar 

  • Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15(11):1119–1125

    Article  Google Scholar 

  • Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106

    Google Scholar 

  • Recht B (2011) A simpler approach to matrix completion. J Mach Learn Res 12:3413–3430

    MathSciNet  MATH  Google Scholar 

  • Recht B, Fazel M, Parrilo P (2010) Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev 52(3):471–501

    Article  MathSciNet  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc: Ser B 58(1):267–288

    MathSciNet  MATH  Google Scholar 

  • Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused LASSO. J R Stat Soc: Ser B 67(1):91–108

    Article  MathSciNet  Google Scholar 

  • Tikhonov AN, Arsenin VY (1977) Solutions of ill-posed problems. Winston, Washington, DC

    Google Scholar 

  • Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), San Francisco, CA, pp 3360–3367

    Google Scholar 

  • Weston J, Elisseeff A, Schölkopf B, Tipping M (2003) Use of the zero norm with linear models and kernel methods. J Mach Learn Res 3:1439–1461

    Google Scholar 

  • Yang Y, Pederson JO (1997) A comparative study on feature selection in text categorization. In: Proceedings of the 14th international conference on machine learning (ICML), Nashville, TN, pp 412–420

    Google Scholar 

  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc-Ser B 68(1):49–67

    Article  MathSciNet  Google Scholar 

  • Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc-Ser B 67(2):301–320

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhi-Hua Zhou .

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zhou, ZH. (2021). Feature Selection and Sparse Learning. In: Machine Learning. Springer, Singapore. https://doi.org/10.1007/978-981-15-1967-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1967-3_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1966-6

  • Online ISBN: 978-981-15-1967-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics