Feature Selection and Sparse Learning

Zhou, Zhi-Hua

doi:10.1007/978-981-15-1967-3_11

Zhi-Hua Zhou³

14k Accesses
1 Citations

Abstract

Watermelons can be described by many attributes, such as color, root, sound, texture, and surface, but experienced people can determine the ripeness with only the root and sound information. In other words, not all attributes are equally important for the learning task. In machine learning, attributes are also called features. Features that are useful for the current learning task are called relevant features, and those useless ones are called irrelevant features. The process of selecting relevant features from a given feature set is called feature selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Hardcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aharon M, Elad M, Bruckstein A (2006) K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Image Process 54(11):4311–4322
Article Google Scholar
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723
Article MathSciNet Google Scholar
Baraniuk RG (2007) Compressive sensing. IEEE Signal Process Mag 24(4):118–121
Article Google Scholar
Bengio S, Pereira F, Singer Y, Strelow D (2009) Group sparse coding. In: Bengio Y, Schuurmans D, Lafferty JD, Williams CKI, Culotta A (eds) Advances in neural information processing systems 22 (NIPS). MIT Press, Cambridge, pp 82–89
Google Scholar
Blum A, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1–2):245–271
Article MathSciNet Google Scholar
Candès EJ (2008) The restricted isometry property and its implications for compressed sensing. Comptes Rendus Math 346(9–10):589–592
Article MathSciNet Google Scholar
Candès EJ, Recht B (2009) Exact matrix completion via convex optimization. Found Comput Math 9(6):717–772
Article MathSciNet Google Scholar
Candès EJ, Romberg J, Tao T (2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans Inf Theory 52(2):489–509
Article MathSciNet Google Scholar
Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM 58(3). Article 11
Google Scholar
Chen SS, Donoho DL, Saunders MA (1998) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1):33–61
Article MathSciNet Google Scholar
Combettes PL, Wajs VR (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model Simul 4(4):1168–1200
Article MathSciNet Google Scholar
Donoho DL (2006) Compressed sensing. IEEE Trans Inf Theory 52(4):1289–1306
Article MathSciNet Google Scholar
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499
Article MathSciNet Google Scholar
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305
MATH Google Scholar
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
MATH Google Scholar
Jain A, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE Trans Pattern Anal Mach Intell 19(2):153–158
Article Google Scholar
Kira K, Rendell LA (1992) The feature selection problem: Traditional methods and a new algorithm. I: Proceedings of the 10th national conference on artificial intelligence (AAAI), San Jose, CA, pp 129–134
Google Scholar
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artificial Intelligence 97(1–2):273–324
Article Google Scholar
Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. In: Proceedings of the 7th European conference on machine learning (ECML), Catania, Italy, pp 171–182
Google Scholar
Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer, Boston
Book Google Scholar
Liu H, Motoda H (2007) Computational methods of feature selection. Chapman & Hall/CRC, Boca Raton
Google Scholar
Liu H, Motoda H, Setiono R, Zhao Z (2010) Feature selection: an ever evolving frontier in data mining. In: Proceedings of the 4th workshop on feature selection in data mining (FSDM), Hyderabad, India, pp 4–13
Google Scholar
Liu H, Setiono R (1996) Feature selection and classification — a probabilistic wrapper approach. In: Proceedings of the 9th international conference on industrial and engineering applications of artificial intelligence and expert systems (IEA/AIE), Fukuoka, Japan, pp 419–424
Google Scholar
Liu J, Ye J (2009) Efficient Euclidean projections in linear time. In: Proceedings of the 26th international conference on machine learning (ICML), Montreal, Canada, pp 657–664
Google Scholar
Mairal J, Elad M, Sapiro G (2008) Sparse representation for color image restoration. IEEE Trans Image Process 17(1):53–69
Article MathSciNet Google Scholar
Mallat SG, Zhang ZF (1993) Matching pursuits with time-frequency dictionaries. IEEE Trans Signal Process 41(12):3397–3415
Article Google Scholar
Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput C-26(9):917–922
Google Scholar
Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15(11):1119–1125
Article Google Scholar
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Google Scholar
Recht B (2011) A simpler approach to matrix completion. J Mach Learn Res 12:3413–3430
MathSciNet MATH Google Scholar
Recht B, Fazel M, Parrilo P (2010) Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev 52(3):471–501
Article MathSciNet Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc: Ser B 58(1):267–288
MathSciNet MATH Google Scholar
Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused LASSO. J R Stat Soc: Ser B 67(1):91–108
Article MathSciNet Google Scholar
Tikhonov AN, Arsenin VY (1977) Solutions of ill-posed problems. Winston, Washington, DC
Google Scholar
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), San Francisco, CA, pp 3360–3367
Google Scholar
Weston J, Elisseeff A, Schölkopf B, Tipping M (2003) Use of the zero norm with linear models and kernel methods. J Mach Learn Res 3:1439–1461
Google Scholar
Yang Y, Pederson JO (1997) A comparative study on feature selection in text categorization. In: Proceedings of the 14th international conference on machine learning (ICML), Nashville, TN, pp 412–420
Google Scholar
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc-Ser B 68(1):49–67
Article MathSciNet Google Scholar
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc-Ser B 67(2):301–320
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Nanjing University, Nanjing, Jiangsu, China
Zhi-Hua Zhou

Authors

Zhi-Hua Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhi-Hua Zhou .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhou, ZH. (2021). Feature Selection and Sparse Learning. In: Machine Learning. Springer, Singapore. https://doi.org/10.1007/978-981-15-1967-3_11

Download citation

DOI: https://doi.org/10.1007/978-981-15-1967-3_11
Published: 21 August 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1966-6
Online ISBN: 978-981-15-1967-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics