Feature selection with MCP $$^2$$ regularization

Shi, Yong; Miao, Jianyu; Niu, Lingfeng

doi:10.1007/s00521-018-3500-7

Feature selection with MCP$^2$ regularization

Original Article
Published: 26 April 2018

Volume 31, pages 6699–6709, (2019)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Yong Shi^1,2,3,4,
Jianyu Miao⁵ &
Lingfeng Niu ORCID: orcid.org/0000-0002-5827-8449^1,2,3

485 Accesses
4 Citations
Explore all metrics

Abstract

Feature selection, as a fundamental component of building robust models, plays an important role in many machine learning and data mining tasks. Recently, with the development of sparsity research, both theoretical and empirical studies have suggested that the sparsity is one of the intrinsic properties of real world data and sparsity regularization has been applied into feature selection models successfully. In view of the remarkable performance of non-convex regularization, in this paper, we propose a novel non-convex yet Lipschitz continuous sparsity regularization term, named MCP$^2$, and apply it into feature selection. To solve the resulting non-convex model, a new algorithm in the framework of the ConCave–Convex Procedure is given at the same time. Experimental results on benchmark datasets demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-class Feature Selection Based on Softmax with $$L_{2,0}$$ -Norm Regularization

Multi-class feature selection via Sparse Softmax with a discriminative regularization

Article 05 May 2024

Effective Learning with Joint Discriminative and Representative Feature Selection

Notes

References

Argyriou A, Evgeniou T, Pontil M (2007) Multi-task feature learning. In: Advances in neural information processing systems, pp 41–48
Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1–2):245–271
Article MathSciNet Google Scholar
Bradley PS, Mangasarian OL (1998) Feature selection via concave minimization and support vector machines. In: Proceedings of the 13 th international conference on machine learning, vol 98. pp 82–90
Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 333–342
Cai X, Nie F, Huang H, Ding C (2011) Multi-class l2, 1-norm support vector machine. In: Data mining (ICDM), 2011 IEEE 11th international conference on. IEEE, pp 91–100
Collobert R, Sinz F, Weston J, Bottou L (2006) Large scale transductive svms. J Mach Learn Res 7:1687–1712
MathSciNet MATH Google Scholar
Constantinopoulos C, Titsias MK, Likas A (2006) Bayesian feature and model selection for gaussian mixture models. IEEE Trans Pattern Anal Mach Intell 6:1013–1018
Article Google Scholar
Ding C, Zhou D, He X, Zha H (2006) R 1-pca: rotational invariant l 1-norm principal component analysis for robust subspace factorization. In: Proceedings of the 23rd international conference on Machine learning. ACM, pp 281–288
Du X, Yan Y, Pan P, Long G, Zhao L (2016) Multiple graph unsupervised feature selection. Signal Process 120:754–760
Article Google Scholar
Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, Hoboken
MATH Google Scholar
Esser E, Lou Y, Xin J (2013) A method for finding structured sparse solutions to nonnegative least squares problems with applications. SIAM J Imaging Sci 6(4):2010–2046
Article MathSciNet Google Scholar
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
Article MathSciNet Google Scholar
Fung G, Mangasarian OL (2000) Data selection for support vector machine classifiers. In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery data mining, pp 64–70
Gao S, Ye Q, Ye N (2011) 1-norm least squares twin support vector machines. Neurocomputing 74(17):3590–3597
Article Google Scholar
Gui J, Sun Z, Ji S, Tao D, Tan T (2016) Feature selection based on structured sparsity: a comprehensive study. IEEE Trans Neural Netw Learn Syst 28(7):1490–1507
Article MathSciNet Google Scholar
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
MATH Google Scholar
Han Y, Yang Y, Yan Y, Ma Z, Sebe N, Zhou X (2015) Semisupervised feature selection via spline regression for video semantic recognition. IEEE Trans Neural Netw Learn Syst 26(2):252–264
Article MathSciNet Google Scholar
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. In: Advances in neural information processing systems, pp 507–514
Jiang W, Nie F, Huang H (2015) Robust dictionary learning with capped l1-norm. In: IJCAI, pp 3590–3596
Li Z, Liu J, Yang Y, Zhou X, Lu H (2014) Clustering-guided sparse structural learning for unsupervised feature selection. IEEE Trans Knowl Data Eng 26(9):2138–2150
Article Google Scholar
Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: AAAI
Ma Z, Nie F, Yang Y, Uijlings JR, Sebe N (2012) Web image annotation via subspace-sparsity collaborated feature selection. IEEE Trans Multimed 14(4):1021–1030
Article Google Scholar
Nie F, Huang H, Cai X, Ding CH (2010) Efficient and robust feature selection via joint $\ell _{2, 1}$-norms minimization. In: Advances in neural information processing systems, pp 1813–1821
Nie F, Xiang S, Jia Y, Zhang C, Yan S (2008) Trace ratio criterion for feature selection. AAAI 2:671–676
Google Scholar
Obozinski G, Taskar B, Jordan M (2006) Multi-task feature selection. Statistics Department, UC Berkeley, Tech. Rep
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Article Google Scholar
Rakotomamonjy A (2003) Variable selection using svm based criteria. J Mach Learn Res 3:1357–1370
MathSciNet MATH Google Scholar
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of relieff and rrelieff. Mach Learn 53(1–2):23–69
Article Google Scholar
Sahami M (1998) Using machine learning to improve information access. Ph.D. thesis, Stanford University
Shi C, Ruan Q, Guo S, Tian Y (2015) Sparse feature selection based on $l_{2, 1/2}$-matrix norm for web image annotation. Neurocomputing 151:424–433
Google Scholar
Shi Y, Miao J, Wang Z, Zhang P, Niu L, Feature selection with $\ell _{2,1--2}$ regularization. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2017.2785403
Tan J, Zhang Z, Zhen L, Zhang C, Deng N (2013) Adaptive feature selection via a new version of support vector machine. Neural Comput Appl 23(3–4):937–945
Article Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58(1):267–288
MathSciNet MATH Google Scholar
Wang S, Tang J, Liu H (2015) Embedded unsupervised feature selection. In: Twenty-ninth AAAI conference on artificial intelligence
Wright S (1965) The interpretation of population structure by f-statistics with special regard to systems of mating. Evolution 19:395–420
Article Google Scholar
Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Syst 23(11):1738–1754
Article Google Scholar
Ye YF, Shao YH, Deng NY, Li CN, Hua XY (2017) Robust lp-norm least squares support vector regression with feature selection. Appl Math Comput 305:32–52
MathSciNet MATH Google Scholar
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B Stat Methodol 68(1):49–67
Article MathSciNet Google Scholar
Zhang CH et al (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894–942
Article MathSciNet Google Scholar
Zhang H, Cao X, Ho JK, Chow TW (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inform 13(2):520–531
Article Google Scholar
Zhang H, Chow TW, Wu QJ (2016) Organizing books and authors by multilayer som. IEEE Trans Neural Netw Learn Syst 27(12):2537–2550
Article Google Scholar
Zhang H, Wang S, Zhao M, Xu X, Ye Y, Locality reconstruction models for book representation. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2018.2808953
Zhang M, Ding CH, Zhang Y, Nie F (2014) Feature selection at the discrete limit. In: AAAI, pp 1355–1361
Zhao Z, Liu H (2007) Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th international conference on machine learning. ACM, pp 1151–1157
Zhen Y, Yeung DY (2012) Co-regularized hashing for multimodal data. In: Advances in neural information processing systems, pp 1376–1384
Zhu P, Hu Q, Zhang C, Zuo W (2016) Coupled dictionary learning for unsupervised feature selection. In: AAAI, pp 2422–2428
Zhu P, Xu Q, Hu Q, Zhang C, Zhao H (2018) Multi-label feature selection with missing labels. Pattern Recognit 74:488–502
Article Google Scholar
Zhu P, Zhu W, Wang W, Zuo W, Hu Q (2017) Non-convex regularized self-representation for unsupervised feature selection. Image Vis Comput 60:22–29
Article Google Scholar

Download references

Acknowledgements

This work was in part supported by National Natural Science Foundation of China under Grants 91546201, 71331005, 11671379 and 11331012.

Author information

Authors and Affiliations

School of Economics and Management, University of Chinese Academy of Sciences, Beijing, 100190, China
Yong Shi & Lingfeng Niu
Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing, 100190, China
Yong Shi & Lingfeng Niu
Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing, 100190, China
Yong Shi & Lingfeng Niu
College of Information Science and Technology, University of Nebraska at Omaha, Omaha, Nebraska, 68182, USA
Yong Shi
School of Mathematics Sciences, University of Chinese Academy of Sciences, Beijing, 100490, China
Jianyu Miao

Authors

Yong Shi
View author publications
You can also search for this author in PubMed Google Scholar
Jianyu Miao
View author publications
You can also search for this author in PubMed Google Scholar
Lingfeng Niu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lingfeng Niu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shi, Y., Miao, J. & Niu, L. Feature selection with MCP$^2$ regularization. Neural Comput & Applic 31, 6699–6709 (2019). https://doi.org/10.1007/s00521-018-3500-7

Download citation

Received: 18 December 2017
Accepted: 20 April 2018
Published: 26 April 2018
Issue Date: October 2019
DOI: https://doi.org/10.1007/s00521-018-3500-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection with MCP\(^2\) regularization

Abstract

Access this article

Similar content being viewed by others

Multi-class Feature Selection Based on Softmax with $$L_{2,0}$$ -Norm Regularization

Multi-class feature selection via Sparse Softmax with a discriminative regularization

Effective Learning with Joint Discriminative and Representative Feature Selection

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Feature selection with MCP\(^2\) regularization

Abstract

Access this article

Similar content being viewed by others

Multi-class Feature Selection Based on Softmax with $$L_{2,0}$$ -Norm Regularization

Multi-class feature selection via Sparse Softmax with a discriminative regularization

Effective Learning with Joint Discriminative and Representative Feature Selection

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation