Abstract
This paper proposes a supervised kernel-free quadratic surface regression method for feature selection (QSR-FS). The method is to find a quadratic function in each class and incorporates it into the least squares loss function. The \(l_{2,1}\)-norm regularization term is introduced to obtain a sparse solution, and a feature weight vector is constructed by the coefficients of the quadratic functions in all classes to explain the importance of each feature. An alternating iteration algorithm is designed to solve the optimization problem of this model. The computational complexity of the algorithm is provided, and the iterative formula is reformulated to further accelerate computation. In the experimental part, feature selection and its downstream classification tasks are performed on eight datasets from different domains, and the experimental results are analyzed by relevant evaluation index. Furthermore, feature selection interpretability and parameter sensitivity analysis are provided. The experimental results demonstrate the feasibility and effectiveness of our method.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40745-024-00518-3/MediaObjects/40745_2024_518_Figa_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40745-024-00518-3/MediaObjects/40745_2024_518_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40745-024-00518-3/MediaObjects/40745_2024_518_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40745-024-00518-3/MediaObjects/40745_2024_518_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40745-024-00518-3/MediaObjects/40745_2024_518_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40745-024-00518-3/MediaObjects/40745_2024_518_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40745-024-00518-3/MediaObjects/40745_2024_518_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40745-024-00518-3/MediaObjects/40745_2024_518_Fig7_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40745-024-00518-3/MediaObjects/40745_2024_518_Fig8_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40745-024-00518-3/MediaObjects/40745_2024_518_Fig9_HTML.jpg)
Similar content being viewed by others
Data Availability
The Vehicle, Segmentation, Control, and WebKB wc datasets used in this study are available to the UCI Machine Learning Repository£º https://archive.ics.uci.edu/ml/index.php. The USPS, ORL, and UMIST datasets used in this study are available in the website https://paperswithcode.com/. The Corel 5k datasets used in this study are available in the website https://www.kaggle.com/.
References
Guyon I, André E (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Gui J, Sun ZN, Ji SW, Tao DC, Tan TN (2017) Feature selection based on structured sparsity: a comprehensive study. IEEE Trans Neural Netw Learn Syst 28(7):1490–1507
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:131–156
Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5(12):1205–1224
Duan MX, Li KL, Liao XK (2018) A parallel multi-classification algorithm for big data using an extreme learning machine. IEEE Trans Neural Netw Learn Syst 29(6):2337–2351
Estévez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw Learn Syst 20(2):189–201
Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. ECML-94, pp 171-182
Malina W (1981) On an extended Fisher criterion for feature selection. IEEE Trans Pattern Anal Mach Intell 3(5):611–614
Peng HC, Long FH, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Genuer R, Poggi JM, Tuleau-Malot C (2010) Variable selection using random forests. Pattern Recognit Lett 31(14):2225–2236
Hastie T, Tibshirani R, Buja A (1993) Flexible discriminant analysis by optimal scoring. J Am Stat Assoc 89(428):1255–1270
Hastie T, Tibshirani R, Friedman J (2009) Linear methods for classification. The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York, pp 103–106
Fu S, Tian Y, Tang L (2023) Robust regression under the general framework of bounded loss functions. Eur J Oper Res 310(3):1325–1339
Nie FP, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint \(l_{2, 1}\)-norms minimization. Adv Neural Inf Process Syst 2:1813–1821
Wang C, Chen X, Yuan G (2021) Semisupervised feature selection with sparse discriminative least squares regression. IEEE Trans Cybern 52(8):8413–8424
Zhao S, Zhang B, Li S (2020) Discriminant and sparsity based least squares regression with \(l_{1}\) regularization for feature representation. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), Spain, pp 1504–1508
Zhao SP, Wu JG, Zhang B, Fei LK (2022) Low-rank inter-class sparsity based semi-flexible target least squares regression for feature representation. Pattern Recogn 123:108346
Dagher I (2008) Quadratic kernel-free nonlinear support vector machine. J Global Optim 41(1):15–30
Cortes C, Vapnik V (1995) Support vector machine. Mach learn 3(20):273–297
Luo J, Fang SC, Deng ZB, Guo XL (2016) Soft quadratic surface support vector machine for binary classification. Asia Pac J Oper Res 33(06):1650046
Bai YQ, Han X, Chen T, Yu H (2015) Quadratic kernel-free least squares support vector machine for target diseases classification. J Comb Optim 30(4):850–870
Mousavi A, Gao ZM, Han LS, Lim A (2022) Quadratic surface support vector machine with \(L_{1}\) norm regularization. J Ind Manag Optim 18(3):1835–1861
Zhan YR, Bai YQ, Zhang W, Ying SH (2018) A p-admm for sparse quadratic kernel-free least squares semi-supervised support vector machine. Neurocomputing 306:37–50
Gao ZM, Fang SC, Gao X, Luo J, Medhin N (2021) A novel kernel-free least squares twin support vector machine for fast and accurate multi-class classification. Knowl Based Syst 226:107123
Liu DL, Shi Y, Tian YJ, Huang XK (2016) Ramp loss least squares support vector machine. J Comput Sci 14:61–68
Yan X, Bai YQ, Fang S, Luo J (2018) A proximal quadratic surface support vector machine for semi-supervised binary classification. Soft Comput 22(20):6905–6919
Ye JY, Yang ZX, Li ZL (2021) Quadratic hyper-surface kernel-free least squares support vector regression. Intell Data Anal 25(2):265–281
Luo J, Tian YJ, Yan X (2017) Clustering via fuzzy one-class quadratic surface support vector machine. Soft Comput 21(19):5859–5865
Gao ZM, Wang YW, Huang M, Luo J, Tang SS (2022) A kernel-free fuzzy reduced quadratic surface \(\nu \)-support vector machine with applications. Appl Soft Comput 127:109390
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Zhao H, Yu SL (2019) Cost-sensitive feature selection via the \(l_{2,1}\)-norm. Int J Approx Reason 104:25–37
Peng YL, Sehdev P, Liu SG, Li J, Wang XL (2018) \(l_{2,1}\)-norm minimization based negative label relaxation linear regression for feature selection. Pattern Recognit Lett 116:170–178
Du X, Nie FP, Wang W, Yang Y, Zhou X (2019) Exploiting combination effect for unsupervised feature selection by \(l_{2,0}\) norm. IEEE Trans Neural Netw Learn Syst 30(1):201–214
Fan M, Zhang X, Hu J, Gu N, Tao D (2022) Adaptive data structure regularized multiclass discriminative feature selection. IEEE Trans Neural Netw Learn Syst 33(10):5859–5872
Nie FP, Wang Z, Tian L, Wang R, Li X (2022) Subspace sparse discriminative feature selection. IEEE Trans Cybern 52(6):4221–4233
Zhang H, Gong MG, Nie FP, Li XL (2022) Unified dual-label semi-supervised learning with top-k feature selection. Neurocomputing 501:875–888
Shen HT, Zhu Y, Zheng W, Zhu X (2021) Half-Quadratic Minimization for Unsupervised Feature Selection on Incomplete Data. IEEE Trans Neural Netw Learn Syst 32(7):3122–3135
He XF, Cai D, Niyogi P (2006) Laplacian score for feature selection. NIPS 05:507–514
Demsar J (2006) Statistical comparisons of classifiers over multiple datasets. J Mach Learn Res 7:1–30
Garciía S, Fernández A, Luengo J, Francisco H (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064
Funding
This work is supported by the National Natural Science Foundation of China (No.12061071).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical Statement
As such, we affirm that the manuscript at hand is wholly our work. It should be noted that, except from the mentioned citations, this manuscript does not include any previously published or written research results. There are no other authors on this document, and we agree to take full legal responsibility for this declaration.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, C., Yang, Z., Ye, J. et al. Supervised Feature Selection via Quadratic Surface Regression with \(l_{2,1}\)-Norm Regularization. Ann. Data. Sci. 11, 647–675 (2024). https://doi.org/10.1007/s40745-024-00518-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40745-024-00518-3