Skip to main content
Log in

Supervised Feature Selection via Quadratic Surface Regression with \(l_{2,1}\)-Norm Regularization

  • Published:
Annals of Data Science Aims and scope Submit manuscript

Abstract

This paper proposes a supervised kernel-free quadratic surface regression method for feature selection (QSR-FS). The method is to find a quadratic function in each class and incorporates it into the least squares loss function. The \(l_{2,1}\)-norm regularization term is introduced to obtain a sparse solution, and a feature weight vector is constructed by the coefficients of the quadratic functions in all classes to explain the importance of each feature. An alternating iteration algorithm is designed to solve the optimization problem of this model. The computational complexity of the algorithm is provided, and the iterative formula is reformulated to further accelerate computation. In the experimental part, feature selection and its downstream classification tasks are performed on eight datasets from different domains, and the experimental results are analyzed by relevant evaluation index. Furthermore, feature selection interpretability and parameter sensitivity analysis are provided. The experimental results demonstrate the feasibility and effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

The Vehicle, Segmentation, Control, and WebKB wc datasets used in this study are available to the UCI Machine Learning Repository£º https://archive.ics.uci.edu/ml/index.php. The USPS, ORL, and UMIST datasets used in this study are available in the website https://paperswithcode.com/. The Corel 5k datasets used in this study are available in the website https://www.kaggle.com/.

References

  1. Guyon I, André E (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    Google Scholar 

  2. Gui J, Sun ZN, Ji SW, Tao DC, Tan TN (2017) Feature selection based on structured sparsity: a comprehensive study. IEEE Trans Neural Netw Learn Syst 28(7):1490–1507

    Article  Google Scholar 

  3. Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:131–156

    Article  Google Scholar 

  4. Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5(12):1205–1224

    Google Scholar 

  5. Duan MX, Li KL, Liao XK (2018) A parallel multi-classification algorithm for big data using an extreme learning machine. IEEE Trans Neural Netw Learn Syst 29(6):2337–2351

    Article  Google Scholar 

  6. Estévez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw Learn Syst 20(2):189–201

    Article  Google Scholar 

  7. Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. ECML-94, pp 171-182

  8. Malina W (1981) On an extended Fisher criterion for feature selection. IEEE Trans Pattern Anal Mach Intell 3(5):611–614

    Article  Google Scholar 

  9. Peng HC, Long FH, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  10. Genuer R, Poggi JM, Tuleau-Malot C (2010) Variable selection using random forests. Pattern Recognit Lett 31(14):2225–2236

    Article  Google Scholar 

  11. Hastie T, Tibshirani R, Buja A (1993) Flexible discriminant analysis by optimal scoring. J Am Stat Assoc 89(428):1255–1270

    Article  Google Scholar 

  12. Hastie T, Tibshirani R, Friedman J (2009) Linear methods for classification. The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York, pp 103–106

    Chapter  Google Scholar 

  13. Fu S, Tian Y, Tang L (2023) Robust regression under the general framework of bounded loss functions. Eur J Oper Res 310(3):1325–1339

    Article  Google Scholar 

  14. Nie FP, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint \(l_{2, 1}\)-norms minimization. Adv Neural Inf Process Syst 2:1813–1821

    Google Scholar 

  15. Wang C, Chen X, Yuan G (2021) Semisupervised feature selection with sparse discriminative least squares regression. IEEE Trans Cybern 52(8):8413–8424

    Article  Google Scholar 

  16. Zhao S, Zhang B, Li S (2020) Discriminant and sparsity based least squares regression with \(l_{1}\) regularization for feature representation. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), Spain, pp 1504–1508

  17. Zhao SP, Wu JG, Zhang B, Fei LK (2022) Low-rank inter-class sparsity based semi-flexible target least squares regression for feature representation. Pattern Recogn 123:108346

    Article  Google Scholar 

  18. Dagher I (2008) Quadratic kernel-free nonlinear support vector machine. J Global Optim 41(1):15–30

    Article  Google Scholar 

  19. Cortes C, Vapnik V (1995) Support vector machine. Mach learn 3(20):273–297

    Article  Google Scholar 

  20. Luo J, Fang SC, Deng ZB, Guo XL (2016) Soft quadratic surface support vector machine for binary classification. Asia Pac J Oper Res 33(06):1650046

    Article  Google Scholar 

  21. Bai YQ, Han X, Chen T, Yu H (2015) Quadratic kernel-free least squares support vector machine for target diseases classification. J Comb Optim 30(4):850–870

    Article  Google Scholar 

  22. Mousavi A, Gao ZM, Han LS, Lim A (2022) Quadratic surface support vector machine with \(L_{1}\) norm regularization. J Ind Manag Optim 18(3):1835–1861

    Article  Google Scholar 

  23. Zhan YR, Bai YQ, Zhang W, Ying SH (2018) A p-admm for sparse quadratic kernel-free least squares semi-supervised support vector machine. Neurocomputing 306:37–50

    Article  Google Scholar 

  24. Gao ZM, Fang SC, Gao X, Luo J, Medhin N (2021) A novel kernel-free least squares twin support vector machine for fast and accurate multi-class classification. Knowl Based Syst 226:107123

    Article  Google Scholar 

  25. Liu DL, Shi Y, Tian YJ, Huang XK (2016) Ramp loss least squares support vector machine. J Comput Sci 14:61–68

    Article  Google Scholar 

  26. Yan X, Bai YQ, Fang S, Luo J (2018) A proximal quadratic surface support vector machine for semi-supervised binary classification. Soft Comput 22(20):6905–6919

    Article  Google Scholar 

  27. Ye JY, Yang ZX, Li ZL (2021) Quadratic hyper-surface kernel-free least squares support vector regression. Intell Data Anal 25(2):265–281

    Article  Google Scholar 

  28. Luo J, Tian YJ, Yan X (2017) Clustering via fuzzy one-class quadratic surface support vector machine. Soft Comput 21(19):5859–5865

    Article  Google Scholar 

  29. Gao ZM, Wang YW, Huang M, Luo J, Tang SS (2022) A kernel-free fuzzy reduced quadratic surface \(\nu \)-support vector machine with applications. Appl Soft Comput 127:109390

    Article  Google Scholar 

  30. Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227

    Article  Google Scholar 

  31. Zhao H, Yu SL (2019) Cost-sensitive feature selection via the \(l_{2,1}\)-norm. Int J Approx Reason 104:25–37

    Article  Google Scholar 

  32. Peng YL, Sehdev P, Liu SG, Li J, Wang XL (2018) \(l_{2,1}\)-norm minimization based negative label relaxation linear regression for feature selection. Pattern Recognit Lett 116:170–178

    Article  Google Scholar 

  33. Du X, Nie FP, Wang W, Yang Y, Zhou X (2019) Exploiting combination effect for unsupervised feature selection by \(l_{2,0}\) norm. IEEE Trans Neural Netw Learn Syst 30(1):201–214

    Article  Google Scholar 

  34. Fan M, Zhang X, Hu J, Gu N, Tao D (2022) Adaptive data structure regularized multiclass discriminative feature selection. IEEE Trans Neural Netw Learn Syst 33(10):5859–5872

    Article  Google Scholar 

  35. Nie FP, Wang Z, Tian L, Wang R, Li X (2022) Subspace sparse discriminative feature selection. IEEE Trans Cybern 52(6):4221–4233

    Article  Google Scholar 

  36. Zhang H, Gong MG, Nie FP, Li XL (2022) Unified dual-label semi-supervised learning with top-k feature selection. Neurocomputing 501:875–888

    Article  Google Scholar 

  37. Shen HT, Zhu Y, Zheng W, Zhu X (2021) Half-Quadratic Minimization for Unsupervised Feature Selection on Incomplete Data. IEEE Trans Neural Netw Learn Syst 32(7):3122–3135

    Article  Google Scholar 

  38. He XF, Cai D, Niyogi P (2006) Laplacian score for feature selection. NIPS 05:507–514

    Google Scholar 

  39. Demsar J (2006) Statistical comparisons of classifiers over multiple datasets. J Mach Learn Res 7:1–30

    Google Scholar 

  40. Garciía S, Fernández A, Luengo J, Francisco H (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064

    Article  Google Scholar 

Download references

Funding

This work is supported by the National Natural Science Foundation of China (No.12061071).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhixia Yang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical Statement

As such, we affirm that the manuscript at hand is wholly our work. It should be noted that, except from the mentioned citations, this manuscript does not include any previously published or written research results. There are no other authors on this document, and we agree to take full legal responsibility for this declaration.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

Appendix A presents key details about the datasets used. Figures 6, 7, 8 and 9 depict samples from four image datasets, and Table 7 lists basic information for all the datasets in this study.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, C., Yang, Z., Ye, J. et al. Supervised Feature Selection via Quadratic Surface Regression with \(l_{2,1}\)-Norm Regularization. Ann. Data. Sci. 11, 647–675 (2024). https://doi.org/10.1007/s40745-024-00518-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40745-024-00518-3

Keywords

Navigation