Abstract
With unlabeled and high-dimensional data explosion, unsupervised feature selection has become an essential step in many machine learning and data mining tasks. Many dictionary learning based models have been successfully developed for unsupervised feature selection in recent years. These models learn an over-complete dictionary to investigate more data distribution information. However, over-complete dictionary learning will generate redundancy in the latent representations for data. Moreover, if data contain noise, dictionary learning will also yield noise in the latent representations. In this paper, we propose a novel unsupervised feature selection framework, named dictionary learning for unsupervised feature selection via dual sparse regression. In this model, dictionary learning is first embedded into a sparse regression to learn an over-complete dictionary with sparse representations for data, in which the redundancy and noise are eliminated. The data are then projected to the representations to evaluate the significance of features using the other sparse regression. We also offer an efficient algorithm to solve this problem and theoretically analyze its convergence and computational complexity, which is proportional to the data dimensionality. Finally, the evaluation results with the k-means task utilizing the selected features on 9 benchmark datasets demonstrate the superiority of our approach in terms of effectiveness and efficiency.
Similar content being viewed by others
Notes
The source code is provided by the author Wei Zheng. Thanks for her generous help.
References
Shang R, Xu K, Shang F, Jiao L (2020) Sparse and low-redundant subspace learning-based dual-graph regularized robust feature selection. Knowl-Based Syst 187:104830. https://doi.org/10.1016/j.knosys.2019.07.001
Wang F, Zhu L, Li J, Chen H, Zhang H (2021) Unsupervised soft-label feature selection. Knowl-Based Syst 219:106847. https://doi.org/10.1016/j.knosys.2021.106847
Zhou H, Wang X, Zhu R (2022) Feature selection based on mutual information with correlation coefficient. Appl Intell 52(5):5457–5474. https://doi.org/10.1007/s10489-021-02524-x
Cai J, Wang S, Guo W (2021) Unsupervised embedded feature learning for deep clustering with stacked sparse auto-encoder. Expert Syst Appl 186:115729. https://doi.org/10.1016/j.eswa.2021.115729
Cai J, Fan J, Guo W, Wang S, Zhang Y, Zhang Z (2022) Efficient deep embedded subspace clustering. In: CVPR. https://doi.org/10.1109/CVPR52688.2022.00012, pp 21–30
Cai J, Wang S, Xu C, Guo W (2022) Unsupervised deep clustering via contractive feature representation and focal loss. Pattern Recogn 123:108386. https://doi.org/10.1016/j.patcog.2021.108386
Dhal P, Azad C (2021) A comprehensive survey on feature selection in the various fields of machine learning. Appl Intell 52(4):4543–4581. https://doi.org/10.1007/s10489-021-02550-9
Feofanov V, Devijver E, Amini M-R (2022) Wrapper feature selection with partially labeled data. Appl Intell 52(11):12316–12329. https://doi.org/10.1007/s10489-021-03076-w
Gao W, Hu L, Zhang P (2020) Feature redundancy term variation for mutual information-based feature selection. Appl Intell 50(4):1272–1288. https://doi.org/10.1007/s10489-019-01597-z
Li H, Wang Y, Li Y, Hu P, Zhao R (2020) Joint local structure preservation and redundancy minimization for unsupervised feature selection. Appl Intell 50(12):4394–4411. https://doi.org/10.1007/s10489-020-01800-6
Liu H, Shao M, Fu Y (2018) Feature selection with unsupervised consensus guidance. IEEE Trans Knowl Data Eng 31(12):2319–2331. https://doi.org/10.1109/TKDE.2018.2875712
Wu X, Chen H, Li T, Wan J (2021) Semi-supervised feature selection with minimal redundancy based on local adaptive. Appl Intell 51(11):8542–8563. https://doi.org/10.1007/s10489-021-02288-4
Wu X, Xu X, Liu J, Wang H, Nie F (2021) Supervised feature selection with orthogonal regression and feature weighting. IEEE Trans Neural Netw Learn Syst 32(5):1831–1838. https://doi.org/10.1109/TNNLS.2020.2991336
Zhang Y, Li H-G, Wang Q, Peng C (2019) A filter-based bare-bone particle swarm optimization algorithm for unsupervised feature selection. Appl Intell 49(8):2889–2898. https://doi.org/10.1007/s10489-019-01420-9
Zhang Y, Lu Z, Wang S (2021) Unsupervised feature selection via transformed auto-encoder. Knowl-Based Syst 215:106748. https://doi.org/10.1016/j.knosys.2021.106748
Wang S, Tang J, Liu H (2015) Embedded unsupervised feature selection. In: AAAI. https://doi.org/10.1609/aaai.v29i1.9211, pp 470–476
Zhu P, Hu Q, Zhang C, Zuo W (2016) Coupled dictionary learning for unsupervised feature selection. In: AAAI. https://doi.org/10.1609/aaai.v30i1.10239, pp 2422–2428
Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: ACM KDD. https://doi.org/10.1145/1835804.1835848, pp 333–342
Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: AAAI. https://doi.org/10.1609/aaai.v26i1.8289, pp 1026–1032
Hou C, Nie F, Li X, Yi D, Wu Y (2014) Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Trans Cybern 44(6):793–804. https://doi.org/10.1109/TCYB.2013.2272642
Shang R, Wang L, Shang F, Jiao L, Li Y (2021) Dual space latent representation learning for unsupervised feature selection. Pattern Recogn 114:107873. https://doi.org/10.1016/j.patcog.2021.107873
Tang C, Bian M, Liu X, Li M, Zhou H, Wang P, Yin H (2019) Unsupervised feature selection via latent representation learning and manifold regularization. Neural Netw 117:163–178. https://doi.org/10.1016/j.neunet.2019.04.015
Shang R, Zhang Z, Jiao L, Liu C, Li Y (2016) Self-representation based dual-graph regularized feature clustering. Neurocomputing 171:1242–1253. https://doi.org/10.1016/j.neucom.2015.07.068
Tang C, Liu X, Li M, Wang P, Chen J, Wang L, Li W (2018) Robust unsupervised feature selection via dual self-representation and manifold regularization. Knowl-Based Syst 145:109–120. https://doi.org/10.1016/j.knosys.2018.01.009
Zhu X, Zhang S, Hu R, Zhu Y, Song J (2018) Local and global structure preservation for robust unsupervised spectral feature selection. IEEE Trans Knowl Data Eng 30(3):517–529. https://doi.org/10.1109/TKDE.2017.2763618
Yuan A, You M, He D, Li X (2020) Convex non-negative matrix factorization with adaptive graph for unsupervised feature selection. IEEE Trans Cybern 52(6):5522–5534. https://doi.org/10.1109/TCYB.2020.3034462
Mairal J, Bach F, Ponce J, Sapiro G (2010) Online learning for matrix factorization and sparse coding. J Mach Learn Res 11:19–60
Xu Y, Chen S, Li J, Luo L, Yang J (2021) Learnable low-rank latent dictionary for subspace clustering. Pattern Recogn 120:108142. https://doi.org/10.1016/j.patcog.2021.108142
Yang X, Jiang X, Tian C, Wang P, Zhou F, Fujita H (2020) Inverse projection group sparse representation for tumor classification: a low rank variation dictionary approach. Knowl-Based Syst 196:105768. https://doi.org/10.1016/j.knosys.2020.105768
Gu X, Cai W, Gao M, Jiang Y, Ning X, Qian P (2022) Multi-source domain transfer discriminative dictionary learning modeling for electroencephalogram-based emotion recognition. IEEE Trans Computat Soc Syst 9(6):1604–1612. https://doi.org/10.1109/TCSS.2022.3153660
Foroughi H, Ray N, Zhang H (2018) Object classification with joint projection and low-rank dictionary learning. IEEE Trans Image Process 27(2):806–821. https://doi.org/10.1109/TIP.2017.2766446
Li Z, Zhang Z, Qin J, Li S, Cai H (2019) Low-rank analysis-synthesis dictionary learning with adaptively ordinal locality. Neural Netw 119:93–112. https://doi.org/10.1016/j.neunet.2019.07.013
Miao J, Yang T, Fan C, Chen Z, Fei X, Ju X, Wang K, Xu M (2022) Self-paced non-convex regularized analysis-synthesis dictionary learning for unsupervised feature selection. Knowl-Based Syst 241:108279. https://doi.org/10.1016/j.knosys.2022.108279
Fan Y, Dai J, Zhang Q, Liu S (2019) Joint dictionary learning for unsupervised feature selection. In: ICANN. https://doi.org/10.1007/978-3-030-30484-3_4, pp 46–58
Mairal J, Bach F, Ponce J, Sapiro G (2009) Online dictionary learning for sparse coding. In: Proceedings of the 26th annual international conference on machine learning, pp 689–696
Zheng W, Xu C, Yang J, Gao J, Zhu F (2018) Low-rank structure preserving for unsupervised feature selection. Neurocomputing 314:360–370. https://doi.org/10.1016/j.neucom.2018.06.010
Parsa MG, Zare H, Ghatee M (2022) Low-rank dictionary learning for unsupervised feature selection. Expert Syst Appl 202:117149. https://doi.org/10.1016/j.eswa.2022.117149
Fan Y, Dai J, Zhang Q (2019) Latent space embedding for unsupervised feature selection via joint dictionary learning. In: IJCNN. https://doi.org/10.1109/ijcnn.2019.8852061, pp 1–8
Zhang Q, Dai J (2018) Cluster structure preserving based on dictionary pair for unsupervised feature selection. In: IJCNN. https://doi.org/10.1109/ijcnn.2018.8489168, pp 1–8
Zhu X, Li X, Zhang S, Ju C, Wu X (2016) Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans Neural Netw Learn Syst 28(6):1263–1275. https://doi.org/10.1109/TNNLS.2016.2521602
Ding D, Xia F, Yang X, Tang C (2020) Joint dictionary and graph learning for unsupervised feature selection. Appl Intell 50(5):1379–1397. https://doi.org/10.1007/s10489-019-01561-x
Li S, Tang C, Liu X, Liu Y, Chen J (2019) Dual graph regularized compact feature representation for unsupervised feature selection. Neurocomputing 331:77–96. https://doi.org/10.1016/j.neucom.2018.11.060
Dumitrescu B, Irofti P (2016) Low dimensional subspace finding via size-reducing dictionary learning. In: MLSP. https://doi.org/10.1109/mlsp.2016.7738900, pp 1–6
Yu G, Zhang G, Zhang Z, Yu Z, Deng L (2015) Semi-supervised classification based on subspace sparse representation. Knowl Inf Syst 43(1):81–101. https://doi.org/10.1007/s10115-013-0702-2
Nishihara R, Lessard L, Recht B, Packard A, Jordan M (2015) A general analysis of the convergence of ADMM. In: ICML. https://doi.org/10.48550/arXiv.1502.02009, pp 343–352
Nie F, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint l2, 1 norm minimization. In: NIPS, pp 1813–1821
Goldstein T, O’Donoghue B, Setzer S, Baraniuk R (2014) Fast alternating direction optimization methods. SIAM J Imaging Sci 7(3):1588–1623. https://doi.org/10.1137/120896219
Acknowledgements
This research was supported by the National Natural Science Foundation of China (No. 62066027), the Natural Science Foundation of Jiangxi Province, China (No. 20212BAB212011), and the Postgraduate Innovation Foundation of Jiangxi Province (No. YC2022-s160).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, JS., Liu, JX., Wu, JY. et al. Dictionary learning for unsupervised feature selection via dual sparse regression. Appl Intell 53, 18840–18856 (2023). https://doi.org/10.1007/s10489-023-04480-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04480-0