Abstract
A novel feature selection algorithm is designed for high-dimensional data classification. The relevant features are selected with the least square loss function and \({\ell _{2,1}}\)-norm regularization term if the minimum representation error rate between the features and labels is approached with respect to only these features. Taking into account both the local and global structures of data distribution with subspace learning, an efficient optimization algorithm is proposed to solve the joint objective function, so as to select the most representative features and noise-resistant features to enhance the performance of classification. Sets of experiments are conducted on benchmark datasets, show that the proposed approach is more effective and robust than existing feature selection algorithms.
Similar content being viewed by others
References
Zhu, X., Suk, H., Lee, S., Shen, D.: Subspace regularized sparse multi-task learning for multi-class neurodegenerative disease identification. IEEE Trans. Biomed. Eng. PP(99), 1 (2015)
Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. 96(12), 6745–6750 (1999)
Berchuck, A., Iversen, E.S., Lancaster, J.M., Pittman, J., Luo, J., Lee, P., Murphy, S., Dressman, H.K., Febbo, P.G., West, M.: Patterns of gene expression that characterize long-term survival in advanced stage serous ovarian cancers. Clin. Cancer Res. Off. J. Am. Assoc. Cancer Res. 11(10), 3686–3696 (2005)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience, New York (2001)
He, X., Niyogi, P.: Locality preserving projections. In: NIPS, pp. 153–160 (2003)
Liu, H., Wu, X., Zhang, S.: A new supervised feature selection method for pattern classification. Comput. Intell. 30(2), 342–361 (2014)
Lj, V.T.V., Dai, H., Mj, V.D.V., He, Y.D., Hart, A.A., Mao, M., Peterse, H.L., Van, D.K.K., Marton, M.J., Witteveen, A.T.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871), 530–536 (2002)
Mj, V.D.V., He, Y.D., van’t Veer, L.J., Dai, H., Hart, A.A., D. W. Voskuil, D.W., Schreiber, G.J., Peterse, J.L., Roberts, C., Marton, M.J.: A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347(25), 1999–2009 (2002)
Nie, F., Huang, H., Cai, X., Ding, C.H.Q.: Efficient and robust feature selection via joint \({\ell _{2,1}}\)-norms minimization. In: Advances in Neural Information Processing Systems, pp. 1813–1821 (2010)
Qiao, L., Chen, S., Tan, X.: Sparsity preserving projections with applications to face recognition. Pattern Recognit. 43(1), 331–341 (2010)
Qin, Y., Zhang, S., Zhu, X., Zhang, J., Zhang, C.: Semi-parametric optimization for missing data imputation. Appl. Intell. 27(1), 79–88 (2007)
Song, Q., Ni, J., Wang, G.: A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans. Knowl. Data Eng. 25(1), 1–14 (2013)
Wang, J.J., Bensmail, H., Gao, X.: Feature selection and multi-kernel learning for sparse representation on a manifold. Neural Netw. Off. J. Int. Neural Netw. Soc. 51c(3), 9C16 (2013)
Wang, Y., Klijn, J.G., Yi, Z., Sieuwerts, A.M., Look, M.P., Fei, Y., Talantov, D., Timmermans, M., Gelder, M.V., Yu, J.: Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365(9460), 671C679 (2005)
West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, J.A., Marks, J.R., Nevins, J.R.: Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl. Acad. Sci. 98(20), 11462–11467 (2001)
Wu, X., Zhang, C., Zhang, S.: Efficient mining of both positive and negative association rules. ACM Trans. Inf. Syst. (TOIS) 22(3), 381–405 (2004)
Wu, X., Zhang, C., Zhang, S.: Database classification for multi-database mining. Inf. Syst. 30(1), 71–88 (2005)
Wu, X., Zhang, S.: Synthesizing high-frequency rules from different data sources. IEEE Trans. Knowl. Data Eng. 15(2), 353–367 (2003)
Yan, Y., Shen, H., Liu, G., Ma, Z., Gao, C., Sebe, N.: Glocal tells you more: Coupling glocal structural for feature selection with sparsity for image and video classification. Comput. Vis. Image Underst.124, 99–109 (2014)
Ye, J.: Least squares linear discriminant analysis. In: Proceedings of the 24th International Conference on Machine Learning, pp. 1087–1093 (2007)
Zhang, S., Qin, Z., Ling, C.X., Sheng, S.: “Missing is useful”: missing values in cost-sensitive decision trees. IEEE Trans. Knowl. Data Eng. 17(12), 1689–1693 (2005)
Zhang, S., Zhang, C., Yan, X.: Post-mining: maintenance of association rules by weighting. Inf. Syst. 28(7), 691–707 (2003)
Zhao, Y., Zhang, S.: Generalized dimension-reduction framework for recent-biased time series analysis. IEEE Trans. Knowl. Data Eng. 18(2), 231–244 (2006)
Zhou, G., Geman, S., Buhmann, J.M.: Sparse feature selection by information theory. In: 2014 IEEE International Symposium on Information Theory (ISIT), pp. 926–930 (2014)
Zhu, X., Huang, Z., Cheng, H., Cui, J., Shen, H.T.: Sparse hashing for fast multimedia search. ACM Trans. Inf. Syst. (TOIS) 31(2), 9 (2013)
Zhu, X., Huang, Z., Cui, J., Shen, H.T.: Video-to-shot tag propagation by graph sparse group lasso. IEEE Trans. Multimed. 15(3), 633–646 (2013)
Zhu, X., Huang, Z., Shen, H.T., Cheng, J., Xu, C.: Dimensionality reduction by mixed kernel canonical correlation analysis. Pattern Recognit. 45(8), 3003–3016 (2012)
Zhu, X., Huang, Z., Shen, H.T., Zhao, X.: Linear cross-modal hashing for efficient multimedia search. In: Proceedings of the 21st ACM international conference on Multimedia, pp. 143–152 (2013)
Zhu, X., Huang, Z., Yang, Y., Shen, H.T., Xu, C., Luo, J.: Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recognit. 46(1), 215–229 (2013)
Zhu, X., Li, X., Zhang, S.: Block-row sparse multiview multilabel learning for image classification. IEEE Trans. Cybern. (2015)
Zhu, X., Suk, H., Lee, S., Shen, D.: Canonical feature selection for joint regression and multi-class identification in Alzheimer’s disease diagnosis. Brain Imaging Behav. 1–11 (2015). doi:10.1007/s11682-015-9430-4
Zhu, X., Suk, H.-I., Shen, D.: Matrix-similarity based loss function and feature selection for Alzheimer’s disease diagnosis. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3089–3096 (2014)
Zhu, X., Suk, H.-I., Shen, D.: Multi-modality canonical feature selection for Alzheimer’s disease diagnosis. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI 2014, pp. 162–169 (2014)
Zhu, X., Suk, H.-I., Shen, D.: A novel matrix-similarity based loss function for joint regression and classification in ad diagnosis. NeuroImage 100, 91–105 (2014)
Zhu, X., Suk, H.-I., Shen, D.: Sparse discriminative feature selection for multi-class Alzheimer’s disease classification In: Machine Learning in Medical Imaging, pp. 157–164 (2014)
Zhu, X., Zhang, L., Huang, Z.: A sparse embedding and least variance encoding approach to hashing. IEEE Trans. Image Process. 23(9), 3737–3750 (2014)
Zhu, X., Zhang, S., Jin, Z., Zhang, Z., Xu, Z.: Missing value estimation for mixed-attribute data sets. IEEE Trans. Knowl. Data Eng. 23(1), 110–121 (2011)
Acknowledgments
This work is supported in part by the China “1000-Plan” National Distinguished Professorship; the China 973 Program under Grant 2013CB329404; the Natural Science Foundation of China under Grants 61170131, 61450001, 61363009, 61263035 and 61573270; the China Postdoctoral Science Foundation under Grant 2015M570837; the Guangxi Natural Science Foundation (Grant No: 2015GXNSFCB139011); the funding of Guangxi “100-Plan”; the Guangxi Natural Science Foundation for Teams of Innovation and Research under Grant 2012GXNSFGA060004; and the Guangxi “Bagui” Teams for Innovation and Research; Innovation Project of Guangxi Graduate Education YCSZ2015095, YCSZ2015096.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cheng, D., Zhang, S., Liu, X. et al. Feature selection by combining subspace learning with sparse representation. Multimedia Systems 23, 285–291 (2017). https://doi.org/10.1007/s00530-015-0487-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-015-0487-0