Abstract
In this paper we consider feature selection for face recognition using both labeled and unlabeled data. We introduce the weighted feature space in which the global separability between different classes is maximized and the local similarity of the neighboring data points is preserved. By integrating the global and local structures, a general optimization framework is formulated. We propose a simple solution to this problem, avoiding the matrix eigen-decomposition procedure which is often computationally expensive. Experimental results demonstrate the efficacy of our approach and confirm that utilizing labeled and unlabeled data together does help feature selection with small number of labeled samples.
Similar content being viewed by others
References
Balakrishnama, S., & Ganapathiraju, A. (1998). Linear discriminant analysis-a brief tutorial. Institute for Signal and information Processing.
Belhumeur, P., Hespanha, J., Kriegman, D., et al. (1997). Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Transactions Pattern Analytical Machinist Intelligence, 19(7), 711–720.
Belkin, M., Niyogi, P., Sindhwani, V. (2006). Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7, 2399–2434.
Cai, D., He, X., Han, J. (2007). Semi-supervised discriminant analysis. IEEE 11th International Conference on Computer Vision, 1–7.
Cai, D., He, X., Han, J. (2008). Srda: An efficient algorithm for large-scale discriminant analysis. IEEE Transaction on Knowledge and Data Engineering, 20(1), 1–12.
Cai, D., Zhang, C., He, X. (2010). Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 333–342. ACM.
Culp, M., & Michailidis, G. (2008). Graph-based semisupervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 174–179.
Ding, C., & Peng, H. (2003). Minimum redundancy feature selection from microarray gene expression data. Computational Systems Bioinformatics Conference, 0, 523–528.
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R. (2004). Least angle regression. The Annals of statistics, 32(2), 407–499.
Friedman, J. (1989). Regularized discriminant analysis. Journal of the American statistical association, 165–175.
Fukunaga, K. (1972). Introduction to statistical pattern recognition: Academic Press.
Gilad-Bachrach, R., Navot, A., Tishby, N. (2004). Margin based feature selection - theory and algorithms. In: Proceedings of the twenty-first international conferenceon Machine learning, 43. ACM, New York, NY, USA.
Hall, M.A. (2000). Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, (pp. 359–366). San Francisco: Morgan Kaufmann (Publishers Inc.)
He, X., Cai, D., Han, J. (2008). Learning a maximum margin subspace for image retrieval. IEEE Transaction on Knowledge and Data Engineering, 20(2), 189–201.
He, X., Cai, D., Niyogi, P. (2006). Laplacian score for feature selection. In: Advances in Neural Information Processing Systems 18, (pp. 507–514). Cambridge: MIT Press.
Helleputte, T., & Dupont, P. (2009). Partially supervised feature selection with regularized linear models. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 409–416. ACM.
Kapoor, A., Grauman, K., Urtasun, R., Darrell, T. (2010). Gaussian processes for object categorization. International Journal of Computer Vision, 88(2), 169–188.
Kira, K., & Rendell, L.A. (1992). A practical approach to feature selection. In: Proceedings of the ninth international workshop on Machine learning, (pp. 249–256). San Francisco: Morgan Kaufmann (Publishers Inc.)
Kononenko, I. (1994). Estimating attributes: Analysis and extensions of relief In Bergadano, F., & De Raedt, L. (Eds.), Machine Learning: ECML-94, Lecture Notes in Computer Science, vol. 784, pp. 171–182. Berlin / Heidelberg: Springer.
Kulis, B., Basu, S., Dhillon, I., Mooney, R. (2009). Semi-supervised graph clustering: a kernel approach. Machine learning, 74(1), 1–22.
Liu, H., & Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17, 491–502.
Liu, Y., Nie, F., Wu, J., Chen, L. (2013). Efficient semi-supervised feature selection with noise insensitive trace ratio criterion. Neurocomputing, 105, 12–18.
Quinlan, J. (1993). C4. 5: programs for machine learning: Morgan kaufmann.
Ren, J., Qiu, Z., Fan, W., Cheng, H., Yu, P. (2008). Forward semi-supervised feature selection. Advances in Knowledge Discovery and Data Mining, 970–976.
Rodriguez-Lujan, I., Huerta, R., Elkan, C., Santa Cruz, C. (2010). Quadratic programming feature selection. Journal of Machine Learning Research, 11, 1491–1516.
Sugiyama, M., Idé, T., Nakajima, S., Sese, J. (2010). Semi-supervised local fisher discriminant analysis for dimensionality reduction. Machine Learining, 78(1), 35–61.
Wang, J., Jebara, T., Chang, S.F. (2008). Graph transduction via alternating minimization. In: Proceedings of the 25th international conference on Machine learning, pp. 1144–1151. ACM, New York, NY, USA.
Weston, J., Elisseeff, A., Schölkopf, B., Tipping, M. (2003). Use of the zero norm with linear models and kernel methods. The Journal of Machine Learning Research, 3, 1439–1461.
Xu, Z., King, I., Lyu, M., Jin, R. (2010). Discriminative semi-supervised feature selection via manifold regularization. Neural Networks. IEEE Transactions on, 21(7), 1033–1047.
Yan, S., Xu, D., Zhang, B., Zhang, H.J., Yang, Q., Lin, S. (2007). Graph embedding and extensions: A general framework for dimensionality reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 40–51.
Yang, W., Zhang, S., Liang, W. (2008). A graph based subspace semi-supervised learning framework for dimensionality reduction In Forsyth, D., Torr, P., Zisserman, A. (Eds.), Computer Vision C ECCV 2008, Lecture Notes in Computer Science, vol. 5303, pp. 664–677. Berlin / Heidelberg: Springer.
Yu, L., & Liu, H. (2004). Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 5, 1205–1224.
Zhao, J., Lu, K., He, X. (2008). Locality sensitive semi-supervised feature selection. Neurocomputing, 71(10), 1842–1849.
Zhao, Z., & Liu, H. (2007). Semi-supervised feature selection via spectral analysis. SIAM International Conference on Data Mining.
Zhou, D., Bousquet, O., Lal, T., Weston, J., Scholkopf, B. (2004). Learning with local and global consistency. In: Advances in Neural Information Processing Systems 16: Proceedings of the 2003 Conference, pp. 595–602.
Zhu, X. (2006). Semi-supervised learning literature survey. world, 10, 10.
Zhu, X., Ghahramani, Z., Lafferty, J. (2003). Semi-supervised learning using gaussian fields and harmonic functions. Proceedings of the Nineteenth International Conference on Machine Learning, pp. 912–919.
Acknowledgements
This work was supported by the Natural Science Foundation of Guangdong Province under Grant no. 2012B040305010.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pan, F., Song, G., Gan, X. et al. Consistent feature selection and its application to face recognition. J Intell Inf Syst 43, 307–321 (2014). https://doi.org/10.1007/s10844-014-0324-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-014-0324-5