Skip to main content
Log in

A hybrid discriminant embedding with feature selection: application to image categorization

  • Published:
Applied Intelligence Aims and scope Submit manuscript


In recent times, feature extraction was the focus of many researches due to its usefulness in the machine learning and pattern recognition fields. Feature extraction mainly aims to extract informative representations from the original set of features. This can be carried out using various ways. The proposed method is targeting a hybrid linear feature extraction scheme for supervised multi-class classification problems. Inspired by recent robust sparse LDA and Inter-class sparsity frameworks, we will propose a unifying criterion that is able to retain these two powerful linear discriminant method’s advantages. Thus, the obtained transformation encapsulates two different types of discrimination, the inter-class sparsity and robust Linear Discriminant Analysis with feature selection. We will introduce an iterative alternating minimization scheme in order to estimate the linear transform and the orthogonal matrix. The linear transform is efficiently updated via the steepest descent gradient technique. We will also introduce two initialization schemes for the linear transform. The proposed framework is generic in the sense that it allows the combination and tuning of other linear discriminant embedding methods. According to the experiments which have been carried out on several datasets including faces, objects and digits, the proposed method was able to outperform the competing methods in most cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others









  1. Belous G, Busch A, Gao Y (2020) Dual subspace discriminative projection. Pattern Recognition, pp 107581

  2. Boyd S, Parikh N, Chu E, Peleato B, Eckstein J et al (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends®, in Machine Learning 3(1):1–122

    MATH  Google Scholar 

  3. Chang C-C, Lin C-J (2011) Libsvm: a library for support vector machines. ACM Trans Intel Syst Technol (TIST) 2(3):27

    Google Scholar 

  4. Chen C-F, Wei C-P, Wang Y-CF (2012) Low-rank matrix recovery with structural incoherence for robust face recognition. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE, pp 2618–2625

  5. Chen H-T, Chang H-W, Liu T-L (2005) Local discriminant embedding and its variants. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05), vol 2, IEEE, pp 846–853

  6. Chen W (2020) Mutualinfo(x, y,nBins, ifplot). MATLAB Central File Exchange

  7. Clemmensen L, Hastie T, Witten D, Ersbøll B (2011) Sparse discriminant analysis. Technometrics 53(4):406–413

    MathSciNet  Google Scholar 

  8. Cunningham P, Delany SJ (2007) k-nearest neighbour classifiers. Multiple Classifier Systems 34(8):1–17

    Google Scholar 

  9. Dean J, Corrado G, Monga R, Chen K, Devin M, Mao M, Ranzato M, Senior A, Tucker P, Yang K et al (2012) Large scale distributed deep networks. In: Advances in neural information processing systems, pp 1223–1231

  10. Dems̆ar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  Google Scholar 

  11. Duda RO, Hart PE, Stork DG (2012) Pattern classification. John Wiley & Sons

  12. Fan Z, Xu Y, Zhang D (2011) Local linear discriminant analysis framework using sample neighbors. IEEE Trans Neural Netw 22(7):1119–1132

    Google Scholar 

  13. Fang X, Han N, Wu J, Xu Y, Yang J, Wong WK, Li X (2018) Approximate low-rank projection learning for feature extraction. IEEE Trans Neural Netw Learn Syst 29(11):5228–5241

    MathSciNet  Google Scholar 

  14. Fang X, Teng S, Lai Z, He Z, Xie S, Wong WK (2017) Robust latent subspace learning for image classification. IEEE Trans Neural Netw Learn Syst 29(6):2502–2515

    MathSciNet  Google Scholar 

  15. Gao L, Yang B, Du Q, Zhang B (2015) Adjusted spectral matched filter for target detection in hyperspectral imagery. Remote Sens 7(6):6611–6634

    Google Scholar 

  16. He L, Yang H, Zhao L (2019) Tensor subspace learning and classification: Tensor local discriminant embedding for hyperspectral image. In: Proceedings of the IEEE international conference on computer vision workshops, pp 0–0

  17. He X, Cai D, Yan S, Zhang H-J (2005) Neighborhood preserving embedding. In: Tenth IEEE international conference on computer vision (ICCV’05) vol 1, vol 2, IEEE, pp 1208–1213

  18. He X, Niyogi P (2004) Locality preserving projections. In: Advances in neural information processing systems, pp 153–160

  19. Hu J, Li Y, Gao W, Zhang P (2020) Robust multi-label feature selection with dual-graph regularization. Knowledge-Based Systems, pp 106126

  20. Imani M, Ghassemian H (2017) High-dimensional image data feature extraction by double discriminant embedding. Pattern Anal Applic 20(2):473–484

    MathSciNet  Google Scholar 

  21. Kozma L (2008) k nearest neighbors algorithm (knn). Helsinki University of Technology

  22. Lai Z, Xu Y, Jin Z, Zhang D (2014) Human gait recognition via sparse discriminant projection learning. IEEE Trans Circuits Syst Video Technol 24(10):1651–1662

    Google Scholar 

  23. Langley P (1994) Selection of relevant features in machine learning: Defense technical information center

  24. Li Z, Liu J, Yang Y, Zhou X, Lu H (2013) Clustering-guided sparse structural learning for unsupervised feature selection. IEEE Trans Knowl Data Eng 26(9):2138–2150

    Google Scholar 

  25. Liu G, Yan S (2011) Latent low-rank representation for subspace segmentation and feature extraction. In: 2011 International conference on computer vision, IEEE, pp 1615–1622

  26. Lu Y, Lai Z, Li X, Wong WK, Yuan C, Zhang D (2018) Low-rank 2-d neighborhood preserving projection for enhanced robust image representation. IEEE Trans Cybern 49(5):1859– 1872

    Google Scholar 

  27. Martínez AM, Kak AC (2001) Pca versus lda. IEEE Trans Pattern Anal Mach Intel 23 (2):228–233

    Google Scholar 

  28. Matas J, Chum O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10):761–767

    Google Scholar 

  29. Peng H, Ding C, Long F (2005) Minimum redundancy-maximum relevance feature selection

  30. Peng X, Lu J, Yi Z, Yan R (2016) Automatic subspace learning via principal coefficients embedding. IEEE Trans Cybern 47(11):3583–3596

    Google Scholar 

  31. Qiao Z, Zhou L, Huang JZ (2009) Sparse linear discriminant analysis with applications to high dimensional low sample size data. Int J Appl Math 39(1):6

    MathSciNet  MATH  Google Scholar 

  32. Quinlan JR (2014) C4. 5: programs for machine learning. Elsevier

  33. Raileanu LE, Stoffel K (2004) Theoretical comparison between the gini index and information gain criteria. Ann Math Artif Intell 41(1):77–93

    MathSciNet  MATH  Google Scholar 

  34. Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of relieff and rrelieff. Mach Learn 53(1-2):23–69

    MATH  Google Scholar 

  35. Smith LI (2002) A tutorial on principal components analysis. Technical report

  36. Stańczyk U., Zielosko B, Jain LC (2018) Advances in feature selection for data and pattern recognition: an introduction. In: Advances in feature selection for data and pattern recognition, Springer, pp 1–9

  37. Tao H, Hou C, Nie F, Jiao Y, Yi D (2015) Effective discriminative feature selection with nontrivial solution. IEEE Trans Neural Netw Learn Syst 27(4):796–808

    MathSciNet  Google Scholar 

  38. Tharwat A, Gaber T, Ibrahim A, Hassanien AE (2017) Linear discriminant analysis: a detailed tutorial. AI Commun 30(2):169–190

    MathSciNet  Google Scholar 

  39. Unar S, Wang X, Wang C, Wang Y (2019) A decisive content based image retrieval approach for feature fusion in visual and textual images. Knowl-Based Syst 179:8–20

    Google Scholar 

  40. Unar S, Wang X, Zhang C (2018) Visual and textual information fusion using kernel method for content based image retrieval. Information Fusion 44:176–187

    Google Scholar 

  41. Unar S, Wang X, Zhang C, Wang C (2019) Detected text-based image retrieval approach for textual images. IET Image Process 13(3):515–521

    Google Scholar 

  42. Wang C, Wang X, Li Y, Xia Z, Zhang C (2018) Quaternion polar harmonic fourier moments for color images. Inf Sci 450:141– 156

    MathSciNet  MATH  Google Scholar 

  43. Wang C, Wang X, Xia Z, Ma B, Shi Y-Q (2019) Image description with polar harmonic fourier moments. IEEE Transactions on Circuits and Systems for Video Technology

  44. Wang C, Wang X, Xia Z, Zhang C (2019) Ternary radial harmonic fourier moments based robust stereo image zero-watermarking algorithm. Inf Sci 470:109–120

    Google Scholar 

  45. Wang D, Nie F, Huang H (2015) Feature selection via global redundancy minimization. IEEE Trans Knowl Data Eng 27(10):2743–2755

    Google Scholar 

  46. Wang X, Wang Z (2014) The method for image retrieval based on multi-factors correlation utilizing block truncation coding. Pattern Recogn 47(10):3293–3303

    Google Scholar 

  47. Wen J, Fang X, Cui J, Fei L, Yan K, Chen Y, Xu Y (2018) Robust sparse linear discriminant analysis. IEEE Trans Circuits Syst Video Technol 29(2):390–403

    Google Scholar 

  48. Wen J, Xu Y, Li Z, Ma Z, Xu Y (2018) Inter-class sparsity based discriminative least square regression. Neural Netw 102:36–47

    MATH  Google Scholar 

  49. Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2008) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intel 31(2):210–227

    Google Scholar 

  50. Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Syst 23(11):1738– 1754

    Google Scholar 

  51. Xu J, Tang B, He H, Man H (2016) Semisupervised feature selection based on relevance and redundancy criteria. IEEE Trans Neural Netw Learn Syst 28(9):1974–1984

    MathSciNet  Google Scholar 

  52. Xu Y, Fang X, Zhu Q, Chen Y, You J, Liu H (2014) Modified minimum squared error algorithm for robust classification and face recognition experiments. Neurocomputing 135:253–261

    Google Scholar 

  53. Xue Y, Zhang L, Wang B, Zhang Z, Li F (2018) Nonlinear feature selection using gaussian kernel svm-rfe for fault diagnosis. Appl Intell 48(10):3306–3331

    Google Scholar 

  54. Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: 2009 IEEE Conference on computer vision and pattern recognition, IEEE, pp 1794–1801

  55. Yang J-B, Ong C-J (2012) An effective feature selection method via mutual information estimation. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42(6):1550–1559

    Google Scholar 

  56. Ye J (2007) Least squares linear discriminant analysis. In: Proceedings of the 24th international conference on machine learning, pp 1087–1093

  57. Zang S, Cheng Y, Wang X, Ma J (2019) Semi-supervised flexible joint distribution adaptation. In: Proceedings of the 2019 8th international conference on networks, communication and computing, pp 19–27

  58. Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: Which helps face recognition?. In: 2011 International conference on computer vision, IEEE, pp 471–478

  59. Zhang X, Chu D, Tan RC (2015) Sparse uncorrelated linear discriminant analysis for undersampled problems. IEEE Trans Neural Netw Learn Syst 27(7):1469–1485

    MathSciNet  Google Scholar 

  60. Zhang Y, Jiang Z, Davis LS (2013) Learning structured low-rank representations for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 676–683

  61. Zhao Z, He X, Cai D, Zhang L, Ng W, Zhuang Y (2015) Graph regularized feature selection with data reconstruction. IEEE Trans Knowl Data Eng 28(3):689–700

    Google Scholar 

  62. Zhou Y, Sun S (2016) Manifold partition discriminant analysis. IEEE Trans Cybern 47 (4):830–840

    Google Scholar 

  63. Zhu R, Dornaika F, Ruichek Y (2019) Joint graph based embedding and feature weighting for image classification. Pattern Recogn 93:458–469

    MATH  Google Scholar 

  64. Zhu R, Dornaika F, Ruichek Y (2019) Learning a discriminant graph-based embedding with feature selection for image categorization. Neural Netw 111:35–46

    MATH  Google Scholar 

  65. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67(2):301–320

    MathSciNet  MATH  Google Scholar 

  66. Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15(2):265–286

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to F. Dornaika.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khoder, A., Dornaika, F. A hybrid discriminant embedding with feature selection: application to image categorization. Appl Intell 51, 3142–3158 (2021).

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: