Abstract
Fisher discriminant analysis (FDA) is a widely used dimensionality reduction tool in pattern recognition. However, FDA cannot obtain an optimal subspace for classification without sufficient labeled samples. Thus, semi-supervised discriminant analysis has attracted great attention in recent years. In this paper, the proposed method employs the exponential-adjusted geometric distance as the measure of similarity, which modifies the exponential function and the scaling factor. The distance not only satisfies the global and local consistency requirements, but also the similarity matrix obtained is more consistent with the real data distribution, thus improves the dimensionality reduction performance. First, in order to deal with the nonlinear separated data, the kernel function is used to map the original data into the high-dimensional feature space. Then, both labeled and unlabeled data in feature space are used to capture the consistence assumption of geometrical structure based on exponential-adjusted geometric distance, which are incorporated into the objection function of local Fisher discriminant analysis as a regularization term. Eventually, the optimal projection matrix is obtained by maximizing the objective function. Experiments on artificial datasets, UCI benchmark datasets, and high-dimensional recognition problems indicate that the presented technique has a significantly improvement in discriminant performance compared with the-state-of-art dimensionality reduction techniques.
Similar content being viewed by others
Data availability
The dataset used in this paper is publicly available.
References
Wu L, Yuan L, Zhao G, Lin H, Li SZ (2022) Deep clustering and visualization for end-to-end high-dimensional data analysis. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3151498
Ayesha S, Hanif MK, Talib R (2020) Overview and comparative study of dimensionality reduction techniques for high dimensional data. Inf Fusion 59:44–58. https://doi.org/10.1016/j.inffus.2020.01.005
Sunhare P, Chowdhary RR, Chattopadhyay MK (2022) Internet of Things and data mining: an application oriented survey. J King Saud Univ Comput Inf Sci 34(6):3569–3590. https://doi.org/10.1016/j.jksuci.2020.07.002
Moyes A, Gault R, Zhang K, Ming J, Crookes D, Wang J (2023) Multi-channel auto-encoders for learning domain invariant representations enabling superior classification of histopathology images. Med Image Anal 83:102640. https://doi.org/10.1016/j.media.2022.102640
Li M, Wang H, Yang L, Liang Y, Shang Z, Wan H (2020) Fast hybrid dimensionality reduction method for classification based on feature selection and grouped feature extraction. Expert Syst Appl 150(7):113–127. https://doi.org/10.1016/j.eswa.2020.113277
Ye F, Bors AG (2021) Deep mixture generative autoencoders. IEEE Trans Neural Netw Learn Syst 33(10):5789–5803. https://doi.org/10.1109/TNNLS.2021.3071401
Matsumoto N, Mazumdar A (2022) Binary iterative hard thresholding converges with optimal number of measurements for 1-bit compressed sensing. In: 2022 IEEE 63rd annual symposium on foundations of computer science (FOCS). IEEE, pp 813–822. https://doi.org/10.1109/FOCS54457.2022.00082
Bhowmick S, Nagarajaiah S (2022) Spatiotemporal compressive sensing of full-field Lagrangian continuous displacement response from optical flow of edge: identification of full-field dynamic modes. Mech Syst Signal Process 164:108232. https://doi.org/10.1016/j.ymssp.2021.108232
Liu W, Tao D, Cheng J, Tang Y (2014) Multiview hessian discriminative sparse coding for image annotation. Comput Vis Image Underst 118:50–60. https://doi.org/10.1016/j.cviu.2013.03.007
Liu W, Zha ZJ, Wang Y, Lu K, Tao D (2016) P-Laplacian regularized sparse coding for human activity recognition. IEEE Trans Ind Electron 63(8):5120–5129. https://doi.org/10.1109/tie.2016.2552147
Huang F, Noël R, Berg P, Hosseini SA (2022) Simulation of the FDA nozzle benchmark: a lattice Boltzmann study. Comput Methods Programs Biomed 221:106863. https://doi.org/10.1016/j.cmpb.2022.106863
Kompa B, Hakim JB, Palepu A, Kompa KG, Smith M, Bain PA, Beam AL (2022) Artificial intelligence based on machine learning in pharmacovigilance: a scoping review. Drug Saf 45(5):477–491. https://doi.org/10.1007/s40264-022-01176-1
Fukui K, Sogi N, Kobayashi T, Xue JH, Maki A (2022) Discriminant feature extraction by generalized difference subspace. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2022.3168557
Bai X, Zhang M, Jin Z, You Y, Liang C (2022) Fault detection and diagnosis for chiller based on feature-recognition model and kernel discriminant analysis. Sustain Cities Soc 79:103708. https://doi.org/10.1016/j.scs.2022.103708
Shao G, Sang N (2017) Regularized max–min linear discriminant analysis. Pattern Recognit 66:353–363. https://doi.org/10.1016/j.patcog.2016.12.030
Sharma A, Paliwal KK (2015) A deterministic approach to regularized linear discriminant analysis. Neurocomputing 151(1):207–214. https://doi.org/10.1016/j.neucom.2014.09.051
Ye H, Li Y, Chen C, Zhang Z (2017) Fast fisher discriminant analysis with randomized algorithms. Pattern Recognit 72:82–92. https://doi.org/10.1016/j.patcog.2017.06.029
Zhu M, Martinez AM (2006) Subclass discriminant analysis. IEEE Trans Pattern Anal Mach Intell 28(8):1274–1286. https://doi.org/10.1109/tpami.2006.172
Duin RPW, Loog M (2004) Linear dimensionality reduction via a heteroscedastic extension of LDA: the Chernoff criterion. IEEE Trans Pattern Anal Mach Intell 26(6):732–739. https://doi.org/10.1109/tpami.2004.13
Zhao D, Lin Z, Xiao R, Tang X (2007) Linear Laplacian discrimination for feature extraction. In: Proceeding of the international conference on computer vision and pattern recognition, pp 1–7. https://doi.org/10.1109/cvpr.2007.383125
Su B, Ding X, Liu C, Wu Y (2018) Heteroscedastic max–min distance analysis for dimensionality reduction. IEEE Trans Image Process 27(8):4052–4064. https://doi.org/10.1109/tip.2018.2836312
Wang H, Lu X, Hu Z, Zheng W (2013) Fisher discriminant analysis with L1-norm. IEEE Trans Cybern 44(6):228–842. https://doi.org/10.1109/tcyb.2013.2273355
Ye Q, Fu L, Zhang Z, Zhao H, Naiem M (2018) Lp- and Ls-norm distance based robust linear discriminant analysis. Neural Netw 105:393–404. https://doi.org/10.1016/j.neunet.2018.05.020
Sugiyama M (2007) Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis. J Mach Learn Res 8:1027–1061. https://doi.org/10.1145/1143844.1143958
Mika S, Ratsch G, Weston J, Scholkopf B, Mullers KR (1999) Fisher discriminant analysis with kernels. In: Neural networks for signal processing IX. IEEE, pp 41–48. https://doi.org/10.1109/nnsp.1999.788121
Jia J, Ruan Q, Jin Y (2016) Geometric preserving local Fisher discriminant analysis for person re-identification. Neurocomputing 205(C):92–105. https://doi.org/10.1016/j.neucom.2016.05.003
Van M, Kang HJ (2015) Bearing defect classification based on individual wavelet local Fisher discriminant analysis with particle swarm optimization. IEEE Trans Ind Inf 12(1):124–135. https://doi.org/10.1109/tii.2015.2500098
Sun Z, Li J, Sun C (2014) Kernel inverse Fisher discriminant analysis for face recognition. Neurocomputing 134(9):46–52. https://doi.org/10.1016/j.neucom.2012.12.075
Hu W, Hu H (2016) Heterogeneous face recognition based on modality-independent Kernel Fisher discriminant analysis joint sparse auto-encoder. Electron Lett 52(21):1753–1755. https://doi.org/10.1049/el.2016.2661
Nie F, Xiang S, Jia Y, Zhang C (2009) Semi-supervised orthogonal discriminant analysis via label propagation. Pattern Recogn 42(11):2615–2627. https://doi.org/10.1016/j.patcog.2009.04.001
Zhao M, Zhang Z, Chow TW, Li B (2014) A general soft label based linear discriminant analysis for semi-supervised dimensionality reduction. Neural Netw 55:83–97. https://doi.org/10.1016/j.neunet.2014.03.005
Lu J, Zhou X, Tan YP, Shang Y, Zhou J (2012) Cost-sensitive semi-supervised discriminant analysis for face recognition. IEEE Trans Inf Forensics Secur 7(3):944–953. https://doi.org/10.1109/tifs.2012.2188389
Tao XM, Wu YK, Bao YX et al (2021) Regularized LFDA algorithm based on density peak clustering. Comput Integr Manuf Syst 29:1–30
Zhang Y, Yeung DY (2011) Semisupervised generalized discriminant analysis. IEEE Trans Neural Netw 22(8):1207–1217. https://doi.org/10.1109/tnn.2011.2156808
Wang S, Lu J, Gu X, Du H, Yang J (2016) Semi-supervised linear discriminant analysis for dimension reduction and classification. Pattern Recognit 57(C):179–189. https://doi.org/10.1016/j.patcog.2016.02.019
Wu H, Prasad S (2018) Semi-supervised dimensionality reduction of hyperspectral imagery using pseudo-labels. Pattern Recognit 74:212–224. https://doi.org/10.1016/j.patcog.2017.09.003
Chen P, Jiao L, Liu F, Zhao J, Zhao Z, Liu S (2017) Semi-supervised double sparse graphs based on discriminant analysis for dimensionality reduction. Pattern Recognit 61:361–378. https://doi.org/10.1016/j.patcog.2016.08.010
Cai D, He X, Han J (2007). Semi-supervised discriminant analysis. In: IEEE international conference on computer vision, pp 1–7. https://doi.org/10.1109/iccv.2007.4408856
Song Y, Nie F, Zhang C, Xiang S (2008) A unified framework for semi-supervised dimensionality reduction. Pattern Recognit 41(9):2789–2799. https://doi.org/10.1016/j.patcog.2008.01.001
Jiang L, Xuan J, Shi T (2013) Feature extraction based on semi-supervised kernel Marginal Fisher analysis and its application in bearing fault diagnosis. Mech Syst Signal Process 41(1–2):113–126. https://doi.org/10.1016/j.ymssp.2013.05.017
Huang SC, Tang YC, Lee CW, Chang MJ (2012) Kernel local Fisher discriminant analysis based manifold-regularized SVM model for financial distress predictions. Expert Syst Appl 39(3):3855–3861. https://doi.org/10.1016/j.eswa.2011.09.095
Sugiyama M, Idé T, Nakajima S, Sese J (2010) Semi-supervised local fisher discriminant analysis for dimensionality reduction. Mach Learn 78(1–2):35–61. https://doi.org/10.1007/s10994-009-5125-7
Liao W, Pizurica A, Scheunders P, Philips W, Pi Y (2012) Semisupervised local discriminant analysis for feature extraction in hyperspectral images. IEEE Trans Geosci Remote Sens 51(1):184–198. https://doi.org/10.1109/jurse.2011.5764804
Jia J, Ruan Q, Jin Y (2016) Geometric preserving local fisher discriminant analysis for person re-identification. Neurocomputing 205:92–105. https://doi.org/10.1016/j.neucom.2016.05.003
Huang Y, Sun Z (2016) Semi-supervised locality preserving discriminant analysis for hyperspectral classification. In: International congress on image and signal processing, biomedical engineering and informatics. IEEE, pp 151–156. https://doi.org/10.1109/cisp-bmei.2016.7852699
Zhang R, Nie F, Li X (2017) Self-weighted spectral clustering with parameter-free constraint. Neurocomputing 241:64–170. https://doi.org/10.1016/j.neucom.2017.01.085
Cao Y, Chen DR (2011) Consistency of regularized spectral clustering. Appl Comput Harmon Anal 30(3):319–336. https://doi.org/10.1016/j.acha.2010.09.002
Fischer B, Roth V, Buhmann J (2003) Clustering with the connectivity kernel. In: Neural information processing systems, pp 89–96. https://doi.org/10.5555/2981345.2981357
Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. Adv Neural Inf Process Syst 27:1601–1608. https://doi.org/10.5555/2976040.2976241
Sheikhpour R, Sarram MA, Chahooki MAZ, Sheikhpour R (2017) A kernelized non-parametric classifier based on feature ranking in anisotropic Gaussian kernel. Neurocomputing 267:545–555. https://doi.org/10.1016/j.neucom.2017.06.035
Machine Learning Repository UCI. http://archive.ics.uci.edu/ml/datasets.html.
Dreiziene L, Ducinskas K (2020) Comparison of spatial linear mixed models for ecological data based on the correct classification rates. Spat Stat 35:100395. https://doi.org/10.1016/j.spasta.2019.100395
Wolf L, Hassner T, Taigman Y (2010) Effective unconstrained face recognition by combining multiple descriptors and learned background statistics. IEEE Trans Pattern Anal Mach Intell 33:1978–1990. https://doi.org/10.1109/tpami.2010.230
Nie F, Xu D, Li X, Xiang S (2010) Semisupervised dimensionality reduction and classification through virtual label regression. IEEE Trans Syst Man Cybern Part B (Cybern) 41(3):675–685. https://doi.org/10.1109/tsmcb.2010.2085433
Tao XM, Ren C, Li Q, Guo WJ, Liu R, He Q, Zou JR (2021) Bearing defect diagnosis based on semi-supervised kernel Local Fisher Discriminant Analysis using pseudo labels. ISA Trans 110:394–412. https://doi.org/10.1016/j.isatra.2020.10.033
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China No. 62176050, the National Innovation and Entrepreneurship Training Program for College Students, No. 202310225305. The authors are grateful to the anonymous reviewers for their valuable comments and suggestions which were very helpful in improving the quality and presentation of this paper.
Author information
Authors and Affiliations
Contributions
Zhiyu Chen contributed to conceptualization, software, validation, and writing—original draft. Yuqi Sun contributed to software, validation, and writing—original draft. Dongliang Hu contributed to data curation and writing—original draft. Yangguang Bian contributed to validation and writing—original draft. Shensen Wang contributed to data curation and validation. Xiyuan Zhang contributed to search literature. Xinmin Tao contributed to project administration and funding acquisition.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical approval
This is a literature review article and does not involve human subject for data collection. There is no need for ethical approval.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The proposed distance measure can simultaneously satisfy the following four properties, and the specific proof process is as follows:
-
(a)
Reflexivity.
If and only if \({{\varvec{x}}}_{i}={{\varvec{x}}}_{j}\);
Proof
When \({\varvec{x}}_{i} = {\varvec{x}}_{j}\), since \(d\left( {{\varvec{p}}_{k} ,{\varvec{p}}_{k + 1} } \right)\) is the Euclidean distance of any two adjacent data points on the minimum path between the vertices \({\varvec{x}}_{i}\) and \({\varvec{x}}_{j}\) on graph \({\varvec{G}}\). \({\varvec{x}}_{i} = {\varvec{x}}_{j}\), then the minimum path between \({\varvec{x}}_{i}\) and \({\varvec{x}}_{j}\) is 0, and \(d\left( {{\varvec{p}}_{k} ,{\varvec{p}}_{k + 1} } \right) = 0,k = 1, \cdots ,\left| {\varvec{p}} \right|\)
-
(b)
Nonnegative
Proof
-
(c)
Symmetry
Proof
Since,
So,
-
(d)
Triangle inequality
Proof
Assume,
According to the definition,
should at least take
So the assumption is not true,
Proving \(D_{i,j}^{\rho } < D_{i,m}^{\rho } + D_{m,j}^{\rho }\).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, Z., Sun, Y., Hu, D. et al. Semi-supervised Kernel Fisher discriminant analysis based on exponential-adjusted geometric distance. Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09768-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00521-024-09768-x