A Price We Pay for Inexact Dimensionality Reduction
In biometrical and biomedical pattern classification tasks one faces high dimensional data. Feature selection or feature extraction is necessary. Accuracy of both procedures depends on the data size. An increase in classification error caused by employment of sample based K-class linear discriminant analysis for dimensionality reduction was considered both analytically and by simulations. We derived analytical expression for expected classification error by applying statistical analysis. It was shown theoretically that with an increase in the sample size, classification error of (K-1)-dimensional data decreases at first, however, later it starts increasing. The maximum is reached when the size of K class training sets, n, approaches dimensionality, p. When n > p, classification error decreases permanently. The peaking effect for real world biomedical and biometric data sets is demonstrated. We show that regularisation of the within-class scattering can reduce or even extinguish the peaking effect.
Keywordsdimensionality reduction complexity sample size biometrics linear discriminant analysis
Unable to display preview. Download preview PDF.
- 5.Kumar, N., Jaiswal, A., Agrawal, R.K.: Performance evaluation of subspace methods to tackle small sample size problem in face recognition. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics, pp. 938–944. ACM, New York (2012)CrossRefGoogle Scholar
- 9.Sun, S., Wang, H., Jiang, Z., Fang, Y., Tao, T.: Segmentation-based heart sound feature ex-traction combined with classifier models for a VSD diagnosis system. Expert Systems with Applications 41(4), Part 2, 1769–1780 (2014)Google Scholar
- 18.Deev, A.D.: Asymptotic expansions for distributions of statistics W, M, W in discriminant analysis. In: Statistical Methods of Classification, vol, 31, pp. 6–57. Moscow University Press, Moscow (1972) (in Russian)Google Scholar
- 19.Raudys, S.: On the amount of a priori information in designing the classification algorithm. Engineering Cybernetics N4, 168–174 (1972) (in Russian)Google Scholar
- 21.Duin, R.P.W.: Small sample size generalization. In: Borgefors, G. (ed.) Proceedings of the 9th Scandinavian Conference on Image Analysis, vol. 2, pp. 957–964 (1995)Google Scholar