A Price We Pay for Inexact Dimensionality Reduction

  • Sarunas Raudys
  • Vytautas Valaitis
  • Zidrina Pabarskaite
  • Gene Biziuleviciene
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9044)

Abstract

In biometrical and biomedical pattern classification tasks one faces high dimensional data. Feature selection or feature extraction is necessary. Accuracy of both procedures depends on the data size. An increase in classification error caused by employment of sample based K-class linear discriminant analysis for dimensionality reduction was considered both analytically and by simulations. We derived analytical expression for expected classification error by applying statistical analysis. It was shown theoretically that with an increase in the sample size, classification error of (K-1)-dimensional data decreases at first, however, later it starts increasing. The maximum is reached when the size of K class training sets, n, approaches dimensionality, p. When p, classification error decreases permanently. The peaking effect for real world biomedical and biometric data sets is demonstrated. We show that regularisation of the within-class scattering can reduce or even extinguish the peaking effect.

Keywords

dimensionality reduction complexity sample size biometrics linear discriminant analysis 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fukunaga, K.: Introduction to Statistical Pattern Recognition, 2nd edn. Academic Press, New York (1990)MATHGoogle Scholar
  2. 2.
    Jaiswal, A., Kumar, N., Agrawal, R.K.: A Hybrid of principal component analysis and partial least squares for face recognition across pose. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 67–73. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Morris, J.S., Coombes, K.R., Koomen, J., Baggerly, K.A., Kobayashi, R.: Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum. Bioinformatics 21(9), 1764–1775 (2005)CrossRefGoogle Scholar
  4. 4.
    Kou, B.C., Chang, K.Y.: Feature extraction for sample size classification problem. IEEE Transactions on Geoscience and Remote Sensing 45, 756–764 (2007)CrossRefGoogle Scholar
  5. 5.
    Kumar, N., Jaiswal, A., Agrawal, R.K.: Performance evaluation of subspace methods to tackle small sample size problem in face recognition. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics, pp. 938–944. ACM, New York (2012)CrossRefGoogle Scholar
  6. 6.
    Lee, C., Landgrebe, D.A.: Feature extraction based on decision boundaries. IEEE Transactions Pattern Analysis and Machine Intelligence 15, 388–400 (1993)CrossRefGoogle Scholar
  7. 7.
    Pang, Y., Wang, S., Yuan, Y.: Learning regularized LDA by clustering. IEEE Trans. Neural Networks & Learning Systems 25(12), 2191–2201 (2014)CrossRefGoogle Scholar
  8. 8.
    Raudys, S.: Taxonomy of classifiers based on dissimilarity features. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds.) ICAPR 2005. LNCS, vol. 3686, pp. 136–145. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  9. 9.
    Sun, S., Wang, H., Jiang, Z., Fang, Y., Tao, T.: Segmentation-based heart sound feature ex-traction combined with classifier models for a VSD diagnosis system. Expert Systems with Applications 41(4), Part 2, 1769–1780 (2014)Google Scholar
  10. 10.
    Yang, W., Wu, H.: Regularized complete linear discriminant analysis. Neurocomputing 137(5), 185–191 (2014)CrossRefGoogle Scholar
  11. 11.
    Yang, W., Wang, Z., Sun, C.: A collaborative representation based projections method for feature extraction. Pattern Recognition 48, 20–27 (2015)CrossRefGoogle Scholar
  12. 12.
    Zhang, T., Fang, B., Tang, Y.Y., Shang, Z., Xu, B.: Generalized discriminant analysis: a matrix exponential approach. IEEE Trans. on Systems, Man and Cyb. Pt.2: Cybernetics 40(1), 186–197 (2010)CrossRefGoogle Scholar
  13. 13.
    Raudys, S., Young, D.: Results in statistical discriminant analysis: A review of the former Soviet Union literature. J. of Multivariate Analysis 89(1), 1–35 (2004)CrossRefMATHMathSciNetGoogle Scholar
  14. 14.
    Amari, S., Fujita, N., Shinomoto, S.: Four types of learning curves. Neural Computation 4, 605–618 (1992)CrossRefGoogle Scholar
  15. 15.
    Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)CrossRefMATHGoogle Scholar
  16. 16.
    Raudys, S.: Portfolio of automated trading systems: Complexity and learning set size issues. IEEE Trans. on Neural Networks and Learning Systems 24(3), 448–459 (2013)CrossRefGoogle Scholar
  17. 17.
    Raudys, S.: Feature over-selection. In: Yeung, D.-Y., Kwok, J.T., Fred, A., Roli, F., de Ridder, D. (eds.) SSPR 2006 and SPR 2006. LNCS, vol. 4109, pp. 622–631. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  18. 18.
    Deev, A.D.: Asymptotic expansions for distributions of statistics W, M, W in discriminant analysis. In: Statistical Methods of Classification, vol, 31, pp. 6–57. Moscow University Press, Moscow (1972) (in Russian)Google Scholar
  19. 19.
    Raudys, S.: On the amount of a priori information in designing the classification algorithm. Engineering Cybernetics N4, 168–174 (1972) (in Russian)Google Scholar
  20. 20.
    Wyman, F., Young, D., Turner, D.: A comparison of asymptotic error rate expansions for the sample linear discriminant function. Pattern Recognition 23, 775–783 (1990)CrossRefGoogle Scholar
  21. 21.
    Duin, R.P.W.: Small sample size generalization. In: Borgefors, G. (ed.) Proceedings of the 9th Scandinavian Conference on Image Analysis, vol. 2, pp. 957–964 (1995)Google Scholar
  22. 22.
    Krzanowski, W., Jonathan, P., McCarthy, W., Thomas, M.: Discriminant analysis with singular covariance matrices: Methods and applications to spectroscopic data. Applied Statistics 44, 101–115 (1995)CrossRefMATHGoogle Scholar
  23. 23.
    Raudys, S., Duin, R.P.W.: On expected classification error of the Fisher classifier with pseudoinverse covariance matrix. Pattern Recognition Letters 19, 385–392 (1998)CrossRefMATHGoogle Scholar
  24. 24.
    Schafer, J., Strimmer, K.: An empirical Bayes approach to inferring large scale gene association networks. Bioinformatics 21, 754–764 (2005)CrossRefGoogle Scholar
  25. 25.
    Hoyle, D.C.: Accuracy of pseudo-inverse covariance learning—A random matrix theory analysis. IEEE Trans. Pattern Analysis and Machine Intelligence 33, 1470–1481 (2011)CrossRefGoogle Scholar
  26. 26.
    Yamada, T., Hyodo, M., Seo, T.: The asymptotic approximation of EPMC for linear dis-criminant rules using a Moore-Penrose inverse matrix in high dimension. Communications in Statistics - Theory and Methods 42(18), 3329–3338 (2013)CrossRefMATHMathSciNetGoogle Scholar
  27. 27.
    Kubokawa, T., Hyodo, M., Srivastava, M.S.: Asymptotic expansion and estimation of EPMC for linear classification rules in high dimension. J. Multivariate Analysis 115, 496–515 (2013)CrossRefMATHMathSciNetGoogle Scholar
  28. 28.
    Srivastava, M.S., Katayama, S., Kano, Y.: A two sample test in high dimensional data. J. Multivariate Analysis 114, 349–358 (2013)CrossRefMATHMathSciNetGoogle Scholar
  29. 29.
    Anderson, T.W.: An Introduction to Multivariate Statistical Analysis. Willey, NY (1958)MATHGoogle Scholar
  30. 30.
    Raudys, S.: Statistical and Neural Classifiers: An Integrated Approach to Design. Springer, London (2001)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Sarunas Raudys
    • 1
  • Vytautas Valaitis
    • 1
  • Zidrina Pabarskaite
    • 1
  • Gene Biziuleviciene
    • 1
    • 2
  1. 1.Faculty of Mathematics and InformaticsVilnius UniversityLithuania
  2. 2.State Research Institute, Centre for Innovative MedicineVilniusLithuania

Personalised recommendations