A First Approach to Face Dimensionality Reduction Through Denoising Autoencoders

  • Francisco J. PulgarEmail author
  • Francisco Charte
  • Antonio J. Rivera
  • María J. del Jesus
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11314)


The problem of high dimensionality is a challenge when facing machine learning tasks. A high dimensional space has a negative effect on the predictive performance of many methods, specifically, classification algorithms. There are different proposals that arise to mitigate the effects of this phenomenon. In this sense, models based on deep learning have emerged.

In this work, denoising autoencoders (DAEs) are used to reduce dimensionality. To verify its performance, an experimentation is carried out where the improvement obtained with different types of classifiers is verified. The classification method used are: kNN, SVM, C4.5 and MLP. The test for kNN and SVM show a better predictive performance for all datasets. The executions for C4.5 and MLP reflect improvements only in some cases. The execution time is lower for all tests. In addition, a comparison between DAEs and PCA, a classical method of dimensionality reduction, is performed, obtaining better results with DAEs in most cases. The conclusions reached open up new lines of future work.


Classification Deep learning Autoencoders Denoising autoencoders Dimensionality reduction High dimensionality 



The work of F. Pulgar was supported by the Spanish Ministry of Education under the FPU National Program (Ref. FPU16/00324). This work was partially supported by the Spanish Ministry of Science and Technology under project TIN2015-68454-R.


  1. 1.
    Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)Google Scholar
  2. 2.
    Bache, K., Lichman, M.: UCI Machine Learning Repository (2013)Google Scholar
  3. 3.
    Batista, G.E., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl. 6(1), 20–29 (2004)CrossRefGoogle Scholar
  4. 4.
    Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)zbMATHGoogle Scholar
  5. 5.
    Bellman, R.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1961)CrossRefGoogle Scholar
  6. 6.
    Bengio, Y.: Deep learning of representations: looking forward. In: Dediu, A.-H., Martín-Vide, C., Mitkov, R., Truthe, B. (eds.) SLSP 2013. LNCS (LNAI), vol. 7978, pp. 1–37. Springer, Heidelberg (2013). Scholar
  7. 7.
    Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “Nearest Neighbor” meaningful? In: Beeri, C., Buneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 217–235. Springer, Heidelberg (1999). Scholar
  8. 8.
    Charte, D., Charte, F., García, S., del Jesus, M.J., Herrera, F.: A practical tutorial on autoencoders for nonlinear feature fusion: taxonomy, models, software and guidelines. Inf. Fusion 44, 78–96 (2018)CrossRefGoogle Scholar
  9. 9.
    Cole, R., Fanty, M.: Spoken letter recognition. In: Proceedings of the Workshop on Speech and Natural Language, pp. 385–390 (1990)Google Scholar
  10. 10.
    Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)CrossRefGoogle Scholar
  11. 11.
    Deng, L.: Deep learning: methods and applications. Found. Trends Signal Process. 7(3–4), 197–387 (2014)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Derrac, J., Chiclana, F., García, S., Herrera, F.: Evolutionary fuzzy k-nearest neighbors algorithm using interval-valued fuzzy sets. Inf. Sci. 329, 144–163 (2016)CrossRefGoogle Scholar
  13. 13.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (1973)zbMATHGoogle Scholar
  14. 14.
    Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(4), 463–484 (2012)CrossRefGoogle Scholar
  15. 15.
    Ghosh, A.K.: On optimum choice of k in nearest neighbor classification. Comput. Stat. Data Anal. 50(11), 3113–3123 (2006)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the NIPS 2003 feature selection challenge. In: Proceedings of Neural Information Processing Systems, vol. 4, pp. 545–552 (2004)Google Scholar
  17. 17.
    Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Their Appl. 13(4), 18–28 (1998)CrossRefGoogle Scholar
  18. 18.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)CrossRefGoogle Scholar
  20. 20.
    Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24(6), 417–441 (1933)CrossRefGoogle Scholar
  21. 21.
    Keller, J.M., Gray, M.R., Givens, J.A.: A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybern. SMC 15(4), 580–585 (1985)CrossRefGoogle Scholar
  22. 22.
    Pearson, K.: LIII. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 2(11), 559–572 (1901)Google Scholar
  23. 23.
    Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
  24. 24.
    Quinlan, J.R.: C4. 5: Programs for Machine Learning. Elsevier, Amsterdam (2014)Google Scholar
  25. 25.
    Schalkoff, R.J.: Artificial Neural Networks, vol. 1. McGraw-Hill, New York (1997)zbMATHGoogle Scholar
  26. 26.
    Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM (2008)Google Scholar
  27. 27.
    Yu, H., Yang, J.: A direct LDA algorithm for high-dimensional data-with application to face recognition. Pattern Recognit. 34(10), 2067–2070 (2001)CrossRefGoogle Scholar
  28. 28.
    Zadrozny, B., Elkan, C.: Learning and making decisions when costs and probabilities are both unknown. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 204–213. ACM (2001)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Francisco J. Pulgar
    • 1
    Email author
  • Francisco Charte
    • 1
  • Antonio J. Rivera
    • 1
  • María J. del Jesus
    • 1
  1. 1.Andalusian Research Institute on Data Science and Computational Intelligence (DaSCI), Computer Science DepartmentUniversity of JaénJaénSpain

Personalised recommendations