Discriminative Representation Learning with Supervised Auto-encoder

  • Fang Du
  • Jiangshe Zhang
  • Nannan Ji
  • Junying Hu
  • Chunxia Zhang


Auto-encoders have been proved to be powerful unsupervised learning methods that able to extract useful features from input data or construct deep artificial neural networks by recent studies. In such settings, the extracted features or the initialized networks only learn the data structure while contain no class information which is a disadvantage in classification tasks. In this paper, we aim to leverage the class information of input to learn a reconstructive and discriminative auto-encoder. More specifically, we introduce a supervised auto-encoder that combines the reconstruction error and the classification error to form a unified objective function while taking the noisy concatenate data and label as input. The noisy concatenate input is constructed in such a method that one third has only original data and zero labels, one third has only label and zero data, the last one third has both original data and label. We show that the representations learned by the proposed supervised auto-encoder are more discriminative and more suitable for classification tasks. Experimental results demonstrate that our model outperforms many existing learning algorithms.


Supervised learning Auto-encoder De-noising 



This work was supported by a grant from the National Key Basic Research Program of China (No.2013CB329404) and two grants from the National Natural Science Foundation of China (Nos. 61572393 and 11671317).


  1. 1.
    Bourlard H, Kamp Y (1988) Auto-association by multilayer perceptrons and singular value decomposition. Biol Cybern 59(4–5):291–294MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Hinton GE, Zemel RS (1993) Autoencoders, minimum description length and helmholtz free energy. In: International conference on neural information processing systems, pp 3–10Google Scholar
  3. 3.
    Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representation by backpropagating errors. Nature 323(6088):533–536CrossRefzbMATHGoogle Scholar
  4. 4.
    Elman JL, Zipser D (1988) Learning the hidden structure of speech. J Acoust Soc Am 83(4):1615–1626CrossRefGoogle Scholar
  5. 5.
    Cottrell GW (1991) Extracting features from faces using compression networks: face, identity, emotion, and gender recognition using holons. In: Connectionist Models: Proceedings of the 1990 Summer School, pp 328–337.
  6. 6.
    Krogh A (1992) A simple weight decay can improve generalization. Adv Neural Inf Process Syst 4:950–957Google Scholar
  7. 7.
    Jia K, Sun L, Gao S, Song Z, Shi BE (2015) Laplacian auto-encoders: an explicit learning of nonlinear data manifold. Neurocomputing 160:250–260CrossRefGoogle Scholar
  8. 8.
    Le QV, Ngiam J, Coates A, Lahiri A, Prochnow B, Ng AY (2011) On optimization methods for deep learning In: Proceedings of the 28th International Conference on Machine Learning. Omnipress, pp 265–272Google Scholar
  9. 9.
    Jiang X, Zhang Y, Zhang W, Xiao X (2014) A novel sparse auto-encoder for deep unsupervised learning. In: Sixth international conference on advanced computational intelligence, pp 256–261Google Scholar
  10. 10.
    Liu W, Ma T, Tao D, You J (2016) Hsae: a hessian regularized sparse auto-encoders. Neurocomputing 187:59–65CrossRefGoogle Scholar
  11. 11.
    Glorot X, Bordes A, Bengio Y, Deep sparse rectifier neural networks. In: Jmlr W Cp 15Google Scholar
  12. 12.
    Hinton G, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–54MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(6):3371–3408MathSciNetzbMATHGoogle Scholar
  14. 14.
    Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y (2007) An empirical evaluation of deep architectures on problems with many factors of variation. In: ICML, pp 473–480Google Scholar
  15. 15.
    Ranzato M, Poultney C, Chopra S, Lecun Y (2006) Efficient learning of sparse representations with an energy-based model. In: Advances in neural information processing systems (NIPS 2006 1137–1144)Google Scholar
  16. 16.
    Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on Machine learning, pp 1096–1103Google Scholar
  17. 17.
    Rifai S, Vincent P, Muller X, Glorot X, Bengio Y, Contractive auto-encoders: explicit invariance during feature extraction. In: International conference on machine learningGoogle Scholar
  18. 18.
    Rifai S, Mesnil G, Vincent P, Muller X, Bengio Y, Dauphin Y, Glorot X (2011) Higher order contractive auto-encoder. Springer, BerlinCrossRefGoogle Scholar
  19. 19.
    Chen FQ, Wu Y, Zhao GD, Zhang JM, Zhu M, Bai J (2014) Contractive de-noising auto-encoder. Springer, BerlinCrossRefGoogle Scholar
  20. 20.
    Hosseiniasl E, Zurada JM, Nasraoui O (2016) Deep learning of part-based representation of data using sparse autoencoders with nonnegativity constraints. IEEE Trans Neural Netw Learn Syst 27(12):2486–2498CrossRefGoogle Scholar
  21. 21.
    Rolfe JT, Lecun Y. Discriminative recurrent sparse auto-encoders. In: International Conference on Learning Representations (ICLR), April 2013Google Scholar
  22. 22.
    Razakarivony S, Jurie F (2014) Discriminative autoencoders for small targets detection. In: International conference on pattern recognition, pp 3528–3533Google Scholar
  23. 23.
    Lee HS, Lu YD, Hsu CC, Yu T, Wang HM, Jeng SK (2017) Discriminative autoencoders for speaker verification. In: IEEE international conference on acoustics, speech and signal processing, pp 5375–5379Google Scholar
  24. 24.
    Liu W, Ma T, Xie Q, Tao D, Cheng J (2017) Lmae: a large margin auto-encoders for classification. Sig Process 141:137–143CrossRefGoogle Scholar
  25. 25.
    Ngiam J, Khosla A, Kim M, Nam J, Lee H, Ng AY (2011) Multimodal deep learning. In: International conference on machine learning, pp 689–696Google Scholar
  26. 26.
    Blake C, Merz C (1998) UCI repository of machine learning databases. Department of Information and Computer Sciences, University of California, Irvine.
  27. 27.
    Huang G, Song S, Gupta JN, Wu C (2014) Semi-supervised and unsupervised extreme learning machines. IEEE Trans Cybern 44(12):1–1CrossRefGoogle Scholar
  28. 28.
    Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Fang Du
    • 1
  • Jiangshe Zhang
    • 1
  • Nannan Ji
    • 2
  • Junying Hu
    • 1
  • Chunxia Zhang
    • 1
  1. 1.School of Mathematics and StatisticsXi’an Jiaotong UniversityXi’anChina
  2. 2.Department of Mathmatics and Information ScienceChang’an UniversityXi’anChina

Personalised recommendations