Abstract
In this work we propose an l p -norm data fidelity constraint for training the autoencoder. Usually the Euclidean distance is used for this purpose; we generalize the l 2 -norm to the l p -norm; smaller values of p make the problem robust to outliers. The ensuing optimization problem is solved using the Augmented Lagrangian approach. The proposed l p -norm Autoencoder has been tested on benchmark deep learning datasets – MNIST, CIFAR-10 and SVHN. We have seen that the proposed robust autoencoder yields better results than the standard autoencoder (l 2 -norm) and deep belief network for all of these problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Huber, P.J.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)
Branham Jr., R.L.: Alternatives to least squares. Astron. J. 87, 928–937 (1982)
Shi, M., Lukas, M.A.: An L1 estimation algorithm with degeneracy and linear constraints. Comput. Stat. Data Anal. 39(1), 35–55 (2002)
Wang, L., Gordon, M.D., Zhu, J.: Regularized least absolute deviations regression and an efficient algorithm for parameter tuning. In: IEEE ICDM, pp. 690–700 (2006)
Barrodale, I., Roberts, F.D.K.: An improved algorithm for discrete L1 linear approximation. SIAM J. Numer. Anal. 10(5), 839–848 (1973)
Schlossmacher, E.J.: An iterative technique for absolute deviations curve fitting. J. Am. Stat. Assoc. 68(344), 857–859 (1973)
Wesolowsky, G.O.: A new descent algorithm for the least absolute value regression problem. Commun. Stat. Simul. Comput. B10(5), 479–491 (1981)
Li, Y., Arce, G.R.: A maximum likelihood approach to least absolute deviation regression. EURASIP J. Appl. Sig. Process. 12, 1762–1769 (2004)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Baldi, P., Hornik, K.: Neural networks and principal component analysis: learning from examples without local minima. Neural Netw. 2, 53–58 (1989)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoen coders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
Chartrand, R.: Nonconvex splitting for regularized low-rank + sparse decomposition. IEEE Trans. Sig. Process. 60, 5810–5819 (2012)
Majumdar, A., Ward, R.K.: On the choice of compressed sensing priors: an experimental study. Sig. Process. Image Commun. 27(9), 1035–1048 (2012)
Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: ICML (2011)
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31, 210–227 (2009)
Masci, J., Meier, U., Cireşan, D., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction. In: Honkela, T. (ed.) ICANN 2011, Part I. LNCS, vol. 6791, pp. 52–59. Springer, Heidelberg (2011)
Lee, H., Grosse, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: ICML (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Mehta, J., Gupta, K., Gogna, A., Majumdar, A., Anand, S. (2016). Stacked Robust Autoencoder for Classification. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9949. Springer, Cham. https://doi.org/10.1007/978-3-319-46675-0_66
Download citation
DOI: https://doi.org/10.1007/978-3-319-46675-0_66
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46674-3
Online ISBN: 978-3-319-46675-0
eBook Packages: Computer ScienceComputer Science (R0)