Deep Kernelized Autoencoders

  • Michael KampffmeyerEmail author
  • Sigurd Løkse
  • Filippo M. Bianchi
  • Robert Jenssen
  • Lorenzo Livi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10269)


In this paper we introduce the deep kernelized autoencoder, a neural network model that allows an explicit approximation of (i) the mapping from an input space to an arbitrary, user-specified kernel space and (ii) the back-projection from such a kernel space to input space. The proposed method is based on traditional autoencoders and is trained through a new unsupervised loss function. During training, we optimize both the reconstruction accuracy of input samples and the alignment between a kernel matrix given as prior and the inner products of the hidden representations computed by the autoencoder. Kernel alignment provides control over the hidden representation learned by the autoencoder. Experiments have been performed to evaluate both reconstruction and kernel alignment performance. Additionally, we applied our method to emulate kPCA on a denoising task obtaining promising results.


Autoencoders Kernel methods Deep learning Representation learning 



We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPU used for this research. This work was partially funded by the Norwegian Research Council FRIPRO grant no. 239844 on developing the Next Generation Learning Machines.


  1. 1.
    Bakir, G.H., Weston, J., Schölkopf, B.: Learning to find pre-images. In: Advances in Neural Information Processing Systems, pp. 449–456 (2004)Google Scholar
  2. 2.
    Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)CrossRefGoogle Scholar
  3. 3.
    Bengio, Y.: Learning deep architectures for ai. Found. Trends Mach. Learn. 2(1), 1–127 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152 (1992)Google Scholar
  5. 5.
    Cho, Y., Saul, L.K.: Kernel methods for deep learning. In: Advances in Neural Information Processing Systems 22, pp. 342–350 (2009)Google Scholar
  6. 6.
    Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (1991)CrossRefzbMATHGoogle Scholar
  7. 7.
    Cristianini, N., Elisseeff, A., Shawe-Taylor, J., Kandola, J.: On kernel-target alignment. In: Advances in Neural Information Processing Systems (2001)Google Scholar
  8. 8.
    Dai, B., Xie, B., He, N., Liang, Y., Raj, A., Balcan, M.F.F., Song, L.: Scalable kernel methods via doubly stochastic gradients. In: Advances in Neural Information Processing Systems, pp. 3041–3049 (2014)Google Scholar
  9. 9.
    Giraldo, L.G.S., Rao, M., Principe, J.C.: Measures of entropy from data using infinitely divisible kernels. IEEE Trans. Inf. Theory 61(1), 535–548 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS10) (2010)Google Scholar
  11. 11.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Honeine, P., Richard, C.: A closed-form solution for the pre-image problem in kernel-based machines. J. Sig. Process. Syst. 65(3), 289–299 (2011)CrossRefGoogle Scholar
  14. 14.
    Izquierdo-Verdiguier, E., Jenssen, R., Gómez-Chova, L., Camps-Valls, G.: Spectral clustering with the probabilistic cluster kernel. Neurocomputing 149, 1299–1304 (2015)CrossRefGoogle Scholar
  15. 15.
    Jenssen, R.: Kernel entropy component analysis. IEEE Trans. Pattern Anal. Mach. Intell. 32(5), 847–860 (2010)CrossRefGoogle Scholar
  16. 16.
    Kamyshanska, H., Memisevic, R.: The potential energy of an autoencoder. IEEE Trans. Pattern Anal. Mach. Intell. 37(6), 1261–1273 (2015)CrossRefGoogle Scholar
  17. 17.
    Kingma, D., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  18. 18.
    Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
  19. 19.
    Kulis, B., Sustik, M.A., Dhillon, I.S.: Low-rank kernel learning with Bregman matrix divergences. J. Mach. Learn. Res. 10, 341–376 (2009)Google Scholar
  20. 20.
    Maaten, L.: Learning a parametric embedding by preserving local structure. In: International Conference on Artificial Intelligence and Statistics, pp. 384–391 (2009)Google Scholar
  21. 21.
    Montavon, G., Braun, M.L., Müller, K.R.: Kernel analysis of deep networks. J. Mach. Learn. Res. 12, 2563–2581 (2011)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2001)Google Scholar
  23. 23.
    Santana, E., Emigh, M., Principe, J.C.: Information theoretic-learning auto-encoder. arXiv preprint arXiv:1603.06653 (2016)
  24. 24.
    Schölkopf, B., Smola, A., Müller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10(5), 1299–1319 (1998)CrossRefGoogle Scholar
  25. 25.
    Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)MathSciNetzbMATHGoogle Scholar
  26. 26.
    Wang, T., Zhao, D., Tian, S.: An overview of kernel alignment and its applications. Artif. Intell. Rev. 43(2), 179–192 (2015)CrossRefGoogle Scholar
  27. 27.
    Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, pp. 370–378 (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Machine Learning GroupUiT–The Arctic University of NorwayTromsøNorway
  2. 2.Department of Computer ScienceUniversity of ExeterExeterUK

Personalised recommendations