Semi-parametric training of autoencoders with Gaussian kernel smoothed topology learning neural networks

  • Zhiyang Xiang
  • Changshou Deng
  • Xueting Xiang
  • Mali Yu
  • Jing XiongEmail author
Original Article


Autoencoders are essential for training multi-hidden layer neural networks. Parametric autoencoder trainings often require user selections of hidden neuron numbers and kernel types. In this paper, a semi-parametric autoencoder training method based on self-organized learning and incremental learning is proposed. The cost function is constructed incrementally by nonparametric learning, and the model parameter is trained by parametric learning. First, a topology learning neural network such as growing neural gas or self-organizing incremental neural network is trained to obtain a discrete representation of the training data. Second, the correlations between different dimensions are modeled as a joint distribution by the neural network representation and kernel smoothers. Finally, the loss function is defined to be the regression prediction errors with each dimension as a response variable in density regression. The parameter of kernels is selected by gradient descent which minimizes the reconstruction error on a data subset. The proposed architecture has the advantage of high training space efficiency because of incremental training, and the advantage of automated selection of hidden neuron numbers. Experiments are carried out on 4 UCI datasets and an image interpolation task. Results show that the proposed methods outperform the perceptron architecture autoencoders and the restricted Boltzmann machine in the task of nonlinear feature learning.


Autoencoder Nonparametric learning Kernel density estimation Incremental learning 



This work was supported in part by Fundamental Research Program of Shenzhen (Project No. JCYJ20170413162458312) and National Natural Science Foundations of China (No. 61562047).

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.


  1. 1.
    Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2016) TensorFlow: large-scale machine learning on heterogeneous distributed systems. ArXiv e-printsGoogle Scholar
  2. 2.
    Bache K, Lichman M (2013) UCI machine learning repository, 901:1. Accessed 25 Mar 2018
  3. 3.
    Bodin E, Malik I, Ek CH, Campbell NDF (2017) Nonparametric inference for auto-encoding variational Bayes. ArXiv e-printsGoogle Scholar
  4. 4.
    Cherif A, Cardot H, Boné R (2011) SOM time series clustering and prediction with recurrent neural networks. Neurocomputing 74(11):1936–1944CrossRefGoogle Scholar
  5. 5.
    Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7:551–585MathSciNetzbMATHGoogle Scholar
  6. 6.
    Csji BC (2001) Approximation with artificial neural networks. Ph.D. thesis, Faculty of Sciences, Etvs Lornd UniversityGoogle Scholar
  7. 7.
    Druzhkov PN, Kustikova VD (2016) A survey of deep learning methods and software tools for image classification and object detection. Pattern Recognit Image Anal 26(1):9–15CrossRefGoogle Scholar
  8. 8.
    Fischer A, Igel C (2012) An introduction to restricted Boltzmann machines. Springer, Berlin, pp 14–36. CrossRefGoogle Scholar
  9. 9.
    Fritzke B et al (1995) A growing neural gas network learns topologies. Adv Neural Inf Process Syst 7:625–632Google Scholar
  10. 10.
    Furao S, Ogura T, Hasegawa O (2007) An enhanced self-organizing incremental neural network for online unsupervised learning. Neural Netw 20(8):893–903CrossRefGoogle Scholar
  11. 11.
    Zhang H, Chow TW (2015) Organizing books and authors by multilayer SOM. IEEE Trans Neural Netw Learn Syst 27(12):2537CrossRefGoogle Scholar
  12. 12.
    Kingma DP, Welling M (2013) Auto-encoding variational Bayes. ArXiv e-printsGoogle Scholar
  13. 13.
    Kohonen T (1998) The self-organizing map. Neurocomputing 21(1):1–6MathSciNetCrossRefGoogle Scholar
  14. 14.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th international conference on neural information processing systems, vol 1, NIPS’12, pp 1097–1105. Curran Associates Inc., USA. Accessed 25 Mar 2018
  15. 15.
    Nalisnick E, Smyth P (2017) Stick-breaking variational autoencoders. In: International conference on learning representations (ICLR). Accessed 25 Mar 2018
  16. 16.
    Oliphant TE (2015) Guide to NumPy, 2nd edn. CreateSpace Independent Publishing Platform, Scotts ValleyGoogle Scholar
  17. 17.
    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNetzbMATHGoogle Scholar
  18. 18.
    Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, New YorkzbMATHGoogle Scholar
  19. 19.
    Shen F, Yu H, Sakurai K, Hasegawa O (2011) An incremental online semi-supervised active learning algorithm based on self-organizing incremental neural network. Neural Comput Appl 20(7):1061–1074CrossRefGoogle Scholar
  20. 20.
    da Silva IN, Hernane Spatti D, Andrade Flauzino R, Liboni LHB, dos Reis Alves SF (2017) Self-organizing Kohonen networks. Springer, Cham, pp 157–172. CrossRefGoogle Scholar
  21. 21.
    Silva TC, Zhao L (2012) Stochastic competitive learning in complex networks. IEEE Trans Neural Netw Learn Syst 23(3):385–398CrossRefGoogle Scholar
  22. 22.
    Silverman BW (1986) Density estimation for statistics and data analysis, vol 26. CRC Press, Boca RatonCrossRefGoogle Scholar
  23. 23.
    Snoek J, Adams RP, Larochelle H (2012) Nonparametric guidance of autoencoder representations using label information. J Mach Learn Res 13(1):2567–2588MathSciNetzbMATHGoogle Scholar
  24. 24.
    Tfekci P (2014) Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods. Int J Electr Power Energy Syst 60:126–140CrossRefGoogle Scholar
  25. 25.
    Thompson JJ, Blair MR, Chen L, Henrey AJ (2013) Video game telemetry as a critical tool in the study of complex skill learning. PloS one 8(9):e75,129CrossRefGoogle Scholar
  26. 26.
    Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference on machine learning, ICML ’08, pp 1064–1071. ACM, New York, NY, USA.
  27. 27.
    Tomczak JM (2016) Learning informative features from restricted Boltzmann machines. Neural Process Lett 44(3):735–750. CrossRefGoogle Scholar
  28. 28.
    Tsanas A, Little MA, McSharry PE, Ramig LO (2010) Accurate telemonitoring of Parkinson’s disease progression by noninvasive speech tests. IEEE Trans Biomed Eng 57(4):884–893CrossRefGoogle Scholar
  29. 29.
    Voegtlin T (2002) Recursive self-organizing maps. Neural Netw 15(8):979–991CrossRefGoogle Scholar
  30. 30.
    Xiang Z, Xiao Z, Wang D, Georges HM (2016) Incremental semi-supervised kernel construction with self-organizing incremental neural network and application in intrusion detection. J Intell Fuzzy Syst 31(2):815–823CrossRefGoogle Scholar
  31. 31.
    Xiang Z, Xiao Z, Wang D, Li X (2016) A Gaussian mixture framework for incremental nonparametric regression with topology learning neural networks. Neurocomputing 194:34–44. CrossRefGoogle Scholar
  32. 32.
    Xiang Z, Xiao Z, Wang D, Xiao J (2017) Gaussian kernel smooth regression with topology learning neural networks and python implementation. Neurocomputing. CrossRefGoogle Scholar
  33. 33.
    Xin M, Zhang H, Sun M, Yuan D (2016) Recurrent temporal sparse autoencoder for attention-based action recognition. In: 2016 International joint conference on neural networks (IJCNN), pp 456–463.
  34. 34.
    Yang H, Wang B, Lin S, Wipf D, Guo M, Guo B (2015) Unsupervised extraction of video highlights via robust recurrent auto-encoders. In: 2015 IEEE international conference on computer vision (ICCV), pp 4633–4641.
  35. 35.
    Zhang H, Cao X, Ho JKL, Chow TWS (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inform 13(2):520–531. CrossRefGoogle Scholar
  36. 36.
    Zhao W, Xu L, Bai J, Ji M, Runge T (2017) Sensor-based risk perception ability network design for drivers in snow and ice environmental freeway: a deep learning and rough sets approach. Soft Comput 2:1–10Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2018

Authors and Affiliations

  • Zhiyang Xiang
    • 1
    • 2
  • Changshou Deng
    • 2
  • Xueting Xiang
    • 3
  • Mali Yu
    • 2
  • Jing Xiong
    • 1
    Email author
  1. 1.Shenzhen Institutes of Advanced TechnologyChinese Academy of SciencesShenzhenChina
  2. 2.School of Information Science and TechnologyJiujiang UniversityJiujiangChina
  3. 3.Haikou College of EcnomicsHaikouChina

Personalised recommendations