K-Means Clustering Optimizing Deep Stacked Sparse Autoencoder

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.


Because of the large structure and long training time, the development cycle of the common depth model is prolonged. How to speed up training is a problem deserving of study. In order to accelerate training, K-means clustering optimizing deep stacked sparse autoencoder (K-means sparse SAE) is presented in this paper. First, the input features are divided into K small subsets by K-means clustering, then each subset is input into corresponding autoencoder model for training, which only has fewer nodes in the hidden layer than traditional models. After training, each autoencoder’s trained weights and biases is merged to obtain the next layer’s input features by feedforward network. The above steps are repeated till the softmax layer, then fine-tuning is carried out. Using MNIST-Rotation datasets to train the network that has three hidden layers and each layer has 800 nodes, the improved model has higher classification accuracy and shorter training time when K = 10. With K increasing, the training time is reduced to almost the same as the fine-tuning time but the recognition ability is descended. Compared with the recently stacked denoising sparse autoencoder, the recognition accuracy is improved by 1%, not only the noise factor is not selected but also the training speed is significantly increased. The trained filters from the improved model is also used to train convolutional autoencoder, and it performs better than traditional models. We find that pre-training stage doesn’t need large samples simultaneously, and small samples parallel training reduces the probability of falling into the local minimum.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13


  1. 1.

    MNIST dataset http://yann.lecun.com/exdb/mnist/.

  2. 2.

    MNIST-Rotation dataset http://www.iro.umontreal.ca/%7elisa/twiki/bin/view.cgi/Public/MnistVariations.

  3. 3.

    The reduced STL-10 dataset http://ufldl.stanford.edu/wiki/resources/.

  4. 4.

    M. Schmidt. minFunc: unconstrained differentiable multivariate optimization in MATLAB. http://www.cs.ubc.ca/%7eschmidtm/Software/minFunc.html, 2005 http://www.cs.ubc.ca/%7eschmidtm/Software/minFunc.html.


  1. 1.

    Alain, G., & Bengio, Y. (2012). What regularized auto-encoders learn from the data generating distribution. Computer Science, 15(1), 3563–3593.

    MathSciNet  MATH  Google Scholar 

  2. 2.

    Bell, A. J. (1996). Edges are the ’independent components’ of natural scenes. In: Advances in neural information processing system (pp. 831–837).

  3. 3.

    Bellinger, C., Drummond, C., & Japkowicz, N. (2017). Manifold-based synthetic oversampling with manifold conformance estimation. Machine Learning, 1, 1–33.

    MATH  Google Scholar 

  4. 4.

    Chandra, B., & Sharma, R. K. (2014). Adaptive noise schedule for denoising autoencoder. In: International conference on neural information processing (pp. 535–542).

  5. 5.

    Cheng, X., Liu, H., Xu, X., & Sun, F. (2016). Denoising deep extreme learning machine for sparse representation. Memetic Computing, 9(3), 1–14.

    Google Scholar 

  6. 6.

    Das, R., & Walia, E. (2017). Partition selection with sparse autoencoders for content based image classification. Neural Computing & Applications, 4, 1–16.

    Google Scholar 

  7. 7.

    Glorot, X., Bordes, A., & Bengio, Y. (2012). Deep sparse rectifier neural networks. Jmlr W & Cp, 15, 315–323.

    Google Scholar 

  8. 8.

    Gupta, K., & Majumdar, A. (2017). Imposing class-wise feature similarity in stacked autoencoders by nuclear norm regularization. Neural Processing Letters, 2, 1–15.

    Google Scholar 

  9. 9.

    Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527.

    MathSciNet  MATH  Article  Google Scholar 

  10. 10.

    Hong, C., Yu, J., You, J., Yu, Z., & Chen, X. (2017). Three-dimensional image-based human pose recovery with hypergraph regularized autoencoders. Multimedia Tools & Applications, 2016(1), 1–19.

    Google Scholar 

  11. 11.

    Imakura, A., Inoue, Y., Sakurai, T., & Futamura, Y. (2018). Parallel implementation of the nonlinear semi-NMF based alternating optimization method for deep neural networks. Neural Processing Letters, 47(3), 815–827.

    Article  Google Scholar 

  12. 12.

    Le, Q. V. (2013). Building high-level features using large scale unsupervised learning. In: IEEE international conference on acoustics, speech and signal processing (pp. 8595–8598).

  13. 13.

    Lemme, A., Reinhart, R. F., & Steil, J. J. (2010). Efficient online learning of a non-negative sparse autoencoder. In: Esann 2010, European symposium on artificial neural networks, Bruges, Belgium, April 28–30, 2010 Proceedings.

  14. 14.

    Li, B., & Chen, C. (2017). First-order sensitivity analysis for hidden neuron selection in layer-wise training of networks. Neural Processing Letters, 7, 1–17.

    Google Scholar 

  15. 15.

    Li, R., & Xu, H. (2017). Parallel stacked autoencoder and its application in process modeling. Journal of Electronic Measurement & Instrumentation, 31, 264–271.

    Google Scholar 

  16. 16.

    Li, Z., Fan, Y., & Liu, W. (2015). The effect of whitening transformation on pooling operations in convolutional autoencoders. Eurasip Journal on Advances in Signal Processing, 2015(1), 37.

    Article  Google Scholar 

  17. 17.

    Makhzani, A., & Frey, B. (2014). k-sparse autoencoders. In ICLR. https://arxiv.org/pdf/1312.5663.pdf.

  18. 18.

    Meng, L., Ding, S., Zhang, N., & Zhang, J. (2018). Research of stacked denoising sparse autoencoder. Neural Computing & Applications, 30(7), 2083–2100.

    Article  Google Scholar 

  19. 19.

    Meng, Q., Catchpoole, D., Skillicom, D., & Kennedy, P. J. (2017). Relational autoencoder for feature extraction. In: International joint conference on neural networks (pp. 364–371).

  20. 20.

    Rifai, S., Vincent, P., Muller, X., Glorot, X., & Bengio, Y. (2011). Contractive auto-encoders: Explicit invariance during feature extraction. In: ICML.

  21. 21.

    Schlkopf, B., Platt, J., & Hofmann, T. (2006). Efficient learning of sparse representations with an energy-based model. In: Advances in neural information processing systems (pp. 1137–1144).

  22. 22.

    Schlkopf, B., Platt, J., & Hofmann, T. (2006). Greedy layer-wise training of deep networks. In: International conference on neural information processing systems (pp. 153–160).

  23. 23.

    Shu, Z., Wu, X.J., & Hu, C. (2018). Structure preserving sparse coding for data representation. Neural Processing Letters, 48(3), 1705–1719.

    Article  Google Scholar 

  24. 24.

    Singhal, V., & Majumdar, A. (2017). Majorization minimization technique for optimally solving deep dictionary learning. Neural Processing Letters, 3, 1–16.

    Google Scholar 

  25. 25.

    Zeng, N., Zhang, H., Song, B., Liu, W., Li, Y., & Dobaie, A. M. (2017). Facial expression recognition via learning deep sparse autoencoders. Neurocomputing, 273, 643–649.

    Article  Google Scholar 

Download references


The work is supported by National Key Technology Research and Development Program of China No.2011BAD21B0601

Author information



Corresponding author

Correspondence to Shuhan Cheng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bi, Y., Wang, P., Guo, X. et al. K-Means Clustering Optimizing Deep Stacked Sparse Autoencoder. Sens Imaging 20, 6 (2019). https://doi.org/10.1007/s11220-019-0227-1

Download citation


  • K-means clustering
  • Sparse autoencoder
  • Convolutional autoencoder
  • Training method