Advertisement

SRS-DNN: a deep neural network with strengthening response sparsity

  • Chen QiaoEmail author
  • Bin Gao
  • Yan Shi
Original Article
  • 38 Downloads

Abstract

Inspired by the sparse mechanism of biological neural systems, an approach of strengthening response sparsity for deep learning is presented in this paper. Firstly, an unsupervised sparse pre-training process is implemented and a sparse deep network is begun to take shape. In order to avoid that all the connections of the network will be readjusted backward during the following fine-tuning process, for the loss function of the fine-tuning process, some regularization items which strength the sparse responsiveness are added. More importantly, the unified and concise residual formulae for network updating are deduced, which ensure the backpropagation algorithm to perform successfully. The residual formulae significantly improve the existing sparse fine-tuning methods such as which in sparse autoencoders by Andrew Ng. In this way, the sparse structure obtained in the pre-training can be maintained, and the sparse abstract features of data can be extracted effectively. Numerical experiments show that by this sparsity-strengthened learning method, the sparse deep neural network has the best classification performance among several classical classifiers; meanwhile, the sparse learning abilities and time complexity all are better than traditional deep learning methods.

Keywords

Deep neural network Strengthening response sparsity Sparse backpropagation algorithm Unified residual formulae 

Notes

Acknowledgements

This research was funded by NSFC Nos. 11471006 and 11101327, the Fundamental Research Funds for the Central Universities (No. xjj2017126), the Science and Technology Project of Xi’an (No. 201809164CX5JC6) and the HPC Platform of Xi’an Jiaotong University.

Compliance with ethical standards

Conflict of interest

The authors declare that there are no financial or other relationships that might lead to conflict of interest of the present article.

References

  1. 1.
    Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetzbMATHGoogle Scholar
  2. 2.
    LeCun Y, Bengio Y, Hinton GE (2015) Deep learning. Nature 521(7553):436–444Google Scholar
  3. 3.
    Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536zbMATHGoogle Scholar
  4. 4.
    Olshausen BA, Field DJ (2004) Sparse coding of sensory inputs. Curr Opin Neurobiol 14(4):481–487Google Scholar
  5. 5.
    Morris G, Nevet A, Bergman H (2003) Anatomical funneling, sparse connectivity and redundancy reduction in the neural networks of the basal ganglia. J Physiol Paris 97(4–6):581–589Google Scholar
  6. 6.
    Ji N, Zhang J, Zhang C et al (2014) Enhancing performance of restricted Boltzmann machines via log-sum regularization. Knowl Based Syst 63:82–96Google Scholar
  7. 7.
    Banino A, Barry C et al (2018) Vector-based navigation using grid-like representations in artificial agents. Nature.  https://doi.org/10.1038/s41586-018-0102-6 Google Scholar
  8. 8.
    Zhang H, Wang S, Xu X et al (2018) Tree2Vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 29:1–15MathSciNetGoogle Scholar
  9. 9.
    Zhang H, Wang S, Zhao M et al (2018) Locality reconstruction models for book representation. IEEE Trans Knowl Data Eng 30:873–1886Google Scholar
  10. 10.
    Barlow HB (1972) Single units and sensation: a neuron doctrine for perceptual psychology. Perception 38(4):795–798Google Scholar
  11. 11.
    Nair V, Hinton G E (2009) 3D object recognition with Deep Belief Nets. In: International conference on neural information processing systems, pp 1339–1347Google Scholar
  12. 12.
    Lee H, Ekanadham C, Ng AY (2008) Sparse deep belief net model for visual area V2. Adv Neural Inf Process Syst 20:873–880Google Scholar
  13. 13.
    Lee H, Grosse R, Ranganath R et al (2011) Unsupervised learning of hierarchical representations with convolutional deep belief networks. Commun ACM 54(10):95–103Google Scholar
  14. 14.
    Ranzato MA, Poultney C, Chopra S, LeCun Yann (2006) Efficient learning of sparse representations with an energy-based model. Adv Neural Inf Process Syst 19:1137–1144Google Scholar
  15. 15.
    Thom M, Palm G (2013) Sparse activity and sparse connectivity in supervised learning. J Mach Learn Res 14(1):1091–1143MathSciNetzbMATHGoogle Scholar
  16. 16.
    Wan W, Mabu S, Shimada K et al (2009) Enhancing the generalization ability of neural networks through controlling the hidden layers. Appl Soft Comput 9(1):404–414Google Scholar
  17. 17.
    Jones M, Poggio T (1995) Regularization theory and neural networks architectures. Neural Comput 7(2):219–269Google Scholar
  18. 18.
    Williams PM (1995) Bayesian regularization and pruning using a laplace prior. Neural Comput 7(1):117–143Google Scholar
  19. 19.
    Weigend A S, Rumelhart D E, Huberman B A (1990) Generalization by weight elimination with application to forecasting. In: Advances in neural information processing systems, DBLP, pp 875–882Google Scholar
  20. 20.
    Nowlan SJ, Hinton GE (1992) Simplifying neural networks by soft weight-sharing. Neural Comput 4(4):473–493Google Scholar
  21. 21.
    Zhang J, Ji N, Liu J et al (2015) Enhancing performance of the backpropagation algorithm via sparse response regularization. Neurocomputing 153:20–40Google Scholar
  22. 22.
    Ng A (2011) Sparse autoencoder. CS294A Lecture Notes for Stanford UniversityGoogle Scholar
  23. 23.
    Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507MathSciNetzbMATHGoogle Scholar
  24. 24.
    Bengio Y, Lamblin P, Popovici D, Larochelle H (2006) Greedy layer-wise training of deep networks. In: Proceedings of the advances in neural information processing systems, pp 19:153–160Google Scholar
  25. 25.
    Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14:1771–1800zbMATHGoogle Scholar
  26. 26.
    Hinton GE (2010) A practical guide to training restricted Boltzmann machines. Momentum 9(1):599–619Google Scholar
  27. 27.
    Fischer A, Igel C (2014) Training restricted Boltzmann machines: an introduction. Pattern Recognit 47(1):25–39zbMATHGoogle Scholar
  28. 28.
    Donoho DL (2006) Compressed sensing. IEEE Trans Inf Theory 52(4):1289–1306MathSciNetzbMATHGoogle Scholar
  29. 29.
    Xiao H, Rasul K, Vollgraf R (2017) Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms arXiv:1708.07747v1
  30. 30.
    Maaten LV, Hinton GE (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605zbMATHGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Mathematics and StatisticsXi’an Jiaotong UniversityXi’anPeople’s Republic of China

Personalised recommendations