Deep learning: an overview and main paradigms


In the present paper, we examine and analyze main paradigms of learning of multilayer neural networks starting with a single layer perceptron and ending with deep neural networks, which are considered regarded as a breakthrough in the field of the intelligent data processing. The baselessness of some ideas about the capacity of multilayer neural networks is shown and transition to deep neural networks is justified. We discuss the principal learning models of deep neural networks based on the restricted Boltzmann machine (RBM), an autoassociative approach and a stochastic gradient method with a Rectified Linear Unit (ReLU) activation function of neural elements.

This is a preview of subscription content, log in to check access.


  1. 1.

    Rosenblatt, F., Principles of Neurodynamics; Perceptrons and the Theory of Brain Mechanisms, Washington: Spartan Books, 1962, p. 616.

    Google Scholar 

  2. 2.

    Minsky, M. and Papert, S., Perceptrons: An Introduction to Computational Geometry, MIT Press, 1969.

    Google Scholar 

  3. 3.

    Hinton, G.E., Osindero, S., and Teh, Y., A fast learning algorithm for deep belief nets, Neural Computation, 2006, vol. 18, pp. 1527–1554.

    MathSciNet  Article  MATH  Google Scholar 

  4. 4.

    Hinton, G., Training products of experts by minimizing contrastive divergence, Neural Computation, 2002, vol. 14, pp. 1771–1800.

    Article  MATH  Google Scholar 

  5. 5.

    Hinton, G. and Salakhutdinov, R., Reducing the dimensionality of data with neural networks, Science, 2006, vol. 313, no. 5786, pp. 504–507.

    MathSciNet  Article  MATH  Google Scholar 

  6. 6.

    Hinton, G.E., A practical guide to training restricted Boltzmann machines, Tech. Rep. 2010-000, Toronto: Machine Learning Group, University of Toronto, 2010.

    Google Scholar 

  7. 7.

    Widrow, B. and Hoff, M., Adaptive switching circuits, in 1960 IRE WESCON Convention Record, DUNNO, 1960, pp. 96–104.

    Google Scholar 

  8. 8.

    Golovko, V., Neural Networks: Training, Organization and Application, Moscow: IPRZHR, 2001, p. 256 (in Russian).

    Google Scholar 

  9. 9.

    Golovko, V., Technique of Learning Rate Estimation for Efficient Training of MLP, Golovko, V., Savitsky, Y., Laopoulos, T., Sachenko, A., and Grandinetti, L., Proc. of the IEEE–INNS–ENNS Int. Joint Conf. on Neural Networks IJCNN’2000, Como, Italy, 2000, Danvers: IEEE Computer Society, 2000, pp. 323–329.

    Google Scholar 

  10. 10.

    Golovko, V., From multilayers perceptrons to deep belief neural networks: Training paradigms and application, Lections on Neuroinformatics, Golovko, V.A., Ed., Moscow: NRNU MEPhI, 2015, pp. 47–84 (in Russian).

  11. 11.

    Rumelhart, D., Hinton, G., and Williams, R., Learning representation by backpropagation errors, Nature, 1986, no. 323, pp. 533–536.

    Article  Google Scholar 

  12. 12.

    Lippmann, R.P., An introduction to computing with neural nets, IEEEASSP Mag., 1987, vol. 4, no. 2, pp. 4–22.

    Google Scholar 

  13. 13.

    Cybenko, G., Approximations by superpositions of a sigmoidal function, Math. Control Signals, Syst., 1989, vol. 2, pp. 303–314.

    MathSciNet  Article  MATH  Google Scholar 

  14. 14.

    Bengio, Y., Learning deep architectures for AI, Foundations Trends Mach. Learning, 2009, vol. 2, no. 1, pp. 1–127.

    MathSciNet  Article  MATH  Google Scholar 

  15. 15.

    Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H., Greedy layer-wise training of deep networks, in Advances in Neural Information Processing Systems, Schölkopf, B., Platt, J.C., and Hoffman, T., Eds., MA: MIT Press, Cambridge,2007, vol. 11, pp. 153–160.

    Google Scholar 

  16. 16.

    Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., and Bengio, S. Why does unsupervised pretraining help deep learning?, J. Mach. Learning Res., 2010, vol. 11, pp. 625–660.

  17. 17.

    Larochelle, H., Bengio, Y., Louradour, J., and Lamblin, P., Exploring strategies for training deep neural networks, J. Mach. Learning Res., 2009, vol. 1, pp. 1–40.

    MATH  Google Scholar 

  18. 18.

    Glorot, X., Bordes, A., and Bengio, Y., Deep sparse rectifier networks, in Proc. of the 14th International Conference on Artificial Intelligence and Statistics, JMLR W&CP, 2011, vol. 15, pp. 315–323.

    Google Scholar 

  19. 19.

    LeCun, Y., Bengio, Y., and Hinton, G., Deep Learning Nature, 2015, vol. 521, no. 7553, p. 436.

  20. 20.

    Golovko, V., A learning technique for deep belief neural networks, Neural Networks and Artificial Intelligence, vol. 440: Communication in Computer and Information Science, Golovko, V., Kroshchanka, A., Rubanau, U., and Jankowski, S., Ed., Springer, 2014, pp. 136–146.

  21. 21.

    Golovko, V., A New Technique for Restricted Boltzmann Machine Learning, Kroshchanka, A., Turchenko, V., Jankowski, S., and Treadwell, D., Proc. of the 8th IEEE International Conference IDAACS-2015, Warsaw 24–26 September 2015, Warsaw, 2015, pp. 182–186.

    Google Scholar 

Download references

Author information



Corresponding author

Correspondence to V. A. Golovko.

About this article

Verify currency and authenticity via CrossMark

Cite this article

Golovko, V.A. Deep learning: an overview and main paradigms. Opt. Mem. Neural Networks 26, 1–17 (2017).

Download citation


  • deep neural networks
  • restricted Boltzmann machine
  • multilayer perceptron