Skip to main content

Why Dose Layer-by-Layer Pre-training Improve Deep Neural Networks Learning?

  • Chapter
  • First Online:
Handbook of Deep Learning Applications

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 136))

Abstract

Deep perceptron neural networks are capable of implementing a hierarchy of successive nonlinear conversions. But training these neural networks by conventional learning methods such as the error back-propagation is faced with serious obstacles owing to local minima. The layer-by-layer pre-training method has been recently proposed for training these neural networks and has shown considerable performance. In the pre-training method, the complex problem of training deep neural networks is broken down into some simple sub-problems in which some corresponding single-hidden-layer neural networks are trained through the error back-propagation algorithm. In this chapter, the theoretical principles regarding how this method effectively improves the training of deep neural networks are discussed, and the maximum discrimination theory is proposed as a proper framework for analysis of training convergence in these neural networks. Subsequently, discriminations of inputs in different layers of two similar deep neural networks, one of which is directly trained through the conventional error back-propagation algorithm and the other through layer-by-layer pre-training method, are compared, and results confirm the validity of the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. S.Z. Seyyedsalehi, S.A. Seyyedsalehi, A fast and efficient pre-training method based on layer-by-layer maximum discrimination for deep neural networks. Neurocomputing 168, 669–680 (2015)

    Article  Google Scholar 

  2. S.Z. Seyyedsalehi, S.A. Seyyedsalehi, New fast pre-training method for training of deep neural network. Int. J. Signal Data Process. 10(1), 13–26 (2013). (In persian)

    Google Scholar 

  3. L. Szymanski, B. McCane, Deep networks are effective encoders of periodicity. IEEE Trans. Neural Netw. 25(10), 1816–1827 (2014)

    Article  Google Scholar 

  4. M. Bianchini, F. Scarselli, On the complexity of neural network classifiers: a comparison between shallow and deep architectures. IEEE Trans. Neural Netw. 25(8), 1553–1565 (2014)

    Article  Google Scholar 

  5. P.Z. Eskikand, S.A. Seyyedsalehi, Robust speech recognition by extracting invariant features. Procedia Soc. Behav. Sci. 32, 230–237 (2012)

    Article  Google Scholar 

  6. Z. Ansari, S.A. Seyyedsalehi, Toward growing modular deep neural networks for continuous speech recognition. Neural Comput. Appl. 1–20 (2017)

    Google Scholar 

  7. H. Behbood, S.A. Seyyedsalehi, H.R. Tohidypour, M. Najafi, S. Gharibzadeh, A novel neural-based model for acoustic-articulatory inversion mapping. Neural Comput. Appl. 21(5), 935–943 (2012)

    Article  Google Scholar 

  8. Z. Ansari, S.A. Seyyedsalehi, Proposing two speaker adaptation methods for deep neural network based speech recognition systems, in 7th International Symposium on Telecommunications (IST), pp. 452–457. IEEE (2014)

    Google Scholar 

  9. T. Nakashika, T. Takiguchi, Y. Ariki, Voice conversion using RNN pre-trained by recurrent temporal restricted Boltzmann machines. IEEE Trans. Audio Speech Lang. Process 23(3), 580–587 (2015)

    Article  Google Scholar 

  10. S. Babaei, A. Geranmayeh, S.A. Seyyedsalehi, Towards designing modular recurrent neural networks in learning protein secondary structures. Exp. Syst. Appl. 39(6), 6263–6274 (2012)

    Article  Google Scholar 

  11. M. Spencer, J. Eickholt, J. Cheng, A deep learning network approach to ab initio protein secondary structure prediction. IEEE Trans. TCBB 12(1), 103–112 (2015)

    Google Scholar 

  12. S.Z. Seyyedsalehi, S.A. Seyyedsalehi, Simultaneous learning of nonlinear manifolds based on the bottleneck neural network. Neural Process. Lett. 40(2), 191–209 (2014)

    Article  Google Scholar 

  13. S.Z. Seyyedsalehi, S.A. Seyyedsalehi, Bidirectional pre-training method for deep neural network learning. Comput. Intell. Electr. Eng. 6(2), 1–10 (2015). (In persian)

    Google Scholar 

  14. S. Gao, Y. Zhang, K. Jia, J. Lu, Y. Zhang, Single sample face recognition via learning deep supervised autoencoders. IEEE Trans. Inf. Forensics Secur 10(10), 2108–2118 (2015)

    Article  Google Scholar 

  15. S.Z. Seyyedsalehi, S.A. Seyyedsalehi, Improving the nonlinear manifold separator model to the face recognition by a single image of per person. Int. J. Signal Data Process. 12(1), 3–16 (2015). (In persian)

    Google Scholar 

  16. S.Z. Seyyedsalehi, S.A. Seyyedsalehi, Attractor analysis in associative neural networks and its application to facial image analysis. Comput. Intell. Electr. Eng. (2018). (In persian)

    Google Scholar 

  17. M. Hayat, M. Bennamoun, S. An, Deep reconstruction models for image set classification. IEEE Trans. Pattern Anal. Mach. Intell. 37(4), 713–727 (2015)

    Article  Google Scholar 

  18. S.M. Moghadam, S.A. Seyyedsalehi, Nonlinear analysis and synthesis of video images using deep dynamic bottleneck neural networks for face recognition. Neural Netw. 105, 304–315 (2018)

    Article  Google Scholar 

  19. S.H. Lee, C.S. Chan, S.J. Mayo, P. Remagnino, How deep learning extracts and learns leaf features for plant classification. Pattern Recogn. 71, 1–13 (2017)

    Article  Google Scholar 

  20. P. Eulenberg, N. Köhler, T. Blasi, A. Filby, A.E. Carpenter, P. Rees, F.J. Theis, F.A. Wolf, Reconstructing cell cycle and disease progression using deep learning. Nat. Commun. 8(1) (2017)

    Google Scholar 

  21. H. Goh, N. Thome, M. Cord, J.-H. Lim, Learning deep hierarchical visual feature coding. IEEE Trans. Neural Netw. 25(12), 2212–2225 (2014)

    Article  Google Scholar 

  22. R. Salakhutdinov, J. Tenenbaum, A. Torralba, Learning with hierarchical-deep models. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1985–1971 (2013)

    Article  Google Scholar 

  23. Y. Bengio, Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1–127 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  24. G. Ian, Y. Bengio, A. Courville, Deep Learning (MIT press, 2016)

    Google Scholar 

  25. N. Plath, Extracting low-dimensional features by means of deep network architectures, Ph.D. dissertation, Technische Universität Berlin, Apr 2008

    Google Scholar 

  26. Y. Bengio, Evolving culture versus local minima, in Growing Adaptive Machines (Springer, Berlin, Heidelberg, 2014), pp. 109–138

    Google Scholar 

  27. A.S. Shamsabadi, M. Babaie-Zadeh, S.Z. Seyyedsalehi, H.R. Rabiee, A new algorithm for training sparse autoencoders, in 2017 25th European Signal Processing Conference (EUSIPCO). IEEE, pp. 2141–2145 (2017)

    Google Scholar 

  28. G.E. Hinton, S. Osindero, Y.-W. Teh, A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  29. N. Jaitly, P. Nguyen, A. W. Senior, V. Vanhoucke, Application of pretrained deep neural networks to large vocabulary speech recognition, in Proceedings of Interspeech (2012)

    Google Scholar 

  30. G.E. Dahl, D. Yu, L. Deng, A. Acero, Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio, Speech Lang. Process. 20(1), 30–42 (2012)

    Article  Google Scholar 

  31. S.Z. Seyyedsalehi, S.A. Seyyedsalehi, New fast pre training method for deep neural network learning, in Proceedings of 19th ICBME (2012). (In Persian

    Google Scholar 

  32. I. Nejadgholi, S.A. Seyyedsalehi, S. Chartier, A brain-inspired method of facial expression generation using chaotic feature extracting bidirectional associative memory. Neural Process Lett. 46(3), 943–960 (2017)

    Article  Google Scholar 

  33. G.E. Hinton, R.R. Salakhutdinov, Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  34. S.A. Seyyedsalehi, S.A. Motamedi, M.R. Hashemigolpayegany, M.H. Ghasemian, Towards describing function of the human brain and neural networks using nonlinear mappings in highdimensional spaces. J. Daneshvar 11&12, 1–10 (1996). (In Persian)

    Google Scholar 

  35. A. Savran, N. Alyüz, H. Dibeklioğlu, O. Çeliktutan, B. Gökberk, B. Sankur, L. Akarun, Bosphorus database for 3D face analysis. Biom. Identity Manag. 5372, 47–56 (2008)

    Article  Google Scholar 

  36. F.S. Samaria, A.C. Harter, Parameterisation of a stochastic model for human face identification. in Proceedings of the Second IEEE Workshop on Applications of Computer Vision, pp. 138–142 (1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seyyed Ali Seyyedsalehi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Seyyedsalehi, S.Z., Seyyedsalehi, S.A. (2019). Why Dose Layer-by-Layer Pre-training Improve Deep Neural Networks Learning?. In: Balas, V., Roy, S., Sharma, D., Samui, P. (eds) Handbook of Deep Learning Applications. Smart Innovation, Systems and Technologies, vol 136. Springer, Cham. https://doi.org/10.1007/978-3-030-11479-4_13

Download citation

Publish with us

Policies and ethics