Abstract
Deep perceptron neural networks are capable of implementing a hierarchy of successive nonlinear conversions. But training these neural networks by conventional learning methods such as the error back-propagation is faced with serious obstacles owing to local minima. The layer-by-layer pre-training method has been recently proposed for training these neural networks and has shown considerable performance. In the pre-training method, the complex problem of training deep neural networks is broken down into some simple sub-problems in which some corresponding single-hidden-layer neural networks are trained through the error back-propagation algorithm. In this chapter, the theoretical principles regarding how this method effectively improves the training of deep neural networks are discussed, and the maximum discrimination theory is proposed as a proper framework for analysis of training convergence in these neural networks. Subsequently, discriminations of inputs in different layers of two similar deep neural networks, one of which is directly trained through the conventional error back-propagation algorithm and the other through layer-by-layer pre-training method, are compared, and results confirm the validity of the proposed framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
S.Z. Seyyedsalehi, S.A. Seyyedsalehi, A fast and efficient pre-training method based on layer-by-layer maximum discrimination for deep neural networks. Neurocomputing 168, 669–680 (2015)
S.Z. Seyyedsalehi, S.A. Seyyedsalehi, New fast pre-training method for training of deep neural network. Int. J. Signal Data Process. 10(1), 13–26 (2013). (In persian)
L. Szymanski, B. McCane, Deep networks are effective encoders of periodicity. IEEE Trans. Neural Netw. 25(10), 1816–1827 (2014)
M. Bianchini, F. Scarselli, On the complexity of neural network classifiers: a comparison between shallow and deep architectures. IEEE Trans. Neural Netw. 25(8), 1553–1565 (2014)
P.Z. Eskikand, S.A. Seyyedsalehi, Robust speech recognition by extracting invariant features. Procedia Soc. Behav. Sci. 32, 230–237 (2012)
Z. Ansari, S.A. Seyyedsalehi, Toward growing modular deep neural networks for continuous speech recognition. Neural Comput. Appl. 1–20 (2017)
H. Behbood, S.A. Seyyedsalehi, H.R. Tohidypour, M. Najafi, S. Gharibzadeh, A novel neural-based model for acoustic-articulatory inversion mapping. Neural Comput. Appl. 21(5), 935–943 (2012)
Z. Ansari, S.A. Seyyedsalehi, Proposing two speaker adaptation methods for deep neural network based speech recognition systems, in 7th International Symposium on Telecommunications (IST), pp. 452–457. IEEE (2014)
T. Nakashika, T. Takiguchi, Y. Ariki, Voice conversion using RNN pre-trained by recurrent temporal restricted Boltzmann machines. IEEE Trans. Audio Speech Lang. Process 23(3), 580–587 (2015)
S. Babaei, A. Geranmayeh, S.A. Seyyedsalehi, Towards designing modular recurrent neural networks in learning protein secondary structures. Exp. Syst. Appl. 39(6), 6263–6274 (2012)
M. Spencer, J. Eickholt, J. Cheng, A deep learning network approach to ab initio protein secondary structure prediction. IEEE Trans. TCBB 12(1), 103–112 (2015)
S.Z. Seyyedsalehi, S.A. Seyyedsalehi, Simultaneous learning of nonlinear manifolds based on the bottleneck neural network. Neural Process. Lett. 40(2), 191–209 (2014)
S.Z. Seyyedsalehi, S.A. Seyyedsalehi, Bidirectional pre-training method for deep neural network learning. Comput. Intell. Electr. Eng. 6(2), 1–10 (2015). (In persian)
S. Gao, Y. Zhang, K. Jia, J. Lu, Y. Zhang, Single sample face recognition via learning deep supervised autoencoders. IEEE Trans. Inf. Forensics Secur 10(10), 2108–2118 (2015)
S.Z. Seyyedsalehi, S.A. Seyyedsalehi, Improving the nonlinear manifold separator model to the face recognition by a single image of per person. Int. J. Signal Data Process. 12(1), 3–16 (2015). (In persian)
S.Z. Seyyedsalehi, S.A. Seyyedsalehi, Attractor analysis in associative neural networks and its application to facial image analysis. Comput. Intell. Electr. Eng. (2018). (In persian)
M. Hayat, M. Bennamoun, S. An, Deep reconstruction models for image set classification. IEEE Trans. Pattern Anal. Mach. Intell. 37(4), 713–727 (2015)
S.M. Moghadam, S.A. Seyyedsalehi, Nonlinear analysis and synthesis of video images using deep dynamic bottleneck neural networks for face recognition. Neural Netw. 105, 304–315 (2018)
S.H. Lee, C.S. Chan, S.J. Mayo, P. Remagnino, How deep learning extracts and learns leaf features for plant classification. Pattern Recogn. 71, 1–13 (2017)
P. Eulenberg, N. Köhler, T. Blasi, A. Filby, A.E. Carpenter, P. Rees, F.J. Theis, F.A. Wolf, Reconstructing cell cycle and disease progression using deep learning. Nat. Commun. 8(1) (2017)
H. Goh, N. Thome, M. Cord, J.-H. Lim, Learning deep hierarchical visual feature coding. IEEE Trans. Neural Netw. 25(12), 2212–2225 (2014)
R. Salakhutdinov, J. Tenenbaum, A. Torralba, Learning with hierarchical-deep models. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1985–1971 (2013)
Y. Bengio, Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1–127 (2009)
G. Ian, Y. Bengio, A. Courville, Deep Learning (MIT press, 2016)
N. Plath, Extracting low-dimensional features by means of deep network architectures, Ph.D. dissertation, Technische Universität Berlin, Apr 2008
Y. Bengio, Evolving culture versus local minima, in Growing Adaptive Machines (Springer, Berlin, Heidelberg, 2014), pp. 109–138
A.S. Shamsabadi, M. Babaie-Zadeh, S.Z. Seyyedsalehi, H.R. Rabiee, A new algorithm for training sparse autoencoders, in 2017 25th European Signal Processing Conference (EUSIPCO). IEEE, pp. 2141–2145 (2017)
G.E. Hinton, S. Osindero, Y.-W. Teh, A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
N. Jaitly, P. Nguyen, A. W. Senior, V. Vanhoucke, Application of pretrained deep neural networks to large vocabulary speech recognition, in Proceedings of Interspeech (2012)
G.E. Dahl, D. Yu, L. Deng, A. Acero, Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio, Speech Lang. Process. 20(1), 30–42 (2012)
S.Z. Seyyedsalehi, S.A. Seyyedsalehi, New fast pre training method for deep neural network learning, in Proceedings of 19th ICBME (2012). (In Persian
I. Nejadgholi, S.A. Seyyedsalehi, S. Chartier, A brain-inspired method of facial expression generation using chaotic feature extracting bidirectional associative memory. Neural Process Lett. 46(3), 943–960 (2017)
G.E. Hinton, R.R. Salakhutdinov, Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
S.A. Seyyedsalehi, S.A. Motamedi, M.R. Hashemigolpayegany, M.H. Ghasemian, Towards describing function of the human brain and neural networks using nonlinear mappings in highdimensional spaces. J. Daneshvar 11&12, 1–10 (1996). (In Persian)
A. Savran, N. Alyüz, H. Dibeklioğlu, O. Çeliktutan, B. Gökberk, B. Sankur, L. Akarun, Bosphorus database for 3D face analysis. Biom. Identity Manag. 5372, 47–56 (2008)
F.S. Samaria, A.C. Harter, Parameterisation of a stochastic model for human face identification. in Proceedings of the Second IEEE Workshop on Applications of Computer Vision, pp. 138–142 (1994)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Seyyedsalehi, S.Z., Seyyedsalehi, S.A. (2019). Why Dose Layer-by-Layer Pre-training Improve Deep Neural Networks Learning?. In: Balas, V., Roy, S., Sharma, D., Samui, P. (eds) Handbook of Deep Learning Applications. Smart Innovation, Systems and Technologies, vol 136. Springer, Cham. https://doi.org/10.1007/978-3-030-11479-4_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-11479-4_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11478-7
Online ISBN: 978-3-030-11479-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)