Abstract
Traditionally, when training supervised classifiers with Backpropagation, the training dataset is a static representation of the learning environment. The error on this training set is then propagated backwards to all the layers, and the gradient of the error with respect to the classifiers parameters is used to update them. However, this process stops when the parameters between the input layer and the next layer are updated. We note that there is a residual error that could be propagated further backwards to the feature vector(s) in order to adapt the representation of the input features, and that using this residual error can lead to improved speed of convergence towards a generalised solution. We present a methodology for applying this new technique to Deep Learning methods, such as Deep Neural Networks and Convolutional Neural Networks.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
We use the notation i, j for Artificial Neural Networks where i indicates the index of the weight in the previous layer of the network and j indicates the index of the weight in the current layer.
- 2.
The epoch in which the validation set reaches the minimum error.
References
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Cogn. Model. 5(3), 1 (1988)
Hecht-Nielsen, R.: Theory of the backpropagation neural network. In: International Joint Conference on Neural Networks, 1989 (IJCNN), pp.593–605. IEEE (1989)
LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time series. Handb. Brain Theory Neural Netw. 3361, 310 (1995)
Miikkulainen, R., Dyer, M.G.: Natural language processing with modular pdp networks and distributed lexicon. Cogn. Sci. 15(3), 343–399 (1991)
Ciresan, D.C., Meier, U., Gambardella, L.M., Schmidhuber, J.: Deep, big, simple neural nets for handwritten digit recognition. Neural Comput. 22(12), 3207–3220 (2010)
Simard, P.Y., Steinkraus, D., Platt, J.C.: Best practices for convolutional neural networks applied to visual document analysis. http://research.microsoft.com/apps/pubs/default.aspx?id=68920 (2003)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv:1412.6980
Lecun, Y., Cortes, C.: The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015). arXiv:1502.03167
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Mozer, M.C.: A focused back-propagation algorithm for temporal pattern recognition. Complex Syst. 3(4), 349–381 (1989)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2014). arXiv:1412.6572
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Mosca, A., Magoulas, G.D. (2017). Learning Input Features Representations in Deep Learning. In: Angelov, P., Gegov, A., Jayne, C., Shen, Q. (eds) Advances in Computational Intelligence Systems. Advances in Intelligent Systems and Computing, vol 513. Springer, Cham. https://doi.org/10.1007/978-3-319-46562-3_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-46562-3_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46561-6
Online ISBN: 978-3-319-46562-3
eBook Packages: EngineeringEngineering (R0)