Skip to main content

Training Neural Networks by Optimizing Random Subspaces of the Weight Space

  • Conference paper
  • First Online:
Artificial Intelligence and Soft Computing (ICAISC 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9692))

Included in the following conference series:

Abstract

This paper describes a new approach to feed-forward neural networks learning based on a random choice of a set of neurons which are temporally active in the process of neural network weight adaptation. The rest of the network weights is locked out (frozen). In contrast to the “dropout” method introduced by Hinton et al. [15], the neurons (along with their connections) are not removed from the neural network during training, only their weights are not modified, i.e. stay constant. This means that in every epoch of training only the random part of the neural networks (a chosen set of neurons and its connections) adapts. Freezing of neurons suppresses overfitting and prevents drastic increment of weights during the learning process, since the overall structure of the neural networks does not change. In many cases the approach based on training only some parts of the neural network (subspaces of the weight space) shortens the time of training. Experimental results for medium size neural networks used for modeling regression are also provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ba, J., Frey, B.: Adaptive dropout for training deep neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 3084–3092 (2013)

    Google Scholar 

  2. Bartlett, P.L.: The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Trans. Inf. Theor. 4(2), 525–536 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  3. Baldi, P., Hornik, K.: Learning from examples without local minima. Neural Netw. 2(1), 53–58 (1989)

    Article  Google Scholar 

  4. Choromanska, A., Henaff, M., Mathieu, M., Arous, G.B., LeCun, Y.: The loss surface of multilayer networks. In: Proceedings of the Conference on AI and Statistics (2014). http://arxiv.org/abs/1412.0233

  5. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  6. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)

    Article  MATH  Google Scholar 

  7. Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control, Signals Syst. 2(4), 303–314 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  8. Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: IEEE International Conference on Acoustic Speech and Signal Processing (ICASSP 2013), Vancouver (2013)

    Google Scholar 

  9. Dauphin, Y., Pascanu, R., Gulcehre, C., Cho, K.: Identifying and attacking the saddle point problem in highdimensional non-convex optimization. In: Proceedings of Advances in Neural Information Processing Systems, vol. 27, pp. 2933–2941 (2014)

    Google Scholar 

  10. Fine, T.: Feedforward Neural Network Methodology. Statistics for Engineering and Information Science. Springer, New York, Inc (1999)

    MATH  Google Scholar 

  11. Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: Proceedings of the 30th International Conference on Machine Learning, pp. 1319–1327. ACM (2013)

    Google Scholar 

  12. Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall PTR, Upper Saddle River (1998)

    MATH  Google Scholar 

  13. Hertz, J., Krogh, A., Palmer, R.G.: Introduction to the Theory of Neural Computation. Addison-Wesley, Redwood City (1991)

    Google Scholar 

  14. Hinton, G.E., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  15. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). http://arxiv.org/abs/1207.0580

  16. Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. 4, 251–257 (1991)

    Article  Google Scholar 

  17. Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70(1), 489–501 (2006)

    Article  Google Scholar 

  18. Kushner, H., Yin, G.: Stochastic Approximation and Recursive Algorithms and Applications, II edn. Springer-Verlag, New York, Inc (2003)

    MATH  Google Scholar 

  19. Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)

    MATH  Google Scholar 

  20. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning (Review). Nature 521, 436–444 (2015)

    Article  Google Scholar 

  21. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  22. Sutskever, I., Martens, J., Dahl, G., Hinton, G.: On the importance of initialization and momentum in deep learning. In: Proceedings of the 30th International Conference on Machine Learning, Atlanta, Georgia, USA (2013)

    Google Scholar 

  23. Schmidhuber, J.: Deep Learning and Neural Networks: An Overview. arXiv:1404.7828 (2014)

  24. Qiu, X., Zhang, L., Ren, Y., Suganthan, P., Amaratunga, G.: Ensemble deep learning for regression and time series forecasting. 2014 IEEE Symposium on Computational Intelligence in Ensemble Learning (CIEL), pp. 1–6 (2014). doi:10.1109/CIEL.2014.7015739

  25. Wang, S.I., Manning, C.D.: Fast dropout training. In: Proceedings of the 30th International Conference on Machine Learning, Atlanta, Georgia, USA (2013)

    Google Scholar 

Download references

Acknowledgments

This research was supported by S50242 grant at the Faculty of Electronics, Wrocław University of Science and Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ewa Skubalska-Rafajłowicz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Skubalska-Rafajłowicz, E. (2016). Training Neural Networks by Optimizing Random Subspaces of the Weight Space. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2016. Lecture Notes in Computer Science(), vol 9692. Springer, Cham. https://doi.org/10.1007/978-3-319-39378-0_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-39378-0_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-39377-3

  • Online ISBN: 978-3-319-39378-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics