Application of Reinforcement Learning to Stacked Autoencoder Deep Network Architecture Optimization

  • Roman Zajdel
  • Maciej KusyEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10841)


In this work, a new algorithm for the structure optimization of stacked autoencoder deep network (SADN) is introduced. It relies on the search for the numbers of the neurons in the first and the second layer of SADN through an approach based on reinforcement learning (RL). The Q(0)-learning based agent is constructed, which according to received reinforcement signal, picks appropriate values for the neurons. Considered network, with the architecture adjusted by the proposed algorithm, is applied to the task of MNIST digit database recognition. The classification quality is computed for SADN to determine its performance. It is shown that, using the proposed algorithm, the semi-optimal configuration of the number of hidden neurons can be achieved much faster than the successive exploration of the entire space of layers’ arrangement.


Stacked autoencoder deep network Reinforcement learning Classification quality 



The work was supported by Rzeszow University of Technology, Department of Electronics Fundamentals Grant for Statutory Activity (DS 2018).


  1. 1.
    Hinton, G.E., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Salakhutdinov, R., Hinton, G.E.: Deep Boltzmann machines. In: International Conference on Artificial Intelligence and Statistics, Clearwater Beach, USA, pp. 448–455 (2009)Google Scholar
  3. 3.
    LeCun, Y., et al.: Handwritten digit recognition with a back-propagation network. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems, vol. 2, pp. 396–404. Morgan-Kaufmann, Burlington (1990)Google Scholar
  4. 4.
    Kang, X., Li, C., Li, S., Lin, H.: Classification of hyperspectral images by Gabor filtering based deep network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. (2017). Scholar
  5. 5.
    Chen, Y., Lin, Z., Zhao, X., Wang, G., Gu, Y.: Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 7(6), 2094–2107 (2014)CrossRefGoogle Scholar
  6. 6.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  7. 7.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc., Red Hook (2012)Google Scholar
  8. 8.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  9. 9.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE Press, Boston (2015)Google Scholar
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE Press, Washington (2016)Google Scholar
  11. 11.
    Bodenhausen, U., Manke, S.: Automatically structured neural networks for handwritten character and word recognition. In: Gielen, S., Kappen, B. (eds.) ICANN 1993. Springer, London (1993). Scholar
  12. 12.
    LeCun, Y., Denker, J.S., Solla, S.A.: Optimal brain damage. In: Advances in Neural Information Processing Systems, vol. 2, pp. 598–605 (1990)Google Scholar
  13. 13.
    Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: International Conference on Learning Representations, Toulon, France (2017).
  14. 14.
    LeCun, Y., Cortes, C., Burges, C.J.C.: The MNIST database of handwritten digits (1998).
  15. 15.
    Lanzi, P.: Adaptive agents with reinforcement learning and internal memory. In: Sixth International Conference on the Simulation of Adaptive Behavior, pp. 333–342. The MIT Press, Cambridge (2000)Google Scholar
  16. 16.
    Watkins, C.: Learning from delayed Rewards. Ph.D. thesis. Cambridge University, Cambridge (1989)Google Scholar
  17. 17.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Faculty of Electrical and Computer EngineeringRzeszow University of TechnologyRzeszowPoland

Personalised recommendations