Skip to main content

Deep Feature-Action Processing with Mixture of Updates

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9492))

Included in the following conference series:

Abstract

This paper explores the possibility of combining an actor and critic in one architecture and uses a mixture of updates to train them. It describes a model for robot navigation that uses architecture similar to an actor-critic reinforcement learning architecture. It sets up the actor as a layer seconded by another layer which deduce the value function. Therefore, the effect is to have similar to a critic outcome combined with the actor in one network. The model hence can be used as the base for a truly deep reinforcement learning architecture that can be explored in the future. More importantly this work explores the results of mixing conjugate gradient update with gradient update for the mentioned architecture. The reward signal is back propagated from the critic to the actor through conjugate gradient eligibility trace for the second layer combined with gradient eligibility trace for the first layer. We show that this mixture of updates seems to work well for this model. The features layer have been deeply trained by applying a simple PCA on the whole set of images histograms acquired during the first running episode. The model is also able to adapt to a reduced features dimension autonomously. Initial experimental result on real robot shows that the agent accomplished good success rate in reaching a goal location.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Vardy, A., Moller, R.: Biologically plausible visual homing methods based on optical flow techniques. Connection Sci. 17, 47–89 (2005)

    Article  Google Scholar 

  2. Tomatis, N., et al.: Combining topological and metric: a natural integration for simultaneous localization and map building. In: Presented at Proceedings of the Fourth European Workshop on Advanced Mobile Robots (Eurobot) (2001)

    Google Scholar 

  3. Zeil, J.: Visual homing: an insect perspective, Current Opinion in Neurobiology. 22(2), 285–293 (2012). ISSN 0959-4388

    Google Scholar 

  4. Sutton, R.S., Barto, A.: Reinforcement Learning, an introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  5. Konda, V., Tsitsiklis, J.: Actor-Critic algorithms. In: Presented at NIPS 12 (2000)

    Google Scholar 

  6. Ziv, O., Shimkin, N.: Multigrid methods for policy evaluation and reinforcement learning. In: Presented at IEEE International Symposium on Intelligent Control, Limassol (2005)

    Google Scholar 

  7. Zhang, C., et al.: Efficient multi-agent reinforcement learning through automated supervision. In: Presented at International Conference on Autonomous Agents Estoril, Portugal (2008)

    Google Scholar 

  8. Bhatnagar, S., et al.: Incremental natural actor-critic algorithms. In: Presented at Neural Information Processing Systems (NIPS19) (2007)

    Google Scholar 

  9. Hinton, G., et al.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  10. Coates, A., et al.: An analysis of single-layer networks in unsupervised feature learning. In: AISTATS 14 (2011)

    Google Scholar 

  11. Vincent, P., et al.: Extracting and composing robust features with denoising autoencoders. In: ICML (2008)

    Google Scholar 

  12. Andrew, Ng et al.: Tutorial in Deep Learning: Stanford University (2010). http://ufldl.stanford.edu/tutorial/

  13. LeCun, Y., et al.: Learning methods for generic object recognition with invariance to pose and lighting. In: CVPR (2004)

    Google Scholar 

  14. Altahhan, A.: A robot visual homing model that traverses conjugate gradient TD to a variable λ TD and uses radial basis features. In: Mellouk, A. (ed.) Advances in Reinforcement Learning, pp. 225–254. InTech Education and Publishing, Vienna (2011)

    Google Scholar 

  15. Altahhan, A.: Robot visual homing using conjugate gradient temporal difference learning, radial basis features and a whole image measure. In: International Joint Conference on Neural Networks (IJCNN), Barcelona, Spain (2010). ISBN: 978-1-4244-6916-1

    Google Scholar 

  16. Altahhan, A., et al.: Visual robot homing using sarsa(λ), whole image measure, and radial basis function. In: International Joint Conference on Neural Networks (IJCNN), Hong Kong (2008)

    Google Scholar 

  17. Nocedal, J., Wright, S.: Numerical Optimization. Springer-Verlag, New York, 978-0-387-30303-1, 2nd Edition (2006)

    Google Scholar 

  18. Sutton, R.S., et al.: A new Q(lambda) with interim forward view and Monte Carlo equivalence. In: Proceedings of the 31 st International Conference on Machine Learning, Beijing, China, 2014. JMLR: W&CP vol. 32 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdulrahman Altahhan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Altahhan, A. (2015). Deep Feature-Action Processing with Mixture of Updates. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9492. Springer, Cham. https://doi.org/10.1007/978-3-319-26561-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26561-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26560-5

  • Online ISBN: 978-3-319-26561-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics