Deep Active Learning for Autonomous Navigation

Hussein, Ahmed; Gaber, Mohamed Medhat; Elyan, Eyad

doi:10.1007/978-3-319-44188-7_1

Ahmed Hussein¹²,
Mohamed Medhat Gaber¹² &
Eyad Elyan¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 629))

Included in the following conference series:

International Conference on Engineering Applications of Neural Networks

2497 Accesses
6 Citations
2 Altmetric

Abstract

Imitation learning refers to an agent’s ability to mimic a desired behavior by learning from observations. A major challenge facing learning from demonstrations is to represent the demonstrations in a manner that is adequate for learning and efficient for real time decisions. Creating feature representations is especially challenging when extracted from high dimensional visual data. In this paper, we present a method for imitation learning from raw visual data. The proposed method is applied to a popular imitation learning domain that is relevant to a variety of real life applications; namely navigation. To create a training set, a teacher uses an optimal policy to perform a navigation task, and the actions taken are recorded along with visual footage from the first person perspective. Features are automatically extracted and used to learn a policy that mimics the teacher via a deep convolutional neural network. A trained agent can then predict an action to perform based on the scene it finds itself in. This method is generic, and the network is trained without knowledge of the task, targets or environment in which it is acting. Another common challenge in imitation learning is generalizing a policy over unseen situation in training data. To address this challenge, the learned policy is subsequently improved by employing active learning. While the agent is executing a task, it can query the teacher for the correct action to take in situations where it has low confidence. The active samples are added to the training set and used to update the initial policy. The proposed approach is demonstrated on 4 different tasks in a 3D simulated environment. The experiments show that an agent can effectively perform imitation learning from raw visual data for navigation tasks and that active learning can significantly improve the initial policy using a small number of samples. The simulated testbed facilitates reproduction of these results and comparison with other approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abbeel, P., Coates, A., Quigley, M., Ng, A.Y.: An application of reinforcement learning to aerobatic helicopter flight. Adv. Neural Inf. Process. Syst. 19, 1 (2007)
Google Scholar
Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)
Article Google Scholar
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents (2012). arXiv preprint arXiv:1207.4708
Bemelmans, R., Gelderblom, G.J., Jonker, P., De Witte, L.: Socially assistive robots in elderly care: a systematic review into effects and effectiveness. J. Am. Med. Direct. Assoc. 13(2), 114–120 (2012)
Article Google Scholar
Calinon, S., Billard, A.G.: What is the teachers role in robot programming by demonstration? Toward benchmarks for improved learning. Interact. Stud. 8(3), 441–464 (2007)
Article Google Scholar
Cardamone, L., Loiacono, D., Lanzi, P.L.: Learning drivers for torcs through imitation using supervised methods. In: 2009 IEEE Symposium on Computational Intelligence and Games, CIG 2009, pp. 148–155. IEEE (2009)
Google Scholar
Chernova, S., Veloso, M.: Confidence-based policy learning from demonstration using Gaussian mixture models. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, p. 233. ACM (2007)
Google Scholar
Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (2012)
Google Scholar
Clark, C., Storkey, A.: Training deep convolutional neural networks to play go. In: Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), pp. 1766–1774 (2015)
Google Scholar
Dixon, K.R., Khosla, P.K.: Learning by observation with mobile robots: a computational approach. In: Proceedings 2004 IEEE International Conference on Robotics and Automation, ICRA 2004, vol. 1, pp. 102–107. IEEE (2004)
Google Scholar
Gorman, B.: Imitation learning through games: theory, implementation and evaluation. Ph.D. thesis, Dublin City University (2009)
Google Scholar
Guo, X., Singh, S., Lee, H., Lewis, R.L., Wang, X.: Deep learning for real-time Atari game play using offline monte-carlo tree search planning. In: Proceedings of Advances in Neural Information Processing Systems, pp. 3338–3346 (2014)
Google Scholar
Ijspeert, A.J., Nakanishi, J., Schaal, S.: Learning rhythmic movements by demonstration using nonlinear oscillators. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2002), pp. 958–963 (2002). No. BIOROB-CONF-2002-003
Google Scholar
Judah, K., Fern, A., Dietterich, T.G.: Active imitation learning via reduction to IID active learning (2012). arXiv preprint arXiv:1210.4876
Karakovskiy, S., Togelius, J.: The mario AI benchmark and competitions. IEEE Trans. Comput. Intell. AI Games 4(1), 55–67 (2012)
Article Google Scholar
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32, 1238 (2013). 0278364913495721
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies (2015). arXiv preprint arXiv:1504.00702
Mash-simulator (2014). https://github.com/idiap/mash-simulator
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with deep reinforcement learning (2013). arXiv preprint arXiv:1312.5602
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Munoz, J., Gutierrez, G., Sanchis, A.: Controller for torcs created by imitation. In: 2009 IEEE Symposium on Computational Intelligence and Games, CIG 2009, pp. 271–278. IEEE (2009)
Google Scholar
Ng, A.Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., Liang, E.: Autonomous inverted helicopter flight via reinforcement learning. In: Ang Jr., M.H., Khatib, O. (eds.) Experimental Robotics IX. Springer Tracts in Advanced Robotics, vol. 21, pp. 363–372. Springer, Heidelberg (2006)
Chapter Google Scholar
Nicolescu, M.N., Mataric, M.J.: Natural methods for robot task learning: instructive demonstrations, generalization and practice. In: Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 241–248. ACM (2003)
Google Scholar
Noda, I., Matsubara, H., Hiraki, K., Frank, I.: Soccer server: a tool for research on multiagent systems. Appl. Artif. Intell. 12(2–3), 233–250 (1998)
Article Google Scholar
Ollis, M., Huang, W.H., Happold, M.: A Bayesian approach to imitation learning for robot navigation. In: 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2007, pp. 709–714. IEEE (2007)
Google Scholar
Ratliff, N., Bradley, D., Bagnell, J.A., Chestnutt, J.: Boosting structured prediction for imitation learning. In: Proceedings of Robotics Institute, p. 54 (2007)
Google Scholar
Ross, S., Bagnell, D.: Efficient reductions for imitation learning. In: International Conference on Artificial Intelligence and Statistics, pp. 661–668 (2010)
Google Scholar
Sammut, C., Hurst, S., Kedzier, D., Michie, D., et al.: Learning to fly. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 385–393 (1992)
Google Scholar
Saunders, J., Nehaniv, C.L., Dautenhahn, K.: Teaching robots by moulding behavior and scaffolding the environment. In: Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-robot Interaction, pp. 118–125. ACM (2006)
Google Scholar
Schaal, S.: Is imitation learning the route to humanoid robots? Trends Cogn. Sci. 3(6), 233–242 (1999)
Article Google Scholar
Silver, D., Bagnell, J., Stentz, A.: High performance outdoor navigation from overhead data using imitation learning. In: Proceedings of Robotics: Science and Systems IV, Zurich, Switzerland (2008)
Google Scholar
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Theano Development Team: Theano: a Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688, May 2016. http://arxiv.org/abs/1605.02688
Togelius, J., De Nardi, R., Lucas, S.M.: Towards automatic personalised content creation for racing games. In: 2007 IEEE Symposium on Computational Intelligence and Games, CIG 2007, pp. 252–259. IEEE (2007)
Google Scholar
Vogt, D., Amor, H.B., Berger, E., Jung, B.: Learning two-person interaction models for responsive synthetic humanoids. J. Virtual Real. Broadcast. 11(1) (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, Robert Gordon University, Garthdee Road, Aberdeen, AB10 7QB, UK
Ahmed Hussein, Mohamed Medhat Gaber & Eyad Elyan

Authors

Ahmed Hussein
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Medhat Gaber
View author publications
You can also search for this author in PubMed Google Scholar
Eyad Elyan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ahmed Hussein .

Editor information

Editors and Affiliations

Robert Gordon University, Aberdeen, United Kingdom
Chrisina Jayne
Lab of Forest Informatics (FiLAB), Democritus University of Thrace Lab of Forest Informatics (FiLAB), Orestiada, Greece
Lazaros Iliadis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hussein, A., Gaber, M.M., Elyan, E. (2016). Deep Active Learning for Autonomous Navigation. In: Jayne, C., Iliadis, L. (eds) Engineering Applications of Neural Networks. EANN 2016. Communications in Computer and Information Science, vol 629. Springer, Cham. https://doi.org/10.1007/978-3-319-44188-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-44188-7_1
Published: 19 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44187-0
Online ISBN: 978-3-319-44188-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics