Apprenticeship Bootstrapping Reinforcement Learning for Sky Shepherding of a Ground Swarm in Gazebo

Nguyen, Hung; Garratt, Matthew; Abbass, Hussein A.

doi:10.1007/978-3-030-60898-9_10

Hung Nguyen³,
Matthew Garratt³ &
Hussein A. Abbass³

Part of the book series: Unmanned System Technologies ((UST))

395 Accesses
1 Citations

Abstract

The coordination of unmanned air–ground vehicles has been an active area due to the significant advantages of this coordination wherein unmanned air vehicles (UAVs) have a wide field of view, enabling them to effectively guide a swarm of unmanned ground vehicles (UGVs). Due to significant recent advances in artificial intelligence (AI), autonomous agents are being used to design more robust coordination of air–ground systems, reducing the intervention load of human operators and increasing the autonomy of unmanned air–ground systems. A guidance and control shepherding system design allows for single learning agent to influence and manage a larger swarm of rule-based entities. In this chapter, we present a learning algorithm for a sky shepherd-guiding rule-based AI-driven UGVs. The apprenticeship bootstrapping learning algorithm is introduced and is applied to the aerial shepherding task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 1. ACM, New York (2004)
Google Scholar
Abbeel, P., Coates, A., Ng, A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. Int. J. Rob. Res. 29(13), 1608–1639 (2010)
Article Google Scholar
Aghaeeyan, A., Abdollahi, F., Talebi, H.A.: UAV–UGVs cooperation: with a moving center based trajectory. Rob. Auton. Syst. 63, 1–9 (2015)
Article Google Scholar
Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Rob. Auton. Syst. 57(5), 469–483 (2009)
Article Google Scholar
Arora, S., Doshi, P.: A survey of inverse reinforcement learning: Challenges, methods and progress (2018). Preprint arXiv:1806.06877
Google Scholar
Balch, T., Arkin, R.C.: Behavior-based formation control for multirobot teams. IEEE Trans. Robot. Autom. 14(6), 926–939 (1998)
Article Google Scholar
Baumann, M., Büning, H.K.: Learning shepherding behavior. Ph.D. Thesis, University of Paderborn (2016)
Google Scholar
Baxter, J.L., Burke, E., Garibaldi, J.M., Norman, M.: Multi-robot search and rescue: A potential field based approach. In: Autonomous Robots and Agents, pp. 9–16. Springer, Berlin (2007)
Google Scholar
Beard, R.W., Lawton, J., Hadaegh, F.Y.: A coordination architecture for spacecraft formation control. IEEE Trans. Control Syst. Technol. 9(6), 777–790 (2001)
Article Google Scholar
Bentivegna, D.C., Atkeson, C.G., Cheng, G.: Learning tasks from observation and practice. Rob. Auton. Syst. 47(2–3), 163–169 (2004)
Article Google Scholar
Billard, A.G., Calinon, S., Dillmann, R.: Learning from humans. In: Springer Handbook of Robotics, pp. 1995–2014. Springer, Berlin (2016)
Google Scholar
Billing, E.A., Hellström, T.: A formalism for learning from demonstration. Paladyn 1(1), 1–13 (2010)
Google Scholar
Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J., et al.: End to end learning for self-driving cars (2016). Preprint arXiv:1604.07316
Google Scholar
Bojarski, M., Yeres, P., Choromanska, A., Choromanski, K., Firner, B., Jackel, L., Muller, U.: Explaining how a deep neural network trained with end-to-end learning steers a car (2017). Preprint arXiv:1704.07911
Google Scholar
Carelli, R., De la Cruz, C., Roberti, F.: Centralized formation control of non-holonomic mobile robots. Lat. Am. Appl. Res. 36(2), 63–69 (2006)
Google Scholar
Carrio, A., Sampedro, C., Rodriguez-Ramos, A., Campoy, P.: A review of deep learning methods and applications for unmanned aerial vehicles. J. Sensors 2017, 3296874 (2017)
Google Scholar
Chaimowicz, L., Kumar, V.: Aerial shepherds: Coordination among UAVS and swarms of robots. In: Alami, R., Chatila, R., Asama, H. (eds.) Distributed Autonomous Robotic Systems, vol. 6, pp. 243–252. Springer Japan, Tokyo (2007)
Chapter Google Scholar
Chen, J., Zhang, X., Xin, B., Fang, H.: Coordination between unmanned aerial and ground vehicles: A taxonomy and optimization perspective. IEEE Trans. Cybern. 46(4), 959–972 (2016)
Article Google Scholar
Chollet, F.: Keras: Theano-based deep learning library. Code: https://github.com/fchollet. Documentation: http://keras. IO (2015)
ClearpathRobotics: ROS husky robot. ROS package at http://wiki.ros.org/Robots/ Husky (2017)
Daniel, C., Neumann, G., Peters, J.: Learning concurrent motor skills in versatile solution spaces. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3591–3597. IEEE, Piscataway (2012)
Google Scholar
Daw, n.d., Niv, Y., Dayan, P.: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8(12), 1704–1711 (2005)
Google Scholar
Dillmann, R.: Teaching and learning of robot tasks via observation of human performance. Rob. Auton. Syst. 47(2–3), 109–116 (2004)
Article Google Scholar
Duan, H., Li, P.: Bio-Inspired Computation in Unmanned Aerial Vehicles. Springer, Berlin (2014)
Book Google Scholar
Dunk, I., Abbass, H.: Emergence of order in leader-follower boids-inspired systems. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE, Piscataway (2016)
Google Scholar
Farinelli, A., Iocchi, L., Nardi, D.: Multirobot systems: a classification focused on coordination. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 34(5), 2015–2028 (2004)
Google Scholar
Fernandez-Rojas, R., Perry, A., Singh, H., Campbell, B., Elsayed, S., Hunjet, R., Abbass, H.A.: Contextual awareness in human-advanced-vehicle systems: A survey. IEEE Access 7, 33304–33328 (2019)
Article Google Scholar
Fraser, B., Hunjet, R.: Data ferrying in tactical networks using swarm intelligence and stigmergic coordination. In: 2016 26th International Telecommunication Networks and Applications Conference (ITNAC), pp. 1–6. IEEE, Piscataway (2016)
Google Scholar
Gee, A., Abbass, H.: Transparent machine education of neural networks for swarm shepherding using curriculum design. In: Proceedings of the International Joint Conference on Neural Networks (2019)
Google Scholar
Glavic, M., Fonteneau, R., Ernst, D.: Reinforcement learning for electric power system decision and control: Past considerations and perspectives. IFAC-PapersOnLine 50(1), 6918–6927 (2017)
Article Google Scholar
Graves, A., Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE, Piscataway (2013)
Google Scholar
Grollman, D.H., Jenkins, O.C.: Incremental learning of subtasks from unsegmented demonstration. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 261–266. IEEE, Piscataway (2010)
Google Scholar
Grounds, M., Kudenko, D.: Parallel reinforcement learning with linear function approximation. In: Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning, pp. 60–74. Springer, Berlin (2008)
Google Scholar
Guenter, F., Hersch, M., Calinon, S., Billard, A.: Reinforcement learning for imitating constrained reaching movements. Adv. Rob. 21(13), 1521–1544 (2007)
Article Google Scholar
Guillet, A., Lenain, R., Thuilot, B., Rousseau, V.: Formation control of agricultural mobile robots: A bidirectional weighted constraints approach. J. Field Rob. 34, 1260–1274 (2017)
Article Google Scholar
Guo, X., Denman, S., Fookes, C., Mejias, L., Sridharan, S.: Automatic UAV forced landing site detection using machine learning. In: 2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pp. 1–7. IEEE, Piscataway (2014)
Google Scholar
Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 66–83. Springer, Berlin (2017)
Google Scholar
Howitt, S., Richards, D.: The human machine interface for airborne control of UAVS. In: 2nd AIAA “Unmanned Unlimited” Conference and Workshop & Exhibit, p. 6593 (2003)
Google Scholar
Huang, H., Sturm, J.: Tum simulator. ROS package at http://wiki.ros.org/tum_simulator (2014)
Hudjakov, R., Tamre, M.: Aerial imagery terrain classification for long-range autonomous navigation. In: 2009 International Symposium on Optomechatronic Technologies, pp. 88–91. IEEE, Piscataway (2009)
Google Scholar
Hunjet, R., Stevens, T., Elliot, M., Fraser, B., George, P.: Survivable communications and autonomous delivery service a generic swarming framework enabling communications in contested environments. In: MILCOM 2017–2017 IEEE Military Communications Conference (MILCOM), pp. 788–793. IEEE, Piscataway (2017)
Google Scholar
Hussein, A., Gaber, M.M., Elyan, E., Jayne, C.: Imitation learning: A survey of learning methods. ACM Comput. Surv. (CSUR) 50(2), 21 (2017)
Google Scholar
Hwang, Y.K., Choi, K.J., Hong, D.S.: Self-learning control of cooperative motion for a humanoid robot. In: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006, pp. 475–480. IEEE, Piscataway (2006)
Google Scholar
Iima, H., Kuroe, Y.: Swarm reinforcement learning method for a multi-robot formation problem. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2298–2303. IEEE, Piscataway (2013)
Google Scholar
Jansen, B., Belpaeme, T.: A computational model of intention reading in imitation. Rob. Auton. Syst. 54(5), 394–402 (2006)
Article Google Scholar
Justesen, N., Risi, S.: Learning macromanagement in starcraft from replays using deep learning. In: 2017 IEEE Conference on Computational Intelligence and Games (CIG), pp. 162–169. IEEE, Piscataway (2017)
Google Scholar
Khaleghi, A.M., Xu, D., Minaeian, S., Li, M., Yuan, Y., Liu, J., Son, Y.J., Vo, C., Lien, J.M.: A dddams-based UAV and UGV team formation approach for surveillance and crowd control. In: Proceedings of the 2014 Winter Simulation Conference, pp. 2907–2918. IEEE Press, Piscataway (2014)
Google Scholar
Khaleghi, A.M., Xu, D., Minaeian, S., Li, M., Yuan, Y., Liu, J., Son, Y.J., Vo, C., Mousavian, A., Lien, J.M.: A comparative study of control architectures in UAV/UGV-based surveillance system. In: IIE Annual Conference. Proceedings. Institute of Industrial and Systems Engineers (IISE), p. 3455 (2014)
Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization (2014). Preprint arXiv:1412.6980
Google Scholar
Kober, J., Peters, J.R.: Policy search for motor primitives in robotics. In: Advances in Neural Information Processing Systems, pp. 849–856 (2009)
Google Scholar
Koenig, N., Howard, A.: Gazebo-3d multiple robot simulator with dynamics (2006)
Google Scholar
Kolling, A., Walker, P., Chakraborty, N., Sycara, K., Lewis, M.: Human interaction with robot swarms: A survey. IEEE Trans. Human-Mach. Syst. 46(1), 9–26 (2015)
Article Google Scholar
Konidaris, G., Osentoski, S., Thomas, P.S.: Value function approximation in reinforcement learning using the fourier basis. In: Association for the Advancement of Artificial Intelligence, vol. 6, p. 7 (2011)
Google Scholar
Kormushev, P., Calinon, S., Caldwell, D.G.: Robot motor skill coordination with em-based reinforcement learning. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3232–3237. IEEE, Piscataway (2010)
Google Scholar
Kormushev, P., Calinon, S., Saegusa, R., Metta, G.: Learning the skill of archery by a humanoid robot ICub. In: 2010 10th IEEE-RAS International Conference on Humanoid Robots, pp. 417–423. IEEE, Piscataway (2010)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Kulić, D., Ott, C., Lee, D., Ishikawa, J., Nakamura, Y.: Incremental learning of full body motion primitives and their sequencing through human motion observation. Int. J. Rob. Res. 31(3), 330–345 (2012)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)
MathSciNet MATH Google Scholar
Li, J., Monroe, W., Ritter, A., Galley, M., Gao, J., Jurafsky, D.: Deep reinforcement learning for dialogue generation (2016). Preprint arXiv:1606.01541
Google Scholar
Li, X., Chen, Y.N., Li, L., Gao, J., Celikyilmaz, A.: End-to-end task-completion neural dialogue systems (2017). Preprint arXiv:1703.01008
Google Scholar
Lin, S., Garratt, M.A., Lambert, A.J.: Monocular vision-based real-time target recognition and tracking for autonomously landing an uav in a cluttered shipboard environment. Autonom. Rob. 41(4), 881–901 (2017)
Article Google Scholar
Liu, M., Amato, C., Anesta, E.P., Griffith, J.D., How, J.P.: Learning for decentralized control of multiagent systems in large, partially-observable stochastic environments. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
Google Scholar
Long, N.K., Sammut, K., Sgarioto, D., Garratt, M., Abbass, H.A.: A comprehensive review of shepherding as a bio-inspired swarm-robotics guidance approach. IEEE Trans. Emer. Topics Comput. Intell. 4, 523–537 (2020)
Article Google Scholar
Mangin, O., Oudeyer, P.Y.: Unsupervised learning of simultaneous motor primitives through imitation. In: Frontiers in Computational Neuroscience Conference Abstract: IEEE ICDL-EPIROB 2011 (2011)
Google Scholar
Martinez, S., Cortes, J., Bullo, F.: Motion coordination with distributed information. IEEE Control Syst. Mag. 27(4), 75–88 (2007)
Article Google Scholar
Mirowski, P., Pascanu, R., Viola, F., Soyer, H., Ballard, A.J., Banino, A., Denil, M., Goroshin, R., Sifre, L., Kavukcuoglu, K., et al.: Learning to navigate in complex environments (2016). Preprint arXiv:1611.03673
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Article Google Scholar
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)
Google Scholar
Mülling, K., Kober, J., Kroemer, O., Peters, J.: Learning to select and generalize striking movements in robot table tennis. Int. J. Rob. Res. 32(3), 263–279 (2013)
Article Google Scholar
Nguyen, H.T., Garratt, M., Bui, L.T., Abbass, H.: Supervised deep actor network for imitation learning in a ground-air UAV-UGVs coordination task. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE, Piscataway (2017)
Google Scholar
Nguyen, H., Garratt, M., Abbass, H.: Apprenticeship bootstrapping. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, Piscataway (2018)
Google Scholar
Nguyen, H., Tran, V., Nguyen, T., Garratt, M., Kasmarik, K., Barlow, M., Anavatti, S., Abbass, H.: Apprenticeship bootstrapping via deep learning with a safety net for UAV-UGV interaction (2018). Preprint arXiv:1810.04344
Google Scholar
Nguyen, T., Nguyen, H., Debie, E., Kasmarik, K., Garratt, M., Abbass, H.: Swarm Q-Leaming with knowledge sharing within environments for formation control. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, Piscataway (2018)
Google Scholar
Nguyen, H.T., Garratt, M., Bui, L.T., Abbass, H.: Apprenticeship learning for continuous state spaces and actions in a swarm-guidance shepherding task. In: 2019 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 102–109. IEEE, Piscataway (2019)
Google Scholar
Nguyen, H.T., Nguyen, T.D., Garratt, M., Kasmarik, K., Anavatti, S., Barlow, M., Abbass, H.A.: A deep hierarchical reinforcement learner for aerial shepherding of ground swarms. In: International Conference on Neural Information Processing, pp. 658–669. Springer, Berlin (2019)
Google Scholar
Niekum, S., Osentoski, S., Konidaris, G., Barto, A.G.: Learning and generalization of complex tasks from unstructured demonstrations. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5239–5246. IEEE, Piscataway (2012)
Google Scholar
Oh, K.K., Park, M.C., Ahn, H.S.: A survey of multi-agent formation control. Automatica 53, 424–440 (2015)
Article MathSciNet Google Scholar
Oh, H., Shirazi, A.R., Sun, C., Jin, Y.: Bio-inspired self-organising multi-robot pattern formation: a review. Rob. Auton. Syst. 91, 83–100 (2017)
Article Google Scholar
Palmer, G., Tuyls, K., Bloembergen, D., Savani, R.: Lenient multi-agent deep reinforcement learning. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pp. 443–451. International Foundation for Autonomous Agents and Multiagent Systems (2018)
Google Scholar
Parker, L.: Multiple Mobile Robot Systems, pp. 921–941. Springer, Berlin (2008). https://doi.org/10.1007/978-3-540-30301-5_41
Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., Schaal, S.: Skill learning and task outcome prediction for manipulation. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3828–3834. IEEE, Piscataway (2011)
Google Scholar
Pendleton, S.D., Andersen, H., Du, X., Shen, X., Meghjani, M., Eng, Y.H., Rus, D., Ang, M.H.: Perception, planning, control, and coordination for autonomous vehicles. Machines 5(1), 6 (2017)
Article Google Scholar
Ramachandran, D., Amir, E.: Bayesian inverse reinforcement learning. In: International Joint Conferences on Artificial Intelligence, vol. 7, pp. 2586–2591 (2007)
Google Scholar
Ross, S., Melik-Barkhudarov, N., Shankar, K.S., Wendel, A., Dey, D., Bagnell, J.A., Hebert, M.: Learning monocular reactive uav control in cluttered natural environments. In: 2013 IEEE International Conference on Robotics and Automation, pp. 1765–1772. IEEE, Piscataway (2013)
Google Scholar
Sen, A., Sahoo, S.R., Kothari, M.: Cooperative formation control strategy in heterogeneous network with bounded acceleration. In: 2017 Indian Control Conference (ICC), pp. 344–349. IEEE, Piscataway (2017)
Google Scholar
Skoglund, A., Iliev, B., Kadmiry, B., Palm, R.: Programming by demonstration of pick-and-place tasks for industrial manipulators using task primitives. In: 2007 International Symposium on Computational Intelligence in Robotics and Automation, pp. 368–373. IEEE, Piscataway (2007)
Google Scholar
Song, J., Ren, H., Sadigh, D., Ermon, S.: Multi-agent generative adversarial imitation learning. In: Advances in Neural Information Processing Systems, pp. 7461–7472 (2018)
Google Scholar
Speck, C., Bucci, D.J.: Distributed UAV swarm formation control via object-focused, multi-objective SARSA. In: 2018 Annual American Control Conference (ACC), pp. 6596–6601. IEEE, Piscataway (2018)
Google Scholar
Strömbom, D., Mann, R.P., Wilson, A.M., Hailes, S., Morton, A.J., Sumpter, D.J.T., King, A.J.: Solving the shepherding problem: heuristics for herding autonomous, interacting agents. J. R. Soc. Interf. 11(100) (2014). https://browzine.com/articles/52614503
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press, Cambridge (1998)
MATH Google Scholar
Szepesvári, C.: Algorithms for reinforcement learning. Synth. Lect. Artif. Intell. Mach. Learn. 4(1), 1–103 (2010)
Article Google Scholar
Trentini, M., Beckman, B.: Semi-autonomous UAV/UGV for dismounted urban operations. In: Unmanned Systems Technology XII, vol. 7692, p. 76921C. International Society for Optics and Photonics (2010)
Google Scholar
Tsitsiklis, J.N., Van Roy, B.: Analysis of temporal-diffference learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1075–1081 (1997)
Google Scholar
Vidal, R., Rashid, S., Sharp, C., Shakernia, O., Kim, J., Sastry, S.: Pursuit-evasion games with unmanned ground and aerial vehicles. In: Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No. 01CH37164), vol. 3, pp. 2948–2955. IEEE, Piscataway (2001)
Google Scholar
Waslander, S.L.: Unmanned aerial and ground vehicle teams: Recent work and open problems. In: Autonomous Control Systems and Vehicles, pp. 21–36. Springer, Berlin (2013)
Google Scholar
Wulfmeier, M., Ondruska, P., Posner, I.: Maximum entropy deep inverse reinforcement learning (2015). Preprint arXiv:1507.04888
Google Scholar
Xu, D., Zhang, X., Zhu, Z., Chen, C., Yang, P.: Behavior-based formation control of swarm robots. Math. Problems Eng. 2014, 205759 (2014)
Google Scholar
Yang, Z., Merrick, K., Jin, L., Abbass, H.A.: Hierarchical deep reinforcement learning for continuous action control. IEEE Trans. Neur. Netw. Learn. Syst. (99), 1–11 (2018)
MathSciNet Google Scholar
Yoshikai, T., Otake, N., Mizuuchi, I., Inaba, M., Inoue, H.: Development of an imitation behavior in humanoid kenta with reinforcement learning algorithm based on the attention during imitation. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), vol. 2, pp. 1192–1197. IEEE, Piscataway (2004)
Google Scholar
You, C., Lu, J., Filev, D., Tsiotras, P.: Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Rob. Auton. Syst. 114, 1–18 (2019)
Article Google Scholar
Yu, H., Beard, R.W., Argyle, M., Chamberlain, C.: Probabilistic path planning for cooperative target tracking using aerial and ground vehicles. In: Proceedings of the 2011 American Control Conference, pp. 4673–4678. IEEE, Piscataway (2011)
Google Scholar
Zhang, T., Kahn, G., Levine, S., Abbeel, P.: Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 528–535. IEEE, Piscataway (2016)
Google Scholar
Zhang, T., Li, Q., Zhang, C.s., Liang, H.w., Li, P., Wang, T.m., Li, S., Zhu, Y.l., Wu, C.: Current trends in the development of intelligent unmanned autonomous systems. Front. Inf. Technol. Electron. Eng. 18(1), 68–85 (2017)
Google Scholar
Zhan, E., Zheng, S., Yue, Y., Sha, L., Lucey, P.: Generative multi-agent behavioral cloning. In: Proceedings of the 35th International Conference on Machine Learning (2018)
Google Scholar
Ziebart, B.D., Maas, A.L., Bagnell, J.A., Dey, A.K.: Maximum entropy inverse reinforcement learning. In: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, vol. 8, pp. 1433–1438. Chicago (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering and Information Technology, University of New South Wales, Canberra, ACT, Australia
Hung Nguyen, Matthew Garratt & Hussein A. Abbass

Authors

Hung Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Garratt
View author publications
You can also search for this author in PubMed Google Scholar
Hussein A. Abbass
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hung Nguyen .

Editor information

Editors and Affiliations

University of New South Wales, Canberra, ACT, Australia
Hussein A. Abbass
University of New South Wales, Canberra, ACT, Australia
Robert A. Hunjet

1 Electronic Supplementary Material

(13486 KB)

(13355 KB)

(13711 KB)

(14418 KB)

(44729 KB)

(39526 KB)

Chapter 10

(138219 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Nguyen, H., Garratt, M., Abbass, H.A. (2021). Apprenticeship Bootstrapping Reinforcement Learning for Sky Shepherding of a Ground Swarm in Gazebo. In: Abbass, H.A., Hunjet, R.A. (eds) Shepherding UxVs for Human-Swarm Teaming. Unmanned System Technologies. Springer, Cham. https://doi.org/10.1007/978-3-030-60898-9_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-60898-9_10
Published: 23 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60897-2
Online ISBN: 978-3-030-60898-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics