Abstract
The usage of Unmanned Aerial Vehicles (UAVs) is gradually gaining momentum for commercial applications. These however often rely on a single UAV, which comes with constraints such as its range of capacity or the number of sensors it can carry. Using several UAVs as a swarm makes it possible to overcome these limitations. Many metaheuristics have been designed to optimise the behaviour of UAV swarms. Manually designing such algorithms can however be very time-consuming and error prone since swarming relies on an emergent behaviour which can be hard to predict from local interactions. As a solution, this work proposes to automate the design of UAV swarming behaviours thanks to a Q-learning based hyper heuristic. Experimental results demonstrate that it is possible to obtain efficient swarming heuristics independently of the problem size, thus allowing a fast training on small instances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aznar, F., Pujol, M., Rizo, R., Rizo, C.: Modelling multi-rotor UAVs swarm deployment using virtual pheromones. PLoS ONE 13(1), e0190692 (2018)
Babić, A., Mišković, N., Vukić, Z.: Heuristics pool for hyper-heuristic selection during task allocation in a heterogeneous swarm of marine robots. IFAC-PapersOnLine 51(29), 412–417 (2018)
Birattari, M., et al.: Automatic off-line design of robot swarms: a manifesto. Frontiers Robot. AI 6, 59 (2019)
Brust, M.R., Zurad, M., Hentges, L., Gomes, L., Danoy, G., Bouvry, P.: Target tracking optimization of UAV swarms based on dual-pheromone clustering. In: 2017 3rd IEEE International Conference on Cybernetics (CYBCONF), pp. 1–8. IEEE (2017)
Burke, E.K., et al.: Hyper-heuristics: a survey of the state of the art. J. Oper. Res. Soc. 64(12), 1695–1724 (2013)
Liu, C., Xin, X., Dewen, H.: Multiobjective reinforcement learning: a comprehensive overview. IEEE Trans. Syst. Man Cybern. Syst. 45(3), 385–398 (2015)
Cowling, P., Kendall, G., Soubeiga, E.: A hyperheuristic approach to scheduling a sales summit. In: Burke, E., Erben, W. (eds.) PATAT 2000. LNCS, vol. 2079, pp. 176–190. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44629-X_11
Duflo, G., Danoy, G., Talbi, E.G., Bouvry, P.: Automated design of efficient swarming behaviours: a Q-learning hyper-heuristic approach. In: Genetic and Evolutionary Computation Conference Companion, pp. 227–228. ACM (2020)
Duflo, G., Danoy, G., Talbi, E.G., Bouvry, P.: Automating the design of efficient distributed behaviours for a swarm of UAVs. In: Symposium Series on Computational Intelligence - SSCI 2020. IEEE, Canberra, Australia (2020)
Duflo, G., Kieffer, E., Brust, M.R., Danoy, G., Bouvry, P.: A GP hyper-heuristic approach for generating TSP heuristics. In: 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 521–529. IEEE (2019)
Khalil, E., Dai, H., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. In: Guyon, I., (eds.) Advances in Neural Information Processing Systems. vol. 30, pp. 6348–6358. Curran Associates, Inc. (2017)
Kieffer, E., Danoy, G., Brust, M.R., Bouvry, P., Nagih, A.: Tackling large-scale and combinatorial bi-level problems with a genetic programming hyper-heuristic. IEEE Trans. Evol. Comput. 24(1), 44-56 (2019)
Lin, J., Zhu, L., Gao, K.: A genetic programming hyper-heuristic approach for the multi-skill resource constrained project scheduling problem. Exp. Syst. Appl. 140, 112915 (2020)
Tuyls, K., Stone, P.: Multiagent learning paradigms. In: Belardinelli, F., Argente, E. (eds.) EUMAS/AT -2017. LNCS (LNAI), vol. 10767, pp. 3–21. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01713-2_1
Van Moffaert, K., Drugan, M.M., Nowe, A.: Scalarized multi-objective reinforcement learning: novel design techniques. In: IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), pp. 191–199. IEEE (2013)
Van Veldhuizen, D.A.: Multiobjective Evolutionary Algorithms: Classifications, Analyses, and New Innovations. Ph.D. thesis, USA (1999), aAI9928483
Varrette, S., Bouvry, P., Cartiaux, H., Georgatos, F.: Management of an academic HPC cluster: the UL experience. In: Proceedings of the 2014 International Conference on High Performance Computing & Simulation (HPCS 2014), pp. 959–967. IEEE (July 2014)
Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE Trans. Evol. Comput. 3(4), 257–271 (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Duflo, G., Danoy, G., Talbi, EG., Bouvry, P. (2021). A Q-Learning Based Hyper-Heuristic for Generating Efficient UAV Swarming Behaviours. In: Nguyen, N.T., Chittayasothorn, S., Niyato, D., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2021. Lecture Notes in Computer Science(), vol 12672. Springer, Cham. https://doi.org/10.1007/978-3-030-73280-6_61
Download citation
DOI: https://doi.org/10.1007/978-3-030-73280-6_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73279-0
Online ISBN: 978-3-030-73280-6
eBook Packages: Computer ScienceComputer Science (R0)