Skip to main content
Log in

Distributed Multi-agent Target Search and Tracking With Gaussian Process and Reinforcement Learning

  • Regular Papers
  • Intelligent Control and Applications
  • Published:
International Journal of Control, Automation and Systems Aims and scope Submit manuscript


Deploying multiple robots for target search and tracking has many practical applications, yet the challenge of planning over unknown or partially known targets remains difficult to address. With recent advances in deep learning, intelligent control techniques such as reinforcement learning have enabled agents to learn autonomously from environment interactions with little to no prior knowledge. Such methods can address the exploration-exploitation tradeoff of planning over unknown targets in a data-driven manner, streamlining the decision-making pipeline with end-to-end training. In this paper, we propose a multi-agent reinforcement learning technique (MARL) with target map building based on distributed Gaussian process (GP). We leverage the distributed GP to encode belief over the target locations in a scalable manner and incorporate it into centralized training with decentralized execution MARL framework to efficiently plan over unknown targets. We evaluate the performance and transferability of the trained policy in simulation and demonstrate the method on a swarm of micro unmanned aerial vehicles with hardware experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others


  1. E. T. Alotaibi, S. S. Alqefari, and A. Koubaa, “Lsar: Multi-uav collaboration for search and rescue missions,” IEEE Access, vol. 7, pp. 55817–55832, 2019.

    Article  Google Scholar 

  2. G. Wang, J. Xiao, R. Xue, and Y. Yuan, “A multi-group multi-agent system based on reinforcement learning and flocking,” International Journal of Control, Automation, and Systems, pp. 1–15, 2022.

  3. L. Zuo, M. Yan, and Y. Zhang, “Adaptive and collisionfree line coverage algorithm for multi-agent networks with unknown density function,” International Journal of Control, Automation, and Systems, vol. 20, no. 1, pp. 208–219, 2022.

    Article  Google Scholar 

  4. S. C. Nagavarapu, L. Vachhani, A. Sinha, and S. Buriuly, “Generalizing multi-agent graph exploration techniques,” International Journal of Control, Automation, and Systems, vol. 19, no. 1, pp. 491–504, 2021.

    Article  Google Scholar 

  5. J. Lim, J. H. Park, and H. J. Kim, “Bayesian online learning for information-based multi-agent exploration with unknown radio signal distribution,” IFAC-PapersOnLine, vol. 50, no. 1, pp. 2621–2626, 2017.

    Article  Google Scholar 

  6. N. Srinivas, A. Krause, S. M. Kakade, and M. W. Seeger, “Information-theoretic regret bounds for gaussian process optimization in the bandit setting,” IEEE Transactions on Information Theory, vol. 58, no. 5, pp. 3250–3265, 2012.

    Article  MathSciNet  MATH  Google Scholar 

  7. R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT press, 2018.

  8. C. Yu, M. Zhang, F. Ren, and G. Tan, “Multiagent learning of coordination in loosely coupled multiagent systems,” IEEE Transactions on Cybernetics, vol. 45, no. 12, pp. 2853–2867, 2015.

    Article  Google Scholar 

  9. L. Zhou, P. Yang, C. Chen, and Y. Gao, “Multiagent reinforcement learning with sparse interactions by negotiation and knowledge transfer,” IEEE Transactions on Cybernetics, vol. 47, no. 5, pp. 1238–1250, 2016.

    Article  Google Scholar 

  10. M. L. Elwin, R. A. Freeman, and K. M. Lynch, “Distributed environmental monitoring with finite element robots,” IEEE Transactions on Robotics, vol. 36, no. 2, pp. 380–398, 2019.

    Article  Google Scholar 

  11. P. Yao, H. Wang, and H. Ji, “Gaussian mixture model and receding horizon control for multiple UAV search in complex environment,” Nonlinear Dynamics, vol. 88, no. 2, pp. 903–919, 2017.

    Article  Google Scholar 

  12. J. Patrikar, B. G. Moon, and S. Scherer, “Wind and the city: Utilizing UAV-based in-situ measurements for estimating urban wind fields,” Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1254–1260, IEEE, 2020.

  13. J. Binney, A. Krause, and G. S. Sukhatme, “Optimizing waypoints for monitoring spatiotemporal phenomena,” The International Journal of Robotics Research, vol. 32, no. 8, pp. 873–888, 2013.

    Article  Google Scholar 

  14. W. Luo, C. Nam, G. Kantor, and K. Sycara, “Distributed environmental modeling and adaptive sampling for multirobot sensor coverage,” Proc. 18th International Conference on Autonomous Agents and Multiagent Systems, pp. 1488–1496, 2019.

  15. Y. Shi, N. Wang, J. Zheng, Y. Zhang, S. Yi, W. Luo, and K. Sycara, “Adaptive informative sampling with environment partitioning for heterogeneous multi-robot systems,” Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 11718–11723, IEEE, 2020.

  16. R. Allamraju and G. Chowdhary, “Communication efficient decentralized Gaussian process fusion for multi-UAS path planning,” Proc. of American Control Conference, pp. 4442–4447, IEEE, 2017.

  17. G. Pillonetto, L. Schenato, and D. Varagnolo, “Distributed multi-agent Gaussian regression via finite-dimensional approximations,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 9, pp. 2098–2111, 2018.

    Article  Google Scholar 

  18. D. Jang, J. Yoo, C. Y. Son, D. Kim, and H. J. Kim, “Multirobot active sensing and environmental model learning with distributed gaussian process,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 5905–5912, 2020.

    Article  Google Scholar 

  19. D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, et al., “Mastering the game of go without human knowledge,” Nature, vol. 550, no. 7676, pp. 354–359, 2017.

    Article  Google Scholar 

  20. Y. Sun and Y. Zhang, “Conversational recommender system,” Proc. of 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 235–244, 2018.

  21. Y. Deng, F. Bao, Y. Kong, Z. Ren, and Q. Dai, “Deep direct reinforcement learning for financial signal representation and trading,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 3, pp. 653–664, 2016.

    Article  Google Scholar 

  22. S. Choi, S. Kim, and H. J. Kim, “Inverse reinforcement learning control for trajectory tracking of a multirotor UAV,” International Journal of Control, Automation, and Systems, vol. 15, no. 4, pp. 1826–1834, 2017.

    Article  Google Scholar 

  23. S. Gu, E. Holly, T. Lillicrap, and S. Levine, “Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates,” Proc. of IEEE International Conference on Robotics and Automation (ICRA), pp. 3389–3396, IEEE, 2017.

  24. O. M. Andrychowicz, B. Baker, M. Chociej, R. Jozefowicz, B. McGrew, J. Pachocki, A. Petron, M. Plappert, G. Powell, A. Ray, et al., “Learning dexterous in-hand manipulation,” International Journal of Robotics Research, vol. 39, no. 1, pp. 3–20, 2020.

    Article  Google Scholar 

  25. S. Omidshafiei, J. Pazis, C. Amato, J. P. How, and J. Vian, “Deep decentralized multi-task multi-agent reinforcement learning under partial observability,” Proc. of International Conference on Machine Learning, pp. 2681–2690, PMLR, 2017.

  26. X. Bai, W. Yan, and S. S. Ge, “Efficient task assignment for multiple vehicles with partially unreachable target locations,” IEEE Internet of Things Journal, vol. 8, no. 5, pp. 3730–3742, 2020.

    Article  Google Scholar 

  27. X. Bai, W. Yan, and S. S. Ge, “Distributed task assignment for multiple robots under limited communication range,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 7, pp. 4259–4271, 2021.

    Article  Google Scholar 

  28. H. Qie, D. Shi, T. Shen, X. Xu, Y. Li, and L. Wang, “Joint optimization of multi-uav target assignment and path planning based on multi-agent reinforcement learning,” IEEE Access, vol. 7, pp. 146264–146272, 2019.

    Article  Google Scholar 

  29. C. E. Rasmussen, “Gaussian processes in machine learning,” Summer School on Machine Learning, pp. 63–71, Springer, 2003.

  30. T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” Proc. of 4th International Conference on Learning Representations, 2016.

  31. R. Lowe, Y. WU, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” Advances in Neural Information Processing Systems, vol. 30, pp. 6379–6390, 2017.

    Google Scholar 

  32. J. Munkres, “Algorithms for the assignment and transportation problems,” Journal of the Society for Industrial and Applied Mathematics, vol. 5, no. 1, pp. 32–38, 1957.

    Article  MathSciNet  MATH  Google Scholar 

  33. B. C. Levy, “Karhunen Loeve expansion of Gaussian processes,” Principles of Signal Detection and Parameter Estimation, pp. 1–47, Springer, 2008.

  34. D. Jang, J. Yoo, C. Y. Son, and H. J. Kim, “Fully distributed informative planning for environmental learning with multi-robot systems,” arXiv preprint arXiv:2112.14433, 2022.

  35. R. O. Saber and R. M. Murray, “Consensus protocols for networks of dynamic agents,” Proc. of American Control Conference, pp. 951–956, 2003.

  36. S. Levine, C. Finn, T. Darrell, and P. Abbeel, “End-to-end training of deep visuomotor policies,” Journal of Machine Learning Research, vol. 17, no. 1, pp. 1334–1373, 2016.

    MathSciNet  MATH  Google Scholar 

  37. X.-H. Chen, S. Jiang, F. Xu, Z. Zhang, and Y. Yu, “Cross-modal domain adaptation for cost-efficient visual reinforcement learning,” Advances in Neural Information Processing Systems, vol. 34, 2021.

  38. I. Mordatch and P. Abbeel, “Emergence of grounded compositional language in multi-agent populations,” Proc. of the 32nd AAAI Conference on Artificial Intelligence, 2018.

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to H. Jin Kim.

Ethics declarations

The authors declare that there is no competing financial interest or personal relationship that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the Agency for Defense Development under Contract UD190026RD.

Jigang Kim received his B.S. degree in mechanical and aerospace engineering from Seoul National University in 2018. He is currently pursuing an integrated M.S./Ph.D. degree at the Department of Mechanical and Aerospace Engineering, Seoul National University. His research interests include robot learning, machine learning, and reinforcement learning.

Dohyun Jang received his B.S. degree in electrical engineering from Korea University in 2017, and his M.S. and Ph.D. degrees in mechanical and aerospace engineering from Seoul National University, Seoul, in 2019 and 2022, respectively. His research interests include distributed systems, networked systems, machine learning, and robotics.

H. Jin Kim received her B.S. degree from Korea Advanced Institute of Technology (KAIST) in 1995, and her M.S. and Ph.D. degrees in mechanical engineering from University of California, Berkeley (UC Berkeley), in 1999 and 2001, respectively. In September 2004 she joined the Department of Mechanical and Aerospace Engineering at Seoul National University, as an Assistant Professor where she is currently a Professor. Her research interests include autonomous robotics and robot vision.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, J., Jang, D. & Kim, H.J. Distributed Multi-agent Target Search and Tracking With Gaussian Process and Reinforcement Learning. Int. J. Control Autom. Syst. 21, 3057–3067 (2023).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: