Indoor Pursuit-Evasion with Hybrid Hierarchical Partially Observable Markov Decision Processes for Multi-robot Systems
In this paper, we examine a pursuit-evasion problem where more than one pursuer may search for one evader in indoor environments. Partially Observable Markov Decision Processes (POMDPs) provide a framework to model the uncertainty arisen from the unknown location of the evader. However, the approach is intractable even with a single pursuer and an evader. Therefore, we propose a Hybrid Hierarchical POMDP structure for improved scalability and efficiency. The structure consists of (i) the base MDPs for the cases where the evader is visible to the pursuers, (ii) the abstract POMDPs for the evader states that are not directly observable, and (iii) the transition states bridging between the base MDPs and abstract POMDPs. This hybrid approach significantly reduces the number of states expanded in the policy tree to solve the problem by abstracting environment structures. Experimental results show that our method expands only 5% of nodes generated from a standard POMDP solution.
KeywordsPursuit and evasion Multi-robot systems Markov decision processes
- 1.Arai, S., Sycara, K., Payne, T.R.: Experience-Based Reinforcement Learning to Acquire Effective Behavior in a Multi-agent Domain, pp. 125–135. Springer (2000)Google Scholar
- 2.Bellman, R.: Dynamic Programming. Courier Corporation (2013)Google Scholar
- 4.Gopalan, N., des Jardins, M., Littman, M.L., MacGlashan, J., Squire, S., Tellex, S., Winder, J., Wong, L.L.: Planning with Abstract Markov Decision Processes (2017)Google Scholar
- 7.Hollinger, G., Kehagias, A., Singh, S.: Probabilistic strategies for pursuit in cluttered environments with multiple robots. In: IEEE International Conference on Robotics and Automation, pp. 3870–3876. IEEE (2007)Google Scholar
- 9.Isler, V., Sun, D., Sastry, S.: Roadmap based pursuit-evasion and collision avoidance. Robot. Sci. Syst. 1, 257–264 (2005)Google Scholar