In multi-agent navigation, agents need to move towards their goal locations while avoiding collisions with other agents and obstacles, often without communication. Existing methods compute motions that are locally optimal but do not account for the aggregated motions of all agents, producing inefficient global behavior especially when agents move in a crowded space. In this work, we develop a method that allows agents to dynamically adapt their behavior to their local conditions. We formulate the multi-agent navigation problem as an action-selection problem and propose an approach, ALAN, that allows agents to compute time-efficient and collision-free motions. ALAN is highly scalable because each agent makes its own decisions on how to move, using a set of velocities optimized for a variety of navigation tasks. Experimental results show that agents using ALAN, in general, reach their destinations faster than using ORCA, a state-of-the-art collision avoidance framework, and two other navigation models.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Videos highlighting our work can be found in http://motion.cs.umn.edu/r/ActionSelection.
Alonso-Mora, J., Breitenmoser, A., Rufli, M., Beardsley, P., & Siegwart, R. (2013). Optimal reciprocal collision avoidance for multiple non-holonomic robots. In A. Martinoli, F. Mondada, N. Correll, G. Mermoud, M. Egerstedt, Hsieh M. Ani, et al. (Eds.), Distributed autonomous robotic systems (pp. 203–216). Berlin: Springer.
Audibert, J. Y., Munos, R., & Szepesvári, C. (2009). Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theoretical Computer Science, 410(19), 1876–1902.
Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2–3), 235–256.
Bayazit, O., Lien, J. M., & Amato, N. (2003). Better group behaviors in complex environments using global roadmaps. In 8th international conference on artificial life (pp. 362–370).
Buşoniu, L., Babuška, R., & De Schutter, B. (2008). A comprehensive survey of multi-agent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C Applications and Reviews, 38(2), 156–172.
Cunningham, B., & Cao, Y. (2012). Levels of realism for cooperative multi-agent reinforcement learning. In Advances in swarm intelligence (pp. 573–582). Springer.
Fiorini, P., & Shiller, Z. (1998). Motion planning in dynamic environments using velocity obstacles. The International Journal of Robotics Research, 17, 760–772.
Funge, J., Tu, X., & Terzopoulos, D. (1999). Cognitive modeling: knowledge, reasoning and planning for intelligent characters. In 26th annual conference on computer graphics and interactive techniques (pp. 29–38).
Giese, A., Latypov, D., & Amato, N. M. (2014). Reciprocally-rotating velocity obstacles. In IEEE international conference on robotics and automation (pp. 3234–3241).
Godoy, J., Karamouzas, I., Guy, S. J., & Gini, M. (2015). Adaptive learning for multi-agent navigation. In Proceedings of international conference on autonomous agents and multi-agent systems (pp. 1577–1585).
Guy, S., Chhugani, J., Kim, C., Satish, N., Lin, M., Manocha, D., & Dubey, P. (2009). Clearpath: Highly parallel collision avoidance for multi-agent simulation. In ACM SIGGRAPH/Eurographics symposium on computer animation (pp. 177–187).
Guy, S., Kim, S., Lin, M., & Manocha, D. (2011). Simulating heterogeneous crowd behaviors using personality trait theory. In Proceedings ACM SIGGRAPH/Eurographics symposium on computer animation (pp. 43–52).
Guy, S.J., Chhugani, J., Curtis, S., Pradeep, D., Lin, M., & Manocha, D. (2010). PLEdestrians: A least-effort approach to crowd simulation. In ACM SIGGRAPH/Eurographics symposium on computer animation (pp. 119–128).
Hastings, W. K. (1970). Monte carlo sampling methods using markov chains and their applications. Biometrika, 57(1), 97–109.
Helbing, D., Buzna, L., & Werner, T. (2003). Self-organized pedestrian crowd dynamics and design solutions. Traffic Forum 12.
Helbing, D., Farkas, I., & Vicsek, T. (2000). Simulating dynamical features of escape panic. Nature, 407(6803), 487–490.
Helbing, D., & Molnar, P. (1995). Social force model for pedestrian dynamics. Physical Review E, 51(5), 4282.
Helbing, D., Molnar, P., Farkas, I. J., & Bolay, K. (2001). Self-organizing pedestrian movement. Environment and Planning B: Planning and Design, 28(3), 361–384.
Hennes, D., Claes, D., Meeussen, W., & Tuyls, K. (2012). Multi-robot collision avoidance with localization uncertainty. In Proceedings of international conference on autonomous agents and multi-agent systems (pp. 147–154).
Henry, P., Vollmer, C., Ferris, B., & Fox, D. (2010). Learning to navigate through crowded environments. In Proceedings of ieee international conference on robotics and automation (pp. 981–986).
Hettiarachchi, S. (2010). An evolutionary approach to swarm adaptation in dense environments. In IEEE Int’l conference on control automation and systems (pp. 962–966).
Hopcroft, J. E., Schwartz, J. T., & Sharir, M. (1984). On the complexity of motion planning for multiple independent objects; pspace-hardness of the" warehouseman’s problem". The International Journal of Robotics Research, 3(4), 76–88.
Johansson, A., Helbing, D., & Shukla, P. K. (2007). Specification of the social force pedestrian model by evolutionary adjustment to video tracking data. Advances in Complex Systems, 10, 271–288.
Karamouzas, I., Geraerts, R., & van der Stappen, A. F. (2013). Space-time group motion planning. In E. Frazzoli, T. Lozano-Perez, N. Roy, & D. Rus (Eds.), Algorithmic foundations of robotics X (pp. 227–243). Berlin: Springer.
Karamouzas, I., Heil, P., van Beek, P., & Overmars, M. (2009). A predictive collision avoidance model for pedestrian simulation. In Motion in games, LNCS, (vol. 5884, pp. 41–52). Springer.
Karamouzas, I., & Overmars, M. (2012). Simulating and evaluating the local behavior of small pedestrian groups. IEEE Transactions on Visualization and Computer Graphics, 18(3), 394–406.
Khatib, O. (1986). Real-time obstacle avoidance for manipulators and mobile robots. International Journal of Robotics Research, 5(1), 90–98.
Kirkpatrick, S., Gelatt, C. D., Vecchi, M. P., et al. (1983). Optimization by simmulated annealing. Science, 220(4598), 671–680.
Kober, J., Bagnell, J. A., & Peters, J. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11), 1238–1274.
Kornhauser, D. M., Miller, G. L., & Spirakis, P. G. (1984). Coordinating pebble motion on graphs, the diameter of permutation groups, and applications. Master’s thesis, M. I. T., Deptartment of Electrical Engineering and Computer Science.
Macready, W. G., & Wolpert, D. H. (1998). Bandit problems and the exploration/exploitation tradeoff. IEEE Transactions on Evolutionary Computation, 2(1), 2–22.
Martinez-Gil, F., Lozano, M., & Fernández, F. (2012). Multi-agent reinforcement learning for simulating pedestrian navigation. In Adaptive and learning agents, (pp. 54–69). Springer.
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6), 1087–1092.
Ondřej, J., Pettré, J., Olivier, A. H., & Donikian, S. (2010). A synthetic-vision based steering approach for crowd simulation. ACM Transactions on Graphics, 29(4), 123.
Pelechano, N., Allbeck, J., & Badler, N. (2007). Controlling individual agents in high-density crowd simulation. In Proceedings of ACM SIGGRAPH/Eurographics symposium on computer animation (pp. 99–108).
Pelechano, N., Allbeck, J. M., & Badler, N. I. (2008). Virtual crowds: Methods, simulation, and control. Synthesis lectures on computer graphics and animation (vol. 3, No. 1, pp. 1–176).
Pettré, J., Ondrej, J., Olivier, A. H., Crétual, A., & Donikian, S. (2009). Experiment-based modeling, simulation and validation of interactions between virtual walkers. In ACM SIGGRAPH/Eurographics symposium on computer animation (pp. 189–198).
Popelová, M., Bída, M., Brom, C., Gemrot, J., & Tomek, J. (2011). When a couple goes together: Walk along steering. In Motion in games, LNCS (vol. 7060, pp. 278–289). Springer.
Ratering, S., & Gini, M. (1995). Robot navigation in a known environment with unknown moving obstacles. Autonomous Robots, 1(2), 149–165.
Reynolds, C. (1999). Steering behaviors for autonomous characters. In Game developers conference (pp. 763–782).
Reynolds, C. W. (1987). Flocks, herds, and schools: A distributed behavioral model. Computer Graphics, 21(4), 24–34.
Shao, W., & Terzopoulos, D. (2007). Autonomous pedestrians. Graphical Models, 69(5–6), 246–274.
Sieben, A., Schumann, J., & Seyfried, A. (2017). Collective phenomena in crowdswhere pedestrian dynamics need social psychology. PLoS ONE, 12(6), 1–9.
Solovey, K., Yu, J., Zamir, O., & Halperin, D. (2015). Motion planning for unlabeled discs with optimality guarantees. In Proceedings of Robotics: Science and Systems. https://doi.org/10.15607/RSS.2015.XI.011.
Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1), 9–44.
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge: MIT Press.
Torrey, L. (2010). Crowd simulation via multi-agent reinforcement learning. In Proceedings of artificial intelligence and interactive digital entertainment (pp. 89–94).
Tsai, J., Bowring, E., Marsella, S., & Tambe, M. (2013). Empirical evaluation of computational fear contagion models in crowd dispersions. Autonomous agents and multi-agent systems (pp. 1–18).
Uther, W., & Veloso, M. (1997). Adversarial reinforcement learning. Technical report, Carnegie Mellon University.
van den Berg, J., Lin, M., & Manocha, D. (2008). Reciprocal velocity obstacles for real-time multi-agent navigation. In Proceedings of IEEE international conference on robotics and automation (pp. 1928–1935).
van den Berg, J., Guy, S.J., Lin, M., & Manocha, D. (2011). Reciprocal n-body collision avoidance. In Proceedings of international symposium of robotics research (pp. 3–19). Springer.
van den Berg, J., Snape, J., Guy, S. J., & Manocha, D. (2011). Reciprocal collision avoidance with acceleration-velocity obstacles. In IEEE international conference on robotics and automation (pp. 3475–3482).
Whiteson, S., Taylor, M. E., & Stone, P. (2007). Empirical studies in action selection with reinforcement learning. Adaptive Behavior, 15(1), 33–50.
Yu, J., & LaValle, S. M. (2013). Planning optimal paths for multiple robots on graphs. In Proceedings IEEE international conference on robotics and automation (pp. 3612–3617). IEEE.
Zhang, C., & Lesser, V. (2012). Coordinated multi-agent learning for decentralized POMDPs. In 7th annual workshop on multiagent sequential decision-making under uncertainty (MSDM) at AAMAS (pp. 72–78).
Zhang, C., & Lesser, V. (2013). Coordinating multi-agent reinforcement learning with limited communication. In Proceedings of international conference on autonomous agents and multi-agent systems (pp. 1101–1108).
Ziebart, B. D., Ratliff, N., Gallagher, G., Mertz, C., Peterson, K., Bagnell, J. A., Hebert, M., Dey, A. K., & Srinivasa, S. (2009). Planning-based prediction for pedestrians. In Proceedings of IEEE/RSJ international conference on intelligent robots and systems (pp. 3931–3936).
This work was partially funded by the University of Minnesota Informatics Institute, the CONICYT PFCHA/DOCTORADO BECAS CHILE/2009 - 72100243 and the NSF through grants #CHS-1526693, #CNS-1544887, #IIS-1748541 and #IIP-1439728.
This is one of several papers published in Autonomous Robots comprising the “Special Issue on Distributed Robotics: From Fundamentals to Applications”.
About this article
Cite this article
Godoy, J., Chen, T., Guy, S.J. et al. ALAN: adaptive learning for multi-agent navigation. Auton Robot 42, 1543–1562 (2018). https://doi.org/10.1007/s10514-018-9719-4
- Multi-agent navigation
- Online learning
- Action selection
- Multi-agent coordination