Bayes–Nash: Bayesian inference for Nash equilibrium selection in human-robot parallel play

Bansal, Shray; Xu, Jin; Howard, Ayanna; Isbell, Charles

doi:10.1007/s10514-021-10023-8

Bayes–Nash: Bayesian inference for Nash equilibrium selection in human-robot parallel play

Published: 05 November 2021

Volume 46, pages 217–230, (2022)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Shray Bansal ORCID: orcid.org/0000-0002-0199-0806¹,
Jin Xu¹,
Ayanna Howard² &
…
Charles Isbell¹

919 Accesses
3 Citations
Explore all metrics

Abstract

We consider shared workspace scenarios with humans and robots acting to achieve independent goals, termed as parallel play. We model these as general-sum games and construct a framework that utilizes the Nash equilibrium solution concept to consider the interactive effect of both agents while planning. We find multiple Pareto-optimal equilibria in these tasks. We hypothesize that people act by choosing an equilibrium based on social norms and their personalities. To enable coordination, we infer the equilibrium online using a probabilistic model that includes these two factors and use it to select the robot’s action. We apply our approach to a close-proximity pick-and-place task involving a robot and a simulated human with three potential behaviors—defensive, selfish, and norm-following. We showed that using a Bayesian approach to infer the equilibrium enables the robot to complete the task with less than half the number of collisions while also reducing the task execution time as compared to the best baseline. We also performed a study with human participants interacting either with other humans or with different robot agents and observed that our proposed approach performs similar to human-human parallel play interactions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time distributed non-myopic task selection for heterogeneous robotic teams

Article 07 November 2018

Hybrid Human Motion Prediction for Action Selection Within Human-Robot Collaboration

Influencing leading and following in human–robot teams

Article 28 October 2021

Notes

This analysis assumes a parameterization of the RRT algorithm such that it completes in a reasonable amount of time.

References

Bansal, S., Cosgun, A., Nakhaei, A., & Fujimura, K. (2018). Collaborative planning for mixed-autonomy lane merging. In 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS).
Barto, A. G., Sutton, R. S., & Anderson, C. W. (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, 5, 834–846.
Article Google Scholar
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). Openai gym. arXiv preprint arXiv:1606.01540.
Carroll, M., Shah, R., Ho, MK., Griffiths, T., Seshia, S., Abbeel, P., & Dragan, A. (2019). On the utility of learning about humans for human-ai coordination. In Advances in Neural Information Processing Systems (pp 5175–5186).
Chen, M., Nikolaidis, S., Soh, H., Hsu, D., & Srinivasa, S. (2018). Planning with trust for human-robot collaboration. In Proceedings of the 2018 ACM/IEEE international conference on human-robot interaction (pp. 307–315).
Diankov, R. (2010). Automated construction of robotic manipulation programs. PhD thesis, Carnegie Mellon University, Robotics Institute.
Engel, D., Woolley, A. W., Jing, L. X., Chabris, C. F., & Malone, T. W. (2014). Reading the mind in the eyes or reading between the lines? Theory of mind predicts collective intelligence equally well online and face-to-face. PloS One,9(12)
Fisac, J. F., Bronstein, E., Stefansson, E., Sadigh, D., Sastry, S. S., & Dragan, A. D. (2019). Hierarchical game-theoretic planning for autonomous vehicles. In 2019 International conference on robotics and automation (ICRA) (pp 9590–9596). IEEE.
Gabler, V., Stahl, T., Huber, G., Oguz, O., & Wollherr, D. (2017). A game-theoretic approach for adaptive action selection in close proximity human-robot-collaboration. In 2017 IEEE international conference on robotics and automation (ICRA).
Gombolay, M. C., Gutierrez, R. A., Clarke, S. G., Sturla, G. F., & Shah, J. A. (2015). Decision-making authority, team efficiency and human worker satisfaction in mixed human-robot teams. Autonomous Robots, 39(3), 293–312.
Article Google Scholar
Hawkins, K. P., Bansal, S., Vo, N. N., & Bobick, A. F. (2014). Anticipating human actions for collaboration in the presence of task and sensor uncertainty. In 2014 ieee international conference on Robotics and automation (ICRA).
Ho, M. K., MacGlashan, J., Greenwald, A., Littman, M. L., Hilliard, E., Trimbach, C., Brawner, S., Tenenbaum, J., Kleiman-Weiner, M., & Austerweil, J. L. (2016). Feature-based joint planning and norm learning in collaborative games. In CogSci
Hoffman, G. (2019). Evaluating fluency in human-robot collaboration. IEEE Transactions on Human-Machine Systems, 49(3), 209–218.
Article Google Scholar
Koppula, H. S., & Saxena, A. (2015). Anticipating human activities using object affordances for reactive robotic response. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 14–29.
Article Google Scholar
Lavalle, S. M. (1998). Rapidly-exploring random trees: a new tool for path planning. Tech. rep.
Leyton-Brown, K., & Shoham, Y. (2008). Essentials of game theory: A concise multidisciplinary introduction. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2(1), 1–88.
Article Google Scholar
Li, S., Shah, J. A. (2019). Safe and efficient high dimensional motion planning in space-time with time parameterized prediction. In 2019 international conference on robotics and automation (ICRA).
Mailath, G. J. (1998). Do people play nash equilibrium? Lessons from evolutionary game theory. Journal of Economic Literature, 36(3), 1347–1374.
Google Scholar
Mainprice, J., Hayne, R., & Berenson, D. (2016). Goal set inverse optimal control and iterative replanning for predicting human reaching motions in shared workspaces. IEEE Transactions on Robotics, 32(4), 897–908.
Article Google Scholar
Nikolaidis, S., Kuznetsov, A., Hsu, D., & Srinivasa, S. (2016). Formalizing human-robot mutual adaptation: A bounded memory model. In 2016 11th ACM/IEEE international conference on human-robot interaction (HRI) (pp. 75–82). IEEE.
Nikolaidis, S., Nath, S., Procaccia, A. D., & Srinivasa, S. (2017). Game-theoretic modeling of human adaptation in human-robot collaboration. In Proceedings of the 2017 ACM/IEEE international conference on human-robot interaction (pp. 323–331).
Nikolaidis, S., Ramakrishnan, R., Gu, K., & Shah, J. (2015). Efficient model learning from joint-action demonstrations for human-robot collaborative tasks. In ACM/IEEE international conference on human-robot interaction.
Park, H. W., & Howard, A. M. (2010). Understanding a child’s play for robot interaction by sequencing play primitives using hidden markov models. In 2010 IEEE international conference on robotics and automation (pp. 170–177).
Parten, M. B. (1932). Social participation among pre-school children. The Journal of Abnormal and Social Psychology, 27(3), 243.
Article Google Scholar
Peters, L., Fridovich-Keil, D., Tomlin, C., & Sunberg, Z. (2020). Inference-based strategy alignment for general-sum differential games. In AAMAS ’20, international foundation for autonomous agents and multiagent systems. https://github.com/lassepe/AAMAS2020-GameInference-Paper/blob/master/submission/ibsa-camera-ready-aamas2020.pdf.
Premack, D., & Woodruff, G. (1978). Does the chimpanzee have a theory of mind? Behavioral and Brain Sciences, 1(4), 515–526.
Article Google Scholar
Sadigh, D., Sastry, S., Seshia, S. A., & Dragan, A. D. (2016a). Planning for autonomous cars that leverage effects on human actions. In Robotics: Science and systems.
Sadigh, D., Sastry, S. S., Seshia, S. A., & Dragan, A. (2016b). Information gathering actions over human internal state. In 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 66–73). IEEE.
Schwarting, W., Pierson, A., Alonso-Mora, J., Karaman, S., & Rus, D. (2019). Social behavior for autonomous vehicles. Proceedings of the National Academy of Sciences, 116(50), 24972–24978.
Article MathSciNet Google Scholar
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel T., & Hassabis, D. (2016). Mastering the game of go with deep neural networks and tree search. Nature, 529.(7587):484
Spica, R., Cristofalo, E., Wang, Z., Montijano, E., & Schwager, M. (2020). A real-time game theoretic planner for autonomous two-player drone racing. IEEE Transactions on Robotics, 36(5), 1389–1403. https://doi.org/10.1109/TRO.2020.2994881.
Article Google Scholar
Sucan, I. A., Moll, M., & Kavraki, L. E. (2012). The open motion planning library. IEEE Robotics & Automation Magazine. https://doi.org/10.1109/MRA.2012.2205651.
Tesauro, G. (1995). Temporal difference learning and td-gammon. Communications of the ACM, 38(3), 58–68.
Article Google Scholar
Trautman, P., & Krause, A. (2010). Unfreezing the robot: Navigation in dense, interacting crowds. In 2010 IEEE/RSJ international conference on intelligent robots and systems (pp. 797–803). IEEE.
Turnwald, A., & Wollherr, D. (2019). Human-like motion planning based on game theoretic decision making. International Journal of Social Robotics, 11(1), 151–170.
Article Google Scholar
Unhelkar, V. V., Siu, H. C., Shah, J. A. (2014). Comparative performance of human and mobile robotic assistants in collaborative fetch-and-deliver tasks. In ACM/IEEE international conference on human-robot interaction (HRI).
Ziebart, B. D., Ratliff, N., Gallagher, G., Mertz, C., Peterson, K., Bagnell, J. A., Hebert, M., Dey, A. K., & Srinivasa, S. (2009). Planning-based prediction for pedestrians. In 2009 IEEE/RSJ international conference on intelligent robots and systems (pp. 3931–3936). IEEE.

Download references

Author information

Authors and Affiliations

Georgia Institute of Technology, Atlanta, GA, USA
Shray Bansal, Jin Xu & Charles Isbell
Ohio State University, Columbus, OH, USA
Ayanna Howard

Authors

Shray Bansal
View author publications
You can also search for this author in PubMed Google Scholar
Jin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Ayanna Howard
View author publications
You can also search for this author in PubMed Google Scholar
Charles Isbell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shray Bansal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bansal, S., Xu, J., Howard, A. et al. Bayes–Nash: Bayesian inference for Nash equilibrium selection in human-robot parallel play. Auton Robot 46, 217–230 (2022). https://doi.org/10.1007/s10514-021-10023-8

Download citation

Received: 01 February 2021
Accepted: 27 September 2021
Published: 05 November 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10514-021-10023-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayes–Nash: Bayesian inference for Nash equilibrium selection in human-robot parallel play

Abstract

Access this article

Similar content being viewed by others

Real-time distributed non-myopic task selection for heterogeneous robotic teams

Hybrid Human Motion Prediction for Action Selection Within Human-Robot Collaboration

Influencing leading and following in human–robot teams

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation