Abstract
As intelligent systems gain autonomy and capability, it becomes vital to ensure that their objectives match those of their human users; this is known as the value-alignment problem. In robotics, value alignment is key to the design of collaborative robots that can integrate into human workflows, successfully inferring and adapting to their users’ objectives as they go. We argue that a meaningful solution to value alignment must combine multi-agent decision theory with rich mathematical models of human cognition, enabling robots to tap into people’s natural collaborative capabilities. We present a solution to the cooperative inverse reinforcement learning (CIRL) dynamic game based on well-established cognitive models of decision making and theory of mind. The solution captures a key reciprocity relation: the human will not plan her actions in isolation, but rather reason pedagogically about how the robot might learn from them; the robot, in turn, can anticipate this and interpret the human’s actions pragmatically. To our knowledge, this work constitutes the first formal analysis of value alignment grounded in empirically validated cognitive models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Note that the theoretical formulation is easily extended to arbitrary measurable sets; we limit our analysis to finite state and objective sets for computational tractability and clarity of exposition.
- 2.
We assume for simplicity that the optimum is unique or a well-defined disambiguation rule exists.
- 3.
Note that this does not imply certainty equivalence, nor do we assume separation of estimation and control: R is fully reasoning about how its actions and those of H may affect its future beliefs.
References
Amodei, D., Steinhardt, J., Man, D., Christiano, P.: Concrete problems in AI safety (2017)
Hadfield-Menell, D., Dragan, A., Abbeel, P., Russell, S.: Cooperative inverse reinforcement learning. NIPS (2016)
Tversky, A., Kahneman, D.: Judgment under uncertainty: heuristics and biases. Science 185(4157) (1974)
Heider, F., Simmel, M.: An experimental study of apparent behavior. Am. J. Psychol. 57(2) (1944)
Meltzoff, A.N.: Understanding the intentions of others: re-enactment of intended acts by 18-month-old children. Dev. Psychol. 31(5) (1995)
Baker, C.L., Tenenbaum, J.B.: Modeling human plan recognition using Bayesian theory of mind. Plan Act. Intent Recognit. (2014)
Shafto, P., Goodman, N.D., Griffiths, T.L.: A rational account of pedagogical reasoning: teaching by, and learning from, examples. Cogn. Psychol. 71 (2014)
Zamir, S.: Bayesian games: games with incomplete information. Computational Complexity: Theory, Techniques, and Applications (2012)
Luce, R.D.: Individual Choice Behavior: A Theoretical Analysis. Wiley, New York (1959)
Dragan, A.D., Srinivasa, S.: Integrating human observer inferences into robot motion planning. Auton. Robot. (2014)
Schelling, T.C.: The Strategy of Conflict. Harvard University Press, Harvard (1960)
Mundhenk, M., Goldsmith, J., Lusena, C., Allender, E.: Complexity of finite-horizon Markov decision process problems. J. ACM 47(4) (2000)
Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. NIPS (2010)
Malik, D., Palaniappan, M., Fisac, J.F., Hadfield-Menell, D., Russell, S., Dragan, A. D.: An efficient, generalized Bellman update for cooperative inverse reinforcement learning. In: Dy J., Krause A. (eds.) Proceedings of the 35th International Conference on Machine Learning, vol.80, pp. 3394–3402. PMLR (2018). http://proceedings.mlr.press/v80/malik18a.html
Acknowledgements
This work is supported by ONR under the Embedded Humans MURI (N00014-13-1-0341), by AFOSR under Implicit Communication (16RT0676), and by the Center for Human-Compatible AI.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Fisac, J.F. et al. (2020). Pragmatic-Pedagogic Value Alignment. In: Amato, N., Hager, G., Thomas, S., Torres-Torriti, M. (eds) Robotics Research. Springer Proceedings in Advanced Robotics, vol 10. Springer, Cham. https://doi.org/10.1007/978-3-030-28619-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-28619-4_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28618-7
Online ISBN: 978-3-030-28619-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)