Pragmatic-Pedagogic Value Alignment

Fisac, Jaime F.; Gates, Monica A.; Hamrick, Jessica B.; Liu, Chang; Hadfield-Menell, Dylan; Palaniappan, Malayandi; Malik, Dhruv; Sastry, S. Shankar; Griffiths, Thomas L.; Dragan, Anca D.

doi:10.1007/978-3-030-28619-4_7

Jaime F. Fisac¹⁴,
Monica A. Gates¹⁴,
Jessica B. Hamrick¹⁴,
Chang Liu¹⁴,
Dylan Hadfield-Menell¹⁴,
Malayandi Palaniappan¹⁴,
Dhruv Malik¹⁴,
S. Shankar Sastry¹⁴,
Thomas L. Griffiths¹⁴ &
…
Anca D. Dragan¹⁴

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 10))

3021 Accesses
9 Citations

Abstract

As intelligent systems gain autonomy and capability, it becomes vital to ensure that their objectives match those of their human users; this is known as the value-alignment problem. In robotics, value alignment is key to the design of collaborative robots that can integrate into human workflows, successfully inferring and adapting to their users’ objectives as they go. We argue that a meaningful solution to value alignment must combine multi-agent decision theory with rich mathematical models of human cognition, enabling robots to tap into people’s natural collaborative capabilities. We present a solution to the cooperative inverse reinforcement learning (CIRL) dynamic game based on well-established cognitive models of decision making and theory of mind. The solution captures a key reciprocity relation: the human will not plan her actions in isolation, but rather reason pedagogically about how the robot might learn from them; the robot, in turn, can anticipate this and interpret the human’s actions pragmatically. To our knowledge, this work constitutes the first formal analysis of value alignment grounded in empirically validated cognitive models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Note that the theoretical formulation is easily extended to arbitrary measurable sets; we limit our analysis to finite state and objective sets for computational tractability and clarity of exposition.
2.
We assume for simplicity that the optimum is unique or a well-defined disambiguation rule exists.
3.
Note that this does not imply certainty equivalence, nor do we assume separation of estimation and control: R is fully reasoning about how its actions and those of H may affect its future beliefs.

References

Amodei, D., Steinhardt, J., Man, D., Christiano, P.: Concrete problems in AI safety (2017)
Google Scholar
Hadfield-Menell, D., Dragan, A., Abbeel, P., Russell, S.: Cooperative inverse reinforcement learning. NIPS (2016)
Google Scholar
Tversky, A., Kahneman, D.: Judgment under uncertainty: heuristics and biases. Science 185(4157) (1974)
Google Scholar
Heider, F., Simmel, M.: An experimental study of apparent behavior. Am. J. Psychol. 57(2) (1944)
Google Scholar
Meltzoff, A.N.: Understanding the intentions of others: re-enactment of intended acts by 18-month-old children. Dev. Psychol. 31(5) (1995)
Google Scholar
Baker, C.L., Tenenbaum, J.B.: Modeling human plan recognition using Bayesian theory of mind. Plan Act. Intent Recognit. (2014)
Google Scholar
Shafto, P., Goodman, N.D., Griffiths, T.L.: A rational account of pedagogical reasoning: teaching by, and learning from, examples. Cogn. Psychol. 71 (2014)
Google Scholar
Zamir, S.: Bayesian games: games with incomplete information. Computational Complexity: Theory, Techniques, and Applications (2012)
Google Scholar
Luce, R.D.: Individual Choice Behavior: A Theoretical Analysis. Wiley, New York (1959)
Google Scholar
Dragan, A.D., Srinivasa, S.: Integrating human observer inferences into robot motion planning. Auton. Robot. (2014)
Google Scholar
Schelling, T.C.: The Strategy of Conflict. Harvard University Press, Harvard (1960)
Google Scholar
Mundhenk, M., Goldsmith, J., Lusena, C., Allender, E.: Complexity of finite-horizon Markov decision process problems. J. ACM 47(4) (2000)
Google Scholar
Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. NIPS (2010)
Google Scholar
Malik, D., Palaniappan, M., Fisac, J.F., Hadfield-Menell, D., Russell, S., Dragan, A. D.: An efficient, generalized Bellman update for cooperative inverse reinforcement learning. In: Dy J., Krause A. (eds.) Proceedings of the 35th International Conference on Machine Learning, vol.80, pp. 3394–3402. PMLR (2018). http://proceedings.mlr.press/v80/malik18a.html

Download references

Acknowledgements

This work is supported by ONR under the Embedded Humans MURI (N00014-13-1-0341), by AFOSR under Implicit Communication (16RT0676), and by the Center for Human-Compatible AI.

Author information

Authors and Affiliations

University of California, Berkeley, CA, USA
Jaime F. Fisac, Monica A. Gates, Jessica B. Hamrick, Chang Liu, Dylan Hadfield-Menell, Malayandi Palaniappan, Dhruv Malik, S. Shankar Sastry, Thomas L. Griffiths & Anca D. Dragan

Authors

Jaime F. Fisac
View author publications
You can also search for this author in PubMed Google Scholar
Monica A. Gates
View author publications
You can also search for this author in PubMed Google Scholar
Jessica B. Hamrick
View author publications
You can also search for this author in PubMed Google Scholar
Chang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Dylan Hadfield-Menell
View author publications
You can also search for this author in PubMed Google Scholar
Malayandi Palaniappan
View author publications
You can also search for this author in PubMed Google Scholar
Dhruv Malik
View author publications
You can also search for this author in PubMed Google Scholar
S. Shankar Sastry
View author publications
You can also search for this author in PubMed Google Scholar
Thomas L. Griffiths
View author publications
You can also search for this author in PubMed Google Scholar
Anca D. Dragan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jaime F. Fisac .

Editor information

Editors and Affiliations

Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
Nancy M. Amato
Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
Greg Hager
Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
Shawna Thomas
Department of Electrical Engineering, Pontificia Universidad Católica de Chile, Santiago, Chile
Miguel Torres-Torriti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fisac, J.F. et al. (2020). Pragmatic-Pedagogic Value Alignment. In: Amato, N., Hager, G., Thomas, S., Torres-Torriti, M. (eds) Robotics Research. Springer Proceedings in Advanced Robotics, vol 10. Springer, Cham. https://doi.org/10.1007/978-3-030-28619-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-28619-4_7
Published: 28 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28618-7
Online ISBN: 978-3-030-28619-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics