Improving Confidence in the Estimation of Values and Norms

Siebert, Luciano Cavalcante; Mercuur, Rijk; Dignum, Virginia; van den Hoven, Jeroen; Jonker, Catholijn

doi:10.1007/978-3-030-72376-7_6

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12298))

Included in the following conference series:

International Workshop on Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems
International Workshop on Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems

429 Accesses

Abstract

Autonomous agents (AA) will increasingly be interacting with us in our daily lives. While we want the benefits attached to AAs, it is essential that their behavior is aligned with our values and norms. Hence, an AA will need to estimate the values and norms of the humans it interacts with, which is not a straightforward task when solely observing an agent’s behavior. This paper analyses to what extent an AA is able to estimate the values and norms of a simulated human agent (SHA) based on its actions in the ultimatum game. We present two methods to reduce ambiguity in profiling the SHAs: one based on search space exploration and another based on counterfactual analysis. We found that both methods are able to increase the confidence in estimating human values and norms, but differ in their applicability, the latter being more efficient when the number of interactions with the agent is to be minimized. These insights are useful to improve the alignment of AAs with human values and norms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For ease of presentation, we chose to present P with no monetary unit. Empirical work [21] shows that the effect of the pie size is relatively small.
2.
The norm that is drawn from the normal distribution is not used as input for the norm in subsequent rounds (i.e., the agent does not memorize it).
3.
To be exact, the proposed demand should be considered as what the proposer considers a ‘normal’ threshold. If it considered a higher threshold to be normal it would have demanded less, if it considered a lower threshold normal it would have demanded more.
4.
The standard deviation in the demand solely based on values \(\sigma _{vd}\) was added to ensure agents vary in what values they find important (\(di_a\)). The \(\sigma _{vd}\) for humans is postulated instead of extracted from empirical data.
5.
Given the deterministic model of the SHA, it might be expected that the RMSE should tend to zero. This is not the case because [d, valueDemand, \(normDemand] \in \mathbb {Z}\), and therefore a rounding operator is used.
6.
The x-axis in Fig. 3 relates to the number of rounds used during the initial estimation process, described in Sect. 4.1. The additional number of interactions performed by each method to improve the confidence in the estimations is not included in the x-axis but presented in Fig. 3(c).
7.
In our case \(OR_a\): given that preference to values and norms are constant, demand is defined according to (4))

References

Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning (ICML). ACM (2004)
Google Scholar
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D.: Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016)
Cooper, D.J., Dutcher, E.G.: The dynamics of responder behavior in ultimatum games: a meta-study. Exp. Econ. 14(4), 519–546 (2011)
Article Google Scholar
Cranefield, S., Winikoff, M., Dignum, V., Dignum, F.: No pizza for you: value-based plan selection in BDI agents. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), pp. 178–184 (2017)
Google Scholar
Crawford, S.E.S., Ostrom, E.: A grammar of institutions. Polit. Sci. 89(3), 582–600 (2007)
Google Scholar
Dechesne, F., Di Tosto, G., Dignum, V., Dignum, F.: No smoking here: values, norms and culture in multi-agent systems. Artif. Intell. Law 21(1), 79–107 (2013)
Article Google Scholar
Del Missier, F., Mäntylä, T., Hansson, P., Bruine de Bruin, W., Parker, A.M., Nilsson, L.G.: The multifold relationship between memory and decision making: an individual-differences study. J. Exp. Psychol.: Learn. Mem. Cogn. 39(5), 1344 (2013)
Google Scholar
Dignum, V.: Responsible Artificial Intelligence: How to Develop and Use AI in a Responsible Way. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30371-6
Book Google Scholar
Fehr, E., Fischbacher, U.: The nature of human altruism. Nature 425(6960), 785–791 (2003)
Article Google Scholar
Fishbein, M., Ajzen, I.: Predicting and Changing Behavior: The Reasoned Action Approach. Taylor & Francis Ltd, Milton Park (2011)
Book Google Scholar
Güth, W., Schmittberger, R., Schwarze, B.: An experimental analysis of ultimatum bargaining. J. Econ. Behav. Organ. 3(4), 367–388 (1982)
Article Google Scholar
Hadfield-Menell, D., Milli, S., Abbeel, P., Russell, S.J., Dragan, A.: Inverse reward design. In: Proceeding of the 31st Conference on Neural Information Processing Systems (NIPS), pp. 6765–6774 (2017)
Google Scholar
Irving, G., Askell, A.: Ai safety needs social scientists. Distill 4(2), e14 (2019)
Article Google Scholar
Levine, S., Popovic, Z., Koltun, V.: Nonlinear inverse reinforcement learning with gaussian processes. In: Proceeding of the 31st Conference on Neural Information Processing Systems (NIPS), pp. 19–27 (2011)
Google Scholar
Malle, B.F.: How the Mind Explains Behavior: Folk Explanations, Meaning, and Social Interaction. MIT Press, Cambridge (2006)
Google Scholar
Mercuur, R., Dignum, V., Jonker, C.M., et al.: The value of values and norms in social simulation. J. Artif. Soc. Soc. Simul. 22(1), 1–9 (2019)
Article Google Scholar
Miller, T.: Explanation in artificial intelligence: insights from the social sciences. Artif. Intell. 267, 1–38 (2018)
Article MathSciNet Google Scholar
Mindermann, S., Armstrong, S.: Occam’s razor is insufficient to infer the preferences of irrational agents. In: Conference on Neural Information Processing Systems (NIPS), pp. 5598–5609 (2018)
Google Scholar
Nielsen, T.D., Jensen, F.V.: Learning a decision maker’s utility function from (possibly) inconsistent behavior. Artif. Intell. 160(1–2), 53–78 (2004)
Article MathSciNet Google Scholar
Nouri, E., Georgila, K., Traum, D.: Culture-specific models of negotiation for virtual characters: multi-attribute decision-making based on culture-specific values. AI Soc. 32(1), 51–63 (2014). https://doi.org/10.1007/s00146-014-0570-7
Article Google Scholar
Oosterbeek, H., Sloof, R., Van De Kuilen, G.: Cultural differences in ultimatum game experiments: evidence from a meta-analysis. SSRN Electron. J. 8(1), 171–188 (2001)
MATH Google Scholar
Pearl, J.: The seven tools of causal inference, with reflections on machine learning. Commun. ACM 62(3), 54–60 (2019)
Article Google Scholar
Van de Poel, I., et al.: Ethics, Technology, and Engineering: An Introduction. Wiley, Hoboken (2011)
Google Scholar
Roese, N.J.: Counterfactual thinking. Psychol. Bull. 121(1), 133 (1997)
Article Google Scholar
Roth, A.E., Erev, I.: Learning in extensive-form games: experimental data and simple dynamic models in the intermediate term. Games Econ. Behav. 8(1), 164–212 (1995)
Article MathSciNet Google Scholar
Schwartz, S.H.: An overview of the Schwartz theory of basic values. Online Read. Psychol. Culture 2, 1–20 (2012)
Google Scholar
Soares, N., Fallenstein, B.: Agent foundations for aligning machine intelligence with human interests: a technical research agenda. In: Callaghan, V., Miller, J., Yampolskiy, R., Armstrong, S. (eds.) The Technological Singularity. TFC, pp. 103–125. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54033-6_5
Chapter Google Scholar

Download references

Acknowledgements

This work was supported by the AiTech initiative of the Delft University of Technology.

Author information

Authors and Affiliations

Faculty of Electrical Engineering, Mathematics, and Computer Science, Delft University of Technology, Delft, The Netherlands
Luciano Cavalcante Siebert & Catholijn Jonker
AiTech, Delft University of Technology, Delft, The Netherlands
Luciano Cavalcante Siebert, Jeroen van den Hoven & Catholijn Jonker
Faculty of Technology, Policy and Management, Delft University of Technology, Delft, The Netherlands
Rijk Mercuur, Virginia Dignum & Jeroen van den Hoven
Department of Computing Sciences, Umeå University, Umeå, Sweden
Virginia Dignum
Leiden Institute of Advance Computer Science, Leiden University, Leiden, The Netherlands
Catholijn Jonker

Authors

Luciano Cavalcante Siebert
View author publications
You can also search for this author in PubMed Google Scholar
Rijk Mercuur
View author publications
You can also search for this author in PubMed Google Scholar
Virginia Dignum
View author publications
You can also search for this author in PubMed Google Scholar
Jeroen van den Hoven
View author publications
You can also search for this author in PubMed Google Scholar
Catholijn Jonker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luciano Cavalcante Siebert .

Editor information

Editors and Affiliations

Umeå University, Umeå, Sweden
Andrea Aler Tubella
University of Otago, Dunedin, New Zealand
Stephen Cranefield
Norwegian University of Science and Technology (NTNU), Gjøvik, Norway
Christopher Frantz
Pontifical Catholic University of Rio Grande do Sul, Porto Alegre, Brazil
Felipe Meneguzzi
University of Aberdeen, Aberdeen, UK
Wamberto Vasconcelos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Siebert, L.C., Mercuur, R., Dignum, V., van den Hoven, J., Jonker, C. (2021). Improving Confidence in the Estimation of Values and Norms. In: Aler Tubella, A., Cranefield, S., Frantz, C., Meneguzzi, F., Vasconcelos, W. (eds) Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems XIII. COIN COINE 2017 2020. Lecture Notes in Computer Science(), vol 12298. Springer, Cham. https://doi.org/10.1007/978-3-030-72376-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-72376-7_6
Published: 02 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72375-0
Online ISBN: 978-3-030-72376-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics