Skip to main content

Improving Confidence in the Estimation of Values and Norms

  • Conference paper
  • First Online:
Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems XIII (COIN 2017, COINE 2020)

Abstract

Autonomous agents (AA) will increasingly be interacting with us in our daily lives. While we want the benefits attached to AAs, it is essential that their behavior is aligned with our values and norms. Hence, an AA will need to estimate the values and norms of the humans it interacts with, which is not a straightforward task when solely observing an agent’s behavior. This paper analyses to what extent an AA is able to estimate the values and norms of a simulated human agent (SHA) based on its actions in the ultimatum game. We present two methods to reduce ambiguity in profiling the SHAs: one based on search space exploration and another based on counterfactual analysis. We found that both methods are able to increase the confidence in estimating human values and norms, but differ in their applicability, the latter being more efficient when the number of interactions with the agent is to be minimized. These insights are useful to improve the alignment of AAs with human values and norms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For ease of presentation, we chose to present P with no monetary unit. Empirical work [21] shows that the effect of the pie size is relatively small.

  2. 2.

    The norm that is drawn from the normal distribution is not used as input for the norm in subsequent rounds (i.e., the agent does not memorize it).

  3. 3.

    To be exact, the proposed demand should be considered as what the proposer considers a ‘normal’ threshold. If it considered a higher threshold to be normal it would have demanded less, if it considered a lower threshold normal it would have demanded more.

  4. 4.

    The standard deviation in the demand solely based on values \(\sigma _{vd}\) was added to ensure agents vary in what values they find important (\(di_a\)). The \(\sigma _{vd}\) for humans is postulated instead of extracted from empirical data.

  5. 5.

    Given the deterministic model of the SHA, it might be expected that the RMSE should tend to zero. This is not the case because [d, valueDemand, \(normDemand] \in \mathbb {Z}\), and therefore a rounding operator is used.

  6. 6.

    The x-axis in Fig. 3 relates to the number of rounds used during the initial estimation process, described in Sect. 4.1. The additional number of interactions performed by each method to improve the confidence in the estimations is not included in the x-axis but presented in Fig. 3(c).

  7. 7.

    In our case \(OR_a\): given that preference to values and norms are constant, demand is defined according to (4))

References

  1. Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning (ICML). ACM (2004)

    Google Scholar 

  2. Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D.: Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016)

  3. Cooper, D.J., Dutcher, E.G.: The dynamics of responder behavior in ultimatum games: a meta-study. Exp. Econ. 14(4), 519–546 (2011)

    Article  Google Scholar 

  4. Cranefield, S., Winikoff, M., Dignum, V., Dignum, F.: No pizza for you: value-based plan selection in BDI agents. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), pp. 178–184 (2017)

    Google Scholar 

  5. Crawford, S.E.S., Ostrom, E.: A grammar of institutions. Polit. Sci. 89(3), 582–600 (2007)

    Google Scholar 

  6. Dechesne, F., Di Tosto, G., Dignum, V., Dignum, F.: No smoking here: values, norms and culture in multi-agent systems. Artif. Intell. Law 21(1), 79–107 (2013)

    Article  Google Scholar 

  7. Del Missier, F., Mäntylä, T., Hansson, P., Bruine de Bruin, W., Parker, A.M., Nilsson, L.G.: The multifold relationship between memory and decision making: an individual-differences study. J. Exp. Psychol.: Learn. Mem. Cogn. 39(5), 1344 (2013)

    Google Scholar 

  8. Dignum, V.: Responsible Artificial Intelligence: How to Develop and Use AI in a Responsible Way. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30371-6

    Book  Google Scholar 

  9. Fehr, E., Fischbacher, U.: The nature of human altruism. Nature 425(6960), 785–791 (2003)

    Article  Google Scholar 

  10. Fishbein, M., Ajzen, I.: Predicting and Changing Behavior: The Reasoned Action Approach. Taylor & Francis Ltd, Milton Park (2011)

    Book  Google Scholar 

  11. Güth, W., Schmittberger, R., Schwarze, B.: An experimental analysis of ultimatum bargaining. J. Econ. Behav. Organ. 3(4), 367–388 (1982)

    Article  Google Scholar 

  12. Hadfield-Menell, D., Milli, S., Abbeel, P., Russell, S.J., Dragan, A.: Inverse reward design. In: Proceeding of the 31st Conference on Neural Information Processing Systems (NIPS), pp. 6765–6774 (2017)

    Google Scholar 

  13. Irving, G., Askell, A.: Ai safety needs social scientists. Distill 4(2), e14 (2019)

    Article  Google Scholar 

  14. Levine, S., Popovic, Z., Koltun, V.: Nonlinear inverse reinforcement learning with gaussian processes. In: Proceeding of the 31st Conference on Neural Information Processing Systems (NIPS), pp. 19–27 (2011)

    Google Scholar 

  15. Malle, B.F.: How the Mind Explains Behavior: Folk Explanations, Meaning, and Social Interaction. MIT Press, Cambridge (2006)

    Google Scholar 

  16. Mercuur, R., Dignum, V., Jonker, C.M., et al.: The value of values and norms in social simulation. J. Artif. Soc. Soc. Simul. 22(1), 1–9 (2019)

    Article  Google Scholar 

  17. Miller, T.: Explanation in artificial intelligence: insights from the social sciences. Artif. Intell. 267, 1–38 (2018)

    Article  MathSciNet  Google Scholar 

  18. Mindermann, S., Armstrong, S.: Occam’s razor is insufficient to infer the preferences of irrational agents. In: Conference on Neural Information Processing Systems (NIPS), pp. 5598–5609 (2018)

    Google Scholar 

  19. Nielsen, T.D., Jensen, F.V.: Learning a decision maker’s utility function from (possibly) inconsistent behavior. Artif. Intell. 160(1–2), 53–78 (2004)

    Article  MathSciNet  Google Scholar 

  20. Nouri, E., Georgila, K., Traum, D.: Culture-specific models of negotiation for virtual characters: multi-attribute decision-making based on culture-specific values. AI Soc. 32(1), 51–63 (2014). https://doi.org/10.1007/s00146-014-0570-7

    Article  Google Scholar 

  21. Oosterbeek, H., Sloof, R., Van De Kuilen, G.: Cultural differences in ultimatum game experiments: evidence from a meta-analysis. SSRN Electron. J. 8(1), 171–188 (2001)

    MATH  Google Scholar 

  22. Pearl, J.: The seven tools of causal inference, with reflections on machine learning. Commun. ACM 62(3), 54–60 (2019)

    Article  Google Scholar 

  23. Van de Poel, I., et al.: Ethics, Technology, and Engineering: An Introduction. Wiley, Hoboken (2011)

    Google Scholar 

  24. Roese, N.J.: Counterfactual thinking. Psychol. Bull. 121(1), 133 (1997)

    Article  Google Scholar 

  25. Roth, A.E., Erev, I.: Learning in extensive-form games: experimental data and simple dynamic models in the intermediate term. Games Econ. Behav. 8(1), 164–212 (1995)

    Article  MathSciNet  Google Scholar 

  26. Schwartz, S.H.: An overview of the Schwartz theory of basic values. Online Read. Psychol. Culture 2, 1–20 (2012)

    Google Scholar 

  27. Soares, N., Fallenstein, B.: Agent foundations for aligning machine intelligence with human interests: a technical research agenda. In: Callaghan, V., Miller, J., Yampolskiy, R., Armstrong, S. (eds.) The Technological Singularity. TFC, pp. 103–125. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54033-6_5

    Chapter  Google Scholar 

Download references

Acknowledgements

This work was supported by the AiTech initiative of the Delft University of Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luciano Cavalcante Siebert .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Siebert, L.C., Mercuur, R., Dignum, V., van den Hoven, J., Jonker, C. (2021). Improving Confidence in the Estimation of Values and Norms. In: Aler Tubella, A., Cranefield, S., Frantz, C., Meneguzzi, F., Vasconcelos, W. (eds) Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems XIII. COIN COINE 2017 2020. Lecture Notes in Computer Science(), vol 12298. Springer, Cham. https://doi.org/10.1007/978-3-030-72376-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-72376-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-72375-0

  • Online ISBN: 978-3-030-72376-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics