Human-aligned artificial intelligence is a multiobjective problem

  • Peter Vamplew
  • Richard Dazeley
  • Cameron Foale
  • Sally Firmin
  • Jane Mummery
Original Paper

Abstract

As the capabilities of artificial intelligence (AI) systems improve, it becomes important to constrain their actions to ensure their behaviour remains beneficial to humanity. A variety of ethical, legal and safety-based frameworks have been proposed as a basis for designing these constraints. Despite their variations, these frameworks share the common characteristic that decision-making must consider multiple potentially conflicting factors. We demonstrate that these alignment frameworks can be represented as utility functions, but that the widely used Maximum Expected Utility (MEU) paradigm provides insufficient support for such multiobjective decision-making. We show that a Multiobjective Maximum Expected Utility paradigm based on the combination of vector utilities and non-linear action–selection can overcome many of the issues which limit MEU’s effectiveness in implementing aligned AI. We examine existing approaches to multiobjective AI, and identify how these can contribute to the development of human-aligned intelligent agents.

Keywords

Ethics Aligned artificial intelligence Value alignment Maximum Expected Utility Reward engineering 

References

  1. Abel, D., MacGlashan, J., & Littman, M. L. (2016). Reinforcement learning as a framework for ethical decision making. In Workshops at the Thirtieth AAAI Conference on Artificial Intelligence. Phoenix.Google Scholar
  2. Allen, C., & Wallach, W. (2012). Moral machines: Contradiction in terms or abdication of human responsibility. In P. Lin, K. Abney, & G. A. Bekey (Eds.), Robot ethics: The ethical and social implications of robotics (pp. 55–68). Boston: MIT Press.Google Scholar
  3. Altmann, J. (2013). Arms control for armed uninhabited vehicles: An ethical issue. Ethics and Information Technology, 15(2), 137–152.CrossRefGoogle Scholar
  4. Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. arXiv preprint arXiv:1606.06565.
  5. Anderson, M., & Anderson, S. L. (2007). Machine ethics: Creating an ethical intelligent agent. AI Magazine, 28(4), 15.Google Scholar
  6. Anderson, M., Anderson, S. L., & Armen, C. (2006a). An approach to computing ethics. IEEE Intelligent Systems, 21(4), 56–63.CrossRefGoogle Scholar
  7. Anderson, M., Anderson, S. L., & Armen, C. (2006b). Medethex: A prototype medical ethics advisor. In Proceedings of the National Conference On Artificial Intelligence, vol. 21, p. 1759.Google Scholar
  8. Andrighetto, G., Governatori, G., Noriega, P., & van der Torre, L. W. (2013). Normative multi-agent systems (vol. 4). Wadern, Germany: Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.Google Scholar
  9. Angus, D., & Woodward, C. (2009). Multiple objective ant colony optimisation. Swarm Intelligence, 3(1), 69–85.CrossRefGoogle Scholar
  10. Arkin, R. C. (2008). Governing lethal behavior: Embedding ethics in a hybrid deliberative/reactive robot architecture Part I: Motivation and philosophy, In 2008 3rd ACM/IEEE International Conference on Human–Robot Interaction (pp. 121–128).Google Scholar
  11. Armstrong, S., Sandberg, A., & Bostrom, N. (2012). Thinking inside the box: Controlling and using an oracle AI. Minds and Machines, 22(4), 299–324.CrossRefGoogle Scholar
  12. Asaro, P. M. (2012). A body to kick, but still no soul to damn: Legal perspectives on robotics. In P. Lin, K. Abney, & G. A. Bekey (Eds.), Robot ethics: The ethical and social implications of robotics (pp. 169–186). Cambridge: MIT Press.Google Scholar
  13. Bentham, J. (1789). The principles of moral and legislation. Oxford: Oxford University Press.Google Scholar
  14. Blythe, J. (1999). Decision-theoretic planning. AI Magazine, 20(2), 37.Google Scholar
  15. Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford: Oxford University Press.Google Scholar
  16. Broersen, J., Dastani, M., Hulstijn, J., & van der Torre, L. (2002). Goal generation in the BOID architecture. Cognitive Science Quarterly, 2(3–4), 428–447.Google Scholar
  17. Brundage, M. (2014). Limitations and risks of machine ethics. Journal of Experimental & Theoretical Artificial Intelligence, 26(3), 355–372.CrossRefGoogle Scholar
  18. Castelfranchi, C., Dignum, F., Jonker, C. M., & Treur, J. (1999). Deliberative normative agents: Principles and architecture. In International Workshop on Agent Theories, Architectures, and Languages (pp. 364–378). New York: Springer.Google Scholar
  19. Coello Coello, C. (2006). Evolutionary multi-objective optimization: A historical view of the field. IEEE Computational Intelligence Magazine, 1(1), 28–36.MathSciNetCrossRefGoogle Scholar
  20. Critch, A. (2017). Toward negotiable reinforcement learning: Shifting priorities in Pareto optimal sequential decision-making. arXiv preprint arXiv:1701.01302.
  21. Cushman, F. (2013). Action, outcome, and value a dual-system framework for morality. Personality and Social Psychology Review, 17(3), 273–292.CrossRefGoogle Scholar
  22. Danielson, P. (2009). Can robots have a conscience? Nature, 457(7229), 540–540.CrossRefGoogle Scholar
  23. Das, I., & Dennis, J. E. (1997). A closer look at drawbacks of minimizing weighted sums of objectives for pareto set generation in multicriteria optimization problems. Structural Optimization, 14(1), 63–69.CrossRefGoogle Scholar
  24. Dewey, D. (2011). Learning what to value. In International Conference on Artificial General Intelligence (pp. 309–314). New York: Springer.Google Scholar
  25. Dewey, D. (2014). Reinforcement learning and the reward engineering principle. In 2014 AAAI Spring Symposium Series.Google Scholar
  26. Dignum, F. (1996). Autonomous agents and social norms. In ICMAS-96 Workshop on Norms, Obligations and Conventions (pp. 56–71).Google Scholar
  27. Dubois, D., Fargier, H., & Prade, H. (1997). Beyond min aggregation in multicriteria decision: (Ordered) Weighted min, discri-min, leximin. In The ordered weighted averaging operators (pp. 181–192). New York: Springer.Google Scholar
  28. Eckhardt, D. E., Caglayan, A. K., Knight, J. C., Lee, L. D., McAllister, D. F., Vouk, M. A., et al. (1991). An experimental evaluation of software redundancy as a strategy for improving reliability. IEEE Transactions on Software Engineering, 17(7), 692–702.CrossRefGoogle Scholar
  29. Etzioni, A., & Etzioni, O. (2016). Designing AI systems that obey our laws and values. Communications of the ACM, 59(9), 29–31.CrossRefGoogle Scholar
  30. Ferrucci, D. A. (2012). Introduction to “This is Watson”. IBM Journal of Research and Development, 56(3.4), 1–1.CrossRefGoogle Scholar
  31. Fieldsend, J. E. (2004). Multi-objective particle swarm optimisation methods. Technical Report No. 419. Department of Computer Science, University of Exeter.Google Scholar
  32. Fieser, J. (2016). Ethics. In The Internet encyclopedia of philosophy (ISSN 2161-0002, http://www.iep.utm.edu, 2016).
  33. Fishburn, P. C. (1968). Utility theory. Management Science, 14(5), 335–378.CrossRefMATHGoogle Scholar
  34. Future of Life Institute. (2015). Research priorities for robust and beneficial artificial intelligence: An open letter (https://futureoflife.org/ai-open-letter/, 2015).
  35. Goodall, N. (2014). Ethical decision making during automated vehicle crashes. Transportation Research Record: Journal of the Transportation Research Board, 2424, 58–65.Google Scholar
  36. Guarini, M. (2006). Particularism and the classification and reclassification of moral cases. IEEE Intelligent Systems, 21(4), 22–28.CrossRefGoogle Scholar
  37. Kant, I. (1993). Grounding for the metaphysics of Morals (1797). Indianapolis: Hackett.Google Scholar
  38. Keeney, R. L. (1988). Value-driven expert systems for decision support. Decision Support Systems, 4(4), 405–412.MathSciNetCrossRefGoogle Scholar
  39. Leenes, R., & Lucivero, F. (2014). Laws on robots, laws by robots, laws in robots: Regulating robot behaviour by design. Law, Innovation and Technology, 6(2), 193–220.CrossRefGoogle Scholar
  40. Lenat, D. B. (1983). Eurisko: A program that learns new heuristics and domain concepts: The nature of heuristics III: Program design and results. Artificial Intelligence, 21(1–2), 61–98.CrossRefGoogle Scholar
  41. Littman, M. L. (2015). Reinforcement learning improves behaviour from evaluative feedback. Nature, 521(7553), 445–451.CrossRefGoogle Scholar
  42. Livingston, S., Garvey, J., & Elhanany, I. (2008). On the broad implications of reinforcement learning based Agi. In Artificial General Intelligence, 2008: Proceedings of the First AGI Conference (p. 478, vol. 171). Amsterdam: IOS Press.Google Scholar
  43. Lozano-Perez, T., Cox, I. J., & Wilfong, G. T. (2012). Autonomous robot vehicles. New York: Springer.Google Scholar
  44. Meisner, E. M. (2009). Learning controllers for human–robot Interaction (PhD thesis, Rensselaer Polytechnic Institute, 2009).Google Scholar
  45. Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The ethics of algorithms: Mapping the debate. Big Data & Society, 3(2), 2053951716679679.CrossRefGoogle Scholar
  46. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533.CrossRefGoogle Scholar
  47. Murphy VII, T. (2013). The first level of Super Mario Bros. Is easy with lexicographic orderings and time travel. The Association for Computational Heresy (SIGBOVIK).Google Scholar
  48. Omohundro, S. M. (2008). The basic AI drives, In AGI (vol. 171, pp. 483–492).Google Scholar
  49. Petraeus, D. H., & Amos, J. F. (2006). Fm 3-24: Counterinsurgency. Department of the Army.Google Scholar
  50. Prakken, H. (2016). On how AI & law can help autonomous systems obey the law: A position paper. AI4J–Artificial Intelligence for Justice, 42, 42–46.Google Scholar
  51. Rawls, J. (1971). A theory of justice. Cambridge: Harvard University Press.Google Scholar
  52. Refanidis, I., & Vlahavas, I. (2003). Multiobjective heuristic state-space planning. Artificial Intelligence, 145(1–2), 1–32.MathSciNetCrossRefMATHGoogle Scholar
  53. Reynolds, G. (2011). Ethics in information technology. Boston: Cengage learning.Google Scholar
  54. Riedl, M. O., & Harrison, B. (2016). Using stories to teach human values to artificial agents. In Proceedings of the 2nd International Workshop on AI. Phoenix, AZ: Ethics and Society.Google Scholar
  55. Roijers, D. M., Vamplew, P., Whiteson, S., & Dazeley, R. (2013). A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research, 48, 67–113.MathSciNetMATHGoogle Scholar
  56. Romei, A., & Ruggieri, S. (2014). A multidisciplinary survey on discrimination analysis. The Knowledge Engineering Review, 29(5), 582–638.CrossRefGoogle Scholar
  57. Ross, W. D. (1930). The right and the good. Oxford: Clarendon Press.Google Scholar
  58. Russell, S. J., & Norvig, P. (2010). Artificial intelligence: A modern approach (3rd ed.). Upper Saddle River: Prentice Hall.MATHGoogle Scholar
  59. Sharkey, N. (2009). Death strikes from the sky: The calculus of proportionality. IEEE Technology and Society Magazine, 28(1), 16–19.CrossRefGoogle Scholar
  60. Sharkey, N. (2012). Killing made easy: From joysticks to politics. In P. Lin, K. Abney, & G. A. Bekey (Eds.), Robot ethics: The ethical and social implications of robotics (pp. 111–128). Cambridge: MIT Press.Google Scholar
  61. Sharkey, N., & Sharkey, A. (2012). The rights and wrongs of robot care. In P. Lin, K. Abney, & G. A. Bekey (Eds.), Robot ethics: The ethical and social implications of robotics (pp. 267–282). Cambridge: MIT Press.Google Scholar
  62. Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489.CrossRefGoogle Scholar
  63. Soares, N., & Fallenstein, B. (2014). Aligning superintelligence with human interests: A technical research agenda. Machine Intelligence Research Institute (MIRI) technical report 8.Google Scholar
  64. Soares, N., Fallenstein, B., Armstrong, S., & Yudkowsky, E. (2015). Corrigibility. In Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence.Google Scholar
  65. Soh, H., & Demiris, Y. (2011). Evolving policies for multi-reward partially observable markov decision processes (mr-pomdps). In Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation (pp. 713– 720). ACM.Google Scholar
  66. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge: MIT Press.Google Scholar
  67. Tavani, H. T. (2011). Ethics and technology: Controversies, questions, and strategies for ethical computing. Hoboken: Wiley.Google Scholar
  68. Taylor, J. (2016). Quantilizers: A safer alternative to maximizers for limited optimization. In AAAI AI, Ethics & Society Workshop.Google Scholar
  69. Taylor, J., Yudkowsky, E., LaVictoire, P., & Critch, A. (2016). Alignment for advanced machine learning systems. Technical report, Technical Report 20161, MIRI.Google Scholar
  70. The IEEE Global Initiative for Ethical Considerations in Artificial Intelligence and Autonomous Systems. (2016). Ethically aligned design: A vision for prioritizing wellbeing with artificial intelligence and autonomous systems.Google Scholar
  71. Vamplew, P., Yearwood, J., Dazeley, R., & Berry, A. (2008). On the limitations of scalarisation for multi-objective reinforcement learning of Pareto Fronts. In AI’08: The 21st Australasian Joint Conference on Artificial Intelligence (pp. 372–378).Google Scholar
  72. Vamplew, P. (2004). Lego mindstorms robots as a platform for teaching reinforcement learning. In Proceedings of AISAT2004: International Conference on Artificial Intelligence in Science and Technology.Google Scholar
  73. Van Moffaert, K., Brys, T., Chandra, A., Esterle, L., Lewis, P. R., & Nowé, A. (2014). A novel adaptive weight selection algorithm for multi-objective multi-agent reinforcement learning, In 2014 International Joint Conference on Neural Networks (IJCNN) (pp. 2306–2314).Google Scholar
  74. Van Riemsdijk, M. B., Jonker, C. M., & Lesser, V. (2015). Creating socially adaptive electronic partners: Interaction, reasoning and ethical challenges. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems (pp. 1201–1206).Google Scholar
  75. van Wynsberghe, A. (2016). Service robots, care ethics, and design. Ethics and Information Technology, 18, 311–321.Google Scholar
  76. Wallach, W., & Allen, C. (2008). Moral machines: Teaching robots right from wrong. Oxford: Oxford University Press.Google Scholar
  77. Wellman, M. P. (1985). Reasoning about preference models. Technical Report 340. Cambridge, MA: MIT Laboratory for Computer Science.Google Scholar
  78. Yampolskiy, R. V., & Spellchecker, M. (2016). Artificial intelligence safety and cybersecurity: A timeline of AI failures. arXiv preprint arXiv:1610.07997.

Copyright information

© Springer Science+Business Media B.V. 2017

Authors and Affiliations

  1. 1.Federation University AustraliaBallaratAustralia

Personalised recommendations