Minds and Machines

, Volume 22, Issue 2, pp 71–85 | Cite as

The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents

Article

Abstract

This paper discusses the relation between intelligence and motivation in artificial agents, developing and briefly arguing for two theses. The first, the orthogonality thesis, holds (with some caveats) that intelligence and final goals (purposes) are orthogonal axes along which possible artificial intellects can freely vary—more or less any level of intelligence could be combined with more or less any final goal. The second, the instrumental convergence thesis, holds that as long as they possess a sufficient level of intelligence, agents having any of a wide range of final goals will pursue similar intermediary goals because they have instrumental reasons to do so. In combination, the two theses help us understand the possible range of behavior of superintelligent agents, and they point to some potential dangers in building such an agent.

Keywords

Superintelligence Artificial intelligence AI Goal Instrumental reason Intelligent agent 

References

  1. Bostrom, N. (2003). Are you living in a computer simulation? Philosophical Quarterly, 53(211), 243–255.CrossRefGoogle Scholar
  2. Bostrom, N. (2006). What is a singleton? Linguistic and Philosophical Investigations, 5(2), 48–54.Google Scholar
  3. Bostrom, N. (2012). Information hazards: A Typology of potential harms from knowledge. Review of Contemporary Philosophy, 10, 44–79. (www.nickbostrom.com/information-hazards.pdf).
  4. Chalmers, D. (2010). The singularity: A philosophical analysis. Journal of Consciousness Studies, 17, 7–65.Google Scholar
  5. Chislenko, A. (1997). Technology as extension of human functional architecture. Extropy Online. (project.cyberpunk.ru/idb/technology_as_extension.html).
  6. de Blanc, P. (2011). Ontological crises in artificial agent’s value systems. Manuscript. The singularity institute for artificial intelligence. (arxiv.org/pdf/1105.3821v1.pdf).
  7. Dewey, D. (2011). Learning what to value. In J. Schmidhuber, K. R. Thorisson, & M. Looks (Eds.), Proceedings of the 4th conference on artificial general intelligence, AGI 2011 (pp. 309–314). Heidelberg: Springer.Google Scholar
  8. Forgas, J., et al. (Eds.). (2009). The psychology of attitudes and attitude change. London: Psychology Press.Google Scholar
  9. Lewis, D. (1988). Desire as belief. Mind, 97(387), 323–332.MathSciNetCrossRefGoogle Scholar
  10. Omohundro, S. (2008a). The basic AI drives. In P. Wang, B. Goertzel, and S. Franklin (eds.). Proceedings of the First AGI Conference, Vol. 171. Frontiers in Artificial Intelligence and Applications. Amsterdam: IOS Press.Google Scholar
  11. Omohundro, S. (2008b). The nature of self-improving artificial intelligence. Manuscript. (selfawaresystems.files.wordpress.com/2008/01/nature_of_self_improving_ai.pdf).
  12. Omohundro, S. (2012). Rationally-shaped artificial intelligence. In Eden, A. et al. (eds.). The singularity hypothesis: A scientific and philosophical assessment (Springer, forthcoming).Google Scholar
  13. Parfit, D. (1984). Reasons and persons. (pp. 123–124). Reprinted and corrected edition, 1987. Oxford: Oxford University Press.Google Scholar
  14. Parfit, D. (2011). On what matters. Oxford: Oxford University Press.Google Scholar
  15. Russell, S., & Norvig, P. (2010). Artificial intelligence: A modern approach (3rd ed.). New Jersey: Prentice Hall.Google Scholar
  16. Sandberg, A., & Bostrom, N. (2008). Whole brain emulation: A roadmap. Technical Report 2008–3. Oxford: Future of Humanity Institute, Oxford University. (www.fhi.ox.ac.uk/Reports/2008-3.pdf).
  17. Shulman, C. (2010). Omohundro’s “basic AI drives” and catastrophic risks. Manuscript. (singinst.org/upload/ai-resource-drives.pdf).
  18. Sinhababu, N. (2009). The humean theory of motivation reformulated and defended. Philosophical Review, 118(4), 465–500.CrossRefGoogle Scholar
  19. Smith, M. (1987). The humean theory of motivation. Mind, 46(381), 36–61.CrossRefGoogle Scholar
  20. Weizenbaum, J. (1976). Computer power and human reason: From judgment to calculation. San Francisco: W. H. Freeman.Google Scholar
  21. Yudkowsky, E. (2008). Artificial intelligence as a positive and negative factor in global risk. In N. Bostrom, & M. Cirkovic (Eds.), Global catastrophic risks. (pp. 308–345; quote from p. 310). Oxford: Oxford University Press.Google Scholar
  22. Yudkowsky, E. (2011). Complex value systems are required to realize valuable futures. In J. Schmidhuber, K. R. Thorisson, & M. Looks (Eds.), Proceedings of the 4th conference on artificial general intelligence, AGI 2011 (pp. 388–393). Heidelberg: Springer.Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  1. 1.Faculty of Philosophy, Oxford Martin School, Future of Humanity InstituteOxford UniversityOxfordUK

Personalised recommendations