An intelligent machine surpassing human intelligence across a wide set of skills has been proposed as a possible existential catastrophe (i.e., an event comparable in value to that of human extinction). Among those concerned about existential risk related to artificial intelligence (AI), it is common to assume that AI will not only be very intelligent, but also be a general agent (i.e., an agent capable of action in many different contexts). This article explores the characteristics of machine agency, and what it would mean for a machine to become a general agent. In particular, it does so by articulating some important differences between belief and desire in the context of machine agency. One such difference is that while an agent can by itself acquire new beliefs through learning, desires need to be derived from preexisting desires or acquired with the help of an external influence. Such influence could be a human programmer or natural selection. We argue that to become a general agent, a machine needs productive desires, or desires that can direct behavior across multiple contexts. However, productive desires cannot sui generis be derived from non-productive desires. Thus, even though general agency in AI could in principle be created by human agents, general agency cannot be spontaneously produced by a non-general AI agent through an endogenous process (i.e. self-improvement). In conclusion, we argue that a common AI scenario, where general agency suddenly emerges in a non-general agent AI, such as DeepMind’s superintelligent board game AI AlphaZero, is not plausible.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Price includes VAT (USA)
Tax calculation will be finalised during checkout.
“Instead of allowing agent-like purposive behavior to emerge spontaneously and haphazardly from the implementation of powerful search processes (including processes searching for internal workplans and processes directly searching for solutions meeting some user specified criterion), it may be better to create agents on purpose.” (Bostrom 2014, p 155).
This model seems also to be widely accepted among those concerned about superintelligent AI. See (Häggström 2018).
As Kaj Sotala and others have argued, there are multiple trajectories to superintelligent AI. This article has only explored one of them.
Note that we not exploring the important but distinct notion of moral agency. For an excellent discussion of moral agency in machines, see Gunkel and Bryson 2014.
We should also note that the notion of agency under consideration is more minimalistic than that proposed by (Floridi and Sanders 2004). In other words, this discussion is unrelated to current ongoing discussions on whether AI can become a moral agent, a moral patient or be morally responsible.
A set is more diverse the greater the expected dissimilarity between a randomly sampled object in that set and the most similar object in that set (Gustafsson 2010).
We are grateful to Linda Linsefors for this objection.
Amodei D, Olah C, Steinhardt J, Christiano P, Schulman J, Mané D (2016) Concrete problems in AI safety.https://arxiv.org/abs/1606.06565
Barandiaran XE, Di Paolo EA, Rohde M (2009) Defining agency: individuality, normativity, asymmetry, and spatio-temporality in action. Adapt Behav 17:367–386. https://doi.org/10.1177/1059712309343819
Block N (1980) Functionalism. In: Block N (ed) Readings in the philosophy of psychology
Bostrom N (2014) Superintelligence: paths, dangers, strategies. OUP Oxford, Oxford
Drexler E (2019) Reframing superintelligence comprehensive AI services as general intelligence. Technical Report 2019–1. Future of Humanity Institute, University of Oxford
Floridi L, Sanders JW (2004) On the morality of artificial agents. Mind Mach 14(3):349–379. https://doi.org/10.1023/B:MIND.0000035461.63578.9d
Gunkel DJ, Bryson J (2014) Introduction to the special issue on machine morality: the machine as moral agent and patient. Philos Technol 27(1):5–8. https://doi.org/10.1007/s13347-014-0151-1
Gustafsson JE (2010) Freedom of choice and expected compromise. Soc Choice Welf 35(1):65–79. https://doi.org/10.1007/s00355-009-0430-4
Haggstrom O (2016) Here be dragons: science, technology and the future of humanity. Oxford University Press, Oxford
Häggström O (2018) Challenges to the Omohundro-Bostrom framework for AI motivations. Foresight 21(1):153–166. https://doi.org/10.1108/FS-04-2018-0039
Hanson R (2016) the age of Em: work, love and life when robots rule the earth, 1st edn. Oxford University Press, Oxford
Jebari K, Lundborg J (2018) The intelligence explosion revisited. Foresight 21(1):167–174. https://doi.org/10.1108/FS-04-2018-0042
Legg S, Hutter M (2007) Universal intelligence: a definition of machine intelligence. Mind Mach 17(4):391–444. https://doi.org/10.1007/s11023-007-9079-x
Lewis DK (1972) Psychophysical and theoretical identifications. Australas J Philos 50(3):249–258. https://doi.org/10.1080/00048407212341301
List C, Pettit P (2011) Group Agency. the possibility, design, and status of corporate agents. Oxford University Press, Oxford
McNaughton D (1988) Moral vision: an introduction to ethics. Blackwell, Oxford
Petersson B (2000) Belief & desire the standard model of intentional action: critique and defence. Dep. Of Philosophy, Kungshuset, Lundag\aard
Putnam H (1967) The nature of mental states. In: Capitan WH, Merrill DD (eds) Art, mind, and religion. Pittsburgh University Press, Pittsburgh, pp 1–223
Russell S, Dewey D, Tegmark M (2015) Research priorities for robust and beneficial artificial intelligence. AI Mag 36(4):105–114. https://doi.org/10.1609/aimag.v36i4.2577
Russell SJ, Norvig P (2016) Artificial intelligence: a modern approach. Pearson Education Limited, Malaysia. https://thuvienso.thanglong.edu.vn/handle/DHTL_123456789/4010
Smith M (1987) The humean theory of motivation. Mind 96(381):36–61
Soares N, Fallenstein B (2017) Agent foundations for aligning machine intelligence with human interests: a technical research agenda. In: Callaghan V, Miller J, Yampolskiy R, Armstrong S (eds) The technological singularity: managing the journey, The frontiers collection. Springer, Berlin, pp 103–25. https://doi.org/10.1007/978-3-662-54033-6_5
Tegmark M (2017) Life 3.0: being human in the age of artificial intelligence. Knopf, New York
Turing AM (1950) Computing machinery and intelligence. Mind 59(October):433–460. https://doi.org/10.1093/mind/LIX.236.433
Yampolskiy RV (2015) Artificial superintelligence: a futuristic approach, 2015th edn. Chapman and Hall/CRC, Boca Raton
Yudkowsky E (2013) Intelligence explosion microeconomics. Technical Report 2013–1. Machine Intelligence Research Institute
No external funding for this work was obtained.
Conflict of interest
No conflicts of interest to declare.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Jebari, K., Lundborg, J. Artificial superintelligence and its limits: why AlphaZero cannot become a general agent. AI & Soc 36, 807–815 (2021). https://doi.org/10.1007/s00146-020-01070-3
- Artificial general intelligence
- The belief/desire model
- Intentional action
- Existential risk