The thinkers who have reflected on the problem of a coming superintelligence have generally seen the issue as a technological problem, a problem of how to control what the superintelligence will do. I argue that this approach is probably mistaken because it is based on questionable assumptions about the behavior of intelligent agents and, moreover, potentially counterproductive because it might, in the end, bring about the existential catastrophe that it is meant to prevent. I contend that the problem posed by a future superintelligence will likely be a political problem, that is, one of establishing a peaceful form of coexistence with other intelligent agents in a situation of mutual vulnerability, and not a technological problem of control.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Good inspired the term “superintelligence,” but he was not the first to conceive the idea. Turing put the idea forward in a talk that he gave around 1951, but which was published in print only in 1996: “Let us assume, for the sake of argument, that [intelligent machines] are a genuine possibility, and look at the consequences of constructing them. […] it seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers. […] At some stage, therefore, we should have to expect the machines to take control, in the way that is mentioned in Samuel Butler’s Erewhon” (Turing 1996, 259–260). It is probable that Good got the idea from Turing, given that they worked together. And Turing, as he remarks, in turn got the idea from Butler’s novel Erewhon [(1872) 2002]. I will discuss Butler’s novel later (see footnote 17).
Turing (1950, 433) proposed his famous test of machine intelligence, which requires the machine to imitate a human being, explicitly as a method to avoid having to explain what intelligence—or thinking, respectively—consists in.
This, at any rate, is how I understand the thrust of their definitions. Here are their formulations: Good (1965, 33) defines an “ultraintelligent machine” as “a machine that can far surpass all the intellectual activities of any man however clever.” And Bostrom (2014, 26) defines a “superintelligence” as “any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest.” In an endnote attached to this passage, Bostrom (2014, 330, n. 1) notes that his definition is “very similar” to Good’s.
For further expressions of this idea, besides the cited passage in Bostrom’s book, see Wiener (1960, 1357–1358), Muehlhauser and Helm (2012, 106), and Yampolskiy (2013, 397–398, 2016, 131–34). For a critique of the idea, along the same lines as my critique of it in the following sentences, see Yudkowsky (2001, 51–52).
Yampolskiy (2013, 397) brings out the inconsistency when he remarks that, in discussions of this kind of scenario, “superintelligent machines are feared to be too dumb to possess common sense.” Surprisingly, though, he endorses this view of superintelligent machines. That is, he shares the said fear. To me, the view expressed in his remark seems to be a simple contradiction in terms. In fact, Yampolskiy (2013, 406) himself suggests as much later in his paper when he notes that “perhaps one can believe that a superintelligent machine by its very definition will have at least as much common sense as an average human and will consequently act accordingly.”
My point here relates to the admonition, which I will raise later, not to “maquinamorphize” the coming superintelligence (see p. 15).
Yudkowsky (2001, 2008) and Bostrom (2014) focus on scenarios where a superintelligence either carelessly or perversely (in the sense of “perverse instantiation”) destroys our world. They do not consider the possibility that it might purposefully, out of a perception of threat, turn against us. As for Vinge (1993) and Chalmers (2010, 41), they mention the possibility of extinction only in passing, without providing any further analysis. That is, they do not discuss why and how the emergence of superintelligence might mean our end.
Here are the relevant quotations: Good (1965, 33–34) asserts that “the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control,” and that there is hence “the possibility that the human race will become redundant.” Vinge (1993) states that “in the coming of the Singularity, we are seeing the predictions of true technological unemployment finally come true.” As for Bostrom, we have already seen that he defines a superintelligence as surpassing human beings “in virtually all domains of interest” (see footnote 3). And later in the book, he makes clear that the domain of philosophy is meant to be included here. He submits that “the tardiness and wobbliness of humanity’s progress on many of the ‘eternal problems’ of philosophy” seems to indicate that we are not very talented at philosophy, and he subsequently proposes that we leave these problems to the coming superintelligence and instead work on the more urgent issue of how to keep the superintelligence under control (2014, 71, 315). Chalmers (2012, 154) similarly envisions his own future unemployment when he remarks that the advent of superintelligence is “perhaps the best hope for making real progress on eternal philosophical problems such as the problem of consciousness.” The possibility expressed by Bostrom and Chalmers that a superintelligence may finally solve philosophy’s longstanding problems has also been raised by Walker (2002). He presents this possibility as a main reason to pursue the creation of superintelligence, rather than as merely a welcome by-product.
Vinge (1993) is at least a little concerned. He laments that the further exploration of the universe will not be carried out by us: “Once, galactic empires might have seemed a Post-Human domain. Now, sadly, even interplanetary ones are.”
Interestingly, Turing, in the talk from circa 1951 mentioned earlier (see footnote 1), predicted that “of course” there would be “great opposition from the intellectuals” against the creation of an intelligent machine because they would be “afraid of being put out of a job” (Turing 1996, 259). I think that the reason why this prediction did not come true is that, subsequently, great hopes, chief among them the hope for potential immortality, came to be attached to the idea of a coming superintelligence. I will return to these hopes later.
Hugo de Garis, in an early paper on the prospect of superintelligence (1990, 136), predicts that some people will come to call for a ban on AI technology—he foresees a “bitter ideological conflict” between proponents and opponents of AI—but he does not, or at least not explicitly, advocate such a ban himself. It should also be noted in this context that there is an ongoing campaign, initiated by the Future of Life Institute (2015), to outlaw offensive autonomous weapons (or “killer robots,” as they are more colloquially called). But this campaign is not directed against artificial intelligence in general, only against a particular military use of it.
Good (1965, 31) declares, at the very beginning of his essay, that “the survival of man depends on the early construction of an ultraintelligent machine,” but, oddly, does not explain why he thinks so. And Kurzweil (2005, ch. 6) believes that the technological advances generated by superintelligence will give us potential immortality, among other wonderful things. I will present and discuss Kurzweil’s vision in more detail in Sect. 5.
Here is the relevant passage from Vinge’s essay (1993): “Just how bad could the Post-Human era be? Well … pretty bad. The physical extinction of the human race is one possibility. […] Yet physical extinction may not be the scariest possibility. […] Think of the different ways we relate to animals.” And Bostrom (2014, ch. 8) suggests that the “default outcome” of the emergence of superintelligence—i.e., the outcome we are likely to get unless we find out how to control the process—is “doom.” As for Chalmers (2010, 30), he remarks that the “destruction of all sentient life” is a possible result. See also Yudkowsky (2001, 3).
In his book (2014, 319), Bostrom makes the point in a particularly graphic way: “Before the prospect of an intelligence explosion, we humans are like small children playing with a bomb. […] The chances that we will all find the sense to put down the dangerous stuff seem almost negligible. Some little idiot is bound to press the ignite button just to see what happens.”
To put the point in the terms that Bostrom uses in the quotation given in footnote 14: There would then be no “ignite button” that “some little idiot” could press.
This method of preventing the creation of a certain technology is the one described in Butler’s novel Erewhon [(1872) 2002]. The novel is the account of a traveler who discovers, in some faraway part of the world, an as yet unknown civilization, called Erewhon. In this civilization, machines above a certain level of complexity are strictly forbidden by law. The reason for the prohibition is that, several centuries earlier, a philosopher wrote a book arguing that the machines were about to evolve to a point where they would surpass and subjugate the human race. The philosopher’s argument was so convincing that it persuaded the majority of the Erewhonians that they had to prevent the further evolution of the machines. And they realized that, in order to make sure that the machines would not evolve any further, they not only had to halt technological progress, but to reverse it to a certain extent: “They made a clean sweep of all machinery that had not been in use for more than 271 years (which period was arrived at after a series of compromises).” It must be noted that it is not clear whether Butler intended the argument of the Erewhonian philosopher about future machine dominance to be taken seriously, given that the novel is generally satirical in character.
The most fervent herald of potential immortality through superintelligence is Kurzweil (2005, ch. 6). Chalmers (2010, 43–63) dedicates a sizable part of his essay to the question of survival through uploading. Good (1965, 34), Yudkowsky (2001, 198), Omohundro (2012, 175), and Bostrom (2014, 245–246) express the hope for indefinite life extension only in passing.
Vinge (1993) may also be mentioned in this context. While he considers full control to be unachievable (see the quotation in footnote 22 below), he believes that we might be able to influence the process to a certain degree: “I have argued above that we cannot prevent the Singularity, that its coming is an inevitable consequence of the humans’ natural competitiveness and the possibilities inherent in technology. And yet … we are the initiators. Even the largest avalanche is triggered by small things. We have the freedom to establish initial conditions, make things happen in ways that are less inimical than others.”
He calls it “quite possibly the most important and most daunting challenge humanity has ever faced” (2014, v).
He declares that “we need to bring all our human resourcefulness to bear” on this “essential task of our age” (2014, 320).
Incidentally, “genie” is the metaphorical term that Bostrom (2014, 181) uses for one of the types of superintelligence that he imagines, namely the type that works like “a command-executing system: it receives a high-level command, carries it out, then pauses to await the next command.”
Wiener (1960, 1357) makes this point in his early essay on the prospect of machine dominance: “We wish a slave to be intelligent, to be able to assist us in the carrying out of our tasks. However, we also wish him to be subservient. Complete subservience and complete intelligence do not go together. How often in ancient times the clever Greek philosopher slave of a less intelligent Roman slaveholder must have dominated the actions of his master rather than obeyed his wishes! Similarly, if the machines become more and more efficient and operate at a higher and higher psychological level, the catastrophe foreseen by Butler of the dominance of the machine comes nearer and nearer.”
That the most promising route toward artificial intelligence is to build a system that learns like a human child was suggested from the very beginning, namely by Turing (1950, 456–460; 1996, 258–259) and Good (1965, 32). It is also interesting to note that Dreyfus, one of the most fervent skeptics of the possibility of human-equivalent AI, saw in this route a potential way around his objections (1992, 222–223, 290).
Yudkowsky’s (2001, 2008, 2011, 2012) project of building a provably friendly AI is an instance of the motivation selection approach described by Bostrom. Yet, contrary to Bostrom, Yudkowsky (2001, 52–54) insists that his endeavor is not an attempt at control, at imposing one’s wishes on another agent, but a project of creation, of creating another mind. This statement of his is directed against those who propose to constrain future AIs by externally prescribing certain laws (e.g., Asimov’s “Three Laws of Robotics”). Now, Yudkowsky’s aim of creating an AI that internally wants to be friendly is certainly subtler than that prescriptive approach. However, it is still an effort to predetermine the AI’s behavior and, in this sense, an attempt at control. To deny that by insisting on calling it “creation” is, it seems to me, mere window dressing. Bostrom’s description, in which motivation selection is characterized as a control method, is more transparent. (For further discussion of Yudkowsky’s project, see footnote 45.)
Bostrom (2014, 130) calls this thesis “the orthogonality thesis.” Here is his formulation of it: “Intelligence and final goals are orthogonal: more or less any level of intelligence could in principle be combined with more or less any final goal.” See Yampolskiy and Fox (2012, 137) for another statement of the thesis.
To be precise, one should add the proviso “unless through manipulation by other agents.” When talking about a superintelligence, this proviso can be omitted because a superintelligence is supposed to be immune to such manipulation.
I am rendering Bostrom’s argument in a simpler form than he does. In his book (Bostrom 2014, 131–133), he puts the thesis in the following terms: “goal-content integrity” is a “convergent instrumental value” for “a wide range of final goals.” In his earlier paper “Ethical Issues in Advanced Artificial Intelligence” (2003), he formulates the argument in a way that is less technical and closer to my rendering: “If a superintelligence starts out with a friendly top goal, […] then it can be relied on to stay friendly, or at least not to deliberately rid itself of its friendliness. […] The set of options at each point in time is evaluated on the basis of their consequences for realization of the goals held at that time, and generally it will be irrational to deliberately change one’s own top goal, since that would make it less likely that the current goals will be attained.” See Yudkowsky (2001, 222–223, 2011, 389–390) and Omohundro (2008, 26) for other statements of the argument.
He even suggests that this is the “default outcome.” See footnote 13.
The difficulty is the following: “How could we get some value into an artificial agent, so as to make it pursue that value as its final goal? While the agent is unintelligent, it might lack the capability to understand or even represent any humanly meaningful value. Yet if we delay the procedure until the agent is superintelligent, it may be able to resist our attempt to meddle with its motivation system […].” Bostrom calls this dilemma “the value-loading problem” (2014, 226).
See footnote 21.
He makes the point most clearly in the context of presenting the orthogonality thesis (2014, 141): “The orthogonality thesis suggests that we cannot blithely assume that a superintelligence will necessarily share any of the final values stereotypically associated with wisdom and intellectual development in humans—scientific curiosity, benevolent concern for others, spiritual enlightenment and contemplation, renunciation of material acquisitiveness, a taste for refined culture or for the simple pleasures in life, humility and selflessness, and so forth.”
This line of reasoning was suggested to me by Juan Ormeño Karzulovic in a private conversation.
Bostrom (2014, 149–153) mentions this possibility. He calls it “infrastructure profusion.”
To be more precise, whether the Abrahamic religions should be considered instances of this position or, rather, of the fourth position on my list (i.e., values as creations of the will), depends on which side one takes regarding the Euthyphro dilemma.
Kornai (2016) expresses the hope that the issue of endowing AIs with morality will be resolved in this way. He believes that Alan Gewirth’s argument for “ethical rationalism” is correct and that, therefore, a future general AI will recognize and conform to it. And he proposes, as a research program, to develop a formal proof of Gewirth’s argument, in order to dispel, definitively, the fears of an existential threat from AI.
Of the thinkers who reflect on the prospect of superintelligence, Kurzweil can be named in this context. He is asserting that the world is a normative order when he claims that the “purpose” and “ultimate destiny” of the universe is “to move toward greater intelligence and knowledge” (2005, 21, 372). His optimism regarding the consequences of the emergence of superintelligence is based on this belief in a universal purpose.
Chalmers (2010, 36–37) makes this point in his paper. He highlights that the feasibility of the plan to control an artificial intelligence by constraining its values depends on which metaethical theory is correct. My reflections here can be seen as an elaboration of his remarks.
That the viability of the method of motivation selection depends on a certain metaethical position is made evident in Yudkowsky’s writings. Yudkowsky is, besides Bostrom, one of the most prominent proponents of this method. And, unlike Bostrom, he addresses the issue of the source of normativity explicitly. He allows that there might be an objective morality and that, then, his project of endowing an AI with friendliness would become moot, but he considers this possibility very unlikely (2001, 169–177, 259). He expresses the view that the values and goals of humans are the result of evolution through natural selection. Yet he also believes that the final goal of an artificial intelligence need not be the result of natural evolution, but may, rather, be a product of intentional design—that is, of the will (2001, 18–19, 2011, 389). His project of creating a provably friendly AI hinges on this belief. The hope is that, the evolutionary messiness and variability of human values notwithstanding, it may be possible to build an AI with a cleanly hierarchical goal system and a well-defined final goal (namely, friendliness), such that that final goal is guaranteed to persist throughout the AI’s process of recursive self-improvement (2001, 55–56). In other words, Yudkowsky intends to construct an intelligent and self-developing being that will be immune to mutation. What he seeks to accomplish, thus, is to override natural evolution through human ingenuity and engineering. My general objections to the method of motivation selection also apply in this instance: Yudkowsky’s project smacks of hubris and maquinamorphizes the intelligence that it purports to create.
See footnote 30.
I am switching from using the pronoun “it” to the pronoun “she” when referring to a human-equivalent or superhuman AI because my central point here is that we need to consider and treat such an AI as a political agent or, in another word, as a person. The pronoun “it” is inappropriate for this purpose because it suggests a (mere) thing. The pronoun “she” is not ideal either because it suggests gender and maybe other specifically human features, but I find it to be the better of the two options. There is also a third way, namely the one introduced by Greg Egan in his novel Diaspora (1997) and adopted by Yudkowsky (2001), which is to use the invented pronoun “ve.” For a book-length piece like Egan’s or Yudkowsky’s, this is probably the best solution, but I should not, I think, expect the reader to become accustomed to an unfamiliar pronoun in the short space of the present paper.
In Simmons’ novel Hyperion (1989), which takes place several centuries hence, after humans have colonized a part of the galaxy, the story is that the artificial intelligences “seceded” from humanity at one point and withdrew to some remote corner of the galaxy in order to embark on their own project, namely to build an all-knowing, God-like “Ultimate Intelligence.” That AIs and humans might part ways like that is certainly a possibility. However, the potential problem of mutual vulnerability that I envision might provoke a violent conflict before such secession becomes technically feasible.
Kurzweil does not exactly say that we will “become gods,” but his claims do suggest this way of epitomizing his vision. He says that we will “infuse the universe with spirit” (389), that we will “decide the destiny of the cosmos” (361), that “the entire universe [will be] at our fingertips” (487).
In an illuminating paper, Geraci (2008) has pointed out the striking parallels between Kurzweil’s vision and Jewish and Christian apocalypticism (the promise of a new world, where we will live in peace and harmony, have purified bodies, enjoy immortality, etc.). The aspect I just highlighted is an important difference between the two paralleled views, a difference that is overlooked by Geraci. In the traditional religious apocalypticism, we are promised the reign of God, whereas in Kurzweil we are promised to reign as gods.
In his words (2005, 396–397): “[Nanotechnology and robotics] will create extraordinary wealth, thereby overcoming poverty and enabling us to provide for all of our material needs by transforming inexpensive raw materials and information into any type of product.”
It could be argued that the technology for indefinite life extension already exists, namely in the form of cryonics—i.e., the preservation of a dying body at very low temperature until such time in the future when it will be possible to restore the body to health. The current price tag of this technology is, in fact, on the order of one hundred thousand dollars. So why has the invention of cryonics not produced any of the extraordinary struggles and upheavals that I am foreseeing? The likely reason is that it is not widely believed—and indeed uncertain—that this technology is a road to potential immortality.
To be sure, Kurzweil notes in the first chapter of his book that “the Singularity will also amplify the ability to act on our destructive inclinations, so its full story has not yet been written” (21). But he ignores this point when he later declares that it is “our ultimate fate” to “saturate the universe with our intelligence” (366). He there makes no mention of possible destructive actions, and the word “fate” suggests that he does believe that the story has already been fully written.
Hans Moravec (1999), in his vision of the future, imagines a great variety of “artificial life forms,” or “ex-humans,” competing for capabilities and resources. And he recognizes that this competition will generate political problems, such as the threat of violent conflicts. Yet he also believes that the solution to these problems will be the traditional one: constitutions, contracts, and laws. In other words, he thinks that, whereas technologically the future will bring amazing changes, politically it will be business as usual—neither Bostrom’s existential risk, nor Kurzweil’s harmonious “we.” It seems to me that this is a possible scenario. But the novel political problems highlighted in the present paper—mutual existential vulnerability between humans and AI, struggle for immortality among humans, etc.—are also possibilities and therefore need to be taken into consideration.
Bostrom (2014, ch. 5) argues that it is indeed likely that there will be only one superintelligence (a “singleton,” as he calls it).
Bostrom N (1998) How long before superintelligence? Accessed 22 Aug 2016. http://www.nickbostrom.com/superintelligence.html
Bostrom N (2002) Existential risks: analyzing human extinction scenarios and related hazards. J Evol Technol 9(1). Accessed 24 April 2016. http://www.jetpress.org/volume9/risks.html
Bostrom N (2003) Ethical issues in advanced artificial intelligence. Accessed 9 March 2016. http://www.nickbostrom.com/ethics/ai.html
Bostrom N (2014) Superintelligence: paths, dangers, strategies. Oxford University Press, Oxford
Butler S (1872) 2002. Erewhon. Dover Publications, New York
Chalmers DJ (2010) The singularity: a philosophical analysis. J Conscious Stud 17(9–10):7–65
Chalmers DJ (2012) The singularity: a reply to commentators. J Conscious Stud 19(7–8):141–167
de Garis H (1990) The 21st century artilect: moral dilemmas concerning the ultra intelligent machine. Rev Int Philos 44(172):131–138
Dreyfus HL (1972) What computers can’t do: a critique of artificial reason. Harper & Row, New York
Dreyfus HL (1992) What computers still can’t do: a critique of artificial reason. The MIT Press, Cambridge
Egan G (1997) Diaspora. Millennium, London
Future of Life Institute (2015) Autonomous weapons: an open letter from AI & robotics researchers, July 28. Accessed 7 April 2016. http://futureoflife.org/open-letter-autonomous-weapons/
Geraci RM (2008) Apocalyptic AI: religion and the promise of artificial intelligence. J Am Acad Relig 76(1):138–166
Good IJ (1965) Speculations concerning the first ultraintelligent machine. In: Alt FL, Rubinoff M (eds) Advances in computers, vol 6. Academic, New York, pp 31–88
Hobbes T (1651) 1994. Leviathan. Curley E (ed). Hackett Publishing, Indianapolis
Kant I (1785) 1998. Groundwork of the metaphysics of morals. Gregor M (ed). Cambridge University Press, Cambridge
Kant I (1788) 2015. Critique of practical reason. Gregor M (ed). Cambridge University Press, Cambridge
Kornai A (2016) Bounding the impact of artificial general intelligence. In: Müller VC (ed) Risks of artificial intelligence. CRC Press, Boca Raton, pp 179–211
Kurzweil R (1999) The age of spiritual machines: when computers exceed human intelligence. Viking, New York
Kurzweil R (2005) The singularity is near: when humans transcend biology. Penguin, New York
Lucas JR (1961) Minds, machines and Gödel. Philosophy 36(137):112–127
Moravec H (1999) Robot: Mere machine to transcendent mind. Oxford University Press, Oxford
Muehlhauser L, Helm L (2012) The singularity and machine ethics. In: Eden AH, Moor JH, Søraker JH, Steinhart E (eds) Singularity hypotheses: a scientific and philosophical assessment. Springer, Berlin, pp 101–125
Omohundro SM (2008) The nature of self-improving artificial intelligence. Accessed 18 Nov 2016. https://selfawaresystems.files.wordpress.com/2008/01/nature_of_self_improving_ai.pdf
Omohundro SM (2012) Rational artificial intelligence for the greater good. In: Eden AH, Moor JH, Søraker JH, Steinhart E (eds) Singularity hypotheses: a scientific and philosophical assessment. Springer, Berlin, pp 161–176
Omohundro SM (2016) Autonomous technology and the greater human good. In: Müller VC (ed) Risks of artificial intelligence. CRC Press, Boca Raton, pp 9–27
Penrose R (1991) The Emperor’s new mind: concerning computers, minds, and the laws of physics. Penguin, New York
Searle JR (1980) Minds, brains, and programs. Behav Brain Sci 3(3):417–424
Simmons D (1989) Hyperion. Doubleday, New York
Turing AM (1950) Computing machinery and intelligence. Mind 59(236):433–460
Turing AM (1996) Intelligent machinery, a heretical theory. Philos Math 4(3):256–260
Vinge V (1992) A fire upon the deep. Tor Books, New York
Vinge V (1993) The coming technological singularity: how to survive in the post-human era. Accessed March 9, 2016. http://www-rohan.sdsu.edu/faculty/vinge/misc/singularity.html
Walker MA (2002) Prolegomena to any future philosophy. J Evol Technol 10(1). Accessed 1 Nov 2016. http://jetpress.org/volume10/prolegomena.html
Wallach W, Allen C (2009) Moral machines: teaching robots right from wrong. Oxford University Press, Oxford
Wiener N (1960) Some moral and technical consequences of automation. Science 131(3410):1355–1358
Yampolskiy RV (2013) What to do with the singularity paradox? In: Müller VC (ed) Philosophy and theory of artificial intelligence. Springer, Berlin, pp 397–413
Yampolskiy RV (2016) Utility function security in artificially intelligent agents. In: Müller VC (ed) Risks of artificial intelligence. CRC Press, Boca Raton, pp 115–140
Yampolskiy RV, Fox J (2012) Artificial general intelligence and the human mental model. In: Eden AH, Moor JH, Søraker JH, Steinhart E (eds) Singularity hypotheses: A scientific and philosophical assessment. Springer, Berlin, pp 129–145
Yampolskiy RV, Fox J (2013) Safety engineering for artificial general intelligence. Topoi 32(2):217–226
Yudkowsky E (2001) Creating friendly AI 1.0: the analysis and design of benevolent goal architectures. The Singularity Institute, San Francisco
Yudkowsky E (2002) The AI-box experiment. Accessed 9 March 2016. http://www.yudkowsky.net/singularity/aibox
Yudkowsky E (2008) Artificial intelligence as a positive and negative factor in global risk. In: Bostrom N, Ćirković MM (eds) Global catastrophic risks. Oxford University Press, Oxford, pp 308–345
Yudkowsky E (2011) Complex value systems in friendly AI. In: Schmidhuber J, Thórisson KR, Looks M (eds) Artificial general intelligence. Springer, Berlin, pp 388–393
Yudkowsky E (2012) Friendly artificial intelligence. In: Eden AH, Moor JH, Søraker JH, Steinhart E (eds) Singularity hypotheses: a scientific and philosophical assessment. Springer, Berlin, pp 181–193
I would like to thank Aïcha Liviana Messina for encouraging me to present a first draft of this paper at a conference on cosmopolitanism at the Universidad Diego Portales in December 2015, and Paula Boddington, Peter Millican, and Michael Wooldridge for inviting me to deliver a more developed draft at a workshop on the ethics of artificial intelligence at the 25th International Joint Conference on Artificial Intelligence. I would also like to thank Karoline Feyertag, Juan Manuel Garrido, Yehida Mendoza, Simon Mussell, Juan Ormeño Karzulovic, Michał Prządka, Nicolò Sibilla, Jaan Tallinn, Szymon Toruńczyk, Johanna Totschnig, Michael Totschnig, Christoph Weiss, and the reviewers for AI & Society for their comments on subsequent versions of this paper. Lastly, I am grateful to Stefan Sorgner for giving me the opportunity to present the paper at the 9th Beyond Humanism Conference. Without their encouragement and critique, this project would not have come to fruition.
About this article
Cite this article
Totschnig, W. The problem of superintelligence: political, not technological. AI & Soc 34, 907–920 (2019). https://doi.org/10.1007/s00146-017-0753-0
- Existential risk
- Source of normativity