Skip to main content

Two arguments against human-friendly AI


The past few decades have seen a substantial increase in the focus on the myriad ethical implications of artificial intelligence. Included amongst the numerous issues is the existential risk that some believe could arise from the development of artificial general intelligence (AGI) which is an as-of-yet hypothetical form of AI that is able to perform all the same intellectual feats as humans. This has led to extensive research into how humans can avoid losing control of an AI that is at least as intelligent as the best of us. This ‘control problem’ has given rise to research into the development of ‘friendly AI’ which is a highly competent AGI that will benefit, or at the very least, not be hostile toward humans. Though my question is focused upon AI, ethics and issues surrounding the value of friendliness, I want to question the pursuit of human-friendly AI (hereafter FAI). In other words, we might ask whether worries regarding harm to humans are sufficient reason to develop FAI rather than impartially ethical AGI, or an AGI designed to take the interests of all moral patients—both human and non-human—into consideration. I argue that, given that we are capable of developing AGI, it ought to be developed with impartial, species-neutral values rather than those prioritizing friendliness to humans above all else.

This is a preview of subscription content, access via your institution.


  1. See, for example, Yudkowsky [27].

  2. See, for example, Tarleton [22], Allen et. al. [1], Anderson and Anderson [2], and Wallach et al. [26].

  3. See, for example, Omohundro [16], Bostrom [4], ch. 12; Taylor et al. [23], Soares [21], and Russell [18].

  4. See Armstrong et al. [3] and Bostrom [4], pp. 177–181.

  5. As an example of a company aiming at the latter, see

  6. While ‘intelligence’ is notoriously difficult to define, Russell [18], p. 9 claims that something is intelligent “to the extent that their actions can be expected to achieve their objectives”. According to Tegmark (2017) p. 50, intelligence is the “ability to accomplish complex goals”. And Yudkowsky [25]: intelligence is “an evolutionary advantage” that “enables us to model, predict, and manipulate regularities in reality”.

  7. Central to explaining AGI’s move to ASI is ‘recursive self-improvement’ described in Omohundro [14].

  8. This is consistent with Yudkowsky [12], p. 2, according to which: “The term ‘Friendly AI’ refers to the production of human-benefiting, non-human-harming actions in Artificial Intelligence systems that have advanced to the point of making real-world plans in pursuit of goals”.

  9. With ‘considers the interests’ I’m anthropomorphizing for simplicity. I expect it to be a matter of controversy whether AGI of any sort can consider the interests of anything whatsoever.

  10. See Regan [17], chapter 5 for a discussion of the notions of ‘moral patient’ and ‘moral agent’.

  11. For opinions regarding when AGI will be attained see Bostrom [4], pp. 23–24 and Müller and Bostrom [12].

  12. See, for example, Bostrom [4], Kurzweil [11], Yudkowsky [7], Chalmers [5], Vinge [25], Good [9]. There are differing views on the timelines involved in the move from AGI to ASI. For a discussion of the differences between ‘hard’ and ‘soft takeoffs’ see, for example, Bostrom [4] chapter 4 (especially pp. 75–80), Yudkowsky [25], Yudkowsky [30], and Tegmark (2017), pp. 150–157.

  13. IAI may favor particular species if species-neutral values dictate favoring some species over others. For example, it may be the case that while all animals are worthy of moral consideration, some species are worthy of a greater level of consideration than others.

  14. Of course, another possibility is that AGI develops hostile values in which case issues of human and non-human interests are likely moot.

  15. Of course, it should be noted that while IAI may not be consistent with FAI, it is at least possible that IAI will be consistent with FAI. I take it that we are not in a position to know which is more likely with any degree of certainty.

  16. The term ‘speciesism’, coined by Ryder [19], is meant to express a bias toward the interests of one’s own species and against those of other species.

  17. By ‘moral patient’ I mean anything which is sentient or conscious and can be harmed or benefitted. A moral patient is anything toward which moral agents (i.e., those entities that bear moral responsibilities) can have responsibilities toward for their own sake. For present purposes, I will take the capacity to suffer as a reasonable sufficient (and possibly necessary) condition for being a moral patient.

  18. By ‘possible’ here I don’t intend a distant, modal sense according to which there exists some possible world in which the relevant beings exist. I mean that, in this world, such beings could very well actually exist in the future given that we don’t exterminate the preceding species or beings.

  19. Even if the goals, as specified, are consistent with human interests, ASI might take unintended paths toward the accomplishing of these goals, or it may develop subgoals (or, instrumental goals) that are ultimately inconsistent with human interests. For the latter issue, see Omohundro [14, 15] and Bostrom [4], ch. 7.

  20. I acknowledge that there is a debate to be had regarding what is ‘in the interest’ of a species. Nonetheless, I do not see the plausibility of my thesis turning on the choices one might make here.

  21. In terms of FAI based upon values we believe to be consistent with human interests, the main problem involves the widely discussed ‘unintended consequences’. The worry stems from our inability to foresee the possible ways in which AGI might pursue the goals we provide it with. Granting that it will become significantly more intelligent than the brightest humans, it’s unlikely that we’ll be capable of discerning the full range of possible paths cognitively available to AGI for pursuing whatever goal we provide it. In light of this, something as powerful as AGI might produce especially catastrophic scenarios (see, for example, Bostrom [4] ch. 8 and Omohundro [15].

    As for FAI based upon what are, in fact, human-centric values, an initial problem arises when we consider that what we believe is in our interest and what is actually in our interest might be quite distinct. If so, how could we possibly go about developing such an AI? It seems that any hopeful approach to such an FAI would require our discovering the correct theory of human wellbeing, whatever that might happen to be. Nonetheless, for the purposes of this paper I want to grant that we are, in fact, capable of developing such an objectively human-friendly AI.

  22. By ‘a set of impartial, species-neutral moral facts’ I mean simply that, given the assumption that the interests of all moral patients are valuable, there is a set of moral facts that follow. Basically, there are a set of facts that determine rightness and wrongness in any possible situation given the moral value of all moral patients, where this is understood in a non-speciesist (i.e., based upon morally relevant features rather than species-membership) way.

  23. I thank an anonymous reviewer for this point.

  24. Muehlhauser and Bostrom [12], p. 43.

  25. Yudkowsky [29], p. 388.

  26. Singer [20].

  27. Singer [20], p. 6.

  28. DeGrazia [7], p. 36.

  29. Singer [20], p. 8.

  30. See Singer [20], p. 20.

  31. DeGrazia [7], pp. 35–36.

  32. The arguments in the remainder of the paper will clearly still follow for proponents of the ‘equal consideration approach’. In fact, my conclusions may still follow on an even weaker anti-speciesist view according to which we ought to treat species as morally equal to humans (or of even greater moral worth than humans) if such beings evolve from current species (see Sect. 4 below).

  33. See, for example, De Waal [8].

  34. In addition, it’s also likely that there will be many cases in which, despite non-human interests receiving no consideration, such interests will remain consistent with human interests. I happily admit this. The point I’m making is that there will be cases where non-human interests will not be consistent with human interests and therefore will be disregarded by FAI.

  35. See, for example, Bostrom [4], Yudkowsky [31], Omohundro [14, 15], Häggström [10], and Russell [18].

  36. This might be accomplished by harvesting and altering their genetic information then producing the new ‘versions’ via in vitro fertilization. This is outlandish, of course, but no more so than the scenarios suggested by many AI researchers regarding existential threats to humanity via unintended consequences.

  37. See Omohundro [15] for a discussion of ‘basic AI drives’. Of these, the most relevant to the current point is ‘resource acquisition’. ‘Efficiency’ is another relevant subgoal, as AGI/ASI will become more efficient with regarding to pursuing its goals as well as its use of resources.

  38. It’s also important to recall that there’s every reason to believe that IAI will, as well as FAI, develop the basic AI drives presented in Omohundro [15].

  39. I remind the reader that by ‘possible’ beings here I intend those that could very well actually exist in the future given that we don’t exterminate the relevant preceding beings and not some logically distant, modal sense of beings.

  40. In addition, given that such species could develop from currently existing species, it is not a major leap to accept that we ought to develop AGI with them in mind as well, even if one rejects that currently existing species are not now worthy of consideration.

  41. Darwin [6], pp. 34–35.

  42. See, for example,, and

  43. I would suggest that this is analogous to cases in which, when presented with a moral dilemma, children should defer to suitable adults to make decisions that will have morally relevant consequences.

  44. In fact, it seems that beyond all of the foregoing, a sufficiently competent and powerful ASI could well fit the environment of the earth, as well as the universe beyond, to the most morally superior of possible biological beings. If it turns out that the optimal moral scenario is one in which the highest of possible moral beings exists and has its interests maximized, then we ought to develop IAI to bring about just this scenario, regardless of whether we are included in such a scenario. On the other hand, if we’re supposed to, morally speaking, develop that which will most benefit humans, then we are left not only scrambling to do so, but also hoping that there are no smarter beings somewhere in the universe working on the analogous project.

  45. I thank an anonymous reviewer for this point as well.

  46. Unfortunately, there is precedent in past human behavior for this attitude. For example, I expect that, with the benefit of hindsight, many believe that nuclear weapons ought not have been created. The same can be said for the development of substances and practices employed in processes that continue to contribute to climate change. Nonetheless, global dismantling of nuclear weapons and moving away from practices that proliferate greenhouse gases remain far off hopes.

    If this is correct, then I would suggest not only that the foregoing provides support for the preferability of species-neutral AGI but that the scope of interests to be considered by AGI ought to be given far more attention than it currently receives.


  1. Allen, C., Smit, I., Wallach, W.: Artificial morality: top-down, bottom-up, and hybrid approaches. Ethics Inf. Technol. 7, 149–155 (2006)

  2. Anderson, M., Anderson, S.: Machine ethics: creating an ethical intelligent agent. AI Mag. 28(4), 15–26 (2007)

    Google Scholar 

  3. Armstrong, S., Sandberg, A., Bostrom, N.: Thinking inside the box: controlling and using an oracle AI. Mind. Mach. 22, 299–324 (2011)

    Article  Google Scholar 

  4. Bostrom, N.: Superintelligence. Oxford University Press, Oxford (2014)

    Google Scholar 

  5. Chalmers, D.: The singularity: a philosophical analysis. J. Conscious. Stud. 17(9–10), 7–65 (2010)

    Google Scholar 

  6. Darwin, C.: The Descent of Man, and Selection in Relation to Sex. John Murray, London (1871)

    Book  Google Scholar 

  7. DeGrazia, D.: Animal Rights: A Very Short Introduction. Oxford University Press, New York, NY (2002)

    Book  Google Scholar 

  8. De Waal, F.: Chimpanzee Politics. Johns Hopkins University Press, Baltimore, MD (1998)

    Google Scholar 

  9. Good, I.J.: Speculations concerning the first ultraintelligent machine. In: Franz, L., Rubinoff, M. (eds.) Advances in Computers, vol. 6, pp. 31–88. Academic Press, New York (1965)

  10. Häggström, O.: Challenges to the Omohundro—Bostrom framework for AI motivations. Foresight 21(1), 153–166 (2019)

    Article  Google Scholar 

  11. Kurzweil, R.: The Singularity is Near: When Humans Transcend Biology. Penguin Books, New York (2005)

  12. Muehlhauser, L., Bostrom, N.: Why We Need Friendly AI. Think 36, 13(Spring) (2014)

  13. Müller, V., Bostrom, N.: Future progress in artificial intelligence: a survey of expert opinion. In: Fundamental Issues of Artificial Intelligence, 2016-06-08, pp. 555–572 (2016)

  14. Omohundro, S.: The nature of self-improving artificial intelligence [] (2007)

  15. Omohundro, S.: The basic AI drives. In: Wang, P., Goertzel, B., Franklin, S. (eds.) Artificial General Intelligence 2008: Proceedings of the First AGI Conference. IOS, Amsterdam, pp. 483–492 (2008)

  16. Omohundro, S.: Autonomous technology and the greater human good. J. Exp. Theor. Artif. Intellig. 26(3), 303–315 (2014).

  17. Regan, T.: The Case for Animal Rights. University of California Press, California (2004)

    Google Scholar 

  18. Russell, S.: Human Compatible: Artificial Intelligence and the Problem of Control. Viking, New York (2019)

    Google Scholar 

  19. Ryder, R.: (2010)

  20. Singer, P.: Animal Liberation. HarperCollins, New York, NY (2002)

    Google Scholar 

  21. Soares, N.: The value learning problem. In: Ethics for Artificial Intelligence Workshop at 25th International Joint Conference on Artificial Intelligence (IJCAI-2016), New York, NY, USA, 9–15 July 2016 (2016)

  22. Tarleton, N.: Coherent Extrapolated Volition: A Meta-Level Approach to Machine Ethics. The Singularity Institute, San Francisco, CA (2010)

    Google Scholar 

  23. Taylor, J., Yudkowsky, E., LaVictoire, P., Critch, A.: Alignment for Advanced Machine Learning Systems. Machine Intelligence Research Institute, July 27, 2016 (2016)

  24. Tegmark, M.: Life 3.0: Being Human in the Age of Artificial Intelligence. Alfred A. Knopf, New York, NY (2017)

    Google Scholar 

  25. Vinge, V.: The coming technological singularity: how to survive in the post-human era. Whole Earth Rev. 77 (1993)

  26. Wallach, W., Allen, C., Smit, I.: Machine morality: bottom-up and top-down approaches for modelling human moral faculties. Ethics Artif. Agents 22(4): 565–582 (2008). doi:

  27. Yudkowsky, E.: Creating Friendly AI 1.0: The Analysis and Design of Benevolent Goal Architectures. The Singularity Institute, San Francisco, CA, June 15 (2001)

  28. Yudkowsky, E.: Artificial intelligence as a positive and negative factor in global risk. In: Bostrom, N., Cirkovic, M. (eds.) Global Catastrophic Risks. Oxford University Press, Oxford, pp 308–345 (2008)

  29. Yudkowsky, E.: Complex value systems in friendly AI. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) Artificial General Intelligence: 4th International Conference. AGI 2011, LNAI 6830, pp. 388–393 (2011)

  30. Yudkowsky, E.: Intelligence Explosion Microeconomics. Technical Report 2013-1. Machine Intelligence Research Institute, Berkeley, CA. Last modified September 13, 2013 (2013)

  31. Yudkowsky, E.: There’s No Fire Alarm for Artificial General Intelligence (2017).

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ken Daley.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Daley, K. Two arguments against human-friendly AI. AI Ethics 1, 435–444 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: