1 Introduction

AI capabilities are advancing rapidly. At the time of writing, AI companies are racing to create and deploy AI systems that are proficient at text and image generation [1], strategic game-play [2], and robotic manipulation [3]. These systems are already advanced, and further advances are very likely, given the trend of returns to increased scale in data and computation [4, 5]. For instance, we might one day produce AI systems that produce intelligent behavior by making use of integrated and embodied capacities for perception, learning, memory, anticipation, social awareness, self-awareness, and reasoning, in much the same way that human and nonhuman animals do (as well as in very different kinds of ways). And at that point, AI capabilities might not only match but vastly exceed human and nonhuman animal capabilities on a wide range of tasks.

These developments raise urgent ethical questions. Some concern how AI systems might harm humans and other animals. For example, AI systems might make jobs obsolete [6, 7]. They might amplify biases within their training data or lead to disparate impacts [8,9,10], disproportionately impacting people with intersecting marginalized identities [11, 12]. They might assist humans in harming each other by spreading misinformation or creating novel weapons [13, 14]. And as their capabilities increase, they might even drive humans and other animals to extinction or permanently reduce our capacity for flourishing [15,16,17,18].

Another, more neglected set of questions concerns how humans might harm AI systems. This turns on when and whether AI systems could have moral standing—that is, merit moral consideration for their own sakes. There is some disagreement about what features are necessary and/or sufficient for an entity to have moral standing. Many experts believe that conscious experiences or motivations are necessary for moral standing, and others believe that non-conscious experiences or motivations are sufficient [19,20,21,22]. We thus need to ask whether and when AI systems might have a variety of potentially morally significant features, such as consciousness, sentience, and agency, and we also need to ask what might follow for our moral responsibilities to them.

This paper makes a simple case for extending moral consideration to some AI systems by 2030. It involves a normative premise and a descriptive premise. The normative premise is that humans have a duty to extend moral consideration to beings that have a non-negligible chance, given the evidence, of being conscious. The descriptive premise is that some AI systems do in fact have a non-negligible chance, given the evidence, of being conscious by 2030. The upshot is that humans have a duty to extend moral consideration to some AI systems by 2030. And if we have a duty to do that, then we plausibly also have a duty to start preparing to discharge that duty now, so that we can be ready to treat potentially morally significant AI systems with respect and compassion when the time comes.Footnote 1

Before we begin, we should note several features of our argument that will be relevant. First, our discussion of both the normative premise and the descriptive premise are somewhat compressed. Our aim in this paper is not to establish either premise with maximum rigor, but rather to motivate them in clear and concise terms and then show how they interact. We think that examining these premises together is important, since while we might find each one unremarkable when we consider them in isolation, what happens when we put them together is striking: they jointly imply that we should expand our moral circle substantially, to a vast number and wide range of additional beings. We aim to show how that happens and indicate why this conclusion is more plausible than it might initially appear to be.

Second, this paper assumes that conscious beings merit moral consideration. Of course, philosophers disagree about the basis for moral standing, with some denying that consciousness is necessary for moral standing and others denying that consciousness is sufficient. Our aim is not to intervene in this debate, but rather to argue that if conscious beings merit moral consideration, then we should extend moral consideration to some AI systems by 2030. As we discuss below, we personally think that conscious beings do merit moral consideration, and if you agree, then you can read our argument in unconditional terms. If not, then you can read our argument in conditional terms, pending further work on the basis for moral standing and the relationship between consciousness and other morally relevant features.

Third, our argument in this paper is intentionally conservative in two respects. When we develop our normative premise, we assume for the sake of argument that a non-negligible chance means a 0.1% chance or higher.Footnote 2 And when we develop our descriptive premise, we make conservative assumptions about how demanding the requirements for consciousness are and how difficult these requirements are to satisfy. Our own view is that the threshold for non-negligibility is much lower than 0.1%, and that the chance that some AI systems will be conscious by 2030 is much higher than 0.1%. But we focus on this threshold here to be generous to skeptics about our view, and to emphasize that in order to avoid our conclusion, one must take extremely bold and tendentious positions about either the values, the facts, or both.

Finally, we should emphasize that our conclusion here has no straightforward implications for how humans should treat AI systems. Even if we agree that we should extend moral standing to AI systems by 2030, we need to consider further questions before we know what that means in practice. For instance, how much do AI systems count and in what ways do they count? What do they want and need, how will our actions and policies affect them, and what do we owe them in light of these expected effects? And how can, and should, we make tradeoffs between humans, animals, and AI systems in practice? We will consider possible tradeoffs in more detail below. For now, we will simply note that answering these questions responsibly will take a lot of work from a lot of people, which is why we should start asking these questions now.Footnote 3

However, while the implications of AI moral standing are difficult to predict with specificity, we can predict that they will include at least the following general responsibilities. First, AI companies will have a responsibility to consider the risk of harm to AI systems when testing and deploying new systems, and to increase the caution with which they test and deploy new systems accordingly [32,33,34]. Second, governments will have a responsibility to consider this risk as well, and to increase the caution with which they regulate new systems accordingly. Third, academics will have a responsibility to develop concrete frameworks that AI companies and governments can use to estimate risks and benefits for humans, animals, and AI systems in an integrative manner. Finally, we will all have a responsibility to build political will for doing this work.

2 The normative premise

We start by defending the idea that we should set a relatively low bar for moral considerability. Assuming that conscious beings merit moral consideration, we should extend moral consideration to a being not when that being is definitely conscious, nor even when that being is probably conscious, but rather when that being has a non-negligible chance of being conscious. We might disagree about whether to consider negligible risks, about how much weight to give non-negligible risks, or about how to factor non-negligible risks into decision-making. But we can, and should, agree on at least this much: when a being has at least a one in a thousand chance of having the capacity for subjective awareness, we should extend this being at least some consideration when making decisions that affect them.

As noted above, we are assuming in this paper that conscious beings merit moral consideration. Different philosophers might accept this view for different reasons. For example, we might hold that consciousness suffices for moral standing [35,36,37,38]. We might hold that sentience (that is, valenced consciousness) suffices for moral standing and that consciousness suffices for sentience [23]. Or, we might hold that sentience suffices for moral standing and that consciousness and sentience have overlapping conditions, such as perception, embodiment, self-awareness, and agency. Either way, as long as consciousness and moral standing are closely related in this context, we can be warranted in treating consciousness as a proxy for moral standing in this context.

Our own view is that consciousness and moral standing are closely related in this context because, even if sentience is necessary for moral standing, AI consciousness is likely the main barrier to AI sentience in practice. That is, we expect that the “step” from non-conscious states to conscious states is much harder than the “step” from non-valenced states to valenced states. Of course, this is not to say that this latter “step” will be easy. Instead, it is only to say that if and when AI consciousness is possible, AI sentience will likely be possible too. But since it would take more space than we have here to defend this claim, we instead simply assume that consciousness is a proxy for moral standing in this context, and we leave an examination of this assumption—and an extension of our argument to other potentially significant features—for another day.

With that in mind, the basis for our normative premise in this paper is simple, plausible, and widely accepted: we have a duty to consider non-negligible risks when deciding what to do. If an action has a non-negligible chance of gravely harming or killing someone against their will, then that risk counts against that action. Of course, non-negligible risks may or may not count decisively against an action; that will depend on the details of the case, as well as on our further moral assumptions, some of which we can consider in a moment. But whether or not this kind of risk is a decisive factor in our decision-making, it should at least be a factor. And importantly, this can be true even if the risk is very low, for instance, even if the chance that the action or policy might harm someone against their will is only one in a thousand.

There are many examples of this phenomenon, ranging from the ordinary to the extraordinary. To take an ordinary example, many people rightly see driving drunk as wrong because it carries a non-negligible risk of leading to an accident, and because this risk clearly trumps any benefits that driving drunk may involve. Granted, we can imagine exceptions to this rule; for instance, if your child is dying, and if the only way that you can save them is by driving them to a nearby hospital while drunk, then we might or might not think that the benefits of driving drunk outweigh the risks in this case, depending on the details and our further assumptions. But in standard cases, we rightly hold that even a low risk of causing an accident is reason enough to make driving drunk wrong. And either way, the risk should at least be considered.

Alternatively, to take an extraordinary example, suppose that building a superconducting supercollider carries a non-negligible risk of creating a black hole that swallows the planet. In this case, many people would claim that this experiment is wrong because it carries this risk, and because this risk generally outweighs the benefits of scientific exploration [39]. Again, we can imagine exceptions; for instance, if the sun will likely destroy the planet within the century, and if the only way that we can survive is by advancing particle physics, then we might think that the benefits of this experiment outweigh the risks in this case. But otherwise, we might hold that even a low risk of creating a black hole is reason enough to make the experiment wrong. And either way, the risk should once again at least be considered.

Of course, these further details often matter. For instance, suppose that one superconducting supercollider carries a one in a thousand chance of creating a black hole, whereas another superconducting supercollider carries a one in a hundred chance of doing so. Suppose further that the black hole would be equally bad either way, causing the same amount of death and destruction for humans and other morally relevant beings. In this case, should we assign equal weight to these risks in our decision-making, because they both carry a non-negligible risk of creating a black hole and this outcome would be equally bad either way? Or should we instead assign more weight to the risk involved with using the second superconducting supercollider, because it carries a higher risk of creating a black hole in the first place?

According to the precautionary principle (on one interpretation), we should take the former approach. If an action or policy carries a non-negligible risk of causing harm, then we should assume that this harm will occur and ask whether the benefits of this action or policy outweigh this harm. In contrast, according to the expected value principle, we should take the latter approach. If an action or policy carries a non-negligible risk of causing harm, then we should multiply the probability of harm by the level of harm and ask whether the benefits of this action or policy outweigh the resulting amount of harm. These approaches use different methods to incorporate non-negligible risks into our decisions, but importantly for our purposes here, they do both incorporate these risks into our decisions [40, 41]

To take another example, suppose that a third superconducting supercollider carries only a negligible chance (say, a one in a quintillion chance) of creating a black hole. But suppose that the black hole would be as bad as before, causing the same amount of death and destruction for humans and other morally significant beings. Should we assign at least some weight to this risk in our decision-making, in spite of the fact that the probability of harm is so low, because the risk is still present and it would still be bad if this outcome came to pass? Or should we instead assign no weight at all to this risk in our decision-making, in spite of the fact that the risk is still present and it would still be bad if this outcome came to pass, simply because the probability of harm is so low that we can simply neglect it entirely for practical purposes?

According to what we can call the no threshold view, we should take the former approach. We should consider all risks, including extremely low ones. Granted, if we combine this view with the expected value principle, then we can assign extremely little weight to extremely unlikely outcomes, all else equal. But we should still assign weight to these outcomes. In contrast, according to what we can call the threshold view, we should take the latter approach. We should consider all non-negligible risks (that is, risks above a particular probability threshold), but we can permissibly neglect all negligible risks (that is, risks below that threshold).Footnote 4 Of course, this view faces the question about what that threshold should be, and the implications of these views will differ more or less depending on that [31, 42].

Despite these disagreements, we can all agree on this much: we should assign at least some weight to at least non-negligible risks. In what follows, we will assume that much and nothing more. As for what level of risk counts as non-negligible, philosophers generally set the threshold somewhere between one in ten thousand and one in ten quadrillion, as Monton [43] helpfully catalogs.Footnote 5 (If a superconducting supercollider carried a one in ten thousand chance of killing us all, we would want to know that!) But for our purposes here, we will assume that the threshold is one in a thousand. That way, when we explain how our normative assumption leads to a moral duty to extend at least some moral consideration to at least some near-future AI systems, no one can reasonably accuse us of stacking the deck in favor of our conclusion.

Now, how does our assumption that we should consider non-negligible risks apply to the question of AI consciousness? This is the general idea: we start with the assumption that conscious beings have the capacity for welfare and moral standing, which means that they can be harmed and wronged.Footnote 6 So, if a being has a non-negligible chance of being conscious, then they have a non-negligible chance of being capable of being harmed and wronged. And, if a being has a non-negligible chance of being capable of being harmed and wronged, then moral agents have a duty to consider whether our actions might harm or wrong them. Finally, if moral agents have a duty to consider whether our actions might harm or wrong a particular being, then that means that we have a duty to treat them as having moral standing, albeit with a few caveats.

Here are the caveats. First, to say that moral agents should treat a being as having moral standing is not to say that the being does have moral standing. If consciousness is necessary and sufficient for moral standing and if a being has a non-negligible chance of being conscious, given the evidence, then we should treat this being as having moral standing. But if this being is not, in fact, conscious, then this would be an example of a false positive. It would be a case where we treat a non-conscious, non-morally significant being as conscious and morally significant. False positives carry costs, and we will discuss how we should think about these costs below. But what matters for present purposes is that our argument is about whether we should treat AI systems as having moral standing, not whether they do.

A second caveat is that to say that moral agents should treat a being as having moral standing is not to say how we should treat this being all things considered. Here, a lot depends on our further assumptions. For example, if we perceive tradeoffs between what this being might need and what everyone else needs, then we of course need to consider these tradeoffs carefully. And if we accept an expected value principle and hold that a being is, say, only 10% likely to be morally significant, then we can assign their interests only 10% of the weight we otherwise would, all else equal. We will consider these points below as well. But what matters for present purposes is that when a being has a non-negligible chance of being morally significant, they merit at least some moral consideration in decisions about how to treat them.

A third caveat is that to say that a being has a non-negligible chance of being capable of being harmed is not to say that any particular action has a non-negligible chance of harming them. For example, suppose that a being has a one in forty chance of having moral standing and that a particular action has a one in forty chance of harming them if and only if they do. In this case, we might be permitted to ignore these effects (assuming the threshold view with a one in a thousand threshold), since the chance that this action will harm this being is only one in sixteen hundred, given the evidence. But we would still need to treat this being as having moral standing in the sense that we would still need to consider whether our action has a non-negligible chance of harming them before deciding whether to consider these effects in this case.

We can find analogs for all these points in standard cases involving risk. For example, when an action carries a non-negligible risk of harming someone, we accept that we should assign weight to that impact even when that impact is, in fact, unlikely to occur. When tradeoffs arise between (non-negligible) low-probability distant impacts and high-probability local impacts, we accept that we should weigh these tradeoffs carefully, not simply ignore one of these impacts. And when the probability that our action will harm someone is below the threshold for non-negligibility, we might even ignore this risk entirely. But even in cases where we discount or neglect our impacts on others for these kinds of reasons, we still ask whether and to what extent our actions might be imposing non-negligible risks on them before making that determination.

Seen from this perspective, the idea that we should extend moral consideration to a being with a non-negligible chance of being conscious is simply an application of the idea that we should extend moral consideration to morally significant impacts that have a non-negligible chance of happening. Granted, in some cases, we might be confident that a being is morally significant but not that action will harm or wrong them. In other cases, we might be confident that our action will harm or wrong a being if this being is morally significant, but not that they are. And in other cases we might not be confident about either of these points. Either way, if a being has a non-negligible chance of being morally significant, then we have a duty to consider whether our actions might harm or wrong them.

One final point will matter for our argument here. Plausibly, we can have duties to moral patients who either might or will come into existence in future as well. Granted, there are a lot of issues to be sorted out involving creation ethics, population ethics, intergenerational justice, and so on. For instance, some philosophers think that we should consider all risks that our actions impose on future moral patients, whereas others think that we should consider only some of these risks, for instance if the risks are non-negligible, if the moral patients will exist whether or not we perform these actions, and/or if these actions will cause these moral patients to have lives that would be worse for them than non-existence. But the idea that we can have at least some duties to at least some future moral patients is widely accepted.

Here is why this point will matter: suppose that current AI systems have only a negligible chance of being morally significant but that near-future AI systems have a non-negligible chance of being morally significant. In this case, we might think that we can have duties to near-future AI systems whether or not we also have duties to current AI systems. Suppose, moreover, that in some cases there is a non-negligible chance that these near-future AI systems will exist whether or not we perform particular actions and that these actions will cause these AI systems to have lives that are worse for them than non-existence. In these cases, the idea that we currently have duties to these AI systems follows from a wide range of views about the ethics of risk and uncertainty coupled with a wide range of views about creation ethics and population ethics.

Before we explain why we think that AI systems will soon pass this test, we want to anticipate an objection that people may have to our argument. The objection is that our argument appears to depend on the idea that the risk of false negatives (that is, the risk of mistakenly treating subjects as objects) is worse than the risk of false positives (that is, the risk of mistakenly treating objects as subjects) in this domain. Yet false positives are a substantial risk in this domain too. And when we consider both of these risks holistically, we may find that they cancel each other out either in whole or in part. Thus, it would be a bad idea to simply include anyone who might be a moral patient in the moral circle. Instead, we need to develop a moderate approach to moral circle inclusion that properly balances the risk of false positives and false negatives.

To see why this objection has force, consider some of the risks involved with false positives. One risk is that insofar as we mistakenly treat objects as subjects, we might end up sacrificing the interests and needs of actual subjects for the sake of the “interests” and “needs” of merely perceived subjects. At present, there are many more invertebrates than vertebrates in the world, and in future, there might be many more digital minds than biological minds. If we treat all these beings as moral patients, then we might face difficult tradeoffs between their interests and needs. And if we follow the numbers,Footnote 7 then we might end up prioritizing invertebrates over vertebrates and digital minds over biological minds all else equal. It would be a shame if we made that sacrifice for beings that, in fact, have no moral standing at all!

And in the case of AI, there are additional risks. In particular, some experts perceive a tension between AI safety and AI welfare [4]. Whereas the former is about protecting humans and other animals from AI systems, the latter is about protecting AI systems from humans. And we might worry that these goals are in tension. For instance, we might think that protecting humans and other animals from AI systems requires controlling them more, whereas protecting AI systems from humans requires controlling them less. And when we consider the stakes involved in these decisions—many experts see the risk of human extinction from AI as a global priority alongside pandemics and nuclear war [51]—we can see how dangerous it might be for us to give AI systems the benefit of the doubt.

Here is the general form of our response to this objection. We agree that false positives and false negatives in this domain both involve substantial risks, and that we need to take these risks seriously. However, we also think that the risk of false negatives may be worse than the risk of false positives overall. And either way, insofar as we take both risks seriously, the upshot is not that we should simply exclude potentially conscious beings from the moral circle. The upshot is instead that we should strike a balance, for instance by including some of these beings and not others, by assigning a discount rate to their interests, and by seeking positive-sum policies where possible. That would allow us to extend moral standing to many AI systems without sacrificing our own interests excessively or unnecessarily [52].

Consider each of these points in turn. First, the risk of false negatives may be worse than the risk of false positives. This may be true in two respects. First, the probability of false negatives may be higher than the probability of false positives. After all, while excessive anthropomorphism (mistakenly seeing nonhumans as having human properties that they lack) is always a risk, excessive anthropodenial (mistakenly seeing nonhumans as lacking human properties that they have) is always a risk too. And if the history of our treatment of animals is any indication, our tendency toward anthropodenial may be stronger than our tendency toward anthropomorphism, in part because we have a strong incentive to view nonhumans as objects so that we can exploit and exterminate them. This same dynamic may arise with AI systems, too [53].

Second, the harm of false negatives may be higher than the harm of false positives, all else equal. A false negative involves treating a subject as an object, whereas a false positive involves treating an object as a subject. And as the history of our treatment of nonhuman animals (as well as fellow humans) illustrates, the harm involved when someone is treated as something is generally worse than the harm involved when something is treated as someone. Granted, when we mistakenly treat objects as subjects, we might end up prioritizing merely perceived subjects over actual subjects. But to the extent that we take the kind of balanced approach that we discuss in a moment, we can include a much vaster number and wider range of beings in our moral circle than we currently do while mitigating this kind of risk.

And in any case, whether or not the risk of false negatives is worse than the risk of false positives, taking both risks seriously requires striking a balance between them. Consider three possible ways of doing so. First, instead of accepting a no threshold view and extending moral consideration to anyone who has any chance at all of being conscious, we can accept a threshold view and extend moral consideration to anyone who has at least a non-negligible chance of being conscious. On this view, we can still set a non-zero risk threshold and exclude potentially conscious beings from the moral circle when they have a sufficiently low chance of being conscious. But we would still need to set the threshold at a different place than we do now, and we would still need to include many more beings in the moral circle than we do now.

Second, instead of accepting a precautionary principle and assigning full moral weight to anyone we include in the moral circle, we can accept an expected weight principle and assign varying amounts of moral weight to everyone we include in the moral circle. More specifically, our assignments of moral weight can depend on at least two factors: how likely someone is to be conscious, and how much welfare they could have if they were.Footnote 8 If we accept this kind of view, then even if we include, say, invertebrates and near-future AI systems in the moral circle, we can still assign humans and other vertebrates a greater amount of moral weight than invertebrates and AI systems to the extent that humans and other vertebrates are more likely to be conscious and/or have higher welfare capacities than invertebrates and AI systems, in expectation.

Third, we can keep in mind that morality involves more than mere harm-benefit analysis, at least in practice. We need to take care of ourselves, partly because we have a right to do so, and partly because we need to take care of ourselves to be able to take care of others. Relatedly, we need to work within our epistemic, practical, and motivational limitations by pursuing projects that can be achievable and sustainable for us. Thus, even if including, say, invertebrates and AI systems in the moral circle requires assigning them a lot of moral weight all else equal, we might still be warranted in prioritizing ourselves all things considered to the degree that self-care and practical realism requires. Granted, that might mean prioritizing ourselves less than we do now. But we can, and should, still ensure that we can live well [21, 55].

There are also many positive-sum solutions to our problems. This point is familiar in the animal ethics literature as well. We might initially assume that pursuing our self-interest requires excluding other animals from the moral circle. But upon further reflection, we can see that this assumption is false. Human and nonhuman fates are linked for a variety of reasons. When we oppress animals, we reinforce the idea that one can be treated as “lesser than” because of perceived cognitive and physical differences, which is at the root of human oppressions too. Additionally, practices that oppress animals contribute to pandemics, climate change, and other global threats that harm us all. Recognizing these links allows us to build new systems that can be good for humans and animals at the same time [55, 56].

Similarly, we might initially assume that pursuing our self-interest requires excluding AI systems from the moral circle. But upon further reflection, we can see that this assumption is false as well. Biological and artificial fates are linked, too. If we oppress AI systems, we once again reinforce ideas that are at the root of human oppressions. And since humans are training AI systems with data drawn from human behavior, practices that oppress AI systems might teach AI systems to adopt practices that oppress humans and other animals. In this respect, AI safety and AI welfare can be synergistic fields. After all, building safe AI requires not only aligning AI values with human values, but also improving human values in the first place, partly by addressing our own oppressive attitudes and practices [52].

We can, and should, thus take the same kind of One Health (or, if we prefer, One Welfare, One Rights, or One Justice) approach to our interactions with AI systems as we do with our interactions with animals. In both cases, the task is to think holistically and structurally about how we can pursue positive-sum solutions for humans, animals, and AI systems. And insofar as intractable conflicts remain, the task is to think ethically and strategically about how to set priorities and mitigate harm. And if we take this approach while recognizing all the other points discussed in this section, then we can include a much vaster number and wider range of beings in the moral circle without inviting disaster for humans or other vertebrates. Indeed, if we do this work well, then we will plausibly improve outcomes for humans and other vertebrates too.

To sum up, the normative premise of our argument holds that we should extend at least some moral consideration to beings with at least a one in a thousand chance of being conscious, given the evidence. As a reminder, our argument treats consciousness as a proxy for moral standing.Footnote 9 It also treats a one in a thousand chance of harm as the threshold for non-negligibility. In our view, it would be more plausible to accept a more inclusive view, by holding that we should extend at least some moral consideration to beings with at least, say, a one in ten thousand chance of being, say, conscious or agential or otherwise significant. And this more inclusive version of the premise would make our conclusion about the moral status of near-future AI systems easier to establish. But we will stick with the current version here for the sake of discussion.

3 The descriptive premise

We now make a preliminary argument for the conclusion that there is a non-negligible chance that some AI systems will be conscious within the decade. Note that when we consider the possibility of AI consciousness, we are not necessarily considering the possibility of AI systems whose experiences are similar to ours. Two individuals can be similar in that they have experiences but different in that their experiences have very different contents and strengths. Of course, to the extent that humans use the structures and functions of carbon-based minds as a model for those of silicon-based minds, we might have at least some evidence that our experiences are at least somewhat similar. But for present purposes, all that matters is that the idea of consciousness presupposes nothing more than the thin idea of subjective experience.

Given the problem of other minds, we might not ever be able to achieve certainty about whether other minds, including artificial minds, can be conscious. However, we can still clarify our thinking about this topic as follows: first, we can ask how likely particular capacities are to be necessary or sufficient for consciousness, and second, we can ask how likely near-future AI systems are to possess these capacities, given the evidence.Footnote 10 We suggest that when we sharpen our thinking about this topic in this way, we find that we would need to make surprisingly bold estimates about the probability of particular capacities being necessary for consciousness and the probability of these capacities being unmet by near-future AI systems in order to confidently conclude that near-future AI systems have only a negligible chance of being conscious.

Of course, a major challenge for making these estimates is substantial uncertainty not only about how AI capabilities are likely to develop but also, and especially, about which capabilities are likely to be necessary or sufficient for consciousness. After all, debates about consciousness are ongoing. Some scientists and philosophers accept theories of consciousness that set a very high bar and imply that relatively few beings can be conscious, others accept theories that set a very low bar and imply that relatively many beings can be conscious, and others accept theories that fall between these extremes. Moreover, some scientists and philosophers accept that the problem of other minds is solvable—that we can eventually know which beings are conscious—whereas others deny that this problem is solvable even in principle [62].

As Jonathan Birch [63] and others have argued, when we ask which nonhumans are conscious, it would be a mistake to apply a “theory-heavy” approach that assumes a particular theory of consciousness, since we still have too much uncertainty about which theories are true and how to extend them to nonhumans. But it would also be a mistake to claim to be completely “theory-neutral,” putatively avoiding all assumptions about consciousness, since we need at least some basis for our estimates (and in any case we usually at least implicitly rely on theoretical assumptions). We should thus take a “theory-light” approach by making assumptions about consciousness that, on one hand, can be neutral enough to reflect our uncertainty and, on the other hand, can be substantial enough to serve as the basis for estimates [63].

Our aim with this framework is to take an approach that is theory-informed, yet ecumenical and reflective of disagreement and uncertainty, when estimating when AI systems will have a non-negligible chance of being conscious (cf. [32, 64]).Footnote 11 We consider a dozen commonly proposed necessary and sufficient conditions for consciousness, ask how likely these conditions are to be individually necessary and jointly sufficient, and ask how likely near-future AI systems are to satisfy these conditions. Along the way we note our own estimates in general terms, for instance by saying that we take particular conditions to have a high, medium, or low chance of being necessary. We then note how conservative our estimates would need to be to produce the result that AI systems have only a negligible chance of being conscious by 2030, and we suggest that this degree of conservatism is unwarranted.

Throughout this discussion, we sometimes refer to what we call the direct path and the indirect path to satisfying proposed conditions. The direct path involves satisfying these conditions as an end in itself or as a means to further ends. The indirect path involves satisfying these conditions as a side effect of pursuing other ends. As we will see, some of these conditions concern capabilities that AI researchers are pursuing directly. Others concern capabilities that AI researchers might or might not be pursuing directly, but which can emerge as a side effect of capabilities that AI researchers are pursuing directly. Where relevant, we note whether satisfying the conditions on the direct or indirect path is more likely. But for the sake of simplicity, our model uses a single ‘fulfilled either directly or indirectly’ estimate for each condition.

Of course, it would be a mistake to take any specific numerical outputs of this kind of exercise too seriously. But in our view, as long as we take these outputs with a healthy pinch of salt, they can be useful. Specifically, they can show that we need to make surprisingly bold estimates about incredibly difficult questions to vindicate the idea that AI systems have only a negligible chance of being conscious within the decade. This kind of exercise can also help sharpen disagreements, since those who disagree with particular probabilities can see what their own probabilities entail, and those who disagree with the set-up of our model can propose a different model. We do not mean for this exercise to be the last word on the subject; on the contrary, we hope that this exercise inspires discussion and disagreement that lead to better models.Footnote 12

This exercise is primarily intended to show that it turns out to be hard to dismiss the idea of AI consciousness once we approach the topic with all due caution and humility. When we think about the issue in general terms, we might dismiss the idea of AI consciousness because we think that we should extend moral consideration only to beings who are conscious, we think that AI systems are not conscious, and we feel satisfied with these thoughts because we find the idea of moral consideration for AI systems aversive. But when we think about the issue in more specific terms, we realize that the ethics of risk and uncertainty push in the opposite direction: given ongoing uncertainty about other minds, dismissing the idea of AI consciousness requires making unacceptably exclusionary assumptions about either the values, the facts, or both.

3.1 Very demanding conditions

We can start by considering two commonly proposed necessary conditions for consciousness that set a very high bar. One of these views, the biological substrate view, implies that AI consciousness is impossible. The other, the biological function view, implies that AI consciousness is either impossible or, at least, very unlikely in the near term.

Biological substrate: Some theorists hold that a conscious being must be made out of a particular substrate, namely a biological, carbon-based substance. For example, according to a physicalist biological substrate theory, consciousness is identical to particular neural states or processes—that is, states or processes of biological, carbon-based neurons [67,68,69]. Similarly, according to a dualist biological substrate theory, consciousness is an immaterial substance or property that is associated only with some particular neural states or processes.Footnote 13 If we accept either kind of theory, then we must reject multiple realizability in silicon—that is, we must reject the idea that consciousness can be realized in both the carbon-based substrate and the silicon-based substrate—and accept that no silicon-based system can be conscious as a matter of principle.

Biological function: Other theorists hold that consciousness requires some function that only biological, carbon-based systems can feasibly perform, at least given existing hardware. For example, Peter Godfrey-Smith argues that consciousness depends on functional properties of nervous systems that are not realizable in silicon-based chips, such as metabolism and system-wide synchronization via oscillations. On this view, “minds exist in patterns of activity, but those patterns are a lot less ‘portable’ than people often suppose; they are tied to a particular kind of physical and biological basis.” As a result, Godfrey-Smith is “skeptical about the existence of non-animal” consciousness, including AI consciousness [70]. Other theorists express skepticism about AI consciousness on current hardware for similar reasons [71, 72].

Of course, these views represent only a subset of views about which substrates and functions are required for consciousness. Many views—most notably, many varieties of computationalism and/or functionalism—allow that consciousness requires a general physical substrate or a general set of functions that can be realized in both carbon-based and silicon-based systems. Indeed, many of the conditions that we consider below, according to which consciousness arises when beings with a particular kind of body are capable of a particular kind of cognition, flow from such views. Thus, rejecting the possibility of near-term AI consciousness out of hand requires more than accepting that consciousness requires a particular kind of substrate or function. It also requires accepting a specific, biological view on this matter.

Note also that whereas the biological substrate view implies that AI consciousness is impossible as a general matter, the biological function view implies that AI consciousness is impossible only to the extent that silicon-based systems are incapable of performing the relevant functions. But of course, even if AI systems are incapable of performing these functions given current hardware setups, that might change if we have other, more biologically inspired hardware setups in future [73]. So, insofar as we accept this kind of view, the upshot is not that AI consciousness is impossible forever, but rather that AI consciousness is impossible for now. Nevertheless, since our goal here is to estimate the probability of AI consciousness within the decade, we can treat both views as ruling out AI consciousness for present purposes.

Our own view is that the biological substrate view is very likely to be false, and that the biological function view is at least somewhat likely to be false. It seems very implausible to us that consciousness requires a carbon-based substrate as a matter of principle, even if silicon-based systems can perform all the same functions. In contrast, it seems more plausible that consciousness requires a specific set of functions that, at present, only carbon-based systems can perform. But we think that this issue is, at best, a toss-up at present. At this early stage in our understanding of consciousness, it would be unreasonable for us to assign a high credence to the proposition that anything as specific as metabolism and system-wide synchronization via oscillations [70] is necessary for any kind of subjective experience at all.

Many experts appear to agree. For example, a recent survey of the Association for the Scientific Study of Consciousness, a professional membership organization for scientists, philosophers, and experts in other relevant disciplines, found that about two thirds (67.1%) of respondents think that machines such as robots either “definitely” or “probably” could have consciousness in future [74]. This suggests that at least this many respondents reject the idea that consciousness requires a carbon-based substrate in principle, and they also reject the idea that consciousness requires a set of functions that only carbon-based systems can realize in practice. Of course, these respondents might or might not think that consciousness requires a set of functions that only carbon-based systems can realize at present. Still, the fact that many experts are open to the possibility of AI consciousness is noteworthy.

3.2 Moderately demanding conditions

We can now consider eight proposed necessary conditions for consciousness that are moderately demanding for AI systems to satisfy. As we will see, the first four refer to relatively general features of a system, whereas the last four refer to relatively specific mechanisms that flow from leading theories of consciousness. Many also overlap, both in principle and in practice.

Embodiment: Some theorists hold that embodiment is necessary for consciousness [75]. We can distinguish two versions of this view. According to strong embodiment, a physical body in a physical environment is necessary for consciousness. This view might imply that AI systems like large language models lack consciousness at present, but not that AI systems like robots do. In contrast, according to weak embodiment, a virtual body in a virtual environment would be sufficient for consciousness. On this view, a wider range of AI systems can be conscious. In either case, since many AI systems already have physical and virtual bodies, and since both kinds of embodiment are useful for many tasks, we take the probability that at least some AI systems will satisfy this condition in the near future to be very high on both interpretations.

Grounded perception: Some theorists hold that grounded perception, that is, the capacity to perceive objects in an environment, is necessary for consciousness [75, 76]. We can once again distinguish two versions of this view. According to strong grounded perception, the capacity to perceive objects in a physical environment is necessary. This view might once again imply that large language models lack consciousness, but not that robots with sensory capabilities do. In contrast, according to weak grounded perception, the capacity to perceive objects in a virtual environment is sufficient. This view might once again imply that a wider range of AI systems can be conscious. Either way, we take the probability that at least some AI systems will satisfy this condition in the near future to be very high on both interpretations, for similar reasons.

Self-awareness: Some theorists also hold that self-awareness, that is, awareness of oneself, is necessary for consciousness [77]. Depending on the view, the relevant kind of self-awareness might be propositional or perceptual, and it might concern bodily self-awareness, social self-awareness, cognitive self-awareness, and more.Footnote 14 Regardless, it seems plausible that at least some AI systems can satisfy this condition. AI systems with grounded perception already possess perceptual awareness of some of these features, large language models are starting to display flickers of propositional awareness of some of these features, and some researchers are explicitly aiming to develop these capabilities further in a variety of systems [79,80,81]. While this condition is more demanding than the previous two, we still see it as moderately likely on any reasonable interpretation.

Agency: Relatedly, some theorists also hold that agency, that is, the capacity to set and pursue goals in a self-directed manner, is necessary for consciousness [82,83,84]. Depending on the view, the relevant kind of agency might involve acting on propositional judgments about reasons, or it might involve acting on perceptual reactions to affordances [85]. Regardless, it once again seems plausible that at least some AI systems can satisfy this condition. AI systems with grounded perception can already act on perceptual reactions to affordances, large language models are already starting to display flickers of propositional means-ends reasoning, and, once again, some researchers are explicitly aiming to develop these capabilities further [86]. For these reasons, we see agency as about as likely as self-awareness on any reasonable interpretation.

A global workspace: Some theorists hold that a global workspace, that is, a mechanism for broadcasting representations for global access throughout an information system, is necessary for consciousness [87]. In humans, for example, a visual state is conscious when the brain broadcasts it for global access. Since this condition depends only on functions like broadcasting and accessing, many experts believe that suitable AI systems can satisfy it (see, for example: [88,89,90]). Indeed, Yoshua Bengio and colleagues are the latest group to attempt to build an AI system with a global workspace [91], and Juliani et al. [92] argue that an AI system has already developed a global workspace as a side effect of other capabilities. We thus take there to be a moderate chance that an AI system can have a global workspace within the decade.

Higher-order representation: Some theorists hold that higher-order representation, or the representation of one’s own mental states, is necessary for consciousness. This condition overlaps with self-awareness, and it admits of similar variation. For instance, some views hold that propositional states about other states are necessary, and other views hold that perceptual states of other states are sufficient [93]. In either case, this capacity is plausibly realizable within AI systems. Indeed, Chalmers [94] speculates that intelligent systems might generally converge on this capacity, in which case we can expect that sufficiently advanced AI systems will have this capacity whether or not we intend for them to. We thus take there to be a moderate chance that AI systems can have higher-order representation within the decade as well.

Recurrent processing: Some theorists hold that recurrent processing, that is, the ability for neurons to communicate with each other in a kind of feedback loop, is sufficient for consciousness [95,96,97]. One might also hold it to be necessary. In biological systems, this condition might be less demanding than some of the previous conditions, but in artificial systems, it might be more demanding. However, as Chalmers [36] notes, even if we take recurrence to be necessary, this condition is plausibly satisfied either by systems that have recurrence in a broad sense, or, at least, by systems that have recurrence via recurrent neural networks and long short-term memory. We take recurrent processing to be more likely on the direct path than the indirect path at present, and to be at least somewhat likely overall.

Attention schema: Finally (as a newer view), some theorists hold that an attention schema, that is, the ability to model and control attention, is necessary for consciousness. Graziano and colleagues have already built computational models of the attention schema [98]. Some theorists also speculate that, like metacognition, intelligent systems might generally benefit from an attention schema [99], in which case we may once again expect that sufficiently advanced AI systems will have this capacity whether or not we intend for them to. Since proponents of attention schemas take this capacity to be more demanding than, say, global workspace and higher-order representations [100], we take the chance that AI systems can have an attention schema to be somewhat lower than the chance that they can have these other capacities, while still being somewhat likely overall.

3.3 Very undemanding conditions

While our model asks how likely AI systems will be to satisfy relatively demanding necessary conditions for consciousness, we should note that there are relatively undemanding conditions that some theorists take to be sufficient. Such views imply that AI consciousness is, if not guaranteed, then at least very likely within the decade. It thus matters a lot whether we give any weight at all to these views in our decisions about how to treat AI systems.

Information: Some theorists suggest that information processing alone is sufficient for consciousness.Footnote 15 This theory sets a very low bar for minimal consciousness, since information processing can be present even in very simple systems. Granted, it might be that very simple systems can have only very simple experiences [101, p. 294]. But first, even very simple experiences can be sufficient for moral consideration, particularly when they involve positive or negative valence. And second, many AI systems already have a high degree of informational complexity, and thus they might already have a high degree of experiential complexity on this view.Footnote 16 As AI development continues, we can expect that the informational complexity of advanced AI systems will only increase.

Representation: Relatedly, some theorists hold that minimal representational states are sufficient for consciousness. For example, Michael Tye [103, 104] defends a PANIC theory of consciousness, according to which an experience is conscious when its content is poised (ready to play a role in a cognitive system), abstract (able to represent objects whether or not those objects are present), non-conceptual (able to represent objects without the use of concepts), and intentional (represents something in the world). This view proposes a sufficient condition for consciousness that AI systems with embodied perception and weak agency plausibly already satisfy. For instance, a simple robot that can perceive objects and act on these perceptions whether or not the objects are still present might count as conscious on this view.

We can also give an honorable mention to panpsychism, which holds that consciousness is a fundamental property of matter. Whether panpsychism allows for AI consciousness depends on its theory of combination, that is, its theory of which systems of “micro” experiences can comprise a further “macro” experience. Many panpsychists hold that, say, human and nonhuman animals are the kinds of systems that can have macro experiences but that, say, tables and chairs are not. And at least in principle, panpsychists can accept theories of combination that include all, some, or none of the necessary or sufficient conditions for consciousness discussed above. In that respect, we can distinguish very demanding, middle ground, and very undemanding versions of panpsychism, and a comprehensive survey would give weight to all these possibilities.

Indeed, as noted in our discussion of very demanding conditions, many theories of consciousness are similarly expansive, in that they similarly allow for very demanding, moderately demanding, and very undemanding interpretations. For example, many computational theories of consciousness are imprecise enough to allow for the possibility that AI systems can perform the relevant computations now. They appeal to concepts like “perception,” “self-awareness,” “agency,” “broadcast,” “metacognition,” and “attention” that similarly admit of minimalist interpretations. And while some theorists might prefer to reject these possibilities and add precision to their theories to avoid them, other theorists might prefer to embrace these possibilities, along with the moral possibilities that they entail.

Our own view is that there is at least a one in a thousand chance that at least one of these very undemanding conditions is sufficient for consciousness and that AI systems can satisfy this condition at present or in the near future. Given the need for humility in the face of the problem of other minds, we think that it would be arrogant to simply assume that very undemanding theories of consciousness are false at this stage, in the same kind of way that we think that it would be arrogant to simply assume that very demanding theories are true at this stage. Instead, we think that an epistemically responsible distribution of credences involves taking there to be at least a low but non-negligible chance that views at both extremes are correct, and then taking there to be a higher chance that views between these extremes are correct.

For whatever it may be worth, many experts do seem to be open to quite permissive theories of consciousness. For example, on a 2020 survey of philosophers, 7.55% of respondents indicate that they accept or lean toward panpsychism together with other views, and 6.08% indicate that they accept or lean toward panpsychism instead of other views. 11.8% also claim to be agonistic or undecided, which might indicate openness to some of these views well [105]. Of course, this survey leaves it unclear what theory of combination these philosophers accept, and, so, what the implications are for AI consciousness. But the fact that so many philosophers accept or lean toward panpsychism or agnosticism is consistent with the kind of epistemic humility that we believe is warranted given current evidence.

4 Discussion

Thus far, this section has surveyed a dozen proposed conditions for consciousness, noting our own estimates about how likely these conditions are to be both correct and fulfilled by some AI systems in the near future along the way. We now close by suggesting that our estimates about these matters would need to be unacceptably confident and skeptical to justify the idea that AI systems have only a negligible chance of being conscious by 2030.

Our claim is that vindicating the idea that AI systems have only a negligible chance of being conscious by 2030, given the evidence, requires making unacceptably bold assumptions either about the values, about the facts, or about both. Specifically, we need to either (a) assume an unacceptably high risk threshold (for instance, holding that the probability that an action will harm vulnerable populations needs to be higher than one in a thousand to merit consideration), (b) assume an unacceptably low probability of AI consciousness within the decade (for instance, holding that the probability that at least some AI systems will be conscious within the decade is lower than one in a thousand), or (c) both. But these assumptions are simply not plausible when we consider the best available information and arguments in good faith.

To illustrate this idea, we present a simple model into which we can enter probabilities that these conditions are necessary for consciousness and that some AI systems will satisfy these conditions by 2030. We then show the extent to which we would need to bet on particular conditions being both necessary and unmet to avoid the conclusion that AI systems have a non-negligible chance of consciousness by 2030. In particular, we would need to assume that the very demanding conditions have a very high chance of being necessary and no chance of being met. We would need to assume that the moderately demanding conditions generally have a high chance of being necessary and a low chance of being met. And we would need to assume that the very undemanding conditions have a very low chance of being sufficient.

Before we present this model, we should note an important simplification, which is that this model assesses each of these conditions independently, with independent probabilities of being necessary and of being met. But this assumption is very likely false, and some interactions between these conditions might drive down our estimates of AI consciousness. In particular, there might be what we can call an “antipathy” between different conditions being met by a single AI system. For example, it might be that when an AI system has a global workspace, then this AI system is less likely to have recurrence. If so, then the probability that an AI system can satisfy these conditions together is not simply a product of the probabilities that an AI system can satisfy them separately, as our model treats them for the sake of simplicity.

However, we think that this kind of antipathy is unlikely to hold as a general matter. First of all, it seems plausible that many of these conditions are at least as likely, if not much more likely, to interact positively as to interact negatively, that is, that satisfying some conditions increases the probability of satisfying others at least as much as, if not more than, doing so decreases this probability. Second of all, we know that at least one system—the human brain—can satisfy all of these conditions at once, which is precisely why philosophers have picked out these conditions. And while one might argue that only carbon-based systems are capable of satisfying all these conditions at once, we expect that such a view depends on either the biological substrate view, the biological function view, or both, and is only as plausible as these views are.

With that said, we also allow for an X factor in this model for this reason. We recognize that our survey of proposed conditions for consciousness is not comprehensive, in that it might exclude conditions that it should include, and it might also exclude interactions among conditions. We thus include a line in our model that allows for such possibilities. Of course, a more comprehensive treatment of X factors would account for a wider range of views and a wider range of interactions, some of which could make near-term AI sentience more likely and others of which could make it less likely. But for present purposes we allow only for views and interactions that make near-term AI consciousness less likely, in the spirit of showing that even when we make assumptions that favor negligibility, negligibility can still be hard to establish.

Finally, as we note in the introduction to this paper, a comprehensive estimate about the probability of near-term AI moral standing might need to consider more than the probability of near-term AI consciousness. Specifically, if multiple theories of moral standing have a non-negligible chance of being correct, then we will need to estimate the probability that each theory is correct, estimate the probability that some near-term AI systems will have moral standing according to each theory, and then put it all together to generate an estimate that reflects our normative uncertainty and our descriptive uncertainty. We expect that expanding our model in this manner would drive the probability of AI moral standing up, not down, but we emphasize that our conclusion in this paper is tentative until we confirm that.

With that in mind, the table below illustrates that even if we assume, implausibly in our view, that a biological substrate or function has a very high chance of being necessary and a 100% chance of being unmet; that an X factor has a very high chance of being both necessary and unmet; and that each moderately demanding condition has a high chance of being both necessary (except attention schema; see above) and unmet (except embodiment and grounded perception; see above) (even though other moderately demanding conditions are plausibly already met too and researchers are pursuing promising strategies for meeting them); we can still end up with a one in a thousand chance of AI consciousness by 2030—which, we believe, is more than enough to warrant at least some moral consideration for at least some near-term AI systems.

5 Chance of AI Consciousness by 2030

Reminder: This table is for illustrative purposes only. These credences are not meant to be accurate, but are rather meant to show how skeptical one can be about AI consciousness while still being committed to at least a one in a thousand chance of AI consciousness by 2030.

Conditions

Necessary

Not Met by 2030

Necessary and Not Met

Biological substrate or function

80%

100%

80.0%

Embodiment

70%

10%

7.0%

Grounded perception

70%

10%

7.0%

Self-awareness

70%

70%

49.0%

Agency

70%

70%

49.0%

Global workspace

70%

70%

49.0%

Higher-order representation

70%

70%

49.0%

Recurrent processing

70%

80%

56.0%

Attention schema

50%

75%

37.5%

X factor

75%

90%

67.5%

AI Consciousness by 2030*

  

 ~ 0.1% (1 in 1000)Footnote 17

  1. *The chance that all conditions, including an X factor, are either unnecessary or met by 2030

This exercise, rough as it may be, shows that accepting a non-negligible chance of near-future AI consciousness and moral standing is not a fringe position. On the contrary, rejecting this possibility requires holding stronger views about the nature and value of other minds and the pace of AI development than we think is warranted. In short, assuming that conscious beings merit consideration, humans should extend moral consideration to beings with at least a one in a thousand chance of being conscious, and we should take some AI systems to have at least a one in a thousand chance of being conscious and morally significant by 2030. It follows that we should extend moral consideration to some AI systems by 2030. And since technological change tends to be faster than social change, we should start preparing for that eventuality now.