Imagine that you are driving when you suddenly realize that the fuel-tank light is on. What makes you, a complex assembly of neurons, aware of the light? And what makes the car, a sophisticated piece of electronics and engineering, unaware of it? What would it take for the car to be endowed with a consciousness similar to our own? Are those questions scientifically tractable?

Alan Turing and John von Neumann, the founders of the modern science of computation, entertained the possibility that machines would ultimately mimic all of the brain’s abilities, including consciousness. Recent advances in artificial intelligence (AI) have revived this goal. Refinements in machine learning, inspired by neurobiology, have led to artificial neural networks that approach or, occasionally, surpass humans (Silver et al. 2016; Lake et al. 2017). Although those networks do not mimic the biophysical properties of actual brains, their design benefitted from several neurobiological insights, including non-linear input-output functions, layers with converging projections, and modifiable synaptic weights. Advances in computer hardware and training algorithms now allow such networks to operate on complex problems (e.g., machine translation) with success rates previously thought to be the privilege of real brains. Are they on the verge of consciousness?

We argue that the answer is negative: the computations implemented by current deep-learning networks correspond mostly to nonconscious operations in the human brain. However, much like artificial neural networks took their inspiration from neurobiology, artificial consciousness may progress by investigating the architectures that allow the human brain to generate consciousness, then transferring those insights into computer algorithms. Our aim is to foster such progress by reviewing aspects of the cognitive neuroscience of consciousness that may be pertinent for machines.

Multiple Meanings of Consciousness

The word “consciousness,” like many pre-scientific terms, is used in widely different senses. In a medical context, it is often used in an intransitive sense (as in “the patient was no longer conscious”), in the context of assessing vigilance and wakefulness. Elucidating the brain mechanisms of vigilance is an essential scientific goal with major consequences for our understanding of sleep, anesthesia, coma, or vegetative state. For lack of space, we do not deal with this aspect here, however, because its computational impact seems minimal: obviously, a machine must be properly turned on for its computations to unfold normally.

We suggest that it is useful to distinguish two other essential dimensions of conscious computation. We label them using the terms global availability (C1) and self-monitoring (C2).

  • C1: Global availability. This corresponds to the transitive meaning of consciousness (as in “The driver is conscious of the light”). It refers to the relationship between a cognitive system and a specific object of thought, such as a mental representation of “the light.” This object appears to be selected for further processing, including verbal and nonverbal report. Information which is conscious in this sense becomes globally available to the organism: we can recall it, act upon it, speak about it, etc. This sense is synonymous with “having the information in mind”: among the vast repertoire of thoughts that can become conscious at a given time, only that which is globally available constitutes the content of C1-consciousness.

  • C2: Self-monitoring. Another meaning of consciousness is reflexive. It refers to a self-referential relationship in which the cognitive system is able to monitor its own processing and obtain information about itself. Human beings know a lot about themselves, including such diverse information as the layout and position of their body, whether they know or perceive something, or whether they just made an error. This sense of consciousness corresponds to what is commonly called introspection, or what psychologists call “meta-cognition”—the ability to conceive and make use of internal representations of one’s own knowledge and abilities.

We propose that C1 and C2 constitute orthogonal dimensions of conscious computations. This is not to say that C1 and C2 do not involve overlapping physical substrates; in fact, as we review below, in the human brain, both depend on prefrontal cortex. But we argue that, empirically and conceptually, the two may come apart, as there can be C1 without C2, for instance when reportable processing is not accompanied by accurate metacognition, or C2 without C1, for instance when a self-monitoring operation unfolds without being consciously reportable. As such, it is advantageous to consider these computations separately before we consider their synergy. Furthermore, many computations involve neither C1 nor C2 and therefore properly called “unconscious” (or C0 for short). It was Turing’s original insight that even sophisticated information processing can be realized by a mindless automaton. Cognitive neuroscience confirms that complex computations such as face or speech recognition, chess-game evaluation, sentence parsing, and meaning extraction occur unconsciously in the human brain, i.e., under conditions that yield neither global reportability nor self-monitoring (Table 1). The brain appears to operate, in part, as a juxtaposition of specialized processors or “modules” that operate nonconsciously and, we argue, correspond tightly to the operation of current feedforward deep-learning networks.

Table 1 Examples of computations pertaining to information-processing levels C0, C1, and C2 in the human brain

We now review the experimental evidence for how human and animal brains handle C0-, C1-, and C2-level computations—before returning to machines and how they could benefit from this understanding of brain architecture.

Unconscious Processing (C0): Where Most of Our Intelligence Lies

Probing Unconscious Computations

“We cannot be conscious of what we are not conscious of” (Jaynes 1976). This truism has deep consequences. Because we are blind to our unconscious processes, we tend to underestimate their role in our mental life. However, cognitive neuroscientists developed various means of presenting images or sounds without inducing any conscious experience (Fig. 1), and then used behavioral and brain-imaging to probe their processing depth.

Fig. 1
figure 1

Examples of paradigms probing unconscious processing (C0). (Top) Subliminal view-invariant face recognition (Kouider et al. 2009). On each trial, a prime face is briefly presented (50 ms), surrounded by masks that make it invisible, followed by a visible target face (500 ms). Although subjective perception is identical across conditions, processing is facilitated whenever the two faces represent the same person, in same or different view. At the behavioral level, this view-invariant unconscious priming is reflected by reduced reaction time in recognizing the target face. At the neural level, it is reflected by reduced cortical response to the target face (i.e., repetition suppression) in the Fusiform Face Area of human inferotemporal cortex. (Bottom) Subliminal accumulation of evidence during interocular suppression (Vlassova et al. 2014). Presentation of salient moving dots in one eye prevents the conscious perception of paler moving dots in the opposite eye. Despite their invisibility, the gray dots facilitate performance when they moved in the same direction as a subsequent dot-display, an effect proportional to their amount of motion coherence. This facilitation only affects a first-order task (judging the direction of motion), not a second-order metacognitive judgment (rating the confidence in the first response). A computational model of evidence accumulation proposes that subliminal motion information gets added to conscious information, thus biasing and shortening the decision

The phenomenon of priming illustrates the remarkable depth of unconscious processing. A highly visible target stimulus, such as the written word “four,” is processed more efficiently when preceded by a related prime stimulus, such as the Arabic digit “4,” even when subjects do not notice the presence of the prime and cannot reliably report its identity. Subliminal digits, words, faces, or objects can be invariantly recognized and influence motor, semantic, and decision levels of processing (Table 1). Neuroimaging methods reveal that the vast majority of brain areas can be activated nonconsciously.

Unconscious View-Invariance and Meaning Extraction in the Human Brain

Many of the difficult perceptual computations, such as invariant face recognition or speaker-invariant speech recognition, that were recently addressed by AI, correspond to nonconscious computations in the human brain (Dupoux et al. 2008; Kouider and Dehaene 2007; Qiao et al. 2010). For instance, processing someone’s face is facilitated when it is preceded by the subliminal presentation of a totally different view of the same person, indicating unconscious invariant recognition (Fig. 1). Subliminal priming generalizes across visual-auditory modalities (Faivre et al. 2014; Kouider and Dehaene 2009), revealing that cross-modal computations that remain challenging for AI software (e.g., extraction of semantic vectors, speech-to-text) also involve unconscious mechanisms. Even the semantic meaning of sensory input can be processed without awareness by the human brain. Compared to related words (e.g., animal-dog), semantic violations (e.g., furniture-dog) generate a brain response as late as 400 ms after stimulus onset in temporal-lobe language networks, even if one of the two words cannot be consciously detected (Luck et al. 1996; van Gaal et al. 2014).

Unconscious Control and Decision-Making

Unconscious processes can reach even deeper levels of the cortical hierarchy. For instance, subliminal primes can influence prefrontal mechanisms of cognitive control involved in the selection of a task (Lau and Passingham 2007) or the inhibition of a motor response (van Gaal et al. 2010). Neural mechanisms of decision-making involve accumulating sensory evidence that affects the probability of the various choices, until a threshold is attained. This accumulation of probabilistic knowledge continues to happen even with subliminal stimuli (de Lange et al. 2011; Vorberg et al. 2003; Dehaene et al. 1998a; Vlassova et al. 2014). Bayesian inference and evidence accumulation, which are cornerstone computations for AI (Lake et al. 2017), are basic unconscious mechanisms for humans.

Unconscious Learning

Reinforcement learning algorithms, which capture how humans and animals shape their future actions based on the history of past rewards, have excelled in attaining supra-human AI performance in several applications, such as playing Go (Silver et al. 2016). Remarkably, in humans, such learning appears to proceed even when the cues, reward, or motivation signals are presented below the consciousness threshold (Pessiglione et al. 2008, 2007).

In summary, complex unconscious computations and inferences routinely occur in parallel within various brain areas. Many of these C0 computations have now been captured by AI, particularly using feedforward convolutional neural networks (CNNs). We now consider what additional computations are required for conscious processing.

Consciousness in the First Sense (C1): Global Availability of Relevant Information

The Need for Integration and Coordination

The organization of the brain into computationally specialized subsystems is efficient, but this architecture also raises a specific computational problem: the organism as a whole cannot stick to a diversity of probabilistic interpretations—it must act, and therefore cut through the multiple possibilities and decide in favor of a single course of action. Integrating all of the available evidence to converge towards a single decision is a computational requirement which, we contend, must be faced by any animal or autonomous AI system, and corresponds to our first functional definition of consciousness: global availability (C1).

For instance, elephants, when thirsty, manage to determine the location of the nearest water hole and move straight to it, from a distance of 5 to 50 km (Polansky et al. 2015). Such decision-making requires a sophisticated architecture for (1) efficiently pooling over all available sources of information, including multisensory and memory cues; (2) considering the available options and selecting the best one based on this large information pool; (3) sticking to this choice over time; and (4) coordinating all internal and external processes towards the achievement of that goal. Primitive organisms, such as bacteria, may achieve such decision solely through an unconscious competition of uncoordinated sensorimotor systems. This solution, however, fails as soon as it becomes necessary to bridge over temporal delays and to inhibit short-term tendencies in favor of longer-term winning strategies. Coherent, thoughtful planning required a specific C1 architecture.

Consciousness as Access to an Internal Global Workspace

We hypothesize that consciousness in the first sense (C1) evolved as an information-processing architecture that addresses this information-pooling problem (Baars 1988; Dehaene et al. 1998b; Dennett 2001; Dehaene and Naccache 2001). In this view, the architecture of C1 evolved to break the modularity and parallelism of unconscious computations. On top of a deep hierarchy of specialized modules, a “global neuronal workspace,” with limited capacity, evolved to select a piece of information, hold it over time, and share it across modules. We call “conscious” whichever representation, at a given time, wins the competition for access to this mental arena and gets selected for global sharing and decision-making. Consciousness is therefore manifested by the temporary dominance of a thought or train of thoughts over mental processes, such that it can guide a broad variety of behaviors. These behaviors include not only physical actions, but also mental ones such as committing information to episodic memory or routing it to other processors.

Relation Between Consciousness and Attention

William James described attention as “the taking possession by the mind, in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought” (James 1890). This definition is close to what we mean by consciousness in the first sense (C1): the selection of a single piece of information for entry into the global workspace. There is, however, a clear-cut distinction between this final step, which corresponds to conscious access, and the previous stages of attentional selection, which can operate unconsciously. Many experiments have established the existence of dedicated mechanisms of attention orienting and shown that, like any other processors, they can operate nonconsciously: (1) in the top-down direction, attention can be oriented towards an object, amplify its processing, and yet fail to bring it to consciousness (Naccache et al. 2002); (2) in the bottom-up direction, attention can be attracted by a flash even if this stimulus ultimately remains unconscious (Kentridge et al. 1999). What we call attention is a hierarchical system of sieves that operate unconsciously. Such unconscious systems compute with probability distributions, but only a single sample, drawn from this probabilistic distribution, becomes conscious at a given time (Asplund et al. 2014; Vul et al. 2009). We may become aware of several alternative interpretations, but only by sampling their unconscious distributions over time (Moreno-Bote et al. 2011; Vul et al. 2008).

Evidence for All-Or-None Selection in a Capacity-Limited System

The primate brain comprises a conscious bottleneck and can only consciously access a single item at a time (see Table 1). For instance, rivalling pictures or ambiguous words are perceived in an all-or-none manner: at any given time, we subjectively perceive only a single interpretation out of many possible ones (even though the others continue to be processed unconsciously (Panagiotaropoulos et al. 2012; Logothetis 1998)). The serial operation of consciousness is attested by phenomena such as the attentional blink and the psychological refractory period, whereby conscious access to a first item A prevents or delays the perception of a second competing item B (Luck et al. 1996; Asplund et al. 2014; Vul et al. 2008; Sergent et al. 2005; Marti et al. 2012, 2015). Such interference with the perception of B is triggered by the mere conscious perception of A, even if no task is performed (Nieuwenstein et al. 2009). Thus, C1-consciousness is causally responsible for a serial information-processing bottleneck.

Evidence for Integration and Broadcasting

Brain-imaging in humans and neuronal recordings in monkeys indicate that the conscious bottleneck is implemented by a network of neurons which is distributed through the cortex, but with a stronger emphasis on high-level associative areas. Table 1 lists some of the publications that have evidenced an all-or-none “ignition” of this network during conscious perception, using a variety of brain-imaging techniques. Single-cell recordings indicate that each specific conscious percept, such as a person’s face, is encoded by the all-or-none firing of a subset of neurons in high-level temporal and prefrontal cortices, while others remain silent (Fig. 2) (Panagiotaropoulos et al. 2012; Logothetis 1998; Kreiman et al. 2002; Quiroga et al. 2008).

Fig. 2
figure 2

Global availability: consciousness in the first sense (C1): Conscious subjective percepts are encoded by the sudden firing of stimulus-specific neural populations distributed in interconnected, high-level cortical areas, including lateral prefrontal cortex, anterior temporal cortex, and hippocampus. (Top) During binocular flash suppression, the flashing of a picture to one eye suppresses the conscious perception of a second picture presented to the other eye. As a result, the same physical stimulus can lead to distinct subjective percepts. This example illustrates a prefrontal neuron sensitive to faces and unresponsive to checkers, whose firing shoots up in tight association with the sudden onset of subjective face perception (Panagiotaropoulos et al. 2012). (Bottom) During masking, a flashed image, if brief enough and followed by a longer “mask,” can remain subjectively invisible. Shown is a neuron in the entorhinal cortex firing selectively to the concept of “World Trade Center.” Rasters in red indicate trials where the subject reported recognizing the picture (blue = no recognition). Under masking, when the picture is presented for only 33 ms there is little or no neural activity—but once presentation time is longer than the perceptual threshold (66 ms or larger), the neuron fires substantially only on recognized trials. Overall, even for identical objective input (same duration), spiking activity is higher and more stable for recognized trials (Quiroga et al. 2008)

Stability as a Feature of Consciousness

Direct contrasts between seen and unseen pictures or words confirm that such ignition occurs only for the conscious percept. As explained earlier, nonconscious stimuli may reach into deep cortical networks and influence higher levels of processing and even central executive functions, but these effects tend to be small, variable, and short-lived (although nonconscious information decays at a slower rate than initially expected (King et al. 2016; Trübutschek et al. 2017)). By contrast, the stable, reproducible representation of high-quality information by a distributed activity pattern in higher cortical areas is a feature of conscious processing (Table 1). Such transient “meta-stability” seems to be necessary for the nervous system to integrate information from a variety of modules and then broadcast it back to them, thereby achieving flexible cross-module routing.

C1 Consciousness in Human and Nonhuman Animals

C1 consciousness is an elementary property which is present in human infants (Kouider et al. 2013) as well as in animals. Nonhuman primates exhibit similar visual illusions (Panagiotaropoulos et al. 2012; Logothetis 1998), attentional blink (Maloney et al. 2013), and central capacity limits (Watanabe and Funahashi 2014) as human subjects. Prefrontal cortex appears to act as a central information sharing device and serial bottleneck in both human and nonhuman primates (Watanabe and Funahashi 2014). The considerable expansion of prefrontal cortex in the human lineage may have resulted in a greater capacity for multimodal convergence and integration (Elston 2003; Neubert et al. 2014; Wang et al. 2015). Furthermore, humans possess additional circuits in inferior prefrontal cortex for verbally formulating and reporting information to others. The capacity to report information through language is universally considered as one of the clearest signs of conscious perception, because once information has reached this level of representation in humans, it is necessarily available for sharing across mental modules, and therefore conscious in the C1 sense. Thus, while language is not required for conscious perception and processing, the emergence of language circuits in humans may have resulted in a considerable increase in the speed, ease, and flexibility of C1-level information sharing.

Consciousness in the Second Sense (C2): Self-Monitoring

While C1-consciousness reflects the capacity to access external, objective information, consciousness in the second sense (C2) is characterized by the ability to reflexively represent oneself (Cleeremans et al. 2007; Cleeremans 2014; Dunlosky and Metcalfe 2008; Clark and Karmiloff-Smith 1993). A substantial amount of research in cognitive neuroscience and psychology has addressed self-monitoring under the term of “metacognition,” roughly defined as cognition about cognition or knowing about knowing. Below, we review the mechanisms by which the primate brain monitors itself, while stressing their implications for building self-reflective machines.

A Probabilistic Sense of Confidence

When taking a decision, humans feel more or less confident about their choice. Confidence can be defined as a sense of the probability that a decision or computation is correct (Meyniel et al. 2015). Almost anytime the brain perceives or decides, it also estimates its degree of confidence. Learning is also accompanied by a quantitative sense of confidence: humans evaluate how much trust they have in what they have learned, and use it to weigh past knowledge versus present evidence (Meyniel and Dehaene 2017). Confidence can be assessed nonverbally, either retrospectively, by measuring whether humans persist in their initial choice, or prospectively, by allowing them to opt out from a task without even attempting it. Both measures have been used in nonhuman animals to show that they too possess metacognitive abilities (Smith 2009). By contrast, most current neural networks lack them: although they can learn, they generally lack meta-knowledge of the reliability and limits of what has been learned. A noticeable exception is biologically constrained models that rely on Bayesian mechanisms to simulate the integration of multiple probabilistic cues in neural circuits (Ma et al. 2006). These models have been fruitful in describing how neural populations may automatically compute the probability that a given process is performed successfully. Although these implementations remain rare and have not addressed the same range of computational problems as traditional AI, they offer a promising venue for incorporating uncertainty monitoring in deep-learning networks.

Explicit Confidence in Prefrontal Cortex

According to Bayesian accounts, each local cortical circuit may represent and combine probability distributions in order to estimate processing uncertainty (Ma et al. 2006). However, additional neural circuits may be required in order to explicitly extract and manipulate confidence signals. MRI studies in humans and physiological recordings in primates and even in rats have specifically linked such confidence processing to the prefrontal cortex (Fleming et al. 2010; Miyamoto et al. 2017; Kepecs et al. 2008). Inactivation of prefrontal cortex can induce a specific deficit in second-order (i.e., metacognitive) judgments while sparing performance on the first-order task (Miyamoto et al. 2017; Rounis et al. 2010). Thus, circuits in prefrontal cortex may have evolved to monitor the performance of other brain processes.

Error Detection: Reflecting on One’s Own Mistakes

Error detection provides a particularly clear example of self-monitoring: just after responding, we sometimes realize that we made an error and change our mind. Error detection is reflected by two components of EEG activity, the error-relativity negativity (ERN) and the positivity upon error (Pe), which emerge in cingulate and medial prefrontal cortex just after a wrong response, but before any feedback is received. How can the brain make a mistake and detect it? One possibility is that the accumulation of sensory evidence continues after a decision is made, and an error is inferred whenever this further evidence points in the opposite direction (Resulaj et al. 2009). A second possibility, more compatible with the remarkable speed of error detection, is that two parallel circuits, a low-level sensory-motor circuit and a higher-level intention circuit, operate on the same sensory data and signal an error whenever their conclusions diverge (Charles et al. 2014, 2013).

Meta-Memory

Humans don’t just know things about the world—they actually know that they know, or that they don’t know. A familiar example is having a word “on the tip of the tongue.” The term “meta-memory” was coined to capture the fact that humans report feelings of knowing, confidence, and doubts on their memories. Meta-memory is thought to involve a second-order system that monitors internal signals (e.g., the strength and quality of a memory trace) to regulate behavior. Meta-memory is associated with prefrontal structures whose pharmacological inactivation leads to a metacognitive impairment while sparing memory performance itself (Miyamoto et al. 2017). Meta-memory is crucial to human learning and education, by allowing learners to develop strategies such as increasing the amount of study or adapting the time allocated to memory encoding and rehearsal (Dunlosky and Metcalfe 2008).

Reality Monitoring

In addition to monitoring the quality of sensory and memory representations, the human brain must also distinguish self-generated versus externally driven representations. Indeed, we can perceive things, but also conjure them from imagination or memory. Hallucinations in schizophrenia have been linked to a failure to distinguish whether sensory activity is generated by oneself or by the external world (Frith 1992). Neuroimaging studies have linked this kind of reality monitoring to the anterior prefrontal cortex (Simons et al. 2017). In nonhuman primates, neurons in the prefrontal cortex distinguish between normal visual perception and active maintenance of the same visual content in memory (Mendoza-Halliday and Martinez-Trujillo 2017).

Foundations of C2 Consciousness in Infants

Self-monitoring is such a basic ability that it is already present during infancy (Fig. 3). The ERN, indicating error monitoring, was observed when one-year-old infants made a wrong choice in a perceptual decision task (Goupil and Kouider 2016). Similarly, after 1-½-year-old infants pointed to one of two boxes in order to obtain a hidden toy, they waited longer for an upcoming reward (e.g., a toy) when their initial choice was correct than when it was wrong, suggesting that they monitored the likelihood that their decision was right (Kepecs et al. 2008; Goupil and Kouider 2016). Moreover, when given the opportunity to ask (nonverbally) their parents for help instead of pointing, they chose this opt-out option specifically on trials where they were likely to be wrong, revealing a prospective estimate of their own uncertainty (Goupil et al. 2016). The fact that infants can communicate their own uncertainty to other agents further suggests that they consciously experience metacognitive information. Thus, infants are already equipped with the ability to monitor their own mental states. Facing a world where everything remains to be learned, C2 mechanisms allow them to actively orient towards domains that they know they don’t know—a mechanism that we call “curiosity.”

Fig. 3
figure 3

Self-monitoring: consciousness in the second sense (C2): Self-monitoring (also called “meta-cognition”), the capacity to reflect on one’s own mental state, is available early during infancy. (Top) One-and-half-year-old infants, after deciding to point to the location of a hidden toy, exhibit two types of evidence for self-monitoring of their decision: (1) they persist longer in searching for the hidden object within the selected box when their initial choice was correct than when it was incorrect. (2) When given the opportunity to ask for help, they use this option selectively to reduce the probability of making an error. (Bottom) One-year-old infants were presented with either a meaningless pattern or a face that was either visible or invisible (depending on its duration) and then decided to gaze left or right in anticipation of face reappearance. As for manual search, post-decision persistence in waiting at the same gaze location increased for correct compared to incorrect initial decisions. Moreover, EEG signals revealed the presence of the error-related negativity over fronto-central electrodes when infants make an incorrect choice. These markers of metacognition were elicited by visible but not by invisible stimuli, as also shown in adults (Charles et al. 2013)

Dissociations Between C1 and C2

According to our analysis, C1 and C2 are largely orthogonal and complementary dimensions of what we call consciousness. On one side of this double dissociation, self-monitoring can exist for unreportable stimuli (C2 without C1). Automatic typing provides a good example: subjects slow down after a typing mistake, even when they fail to consciously notice the error (Logan and Crump 2010). Similarly, at the neural level, an ERN can occur for subjectively undetected errors (Nieuwenhuis et al. 2001). On the other side of this dissociation, consciously reportable contents sometimes fail to be accompanied by an adequate sense of confidence (C1 without C2). For instance, when we retrieve a memory, it pops into consciousness (C1) but sometimes without any accurate evaluation of its confidence (C2), leading to false memories. As noted by Marvin Minsky, “what we call consciousness [in the C1 sense] is a very imperfect summary in one part of the brain of what the rest is doing.” The imperfection arises in part from the fact that the conscious global workspace reduces complex parallel sensory streams of probabilistic computation to a single conscious sample (Asplund et al. 2014; Vul et al. 2009; Moreno-Bote et al. 2011). Thus, probabilistic information is often lost on the way, and subjects feel over-confident in the accuracy of their perception.

Synergies Between C1 and C2 Consciousness

Because C1 and C2 are orthogonal, their joint possession may have synergistic benefits to organisms. In one direction, bringing probabilistic metacognitive information (C2) into the global workspace (C1) allows it to be held over time, integrated into explicit long-term reflection, and shared with others. Social information sharing improves decisions: by sharing their confidence signals, two persons achieve a better performance in collective decision-making than either person alone (Bahrami et al. 2010). In the converse direction, the possession of an explicit repertoire of one’s own abilities (C2) improves the efficiency with which C1 information is processed. During mental arithmetic, children can perform a C2-level evaluation of their available competences (e.g., counting, adding, multiplying, memory retrieval…) and use this information to evaluate how to best face a given arithmetic problem (Siegler 1988). This functionality requires a single “common currency” for confidence across difference modules, which humans appear to possess (de Gardelle and Mamassian 2014).

Endowing Machines with C1 and C2

How could machines be endowed with C1 and C2 computations? Let us return to the car light example. In current machines, the “low gas” light is a prototypical example of an unconscious modular signal (C0). When the light flashes, all other processors in the machine remain uninformed and unchanged: fuel continues to be injected in the carburetor, the car passes gas stations without stopping (although they might be present on the GPS map), etc. Current cars or cell phones are mere collections of specialized modules that are largely “unaware” of each other. Endowing this machine with global information availability (C1) would allow these modules to share information and collaborate to address the impending problem (much like humans do when they become aware of the light, or elephants of thirst).

While AI has met considerable success in solving specific problems, implementing multiple processes in a single system and flexibly coordinating them remain difficult problems. In the 1960s, computational architectures called “blackboard systems” were specifically designed to post information and make it available to other modules in a flexible and interpretable manner, similar in flavor to a global workspace (Baars 1988). A recent architecture called Pathnet uses a genetic algorithm to learn which path through its many specialized neural networks is most suited to a given task (Fernando et al. 2017). This architecture exhibits robust, flexible performance and generalization across tasks, and may constitute a first step towards primate-like conscious flexibility.

To make optimal use of the information provided by the fuel-gauge light, it would also be useful for the car to possess a database of its own capacities and limits. Such self-monitoring (C2) would include an integrated image of itself, including its current location, fuel consumption, etc., as well as its internal databases (e.g., “knowing” that it possesses a GPS map that can locate gas stations). A self-monitoring machine would keep a list of its subprograms, compute estimates of their probabilities of succeeding at various tasks, and constantly update them (e.g., noticing if a part fails).

Most present-day machine-learning systems are devoid of any self-monitoring: they compute (C0) without representing the extent and limits of their knowledge or the fact that others may have a different viewpoint than their own. There are a few exceptions: Bayesian networks (Ma et al. 2006) or programs (Tenenbaum et al. 2011) compute with probability distributions and therefore keep track of how likely they are to be correct. Even when the primary computation is performed by a classical CNN, and is therefore opaque to introspection, it is possible to train a second, hierarchically higher neural network to predict the first one’s performance (Cleeremans et al. 2007). This approach, whereby a system re-describes itself, has been claimed to lead to “the emergence of internal models that are metacognitive in nature and (…) make it possible for an agent to develop a (limited, implicit, practical) understanding of itself” (Cleeremans 2014). Pathnet (Fernando et al. 2017) uses a related architecture to track which internal configurations are most successful at a given task and use this knowledge to guide subsequent processing. Robots have also been programed to monitor their learning progress, and use it to orient resources towards the problems that maximize information gain, thus implementing a form of curiosity (Gottlieb et al. 2013).

An important element of C2 which has received relatively little attention is reality monitoring. Bayesian approaches to AI (Lake et al. 2017; Tenenbaum et al. 2011) have recognized the usefulness of learning generative models that can be jointly used for actual perception (present), prospective planning (future), and retrospective analysis (past). In humans, the same sensory areas are involved in both perception and imagination. As such, some mechanisms are needed to tell apart self-generated versus externally triggered activity. A powerful method for training generative models, called adversarial learning (Goodfellow et al. 2014) involves having a secondary network “compete” against a generative network, to critically evaluate the authenticity of self-generated representations. When such reality monitoring (C2) is coupled with C1 mechanisms, the resulting machine may more closely mimic human consciousness in terms of affording global access to perceptual representations while having an immediate sense that their content is a genuine reflection of the current state of the world.

Concluding Remarks

Our stance is based on a simple hypothesis: what we call “consciousness” results from specific types of information-processing computations, physically realized by the hardware of the brain. It differs from other theories in being resolutely computational—we surmise that mere information-theoretic quantities (Tononi et al. 2016) do not suffice to define consciousness unless one also considers the nature and depth of the information being processed.

We contend that a machine endowed with C1 and C2 would behave as if it were conscious—for instance, it would know that it is seeing something, would express confidence in it, would report it to others, could suffer hallucinations when its monitoring mechanisms break down, and may even experience the same perceptual illusions as humans. Still, such a purely functional definition of consciousness may leave some readers unsatisfied. Are we “over-intellectualizing” consciousness, by assuming that some high-level cognitive functions are necessary tied to consciousness? Are we leaving aside the experiential component (“what it is like” to be conscious)? Does subjective experience escape a computational definition?

While those philosophical questions lie beyond the scope of the present paper, we close by noting that, empirically, in humans, the loss of C1 and C2 computations co-varies with a loss of subjective experience. For example, in humans, damage to the primary visual cortex may lead to a neurological condition called “blindsight,” in which the patients report being blind in the affected visual field. Remarkably, those patients can localize visual stimuli in their blind field, but they cannot report them (C1) nor can they effectively assess their likelihood of success (C2)—they believe that they are merely “guessing.” In this example at least, subjective experience appears to cohere with possession of C1 and C2. Although centuries of philosophical dualism have led us to consider consciousness as unreducible to physical interactions, the empirical evidence is compatible with the possibility that consciousness arises from nothing more than specific computations.