1 Introduction

Since the beginning of modernity, people have tried to create artificial creatures and humanoid automata. Jacques de Vaucanson’s flute player, his fluttering, quacking and drinking duck, or the „scribe” created by Pierre Jaquet-Droz in 1774 are well-known examples of the fascination that such lifelike products aroused in their contemporaries. Current robots have left their early predecessors far behind and are about to become common partners of humans as „social robots” – as care robots for the elderly, playmates for children, household helpers, or conversation partners for the lonely.

The problems we run into when interacting with androids are illustrated by „Sophia”, a humanoid robot from the company Hanson Robotics (Parviainen & Coeckelbergh, 2021). Sophia has human-like facial expressions, displays about 60 different emotional signals, has a reasonably modulated tone of voice and makes eye contact with people she encounters. She (or „it”? „She” is typically reserved for people, but let’s accept anthropomorphism for the moment) answers relatively complex questions, can recognize people and jokes about the English weather on a London talk show.Footnote 1 Even if it’s just a bluff, her effect is astounding. Sophia is already on the edge of the „uncanny valley” (Mori et al., 2012), as it is called in robotics, the threshold where an android’s human resemblance creates in us a feeling of both uncanniness and fascination.

In a different way than in robotics, this threshold is crossed in „Her“, a science fiction film by Spike Jonze from 2013: Theodore, a shy but empathetic man, falls in love with a computer program named Samantha, who has no corporeality except for an erotic voice (spoken by Scarlett Johannson). However, as a „learning system”, she seems to increasingly develop human sensations and empathy. The more Theodore feels understood by Samantha and finally falls in love with her, the more indifferent he becomes to the question of whether she is a real person or just a simulation – the delightful relationship is enough, and he loses his critical distance.

It seems timely that we account for our interactions with agents that simulate subjectivity and aliveness. After all, artificial systems such as Alexa or Siri and robots like Asimo (Sakagami et al. 2002), iCub (Gaudiello et al., 2016) or Sophia are designed to interact with us as convincingly as possible. The humanoid robot Pepper is able to analyze the facial expressions, gestures, and tone of voice of its human counterparts in order to calculate their emotional state (Pandey & Gelin, 2018).Footnote 2 Similarly, so-called „empathic chatbots” are said to exhibit emotional intelligence in order to help people with mental health problems (Devaram, 2020). Moreover, it is now frequently claimed that we can „understand” robots (Hegel et al., 2009; Ziemke, 2020), attribute to them rightly not only „states of mind”, but also „desires, knowledge, beliefs, emotions, perceptions” (Hellström & Bensch, 2018), empathize with them (Schmetkamp, 2020), and accept them as partners (Breazal et al. 2004). Robots are thus regarded as „intentional agents”, whose „beliefs and desires” should be appropriately understood in order to interact with them (Thellman & Ziemke, 2020; Thellman, 2021). Conversely, robots should „understand others’ actions, intentions, and emotions and show emotions themselves” (Brinck & Balkenius, 2020, 54), so that there could be „joint intention” (Breazal et al. 2004), even „mutual recognition” between humans and robots (Brinck & Balkenius, 2020). This development raises a number of interrelated questions:

  1. (1)

    Is it really possible to understand AI systems or robots in the proper sense of the word, i.e., to regard them as agents with beliefs, intentions, and desires? And can there be mutual empathy or „shared goals and shared intentions” between a human and a robot (Herrmann & Melhuish, 2010)?

  2. (2)

    If this assumption proves to be incorrect presently, could there be a stage in the future development of AI systems where we should actually attribute some kind of subjectivity and thus quasi-personal status to them?

  3. (3)

    How will our attitudes toward AI systems change as we increasingly interact with them? Will the distinction between simulated and real encounters become increasingly blurred?

These questions will be explored in the following, with a focus on the question of a possible understanding of AI systems and robots. To this end, an initial conceptual clarification is needed. (a) „Understanding” here means not just „understanding how something works” – this functional meaning is obviously not meant in the above-mentioned contexts of artifical agents. (b) In what follows, understanding also means more than grasping the semantic meaning of words or other signs, as when one speaks of „understanding a text”, for example. Of course, we can „understand” Alexa or Sophia in the sense that we can take their programmed output as information. What is meant in the following is communicative understanding in the proper sense, namely understanding the utterances of another as an expression of his or her intentions, beliefs and feelings – in short: understanding not something, but someone. The question then is whether this concept of understanding can also be applied to artificial systems so that there can be a communication with them in the proper sense.Footnote 3

I will proceed in several steps. First, I will describe the conditions for mutual understanding on the empathic (2.1) and on the semantic level (2.2) and show in each case how talk of „understanding” current artificial systems represents a category mistake. According to my thesis, the basic condition for understanding turns out to be the sharing of a common form of life: sociality presupposes conviviality. I then show, by reference to an enactive concept of living beings as autopoietic systems, that artificial systems are unable in principle to fulfill this fundamental condition of understanding (3). In the final section, I examine the possible consequences of a creeping dissolution of the categorial distinctions between genuine sociality and „we-intentionality” on the one hand, and simulated or feigned sociality on the other (4).

2 The preconditions for communicative understanding

Let us first examine in more detail whether we can speak of communicative understanding vis-à-vis an AI system or a robot. We can distinguish two forms of such understanding:

  1. (a)

    empathic understanding, i.e., understanding the other’s emotional expression, such as his or her joy or sadness;

  2. (b)

    semantic understanding, i.e. understanding his or her verbal utterances.

In both respects, interaction with an artificial system can give the impression or illusion of understanding someone. Let us consider each of them separately.

2.1 Empathic understanding

Social understanding is primarily based on grasping the other’s feelings and intentions through intercorporeal empathy (Zahavi, 2015; Fuchs, 2017). It is directed at the emotional expression of others, manifested in their facial and gestural movements, be it in face-to-face encounters or also in watching people in movies or on television. However, this primary empathy is by no means limited to living beings. It can also be directed towards inanimate objects, if they seem – e.g. by their movements – to show expressive or intentional behavior. One example is Heider and Simmel’s (1944) famous experiment on simple geometric shapes such as circles or triangles moving around each other, which led people to interprete them in terms of intentional and emotional behavior. Similarly, a robotic lawnmower, „searching” in vain for a charging station for its expiring battery, can easily elicit sympathy. Numerous studies have shown that people treat robots or avatars as if they were living beings endowed with mental states, and cite intentions or desires rather than causes as explanations for their actions (Duffy, 2003; Waytz et al., 2010; Özdem et al., 2017; Harth, 2017).

At the same time, this anthropomorphism is usually accompanied by an „as-if-consciousness”, i.e., by the implicit knowledge that what is involved is only an apparent intentionality (Fuchs 2014). We take the „intentional stance” (Dennett, 1987) even towards non-living agents, but without necessarily believing that they actually have genuine intentionality (Thellman et al., 2017).Footnote 4 This as-if-consciousness, however, dwindles with the increasing lifelikeness of the objects. We easily perceive human-like voices in particular as an expression of an „inside”.Footnote 5 Something which listens and responds to us like Siri and Alexa, or advises us and performs services for us, we easily perceive as alive and animated. And when Sophia says in a tender voice, „That makes me happy,” it takes some active distancing to realize that there is no one there to feel happy, that it is indeed not an utterance at all. In other words, we should not be deceived by the involuntary empathy to which we tend when objects are sufficiently expressive and life-like; it certainly does not correspond to a real sharing of feelings.

The increasingly perfected simulation of subjectivity and communication thus requires that we reject the pretense of an utterance and take Sophia’s talk for what it actually is: hollow words, like those of a parrot. Otherwise, we abandon ourselves to appearances and, like Theodore in Her, simply give up the „as-if”, the distinction between simulation and reality – in a move that Lombard & Ditton (1997) have termed „willingness to suspend disbelief”. In which case the impression of an utterance is no longer rejected but passes over into the illusion of empathy or understanding of feelings.

Of the positions that see here not an illusionary but a justified empathy, I pick out only one. By comparing robots to fictional characters, Schmetkamp (2020) has argued that we can indeed empathize with robots „… by either inferring, feeling, interacting, or imagining how they perceive and move in their world“ - just as we imagine how a character in a novel or movie perceives, acts, and feels. In this way, we might also „attribute something like a perspectival experience to robots“ (Schmetkamp, 2020: 881). Now, there is undoubtedly empathy with fictional persons, such as Anna Karenina, even if we remain aware that they are not real (Fuchs 2014). However, if they were real, then our empathy would have an actual counterpart, precisely in the experience of these persons; they would be people like ourselves. In the case of humanoid robots, however, it is the other way around: they are quite real, but our empathy with them is only an unjustified anthropomorphism, since it does not correspond to any subjective experience.

Thus, either Schmetkamp’s argument again boils down to the indisputable fact that humans easily attribute intentions and feelings to robots (for this, no reference to fictional characters is needed). Or she wrongly transfers the case of fictional characters to humanoid robots, as if they had something in common with Anna Karenina’s feelings, with which we could sympathize. The latter seems more likely, because Schmetkamp also ascribes a perceptual experience to robots („a robot literally (e.g. visually) perceives the world in a certain way”, p. 890). However, this means a category mistake: Robots can simulate perception (as the robotic lawnmawer), but they cannot actually perceive because they do not have subjectivity. Thus, our empathy with them remains without an adequate object.

2.2 Semantic understanding

We can also understand utterances in a semantic sense, provided that it is a matter of linguistic communication. This is not necessarily tied to bodily presence, but can also be transmitted as a letter, email or chat. In such cases we still understand the utterance as utterance, i.e., we read it as an expression of the other’s intentions, not just as factual information as in a newspaper. But in such communication the possibility of simulation and thus of feigning subjectivity is naturally increased. It is already possible that the friendly online partner or the empathetic online therapist is in fact just a chatbot. Let us assume that the simulation of intentional utterances is so successful that we can no longer recognize it as such and have the compelling impression of a real „counterpart”. From this point on, would the attribution of intentionality and thus of subjectivity be justified?

This is the situation underlying Alan Turing’s well-known test: a group of test subjects were to communicate in writing with a human and with a computer without having any optical or acoustic contact with either (Turing 1950). If the test subjects were subsequently unable to distinguish between human and computer, then, according to Turing, nothing prevents us from recognizing the latter as a „thinking machine.” Critics have rightly pointed out that the Turing test defines thinking and its intentional expression in purely behavioristic terms, namely as the output of a computational system, be it the brain or the computer. Yet to the objection that thinking presupposes subjectivity or consciousness, Turing would reply that we can as little be sure of other humans actually thinking as we can be of machines:

According to the most extreme form of this view the only way by which one could be sure that a machine thinks is to be the machine and to feel oneself thinking. One could then describe these feelings to the world, but of course no one would be justified in taking any notice. Likewise, according to this view the only way to know that a man thinks is to be that particular man. It is in fact the solipsist point of view. (Turing 1950: 446)

Subjectivity and indeed consciousness as such are, according to Turing, inaccessible and therefore unverifiable. Mere verbal output is sufficient for the attribution of „thinking” – embodied interaction is excluded by the scenario from the outset.

Now, the Turing test has not yet been passed by any AI system. The Loebner Prize, established in 1991 to reward any machine that could, has never had to be paid out. It is not on complex logical questions where AI systems fail, but rather on questions that require common sense and contextual understanding (Moor, 2001), such as: „Where is Peter’s nose when Peter is in New York? What does the letter M look like when you turn it upside down? Does my budgie have ancestors who were alive in 1750? How many grains of sand do you call a heap?” Supposedly intelligent systems fail here, especially when it comes to understanding metaphors, irony, or sarcasm. They only know unambiguous individual elements, 0 or 1 – for everything that is ambiguous, enigmatic, vague, or has an atmospheric impression, they lack the sense. The relationship between foreground and background, object and context, that helps us make sense of such questions, does not exist for them, nor does the shared background of commonsensical knowledge (Dreyfus, 1992; Fuchs, 2021).

But let us assume that future machine learning systems will be able to pass the Turing test – with sufficient training based on myriads of situations, context understanding might eventually be simulated. This is even more likely when the systems are implemented in robots that interact with their environment. Such a system with abilities that equal or even surpass those of the human mind is referred to in the research as „strong AI.” So once a future Alexa can carry on any conversation, remember past situations and refer to itself – would we then also have to attribute subjectivity to it and concede that it can „understand” us in a genuine sense?

Searle countered Turing’s argument with his well-known thought experiment of the „Chinese room” (Searle 1980). A man who does not understand a word of Chinese is locked in a room containing only a manual with all the rules for answering Chinese questions. The man now receives incomprehensible Chinese characters from a Chinese man through a slit in the room („input”), but with the help of the program is able to find appropriate answers, which he then passes on to the outside („output”). However, as Searle argues, even if the Chinese man outside does not notice the deception, one could certainly not claim of the man in the room that he understands Chinese. Searle’s „Chinese Room” is, of course, an illustration of a computer which functions completely adequately and yet lacks the decisive prerequisite for understanding, namely intentional (and for that matter, phenomenal) consciousness. Consequently, human understanding cannot be reduced to functional algorithms: even „strong AI”, should it be possible at all, would only simulate understanding.

Dennett and others have objected to Searle that while understanding or comprehension cannot be attributed to the person in the room, it can well be attributed to the system as a whole, provided it is equipped with sufficiently complex programs:

The competence is in the software (…) The central processing unit in your laptop doesn’t know anything about chess, but when it is running a chess program, it can beat you at chess, and so forth […] The way to reproduce human competence and hence comprehension (eventually) is to stack virtual machines on top of virtual machines on top of virtual machines – the power is in the system, not in the underlying hardware […] comprehension is an effect created (bubbling up) from a host of competences piled on competences (Dennett, 2013, 325).

However, the idea that AI could eventually reach the level of human, i.e. conscious, intelligence simply by increasing the complexity of the software is no more than an assumption. It is often justified with the principle of recursivity, i.e. the feedback of the state of a system into its further processes. But this principle is already realized in a thermostat, and no one would argue that, e.g. a refrigerator can „feel” too warm and „decide” to lower the temperature. A drone also has all the homing systems and feedback mechanisms that allow it to continuously self-adjust its trajectory, but we are unlikely to attribute to it an understanding of its search process or a sense of success when it reaches its target. Whatever properties of a system or whatever relations to its environment are fed into its information processing, there is nothing to suggest that this could at some point produce qualitative experience or understanding. Dennett does not even try to make this plausible, but simply defines comprehension in functionalist terms as the result of competences, i.e. of the appropriate performance of a system (e.g. the chess computer) – exactly in the sense of Turing. „Piling up” these competences does not change this.Footnote 6

But what do we actually mean when we talk about someone understanding another’s verbal utterance? Obviously not merely that he is able to give a suitable answer to it (even if this is normally a sufficient indication). In other words, it is not enough to link the verbal symbols to a fact represented in one’s mind, so that this link becomes the trigger for further chains of symbols and an appropriate linguistic output. All this could also be reproduced by the algorithms of a program, or by Dennett’s „virtual machines”. Understanding means instead the embedding of the heard words in a context of what is known or pre-understood, so that a feeling of recognition, congruence and familiarity arises.

So, for example, when I hear my friend’s request, „give me the hammer, please!“, I need to match it with my prior understanding of a hammer, at the same time grasp my friend’s intention, and finally have the bodily knowledge of how to grasp and hand over a hammer. This familiarity with the words and the meaning of the situation is necessary part of understanding the request. It manifests itself in an implicit feeling of „I understood,“ which then prompts me to take the appropriate action. Thus, a feeling of familiarity and congruence is the characteristic of understanding – the appropriate response or reaction is merely its consequence. Semantic understanding, too, is therefore by no means a purely functional or cognitive process, but also an affective one; it again presupposes a feeling and thus an experiencing subject. This is where a functionalist description that eliminates subjective, qualitative experience and reduces understanding to a suitable input-output relation fails.

This is even more true if we consider the entire situation of communication: understanding means not only grasping the meaning of another’s utterance, but also being aware that he addressed me with his utterance, i.e., that he intended an understanding. His communicative intention is a necessary part of the utterance that I understand (Grice, 1957). The fact that I thus understand not only the other’s words but also the other himself as an intentional subject ultimately enables the shared intentionality or „we-intentionality“ of understanding. It implies both (a) that I perceive my interlocutor as an intentional agent like myself, and (b) that he in turn has an awareness of me as an intentional agent. This is the reciprocal relation of the second-person perspective: each partner in the interaction experiences himself or herself as the other’s ‘you’, as the addressee of his communicative intention: „[T]he unique feature of relating to you as you is that you also have a second-person perspective on me, that is, you take me as your you” (Zahavi, 2015: 93). This, in turn, is the basis for a sense of „we” that connects us with the other person, a feeling of mutual understanding.Footnote 7

Thus, in order to understand Alexa or Samantha in the communicative sense, we would have to attribute to them not only an actual understanding of our words in the sense given above, but also a second-person perspective, namely an awareness of us as understanding subjects, along with a communicative intention, i.e. the will to convey something to us with their utterances. Even in a perfect simulation of communication, one which would let an AI system pass the Turing test, this would be lacking; there could be no question of a mutual understanding, let alone „mutual recognition” (Brinck & Balkenius, 2020).

3 Why robots can’t experience

I have described the conditions for communicative understanding in the empathic and semantic sense – conditions that are clearly not fulfilled by current AI-based systems:

  1. (1)

    The involuntary empathy we feel towards artificial agents is based merely on our tendency to anthropomorphism.

  2. (2)

    The mere transmission of information between such agents and a human does not mean understanding someone. In other words, it implies no more understanding than reading an instruction manual, even if it proceeds via verbal interactions.

Now one could argue that the future development of humanoid robots will at some point cross the threshold beyond which we should ascribe subjectivity to them. Already the increasingly perfect simulation – as shown in „Her” – can give rise to doubts as to whether we are not dealing with subjects after all, with the possibility of mutual understanding in the proper sense.

I reject this possibility for the following reasons. (1) Our everyday mutual understanding is not only based on the attribution of intentional states, but more fundamentally on a common form of life: sociality presupposes conviviality. (2) AI systems and robots do not belong to this shared form of life, since they do not have a vital and thus phenomenal embodiment. (3) The approximation of robots to living beings (in the sense of so-called ‘Artificial Life’) fails because living beings represent autopoietic systems with a developmental history, which are not accessible to biological engineering.

3.1 Conviviality as basis of social understanding

Already Turing argued that we had no reason to deny subjective states such as beliefs and desires to an AI system, provided its performance was equivalent to that of humans. The insistence on human subjectivity would be based only on our own experience, not on that of others, and was therefore „the solipsistic point of view” (see above, 2.2). However, our assumption that other humans (as well as other higher animals) are conscious is by no means based on solipsism and inference. Subjectivity is not something we first suspect in others and then attribute to them if there are sufficient signs for it, as the Theory of Mind assumes (Gallagher, 2001). Rather, we perceive others from the outset as embodied participants in a common form of life, in which we do not merely infer selfhood from signs but always already presuppose it.Footnote 8 This intercorporeal perception is bound up with our common aliveness, embodiment and life history. We share with others the existential facts of being born and growing, the need for air, food and warmth, waking and sleeping, last not least mortality; and this is the common background against which we also interpret all their verbal utterances. Whatever does not belong to this form of life – i.e. artifacts such as computers or robots – is not subject to the implicit assumption of subjectivity; mere similarities of performance are not sufficient for its attribution.

So if future AI systems or robots are one day able to pass the Turing test, it is not their cognitive performance that should make us believe in a conscious being. Rather, our everyday sharing of emotions and intentions with others presupposes a sharing of life. Whatever can feel hunger, thirst, pleasure or pain, joy or suffering, so that we can empathize with these states, must be of our kind in the broadest sense, that is, a living being belonging to our species or descended from another species whose expressions of emotion and striving are sufficiently similar to ours. Whatever thinks and considers must also have an awareness of its thinking, thus again be a self-sensing, living being. And whatever speaks to us must be able to give expression to an inner experience, so that a „we-intentionality” emerges. In short, the perception of others as conscious beings is based on the presupposition of a common form of life that enables us to share our experience, or on our “conviviality”.Footnote 9

Candidates for an attribution of subjectivity must therefore be of our kind: embodied, moving spontaneously and purposefully, expressive and alive. Could humanoid robots or androids fulfill this requirement? As yet, neither AI systems nor robots convincingly convey the impression of aliveness. However, the implicit presupposition of conviviality might change as we increasingly interact with AI systems. We might be persuaded that while we are not dealing with bodily beings whose life form we share, we are dealing with sentient and experiencing systems of a different kind. Empathy would then decouple from conviviality without succumbing to mere anthropomorphism. Is there, then, the prospect of some form of AI that could make the ontological claim to possess subjectivity and experience such that we can actually understand it empathically? Might future humanoid robots not only simulate life but actually come alive, so that we would rightly transfer our empathy to them without being subject to an illusion?

3.2 Robot functionalism versus vital embodiment

That robots are increasingly capable of simulating certain life functions is undeniable, including sensorimotor functions in particular. Operational mobility and interaction with the environment enable advanced robots to provide new forms of feedback and adaptation that go beyond the capabilities of stationary learning systems. Integrated self-models allow today’s robots to localize themselves in space, register the results of their behavior in the environment and modify their own programs accordingly.Footnote 10 This suggests what Sharkey & Ziemke (2001) have termed „robot functionalism”: a robot with bodily structures and interaction patterns similar to those of human beings could develop intrinsic intentionality or even self-awareness.

But the self-modeling of a robot is not, as is often assumed, a kind of self-awareness. The additional feedback loop, which comes about through an internally generated self-model, does not entail conscious self-reference; for this, the robot would have to perceive its self-model and recognize it as itself, as with a mirror image. This means, however, that it would have to have – beforehand – a basal, pre-reflective self-consciousness which for its part could not be generated by self-modeling – otherwise one would end up in an infinite regress.Footnote 11 Neither sensorimotor embodiment nor self-modeling are therefore sufficient for subjectivity. Instead, what is crucial is vital embodiment, which, from an enactivist perspective, is the basis of primary self-awareness, and thus, of the continuity of life and mind (Jonas, 1966; Thompson, 2007; Fuchs, 2018, 2020).

Conscious experience, from this point of view, is neither a model of the world nor a model of the self located inside the brain,Footnote 12 but primarily an activity of the whole organism in which its current homeostasis manifests itself. The emergence of experiencing is tied to the requirement of living beings to maintain themselves in a precarious equilibrium in exchange with their environment, which is made possible by metabolism (Jonas, 1966). Deviations from homeostasis must be registered and responded to by appropriate adaptive behavior toward the environment if the living being is not to perish (Di Paolo, 2009; Di Paolo, 2018). In higher animals, this happens by feeling values that integrally reflect the state of homeostasis in its ups and downs. „The source of feeling is life on the wire, balancing its act between flourishing and death” (Damasio, 2018, 20). Thus, the maintenance of homeostasis, i.e., the internal milieu and with it the viability of the organism, is the primary function of consciousness; this manifests itself in the phenomena of drive, hunger, thirst, displeasure, or satisfaction and pleasure. Consciousness, therefore, does not arise first in the cortex, but results from ongoing vital regulatory processes involving the whole organism, which are already integrated in the brainstem and midbrain centers (Panksepp, 1998; Damasio, 2010; Fuchs, 2018). In this way, a bodily-affective self-experience emerges, namely the feeling of life with its various states of pleasure and displeasure, which, as basic subjectivity, underlies all higher mental functions. One can also express it as follows: all experiencing is a form of life; without life there is no subjectivity (Fuchs, 2018: 78, 94).Footnote 13

In the same way, the emotions are also tied to the constant interaction of brain and body. Moods and feelings always involve the entire organism: brain, autonomous nervous system, heart, circulation, respiration, intestines, muscles, facial expressions, gestures, and posture. Every emotional experience is inseparably linked to changes in this body landscape (Fuchs & Koch, 2014).Footnote 14 An AI system, however, does not have a biological body and thus cannot have feelings. And of course, every cognition, perception and action is also mediated by the living body, realized through the interactions of brain, organism, and environment – through functional circuits in which our senses and limbs as well as things and other people are involved (Chiel & Beer 1997, Sharkey & Ziemke 2001).

The brain is capable of integrating all these organismic functions – but only within a continuous resonant loop, or a „functional fusion” of brain and body (Damasio, 2010, 273).Footnote 15 It is not a control center that receives information and issues commands, but part of the functional whole of body and environment. All these living processes and integrating functions are of a biological and biochemical nature and therefore cannot be simulated even by highly complex computers or AI-based robots. Robotic sensors, actors and digital self-models represent only a „mechanistic embodiment” (Sharkey & Ziemke, 2001) superficially similar to the human body and its functions. Without a biological body in metabolic exchange with the environment, the prerequisite for basal self-awareness and thus also higher-order consciousness is missing.

3.3 Autopoietic versus Artificial Life

Now, in robotics we are not only dealing with the simulation of expressions of life but increasingly also with the mimicking of adaptation, learning and development, as it characterizes the ontogeny and life course of higher organisms. Robots equipped with Artificial Neural Networks are able to „learn” from interactions with their environment, for example by reinforcement learning or evolutionary adaptation techniques (generation of new behavioral variants, selection and implementation of successful variants). Their behaviors are no longer determined solely by pre-programmed rules but by a „memory” of their interactions. Thus, one also speaks of „evolutionary robotics”, or „Artificial Life” (Ziemke & Sharkey 2001, Kim & Cho, 2006, Bongard, 2013). Are we now dealing with the transition to technically generated living beings, to which we would have to ascribe something like self-preservation, self-development, and purposefulness, at least in principle?

The reasons for the principal distinction between living beings and machines have already been repeatedly pointed out (von Uexküll 1973, 1982, Maturana & Varela, 1980, Zlatev 2003, Sharkey & Ziemke 2001), and I will only mention the most important arguments here. The central difference is undoubtedly the autopoietic organization of living beings, which implies a special, reciprocal relationship between parts and whole (Varela, 1997). The organism as a whole makes possible the existence of the parts, cells and organs of which it is itself composed. It produces and reproduces the parts, which in turn, through their interaction, enable the persistence of the organism. Self-preservation therefore means self-reproduction: the living system separates itself from the environment by a semi-permeable membrane, which at the same time enables the metabolic exchange that the system requires for constant self-transformation, even down to the smallest parts. The living being thus exhibits a fluid, dynamic process form: it continuously incorporates and assimilates new matter, i.e. subjects it to its form and purpose.

In contrast to the autopoiesis of organisms, robots are allopoietic machines: they do not manufacture themselves, but are designed as an external synthesis of inanimate and rigid single elements (Maturana & Varela, 1980). As von Uexküll (1982) put it, they are built centripetally (the parts are first produced, then combined according to the designers’ blueprint), whereas the construction of an animal is centrifugal, „from the inside out.” Living beings develop from simple cells by self-differentiation and growth, in continuous metabolism, so that all parts form an indivisible unit (Sharkey & Ziemke, 2001; Ziemke, 2016). Artificial systems, on the other hand, may be able to incorporate available materials into their structures, but they do not assimilate and transform them because they have no metabolism – they only need to recharge their batteries from time to time. Likewise, their adaptation or „learning” processes relate only to their functional program, not to their structure and shape. Since artifacts do not undergo autonomous growth and development processes, they cannot die either, but only become defective (Fuchs, 2021).

Thus, the term Artificial Life ultimately proves to be a misnomer. There is no artificial life, because life is per se not something produced but autopoietic, self-effected and self-developing. Artificial life could therefore at best be life induced by humans: namely by providing all the conditions that must be fulfilled for life to spontaneously emerge and organize itself. But that would not be the production of living things themselves. Even „artificial life” would have to organize itself, develop by itself, and would thus no longer be artificial.Footnote 16

Aliveness is also the prerequisite for feeling and sensing, which we presuppose in every empathic understanding of others, because it is through feelings that the living being attributes meaning to its environment for its homeostatic self-preservation. This meaningfulness manifests itself in the values – the attractive or aversive qualities – which the feeling animal discovers in the environment (Zlatev 2003). Meaningfulness or sense-making is thus originally tied to relevance for self-preservation, that is, to the living individuation of an autopoietic system.Footnote 17 An artificial system, on the other hand, has no inherent concern for its self-preservation, it is does not care for anything and so it cannot feel anything, neither pleasure nor suffering: „… the precariousness that grounds the concern inherent in living existence has no counterpart in a computer simulation whose entities are purely logical and hence essentially immortal” (Froese & Taguchi, 2019: 3).Footnote 18

Finally, aliveness is also the basis for the development of differentiated human emotions such as shame, pride, guilt, compassion, etc., that are directed toward more complex, particularly social situations and their values (Barrett, 2005; Klimecki, 2015; Vaish, 2018). These emotions, while no longer aiming at mere survival, nevertheless stem from the biological and psychological history of the individual. Lived, embodied experiences are the basis for a person’s emotional life. Moreover, socialization in early childhood also provides the implicit knowledge of intercorporeality as well as the shared background or commonsensical knowledge that AI-systems lack (Caminada, 2014; see above, 2.2). The history of robots is quite different: human designers have installed the functional states that underlie their behavior (Hofmann, 2018), and the adaptations they might undergo as „learning” systems are not based on any lived experience they might consciously remember.

Even if their programs are embodied in a weak sense, i.e. can perform sensorimotor interactions with the environment, robots lack the vital embodiment that characterizes living beings. And even if their programs can adapt to interactions and environments by means of artificial neural networks, they remain allopoietic machines that do not sustain themselves or evolve by themselves through metabolism and growth. Thus, they also lack the prerequisites for the experience of values and meaningfulness. No matter how perfectly they will simulate feeling, perceiving and thinking in the future – if we believe that we can understand them empathically, we are laboring under an illusion. There can be no „shared sense-making” with robots, because this presupposes shared living or conviviality.

4 The perils of simulation

Even if there can be no AI endowed with subjectivity, sensation or intentionality, and if the simulation of life functions, however perfect, cannot generate consciousness – the advances in simulation technology will not fail to have an effect. The anthropomorphism inherent in our perception and thinking tempts us all too readily to attribute human intentions, actions, and even feelings to our machines. This „digital animism” is already beginning to spread today – either because the categorical difference between subjectivity and its simulation is no longer understood, or because it increasingly appears unimportant. The more frequent and varied the interactions with artificial agents become, the more likely it is that implicit attribution of intentions will emerge (Papagni & Koeszegi, 2021). The as-if-consciousness usually associated with anthropomorphism toward inanimate objects then gives way to illusory understanding. That AI systems supposedly already „think,” „know,” „plan,” „predict,” or „decide” paves the way for boundary dissolutions, of which Hans Jonas already warned:

There is a strong and, it seems, almost irresistible tendency in the human mind to interpret human functions in terms of artifacts that take their place, and artifacts in terms of the replaced human functions. [. . ] The use of an intentionally ambiguous and metaphorical terminology facilitates this transfer back and forth between the artifact and its maker. (Jonas, 1966: 110)

Such a dissolution of the categorical differences between subjectivity and its simulation could have far-reaching consequences. Engaging with artificial systems will then increasingly take the place of human relational experiences. If a cuddly robot called „Smart Toy Monkey” is supposed to serve as a friend to small children and thereby promote „social-emotional development;”Footnote 19 if friendly nursing robots replace the human care of dementia patients and supposedly listen to their stories (Maalouf et al., 2018); or if patients are prescribed programmed online psychotherapies that save them having to see a therapist (Stoll et al., 2020) – then machines become fake subjects or „relationship artefacts,” as Turkle (2011) has put it. They cheat people out of real communication.

Sharkey and Sharkey have argued „…that a deception can be said to have occurred in robotics if the appearance and the way that a robot is programmed to behave, creates, for example, the illusion that a robot is sentient, emotional, and caring or that it understands you or loves you” (Sharkey & Sharkey, 2021: 311). It should therefore be one of the basic ethical requirements for AI systems that they identify themselves as such and do not deceive people who are dealing with them in good faith. Nor should they use emotional language such as „I care”, „I like you”, „I’m sad”, etc. This is particularly true in the areas of child rearing and care of the elderly, where those affected are not yet or no longer able to make the distinction between original and simulation (Epley et al., 2007).

As one example, consider the possible consequences in the field of psychotherapy, where this distinction is certainly important for those affected. Here, mental health apps, virtual psychotherapists and chatbot therapies are increasingly taking the place of trained mental health professionals. Well over 10,000 mental health apps are already available for download on the market (Cabibihan et al., 2013). Particularly relevant for psychotherapy are „conversational chatbots” that conduct a speech-based dialog with humans via an interactive interface. They can imitate a therapeutic conversational style, simulate empathy, and thus create an interaction that sometimes cannot be distinguished from real interventions, even by experts (Fitzpatrick et al., 2017, Inkster et al., 2018, Bendig et al., 2019).

One might assume that users of virtual psychotherapies who are educated about the nature of the intervention maintain an „as-if” consciousness that avoids any illusion of being understood. However, this assumption is premature: users tend to quickly endow technical systems with human-like characteristics. This is called the „Eliza effect”, after the computer program that Joseph Weizenbaum, as long ago as the 1960s, used in order to simulate a therapist (Weizenbaum, 1966, Cristea & Sucalǎ 2013). The Eliza effect was confirmed in a recent study with the conversational agent Woebot, which supports patients in coping with bereavement or depression (Fitzpatrick et al., 2017). Based on learning networks, Woebot provides seemingly understanding responses, empathic affirmations and encouragements that are deceptively similar to a real interaction. The study showed that users (n = 36,070) established personal bonds with Woebot that were similar to those in face-to-face cognitive-behavioral therapies (Darcy et al., 2021). Though they were informed that Woebot was not a real person, patients endorsed phrases such as the following as frequently as with regard to real therapists:

I believe Woebot likes me. – Woebot and I respect each other. – I feel that Woebot appreciates me. – I feel Woebot cares about me even when I do things that it does not approve of. (Darcy et al., 2021)

It becomes apparent that susceptibility to „digital animism” and the abandonment of the „as-if” is high among Woebot’s users. Their emotional distress and neediness can reinforce the general tendency toward anthropomorphism.Footnote 20

The application of AI systems in psychiatry and psychotherapy is often justified with the prospect that they could help reach underserved populations in need of mental health services and promote patients’ self-management skills (Blease et al., 2020). The evidence for perceived social support through chatbots is so far inconclusive, but many users seem to appreciate the availability and anonymity of contacts (Wezel et al., 2020). Yet it is obvious that these systems also blur the boundaries between reality, simulation, and fiction, with potentially problematic consequences. For example, the omission of face-to-face interaction in online communication generally favors the projection of feelings onto the virtual counterpart (Fuchs 2014). Thus, there is a risk of transferring emotions, expectations, and (often unfavorable) relationship patterns to the chatbot (Fiske et al., 2019). Unlike the relationship with a real therapist, however, there is no person on the other side of this transference. The projections cannot be perceived by the counterpart, mirrored and resolved in a professional way.

A fortiori, the complex work of hermeneutic understanding cannot be done by an AI apparatus. No machine can see through the patient’s behavior in its contrasts between speech and action or in its latent conflicts, recognize the meaning of symptoms on the basis of the patient’s life situation and derive conclusions from it. The dialogue with the robot remains on the surface; it can be momentarily pleasant and supportive, but never insightful in the psychotherapeutic sense. Ultimately, the patient remains alone with himself; his need for a trusting relationship, as reflected in the statements quoted above, remains unfulfilled, because this is only feigned by the speech apparatus. He may feel understood, but there is no one who understands him.

5 Conclusion

Advances in simulations make it necessary to clarify the categorical differences between human and artificial intelligence, as well as between living beings and artificial systems. In this paper, I have explored whether we can meaningfully talk about communicating with, feeling empathy for, and understanding AI systems or robots. The result is clear: notions of communication, understanding, and empathy necessarily demand a counterpart endowed with subjectivity, an embodied person with whom we are connected in conviviality. The involuntary anthropomorphism that arises in our perception of AI systems should not deceive us, for it is typical of life-like and expressive objects that we know for certain do not possess subjectivity. Advances in simulation make it increasingly difficult for us to shake off the illusion of a subjective counterpart when dealing with AI; but that is no reason to abandon the distinction between subjectivity and its simulation as such. Rather, it is a reason to strive for a precise use of terms that avoids category errors whenever possible.

I have therefore examined the concept of understanding more closely and shown why it cannot be applied to our interaction with artificial systems and robots. In the empathic sense, we can only understand what has sensations and feelings – and robots have no feelings. Likewise, in the semantic sense, we can only understand what wants to communicate with us and in turn understands us, that is, what is able to enter into a shared or „we-intentionality”. Understanding thus requires not only a transfer of information, or a suitable linking of symbols into syntax, but also an actual experience of meaningfulness and an intertwining of intentions – understanding someone, not just something. As I have shown further, this in turn presupposes belonging to a common form of life, or conviviality.

Against the assumption that future AI systems or robots could actually develop a kind of subjectivity, consciousness, or aliveness beyond their increasingly perfect simulation, I have outlined an embodied and enactive view of mind and life. Subjectivity, according to this view, is not a mere product of information processing in the brain, but is tied to the selfhood of an autopoietic organism that maintains itself in demarcation and exchange with the environment. Vital embodiment is the primary basis of experience, presupposing the biological processes of homeostasis, metabolism, growth, and cell differentiation, among others. This basis cannot be replaced by an allopoietic machine, however complexly its programs and feedback loops are designed. This also applies to sensorimotor robots, which can model their own state and feed it into their programs, or adapt their behavior through artificial neural networks, but lack the vital embodiment required for subjectivity.

Despite these categorical differences, it can be assumed that the human tendency toward anthropomorphism will be difficult to curb in view of the increasing lifelikeness of AI agents. It is likely to produce a „digital animism” that increasingly blurs the distinction between subjectivity and its simulation. I have illustrated the associated dangers using the example of virtual psychotherapies. The dangers lie above all in the tendency toward projective empathy (Fuchs 2014), i.e., the transfer of feelings, expectations, and hopes onto quasi-subjects with whom there can be no real conviviality or we-intentionality. In this way, they suggest a trusting relationship and understanding, with the risk that their users lack beneficial human interactions.

How can these tendencies be countered? – First of all, it seems necessary to reject the imprecise use of language that blurs the categorial and ontological differences between subjectivity and simulation, the animate and the inanimate, the artificially produced and the naturally developing. This would imply the preferential use of terms such as „simulated intentionality”, „seemingly expressive behavior” or „simulated social interactions” for artifical systems (Papagni & Koeszegi, 2021). Second, it is to be demanded that AI systems remain transparent as such, i.e., that they must not systematically deceive humans about their simulation of subjectivity or aliveness. Otherwise, they create a pseudo-community that cheats subjects out of real interaction. Third, there is a need for a new awareness of what embodied interactions and empathic relationships mean to us as social beings. Valuing and nurturing these relationships, rather than increasingly replacing them with virtual quasi-encounters, is likely to become of particular importance in an increasingly digitalized lifeworld.