1 Introduction

Advances in artificial intelligence (AI) lead to fundamental changes in how people communicate. One of the most profound turns is that social AI or communicative AI enables us to communicate with machines and technologies as communicators (so-called human-machine communication or HMC), instead of technology merely acting as a medium in human-human communication (e.g., Gunkel 2020; Guzman and Lewis 2020).

This new type of communication does not neatly fit in previous theoretical frameworks developed within communication science that focus on human-human communication or computer-mediated communication. Guzman and Lewis (2020) stated that this brings about theoretical challenges, and formulated a HMC research agenda, in which they identified three key aspects of communicative AI technologies: 1) functional dimensions through which people make sense of these current-day AI-enabled communicators, 2) relational dynamics that unfold between humans and these technologies, and 3) metaphysical implications that arise from the blurring ontological boundaries regarding humans, machines, and communication. They also put forth that the Computers as/are Social Actors (CASA) paradigm (e.g., Nass and Moon 2000) and the media equation (Reeves and Nass 1996) have been very influential in scholarly thinking about the interactions between humans and machines (see also Fortunati and Edwards 2021; Gambino et al. 2020).

The three concepts that the present study focuses on—source orientation, anthropomorphism, and social presence—all originate in, or are related to this influential CASA framework, and are very important in the three strands of research that Guzman and Lewis (2020) identified. The three concepts are crucial for understanding users’ entity perceptions when interacting with AI-enabled communicatiors such as chatbots, and will continue to play a pivotal role in present-day research on human-chatbot communication. However, there are also problems regarding the three conceptualizations and the related measurements, among others because the concepts are not necessarily clearly distinguishable from each other and the measures sometimes overlap. The present qualitative interview study analyzes users’ perceptions of their interactions with chatbots through the lens of source orientation, anthropomorphism as well as social presence, in order to unravel how these three concepts can help to understand human-chatbot communication—each in their unique ways.

This analysis first sets out to make a conceptual contribution. The analysis displays how each of the three concepts is helpful in understanding users’ experiences with chatbot communication, and how the concepts can be meaningfully distinguished from each other. In the discussion section, it is specifically argued that future chatbot research should also take source orientation into consideration—in addition to anthropomorphism and social presence that gained much more attention in HMC research thus far. Second, the study delves into how users respond to measurements that are typically used to tap into the three concepts, thus providing necessary background information for chatbot researchers that include these measurements in their chatbot effect studies. The discussion section encourages chatbot researchers to not uncritically keep employing the widely-used measures, but to—as a new HMC research community—question and refine them.

The focus on customer service chatbots is justified because these are one of the most relevant HMC examples that individuals are currently confronted with in their everyday lives (Araujo et al. 2020). Chatbots are software-based systems that are intended to interact with humans using natural language (e.g., Feine et al. 2019), and they are increasingly used in customer service. Customers type their queries in a dialogue screen, or click on buttons, and receive answers in natural language (van der Goot and Pilgrim 2020). The relative importance of customer service in daily lives (Følstad and Skjuve 2019), and the rapid push toward chatbots from the side of organizations (Feine et al. 2019), make customer service chatbots one of the most ubiquitous AI-enabled communicators.

2 Literature review

For each of the three concepts, the current section traces the origins of the concept, makes clear how it relates to the CASA paradigm, outlines how the concept plays a role in recent HMC and chatbot research, and introduces research questions for the interview study.

2.1 Source orientation

Reeves and Nass devoted a chapter to this concept in their foundational book on the media equation (1996) and wrote that people orient themselves toward the source that is the most proximate, e.g., the news anchor instead of the persons who wrote the news item, and the computer itself instead of a more distant programmer (see also Solomon and Wash 2014). In a journal article related to this chapter, Sundar and Nass (2000) reported findings of two typical CASA experiments in which they told participants they were working with a computer, a programmer, or a networker (i.e., another type of human interactant). Because they found differences in social responses to the computer versus the human conditions, they concluded that normally people do not orient themselves to an unseen programmer or imagined person in another room, but to the computer itself as an independent source of information.

This earlier work focused on human-computer interactions, but by now the concept “source orientation” has found its way into research on communicative AI and HMC. Guzman (2019) defined source orientation as “who or what people think they are communicating with” (p. 344) and conducted a qualitative interview study in which she examined source orientations when people use voice-based virtual assistants in mobile phones such as Siri. In doing so, her method distinguished itself from the earlier CASA experiments, since she explicitly asked users about their conceptualizations, whereas Reeves and Nass (1996) emphasized that automatic processes are at stake that can not be detected by self-reports. In the interviews, Guzman found that users indeed oriented themselves toward the technology itself, instead of thinking they were interacting with a human. But in this specific case there were two distinctive source orientations. Users felt they were communicating with the mobile device (i.e., they heard the voice of the machine) or with an assistant separate from the mobile device (i.e., they heard the voice in the machine).

However, a crucial and distinctive characteristic of text-based chatbots is that users may mistakenly think they are engaging in a one-on-one conversation with a human being. This is a key element in the famous Turing test, which plays a hugely influential role in the field of AI (Gunkel 2020, p. 31) and is center stage in the annual Loebner Prize competition (Christian 2011). In this test and its annual competition, the question is whether chatbots can be made so natural that the judges, or the audience, are fooled into believing they are communicating with a human being. The fact that some people may think they are engaged in a one-on-one conversation with a human being is in stark contrast to the fundamental notion of the CASA paradigm that people orient themselves to the technology itself, while knowing that the entity does not warrant human treatment or attribution (i.e., “ethopoeia”, Nass and Moon 2000, p. 94). In the case of chatbot communication, users may not know.

To delve into this further, the present interview study aims to find out whether people—when using a customer service chatbot—think they are communicating with a human being or not. And if not, what is the source they think they are communicating with? And what do they base their source perceptions on? In short, the study sets out to answer the following question:

RQ1:

Who or what do people think they are communicating with when using customer service chatbots?

2.2 Anthropomorphism

A widely-used definition of anthropomorphism reads: “attributing humanlike properties, characteristics, or mental states to real or imagined nonhuman agents and objects” (Epley et al. 2007, p. 865). These nonhuman agents can be anything that acts with apparent independence, for instance nonhuman animals, gods, and electronic devices. Epley et al. state that this phenomenon is surprisingly common; it is a human tendency. But this tendency is not invariant: it depends on dispositional, situational, developmental, and cultural influences.

Along these lines, in CASA research, anthropomorphism has been defined as “the assignment of human traits and characteristics to computers” (Nass and Moon 2000, p. 82). In this article, which is one of the fundamental works within the paradigm, Nass and Moon presented a series of experiments to show that individuals indeed mindlessly apply social rules and expectations to computers. The studies showed that people overuse human social categories such as gender and ethnicity, and that they engage in social behavior such as politeness and reciprocity. Importantly, Nass and Moon (2000) stated very explicitly that anthropomorphism could not explain these results. They wrote that anthropomorphism involves the thoughtful, sincere belief that the object has human characteristics (p. 93). Subsequently they concluded that this process does not apply to social interactions with computers, because users are very well aware that the computer is not a person and does not warrant human treatment or attribution.

In a direct response to this article, Kim and Sundar (2012) challenged the idea that anthropomorphism is a thoughtful and conscious belief. They noted that anthropomorphism can occur mindlessly as well as mindfully. In their recent extension of the CASA paradigm, called the Media Are Social Actors (MASA) paradigm, Lombard and Xu (2021) also made a plea for considering mindless and mindful anthropomorphism as two major complementary mechanisms that help to understand people’s social responses to technology. Mindless anthropomorphism is seen as the intuitive and spontaneous social responses (Lombard and Xu 2021) or the unconscious ascription of human traits such as thoughtfulness or politeness to an artificial entity (Schuetzler et al. 2020). In contrast, mindful anthropomorphism is seen as the deliberate and thoughtful attribution of human characteristics (Lombard and Xu 2021), and these thoughtful assessments of how humanlike an agent is are akin to the famous Turing test (Schuetzler et al. 2020). Depending on the social cues apparent in the technology, individual factors, and contextual factors, mindless anthropomorphism or mindful anthropomorphism may be activated.

Kim and Sundar (2012) used two measures to assess whether the tendency to treat computers as human beings is conscious (mindful) or non-conscious (mindless). Mindful anthropomorphism was assessed by asking participants directly whether they perceived the website as being humanlike/machinelike, natural/unnatural, or lifelike/artificial (Powers and Kiesler 2006). Mindless anthropomorphism on the other hand was measured by asking participants how well the adjectives likeable, sociable, friendly, and personal described the website. Kim and Sundar stated that their study provided evidence for mindless anthropomorphism, but not for mindful anthropomorphism.

These two measures of anthropomorphism found their way into recent experimental work on the effects of chatbot communication (e.g., Araujo 2018; Ischen et al. 2020; Zarouali et al. 2020), in which they were used as mediators. This limited number of studies, assessing the effects of different aspects of chatbot communication, did not yet lead to a univocal picture of how mindful versus mindless anthropomorphism play a role in chatbot communication.

In line with Kim and Sundar’s (2012) findings, Zarouali et al. (2020) found a differential effect on mindful versus mindless anthropomorphism: mindless anthropomorphism was higher for a (news) chatbot compared to a website, whereas mindful anthropomorphism was lower. This differential pattern was not apparent in Araujo’s (2018) study, in which a chatbot with anthropomorphic cues (i.e., humanlike language or a name), compared with a chatbot without such cues, led to both mindless and mindful anthropomorphism. The study by Ischen et al. (2020) did not find a differential effect either: mindful anthropomorphism and mindless anthropomorphism (and social presence, see below) were actually highly correlated. Therefore, they only included mindful anthropomorphism as a mediator, and found that the chatbot (compared with an interactive website) did not score higher on mindful anthropomorphism.

In this type of chatbot research, other operationalizations of mindful and mindless anthropomorphism also occur. For instance, Schuetzler et al. (2020) operationalized mindful anthropomorphism as “perceived humanness—the degree to which a person believes that a conversational agent might be human” (p. 880), and asked the following question: „My chat partner was: ‘definitely computer,’ ‘probably computer,’ ‘not sure, but guess computer,’ ‘not sure, but guess human,’ ‘probably human,’ and ‘definitely human’”. As such, the answers resembled a Turing test (e.g., Proudfoot 2011), and in the current literature review this would be considered a measure of source orientation. Mindless anthropomorphism was operationalized as the level of engagement a user feels with the conversational agent (p. 880), which was measured by asking to rate the chat partner’s skill, politeness, engagement, responsiveness, thoughtfulness, and friendliness. The study found that chatbots with increased conversational skills led to higher scores on both mindful and mindless anthropomorphism.

The current study closely looks into anthropomorphism in chatbot communication from several angles. First, in the seminal CASA literature (e.g., Nass and Moon 2000), social reponses such as using human categories like gender and ethnicity and engaging in social behavior such as politeness and reciprocity were very important. The difference between the computers of those days and current chatbots is that many chatbots are purposefully imbued with anthropomorphic cues such as gender and ethnicity. The present interviews give insight in the ways in which these cues find their way into how participants talk about the chatbots, by answering the following question:

RQ2a:

In what ways do anthropomorphic cues such as gender become apparent in how users talk about customer service chatbots?

Second, in contrast to the computers in the original CASA studies, chatbots are seemingly engaging in one-on-one conversations and are purposefully imbued with social and anthropomorphic cues (e.g., Feine et al. 2019; Go and Sundar 2019; Lombard and Xu 2021). The present study aimed to explore how users perceive these cues. As such, it bore a resemblance to Go and Sundar’s study on the effects of anthropomorphic visual cues in chatbots, in which the manipulation check was called “perceived anthropomorphism” and included questions on whether the agent’s profile picture looked human, realistic, and like a cartoon. The current study asked:

RQ2b:

How do people evaluate anthropomorphic cues implemented in chatbots?

Third, the interviews set out to explore what the thought processes were when participants provide scores on the widely-used scales of mindful (Powers and Kiesler 2006) and mindless anthropomorphism (Kim and Sundar 2012), thus acquiring more insight in the ways in which these measures actually tap into the concepts they intend to measure. Therefore, the questions are:

RQ2c:

How do people respond to the measurement of mindful anthropomorphism?

RQ2d:

How do people respond to the measurement of mindless anthropomorphism?

2.3 Social presence

Several extensive bodies of literature focus on social presence and work with widely-used definitions. A succinct definition reads: “the sense of being together with another” (Biocca et al. 2003, p. 456). Biocca et al. noted that this other can be either a human or artificial intelligence. Another widely-cited definition, formulated in an article in the journal Communication Theory, reads: “A psychological state in which virtual (para-authentic or artificial) social actors are experienced as actual social actors in either sensory or nonsensory ways” (Lee 2004, p. 45). Lee added that social presence occurs when technology users do not notice the para-authenticity of mediated humans and/or the artificiality of simulated nonhuman social actors. Another stream of work, focusing on computer-mediated communication and e‑commerce, defined it as “the extent to which a medium allows a user to experience others as being psychological present” (Gefen and Straub 2003, p. 11, 2004), thus focusing on the characteristics of the medium.

To avoid misunderstandings in light of these other definitions, Lombard and Xu (2021), in their recent work on the MASA paradigm, used the term “medium-as-social-actor presence”. This referred to the idea that when a medium itself presents social cues, individuals perceive it (with or without conscious awareness) not as a medium but as an independent social entity (p. 30). According to Lombard and Xu, the behavioral social responses typically studied in the CASA experiments (assigning gender, being polite, etc.) can be seen as a reflection and indication of users’ medium-as-social-actor presence experiences.

Some of the above-mentioned experiments studying the effects of chatbots (e.g., Araujo 2018; Go and Sundar 2019; Ischen et al. 2020; Schuetzler et al. 2020) looked into both anthropmorphism and social presence. Following the several definitions of social presence, several measures have been used. Araujo (2018) and Ischen et al. (2020) used an adaptation of the scale in Lee et al. (2006), which was an operationalization of Lee’s (2004) definition of social presence. The items in Ischen et al. (2020) were: “I felt as if it was an intelligent being”, “I felt as if it was a social being”, “I felt as if it was communicating with me”, “I paid attention to it”, “I felt involved with it”, “I felt as if I was alone (reversed)”, and “I felt as if the chatbot was responding to me”. Araujo (2018) did not find an effect on social presence. As said, Ischen et al. (2020) found a high correlation between the two types of anthropomorphism and social presence, and thus proceeded with only mindful anthropomorphism as mediator. The high correlation is not surprising since the wording of some items for both mindless anthropomorphism (“I perceived the chatbot as sociable”) and social presence (“I felt as if it was a social being”) are rather similar.

Go and Sundar (2019) measured social presence with five items adopted from the work of Gefen and Straub (2003), saying they tapped into the psychological sense of others without co-presence. They did not report the exact items, and wrote that chatbots with higher message interactivity led to higher social presence, and that social presence mediated the effect of message interactivity on several outcome variables. However, anthropomorphic visual cues and identity cues did not have an effect on social presence.

And, to conclude, Schuetzler et al. (2020) asked about the extent in which users experienced sociability, warmth, personalness, and sensitivity in the interaction. Chatbots with more conversational skills led to more social presence, which in turn led to higher scores on mindful anthropomorphism (i.e., thinking the chatbot is a human) and mindless anthropomorphism (i.e.,, experiencing the chat partner as polite, engaging, etc.).

Overall, this overview shows that there is overlap between the items that have been used to measure mindful and mindless anthropomorphism and social presence, and that the relationship between anthropomorphism and social presence has been hypothesized and tested in both directions. The present interview study distinguishes between anthropomorphism and social presence with the help of the following working definitions: anthropomorphism is focused on the assignment of human characteristics, whereas social presence is focused on the feeling of being together with a non-human entity. To get a grasp on whether and how people experienced feelings of being together when using a chatbot, the study answers the following question:

RQ3:

In what ways is social presence apparent in how people talk about their experiences with customer service chatbots?

3 Method

3.1 Procedure

The data collection in this interview study was a collaboration between the author of the present paper and an ISO-certified research agencyFootnote 1. The author approached the agency with her research idea and research questions. The author and two researchers of the agency developed the interview guide together. The agency coordinated the selection of the interviewees, using ISO-certified respondent recruitment agencies. The interviewees received a monetary compensation for their participation, in line with the normal procedures of the agency. Interviewees were not allowed to have participated in qualitative research in the preceding six months. The selection criterion was that the interviewee should have experience with contacting companies, and a stratified sampling procedure was used taking gender, age, educational level and household composition into account.

The interviews were conducted by one of the two agency researchers, whereas the other two researchers were present in the observation room to check whether additional questions had to be asked. The interviews took place on three days in November 2019, on two research locations (in a larger and smaller city). Each interview lasted one hour and was conducted in Dutch. Participants signed a consent form prior to the interview, and the study was approved by the Ethics Review Board of the author’s university. All interviews were video recorded (as is common practice in the agency), and transcribed verbatim.

3.2 Sample

The sample consisted of 24 interviewees and was purposefully varied in terms of gender (male n = 12; female n = 12), age (18–25 years n = 5; 26–35 years n = 5; 36–45 years n = 5; 46–65 years n = 5; 65–78 years n = 4), educational level (low n = 8; middle n = 8; high n = 8) and household composition.

3.3 Chatbots

We selected nine customer service chatbots that together showed variation on two dimensions: first, chatbots with humanlike versus robotlike characteristics, and second, chatbots for profit versus non-profit organizations (see Table 1). Humanlike characteristics (i.e., anthropomorphic cues) included names (Wout, Fleur, Iris, Nina), visual cues (picture of a human being or an icon of a cartoonish figure), and a chat environment. Robotlike characteristics included: an identity cue/disclosure in the introduction stating that it was a chatbot/digital assistant/virtual assistant; non-human icons -for instance the “initials” of the company name-, and the use of buttons instead of free typing. All chatbots were available on the companies’ websites. For each chatbot, we prepared a scenario with a question that the chatbot was able to answer, and one that the chatbot was not able to answer. For example, a scenario for bol.com was: “You ordered a package but it did not arrive yet. You want to know where it is”. The interviewee could interact with the chatbot until they told the interviewer they were ready. The researchers ensured that each interviewee interacted with two different types of chatbots, particularly a humanlike chatbot and a robotlike one. Each chatbot was used by around six interviewees.

Table 1 Companies to which the Customer Service Chatbots in the Interview Study belonged

3.4 Interview Guide

The interview guide was slightly adjusted between the three days, and consisted of an introduction and four topics. In the introduction, the interviewer provided information about the agency, the collaboration with the university, and about how the interview would work. She also asked the interviewee to briefly introduce him/herself. Importantly, throughout the interview, the interviewer did not pro-actively mention the terms chatbots, virtual agents etc. in any way.

The first topic asked about previous experiences with customer service and communication with companies. After this first topic, the interviewee was asked to interact with one of the preselected chatbots (without mentioning that it was a chatbot). The second topic—i.e., the one used in the current paper—pertained to their experiences during this specific chatbot conversation. At the start, the questions were open-ended and did not ask about specific characteristics of the conversation. The interviewer openly asked “what did you just do?”, “what happened here?”, “how did it go?”, and “how do you feel about this?” The next subtopics focused specifically on the sensitizing concepts.

For source orientation, the interviewer probed into what the interviewee thought he/she was communicating with, by asking “what is behind this?”, “how would you describe it?”, “what happens on the other side?”, and also “what do you base this on?”

For anthropomorphism, the interviewer asked about responses to the anthropomorphic cues, for instance the icon, the picture of a person, or the name. Regarding an icon, the interviewer would ask: “What is this?”, “How does this come across to you?” For mindful anthropomorphism, the interviewer showed a scale on paper ranging from robotlike to humanlike (which is one of the items in Kim and Sundar 2012), and asked the interviewee to put a cross where he/she would situate the chat. Subsequently the interviewer probed into what led to this score. For mindless anthropomorphism, the interviewee asked the interviewee to rate the chats as friendly, social, personal, and intelligent (Kim and Sundar 2012), on a scale from 1–10, and probed into what led to these scores.

For social presence, we used a blob tree (see blobtree.com) that can be used to explore feelings. The picture depicted around 50 blobs (purposefully not human characters) situated on and around a tree, for example two blobs sitting next to each other on a branch, two standing hand in hand, one climbing up a ladder or falling from the tree. The interviewer asked which blob(s) best represented the interviewee’s experience with the chatbot, and why the interviewee chose this/these blob(s). The interviewer also asked to what extent the interviewee experienced the online activity as communication and as a conversation. After these questions, the interviewee was invited to use a second preselected chatbot, and subsequently the interviewer asked the questions of topic 2 for this second interaction.

The third and last topic asked company-related questions, such as “what does this type of communication tell you about the company?” (Obviously without calling it communication, if the interviewee did not see it as communication).

3.5 Analysis

For the current paper, the analysis started with a thorough literature review to understand the three sensitizing concepts (Blumer 1954) source orientation, anthropomorphism, and social presence. To be as clear as possible on the working definitions, the author discussed these during several research meetings in her department. The interview guide was developed with these three sensitizing concepts in mind. After interviews and transcription, all interview transcripts were uploaded in the computer program Atlas.ti. In Atlas.ti, the author conducted open coding, a procedure commonly used as the first step in the grounded theory approach (Braun and Clarke 2013; Charmaz 2014). This step was explicitly guided by the sensitizing concepts, but was open in the sense that the author read the interviews closely, line-by-line, and added detailed codes to find as much nuance and input as possible per concept. There was no preconceived idea about what the findings per concept would be. During this process, she made notes per concept. After going through all interviews, she used the codes and the notes to write the current result section.

3.6 Techniques to optimize credibility and transparency

Several techniques were applied to optimize the credibility of the findings and the transparency of the research process. An important characteristic of the current analysis is theory triangulation (Braun and Clarke 2013). The author purposefully analyzed the interviews from three different conceptual angles, to gain fine-grained insights in users’ experiences and to be able to compare the concepts with each other. Researcher triangulation (Braun and Clarke 2013) played a role when conducting the interviews, and peer debriefing took place when the author received feedback on the findings during several meetings in her department, by colleagues who read the draft paper, and at an international conference. To enable readers to evaluate the conclusions, the result section aims to provide a thick description (i.e., a rich, detailed and complex account, Braun and Clarke 2013) of the findings per concept. Saturation, the point at which new data do not generate substantially new ideas anymore (Braun and Clarke 2013), has been reached in the sense that conducting new interviews after these 24 interviews (in the same context) would not result in entirely new findings per concept.

4 Results

4.1 Source orientation

This section discusses what the interviews reveal about “source orientation”. In other words, the section answers the question “who or what do people think they are communicating with when using customer service chatbots?” (RQ1).

Source: human being.

Two interviewees thought they were communicating with a human being. During the interview, interviewee 9 (female, 78 years) used the chatbot of the police, named Wout, in which citizens can report noise pollution. When asked who or what answered her questions, she said: “Police officer Wout”. Later in the interview, she thought that maybe it was not a police officer, but an administrative officer. Interviewee 22 (female, 67 years) used a chatbot of company Vitens, and assumed she had been communicating with an employee of this company.

Source: something automated.

All other interviewees (sometimes after some doubt, see below) concluded that they were receiving answers from something automated. These interviewees were clearly not unified in the term they used to explain what they had been using. Within this variation, we can first distinguish interviewees who referred to some type of conversational agent (although they never used that term) or non-human entity. They used the terms chatbot (or chatbox or chatboy), chatrobot, digital assistant, virtual agent, virtual assistant, or robot. Second, among the interviewees who thought that they had been using something automated, some alluded to the “software”. They put forth algorithms, a digital program, answers to key words, something preprogrammed, standard answers, or a system. Third, some interviewees hinted at the hardware that was behind the chatbot, namely a computer, a computer with algorithms, a machine, or a server.

For all respondents who mentioned something automated, the interviewer probed into their understanding of how this worked. The most frequent answer was that it was based on key words. As interviewee 20 (female, 74 years) said: “It is automatic answers. You type something and they select a few words, and then you automatically receive this answer.” Interviewees who identified a conversational agent or non-human entity would say something like “A robot. An automated system that has an answer to every question” (interviewee 3, male, 19 years). Some interviewees mentioned algorithms and key words in one breath, for instance interviewee 1 (male, 64 years): “it is algorithms, that select some words in my question, and then throw out an answer based on that”. And some interviewees used more specific “software-related” jargon. For example, interviewee 13 (female, 26 years) said: “I think it is an RPA package.”

The interviewer also asked them what they based these conclusions on. The most-often mentioned cue that led them to the conclusion that there was something automated on the other side was that the answers came very fast. “The icon of Nina maybe gives the wrong impression, but no, because of the speed I am convinced it is not a real Nina” (interviewee 1, male, 64 years). Other important cues were related to the content of the answers: these were standard answers, the question was not answered, “he” does not understand it, there was a repetition of sentences, or the answers were not humanlike. “I expect that it is a robot because it is fast and because it does not exactly answer my question” (interviewee 20, female, 74 years). Thus, in several cases, a wrong answer was a cue for understanding that it is not a human being on the other side. And the last cue was that some introductions of chatbots explicitly stated “Hello, I am Billy, the chatbot of Bol.com” or “Hello, I am the digital assistant of Waternet”. However, several interviewees indicated that they only paid attention to this cue in response to the interviewer’s questions.

Source: doubt.

During the interviews, all interviewees concluded that they had been receiving answers from either a human being or something automated. However, it is relevant to note that some interviewees expressed doubt regarding their conclusions. The two women who thought they were chatting with a human being started doubting this during their interviews. After probing, interviewee 9 (female, 78 years)—who thought it was a police officer—said “or you mean it is something automated? I know what you mean. That is not possible right?” This seemed to be an interview effect, and it seems most plausible that in real life she would have kept thinking it was a human being.

The interviewees who thought it was something automated were sometimes not sure, until they reached their conclusion. As interviewee 14 (male, 25 years) said: “When it started, it said ‘that is a pity’, so then I thought ‘oh it is probably a person’, but then I thought ‘no, this can never be a person’”. In addition, some interviewees were quite hesitant in their formulations, for instance “it is a virtual agent right?” (interviewee 2, male, 30 years). It showed how they used several cues to come to a conclusion about the communication source.

4.2 Anthropomorphism

This section discusses what the interviews displayed about anthropomorphism in human-chatbot communication, analyzed through four different angles (RQ2 a,b,c,d).

Social responses in talking about the chatbot.

When talking about the chatbot, intervieweeswho concluded there was something automated on the other sideused formulations that could be used for human beings as well. Specifically, anthropomorphic cues such as the name and gender of the chatbot found their way into interviewees’ formulations. In these formulations, interviewees also assigned some agency to the automated other. A few examples: “But Wout is still learning [formulation used in the introduction of the chatbot] […] Then Wout can solve it right?” (interviewee 2, male, 30 years); “The robot does understand what I mean […] He did not really help me” (interviewee 3, male, 19 years); “Fleur practiced that” (interviewee 11, male, 28 years). In a few interviews, we explicitly asked about these formulations. For instance in interview 1 (male, 64 years):

Verse

Verse Interviewer: But you do talk about “Nina”. Interviewee: Yes, but Nina is a computer for me. Computers can have a name too.

Evaluations of anthropomorphic cues.

Later in the interviews, the interviewer pointed at the anthropomorphic cues (particularly the name, cartoonish icon, or picture of a human being) and asked what the interviewee thought these things were, and what their opinion about this was. The main finding is that interviewees identified that these cues are included to give the impression that one is communicating with a human being. In other words, they saw it as a simulation of human touch. This is nicely explained by interviewee 3 (male, 19 years):

“That is just a picture to make it look a bit more humanlike I think. I think that they try to make it as humanlike as possible, so that people also have a good feeling when talking to a robot. The introduction states that it is a chatbot, but probably they do this to give you the feeling that you are being helped by the customer service.”

Subsequently, we asked them about their opinions on this simulation of human touch. At one end of the spectrum, there were interviewees who straightforwardly liked the humanlike features or anthropomorphic cues. For instance, interviewee 6 (female, 52 years) said: “A nice detail is that you see that she is typing; I think that is funny. It is a nod to humanity, that I like more than that picture of that woman”.

The middle position was taken by interviewees who put forth that for them personally the anthropomorphic cues did not matter, but that they may matter for other people (i.e., the so-called third person effect). Rather similarly, some recognized that the cues did not matter for them consciously, but that they may have an effect on them subconsciously. Interviewee 3 (male, 19 years) about the icon:

“I feel like the icon is not really necessary. I do not think something else when I see that icon. It does not really do anything for me. But maybe it does something subconsciously, that it gives me a better feeling. But I actually did not even notice it.”

At the other side of the spectrum, interviewees experienced the anthropomorphic cues as counterproductive and even misleading and unfair. They were outrightly negative about this simulation of human touch. Interviewee 21 (male, 21 years):

“That picture [of a human being]: they use a natural person, while there is of course no natural person behind this. Also with the name: it makes the whole thing more personal, and I expect there is a real person behind it, but there is not. It is all already programmed. They give the wrong impression I think. Because I expect a person. Maybe fraud is too big a word. But they do deceive me.”

Mindless anthropomorphism.

We asked the interviewees to what extent they rated the chats as friendly, social, personal, and intelligent (on a scale from 1—10), and probed into what led to these scores. There were differences between the four items in how the interviewees responded to them.

For “friendly”, the answers were most uniform, in the sense that interviewees typically scored the extent in which they thought the language style was friendly. Interviewee 8 (male, 47 years) gave an 8: “Cheerful, friendly. When I look at the questions [of the chatbot], I think it is very nice. They are doing a good job”. When interviewees assessed language style here, it was possible that they gave a high score for “friendly”, but at the same time gave a low score for “personal”. For example interviewee 24 (male, 38 years) gave a 9 for “friendly”, and a 1 for “personal”. Some interviewees doubted whether you can answer such question about a machine, which implies that they saw it as a human characteristic. As interviewee 15 (male, 19 years) said: “Hmm, difficult. It is still a computer for me. But in terms of friendliness, what should I say, in terms of what you can expect from a computer, I think an 8”.

For “personal”, the answers were sometimes quite complex, because interviewees tried to come up with an answer, using different dimensions. Both for “personal” and for “social”, the provided scores ranged from 0 to 9. A rather complex answer (interviewee 2, male, 30 years):

“Personal? Not really, no. When you see ‘Hello I am Wout, I am still learning, I do not always understand you immediately’, then it is not really personal. What do I actually talk to, you know what I mean? Nothing actually. That is what it is, a formal thing with an informal appearance. That is the feeling I get.”

Other dimensions that interviewees used were for instance whether the chatbot used their personal data like their name, and whether the chatbot answered their own specific questions. For instance, interviewee 10 (female, 66 years) gave an 8, because “he clearly says what needs to be done and gives a good answer”.

For “social”, the answers were also complex and difficult to categorize. Some interviewees think the label “social” just does not apply. Apparently they see social as a characteristic exclusively for humans. Interviewee 17 (female, 21 years): “I can not call this social, it is just a bot”. On the other end, some interviewees did give rather high scores for social. For instance, interviewee 10 (female, 66 years) gave an 8: “it gives an answer, it is nice, it is empathetic because it uses ‘je’ [Dutch informal pronoun]. It is handy, clear, friendly”.

For “intelligent”, some interviewees evaluated whether the chatbot had answered their query sufficiently and what the chatbot was able to do. For instance, interviewee 21 (male, 21 years) said: “They come up with good questions so… They did well with that. You can not type full sentences, but I do not think that is necessary”. Some focused on artificial intelligence instead of intelligence as human characteristic. Interviewee 13 (female, 26 years) gave a 4, with the following reasoning:

“It is just standard answers, standard questions, it is just looking for a word in its database and then provides an answer. It’s not much more than the if-then function in Excel. I think they can not do much more at the moment. They can only do more when they have progressed with artificial intelligence.”

Mindful anthropomorphism.

Interviewees indicated on a scale how they rated the interaction from robotlike to humanlike, and the interviewer probed into what led to these scores. Essentially, the responses of the interviewees—who thought they chatted with something automated—signaled that they combined several dimensions, particularly their source orientation with the simulation of human touch. They typically said that they used a chatbot/robot/system/computer etc., but that it was made to appear humanlike, and therefore they gave a score somewhere midway the scale (and not completely robotlike or completely humanlike). As interviewee 15 (male, 19 years) said:

“Yeah, Waternet is just a digital assistance system. Nothing more. But because it does remember your questions and really gives an answer, yeah, you go a little bit to the humanlike side. That is what we did with computers, we made them humanlike.”

4.3 Social presence

Based on the interviews, this section discusses the ways in which social presence is apparent in how people talk about their experiences with customer service chatbots (RQ3). The analysis focused on the feeling of being together, and on communication with a non-human entity.

Together.

Some interviewees used the word “together” when talking about the chatbot interactions. This was most clearly the case when interviewees were asked to point out which blobs in the blob tree (see the method section) best characterized their experiences. As an example, interviewee 1 (male, 64 years)—who was critical about the humanlike tone of voice (“I do not fall into that trap”)—chose two blobs with the following argumentation: “We want something together, but we do not know whether we need to go up or down the rope. […] He does give me the feeling that we want something together. But we do not really find the solution [to the customer query]”. And interviewee 6 (female, 62 years) said: “It doesn’t matter to me that much whether it is somebody or something, we are already so integrated that we do it together, so it does not matter”.

Communicating with an entity other than a human being.

The interviews show that some interviewees indeed saw this chatbot use as communication, with an entity other than a human being. Interviewee 2 (male, 30 years) saw the interaction with the chatbot more like filling out a form, but at the same time noted that “it is a feeling that you have, that he or she does seem to respond to what you are saying”. And interviewee 15 (male, 19 years): “It is Iris [name of the chatbot], that does not really listen of course. It is just a system. But she does give you the idea that you are being heard”. Interviewee 16 (female, 36 years) about the feeling that it gives: “It does give the idea as if I have spoken with … not with a person, but just with a robot”.

In contrast, some interviewees clearly stated that they did not see this as communication or a conversation. If it is not with a human being, it is not communication. As interviewee 20 (female, 74 years) said in response to the question about it being a conversation:

“No, not at all. For me, a conversation is between humans, not through a robot. That is why I hate all those robots in nursing homes. […] Human to human is being together, heart to heart, warmth. […] This is not communication; this is purely automated.”

5 Discussion

This qualitative interview study analyzed users’ perceptions of their interactions with chatbots through the lens of source orientation, anthropomorphism as well as social presence, in order to unravel how these three concepts can help to understand human-chatbot communication—in their unique ways. The discussion focuses on how HMC scholars can proceed with these concepts and their measurements in future chatbot research.

5.1 Theoretical implications

The literature review and the analysis of the interviews showed that—when defined clearly and distinctly—each concept offers a unique angle from which to understand users’ entity perceptions. The distinctive conceptualizations are shown in Table 2. More specifically, a plea is made to also pay attention to source orientation in future chatbot research, since the inclusion of this concept contributes to a more thorough understanding of anthropomorphism and social presence in users’ experiences.

Table 2 Conceptualizations of Source Orientation, Anthropomorphism, and Social Presence

Source orientation.

As the word “source orientation” implies, it can be defined literally as the source that users orient themselves toward when interacting with technology (Reeves and Nass 1996; Solomon and Wash 2014; Sundar and Nass 2000). In the classical CASA and media equation experiments, scholars treated source orientation as a constant: users orient themselves toward the computer itself instead of at a human being such as a programmer (Reeves and Nass 1996; Sundar and Nass 2000). Reeves and Nass (p. 189) did acknowledge that there can be several “layers” of sources, even many within a computer itself, but they emphasized that users orient to the most proximate source.

In contrast, the present interviews showed that some users actually thought they were engaged in a one-on-one conversation with a human being (see also for instance Schuetzler et al. 2020), whereas there was variation in the automated sources that they mentioned. They thought they interacted with a conversational agent (using a variety of terms), something software-related (such as algorithms etc.), or something hardware-related (such as a server). Future research needs to further disentangle what the sources are that users orient themselves toward in chatbot interactions. The source orientation model developed by Solomon and Wash (2014) seems like a particularly fruitful entry point. This model identifies several “layers” that users can orient themselves toward (application, computer, other users, programmers, organizations, etc.), and distinguishes three conditions: awareness (the user is aware of these sources), attention (the user pays attention to these sources) and engagement (the user is engaged toward these sources).

Gaining insight in which “layers” users orient themselves toward will also significantly enhance our understanding of anthromorphism and social presence. One important point is that when users think they are engaged in a one-on-one conversation with another human, they are not engaged in anthropomorphism (i.e., assigning human characteristics to nonhuman agents), nor in social presence (when conceptualized as the sense of being together with an artificial actor). More broadly speaking, knowing which “layers” users orient toward will enhance our understanding of the aspects that they are anthropomorphizing (i.e., to what aspects they are assigning human characteristics) and what exactly makes them experience feelings of being together. Gaining these insights is relevant for all three key aspects of communicative AI that Guzman and Lewis (2020) identified in their HMC research agenda, because it helps to better understand how users make sense of what they are communicating with (aspect 1), what they are building relationships with (aspect 2), and how they experience ontological boundaries between themselves and machines (aspect 3). And finally, it is relevant to note that several AI regulations and guidelines state that users have the right to know who or what they are interacting with (California Legislative Information 2018; European Commission 2022). The fact that users may not know they are interacting with technology instead of with a human being brings ethical issues. This makes it pivotal to further investigate users’ source orientations when interacting with chatbots.

Anthropomorphism.

Anthropomorphism has received a lot of attention in HMC and chatbot research, and will continue to do so. This is due to the fact that chatbots are intended to mimic human-human conversations and are often imbued with anthropomorphic design features, whereas on the side of users anthropomorphizing machines is a widespread tendency (e.g., Proudfoot 2011). The definition of anthropomorphism seems uncontested and reads as: “attributing humanlike properties, characteristics, or mental states to real or imagined nonhuman agents and objects” (Epley et al. 2007, p. 865).

Within this broader definition, the current study reveals two conceptual issues that stand out. The first question is what “humanlike properties, characteristics and mental states” are precisely. The current interviews showed that the human characteristics that are used in the scale for mindless anthropomorphism are not necessarily nor exclusively human characteristics. For example, for “friendly” participants typically evaluated language style, whereas for “personal” some of them evaluated whether personal data had been used. To move forward with this issue, the recent conceptualization of anthropomorphism by Kühne and Peter (2022) will prove to be helpful because—using Wellman’s Theory of Mind framework—they specify the definition to the cognition of attributing human mental capacities. Thus anthropomorphism entails the attribution of thinking, feeling, perceiving, desiring, and choosing to a robot. In this conceptualization, the attribution of personality and moral value to a robot is not seen as part of anthropomorphism, but as a more downstream cognition. Future research should explore how and to what degree users assign these human mental capacities to chatbots.

Second, the issue of mindless versus mindful anthropomorphism needs to be addressed. Lombard and Xu (2021) emphasized that mindless and mindful anthropomorphism are two major complementary mechanisms that help to understand users’ social responses to technology. The recent conceptualization of Kühne and Peter (2022) focuses on mindful or explicit anthropomorphism. Future chatbot studies should explicate whether they are investigating mindless or mindful anthropomorphism, or the relation between the two.

Social presence.

Although in their research agenda Guzman and Lewis (2020) did not explicitly mention social presence, this concept is related to key aspects of communicative AI technologies and has often been included in research on chatbots. However—compared to source orientation and anthropomorphism—the definition of social presence is more contested, with several widely-used definitions from different bodies of literature (e.g., Biocca et al. 2003; Lee 2004; Lombard and Xu 2021). Distilled out of these definitions, the current interview study used as working definition: “the feeling of being together with a non-human entity”. This definition is helpful in making a distinction between the concepts, because it clearly focuses on feelings, whereas anthropomorphism refers to a cognition. Since in the brain thoughts and feelings are situated in distinctive areas (e.g., Taylor 2021, p. 12), it is important to investigate both of these separately. For future research one could consider to define social presence simply as “the feeling of being together with another” (see Table 2), as in Biocca et al. (2003)’s definition where the other also could be a human or an artificial agent. Source orientation can consequently help to understand what “layers” of the AI-enabled agent users feel together with.

In the current study, some interviewees felt “together” with the chatbot, and experienced their interactions with chatbots as a form of communication with an entity other than a human. However, some interviewees clearly rejected the notion that this was a form of communication. It is vital to consider that although we name our research field human-machine communication (Guzman and Lewis 2020)—and although this paper has human-chatbot communication in the title—, this does not mean that users necessarily experience it as such. Future research should consider the difference between the use of machines, interactions with machines, and communication with machines, and examine under which conditions people experience chatbots as independent social actors that they feel together with and communicate with.

5.2 Methodological implications

The original CASA and media equation experiments related manifest manipulations to behavioral responses. In some of these, source orientation was manipulated by telling people that they were working with a computer, a programmer, or a networker (Sundar and Nass 2000). In stark contrast to this earlier work, recent chatbot experiments have included mindless and mindful anthropomorphism and social presence as mediators, and a qualitative study explored users’ source orientations when interacting with voice assistants (Guzman 2019). The present literature review and interview study showed several issues with the currently-used measures of anthropomorphism and social presence: some of the items overlap, there are issues with what human characteristics actually are (as discussed in the previous section), and it is probematic to label these measures as mindless and mindful.

The current consensus that both mindless and mindful processes—particularly regarding anthropomorphism—can be at work (e.g., Kühne and Peter 2022; Lombard and Xu 2021; Złotowski et al. 2018) calls for empirical research that uses a variety of methods that are better able to examine these processes and their outcomes. To this end, mixed-method projects are needed, particularly combining in-depth interviews with experiments, to compare the outcomes of behavioral measures (i.e., the social reponses that users exhibit toward chatbots), open-ended questions that provide in-depth insight in user experiences, and explicit and implicit measures (e.g., Vandeberg et al. 2015, 2016). Relatedly, Lombard and Xu call for the the combination of for instance fMRI and EEG with interviews.

Source orientation.

In future experimental research, source orientation can be manipulated by letting users know explicitly which source they are supposed to orient themselves toward. CASA studies in which the computer was labeled as either a computer, a programmer, or another type of human interactant (Sundar and Nass 2000) bear a resemblance to recent chatbot disclosure studies in which chatbots are explicitly labeled as chatbots or not (De Cicco and Palumbo 2020; Luo et al. 2019; Mozafari et al. 2021). These studies can be extended with other conditions in which the interlocutor is introduced as a human being (in parallel to Sundar and Nass’ study) or as an algorithm, computer, server etc. (using sources that interviewees in the current study put forth).

In addition, future research needs to explore the several “layers” in users’ source orientations (Solomon and Wash 2014). Asking open-ended questions (either in a qualitative interview study, or integrated in experiments and surveys) such as “what did just happen” will reveal the different wordings and understandings that users have regarding artificial agents. Subsequently, these wordings can function as input for explicit measures that can be used in experiments. These measures can be inspired by Schuetzler et al.’s (2020) measure (although they called it mindful anthropomorphism), and can be considered as a version of the famous Turing test (Christian 2011; Proudfoot 2011).

Anthropomorphism.

As said, the challenge for future research is to gain insight in both mindless and mindful anthropomorphism (e.g., Lombard and Xu 2021; Złotowski et al. 2018), and Kühne and Peter (2022) suggest that a measure of explicit anthropomorphism needs to be complemented with measures of implicit anthropomorphism, for instance with indirect self-report measures and neurological measures. The currently-used measures of mindless and mindful anthropomorphism are both explicit measures asking for participants’ reflections, and as such the measure “mindless anthropomorphism” does not tap into mindless processes. However, this measure is a rather direct operationalization of “the attribution of human characteristics to nonhuman agents”. As mentioned above, the current interviews showed that the characteristics in this scale are not necessarily nor exclusively human characteristics. A way to move forward will be to translate Kühne and Peter (2022)’s more precise definition (i.e., the attribution of thinking, feeling, perceiving, desiring, and choosing) into an explicit measure of mindful anthropomorphism. In parallel, researchers should work on implicit measures tapping into mindless anthropomorphism. As an example, Heyselaar and Bosse (2020) worked on the development of a theory of mind task to measure how much agency users attribute to a chatbot.

For the measure of mindful anthropomorphism, the interviewees were deliberating between their source orientations and the anthropomorphic cues that simulated human touch (e.g., “it is a machine, but it does act nice, so I will give a score in the middle”). This finding suggests that this measure may not be necessary when new explicit measures of source orientation and mindful anthropomorphism are used.

Social presence.

To distinguish the measure of social presence from the measures of source orientation and anthropomorphism, it is vital that it focuses exclusively on feelings. Like in the current interview study, open-ended questions, techniques such as the blobtree (blobtree.com), and think-aloud methods can reveal users’ feelings of being together (or not). The issue with the current explicit measures (e.g., Lee et al. 2006) is that some items overlap with the items for anthropomorphism, for instance when they include the words “social” and “intelligent”. Interview studies can give input for new items that tap into feelings of togetherness more precisely. And, as said for the other concepts, researchers should work on developing implicit measures for these feelings (e.g. Heyselaar and Bosse 2020; Vandeberg et al. 2015, 2016).

5.3 Limitations and future research

One limitation of the current design is that a selection of (nine) chatbots was used with a variation in anthropomorphic or social cues, with each interviewee interacting with two of these chatbots. This was purposefully done to gather a wide array of users’ experiences and perceptions. However, with this design it is not possible to assess which cues lead to which source orientations, forms of anthropomorphism, and feelings of social presence. Relatedly, it is not possible to systematically examine relations between the three concepts. Quantitative research is called for to systematically disentangle which combinations of cues lead to which perceptions (e.g., Lombard and Xu 2021; Rapp et al. 2021), and how specific types of source orientation (e.g., human versus chatbot, or different understandings of the automated other) are related to processes of anthropomorphism and to feelings of being together.

Second, the current literature review and interview study focused on chatbots specifically. However, the theoretical and methodological implications seem relevant for other types of AI-enabled communicators such as social robots and voice assistants as well. Future projects that do not solely focus on chatbots but instead focus on the comparison between several types of interlocutors will greatly enhance our understanding of source orientation, anthropomorphism and social presence in human-machine communication.