1 Introduction

Service robots will soon become part of our daily lives where they perform tasks like cleaning, transportation, assistance, and guidance [1,2,3]. A service robot is built to be of use to humans and fulfill a certain purpose. It has to perform its tasks correctly and reliably to be trusted and accepted [4]. So far, the dissemination of robots in our daily lives has focused on the domestic environment, [5,6,7] but soon service robots will be employed in shared spaces [8] like train stations, malls, and museums which represent crowded places with intuitive and unexpected human-robot encounters [9,10,11,12]. Interaction with lay users in public already gained research interest two decades ago [10, 11, 13, 14] and has recently received increased attention again [2, 15,16,17], as commercial robots can now operate autonomously in public spaces due to technological progress. Especially, the public context poses a challenge as interaction with unsuspecting, vulnerable users occurs, and service robots moving autonomously might be perceived as unpredictable and hence trigger discomfort and distrust [4, 18].

To achieve an efficient, safe, trustworthy, and acceptable human-robot interaction a potential solution could be to equip the service robots with interaction strategies, for instance, ensuring transparency of the robot’s actions [19, 20]. These interaction strategies can range from audio or visual signals to natural language dialogues to an intricate set of actions. Essential for these strategies is the appropriate application considering user, robot, and context. Hereby, understanding user experience (UX) is core to designing acceptable and trustworthy robot interaction strategies [1, 21] as it assesses users’ thoughts and feelings when interacting with a robot [19].

Especially, the intuitive reaction of lay users to autonomous service robots can provide insights into interactions that are already effortless and the ones that create misunderstandings [9, 10, 20, 22]. The initial user reaction is vital because a failed first interaction (e.g., fear, conflicts) with a resulting negative UX can lead to rejection of the robot and mistrust [23,24,25].

For investigating UX in public HRI, a design methodology has been proposed by Tonkin and colleagues [19]. It incorporates eleven recursive steps and includes user-centered methods like observation and interviews but also the HRI-specific design of robot interaction behavior. By applying their methodology, the UX with a service robot at an airport terminal could be improved [19].

The present study focuses on the first three steps of the HRI design circle in public by Tonkin and colleagues [19]: define the challenge, observe and form insights. A field study at a train station was conducted (see Fig. 1) that combined results from observation, interviews, and questionnaires in a multi-method approach similar to [19, 26, 27], including case studies as proposed in [28].

Findings regarding observed conflicts and critical situations are presented. Challenges that arose during co-existence with a mechanoid, autonomous cleaning robot are highlighted and design considerations and recommendations are provided that are accompanied by literature suggestions for further reading on the specific topics. This way, the paper might represent a starting point for researchers investigating HRI in public. The recommendations will be listed beneath each result, numbered consecutively and marked with an ’R’ for recommendations. Additionally, aspects for future research are highlighted to inspire a research agenda in the field of public HRI.

Fig. 1
figure 1

Cleaning robot driving in the hallway of the train station where the study took part. Depiction of front and backside of the robot

2 Theoretical Background and Related Work

In the following, the theory on user acceptance and trust in autonomous robots in public spaces is elaborated, and the research gaps in previous work are highlighted. Human-robot conflicts that might occur in shared environments are described to illustrate underlying processes that need improvements (e.g., robot bullying).

2.1 Existence Acceptance and Trust in Autonomous Service Robots in Public

2.1.1 Existence Acceptance

Co-existence describes a form of HRI which is indirect as humans and robots share the same space but not the same goal. The human mostly has the role of a bystander with the primary aim of avoidance rather than interaction [29]. Hence, the application of service robots in public is different from other application areas such as the domestic or industrial context concerning duration, frequency (short one-time interactions) and type of interaction (co-existence vs. cooperation), ownership (user vs. passerby), the voluntariness of use (no active choice vs. intention to use), human authority, user expertise and pre-knowledge of the robot’s purpose and abilities [1, 15].

Due to these particularities of robot application in public, previous models to explain technology acceptance cannot be directly applied. Traditional acceptance models such as the Automation Acceptance Model (AAM) [25] were not developed for indirect interactions such as co-existence in public. Therefore, Abrams and colleagues [15] described the so-called ’existence acceptance’ which does not assume an intention to use or interact with the robot by the passerby: it is the ’[...] passive approval of the presence of an autonomous robot [...with passersby which] ’might not want or need to use it, and still accept its existence’ ( [15], p. 274).

The Existence Acceptance model sheds some light on potentially meaningful variables for understanding the acceptance of a robot in public. To further understand the process of how the relevant set of variables relates to accepting or rejecting robots in public spaces, the current field study explored passersby reactions (behavior, thoughts, and emotions), as well as trust and acceptance of an autonomous robot in public.

Table 1 Overview of Similar Studies in Public Spaces

2.1.2 Passersby Reactions in Public

Previous work that investigated passersby’s reactions to robots in public does provide support for the concept of Existence Acceptance: the main reaction of passersby to the co-existence with robots in public was to ignore or briefly acknowledge the robot’s existence. In the following, the results of field studies similar to the presented study (for more similar studies, see Table 1) are described and discussed in more detail to give an overview of existing findings on user reactions to autonomous robots in public.

One study at a train station in Japan in 2007 looked at the reactions of passersby similarly to the presented approach [9]. Two humanoid robots that could give travel information were placed at the side of an aisle, and the reactions of passersby were video recorded. A total of 5900 individuals were observed on eight days, and 163 passersby were interviewed. The most often shown behaviors were ignoring (50%) and noticing (35%). Less frequent were stopping and listening to information, touching, changing directions, talking about the robot, and taking photos. While most participants did not look at the robots, some were very interested and stopped to look and took photos [9].

Weiss and colleagues described a field study in 2008 with an autonomous robot asking passersby for directions in a crowded city center in Germany [21]. The observed behavior from the German public showed that whether the passersby stopped for the robot to provide it with directions depended on various factors such as time pressure, being in a group, or being local. Contrarily, carrying an object or being accompanied by a pet was not influential. Men and children were more willing to interact with the robot [21].

In 2012, Weiss and colleagues [26] repeated the field study setup with a more advanced version of the robot IURO and found similar results to the previous studies. The robot was ignored by the passersby rather often. The most frequent behaviors were laughing, pointing at the robot, and interacting with the researcher. Speech was the preferred feedback modality for the humanoid robot by 70% of the respondents of the questionnaire.

A recent study in 2020 with the humanoid Pepper at a train station in Sweden revealed that people are still unaccustomed to interacting with robots in public space [16]. Especially, speech interaction with Pepper was not often observed: participants preferred to use the touch screen on the robot’s chest instead of talking to it. Participants were also asked about their concerns which were mostly about a missing robot operator (e.g., the robot’s autonomy) and data security (e.g., hacking) [16].

To summarize, the majority of field studies in HRI have reported on the various observed user reactions to humanoid robots in public which remained relatively stable over the years, but the reasons for the individual variations have not been sufficiently identified.

Therefore, the presented study extends the scope of previous studies as follows: 1) Instead of humanoid social robots (see Table 1) designed, amongst others, for entertainment/information, a functional, mechanoid service robot designed for a specific, non-social task was used. 2) In contrast to previously-used semi-autonomous prototypes [9, 21], the investigated robot is commercially-available, fully autonomous, and designed for long-term use. Therefore, the robot could be applied without a visible operator/researcher accompanying the robot, thereby reducing possible experimenter effects. 3) Instead of direct social interactions, in which participants and the investigated robot were engaged in a shared task [9, 21, 26, 27], a situation of mere co-existence of robot and passersby was investigated in this study. The interaction was as close to realistic co-existence as possible, as participants had no instruction to interact with it (e.g., guidance or assistance, dialogues [26, 27]). Hence, in line with non-social tasks of public robots (e.g., cleaning) social interactions between the passersby and the robot were not focal. 4) A combination of case studies and group comparisons is used to shed light on individual reactions and explain individual reactions (e.g., trust, acceptance).

Based on this, the following research question arises: RQ1: What passerby behavior can be observed during co-existence with a fully-autonomous, mechanoid cleaning robot in public?

2.1.3 User Expectations and Concerns

To create a safe, trustworthy, and acceptable HRI, evoking a positive initial user reaction is supposed to be a central goal of robot interaction strategy design.

UX in HRI focuses on thoughts and feelings that arise when interacting with a robot and is, amongst others, assessed by observational studies and interviews [19, 32]. User expectations, together with knowledge (or assumptions) about the technology and its functions and abilities, are organized in user mental models. Mental models of a system (like a robot) shape how we think and react to it. They can be updated by experience and are sometimes incorrect [33,34,35]. User mental models in HRI amongst others have been studied for human-robot awareness [36] and to understand user attitudes [32] and behavior towards humanoid robots [33, 37, 38]. Consequently, understanding user mental models is vital for understanding UX in different use contexts and explaining the variety of user behavior that can be seen in HRI [39]. However, user mental models and inner states have not yet been identified during the initial HRI in public. This merits further investigation about passersby thoughts and concerns about autonomous robots in public.

RQ2: What thoughts and concerns will occur when naive passersby are confronted with a fully-autonomous, mechanoid cleaning robot in public for the first time?

2.1.4 Emotional Reactions

Abrams and colleagues [15] state in their paper about Existence Acceptance, passersby will still perceive, evaluate and show emotional reactions such as joy, surprise, or fear of the robot, although no interaction might take place [15]. However, to the authors’ knowledge, the emotional reactions of the passersby to an autonomous service robot in public have not yet been thoroughly investigated in field studies. Although Hayashi and colleagues [9] assessed enjoyment of the robot at the Japanese train station and found average values of five on a 7-point scale, they did not indicate which questionnaire was used. Therefore, it is not clear whether the authors assessed an emotional response to the robot (e.g., pleased or satisfied) or a hedonic factor of acceptance [40, 41]. Therefore, the presented study investigates emotional reactions more thoroughly by interviewing passersby about their feelings during the interaction with the robot.

RQ3: Which emotional reactions resulting from the co-existence with a fully-autonomous, mechanoid cleaning robot in public are reported?

2.1.5 Trust in Autonomous Robots in Public

Trust in the robot has been found to predict Existence Acceptance in the online study conducted by [15]. However, the initial trust of passersby in autonomous robots in public has not yet been investigated in real-world scenarios. Since the interaction with service robots is not yet common in public spaces and robot capabilities are unclear, a lack of acceptance and trust might prevent the successful integration of service robots in our society as it is likely to result in anxiety [42, 43], perceived threat, and complaints to authorities about the robot application [1, 44]. Accordingly, users’ levels of trust in robots constitute an important basis for the familiarization and the initial interaction with public service robots [42, 43]. In recent work on trust in automation, early trust levels were shown to considerably affect further trust development in later interaction with robots [43]. Taken together, user expectations and trust are essential psychological concepts for understanding human behavior concerning unfamiliar, automated technology, like service robots. Hence, in the presented study, the initial trust and acceptance of passersby to an autonomous cleaning robot in public was investigated by administering respective questionnaires.

RQ4: Do passersby trust and accept a fully-autonomous, mechanoid cleaning robot during their initial encounter in public?

2.2 Human-Robot Conflicts in Public

2.2.1 Transparency of Robot’s Actions

Autonomous robots can be viewed as unpredictable, resulting in feelings of unease and distrust [18]. Resulting from this lack of predictability are conflicts in trajectory planning (e.g., collisions) which are a common issue in HRI, as previous studies with human-robot co-existence have shown [45, 46]. A long-known important factor for a safe, acceptable, and trustworthy HRI is the transparency of an autonomous system concerning its plans and actions [47,48,49,50,51,52]. Here, the bystander role of the passerby in public is particularly challenging for HRI interaction design [53] as the information necessary for the bystander to make him/her comfortable with the robot is difficult to deduct: a balance has to be found between necessary and redundant information. General solutions to foster transparency are, amongst others, operator training and human-robot interface design (e.g., feedback) [51]. For instance, interface design for the co-existence with mobile robots in industrial contexts includes auditory warning sounds and trajectory projections [18, 45, 54].

However, some of these solutions discussed in other settings are not applicable in the public context (e.g., training). Until now, it has not been determined yet which interaction designs might be transferable (e.g., too many interaction displays might be considered as annoying by passersby). Hence, in the presented study, the status quo of initial HRI with a slow-driving and object-avoiding robot was observed. Therefore, a central research question of this study was whether the robot’s safety features were already sufficient for smooth co-existence in public and, if not, what additional information was desired by the passersby.

RQ5: What interaction conflicts will occur when naive passersby encounter a fully-autonomous, mechanoid cleaning robot without an HMI?

RQ6: What are desired interaction strategies for a fully-autonomous, mechanoid cleaning robot in public?

2.2.2 Human-Robot Power Asymmetry

Apart from conflicts on the trajectory level, conflicts can also develop on the basis of the social roles of human (master vs. guest) and robot (tool, servant, organizational representative, companion) which differ in public use contexts from domestic or industrial contexts. In public, social role confusions (operator vs. guest in public space) might occur and lead to conflicts [53]: For instance, a public cleaning robot does not need to accept orders from every passerby which contradicts the human’s wish to be superior to a robot [55]. Hereby, especially robot bullying and negotiation of task priorities between service robot and passerby need to be mentioned.

2.2.3 Robot Bullying

A common phenomenon observed when autonomous robots are applied unsupervised in public is robot bullying or robot abuse [56, 57]. It constitutes physical or psychological aggression against the robot, for instance, blocking the robot’s path, kicking, or insulting it (for an overview of possible mistreatment see [57]). Especially, robot abuse is often observed when children interact with a robot in public unobserved by their parents [58, 59]. Hereby, it has been shown that robots that are not humanlike and do not seem like they would have a mind of their own (’mindless’) are more often the subject of abuse [60]. Various counter strategies have been pursued by robot designers, such as having the robot shut down completely or driving away from the bullies [61, 62]. However, none of these strategies are useful and allow the robot to continue its task execution [62]. Therefore, other interaction strategies are needed for the robot to be effective and avoid bullying. This was investigated in the present study by observing bullying instances and asking passersby how bullying could be prevented. As the robot in the presented study was mechanoid and did not appear to be intelligent (it even has handles on the back for remote control), it might be subject to robot bullying.

RQ7: Will robot bullying be observed when passersby are confronted with a heavy, fully-autonomous cleaning robot?

2.2.4 Human-Robot Task Prioritisation Conflicts

In public, a negotiation of task priorities between service robots and passersby may be necessary, such as it happens with human cleaning personnel and passersby (Who is allowed to perform a task first if both tasks cannot be done simultaneously?). A human cleaner would assert itself and ask the passerby to step aside. It is conceivable that service robots, to operate effectively, would also benefit from being assertive at times (e.g., insist that the passerby steps aside) [11, 63, 64]. Due to human-robot power asymmetry (e.g., the desire to always be superior to a robot), [55] the same behavior from a service robot (e.g., asking a passerby to step aside or remove luggage from the ground) may lead to negative reactions [63, 64]. Objectively, the robot, as representative of the cleaning company, should be allowed to act accordingly as the passerby is only a guest at the location. But will it be considered acceptable for a robot? In the public context, assertive robot behavior had been proposed for tour guide robots when the way of the robot was blocked [11, 30]. Hence, in this user study, participants were asked whether they would accept an assertive cleaning robot and under which circumstances this might be the case.

RQ8: Could passersby imagine accepting an assertive public cleaning robot?

3 Method

3.1 Study Setting and Setup

To investigate HRI in public spaces, a train station provides the benefits of natural interaction from a heterogeneous sample (age, education, experience) in a crowded place where problematic interactions show quickly [9, 16]. Hence, the field study took place at a train station in a medium-sized town (>120.000 inhabitants). The study was conducted for two weeks in November 2018 during work hours from 9 am to 5 pm. The observation of the passersby took place in a side corridor of the station, which led from the main hall to the platforms and the car park (see Fig. 2a). The observance area consisted of a straight aisle (approx. 4m x 10m) with lockers on one side and a shop window on the other. Within the hallway, the robot drove a predestined path and wet-wiped the floor. A robot supervisor was present during the study to be able to intervene in an emergency. Two cameras recorded the HRI from two points of view. The camera positions were indicated and displays were placed at the beginning and end of the observance area that informed participants of video recordings due to data collection for a scientific study (see Fig. 2b).

Fig. 2
figure 2

Overview of study setup, setting, and sample sizes. a Depiction of study setup at the train station with the camera and interviewer location. Dotted lines indicate robot trajectory. A to C indicate points of view for the images in 2b. b Images from the study setting. c Sample sizes for observance, interview, and questionnaire data

3.2 Autonomous Cleaning Robot

The autonomous cleaning robot CR700 (ADLATUS Robotics) has the size (98 x 75.5 x 100 cm) and looks of an ordinary manually-operable cleaning machine with two handles and an operating display at the back (see Fig. 1). It weighs up to 300 kg and has a velocity of 0.3 m/s to 0.8 m/s. It can drive and clean autonomously and avoid obstacles but cannot yet differentiate between inanimate and animate objects (humans). When an obstacle has been detected by the robot, depending on trajectory options either passes the obstacle or stops in front of it at a distance of 10 cm. The robot makes cleaning noises similar to existing cleaning machines. The robot was fully autonomous during the field trial. A supervisor was on stand-by in case of emergency and was hidden from the passersby.

3.3 Procedure

The study consisted of four parts: observation of passersby, recruitment, semi-structured interview, and questionnaire. First, the passersby were observed during their natural interaction with the robot and interviewed afterward about their thoughts and feelings if they consented to participation. After the interview, the questionnaire about participants’ perception of the robot regarding acceptance, trust, and predictability was administered. Ideally, this led to the acquisition of three data types per participant: observed behavior, participant’s self-reports, and standardized metrics.

Observation. The interviewer waited around the corner in the interview area and monitored the observance area via a camera live stream on a tablet. This way, the natural interaction of the passersby with the robot was not disturbed. The robot supervisor also behaved discreetly and did not draw attention to himself.

Recruitment. The interviewers randomly recruited passersby from the observance area, notwithstanding whether an interaction with the robot had occurred or not. If the passerby agreed to the interview, the interviewer led him/her around the corner to the interview area. Due to recruitment in the field, the following exclusion criteria were defined: minors were not recruited, as well as passersby who seemed unfit to participate (e.g., intoxication, mental illness) or were non-native speakers. If two people (e.g., couples, friends) were recruited simultaneously, they were interviewed separately. Groups were not recruited as the focus of the study was on individual reactions.

Interview. The interview lasted about 10 minutes, was semi-structured, and consisted of 15 questions about how the participants perceived the robot and if and how they wanted the robot to communicate with them (see Table 2). The interview was audio recorded.

Questionnaire. If participants still had time after the interview, they were also asked to fill in a questionnaire about their perception of the robot (see Sect. 3.4.3). Participants gave their informed consent to participation and data processing before the interview started. Participants received a small gift (i.e., a university pen and sweets) for participation.

3.4 Sample Sizes

3.4.1 Video Data Sample

Video data could be obtained from N = 344 passersby (58% male). As the age of the observed sample could not be directly assessed, two researchers categorized the passersby into one of the five age groups: child, young adult, 18-40 years old, 41-60 years old, and over 60 years old) accordingly. The majority was aged 41 to 60 years (47%), followed by 18 to 40 years (31%). One-fifth of observed passersby was over 60 years old.

3.4.2 Interview Sample

Overall, due to the crowded and hectic environment of a train station, only a small percentage of observed passersby (see Fig. 2c) could be recruited (e.g., refusal to participate due to time pressure). The final sample size for the interview data was n = 54 (55% female). The age ranged from 18 to 76 years (\(M = 46.49, SD = 19.07\)). Participation was voluntary and could be terminated by the subject at any time. The subjects were informed about the process of the study, as well as their rights regarding data processing. All subjects signed agreement forms for the use of their video and audio recordings.

Originally, 67 passersby could be recruited for the interview (see Fig. 2c). However, two people had to be excluded from the interview data since the interview revealed that German was not the mother tongue and misunderstandings were apparent (n = 1) and incapacity of study participation (n = 1). For eleven subjects, the audio material was not usable (e.g., background noises, incoherence, disturbances, data loss).

3.4.3 Questionnaire Sample

Questionnaire data could be obtained from 32 participants from the interview, see Fig. 2c). The 32 participants (41% female, two did not answer) were on average 43 years old (range: 18-72). Concerning education, half of the participants (n = 16) had a university degree, 19% had a secondary school degree (the German ’Realschule’), and 16% had a high school degree (the German Abitur or A-levels).

3.5 Semi-Structured Interview

The interview (see Table 2) consisted of 15 questions divided into five topics: 1) the participant’s self-reported reaction to the robot, 2) the participant’s perceptions of the robot’s functions, 3) the participant’s preferences for communication with the robot, 4) the participant’s concerns about critical situations with the robot and 5) the participant’s acceptance level of assertive robot behavior. The interview was semi-structured, hence the interviewer was free to investigate some topics in more detail if the respondent was ready to provide more information. Due to the field setting, not all participants had time for all interview questions, interviewers sometimes had to reduce the number of questions asked. Accordingly, the number of respondents per question varies and is indicated in Table 2. The interview questions Q1 - Q13 were inspired by user-centered design approaches [65] to inquire about thoughts, feelings, concerns, and potential design ideas. Interview questions Q14 and Q15 address user preferences about assertive robots as a potential solution to human-robot task prioritization conflicts.

Table 2 Semi-structured interview

After the interview, a study part followed that assessed the preferences of 32 interviewed participants regarding possibilities for trajectory projections for an autonomous robot in public. Participants could voice their design ideas and then had to choose from three presented designs. The results of this study part are reported elsewhere [66].

3.6 Post-Interview Questionnaire

The questionnaire assessed participant’s trust in automation (seven items on a 7-point Likert-scale from the German version LETRAS-G [67] of the trust in automation scale from [68], Cronbach’s \(\alpha = .91\)), acceptance of the robot (six items on a 5-step semantic differential of the Van der Laan Scale [69], Cronbach’s \(\alpha = .88\)), as well as how predictable they perceived the robot (four self-developed items on a 7-point Likert-scale, Cronbach’s \(\alpha = .88\)): ’The robot communicated with me in a comprehensible way; I understood what the robot was going to do next; The robot’s actions were clear to me at all times; The robot was inscrutable for me.’ (recoded). Predictability items were developed based on the subscale ’Perceived Understandability’ of [70] but adapted to the co-existing nature of the autonomous cleaning robot in public space. Items like ’I understand how the system will assist me with decisions I have to make.’ were not applicable as the robot does not assist the human and no task is performed together.

3.7 Data Analysis

3.7.1 Video Data

Video data comprised 110 minutes recorded by two cameras. A total of eleven 10-minute videos were used. Behavior coding of the video data was performed by two coders in an iterative process and comprised 13 different behavior categories (see Table 3 and Fig. 3), which were derived from literature [9] and adapted to our data in an iterative process by adding categories such as ’evading’ or ’observing’ to capture all observable behaviors of the passersby. Hayashi and colleagues [9] also identified ’staying’, ’talking about robots’, and ’watching with child’ as behavioral categories. These three codes did not apply to our data and were not coded. For each passerby, all behaviors that were shown were coded except for facial expressions, which had to be excluded from the analysis due to German data protection regulations.

3.7.2 Interview Data

The recorded interviews were transcribed focusing on standard orthography, meaning that the transcription of the spoken language corresponds to the norms of written language (for example, ’Nope’ to ’No’). Dialect or linguistic features, such as intonation or throat clearing, as well as sounds such as ’um’ and ’hm’, were not transcribed. In contrast, errors, repetition of words, or sentence breaks were not corrected and transcribed verbatim. Longer pauses in the speech were indicated in the transcribed text.

In an iterative process, the transcribed text was then categorized. The analysis of the qualitative data was conducted following the procedure of the qualitative content analysis, [71,72,73] which is a data-driven, inductive approach.

Two independent raters were involved in the iterative process of coding the data. First, a basic category system was developed, which featured the main topics of the interview questions such as thoughts (e.g., robot functioning, concerns of robot application), feelings (positive, neutral, negative), and design recommendations. Then both raters differentiated the broad codes into smaller categories (e.g., concerns regarding malfunctions) independently from each other based on the data. Example codes can be seen in the Appendix, Table 9. The categories comprised up to four levels and contained a detailed representation of the interviews. In a third step, the coding scheme was presented to the authors and associated researchers for discussion. Potential to summarize (e.g., the design recommendations) or differentiate (e.g., reported feelings and concerns) was identified, and the two independent raters refined the codes afterward until a balance between aggregation and detail was reached so that a comprehensive yet meaningful picture about subjects’ thoughts, fears, and desires concerning the cleaning robot could be created.

The results of the interview coding per interview question are reported below. The number of the interview question to which the results refer is given in parentheses (e.g., Q1) and refers to Table 2. Results are accompanied by example participant quotes. For each quote, the gender and age of the respective participant are indicated in parentheses.

3.8 Questionnaires

Questionnaire data were checked for outliers, but no participant had to be excluded. Items were re-coded when necessary, and means were calculated based on the respective questionnaire manuals. An extreme group comparison was performed to investigate whether passersby that perceived the robot as less predictable also trusted and accepted it differently than passersby that thought the robot was predictable in its actions. Following recommendations by [74, 75], the upper and lower quartile of the average predictability ratings were used to create a ’low’ and a ’high’ group. Eight participants were in the low group and nine in the high group. The average trust and acceptance scores were then compared between both groups using a non-parametrical Mann-Whitney U-test due to the small sample size.

Fig. 3
figure 3

Screenshots as examples for behavior codes

Table 3 Behavioral coding system applied to video data for categorizing user reactions

4 Results

4.1 Initial Behavior, Thoughts and Lack of Transparency

4.1.1 Initial Behavioral Reaction to the Robot

Concerning the initial reactions of passersby to the autonomous cleaning robot in public, it was found that 63% of the 344 observed passersby noticed the robot on the first encounter, while about one-third of participants ignored the robot. The absolute frequency of all shown passersby behaviors (initial and subsequent) can be found in the Appendix, Table 10. Overall, a quarter of passersby evaded the robot, and ten percent stopped to watch. Less frequent passerby behaviors were, for instance, observing (5,5%) and pointing (3,4%). Two instances of touching were observed: a male passerby (see Fig. 3) and a child (see Fig. 10a). Three passersby took footage of the robot (for a statement of one of those passersby, see Appendix, Table 11). Notably, seven passersby actually blocked the robot’s way and hindered it in driving along (for examples, see Appendix, Table 11 ’testing behavior’). This is later discussed in Sect. 4.3. No differences were found for the most shown user reactions like noticing or ignoring for different age groups.

4.1.2 Initial Thoughts about the Robot’s Function and the Experienced Interaction

The majority of participants (96%) did understand what the function of the robot was (Q1), and 80% were not surprised that the robot was autonomous (Q2, Q3). However, some participants were concerned by the robot’s autonomy and desired a human operator (see Sect. 4.2.1). For general thoughts (Q5) see Table 4.

Concerning participant’s thoughts about the interaction with the robot (Q4), half of the interviewees (51%) did not believe that the robot had perceived them during the interaction, for instance, because the distance was too large to tell (n = 8) or the robot did not adapt its trajectory (n = 6) (see Fig. 4), four were unsure. Participants who thought they were perceived by the robot (41%) determined this by the evasive action of the robot or by inferring that it must have sensors for object detection. Interestingly, these two subgroups did also appear in the questionnaire data and differed in their trust and acceptance of the robot (see next section and Fig. 7). Hence, there seems to have been some insecurity among passersby regarding the robot’s recognition of humans which needs to be addressed.

R.1:

An autonomous robot in public should make its current and future actions (e.g., if it has registered the passerby in the vicinity, its planned trajectory, announce its presence, collision warning) transparent and expectable (e.g., projections or lights [18, 45, 66, 76]).

Fig. 4
figure 4

Example of misunderstandings in observed HRI in public. Evasive actions of passersby

Table 4 Example Quotes for Participant’s Categorized General Thoughts (Q5)

4.1.3 Questionnaire: Acceptance, Trust, and Predictability

The questionnaire assessed the participants’ trust and acceptance of the robot, as well as the perception of the robot’s predictability from 32 participants. As can be seen in Fig. 7a,b, participant’s trust (\(M = 5.15; SD = 1.34\), range: 2-7) and acceptance (\(M = 3.88; SD = 0.82\), range: 2-5) were rather high. The predictability of the robot’s action was also rated as rather good (\(M = 4.24; SD = 1.70\), Median = 4.5, range: 1-7), but eight participants had average scores lower than three (lower quartile), indicating that they did not perceive the robot’s actions as predictable.

To investigate whether these eight participants also trusted and accepted the robots less than participants that perceived the robot as very predictable, an extreme group comparison (lower vs. upper quartile) was performed. A significant difference for the trust (\(Z = -2.75, p < .01\)) and acceptance ratings (\(Z = -2.80, p < .01\)) was found indicating that the passersby that perceived the robot as unpredictable also trusted and accepted it less.

Fig. 5
figure 5

Angry reactions because the passerby had to evade the robot

4.2 Feelings, Concerns and Critical Situations

4.2.1 Initial Feelings During the Interaction

The initial feelings (Q6) were threefold (see Fig. 6). Whereas one-third (n = 15) felt normal and well during the interaction with the robot, one-third (n = 14) did also feel positively astonished by the application and task execution of the robot and were curious about the robot. A smaller proportion (n = 9) reported feelings of being discomforted or anxious (for examples, see Fig. 4 and 5). The rest did not report any feelings (14%). See Table 5 for quotes.

When explicitly asked whether the passerby had felt safe in relation to the robot (Q7), the majority reported having felt safe (74%), six participants felt insecure, and two people were ambivalent. One participant who felt insecure stated that she did not yet trust autonomous machines to have felt safe in the interaction with the robot (’Because you don’t yet have the confidence in machines that run independently because you still think there are some aspects that don’t quite work yet and then it drives towards you. But it was not like that.’, female, 26 years). Another participant who felt insecure would have preferred a human operator (’Not really. So if there had been a person operating it, I would have preferred that.’, male, 64 years).

To find user groups with the same emotional reaction to the robot, it was investigated which participants provided the same answers to Q6 (feelings), Q7 (feelings of safety), and Q8 (concerns).

Three participants gave the same answers: they experienced negative emotions (anxious or discomforted), felt unsafe, and had concerns about collisions. All three were female (age: 19, 23, 65) and had had direct interactions with the robot (either at the lockers or it drove frontally towards her). The 64-year-old reported feeling insecure at first, but when she stood in front of the robot to test whether it would stop, she was reassured when it did. Screenshots from her interactions and statements can be found in the Appendix, Table 11.

R.2:

User-centred navigation could render an autonomous robot in public more predictable and acceptable: e.g., visible reduction of speed when an obstacle is detected, avoiding abrupt turns [46, 77] and keeping a socially-accepted distance to passersby (for examples of how this can be implemented see [78, 79]). Hereby, the increased space needs of vulnerable passersby should be considered (e.g., elderly, passersby with disabilities [76, 80], children [81, 82]). Future work is needed on social robot navigation for vulnerable groups.

Table 5 Example Quotes for Participant’s Categorized Feelings (Q6)
Fig. 6
figure 6

Interview data. Pie chart showing participants’ initial emotional reactions to the autonomous service robot in public, n = 46

Fig. 7
figure 7

Questionnaire data. Boxplots of participants’ a trust and predictability ratings assessed with 7-point Likert scale and b acceptance ratings assessed on 5-step semantic differential c bar charts depict results of extreme group comparison via lower and upper quartile (low vs. high robot predictability scores) for trust and acceptance, *\(p < .05\), **\(p < .01\), n = 32

4.2.2 User Concerns and Suspected Critical Situations

The majority of the passersby (68%) did not voice concerns (Q8) but could not give a reason for having any concerns. Three participants said they had no worries because a) they felt safe (n = 3) because of the robot’s low velocity and sensors or b) they trusted the automation (n = 1, ’I’m a train driver by profession, I had to work with a lot of machines, so I trust the technology. Also, if no one trusted the technology, someone would run alongside [the robot] to press the emergency stop [...].’, female, 28 years).

The passersby that had concerns were mostly worried about colliding with the robot (n = 11) because they felt that the planned robot trajectory was unpredictable (’Is it right on my heels or where exactly is it going to drive next?’, female, \(>60\) age group).

The other concerns were job loss due to continuing automation (n = 2) (’Well, nobody likes to do the work that the robots do, but people used to do it who were paid for it. That’s why I am ambivalent.’ [...], female, 65 years) and robot malfunctions and programming errors (n = 1) (’As a technician, you always think the worst case: programming errors are not excluded.’, male, 63 years).

R.3:

It could be beneficial to inform the public beforehand (e.g., information campaigns about the application of an autonomous robot in a certain area) and whether or not this leads to job loss [83].

In three cases, there was also a change in concerns before and after the interaction with the robot. Two participants said that at first, they had concerns regarding collisions, but when they passed the robot without any issues, they were reassured. For one person who observed a collision, it was vice versa (’Not really, because it moves very slowly. But I just observed a woman walking past and almost colliding with the machine. That’s why I think you have to be more careful as a human being, because the machine doesn’t look out for you, so to speak.’, female, 21 years).

Concerning critical situations (Q9), concerns regarding interactions with vulnerable users (see Fig. 8) like kids ([...]’small kids which it might not sense that well’, female, 25 years), older adults, and disabled (’I imagine that for older adults, it might seem threatening and for visually impaired people’ [...], female, 33 years), as well as interactions with pets, were named (’Very fast-moving objects, like dogs for example’, female, 25 years). The robot touching objects like luggage, strollers, or walking canes was mentioned as critical as well. Some people also feared sensors failing and malfunctions.

4.2.3 Observed Critical Situations

Contrary to the reported user concerns, nothing critical happened in 98% of the observed interactions (e.g., the danger of harm, misunderstandings) but two non-harmful collisions occurred, and several startled reactions of passersby were observed due to the robot unexpectedly turning on the spot (for examples, see Fig. 9a,b).

Collisions. Both collisions happened with train station staff members and were human-caused. The first was a staff member who ran around the corner into the stationary robot he did not see and complained with an annoyed gesture (see Fig. 9c). In the second collision, an employee pushed a transport cart loaded with boxes (which blocked her view) into the robot because she did not see (or hear) the robot approaching. The robot could not swerve to the right because there was a passerby. To the left was the wall. A faster braking reaction by the robot would not have prevented the collision either as the approaching human did not swerve at all. Both collisions show that human-caused collisions can sometimes be hard to avoid by the robot. For these cases, the robot needs to be equipped with interaction strategies that inform passersby about its presence and warn the passerby if a collision might be imminent. To prevent specific collisions with staff in public places, they should be informed of the robot’s application beforehand.

R.4:

Staff should be informed beforehand that autonomous robots will be in use in their workspace to prevent human-caused collisions [84, 85].

Evasive Actions of Passersby. The robot behavior that produced the most critical situations with passersby was the unexpected turn at the end of the cleaning area when passersby had to swerve suddenly to avoid a collision. When the robot turned at the end of the aisle, it left the passersby insufficient space to walk past, so they had to change their path (see Fig. 4). Swerving can be particularly troublesome for passersby with walking disabilities, as Fig. 4a shows: A person with a walking disability had to quickly avoid the robot when it turned at the end of the cleaning area and blocked her path. When passersby had to swerve the robot, it was sometimes accompanied by annoyed reactions and respective hand gestures (see Fig. 5).

Fig. 8
figure 8

Screenshots with examples for co-existence with vulnerable passersby, top: passersby in wheelchairs, middle: passersby with walking canes, bottom: parents with strollers

4.2.4 Behavioral Reactions of Vulnerable Passersby

In line with the user concerns in the interviews, interactions of vulnerable passersby with the robot warrant special attention (examples can be seen in Fig. 8).

Elderly and Passersby With Walking Disabilities. Passersby over 60 years of age and passersby needing a walking aid were observed to intentionally adapt their path (e.g., walking in a circle around the robot) to keep more distance to the robot than others (for examples, see Fig. 8b). Wheelchair users did not react notably different than other passersby (see Fig. 8a).

Children. Child-robot interactions were also observed. Younger children naturally showed more playful behavior, such as running back and forth and running alongside the robot. On one occasion with a pair of brothers (see Fig. 10), the elder brother tried to prevent the younger brother from approaching the robot too closely by tugging him away. Later, the younger brother returned and touched the robot playfully at the front and then stood in front of the robot and waited. The robot had stopped at the time and was not moving. The mother was not visible on the camera during the 2-minute interaction of the children.

Fig. 9
figure 9

Screenshots with examples for HRI issues in public

Fig. 10
figure 10

Screenshots with an example for child robot interaction

R.5 :

Parental supervision cannot be assumed at all times when children interact with an autonomous robot in public, so malfunctions of the robot (e.g., failure to detect the child) should be conveyed in a way that younger children understand it who are potentially unable to read (e.g., non-verbally via sounds, colors, and movement). Future work is needed regarding child-friendly communication modalities in public.

Parents Pushing Strollers. Some anxious reactions of parents pushing strollers were observed. One particular pair of parents pushing a stroller was visibly concerned about the robot’s trajectory and hesitated to drive by the robot (Fig. 9d). When the robot started moving, the woman swerved it while the man shielded the stroller from the robot with his body. This couple had seen the robot in a prior passing where they had had no interaction with it (see Fig. 8c, left picture). Another pair of parents pushing a stroller experienced the inactive robot to unexpectedly start approaching the stroller, which made the parents speed up their walking pace to evade the robot (see Fig. 8c, center picture). However, in most cases, the passersby pushing strollers were not concerned about the robot (for an example, see Fig. 8c, right picture).

4.2.5 Preferences for Communication with an Autonomous Robot in Public

Although the lack of the robot’s interaction concept led to observed misunderstandings, the majority (62%) of the 49 respondents did not want to communicate with a robot in public (Q11). The majority did not provide a reason, but those who did reject to communicate with a robot/machine (n = 4) in general. It was also stated that the robot was too loud for communication (n = 2) and that it would be sufficient for the robot to clean and drive collision-free (n = 2). Three participants said that the operator should be addressed instead of the passerby (n = 3).

One-third of the affirmative participants, desired communication of the planned robot trajectory and the registration of passersby as an obstacle. More detailed results regarding design ideas for robot trajectory projection can be found in [66]. When asked how the robot could warn inattentive users in public (Q12), 71% of the participants preferred acoustic warnings, for instance, like on a lorry (n = 3) or a siren (n = 2). Thirty-five percent of the participants desired visual signals like a warning light.

R.6:

Speech does not seem to be preferred as a communication medium for a mechanoid service robot in public. Non-verbal strategies such as projections or sounds might be more acceptable [66].

4.3 Human-Robot Conflicts

4.3.1 Observed Use Cases for Robot Assertiveness.

As a special case of HRI, robot assertiveness was investigated as a potential conflict resolution strategy (e.g., for the conflicts with groups). For instance, human-robot task prioritization conflicts can occur in public when a person tries to put his/her luggage into the locker while the robot plans to clean the same spot (see Fig. 11a). If no conflict occurs, as in Fig. 11a, there will be no need for the robot to be assertive as it has enough space to pass the person at the locker.

In contrast, situations were observed in which the robot’s path was blocked by a passerby or a group, either intentionally or through inattention (see Fig. 12b,c). The first group consisted of five people who had stopped, blocked the robot’s path, and observed its reaction (see Fig. 12b). The second group consisting of four, blocked the robot’s path by inattention (see Fig. 12c). If the robot is made way by the group as in Fig. 11a, it will slowly continue driving but if there is not enough space to pass, the robot currently stops and waits which makes its cleaning process inefficient.

R.7 :

The robot needs to be equipped with multi-user interaction strategies, especially if groups block its way. These could be bystander interventions [61] or assertive robot interaction strategies (see next recommendation).

4.3.2 Participants’ Preferences Regarding Assertive Robot Behavior

A potential solution to the aforementioned conflicts could be that the robot asserts itself and, for instance, poses a request to the passerby if necessary. To explore if and how passersby might accept such robot behavior, 45 participants were asked whether a robot should assert itself if humans stand in its way (Q17).

Table 6 Example Quotes for Decision Regarding Robot Assertiveness (Q14)

The majority of participants (73%) rejected an assertive robot, seven (15,5%) accepted it, and two were ambivalent (see Fig. 13). Three said explicitly that this was a difficult question that they could not answer (for example quotes see Table 6). The most named reason for rejection was the principle that a human always has priority (n = 12) and that humans should be superior to the robot (n = 3). Five participants said the robot should just stop, two said it should swerve, and two were concerned about vulnerable passersby getting hurt.

The seven participants who accepted robot assertiveness were asked how the robot could assert itself (Q18). Acoustics were mentioned by all seven, with four participants preferring sounds like a horn or humming with one person suggesting a successively rising warning signal (see Table 7). Two participants preferred speech for the robot to be assertive such as a command ’Step aside!’ or a humorous announcement (see Table 7). Additionally, one ambivalent person (male, 39 years) stated that it might be worth looking at how professional cleaners asserted themselves and investigating how it can be transferred to a robot (see Table 7).

R.8 :

Acceptably designed assertive robot interaction strategies could allow an autonomous robot to function in public even when impediments to task performance occur (e.g., passersby block its way). These strategies could be verbal (e.g., requests [63, 64, 86, 87]) or non-verbal (e.g., sounds, movements [88, 89]) based on robot type and capabilities.

Fig. 11
figure 11

Example use cases for robot assertiveness: a robot has enough space to swerve, b inattentive passersby at the locker block the robot’s path, c inattentive passersby do not leave enough space for the robot to pass

Fig. 12
figure 12

Examples for observed group interactions. a the group lets the robot pass through, b the group is blocking the robot’s path, c an inattentive group blocks the robot’s path

Fig. 13
figure 13

Interview data. Answers to the question: ’Should the robot assert itself when humans stand in its way?’, n = 45

Table 7 Example Quotes for Design Ideas for Robot Assertiveness (Q15)

4.3.3 Participants’ Ideas for the Prevention of Robot Bullying

Regarding robot bullying, seven observed incidences occurred that could rather be termed ’testing behavior’ than real bullying aimed at damaging the robot. Passersby actively blocked the way of the robot (see Fig. 14) or suddenly jumped in front of the robot to see how it would react, some also cornered it. Although the severity of observed bullying behavior was not high in the presented study, it could become critical if more bold actions of the passersby would result from it (e.g., damaging the robot), if such behavior would last for several minutes or be exhibited by a group of people. This might render an autonomous cleaning robot in public rather inefficient.

Fig. 14
figure 14

Screenshot of passerby’s testing behavior

Consequently, twenty-nine participants were asked how it could be prevented that the robot is willingly damaged or ’teased’. Suggestions concerned improving the material (n = 5) (’You might have to make it very sturdy, like all things, such as vending machines for bus tickets’ [...], male, 64 years) or an acoustic signal to protect against destruction such as a voice was mentioned by four participants (’A voice that then scolds people who approach.’, female, 50 years) or a warning signal (’Maybe warning signals [...] if you touch it.’, female, 25 years). One participant mentioned putting a warning sign saying ’video surveillance’ on the robot.

In contrast, seven respondents (24%) assumed that no protection against destruction was possible (’You can’t build in such a security nowadays in my opinion, because if they want to destroy it, they do it’, female, 42 years). One respondent stated that deliberate destruction might decrease as soon as people got used to autonomous robots in public (’If there are enough of those robots driving around, it will cease’, male, 55 years).

R.9:

It is to be expected that passersby will test the ability of an autonomous robot to stop if an object is in the way. If the testing behavior, gets violent (i.e., bullying), counter mechanisms should be in place (e.g., video surveillance, physical modifications to the robot [60], interaction strategies such as emotional responses [90] or inducing bystander interventions [61]).

4.4 Case Studies

Case studies to illustrate the above-mentioned results are described in the Appendix, in Table 11. The observed reactions were sorted by criticality (made visible in the table by color coding), ranging from positive events such as helpful and interested reactions to annoyance, anxiety, and skepticism.

Concerning positive interactions, an interesting instance of helpful behavior of a male passerby (case 1) occurred when he lifted his luggage from the ground on the robot’s approach and made a step back so the robot could clean. Apart from this extraordinary pro-social behavior, other positive reactions such as interested passersby stopping to film the robot (case 2) and sometimes making curious inquiries about it during the interview (case 3) occurred.

Similar to interested passersby, instances of testing behaviors marked events that might be harmless initially but could become problematic. Although the observed reactions mostly included trialing whether the robot would stop if the person extended his/her legs or arms in front of the robot (case 4) or stepped in its way (case 5 & 6), it could be considered robot bullying if such behavior would last for several minutes or was exhibited by a group of people. This might render an autonomous cleaning robot in public rather inefficient.

Annoyed reactions of passersby could also be observed, and both times resulted from the robot’s violation of personal proximity preferences: one time because it was turning at the end of the aisle and the passerby did not have enough space to pass (case 7), and in the second instance it came too close for comfort when the passerby tried to stow luggage into a locker (case 8).

The inefficiency of the cleaning robot was uttered by a female passerby (case 9) who was skeptical if the robot would perform its job properly and was worried that automatization of that kind would lead to job loss. Additionally, she was worried about where the robot would drive next and whether it would collide with her. Her concerns were also repeated by other interviewed participants (see section Interview data, Subsection 4.2.2).

4.5 Summary of Results

  1. 1.

    The majority of passersby noticed the robot.

  2. 2.

    About one-third of passersby ignored the robot on the first encounter.

  3. 3.

    One-quarter of passersby evaded the robot.

  4. 4.

    Age groups did not differ in their initial reaction towards the autonomous cleaning robot.

  5. 5.

    Participants’ acceptance of and trust in the robot was quite high amongst the participants that filled in the questionnaire.

  6. 6.

    First initial emotional reactions to the robot were mostly neutral (feeling normal/well), but amused and discomforted reactions were also reported.

  7. 7.

    The majority of passersby was not surprised that the cleaning robot was autonomous.

  8. 8.

    Most concerns regarded safety and collision avoidance.

  9. 9.

    Critical situations were rare (2% of observed cases).

  10. 10.

    Intransparency of the robot’s status and planned trajectory (e.g., turning at the end of the aisle) led to misconceptions and critical situations.

  11. 11.

    One-fifth of participants that filled in the questionnaire perceived the robot’s actions as unpredictable. These participants also trusted and accepted the robot less.

  12. 12.

    The majority did not want to communicate with a mechanoid public service robot.

  13. 13.

    Visual signals were preferred for the robot’s planned trajectory projection.

  14. 14.

    Assertive robot behavior was rejected by the majority of respondents, but a small proportion was in favor and named auditory and visual signals as design possibilities.

  15. 15.

    Robot bullying did not occur for the heavy, mechanoid robot. Only testing behavior was observed more.

5 Discussion

The presented study explored the initial behavioral, cognitive and emotional reaction of passersby to an autonomous cleaning robot in public. In general, the results showed that the co-existence between passersby and a public cleaning robot in a hectic environment worked quite well. Most passersby felt safe and normal in the robot’s presence, and trust and acceptance were high. However, interaction issues (e.g., one-quarter of observed passersby actively evaded the robot) and concerned or anxious user reactions were also observed. Case studies were presented, and individual behavior ranging from interest to anxiety could be observed. Together with the observations of minor incidences such as startled reactions or the two non-harmful collisions, the case studies can serve to derive recommendations for future robot interaction strategy design in public.

Regarding RQ1, the observed reactions in public were partly comparable to other observational studies but also produced new findings concerning passersby reported thoughts and emotions. Results were comparable regarding the generally low interest of passersby in the robot [9, 21] and occurrences of testing behavior such as blocking the robot’s path [57]. Although two-thirds of the station visitors looked at the robot, one-third of passersby ignored the robot despite the novelty it represents having an autonomous cleaning machine operating in a crowded environment which might point toward the notion of existence acceptance [15] in public. This is also supported by the high trust and acceptance ratings from the questionnaires (see discussion of RQ4).

However, the moving robot in our study attracted more attention than the two stationary robots at the Japanese train station (noticing occurred in 61% of cases compared to 35%) [9]. However, frequencies of user reactions like touching or pointing were observed comparably less frequently by [9] (three times per day on eight days of observation) as in our study. The lack of interest was explained by Hayashi and colleagues [9] by the hectic train station setting with travelers under time pressure. Additionally, the low interest of passersby might be due to the mere co-existence with a cleaning robot which does not hold a primary interaction goal for the passersby [15] if there was no trajectory conflict.

Age effects were also investigated regarding the initially shown passersby reaction. For the frequently observed reactions such as noticing or ignoring no differences occurred. However, less frequent behavior such as touching or blocking the way was not observed in all age groups. For instance, older adults were never observed to touch the robot but rather kept their distance. This might be explained by reported fears of collision and falling/tripping in this age group. In a previous study with a telepresence robot for home use by older adults, safety requirements for a mobile robot’s movement in the vicinity of older adults have been described to prevent scared reactions [91].

In contrast, children were more likely to touch and run alongside the robot as an expression of playfulness which has also been shown in previous studies [57,58,59]. As the robot was heavy and taller than most children, it is remarkable how close the children approached the robot, which at the same time presents a warning that robotic system failures need to be communicated in a way that passing children can understand when they should not approach a malfunctioning robot.

Regarding vulnerable passersby, contrary to reported passersby concerns, nothing critical happened. Although passersby with walking disabilities actively kept more distance from the robot, wheelchair users did not react anxiously in response to the robot but just drove past it.

The presented findings show the need for the user-centered design of interaction strategies for autonomous robots in public, considering the needs of different age groups and vulnerable passersby. To understand the experience the passersby had with the autonomous cleaning robot, reported thoughts and emotional reactions are discussed in the following.

Concerning reported emotional reactions (RQ3), most passersby felt safe and normal in the presence of the robot. However, participants that were not sure whether the robot had registered them showed more concerns and negative emotions (see below). Regarding participant’s thoughts and concerns (RQ2), participants reported having been reassured by the robot’s object avoidance and low velocity, as has been shown in a field study with an airport guidance robot [27].

Initial trust and acceptance in the autonomous cleaning robot were also high (RQ4) but lower trust and acceptance ratings were found for participants that perceived the robot as unpredictable (see discussion of interaction conflicts). In contrast to this, there was also an opposite group of passersby who were not that confident that the robot had registered them during the interaction or could avoid objects. These participants tended to voice concerns about collisions and report negative emotions. We also observed a change in concerns before and after the interaction for two participants who first had concerns about colliding with the robot but were reassured when they had interacted with it. This is in line with [30] who also observed more positive attitudes towards robots after the interaction.

That most passersby felt normal and safe in the robot’s vicinity resulted in the reasoning that the robot would not be applied if it was not safe or that it had sensors and would avoid objects. This is a common belief by laypersons also found in other studies regarding user mental models and assumptions about robot capabilities [33, 92, 93]. These beliefs about the robot’s functions might sometimes lead to overtrust [94] as the occurrences of collisions and evasive behavior show, although the robot in the present study was equipped with sensors and collision avoidance systems. Therefore, it is important to foster a calibrated level of trust [67, 95] in autonomous service robots in public by exploring the reasons for critical incidences and developing countermeasures such as acceptable and trustworthy robot interaction strategies.

As in the present study, a continually moving robot that shared the same space as the passersby was used, interaction conflicts (RQ5) like active avoidance behavior of passersby, collisions, and startled reactions could be observed. Two collisions with train station staff members happened for which the robot was not responsible. In the first case, the employee ran into the stationary robot and in the second case, the employee pushed a cart into the robot as his/her view was obstructed. Like in the well-known dilemma of the ’trolley problem’ [96, 97], the robot could not swerve to the right because there was a passerby. To the left was the wall.

These incidents show that human errors are difficult to prevent by the robot’s safety features and that the robot could benefit from additional interaction strategies. For instance, the robot could first announce its presence and then warn the unaware passerby when a collision might be imminent and cannot be prevented by braking or swerving. These strategies already work for mobile robots to prevent human-robot collisions in industrial applications [98]. As only collisions with employees occurred in the present study, one solution could be to inform the employees about the use of the robot beforehand.

As the cleaning robot was not equipped with an HMI, the unpredictability of the robot’s actions led to several problems with HRI:

  1. 1.

    the robot’s turn at the end of the aisle was unexpected

  2. 2.

    half of the participants were not sure if the robot registered them

  3. 3.

    one-third of participants were anxious in the robot’s vicinity

  4. 4.

    most reported concerns were about collisions

  5. 5.

    participants that perceived the robot as unpredictable trusted and accepted the robot less

Consequently, based on the theory about transparent system design [12, 99, 100] the robot’s planned actions, status, abilities, and limits need to be communicated clearly [99, 100] to prevent disappointment and calibrate trust accordingly [50, 67, 101] when service robots operate autonomously in public.

However, as most passersby felt safe and normal during the encounter with the robot and a majority of them indicated to prefer to not communicate with a robot in public, an acceptable level of information conveyance for robotic interaction strategies (RQ6) needs to be considered. On the one hand, the robot should act predictably and indicate its planned actions, but on the other hand, it should not present unnecessary information in order not to annoy uninvolved passersby.

Finding this balance is a challenging task for future work, especially as the majority in the presented study did not desire to communicate with a service robot in public. The ones who did, wanted the robot to convey its planned trajectory, as well as whether or not the passerby had been registered by the robot. Ideas for visual solutions known from the industrial application context (e.g., projections on the floor [18, 54]) can be found in [66]. However, a caveat with visual signals in public space is that they are not suited for visually disabled people, for which auditory interfaces are more suitable [76, 102]. Therefore, the resulting interaction strategy for an autonomous cleaning robot should be multi-modal to suit various passersby’s needs.

Unlike previous studies in public [16, 26], participants’ answers indicated a desire for auditory signals but no preference for verbal interaction. This might potentially be due to the non-anthropomorphic robot design, which might not raise the passersby expectation of humanlike communication [103], its non-social task [29] and the public setting [15].

A preference for non-verbal interaction due to the mechanoid robot appearance has also been described in [104]. Cowan and colleagues [105] noted that the use of a human voice might contribute to user expectations of uniquely human abilities that are not met by the robot’s current capabilities. Hence, when designing an HMI for an autonomous cleaning robot in a public space, multi-modality might be preferable, and the practicality of the solution also concerning robot bullying and vandalism needs to be considered.

Concerning robot bullying (RQ7), sharing the space with an autonomous robot made seven passersby test the robot’s abilities thereby hindering it “to do its job.” Although behavior aimed at damaging the robot was not observed, testing behavior aimed to judge the robot’s abilities (e.g., object detection) could be registered. The observed reactions mostly included trialing whether the robot would stop if the person extended his/her legs or arms in front of the robot or stepped in its way.

The observed testing behavior has also been reported by Salvini and colleagues [57] for a small, unsupervised interactive service robot at a fair in 2010. Other studies also report more abusive behavior shown by children when interacting unsupervised with the robot [58, 59, 106] which was not observed in the present study. This might be due to the train station setting where children were always accompanied by adults in contrast to other studies at malls, for instance, where children are sometimes left unattended. Although the cleaning robot in the presented study could be considered as ’mindless’ compared to humanoid robots [60], it was also rather large and heavy compared to most of the smaller and lighter robots used in previous studies. The robot’s physical presence might have prevented serious abuse. However, future studies are needed to draw this conclusion which might compare rates of robot bullying in public with different robot types.

The passersby were also asked how robot bullying could be prevented. One-fifth thought it was impossible to protect the robot. The other respondents imagined warning or annoying sounds to be useful, as well as video surveillance. Future studies should investigate if these strategies effectively prevent or at least stop abusive behavior towards robots. It should also be investigated if nighttime does increase the risk of vandalism as it does with other objects in public (e.g., vending machines, park benches). Hereby, group interactions should also be targeted based on the observed interactions and because robot bullying also becomes more likely with an audience [59, 61].

Similar to robot bullying, which renders a robot ineffective, situations in public were observed where passersby hindered the robot in its task execution either intentionally (as observed by some groups) or by inattention (person putting luggage into the locker or operating a smartphone).

Thus it was explored whether and under which conditions passersby might accept an assertive robot (RQ8). The majority of participants did not desire an assertive robot as they wanted the human to be superior to the robot. The desired submissive role for a service robot and the human-robot power asymmetry has also been previously found in surveys [107,108,109]. Three respondents explicitly said that they could not provide an answer to such a difficult question which merits further research into perceived boundaries for robot actions such as assertive behavior.

The participants who were ready to accept assertive robots suggested speech (humorous or as a command), acoustics (horn or siren), and visual signals for the implementation of robot assertiveness. The suggested modalities were similar to a previous study that investigated the assertive strategies of a tour guide robot based on the principle of escalation [11]. The robot would utter different strategies depending on how long the passerby blocked its way: 1) the robot would first explain that it needed space and if the person did not step aside 2) it would play a sound of a horn accompanied by a sad face 3) and finally show an angry face and say ’You are in my way!’. The authors reported that not everyone stepped aside, although the robot’s intention was understandable. Nevertheless, the assertive robot could navigate more quickly through the crowd than the non-assertive robot [11]. Unfortunately, the authors did not provide proportions of compliance and non-compliance.

In summary, although the majority of respondents in the present study rejected an assertive robot, this topic merits further investigation as a balance has to be found between what users desire (especially if they have no experience yet with assertive robots) and the effectiveness of a public service robot [11, 63, 88]. The proposed acceptable, assertive robot strategies provide inspiration for future investigations regarding their potential to make public service robots navigate more effectively in our social environments.

Table 8 Future Work Based on Design Considerations

5.1 Strengths and Limitations

The presented insights are valuable because they stem from unbiased real-life interactions with an autonomous, mechanoid service robot in public. As the passersby did not expect to interact with a robot, they can be considered naïve for the interaction and showed more naturalistic behavior (e.g., bullying behavior) than would have been observable in a controlled lab experiment (e.g., experimenter effect).

Additionally, as a wide variety of passersby of different age groups, educational levels, and economic status was observed, the sample can be considered as heterogeneous as the examined cultural group.

Finally, by combining the different research methods of observation, interview, and questionnaires, objective and subjective views on initial HRI could be gained to create a more holistic picture of UX with a mechanoid, autonomous robot in public, highlighting opportunities for improvement of HRI, and indicate future work.

Although field studies have a high external validity, limitations to the internal validity need to be considered. Due to the hectic environment, not all participants could be asked all questions, which reduced the amount of data that could be collected. This was also true for the questionnaire data, as only half of the interviewed participants had additional time to fill in the questionnaire. Also, the HRI was not the same for all passersby (e.g., time of day, groups, other passersby present and reacting to the robot), such as it would have been in a controlled lab environment. However, when passersby in the future will encounter autonomous robots for the first time, they will also have different initial experiences with them, especially if the first contact takes place in public. Hereby, this study can contribute to a positive initial experience by making recommendations on how to improve the initial HRI to avoid the observed issues.

Apart from field study limitations, methodological limitations have to be considered as well. As extreme group analysis can suffer from variance restriction, it was acted on existing recommendations [110] to circumvent this issue and non-parametric testing based on ranks was performed instead of parametric testing based on variance. Regarding the interview data coding, no inter-rater reliability could be calculated as both coders developed the coding schemes together in an iterative process instead of coding separately. This was intended as one common solution was favored instead of a high coding congruence which might be necessary for creating stimuli material (e.g., learning material). Regarding participants’ rejection of assertive robot behavior, it might be conceivable that most of the respondents imagined too drastic assertive robot actions (as no examples were given of how the robot could assert itself, as this was also the topic of investigation). This was also mirrored by two respondents raising the issue that an assertive public robot could harm vulnerable passersby. Future studies need to ensure that participants understand that causing harm is not intended with the introduction of assertive robots.

5.2 Design Considerations and Future Work

The presented study aimed to highlight design challenges that arise when naive passersby interact with an autonomous service robot in public. Several challenges were identified and recommendations, also based on literature, were provided.

The recommendations in the present study are comparable to the ones by Weiss and colleagues [26] and Mubin and colleagues [3]. The main difference lies in the design of the investigated robot as mentioned in Related Work, Section 2.1.2: While for the humanoid robot proactive, verbal interactions were recommended previously [3, 26], for the mechanoid robot non-verbal interaction was favored and communication only desired if necessary. Hence, the appearance of the robot needs to be considered when implementing interaction strategies in public.

For non-social service tasks, a mechanoid appearance might suffice as it does not raise expectations about its capabilities (e.g., speech) [103] that might complicate public HRI (e.g., multilingualism, surrounding noise, people with hearing impairment) while at the same time seems to be rather accepted in the public ecosystem.

The study identified a number of concepts that can inspire future work (see Table 8) and define a research agenda in the area of co-existence and human-robot interaction in public to increase passerby’s existence acceptance of an autonomous service robot.

The main points of the research agenda are an inclusive design of robot 1) navigation and 2) communication strategies to cater to the needs of vulnerable passersby (e.g., elderly, children, individuals with disabilities) and 3) the need for acceptable and effective conflict resolution strategies to avoid an inefficient robot (e.g., testing behavior, the path is blocked).

The results from this study also influenced the design of a longer list of design requirements which covers the topics of acceptable and trustworthy design of public robots, as well as organizational requirements for a safe, acceptable, and trustworthy HRI in public [111].

Table 9 Example of Coding Scheme
Table 10 Observed User Reactions: Per Age Group and Total Count

6 Conclusion

The presented field study at a German train station explored the initial behavioral, cognitive and emotional reaction of passersby to an mechanoid, autonomous cleaning robot in public. It combined observation of natural behavior with user interviews and questionnaires about trust and acceptance. In general, the results showed that the co-existence with passersby and a public mechanoid cleaning robot in a hectic environment works already quite well as most passersby felt safe in the presence of the robot. Trust and acceptance of the robot were on a high level apart from passersby that experienced the robot to be unpredictable in its actions. In two percent of observations, minor incidences like collisions occurred that might have been prevented if the robot would communicate its planned trajectory. For this, the interviewed passersby desired interaction strategies that conveyed only the most necessary information for the robot to be predictable but not annoying in a shared space. The results of this study provide valuable insights into natural HRI to inspire an informed design of acceptable and trustworthy interaction strategies for autonomous service robots in public.