1 Introduction

Autonomous mobile robots have been expected to provide various services safely and securely in human-crowded spaces such as shopping malls, station concourses, and airport lobbies [1, 2]. Such robots must be capable of behaving collaboratively with humans, considering own movement purpose (e.g., human guidance and object transport), human movement, and surrounding environments. Here, the type of human–robot collaboration is roughly divided into two types [3, 4]. One is a passive avoidance strategy (PAS), to thoroughly avoid interfering with any movements of humans by stoppage and detour, based on the principle of ‘prioritizing humans.’ The PAS has been applied to most conventional autonomous mobile robots as a typical path-planning method, including a dynamic window approach (DWA) [5] and rapidly-exploring random trees (RRT) [6]. As stated above, the conventional approach where the robot finds a collision-free path does not allow it to coordinate with humans even when it tries to pass through a narrow corridor. This will cause the freezing problem [7,8,9,10]. The other is an active inducement strategy (AIS), to collaboratively move by conveying the robot’s intent to the human using gestures, voice, and physical touch [3], like humans. The AIS, which enables the robot to do proximal navigation, could provide more robust and efficient navigation even in human-crowded spaces. Thus, the AIS can be one of the powerful strategies for solving the above freezing problems [3].

In our previous studies on AIS, we have analyzed various scenes where mobile robots interact with humans (i.e., situations where the relative distance is short) and proposed a movement control scheme by active inducement [11, 12]. We have then proposed an interactive navigation framework with situation-adaptive multimodal inducement, considering robot efficiency and human psychology [3]. Experiments revealed that the proposed framework enabled the robot to move more efficiently and naturally by using the inducement methods, e.g., path indication, voice interaction, and notifying touch, depending on situations, e.g., space attributes, available width, and the number of humans.

1.1 Intent Conveyance and Robustness to Misrecognition

On the other hand, we found that the navigation system should be improved regarding the effectiveness of intent conveyance and the robustness against misrecognition. We here explain the importance of those improvements by illustrating a situation in human–human communication.

First is the effectiveness of intent conveyance. Before \(A\) starts to communicate with \(B\), \(A\) observes \(B\)’s states to know feasible and effective ways of communication. If \(B\) is aware of \(A\), \(A\) can immediately convey what \(A\) wants to tell. But, if \(B\) is unaware of \(A\), \(A\) needs to use inducible communication, such as voice interaction when \(B\) is reading a book, or physical touch when \(B\) is listening to music at full volume, as shown in Fig. 1a. The physical touch can use when \(A\) and \(B\) are close enough to touch, as shown in Fig. 1b. \(A\) then communicates with \(B\) by the selected way. After the initial communication, \(A\) observes \(B\)’s reaction to know if \(A\)’s intent was successfully conveyed to \(B\). If there was a large gap between \(B\)’s actual and expected behavioral change, \(A\) understands that \(B\) did not receive \(A\)’s intent. If not conveyed, \(A\) needs to determine the next action, such as increasing its strength, changing the modality, or giving up communication, based on a series of relationships between \(A\)’s way of communication and \(B\)’s reaction, as shown in Fig. 1c. \(A\) then executes the selected way of communication. In sum, humans have the looped process to observe the human state, determine a suitable way of communication, and execute it.

Fig. 1
figure 1

Inducement suitable for human states, including a awareness human toward robot (voice for notice or just go), b relative position between human and robot (voice or touch), and c reaction of human after the last inducement from robot (largely avoid or slightly avoid)

Second is the robustness against misrecognition. \(A\) and \(B\) will interfere if they continue to proceed each current path. If \(A\) could estimate that \(B\) will avoid to right of \(B\), \(A\) will avoid to right of \(A\). If the estimation was right, they would pass through safely. However, we (humans) sometimes fail the estimation, and this might lead to nearly colliding with each other. Nevertheless, humans could quickly recognize the behavioral changes of the confronting human, adequately replan the following action, and immediately execute it. In other words, humans ‘can’ fail the estimation. In summary, humans have an error-tolerant navigation system, based on an effective feedback loop and rapid recovery action function.

1.2 States to be Recognized

The capabilities of predictive inference and rapid motion control for current robots are quite inferior to those for humans. Moreover, human behaviors are inherently difficult to be predicted by even humans. Some studies have tried to make a predictive model of human motion. In [13], a proactive social motion model was proposed based on the socio-spatiotemporal characteristics of humans and human groups. In [14], socially acceptable navigation was proposed using the social force model as a heuristic approach. In [15], how to recognize a pedestrian’s navigation intent and predict a pedestrian’s motion and how a robot dynamically adapts its navigation policy when facing unexpected human movements were investigated. They produced interesting outcomes but indicated that predictive models have some limitations in preciseness and earlier prediction. We found from the above analysis that the navigation system should have functions: (i) recognition of human’s intent-receivable state, (ii) recognition of human’s intent-received state, (iii) determination of suitable inducement considering misrecognition.

  • To recognize intent-receivable state (proactive state). The current navigation system decides the inducement depending on a passing width. However, the inducement would not reach humans if they were unaware of the robot. Thus, the system needs to decide the inducement based on the proactive state of the human, i.e., whether the human is in a state where the inducement can be received and which modality can be received.

  • To recognize intent-received state (postactive state). The current navigation system changes the inducement depending on the position change of the human. However, the inducement is repeatedly executed if the position change is repeated. Thus, the system needs to decide the inducement based on the postactive state of the human, i.e., whether the human received the intent from the robot and to what degree it was received.

  • To determine suitable inducement considering misrecognition. As stated above, a new navigation system will determine the inducement based on the proactive and postactive states of the human. However, the system cannot perfectly recognize those human states. Thus, the system needs to decide the inducement based on the gap between the expected and actual behavior changes by inducement (what the intent of the human is).

1.3 Error-Tolerant Navigation (ETN)

In summary, the robot must have an error-tolerant navigation (ETN) method that can robustly deal with error estimation and miss conveyance. We need to newly develop a framework that allows the misrecognition and recovery of it.

Robot navigation methods based on human-state estimation have been proposed [16, 17]. However, most conventional ones are based on the PAS [18], focusing on modeling precise human behaviors, with the philosophy that ‘robot’s errors are unacceptable’ [13, 19, 20]. We could say that they did not focus on human-intent estimation based on active inducement from the robot by allowing it to make mistakes, and there are no comprehensive frameworks for error-tolerant navigation. We thus develop an ETN that can recognize ‘whether a human is in a state where an inducement from others can be acceptable’ and ‘how degree the intention was conveyed.’ The ETN could contribute to proposing a new human–robot interaction with the acceptance of error, although conventional ones do not permit a mistake. Unavoidable errors inevitably happen, so it is important to always consider the possibility of providing wrong outputs. The ETN thus recognizes the errors by observing the human reaction and recovers the errors by changing the robot’s behavior. A way of recognizing errors should be ‘error-tolerant’ as much as possible, so the ETN estimates if the errors happened or not by comparing the actual avoidance distance with the expected avoidance distance. Even if the robot makes a small mistake in a prediction process, it could prevent a fatal mistake by recognizing the small mistake and recovering it due to an error-correction loop, including active inducement.

In this study, we target passing scenarios where a single person walks in a corridor in an office. The rest of the paper follows: Sect. 2 explains requirements, Sect. 3 details the development of ETN, Sect. 4 explains experimental conditions, and Sect. 5 describes experimental results and discussion. Finally, Sect. 6 summarizes this study.

2 Related and Required Works

In this section, we analyzed the conventional studies and clarified the required work.

2.1 Related Works

2.1.1 Intent Estimation and Conveyance

For human–robot interaction, intention estimation is quite important [19, 21]. For conveyance interface design, intent communication methods using light signals and indicators [22], projection [12, 23], and bi-direction intent communication via augmented reality [24] were proposed. A method to convey the navigation intent of a mobile robot to humans by adopting the semantics of car’s turn indicators was also proposed [25]. An external vehicle interface called the automated vehicle interaction principle (AVIP) that communicates vehicles’ mode and intent to pedestrians was proposed [26]. A robotic wheelchair provides its future trajectory with light projection according to a goal intended by the passenger [27]. However, those methods require special devices such as a projector, not like a human, and do not focus on cases of occurring miscommunication.

For human intent estimation in navigation [28], proposed a method of inferring the navigation intent of humans based on pre-computed motion probability grids [29], proposed a framework for inferring and planning concerning the movement intention of goal-oriented agents in an interactive multi-agent setup. For automated vehicles [30], estimates the driver’s lane-change intent by a Bayesian network-based model in combination with a Gaussian mixture [31]. Proposed a user action model using Gaussian process regression to encapsulate the probabilistic and nonlinear relationships among user action, state of the environment, and user intention [32]. Proposed a system to recognize the human’s hand motion intent and plan a motion to enable the robot to communicate its intent using legible and predictable motion [33]. Investigated an interactive intention-predicting method using bimodal information for a public service robot [34]. Proposed a method to infer human intents denoted by the goal locations of reaching motions using a neural network-based approximate EM algorithm with online model learning [35]. Presented a planning framework that combines implicit (robot motion) and explicit (visual/audio/haptic feedback) communication during robot navigation.

In summary, conventional methods address the human intent estimation and conveyance by focusing on the best solution and situation, such as determining handover timing [36], but they do not consider cases of failing communication as a systematic framework.

2.1.2 Error-Tolerant System (Absolute or Relative)

The ETN can be assumed as a feedback control, and a way of achieving the purpose, i.e., to make passable width for passing a corridor, differs with time-series events and situations. In addition, the systems provide a human (plant) with an inducement (operating variables), and the ratio of conveyance dynamically changes in situations. Thus, it is difficult for the system to estimate the amount of movement change of the human (control variable). Error-tolerant systems are often studied in the field of adaptive control, such as fault-tolerant control for uncertain nonlinear systems with unknown dead-zone and unmodeled dynamics [37, 38]. Error tolerance is tried to be achieved by feedforward and feedback strategies, and specifically, an observer can output approximate values by handling unknown (unmodeled) disturbances. Note that such control systems are not supposed to provide wrong outputs, such as mistaking positive for negative, while the ETN can notionally allow such an inverse output.

One idea to solve these problems is to make precise modeling and compensate for disturbance by introducing the latest control technologies. For example, in [39], social momentum, a planning framework for legible robot motion generation, was developed due to the benefits of intent-expressive motion. In [40], a formalism was developed to mathematically define and distinguish the predictability and legibility of motion. However, this approach is not suitable due to unavoidable errors by the limitation of precision in modeling and performance of the controller.

Thus, relatively improvement methods such as external force measurement [41] and macroscopic control parameter adjustment [42] would be effective. Consequently, we need to focus on an error (disturbance)-tolerant navigation system.

2.2 Required Work for ETN

As a first step for the purpose, we make a framework based on fundamental functions, focusing on an error-recovering system. From the analysis of the human–human interaction focusing on intent communication and error recovery (compensation) mechanism, as stated in Subsect. 1.1, we could derive a suitable relationship between a situation and inducement method, so we adopt a model-based approach in this preliminary study. To achieve the above requirements, the ETN needs to have the following four functions.

  • Interference-possibility judgment The robot first judges if the paths of the robot and human interfere based on environmental information, e.g., wall and obstacle, and human, e.g., position and velocity, as shown in Fig. 2a. If the robot recognizes interference (\(IP\) = 1), it judges the current state as a situation where both or either the robot or human must change the path (behavior).

  • Proactive observation (human-awareness judgment) To determine a suitable inducement in \(IP\) = 1, the robot uses visual awareness of the human toward the robot (\(HA\)). Estimating if the human truly recognizes the robot is quite difficult, so the robot estimates it from the face direction of the human, as shown in Fig. 2b. The intent-receivable state can be judged by human position and \(HA\) since the humans need to attend to the robot.

  • Inducement selector The robot then provides the human with inducements during \(IP\) = 1. It selects an inducement method as the initial one, according to the states of the robot, human, and environments as well as \(HA\), as shown in Fig. 2c. The robot then evaluates the human reaction as inducement achievement \(IA\) (postactive state). If \(IP\) is still 1, the robot determines the next inducement based on \(IA\) and the inducement method that the robot provided the last time, as shown in Fig. 2e.

  • Postactive observation (inducement-achievement judgment) After the robot provided the inducement, the robot estimates if the intent was successfully conveyed to the human (\(IA\)) by checking the human reaction, as shown in Fig. 2d. If there is a large gap between the actual and expected reactions, \(IA\) is not achieved (\(IA\) = 0). In \(IA\) = 0, the robot selects subsequent (different) inducement based on \(IA\).

Fig. 2
figure 2

Diagram of error-tolerant navigation system based on human-state estimation for human–robot collaborative movement, including a interference possibility, b human awareness, c inducement selector, d inducement achievement, and e inducement selector. Even if hesitation and avoidance repetition occur, small correction loop decreases large error

2.3 Preparation

As explained in [3], we refer to a robot’s action to encourage a human to change their cognitive (e.g., awareness of robot), physical (e.g., standing position), and psychological (e.g., comfort) states as ‘inducement.’ The inducement has various types of modalities, such as body movement (appeal to the visual sense), speech (appeal to the auditory sense), and touch/contact (appeal to the haptic/kinesthetic sense). The inducement also has different strengths, such as weak notification of robot intention by body movement as well as strong notification of robot intention by physical interaction. These inducements should be selected depending on the environment, human, and robot, as follows.

  • Path indication The robot modifies its path to convey its intent, i.e., assertion in the same way or compromise in a different way. This inducement is natural, weak, and low-intervene (physical and psychological).

  • Voice interaction The robot uses its voice to notice its existence to humans. This inducement is natural, specific, and medium-intervene (cognitive and psychological).

  • Physical notifying touch The robot uses weak touch to provide notification and induce the human to give way to the robot. This inducement is a high-intervene (cognitive and physical).

We can make a basic rule that low-intervene inducement must be selected in the first trial, and if failed, one step higher intervening inducement must be selected. When the interference is not solved after the robot provides physical touch (the highest intervention in this study), the robot selects ‘detour to path a different route.’ Until arriving at the goal, the navigation system runs the control loop, including \(IP\), \(HA\), and \(IA\), and adaptively selects its own behaviors.

3 Error-Tolerant Navigation Framework

The navigation system selects inducement methods based on the interference possibility (\(IP\)), human awareness (\(HA\)), and inducement achievement (\(IA\)).

3.1 Interference-Possibility Judgment

This function outputs binary values (\(IP\) = 0 or 1), to judge the need for inducement. Accurate detection of human attributes is essential for robot navigation. There are various methods for human detection. One way is to detect leg-like shapes from the laser data [43]. There are many human trajectory estimation methods, including using LSTM [44]. These methods have a time delay for computation, and basically, precise estimation is difficult, as stated in Sect. 1. Thus, we use a simple pedestrian model, assuming that the current velocity keeps (we will update the model in the future).

3.1.1 Human and Human Velocity Estimation

Fusing the data from both the camera and laser range finder (LRF) generally yields more precise results [45, 46]. First, the LRF scans through the environment and then filters out values that suggest human-like objects. We assume the 5–15 successive point clouds as humans in our experimental setting, estimate eclipses from point clouds, and derivate long and short axes. The human posture is estimated by the angle of those axes. The system identifies the front and back of the human by using a function of KINECT v2. To calculate the velocity, we apply the least squares method using position data for about 0.25 s (8 data). Figure 3 shows the coordination system. We denote the human and robot position (x, y) angle, and velocity as \(\left({x}_{H},{ y}_{H},{ \theta }_{H}, { v}_{H}\right)\) and \(({x}_{R},{ y}_{R}, {\theta }_{R}, { v}_{R})\), respectively. The longitudinal distance between the human and robot along with a corridor (\(x\) direction) is denoted by \({l}_{HR}\), as shown in Fig. 3a. These parameters are calculated in every sampling frequency (30 ms).

Fig. 3
figure 3

Interference possibility (IP), which is calculated based on relative distance when robot and human are in passing point. a Current state, b future state at passing point, and c definition of IP

3.1.2 Interference Judgment

A time \({S}_{t}\) when the robot and human reach the passing point \({x}_{p}\) is calculated based on the current velocity vector of both the human and robot (Fig. 3b), and it is given by

$$ S_{t} = l_{HR} /\left| {v_{H} \sin \theta_{H} - v_{R} \sin \theta_{R} } \right|. $$
(1)

Then the lateral distance between the geometric center of the human and robot (\(y\) direction) \({D}_{HR}\) is calculated in the time \({S}_{t}\), and it is given by

$$ D_{HR} = \left| {(x_{H} - x_{R} ) + S_{t} \left( {v_{H} \cos \theta_{H} - v_{R} \cos \theta_{R} } \right)} \right|. $$
(2)

Finally, the system identifies if \({D}_{HR}\) is larger than the threshold distance \({D}_{L}\), to judge the possibility of interference. Comfortable passing between humans requires a marginal distance (called personal space). According to [47], humans feel the interference possibility when passing on a narrower passage than 1.3 times the shoulder width. To ensure safe and comfortable passing, we apply the above marginal distance as \({D}_{L}\) [3].We denote the shoulder widths of human and robot as \({SL}_{H}\) and \({SL}_{R}\), respectively, and \({D}_{L}\) is given by

$$ D_{L} = \left[ {\left( {SL_{H} + SL_{R} } \right) \times 1.3} \right]/2. $$
(3)

\({SL}_{R}\) was set to 0.6 m according to the robot we used in this study (Fig. 7) and \({SL}_{H}\) was set to 0.4 m as the mean shoulder width of humans [47]. \({D}_{L}\) is thus a constant value as 1.3 m. Thus, \(IP\) can be calculated by

$$ \left\{ {\begin{array}{*{20}l} {IP = 1\;\left( {{\text{interference}}} \right) } \hfill & {\left| {D_{HR} \le D_{L} } \right.} \hfill \\ {IP = 0\;\left( {{\text{no}}\;{\text{interference}}} \right)} \hfill & {\left| {D_{HR} > D_{L} } \right.} \hfill \\ \end{array} .} \right. $$
(4)

3.2 Proactive State Observation (Human Awareness)

This function outputs ternary outputs (\(HA\) = 0, 0.5, or 1), as visual awareness of the human toward the robot.

3.2.1 Concept

It is essentially difficult to precisely estimate “if the human is aware of the robot.” We thus focus on its necessary conditions, that is, the human gaze on the robot. This can be the strongest condition but strictly not enough. If the human gazes at the robot, we cannot judge from outside if the human recognizes the robot. Some studies estimate drivers’ situational awareness by using close images of drivers’ faces in time series [48, 49]. However, for autonomous mobile robots, gaze detection using cameras installed on the robot is quite difficult to measure due to its small target size, robot oscillation, light condition, occlusion, and so on. Thus, we simply used the head direction of the human in this study.

3.2.2 Human Awareness Judgment

In the horizontal plane, the face and gaze directions are strongly related, and the gaze direction is within + 20\(^\circ \) of the head direction [50]. Thus, we assume that the head direction corresponds to the gaze direction. Here, the field of vision is classified by the effective field (\({\theta }_{E}\) = \(\pm \) 15\(^\circ \)) and marginal visual field (\({\theta }_{M}\) = \(\pm \) 100\(^\circ \)) of the human. The head direction is denoted by \({\theta }_{HEAD}\) and the relative angle of face is denoted by \({\theta }_{HR}\), as shown in Fig. 4a and b. The relative human position is measured by LRF, and its face direction is measured by KINECT v2 (30 fps). Moreover, humans must be within a sensing range \({D}_{SHR}\). As shown in Fig. 4c, \(HA\) is given by

$$ \left\{ {\begin{array}{*{20}l} {HA = 1\; \left( {{\text{High}}} \right) } \hfill & {\left| {\left| {\theta_{HEAD} - \theta_{HR} } \right| \le \theta_{E} /2} \right.} \hfill \\ {HA = 0.5 \;\left( {{\text{Low}}} \right)} \hfill & {\left| {\theta_{E} /2 < \left| {\theta_{HEAD} - \theta_{HR} } \right| \le \theta_{M} /2} \right.} \hfill \\ {HA = 0\; \left( {{\text{None}}} \right) } \hfill & {\left| {\left| {\theta_{HEAD} - \theta_{HR} } \right| \ge \theta_{M} /2} \right.} \hfill \\ \end{array} } \right.. $$
(5)
Fig. 4
figure 4

Human awareness (HA), which is calculated based on relative angle between robot and face direction of human. a Effective field of view, b face direction, and c definition of HA

If the robot is in the central (marginal) visual field, \(HA\) is high (low). And if the robot is out of the effective visual field, \(HA\) is none. For robust estimation, the system outputs \(HA\) = 1 or \(HA\) = 0.5 if its time duration is longer than 0.25 s in the time window of 0.375 s (determined from pre-experiments). Recognition in the effective visual field is higher resolution, so it is effective to comprehend environments in detail. If the robot is in the state (\(HA\) = 1), vision-based inducement such as path change becomes effective. On the other hand, if recognition is in the marginal visual field (\(HA\) = 0.5), visual-based inducement is not suitable.

3.3 Postactive State Observation (Inducement Achievement)

This function outputs ternary values (\(IA\) = 0, 0.5, or 1) by using an avoidance distance of the human to judge the necessity of the next inducement, as shown in Fig. 5.

Fig. 5
figure 5

Inducement achievement (IA), which is calculated based on expected and actual avoidance distance of human and actual avoidance distance of robot. ae show dynamic situation when \(\left( {R_{H} ,R_{R} } \right)\) = (0.5, 0.5) and f shows static situation. In (b), human moves in the expected direction, in (c) human does not move in the expected direction, and in de, human moves in the intermediate direction

3.3.1 Expected Avoidance Distance and Ratio

The purpose of inducement is to make the distance between the robot and human before inducement \({D}_{HR}\) larger than \({D}_{L}\) given by (3), i.e., \(IP\) = 0. As shown in Fig. 5a, the expected avoidance distances of human \({A}_{HE}\) and robot \({A}_{RE}\) is given by

$$ A_{HE} + A_{RE} = D_{L} - D_{HR} $$
(6)

We then determine \({A}_{HE}\) and \({A}_{RE}\) by introducing the avoidance ratio of human \({R}_{H}\) and robot \({R}_{R}\), so that their summation is 1. We can arbitrarily define the combination. (\({R}_{H}\), \({R}_{R}\)) = (0, 1) means that the robot avoids all and (\({R}_{H}\), \({R}_{R}\)) = (0.5, 0.5) means that the robot and human avoid half each. In this study, we adopted two basic combinations: mutual avoidance (\({R}_{H}\), \({R}_{R}\)) = (\(\alpha \), 1\(-\alpha \)), where 0 \(<\alpha <\) 1, and full robot avoidance (\({R}_{H}\), \({R}_{R}\)) = (0, 1). In mutual avoidance, \({A}_{HE}\) = \(\alpha \times \left({D}_{L}-{D}_{HR}\right)\) and \({A}_{RE}\) = \((1-\alpha )\times \left({D}_{L}-{D}_{HR}\right)\). In full avoidance, \({A}_{HE}\) = 0 and \({A}_{RE}\) = \({D}_{L}-{D}_{HR}\).

3.3.2 Actual Avoidance Distance

The robot will avoid the expected distance \({A}_{RE}\), so the actual avoidance distance of the robot \({A}_{RA}\) is definitely equal to \({A}_{RE}\). Thus, the system needs to check the actual avoidance distance of human \({A}_{HA}\). We denote the distance between the robot and human after inducement, which includes reactive movement of the human, as \({{D}_{HR}}^{^{\prime}}\). We can calculate \({{D}_{HR}}^{^{\prime}}\) by using (2). As shown in Fig. 5b, \({A}_{HA}\) is thus given by

$$ A_{HA} = D_{HR}^{^{\prime}} - \left( {D_{HR} + A_{RA} } \right), $$
(7)

where \({A}_{RA}\), \({A}_{HA}\), and \({A}_{HR}\) have a positive value so that \({{D}_{HR}}^{^{\prime}}\) gets large.

3.3.3 Inducement Achievement

The system then compares \({A}_{HA}\) with \({A}_{HE}\). When the human moved in the expected direction and the avoidance distance was large enough (\({A}_{HA}\ge {A}_{HE}\)), the robot can immediately judge that the inducement was succeeded, and \(IA\) becomes high (\(IA\) = 1), as shown in Fig. 5b. When the human moved in the direction opposite to the expected one (\({A}_{HA}<0\)), the robot can immediately judge that the inducement failed, and \(IA\) becomes none (\(IA\) = 0), as shown in Fig. 5c. Furthermore, when the human moved in the expected direction but the avoidance distance was not enough (\(0<{A}_{HA}<{A}_{HE}\)), the system halts judging \(IA\) for a while since the reaction might be delayed depending on the individuals. In this situation, it is unclear if the human continues to walk the current path or avoids the expected distance. The waiting time is difficult to theoretically determine, so we adopt the maximum time when the robot can wait due to natural and safe passing. In this study, we set the natural avoidance angle \({\theta }_{n}\), which is not too steep (safe) avoidance, 30\(^\circ \) from exploratory experiments. We denote \({D}_{HR}\) when the robot avoids in the direction of \({\theta }_{n}\) from the current direction as \({D}_{HRL}\). If \({D}_{HRL}\) is larger than \({D}_{L}\), the robot temporally outputs that \(IA\) is low (\(IA\) = 0.5), as shown in Fig. 5d. If \({D}_{HRL}\) is \({D}_{L}\) or smaller, the robot immediately judges that inducement failed, and \(IA\) becomes none (\(IA\) = 0), as shown in Fig. 5e. In sum, \(IA\) is given by

$$ \left\{ {\begin{array}{*{20}l} {IA = 1\; \left( {{\text{High}}} \right) } \hfill & {\left( {A_{HA} \ge A_{HE} } \right)} \hfill \\ {IA = 0.5\; \left( {{\text{Low}}} \right)} \hfill & {\left( {0 \le A_{HA} < A_{HE} } \right)\,\& \, \left( {D_{HRL} > D_{L} } \right)} \hfill \\ {IA = 0\; \left( {{\text{None}}} \right) } \hfill & {\left( {0 \le A_{HA} < A_{HE} } \right)\, \&\, \left( {D_{HRL} \le D_{L} } \right)} \hfill \\ {IA = 0\; \left( {{\text{None}}} \right) } \hfill & {\left( {A_{HA} < 0} \right)} \hfill \\ \end{array} } \right.. $$
(8)

Figure 5f shows a static situation where a human stands in a corridor and \(HA\) = 0. After initial inducement using voice interaction (then physical touch), if the human moved in a direction so that the robot passing width became large enough, the robot immediately judges that the inducement was succeeded, and \(IA\) becomes high (\(IA\) = 1), otherwise, \(IA\) = 0.

3.4 Initial and Subsequent Inducement Selector

The feature of ETN is to allow the robot to provide inducements several times. At the initial inducement, the robot should select mutual interaction using weaker inducement for minimum social disruption. If the initial inducement fails, it adopts a stronger inducement as a subsequent inducement by reference to the failed initial inducement. We here explain the basic rules for inducement selection for two situations where the human walks (dynamic) and stands (static) since suitable inducements are different, as shown in Fig. 6.

Fig. 6
figure 6

Inducement selector, which is determined from IP, HA, and IA. ad shows dynamic situation where human walks and eh show static situation where human stands. In dynamic situations, robot first uses path indication (mutual avoidance) then selects no inducement, pending, or path indication (full avoidance). In static situations, robot first uses voice interaction then selects no inducement, physical touch, or detour

3.4.1 Dynamic Situation

  • Step 1: voice notification If a human who obstructs the path of the robot (\(IP\) = 1) is not aware of the robot (\(HA\) = 0), the robot kindly notifies the human of the existence of the robot by using voice interaction “I am coming through.”

  • Step 2: path indication (mutual avoidance) If the human still obstructs the robot (\(IP\) = 1) but is aware of the robot (\(HA\) = 1), the robot provides inducement so that both the human and robot cooperatively give way. Thus, the robot selects a mutual-avoidance path (\({R}_{H}\), \({R}_{R}\)) = (\(\alpha \), 1\(-\alpha \)), as shown in Fig. 6a. If the human changes the path so that the human will not interfere with the robot (\(IP\) = 0), the robot assumes that the inducement was successfully conveyed to the human and does not provide any inducements, as shown in Fig. 6b.

  • Step 3: path indication (full robot avoidance) If the human still obstructs the robot (\(IP\) = 1) and is still not aware of the robot (\(HA\) = 0) by Step 1 or \(IA\) = 0 by Step 2, the robot selects a full-avoidance path, (\({R}_{H}\), \({R}_{R}\)) = (0, 1), and heads to the widest passing space to safely avoid a collision, as shown in Fig. 6d. In another case, if \(IA\) = 0.5 by Step 2, the robot suspends to provide inducements during keeping a safe distance, as explained in Sect. 3.3.3. If the human does not change the path and the robot crosses over the safety distance, the robot executes full avoidance, as shown in Fig. 6c.

3.4.2 Static Situation

  • Step 1: voice notification If a human who obstructs the path of the robot (\(IP\) = 1) is not aware of the robot (\(HA\) = 0), the robot kindly notifies the human of the existence of the robot by using voice interaction “Let me pass, please,” as shown in Fig. 6e. If the human changes the path so that the human will not interfere with the robot (\(IP\) = 0), the robot assumes that the inducement was successfully conveyed to the human and does not provide any inducements (Fig. 6f).

  • Step 2: physical touch If the human still obstructs the robot (\(IP\) = 1) and is still not aware of the robot (\(HA\) = 0), the robot subsequently uses physical touch to strongly convey the intent (Fig. 6g). In our previous work [51], humans moved closely along the lines of the direction of touching force, and the back and shoulder were suitable touching points. The robot stands in a suitable position to make a proper and safe touch with its arm and provide inducible touch on a human’s back or upper arm, with the maximum touching force of 50 N [51].

  • Step 3: detour If the human still obstructs the robot (\(IP\) = 1), although the robot provided the strongest inducement, the robot gives up passing the way and searches and selects a detour route (Fig. 6h).

4 Experimental Condition

We explain the experimental design, including path change and physical touch scenarios using the autonomous mobile robot with omnidirectional wheels, two 6-DOF manipulators with torque sensors, as shown in Fig. 7, and human subjects.

Fig. 7
figure 7

Autonomous mobile robot specification

4.1 Objective and Design of Experiments

We assumed an ordinal corridor in an office and set the width to 3 m. As representative scenarios, we conducted a path change scenario and physical touch scenario. Figure 8 shows the initial state of the human and robot. We used a pre-built map that included only walls for the experiments. The objective of the experiments is to evaluate if the ETN can recover errors and investigate how the robot with the ETN interacts with humans, so we prepared fundamental and advanced experiments for dynamic and static scenarios, respectively, as listed in Table 1.

Fig. 8
figure 8

a Dynamic and b static experimental conditions

Table 1 Experimental patterns and order

The fundamental experiment aims to intentionally produce situations where the robot fails intent conveyance by asking subjects to take predefined behaviors. We evaluate from this experiment if the robot can adequately recover the error. The advanced experiment aims to produce non-constraint passing situations by asking the subjects to behave voluntarily. We evaluate from this experiment how smoothly the robot and human interact. For all the experiments, we briefed that the subjects would interact with the robot but did not tell how the robot would act and react to human behavior.

4.2 Experimental Conditions

4.2.1 Path Change (Dynamic Situation)

We asked the subjects to walk at a constant speed of around 0.35 m/s (they practiced several times), the same as the robot movement speed, to keep the condition over the experiments. The robot and human stood in the center of the corridor, and the distance between each other was 5.5 m in the depth direction and headed to movement destinations (Fig. 8a). This study evaluates the error-recovering behaviors of both the robot and humans. To easily trigger miss conveyance of robot intent, we set the mutual avoidance ratio to (\({R}_{H}\), \({R}_{R}\)) = (2/3 (= 0.67), 1/3 (= 0.33)).

For the fundamental experiment, we set four conditions with different combinations of (\(HA\), \(IA\)) = (1, without), (1, with), (0, without), and (0, with). In \(HA\) = 1, we asked the subject to walk while facing the robot. They moved in the same direction as the robot movement after the subject observed that the robot moved to intentionally produce a situation where the robot failed to convey the intent to the human. After this, the subjects can change the path or stop if needed (e.g., possibly colliding with the robot). In \(HA\) = 0, we asked the subject to walk while not facing the robot (looking at the calendar on the wall). The subjects recognize the robot’s existence but do not know where the robot is and how fast the robot moves. The robot provides the subjects with voice interaction to notify its existence. The procedure after this was the same as when \(HA\) = 1. With \(IA\) function, the robot changes the behavior depending on proactive and postactive states, but without IA function, the robot continues the initial inducement. For the advanced experiment, we prepared two conditions: (HA, IA) = (1, with) and (0, with). Initial settings were the same as the fundamental experiment, but we asked the subjects to completely freely move. We expect that the ETN (= with IA) will recognize the difference between actual and expected human behaviors and immediately change robot inducement to achieve smooth passing.

4.2.2 Physical Touch (Static Situation)

The subject stood 0.6 m away from the wall and 3 m away from the robot and waited for what would happen next, as shown in Fig. 8b. The subjects had the task of having the luggage with the experimenter. The subjects face away from the robot (HA = 0) and do not know what will happen, i.e., the robot will pass through. The robot interacts with the subjects by using voice and physical interaction.

For the fundamental experiment, we prepared two conditions: (HA, IA) = (0, without) and (0, with). We asked the subject not to move since they hold delicate luggage, regardless of receiving any inducements, to intentionally produce a situation where the robot fails to convey the intent to the human. The robot without IA function continues to touch the human, but the robot with IA function understands the human’s intent and detours. For the advanced experiment, we prepared one condition: (HA, IA) = (0, with). Initial settings were the same as the fundamental experiment, but we asked the subjects to freely respond to the robot inducement. Like the dynamic situation, we expect that the ETN will recognize an error from a human reaction and re-select a suitable inducement different from the initial inducement.

4.3 Evaluation Method

The subject was 10 (9: male and 1: female, age: 21–28, mean: 22.2, standard deviation: 2.1). In this study, we adopted a within-subject design to enable subjects to compare the navigation systems with and without the ETN (IA function). The subjects only know that they will interact with the robot, but do not know how they will interact with it. In light of the experimental contents, we can fix the orders between dynamic/static, fundamental/advanced, and HA = 0/1. Thus, we only randomized the order of with and without IA function for each condition, to mitigate the order effects. The experimental order is listed in Table 1.

We recorded the human and robot positions in times series using the motion capturing system (OptiTrack Prime 13). We evaluated the trajectory and path efficiency called ‘hesitation’ [52] as quantitative evaluation. To evaluate the change of the velocity vector during movement, we calculated the hesitation. The hesitation increases drastically when a rapid velocity change occurs. We denote the velocity vector at 0.35 s ago as \({\varvec{V}}_{{{\varvec{n}} - 1}}\) and that at time \(n\) as \({\varvec{V}}_{{\varvec{n}}}\), the avoid start and passed time as \(T_{S}\) and \(T_{E}\), and the hesitation \(H\) is given by

$$ H = \int\limits_{{T_{S} }}^{{T_{P} }} {\frac{{\left| {{\varvec{V}}_{{\varvec{n}}} - {\varvec{V}}_{{{\varvec{n}} - 1}} } \right|}}{{V_{n - 1} }}dt} . $$
(9)

Moreover, we evaluate that the ETN improves human psychology, compared to the conventional method. Specifically, the ETN requires reducing negative feelings that occur at hesitation and repetitive avoidance. Thus, after each trial, we asked the seven-scale questionnaires with the ordinal scale, about [(–) unnatural, (+) no unnatural] to evaluate the impression of robot behavior, and [(–) discomfort, (+) no discomfort] and [(–) fear, (+) no fear] to analyze the subject’s negative feelings. The subjects marked a printed questionnaire form by referring to past responses for comparison.

5 Experimental Results

We analyze the experimental results in terms of both quantitative and qualitative aspects.

5.1 Evaluation: Path Change

From the fundamental experiments, we confirmed that the ETN framework worked adequately, as shown in Fig. 9. This result is one of the most important contributions in this paper.

Fig. 9
figure 9

Human and robot behaviors in dynamic passing scenario (HA = 0). In a without IA function, robot kept avoiding to the same direction, leading to unsmooth human movement. In contrast, in b with IA function, the robot smoothly changed its path to the opposite side, leading to smooth passing of both human and robot

5.1.1 Trajectory of Human and Robot

Figure 9 shows an example of the movement of the human and robot in HA = 0 with/without IA function. Figure 10 shows the actual trajectories of the robot and human (the same data as Fig. 9). As figures show, without IA function, the robot kept avoiding to the same direction independently of the human reaction, leading to the human’s unsmooth movement change. In contrast, with IA function, the robot could observe that the human moved in the same direction as the robot movement. Thus, the robot smoothly changed its path to the opposite side since keeping the same inducement had a higher possibility of collision. Figure 10 also shows that the distance between the robot and human was shorter without IA function (around 5 s) than with IA function. This indicates that they safely and collaboratively passed through, as we expected.

Fig. 10
figure 10

Trajectories of human and robot in dynamic passing scenario (HA = 0), corresponding to Fig. 9. In a, they are close to around 0.65 m at 5 s, on the other hand, in b The closest distance between them is 1.15 m at 5 s, which means they could pass smoothly

We then analyzed how the robot recovered errors in four conditions in detail. Figure 11 shows the actual trajectories of the robot and human with the avoidance timing and order, and Fig. 12 shows the actual and expected avoidance distance of the robot and human (\(A_{RA}\), \(A_{RE}\), \(A_{HA}\), \(A_{HE}\)). Figures 11a and 12a show the result without IA for HA = 1. The robot first changed the right path to the right as mutual avoidance at 1.0 s, and the subject intentionally changed the path to the left at 1.1 s. The robot without IA function kept the initial avoidance distance until the end, as shown in Fig. 12a. When the robot-human distance reached around 0.8 m (at 4.8 s), the subject predicted that they would collide soon, so only the subject avoided largely. Figures 11b and 12b show the result with IA for HA = 1. The robot first provided path indication to the left as mutual avoidance at 0.2 s, and the subject also changed a path to the right at 1.2 s. The robot could recognize the error thank to IA function and provided full avoidance to the right at 1.4 s. As Fig. 12b shows, the robot recognized that \(A_{HA}\) (actual) did not reach \(A_{HE}\) (expected) in 0.2–1.2 s, so it changed \(A_{RE}\) (expected) then \(A_{RA}\) (actual) reached \(A_{RE}\) in 1.4–2.2 s. Finally, they could pass by each other smoothly while avoiding at a distance of 2 m.

Fig. 11
figure 11

Trajectories of robot (circle) and human (triangle) with 0.5 s time step for four conditions, including timing and order of avoidance (inducement) of robot and human for fundamental experiment. a and c Robot did not change behaviors after first avoidance then only human avoided unsmoothly. b and d Robot immediately adjusted path direction according to human pass change

Fig. 12
figure 12

Actual and expected avoidance distance for robot and human in dynamic situation. a and c show that robot did not respond even when human did not avoid their expected distance. In b and d, robot first moved distance to be avoided but human did not react as robot expected, so robot adaptively changed its avoidance direction and distance to achieve safe passing distance

Figures 11c and 12c show the result without IA for HA = 0. The robot first selected full avoidance to the right with voice interaction at 0.9 s. The robot without IA function kept going without changing its behavior even though the subject was coming in the same direction. Like Fig. 11a, the subject finally avoided the robot to the opposite side at 3.9 s, as shown in Fig. 12c. This would make the subject evoke negative feelings, and the subject responded fear (− 2), as we explained in the later subsection. Figures 11d and 12d show the result with IA for HA = 0. The robot first provided full avoidance to the right at 0.7 s, and it judged \(IP\) = 0. However, the subject changed the path to the left at 1.8 s to occur the interference again (\(IP\) = 1). The robot then re-provided path indication to the left (opposite direction) as mutual avoidance at 2.6 s, and finally successfully passed through each other. As Fig. 12d shows, the human slightly changed the actual avoidance distance \(A_{HA}\) in 2.0–2.5 s, regardless of \(IP\) = 0. We confirmed that the robot could seamlessly respond to the several path changes of the human.

5.1.2 Hesitation (Movement Efficiency of Human)

In subSect. 4.3, we defined the hesitation to evaluate the movement efficiency. This index uses the integration of variation of velocity vector while avoiding movement as losing movement efficiency. Figure 13 shows the movement efficiency of the subjects. As the figure shows, the loss of movement efficiency with IA function regardless of human awareness (HA) was smaller than that without IA function. Student’s \(T\)-test revealed the significant difference between with/ and without IA in HA = 1 (\(p\) < 0.05, \(t\)(9) = 3.117) and marginally significance in HA = 0 (\(p\) < 0.1, \(t\)(9) = 2.131). A smaller hesitation indicates that the rapid and proactive behavioral changes of the robot, thanks to IA function, enabled the humans to smoother avoidance without any sudden path changes. These results could be explained by the trajectories and avoidance distance analyzed in subSect. 5.1.1.

Fig. 13
figure 13

Human’s movement efficiency (hesitation) for dynamic experiment without/with IA function for a HA = 1 and b HA = 0

5.1.3 Questionnaires

We conducted a psychological evaluation using questionnaires to compare how the behavior of robots with and without IA affects human feelings. We asked about the three items of unnatural, discomfort, and fear. Figure 14 shows the questionnaire scores. As figures show, the scores of each index with IA function regardless of human awareness (HA) were smaller than those without IA function. Student’s \(T\)-test revealed the significant difference between with and without IA function in unnatural (\(p\) < 0.001, \(t\)(9) = 5.250), discomfort (\(p\) < 0.005, \(t\)(9) = 3.737), and fear (\(p\) < 0.001, \(t\)(9) = 5.000) when HA = 1. The significant difference between with and without IA function in unnatural (\(p\) < 0.005, \(t\)(9) = 3.473), discomfort (\(p\) < 0.001, \(t\)(9) = 4.974), and fear (\(p\) < 0.001, \(t\)(9) = 4.974) when HA = 0. We found from the results that IA function could make human psychology more moderate and acceptable. This result indicates that robot behaviors to recover mistakes by its misrecognition would be acceptable.

Fig. 14
figure 14

Subjective evaluation in unnatural, discomfort, and fear for dynamic experiment without/with IA function for a HA = 1 and b HA = 0

5.1.4 Discussion (Advanced Experiment)

Here, we analyzed the results of the advanced experiments to discuss how the robot and human interacted. Figure 15 shows the actual trajectories of the robot and human with the avoidance timing and order for smooth and unsmooth avoidance, and Fig. 16 shows the actual and expected avoidance distance of the human and robot (\(A_{RA}\), \(A_{RE}\), \(A_{HA}\), \(A_{HE}\)) for unsmooth avoidances.

Fig. 15
figure 15

Trajectories of robot (circle) and human (triangle) with 0.5 s time step for four conditions, including timing and order of avoidance (inducement) of robot and human for advanced experiment. In (a) and (c), robot changed path at 0.7 s (and 1.4 s) and then human changed path at 1.5 s (and 2.0 s), indicating a successful intention conveyance. In (b), robot changed path three times. In (d), robot changed path two times and their distance was too close at second avoidance

Fig. 16
figure 16

Actual and expected avoidance distance for robot and human (unsmooth passing situation). In (a), human avoided robot in the same direction where robot was avoiding, leading to repetitious avoidance. In (b), robot largely avoided since it could not precisely estimate human intention. However, robot and human could pass safely in both cases

Figure 15a shows the smooth avoidance for HA = 1. The robot initially provided mutual avoidance at 0.7 s. Then, the subject recognized the inducement from the robot and changed the path to the left. This example well shows that the robot’s intent could adequately convey to the human. Figure 15b shows unsmooth avoidance for HA = 1, where the robot provided inducements (path changes) three times. Both the robot and human first changed paths to the direction, which will make a collision at almost the same time (around 1.0–1.1 s). The robot thus selected full avoidance to the opposite direction at 2.9 s. However, unfortunately, right after the robot changed the path, the human also changed the path to the same direction at 3.2 s. The robot rapidly re-changed the path to the opposite direction at 3.8 s and finally successfully passed the human. From Fig. 16a, we can guess that the reason for the three-times path change is that the human took a long time (about 2.7 s) to adequately understand the robot’s path indication and/or the robot took about 2 s to react to the human’s path change. In the future, we will consider ways to make robot inducements more understandable to humans (repetitious avoidance sometimes happens even among humans). The emphasis should be on the fact that the robot could change its path three times in a short period (within 4 s), indicating that it has sufficient responsiveness to decision-making and behavior change. They are important factors in navigating robots in human-existing environments.

Figure 15c shows smooth avoidance in HA = 0. Initially, the subject seemed unsure which way to avoid, but the robot first provided a path indication to the right at 1.4 s. By responding to it, the subject started to avoid to the opposite side of the robot about 0.6 s later. This is an example where the path indication initiated by the robot worked properly. Figure 15d shows unsmooth avoidance in HA = 0. The robot provided a path indication to the right at 0.5 s, but the human also slightly changed the path to the same side as the robot, so the robot largely changed the path to avoid to the left at 2.5 s. The distance between the robot and human when they passed was quite close. As shown in Fig. 16b, the human showed (slight) signs of avoiding to the left side at 1.2 s, but the robot changed the avoidance direction at 2.5 s, which had a latency of about 1 s. To increase the responsiveness more, the system needs to predict the actual avoidance distance based on the time-variation of human trajectory. It is worth noting that the fact that the robot could avoid the subject in this situation means that it can safely avoid humans, who approach without considering others, e.g., humans walking while on the phone, and indicates that the robot is highly adaptable.

Table 2 lists the number of path changes until the robot and subjects successfully passed through for HA = 1 and 0, respectively. We found from the table that the number of inducements for HA = 1 is higher than that for HA = 0. For HA = 0, the robot selected full avoidance at the first inducement due to HA = 0, so the success rate of intent conveyance was higher than mutual avoidance (\(R_{H}\), \(R_{R}\)) = (2/3, 1/3). However, it is inappropriate for the robot to always perform full avoidance due to the robot’s movement efficiency. In some cases, e.g., Fig. 15c, even when the robot performed full avoidance, the human performed additional avoidance. Thus, the ETN that enables the robot to appropriately adjust the avoidance distance depending on the situation will be useful.

Table 2 Number of path indications until successful avoidance

5.2 Evaluation: Physical Touch

From the experiments, we confirmed that the proposed ETN framework worked adequately, as shown in Fig. 17. This is also one of the most important contributions in this paper.

Fig. 17
figure 17

Human and robot behaviors in static scenario (HA = 0). In (a) with IA function, robot first provided physical touch and recognized that human did not move, then moved backward to detour. In (b) without IA function, robot repeatedly provided physical touch because the robot could not estimate the human intent

5.2.1 Human and Robot Behaviors

Figure 17 shows the movement of the robot and human in HA = 0, with/without IA function. Figure 17a shows an example of a detour with IA function. The robot first physically touched the subject, measured the distance between the human and wall after the initial inducement, and judged that the robot could not pass through, that is, IA = 0. The robot then moved backward to select another way. Figure 17b shows an example without IA function. The robot repeated the physical touch since it could not estimate the human intent from the situational difference between before and after inducement. We found that the proposed ETN could select robot behaviors suitable for the situations.

5.2.2 Questionnaires

Figure 18 shows the results of the questionnaire scores. As the figure shows, the scores of each index with IA function were worse than those without IA function. Student’s \(T\)-test revealed the significant difference between with and without IA function in unnatural (\(p\) < 0.05, \(t\)(9) = 3.146), discomfort (\(p\) < 0.05, \(t\)(9) = 2.676), and fear (\(p\) < 0.05, \(t\)(9) = 2.648). We also confirmed from the results that IA function could make human psychology more moderate and acceptable.

Fig. 18
figure 18

Subjective evaluation in unnatural, discomfort, and fear for static experiment without/with IA function

5.2.3 Discussion (Advanced Experiment)

In the advanced experiment for all ten subjects, the robot could pass through, as shown in Fig. 19. Here, we describe the procedures of the robot passing through as follows. First, the robot moved to the right behind the subject and asked, “Let me pass, please.” If the subject did not respond, the robot provided a physical touch for notice. If the subject avoided the distance enough to allow the robot to pass through, the robot said, “Thank you,” and moved forward and went through. The result shows that all the subjects made the robot pass through, and the detour was not observed, indicating that the robot could make a way by active inducement.

Fig. 19
figure 19

Human and robot behaviors in static scenario (HA = 0) for advanced experiment. Robot first used voice interaction and human did not move. Then, with IA function, robot used physical touch and successfully passed

Table 3 lists the timing when the subjects gave way to the robot. From the table, we found that six out of ten subjects gave way to the robot through voice interaction. These six subjects said, “the robot used voice when it was close to me, so I could guess that the robot was trying to pass through the gap between I and the wall,” “When the robot spoke to me, I avoided it because I could guess where the robot wanted to pass,” and “When I turned around by voice, the robot was there. I felt that the robot talked to me, so I gave way to the robot.” Note that the authors translated those comments from Japanese to English. The common point is that the timing of voice interaction and the distance between the robot and subject are important to accurately convey the navigational intent of the robot to humans. In this experiment, the robot moved to the diagonally backward right of the subject and used voice interaction. The subjects could understand that the robot spoke to them and wanted to pass through the right of them. Thus, the subjects finally move to the left. We found from the discussion that the timing of inducement and standing position with respect to the subject is important when the robot uses voice interaction.

Table 3 Timing to give a way (1st: voice, 2nd: contact)

Then, we analyzed the remaining four subjects who moved by physical touch. They responded, “I did not know how much I should move from only voice interaction, but by physical touch, I could know the direction and magnitude to be moved.” “I did not know who was talked to by the robot since I was looking ahead. However, from physical touch, I could receive the intent that the robot wanted me to avoid.” “When the robot was behind, there were many possible paths to go, but I moved since physical touch from the robot made me confirm that the robot wanted to pass through my right side.” These comments indicate that the voice interaction loosely conveys the intent to an unspecified number of people while the physical touch could convey the detailed intent, including direction and magnitude, to a specific person. We confirmed that physical contact was an inducement that could strongly convey specific information to a specific person.

Finally, we discuss the acceptability of active inducement from the robot in different cultural norms. The experiments were performed in Japan, and all the subjects were Japanese. In our experiments, we did not find subjects who unaccepted the robot inducement, but the active inducement might make people from different cultural backgrounds uncomfortable. We will investigate the conditions using physical touch or the number of repetitive inducements and then develop a parameter-tuning scheme suitable for applied domains.

6 Conclusion

In this study, we proposed error-tolerant navigation (ETN) with a process to actively estimate the human intent by iterative interaction with the robot. As a preliminary study, we focused on ‘the intent conveyance from robot to human’ and ‘its achievement.’ The ETN estimated interference possibility (IP) to determine the necessity of inducement, human awareness (HA) to select an inducement method, and inducement achievement (IA) to judge the need for action again. If the ETN estimated the interference, the robot provided inducements according to HA, such as path indication when HA was high or voice and physical interaction when HA was low. Each inducement corresponds to an expected behavior change in the human. IA was calculated from the difference between the expected and actual changes. If the change was not observed within the specified time after the inducement, inducements with a stronger intent conveyance were executed. When IA was none, after the strongest inducement, the robot selected another route. The results of experiments indicated that the proposed ETN could achieve smoother movement of humans and reduce psychological burden compared with a conventional navigation system (without ETN), indicating that it has sufficient responsiveness to decision-making and behavior change. They are important factors in navigating robots in human-existing environments, so we could say that the ETN could contribute to proposing a new human–robot interaction with the acceptance of error. We implemented and evaluated the ETN in the limited scenarios, but the proposed error-correction loop would be a common and essential feature for human-aware interactive navigation in arbitrary scenarios. Even if the robot makes a small mistake in a prediction process, it could prevent a fatal mistake by recognizing the small mistake and recovering it thanks to an error-correction loop.

On the other hand, this study also has limitations to be addressed in the future for making the ETN work well in real environmental settings. Each function of the ETN framework should be improved as follows. For IP judgment, a probabilistic model to estimate interference is required. For HA judgment, a precise and generalized model to estimate awareness by using time series and semantic environmental information should be developed after consolidating a practical methodology to recognize the ground truth of whether a human is truly aware of a robot or not. We will expand human awareness to visual, auditory, and haptic senses. For IA judgment, we need to develop a learning system that memorizes the relationship between the situation and the robot experience. Moreover, we will apply this ETN framework to more dynamic complex environments, e.g., station concourses where humans walk fast or crowded shopping malls, to make it generalized. We will also assess human acceptability and its transition toward the ETN framework through long-term experiments.