1 Introduction

1.1 Research Background and Motivation

One expected application of mobile robots is the task of guiding users in environments with high densities of pedestrians such as railway stations, shopping malls, and stadiums [1]. Due to improvements in human recognition technologies [2] and simultaneous localization and mapping (SLAM) [3], mapping and navigating environments, while recognizing and avoiding humans, has become feasible for mobile robots. In the field of mobile robot navigation, various approaches that employ sensors placed within the environment have been implemented, but this study achieves navigation by recognizing humans with sensors installed on the robot itself, to reduce the operating costs incurred when the robots are introduced to new environments. When operating in environments where humans are active, robots must avoid causing feelings of discomfort or obstruction in humans, in addition to avoiding collisions with them. Furthermore, pedestrians are not simply moving obstacles, but they also have the unique characteristics of altering their movement in response to the robots, making collision avoidance more challenging.

Meanwhile, to promote development of such navigation in human-interactions, it is important to have a common testbed. Since the humans’ behavior is affected by various factors such as density, destination, and obstacles other than a robot, considering scenarios of human’s behavior carefully is necessary according to objectives of experiments.

1.2 Related Research

When navigating in environments with active humans, route planning and speed control are necessary for avoiding collisions with humans. Considering pedestrians as moving obstacles, Berg et al. implemented multi-agent navigation through a speed control method called reciprocal velocity obstacle (RVO) [4]. While effective, RVO does not explicitly address interactions between pedestrians and robots, so robots exhibited abnormal proximity to the surrounding humans. By introducing a Kalman filter to RVO, Kim et al. enabled uncertainty to be handled, thus solving the problem of abnormal proximity [5]. However, introducing uncertainty into this navigation method triggered the Freezing Robot problem (FRP) [6], where the robot becomes unable to move in high-density, crowded environments, due to high uncertainty of collision with pedestrians.

Trautman et al. proposed to solve the FRP under a cooperative collision avoidance scenario between robot and humans. However, generally speaking, we cannot guarantee that all pedestrians always cooperate with robots. Of course, on the contrary, if pedestrians are completely adversarial to robots, FRP can never be solved. This suggests that some explicit communication, such as asking pedestrians to make room for the robot, will be effective to resolve the problem. Nevertheless, we believe that focusing on navigation against non-cooperative pedestrians will certainly contribute in building navigation strategy in a densely crowded environment. This is because improvement of performance in such non-cooperative environments will reduce necessity of robots to ask pedestrians to make room for them.

From the perspective of consideration of robot-human interaction in navigation, Mehta proposed a method [7] based on expanding Multi-Policy Decision Making [8]. Here, decision policies are designed in advance and may include Go-Solo, Follow, and Stop. By switching between these policies, according to the state of surrounding humans, the robot can select behaviors, such as Stop, rather than simple obstacle-avoidance, Go-Solo, that may cause FRP. In this case, a geometric prediction of human behavior is used when each policy is evaluated, but it is unclear whether humans conform completely to this model. If robots could predict interactions based on the behaviors that humans actually exhibit in response to robots, they could navigate in harmony with the real world. Studies [9] on predicting humans’ interactions exist, but focus on one-to-one encounters between humans and robots. Contrarily, our work attempts to predict interactions when multiple pedestrians are crowded densely together. Furthermore, it is believed that local observations of individual positions can become inaccurate, in environments where humans move in crowds, because of occlusion and density of human legs/bodies. In this case, methods that enable robots to account for overlapping humans when planning motions, such as ‘Avoid approaching humans’ or ‘Follow humans walking ahead in the same direction’, can be more effective.

Additionally, Abbeel reported using inverse reinforcement learning to teach robots human behavior [10], which enabled robot navigation in complex environments, such as within crowd flows, to be taught via human instructions. Furthermore, Kretzschmar et al. proposed learning of socially compliant behavior model based on pedestrian behavior or tele-operated robots behavior via inverse reinforcement learning [11]. Based on a probabilistic modeling, their model could capture both discrete navigation decision and continuous (relatively small) variance of human trajectories. However, the approach of inverse reinforcement learning is vulnerable to bias and other undesirable effects, as it relies on the specific traits of the human instructor or environments where data are collected. Our approach, on the contrary, focuses only on local interaction between humans and robots even within the same data-driven approach. Therefore, it is expected that the behavioral model can be constructed in a local and less-environmentally influenced scope.

Chen et al. used deep reinforcement learning [12] to teach robots human social norms [13]. These norms, such as “when walking on the right, pass people on the left” are implicit rules for humans in crowds, but were taught to the robot by providing the movement, when passing-each-other, as a reward to the robot. They were thereby able to achieve navigation that did not cause discomfort in the surrounding humans. In environments where humans are present in high densities, however, the robots should move with the flow of pedestrians going in the same direction rather than following the social norms to avoid collisions with every individual. Therefore, teaching social norms is difficult to apply to high-density, crowded environments. Furthermore, this method faces challenges when responding to planning requirements for effective robot motion, considering global optimality.

Finally in most related works, unfortunately, experimental evaluations have been conducted under varying crowd conditions, which directly affects the evaluation of the navigation ability in high-density crowded environments and makes it difficult to objectively verify their effectiveness via reproduction. Thus, another important issue in this field of robotic navigation in crowded environments is that a common reproducible and controlled test has not yet been proposed.

1.3 Research Objective

We propose here a navigation method wherein robots navigate while making predictions, by learning human interaction in high-density, crowded environments. We propose a method for extracting multiple paths, based on the distribution of nearby humans, as a method that expresses multiple navigation strategies, for both avoiding and following, under specific circumstances where the robot is surrounded by nearby pedestrians. Based on extracting paths and acquiring an interaction model, the robot trajectory upon path selection can be predicted while considering humans’ interactions. Using of acquired interaction model, the robot can predict excessive human interference, through low-speed robot movement caused by the interference. By combining a localized interaction prediction and global movement planner, the optimal path can be chosen from multiple options. We tested the proposed navigation method in a real-world environment with many pedestrians to validate its effectiveness in a controlled and reproducible way.

This study provides following contributions to the field of mobile robot navigation in crowded environment. First, we present tests of mobile robot navigation in a continuous flow of humans with an explicitly-controlled density. The conditions and settings make the evaluation process of mobile robot navigation in highly crowded environment reproducible and comparable. Second, not only is the robot able to avoid humans in one-on-one situations, but also it can avoid or follow a crowd of pedestrians through the localized path extraction method. In a previous study [14], navigation was achieved in an environment that was crowded with many people, but a ceiling-mounted camera had to be used to measure the humans. It should be noted that our study uses only sensors embedded on the robot itself, realized by expressing the path from the viewpoint of the robot based on the its local sensing. Second, this work provides a framework that enables data collection about human interactions that can be used for predictions. This enables predictions that incorporate human behavior in actual environments, making this framework distinct from motion planning approaches such as pedestrian optimal reciprocal collision avoidance (PORCA) [15], which addresses human intention and uncertainty as a partially observable Markov decision process, because it can be positioned as an experience-based method. Finally, this method considers optimization in combination with global path planning. This is advantageous in comparison with those methods based on information of control by human operator, such as inverse reinforcement learning.

2 Problem Settings

For this study, we assume a crowded environment where humans tend to generate some flows due to a high density. To ensure focus on human interactions, no obstacles were placed in the region around the start and target positions. A mobile robot navigates itself toward its destination using only its onboard sensor information and is not supposed to have an explicit communication with pedestrians (e.g., it does not ask humans to make room for itself by gesture or utterance). The primary goal of the robot is to quickly reach a defined target position, beginning from a fixed initial position, while also avoiding collisions with humans. The secondary goal is to avoid causing any psychologically oppressive feelings because of traversing against pedestrians.

For implementation and validation of the mobile robot navigation, we rely on a simulator for data collection of building a predictor and experiments with real human subjects for validation of the proposed navigation. Details of the simulator are described in appendix, and experimental conditions for validation will be described in Sect. 4.

3 Navigation Method Based on Path Extraction and Selection

Figures 1 and 2 provide an overview of the proposed navigation method. Figure 1 outlines the offline data collection process on the simulator, used for configuring the predictor, which finally predicts human interactions. Path candidates are extracted based on 2D LiDAR point cloud via leg detection module. One of paths is randomly selected for a certain duration for navigation. The robot trajectory is recorded while the path remains selected. The resulting data are used to configure a human/robot trajectory predictor. The predicted trajectory reflects interaction between the robot and its nearby pedestrians.

Fig. 1
figure 1

Robot control architecture for data collection. Collected human leg position and robot’s trajectories are used for construction of robot trajectory predictor

Fig. 2
figure 2

Robot control architecture for online navigation. Constructed human trajectory predictor is used for path selection

At the online navigation stage, as shown in Fig. 2, the predictor forecasts the trajectories and arrival positions, several seconds into the future, given that the extracted candidate path is selected. Integrating the predicted results and the global path planning results ensures that the most efficient path is selected, as determined by the shortest time of arrival to the target position.

3.1 Extracting Path Candidates Based on Local Observations

In this study, two types of path extractors, each expressing one of two strategies, are configured to extract the robot paths. One is the Avoidance strategy, in which the robot avoids humans and aims for the goal, while the other is the Follow strategy, in which the robot follows humans. To extract path candidates, a one-dimensional potential function is defined using angle \(\phi \) [rad] in a polar coordinate system with the robot at the origin. This function is a simplification of the well-known artificial potential field method, defined over a 2D plane, to one dimensional (1D), angle, space. It expresses the difficulty of moving forward in each possible driving direction in the robot’s immediate surroundings, where a negative number indicates that the movement is very easy (Fig. 3).

Fig. 3
figure 3

Parameter definitions for potential \(P^\text {follow}(\phi )\)(left) and an example of its component (right). local minima are used for candidates of paths

Assuming i denote the index of human recognized by the robot, (1) shows the calculation of the potential function, \(P^\text {follow}(\phi )\), for extracting the path to follow humans that are walking in the same direction as the robot. Here, in the robot coordinate system, the polarization angle of the polar coordinate is \(\phi \), the human i direction is \(\theta _i\), and the direction of the goal is \(\theta _g\). These angles are 0 deg in the front view of the robot, and the right side is defined as positive. \(A_i^\text {follow}\) and \(\sigma _\text {follow}\) are each defined below, in (2) and (3), respectively.

$$\begin{aligned} p_{i}^\mathrm{follow}(\phi )&=A_{i}^\mathrm{follow}\exp \left\{ -\tfrac{(\phi -\theta _{i}-0.1\theta _{g})^2}{2{\sigma _\text {follow}}^2}\right\} \end{aligned}$$
$$\begin{aligned} A_{i}^\mathrm{follow}&={\left\{ \begin{array}{ll} (^{\text {R}}v_{i}^\mathrm{follow}+0.1)\exp \left\{ -(r_i-1)\right\} &{}\mathrm{if} \ r_i\ge 1 \\ (^{\text {R}}v_{i}^\mathrm{follow}+0.1)&{}\mathrm{otherwise} \end{array}\right. } \end{aligned}$$
$$\begin{aligned} \sigma _\text {follow}&={\left\{ \begin{array}{ll} \exp (0.1r_{i}-2)&{}\mathrm{if} \ r_i\ge 6.1\\ 0.25&{}\mathrm otherwise \end{array}\right. } \end{aligned}$$

This potential function is configured using a Gaussian function and designed so that locally minimum values can be obtained from the direction where humans are moving away from the robot. The distance to detected human i is defined as \(r_i\), the component of the direction toward human i in the robot velocity is \({}^\text {R}v_i^\text {follow}\), and \(A_i\) is the coefficient that relates to the potential size.

\(A^\text {follow}\) and \(\sigma _\text {follow}\) are used to express that moving with a faster speed away from the robot (large \({}^\text {R}v^\text {follow}\)) results in a greater influence from the human, and humans that are further away (large \(r_i\)) exhibit less influence. The amplitude of the Gaussian function is expressed by \(A_i\), and humans that are further away have correspondingly smaller impacts. Additionally, the coefficient \(\sigma _\text {follow}\) relates to the horizontal width of the potential, and a narrower width indicates a narrow range that exerts an effect when (4) is superimposed. Therefore, this coefficient is set so that the potential width of humans narrows with decreasing \(r_i\). Within a certain range, \(\sigma _\text {follow}\) was set constant at 0.25 to prevent a too narrow width. The threshold value of 6.1 in (3) was decided so that two divided cases have almost equal values at \(r_i = 6.1\), preventing sudden change of \(\sigma _\text {follow}\). The potential, \(P^\text {follow}\), which is used to extract paths, is defined from the sum of the potential \(p_i^\text {follow}\) for each individual human i.

$$\begin{aligned} P^\mathrm{follow}(\phi )&=\sum _{i=1}^{N}p_{i}^\mathrm{follow}(\phi ) \end{aligned}$$

When following humans, the direction \(\phi \) for humans that are moving away from the robot is minimal, and the robot can then engage in following these humans.

Figure 4 shows an example of a path extracted using \(P^\text {follow}\). The left panel shows a distribution of surrounding humans, and the right side shows the corresponding \(P^\text {follow}(\phi )\). The lines on the human dots represent velocity. In the left panel, there is a total of eight humans and two humans are present in front (slightly to the right) and two humans in the front left of the robot. One of the extracted direction, along indicating that paths for following groups of humans have been extracted.

Fig. 4
figure 4

Example of following directions. There are two groups of humans walking in similar directions to the robot and two directions are extracted to follow them. The two directions in the left panel correspond to two local minima in the potential in the right panel

Two types of avoidance path candidates are also extracted. One is the path passing between humans, based on \(P^\text {avoid}(\phi )\), and the other is the path going around the outside of a group of humans at the edge of the robot’s field of vision. Regarding the first candidates, \(P^\text {avoid}(\phi )\), \(A^\text {avoid}\), and \(\sigma _\text {avoid}\) are defined as

$$\begin{aligned} p_{i}^\mathrm{avoid}(\phi )&=A_{i}^\mathrm{avoid}\exp \left\{ -\tfrac{(\phi -\theta _{i})^2}{2{\sigma _\text {avoid}}^2}\right\} \end{aligned}$$
$$\begin{aligned} A_{i}^\mathrm{avoid}&={\left\{ \begin{array}{ll} \exp \left\{ -(r_i-1)\right\} &{}\mathrm{if} \ r_i\ge 1\\ 1&{}\mathrm otherwise\\ \end{array}\right. } \end{aligned}$$
$$\begin{aligned} \sigma _\text {avoid}&={\left\{ \begin{array}{ll} \exp (0.1r_{i}-2)&{}\mathrm{if} \ r_i\ge 2.9\\ 0.18&{}\mathrm{otherwise}.\\ \end{array}\right. } \end{aligned}$$

The calculated 1D potential minimum value yields a path that passes between people. The 1D potential used to extract this path is obtained from the sum of the detected human i potentials shown in (5). Further, when \(r_i\) decreases below 2.9 m, \(\sigma _\text {avoid}\) assumes a constant value of 0.18 to prevent \(\sigma _\text {avoid}\) from becoming too small, which could cause the extraction of a path too close to a human. Similar to the case of (3), a threshold value \(r_i=2.9\) was decided for maintaining continuity of \(\sigma _\text {avoid}\) in the divided cases.

$$\begin{aligned} P^\mathrm{avoid} (\phi )&=\sum _{i=1}^{N}p_{i}^\mathrm{avoid}(\phi )+p_\mathrm{g}(\phi ) \end{aligned}$$

The first term in this equation is the potential function with a minimal gap between humans. The second term, \(p_g(\phi )\), is the potential function, defined in (9), and it minimizes the goal direction.

$$\begin{aligned} p_\mathrm{g}(\phi )&=-B\exp \left\{ -\tfrac{(\phi -\theta _\mathrm{g})^2}{\beta }\right\} \end{aligned}$$

Here, the parameters B and \(\beta >0\) determine the shape of the potential function.

Regarding path candidates going outside of humans, in many cases, the potential decreases monotonically at both side limits of the 2D LiDAR field of vision, and the minimum cannot be defined. Thus, to extract a path that goes around the outside of a group of humans, the potential function value on the outer side of the field of view from the maximum nearby potential is approximated by the following equation since the path away from the maximum nearby potential is extracted at a constant interval angle.

$$\begin{aligned} \hat{p}(\phi )=A_\text {round}\exp \left( -\tfrac{\phi ^2}{\sigma _\text {round}^2}\right) \end{aligned}$$

Here, \(A_\text {round}\) sets the maximum nearby potential and estimates \(\sigma _\text {round}\) through the least-squares method. Based on this estimated value, the direction where \(\sigma _\text {round}\) s displaced, towards the edge of the field of view, from the maximum direction, becomes the path candidates \(\phi _\text {round}^L\) and \(\phi _\text {round}^R\).

The left panel of Fig. 5 shows an example distribution of humans and the corresponding potential, \(P^\text {avoid}(\phi )\) is shown in the right panel. With \(P^\text {avoid}(\phi )\) minimized, a path between human groups can be extracted, but the minimum value cannot be detected on either side. Clearly, a direction of travel that goes around the outside of the group has been extracted in Fig. 5 as two paths going around the group, \(\phi _\text {round}^L\) and \(\phi _\text {round}^R\) can be readily observed.

Fig. 5
figure 5

Example of obstacle avoidance and round directions. Three candidate directions are extracted, center of which directs to the gap between two groups of humans. The left and right directions are to go around the human groups

3.2 Robot Velocity Control

In addition to determining the direction of travel, as described in the previous section, we used the optimal velocity (OV) model [16] to ensure that the robot moves at velocities appropriate for its distance from humans. This model is typically used as a vehicle control model and it well describes steady flow and congestion, not only for traffic system but also for pedestrians’ flow (e.g., [17]). Thus, the OV model is used as a method for controlling the robot velocity, so that the robot can smoothly join a human’s flow in the case of the Follow strategy.

Given d, the distance from the robot to the closest human, the target velocity \(V_\text {des}(d)\) is determined as

$$\begin{aligned} V_\text {des}(d)&= \frac{V_{\max }}{1+\tanh (\beta c)} \{ \tanh (\beta d - c) + \tanh (\beta c) \},\; d>0.5 \end{aligned}$$
$$\begin{aligned} V_\mathrm{pb}(d)&= 2 d V_\text {des}({d}),\; 0\le d\le 0.5 \end{aligned}$$

Figure 6 shows \(V_\text {des}(d)\) and \(V_\text {pb}(d)\)Vdes(d) and Vpd(d). At 0.5 m or below, marked by the dotted line, Vpb causes the robot to decelerate rapidly to avoid a collision.

Fig. 6
figure 6

Desired velocity profile of OV model \(V_\text {des}(d)\)

Furthermore, we consider the ±30 deg direction of the global path from the robot viewpoint, which is the direction leading straight to the goal if there are no obstacles except for pedestrians. To avoid wasting motion when there are no obstacles near the robot, if the sensor value indicates that no obstacles exist within 5 m, the robot advances at 1.0 m/s in the direction of this global path.

3.3 Data Collection and Predictor Configuration

On the simulator, data are collected for interactions between the robot and humans, when they travelled in proximity according to paths determined by the extraction method described above. The trajectory predictor is then configured from that data (Table 1). As described in Sect. 3.1, the path candidates are classified into two groups: “Avoid,” which passes through the gaps between and around humans, and “Follow,” which follows humans. Consequently, two corresponding types of trajectory predictors are created using multi-layered neural networks. The predicted interactions, shown in Fig. 7, are expressed as displacement from the trajectory, when the robot is moving straight forward at a steady velocity in the direction of the path selected at any point in time, the ‘straight-ahead’ route (Table 1).

Fig. 7
figure 7

Definition of trajectory of robot

Table 1 Inputs for robot trajectory predictors

The definitions for these element inputs are shown in Fig. 8. Displacement from the straight-ahead route is expressed as \(\varDelta {\varvec{x}}_n\in \mathbb {R}^2, n=1,\cdots ,N\), where n denotes a discrete time with time step \(\varDelta t\), and N denotes the maximum discrete time for the prediction. Thus, the output of the approximator for each function is a 2N-dimensional vector. Table 2 shows common settings of the avoiding/following motion predictor neural networks. To collect a variety of moving directions, random selection from multiple candidate directions was controlled so that directions with less samples collected in the past are selected more frequently. Ten precent of the training samples were used for testing. In the process of training, loss functions of both training data and test data equally converged, which indicated absence of overfitting.

Table 2 Settings of motion predictors
Fig. 8
figure 8

Inputs for robot trajectory prediction

Among the inputs for each predictor in Fig. 8, the position and velocity of humans on either side of the travel direction serve as inputs for Avoid. Of the humans on either side, those close to the robot’s direction of travel and to the robot itself are selected. For Follow, the position and velocity of humans to be followed, as well as the positions of humans on either side, are input. In both cases, one of the two indicators of human distance and direction to the left and right is randomly selected from the Pareto Optimal Solution. If no humans are present to the left or right, the distance and angle are each set as 1000.

3.4 Online Navigation Method

During online navigation, the robot extracts candidate paths using the extraction method described in Sect. 3.1. After the path selection described below has been conducted, drive control, based on velocity control, is conducted as described in Sect. 3.2.

Figure 9 shows the predicted trajectory when the ith path candidate is selected. Based on the Nth position, \(\hat{\varvec{x}}^{(i)}_N\), of the positions from the predictor described in Sect. 3.3, the global path planner calculates the path to the goal position, using the A* algorithm. The path length is \(L^{(i)}\).

Fig. 9
figure 9

Predicted local trajectory and global path

When the candidate path i is selected, the total time cost, \(C_\text {total}^{(i)}\), to reach the goal is defined by

$$\begin{aligned} C_\text {total}^{(i)}&=C_\text {local}^{(i)} +C_\text {global}^{(i)}\nonumber \\&= \left( \frac{\varDelta \theta ^{(i)}}{\omega _\text {robot}}+N \varDelta t \right) + \frac{L^{(i)}}{v_\text {robot}}. \end{aligned}$$

Here, \(C_\text {local}^{(i)}\) is the time cost required to move N steps along the selected path, and it corresponds to the parenthetical term on the second line of (13). The first item inside these parentheses is the time cost for the robot turning direction, starting from the direction it is currently facing. The quotient of the angle of deflection \(\theta ^{(i)}\) toward path i is used, with the rotational velocity \(\omega _\text {robot}\) [rad/s] kept constant. The second item inside the parentheses is the time cost for N steps. Since the observational period \(\varDelta t\) is constant, this value is a constant common to all candidate paths. The third term, after the parenthesis, expresses the time cost to arrive at the goal position after localized path selection has occurred, and is obtained by dividing the length of the route \(L^{(i)}\), obtained from the global planner, by the velocity of the robot, \(v_\text {robot}\). To simplify the global cost estimation, we assume that the robot moves to the goal at a constant velocity. A selection is made, among the candidate paths, using the following equation to minimize the time cost described above.

$$\begin{aligned} i^*=\mathop {\hbox {argmin}}_{i} C_\text {total}^{(i)} \end{aligned}$$

This path selection is performed at \(\varDelta t\) [s] intervals. During this time, the robot follows the path candidates described in Sect. 3.1, which are refreshed with each observational frame, and implements the velocity control described in Sect. 3.2.

4 Experiment

We confirmed, both in the simulator and the actual mobile robot, that the proposed method enabled the robot to arrive at its destination, without collisions, in an environment with a human flow of approximately 0.2 humans/m\(^2\). We then validated these robot-human interactions with an increased density of 0.5 humans/m\(^2\), which corresponds to an environment full of moving crowds, such as a crowded railway station. This presents a significant challenge for navigation methods simply designed to avoid collisions with single humans.

Our proposed method was constructed by the data collected by the simulator, and evaluated by actual mobile robots and pedestrians. There are two primary reasons for collecting data in this way:

  • It cannot be said that human behavior on a ROS-stage-based simulator adequately represents true human behavior. Specifically, it is difficult to bridge the differences in behavior with actual movements, including avoidance behaviors when near other humans or a robot, and large repulsion-based movements. Thus, we do not believe that the detailed verification of robot-human interference is meaningful in this simulation.

  • It can, however, be acknowledged that within the range where humans do not get too close to each other and to robots, avoidance movements in the simulator have a certain degree of realistic spontaneity. Therefore, we believe that there is value in verifying a learner, which has learned behaviors, before approaching point-blank range.

4.1 Robot Condition

The robot model used in the simulator was assumed to be an omnidirectional mobile robot with a diameter of 0.45 m and a height of 1 m, while these values were 0.56 and 0.9 m for the actual experimental robot. Followings are the robot and simulation settings: walking speed of 0–1.0 m/s, turning angular velocity of −90 to 90 deg/s, generatable acceleration of \(-1\) to 1 m/s\(^2\), and angular acceleration of − 90 to 90 deg/s\(^2\). A two-dimensional (2D) LiDAR sensor with a view angle of −90 to 90 deg, a measurable distance of 0.1–30 m, and a resolution of 0.25 deg was embedded in the robot. Within the sensor measurement range, human positions were measured, and a differential was used to estimate their speeds. If humans overlapped, as is the case with robot navigation in a densely crowded environment, only the front human was viewed. The simulator equipment was based on a ROS Stage [18], and details of this simulator is described in appendix.

4.2 Design of Reproducible Environmental Condition

Figure 10 shows the experimental environment. Humans move in the direction of the arrow, within the range indicated by the 2 m wide and 8 m long gray rectangle. There are eight humans inside this range, arranged so that their density is always 0.5 humans/m\(^2\). Six points are defined on each side of the pedestrian flow, denoted A1–A3 and B1–B3, and the robot goes back and forth between each A point and the three B points. Crossing the flow once and arriving at the destination on the opposite side counts as one trial. In this way, each method is tested 18 times. We requested the following of the subjects while they were walking: to maintain the regulated density, enter the area as soon as another person leaves; when within the designated area, walk at a natural velocity; choose where to walk inside the area at random, preferably not along the middle line; do not alter behavior when approaching the robot and keep this response consistent; and do not learn or predict the robot behavior by its repetition.

Fig. 10
figure 10

Experimental environment settings

When returning to the inflow position after leaving the experimental area, subjects were to return to their original position through the side opposite to the goal position of the robot at that point in time to avoid approaching, and subsequently affecting, the robot after it passed through the area. Subjects were also not provided with a description of the evaluated navigation methods or the experiment intention.

The robot detected and measured pedestrian legs with a ROS leg detector [19] and 2D LiDAR point group information. Human detectors, which detect and measure the human positions, based on leg detectors are currently available, but they were not used in this work because they produce frequent false-negatives. Leg detectors recognize humans by sensing each leg and transmitting the information to the path extractor. Even when humans are walking at a steady velocity, their legs do not necessarily move at the same steady velocity, so this information was filtered and smoothed over time. After smoothing, the estimated time and velocity values were obtained at 25 ms cycles. Accordingly, candidate paths were extracted in 30 ms cycles, and the robot wheel velocity control was implemented in 5 ms cycles.

The process for selecting the route with minimal cost, based on (14), was implemented in 330 ms cycles. The predictor time steps were N = 5. In the simulation, the path selection cycle was \(\varDelta t= \) 600 ms, so 3 s of movement was predicted.

4.3 Comparison Methods

We refer to our proposed method as navigation with local human-interaction prediction (NLHP). Table 3 shows three alternative methods, used for comparison.

Table 3 Robot movement of comparison method

All three of these methods utilize the A\(^*\) method, which is widely used for motion planning. A\(^*_\text {diff}\) offers the most promising performance, whereas A\(^*_\text {omni}\) and A\(^*_\text {omni35}\) were included to investigate the overall performance impact of their varied motion characteristics. A description of the three methods is provided below.

  • A\(^*_\text {diff}\): With the robot radius set as 0.5 m, a route to the goal position was planned. A velocity command was assigned, assuming that the robot runs along the planned trajectory by its differential two-wheel mechanism.

  • A\(^*_\text {omni}\): The same radius and route planning as in A\(^*_\text {diff}\). The velocity command was assigned while gradually shifting toward the tangential direction of the generated route, assuming an omnidirectional movement mechanism. Therefore, A\(^*_\text {omni}\) exhibits salient behavior differences from A\(^*_\text {diff}\), when A\(^*\) provides a path with a direction that is perpendicular to the current robot posture. A\(^*_\text {diff}\) causes the robot to first stop and turn toward the perpendicular direction (along the path), whereas A\(^*_\text {omni}\) keeps moving along the route with its current posture and turn gradually toward the perpendicular direction.

  • A\(^*_\text {omni35}\): The route was planned with the radius set as 0.35 m, and the velocity command was assigned as in A\(^*_\text {omni}\).

Under A\(^*_\text {omni}\) and A\(^*_\text {omni35}\), frequent collisions occurred with obstacles that were 180 deg outside of the frontal field of view of the robot. Contrarily, in the NLHP method, the robot always moved while facing the direction of the selected path, minimizing the risk of colliding with humans that are right beside or in front of it. Because it reduces the risk of collision, A\(^*\text {diff}\) was used as the primary comparison method. Notably, A\(^*_\text {diff}\) is a direction control method that, due to translational movement, does not move horizontally or diagonally backwards because of its motion model based on a differential two-wheel mechanism. We verified performances of A\(^*_\text {omni}\) and A\(^*_\text {omni35}\) as examples of prioritizing time-efficiency over safety in high-density environments.

4.4 Evaluation Index

We used the following indices to evaluate and compare the various navigation methods.

  • Duration: the time required to arrive at the goal position [s]

  • Human flow efficiency: number of humans who passed a specific point (entrance of the gray region in this measurement) per time [humans/s]

  • Collisions: the number of collisions or near-collisions, identified based on pedestrians’ emergency avoidance behaviors

  • Disturbances: the number of times the flow of pedestrians was disturbed, identified based on pedestrians’ changing their movement direction to avoid the robot

  • Blame: psychological intimidation experiences when pedestrians were encountered

Because the robot could only measure humans in a 180 deg frontal range, collisions and flow disturbances were identified and measured visually, by us, based on videos taken during the experiment.

Blame, at a given time t, is quantified by the following equation, in which the human movement direction was added to the evaluation method [7] based on information measured and considered by the robot.

$$\begin{aligned} B(t)&=\max _{i} \beta _{i}\frac{2}{1+{e}^{\alpha _{i}}}\nonumber \\ \alpha _{i}&=\Vert \varDelta t{\varvec{v}}_\mathrm{{robot}}-{\varvec{x}}_{ {i}}\Vert \nonumber \\ \beta _{i}&=\exp \left\{ -\frac{(\phi _{i} - \theta _{i})^2}{2\sigma _{\text {blame}}^2} \right\} \end{aligned}$$

Here, i represents humans within a 1.5 m radius of the robot center, and the blame is evaluated for those humans exhibiting the highest blame levels. \(\alpha _i\) represents the difference between human positions, \(\varvec{x}_i\), and the robot position after \(\varDelta t\) [s]. Closer distances are represented by smaller denominators, and subsequently larger quotients. \(\beta _i\) is set so that smaller differentials between the direction of human motion, \(\theta _i\), and the direction of the robot seen by the humans, \(\phi _i\) [rad], result in larger \(\beta _i\) values. The parameter \(\sigma _{\text {blame}}=0.5\) sets the range of influence, in the form of a Gaussian function. Thus, blame is an indicator that increases when the robot approaches humans or is positioned in the direction of human movement. The blame at each time instant is evaluated as blame per time, which represents an average value for the time a human spends within the 1.5 m radius of the robot.

4.5 Experimental Results

Figure 11 shows the time required for our NLHP method and the other comparative methods to reach the goal position. The error bars indicate a 95% confidence interval and ** indicates statistically significant difference at the \(p<0.01\). We observe that A\(^*_\text {omni35}\), which did not hesitate to pass through small gaps among humans, reached the goal in the shortest time, and NLHP reached it next. Specifically, the NLHP method provided a 43% time reduction compared with A\(^*_\text {diff}\), the primary comparison method.

Fig. 11
figure 11

Duration for each of the evaluated methods

Figure 12 shows the efficiency of human flow, which was found to be approximately 1 humans/sec for all cases. Since the robot affects humans’ behaviors only in some limited period in each trial, differences between human flows were small. However, there were statistically significant differences between the proposed method and the comparison methods.

Fig. 12
figure 12

Comparison of human flow efficiency during navigation

Collisions and disturbances are shown in Fig. 13 and 14, respectively. While the NLHP method and A\(_\text {diff}\) exhibit similar numbers of collisions, A\(^*_\text {omni}\) and A\(^*_\text {omni35}\) produced large numbers of both. This is primarily attributed to interference with humans outside of the robot’s range of view. Regarding the number of disturbances, A\(^*_\text {omni35}\) achieved the least number. It was due to its aggressive and quick strategy, causing very short duration of interaction against pedestrians. Thus, as a total balance including other indices, it cannot be overestimated. Comparison of blame indicated in Fig. 15 shows that the proposed method realized the best performance in reducing psychological influence to humans, though statistical significance was not obtained due to large variation of the blame values in each trial.

Fig. 13
figure 13

Number of collisions produced by the evaluated methods

Fig. 14
figure 14

Number of disturbances produced by the evaluated methods

Fig. 15
figure 15

Blame per time for each of the evaluated methods

All robot behaviors recorded in videos can be accessed from the author’s HP [20]. As examples, snapshots of navigations with starting point B1 and destination A3 are shown in Figs. 16, 17, 18 and 19 (resolution was lowered for privacy). Each shot in the sequences is drawn with 3 s interval. First, it can be observed that the density of pedestrians in the setting is quite high and challenging when humans walk in their natural speed without specific consideration to the robot. For cases of A\(^*_\text {omni}\) and A\(^*_\text {omni35}\), the robot took path closer to the direct diagonal line toward the destination. However, it is sometimes human-unconscious, and the robot collided with a pedestrian at the third frame in Fig. 19. With the proposed method, NLHP, the robot was flew downstream of the pedestrian’s flow while crossing it. This behavior is caused by the mechanism of extracting human-following direction. In the case of A\(^*_\text {diff}\) in Fig. 17, the robot was always trying to be safe, but it was too slow and still could not avoid the ‘blamed’ situation sufficiently.

Fig. 16
figure 16

Navigation with NLHP (18.7 s)

Fig. 17
figure 17

Navigation with A\(^*_\text {diff}\) (32.1 s)

Fig. 18
figure 18

Navigation with A\(^*_\text {omni}\) (14.9 s)

Fig. 19
figure 19

Navigation with A\(^*_\text {omni35}\) (12.2 s)

The comparisons between the five indices with the comparison methods are summarized in Table 4. It was confirmed that NLHP, the proposed method, showed better performance in comparison with A*diff, the primary comparison method. On the contrary, as can be seen in duration and number of disturbance comparisons, A*omni35, the aggressive and quick strategy, showed a better performance. This means that there can be some room for the proposed method to improve its performance while maintaining the low interference property against humans.

Even though NLHP showed lower blame / time than the other methods in average, there was no statistically significant differences due to large deviations. It can be partly caused by fluctuations in measurement of human velocity because the blame defined in (15) includes estimated direction of human motion. Therefore, decreasing noise in human motion estimation is expected to increase the reliability of the measurement of blame.

4.6 Discussion

The path extraction described in Sect. 3.1 does not utilize human velocity information when setting the robot driving direction because we were unable to measure human velocities with sufficient accuracy in our current experimental system. Additionally, we did not observe significant improvements for methods that did incorporate this information. If the leg and human detectors used in this experiment were improved, and human velocity could be estimated more accurately, we anticipate that using this information would improve the candidate path extractions and subsequently the navigation performance.

Furthermore, in this experimental system, LiDAR provided the robot with a 180 deg frontal range of observation, so contact with humans at the rear and sides could not be avoided. The robot was sufficiently able to stop when humans were detected in the point-blank range, so improving this vision to avoid all collisions will make the robot and our NLHP method viable real-world operation.

Though the basic behavior of the robot with the proposed method was sufficient in the experiment, there were many cases where the robot could act against humans more appropriately. Parameters for the potential function defined in Sect. 3.1 were mainly tuned based on the robot’s behavior in the simulation by human hand. However, it is desired to tune those parameters automatically through both simulations and real-robot experiments for further improving robot navigation performance. Optimization techniques (e.g., [21, 22]) can be applied for the automatic tuning of the parameters.

5 Conclusion and Future Challenges

In this study, we proposed a method that combines a local planner for route prediction, based on embedded sensor information and a global planner that calculates the shortest distance to the goal, based on map information, to achieve time-efficient robot navigation in high-density crowded environments. First, candidate paths were extracted from the embedded sensor, and the positions what would be reached after approximately 3 s were predicted for each of the path candidates. We used a predictor to forecast the deceleration and the impacts of humans’ interaction to the left and right. Combining the time cost of reaching the predicted position and the time cost of reaching the goal, which was obtained by the global planner, we built a route selection system based on global time efficiency. The proposed method was implemented and verified in an environment with a high crowd density of 0.5 humans/m\(^2\) in a reproducible and controlled way, which helps different methods to be compared against each other in a common condition. These experiments demonstrated that navigation could not only be achieved in this environment, but could be done in a shorter time and with less interference, while maintaining a similar collision risk, compared with established, standard A\(^*\)-based obstacle avoidance methods. The high-density environment employed in our experiment continuously reproduced its high-density with more consistency than other related studies. While these conditions might not occur with this frequency in real-world situations, it provides a common problem setting for evaluating navigation methods in crowded environments.

Table 4 Summary of comparisons between the proposed and comparison methods

In the future, the prediction accuracy should be improved. Specifically, the human behavior reproduced in the simulator must be more similar to true human behavior. We expect that, by changing input parameters from those representing individual humans to those representing groups, we can enable prediction of human flows and not just individual travel, thus improving the prediction accuracy. More precise prediction will allow us to realize prediction in longer time span. Though we did not focus on behavior of individual pedestrian in the current implementation, this improvement will improve the local interaction between the robot and a pedestrian such as cooperative collision avoidance [6]. Additionally, an option to stop (wait), which is not implemented in the current method, could further improve the performance.

The proposed NLHP method offers the unique advantage of using data collected from learning. This must be extended to data collected in the real world [23] and verified. For example, highly accurate human psychological models that account for individual differences [24] could be introduced while continuing to confirm the significance of these for robot navigation. Finally, while \(A^*\)-based methods are a reasonable standard as a basic comparison, there are other approaches for reciprocal collision avoidance [15, 25]. Further comparisons against such navigation methods are to be conducted in our future works.