Robot Navigation Based on Predicting of Human Interaction and its Reproducible Evaluation in a Densely Crowded Environment

To achieve robot navigation in crowded environments having high densities of moving people, it is insufﬁcient to simply consider humans as moving obstacles and avoid collisions with them. That is, the impact of an approaching robot on human movements must be considered as well. Moreover, various navigation methods have been tested in their own environments in the literature, which made them difﬁcult to compare with one another. Thus, we propose an autonomous robot navigation method in densely crowded environments for data-based predictions of robot-human interactions, together with a reproducible experimental test under controlled conditions. Based on localized positional relationships with humans, this method extracts multiple alternative paths, which can implement either following or avoidance, and selects an optimal path based on time efﬁciency. Each path is selected using neural networks, and the various paths are evaluated by predicting the position after a given amount of time has elapsed. These positions are then used to calculate the time required to reach a certain target position to ensure that the optimal path can be determined. We trained the predictor using simulated data and conducted experiments using an actual mobile robot in an environment where humans were walking around. Using our proposed method, collisions were avoided more effectively than when conventional navigation methods were used, and navigation was achieved with good time efﬁciency, resulting in an overall reduction in interference with humans. Thus, the proposed method enables an effective navigation in a densely crowded environment, while collecting human-interaction experience for further improvement of its performance in the future.


Research Background and Motivation
One expected application of mobile robots is the task of guiding users in environments with high densities of pedestrians such as railway stations, shopping malls, and stadiums [1]. Due to improvements in human recognition technologies B Yuichi Kobayashi kobayashi.yuichi@shizuoka.ac.jp 1 Department of Mechanical Engineering, Shizuoka University, Hamamatsu, Japan 2 Hewlett-Packard Japan, Ltd., Tokyo, Japan 3 Persol Research & Development, Nagoya, Japan 4 Sky Ocean Technology Co. Ltd., Jeju-si, South Korea 5 Toyota Motor Corporation, Toyota, Japan [2] and simultaneous localization and mapping (SLAM) [3], mapping and navigating environments, while recognizing and avoiding humans, has become feasible for mobile robots. In the field of mobile robot navigation, various approaches that employ sensors placed within the environment have been implemented, but this study achieves navigation by recognizing humans with sensors installed on the robot itself, to reduce the operating costs incurred when the robots are introduced to new environments. When operating in environments where humans are active, robots must avoid causing feelings of discomfort or obstruction in humans, in addition to avoiding collisions with them. Furthermore, pedestrians are not simply moving obstacles, but they also have the unique characteristics of altering their movement in response to the robots, making collision avoidance more challenging.
Meanwhile, to promote development of such navigation in human-interactions, it is important to have a common testbed. Since the humans' behavior is affected by various factors such as density, destination, and obstacles other than a robot, considering scenarios of human's behavior carefully is necessary according to objectives of experiments.

Related Research
When navigating in environments with active humans, route planning and speed control are necessary for avoiding collisions with humans. Considering pedestrians as moving obstacles, Berg et al. implemented multi-agent navigation through a speed control method called reciprocal velocity obstacle (RVO) [4]. While effective, RVO does not explicitly address interactions between pedestrians and robots, so robots exhibited abnormal proximity to the surrounding humans. By introducing a Kalman filter to RVO, Kim et al. enabled uncertainty to be handled, thus solving the problem of abnormal proximity [5]. However, introducing uncertainty into this navigation method triggered the Freezing Robot problem (FRP) [6], where the robot becomes unable to move in high-density, crowded environments, due to high uncertainty of collision with pedestrians.
Trautman et al. proposed to solve the FRP under a cooperative collision avoidance scenario between robot and humans. However, generally speaking, we cannot guarantee that all pedestrians always cooperate with robots. Of course, on the contrary, if pedestrians are completely adversarial to robots, FRP can never be solved. This suggests that some explicit communication, such as asking pedestrians to make room for the robot, will be effective to resolve the problem. Nevertheless, we believe that focusing on navigation against non-cooperative pedestrians will certainly contribute in building navigation strategy in a densely crowded environment. This is because improvement of performance in such non-cooperative environments will reduce necessity of robots to ask pedestrians to make room for them.
From the perspective of consideration of robot-human interaction in navigation, Mehta proposed a method [7] based on expanding Multi-Policy Decision Making [8]. Here, decision policies are designed in advance and may include Go-Solo, Follow, and Stop. By switching between these policies, according to the state of surrounding humans, the robot can select behaviors, such as Stop, rather than simple obstacle-avoidance, Go-Solo, that may cause FRP. In this case, a geometric prediction of human behavior is used when each policy is evaluated, but it is unclear whether humans conform completely to this model. If robots could predict interactions based on the behaviors that humans actually exhibit in response to robots, they could navigate in harmony with the real world. Studies [9] on predicting humans' interactions exist, but focus on one-to-one encounters between humans and robots. Contrarily, our work attempts to predict interactions when multiple pedestrians are crowded densely together. Furthermore, it is believed that local observations of individual positions can become inaccurate, in environments where humans move in crowds, because of occlusion and density of human legs/bodies. In this case, methods that enable robots to account for overlapping humans when planning motions, such as 'Avoid approaching humans' or 'Follow humans walking ahead in the same direction', can be more effective.
Additionally, Abbeel reported using inverse reinforcement learning to teach robots human behavior [10], which enabled robot navigation in complex environments, such as within crowd flows, to be taught via human instructions. Furthermore, Kretzschmar et al. proposed learning of socially compliant behavior model based on pedestrian behavior or tele-operated robots behavior via inverse reinforcement learning [11]. Based on a probabilistic modeling, their model could capture both discrete navigation decision and continuous (relatively small) variance of human trajectories. However, the approach of inverse reinforcement learning is vulnerable to bias and other undesirable effects, as it relies on the specific traits of the human instructor or environments where data are collected. Our approach, on the contrary, focuses only on local interaction between humans and robots even within the same data-driven approach. Therefore, it is expected that the behavioral model can be constructed in a local and less-environmentally influenced scope.
Chen et al. used deep reinforcement learning [12] to teach robots human social norms [13]. These norms, such as "when walking on the right, pass people on the left" are implicit rules for humans in crowds, but were taught to the robot by providing the movement, when passing-each-other, as a reward to the robot. They were thereby able to achieve navigation that did not cause discomfort in the surrounding humans. In environments where humans are present in high densities, however, the robots should move with the flow of pedestrians going in the same direction rather than following the social norms to avoid collisions with every individual. Therefore, teaching social norms is difficult to apply to high-density, crowded environments. Furthermore, this method faces challenges when responding to planning requirements for effective robot motion, considering global optimality.
Finally in most related works, unfortunately, experimental evaluations have been conducted under varying crowd conditions, which directly affects the evaluation of the navigation ability in high-density crowded environments and makes it difficult to objectively verify their effectiveness via reproduction. Thus, another important issue in this field of robotic navigation in crowded environments is that a common reproducible and controlled test has not yet been proposed.

Research Objective
We propose here a navigation method wherein robots navigate while making predictions, by learning human interaction in high-density, crowded environments. We propose a method for extracting multiple paths, based on the distribution of nearby humans, as a method that expresses multiple navigation strategies, for both avoiding and following, under specific circumstances where the robot is surrounded by nearby pedestrians. Based on extracting paths and acquiring an interaction model, the robot trajectory upon path selection can be predicted while considering humans' interactions.
Using of acquired interaction model, the robot can predict excessive human interference, through low-speed robot movement caused by the interference. By combining a localized interaction prediction and global movement planner, the optimal path can be chosen from multiple options. We tested the proposed navigation method in a real-world environment with many pedestrians to validate its effectiveness in a controlled and reproducible way.
This study provides following contributions to the field of mobile robot navigation in crowded environment. First, we present tests of mobile robot navigation in a continuous flow of humans with an explicitly-controlled density. The conditions and settings make the evaluation process of mobile robot navigation in highly crowded environment reproducible and comparable. Second, not only is the robot able to avoid humans in one-on-one situations, but also it can avoid or follow a crowd of pedestrians through the localized path extraction method. In a previous study [14], navigation was achieved in an environment that was crowded with many people, but a ceiling-mounted camera had to be used to measure the humans. It should be noted that our study uses only sensors embedded on the robot itself, realized by expressing the path from the viewpoint of the robot based on the its local sensing. Second, this work provides a framework that enables data collection about human interactions that can be used for predictions. This enables predictions that incorporate human behavior in actual environments, making this framework distinct from motion planning approaches such as pedestrian optimal reciprocal collision avoidance (PORCA) [15], which addresses human intention and uncertainty as a partially observable Markov decision process, because it can be positioned as an experience-based method. Finally, this method considers optimization in combination with global path planning. This is advantageous in comparison with those methods based on information of control by human operator, such as inverse reinforcement learning.

Problem Settings
For this study, we assume a crowded environment where humans tend to generate some flows due to a high density.
To ensure focus on human interactions, no obstacles were placed in the region around the start and target positions. A mobile robot navigates itself toward its destination using only its onboard sensor information and is not supposed to have an explicit communication with pedestrians (e.g., it does not ask humans to make room for itself by gesture or utterance). The primary goal of the robot is to quickly reach a defined target position, beginning from a fixed initial position, while also avoiding collisions with humans. The secondary goal is to avoid causing any psychologically oppressive feelings because of traversing against pedestrians. For implementation and validation of the mobile robot navigation, we rely on a simulator for data collection of building a predictor and experiments with real human subjects for validation of the proposed navigation. Details of the simulator are described in appendix, and experimental conditions for validation will be described in Sect. 4.

Navigation Method Based on Path Extraction and Selection
Figures 1 and 2 provide an overview of the proposed navigation method. Figure 1 outlines the offline data collection process on the simulator, used for configuring the predictor, which finally predicts human interactions. Path candidates are extracted based on 2D LiDAR point cloud via leg detection module. One of paths is randomly selected for a certain duration for navigation. The robot trajectory is recorded while the path remains selected. The resulting data are used to configure a human/robot trajectory predictor. The predicted trajectory reflects interaction between the robot and its nearby pedestrians.
At the online navigation stage, as shown in Fig. 2, the predictor forecasts the trajectories and arrival positions, several seconds into the future, given that the extracted candidate path is selected. Integrating the predicted results and the

Extracting Path Candidates Based on Local Observations
In this study, two types of path extractors, each expressing one of two strategies, are configured to extract the robot paths.
One is the Avoidance strategy, in which the robot avoids humans and aims for the goal, while the other is the Follow strategy, in which the robot follows humans. To extract path candidates, a one-dimensional potential function is defined using angle φ [rad] in a polar coordinate system with the robot at the origin. This function is a simplification of the wellknown artificial potential field method, defined over a 2D plane, to one dimensional (1D), angle, space. It expresses the difficulty of moving forward in each possible driving direction in the robot's immediate surroundings, where a negative number indicates that the movement is very easy (Fig. 3).
Assuming i denote the index of human recognized by the robot, (1) shows the calculation of the potential function, P follow (φ), for extracting the path to follow humans that are walking in the same direction as the robot. Here, in the robot coordinate system, the polarization angle of the polar coordinate is φ, the human i direction is θ i , and the direction of the goal is θ g . These angles are 0 deg in the front view of the robot, and the right side is defined as positive. A follow i and σ follow are each defined below, in (2) and (3), respectively.
This potential function is configured using a Gaussian function and designed so that locally minimum values can be obtained from the direction where humans are moving away from the robot. The distance to detected human i is defined as r i , the component of the direction toward human i in the robot velocity is R v follow i , and A i is the coefficient that relates to the potential size.
A follow and σ follow are used to express that moving with a faster speed away from the robot (large R v follow ) results in a greater influence from the human, and humans that are further away (large r i ) exhibit less influence. The amplitude of the Gaussian function is expressed by A i , and humans that are further away have correspondingly smaller impacts. Additionally, the coefficient σ follow relates to the horizontal width of the potential, and a narrower width indicates a narrow range that exerts an effect when (4) is superimposed. Therefore, this coefficient is set so that the potential width of humans narrows with decreasing r i . Within a certain range, σ follow was set constant at 0.25 to prevent a too narrow width. The threshold value of 6.1 in (3) was decided so that two divided cases have almost equal values at r i = 6.1, preventing sudden change of σ follow . The potential, P follow , which is used to extract paths, is defined from the sum of the potential p follow i for each individual human i.
When following humans, the direction φ for humans that are moving away from the robot is minimal, and the robot can then engage in following these humans. Figure 4 shows an example of a path extracted using P follow . The left panel shows a distribution of surrounding humans, and the right side shows the corresponding P follow (φ). The lines on the human dots represent velocity. In the left panel, there is a total of eight humans and two humans are present in front (slightly to the right) and two humans in the front left of the robot. One of the extracted direction, along indicating that paths for following groups of humans have been extracted.
Two types of avoidance path candidates are also extracted. One is the path passing between humans, based on P avoid (φ), and the other is the path going around the outside of a group of humans at the edge of the robot's field of vision. Regarding the first candidates, P avoid (φ), A avoid , and σ avoid are defined as 0.18 otherwise. The calculated 1D potential minimum value yields a path that passes between people. The 1D potential used to extract this path is obtained from the sum of the detected human i potentials shown in (5). Further, when r i decreases below 2.9 m, σ avoid assumes a constant value of 0.18 to prevent σ avoid from becoming too small, which could cause the extraction of a path too close to a human. Similar to the case of (3), a threshold value r i = 2.9 was decided for maintaining continuity of σ avoid in the divided cases.
The first term in this equation is the potential function with a minimal gap between humans. The second term, p g (φ), is the potential function, defined in (9), and it minimizes the goal direction.
Here, the parameters B and β > 0 determine the shape of the potential function.
Regarding path candidates going outside of humans, in many cases, the potential decreases monotonically at both side limits of the 2D LiDAR field of vision, and the minimum cannot be defined. Thus, to extract a path that goes around the outside of a group of humans, the potential function value on the outer side of the field of view from the maximum nearby potential is approximated by the following equation since the path away from the maximum nearby potential is extracted at a constant interval angle.
Here, A round sets the maximum nearby potential and estimates σ round through the least-squares method. Based on this estimated value, the direction where σ round s displaced, towards the edge of the field of view, from the maximum direction, becomes the path candidates φ L round and φ R round . The left panel of Fig. 5 shows an example distribution of humans and the corresponding potential, P avoid (φ) is shown in the right panel. With P avoid (φ) minimized, a path between human groups can be extracted, but the minimum value cannot be detected on either side. Clearly, a direction of travel that goes around the outside of the group has been extracted in Fig. 5 as two paths going around the group, φ L round and φ R round can be readily observed.

Robot Velocity Control
In addition to determining the direction of travel, as described in the previous section, we used the optimal velocity (OV) model [16] to ensure that the robot moves at velocities appropriate for its distance from humans. This model is typically used as a vehicle control model and it well describes steady flow and congestion, not only for traffic system but also for pedestrians' flow (e.g., [17]). Thus, the OV model is used as a method for controlling the robot velocity, so that the robot can smoothly join a human's flow in the case of the Follow strategy. Given d, the distance from the robot to the closest human, the target velocity V des (d) is determined as

Data Collection and Predictor Configuration
On the simulator, data are collected for interactions between the robot and humans, when they travelled in proximity according to paths determined by the extraction method described above. The trajectory predictor is then configured from that data ( Table 1). As described in Sect. 3.1, the path candidates are classified into two groups: "Avoid," which passes through the gaps between and around humans, and "Follow," which follows humans. Consequently, two corresponding types of trajectory predictors are created using multi-layered neural networks. The predicted interactions, shown in Fig. 7, are expressed as displacement from the trajectory, when the robot is moving straight forward at a steady velocity in the direction of the path selected at any point in time, the 'straight-ahead' route ( Table 1).
The definitions for these element inputs are shown in Fig.  8. Displacement from the straight-ahead route is expressed as Δx n ∈ R 2 , n = 1, · · · , N , where n denotes a discrete time with time step Δt, and N denotes the maximum discrete time for the prediction. Thus, the output of the approximator for each function is a 2N -dimensional vector. Table 2 shows common settings of the avoiding/following motion predictor neural networks. To collect a variety of moving directions, random selection from multiple candidate directions was controlled so that directions with less samples collected in the past are selected more frequently. Ten precent of the training samples were used for testing. In the process of training, loss functions of both training data and test data equally converged, which indicated absence of overfitting.  Among the inputs for each predictor in Fig. 8, the position and velocity of humans on either side of the travel direction serve as inputs for Avoid. Of the humans on either side, those close to the robot's direction of travel and to the robot itself are selected. For Follow, the position and velocity of humans to be followed, as well as the positions of humans on either side, are input. In both cases, one of the two indicators of human distance and direction to the left and right is randomly selected from the Pareto Optimal Solution. If no humans are present to the left or right, the distance and angle are each set as 1000.

Online Navigation Method
During online navigation, the robot extracts candidate paths using the extraction method described in Sect. 3.1. After the path selection described below has been conducted, drive control, based on velocity control, is conducted as described in Sect. 3.2. Figure 9 shows the predicted trajectory when the ith path candidate is selected. Based on the N th position,x  When the candidate path i is selected, the total time cost, C (i) total , to reach the goal is defined by Here, C (i) local is the time cost required to move N steps along the selected path, and it corresponds to the parenthetical term on the second line of (13). The first item inside these parentheses is the time cost for the robot turning direction, starting from the direction it is currently facing. The quotient of the angle of deflection θ (i) toward path i is used, with the rotational velocity ω robot [rad/s] kept constant. The second item inside the parentheses is the time cost for N steps. Since the observational period Δt is constant, this value is a constant common to all candidate paths. The third term, after the parenthesis, expresses the time cost to arrive at the goal position after localized path selection has occurred, and is obtained by dividing the length of the route L (i) , obtained from the global planner, by the velocity of the robot, v robot . To simplify the global cost estimation, we assume that the robot moves to the goal at a constant velocity. A selection is made, among the candidate paths, using the following equation to minimize the time cost described above.
This path selection is performed at Δt [s] intervals. During this time, the robot follows the path candidates described in Sect. 3.1, which are refreshed with each observational frame, and implements the velocity control described in Sect. 3.2.

Experiment
We confirmed, both in the simulator and the actual mobile robot, that the proposed method enabled the robot to arrive at its destination, without collisions, in an environment with a human flow of approximately 0.2 humans/m 2 . We then validated these robot-human interactions with an increased density of 0.5 humans/m 2 , which corresponds to an environment full of moving crowds, such as a crowded railway station. This presents a significant challenge for navigation methods simply designed to avoid collisions with single humans.
Our proposed method was constructed by the data collected by the simulator, and evaluated by actual mobile robots and pedestrians. There are two primary reasons for collecting data in this way: -It cannot be said that human behavior on a ROS-stagebased simulator adequately represents true human behavior. Specifically, it is difficult to bridge the differences in behavior with actual movements, including avoidance behaviors when near other humans or a robot, and large repulsion-based movements. Thus, we do not believe that the detailed verification of robot-human interference is meaningful in this simulation. -It can, however, be acknowledged that within the range where humans do not get too close to each other and to robots, avoidance movements in the simulator have a certain degree of realistic spontaneity. Therefore, we believe that there is value in verifying a learner, which has learned behaviors, before approaching point-blank range.

Robot Condition
The robot model used in the simulator was assumed to be an omnidirectional mobile robot with a diameter of 0. Within the sensor measurement range, human positions were measured, and a differential was used to estimate their speeds. If humans overlapped, as is the case with robot navigation in a densely crowded environment, only the front human was viewed. The simulator equipment was based on a ROS Stage [18], and details of this simulator is described in appendix. Figure 10 shows the experimental environment. Humans move in the direction of the arrow, within the range indicated by the 2 m wide and 8 m long gray rectangle. There are eight humans inside this range, arranged so that their density is always 0.5 humans/m 2 . Six points are defined on each side of the pedestrian flow, denoted A1-A3 and B1-B3, and the robot goes back and forth between each A point and the three B points. Crossing the flow once and arriving at the destination on the opposite side counts as one trial. In this way, each method is tested 18 times. We requested the following of the subjects while they were walking: to maintain the regulated density, enter the area as soon as another person leaves; when within the designated area, walk at a natural velocity; choose where to walk inside the area at random, preferably not along the middle line; do not alter behavior when approaching the robot and keep this response consistent; and do not learn or predict the robot behavior by its repetition. When returning to the inflow position after leaving the experimental area, subjects were to return to their original position through the side opposite to the goal position of the robot at that point in time to avoid approaching, and subsequently affecting, the robot after it passed through the area. Subjects were also not provided with a description of the evaluated navigation methods or the experiment intention.

Design of Reproducible Environmental Condition
The robot detected and measured pedestrian legs with a ROS leg detector [19] and 2D LiDAR point group information. Human detectors, which detect and measure the human positions, based on leg detectors are currently available, but Walking direction Human Fig. 10 Experimental environment settings they were not used in this work because they produce frequent false-negatives. Leg detectors recognize humans by sensing each leg and transmitting the information to the path extractor. Even when humans are walking at a steady velocity, their legs do not necessarily move at the same steady velocity, so this information was filtered and smoothed over time. After smoothing, the estimated time and velocity values were obtained at 25 ms cycles. Accordingly, candidate paths were extracted in 30 ms cycles, and the robot wheel velocity control was implemented in 5 ms cycles.
The process for selecting the route with minimal cost, based on (14), was implemented in 330 ms cycles. The predictor time steps were N = 5. In the simulation, the path selection cycle was Δt = 600 ms, so 3 s of movement was predicted.

Comparison Methods
We refer to our proposed method as navigation with local human-interaction prediction (NLHP). Table 3 shows three alternative methods, used for comparison.
All three of these methods utilize the A * method, which is widely used for motion planning. A * diff offers the most promising performance, whereas A * omni and A * omni35 were included to investigate the overall performance impact of their varied motion characteristics. A description of the three methods is provided below.
-A * diff : With the robot radius set as 0.5 m, a route to the goal position was planned. A velocity command was assigned, assuming that the robot runs along the planned trajectory by its differential two-wheel mechanism.
-A * omni : The same radius and route planning as in A * diff . The velocity command was assigned while gradually shifting toward the tangential direction of the generated route, assuming an omnidirectional movement mechanism. Therefore, A * omni exhibits salient behavior differences from A * diff , when A * provides a path with a direction that is perpendicular to the current robot posture. A * diff causes the robot to first stop and turn toward the perpendicular direction (along the path), whereas A * omni keeps moving along the route with its current posture and turn gradually toward the perpendicular direction.
-A * omni35 : The route was planned with the radius set as 0.35 m, and the velocity command was assigned as in A * omni .
Under A * omni and A * omni35 , frequent collisions occurred with obstacles that were 180 deg outside of the frontal field of view of the robot. Contrarily, in the NLHP method, the robot always moved while facing the direction of the selected path, minimizing the risk of colliding with humans that are right beside or in front of it. Because it reduces the risk of collision, A * diff was used as the primary comparison method. Notably, A * diff is a direction control method that, due to translational movement, does not move horizontally or diagonally backwards because of its motion model based on a differential two-wheel mechanism. We verified performances of A * omni and A * omni35 as examples of prioritizing time-efficiency over safety in high-density environments.

Evaluation Index
We used the following indices to evaluate and compare the various navigation methods. Because the robot could only measure humans in a 180 deg frontal range, collisions and flow disturbances were identified and measured visually, by us, based on videos taken during the experiment. Blame, at a given time t, is quantified by the following equation, in which the human movement direction was added to the evaluation method [7] based on information measured  Fig. 11 Duration for each of the evaluated methods and considered by the robot.
Here, i represents humans within a 1.5 m radius of the robot center, and the blame is evaluated for those humans exhibiting the highest blame levels. α i represents the difference between human positions, x i , and the robot position after Δt [s]. Closer distances are represented by smaller denominators, and subsequently larger quotients. β i is set so that smaller differentials between the direction of human motion, θ i , and the direction of the robot seen by the humans, φ i [rad], result in larger β i values. The parameter σ blame = 0.5 sets the range of influence, in the form of a Gaussian function. Thus, blame is an indicator that increases when the robot approaches humans or is positioned in the direction of human movement. The blame at each time instant is evaluated as blame per time, which represents an average value for the time a human spends within the 1.5 m radius of the robot. Figure 11 shows the time required for our NLHP method and the other comparative methods to reach the goal position. The error bars indicate a 95% confidence interval and ** indicates statistically significant difference at the p < 0.01. We observe that A * omni35 , which did not hesitate to pass through small gaps among humans, reached the goal in the shortest time, and NLHP reached it next. Specifically, the NLHP method provided a 43% time reduction compared with A * diff , the primary comparison method.  Figure 12 shows the efficiency of human flow, which was found to be approximately 1 humans/sec for all cases. Since the robot affects humans' behaviors only in some limited period in each trial, differences between human flows were small. However, there were statistically significant differences between the proposed method and the comparison methods.

Experimental Results
Collisions and disturbances are shown in Fig. 13 and 14, respectively. While the NLHP method and A diff exhibit similar numbers of collisions, A * omni and A * omni35 produced large numbers of both. This is primarily attributed to interference with humans outside of the robot's range of view. Regarding the number of disturbances, A * omni35 achieved the least number. It was due to its aggressive and quick strategy, causing very short duration of interaction against pedestrians. Thus, as a total balance including other indices, it cannot be overestimated. Comparison of blame indicated in Fig. 15 shows that the proposed method realized the best performance in reducing psychological influence to humans, though statistical significance was not obtained due to large variation of the blame values in each trial.
All robot behaviors recorded in videos can be accessed from the author's HP [20]. As examples, snapshots of navigations with starting point B1 and destination A3 are shown First, it can be observed that the density of pedestrians in the setting is quite high and challenging when humans walk in their natural speed without specific consideration to the robot. For cases of A * omni and A * omni35 , the robot took path closer to the direct diagonal line toward the destination. However, it is sometimes human-unconscious, and the robot collided with a pedestrian at the third frame in Fig. 19. With the proposed method, NLHP, the robot was flew downstream of the pedestrian's flow while crossing it. This behavior is caused by the mechanism of extracting human-following direction. In the case of A * diff in Fig. 17, the robot was always trying to be safe, but it was too slow and still could not avoid the 'blamed' situation sufficiently.
The comparisons between the five indices with the comparison methods are summarized in Table 4. It was confirmed that NLHP, the proposed method, showed better performance in comparison with A*diff, the primary comparison method. On the contrary, as can be seen in duration and number of disturbance comparisons, A*omni35, the aggressive and quick strategy, showed a better performance. This means that there can be some room for the proposed method to improve its performance while maintaining the low interference property against humans.
Even though NLHP showed lower blame / time than the other methods in average, there was no statistically significant differences due to large deviations. It can be partly caused by fluctuations in measurement of human velocity because the blame defined in (15) includes estimated direction of human motion. Therefore, decreasing noise in human motion estimation is expected to increase the reliability of the measurement of blame.

Discussion
The path extraction described in Sect. 3.1 does not utilize human velocity information when setting the robot driving direction because we were unable to measure human velocities with sufficient accuracy in our current experimental system. Additionally, we did not observe significant improvements for methods that did incorporate this information. If the leg and human detectors used in this experiment were improved, and human velocity could be estimated more accurately, we anticipate that using this information would improve the candidate path extractions and subsequently the navigation performance.
Furthermore, in this experimental system, LiDAR provided the robot with a 180 deg frontal range of observation, so contact with humans at the rear and sides could not be avoided. The robot was sufficiently able to stop when humans were detected in the point-blank range, so improving this vision to avoid all collisions will make the robot and our NLHP method viable real-world operation.
Though the basic behavior of the robot with the proposed method was sufficient in the experiment, there were many cases where the robot could act against humans more appropriately. Parameters for the potential function defined in Sect. 3.1 were mainly tuned based on the robot's behavior in the simulation by human hand. However, it is desired to tune those parameters automatically through both simulations and real-robot experiments for further improving robot navigation performance. Optimization techniques (e.g., [21,22]) can be applied for the automatic tuning of the parameters.

Conclusion and Future Challenges
In this study, we proposed a method that combines a local planner for route prediction, based on embedded sensor information and a global planner that calculates the shortest distance to the goal, based on map information, to achieve time-efficient robot navigation in high-density crowded environments. First, candidate paths were extracted from the embedded sensor, and the positions what would be reached after approximately 3 s were predicted for each of the path candidates. We used a predictor to forecast the deceleration and the impacts of humans' interaction to the left and right. Combining the time cost of reaching the predicted position and the time cost of reaching the goal, which was obtained by the global planner, we built a route selection system based on global time efficiency. The proposed method was implemented and verified in an environment with a high crowd density of 0.5 humans/m 2 in a reproducible and con- trolled way, which helps different methods to be compared against each other in a common condition. These experiments demonstrated that navigation could not only be achieved in this environment, but could be done in a shorter time and with less interference, while maintaining a similar collision risk, compared with established, standard A * -based obstacle avoidance methods. The high-density environment employed in our experiment continuously reproduced its high-density with more consistency than other related studies. While these conditions might not occur with this frequency in real-world situations, it provides a common problem setting for evaluating navigation methods in crowded environments.
In the future, the prediction accuracy should be improved. Specifically, the human behavior reproduced in the simulator must be more similar to true human behavior. We expect that, by changing input parameters from those representing individual humans to those representing groups, we can enable prediction of human flows and not just individual travel, thus improving the prediction accuracy. More precise prediction will allow us to realize prediction in longer time span. Though we did not focus on behavior of individual pedestrian in the current implementation, this improvement will improve the local interaction between the robot and a pedestrian such as cooperative collision avoidance [6]. Addi-  tionally, an option to stop (wait), which is not implemented in the current method, could further improve the performance. The proposed NLHP method offers the unique advantage of using data collected from learning. This must be extended to data collected in the real world [23] and verified. For example, highly accurate human psychological models that account for individual differences [24] could be introduced while continuing to confirm the significance of these for robot navigation. Finally, while A * -based methods are a reasonable standard as a basic comparison, there are other approaches for reciprocal collision avoidance [15,25]. Further comparisons against such navigation methods are to be conducted in our future works. Fig. 20 a Pedestrian's recognition range. b Caution zone and danger zone for switching behavior. c Pedestrians' flows generated in simulator by locating gates. A pedestrian has two feet in the simulator but center of each pedestrian is measured as its position by other pedestrians or a robot permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.

Appendix: Pedestrian Behavior Model in Simulator
In the simulator, each pedestrian is supposed to have a virtual distance sensor that detects distances to other pedestrians and walls in its neighborhood. The detection range of the distance sensor is 8 m in the front (−90 deg to 90 deg around its forward direction) and 5 m for other directions (see Fig. 20a).
A pedestrian's translational velocity v(t) [m/s] and rotational angular valocity ω(t) [rad/sec] are decided by v(t) = max(v max , v(t − Δt) + aΔt) (16) ω(t) = αθ s + βθ p + γ θ g , where Δt denotes control cycle. v max is an individual human parameter which is assigned inherently to each pedestrian by a uniformly random value within range [0.8 1.5] m/s. Acceleration a is set as 1 m/s 2 . θ s , θ p and θ g denote angles from the front of human to center of open space, target direction and direction of a neighbor human that has the same target, respectively. The last feedback term γ θ g has an effect that humans aiming at the same target behave as a group by sharing the same moving direction. Gain parameters α, β and γ corresponding to the angles are set as indicated in Table 5.  There are two zones around a pedestrian that switch its emergent behaviors, named caution zone and of danger zone. Caution zone is defined by radius 1 m and danger zone is defined by 0.5 m (see Fig. 20b). When there is an obstacle in the caution zone, velocities are decided as where v min is set as 0.2m/s. But in case when there is an open space in its front even when there is an obstacle in the caution zone, velocity is decided as If there is an obstacle in danger zone, velocities are decided as v(t) = 0 (21) ω(t) = sign(θ s ) · random(0 ∼ 1), where sign() denotes sign function and random() denotes uniformly random value generator within the indicated range. In order to guide pedestrians along defined routes, gates are located in the environment. Each pedestrian has its assigned sequence of gates to pass. It search for the next target gate and when it is found, it passes through the gate, guided by the feedback term θ p in (17) and (19). When it has passed the target gate, the next gate is searched. Pedestrians' flow is generated by locating gates sequentially as shown in Fig. 20c as an example.
In the process of sample collection described in section 3.3, the gates described above are located so that pedestrians generate a chain of flow. Initial positions and destinations of the robot are placed randomly to cross the flow interatively as shown in Fig. 21. Once the robot reaches its destination,