Game theoretical trajectory planning enhances social acceptability of robots by humans

Galati, Giada; Primatesta, Stefano; Grammatico, Sergio; Macrì, Simone; Rizzo, Alessandro

doi:10.1038/s41598-022-25438-1

Game theoretical trajectory planning enhances social acceptability of robots by humans

Article
Open access
Published: 20 December 2022

Volume 12, article number 21976, (2022)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Game theoretical trajectory planning enhances social acceptability of robots by humans

Download PDF

Giada Galati¹,
Stefano Primatesta²,
Sergio Grammatico³,
Simone Macrì⁴ &
…
Alessandro Rizzo^1,5

2068 Accesses
1 Citation
3 Altmetric
Explore all metrics

Abstract

Since humans and robots are increasingly sharing portions of their operational spaces, experimental evidence is needed to ascertain the safety and social acceptability of robots in human-populated environments. Although several studies have aimed at devising strategies for robot trajectory planning to perform safe motion in populated environments, a few efforts have measured to what extent a robot trajectory is accepted by humans. Here, we present a navigation system for autonomous robots that ensures safety and social acceptability of robotic trajectories. We overcome the typical reactive nature of state-of-the-art trajectory planners by leveraging non-cooperative game theory to design a planner that encapsulates human-like features of preservation of a personal space, recognition of groups, sequential and strategized decision making, and smooth obstacle avoidance. Social acceptability is measured through a variation of the Turing test administered in the form of a survey questionnaire to a pool of 691 participants. Comparison terms for our tests are a state-of-the-art navigation algorithm (Enhanced Vector Field Histogram, VFH) and purely human trajectories. While all participants easily recognized the non-human nature of VFH-generated trajectories, the distinction between game-theoretical trajectories and human ones were hardly revealed. Our results mark a strong milestone toward the full integration of robots in social environments.

Towards Socially Acceptable, Human-Aware Robot Navigation

Human-Like Motion Planning Based on Game Theoretic Decision Making

Article Open access 11 July 2018

Towards Safer Robot Motion: Using a Qualitative Motion Model to Classify Human-Robot Spatial Interaction

Find the latest articles, discoveries, and news in related topics.

Artificial Intelligence

Introduction

The widespread diffusion of service robots for diverse applications is making autonomous robots more and more pervasive in our lives¹. In the near future, autonomous robots will likely coexist and share our very space. Application scenarios will be characterized by populated and dynamic environments, where autonomous navigation has to ensure not only the physical safety of human subjects, but also a great degree of social acceptability². Trajectory planners at the state of the art mostly aim at ensuring the former requisite^3,4,5, while seldom tackling the social acceptability issue. Most of contemporary autonomous navigation algorithms model humans as inanimate dynamic obstacles rather than social entities interacting with each other through complex and strategized patterns⁶. The oversimplification of human behavioral traits in the design of navigation algorithms may have severe consequences, such as the emergence of the well-known “freezing robot problem”⁷.

The freezing robot problem occurs when the environment exceeds a certain level of complexity and the robot is no longer able to manage it because of some deficiencies in the navigation algorithm, for example in the prediction of the human motion model. This context could lead to conditions in which the robot considers all paths unsafe, so it freezes its motion (or makes unnecessary maneuvers) to avoid collision with humans⁷.

Socially-aware navigation is gaining momentum as a fundamental requirement for the design of social robots, able to adhere to the social convention of providing a friendly and comfortable interaction with humans. Socially-aware navigation combines perception, dynamical system theory, social conventions, human motion modeling, and psychology. Trajectories generated in this context should be predictable, adaptable, easily understandable, and acceptable by humans⁸. Toward improving trust, comfort, and social acceptance, humans should be explicitly considered by robots as intelligent agents who interact and may influence the motion of others⁹. Recent efforts in socially-aware navigation model humans as static entities¹⁰ or as agents driven by very simplistic motion models¹¹. Such simplistic assumptions may hardly cope with the complexity of human behavior and interaction, yielding trajectories that are far from predictable, smooth, and in turn acceptable for humans. Models based on learning theory, on the other hand, promise better results¹² provided that a large training data set involving human subjects is available, which is not always the case.

Here, we present a socially-aware robot navigation strategy that accurately models human behavior using game theory (see Fig. 1 for a graphical abstract of the procedure). Game theory offers substantial benefits compared to alternative modeling methods, such as reactive strategies^13,14,15 and learning schemes^16,17,18,19. With respect to the former, game theory is able to perform motion prediction and anticipation of the behavior of other humans, typical of human decision making in social contexts²⁰. Compared to the latter, it overcomes their distinctive lack of explainability, generalization, and the need for large training data sets. Game theory has successfully found applications in robot motion planning, as in Zhang et al.²¹, where a non-cooperative, zero-sum game is used to coordinate their motion and avoid obstacles to execute a set of prioritized tasks. Gabler et al.²² propose a game-theoretical framework in which humans and robots collaborate in an industrial assembly scenario; Dragen et al.²³ and Nikolaidis et al.²⁴ model the interaction between human and robot as a two-player game and point out how different game assumptions and approximations lead to different robot behaviors.

Our approach uses non-cooperative game theory²⁵ to model the navigation behavior of multiple humans in populated environments, positing that conditions of safe navigation, adherence to social norms, and psychological comfort correspond to a Nash equilibrium in the proposed game-theoretical model. Differently from the previously cited works, our model contemplates more than two players—a feature that is essential to model populated environments. The human motion model informs the design of a robotic trajectory planner, whereby the robot tends to mimic human behavior during motion and interaction in a populated environment.

In this study, we leverage the concept of anthropomorphism, i.e. the intrinsic tendency of humans to attribute intentions and consciousness to non-human entities²⁶. Due to such an attribution, designing robotic trajectories that share some features with human trajectories would reinforce anthropomorphism, enhancing the acceptability by humans²⁷.

Our work marks an important milestone in the field of social robotics. It provides an efficient, socially-aware motion planning framework that encapsulates realistic features of human crowds, remarkably enhancing the social acceptance of the planned trajectories. Namely, we incorporate the human's personal space (i.e. the region around the human in which others cannot intrude without causing discomfort)²⁸, the recognition of human groups²⁹, the sequential decision-making typical of human beings³⁰, and a natural human-obstacle interaction³¹—features that are often missing in many approaches, including those based on game theory⁹.

The methodology proposed in this paper is generally applicable to any class of mobile robots. To avoid confounds related to the choice of specific hardware setups and focus on the assessment of human perception of the robot motion, validation is executed on virtualized environments, where the humanly-populated scene is extrapolated from surveillance videos. Three different experimental conditions are considered: the first involves only human subjects, the second contains a virtualized mobile robot programmed through the state-of-the-art Enhanced Vector Field Histogram (VFH³²) algorithm moving through the population, the third replaces the VFH algorithm with our game-theoretical approach.

Across the three experimental conditions, we perform a twofold validation of our approach: first, we evaluate performance parameters typical of path planning (path length ratio, path regularity, and distance to the closest pedestrian), and then we analyze the results of a survey questionnaire to directly assess social acceptability by human subjects. To this aim, we administered a variant of the Turing test to a pool of 691 volunteers, who evaluated the human likeness of three sets of videos corresponding to the three experimental conditions explained above. To conceal the appearance of the agents, we masked humans and robots by replacing them with arrows so that the volunteers did not have the possibility to distinguish between them.

Evidence from our experimental campaign reveals that trajectories generated by our game-theoretical approach exhibit performance parameters that are efficient and closer to those achieved by human subjects than those executed by VFH. Moreover, the outcome of the survey questionnaire highlights the superior acceptability of game-theoretical-generated trajectories with respect to those generated through VFH.

Methods

Figure 1 schematizes the proposed procedure for the realization and validation of our game-theoretical framework for the social acceptability of robotic trajectories. The methodology can be subdivided in four main logical phases, corresponding to the panels in the figure. First, a game-theoretical model of pedestrian motion is devised and its parameters are tuned on the basis of the analysis of human motion videos (panel (a)). Second, a robotic trajectory planner informed by the game-theoretical pedestrian model is realized. The robot is deployed and operated in a virtual humanly-populated environment, where humans execute real trajectories extracted from videos. In this phase, three important performance parameters in robotic trajectory planning (path length ratio, path regularity, and distance to the closest pedestrian) are evaluated and compared with the state-of-the-art VFH algorithm (panel (b)). Third, the virtual environments containing humans and the robot are processed and prepared to be administered for the validation survey questionnaire (panel (c)). Finally, the survey questionnaire is administered and the results are collected and analyzed (panel (d)). In the following, the main components that constitute our methodology are illustrated in detail.

Game-theoretical model

Assumptions

The assumptions supporting our game-theoretical model for human motion are listed in what follows. To improve readability, here and henceforth we will refer to human subjects as agents. This term will be also used for the robot when no distinction between the two categories is required.

All pedestrians are rational agents with common knowledge moving in a 2D populated dynamic environment.

Rational behavior entails that agents only aim to reach their own individual motion goal (i.e. the location to which the agents wish to go). In mathematical terms, this translates into a minimization of an individual cost (equivalently, a maximization of an individual benefit), such as their overall path length³³ or energy consumption³⁴. Practically, agents continuously update their navigation behavior while walking in populated environments, based on the observation and possible prediction of the motion of surrounding agents.

The possession of common knowledge by agents in our game-theoretical model implies that all agents have the same knowledge about what actions can be performed to reach their final goal and how other pedestrians behave while walking.

Such an assumption is reasonable when dealing with models of human traits, as individuals commonly learn these skills by experience during everyday life⁹.

We consider a populated dynamic environment, possibly busy, but not crowded, such as typical streets occupied by pedestrians walking on sideways, or populated indoor spaces, such as hotel halls⁹. We suppose that the environment contains static obstacles, which have to be avoided by agents in a natural manner. Our approach is based on a microscopic modeling strategy, whereby a single individual is mapped onto a single software agent, which mimics the individuals’ decision and their interactions.

Game description

The proposed model for pedestrian motion is a non-cooperative, static, perfect information, finite, and general-sum game with many players (or agents).

In our model, each agent aims at reaching its own goal individually, but the minimization of its individual cost function does not exclude the possibility to collaborate with other agents, should this help attaining individual goals³⁵ as well.

Our model recognizes as groups those pedestrians that move staying close to each other keeping a similar direction of motion. These groups of agents are considered as single players, whereby members of the group share a common strategy and a common motion pattern. This last assumption practically entails that the navigation strategy of the robot in avoiding human groups would treat them as a compact group of people that cannot be split to better attain its own navigation goal.

The game is static in the sense that agents move and take decisions simultaneously, it is based on perfect information, that is, each agent knows the current and the previous actions of all agents, e.g. via direct observations.

The game is also finite, i.e., the game has a finite number N of agents belonging to the agent set $\mathcal {N}$, where each agent ${i \in \mathcal {N}}$ can choose among a finite number of actions available, defined by the action set $\Theta$, which is supposed to be common to all agents. In particular, we indicate with $\theta _{i} (t) \in \Theta$ the action executed by agent i at the discrete time t. In our application, the execution of action $\theta _i(t)$ corresponds to a motion of agent i in the 2D plane at constant velocity v and constant heading $\theta _i (t)$ over the whole discrete time step $\Delta t$. We assume that agents have a bounded visibility angle and that $\theta _i(t)$ is designed to uniformly partition such an angle. We denote by $p_i \in \mathbb {R}^2$ the position of agent i in the 2D environment, with respect to a fixed orthogonal reference frame.

Moreover, the proposed model is a general-sum game, i.e., the sum of all gains and losses of the utility functions over all agents is not necessarily equal to zero.

Similar to²⁰, we postulate that, in such a navigation task, agents tend to reach a Nash equilibrium—the condition in which no agent has an incentive to unilaterally change its own action (or strategy) if the other agents do not change theirs. In other words, a Nash equilibrium occurs when each agent achieves its best response, i.e., its minimum individual cost, given the actions of the other agents. In general, however, existence and uniqueness of a Nash equilibrium is not guaranteed in our setup, and its analytical characterization is almost always impossible to obtain, thus making numerical approaches for an approximate computational necessary. Here, the Nash equilibrium is approximately computed via the sequential best response approach³⁶.

Let us explain the idea of the sequential best response for two agents, A and B: agent A observes the motion of agent B and then solves an optimization problem to determine its own trajectory, given the latest observation of agent B. Afterwards, a check action is performed, verifying if the strategies of both agents are the same as those computed in the previous iteration; in such a case, the game has reached a Nash equilibrium. Otherwise, agent B computes its optimal strategy, given the latest observed strategy of agent A. The procedure is applied iteratively, until the equilibrium condition is met. The same strategy identically extends to N agents.

Our modeling procedure assumes that all the agents in the planar space play the game mentioned above. After the model has been identified, we will use it to control a single, synthetic agent to navigate through the populated environment. Such an agent is called robot player.

Optimization problem

The sequential best response approach in our game-theoretical model for human motion in a populated environment requires the solution of a set of interdependent optimization problems, one for each agent moving in the environment. The goal of the optimization problem for each agent i is to find the best sequence of actions, $\varvec{\theta} ^{*}_{\varvec{i}} = (\theta _{i}(t), \theta _i(t+\Delta t), \theta _i(t+2 \Delta t), \ldots , \theta _i(t+T \Delta t))$, over a finite prediction horizon $T \Delta t$, given the actions of the other agents. Without loss of generalization and to improve readability, here and henceforth we assume a unitary discrete-time step, i.e., $\Delta t = 1$.

All agents seek for the Nash equilibrium by applying the sequential best response strategy, solving their own optimization problem on the basis of the observed behavior of the rest of the population. We define the optimization problem for each agent $i \in \mathcal {N}$ as

$$\begin{aligned} \varvec{\theta} ^{*}_{\varvec{i}} =~&\!\underset{\varvec{\theta _{i}}}{\text{min}}&\quad&J(\varvec{\theta _{i}}) \end{aligned}$$

(1a)

$$\begin{aligned}&\text {s.t.}{} & {} \left\| p_i(t, \theta _{i}(t))-p_j(t) \right\| _{2}~\ge \beta \quad \forall ~t, \forall i , j \in \mathcal {N}, i\ne j \end{aligned}$$

(1b)

$$\begin{aligned}{} & {} &p_i(t, \theta _{i}(t)) \notin {\mathcal {O}}_{\text{obs}} \quad \forall ~t, \forall i \in {\mathcal {N}} \end{aligned}$$

(1c)

with

$$\begin{aligned} p_i(t, \theta _{i}(t)) = p_i(t-1, \theta _{i}(t-1))+ \Delta p (\theta _i(t), v). \end{aligned}$$

(2)

The cost function $J(\varvec{\theta _{i}})$ in (1a) is defined as

$$\begin{aligned} J(\varvec{\theta _{i}}) = \Phi _{\text{goal}}(\varvec{\theta _{i}}) + \Phi _{\text{smooth}}(\varvec{\theta _{i}}) + \Phi _{\text{obs}}(\varvec{\theta _{i}}), \end{aligned}$$

(3)

where the three summands are defined as follows:

(i) The term $\Phi _{\text{goal}}(\varvec{\theta _{i}})$ tends to reduce the overall path length for each agent i and, hence, models the goal-oriented attitude of the agent:

$$\begin{aligned} \begin{aligned} \Phi _{\text{goal}}(\varvec{\theta _{i}})=\sum _{t=1}^{T} \gamma (t) \Vert p_i(t, \theta _{i}(t))-p_i^{*} \Vert, \end{aligned} \end{aligned}$$

(4)

with $\gamma (t)$ being a time-variant weight factor; $p_i(t, \theta _i(t))$ is the estimated position of agent i at time t, considering a constant speed modulus v and the heading control action $\theta _i(t)$ applied at time t, computed using the kinematic update Eq. (2); and $p_i^{*}$ is the estimate of agent i’s goal over the time horizon T. In the absence of an explicit definition of a pedestrian’s goal, we assume that, within the horizon $[t, t+T]$, the goal of agent i lays on a straight line starting in $p_i(t)$ and oriented along the observed agent heading at time t. Under these assumptions, the practical meaning of the time horizon T is the estimate of the time interval within which a pedestrian sets up and maintain their walking goal.

(ii) The term $\Phi _{\text{smooth}}(\varvec{\theta _{i}})$ penalizes excessive rotations, thus promoting smooth trajectories. In fact, during navigation, humans tend to avoid too many changes of orientation to minimize their energy consumption³⁴:

$$\begin{aligned} \begin{aligned} \Phi _{\text{smooth}}(\varvec{\theta _{i}})= \sum _{t=1}^{T} ( 1- \gamma (t) ) | \theta _i(t) - \theta _i(t-1) |, \end{aligned} \end{aligned}$$

(5)

where $\theta _i(t)$, $\theta _i(t-1)$ are the orientation of the agent at time t and $(t-1)$, respectively. We observe that the term $\Phi _{\text{smooth}}(\varvec{\theta _{i}})$ is weighted in a complementary fashion to $\Phi _{\text{goal}}(\varvec{\theta _{i}})$, to satisfy the assumption (further detailed in the Implementation section) of their relative importance as long as the agent approaches its target.

(iii) The term $\Phi _{\text{obs}}(\varvec{\theta _{i}})$ tends to optimize the natural interaction with static objects. In fact, humans tend not to walk too close to static obstacles, unless it is necessary. For this reason, we model this behavior as a soft constraint:

$$\begin{aligned} \begin{aligned} \Phi _{\text{obs}}(\varvec{\theta _{i}})= \sum _{t=1}^{T} \frac{\rho }{\Vert p_i(t, \theta _{i}(t))- p_{\text {obs}} \Vert }, \end{aligned} \end{aligned}$$

(6)

where $\rho$ is a weighting factor and the denominator in (6) is the distance between the agent position $p_i(t, \theta _{i}(t))$ and the closest static obstacle $p_{\text {obs}}$ at time t. The exact procedure to compute $p_{\text {obs}}$ will be explained later. Practically, (6) penalizes small distances between an agent and static obstacles.

The inequality in (1b) is a hard constraint imposing to avoid other agents, assuming a circular region around agents as their personal space²⁸ to be avoided. In this way, agent i is required to maintain at least a minimum distance $\beta$ with other agents in the observed scenario. Constraint (1c) models the avoidance of static obstacles by imposing that the position $p_i(t,\theta _{i}(t))$ is outside the obstacle space $\mathcal {O}_{\text{obs}}$, defined as a subset, possibly disconnected, of the 2D planar space, occupied by obstacles, where motion of agents is forbidden.

Equation (2) formalizes the kinematic update of the position of agent i at time t, subject to a heading command $\theta _i(t)$, at a constant velocity v.

Validation

The proposed game-theoretical human motion model is validated by conducting a qualitative comparison between generated trajectories and human ones, observed in open-source surveillance videos^37,38. These surveillance videos, used to validate the proposed model, show a typical urban scenario in which multiple agents walk interacting with each other and avoiding static obstacles. Figure 2 illustrates randomly selected frames of such surveillance videos in two different scenarios. Specifically, Fig. 2 compares real trajectories executed by humans (Fig. 2a,c) with the estimated trajectories generated for all agents by the proposed model solving our game-theoretical problem (Fig. 2b,d).

We observe that, in both the illustrated scenarios, our game-theoretical approach generates collision-free trajectories (Fig. 2b,d) that are smooth and resemble those executed by their human counterparts. However, we note that the trajectories generated by our algorithm exhibit a sharper reaction than humans in the vicinity of surrounding agents. This is evident when comparing Fig. 2a,b, focusing on the interaction between the green trajectory and the blue one. A comparable circumstance can be observed in Fig. 2c,d, with reference to the yellow trajectory. This phenomenon is most likely caused by the discrete action set associated with each agent. Notably, in our implementation an agent can choose one out of seven possible headings inside their own visibility zone, resulting in a resolution of $\pm \,\,{\pi }/{6}~\text{rad}$, in the attempt of minimizing the corresponding cost function. On the other hand, human subjects can select their heading over an infinite set.

A further cause of discrepancy between human and game-theoretical trajectories resides in the kinematic update of the agent position in Eq. (2)—a linear update with constant heading and velocity over the whole sampling step—and the estimation of the human target, assumed to be constant over an interval of duration T—actually an unknown, subject to the very stochastic nature of human behavior.

Algorithm

The game-theoretical model of pedestrian motion described above is used to inform a robotic trajectory planner for autonomous robots moving in populated environments.

The main steps executed by the proposed trajectory planner are described in Algorithm 1. First, the robot position ($p_{\text{robot}}$) is initialized using the function $\text{InitializeRobotPosition}$. Then, the algorithm executes an iterative procedure that stops when the robot reaches its target position ($p_{\text{goal}}$). Here, we will refer to both humans and the robot with the term “agent”. Each iteration performs five main steps: recognition of groups of humans ($\text{GroupRecognition}$), first estimation of trajectories for all agents ($\text{FirstEstimation}$), collision checking between agents and with obstacles ($\text{CheckCollision}$), computation of the agent trajectory ($\text{ComputeSolution}$), and update of the robot position using the computed trajectory ($\text{UpdateRobotPosition}$). This iterative procedure predicts the agents’ motion and generates the optimal robot trajectory over the fixed time horizon T, by applying the strategy detailed below. After such an optimal trajectory for the robot is computed, only the action corresponding to the first time step is actually applied to the robot and the process is repeated until the robot reaches its goal.

In the following, each step of Algorithm 1 is detailed:

$\text{GroupRecognition}$. The algorithm performs the group recognition of agents, considering the observed orientation of each agent and the distances between them. In fact, a group is typically moving while maintaining a common orientation and keeping a distance between agents shorter than the typical personal space of the single agent. Upon recognition, groups are considered as unique entities and treated as single agents in the subsequent phases.
$\text{FirstEstimation}$. A preliminary estimation of all agents’ trajectories (i.e., $\varvec{\theta}$) is performed, projecting hypothetical rectilinear trajectories over the interval T.
$\text{CheckCollision}$. Given the trajectories of all agents ($\varvec\theta$), previously estimated by the $\text{FirstEstimation}$, the $\text{CheckCollision}$ function detects the possible occurrence of collisions between an agent i with obstacles and other agents, activating the flag variables $C_{\text{obs}}$ and $C_{\text{agents}}$, respectively. In particular, we refer to the occurrence of a collision each time the individual personal space of an agent is violated.
$\text{ComputeSolution}$. Considering the estimated trajectories ($\varvec\theta$), and the flags $C_{\text{obs}}$ and $C_{\text{agents}}$, Algorithm 2 computes a solution of the motion planning problem for an agent i selecting one of the possible following cases:
1. (i).
  if a collision with other agents is envisaged, two alternative solutions are evaluated. Then, the solution that involves the lowest cost of Eq. (3) will be selected. The first solution ($\varvec{\theta _i}^\text{gt}$) is computed using the strategy defined in Algorithm 3, where trajectories are generated seeking for a Nash equilibrium solution of the game presented in the Game description section. The second solution is computed through the $\text{Decelerate}$ function, which evaluates the opportunity to decelerate—a typical human behavioral trait in navigation—to avoid the collision with other agents. In particular, after identifying the discrete time step t at which a collision between agent i and other agents is envisaged to occur, the cost associated with sixteen different deceleration patterns is evaluated using the cost function (3), provided that constraints in Eqs. (1b) and (1c) are satisfied;
2. (ii).
  if an agent is envisaged to collide with a static obstacle ($C_{\text{obs}}$), the agent solves its individual optimization problem described above (without playing the game and, hence, not seeking for the Nash equilibrium);
3. (iii).
  if no collision between agents or static obstacles is envisaged, trajectories are maintained as straight lines, keeping the current heading and a constant velocity, practically implementing what was already computed in the $\text{FirstEstimation}$ procedure.
$\text{UpdateRobotPosition}$. Considering the computed trajectory of the robot, the action corresponding to the first time step is executed and the robot position is updated using Eq. (2).

Implementation

The algorithm presented above has been implemented in Matlab and the main implementation features are discussed in what follows.

The discrete time step has been set to $\Delta t = 1.2 \text{ s }$. The time horizon for optimization has been set to $T = 4$, that is, 4.8 seconds. Please note that in the main paper we assumed a unitary discrete time step to enhance readability.

As previously stated, each agent can execute actions taken from an action set $\Theta$ of finite size. Specifically, in our implementation, each agent has seven possible actions for $\theta _i(t)$, which represent possible relative headings to follow within the agent visibility zone. Namely, $\theta _i(t)$ is updated as $\theta _i(t) = \theta _i(t-1) + u(t-1)$, where $u(t-1)$ takes values in the finite set $\Theta =\{ -\,\,{\pi }/{2}, -\,\,{\pi }/{3}, -\,\,{\pi }/{6}, 0, \,\,{\pi }/{6}, \,\,{\pi }/{3}, \,\,{\pi }/{2} \}~\text{rad}$. We remark that we limited the cardinality of $\Theta$ to seven, pursuing a trade-off between a satisfactory performance and a reasonable computational complexity of the algorithm.

In Eq. (1b), the $\beta$ parameter is set considering the Hall convention²⁸ that posits the existence of a personal space of circular shape that ensures comfort conditions for human navigation. The value of $\beta$ has been estimated through the analysis of the open-source surveillance videos^37,38.

In Eq. (3), the term ($\Phi _{\text{obs}}(\varvec{\theta _{i}})$) can be neglected if the first estimation of the agent trajectory does not intersect any static obstacle. Otherwise, $\Phi _{\text{obs}}(\varvec{\theta _{i}})$ in Eq. (6) is computed referring to the closest obstacle, toward which the agent is projected to collide. Then, the closest point of such obstacle to the agent position is computed ($p_{\text {obs}}$). To reduce the computational load, obstacles are mapped into a discrete spatial map overlapping with the 2D environment. The map consists of a rectangular matrix of 576x720 cells, which are marked as being occupied by an obstacle or free from them. Each cell covers approximately a square of 1.8x1.8 cm.

The weight $\gamma (t)$ in Eqs. (4) and (5) is selected as a time-varying term that is used to balance the relative importance of terms $\Phi _{\text{goal}}(\varvec{\theta }_i)$ and $\Phi _{\text{smooth}}(\varvec{\theta }_i)$ over the optimization horizon T. This choice emerges from the analysis of the available surveillance videos, where we observed that the minimization of the distance to goal typically prevails on the smoothness requirement as long as the agent gets closer to their goal, and vice versa. Considering $T=4$ time steps, we chose the following sequence for $\gamma (t)$, starting from a generic time instant $t^*$: $\gamma (t^*)=0.6$, $\gamma (t^*+1)=0.7$, $\gamma (t^*+2)=0.8$, $\gamma (t^*+3)=1.0$.

Trajectories generation for performance parameters

We designed trajectories for a preliminary assessment through performance parameters evaluated in three experimental conditions, which differ for the algorithm governing the motion of a selected agent (i.e., either a robot or a human being): in the condition humans only (HO), all the agents were human beings moving in a real environment; in the condition humans and GT (GT), one of the agents was controlled by our game-theoretical algorithm, while the other agents were human beings; and in the condition humans and VFH (VFH), one of the agents was controlled by the VFH algorithm³², while the other were human beings. Each experimental condition comprises seven different experiments (i.e. seven different trajectories), differing for the start and the goal chosen for the selected agent, the quantity of human subjects involved in the interaction, and their motion patterns.

The virtualized environment is constructed by processing movies collected from surveillance cameras of populated environments³⁷, obtaining a 2D arena where virtual agents reproduce the human motion captured in the movies. In the HO condition, the performance parameters are evaluated in the original arena, with reference to a randomly selected human being. In the GT and VFH conditions, a virtual agent is introduced in the arena and commanded to use the given trajectory planner (GT or VFH) to navigate through the existing virtual agents, corresponding to human beings.

Survey questionnaire, a-priori power analysis

Survey questionnaire

The proposed methodology is validated using a variation of the Turing test³⁹, which evaluates whether the robot behavior, controlled by the game-theoretical method, is comparable to or indistinguishable from human navigation patterns.

The variation of the Turing test consists of an online survey questionnaire composed by three main parts: (i) in the first part, the participants underwent a training phase to familiarize with the working environment (see Fig. 3a,b); (ii) in the second part, the participants watched 21 videos reproducing the seven experiments for each of the three experimental conditions, where both the background and the agents are concealed—blue arrows over a gray background—(Fig. 3c illustrates a frame of a single experiment); (iii) in the third and final part, the participants watched the same 21 videos (but in a different random order), where they were asked in addition to focus on a circled arrow (Fig. 3d illustrates an example of a frame of a single experiment). The participants were unaware that the circled arrow targeted a random human agent in the HO experimental condition and the robotic agent in GT and VFH experimental conditions. We remark that the seven experiments used for the survey questionnaire are identical to those used to evaluate the performance parameters computed in the previous section.

The execution of each part entails answering specific questions. In the first part of the survey questionnaire, the participants were required to provide their gender, age, and level of experience in robotics field on a Likert scale⁴⁰ from 1 (no experience) to 5 (expert).

During the training part, the participant is guided from a typical urban scenario of Fig. 3a to the particular scenario used in the other parts of the test illustrated in Fig. 3c. The intermediate scenario of Fig. 3b is designed to gradually guide the participant to the final set-up.

In the testing scenario of Fig. 3c, agents (pedestrians and robot) have been replaced with arrows and the urban environment has been removed to prevent the participant from focusing on the scenario, rather than on the movement of agents.

In the second part, the participant watches 21 videos randomly (about 15 seconds each) consisting in three different experimental conditions: 7 videos show an environment with only pedestrians (HO); other 7 videos a scenario with pedestrians and a robot controlled with an algorithm at the state-of-the-art (the Enhanced Vector Field Histogram³²) (VFH); and the remaining 7 videos show a scenario with pedestrians and a robot controlled with the proposed algorithm (GT). In all experimental conditions, robot trajectories are re-planned with a frequency of 2 Hz.

To assess the level of social acceptance of our game-theoretical trajectories, in the second part (following habituation), we asked the participants to say if they perceived “weirdness” in the motion observed in the videos, and then to indicate which is the perceived “weird” arrow, if any, as shown in Fig. 3c.

In the third part, participants were requested to determine whether the circled arrow is a human or not. Then, participants were asked to rate the naturalness of the motion of the circled arrow on a Likert scale⁴⁰ defined in a range from 1 (completely unnatural) to 5 (completely natural).

All videos used in the survey questionnaire are generated from an open-access dataset³⁷.

The test takes about 20 minutes to be completed properly. The test has three rules: (i) the participant cannot pause the video; (ii) the participant can watch videos only once; (iii) the participant should complete the test without interruptions or distractions.

A-priori power analysis

Preliminary, we conducted an a-priori power analysis to estimate the number of participants required to provide acceptable and significant statistical results⁴¹. To this aim, we used the free software G*Power⁴². First, we identified our case analysis as a non-parametric study, since non-parametric statistical tests pose no constraints or prerequisites on the data distributions⁴³. Then, we assumed that the data collected after the a-priori study would be analyzed via the non-parametric Kruskal–Wallis test because our independent variables pertain to more than two independent groups (HO, GT, and VFH) and our dependent variables (the rating of the weirdness motion, human-likeness, and naturalness of movement) are ordinal.

Based on⁴¹, we computed the total sample size considering the ANOVA test⁴⁴, i.e., the parametric-equivalent test of the Kruskal–Wallis one and then multiplied the result by the corrective factor ARE, obtaining the equivalent sample size of the non-parametric Kruskal–Wallis test. The result of the a-priori analysis for our non-parametric test is about 152 volunteers, considering an alpha level equal to $5\%$, power of the study $80\%$ and the three groups, corresponding to the three different experimental conditions. We recruited the participants using the Institutional email of Politecnico di Torino and then we distributed an online survey questionnaire to students and university staff. Ultimately, we collected 691 responses, exceeding the sample size of 152.

Statistical analysis

Experimental data (both the generated robotics trajectories and the responses to the survey questionnaire) were preliminary assessed for normality distribution and homoscedasticity of variance (Levene’s test). These analyses revealed that data violated the assumptions for parametric statistics. Thus, we decided to adopt a non-parametric test , i.e., Kruskal–Wallis⁴⁵.

First, the quality of the trajectories generated by the two algorithms and the HO condition was evaluated. We first addressed whether they differed in terms of variability of path length ratio, path regularity, and distance to the closest pedestrian through the Levene’s tests⁴⁶. Significance level was set at $p<0.05$⁴⁵ (for all statistical tests performed in this study), and paired post-hoc comparisons have been conducted—adopting a Bonferroni correction—when appropriate. Following these preliminary analyses, trajectory data have been analysed through the non parametric Kruskal–Wallis test.

Survey questionnaire data have also been analysed through Kruskal–Wallis test followed by Bonferroni post-hoc analyses⁴⁷. These analyses were aimed at assessing whether participants exhibited a differential appraisal of the different trajectories in terms of weirdness, human likeness, and naturalness. This statistical approach was adopted for all the questions in the second and third part of the survey questionnaire, except for the second question of the second part. In the latter, participants were asked to indicate the perceived “weird” arrow, if any. We posit that more weirdness should be perceived in agents driven by algorithms than in agents associated with human beings. For this reason, the answers expressed relative the HO scenario were not considered, since all arrows corresponded to human beings and an indication of weirdness would not make sense to our research question. As a consequence, in this specific instance, only two experimental conditions had to be compared (GT and VFH) and, to this aim, we used the Mann-Whitney test^48,49.

Results

In this section, the results of the analysis conducted on the trajectories of the 21 experiments (seven experiments for each of the three experimental conditions) are presented. Then, the results of the survey questionnaire are illustrated and commented.

Analysis of performance parameters

Three widely adopted parameters, deemed as important for socially navigating robots, were evaluated across the three experimental conditions: the Path Length Ratio (PLR), the Path Regularity (PR), and the Closest Pedestrian Distance (CPD)⁵⁰.

The PLR is defined as the ratio between the length of the line-of-sight path between the initial and final point of a path and the actual path length between the same two points⁵⁰. A higher path length ratio is usually preferred, since it indicates that an agent minimizes the length of the path to reach its goal. We computed the PLR for each experiment and we illustrate its average values across the three experimental conditions in Fig. 4a. The results in Fig. 4a suggest that the HO scenario was characterized by the highest average PLR, followed by GT and VFH.

The PR quantifies to what extent a path is similar to a straight line⁵⁰. Upon normalization, $PR=1$ corresponds to a straight path from start to goal. Values of PR closer to one are preferable, since they are indicative of a smoother motion, without excessive changes of direction. In Fig. 4b, the average PR for each experimental condition is illustrated, where the highest average value pertains to HO, followed by GT and VFH. These results appear in line with the tenet that humans tend to minimize their energy, thus avoiding sudden changes of orientation, and with the design principle of the VFH algorithm, which avoids obstacles only when the agent is close to them³², entailing swift changes of orientation to get away from them.

The CPD is defined as the distance from the closest pedestrian, normalized with respect to the maximum length measurable during experiments, which is the diagonal of the experimental arena. Also for this parameter, the attainment of values closer to one is desirable, as this implies a good tendency in staying clear from humans when following planned trajectories. Average values of CPD in the three experimental conditions are illustrated in Fig. 4c, where the highest average value is related to GT, followed by HO and VFH. The reason for the latter is presumably due to the purely reactive design of the VFH algorithm. We posit that the intermediate ranking of HO with respect to CPD is due to the ability of humans to evaluate situations on a case-by-case basis.

While the rankings described above are suggestive of superior performance parameters attained by GT over VFH, the verification of the statistical significance of these comparisons is in order.

To preliminarily evaluate the quality of the trajectories generated by the two algorithms and the HO, we first addressed whether they differed in terms of inter-experiment variability of the three performance parameters through the Levene’s test⁴⁶.

Hypothesis 1

($H_{0}$) The variance of the three performance parameters (PLR, PR, CPD) is statistically indistinguishable when computed over the three experimental conditions (HO, VFH, GT).

We evaluated the extent to which each algorithm generated trajectories that were similar to one another. Our analysis revealed that there exists a significant differential variability with respect to PR ($F_{2,18}=3.75$, $p = 0.043$). Thus, we performed a post-hoc analysis that revealed much more variability in the VFH videos compared to HO and GT (VFH vs. HO: $p = 0.038$; VFH vs. GT: $p = 0.040$; HO vs. GT: $p = 0.97$); similarly, albeit not statistically significant, we observed a trend toward increased variability in VFH experiments with respect to PLR ($F_{2,18}=3.22$, $p = 0.064$). Finally, the inter-experiment variability within each experimental condition was indistinguishable concerning CPD ($F_{2,18}=2.31$, $p = 0.130$). These results indicate that, albeit indistinguishable in absolute values, the reproducibility and predictability of each experimental condition in terms of PR and PLR was much higher in HO and GT than in VFH scenario.

With the Leven’s test described above, we have not only shown that the variances of the 3 experimental conditions in PLR and CPD are equal but we have also shown that for these two parameters the assumptions for executing the Kruskal–Wallis test are satisfied. Thus, in line with this consideration, the null hypothesis for the Kruskal–Wallis is defined as follows:

Hypothesis 2

($H_{0}$) The two performance parameters (PLR, CPD) computed over the three experimental conditions (HO, VFH, GT) are statistically indistinguishable across experimental conditions.

To this aim, Kruskal–Wallis analysis⁵¹ was executed across the two performance parameters, revealing the absence of statistically significant differences ($\chi ^{2}=2.5$, $p=0.286$ for PLR; $\chi ^{2}=0.36$, $p=0.834$ for CPD). The reason behind such observations is strictly related to the consideration of only seven experiments for each experimental condition, with differential degree of variability, and thus characterized by a limited statistical power.

For completeness, we executed an a posteriori power study to verify the limited statistical power, and we found that the statistical power considering only seven experiments per group is 6%, thus very limited.

Survey questionnaire

We collected 691 responses to the survey questionnaire, where participants were in majority in their thirties with very little experience on robotics (Table 1). The gender composition was slightly unbalanced toward men. The age range of our sample isu from 18 to 78 years old.

Table 1 Demographic characteristics and experience with robotics on a scale from 1 (minimum experience) to 5 (maximum experience) collected during the first part of the test.

Full size table

A power analysis⁴¹ indicated that the adequate statistical power was guaranteed with 152 participants. Since the number of participants largely exceeded the required sample size, we opted for a bootstrapping approach⁵², in which we randomly sampled 152 observations from the complete pool of responses and iterated this process 100 times. Adopting this procedure, we kept the sample size to the appropriate number (thus reducing the odds to obtain biologically irrelevant findings⁵³) and increased the generalizability of our findings by testing their robustness against repeated observations.

Experimental outcomes were analyzed with the Kruskal–Wallis test to statistically reject the $H_{0}$ hypothesis and understand if there exist differences among experimental conditions.

Our null hypothesis posits:

Hypothesis 3

($H_{0}$) All experimental conditions (HO, VFH, GT) are perceived by participants as indistinguishable.

In the analysis of the results of the second part, in accordance with our expectations, the VFH condition was characterized by the highest level of weirdness compared to HO and GT conditions, which were in turn indistinguishable from one another (Kruskal–Wallis test $\chi ^{2}=107\pm 13.5$ and $p<10^{-17}$ for all bootstrapping iterations; post-hoc analysis: for HO-VFH $p<10^{-10}$ for all bootstrapping iterations, for GT-VFH $p<10^{-14}$ for all bootstrapping iterations, for GT-HO $p>0.05$ for 88 bootstrapping iterations out of 100, but the remaining has $p>0.01$).

Figure 5a illustrates the mean rank (in light of the bootstrapping procedure) in “weirdness” of motion (WM) along with its standard deviation.

Notably, GT and HO are indistinguishable from one another, while VFH is significantly different from GT and HO. Specifically, while VFH was considered “weird” in the majority of instances (61%), GT was considered “weird” much less often than HO videos (33% and 37%, respectively) (See Fig. 6).

We then asked the participants who detected weirdness in the videos to indicate which of the arrows exhibited such weirdness. We posit that more weirdness should be perceived in agents driven by algorithms than in agents associated with human beings. Our experiments indicated that the agent judged as weird was actually associated to a robot only in 16% of GT, while this proportion drastically increased to 47% of VFH (see Fig. 6 patterned bars). This finding, combined with the Mann-Whitney test ($U=4(10^{3}) \pm 489$, $p<10^{-20}$ for all the bootstrapping interation considering the whole bootstrapping analysis), supports the view that the trajectories generated by GT are perceived as much more natural than those generated by VFH. Additionally, it suggests that the motion of the robot controlled by GT is perceived as more human-like than the one generated by VFH.

In the third part, we further delved into the subjective rating of the three motion patterns by asking participants to focus on the motion of a circled target agent and evaluate whether such motion corresponds to a human or not (human likeness), along with its degree of naturalness on a Likert scale from one (minimum naturalness) to five (maximum naturalness). When focusing on the qualitative measurements of the human likeness, we observed that VFH-related arrows were considered much less human-like (41.11%) than both GT (64.59%) and HO (80.31%). Thus, as illustrated in Fig. 5b, VFH is judged as the least human-like ( $\chi ^{2}=142.55\pm 15.12$, $p<10^{-22}$ for all Kruskal–Wallis bootstrapping iterations; post-hoc analysis: $p<10^{-5}$ VFH-GT, $p<10^{-30}$ VFH-HO) which is consistent with the previous part of the test, where VFH is perceived as generating the “weirdest” motion. Additionally, GT-related arrows were considered significantly less human-like compared to HO ($p<10^{-4}$ post-hoc analysis GT-HO).

Figures 5c and 7 illustrate the results related to the naturalness of the circled arrow. Figure 7 shows the result about the average naturalness of motion of the circled arrow on a Likert scale from 1 (minimum naturalness) to 5 (maximum naturalness), computed over the 100 iterations of the bootstrapping procedure.

In accordance with our expectations, HO exhibits the highest mean degree of naturalness (4) with a standard deviation of 0.04, closely followed by GT (3.5) with a standard deviation of 0.04, whereas a larger gap separates VFH (2.6) with a standard deviation of 0.05.

Importantly, although significantly different from HO, GT values exceeded three. This may indirectly suggest that while HO videos were deemed natural, also GT videos may have been regarded as human-like. Yet, this proposition is currently speculative whereby the intermediate value (three) was not marked with the anchor "natural". Therefore, future studies are needed to precisely detail the individual appraisal of the naturalness of the GT trajectory.

Discussion

The main goal of our study was to design a navigation system for autonomous robots moving through populated environments, characterized by a high degree of acceptability by humans. Specifically, in light of the increasing use of autonomous robots in real life, we tested whether a navigation system designed through the principles of game theory would generate indistinguishable trajectories from those walked by human beings. To this aim, we first leveraged game theory to develop a model capable of predicting the intention of motion of humans in populated environments and then, based on this model, we devised a trajectory planning algorithm for a mobile robot. Finally, to assess the social acceptance of the generated robotic trajectories, we conducted a survey questionnaire on a statistically robust group of volunteers using a variation of the Turing test.

For greater completeness and toward even more robust outcomes, before analyzing the results collected from the survey questionnaire, we also analyzed the geometrical features of the robotic trajectories, generated in the three experimental conditions (HO, GT, and VFH), selecting three performance parameters from the state of the art (PLR, PR, CPD). The ranking obtained through this analysis (HO, GT, VFH) is consistent with the results obtained through the Turing test, except for the closest pedestrian distance (CPD), in which the trajectories generated by our planner (GT) exhibit higher values of the parameter than those measured in environments populated by humans only (HO). We hypothesize that this exception is due to the fact that our model guarantees, by design, a minimum safe distance to pedestrians to prevent collisions and to ensure, in any case, a comfortable action space. On the other hand, humans on the walk are more flexible in this respect, and evaluate circumstances on a case by case basis. While the outcome of the Turing test is consistent with the analysis of the performance parameters of the trajectories, the statistical analysis (Kruskal–Wallis) executed on the latter shows that this finding is not statistically significant. To explain this non-statistically significant result, we point out that the statistical analysis was conducted on only seven experiments per group, with differential degree of variability, and thus characterized by a limited statistical power.

Moreover, to preliminarily evaluate the quality of the trajectories, we conducted a systematic analysis (Levene’s test) to assess the degree of variability of the different scenarios. In other terms, we evaluated the extent to which each algorithm generated videos that were similar to one another. This analysis revealed that there exists a significant variability with respect to the path regularity (PR), whereby the videos with the robot controlled by the VFH are the most variable, compared to the HO and GT experimental conditions. This finding suggests that the VFH algorithm is less predictable (i.e., it provides less regular results) than both our algorithm and a real pedestrian.

The variant of the Turing test comprises a first part that functions as a training phase. The second part comprises two consecutive phases. The first phase is devoted to compare the social acceptability of trajectories generated by either our game-theoretical algorithm (GT) or a state of the art algorithm (VFH) against a reference experimental condition, a complex social environment populated by humans only (HO). To this aim, participants were asked to say if they perceived weirdness in HO, GT, or VFH experimental conditions. The statistical test confirms that the perceived weirdness in trajectories in which only human subjects are involved is statistically indistinguishable from trajectories where the GT-controlled robot and human subjects coexist. Conversely, the videos in which the trajectories are generated by the VFH algorithm are perceived with a remarkably higher degree of overall weirdness compared with either HO or GT scenarios.

In the second phase of the second part of the test, participants were asked to indicate which is the perceived “weird” arrow, if any. In this regard, we observed that the trajectories generated by the VFH algorithm were more frequently recognized as “weird” than those generated by our GT algorithm.

In the third part of the test, participants were requested to focus on a circled arrow (a human in the HO experimental condition, a robot in GT and VFH ones), and were asked to evaluate whether or not the motion of the circled arrow corresponded to human recordings, and then rate their degree of naturalness. We observed that, while the arrow in VFH scenario was perceived as not human-like, the arrow controlled through GT was considered human-like, albeit not as human-like as the one rated in the HO experimental condition. We believe that this result is related to the fact that, in this part of the test, participants were asked to focus on one arrow only, thus being biased toward detecting an artificial behavior. The same ranking between the three experimental conditions (HO, GT, and VFH) resulted from the analysis of the naturalness of motion of the circled arrow. Indeed, HO has the highest degree of naturalness, closely followed by our GT trajectory planner, and then by the VFH planner.

We can conclude that, if participants were not guided to focus on a particular arrow, they would not distinguish much difference between a real human and a robot controlled through our game-theoretical framework and, therefore, the generated trajectory is a good candidate for social acceptance. This implies that our trajectory planning algorithm would help programming robots to blend well in populated environments, and, hence, to be perceived as more friendly, collaborative, and non-hostile.

Our findings are consistent with other studies in the literature⁹, where a different game-theoretical planner is perceived almost as human-like as human recordings. However, in⁹, the authors created a human-like motion planner for mobile robots, still maintaining a simplified framework that does not comprise, for example, human groups, obstacle avoidance performed by humans, and the human desire of keeping a safe personal space around them²⁸. Moreover, their tests only comprise simplified scenarios: a first test with either only humans, or only robots; a second test in which the participant, based on virtual reality, interacts with an agent who can move as a human or a robot. In our study, we went one step further in modeling (including the personal space, the group recognition, and the human-obstacle interaction) but also in the design of the variation of the Turing test (considering a real case scenario in which a robot moves in a human populated environment). Nevertheless, it is hard to make extensive comparisons with other approaches, as the literature on variants of the Turing tests for assessing social acceptability of a robot agent is scant.

Notably, the literature reports three main methods to evaluate the human-likeness and the social acceptance of robot navigation: (i) definition of social rules or performance parameters and, then, assessment of the adherence of the robot motion planner to these principles^54,55,56; (ii) comparison between simulated trajectories and observed pedestrian behavior⁵⁷; (iii) survey questionnaire based on a variation of the Turing test^9,58. The main limitation of the first two methods is that they do not consider how humans perceive the robot. However, these methods can be applied to evaluate, as a preliminary test, some features of the generated trajectories. Indeed, our analysis of the performance parameters of the generated trajectories falls within the first methodology, whereas the second methodology has been used as a validation criterion for our game-theoretical model of pedestrian motion.

Hence, toward our aims, we deemed the Turing as an effective means to study the human-likeness and the social acceptability of the generated trajectories.

Unlike the Kretzschmar’s⁵⁸ and Turnwald’s⁹ tests, where volunteers watched videos in which the totality of agents moved either in an artificial way or as real pedestrians, our survey questionnaire completely changes such a perspective. In fact, our test videos reproduce a true use case scenario of the algorithm (an environment populated by people with a single robot moving within), where the real nature of agents is masked and made uniform to eliminate any participants’ bias. Moreover, unlike Kretzschmar’s test⁵⁸, where the Turing test is executed only on 10 participants, we performed an a priori power analysis to infer the correct sample size to obtain statistically significant results. Due to the largely superior size of collected data than the outcome of the power analysis, we carried out a 100-iteration bootstrap, always getting consistent results across iterations, highlighting the robustness of our results and further corroborating our hypothesis.

When interpreting the results of our study, we should also acknowledge the limitations of the model and of the test design. Regarding the former, our model does not take into account the uncertainties that arise from the interaction with the external world. Importantly, the stochasticity of human behavior is not explicitly modeled, although this is implicitly accounted for through tuning model parameters identified from real trajectory data, extracted from surveillance cameras. A range of simplifying assumptions were in order to handle the computational complexity of the algorithm. The main one resides in the discrete nature of our model, whereby each agent can choose between a fixed number of motion directions—an indispensable trade-off between predictive accuracy and computational effort. Moreover, the designed human motion model has been devised to operate with a limited number of pedestrians: its computational complexity may be difficult to manage if the number of agents increases to more than a dozen. The pedestrian model used in this study only considers people’s goal-directed and collision-avoiding behaviors, while ignoring other social activities that humans may perform in a pedestrian urban scenario, such as waiting for a bus or wandering without a clear direction. Thus, any pedestrian behavior that is not contemplated by our model breaks the assumptions under which our system works. In addition, our method does not allow customization of trajectories. For example, the prediction of a trajectory walked by an elderly person may be coincident with that of a child.

The main limitation of the test design is the choice of the navigation algorithm chosen for comparison (VFH). Ideally, more than one algorithm should have been selected in order to mitigate algorithm-induced biases. However, since the execution of the Turing test already took about 20 minutes to the average participant, we preferred to limit our comparison to only one algorithm at the state of the art, in order to avoid increasing the time of the experiment for each participant, mitigate attention biases and, in the end, achieve robust results.

Our work can be extended along several directions. To manage and predict the motion of big crowds, mean-field games could be adopted⁵⁹. We remark, however, that crowded and populated scenarios are different under many aspects, and the deployment of a robot in the two scenarios would cover totally different application fields.

The lack of customization in the inference of trajectories by our model can be mitigated by combining our approach with learning strategies as in⁶⁰, encompassing variegate behaviors across the experimental scenario. In fact, adding variability to the pedestrian model might allow for a more accurate prediction of human motion pattern and should allow the robot to better adapt to the needs of the human with whom it is interacting. For example, if a robot recognizes a person who has difficulties in walking, the robot should be able to predict their movement and possibly reduce its speed. Moreover, it would be interesting to understand and assess the quality of our generated trajectories considering not only social acceptability but also the comfort¹¹ feeling of participants, for instance by creating a real shared environment with humans and a robot.

Statement about methods

All methods were carried out in accordance with relevant guidelines and regulations described in the methods section.

Ethical approval and informed consent

The experimental protocol regulating the administration of the Turing test to human subjects, the evaluation of the results, and the data management plan was approved by the ethical committee of the Istituto Superiore di Sanità (Italian Institute of Health) with approval code AOO-ISS 10/07/2020 - 0024079, Class: PRE BIO CE 01.00. Each participant also provided informed consent, after the explanation of the nature and possible consequences of the study.

Data and code availability

The code used for the generation of the trajectories and the anonymized data collected during the experiments are available through the following link: https://gitlab.com/PoliToComplexSystemLab/game-theoretic-trajectory-planning.git.

References

Torras, C. Service robots for citizens of the future. Eur. Rev. 24(1), 17–30 (2016).
Article Google Scholar
Kruse, T., Pandey, A. K., Alami, R. & Kirsch, A. Human-aware robot navigation: A survey. Robot. Auton. Syst. 61(12), 1726–1743 (2013).
Article Google Scholar
Fox, D., Burgard, W. & Thrun, S. The dynamic window approach to collision avoidance. IEEE Robot. Autom. Mag. 4(1), 23–33 (1997).
Article Google Scholar
Fiorini, P. & Shiller, Z. Motion planning in dynamic environments using velocity obstacles. Int. J. Robot. Res. 17(7), 760–772 (1998).
Article Google Scholar
Van den Berg, J., Lin, M. & Manocha, D. Reciprocal velocity obstacles for real-time multi-agent navigation. in 2008 IEEE International Conference on Robotics and Automation, 1928–1935, IEEE (2008).
Kivrak, H., Cakmak, F., Kose, H. & Yavuz, S. Social navigation framework for assistive robots in human inhabited unknown environments. Eng. Sci. Technol. Int. J. 24(2), 284–298 (2020).
Google Scholar
Trautman, P., Ma, J., Murray, R. M. & Krause, A. Robot navigation in dense human crowds: Statistical models and experimental studies of human-robot cooperation. Int. J. Robot. Res. 34(3), 335–356 (2015).
Article Google Scholar
Rios-Martinez, J., Spalanzani, A. & Laugier, C. From proxemics theory to socially-aware navigation: A survey. Int. J. Soc. Robot. 7(2), 137–153 (2015).
Article Google Scholar
Turnwald, A. Human-like Motion Planning in Populated Environments. PhD thesis, Technische Universität München (2019).
Sisbot, E. A., Marin-Urias, L. F., Alami, R. & Simeon, T. A human aware mobile robot motion planner. IEEE Trans. Robot. 23(5), 874–883 (2007).
Article Google Scholar
Shiomi, M., Zanlungo, F., Hayashi, K. & Kanda, T. Towards a socially acceptable collision avoidance for a mobile robot navigating among pedestrians using a pedestrian model. Int. J. Soc. Robot. 6(3), 443–455 (2014).
Article Google Scholar
Chen, Y. F., Everett, M., Liu, M. & How, J. P. Socially aware motion planning with deep reinforcement learning. in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 1343–1350, IEEE (2017).
Helbing, D. & Molnar, P. Social force model for pedestrian dynamics. Phys. Rev. E 51(5), 4282 (1995).
Article ADS CAS Google Scholar
Tadokoro, S., Hayashi, M., Manabe, Y., Nakami, Y. & Takamori, T. On motion planning of mobile robots which coexist and cooperate with human. in Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots, Vol. 2, 518–523, IEEE (1995).
Hoeller, F., Schulz, D., Moors, M. & Schneider, F. E. Accompanying persons with a mobile robot using motion prediction and probabilistic roadmaps. in 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, 1260–1265, IEEE (2007).
Bennewitz, M., Burgard, W., Cielniak, G. & Thrun, S. Learning motion patterns of people for compliant robot motion. Int. J. Robot. Res. 24(1), 31–48 (2005).
Article Google Scholar
Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L. & Savarese, S. Social LSTM: Human trajectory prediction in crowded spaces. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 961–971 (2016).
Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S. & Alahi, A. Social GAN: Socially acceptable trajectories with generative adversarial networks. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2255–2264 (2018).
Liang, J., Jiang, L., Niebles, J. C., Hauptmann, A. G. & Fei-Fei, L. Peeking into the future: Predicting future person activities and locations in videos. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5725–5734 (2019).
Turnwald, A., Althoff, D., Wollherr, D. & Buss, M. Understanding human avoidance behavior: Interaction-aware decision making based on game theory. Int. J. Soc. Robot. 8(2), 331–351 (2016).
Article Google Scholar
Zhang, H., Kumar, V. & Ostrowski, J. Motion planning with uncertainty. in Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No. 98CH36146), Vol. 1, 638–643, IEEE (1998).
Gabler, V., Stahl, T., Huber, G., Oguz, O. & Wollherr, D. A game-theoretic approach for adaptive action selection in close proximity human-robot-collaboration. in 2017 IEEE International Conference on Robotics and Automation (ICRA), 2897–2903, IEEE (2017).
Dragan, A. D. Robot planning with mathematical models of human state and action. arXiv preprint arXiv:1705.04226 (2017).
Nikolaidis, S., Forlizzi, J., Hsu, D., Shah, J. & Srinivasa, S. Mathematical models of adaptation in human-robot collaboration. arXiv preprint arXiv:1707.02586 (2017).
Nash, Jr, J. Non-cooperative games. Essays on Game Theory. 22–33 (Edward Elgar Publishing, 1996).
Waytz, A., Cacioppo, J. & Epley, N. Who sees human? The stability and importance of individual differences in anthropomorphism. Perspect. Psychol. Sci. 5(3), 219–232 (2010).
Article Google Scholar
Fink, J. Anthropomorphism and human likeness in the design of robots and human-robot interaction. in International Conference on Social Robotics, 199–208, Springer (2012).
Hall, E. T. The Hidden Dimension Vol. 609 (Doubleday, Garden City, 1966).
Google Scholar
Mavrogiannis, C., Baldini, F., Wang, A., Zhao, D., Trautman, P., Steinfeld, A. & Oh, J. Core challenges of social robot navigation: A survey. arXiv preprint arXiv:2103.05668 (2021).
Xie, D., Shu, T., Todorovic, S. & Zhu, S.-C. Learning and inferring “dark matter’’ and predicting human intents and trajectories in videos. IEEE Trans. Pattern Anal. Mach. Intell. 40(7), 1639–1652 (2017).
Article Google Scholar
Manual, H. C. Special Report 209 Vol. 1, 985 (Transportation Research Board, Washington, DC, 1985).
Ulrich, I. & Borenstein, J. VFH+: Reliable obstacle avoidance for fast mobile robots. in Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No. 98CH36146), Vol. 2, 1572–1577, IEEE (1998).
Bitgood, S. & Dukes, S. Not another step! Economy of movement and pedestrian choice point behavior in shopping malls. Environ. Behav. 38(3), 394–405 (2006).
Article Google Scholar
McNeill Alexander, R. Energetics and optimization of human walking and running: The 2000 Raymond Pearl memorial lecture. Am. J. Hum. Biol. 14(5), 641–648 (2002).
Article CAS Google Scholar
Osborne, M. J. & Rubinstein, A. course in Game Theory (MIT press, 1994).
Sagratella, S. Algorithms for generalized potential games with mixed-integer variables. Comput. Optim. Appl. 68(3), 689–717 (2017).
Article MathSciNet MATH Google Scholar
Lerner, A., Chrysanthou, Y. & Lischinski, D. Crowds by example. in Computer Graphics Forum Vol. 26, no. 3 655–664 (Wiley Online Library, 2007).
Pellegrini, S., Ess, A., Schindler, K. & Van Gool, L. You’ll never walk alone: Modeling social behavior for multi-target tracking. in 2009 IEEE 12th International Conference on Computer Vision, 261–268, IEEE (2009).
Saygin, A. P., Cicekli, I. & Akman, V. Turing test: 50 years later. Minds Mach. 10(4), 463–518 (2000).
Article Google Scholar
Likert, R. A technique for the measurement of attitudes. Arch. Psychol. 140, 5–55 (1932).
Google Scholar
Prajapati, B., Dunne, M. & Armstrong, R. Sample size estimation and statistical power analyses. Optom. Today 16(7), 10–18 (2010).
Google Scholar
Erdfelder, E., Faul, F. & Buchner, A. GPOWER: A general power analysis program. Behav. Res. Methods Instrum. Comput. 28(1), 1–11 (1996).
Article Google Scholar
Corder, G. W. & Foreman, D. I. Nonparametric Statistics: A Step-by-Step Approach (John Wiley & Sons, 2014).
Roberts, M. & Russo, R. A Student’s Guide to Analysis of Variance (Routledge, 2014).
Kraska-Miller, M. Nonparametric Statistics for Social and Behavioral Sciences (CRC Press, 2013).
Gastwirth, J. L., Gel, Y. R. & Miao, W. The impact of Levene’s test of equality of variances on statistical theory and practice. Stat. Sci. 24(3), 343–360 (2009).
Article MathSciNet MATH Google Scholar
Foster, G. C., Lane, D., Scott, D., Hebl, M., Guerra, R., Osherson, D. & Zimmer, H. An introduction to psychological statistics. Rice University (2018).
Mann, H. B. & Whitney, D. R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18, 50–60 (1947).
Article MathSciNet MATH Google Scholar
Fay, M. P. & Proschan, M. A. Wilcoxon–Mann–Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. Stat. Surv. 4, 1 (2010).
Article MathSciNet MATH Google Scholar
Biswas, A., Wang, A., Silvera, G., Steinfeld, A. & Admoni, H. Socnavbench: A grounded simulation testing framework for evaluating social navigation. arXiv preprint arXiv:2103.00047 (2021).
Ostertagova, E., Ostertag, O. & Kováč, J. Methodology and application of the Kruskal–Wallis test. Appl. Mech. Mater. 611, 115–120 (2014).
Article Google Scholar
Efron, B. & Tibshirani, R. J. An Introduction to the Bootstrap (CRC Press, 1994).
Johnson, D. H. The insignificance of statistical significance testing. J. Wildl. Manag. 63, 763–772 (1999).
Article Google Scholar
Kirby, R., Simmons, R. & Forlizzi, J. Companion: A constraint-optimizing method for person-acceptable navigation. in RO-MAN 2009-The 18th IEEE International Symposium on Robot and Human Interactive Communication, 607–612, IEEE (2009).
Müller, J., Stachniss, C., Arras, K. O. & Burgard, W. Socially inspired motion planning for mobile robots in populated environments. in Proceedings of International Conference on Cognitive Systems (2008).
Pradeep, Y. C., Ming, Z., Del Rosario, M. & Chen, P. C. Human-inspired robot navigation in unknown dynamic environments. in 2016 IEEE International Conference on Mechatronics and Automation, 971–976, IEEE (2016).
Tamura, Y., Dai Le, P., Hitomi, K., Chandrasiri, N. P., Bando, T., Yamashita, A. & Asama, H. Development of pedestrian behavior model taking account of intention. in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 382–387, IEEE (2012).
Kretzschmar, H., Spies, M., Sprunk, C. & Burgard, W. Socially compliant mobile robot navigation via inverse reinforcement learning. Int. J. Robot. Res. 35(11), 1289–1307 (2016).
Article Google Scholar
Dogbé, C. Modeling crowd dynamics by the mean-field limit approach. Math. Comput. Model. 52(9–10), 1506–1520 (2010).
Article MathSciNet MATH Google Scholar
Ma, W.-C., Huang, D.-A., Lee, N. & Kitani, K. M. Forecasting interactive dynamics of pedestrians with fictitious play. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 774–782 (2017).

Download references

Acknowledgements

A.R. and G.G are partially supported by Compagnia di San Paolo. S. G. is partially supported by the ERC under project COSMOS (802348).

Author information

Authors and Affiliations

Department of Electronics and Telecommunications, Politecnico di Torino, Torino, Italy
Giada Galati & Alessandro Rizzo
Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Torino, Italy
Stefano Primatesta
Delft Center for Systems and Control, TU Delft, Delft, The Netherlands
Sergio Grammatico
Center for Behavioral Sciences and Mental Health, Istituto Superiore di Sanità, Roma, Italy
Simone Macrì
Institute for Invention, Innovation, and Entrepreneurship, New York University Tandon School of Engineering, Brooklyn, NY, USA
Alessandro Rizzo

Authors

Giada Galati
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Primatesta
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Grammatico
View author publications
You can also search for this author in PubMed Google Scholar
Simone Macrì
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Rizzo
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Data Curation G.G., S.P. Formal Analysis G.G., S.P. Funding Acquisition A.R. Investigation All. Methodology S.G., S.M., A.R. Project Administration A.R. Resources S.G., S.M., A.R. Software G.G., S.P. Supervision S.G., S.M., A.R. Validation G.G., S.M. Visualization G.G., S.P. Writing original draft G.G., A.R. Writing review and editing All.

Corresponding author

Correspondence to Alessandro Rizzo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Galati, G., Primatesta, S., Grammatico, S. et al. Game theoretical trajectory planning enhances social acceptability of robots by humans. Sci Rep 12, 21976 (2022). https://doi.org/10.1038/s41598-022-25438-1

Download citation

Received: 17 May 2022
Accepted: 30 November 2022
Published: 20 December 2022
DOI: https://doi.org/10.1038/s41598-022-25438-1
Springer Nature Limited

Game theoretical trajectory planning enhances social acceptability of robots by humans

Abstract

Similar content being viewed by others

Towards Socially Acceptable, Human-Aware Robot Navigation

Human-Like Motion Planning Based on Game Theoretic Decision Making

Towards Safer Robot Motion: Using a Qualitative Motion Model to Classify Human-Robot Spatial Interaction

Explore related subjects

Introduction

Methods

Game-theoretical model

Assumptions

Game description

Optimization problem

Validation

Algorithm

Implementation

Trajectories generation for performance parameters

Survey questionnaire, a-priori power analysis

Survey questionnaire

A-priori power analysis

Statistical analysis

Results

Analysis of performance parameters

Hypothesis 1

Hypothesis 2

Survey questionnaire

Hypothesis 3

Discussion

Statement about methods

Ethical approval and informed consent

Data and code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation