Coordination-free Multi-robot Path Planning for Congestion Reduction Using Topological Reasoning

We consider the problem of multi-robot path planning in a complex, cluttered environment with the aim of reducing overall congestion in the environment, while avoiding any inter-robot communication or coordination. Such limitations may exist due to lack of communication or due to privacy restrictions (for example, autonomous vehicles may not want to share their locations or intents with other vehicles or even to a central server). The key insight that allows us to solve this problem is to stochastically distribute the robots across different routes in the environment by assigning them paths in different topologically distinct classes, so as to lower congestion and the overall travel time for all robots in the environment. We outline the computation of topologically distinct paths in a spatio-temporal configuration space and propose methods for the stochastic assignment of paths to the robots. A fast replanning algorithm and a potential field based controller allow robots to avoid collision with nearby agents while following the assigned path. Our simulation and experiment results show a significant advantage over shortest path following under such a coordination-free setup.


Introduction 1.Motivation and Problem Description
Autonomous vehicles are expected to travel in urban environments in the future for increased safety and overall efficiency.They could be of different car brands running different navigation & communication systems that do not share their route-choosing processes or travel data, either due to lack of communication or due to privacy restrictions.To avoid traffic congestion, such as those caused by non-cooperating human drivers nowadays, independent autonomous vehicles need to have a method for distributing traffic in the environment without communication.Motivated by this real-world scenario, in this paper we consider the problem of path planning for a large number of privacy-aware robots in a complex, cluttered indoor or urban environment with uncertainties (other unpredictable agents such as pedestrians), where the robots need to be well-distributed throughout the environment and avoid congestion in any region, but are not allowed to communicate or share their location data or intents with other robots.This is relevant to avoiding congestion in distributed vehicle routing problems when a vehicle's location/intent cannot be shared either due to lack of communication or due to privacy restrictions.We address the fundamental question of path planning under such circumstances without any interrobot coordination, while trying to minimize the overall congestion in the environment.We assume that each robot knows the map of the environment and can localize itself in it, but do not know the location or intent of the other robots.Furthermore, a robot can detect other agents in its immediate neighborhood (for example using on-board cameras or laser range sensors) so as to be able to avoid immediate collisions with them, although they cannot broadcast or communicate any information with each other.The key insight that allows us to solve this problem is to make each robot stochastically choose from a set of different routes in the environment which correspond to paths in different topologically distinct classes (Figure 1), so as to lower congestion and the overall travel time for all robots in the environment.

Literature Review
MAPF: Multi-agent path finding (MAPF) is a well-studied problem with a variety of practical applications.Given a set of robots and their start locations, the objective of MAPF is to find a set of paths that lead them to their corresponding goals without collision, while minimize the sum of travel time of each agent.Early studies [1] focused on computing valid collision-free solutions, while recent method [2] and its many variants have strived to compute optimal solutions.They all focus on conflict resolution among the agents' path choices so that an optimal or near-optimal solution can be achieved.Furthermore, existing algorithms [3,4] are designed to deal with agents that are well-distributed across the environment and not have similar start/goal locations.For example, in [3], in all presented results, the initial robot positions are chosen to be distributed uniformly throughout the environment.For a large group of robots with similar start and goal points, the computation involved in existing MAPF methods will increase drastically due to increased collision avoidance computations, and the collision/congestion avoidance still happens at a local level.
MAPF in dynamic, cluttered and uncertain environments is also well-studied in robotics [5][6][7][8].Such dynamic environments exist in presence of pedestrians and other robots in busy indoor or urban environments.Robots employed in such environments need to arrive at designated target locations while avoiding both static and dynamic obstacles such as pedestrians, which is usually unknown to the path planner at the first place.The dynamic nature of the environment in all these existing work, however (see the review paper [7] for example), is assumed to arises from completely unpredictable agents (both human pedestrians and other robots), and hence these work focus primarily on improving the short-term prediction of the behavior of such agents without consideration for long-term congestion reduction.We, on the other hand, consider the problem where the robots in the environment use the same stochastic algorithm that, even without inter-agent coordination or communication, results in overall reduction of congestion in the environment in the long-term.
Along similar lines, some studies [9,10] have focused more on pedestrian's trajectory prediction or social compliance with humans that facilitate obstacle avoidance in a more human-friendly way, but is still at a local level and over short time horizons.Over longer time periods, when multi-robot groups run on a large complicated map cluttered by dynamic obstacles, this does not help proactively avoid robot congestion in the long run if the robots' routes are not well-distributed across the environment.In this paper we focus on developing algorithms for individual robots that, even without inter-robot communication or coordination, tries to reduce overall congestion in the environment by stochastically distributing the robots over different routes.
While the presence of uncertain agents such as pedestrians in such planning problems have been considered [11][12][13][14], the problem of reducing robot traffic congestion by taking advantage of the structure of the environment without explicit inter-robot coordination remains open.
Congestion Avoidance in MAPF: Both with and without uncertainties, existing literature uses inter-robot coordination for global planning to avoid congestion [15][16][17][18], or use information stored in the environment (represented as a network) as a means of indirect coordination between the agents [19][20][21].A recent work on congestion-aware policy synthesis [22] is notable, and tries to achieve a balance between congestion reduction and minimization of detours by designing single-robot automata.However, even in this approach, significant centralized inter-robot coordination (using a shared probabilistic reservation table) is necessary.This assumption of inter-robot coordination is fundamentally different from the premise of our current work, where we assume that there exists no inter-agent communication or coordination, and robots do not share their plan or intent.Furthermore, [22] does not consider out-of-system agents (unpredictable agents such as pedestrians), which we do in our current work.
Other Related Multi-robot Coordination Problems: Most multi-robot persistent patrolling/surveillance methods use some centralized coordination [23][24][25][26].In a interrobot communication denied situation, local methods (e.g., repulsive force or velocity obstacle [27]) or other decentralized framework (using topological braids [28] or rotations [29]) can only help avoid collision but not avoid congestion proactively.In those methods, without inter-robot coordination, congestion could readily happen.Action strategy planning for robots minimizing expected cost, while a well-researched area (and often addressed using reinforcement learning or game-theoretic methods [30,31]), mostly focuses on local actions, do not take global topology of the environment into account, rely on shared information, or do not scale with the number of robots (for example, [31] presents results with only three robots).

Solution Overview and Paper Outline
The technical problem considered in this paper can be summarized as follows: Problem Statement: Given a discrete graph representation of an environment, how can a large number of privacy-aware robots with local sensing plan their respective paths in the graph without inter-robot communication or centralized coordination (i.e., without sharing their location or intents) so as to minimize the overall congestion in the environment.If all robots choose the same/similar paths (e.g., shortest paths), certain regions of the environment will inevitably be traversed more.This issue is even more aggravated when the robots or groups of robots have similar start and goal locations.Instead, a robot stochastically chooses between different topologically distinct classes of solutions with an aim of reducing overall congestion in the environment and to altruistically minimize overall travel time for any robot.We leverage the topological path planning methods introduced in the author's prior works [32][33][34][35][36][37] for computing the topologically distinct paths in the environment.

Contributions:
The main new contributions of the paper are: 1. Formulating an optimization problem for coordination/communication-free computation of probability values using which a robot stochastically chooses a path out of the available topologically distinct choices.2. Develop efficient computation methods for solving the said optimization problem by employing simplifications.3. Design the controller used by each non-holonomic robot to follow their chosen path while locally avoiding collisions with other agents in its immediate neighborhood.This requires the development of a fast re-planner that takes the topological class into account.4. Validate the proposed method in simulations and real robot experiments.
Paper Outline: In Section 2 we provide background on the computation of paths in topologically distinct classes using discrete search in Z 2 -homology augmented graphs.The main contributions of the paper appear in Section 3, where we describe the coordination-free computation of the probability values using which a robot stochastically chooses a path out of the available topologically distinct choices.This includes estimation of traffic density in the environment (Section 3.2), which is used in the evaluation of the cost of path choices (Section 3.4), which in turn is used in the computation of the probability values through an optimization process (Section 3.3).We also provide multiple appromixations and simplifications to the optimization problem for fast and efficient computation (Section 3.5).A robot stochastically chooses a path using the computed probability values.We call a chosen path the reference path of the robot, and once chosen, a robot commits to it.Section 4 describes the controller used by each non-holonomic robot to follow the reference path while locally avoiding collisions with other agents in its immediate neighborhood.This includes a prediction of the future position of agents in the immediate neighborhood (Section 4.1) and fast re-planning of path to avoid collision (Section 4.2).Section 5 provides simulation results and results from real-robot experiments.
The overall algorithm that each robot follows in computing and executing their respective paths is shown in the pipeline diagram of Figure 2. The relevnat section numbers containing the details of each of the algorithmic components are also shown in the figure . 2 Preliminaries -Topological Planning and Rationalized Discretization in Spatio-Temporal Domain In this section we provide brief background on topological path planning that allows the computation of shortest path in different topological classes using a graph searchbased approach.For further details on topological path planning the reader can refer to the author's prior work [32][33][34][35].The type of topological classes that we consider in particular is the Z 2 homology class [36,37], and we describe the path planning in spatio-temporal domain in order to account for dynamic agents during replanning.
2.1 Background: Z 2 Homology and H 2 -signature Two trajectories connecting the same start and goal points on a planar domain are said to be in the same homology class if the closed loop formed by the two trajectories forms the oriented boundary of a two-dimensional obstacle-free region [32].The  homology class of a loop can be quantified by winding numbers around the connected components of obstacles.In order to prevent the separate counting of the homology classes that loop around an obstacle multiple times, one can compute the homology in the "mod 2" coefficient [36,37].Doing so identifies all the even winding numbers to 0 and all the odd winding numbers to 1.This prevents the creation of separate homology classes for loops that wind around obstacles multiple times (Figure 3(a)) and we refer to this homology as Z 2 -homology.
In order to quantitatively identify and represent Z 2 -homology class of trajectories in a planar domain, C = R 2 − O (where O ⊂ R 2 is the obstacle set), we construct a homology invariant called H 2 -signature, which is a function on the space of curves in C that uniquely identifies a curve's Z 2 -homology class.This computation and the associated construction (see [36,38] for more details) can be summarized as follows: In an environment with o connected components of obstacles we place a representative point, ζ i , on the i th connected component and construct non-intersecting rays, {r i } i=1,2••• ,o , emanating from the representative points.The H-signature of a curve, τ , is then given by a vector of integers, where h i is the winding number around the i th obstacle and can be computed by the number of times the curve intersects the ray emanating from ζ i (in counting the number of intersections, crossing from left to right is considered positive and that from right to left is considered negative).Subsequently, the H 2 -signature of τ is defined as , in which the i th element assumes value in Z 2 = Z/2Z = {0, 1} and gives the parity of the number of times the curve intersects the ray emanating from ζ i .This computation is further illustrated in Figure 3(a).

Topological Planning in H 2 -Augmented Graph In Spatio-Temporal Domain
The configuration space of each of the robots in our case is a discrete representation of the spatio-temporal domain.For a single robot, it is represented by the graph, G = (V, E), so that a vertex v ∈ V is represented by v = (x, y, t), and edges connect neighboring vertices.In order to keep track of the Z 2 homology invariants, we define an H 2 -augmented graph (see [32,36,38] for detailed construction), G H2 = (V H2 , E H2 ), based on the graph G, such that a vertex in it is represented as (v, H) ∈ V H2 , which contains the additional information of the H 2 -signature, H, of the trajectory leading from a start vertex v s ∈ V up to the vertex v.An edge connecting vertex , and H ′ is the sum of H and the H 2 -signature of the edge connecting v to v ′ .In the spatio-temporal setup the rays in Figure 3(a) emanating from obstacles are extruded in the temporal direction to construct half-planes (Figure 3(b)) and the intersections of trajectories are counted with these half-planes for computing the H 2signature. Thus in this case the H 2 -signature computation for a path in the X-Y-T space requires counting the number of intersections of the path with each of those half-planes.
Formally, given the graph, G and a start vertex v s ∈ V, the vertex set, edge set and the cost function of the H 2 -augmented graph can be described as: 1. (v, 0) ∈ V H2 is the start vertex, and the vertex set is given by: for some trajectory and "+" is vector addition.

The cost associated with an edge,
The cost function is described in more details in Section 2.4.Searching in this H 2 -augmented graph from the start vertex (v s , 0) using A* search [39], paths to vertices of the form (v g , * ) give paths in different Z 2 homology classes connecting v s and v g (Figure 3(c)).The vertices are generated on-the-fly and as required during the execution of A* search on the graph.The paths obtained in the different homology classes are in ascending order of the cost.

Rationalized Discretization of the Spatio-Temporal Configuration Space for Constructing G
In order to construct the configuration graph, G, we discretize the spatio-temporal domain into a grid and place a vertex in every discrete cell in the grid.The time axis is discretized uniformly with the time layers separated by δt (Figure 4(b)).In the initial planning for the reference path we assume a constant robot speed, V max , which is difficult to achieve with a uniform spatial discretization of δr = V max δt along X and Y directions (since diagonal edges of an uniform square grid discreitization are longer than the edges parallel to the coordinate axes).Instead, we discretize the spatial directions (both X and Y) using intervals of δr ′ = δr 4 and establish edges connecting a vertex (x, y, t) ∈ V with vertices of the form (x ± 3δr ′ , y ± 3δr ′ , t + δt), (x ± 4δr ′ , y, t + δt) and (x, y ± 4δr ′ , t + δt) (see Figure 4(a)).In doing so, the spatial length of edges parallel to X or Y axes are δr, while the spatial length of the diagonal edges are √ 2 3 4 δr = 1.06 • • • δr ≈ δr (Figure 4(a)), thus allowing the implementation of almost-isotropic (direction-independent) velocity of V max along the edges.We call this the "Rationalized Discretization".

Cost Function and Heuristic Function for A* Search for Computing Paths in Different Topological Classes
For the topological planning of the paths in different Z 2 -homology classes for a particular robot, the cost function accounts for the traversal time for the edges.The cost of an edge connecting vertices (x, y, t, h) and time to reach a goal vertex of the form (x g , y g , * , * ) using any Z 2 homology class, which is used as the heuristic function for A* search in G H2 .An A* search in this H 2 -augmented graph is used to find paths to vertices of the form (x g , y g , * , * ), which returns paths in increasing order of cost, and the cost of a path, π, is denoted by C B (π), which is referred to as the base travel cost.

Stochastic Topological Path Assignment for Coordination-free Multi-robot System
In this section we describe the algorithm that a robot uses for computing the probability values with which it stochastically assigns itself to one of the topologically distinct paths that it has computed.The main consideration in designing the algorithm is that the robots are not allowed to communicate or coordinate among themselves in either the computation of the probability values or in choosing their own path assignments.A robot does not share its location or its intent/choice with other robots.In the next section we start with a brief description of the problem setup and introduce some terminologies.

The Environment and a Robot's Mental Model of the Environment
Figure 5: The mental model of a robot computing its own path (a planning robot): In absence of the knowledge of the positions of other robots or pedestrians, a robot reasons about its stochastic choice from the available topological classes of paths using a team of virtual robots with similar start/goal (the "contending robots").All other agents (robots or pedestrians -referred to as "distant agents") are accounted by an estimated traffic density map, ρ.
We consider a planar indoor or urban environment with multiple robots navigating from one location to another within the environment while trying to avoid global traffic congestion.While the robots are rational agents (their actions are determined by an algorithm that altruistically tries to reduce global traffic congestion), there also exist non-rational agents (pedestrians) that do not attempt to reduce congestion in their path planning.Robots need to maintain a minimum safe distance from other robots as well as pedestrians.Because of that, if a passage or route in the environment becomes too crowded, some may have to slow down or wait for the congestion to reduce.Due to lack of inter-robot coordination, a lack of knowledge of the current global traffic state/distribution in the environment, and the unpredictable nature of the pedestrians, it is virtually impossible to predict such congestion ahead of time.As a consequence, the overall travel time of all robots could increase significantly.The key insight in addressing this problem is to distribute the robots across different routes in the environment so as to lower the probability of congestion.Assigning the robots paths in different topological (Z 2 homology) classes in the environment can help achieving that.

Types of Agents and the "Planning Robot"
In the environment we assume that there are two types of agents -i.non-rational agents, also referred to as pedestrians, that always choose the shortest path without consideration for global congestion reduction, and ii.rational agents, also referred to as robots, that stochastically chooses one of the multiple topologically distinct paths available to it with an aim of reducing global congestion.Without coordination between the robots, all computation of the paths and stochastic path selection for a particular robot happen onboard the robot itself in a decentralized manner, without communication with other robots.In the following sections, we describe the computations made by a particular robot (i.e., from the perspective of that individual robot), referred to as "the planning robot".It is to be noted that all robots are planning robots in their own rights, and the same algorithm is used by each robot for its individual computation.

Decoupling the Problem to Reason About Global Agent Distribution and Local Path Selection
A planning robot needs to not only reason about other robots that may have similar start and goal locations as itself, but also the robots that may have different start/goal locations as well as the pedestrians.Without knowing the location, intent, or choices of other agents in the environment, a planning robot decouples this complex problem into two parts: i. Estimation of global traffic density, ρ, in the environment based on a prior belief of agent trajectories (Section 3.2), and, i. Probabilistic choice of topologically distinct paths for avoiding congestion (Sections 3.3-3.5).A planning robot does not know the location of other agents (pedestrians or other robots) in the environment.In order to reason about the available topological classes connecting its own start and goal locations, it considers a group of virtual robots with similar start & goal locations.These robots are referred to as contending robots, and having similar start/goal locations have the same topological classes of paths as the planning robot and thus enables reasoning about the stochastic choice of a class (Figure 5).Other agents (referred to as distant agents) in the environment are represented by an estimated traffic density map, which is used in evaluating the different topological classes of paths when making the stochastic choice.

Estimation of Traffic Density in an Environment
This density estimation not only accounts for pedestrians, but also potentially accounts for robots that have start & goal locations that are widely different from the planning robot.We refer to such agents as distant agents.This density is used in formulating the optimization problem for computing the probability values associated with the different topologically distinct path options available to the planning robot.
The distant agents' traffic density, ρ : V → R + , is described by a real number associated with each cell (pixel) in the discrete representation of the environment.While ρ can be computed from historic traffic data, in absence of such data it can also be estimated from the structure of the environment.In order to do that, traffic is randomly generated with thousands of shortest paths connecting random starts to goals.
Then each path is thickened by Minkowski summing it with the pedestrians' diskshaped footprint, and for each discrete cell in the map the number of such thickened paths that pass through it is counted.This distribution is then normalized to obtain the density function, ρ.Some examples of traffic density maps computed this way are shown in Figure 6.This traffic density is an estimation only, related to the map's topology and geometry, and needs to be computed only once for a given environment.

Probabilistic Choice of a Topological Class
In order to avoid congestion along the possible routes that the planning robot can choose to reach its goal, the robot needs to reason about the choices made by other robots that have similar start and goal locations as itself.To that end the robot can choose from multiple topological classes of paths representing distinct routes leading to its goal.However, without coordination between the robots, a probabilistic approach is taken in which the planning robot reasons about an estimated n robots, including itself, with similar start and goal (refereed to as contending robots) and chooses a topological class stochastically based on probability values computed to minimize the travel time for all the contending robots.The planning robot starts by computing m topologically distinct paths connecting its start to its goal location (Section 2).The paths are referred by by their number/index from the set S = {1, 2, . . ., m}.The planning robot needs to choose one out of these m classes stochastically.Suppose P j is the probability with which the robot chooses the j-th path.In order to compute the path choice probabilities, {P j } j∈S , the robot reasons about the time of travel if all the contending robots choose the paths from S according to the probabilities {P j } j∈S (the contending robots being rational agents will use the same probability computation method themselves, thus arriving at the same probability values).

Path Choice Probability Computation -The Complete Formulation
The value of n, while unknown, can be estimated based on the a priori knowledge of agent density in the environment or through local sensing.However, the planning robot uses the value n only for the computation of its own path choice probability values {P j } j∈S , while in reality there is no real coordination between the planing robot and the other robots.
We denote the set of contending robots as R = {1, 2, • • • , n}.Suppose the i-th contending robot's choice of path is σ i ∈ S, for i ∈ R. We define a joint path choice made by the n contending robots to be σ = (σ 1 , σ 2 , . . ., σ n ) ∈ S n .Given a joint path choice σ ∈ S n , suppose C(σ) is the estimated travel time cost of the entire group of contending robots (which is determined by the geometry of the m paths, the prior estimated traffic density, ρ, along those paths, and the number of contending robots in each of those paths due to the joint path choice, σ -the computation of the cost C(σ) is described in details in Section 3.4).Since the robots make their individual choices independently, the probability of making the joint path choice σ is n i=1 P σi .We thus formulate the following optimization problem for minimizing the expected cost: where the summation in the objective function is over all possible joint path choices, and hence involves m n terms.
Proof The equality constraint is clearly affine and the inequality constraints are linear.Define the objective function f ({P j } j∈S ) = σ∈S n C(σ) n i=1 Pσ i , which is homogeneous of degree n in the probabilities.Thus, for two sets of probability values, {P j } j∈S and {P (2) j } j∈S , and with 0 ≤ λ ≤ 1, Hence the objective function is convex.□ Even though this optimization problem is convex, the number of terms in the objective function grows exponentially (or factorially, upon some simplification) with n.Hence, for all practical purposes, a direct solution to this optimization problem is not feasible since the evaluation of the objective function takes a lot of time.Hence in Section 3.5 we will formulate two simplified optimization problems that are computationally more amenable.In the proximity penalty computation, the overlap of two paths is determined by drawing a corridor around the path, π j , then measuring the part of the path π j ′ that lies within that corridor.

Travel Time Cost Computation
In this section, we describe the computation of the travel time cost function C : S n → R + that estimates the travel time of the team of contending robots for a joint path choice, σ.We first observe that the individual travel time of the i-th contending robot will not only depend on its own chosen path, σ i , but also the choices made by the other contending robots since that will determine the level of congestion along the different parts of the path.Define the set of contending robots that choose the path j ∈ S as R j = {i | σ i = j} ⊆ R and the number of those contending robots as N j (σ) = |R j (σ)|.Due to uniformity between the contending robots, if two contending robots, i 1 and i 2 , choose the same path (say, j = σ i1 = σ i2 ), then their estimated individual travel times will be the same.Thus, for a given joint path choice σ, we define the estimated travel time cost for the j-th path as D j (σ).We define two possible types of travel time cost functions for use in the objective function of (1): 1. Average Travel Time Cost: using which would try to minimize the average of the travel times of all the contending robots.2. Maximum Travel Time Cost: C max (σ) = max j∈S D j (σ), using would try to minimize the maximum out of the travel times of all the contending robots.
The estimated travel time cost for the j-th path, D j (σ), for a given joint path choice σ, not only depends on the number of contending robots assigned to the path in the j-th class, but also the number of contending robots assigned to the other paths in S, since those paths can potentially have geometric overlaps with the j-th path (Figure 7).As described earlier, we use A* search in the H 2 -augmented graph (Section 2.2) to compute distinct paths in the m different topological classes connecting the start and the goal location of the planning robot.Let's refer to these paths as {π j } j∈S .The travel cost for the j-th path is then computed as where, i. C B (π j ) is the base travel cost as computed by the search in the H 2 -augmented graph (i.e.estimated time taken to follow the path without consideration for any other agent in the environment).

Coordination-free Planning for Congestion Reduction
ii.The second term computes the additional cost due to the a priori belief of traffic density, ρ, modeled to be proportional1 to the net estimated traffic density along the path, C T (π j ) = s∈πj ρ(s)2 -referred to as the traffic-weighted travel cost, and scaled by the estimated number of distant agents, Q, in the environment.The proportionality constant, a, is determined experimentally (described in the next paragraph).iii.The last term computes the proximity penalty or overlap cost between pairs of paths due to multiple contending robots from different (or the same) paths creating congestion along the regions of π j where there is a geometric overlap with π j ′ (Figure 7).This includes self-overlap cost (when j ′ = j) due to the multiplicity of contending robots following the j-th path.The cost is proportional 1 to the number of additional contending robots, N j ′ (σ), in the overlapping path and the amount of overlap, C P (π j , π j ′ ).The amount of overlap itself consists of two parts: where the first part is the time cost of the part of the path π j ′ that overlaps with π j as is computed by the search in the H 2 -augmnted graph (with the overlap being determined by the proximity between the points on the two paths -Figure 7), and C O,T (π j , π j ′ ) is the net estimated traffic density on the overlapping parts of π j ′ .
The values of a and b are determined experimentally by running multiple simulations in a simple single-passage map (Figure 8) with a varying number of distant agents (we choose pedestrians only) and a varying number of contending robots.The width and the length of the passage in this map are chosen to be similar to those in the maps used in experiments and simulations.For more details on the implementation of the simulations, refer to Section 4.

Simplified Formulations
The optimization problem in (1) is referred to as the complete model and has O(m n ) terms in the objective function, making it extremely computationally expensive to solve with a large number of robots and available topological classes.We thus propose couple of approximations to simplify the optimization problem.These approximations rely on the fact that the number of contending robots, n, that the planning robot assumes is an estimate and is purely for the purpose of computing its own path choice probabilities.In reality, there is no coordination or communication between nearby robots.We thus use extreme values of n to simplify the models.

Two-robot Model
In this model we assume that the number of contending robots the planning robot considers is n = 2, so that the objective function in (1) becomes quadratic, which can be solved efficiently using a quadratic program.However, in computing the travel time cost for the j-th path, D j (σ), with σ ∈ S 2 , we can still account for a number n other than 2 by simply replacing each of the 2 robots with n 2 robots when computing  the overlap costs.This is effectively done in (2) by scaling & redefining the robot counting function as

Ensemble Model
In this model the planning robot assumes a large number of contending robots so that the number of robots in the j-th class is approximately nP j .Considering the problem of minimizing the maximum travel time cost (i.e., the maximum out of the travel times of all the contending robots) This allows us to reformulate the optimization problem as s.t.
where K j (P 1 , P 2 , . . ., P m ) is the estimated travel time cost for a robot assigned to the j-th path if the number of robots following the l-th path is nP l for all l ∈ S. The expression of K j is derived naturally from the definition of D j in (2): Note that K j (P 1 , P 2 , . . ., P m ) is affine in {P j } l∈S , and hence the objective function in (3) being max of affine functions, is convex [40].As a consequence, the optimization problem in (3) can be solved using efficient numerical methods and does not have an exponentially large number of terms as was the case in (1).It is worth noting that there is no meaningful analogous ensemble model for minimization of the average travel time cost since the average of the affine functions, {K j } j∈S , would result in an affine objective function in (3), which would result in a linear program, the solutions to which is always trivial with one of the probabilities in {P j } j∈S being equal to 1 and rest 0.

Path Choice Probability Values
Given a planning robot's start and goal location in an environment, the path choice probabilities, {P j } j∈S , depend on the choice of the model (complete model, 2-robot model or ensemble model) as well as the number of contending agents, n.For the ensemble model, with the large n assumption, it is clear from ( 3) and ( 4) that the probability values are independent of the choice of n.A comparison of the probability values computed using the different models and different n is shown in Table 1.The similarity among the values computed using the complete model and the 2-robot model is apparent, while the values from the ensemble model get closer to the 2-robot model as the value of n increases.This allows us to choose a simplified model for fast computation of the probabilities in experiments and simulations with a large number of contending robots.

Execution of Chosen Path with Local Collision Avoidance
In this section we describe the algorithm and controller used by a robot for following the stochastically chosen path in a topological class while avoiding immediate collisions based on local sensing.The overall algorithm for each robot3 is as follows: A robot stochastically chooses a reference path from the paths {π j } j∈S that it computed using A* search in the H 2augmented graph according to its own computed probability distribution {P j } j∈S .Once a robot chooses its own reference path, it commits to that path, since without inter-robot coordination and without live global traffic updates, there is not new information to warrant a full-blown replanning of reference path.Each robot then starts executing its reference path while performing a fast replanning (described in Section 4.2) at regular intervals of time with an appropriately chosen heuristic function in order to follow the reference path while avoiding collision with pedestrians and other robots.The fast replanning is performed using A* search in the H 2 -augmented graph, G H2 , and does not compute the path choice probabilities, but simply avoids high pedestrian/robot density regions as estimated in the immediate future in the spatiotemporal domain using sensing of the immediate vicinity.Feedback linearization and a potential-based approach allows control of the robot while avoiding collision.
The following subsections give more details on prediction of agent density in the immediate spatiotemporal neighborhood (Section 4.1), the fast replanning algorithm (Section 4. 2), and a potential-based local collision avoidance for the non-holonomic robot model (Section 4.3).

Spatio-temporal Representation of Other Agents' Near-future Occupancy Probability Distribution
For each agent in its immediate vicinity, a robot performs a short-term prediction of the agent's occupancy probability distribution as a density function in its spatiotemporal configuration space.Given the instantaneous position and velocity of a nearby agent (estimated using sensors onboard the robot), the robot employs a simple prediction-only Markov localization approach [41,42] (a discrete analog of Kalman filter) that uses a motion model to predict the probability of occupancy distribution (in Coordination-free Planning for Congestion Reduction a uniform discrete representation of the spatio-temporal domain) of the agent for the next m max timesteps (with each times-step of length δt, and a spatial discretization of δr ′ -the same discretization used in construction of G -Section 2.3).These occupancy probability values from the different nearby agents are aggregated (pointwise maximum) to construct the probability of occupancy map, P(x, y, t), that assigns a value to every discrete cell (Figure 9(a)).In practice, the probability computations are done on-the-fly during the graph search and only for t between the current time and m max time-steps into the future.

Fast Replanning -Heuristic Function and Cost Function
In order to perform a fast re-planning to avoid collision with other nearby agents, while ensuring that a robot stays committed to its reference path, we design a reference-pathbased heuristic function for guiding an A* search on the H 2 -augmented graph G H2 for quickly computing a path in the same homology class as the reference path.The heuristic function for the fast re-planner for a vertex (v, t, h) ∈ V H2 is computed as follows (Figure 9(b)): We compute the closest vertex p * = argmin p∈π ref ∥x − p∥ on the reference path, π ref (described as a sequence of points on the planar domain), and return the Euclidean distance between v and p * (referred to as part I of the heuristic function), and add to it the cost (travel time) from p * to the goal on the reference path (referred to as part II of the heuristic function) which was pre-computed as part of the reference path search in H 2 -augmented graph.More formally, the heuristic function evaluated at (x, t, h) ∈ V H2 is described as , where α ≤ 1 is a constant to tune the inadmisibility of the heuristic function, with lower value of α allowing greater deviation of the re-planned path from the reference path.
The cost function for fast replanning not only tries to minimize travel time, but also accounts for the computed nearby agent probability of occupancy, P. In particular, the cost of an edge, e ∈ E, connecting two points in the spatio-temporal domain is described by C G (e) = e 1+ι 1+ι−P(x,y,t) dt, where the integration is a line integration on the segment representing the edge and is performed numerically using linear interpolation of P along the uniformly discretized segment, and ι = 0.001 is a small positive constant used for numerical stability.Note that if the probability of occupancy is close to 1 at some point, that point will have very high cost and will hence be avoided.

Non-holonomic Robot Control for Trajectory Tracking and Potential-based Collision Avoidance
Trajectory tracking: Each robot tracks its computed trajectory (the reference trajectory at the beginning, and the fast-replanned trajectory subsequently).In the experiments, each non-holonomic differential drive robot controls a lookahead point [43] by computing the corresponding linear and angular velocities.At a given time step, a target point p l = (x l , y l ) is extracted from the time parametrized trajectory output by the replanning algorithm.This set point is tracked using a feedback linearization controller designed as follows: Given the current robot position p c = (x c , y c ), orientation θ c and a constant lookahead distance which is a point at a distance d f ahead of the robot.This enables the lookahead point track the target point p l .c s is a constant scalar gain.Corresponding linear and angular velocities for the robot is thus computed as u = v x cos θ c + v y sin θ c and ω = (v y cos θ c − v x sin θ c )/d f .For the differential drive robot, linear and angular velocities can be mapped to left and right wheel efforts (or velocities) as v L = u − ωl/2 and v R = u + ωl/2 where l is the separation between the two wheels of the robot.
Local collision avoidance with other agents: Depending on the robot heading, a fan-shaped collision cone is generated in front of the robot with radius r c and angle α c .For every other agent, i, detected inside the collision cone with position p agnt,i = (x agnt,i , y agnt,i ), a repulsion velocity to slow down the robot, that is inversely proportional to the distance between them, is computed as v rep,i = −c a (p agnt,i − p c )/∥p agnt,i − p c ∥ 2 , where c a is a positive constant.The resulting Cartesian velocity of the robot is computed as v avoid = v + i v agnt,i .This emulates the behavior of a vehicle that tries to avoid other agents ahead of it, but not behind it.
Local collision avoidance with obstacles: For avoiding robot-environment collision (including environment boundaries and obstacles) we use a velocity cancellation policy as follows: The closest point on an obstacle or environment boundary to the robot is denoted as p env = (x env , y env ).Then the vector from the robot to the environment is defined as v env = p env − p c .An obstacle repulsion component of the velocity is activated only if the obstacle is sufficiently close, and the robot has a velocity component towards the obstacle, and thus the final velocity of the robot is computed as follows: where, c e and d e are positive constants.

Results & Discussions
We run the simulations on several maps: "cage 1" (Figure 3(c)), "cage 2" (Figure 6(a)), "lehigh" (Figure 6(b)), "o2" (Figure 6(c)), and "group" (Figure 6(d)).Three types of comparisons are made: i.We first compare our proposed topological planning algorithm (a robot stochastically choosing paths from available topological classes) with a shortest-path algorithm (each robot, without any inter-robot coordination, chooses the shortest path to goal).In this comparison all distant agents are modeled as pedestrians and their trajectories are randomly generated.ii.Then, we demonstrate the effectiveness of a. the traffic density map, and, b. the proposed computation of path choice probabilities by comparing the performance of our algorithm with versions that either does use an uniform traffic density map or uniformly path choice probabilities over the topological classes.iii.Finally, we apply our proposed method to a setup with multiple groups of robots that start from different locations and have different goal locations, and compare the performance of our method with the performance of the shortestpath algorithm.It is worth noting that the fundamental premise of lack of inter-agent coordination or communication makes our work extremely unique.We assume that there exists no inter-agent communication or coordination, and robots do not share their plan or intent with other agents or with any central server.No other prior work, to our knowledge, assumes complete lack of communication or coordination (for example, in [22] there exists communication and coordination between the robots in construction of a shared PRT).Hence a fair comparison of our method with such alternatives in literature is not possible.
i.In the topological-versus-shortest-path comparison, the robots are allowed to choose one out of up to m = 6 classes.In each environment we vary the number of robots, n, and the number of pedestrians, Q, and note the average travel time of the robots and the maximum travel time (the time taken by the last robot to reach its goal).We also measure the average time spent on collision avoidance per robot.A simulation with the same initial conditions is performed using each of the proposed topological algorithm and the shortest path algorithm.Tables 2 and 3 shows a performance comparison.Each of the percentage numbers is the ratio of travel time (average or maximum) between the simulations using the topological algorithm and that using the shortest path algorithm.For collisions we show the difference between the time spent avoiding collisions using the topological algorithm and that using the shortest path algorithm 4 In computing the path assignment probability values for the topological algorithm we can use either the maximum travel time cost, C max , or the average travel time cost, C avg , and choose one out of the two simplified formulations -2-robot model (solved using QP library 'qpOASES') or the ensemble model (solved using NLP library 'NLopt').This is indicated in the first column of the tables.
As evident from the results, as the number of robots and pedestrians increase (i.e., the potential of congestion increases), our proposed topological algorithm significantly outperforms the shortest path algorithm in all aspects.It is also worth noting that in the larger lehigh map, using the travel time costs C max , the advantage is higher in the maximum travel time than the average travel time.ii.a.To verify that the traffic density map (as described in Section 3.2) in our proposed algorithm makes a difference in performance, we used a uniform traffic density map to run 10 simulations for comparison (while keeping the rest of the algorithm the same).The results in Table 4 suggests that the algorithm without an appropriately computed traffic density map underperforms.The role of the traffic density map in predicting the traffic for computation of the path assignment probabilities is statistically significant.
ii. b.To demonstrate the effectiveness of our proposed path assignment probability computation, we compare the performance of our ensemble model for probability computation with uniform path assignment probabilities.In particular, in map o2 (Figure 6(c)), the ensemble model minimizing the max.travel time cost gives path assignment probabilities of 0.34, 0.25, 0.26 and 0.14 for the 4 topological classes in the environment, which is sufficiently different from uniform probability of 0.25 for all the classes.This makes this environment a prime candidate for comparison of our method with the uniform path assignment probability method.Table 5 shows that the path assignment probabilities computed using the proposed algorithm leads to an improved performance with a variety of robot and pedestrian setups, when compared against the uniform path assignment probability method.
iii.Although most of our simulations focus on single-group scenarios, we have applied our algorithm to a multi-group case as well.In map "group" (Figure 6(d)), three groups of robots starting of from different locations try to reach their respective goals cross the map.As suggested by Table 6, our method, with 10 classes for robots to choose from performs better than the shortest-path algorithm in a statistically significant manner.
Real Robot Experiments: Real-robot experiments were run only on the cage 2 map with 9 robots and 9 pedestrians ( Fig 10).The results from each of the 10 runs are summarized in Table 7 and demonstrate similar advantages as seen in simulations.
Complete video of the simulation can be found in the multimedia attachment.It is to be noted that the robots used in the experiments are omni-directional and are is allowed to stop and/or move back in order to avoid immediate collision with other robots or pedestrians that it can sense in its immediate neighborhood.Since the robots follow paths computed by A* planer on a discrete grid representation of the environment, it needs to follow a piece-wise path that is restricted to the graph.Discussions, Limitations and Future Directions: As demonstrated in the simulations and the experiments, our inter-agent coordination-free method distributing the agents across different routes outperforms other coordination-free methods with respect to the overall travel time.The advantage is particularly amplified when there are a large number of robots in the environment.Compared to other MAPF methods, our method does not require real-time traffic information of both in-and out-of-system agents.The probability computation time using one of the simplified models for each robot is constant irrespective of the number of agents in the environment, and hence the computation complexity per robot does not increase with the number of agents, making our algorithm suitable for an environment with a large number of agents.Despite the demonstrated effectiveness of the proposed method, we recognize several limitations of the current method, which warrant further future investigations: i We use a priory traffic density estimation in the cost function for computing the reference paths for each robot.Currently this traffic density is computed synthetically purely based on the structure/map of the environment.In a real urban environment, historic traffic date can provide more accurate traffic density.In future we plan to test the proposed method with real traffic data collected from the department of transportation for constructing the traffic density map.ii It is assumed that each robot has a priori knowledge of the environment (a map) and also knows its own location (using a global localization system such as GPS).
Without one or both of these information, each robot will also need to simultaneously create a map of the environment and/or localize itself in the environment.This will require each robot to use a SLAM (Simultaneous Localization and Mapping) [44] module on top of our coordination-free planning algorithm, which we will do in the future.iii We have used A* search algorithm in a discrete graph representation of the environment for computing the cost-minimizing paths restricted to the graph.This results in the individual robots following paths that are piecewise linear, but may have sudden turns because of the discrete nature of the graph.In future we will use any-angle planning algorithms such Theta* [45] or S* [46] to generate smoother paths for the individual robots to follow.

4 Figure 1 :
Figure 1: Key idea: Make each robot stochastically choose from a set of topologically distinct paths in the environment.

Figure 2 :
Figure 2: Workflow of the proposed coordinationfree planning.Each robot follows this sequence of algorithms for computing and executing its own path without any inter-robot coordination or communication.

Figure 4 :
Figure 4: Successors (cyan) of a vertex (x, y, t) ∈ V (in red) using the rational discretization.Note how, although the successors with different spatial coordinates do not exactly fall on a circle of radius Vmaxδt, they do approximate the circle reasonably well due to the rationlized discretization.

Figure 6 :
Figure 6: Maps used for simulations and experiments.Traffic density maps estimated without a priori knowledge of any historic traffic data are shown using shades of cyan -darker cyan indicates potentially busier traffic regions.

Figure 7 :
Figure 7:  In the proximity penalty computation, the overlap of two paths is determined by drawing a corridor around the path, π j , then measuring the part of the path π j ′ that lies within that corridor.
(a) map: single_passage Number of contending robots Number of distant agents (b) computation of parameter 'b'.(c) computation of parameter 'a'.

Figure 8 :
Figure 8: Experimental estimation of the proportionality constants a and b in (2).The curves in different colors in (b) and (c) are with different ratios of the robot's safety distance to the passage's width as that is different for different environments.(a) The single-passage map used to experimentally compute the proportionality constants a and b, with the estimated traffic density, ρ colored in cyan.(b) Max travel time increases with the number of contending robots, with the fixed number of pedestrians.The slope of this curve computes the constant b.(c) Max travel time increases with the number of pedestrians, with the number of robots fixed at 10.The slope of this curve computes the constant a.

Figure 9 :
Figure 9: Fast replanning: (a) Illustration of computation of a nearby agent's probability in X-Y-T configuration space with δ t = 0.25s.The lighter cyan, the less likely it is occupied by the agent.The red ring shows the region (including safety radius) occupied by the agent at the time of prediction.(b) Heuristic function for fast re-planning returns the estimated time of travel for the parts I and II.

Figure 10 :
Figure 10: Real robot experiment in map "cage 2".Virtual (augmented reality) pedestrians are used in the experiment (see multimedia attachment).

Table 1 :
A comparison of the probability distribution over classes by different models on Map "cage 1" without non-rational agents.The paths of classes are shown in Figure3(c).
* All methods use a = 0.1625, b = 0.04548, and Q = 0. ** Not computable due to the memory overflow in computation of 3 20 terms in the objective function of the complete model.

Table 2 :
Performance of our proposed topological algorithm as compared to a shortest path algorithm in simulations for map "cage 1", using two time-cost & assignment probability computation models.Darker cyan indicates bigger advantage of the topological algorithm over the shortest-path one, while darker red indicates cases where it underperformed.Each cell shows the the average over 10 simulation runs with different initial conditions.See multimedia attachment for sample simulation runs.
* All methods use a = 0.1625, b = 0.04548.% values: time taken in topological algorithm time taken in shortest path algorithm × 100% Collision numbers: (colliding duration per robot in topological algorithm) -(colliding duration per robot in shortest-path algorithm)

Table 3 :
Performance comparisons in the other two maps.

Table 4 :
Performance comparisons between simulations using traffic density map, ρ, and not using it (i.e., effectively uniform traffic density by setting traffic density term 0) on Map "cage 1", both with 6 classes, 10 robots, 10 pedestrians, and using the 2-robot model minimizing average travel time cost.Each cell shows the the average over 10 simulation runs with different initial conditions.

Table 5 :
Performance comparisons between simulations using path assignment probabilities computed using the ensemble model minimizing max.travel time cost, and using an uniform path assignment probability over 4 topological classes on map "o2" (Figure6(c)).Each cell shows the the average over 10 simulation runs with different initial conditions.

Table 6 :
Performance comparisons for multi-group scenarios (3 groups, 5 robots in each group) between the topological method and the shortest-path method in map "group" (Figure6(d)).The topological method uses ensemble model minimizing max.time time cost with 10 classes.Each cell shows the the average over 10 simulation runs with different initial conditions.

Table 7 :
Statistic from individual real-robot experiments on map "cage 2".All runs are with 9 robots and 9 virtual pedestrians, using the 2-robot model minimizing average travel time cost, Cavg.