Dependable Navigation for Multiple Autonomous Robots with Petri Nets Based Congestion Control and Dynamic Obstacle Avoidance

In this paper, a novel path planning algorithm for multiple robots using congestion analysis and control is presented. The algorithm ensures a safe path planning solution by avoiding collisions among robots as well as among robots and humans. For each robot, alternative paths to the goal are realised. By analysing the travelling time of robots on different paths using Petri Nets, the optimal configuration of paths is selected. The prime objective is to avoid congestion when routing many robots into a narrow area. The movements of robots are controlled at every intersection by organising a one-by-one passing of the robots. Controls are available for the robots which are able to communicate and share information with each other. To avoid collision with humans and other moving objects (i.e. robots), a dipole field integrated with a dynamic window approach is developed. By considering the velocity and direction of the dynamic obstacles as sources of a virtual magnetic dipole moment, the dipole-dipole interaction between different moving objects will generate repulsive forces proportional to the velocity to prevent collisions. The whole system is presented on the widely used platform Robot Operating System (ROS) so that its implementation is extendable to real robots. Analysis and experiments are demonstrated with extensive simulations to evaluate the effectiveness of the proposed approach.


Introduction
Path planning is one of the key components of a robotic control system, with the function to drive a robot from a starting to a goal position without crossing through obstacles, and do this in an optimal manner. Although the path planning problem has been addressed since the 1950s [1,2], it has not been solved thoroughly until today [3]. Over the years many solutions have been restricted to static environments or single robot cases. In general, the path is optimised with respect to a cost function, which is primarily based on the total length of the path, and in some cases other Lan Anh Trinh lananh.trinh@mdu.se Mikael Ekström mikael.ekstrom@mdu.se Baran Cürüklü baran.curuklu@mdu.se 1 School of Innovation, Design and Engineer, Malardalen University, Västerås, Sweden factors, such as the complexity of the terrain, areas of interest, and other environmental factors. Recently, new challenges have emerged for path planning of fully autonomous robots within the Industry 4.0 context [4], where a mobile industrial robot is no longer separated from humans in space. More specifically, the robot shares the workshop floor with workers, and it interacts with them as well as other robots.
The navigation is crucial in such problem domains, since these scenarios demand collision avoidance capability even if there are only means to predict the exact movement of moving obstacles to a certain degree. Thus, when a path planning algorithm is designed, industrial standards should be considered. Dependability is important in industrial system engineering with its focus on a system's (i) availability, (ii) reliability, (iii) safety, and other properties such as (iv) maintainability, (v) integrity, etc.
In this paper, a dependable path planning algorithm is evaluated with respect to the first of those properties, i.e the points (i)-(iii). Reliability is assessed by the correctness of the control system to drive a robot to the goal. Availability is reflected by continuity of the system within a time limit, not having the dead-lock situation which happens when the robot takes a very long time to reach a target, or even not able to do so. Lastly, safety is evaluated by the obstacle avoidance abilities of the robot to prevent collisions with static and moving objects.
In general, a path planning algorithm is separated into global and local path planning. The former with the target to find a way from a start position to a goal having no intersection with any static obstacles in the map, while the latter will help the robot to avoid collisions with moving obstacles. So far, numerous global and local path planning algorithms have been developed with the ambition to improve dependability of navigation systems. Efforts have been made with the global path planning to find a feasible and effective route for a robot [5][6][7]. In many cases, a map of configuration space is represented as a graph and a graph-based searching algorithm like A*, is used to find a collision-free path [2,8]. The Theta* algorithm [9] adds line-of-sight checks into A* so that the path to the goal becomes more straight, thus contains fewer turns. Dynamic A* adapts to the changes of a static map in real-time, allowing the path planning to be more robust in unknown environments. Other common approaches explore the map by sampling, such as probabilistic road map (PRM) [10], with the goal of building the map by one-time sampling or rapidly-exploring random trees (RRTs) [11] to generate samples incrementally from the initial position until reaching the goal.
For multiple robots, the multi-agent path finding (MAPF) algorithms have been developed to search for optimal paths to drive the robots to their goals without conflicts. The conflicts happen when several robots occupy the same place at an instance of time. In general the MAPF problem is nondeterministic polynomial (NP) hard [12] where the searching space grows exponentially with respect to the number of robots. A*-based search [13,14] was one the of early solutions to perform a search with heuristics in higher dimensional space with multi robots. Conflict-based search (CBS) [15] can be used to find an optimal solution by a repeated process. First, CBS initialises a root node with all paths as the shortest paths of robots to the goal without concerning other robots. Continuously CBS expands the tree search starting from the root node by adding child nodes with conflicting constraints to avoid collisions. The constraints about potential conflicts are used to refine the paths at each node. Enhanced CBS [16] solved the MAPF problem within a suboptimal bound. Cooperative A* (CA*) [17], also a suboptimal and incomplete approach, prioritises the robot to assign the path by checking the number of collisions of the shortest path as compared to the planned paths. Priority-based search (PBS) [18] is the combination of CBS and CA* to improve the priority order of CA*. In overall, there has been a lack of a MAPF solver dealing with complex constraints like allowing several robots to meet at an intersection or one robot to wait for others to pass through the intersection. This paper investigates an effective tool to analyse and find optimal settings for global paths of multiple robots on those situations.
Once the global path to the goal has been generated, obstacle avoidance with local path planning ensures the safety for both robots and humans by preventing collisions between them. In velocity-obstacle (VO) approaches [19], a velocity obstacle is defined as a forbidden zone in the velocity space of the robot where a velocity vector in that zone could result in a collision with moving obstacles. The reciprocal velocity obstacle (RVO) [20,21] implements VO in a distributed manner where each robot has responsibility to avoid collisions with the rest by choosing a velocity outside VOs. However, after escaping from the collision, the robot tends to come back to its preferred velocity, possibly leading into a reciprocal dance. To address this problem, the optimal reciprocal collision-avoidance (ORCA) [22] casts the solution with a linear programming optimisation. Simultaneously, the hybrid reciprocal velocity obstacle (HRVO) [23] seeks for the velocity of the robot more towards one side of the half plane of VOs. Nonetheless, VO-based methods work with the assumption that the velocity of a robot is not changed over a time short interval. Therefore, these kind of approaches are more suitable to prevent potential collisions within a local range. Alternatively, a learning-based approach with reinforcement learning, has been investigated using Markov decision process to learn from history to predict the next movement of the robot with respect to observation [24]. Yet, a long training process needs to be done to collect sufficient amount of data to perform planning in both local and global scale. Conventional, however still widely applicable for local path planing, are potential field and dynamic window methods [25]. Potential field is defined as the combination of a repulsive field to push robots away from obstacles and attractive fields to drive a robot to a goal. Meanwhile, dynamic window approach (DWA) [26] generates a set of possible future trajectories of a robot with respect to the current dynamic configuration and evaluates these trajectories to find the optimal solution. As aforementioned, the manner global and local path planning systems allowing navigation of several robots into one narrow area might make the robots to block each other, i.e, lead to dead-or live-lock situations. To address the problem of congestion when routing multiple robots into a crowded place, a group of the robots should proactively define effective routes to reduce conflicts with each other. The problem presented here has some similar key features with traffic control in urban areas. The most common method used to ensure safe transportation is to have traffic signals to control the vehicles and pedestrians through cross ways. A modern monitoring system nowadays can collect information about traffic jams and send notification to drivers to avoid moving toward a crowded area. These approaches lead to the introduction of a congestion control for path planning that is presented in this work to achieve dependability in a navigation system for autonomous robots in a dynamic environment.
Previously, with respect to solving conflicts and sharing of resources, it has been shown that Petri net (PN) provides an effective solution to such problems for an autonomous system [27]. The uses of PN to design autonomous robots are demonstrated by Yasuda et al. [28], Iocchi et al. [29]. This is further explored with the use of other extension with ROS on real robots by Fabre et al. [30] and also with fault tolerance by Miyagi et al. [31], Lusier et al. [32]. Although there have been previous works of using PN on path planning [33], they have mainly focused on the discrete system of robots changing states from one node to another in a static graph, where a node is a particular place or area representing a conflict resource to be resolved with PN. Such a static configuration is highly dependent on the precise estimation of travelling time of robots among the nodes. For an example of a highly accurate navigation system, an automated guided vehicle (AGV) using marked lines on a floor is able to exactly control the moves of robot to follow the planned path. The traffic control for AGV, like zone control, has been well investigated in literature [34,35] to synchronise multi AGVs to enter an area safely. This paper deals with a broader scenario when the robot must be able to recognise and work with humans and/or other multiple robots, in an unstructured and unknown environment.
With regard to PN as an effective tool in dependable autonomous control to resolve conflicting problems, this paper presents a new approach of using it in generating the paths of multiple robots to reduce congestion by avoiding routing many robots into the same place. There are three main contributions of the paper. First, the paper shows how to model the path finding problem for multiple robots with PN and how to analyse alternative paths of each robot to find the optimal configurations with less traffic conflicts along the paths. The optimisation on PN allows to solve a MAPF problem with complex constraints such as stopping robots at intersection to wait for other robots. Second, the paper proposes the way of monitoring moving tokens in the PN to control the traffic of robots at every path crossing to reduce conflicts. The first contribution aims at finding the optimal global path for robots, whilst the second one realises the PN control to synchronise the moving of the robots on the assigned paths. Since the PN is mainly designed to control robot movements, the collisions between robots and humans are addressed by DWA. However, the use of DWA is not effective if it only prevents collisions in a passive manner, meaning that a robot doesn't react until the moving obstacles have moved close enough and then considers them as normal static obstacles. This type of behaviour needs to be avoided. In the third contribution, this paper shows the implementation of the dipole field algorithm on top of the DWA algorithm to present a new method for obstacle avoidance, which is based on the information about directions and speeds of the moving objects. Moving obstacles are described by magnetic dipoles with their moments aligned with the direction of velocities. As a consequence, opposite dipoles will generate repulsive forces to turn the robot so that they do not collide with the obstacles. The capability of PN control with dipole field on DWA allows the whole proposed path planning system to handle both static and dynamic obstacles for the robots' paths. This paper has been extended from initial studies presented in [36]. In [36], the method to construct a PN is to divide the working space into non-overlapping regions with a grid layout where each cell of the grid corresponds to a PN control place. However, a robot usually crosses many grid cells along its trajectories and some of them never play a role in controlling congestion as they are not the junctions of two or more robots' trajectories. This leads to unnecessary defined spaces and transitions, resulting in a sparse PN to control robot movements. In this paper, the following developments have been made to further improve the prior work: In overall, the whole system is evaluated with extensive evaluations of different scenarios. The system is also integrated in the navigation stacks of ROS, one of the most widely used robotic frameworks. The rest of the paper is organised as follows. Through Section 2, the problem of this work is stated, following by the proposed path planning algorithm. Thereafter, in Section 3 the simulation results to evaluate the proposed system are presented. Finally, contributions of the paper and future directions are summarised and concluded in Section 4.

Problem Statement
The basic annotations and preliminaries for the equations to be used are defined as follows.
Robots This work copes with a team of N identical robotic agents A = {a 1 , a 2 , ..., a N } which are simply presented by a unit-circle model with the identical radius r. The robots are freely moving on 2-D plane M = R 2 . The location of each robot i at time t is denoted by x i (t) ∈ R 2 , with corresponding velocityẋ i (t). The area occupied by a robot is denoted by C i (x i (t)) ⊂ R 2 , a circle with the centre at position x i (t) and radius r.

Map and Static Obstacles
The working environment is presented with a simple binary grid map where a set of static obstacles (white pixels) are described by O ⊂ R 2 and robots are freely moving within the black regions of the map M \ O. Let O r be the subset of O dilated by the robot's radius r. It is obvious that if there is no intersection of the moving trajectory of the robot trajectory with O r , then the motion of the robots are collision free with respect to the static obstacles.

Humans and Moving Obstacles
There are M human subjects present in the working environments together with the robots. The information about current locations, planned paths, velocity, accelerations, etc. are shared among the robots, however, they are unable to know the exact trajectories of humans. The information about the dynamics of the humans, e.g. In this paper, all humans will be presented by same radius. Each robot i will then see a set of moving obstacles/humans as Global Paths Each robot is assigned a task to move from a starting point to a goal. Using a global guidance system, a number of alternative paths are planned. These global paths are updated frequently to adapt to the changes in robot trajectories. Let a set of global paths for a robot i be where the path consists of connected line segments. Different paths that are leading into the same goal will simply extend the area G i , possibly defined by a circle with a predefined radius.
The problem to be addressed in the proposed work has the aim that all robots should be able to complete their navigation tasks without colliding with any static and moving obstacles in the working environment. For all times from t 0 i to t 1 i for a robot i to travel along its trajectory, the collision constraints are expressed by, which is correspondent to no intersection of the trajectory with static obstacles, other robots and moving obstacles in its environment. The requirement to reach the goal is given by: Routing several robots into narrow areas could lead to congestion, that may lead to a deadlock situation or prolong the time to complete the navigation task. This is due to the reason that robots need to go around to avoid collisions with each other. The main problem to be solved in this work is to tackle the issues defined by Eqs. 1 and 2 with the use of PN to synchronise the movements of robots to find less not only collision-but also congestion-free paths to reach their goals correctly.
The solution starts with the construction of multiple paths for each robot (Section 2.2). A path, which is presented by a set of line segments connecting from a starting point to a goal, is found by the Theta* algorithm. Continuously, each robot broadcasts its paths to all robots within the working space. By checking the intersections of the planned paths, PNs are constructed to analyse the movements of the robots when they enter the intersection regions (Section 2.3). In each PN, a single robot is correspondent to one token of the network. An optimisation algorithm is performed to choose the path for each robot to minimise the conflict/congestion and the travelling time to complete the task (Section 2.4). Once the optimal path is found, the robot then follows the path and synchronises with other robots to avoid collisions (Section 2.6). Collisions with other obstacles that are not able to communicate their planned trajectories with the robots, e.g. humans, are addressed by using DWA combined with a dipole field (Section 2.7). In the case that one of the robots is getting stuck, the global path planning is reactivated to generate a new path from current position to the goal, and PN planning is updated with a new moving control plan. The overall architecture of the system is described in Fig. 1.

Multiple Global Paths
A global path planning algorithm needs information regarding the environment e.g. as a scanned map beforehand to generate a path from a starting point to a desired goal. The global path planning mainly applies for a static environment. This means that in order to compute the global path to a goal for the robot, the presence of other robots and humans, i.e. dynamic objects, is not taken into account. Theta* [9] has realised an any-angle path algorithm, which is able to find an optimal path in any direction, by adding a line-of-sight detection function to each search iteration. Unlike A* or Dijkstra, the path found by Theta* is a connection of line-of-sight nodes so that the generated path is smoother, more realistic, and has fewer turns. With regard to those advantages of Theta*, it has been used as the main global path planning in this paper. To generate multiple paths, Theta* is utilised to sequentially find a new path after inserting a set of found paths into an obstacle map. This way, the next found path does not overlap with the previous ones ( Fig. 2). A backup path is critical if the robot needs to pass through a narrow area with the possibility of facing another robot so that it has to find an alternative way. On an empty space, the multiple paths created by this approach can be close to each other. Therefore, to reduce the complexity of the control PN, multiple paths are only added if separation among them are big enough. The algorithm to find multiple paths from a starting point to a destination is summarised in Algorithm 1.

Petri Net Construction
The basic definitions of a PN are given as follows.
Dilate ξ i by the radius of the robot and add it into O r ; Ξ := Ξ ∪ ξ i ,; end end end elements of P and T respectively, and W ⊆ (P × T )∪ (T × P ) is a set of arcs connecting from a place to transition and vice versa.
In a PN, an input place of a transition is defined as the place from which an arc goes into the transition. Meanwhile, the contrasting one, i.e, the output place, connects to an arc going out from the transition. Both the input and output arcs of the transition are added by non-negative integer weights. With regard to a set of output weights O and a set of input weights I , PN is described as a set of five tuples G = P , T , W, O, I . Places in a PN may consist of a number of marks named tokens. Let O be a two dimensional matrix of weights O(p i , t j ) from the place p i to the transition t j . I is similarly defined by the weight I (t j , p i ) from the transition t j to the place p i . It is noted that 1 ≤ j ≤ |T |, where |T | is the total number of transition. Thus, a change of the marking vector of a transition from M to M is given by a finite sequence of transitions It is said that the marking M is reached by the marking M by firing t. In the vector form, this transition marking is presented as where I p = I(·, p) is the column vector p-th of the matrix The pipeline to formulate the control PN from multiple global paths includes four main steps: First, a set of intersection regions are found by checking all crossings of the global paths; Second, the list of intersection candidates are reduced by merging close intersections; Continuously, the control places and the places before and after an intersection are located; Finally, flowing all the links between the places, the PN is created.
First of all, to find the area of intersection, the thickness of the path is dilated by the radius of the robot. For a pair of paths, there are one or more intersections formed along the paths. Due to the dilation of the path, an intersection is not presented as a point, but a polygon. Consequently, small polygons are filtered out based on their perimeters. The centre of a polygon is estimated by averaging all middle point of the segments passing through the polygon. PN is applied to avoid the congestion by granting permission to just one robot to pass through an intersection at an instance of time. The trajectory crosses that happen inside a narrow area may even lead to an unexpected situation of making robots getting locked and not able to escape from the congestion area. Therefore, other control places are created to handle such an area to limit the number of robots allowed to pass through it. For simplicity, those types of areas are manually given in this work by providing a list of polygons to cover the restricted areas.
Meanwhile, it is also inefficient to have control places which are very close to each other. To address this issue, based on the positions of the intersection, the places are grouped by hierarchical agglomerative clustering [37] until the distances among every places are greater than a predefined threshold. It is also important to avoid creating places nearby the sources, goals, and the restricted areas. The combination of merging the paths and remaining only the important control places helps to create a compact PN to control the movements of robots. In summary, a set of intersections are estimated by Algorithm 2.
In the next step, each place of the PN corresponds to a region of intersection and is assigned the location at the centre of that region. An example of creating a control place is described in Fig. 4 with three control places numbered 6, 7, and 8, respectively. The three green circles present the starting positions while the three blue circles are the destinations. Three control places are marked with red circles. Control place 8 lies at the middle of the intersected area of two paths. Meanwhile, the control place 7 is formed by grouping the intersection from three paths. The restricted area 6 is defined to be inside a corridor to allow only one robot to pass through at a time. Although there is an intersection close to the restricted area 6, no control place is created as it stays on the entrance (or the exit) of the corridor. The two places p  but their centres stay at the crossing points between the path ξ i and I j i . Some special cases are described in Fig. 6. In the first situation (Fig. 6.A) where the separation of two intersections is less than the total distance D Finally, the PN model is constructed by Algorithm 3 to synchronise the movements of multiple robots. It is noted that a cross is considered as a space resource allocated to a single robot each time. In this algorithm, a place of PN is added according to a cross and a transition represents a physical connection between two adjacent crosses on a path. An example to add a PN place to control the movements  of two robots through a cross is described in Fig. 7. The places assigned for the moving paths of Robot 1 and Robot 2 are named by P 1 {index} and P 2 {index} respectively (the places P 1 2 and P 1 3 for Robot 1, P 2 2 and P 3 3 for Robot 2). The control place C is made to manage two robots to pass through the intersection.

Estimation of Optimal Configuration
When a path has many crossings, it means that the robot's travelling time is prolonged. Therefore, the PN construction is improved by first setting multiple paths for each robot to its goal and then choosing the optimal configuration of PN to minimise the travelling time of robots and to avoid deadlock situation. In Algorithm 3, T = [τ 1 , τ 2 , ..., τ |T | ] T is the approximation of the delayed travelling time over a transition. Assume that all robots are configured to have the same speed, the travelling time is therefore proportional to, or for simplicity is equal to, the travelling distance from one transition into another. With regards to T , the firing rule is modified to consider the delayed time in the transition firing: After a transition t i is enabled, it will fire after τ i time units. Those extensions actually introduce timed PN J Intell Robot Syst (2022) 104: 69 Let M j , j = 1, 2., .., n denote a sequence of markings and the firing count vector ς j that makes the moves of one marking to the next, M j = M j −1 + C · ς j , j = 1, 2, ..., n. The total of the corresponding travelling times of all robots after firing the sequence ς j , j = 1, 2, ..., n is estimated by n j =1 f (ς j ), where f (ς j ) is the delayed time of the whole PN after ς j is performed. For simplicity, assume that each robot is setup with the same number, k, of alternative paths to reach the goal. Let z i,p be a set of N × k binary variables defined according to how each robot choose its path, The constraint that only one path is assigned to a robot is expressed by, k p=1 z i,p = 1.
Therefore, ς j (z i,p ) is a firing sequence decided by the selection of z i,p . The optimal problem to find best assignments of paths to robots is stated as follows, Solving the optimal problem in Eq. 8 provides a solution for the optimal path for each robot to construct the control PN. Thanks to PN simulation tools [38], different combinations of assigning the k−th paths to the robots are examined to find the best configurations of z i,p to minimise n j =1 f (ς j ). After this, the PN is fixed with the optimal path selection (the paths which are not selected are disabled) and the use of PN control in path planning is applied in the controlling stage.

Computational Complexity and Solution at a Large Scale
The computational complexity of the approach is the time needed to evaluate all combinations of assigning different paths to a robot. Assume that there is a maximum of K paths for each robot and N is the total number of robots, the evaluation of all samples runs in O(K N ). As the computational complexity grows exponentially with respect to N, the algorithm will take a long time to find the optimal solution with a large number number of robots. Therefore, in order to apply the solution in a large scale, in this paper, the optimisation with a subgroup is proposed. The whole working space is divided into separated zones where the movements of robots inside each zone are synchronised by a PN. A robot travels through a sequence of zones until reaching its goal. As shown in the example of Fig. 8, Robot 1 (r1) has to pass through Zone 1, Zone 2, and Zone 3 to approach its goal. To join a zone, robots wait at the boundary of the zone until the controlled PN of the zone is free. Therefore, Robot 1 and Robot 2 (r2) must stop until the PN at Zone 2 completes controlling Robot 3 (r3) and Robot 4 (r4). Let D be the maximum number of robots allowed to

Movement Control on Optimal Paths
The whole system is organised in a centralised manner where one master ROS node is used to create the PN model for each robot and share information with all other robots via communication channels. Once the above PN model is built, the realisation of the path planning becomes the step-by-step moving of a robot from one place to another along the generated trajectory until the robot reaches its goal. The movement of each robot is controlled by firing the enabled transitions and following a sequence of places visited by the token assigned to that robot. Each robot is linked to one token given Algorithm 3. The movement controlling algorithm for a team of robots is implemented by the Algorithm 4. The PN planning mainly deals with the congestion of robots by preventing several robots to be navigated into a narrow place at the same time. The prerequisite requirement for this control with PN is that the global planner paths are shared by the robots. However, this is not applied for humans and other moving obstacles unless they explicitly share their moving trajectories. Thus, to avoid these objects, the local path planning with DWA and dipole field is applied.

Dipole Field and DWA for Moving Obstacle Avoidance
The dipole field has been introduced by the authors for obstacle avoidance with moving human or robotic agents [39]. In this method, robots and obstacles (named agents in general) are modelled by small magnets with their dipole moment vectors aligned with the moving directions of the robots and obstacles. In a consequence, two agents moving toward each other or having the same poles facing each other are pushed far away by magnetic forces. The magnetic field B of the dipole moment vector m generated by an agent is given by where d is the distance vector, d = d ,d = d/ d , in which ||.|| denotes the norm of the vector, and ρ is a constant. The magnetic moment m is designed to be aligned with the moving directions of the agent and its magnitude is proportional to the speed of the agent. An agent with the where the gradient ∇ presents the change of potential m j ·B k generated per unit distance, and ρ is a constant. Dynamic window approach (DWA) was introduced by Fox et al. [40] and later developed in ROS [26] to use multiple constraints of velocity limits, of acceleration limits, and of following the predefined global path into the local path planning. The local searching space is reduced to dynamic windows in a three-step progress (Fig. 9). Firstly, DWA considers the robot's trajectories as circular trajectories or curvatures determined by a set of translation and rotation velocities (v i =ẋ i (t), ω i ). Secondly, only admissible pairs of (v i , ω i ) corresponding to their trajectories are considered if the robot is able to move forward without colliding with obstacles. Finally, the dynamic window limits the admissible velocities to those that the robot can safely reach to the goal in a short time with optimised accelerations.
Only the pair of (v * i , ω * i ) is selected if the objective function reaches to maximum, where F (v i , ω i ) = f s q s Q s (v i , ω i ) , Q s (v i , ω i ) and q s are the cost function and the corresponding weight, and f (·) is an optional function used to smoothen the weighted sum of the above components. In the implementation of DWA on the ROS navigation stack 1 , the objective function given by Eq. 12 is used to score the admissible velocities, With regards to the grid-based presentation of the cost map, D(v i , ω i ) is the total cost of the grids passed through by the trajectory (v i , ω i ). Meanwhile, H (v i , ω i ) is a goal heading which is the distance from the endpoint of the trajectory to the goal, P (v i , ω i ) is the distance from the endpoint of the trajectory to the global path, α, β and γ weight each term of the equation. The trajectory with the minimised cost F (v * i , ω * i ) in Eq. 12 is the one with the highest possibility of avoiding static obstacles, going toward the goal, and being close to the planned path. Additionally, the dipole field has been introduced in this paper to take into account the moving directions and velocity of the obstacles to avoid the collisions with dynamic obstacles.
To integrate the dipole field into DWA, it is noted that the potential changes reflect the repulsive forces that prevent agents to collide with each other. Therefore, the new terms Ω j (v i , ω i ) are added into Eq. 12 to weight the trajectories based on the possibility of collisions or to minimise the total dipole foces applied on the robots. The dipole field term is removed if Ω j (v i , ω j ) does not generate repulsive forces (two objects do not move toward each other). The pair (v i , ω i ) and its trajectory are also removed if they will cause the collisions with another robot or moving objects with regards to the velocity constraints introduced by Fiorini and Shiller [19]. It means for every neighbouring robots j = i around robot i,

Simulation and Evaluation
The whole system is implemented in the Robot Operating System (ROS) platform [41] with version Kinetic Kame in Ubuntu 16.04 on a laptop with Intel i7 CPU of 8 cores and a clock frequency of 2.90 GHz. The Gazebo simulator is used to provide realistic simulation of robots working in indoor environments. The simulated robotic agents are Husqvarna research platform (HRP) 2 mowers with an extra depth sensor, and a LIDAR laser scanner. The centralised control is performed by a master ROS node and robots in the working space share their locations and global path information through ROS messages. The PN planning module is programmed in Python with the support of the SNAKES library [38]. Meanwhile, the global path planning with the Theta* algorithm and the obstacle avoidance with dipole field on DWA are implemented in C++. In all experiments, it is assumed that all robots travel with the same speeds. Therefore, instead of using the time to analyse PN, the length of the travelling path is used as the delayed time at each transition in all simulated experiments.

Simulation Scenarios
Two simple scenarios are simulated in this section to describe the working principle of the system.

Scenario 1
In the first scenario, the two robots are placed at opposite sides of a corridor where one of them moves from top to bottom and the other robot travels in the opposite 2 https://github.com/HusqvarnaResearch/hrp direction (Fig. 10). The size of the map is set to 10 × 10m. The global path planning returns two alternative paths for each robot where it has one goal location. The shortest path  The conventional PN to visualise the generated PN model in Fig. 11 goes through the corridor and the other goes around the sides, outside the corridor. For both two robots described in Fig. 10.B, Path 1 is the shortest path going through the corridor while Path 2 goes by the side of the map. The green and blue circles indicate the starting and ending positions of the robots while the red one is the intersection point if both robots choose the shortest paths. Thus, if both robots choose the shortest path, they will meet each other at the centre of the corridor. A restricted area lying at the middle of the corridor is defined to allow only one robot to pass through. A timed PN is created for the whole system to analyse the best configurations for all robots (with its visualisation given by the SNAKE library in Fig. 11 and by the conventional PN definition in Fig. 12). The delayed times at transitions are given in Table 1, where the delays at transition t 1 and t 5 are 6.45 and 6.51 meters, corresponding to the travelling distance of robots through the corridor. Different combinations of pairs of paths for the robots are composed to find the best routes for the robots. If both Robot 1 and Robot 2 select to follow Path 2, the firing sequence of transitions is {t 3 → t 7 }, the total travelling time of two robots is 24.95. Similarly, if Robot 1 chooses the Path 2 and Robot 2 chooses Path 1, the corresponding Table 1 The time delay at all transitions in the PN model. The time delay is equivalent to the travel distance from one place to another, so that the time unit is not necessary used in experiments With the best routes selected by analysing the timed PN, the congestion does not happen in this scenario (Fig. 13).

Scenario 2
In the second scenario (Fig. 14), Robot 2 moves as previously while Robot 1 moves from left to right. In this scenario, the shortest path of one robot crosses that of the other at the centre. The size of the map is also 10×10m, and each robot is able to find two different paths from starting to ending points. Similar to the previous experiment, the green and blue circles mark the starting and ending location of the robots, the red circles mark the control places. For Robot 1, starting and ending points are marked by green circle nr. 0 and blue circle nr.
which is in case that both of the robots choose to follow Path 1, the travelling cost is 20.20. If the robots select Path 2 which is indicated by the sequence 19 → t 9 }, the total travelling time is 29.73. Finally, 46.65 is for the sequence {t 10 → t 11 → t 5 → t 12 → t 6 → t 13 → t 14 → t 7 → t 8 → t 9 } when Robot 1 chooses Path 2 and the other follows Path 1. With regard to those analyses, both robots select the path with the same index 1 to follow. In this case, the PN control (Figs. 15 and 16) is applied to synchronise the movements of two robots passing through the cross (Fig. 17).

Evaluation
In this section, the performance of the proposed method is verified through the evaluation of all the proposed scenarios.

Multiple Robots Controlled by a Single Petri Net
An extensive evaluation is performed with four robots that are placed at different sides of the map (Fig. 18). The size of the map is set to 20 × 20m. Similar to the previous examples, each robot is able to plan two different routes to reach its goal. PN is used to analyse the best routes and to synchronise the movements of robots. The experiment is repeated 100 times by randomly defining different goal locations. For each run the starting points are kept the same. One restricted area is established inside the corridor.
The results of three example runs are depicted in Fig. 19. The first row describes the situation of the cross present inside a corridor. Without PN control, the two robots follow the shortest paths leading to the collision inside the corridor even though the two robots manage to keep on moving to reach their goals (Fig. 19(A 1 )). The result with PN control is depicted in the Fig. 19(A 2 ) showing that no collisions occur as one robot goes outside the corridor. In the second row, there are figures expressing the case in which the cross is on an empty space. Without the PN control, a collision happens ( Fig. 19(B 1 )). The PN control helps to make the moving trajectories of both robots smooth ( Fig. 19(B 2 )). A  comprehensive example of all paths included in a single PN model is given in the third row. There is no collision in two cases of not using (Fig. 19(C 1 )) or using the PN control ( Fig. 19(C 2 )). However, in the latter, with the designed path of less intersections, a robot with the red global path is able to keep a smoother trajectory to the goal. In Fig. 20 the histogram of all occasions of of when the distance between two robots is less than 1.0 meters in     [26]. The mean of those distances is 0.76m for the PN control with a standard deviation (STD) of 0.14m. Similarly, the mean and STD are 0.73m and 0.14m for the DWA-baseline algorithm. Using PN to synchronise the movements of robots, the occasions when the distance is less than 0.7 meters are reduced significantly. Note that collisions risk to happen when the distance between two robots is in the range [0.5, 0.7] meters, dependent on whether they are moving toward each other in an opposite direction or they glide over each other. The summary of the number of collisions and dead locks are given in Table 3. The deadlocks happen when the two robots have a strong hit that makes them stop or a robot is locked inside a corridor. There are several cases where a goal lies on the designed global path of another robot. Once a robot finishes its path and ends at the goal, it consequently stops its PN control, leading to a collision and/or even a dead lock. In these cases, two dead locks are recorded although PN control is used.

Multiple Robots Controlled by Multiple Petri Nets
The optimisation by subgroups proposed in Section 2.5 shows the way to search for optimal global paths with a large number of robots in a working space. An experiment is designed in this section to evaluate the effectiveness of the proposed approach. The map with the size of 40 × 40m with several obstacles and narrow corridors is used. Sixteen robots are distributed equally on each side of the map. The starting positions of the robots are fixed while their goals are uniformly assigned with the separation of 3 meters. The minimum distance between a starting point and its corresponding goal is 5 meters. As depicted in Fig. 21, the whole working space is divided into 9 non-overlapping zones where each zone is controlled separately by one PN controller. Zone 4 takes into account only restricted areas (narrow corridors in the map) to create a PN. The maximum number of 8 robots is allowed to simultaneously enter a zone while at most three global paths are assigned for a source-destination pair. The experiment is repeated 10 times so that a total number of 160 path planning tasks are evaluated. The proposed algorithm is compared against the baseline DWA [26] and VO-based DWA path planning [25]. Table 4 summarises the success rate of completing the navigation tasks of the three methods. The baseline DWA mainly fails due to the collisions of robots in both empty spaces and narrow areas. The VO-based DWA has a deadlock problem when two robots meet each other in a narrow corridor. Meanwhile, the failures of the PN-based control usually happens with a single robot when it has a sharp turn and collides with a wall. In overall, the PN-based control is superior to the other two methods on its reliability to complete the path planning tasks in this experiment. The processing time (averaged over 50 trials) of running a PN to find the optimal configuration of global paths is 8.7 seconds. Note that the optimisation is only required once a group of robots enters a new zone.

Robots and Humans Sharing Working Space
In this experiment, two humans acting in the simulated working space, which contains three robots, are added. The  Fig. 22 The working space with two humans and three robots simulated in Gazebo human actor plugin of Gazebo simulator is customised to allow control the moving trajectory of a human subject by repeated patterns (Fig. 22). The blob-based detection using the laser scanner developed by the SPENCER project [42] is applied to track the trajectories and velocities of moving human subjects and obstacles in the working space. The same size of the map as earlier experiments of 20x20m is used. The experiments were repeated 5 times. For each trial, the static obstacles, start and goal positions of both robots and humans are randomised. The minimum distances between human and robot are recorded for evaluation purposes. The overall result is summarised in Table 5. With the combination with the dipole field on top of PN path planning, there are no collisions between robots and humans recorded.

Conclusions and Discussion
This paper has introduced a novel path planning algorithm for multiple robots using PN in combination with obstacle avoidance with dipole field. To avoid the congestion of routing several robots into an intersection area, PN is utilised for path optimisation and then movement control of robots. Particularly, the path planning system establishes alternative routes to the goal for each robot. Due to the effectiveness PN demonstrates in analysing the travelling time from the starting position to the goal point, the optimal paths are assigned to the robots to increase the chances of a robot to reach its goal without facing dead-lock situations inside narrow areas. In the second stage when the global paths are configured with optimal selection, the PN model is applied to control the movement of robots to ensure that a robot is allowed to pass through a cross one by one. The experimental results with Gazebo simulator have revealed that the PN control is able to synchronise the movements of multiple robots passing through the intersection, which helps to both avoid dead-lock and shorten the travelling paths of the robots. Meanwhile, the dipole-field implemented with DWA is able to advance the local path planning with an ability to avoid moving humans and other robots as well as uncontrolled obstacles in the shared workspace, as is also shown in simulations. Thus, these results show that the presented algorithm can improve the dependability aspect of a navigation system, based on path planning, for multiple robots system. Some limitations of the proposed method are listed to be addressed in the future. First, in the current configuration one token is assigned at the control PN place of a narrow area, which allows only a single robot to pass through. This is to minimise the possibility of a deadlock happening inside the area. If the control area is big enough, it is more optimal to allow several robots to enter it at the same time. Therefore, the correlation between the size of the region and the number of a token at the control place should also be considered as a factor to be optimised with the modelled PN. Second, the proposed system is mainly implemented in a centralised manner. Thus, a shift into a distributed algorithm is considered so that the system is not so dependent on a single central server to operate. Third, recent searches [43] have shown the new way to address multi-agent path planning with uncertainty using Markov decision process and congestion probabilities. Studies about extending this approach into the PN control model presented in this paper may help to find a faster way to solve the path selection optimisation. Lastly, the evaluation of the proposed approach is currently performed only as a Gazebo simulator. It is plausible to assume that an implementation in real robots will help to evaluate the proposed algorithm.
Author Contributions All authors, including Lan Anh Trinh, Mikael Ekström and Baran Cürüklü, contributed to the writing of the manuscript. Lan Anh Trinh implemented the algorithms and performed the testing experiments.
Funding Open access funding provided by Mälardalen University. The research leading to the presented results has been undertaken within the research profile DPAC -Dependable Platform for Autonomous Systems and Control, funded by the Swedish Knowledge Foundation.
Availability of data and materials The data are available upon request.

Code Availability
The source codes are available upon request.

Conflict of Interests
The authors declare that the research was conducted in the absence of any commercial or financial relationships construed as a potential conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.