Multi-Robot Patrolling with Sensing Idleness and Data Delay Objectives

Multi-robot patrolling represents a fundamental problem for many monitoring and surveillance applications and has gained significant interest in recent years. In patrolling, mobile robots repeatedly travel through an environment, capture sensor data at certain sensing locations and deliver this data to the base station in a way that maximizes the changes of detection. Robots move on tours, exchange data when they meet with robots on neighboring tours and so eventually deliver data to the base station. In this paper we jointly consider two important optimization criteria of multi-robot patrolling: (i) idleness, i.e. the time between consecutive visits of sensing locations, and (ii) delay, i.e. the time between capturing data at the sensing location and its arrival at the base station. We systematically investigate the effect of the robots’ moving directions along their tours and the selection of meeting points for data exchange. We prove that the problem of determining the movement directions and meeting points such that the data delay is minimized is NP-hard. For this purpose, we define a structure called tour graph which models the neighborhood of the tours defined by potential meeting points. We propose two heuristics that are based on a shortest-path-search in the tour graph. We provide a simulation study which shows that the cooperative approach can outperform an uncooperative approach where every robot delivers the captured data individually to the base station. Additionally, the experiments show that the heuristic which is computational more expensive performs slightly better on average than the less expensive heuristic in the considered scenarios.


Introduction
The interest in using mobile robot teams for surveillance and monitoring environments over a longer period of time has emerged with the advances in the fields of robotics, computation and communication.Examples for applications include disaster response [7], [35], [17], wildfire monitoring [12], security tasks [20], environmental monitoring [31], and exploration and mapping [24], [32].The mobility of the robots extends their sensor coverage and allows areas to be monitored that cannot be covered efficiently with static sensors that also have to be deployed.The drawback is, that not all areas of the environment can be monitored at the same time.In certain scenarios it is not only important that certain locations of interest (referred to as sensing locations) get visited repeatedly, but also that the data captured by the sensors of the robots is transmitted to a base station in due time.This allows human mission operators to quickly assess a situation or that the collected data can be promptly processed for another purpose.We assume that the mobile robots and the base station are equipped with wireless transceivers to exchange data as well as sufficient memory to store the data.This enables the data to travel to the base station via multiple robots in a store-and-forward fashion.Two optimization criteria are essential to this multi-robot problem: idleness and delay.The first describes the time between consecutive visits at a sensing location, and the second describes the time between the capturing of data at a sensing location and its arrival at the base station.To optimize or constrain these criteria, coordinating the movement of the robots is necessary.
Monitoring of an environment over long time periods is related to the patrolling problem, where mobile robots continuously travel and sense the environment.While idleness is a common optimization criterion for the patrolling problem [21], [6], [19], [30], explicitly minimizing or constraining delay has experienced much less attention in literature.In contrast to most of the existing work, we focus on cooperative data transportation which eliminates the need of detours to the base station for every robot to deliver the data.This can improve idleness and allows to operate robots in environments where traveling to the base station is not possible for every robot (e.g.due to obstacles).
Depending on the representation of the environment, determining the optimal solution to a patrolling problem can be computationally demanding.Determining the optimal tours for minimal idleness on graphs for example is related to the traveling salesperson problem (TSP) [11] and the k-TSP [10] which are both NP-complete.To decouple the complexity of path planning from planning the coordinated data transport to the base station, we assume that closed tours for each robot are given.Scheduling robots on given tours considering some idleness criterion is a recurring problem in literature, e.g.[28], [27], [36], [5], and [16].
We consider the following patrolling scenarios.A set of closed tours which can have different lengths, one for each robot, is given and the robots are only allowed to move along these tours in a certain direction.Two robots can exchange the data captured on their tours when they are at certain positions on their tours (so-called meeting points) with the aim to transport the data to the base station lying on the tour of a particular robot (Figure 1; see also Figure 15 for an example of a corridor environment to be patrolled).The goal is to limit the maximum idleness to the lowest possible value determined by the tours and to minimize the delay over all sensing locations.This problem involves answering the following questions: (i) which robots should meet, (ii) in what direction should the robots move on their tours, and (iii) if there are more Figure 1: Example of a multi-robot patrolling scenario at three time instances (from left to right).Positions and directions of the robots are indicated by triangles, the base station is depicted as filled square.There is a dashed line between robots if they exchange data.The robots move along fixed tours (depicted as circles) and exchange data with robots on neighboring tours.than one possible meeting points between tours, at which one should robots meet.Additionally, a schedule has to be determined which describes where the robots should wait for each other (in case one robot meets with more than one other robot on its tour).We show that all three questions are NP-hard and propose a heuristic for solving this problem.The first question is related to selecting a tour tree from a tour graph (explained in Section 4) and is therefore termed minimum delay tree (MDT).The other two questions are related to extensions of MDT and are termted MDT with directions (MDTD) and MDTD with meeting points (MDTDM).
The contributions of this work can be summarized as follows: We formulate the MDT problem and its extensions and show that they are NP-hard.We describe an algorithm that efficiently constructs a solution that has the best possible idleness with the given tours.In case only the directions of the tour traversals have to be chosen (everything else is fixed), we describe a procedure that efficiently constructs a solution that also minimizes the delay.We propose two heuristics for MDTD which select the tour tree and the directions from a tour graph.Finally, we evaluate and compare the heuristics in experimental simulations.
The article is organized as follows: In Section 2 we review the existing literature.In Section 3 we introduce some notation and formulate the idleness and delay criteria.In Section 4 the MDT problem and its extensions are described and the heuristics for MDTD are presented in Section 5.In Section 6 we describe the algorithm for the online execution once a solution for MDTD has been obtained.Section 7 describes the simulation results and Section 8 concludes the article.

Related work
The multi-robot patrolling problem can be divided into the problem of determining paths in the environment and controlling and coordinating the robot movement along these paths.In [28] algorithms for the calculation of minimum idleness partitions for robots on a given chain, tree or cyclic graph are presented.In [27] a tour in the environment containing sensing locations with different priorities is calculated, and a control law that coordinates the robots on that tour is developed with the aim to minimize the weighted idleness.Coordinated patrolling accounting for leaving and joining robots on a linear perimeter with dynamic length is considered in [18].Similarly, in [2] robots travel along their partition on a linear perimeter and use local coordination with their neighbors to react to changes in perimeter length, number of robots and travel speed.In [36] a velocity controller for robots following individual tours is developed.The goal is to limit uncertainty, which is growing in the environment at different rates.The problem of finding tours that meet idleness constraints of sensing locations is considered in [8] and periodicity properties of these tours are investigated.In [26] the long term goal of minimizing the idleness is converted into a short horizon control law that selects the next sensing location that should be visited by a robot.In [5] tour planning, dispatching robots on tours and controlling the speed to meet the revisit constraints of points of interest in a wireless sensor network setting with data mules is considered.
Maintaining connectivity is a prevalent requirement for multi-robot task planning [25], [13], [14], [29], [9].Persistent surveillance considering energy constraints and forcing persistent multi-hop connectivity to a base station are considered in [33] and [34].The problem of minimizing the coverage time of an area with recurrent connectivity demands is presented in [3].In [16] robots travel back and forth along predefined paths between rendezvous points.A distributed controller determines the meeting times such that recurrent connectivity is guaranteed.
The MILP (mixed integer linear program) formulation and heuristics for the problem of finding a patrolling path for each robot with the goal to minimize the delay is presented in [4].Each robot follows a path containing sensing locations and intermediate detours to communication sites where the data can be transmitted to the base station.A MILP formulation and a heuristic for a similar problem with task revisit constraints is presented in [23].Patrolling considering the propagation of information among the robots is considered in [1].A decentralized algorithm maintains a grid shaped partition of the area where each robot is traveling a circular path within its subarea.Robots exchange data on the border of its subarea with each robot of the neighboring subareas, which minimizes the propagation time of information in this grid shaped partition.Table 1 summarizes the most relevant references for this work.

Problem Formulation
We assume that the tours are closed and can have different lengths, all points on a tour are sensing locations, and the tours contain predefined meeting points for the exchange of data with robots on neighboring tours.Robots can exchange collected data if they are at the meeting point that connects their tours at the same time.Since tours can have different lengths, it might be necessary for a robot to wait at a meeting point to meet its neighbor.All robots move with the same unit speed in a particular direction (either clockwise or counterclockwise) but every robot can stop at any point on its tour for an arbitrary amount of time.

Reference
Persist.Min.Del.Connect.[23] X X S [4] X S [9] X D [13] o D [16], [15], [1], [28], [18], [2] X D [33], [34] X More precisely, given is a set V = {1, . . ., n} of n tours for n robots.There is a one-to-one mapping between tours and robots, and we will use the same variable to identify a robot as well as the tour which it traverses.Every robot v ∈ V moves along a tour in a particular direction d v and with unit speed, which is the same for all robots.With each tour a real number l v > 0 is associated, which is the minimum time a robot can traverse the tour completely if there are no intermediate stops.Each point on a tour v is from a set denoted P (v) and has a coordinate in a local one-dimensional coordinate system which is determined by an origin on the tour and the direction of the tour.We assume that a subset of the points on a tour are sensing locations, and denote its set with P S (v) and P S := v∈V P S (v).The vertices which contain sensing locations are denoted with V S .The position of a meeting point between tours w and v is specified along the local coordinate system as p meet w (v) on tour w, and as p meet v (w) on tour v.The position of a robot v at a certain instant t on its tour is denoted by p v (t).Vertex v 0 identifies the tour which has a connection to the base station at point p BS v0 .The tours are the vertices in a tour graph G = (V, E, v 0 , l v , time v , l d v ), with an edge between two tours in E if they are connected with a meeting point.Actually, a meeting point identifies two different points, one on each of the two tours it connects.The function time v (p, q, d) : gives the minimum time for a robot to travel from meeting point p to q on tour v in clockwise or counterclockwise direction d (i.e. the distance between p and q under the unit speed assumption without intermediate stops).The function l d v (p, d) : P (v) × {cw, ccw} → R ≥0 returns the minimum time a sensing location from p can be reached when moving in clockwise or counterclockwise direction.This function will allow the delay calculation after a robot starts its tour from a particular point p ∈ P (v).In the following, we use the short notation G = (V, E) for the tour graph.The tour graph is connected such that the data collected by the robots can reach a base station, which is connected to a particular tour.
To visit a sensing location x ∈ P S (r) at time t, a robot r must be at the position of the sensing location at time t, i.e. p r (t) = x.A robot r 1 visiting a sensing location x ∈ P S (r 1 ) at time t captures and stores the observation data associated with the tuple (x, t) in its local memory.The data is forwarded by robot r 1 to another robot r 2 at time t ≥ t if p r1 (t ) = x 1 ∈ P (r 1 ), p r2 (t ) = x 2 ∈ P (r 2 ), and x 1 = p meet r1 (r 2 ) and x 2 = p meet r2 (r 1 ).Robot r 2 stores the data associated with the tuple (x, t) and forwards it to any other robot it can communicate with at times t ≥ t .Finally, all data arrives at the base station x 0 ∈ X.We assume that a sensing location x ∈ P S is not considered to be visited when the robot stops at x but when it again starts from x.In this way it is possible to decrease the delay by deferring the data generation to the latest possible time.
A patrolling strategy π is a mapping from instants of time to points in P S (r) for every robot r and describes when points should be visited by the robots.Two values are associated with a point x ∈ P S , instantaneous idleness and instantaneous delay.The first describes the time a point remains unvisited, and the second describes the time between the capturing of observation data and its earliest arrival at the base station.The definition of the idleness criterion is adopted from [19] and is extended by the delay criterion.
Definition 1 (Instantaneous idleness, instantaneous worst idleness, worst idleness criterion [19]).If the robots follow a strategy π, the instantaneous idleness I π t (x) ∈ R ≥0 at time t of point x ∈ P S is the elapsed duration since the last visit of x by any robot.By convention, at initial time, I π 0 (x) = 0, for any strategy π and any x ∈ P S .The worst idleness criterion W I π is defined as where W I π t := max x∈ps I π t (x) is the instantaneous worst idleness.
Note that the definition of the instantaneous idleness considers the situation when a robot waits at a certain location, its instantaneous idleness stays zero as long as the robot is at that position.
Definition 2 (Instantaneous visit delay, instantaneous delay, instantaneous worst delay, worst delay criterion).If the robots follow a strategy π, the instantaneous visit delay D π t (x, t , t ) at time t of point x ∈ P s is the elapsed duration since a visit of any robot at point x that happened at time t before the data arrives at the base station at time t : The instantaneous delay D π t (x) of a point x ∈ p s is defined as ), where T V x is the set of points in time a visit at x happens, and T R (x,t ) is the set of points in time the data associated with the tuple (x, t ) arrives at the base station.The worst delay criterion W D π is defined as where W D π t := max x∈ps D π t (x) is the instantaneous worst delay.With this notation we define the MDT problem (and similar its extensions) as optimization problem where Π M DT is the set of all solutions for the MDT instance, and l r is the minimal time for robot r to completely traverse its tour.We will show that there exists a feasible solution for every instance of the MDT problem.

Scheduling of robots on tours
In this section we discuss the problem of coordinating robots on predefined tours such that the idleness is bounded to the lowest possible value and the delay is minimized.Coordination comprises selecting the data exchange points where robots should meet and defining a travel direction for each robot on its tour.Selecting meeting points and directions determines the route of the captured data from a sensing location to the base station and has an effect on the delay.We will introduce the basic structure which describes the problem, the tour graph.We will consider different variations of this problem: (i) selecting the directions when there is a minimal number of meeting points given, i.e. the tour graph is a tour tree, (ii) selecting a minimal number of meeting points when the directions are given, i.e. selecting a tour tree in a tour graph, (iii) selecting a minimal number of meeting points as well as directions, and (iv) selecting unique meeting points between tours, i.e. selecting a tree in a tour multi-graph.We will show that the latter three problems are NP-hard.Figure 2a shows an example of a tour graph which is a tree in this particular case (with |V | = n = 7).If, like in this example, the tour graph is a tree and the directions are given, the path of data from an origin to the base station can be easily reconstructed.Assume that robot 1 and robot 3 have just met and robot 3 has sent its collected data to robot 1.After that, robot 1 meets robot 2, and robot 2 continues collecting data as it moves along its tour to the meeting point with robot 1 again.Here, after some time it meets robot 1 again (possibly it has to wait for robot 1) and sends the new data to robot 1 which travels along its tour to the meeting point with robot 3. Robot 3 receives the data and moves to the meeting point with robot 5.At the meeting point with robot 5, it sends its own data and data received from robot 1 (and robot 4) to robot 5. Finally, robot 5 sends its own data and all data it received from robot 6 and 3 to the base station.

Selecting directions
We consider the situation where the tour graph forms a tree T = (V, E) (the problem of selecting a tour tree in a tour graph is described in subsequent subsections), i.e. there is a minimal number of meetings points such that the tour graph is connected and the data from each sensing location can travel to the base station.We will first define a structure called schedule, which contains the information for coordinating the robots on their tours.This information contains the start position of a robot on its tour, the direction of traversal, and the positions where a robot should stop and how long it should wait at a particular position.
In a schedule a robot starts and stops at The requirement that every robot meets its neighbors defined by the edges of the tree T ensures that all collected data reach the base station (at least if the schedule is repeated).We define a repeated schedule as an infinite horizon patrolling strategy π + that can be constructed from a schedule: Definition 4 (Repeated schedule).A repeated schedule π + is a repetition of a schedule defined by with The "spacing" between two repetitions of a schedule is defined by a robot v and γ, the time between the schedules of robot v. Basically, inequality (7) states that each robot has to finish its tour before it can start again in the following repetition of the schedule.Obviously, the worst idleness W I π + ≥ L := max v∈V {l v }, the length of the largest tour traversed by a robot.
Figure 2b shows two repetitions of a schedule of the tour tree in Figure 2a.This repeated schedule can be defined by (π, 1, 0) for example.Robot 1 and 3 start at their meeting point (described by the undirected edge [1,3] in the tour graph) at the same time and robot 1 moves without intermediate stops, whereas robot 3 has to wait for robot 1 when it finished its tour.As robot 1 moves on its tour, it meets robot 2, which is waiting for robot 1 at the meeting point [1,2].Robot 2 starts to move, finishes its tour, and waits for robot 1 to meet again at the meeting point.Note that the minimum worst idleness schedule in Figure 2b does not necessarily minimize the worst delay.For example, when robot 2 finished its tour, it has to wait for robot 1 to transmit the captured data to it, which imposes a delay on the data of robot 2 (robot 2 could have postponed its start such that no new data is captured for a certain amount of time after it transmitted its data to robot 1).Proposition 5. A repeated schedule π + = (π, v, 0), with v := arg max v∈V {l v }, can be constructed from any schedule π with no intermediate waiting times, i.e.
Proof.Since every robot follows its tour without intermediate stops and meets all neighbors, after a finite time max v∈V {l v + wait v (p start v )}, all robots have returned to the starting position.A repeated schedule must fulfill the inequality Since the difference between the start times at the starting positions between v and any w ∈ V in each repetition of the schedule π is ∆ wv , and the difference between consecutive start times of v is L, also W I π + = L.
Restricting the waiting times to the meeting point of v with its parent to point p start v has no negative impact on the delay since waiting at any other position on the tour cannot decrease the delay when the tour tree is given.Moreover, the data generation is deferred if p start v ∈ P S (v) due to the assumption described in Section 3.With a given tour tree, selecting the directions has an impact on the worst delay W D. Compared to the schedule with counterclockwise directions in the lower part of Figure 3, the schedule with clockwise directions in the upper part results in a lower delay.
Algorithm 1 determines the schedule with directions of a given tour tree T = (V, A).To identify the direction of an edge, the arc set A is used where each edge from E is directed towards the root node v 0 , which contains the base station.In Line 1 the recursive procedure rec(v, u) is called.This function returns the maximum delay for a branch of a tree originating at tour v, including the path of the data on tour v to its parent u (the parent of v is the unique node u in an edge (v, u)).If v is a leaf the direction is chosen that leads to smaller delay when robot v starts at p start v .For tour v the procedure tests which direction results in a smaller delay on tour v given the maximum delays of the branches (Line 20 and 21).The function time v (p, q, d) returns the time it takes to travel from point p to point q on a tour v given the traversal direction d.Additionally, the procedure calculates the differences in the starting times and stores them in the variables ∆ vw (Line 26).These values are used to determine the starting times of the robots (Line 10).
The starting point p start w of a robot w is set to the meeting point p meet w (v) with its successor v in the traversal order (Line 9).The only point where w has to wait is the meeting point with v, and the waiting time is the sum of the waiting time of the successor v and the previously calculated value ∆ vw (Line 10).This produces a schedule where w is waiting for its successor v and starts moving as soon it has met v, and follows the whole tour without intermediate stops.Finally, the wait times are shifted to be positive.Proposition 6. Algorithm 1 produces a schedule π, from which a repeated schedule π + = (π, v, 0), with v := arg max v∈V {l v }, can be constructed.The

Input:
Tour tree T = (V, A), base station vertex v0, position of the base station (on tour v0) p BS v 0 , meeting positions for w ∈ V with (w, v) ∈ A do for (w, v) ∈ A do 19: Mw ← rec(w, v) 20: dv ← cw for (w, v) ∈ A do 27: worst idleness W I π + = L. Furthermore, π + minimizes the worst delay.
Proof.In the loop in Line 6 the start times are chosen such that every robot v meets its neighbors w, ∀(w, v) ∈ A without intermediate stops.The worst idleness of L follows from Proposition 5. Now we show that the algorithm produces a schedule with minimum worst delay.Note that because of the chosen starting times, all captured data on a tour of a robot travels to the base station within the same schedule (assuming no repetition).When the minimum worst-case delays M w to node v towards the base station in a call of rec(v, u) are known (which is certainly true if w is a leaf), then max{min d∈{cw,ccw} } is also the minimum worst-case delay of all data including the data captured by v until meeting position of v with its parent towards the base station.

Selecting a tree in the tour graph
Now we consider a tour graph G = (V, E) instead of a tour tree.To show that the problem of determining a tree with a minimum delay schedule in a tour graph is NP-hard, we will formulate it as a decision problem d-MDT and reduce the NP-complete problem 3SAT2 to it.We will assume that the directions of the tours are given and formulate d-MDT as follows.Given a tour graph with directions and the distances between the meeting points, and a bound B, the question is: is there a tree in the tour graph that admits a schedule with worst case delay of at most B? The optimization problem MDT cannot be easier than the decision problem, since a solution of the optimization problem also gives an answer to the decision problem.
The construction of an d-MDT instance from an arbitrary 3SAT instance is shown by means of the example {c 1 = {x 1 , x 2 , x 3 }, c 2 = {x 1 , x 2 , x 4 }, c 3 = {x 2 , x 3 , x 4 }} in Figure 4.In the reduction a vertex appears for each variable and each clause, and a meeting point connects a variable x i with a clause c j if the variable appears in the clause.The position of the meeting point on x i depends on whether the variable is complemented or not complemented in the clause.The basic idea is that for each clause c j an edge (c j , x i ) has to be selected such the data data from each c j can pass some x i with a low additional delay.This selection has the interpretation that the variable x i makes the clause evaluate to T rue.Since the result has to be a tree, a low additional delay for all clauses results in a satisfying assignment of the 3SAT instance.The details are described in the proof of the following proposition: Proof.Given an instance of 3SAT with variables W = {x 1 , . . ., x a }, and clauses C = {c 1 , . . ., c b } a tour graph with the following n = a + b + 3 vertices is constructed: • one vertex for every clause c i • one vertex for every variable x j • two vertices x and x • a vertex t The direction is set arbitrary and can be the same for all the tours, P (v) = P S (v) for all v, and the following meeting points between the tours are introduced: • On every c i there is a meeting point with every variable x j which appears as literal in c i .The distances between the meeting points on c i is 2/3.
• On every x i there are two meetings points with x and x with distance 1 between them on each side of the tour.The meeting points with the clauses c j , where the variable x i appears, are grouped such that the distance to the meeting point with x is 0 and to the meeting point with x is 1 if the variable appears as x i in c j , and vice versa if the variable appears as x i in c j .
• On each of x and x there is a meeting point with every variable x i with distance 0 between them except for two meeting points, each with distance 1 to the meeting point with t (such that the distance between them is 2 on the other side of the tour).
• On t there are meeting points with x and x with distance 0 between them on one side of the tour and distance 1 between each of them and the meeting point with the base station on the other side of the tour (such that the distance between them is 2 on the other side).
The bound B is set to 4. Given a satisfying assignment for the variables x i , the parent in the tree for a variable x i is x if the variable is T rue in the assignment, or x if the variable is F alse.The parent of a clause c j can be any x i that appears as satisfying literal in the clause.In this way the worst case delay, which is caused by the tours c j , is 4 (including the length of the tours).Note that the distances do not have to be 0 and 1, but sufficiently small and large, respectively.Based on these distances, the bound B has to be set accordingly.
Next, we have to show that a tree with worst case delay of 4 also determines a satisfying assignment for the 3SAT instance.We will do this by showing that the tree has to have a certain structure.First, both edges from t to x and x have to be in the tree.Otherwise, if e.g.(t, x) is not chosen, data from x to x has to pass some tour x j which leads to a delay of 5 = 2 (length of tour x) + 1 (on x j ) + 1 (on x) + 1 (on t).Second, exactly one edge from any x j to either x or x has to be in the tree.Choosing both edges results in a cycle containing The circles represent the tours (which do not touch for better readability), the connection between the tours are depicted with the named meeting points, and the direction is ccw for all tours.
x j , x, x, and t.If none of these edges is in the tree, the data has to travel along a path from x j to some c i to some x k and then to either x or x which causes a delay of at least 4 + 2/3 = 2 (length of tour x j ) + 2/3 (on c i ) + 0 (on x k ) + 1 (on x or x) + 1 (on t).Finally, for every c i exactly one edge to some x j has to be in the tree.Because of the arrangement of the meeting points on the tours x j , choosing the edges for tours c i and the edges between x j and x or x that admit a worst case delay of 4, is equivalent to finding a satisfying assignment for the 3SAT instance.
Consider the example in Figure 4 again with the assignment x 1 = x 2 = x 4 = T rue, and x 3 = F alse.The parent of x 1 , x 2 , and x 4 is x, and the parent of x 3 is x.The parent of c 1 can be either x 1 or x 2 .

Selecting directions and meeting points
The problem minimum delay tree with directions (MDTD) is similar to MDT with the additional problem of finding the directions.We will show that the decision version d-MDTD is also NP-hard and present a heuristic algorithm for the problem.
Proof.The proof is similar to the proof of Proposition 7. In addition to selecting a tree, the directions for traversing the tours have to be determined as well.The difference in the reduction is the arrangement of the meeting points on the tours for the variables x j .The distance on the tour between a meeting point c i and x is 0 if the variable x j does not appear as complement in clause c j , and the distance between a meeting point c j and x is 0 if the variable appears as complement.Figure 5: Example of a reduction from the 3SAT instance to d-MDTD (same example as in Figure 4) Figure 5 shows the construction of the reduction for the same example as in Section 4.2.
The direction of the tours except for the tours corresponding to variables can be set arbitrary.If in an assignment a variable x j = T rue, then the direction of the corresponding tour is counterclockwise, and clockwise otherwise.Therefore, a satisfying assignment admits a tree with worst case delay of 4.
Again, to show that a tree with worst case delay of 4 also determines a satisfying assignment for the 3SAT instance, we will show that the tree has to have a certain structure.First, both edges from t to x and x have to be in the tree.Otherwise, if e.g.(t, x) is not chosen, the shortest possible path for data from x to x has to pass some tour x j and some c i and some x k which leads to a delay of 4 + 2/3 = 2 (length of tour x) + 0 (on x j ) + 2/3 (on c i ) + 0 (on x k ) + 1 (on x) + 1 (on t).Second, exactly one edge from any x j to either x or x has to be in the tree.Choosing both edges results in a cycle containing x j , x, x, and t.If none of these edges are in the tree, the data has to travel along a path from x j to some c i to some x k and then to either x or x which causes a delay of at least 4 + 2/3 = 2 (length of tour x j ) + 2/3 (on c i ) + 0 (on x k ) + 1 (on x or x) + 1 (on t).Finally, for every c i exactly one edge to some x j has to be in the tree.To limit the delay for data from tours c i to 4, the direction for the tours x j have to be chosen accordingly.This is only possible if the 3SAT instance has a satisfying assignment.

Selecting unique meeting points
In case there are more than one potential meetings points between two tours, a unique set of meeting points has to be selected to obtain a tour graph (without multiple edges between two vertices).The decision problem d-MDTDM (minimum delay tree with directions and meeting points) is also NP-hard: Proof.The proof is based on a similar idea as the proofs of Proposition 7 and Proposition 8. Figure 6 shows the reduction of the 3SAT instance of the example in Figure 4.An assignment of 3SAT selects the meeting point between x i and x i : if x i = T rue, the upper meeting point is selected (the directions of x i and x i are counterclockwise), if x i = F alse, the lower meeting point is selected (the directions of x i and x i are clockwise).A satisfying assignment of 3SAT results in a delay of 17/2.Selecting a meeting point between tours x i and x i and their directions such that the delay is 17/2, also determines a satisfying assignment for the 3SAT instance.

Approximation
We have shown that the MDTD is already NP-hard when P (v) = P S (v) for all v ∈ V and all tours have the same length.The formal definition allows tours containing (arbitrarily large) segments without sensing locations which can be used to derive the result that the problem can not be approximated with a constant factor unless P = N P .In Figure 7 a direction gadget is shown that is inserted between x i and c j if there is an edge in the tour graph (see Figure 5).This gadget allows the data to pass in one direction within a delay of 2 but causes a delay of at least Γ in the other direction and should prevent that the data from c j travels along a path on tours c j , x i , c k .Additionally, the segments on x i which are 1 in Figure 5 get Γ and do not contain sensing locations.Then, as before a W I of 6 gives also a solution of the 3SAT instance.Now for every α, Γ is chosen large enough, e.g.Γ = 7α, and an α-approximation also results in a solution for the 3SAT instance.A straightforward approximation for the case P (v) = P S (v) is a breadth first traversal of the tour graph to determine a tour tree which is the union of the shortest paths from each vertex to the base station vertex.Since L = max v∈V {l v } is a lower bound for the optimal worst delay W D OP T , the worst delay W D SP of a breadth first traversal starting from the base station tour cannot be worse than depth SP (G)•W D OP T , where depth SP (G) := max v∈V {dist G (v, v 0 )} is the maximum length of all shortest paths in the (unweighted) tour graph from the tours to the base station tour, e.g.depth SP (G) = 3 which is the length of the path from tour 2 to tour 5 for the example in Figure 2.

Heuristics for MDTD
We present two heuristics for MDTD that select a tree in a tour graph and the directions for the tours.The first algorithm (MDTD-SP) determines a tree from the union of the shortest paths from all vertices to the base station vertex.The rationale behind this idea is to minimize the longest path in the tour graph in terms of the number of tours the generated data passes.This is shown in Algorithm 2.

Input:
Tour graph G = (V, E, v0, lv, timev, l d v ) Output: The second algorithm (MDTD-CG) is shown in Algorithm 3 and requires a converted graph G = (V , E , W ) with edge lengths W which is constructed from a tour graph G = (V, E).The vertices V of the converted graph contain the meeting points, i.e., if there is an edge [k, l] ∈ E, then there is a vertex v kl ∈ V .The length W of the edges E between vertices in V are the lengths of the segments of the tours in V .An example of a tour graph and its converted graph is shown in Figure 8.The idea behind this algorithm is to minimize the longest path that data actually travels on a path to the base station.
The algorithm determines the shortest path from every vertex in v kl (representing a meeting point between tours k and l) to the base station v 0x .The function dist G (s, d) returns the length of the shortest path from vertex s to vertex d in a weighted graph G.This path represents a path for the data in the original tour graph for both tours k and l (in path k and path l ) and is stored together with the length of the path in G (in len k and len l ) if it is shorter than the shortest paths that have already been found for tours k and l (see the loop starting at Line 3).After this, the shortest paths in G for every tour v ∈ V have been found.Note that the largest sum of the shortest path plus the tour length max v∈V (len v + l v ) is a lower bound on the worst delay W D.
Next, the branches of the tree T are added to A (loop in Line 12).This is shown in Figure 9 by means of the example of Figure 8. Assume the longest path from any tour in V starts at vertex 27 (Figure 9a).This path determines a path (2,7,3,6,5) in G (which is added to the tree T ) and the directions 9c).In the next step the path starting at 14 is considered (Figure 9b).This path would result in the path (1,4,3,5) in G. Since 3 is already part of the tree, only the branch (1,4,3) is added to the tree (Figure 9d), and d 4 = ccw.All tours are part of the tree T and the algorithm stops.The directions of the leaves 1 and 2 are set according the rule for leaves in Algorithm 1.
Basically, in Line 17 the algorithm checks if a tour has been left and adds the appropriate arc to the arc set U .If a tour is already in the tree T , the path loop exits (Line 16), and the algorithm continues with the next tour in V .The direction of the tour m which has been left depends on the order of meeting points v rs and its successor on path i on the tour m (Line 23).
Proposition 10.Let W D SP and W D CG be the worst delay of a tree determined by MDTD-SP and MDTD-CG, respectively.Then, for every α > 0 there are instances of tour graphs such that W D SP /W D CG > α.

Proof. Consider a tour graph with a chain
of large tours of length Γ where each tour is connected with an arm of small tours to v 0 (see Figure 10).Each tour on an arm has length and each arm has at least k tours.The meeting points on the chain of the large tours are on the opposite sides of the tours, i.e. time vi (p meet vi Then MDTD-SP will result in tree where all large tours are in a chain.If Γ/ is large enough, MDTD-CG will create a tree where each large tour is connected with its arm to v 0 .If k > 2α, then W D SP /W D CG > α.

Online execution
Once the tree (Section 5), the directions and the schedule (Subsection 4.1) have been determined, the robots have to execute this schedule.If the robots need to be deployed in the environment the schedule determined by Algorithm 1 Figure 9: Two steps of the tree generation of Algorithm 3. First, the path in G starting from 27 is considered (bold edges in Figure 9a), which results in the path (2,7,3,6,5) in G and the directions d 7 = d 6 = cw, d 3 = d 5 = ccw (bold lines in Figure 9c).Next, the path starting at 14 is considered (Figure 9b), which results in the path (1,4,3,5) in G. Since 3 is already in the tree, only the branch (1, 4, 3) with d 4 = ccw is added (Figure 9c, the dashed line indicates the discarded part of the path).After this step the tree contains all tours.

Input:
Tour graph G = (V, E), converted graph G = (V , E , W ) with edge lengths W (e), ∀ e ∈ E , base station v0x Output: for each v kl on path pathi to v0x do  vrs ← v kl 25: has to be reached by the robots from an initial state.We assume that each robot navigates along some path in the environment to the meeting point with its parent at the beginning of the mission and reaches this position after some time.
The algorithm for the online execution is described in the following subsection.

State machine
The algorithm running on every robot v is shown in Algorithm 4 which resembles a state machine where the variable state can take one of the states {IN IT, AT W AIT, M OV IN G}.The robot is in IN IT state as long as it is moving from the initial position to the meeting point with its parent on its tour (p start v ).In state AT W AIT it is waiting for its parent on p start v , and in state M OV IN G it is moving along its tour.The input to the algorithm is the schedule (in particular ∆ uv determined by Algorithm 1) and the output are commands M ove and Stop for the motion actuators.The state of the state machine is initially IN IT .We will show that under this assumption there is an infinite sequence of state transitions IN IT, W AIT AT, M OV IN G, W AIT AT, . . .and that the schedule will converge to the optimal schedule after a finite time.A state transition A, B means that variable state changes from state = A to state = B.Because of the assumption that every robot v will reach p start v , a transition from state = IN IT to state = AT W AIT always happens for every robot.
Proposition 11.A robot never has to wait for an infinite time.This implies an infinite sequence of state transitions W AIT AT, M OV IN G, . . .for each robot.
Proof.The situations when a robot has to wait infinitely long is when condition p u (t) = p meet u (v) (waiting for the parent) in state W AIT AT never holds or when condition p w (t) = p meet w (v) (waiting for a child) in state M OV IN G never holds.Since the meeting points define a tour tree T = (V, A), it is sufficient to show that no robot has to wait for its parent infinitely long.After a robot has met its parent, it is waiting for a finite time (line 19) and traverses its tour.The proof is by induction on the number of robots in the tree.In the base case only the robot v which has the base station on its tour is in the tree and the condition p 0 (t) = p meet 0 (v) always holds (0 is the base station and the parent of v).In the inductive step a tour v is added to the tree.Since its parent u does not have to wait for its parent and starts its tour after a finite waiting time, also v meets u at p start v (when p u (t) = p meet u (v)).
Proposition 12.After a finite number of state transitions for each robot from M OV IN G to AT W AIT the schedule has converged to the schedule determined by Algorithm 1, i.e. ∆t stays 0.
Proof.Consider a tour v with largest distance from the base station tour in the tour tree which has only leaves as children.After v met its parent it starts traversing the tour and possibly has to wait for children to reach the meeting point.After the first traversal (state transition from M OV IN G to AT W AIT ) of the tour all children started their tour and had enough time to finish their tour and to reach the meeting position with v on the second traversal of v. Therefore ∆t will be 0 for v after the second traversal.The same holds for the parent u of v after an additional state transition from M OV IN G to AT W AIT of u.This argument can be repeated until the base station tour is reached.
Figure 11 shows the startup phase and execution of the state machine for the given example tour tree.After the schedule has emerged, robot 5

Experimental evaluation
In this section we describe the results from simulation experiments with the aim to assess the performance of the heuristics (MDTD-CG/SP) in terms of worst idleness W I and worst delay W D in different situations (number of robots).To assess the effect of robots cooperating for the data transportation, we compare MDTD-CG/SP with an approach where the data is not transported via other robots to the base station but directly by the robot which captures the data (this approach is denoted as single-hop approach).Additionally, a breadthfirst traversal of the tour graph has been implemented where the directions of the tours are determined by Algorithm 1 from the resulting tree (MDTD-SP).
The environment is modeled as rectangular grid of cells of unit size, and time is discretized into time steps.A robot can move from one cell of the grid to one of the 8 neighboring cells or stay at the same cell within one time step.The communication range R com (measured in number of cells) determines which cells are within communication range.The base station is in the cell at the lower left corner.
A genetic algorithm implementation3 is used to determine a tour through all sensing locations and the base station.To obtain the individual tours for the robots, the tour is split with k-SPLITOUR [10].
For MDTD, meeting points from a set of potential meeting points between each pair of tours have to be selected.In Section 4.2 we consider selecting the edges (corresponding to meeting points) in the tour graph such that the resulting graph is a tree, whereas here we are concerned with the selection of one of possible multiple edges between two vertices in the tour graph (cf.Section 4.4).For a certain R com , a potential meeting point between two tours is a pair of cells on the two tours within communication range.For the selection of meetings points, tours are traversed in a breadth-first order starting at the tour which is connected to the base station.The tours are added in the traversal order to a converted graph (the converted graph is described in Section 5), where the vertices are the meeting points selected so far.For every potential meeting point of v with a neighboring tour v , the shortest path to the base station on the converted graph is calculated, and the meeting point with the shortest path is selected as meeting point between v and v .This heuristic tries to shift meeting points as close as possible to the base station in the converted graph.
First, we compare the performance of MDTD-CG/SP with a single-hop algorithm similar to the one in [4] where robots make detours to communication sites to transmit the data.An increasing number of detours are inserted in a pre-calculated tour for each UAV until the total travel distance exceeds a certain travel budget.The heuristic of [4] tries to minimize the average delay and is not well suited for minimizing the worst delay.Here, an increasing number of detours to the base station, which are evenly spread along a robot's tour (using k-SPLITOUR), are inserted until a certain total tour length of a robot is exceeded.This bound for the tour length is set to the maximum of the worst idleness resulting from MDTD-CG/SP and the maximum tour length (including the base station) for each robot, such that every robot is able to transmit the data from its tour to the base station at least once.The results for W I and W D for different number of robots on a grid with an area of 20 × 60 sensing locations is shown in Figure 12 and Figure 13, respectively.Due to the stochastic nature of the genetic algorithm, the experiment is repeated 10 times for each n, and the standard deviation is a also shown in the figures.
Figure 13 also shows the worst delay W D of the optimal solutions for the MDT instances generated with the state-of-the-art IP solver Gurobi4 (an MDT instance is defined by the tours and the meeting points, see Appendix A for the MILP formulation).Note that the W I is the same for MDTD-CG/SP and MDTD (opt).
From Figure 12 and Figure 13 it can be seen that on one hand MDTD-CG/ SP/opt can outperform the single-hop approach in terms of W I. The value for W I is the same for all three algorithms since all use the same tours.In the single-hop approach a robot from a more distant subarea has a long path to the base station, which causes a large W I. On the other hand, the singlehop approach can outperform MDTD-CG/SP/opt in terms of W D because the data travels the shortest possible path to the base station which can be seen as a lower bound for W D with given tours.
The average computation times for different number of robots is shown in Table 2.For MDTD-CG/SP a single core and for MDTD (opt) all 8 logical cores of a machine with an Intel Core-i7 6700K and 32GB of RAM were used.The instances for MDTD (opt) are the same as for Figure 12 and Figure 13.The instances (tour graphs) for MDTD-CG/SP have been randomly generated (10 instances for each n), with edge probability of 0.25 between tours and randomly sampled meeting point distances on a tour.
In Figure 14    for MDTD-CG/SP and the single-hop approach.The horizon for the delay calculation is the W I achieved by MDTD-CG/SP.The reason is that for MDTD-CG/SP every sensing location gets visited once within this horizon.Since every robot is constantly moving in the single-hop approach the sum of the traveled distances is higher than for MDTD-CG/SP where robots might have to wait at meeting positions.
There are situations for which the single-hop approach performs arbitrarily bad in terms of W I, e.g., the delay is unbounded if it is not possible for each robot to travel to the base station due to obstacles.Figure 15 shows a scenario (20x40 cells) with predefined tours where the worst idleness and delay for MDTD-CG is 31 and 65, respectively, and for the single-hop approach 93 and 51, respectively (all tours have approximately the same length of 30 cells).The large worst idleness of the single-hop approach compared to MDTD-CG is obvious, since the robots, which traverse the right most tours, have long paths to the base station, whereas W D is only slightly larger for MDTD-CG, since the data follows a path that is close to the shortest one to the base station.

Conclusion
Multi-robot patrolling is an important application of multi-robot systems, and in certain situations it is not only important that sensing locations get vis-   ited repeatedly but also that the data reaches a base station on time for further processing or for an assessment by mission operators.This is typically required in disaster response scenarios where the mission operators need an up-to-date view of the situation.We presented a multi-robot patrolling problem with cooperative data transport to the base station where robots move on predefined tours which eliminates the need for every robot to return to the base station for data delivery.MDT represents the problem of minimizing the data delay which turns out to be NP-hard although its simple definition which is decoupled from path planning.Explicitly minimizing delay for patrolling with cooperative multi-robot data transport has not been investigated so far to the best of our knowledge.We presented heuristics and an algorithm for online execution and evaluated the performance in simulation experiments.The comparison of MDT with an uncooperative approach (every robot individually transports the data to the base station) on predefined tours shows that the cooperative approach can outperform the uncooperative approach in terms of WI and the traveled distances.The reason is, that the robots that handover the data to other robots (in this way the data finally reaches the base station) can continue patrolling their tours, while in the uncooperative approach, robots are forced to leave their tour to move to the base station.
The problem relies on TSP tours (and subtours derived from the TSP tours) through all sensing locations.In our work they are generated with traditional algorithms that try to minimize the length of the tour (and minimize the maximum length of the subtours).An open issue is the generation of such tours that support the joint minimization of idleness and delay.Other open questions are whether there are approximation algorithms with guaranteed bounds and whether some instance classes (e.g.planar graphs) can be solved optimally in polynomial time.

MILP formulation of MDTD
The mixed integer linear programming (MILP) model of MDTD is based on a multi-commodity flow formulation for trees on a graph G = (V, A) with n + 1 vertices V (including a vertex 0 for a virtual base station tour) and arc set A [22].The base station is the source of a commodity flow f c e for each vertex (constraint (8)).A flow of commodity c represents the path of the data from robot c towards the base station (though the flow originates at the base station in this formulation).For each vertex the sum of incoming flows is equal to the sum of outgoing flows for each commodity not dedicated to that vertex (constraint (9)), and each vertex c consumes the commodity of type c (constraint (10)).There can be only a flow on an edge if this edge is selected in the tree (constraint (11)) and the sum of the edges must be n (constraint (12)).
f c e ≤ x e ∀e ∈ A, ∀c ∈ V \ {0} (11) e∈A x e ∈ {0, 1} f c e ≥ 0 ∀e ∈ A, ∀c ∈ V \ {0} The data which robot j gets at the meeting point between i and j and is forwarded at meeting point between j and k has to travel the distance l j,ccw ik or l j,cw ik on tour j, depending on the direction robot j traverses its tour.Therefore, two flow variables f c ij and f c jk are involved in the cost calculation in constraint ( 17) for data originating from c and traversing the tour j.The separation of the flows in this formulation allows the definition of a min-max objective.For each commodity c, z c models the delay of data originating at robot c and the objective is to minimize z.The decision variables u ccw j and u cw j determine the direction robot j traverses its tour.
(i,j),(j,k)∈A f c ij f c jk u ccw j l j,ccw ik + f c ij f c jk u cw j l j,cw ik ∀c ∈ V \ {0} (17) The products, e.g.f c ij f c jk u ccw j , can be linearized (likewise f c ij f c jk u cw j ) with an additional variable f c,ccw ijk and the constraints:

Figure 2 :
Figure 2: (a) Example of a tour graph which is a tree with |V | = n = 7 and P (v) = P S (v) for all v ∈ V .The tours are depicted as circles, labeled by numbers, and furnished with arrows that show the movement direction of the robots.The solid small circles are the meeting points and the small solid rectangle is the base station.The straight lines indicate the edges of the tour graph.(b) A repeated schedule constructed from the graph in (a).A horizontal line denotes movement of a robot and the spacing between two horizontal lines indicates that a robot does not move.A vertical, curved arrows indicates that two robots meet and exchange data.The directions of the arrows indicate the pathes the captured data travels towards the base station.The two solid rectangles show the position of the base station on tour 5.
The function wait v : P (v) → R ≥0 defines the waiting times for points on tour v (the waiting positions are the positions where the function returns values > 0).d v ∈ {cw, ccw} are the traversal directions (clockwise and counterclockwise).Additionally,p v (t) = p start v ∀t < 0 and t > τ v .wait v (p start v )is the initial waiting time (possibly 0) starting at time 0. Every robot meets its neighbor

Figure 3 :
Figure 3: Tour tree with P (v) = P S (v) for all v ∈ V where all robots move in clockwise (a) or in counterclockwise (c) directions.The worst delay for clockwise directions is smaller (b) than for counterclockwise directions (d).

Figure 4 :
Figure 4: Example of a reduction from the 3SAT instance {c1 = {x 1 , x 2 , x 3 }, c 2 = {x 1 , x 2 , x 4 }, c 3 = {x 2 , x 3 , x 4 }} to d-MDT.The circles represent the tours (which do not touch for better readability), the connection between the tours are depicted with the named meeting points, and the direction is ccw for all tours.

Figure 6 :
Figure6: Example of a reduction from the 3SAT instance (same example as in Figure4) to the problem of selecting unique meeting points between tours d-MDTDM.

Figure 8 :
Figure 8: Example of a tour graph G = (V, E) (a) and the converted graph G = (V , E , W ) (b).The meeting points of the original graph are the vertices V in the new graph and the lengths w of the edges in E between the vertices in V have the minimum travel times of the segments of the original tours.If there are two edges between two meeting points, the longer one is discarded.

15 :
if m is already in T then 16:

Figure 10 :
Figure 10: Example of a tour graph (size of the circle indicates the length lv).

Figure 11 :
Figure 11: Example of startup phase and execution of the state machine.(a) Tour tree with tours of equal length and cw direction for all robots.(b) Position pv(t) over t for robots 1 to 5 where the each bottom line indicates p start v .The small numbers on the vertical axis indicate the position of the meeting points on tour v.A small dot indicates when robot v reaches p start v , i.e., the state transition from IN IT to AT W AIT .

Figure 12 :
Figure 12: Worst idleness W I for MDTD-CG/SP/opt and Single-hop-detour with varying number of robots n, and neighboring cells are within communication range (R com = 1).

Figure 13 :
Figure 13: Worst delay W D for MDTD-CG/SP/opt and of the optimal solution (opt) for varying number of robots n, R com = 1.

Figure 14 :
Figure 14: Sum of the traveled distances (number of steps in the grid) for MDTD-CG/SP and Single-hop-detour with varying number of robots n.The time within the distances have been calculated is W I achieved by MDTD-CG/SP.

Figure 15 :
Figure 15: A scenario of size 20x40 cells for comparison of MDTD-CG with a single-hop approach.The bold lines show the border and obstacles of the environment (which prohibit movement and communication), the rectangles with rounded edges show the tours for the robots and the dashed lines show possible meeting points for MDTD-CG.

Table 1 :
Summary of related work.An 'X' in the column Persist.indicates that the approach generates a solution for an infinite time horizon.An 'X' in Min.Del. indicates that the work explicitly considers minimization of the delay and an 'o' that the delay is considered as constraint.The meaning of the letters in the Connect.columns are: 'D' for delay tolerant (store-and-forward), 'S' for single hop (robots deliver data directly to the base station), 'R' for recurrent connectivity (all robots meet after a certain interval), and 'F' for full (persistent connectivity among all robots and the base station).

Table 2 :
Average computation times (sec) for different number of robots n for MDTD-CG/SP/opt.
8.2.List of symbols (tour) graph with vertex set V and edge set E G = (V, A) (tour) graph with vertex set V and arc set A T = (V, E) (tour) tree with vertex set V and edge set E [v, w] ∈ E (undirected) edge between v and w (v, w) ∈ A (directed) arc from v to w G = (V , E , W ) converted graph of tour graph G v kl vertex of converted tour graph v 0 or 0 base station lv minimum traversal time (without stops) of tour v L max v∈V {lv} dv direction robot v traverses its tour (cw or ccw) pr(t) position of robot r at time t timev(p, q, d) minimum travel time on tour v from point p to point q G (s, d) length of shortest path between vertices s and d in (weighted) graph G