Anytime and Efficient Multi-agent Coordination for Disaster Response

The Coalition Formation with Spatial and Temporal constraints Problem (CFSTP) is a multi-agent task allocation problem where the tasks are spatially distributed, with deadlines and workloads, and the number of agents is typically much smaller than the number of tasks. To maximise the number of completed tasks, the agents may have to schedule coalitions. The state-of-the-art CFSTP solver, the Coalition Formation with Look-Ahead (CFLA) algorithm, has two main limitations. First, its time complexity is exponential with the number of agents. Second, as we show, its look-ahead technique is not effective in real-world scenarios, such as open multi-agent systems, where new tasks can appear at any time. In this work, we study its design and define a variant, called Coalition Formation with Improved Look-Ahead (CFLA2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {CFLA}2$$\end{document}), which achieves better performance. Since we cannot eliminate the limitations of CFLA in CFLA2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {CFLA}2$$\end{document}, we also develop a novel algorithm to solve the CFSTP, the first to be simultaneously anytime, efficient and with convergence guarantee, called Cluster-based Task Scheduling (CTS). In tests where the look-ahead technique is highly effective, CTS completes up to 30%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$30\%$$\end{document} (resp. 10%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$10\%$$\end{document}) more tasks than CFLA (resp. CFLA2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {CFLA}2$$\end{document}) while being up to 4 orders of magnitude faster. We also propose S-CTS, a simplified but parallel variant of CTS with even lower time complexity. Using scenarios generated by the RoboCup Rescue Simulation, we show that S-CTS is at most 10%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$10\%$$\end{document} less performing than high-performance algorithms such as Binary Max-Sum and DSA, but up to 2 orders of magnitude faster. Our results affirm CTS as the new state-of-the-art algorithm to solve the CFSTP.


Introduction
Disasters, man-made and natural, can cause severe loss of life, damage to infrastructure and cascading failures in energy systems [1]. In the aftermath of a disaster, first responders have to be deployed to meet the needs of the community. They are responsible for complex tasks such as first aid and infrastructure restoration, which they must perform during periods of high stress and in environments with strict time constraints [5]. During these operations, it is fundamental to act as fast as possible since any delay can lead to further tragedy and destruction.
We focus on a class of disaster response problems that has been characterised by Ramchurn et al. [29] as Coalition Formation with Spatial and Temporal constraints Problem (CFSTP). We use the definitions of coalition and coalition formation given in [14,29,32]. Hence, a coalition is a flat and task-oriented organisation of agents, short-lived and disbanded when no longer needed, while coalition formation is a consequence of the emergent behaviour of the system [23]. In the CFSTP, the agents (e.g., ambulances or fire brigades) have to decide which tasks they are going to execute (e.g., save victims or extinguish fires). The decision is influenced by how tasks are located in the disaster area, how much time is needed to reach them, how much work they require (e.g., how large a fire is) and their deadlines (e.g., estimated time left before victims perish). Given these conditions, and considering that there could be many more tasks than agents, it is necessary that agents cooperate with each other by forming, disbanding and reforming coalitions over time [32]. Coalitions enable agents to complete tasks more efficiently than working individually. Moreover, some tasks may have constraints that could not be satisfied by single agents. For instance, a fire is extinguished faster when multiple fire brigades work on it together. Hence, the objective of the CFSTP is to schedule the right coalitions (e.g., the fastest ambulances and fire trucks with the largest water tanks) to the right tasks (e.g., sites with the most victims and the strongest fires) to ensure that as many tasks as possible are completed.
Our interest is in algorithms that are anytime (i.e., which can return partial solutions if they are interrupted before completion), have theoretical properties and can solve the CFSTP efficiently (i.e, approximation algorithms [25]). The reason is that anytime and approximate solutions are fundamental in real-world domains, where it is necessary to have theoretical guarantees, but it may be computationally not feasible or economically undesirable to produce an optimal solution [42]. In particular, as we said above, the faster the disaster response, the lower the losses incurred. We also assume that the agents are situated in a open [13] system, that is, at any time, agents can join in or leave and new tasks can appear.
To date, the most effective way of solving the CFSTP is to reduce it to a Distributed Constraint Optimisation Problem (DCOP) [10] and solve it with the Max-Sum algorithm [9]. The variants relevant to our scope are Fast Max-Sum (FMS) [29] and Binary Max-Sum (BinaryMS) [27]. FMS has a time complexity which is exponential in the number of agents [29, Section 6.1] but it can find optimal solutions when the problem is represented by an acyclic factor graph [10]. On the other hand, BinaryMS can only find approximate solutions, but its time complexity is polynomial in the number of agents. Nonetheless, since both use binary decision variables, they require a pre-processing phase with exponential time to solve CFSTP instances with n-ary decision variables. Multi-agent approaches that solve problems similar to the CFSTP make use of social insects [8], automated negotiation [11,12,39] and evolutionary computation [41], but without considering the anytime property. In the iTax taxonomy of Korsah et al. [20], the CFSTP is defined as a Cross-schedule Dependent Single-Task Multi-Robot Time-extended Assignment (XD [ST-MR-TA]) problem [20]. To date, the approaches proposed to solve XD [ST-MR-TA] problems utilise linear programming [2,18,19], automated negotiation [21] and memetic algorithms [22]. However, either they do not produce anytime solutions [21,22], or they do not have theoretical properties [2], or they are based on a simpler model [18,19].
Against this background, we focus on the state-of-the-art algorithm to solve the CFSTP, namely the Coalition Formation with Look-Ahead (CFLA) algorithm [30]. Our rationale is that CFLA is anytime and, although its computational time is exponential in the worst case, due to its design [30,Section 6] and to the performance of current computers, a well-engineered implementation can find a solution to problems with dozens of agents and hundreds of tasks in minutes, when it is not necessary to terminate early. Specifically, we advance the state of the art in the following ways: • We define CFLA2 , a novel variant of CFLA that minimises limitations and improves performance. • Since we cannot eliminate the limitations of CFLA in CFLA2 , we design CTS, the first CFSTP solver to be simultaneously anytime, efficient and with convergence guarantee. In tests where the look-ahead technique is highly effective, CTS is up to 4 orders of magnitude faster than CFLA and CFLA2. • Finally, we propose a simplified, parallel and more efficient variant of CTS, called S-CTS. In problems generated with the RoboCup Rescue Simulation [16], we show that S-CTS can compete with high-performance DCOP algorithms, while being up to 2 orders of magnitude faster.
The rest of the paper is organised as follows. In "Problem Formulation", we give our CFSTP model. "Coalition Formation with Improved Look-Ahead" details CFLA2 and "Cluster-Based Task Scheduling" presents the CTS algorithm. Next, we give comparison tests between CFLA, CFLA2 and CTS, then we show the performance of S-CTS in the RoboCup Rescue Simulation and finally conclude.

Problem Formulation
We present below a refined constraint optimisation model of the CFSTP [30]. More precisely, we extend the definition of coalition value, define the constraints with fewer and simpler equations, and introduce the concept of solution degree.

Basic Definitions
Let V = {v 1 , … , v m } be a set of m tasks and A = {a 1 , … , a n } be a set of n agents. 1 Let L be the finite set of all possible task and agent locations. Hence, more than one agent or task can be at the same location. Time is denoted by t ∈ ℕ , starting at t = 0 , and agents travel or execute tasks with a SN Computer Science base time unit of 1. The time units needed by an agent to travel from its current location to a new task location are given by the function ∶ A × L × L → ℕ . Unlike [30], we put A in the domain of to characterise agents with different speeds. 2 Task locations do not change over time, while  agent locations where w v ∈ ℝ + is the workload of v, or the amount of work required to complete v, and d v ∈ ℕ is the deadline of v, or the time until which agents can work on v. Our notion of work will be clear in "Coalition Values". Hence, workloads are positive and some tasks may have a deadline of zero.
In other words, a problem may have tasks that cannot be completed in time, independently of the algorithm chosen to solve it. We denote the location of agent a at time t by l t a ∈ L , the times at which a starts and finishes working on , respectively, and the latest deadline by d max = max v∈V d v .

Coalition Allocations
Agents are cooperative [38] and can work together to complete a task. A subset of agents C ⊆ A is called a coalition. At time t, the rationale for allocating coalition C to task v is that C completes v in the fewest time units. An agent allocation is denoted by a→v t and represents the fact that agent a works on task v at time t. The set of all agent allocations is denoted by and contains all possible agent allocations. A coalition allocation is denoted by C→v t and represents the fact that coalition C works on task v at time t. Given a set of agent allocations T ′ ⊆ T and a time t ′ ≤ d max , the set of coalition allocations corresponding to T ′ over the time period [0, t � ] is denoted by (1)

Coalition Values
Each coalition allocation has a coalition value, given by the function 3 u ∶ P(A) × V → ℝ ≥0 , where P(A) is the power set of A and ℝ ≥0 is the set of non-negative real numbers. Unlike [30], we put V in the domain of u to characterise the fact that the same coalition may execute different tasks with different performances. Hence, given a coalition allocation C→v t , the value u(C, v) expresses the amount of work that coalition C does on task v at each time t. The workload w v decreases linearly over time, depending only on u(C, v). Moreover, u(C, v) is independent of time and the effect of C working on v.

Constraints
There are three constraint types: structural, temporal and spatial. Structural constraints require that each task can be allocated to only one coalition at a time. For each task v, this is characterised by defining the set v ⊆ , which contains only coalition allocations to v, is maximal with respect to inclusion and such that Temporal constraints require that each task v can be completed only within its deadline d v . This is characterised by the function ∶ V × → {0, 1} , defined as follows: Equation 5 utilises v to consider only coalition allocations that satisfy the structural constraints.
Spatial constraints require that an agent will not start working on a task before reaching it: Equation 7 also implies that two tasks cannot be allocated to the same agent at the same time. In other words, coalitions that exist at a different location at the same time are disjoint [30,Section 2].
A set of agent allocations (v, � ) = 1 , and ′ satisfies Eqs. 6 and 7. Consequently, at When a task v is allocated to a coalition C, each agent a ∈ C starts working on v as soon as it reaches the location of v, without waiting for the remaining agents. In other words, there are no synchronisation constraints [24].

Objective Function
The objective function of the CFSTP is to find a feasible set of coalition allocations that maximises the number of completed tasks: To solve Eq. 8, an exhaustive search may require to verify all the possible agent allocations until d max . Consequently, the time complexity of finding an optimal solution to the CFSTP Hence, the argument of the maxima in Eq. 8 is a solution with the highest degree.
Ramchurn et al. [30] proved that the CFSTP is NPhard [25] and a generalisation of the Team Orienteering Problem [4], which is a generalisation of the Travelling Salesman Problem [37]. As we said in "Introduction", CFLA is the state-of-the-art CFSTP solver. In the next section, we show how to improve it.

Coalition Formation with Improved Look-Ahead
We now present the Coalition Formation with improved Look-Ahead ( CFLA2 ), an extension of the CFLA algorithm [30]. More precisely, its look-ahead technique ("Phase 3: Defining the Degree of Each Task") has two modifications that, as we shall see in "Comparison Tests", enhance the overall performance.
The concept of CFLA2 is the same as CFLA, but for completeness, we briefly report it in "The Concept of CFLA2 ". After that, we detail the procedures that compose CFLA2 , explaining how they differ from the ones of CFLA. Finally, we list the limitations that CFLA2 continues to keep from CFLA, which are the rationale for our new algorithm in "Cluster-Based Task Scheduling".
Both CFLA and CFLA2 use the same four phases, but Ramchurn et al. [30,Section 6] describe them in three algorithms. To improve the readability, we describe them in four algorithms ("Phase 1: Defining the Legal Agent Allocations" to "Phase 4: Overall Procedure of CFLA2 ").

The Concept of CFLA2
CFLA2 is a centralised, anytime and greedy algorithm that solves Eq. 8 by maximising the working time of the agents and minimising the time required by coalitions to complete tasks. It is divided into four phases: 1. Defining the legal agent allocations ("Phase 1: Defining the Legal Agent Allocations"). 2. For each task v, choosing the best coalition C ("Phase 2: Selecting the Best Coalition for Each Task"). 3. For each task v, doing a 1-step look-ahead ("Phase 3: Defining the Degree of Each Task") to define its degree v , or the number of tasks that can be completed after the completion of v. 4. At each time t ∈ [0, d max ] , allocating a task not yet completed and with the highest degree ("Phase 4: Overall Procedure of CFLA2").

Phase 1: Defining the Legal Agent Allocations
At time t, Algorithm 1 determines which free agents 5 ( A t free ) can reach which uncompleted tasks ( V unc ) before their deadlines. The resulting set of legal agent allocations is denoted by L t . This phase is identical in CFLA.

SN Computer Science
Given a task v and a set of legal agent allocations L t (computed by Algorithm 1), Algorithm 2 returns the Earliest-Completion-First (ECF) 6 coalition C * v that can be allocated to v [30, Section 6.2]. More precisely, the algorithm minimises both the size of C * v and the time at which it completes v. This is achieved by iterating from the smallest to the largest possible coalition size (line 5) and iterating through all possible coalitions of each size (line 6). When the procedure finds a coalition C that can complete v within its deadline d v (line 7), then |C| is the minimum size of the coalitions that can complete v. Hence, C * v is identified among the coalitions that have size |C| (lines [8][9][10][11]. The summation at lines 7-8 is the workload done by the coalition allocations defined from the asynchronous arrivals ("Constraints") of the agents of C (line 6) to the location of v.
Unlike the original formulation [30, Algorithm 2], Algorithm 2 clarifies that the minimum coalition size has to be determined by iterating through the subsets of the combinations 7 of A t v , which is the set of free agents that at time t can reach v within d v .

Phase 3: Defining the Degree of Each Task
Given a task v, Algorithm 3 performs a brute-force search to define its degree v ("The Concept of CFLA2 "). At line 8, with a procedure similar to line 5 in Algorithm 2, it checks how many tasks can be completed after the completion of v. Hence, Algorithm 3 assigns a score to each coalition allocation selected by Phase 2 ("Phase 2: Selecting the Best Coalition for Each Task") for each currently uncompleted task. These scores are then used by Phase 4 ("Phase 4: Overall Procedure of CFLA2 ") to choose the next task to execute. Algorithm 3 differs from the original [30, Algorithm 3] in two points. First, it only considers uncompleted tasks that have a deadline greater or equal to d v (line 4). This prevents from counting tasks that can be completed before the completion of v. As defined in "The Concept of CFLA2 ", v represents the number of tasks that can be completed only after the completion of v. Second, at line 11, v is not just incremented by 1, but also by 1 − v 2 , where v 2 is the rescaling 8 of w v 2 to the range [w min , w max ] , with w min and w max being, respectively, the minimum and maximum task workloads: Hence, v is also a measure of how much total workload remains after the completion of v. Maximising v (line 12 of Algorithm 4) leads to the remaining tasks with the smallest workloads, which increases the probability of completing more.

Phase 4: Overall Procedure of CFLA2
Algorithm 4 shows the overall procedure. The repeat-until loop runs until all tasks are completed, or until the latest deadline is expired (line 22). At each time t, the set of legal agent allocations is updated (line 8) and a task allocation is defined (lines 9-18). If it is not possible to allocate other tasks, the algorithm stops early (line 19).

Analysis and Discussion
Algorithm 1 iterates through all free agents and uncompleted tasks. Assuming that line 4 requires constant time, the time complexity is = O(|A| ⋅ |V|).
Algorithm 2 iterates (line 5) from coalition size 1 to Therefore, despite having a lower complexity than an optimal CFSTP solver ("Objective Function"), CFLA2 has a run-time that increases quadratically with the number of tasks and exponentially with the number of agents, which makes it not suitable for systems with limited computational power or real-time applications. Other limitations are as follows.
1. It can allocate at most one task per time unit [30,Section 7]. More formally, at each time unit, the best-case guarantee of CFLA2 is to find a partial solution with degree k = 1. 2. In general, greedily allocating a task with the highest degree now does not ensure to allocate all uncompleted tasks in future. This is particularly relevant in an open system, where there is no certainty of having further uncompleted tasks ("Introduction"). 3. The more the tasks can be grouped by degree, the more the look-ahead technique becomes a costly random choice. In other words, at time t, if some tasks V ′ ⊆ V have all maximum degree, then Algorithm 4 selects v * randomly from V ′ . Hence, the larger V ′ is, the less relevant Algorithm 3 becomes. 4. In Algorithm 4, all tasks have the same weight. That is, tasks with earlier deadlines may not be allocated before tasks with later deadlines. This is independent of the order in which the uncompleted tasks are elaborated (line 9) since the computation of max (line 12) would not be affected.
To overcome the limitations of CFLA2 , in the next section we present CTS, a CFSTP solver that is anytime, efficient and with convergence guarantee, both in closed and open systems.

Cluster-Based Task Scheduling
The Cluster-based Task Scheduling (CTS) is an anytime and greedy algorithm that operates at the agent level, rather than at the coalition level. It is divided into the following two phases.
1. For each agent a, defining a task v such that v is the closest to a and d v is minimal. 2. For each task v, defining the coalition of agents to which v has to be allocated.
Algorithm 5 is used in Phase 1, while Algorithm 6 enacts the two phases. We describe them, respectively, in "Selecting the Best Task for Each Agent" and "Overall Procedure of CTS".

Selecting the Best Task for Each Agent
Given a time t and an agent a, Algorithm 5 returns the uncompleted task v that is allocable, the most urgent and closest to a. By allocable we mean that a can reach v before deadline d v , while most urgent means that v has the earliest deadline. The algorithm prioritises unallocated tasks, that is, it first tries to find a task to which no agents are travelling, and on which no agents are working ( v t a [0] ). Otherwise, it returns an already allocated but still uncompleted task such that a can reach it and contribute to its completion ( v t a [1] ). This ensures that an agent becomes free only when no other tasks are allocable and uncompleted.
The overall procedure is described in Algorithm 6. The repeat-until loop is the same as CFLA2 , to preserve the anytime property. Phases 1 and 2 are represented, respectively, by the loops at lines 5 and 16.
Phase 1 loops through all agents. Here, an agent a may either be free or reaching a task location. In the first case (line 6), if an uncompleted task v can be allocated to a (lines 7 and 8), then v is flagged as allocable (line 9) and a is added to the set of agents A t v to which v could be allocated at time t (line 11). In the second case (line 12), a is travelling to a task v, hence its location is updated (line 13) and, if it reached v, it is set to working on v (line 14).
Phase 2 visits each uncompleted task v. If v is allocable (line 18) then it is allocated to the smallest coalition of agents in A t v (defined in Phase 1) that can complete it (lines 19-32). In particular, at lines 24-27, v is the amount of workload w v done by all the coalitions formed after the arrival to v of the first i − 1 agents in t v (defined at line 19). After that, if agents are working on v (line 33), its workload w v is decreased accordingly (line 34). If w v drops to zero or below, then v is completed (lines 35-37). The algorithm stops (line 39) when all the tasks have been completed, or the latest deadline is expired, or no other tasks are allocable and uncompleted ("Selecting the Best Task for Each Agent").
The spatial constraints (Eqs. 6 and 7) are satisfied by executing Algorithm 5 only on free agents (line 6), while the temporal constraints (Eq. 5) are satisfied by allocating a task v to a coalition C only when C has the minimum size and can complete v within the deadline d v .

Analysis and Discussion
The approach of CTS transforms the CFSTP from a 1 − k task allocation to a series of 1-1 task allocations. In other words, instead of allocating each task to a coalition of k agents, we have that coalitions are formed by clustering (i.e., grouping) agents based on the closest and most urgent tasks. This is an eligibility criterion: unlike CFLA2 , CTS exploits the distances between agents and tasks and the speeds of agents to reduce the time needed to define coalition allocations. Algorithm  The following theorem is based on the definitions given in "Constraints".

Theorem 1 CTS is guaranteed to find feasible coalition allocations.
Proof We prove by induction on time t. At t = 0 , Phase 1 of Algorithm 6 selects a task v for each agent a such that v is allocable, the most urgent and closest to a ("Selecting the Best Task for Each Agent"). This implies that the agent allocation a→v 0 is legal ("Constraints"). Then, Phase 2 ("Overall Procedure of CTS") allocates v to a only if it exists a coalition C such that |C| is minimum, C→v 0 is feasible ("Constraints") and a ∈ C.
At t > 0 , for each agent a, there are two possible cases: a task v has been allocated to a at time t ′ < t , or a is free (i.e., idle). In the first case, a is either reaching or working on v (lines 12-15 in Algorithm 6), hence a→v t is legal and C→v t is feasible, where a ∈ C . In the second case, a is either at its initial location or at the location of a task on which it finished working at time t ′ < t . Thus, as in the base case, if it exists a coalition C and a task v such that |C| is minimum, C→v t is feasible and a ∈ C , then v is allocated to a. ◻ As shown in the two previous sections, Algorithm 5 iterates exactly once over a finite set of uncompleted tasks, while the repeat-until loop of Algorithm 6 is executed at most d max times. Hence, a corollary to Theorem 1 is that CTS converges to a partial solution if it exists.
The counterexample given by Limitation 2 in "Analysis and Discussion" does not allow to prove the convergence of CFLA and CFLA2 in general settings. Since no current algorithm that solves the CFSTP is simultaneously anytime, efficient and with convergence guarantee ("Introduction"), CTS is the first of its kind. (11) d max ⋅ (|V| + |A| log |A|) .

Comparison Tests
We implemented CFLA, CFLA2 and CTS in Java, 9 and replicated the experimental setup of [30] because we wanted to evaluate how well CFLA2 and CTS perform in settings where the look-ahead technique is highly effective. For each test configuration, we solved 100 random CFSTP instances and plotted the average and standard deviation of: percentage of completed tasks; agent travel time ("Basic Definitions"); task completion time, or the time at which a task has no workload left; problem completion time, or the time at which no other tasks can be allocated.

Setup
Let U(l, u) and U I (l, u) be, respectively, a uniform real distribution and a uniform integer distribution with lower bound l and upper bond u. Our parameters are defined as follows. Unlike [30], we set the number of maximum agents to 40 instead of 20, because it allows in this setup to complete all tasks in some instances. We did not perform a comparison on larger instances because of the run-time of CFLA and CFLA2 : on commodity hardware, CTS takes seconds to solve instances with thousands of agents and tasks, while CFLA and CFLA2 take days. Consequently, the purpose of this section is to highlight the performance of CTS using CFLA and CFLA2 as a baseline. We aim to verify the scalability of CTS in a future investigation.

Results
In terms of completed tasks (Fig. 1a), the best performing algorithm for instances with up to 18 agents is CFLA2 , while the best performing algorithm for instances with at least 20 agents is CTS. CFLA is outperformed by CFLA2 in all instances except those with 2 agents, and by CTS in instances with at least 10 agents. The reason why the performance of CFLA and CFLA2 does not improve significantly starting from instances with 20 agents is that the more agents (with random initial locations) there are, the more the tasks are likely to be grouped by degree. 10 CFLA2 has a trend similar to that of CFLA because it has the same limitations, but it performs better due to its improved look-ahead technique. CTS is not the best in all instances because its average task completion time is the highest (see the discussion on Fig. 1c below). This implies that the fewer the agents, the more the tasks may expire before they can be allocated. In our setup, 10 (resp. 20) is the number of agents starting from which this behaviour is contained enough to allow CTS to outperform CFLA (resp. CFLA2).
Regarding agent travel times (Fig. 1b), it can be seen that CTS is up to three times more efficient than CFLA and CFLA2 . This is due to Algorithm 5, which allocates tasks to agents also based on their proximity. CFLA2 has lower agent travel times than CFLA for the following reason. The degree computation in CFLA2 also considers how much total workload would be left ("Phase 3: Defining the Degree of Each Task"). Higher degrees correspond to lower workloads, and tasks with lower workloads are completed first. Thus, fewer tasks are grouped by degree and more are likely to be completed. This means that the average distance between task locations in a CFLA2 solution may be lower than that of a CFLA solution. The agent travel times increase with all algorithms. This behaviour is also reported, but not explained, by Ramchurn et al. [30]. To explain it, let us consider a toy problem with one agent a 1 and one task v. If we introduce a new agent a 2 such that (a 2 , l 0 a 2 , l v ) > (a 1 , l 0 a 1 , l v ) , then the average travel time increases. In our setup, this happens because the initial agent locations are random.
In general, task completion times (Fig. 1c) decrease because the more agents there are, the faster the tasks are completed. The completion of task v is related to the size of the coalition C to which v is allocated: the highest the completion time, the smallest the size of C, hence the highest the working time of the agents in C. Task completion times are inversely related to agent travel times. Since CTS has the smallest agent travel times and allocates tasks to the smallest coalitions, it consequently has the highest task completion times. Therefore, in CTS, agents work the highest amount of times, and the number of tasks attempted at any one time is the largest.
The problem completion times (Fig. 1d) are in line with the task completion times (Fig. 1c) since the faster the tasks are completed, the less time is needed to solve the problem. The reason why the times of CFLA and CFLA2 do not decrease significantly from 20 agents up is linked to their performance (see the discussion on Fig. 1a above). On the other hand, the fact that the times of CTS decrease more consistently than those of CFLA and CFLA2 indicates that CTS is the most efficient asymptotically. In other words, CTS is likely to solve large problems in fewer time units than CFLA and CFLA2.
In terms of computational times, CTS is significantly faster than CFLA and CFLA2 . For example, in instances with 40 agents and 300 tasks, on average 11 CTS is 45106% ± [2625, 32019] (resp. 27160% ± [1615, 20980] ) faster than CFLA (resp. CFLA2 ). The run-time improvement of CFLA2 is due to line 4 of Algorithm 3, due to which the look-ahead technique elaborates fewer tasks.

Tests with the RoboCup Rescue Simulation
In this section, we benchmark a variant of CTS ("Cluster-Based Task Scheduling") against high-performance DCOP solvers with the RoboCup Rescue Simulation (RCRS), one of the most important projects promoting multi-agent research on disaster response [16]. By reproducing the aftermath of an earthquake in a city, the RCRS allows verifying coordination approaches that could be enacted by first responders in such situations [15,31]. We conducted the tests with our fork of RMAS-Bench 12 [17], a benchmark platform based on the RCRS. We chose it because it allows comparisons against ready-touse implementations of BinaryMS ("Introduction") and the Distributed Stochastic Algorithm (DSA) [40]. We use them as a baseline because: • Max-Sum and its variants are widely used and can obtain partial solutions with very high degrees ("Introduction").
In particular, BinaryMS can produce a solution within the time limit enforced by the RCRS 13 and with the same quality as FMS [27]. • Since numerous empirical evaluations have proven its efficacy in many different domains, DSA is a touchstone for testing DCOP and RCRS algorithms [10].
"Simplified CTS" describes how CTS can be adapted for use in the latest RCRS version. 14 The following two sections report our setup and results, respectively.

Simplified CTS
In the current RCRS version, deadlines and workloads are not accessible to agents. Thus, we cannot implement CTS since we can neither verify the spatial constraints in Phase 1 nor can we implement Phase 2 ("Overall Procedure of CTS"). However, the RMASBench allows to obtain the utility of a task, which is a quantitative measure that indicates the current importance of a task. Consequently, we implemented a modified Phase 1 in which each agent can independently choose to work on the closest task with the highest utility. We call this variant Simplified CTS (S-CTS).
The time complexity of S-CTS is O(d max ⋅ |V|) since the agents do not coordinate with each other and their choice is carried out in parallel. Although S-CTS may seem like a major handicap, we show below that it offers a reasonable trade-off between performance and complexity.

Setup
All tests are based on the Paris map, one of the most used in the RoboCup competition. We kept the default setup [27, Section 6.1] because, according to the authors, it maximises the performance of both BinaryMS and DSA.
In RMASBench, there are police patrols and fire brigades. A police patrol can unblock roads, while a fire brigade can extinguish fires. Having 2 agent types allows studying inter-team coordination aspects. Since this is not in our scope, we did not consider road blockades. As a result, our problems are easier and our baseline is more competitive. Figure 2 gives an example.
The RCRS is based on scenarios [31]. A scenario is a class of problems, whose main parameter is the number of agents. In RMASBench, there are 5 scenarios, respectively, with 15, 21, 27, 33 and 40 fire brigades. Other settings are as follows.
• The agents are homogeneous, that is, they all have the same speed and water tank size. • There are 3 ignition points and each scenario is replicated 30 times. At each execution, a pseudo-random number generator influences the way the fires spread from ignition points to nearby buildings. • To get a non-trivial number of fires, the agents are added 25 seconds after the start. That is, u(C, v) = |C|. • Deadlines and workloads are randomly generated by the RCRS.

Fig. 2
Detail of an example problem on the Paris map. The red dots are fire brigades and the blue lines are their water jets. The colour of the buildings reflects their status: grey means no damage; yellow to red means on fire; blue to purple means that the fire has been extinguished, and black means that the building is burnt. The darker the colour, the greater the damage. On the centre-right is a fire station, to which the fire brigades return to refill For each scenario and algorithm, we plot the average and standard deviation of:

Results
The more the agents communicate with each other, the better they coordinate. In turn, this leads to lower completion times and numbers of burned buildings. Because there is no exchange of messages in S-CTS and BinaryMS has the highest communication overhead, they are, respectively, the least and the most performing in Fig. 2a, b. Nevertheless, this does not result in a drastic drop in performance. In Fig. 2c, in the worst-case scenario (i.e., 21 agents), on average S-CTS scores about 10% (resp. 5% ) less than BinaryMS (resp. DSA). This is not trivial, given that S-CTS is a simplification and that the scenarios used are Regarding the average CPU time (Fig. 2d), S-CTS is up to 2 orders of magnitude (resp. 1) faster than BinaryMS (resp. DSA). This is because BinaryMS has a pre-processing phase that requires exponential time ("Introduction") while DSA, despite having a time complexity similar to that of S-CTS [10, Table 4], has a message-passing phase as well.
In Fig. 2a-c, the trends converge to zero because the more agents there are, the less relevant the solver becomes. In other words, the greater the number of agents, the higher the quality of solutions. We can deduce that the degree of agent communication is directly proportional to the score and inversely proportional to the CPU time. However, as we have seen, the performance difference between communication and no communication is not necessarily significant.

Conclusions
In this paper, we proposed two novel algorithms to solve the CFSTP. The first is CFLA2 , an improved version of CFLA, and the second is CTS, which is the first to be simultaneously anytime, efficient and with convergence guarantee. CFLA2 can replace CFLA in offline settings or for small problems, while CTS provides a baseline for benchmarks with dynamic and large problems. Moreover, we showed how a simplified but parallel variant of CTS is enough to compete with high-performance solvers (i.e., BinaryMS and DSA) in the RCRS. Because it significantly outperforms CFLA and is more applicable than CFLA2 , we can consider CTS to be the new state-of-the-art CFSTP solver. Due to its features ("Analysis and Discussion"), CTS can also be used in contexts that are not necessarily real-time, but can be still captured by the CFSTP model, such as multi-robot area coverage or exploration of environments that are dangerous for humans [30,Section 8].
The limitation of CTS is that it cannot define the quality of its approximation ("Analysis and Discussion"). Moreover, the fact that it maximises the agent working times ("Comparison Tests") implies that some agents may take longer to complete some tasks and therefore may not work on others. Thus, if an optimal solution exists, in general CTS cannot guarantee to obtain it. The CFSTP model also has some limitations, including • The tasks have all the same weight and have no order.
This does not capture scenarios such as search and rescue missions, where some tasks may have higher priority or must be completed before others [20,24,28]. • The task workloads are assumed static, when in reality they might be dynamic (e.g., fires that grow in intensity).
• Agent communication is perfect and without costs (i.e., free comm environment [26]). Instead, real-world communication channels may fail or have operational constraints, such as low bandwidth or limited network topology (e.g., sparse robot swarms [34]). • Each agent knows its subproblem a priori (i.e., total knowledge or deterministic environment behaviour [10,Section 3]). In real-world domains, task states are partially or not known a priori, thus the agents must balance the exploration of the environment and the exploitation of the acquired information [35,36].
Consequently, future work aims at 1. Extending CTS to give quality guarantees on the solutions found, and testing its scalability in dynamic benchmarks. 2. Extending the CFSTP model to eliminate the aforementioned limitations and capture more disaster response scenarios. 3. Designing an anytime, optimal and distributed algorithm to solve both the CFSTP and our extension. 4. Investigating efficient inter-team coordination in the RCRS [27]. Specifically, focusing on problems with fires, road blockades and victims trapped under the rubble.
as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.