Multi-robot task allocation problem with multiple nonlinear criteria using branch and bound and genetic algorithms

The paper proposes the formulation of a single-task robot (ST), single-robot task (SR), time-extended assignment (TA), multi-robot task allocation (MRTA) problem with multiple, nonlinear criteria using discrete variables that drastically reduce the computation burden. Obtaining an allocation is addressed by a Branch and Bound (B&B) algorithm in low scale problems and by a genetic algorithm (GA) specifically developed for the proposed formulation in larger scale problems. The GA crossover and mutation strategies design ensure that the descendant allocations of each generation will maintain a certain level of feasibility, reducing greatly the range of possible descendants, and accelerating their convergence to a sub-optimal allocation. The proposed MRTA algorithms are simulated and analyzed in the context of a thermosolar power plant, for which the spatially distributed Direct Normal Irradiance (DNI) is estimated using a heterogeneous fleet composed of both aerial and ground unmanned vehicles. Three optimization criteria are simultaneously considered: distance traveled, time required to complete the task and energetic feasibility. Even though this paper uses a thermosolar power plant as a case study, the proposed algorithms can be applied to any MRTA problem that uses a multi-criteria and nonlinear cost function in an equivalent way. The performance and response of the proposed algorithms are compared for four different scenarios. The results show that the B&B algorithm can find the global optimal solution in a reasonable time for a case with four robots and six tasks. For larger problems, the genetic algorithm approaches the global optimal solution in much less computation time. Moreover, the trade-off between computation time and accuracy can be easily carried out by tuning the parameters of the genetic algorithm according to the available computational power.


Introduction
In recent decades, there have been great advances in the field of robotics with several works with humanoid robots [1], robotic arms [2] and in mobile robots or unmanned vehicles. Particularly, these unmanned vehicles have been used sible way to optimize one or more performance indexes, such as energy consumption or total time spent among others. According to the taxonomy proposed in [16], the MRTA problem can be classified as: -Single task robot (ST) or multi-task robot (MT) problem, depending on the number of tasks each robot can perform at a time: just one or more than one. -Single-robot task (SR) or multi-robot task (MR) problem, depending on the number of robots needed to perform a task: only one or more than one. -Instantaneous assignment (IA) or time-extended assignment (TA), depending whether future allocations are taken into account or not. Within MRTA solution approaches, two clear types can be distinguished: centralized and decentralized. In many cases, the use of a centralized approach in which we focus in this paper makes it possible to improve the performance substantially but in return it has the negative counterpart of increasing the computational load. The performance also depends on the algorithm used. The most commonly used are either: • Market-based approaches such as auction algorithms [17,18] are inspired in economy and provide a simple way of allocating resources but imply explicit communication among the agents, which sometimes may cause the communication cost be too high. The main advantage of these approaches is the scalability and robustness they exhibit, although optimality is never ensured since any agent of the system has complete information about it. Auctions have been widely used throughout human history in order to allocate resources and in the case of robots, which are not programmed to be selfish or to lie, the auction-based algorithm [18][19][20] is appropriate. • Optimization-based algorithms [21] focus on finding the optimal solution of a problem mathematically, given a set of constraints (usually a NP-Hard problem with the corresponding computational burden) like in the Optimal Assignment Problem (OAP) [22]. These approaches often depend on a central agent and as a consequence, are highly susceptible to attacks or communication failures. Within these algorithms, there are two main subtypes: -Deterministic, such as gradient and Hessian-based methods [23,24]. -Stochastic, such as meta-heuristic algorithms [25] like the Ant Colony Algorithm [26], Cuckoo Search [27][28][29], Particle Swarm Optimization (PSO) [30][31][32] or genetic algorithm (GA) [33] among others.

MRTA in the context of thermosolar plants
Thermosolar plants cover large extensions of land situated in areas of high solar irradiance, in which mirrors are used to concentrate solar thermal energy to generate electric power. The most common type of thermosolar plant nowadays is the Concentrated Solar Plant with Parabolic Trough Collector (CSP PTC). This type of plant uses parabolic mirrors to focus the solar energy toward some piping situated in the focal point of the parabola. Thermal oil is used as a heat transfer fluid (HTF) circulating through these pipes and absorbing radiation. Then, the HTF is transported to a power generation plant through a manifold that collects the hot oil from the collectors. Interested readers can find more information about these plants in [34].
CSP PTC plants are usually operated by controlling the total flow that goes through the manifolds, generally with the aim of maintaining the oil temperature close to a given setpoint. However, this control has an important drawback whenever the irradiance is not uniform in the whole plant. A localized decrease on the irradiance, produced by a cloud, might trigger a decrease in the flow, but since the decrease would be localized in a certain area of the plant there would be other collectors where this decrease in flow might induce an increase in the oil temperature. Increases in the oil temperature above the maximum admissible can be harmful for the process and even dangerous to the plant equipment and therefore, in some cases, collectors must be defocused, as a security measure, although that means throwing away energy.
The problem of defocusing has been addressed from different approaches by proposing the use of predictive control strategies as model predictive control (MPC), [35,36]. In other works, as [37], controlling the flow in each collector by using valves is proposed. For these types of controllers, having an estimation of the distribution of the irradiance throughout the plant is needed.
MRS are frequently used in mobile robot sensor networks (RSN) [38], a particular case of wireless sensor networks (WSN) [39], where assembling a plant-wide network with sensors is considered an extremely expensive option and an alternative solution is sought by using robots that transport the sensors to different places in the plant. This is the case of thermosolar plants, where on the one hand, direct normal irradiance (DNI) sensors, pyrheliometers, are quite expensive and, on the other hand, there is a familiar and structured environment where all the possible paths for the robots can be pre-determined. Therefore, in this scenario, unmanned vehicles equipped with DNI sensors would be sent to different areas of the plant to take measurements [40]. Since a robot cannot take measurements at two different places at the same time, although going to a spot first and then to another is possible, we are dealing with a ST-TA problem according to the classification previously introduced. If we also take into account that we only need one vehicle to take a measurement and that vehicles can be of different types, we can complete the classification of the problem as a heterogeneous, ST , SR,TA, MRTA one. Another work where a MRS has been proposed in the solar power plants context is [41], where mobile robots are proposed for the inspection and maintenance of the plants.
In this paper, a genetic algorithm and a branch and bound algorithm are applied to a specific, heterogeneous, ST-SR-TA MRTA problem, particularized in a case study of a RSN that collects data all along a thermosolar power plant. For this purpose, two types of robots with different possible paths and speeds are proposed: UGVs and UAVs.

Contributions and outline
The main contribution of this work is the MRTA problem formulation of an MRS in the context of inspection and measurement tasks in a solar thermal plant. The problem has been formulated using a cost function with multiple objectives designed taking into account that our MRS is of the ST-SR-TA type as well as the fact that there may be tasks that are more urgent than others or that it may be more desirable to use some robots instead of others, e.g., it is better to use ground vehicles whenever possible since that way, the battery of aerial vehicles will be preserved in case some urgent missions arise.
This formulation includes the use of discrete variables that code the information of the allocations and allows to greatly increase the speed of obtaining a solution since the number of variables involved in coding an allocation falls drastically. Finally, both a B&B algorithm and a genetic algorithm have been specially designed for this formulation and a comparison of the two approaches has been made. The genetic algorithm designed makes use of a set of solutions associated with the problem that are used by the algorithm as initial population, which speeds up its performance and ensures a good solution for the allocation.
This paper is organized as follows. The problem statement and the mathematical formulation are presented in Sect. 2. In Sect. 3, the proposed algorithms to approach the problem are presented. In Sect. 4, the problem is particularized for a case of study based on a thermosolar power plant and a fleet of robots composed of UAVs and UGVs. The results of proposing several scenarios for the optimization of the problem are analyzed in Sect. 5. Finally, the conclusions and future works are drawn in Sect. 6.

Problem statement
In this section, the MRTA problem considered is formulated in the most generic possible way, assuming that there is a fleet of robots receiving orders from an upper layer, which also decides which tasks are to be done. It should be taken into account that, although in this paper we apply this formulation in the context of CSP PTC, an equivalent formulation consisting of using a MRS as a RSN can be efficiently used for other MRTA problems.
The MRTA problem considered is expressed mathematically for a set of N robots R = {r 1 , r 2 , . . . , r N } and a set of M tasks S = {s 1 , s 2 , . . . , s M }. According also to the taxonomy mentioned in Sect. 1, the problem can be defined as follows: • Heterogeneous, as there can be different types of robots.
• ST, as each robot can only fulfill one task at a time.
• SR, as only one robot is needed to fulfill one task.
• TA, since obtaining the optimal allocation for all tasks requires having into account that: -A robot can fulfill a new task once it has completed the previous task. -Each task requires a certain amount of time to be completed. -The best allocation may require that one or more robots stay still while other robots perform more than one tasks.

Discrete variables
A set of discrete variables u i (n) has been considered to model the problem mathematically. These variables, u i (n), represent the nth allocated task of robot i, i.e., n provides information on the order in which a robot is performing the allocated tasks. Thus, u i (n) ∈ S ∪ {0}, since robot i is either performing a task as its nth mission or not doing anything (u i (n) = 0). Note that if this variable takes value 0 for a robot i at any n k , u i (n k ) = 0, it makes no sense that any u i (n k+l ) = 0 with l > 0. As n must be as large as the maximum number of tasks that a single robot can perform and there are no limitations on this respect. The most extreme case is that one single robot performs all the tasks and so, it is considered that n ∈ [1, . . . , M] and as a consequence, there are N × M variables. An example of these variables can be seen in Table 1 One of the main advantages of using discrete variables is that the constraints associated with the SR problem (a Table 1 In the allocation presented, robot 1 does task 1 first and, once, it is finished disregarding the time its takes to do so, it performs task 3 while robot 2 performs task 2

Robot
First robot Second robot Note that since u 1 (3) = 0, robot 1 is not doing anything after fulfilling task 3 and the same happens with robot 2 after completing task 2 robot cannot be assigned to more than one task at a time) are satisfied automatically with no need of soft or hard constraints. A further advantage is that the number of discrete variables used is smaller than when formulating the problem with binary constraints, therefore making the computational cost lower and making addressing nonlinear cost functions easier.

Cost function
In order to carry out the optimization, the proposed cost function must take into account multiple criteria: -Time employed to complete each task, i.e., the amount of time elapsed between the allocation and the completion of a task. However, the approach proposed allows changing these criteria just by changing the cost function.
In this work, we have also considered that: -Each task may have a different weight as not all the tasks are equally urgent. Moreover, it may be more desirable to use a certain robot than another, which makes the distance traveled by robots have different weights as well. -It is also necessary to take into account that allocations must be feasible, i.e., an allocation cannot use more energy than the energy available. Notice as well that a robot must not only have enough energy to fulfill its allocated tasks but also to go to a charge station after completing them.
A cost function based on different criteria such as the delay on performing the tasks or the distance traveled by robots can be formulated as follows: where U = [u 1 (1), u 1 (2), .., u 1 (M), u 2 (1), . . . , u N (M)], δ j corresponds to the priority given to a certain task j, t j (U ) is the time that it takes to complete task j in a given allocation, λ i corresponds to the penalty of using robot i and d i (U ) corresponds to the distance traveled by robot i. Note that these parameters may vary depending on the problem and they depend on the application (there may be applications in which performing the tasks as fast as possible is desired, applications where limiting the movement of the robots is the best option, or mixed applications where we can have both, robots mobility limited differently depending on the robot and urgent and minor tasks). Function γ (U ) is used to model the following soft constraints: -Function γ 1 (U ) is a function which takes value 0 when the allocation is feasible from an energetic point of view, i.e., when there is enough battery for the allocation to be fulfilled and to reach the battery stations from the final points. In case the allocation is not feasible, it takes the same value as the number of robots with not enough energy to fulfill the allocation. This way, the penalization is higher when the allocation implies more robots out of energy. -Function γ 2 (U ) is used to ensure that all the tasks are performed and that each task is allocated only once. Thus, it models the ST formulation of the problem. It takes value 0 when no task is repeated and every task is performed and otherwise, it takes a value that equals the number of tasks allocated to more than one robot plus the number of tasks not allocated to any robot.
The weights α 1 and α 2 are much larger than the rest of the weights of the cost function since it is better to obtain an allocation that complies with the constraints than one that does not.
Note that the constraints can be modeled as soft constraints since non-compliance is not critical. The fact that an allocation makes a robot perform a task that has already been performed by a different robot is not desirable, but there is no physical problem with it. On the other hand, the fact that a robot does not have enough energy to carry out the tasks allocated to it will only imply that some of those tasks will not be performed in time, since the robot will have to recharge its battery. Anyway, in the case that an unfeasible allocation has to be avoided, the value of α 1 and α 2 can be increased. Finally, it should be noted that in the case that there is no feasible solution, posing the constraints as soft constraints will make the algorithms return those allocations that are closer to the feasible zone.
The problem stated in this paper (1), and the algorithms proposed in Sect. 3, do not depend on how functions t j , d i Table 2 Note that U 1 and U 2 represent the same assignment under this formulation since both represent that robot 1 performs first task 1 and then task 3 and robot 2 performs task 2

Robot
First robot Second robot and γ are obtained, i.e., these can be either linear or nonlinear functions. In fact, other performance indexes, such as energy consumption or carbon emissions, could be added to the current ones, or even replace them, changing neither the problem formulation nor the proposed algorithm. However, it is important to remark that the faster these functions are computed, the larger the problem solved optimally could be. The specific functions used in this paper are detailed in Sect. 4.3.

Proposed control algorithms
The considered MRTA problem (1) entails a discrete optimization with N · M variables, which can take M +1 discrete values from 0 to M. As a consequence, computation times will increase exponentially with the size of the problem making impossible to solve the problem exhaustively, even for relatively small problems. More specifically, the number of feasible discrete solutions can be computed with: M+1 (2) where it can be seen that with 3 robots and 3 tasks the number of feasible solutions is 6561, with 5 robots and 5 tasks, the number of feasible solutions goes up to 244,140,625, with 8 robots and 8 tasks, the number of feasible solutions reaches 1.8014 · 10 16 . However, the number of solutions to be evaluated could be further bounded according to the soft constraint α 2 introduced in the previous section, i.e., focusing on solutions that perform all tasks once. Also, those solutions that represent the same assignments as one already considered (produced due to changes in the zeroes in U as can bee seen in Table 2) can be removed from the count. To this end, it is possible to make use of combinatorial analysis to find a formula for the exact number of plausible (potentially optimal) solutions.
where P M = M! is the number of permutations without repetition of S, PR The number of solutions obtained for some problems with from 1 to 7 tasks and from 1 to 7 robots and the increasing trends with both the number of tasks and the number of robots can be better observed in Fig. 1.
As a consequence, in order to carry out the optimization, this work considers two different approaches: for small-sized problems, the optimal solution can be found by using a Branch and Bound (B&B) algorithm (see Sect. 3.2). However, for medium-and large-scale problems finding the optimal solution using B&B also becomes unfeasible. In this case, the use of a genetic algorithm (GA) is proposed (see Sect. 3.3).
Both algorithms (B&B and GA) make use of a set of initial solutions that provide an upper bound on the value of the optimal cost function. The computation of these initial solutions is detailed in the following Sect. 3.1.

Initial solutions
This section presents the set of suboptimal initial solutions that has been used in this work for the application of the B&B and GA algorithms. More specifically, the set of initial solutions for the problem presented in Sect. 2, is composed of the following subsets: 1. The N initial solutions resulting from solving the shortest path problem for every robot i: where and ϕ(U i ) = 0 is the set of constraints, ensuring that all the tasks are carried out once. Notice that this problem is the well-known traveling salesman problem (TSP). TSP is a NP-Hard problem that can be solved in an exact way using Held-Karp algorithm [42] in a time O(n 2 2 n ). However, the nearest neighbor (NN) algorithm [43], a heuristic greedy algorithm consisting of choosing the nearest non-visited node in each step, is used to generate feasible solutions in a very reasonable time. 2. The N initial solutions resulting from solving the shortest time problem for every robot i. The distances considered in these problems are the ones resulting from adding the estimated traveling times between robots and tasks and the operation times required in the corresponding tasks: where t * i is the working time of each robot considering both the time needed to reach a task and the time needed to perform the task (note that t and ϕ(U i ) = 0 is the set of constraints, ensuring that all the tasks are carried out once. This problem is addressed in a similar way to the shortest path problem. However, since the criteria are different, the solution of this problem is both feasible and different from the problem mentioned in (4). 3. The initial solution resulting from solving the assignment problem using only distances: is the set of constraints, ensuring that all the tasks are carried out (if M <= N ) or that all the robots are assigned to one task (if M > N ). In case there are more tasks than robots (M > N ), the problem is iterated by considering the allocation in the subsequent nth orders of the robots. Each iteration takes into account the previous allocation of the robots, the remaining tasks and the distance among them, and it is solved using Kuhn-Munkres algorithm [44] in a time O(n 3 ). 4. The initial solution resulting from solving the assignment problem using only times: Note that this problem can be hard to solve directly since the time in which tasks are performed will depend on the assignment in the previous iteration. Thus, in this set of initial solutions, we address the problem similarly to the distance assignment problem but considering the time of reaching the task and performing it instead of the distance.
All the considered initial solutions can be fast and easily computed by using well-known algorithms. It has to be pointed out that the used set of initial solutions is independent of the subsequently applied control algorithms (B&B and GA). Therefore, the set used can be adapted for each problem. The only necessary requirement is that the initial solutions have to be computed fast enough to allow subsequent computations within the B&B or GA algorithm. In order to have a good performance, it is also important that the initial solutions correspond to the different criteria of the cost function. For example, if a cost function minimizes time and distance simultaneously the algorithms should be fed with initial solutions reflecting an opposing behavior (i.e., minimizing only time or space).

Branch and bound algorithm
For small sized problems, the optimal solution for the optimization problem presented in Sect. 2 can be computed by using the Branch and Bound (B&B) algorithm [45]. This algorithm, used for discrete and combinatorial optimization, consists of developing the tree of possible solutions sequentially, discarding the partial solutions that have a cost function higher than a set boundary, until the optimal solution is found. B&B has been used in some works to address the MRTA problem, as in [46], where it is used in combination with a Monte Carlo search tree [47]. However, to the best of our knowledge, it has always been applied to the MRTA problem using binary variables for the task assignments while in this work it is applied using discrete variables (which allows to substantially decrease the number of leaves on the search tree). Initial allocation Tree with 2 robots and 2 tasks. The "branch" is done by adding a single task to the task queue of each robot (in every possible way). The entire set of possible allocations for the firstly allocated tasks (composed of 6 leaves in this case) is generated More specifically, in the present work, the use of the B&B algorithm is proposed for solving small-scale [less than 10 5 solutions according to (3)] MRTA problems formulated as in (1). The algorithm generates all the possible feasible combinations for each nth allocated task removing the ones that show a cost function higher than the current upper bound for the optimal cost function. For the remaining task allocations, the corresponding possible combinations for the following (n + 1)th allocated tasks are generated comparing their new cost function with an upper bound. The algorithm continues until all the possible optimal task allocations (for any order) have been explored.
More specifically, the algorithm applied in this paper is composed of the following steps: 1. The best (lowest) cost function obtained within the set of previously computed initial solutions (see Sect. 3.1) is taken as the initial estimation of the B&B threshold (J * ). 2. Then, all the possible task allocations for the firstly allocated tasks for each robot are generated, i.e., the new "branch" is done by adding a single task to the task queue of each robot (in every possible way). A simple example with 2 robots and 2 tasks can be seen in Fig. 2. 3. The partial cost function of the generated allocations is computed. This function corresponds to the distances and times of the allocated tasks taking into account that many tasks can be unallocated to any robot at this step. The partial cost function used is similar to (1) but redefining γ 2 (U ) in order to eliminate the penalization for unallocated tasks: In contrast with the original cost function defined in (1), is not used since, in this case, also partial assignments which do not fulfill all the tasks are evaluated.  Fig. 2 assuming that the partial cost function of the first and fourth leaves are higher than J * (U ) and are therefore discarded

Fig. 4
Cost function updates for the tree shown in Fig. 3. It is assumed that the cost function of the allocation on the fifth leaf is lower than J * (U ) and, as a consequence, the value of J * (U ) is updated. Subsequently, the sixth leaf (with a cost function higher than the updated value of J * (U )) is removed accordingly Fig. 5 Allocation tree resulting from considering the task assigned in the second place for the search tree in Fig. 4 Therefore, this partial cost function J * (U ) will be lower than an original cost function J (U ) if some tasks are not allocated or equal to the original cost function if all the tasks are allocated (J * (U ) ≤ J (U )). 4. The task allocations with a cost function higher than the B&B threshold are discarded as it can be seen in Fig. 3. 5. If any allocation is complete (i.e., all the tasks are allocated) and the corresponding cost function is lower than the current value of the B&B threshold, this value is updated (also discarding the allocations lower than the new cost function) as it can be seen in Fig. 4. 6. The allocations which have not been discarded are used to generate a new set of possible allocations including the nth allocated task for those solutions which have still tasks to allocate, as it can be seen in Fig. 5. 7. The algorithm ends if the remaining leaves cannot be increased any more (i.e., if all the tasks are allocated). Otherwise, it returns to step 3 (Fig. 6).

Fig. 6
Cost function update for the tree shown in Fig. 5. It is assumed that the cost function of the complete allocation on the first leaf is lower than the current value of J * (U ) (the one corresponding to the last leaf) and, as a consequence, the second and third leaves (with a cost function higher than the updated value of J * (U )) are also removed. Since only one candidate leaf remains (U = [0, 1, 0, 2]) and this leaf cannot be further increased, the remaining allocation is the optimal one

Genetic algorithm
The previously presented B&B algorithm computationally explodes for medium and large-sized problems. As a result, another faster but suboptimal algorithm has to be used for these cases. The genetic algorithm (GA) is a meta-heuristic algorithm inspired by the process of natural selection and evolution which relies on crossover, mutations and selection [48]. The GA has been applied to similar MRTA problems in works such as [49], where it is used in combination with intra-path constraints in order to solve the allocation of a heterogeneous fleet of robots in a disaster scenario; [50], where a multi-objective cost function including both energy and time criteria is used; [51], where a multi-objective cost function is also used in a multi-robot task context including time constraints and discrete variables are used to model the problem; and [33,52], where similar structured environments with discrete positions of the robots and discrete variables are used.
In this work, we have designed a GA including the use of specially developed crossover and mutation strategies that generate ordered solutions for the next generations. These strategies generate new allocations that do not introduce unfeasibility from the viewpoint of tasks (allocations that do not fulfill all the tasks or that fulfill a task more than once cannot be fathered by allocations that do) and thus, reduce drastically the number of possible allocations and the computation load. With the same criterion, it has been taken into account that any allocation for a robot u i including intermediate zeroes is equivalent to the same allocation removing these zeroes and adding them at the end. The crossover and mutation methods presented in this section are novel to the best of our knowledge.
An example of the individuals or chromosomes which are used in this paper can be seen in Table 3, where the nth allocated tasks for each robot R i are concatenated. Note that each individual or chromosome corresponds to a full candidate solution to the MRTA problem (i.e., all the tasks assigned to all of the robots).
Notice that, even though the crossover and mutation strategies developed do not generate unfeasible allocations from the viewpoint of tasks, they can still produce unfeasible allocations from an energetic point of view.
The GA applied for the considered MRTA problem is composed of the following steps: 1. The set of feasible initial solutions presented in Sect. 2 is used to generate the initial population (i.e., the first individuals) of the GA. In the case that the population size is larger than the number of initial solutions previously computed, the rest of the population is filled with randomly generated potentially optimal solutions. To do this, we permute the set S randomly and then go through all the tasks, selecting a random robot from R and allocating the task to the robot. This way, the resulting individual will not activate γ 2 in (1). 2. The cost function is computed for the entire set of individuals of the current generation. The fittest individuals are selected as the elite population, which survive automatically into the next generation. 3. The algorithm generates a new set of individuals, which are distributed as follows: (a) The elite population fraction, G E , defines the percentage of the new population that is considered to be elite individuals. (b) The Crossover fraction, G C , defines the percentage of the non-elite new population which is composed of individuals generated by crossover. An example can be seen in Fig. 7. Each crossover strategy is composed of the following steps: i. Two elite individuals are randomly selected. ii. One task is randomly selected. iii. The selected task is removed for both individuals. iv. The task allocations are shifted in order to correct the individual obtained in the previous step. Table 3 Gen example Robot First robot · · · ith robot · · · N th robot Gen example 1 3 · · · 5 · · · 0 · · · 2 0 · · · 0 · · · 0 · · · 4 0 · · · 0 · · · 0 Order

3.(b)
.v (Robot 1, 1st order) (Robot 2, 2nd order)  .v) v. The selected task is reassigned to each individual but using the allocation considered by the other individual (i.e., for the first individual, the selected task is now allocated for the position originally given by the second individual). vi. The task allocations are shifted again in order to correct the individual obtained in the previous step. (c) G M1 is the fraction of individuals generated by mutations, which are generated by using the first kind of mutation proposed. An example can be seen in Fig. 8. Each mutation for the first strategy is composed of the following steps: i. One elite individual is randomly selected. ii. One task is randomly selected. iii. The selected task is removed. iv. The task allocation is shifted in order to correct the individual obtained in the previous step. v. A new robot and position is randomly selected for the selected task. vi. The selected task is included in the randomly selected position. vii. The task allocation is shifted again in order to correct the individual obtained in the previous step. (d) G M2 is the fraction of individuals generated by mutations, which are generated using the second kind of mutation proposed. An example can be seen in Fig. 9. Note that G M2 = 1 − G M1 by definition, since there are only 2 types of mutation. Each mutation for the second strategy is composed of the following steps: i. One elite individual is randomly selected. ii. Two tasks are randomly selected. iii. The selected tasks are exchanged.   Example 1 Let us take a case in which we have a population size of 100 individuals. To create the next generation, we will copy the 10 best individuals from the current generation. Then, considering G C = 0.8, 80% of the 90 remaining individuals, i.e., 72 individuals, must be created by crossover.

3.(d).iii 3.(d).iii
To do this, we take random selections of elite individuals and do crossovers between them until we reach the 72 individuals we need. The remaining individuals 100 − 10 − 72 = 18 are created by mutations. Assuming G M1 = 0.5 and G M2 = 0.5 half of them, 9, are going to be generated by selecting a random elite individual and applying mutation type 1 and the other half by doing the same but applying mutation type 2.
Note that, since we start from a potentially optimal initial population (the initial solutions plus the potentially optimal random solutions) and since the proposed genetic algorithm always maintains this property between generations, the middle zeroes in the gen do not affect the performance and speed of this approach as the algorithm will not produce not potentially optimal solutions.

Case study
In this section, the models used to simulate the robots are described. Likewise, the solar power plant layout considered is detailed. Finally, the problem formulated in Sect. 2 is particularized taking into account both the robot models and the plant layout.

Robot fleet
Since the fleet is comprised by both aerial and ground unmanned robots, there are different characteristics of speed and energy consumption or recharging rates for each type of robot. Likewise, different security energy levels have been considered for each type of robot. The parameters considered for UAVs and UGVs can be seen in Table 4. Note that V mean is the mean speed of the robots in the XY plane and V Zmean is the landing/taking off speed. A different number of UGVs and UAVs is considered for the simulations in order to analyze the time required to solve problems of different sizes.  Note that discharge and charge rates refer to operating and charging time, respectively corresponding to the steam generation, turbine and generator (see Fig. 10).

Solar plant layout
In the case of aerial robots, it is also assumed that they are neither allowed to fly over the collectors for identical reasons, so the only allowed paths are those which do not go through the small vertical rectangular obstacles which represent the collectors.
In the case of ground robots, manifolds are considered insurmountable and to overcome them they must exclusively go through the authorized crossing-points, which can be considered as bottlenecks for moving from one part of the plant to another. This area has been meshed into a grid of 434 spots where measurements can be taken. This grid has been created by placing the measurement spots in the intersection of 35 columns and 13 rows. Columns have been placed in the space between collectors, while rows have been placed taking into account that there must be rows in the horizontal passing places and that the space between rows must be similar to the space between columns.
In Fig. 10 and in the rest of figures used in this paper, the red circles represent the measurement spots, the red squares represent the charge stations and the different forbidden areas are represented by: -Black rectangles in the case they affect all the robots.
This area is where the offices, parking lots, deposits, thermal energy storage (TES), and energy generation zone (including the steam generation, the turbine, and the electric generator) are located. -Green rectangles in the case they affect only UAVs. This rectangles represent the solar collectors and are comprised of parabolic mirrors. -Blue rectangles in the case they affect only UGVs. This rectangles represent the manifolds that distribute the HTF to the collectors and back to the steam generator.
Notice that, even though UAVs cannot fly over the collectors, their potential to move through the plant is bigger than in the case of UGVs, which need to go through a crossing pass that can act as a bottleneck to go from one sector to another. On the other hand, UGVs have a higher durability than UAVs, as they have a longer-lasting battery and consume less power.

Cost function
In this paper, the distances between each robot and each task and the distance between each task and the other tasks have been pre-calculated. This consideration can be done since the plant is a structured environment and the starting positions of the robots, the positions of the tasks and the possible paths are known. Thus, we only need to pre-compute these distances for the actual positions of the robots and the tasks at the moment of solving the MRTA problem.
Making use of these distances and the speed and discharging rate of the robots, it is possible to pre-calculate all the times and energy needed by a robot to go from its starting position to a task, or from one task to another. Besides, an initial delay time, t o i , is considered for each robot since a robot can be considered for an allocation even if it is currently charging of finishing a previous task. These initial delay times can be estimated knowing the speed, the state of the robot and the parameters of the allocated task (in case the robot is finishing a previous task); or the charging rate and the energy level (in case the robot is charging). These parameters can be consulted in Table 4.
Once these data are obtained and there is a proposed U , the process of obtaining the distance traveled by each robot, d i (U ), is as simple as adding the distances that each robot i must do due to its allocated tasks, i.e., the distance from its starting position to the task allocated as its first task, the distance from its first allocated task to the second allocated task, etc. This can be mathematically expressed by: whered ik (U ) is the distance traveled by robot i as a result of the kth allocated task, D RT ab contains the distance between robot a starting position and task b, and D TT ab are the distances between task a and task b.
Similarly, the process of checking if the allocation is feasible from an energetic point of view, γ 1 (U ) = 1, is simple as well, since the initial energy of the robots, E o i , and the different needed energies have been pre-calculated. This can be mathematically expressed by: where E i (U ) is the estimated energy consumed by robot i after finishing all its allocated tasks,Ê ik (U ) is the estimated energy consumed by robot i as a result of the kth allocated task, E RT ab contains the estimated energy that robot a needs to move from its starting position to task b, E TT abc contains the estimated energy that robot c needs to move between task a and task b, and E s is the security energy level, i.e., the minimum energy needed to reach a charging station.
However, the process of obtaining the delay time of each task, t j (U ), is a little bit different, not only is the time needed to reach the position of the task relevant but also the accumulated time of the robot which is performing task j. The operating time of each robot is defined and set to the initial delay calculated previously, i.e., t i (U ) = t o i ∀i. Once the initial values are set, we must go through each robot queue of allocated tasks updating t i (U ) by adding the necessary time to reach the following task and the operation time of this task, t op j . Each time a task j is considered the delay time of task j is set to the current value of the operating time of robot i, i.e., t j (U ) = t i (U * j ). This can be mathematically expressed by: where R j represents the robot to which task j has been allocated, N j is the number of tasks allocated to robot R j before task j,t R j k (u R j ) is the traveling time of robot R j as a result of the k th allocated task; T RT ab contains the time that robot a needs to get from its starting position to task b, and T TT abc contains the time that robot c needs to go from task a to task b.
Notice that, even though the amount of energy needed to fulfill the task has been neglected, adding it would not modify substantially the proposed approach. Fig. 11, we can see an example with 2 robots (an UGV and an UAV) and 3 tasks. The allocation consists of robot 2 performing sequentially tasks 2 and 3, and robot 1 performing task 3, i.e., U * = 3 0 0 1 2 0 . In this case, distance traveled by robot 2 is d 1 (U * ) = D RT 1,3 and distance traveled by robot 2 is d 2 (U * ) = D RT 2,1 + D TT 1,2 . Analogously, the corresponding consumed energies are E 1 (U * ) = E RT   of the tasks is allocated more than once (γ 2 (U * ) = 0) and assuming

Scenarios
To test both, the GA and B&B approaches, a set of different sized scenarios are simulated: These scenarios are determined by the amount of robots and their types, locations, starting energy, delay on being prepared in the starting point, and penalty for using each one; and the amount of tasks, their locations and the operating time that it takes to perform them. The parameters that define these scenarios can be seen in Table 5 for robots and in Table 6 for Tasks. Note that in Table 5, G stands for ground robots and A for Aerial robots.

GA tuning
The GA has been tuned with the parameters shown in Table 7. These parameters have been chosen heuristically. Given the heuristic nature of the GA, it has been run 10 times in each of the scenarios described in the previous subsection.  -I  S-II  S-III  S-IV  S-I  S-II  S-III  S-IV  S-I  S-II  S-III  S-

Results
This section shows and discusses the simulation results on the application of the proposed algorithms for the proposed scenarios.

Scenario I
In this case, the best allocation obtained from solving Scenario I using the GA, which is also the most frequently obtained allocation among the GA repetitions, is the optimal allocation, obtained with the B&B algorithm and that can be seen in Fig. 12. The optimal allocation implies that robot 1 and 2 stay still, robot 3 performs task 1 and task 2 sequentially, and robot 4 performs the rest of the tasks in the following order: 3, 5, 6 and 4. The best initial solution obtained for Scenario I, which corresponds to the assignment problem for distances [see Eq. (6)], is shown in Fig. 13. As it can be easily seen in the figure, there are some similarities and differences with the achieved allocation after applying GA. The most important difference is that, after applying the GA, the UGVs are not allocated to any task and the tasks that were previously performed by UGVs are now distributed to the UAVs.
Moreover, in those cases in which the optimal allocation cannot be achieved by combining the initial solutions, the GA makes use of the mutation strategy to jump out of the local minima of the cost function.

Scenario II
As for Scenario I, the allocation obtained from solving Scenario II by using the B&B algorithm (i.e., the optimal allocation) coincides with the best allocation obtained using the GA. The allocation is shown in Fig. 12.
However, in contrast to the previous scenario, the most repeated allocation among the repetitions is not the optimal one, reaching a value of cost function close but slightly larger to the optimal one (Fig. 14).

Scenario III
The B&B algorithm cannot be applied to Scenario III because of the combinatorial explosion due to the size of the problem. As a consequence, it is not possible to know if the best allocation provided by the GA is the optimal one. The best allocation obtained from solving Scenario III by using the GA is shown in Fig. 15.
As it can be seen in Fig. 15, in this scenario robots are concentrated in a sector of the plant while tasks are located in another sector. As a consequence, only two aerial robots are assigned to the entire set of tasks, implying that the obtained solution highly differs from the ones obtained for the set of initial solutions.

Scenario IV
As in the previous scenario, the optimal solution cannot be found due to the size of the problem. The best allocation obtained from solving Scenario III by using the GA is shown in Fig. 16).
In this allocation, most tasks are performed by UAVs, while UGVs remain as backup robots performing only those tasks that are near their initial locations (14 and 4). Note that in the allocation obtained the route followed by robot 4 is quite strange, leaving task 9 for the end and attending tasks 2 and 5 before. However, it must be taken into account that the allocation obtained is not the optimal one (since the GA is a metaheuristic algorithm) and that, in this case, this allocation makes sense since there are other parameters to be considered in addition to the distance. As we can seen in Table 6,  15 Allocation with Genetic algorithm in Scenario III task 9 has low importance (δ 9 = 1) and the time required to fulfill it is long (t op 9 = 107 s). Thus, performing it first would delay significantly other tasks allocated to robot 4.

Performance comparison
In Table 8, all the values of J can be compared. In the first column, there are the values of the best solutions among the initial solutions. In the second and third column, there are the best J values and the mean J values among all the repetitions, of the solutions obtained using the GA. In the last column, there are the values of the solutions obtained using B&B and therefore the optimal values for each scenario.
It can be seen that, in those scenarios where the optimal value using B&B could be obtained, this value is achieved by the GA most times, getting very near when it is not. It can also be checked that, even though B&B ensures an optimal  Table 8 Cost function for the best initial solution (J ini ), best (J GA best ) and mean (J GA mean ) cost functions obtained using the GA and optimal cost function obtained using B&B (J opt ) Notice that the values of J for the GA correspond to the minimum and the mean value of applying the GA 10 times, so even though the minimum value coincides with the optimal value, this does not mean that the GA always reaches it (since the mean value is over the optimal one) solution, it is only computationally feasible in small-sized problems.
As an example of the performance of the GA, the evolution of the best J values and the mean J values among the repetitions in a single run of the GA can be observed in Fig. 17.

Computational costs
The different computational costs using an Intel(R) Core(TM) i7-8700 CPU with 3.20 GHz processor can be compared in Fig. 18 for the mean times required for the GA. As it can be seen, the time required for solving scenario III is 2.27 s. Given that the mean time required for solving the cost function on this scenario is 3.2 · 10 −4 s in the time it takes to solve the scenario using the GA, the cost function could be solved The computation time required for solving the set of initial solutions is negligible since it is below 10 −2 s for the four scenarios considered. In the B&B algorithm, the lower the bound is, the faster the algorithm is, and so, the computation time varies with the bound obtained from the set of initial solutions. However, only the lower bound has been considered with a computation time of 5.8 s for the first scenario and 206,040 s for the second one, i.e., 2.38 days. As previously stated, solving the third and fourth scenarios using B&B is not practicable due to the exponential increase in the computation times. It is also important to remark that, even though in small scenarios with few robots and tasks the B&B algorithm takes a similar time to the genetic algorithm and reaches an optimum solution, the computational cost increases exponentially with the size of the problem as it corresponds to a NP-Hard problem.

Monte Carlo studio
To validate the efficiency of our method, we have generated 20 different random scenarios for each problem size from 1 to 8 robots and from 4 to 8 tasks, i.e., 640 scenarios. Each scenario has been solved 50 times in the case of the GA from where we are taking the mean and the minimum value of J among the iterations. The J values obtained with both methods, GA and B&B, are shown in Fig. 19, where it can be seen that in many cases the optimal allocation is among the initial solutions. Also, it can be seen that in a few cases the mean GA is nearer the initial solutions than the optimal allocation, as it corresponds to a metaheuristical method.
Notice that B&B was only computed for cases where the number of solutions would not exceed 10 5 and that the GA tuning parameters, including the population size, were maintained constant and equal to those in Table 7.
The mean improvement of the GA over the initial solutions is 9.48% when considering the mean value among the iterations and it ascends to 12.61% when considering the minimum value among the iterations. Also, we can define optimality by how close the obtained J is to the one obtained in the B&B (in the cases in which it has been computed) in reference to the one of the initial solutions (disregarding all the cases in which the optimal solution is among the initial ones). Under this definition, the mean optimality considering the mean value among the iterations is 93.17%, while considering the minimum value among the iterations it increases up to 93.81%.

Conclusions
In this work, the MRTA problem has been formulated in the context of a thermosolar power plant where a MRS formed by various types of robots (ground and aerial) is in charge of collecting the DNI data. Under this approach, we assume a list of the tasks to be performed along with the time it takes to complete them and an associated priority or urgency. Besides, we also prioritize the use of a certain type of robot over the other or what is the same, to minimize the weighted distance traveled by the robots. Thus, we proposed a multi-criteria cost function that evaluates allocations taking into account the time needed to perform each task and the distance traveled by robots. However, the algorithms proposed in this paper do not vary when considering other criteria.
In this cost function, the constraints associated with energetic feasibility and to the completion of the entire set of tasks only once have been included as soft constraints. In order to solve this problem, discrete variables that code the allocation in a compact way have been used, which, combined with the B&B algorithm or the GA designed for the formulation in this work, allow to accelerate the computation of a feasible (if possible) and optimal (or quasi-optimal) allocation.
The B&B algorithm has proved to be able to obtain the optimal allocation. Moreover, it can be computed faster than the genetic algorithm if the size of the problem does not exceed a certain limit. However, as expected, the calculation time in this method increases with the size of the problem addressed, getting to a point where the B&B algorithm cannot be computed in a reasonable time. For medium-and largescale problems, the proposed GA algorithm allows to obtain a suboptimal solution in a shorter period of time.
The proposed algorithm has been tested considering a fleet composed of UGVs and UAVs and a 63 ha (1180 × 530 m), 30 MW thermosolar plant with a known structured layout, in 4 different sized scenarios (varying the number of robots of each type, the number of tasks and the starting parameters).
The results obtained show that the optimal solution can be computed using a B&B algorithm in a short time (less than 10 s) for problems with between 4 and 5 robots and 5 or 6 tasks, i.e., problems with around 10 6 possible allocations according to Eq. (2). For larger sizes, the results show that the GA provides a good solution that is quite close to the optimal one (in Scenario II the GA achieves an allocation with cost 5454.36 while the optimal allocation cost is 5182.70).
The simulation results also show that the proposed set of initial solutions used as initial population improves the speed of the GA method at the same time that it provides a floor for it, and it also allows to obtain an initial bound of the cost function to be used by the B&B algorithm. The observed increase in the computation speed becomes more remarkable as the problem size increases. In the figure, the different random scenarios have been ordered by the value of J obtained from the initial solutions (note that, due to the different parameters considered in the problem, there can be scenarios with similar sizes but very different values of J and vice versa). We can see the mean J value and the minimum J value obtained among the 50 GA iterations, respectively, in dark blue and in green. As expected, the mean value is contained between the minimum and the best J obtained by solving the initial solutions (black). The J obtained using B&B is represented by red dots and is always the minimum value (color figure online) Future research lines for this approach might include improving the GA developed, making use of a wider part of the genetic pool to create the descendant individuals and trying different strategies as Roulette Wheel, testing it with real robots in the context of a distributed estimation of the radiation, the use of clusterings to split large groups of robots and tasks into smaller, affordable ones which can be solved by using the B&B algorithm and the testing of other metaheuristic algorithms such as the Cuckoo Search with this formulation.