# Online decentralized information gathering with spatial–temporal constraints

## Authors

- First Online:

- Received:
- Accepted:

DOI: 10.1007/s10514-013-9369-5

- Cite this article as:
- Gan, S.K., Fitch, R. & Sukkarieh, S. Auton Robot (2014) 37: 1. doi:10.1007/s10514-013-9369-5

- 4 Citations
- 549 Views

## Abstract

We are interested in coordinating a team of autonomous mobile sensor agents in performing a cooperative information gathering task while satisfying mission-critical spatial–temporal constraints. In particular, we present a novel set of constraint formulations that address inter-agent collisions, collisions with static obstacles, network connectivity maintenance, and temporal-coverage in a resource-efficient manner. These constraints are considered in the context of the target search problem, where the team plans trajectories that maximize the probability of target detection. We model constraints continuously along the agents’ trajectories and integrate these constraint models into decentralized team planning using a computationally efficient solution method based on the Lagrangian formulation and decentralized optimization. We validate our approach in simulation with five UAVs performing search, and through hardware experiments with four indoor mobile robots. Our results demonstrate team planning with spatial–temporal constraints that preserves the performance of unconstrained information gathering and is feasible to implement with reasonable computational and communication resources.

### Keywords

Team planningDecentralized optimizationInformation gatheringSpatial–temporal constraintsCollision avoidanceNetwork connectivity Temporal-coverage## 1 Introduction

Coordinated teams of mobile sensor agents are useful in performing information gathering tasks. Example application areas include atmospheric mapping (Kovacina et al. 2002), sampling of dynamic fields (Antonelli et al. 2012), area coverage (Renzaglia et al. 2012), target searching (Hollinger et al. 2009) and target tracking (Frew et al. 2008). In these tasks, sensor mobility provides flexibility and adaptability in dynamically changing environments, potentially yielding a higher rate of information gain. However, teams of agents are often required to move in close proximity to each other in a shared workspace, which may contain obstacles. Agents may also need to communicate with each other for decentralized data fusion and decentralized decision making. Therefore, it is important to develop coordinated team motion planning strategies that satisfy general mission-critical constraints such as collision between agents, collision between agents and obstacles and network communication integrity. In this paper, we address the problem of multi-agent coordination for information gathering under such constraints.

Agents involved in a cooperative mission often operate in a common workspace. Self collisions must be taken into account during cooperative planning such that all planned reference trajectories are feasible. Mitigating the risk of collision is especially crucial for airborne systems, where collision is likely to result in catastrophic loss of the platform and possible damage to the environment. If team planning does not explicitly consider safety constraints, the optimized team decisions may not be executed as planned. An additional layer of real-time reactive approaches may be frequently activated. This may result in wasted computational power and communication effort during the team planning process, and may also induce uncertainty in the overall system behavior.

Additional constraints often appear in real-world scenarios, and neglecting these constraints can limit the applicability of multi-agent coordination strategies. Decentralized data fusion and decentralized decision making may require the team to maintain a certain communication topology. A planned trajectory that disregards team network integrity could potentially result in disconnected communication links that limit further information sharing and decentralized planning. There may exist further spatial constraints such as obstacles and prohibited regions of entry on the field, or temporal-coverage constraints where agents must pass by a ground station or a user to report their observations at arbitrary time intervals.

A common approach to planning for information gathering tasks in continuous domains is to formulate the problem as the optimization of an information-theoretic objective function over a finite future action horizon. Integrating constraints into this optimization problem is challenging for several reasons. Collision and network connectivity constraints are continuous along the agent’s action horizon. Constraints may also vary with time due to moving obstacles. Furthermore, constraints may not be fixed during planning. For example, when considering agent-agent collisions, constraints are coupled during optimization. Finding a feasible trajectory for one agent depends on the trajectories chosen by other agents in the team.

In this paper, we present a novel set of continuous spatial–temporal constraints for efficient online decision making in decentralized information gathering problems. We show how these continuous constraints can be converted to discrete form while preserving continuity guarantees, and derive analytical gradients. Constraints are integrated into a unified optimization framework using an augmented Lagrangian formulation and solved using asynchronous decentralized gradient-descent. Planning is then performed using a multi-step receding horizon approach. Because the constraints are represented discretely and have closed-form gradients, decision making can be performed efficiently and in an online manner while ensuring continuous constraint satisfaction along the agents’ trajectories.

Using this formulation, we first consider the case of inter-agent collision constraints. We present our constraint derivation and online team planning algorithm along with complexity analysis. We demonstrate the algorithm in the context of coordinated search, where agents seek to maximize the probability of target detection, and present simulation results from 200 randomized trials with a team of five UAVs where we compare our algorithm to the search-only and collision-avoidance-only cases. These simulation results complement the complexity analysis by showing empirically that our algorithm finds feasible solutions within a reasonable time horizon, and also demonstrate that our algorithm preserves the information gathering performance of the unconstrained case while minimizing the chance of collisions.

To further validate computational and communication efficiency, we demonstrate our algorithm using a team of four non-holonomic mobile robots in an indoor environment. The robots are equipped with on-board processing and communication capabilities. Because our work concentrates on optimizing motion, sensors are simulated using a probabilistic sensor model. Each robot travels at near-constant velocity and performs continuous planning online. No centralized processing is involved. These experimental results verify that our algorithm is feasible with respect to realistic computational and communication resources available on modest robot platforms.

We then consider three additional constraints for coordinated information gathering: (1) network integrity constraints, (2) stationary obstacle collision constraints, and (3) temporal-coverage constraints where timed objectives cannot be met within the planning horizon. These cases cover important practical scenarios that can arise in information gathering tasks. Network integrity constraints limit the maximum inter-agent distance, which is useful in modeling limited communication range. Stationary collision constraints force agents to avoid known (stationary) obstacles. Temporal-coverage constraints allow agents to plan long-term patrolling behavior that extends far beyond the agents’ action horizon. Again, we derive explicit gradient models for each case and present simulation results in the context of coordinated search with a team of simulated UAVs.

A preliminary version of this work appears in (Gan et al. 2012). Here the work is extended to include experimental results with real robots, the three additional constraint cases, and associated experimental results in simulation.

The paper is organized as follows. Related work is discussed in Sect. 2. The team information gathering problem is introduced in Sect. 3. Inter-agent collision constraints are presented in Sect. 4 and integrated into online team planning in Sect. 5, with experimental results presented in Sect. 6. Additional spatial–temporal constraints are presented in Sect. 7, and Sect. 8 concludes the paper. Details of gradient derivations are provided in three appendices.

## 2 Related work

Research that studies decentralized multi-agent team negotiation for information gathering has focused primarily on coordination strategies in the absence of spatial–temporal constraints. Representative work includes environmental monitoring (Casbeer et al. 2006), area exploration and mapping (Grocholsky et al. 2003), perimeter surveillance (Kingston et al. 2008), target search (Tisdale et al. 2009), target surveillance and tracking (Tang and Ozguner 2005) and search and rescue (Furukawa et al. 2006).

The problem of considering continuous spatial–temporal constraints during decentralized trajectory planning for information gathering has not been comprehensively studied in the literature. In airborne systems for example, a team of UAVs are usually assigned to operate in non-conflicting altitude layers to avoid self collisions. This assumption is commonly found in both simulations (Ahmadzadeh et al. 2006; Leung et al. 2006) and real-life demonstrations (Cole et al. 2008; How and King 2004). This technique, however, is not suitable for missions that require fixed-altitude operations, such as a mapping scenario, and certainly not available to ground-based robot teams.

Another common approach is to decouple the constraints from the decision-making process. In this approach, team planning is performed at a mission level without considering constraints. The planned trajectories, which may be non-feasible, are modified during closed-loop execution by reactive systems in separate lower-level controllers (Ayanian and Kumar 2010). Strategies for generating collision-free maneuvers include potential field methods (Tanner and Christodoulakis 2007; Zavlanos and Pappas 2007), geometric approaches (Bretl 2012), navigation functions (Dimarogonas et al. 2006), reachable sets (Gillula et al. 2011; Wu and How 2012) and barrier certificates (Barry et al. 2012). For network integrity, graph Laplacian methods are commonly used for maintaining network connectivity (Zavlanos and Pappas 2008; Sabattini et al. 2012). Reactive systems can often guarantee constraint satisfaction, especially under the assumption of holonomic robot kinematics and are important for online systems. However, reactive strategies usually deviate the robots from their nominal plan (Lapierre and Zapata 2012), either temporally, spatially, or combinations of both, which induces uncertainty in the overall system behavior. Our intention is to explicitly incorporate spatial–temporal constraints while simultaneously optimizing the team trajectory over some finite action horizon to achieve a common mission objective.

Approaches that do incorporate constraints during planning typically apply to holonomic agents. A decentralized gradient controller is derived in (Julian et al. 2012) for a team of holonomic agents surfing the maximum information gradient while implicitly satisfying network connectivity. We focus on more challenging scenarios involving non-holonomic agents with a minimum speed requirement, such as fixed-wing UAVs in a time-critical search and rescue mission. Another common approach is the assumption of discretized workspace. Static environmental constraints such as obstacles can often be easily represented in a discrete world and incorporated into planning (Bhattacharya et al. 2010; Durham et al. 2012). In contrast, we address the constrained team planning problem in a continuous workspace. This is important for many information gathering tasks since a discrete workspace potentially limits the ability to gather critical information.

Inter-agent constraints are commonly addressed with the approach of sequential planning (Schouwenaars et al. 2004; Hollinger and Singh 2010; Desaraju and How 2012). At any moment, only one agent plans its trajectory while fixing the trajectories of others. With this sequential planning approach, coupled constraints are usually easier to satisfy at the expense of longer planning time as the team size increases. Asynchronous algorithms, on the other hand, do not rely on an ordered sequence. Each agent is able to perform continuous iterations based on the latest communicated knowledge from others, fully utilizing the time allocated for team planning (Mathews et al. 2009).

Work that does incorporate constraints while maximizing some team objective falls under the category of decentralized constraint optimization. The classical approach, which we adopt, transforms a constrained problem into an unconstrained problem using an augmented Lagrangian formulation and then solves using a gradient-based algorithm (Bertsekas 1996). The work of (Hoffmann and Tomlin 2010) demonstrates the use of decentralized optimization in solving for a one-step action that maximizes information gain and satisfies inter-agent safety constraints. Instead of planning for a one-step action, a less myopic approach performs planning for a finite horizon trajectory (Chung and Burdick 2007). Trajectory planning can also be seen in the work of multi-agent team formation and navigation problems, where agents are constrained to remain in some formation or constrained to prevent colliding into each other while traversing some pre-planned team reference trajectory (Raffard et al. 2004; Kuwata and How 2006; Inalhan 2004). Constraints, however, are applied at discretized states along the action horizon without the consideration of the full continuous trajectory.

## 3 A cooperative team problem

In this section, we introduce the motion model and multi-step look-ahead action horizon setup for a single agent. We then formulate the cooperative team planning problem as an optimization problem over this action horizon.

### 3.1 Motion model and action vector

At every decision-making instant \(k\), the agent plans for its local open loop action vector \(v^k= \left[ {u^k,u^{k+ 1}, \ldots , u^{k+N-1}} \right] ^{T}\)\(\in \mathcal {V}\) where \(N\) is the number of look-ahead actions and \(u^k\in \mathcal {U}\subset \mathbb {R}\) is an action variable. Actions are held constant within a segment. The time interval between an action segment \(u^k\) is defined as \(\Delta t^k= t^{k+1} - t^k\). The total action horizon is thus \(H= \sum \limits _{l =0}^{N-1} {\Delta t^{k+ l}}\).

### 3.2 Team objective function

## 4 Inter-agent safety constraints

*subject to*.

*mc*stands for

*maximum curvature*. The distance lower bound can be obtained from geometry:

## 5 Decentralized decision making

This section details our approach in solving the constrained team problem in its decentralized form. First, we discuss the required information to be exchanged between agents to achieve decentralized optimization. We then solve the constrained optimization problem in augmented Lagrangian form using decentralized gradient descent. Finally, we detail the process of implementing team decision making online.

### 5.1 Decentralization

*impact*(Mathews et al. 2009). Impact space is a pre-defined space common to every team member. An impact space in its fully decentralized form has the potential to abstract the decision variables, motion model and sensor model of an agent. This eliminates the need for each agent to store potentially different models of every other agent, resulting in a scalable system.

### 5.2 Augmented Lagrangian method

### 5.3 Local gradient-based refinement

- 1.
Objective function \(J\) is bounded from below;

- 2.
Constraint function \(G\) is continuous and differentiable;

- 3.Multipliers \(\lambda _i\) is a bounded sequence. A common method in updating \(\lambda \) is:$$\begin{aligned} \lambda _{i} \leftarrow \lambda _i + c_i G_i ; \end{aligned}$$(24)
- 4.
Penalty parameter \(c_i\) is a series of positive increasing values for each iteration loop.

Note that our algorithm does not guarantee solution existence nor global convergence. There are many possible situations such as deadlocks and saturations of UAV turn rate where feasible solutions do not exist in the local action subspace. However, when a feasible team solution is found, given that all agents perfectly execute their plan, our algorithm guarantees that the entire team is continuously collision-free along the planned action horizon (as shown in Sect. 4).

### 5.4 Explicit gradient model

### 5.5 Forward planning architecture

### 5.6 Sampling-based initialization

### 5.7 Analysis

This section describes the communication and computation complexity of the decentralized constrained optimization algorithm. The variables involved are the size of the team \(Q\) and the length of the lookahead action vector \(N\). We provide complexity analysis for a single iteration of the optimization algorithm. Multiple iterations are required for convergence, but in practice (for minimum velocity platforms) we typically bound the number of iterations by a constant. Therefore this analysis represents the work required for agents to choose their next action.

In a fully connected network, each agent has to send messages to every other agent in the team. Thus the communication complexity for each agent is \(\mathcal {O}\bigl (Q\bigr )\) and the overall team has communication complexity \(\mathcal {O}\bigl (Q^2\bigr )\).

The gradient of the local search objective function \(\frac{\partial J_i^k}{\partial v_i^k}\) is a chained partial derivative of the action vector which has computational complexity \(\mathcal {O}\bigl (N^2\bigr )\). On the other hand, the gradient of the local constraints \(\frac{\partial \Delta J_i^k}{\partial v_i^k}\) requires computation of the constraints with respect to each agent which has computational complexity \(\mathcal {O}\bigl (N Q\bigr )\). The total computational complexity of gradient evaluation is thus \(\mathcal {O}\bigl (N^2 + N Q\bigr )\).

## 6 Experiments with inter-agent safety constraints

### 6.1 Simulation experiments

#### 6.1.1 Problem setup

Each UAV has an action vector of 10 s which is uniformly discretized into five segments. Each segment is thus 2 s in duration and holds a piecewise constant turn rate action limited to the range between \(\pm 30\,^{\circ }\)/s. The system employs a multi-channel broadcast communication network where UAVs are able to freely communicate with each other without interference. A communication frequency of 10 Hz is used with a 0.1 s message delay. Optimization frequency is 10 Hz and each UAV replans every 6 s, resulting in 60 optimization iterations per UAV per planning horizon.

Minimizing \(J\) equivalently maximizes the rate of cumulative probability of target detection \(P_D\), which increases the probability of finding the target in a shorter time frame (Wong et al. 2005).

#### 6.1.2 Comparison of PS, PCA and SCA

Three mission objectives are to be compared: (1) Pure Search (PS), (2) Pure Collision Avoidance (PCA) and (3) Search under Collision Avoidance (SCA). This comparison is important as it reveals the advantages and effectiveness of our algorithm.

Without considering inter-UAV collision constraints, PS has resulted in several potential hazards during the mission. An instance is shown in Fig. 5b where UAVs are clustered in one region and colliding. Figure 6b shows that in nearly 36 % of the mission there exists at least one pair of UAVs below the safety distance threshold, which is unacceptable for a real-life mission. The SCA algorithm, however, is able to maintain the superior information gathering rate of PS and at the same time adopt the properties of PCA by successfully satisfying the inter-UAV safety requirement. Figure 6b shows for both PCA and SCA, the minimum distance \(d\) is kept above the mission specified safety distance \(2R\) of 15 m.

In this particular case, SCA even outperformed PS in achieving the 99 % probability threshold earlier. Some UAVs are forced to turn back earlier due to collision constraints while they are about to cluster around the top right corner, as in Fig. 5l. This has caused the UAVs to explore high information regions earlier. However, this behavior is a side effect of our algorithm for this particular information gathering scenario and it shall not be the main focus of our work. The main advantage is its ability in satisfying inter-agent constraints while maximizing information.

#### 6.1.3 Tightly constrained workspace

#### 6.1.4 Monte Carlo simulation

In this example, a Monte Carlo simulation is performed for PS, PCA and SCA to further validate their information gathering and constraint handling capabilities. The experimental setting remains the same as in Sect. 6.1.2. For each case, simulations are performed for 200 randomly generated initial target belief distributions. Each randomized initial belief is a mixture of \(N_G\) Gaussians, where \(N_G\) is a randomized value between \(1\) and \(10\). Each Gaussian is a two-dimensional normal distribution with mean \(\mu \) and standard deviation \(\sigma \). Variable \(\mu \) is generated from a uniform distribution within the search workspace while \(\sigma \) is a positive random value less than a quarter of the workspace diameter.

### 6.2 Mobile robots experiment

We have further validated the SCA algorithm using four iRobot-Create indoor mobile robots. The main purpose of the experiment is to verify the online capabilities of the algorithm on systems with limited computation and communication resources.

#### 6.2.1 Experimental setup

Experiments are undertaken in a \(5m\times 5m\) indoor environment. Each robot is equipped with an acoustic sensor module, a ZigBee communication module and an on-board ARM-architecture processing module running at 72Mhz (Lal and Fitch 2009). Robot localization is achieved by measuring the acoustic waves generated at 2 Hz frequency from the three acoustic speakers located around the field. This information is filtered to produce a 2D planar position measurement of the on-board acoustic microphone. This noisy global position measurement, together with odometry measurement, is then incorporated into an Extended Kalman Filter (EKF) that estimates the mean and covariance of the robot’s 2D position vector.

The robots operate at a speed of 0.05 m/s. The action horizon is 40 s discretized into \(5\) uniform segments. Optimization frequency is 2 Hz and replanning occurs every 24 s, resulting in 48 iterations per planning horizon. Communication between robots operates on a point-to-point basis and information is exchanged at 1 Hz with noisy delay between 0.1 and 2 s. The sensor model used here is the same as in the simulation experiments with parameters \(P_{d_{max}}=0.85\), \(\sigma =2\) and \(d_{max}=0.75 m\). A pure-pursuit guidance controller is used for reference path following. The safety radius of each robot is set to 0.4 m taking into account its physical size and its expected EKF uncertainty bound.

#### 6.2.2 Experimental results

Figure 12c shows the team optimization process. At the beginning of each planning time slot, the initial action vector of each robot is obtained through RRT planning without the knowledge of others’ initial action vectors. This hardware experiment has a higher safety region to workspace area ratio compared to the SCA simulation in Sect. 6.1, resulting in a more constrained workspace. Because of this, constraints are more likely to be violated at the beginning of the planning stages compared to the SCA simulation results in Fig. 7. These constraint violations are subsequently minimized during the team optimization process. Furthermore, the objective function may increase while joint actions are optimized towards a feasible subspace. The objective function resumes in a decent direction once all constraints are feasible. It can be seen that no constraints are violated at the end of each planning time slot, resulting in continuous collision-free team trajectories.

## 7 Additional constraints

This section now extends the approach in modeling inter-agent safety constraints to the modeling of three additional constraint cases: (1) network integrity constraints, (2) stationary safety region, and (3) temporal-coverage. Capturing these important spatial–temporal constraints highlights the strength and adaptability of the approach. The constraints in the first two cases are modeled based on worst-case analysis. The approaches are similar to the case of inter-agent safety constraints. For temporal-coverage, we analytically model this constraint using forward reachable set analysis. Simulation results are subsequently presented for each constraint case.

### 7.1 Network integrity constraints

One of the most critical components in multi-agent systems is communication, especially for decentralized systems. Agents in a team are required to share information for decentralized data fusion and decentralized decision making. The network has to be connected in order for an agent to pass messages to other agents in the same network.

*greater*than the communication range. We approach this by defining the inverted normalized distance induced from the normalized distance

#### 7.1.1 Simulation: network integrity

This section demonstrates the behavior of Search with Communication Constraints (SCC). The search area is 300 m x 300 m, where the prior target belief is a mixture of Gaussians. Communication range is 100 m. The team network topology is a chain of connected UAVs. The nature of the initial belief is intended to separate the team heading towards two information rich areas. The chain can be easily disconnected without the presence of communication constraints.

### 7.2 Stationary safety regions

#### 7.2.1 Simulation: a cluttered environment

### 7.3 Temporal-coverage

There are many scenarios where agents would have to pass through a certain area in the operational field at a specific time. In remote sensing and reconnaissance scenarios, agents may want to send cumulative sensor information onto ground stations for analysis and data processing at fixed periods. This will require the agents to account for temporal-coverage requirements while they are planning their future actions. In a receding horizon framework, one way of accounting for some temporal-coverage requirement at a future time \(T_s\) is to plan for an action horizon \(H\) long enough that includes \(T_s\). A longer horizon usually increases the size of the problem (size of action vector increases if we hold action time interval constant) which will be more difficult to solve. We propose a novel approach that integrates temporal-coverage requirements into our multi-step look-ahead planning framework without the need to vary the horizon length.

### 7.4 Simulation: ground station temporal-coverage

In this final scenario, we demonstrate the ability of our algorithm to account for temporal-coverage constraints. A team of \(Q=5\) UAVs with id \(i \in [1,2,\ldots ,Q]\) are assigned to a decentralized search operation. A ground station monitoring the operation is located at the center of the map with a coverage radius of \(R_s = 50\) m. UAVs within this coverage region are assumed to be able to report collected observations to the ground station. In addition to the search mission objective previously defined in Eq. (28), each UAV is required to pass-by the ground station at \(H_s = 40\) s intervals in a predefined sequential ordering. This requirement can be modeled as a temporal-coverage constraint and integrated into decentralized decision making. Each UAV-i is initially assigned the value \(T_s^i = i \times H_s\) and this parameter is incremented by \(Q \times H_s\) after its groundstation revisit.

## 8 Discussion and future work

We have presented a novel set of spatial–temporal constraints for decentralized multi-agent information gathering tasks. We showed how these constraints can be efficiently integrated into an online constrained optimization framework and demonstrated the behavior of our algorithms through simulated examples of multi-UAV trajectory planning for the target search problem. To validate computational and communication efficiency for practical applications, we performed experiments with a team of indoor mobile robots with modest onboard computation and communication resources.

In this work, we demonstrated coordination in the context of probabilistic search, which is an important information gathering problem with many practical applications. However, our constraint models can also be directly applied to other information gathering objectives such as mapping and tracking.

Our current constraint models are restricted to 2D constant velocity motions and constant curvature segments. Important avenues for future work include extending the collision constraint model to 3D space and considering variable velocity and curvature along a segment. These extended models could then be applied to many 3D trajectory planning problems, such as 3D collision-free path planning. In particular, ellipsoidal safety regions are important to consider because localization uncertainty is usually represented as a Gaussian ellipsoid. Many missions and approaches are able to predict the future localization uncertainty of a platform along its planned trajectory, and the generalized collision constraint model could then take advantage of this localization uncertainty information in planning. With this model, it may be possible to represent safety constraints as soft constraints, in which the optimization problem can be formulated as the minimization of some form of an arbitrary risk reward function. It is also useful to generalize the constraint model to handle different forms of safety regions and other indoor obstacles such as straight wall edges.

The current form of network connectivity constraint is based on a constant network topology. This constant topology highly restricts the flexibility for the team since the network is rigidly connected. Extending the ability of this model to handle dynamic network topologies will have great impact on the team planning flexibility, potentially yielding better information gathering performance. For the case of temporal-coverage constraints, the current model is derived based on the absence of other spatial–temporal constraints that could potentially affect the shape of the agent’s reachable set. Extensions are necessary to generalize the model for arbitrary-shaped reachable sets. Furthermore, we would like to consider dynamic combinations of constraint models to yield more sophisticated coordination. For example, a network-connected team could decide to split and then regroup at a particular region at a specific time, sacrificing temporal network discontinuity in gaining a better team reward. When the team splits, network connectivity constraints could be replaced by a temporal-coverage constraint, and then later re-activated once the team meets.