Trajectory Planning and Optimization for Minimizing Uncertainty in Persistent Monitoring Applications

This paper considers persistent monitoring of environmental phenomena using unmanned aerial vehicles (UAVs). The objective is to generate periodic dynamically feasible UAV trajectories that minimize the estimation uncertainty at a set of points of interest in the environment. We develop an optimization algorithm that iterates between determining the observation periods for a set of ordered points of interest and optimizing a continuous UAV trajectory to meet the required observation periods and UAV dynamics constraints. The interest-point visitation order is determined using a Traveling Salesman Problem (TSP), followed by a greedy optimization algorithm to determine the number of observations that minimizes the maximum steady-state eigenvalue of a Kalman filter estimator. Given the interest-point observation periods and visitation order, a minimum-jerk trajectory is generated from a bi-level optimization, formulated as a convex quadratically constrained quadratic program. The resulting B-spline trajectory is guaranteed to be feasible, meeting the observation duration, maximum velocity and acceleration, region enter and exit constraints. The feasible trajectories outperform existing methods by achieving comparable observability at up to 47% higher travel speeds, resulting in lower maximum estimation uncertainty.


Introduction
Persistent sensing and data collection is important for tracking environmental phenomena such as atmospheric pollution [1,2] and wildfire ignition at the wildland-urban interface [3,4]. Early detection of dangerous conditions is a critical factor in mitigating damages to ecological systems and human infrastructure [5]. Persistent monitoring methods have utilized satellite data, human-piloted aircraft, and networks of stationary sensors, optimizing their placement for a variety of resource constraints [6][7][8][9]. Satellite data is often at too low resolution (e.g. pixel widths of 1 km for MODIS and 375 m for VIIRS) or too infrequent to detect rapidly emerging trends [10][11][12][13]. Regular flights of human-piloted vehicles is useful for ongoing tracking efforts in localized regions but is too expensive for large-scale monitoring. Dense sensor deployments can monitor the environment effectively but as the size of the region grows, sensor network deployment and maintenance can quickly become cost prohibitive. Stationary sensing networks also suffer from information redundancy since many environmental phenomena are highly temporally and spatially correlated.
Recent persistent monitoring research has focused on developing motion planning and control techniques for mobile sensing robots, such as unmanned aerial vehicles (UAVs), that can monitor large environments and deliver high-frequency, high-resolution data more effectively [14][15][16][17][18][19][20][21][22]. Technological advancements in sensing, computation, and communication enable UAVs to supply critical real-time data for a variety of applications, depending on the equipped visual, thermal, chemical, or depth sensors.
The goal of persistent monitoring is to minimize estimation uncertainty by planning an optimal trajectory for the sensing platforms. Probabilistic filtering techniques are employed to integrate the sensed information over time into an estimate the signal of interest. Kalman filtering is commonly used [22,23], including generalizations such as extended and unscented Kalman filtering [24], particle filtering [25], and variational inference [26].
Researchers in persistent monitoring have used a variety of objective functions for quantifying the effectiveness of an informative trajectory. We chose to minimize the maximum eigenvalue of the steady-state covariance matrix, which intuitively represents a bound on the uncertainty of the model at any POI and is commonly referred to as the spectral radius or E-optimality [27]. Other potential objective functions include the trace of the covariance matrix, which is the sum of the uncertainty, or the determinant of the covariance matrix, which is the volume of uncertainty [28]. By optimizing for the steady-state uncertainty, the developed controller may perform suboptimally for initial cycles of a periodic traversal of a trajectory, but it can be shown that the uncertainty evolution converges at an exponential rate from any initial covariance to a cyclic pattern of values (Sec IV.A in [29]).
Our previous work [22] proposes an initial solution for a UAV flying along a prescribed sensing trajectory. The objective was to plan a velocity profile along the path that minimizes the maximum eigenvalue of the estimation covariance matrix of a Kalman filter. In contrast to other works, which only optimize to minimize the revisitation frequency [30], the method balances the number of consecutive measurements at a point with the frequency of visitation. While the generated trajectory obeyed maximum velocity constraints of the UAV, the sensing trajectory was not continuous, and resulted in paths that were infeasible to follow (Fig. 1).

Contributions
We present a planner for feasible trajectories for persistent monitoring with UAVs. Modern approaches [14,22,23,31] ignore the dynamics of monitoring platforms, resulting in violations of full observability assumptions during simulations with controllers that track a trajectory. In contrast, our proposed feasible Trajectory Optimization for Persistent monitoring (f-TOP) plans a trajectory, adhering to a UAV's dynamic constraints while minimizing estimation uncertainty, with the following contributions: 1. Bi-level optimization that plans a minimum-jerk, three times continuously differentiable trajectory that minimizes steady-state estimation uncertainty, 2. Greedy Knockdown Algorithm to determine the optimal number of observations of each POI to minimize worst-case bound on estimation uncertainty for periodic sampling, 3. Second-order cone program (SOCP) for calculating B-spline trajectory coefficients that enforce constraints of vehicle dynamics.
The remaining paper is organized as follows. Section 2 outlines related work relevant to trajectory planning for persistent monitoring. Section 3 describes the problem formulation. Section 4 provides background information necessary for the formulation of our optimization. Section 5 describes our optimization approach for planning feasible trajectories. Section 6 compares our proposed method against baseline and existing works. The paper concludes with Section 7.

Related Work
Our contribution is at an intersection of several related research areas in persistent monitoring. Inspired by the periodic sensing results from the stationary sensing literature [8,9], we extend Kalman filter convergence results to persistent monitoring scenarios. We focus on planning dynamically feasible trajectories for robotic sensing platforms that better meet observability assumptions, which are commonly assumed but often unachieveable in existing works without significant reduction in robot performance. In this section, we provide a brief overview of relevant literature in sensor scheduling, trajectory planning for persistent monitoring, and feasible trajectories. Our proposed f-TOP algorithm attempts to plan trajectories that adhere to the dynamic constraints of the system, resulting in trajectories that can be tracked and remaining within the sensing distance of a POI. The First-Order and Direct Planner generate infeasible trajectories with discontinuities in the velocity profile

Sensor Scheduling
In sensor scheduling, a group of stationary sensors attempts to measure and approximate a phenomenon in a resourceefficient manner by selecting a subset of sensors to sample at each time step. Initial work calculates bounds on the estimation uncertainty by forming sensor scheduling as a convex problem problem and creating heuristic and open-loop policies [6,32]. Zhao et al. [29] advance the sensor scheduling problem for infinite time horizons, proving that for linear Gaussian processes and sensing functions the steady-state solution is independent of initial estimation covariance and the optimal estimation can be approximated arbitrarily close by a finite, periodic schedule. Greedy scheduling algorithms work well in many cases, approaching optimal results at reduced computational complexity. Jawaid and Smith [7] leverage submodularity of the estimation error to show that greedy approaches can approach optimal results, and additional efforts prove greedy algorithms can be optimal for specific cases of uncorrelated noise [8,9].
The primary difference between persistent monitoring with robotic agents and sensor scheduling is that there is a non-negligible switching cost between targets for robotic sensing platforms, which must physically move to measure new regions. Despite the difference, we can leverage and extend sensor scheduling results [9,29] by modeling our POIs as uncorrelated and proposing a finite, periodic schedule with an additional period where no observations occur.

Trajectory Planning for Persistent Monitoring
Instead of relying on fixed-position sensors for estimation, an alternate cost-effective approach for sensing across a large area is UAV-mounted sensors [33]. The simplest formulation of persistent monitoring a set of discrete POIs is the patrol problem [34], where every POI must be visited with the objective of minimizing time between revisits. Minimizing time between revisits results in a near-optimal strategy when all POIs have equivalent uncertainty in state transitions, but are sub-optimal when the POIs have different uncertainty characteristics.
For POIs with varying rates of uncertainty, persistent monitoring trajectories can be planned that optimize one of several different metrics of uncertainty, such as maximum eigenvalue, log determinant, or trace of an estimate covariance matrix [28]. Decomposing trajectory generation into separate path planning and velocity control problems allows for developing optimal solutions for constrained problems with reduced complexity [35]. The first approaches for planning persistent monitoring trajectories optimize wait times, visit orders, and velocity along a given 1-dimensional path.
For a given 1-dimensional path, the persistent monitoring problem can be simplified into calculating a required length of observation time at each POI prior to travelling to the next. The length of observation time is modelled based on different accumulation and clearing models of uncertainty. Smith et al. [14] represent uncertainty at POIs is modelled as a linear, continuous accumulation model with uncertainty reduction directly proportional to the time spent measuring a point, planning a periodic velocity as a composition of basis functions that minimized the maximum accumulation value. Similar works expand the concepts to different uncertainty accumulation models that clear in linear proportion to the distance of nearby robotic platforms [15] and for symmetric, non-linear accumulation models [16]. Ostertag et al. [22] plans a velocity along a prescribed trajectory by determining the optimal dwell times using a greedy application of an Kalman filter at an infinite-time horizon. Researchers have planned trajectories in two-dimensional space using the concept of infinitesimal perturbation analysis, which approximates the gradient of the cost function but with limited guarantees of optimality [18,20].
Related to persistent monitoring is the field of informative trajectory planning where a robotic agent optimizes for information gain while travelling through an environment but does not necessarily measure POIs periodically or with full observability. A common formulation is the orienteering problem [36], whereby an agent travels along a graph and collects rewards for visiting locations, attempting to maximize the reward with a limited resource spent on each edge taken. For problems formatted on continuous planes, optimal solutions are NP-hard, but recent work in sampling-based path planning [19,37,38] and finite-horizon, search-based path planning [39] show promising results for exploration problems, which are similar but not immediately applicable to persistent monitoring.

Trajectory Representation for UAVs
When dealing with real UAV systems, trajectory continuity is important to consider since trajectories with abrupt changes in velocity or acceleration, such as are common in simple waypoint paths, will result in high tracking error when a UAV is operating at high velocities. Continuous trajectories are more complicated to solve, so researchers have simplified the problem into graph searches using a discrete number of fixed motion primitives [40,41]. While these approaches are useful for planning real-time actions due to the limited number of movement patterns, they are almost always sub-optimal.
To handle the complexity of developing continuous trajectories, researchers have formulated the trajectory optimization as finding coefficients for a set of basis functions. By limiting to searching for the set of coefficients, the solution space is sufficiently limited to formulate the problem in a tractable manner. Objective functions commonly minimize the snap (4th derivative) [40,[42][43][44][45] or jerk (3rd derivative) [46,47] of the trajectory. Quadrotors are differentially flat [48], meaning that for a selected set of trajectory parameters, typically position in three dimensions and yaw, the control inputs can be precisely calculated from the first three derivatives of the trajectory.
Initial work in developing trajectories for quadrotors uses piece-wise polynomial basis functions. The polynomial basis functions have issues with numerical stability and are difficult to fit to non-linear constraints on maximum velocity and acceleration. Mellinger et al. [42] formulate the optimization as a quadratic program optimizing for coefficients of piecewise two-time continuously differentiable polynomial basis curves, solving for an arbitrary time range and scaling to meet any required constraints. The constraints are linear, however, and checked only at discrete time instances, which results in a large number of linear constraints and no guarantee of meeting maximum velocity or acceleration constraints. Mueller et al. [40] resolves the numerical stability of polynomial basis functions by for end point conditions instead of coefficients, proving a simple mapping between end point conditions and the coefficients. Richter et al. [44] improves the speed to solve the optimization by solving an unconstrained quadratic optimization followed by a check on constraints. Polynomial bases are numerically ill-conditioned for complex paths, and as the order of the polynomial increases to meet pathing demands, properties such as curvature and the extrema of derivatives must be found using numerical root-finding methods.
To solve the shortcomings of polynomial basis functions, researchers began exploring Bezier curves and B-splines. This class of curves are defined by a set of control points, are numerically stable, and have beneficial properties of convexity and linear formulation of derivatives. The last property enables simple calculation for a bound on the maximum magnitude of lower order derivatives. B-splines have a limited support for any given point in time with the support transition in accordance to a set of coefficients called knots. The limited support enables further numerical stability and complex maneuvers. Ding et al. [49] leverage the properties of B-splines to efficiently solve for a path using a kinodynamic search and elastic optimization to fine tune control points. Zhou et al. [50] utilize a similar method, but model the path as a non-uniform B-spline with an iterative time adjustment method, adjusting the knot spacing to achieve more aggressive trajectories. Due to the convex hull property of B-splines, both methods place constraints on the control points to avoid collisions.
Due to the advantages of numerical stability, continuity, and complex pathing while meeting dynamic constraints able to meet dynamic constraints of a robotic platform, we plan our trajectories by developing optimizations for the control points of B-splines. Unlike [50], our proposed trajectory optimization minimizes the magnitude of jerk, which enables simple computation for a 4th-order B-spline.

Problem Formulation
Given a set of points of interest (POIs) in the environment, we focus on planning a dynamically feasible trajectory for a quadrotor robot to estimate a time-dependent value of interest at these points with minimum uncertainty. Prior to stating the problem formally, we define the environment model, quadrotor dynamics, and the sensing model. The paper follows the convention of denoting matrices by capitalized bold symbols A and vectors by lowercase bold symbols a.

Environment Model
Consider a set of N POIs spread throughout an environment ] ⊺ of an environmental phenomenon at the N POIs over discrete time steps t k , t k+ 1 , …. Time is discretized with sampling frequency f s and associated period f −1 s such that t k+1 = t k + f −1 s ∀ k ∈ ℤ ≥0 . To simplify notation, variables that depend on time are denoted with a subscript k where appropriate, e.g. A k = A(t k ). We assume the values at each location are independent and time varying according to the following discretized model: where m k are independent increments drawn from a Gaussian distribution with mean value change μ k and a diagonal covariance Wf −1 s .

Quadrotor Model
A quadrotor equipped with a sensor is attempting to estimate the phenomenon at each POI. The state of the quadrotor is defined by its orientation R ∈ SO(3), world-frame position ∈ ℝ 3 , velocity ̇ ∈ ℝ 3 , acceleration ̈ ∈ ℝ 3 , and jerk ⃛ ∈ ℝ 3 . The control inputs of the quadrotor are the bodyframe thrust f ∈ ℝ ≥0 and rotational velocity ∈ ℝ 3 , which are directly related to the rotational velocity of the four rotors [42]. The quadrotor state evolves with the following dynamics: (1) where m is the mass of the quadrotor, g is the acceleration due to gravity, e 3 is the unit vector for z-axis in the world frame, and ⋅ ∶ ℝ 3 ↦ (3) denotes the hat map, which maps 3-D axis-angle vectors to a skew-symmetric matrix.
The quadrotor system is differentially flat with respect to its position s and yaw angle ψ states [48]. This means that, given a desired time-parametrized flat trajectory s d , ψ d , we can recover the control inputs f and ω that will track the desired trajectory: where R d and Ṙ d are the desired rotation and rotational velocity. The desired orientation R d is obtained from rotation about yaw ψ d and desired acceleration a d according to [42]. The desired rotation and acceleration are provided by the planned trajectory that is generated in Section 5, which the UAV is attempting to track.
The quadrotor has bounds on its velocity and acceleration due to physical constraints. The total thrust that can be generated by the motors is limited, resulting in a maximum acceleration bound. To account for the lack of air drag in the simplified dynamics model, we also introduce a maximum velocity constraint. Hence, all intended quadrotor trajectories should lie in the admissible set: where C 3 (ℝ, ℝ 3 ) denotes the space of three-times continuously differentiable functions ℝ ↦ ℝ 3 , v max is the maximum velocity magnitude, and a max is the maximum acceleration magnitude.

Sensing Model
A sensor is mounted on the quadrotor that can sense a POI q i if it is within a given closed, convex region B i , which represents the union of all UAV positions such that q i is in the sensor field of view. An example trajectory and observation region are shown in Fig. 2. The sensor captures observations with a sampling frequency of f s and each measurement is corrupted by additive noise: where H k is a binary observation matrix and η is additive, Gaussian observation noise with a positive semidefinite covariance matrix V. The matrix H k models the sensor field of view by assigning 1 to visible POIs and 0, otherwise. Equation 8 models a sensor that provides direct measurements of the features x k at the POIs. For example, a downward-facing visible-light or thermal camera used for vegetation imaging or temperature estimation can be modeled by Eq. 8.
The true value of the environmental phenomenon is continuous in time but due to the discrete-time observations, x k is estimated using a discrete-time Kalman filter. The filter maintains a mean estimate z k and a covariance matrix Σ k , whose values are predicted using the environment model in Eq. 2 and updated using the sensor measurements and the sensing model in Eq. 8. As shown in Fig. 3, the predict stage incorporates the Gaussian noise of the environment process to obtain a predicted estimate z − k and associated uncertainty − k , and the update stage applies the information gained from an observation y k to obtain an improved estimate z k with uncertainty Σ k . The estimated value is important for data collection and is introduced for completeness, but our trajectory optimization is only concerned with minimizing the covariance of the estimation. Despite the discrete updates of the Kalman filter, the precise values of the covariance matrix can be calculated at any point t using the following: Each observation reduces the estimation uncertainty of a POI by a monotonically decreasing amount. This property of diminishing returns of the Kalman filter has been exploited to optimize sensor selection and scheduling in previous works [8,9,51,52]. As we describe later, the number of consecutive observations of a POI d i will be an important optimization target due to the inherent trade off between reducing the uncertainty of individual POIs and reducing total cycle time, impacting the uncertainty of all POIs as described in Section 4.1.

Problem Statement
Our objective is to plan a trajectory for a sensing quadrotor that results in a minimum estimation uncertainty across all POIs. While researchers have used a variety of optimization goals [28], we choose to minimize the maximum eigenvalue of the steady-state covariance matrix, which bounds the estimation uncertainty at any POI and is commonly referred to as E-optimality [27].
When considering persistent sensing over an infinite time horizon, optimal trajectories have been proven to be approximately periodic [29]. Hence, we restrict our focus to the class of periodic trajectories, which maintain full observability of the POIs at every loop. An infinite time horizon is selected for the optimization goal because any initial covariance will converge to an infinite-horizon steady-state condition exponentially fast for a periodic trajectory [29], removing the influence of initial conditions. Additionally, we obtain a solution that works for any initial position along the trajectory, which is equivalent to starting the trajectory at any point within a sensing period. While suboptimal in some instances, removing the reliance on specific timing of sampling allows for a lower cycle time and guarantees a computable upper bound for the maximum estimation uncertainty.

Problem 1 (Minimizing Steady-State Uncertainty subject to Trajectory Feasibility Constraints)
For a quadrotor with dynamics in Eqs. 3-6 and POIs Q , find a periodic, timeparameterized trajectory s with loop period T to minimize the maximum eigenvalue ∞ max of the steady-state Kalman filter covariance matrix Σ that evolves as defined in Eq. 11 at an infinite-time horizon for any initial sampling position: where S is the admissible set of trajectories defined in Eq. 7 and ∞ max is defined as: where max is the maximum eigenvalue of Σ.
Our approach to generating a feasible minimum steadystate uncertainty trajectory considers three subproblems: 1. Determine POI visitation order, 2. Calculate optimal number of consecutive observations, 3. Generate a dynamically feasible trajectory.
As outlined in Fig. 4, the POI visitation order is calculated once for a given set of POIs q. Then, the Greedy Knockdown Algorithm calculates the optimal observation periods d that minimize the maximum estimation uncertainty. The Feasible Trajectory Optimizer for Persistent Monitoring (f-TOP) calculates minimum-jerk trajectories that meet the observation periods d and dynamic constraints of the system. If the feasible trajectory timing does not match the optimal observation . 3 The Riccati update procedure for a discrete Kalman filter consists of two stages: a Predict Stage that updates the mean and covariance of the estimate based on the known system model and an Update Stage that refines the prediction according to the measured value and observation noise

Optimal Observation Lengths
Our goal is to minimize the estimation uncertainty of a set of POIs by determining the optimal observation length for each POI. This presents an interesting trade off; the more time spent observing any single POI, the more uncertain we become about the current value of the other POIs.
In this section, we outline how to calculate the maximum eigenvalue of the Kalman filter estimation covariance at steady state and describe the trade-off between minimizing loop time and decreasing the uncertainty for a single POI. Then, we describe an approach to calculate the optimal observation lengths for a set of POIs. First, we determine an optimal visitation order for the POIs, which is equivalent to finding the shortest cyclic path through the POIs. For calculating the optimal observation lengths for each POI, we implement the Greedy Knockdown Algorithm (GKA) from our previous work [22] with updated stopping criteria. The output of GKA is an optimal continuous observation length for each POI, which is used as a constraint when calculating feasible paths in Section 5.

Steady-State Kalman Filter Bounds
For a discrete Kalman filter, a periodic sequence of observations will result in the estimation covariance to converge exponentially fast [29] to a periodic steady-state covariance Σ(t) where for a loop with cycle time T, Σ(t) = Σ(t + T) as t → ∞ . We define the steady-state value for the i-th eigenvalue immediately prior and after observation d as (d) i and with the maximum value at steady-state i = (1) i for simplicity. For POIs with uncorrelated values, the eigenvalues are ordered such that λ i corresponds to the maximum uncertainty of POI q i . The maximum eigenvalue of the steady-state covariance matrix is guaranteed to be bounded if every POI is measured at least once per cycle.

Lemma 1 (Bounded Uncertainty Guarantee 9)
If all POIs Q are observed at least once during a cycle of duration T, the maximum eigenvalue of the steady-state Kalman filter covariance Σ will be bounded as lim During each measurement of POI q i , d i ∈ ℤ ≥1 consecutive observations are made where each observation is binary as described in Eq. 10. Additional observations result in a reduction in uncertainty such that (d+1) i , but the uncertainty reduction decreases for each observation added i . An example of the evolution of the eigenvalues over time at steady state can be seen in Fig. 2.
For uncorrelated POIs at steady state, the reduction in uncertainty from the d i consecutive observations is followed by a linear increase in uncertainty in proportion to W. The lack of correlation means that each eigenvalue evolves independently with the same cycle period T. For the purpose of computing the maximum eigenvalue of the steady-state covariance, the observations of each POI can be shifted in time to align, such that the first observation of each POI occurs at the same time instance. The rearrangement of observation times can be captured by a new condensed observation matrix ̃ k where k is the number of discrete observations. Let E i be the elementary matrix with 1 at its i-th diagonal entry and 0 everywhere else, then ̃ k is defined as: 1 is the identity matrix, indicating all POIs are observed at least once, and the last non-zero observation matrix ̃ d contains 1s for the POIs that had the highest number of observations. With all observations condensed, the maximum uncertainty for each POI λ i with the modified time alignment will occur immediately prior to the application of ̃ 1 , which is described more formally in Eq. 17. For a periodic cycle where T mod f −1 s = 0 , the uncertainty can be calculated precisely. Under these conditions, each observation per loop will be taken at the exact location as the previous loop. Aperiodic cycles where T mod f −1 s ≠ 0 can be converted to periodic by raising T to the nearest f −1 s multiple. Using ̃ k , the worst-case interobservation time is given as: where ⌈⋅⌉ is the ceiling function. For high sampling frequencies, the change in cycle time will result in minimal change in overall error, but for lower sampling frequencies, which may occur with slow-responding air quality sensors or certain image processing applications, this rounding can impact the overall performance of the system.
The worst-case steady-state uncertainties ̃ resulting from using ̃ k and T forms the following system of equations: where ̃ 1 contains the maximum uncertainty for all POIs immediately prior to the first observation, such that [̃ 1 ] i,i = i . For smaller values of d , the equations can be quickly solved directly, but for larger values of d , the problem can be solved efficiently using structure-preserving algorithms [53].

Visitation Order
The cycle time T can be split into two components: T i , the time spent observing q i , and T i→i+1 , the travel time between successive observation regions B i and B i+ 1 . In this section, we present a solution for minimizing the interobservation travel time by formulating the problem as a standard TSP.
Modern solutions to the TSP involve minimizing the total cost to visit all points and finding a Hamiltonian tour that only visits each POI once. For our application, the cost between POIs is the time required to travel between them. A state-of-the-art exact TSP solver is Concorde [54], which leverages Lin-Kernighan heuristics, branch-and-bound methods, and other upper-and lower-bound methods to converge to solutions quickly for real-world problems but still with exponential complexity for worst-case situations.
We restrict our trajectory to the class of periodic trajectories that is periodic for position and all higher order derivatives, such that: Due to being periodic, the trajectory finishes a loop with travel from the last q N to the first POI q 1 .
For a given visitation order, the travel time between POIs can be fixed by selecting entrance and exit points for the observation region B i for each POI, which separates the problem into parallel subproblems of generating a trajectory The fixed entrance and exit positions for each observation region enable the trajectory s to be decomposed into trajectories that are within an observation region s i and interpoint trajectories that are between observation regions i→j , such that: The interpoint trajectories are fully defined as the linear interpolation between out * i and in * j with ‖̇ ‖ = v max and ‖̈ ‖ = 0 . Continuity is maintained between the trajectory segments by constraining the trajectory within an observation region to begin at the entrance position and end at the exit position with 0 acceleration and velocities aligned with the incoming and outgoing trajectories. By constraining the trajectory at each entrance and exit position, trajectories within each observation region can be calculated independent of each other by the optimization outlined in Section 5.2.
Although we generate a visitation order by solving a TSP, the interpoint travel times and entrance and exit positions can be calculated for any arbitrary visitation order of POIs. However, the visitation order has a direct impact on the (20) in * observation periods calculated in Section 4.3 and the trajectories generated in Section 5, so any change to the visitation order would require re-executing the iterative trajectory generation portion of f-TOP.

Greedy Knockdown Algorithm
Prior to generating the trajectory s i through each observation region B i , the length of time spent observing each POI must be determined. Since we constrict the trajectory to visit each POI once per cycle for one or more consecutive observations, the goal is to determine the number of consecutive observations that will minimize the worst-case steady-state uncertainty. To accomplish this, we extend the Greedy Knockdown Algorithm from our previous work [22] by updating the method with an improved stopping criterion and more efficient computation. We chose to implement a greedy algorithms because they have been shown to be near-optimal for approximate supermodular metrics [52], such as the mean-squared error and spectral norm of the estimation covariance matrix, and optimal when state and sensor noise are uncorrelated [9].
Our proposed solution GKA, shown in Algorithm 1, determines the optimal number of observations of each POI d i from the interpoint travel times T i→j , the uncertainty in the environmental model W, and the covariance of the noise in the observation model V. The initial run of GKA leverages the assumption that each POI is observed at least once and sets d i = 1 ∀ i. Following runs of GKA calculate the number of initial observations from the minimum traversal time of each subregion, an output of the proposed f-TOP algorithm in Section 5.2.
GKA is a greedy algorithm that proceeds in iterations denoted by a (Algorithm 1). The algorithm may progress for several iterations beyond the optimum, so the iteration that contains the optimal number of iterations and the final iteration may be different.
For each iteration, a bound on the maximum steadystate uncertainty for each point is calculated by forming ̃ (Algorithm 1 Line 4) and computing the steady-state uncertainty (Algorithm 1 Line 10) using the techniques outlined in Section 4.1. The bound is the result of solving the nested Kalman filter formulation in Eq. 17 with the special observation matrix ̃ . An observation is added to the POI with the largest estimation uncertainty as denoted by the i-th diagonal element in ̃ (Algorithm 1 Lines 6-8). The algorithm continues until termination conditions are met that guarantee the minimum upperbound on the maximum eigenvalue has been found. The last step is to check all iterations and return the optimal number of observations of each POI d i associated with the minimum upperbound.
Each iteration of GKA adds an observation to the POI with the current maximum estimation uncertainty, and this addition has competing effects. Firstly, the additional sequential observation reduces the estimation uncertainty of the POI being observed. As the number of consecutive observations of a single POI increases, the uncertainty of that POI reduces exponentially and asymptotically to the value that a stationary sensor with the sample sampling characteristics could achieve if left at the POI. Secondly, each observation increases the overall cycle time by f −1 s (i.e. T a+1 = T a + f −1 s ), resulting in a linear increase in uncertainty of all other POIs. The GKA stopping criterion in Algorithm 1 Line 9 identifies the point when additional observations of a POI can no longer improve the maximum estimation uncertainty (Fig. 5).
Since the reduction in uncertainty decreases for each subsequently added observation, after some number of observations, the reduction from adding an observation will be less than the uncertainty increase when an observation is taken at another point. Once this critical point is reached, GKA is stopped after the maximum uncertainty for any two following iterations is higher than the maximum uncertainty at the latch point (Algorithm 1 Line 9). Algorithm 1 uses the array notation ̃[ a, ∶] to denote the eigenvalues associated with iteration a of the algorithm generated with the modified observation matrix ̃ . For simplicity in describing the progression of the algorithm and stopping criterion, it is useful to express the eigenvalues for a specific iteration a and consecutive observation d i in a single term ̃( d i ,a) i . Formally, the stopping criterion is stated as follows. The result of GKA is the number of observations required for each POI to minimize the upperbound on the steady-state uncertainty. Each of the d i observations requires a period of f −1 s , which serves as constraints for the length of the observation period. In the next section, the trajectory optimization generates a smooth trajectory that ensures that the required observation period while adhering to the dynamic constraints of the UAV.

Minimum Uncertainty Trajectories
From the observation lengths calculated by GKA (i.e. d i f −1 s ) and physical constraints of the platform, we generate a feasible trajectory that meets these constraints while minimizing the average jerk. As shown in Eq. 22, the whole trajectory can be decomposed into interpoint trajectories and trajectories within an observation region s i . The interpoint trajectories are fully defined by a linear interpolation between consecutive observation regions with maximum velocity and no acceleration. In this section, we present a method for generating the remaining trajectories within the observation region using B-spline trajectories to ensure adherence to constraints on the dynamics of the system, the required observation length, and trajectory continuity at each transition between trajectories within and between observation regions. First, we present a brief background on B-spline trajectories and then we present our proposed f-TOP algorithm for generating feasible trajectories within each observation region.

B-Spline Trajectory
To generate C 3 continuous trajectories that meet constraints on velocity and acceleration, we leverage a B-spline representation. B-spline curves have an attractive property that the curve and its derivatives are contained within the convex hull of their control points and offer controllable levels of continuity and numerical stability due to their limited, timevarying support.
B-splines are piecewise polynomials defined for each subregion i by a set of control points C i , k-th order basis functions N k (t), and knots τ j . The k-th order basis functions can be calculated using the Cox-de Boor algorithm as follows: where N k (t) j is the j-th basis function of N k (t).
Each basis function provides limited support over the interval τ j ≤ t < τ j+k . At any given point along the curve, a B-spline function will have a maximum support of k + 1 different basis functions, transitioning between basis functions at each knot. We focus on developing trajectories with uniform B-splines, which have a uniform spacing between all knots, such that Δτ = τ j + 1 − τ j . At the transition points defined by the knots τ k , the curve has C k− 1 continuity. The derivative of the B-spline bases results in a linear combination of lower order bases, which can be succinctly represented as: where C i is a matrix of the control points, A (a) is a Toeplitz matrix that captures the relationship between C and the control points of the a-th derivative C (a) , and N k − 1 (t) is vector of the value of each basis function at time t. For the derivation of Eq. 27 see Appendix A.2.
Due to the fact that the basis functions sum to unity ∑ j N j,k (t) = 1 , ∀t, the value of a function defined by B-splines will remain within the convex hull created by the control points of the support. As shown in Eq. 27, the derivative In the case where a robotic platform has the same maximum allowable velocity and acceleration for the entirety of its trajectory, the constraints are guaranteed to be met if the magnitude of all control points c are less than the limiting value (i.e. v max or a max as defined in S).

Trajectory Optimization with f-TOP
Trajectories that are optimized to be informative must take the dynamic constraints of the sensing platform into account. If a trajectory cannot be followed and observation locations are different from planned, the resulting uncertainty in an estimate can be arbitrarily bad. Our proposed Feasible Trajectory Optimization for Persistent Monitoring (f-TOP) takes the point order and required number of observations for each POI and generates a feasible trajectory that guarantees the correct number of observations are taken within each sensing region while meeting dynamic constraints of the UAV.
The proposed f-TOP optimization has a multi-objective formulation, primarily to minimize the maximum steadystate uncertainty as formulated in Eq. 13 and secondarily to provide the smoothest possible path as shown in Eq. 28a. By applying the results from Section 4.2 and the output of GKA from Section 4.3, minimizing the maximum steady-state uncertainty can be simplified to minimizing the total travel time subject to constraints on entrance and exit conditions in Eq. 28c and the minimum number of required observations in Eq. 28e for each observation region.
The secondary objective of a smooth path is formulated as a minimization of the L 2 -norm of the jerk of the trajectory with a feasibility constraint that the trajectory remains within the valid state space S as defined in Eq. 7. By restricting the trajectory s to S in Eq. 28b, the maximum velocity and acceleration are bounded by the limits of the platform.
As smoothness is a secondary objective, the coefficient can be set arbitrarily small so that it does not affect the primary goal. The full multi-objective formulation is as follows: where the trajectory is segmented as described in Eq. 22. Note that since GKA assumes that the UAV remains within the observation region of a POI for a continuous period of time, the additional constraint of s i ∈ B i is applied.
The optimization and constraints from Eq. 28a can be reformulated using B-splines into a quadratically constrained quadratic program (QCQP). The primary objective is simplified by removing the interpoint travel times T i→j , which are constant due to the entrance and exit constraints, and the time within each observation region is updated with the number of control points and a fixed, uniform knot spacing such that T i = M i Δτ i . The secondary objective of minimum jerk is converted into a linear relationship of control points using Eq. 27, which form a convex hull of achievable values for the trajectory derivatives. For the special case of a 4thorder B-spline with k = 4, the objective can be simplified into a quadratic form (see Appendix A.3) as shown below: s.t. Eqs. (28b) and (28c) where i ∈ ℝ M i ×3 is the matrix of all control points and c i,j is a vector of the j-th control point for the i-th observation region, A (3) is a third-order Toeplitz matrix capturing the relationship between control points of B-spline derivatives as in Eq. 27, the first and second order control points are computed as ..,N} is the POI index, and j ∈{1,...,M i } is the interior control point index. Note that M i only refers to the control points within the observation region. When implementing, k − 1 additional control points must be added outside of the observation region to define the entrance and exit conditions in Eq. 28c where s i is computed from control points and B-spline basis functions as described in Eq. 24.
The objective of the optimization problem in Eq. 29a has a linear and a quadratic component that defines the mean-square of the jerk with quadratic constraints on the magnitude of position, velocity, and acceleration. This form of QCQP can be efficiently solved by modern convex optimization solvers with interior-point methods [55]. While the problem is solvable, the constraints of the problem are restrictive with many values of Δτ resulting in infeasible conditions. The feasibility of the constraints is closely linked to the time that the robotic platform remains within the observation region M i Δτ, the size of the observation region B i , and the dynamic constraints. Larger M i values result in more flexibility in the planner in terms of the number of control points, and so long as the robotic platform can remain within the sensing region of a point, which is determined by v max and a max , and has a sufficient number of control points, then it can meet the enter and exit constraints of the region of interest.
To better optimize the search for a feasible domain, we introduce a bi-level formulation of the optimization problem with slack variables that guide the selection of appropriate M i and Δτ values. The outer problem controls the number of control points and knot spacing while the inner problem minimizes the slack of the constraints, which is added to provide direct information for how close the solver is to a feasible solution. The inner problem refits the constraints from Eq. 29a with slack variables for the velocity constraint π v , acceleration constraint π a , and overall slack π. The outer problem can adjust M i and Δτ i with gradient descent methods to better approximate good values for subsequent iterations. The inner problem is defined for a given observation region B i as a second-order cone program (SOCP) as follows: s.t. Eqs. (28b), (28c), and (29b) ij , ̇ (y) ij , and ̇ (z) ij represent the first, second, and third entry of ̇ ij , respectively, π i is an upperbound on the slack, π is an added weight to the acceleration, and the slacked constraints are formulated as a second-order cone. An n-dimensional quadratic cone is defined as: n and the normalized velocity and acceleration constraints are transformed from Eq. 29b to the following form: where π v and π a are the slack for the velocity and acceleration constraints, respectively. While it may look difficult to solve a SOCP with NM i cone constraints, each region of interest is independent due to the entrance and exit constraints. Each independent problem results in 2M i fourthorder cones and a positional constraint to remain within the convex optimization region, which is easily solvable by modern standards. The final bi-level optimization consists of a higher level controller that sets the knot spacing and number of control points that are guided to a feasible region and a set of independent lower-level problems that depend on knot spacing and number of control points f(M i ,Δτ i ) that can be solved in parallel as follows: where the inner problem f(Δτ i ,M i ) is defined in Eq. 30a. In the inner problem objective statement in Eq. 30a, the reason for an added focus on minimizing the maximum acceleration is twofold. First, the minimum normalized slack on velocity (π v ) is 1 since the entrance and exit conditions for each observation region have the UAV travelling at v max . In order to further guide good solutions after the slack in velocity has been removed, an additional focus needs to be added to the acceleration slack. Secondly, the maximum acceleration along a trajectory has a large impact on the ability of a UAV to track a trajectory since maximum acceleration is directly linked to the maximum force produced by the motors, which is a limiting factor on performance. The further the acceleration is from the upper limit, the better a UAV can recover from tracking errors.
It is important to note that the minimum sensing time d i f −1 s may differ from the amount of time it takes to traverse the region of interest Δτ i M i . When they differ (i.e. Δτ i ≠Δτ i M i ), the GKA from Section 4.3 is rerun with the new minimum traversal times to determine if a more optimal series of observations can be obtained. The iteration continues until convergence, which is achieved when the Greedy Knockdown Algorithm maintains its number of observations for two consecutive cycles.

Simulation Setup
We evaluate the f-TOP algorithm against three baseline algorithms in a simulated scenario requiring monitoring a set of randomly distributed points over a 50 acre region (450-by-450 m). We assume the environmental model from Eq. 2 is zero mean and distribute the covariance growth W in accordance to a Chi-squared distribution. We vary the number of points in the region between 10 and 80, which can represent points that an expert selects as useful for inputting into an environmental model in the air quality example or points that have high probability of event detection from thermal or RGB camera sensors in forest fire detection scenarios.
Each POI can be measured within a 10 m circular observation region that does not overlap with the observation regions of other POIs. The circular region approximates the performance of a UAV that is mounted with a camera on a gimbal system. Despite aggressive maneuvers, the camera can remain pointed at the POI. We use a UAV model based on a commercially available Phantom 4 quadrotor, which has a mass of 1.216 kg, maximum velocity of 20 m/s, maximum acceleration of 19.2 m/s 2 , and battery life of 20 min. The maximum acceleration is the maximum magnitude in the X-Y plane, which was calculated using the maximum lift capability and required lift to maintain altitude against the acceleration due to gravity. During simulation, limits were placed on each motor individually. Aggressive maneuvers at the acceleration limit can result in tracking instability when using a realistic controller.
The test configurations include the maximum velocity and acceleration and more conservative configurations in 10% steps for maximum velocities v max ∈ {10, 12, 14, 16, 18, 20} m/s and two maximum acceleration configurations at 90% and 100% with a max ∈ {17.3, 19.2} . Each test configuration was run for 10 randomly drawn configurations for the expected battery lifetime of the platform (20 min), which resulted in a variable number of average loops, dependent on maximum velocity and number of POIs.

Comparison Algorithms
We compared f-TOP to three baseline approaches: a naive constant max velocity speed controller (Constant), a velocity controller obtained using the algorithm proposed by [14] with uncertainty decay estimates equal to a first-order approximation of a Kalman filter (Linear), and a velocity controller that leverages the GKA but without feasibility constraints, similar to [22] (GKA 2019).
The Constant method calculates the visitation order by solving a TSP as described in Section 4.2 and then traverses the POIs at maximum velocity. The method assumes linear travel between successive points and does not take observations or dynamic constraints into consideration. These assumptions lead to trajectories with instantaneous velocity changes at each POI, which results in tracking errors due to the dynamic limits of the UAV, and potentially unobserved points at higher maximum velocities or for POIs with small observation regions.
The Linear method is based on the uncertainty models used in [14], which include a linear increase in uncertainty when a POI is not being observed and a linear decrease when it is. Since the steady state uncertainty of the Kalman filter is achieved when, for each loop, the reduction in uncertainty equals the increase in uncertainty for each cycle of the loop, a first-order approximation of the reduction rate of the discrete Kalman filter is the product of the loop decrease and the sampling rate, i.e., TWf s where W is the uncertainty growth from Eq. 2 and T is the loop time at steady state. A UAV trajectory is obtained using the velocity controller proposed by Smith et al. [14]. While Linear is an improvement over Constant, it still lacks a guarantee that all points of interest will be observed.
The GKA 2019 method calculates a POI visitation order by solving a TSP as described in Section 4.2, and optimizes the velocity profile over the resulting path using [22]. The method attempts to meet the required observation periods by reducing velocity within the regions of interest. The method plans a trajectory that minimizes an upper bound on the steady-state error, but like the Constant and Linear methods, GKA 2019 generates trajectories with instantaneous velocity changes.

Results
All results for the Constant, Linear, GKA 2019, and f-TOP trajectory planners were captured for the same POI placement and uncertainty growth rates. The minimum upper bound on the steady-state error was approximated by the maximum eigenvalue of the covariance matrix of the estimation over the last loop of each test. All uncertainty results were normalized to the bound on max uncertainty of the f-TOP algorithm.
Linear, GKA 2019, and f-TOP guarantee that each POI will be observed at least once by remaining within the observation region for each POI longer than the sampling period. The Constant planner lacks this guarantee of observability because it travels at maximum velocity. Due to the cyclic nature of the plans, not observing a POI results in an unstable condition whereby the uncertainty for the unobserved POI will grow unboundedly.
Despite that the trajectories are optimized to visit all POIs, when the plans are followed by a non-linear, geometric controller [56] that obeys the UAV dynamics, the actual coverage can differ from the planned coverage. The Constant, Linear, and GKA 2019 plan trajectories without incorporating acceleration limits, resulting in trajectories that cannot be followed. When POIs were sufficiently far from each other, the UAV controller converged to the planned trajectories during the straight flight segments between observation regions. As the number of POIs increased, resulting in a dense configuration that required high agility, the Constant, Linear, and GKA 2019 trajectories regularly resulted in missing observations of POIs. Examples of trajectories and differences between planned and actual coverage can be seen in Fig. 6, and a quantitative comparison between the planned and simulated trajectories is provided as the Max X-Y Error in Table 1.
Paths planned by f-TOP at the performance limits (i.e. v max = 20 m/s and a max = 19.2 m/s 2 ) also suffered from tracking errors. This is due to a mismatch between maximum magnitude constraints on the dynamics with which the trajectories are constructed and the actual drone model, which includes limits on the maximum force for each motor. Depending on the maneuver, the required correction to follow the desired trajectory can max out the force ability of an individual motor. For paths further within the performance limits, the proposed f-TOP algorithm generates paths that can be followed more reliably.
The actualized coverage, which is the percentage of POIs that received at least one observation per loop, can be observed in Fig. 7 and Table 1. As the maximum allowable velocity increases, the coverage of the POIs drops. For the plans that do not obey maximum acceleration constraints (i.e. Constant, Linear, GKA 2019), the UAV can only achieve 50% of the maximum velocity before reaching unstable conditions. The plans by f-TOP incorporate maximum acceleration constraints, resulting in plans that are feasible. Several variants of plans were generated for maximum acceleration a max values of 90% and 100% of the maximum listed accelerations.
Despite the lower loop time of the the naive (Constant) and baseline (Linear and GKA 2019) methods, the maximum uncertainty, Max Covar as seen in Table 1, is substantially worse than the result of f-TOP due to the actualized coverage of the planned trajectories. By incorporating maximum acceleration constraints and planning feasible paths, the robotic platforms are able travel 47% faster while maintaining the same coverage. The higher achievable speed results in lower average loop times for a given coverage level and a significant improvement in maximum uncertainty of the measurement at each POI. Note that the Max Covar is normalized to the minimum eigenvalue from any method for a given trial, such that a value of 0.0 means that the method had the minimum uncertainty for a single trial.
The best performing method was the f-TOP algorithm with acceleration limited to 90% of the maximum value of the platform. Despite the slightly improved loop times of f-TOP with higher accelerations, the trajectories planned to the limit of the system resulted in errors due to the imprecise tracking of the controller, which updated the trajectory plan online at 100 Hz, causing compounding error when operating at the system limited in complex maneuvers. By limiting the maximum acceleration and velocity to 90% of the maximum value of the platform, the controller was better able to control the system and achieve the required observations in a minimum time. Subplots (a) -(d) focus on regions that highlight the differences in the methods and how the performance of the proposed f-TOP method enables more observations Fig. 7 Comparison of methods showing the impact of not planning with higher-order dynamic constraints. Increasing maximum velocity constraint results in naive planners creating infeasible trajectories that result in missing observations. As a region increases in numbers of POIs, the maneuvers to measure all points become more aggressive, resulting in naive methods requiring significantly lower speeds to achieve similar average observability. Results generated are average of 10 random trials for each condition Table 1 Statistics for each simulated method for 10 randomized distributions of 80 POIs with random uncertainty growth rates. A graphical summary of important results can be seen in Fig. 7 (right). Max Covar is the maximum eigenvalue of estimation covariance nor-malized to the smallest maximum eigenvalue from any algorithm for the same trial. Max X-Y Error is the maximum distance between the planned and simulated trajectory for any time point in the given trial. Coverage is the percentage of POIs with at least one observation

Conclusion
Persistent monitoring using mobile robotic platforms offers substantial benefits over stationary sensors and lowresolution satellite imagery. When planning trajectories for UAVs, it is critical to optimize for observation periods and to constrain the trajectory to meet the dynamics constraints of the sensing platform to ensure a feasible trajectory. The f-TOP method generates optimal observation periods using the Greedy Knockdown Algorithm and forms continuous, feasible trajectories by optimizing the control points and time knots of a B-spline in a novel bi-level formulation. The resulting trajectories have guarantees on their position, velocity, and acceleration, meeting provided constraints. Simulations for randomized POI distributions show that the trajectories planned using the f-TOP framework outperform related methods, achieving the same level of observability at 47% higher travel speeds, resulting in significant performance gains. Future work will focus on applying the f-TOP algorithm in physical experiments. For persistent monitoring in the real world, the generated trajectory must involve rendezvous with a recharging station at regular intervals, placing additional requirements for the planned trajectories. Online replanning would be needed to deal with dynamic obstacles or changing objectives. Lastly, the f-TOP algorithm could be extended to include environmental models that capture measurement correlation.

A.1 GKA Stopping Criterion (Section 4.3)
For uncorrelated POIs, each additional observation results in a decrease in the eigenvalue of estimation uncertainty for a POI, such that where ρ(⋅) is the reduction in uncertainty due to an added observation, a known monotonic function.
When a POI meets the stopping condition i , which states that when the stopping condition is met, the uncertainty λ i can only increase after the set of observations at q i and at any other POI. In GKA, an observation is added to the POI with the highest uncertainty. When any two or more POI meet the stopping condition, then adding an observation to any POI that meets the stopping criterion will result in an overall increase in uncertainty.

A.2 B-Spline Derivatives (Section 5.1)
The derivative for a B-spline is defined as: For uniform spacing Δτ = τ j + 1 − τ j , the derivative simplifies as: where A (1) captures the interaction between basis functions. With A (a) capturing the linear interaction between basis functions for derivative a, further derivatives take the form:

A.3 B-Spline Objective Reformulation (Section 5.2)
The primary objective from from Eq. 28a can be simplified by noting that T i→i+1 is constant and will not affect the minimization goal and that T i = M i Δτ i . The secondary objective, the average jerk power, can be replaced by the linear combination of control points and bases in Eq. 27. The result is an updated cost function as follows: The secondary objective can be further modified for the specific case of minimizing the square of jerk magnitude for k = 4. For a 4rd-order representation of position, the 3rd derivative results in a basis function of order 1, which is either 1 for τ i ≤ t < τ i + 1 and 0 otherwise as shown in Eq. 25. Since the limited domain of the base does not overlap with (35) any other base, the outer product of the 0th-order basis function results in I and the integral results in ΔτI, where I is the identity matrix as seen below: Since is an arbitrarily small scale factor, we incorporate 1/M i and 1/Δτ 5 into , resulting in the reformulated objective: