MP-RRT#: a Model Predictive Sampling-based Motion Planning Algorithm for Unmanned Aircraft Systems

This paper introduces a kinodynamic motion planning algorithm for Unmanned Aircraft Systems (UAS), called MP-RRT#. MP-RRT# joins the potentialities of RRT# with a strategy based on Model Predictive Control to efficiently solve motion planning problems under differential constraints. Similar to other RRT-based algorithms, MP-RRT# explores the map constructing an asymptotically optimal graph. In each iteration the graph is extended with a new vertex in the reference state of the UAS. Then, a forward simulation is performed using a Model Predictive Control strategy to evaluate the motion between two adjacent vertices, and a trajectory in the state space is computed. As a result, the MP-RRT# algorithm eventually generates a feasible trajectory for the UAS satisfying dynamic constraints. Simulation results obtained with a simulated drone controlled with the PX4 autopilot corroborate the validity of the MP-RRT# approach.


Introduction
The use of Unmanned Aircraft Systems (UAS) has increased progressively across a wide range of applications such as remote sensing, search and rescue, security and surveillance, precision agriculture, infrastructure inspection and urban planning, to name a few [35]. Such extensive use of UAS triggered the rapid growth of various related research topics, of which autonomous flight has been driving great interest [3]. The great availability of powerful and efficient computational capabilities has enabled the extensive use of optimal control strategies and machine learning to address the problem of autonomy. However, autonomous flight remains a very challenging task [39], whose solution extends across diverse aspects: perception, localization and mapping, and motion planning and control [36].
Notably, the motion planning problem is ubiquitous in most of autonomous robotics applications, beside UAS. In simple words, motion planning is defined as the computation of the control input needed to drive a vehicle from an initial state to a target state satisfying the vehicle kinematic and dynamic constraints, while avoiding obstacles and other forbidden zones [21]. Typically, the motion planning problem is subdivided into two subproblems: path planning and path tracking. Much effort has been devoted to deal with both tasks. The authors in [14] adopted a potential field approach for planning, followed by a multi-constrained Model Predictive Control (MPC) strategy for tracking. The authors in [7] proposed a two-stage approach where path planning is computed by leveraging the Rapidly-exploring Random Tree (RRT) algorithm, associated with a Linear Quadratic Regulator (LQR) controller for the tracking of the resulting reference trajectory. Similarly, in [26] the planned reference trajectory provided by the RRT algorithm was post-optimized using MPC to determine a feasible trajectory plan. However, none of the mentioned two-stage approaches guarantee the dynamic feasibility of the computed path.
Alternatively, another classical approach for path planning breaks the problem into two phases: a continuous collision-free path is generated in the first phase, while in the second phase a low-cost trajectory is computed along the previously generated path, respecting dynamic constraints. However, this approach may lead to infeasible or inefficient trajectories, due to the possible incompatibility of the cost function used by the optimization engine in the first and the second phase. For example, minimizing the Euclidean distance in the first phase may result in a continuous collision-free path that is incompatible with the dynamic constraints of the second phase. This becomes more evident when the motion planning is required to avoid obstacles in a time-varying environment. Hence, a more suitable approach would be to consider both the kinematic and dynamic constraints simultaneously, while constructing the trajectory during the planning phase. To address this issue, kinodynamic motion planning algorithms can be employed in order to satisfy at the same time both the kinematic and the dynamic constraints of the motion planning model [8]. A considerable number of kinodynamic strategies makes use of various primitive curves to define a path, such as Bezier curves [22], harmonic potential fields [27], or using learning approaches [25].
Recently, incremental sampling-based planners have been successfully employed to solve the kinodynamic motion planning problem as a two-point boundary value problem in the dynamic state space of the robot system. Sampling-based planners, such as Rapidly-exploring Random Tree (RRT) and the Probabilistic Roadmap (PRM), are employed even in high dimensional spaces to solve motion planning problems [23]. Generally, sampling-based planners search for the motion planning solution in one of the following spaces: the state space X , representing the possible states of the vehicle; the control space U , representing the possible control input configurations applied to drive the vehicle; and the reference space Y, representing the possible desired vehicle states. A sampling-based kinodynamic algorithm was proposed for the first time by LaValle and Kuffner in [24], in which an RRT algorithm samples the control input of the vehicle and, then, predicts its corresponding motion in order to construct a tree of trajectories in the state space. As a consequence, the computed trajectory satisfies by design the constraints imposed by the vehicle dynamics and can be easily executed by the vehicle. Several samplingbased planners have been developed since then by extending and improving such approach [10,13]. The authors in [17] proposed the RRT * algorithm, one of the most widely used sampling-based techniques. RRT * constitutes an improvement of the original RRT algorithm toward the attainment of a near optimal solution. Moreover, the RRT * algorithm has been further improved for specialized purposes such as realtime path planning [31], anytime planning [18], multi-agent planning [6], and others [32]. RRT * has improved the path quality of the original RRT through employing two new major mechanisms: rewiring and best neighbor search. RRT * incrementally adds new connections to the existing tree whenever a new sample is generated. Such a rewiring procedure gives RRT * the chance to gradually improve its path-cost, asymptotically approaching the lowest-cost path as the number of iterations increases. However, this rewiring is performed on a local basis, preventing a global propagation of the changes in the graph and a consequent optimization at the global level. An improvement is constituted by RRT # , which generates a guaranteed asymptotically optimal graph that always contains the lowest-cost path [2].
Another common concern in sampling-based approaches is related to the random sampling of the control space, instead of the random sampling of the reference space of the robot [20]. While sampling in the reference space always generates feasible trajectories, sampling in the control space may often result in the selection of inputs that can lead to infeasible trajectories, due to the presence of dynamic constraints. This typically yields longer execution times and an inefficient management of the algorithm. To address this problem, the authors in [20] proposed the closed-loop RRT (CL-RRT), in which the samples are drawn from the reference space instead of the control space. The sampled reference is then used to compute a trajectory using the closed-loop model of the robot. A similar approach is also used in [1] with the RRT # algorithm. Similarly, in [16] a kinodynamic RRT * is realized using Dubins curves as a primitive curve defining the reference path between the new sample and the existing tree, whereas an LQR controller is used in [33] to compute the cost of tracking the reference path.
This paper presents a kinodynamic motion planning algorithm called MP-RRT # , where RRT # [2] is enhanced by the use of Model Predictive Control (MPC) [5] to compute the cost between vertices evaluated in the rewiring of the new sample. The proposed sampling-based planner avoids the local rewiring limitation of RRT * by employing RRT # [2], and draws its samples from the reference space instead of the control space, as proposed in [20]. Then, given an input sample of the closed-loop system, i.e., a reference r in the reference space Y, the proposed algorithm uses MPC [5] to perform a forward simulation obtaining a state trajectory and the optimal control input to track the sampled reference.
The proposed MP-RRT # concurrently constructs two graphs: (i) a graph G Y ∈ Y to explore the reference space Y, and (ii) a graph G X ∈ X in the state space X to evaluate the UAS motion between vertices of G Y using the MPC strategy. Practically, G X consists of a graph of trajectories in the state space of the UAS.
The proposed strategy is more efficient than other kinodynamic RRT-based approaches [10,13,24], since it guarantees the feasibility of the trajectory for each sampled reference state passed to the MPC strategy. Obstacle avoidance is encapsulated in the approach by labeling trajectories containing obstacles as unfeasible and consequently discarding them, rather than frequently invoking a costly trajectory repairing due to the presence of obstacles. Hence, the proposed strategy requires fewer samples to construct an exploration tree and, consequently, less effort to compute an optimal trajectory. Additionally, the use of the MPC strategy guarantees that the computed trajectory satisfies input constraints. Anyway, the feasibility of the resulting trajectory is further examined later by the algorithm in a verification step to ensure collision avoidance. Importantly, sampling in the reference space Y rather than the state space X is generally more efficient, especially for vehicles with complex dynamics where the reference space has smaller dimension than the state space. For instance, here the UAS state space had a dimension of 8 variables, whereas the reference space had a dimension of 3 variables only. Figure 1 illustrates the proposed strategy to generate a graph of feasible trajectories using the MP-RRT # algorithm. Given a sample drawn from the reference space, MP-RRT # incrementally extends the graph G Y ∈ Y by adding a new vertex and an edge. Then, following a primitive curve (i.e., a Dubins curve in this work) the reference path connecting the new vertex to the existing graph is defined, and MPC is used to construct the corresponding trajectory in the graph G X ∈ X , respecting the vehicle dynamics and control input On the other hand, the graph G X consists of trajectories (in magenta) obtained through evaluating the edges of G Y using the MPC strategy. An edge of G Y is labeled as invalid if its corresponding trajectory in G X crosses an obstacle constraints. Note that obstacle avoidance is implemented for the computed trajectory rather than the edges, i.e., even if the edge crosses an obstacle, it will not be discarded unless its corresponding trajectory enters the obstacle space.
Here, the proposed MP-RRT # strategy is specifically used to solve the motion planning problem of a multicopter to find a feasible trajectory for the UAS satisfying dynamic constraints. The novelty of the presented work resides in the use of the MPC in the graph construction of a RRTbased algorithm, which introduces evident benefits due to its ability of generating a forward path that is compatible with the vehicle constraints. In this way, the execution of feasible trajectories with maneuvers that are compatible with the vehicle constraints is guaranteed.
The rest of the paper is organized as follows. Section 2 defines the optimal motion planning problem. The proposed algorithm is described in Section 3, with a detailed description of the pseudocode, the UAS model and the MPC strategy adopted. Section 4 describes the experimental results, while our conclusions are drawn in Section 5.

Problem Formulation
This section defines the motion planning problem studied in this paper. First, the UAS dynamic model is described aṡ where x(t) ∈ R n x is the system state with dimension n x , and u(t) ∈ R n u is the control input with dimension n u . Both states and control inputs should respect specific constraints. Specifically, the vehicle state must belong to the free state space X free = X \ X obs , in order to navigate through an obstacle-free trajectory. This constraint is expressed by where X is the state space and X obs is the space occupied by obstacles. Moreover, the control input is constrained as where U is the space of the admissible inputs (roll, pitch and thrust commands) in order to meet the mission specifications.
Given the initial state of the UAS x 0 = x(0) at time t = 0 and the target state defined by the goal region X goal ⊂ R n x , the aim of the motion planning problem is to compute an optimal state trajectory x * : [0, t f ] ∈ X free and an optimal control input sequence u * : [0, t f ] ∈ U over a finite time horizon from 0 to t f able to drive the vehicle from the initial state x(0) = x 0 to a final state within the goal region x(t f ) ∈ X goal . x * and u * are computed minimizing a cost function Cost(·), while satisfying the constraints imposed by Eqs. 2 and 3. Hence, the optimal motion is the solution of the following problem Here, we will solve such a motion planning problem using the proposed MP-RRT # algorithm. In particular, the proposed algorithm solves the motion planning problem in Eq. 4 by constructing a graph of optimal trajectories G X in the state space and, then, by selecting the best complete trajectory in the graph from the start to the goal state. Specifically, optimal trajectories of the graph G X are computed using a Model Predictive Control strategy that takes into account the UAS dynamic model in Eq. 1, satisfying the constraints imposed by Eqs. 2 and 3.

The MP-RRT # Strategy
This section introduces the proposed MP-RRT # algorithm (Model Predictive Rapidly-exploring Random Tree "sharp"), which enhances the RRT # algorithm [2] using a MPC strategy to compute a near-optimal trajectory for UAS respecting the dynamic and kinematic constraints, while avoiding obstacles.
Similar to other kinodynamic RRT-based algorithms, our MP-RRT # algorithm explores the search space by constructing an incremental graph rooted from the start. Specifically, the MP-RRT # generates two graphs simultaneously: The former graph, i.e., G Y , consists of vertices and edges in the reference space Y. It is constructed incrementally by sampling vertices and growing the graph to uniformly explore the reference space. This is equivalent to any other graph generated to other algorithms of the RRT family. On the other hand, the latter graph, G X , consists of a graph of trajectories computed through MPC. G X is built concurrently with G Y and, practically, it is used to evaluate the motion between vertices of G Y , generating a graph of feasible trajectories in the state space. Figure 1 shows a simple example of the proposed strategy, where the graph G Y in blue is constructed in the reference state, while the corresponding graph G X in the state space is colored in magenta.
As can be observed in Fig. 1, collisions with obstacles are accounted for by graph G X . If a trajectory in G X enters the obstacle space, the corresponding edge in G Y will not be included in the resulting graph. On the contrary, as can be observed in Fig. 1, even if an edge in G Y crosses an obstacle, it will not be discarded if its corresponding trajectory in G X does not collide with obstacles. This choice is motivated by the fact that, in general, the reference state has generally a lower dimension than the dimension of the vehicle state. As a consequence, it is more efficient to generate a graph in the reference space than in the state space.

Algorithm
The proposed algorithm is based on the the RRT # algorithm proposed in [2]. The RRT # is a special variant of the Rapidly-exploring Random Graph (RRG) that ensures a globally optimal graph in the search space.
The main pseudocode of MP-RRT # is defined in Algorithm 1. The inputs of the algorithm are the initial state x 0 , the goal region X goal , the reference space Y and the state space X in which the motion planning searches for a feasible solution.
First, both graphs G Y and G X are initialized (lines 2 to 4). In particular, the initial vertex in the reference state is defined using the initial state x 0 (line 3). In fact, we assume that the reference space Y is a subset of the state space X and, as a consequence, an element r ∈ Y can be derived from a state x ∈ X . Then, the iterative procedure of the construction of the graph starts and continues until a certain number N of vertices are sampled and added to the graph (lines 5 to 8). Specifically, a vertex r rand is randomly sampled (line 6) and both graphs G X and G Y are extended by adding the new vertex (line 7). The Replan() function propagates this update on the graphs (line 8). Both the Extend() and Replan() functions are detailed in Algorithms 2 and 4, respectively. Finally, the branch T X connecting the initial and the target states is extracted from the graph G X (line 9) and returned as the solution of the algorithm.
The Extend procedure is a crucial element for the proposed approach; it is responsible for the expansion of both graphs by adding a new vertex, after which the cost of the state trajectory is computed using MPC. This procedure is detailed in Algorithm 2. Initially, the new vertex r is connected to the nearest vertex r nearest in the graph G Y (line 2). Then, the Nearest() function finds the vertex with the minimum Euclidean distance from r. Hence, the states x nearest and x are defined from r nearest and r, respectively (lines 3 and 4). The ComputeTrajectory() function (line 5) uses MPC to compute the optimal state trajectory x moving from x nearest to x. Then, if the computed trajectory is valid, i.e., it does not collide with obstacles, the cost-to-come of vertex r, denoted by g(r) is computed by adding the cost at the previous vertex to the cost of the trajectory x, denoted by c(x) (line 7). In line 8, all the neighbor vertices of r are added to the neighbor set N and, then, the vertex r is included in the neighbor set of its neighbors (lines 8 to 10).
The Near() function selects the M-nearest vertices as defined in [17]. Specifically, the number M of neighbors evaluated is defined as where d is the dimension of the reference space Y, and the notation |V | defines the cardinality of the set of vertices, i.e. the number of vertices in the graph G Y . According to [17], Eq. 5 ensures the asymptotic optimality of the algorithm. The FindParent() function searches for the neighbor vertex of r that provides the minimum cost-to-come g() including the vertex r to the graph G Y and, similarly, the corresponding state x to the graph G X (line 11). Then, the vertex r is included in the priority queue q (line 12) used in the Replan() to propagate any updated cost in the graph G Y .
The FindParent() procedure is detailed in Algorithm 3. For each near vertex of r, the state trajectory from x and x near is computed to select the best parent vertex (from lines 2 to 9). Then, the selected r near is defined as parent of r (line 8) and, similarly, x near is defined as parent of x (line 9).
The priority queue has a crucial role in the RRT # algorithm [2], since it is a queue of vertices that is evaluated in the Replan() procedure to propagate any update on the graph. Vertices of the queue are ordered based on their cost f (r) from the highest to the lowest. Specifically, the cost f (r) is the estimated cost to reach the goal passing through the vertex r, inspired by the well-known cost function defined in the A * algorithm [12] f (r) = g(r) +ĥ(r).
Function g(r) represents the cost-to-come at the vertex r, i.e. the cost of moving between the start vertex r 0 and r, with g(r 0 ) = 0. Functionĥ(r) is the estimated cost-to-go to reach the goal state, withĥ(r goal ) = 0.
In particular, the Replan() procedure is detailed in Algorithm 4. This procedure is based on an iterative loop that updates only promising vertices (lines 2 to 15), i.e., vertices that can improve the current solution in the graph.

Algorithm 2
The Extend procedure.
where r * goal is the vertex in the goal region with the minimum cost-to-come. Notably, the heuristic costĥ(r) used to compute f (r) must be admissible, i.e., it should not overestimate the cost-to-go, discarding vertices that would lead to the optimal solution. The evaluation of promising vertices is essential to avoid the propagation toward vertices that cannot improve the current solution, speeding up the algorithm. The first element of the queue is selected (line 3) and removed from q (line 5). Then, the procedure verifies if the current vertex can improve the cost-to-come of its neighbors (lines 6 to 15) as a new parent vertex. This is verified by computing the cost-to-come of the resulting state trajectory of moving from x to x nbh . Similar to Algorithm 3, line 10 checks if the neighbor vertex r nbh ∈ N is a promising vertex and, in line 11, if r can be the new parent vertex of r nbh . If this condition occurs, the vertex r nbh is included in q to be evaluated in the Replan() procedure. In particular, r nbh is defined in the reference space, while x nbh is the corresponding state in the state space.

UAS Model
Here, we adopt a multicopter, with a dynamic motion model linearized around its hovering condition [15], in which small variations of the attitude angle are assumed and the vehicle heading is aligned with the x-axis of the multicopter inertial frame. Hence, the linear model of the UAS is defined aṡ where A c is the state matrix in continuous time B c is the input matrix in continuous time and where a x , a y and a z are the drag coefficients, g is the gravity acceleration, τ φ is the roll time constant, τ θ is the pitch time constant, k φ is the roll gain and k θ is the pitch gain. The state vector is is the position vector of the UAS in the three-dimensional space, v is the velocity vector, W φ and W θ are the roll and pitch angles in the inertial frame W. The input vector is where W φ d and W θ d are the roll and pitch control commands in the inertial frame and T is the thrust control command. Since we will use a controller implemented in the discrete time, the UAS model is discretized as where T s is the sampling time.

Model Predictive Control
In order to construct the graph G X ∈ X incrementally, the MP-RRT # algorithm uses MPC to compute the optimal state trajectory between every newly added vertex and its adjacent vertices and, then, to evaluate the cost of such a trajectory.
Based on the UAS model previously defined, in this work we implement a Linear Model Predictive Control inspired by [15].
Specifically, the MPC searches for an optimal trajectory by optimizing the cost function where H p is the prediction horizon. The input vector is  Hence, the following convex optimization problem is solved subject to The optimization problem of Eq. 14 requires a reference trajectory x ref . In this work, the reference trajectory is defined using Dubins curves [9]. Compared to other primitive curves used to find the shortest path between two  Fig. 3 The roll and pitch control inputs computed by MPC to follow the trajectory of Fig. 2 configurations, and for our purposes of building the graph iteratively from curve segments extending the existing path with a new vertex, Dubins curves are a suitable solution to achieve flyable paths, since they are forward-only curves, whereas other curves like Reeds-Shepp require backwardmotion [23]. The Dubins curvature radius generally affects motion planning, whereby different curvature radii yield different planned paths. As a fundamental requisite, the curvature radius should reflect the minimum curvature that the vehicle can execute, compatible with kinodynamic constraints. Here, we are considering a multicopter that, theoretically, has zero curvature radius (i.e., it can rotate around its axis in place). However, since we are assuming a constant nonzero cruise velocity in the motion planning, a lower bound for the curvature radius needs to be set. Dubins curves refer to the shortest path between two poses in the two-dimensional space considering a constant radius curvature. This solution fits perfectly with our work, since the algorithm is implemented in the two-dimensional space. However, Dubins curves are also extended to the three-dimensional space [29] and with a variable radius curvature [11] being able to be used also in more complex scenarios.
Given the two-dimensional pose of the aircraft p x p y p β and assuming a constant speed, the differential equations of Dubins curves arė where u c is normalized in the range between −1 and 1 with respect to the maximum curvature radius of the aircraft. The shortest path between two poses can be expressed as a combination of no more than three motion primitives [9].
The optimization problem of Eq. 14 is solved following the reference trajectory. Then, in accordance with the MPC philosophy, only the first control input is applied and the optimization is solved iteratively. Figure 2 shows an example of reference trajectory generated using Dubins curves and followed through MPC. Figure 3 illustrates the roll and pitch control commands computed to follow the trajectory.

Implementation
The proposed strategy is implemented in C++ using the Robot Operating System (ROS) [34] framework and the Open Motion Planning Library (OMPL) [38], which provides many state-of-the-art sampling-based algorithms and many additional functionalities to facilitate the development of new algorithms.
The MP-RRT # algorithm is implemented considering a two-dimensional space, i.e., flying at a fixed altitude. Hence, the selected configuration space is the Special Fig. 4 The construction of the exploration tree using the MP-RRT # algorithm. The start and target positions are in green and in red, respectively. The graph G Y in the reference space is colored in blue, while the computed path obtained from the graph G X in the state space is colored in magenta. In (a), the graph consists of 10 vertices rooted from the start pose finding an initial solution in the map with a cost (i.e., the path length) of 66.44 m. In (b), the graph with 20 vertices, in which the solution is improved with a cost of 45.09 m. In (c), the graph consists of 60 vertices, but the solution is not improved. In (d), the graph has 100 vertices obtaining a solution with cost 38.70 m Euclidean Group SE (2) in which each admissible configuration is a pose in the two-dimensional space free to translate and rotate. Each reference sampled by the algorithm in the reference space Y consists therefore of three parameters, i.e., two defining the position of the UAS and one defining its orientation, corresponding to the flight direction. Each time the MP-RRT # algorithm evaluates the motion between two states, a reference trajectory is computed using Dubins curves and the MPC computes the optimal state trajectory and control input to track it.
The motion-cost of the trajectory is computed considering the path length of the resulting trajectory with x i ∈ x, and M is the size of the trajectory. On the other hand, the cost-to-goĥ(r) is computed as the length of the Dubins curve between the vertex r and the goal region X ref .
The optimization problem of the MPC is solved using CVXGEN [28], a tool for code generation for convex optimization. CVXGEN can be used to generate fast custom code for small, QP-representable convex optimization problems. The mathematical problem is translated into a high speed solver that is twelve-to thousand-times faster than other popular optimizers [28]. Hence, the linear model of the UAS and the Linear MPC problem of Eqs. 13 and 14 are included and solved with CVXGEN.
Experimental tests are performed considering the multicopter Asctec Firefly and using the parameters listed in Table 1.
The MP-RRT # is executed considering a maximum cruise velocity of 2.5 m/s and the reference trajectory is computed with Dubins curves with a curvature radius of 2 m. The admissible control input is defined through the following constraints: The MPC is manually tuned by setting matrices Q x and R through trial-and-error to attain a satisfactory behavior in tracking the reference trajectory. Specifically, Q x , and R are set as    Fig. 6 The average tracking error for 20 trajectories running the same scenario of Fig. 4 Moreover, Q final is computed by iteratively solving the Algebraic Riccati Equation [4]. Figure 2 illustrates an example of reference trajectory computed with Dubins curves and connecting two vertices. The trajectory is followed by the MPC that reaches the target vertex computing the roll and pitch control commands plotted in Fig. 3.

Simulation Results
The proposed MP-RRT # algorithm is tested in different scenarios to evaluate its behavior in computing UAS trajectories. Figure 4 shows the evolution of the graph during the exploration of the reference space (i.e., the map). Specifically, in Fig. 4(a), the algorithm computes a graph with 10 vertices finding an initial solution that is far from the optimal one. In Fig. 4(b), the graph consists of 20 vertices, improving the solution path. On the contrary, the solution is not improved in Fig. 4(c), with a graph with  Fig. 4(d), a better solution is found with a graph with 100 vertices. The previously described test highlights the ability of the proposed algorithm to explore the map and to compute a feasible trajectory for the UAS. The quality of the computed trajectory increases with the number of vertices in the graph, converging toward the optimal solution. In order to demonstrate the above mentioned pattern, we performed 50 tests using the same scenario of Fig. 4. The average cost of the resulting solution path against the number of iterations of the MP-RRT # algorithm is shown in Fig. 5.
Considering the same scenario of Fig. 4, we evaluate the ability of the MPC in tracking the reference trajectory defined using Dubins curves. Table 2 reports the average trajectory tracking error in 20 tests. The average tracking error is the average Euclidean distance between the setpoints of the reference trajectory defined using Dubins curves and their corresponding states in the actual state trajectory computed using the MPC and satisfying dynamic constraints. The average tracking error along the whole path was found to be reasonably small, being always smaller than 0.05 m along trajectories with a length ranging between 43 and 51 m. Figure 6 illustrates the average tracking error for each of the 20 tests.
Other tests in more complex maps are shown in Figs. 7 and 8. In particular, Fig. 7 shows an interesting scenario, in which Figs. 7(a) and (b) present the target in similar positions but with opposite directions. As a consequence, the algorithm computes different solutions in order to reach the target with the desired flight direction. Similarly, in Fig. 9(a), the MP-RRT # algorithm explores a map with a graph of 100 vertices, computing a solution. The trajectory computed in Fig. 9(a) is also executed in a realistic simulation environment, using Gazebo and SITL frameworks. Gazebo is an open-source multi-robot simulator fully compatible with ROS [19] able to simulate robots, sensors, and rigid body dynamics. SITL (Software In The Loop) [37] is a software to execute an autopilot on a computer, without using a specific and dedicated hardware. In this work, the simulation uses the PX4 autopilot [30], an open-source flight control software for drones and other autonomous vehicles.
In particular, the state trajectory computed with MP-RRT # is uploaded on the PX4 autopilot and, then, executed as shown in Fig. 9(b). Although the environment of Fig. 9(b) does not correspond to the map of Fig. 9(a), the executed trajectory in Fig. 9(b) is the same generated in Fig. 9(b).

Conclusions
In this paper, we have introduced a novel kinodynamic sampling-based motion planning algorithm called MP-RRT # , which enhances the RRT # using a Model Predictive Control strategy to compute an optimal trajectory for UAS.
Similar to RRT # , the proposed algorithm explores the map constructing an asymptotically optimal graph. Specifically, two graphs are concurrently constructed: G Y and G X . First, the graph G Y explores the reference space of the UAS. Then, the MPC strategy is used to iteratively evaluate the feasibility of each newly added vertex and to compute the cost of its corresponding edge constructing a graph G X of feasible trajectories in the state space. The resulting trajectory computed by the proposed MP-RRT # algorithm is a near-optimal trajectory that respects both the kinematic and dynamic constraints of the UAS.
The proposed MP-RRT # algorithm differs from other kinodynamic RRT-based algorithms in sampling the input reference of the closed loop system instead of directly sampling the control input. This gives rise to considerable advantages, especially when dealing with vehicles with complex dynamics where the reference space dimension is considerably smaller than the control space and state space of the vehicle.
The simulation results obtained from the implementation of the proposed MP-RRT # algorithm demonstrate good trajectory quality even for complex maps. Moreover, the computed trajectory is executable by a UAS equipped with a professional autopilot.
Although the proposed algorithm is tested in a simplified scenario, i.e., in a two-dimensional space using a linearized model of the UAS, the proposed MP-RRT # algorithm can be extended to more complex scenarios by increasing the complexity of the algorithm. Moreover, although the work presented here focuses on UAS, the proposed motion planning strategy has a general validity, and can be easily adapted to other kinds of robots, such as ground robots, autonomous cars, and underwater vehicles.
Future works will extend the current algorithm to solve a three-dimensional motion planning problem. The proposed MP-RRT # strategy will be adapted for real-time motion planning problems like the one described in [1,20]. In addition to that, experimental tests will be conducted on a physical robotic platform to evaluate the performance under realistic conditions. Author Contributions SP conceived the research and designed a first version of the algorithm, AO collaborated to the development of the algorithm, implemented the Model Predictive Control Strategy, performed the simulations, and evaluated the results. SP and AO drafted a first version of the manuscript. AR supervised the research and produced a revised version of the manuscript. All the authors finally revised and agreed on the final version of the manuscript.
Funding Open access funding provided by Politecnico di Torino within the CRUI-CARE Agreement. This work is partially supported by Compagnia di San Paolo and by an Amazon Research Award granted to Dr. A. Rizzo.
Data Availability Not applicable, as no real data have been used to realize this work.

Code Availability
The code will be made available upon request.

Declarations
Ethics approval Not applicable. (This study does not involve human participants, their data or biological material).

Consent to participate and for publication
The manuscript is approved by all authors for publication.

Conflict of Interests
The authors declare neither conflict of interest, nor competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.