Keywords

1 Introduction

Over the past decade, advancements in technology, coupled with their decreasing costs, have led to the widespread integration of autonomous systems into our daily lives [1]. These systems have had a significant impact on various fields, including technology, science, and society, making them valuable tools for a range of applications, from military and commercial uses to hobbies. Currently, autonomous systems are primarily utilized in known and closed environments like industrial production plants, where the system behavior can be assured by detailed models of the system and its environment. However, there is a growing demand to employ autonomous systems, particularly autonomous vehicles, in heterogeneous and unknown environments, such as search and rescue robots [12], environmental monitoring drones [3], and self-driving cars on highways [5], among others. This trend has sparked extensive research in multiple domains, demonstrating that autonomous vehicles can carry out complex missions without human intervention. In this doctoral dissertation, we try to answer the following question to achieve autonomous navigation:

How can I reach a particular location without colliding with the surrounding environment?

Trying to answer this question, we can identify the following requirements:

Safety: Vehicle navigation approaches have to guarantee classical requirements, such as the design of a dynamically feasible trajectory able to satisfy actuators and state variables limits while at the same time guaranteeing the satisfaction of constraints arising from the perception of the environment. To guarantee an obstacle-free motion is also required to the controller to ensure persistent constraints satisfaction despite different sources of uncertainty that can lead to a failure or crash of the system.

Exploitation: The main task we aim to solve is reaching a given location. This objective must be achieved guaranteeing the above mentioned safety requirement. These objectives, however, are often conflicting and can potentially lead to performance degradation during the navigation and to a too-conservative behaviour of the vehicle if the trade-off between exploitation and safety is not considered.

Exploration: During this phase, it is crucial to account for obstacles and other agents in the environment that may hinder direct access to a desired location. To ensure successful navigation, it becomes necessary to gather and store information about the surroundings. This information is then utilized to devise a strategy that avoids the system from becoming trapped in a local minimum, even when the desired location is ultimately reachable.

This doctoral thesis aims to design multiple predictive control architectures that address the trade-off between the identified requirements for autonomous vehicle navigation. This summary brief is structured as follows: In Sect. 3, the design of MPC schemes that can satisfy safety requirements under time-varying constraints, which shift with the system state, is presented. Then, Sect. 4 introduces a novel MPC formulation named multi-trajectory MPC, which allows for better exploitation of current information about the environment. In Sect. 5, a high-level receding horizon strategy for environment exploration is presented, which utilizes a graph-based map of the environment constructed online. Section 6 showcases real-world applications and simulations, demonstrating the proposed approach’s effectiveness in addressing various problems within realistic scenarios. Finally, Sect. 7 concludes this brief with some final remarks and outlines possible future research directions.

2 Constrained Autonomous Navigation Problem

As mentioned in Sect. 1, for autonomous vehicles navigation, ensuring safe and efficient path planning is of paramount importance. To solve this problem, this doctoral thesis proposes Model Predictive Control (MPC) strategies to drive the system to the final target. MPC is a optimization-based control strategy that combines real-time feedback with predictive models to calculate and optimize the trajectory of the vehicle [11]. By leveraging mathematical models and optimization algorithms, MPC empowers autonomous vehicles to make intelligent decisions, navigate complex environments, and avoid obstacles effectively. At its core, MPC operates by continuously predicting the future behavior of the vehicle and optimizing its trajectory based on a defined objective and set of constraints. This predictive nature enables the vehicle to anticipate potential obstacles and navigate complex scenarios with agility. By considering both current and future states of the vehicle, MPC provides a reliable approach to path planning, facilitating smoother and safer autonomous driving experiences. The predictive modeling aspect of MPC involves creating a mathematical representation of the vehicle’s dynamics and its interaction with the surrounding environment. This model captures essential parameters such as the vehicle’s position, velocity, acceleration, and other relevant variables. The accuracy and fidelity of the predictive model play a crucial role in the effectiveness of MPC, as it directly influences the quality of trajectory predictions and subsequent decision-making. To leverage the complexity of the model and achieve faster implementation MPC can be employed in a hierarchical manner. By dividing the control problem into multiple levels, each addressing different time scales and aspects of the autonomous vehicle’s behavior, MPC allows for efficient decision-making and real-time responsiveness. In this doctoral thesis, we are interested in analyzing high-intermediate levels, where we design a high-level plan (Sect. 5) and use a trajectory planner to refine the plan and generate a feasible trajectory considering the vehicle’s dynamic constraints and environmental factors (Sects. 3 and 4). Here, MPC can be employed to optimize the trajectory based on real-time sensor data and accurately predict the vehicle’s future behavior, accounting for factors like vehicle dynamics and environment conditions. To this end, we consider an autonomous vehicle described by a generic nonlinear discrete-time invariant model with a state vector \(\boldsymbol{x}(k)\in \mathbb {R}^{n_x}\) that encompasses essential vehicle dynamics, such as position and velocity, and an input vector \(\boldsymbol{u}(k) \in \mathbb {R}^{n_u}\). The system is characterized by a state dynamics function \(f:\mathbb {R}^{n_x}\times \mathbb {R}^{n_u}\rightarrow \mathbb {R}^{n_x}\). The system is subject to time-invariant input and state constraints that impose physical limitations or boundaries on these variables, given by:

$$\begin{aligned} \boldsymbol{u}(k) \in \mathcal {U}, \quad \boldsymbol{x}(k) \in \bar{\mathcal {X}} \end{aligned}$$
(1)

Here, \(\mathcal {U}\subset \mathbb {R}^{n_u}\) and \(\bar{\mathcal {X}}\subset \mathbb {R}^{n_x}\) represent non-empty closed convex sets. The system is also subject to a non-empty time-varying convex set of constraints \(\mathcal {X}(k)\subset \mathbb {R}^{n_x}\) arising from the perception of the environment:

$$\begin{aligned} \boldsymbol{x}(k) \in \mathcal {X}(k). \end{aligned}$$
(2)

We are now in a position to formally define the problem we aim to solve. Given a state reference \(\bar{\boldsymbol{r}}\in \mathbb {R}^{n_x}\) and the system state \(\boldsymbol{x}(k)\) at each time step, our objective is to design a state feedback control law \(\boldsymbol{u}(k)=\kappa (\boldsymbol{x}(k),\bar{\boldsymbol{r}},\mathcal {X}(k))\) that drives the system’s state as close as possible to the reference \(\bar{\boldsymbol{r}}\), while satisfying the constraints (1) and (2) for all \(k\ge 0\).

3 Environment Aware MPC for Autonomous Navigation

When considering autonomous navigation problems, it is crucial for the system to perceive its surroundings and avoid collisions with the environment or other agents. This perception can be translated into a set of system state constraints that may change over time. An example of this is a vehicle that is equipped with an exteroceptive sensor capable of sensing its surrounding unknown environment. At each time step, sensor measurements can be used to derive a safe set, which evolves during navigation. While theoretical guarantees such as recursive feasibility and stability have been studied extensively for MPC schemes with time-invariant constraints, the problem of time-varying state constraints has received relatively less attention in the literature, despite its broad impact in various applications. The work presented in [10] is one of the few papers that addresses this problem. The authors analyze two interesting cases: modeled constraint variation and constraints with bounded changes. Ensuring persistent constraint satisfaction becomes crucial to meet the safety requirement, given the time-varying nature of the constraints. Recursive feasibility is a crucial property of MPC as it guarantees the ability to find a feasible solution at each sampling time. It ensures that the optimization problem can be successfully solved, resulting in a trajectory that satisfies all constraints while optimizing the defined objectives. When time-varying constraints are considered in MPC, the challenge lies in maintaining feasibility throughout different time steps despite the potential evolution of the constraints, which could be influenced by environmental or sensor factors. In this section, our focus is on achieving persistent constraint satisfaction under two specific types of time-varying constraints that arise when an autonomous vehicle is equipped with an on-board local sensor for perceiving the surroundings.

Fig. 1
Two dashed lines represent the trajectory and a fixed obstacle. The size of the agents is represented by the continuing circles from negative 2 to zero. The trajectory lines are concavely upward, whereas the obstacle is downward.

Example showing the loss of recursive feasibility for a position shifting state constraint. On the left, the traditional implementation without recursive feasibility guarantees, while on the right the proposed implementation. We consider the trajectory of the blue agent, while the red agent can be seen as a fixed obstacle. Blue line with ‘*’ represents the predicted state trajectory at time k. Due to the presence of a terminal state constraint, the tail of this trajectory represent a candidate trajectory at time \(k+1\). Light green polytope represents the state constraint at time k, while light blue polytope represents the state constraint shifted at the first predicted state. In both examples, an obstacle avoidance constraint is imposed, and the dashed circles around the trajectories represent the size of the agents

3.1 State Shifting Constraint

When the vehicle is equipped with an on-board sensor or a communication device, the information provided by the sensor can be interpreted as a set of constraints centered at the vehicle’s position. In [14], we considered a constant polytopic set centered at the vehicle position to describe a communication set around the agent. Despite the constant nature of the constraint set, the shifting feature can cause a loss of feasibility, as depicted in Fig. 1 (left). To guarantee recursive feasibility, we proposed an implementation that ensures, by construction, that the tail of the predicted trajectory lies within the safe sets generated by shifting the set along the trajectory, as shown in Fig. 1 (right). The second type of shifting constraints considered in this doctoral thesis is a time-varying state constraint represented by a sequence of time-varying and unpredictable state constraints that shift with the vehicle state. Unlike the previous case, the shape of the constraint changes at each time step. Due to the unpredictable and time-varying nature of the constraints, in this case, it is not possible to guarantee recursive feasibility anymore, however, under suitable assumptions, it is possible to ensure persistent constraints satisfaction. In [13], these constraint sets were used to represent a sequence of unpredictable obstacle-free regions determined by a drone’s sensor during navigation in an unknown environment. Under the assumption of a static environment, it can be proven that persistent constraint satisfaction is guaranteed if the predicted trajectory remains within one of the constraint sets encountered up to the current time step.

Fig. 2
Two graphs of y f versus x f. The enclosed shape between 2 and 0 along the y-axis represents the safe trajectory, while the diagonal line from 0 to 5 represents the exploitation trajectory. Right. The dashed-dotted line represents the single-trajectory M P C.

Example illustrating the predicted state evolution with the multi-trajectory approach (left) versus a single-trajectory one (right). Green line with ‘\(\circ \)’ represents the safe trajectory; Red line with ‘\(\diamondsuit \)’ is the exploitation trajectory; black dash-dotted line with ‘\(\Diamond \)’ is the trajectory of the single-trajectory MPC. Red ‘\(\star \)’: target. The time-varying constraints are represented with colored polytopes

4 Multi-trajectory MPC

As shown in Sect. 3 and illustrated in Fig. 1, ensuring recursive feasibility with time-varying constraints requires striking a delicate balance between safety and system performance optimization. To manage this trade-off, we propose the multi-trajectory MPC (mt-MPC) formulation. This approach predicts two trajectories: a “safe” trajectory that remains within the safe set and reaches a safe state, and an “exploiting” trajectory that allows violations of current constraints if they are overly conservative. The two trajectories share a common control action at the current step and diverge subsequently in the prediction. To illustrate this concept, consider the two-dimensional example shown in Fig. 2, where the right side demonstrates a single-trajectory approach and the left side presents a multi-trajectory approach. The trajectories are calculated by solving a Finite Horizon Optimal Control Problem (FHOCP) with the goal of reaching a target beyond the safe set. The single-trajectory MPC constrains the entire predicted trajectory to lie within the safe set, minimizing the average deviation from the target. However, it neglects the possibility of an improved safe set in the future. In contrast, the multi-trajectory strategy plans a significantly better (yet currently unfeasible) exploiting trajectory while maintaining a backup safe trajectory in case the constraints’ set does not expand towards the target. By comparing the positions reached by the two approaches at the first predicted time step, implemented in a receding horizon fashion, the potential advantage of the multi-trajectory approach becomes evident. Thus, the resulting FHOCP is divided into two predictions: the exploitation trajectory, which is considered in the cost function to drive the system towards the desired reference, and the safe trajectory, which satisfies the time-varying state constraints and ensures reaching a steady state within the current set of state constraints. The problem is generally a nonlinear program (NLP), and under regularity assumptions, a numerical solver can compute a (possibly local) minimum. To reduce the complexity, a linear model of the system and a quadratic cost function can be considered, resulting in a quadratic programming (QP) problem, as shown in [13]. If feasible, the QP problem can be efficiently solved for a global minimizer. While the ideal formulation guarantees the recursive feasibility of the approach by applying one of the methods presented in Sect. 3 to the safe trajectory, it may not ensure the convergence of the MPC scheme. To guarantee convergence, we propose a modified version of the FHOCP inspired by the works presented in [8, 9]. This modified version includes a convergence constraint that enforces a decreasing cost function associated with the safe trajectory over time, along with an offset cost that drives the terminal state towards the final reference.

5 Navigation Around Obstacles

The possibility that the system gets stuck in front of an obstacle that obstructs the path between the current vehicle position and the target is a common challenge in optimization-based autonomous navigation. To tackle this issue, we propose a mapping and exploration strategy called G-BEAM (Graph-Based Exploration and Mapping). The G-BEAM strategy leverages the under-approximation of the free space derived from exteroceptive sensor measurements to construct a graph representation of the environment. This graph is subsequently used to compute an optimal path for either exploring the environment or reaching a predetermined target. Before delving into the details of the approach, let us provide a description of the employed sensor and how we can effectively utilize the sensor measurements.

5.1 Exteroceptive Sensor and Convex Under-Approximation of the Free Space

We assume that, at each time step, we have access to the vehicle’s position and readings from an exteroceptive sensor, which is oriented parallel to the horizontal plane and capable of detecting the surrounding environment. The vehicle’s position can be measured using a Global Positioning System (GPS) sensor, while the exteroceptive sensor can be a Light Detection and Ranging (LiDAR) sensor and/or stereo cameras. The exteroceptive sensor provides a point cloud representation of the environment surrounding the vehicle at each time step. We assume that the sensor measurements are evenly spaced in all directions on a unit sphere centered around the vehicle’s position. These sensor measurements yield a non-convex region representing the obstacle-free area around the vehicle. However, for the sake of simplicity and efficiency, we aim to obtain a convex polytopic under-approximation of this obstacle-free region. In the literature, there have been numerous studies on the inner approximation of non-convex regions using convex sets. Various optimization-based approaches have been proposed (e.g., [2, 6]), where the positions of known obstacles are utilized to obtain the largest convex set within the free area. However, these methods typically require a priori knowledge of the environment, such as a set of convex polytopes or a map, and do not guarantee that the current vehicle position is included in the obtained set. In contrast, we propose an algorithm that relies solely on local sensor measurements and does not require a preexisting map of the environment. Given a set of sensor measurements, we begin by constructing the smallest convex polytope defined by a predetermined number of vertices centered at the vehicle’s position and within the measurements. We then iteratively expand these vertices until they reach the sensor’s maximum detection range or until another vertex’s expansion includes a sensor measurement.

5.2 Graph-Based Exploration and Mapping

The proposed solution entails the addition of a further hierarchical layer above the MPC controller proposed in the previous section. The choice of this structure is motivated by the difference in complexity and speed requirements of the involved algorithms. The main idea is to build a reachability graph that describes the environment, which can be used seamlessly for both navigation and exploration tasks. Nodes and collision-free arcs are acquired from the convex polytope and used to update the graph together with an exploration gain representing the expected amount of information gained by reaching that node. Then, a receding horizon navigation logic selects the next target node to be provided at the lower level controller. See Fig. 3 (left) for an example of the obtained graph. The navigation approach in G-BEAM is an event-based receding horizon one with similarities with MPC: each time the current reference node is reached, the graph is updated and a new temporary target is computed by planning a path in the graph culminating at a target node. Thus, comparing the approach with an MPC scheme, we exploit the graph-based model of the environment, to find the path on the graph that minimizes a user-defined cost-function. Indeed, the target node is computed by maximizing a reward function (minimizing the negative reward function) that trades off between environment exploration and reaching the external target \(\bar{\boldsymbol{r}}\) (exploitation), if provided.

6 Applications to Autonomous Navigation Problems

We evaluated the performances of the proposed approaches on realworld and simulated problems.

Fig. 3
An aerial view map of a target site with a reachability graph is on the left. A photo of a drone is in the center. An aerial view of a target site displaying the drone path trajectories is on the right.

In the middle the DJI drone used for the experiments. On the right the experimental test of the mt-MPC approach. Closed-loop trajectory obtained with mt-MPC (blue line with ‘\(\Diamond \)’). Safe set \(\mathcal {X}(k)\) at different time step k in red. On the left the reachability graph obtained with G-BEAM without providing an external target

The first practical application considers the navigation out-of-the-lab of a drone prototype built with off-the-shelf components (see [13] for further details). The drone, shown in Fig. 3 (in the middle), is equipped with a 2-dimensional LiDAR (Light Detection and Ranging) sensor able to detect the surrounding environment and it is controlled at a lower level by a commercial flight controller. The higher-level control strategies presented in Sects. 34 and 5 are implemented on a low-cost onboard computer and adopted to safely navigate and explore the environment. The experiments shown that the approach effectively allows to reach a target or explore the environment while avoiding a-priori unknown obstacles and that the approaches can be efficiently implemented in real-time on low cost hardware.

The proposed framework has been applied also to multi-agents problems. In [4], has been exploited to navigate a System of TEthered Multicopters (STEM) in a partially known environment. The system consists of multiple aircraft interconnected by power supply tethers (see [7] for further details on the system) and equipped with various exteroceptive sensors for detecting the surroundings. Initially, an optimal mission planner is developed to determine the position references for the system’s configuration in the nominal environment. Subsequently, a reactive path planner utilizes the sensor readings and time-varying state constraints derived from the sensor measurements, following the presented framework. This path planner ensures the system navigates to the desired configuration, avoiding collisions with obstacles, even if they differ from the nominal ones. In [14] the multi-trajectory formulation has been extended to consider multi-agent systems with time-varying network topology. Every agent is equipped with a communication device, enabling bidirectional communication with other agents when their communication sets overlap. The communication sets depend on the system’s position, resulting in a time-varying communication topology. Re-configuring the controller in control system networks, involving agents joining or leaving the network, remains an open challenge. To address this, the multi-trajectory MPC scheme is utilized to guarantee automatic plug-and-play operations that cannot be denied.

7 Conclusions

This brief provided an overview of a framework utilizing MPC for the safe navigation of autonomous vehicles towards a target reference. The theoretical results presented demonstrate that MPC serves as a suitable tool to accomplish this task, striking a balance between exploiting the vehicle’s capabilities and ensuring constraint satisfaction for enhanced safety. The control strategies discussed in Sects. 34 and 5 have demonstrated their remarkable effectiveness in real-world experiments. Future research directions include expanding the multi-trajectory MPC approach to diverse scenarios, such as reconfigurable systems, and incorporating additional learning components into the problem. Furthermore, exploring its application in economic contexts could be intriguing, where unpredictable time-varying constraints can be viewed as evolving resources. In conclusion, this work highlighted the significance of studying time-varying constraints to embrace MPC as a promising approach for the constrained navigation of autonomous vehicles holds great potential for advancing the future of transportation.