1 Introduction

Autonomous driving of road vehicles promises to release passengers from paying attention to traffic, to enable car-sharing concepts relying on automated vehicles, and to enhance traffic flow by better coordination [34]. An anticipated additional advantage—and a required property at the same time—is the reduction of the number of accidents, injuries, and fatalities per driven distance. To see that this property indeed is achieved, the process of determining driving decisions for automated cars needs to continuously evaluate if an encountered scenario bears the risk of safety-critical evolutions, and to choose only driving options for which the motion remains safe as likely as possible. With respect to designing safe motion of single autonomous vehicles, intense research efforts in the past years have led to considerable insight into how to accomplish the main tasks of environment perception, vehicle localization, and control, see [2, 36, 40] among many others. Recent tremendous progress in inter-vehicle communication [48] paves the path, however, to employ techniques of coordination and cooperation to further improve safe autonomous operation also for groups of vehicles. The exchange of driving plans among neighbored cars (or the distribution of jointly computed plans to these cars) can obviously reduce the uncertainty about the actions of other traffic participants, and thus can contribute to safety. This book chapter investigates how hierarchical concepts of cooperative motion planning for groups of autonomous vehicles can ensure driving decisions that are consistent with respect to safe interaction.

1.1 Relevant Work

Due to the complexity of the tasks to timely identify a current traffic situation, of computing a safe driving decision, and possibly to communicate with and align to the behavior of connected vehicles, the use of modular and hierarchic approaches has been investigated in various forms, see e.g. [5, 6, 10, 41, 50]. While such schemes are often expected to lead to quicker reactions, to more flexibility and suitability for maintenance (such as easier update of modules) [38], they also bear to the challenge of ensuring consistency between different decision units: For the information-flow from a top layer of a hierarchy, which typically determines a qualitative behavior (such as lane following, turning, emergeny braking, etc.), to the bottom-most layer, which takes care of the vehicle actuation, it must be ensured that decisions are not contradictive.

For the subtask of path planning of autonomous vehicles, a large set of different approaches has been proposed in the past, as reported in the survey papers [3, 21, 22]. One class of techniques is based on gridding of the state space of a vehicle and searching a path along a set of grid points, e.g., by path-velocity decomposition [27], by RRT*-algorithms [26, 30], or by Monte-Carlo trees [28]. The complexity of these approaches, however, grows exponentially with the dimension of the state space, as obtained for larger sets of vehicles. A second class of techniques is that of learning-based approaches comprising supervised learning based on data from human driving experience, e.g. [45, 47], and reinforcement learning [31]. While these methods do not require structural insight into traffic situations and the computational effort required online is relatively low, the data set required for learning offline is very large, and means to always guarantee safety are not known yet. A different class of methods, which is relevant for the approach to be proposed in this paper, uses elements for structuring driving behavior into maneuvers, motion primitives, or homotopies. The common idea is to group behaviors which satisfy a notion of similarity or symmetry (such as invariance to translation or rotation), see e.g. [18, 19]. Maneuvers together with principles of optimal control can be used for motion planning [19, 33, 44], as well as for verifying safety of vehicle motion [25]. To satisfy conditions under which maneuvers or motion primitives can be concatenated to longer driving plans, and the consideration of obstacles are challenging for these techniques. In addition, these concepts are so far not used for sets of cooperative autonomous vehicles. A fourth class of relevant techniques for the work to be proposed in this chapter are approaches based on mixed-integer programming. The underlying optimization problems combine logic conditions modeled by integer variables with continuous variables to represent vehicle motion, see e.g. [29, 35, 37, 39, 43]. While the subclasses of mixed-integer linear or quadratic programming ensure that globally optimal solutions can be found, the computational effort is typically an issue if used in online optimization.

For the subtask of tracking a reference on a lower layer of a control hierarchy, different variants of model-predictive control (MPC) have been considered in the past, since this class of techniques is suitable to consider constraints of the inputs and states (such as the adherence to the admissible regions). On the one hand, approaches of nonlinear MPC have been considered for this purpose [16, 17, 23], but it is questionable whether the nonlinear optimization problem can be solved with the frequency required for low-level tracking control. On the other hand, linearized-based variants of MPC [10, 24, 32, 50] need to account robustly for linearization errors, typically leading to conservativeness, thus implicitly reducing the space from which the reference signals can be chosen. Similar reasoning applies to methods relying on linear-parameter-varying models, see e.g. [1, 4]. For these reasons, the use of tracking techniques based on nonlinear dynamics, with consideration of constraints, but without embedding online optimization seems preferable. The techniques reported in [20, 49] for tracking by stabilizing feedback control motivate the concept used in this paper for the same purpose, but those techniques require extensions to account for the constraints imposed by the second layer.

1.2 Contribution

In order to timely adapt the driving behavior of sets of cooperative vehicles to changing situations, this book chapter proposes a hierarchic approach using three layers of decision: The first (and upper) layer structures the setting into cooperative groups, the second layer computes driving plans which are guaranteed to exclude collision while leading to the goals of the involved vehicles, and the third layer uses the plans as reference signals for the online control tasks. One particular focus with respect to the second layer is the division into an offline and an online part for computational efficiency. The offline part determines and stores admissible driving regions as well as selected optimized trajectories for the vehicles grouped together for a specific maneuver. The online part comprises only the relatively easy steps of interpolating between the pre-computed optimal trajectories. A second focus of this chapter is on the third layer, on which a feedback control task is solved for a more detailed representation of nonlinear vehicle dynamics such that following the plan obtained from the second layer (as reference signal) is achieved. At the same time, it is ensured that the admissible driving region is not left. Thus, a main benefit of the approach to be presented is that consistency of the decisions at the interfaces between the first and second layer, and between the second and third layer respectively, is guaranteed.

Fig. 1
A flow diagram has 3 layers. Group coordination leads to maneuver planning via scenario or maneuver, leading to trajectory control via reference trajectories which then leads back to maneuver planning via acknowledgment, and to group coordination via acknowledgement.

Hierarchical structure of the decision-making procedure and additional information from the environment

2 A Hierarchical Approach to Decision Making

This section first provides an overview of the proposed hierarchic procedure of decision making for cooperative autonomous driving, while details on the techniques assigned to the three layers will be described in the subsequent sections. According to Fig. 1, the upper-most layer is termed group coordination and aims at identifying the traffic scenario in a particular road area. The underlying (and required) information is the road topology in the respective area, the set of traffic participants and obstacles in this area, and the routes of the vehicles (leading essentially to the point at which each vehicle intends to leave the area). The sources of information are the route planners of the vehicles, the communication units of the vehicles (and possibly of road-side stations), and the on-board vision and perception systems. The mechanisms of the latter are not in the scope of this paper, but the assumption is that the onboard sensorics together with algorithms of object identification deliver the complete set of objects in the environment, together with predictions of the motion of dynamic objects. The available information is used to identify the current scenario and to select a maneuver from a maneuver library, which is computed a-priori in an offline phase. The selection of the maneuver comprises the formation of cooperative groups, i.e. the set of vehicles is partitioned into subsets which cooperate in executing a maneuver jointly, see Sect. 2. (Non-cooperative traffic participants are assigned to separate groups.)

For any of such cooperative groups, the maneuver selection is passed to layer 2, termed maneuver planning. For each maneuver, the set of possible behaviors for each car of the respective group is computed offline based on reachable sets of hybrid automata. These automata (to be defined in Sect. 4) combine the driving plans of the vehicles assigned to a cooperative group, and they ensure that any vehicle exclusively occupies a certain region of the road, thus excluding collision. The computation of reachable sets is based on simplified linear dynamics and considers the goal set for the chosen maneuver and constraints arising from obstacles and the road topology. Optimized sample trajectories within these reachable sets are computed offline, and are stored in a maneuver library for the respective situation. In the online execution, an interpolation between the optimized sample trajectories is computed for each vehicle and its exact position as observed by the sensors. These interpolated trajectories determine a reference forwarded to the third layer, and they remain contained in the set of admissible regions. An acknowledgement is sent back to the first layer to either report success of maneuver planning, or to initiate the selection of a different maneuver, if the one considered before was not found to be executable in the online procedure.

On the third layer, each vehicle controller locally aims at controlling the vehicle position and speed to the reference obtained from the second layer. As opposed to the linearized dynamics used on layer 2, this feedback control task is based on a higher-dimensional nonlinear model of the vehicle dynamics, which involves the quantities obtained from the on-board sensors, the actuated quantities, as well as possible disturbances (such as wind gusts). It is important for this layer that the local tracking error between the planned reference and the actual vehicle path is hold in suitable bounds. For the trajectory tracking, a tailored method is proposed which uses a nonlinear model of the tracking error represented in the Frenet frame. The tracking errors of the position, their speed of change, and of the yaw angle is controlled by state feedback control. Bounds on the maximum deviation arising for bounded uncertainties of wind and tire forces can be computed, and thus can be compared with the driving corridors used in maneuver planning on layer 2. If an inconsistency occurs, the acknowledgement signal from layer 3 to 2 needs to report this and triggers modification of the reference trajectory.

The computations on the three layers are iteratively updated with appropriate frequency: In the nominal case, formed groups on layer 1 would be expected to exist for a range of several seconds until a maneuver is terminated, but possible occurrence of new situations (including emergencies like suddenly occurring obstacles) requires to react within a few milliseconds. If layer 2 follows a selected maneuver over several seconds, incoming new measurements about the exact vehicle positions or communicated information allow to update the computation of updated reference trajectories—an update of every of 10–50 milliseconds can be deemed reasonable for implementations of the hierarchy, while emergency situations again require to switch much faster to, e.g., a mode of hard braking. On layer 3, the update of the feedback control action would be executed nominally in the range of milliseconds.

3 Group Coordination

The task of group coordination is explained based on the example scenario shown in Fig. 2. Assume that a set of vehicles is present in a defined section of the road network, here the area of a T-intersection. A decision unit performing the functions of the upper two layers of the control hierarchy is assigned to this section. Physically, this unit could either be embedded into the infrastructure of the intersection, or one of the car controllers could temporarily assume this role. Let all vehicles be equipped with devices for wireless communication, such that the intended routes, the current positions and speed, and the driving plans (as outcome of the planning on layer 2) can be exchanged among the vehicles and with the decision unit. If the five vehicles follow the intended directions as indicated by the solid arrows, the assignment to two cooperative groups is straightforward, namely the red-colored and the two gray-colored vehicles form a group \(G_1\), while the white and the green vehicle are assigned to \(G_2\). Obviously, a simple lane following maneuver needs to be planned for \(G_2\), whereas it has to be decided for \(G_1\) at which position the car intending the right turn merges into the convoi of the two others. Assuming that the green car intended a left turn (indicated by the dashed arrow), the vehicle needed to be assigned to \(G_1\).

The situation becomes more complicated if non-cooperative traffic participants are present: If the red car were non-cooperative, the decision unit would assign it to an additional group \(G_3\). It had to be distinguished if this car communicates its planned trajectory, or not. If it did, the trajectory would be considered as disturbance for the maneuver planning of the gray cars in \(G_1\). If it did not, the complete set of possible behaviors of the red car would have to be considered (or estimated based on the sensor data), leading to more conservative maneuver planning for the gray car turning right. More details on considering non-cooperative traffic participants can be found in [8, 42], where vulnerable road users (cyclists or pedestrians) are addressed.

Fig. 2
A road network diagram illustrates a scenario where vehicles communicate with a decision unit to form cooperative groups based on their driving intentions. It depicts 3 groups, labeled G 1, G 2, and G 3, with a decision unit.

Example of forming cooperative groups: The green car is assigned to \(G_2\) or \(G_1\) depending on the driving intention. In case of a non-cooperative red car, it forms another group \(G_3\). All vehicles communicate with a decision unit

Note that for some scenarios the assignment of vehicles to groups is not related to a fixed section of a road, but to a moving section, as e.g. if a set of interacting vehicles on a highway perform an over-taking maneuver.

While the approach presented in this paper determines cooperative maneuvers for each group by the procedure described in the next section, the following alternative approach from [9, 41] should be briefly mentioned: There, the different vehicles in a road section compute a proposal for collision-free trajectories by MPC using linearized dynamics (as an alternative to layer 2). These trajectories are then negotiated by an auction-like bidding procedure, i.e. the vehicle controllers determine which proposed trajectories are agreeable for all vehicles of a group (as an alternative solution on layer 1).

Fig. 3
A diagram illustrates the overtaking procedure. The car at position P 1 passes through position P 2 and reaches position P 3, in a curved pattern. Below, there are circular insets, labeled q 1, q 2, and q 3, with mathematical equations.

Partitioning of the overtaking procedure of the red car into three phases and construction of a hybrid automaton modeling the procedures

4 Maneuver Planning

4.1 Planning Based on Hybrid Models and Controllable Sets

The maneuver planning for a cooperative group is based on joint modeling of the vehicle dynamics by hybrid automata. This choice is motivated by the observation that most maneuvers can be understood as a sequence of qualitatively different phases, as shown for the example of an over-taking scenario in Fig. 3, adopted from [11]. The process of the red car overtaking the gray one can be separated into the phase \(P_1\) of accelerating and approaching the gray car, the phase \(P_2\) of changing lane and passing the gray car, and the phase \(P_3\) of changing back to the right lane and possibly decelerating. The maneuver is straightforwardly cast into a hybrid automaton, as shown in the bottom part of Fig. 3: Each phase \(P_i\), \(i\in \{1,2,3\}\) is modeled by a discrete state \(q_i\), and transitions \(\theta _{12}\), \(\theta _{23}\) represent the instantaneous change of the discrete state if an associated transition condition \(g(\theta _{12},x_k)\), or \(g(\theta _{23},x_k)\) holds true. These conditions depend on the continuous-valued state vector \(x_k\) at a discrete point of time indicated by k. The state vector comprises the position (in cartesian coordinates) and the speed of all vehicles in the cooperative group. Based on \(x_k\), the condition \(g(\theta _{12},x_k)\) formulates, e.g., the conjunction of the facts that the red car has approached the gray one up to a lower distance threshold, that the speed of the red car is sufficiently larger that than the speed of the gray car, and that the distance to the white car is sufficiently large for a given speed of the white car. Conditions of similar type can be formulated to characterize the admissible values of \(x_k\) for each of the discrete states \(q_i\), then referred to as invariants \(x_k\in inv(q_i)\). They play an important role in separating safe from unsafe driving behavior, i.e., any \(x_k\) modeling a dangerously close distance between two vehicles is excluded from the invariants. Likewise, obstacles and regions outside of the drivable road space are not included in \(inv(q_i)\). The evolution of the state is modeled by discrete-time difference equations \(x_{k+1}=f_i(x_k,u_k,w_k)\), which depend on the vector of control inputs \(u_k\) (available to actively change \(x_k\) by accelerating, braking, and steering) and on a possible vector of disturbances \(w_k\)—for both vectors, static bounds \(u_k\in U\) and \(w_k\in W\) are assumed to be known. Formally, a hybrid automaton \(HA=(T,Q,q_0,q_T,inv,U,W,X_0,X_T,\Theta ,g,f)\) is introduced for a maneuver, containing the set \(T\subset \mathbb {N}\) of discrete points of time, the set \(Q\subset \mathbb {N}\) of discrete states, discrete initial and target states \(q_0\in Q\) and \(q_T\in Q\), the assignment inv of invariants to discrete states, the bounded sets \(U\subset \mathbb {R}^m\) and \(W\subset \mathbb {R}^r\) of inputs and disturbances, the sets \(X_0\subset \mathbb {R}^n\) and \(X_T\subset \mathbb {R}^n\) of possible continuous initial states and target states, the transition conditions g, and the discrete-time state transfer functions \(f_i\) (collected in f). See [7, 12] for more details on the semantics of the model.

Given this model, a maneuver is defined as a tuple \(\mathcal {M}=(\mathcal {G},\mathcal {N},HA,h)\) of the group \(\mathcal {G}\) of cooperating cars, the set \(\mathcal {N}\) of non-cooperating vehicles, the hybrid automaton, and a planning horizon h (as a maximum number of time points to complete the maneuver). Such a tuple is modeled offline as a template for a class of scenarios of the same pattern, as for the example of overtaking procedures on a single lane road involving three vehicles. A tuple \(\mathcal {M}\) is modeled offline for any class of scenarios the autonomous vehicles are expected to get in, and \(\mathcal {M}\) is stored in a maneuver library.

The advantage of using maneuvers of this type for planning is that set-based offline computations allow (under some assumptions) to represent the set of feasible and collision-free maneuver instances. For this purpose, let all \(f_i\) be affine mappings of its arguments and all continuous sets in HA be chosen polytopic. Then, j-step robust controllable sets can be computed for HA as a sequence of polytopic subsets of the invariants:

$$\begin{aligned} \mathcal {K_j}=(K_0,\ldots ,K_j) \end{aligned}$$

for which an input trajectory \((u_0,\ldots ,u_j)\) exists to definitely transfer the state \(x_0\in X_0\) in at most j steps into the target \(X_T\), despite of the presence of the disturbances \(w_k\in W\). These sets are instrumental for the guarantee of finding a winning control strategy for an arbitrary initialization \(x_0\in X_0\) measured in online operation, i.e. to determine \((u_0,\ldots ,u_j)\) for:

$$\begin{aligned} (x_0\in K_0,\ldots ,x_j\in X_T\subseteq K_j) \end{aligned}$$
(1)

(compare to [12, 46]). Such a trajectory ensures for \(h\le j\) that a maneuver \(\mathcal {M}\) can be successfully completed, and it implies that the vehicles are coordinated in safe interaction.

Of course, the choice of \((u_0,\ldots ,u_j)\) is not unique. In order to determine an optimal choice with respect to a cost functional, such as minimizing the control effort and the distance to a target point of the maneuver, a minimization problem:

$$\begin{aligned} &\min _{(u_0,\ldots ,u_h)} \sum _{k=1}^{h} \Vert C\cdot (x_k-x_T)\Vert _2^2+\Vert D\cdot u_k\Vert _2^2 \end{aligned}$$
(2)

(with weighting matrices C, D) could be solved (e.g. for worst-case values of the \(w_k\)). This minimization, however, is subject not only to (1), but also to the constraints arising from the dynamics of HA. Its encoding with respect to the assignment of \(f_i\) to \(inv(q_i)\) and the transitions involves to use binary variables and several linear inequalities, leading to an optimization of type mixed-integer quadratic programming. The computation times found to be required to solve such problems are often too large compared to the update frequency targeted for layer 2. In addition, even smaller times are required to decide whether a selected maneuver leads to a feasible solution (in order to report back to layer 1 that an alternative maneuver is needed). Thus, the proposal is to use a combination of offline computation of optimal strategies and quick online interpolation [13]: First, polytopic inner approximations \(\hat{\mathcal {K}}_k\subset \mathcal {K}_k\) of the control invariant sets are determined. The objective is to obtain good coverage of the \(\mathcal {K}_k\), to consider the worst-case disturbances (if present), and to use only a relatively small number of facets of the polytopes \(\hat{\mathcal {K}}_k\). A procedure for this step is described in [7], and it considers the transition dynamics and the invariants of HA. Note that, as the result of this procedure, still a control trajectory leading to \(X_T\) is guaranteed to exist for any initialization to a state in the \(\hat{\mathcal {K}}_k\).

Fig. 4
A diagram depicts an optimization projection using arrows. The projections are labeled K 0 hat, K 1 hat, K 2, and K 3 hat. It illustrates the path obtained from the stepwise linear interpolation, representing the interpolated state path, and the corresponding inputs determine the control strategy.

Gray-colored arrows represent the optimal projections \(\varphi \) of the vertices of the polytopes \(\hat{\mathcal {K}}_k\) (which are inscribed to the corresponding subset of \(\mathcal {K}_k\)). The path shown in red is obtained online from stepwise linear interpolation \(\tilde{\varphi }\) between the optimal vertex proejctions

Secondly and still carried out offline, the vertices of the polytopes \(\hat{\mathcal {K}}_k\) are optimally projected forward onto \(\hat{\mathcal {K}}_{k+1}\), \(k\in \{0,\ldots ,h-1\}\) by solving an optimization problem similar to (2). This procedure is exemplary sketched in Fig. 4 by dashed gray-colored arrows, and is denoted by \(\varphi \). The triples of vertex \(v_i\), optimal input \(u^*_i\) and the optimal projected state \(x^*_i=\varphi (v_i,u^*_i)\) are stored with the maneuver for online use.

In the online execution, the following is accomplished: First, it is checked whether the currently measured state, denoted by \(x_0\), is contained in any of the sets \(\hat{\mathcal {K}_k}\) of the selected maneuver \(\mathcal {M}\). If not, the maneuver is not guaranteed to be executed safely, and an alternative maneuver has to be determined. If \(x_0\in \hat{\mathcal {K}_k}\) for any k, the barycentric coordinates of \(x_0\) in \(\hat{\mathcal {K}_k}\) (with respect to its vertices \(v_i\)) are computed. The input \(u_0\) is then determined as the interpolation of the inputs \(u^*_i\) (as optimized offline for the vertices \(v_i\)) by use of the same barycentric coordinates. In Fig. 4, the outcome is shown by a red dashed line for the interpolated state path, and is denoted by \(\tilde{\varphi }\). The corresponding sequence of interpolated inputs along this path determines the control strategy which transfers the state \(x_k\) eventually into the goal set \(X_T\) of the maneuver. The interpolation requires only relatively small computational effort.

4.2 Illustration for an Overtaking Maneuver

For illustration of the procedure, consider again the overtaking maneuver from Fig. 3, see also [7, 13]. Let the longitudinal position of the three vehicles be denoted by \(p_x^{(i)}\) and the lateral position by \(p_y^{(i)}\), where \(i=1\) refers to the red car, \(i=2\) to the gray car, and \(i=3\) to the white car approaching from opposite direction. To simplify the model, the relative longitudinal positions \(p_{r}^{(2)}=p_x^{(2)}-p_x^{(1)}\) and \(p_{r}^{(3)}=p_x^{(3)}-p_x^{(1)}\) are introduced, and the lateral positions of the gray and white car are assumed to be constant (thus their lateral speeds equal zero). The reduced state vector:

$$\begin{aligned} x&=\begin{pmatrix} p_{r}^{(2)}&p_{r}^{(3)}&p_y^{(1)}&v_x^{(1)}&v_x^{(2)}&v_x^{(3)}&v_y^{(1)} \end{pmatrix} \end{aligned}$$

is thus defined on a 7-dimensional space. The inputs are chosen identical to the accelerations in longitudinal and lateral direction (\(u_x^{(i)}:=\dot{v}_x^{(i)}\), \(u_y^{(i)}:=\dot{v}_y^{(i)}\)), leading to a 4-dimensional input vector:

$$\begin{aligned} u&=\begin{pmatrix} u_{x}^{(1)}&u_{x}^{(2)}&u_{x}^{(3)}&u_{y}^{(1)} \end{pmatrix}. \end{aligned}$$

A simple double integrator dynamics is used for longitudinal and lateral motion, leading to a linear model \(\dot{x}(t)=A\cdot x(t)+B\cdot u(t)\) with only entries 0 and 1 in the matrices A and B. While very simple, such a model (with subsequent discretization of time) is very frequently used in path planning—and well justifiable for the hierarchic approach, since it only serves the purpose of computing a reference trajectory for a control problem with more detailed dynamics on layer 3.

The model is complemented by constraints for the state and input vector, by a target set formulating the ranges of the vehicle positions for completing the maneuver, by a nominal longitudinal speed (as \(x_T\)), and by weighting matrices C, D of the cost functional (here chosen to identity matrices). The hybrid model HA is obtained (according to Fig 3) by adding the discrete states, the invariants (which consider a safe minimal distance between the vehicles in both coordinates), and the transitions including the conditions \(g(\theta )\) (corresponding to reaching the boundaries of the invariants); see [7] for a full parametrization of these components.

Fig. 5
21 diagrams, arranged in 7 columns, present the series of controllable sets for an overtaking maneuver. They illustrate the exact controllable sets and the sets providing polytopic inner approximations.

Two-dimensional projections of the controllable sets obtained for the overtaking maneuver; the white sets represent the exact controllable sets, while the gray sets establish polytopic inner approximations; from [7]

Fig. 6
2 graphs. P y versus P x illustrates an ascending to descending trend for the red vehicle, and 2 horizontal trends for gray and white vehicles. V x versus time in seconds plots nearly horizontal trends for red and gray vehicles, and an ascending trend for the white vehicle.

Results for the overtaking procedure: left part—lateral over longitudinal position (crossed: red vehicle, boxed—gray vehicle, circled—white vehicle), h denotes the end of the planning horizon; right part—longitudinal speed over time (solid—red car, dashed—gray car, dotted—white car)

Based on this hybrid model (which does not include disturbances), the controllable sets are computed over \(j=5\) steps, and are shown in Fig. 5. After optimizing the trajectories originating from the vertices of the controllable sets, the interpolation procedure is applied. Figure 6 shows example trajectories of the positions and longitudinal speeds for a chosen initialization. The trajectories demonstrate that the maneuver is successfully completed without collision, and they indicate cooperation in the sense that the gray vehicle lowers its speed and the white vehicle keeps a low speed until the red vehicle has passed. The references [7, 13] provide insights into the computation times for the proposed scheme of online interpolation between offline optimized trajectories compared to online trajectory optimization within the controllable sets: For a large number of test instances for this example (with varying initialization) it was found that the interpolation approach required an average time of 0.25 milliseconds, while the results for the online optimization were obtained in average after 12.1 milliseconds. The first value can be deemed sufficiently small for execution within the hierarchic approach.

For a second example describing the cooperative maneuver of vehicles merging into highway traffic at an on-ramp, the interested reader is referred to [15].

5 Trajectory Control

The technique described in the preceding section provides a feasible path for any vehicle involved in a scenario with respect to the simplified (piecewise) linear dynamics used on layer 2. The control strategies obtained there for realizing the paths are, however, not immediately useful for controlling the autonomous vehicles for the following reasons: (a) vehicle motion comprises nonlinear effects which need to be considered for low-level vehicle control, (b) the state and input vectors used for layer 2 do not contain the full set of controlled and actuated variables typically employed for vehicle control. In consequence, it is not ensured that the vehicle would indeed follow the plan computed on layer 2—thus, the control hierarchy includes the additional layer 3 for local low-level vehicle control. This layer employs more accurate models, and the paths computed on layer 2 serve as reference signals for the local vehicle controllers. Two questions immediately arise from this choice: First, a more accurate and nonlinear model lets one expect higher computational effort. Reference tracking by nonlinear MPC, as in [10], may not be feasible if the time required for the online solution of the optimization problem is not compatible to high execution frequencies for low-level control. Hence, the approach proposed below uses a state-feedback controller with very low computational demands. Secondly, the use of different models for plan generation and tracking control may compromise consistency between the two decision levels—this point is addressed below. (See also [14] for a discussion on the relation between linear integrator models for planning and the nonlinear motion of vehicles on curved roads.)

To prepare the control design, it is convenient to transform the setting for a given reference trajectory into so-called Frenet-coordinates, which specify positions relative to their projection onto the reference path. This means that a point on the path is described by a path coordinate s, and an offset that is measured in normal direction to the reference.

The standard bicycle model is used here as starting point for vehicle modeling, with a state vector \(\chi \) containing the positions \(p_x\) and \(p_y\) (in cartesian coordinates), the vehicle orientation \(\psi \), and the longitudinal and lateral speeds \(v_X, v_Y\), and the yaw rate \(\omega \). The input vector \(\mu \) is defined to contain longitudinal tire slip \(s_X\) and the steering angle \(\delta \). With a rotation matrix \(R(\psi )\) and external accelerations \(a:=(a_X,a_Y,a_{\psi })^T\) to account for effects like wind and tire forces, the model is formulated to:

$$\begin{aligned} \begin{pmatrix} \dot{p}_x \\ \dot{p}_y \\ \dot{\psi } \\ \dot{v}_X \\ \dot{v}_Y \\ \dot{\omega } \end{pmatrix} = \begin{pmatrix} R(\psi ) \begin{pmatrix} v_X \\ v_Y \end{pmatrix} \\ \omega \\ \begin{pmatrix} a_X \\ a_Y \\ a_\psi \end{pmatrix} + \begin{pmatrix} v_X \omega \\ - v_Y \omega \\ 0 \end{pmatrix} \end{pmatrix}. \end{aligned}$$
(3)

It is assumed that the accelerations \(a:=\bar{a}+\Delta a\) can be written as the sum of a nominal part \(\bar{a}\) and a bounded offset \(\Delta a\), \(||\Delta a||\le \Delta a_{max}\). As detailed in [7], the dependency of (3) on \(s_X\) is obtained from an appropriate tire model leading to the function \(\bar{a}(\delta ,s_X,s_Y,v_Y,\omega )\).

By defining the error state vector (with tangential and normal part to the reference indicated by indices t and n):

$$\begin{aligned} e := \begin{pmatrix} e_t & e_n & \dot{e}_t & \dot{e}_n & e_\psi & e_\omega \end{pmatrix}^T = \begin{pmatrix} e_{pos}^T & \dot{e}_{pos}^T & e_{yaw}^T \end{pmatrix}^T, \end{aligned}$$

the error dynamics can be derived to:

$$\begin{aligned} \frac{d}{dt} \begin{pmatrix} e_{pos}\\ \dot{e}_{pos} \end{pmatrix} = \begin{pmatrix} 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \dot{\theta }^2 & \ddot{\theta } & 0 & 2\dot{\theta }\\ -\ddot{\theta } & \dot{\theta }^2 & -2\dot{\theta } & 0 \end{pmatrix} \begin{pmatrix} e_{pos}\\ \dot{e}_{pos} \end{pmatrix} - \begin{pmatrix} 0 \\ 0\\ \ddot{s} \\ \dot{\theta }\dot{s} \end{pmatrix} + \begin{pmatrix} \begin{array}{cc} 0 & 0\\ 0 & 0 \end{array} \\ R(e_{\psi }) \end{pmatrix} \begin{pmatrix} \bar{a}_X \\ \bar{a}_Y \end{pmatrix} + \begin{pmatrix} \begin{array}{cc} 0 & 0\\ 0 & 0 \end{array} \\ R(e_{\psi }) \end{pmatrix} \begin{pmatrix} \Delta a_X \\ \Delta a_Y \end{pmatrix} \end{aligned}$$
(4)

and:

$$\begin{aligned} \frac{d}{dt} \begin{pmatrix} e_{\psi }\\ \dot{e}_\omega \end{pmatrix} = \begin{pmatrix} 0 & 1\\ 0 & 0 \end{pmatrix} \begin{pmatrix} e_{\psi }\\ \dot{e}_\omega \end{pmatrix} + \begin{pmatrix} 0\\ 1 \end{pmatrix} (\bar{a}_{\psi } + \Delta a_{\psi } - \ddot{\theta }). \end{aligned}$$
(5)

In here, \(\theta \) describes the orientation angle between the longitudinal direction (indicated by X, tangential to the reference signal) and the cartesian coordinate x.

For given \(v_X\), \(v_Y\), and \(\omega \), the nominal acceleration \(\bar{a}\) is controlled by appropriate choice of \(\delta \) and \(s_X\) (while \(\Delta a\) is a disturbance). By defining a virtual input vector \(\tilde{\mu }\), \(\bar{a}\) can be computed:

$$\begin{aligned} \tilde{\mu } : &= \begin{pmatrix} \dot{\theta }^2 & \ddot{\theta } & 0 & 2\dot{\theta }\\ -\ddot{\theta } & \dot{\theta }^2 & -2\dot{\theta } & 0 \end{pmatrix} \begin{pmatrix} e_{pos}\\ \dot{e}_{pos} \end{pmatrix} - \begin{pmatrix} \ddot{s} \\ \dot{\theta }\dot{s} \end{pmatrix} + R(e_\psi ) \begin{pmatrix} \bar{a}_X \\ \bar{a}_Y \end{pmatrix} \end{aligned}$$
(6)
$$\begin{aligned} \Leftrightarrow \begin{pmatrix} \bar{a}_X \\ \bar{a}_Y \end{pmatrix} &= R(e_\psi )^T \left( \tilde{\mu } + \begin{pmatrix} \ddot{s} \\ \dot{\theta }\dot{s} \end{pmatrix} - \begin{pmatrix} \dot{\theta }^2 & \ddot{\theta } & 0 & 2\dot{\theta }\\ -\ddot{\theta } & \dot{\theta }^2 & -2\dot{\theta } & 0 \end{pmatrix} \begin{pmatrix} e_{pos}\\ \dot{e}_{pos} \end{pmatrix} \right) \end{aligned}$$
(7)

To reduce the tracking error by feedback compensation, define a feedback law:

$$\begin{aligned} \tilde{\mu } = -K \begin{pmatrix} e_{pos}\\ \dot{e}_{pos} \end{pmatrix} \end{aligned}$$
(8)

with controller matrix K, by which the closed-loop position and speed error are obtained to:

$$\begin{aligned} \frac{d}{dt} \begin{pmatrix} e_{pos}\\ \dot{e}_{pos} \end{pmatrix} = f_{pos}(e,\Delta a) &:= \left( \begin{array}{c} \begin{array}{cc} 0_{2\times 2} & I_{2\times 2} \end{array} \\ -K \end{array} \right) \begin{pmatrix} e_{pos}\\ \dot{e}_{pos} \end{pmatrix} + \begin{pmatrix} \begin{array}{cc} 0 & 0 \\ 0 & 0 \end{array} \\ R(e_{\psi }) \end{pmatrix} \begin{pmatrix} \Delta a_X \\ \Delta a_Y \end{pmatrix}. \end{aligned}$$
(9)

The transformation of a given \(\bar{a}\) into \(\delta \) and \(s_X\) is detailed in [7], as well is the derivation of an expression for the yaw error dynamics:

$$\begin{aligned} \frac{d}{dt} \begin{pmatrix} e_\psi \\ e_\omega \end{pmatrix} = f_{yaw}(e,\bar{x},\Delta a). \end{aligned}$$

For several variables contained in the nonlinear model, physical constraints need to be satisfied in order to establish safe driving. Only those reference trajectories from layer 2 which satisfy these constraints, can be characterized as realizable. Furthermore, deviations from the reference signal (in terms of the introduced tracking errors) must be located inside of those constraints. Thus, for given bounds \(\Delta a_{max}\) of the uncertainties, admissible ranges for the tracking errors can be computed in order to satisfy the constraints. While out of scope of this book chapter, the reader is referred to the approach described in [7] for synthesizing the matrix K in the feedback control law (8). This law keeps the tracking error (for \(e_{pos}\) and \(\dot{e}_{pos}\)) within the admissible ranges (for single reference values). The synthesis is based on solving a semi-definite program constrained by linear matrix inequalities. The analysis of the tracking error of the yaw dynamics is more intricate due to its nonlinear and time-varying nature, but boundedness of this error can be shown, too, under appropriate assumptions [7].

6 Conclusions

This book chapter has proposed the concept of a hierarchic decision architecture to enable cooperative driving of a set of autonomous vehicles (even in presence of non-cooperative traffic participants). While the implementation and testing of this proposal in practice (i.e. on an autonomous vehicle) is still matter of future investigations, the following principal advantages are highlighted:

  • The decomposition of the overall problem into three layers for separating the tasks of group determination, planning of joint maneuvers, and local vehicle control makes the problem tractable, even for the challenging timing of real-world traffic scenarios. The additional partitioning of the maneuver planning on layer 2 into an offline and an online part further increases the computational efficiency.

  • With the objective to provide guarantees on safety of the cooperative driving strategies, particular emphasis has been set on embedding constraints for excluding collision, and on reasoning about consistency of the decisions across the layers. Along this line, mechanisms for checking admissibility of the information received from the super-ordinated layers have been proposed.

  • The feedback control scheme on layer three is not only computationally very efficient, but it allows to derive conditions for which constraint compliance is obtained. Upper bounds on the tracking errors (for bounded disturbances) allow to construct reference trajectories on layer 2 which have sufficient safety margins for excluding collision.

Aspects of future research include the construction of an as complete as possible maneuver library for layer 2. If a scenario is encountered for which no compliant maneuver has been defined, stopping one or more vehicles is the only and undesired choice. (Of course, this lack of completeness applies for all existing approaches, including those relying on learning from massive data sets.) Systematic classification of scenarios and structured modeling of atoms of maneuvers (and concatenation hereof) may help to run into this situation very rarely. With respect to the third layer, the inclusion of measurement uncertainties constitutes a valuable future extension.