Keywords

1 Introduction

While a short-term step can improve tracking accuracy and stability of motion control system, a long-term prediction/control horizon is necessary to make a resonable future guess and make impacts on the translational motion and speed profile of vehicles [4]. However, the limited computing power causes a dilemma between them. Besides, the coupling control of states envoling over different time scales leads to difficult controllers tunning in different driving conditions.

This paper derives the necessity for path tracking problem. As a result, the translational motion can be decoupled from the rotational. And states envoling over different time scales can be regulated with different control loops. Thus, the computational load can be reduced and the control horizon can be extended without sacrificing tracking accuracy. To further mitigate the computational burden, The path integral control framework is incorporated into the model predictive control, where the control sequence is constantly optimized via parallel sampling [6]. Moreover, a predictive output regulator is proposed to solve the underactuated problem in lateral tracking. Simulation results show that the proposed method reduces the computational complexity of nonlinear optimization effectively and improves the steering smoothness without sacrificing tracking accuracy.

2 Formulation of Path Tracking Problem

We consider the path tracking problem as the Point to Points problem (P2Ps). The original P2Ps solve the path tracking problem in a single loop, which leads to difficult bandwidth allocating and controllers tunning. In this paper, the coupling P2Ps is decoupled as the simplified P2Ps problem through deriving a necessity as shown in Fig. 1.

Fig. 1.
figure 1

Formulation of the P2Ps tracking problem.

In the coupling design with single-loop, path convergency needs the following equations hold.

$$\begin{aligned} \lim _{t \rightarrow \infty } \Vert x(t) - x_{\text {ref}}(t)\Vert \rightarrow 0, \end{aligned}$$
(1a)
$$\begin{aligned} \lim _{t \rightarrow \infty } \Vert y(t) - y_{\text {ref}}(t)\Vert \rightarrow 0, \end{aligned}$$
(1b)
$$\begin{aligned} \lim _{t \rightarrow \infty } \Vert \theta (t) - \theta _{\text {ref}}(t)\Vert \rightarrow 0, \end{aligned}$$
(1c)
$$\begin{aligned} \lim _{t \rightarrow \infty } \Vert v(t) - v_{\text {ref}}(t)\Vert \rightarrow 0. \end{aligned}$$
(1d)

In the necessity derivation, we prove that the last two equations, i.e., Eq. 1c and 1d, are the necessary condition of the first two equations. Then the single loop control scheme can be transformed into cascade control scheme.

3 Decoupling Control Scheme

In this section, we propose a decoupling control scheme with the translational regulator and the attitude regulator as shown in Fig. 2. Because any rotation maneuver required to point the heading in the right direction or desired yaw rate for translational control can be achieved quickly, the control authority of the attitude regulator has a higher bandwidth than the translation regulator.

Fig. 2.
figure 2

Decoupling control scheme.

3.1 Translation Regulator

For a mobile vehicle with nonholonomic constraints, where the control actions may affect not only the immediate result but also the next situation and, through that, all subsequent results, which we called the delayed effect of the actions. With a guidance law generating the transient profile, dynamics controller can output more reasonable manipulation signal and provide smoother steering.

Control systems with long-term control horizon and short sampling step bear significant computational burden. Due to decoupling design, cascade control scheme enable different sampling steps and control horizons between slow translational regulator and attitude regulator. Although linear tracking law in translational regulator reduces the computational burden [1], it needs explicit method to handle the large deviation from reference path [5]. We propose to formulate a nonlinear programming (NLP) for translational regulator as follows,

$$\begin{aligned} \text{ minimize } \quad \phi (\textbf{x}_T) + \int _{t}^{T}(q(\textbf{x}_{t}) + \frac{1}{2} \textbf{u}_{{t}}^T\textbf{R} \textbf{u}_{{t}})dt. \end{aligned}$$
(2a)
$$\begin{aligned} \begin{aligned} \text{ subject } \text{ to }\quad & \dot{\textbf{x}} = \textbf{f}(\textbf{x}_{t}, \textbf{u}_{t}, t), \end{aligned} \end{aligned}$$
(2b)

Assume the dynamics equation given in Eq. () can be transformed into system which is affine in control input. Define the value function of system states,

$$\begin{aligned} V(\textbf{x}_t) = \min _{\textbf{u}} \mathbb {E}_{\mathbb {Q}}[\phi (\textbf{x}_T) + \int _t^T(q(\textbf{x}_t) + \frac{1}{2}\textbf{u}(\textbf{x}_t)^T\textbf{R}\textbf{u}(\textbf{x}_t))dt]. \end{aligned}$$
(3)

The expectation of the second-order Taylor expansion of \( V(\textbf{x}_t) \) with respect to the state variable is as following,

$$\begin{aligned} \mathbb {E}_{\mathbb {Q}}[V(\textbf{x}_k+\delta )] = V(\textbf{x}_k) + \varDelta (\textbf{f}^T+\textbf{u}^T\textbf{G}^T)V_x(\textbf{x}_k). \end{aligned}$$
(4)

Under the small enough sampling step \( \varDelta \), the Bellman principle can be used to approximate the value equation as a recursive equation,

$$\begin{aligned} V(\textbf{x}_k, k) = \min _{\textbf{u}}\{ \varDelta (q(\textbf{x}_k) + \frac{1}{2}\textbf{u}^T\textbf{Ru}) + \mathbb {E}_{\mathbb {Q}}[V(\textbf{x}_k+\varDelta \textbf{f} + \varDelta \textbf{Gu}, k+1)] \}. \end{aligned}$$
(5)

Substituting the Eq. (4) into Eq. (5), the optimal control input can be derived under unconstrained conditions. Then, calculating the limits respect to time, we can derive the Hamilton-Jocabi-Bellman (HJB) equation as following,

$$\begin{aligned} V_t(\textbf{x}_t,t) = q(\textbf{x}_t,t) + \textbf{f}(\textbf{x}_t,t)^TV_x - \frac{1}{2}V_x^T\textbf{GR}^{-1}\textbf{G}^TV_x. \end{aligned}$$
(6)

Normally, it is difficult to solve this backward PDE due to the curse of dimensionality. The path integral control scheme provides an elegant method to derive the optimal control distribution based on Feynman Kac lemma [3]. It allows to represent the solution of the PDE as an exception of a stochastic function,

$$\begin{aligned} V(\textbf{x}_t, t) = - \lambda \log (\mathbb {E}_{\mathbb {P}}[\exp (-\frac{1}{\lambda }S(\tau ))]), \end{aligned}$$
(7)

where \( S(\tau ) \) is the state-dependent cost function. Supposing that we have known the probability density function \( q^*(\textbf{u}|U, \sigma ) \) of the optimal control distribution \( \mathbb {Q}^* \), the optimal control inputs at each sampling step t can be generated through sampling from \( \mathbb {Q}^* \).

$$\begin{aligned} \textbf{u}_t^* = \mathbb {E}_{\mathbb {Q}^*}[\hat{\textbf{u}}_t] \quad \forall t \in {0,1,\ldots ,T-1}. \end{aligned}$$
(8)

However, it’s inaccessible to explicitly provide the optimal control distribution because the precise environment models cannot be established for autonomous vehicles. So we cannot directly sample from the optimal control distribution. Importance sampling (IS) can be utilized to sample from another known distribution \( \mathbb {Q}_{\hat{U}, \sigma } \) and monte carlo (MC) method can be utilized to provide the unbiased estimation of the optimal control sequence.

$$\begin{aligned} \begin{aligned} \mathbb {E}_{\mathbb {Q}^*}[\hat{\textbf{u}}_t] &= \int q^*(\textbf{u}|U, \sigma )\hat{\textbf{u}}_td\hat{\textbf{u}} \\ &= \int w(\textbf{u})q(\textbf{u}|\hat{U},\sigma )\hat{\textbf{u}}_td\hat{\textbf{u}} \\ &= \mathbb {E}_{\mathbb {Q}_{\hat{U},\sigma }}[w(\textbf{u})\hat{\textbf{u}}_t], \end{aligned} \end{aligned}$$
(9)

where \( w(\textbf{u}) = \frac{q^*(\textbf{u}|U, \sigma )}{q(\textbf{u}|\hat{U}, \sigma )} \) is the IS weight.

Providing the optimal control distribution as follows [6],

$$\begin{aligned} q^*(\textbf{u}) = \frac{1}{\eta } \exp (-\frac{1}{\lambda }S(\textbf{u}))p(\textbf{u}), \quad \eta = \int \exp (-\frac{1}{\lambda }S(\textbf{u}))p(\textbf{u})d\textbf{u}, \end{aligned}$$
(10)

where \( p(\textbf{u}) = q(\textbf{u}|\tilde{U}, \sigma ) \) is the base distribution. Substituting Eq. (10) into IS weight and abandoning the non-optimization elements, we can estimate the IM weight as following,

$$\begin{aligned} \begin{aligned} w(\textbf{u}) &= \frac{\exp (-\frac{1}{\lambda }S(\textbf{u}) - \sum _{t = 0}^{T-1}(\mathbf {\hat{u}}_t - \mathbf {\tilde{u}}_t)^T \sigma ^{-1}\textbf{v}_t)}{\int \exp (\sum _{t=0}^{T-1}((\mathbf {\hat{u}}_t - \mathbf {\tilde{u}}_t)^T \sigma ^{-1}\textbf{v}_t))q(\textbf{u}|\hat{U},\sigma )\exp (-\frac{1}{\lambda }S(\textbf{u}))} \\ &= \frac{\exp (-\frac{1}{\lambda }S(\textbf{u}) - \sum _{t = 0}^{T-1}(\mathbf {\hat{u}}_t - \mathbf {\tilde{u}}_t)^T \sigma ^{-1}\textbf{v}_t)}{\int q(\textbf{u}|\hat{U},\sigma )\exp (-\frac{1}{\lambda }S(\textbf{u}) - \sum _{t = 0}^{T-1}(\mathbf {\hat{u}}_t - \mathbf {\tilde{u}}_t)^T \sigma ^{-1}\textbf{v}_t)} \end{aligned} \end{aligned}$$
(11)

The computational cost evaluations of MPPI with NLP and single loop linear MPC are provided in Table 1.

Table 1. Computational Cost Evaluation

3.2 Attitude Regulator

In the decoupled cascade control scheme, the master controller, i.e., translational regulator, provides the set point for the slave controller (attitude regulator). Then the attitude regulator manipulates the steering mechanism to guide the vehicle to achieve desired yaw rate and sideslip angle. Methods that do not consider the side-slip angle probably lead to non-zero steady-state yawing error and poor performance when driving in the tight radii curves [2]. The tangent direction of the reference path is commonly chosen as the desired heading angle of the attitude controller in these methods. We take the side-slip angle into consideration through designing a MPC based output regulator and generate the transient profile for output regulator to improve the transient response and the driving comfort. The finite-time optimal control problem is transformed into a standard quadratic program (QP).

4 Result

J-shape path consisting of a straight line with a length of 70 m and an arc with a radius of 47.8 m is fed as the reference path. The vehicle is controlled at a speed of 36 km/h. Figure 3 shows the simulation results of proposed decoupling ORMPC (D-ORMPC), compared with Error-based MPC (E-MPC) and its variants (PMPC) with a preview item of path curvatures.

More comparisions about MPPI with NLP are presented in Fig. 4. In all four driving scenarios, two controllers share the same control parameters, which prove the proposed scheme achieves good parameters adaptability. The results show that the tracking performance of MPPI deteriorates in tight radii curves, which is mainly because it does not sample sufficiently in action space and there is a bias in the estimation of state value.

Fig. 3.
figure 3

Simulation results in J-shape path. (a) Steering rate. (b) Lateral jerk. (c) Lateral tracking error. Simulation results of steering rate and lateral jerk show that our method achieves more elegant steering without sacrificing tracking accuracy.

Fig. 4.
figure 4

(a) Transient profile generated by NLP in Sinusoidal path at one sampling interval. (b) Transient profile generated by MPPI. (c) Solving time and tracking error comparisions in different driving scenario.