Neural Network Inverse Optimal Control of Ground Vehicles

Cespi, Riccardo; Di Gennaro, Stefano; Castillo-Toledo, Bernardino; Romero-Aragon, Jorge Carlos; Ramírez-Mendoza, Ricardo Ambrocio

doi:10.1007/s11063-023-11327-9

Neural Network Inverse Optimal Control of Ground Vehicles

Open access
Published: 20 June 2023

Volume 55, pages 10287–10313, (2023)
Cite this article

Download PDF

You have full access to this open access article

Neural Processing Letters Aims and scope Submit manuscript

Neural Network Inverse Optimal Control of Ground Vehicles

Download PDF

Riccardo Cespi¹,
Stefano Di Gennaro²^na1,
Bernardino Castillo-Toledo³^na1,
Jorge Carlos Romero-Aragon⁴^na1 &
…
Ricardo Ambrocio Ramírez-Mendoza¹^na1

1193 Accesses
Explore all metrics

Abstract

In this paper an active controller for ground vehicles stability is presented. The objective of this controller is to force the vehicle to track a desired reference, ensuring safe driving conditions in the case of adhesion loss during hazardous maneuvers. To this aim, a nonlinear discrete-time inverse optimal control based on a neural network identification is designed, using a recurrent high order neural network (RHONN) trained by an Extended Kalman Filter. The RHONN ensures stability of the identification error, while the controller ensures the stability of the tracking errors. Moreover, a discrete-time reduced order state observer is utilized to reconstruct the lateral vehicle dynamic not usually available. For the control problem, the references of the lateral velocity and yaw rate are given by a dynamic system mimicking an ideal vehicle having not-decreasing tire lateral characteristics. The proposed approach avoids the identification of the Pacejka’s lateral parameters of the tires, so simplifying the input control determination. Moreover, an optimal control is proposed to optimize the actuator effort and power, usually bounded. Control gains are determined using optimal “nature-inspired" algorithms such as particle swarm optimization. Test maneuvers, performed through the full vehicle simulator CarSim^®, have been used to test correctness, quality and performances of the observer, the neural identifier and the inverse optimal controller. Robustness of the reduced order discrete-time state observer is also discussed for different sample times. Finally, a fair comparison between optimal and non-optimal control schemes is presented, highlighting the numerical results obtained in simulation.

Safe Design of Stable Neural Networks for Fault Detection in Small UAVs

A Novel Neural-Fuzzy Guidance Law Design by Applying Different Neural Network Optimization Algorithms Alternatively for Each Step

State reconstruction in a nonlinear vehicle suspension system using deep neural networks

Article 09 June 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

During the last years the driving safety has been improved making use of active actuators. Most of the applications utilize steer-by-wire systems, such as the active front steering (AFS) in order to assist the driver in complex and hazardous maneuvers. In general, the vehicle agility, maneuverability and stability are improved making use of AFS as explained, for instance, in [1,2,3,4,5]. The driving safety can be largely improved also using the rear torque vectoring (RTV) technique. Recent contributions on Rear Torque Vectoring control can be found in [6,7,8,9,10]. Combined AFS and RTV actions can be applied for ensuring driving stability [11,12,13]. In the same line of ideas, this paper deals with the use of both AFS and RTV for the active control of the vehicle.

A big criticism of the plant-based control strategies utilized in [11,12,13], is that when calculating the explicit expression of the control law to stabilize the vehicle attitude, the Pacejka’s parameters that define the lateral tire forces must be known at any time. This strong assumption is considered as for instance in [14, 15] where authors assume the lateral tire forces to be known. However, such intrinsic parameters are subjected to decay and deterioration so their estimation remains an arduous task.

Furthermore, utilizing the proposed control approach where a discrete-time reduced order observer reconstructs the vehicle lateral dynamic not usually available and by using a RHONN to identify the vehicle observed dynamics [16,17,18], the synaptic weights provide neural adaptation avoiding the necessity of the knowledge of the Pacejka’s tire parameters.

The training algorithm for the RHONN weight updating is carried out by an extended Kalman filter (EKF) obtaining a model of the vehicle, used to design an inverse optimal controller. The main advantage of this strategy is that using this RHONN-based model, the AFS input appears linearly in the dynamics, and not implicitly in the tire characteristic [19]. This allows calculating the AFS input without inverting the tire model to determine the AFS input, which is a not obvious task since the tire model depends on experimental parameters and vehicle vertical dynamics. Furthermore, the RTV control law does not take into account the explicit expression of the lateral front and rear tire forces that are, usually, not available. The originality and novelty of the proposed control method rely in the fact that the AFS, considered as a control input, is calculated without knowing the tire Pacejka’s parameters. Moreover, since the RTW has a limited actuation, bounded by the vehicle speed and yaw inertia of the vehicle, an optimal approach which minimizes the control efforts is considered. These two aspects represent the main contributions of this paper. The other notable aspect to be mentioned is that the controller is here determined using the inverse optimal control technique [17, 18]. In classical optimal control setting, the meaningful cost functional is given a priori and, then, it is used to calculate the control law by solving a Hamilton–Jacobi–Bellmann (HJB) equation. This latter is, in general, a difficult task. The inverse optimal control technique can be used to overcome this problem, by choosing an a priori candidate Lyapunov function, which is then used to calculate the control law and a meaningful cost functional [17, 18]. This scheme is here proposed to control the vehicle lateral and yaw dynamics in the case of drifting and adhesion loss, which are commonly considered dangerous situations. A further advantage in the use of such a control technique is that it minimizes the actuator effort. It is worth noting that the use of controls with EKF identification were used in real-time applications [20,21,22,23,24]. The availability of high-performance digital devices makes the implementation possible also in the case of vehicles, which nowadays have enough computational power to guarantee that all the calculations needed are made in a efficient way. It is worth noting that under a computation point of view, there exists another technique to solve the HJB equation for the optimal control as explained in [25, 26] where authors utilize adaptive dynamic programming (ADP) to calculate such solution. However, in this paper authors do not provide further detailed comparisons that will be given in future works.

In the literature the combined use of plant-based observers and neural network controllers are quite common as for instance in [27] where the oxygen excess ratio in polymer electrolyte membrane fuel cell is controlled. In [28, 29] an hybrid adaptive learning neural network control and a discrete-time adaptive neural network control are used to improve the steering by wire systems that are usually affected negatively by the friction torque and self-aligning torque.

To validate the proposed controller, the CarSim^® platform is used to mimic realistically a vehicle performing a challenging ATI 90/90 steer maneuver. The motivation of choosing this platform is due to the fact that this software well predicts the real vehicle response, as validated with extensive experimental tests conducted by automotive companies such as Ford Motor Company, Chrysler among others.

The paper is organized as follows. Section 2 introduces some preliminaries about neural networks, RHONN identification and inverse optimal control, whereas in Sect. 3 the proposed method is applied to a ground vehicle. In Sect. 4, quality and performance of the proposed controller are shown via simulations in CarSim^®. Some comments conclude the paper.

2 Recalls on Nonlinear Neural Network RHONN Identification and Discrete-Time Inverse Optimal Control for Trajectory Tracking

Given a generic multi inputs and multi output (MIMO) discrete-time nonlinear system of the form

$$\begin{aligned} x_{k+1}=F(x_k,u_k) \end{aligned}$$

(1)

with $k\in \mathbb {N}=\{0,1,2\ldots \}$, $x_k=(x_{1,k}, \ldots , x_{n,k})^T\in \mathbb {R}^n$, $u_k=(u_{1,k}, \ldots , u_{m,k})^T\in \mathbb {R}^m$, and $F:\mathbb {R}^n\times \mathbb {R}^m\rightarrow \mathbb {R}^n$ an analytic vector field such that $F(0,0)=0$ [16], it can be approximated by a discrete RHONN [16,17,18]

$$\begin{aligned} x_{i,k+1}=w_{i,k}^T z_i(x_k,u_k) \qquad i=1,2,\ldots ,n. \end{aligned}$$

(2)

This result is particularly useful in some cases, e.g. when the parameters of the original system are not fully known. In (2), $x_{i,k}$ represents the state of the $i^{th}$ neuron, $w_{i,k}=(w_{i,1,k}, \ldots , w_{i,\ell _i,k})^T$, $i=1,\ldots ,n$, are the adjustable synaptic weights of the neural network, and $\ell _i$ is the number of high order connections. For $\ell _i$ sufficiently large, (2) approximates the system to be identified to any degree of accuracy. The $\ell _i$-dimensional vector $z_i$ is of the form

$$\begin{aligned} z_i(x_k,u_k)= \left( \begin{matrix}z_{i,1} \\ z_{i,2} \\ \vdots \\ z_{i,\ell _i}\end{matrix}\right) = \left( \begin{matrix}\prod \limits _{j\in I_1} \gamma ^{d_{i_j}(1)}_{i_j,k}\\ \prod \limits _{j\in I_2} \gamma ^{d_{i_j}(2)}_{i_j,k}\\ \vdots \\ \prod \limits _{j\in I_{\ell _i}} \gamma ^{d_{i_j}(\ell _i)}_{i_j,k}\end{matrix}\right) , \end{aligned}$$

(3)

for $i=1,2,\ldots ,n$ where $\{I_1, I_2, \ldots ,I_{\ell _i}\}$ is a collection of $\ell _i$ non-ordered subsets of $\{1,2,\ldots ,n+m\}$, and $d_{i_j}(1),\ldots ,d_{i_j}(\ell _i)$ are non-negative integers. Finally,

$$\begin{aligned} \gamma _{i,k}=\left( \begin{matrix}\gamma _{i,1,k} \\ \vdots \\ \gamma _{i,n,k} \\ \gamma _{i,n+1,k}\\ \vdots \\ \gamma _{i,n+m,k}\end{matrix}\right) =\left( \begin{matrix}s(x_{1,k}) \\ \vdots \\ s(x_{n,k}) \\ u_{1,k}\\ \vdots \\ u_{m,k}\end{matrix}\right) ,\quad i=1,\ldots ,n. \end{aligned}$$

(4)

with $\gamma _{i,j,k}$ either external inputs or states of neurons passed through a sigmoid function. The functions $s(x_{i,k})$, $i=1,\ldots ,n$, are typically sigmoidal monotone-increasing and differentiable functions, called activation functions, having the form

$$\begin{aligned} s(x_{i,k})=\frac{\alpha _i}{1+e^{-\beta _i x_{i,k}}}-\tau _i, \qquad i=1,\ldots ,n \end{aligned}$$

(5)

where $\alpha _i,\beta _i,\tau _i>0$ are constants. Sigmoid activation functions, commonly used in applications, are the logistic functions, obtained for $\alpha _i=\beta _i=1$, $\tau _i=0$, and the hyperbolic tangent functions, obtained for $\alpha _i=\beta _i=2$, $\tau _i=1$.

In this paper we consider the particular case in which (2) is described by the discrete-time RHONN depicted in Fig. 1, and described by the following equation [30]:

$$\begin{aligned} x_{i,k+1}=w_{i,k}^T z_i(x_k)+w_i^{\circ T}u_k, \quad i=1,2,\ldots ,n \end{aligned}$$

(6)

where $w_i^{\circ T}$ is a constant synaptic weights vector, and the functions $\gamma _{i,k}$ are in the particular form

$$\begin{aligned} \gamma _{i,k}=\left( \begin{matrix}\gamma _{i,1,k} \\ \vdots \\ \gamma _{i,n,k}\end{matrix}\right) =\left( \begin{matrix}s(x_{1,k}) \\ \vdots \\ s(x_{n,k})\end{matrix}\right) . \end{aligned}$$

(7)

This last choice simplifies the calculation of the control signal needed to guarantee the closed loop performance.

Let us now denote by $w_i^*$, $w_i^{\circ *}$, $i=1,\ldots ,n$, the constant (unknown) weights minimizing, on a fixed compact set, the norm of the identification error between (6) and the system to be identified [17]. Therefore, considering the approximation errors

$$\begin{aligned} \epsilon _{i,k}=\Big (w_{i,k}-w_i^*\Big )^T z_i(x_k) + \Big (w_i^\circ -w_i^{\circ *}\Big )^T u_k, \end{aligned}$$

(8)

for $i=1,\ldots ,n$ one rewrites (6) as

$$\begin{aligned} x_{i,k+1}=w_i^{*T} z_i(x_k)+w_i^{\circ * T} u_k+\epsilon _{i,k}, \qquad i=1,\ldots ,n. \end{aligned}$$

(9)

For (9) one can consider a RHONN identifier

$$\begin{aligned} {\hat{x}}_{i,k+1}={\hat{w}}_{i,k}^T z_i({\hat{x}}_k)+ w_i^{\circ * T}u_k, \quad i=1,2,\ldots ,n. \end{aligned}$$

(10)

with ${\hat{x}}_k$ the estimate of $x_k$, ${\hat{w}}_{i,k}$ the estimate of $w_i^*$. Furthermore, in (10) it is assumed that the value of $w_i^{\circ *}$ can be estimated off-line. This can be done for a large class of systems in affine form since $w_{i,k}^\circ $ is constant. The RHONN weight estimation error is

$$\begin{aligned} {{\tilde{w}}}_{i,k}=w_i^*-{\hat{w}}_{i,k},\qquad i=1,\ldots ,n,\ \forall \>k\in \mathbb {N} \end{aligned}$$

(11)

and its dynamics are

$$\begin{aligned} {\tilde{w}}_{i,k+1}-{\tilde{w}}_{i,k}={\hat{w}}_{i,k}-{\hat{w}}_{i,k+1}, \qquad i=1,\ldots ,n,\ \forall \>k\in \mathbb {N} \end{aligned}$$

(12)

since $w_i^*$ is constant.

2.1 The EKF Training Algorithm

For the on-line learning process of the RHONN weights of (10), one can use a modified version of the well-known EKF algorithm [31, 32], in which the weights become the states to be estimated. The main objective of the EKF is to find the optimal values for the weight vector ${\hat{w}}_{i,k}^T$ such that the identification errors

$$\begin{aligned} e_{i,k}=x_{i,k}-{\hat{x}}_{i,k}, \qquad i=1,\dots , n \end{aligned}$$

(13)

are minimized. The EKF solution to the training problem is [31, 32]

$$\begin{aligned} {\hat{w}}_{i,k+1}={\hat{w}}_{i,k}+\eta _{i,k} K_{i,k} e_{i,k} \qquad i=1,\dots , n \end{aligned}$$

(14)

where

$$\begin{aligned} K_{i,k}=P_{i,k} H_{i,k} M_{i,k}\in \mathbb {R}^{~\ell _i\times m} \end{aligned}$$

(15)

is the Kalman gain vector, $i=1,\dots ,n$, and $\eta _{i,k}\in [0,1]$ is the rate learning. Here $P_{i,k}\in \mathbb {R}^{\ell _i\times \ell _i}$ is the predictive error associated covariance matrix defined as

$$\begin{aligned} P_{i,k+1}=P_{i,k} -K_{i,k} H_{i,k} ^TP_{i,k} +Q_{i,k} \end{aligned}$$

(16)

for $i=1,\dots , n$, with $Q_{i,k}\in \mathbb {R}^{\ell _i\times \ell _i}$ the state noise associated covariance matrix. Moreover, the global scaling matrix $M_{i,k}$ is given by

$$\begin{aligned} M_{i,k}=\Big (R_{i,k} +H_{i,k} ^TP_{i,k} H_{i,k}\Big )^{-1} \end{aligned}$$

(17)

for $i=1,\dots , n$, where $R_{i,k}\in \mathbb {R}$, and $H_{i,k}\in \mathbb {R}^{~\ell _i\times m}$ is a matrix for which each entry

$$\begin{aligned} h_{i,j,k}=\bigg (\frac{\partial {{\hat{x}}_{i,k}}}{\partial {\hat{w}}_{i,j,k}}\bigg )_{{\hat{w}}_{i,k}={\hat{w}}_{i,k+1}}, i= & {} 1,\ldots ,n, \nonumber \\ j= & {} 1,\ldots ,\ell , \end{aligned}$$

(18)

is the derivative of one of the neural network output ${\hat{x}}_{i,k}$ with respect to one neural network weight ${\hat{w}}_{i,j}$. Note that $H_{i,k}$, $K_{i,k}$, and $P_{i,k}$ are bounded [33]. The dynamic of (11) can be expressed as

$$\begin{aligned} {\tilde{w}}_{i,k+1}={\tilde{w}}_{i,k}-\eta _{i,k} K_{i,k}e_{i,k} \end{aligned}$$

(19)

On the other hand, the dynamics of (13) is

$$\begin{aligned} e_{i,k+1}={\tilde{w}}_{i,k}^T z_i({\hat{x}}_k)+\epsilon _{i,k}. \end{aligned}$$

(20)

For the error dynamics (20) we will introduce a stability property, given in the following definition.

Definition 2.1

The solutions of a system $x_{k+1}=\phi (x_k)$ are Semi-Globally Uniformly Ultimately Bounded (SGUUB) if for any compact $\varOmega $ and all initial condition $x_{k_0}\in \varOmega $, there exist an $\epsilon >0$ and a number $N(\epsilon ,x_{k_0})$ such that $\Vert x_k\Vert <\epsilon $, $\forall \>k\ge k_0+N$.$\diamond $

It is worth noting that whereas it will be proven that $\tilde{w}_{i,k}$ and $e_{i,k}$ are stable, the approximation error $\epsilon $ cannot be given a priori since it depends on the accuracy of the neural model the control designer presents. In this sense, an heuristic method, based on try-and-error approach, is repeated until $\epsilon $ results to be acceptable.

2.2 Discrete-Time Inverse Optimal Control for Trajectory Tracking

The analysis of the inverse optimal control for trajectory tracking will be performed for input-affine systems

$$\begin{aligned} {\hat{x}}_{k+1}=f({\hat{x}}_k)+g({\hat{x}}_k)u_k \end{aligned}$$

(21)

with the following associated cost functional

$$\begin{aligned} \mathcal{J}(\xi _k)=\sum _{j=k}^{\infty } \Big (l(\xi _j)+u_j^T R(\xi _j)u_j\Big ) \end{aligned}$$

(22)

where $\xi _k={\hat{x}}_k-x_{k,\textrm{ref}}$ is the tracking error between the neural network state vector ${\hat{x}}_k$ and the desired trajectory $x_{k,\textrm{ref}}$. Furthermore, $l(\xi _j):\mathbb {R}^n\rightarrow \mathbb {R}^+$ is a positive semidefinite function, and $R(\xi _j):\mathbb {R}^n\rightarrow \mathbb {R}^{m\times m}$ is a real, symmetric, positive definite weighting matrix. For the sake of simplicity, in this work the elements of $R(\xi _j)$ will be taken constant, namely $R(\xi _j)=R$ [18]. The cost functional (22) can be rewritten as

$$\begin{aligned} \mathcal{J}(\xi _k)=l(\xi _k)+u_k^T R u_k+\sum _{j=k+1}^{\infty } \Big (l(\xi _j)+u_j^TRu_j\Big )=l(\xi _k)+u_k^T R u_k+\mathcal{J}(\xi _{k+1}) \end{aligned}$$

(23)

where, without loss of generality, one requires that $\mathcal{J}(\xi _0)=0$.

The existence of a control $u_k$ ensuring $J(\xi _k)$ finite in (22), can be given in terms of the existence of a Lyapunov function $V(\xi _k)$. In fact, following [34,35,36,37,38] and the Bellman’s optimality principle [39, 40] one looks for a Lyapunov function $V(\xi _k)$ that, denoting $u_k^*$ the optimal control which minimizes $V(\xi _k)$, and with $V^*(\xi _k)=V(\xi _k)\mid _{u_k^*}$, satisfies:

$$\begin{aligned} V^*(\xi _k)=l(\xi _k)+u_k^{*T} R u_k^*+V^*(\xi _{k+1}). \end{aligned}$$

(24)

Here $u_k^*$ can be determined calculating the gradient of the right-hand side of (24) with respect to $u_k=u_k^*$ [41,42,43]

$$\begin{aligned} 0= & {} 2Ru_k^*+\bigg (\frac{\partial V^*(\xi _{k+1})}{\partial u_k}\bigg )^T=2 R u_k^*+\bigg (\frac{\partial \xi _{k+1}}{\partial u_k}\bigg )^T \bigg (\frac{\partial V^*(\xi _{k+1})}{\partial \xi _{k+1}}\bigg )^T\nonumber \\= & {} 2 R u_k^*+g^T({\hat{x}}_k)\bigg (\frac{\partial V^*(\xi _{k+1})}{\partial \xi _{k+1}}\bigg )^T \end{aligned}$$

(25)

so obtaining the controller that globally stabilizes the tracking error $\xi _k$ and minimizes the cost function (22)

$$\begin{aligned} u_k^*=-\frac{1}{2}R^{-1}g^T({\hat{x}}_k)\bigg (\frac{\partial V^*(\xi _{k+1})}{\partial \xi _{k+1}}\bigg )^T \end{aligned}$$

(26)

with the condition $V(\xi _0)=0$. These considerations justify the following definition.

Definition 2.2

(Inverse Optimal Control for Trajectory Tracking). The control (26) is a global inverse optimal controller for trajectory tracking if:

(i)
It guarantees global asymptotic stability of the tracking error $\xi _k={\hat{x}}_k-x_{k,\textrm{ref}}$;
(ii)
$V(\xi _k)$ is a radially unbounded positive definite function such that
$$\begin{aligned} {\overline{V}}=V^*(\xi _{k+1})-V^*(\xi _k)+u^{*T}_k R u_k^*\le 0. \diamond \end{aligned}$$

When one selects $l(\xi _k)=-{\overline{V}}$, then $V(\xi _k)$ is a solution of the HJB equation:

$$\begin{aligned} l(\xi _k)&+V(\xi _{k+1})-V(\xi _k)+\frac{1}{4}\frac{\partial V^T(\xi _{k+1})}{\partial \xi _{k+1}}g({\hat{x}}_k)R^{-1}g^T({\hat{x}}_k)\frac{\partial V(\xi _{k+1})}{\partial \xi _{k+1}}=0. \end{aligned}$$

(27)

To calculate the inverse optimal control, let us consider a candidate Lyapunov function of the form

$$\begin{aligned} V(\xi _k)=\frac{1}{2}\xi ^T_kP\xi _k, \qquad P=P^T>0. \end{aligned}$$

(28)

If (i) and (ii) are satisfied, the control law (26) is

$$\begin{aligned} u_k^*= & {} -\frac{1}{2}R^{-1}g^T({\hat{x}}_k)P\xi _{k+1}=-\frac{1}{2}(R+P_2)^{-1}P_{1,k} \end{aligned}$$

(29)

with $P_{1,k}=g^T({\hat{x}}_k)P(f({\hat{x}}_k)-x_{k+1,\textrm{ref}})$ and $P_2=\frac{1}{2}g^T({\hat{x}}_k)Pg({\hat{x}}_k)$. It is worth pointing out that P and R are positive definite and symmetric matrices. Thus, the existence of the inverse is ensured.

It is worth noting that the inverse optimal control method is a control strategy derived on the same basis of the optimal control theory with the difference that the HJB equation is not solved first, since the optimal cost function is substituted by a candidate Lyapunov function, known a priori, and then the control law is calculated.

3 The Discrete-Time Inverse Optimal Control for Trajectory Tracking of Ground Vehicles

In this section we apply the previous results of neural identification and inverse optimal control to a ground vehicle represented by CarSim^®. The control scheme is described in Fig. 2, in which the steering wheel angle $\delta _{d,k}$, the longitudinal and lateral accelerations $a_{x,k}$, $a_{y,k}$, the longitudinal velocity $v_{x,k}$ and the yaw rate $\omega _{z,k}$ are measured from CarSim^®.

A discrete-time reduced-order state observer estimates the lateral vehicle velocities ${\tilde{v}}_{y,k}$ which is, in general, not known. A neural identifier then provides an input-affine model, avoiding the hard task of inverting the lateral tyre characteristic when deriving the control laws. Synaptic weights $w_{i,j,k}$ are adjusted on-line by the Extended Kalman Filter while minimizing the identification errors ${\hat{e}}_{i,k}$. Finally, the Inverse Optimal Controller, based on the neural model ${\hat{v}}_{x,k}, {\hat{v}}_{y,k}, {\hat{\omega }}_{z,k}$ and references $v_{y,k,\textrm{ref}}, \omega _{z,k,\textrm{ref}}$ and increments, provides the AFS $\delta _{c,k}$ and RTV $M_{z,k}$ that represent the CarSim^®control inputs. In this work authors considered the combined use of an observer in order to estimate the not known vehicle dynamics and a neural identifier to obtain an affine to the input model in which the Pacejka’s parameters are adapted by the neural synaptic weights. This strategy ensures a global exponential stability of the estimation error, given by the observer, and a practical stability guaranteed by the identifier. It is worth noting that the system that makes it possible to avoid the use of the Pacejka’s tire parameters of the tire lateral forces, is the neural identifier which adapts synaptic weights under the Kalman learning rules. This latter works under the hypothesis that all the dynamics to be identified are known. To this aim, a discrete-time observer is presented to reconstruct the unavailable vehicle lateral dynamics.

CarSim^®mimics realistically the vehicle dynamics. The model contains many dynamics, which describe the complex behavior of the vehicle. However, for vehicles with low center of gravity, the essential dynamics describing the vehicle attitude are given by the longitudinal and lateral velocities and the yaw rate. This is well described by the so-called single-track vehicle model shown in Fig. 2, and very often used to design active controllers for ground vehicles [44,45,46].

The interested reader can find in [2] a discrete-time version of such a model, obtained by means of a variational integrator (known as symplectic Euler), and representing the discrete-time version of the single-track model. Even if this model ensures better performance for (relatively) high sampling periods, a more popular model is the Euler approximation of the single-track model:

$$\begin{aligned} x_{k+1}= & {} \left( \begin{matrix}v_{x,k+1} \\ v_{y,k+1} \\ \omega _{z,k+1}\end{matrix}\right) =\left( \begin{matrix}v_{x,k}+T\bigg ( v_{y,k}\omega _{z,k} +\dfrac{\mu _x}{m} \Big (F_{x,f}(\lambda _{f,k})+F_{x,r}(\lambda _{r,k})\Big )\bigg )\\ v_{y,k}+T\bigg (-v_{x,k}\omega _{z,k} +\dfrac{\mu _y}{m} \Big (F_{y,f}(\alpha _{f,k})+F_{y,r}(\alpha _{r,k})\Big )\bigg )\\ \omega _{z,k}+T\bigg (\dfrac{\mu _y}{J_z}\Big (F_{y,f}(\alpha _{f,k}) l_f-F_{y,r}(\alpha _{r,k})l_r\Big )+\frac{M_z}{J_z}\bigg )\end{matrix}\right) \end{aligned}$$

(30)

where T is the sampling period, $v_{x,k}$, $v_{y,k}$, $\omega _{z,k}$ are the vehicle longitudinal, lateral, and yaw velocities, and $F_{y,f}$, $F_{y,r}$ are the lateral forces which depend on the tire slip angles $\alpha _{f,k}=\delta _{d,k}+\delta _{c,k}-(v_{y,k}+l_f\omega _{z,k})/v_{x,k}$, $\alpha _{r,k}=-(v_{y,k}-l_r\omega _{z,k})/v_{x,k}$, where $\delta _{d,k}$ is the driver steering wheel angle, and $\delta _{c,k}$ is the AFS input.

Furthermore, $M_{z,k}$ is the RTV input, and $F_{x,f}$, $F_{x,r}$ are the longitudinal forces, depending on the front/rear tire slips $\lambda _{f,k}=1-\omega _{w,f,k}R_w/v_{x,k}$, $\lambda _{r,k}=1-\omega _{w,r,k}R_w/v_{x,k}$, where $\omega _{w,f,k}$, $\omega _{w,r,k}$ are the front/rear wheel angular velocities, and $R_w$ the wheel radius. Finally, m, $J_z$ are the vehicle mass and inertia momentum, $l_f$, $l_r$ are the front and rear vehicle length, $\mu _x$, $\mu _y$ are the longitudinal and lateral tire-road friction coefficient.

3.1 The Control Problem

As already commented before, the use of the AFS and the RTV allow us to track given references for the lateral velocity $v_{y,k,\textrm{ref}}$ and the yaw rate $\omega _{z,k,\textrm{ref}}$. Then, the control problem can be defined as follows: given bounded references $v_{y,k,\textrm{ref}}$ and $\omega _{z,k,\textrm{ref}}$, with bounded increments, determine a controller $u_k=\alpha _k({\hat{x}}_k,x_{k,\textrm{ref}})$ such that the tracking errors $e_{v_{y,k}}=v_{y,k}-v_{y,k,\textrm{ref}}$, $e_{\omega _{z,k}}=\omega _{z,k}-\omega _{z,k,\textrm{ref}}$ satisfy

$$\begin{aligned} \lim \limits _{k\rightarrow \infty }e_{v_{y,k}}=0, \qquad \lim \limits _{k\rightarrow \infty }e_{\omega _{z,k}}=0. \end{aligned}$$

Moreover, when applying control strategies for vehicle stability, not all the state measurements are available from the vehicle, so that, in order to avoid an extensive use of sensors, we present a discrete-time reduced-order state observer for the reconstruction of the vehicle lateral velocity ${\tilde{v}}_{y,k}$.

Making reference to the control scheme in Fig. 2, the tracking errors $e_{v_{y,k}}$, $e_{\omega _{z,k}}$ can then be bounded as follows:

$$\begin{aligned} \begin{aligned} \Vert e_{v_{y,k}}\Vert&\le \Vert v_{y,k}-{\tilde{v}}_{y,k}\Vert +\Vert {\tilde{v}}_{y,k}-{\hat{v}}_{y,k}\Vert +\Vert {\hat{v}}_{y,k}-v_{y,k,\textrm{ref}}\Vert \\ \Vert e_{\omega _{z,k}}\Vert&\le \Vert \omega _{z,k}-{\hat{\omega }}_{z,k}\Vert +\Vert \hat{\omega }_{z,k}-\omega _{z,k,\textrm{ref}}\Vert . \end{aligned} \end{aligned}$$

(31)

Thus, the trajectory tracking problem of the desired trajectories can be split into three requirements:

$$\begin{aligned}&1.\quad \lim \limits _{k\rightarrow \infty }\Vert v_{x,k}-\tilde{v}_{x,k}\Vert =0, \lim \limits _{k\rightarrow \infty }\Vert v_{y,k}-\tilde{v}_{y,k}\Vert =0\nonumber \\&2.\quad \lim \limits _{k\rightarrow \infty }\Vert \tilde{v}_{x,k}-{\hat{v}}_{x,k}\Vert \le \varepsilon _{e_1}, \lim \limits _{k\rightarrow \infty }\Vert {\tilde{v}}_{y,k}-{\hat{v}}_{y,k}\Vert \le \varepsilon _{e_2},\nonumber \\&\qquad \quad \lim \limits _{k\rightarrow \infty }\Vert \omega _{z,k}-\hat{\omega }_{z,k}\Vert \le \varepsilon _{e_3}\nonumber \\&3.\quad \lim \limits _{k\rightarrow \infty }\Vert {\hat{v}}_{y,k}-v_{y,k,\textrm{ref}}\Vert =0, \lim \limits _{k\rightarrow \infty }\Vert \hat{\omega }_{z,k}-\omega _{z,k,\textrm{ref}}\Vert =0 \end{aligned}$$

(32)

with $\varepsilon _{e_1}, \varepsilon _{e_2}, \varepsilon _{e_3}>0$ fixed bounds for the norm of the identification errors. The asymptotic stability of the estimation error stated in the first condition is ensured by the use of a reduced-order state observer presented in Sect. 3.2. The practical stability of the identification error required by the second condition is guaranteed by use of a RHONN identifier introduced in Sect. 3.4, whereas the reference tracking stability required by the third condition is satisfied by the use of a discrete-time controller discussed in Sect. 3.5, developed with the inverse optimal control technique. Finally, Sect. 3.3 shows how to generate safe references for the vehicle attitude.

3.2 Discrete-Time reduced-order state observer

From the mathematical model (30), in order to estimate the lateral velocity $v_{y,k}$, we present the following reduced order state observer:

$$\begin{aligned} {\tilde{v}}_{x,k+1}= & {} {\tilde{v}}_{x,k}+T(\tilde{v}_{y,k}\omega _{z,k} +a_{x,k})+k_{o,1}(v_{x,k}-{\tilde{v}}_{x,k})\nonumber \\ {\tilde{v}}_{y,k+1}= & {} {\tilde{v}}_{y,k}+T(-{\tilde{v}}_{x,k}\omega _{z,k} +a_{y,k})+k_{o,2}(v_{x,k}-{\tilde{v}}_{x,k}) \end{aligned}$$

(33)

where $a_{x,k}, a_{y,k}$ are the vehicle longitudinal and lateral accelerations supposed to be known.

For the observer (33), let us state the following:

Theorem 3.1

The reduced order state observer (33) with $k_{o,1}$ and $k_{o,2}$ such that:

$$\begin{aligned} \left( \begin{array}{cc}k_{o,1} \\ k_{o,2}\end{array}\right) =\left( \begin{array}{cc}\frac{-b\pm \sqrt{b^2-4ac}}{2a}\\ \frac{k_{o,1}(\kappa \mathcal{S}_{\omega _{z,k}}-2T\omega _{z,k})+\kappa T^2\omega _{z,k}^2\mathcal{S}_{\omega _{z,k}}}{d}\end{array}\right) \end{aligned}$$

(34)

with:

$$\begin{aligned} a= & {} \frac{(\kappa \mathcal{S}_{\omega _{z,k}}-2T\omega _{z,k})^2}{d^2}+\frac{2\kappa T \mid \omega _{z,k}\mid -\kappa ^2}{d}+1\nonumber \\ b= & {} \frac{2T^2\omega _{z,k}^2\kappa ^2-4\kappa T^3\mid \omega _{z,k}\mid ^3}{d^2}+\frac{\kappa ^2-\kappa ^2T^2\omega _{z,k}^2-4T^2\omega _{z,k}^2}{d}\nonumber \\{} & {} \qquad -\kappa T\mid \omega _{z,k}\mid -2\nonumber \\ c= & {} \frac{\kappa ^2T^4\omega _{z,k}^4}{d^2}+\frac{2\kappa T^3\mid \omega _{z,k}\mid ^3+\kappa ^2T^2\omega _{z,k}^2}{d}+T^2\omega _{z,k}^2\nonumber \\{} & {} \qquad +\kappa T \mid \omega _{z,k}\mid +\rho _1\nonumber \\ d= & {} 2-\kappa T\omega _{z,k} \end{aligned}$$

(35)

for $\rho _1$, $\rho _2>0$ such that the discriminant in (34) is greater or equal to zero and with $\kappa =\frac{T\omega _{z,k}^2}{\mid \omega _{z,k}\mid }-\rho _2$ for $\mid \kappa \mid <2$, ensures the asymptotic stability to the origin of the estimation errors:

$$\begin{aligned} {\tilde{e}}_{v_x,k}=v_{x,k}-{\tilde{v}}_{x,k} \qquad \tilde{e}_{v_y,k}=v_{y,k}-{\tilde{v}}_{y,k} \end{aligned}$$

$\diamond $

The proof is given in “Appendix A”.

3.3 The Reference Signals

The references $v_{y,k,\textrm{ref}}$, $\omega _{z,k,\textrm{ref}}$ represent what the driver expects from the vehicle performance. Concerning $v_{x,k}$, in this paper one assumes that the slips $\lambda _{f,k}$, $\lambda _{r,k}$ are set to zero, and therefore no longitudinal acceleration/deceleration is imposed. Various expressions can be found in the literature as reference generators. In particular, we consider (without loss of generality) the references given in [11, 12, 47] as the behavior of an “ideal” or “reference” vehicle. This ideal vehicle is not controlled by the AFS and/or the RTV, and receives as input only the driver’s steering signal

$$\begin{aligned} \left( \begin{matrix}v_{y,k+1,\textrm{ref}} \\ \omega _{z,k+1,\textrm{ref}}\end{matrix}\right) =\left( \begin{matrix}v_{y,k,\textrm{ref}}-T v_{x,k}\omega _{z,k,\textrm{ref}}+\\ T\dfrac{\mu _{y,\textrm{ref}}}{m_{\textrm{ref}}} \Big (F_{y,f,\textrm{ref}}(\alpha _{f,\textrm{ref}})+F_{y,r,\textrm{ref}}(\alpha _{r,\textrm{ref}})\Big )\\ \omega _{z,k,\textrm{ref}}+T\dfrac{\mu _{y,\textrm{ref}}}{J_{z,\textrm{ref}}}\Big (F_{y,f,\textrm{ref}}(\alpha _{f,\textrm{ref}}) l_f-F_{y,r,\textrm{ref}}(\alpha _{r,\textrm{ref}})l_r\Big )\end{matrix}\right) \end{aligned}$$

(36)

The reference lateral forces $F_{y,f,\textrm{ref}}$ $F_{y,r,\textrm{ref}}$ depend on the reference slip angles

$$\begin{aligned} \alpha _{f,\textrm{ref}}= & {} \delta _{d,k}-\frac{v_{y,k,\textrm{ref}}+l_f\omega _{z,k,\textrm{ref}}}{v_{x,k}}, \nonumber \\ \alpha _{r,\textrm{ref}}= & {} -\frac{v_{y,k,\textrm{ref}}-l_r\omega _{z,k,\textrm{ref}}}{v_{x,k}} \end{aligned}$$

(37)

and appear multiplied by the reference lateral tire-road friction coefficient $\mu _{y,\textrm{ref}}$. These forces should be chosen in such a way they impose the desired behaviour to the vehicle. Various expressions can be used for these forces, [48, 49], so that for the sake of simplicity we will use the Pacejka’s Magic Formula [19]:

$$\begin{aligned} F_{y,j,\textrm{ref}}=D_{y,j,\textrm{ref}}\sin (C_{y,j,\textrm{ref}}\arctan (B_{y,f,\textrm{ref}}\alpha _{j,k,\textrm{ref}})),\quad j=f,r \end{aligned}$$

(38)

where $D_{y,j,\textrm{ref}}$, $C_{y,j,\textrm{ref}}$ and $B_{y,j,\textrm{ref}}$ are fixed constant values chosen by the designer. In particular, $F_{y,r,\textrm{ref}}$ is proposed to be not decreasing with the slip angle $\alpha _{r,\textrm{ref}}$. This ensures that the ‘reference vehicle’ can not generate tail-spins.

3.4 The CarSim^® Neural Identification and the Inverse Optimal Control for References Tracking

The RHONN identifier (10) takes measurements from the CarSim^® only for the yaw rate $\omega _{z,k}$ whereas the identification of the longitudinal and lateral velocity is made using the reconstructions ${\tilde{v}}_{x,k}$, ${\tilde{v}}_{y,k}$ given by the observer in (33). The neural model proposed is the following:

$$\begin{aligned} {\hat{v}}_{x,k+1}= & {} {\hat{w}}_{11}\tanh ({\hat{v}}_{x,k})+{\hat{w}}_{12}\tanh (a_{x,k})\nonumber \\ {\hat{v}}_{y,k+1}= & {} {\hat{w}}_{21}\tanh ({\hat{v}}_{x,k})\tanh ({\hat{\omega }}_{z,k})+{\hat{w}}_{22}\tanh (a_{y,k})+w_{23}^{\circ }\delta _{c,k}\nonumber \\ \hat{\omega }_{z,k+1}= & {} {\hat{w}}_{31}\tanh (\delta _{d,k}) +{\hat{w}}_{32}\tanh (a_{y,k})+{\hat{w}}_{33}\tanh ({\hat{\beta }}_k)+{\hat{w}}_{34}\tanh (a_{x,k})\nonumber \\{} & {} -w_{35}^{\circ }\delta _{c,k}+w_{36}^{\circ }M_{z,k} \end{aligned}$$

(39)

where ${\hat{v}}_{x,k}$, ${\hat{v}}_{y,k}$, ${\hat{\omega }}_{z,k}$ are the neural identifications for ${\tilde{v}}_{x,k}$, ${\tilde{v}}_{y,k}$, $\omega _{z,k}$, and where ${\hat{\beta _k}}$ represents the vehicle slip angle calculated as ${\hat{\beta }}_k=\arctan ({\hat{v}}_{y,k}/{\hat{v}}_{x,k})$. Moreover, $w_{23}^{\circ }, w_{35}^{\circ }$ and $w_{36}^{\circ }$ are constants tuned by the control designer.

In (39), the AFS and RTV inputs $\delta _{c,k}$, $M_{z,k}$ appear. It is interesting to note that $\delta _{c,k}$ appears linearly in the model, and not implicitly in the lateral front force as in the single-track discrete-time models.

It is important to remind that the neural model able to identify the process is not unique. Model (39) has shown good characteristics with respect to CarSim^® measurements including noise and perturbations, and it has shown good performance when tracking the reference signals.

The stability of the identification error system:

$$\begin{aligned} {\hat{e}}_{v_{x,k}}= & {} {\hat{v}}_{x,k}-v_{x,k};\quad {\hat{e}}_{v_{y,k}}={\hat{v}}_{y,k}-{\tilde{v}}_{y,k};\quad {\hat{e}}_{\omega _{z,k}}={\hat{\omega }}_{z,k}-\omega _{z,k}; \end{aligned}$$

(40)

as well as the stability of the synaptic weights errors:

$$\begin{aligned} {\tilde{w}}_{1,k}= & {} \left( \begin{matrix}w_{11}^*-{\hat{w}}_{11,k}\\ w_{12}^*-{\hat{w}}_{12,k}\end{matrix}\right) \quad \tilde{w}_{2,k}=\left( \begin{matrix}w_{21}^*-{\hat{w}}_{21,k}\\ w_{22}^*-{\hat{w}}_{22,k}\end{matrix}\right) \quad {\tilde{w}}_{3,k}=\left( \begin{matrix} w_{31}^*-{\hat{w}}_{31,k}\\ w_{32}^*-{\hat{w}}_{32,k}\\ w_{33}^*-{\hat{w}}_{33,k}\\ w_{34}^*-{\hat{w}}_{34,k}\end{matrix}\right) \end{aligned}$$

(41)

are discussed in the following theorem:

Theorem 3.2

The RHONN identifier (39), trained by the EKF algorithm (13), (14), (15), (16), (17), (18) to identify the lateral vehicle velocity ${\tilde{v}}_{y,k}$ from the reduced order observer (33) and to identify the vehicle longitudinal velocity $v_{x,k}$ and yaw rate $\omega _{z,k}$ from CarSim^®, ensures that the identification errors (40) are SGUUB, and that the weight estimation errors (41) remain bounded- $\diamond $

The proof is given in “Appendix B”.

3.5 The Inverse Optimal Control Law

It is now possible to introduce the inverse optimal control law in order to force the ground vehicle to follow the desired references.

As commented before, the input control laws used for this task are the active front steering $\delta _{c,k}$ (AFS) and the rear torque vectoring $M_{z,k}$ (RTV) for the tracking of the lateral velocity $v_{y,k,\textrm{ref}}$ and yaw rate $\omega _{z,k,\textrm{ref}}$ references. Here, no control strategy is presented for the longitudinal velocity $v_{x,k}$ being this latter a bounded signal as explained in [11, 12].

Based on the structure given in (29), the control law is expressed in the matrix form as follows:

$$\begin{aligned} u^*_k=\left( \begin{matrix}\delta _{c,k}\\ M_{z,k}\end{matrix}\right) =-\frac{1}{2}(R+P_2)^{-1}P_{1,k} \end{aligned}$$

(42)

being:

$$\begin{aligned} P_{1,k}= & {} g^T P(f({\hat{x}}_k)-x_{k+1,\textrm{ref}});\quad P_2=\frac{1}{2}g^TPg; \end{aligned}$$

(43)

and:

$$\begin{aligned} x_{k+1,\textrm{ref}}= & {} \left( \begin{matrix}v_{y,k+1,\textrm{ref}} \\ \omega _{z,k+1,\textrm{ref}}\end{matrix}\right) ;\quad x_{k,\textrm{ref}}=\left( \begin{matrix}v_{y,k,\textrm{ref}} \\ \omega _{z,k,\textrm{ref}}\end{matrix}\right) ; \nonumber \\ {\hat{x}}_{k+1}= & {} \left( \begin{matrix}{\hat{v}}_{y,k+1} \\ {\hat{\omega }}_{z,k+1}\end{matrix}\right) ;\quad {\hat{x}}_{k}=\left( \begin{matrix}{\hat{v}}_{y,k} \\ {\hat{\omega }}_{z,k}\end{matrix}\right) ; \end{aligned}$$

(44)

$$\begin{aligned} f({\hat{x}}_k)= & {} \left( \begin{array}{cc} &{}{\hat{w}}_{21}\tanh ({\hat{v}}_{x,k})\tanh ({\hat{\omega }}_{z,k})+{\hat{w}}_{22}\tanh (a_{y,k})\\ &{}{\hat{w}}_{31}\tanh (\delta _{d,k}) +{\hat{w}}_{32}\tanh (a_{y,k})\\ &{}+{\hat{w}}_{33}\tanh ({\hat{\beta }}_k)+{\hat{w}}_{34}\tanh (a_{x,k})-w_{35}^{\circ }\delta _{c,k}\end{array}\right) ; \end{aligned}$$

(45)

$$\begin{aligned} R= & {} \left( \begin{matrix}r_{11} &{} 0\\ 0 &{} r_{22}\end{matrix}\right) ; P=\left( \begin{matrix}p_{11} &{} p_{12}\\ p_{21} &{} p_{22}\end{matrix}\right) ; g=\left( \begin{matrix}w_{23}^{\circ } &{} 0\\ w_{35}^{\circ } &{} w_{36}^{\circ }\end{matrix}\right) . \end{aligned}$$

(46)

Notice that from (29) it is here considered $g({\hat{x}}_k)=g$ constant, ensuring controllability of the system.

Now, along the same lines of theorem (4.7) of [18], we can state the following theorem:

Theorem 3.3

Let $x_{k,\textrm{ref}}$ be a bounded reference with bounded increments $x_{k+1,\textrm{ref}}$. If there exists a matrix $P=P^T>0$ such that:

$$\begin{aligned}{} & {} \frac{1}{2}P_{3,k}+\frac{1}{2}x^T_{k+1,\textrm{ref}}Px_{k+1,\textrm{ref}}-\frac{1}{2}{\hat{x}}^T_kP{\hat{x}}_k-\frac{1}{2}x^T_{k,\textrm{ref}}Px_{k,\textrm{ref}}-\frac{1}{4}P_{1,k}^T(R+P_2)^{-1}P_{1,k}\le \nonumber \\{} & {} -\frac{1}{2}\Vert P\Vert ~\Vert f({\hat{x}}_k)\Vert ^2-\frac{1}{2}\Vert P\Vert ~\Vert x_{k+1,\textrm{ref}}\Vert ^2-\frac{1}{2}\Vert P\Vert ~\Vert {\hat{x}}_k\Vert ^2-\frac{1}{2}\Vert P\Vert ~\Vert x_{k,\textrm{ref}}\Vert ^2 \end{aligned}$$

(47)

where:

$$\begin{aligned} P_{1,k}= & {} g^TP\big (f({\hat{x}}_k)-x_{k+1,\textrm{ref}}\big ); \quad P_2=\frac{1}{2}g^TPg; \quad P_{3,k}=f^T({\hat{x}}_k)Pf({\hat{x}}_k); \end{aligned}$$

(48)

then the control law (42), based on the neural identifier (39), ensures global asymptotic convergence to zero of the tracking error $\xi _k ={\hat{x}}_k -x_{k,\textrm{ref}}.$ Moreover, this control law is inverse optimal, i.e. it minimizes the cost functional $\mathcal{J}(\xi _k)=V(\xi _k)$ given by (28), with $l(\xi _k)=-V^*(\xi _{k+1})+V^*(\xi _k)-u^{*T}_k R u_k^*$.$\diamond $

It is worth to stress that there are no analytical conditions, in general, that allow knowing a priori if (47) is feasible. However it is possible to proceed using heuristic methods, such the nature-inspired optimization process known as Particle Swarm Optimization (PSO) [50, 51] used in this work, to find the positive definite symmetric matrix P verifying (47). Moreover, the use of the PSO algorithm also allow reaching better performances in terms of tracking error optimization since it compares results given for all P matrix satisfying (47) and returns the best minimization.

4 Simulation Results

To emphasize the control performance and better testing the controller, in this paper, instead of using a mathematical model of the plant, it is used the CarSim^® extended model simulator which is able to reproduce very closely the behavior of the physic of a ground vehicle.

The behavior of the proposed nonlinear inverse optimal controller is shown for an interesting case, in which the vehicle performs an ATI 90-90 maneuver. The ATI 90-90 maneuver is described in the standard ISO/TS 16949. The vehicle moves in open-loop throttle valve with an initial speed set to 27.8 m/s (about 100 km/h), with a released throttle valve and without braking.

The driver steering wheel angle $\delta ^{s,w}_{d,k}$, related by a ratio of 16:1 with the steering angle $\delta _{d,k}$, is shown in Fig. 3a in which a superimposed random noise is also considered.

Table 1 Mean error indicators for sample time variations in the observer (33)

Full size table

A further source of difficulty is taken into account, considering an abrupt change of the tire-road friction coefficient where $\mu _{d} =0.9$, $\mu _w=0.5$ correspond to dry and wet surfaces (see Fig. 3b). Figure 3c, d, show longitudinal and lateral accelerations respectively.

Figure 4 shows the importance of being able to rely on an active controller for vehicle stability improvements. In red, the vehicle is presented when the controller is disabled (open loop system) and in yellow when the controller is enabled (closed loop system). Note that the controlled vehicle performs on a safer driving condition while the uncontrolled vehicle presents a strong drifting due to adhesion loss.

In order to test quality and robustness of the observer (33), we used a variation of the initial conditions selecting $v_{x,k,0}=28$ [m/s] and $v_{y,k,0}=0.005$ [m/s]. Results are listed in Fig. 5, where the discrete gains $k_{o,1}, k_{o,2}$ and the stability parameter $\kappa $ ensuring the convergence of the observed dynamics to the real vehicle model, are presented.

Performance and stability of the discrete-time state-observer depend on the sample time size. To this aim, sample time variations are introduced and three different basic indicators known as integral square error $\mathrm (ISE)$, integral time square error $\mathrm (ITSE)$ and the integral of absolute error $\mathrm (IAE)$ are discussed, as presented in Table 1, and calculated as follows:

$$\begin{aligned} \text {ISE}=\sum _{k=1}^N e(k)^2;~~~\text {ITSE}=\sum _{k=1}^N ke(k)^2;~~~ \text {IAE}=\sum _{k=1}^N \mid e(k)\mid . \end{aligned}$$

(49)

Notice that for $\mathrm T=0.0001$ and $\mathrm T=0.001$ the observer ensures robustness with respect to sample time variation while the discriminant in Eq. (34) continue being grater than zero.

Quality and performance of both the identifier and the observer are discussed in Fig. 6. In particular, Fig. 6a–c, show the identification error of the longitudinal velocity ${\hat{e}}_{v_{x,k}}$, lateral velocity ${\hat{e}}_{v_{y,k}}$ and yaw rate ${\hat{e}}_{\omega _{z,k}}$, respectively whereas in Fig. 6d, e the observation errors of the longitudinal (${\tilde{e}}_{v_{x,k}}$) and lateral (${\tilde{e}}_{v_{x,k}}$) velocities are presented.

Figure 7, compares the vehicle behavior in open-loop and closed-loop systems. Notice how, in Fig. 7a, b, the case of open-loop system, the lateral velocity $v_{y,k}$ and the yaw rate $\omega _{z,k}$ do not track the safer references $v_{y,k,\textrm{ref}}$, $\omega _{z,k,\textrm{ref}}$. Instead, in Fig. 7c, d, the case of closed-loop system, the tracking of the references $v_{y,k,\textrm{ref}}$ and $\omega _{z,k,\textrm{ref}}$ performs as expected.

Moreover, the longitudinal velocity $v_{x,k}$, in closed-loop system, and the control efforts in terms of Active Front Steering ($\delta _{c,k}$) and Rear Torque Vectoring ($M_{z,k}$) can be appreciated in Fig. 8.

Finally, Fig. 9v presents the synaptic neural weights ${\hat{w}}_{11,k}$, ..., ${\hat{w}}_{34,k}$ during the online adaptation in the neural identifier (39).

Parameters used in the observer (33), neural identifier (39), and inverse optimal controller (42), are listed in Table 2.

Table 2 Parameters used in the control scheme

Full size table

To test quality and performance of the inverse optimal controller, authors propose a fare comparison between optimal and non-optimal methods in order to verify the advantages of using such control approach. The comparison is said to be ‘fair’ because the neural identifier in (39) is utilized in both cases highlighting the contribution of the controllers exclusively.

The non-optimal control law can be designed as follows:

for the tracking errors $e_{v_y,k}={\hat{v}}_{y,k}-v_{y,k,\textrm{ref}}$ and $e_{\omega _z,k}={\hat{\omega }}_{z,k}-\omega _{z,k,\textrm{ref}}$ it is possible to choose a Lyapunov candidate function of the form $V_k=e_{v_y,k}^2+e_{\omega _z,k}^2$ and impose the condition of $\varDelta V_k=-k_1 e_{v_y,k}^2-k_2 e_{\omega _z,k}^2$ ensuring the Lyapunov increments to be negative definite. By solving for the control laws one gets:

$$\begin{aligned} {\overline{u}}_{k,1}= & {} -(k_1-\frac{1}{2})({\hat{v}}_{y,k}-v_{y,k,\textrm{ref}})+v_{y,k+1,\textrm{ref}}-{\hat{v}}_{y,k}\nonumber \\{} & {} -{\hat{w}}_{31}\tanh (\delta _{d,k})-{\hat{w}}_{32}\tanh (a_{y,k})-{\hat{w}}_{33}\tanh ({\hat{\beta }}_k)-{\hat{w}}_{34}\tanh (a_{x,k})\nonumber \\ {\overline{u}}_{k,2}= & {} -(k_2-\frac{1}{2})(\hat{\omega }_{z,k}-\omega _{z,k,\textrm{ref}})+\omega _{z,k+1,\textrm{ref}}-\hat{\omega }_{z,k}\nonumber \\{} & {} -{\hat{w}}_{21}\tanh ({\hat{v}}_{x,k})\tanh (\hat{\omega }_{z,k})-{\hat{w}}_{22}\tanh (a_{y,k})-c_1 {\overline{u}}_{k,1} \end{aligned}$$

(50)

where ${\overline{u}}_{k,1}$ represents the AFS and ${\overline{u}}_{k,2}$ the RTV.

Results obtained applying the non-optimal control law (50) are shown in Fig. 10.

Validation of the optimal controller is also made numerically in terms of the power consumption of the actuators for both optimal and non-optimal control techniques. Notice that both methods provide good shape in terms of references tracking as shown in Fig. 10a–d. However, the Inverse Optimal Control presented in this work provides better tracking performance while demanding less power. Obtained numerical results are presented in Table 3.

Finally, the P matrix in Theorem 3, with $P>0$ and $P=P^T$ has been calculated making use of a Nature-inspired optimization process, named particle swarm optimization (PSO), in order to find the optimal value of the P matrix as explained in [50, 51]. In this respect, the logic behind this algorithm is shown in Fig. 11, where the first step of the computation represents the parameter initialization such as: number of variables to be optimized, number of swarm particles, number of iterations, cognitive and social weighting factors. The second step consists in generating random numeric values to be tested in simulation. The random values are selected verifying constrain conditions that, for this application, is given by the positive definition of the P matrix in the Lyapunov candidate function (46). Next, random values are tested in simulation and obtained performances are written in output files. Results are then analyzed in terms of statistic criteria. In fact, for this application a mean square error of the concatenation of the tracking error for the lateral velocity and yaw rate ($e_k$) are considered. If the performance obtained during the actual iteration is smaller compared to previous iterations, the upload of the optimal combination is made. Else, the algorithm executes a new iteration by modifying the random values until the number of iterations are successfully reached.

As a reference, obtained results of the PSO obtained during the last execution are listed in Table 4. It is worth noting that the minimum stationary point reached by the algorithm may not necessarily represent a global or absolute point.

5 Conclusions

This paper proposes a nonlinear discrete-time inverse optimal controller based on a RHONN identifier trained by the extended Kalman filter in which the vehicle lateral velocity is reconstructed by a discrete-time reduced order state observer, for the active control of a ground vehicle, to ensure safe driving conditions in the case of adhesion loss and hazardous maneuvers. According to test maneuvers simulated in CarSim^®, the proposed approach shows a proper identification of the vehicle dynamics in terms of identification errors, and a good performance of the controller in terms of reference tracking errors, even in presence of parameter uncertainties, measurement noise and unmodelled dynamics. A nature-inspired optimization algorithm known as particle swarm optimization (PSO) is utilized for the control gain settings. The main contributions of this work concern control aspects in fact, with this approach, one can avoid the hard task of inverting the Pacejka’s tire lateral equation and it can also be ensured asymptotic stability of the tracking errors even without knowing the Pacejka’s coefficients that are difficult to be estimated besides being time-varying. Furthermore, a fair numerical comparison between the proposed control scheme and a non-optimal strategy shows that the former approach provides the same control performances in terms of tracking errors, while minimizing the control power consumption.

Table 3 Comparison between optimal and non-optimal control efforts in terms of power consumption

Full size table

Table 4 Obtained results of the PSO algorithm in the last execution

Full size table

Future works involve electric power-trains, in order to generate torque vectoring based on electric motor torques and angular velocities as well as active differential systems. In case of vehicles with high center of gravity, the introduction of the roll dynamics in both the observer and the neural identifier can improve the control performance as well. Finally, several comparisons with adaptive dynamic programming (ADP), to solve the HJB equations, will be studied.

Data Availability

The authors confirm that the data supporting the findings of this study are available within the article [and/or] its supplementary materials.

References

Bianchi D, Borri A, Burgio G, Di Benedetto MD, Di Gennaro S (2010) Adaptive integrated vehicle control using active front steering and rear torque vectoring, special issue on: autonomous and semi-autonomous control for safe driving of ground vehicles. Int J Veh Auton Syst 8:85–105
Article Google Scholar
Navarrete Guzmán A, Di Gennaro S, Rivera Domínguez J, Acosta Lua C, Loukianov AG, Castillo-Toledo B (2016) Enhanced discrete-time modeling via variational integrators and digital controller design for ground vehicles. IEEE Trans Ind Electron 63(10):6375–6384
Article Google Scholar
Borri A, Bianchi D, Di Benedetto MD, Di Gennaro S (2017) Optimal workload actuator balancing and dynamic reference generation in active vehicle control. J Frankl Inst 354:1722–1740
Article MathSciNet MATH Google Scholar
Etienne L, Acosta Lua C, Di Gennaro S, Barbot JP (2020) A super-twisting controller for active control of ground vehicles with lateral tire-road friction estimation and CarSim validation. Int J Control Autom Syst 18:1177–1189
Article Google Scholar
Abbas C, Reine T, Moustapha D, Ali H, Ali C (2020) A comparison between a centralised multilayer LPV $H_\infty $ and a decentralised multilayer sliding mode control architectures for vehicle’s global chassis control. Int J Control. https://doi.org/10.1080/00207179.2020.1791360
Article MATH Google Scholar
Zheng J, Fu M, Lu R, Xie S (2018) Design, identification, and control of a linear dual-stage actuation positioning system. J Frankl Inst. https://doi.org/10.1016/j.jfranklin.2018.05.029
Article MathSciNet MATH Google Scholar
Perozzi G, Efimov D, Biannic JM, Planckaert L (2018) Trajectory tracking for a quadrotor under wind perturbations: sliding mode control with state-dependent gains. J Frankl Inst. https://doi.org/10.1016/j.jfranklin.2018.04.042
Article MathSciNet MATH Google Scholar
Ataei M, Khajepour A, Jeon S (2018) A novel reconfigurable integrated vehicle stability control with omni actuation systems. IEEE Trans Veh Technol 67(4):2945–2957
Article Google Scholar
Liu W, Khajepour A, He H, Wang H, Huang Y (2018) Integrated torque vectoring control for a three-axle electric bus based on holistic cornering control method. IEEE Trans Veh Technol 67(4):2921–2933
Article Google Scholar
Vignati M, Sabbioni E (2021) A cooperative control strategy for yaw rate and sideslip angle control combining torque vectoring with rear wheel steering. Veh Syst Dyn. https://doi.org/10.1080/00423114.2020.1869273
Article Google Scholar
Acosta-Lua C, Castillo-Toledo B, Cespi R, Di Gennaro S (2015) An integrated active nonlinear controller for wheeled vehicles. J Frankl Inst 352:4890–4910
Article MathSciNet MATH Google Scholar
Acosta-Lua C, Castillo-Toledo B, Cespi R, Di Gennaro S (2016) Nonlinear observer-based active control of ground vehicles with non negligible roll dynamics. Int J Control Autom Syst 14:743–752
Article Google Scholar
Bianchi D, Borri A, Burgio G, Di Benedetto MD, Di Gennaro S (2009) Adaptive integrated vehicle control using active front steering and rear torque vectoring. In: Joint 48th IEEE conference on decision and control and 28th Chinese control conference, pp 3557–3562
Ghosh J, Tonoli A, Amati N (2017) Improvement of lap-time of a rear wheel drive electric racing vehicle by a novel motor torque control strategy. In: SAE technical paper. https://doi.org/10.4271/2017-01-0509
Ghosh J, and Tonoli A, Amati N. (2018) A deep learning based virtual sensor for vehicle sideslip angle estimation: experimental results. 2018. In: SAE technical paper. https://doi.org/10.4271/2018-01-1089
Rovithakis GA, Chistodoulou MA (2000) Adaptive control with recurrent high-order neural networks. Springer, Berlin
Book Google Scholar
Sanchez EN, Alanis AY, Loukianov AG (2008) Discrete-time high order neural control. Springer, Berlin
Book MATH Google Scholar
Sanchez EN, Ornelas F (2013) Discrete-time inverse optimal control for nonlinear systems. CRC Press, Boca Raton
MATH Google Scholar
Pacejka HB (2005) Tire and vehicle dynamics. Elsevier, Amsterdam
Google Scholar
Quintero-Manríquez E, Sanchez EN, Antonio-Toledo ME, Muñoz F (2021) Neural control of an induction motor with regenerative braking as electric vehicle architecture. Eng Appl Artif Intell 104:104275
Article Google Scholar
Djilali L, Sanchez EN, Belkheiri M (2018) Real-time implementation of sliding-mode field-oriented control for a DFIG-based wind turbine. Int Trans Electr Energy Syst 28(5):e2539. https://doi.org/10.1002/etep.2539
Article Google Scholar
Djilali L, Sanchez EN, Belkheiri M (2019) Real-time neural sliding mode field oriented control for a DFIG based wind turbine under balanced and unbalanced grid conditions. IET Renew Power Gener. https://doi.org/10.1049/iet-rpg.2018.5002
Article Google Scholar
Muñoz F, Sanchez EN, Xia Y, Deng S (2017) Real-time neural inverse optimal control for indoor air temperature and humidity in a direct expansion (DX) air conditioning (A/C) system. Int J Refrig 79:196–206. https://doi.org/10.1016/j.ijrefrig.2017.04.01
Article Google Scholar
Quintal G, Sanchez EN, Alanis AY, Arana-Daniel NG (2015) Real-time FPGA decentralized inverse optimal neural control for a Shrimp robot. In: 10th System of systems engineering conference (SoSE). https://doi.org/10.1109/sysose.2015.7151922
Mu C, Wang K, Qiu T (2022) Dynamic event-triggering neural learning control for partially unknown nonlinear systems. IEEE Trans Cybern 52(4):2200–2213. https://doi.org/10.1109/TCYB.2020.3004493
Article Google Scholar
Mu C, Wang K, Ni Z (2022) Adaptive learning and sampled-control for nonlinear game systems using dynamic event-triggering strategy. IEEE Trans Neural Netw Learn Syst 33(9):4437–4450. https://doi.org/10.1109/TNNLS.2021.3057438
Article MathSciNet Google Scholar
Wang Y, Wang Y, Xu J, Chai T (2021) Observer-based discrete adaptive neural network control for automotive PEMFC air-feed subsystem. IEEE Trans Veh Technol 70(4):3149–3163. https://doi.org/10.1109/TVT.2021.3064604
Article Google Scholar
Wang Y, Wang Y, Tie M (2021) Hybrid adaptive learning neural network control for steer-by-wire systems via sigmoid tracking differentiator and disturbance observer. Eng Appl Artif Intell. https://doi.org/10.1016/j.engappai.2021.104393
Article Google Scholar
Wang Y, Wang Y (2021) Discrete-time adaptive neural network control for steer-by-wire systems with disturbance observer. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2021.115395
Sanchez EN, Alanis AY (2006) Redes neuronales: conceptos fundamentales y aplicaciones a control automático. Automática Robótica. Pearson Educación, London
Google Scholar
Rubio J, Yu W (2006) Nonlinear system identification with recurrent neural networks and dead-zone Kalman filter algorithm. Neurocomputing 70:2460–2466
Article Google Scholar
Haykin S (2002) Kalman filtering and neural networks. Wiley, New York
Google Scholar
Song Y, Grizzle JW (1995) The extended Kalman filter as local asymptotic observer for discrete-time nonlinear systems. J Math Syst Estim Control 5:59–78
MathSciNet MATH Google Scholar
Sepulchre R, Jankovic M, Kokotovic PV (1997) Constructive nonlinear control. Springer, Berlin
Book MATH Google Scholar
Lewis FL, Syrmos VL (1995) Optimal control, 2nd edn. Wiley, New York
Google Scholar
Basar T, Olsder GJ (1995) Dynamic noncooperative game theory, 2nd edn. Academic Press, New York
MATH Google Scholar
Alanis AY, Sanchez EN, Loukianov AG (2010) Real-time discrete nonlinear identification via recurrent high order neural networks. Comput Sist 14(1):63–72
MATH Google Scholar
Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern-Part B 38:943–949
Article Google Scholar
Bellman RE (1957) Dynamic programming. Princeton University Press, Princeton
MATH Google Scholar
Bellman RE, Dreyfus SE (1967) Programarea dinamicã aplicatã, ser. Translated from the English. Editura Tehnicã, Bucharest
Ornelas F, Loukianov AG, Sanchez EN, Bayro-Corrochano EJ (2008) Planar robot robust decentralized neural control. San Antonio, Texas, USA. In: Proceedings of the IEEE multiconference on systems and control (MSC 2008)
Ricalde LJ (2005) Inverse optimal adaptive recurrent neural control with Constrained Inputs. Ph.D. Thesis, p 43
Ornelas F, Loukianov AG, Sánchez EN, Bayro-Corrochano EJ (2010) Decentralized neural identification and control for uncertain nonlinear systems: application to planar robot. J Frankl Inst 347(6):1015–1034
Article MathSciNet MATH Google Scholar
Guiggiani M (2007) Dinamica del veicolo. CittaStudi
Heydinger GJ, Garrott WR, Chrstos JP, Guenther DA (1990) A methodology for validating vehicle dynamics simulations. SAE J Autom Eng. Paper 900128
Wong J (1993) Theory of ground vehicles. Wiley, New York
Google Scholar
Acosta Lúa C, Di Gennaro S (2017) Nonlinear adaptive tracking for ground vehicles in the presence of lateral wind disturbances and parameter variations. J Frankl Inst 354:2742–2768
Article MathSciNet MATH Google Scholar
Li J, Zhang Y, Yi J (2012) A hybrid physical-dynamic tire/road friction model. J Dyn Syst Meas Control. https://doi.org/10.1115/1.4006887
Canudas-de-Wit C, Tsiotras P, Velenis E, Basset M, Gissinger G (2003) Dynamic friction models for road/tire longitudinal interaction. Veh Syst Dyn. https://doi.org/10.1076/vesd.39.3.189.14152
Article Google Scholar
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95—international conference on neural Networks, Perth, WA, Australia, vol 4, pp 1942–1948
Parsopoulos KE, Vrahatis MN (2010) Particle swarm optimization and intelligence: advances and applications. Information Science Reference, Hersey
Book Google Scholar

Download references

Funding

This research received not specific grant from any founding agency in the public commercial or not-for-profit sector.

Author information

Stefano Di Gennaro, Bernardino Castillo-Toledo, Jorge Carlos Romero-Aragon and Ricardo Ambrocio Ramírez-Mendoza have contributed equally to this work.

Authors and Affiliations

Escuela de Ingeniería y Ciencias, Tecnologico de Monterrey, Av. Eugenio Garza Sada 2501, 64849, Monterrey, Nuevo León, Mexico
Riccardo Cespi & Ricardo Ambrocio Ramírez-Mendoza
Department of Information Engineering, Computer Science and Mathematics, University of L’Aquila, Via Vetoio, Loc. Coppito, 67100, L’Aquila, Abruzzo, Italy
Stefano Di Gennaro
Electric Engineering, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (Cinvestav), Unidad Guadalajara, Av. del Bosque 1145, Col. El Bajío, 45019, Zapopan, Jalisco, Mexico
Bernardino Castillo-Toledo
Intel Labs Mexico, Intel Corporation, Av. del Bosque 1001, Col. El Bajío, 45019, Zapopan, Jalisco, Mexico
Jorge Carlos Romero-Aragon

Authors

Riccardo Cespi
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Di Gennaro
View author publications
You can also search for this author in PubMed Google Scholar
Bernardino Castillo-Toledo
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Carlos Romero-Aragon
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Ambrocio Ramírez-Mendoza
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed equally in preparing this paper.

Corresponding author

Correspondence to Riccardo Cespi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical Approval

There are no human subjects in this article and informed consent is not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

1.1 Appendix A. Proof of Theorem 1

For the estimation errors:

$$\begin{aligned} {\tilde{e}}_{v_x,k}=v_{x,k}-{\tilde{v}}_{x,k} \qquad \tilde{e}_{v_y,k}=v_{y,k}-{\tilde{v}}_{y,k} \end{aligned}$$

and they increments:

$$\begin{aligned} {\tilde{e}}_{v_x,k+1}= & {} {\tilde{e}}_{v_x,k}+T\omega _{z,k}\tilde{e}_{v_y,k}-k_{o,1}{\tilde{e}}_{v_x,k}\nonumber \\ {\tilde{e}}_{v_y,k+1}= & {} \tilde{e}_{v_y,k}-T\omega _{z,k}{\tilde{e}}_{v_x,k}-k_{o,2}{\tilde{e}}_{v_x,k} \end{aligned}$$

(51)

one can consider the following Lyapunov candidate function:

$$\begin{aligned} V_{o,k}=({\tilde{e}}_{v_x,k}^2+{\tilde{e}}_{v_y,k}^2)-\kappa \mathcal{S}_{\omega _{z,k}}{\tilde{e}}_{v_x,k}{\tilde{e}}_{v_y,k} \end{aligned}$$

(52)

where $\mathcal{S}_{\omega _{z,k}}$ is the classical sign function

$$\begin{aligned} \mathcal{S}_{\omega _{z,k}}=\textrm{sign}(\omega _{z,k})= {\left\{ \begin{array}{ll} 1 &{} \hbox {if} \;\omega _{z,k}>0\\ 0 &{} \hbox {if} \;\omega _{z,k}=0\\ -1 &{} \hbox {if} \;\omega _{z,k}<0. \end{array}\right. } \end{aligned}$$

In order to ensure $V_{o,k}$ in (52) to be a Lyapunov candidate function one may impose $\frac{\kappa ^2}{4}<1$.

The variation of the Lyapunov candidate function is defined as:

$$\begin{aligned} \varDelta V_{o,k}= & {} V_{o,k+1}-V_{o,k}=\big (\tilde{e}_{v_x,k}+T\omega _{z,k}{\tilde{e}}_{v_y,k}-k_{o,1}\tilde{e}_{v_x,k}\big )^2-{\tilde{e}}_{v_x,k}^2\nonumber \\{} & {} +\big (\tilde{e}_{v_y,k}-T\omega _{z,k}{\tilde{e}}_{v_x,k}-k_{o,2}{\tilde{e}}_{v_x,k} \big )^2-{\tilde{e}}_{v_y,k}^2+\kappa \mathcal{S}_{\omega _z,k}\tilde{e}_{v_x,k}{\tilde{e}}_{v_y,k}\nonumber \\{} & {} -\kappa \mathcal{S}_{\omega _{z,k}}\big ({\tilde{e}}_{v_x,k}+T\omega _{z,k}\tilde{e}_{v_y,k}-k_{o,1}{\tilde{e}}_{v_x,k}\big )\big (\tilde{e}_{v_y,k}-T\omega _{z,k}{\tilde{e}}_{v_x,k}-k_{o,2}{\tilde{e}}_{v_x,k}\big ) \end{aligned}$$

(53)

obtaining:

$$\begin{aligned} \varDelta V_{o,k}= & {} \bigg ( k_{o,1}^2-2 k_{o,1}+ T^2\omega _{z,k}^2+ k_{o,2}^2+2 T\omega _{z,k}k_{o,2}+\kappa \mathcal{S}_{\omega _{z,k}}k_{o,2}\nonumber \\{} & {} +\kappa T \mid \omega _{z,k}\mid +\kappa T \mid \omega _{z,k}\mid k_{o,2}-\kappa \textrm{sign}{\omega _{z,k}}k_{o,1}k_{o,2}\bigg )\tilde{e}_{v_{x,k}}^2\nonumber \\{} & {} +\bigg (T^2\omega _{z,k}^2-\kappa T \mid \omega _{z,k}\mid \bigg ){\tilde{e}}_{v_{y,k}}^2\nonumber \\{} & {} +\bigg (2\alpha T\omega _{z,k}-2\alpha T\omega _{z,k}k_{o,1}-2\beta T\omega _{z,k}-2\beta k_{o,2}\nonumber \\{} & {} +\kappa T^2\omega _{z,k}^2\mathcal{S}_{\omega _{z,k}}+\kappa T k_{o,2}\mid \omega _{z,k}\mid -\kappa T k_{o,1}\mid \omega _{z,k}\mid \bigg ){\tilde{e}}_{v_x,k}{\tilde{e}}_{v_y,k} \end{aligned}$$

(54)

The product between the errors of the longitudinal and lateral velocities is eliminated utilizing $k_{o,2}$ whereas the sign of the squared error of the lateral velocity is ensured to be negative imposing $\kappa =\frac{T \omega _{z,k}^2}{\mid \omega _{z,k}\mid }-\rho _2$ for $\rho _2>0$.

Selecting the observer gains $k_{o,1}$ and $k_{o,2}$ such that:

$$\begin{aligned} \left( \begin{array}{cc}k_{o,1} \\ k_{o,2}\end{array}\right) =\left( \begin{array}{cc}\frac{-b\pm \sqrt{b^2-4ac}}{2a}\\ \frac{k_{o,1}(\kappa \mathcal{S}_{\omega _{z,k}}-2T\omega _{z,k})+\kappa T^2\omega _{z,k}^2\mathcal{S}_{\omega _{z,k}}}{d}\end{array}\right) \end{aligned}$$

(55)

with:

$$\begin{aligned} a= & {} \frac{(\kappa \mathcal{S}_{\omega _{z,k}}-2T\omega _{z,k})^2}{d^2}+\frac{2\kappa T \mid \omega _{z,k}\mid -\kappa ^2}{d}+1\nonumber \\ b= & {} \frac{2T^2\omega _{z,k}^2\kappa ^2-4\kappa T^3\mid \omega _{z,k}\mid ^3}{d^2}+\frac{\kappa ^2-\kappa ^2T^2\omega _{z,k}^2-4T^2\omega _{z,k}^2}{d}-\kappa T\mid \omega _{z,k}\mid -2\nonumber \\ c= & {} \frac{\kappa ^2T^4\omega _{z,k}^4}{d^2}+\frac{2\kappa T^3\mid \omega _{z,k}\mid ^3+\kappa ^2T^2\omega _{z,k}^2}{d}+T^2\omega _{z,k}^2+\kappa T \mid \omega _{z,k}\mid +\rho _1\nonumber \\ d= & {} 2-\kappa T\omega _{z,k} \end{aligned}$$

(56)

one obtains:

$$\begin{aligned} \varDelta V_{o,k}= & {} -\rho _1{\tilde{e}}_{v_x,k}^2-\rho _2\tilde{e}_{v_y,k}^2 \end{aligned}$$

(57)

ensuring the asymptotic stability of the origin of the estimation error.

The behavior of the observer gains $k_{o,1}, k_{o,2}$, the parameter $\kappa $ in (52) and the sign of the discriminant in (34) are discussed in Sect. 4.

1.2 Appendix B. Proof of Theorem 2

Let us consider the following Lyapunov candidate function:

$$\begin{aligned} V_{i,k}={\hat{e}}_{x_{i,k}}^T {\hat{e}}_{x_{i,k}}+\tilde{w}_{i,k}^TP_{i,k}{\tilde{w}}_{i,k} \end{aligned}$$

(58)

The first increment is given by the following:

$$\begin{aligned} \varDelta V_{i,k}= & {} V_{i,k+1}-V_{i,k}\nonumber \\= & {} \tilde{w}_{i,k+1}^T P_{i,k+1} {\tilde{w}}_{i,k+1}+{\hat{e}}_{x_{i,k+1}}-\tilde{w}_{i,k}^T P_{i,k} {\tilde{w}}_{i,k}-{\hat{e}}_{x_{i,k}}\nonumber \\= & {} \Big ({\tilde{w}}_{i,k}-\eta _{i,k} K_{i,k}e_{x_{i,k}}\Big )^T \Big (P_{i,k}-U_{i,k}\Big )\Big ({\tilde{w}}_{i,k}-\eta _{i,k} K_{i,k}e_{x_{i,k}}\Big )\nonumber \\{} & {} \quad +\Big ({\tilde{w}}_{i,k}^T z_{i,k}+\epsilon _{i,k}\Big )^2-{\tilde{w}}_{i,k}^T P_{i,k} \tilde{w}_{i,k}-{\hat{e}}_{x_{i,k}} \end{aligned}$$

(59)

where $U_{i,k}=K_{i,k}H_{i,k}^T P_{i,k}+Q_{i,k}$; then (59) can be expressed as:

$$\begin{aligned} \varDelta V_{i,k}= & {} {\tilde{w}}_{i,k}^T P_{i,k}\tilde{w}_{i,k}-\eta _i {\hat{e}}_{x_{i,k}}K_{i,k}^T P_{i,k}\tilde{w}_{i,k}-{\tilde{w}}_{i,k}^T U_{i,k}{\tilde{w}}_{i,k}\nonumber \\{} & {} \quad +\eta _i {\hat{e}}_{x_{i,k}}K_{i,k}^T U_{i,k}{\tilde{w}}_{i,k}-\eta _i {\hat{e}}_{x_{i,k}}{\tilde{w}}_{i,k}^T P_{i,k}K_{i,k}+\eta _i^2 {\hat{e}}_{x_{i,k}}^2 K_{i,k}^T P_{i,k}K_{i,k}\nonumber \\{} & {} \quad +\eta _i{\hat{e}}_{x_{i,k}}{\tilde{w}}_{i,k}^T U_{i,k}K_{i,k}-\eta _i^2 {\hat{e}}_{x_{i,k}}^2K_{i,k}^T U_{i,k}K_{i,k}+({\tilde{w}}_{i,k}^T z_{i,k})^2\nonumber \\{} & {} \quad +2\epsilon _{i,k}{\tilde{w}}_{i,k}^T z_{i,k}+\epsilon _{i,k}^2-{\tilde{w}}_{i,k}P_{i,k}{\tilde{w}}_{i,k}-{\hat{e}}_{x_{i,k}}^2 \end{aligned}$$

(60)

Using the inequalities:

$$\begin{aligned} XX^T+YY^T\ge & {} 2X^TY\nonumber \\ XX^T+YY^T\ge & {} -2X^TY\nonumber \\ -\lambda _{\min }(P)X^2\ge & {} -X^T PX\ge -\lambda _{\max }(P)X^2 \end{aligned}$$

(61)

considered valid for all $X\in \mathbb {R}^n, Y\in \mathbb {R}^n, P\in \mathbb {R}^{n\times n}$, with $P=P^T>0$, (60) can be written as follows:

$$\begin{aligned} \varDelta V_{i,k}\le & {} -{\tilde{w}}_{i,k}^T U_{i,k}\tilde{w}_{i,k}-\eta _i^2 {\hat{e}}_{x_{i,k}}^2K_{i,k}^T U_{i,k}K_{i,k}+\tilde{w}_{i,k}^T {\tilde{w}}_{i,k}+{\hat{e}}_{x_{i,k}}^2\nonumber \\{} & {} \qquad +\eta _i^2 {\hat{e}}_{x_{i,k}}^2K_{i,k}^T P_{i,k} P_{i,k}^T K_{i,k}+\eta _i^2\tilde{w}_{i,k}^T U_{i,k}K_{i,k}K_{i,k}^TU_{i,k}^T{\tilde{w}}_{i,k}\nonumber \\{} & {} \qquad +\eta _i^2 {\hat{e}}_{x_{i,k}}^2K_{i,k}^T U_{i,k}K_{i,k}+2(\tilde{w}_{i,k}^T z_{i,k})^2+2\epsilon _{i,k}^2-{\hat{e}}_{x_{i,k}}^2 \end{aligned}$$

(62)

Then,

$$\begin{aligned} \varDelta V_{i,k}\le & {} -\Vert \tilde{w}_{i,k}\Vert ^2\lambda _{\min }(U_{i,k})-\eta _i^2 {\hat{e}}_{x_{i,k}}^2\Vert K_{i,k}\Vert ^2 \lambda _{\min }(U_{i,k})+\Vert \tilde{w}_{i,k}\Vert ^2\nonumber \\{} & {} \qquad +\eta _i^2 {\hat{e}}_{x_{i,k}}^2\Vert K_{i,k}\Vert ^2 \lambda _{\max }^2(P_{i,k})+\eta _i^2\Vert \tilde{w}_{i,k}\Vert ^2\lambda _{\max }^2(U_{i,k})\Vert K_{i,k}\Vert ^2\nonumber \\{} & {} \qquad +\eta _i^2 {\hat{e}}_{x_{i,k}}^2\Vert K_{i,k}\Vert ^2\lambda _{\max }(P_{i,k})+2\Vert \tilde{w}_{i,k}\Vert ^2\Vert z_{i,k}\Vert ^2+2\epsilon _{i,k}^2 \end{aligned}$$

(63)

Let us now define:

$$\begin{aligned} E_{i,k}= & {} \lambda _{\min }(U_{i,k})-\eta _i^2\lambda _{\max }^2(U_{i,k})\Vert K_{i,k}\Vert ^2-2\Vert z_{i,k}\Vert ^2-1\nonumber \\ F_{i,k}= & {} \eta _i^2\Vert K_{i,k}\Vert ^2\lambda _{\min }(U_{i,k})-\eta _i^2\Vert K_{i,k}\Vert ^2\lambda _{\max }^2(P_{i,k})-\eta _i^2\Vert K_{i,k}\Vert ^2\lambda _{\max }(P_{i,k}) \end{aligned}$$

(64)

and selecting $\eta _i,Q_{i,k},R_{i,k}$ such that $E_{i,k},F_{i,k}>0, \forall k$, one gets:

$$\begin{aligned} \varDelta V_{i,k}\le & {} -\Vert {\tilde{w}}_{i,k}\Vert ^2E_{i,k}-{\hat{e}}_{x_{i,k}}^2F_{i,k}+2\epsilon _{i,k}^2. \end{aligned}$$

(65)

Hence, $\varDelta V_{i,k}<0$ when:

$$\begin{aligned} \Vert \tilde{w}_{i,k}\Vert>\kappa _1= & {} \dfrac{\sqrt{2}\mid \epsilon _{i,k}\mid }{\sqrt{E_{i,k}}}\nonumber \\ \mid e_{x_{i,k}}\mid >\kappa _2= & {} \dfrac{\sqrt{2}\mid \epsilon _{i,k}\mid }{\sqrt{F_{i,k}}} \end{aligned}$$

(66)

Therefore, according to Theorem 3.2 the solution of (19) and (20) are SGUUB.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Cespi, R., Di Gennaro, S., Castillo-Toledo, B. et al. Neural Network Inverse Optimal Control of Ground Vehicles. Neural Process Lett 55, 10287–10313 (2023). https://doi.org/10.1007/s11063-023-11327-9

Download citation

Accepted: 05 June 2023
Published: 20 June 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11063-023-11327-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Neural Network Inverse Optimal Control of Ground Vehicles

Abstract

Similar content being viewed by others

Safe Design of Stable Neural Networks for Fault Detection in Small UAVs

A Novel Neural-Fuzzy Guidance Law Design by Applying Different Neural Network Optimization Algorithms Alternatively for Each Step

State reconstruction in a nonlinear vehicle suspension system using deep neural networks

1 Introduction

2 Recalls on Nonlinear Neural Network RHONN Identification and Discrete-Time Inverse Optimal Control for Trajectory Tracking

2.1 The EKF Training Algorithm

Definition 2.1

2.2 Discrete-Time Inverse Optimal Control for Trajectory Tracking

Definition 2.2

3 The Discrete-Time Inverse Optimal Control for Trajectory Tracking of Ground Vehicles

3.1 The Control Problem

3.2 Discrete-Time reduced-order state observer

Theorem 3.1

3.3 The Reference Signals

3.4 The CarSim® Neural Identification and the Inverse Optimal Control for References Tracking

Theorem 3.2

3.5 The Inverse Optimal Control Law

Theorem 3.3

4 Simulation Results

5 Conclusions

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Additional information

Publisher's Note

Appendices

Appendices

1.1 Appendix A. Proof of Theorem 1

1.2 Appendix B. Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

3.4 The CarSim^® Neural Identification and the Inverse Optimal Control for References Tracking