1 Introduction

During the last years the driving safety has been improved making use of active actuators. Most of the applications utilize steer-by-wire systems, such as the active front steering (AFS) in order to assist the driver in complex and hazardous maneuvers. In general, the vehicle agility, maneuverability and stability are improved making use of AFS as explained, for instance, in [1,2,3,4,5]. The driving safety can be largely improved also using the rear torque vectoring (RTV) technique. Recent contributions on Rear Torque Vectoring control can be found in [6,7,8,9,10]. Combined AFS and RTV actions can be applied for ensuring driving stability [11,12,13]. In the same line of ideas, this paper deals with the use of both AFS and RTV for the active control of the vehicle.

A big criticism of the plant-based control strategies utilized in [11,12,13], is that when calculating the explicit expression of the control law to stabilize the vehicle attitude, the Pacejka’s parameters that define the lateral tire forces must be known at any time. This strong assumption is considered as for instance in [14, 15] where authors assume the lateral tire forces to be known. However, such intrinsic parameters are subjected to decay and deterioration so their estimation remains an arduous task.

Furthermore, utilizing the proposed control approach where a discrete-time reduced order observer reconstructs the vehicle lateral dynamic not usually available and by using a RHONN to identify the vehicle observed dynamics [16,17,18], the synaptic weights provide neural adaptation avoiding the necessity of the knowledge of the Pacejka’s tire parameters.

The training algorithm for the RHONN weight updating is carried out by an extended Kalman filter (EKF) obtaining a model of the vehicle, used to design an inverse optimal controller. The main advantage of this strategy is that using this RHONN-based model, the AFS input appears linearly in the dynamics, and not implicitly in the tire characteristic [19]. This allows calculating the AFS input without inverting the tire model to determine the AFS input, which is a not obvious task since the tire model depends on experimental parameters and vehicle vertical dynamics. Furthermore, the RTV control law does not take into account the explicit expression of the lateral front and rear tire forces that are, usually, not available. The originality and novelty of the proposed control method rely in the fact that the AFS, considered as a control input, is calculated without knowing the tire Pacejka’s parameters. Moreover, since the RTW has a limited actuation, bounded by the vehicle speed and yaw inertia of the vehicle, an optimal approach which minimizes the control efforts is considered. These two aspects represent the main contributions of this paper. The other notable aspect to be mentioned is that the controller is here determined using the inverse optimal control technique [17, 18]. In classical optimal control setting, the meaningful cost functional is given a priori and, then, it is used to calculate the control law by solving a Hamilton–Jacobi–Bellmann (HJB) equation. This latter is, in general, a difficult task. The inverse optimal control technique can be used to overcome this problem, by choosing an a priori candidate Lyapunov function, which is then used to calculate the control law and a meaningful cost functional [17, 18]. This scheme is here proposed to control the vehicle lateral and yaw dynamics in the case of drifting and adhesion loss, which are commonly considered dangerous situations. A further advantage in the use of such a control technique is that it minimizes the actuator effort. It is worth noting that the use of controls with EKF identification were used in real-time applications [20,21,22,23,24]. The availability of high-performance digital devices makes the implementation possible also in the case of vehicles, which nowadays have enough computational power to guarantee that all the calculations needed are made in a efficient way. It is worth noting that under a computation point of view, there exists another technique to solve the HJB equation for the optimal control as explained in [25, 26] where authors utilize adaptive dynamic programming (ADP) to calculate such solution. However, in this paper authors do not provide further detailed comparisons that will be given in future works.

In the literature the combined use of plant-based observers and neural network controllers are quite common as for instance in [27] where the oxygen excess ratio in polymer electrolyte membrane fuel cell is controlled. In [28, 29] an hybrid adaptive learning neural network control and a discrete-time adaptive neural network control are used to improve the steering by wire systems that are usually affected negatively by the friction torque and self-aligning torque.

To validate the proposed controller, the CarSim® platform is used to mimic realistically a vehicle performing a challenging ATI 90/90 steer maneuver. The motivation of choosing this platform is due to the fact that this software well predicts the real vehicle response, as validated with extensive experimental tests conducted by automotive companies such as Ford Motor Company, Chrysler among others.

The paper is organized as follows. Section 2 introduces some preliminaries about neural networks, RHONN identification and inverse optimal control, whereas in Sect. 3 the proposed method is applied to a ground vehicle. In Sect. 4, quality and performance of the proposed controller are shown via simulations in CarSim®. Some comments conclude the paper.

2 Recalls on Nonlinear Neural Network RHONN Identification and Discrete-Time Inverse Optimal Control for Trajectory Tracking

Given a generic multi inputs and multi output (MIMO) discrete-time nonlinear system of the form

$$\begin{aligned} x_{k+1}=F(x_k,u_k) \end{aligned}$$
(1)

with \(k\in \mathbb {N}=\{0,1,2\ldots \}\), \(x_k=(x_{1,k}, \ldots , x_{n,k})^T\in \mathbb {R}^n\), \(u_k=(u_{1,k}, \ldots , u_{m,k})^T\in \mathbb {R}^m\), and \(F:\mathbb {R}^n\times \mathbb {R}^m\rightarrow \mathbb {R}^n\) an analytic vector field such that \(F(0,0)=0\) [16], it can be approximated by a discrete RHONN [16,17,18]

$$\begin{aligned} x_{i,k+1}=w_{i,k}^T z_i(x_k,u_k) \qquad i=1,2,\ldots ,n. \end{aligned}$$
(2)

This result is particularly useful in some cases, e.g. when the parameters of the original system are not fully known. In (2), \(x_{i,k}\) represents the state of the \(i^{th}\) neuron, \(w_{i,k}=(w_{i,1,k}, \ldots , w_{i,\ell _i,k})^T\), \(i=1,\ldots ,n\), are the adjustable synaptic weights of the neural network, and \(\ell _i\) is the number of high order connections. For \(\ell _i\) sufficiently large, (2) approximates the system to be identified to any degree of accuracy. The \(\ell _i\)-dimensional vector \(z_i\) is of the form

$$\begin{aligned} z_i(x_k,u_k)= \left( \begin{matrix}z_{i,1} \\ z_{i,2} \\ \vdots \\ z_{i,\ell _i}\end{matrix}\right) = \left( \begin{matrix}\prod \limits _{j\in I_1} \gamma ^{d_{i_j}(1)}_{i_j,k}\\ \prod \limits _{j\in I_2} \gamma ^{d_{i_j}(2)}_{i_j,k}\\ \vdots \\ \prod \limits _{j\in I_{\ell _i}} \gamma ^{d_{i_j}(\ell _i)}_{i_j,k}\end{matrix}\right) , \end{aligned}$$
(3)

for \(i=1,2,\ldots ,n\) where \(\{I_1, I_2, \ldots ,I_{\ell _i}\}\) is a collection of \(\ell _i\) non-ordered subsets of \(\{1,2,\ldots ,n+m\}\), and \(d_{i_j}(1),\ldots ,d_{i_j}(\ell _i)\) are non-negative integers. Finally,

$$\begin{aligned} \gamma _{i,k}=\left( \begin{matrix}\gamma _{i,1,k} \\ \vdots \\ \gamma _{i,n,k} \\ \gamma _{i,n+1,k}\\ \vdots \\ \gamma _{i,n+m,k}\end{matrix}\right) =\left( \begin{matrix}s(x_{1,k}) \\ \vdots \\ s(x_{n,k}) \\ u_{1,k}\\ \vdots \\ u_{m,k}\end{matrix}\right) ,\quad i=1,\ldots ,n. \end{aligned}$$
(4)

with \(\gamma _{i,j,k}\) either external inputs or states of neurons passed through a sigmoid function. The functions \(s(x_{i,k})\), \(i=1,\ldots ,n\), are typically sigmoidal monotone-increasing and differentiable functions, called activation functions, having the form

$$\begin{aligned} s(x_{i,k})=\frac{\alpha _i}{1+e^{-\beta _i x_{i,k}}}-\tau _i, \qquad i=1,\ldots ,n \end{aligned}$$
(5)

where \(\alpha _i,\beta _i,\tau _i>0\) are constants. Sigmoid activation functions, commonly used in applications, are the logistic functions, obtained for \(\alpha _i=\beta _i=1\), \(\tau _i=0\), and the hyperbolic tangent functions, obtained for \(\alpha _i=\beta _i=2\), \(\tau _i=1\).

Fig. 1
figure 1

RHONN architecture

In this paper we consider the particular case in which (2) is described by the discrete-time RHONN depicted in Fig. 1, and described by the following equation [30]:

$$\begin{aligned} x_{i,k+1}=w_{i,k}^T z_i(x_k)+w_i^{\circ T}u_k, \quad i=1,2,\ldots ,n \end{aligned}$$
(6)

where \(w_i^{\circ T}\) is a constant synaptic weights vector, and the functions \(\gamma _{i,k}\) are in the particular form

$$\begin{aligned} \gamma _{i,k}=\left( \begin{matrix}\gamma _{i,1,k} \\ \vdots \\ \gamma _{i,n,k}\end{matrix}\right) =\left( \begin{matrix}s(x_{1,k}) \\ \vdots \\ s(x_{n,k})\end{matrix}\right) . \end{aligned}$$
(7)

This last choice simplifies the calculation of the control signal needed to guarantee the closed loop performance.

Let us now denote by \(w_i^*\), \(w_i^{\circ *}\), \(i=1,\ldots ,n\), the constant (unknown) weights minimizing, on a fixed compact set, the norm of the identification error between (6) and the system to be identified [17]. Therefore, considering the approximation errors

$$\begin{aligned} \epsilon _{i,k}=\Big (w_{i,k}-w_i^*\Big )^T z_i(x_k) + \Big (w_i^\circ -w_i^{\circ *}\Big )^T u_k, \end{aligned}$$
(8)

for \(i=1,\ldots ,n\) one rewrites (6) as

$$\begin{aligned} x_{i,k+1}=w_i^{*T} z_i(x_k)+w_i^{\circ * T} u_k+\epsilon _{i,k}, \qquad i=1,\ldots ,n. \end{aligned}$$
(9)

For (9) one can consider a RHONN identifier

$$\begin{aligned} {\hat{x}}_{i,k+1}={\hat{w}}_{i,k}^T z_i({\hat{x}}_k)+ w_i^{\circ * T}u_k, \quad i=1,2,\ldots ,n. \end{aligned}$$
(10)

with \({\hat{x}}_k\) the estimate of \(x_k\), \({\hat{w}}_{i,k}\) the estimate of \(w_i^*\). Furthermore, in (10) it is assumed that the value of \(w_i^{\circ *}\) can be estimated off-line. This can be done for a large class of systems in affine form since \(w_{i,k}^\circ \) is constant. The RHONN weight estimation error is

$$\begin{aligned} {{\tilde{w}}}_{i,k}=w_i^*-{\hat{w}}_{i,k},\qquad i=1,\ldots ,n,\ \forall \>k\in \mathbb {N} \end{aligned}$$
(11)

and its dynamics are

$$\begin{aligned} {\tilde{w}}_{i,k+1}-{\tilde{w}}_{i,k}={\hat{w}}_{i,k}-{\hat{w}}_{i,k+1}, \qquad i=1,\ldots ,n,\ \forall \>k\in \mathbb {N} \end{aligned}$$
(12)

since \(w_i^*\) is constant.

2.1 The EKF Training Algorithm

For the on-line learning process of the RHONN weights of (10), one can use a modified version of the well-known EKF algorithm [31, 32], in which the weights become the states to be estimated. The main objective of the EKF is to find the optimal values for the weight vector \({\hat{w}}_{i,k}^T\) such that the identification errors

$$\begin{aligned} e_{i,k}=x_{i,k}-{\hat{x}}_{i,k}, \qquad i=1,\dots , n \end{aligned}$$
(13)

are minimized. The EKF solution to the training problem is [31, 32]

$$\begin{aligned} {\hat{w}}_{i,k+1}={\hat{w}}_{i,k}+\eta _{i,k} K_{i,k} e_{i,k} \qquad i=1,\dots , n \end{aligned}$$
(14)

where

$$\begin{aligned} K_{i,k}=P_{i,k} H_{i,k} M_{i,k}\in \mathbb {R}^{~\ell _i\times m} \end{aligned}$$
(15)

is the Kalman gain vector, \(i=1,\dots ,n\), and \(\eta _{i,k}\in [0,1]\) is the rate learning. Here \(P_{i,k}\in \mathbb {R}^{\ell _i\times \ell _i}\) is the predictive error associated covariance matrix defined as

$$\begin{aligned} P_{i,k+1}=P_{i,k} -K_{i,k} H_{i,k} ^TP_{i,k} +Q_{i,k} \end{aligned}$$
(16)

for \(i=1,\dots , n\), with \(Q_{i,k}\in \mathbb {R}^{\ell _i\times \ell _i}\) the state noise associated covariance matrix. Moreover, the global scaling matrix \(M_{i,k}\) is given by

$$\begin{aligned} M_{i,k}=\Big (R_{i,k} +H_{i,k} ^TP_{i,k} H_{i,k}\Big )^{-1} \end{aligned}$$
(17)

for \(i=1,\dots , n\), where \(R_{i,k}\in \mathbb {R}\), and \(H_{i,k}\in \mathbb {R}^{~\ell _i\times m}\) is a matrix for which each entry

$$\begin{aligned} h_{i,j,k}=\bigg (\frac{\partial {{\hat{x}}_{i,k}}}{\partial {\hat{w}}_{i,j,k}}\bigg )_{{\hat{w}}_{i,k}={\hat{w}}_{i,k+1}}, i= & {} 1,\ldots ,n, \nonumber \\ j= & {} 1,\ldots ,\ell , \end{aligned}$$
(18)

is the derivative of one of the neural network output \({\hat{x}}_{i,k}\) with respect to one neural network weight \({\hat{w}}_{i,j}\). Note that \(H_{i,k}\), \(K_{i,k}\), and \(P_{i,k}\) are bounded [33]. The dynamic of (11) can be expressed as

$$\begin{aligned} {\tilde{w}}_{i,k+1}={\tilde{w}}_{i,k}-\eta _{i,k} K_{i,k}e_{i,k} \end{aligned}$$
(19)

On the other hand, the dynamics of (13) is

$$\begin{aligned} e_{i,k+1}={\tilde{w}}_{i,k}^T z_i({\hat{x}}_k)+\epsilon _{i,k}. \end{aligned}$$
(20)

For the error dynamics (20) we will introduce a stability property, given in the following definition.

Definition 2.1

The solutions of a system \(x_{k+1}=\phi (x_k)\) are Semi-Globally Uniformly Ultimately Bounded (SGUUB) if for any compact \(\varOmega \) and all initial condition \(x_{k_0}\in \varOmega \), there exist an \(\epsilon >0\) and a number \(N(\epsilon ,x_{k_0})\) such that \(\Vert x_k\Vert <\epsilon \), \(\forall \>k\ge k_0+N\).\(\diamond \)

It is worth noting that whereas it will be proven that \(\tilde{w}_{i,k}\) and \(e_{i,k}\) are stable, the approximation error \(\epsilon \) cannot be given a priori since it depends on the accuracy of the neural model the control designer presents. In this sense, an heuristic method, based on try-and-error approach, is repeated until \(\epsilon \) results to be acceptable.

2.2 Discrete-Time Inverse Optimal Control for Trajectory Tracking

The analysis of the inverse optimal control for trajectory tracking will be performed for input-affine systems

$$\begin{aligned} {\hat{x}}_{k+1}=f({\hat{x}}_k)+g({\hat{x}}_k)u_k \end{aligned}$$
(21)

with the following associated cost functional

$$\begin{aligned} \mathcal{J}(\xi _k)=\sum _{j=k}^{\infty } \Big (l(\xi _j)+u_j^T R(\xi _j)u_j\Big ) \end{aligned}$$
(22)

where \(\xi _k={\hat{x}}_k-x_{k,\textrm{ref}}\) is the tracking error between the neural network state vector \({\hat{x}}_k\) and the desired trajectory \(x_{k,\textrm{ref}}\). Furthermore, \(l(\xi _j):\mathbb {R}^n\rightarrow \mathbb {R}^+\) is a positive semidefinite function, and \(R(\xi _j):\mathbb {R}^n\rightarrow \mathbb {R}^{m\times m}\) is a real, symmetric, positive definite weighting matrix. For the sake of simplicity, in this work the elements of \(R(\xi _j)\) will be taken constant, namely \(R(\xi _j)=R\) [18]. The cost functional (22) can be rewritten as

$$\begin{aligned} \mathcal{J}(\xi _k)=l(\xi _k)+u_k^T R u_k+\sum _{j=k+1}^{\infty } \Big (l(\xi _j)+u_j^TRu_j\Big )=l(\xi _k)+u_k^T R u_k+\mathcal{J}(\xi _{k+1}) \end{aligned}$$
(23)

where, without loss of generality, one requires that \(\mathcal{J}(\xi _0)=0\).

The existence of a control \(u_k\) ensuring \(J(\xi _k)\) finite in (22), can be given in terms of the existence of a Lyapunov function \(V(\xi _k)\). In fact, following [34,35,36,37,38] and the Bellman’s optimality principle [39, 40] one looks for a Lyapunov function \(V(\xi _k)\) that, denoting \(u_k^*\) the optimal control which minimizes \(V(\xi _k)\), and with \(V^*(\xi _k)=V(\xi _k)\mid _{u_k^*}\), satisfies:

$$\begin{aligned} V^*(\xi _k)=l(\xi _k)+u_k^{*T} R u_k^*+V^*(\xi _{k+1}). \end{aligned}$$
(24)

Here \(u_k^*\) can be determined calculating the gradient of the right-hand side of (24) with respect to \(u_k=u_k^*\) [41,42,43]

$$\begin{aligned} 0= & {} 2Ru_k^*+\bigg (\frac{\partial V^*(\xi _{k+1})}{\partial u_k}\bigg )^T=2 R u_k^*+\bigg (\frac{\partial \xi _{k+1}}{\partial u_k}\bigg )^T \bigg (\frac{\partial V^*(\xi _{k+1})}{\partial \xi _{k+1}}\bigg )^T\nonumber \\= & {} 2 R u_k^*+g^T({\hat{x}}_k)\bigg (\frac{\partial V^*(\xi _{k+1})}{\partial \xi _{k+1}}\bigg )^T \end{aligned}$$
(25)

so obtaining the controller that globally stabilizes the tracking error \(\xi _k\) and minimizes the cost function (22)

$$\begin{aligned} u_k^*=-\frac{1}{2}R^{-1}g^T({\hat{x}}_k)\bigg (\frac{\partial V^*(\xi _{k+1})}{\partial \xi _{k+1}}\bigg )^T \end{aligned}$$
(26)

with the condition \(V(\xi _0)=0\). These considerations justify the following definition.

Definition 2.2

(Inverse Optimal Control for Trajectory Tracking). The control (26) is a global inverse optimal controller for trajectory tracking if:

  1. (i)

    It guarantees global asymptotic stability of the tracking error \(\xi _k={\hat{x}}_k-x_{k,\textrm{ref}}\);

  2. (ii)

    \(V(\xi _k)\) is a radially unbounded positive definite function such that

    $$\begin{aligned} {\overline{V}}=V^*(\xi _{k+1})-V^*(\xi _k)+u^{*T}_k R u_k^*\le 0. \diamond \end{aligned}$$

When one selects \(l(\xi _k)=-{\overline{V}}\), then \(V(\xi _k)\) is a solution of the HJB equation:

$$\begin{aligned} l(\xi _k)&+V(\xi _{k+1})-V(\xi _k)+\frac{1}{4}\frac{\partial V^T(\xi _{k+1})}{\partial \xi _{k+1}}g({\hat{x}}_k)R^{-1}g^T({\hat{x}}_k)\frac{\partial V(\xi _{k+1})}{\partial \xi _{k+1}}=0. \end{aligned}$$
(27)

To calculate the inverse optimal control, let us consider a candidate Lyapunov function of the form

$$\begin{aligned} V(\xi _k)=\frac{1}{2}\xi ^T_kP\xi _k, \qquad P=P^T>0. \end{aligned}$$
(28)

If (i) and (ii) are satisfied, the control law (26) is

$$\begin{aligned} u_k^*= & {} -\frac{1}{2}R^{-1}g^T({\hat{x}}_k)P\xi _{k+1}=-\frac{1}{2}(R+P_2)^{-1}P_{1,k} \end{aligned}$$
(29)

with \(P_{1,k}=g^T({\hat{x}}_k)P(f({\hat{x}}_k)-x_{k+1,\textrm{ref}})\) and \(P_2=\frac{1}{2}g^T({\hat{x}}_k)Pg({\hat{x}}_k)\). It is worth pointing out that P and R are positive definite and symmetric matrices. Thus, the existence of the inverse is ensured.

It is worth noting that the inverse optimal control method is a control strategy derived on the same basis of the optimal control theory with the difference that the HJB equation is not solved first, since the optimal cost function is substituted by a candidate Lyapunov function, known a priori, and then the control law is calculated.

3 The Discrete-Time Inverse Optimal Control for Trajectory Tracking of Ground Vehicles

In this section we apply the previous results of neural identification and inverse optimal control to a ground vehicle represented by CarSim®. The control scheme is described in Fig. 2, in which the steering wheel angle \(\delta _{d,k}\), the longitudinal and lateral accelerations \(a_{x,k}\), \(a_{y,k}\), the longitudinal velocity \(v_{x,k}\) and the yaw rate \(\omega _{z,k}\) are measured from CarSim®.

Fig. 2
figure 2

Control scheme and bicycle model

A discrete-time reduced-order state observer estimates the lateral vehicle velocities \({\tilde{v}}_{y,k}\) which is, in general, not known. A neural identifier then provides an input-affine model, avoiding the hard task of inverting the lateral tyre characteristic when deriving the control laws. Synaptic weights \(w_{i,j,k}\) are adjusted on-line by the Extended Kalman Filter while minimizing the identification errors \({\hat{e}}_{i,k}\). Finally, the Inverse Optimal Controller, based on the neural model \({\hat{v}}_{x,k}, {\hat{v}}_{y,k}, {\hat{\omega }}_{z,k}\) and references \(v_{y,k,\textrm{ref}}, \omega _{z,k,\textrm{ref}}\) and increments, provides the AFS \(\delta _{c,k}\) and RTV \(M_{z,k}\) that represent the CarSim®control inputs. In this work authors considered the combined use of an observer in order to estimate the not known vehicle dynamics and a neural identifier to obtain an affine to the input model in which the Pacejka’s parameters are adapted by the neural synaptic weights. This strategy ensures a global exponential stability of the estimation error, given by the observer, and a practical stability guaranteed by the identifier. It is worth noting that the system that makes it possible to avoid the use of the Pacejka’s tire parameters of the tire lateral forces, is the neural identifier which adapts synaptic weights under the Kalman learning rules. This latter works under the hypothesis that all the dynamics to be identified are known. To this aim, a discrete-time observer is presented to reconstruct the unavailable vehicle lateral dynamics.

CarSim®mimics realistically the vehicle dynamics. The model contains many dynamics, which describe the complex behavior of the vehicle. However, for vehicles with low center of gravity, the essential dynamics describing the vehicle attitude are given by the longitudinal and lateral velocities and the yaw rate. This is well described by the so-called single-track vehicle model shown in Fig. 2, and very often used to design active controllers for ground vehicles [44,45,46].

The interested reader can find in [2] a discrete-time version of such a model, obtained by means of a variational integrator (known as symplectic Euler), and representing the discrete-time version of the single-track model. Even if this model ensures better performance for (relatively) high sampling periods, a more popular model is the Euler approximation of the single-track model:

$$\begin{aligned} x_{k+1}= & {} \left( \begin{matrix}v_{x,k+1} \\ v_{y,k+1} \\ \omega _{z,k+1}\end{matrix}\right) =\left( \begin{matrix}v_{x,k}+T\bigg ( v_{y,k}\omega _{z,k} +\dfrac{\mu _x}{m} \Big (F_{x,f}(\lambda _{f,k})+F_{x,r}(\lambda _{r,k})\Big )\bigg )\\ v_{y,k}+T\bigg (-v_{x,k}\omega _{z,k} +\dfrac{\mu _y}{m} \Big (F_{y,f}(\alpha _{f,k})+F_{y,r}(\alpha _{r,k})\Big )\bigg )\\ \omega _{z,k}+T\bigg (\dfrac{\mu _y}{J_z}\Big (F_{y,f}(\alpha _{f,k}) l_f-F_{y,r}(\alpha _{r,k})l_r\Big )+\frac{M_z}{J_z}\bigg )\end{matrix}\right) \end{aligned}$$
(30)

where T is the sampling period, \(v_{x,k}\), \(v_{y,k}\), \(\omega _{z,k}\) are the vehicle longitudinal, lateral, and yaw velocities, and \(F_{y,f}\), \(F_{y,r}\) are the lateral forces which depend on the tire slip angles \(\alpha _{f,k}=\delta _{d,k}+\delta _{c,k}-(v_{y,k}+l_f\omega _{z,k})/v_{x,k}\), \(\alpha _{r,k}=-(v_{y,k}-l_r\omega _{z,k})/v_{x,k}\), where \(\delta _{d,k}\) is the driver steering wheel angle, and \(\delta _{c,k}\) is the AFS input.

Furthermore, \(M_{z,k}\) is the RTV input, and \(F_{x,f}\), \(F_{x,r}\) are the longitudinal forces, depending on the front/rear tire slips \(\lambda _{f,k}=1-\omega _{w,f,k}R_w/v_{x,k}\), \(\lambda _{r,k}=1-\omega _{w,r,k}R_w/v_{x,k}\), where \(\omega _{w,f,k}\), \(\omega _{w,r,k}\) are the front/rear wheel angular velocities, and \(R_w\) the wheel radius. Finally, m, \(J_z\) are the vehicle mass and inertia momentum, \(l_f\), \(l_r\) are the front and rear vehicle length, \(\mu _x\), \(\mu _y\) are the longitudinal and lateral tire-road friction coefficient.

3.1 The Control Problem

As already commented before, the use of the AFS and the RTV allow us to track given references for the lateral velocity \(v_{y,k,\textrm{ref}}\) and the yaw rate \(\omega _{z,k,\textrm{ref}}\). Then, the control problem can be defined as follows: given bounded references \(v_{y,k,\textrm{ref}}\) and \(\omega _{z,k,\textrm{ref}}\), with bounded increments, determine a controller \(u_k=\alpha _k({\hat{x}}_k,x_{k,\textrm{ref}})\) such that the tracking errors \(e_{v_{y,k}}=v_{y,k}-v_{y,k,\textrm{ref}}\), \(e_{\omega _{z,k}}=\omega _{z,k}-\omega _{z,k,\textrm{ref}}\) satisfy

$$\begin{aligned} \lim \limits _{k\rightarrow \infty }e_{v_{y,k}}=0, \qquad \lim \limits _{k\rightarrow \infty }e_{\omega _{z,k}}=0. \end{aligned}$$

Moreover, when applying control strategies for vehicle stability, not all the state measurements are available from the vehicle, so that, in order to avoid an extensive use of sensors, we present a discrete-time reduced-order state observer for the reconstruction of the vehicle lateral velocity \({\tilde{v}}_{y,k}\).

Making reference to the control scheme in Fig. 2, the tracking errors \(e_{v_{y,k}}\), \(e_{\omega _{z,k}}\) can then be bounded as follows:

$$\begin{aligned} \begin{aligned} \Vert e_{v_{y,k}}\Vert&\le \Vert v_{y,k}-{\tilde{v}}_{y,k}\Vert +\Vert {\tilde{v}}_{y,k}-{\hat{v}}_{y,k}\Vert +\Vert {\hat{v}}_{y,k}-v_{y,k,\textrm{ref}}\Vert \\ \Vert e_{\omega _{z,k}}\Vert&\le \Vert \omega _{z,k}-{\hat{\omega }}_{z,k}\Vert +\Vert \hat{\omega }_{z,k}-\omega _{z,k,\textrm{ref}}\Vert . \end{aligned} \end{aligned}$$
(31)

Thus, the trajectory tracking problem of the desired trajectories can be split into three requirements:

$$\begin{aligned}&1.\quad \lim \limits _{k\rightarrow \infty }\Vert v_{x,k}-\tilde{v}_{x,k}\Vert =0, \lim \limits _{k\rightarrow \infty }\Vert v_{y,k}-\tilde{v}_{y,k}\Vert =0\nonumber \\&2.\quad \lim \limits _{k\rightarrow \infty }\Vert \tilde{v}_{x,k}-{\hat{v}}_{x,k}\Vert \le \varepsilon _{e_1}, \lim \limits _{k\rightarrow \infty }\Vert {\tilde{v}}_{y,k}-{\hat{v}}_{y,k}\Vert \le \varepsilon _{e_2},\nonumber \\&\qquad \quad \lim \limits _{k\rightarrow \infty }\Vert \omega _{z,k}-\hat{\omega }_{z,k}\Vert \le \varepsilon _{e_3}\nonumber \\&3.\quad \lim \limits _{k\rightarrow \infty }\Vert {\hat{v}}_{y,k}-v_{y,k,\textrm{ref}}\Vert =0, \lim \limits _{k\rightarrow \infty }\Vert \hat{\omega }_{z,k}-\omega _{z,k,\textrm{ref}}\Vert =0 \end{aligned}$$
(32)

with \(\varepsilon _{e_1}, \varepsilon _{e_2}, \varepsilon _{e_3}>0\) fixed bounds for the norm of the identification errors. The asymptotic stability of the estimation error stated in the first condition is ensured by the use of a reduced-order state observer presented in Sect. 3.2. The practical stability of the identification error required by the second condition is guaranteed by use of a RHONN identifier introduced in Sect. 3.4, whereas the reference tracking stability required by the third condition is satisfied by the use of a discrete-time controller discussed in Sect. 3.5, developed with the inverse optimal control technique. Finally, Sect. 3.3 shows how to generate safe references for the vehicle attitude.

3.2 Discrete-Time reduced-order state observer

From the mathematical model (30), in order to estimate the lateral velocity \(v_{y,k}\), we present the following reduced order state observer:

$$\begin{aligned} {\tilde{v}}_{x,k+1}= & {} {\tilde{v}}_{x,k}+T(\tilde{v}_{y,k}\omega _{z,k} +a_{x,k})+k_{o,1}(v_{x,k}-{\tilde{v}}_{x,k})\nonumber \\ {\tilde{v}}_{y,k+1}= & {} {\tilde{v}}_{y,k}+T(-{\tilde{v}}_{x,k}\omega _{z,k} +a_{y,k})+k_{o,2}(v_{x,k}-{\tilde{v}}_{x,k}) \end{aligned}$$
(33)

where \(a_{x,k}, a_{y,k}\) are the vehicle longitudinal and lateral accelerations supposed to be known.

For the observer (33), let us state the following:

Theorem 3.1

The reduced order state observer (33) with \(k_{o,1}\) and \(k_{o,2}\) such that:

$$\begin{aligned} \left( \begin{array}{cc}k_{o,1} \\ k_{o,2}\end{array}\right) =\left( \begin{array}{cc}\frac{-b\pm \sqrt{b^2-4ac}}{2a}\\ \frac{k_{o,1}(\kappa \mathcal{S}_{\omega _{z,k}}-2T\omega _{z,k})+\kappa T^2\omega _{z,k}^2\mathcal{S}_{\omega _{z,k}}}{d}\end{array}\right) \end{aligned}$$
(34)

with:

$$\begin{aligned} a= & {} \frac{(\kappa \mathcal{S}_{\omega _{z,k}}-2T\omega _{z,k})^2}{d^2}+\frac{2\kappa T \mid \omega _{z,k}\mid -\kappa ^2}{d}+1\nonumber \\ b= & {} \frac{2T^2\omega _{z,k}^2\kappa ^2-4\kappa T^3\mid \omega _{z,k}\mid ^3}{d^2}+\frac{\kappa ^2-\kappa ^2T^2\omega _{z,k}^2-4T^2\omega _{z,k}^2}{d}\nonumber \\{} & {} \qquad -\kappa T\mid \omega _{z,k}\mid -2\nonumber \\ c= & {} \frac{\kappa ^2T^4\omega _{z,k}^4}{d^2}+\frac{2\kappa T^3\mid \omega _{z,k}\mid ^3+\kappa ^2T^2\omega _{z,k}^2}{d}+T^2\omega _{z,k}^2\nonumber \\{} & {} \qquad +\kappa T \mid \omega _{z,k}\mid +\rho _1\nonumber \\ d= & {} 2-\kappa T\omega _{z,k} \end{aligned}$$
(35)

for \(\rho _1\), \(\rho _2>0\) such that the discriminant in (34) is greater or equal to zero and with \(\kappa =\frac{T\omega _{z,k}^2}{\mid \omega _{z,k}\mid }-\rho _2\) for \(\mid \kappa \mid <2\), ensures the asymptotic stability to the origin of the estimation errors:

$$\begin{aligned} {\tilde{e}}_{v_x,k}=v_{x,k}-{\tilde{v}}_{x,k} \qquad \tilde{e}_{v_y,k}=v_{y,k}-{\tilde{v}}_{y,k} \end{aligned}$$

\(\diamond \)

The proof is given in “Appendix A”.

3.3 The Reference Signals

The references \(v_{y,k,\textrm{ref}}\), \(\omega _{z,k,\textrm{ref}}\) represent what the driver expects from the vehicle performance. Concerning \(v_{x,k}\), in this paper one assumes that the slips \(\lambda _{f,k}\), \(\lambda _{r,k}\) are set to zero, and therefore no longitudinal acceleration/deceleration is imposed. Various expressions can be found in the literature as reference generators. In particular, we consider (without loss of generality) the references given in [11, 12, 47] as the behavior of an “ideal” or “reference” vehicle. This ideal vehicle is not controlled by the AFS and/or the RTV, and receives as input only the driver’s steering signal

$$\begin{aligned} \left( \begin{matrix}v_{y,k+1,\textrm{ref}} \\ \omega _{z,k+1,\textrm{ref}}\end{matrix}\right) =\left( \begin{matrix}v_{y,k,\textrm{ref}}-T v_{x,k}\omega _{z,k,\textrm{ref}}+\\ T\dfrac{\mu _{y,\textrm{ref}}}{m_{\textrm{ref}}} \Big (F_{y,f,\textrm{ref}}(\alpha _{f,\textrm{ref}})+F_{y,r,\textrm{ref}}(\alpha _{r,\textrm{ref}})\Big )\\ \omega _{z,k,\textrm{ref}}+T\dfrac{\mu _{y,\textrm{ref}}}{J_{z,\textrm{ref}}}\Big (F_{y,f,\textrm{ref}}(\alpha _{f,\textrm{ref}}) l_f-F_{y,r,\textrm{ref}}(\alpha _{r,\textrm{ref}})l_r\Big )\end{matrix}\right) \end{aligned}$$
(36)

The reference lateral forces \(F_{y,f,\textrm{ref}}\) \(F_{y,r,\textrm{ref}}\) depend on the reference slip angles

$$\begin{aligned} \alpha _{f,\textrm{ref}}= & {} \delta _{d,k}-\frac{v_{y,k,\textrm{ref}}+l_f\omega _{z,k,\textrm{ref}}}{v_{x,k}}, \nonumber \\ \alpha _{r,\textrm{ref}}= & {} -\frac{v_{y,k,\textrm{ref}}-l_r\omega _{z,k,\textrm{ref}}}{v_{x,k}} \end{aligned}$$
(37)

and appear multiplied by the reference lateral tire-road friction coefficient \(\mu _{y,\textrm{ref}}\). These forces should be chosen in such a way they impose the desired behaviour to the vehicle. Various expressions can be used for these forces, [48, 49], so that for the sake of simplicity we will use the Pacejka’s Magic Formula [19]:

$$\begin{aligned} F_{y,j,\textrm{ref}}=D_{y,j,\textrm{ref}}\sin (C_{y,j,\textrm{ref}}\arctan (B_{y,f,\textrm{ref}}\alpha _{j,k,\textrm{ref}})),\quad j=f,r \end{aligned}$$
(38)

where \(D_{y,j,\textrm{ref}}\), \(C_{y,j,\textrm{ref}}\) and \(B_{y,j,\textrm{ref}}\) are fixed constant values chosen by the designer. In particular, \(F_{y,r,\textrm{ref}}\) is proposed to be not decreasing with the slip angle \(\alpha _{r,\textrm{ref}}\). This ensures that the ‘reference vehicle’ can not generate tail-spins.

3.4 The CarSim® Neural Identification and the Inverse Optimal Control for References Tracking

The RHONN identifier (10) takes measurements from the CarSim® only for the yaw rate \(\omega _{z,k}\) whereas the identification of the longitudinal and lateral velocity is made using the reconstructions \({\tilde{v}}_{x,k}\), \({\tilde{v}}_{y,k}\) given by the observer in (33). The neural model proposed is the following:

$$\begin{aligned} {\hat{v}}_{x,k+1}= & {} {\hat{w}}_{11}\tanh ({\hat{v}}_{x,k})+{\hat{w}}_{12}\tanh (a_{x,k})\nonumber \\ {\hat{v}}_{y,k+1}= & {} {\hat{w}}_{21}\tanh ({\hat{v}}_{x,k})\tanh ({\hat{\omega }}_{z,k})+{\hat{w}}_{22}\tanh (a_{y,k})+w_{23}^{\circ }\delta _{c,k}\nonumber \\ \hat{\omega }_{z,k+1}= & {} {\hat{w}}_{31}\tanh (\delta _{d,k}) +{\hat{w}}_{32}\tanh (a_{y,k})+{\hat{w}}_{33}\tanh ({\hat{\beta }}_k)+{\hat{w}}_{34}\tanh (a_{x,k})\nonumber \\{} & {} -w_{35}^{\circ }\delta _{c,k}+w_{36}^{\circ }M_{z,k} \end{aligned}$$
(39)

where \({\hat{v}}_{x,k}\), \({\hat{v}}_{y,k}\), \({\hat{\omega }}_{z,k}\) are the neural identifications for \({\tilde{v}}_{x,k}\), \({\tilde{v}}_{y,k}\), \(\omega _{z,k}\), and where \({\hat{\beta _k}}\) represents the vehicle slip angle calculated as \({\hat{\beta }}_k=\arctan ({\hat{v}}_{y,k}/{\hat{v}}_{x,k})\). Moreover, \(w_{23}^{\circ }, w_{35}^{\circ }\) and \(w_{36}^{\circ }\) are constants tuned by the control designer.

In (39), the AFS and RTV inputs \(\delta _{c,k}\), \(M_{z,k}\) appear. It is interesting to note that \(\delta _{c,k}\) appears linearly in the model, and not implicitly in the lateral front force as in the single-track discrete-time models.

It is important to remind that the neural model able to identify the process is not unique. Model (39) has shown good characteristics with respect to CarSim®  measurements including noise and perturbations, and it has shown good performance when tracking the reference signals.

The stability of the identification error system:

$$\begin{aligned} {\hat{e}}_{v_{x,k}}= & {} {\hat{v}}_{x,k}-v_{x,k};\quad {\hat{e}}_{v_{y,k}}={\hat{v}}_{y,k}-{\tilde{v}}_{y,k};\quad {\hat{e}}_{\omega _{z,k}}={\hat{\omega }}_{z,k}-\omega _{z,k}; \end{aligned}$$
(40)

as well as the stability of the synaptic weights errors:

$$\begin{aligned} {\tilde{w}}_{1,k}= & {} \left( \begin{matrix}w_{11}^*-{\hat{w}}_{11,k}\\ w_{12}^*-{\hat{w}}_{12,k}\end{matrix}\right) \quad \tilde{w}_{2,k}=\left( \begin{matrix}w_{21}^*-{\hat{w}}_{21,k}\\ w_{22}^*-{\hat{w}}_{22,k}\end{matrix}\right) \quad {\tilde{w}}_{3,k}=\left( \begin{matrix} w_{31}^*-{\hat{w}}_{31,k}\\ w_{32}^*-{\hat{w}}_{32,k}\\ w_{33}^*-{\hat{w}}_{33,k}\\ w_{34}^*-{\hat{w}}_{34,k}\end{matrix}\right) \end{aligned}$$
(41)

are discussed in the following theorem:

Theorem 3.2

The RHONN identifier (39), trained by the EKF algorithm (13), (14), (15), (16), (17), (18) to identify the lateral vehicle velocity \({\tilde{v}}_{y,k}\) from the reduced order observer (33) and to identify the vehicle longitudinal velocity \(v_{x,k}\) and yaw rate \(\omega _{z,k}\) from CarSim®, ensures that the identification errors (40) are SGUUB, and that the weight estimation errors (41) remain bounded- \(\diamond \)

The proof is given in “Appendix B”.

3.5 The Inverse Optimal Control Law

It is now possible to introduce the inverse optimal control law in order to force the ground vehicle to follow the desired references.

As commented before, the input control laws used for this task are the active front steering \(\delta _{c,k}\) (AFS) and the rear torque vectoring \(M_{z,k}\) (RTV) for the tracking of the lateral velocity \(v_{y,k,\textrm{ref}}\) and yaw rate \(\omega _{z,k,\textrm{ref}}\) references. Here, no control strategy is presented for the longitudinal velocity \(v_{x,k}\) being this latter a bounded signal as explained in [11, 12].

Based on the structure given in (29), the control law is expressed in the matrix form as follows:

$$\begin{aligned} u^*_k=\left( \begin{matrix}\delta _{c,k}\\ M_{z,k}\end{matrix}\right) =-\frac{1}{2}(R+P_2)^{-1}P_{1,k} \end{aligned}$$
(42)

being:

$$\begin{aligned} P_{1,k}= & {} g^T P(f({\hat{x}}_k)-x_{k+1,\textrm{ref}});\quad P_2=\frac{1}{2}g^TPg; \end{aligned}$$
(43)

and:

$$\begin{aligned} x_{k+1,\textrm{ref}}= & {} \left( \begin{matrix}v_{y,k+1,\textrm{ref}} \\ \omega _{z,k+1,\textrm{ref}}\end{matrix}\right) ;\quad x_{k,\textrm{ref}}=\left( \begin{matrix}v_{y,k,\textrm{ref}} \\ \omega _{z,k,\textrm{ref}}\end{matrix}\right) ; \nonumber \\ {\hat{x}}_{k+1}= & {} \left( \begin{matrix}{\hat{v}}_{y,k+1} \\ {\hat{\omega }}_{z,k+1}\end{matrix}\right) ;\quad {\hat{x}}_{k}=\left( \begin{matrix}{\hat{v}}_{y,k} \\ {\hat{\omega }}_{z,k}\end{matrix}\right) ; \end{aligned}$$
(44)
$$\begin{aligned} f({\hat{x}}_k)= & {} \left( \begin{array}{cc} &{}{\hat{w}}_{21}\tanh ({\hat{v}}_{x,k})\tanh ({\hat{\omega }}_{z,k})+{\hat{w}}_{22}\tanh (a_{y,k})\\ &{}{\hat{w}}_{31}\tanh (\delta _{d,k}) +{\hat{w}}_{32}\tanh (a_{y,k})\\ &{}+{\hat{w}}_{33}\tanh ({\hat{\beta }}_k)+{\hat{w}}_{34}\tanh (a_{x,k})-w_{35}^{\circ }\delta _{c,k}\end{array}\right) ; \end{aligned}$$
(45)
$$\begin{aligned} R= & {} \left( \begin{matrix}r_{11} &{} 0\\ 0 &{} r_{22}\end{matrix}\right) ; P=\left( \begin{matrix}p_{11} &{} p_{12}\\ p_{21} &{} p_{22}\end{matrix}\right) ; g=\left( \begin{matrix}w_{23}^{\circ } &{} 0\\ w_{35}^{\circ } &{} w_{36}^{\circ }\end{matrix}\right) . \end{aligned}$$
(46)

Notice that from (29) it is here considered \(g({\hat{x}}_k)=g\) constant, ensuring controllability of the system.

Now, along the same lines of theorem (4.7) of [18], we can state the following theorem:

Theorem 3.3

Let \(x_{k,\textrm{ref}}\) be a bounded reference with bounded increments \(x_{k+1,\textrm{ref}}\). If there exists a matrix \(P=P^T>0\) such that:

$$\begin{aligned}{} & {} \frac{1}{2}P_{3,k}+\frac{1}{2}x^T_{k+1,\textrm{ref}}Px_{k+1,\textrm{ref}}-\frac{1}{2}{\hat{x}}^T_kP{\hat{x}}_k-\frac{1}{2}x^T_{k,\textrm{ref}}Px_{k,\textrm{ref}}-\frac{1}{4}P_{1,k}^T(R+P_2)^{-1}P_{1,k}\le \nonumber \\{} & {} -\frac{1}{2}\Vert P\Vert ~\Vert f({\hat{x}}_k)\Vert ^2-\frac{1}{2}\Vert P\Vert ~\Vert x_{k+1,\textrm{ref}}\Vert ^2-\frac{1}{2}\Vert P\Vert ~\Vert {\hat{x}}_k\Vert ^2-\frac{1}{2}\Vert P\Vert ~\Vert x_{k,\textrm{ref}}\Vert ^2 \end{aligned}$$
(47)

where:

$$\begin{aligned} P_{1,k}= & {} g^TP\big (f({\hat{x}}_k)-x_{k+1,\textrm{ref}}\big ); \quad P_2=\frac{1}{2}g^TPg; \quad P_{3,k}=f^T({\hat{x}}_k)Pf({\hat{x}}_k); \end{aligned}$$
(48)

then the control law (42), based on the neural identifier (39), ensures global asymptotic convergence to zero of the tracking error \(\xi _k ={\hat{x}}_k -x_{k,\textrm{ref}}.\) Moreover, this control law is inverse optimal, i.e. it minimizes the cost functional \(\mathcal{J}(\xi _k)=V(\xi _k)\) given by (28), with \(l(\xi _k)=-V^*(\xi _{k+1})+V^*(\xi _k)-u^{*T}_k R u_k^*\).\(\diamond \)

Fig. 3
figure 3

CarSim® maneuver: a Steering wheel angle \(\delta ^{s,w}_{d,k}\), (deg vs. s). b Lateral tire-road friction coefficient \(\mu _y\), (– vs. s). c Longitudinal acceleration \(a_{x,k}\), (g’s vs. s). d Lateral acceleration \(a_{y,k}\), (g’s vs. s)

It is worth to stress that there are no analytical conditions, in general, that allow knowing a priori if (47) is feasible. However it is possible to proceed using heuristic methods, such the nature-inspired optimization process known as Particle Swarm Optimization (PSO) [50, 51] used in this work, to find the positive definite symmetric matrix P verifying (47). Moreover, the use of the PSO algorithm also allow reaching better performances in terms of tracking error optimization since it compares results given for all P matrix satisfying (47) and returns the best minimization.

Fig. 4
figure 4

CarSim® maneuver: Open loop vehicle (Red), closed loop vehicle (Yellow)

Fig. 5
figure 5

Reduced-order state-observer gains: a \(k_{o,1}\) (– vs. s). b \(k_{o,2}\) (– vs. s). c \(\kappa \) (– vs. s)

4 Simulation Results

To emphasize the control performance and better testing the controller, in this paper, instead of using a mathematical model of the plant, it is used the CarSim® extended model simulator which is able to reproduce very closely the behavior of the physic of a ground vehicle.

The behavior of the proposed nonlinear inverse optimal controller is shown for an interesting case, in which the vehicle performs an ATI 90-90 maneuver. The ATI 90-90 maneuver is described in the standard ISO/TS 16949. The vehicle moves in open-loop throttle valve with an initial speed set to 27.8 m/s (about 100 km/h), with a released throttle valve and without braking.

The driver steering wheel angle \(\delta ^{s,w}_{d,k}\), related by a ratio of 16:1 with the steering angle \(\delta _{d,k}\), is shown in Fig. 3a in which a superimposed random noise is also considered.

Table 1 Mean error indicators for sample time variations in the observer (33)

A further source of difficulty is taken into account, considering an abrupt change of the tire-road friction coefficient where \(\mu _{d} =0.9\), \(\mu _w=0.5\) correspond to dry and wet surfaces (see Fig. 3b). Figure 3c, d, show longitudinal and lateral accelerations respectively.

Figure 4 shows the importance of being able to rely on an active controller for vehicle stability improvements. In red, the vehicle is presented when the controller is disabled (open loop system) and in yellow when the controller is enabled (closed loop system). Note that the controlled vehicle performs on a safer driving condition while the uncontrolled vehicle presents a strong drifting due to adhesion loss.

Fig. 6
figure 6

Identification and observation errors: a \({\hat{e}}_{x,k}=v_{x,k}-\hat{v}_{x,k}\), (– vs. s). b \({\hat{e}}_{y,k}=v_{y,k}-\hat{v}_{y,k}\), (– vs. s). c \({\hat{e}}_{\omega _{z,k}}=\omega _{z,k}-\hat{\omega }_{z,k}\), (– vs. s). d \({\tilde{e}}_{v_x,k}=v_{x,k}-{\tilde{v}}_{x,k}\), (– vs. s). e \(\tilde{e}_{v_y,k}=v_{y,k}-{\tilde{v}}_{y,k}\), (– vs. s)

Fig. 7
figure 7

a Open loop system: vehicle lateral velocity \(v_{y,k}\) (solid), reference \(v_{y,k,\textrm{ref}}\) (dash-dot), identifications \({\hat{v}}_{y,k}\) (dashed), (m/s vs. s). b Open loop system: vehicle yaw rate \(\omega _{z,k}\) (solid), reference \(\omega _{z,k,\textrm{ref}}\) (dash-dot), identifications \(\hat{\omega }_{z,k}\) (dashed), (rad/s vs. s). c Closed loop system vehicle lateral velocity \(v_{y,k}\) (solid), reference \(v_{y,k,\textrm{ref}}\) (dash-dot), identifications \({\hat{v}}_{y,k}\) (dashed), (m/s vs. s). d Closed loop system: vehicle yaw rate \(\omega _{z,k}\) (solid bold), reference \(\omega _{z,k,\textrm{ref}}\) (dash-dot), identifications \({\hat{\omega }}_{z,k}\) (dashed), (rad/s vs. s)

In order to test quality and robustness of the observer (33), we used a variation of the initial conditions selecting \(v_{x,k,0}=28\) [m/s] and \(v_{y,k,0}=0.005\) [m/s]. Results are listed in Fig. 5, where the discrete gains \(k_{o,1}, k_{o,2}\) and the stability parameter \(\kappa \) ensuring the convergence of the observed dynamics to the real vehicle model, are presented.

Fig. 8
figure 8

a Longitudinal velocity \(v_{x,k}\) (solid), identification \(\hat{v}_{x,k}\) (dashed) and observed \(\tilde{v}_{x,k}\) (gray), (m/s vs. s). b AFS \(\delta _{c,k}\) (solid bold), (rad vs. s). c RTV \(M_{z,k}\) (solid bold), (kNm vs. s)

Fig. 9
figure 9

Synaptic weights of the Neural Identifier in (39), (– vs. s)

Performance and stability of the discrete-time state-observer depend on the sample time size. To this aim, sample time variations are introduced and three different basic indicators known as integral square error \(\mathrm (ISE)\), integral time square error \(\mathrm (ITSE)\) and the integral of absolute error \(\mathrm (IAE)\) are discussed, as presented in Table 1, and calculated as follows:

$$\begin{aligned} \text {ISE}=\sum _{k=1}^N e(k)^2;~~~\text {ITSE}=\sum _{k=1}^N ke(k)^2;~~~ \text {IAE}=\sum _{k=1}^N \mid e(k)\mid . \end{aligned}$$
(49)

Notice that for \(\mathrm T=0.0001\) and \(\mathrm T=0.001\) the observer ensures robustness with respect to sample time variation while the discriminant in Eq. (34) continue being grater than zero.

Quality and performance of both the identifier and the observer are discussed in Fig. 6. In particular, Fig. 6a–c, show the identification error of the longitudinal velocity \({\hat{e}}_{v_{x,k}}\), lateral velocity \({\hat{e}}_{v_{y,k}}\) and yaw rate \({\hat{e}}_{\omega _{z,k}}\), respectively whereas in Fig. 6d, e the observation errors of the longitudinal (\({\tilde{e}}_{v_{x,k}}\)) and lateral (\({\tilde{e}}_{v_{x,k}}\)) velocities are presented.

Figure 7, compares the vehicle behavior in open-loop and closed-loop systems. Notice how, in Fig. 7a, b, the case of open-loop system, the lateral velocity \(v_{y,k}\) and the yaw rate \(\omega _{z,k}\) do not track the safer references \(v_{y,k,\textrm{ref}}\), \(\omega _{z,k,\textrm{ref}}\). Instead, in Fig. 7c, d, the case of closed-loop system, the tracking of the references \(v_{y,k,\textrm{ref}}\) and \(\omega _{z,k,\textrm{ref}}\) performs as expected.

Moreover, the longitudinal velocity \(v_{x,k}\), in closed-loop system, and the control efforts in terms of Active Front Steering (\(\delta _{c,k}\)) and Rear Torque Vectoring (\(M_{z,k}\)) can be appreciated in Fig. 8.

Finally, Fig. 9v presents the synaptic neural weights \({\hat{w}}_{11,k}\), ..., \({\hat{w}}_{34,k}\) during the online adaptation in the neural identifier (39).

Parameters used in the observer (33), neural identifier (39), and inverse optimal controller (42), are listed in Table 2.

Table 2 Parameters used in the control scheme

To test quality and performance of the inverse optimal controller, authors propose a fare comparison between optimal and non-optimal methods in order to verify the advantages of using such control approach. The comparison is said to be ‘fair’ because the neural identifier in (39) is utilized in both cases highlighting the contribution of the controllers exclusively.

The non-optimal control law can be designed as follows:

for the tracking errors \(e_{v_y,k}={\hat{v}}_{y,k}-v_{y,k,\textrm{ref}}\) and \(e_{\omega _z,k}={\hat{\omega }}_{z,k}-\omega _{z,k,\textrm{ref}}\) it is possible to choose a Lyapunov candidate function of the form \(V_k=e_{v_y,k}^2+e_{\omega _z,k}^2\) and impose the condition of \(\varDelta V_k=-k_1 e_{v_y,k}^2-k_2 e_{\omega _z,k}^2\) ensuring the Lyapunov increments to be negative definite. By solving for the control laws one gets:

$$\begin{aligned} {\overline{u}}_{k,1}= & {} -(k_1-\frac{1}{2})({\hat{v}}_{y,k}-v_{y,k,\textrm{ref}})+v_{y,k+1,\textrm{ref}}-{\hat{v}}_{y,k}\nonumber \\{} & {} -{\hat{w}}_{31}\tanh (\delta _{d,k})-{\hat{w}}_{32}\tanh (a_{y,k})-{\hat{w}}_{33}\tanh ({\hat{\beta }}_k)-{\hat{w}}_{34}\tanh (a_{x,k})\nonumber \\ {\overline{u}}_{k,2}= & {} -(k_2-\frac{1}{2})(\hat{\omega }_{z,k}-\omega _{z,k,\textrm{ref}})+\omega _{z,k+1,\textrm{ref}}-\hat{\omega }_{z,k}\nonumber \\{} & {} -{\hat{w}}_{21}\tanh ({\hat{v}}_{x,k})\tanh (\hat{\omega }_{z,k})-{\hat{w}}_{22}\tanh (a_{y,k})-c_1 {\overline{u}}_{k,1} \end{aligned}$$
(50)

where \({\overline{u}}_{k,1}\) represents the AFS and \({\overline{u}}_{k,2}\) the RTV.

Results obtained applying the non-optimal control law (50) are shown in Fig. 10.

Fig. 10
figure 10

Non-optimal control in (50): a Closed loop system vehicle lateral velocity \(v_{y,k}\) (solid), reference \(v_{y,k,\textrm{ref}}\) (dash-dot), identifications \({\hat{v}}_{y,k}\) (dashed), [m/s vs. s]. b Closed loop system: vehicle yaw rate \(\omega _{z,k}\) (solid bold), reference \(\omega _{z,k,\textrm{ref}}\) (dash-dot), identifications \(\hat{\omega }_{z,k}\) (dashed), [rad/s vs. s]. c AFS \(\delta _{c,k}\) (solid bold), [rad vs. s]. d RTV \(M_{z,k}\) (solid bold), [kNm vs. s]

Validation of the optimal controller is also made numerically in terms of the power consumption of the actuators for both optimal and non-optimal control techniques. Notice that both methods provide good shape in terms of references tracking as shown in Fig. 10a–d. However, the Inverse Optimal Control presented in this work provides better tracking performance while demanding less power. Obtained numerical results are presented in Table 3.

Finally, the P matrix in Theorem 3, with \(P>0\) and \(P=P^T\) has been calculated making use of a Nature-inspired optimization process, named particle swarm optimization (PSO), in order to find the optimal value of the P matrix as explained in [50, 51]. In this respect, the logic behind this algorithm is shown in Fig. 11, where the first step of the computation represents the parameter initialization such as: number of variables to be optimized, number of swarm particles, number of iterations, cognitive and social weighting factors. The second step consists in generating random numeric values to be tested in simulation. The random values are selected verifying constrain conditions that, for this application, is given by the positive definition of the P matrix in the Lyapunov candidate function (46). Next, random values are tested in simulation and obtained performances are written in output files. Results are then analyzed in terms of statistic criteria. In fact, for this application a mean square error of the concatenation of the tracking error for the lateral velocity and yaw rate (\(e_k\)) are considered. If the performance obtained during the actual iteration is smaller compared to previous iterations, the upload of the optimal combination is made. Else, the algorithm executes a new iteration by modifying the random values until the number of iterations are successfully reached.

As a reference, obtained results of the PSO obtained during the last execution are listed in Table 4. It is worth noting that the minimum stationary point reached by the algorithm may not necessarily represent a global or absolute point.

5 Conclusions

This paper proposes a nonlinear discrete-time inverse optimal controller based on a RHONN identifier trained by the extended Kalman filter in which the vehicle lateral velocity is reconstructed by a discrete-time reduced order state observer, for the active control of a ground vehicle, to ensure safe driving conditions in the case of adhesion loss and hazardous maneuvers. According to test maneuvers simulated in CarSim®, the proposed approach shows a proper identification of the vehicle dynamics in terms of identification errors, and a good performance of the controller in terms of reference tracking errors, even in presence of parameter uncertainties, measurement noise and unmodelled dynamics. A nature-inspired optimization algorithm known as particle swarm optimization (PSO) is utilized for the control gain settings. The main contributions of this work concern control aspects in fact, with this approach, one can avoid the hard task of inverting the Pacejka’s tire lateral equation and it can also be ensured asymptotic stability of the tracking errors even without knowing the Pacejka’s coefficients that are difficult to be estimated besides being time-varying. Furthermore, a fair numerical comparison between the proposed control scheme and a non-optimal strategy shows that the former approach provides the same control performances in terms of tracking errors, while minimizing the control power consumption.

Table 3 Comparison between optimal and non-optimal control efforts in terms of power consumption
Fig. 11
figure 11

PSO algorithm scheme

Table 4 Obtained results of the PSO algorithm in the last execution

Future works involve electric power-trains, in order to generate torque vectoring based on electric motor torques and angular velocities as well as active differential systems. In case of vehicles with high center of gravity, the introduction of the roll dynamics in both the observer and the neural identifier can improve the control performance as well. Finally, several comparisons with adaptive dynamic programming (ADP), to solve the HJB equations, will be studied.