Neural Network Inverse Optimal Control of Ground Vehicles

In this paper an active controller for ground vehicles stability is presented. The objective of this controller is to force the vehicle to track a desired reference, ensuring safe driving conditions in the case of adhesion loss during hazardous maneuvers. To this aim, a nonlinear discrete-time inverse optimal control based on a neural network identification is designed, using a recurrent high order neural network (RHONN) trained by an Extended Kalman Filter. The RHONN ensures stability of the identification error, while the controller ensures the stability of the tracking errors. Moreover, a discrete-time reduced order state observer is utilized to reconstruct the lateral vehicle dynamic not usually available. For the control problem, the references of the lateral velocity and yaw rate are given by a dynamic system mimicking an ideal vehicle having not-decreasing tire lateral characteristics. The proposed approach avoids the identification of the Pacejka’s lateral parameters of the tires, so simplifying the input control determination. Moreover, an optimal control is proposed to optimize the actuator effort and power, usually bounded. Control gains are determined using optimal “nature-inspired" algorithms such as particle swarm optimization. Test maneuvers, performed through the full vehicle simulator CarSim®, have been used to test correctness, quality and performances of the observer, the neural identifier and the inverse optimal controller. Robustness of the reduced order discrete-time state observer is also discussed for different sample times. Finally, a fair comparison between optimal and non-optimal control schemes is presented, highlighting the numerical results obtained in simulation.


Introduction
During the last years the driving safety has been improved making use of active actuators.Most of the applications utilize steer-by-wire systems, such as the active front steering (AFS) in order to assist the driver in complex and hazardous maneuvers.In general, the vehicle agility, maneuverability and stability are improved making use of AFS as explained, for instance, in [1][2][3][4][5].The driving safety can be largely improved also using the rear torque vectoring (RTV) technique.Recent contributions on Rear Torque Vectoring control can be found in [6][7][8][9][10].Combined AFS and RTV actions can be applied for ensuring driving stability [11][12][13].In the same line of ideas, this paper deals with the use of both AFS and RTV for the active control of the vehicle.
A big criticism of the plant-based control strategies utilized in [11][12][13], is that when calculating the explicit expression of the control law to stabilize the vehicle attitude, the Pacejka's parameters that define the lateral tire forces must be known at any time.This strong assumption is considered as for instance in [14,15] where authors assume the lateral tire forces to be known.However, such intrinsic parameters are subjected to decay and deterioration so their estimation remains an arduous task.
Furthermore, utilizing the proposed control approach where a discrete-time reduced order observer reconstructs the vehicle lateral dynamic not usually available and by using a RHONN to identify the vehicle observed dynamics [16][17][18], the synaptic weights provide neural adaptation avoiding the necessity of the knowledge of the Pacejka's tire parameters.
The training algorithm for the RHONN weight updating is carried out by an extended Kalman filter (EKF) obtaining a model of the vehicle, used to design an inverse optimal controller.The main advantage of this strategy is that using this RHONN-based model, the AFS input appears linearly in the dynamics, and not implicitly in the tire characteristic [19].This allows calculating the AFS input without inverting the tire model to determine the AFS input, which is a not obvious task since the tire model depends on experimental parameters and vehicle vertical dynamics.Furthermore, the RTV control law does not take into account the explicit expression of the lateral front and rear tire forces that are, usually, not available.The originality and novelty of the proposed control method rely in the fact that the AFS, considered as a control input, is calculated without knowing the tire Pacejka's parameters.Moreover, since the RTW has a limited actuation, bounded by the vehicle speed and yaw inertia of the vehicle, an optimal approach which minimizes the control efforts is considered.These two aspects represent the main contributions of this paper.The other notable aspect to be mentioned is that the controller is here determined using the inverse optimal control technique [17,18].In classical optimal control setting, the meaningful cost functional is given a priori and, then, it is used to calculate the control law by solving a Hamilton-Jacobi-Bellmann (HJB) equation.This latter is, in general, a difficult task.The inverse optimal control technique can be used to overcome this problem, by choosing an a priori candidate Lyapunov function, which is then used to calculate the control law and a meaningful cost functional [17,18].This scheme is here proposed to control the vehicle lateral and yaw dynamics in the case of drifting and adhesion loss, which are commonly considered dangerous situations.A further advantage in the use of such a control technique is that it minimizes the actuator effort.It is worth noting that the use of controls with EKF identification were used in real-time applications [20][21][22][23][24].The availability of high-performance digital devices makes the implementation possible also in the case of vehicles, which nowadays have enough computational power to guarantee that all the calculations needed are made in a efficient way.It is worth noting that under a computation point of view, there exists another technique to solve the HJB equation for the optimal control as explained in [25,26] where authors utilize adaptive dynamic programming (ADP) to calculate such solution.However, in this paper authors do not provide further detailed comparisons that will be given in future works.
In the literature the combined use of plant-based observers and neural network controllers are quite common as for instance in [27] where the oxygen excess ratio in polymer electrolyte membrane fuel cell is controlled.In [28,29] an hybrid adaptive learning neural network control and a discrete-time adaptive neural network control are used to improve the steering by wire systems that are usually affected negatively by the friction torque and self-aligning torque.
To validate the proposed controller, the CarSim ® platform is used to mimic realistically a vehicle performing a challenging ATI 90/90 steer maneuver.The motivation of choosing this platform is due to the fact that this software well predicts the real vehicle response, as validated with extensive experimental tests conducted by automotive companies such as Ford Motor Company, Chrysler among others.
The paper is organized as follows.Section 2 introduces some preliminaries about neural networks, RHONN identification and inverse optimal control, whereas in Sect. 3 the proposed method is applied to a ground vehicle.In Sect.4, quality and performance of the proposed controller are shown via simulations in CarSim ® .Some comments conclude the paper.

Recalls on Nonlinear Neural Network RHONN Identification and Discrete-Time Inverse Optimal Control for Trajectory Tracking
Given a generic multi inputs and multi output (MIMO) discrete-time nonlinear system of the form and F : R n × R m → R n an analytic vector field such that F(0, 0) = 0 [16], it can be approximated by a discrete RHONN [16][17][18] x This result is particularly useful in some cases, e.g. when the parameters of the original system are not fully known.In (2), x i,k represents the state of the i th neuron, w i,k = (w i,1,k , . . ., w i, i ,k ) T , i = 1, . . ., n, are the adjustable synaptic weights of the neural network, and i is the number of high order connections.For i sufficiently large, (2) approximates the system to be identified to any degree of accuracy.The i -dimensional vector z i is of the form with γ i, j,k either external inputs or states of neurons passed through a sigmoid function.The functions s(x i,k ), i = 1, . . ., n, are typically sigmoidal monotone-increasing and differentiable functions, called activation functions, having the form where α i , β i , τ i > 0 are constants.Sigmoid activation functions, commonly used in applications, are the logistic functions, obtained for α i = β i = 1, τ i = 0, and the hyperbolic tangent functions, obtained for In this paper we consider the particular case in which ( 2) is described by the discrete-time RHONN depicted in Fig. 1, and described by the following equation [30]: where w •T i is a constant synaptic weights vector, and the functions γ i,k are in the particular form This last choice simplifies the calculation of the control signal needed to guarantee the closed loop performance.
Let us now denote by w * i , w • * i , i = 1, . . ., n, the constant (unknown) weights minimizing, on a fixed compact set, the norm of the identification error between (6) and the system to be identified [17].Therefore, considering the approximation errors for i = 1, . . ., n one rewrites (6) as For ( 9) one can consider a RHONN identifier with xk the estimate of x k , ŵi,k the estimate of w * i .Furthermore, in (10) it is assumed that the value of w • * i can be estimated off-line.This can be done for a large class of systems in affine form since w and its dynamics are since w * i is constant.

The EKF Training Algorithm
For the on-line learning process of the RHONN weights of (10), one can use a modified version of the well-known EKF algorithm [31,32], in which the weights become the states to be estimated.The main objective of the EKF is to find the optimal values for the weight vector ŵT i,k such that the identification errors are minimized.The EKF solution to the training problem is [31,32] ŵi,k+1 = ŵi,k where is the Kalman gain vector, i = 1, . . ., n, and η i,k ∈ [0, 1] is the rate learning.Here P i,k ∈ R i × i is the predictive error associated covariance matrix defined as for i = 1, . . ., n, with Q i,k ∈ R i × i the state noise associated covariance matrix.Moreover, the global scaling matrix M i,k is given by for i = 1, . . ., n, where R i,k ∈ R, and H i,k ∈ R i ×m is a matrix for which each entry is the derivative of one of the neural network output xi,k with respect to one neural network weight ŵi, j .Note that H i,k , K i,k , and P i,k are bounded [33].The dynamic of (11) can be expressed as On the other hand, the dynamics of ( 13) is For the error dynamics (20) we will introduce a stability property, given in the following definition.

Definition 2.1
The solutions of a system x k+1 = φ(x k ) are Semi-Globally Uniformly Ultimately Bounded (SGUUB) if for any compact Ω and all initial condition x k 0 ∈ Ω, there exist an > 0 and a number N ( , It is worth noting that whereas it will be proven that wi,k and e i,k are stable, the approximation error cannot be given a priori since it depends on the accuracy of the neural model the control designer presents.In this sense, an heuristic method, based on try-and-error approach, is repeated until results to be acceptable.

Discrete-Time Inverse Optimal Control for Trajectory Tracking
The analysis of the inverse optimal control for trajectory tracking will be performed for input-affine systems with the following associated cost functional where ξ k = xk − x k,ref is the tracking error between the neural network state vector xk and the desired trajectory x k,ref .Furthermore, l(ξ j ) : R n → R + is a positive semidefinite function, and R(ξ j ) : R n → R m×m is a real, symmetric, positive definite weighting matrix.
For the sake of simplicity, in this work the elements of R(ξ j ) will be taken constant, namely R(ξ j ) = R [18].The cost functional ( 22) can be rewritten as where, without loss of generality, one requires that J (ξ 0 ) = 0.The existence of a control u k ensuring J (ξ k ) finite in (22), can be given in terms of the existence of a Lyapunov function V (ξ k ).In fact, following [34][35][36][37][38] and the Bellman's optimality principle [39,40] one looks for a Lyapunov function V (ξ k ) that, denoting u * k the optimal control which minimizes V (ξ k ), and with Here u * k can be determined calculating the gradient of the right-hand side of ( 24) with respect to so obtaining the controller that globally stabilizes the tracking error ξ k and minimizes the cost function ( 22) with the condition V (ξ 0 ) = 0.These considerations justify the following definition.

Definition 2.2 (Inverse Optimal Control for Trajectory Tracking
).The control ( 26) is a global inverse optimal controller for trajectory tracking if: (i) It guarantees global asymptotic stability of the tracking error To calculate the inverse optimal control, let us consider a candidate Lyapunov function of the form If (i) and (ii) are satisfied, the control law ( 26) is with It is worth pointing out that P and R are positive definite and symmetric matrices.Thus, the existence of the inverse is ensured.
It is worth noting that the inverse optimal control method is a control strategy derived on the same basis of the optimal control theory with the difference that the HJB equation is not solved first, since the optimal cost function is substituted by a candidate Lyapunov function, known a priori, and then the control law is calculated.In this section we apply the previous results of neural identification and inverse optimal control to a ground vehicle represented by CarSim ® .The control scheme is described in Fig. 2, in which the steering wheel angle δ d,k , the longitudinal and lateral accelerations a x,k , a y,k , the longitudinal velocity v x,k and the yaw rate ω z,k are measured from CarSim ® .A discrete-time reduced-order state observer estimates the lateral vehicle velocities ṽy,k which is, in general, not known.A neural identifier then provides an input-affine model, avoiding the hard task of inverting the lateral tyre characteristic when deriving the control laws.Synaptic weights w i, j,k are adjusted on-line by the Extended Kalman Filter while minimizing the identification errors êi,k .Finally, the Inverse Optimal Controller, based on the neural model vx,k , vy,k , ωz,k and references v y,k,ref , ω z,k,ref and increments, provides the AFS δ c,k and RTV M z,k that represent the CarSim ® control inputs.In this work authors considered the combined use of an observer in order to estimate the not known vehicle dynamics and a neural identifier to obtain an affine to the input model in which the Pacejka's parameters are adapted by the neural synaptic weights.This strategy ensures a global exponential stability of the estimation error, given by the observer, and a practical stability guaranteed by the identifier.It is worth noting that the system that makes it possible to avoid the use of the Pacejka's tire parameters of the tire lateral forces, is the neural identifier which adapts synaptic weights under the Kalman learning rules.This latter works under the hypothesis that all the dynamics to be identified are known.To this aim, a discrete-time observer is presented to reconstruct the unavailable vehicle lateral dynamics.
CarSim ® mimics realistically the vehicle dynamics.The model contains many dynamics, which describe the complex behavior of the vehicle.However, for vehicles with low center of gravity, the essential dynamics describing the vehicle attitude are given by the longitudinal and lateral velocities and the yaw rate.This is well described by the so-called single-track vehicle model shown in Fig. 2, and very often used to design active controllers for ground vehicles [44][45][46].
The interested reader can find in [2] a discrete-time version of such a model, obtained by means of a variational integrator (known as symplectic Euler), and representing the discretetime version of the single-track model.Even if this model ensures better performance for (relatively) high sampling periods, a more popular model is the Euler approximation of the single-track model: where T is the sampling period, v x,k , v y,k , ω z,k are the vehicle longitudinal, lateral, and yaw velocities, and F y, f , F y,r are the lateral forces which depend on the tire slip angles , where δ d,k is the driver steering wheel angle, and δ c,k is the AFS input.Furthermore, M z,k is the RTV input, and F x, f , F x,r are the longitudinal forces, depending on the front/rear tire slips , where ω w, f ,k , ω w,r ,k are the front/rear wheel angular velocities, and R w the wheel radius.Finally, m, J z are the vehicle mass and inertia momentum, l f , l r are the front and rear vehicle length, μ x , μ y are the longitudinal and lateral tire-road friction coefficient.

The Control Problem
As already commented before, the use of the AFS and the RTV allow us to track given references for the lateral velocity v y,k,ref and the yaw rate ω z,k,ref .Then, the control problem can be defined as follows: given bounded references v y,k,ref and ω z,k,ref , with bounded increments, determine a controller Moreover, when applying control strategies for vehicle stability, not all the state measurements are available from the vehicle, so that, in order to avoid an extensive use of sensors, we present a discrete-time reduced-order state observer for the reconstruction of the vehicle lateral velocity ṽy,k .
Making reference to the control scheme in Fig. 2, the tracking errors e v y,k , e ω z,k can then be bounded as follows: Thus, the trajectory tracking problem of the desired trajectories can be split into three requirements: with ε e 1 , ε e 2 , ε e 3 > 0 fixed bounds for the norm of the identification errors.The asymptotic stability of the estimation error stated in the first condition is ensured by the use of a reducedorder state observer presented in Sect.3.2.The practical stability of the identification error required by the second condition is guaranteed by use of a RHONN identifier introduced in Sect.3.4, whereas the reference tracking stability required by the third condition is satisfied by the use of a discrete-time controller discussed in Sect.3.5, developed with the inverse optimal control technique.Finally, Sect.3.3 shows how to generate safe references for the vehicle attitude.

Discrete-Time reduced-order state observer
From the mathematical model (30), in order to estimate the lateral velocity v y,k , we present the following reduced order state observer: where a x,k , a y,k are the vehicle longitudinal and lateral accelerations supposed to be known.
For the observer (33), let us state the following: Theorem 3.1 The reduced order state observer (33) with k o,1 and k o,2 such that: for ρ 1 , ρ 2 > 0 such that the discriminant in (34) is greater or equal to zero and with , ensures the asymptotic stability to the origin of the estimation errors: The proof is given in "Appendix A".

The Reference Signals
The references v y,k,ref , ω z,k,ref represent what the driver expects from the vehicle performance.Concerning v x,k , in this paper one assumes that the slips λ f ,k , λ r ,k are set to zero, and therefore no longitudinal acceleration/deceleration is imposed.Various expressions can be found in the literature as reference generators.In particular, we consider (without loss of generality) the references given in [11,12,47] as the behavior of an "ideal" or "reference" vehicle.This ideal vehicle is not controlled by the AFS and/or the RTV, and receives as input only the driver's steering signal The reference lateral forces F y, f ,ref F y,r ,ref depend on the reference slip angles and appear multiplied by the reference lateral tire-road friction coefficient μ y,ref .These forces should be chosen in such a way they impose the desired behaviour to the vehicle.Various expressions can be used for these forces, [48,49], so that for the sake of simplicity we will use the Pacejka's Magic Formula [19]: where D y, j,ref , C y, j,ref and B y, j,ref are fixed constant values chosen by the designer.In particular, F y,r ,ref is proposed to be not decreasing with the slip angle α r ,ref .This ensures that the 'reference vehicle' can not generate tail-spins.

The CarSim ® Neural Identification and the Inverse Optimal Control for References Tracking
The RHONN identifier (10) takes measurements from the CarSim ® only for the yaw rate ω z,k whereas the identification of the longitudinal and lateral velocity is made using the reconstructions ṽx,k , ṽy,k given by the observer in (33).The neural model proposed is the following: where vx,k , vy,k , ωz,k are the neural identifications for ṽx,k , ṽy,k , ω z,k , and where βk represents the vehicle slip angle calculated as βk = arctan( vy,k / vx,k ).Moreover, w • 23 , w • 35 and w • 36 are constants tuned by the control designer.
In (39), the AFS and RTV inputs δ c,k , M z,k appear.It is interesting to note that δ c,k appears linearly in the model, and not implicitly in the lateral front force as in the single-track discretetime models.
It is important to remind that the neural model able to identify the process is not unique.Model (39) has shown good characteristics with respect to CarSim ® measurements including noise and perturbations, and it has shown good performance when tracking the reference signals.
The stability of the identification error system: as well as the stability of the synaptic weights errors: are discussed in the following theorem: The RHONN identifier (39), trained by the EKF algorithm (13), ( 14), ( 15), ( 16), ( 17), (18) to identify the lateral vehicle velocity ṽy,k from the reduced order observer (33) and to identify the vehicle longitudinal velocity v x,k and yaw rate ω z,k from CarSim ® , ensures that the identification errors (40) are SGUUB, and that the weight estimation errors (41) remain bounded- The proof is given in "Appendix B".

The Inverse Optimal Control Law
It is now possible to introduce the inverse optimal control law in order to force the ground vehicle to follow the desired references.As commented before, the input control laws used for this task are the active front steering δ c,k (AFS) and the rear torque vectoring M z,k (RTV) for the tracking of the lateral velocity v y,k,ref and yaw rate ω z,k,ref references.Here, no control strategy is presented for the longitudinal velocity v x,k being this latter a bounded signal as explained in [11,12].
Based on the structure given in (29), the control law is expressed in the matrix form as follows: being: Notice that from (29) it is here considered g( xk ) = g constant, ensuring controllability of the system.Now, along the same lines of theorem (4.7) of [18], we can state the following theorem: If there exists a matrix P = P T > 0 such that: where: then the control law (42), based on the neural identifier (39), ensures global asymptotic convergence to zero of the tracking error Moreover, this control law is inverse optimal, i.e. it minimizes the cost functional J (ξ k ) = V (ξ k ) given by (28), It is worth to stress that there are no analytical conditions, in general, that allow knowing a priori if (47) is feasible.However it is possible to proceed using heuristic methods, such the nature-inspired optimization process known as Particle Swarm Optimization (PSO) [50,51] used in this work, to find the positive definite symmetric matrix P verifying (47).Moreover, the use of the PSO algorithm also allow reaching better performances in terms of tracking error optimization since it compares results given for all P matrix satisfying (47) and returns the best minimization.

Simulation Results
To emphasize the control performance and better testing the controller, in this paper, instead of using a mathematical model of the plant, it is used the CarSim ® extended model simulator which is able to reproduce very closely the behavior of the physic of a ground vehicle.
The behavior of the proposed nonlinear inverse optimal controller is shown for an interesting case, in which the vehicle performs an ATI 90-90 maneuver.The ATI 90-90 maneuver is described in the standard ISO/TS 16949.The vehicle moves in open-loop throttle valve with The driver steering wheel angle δ s,w d,k , related by a ratio 16:1 with the steering angle δ d,k , is shown in Fig. 3a in which a superimposed random noise is also considered.
A further source of difficulty is taken into account, considering an abrupt change of the tire-road friction coefficient where μ d = 0.9, μ w = 0.5 correspond to dry and wet surfaces (see Fig. 3b). Figure 3c, d, show longitudinal and lateral accelerations respectively.
Figure 4 shows the importance of being able to rely on an active controller for vehicle stability improvements.In red, the vehicle is presented when the controller is disabled (open loop system) and in yellow when the controller is enabled (closed loop system).Note that  the controlled vehicle performs on a safer driving condition while the uncontrolled vehicle presents a strong drifting due to adhesion loss.In order to test quality and robustness of the observer (33), we used a variation of the initial conditions selecting v x,k,0 = 28 [m/s] and v y,k,0 = 0.005 [m/s].Results are listed in Fig. 5, where the discrete gains k o,1 , k o,2 and the stability parameter κ ensuring the convergence of the observed dynamics to the real vehicle model, are presented.
Performance and stability of the discrete-time state-observer depend on the sample time size.To this aim, sample time variations are introduced and three different basic indicators known as integral square error (ISE), integral time square error (ITSE) and the integral of absolute error (IAE) are discussed, as presented in Table 1, and calculated as follows: Notice that for T = 0.0001 and T = 0.001 the observer ensures robustness with respect to sample time variation while the discriminant in Eq. ( 34) continue being grater than zero.Quality and performance of both the identifier and the observer are discussed in Fig. 6.In particular, Fig. 6a-c, show the identification error of the longitudinal velocity êv x,k , lateral velocity êv y,k and yaw rate êω z,k , respectively whereas in Fig. 6d, e the observation errors of the longitudinal ( ẽv x,k ) and lateral ( ẽv x,k ) velocities are presented.Moreover, the longitudinal velocity v x,k , in closed-loop system, and the control efforts in terms of Active Front Steering (δ c,k ) and Rear Torque Vectoring (M z,k ) can be appreciated in Fig. 8.
To test quality and performance of the inverse optimal controller, authors propose a fare comparison between optimal and non-optimal methods in order to verify the advantages of using such control approach.The comparison is said to be 'fair' because the neural identifier in (39) is utilized in both cases highlighting the contribution of the controllers exclusively.
The non-optimal control law can be designed as follows: for the tracking errors e where u k,1 represents the AFS and u k,2 the RTV.
Results obtained applying the non-optimal control law (50) are shown in Fig. 10.Validation of the optimal controller is also made numerically in terms of the power consumption of the actuators for both optimal and non-optimal control techniques.Notice that both methods provide good shape in terms of references tracking as shown in Fig. 10a-d.However, the Inverse Optimal Control presented in this work provides better tracking performance while demanding less power.Obtained numerical results are presented in Table 3.
Finally, the P matrix in Theorem 3, with P > 0 and P = P T has been calculated making use of a Nature-inspired optimization process, named particle swarm optimization (PSO), in order to find the optimal value of the P matrix as explained in [50,51].In this respect, the logic behind this algorithm is shown in Fig. 11, where the first step of the computation represents the parameter initialization such as: number of variables to be optimized, number of swarm particles, number of iterations, cognitive and social weighting factors.The second Optimal step consists in generating random numeric values to be tested in simulation.The random values are selected verifying constrain conditions that, for this application, is given by the positive definition of the P matrix in the Lyapunov candidate function (46).Next, random values are tested in simulation and obtained performances are written in output files.Results are then analyzed in terms of statistic criteria.In fact, for this application a mean square error of the concatenation of the tracking error for the lateral velocity and yaw rate (e k ) are considered.If the performance obtained during the actual iteration is smaller compared to previous iterations, the upload of the optimal combination is made.Else, the algorithm executes a new iteration by modifying the random values until the number of iterations are successfully reached.As a reference, obtained results of the PSO obtained during the last execution are listed in Table 4.It is worth noting that the minimum stationary point reached by the algorithm may not necessarily represent a global or absolute point.

Conclusions
This paper proposes a nonlinear discrete-time inverse optimal controller based on a RHONN identifier trained by the extended Kalman filter in which the vehicle lateral velocity is reconstructed by a discrete-time reduced order state observer, for the active control of a ground vehicle, to ensure safe driving conditions in the case of adhesion loss and hazardous maneu-  vers.According to test maneuvers simulated in CarSim ® , the proposed approach shows a proper identification of the vehicle dynamics in terms of identification errors, and a good performance of the controller in terms of reference tracking errors, even in presence of parameter uncertainties, measurement noise and unmodelled dynamics.A nature-inspired optimization algorithm known as particle swarm optimization (PSO) is utilized for the control gain settings.The main contributions of this work concern control aspects in fact, with this approach, one can avoid the hard task of inverting the Pacejka's tire lateral equation and it can also be ensured asymptotic stability of the tracking errors even without knowing the Pacejka's coefficients that are difficult to be estimated besides being time-varying.Furthermore, a fair numerical comparison between the proposed control scheme and a non-optimal strategy shows that the former approach provides the same control performances in terms of tracking errors, while minimizing the control power consumption.Future works involve electric power-trains, in order to generate torque vectoring based on electric motor torques and angular velocities as well as active differential systems.In The product between the errors of the longitudinal and lateral velocities is eliminated utilizing k o,2 whereas the sign of the squared error of the lateral velocity is ensured to be negative imposing κ = one obtains: ensuring the asymptotic stability of the origin of the estimation error.The behavior of the observer gains k o,1 , k o,2 , the parameter κ in (52) and the sign of the discriminant in (34) are discussed in Sect. 4.

Appendix B. Proof of Theorem 2
Let us consider the following Lyapunov candidate function: The first increment is given by the following: Using the inequalities: considered valid for all X ∈ R n , Y ∈ R n , P ∈ R n×n , with P = P T > 0, (60) can be written as follows: ΔV i,k ≤ − wT Let us now define: and selecting η i , Q i,k , R i,k such that E i,k , F i,k > 0, ∀k, one gets: Hence, ΔV i,k < 0 when:

Fig. 2
Fig. 2 Control scheme and bicycle model

Figure 7 ,
compares the vehicle behavior in open-loop and closed-loop systems.Notice how, in Fig. 7a, b, the case of open-loop system, the lateral velocity v y,k and the yaw rate ω z,k do not track the safer references v y,k,ref , ω z,k,ref .Instead, in Fig. 7c, d, the case of closed-loop system, the tracking of the references v y,k,ref and ω z,k,ref performs as expected.

Table 2
Parameters used in the control scheme

Table 4
Obtained results of the PSO algorithm in the last execution of vehicles with high center of gravity, the introduction of the roll dynamics in both the observer and the neural identifier can improve the control performance as well.Finally, several comparisons with adaptive dynamic programming (ADP), to solve the HJB equations, will be studied.The variation of the Lyapunov candidate function is defined as:ΔV o,k = V o,k+1 − V o,k = ẽv x ,k + T ω z,k ẽv y ,k − k o,1 ẽv x ,k 2 − ẽ2 v x ,k + ẽv y ,k − T ω z,k ẽv x ,k − k o,2 ẽv x ,k 2 − ẽ2 v y ,k + κS ω z ,k ẽv x ,k ẽv y ,k −κS ω z,k ẽv x ,k + T ω z,k ẽv y ,k − k o,1 ẽv x ,k ẽv y ,k − T ω z,k ẽv x ,k − k o,2 ẽv x ,k (53) obtaining: ΔV o,k = k 2 o,1 − 2k o,1 + T 2 ω 2 z,k + k 2 o,2 + 2T ω z,k k o,2 + κS ω z,k k o,2 +κ T | ω z,k | +κ T | ω z,k | k o,2 − κsignω z,k k o,1 k o,2 ẽ2 case ,k = wT i,k+1 P i,k+1 wi,k+1 + êx i,k+1 − wT i,k P i,k wi,k − êx i,k = wi,k − η i,k K i,k e x i,k T P i,k − U i,k wi,k − η i,k K i,k e x i,k = K i,k H T i,k P i,k + Q i,k; then (59) can be expressed as:ΔV i,k = wT i,k P i,k wi,k − η i êx i,k K T i,k P i,k wi,k − wT i,k U i,k wi,k +η i êx i,k K T i,k U i,k wi,k − η i êx i,k wT i,k P i,k K i,k + η 2