Keywords

1 Introduction

Realistic mathematical driver steering control models are useful tools for developing driver assistance systems such as stability control and lane-keeping assistance. Driver models allow the vehicle design space to be explored safely and at low cost.

State of the art driver models typically use optimisation-based predictive control algorithms. These require a prediction model, which in a human context is known as the internal model, and represents the human understanding of the vehicle dynamics. The internal model is often assumed to be an accurate deterministic representation of the vehicle dynamics. However, it seems likely that in many cases a driver has an inaccurate and uncertain understanding of the vehicle dynamics, particularly in unfamiliar nonlinear regimes of operation.

Recently, a new approach has been taken by the authors, using Model Predictive Control (MPC) with a Gaussian Process (GP) providing a data-driven internal prediction model of the vehicle dynamics [4]. The motivation for using a GP is that it represents more closely the experience-based learning process and prediction uncertainty of the human driver. This approach was inspired by the ‘predictive processing’ hypothesis which has gained traction in several academic communities [2].

An example of a partially learnt GP model of nonlinear lateral-yaw vehicle dynamics is shown in Fig. 1, predicting the mean and variance of the vehicle lateral velocity state at the next time step given current vehicle states (here all zero) and steering angle input (on x-axis). The learning behaviour of the driver model is apparent in Fig. 2, which compares the RMS path error with RMS handwheel velocity (akin to control effort) over the course of twelve successive avoidance manoeuvres (elk or moose test).

Fig. 1.
figure 1

Part-learnt GP model of nonlinear lateral-yaw vehicle dynamics [4]. Solid blue line is the mean prediction, light blue shaded area is the 95% confidence interval.

Fig. 2.
figure 2

RMS path error against RMS handwheel velocity for a simulated driver performing twelve successive avoidance manoeuvres. The smallest circle is the first manoeuvre and the largest is the twelfth. The dashed line is a diagonal regression [4].

It is reported in [4] that the simulated learning behaviour seen in Fig. 2 is similar to that measured in some of the human test subjects that performed the manoeuvres in a real vehicle on a test track. The behaviour can be described as ‘cautious’, where control activity increases and path error decreases with successive manoeuvres. Some other drivers in the experiment displayed ‘adventurous’ behaviour, characterised by control activity and path error decreasing with successive manoeuvres.

The MPC+GP driver model architecture described in [4] is shown in Fig. 3, with additional elements introduced in the present work shown in orange and grey to be described in the next sections. The model includes additive memory data noise but no sensory measurement noise. The GP internal model generates predictions. Control actions are optimised to minimise a cost function that penalises predicted lateral and yaw deviations from a previewed target path, together with other penalties on GP internal model prediction variance and the first and second derivatives of steering angle with respect to time. Cautious and adventurous steering behaviours are generated by adjusting the penalty on GP internal model prediction variance [4].

The aim of the work in this paper is to improve the MPC+GP driver steering model by sensory noise and state estimation. Based on experimental observations in the field of computational neuroscience, it is believed that human sensory measurement noise is signal-dependent, where the noise is proportional to the signal being measured [7]. In human control, there is evidence that state estimation is performed by the brain in a probabilistically optimal, Bayesian manner [3, 5], using internal model predictions to improve the accuracy of the state determination and mitigate effects of sensory noise. In the present paper the existing MPC+GP driver model is extended to include realistic sensorimotor noise and a state estimator that uses the GP for the state prediction step. The performance of the model is then investigated.

Fig. 3.
figure 3

Driver model architecture from [4] with new elements shown in orange and elements used solely for analysis shown in grey.

2 Noise Model and State Estimation

Signal-dependent noise sources are added to represent measurement noise and control (or process) noise, as shown in orange in Fig. 3. The noise is modelled as shown in Fig. 4, with constant signal to noise ratio (SNR) and a noise floor to represent sensory perception threshold.

With reference to the variables in Fig. 3, measurement variance is diagonal, with \({(\Sigma _m)}_{k,k} = \text {max}({(\boldsymbol{z}_{i,j})}_{k}^{2} {(\boldsymbol{s}_m)}_{k}^{-1}, \, {(\boldsymbol{n}_m)}_{k})\), and control variance, \(\boldsymbol{\sigma }_c = \text {max}({\delta }_{i}^{2} {s}_c^{-1}, \, n_c)\). Here \(\boldsymbol{z}_{i,j}\) is the state vector at the \(i^{\text {th}}\) timestep of the \(j^{\text {th}}\) manoeuvre, \(\boldsymbol{s}_m\) is the measurement SNR for each state dimension, \(\boldsymbol{n}_m\) is the measurement noise floor, \(\delta _{i}\) is the planned control action for the \(i^{\text {th}}\) timestep, \(s_c\) and \(n_c\) are the control SNR and noise floor.

Fig. 4.
figure 4

Measurement variance relation to signal power for approximation used [6].

Physiologically plausible values for SNRs and noise floors can be ascertained from the literature on sensory thresholds and just noticeable differences (JNDs), for example [3, 6, 7]. SNRs are typically in the region of unity; a value of 1.6 is used for the results in the present paper.

The State Estimation block (Fig. 3) is implemented as a Kalman Filter. The state estimation process uses the GP Internal Model to predict the next state of the vehicle given the current state and control input. From this, with the predicted state, the measured state and the variances of each, Kalman Filtering is used to update the state estimate probabilistically. This state estimate is then used as the believed current state from which to predict future states when optimising the control action. The equations for the Kalman Filter prediction and update are implemented as follows, based on [1]:

$$\begin{aligned} \text {Prediction Step:} \quad & \hat{\boldsymbol{x}}_{k|k-1} = \text {GP Prediction}\, (\hat{\boldsymbol{x}}_{k-1|k-1} , u_{k-1}P_{k-1|k-1}) \\ & P_{k|k-1} = \text {GP Variance}\, (\hat{\boldsymbol{x}}_{k-1|k-1} , u_{k-1}, P_{k-1|k-1}) \\ \text {Update Step:} \quad & K_k = P_{k|k-1} H^T (H P_{k|k-1} H^T + R_k)^{-1} \\ & \hat{\boldsymbol{x}}_{k|k} = \hat{\boldsymbol{x}}_{k|k-1} + K_k (\boldsymbol{y}_k - H \hat{\boldsymbol{x}}_{k|k-1}) \\ & P_{k|k} = (I - K_k H) P_{k|k-1} \end{aligned}$$

where \(\hat{\boldsymbol{x}}_{k|k-1}\) and \(P_{k|k-1}\) are the GP model predicted current state mean and covariance, \(\hat{\boldsymbol{x}}_{k-1|k-1}\) and \(P_{k-1|k-1}\) are the previous state estimate mean and covariance and \(u_{k-1}\) is the previous control input.

The Kalman filtered state estimates are not used in the memory dataset for training the GP internal model. This is because the Kalman filtering introduces a bias to the data towards the errors in the learnt model, so the use of this filtered data significantly reduces the learning rate. Instead, the GP internal model is trained on the raw measured data, and the Kalman Filter is used only for state estimation during the control optimisation process. As the GP internal model improves and the state estimate becomes more accurate, better predictions of future states can be made leading to better control performance.

3 Simulation Results and Discussion

In this section simulations were performed with the control noise in Fig. 3 set to zero. The measurement noise was set to give an SNR of 1.6. Simulations were initially performed with the Deterministic Internal Model block shown in grey in Fig. 3 switched in and data noise set to zero. This enabled the effect of GP learning on state estimation to be investigated separately from the GP’s effect on control optimisation. The GP was initialised with fifty data points randomly distributed within the vehicle’s operating envelope. The driver steering model was then run to perform twelve successive elk avoidance manoeuvres, with the memory dataset and the GP updated after each of the twelve manoeuvres. For the purpose of determining statistically reliable results, the sequence of twelve manoeuvres was repeated one hundred times, each time with different uncorrelated noise signals.

The performance of the state estimator in each of the twelve manoeuvres was quantified by calculating the ratio of the variance of the true state to the variance of the state estimation error. This ratio is also denoted as a SNR. However, to avoid the calculated SNR being dependent on the performance of the controller in each manoeuvre, the vehicle states during each manoeuvre were not used in calculating the estimator’s SNR. Instead, a set of 2000 independent randomly generated starting states with measurement noise of SNR=1.6 was used to evaluate the performance of the state estimator, using the GP internal model from each of the twelve manoeuvres. The SNR of the estimator was also calculated for multiple timesteps beyond the starting state.

It can be seen in Fig. 5 that the SNR of the estimate of lateral velocity improves over the course of the twelve manoeuvres as the GP internal model learns the vehicle dynamics. The SNR reduces as more timesteps are advanced, converging to an SNR of around 3 for the first manoeuvre and 10 for the final manoeuvre, which is significantly improved from the measurement SNR of 1.6 and confirms the effectiveness of the estimator.

Fig. 5.
figure 5

Plot of the median SNR of state estimates to true states against manoeuvre number. The SNR is calculated over one hundred repeats of twelve elk test manoeuvres on a test set of 2000 randomly generated starting states. Different lines are plotted for additional timesteps advanced from the starting states.

Fig. 6.
figure 6

Conflict plot showing RMS path error against RMS handwheel velocity for the MPC+GP driver model with sensory noise and state estimation.

Another simulation was run with the prediction model switch (Fig. 3) set to use the GP internal model for control optimisation as well as for state estimation. Figure 6 shows the RMS path error against RMS handwheel velocity over twelve elk avoidance manoeuvres. The control performance and learning behaviour of the new model are similar to that of the earlier model seen in Fig. 2, and therefore similar to the measured behaviour of human test subjects reported in [4].

The work reported in this paper contributes increased realism of driver models for use in virtual vehicle development. Work is planned to add visual and vestibular sensory dynamics to the driver steering model, and to validate further the model against experiments with human test subjects.

4 Conclusion

  • There is a need for realistic mathematical driver steering control models that represent human learning behaviour.

  • Recent work by the authors has combined MPC with a GP internal model.

  • The present work adds signal-dependent sensorimotor noise and state estimation to the MPC+GP driver model.

  • Simulation results show that the state estimator with a GP internal model exhibits the expected improvement in estimation accuracy with successive manoeuvres.

  • The results also show that control performance and learning behaviour are similar to measured human behaviour reported recently.

  • Further work is planned to extend the model to include visual and vestibular sensory dynamics, and to perform more experimental validation.